letter from the editor kenneth j. varnum information technology and libraries | september 2018 1 https://doi.org/10.6017/ital.v37i3.10747 this september 2018 issue of ital continues our celebration of the journal’s 50 th anniversary with a column by former editorial board member mark dehmlow, who highlights the technological changes beginning to stir the library world in the 1980s. the seeds of change planted in the 1970s are germinating, but the explosive growth of the 1990s is still a few years away. in addition to peer-reviewed articles on recommender systems, big data processing and storage, finding vendor accessibility documentation, using gis to find specific books on a shelf, and a recommender system for archival manuscripts, we are also publishing the student paper by this year’s ex libris/lita student writing award, “the open access citation advantage: does it exist and what does it mean for libraries?”, by colby lewis at the university of michigan school of information. this inciteful paper impressed the competition’s judges (as ital’s editor, i was one of them) and i am very pleased to include ms. lewis’ work here. this issue also marks my fourth as editor. with one year under my belt i am finding a rhythm for the publication process and starting to see the increased flow of articles from outside traditional academic library spaces that i wrote about in december 2017. as always, if you have an idea for a potential ital article, please do get in touch. we on the editorial board look forward to working with you. sincerely, kenneth j. varnum, editor varnum@umich.edu september 2018 http://www.ala.org/news/member-news/2018/04/colby-lewis-wins-2018-litaex-libris-student-writing-award mailto:varnum@umich.edu editor’s comments bob gerrity information technology and libraries | december 2012 1 past and present converge with the december 2012 issue of information technology and libraries (ital), as we also publish online the first volume of ital’s predecessor, the journal of library automation (jola), originally published in print in 1968. the first volume of jola offers a fascinating glimpse into early days of library automation, when many things were different, such as the size (big) and capacity (small) of computer hardware, and many things were the same (e.g., richard johnson’s description of the book catalog project at stanford, where “the major achievement of the preliminary systems design was to establish a meaningful dialogue between the librarian and systems and computer personnel.” plus ça change, plus c'est la meme. there are articles by luminaries in the field: richard de gennaro describes approaches to developing an automation program in a large research library, frederick kilgour, from the ohio bob gerrity (r.gerrity@uq.edu.au) is university librarian, university of queensland, australia. http://ejournals.bc.edu/ojs/index.php/ital/issue/view/312 editor’s comments bob gerrity editor’s comments | gerrity 2 college library center (now oclc), analyzes catalog-card production costs at columbia, harvard, and yale in the mid 1960s (8.8 to 9.8 cents per completed card), and henriette avram from the library of congress describes the successful use of the cobol programming language to manipulate marc ii records. the december 2012 issue marks the completion of ital’s first year as an e-only, open-access publication. while we don’t have readership statistics for the previous print journal to compare with, download statistics for the e-version appear healthy, with more than 30,000 full-text article downloads for 2012 content so far this year, plus more than 10,000 downloads for content from previous years. based on the download statistics, the topics of most interest to today’s ital readers are discovery systems, web-based research guides, digital preservation, and digital copyright. this month’s issue takes some of these themes further, with articles that examine the usability of autocompletion features in library search interfaces (ward, hahn, and feist), reveal patterns of student use of library computers (thompson), propose a cloud-based digital library storage solution (sosa-sosa), and summarize attributes of open standard file formats (park, oh). happy reading. microsoft word ital_december_gerrity_final.docx   editor’s comments bob gerrity     information  technologies  and  libraries  |  september  2013   3     this  month’s  issue   we  have  an  eclectic  mix  of  content  in  this  issue  of  information  technology  and  libraries.   lita  president  cindi  trainor  provides  highlights  of  the  recent  lita  forum  in  louisville  and   planned  lita  events  for  the  upcoming  ala  midwinter  meeting  in  philadelphia,  including  the  lita   town  meeting,  the  always-­‐popular  top  tech  trends  panel,  and  the  association’s  popular   “networking  event”  on  sunday  evening.     ital  editorial  board  member  jerome  yavarkosky  describes  the  significant  benefits  that   immersive  technologies  can  offer  higher  education.  the  advent  of  massive  open  online  courses   (moocs)  would  seem  to  present  an  ideal  framework  for  the  development  of  immersive  library   services  to  support  learners  who  may  otherwise  lack  access  to  quality  library  resources  and   services.   responsive  web  design  is  the  topic  of  a  timely  article  by  hannah  gascho  rempel  and  laurie  m.   bridges,  who  examine  what  tasks  library  users  actually  carry  out  on  a  library  mobile  website  and   how  this  has  informed  oregon  state  university  libraries’  adoption  of  a  responsive  design   approach  for  their  website.   piotr  praczyk,  javier  nogueras-­‐iso,  and  salvatore  mele  present  a  method  for  automatically     extracting  and  processing  graphical  content  from  scholarly  articles  in  pdf  format  in  the  field  of   high-­‐energy  physics.  the  method  offers  potential  for  enhancing  access  and  search  services  and   bridging  the  semantic  gap  between  textual  and  graphical  content.   elizabeth  thorne  wallington  describes  the  use  of  mapping  and  graphical  information  systems   (gis)  to  study  the  relationship  between  public  library  locations  in  the  st.  louis  area  and  the   socioeconomic  attributes  of  the  populations  they  serve.  the  paper  raises  interesting  questions   about  how  libraries  are  geographically  distributed  and  whether  they  truly  provide  universal  and   equal  access.     vadim  gureyev  and  nikolai  mazov  present  a  method  for  using  bibliometric  analysis  of  the   publication  output  of  two  research  institutes  as  a  collection-­‐development  tool,  to  identify  journals   most  important  for  researchers  at  the  institutes.       bob  gerrity  (r.gerrity@uq.edu.au)  is  university  librarian,  university  of  queensland,  australia.       editor’s comments bob gerrity   editor’s  comments  |  gerrity       4         letter from the editor: the core question letter from the editor the core question kenneth j. varnum information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.12137 as i write this, the members of the association for library collections and technical services (alcts), the library leadership and management association (llama), and lita are voting to merge into a new consolidated division, core: leadership, infrastructure, futures. this merger is essential to the continuing activities that we library technologists rely on. the lita board has indicated that if the merger does not go through, lita will be forced to dissolve over the coming year. the merger will enrich lita members’ opportunities. lita has long focused on the library technology practitioner. that has been our core competency, born of a time when technology was the new thing in libraries. we technologists know—the entire information profession knows— that technology is no longer an addition to a library, but is the way society operates, for a huge portion of our work and life. core reflects this evolutionary change. similar evolutions have taken place in technical services and collections development areas; those functions have been forever changed by the wave of technologies that we have implemented over the past half century. core brings together the practitioners and technologies that make libraries run, and combines them with the library leadership areas that many of us aspire to, or end up taking on, as our careers develop. when i joined lita over a decade ago, i was myself moving from “doer” to “manager.” now that my role is largely project and personnel management, the skills and conversations i seek for personal growth are often found in other parts of ala, and beyond. yet, the focus—the core, if i may—of what i do is still in the center of the venn diagram of technology, people, and data. i voted to support core and hope that all of you who belong to lita, alcts, and/or llama, will do the same. sincerely, kenneth j. varnum, editor varnum@umich.edu march 2020 https://core.ala.org/ mailto:varnum@umich.edu libraryvpn: a new tool to protect patron privacy public libraries leading the way libraryvpn a new tool to protect patron privacy chuck mcandrew information technology and libraries | june 2020 https://doi.org/10.6017/ital.v39i2.12391 chuck mcandrew (chuck.mcandrew@leblibrary.com) is information technology librarian, lebanon (nh) public libraries. due to increased public awareness of online surveillance, a rise in massive data breaches, and spikes in identity theft, there is a high demand for privacy enhancing services. vpn (virtual private network) services are a proven way to protect online security and privacy. vpn’s effectiveness and ease of use have led to a boom in vpn service providers globally. vpns protect privacy and security by offering an encrypted tunnel from the user’s device to the vpn provider. vpns ensure that no one who is on the same network as the user can learn anything about their traffic except that they are connecting to a vpn. this prevents surveillance of data from any source, including commercial snooping such as your isp trying to monetize your browsing habits by selling your data, malicious snooping such as a fake wifi hotspot in an airport hoping to steal your data, or government-level surveillance that can target political activists and reporters in repressive countries. some people might ask why we need a vpn as https becomes more ubiquitous and provides end to end encryption for your web traffic. https will encrypt the content that goes over the network, but metadata such as the site you are connecting to, how long you are there, and where you go next are all unprotected. additionally, some very important network protocols, such as dns, are unencrypted and anyone can see them. a vpn eliminates all of those issues. however, there are two major problems with current vpn offerings. first, all reliable vpn solutions require a paid subscription. this puts them out of reach of economically vulnerable populations who often have no access to the internet in their homes. in order to access online services, they may rely on public internet connections such as those provided by restaurants, coffee shops, and libraries. using publicly accessible networks without the security benefits of a vpn puts people’s security and privacy at great risk. this risk could be eliminated by providing free access to a high-quality vpn service. the second problem is that using a vpn requires people to place their trust in whatever vpn company they use. some (especially free solutions) have proven not to be worthy of that trust by containing malware or leaking and even outright selling customer data. companies that abuse customer data are taking advantage of vulnerable populations who are unable to afford more expensive solutions or who do not have the knowledge to protect themselves. together, these two problems create a situation where having security and privacy is only available to those who can afford it and have the knowledge to protect themselves. libraries are ideally positioned to help with this situation. libraries work to provide privacy and security to people every day. this can mean teaching classes, making privacy resources available, and even advocating for privacyfriendly laws. mailto:chuck.mcandrew@leblibrary.com https://www.forbes.com/sites/forbestechcouncil/2018/07/10/the-future-of-the-vpn-market/#5b08fd8e2e4d https://research.csiro.au/ng/wp-content/uploads/sites/106/2016/08/paper-1.pdf information technology and libraries june 2020 libraryvpn | mcandrew 2 libraries are also located in almost every community in the united states and enjoy a high level of trust from the public. librarians can be thought of as being a physical vpn. people who come into libraries know that what they read and information that they seek out will be protected by the library. in fact, libraries have helped to get laws protecting the library records of patrons in all 50 states of the usa. people know that when a library offers a service to their community it isn’t because they want to sell their information or show them advertisements. with libraries, our patrons are not the product. libraries also already provide many online services to all members of their community, regardless of financial circumstances. examples include access to online databases, language learning software, and online access to periodicals such as the new york times or consumer reports. many of these services would cost too much for individual patrons to access individually. by pooling their resources, communities are able to make more services available to all of their citizens. to help address the above issues, the lebanon public libraries, in partnership with the westchester (new york) library system, the leap encryption access project (https://leap.se/), and tj lamanna (emerging technology librarian from cherry hill public library and library freedom institute graduate) started the libraryvpn project. this project will allow libraries to offer a vpn to their patrons. patrons will be able to download the libraryvpn application on a device of their choosing and connect to their library’s vpn server from wherever they are. libraryvpn was first conceived a number of years ago, but the real start of the project was when it received an imls national leadership grant (lg-36-19-0071-19) in 2019. this grant was to develop integrations between leap’s existing vpn solution and integrated library systems using sip2 which will allow library patrons to sign in to libraryvpn using their library card. this grant also included development of a windows client (there was already a mac and linux client) and alpha testing at the lebanon public libraries and westchester library system. we are currently working on moving into the testing phase of the software, and planning phase two of this project. phase two of libraryvpn will involve expanding our testing to up to 12 libraries and conducting end-user testing with patrons and library staff. we have submitted an application for imls funding for phase two and are actively looking for libraries that are excited about protecting patron privacy and would like to help us beta test this software. if you work for a library that would be interested in participating, you can reach us via email at libraryvpn@riseup.net or @libraryvpn on twitter. if you would like to help out with this project in another way, we would love to have more help. please reach out. we currently are thinking about three deployment models for libraries in phase two. first would be an on-premises deployment. this would be for larger library systems with their own servers and it staff. libraryvpn is free and open source software and can be deployed by anyone. since it uses sip2 to connect to your ils, it should work with any ils that supports the sip2 protocol. this deployment model has the advantage of not requiring any hosting fees but does require the library system to have staff that can deploy and manage public facing services. drawbacks to this approach would include higher bandwidth use and dealing with abuse complaints. phase 2 testing should give us better data about how much of an issue this will be, but https://leap.se/ mailto:libraryvpn@riseup.net information technology and libraries june 2020 libraryvpn | mcandrew 3 our experience hosting a tor exit node at the lebanon public libraries suggest that it won’t be too bad to deal with. our second deployment model would be cloud hosting. if a library has it staff who can deploy services to the cloud, they could host their own libraryvpn service without needing their own hardware. however, when deploying to the cloud, there will be ongoing costs for running the servers and bandwidth used. figuring out how much bandwidth an average user will consume is part of the data we are hoping to get from our phase 2 testing so we can offer guidelines to libraries who choose to deploy their own libraryvpn service. finally, we are looking at a hosted version of libraryvpn. we anticipate that smaller systems that do not have dedicated servers or it staff will be interested in this option. in this case, there would be ongoing hosting and support costs, but managing the service would not be any more complicated than subscribing to any other service the library hosts for their patrons. libraryvpn is a new project that is pushing library services outside of the library to where the library is. we want to make sure that all of our patrons are protected, not just those with the financial ability and technical know-how to get their own vpn service. as librarians, we understand that privacy and intellectual freedom are joined, and we want to maximize both. as the american library association’s code of ethics says, “we protect each library user's right to privacy and confidentiality.” http://www.ala.org/tools/ethics september_ital_yelton_final president’s message: 50 years andromeda yelton information technologies and libraries | september 2017 1 fifty years. lita was voted into existence (as isad, the information science and automation division) in detroit at midwinter 1966. therefore we have just completed our first fifty years, a fact celebrated (thanks to our 50th anniversary task force) with a slide show and cake at annual in chicago. it’s truly humbling to take office upon this milestone. looking back, some of the true giants of library technology have held this office. in 1971-72, jesse shera, who in his wide-ranging career challenged librarians to think deeply about the epistemological and sociological dimensions of librarianship; ala makes several awards in his name today. in 1973-74 and again in 1974-75, frederick kilgour, the founding director of oclc, who also has an eponymous award. in 1975-76, henriette avram, the mother of marc, herself. moreover, thanks to the work of countless lita volunteers, much of this history is available openaccess. i strongly recommend reading http://www.ala.org/lita/about/history/ for an overview of the remarkable people and key issues across our history. you can also read papers by avram and kilgour, among many others, in the archives of this very publication. in fact, reading the ital archives is deeply engaging. it turns out library technology has changed a bit in 50 years! (i trust that isn’t a shock to you.) the first articles (in what was then the journal of library automation) are all about instituting first-time computer systems to automate traditional library functions such as acquisitions, cataloging, and finance. the following passage caught my eye: “a functioning technical processing system in a two-year community college library utilizes a model 2201 friden flexowriter with punch card control and tab card reading units, an ibm 026 key punch, and an ibm 1440 computer, with two tape and two disc drives, to produce all acquisitions and catalog files based primarily on a single typing at the time of initiating an order” (“an integrated computer based technical processing system in a small college library”, jack w. scott; https://doi.org/10.6017/ital.v1i3.2931.) how many of us are still using punch cards today? and, indeed, how many of us are automating libraries for the first time? the topics discussed among lita members today are far more wideranging: user experience, privacy, accessibility. they’re more likely to be about assessing and improving existing systems than creating new ones, and more likely to center on patron-facing technologies. andromeda yelton (andromeda.yelton@gmail.com) is lita president 2017-18 and owner/consultant of small beautiful useful llc. president’s message | yelton https://doi.org/10.6017/ital.v36i3.10086 2 and yet, with a few substitutions — say, “raspberry pi” for “friden flexowriter” — the blockquote above would not be out of place today. then as now, lita members were doing something exciting, yet deeply practical, that cleverly repurposes new technology to make library experiences better for both patrons and staff. our job descriptions have changed enormously in fifty years; in fact, the lita board charged a task force to develop lita member personas, so that we can better understand whom we serve, and work to align our publications, online education, conference programming, and committee work toward your needs. (you can see an overview of the task force’s stellar work on litablog: http://litablog.org/2017/03/who-are-lita-members-lita-personas/.) at the same time, the spirit of pragmatic creativity that runs throughout the first issues of the journal of library automation continues to animate lita members today. i’m looking forward to seeing where we go in our next fifty years. 66 journal of lihm1'y automation vol. 7/1 march 1974 book reviews computer systems in the library: a handbook for managers and designers. by stanley j. swihart and beryl f. hefley. a wiley-becker & hayes series book. los angeles: melville publishing company, 1973. 388p. once every year or two, either in england or the united states, a book appears attempting to explain computer systems to librarians. this book, compute?' systems in the library, is the most recent of the introductory texts. it starts off with a chapter entitled "why automate?" which skims ve1y lightly and uncritically over the often-repeated reasons for using computers. in this instance, money is included as a reason to automate, for we are told that "when properly planned, unit operating costs are normally reduced when a function is automated." automation's impact on the library's research and development budget is not discussed. the book then proceeds to the six chapters which occupy the bull< of the book they cover the automation of six major librmy functions: catalog publication, circulation, acquisitions, cataloging, catalog reference services, and serials. each chapter consists of a description of one or two apparently existing automated systems, with a complete discussion of how the system functions, what files are involved, the data in each file, coding and formats used in the files, and reproductions of various output products from each file. unfortunately, we are not told where each of these systems exists, and the systems often appear to use techniques that are suitable only for very small libraries. for example, in the circulation system that is described, a packet of prepunched book cards is to be carried in the book; each time the book is charged or discharged one of the cards is removed, with the last card serving as a signal to create a new deck of cards. little mention is made of the data collection terminals that are so commonly used in automated circulation systems, with the result that the description is very closely linked to a single system, with little opportunity for the reader to compare various methods or techniques of information handling. the latter part of the book addresses itself to some general problems, including the interlibrary sharing of data and programs; the planning, implementation, and control of automation projects; and brief discussions of input and output problems, the protection of records, and some considerations in choosing hardware. three appendixes offer a 2,500-word exclusion list for kwic indexes, a set of model keypunching rules for a corporate library, and a thirty-three-item bibliography in which the majority of works listed were published between 1964 and 1968. a major weakness of the book seems to be its lack of critical focus. library automation problems are treated as being not particularly difficult; in fact, "the authors can see no serious or major disadvantages to automation in libraries. the situation," we are told, "can be compared with the disadvantages of using typewriters or telephones." this reviewer finds it difficult to know what sort of audience these words, and the entire book, are addressed to. though subtitled "a handbook for managers and designers," it would be an inexperienced manager indeed who needed to be told that "in its mode of operation, a keypunch is quite similar to a typewriter. a key must be struck for each character . . . ," or that "the catalog master file may be stored on magnetic tape reels or on magnetic disks." the experienced librarian, on the other hand, will not be pleased to learn that "many libraries with computer systems have given up the library of congress [filing] system for mel mac and have placed mac in order between mab and mad, and me between mb and md." nor will anyone associated with libraries be pleased to discover that "computer centers not only can, but frequently do, lose information. from time to time complete files are erased. there is almost no way to ensure that information will not be inadvertently erased." the librarian who is already involved in automated systems will not need this book; the librarian who wishes to learn about automation and the systems analyst who needs to understand library systems will do well to read other sources in addition to this one. peter simmons university of british columbia the metropolitan library. edited by ralph w. conant and kathleen molz. cambridge, mass.: m.i.t. press, 1972. 333p. $10.00. the editors describe this book as a sequel to the important public library and the city (1965), also published by m.i.t. press. the focus again is on the concerns of metropolitan public librarians, combining the viewpoints of specialists from library and social science disciplines. of the eighteen papers included, only three, by john trebbel, john bystrom, and kathleen molz, concentrate on the implications of present and future technology on public library service. their papers offer a general, if hard-nosed, approach to the need for specific research into the economic, behavioral, professional, and technological barriers impeding the advent of the automated millenium. micrographics, reprography, computers, facsimile transmission, telecommunications hardware, and technology are considered essential components of information transbook reviews 61 fer with which libraries must become compatible-and comfortable. the ijllperative need for and conduct of long-range research in telecommunications is outlined by bystrom, including aspects of research necessary for both a national telecommunications network linking all types of libraries and the local use of community cablevision by individual library outlets. the three authors devote considerable head-shaking to the chilling reality of financing technological adaptations and innovations in libraries-the "snake in eden" according to trebbel. governments, specifically national governments, are cited as the logical sources of the enormous sums required for automated library and information services of whatever kind. molz warns repeatedly and forcefully that libraries, while not discarding the book, must change their priorities. continued dependence on print as the prime information transfer medium is insupportable. the public library must adapt to a multimedia world. none of the foregoing is new to 'information scientists or specialists in automation, but as concerned participants in the knowledge business they should find these papers of general interest. lois m. bewley university of british columbia letter from the editor kenneth j. varnum information technology and libraries | december 2017 1 https://doi.org/10.6017/ital.v36i4.10237 i am excited to have been appointed editor of information technology and libraries as the journal enters its 50th year. originally published as the journal of library automation, ital has a long history of tracking the rapid-fire changes in technology as it relates to libraries. much as it has over the past 50 years, technology will continue to change not just the way libraries offer services to their communities, but the way we conceptualize what it is we do. if past is prologue, i have no doubt the next decades will continue to amaze, probably in ways even the most adventurous trend-forecaster won’t get quite right. in the context of the rapid change in how we do our work, what we do will remain the same: collecting, preserving, and providing access to the information and artefacts of our culture, whatever that may be. i would like ital to grow and expand, while keeping its core essence the same. that core is high-quality, relevant, and informative articles, reviewed by our peers, and made available to the world. but i think there is more we can do for lita and the library technology profession by expanding the scope and impact of the journal through seeking and soliciting articles from a wider range of librarians, adding more case studies to the research articles that are at the journal’s core, and being more rapidly responsive to the evolving technology landscape in front of us. to that end, i invite you to think broadly about researching, documenting, and describing the technology-related work you do so that others can learn about it. i welcome questions about how your project might fit into ital, and look forward to working with you. i’d like to close by extending my thanks to bob gerrity, who served as ital’s editor for the past 6 years and stewarded the journal’s transition to an open access publication. i am grateful for his service to ital, lita, and the profession. sincerely, kenneth j. varnum editor varnum@umich.edu mailto:varnum@umich.edu 60 information technology and libraries | june 2011 b ecause this is a family program and because we are all polite people, i can’t really use the term i want to here. let’s just say that i am an operating system [insert term here for someone who is highly promiscuous]. i simply love to install and play around with various operating systems, primarily free operating systems (oses), primarily linux distributions. and the more exotic, the better, even though i always dutifully return home at the end of the evening to my beautiful and beloved ubuntu. in the past year or two i can recall installing (and in some cases actually using) the following: gentoo, mint, fedora, debian, moonos, knoppix, damn small linux, easypeasy, ubuntu netbook remix, xubuntu, opensuse, netbsd, sabayon, simplymepis, centos, geexbox, and reactos. (aside from stock ubuntu and all things canonical, the one i keep a constant eye on is moonos [http://www.moonos.org/], a stunningly beautiful and eminently usable ubuntu-based remix by a young artist and programmer in cambodia, chanrithy thim.) in the old days i would have rustled up an old, sloughed-off pc to use as an experimental “server” upon which i would unleash each of these oses, one at a time. but those were the old days, and these are the new days. my boss kindly bought me a big honkin’ windows-based workstation about a year and a half ago, a box with plenty of processing power and memory (can you even buy a new workstation these days that’s not incredibly powerful, and incredibly inexpensive?), so my need for hardware above and beyond what i use in my daily life is mitigated. specifically, it’s mitigated through use of virtual machines. i have long used virtualbox (http://www.virtualbox .org/) to create virtual machines (vms), lopped-off hunks of ram and disk space to be used for the installation of a completely different os. with virtualbox, you first describe the specifications of the vm you’d like to create—how much of the host’s ram to provide, how large a virtual hard disk, boot order, access to host cd drives, usb devices, etc. you click a button to create it, then you install an os onto it, the “guest” os, in the usual way. (well, not exactly the usual way; it’s actually easier to install an os here because you can boot directly from a cd image, or iso file, negating the need to mess with anything so distasteful and old-fashioned and outre as an actual, physical cd-rom.) in my experience, you can create a new vm in mere seconds; then it’s all a matter of how difficult the os is to install, and the linux distributions are becoming easier and easier to install as the months plow on. at any rate, as far as your new os is concerned, it is being installed on bare metal. virtual? real? for most intents and purposes the guest os knows no difference. in the titillatingly dangerous and virus-ridden cyberworld in which we live, i’ll not mention the prophylactic uses of vms because, again, this is a family program and we’re all polite people. suffice it to say, the typical network connection of a vm is nated behind the nic of the host machine, so at least as far as active network– based attacks are concerned, your guest vm is at least as secure as its host, even more so because it sits in its own private network space. avoiding software-based viruses and trojans inside your vm? let’s just say that the wisdom passed down the cybergenerations still holds: when it rains, you wear a raincoat—if you see what i’m saying. aside from enabling, even promoting my shameless os promiscuity, how are vms useful in an actual work setting? for one, as a longtime windows guy, if i need to install and test something that is *nix-only, i don’t need a separate box with which to do so. (and vice versa too for all you unix-weaned ladies and gentlemen who find the need to test something on a rocker from redmond.) if there is a software dependency on a particular os, a particular version of a particular os, or even if the configuration of what i’m trying to test is so peculiar i just don’t want to attempt to mix it in with an existing, stable vm, i can easily and painlessly whip up a new instance of the required os and let it fly. and deleting all this when i’m done is easily accomplished within the virtualbox gui. using a virtual machine facilitates the easy exploration of new operating systems and new applications, and moving toward using virtual machines is similar to when i first started using a digital camera. you are free to click click click with no further expense accrued. you don’t like what you’ve done? blow it away and begin anew. all this vm business has spread, at my home institution, from workstation to data center. i now run both a development and test server on vms physically sitting on a massive production server in our data center—the kind of machine that when switched on causes a brown-out in the tri-state area. this is a very efficient way to do things though because when i needed access to my own server, our system administrator merely whipped up a vm for me to use. to me, real or virtual, it was all the same; to the system administrator, it greatly simplified operations. and i may joke about the loud clank of the host server’s power switch and subsequent dimming of the lights, but doing things this way has been shown to be more energy efficient than running a server farm in which each server editorial board thoughts: just like being there, or how i learned to stop coveting bare metal and learned to love my vm mark cyzyk (mcyzyk@jhu.edu) is the scholarly communication architect in the sheridan libraries, johns hopkins university, baltimore, maryland. mark cyzyk editorial board thoughts | cyzyk 61 virtual machines: zero-cost playgrounds for the promiscuous, and energy efficient, staff saving tools for system operations. what’s not to like? throw dual monitors into the mix (one for the host os; one for the guest), and it’s just like being there. sucks in enough juice to quench the thirst of its redundant power supplies. (they’re redundant, they repeat themselves; they’re redundant, they repeat themselves—so you don’t want too many of them around slurping up the wattage, slurping up the wattage . . . ) public libraries leading the way: on educating patrons on privacy and maximizing library resources public libraries leading the way on educating patrons on privacy and maximizing library resources t.j. lamanna information technology and libraries | september 2019 4 t.j. lamanna (professionalirritant@riseup.net) is an adult services librarian, cherry hill public library. abstract libraries are one of our most valuable institutions. they cater to people of all demographics and provide services to patrons they wouldn’t be able to get anywhere else. the list of services libraries provide is extensive and comprehensive, although unfortunately, there are significant gaps in what our services can offer, particularly those regarding technology advancement and patron privacy. though library classes on educating patrons’ privacy protection are a valiant effort, we can do so much more and lead the way, maybe not for the privacy industry but for our communities and patrons. creating a strong foundational knowledge will help patrons leverage these new skills in their day to day lives as well as help them educate their families about common privacy issues. in this column, we’ll explore some of the ways libraries can utilize their current resources as well as provide ideas on how we can maximize their effectiveness and roll new technologies into their operations. though many libraries have policies on how they deal with patron privacy, unfortunately some policies aren’t very strong and oftentimes staff isn’t trained in the details of these policies. fortunately, for libraries who don’t have these necessary policies, there are some, such as the san jose public library, that offer their own as a framework.1 those that do have a strong comprehensive policy must make sure they are enforcing and regularly updating it to comply with new technologies being released. it’s a daunting task, but as article vii of the library bill of rights says, “all people, regardless of origin, age, background, or views, possess a right to privacy and confidentiality in their library use. libraries should advocate for, educate about, and protect people’s privacy, safeguarding all library use data, including personally identifiable information.”2 this means we have a responsibility to our patrons to do everything in our power to protect them and teach them to protect themselves. this requires a concerted effort not just for technology and it librarians, but for all library workers. a privacy policy means little if those on the front lines are either unaware of the policy or unsure how it is to be implemented. therefore, all library staff should both understand the fundamental reasons behind library privacy policies and be trained in maintaining them. libraries may consider implementing this training during staff development days or offer independent training sessions as needed. since the introduction of the patriot act, libraries stopped collecting patrons’ reading habits, but so many library integrated library systems (ils) snag massive amounts of patron information we are unaware of. i’ve been administering our ils for over two years and i just found another space where items are being unnecessarily retained that i didn’t notice before. an instance such as this calls for limiting personally identifiable information (pii) to what is strictly necessary. mailto:professionalirritant@riseup.net on educating patrons on privacy and maximizing library resources | lamanna 5 https://doi.org/10.6017/ital.v38i3.11571 in limiting the pii gathered in the first place, library staff should consider the following questions: what information do libraries really need to collect to offer library cards or programming? does your library really need patrons’ date of birth or gender? probably not. if so, you shouldn’t be collecting it, and if you do, make sure you anonymize the data. using metrics is vital to how libraries function, receive funding, and schedule programming. you can still use the information, but it should not be connected to a patron in any way. after educating staff, we can educate patrons on developing better and safer practices regarding personal privacy and security in their daily lives. practical examples range from teaching patrons how to create strong passwords and backup sensitive files to explaining how malware works and what the “cloud” actually is. this is a start, but it goes far beyond that. i’ve served many patrons who, even after taking courses on the subject, are overwhelmed by the security measures needed to protect themselves. this isn’t necessarily a sign that our classes are ineffective, but it does imply that new tactics are needed. let’s look at a few examples. another version of pii that we often overlook are security measures such as closed-circuit television (cctv) or security/police officers in our buildings.3 they often are either forgotten or outside the purview of the library itself. as the college of policing states, “cctv is more effective when directed at reducing theft of and from vehicles, while it has no impact on levels of violent crime.”4 while there are justifications for bringing this technology into the library, they should only be set up where needed, taking great care not to point them at patron or staff computers. if cctv is needed, make sure to follow local retention laws and remove the footage as soon as its time has expired. this idea applies to all collected information. there is no reason to archive data beyond the date they can be destroyed as it puts the library and its patrons in a compromised position. law enforcement in the library is a tough thing to argue against in our current political climate. but studies have shown that police presence does little to deter crime and may actually disproportionately impact marginalized communities.5 consider the purpose of law enforcement personnel and if their presence is actually necessary to the proper functioning of your library. in the event that you should have law enforcement come in with a subpoena that requires you to turn over your patron data, it’s important to have a canary warning that can be removed so your patrons understand what has happened.6 another way libraries can lead the way in protecting patron privacy both inside and outside the library is by supporting legislation that bans facial recognition software. this type of technology is becoming ubiquitous, but places have already started pushing back and libraries can be the epicenter of this movement. it’s already been banned in oakland,7 san francisco8 (one of the homes of this technology), as well as somerville, massachusetts, with groups like the massachusetts library association unanimously putting out a moratorium on facial surveillance, which is the practice of recording ones face to create user profiles.9 there are other states that are working down this path and it’s overwhelmingly heartening to see libraries step up and in front of something they know would damage our communities. we ought to be activists, standing on the front lines and showing our patrons our deepest commitment to them. surely there are greater strides we can make, such as revising wifi policies. wifi is one of the most used services libraries offer and many libraries don’t use it to their full potential. for instance, some libraries turn off their wifi when the building is closed, severely limiting patrons’ information technology and libraries | september 2019 6 usage. it’s a service we pay for and there is no reason it shouldn’t be available at all times. your it service should make sure the wifi is secure (it should be where it’s available at all hours or not). unlimited access to wifi becomes invaluable to users who need it for emergencies including completing work or accessing important online services when the library is closed. while we do have limited bandwidth and it services must actively maintain wifi security, libraries should make sure it’s available to the public as often as possible. now that we’ve covered using bandwidth when we aren’t open, let’s talk about libraries with excess bandwidth. no resource should go unused in the library. we have a limited budget and we should make sure every penny is used to serve our communities. one fantastic use of excess bandwidth — especially during closed hours — would be to set up a tor relay in your library, an anonymity network that allows people to surf the internet with extra security and privacy in mind. it’s quite easy to set up and you can limit how much bandwidth it uses so you aren’t shorting anyone in your library. it’s a service used by groups such as journalists or activists who want to make positive change in the world and need a safe place to do so. some are concerned that the tor network is used for malicious intent but the tor project, the organization that runs the network, constantly works to ensure nothing like that is taking place. also, anything solicitous you can find on the tor network is available on the regular internet including places like facebook or craigslist, so the stigma of the network should be taken in context. the tor project routinely monitors the network and searches out illegal material (there are no hired killers on the tor network). given all this, you could help the network greatly by just partitioning a small amount of your bandwidth. libraries have the unique ability to be transformative. unlike other non-profits or organizations, we have the ability to pivot. we can both change directions as needed and pave the way for our communities as leaders in the movement toward patron privacy. i leave you with a quote from hardt and negri: “…we share common dreams of a better future.”10 that should be our motto. endnotes 1 “our privacy policy, san jose public library, accessed august 15, 2019, https://www.sjpl.org/privacy/our-privacy-policy. 2 “library bill of rights,” american library association, last modified january 19, 2019, http://www.ala.org/advocacy/intfreedom/librarybill. 3 “importance of cctv in libraries for better security,” accessed august 14, 2019, https://www.researchgate.net/publication/315098570_importance_of_cctv_in_libraries_for _better_security. 4 “effects of cctv on crime,” college of policing, accessed august 14, 2019, http://library.college.police.uk/docs/what-works/what-works-briefing-effects-of-cctv2013.pdf. 5 “do police officers in school really make them safer?” accessed august 14, 2019, https://www.npr.org/2018/03/08/591753884/do-police-officers-in-schools-really-make-themsafer. 6 “canary warning,” wikipedia https://en.wikipedia.org/wiki/warrant_canary. https://www.sjpl.org/privacy/our-privacy-policy http://www.ala.org/advocacy/intfreedom/librarybill https://www.researchgate.net/publication/315098570_importance_of_cctv_in_libraries_for_better_security https://www.researchgate.net/publication/315098570_importance_of_cctv_in_libraries_for_better_security http://library.college.police.uk/docs/what-works/what-works-briefing-effects-of-cctv-2013.pdf http://library.college.police.uk/docs/what-works/what-works-briefing-effects-of-cctv-2013.pdf https://www.npr.org/2018/03/08/591753884/do-police-officers-in-schools-really-make-them-safer https://www.npr.org/2018/03/08/591753884/do-police-officers-in-schools-really-make-them-safer https://en.wikipedia.org/wiki/warrant_canary on educating patrons on privacy and maximizing library resources | lamanna 7 https://doi.org/10.6017/ital.v38i3.11571 7 sarah ravani, “oakland bans use of facial recognition technology, citing bias concerns,” san francisco chronicle, july 17, 2019, https://www.sfchronicle.com/bayarea/article/oakland-bansuse-of-facial-recognition-14101253.php. 8 kate conger, richard fausset, and serge f. kovaleski, “san francisco bans facial recognition technology,” new york times, may 14, 2019, https://www.nytimes.com/2019/05/14/us/facialrecognition-ban-san-francisco.html. 9 sarah wu, “somerville city council passes facial recognition ban,” boston globe, june 27, 2019, https://www.bostonglobe.com/metro/2019/06/27/somerville-city-council-passes-facialrecognition-ban/sfaqq7mg3dgulxonbhscyk/story.html. 10 michael hart and antonio negri, multitude: war and democracy in the age of empire, (new york: the penguin press, 2009), p. 128. https://www.sfchronicle.com/bayarea/article/oakland-bans-use-of-facial-recognition-14101253.php https://www.sfchronicle.com/bayarea/article/oakland-bans-use-of-facial-recognition-14101253.php https://www.nytimes.com/2019/05/14/us/facial-recognition-ban-san-francisco.html https://www.nytimes.com/2019/05/14/us/facial-recognition-ban-san-francisco.html https://www.bostonglobe.com/metro/2019/06/27/somerville-city-council-passes-facial-recognition-ban/sfaqq7mg3dgulxonbhscyk/story.html https://www.bostonglobe.com/metro/2019/06/27/somerville-city-council-passes-facial-recognition-ban/sfaqq7mg3dgulxonbhscyk/story.html abstract endnotes lib-mocs-kmc364-20131012114220 nated, volume-oriented, resource-sharing electronic ordering process. for information relative to bisac transmission formats or bisac membership, write to: book industry systems advisory committee, 160 fifth ave., suite 604, new york, ny 10010. for input to bisac purchase order formats, write to: j. k. long, chairman, bisac p.o. subcommittee, c/o oclc, inc., 6565 frantz rd., dublin, oh 43017. (mr. long is also the library or network representative on the isbn advisory council.) for input to the ansi z39 p.o. transmission formats, write to: mr. e. muro, chairman, subcommittee u, c/o baker & taylor co., 6 kirby ave., somerville, nj 08876. for problems with the isbn and san, write to: mr. emory i koltay, international standard book numbering agency, 1180 avenue of the americas, new york, ny 20036. microcomputer backup to online circulation sheila intner: emory university, atlanta, georgia. our primary objective in purchasing microcomputer systems for the great neck library was to provide a better alternative to paper and pencil checkouts when our minicomputer-based clsi libs 100 automated circulation system was down. two difficult and lengthy downtime periods occurring shortly after going online convinced the administration that public service should not be jeopardized because of system failure. after investigation of the backup systems vended by computer translation, inc., 1 two of them were purchased in november 1980. computer translation, inc. (cti) sells a turnkey backup system based on an apple ii plus microcomputer, with two mini-disk drives using 5 '14 " floppy diskettes, a tv monitor, and a switching system connecting the apple to the libs 100 console and terminals. software designed to interface with the clsi system is part of the package. the backup collects and stores data for communications 297 check-ins and checkouts and then dumps them into the database by simulating a terminal when the mini-main-frame is operational again. this requires dedicating a terminal to this process until complete. it can also be used alone as a portable unit for circulation purposes, or with any of the many applesoft packages available, or with an applesoft program of the user's own design. our initial experience in great neck was with a borrowed demonstration system, set up by a sympathetic cti representative on the spur of the moment in tandem with and connected to the main library checkout station's crt laser terminal after several days of downtime. the circulation staff cheered as the familiar prompts appeared on both screens. they used the clsi equipment which they were accustomed to operating and the computer room staff learned to operate the cti system. the ease with which the apple could be transported to different locations in the building and the immediate relief it gave wherever it was connected, sometimes one checkout station, sometimes another, led us to put off deciding on a permanent installation at first. we thought it might be more advantageous to keep it on a rolling cart and use it wherever a terminal was down, or wherever the traffic appeared to be heaviest. we continued in this manner for a while even after both of our own apple systems were delivered. it soon became apparent that the apple and its accompaniments, especially the switching system with its dangling cables, was a nuisance at the checkout counter. people with piles of books or records tended to nudge it dangerously close to the edge or jiggle its connections loose. the circulation staff didn't like waiting until someone from the computer room could be spared to bring up the system, secure the connections, and turn on the apple. also, although the apple is a very reliable instrument which has given us negligible downtime, bumpy rides over various floors, carpets, lintels, and textured tiles occasionally loosened its chips and rendered it, too, inoperative. cti representatives were called in to make a more permanent installation for the apple in our computer room, a simple operation requiring some additional cable. se298 journal of library automation vol. 14/4 december 1981 lection of the terminals to be attached as alternate backup or dumping sites was not so easy, however. the choice of the primary backup site was not a problem, since one of the two checkout stations flanking the main door was fairly obvious. but the second terminal which would be preempted for dumping was a more difficult decision. dumping sessions vary in length depending on the number of records to be processed and the activity on the rest of the libs 100 system. in our library, we find it takes about an hour to dump 100 to 150 transactions. this appears to be slower than average and may well be due to the extremely high level of system activity. thus, dumping 1,000 transactions would take a full working day. we had been online for such a short time that great backlogs of patron and material data entry from new registrants and unconverted books had developed and were a high priority item. neither the circulation department, which was handling registrations, nor technical services, which was handling materials, felt they could afford to lose much terminal time for dumping. thus, the reference department's information desk terminal was reluctantly chosen as the alternate terminal on the grounds that they only did inquiries formaterials which borrowers could locate by means of searching the catalog and making trips to the shelves. if necessary, information desk personnel could step across the aisle to the circulation department and use a terminal there. the permanent installation was set up in this way for one backup system , while the other one remained mobile in the event we wanted to use it at one of our three branches. only the switching box and cables were really unmovable. the apple, drives, and monitor could still be disconnected and moved about at will. experience over the last few months with this arrangement demonstrated that, all things considered, it is unwise to attach two public service terminals to one apple, in spite of the pressure it puts on behind-thescenes operations to lose terminal time in the event of an extensive dump. the reaction of the public to being told a terminal that usually helped them was inoperative has been so negative it outweighed the delays in data entry. therefore, a change in the current configuration will soon be made. meanwhile, we realized the second backup system was not being used to greatest advantage. when the libs 100 was down, the next most pressing demand after main library checkouts were checkouts at the largest branch, located near the railroad station. we were collecting about thirty transactions an hour or less at other locations in the main building while the station branch staff were writing down twice that amount or more and explaining to their public that the computer was down. it seemed important to pursue the possibility of connecting one of the station branch's terminals to the second apple while keeping the apple itself in the computer room in the main library. not only was there even less space in the branch for another piece of hardware on their counter, but staff training and hardware control presented a greater problem since many more part-time people were employed there. cti worked on the problem for about two months, resolving it through the addition of a modem to the basic configuration. in this new installation, which we did ourselves with phone assistance from cti, and which has been operational for three weeks as of this writing, the dedicated phone line connector for the branch terminal is removed from its port on the libs 100 console and inserted into one of the switching box connectors. the apple is turned on as usual and the crt laser terminal at station branch appears to operate normally. in fact, it operates so closely to its usual libs 100 mode that staff members forget they are not online with the libs and call up to find out why inquiries don't work. we are still experiencing a significant amount of downtime with our libs 100. some of this is attributable to our relatively full storage, requiring us to perform housekeeping routines frequently, but the rest is a result of system failure. now, however, because of the apples, this causes far less anguish in the circulation department. when the libs 100 goes down, the permanently connected backups are switched on in the computer room by their staff and circulation clerks continue checking materials out on their regular clsi equipment in the main library and station branch. on days when housekeeping chores are scheduled, the console operator's job includes turning on the apples so we can begin serving the public when the doors open at 9:00 a.m. unless downtime persists for more than a day, no other routines are done except checkouts. under some circumstances, certain materials might be checked in on the apple, but it is not desirable to do this for newer materials on which holds may have been placed. when the libs 100 is online again, the checkout station is switched back to normal mode and the apple takes over the information desk's port for dumping, rendering that terminal inoperative. dumping continues around the clock until all transactions have been processed from both apples. normal activities proceed at all other terminals. diskettes are dumped in chronological order. as the dumping process operates, a file of transactions eliciting error or exception messages from the libs 100 is created on the apple diskette. this file is available for attention at a later time for manual entry into the database. the chief asset of the dumping process is the accuracy achieved by automatic inputting. when we used paper and pencil, not only was the original writing time consuming, but manual data entry was difficult because of illegible handwriting, inaccurate transcription of the numbers, inaccurate inputting into the database, and lack of available personnel for the job. the cti system resolves all of these difficulties, but a price is paid in the loss of the dumping terminal's services. the public may be less disturbed if a terminal in a nonpublic area is used. but to the department involved, access to the database is a central part of their work and its loss severely limits their output. in fact, dependence on the automated circulation system by all departments in the library has been swift and universal even though we originally assumed the terminals outside the circulation department would be used sparingly. plans are being made to store personnel records in machine-readable form on diskettes. other developments are being put on a back burner until we have less frequent communications 299 need for the apples as backups. however, levels, great neck library's youth department, has several apples of its own on which budding "computerniks" practice their art. for them there are few limits to possible applications-perhaps only the outermost boundaries of imagination. reference 1. joseph covino and sheila intner, "an informal survey of the cti computer backup system;· journal of library automation 14:108-10 oune 1981). computer-to-computer communication in the acquisition process sandra k. paul: skp associates, new york city. in the 1970s, we entered the period of computer-to-computer communication; we now appear to have reached the second stage of development. today more than seventy publishers are equipped to receive computer tape orders and input them directly to their order fulfillment systems; twenty-six publishers can produce computer invoices and credits for their customers; six are capable of sending monthly updating information about titles, prices, publication dates, and books declared out of print. all of this, however, is based on a system through which computer tapes are sent from buyer to seller and back via the united states mail. the next stepcomputer-to-terminal or computer-tocomputer communication-is just around the corner. historical perspective how did this happen? it started in september 1974 when dewitt c. ("bud") baker, newly appointed president of the baker & taylor company, envisioned the savings his company could find if their customers provided the international standard book number (isbn) on their orders. he also believed that the volume of paper created by the computer was expensive and time-consuming for publishers to handle. editorial | truitt 87 l ife out of balance. those who saw it will surely recall the 1982 film that juxtaposed images of stunning natural beauty with scenes of humankind’s intrusion into the environment, all set to a score by philip glass. the title is a hopi word meaning “life out of balance,” “crazy life,” “life in turmoil,” “life disintegrating,” or “a state of life that calls for another way of living.” while the film, as i recall, relied mainly on images of urban landscapes, mines, power lines, etc., to make its point about our impact on the world around us, it did include as well images that had a technological focus, even if the pre–pc technology exemplars shown may seem somewhat quaint thirty years later.1 the sense that one is living in unbalanced, crazy, or tumultuous times is nothing new. indeed, i think it’s fair to say that most of us—our eyes and perspectives firmly and narrowly riveted to the here and now—tend to believe that our own specific time is one of uniquely rapid and disorienting change. but just as there have been, and will be, periods of rapid technological change, social upheaval, etc.—“been there, done that, got the t-shirt,” to recall the memorably pithy, if now slightly oh-so-aughts, slogan—so too have there been reactions to the conditions that characterized those times. a couple of very different but still pertinent examples come to mind. in the second half of the nineteenth century, a reaction against the social conservatism and shoddy, mass-produced goods of the victorian era began in england. inspired by writer and designer william morris, the arts and crafts movement emphasized simplicity, hand-made (as opposed to factory-made) objects, and social reform. by the turn of the century, the movement had migrated to the united states—memo to self: who were the leading lights of the movement in canada?—finding expression in the “mission-style” furniture of gustav stickley, the elegant art pottery of rookwood, marblehead, and teco, and the social activism of elbert hubbard’s roycrofters. fast-forward another half-century to the mid-1960s and the counter-culture of that time, itself a reaction to the racism, sexism, militarism, and social regimentation of the preceding decade. for a brief period, experimentation with “alternative lifestyles,” resistance to the vietnam war, and agitation for social, racial, and sexual change flourished. whatever one’s views about, say, the flower children, civil rights demonstrations, or the wisdom of u.s. involvement in vietnam, it’s well-nigh impossible to argue that the society that emerged from that time was not fundamentally different from the one that preceded it. that both of these “movements” ultimately were subsumed into the larger whole from which they sprang is only partly the issue. and my aim is not to romanticize either of these times, even as i confess to more than a passing interest in and sympathy for both. rather, my point is that their roots lay in a reaction to excesses—social, cultural, economic, political, even technological—that marked their times. they were the result of what might be termed “life out of balance.” in turn, their result, viewed through a longer lens, was a new balance, incorporating elements of the status quo ante and critical pieces from the movements themselves. thesis —> antithesis —> synthesis. we find ourselves in such unbalanced times again today. even without resort to over-hyped adjectives such as “transformational,” it is fair to say that we are in uncertain times. in libraries, budgets, staffing levels, and gate counts are in decline. the formats and means of information delivery are rapidly changing. debates rage over whether we are merely in the business of delivering “information” or of preserving, describing, and imparting learning and knowledge. perhaps worst of all, as our role in the society of which we are a part changes into something we cannot yet clearly see, we fear “irrelevance.” what will happen when everyone around us comes to believe that “everything [at least, everything that’s important] is on the web” and that libraries and librarians no longer have a raison d’être? for much of the past decade and a half—some among us might argue even longer—we’ve reacted by taking the rat-in-the-wheel approach. to remain “relevant,” we’ve adopted practically every new fad or technology that came along, endlessly spinning the wheel faster and faster, adopting the tokens of society around us in the hope that by so doing we would stanch the bleeding of money, staff, patrons, and our own morale. as i’ve observed in this space previously,2 we’ve added banks of über-connected computers, clearing away book stacks to design technology-focused creative services and collaborative spaces around them. we’ve treated books almost as smut, to be hidden away in “plain-brown-wrapper” compact storage facilities. we’ve reduced staffing, in the process outsourcing some services and automating others so that they become depersonalized, the library equivalent of a bank automated teller machine. we’ve forsaken collection building, preferring instead to rent access to resources we don’t own and to cede digitization control of those resources that we ostensibly do own. where does it end? in a former job, i used to joke that my director’s vision of the library would not be fully realized until no one but the director and the library’s system administrator were left on staff and nothing but a giant super-server remained of the library. it seemed only black humor then. today it’s just black. marc truitt marc truitt (marc.truitt@ualberta.ca) is associate university librarian, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. editorial: koyaanisqatsi 88 information technology and libraries | september 2011 and intellectual rest. they are places of the imagination. play to these strengths. those seeking to reimagine library spaces as refuges could hardly do better than to look to jasper fforde’s magical bookworld in the thursday next series for inspiration.3 stuffy academics and special libraries take note: library magic is not something restricted to children’s rooms in public libraries. walk through the glorious spaces of yale’s sterling memorial library or visit the reading room at the university of alberta’s rutherford library—known to the present generation of students as the “harry potter room,” for its evocation of the hogwarts school’s great hall—and then tell me that magic does not abound in such places. it’s present in all of our libraries, if we but have eyes to see and hearts to feel. ■■ the library was once a place for the individual. to contemplate. to do research. to know the peace and serenity of being alone. in recent years, as we’ve moved toward service models that emphasize collaboration and groups, i think we’ve lost track of those who do not visit us to socialize or work in groups. we need to reclaim them by devoting as much attention to services and spaces aimed at those seeking aloneness as we do at those seeking togetherness. the preceding list will probably brand me in the minds of some readers as anti-technology. i am not. after spending the greater part of my career working in library it, i still can be amazed at what is possible. “golly? we can do that?” but i firmly believe that library technology is not an end in itself. it is a tool, a service, whose purpose is to facilitate the delivery of knowledge, learning, and information that our collections and staff embody. nothing more. that world view may make me seem old fashioned; if such be the case, count me proudly guilty. in the end, though, i come back to the question of balance. there was a certain balance in and about libraries that prevailed before the most recent waves of technological change began washing over libraries a couple of decades ago. those waves disrupted but did not destroy the old balance. instead, they’ve left us out of balance, in a state of koyaanisqatsi. it’s time to find a new equilibrium, one that respects and celebrates the strengths of our traditional services and collections while incorporating the best that new technologies have to offer. it’s time to synthesize the two into something better than either. it’s time for balance. references and notes 1. wikipedia, “koyaanisqatsi,” http://en.wikipedia.org/ wiki/koyaanisqatsi (accessed july 12, 2011). ital readers in the united states can view the entire film online at http://www more importantly, where has all this wheel spinning gotten us, other than continued decline and yet more hand-wringing and anguish about irrelevance? it’s time to recognize that we are living in a state of koyaanisqatsi (life out of balance). and it’s up to us to do something new about it by creating a new balance. here are a few perhaps out-of-the-box ideas that i think could help with establishing that balance. spoiler alert: some of these may seem just a bit retro. i can’t help it: my formative library years predate the chicxulub asteroid impact. anyway, here goes: ■■ cease worrying so about “relevance.” instead, identify our niche: design services and collections that are “right” and uniquely ours, rather than pale reflections of fads that others can do better and that will eventually pass. we are not google. we are not starbucks. we know that we cannot hope to beat these sorts of outfits at their games; perhaps less obvious is that we should be extremely wary of even partnering with them. their agenda is not ours, and in any conflict between agendas, theirs is likely to prevail. we must identify something unique at which we excel. ■■ find comfort in our own skins. too many of us, i sense, are at some level uneasy with calling ourselves “librarians.” perhaps this is so because so many of us came to the profession by this or that circuitous route, that is, that we intended to be something else and wound up as librarians. get over it and wear the sensible shoes proudly. ■■ stop trying to run away from or hide books. they are, after all, perceived as our brand. is that such a bad thing? ■■ quit designing core services and tools that are based on the assumption that our patrons are all lazy imbeciles who will otherwise flee to google. the evidence suggests that those folks so inclined are already doing it anyway; why not instead aim at the segment that cares about provision of quality content and services—in collections, face-to-face instruction, and metadata? people can detect our arrogance and condescension on this point and will respond accordingly, either by being insulted and alienated or by acting as we depict them. ■■ begin thinking about how to design and deliver services that are less reliant on technology. technology has become, to borrow from marx, the opiate of libraries and librarians; we rely on it to the exclusion of nontechnological approaches, even when the latter are available to us. technology has become an end in itself, rather than a means to an end. ■■ libraries are perceived by many as safe harbors and refuges from any number of storms. they are places of rest—not only of physical rest, but of emotional editorial | truitt 89 editorial.cfm (accessed july 13, 2011). 3. begin with fforde’s the eyre affair (2001) and proceed from there. if you are a librarian and are not quickly enchanted, you probably should consider a career change very soon! thank you, michele n! .youtube.com/watch?v=sps6c9u7ras. sadly, the rest of us must borrow or rent a copy. 2. marc truitt, “no more silver bullets, please,” information technology & libraries 29, no. 2 (june 2010), http://www.ala .org/ala/mgrps/divs/lita/publications/ital/292010/2902jun/ we give to the organization. the lita assessment and research committee recently surveyed membership to find out why people belong to lita, this is an important step in helping lita provide programming etc. that will be most beneficial to its users, but the decision on whether to be a lita member i believe is more personal and doesn’t rest on the fact that a particular drupal class is offered or that a particular speaker is a member of the top tech trends panel. it is based on the overall experience that you have as a member, the many little things. i knew in just a few minutes of attending my first lita open house 12 years ago that i had found my ala home in lita. i wish that everyone could have such a positive experience being a member of lita. if your experience is less than positive how can it be more so? what are we doing right? what could we do differently? please let me or another officer know, and/or volunteer to become more involved and create a more valuable experience for yourself and others. president’s message continued from page 86 creating and managing a repository of past exam papers communications creating and managing a repository of past exam papers mariya maistrovskaya and rachel wang information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.11837 mariya maistrovskaya (mariya.maistrovskaya@utoronto.ca) is digital publishing librarian, university of toronto. rachel wang (rachel.wang@utoronto.ca) is application programmer analyst, university of toronto. abstract exam period can be a stressful time for students, and having examples of past papers to help prepare for the tests can be extremely helpful. it is possible that past exams are already shared on your campus—by professors in their specific courses, via student unions or groups, or between individual students. in this article, we will go over the workflows and infrastructure to support the systematic collection, provision of access to, and repository management of past exam papers. we will discuss platform-agnostic considerations of opt-in versus opt-out submission, access restriction, discovery, retention schedules, and more. finally, we will share the university of toronto setup, including a dedicated instance of dspace, batch metadata creation and ingest scripts, and our submission and retention workflows that take into account the varying needs of stakeholders across our three campuses. background the university of toronto (u of t) is the largest academic institution in canada. it spans across three campuses and serves more than 90,000 students through its 700 undergraduate and 200 graduate programs.1 the university of toronto structure is the product of its rich history and is thus largely decentralized. as a result, the management of undergraduate exams is carried out individually by each major faculty at the downtown (st. george) campus, and centrally at the university of toronto mississauga (utm) campus and the university of toronto scarborough (utsc) camp us. the faculty of arts and science (fas) at the st. george campus has traditionally made exams from its departments available to students. in the pre-internet era, students were able to consult print and bound exams in departmental and college libraries’ reference collections. with the rise of online technologies, the fas registrar’s office seized the opportunity to make access to past exams more equitable for students and worked with the university of toronto libraries (utl) information technology services (its) to digitize and make exams available online. they were initially shared electronically via the gopher protocol and later via docutek eres, one of the first available course e-reserves systems. after the utl became an early adopter of the dspace (https://duraspace.org/dspace/) open source platform for its institutional repository in 2003, the utl its created a separate instance of dspace to serve as a repository of old exams. the repository makes the last three years of exams from the fas, utm, and utsc available online in pdf. about 5,500 exam papers are available to students with u of t login at any given time. discussed below are some of the considerations in establishing and maintaining a repository of old exams on campus, along with practical recommendations and shared workflows from the utl. mailto:mariya.maistrovskaya@utoronto.ca mailto:rachel.wang@utoronto.ca https://duraspace.org/dspace/ information technology and libraries march 2020 creating and managing a repository of past exam papers | maistrovskaya and wang 2 considerations in establishing a repository of old exams if you are looking to establish a repository of old exams, these are some of the considerations to take into account when planning a new service or evaluating an existing one. the source of old exams depending on the level of centralization on your campus, exams may be administered by individual academic departments or submitted by instructors/admins into a single location and managed centrally. the stakeholders involved in this process may include the office of the registrar, campus it, departmental admins or libraries, etc. establishing a relationship with such stakeholders is key in getting access to the files. when arranging to receive electronic files, consider whether they could be accompanied with existing metadata. alternatively, if the university archives or records management already receive copies of campus exams, you may be able to obtain them there. print versions will need to be digitized for online access—later in this article we will share metadata creation strategies in this scenario. it is also possible that exams may be collected in less formal ways, for example, via exam drives by student unions and groups. the utl works closely with the fas registrar’s office to receive a batch of exams annually. the utl receives a copy of print fas exams that get digitized by the its staff. the utl also receives exams from two u of t campuses, utm and utsc, that arrive in electronic format via the campus libraries. the u of t engineering society and the faculty of law each maintain their individual exam repositories, and the arts and science student union maintains a bank of term tests donated by students. content hosting and management one of the key questions to answer is which campus department or unit will be responsible for hosting the exams, managing content collection, processing and uploads, and providing technical and user support. these responsibilities may be within the purview of a single unit or may be shared between stakeholders. here are some examples of the tasks to consider: 1. collecting exams from faculty or receiving them from a central location 2. managing restrictions (exams that will not be made available online) 3. digitizing exams received in print 4. creating metadata or converting metadata received with the files 5. uploading exams to the online repository 6. removing exams from the online repository 7. providing technical support and maintenance (e.g., platform upgrades, troubleshooting) 8. providing user support (e.g., assistance with locating exams) at u of t, tasks 1–2 are taken care of by registrar offices at fas and utm and by the library at utsc. tasks 3–8 are performed centrally by the utl its, with the exception of digitization services for exams received from the utm and utsc campuses. further details and considerations related to the content management system and processing pipelines are outlined in the “infrastructure and workflows” section below. information technology and libraries march 2020 creating and managing a repository of past exam papers | maistrovskaya and wang 3 collection scope depending on the sources of your exams, you may need to establish the scope rules for what gets included in the collection. for example: • will you only include final exams? will term tests also be included? • will solutions be posted with the exams? • will additional materials, such as course syllabi, also be included? at the utl, only final exams are included in the repository, and no answers are supplied. exam retention making old exams available online is always a balancing act between the interests of students who want to have access to past test questions and the interests of instructors who may have a limited pool of questions to draw from or who may teach different course content over time and want to ensure that the questions continue to be relevant. at the utl, in consultation with campus partners, the balance was achieved by only posting the three most recent years of exams in the repository. as soon as a new batch is received, the utl removes a batch of exams more than three years old. opt-in versus opt-out approach where exam collection is driven centrally by a registrar’s office, for example, that office may require that all past exams be made available to students. similarly to the retention considerations, the needs of instructors who draw questions from a limited pool can be accommodated via opt-outs, individual exam restrictions, and ad hoc take-down requests. an alternative approach to exam collection would be an opt-in model where faculty choose to submit exam questions on their own schedule. at the utl, the fas and the utm campus both operate under the opt-out model. the utl receives all exam questions in regular batches unless they have been restricted by instructors’ requests. occasional withdrawal requests from instructors require an approval from the registrar’s office. conversely, the utsc campus operates under the opt-in model where individual departments submit their exams to the library. while this model provides the most flexibility, the volume of exams received from this campus is subsequently relatively small. repository access when making old exams available online, one of the things to consider is who will have access to them. will the exams only be available to students of the respective academic department, or to all students, or to the general public? will access be possible on campus as well as off campus? if the decision is made to restrict access, is there an existing authorization infrastructure in place that the repository could take advantage of, such as an institutional single sign-on or library’s proxy access? at the utl, access to the old exams repository is provided through ezproxy in the same fashion as subscription resources made available via the library. information technology and libraries march 2020 creating and managing a repository of past exam papers | maistrovskaya and wang 4 discoverability and promotion how will students find out about the exams available in the repository? will the repository be advertised via the library’s website, promoted by course instructors, or linked with the other course materials? considering the challenge of promoting a resource like this along with a variety of other library resources, it will be preferable to make it known to students via the same channels through which they receive other course information. for many institutions this would be via their learning management system or their course information system. at u of t, the old exams repository is linked from the library website. previously, the link was embedded in the university’s learning management system course template. with a recent transition to a new learning management engine, such exposure is yet to be reestablished. infrastructure and workflows minimum cms requirements a repository of old exams does not require a specific content management system (cms) or an offthe-shelf platform. your institution may already have all the components in place to make it happen. here are the minimum requirements you will want to see in such a system: • file upload by staff (preferably in batch) • file download by end users • basic descriptive metadata • search / browse interface • access control / authentication (if you choose to restrict access) the utl uses a stand-alone instance of dspace for its old exams repository. dspace is an opensource software for digital repositories used across the globe primarily in academic institutions. the utl chose this platform since it was already running an instance of dspace for its institutional repository (ir) and had the infrastructure and expertise on site. however, this is not a solution we would recommend to an institution with no existing dspace experience. while dspace is an open source platform, maintaining it locally requires significant staff expertise that may not be warranted considering that a collection of exams would only use a fraction of its robust functionality. if you do consider using dspace, a hosted solution may be preferable in a situation when local it resources and expertise are limited. distributing past exams via an existing digital repository an institution that already maintains a digital repository may consider adding exams as a collection to the existing infrastructure. when choosing to do so it is important to consider whether the exams use case may be different from your ir use case, and whether the new collection will fit in the existing mission and policies. differences may include the following: • access level. ir missions tend to revolve around providing openly accessible materials, whereas exams may need to be restricted. will your repository allow selective access restrictions to the exams collection? • longevity. ir materials are usually intended to be kept long-term, whereas exams may be on a retention schedule. for that reason, it also does not make sense to assign permanent identifiers to exams as many repositories do for their other materials. information technology and libraries march 2020 creating and managing a repository of past exam papers | maistrovskaya and wang 5 • file types and metadata. unlike a variety of research outputs and metadata usually captured in an ir, exams would have uniform metadata and object type. this makes them suitable for batch transformations and uploads. batch metadata creation options because of the uniform object type, exams are well suited to batch processing, transformations, and uploads. at utl, metadata is created from the filenames of scanned pdf files by a python script.2 the script breaks up the filename into dublin core metadata fields based on the pattern shown in figure 1. see figure 2 for a snippet of the script populating dublin core metadata fields. figure 1. file-naming pattern for metadata creation at utl. figure 2. a screenshot of the utl script generating dublin core metadata from filenames. once metadata is generated, the second python script (figure 3) packages the pdf and metadata file into a dspace simple archive (dsa) which is the format that dspace accepts for batch ingests. information technology and libraries march 2020 creating and managing a repository of past exam papers | maistrovskaya and wang 6 figure 3. a screenshot of the utl script packaging a pdf and metadata into a dspace simple archive. the dspace simple archive (dsa) then gets batch uploaded into the respective campus and examperiod collections (figure 4) using the dspace native batch import functionality. figure 5 shows what an individual exam record looks like in the repository. after a new batch is uploaded, collections older than three years are removed from the repository. the utl’s exams processing scripts are openly available in github under an apache license 2.0 (https://github.com/utlib/dspace-exams-ingest-scripts/). figure 4. a screenshot of collections in the utl’s old exams repository. https://github.com/utlib/dspace-exams-ingest-scripts/ information technology and libraries march 2020 creating and managing a repository of past exam papers | maistrovskaya and wang 7 figure 5. a screenshot of a record in the utl’s old exams repository. conclusion having access to examples of past exam questions can be extremely helpful to students in preparing for upcoming tests. it is possible that old exams are already being shared on your campus in official or unofficial ways, in print or electronically. facilitating online sharing of electronic copies means that all students, on and off campus, will have equitable access to these valuable resources. we hope that the considerations and workflows outlined in this article will help institutions establish such services locally. acknowledgements the authors would like to acknowledge the utl librarians and staff who contributed to the setup and maintenance of the old exams repository over the years: marlene van ballegooie, metadata technologies manager, who operated the filename-to-dublin core metadata crosswalk; sean xiao zhao, former applications programmer analyst, who converted it into python; and sian meikle, associate chief librarian for digital strategies and technology, who was at the inception of the original exam-sharing service and provided valuable historical context and feedback on this article. endnotes 1 university of toronto, “quick facts,” accessed november 4, 2019, https://www.utoronto.ca/about-u-of-t/quick-facts. 2 university of toronto libraries, “exam metadata generation and ingest for dspace,” github repository, last modified september 20, 2019, https://github.com/utlib/dspace-exams-ingestscripts/. https://www.utoronto.ca/about-u-of-t/quick-facts https://github.com/utlib/dspace-exams-ingest-scripts/ https://github.com/utlib/dspace-exams-ingest-scripts/ abstract background considerations in establishing a repository of old exams the source of old exams content hosting and management collection scope exam retention opt-in versus opt-out approach repository access discoverability and promotion infrastructure and workflows minimum cms requirements distributing past exams via an existing digital repository batch metadata creation options conclusion acknowledgements endnotes the first 500 mistakes you will make while streaming on twitch.tv public libraries leading the way the first 500 mistakes you will make while streaming on twitch.tv chris markman, kasper kimura, and molly wallner information technology and libraries | september 2022 https://doi.org/10.6017/ital.v41i3.15475 chris markman (chris.markman@cityofpaloalto.org) is senior librarian, palo alto city library. kasper kimura (kasper.tsutomu@gmail.com) is methodist youth fellowship high school director, wesley united methodist church. molly wallner (molly.wallner@cityofpaloalto.org) is senior librarian, palo alto city library. © 2022. introduction three librarians at the palo alto city library embarked on an epic virtual event journey in 2020. this is our story. twitch.tv is the most popular video game streaming platform on the internet right now, but that does not mean it is the easiest to use or navigate as content creators. while the mistakes were many, you do not have to repeat them. in short, lessons learned over the past two years fell under four distinct categories, many of them interrelated or compounding one another: • physical space limitations and challenges migrating studio setups during various phases of the covid-19 pandemic. • complex decisions making audio and video equipment purchases. • our own familiarity with videogame streaming platforms and specialized software. • converting our in-person event policies and codes of conduct for virtual events. mistakes 001–135: picking the right time, place, and software we can say confidently that mistake #1 in your 500-mistake journey is pretending the library will strike gold with its first ever stream and achieve instant online success. we chose minecraft as our first videogame featured on twitch.tv. the cold reality is that real-world streamers who host thousands of viewers at one time are not building the interpersonal connections you are likely aiming for as a librarian. the second biggest mistake you’re likely to make while setting up a stream is in picking the right location. over the course of two years, in response to different levels of building access, we ended up moving our ad-hoc studio location a total of four times. each location posed its own challenges, and we learned more about what worked with every move. your streaming space should not only be distraction free, but also easy to adjust as needed, because your setup will change over time. picking the right av equipment for your stream is a gigantic topic, and the subject of infinite support forum threads and online discussions. the correct answer also largely depends on if you plan to stick with console game streaming, or pc, or some mixture of both. we can summarize by saying that to start off, you do not need the very best studio gear, and in fact , this thinking can lead to an artificial barrier that might result in more “tech debt” than necessary. you will end up spending a considerable amount of time troubleshooting strange quirks that were not there the last time you streamed, or with each new equipment purchase/upgrade. mailto:chris.markman@cityofpaloalto.org mailto:kasper.tsutomu@gmail.com mailto:molly.wallner@cityofpaloalto.org information technology and libraries september 2022 the first 500 mistakes you will make while streaming on twitch.tv | markman, kimura, and wallner 2 mistakes 136–223: moderation tools and volunteers we have had to block a few bots, as well as tactfully defuse some loose-cannon streamsurfers by maintaining aggressive kindness in answer to their sarcastic questioning. overall, our moderating world has not been rocked in a way we weren’t prepared for, due to our thoughtfully crafted and transparent policy that was adapted from our patron code of conduct, trained teen volunteer moderators, and clear communication as a team. mistakes 224–301: the finer points of twitch.tv in addition to having had little experience playing video games in general, our stream host also had no experience with streaming. by design, kasper went into our first stream with only two guidelines for interacting with twitch viewers: don’t stop talking and be friendly. no one wants to watch someone silently play a game badly; it’s not engaging and it’s not fun. another part of using twitch that we did not account for until we were in the middle of the first stream is that the chat runs on a delay. this makes sense from a moderating point of view; you want to be able to catch inappropriate or spam comments. however, in terms of holding a conversation with the chat, it became a mental challenge to hold multiple threads of conversation at a time—all while playing—and all while narrating what’s happening on screen, and as people were typing to respond to what was just said or done. this process can be very overwhelming for twitch.tv hosts.1 imagine driving a car on the highway while also watching a movie of yourself, and then simultaneously holding a conversation with ten or more people in the back seat of this car at the same time. they’re not commenting on what you’re currently doing though, instead they’re making jokes about the on-ramp or stop light two miles back. it’s not impossible to juggle these tasks simultaneously, but as the host, it does require practice. mistakes 302–389: art is a process, just like the inevitable bugs you will find in your setup every time you change anything heed our warning! you can find a mountain of well-meaning online advice and tutorials about the best possible streaming setup and content strategy: much of this is outdated or aimed at a very specific subset of gamers. there is a cottage industry of media consultants and youtuber personalities that review hardware and share tips and tricks advice. your information literacy skills should not go to waste here! always consider the source. stream decks and keyboard shortcuts: what the twitch.tv pros get right if we could go back in time, there is one element to our stream setup that could have been integrated sooner, and that’s the stream deck by elgato (https://www.elgato.com/en/streamdeck). this extra desktop keypad is literally a game changer for usability—it is the peanut butter that smooths over all the ux cracks created by open broadcast studio (obs) and the chaos of chat interactions already discussed. this small hardware upgrade also makes onboarding new stream hosts much easier because there is no need to memorize keyboard shortcuts: the buttons on the stream deck can be customized to do exactly what they say they do (like mute audio, change screen layouts, or stop and start streaming). mistakes 390–499: do androids dream of electric animal crossing dream codes? what does twitch.tv outreach look like? we used social media to connect with other organizations doing similar work, such as the lgbtq+ youth space in san jose. we had worked with this group before the pandemic on some https://library.cityofpaloalto.org/library-policies/patron-code-of-conduct/ https://www.elgato.com/en/stream-deck https://www.elgato.com/en/stream-deck https://youthspace.org/ information technology and libraries september 2022 the first 500 mistakes you will make while streaming on twitch.tv | markman, kimura, and wallner 3 pride programs for teens at the library, and so in 2020, when we saw on their instagram that they had a minecraft server open to the local community, our team eagerly jumped on this opportunity to collaborate with them. we had a minecraft stream; they had a minecraft server—could the stars be any more aligned? after some planning, one of the server mods joined us for a stream and gave us a tour of their server, which ended up being one of our most popular streams to date. conclusion: and what did we learn from all this? the final mistake (#500) is giving up. over the past two years we have hosted over 50 streams at https://www.twitch.tv/paloaltolibrary and can say confidently that not only was each virtual event unique, but also improved over time. we encourage more librarians to test out this mode of online outreach and practice your iterative design skills. video game streaming is not only fun for both the audience and hosts, but also a great way to connect with “extremely online” patrons of all ages. endnotes 1 to illustrate this problem in more detail: consider the events of our first very first stream, in which kasper’s dog saw a postal employee through the window while live on camera and reacted accordingly. this was one of the many reasons why moving our center of operations from the living room to the library was an upgrade. https://www.twitch.tv/paloaltolibrary introduction mistakes 001–135: picking the right time, place, and software mistakes 136–223: moderation tools and volunteers mistakes 224–301: the finer points of twitch.tv mistakes 302–389: art is a process, just like the inevitable bugs you will find in your setup every time you change anything stream decks and keyboard shortcuts: what the twitch.tv pros get right mistakes 390–499: do androids dream of electric animal crossing dream codes? what does twitch.tv outreach look like? conclusion: and what did we learn from all this? endnotes reproduced with permission of the copyright owner. further reproduction prohibited without permission. china academic library and information system: an academic library consortium in china dai, longji;chen, ling;zhang, hongyang information technology and libraries; jun 2000; 19, 2; proquest pg. 66 china academic library and information system: an academic library consortium in china longji dai, ling chen, and hongyang zhang since its inception in 1998, china academic library and information system (calis) has become the most important academic library consortium in china. calis is centrally funded and organized in a tiered structure. it currently consists of thirteen management or information centers and seventy member libraries' 700,000 students. after more than a year of development in information infrastructure, a calis resource-sharing network is gradually taking shape. l ike their counterparts in other countries, academic libraries in china are facing such thorny problems as shrinking budgets, growing patron demands, and rising costs for purchasing books and subscribing to periodicals. it has thus become increasingly difficult for a single library to serve its patrons to their satisfaction. under these circumstances, the idea of resource sharing among academic libraries was born. library consortia provide an organizational form for libraries to share their resources. the georgia library learning online (galileo), the virtual library of virginia (viva), and ohiolink are among the wellknown library consortia in the united states.i traditionally, the primary purpose of establishing a library consortium is to share physical resources such as books and periodicals among members. more recently, however, advances in computer, information, and telecommunication technologies have dramatically revolutionized the way in which information is acquired, stored, accessed, and transferred. sharing electronic resources has rapidly become another important goal for library consortia. i what is calis? in may 1998, as one of the two public service systems in "project 211," the china academic library and information system (ca lis) project was approved by the state development and planning commission of china after a two-year feasibility study by experts from academic libraries across the country. calis is a nationwide academic library consortium. funded primarily by the chinese government, it is longji dai is director, peking university library, and deputy director, calis administrative center; ling chen is deputy director, calis administrative center; and hongyang zhang is deputy director, reference department, peking university library. 66 information technology and libraries i june 2000 intended to serve multiple resource-sharing functions among the participating libraries-including online searching, interlibrary loan, document delivery, and coordinated purchasing and cataloguing-by digitizing resources and developing an information service network. i structure and management of calis a library consortium is an alliance formed by member libraries on a voluntary basis to facilitate resource sharing in pursuit of common interests. whether a consortium can operate successfully depends in large part on how it is managed. calis differs from library consortia in the united states in that it is a national network. it resembles multistate consortia in the united states with respect to geographic distribution of member libraries, but it is like tightly knit or even centrally funded statewide ones in terms of management.2 the calis members are distributed in twenty-seven provinces, cities, and autonomous regions in china, making an entirely centralized management difficult. after surveying some of the major library consortia in the united states, europe, and russia, calis adopted an organizational mode characterized by a combination of both centralized and localized management-that is, a three-tiered structure (figure 1). in order to improve the management efficiency and maximize the sharing of various resources including funds, calis has established a coordination and management network comprising one national administrative center (which also serves as the north regional center), five national information centers (see table 1) and seven regional information centers (see table 2). the thirteen centers are maintained by full-time staff members provided by the libraries in which these centers are located. the national administrative center (located in peking university)-overseen by officials from the concerned office at the ministry of education and the presidents of peking and tsinghua universities and advised by an advisory committee consisting of experts from major member libraries-is responsible for the construction and management of calis, makes policies and regulations, and prepares resource-sharing agreements. the center has an office handling routine management needs and several specialized work groups overseeing calis' national projects, such as those for the development of databases for union catalogues, current chinese periodicals, and calis' service software. under the guidance of the national administrative center, five national information centers are each responsible for building and maintaining an information system reproduced with permission of the copyright owner. further reproduction prohibited without permission. in one of five general areas-humanities, social science, and science; engineering and technology; agriculture and forestry; medicine; and national defense-in coordination with regional centers and member libraries. the host libraries where these centers are located possess relatively abundant collections in their respective areas. these centers, which are intended to be information bases that cover all major disciplines of science, are responsible for importing databases for sharing and constructing resource-sharing networks among member libraries and for providing searching and document delivery services to member libraries. 5 national information centers 8 regional information centers 70 member libraries depending on their location, academic libraries in china are divided into eight groups, with each forming a regional library consortium. each regional consortium is overseen by a regional management center, except for the consortium in the north, which is directly managed by the national management center. the regional centers not only participate in nationwide projects in coordination with the national centers and other figure 1. the three-tiered structure of calis regional centers, but they also are responsible for promoting cooperation among libraries in their particular regions. all the centers are located in member universities and staffed by the host universities. the concerned vice president or library director of a host university is in charge of the associated center. the regional centers also are assisted by regional coordination committees and advisory committees of provincial and municipal officials in charge of education; university presidents; library directors; and senior librarians in the concerned table 2. table 1. five national information centers areas of specialization humanities , social science and science engineering and technology agriculture and forestry medicine national defense location peking university, beijing tsinghua university , beijing china agricultural university, beijing beijing medical university, beijing haerbin industrial university, haerbin, heilongjiang regional information centers and areas of the ir jurisdiction name national administrative center southeast (south) regional center southeast (north) regional center central regional center south regional center southwest regional center northwest regional center northeast regional center location beijing shanghai nanjing wuhan guanzhou chengdu xi'an jilin areas of juristiction beijing, tianjin , hebei, shanxi, and inner mongolia shanghai, zhejiang, fujian, and jiangxi jiangsu, anhui, and shandong hubei, hunan, and henan guangdong, hainan, and guangxi sichuan, chongqing, yunnan, and guizhou shanxi, gansu, ningxia, and xinjiang jilin, liaoning, and heilongjiang china academic library and information system i dai, chen, and zhang 67 reproduced with permission of the copyright owner. further reproduction prohibited without permission. regions. these committees serve a coordinating role in the regions. i funding the development and operation of calis has been funded in large part by the chinese government. the sources of funding for calis at the present time are as follows: • government grants. much of the funds for the calis project during the first phase of construction came from the government. because of the demonstrated benefits of the ongoing project, it is expected that the government will provide funds for the second phase of calis construction. these government funds have been used in the purchase of software and hardware for the calis centers and commercial databases, development of service software and databases, training of staff members, etc. • local matching funds. according to prior agreements, a province or city that desires to have a regional center is required to provide funds in supplementation to the government funds for the construction of its local center. • member library funds. these funds, primarily derived from the university budgets, have been used to purchase electronic resources and cover the expenses incurred from the use of the calis service software platforms. although calis is currently funded by the government, the future expansion and operation of the system is expected to rely in large part on other sources of fun_ds. the funding needs for calis may be met by operating the system in a commercial mode. i principles for cooperation among members the successful operation of a library consortium clearly depends on good working relationships among members and between members and the consortium. at calis, all members are required to adhere to a set of principles (see below) in dealing with these relationships. it is based on these principles, known as the calis princ~ples for cooperation among members, that calis pohc1es and rules are made. • the common interests of calis are above those of individual member libraries. 68 information technology and libraries i june 2000 • • • • member libraries should not cooperate at the expense of the interests of others. calis provides services to member libraries for no profit. member libraries are all equal and enjoy the same privileges. larger member libraries are obliged to make more contributions. i what has been achieved? when it was first established, calis had sixty-one member libraries from major universities participating in "project 211." later, as many other major universities were interested in joining the alliance, the number of calis members has climbed to seventy. at the present, calis serves about 700,000 students. construction of calis is a long-term, strategic undertaking. the system provides service functions as they become available and is constantly being improved in the process. in the first phase (1998 to 2000) of the project, calis successfully started the following information-sharing functions in its member libraries: • primary and secondary data searching; • interlibrary borrowing and lending; • document delivery; • coordinated purchasing; and • online cataloguing. the following tasks have been completed: • purchase of computer hardware (e.g., sun e~s00); • construction of a cern etor internet-based information-sharing network connecting academic libraries across the country; and • group purchase of databases, such as umi, ebsco, ei village, inspec, elsevier, and web of science, that are shared among member libraries either directly online or indirectly through requested service/ document delivery. calis also has completed development of a number of databases, including: • • union catalogues. these databases currently contain 500,000 bibliographic records of the chinese and western language books and periodicals in all member libraries. dissertation abstracts and conference proceedings . these databases now contain abstracts of doctoral dissertations (12,000 bibliographic records) and proceedings of national and international conferences (8,000 records) collected from more than thirty reproduced with permission of the copyright owner. further reproduction prohibited without permission. memb er librarie s. the databa ses are expected to ha ve 40,000 record s in total by the end of 2000. • current chinese periodicals. th ese databases (5,000 titl es, 1.5 milli on bibliographic records) cont ain cont ents and indexes of current chinese pe rio dicals from about thirt y member libraries. • key disciplines databases. calis has sponsor ed the de ve lopment of twe nt yfiv e di scip line-sp eci fic d a tabases by m em ber librarie s. each of thes e dat abc.ses contains about 50,000 to 100,000 record s. the first three class es of databases are prepared in the usmarc, un imarc, or ccfc format for the ease of u se b y patron s and ca ta loguing s taff and in data exchang e. clients from member librari es may perform a web-ba sed sear ch of th e above databa ses. most of th em contain secondary docum en ts and ab str acts, and access calis onl ine resources using brows ers. deve lopm ent of sofhvare platform s includes the following: • cooperative online cataloguing systems. the syst ems includ e protocol 239.50-based searc h and upl oad in g serve rs and terminal softw are platforms for cataloguing staff. acquisition and ca taloguin g staff in each memb er library m ay participate in cooperativ e online cataloguing using the terminal sof tware platform s on their local sys tem . th e sys tems have been u sed for the devel opment and operation of the union catalogue databa ses. • systems for database development. these syst ems can be used in the de velopment of shared databa ses containing secondary data informati on in usmarc, unimarc, ccfc, or dublin core format. the systems for dat abase developm ent in the usmarc, unimarc, or ccfc format s are equipp ed with a search server based on the 239.50 protocol to permit use by catalo guing staff and for data exchange . • a n interlibrary loan system. the sys tem, d eve loped base d on the iso10160/10161 protocol, consists of ill protocol machines and clien t terminal s. these sys tems, locat ed in memb er libr aries, are interco nnected to form a calis interl ibrar y loan n etw ork. primar y document deliv ery sof tware bas ed on the ftp protocol also has been developed for the de livery of scann ed docum ent s between libr aries. • an opac system. the system has both web /239. 50 a nd web / ill ga teways . patron s may visit the system using co mmon brow sers , sea rch all calis new! lita publications getting the most out of web-based surveys by david ward • 2000 $20 ($18 lita members) isbn 0838981089 surv eys help evalu ate user service s, rat e diff e r e nt librar y programs, facilitat e n ee ds assess m ents , a id fa cul ty research , a nd mor e. posti ng surv eys to the w eb provide s an easy and conveni en t way to reach in ten ded aud igetting the most out of web-based survey s enc es, cen tralizes data collection a n d gives librari a ns gre ater contro l over analyz ing and repor ting results . thi s guide shows ho w to create r ob u st w eb-ba se d sur veys, a nd t h e n gather a nd ass imil ate t h e ir da ta for u se in common database a nd spre adsh eet programs. th e auth or h as applied th e techniques described in hi s own work and has desi gned both comm ercial and acad emic web sites . digital imaging of photographs: a practical approach to workflow design and project management by lisa macklin and sarah lockmiller• 1999 $20 ($18 lita members) • isbn 0838980058 a com pre hens ive app roach to man agement of digit al im ag ing in libr aries a nd archi ves , from archival nega tives to metadata ca taloging a nd web -base d access. getting mileage out of meta data: applications for the library by jean hudgins, grace agnew, and elizabeth brown 1999 • $22 ($19.20 lita members)• isbn 0838980066 an overview of the state-of-the-art metadata cataloging and curr ently ava ilabl e metadata standa rds, incl uding compr ehen sive descr iption s an d links to current a pplications. other lita publications and a printable order form can be found at www.lita.org/litapubs/index.html. fax orders to (312) 836-9958 or call 800-545-2433, press 7 (m-f, 8-5 cst). china academic library and information system i dai, chen, and zhang 69 reproduced with permission of the copyright owner. further reproduction prohibited without permission. databases, and send search results directly to the calis interlibrary loan service. patrons also may access an ill server through web/ill, tracking the status of submitted interlibrary loan requests, inquiring about fees, and so on. the databases that are centrally located and those that are distributed at various locations as well as service platforms in member libraries form a calis information service network. i future considerations in a period of just over a year, considerable progress has been made in forming a nationwide resource-sharing library consortium in china. however, because member libraries vary in size, available funds, staff quality, and automation level, calis has yet to realize its potential. there are a number of problems that remain to be solved. for example, the calis union catalogue databases do not work well on some of the old automation systems in member libraries and the calis service platforms are incompatible with a dozen automation systems currently in use; as a result, the union catalogues cannot tell the real-time circulation status in all member libraries, affecting interlibrary loan service. in addition, primary 70 information technology and libraries i june 2000 resources are not sufficiently abundant. therefore, the extent to which resources are shared among member libraries remains quite limited. in the next phase of development, calis will improve service systems (including hardware and software platforms) and the distribution of shared databases. at the same time, calis will develop more electronic resource databases and be actively involved in the research and development of digital libraries, expanding the scale and extent of resource sharing. references 1. barbara a. winters, "access and ownership in the 21st century: development of virtual collection in consortia! settings," in electronic resources and consortia (taiwan: science and technology information center, 1999), 163-80; katherine a. perry, "viva (the virtual library of virginia): virtual management of information, in electronic resources and consortia (taiwan: science and technology information center, 1999), 93-114; delmus e. williams, "living in a cooperative world: meeting local needs through ohiolink," in electronic resources and consortia, ching-chin chen, ed. (taiwan: science and technology information center, 1999), 137-61. 2. jordan m. scepanski, "collaborating on new missions: library consortia and the future of academic libraries," in proceedings of the international conference on new missions of academic libraries in the 21st century, duan xiaoqing and he zhaohui, eds. (peking: peking univ. pr., 1998), 271-75. lib-s-mocs-kmc364-20141005044017 catalog records retrieved by personal author using derived search keys 103 alan l. landgraf and frederick g. kilgour: the ohio college library center this investigation shows that search keys derived from personal author names possess a sufficient degree of distinctness to be employed in an effi~ cient computerized interactive index to a file of marc ii catalog records having 167,7 45 personal author entries. previous papers in this series and experience at the ohio college library center have established that truncated derived search keys are efficient for retrieval of entries by name-title and title from large on~line computerized files of catalog records. 14 experiments reported in the earlier papers were " ... based on the assumption that each key had a probable use equal to all other keys."5 however, guthrie and slifko have shown that random selection of entries, rather than keys, yields results closer to actual experience but with a higher number of entries per reply.6 for example, they found on retrieving from a file of 857,725 records using a 4, 5 (four characters of main entry, five characters of title) key tl1at when the basis of the search was random keys there was one entry per reply 81.3 percent of the time, but when the basis was random records, there was one entry per reply 55.7 percent of the time. this paper presents the results of experimentation with search keys to be used in constructing an author index to a large file of on-line catalog records. an interactive environment is assumed, with the interrogator employing a remote terminal. a companion paper de:;etibes the findings of an investigation into retrieval efficiency of search keys derived from corporate author names.7 materials and methods the investigation employed a marc ii file containing approximately 200,000 monographic records from which a computer program extracted 167,745 personal-name keys. the program extracted these keys from main entry, series statement, added entry, and series added entry fields. the basic key structure consisted of sixteen characters-the first eight from the surname, the first seven from the forename, and the first character from the middle name ( 8,7,1). if the surname and forename contained fewer char104 journal of libmry automation vol. 6/ 2 june 1973 ~ likelihood 90.00% 99.00% 99 . 50% 0 90. 00% 99. 00% 99.50% 0 ....... j: ii i .&: -"i ii 0 " ....... j: 2 .j: ..... it "i ~ ~ ii 3 j::. .... " ~ no. of characters extracted from the surname 3 4 5 6 (>200) (> 200) (>200) 171 (>200) 67 25 18 16 172 90 71 63 (>200) 105 102 81 16 8 6 6 55 25 23 ii 67 36 32 30 26 12 9 87 44 38 106 62 57 8 5 5 29 21 21 37 30 30 17 50 78 5 23 31 fig. 1. number of names retrieved 90, 99, and 99.5 percent of the titne for different key structures acters than the key segment to be derived, the segment was left-justified and padded out with blanks. if there was no middle name or middle initial, a blank was used. another program derived shorter keys from the 8,7,1 structure ranging from 3,0 to 5,2,1. next, a sort program arranged the shorter keys in alphabetical order. a statistics collection program then processed the alphabetical file. this program counted the number of distinct keys, built a frequency distribution of names per distinct key and cumulative frequency distributions of names per distinct key in percentile groups. results figure 1 presents the findings at three levels of likelihood for retrieving n catalog records retrieved/ landgraf 105 table 1 . number of names r etrieved with 90 percent likelihood no. of characters 3 4 5 6 7 no. of names retrieved ( > 200) (>200) (>200) ( > 200) 26 25 16 171 18 17 12 8 8 16 9 6 5 5 key structure 3,0 4,0 3,1 5,0 3,2 4,1 3,1,1 6,0 5,1 3,3 4,2 3,2,1 4,1,1 6,1 5,2 5,1,1 3,3,1 4,2,1 or fewer names when a variety of search key combinations were employed ranging from three to six characters from the surname, zero to three characters from the first name, and with or without the middle initial. table 1 is an extraction from figure l and contains the number of names retrieved at a level of 90 percent likelihood for the various search keys employed. figure 2 has the same structure as figure 1 but contains the degree of distinctness as percentages, ( no. of distinct keys) 100 no. of entries x percent. table 2 records distinctness arranged by number of characters per key. figure 3 is a graphical representation of the degrees of distinctness of the various keys. in this figure, different types of lines connect points representing key structures that contain an equal number of characters. the bottom line in table l may be read as saying that 90 percent of the time a 4,2,1 key will retrieve five or fewer names from a file of 167,745 personal name keys. the bottom line of table 2 states that from the same file the 4,2,1 key. yields a single name 64.1 percent of the time. discussion, this experiment has shown the degree of distinctness-that is to say, the number of distinct keys divided by the total number of entries from which all keys were derived-to be a useful tool in determining what key structures may be efficiently used. as seen by comparing figure 1 with figure 2 and table 1 with table 2, there is a high degree of correlation between distinctness aj}d the likelihood of retrieving a certain number of names 90, 106 journal of library automation vol. 6/ 2 june 1973 no. of characters extracted from the surname ~ 0 a: ila.~ 0 03: ~!::: ~~ a:o 1-z ~< 3:-' cl)t-< ffiie t;w!: :~w 100; but some 2,2,2,2,2 keys corresponded to more than 500 file entries. a typical crt display terminal can accommodate only ten or fewer entries per screen. therefore, if the average number of entries per reply is desired to be ten or fewer, it is necessary either to ignore entries with high multiplicity or to adopt a different scheme of storing and retrieving such items, in which case the mathematical result would be the same as ignoring high-frequency items. the average number of entries per reply was computed for five different values of m ( 19,29,39,49, and 59); the results of these computations are in table 2, which reveals that if keys in the file are allowed a maximum recurrence of 39 entries per key, it would be possible to have keys in the main index for about 75 percent of total records, while entries for only 142 high frequency keys would have to be shunted to a secondary index. in this case, the average number of entries per reply would be about eight. table 3 gives the probability of number of entries per reply for the index file consisting of 50,854 (out of a total of 68,169) records with the maximum frequency of any key in the file being 39. for preparing this table the assumption is made that each entry in the file has an equal probability of being accessed. thus the probability of obtaining i entries per reply is given by: p(i)= jft 'f. ifj i= 1 where f, is frequency of keys occurring exactly i number of times in the index file. an inspection of this table shows that in 87.7 percent of the 160 journal of library automation vol. 6/ 3 september 1973 table 3. probability of number of entries per reply for an index file using 2,2,2,2, 2 key. number of entries 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 frequency 14820 2893 1276 726 427 312 248 195 150 120 78 88 56 71 62 48 41 28 24 22 18 16 23 25 13 9 12 18 10 11 11 13 6 9 7 6 11 5 2 probability pt·,ccntasc29.1 11.4 7.5 5.7 4.2 3.7 3.4 3.1 2.6 2.4 1.7 2.1 1.4 1.9 1.9 1.5 1.3 1.0 0.9 0.9 0.7 0.7 l.l l.l 0.7 0.4 0.7 1.0 0.5 0.7 0.7 0.8 0.4 0.6 0.4 0.5 0.8 0.3 0.2 cumulutioc prolhjhiijlll 1' ~rr.cnltrew 29.1 40.5 48.0 53.7 57.9 61.6 65.0 68.1 70.7 73.1 74.8 76.9 78.3 80.2 82.1 83 .6 84.9 85.9 86.8 87.7 88.4 89.1 90.2 91.3 92.0 92.4 93.1 94.1 94.6 95.3 96.0 96.8 97.2 97.8 98.2 98.7 99.5 99.8 100.0 time there would be 20 or fewer replies. this represents two screensful of information on a typical crt display. conclusion a file containing only those entries for which the frequencies of 2,2,2,2,2 search keys is 39 or fewer would produce 20 or fewer entries per corporate autlwr entry recordsjlandgraf, et al. 161 reply approximately 88 percent of the time, but such a file excludes 142 high frequency keys for 17,315 of a total of 68,169 entries . therefore, a special technique for handling corporate~entry derived keys of high multi~ plicity is desirable. references 1. a. l. landgraf and f. g. kilgour, "catalog records retrieved by personal author using derived search keys," journal of library automati{)n 6:103-8 (june 1973}. 2. f. g. kilgour, p. l. long, and e. b. leiderman, "retrieval of bibliographic entries from a nam~title catalog by use of truncated search keys," proceedings of the american society for information science 7:79-82 ( 1970}. 3. f . g. kilgour, p. l. long, e. b. leiderman, and a. l. landgraf, "titl~only entries retrieved by the use of truncated search keys," journal of library automation 4:207-10 (dec. 1971). 4. p. l. long and f. g. kilgour, "a truncated search key title index," journal of library automation 5:17-20 (march 1972}. 5. kilgour, long, leiderman, "retrieval of bibliographic entries." 6. g. d. guthrie and s. d. slifko, "analysis of search key retrieval on a large bibliographic file," journal of library automation 5:96--100 (june 1972}. 1. landgraf and kilgour, "catalog records retrieved." lib-s-mocs-kmc364-20141005044633 162 grant project information via a shared data base justine roberts: the library, university of california, san francisco a quarterly keyword index to campus grant projects is provided by the health science library at the university of california, san francisco, using a data base created and maintained by the campus' contracts & grants office. the index is printed in kwoc format, using the chief investigator's name as the key to a section of project summaries. a third section is also included, listing the summaries under the name of the sponsoring department. introduction communication channels between the computer center and the library at the university of california, san francisco are open despite the "normal" and accompanying library use traumas of an all-purpose university computi.rig center. thus the library's chief administrator received an immediate, if unexpected, response to her statement of campus need for subject access to information about local research and training projects. as she summarized it, the information need is expressed as "who is doing what, where, with what amount of funds?" such queries about campus work come to the health science library regularly, but had often remained unanswered because of inadequate published sources and the lack of easily accessible local sources. neither the campus contracts and grants ( c&g) office, nor any other campus unit, had files organized to allow inquiry by subject or by department name, nor were any departments staffed to provide a general information service of this nature. previous investigation by the library had revealed the fiscal infeasibility of extracting citations of publications on campus research projects from a commercially available data base. the latest locally compiled directory of campus work was eleven years old. response by the computer center director to the library statement was a suggestion for a three-department cooperative project between the library, the computer center, and the c&g office to produce a quarterly index of the machine-readable administrative file of the latter department. this accession-ordered file is comprised of 1441-character records which provide for 42 data elements needed by the c&g office to monitor the progress and fiscal status of all extramurally funded projects and proposals (figure 1). grant project information/roberts 163 214g-1000.40 d0+15 1g404~61 .. 1fdrshan tolano!l15 , .... 0527110tou~t06l0720701700ftldlzt iainjng g«.anr ,,. fnoocrtna..cigf, otaii£-,es ---=-· ···· ···ao .. ·~oooo oooooooo ooaooooo o:)oooooo noooooo'l ooonoooo oo qo d90q ooqoqood p qq dooqo oqooo qo q oq®cjo'r' ooo qqoo o 00000 0 30 oqo o ' 0 oo q.zz4146 u l 00 0 s4ll6l 00050119qooq09 qq00005z2 1}q d04 1] 460 oooaoo00241~ l '5jooi,4li"000'50jl~ooooood0004~ft0200045 14000000000 00 ooooooooooodoooooooooooooooooooooooooooooooooooo 00 ooocooooooooooooooooooooooooo!lodooooooooiioodoooo ·~ ·-··-oa-ooooooooodoooi,-oooooooooooo"-oo""oo"'o"'oo""oo"'o"-oo""oo"'o"-no"'oo"'oo""o:------:oo"'"""o"'oo""no""o"'ooo= o""o"'oo'"'ooo"' . . q1joo qodqqqodqooooooooooqqj)~-~-qqo.qq.l1t)q_q~.q.q0j).q..q.oo;>oj><00t!!jdomo"-'ootl!jooljl0>j<'0!!!joolj!_oo!!j!o!!lo _ __,.oo!!_!!!oolj!_oo!!j!o'l!)do'-"'oo .. o""oo!jloo"'o!!!joo"'ooi!!io.!l!ddlli!ddil!iol!loo"'oo""o""'-do ooo cciooooooooooo oa ooooooooooooilnooooooooaooooooooooooooooooooooooo 00 ooooooooooooooodoooooojoooooooooooooooooooomoqo -·--oo oooooooooooooooooooooooooooooooooooooooooociooooo :j 1 ~o 11111010 t 720610 no lot7206l07l oo moooooo ttl r4 2 o , ooto4 3'911600 1 06'9u t 1 .. ...qqj 0 9607 iip q ll ltl 2l'9 0 0 115)512 0 --· ·-··--·--·· ·---fig. 1. dump of contracts & grants office master tape record original specifications for these project records had in fact included a gesture toward information retrieval in the form of a 5-digit "discipline" code; this code had quickly become null when problems of maintenance and interpretation revealed themselves. however, the regular monthly entry and updating of other descriptive data was already twelve months underway at the time the cooperative project was suggested. library index the original proposal for a library index to the c&g file was production of a standard kwic (keyword-in-context) index to project titles, to be based on use of an ibm share library program developed by computer center staff. after review of available library programs and output, the product was finally specified to be a kwoc (keyword-out-of-context) index to project titles, using the chief investigator's name as key to a second 'bibliographic" section of project summaries (figures 2, 3). a third section was added to list project summaries indexed by campus department name (figure 4). the c&g file included 12 elements which the library considered to be of general campus interest. these elements, comprising the project summaries, are: ( 1) project title; ( 2) chief investigator's name; ( 3) award status (i.e., funded or proposed); ( 4) project site (i.e., campus or affiliated institution); ( 5) project type (e.g., training, basic research, applied research); ( 6) grant number; ( 7) total project duration to date; ( 8) award period; ( 9) award amount; ( 10) granting agency name; ( 11) campus department; and ( 12) school. items 10, 11, and 12 of this list exist as numeric codes in the records, --:....-;;;;-:--c:-:--:-:c-:--'""""'!oi-sif\g-cc,...muf\ity ierifil heili 1--·· · -·dunlap ... mental healttt princ.ipl[ s and early educat w n f (j it oeaf children schlesinger, h ~~~~:~l~~ !~ t=e~:~:~=~ ~~ ~f :~m~e~~~~6:~ ~ ~n h:~!!~~t ~ ---------.oijsi:ji=li'ft~to~t,..\5ng:"!t:•o.c· ~~ -helabollt routes & their cthtitcl singer, 1 ~~~~~~~-'i~~~;~~·~s~~~~~2!-'la~~o;;c~~~tc.;:~:~ot· !~i ~nf e~7:~~~~~~1 ~smh1-bol-is-m -ta:::~·/ stuutes cf bile pigment i"'ltatiollsm schmid, r sa! :~~: ~" "~~ 4~~~:~" .~v£ ~~ '\i~e~a~:!:~uf!jf2~' ~\1!!2c;;te,;oe"'s•t ----------~~~/t¢~~t:·o.jl" ,.-. --tnetr.n erroiis qf metaticli sm smith, l "-*w.lffis'-''-'-~---l:~iffi~~~~~~~ o: ~~p~~o~~~~~ii~k~~~~~·~~·~;tt~~:" ceu s ___ ~-----~~n~at .. " fig. 2. keyword subject section 164 journal of library automation vol. 6/3 september 1973 _,uc..._..sf-:_j,jli'-"j:b .. a ... bfl-------...ti.ll.....ulhtl<;l a _ t_:_r;_fi_~•li.j1!.!2t.k _..21jlil,j[;l_ _ _ ________ jp:aag!ilec__.l19"""'•"'o"se'""•"'"'•u:-,--cw:-.... -..,..,ly"'•"'•h"'cc" y=te &li"~un e cy1 ct·~ · sts· ~ ·... ----patholugv pfi(~ j lvpe 8 slle:ucsf award ccontjiiiueoi --------"yp"-!..li!...j.i .;t!'t!cj .(ll__to !?/31/i.r.) agency cc.zl __ _!!n""c:._,ro,_.i_,t.,_a0,_,7"-19"'1~--,.----aosetiiau, w. • ••• lranshr fact(l fi -charachr t. role tumor immunity --------~~=~th~~u,.lc~ .. ~~icii/fj' to. '6f3c~.q..tj~1';~0~ ag~~~~~-: nu: ~=&b q ...!•~o~sfeo!n.!.!th!!!.a'-'l ''-'''-'''-'-' ~· ~·_.<;ex"-t~~:~~d~e~~~~t!\~:0:~~"uf'f~~;'t~:t4l fields gf p;~~t ~~:~ :1-ne~~~~~ucsf award 't'r 5 c ij/01170 to 4/30/hi agency 0061 no: rolns-09146 ."'c"'th""•"'••,-,""'s'.., -. .-cs"et'""re'""t"'io'-'n'"'p"'r"'ct"'e""'in7s ·&iinsey f.anu.'fat ____ ---·-----.•• --· , ···dentistry pa.oj lype 8 site:utsf proposal ruo,.,to:, a ... • • • the fcrest cycles cf uengue in m.tlaysu hccpeii fcuncal ion prijj type b '-'-'u.ci ~~-=~~~-;~~,·~2~1~/c,'~'l:'~-t~o,._:s'~'=''~~~·~·-~~~~-~=:::~~~~~~:;~~~~~~--fig. 3. chief investigator section with decoding tables comprising a separate file at the beginning of the tape volume. these items, together with items 3, 4, and 5 are coded on input but are not uniformly edited by the regular c&g update program. program requirements necessary program functions included the selection of active grant records and the editing, decoding, and formatting of selected data elements for printing, and the extraction and sorting of index terms. these functions were divided between a main routine coded by library staff, an indexing subroutine and print program written by a computer center staff member, and an ibm utility sort. local programs were written in pl/ 1, using the newly installed pl/1 optimizing compiler, and provided several tests of the compiler's capacities. only projects currently in "award" status, or which have outstanding proposals during the preceding twenty months, are selected for printing, approximately two-thirds of the file at this time. these conditions are tested on various data fields in the c&g record, and data from selected records are reformatted into a partial print line and passed to the keyword extraction ,..t: n icint-, c.j:~e i.!a~-fj~f s."i· i~lll. m!sciiptrmi'tf t=ei't.~-lct'lr-: tp[;i,· so:(gl u ~' tldc.jnf: p.-l:j type h site:ucsfjh:c i tine' gf.i\eriol • hgvlat ten 'cf·' p.'ftfhansf.()pi1iffgn' fntheltvf'r si.hi:c l gf ~lojc.ine pfi.c j type ti' stte:ucsf ,.ecii':ih, ghej.ial • p41ht:ttie'5itti:' aclh hr-:1-l 'faill.:l,·l: s( hf.ll gf mfflicine p'-'~j type ~ slle:v .~ -::;~e"'o"'i t"i ;:;-;n e,-, -,gc;oe n"'e"'•a""l.--;:;""'"'•"'••"""""th'r'rg 'f c. ((iii c inc g e 1\ s. schcdl r.f "'~ihcint pllf1j typ[ 0 sjte:ucsf ndtclf'.li:, bhemi\l ,. ly.syl olivas£ k(:'t l~hc tftl.ac:;'f.'n' c: vi"--ss (il'-."it' t'ng st:hl:l of mi:cicwe tl'-'(i j tyh b stlf.:uc.sf proposal proposal proposal • proposal p~c.posal ccontjnued. bissell•: oj-~:~-~ ::-, ·,;-._ ;·_.-;.r:::··· ~:-;:-:9_ siegel, r --...,:noulc'>in"'ec::-,<'o~, .... , ....... .--;;ccn:n;t;tr;;;a,.ct.--.;10;-;;-.0,:;;v•"•""op c··cr:r;ifucr lrait-.ih~g sessions frjr( op,.;s•"'n"•"'•"'•s'"o"-nn..,el;---_ --,,.---,_,~~-'"-d""''"'j"sc::-,-;;";--sc.hccl of mfojc:lne p !l:o j t'r'pe t sjte::ucsf "proposal: .. !-·.~ '·. .._, ... t ~icim , gei~f. ra l • y[f;ulath: n of ci tf~[ e" ipf'fs's'j cs:·l>y-~tffij~f.i£'rgiij '(;'cj;,htxe""s--------,8;';-a;i-;xt,-,'er;;-,-j.,--' .sci-icc( of •t:i:i cin !: tl'fjl caroigv as cul ar res -sctu.ii l ( f ~;[c-icin e ·•.• • ._, ( a-rlfh) vat 'r f s iffi __ _ --·-··· .... ----occ iu gl l'ajh (log y, clhd cal sch cl cf. "'hici ~§ . , .... _ _ • o __ t:lin p~~i::i~e. o )008901 histo k y 1-:ealth sci sc~ool (jf fi'loicine •ll . hist of health s tl _ _!l!!t;t<~f~i.ljift!ji,":l!,bix.x---:-'-------"u'ts f t i] j\!iflacts &._g.b.a.tlls....ll:tlla.-. __ ._____2/...l'llll---------"m.i'---'-•••••file2 ii.eao ei'i'ors ----------~a.cifncy c.ooe~~t . in cc.g ubl'e -·----·--depf • ccoes not found -----------------sue co de s not foun5 +656 +235 *200 +233 0621 +657 0623 +237+658 •212 +229 *208 +223 0383 improved deliveryjherling, et al. 279 1.0 supplies ( i io i mending & binding ( i 9 i 0.75 newly processed contract mts ( i 6 i ~ reciprocal returns ( i 8 i ::j 0.5 photo duplicated mtls (i zl i= ;;;, 0.25 inter library loans ( 1 3 1 0 8 time (days) fig. 3. utility cu1'ves for the timeliness of library .materials delivery. engineering techniques to the solution of management and systems problems, generally but not necessarily with the use of the computer. the operations research approach requires a valid unit of measurement. if an existing system is to be evaluated for comparison with alternative systems other than subjectively, some quantitative basis must be derived. we believe that one of the most important products of this project was the development of a measure of effectiveness, or "objective function." this measure was a composite of numerical values (weights ) assigned to the types of materials to be delivered, as shown in table 1; the frequency of delivery within a week; timeliness value (utility), as shown in figures 3 and 4; and the number of units to be delivered. to illustrate: ten interlibrary loan items (a weight of .135) delivered in less than one day (a utility of 1) have an effectiveness value of 13.5. on the other hand, ten items delivered in five days (a utility of .5) have 'an effectiveness value of 0.5 x .135 x 10 = 6.75. a system designed to accomplish the latter would have 50 percent less effectiveness than a system that accomplished the former. no librarian needs to be told that it is generally more important to deliver interlibrary loans promptly than it is to deliver supplies. but in order to use operations research methods, quantitative values, as we have said, are required. the values shown in table 1 and the sensitivity to timeliness of delivery, i.e., figures 3 and 4, were established by the use of a technique 280 journal of library automation vol. 7/4 december 1974 0.75 ~ :::i 0.5 § 0.25 0 4 8 12 16 time (days) fig. 4. utility curves for the timeliness of librmy materials delive1·y. gifts (itt) intra library bulk shipmts ( i 5 ) newly processed intra library ( 1 7 ) correspondence (i 1 ) intra library loans (i 4) known as the delphi method. our application of the method in this project has been described elsewhere. 4 essentially the method seeks out a consensus from a panel of knowledgeable people, in this case experts from academic, school, special, and public libraries, and a trustee. the methodology has three characteristics: anonymity, controlled feedback, and statistical group response. anonymity is used to minimize the linpacts of dominant individuals in the panel. this is achieved by eliciting separate and individual responses to previously prepared questions. in this case, the responses were made in writing on preprinted forms. controlled feedback reduces the variance in parameter estimates. after the first and all remaining rounds, the results of the previous round are fed back to the panel in a summarized form showing the vote distribution along with various justifications for votes after the second round. since the panel is asked to reevaluate their position based on the feedback provided, but with no particular attempt to anive at unanimity, the spread of votes will usually be much smaller after several rounds than during the earlier rounds. this is known as statistical group response. in each case consensus was reached within five rounds. in addition to the need for evaluating system effectiveness in relation to service, there is the need to relate effectiveness to costs. systems description i provided the data on all fixed and variable costs of the existing system. because of the prevailing use in libraries of line accounting, all the improved delivery/herling, et al. 281 300 --~9---r.r---€>~· ~----;:'\..jo;0j..--hdq 0 0 250 200 "' ~ 150 ..., h ::0: ~ legei:ijl !;! 100 0 hdq a ---------a ~ ----b 4> -----c x ------d 50 0 ..1...,8""/ 1:-=7...,8'""/ 2,.,.4-..,.8/.,.,3.,...1 .....-::9""/ 8,....,.,9,..,/1:-:-4-,.,.9 /.,.,2.,..1 ...,...,9""/ 2""'8~10"""/""5 ~10::-;/-:-:12~1:-::0~/ 1:-::-9..-:1-::-0 /'7::2':""6 r-:1:-::-1-;:/ 2:-r1:-::-1-;:/ 9:-r.1:-1/;:-16::-'th!e --aug, •i• sept. ----f-4--oct. fig. 5. weeldy mileage versus time for ccpl trucks. associated costs were not readily available, hence present costs were probably underestimated. for purposes of computer processing, cost per minute of driving and cost per mile of truck operation were identified. software to repeat: the general approach was to study the characteristics of the existing system, then design an improved system. using the elements described above, a computer program was written to emulate the system, introducing, however, the measure of effectiveness to make it possible to establish values representing the existing level of performance. entered in the program were 1. the nodes, 2. demand and frequency of delivery at each node, 3. geographic coordinates of each node, 4. unit costs, and 5. weights and utilities for each type of material. the program was run to compute, for each driver, the costs, distances traveled, volume delivered, time utilization, the effectiveness as discussed earlier, and then the cost/effectiveness ratios. figures 5 through 10 show the hard data inputs to the program. table 2 depicts a sample of 1 to co to ._ 0 ~ table 2. statistical analysis of the cpl driver collection cards ~ -.q... summer schedule 8117170---914170 t"-1 .... c:s-' statistic delivery pickup ~ number number c:.s::: daily number number of number of of bindery number of mileage of stops telescopes packages boxes audiovisuals number of number of of bindery number of e" telescopes packages boxes audiovisuals .,.... 0 mean 48.13 14.33 18.00 13.11 0.11 5.89 0.11 ~ 16.22 ~ .,.... .... variance 187.27 9.50 58.50 98.61 0.11 57.94 51.11 0.11 0 ~ standard deviation 13.68 3.08 7.65 9.93 0.33 7.61 7.15 0.33 - x ---4-~--a --a---a-~-----.a-a-a a >< ... c t:-~z;.-~-legend 0 hdq a-------:.... a ~--~ b 4>---c x----n .,----_j~-!l-----£l----~~-@--~~~~0~·--~-----0 hdq 0 ·~ 0 q @ 0 0 0 e) 8/17 8/24 8/31 9/8 9/14 9/21 9/28 10/5 10/12 10/19 10/26 11/2 u/9 11/16 tille -aug. .,..,,. sept. --...,•-+1.,.•--oct. --~,._-nov. -fig. 6. number of stops per week vet·sus time for ccpl trucks. the statistical analysis performed on the hard data. four sets of computer runs were made, first using data for the same week for all drivers, then data for several weeks for different drivers. total effectiveness of the existing system, as measured by the sum of the multiples of the importance of each material type (weight), their timeliness values (utilities), and total amounts of materials delivered, ranged from 8,110 to 9,950. costs ranged from $3,801 to $3,934 per week. a second program incorporating the tools of operations research known as simulation and optimization was then used to design an improved system. this program (simopt) included a routing algorithm (set of instructions to the computer) to determine the best routes for each of the drivers on a daily basis. figure 11 describes the basic logic of this program. the procedures to operate the methodology require the following steps (see figure 11) : 1. based on the library hierarchy, contractual arrangements, or any ex~ traneous but agreed-upon reasons, the librarians assign frequencies of delivery to each group of or individual nodes. 2. using the maps (figure 2) and other information, librarians group nodes and assign them to a driver along with the frequency as de284 journal of library automation vol. 7/4 december 1974 200 '"' 150 "' "' ~ "' p., § "' .., ['.! ,.. 100 0 "' "' ~ "' 50 0 0 l-aue. "'i • sepr.----j----oct. ---!"'"'""''"'.,_nov, ._ legei>id 0 ccpl hdq a-------·ccpl a (;;>----ccpl b cp----ccplc x---·ccpl d fig. 7. number of telescopes delivered per week versus time for ccpl trucks. rived from step 1 above. this constitutes the input necessary for a computer production run. 3. in a production run, the computer calculates: a. its best route for each driver day by day; b. the effectiveness of the route; c. the cost of the route cumulative by day for one week; d. the distance traveled by each driver; e. the time spent working by each driver; and f. capacity, time, and/or distance constraint violations 4. if results of step 3 are not satisfactory or a better variant is synthesized, librarians can iterate through steps 1 or 2. in order to maintain the information basis of the procedure, the following input must be updated for computer files. ad hoc basis • node changes -new nodes -nodes to be dropped -changes of location -changes of hierarchical status and category • changes in vehicle capacity • changes in cost parameters pe1'iodic (inte1'mediate range) basis • evaluate demands-by season by node 600 500 0 imp1'0ved delive1·yjherling, et al. 285 0 0684 0-~~~------wq 0 legend 0 wq [;) i! 400 a------a (>--~b 0----c x----d .,. f;l ~ ..., ~ 300 0 i "' "" 0 0 200 i 100 fig. 8. number of packages delivered per week vm·sus time for ccpl trucks. (once every two or three years or ad hoc if major shifts have been established) • evaluate driver time data (as above) pel'iodic (long mnge) basis • reevaluate the material types • reestablish sensitivity curves maintaining the same frequency of delivery as used earlier, but with routes generated by the computer, results showed a potential cost reduction of 5 percent and an increased effectiveness of 37,930, or 400 to 500 percent improvement. the simulation-optimization program also has the capability of processing changes in the elements of the system. effects of two types of changes were tested: 1. configurations which included an increase of frequency of delivery to daily delivery for most libraries and twice-daily delivery to some; and 2. configurations which included one or two trucks dedicated to transshipment delivery among key distribution centers. 286 journal of library automation vol. 7/4 december 1974 150 0 125 0 0 i.egenq 0 hdq a-------a ~==:--= ~ x----d 25 .. -~ «!> cj)c t:;o ~ . sip ~ ~ _r:;.. r:;.. ·q;--b 0 8/17 8/24 8/31 9/8 9/14 9/21 time aug, ... i .. sept. ... i .. ...i .. fig. 9, number of bindery boxes delivered per week versus time for ccpl trucks. effectiveness again increased 400 to 500 percent; costs, however, also increased, between 3 and 39 percent. discussion essentially, these results provided the means by which cleveland libraries could maintain the existing delivery system at a slight reduction in cost, but with a fourto fivefold increase in effectiveness; or could improve the frequency of delivery at a known increase in cost and the fourto fivefold improved effectiveness. at the same time, a realistic basis for. evaluating bids from commercial delivery services was made available, should this alternative be explored. last, but by no means least, a method for the analysis and/or design of a delivery system that could be used by. other library networks was developed. no study-as is true of most human endeavors-is perfect: ours is no exception. the original intent of the proposal, to study the entire distribution system, and especially its reference network aspects, was narrowed to the delivery subsystem because of inadequate funding. underestimation of the complexity of the problem, which mandated the expenditure of more time than was anticipated on data collection and systems description, caused a limitation on the time that could be devoted to study of the deimproved delivetyjherling, et al. 287 300 0 250 "' "' ~ 200 "' legend :;j [;:' 0 hdq h a ---a "' "' 0 ----b "' 0 :> 150 0 ---c ~ ----d ""' 0 0 p'l "' i 100 50 0 8/17 8/24 8/31 9/8 9/14 9/21 9/28 10/5 10/12 10/19 10/26 11/2 11/9 11/16 -,aug. ~~--sept. ----t--oct, ---1-o--nov. fig. 10. numhe1' ofaudiovisuals delivered pel' week versus time fo1' ccpl trucks. livery subsystem. we could not, as we had intended, consider the question of optimum truck size or alternative types of vehicles; hypothetically, a combination of motorcy 1 cles and large trucks would produce a more costeffective system. acceptance of the location of facilities such as garages as fixed was a further limiting factor: their relocation might have a significant effect. finally, the method of approach in concert with the realities of library budgets ruled out the design of an ideal system unrelated to the existing system. · enough has been written recently to denigrate the usefulness of the computer in library applications. nevertheless, we must acknowledge that a greater amount of human intervention than anticipated was employed as a corrective in the generation of computer-produced routes and must also be used for their implementation. consider: each of 700 geographical locations is a potential successor in a route to any other of the remaining 699. to process these for computer routing would require obtaining nearly 500,000 pairs of geographical coordinates, their keypunching, and verifying. by human selection from a map, reasonable sets of contiguous nodes were fed into the computer; the pairs of geographical coordinates were thus reduced to the not unmanageable number of 2,500 . to 6,400 pairs. further, once computer routes have been generated, human interven288 ] ournal of library automation vol. 7 i 4 decem her 197 4 unit & weights & 9 variable utilities costs values hierarchy node select & location demand r--season or policy rules coordinates date i i t t i i select i i potential select node list frequency for driver ~ select schedule i routing subroutine ~ t i compute objective and cost • check :---jill------____ __,constraints ~......_-----------a yes next no day or driver constraints fig. 11. a general methodology for the simulation-optimization. resources ~available i i : i i i i l i i i i i t ! i i i : l _j i ____ _j tion is required to adjust these to road and traffic patterns that the computer cannot know. this does not imply that the multitude of calculations that need be performed in a study such as this could have ever been attempted without the computer. conclusion despite its imperfections, the project discussed here has convinced us that the approach and methodology are of value to the library community, not only in application to library delivery systems but also in application to a multitude of library service problems, particularly those involving several libraries or library systems, albeit because of changes in top administrative positions within the key library systems the results of this study are still awaiting implementation. improved delivery/herling, et al. 289 references 1. library of congress information bulletin 31:a72 (june 9, 1972). 2. a related study relatively limited in scope is j. c. hsiao and f. j. heinritz, "optimum distribution of centrally processed material: multiple routing solutions utilizing the lock-set method of sequential programming," library resources & technical services 13:537-44 (fall 1969). 3. full documentation of the project is available in the following: an operations research study and design of an optimal distribution network for selected public, academic, and special libraries in gmater cleveland: technical report (cleveland, ohio: the task force, lsca title iii distribution project, 1972); systems description i (cleveland, ohio: the task force, lsca title iii distribution project, 1972). these are available on loan through the state library of ohio. 4. a. reisman, g. kaminski, s. srinivasan, j. herling, and m. g. fancher, "timeliness of library material delivery: a set of priorities," socio-economic planning sciences 6:145--52 (1972). i ! ! i i, editorial ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ creating and deploying usb port covers at hudson county community college communications creating and deploying usb port covers at hudson county community college lotta sanchez and john delooper information technology and libraries | september 2019 91 lotta sanchez (lsanchez@hccc.edu) is library associate – technology, hudson county community college. john delooper (john.delooper@lehman.cuny.edu) is web services – online learning librarian, lehman college, city university of new york. abstract in 2016, hudson county (nj) community college (hccc) deployed several wireless keyboards and mice with its imac computers. shortly after deployment, library staff found that each device’s required usb receiver (a.k.a. dongle) would disappear frequently. as a result, hccc library staff developed and deployed 3d printed port covers to enclose these dongles. this, for a time, proved very successful in preventing the issue. this article will discuss the development of these port covers, their deployment, and what worked and did not work about the project. introduction 3d printing was invented in the 1980s but remained a niche product until emerging as a mainstream technology beginning in 2009. it has been speculated that the growth in popularity was due to several factors, most notably the expiration of patents on technologies such as fused deposition modeling.1 the expiration of this patent led to the emergence of several new companies such as makerbot, which developed and released lower priced 3d printers in an effort to popularize 3d printing.2 nevertheless, early 3d printers were still more expensive than most individual consumers could afford. as with laser printers in the 1980s, many libraries combined their role in the growing makerspace movement with their community purchasing power to bring this new technology to libraries across the united states.3 libraries thus became focal points in the nascent consumer 3d printing movement, frequently providing both training and access to supplies and equipment. as the popularity of 3d printing grew, new communities of 3d printing users emerged and began to design and share artwork and practical objects created with 3d printing technology, often via communities like thingiverse and shapeways. 3d printing at hudson county community college in august 2014, the hudson county (nj) community college (hccc) library moved into a larger facility, nearly doubling its square footage. at this time, many libraries were beginning to open makerspaces, which are facilities for collaboration where the “emphasis is on creating with technology,” and hccc saw an opportunity to join this movement.4 given the results of student feedback surveys, and the observed popularity of 3d printing in public libraries, hccc librarians sought to purchase a 3d printer as a signature technology for the new makerspace. to support the new makerspace, the library’s staff implemented a series of workshops to teach students how to use the 3d printer and create their own projects. in addition, when the makerspace was not in use, the library’s administration allowed staff to experiment with the 3d printer, as well as all mailto:lsanchez@hccc.edu mailto:john.delooper@lehman.cuny.edu creating and deploying usb port covers | sanchez and delooper 92 https://doi.org/10.6017/ital.v38i3.11007 technologies housed in the makerspace, to allow them to better understand and promote these tools. about hudson county community college as per its 2017-18 hccc factbook, hccc is an urban institution “offering courses and classes in a wide variety of disciplines and studies in one of the most densely populated and ethnically diverse areas of the united states.”5 as of fall 2017, hccc’s full-time-equivalent student population is 7,712, and includes students representing “more than 90 nationalities.” many of these students hail from outside of the united states, “nearly 58 percent of whom speak a language other than english in their homes.” hccc’s demographics also skew young, with students ages 20 through 29 comprising approximately 52 percent of enrolled students. more recently, hccc has also increased its enrollment of high school students, as “the number of students under the age of 18, who are mostly enrolled through hccc’s various high school initiative programs, has more than quadrupled over the past five years.” as with many other community colleges, hccc’s student body includes approximately a 6:4 ratio of female to male students. the mac usb dilemma as part of the move to a new facility, the library purchased several new technologies such as computers including dell pcs and apple imacs (macs). the dell pcs came with wireless keyboards and mice, and in march 2016, the macs were switched to wireless keyboards and mice as well because their original keyboards and mice began to break down and needed replacement. students reported to library staff that the wireless keyboards and mice were a good investment, as they made it easier to move keyboards for better collaboration and for ease of storing backpacks and textbooks on desks. on both the dell pcs and the macs, the wireless keyboards and mice required the use of a small usb receiver, known as a dongle, to connect to the computer. as the wireless keyboards were installed, several library staff members raised concerns that wireless keyboards and mice would be tempting targets for theft by patrons. surprisingly, theft of keyboards and mice did not come to pass. since deployment, library staff reported no incidents of theft of any keyboards or mice. however, an unexpected type of theft soon emerged. library employees noticed that on the imacs, the type-a usb dongles, which were needed for the computers to receive input from the keyboards and mice, started disappearing. staff observed that this seemed to be a problem only among the library’s 18 macs, not its 57 dell computers, which also had wireless keyboards and wireless receivers. anecdotal observation suggested that this phenomenon emerged due to the dell’s dark color scheme, which obscured each compu ter’s usb ports, and rendered the dongles inconspicuous. in contrast, the imacs had sleek aluminum finishes, on which the dongles were more visible, and seemed to be perceived by students as flash drives (see figures 1-5). information technology and libraries | september 2019 93 figure 1. hccc imac (back with dongle shown). creating and deploying usb port covers | sanchez and delooper 94 https://doi.org/10.6017/ital.v38i3.11007 figure 2. hccc dell pc (back with dongle shown). information technology and libraries | september 2019 95 figure 3. hccc imac with usb dongle closeup. figure 4. hccc pc with usb dongle closeup. creating and deploying usb port covers | sanchez and delooper 96 https://doi.org/10.6017/ital.v38i3.11007 figure 5. comparison of mac and pc usb ports. these perceptions were confirmed as students started visiting service points with dongles from the macs and turning them in to library staff as “lost flash drives.” as there was frequently a lag between when a dongle was turned over to staff and the device’s initial disappearance, students began to report frustration that they would try to use a mac and find that the mouse and keyboard could not communicate with the computer. this would cause them to assume that the mac was broken, and library staff would respond by taking the computer out of service until a tech could examine it, often several hours or even days later depending on staffing. during the first semester that these keyboards and mice were deployed, the library found that almost every usb receiver was lost or stolen. this resulted in over $300 of unplanned expenses. in addition, library staff spent dozens of hours inspecting the imacs after students reported non-functioning keyboards, determining what issue was occurring, ordering replacement parts, and connecting new dongles, a process also referred to as “pairing.” to address this problem, hccc’s director of library technology sought solutions from the library’s technology staff. at a staff meeting in the spring 2016 semester, most of the members of the technology unit suggested that the library address the disappearing dongle issue by purchasing new wired keyboards and mice. the director of the technology unit felt that this was a premature solution to the issue, as he and the library administration preferred a solution that allowed the library to continue to use the wireless keyboards and mice, which were both costly and requested by the institution’s student community. during this meeting, the idea of finding port covers for the dongles arose, and one of the library’s technology associates suggested using the library’s 3d printer to create a cover that inserts into one type a usb port and would cover the dongle in the adjacent slot. the library’s technology director asked her to create a prototype, and the technology associate began work on creating this port cover. information technology and libraries | september 2019 97 methodology to create the 3d-printed port cover, the technology associate began with an online search of the 3d-printing community thingiverse, looking to see if any other 3d port covers already existed. she hoped to find an existing port cover that was both functional and easy to manufacture—in other words, quick to print, since the library’s makerbot replicator often took hours to print intricate designs and frequently jammed, due to an extruder design flaw that was common to fifth generation replicator printers.6 a thingiverse search found several varieties of port covers, but each was designed solely to occupy a port in order to prevent dust or corrosion, not to cover or hide dongles or other peripherals. since none of the existing designs adequately met the library’s needs, the technology associate created her own design using tinkercad, a web-based computer aided design (cad) program (see figure 6). figure 6. picture of port cover design in tinkercad. since each of hccc’s imac computers contained four type a ports, students would often attach other peripherals such as phone charge/sync cables or flash drives. therefore, the port cover needed to be small enough to allow room for peripherals. the technology associate thus designed a cover that would not hinder students who wanted to insert their usb flash drives or other devices, as is depicted in figures 7-10. creating and deploying usb port covers | sanchez and delooper 98 https://doi.org/10.6017/ital.v38i3.11007 figure 7. closeup of port cover. figure 7. alternate angle of port cover. information technology and libraries | september 2019 99 figure 8. picture of dongle with port cover installed. figure 9. port cover allowed space to utilize other usb ports for flash drives, etc. creating and deploying usb port covers | sanchez and delooper 100 https://doi.org/10.6017/ital.v38i3.11007 she then exported the tinkercad design as an stl (stereolithography) file, and printed prototypes on the makerbot replicator using pla filament. finding that her initial measurements did not quite fit, she adjusted the models one millimeter at a time, and reprinted them until the fit was secure and the dongles were covered. at this point, she printed enough covers for each mac, along with a few spares in case covers broke or wore out during normal operation. results at the beginning of the fall 2016 semester, the port covers were deployed to each of the library’s macs. during that semester, the technology associate monitored the effectiveness of the port covers. by the end of the semester, four port covers had disappeared, along with one dongle. at the beginning of the spring 2017 semester, the missing dongle was replaced, and replacement port covers were printed and deployed to the machines from which the port covers disappeared. again, the success of the port cover installation project was monitored. during this period, four port covers disappeared, along with two dongles. after the spring semester, the technology associate conferred with the director of library technology, and they decided that given the relatively low cost of 3d-printer filament used to print the dongles, and the greatly reduced receiver theft rate, this was an acceptable loss. they therefore decided to continue utilizing the port covers. but, in the fall 2017 semester, five covers and each of their corresponding dongles disappeared. then, during spring 2018 semester, all of the port covers disappeared at least once, as did the associated dongles. in total, 20 dongles were lost during that semester. the director of library technology and the technology associate conferred once again, and decided that due to this increase in theft, and a concurrent change in the college’s purchasing process, the library would abandon the 3d-printed port cover experiment. analysis after two seemingly successful semesters, library staff were proud of the changes that resulted from deploying the port cover. yet given the reoccurrence of the theft pattern in subsequent semesters, they started to worry that printing new port covers was not a sustainable practice. to that end, the technology associate considered several theories as to what would cause the port covers to disappear. for instance, research by keizer, lindenberg, and steg found that acts of social disorder (such as graffiti or litter) will spread if not stopped promptly.7 under this framework, it could be suggested that the library was too slow to respond to missing covers, and thus permitted the loss of the dongles due to insufficient action or maintenance. this theory seems logical since following an enrollment decline that began in fall 2016, a hiring freeze was instituted so as staff members left the institution, few positions were replaced. indeed, as of fall 2018, hccc’s staff is 75 percent part-time and part-timers are subject to renewal or dismissal every six months. in addition, many library employees are student workers, who often leave at graduation, and other part-time staff tend to find full-time employment or leave the library for full-time work at rates that may exceed other institutions who have more permanent staff. with limited staff resources, many of the library’s employees noted anecdotally that they were not able to give as much attention toward preventative maintenance on library computers as they had in prior semesters. therefore, they did not have time to proactively monitor equipment such as port covers and dongles. it is also possible that a novelty factor was at play. perhaps when the covers were first deployed, the brightly colored filaments stood out on the aluminum computers, making students more likely information technology and libraries | september 2019 101 to notice them and alter their behaviors accordingly. if this was the case, new students who began their coursework in subsequent semesters would not have known that port covers were an additional piece that had been added to the library’s computers in response to prior issues. following this speculation, the library’s patrons who removed port covers in fall 2017 and spring 2018 might have thought they were removing damaged or nonfunctional flash drives similarly to the students who brought what they believed were lost flash drives to library staff during the spring 2016 semester. finally, the difference in semesters could also have been due to random chance, in which case, no staff action could have affected the rate at which port covers disappeared. conclusion and future research being unsure of which of these analyses was most correct, the technology associate had planned to learn from the sudden resurgence in thefts in several ways. she planned to experiment with adding signage about the importance of dongles and the usage of port covers, and to interview student mac users to find out their perceptions about the port covers, as well as possible ideas and student-generated suggestions to prevent future thefts. she also considered designing and experimenting with printing more elaborate port covers to see if increased visibility or an elaborate shape would change theft rates. however, a complication arose during the 2018 and 2019 fiscal years. during this time, the college’s finance office changed its purchasing procedures. first, they eliminated the library’s technology budget, centralizing all technology purchases in a “pool,” whose total budget was uncertain. to make purchases from this pool, departments had to create detailed needs justification and obtain approvals from four high-level executives, in addition to the preexisting procedure of obtaining quotes and getting department head and vice president approval. while the library was eventually able to obtain funds from this process, navigating the pool process typically took about six months per purchase, which meant that, in effect, replacement dongles had to come from existing supplies. in addition, the supplies budget line, which was greatly reduced due to the enrollment decline, also came under increased scrutiny, and the purchasing department began to refuse to approve the purchase of batteries. while many of the mac keyboards were solar powered, and thus did not require batteries, all of their wireless mice, along with the wireless keyboards and mice on the windows pcs, required the use of either aa or aaa batteries. as battery supplies dwindled, the purchasing department did eventually agree to allow purchase of more batteries, under the condition that the library begin going through the pools process to purchase wired keyboards and mice. in the meantime, the technology associate continues to monitor wireless dongles, reprint port covers, and swap wired keyboards from the library’s spare parts inventory for wireless ones as dongles have disappeared. the creation of 3d-printed port covers was successful at preventing equipment loss at hccc for only two semesters before failing to fulfill that purpose. library staff speculated about the cause of this change but were unable to make that determination with certainty before budgetary changes caused the end of the 3d-printed port cover experiment. nevertheless, this project proved valuable to the library to better learn about 3d-printing technology, and to experiment with its practical uses in the library environment. endnotes creating and deploying usb port covers | sanchez and delooper 102 https://doi.org/10.6017/ital.v38i3.11007 1 filemon schoffer, “how expiring patents are ushering in the next generation of 3d printing,” techcrunch (blog), june 5, 2016, http://social.techcrunch.com/2016/05/15/how-expiringpatents-are-ushering-in-the-next-generation-of-3d-printing/. 2 christopher mims, “3d printing will explode in 2014, thanks to the expiration of key patents,” quartz (blog), july 21, 2013, https://qz.com/106483/3d-printing-will-explode-in-2014thanks-to-the-expiration-of-key-patents/. 3 jason griffey, “absolutely fab-ulous,” library technology reports 48, no. 3 (april 2012): 21–24, https://journals.ala.org/index.php/ltr/article/view/4794. 4 caitlin bagley, “what is a makerspace? creativity in the library,” ala techsource, december 20, 2012, http://www.ala.org/tools/article/ala-techsource/what-makerspace-creativity-library. united for libraries, american library association office for information technology policy, and public library association, “progress in the making: an introduction to 3d printing and public policy,” september 2014, http://www.ala.org/advocacy/sites/ala.org.advocacy/files/content/advleg/pp/hometip3d_printing_tipsheet_version_9_final.pdf. 5 hudson county community college, “fact book 2017-2018,” 2018, https://www.hccc.edu/uploadedfiles/pages/explore_hccc/visiting_hccc(1)/factbook%20final%20web%20version.pdf. 6 adi robertson, “makerbot is replacing its most ill-fated 3d printing product,” the verge (blog), january 4, 2016, https://www.theverge.com/2016/1/4/10677740/new-makerbot-smartextruder-plus-3d-printer-ces-2016. 7 kees keizer, siegwart lindenberg, and linda steg, “the spreading of disorder,” science 322, no. 5908 (2008): 1681–85. http://social.techcrunch.com/2016/05/15/how-expiring-patents-are-ushering-in-the-next-generation-of-3d-printing/ http://social.techcrunch.com/2016/05/15/how-expiring-patents-are-ushering-in-the-next-generation-of-3d-printing/ https://qz.com/106483/3d-printing-will-explode-in-2014-thanks-to-the-expiration-of-key-patents/ https://qz.com/106483/3d-printing-will-explode-in-2014-thanks-to-the-expiration-of-key-patents/ https://journals.ala.org/index.php/ltr/article/view/4794 http://www.ala.org/tools/article/ala-techsource/what-makerspace-creativity-library http://www.ala.org/advocacy/sites/ala.org.advocacy/files/content/advleg/pp/hometip-3d_printing_tipsheet_version_9_final.pdf http://www.ala.org/advocacy/sites/ala.org.advocacy/files/content/advleg/pp/hometip-3d_printing_tipsheet_version_9_final.pdf https://www.hccc.edu/uploadedfiles/pages/explore_hccc/visiting_hccc(1)/factbook-%20final%20web%20version.pdf https://www.hccc.edu/uploadedfiles/pages/explore_hccc/visiting_hccc(1)/factbook-%20final%20web%20version.pdf https://www.theverge.com/2016/1/4/10677740/new-makerbot-smart-extruder-plus-3d-printer-ces-2016 https://www.theverge.com/2016/1/4/10677740/new-makerbot-smart-extruder-plus-3d-printer-ces-2016 abstract introduction 3d printing at hudson county community college about hudson county community college the mac usb dilemma methodology results analysis conclusion and future research endnotes reproduced with permission of the copyright owner. further reproduction prohibited without permission. electronic library for scientific journals: consortium project in brazil rosaly favero krzyzanowski;taruhn, rosane information technology and libraries; jun 2000; 19, 2; proquest pg. 61 electronic library for scientific journals: consortium project in brazil making information available for the acquisition and transmission of human knowledge is the focal point of this paper, which describes the creation of a consortium for the 1111iversity and research institute libraries in the state of sao paulo, brazil. through sharing and cooperation, the project will facilitate information access and minimize acquisition costs of international scientific periodicals, consequently increasing user satisfaction. to underscore the advantages of this procedure, the objectives, management, and implementation stages of the project are detailed, as submitted to the research support foundation of the state of sao paulo (fapesp). i production, organization, and acquisition of knowledge in 1851, predicting the imminent growth in information, which in fact exploded in volume one hundred years later, joseph henri of the smithsonian institute voiced his opinion that the progress of mankind is based on research, study, and investigation, which generate wisdom, knowledge or, simply , information. he stated that for practically every item of interest there is some record of knowl edge pertinent to it, "and unless this mass of information be properly arranged, and the means furnished by which its content may be ascertained, literature as well as science will be overwhelmed by their own unwieldy bulk. the pile will begin to totter under its own weight, and all the additions we may heap upon it will tend to add to the extension of the base, without increasing the elevation and dignity of the edifice." 1 at the threshold of the twenty-first century, these words become more self-evident by the day. there are enormous archives of knowledge from which people extract parts, allowing them to advance and progress in science, technology, and the humanities. until some decades back, recovery from these archives was essentially a manual task consisting of written work and organization. today's technologies provide auxiliary tools to transmit this knowledge . although information is a cultural and social asset, it now is purchased at high prices . making these enormous archives available in a clear and organized manner by using the proper technology is currently the greatest challenge for all those involved in knowledge management-the production , organization, and transmission of information. rosaly favero krzyzanowski rosane taruhn i the advent and implications of electronic publications among the major contributions of the industrial era, outstanding are the evolution and growth of information publi shing and printing facilities that use tools to record, store, and distribute information. in the last ten years, the first steps were taken toward the storage and reproduction of sounds and images in new multimedia formats. technological advances also have brought new possibilities in accessing and disseminating information . electronic publishing has been particularly effective in accelerating access and contributing to the generation of additional knowledge; consequently, an exponential increase in data has taken place, most notably in the second half of the twentieth century. current journals numbered about 10,000 at the beginning of the century; by the year 2000 the number had reached an estimated 1 million. 2 as a result, specialized literature has been warning about a possible crisis in the traditional system of scientific publications on paper . in addition to the difficulty of financing the publication of these works, the prices of subscriptions to scientific periodicals on paper have been rising every year. at times, this makes it impracticable to update collections in all libraries, which interferes substantially in development. on the other hand, access to electronic scientific publications via internet is proving to be an alternative for maintaining these collections at lower cost. it also provides greater agility in publishing and distributing the periodical, and in the final user's accessing of the information. due to this, it is important that institutions that wish to support and promote research developed by their scientific communities facilitate access to these publications on electronic media . to paraphrase line, we can say that although publishers are still uncertain as to all the aspects of transmitting information electronically, because authors and institutions will be increasingly able to distribute their works on the web without the direct involvements of publishers, there is an escalation in electronic publications being published by scientific publishers.3 rosaly favero krzyzanowski is technical director of the integrated library system of the university of sao paulosibi/usp, brazil. rosane taruhn is director of the development and maintenance of holdings service of the technical department of the university of sao paulo-sibi/usp, brazil. electronic library for scientific journals i krzyzanowski andtaruhn 61 ! reproduced with permission of the copyright owner. further reproduction prohibited without permission. physical figure 1. infrastructure resources for consortium formation line also savs that one of the reasons for the growth in the number o'f electronic publications is "that it is technically possible to make them [journals] accessible in this way, and in fact easy and cheap, since nearly all te_xt ~oes through a digital version on the way to pubh~ahon. secondly, journal publishers believe that electronic ve~sions provide a second market in addition to that for t~eir printed versions, or at least in an expanded market, since many users will be the same." 4 . . . . . it is important to point out that the sc1enhhc penod1cal, be it paper or electronic, must ensure market valu_e and academic community receptivity, have a staff qualified for scientific publishing, be consistent in publishing release dates, comply with international standards, and use established distribution and sales mechanisms. 5 line goes further: "electronic publication as an_ 'extra' to printed publication has few added costs of j~urnal publication other than those of printing, and pubhshe~s are not going to want to make less money fro~ elect~onic journals than they do from printed ones. while p~inted journals once acquired can be used and reused without extra cost, each access to an electronic article has to be paid for. and although the costs of storage and binding may be saved, these are offset by the costs of printing out."6 he then notes that this technology demands an active equipment and telecommunication infrastructure. another point he addresses is the need for users to master the search strategies required to efficiently recover information, thus reducing the time spent and costs. in turn, saunders points out that, depending on the contracts made with the publishers or their agents: 62 information technology and libraries i june 2000 libraries, through their development, formation, and maintenance policies, should be receptive to this transition by accommodating the different means of communication to the different user needs and striving for a new balance. these policies should certainly stress the cooperation and sharing of remote access to the information demanded. budget estimates should, therefore, foresee, in addition to the subscriptions to electronic titles with complete texts, other possible items like licensing rates for multi-user remote access and the right to copy articles on electronic media to paper, depending on the contracts made with the publishers or their agents.7 i electronic publication consortiums catering to mutual interests by setting up a library consortium to select, acquire, maintain, and preserve electronic information is one means of reducing or sharing costs as well as expanding the universe of information available to users and ensuring a successful outcome. resources-physical, human, financial, and electronic-are combined for the common good; in this case, the consortium, as shown in figure 1, which was extracted and adapted from an oclc institute. 8 the consortium presupposes invigoration of cooperative activities among member libraries by promoting the central administration of electronic publication databases as part of a shared library system visible to all and replete with access facilities. in addition to putting in place simplified, reciprocal lending progra~s and spu_rring _the cooperative development of collections and the~r st~nng, the consortium has the objective of implementing information distribution by electronic means, provided that copyright and fair use rights are complied _wi~h.9 on t~e other hand, "the research library community is committed to working with publishers and database producers to develop model agreements that deploy lice~ses that d? not contract around fair use or other copynght provisions. in this way, one seeks to insure the library practices being disseminated, especially interli?~ary lendi~g."'. 0 experience shows that acqumng ~ubhcahons through consortia has brought great benefits and has equally favored different size institutions that would not be able to afford single subscriptions, whether on paper or in electronic format. north american and european universities have been opting for this type of alliance to augment inve~tment cost-benefit. important examples of these consortia currently operative are: • washington research library consortium, washington, d.c., www.wric.org; reproduced with permission of the copyright owner. further reproduction prohibited without permission. • university system of georgia, galileo project, www.galileo.peachnet.edu; • committee on institutional cooperation, michigan, www.cedar.cic.net/ cic; and • ohio library and information network, ohio link, www.ohiolink.edu. i the electronic consortium in the state of sao paulo considering that brazilian institutions also are being affected by the high cost of maintaining periodical collections and that alternative means of distributing this information are available, the model used abroad has shown itself as appropriate for developing the international scientific publications electronic library in the state of sao paulo. the location has a favorable information infrastructure available, particularly that of the electronic network of the academic network of sao paulo (ansp), thanks to the support of the research support foundation of the state of sao paulo (fapesp). 11 growing user demand for direct, convenient access to information in the state of sao paulo also was a factor in location choice. the final decision was to compose the consortium of five sao paulo state universitiesuniversidade de sao paulo (usp), universidade estadual paulista (unesp), universidade de campinas (unicamp), universidade federal de sao carlos (ufscar), and universidade federal de sao paulo (unifesp)-as well as the latin american and caribbean center for health science information (bireme). the consortium's goal was to make available to the member institutions' entire scientific community-10,492 faculty and researchers -rapid access to the complete, updated texts of the elsevier science scientific journals. this publishing house, an umbrella for north holland, pergamon press, butterworth-einemann, and excerpta medica, presently publishes electronic versions of its journals. selection of the member institutions that would serve as a pilot group for this project was based on prior experience with the cooperative work in preparing the unibibli collective catalog cd-rom, which, using bireme/opas/oms technology, consolidates the collections of these three universities. the project was initially funded by the fapesp; since its fourth edition the cdrom has been published through funds provided by the universities themselves, by means of a signed agreement. moreover, choice of elsevier science, which would be justified solely by its premier ranking in the global publishing market, also is due to the fact that consortium member institutions maintain subscriptions to a great number (606) of this publishing house's titles on paper. already fully available on electronic media, these titles are components of a representative collection initiating the building of the international scientific publications electronic library in the state of sao paulo. furthermore, the majority of the titles are studied on the institute of scientific information's web of science site, which has been at the disposal of researchers and libraries in the state of sao paulo since 1998. consortium objectives the consortium was formed to contribute to the development of research through the acquisition of electronic publications for the state of sao paulo's scientific community. using the ansp network, in addition to augmenting and speeding up access to current scientific information in all the member institutions, will: • increase the cost-benefit per subscription; • promote the rational use of funds; • ensure continuous subscription to these periodicals; • increase the universe of publications available to users through collection sharing; • guarantee local storage of the information acquired and thus ensure the collection's maintenance and its continual use by present and future researchers; and • develop the technical capabilities of the personnel of the state of sao paulo institutions in operating and using electronic publication databases. initially, the project will not interfere in the current process of acquiring periodicals on paper and in distributing collections in member institutions. however, as electronic collection utilization becomes predominant, duplicate subscriptions to paper may be eliminated so as to allow new subscriptions to be available to the consortium at no additional cost. implementation of the electronic library for international scientific publications implementation of this project includes the following stages already achieved: • constitution of the consortium by the six member institutions; and • set up of an administrative board. the following stages are in progress: • purchase of hardware (central server) and a software manager; and • estimate for the installation of the operational system. electronic library for scientific journals i krzyzanowski and taruhn 63 reproduced with permission of the copyright owner. further reproduction prohibited without permission. bireme server fapesp server full-text database r----------.,1 full-text 1 t international i r database 1 ~----------~ web of science .... •--•.. : scientific : : current : : contents : scielo : periodical : 1 electronic 1 i l'b i 1 1 rary 1 .. __________ .. : connect : i (ccc) i i i ., __________ ., \/ universe • web of science: 8,000 titles • ccc: 9,000 titles users in consortia institutions • scielo (scientific electronic library online): 100 titles • international scientific periodical electronic library: 606 titles figure 2. reference database and full-text interconnectivity to optimize information access and the following stages are planned: • training for qualified personnel and maintenance of the infrastructure built up; • acquisition and implementation of the electronic library on the central server; and • permanent utilization assessment. the pilot project proposes that the central server, for storage and availability of electronic scientific periodical collections on the ansp network, be located at fapesp in order to facilitate development of an electronic bank. in the future, the bank should, in addition to the collection in mind for the project, include international collections of other publishing houses: the scielo collection of brazilian scientific magazines (project fapesp /bireme) as well as the web of science and current contents connect reference databases (see figure 2). consortium management the electronic library will be administrated by the consortium's administrative board, made up of a general coordinator, an operations coordinator, and directors and coordinators of the library systems and central libraries of member institutions as well as consultants recommended by fapesp. the administrative board shall be in charge of the implementation, operation, dissemination, and assessment of electronic library utilization. it also is charged 64 information technology and libraries i june 2000 with supervising qualified personnel training in order to guarantee the success of the project. an agreement specifying the consortium objective, its constitution, the manner by which it shall be executed and consortium member obligations established was signed. shortly, a contract to use elsevier science electronic publications shall be signed by fapesp and by the provider. the agreement's documents and use license were drawn up in compliance with the principles for licensing electronic resources recommended by the american library association, published in final version at the 1997 american library association annual conference.12 i recovery system and information use evaluation research on electronic media suggests that use of a single software program that offers different strategies and forms of interacting for searching the collections requires an evaluation of the efficiency of individual research strategies. this evaluation is critical for preparation of guidelines that orient the choice of systems and proper training programs.13 for the electronic library, the challenge of measuring not only the amount of file use but also the efficacy and efficiency of its information access systems and training for its users is an imperative task. in the project reproduced with permission of the copyright owner. further reproduction prohibited without permission. described, evaluation shall be made by indicators that demonstrate use of the electronic library and of the collections on paper, per journal title, subject researched, user institution, number of accesses per day, and user satisfaction regarding service provided (interface, response time, text copies), among other factors to be studied. i final remarks the way in which electronic media are read by the users is a code far beyond the written, because sound and image are being added increasingly. in this first generation of electronic publications, fapesp supported availability of web of science and of scielo by fapesp and the creation of the international scientific publications electronic library in the state of sao paulo. the possible introduction of current contents connect will trigger an extraordinary leap in research development, facilitating the access of scientific information and the acquisition and transmission of human knowledge as well as enhancing the cooperative and sharing enterprise of member libraries. references and notes l. annual report of the board of regents of tile smit/zsonum institution ... during the year 1851 (washington, d.c. 1852), 22. 2. leo wieers, "a vision of the library of the future," in developing the library of the fut11re: the tilb11rg experience, h. geleijnse and c. grootaers, eds. (tilburg, the netherlands: tilburg univ., 1994), 1-11. 3. m. b. line, "the case for retaining printed lis journals," !fla journal 24, no. 1 (oct./nov. 1998): 15-19. 4. ibid. 5. r. f. krzyzanowski, "administra<;ao de revistas cientificas," in re11niiio anual da sociedade de pesquisa odonto/6gica, aguas de sao pedro, 14, 1997. (lecture) 6. line, "the case for retaining printed lis journals." 7. l. m. saunders, "transforming acquisitions to support virtual libraries," information teclmology and libraries 14, no. 1 (mar. 1995): 41-46. 8. oclc institute, oclc instit11te seminar: information tec/znology trends for thl' global library cormmmity, 1997, ohio (dublin, ohio: oclc institute/the andrew w. mellon foundation/funda<;ao gettilio vargas/bibliodata library network, 1997). 9. a definition of fair use is the "legal use of information: permission to reproduce texts for the purposes of teaching, study, commentary or other specific social purposes." found in j. s. d. o'connor, "intellectual property: an association of research libraries statement of principles." accessed july 28, 1999, http://arl.cni.org/ scomm/ copyright/ principles. html. 10. statement of current perspective and preferred practices for the selection and purchase of electronic information. icolc statement on electronic information. accessed july 2, 1998, www.library.yale.edu/ consortia/statement.html. 11. r. f. krzyzanowski and others, biblioteca eletr6nica de publicac;oes cientfficas internacionais para as universidades e institutos de pesquisa do estado de sao paulo. sao paulo, 1998 (project presented to fapesp-fundac;ao de amparo a pesquisa do estado de sao paulo). 12. b. e. c. schottlaender, "the development of national principles to guide librarians in licensing electronic resources," library acquisitions-practice and theory 22, no. 1 (spring 1998): 49-54. 13. w. s. lang and m. grigsby, "statistics for measuring the efficiency of electronic information retrieval," journal of the american society for information science 47, no. 2 (feb. 1996): 159-66. electronic library for scientific journals i krzyzanowski and taruhn 65 simon fraser university computer produced map catalogue 105 brian phillips: head social sciences librarian and gary rogers : programmer-analyst, computer centre, simon fraser university, burnaby, british columbia an ibm 360/50 computer and magnetic tape are used in a new university library to produce a map catalogue by area and up to six subiects for each map. cataloguing is by non-professional staff using the library of congress "g, schedule. author, title, and publisher are in variable length fields, and codes are seldom used for input or interpretation. machine searches by area, subjects, author, publisher, scale, pro-;ection, date and language can be carried out. simon fraser university in burnaby, british columbia, opened in september 1965 to 2,500 students. the library's book collection was small and the map collection yet to be started. to-day there are 6,000 students, approximately 350,000 volumes and 25,000 sheet maps. when graduate work was offered in geography the map collection had to be expanded rapidly. only a small staff was available and it was essential that any map catalogue be largely maintained by trained nonprofessional assistants. the circulation, acquisitions, and serials systems were automated and there was of course no sacred 3"x5" card file to be replaced. an ibm 1401 (now a 360/50) was in the library and the university librarian encouraged experiment. some form of automated book catalogue was clearly indicated and work began in 1966 to develop one. automated or semi-automated methods for cataloguing and producing map lists have been in use for over twenty years. very little, however, appeared in print on the subject until the 1960's. since that time there 106 journal of library automation vol. 2/ 3 september, 1969 has been a number of articles on proposed systems and experimental projects, though only a few describe operating systems. the u.s. army map service library has used punched cards since 1945 ( 1). at the time of investigation, this system was not fully automated, making use only of electric accounting machines rather than a computer. other automated catalogues, such as those for the san juan project ( 2 ) and for mcmaster university ( 3), restricted the amount of information possible by using only one punched card. these systems required codes and tables for both input and interpretation. the literature revealed other approaches, and several, such as indexing by co-ordinates ( 4 ), or using a hierarchical classification, were considered. in the former, each sheet is indexed by its centroid in latitude and longitude. this provides complete control by location; but all requests, and of course indexing, must be expressed by centroid and the extent of the search area indicated in miles. a hierarchical system, such as that suggested by donahue and hedges ( 5) or that used by mcmaster university, permits a detailed subdivision of area and/ or subject. there is, however, no agreement on standards, with each library developing a classification to meet its own needs. visits were made to the university of california at santa cruz and illinois state university at normal, illinois, to see two systems that were being automated. both had used the universally recognized classification of the library of congress. the california system was first outlined by carlos hagen ( 6) in a proposal to automate the map library at the los angeles campus, and implemented with some changes at santa cruz by stanley stevens. william easton at illinois state has described his work in cataloguing the collection there ( 7). the use of codes in both cases meant a number of revisions, as new projections, publishers or other information were required. because of format, library staff must be called upon to interpret much of the information. materials and method in the simon fraser map catalogue the library of congress classification "g" schedule ( 8) is adopted for comruter use. in it each major natural or political unit is assigned a block o four numbers. the schedule starts with the world and hemispheres, then sweeps through north and south america, to europe, asia, africa, and finally oceania. adjacent areas are thus grouped together numerically. the classification similarly groups related subjects. a single letter is used for the broad subject and an alphanumeric code for subdivisions. in an automated system each area name must have a unique number if it is to appear in the printout under that name. to this end it h~s been necessary to make variations in the library of congress "g" schedule. indo-china ( g8010-g8014 }, for example, must be split to provide separate numbers for laos, cambodia and vietnam. the subject classificomputer pmduced map catalog 107 cation must also be divided to provide an alphanumeric code for each subject that is grouped under one general number in the schedule. as commonly in map libraries, the main entry is area rather than author, which is of secondary importance. the author (engraver, cartographer, etc.) is entered on the coding form and appears in the description. in the imprint, publisher is given first, followed by place of publication. these three elements are in variable length fields. information from the maps is entered on a coding sheet (figure 1) by a library assistant. difficult sheets are entered by a librarian who checks all sheets. as indicated on the flow chart (figure 2), the coding sheets are sent to the library keypunching section, where a deck is made for each record. the number of cards for any particular map depends upon the quantity of information required to describe it. the cards are then sent to the computing centre, where they are written onto magnetic tape and used to update the current master · files. a preliminary survey determined the average length of a map record to be 350 characters, while the maximum approached the region of 700. in order to maximize the use of tape space, it was decided that four of the fields would be variable in length. these are: 1) main entry (area and title) ( 215 characters maximum); 2) publisher ( 129 characters maximum); 3) author ( 129 characters maximum); and 4) notes ( 215 characters maximum). access to these data elements is made possible by storing their character counts in fields preceding the variable portion of the record. two master files are kept and updated each time a run is required. these are the area master (by l.c. class number) and the subject master. the area master contains all maps and is used to produce the classified list and the alphabetical list. the subject master contains only those maps which have been assigned an l.c. subject code. if a map has more than one subject it appears on both the list and the tape file as many times as it has subjects. changes and deletions are entered into the system along with additions. status codes signal the three: aaddition, cchange, ddeletion. change and deletion records are complete decks. the records are changed or deleted by comparing the call number on the area master and the call number and subject code on the subject master. call number and subject code are the only fields that cannot be thus changed. their change is accomplished by replacing the old record with the revised record. as the only unique identifying number for each punched card would be the call number (maximum of 24 spaces), a six-digit i. d. number is assigned. it is repeated for each of the five decks. the maximum number of cards used in any deck is four (main entry and notes), though up to ninety-nine could be used if necessary. 1 6 1.0 . nuiiier s.f.u. map coding sheet ~~·:·;~·;·j;j;t:t ~:· :·r: ; :·: ;·:,~~·~·~;: ~· ~·~,~~;j·: : : :t·;·t; ;·;r ~:·t· ;~~·~ ·;~ ;~~· : f:·: :r·;·:·:~·:·r;·j . .... ~~ 6 10 80 1.0 . number t:·:·:·:·tl::t:·:~·: :·:·:·:·::· :·: :: ·: ·:·::::;:·:~::~~:~·:·:~·:·:·: :·~:· : ·: ·:·: ·:·: :,::~·::·:·: :·:·:·:,: :: : : : :: : : :: i 60 f:~gl:l:::h· ::~~: :~:: :,:·:·:·: : ::: ::: :~:·:: 0:~: ~: :: :: : :::: ::::::::::: :::::: :::::::: i 1 6 1.0 . number 10 6 10 fig. 1. .map coding sheet. 1 0 .. -------notes ............. . ...... @ i ~ 1:"4 .... c3" ~ ~ f g· ~ ~ coj cf.) .g l ~ t5 ffi 8 computer produced map catalog 109 map work forms sort records card code within card type within 1.0. humber fig. 2. work flow chart. 1 tape layout • lm005 cd subject master & update tapes (v area master & update tapes status identification a= addition area number c =change ® d = deletion cd cd 6 1 8 projection language cd g) 40 41 4 2 ~-~ .. 45 title publisher length length cd 0) it 83 84 86 87 1113 location cd author length 0) ~block size = max • 2340 rec. size = max.· 780 sub subj. date of sheet copy scale area maj. publication number cd cd cd @) cd g) 15 tg 1119 z213 --32 33 35 37 39 -cd subj. master form size code name subj. code g) spares cd ® g) cd area master· alpha • code g) @or@ 4547 4149 54 55 -·-~·57 58 60 or 61 61 or 62 ---80 notes title publisher author notes length (place & subject) 0) max = 215 max= 129 max= 129 max = 215 89 90~ ----91 93 -------........variable length section fig . .'3: layout for area and subject masters and update tapes. ..... b ..... ~ -c -t"-4 & ~ ~ g ..... ... c ;:$ ~ ~ t-0 ~ ij) cd "0 ,.,. ~ ct> vi-! ..... co ~ computer produced map catalog 111 the equipment used is all ibm. the cards are punched on an 029 keypuncher and verified on an 059. a computer model 360/50 is now being used, though equipment of this capacity is not necessary. during the development of the project an ibm 1401 and later a 360/40 were used. printing is done on an ibm 1403 at 1100 lines per minute. the programmes and their functions. the following nine programmes which were originally written in autocoder are now in pl/i. this is a relatively new high-level language for the ibm 360 system. to have maximum efficiency from this language large core storage is necessary, though it can be used, with restrictions, on a 32k core storage computer. with the use of other programming languages the system could run efficiently on any computer. lm001: this programme puts the card decks (from keypunching) onto tape in card image. lm005: this programme creates and explodes each group of records on the card image tape with the same identification number to produce a subject update tape and an area update tape (figure 3). at the same time, each record is edited; if an error is found, the record is rejected. in order for a record to be valid, the following conditions must exist: 1) numerical identification number. 2) valid card type, i.e. ( 1, 2, 3, 4, 5) (see figure 1 ) . 3) no duplicate cards for the same map. 4) card codes successive. 5) area being 'g' followed by four numeric digits. 6) numerical date. 7) if scale absent, 'z' (not printed). 8) general information card and title card for each map. lm010: in this programme, the area master is updated with the area tape. an error message is printed and the record rejected if there is already an addition on the master file or if there is a change or deletion having no corresponding record on the master file. also the area number is checked against a table to see if it is valid. if it is invalid, an error message is printed but the record will appear in the master file. lm015: this programme lists the alphabetical geographical master. lm025: this programme lists the area master geographically. lm030: this programme updates the subject master. a message is printed and the record rejected if there is an addition which already exists on the master file and if there is a change or deletion which has no corresponding record on the master file. also, the subject code is checked against a table to see if it is valid. if the subject code is invalid, an error message is printed but the record will appear in the master file. i 112 ]oumal of library automation vol. 2/ 3 september, 1969 lm035: this programme lists the subject master. at the same time, a tape is produced with each subject heading and the page number it appears on. lm036: this programme lists a table of contents for the master subject list. lm037: this programme lists an index for the master subject list. the catalogue the catalogue is a book catalogue produced in three sections, unburst and top-bound in loose-leaf binders. the first is the classified or shelf list section, which brings maps of adjacent areas together. within each l.c. number or area, maps are by subject code. in l.c. order, general maps are followed by those with subject emphasis, then by those showing political divisions, ending with cities. area names and numbers are in bold type. all pages are numbered and there is a table of contents giving area name and l.c. equivalent. there is also a list of subjects with code numbers. section two is the same list in alphabetical order by area name (figure 4). section three (figure 5) is the subject listing. maps are arranged by l.c. subject code rather than alphabetically, which gives the advantage of grouping related subjects together. within each group maps are in class number ( ie. area) order. in the format for this section the l.c. alphanumeric code is given first, with the subject name in bold type antigua. • • • g5047g5050 g5047 1959 antigua, west indies (antigua island). 1:25,000: transverse mercator. great britain. directorate of overseas surveys, london, 1962. set of ·2 maps. fig. 4. alphabetical area list. j80 industrial agricultural produg.ts g8481 j80 1959 rhodesia and nyasalandtobacco (tobacco production ••• rhodesia and nyasaland). 1:3 million. rhodesia and nyasaland. director of federal surveys, salisbury, 1961. federal atlas map no 20. k forests and forestry k10 forestry in general fig. 5. classified subject list. computer produced map catalog 113 for major groups and in regular type for the subdivisions. an alphabetical index of subjects refers the user to the page where his subject begins. the call number in all three lists includes only the major subject of the map, but since a map may cover several, up to five additional or "minor" ones may be included when cataloguing. a single sheet may therefore appear under several headings in the subject section. this method is also used to catalogue a single sheet containing several separate maps. evaluation although some modifications may yet be made to the system, the catalogue has proven highly successful and possesses a number of advantages over existing manual and automated systems. its clear format and lack of symbols make it easy to use. it is issued each trimester in three copies, of which one is kept in the library. one copy is sent to the geography department, and one to the history department. the work form is simple enough to be used by skilled non-professional help and as all punched cards are verified there are fewer errors than with card typing. some errors do occw·, but in almost all instances the record is automatically rejected and corrections made. filing errors are non-existent. few codes are needed for input and only the l.c. number, form and location are not readily understood in the printed catalogue. although language, scale, projection and subject are entered in code or short form, tney appear in full on the lists. the codes used for form, language, and projection are very simple and reference to them is seldom necessary. main entry, author, and imprint are in variable length fields, allowing complete information to be given without codes or abbreviations. as main entry is the area name and imprint is by publisher rather than place, a gazetteer-index, and a list by publisher, as well as an author list or index, can be produced when required. although not envisaged as an important element, the provision of a punched card for notes has been most valuable. if a map is withdrawn from a journal, or has an accompanying brochure filed elsewhere, this is stated. any further explanation necessary for an understanding of the map is also given. since all elements on the first card are in fixed fields it is possible to obtain lists on demand by subject, date, scale, language projection, etc. although the extent of the simon fraser collection makes this impractical now, its potential for preparing bibliographies and machine searching is apparent. an analysis shows that initial costs were not excessive. the programming time of two months was the largest single item at $2,400.00. computer time and forms to produce the three listings totalled $110.00. the projected cost based on the present size of the collection is $280.00 per 114 journal of library automation vol. 2/ 3 september, 1969 year, a figure which will increase as the collection grows. keypunching and verifying time is approximately 2~ minutes per map. while this is of course a cost factor, it is done at slack time by the cataloguing department, whose operators are paid from $360.00 to $400.00 per month. in a manual system, an additional clerk at $3,564.00 would have been needed to type and file cards, and furniture for the cards would also have been required. the disadvantages are now more evident than upon receipt of the first lists in june 1968. use of the classified section has been slight except by the library staff and it will be issued only once each year. the alphabetical area section is the most heavily used, but the arrangement of entries by l.c. code under each area is confusing. as the number of maps increases from one page to many the user finds it increasingly difficult to locate a thematic map. the third section, by subject, helps overcome this problem, but here again the list is by l.c. code, not alphabetical within each subject. topographic series and sets were catalogued with one enb·y so the number of records was considerably less than the 25,000 sheets in the collection. archival and facsimile maps acquired since the system was designed have presented problems. the librarian and library assistant were new to map work; consequently the number of errors was high, and corrections and patching-up were time consuming and therefore costly. conclusion despite the less than perfect product, however, the results are worthwhile. first time users experience some difficulty with the classified arrangements but only a simple explanation is needed, and thereafter students are able to identify and locate most maps with little reference to staff. the geography department, and to a lesser extent the history department, do make use of their copies of the catalogue. telephone enquiries for holdings are minimal and some faculty have asked that they be given their own copies. the simon fraser system is not expensive to operate, the catalogue could be issued more frequently at little extra cost, and the system uses a widely accepted classification scheme that is updated periodically. the programmes employed could be adapted by other libraries with few, if any, modifications and the system could be run on any computer. there will be more sophisticated map catalogues, such as that of the library of congress, using marc ii format, and others which will take greater advantage of computer capabilities. extensive and costly research, liowever, will be needed to develop these systems. the simon fraser system is operating now, was developed in a very short time, and has had a successful first year of use. computer produced map catalog 115 references 1. murphy, mary: "will automation work for maps?" special libraries, 54 (november 1963 ), 563-567. 2. thomas, kenneth a., sr.: "the san juan island project: cataloguing maps by mechanized techniques," special libraries association, geography and map division. bulletin, 54 (december 1963), 8-12. 3. donkin, kate; goodchild, michael: "an automated approach to map utility," the cartographer, 4 (june 1967), 39-45. 4. stallings, david lloyd: automated map reference retrieval. a thesis submitted in partial fulfilment of the requirements for the degree of master of arts. (seattle, university of washington, 1966), p. 71. 5. donahue, joseph c.; hedges, charles p.: "caresa proposed cartographic retrieval system," american documentation institute. proceedings, (1964), 137-140. 6. hagen, carlos b. : "an information retrieval system for maps," unesco. library bulletin, 20 (january-february 1966) 30-35. 7. easton, william w.: "automating the illinois state university map library," special libraries association, geography and map divis~on . bulletin, 61 (march 1967), 3-9. 8. united states. library of congress. subject cataloging division: classification, class g: geography, anthropology, folklore, manners and customs; 3rd. ed. (washington: superintendent of documents, 1954)' p.502. mccrory ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ microsoft word september_ital_dowling_final.docx president’s  message   thomas  dowling     information  technologies  and  libraries  |  september  2015        doi:  10.6017/ital.v34i3.8966   1   fall  has  arrived,  faster  than  expected  (as  it  always  does).    it  seems  like  ala  annual  just  wrapped   up  in  san  francisco,  but  we're  already  well  underway  with  the  coming  year's  activities.   the  national  forum  2015  will  be  here  before  you  know  it.    in  a  fall  season  crowded  with  good   technology  conferences,  lita  forum  consistently  proves  its  value  as  a  small,  engaging,  and   focused  meeting.    technologists,  strategists,  and  front-­‐line  librarians  come  together  to  discuss  the   tools  they  make  and  use  to  provide  cutting  edge  library  services.    in  addition  to  great  lita   programming,  this  year  we're  working  with    colleagues  from  llama  (the  library  leadership  and   management  association)  to  provide  a  set  of  programs  focused  on  the  natural  cooperation  of   management  and  technologies  in  libraries.    there  are  great  preconferences  on  makerspaces  and   web  analytics,  keynote  addresses,  over  50  concurrent  sessions,  and  a  lot  of  networking   opportunities.    click  on  over  to  litaforum.org,  and  i  hope  to  see  you  in  minneapolis,  november  12-­‐ 15.   not  too  long  after  forum,  we'll  be  in  boston  for  midwinter,  and  then  annual  in  orlando.    the   program  planning  committee  is  already  at  work  selecting  the  best  programs  for  annual.  next   summer  is  also  the  start  of  lita’s  50th  anniversary  celebrations!   of  course,  not  everything  we  do  involves  travel  and  in-­‐person  meetings.    lita’s  fall  schedule  of   webinars  includes  sessions  on  patron  privacy,  creative  commons,  personal  digital  archiving,  and  a   second  iteration  of  top  technologies  every  librarian  needs  to  know.   on  the  staff  side,  we  are  happy  to  say  that  jenny  levine  has  started  as  lita’s  new  executive   director.    jenny  comes  to  us  from  ala’s  it  and  telecommunications  services  department,  where   she  is  still  putting  in  some  time  bringing  a  new  version  of  ala  connect  online.    jenny  and  the   governing  board  are  already  working  together  virtually:  we  are  about  to  select  our  emerging   leaders  for  the  year  and  are  working  on  an  exercise  to  set  divisional  priorities,  with  an  eye  toward   drafting  a  new  strategic  plan.    the  board  will  hold  two  online  meetings  this  fall.    as  always,  these   are  open  meetings,  so  if  you’re  interested  in  your  association’s  governance,  you’re  welcome  to  sit   in.    watch  the  board’s  area  in  connect  for  details,  or  look  for  upcoming  posts  to  lita-­‐l  and   litablog.org.    and  if  you  need  to  contact  the  board,  you  can  reach  us  at   http://www.ala.org/lita/about/board/contact.   i  hope  to  meet  as  many  lita  members  as  possible  this  year,  at  one  of  the  upcoming  in-­‐person   meetings  or  online,  or  just  drop  me  a  line  on  connect.    it’s  going  to  be  a  great  year  for  lita.     thomas  dowling  (dowlintp@wfu.edu)  is  lita  president  2015-­‐16  and  director  of  technologies,   z.  smith  reynolds  library,  wake  forest  university,  winston-­‐salem,  north  carolina.     editorial and technological workflow tools to promote website quality | morton-owens 91 emily g. morton-owens editorial and technological workflow tools to promote website quality everard and galletta performed an experimental study with 232 university students to discover whether website flaws affected perception of site quality and trust. their three types of flaws were incompleteness, language errors (such as spelling mistakes), and poor style in terms of “ambiance and aesthetics,” including readable formatting of text. they discovered that subjects’ perception of flaws influenced their judgment of a site being highquality and trustworthy. further, they found that the first perceived error had a greater negative impact than additional problems did, and they described website users as “quite critical, negative, and unforgiving.”5 briggs et al. did two studies of users’ likelihood of accepting advice presented on a website. of the three factors they considered—credibility, personalization, and predictability—credibility was the most influential in predicting whether users would accept or reject the advice. “it is clear,” they report, “that the look and feel of a web site is paramount in first attracting the attention of a user and signaling the trustworthiness of the site. the site should be . . . free of errors and clutter.”6 though none of these studies focuses on libraries or academic websites and though they use various metrics of trustworthiness, together they point to the importance of quality. text quality and functional usability should be important to library website managers. libraries ask users to entrust them to choose resources, answer questions, and provide research advice, so projecting competence and trustworthiness is essential. it is a challenge to balance the concern for quality with the desire to update the website frequently and with librarians’ workloads. this paper describes a solution implemented in drupal that promotes participation while maintaining quality. the editorial system described draws on the author’s prior experience working in book publishing at penguin and random house, showing how a system that ensures quality in print publishing can be adjusted to fit the needs of websites. ■■ setting editing most people think of editing in terms of improving the correctness of a document: fixing spelling or punctuation errors, fact-checking, and so forth. these factors are probably the most salient ones in the sense that they are editor’s note: this paper is adapted from a presentation given at the 2010 lita forum library websites are an increasingly visible representation of the library as an institution, which makes website quality an important way to communicate competence and trustworthiness to users. a website editorial workflow is one way to enforce a process and ensure quality. in a workflow, users receive roles, like author or editor, and content travels through various stages in which grammar, spelling, tone, and format are checked. one library used a workflow system to involve librarians in the creation of content. this system, implemented in drupal, an opensource content management system, solved problems of coordination, quality, and comprehensiveness that existed on the library’s earlier, static website. t oday, libraries can treat their websites as a significant point of user contact and as a way of compensating for decreases in traditional measures of library use, like gate counts and circulation.1 websites offer more than just a gateway to journals; librarians also can consider instructional or explanatory webpages as a type of public service interaction.2 as users flock to the web to access electronic resources and services, a library’s website becomes an increasingly prominent representation of the library. at the new york university health sciences libraries (nyuhsl), for example, statistics for the 2009–10 academic year showed 580,980 in-person visits for all five locations combined. by comparison, the website received 986,922 visits. in other words, the libraries received 70 percent more website visits than in-person visits. many libraries conduct usability testing to determine whether their websites meet the functional needs of their users. a concern related to usability is quality: users form an impression of the library partly based on how it presents itself via the website. as several studies outside the library arena have shown, users’ experience of a website leads them to attribute characteristics of competence and trustworthiness to the sponsoring organization. tseng and fogg, discussing non-web computer systems, present “surface credibility” as one of the types of credibility affecting users. they suggest that “small computer errors have disproportionately large effects on perceptions of credibility.”3 in another paper by fogg et al., “amateurism” is one of seven factors in a study of website credibility. the authors recommend that “organizations that care about credibility should be ever vigilant—and perhaps obsessive—to avoid small glitches in their websites. . . . even one typographical error or a single broken link is damaging.”4 emily g. morton-owens (emily.morton-owens@med.nyu.edu) is web services librarian, new york university health sciences libraries, new york. 92 information technology and libraries | september 2011 happens when a page moves from one state to another. the very simple workflow in figure 1 shows two roles (author and editor) and three states (draft, approval, and published). there are two transitions with permissions attached to them. only the author can decide when he or she is done working and make the transition from draft to approval. only the editor can decide when the page is ready and make the transition from approval to published. (in these figures, dotted borders indicate states in which the content is not visible to the public.) a book publishing workflow involves perhaps a dozen steps in which the manuscript passes between the author, his or her agent, and various editorial staff. a year can pass between receiving the manuscript and publishing the book. the reason for that careful, conservative process is that it is very difficult to fix a book once thousands of copies have been printed in hardcover. by contrast, consider a newspaper: a new version appears every day and contains corrections from previous editions. a newspaper workflow is hardly going to take a full year. a website is even more flexible than a newspaper because it can be fixed or improved at any time. the kind of multistep process used for books and newspapers is effective, but not practical for websites. a website should have a workflow for editorial quality control, but it should be proportional to the format in terms of the number of steps, the length of the process, and the number of people involved. alternate workflow models this paper focuses on a contributor/editor model in which multiple authors create material that is vetted by a central authority: the editor. other models could be implemented with much the same tools. for example, in a peer-review system as is used for academic journals, there is a reviewer role, and an article could have states like “published,” “under review,” “conditionally accepted,” and so forth. most noticeable when neglected. editors, however, have several other important roles. for example, they select what will be published. in book publishing, that involves rejecting the vast majority of material that is submitted. in many professional contexts, however, it means soliciting contributions and encouraging authors. either way, the editor has a role in deciding what topics are relevant and what authors should be involved. additionally, editors are often involved in presenting their products to audiences. in book publishing, that can mean weighing in on jacket designs or soliciting blurbs from popular authors. on websites, it might mean choosing templates or fonts. editors want to make materials attractive and accessible to the right audience. together, correctness, choice, and presentation are the main concerns of an editor and together contribute to quality. each of these ideas can be considered in light of library websites. correctness means offering information that is current and free of errors, contradictions, and confusing omissions. it also means representing the organization well by having text that is well written and appropriate for the audience. writing for the web is a special skill; people reading from screens have a tendency to skim, so text should be edited to be concise and preferably organized into short chunks with “visible structure.”7 there is also good guidance available about using meaningful link words, action phrases, and “layering” to limit the amount of information presented at once.8 of course, correctness also means avoiding the kind of obvious spelling and grammar mistakes that users find so detrimental. choice probably will not involve rejecting submissions to the website. instead, in a library context it could mean identifying information that should appear on the website and writing or soliciting content to answer that need. presentation may or may not have a marketing aspect. a public library’s website may advertise events and emphasize community participation. as an academic medical library, nyuhsl has in some sense a captive audience, but it is still important to communicate to users that librarians understand their unique and highlevel information needs and are qualified to partner with them. workflow a workflow is a way to assign responsibility for achieving the goals of correctness, choice, and presentation. it breaks the process down into steps that ensure the appropriate people review the material. it also leaves a paper trail that allows participants to see the history and status of material. workflow can alleviate the coordination problems that prevent a website from exhibiting the quality it should. a workflow is composed of states, roles, and transitions. pages have states (like “draft” or “published”) and users have roles (like “contributor” or “editor”). a transition figure 1. very basic workflow editorial and technological workflow tools to promote website quality | morton-owens 93 effect was on the quality of the website, which contained mistakes and confusing information. ■■ methods nyuhsl workflow and solutions to resolve its web management issues, nyuhsl chose to work with the drupal content management system (cms). the ability to set up workflow and inventory content by date, subject, or author was a leading reason for that decision. other reasons included usability of the backend for librarians, theming options, the scripting language the cms uses (php), and drupal’s popularity with other libraries and other nyu departments.9 nyuhsl’s drupal environment has four main user roles: 1. anonymous: these are visitors to the nyuhsl site who are not logged in (i.e., library users). they have no permissions to edit or manage content. they have no editorial responsibilities. 2. library staff: this group includes all the staff content authors. their role is to notice what content library users need and to contribute it. staff have been encouraged to view website contributions as something casual—more akin to writing an e-mail than writing a journal article. 3. marketing team: this five-member group checks content that will appear on the homepage. their mandate is to make sure that the content is accurate about library services and resources and represents the library well. its members include both librarians and staff with relevant experience. 4. administrators: there are three site admins; they have the most permissions because they also build the site and make changes to how it works. two of the three admins have copyediting experience from prior jobs, so they are responsible for content approvals. they copyedit for spelling, grammar, and readability. admins also check for malformed html created by the wysiwyg (what you see is what you get) interface provided for authors, and they use their knowledge of other material on the site to look out for potential conflicts or add relevant links. returning to the themes of correctness, choice, and presentation, it could be said that librarian authors are responsible for choice (deciding what to post), the marketing team is responsible for choice and presentation, and the administrators are responsible for all three. an important thing to understand is that each person in a role has the same permissions, and any one of in an upvoting system like reddit (http://reddit .com), content is published by default, any user has the ability to upvote (i.e., approve) a piece of content, and the criterion for being featured on the front page is the number of approvals. in a moderation system, any user can submit content and the default behavior is for the moderator to approve anything that is not outright offensive. the moderator never edits, just chooses the state “approved” or the state “denied.” moderation is often used to manage comments. another model, not considered here, is to create separate “staging” and “production” websites. content and features are piloted on the staging site before being pushed to the live site. (nyuhsl’s workflow occurs all on the live site.) still, even in a staging/production system the workflow is implicit in choosing someone who has the permission and responsibility to push the staging site to the production site. problems at nyuhsl in 2007, the web services librarian position at nyuhsl had been open for nearly a year. librarians who needed to post material to the website approached the head of library systems or the “sysadmin.” both of them could post pages, but they did not proofread. pages that became live on the website stayed: they were never systematically checked. if a librarian or user noticed a problem with a page, it was not clear who had the correct information or was responsible for fixing it. often, pages that were found to be out-of-date would be delinked from other pages but were left on the server and thus findable via search engines or bookmarks. because only a few people had ftp access to the server, but authored little content, the usernames shown on the server were useless for determining who was responsible for a page. similarly, timestamps on the server were misleading; someone might fix one link on a page without reviewing the rest of it, so the page could have a recent timestamp but be full of outdated information. even after a new web services librarian started in 2007, problems remained. the new librarian took over sole responsibility for posting content, which made the responsibility clearer but created a bottleneck, for example, if she went on vacation. furthermore, in a library with five locations and about sixty full-time employees, it was hard for one person to do justice to all the libraries’ activities. if a page required editing, there was no way to keep track of whose turn it was to work on the document. there also was no automatic notification when a page was published. this made it possible for content to go astray and be forgotten. these problems added up to frustration for would-be content authors, a time drain for systems staff, and less time to create new content and sites. the most significant 94 information technology and libraries | september 2011 at the top of the homepage. their appearance should not be delayed, so any staff author can publish one. class sessions are specific dates, times, and locations that a class is being offered. these posts are assembled from prewritten text, so there is no way to introduce errors and no reason to route them through an approval step. figure 2 illustrates the main steps of the three cases. the names of the states are shown with arrows indicating which role can make each transition. unlabeled arrows mean that any staff member can perform that step. figure 3 shows how, at each approval step, content can be sent back to the author (with comments) for revision. although this happens rarely, it is important to have a way to communicate with the author in a way that is traceable by the workflow. figure 4 illustrates the concept of retirement. nyuhsl needed a way to hide content from library users and search engines, but it is dangerous to allow library staff to delete content. also, old content is sometimes useful to refer to or can even be republished if the need arises. any library staff user can retire content if they recognize it as no longer relevant or appropriate. additionally, library staff can resurrect retired content by resetting it to the draft state. that is, they cannot directly publish retired content (because they do not have permission to publish), but they can put it back on the path to being published by saving it as a draft, editing, and resubmitting for approval. figure 5 shows that library staff do not really need to understand the details of workflow. for any new content, they only have two options: keep the content in the draft state or move it on to whatever next step is available. all them can perform an action. the five marketing team members do not vote on the content, nor do they all have to approve it; instead, any one of them, who happens to be at his workstation when they get a notification, is sufficient to perform the marketing team duty. also, the marketing team members and administrators do not “self-approve”—no matter how good an editor someone may be, he or she is rarely good at editing her own work. nyuhsl’s workflow considers three cases: 1. most types of content are reviewed by one of the administrators before going live. 2. content types that appear on the homepage (i.e., at higher visibility) are reviewed by a member of the marketing team before being reviewed by an administrator. 3. two types of content do not go through any workflow. alerts are urgent messages that appear in red figure 2. approval steps figure 3. returning contents for edits figure 4. retirement editorial and technological workflow tools to promote website quality | morton-owens 95 this may sound like a large volume of e-mail, but it does not appear to bother library staff. the subject line of every e-mail generated by the system is prefaced with “[hsl site]” for easy filtering. also, every e-mail is signed with “love, the nyuhsl website.” this started as a joke during testing but was retained because staff liked it so much. one described it as giving the site a “warm, fuzzy feeling.” drupal modules nyuhsl developers used a number of different drupal modules to achieve the desired workflow functionality. a simple system could be achieved using fewer modules; the book using drupal offers a good walkthrough of workflow, actions, and trigger.10 of course, it also would be possible to implement these ideas in another cms or in a homegrown system. this list does not describe how to configure each module because the features are constantly evolving; more information is available on the drupal website.11 the drupal modules used include: ■■ workflow ■■ actions ■■ trigger ■■ token ■■ module grants ■■ wysiwyg, imce, imce wysiwyg api bridge ■■ node expire ■■ taxonomy role ■■ ldap integration ■■ rules ■■ results participation figure 6 shows the number of page revisions per person from july 14, 2009, to november 4, 2010. since many pages are static and were created only once, but need to be updated regularly, a page creation and a page update count equally in this accounting, which was drawn from the node_revisions table in drupal. it gives a general sense of content-related activity. a reasonable number of staff have logged in, including all of the librarians and a number of staff in key positions (such as branch managers). the black bars represent the administrators of the website. it is clear that the workflow system, while broadening participation, has hardly diffused primary responsibility of managing the website. the web services librarian and web manager have by far the most page edits, as they both write new content and edit content written by all other users. of the other options are hidden because staff do not have permission to perform them. the status of content in the workflow can be checked by clicking on the workflow tab of each page, but it also is tracked by notification e-mails. when the content enters a state requiring an approval, each person in that approving role gets an e-mail letting them know something needs their attention. the e-mail includes a link directly to the editing page. for example, if a librarian writes a blog post and changes its state from “draft” to “ready for marketing approval,” he or she gets a confirmation e-mail that the post is in the marketing approval queue. the marketing team members each get an e-mail asking them to approve the post; only one needs to do so. once someone has performed that approval, the marketing team members receive an e-mail letting them know that no further action is required. now the content is in the “ready for approval” state and the author gets another e-mail notification. the administrators get a notification with a link to edit the post. once an administrator gives the post final approval, the author gets an e-mail indicating that the post is now live. the nyuhsl website workflow system also includes reminders. each piece of content in the system has an author (authorship can be reassigned, so it is not necessarily the person who originally created the page). the author receives an e-mail every four months reminding him or her to check the content, revise it if necessary, and re-save it so that it gets a new timestamp. if the author does not do so, he or she will continue to get reminders until the task is complete. also, the site administrators can refer to a list of content that is out of date and can follow up in person if needed. note that reminders only apply to static content types like pages and faqs, not to blog posts or event announcements, which are not expected to have permanent relevance. figure 5. workflow choices for library staff users 96 information technology and libraries | september 2011 check the status by clicking on the workflow tab. this eliminates the discouraging mystery of having content get lost on the way to being published. ■■ identifying “problem” content: the node expire module has been modified to send e-mail reminders about stale content; as a result, this “problem” figure 7 shows the distribution of content updates once the web team members have been removed. it is clear that a small number of heroic contributors are responsible for the bulk of new content and updates, with other users logging on sporadically to address specific needs or problems. how editorial workflow addresses nyuhsl’s problems different aspects of the nyuhsl editorial workflow address different website problems that existed before the move to a cms. together, the workflow features create a clearly defined track that marches contributed content along a path to publication while always making the history and status of that content clear. ■■ keeping track of who wrote what when: this information is collected by the core drupal software and visible on administrative pages. (drupal also can be customized to display or sort this information in more convenient ways.) ■■ preventing mistakes and inconsistencies: this requires a human editor, but drupal can be used to formalize that role, assign it to specific people, and ensure nothing gets published without being reviewed by an editor. ■■ bottlenecks: nyuhsl eliminated bottlenecks that stranded content waiting for one person to post it by creating roles with multiple members, any one of whom can advance content to the next state. there is no step in the system that can be performed by only one person. ■■ knowledge: the issue of having too much going on in the library for one person to report on was addressed by making it easier for more people to contribute. drupal encourages this through its usability (especially a wysiwyg editor), and workflow makes it safe by controlling how the contributions are posted. ■■ “lost” content: when staff contribute content, they get e-mail notifications about its status and also can figure 6. number of revisions by user each user is indicated by their employee type rather than by name. figure 7. number of revisions by user, minus web team each user is indicated by their employee type rather than by name editorial and technological workflow tools to promote website quality | morton-owens 97 places web content in the context of other communication methods, like e-mail marketing, press releases, and social media.12 in her view, it is not enough to consider a website on its own; it has to be part of a complete strategy for communicating with an organization’s audience. libraries embarking on a website redesign would benefit from contemplating this larger array of strategic issues in addition to the nitty-gritty of creating a process to ensure quality. ■■ conclusions nyuhsl differs from other libraries in its size, status as an academic medical library, level of it staffing, and other ways. some aspects of nyuhsl’s experience implementing editorial workflow will, however, likely be applicable to other libraries. it does not necessarily make sense to assign editorial responsibility to it staff; instead, there may be someone on staff who has editorial or journalistic experience and could serve as the content approver. many universities offer short copyediting courses, and a prospective website editor could attend such a course. implementing a workflow system, especially in drupal, requires a lot of detailed configuration. developers should make sure the workflow concept is clearly mapped out in terms of states, roles, and transitions before attempting to build anything. workflow can seem complicated to users too, so developers should endeavor to hide as much as possible from nonadministrators. small mistakes in drupal settings and permissions can cause confusing failures in the workflow system. for example, a user may find him or herself unable to advance a blog post from “draft” to “ready for approval,” or a state change from “ready for approval” to “live,” and may not actually cause the content to be published. it would save time in the long run to thoroughly test all the possibilities with volunteers who play each role before the site is in active use. finally, when the workflow is in place, the website’s managers may find themselves doing less writing and fewer content updates. they have a new role, though: to curate the site and support staff who use the new tools. the concept of editing is not yet consistently applied to websites unless the site represents an organization that already relies on editors (like a newspaper)—but it is gaining recognition as a best practice. if the website is the most readily available public face of an institution, it should receive editorial attention just as a brochure or fundraising letter would. workflow is one way that libraries can promote a higher level of quality and perceived competence and reliability through their website presence. content is usually addressed by library staff without the administrators/editors doing anything at all. the administrators also can access a page that lists all the content that has been marked as “expired” so they know with whom to follow up. ■■ outdated content: some content may be outdated and undesirable to show the public or be indexed by search engines, but be useful to librarians. it also is not safe to allow staff to delete content, as they may do so by accident. these issues are addressed by the notion of “retiring” content, which hides content by unpublishing it but does not delete it from the system. ■■ future work the workflow system sets up an environment that achieves nyuhsl’s goals, structurally speaking, but social (nontechnology) considerations prevent it from living up to its full potential. not all of the librarians contribute regularly. this is partly because they are busy, and writing web content is not one of their job requirements. another reason is that some staff are more comfortable using the system than others, a phenomenon that reinforces itself as the expert users spend more time creating content and become even more expert. a third cause is that not all librarians may perceive that they have something useful to say. reluctant contributors have no external motivation to increase their involvement. it would be helpful to formalize the role of librarians as content contributors. there is presently no librarian at nyuhsl whose job description includes writing content for the website; even the web services librarian is charged only with “coordinating, designing, and maintaining” sites. ideally, every librarian job description would include working with users and would mention writing website content as an important forum for that. that said, it is not clear what metric could be used to judge the contributions fairly. it also is important to continue to emphasize the value of content contributions so that librarians are motivated and feel recognized. even librarians whose specialties are not outreach-oriented (e.g., systems librarians) have expert knowledge that could be shared in, say, a short article on how to set up rss feeds. workflow is part of a group of concerns being called “content strategy.” this concept, which has grown in popularity since 2008, includes editorial quality alongside issues like branding/messaging, search engine optimization, and information architecture. a content strategist would be concerned with why content is meaningful in addition to how it is managed. in her brief, useful book on the topic, kristina halvorson 98 information technology and libraries | september 2011 5. andrea everard and dennis f. galletta, “how presentation flaws affect perceived site quality, trust, and intention to purchase from an online store,” journal of management information systems 22 (2005–6): 79. 6. pamela briggs et al., “trust in online advice,” social science computer review 20 (2002): 330. 7. patrick j. lynch and sarah horton “online style,” web style guide, 3rd ed., http://webstyleguide.com/wsg3/9-editorial-style/3-online-style.html (accessed dec. 1, 2010). 8. janice (ginny) redish, letting go of the words: writing web content that works (san francisco: morgan kaufman, 2007). 9. emily g. morton-owens, karen l. hanson, and ian walls, “implementing open-source software for three core library functions: a stage-by-stage comparison,” journal of electronic resources in medical libraries 8 (2011): 1–14. 10. angela byron et al., using drupal (sebastopol, calif.: o’reilly, 2008). 11. all drupal modules can be found via http://drupal.org/ project/modules. 12. kristina halvorson, content strategy for the web (berkeley, calif.: new riders, 2010). ■■ acknowledgments thank you to jamie graham, karen hanson, dorothy moore, and vikram yelanadu. references 1. charles martell, “the absent user: physical use of academic library collections and services continues to decline 1995–2006,” journal of academic librarianship 34 (2008): 400–407. 2. jeanie m. welch, “who says we’re not busy? library web page usage as a measure of public service activity,” reference services review 33 (2005): 371–79. 3. b. j. fogg and hsiang tseng, “the elements of computer credibility” (paper presented at chi ’99, pittsburgh, pennsylvania, may 15–20, 1999): 82. 4. b. j. fogg et al., “what makes web sites credible? a report on a large quantitative study” (paper presented at sigchi ’01, seattle, washington, mar. 31–apr. 4, 2001): 67–68. microsoft word september_ital_maceli_proofed.docx what  technology  skills  do  developers   need?  a  text  analysis  of  job  listings  in   library  and  information  science  (lis)     from  jobs.code4lib.org.      monica  maceli     information  technology  and  libraries  |  september  2015             8   abstract   technology  plays  an  indisputably  vital  role  in  library  and  information  science  (lis)  work;  this  rapidly   moving  landscape  can  create  challenges  for  practitioners  and  educators  seeking  to  keep  pace  with   such  change.  in  pursuit  of  building  our  understanding  of  currently  sought  technology  competencies   in  developer-­‐oriented  positions  within  lis,  this  paper  reports  the  results  of  a  text  analysis  of  a  large   collection  of  job  listings  culled  from  the  code4lib  jobs  website.  beginning  more  than  a  decade  ago  as   a  popular  mailing  list  covering  the  intersection  of  technology  and  library  work,  the  code4lib   organization's  current  offerings  include  a  website  that  collects  and  organizes  lis-­‐related  technology   job  listings.  the  results  of  the  text  analysis  of  this  dataset  suggest  the  currently  vital  technology  skills   and  concepts  that  existing  and  aspiring  practitioners  may  target  in  their  continuing  education  as   developers.     introduction for  those  seeking  employment  in  a  technology-­‐intensive  position  within  library  and  information   science  (lis),  the  number  and  variation  of  technology  skills  required  can  be  daunting.  the  need  to   understand  common  technology  job  requirements  is  relevant  to  current  students  positioning   themselves  to  begin  a  career  within  lis,  those  currently  in  the  field  that  wish  to  enhance  their   technology  skills,  and  lis  educators.  the  aim  of  this  short  paper  is  to  highlight  the  skills  and   combinations  of  skills  currently  sought  by  lis  employers  in  north  america  through  textual   analysis  of  job  listings.  previous  research  in  this  area  explored  job  listings  through  various   perspectives,  from  categorizing  titles  to  interviewing  employers;1,2  the  approach  taken  in  this   study  contributes  a  new  perspective  to  this  ongoing  and  highly  necessary  work.  this  research   report  seeks  a  further  understanding  of  the  following  research  questions:   • what  are  the  most  common  job  titles  and  skills  sought  in  technology-­‐focused  lis  positions?   • what  technology  skills  are  sought  in  combination?   • what  implications  do  these  findings  have  for  aspiring  and  current  lis  practitioners   interested  in  developer  positions?     as  detailed  in  the  following  research  method  section,  this  study  addresses  these  questions     monica  maceli  (mmaceli@pratt.edu)  is  assistant  professor,  school  of  information  and  library   science,  pratt  institute,  new  york.     what  technology  skills  do  developers  need?  |  maceli   doi:  10.6017/ital.v34i3.5893   9   through  textual  analysis  of  relevant  job  listings  from  a  novel  dataset—the  job  listings  from  the   code4lib  jobs  website  (http://jobs.code4lib.org/).  code4lib  began  more  than  a  decade  ago  as  an   electronic  discussion  list  for  topics  around  the  intersection  of  libraries  and  technology.3  over  time,   the  code4lib  organization  expanded  to  an  annual  conference  in  the  united  states,  the  code4lib   journal,  and  most  relevant  to  this  work,  an  associated  jobs  website  that  highlights  jobs  culled  from   both  the  discussion  list  and  other  job-­‐related  sources.  figure  1  illustrates  the  home  page  of  the   code4lib  jobs  website;  the  page  presents  job  listings  and  associated  tags,  with  the  tags  facilitating   navigation  and  viewing  of  other  related  positions.  users  may  also  view  positions  geographically  or   by  employer.           figure  1.  homepage  of  the  code4lib  jobs  website,  displaying  most-­‐recently  posted  jobs  and  the   associated  tags.4   in  addition  to  the  visible  user  interface  for  job  exploration,  the  website  consists  of  software  to   gather  the  job  listings  from  a  variety  of  sources.  the  website  incorporates  jobs  posted  to  the   code4lib  discussion  list,  american  library  association,  canadian  library  association,  australian   library  and  information  association,  highered  jobs,  digital  koans,  idealist,  and  archivesgig.  this   broad  incoming  set  of  jobs  provides  a  wide  look  into  new  technology-­‐related  postings.     new  job  listings  are  automatically  added  to  a  queue  to  be  assessed  and  tagged  by  human  curators   before  posting.  this  allows  manual  intervention  where  a  curator  assesses  whether  the  job  is   relevant  to  technology  in  the  library  domain  and  to  validate  the  job  listing  information  and   metadata  (see  figure  2).  curating  is  done  on  a  volunteer  basis,  and  curators  are  asked  to  assess   whether  the  position  is  relevant  to  the  code4lib  community,  if  it  is  unique,  and  to  ensure  that  it   has  an  associated  employer,  set  of  tags,  and  descriptive  text.  combining  both  software  processes     information  technology  and  libraries  |  september  2015                   10   and  human  intervention  in  the  job  assessment  results  in  the  ability  to  gather  a  large  number  of   jobs  of  high  relevance  to  the  code4lib  community.  as  mentioned  earlier,  code4lib’s  origins  are  in   the  area  of  software  development  and  design  as  applied  in  lis  contexts.  these  foci  mean  that  most   jobs  identified  as  relevant  for  inclusion  in  the  code4lib  jobs  dataset  are  oriented  toward  developer   activities.  the  code4lib  jobs  website  therefore  provides  a  useful  and  novel  dataset  within  which  to   understand  current  employment  opportunities  relating  to  the  intersection  between  technology— particularly  developer  work—and  the  lis  field.       figure  2.  code4lib  job  curators  interface  where  job  data  is  validated  and  tags  assigned.5   research  method   to  analyze  the  job  listing  data  in  greater  depth,  a  textual  analysis  was  conducted  using  the  r   statistical  package,  exploring  job  titles  and  descriptions.6  first,  the  job  listing  data  from  the  most   recent  complete  year  (2014)  were  dumped  from  the  database  backend  of  the  code4lib  jobs   website;  this  dataset  contained  1,135  positions  in  total.  the  dataset  included  the  job  titles,   descriptions,  location  and  employer  information,  as  well  as  tags  associated  with  the  various     what  technology  skills  do  developers  need?  |  maceli   doi:  10.6017/ital.v34i3.5893   11   positions.  the  text  was  then  cleaned  to  remove  any  markup  tags  or  special  characters  that   remained  from  the  scraping  of  listings.  finally,  the  tm  (text  mining)  package  in  r  was  used  to   calculate  frequency,  correlation  of  terms,  generate  plots,  and  cluster  terms  across  both  job  titles   and  descriptions.7   results   job  title  analysis   of  the  full  set  of  1,135  positions,  30  percent  were  titled  as  a  librarian  position;  popular  specialties   included  systems  librarian  and  various  digital  collections  and  curation-­‐oriented  librarian  titles.   figures  3  and  4  detail  the  most  common  terms  used  in  position  titles  across  librarian  and   nonlibrarian  positions.       figure  3.  most  common  terms  used  in  librarian  position  titles.   345 89 63 59 34 29 25 25 23 21 20 20 18 18 16 14 13 13 13 12 12 11 11 11 10 librarian digital systems services metadata data technologies university technology web electronic resources assistant information emerging scholarship collections library management initiatives sciences cataloging projects research professor top title terms librarian positions   information  technology  and  libraries  |  september  2015                   12     figure  4.  most  common  terms  used  in  nonlibrarian  position  titles.   the  most  popular  job  title  terms  were  then  clustered  using  ward’s  agglomerative  hierarchical   method  (dendogram  in  figure  5).  agglomerative  hierarchical  clustering,  of  which  ward’s  method   is  widely  used,  begins  first  with  single-­‐item  clusters,  then  identifies  and  joins  similar  clusters  until   the  final  stage  in  which  one  larger  cluster  is  formed.  commonly  used  in  text  analysis,  this  allows   the  investigator  to  explore  datasets  in  which  the  number  of  clusters  is  not  known  before  the   analysis.  the  dendograms  generated  (e.g.,  figure  5)  allow  for  visual  identification  and   interpretation  of  closely  related  terms  representing  various  common  positions,  e.g.,  digital   librarian,  software  engineer,  collections  management,  etc.  given  that  job  titles  in  listings  may   include  extraneous  or  infrequent  words,  such  as  the  organization  name,  the  cluster  analysis  can   provide  an  additional  view  into  common  job  titles  across  the  full  dataset  in  a  more  generalized   fashion.     182 141 116 90 86 68 65 59 59 59 55 52 49 49 40 40 40 40 38 35 34 34 33 32 24 digital developer library manager specialist software web archivist services technology engineer director data systems analyst coordinator information senior metadata administrator lead project head programmer research top title terms non-librarian positions   what  technology  skills  do  developers  need?  |  maceli   doi:  10.6017/ital.v34i3.5893   13       figure  5.  cluster  dendrogram  of  terms  used  in  job  titles  generated  using  ward's  agglomerative   hierarchical  method.       tag  analysis   as  described  earlier,  the  code4lib  jobs  website  allows  curators  to  validate  and  tag  jobs  before   listing.  the  word  cloud  in  figure  6  displays  the  most  common  tags  associated  with  positions,  with   xml  being  the  most  popular  tag  (178  occurrences).  figure  7  contains  the  raw  frequency  counts  of   common  tags  observed.       information  technology  and  libraries  |  september  2015                   14         figure  6.  word  cloud  of  most  frequent  tags  associated  with  job  listings  by  curators.     what  technology  skills  do  developers  need?  |  maceli   doi:  10.6017/ital.v34i3.5893   15     figure  7.  frequency  of  commonly  occurring  tags  (frequency  of  fifty  occurrences  or  more)  in  the   2014  job  listings.   job  description  analysis   the  job  description  text  was  then  analyzed  to  explore  commonly  co-­‐occurring  technology-­‐related   terms,  focusing  on  frequent  skills  required  by  employers.  figures  8,  9,  and  10  plot  term   correlations  and  interconnectedness.  terms  with  correlation  coefficients  of  0.3  or  higher  were   chosen  for  plotting;  this  common  threshold  chosen  broadly  included  terms  with  a  range  in   positive  relationship  strength  from  moderate  to  strong.     plots  were  created  to  express  correlations  around  the  top  five  terms  identified  from  the  tags:  xml,   javascript,  php,  metadata,  and  html  (frequencies  in  figure  7).  any  number  of  terms  and   178 155 152 142 125 119 114 106 101 99 90 90 89 89 86 82 79 78 70 70 69 69 66 63 62 54 53 51 51 50 50 xml javascript php metadata html archive cascading style sheets python integrated library system java mysql dublin core marc standards encoded archival description ruby drupal project management sql metadata object description standard data management gnu/linux digital preservation perl digital library xsl transformations resource description and access digital repository world wide web management dspace mets frequency of tags 2014 job listings   information  technology  and  libraries  |  september  2015                   16   frequencies  can  be  plotted  from  such  a  dataset;  to  orient  the  findings  closely  around  the  job  listing   text,  a  focus  on  the  top  terms  was  chosen.  these  plots  illustrate  the  broader  set  of  skills  related  to   these  vital  competencies  represented  in  the  job  listings.         figure  8.  job  listing  terms  correlated  with  “xml”  (most  popular  tag).         figure  9.  job  listing  terms  correlated  with  “javascript”  (second  most  popular  tag),  including   “php”  and  “html”  (third  and  fifth  most  popular  tags,  respectively).     what  technology  skills  do  developers  need?  |  maceli   doi:  10.6017/ital.v34i3.5893   17     figure  10.  job  listing  terms  correlated  with  “metadata”  (fourth  most  popular  tag).     finally,  a  series  of  general  plots  was  created  to  visualize  the  broad  set  of  skills  necessary  in   fulfilling  the  positions  of  interest  to  the  code4lib  community.  as  detailed  in  the  title  analysis   (figures  3  and  4),  apart  from  the  generic  term  librarian,  the  two  most  common  terms  across  all  job   titles  were  digital  and  developer.  correlation  plots  were  created  to  detail  the  specific  skills  and   requirements  commonly  sought  in  positions  using  such  terms.  figure  11  illustrates  the  terms   correlated  with  the  general  term  of  developer,  while  figure  12  displays  terms  correlated  with   digital.  the  implications  of  these  findings  will  be  discussed  further  in  the  following  discussion   section.             information  technology  and  libraries  |  september  2015                   18     figure  11.  job  listing  terms  correlated  with  “developer.”       figure  12.  job  listing  terms  correlated  with  “ddigital.”     what  technology  skills  do  developers  need?  |  maceli   doi:  10.6017/ital.v34i3.5893   19   discussion   taken  as  a  whole,  the  job  listing  dataset  covered  a  quite  dramatic  range  of  positions,  from  highly   technical  (e.g.,  senior-­‐level  software  engineer  or  web  developer)  to  managerial  and  leadership   roles  (e.g.,  director  or  department  head  roles  centered  on  digital  services  or  emerging   technologies).  these  findings  support  the  suggestions  of  earlier  research,8  which  advocated  for  lis   graduate  programs  to  build  their  offerings  not  just  in  technology  skills  but  also  in  technology   management  and  decision-­‐making.  however,  the  code4lib  jobs  dataset  is  a  one-­‐dimensional  view   into  the  employment  process  and  is  focused  largely  on  the  developer  perspective.  additional   contextual  information,  including  whether  suitable  candidates  were  easily  identified  and  if  the   position  was  successfully  filled,  would  provide  a  more  complete  view  of  the  employment  process.   prior  research  has  indicated  that  many  technology-­‐related  positions  in  lis  are  in  fact  difficult  to   fill  with  lis  graduates.9  while  lis  graduate  programs  have  made  great  strides  in  increasing  the   number  of  courses  and  topics  covered  that  address  technology,  these  improvements  may  not   benefit  those  already  in  the  field  or  wishing  to  shift  towards  a  more  technology-­‐focused  position.   in  the  common  tags  and  terms  analysis,  experience  with  specific  lis  applications  was  relatively   infrequently  required,  with  the  drupal  content  management  system  a  notable  exception.  more   generalizable  programming  languages  or  concepts,  e.g.,  python,  relational  databases,  xml,  etc.,   were  favored  as  with  technology  positions  outside  of  the  lis  domain,  employers  likely  seek  those   with  the  ability  to  flexibly  apply  their  skills  across  various  tools  and  platforms.  this  may  also   relate  to  the  above  challenges  in  filling  such  positions  with  lis  graduates,  with  the  goal  of  opening   up  the  position  to  a  larger  technologist  applicant  base.   common  web  technologies  popular  in  the  open-­‐source  software  often  favored  by  lis   organizations  continued  to  dominate,  with  a  clear  preference  for  candidates  well  versed  in  html,   css,  javascript,  and  php.  relating  to  these  skills,  web  development  and  design  practices  were   often  intertwined  with  positions  requesting  both  developer-­‐oriented  skillsets  as  well  as  interface   design  (e.g.,  figure  7).  technologies  supporting  modern  web  application  development  and   workflow  management  were  evident  as  well,  e.g.,  common  requirements  for  experience  with   versioning  systems  such  as  git,  popular  javascript  libraries,  and  development  frameworks.  also   striking  was  the  richness  of  the  terms  correlated  with  metadata  (figure  10),  including  mention  of   growing  areas  of  expertise,  such  as  linked  data.     interestingly,  the  general  correlation  plots  expressing  the  common  terms  sought  around  “digital”   and  “developer”  positions  were  quite  varied.  while  the  developer  plot  (figure  11  above)  provided   a  richly  technical  view  into  common  technologies  broadly  applied  in  web  and  software   development,  the  terms  correlated  around  digital  were  notably  less  technical  (figure  12  above).   while  there  was  a  clear  focus  on  digital  preservation  activities  and  common  standards  in  this  area,   mention  of  terms  such  as  “grant”  indicated  that  these  positions  likely  have  a  broad  role.  the  term   digital  was  frequently  observed  in  librarian  job  titles,  so  these  roles  may  be  tasked  with  both   technical  and  administrative  work.       information  technology  and  libraries  |  september  2015                   20   finally,  there  are  inherent  difficulties  in  capturing  all  jobs  relating  to  technology  use  in  the  lis   domain  that  introduce  limitations  into  this  study.  while  the  incoming  job  feeds  attempt  to  broadly   capture  recent  job  posts,  it  is  possible  that  jobs  are  missed  or  overlooked  by  the  job  curators.   given  the  lack  of  one  centralized  job-­‐posting  source  regardless  of  the  field,  this  is  a  common   challenge  to  research  work  attempting  to  assess  every  job  posting.  and  as  mentioned  above,  there   is  also  a  lack  of  corresponding  data  as  to  whether  these  jobs  are  successfully  filled  and  what   candidate  backgrounds  are  ultimately  chosen  (i.e.,  from  within  or  outside  of  lis).     conclusion   this  assessment  of  the  in-­‐demand  technology  skills  provides  students,  educators,  and  information   professionals  with  useful  direction  in  pursuing  technology  education  or  strengthening  their   existing  skills.  there  are  myriad  technology  skills,  tools,  and  concepts  in  today’s  information   environments.  reorienting  the  pursuit  of  knowledge  in  this  area  around  current  employer   requirements  can  be  useful  in  professional  development,  new  course  creation,  and  course  revision.   the  constellations  of  correlated  skills  presented  above  (figures  8–12)  and  popular  job  tags  (figure   7)  describe  key  areas  of  technology  competencies  in  the  diverse  areas  of  expertise  presently   needed,  from  web  design  and  development  to  metadata  and  digital  collection  management.  in   addition  to  the  results  presented  in  this  paper,  the  code4lib  job  website  provides  a  continuously   current  view  into  recent  jobs  and  related  tags;  this  data  can  help  those  in  the  lis  field  orient   professional  and  curricular  development  toward  real  employer  needs.   acknowledgements   the  author  would  like  to  thank  ed  summers  of  the  maryland  institute  for  technology  in  the   humanities  for  generously  providing  the  jobs.code4lib.org  dataset  for  analysis.     references     1. janie  m.  mathews  and  harold  pardue,  “the  presence  of  it  skill  sets  in  librarian  position   announcements,”  college  &  research  libraries  70,  no.  3  (2009):  250–57,   http://dx.doi.org/10.5860/crl.70.3.250.     2. vandana  singh  and  bharat  mehra,  “strengths  and  weaknesses  of  the  information  technology   curriculum  in  library  and  information  science  graduate  programs,”  journal  of  librarianship  &   information  science  45,  no.  3  (2013):  219–31,  http://dx.doi.org/10.1177/0961000612448206.     3. “about”"  code4lib,  accessed  january  6,  2014,  http://jobs.code4lib.org/about/.   4. “code4lib  jobs:  all  jobs,”  code4lib  jobs,  accessed  january  12,  2015,  http://jobs.code4lib.org/.     5. “code4lib  jobs:  curate,”  code4lib  jobs,  accessed  january  17,  2015,   http://jobs.code4lib.org/curate/.     6. r  core  team,  r:  the  r  project  for  statistical  computing,  2014,  http://www.r-­‐project.org/.     what  technology  skills  do  developers  need?  |  maceli   doi:  10.6017/ital.v34i3.5893   21   7. ingo  feinerer  and  kurt  hornik,  “tm:  text  mining  package,”  2014,  http://cran.r-­‐ project.org/package=tm.     8. meredith  g.  farkas,  “training  librarians  for  the  future:  integrating  technology  into  lis   education,”  in  information  tomorrow:  reflections  on  technology  and  the  future  of  public  &   academic  libraries,  edited  by  rachel  singer  gordon,  193–201  (medford,  nj:  information  today,   2007).   9. mathews  and  pardue,  “the  presence  of  it  skill  sets  in  librarian  position  announcements.”   editorial | truitt 3 w ithin the last few months, two provocative books have been published that take different approaches to the question of how we learn in the always-on, always-connected electronic environment of “screens.” while neither is specifically directed at librarians, i think both deserve to be read and discussed widely in our community. ■■ the shallows the first, the shallows: what the internet is doing to our brains (norton, 2010), by nicholas carr, is an expanded version of his article “is google making us stupid?” published in the july/august 2008 issue of atlantic monthly and discussed in this space soon after.1 carr’s arguments in the shallows will be familiar to those who read his earlier article, but they are more thoroughly developed in his book and worth summarizing here. carr’s thesis is that use of connective technology—the internet and the web—is leading to a remapping of cognitive reading and thinking skills, and a “shallowing” of these mental faculties: over the last few years i’ve had an uncomfortable sense that someone, or something, has been tinkering with my brain, remapping the neural circuitry, reprogramming the memory. . . . i’m not thinking the way i used to think. i feel it most strongly when i’m reading. i used to find it easy to immerse myself in a book or a lengthy article. . . . that’s rarely the case anymore. (5) the problem, as carr goes on to describe at some length, chronicling in detail the results of years of neurological investigations, is that the brain is “plastic.” “virtually all of our neural circuits—whether they’re involved in feeling, seeing, hearing, moving, thinking, learning, perceiving, or remembering—are subject to change.” and one of the things that is changing them the most drastically today is our growing reliance on digital information. the paradox is that as we repeat an activity—surfing the web and clicking on links, rather than engaging with linear texts, for example—chemically induced synapses cause us to want to continue the new activity, strengthening those links (34). this quality of plastic neural circuits that can be remapped, when combined with the “ecosystem of interruption technologies” of the internet and the web (e.g., in-text hyperlinks, e-mail and rss alerts, text messaging, twitter, multiple widgets, etc.) is resulting in what carr argues is a growing inability or unwillingness to engage with and reflect deeply upon extended text (91).2 as carr puts it, the linear, literary mind . . . [that has] been the imaginative mind of the renaissance, the rational mind of the enlightenment, the inventive mind of the industrial revolution, even the subversive mind of modernism . . . may soon be yesterday’s mind. (10) there is much more. carr offers pointed critiques of major internet players and the roles they play in facilitating and exploiting the remapping of our neural circuits. google, whose “profits are tied directly to the velocity of people’s information intake,” is to carr “in the business of distraction” (156–57). the google book initiative “shouldn’t be confused with the libraries we’ve known until now. it’s not a library of books. it’s a library of snippets. . . . the strip-mining of ‘relevant content’ replaces the slow excavation of meaning” (166). ultimately, for carr, it’s about who is controlling whom. while the internet may permit us to better perform some functions—search, for example—“it poses a threat to our integrity as human beings . . . we program our computers and thereafter they program us” (214). put another way, “the computer screen bulldozes our doubts with its bounties and conveniences. it is so much our servant that it would seem churlish to notice that it is also our master” (4). ■■ hamlet’s blackberry perhaps less familiar than carr’s work is william powers’ hamlet’s blackberry: a practical philosophy for building a good life in the digital age (harpercollins 2010). powers, a writer whose work has appeared in the washington post, the new york times, the new republic, and elsewhere, describes the influence of digital technology (or “screens,” to use his shorthand)3 and connectedness on our lives: in the last few decades, we’ve found a powerful new way to pursue more busyness: digital technology. computers and smart phones are often pitched as solutions to our stressful, overextended lives. . . . but at the same time, they link us more tightly to all the sources of our busyness. our screens are conduits for everything that keeps us hopping—mandatory and optional, worthwhile and silly. . . . marc truitteditorial: “the air is full of people” marc truitt (marc.truitt@ualberta.ca) is associate university librarian, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 4 information technology and libraries | march 2011 if not yet a general consensus, that people are coming to experience and understand these costs. finally, they also make the point that things need not continue on their present course. i can imagine that if we in libraries take carr and powers seriously, there might be significant implications for service models and collections practices. both books have been reviewed in all the usual mainstream places. remarkably though, to me—and excluding a scant few discussion list threads such as that on web4lib several years ago—i’ve seen no discussion in the usual professional venues of their implications where libraries are concerned. perhaps i’m simply not reading the “right” weblogs or discussion lists. i’m not under the illusion that libraries or librarians can by themselves alter our rush toward the “shallows.” still, given our eagerness to discuss how we extend the reach of “screens” in libraries—whether in the form of learning commons, wireless access, mobile-friendly websites, clearing stacks of “tree-books” in favor of e-books, etc.—would it not be reasonable to think that we should show as much concern about the consequences of such activities, and even some interest in providing possible remedial alternatives? one of my favorite library spaces in college was the linonia and brothers reading room in yale’s sterling memorial library (see a photo of the reading room at http://images.library.yale.edu/madid/oneitem.aspx ?id=1772930). its dark oak paneling, built-in bookshelves, overstuffed leather easy chairs, cozy alcoves, toasty, footwarming steam radiators, and stained-glass windows overlooking a quiet courtyard represented the epitome of the nineteenth-century “gentleman’s library” and encouraged the sort of deep reading and contemplation that are becoming so rare in our institutions today. i spent many hours there, reading, thinking, dreaming—and yes, catnapping too. i haven’t visited the “l&b” in years; i hope it is still the way i so fondly recall it. over the past few years, as we’ve considered the various aspects of the library-as-space question, we’ve created all manner of collaborative, group-focused, überconnected learning spaces. we’ve also created bookfree spaces (to say nothing of book-free “libraries”), food-friendly spaces, quiet and cell-phone-free spaces, and a host of others of which i’m sure i haven’t thought. so, in an attempt to get us thinking about what carr ’s and powers’ books might mean for libraries, here’s a crazy idea to start us off: how about a screen-free space for deep reading and contemplation? it should be very low-tech: no mobiles, no laptops, no desktops, no networks, no clickety-clack of keys, no chimes of incoming e-mail and tweets, no unearthly glow of monitors. no food, drink, or group-study areas, either. just a quiet, inviting, comfortable space for individual reading and the goal is no longer to be “in touch” but to erase the possibility of ever being out of touch. to merge, to live simultaneously with everyone, sharing every moment, every perception, thought, and action via our screens. even the places where we used to go to get away from the crowd and the burdens it imposes on us are now connected. the simple act of going out for a walk is completely different today from what it was fifteen years ago. whether you’re walking down a big-city street or in the woods outside a country town, if you’re carrying a mobile device with you, the global crowd comes along. . . . the air is full of people. (14–15) drawing inspiration and analogy from a list of philosophers and other historical and literary figures beginning with plato and ending with mcluhan, powers describes seven practical approaches, tools, and techniques for disconnecting from our screen-driven life: ■■ seek physical distance (plato) ■■ seek intellectual and emotional distance (seneca) ■■ hope for devices that might allow us to customize our degree of connectedness (gutenberg) ■■ consider older, low-tech tools as alternatives where possible (shakespeare via hamlet) ■■ create positive rituals (ben franklin) ■■ create a “walden zone” refuge (thoreau) ■■ be aware of and take personal control from technology by being aware of that technology (mcluhan) powers then reviews how he and his family used these techniques to regain the sense of control and depth they felt they’d lost to screens. in the past several months, i’ve tried a couple myself. i no longer carry a blackberry unless i’m traveling out of town. i avoid e-mail and the internet completely on saturdays (my “internet sabbath”). the effect of these two small and easily achieved changes has been little short of liberating, providing space to think and reflect without the distraction of always-on connectedness. walking my lab seamus has become a special pleasure! ■■ bringing libraries into the picture so, what do carr’s and powers’ theses mean for libraries, and what do they mean in particular for those of us who provide technology solutions for libraries? they remind us that there is a very real human cost to the technology of screens and always-on connectedness that have become our stock-in-trade in recent years. as well, they provide convincing evidence that there is a growing awareness, editorial | truitt 5 references and notes 1. carr’s atlantic monthly article appeared in volume 301 (july/aug. 2008) and can be found at http://www.theatlantic . c o m / m a g a z i n e / a rc h i v e / 2 0 0 8 / 0 7 / i s g o o g l e m a k i n g u s -stupid/6868/ (accessed jan. 14, 2011); my ital column on the topic is at http://www.ala.org/ala/mgrps/divs/lita/ ital/272008/2703sep/editorial_pdf.cfm (accessed jan. 14, 2011). 2. the term “ecosystem of interruption technologies” belongs to cory doctorow. 3. powers uses the term “screens” to describe “the connective digital devices that have been widely adopted in the last two decades, including desktop and notebook computers, mobile phones, e-readers, and tablets” (1). thought. would some of our patrons adopt it? i’m willing to bet that they would. do we not owe them the same commitment to service that we’ve worked so hard to provide to those who wish to be collaborative and “always-on”? absolutely. no, we can’t change the world or stop the march of the screens. but perhaps, as with powers’ “walden zone,” we can start by providing a close-at-hand safe harbor for those of our patrons seeking refuge from the “always-on” world of screens. 261 marc ii and cobol henriette d. avram and julius r. droz: information systems office, library of congress, washington, d. c. a description of the machine processing of marc ii records using cobol for an application on the library of congress system 360/ 30. emphasis is on the manipulation by cobol of highly complex variable length marc records containing variable length fields. since the implementation of the marc ii format by the library of congress for the marc distribution service, some potential users of machine readable data have expressed doubts that marc formatted records could be effectively manipulated by programs in the cobol language. griffin ( 1) expressed his concern by stating, "users will r~quire programmers skilled in languages other than fortran or cobol to take advantage of marc records." during the design phases of the marc ii format, the information systems office staff concluded that the capability of cobol should be accommodated. the relationship between the format and cobol could not be tested until a data base was established. the data base is now an accomplished fact. the purpose of this article is to report that cobol can be and is being used with marc ii data. application the science and technology division of the library of congress had a requirement from the u.s. army terrestrial science center to produce three reports (by subject heading, by author, and by corporate author) for ultimate photo-reduction and reproduction of the periodical literature in the library's collection dealing with cold region research. the bibliographic data was processed through the marc system and the resultant tape was in the library of congress' marc processing format. it is at this point in the marc distribution service that the marc processing format is converted to the marc ii communications format for distribution to subscribers. rather than create a completely simulated environ262 journal of library automation vol. 1/ 4 december, 1968 ment within which to test the use of cobol, it was decided to integrate the analysis of the cobol language with the task at hand, i.e, the programming effort required to produce the necessary reports from a magnetic tape file of marc records. additional criteria for this investigation were established. stress was placed on the development of cobol programming techniques utilizing the smallest possible amount of computer core storage, thereby establishing the capability for potential users with a minimum hardware configuration. furthermore, since cobol compilers vary in the language power that they provide with the size and make of equipment, the subset of cobol language used conformed with the basic level of cobol that is being standardized for acceptance in the adp world by the united states of america standards institute (us asi) subcommittee x3.4 (common programming languages) . marc ii communications format and marc processing format compared before a description of what was done and how, a comparison of the marc ii communications format and the marc processing format, with a brief statement of their differences, is in order. the marc ii communications format ( 2) is schematically represented in figure 1. leader record control variable directory fields fields fig. 1. marc ii communications format. the communications format may be recorded on either seven-level or nine-level tape and the term "byte" in the following discussion refers to either a six-bit or eight-bit character. the marc processing format is schematically represented in figure 2. record leader communications control fixed record variable field field field directory fields fig. 2. library of congress marc processing format. marc records at the library of congress are contained on magnetic tape in the form of undefined records. for system 360 purposes, these are unblocked, variable length records without the two four-byte block length and record length fields. in the processing format, fields may be recorded in binary, or packed decimal, hexidecimal or ebcdic characmarc ii and cobol/ avram and droz 263 ters dependent on the characteristics of the field and the machine processing and storage efficiency required. therefore, in the processing format the term "byte" does not necessarily refer to a character but rather describes a unit of eight bits. a brief description of the fields of both formats and a gross comparison of their differences is shown in table 1. table 1. comparison of marc ii communications format and marc processing format communications format leader the leader is fixed in length for all records and contains 24 bytes ( characters ) . logical record length record status type of record bibliographic level blanks indicator count subfield code count base address of data blanks 5 bytes 1 byte 1 byte 1 byte 2 bytes 1 byte 1 byte 5 bytes 7 bytes processing format leader the leader is fixed in length for all records and contains 12 bytes. logical record length date and status blank type of record bibliographic level blanks communications field 2 bytes 4 bytes 1 byte 1 byte 1 byte 3 bytes the communications field is fixed in length for all records and contains 12 bytes. record directory location directory entry count record source record destination in process type in process status blanks record control field 2 bytes 2 bytes 1 byte 1 byte 1 byte 1 byte 4 bytes the record control field is fixed in length and contains 14 bytes. in the format for monographs, the library of congress catalog card number is recorded in this field. fixed fields the fixed fields are fixed in length for all records and contain 54 bytes. 264 ]oumal of library automation vol. 1/ 4 december, 1968 record directory the record directory is made up of a variable number of fixed length entries ( 12 bytes each) which contains the identification tag, the length and the starting character position in the record of each variable field. the record directory ends with a field terminator code. tag length starting character position control fields 3 bytes 4 bytes 5 bytes the control fields contain alphameric data elements and are recorded like variable fields although many have a fixed length. these fields end with a field terminator code. each control field is identified by a 3-byte numeric tag in the record directory and these tags are not repeated in the logical record. in the marc format for monographs, the library of congress catalog card number and the fixed fields are recorded as control fields. variable fields the variable fields are made up of variable length alphameric data, and all fields end with a field terminator code except the last variable field in the logical record which ends with an end of record code. each variable field is identified by a 3-byte numeric tag in the record directory, and tags may be repeated as required in the logical record. each variable field begins with a constant number of indicators which provide descriptive inforrecord directory the record directory is made up of a variable number of fixed length entries ( 12 bytes each) . the record dircetory ends with a field terminator code. tag sequence number blanks action code length starting character position 3 bytes 1 byte 3 bytes 1 byte 2 bytes 2 bytes (record control field and fixed fields) variable fields the variable fields are made up of variable length alphameric data, and all fields end with a field terminator code except the last variable field in the logical record which ends with an end of record code. each variable field is identified by a 3-byte numeric tag in the record directory and tags may be repeated as required in the logical record. each variable field begins with the number of indicators required for that field and the marc ii and cobol/ avram and droz 265 mation about that field. the field contains data elements which may be separated by subfield codes to identify the data element. a subfield code is composed of a delimiter and a lower case alphabetic character. a variable field for monographs can be described as follows: ii$a data $b data ft where i = indicator $a and $b = subfield code ff = field terminator $ = delimiter number of lower case alphabetic characters which are a part of the subfield code. the indicators and alphabetic characters are each followed by a delimiter. the data elements of the field are separated by delimiters only. a variable field can be described as follows: h i2 .. . in$a1 a2 .. . an $ data $ data $ data ft where i = indicator a = alphabetic character ft = field terminator $ = delimiter it should be noted that in the communications format those fields that are fixed in length (library of congress catalog card number and the fixed fields) for a particular form of material, e.g., monographs, are recorded as variable fields. this guarantees that the same format structure is able to be used to represent all forms of material, e.g., serials, maps, music, etc., where the contents of these fields may . vary in length and meaning. the marc ii communications format was designed for the exchange of bibliographic information recorded on magnetic tape to be manipulable by the recipient's computer regardless of its characteristics, i.e., word machines, character machines, third-generation and second-generation computers ( 3). the marc processing format was designed for the library of congress system 360. development the use of the cobol programming language provided straightforward language approach to the task. the use of natural language provided the program with its own documentation. in order to relate the program to its use on the specific computer located at the library of congress, only the following were required in cobol: environment division. configuration section. source-computer. ibm-360 e30. object-computer. ibm-360 e30. the marc input file was treated as containing "undefined" records. in the marc file, records actually vary in length up to 2048 characters. 266 journal of library automation vol. 1/ 4 december, 1968 in cobol, the marc file and records, along with the option card file and printed output file were described as: input-output section. file-control. select marc-file assign to 'sys012' utility 2400 units reserve no alternate areas. select option-card assign to 'sys013' unit-record 2540r reserve no alternate areas. select index-report assign to 'sys014' unit-record 1403. data division. file section. fd marc-file 01 fd 01 fd 01 data record is marc-record label records are standard recording mode is u. marc-record. 02 marc-byte option-card picture x occurs 2048 times. data record is option-record label records are omitted recording mode is f. option-record picture x(80). index-report data record is index-record label records are omitted recording mode is f. index-record. 02 carriage-control 02 filler picture x. picture x(l30). the next step was to manipulate the fields of the marc processing format which would provide access to the variable field information. these fields included: 1) the first 92 bytes of each record containing fixed-formatted items (leader, communication field, record control field, fixed fields); 2) a 12-byte directory entry that had been established for each variable field in the record containing the identification tag, the tag sequence number, the length of the field that this directory entry corresponded to, and the starting position of that field relative to the first position of the marc record; and 3) the number of directory entries contained in each record and carried in the fixed portion of the record. the specific programming approach taken was to move the fixed fields to a work area, calculate the locations of the directory entries and their corresponding fields, extract the desired field from the record and place marc ii and cobol/ avram and droz 267 it in a work area for the appropriate processing. this technique was used to overcome the word boundary alignment considerations in the system 360, i.e., the use of binary arithmetic for certain data fields in the processing format. since in some cases character-by-character scanning of the record data would be required, the programming routines were modularized to provide for economic core storage utilization and simple accessing. the three cobol modules (which could be used repetitively) were developed as follows, preceded by supporting work areas and subscripts: work areas and subscripts working-storage section. 01 fixed-marc. 01 01 01 01 01 01 02 length picture 99 usage computational. 02 directory-count picture 99 usage computational. marc-fixed redefines fixed-marc. 02 fixed picture x occurs 92 times. hold-directory. 02 d-tag 02 filler 02 d-length 02 d-address picture x(3). picture x(5) .. picture 99 usage computational. picture 99 usage computational. directory-hold redefines hold-directory. 02 d-hold picture x occurs 12 times. subscripts usage computational. 02 dsub picture 9(4) value zeros. 02 tsub picture 9(4) value zeros. hold-areas. 02 hold-tag 02 hold-data. 03 data-hold 02 persw options. picture x(3) value spaces . picture x occurs 1000 times. picture 9 value zero. 02 option-tag! picture x(3). 02 option-tag2 picture x(3) . 02 option-tags picture x(3). 02 other-options picture x(71) . continued to end of work areas and subscripts. 268 journal of library automation vol. 1/ 4 december, 1968 modules mod-1. note sub-routine to move fixed items from record into work area. enter with dsub and tsub subscripts set to zero. exit with fixed items, addressable by data-names, in work area labelled 'fixed-marc'. move-fixed. add 1 to dsub. add 1 to tsub. move marc-byte (dsub) to fixed (tsub) . mod-2. note sub-routine to find a desired identification tag in the record directory. search must be controlled by the given number of entries in the record directory. enter with dsub set to 92 and persw set to zero. search tag must be stored in 'hold-tag'. exit with 12 character directory entry in work area labelled 'hold-directory'. sift-directory. if persw equals 1 go to sift-exit. move zero to tsub. perform directory-move 12 times. if d-tag equals hold-tag move 1 to persw. sift-exit. exit. directory-move. add 1 to dsub. add 1 to tsub. move marc-byte (dsub) to d-hold (tsub) . mod-3. note sub-routine to find desired record field and move it to a work area for processing. enter with desired directory entry in work area labelled 'hold-directory'. exit with desired field in work area labelled 'hold-data'. move-data. move zeros to tsub. move spaces to hold-data. move d-address to dsub. perform move-ad-length times. move-a. add 1 to dsub. add 1 to tsub. move marc-byte (dsub) to data-hold (tsub ) . once access to any field in the record was accomplished by the use of sub-routine modules, what remained was to establish the "mainline coding" and to apply the specifications and logic for the reports that were marc ii and cobol/ avram and droz 269 to be produced. with this objective, the following cobol statements were written: procedures division. housekeeping. a. b. open input marc-file option-card output index-report. read option-card into options to end co to a. read marc-file record at end co to endjob. move zeros to dsub tsub. perform move-fixed 92 times. move option-tag! to hold-tag. perform b. move option-tac2 to hold-tag. perform b. move option-tags to hold-tag. perform b. co to a. move zeros to persw. move 92 to dsub. perform sift-directory thru sift-exit directory-count times. perform move-data. (at this point, the field is formatted for output and printed.) end job. close marc-file option-card index-report. stop run. results 00 00 figure 3 is a sample of the output of the program. the character-bycharacter scan, implicit in the programming required of a variable length field, also gave rise to extremely flexible data report formatting. all the data printed is "floating" in nature, restricted only by the user's print format requirements which were introduced by option card in this case. table 2 shows the amount of core occupied during processing, and table 3 lists man weeks expended in producing the program. table 2. computer core usage program sections software 1/0 routines cobol compiler sub-routines program i/0 buffers and constants object instructions total bytes occupied 1,155 2,071 4,110 3,944 11,280 270 journal of library automation vol. 1/ 4 december, 1968 con~truc ti on 23-2660 irtc:rc.ts inq fro s t res i~tance of road toppings in th~ tiumen • regio23-31bij w~~ hing concrete ~ ggr~gates in freezing we athet contact photography 23-3161 cont~ct mf!l'no.1 of photographing ·snow and f irn samples 23-2799 continuous loadi~g effects snow creep unner continuous loading 23-270~ contruction building emhank=ents in freezing weather 23-31 50 comvection calc ulating thawing dept h of fr ozen moi s t soil 21-2217 convective heat exch~n9e in radiant g~s 2 j-2q7j convecti•e heat exch~nge in v~ter s~tu r ~ ted ground 23-2ij 83 lolrain"at ·tt\er11r1l convec tion in a rotating . void between tvo discs 2)-2 u77 convective heat exchange convecti ve hea t e xcha ng e in radi~nt 9as 23-2ij73 cooliig syst!iis optiu l para•eters opera ting o n •jas hcd.t release coordiii1tes of cooling systeas rege neration with 2 1-2qqs l eo ~ov~~ent determination from as trono mical obser vations 23-3070 cores analysis of ice core s f rom byr d statio n 23-3113 deepcore d r illing prog ra• at 8 yrd station 23-31 12 ue s ults of antarctic core hole to bedrock 23 3 126 cosiiic dust analy sis of magentic particles fro• greenland ice 23 32 13 counebii easubi!!s counteraedsu re s for icing 2 3-2 567 fro :;t he4ve c ou nte raeas ures in road 2 3-2 308 cons tructi o n invest i gdt inq frost 2j-217 0 investigating · fr os t railroad trac~s h~ave areas heave areas on lar,rtslide counteraeasures "dss aoveaent and effec tive 2 j-2729 2 3-2602 counterme~sures 23-27 31 pr eventing the oppearance of naled nea r 3aa ll bridges and pipes 23-2584 pro t .c tin g ro~d5 fro • l~ndslides by building outlets for solifluction ~aterial 2 3-2595 rock st reams 23-26oq critical soomd pressure critical valu e of sound th e p r ocesses of neat tran s fe'r t~~iog place influence of aco ustic l3-250b cr!ogeiic pbocesses peraafrost in tien shan theroal regi~e of so ils regions cbyoceiic belief pressure for and 1a ss under the vibrations 23 -301 8 io' permafrost 23-2226 locating oioo r struc~u res fro• cry?gen ic r el i e f and pe rmafrost ir. d icati ons 2 3-31 0.0 sup<:r •j"" " distribution of radiu• and th o riua in the transbaikal taiga 23-281>2 ~oi l~ of orals, vest aa4 .. ~:ibe r i~ tu ndra and forest-tundra cry oce mics cryogenic engineering cbyoturbati011 central 23-319) soils 23-3195 23-2276 cryogenic ·fossil c(p.vasses coun ty. ··: crystal gbovth in l'is let 23-2314 patter ns on the ice surface of a lake ~327 14 crystal lattices growth of an ice cr ystal ' in anal og y wit h a n elec tro s tatic field 23-2624 crystal structobe crystal struc ture of water 23-22ij5 gr o vt h o f s nowflakes 23-2874 si lver 'iod ide nucleating sites 23-2269 snow c ry s tals in fusb iai district,' kyoto 23-2878 cbystal study t!chiiqo!s complexities of the three-diaensional ' shape of indivi dual crystals in g l ac i er ice 23-2943 crystals physi ca l properties of aolecular c ry s tals, liquids, and glasselj-2231 cobic ice hexagonal and cubic ice at lov temperature 23-2651 ten s ile strength of c ubic crystals under pressure . 23-2928 cyclo'e blokiig siok ft!f!r "cyclone~ blowing snow •eter and its u se a t llirnyy 23-3073 cjbstal stody tecuiiqo!s contac't • ethod of photographing snow ann firn sa•ples 23-2799 diiiac! avalan ches on rebun i s land, japan 23-29.13 daaage by snowstora of jan. 1963 in japan 2l-a67 por e s t da mage caused by avalanches ' 23-2875 snov and ice da•age o n electric communication lin es in hokkaido 23-2~8 1 darag!690 forest tbe!s co•parative studies · of avalanche injury and vind daaage to forests 23-2ij2~ dus building da a s of •oraine deposits 23-2556 bui l ding e abank•ents in freezing weather 23-3150 chang ing the hydrologica l regiae of a river by controlling its flow23-2429 the ycar-round constru c tion of t~~ yilyuy power station da• in t•e extreoe north 23-2982 dbpoir1tioi build in g d efor•ati o os caused bj frost h eave 2 l26 07 concrete defor•ation due to shrinkage at minus te•pera ture s 23-2558 deformation of brid ge abut•ents erected on penafrosf 23-2599 roadbed deforaati o n due to ground thawing and fros t heave 23-2865 stab ility o f foundations built oo fr os t heaving ground 23-2598 strains in concrete due to negati•e t e aperatures 2l-28ij5 d!gfiee dus de vel o pment of ~bore ice in the lazare v station region 23-303 7 fig. 3. output of cobol language program using marc ii data. marc ii and cobol/ avram and droz 271 table 8. manpower expenditure activity analysis and programming debugging and checkout man weeks 1 2 total 3 since the processing time of a print program is usually a function of the speed of the printer, no accurate internal processing times were recorded . . however, there was no noticeable time difference between this program and other marc print programs written at the library of congress in assembly language. communication format processing the aforementioned techniques are equally adaptable for use with the marc ii communications format ( 3) with the following changes in format conventions: 1) the communication format has a 24-character leader rather than 92 characters of fixed length items in the processing format. in the program, under the "working-storage section", the group item labelled "fixed-marc" would have to be redefined to conform with the 24-character leader. the cobol statements that are noted with " 0 0 " would require a change of their value from "92" to "24". 2) the communication format has no total count of entries in the record directory. a calculation would have to be made to arrive at the total count and that figure stored in a new hold area labelled "directory-count". the base address of the data in the communication format is not relative to the first position of the record as defined in the processing format, but to the first position of the first variable field. this base address is carried in the record leader, and is available for the calculation required for the directory entry count (base address -24/ 12). in the program, after the record directory had been searched and the proper entry placed in the work area, the "move-data" sub-routine would move the appropriate field to the work area for processing with the one alteration noted below with an asterisk. move-data. move zeros to tsub. move spaces to hold-data. move d-address to dsub. add base-address to dsub. o perform move-a d-length times. move-a. add 1 to dsub. add 1 to tsub. move marc-byte (dsub) to d-hold (tsub). programming techniques naturally are dependent on the processing required and the format characteristics at the individual institution. if the marc ii communications format were to be manipulated in the form 272 journal of library automation vol. 1/ 4 december, 1968 in which it is received (each byte equal to a character with a 24-character leader followed by 12-character directory entries) an alternate approach to that suggested above could be to work in the record area and not move data to a work area. conclusion the only marc ii data available to users up to the writing of this article (october 1968) has been the marc ii test tape released by the library of congress in august 1968. therefore, it is probable that most people expressing doubts about the use of cobol with marc records have done so without the experience of actually using the language. we now have this experience at the libary of congress. cobol was successfully used for the computer processing of marc records. the complexity of the record did not detract from ease in programming. although the programs written were for a report function, the data accessing modules of cobol nevertheless can be used for many other functions. file maintenance and retrieval algorithms could be defined and programmed in cobol with facility equal to that in programming the subject function. references 1. griffin, hillis: "automation of technical processes in libraries," in annual review of information science and technology, edited by carlos a. cuadra (chicago: encyclopaedia britannica) 3 (1968), 241-262. 2. u. s. library of congress, information systems office: subscriber's guide to the marc distribution service (washington, d. c.: library of congress, 1968). 3. avram, henriette d.; knapp, john f.; rather, lucia j.: the marc ii format: a communications format for bibliographic data (washington, d . c.; library of congress, 1968 ), pp. 1, 2, 10. 53 usa standard for a format for bibliographic information interchange on magnetic tape the chairman of the united states of america standards institute, sectional committee z39, library work and documentation, has approved publication of the following draft "usa standard for a format for bibliographic information interchange on magnetic tape" to hasten availability of this fundamental contribution to bibliographic standardization. two important implementations follow the standard. part b of appendix i is "preliminary guidelines for the library of congress, national library of medicine, and national agricultural library implementation of the proposed american standard for a format for bibliographic information interchange on magnetic tape as applied to records representing monographic materials in textual printed form (bookstmore succintly known as marc ii. part cis a committee working paper entitled "preliminary committee on scientific and technical information (cosati) guidelines for implementation of the usa standard." 0. introduction 0.1 t~is introduction is not part of the proposed standard but is included to facilitate its use. 0.2 this standard defines a format which is intended for the interchange of bibliographic records on magnetic tape. it has not been designed as a record format for retention within the files of any specific organization. nor has it been the intent of the subcommittee to define the content of individual records. rather it has attempted to describe a generalized structure which can be used to transmit between systems records describing all forms of material capable of bibliographic qescriptions, as well as related records, such as authority records for authors and subject headings. 54 journal of library automation vol. 2/2 june, 1969 0.3 in designing the format the subcommittee has tried to achieve the goals listed below. it recognizes, however, that the goals were not completely compatible and that various trade-offs were required. (a) hospitality-to all kinds of bibliographic information should be provided; (b) hardware independence-a format which can be used with a variety of digital computers should be defined; (c) uniformity of structure-the structure of all machine records should be basically identical and include such control information as may be required "to specify unique characteristics. for any given class of records the components of the format may have specific meanings and unique characteristics; (d) data manipulation-the methods of recording and iden. tifying data should provide for maximum manipulability · ·leading to ease of conversion -to other f.ormats for various uses. · 0.4 the standard· includes the concept that the bibliographic unit may .be described independently or in relation to other bibliographic units. many relationships exist, including: the hierarchical, in which the bibliographic unit contains, or is contained in, another bibliographic unit, e.g., a monograph in a series; the equivalent, e.g., a work and its translation; and the sequential, e.g., a serial which appeared under a succession of titles. the standard provides for bibliographic records which describe one or more related bibliographic units, and provides for coding the relationships among them. appendix ii describes a proposed method for implementing this concept. 0.5 preliminro·y guidelines for implementing the standard by two different groups of users are provided in appendix i. these guidelines are not part of the standard but . are included to illustrate the use of the format. 0.6 explanatory material which is not part of the standard but which will assist in its interpretation or implementation appears in brackets. 0.7 the appendices accompanying this standard are not part of the standard. · · · · 0.8 the development of this standard was made possible partially by support received from the national science foundation and the council on library . resources. personnel of . the us asi committee z39 at the time the committee approved the standard were dr. jerrold orne, chairman; mr. james wood, vicechairman; and mr. harold oatfield, secretary. usa standard for a format for bibliographic information interchange 55 the subcommittee on machine input records, which is directly responsible for this standard, had the following personnel: mrs. henriette d. avram, chairman assistant coordinator of information systems information systems office library of congress washington, d.c. 20540 mrs. pauline a. atherton school of library science syracuse university 308 carnegie library syracuse, new york 13210 mr. arthur r. blum american institute of physics 335 east 45th street new york, new york 10017 mr. lawrence f. buckland president, lnforonics, inc. 806 massachusetts avenue cambridge, massachusetts 02139 miss ann t. curran inforonics, inc. 806 massachusetts a venue cambridge, massachusetts 02139 mr. kay d. guiles information systems office library of congress washington, d.c. 20540 mr. frederick g. kilgour director, ohio college library center 1314 kinnear road columbus, ohio 43212 mr. abraham i. lebowitz assistant to the director national agricultural library u.s. department of agriculture washington, d.c. 20250 mrs. phyllis b. steckler r. r. bowker company 1180 a venue of the americas new york, new york 10036 56 journal of library automation vol. 2/ 2 june, 1969 1. glossary it has been considered unnecessary to define terms in common use. terms which have a special meaning in the standard or which might be ambiguous are defined below. base address of data. a data element whose value is equal to the character position of the character following the field terminator of the directory, where the specified origin is the first character of the leader. [example: if the directory contains two ( 2) entries, the first character position of data will be 49, and therefore the base address of data equals 49.] basic character. a character occurring in columns 2, 3, 6 or 7 of the standard code as defined in usas x3.4-1967 code for information interchange, p. 6. [the basic character set is included as part of the illustration on page 82 of appendix i, columns 2, 3, 6 and 7.] bibliographic information interchange format. a format for the exchange, rather than the local processing, of bibliographic records. (the terms "bibliographic information interchange format," "information interchange format," and "interchange format" are used interchangeably in this standard. ) bibliographic level. a data element which, in conjunction with the data element "type-of-record," specifies the characteristics and describes the components of the bibliographic record. [see appendix i for an illustration of an application of this data element.] bibliographic record. a collection of fields, including a leader, directory, and bibliographic data, describing one or more bibliographic units treated as one logical entity. bibliographic unit. a defined body of recorded information and the artifact on which it is recorded, e.g., a book, chapter of a book, map, cuneiform tablet, digital magnetic tape file, song (sheet music ), and song (phonograph record ). a bibliographic unit may be part of a larger bibliographic unit (e.g., the chapter as part of a book, which in tum is part of a series). [it is assumed that the originators of bibliographic information and/or bibliographic descriptions follow a set of mles or guidelines which define, for the originating source, what is to be treated as a bibliographic unit.] a single author or subject heading authority record is also a bibliographic unit. character. see internal character. communications format. see bibliographic information interchange format. control field. a variable field which supplies parameters which may be required in the processing of the bibliographic record. control number. an alphanumeric symbol uniquely associated with a bibliographic record assigned by the organization creating the bibliographic record. usa standard for a format for bibliographic information interchange 57 data element. a defined unit of information within a system. data element identifier. a code consisting of one or more basic characters used to identify individual data elements within a variable field. if and when data element identifiers are used, each occurrence must be immediately preceded by a delimiter, and each data element identifier must immediately precede the data element it identifies. the length (in characters) of the data element identifier must be uniform for each field of a given record. [in effect, a delimiter and data element identifier are combined to form a symbol used to initiate and identify data elements within a variable field. the use of the concept of data element identifiers is optional and provides a means of explicitly identifying data elements, even though in some instances there may be a redundancy of identification (e.g., if a variable field consists of only one data element, presumably the tag alone would provide sufficient identification). j data field. a variable field containing bibliographic or other data not intended to supply parameters for the processing of the bibliographic record. delimiter. a character which serves as an initiator, a separator, or a terminator of individual data elements within a variable field. [whether a delimiter is used to initiate, to separate, or to terminate, is dependent upon a specific system.] delimiter (or delimiter plus data element identifier) count. a data element whose value is the length (in characters) of the delimiter (or, if data element identifiers are used, the length (in characters) of the delimiter plus data element identifier) used within the record. directory. an index to the location of the variable fields (control and data) within a bibliographic record. the directory consists of entries. entry. a fixed field within the directory which contains information about a variable field. entry map. a data element which is used to indicate the structure of the entries in the directory. external characi'er. a graphic symbol which may be represented by one or a series of two or more internal characters. [the external character "space, is always represented by an internal character.] field. a defined character string which may contain one or more data elements. see also control field; delimiter; entry; fixed field; indicator; variable field. field terminator (ft). a character used to terminate a variable field within a bibliographic record. the last variable field is terminated by a record terminator and not a field terminator. 58 journal of library automation vol. 2/ 2 june, 1969 file. a set of related records denoted by a single name. fixed field. one in which every occurrence of the field has a length of the same fixed value regardless of changes in the contents of the field from occurrence to occurrence. format. see structure. ft. see field terminator. indicator. a data element associated with a data field which supplies additional information about the associated data field. indicator count. a data element whose value is the length (in characters) of the indicator( s) which appears as the first data element in each variable data field. the length (in characters) of the indicator ( s) must be uniform for each field of a given record. (a length of zero ( 0) is permitted) . information interchange format. see bibliographic information interchange format. interchange format. see bibliographic information interchange format. internal character. a pattern of bits of a predetermined length (depending on the system ) treated as a meaningful unit. (the terms "internal character" and "character" are used interchangeably in this standard.) leader. a fixed field which occurs at the beginning of each bibliographic record which provides parameters for the processing of the record. padding character. a character used to fill areas in fixed fields which contain no data. [see paragraph a.2.1.4 of appendix i.] primary bibliographic unit. that bibliographic unit whose physical and bibliographic characteristics determine the type-ofrecord and bibliographic level. record. see bibliographic record. record length. a data element whose value is equal to the length (in characters) of the bibliographic record including the record terminator. record terminator (rt) . a character used to terminate each record. rt. see record terminator. status. a data element which indicates the relation of the bibliographic record to a file (e.g., new, updated, etc.). structure. the framework of fixed and variable fields within the bibliographic record. subrecord. a group of fields within a bibliographic record which may be treated as a logical entity. [when a bibliographic record describes more than one bibliographic unit, the descriptions of individual bibliographic units may be treated as subrecords.] usa standard for a format for bibliographic information interchange 59 tag. a series of characters used to specify the name or label of an associated variable field. type-of-record. a data element which, in association with the data element "bibliographic level," indicates the form of the bibliographic description provided for the primary bibliographic unit. [it is assumed that the person providing the bibliographic description, on the basis of predefined criteria, will detemline the treatment a given item is to receive; i.e., whether the item is to be treated as a book, a journal article, a map, a picture, an abstract, a bibliographlcal footnote, etc. if a given item consists of parts which, if they occurred independently, would be accorded different bibliographic descriptions, the choice of treatment selected is assumed to be the most appropriate. frequently occurring combinations may be accorded their own treatments, e.g., collections of drawings with accompanying text. for each established form of bibliographic description, there will be a record format whose components are defined by the "type-of-record" data element. among these components are the length of the fixed fields, the tagging scheme employed, and the definition of the data elements. if the interchange format is used for the interchange of records of a type for which "bibliographic description" is not a parameter, e.g., authority records, this data element may be redefined. see afpendix i for an illustration of an application of this data element. variable field. one in which the length of an occurrence of the field is determined by the length (in characters) required to contain the data stored in that occurrence. the length may vary from one occurrence to the next. 2. purpose and scope 2.1 2.2 2.3 2.4 this standard defines a format for the interchange of bibliographic and related [authority files, subject heading lists, etc.] records. this standard does not define a record format for retention within the files of any specific organization. this standard does not necessarily define the content of individual records. it does describe a generalized structure which can be used for the interchange of records describing various forms of bibliographic material. this standard assumes the utilization of the following usasi standards and proposed standards: (a) usas x3.22-1967 recorded magnetic tape for information interchange ( 800 cpi, nrzi) (b) usas x3.4-1967 code for information interchange (c) proposed standard x3.2/552 magnetic tape labels and file structure 60 journal of library automation vol. 2/2 june, 1969 3. bibliographic information interchange format 3.1 schematic representation the interchange format is schematically represented below: i . i i i . i leader directory i f control if other if data if data i r field i t n i iu!cord length 0 i t nulllber it control 1t field rt i i fields i i (if _l _l present) i 1 i i ' i 3.2 leader status 4 5 3.2.1 schematic representation the leader is schematically represented below: type of iiiiiliqreserved indidelimiter !lase reserved entry record graphic for cator (or deaddress for usi! map level futiiri! count limiter plus of iiy user use data element data systems identifier count 6 7 8 9 10 ll 12 16 17 19 20 23 3.2.2 record length the record length is a 5-digit decimal number equal to the bibliographic record length. this number will include its own five characters and the record terminator. the record length will always be present in character positions 0-4 of the record. in the interchange format the bibliographic record has a maximum length of 99,999 characters. 3.2.3 · status a data element in character position 5 consisting of 1 basic character. 0 3.2.4 type-of-record a data element in character position 6 consisting of 1 basic character. o 3.2.5 bibliographic level a data element in character position 7 consisting of 1 basic character. 0 3.2.6 indicator count a data element in character position 10 consisting of 1 decimal digit 0 equal to the length (in characters) of the indicator ( s) which appears as the first data element of each variable data field. if indicators are not used, this field is set to zero ( 0). (see 3.4.2.1) • see appendb: i for an lllustratfoa of an application of this data element. usa standard for a format for bibliographic information interchange 61 3.2.7 delimiter (or delimiter plus data element identifier) count a data element in character position 11 consisting of 1 decimal digit equal to the length (in characters) of the delimiter (or, if data element identiliers are used, the length (in characters) of the delimiter plus data element identifier) used within the record. if a delimiter is not used, this field is set to zero ( 0). if a delimiter alone (i.e., without data element identifiers) is used, this field is set to one ( 1 ) . 3.2.8 base address of data a data element in character positions 12-16 consisting of 5 decimal digits and equal to the combined length (in characters) of the leader and directory (including the field terminator at the end of the directory) . 3.2.9 entry map (see 3.3.1 for the description of entries.) structure of each entry in the directory: tag length starting of character field position entry map: m n i ~ i ~ m = length (in characters) of the '1ength of field" portion of each entry in the directory n = length (in characters) of the "starting character position" portion of each entry in the directory 0 = undefined; available for future use the entry map is a data element in character positions 20-23 consisting of 4 decimal digits. each decimal digit recorded corresponds sequentially to each portion of the entry, except for the portion allotted to the tag. character position 20 in the entry map indicates the length (in characters) of the "length of field" portion of each entry in the directory; character position 21 indicates the lenrh (in characters) of the "starting character position portion of each entry. if one of these does not occur, the relevant character position in the entry 62 journal of library automation vol. 2/ 2 june, 1969 map is set to zero. character positions 22 and 23 are undefined and are available for future use. [since bibliographic data is usually variable in length, the structure of an entry in the directory will usually follow the pattern "tag, length of field, starting character position." the inclusion of an entry map provides flexibility for those users who wish to structure the entry in the directory differently, either by including (in addition to tag, length of field, and starting character position) other data elements not defined in this standard or by excluding those that have been defined. however, any restructuring of the entry by a user will have to be done within the general limitations imposed by the standard (see 3.3.1). the use of the entry map can be illustrated as follows : ( 1) an entry map set to 4500 would define the characteristics of a directory in which each entry consisted of a 3-digit tag (not expressed in the entry map), a 4-digit length of field, and a 5-digit starting character position. ( 2) an entry map set to 0500 would define the characteristics of a directory in which each entry consisted of a 3-digit tag, no length of field data element, and a 5-digit starting character position. see appendix i for an illustration of an actual application of the concept of an entry map.] 3.3 direct01·y the directory consists of a series of fixed fields (hereinafter referred to as entries). the directory ends with a field terminator. the directory must contain at least one entry for each subsequent variable field (control and data). [in the case of very long fields additional entries may be required. see 3.3.1.3.] 3.3.1 entries each entry consists of 12 characters. each entry must contain, at the very least, a tag, and length of field, or a tag and starting character position and must correspond, unambiguously, to a specific variable length data or control field. the tag, length of field, and starting character position must, whenever they occur, be in that sequence. 3.3.1.1 tag the tag is a data element consisting of 3 basic characters. 3.3.1.2 tags for control fields tags 001-009 are reserved for control fields as shown: usa standard for a format for bibliographic information interchange 63 001 control number 002 reserved for subrecord directory, if any• 003 reserved for subrecord relationship, if any• 004-009 reserved for use by user systems 3.3.1.3 length of field the length of field in the entry is the length (in characters) of the variable field to which it corresponds. the length of field includes the indicator( s) and field terminator. it is expressed as a decimal number. if the length of a variable field exceeds the maximum length expressible as decimal number in the length of field portion of the entry, two or more entries (called a "subset" for the purposes of this explanation) will be used to define the location and extent of such a field. since all the entries in the subset of entries reference the same variable field, they will contain the same tag. the length of field in each entry of the subset, except the last entry in the subset, will be set to 0 to indicate that the length of field is equal to the maximum length expressible and that there is additional information for the same fi eld in the next entry in the record direptory. the length of field for all entries in the subset subsequent to the first will refer to the length (in characters) of the overflow data. [this convention cannot be followed if the structure of the entry does not contain a length of field.] 3.3.1.4 starting character position the starting character position is the character position of the first character in the variable field (which may be an indicator or data; see 3.4.2) referenced by the entry. it is given relative to the base address of data (i.e., the first character of the first variable field following the directory is numbered 0) . 3.3.2 sequence of entries the entries in the directory may be recorded in any sequence (i.e., they need not be in the same sequence as the corresponding variable fields ) except that the • ap pendix ii illustrates a possible method of handling sub records within a b ibuoktaphic record. this is not part of the st andard . 64 journal of library automation vol. 2/2 june, 1969 entry for tags 001-009 must always be first and in ascending numeric sequence. [note that specific systems may use the sequence of entries in the directory to convey semantic information.] 3.4 variable fields 3.4.1 general following the leader and directory, the bibliographic record consists of variable fields. (although the directory is technically a variable field, the following paragraphs do not apply to it.) 3.4.2 structure of variable fields indicator(s)* each variable field consists of indicators( s) (if used), a delimiter (if used), a data element identifier (if used), data, and a field terminator, as shown. control fields do not contain indicators, delimiters, or data element identifiers. delimiter* . data element data field identifier* terminator * except control fields 3.4.2.1 indicator the indicator is the first data element in each variable field. the length (in characters) of the indicator(s), which may be 0, (i.e., no indicator is present) is recorded in the indicator count in the leader. all variable fields, except control fields, in the same record have the same length (in characters) for an indicator(s ). 3.4.3 sequence of variable fields the variable fields, except for the control fields associated with tags 001-009, need not occur in the same sequence as the corresponding directory entries. the control fields which occur must be first and in ascending numeric sequence. 3.4.4 control fields the variable fields associated with tags 001-009 are control fields. control fields do not contain indicators, delimiters, or data element identifiers. 3.4.4.1 control number field this field contains the control number, consisting of basic characters. this field must always occur once, and only once, in each usa standard for a format for bibliographic information interchange 65 bibliographic record, and must immediately follow the directory. 3.5 variable data fields 3.5.1 general the remainder of the bibliographic record consists of variable data fields. there are no restrictions on the munber, length, or content of the variable data fields other than those already stated or implied (e.g., those based on the limitations of the total record length). 3.5.2 multiple data elements multiple data elements within fields may be fixed or variable and may be identified by position, by the use of a delimiter alone, or by the use of a delimiter plus data element identifier( s) as the case may be. lessons learned: a primo usability study kelsey brett, ashley lierman, and cherie turner information technology and libraries | march 2016 7 abstract the university of houston libraries implemented primo as the primary search option on the library website in may 2014. in may 2015, the libraries released a redesigned interface to improve user experience with the tool. the libraries took a user-centered approach to redesigning the primo interface, conducting a “think-aloud” usability test to gather user feedback and identify needed improvements. this article describes the method and findings from the usability study, the changes that were made to the primo interface as a result, and implications for discovery-system vendor relations and library instruction. introduction index-based discovery systems have become commonplace in academic libraries over the past several years, and academic libraries have invested a great deal of time and money into implementing them. frequently, discovery platforms serve as the primary access point to library resources, and in some libraries they have even replaced traditional online public access catalogs. because of the prominence of these systems in academic libraries and the important function that they serve, libraries have a vested interest in presenting users with a positive and seamless experience while using a discovery system to find and access library information. libraries commonly conduct user testing on their discovery systems, make local customizations when possible, and sometimes even change products to present the most user-friendly experience possible. university of houston libraries has adopted new discovery technologies as they became available in an effort to provide simplified discovery and access to library resources. as a first step, the libraries implemented innovative interface’s encore, a federated search tool, in 2007. when index-based discovery systems became available, the libraries saw them as a way to provide an improved and intuitive search experience. in 2010, the libraries implemented serials solutions’ summon. after three years and a thorough process of evaluating priorities and investigating alternatives, the libraries made the decision to move to ex libris’ primo, which was done in may of 2014. the libraries’ intention was to continually assess and customize primo to improve functionality and user experience. the libraries conducted research and performed user testing, and in may kelsey brett (krbrett@ua.edu) is discovery systems librarian, ashley lierman (arlierman@uh.edu) is instructional design librarian, and cherie turner (ckturner2@uh.edu) is chemical sciences librarian, university of houston libraries, houston, texas. mailto:krbrett@ua.edu mailto:arlierman@uh.edu mailto:ckturner2@uh.edu lessons learned: a primo usability study | brett, lierman, and turner doi: 10.6017/ital.v35i1.8965 8 2015 a redesigned primo search results page was released. one of the activities that informed the primo redesign was a “think-aloud” usability test that required users to complete a set of two tasks using primo. this article will present the method and results of the testing as well as the customizations that were made to the discovery system as a result. it will also discuss some broader implications for library discovery and its effect on information literacy instruction. literature review there is a substantial body of literature discussing usability testing of discovery systems. in the interest of brevity, we will focus solely on studies and overviews involving primo implementations, from which several patterns have emerged. multiple studies have indicated that users’ responses to the system are generally positive; even in testing of very early versions by a development partner users responded positively overall.1 interestingly, some studies found that in many cases users rated primo positively in post-testing surveys even when their task completion rate in the testing had been low.2 multiple studies also found evidence that, although users may struggle with primo initially, the system is learnable over time. comeaux found that the time it took users to use facets or locate resources decreased significantly with each task they performed,3 while other studies saw the use of facets per task increase for each user over the course of the testing.4 user reactions to facets and other post-limiting functions in primo were divided. in one of the earliest studies, sadeh found that users responded positively to facets,5 and some authors found users came to use them heavily while searching,6 while others found that facets were generally underused.7 multiple studies found that users tended to repeat their searches with slightly different terms rather than use post-limiting options.8 thomsett-scott and reese, in a survey of the literature on discovery tools, reported evidence of a trend that users reacted more positively to post-limiting in earlier discovery studies,9 while the broader literature shows more negative reactions in more recent studies. this could indicate that shifts in the software, user expectations, or both may have decreased users’ interest in these options. a few specific types of usability problems seem common across tests of primo and other discovery systems. across a large number of studies, it has been found that users—especially undergraduate students—struggle to understand library and academic terminology used in discovery. some terminology changes were made after users had difficulty in the earliest usability tests of primo,10 but users continued to struggle with terms like hold and recall in item records.11 users also failed to understand the labels of limiters,12 and they also failed to recognize the internal names of repositories and collections.13 literature reviews on discovery systems have found terminology to be a common stumbling block for searchers across a wide number of individual studies.14 similarly, users often struggle to understand the scope of options available to them when searching and the holdings information in item records. users failed in multiple tests to distinguish between the article level and the journal level,15 could not interpret bibliographic information technology and libraries | march 2016 9 information sufficiently to determine that they had found the desired item,16 and chose incorrect options for scoping their searches.17 many studies found that users were unable to distinguish between multiple editions of a held item when all item types or editions were listed in the record.18 in other cases, users had difficulty interpreting locations and holdings information for physical items.19 among the needs and desires expressed by and for primo users in the literature, two in particular stand out. first, many users expressed a desire for more advanced search options; some wanted more complexity in certain facets and the ability to search within results,20 while other users simply wanted an advanced search option to be available.21 secondly, a large number of studies indicated that instruction on primo or other discovery systems was needed for users to search effectively. in some cases this was the conclusion of the researchers conducting the study,22 while in other cases users themselves either suggested or requested instruction on the system.23 it is also worth noting that it has been questioned whether usability testing as a whole is a sufficient mechanism for evaluating discovery-system functionality. prommann and zhang found that usability testing has focused almost exclusively on the technical functioning of the software and not adequately revealed the ability of discovery systems like primo to successfully complete users’ desired tasks.24 they proposed hierarchical task analysis (hta) as an alternative, to examine users’ most frequent desires and the capacity of discovery systems to meet them. prommann and zhang acknowledged, however, that as hta is completed by an expert on the system rather than by an actual user, some of the valuable information derived from usability testing (including terms and functions that users do not understand, however well-designed) is lost in the process; they concluded that a combination of the two systems of testing is ideal to retain the best of both. background at the university of houston libraries, the resource discovery systems department (rds) is responsible for the maintenance and development of primo. however, it is important to rds to gather feedback and foster buy-in from stakeholders in the library before making changes to the system. to that end, rds works with two committees to assess the system and make recommendations for its improvement. the discovery usability group and the discovery advisory group include members from public services, technical services, and systems; each member brings a unique perspective on discovery. the discovery usability group is charged with assessing the discovery system through a variety of methods including usability testing, focus groups, and user interviews. the discovery advisory group reviews results of user testing and makes recommendations for improvement. all changes to the discovery system are reviewed by the groups before they are released for public use. lessons learned: a primo usability study | brett, lierman, and turner doi: 10.6017/ital.v35i1.8965 10 in fall 2014, several months after the primo implementation, the discovery usability group conducted a focus group with student workers from the library’s information desk (a dual reference and circulation desk) to solicit feedback about the functionality of primo and suggestions for its improvement. in the meantime, the discovery advisory group was testing primo and evaluating primo sites at peer and aspirational institutions. the groups used the information collected through the focus group and research on primo to make recommendations for improvement. rds has access to a primo development sandbox, and many of the recommended changes were made in the sandbox environment and reviewed by the two groups prior to public release. changes to the search box can be seen in figure 1. rarely used tabs were replaced with a dropdown menu to the right of the search box to allow users to limit to “everything,” “books+,” or “digital library.” to increase visibility, links to “advanced search” and “browse search” were made larger and more spacing was added. live site: development sandbox: figure 1. search box in live site (above) and development sandbox (below) at time of testing changes were also made to create a cleaner and less cluttered search results page (see figure 2). more white space was added, and the links (or tabs) to “view online,” “request,” “details,” etc., were redesigned and renamed for clarity. for example, the “view online” link was renamed to “preview online” because it opens a box within the search results page that displays the item. the groups believed “preview online” more accurately represents what the link does. information technology and libraries | march 2016 11 live site: development sandbox: figure 2. search results in live site (above) and development sandbox (below) at time of testing the facets were also redesigned to look cleaner and larger to attract users’ attention (see figure 3). lessons learned: a primo usability study | brett, lierman, and turner doi: 10.6017/ital.v35i1.8965 12 live site: development sandbox: figure 3. facets in live site and development sandbox at time of testing both groups were happy with the changes to the primo development sandbox but wanted to test the effect of the changes on user search behavior before updating the live site. the discovery usability group conducted a usability test within the development sandbox. the goal of the test was to find out if users could effectively complete common research tasks using primo. with that goal in mind, the group developed a usability test and conducted it during the spring semester of 2015. information technology and libraries | march 2016 13 methodology the discovery usability group developed a usability test using a “think-aloud” methodology, where users were asked to verbalize their thought process as they completed research tasks through primo. four tasks were designed to mirror tasks that users are likely to complete for class assignments or for general research. to minimize the testing time, each participant completed two tasks, with the facilitators alternating between two sets of tasks from one participant to the next. test 1 task 1: you are trying to find an article that was cited in a paper you read recently. you have the following citation: clapp, e., & edwards, l. (2013). expanding our vision for the arts in education. harvard educational review, 83(1), 5–14. please find this article using onesearch [the public-facing name given to the libraries’ primo implementation]. task 2: you are doing a research project on the effects of video games on early childhood development. find a peer-reviewed article on this topic, using onesearch. test 2 task 1: recently your friend recommended the book the lighthouse by p. d. james. use onesearch to find out if you can check this book out from the library. task 2: you are writing a paper about the drug cartels’ influence on mexico’s relationship with the united states. find a newspaper article on this topic, using onesearch. two facilitators set up a table with a laptop in the front entrance of the library. they alternated between the facilitator and note-taker roles. another group member took on the role of “caller” and recruited library patrons to participate in the study. the caller set up a table visible to those passing by with library-branded t-shirts and umbrellas to incentivize participation. the caller explained what would be expected of the potential participant and went over the informedconsent document. after signing the form, the participant performed two tasks. after the test the participant received a library t-shirt or umbrella, and snacks. the facilitators used morae usability software to record the screen and audio of each test. participants were asked for permission to record their sessions, but could opt out. during the three hour testing period, fifteen library patrons participated in the study, and fourteen sessions were recorded. of the fifteen participants, thirteen were undergraduate students (four freshman, one sophomore, seven juniors, and two seniors), one was a graduate student, and one was a postbaccalaureate student. the majority of the participants were from the sciences, along with two students from the college of business and two from the school of communications. there were no participants from the humanities. lessons learned: a primo usability study | brett, lierman, and turner doi: 10.6017/ital.v35i1.8965 14 the facilitators took notes on a rubric (see table 1) that simplified the processes of coding and reviewing the recordings. after the usability testing, the facilitators reviewed the notes and recordings, coded them for common themes and breakdowns, and prepared a report of their findings and design recommendations. the facilitators sent the report, along with audio and screen recordings, to the discovery advisory group, who reviewed them along with rds. the discovery advisory group made additional design recommendations, and rds used the information and recommendations to implement additional customizations to the primo development sandbox. preliminary questions ask: what is your affiliation with the university of houston? year? major? ask: how often do you use the library website? for what purpose(s)? task 1 describe the steps the participant took to complete the task s/u ask: how did you feel about this task? what was simple? what was difficult? ask: is there anything that would make completing this task easier? task 2 describe the steps the participant took to complete the task s/u ask: how did you feel about this task? what was simple? what was difficult? ask: is there anything that would make completing this task easier? follow-up question ask: what can we do to improve the overall experience using onesearch? table 1. task completion rubric for test 1 information technology and libraries | march 2016 15 results test 1, task 1 you are trying to find an article that was cited in a paper you read recently. you have the following citation: clapp, e., & edwards, l. (2013). expanding our vision for the arts in education. harvard educational review, 83(1), 5–14. please find this article using onesearch. participant time on task task completion 1 1m 54s y 2 4m 13s y 3 1m 26s y 4 1m 17s y 5 1m 26s y (required assistance) 6 1m 43s y 7 1m 27s y 8 1m 5s y table 2. results for test 1, task 1 all eight participants successfully completed this task, although sophistication and efficiency varied between participants. some searched by the authors’ last names, which was not specific enough to return the item in question. four participants attempted to use advanced search or the drop-down menu to the right of the search box to pre-filter their results. two participants viewed the options in the drop-down menu, which were “everything,” “books+,” and “digital library,” and left it on the default “everything” search. when prompted, the participants explained that they were expecting the drop-down to contain title and/or author limiters. similarly, participants expected an author limiter in the advanced search. the citation format seemed to confuse participants, and they tended to search for the piece of information that was listed first—the authors—rather than the most unique piece of information—the title. if the first search did not return the correct item in the first few results, the participant would modify their search by searching for a different element of the citation or adding another element of the citation to the initial search until the item they were looking for appeared as one of the first few results. participant 5 thought they had successfully completed the lessons learned: a primo usability study | brett, lierman, and turner doi: 10.6017/ital.v35i1.8965 16 task, but the facilitator had to point out that the item they chose did not meet the citation exactly, and on the second try they found the correct item. participant 2 worked on the task for more than four minutes, significantly longer than the other seven participants. they immediately navigated to advanced search and filled out several fields in the advanced search form with the elements of the citation. if the search did not return their item, they added more elements until they finally found it. simply searching the title in the citation would have returned the item as the first search result. filling out the advanced search form with all of the information from the citation does not necessarily increase a user’s chances of finding the item in a discovery system, though it might do so when searching in an online catalog or subject database. the discovery advisory and usability groups made two recommendations to address some of the identified issues: include an author search option in the advanced search, and add an “articles+” option to the drop-down menu on the basic search. rds implemented both recommendations. the discovery usability group identified confusion around citations as a common breakdown during this task. the groups recommended providing instructional information about searching for known items to address this breakdown; however, rds is still working on an effective method to provide this information in a simple and visible way. test 1, task 2 you are doing a research project on the effects of video games on early childhood development. find a peer-reviewed article on this topic, using onesearch. participant time on task task completion 1 3m 44s y 2 2m 21s y 3 5m 23s y (required assistance) 4 2m 5s y 5 3m 32s y 6 2m 45s y 7 3m 8s y 8 3m 1s y (required assistance) table 3. results for test 1, task 2 all eight participants successfully found an article on this topic, but were less successful in determining whether the article was peer-reviewed. only one participant used the “peer-reviewed information technology and libraries | march 2016 17 journals” facet without being prompted. three users noticed the “[peer-reviewed journal]” note in the record information for search results, and used it to determine if the article was peer-reviewed. one participant went to the full-text of an article, and said it “seemed” like it was peer-reviewed and considered the task complete. the resource type facets were more heavily used during this task than the “peer-reviewed journals” facet, despite its being promoted to the top of the list of facets. two participants used the “articles” facet, and two participants used the “reviews” facet, thinking it limited to peer-reviewed articles. participants 3 and 8 needed help from the facilitator to determine whether a source was peer-reviewed. there was an overall misunderstanding of what peer-reviewed means, which affected participants’ confidence in completing the task. the design recommendations based on this task included changing the “peer-reviewed journals” facet to “peer-reviewed articles” or simply, “peer-reviewed.” rds changed the facet to “peerreviewed articles” to help alleviate confusion. additionally, the groups recommended emphasizing the “[peer-reviewed journal]” designations within the search results and providing a method for limiting to peer-reviewed materials before conducting a search. customization limitations of the system have prevented rds from implementing these design recommendations yet. a way to address the breakdowns caused by misunderstanding terminology also has yet to be identified. it was disheartening that participants did not use the “peer-reviewed journals” facet despite its being purposefully emphasized on the search results page. test 2, task 1 recently your friend recommended the book the lighthouse by p. d. james. use onesearch to find out if you can check this book out from the library. participant time on task task completion 1 1m 7s y 2 56s y 3 no recording y 4 2m 21s y 5 1m 8s y 6 2m 14s y 7 1m 15s y table 4. results for test 2, task 1 all seven participants were able to find this book using primo, but had difficulty in determining what to do once they found it. for this task every participant searched by title and found the book as the first search result. four users limited to “books+” before searching using the drop-down lessons learned: a primo usability study | brett, lierman, and turner doi: 10.6017/ital.v35i1.8965 18 menu, while the other three remained in the default “everything” search. only one participant used the locations tab within the search results to determine availability; the others clicked the title and went to the item’s catalog record. all participants were able to determine that the book was available in the library, but there was an overall lack of understanding about how to use the information in the catalog to check out a book. participant 1 said that they would write down the call number, take it to the information desk, and ask how to find it, which was the most sophisticated response of all seven participants. participant 4 spent nearly two minutes clicking through links in the opac expecting to find a “check out” button and only stopped when the facilitator stepped in. a recommended design change based on this task was to have call numbers in primo and the online catalog link to a stacks guide or map. this is a feature that may be developed in the future, but technical limitations prevented rds from implementing it in time for the release of the redesigned search interface. like the previous tasks, some of the breakdowns occurred because of a lack of understanding of library services. users easily figured out that there was a copy of the book in the library, but had little sense of what to do next. none of the participants successfully located the stacks guide or the request feature that would put the item on hold for them. steps should be taken to direct users to these features more effectively. test 2, task 2 you are writing a paper about the drug cartels’ influence on mexico’s relationship with the united states. find a newspaper article on this topic, using onesearch. participant time on task task completion 1 4m 45s y (required assistance) 2 59s y 3 no recording n 4 7m 47s y 5 2m 52s y 6 1m 33s y 7 1m 30s y table 5. results for test 2, task 2 this task was difficult for participants. two users limited their search initially to “digital library” using the drop-down menu, thinking it would be a place to find newspaper articles; their searches returned zero results. only two users used the “newspaper articles” facet without being prompted, and users did not seem to readily distinguish newspaper articles as a resource type. participants information technology and libraries | march 2016 19 did not notice the resource type icons without being prompted. several participants needed to be reminded that the task was to find a newspaper article, and not any other type of article. with guidance, most participants were able to complete the task. participant 4 remained on the task for almost eight minutes because of their dissatisfaction with the relevancy of the results to the prompt. interestingly, they found the “newspaper articles” facet and reapplied it after each modified search, suggesting that they learned to use system features as they went. one of the recommendations based on this task was to remove “digital library” as an option in the drop-down menu on the basic search. it was evident that “digital library” did not have the same meaning to end users as it does to internal users. this recommendation was easily implemented. another recommendation was to emphasize the resource type icons within the search results, but we have not determined a way to do so effectively. one suggestion from the discovery usability group was to exclude newspaper articles from the search results as a default, but no consensus was reached on this issue. limitations the discovery usability group identified limitations to the usability test that should be noted. testing was done in a high-traffic portion of the library’s lobby, which is used as study space by a broad range of students. participants were recruited from this study space, and we chose not to screen participants. the fifteen participants in the study did not constitute a representative sample. almost all participants were undergraduate students, and no humanities majors participated. the outcomes might have been different if our participants had included more experienced researchers or students from a broader range of disciplines. by adding screening questions or choosing a more neutral location, we would have limited the number of participants who could complete our testing. another limitation was that the participants started the usability test within the primo interface. because primo is integrated into the libraries’ website, users would typically begin searching the system from within the library homepage. the goals of the study required testing of our primo development sandbox, which was not yet available to the public, and therefore could not be accessed in the same way. this gave participants some additional options from the initial search pages that are not usually available through the main search interface. while testing an active version of the interface would be preferable, one of our goals was to understand how our modifications affected user behavior, so testing the unmodified version was not an acceptable substitute. additionally, the usability study presented tasks out of context and did not replicate a true user-searching experience. despite the limitations, we learned valuable lessons from the participants in this study. discussion users successfully completed the tasks in this usability study. unfortunately, they did not take advantage of many of the features that can make such tasks easier—particularly facets. this was lessons learned: a primo usability study | brett, lierman, and turner doi: 10.6017/ital.v35i1.8965 20 especially apparent when we asked users to find a peer-reviewed journal article (test 1, task 2). primo has a facet that will limit a search to only peer-reviewed journal articles, and only one out of eight participants used this facet during this task. participants appreciated the pre-search filtering options, and requested more of them (such as an author search), while post-search facets were underutilized. similarly, participants almost uniformly ignored the links, or tabs, within the search results, which would provide users with more information, a preview of the full-text, and additional features such as an email function. users bypassed these options and clicked on the title instead. the discovery usability group theorized that users clicked on the title of the item because that behavior would be successful in a more familiar search interface like google. the team customized the configuration so that a title click would open either the full-text of electronic items or the catalog record for physical items to accommodate users’ instinctive search behaviors. the tabs, though a prominent feature of the discovery system, have proved to have little value for users. throughout the implementation of discovery systems in academic libraries, both research studies and anecdotal evidence have suggested that users do not find end-user features like facets valuable; however, discovery system vendors have made no apparent attempt to reimagine the possibilities for search refinements. indeed, most of the findings in this study will present few surprises to anyone familiar with the discovery usability literature, which is itself concerning. as our literature review has shown, many of the same general usability issues have repeated throughout studies of primo since 2008, and most are very similar to usability issues in other, competitor discovery systems. this raises some concerns about the pace of innovation in the discovery field, and whether discovery vendors are genuinely taking into account the research findings about the needs of our users as they refine their products. in a recent article, david nelson and linda turney identified many issues with discovery facets in their current form that may be barriers to usage, particularly labeling and library jargon; we join them in urging vendors and libraries to collaborate more closely for deep analysis of actual facet usage by users, and to address those factors that have negatively affected facets’ value.25 during our usability study, a common barrier to the successful completion of a task was not the technology itself but a lack of understanding of the task. participants had difficulty deciphering a citation, which may have led to their tendency to search for a journal article by author and not by title. many participants struggled with using call numbers, and how to find and check out books in the library. peer review also proved to be a difficult or unfamiliar concept for many; when looking for peer-reviewed articles, some participants clicked on the “reviews” facet, which limited their searches to an inappropriate resource type. additionally, participants did not differentiate between journal articles and newspaper articles, which may indicate a broader inability to differentiate between scholarly and nonscholarly resources. this effect may be exaggerated by the high percentage of science students who participated, as these students may not have frequent need for newspaper articles. all of these challenges, however, are indicative of a deeper problem information technology and libraries | march 2016 21 with terminology. regardless of how simple it is to limit a search to peer-reviewed articles, a user who does not understand what peer review means cannot complete the task with confidence or certainty. librarians struggle with presenting understandable language and avoiding library terminology; as we discovered, academic language, like “peer-reviewed” and “citation,” presents a similar problem. these are not issues that can be resolved with a technological solution. rather, we join previous authors in suggesting that instruction may be a reasonable way to address many usability issues in primo. from our findings and from those in the wider literature, we conclude that general instruction in information literacy is prerequisite for effective use of this or any research tool, particularly for undergraduates. nichols et al. “recommend studying how to effectively provide instruction on primo searching and results interpretation,”26 but instruction on the use of a single tool is of limited utility to students in their academic lives. instead libraries could bolster information literacy instruction on key concepts around the production and storage of information, scholarly communications, and differences in information types. teaching these concepts effectively should help to alleviate the most common user issues, including understanding terminology and different types of information, as well as helping students to understand key elements of research in general. this is a particularly important point to note for librarians working as advocates for information literacy instruction, especially in cases where administrators or faculty may feel that more advanced tools, like discovery systems, should make instruction obsolete. conclusion several changes were made to the primo interface in response to breakdowns identified during the usability study. resource discovery systems first implemented the changes to the primo development sandbox. after the discovery usability and advisory groups agreed on the changes, they were made available on the live site (see figure 4). the redesigned search results page became available to the general public between the spring and summer academic sessions of 2015. in addition to the changes that were made because the usability study, rds made changes to the look and feel to make the search results interface more aesthetically pleasing and more in line with the university of houston brand. lessons learned: a primo usability study | brett, lierman, and turner doi: 10.6017/ital.v35i1.8965 22 before (live site): figure 4. primo interface before usability testing during (development sandbox): figure 5. primo interface during usability testing information technology and libraries | march 2016 23 after (live site): figure 6. primo interface after usability testing many larger assertions of this study, encompassing implications for instruction and our needs from discovery vendors, will require further study to address. the authors intend to continue to investigate these issues as additional usability testing is conducted and to use the data to support future vendor relations and instructional curriculum development discussions. references 1. tamar sadeh, “user experience in the library: a case study,” new library world 109, no. 1/2 (2008): 7–24, doi:10.1108/03074800810845976. 2. aaron nichols et al., “kicking the tires: a usability study of the primo discovery tool,” journal of web librarianship 8, no. 2 (2014): 172–95, doi: doi:10.1080/19322909.2014.903133; scott hanrath and miloche kottman, “use and usability of a discovery tool in an academic library,” journal of web librarianship 9, no. 1 (2015): 1–21, doi:10.1080/19322909.2014.983259. 3. david j. comeaux, “usability testing of a web-scale discovery system at an academic library,” college & undergraduate libraries 19, no. 2–4 (2012): 189–206, doi:10.1080/10691316.2012.695671. 4. kylie jarrett, “findit@ flinders: user experiences of the primo discovery search solution,” australian academic & research libraries 43, no. 4 (2012): 278–300; nichols et al., "kicking the tires." 5. sadeh, “user experience in the library.” http://dx.doi.org/10.1108/03074800810845976 http://dx.doi.org/10.1080/19322909.2014.903133 http://dx.doi.org/10.1080/19322909.2014.983259 http://dx.doi.org/10.1080/10691316.2012.695671 lessons learned: a primo usability study | brett, lierman, and turner doi: 10.6017/ital.v35i1.8965 24 6. jarrett, “findit@ flinders”; nichols et al., “kicking the tires.” 7. xi niu, tao zhang, and hsin-liang chen, “study of user search activities with two discovery tools at an academic library,” libraries faculty and staff scholarship and research 30, no. 5 (2014), doi:10.1080/10447318.2013.873281; hanrath and kottman, “use and usability of a discovery tool in an academic library.” 8. rice majors, “comparative user experiences of next-generation catalogue interfaces,” library trends 61, no. 1 (2012): 186–207, doi:10.1353/lib.2012.0029; niu, zhang, and chen, “study of user search activities with two discovery tools at an academic library.” 9. beth thomsett-scott and patricia e. reese, “academic libraries and discovery tools: a survey of the literature,” college & undergraduate libraries 19, no. 2–4 (2012): 123–43, doi:10.1080/10691316.2012.697009. 10. sadeh, “user experience in the library.” 11. comeaux, “usability testing of a web-scale discovery system at an academic library.” 12. jessica mahoney and susan leach-murray, “implementation of a discovery layer: the franklin college experience,” college & undergraduate libraries 19, no. 2–4 (2012): 327–43, doi:10.1080/10691316.2012.693435. 13. joy marie perrin et al., “usability testing for greater impact: a primo case study,” information technology & libraries 33, no. 4 (2014): 57–67. 14. majors, “comparative user experiences of next-generation catalogue interfaces”; thomsettscott and reese, “academic libraries and discovery tools.” 15. jarrett, “findit@ flinders”; mahoney and leach-murray, “implementation of a discovery layer.” 16. jarrett, “findit@ flinders”; mahoney and leach-murray, “implementation of a discovery layer”; nichols et al., “kicking the tires." 17. jarrett, “findit@ flinders”; mahoney and leach-murray, “implementation of a discovery layer”; perrin et al., “usability testing for greater impact : a primo case study.” 18. jarrett, “findit@ flinders”; nichols et al., “kicking the tires”; hanrath and kottman, “use and usability of a discovery tool in an academic library”; majors, “comparative user experiences of next-generation catalogue interfaces.” 19. comeaux, “usability testing of a web-scale discovery system at an academic library”; thomsett-scott and reese, “academic libraries and discovery tools.” 20. jarrett, “findit@ flinders.” 21. mahoney and leach-murray, “implementation of a discovery layer”; perrin et al., “usability testing for greater impact.” http://dx.doi.org/10.1080/10447318.2013.873281 http://dx.doi.org/10.1353/lib.2012.0029 http://dx.doi.org/10.1080/10691316.2012.697009 http://dx.doi.org/10.1080/10691316.2012.693435 information technology and libraries | march 2016 25 22. mahoney and leach-murray, “implementation of a discovery layer”; nichols et al., “kicking the tires”; niu, zhang, and chen, “study of user search activities with two discovery tools at an academic library.” 23. thomsett-scott and reese, “academic libraries and discovery tools.” 24. tao zhang and merlen prommann, “applying hierarchical task analysis method to discovery layer evaluation,” information technology & libraries 34, no. 1 (2015): 77–105, doi:10.6017/ital.v34i1.5600. 25. david nelson and linda turney, “what’s in a word? rethinking facet headings in a discovery service,” information technology & libraries 34, no. 2 (2015): 76–91, doi:10.6017/ital.v34i2.5629. 26. nichols et al., “kicking the tires,” 184. http://dx.doi.org/10.6017/ital.v34i1.5600 http://dx.doi.org/10.6017/ital.v34i2.5629 trope or trap? roleplaying narratives and length in instructional video amanda s. clossen information technology and libraries | march 2018 27 amanda s. clossen (asc17@psu.edu) is learning design librarian, pennsylvania state university. abstract a concern that librarians face when creating video is whether users will actually watch the video they are directed to. this is a significant issue when it comes to how-to and other point-of-need videos. how should a video be designed to ensure maximum student interest and engagement? many of the basic skills demonstrated in how-to videos are crucial for success in research but are not always directly connected to a class. whether a video is selected for inclusion by an instructor or viewed after it is noticed by a student depends on how viewable the video is perceived to be. this article will discuss the results of a survey of more than thirteen hundred respondents. this survey was designed to establish the broad preferences of the viewers of instructional how-to videos, specifically focusing on the question of whether the length and presence of a role-playing narrative enhances or detracts from the viewer experience, depending on demographic. literature review length since the seminal 2010 study by bowles-terry, hensley, and hinchliffe established emerging best practices for pace, length, content, look and feel, and video versus text, a variety of works compiling best practices for video have been created.1 the very successful library minute videos from arizona state university resulted in a collection of how-tos and best practices by rachel perry.2 these included tips on addressing an audience, planning, content, length, frugality, and experimentation. in 2014 coastal carolina nursing students were surveyed for their preferences in video, resulting in another set of best practices. these focused on video length, speaking pace, zoom functionality, and use of callouts.3 martin and martin’s extensive 2015 review covers content, compatibility, accessibility, and audio.4 the recommended length listed in these best practices varies widely. thirty-seconds to a minute is recommended by bowles-terry, hensley, and hinchliffe, while perry recommends no longer than ninety seconds.5 the coastal carolina study and seminole state review recommend no longer than three minutes.6 nearly all the articles reviewed stress that complicated concepts should be broken into more easily comprehensible chunks to avoid overwhelming student cognitive load. mailto:asc17@psu.edu trope or trap? role-playing narratives and length in instructional video | clossen 28 https://doi.org/10.6017/ital.v37i1.10046 narrative roleplay scenario the typical roleplay involves a hypothetical student who needs some sort of assistance and is helped through the process using library resources. often there is also a hypothetical guide, who can be a librarian, friend, or professor. these hypothetical situations are recorded in a variety of ways: from live-action video recordings, to screencast voice-overs, to text. the efficacy of such tools in library video have been explored little, if at all. devine, quinn, and aguilar’s 2014 study explores the usage and effectiveness of microand macro-narratives in resident information literacy instruction,7 but there is no question that this instructional scenario is very different than how-to instructional videos. the interplay between student interest and such narratives is addressed by emotional interest theory, which states that adding unrelated but interesting material increases attention by energizing the learner. these unrelated pieces of engaging material are known as seductive details. this “highly interesting and entertaining information . . . is only tangentially related to the topic but is irrelevant to the author’s intended theme.”8 exploration of this concept through experimental study has indicated that seductive details are detrimental to learning.9 some evidence indicates that learners are more likely to remember these details than the important content itself thanks to cognitive load issues.10 however, there have also been cases where seductive details have improved recall.11 in their 2015 study, park, flowerday, and brünken argue that the format and presentation of seductive details have varying effect on learning processes and that they can be used to positive effect.12 in this paper, the seductive details to be studied are those of the roleplay narrative used to frame instruction in how-to videos. methods survey design the survey was designed to explore three questions: • does the length of the video affect a user’s willingness to watch it? • do users prefer videos that are pure instruction or those that use a roleplay narrative to deliver content? • does the demographic of the viewer affect a video’s viewability? the survey was revised in collaboration with a survey design and statistical specialist at the penn state library’s data learning center. the completed survey was then entered into qualtrics for implementation. implementation implementation and subject-gathering was done through a survey-research sampling company that provided both a wide demographic and rapid data collection. this was sponsored by an institutional grant. subjects from a variety of institution types and geographic locations were solicited via email invitation to complete a survey that explored their perspectives on instructional videos. information technology and libraries | march 2018 29 the twenty-question survey was focused on respondents of a traditional college age. implementation resulted in 1,305 responses out of 1,528 surveys. after implementation, results were compiled and analyzed by a statistical expert at the institutional data center. nearly all the analyses to follow are simple cross-tabulations of respondent choices as correlations between demographics and preference were minor based on a multivariate analysis of variance (manova) test. results and discussion demographics the survey, which was limited to a traditionally college-aged population (eighteen to twentyfour), produced a nearly 1:1 gender distribution (figure 1). figure 1. age and gender distribution. the survey had around 64 percent student participants, 77 percent of these attending school full time. of those full-time students, 60 percent were resident students, and only 9 percent were solely online students. unemployed participants were more likely to be full-time resident students whereas online students were more likely to be employed full-time. (see figures 2 and 3.) trope or trap? role-playing narratives and length in instructional video | clossen 30 https://doi.org/10.6017/ital.v37i1.10046 figure 2. employment and student status distribution. figure 3. resident versus online status distribution. information technology and libraries | march 2018 31 information and video confidence the distribution of confidence in information-seeking ability hovered around 90 percent. however, at most, only half of respondents had any familiarity with google scholar (see figure 4). this tells us several things, the most important being that what librarians consider appropriate confidence in information-seeking is very different from what the college-aged layperson considers appropriate. this supports colón-aguirre and fleming-may’s 2012 study that indicates that students are likely to use free online websites that require the least effort for their research.13 figure 4. information-seeking confidence. video length length of a video does play a role for most. about 70 percent of participants indicated that they are either more likely to watch a video with a timestamp or will rarely watch unless the time is indicated (see figure 5). timestamp is easily provided by most video players. the mean maximum time for college-age participants’ willingness to watch was about four and a half minutes. the median was approximately three minutes. in general, shorter appears better: three to four minutes is around the maximum length that most eighteen to twenty-nine year olds are willing to watch. this contradicts all the referenced best practices but those proffered by baker, who described thirty to ninety seconds as ideal video viewing time. her study found that 41 percent of her students preferred videos that were one to three minutes long, but 24 percent preferred three to five minutes. because of this, she recommends videos that are three minutes or less.14 trope or trap? role-playing narratives and length in instructional video | clossen 32 https://doi.org/10.6017/ital.v37i1.10046 figure 5. perspective on viewing time. instructions versus roleplay the bulk of the survey was questions related to two videos. both videos were under three minutes long and were produced using techsmith’s camtasia screencast software. the screencast video simply explained how to complete a research task—searching google scholar for an article addressing a theme in shakespeare’s romeo and juliet. viewers were guided through the process of finding articles on this topic by a single narrator. no dramatized roleplay situation was presented. the narrative video guided the participants through a hypothetical situation dramatized by two actors. the scenario was a common one—a student procrastinating on a paper and asking her roommate for assistance at the last minute. the roommate guided the student through use of google scholar, completing the same tasks as the screencast video. participants watched both videos and answered a series of questions on their reactions. number of views was tracked on the media player, verifying that both videos were viewed. screencasts while watching the screencast video, most participants found that the narrator was trustworthy and that they were learning. only 15 percent felt the video needed an example scenario. though there were mixed experiences as to the length of the video, the timing of the video seemed on information technology and libraries | march 2018 33 point, as only 11.6 percent strongly believed that the video took too long and 7.5 percent strongly felt that went too quickly. (see figure 6.) figure 6. screencast reactions. when asked an open-ended question about what struck them the most in the screencast video, respondents most frequently stated that they found it to be informative and interesting, or at least neutral. however, a variety of responses were observed, both negative and positive, or even contradictory. it is worth noting that within this open-ended format, dislike of the narrator’s voice was independently assigned as one of the top three issues. this stresses the importance of coherent and pleasant narration, as it is something that viewers will likely notice. trope or trap? role-playing narratives and length in instructional video | clossen 34 https://doi.org/10.6017/ital.v37i1.10046 figure 7. open-ended questions: screencast. narrative while watching the narrative video, participants found that they could relate to the characters or scenario and found that they were learning as much as they were when watching the screencast (see figure 8). however, there were mixed responses regarding video length and credibility of the narrator. when compared across demographics, employed respondents and students were more likely to agree that they could relate to the scenario than unemployed and nonstudents. male respondents and employed were more likely to think that the video went too fast than female and unemployed respondents. when asked an open-ended question on what most struck them about the narrative video, respondents most often stated that they found it to be boring and long, though a good number also indicated it was interesting and informative (see figure 9). just as with the screencast video, a variety of responses, both negative and positive, were observed, some even conflicting. information technology and libraries | march 2018 35 figure 8. narrative reactions. figure 9. open-ended questions: narrative. trope or trap? role-playing narratives and length in instructional video | clossen 36 https://doi.org/10.6017/ital.v37i1.10046 in addition, 13.5 percent of respondents were unsatisfied with the content of the video. just as with the screencast video, a variety of responses, both negative and positive, were observed, some even conflicting. screencast versus narrative the screencast video tended to be preferred by respondents, with higher average scores in content, engagement, learning value, and narrator trustworthiness. in contrast, respondents also thought that the screencast video moved too quickly compared to the narrative video. additionally, participants were more impatient during the narrative video (see figure 10). figure 10. screencast versus narrative. to observe differences between the screencast and narrative videos with regards to respondent reactions within specific population demographics, manova test was performed. this test revealed that none of the p-values were significant (at α = .05), leaving no correlation between student status, employment status, and reaction to each video. a more liberal interpretation of the data from this analysis might conclude that differences in impatience across student status were possibly significant (α = .10), with students being more likely to exhibit a smaller difference in *score defined as 1 = “not very much” to 5 = “very much”, with difference = screencast score – narrative score. red rows indicate higher scores for the narrative video. statistics for differences in screencast and narrative* (n=1305) information technology and libraries | march 2018 37 impatience for the two video styles. the preferences for screencast over narrative video did not change when the demographics were spliced. conclusions it is impossible to please everyone all the time—at least that is what survey results suggest. there are several takeaways to this study: video length matters, especially as a consideration before the video is viewed. timestamps should be included in video creation, or it is highly likely that the video will not be viewed. the video player is key here, as some video players include video length, while others do not. videos that exceed four minutes are unlikely to be viewed unless they are required. voice quality in narration matters. although preference in type of voice inevitably varies, the actor’s voice is noticed over production value. it is important that the narrator speaks evenly and clearly. for brief how-to videos, there is a small preference for screencast instructional videos over a narrative roleplay scenario. the results of the survey indicate that roleplay videos should be wellproduced, brief, and high quality. however, what constitutes high quality is not very well established.15 finally, screencast videos should include an example scenario, however brief, to ground the viewer in the task. suggestions for further study next steps for research might include a more refined survey focusing on the results of this study. of equal value would be a series of focus groups that are given both a screencast and narrative video and asked to discuss their preferences. though a wide variety of students were surveyed, limits of this dataset prevented the exploration of specific correlations among students attending different institution types or among those pursing different majors. further research addressing the differences among these student bodies would be a welcome addition to the literature. references 1 melissa bowles-terry, merinda kaye hensley, and lisa janicke hinchliffe, “best practices for online video tutorials in academic libraries: a study of student preferences and understanding,” communications in information literacy 4, no. 1 (january 1, 2010): 17–28. 2 anali maughan perry, “lights, camera, action! how to produce a library minute,” college & research libraries news 72, no. 5 (2011): 278–83. trope or trap? role-playing narratives and length in instructional video | clossen 38 https://doi.org/10.6017/ital.v37i1.10046 3 ariana baker, “students’ preferences regarding four characteristics of information literacy screencasts,” journal of library & information services in distance learning 8, no. 1–2 (january 2, 2014): 67–80, https://doi.org/10.1080/1533290x.2014.916247. 4 nichole a. martin and ross martin, “would you watch it? creating effective and engaging video tutorials,” journal of library & information services in distance learning 9, no. 1–2 (january 2, 2015): 40–56, https://doi.org/10.1080/1533290x.2014.946345. 5 bowles-terry, hensley, and hinchliffe, “best practices,” 23; perry, “lights, camera, action!,” 282. 6 baker, “students’ preferences,” 76; martin and martin, “would you watch it?,” 48. 7 jaclyn r. devine, todd quinn, and paulita aguilar, “teaching and transforming through stories: an exploration of macroand micro-narratives as teaching tools,” reference librarian 55, no. 4 (october 2, 2014): 273–88, https://doi.org/10.1080/02763877.2014.939537. 8 shannon f. harp and richard e. mayer, “the role of interest in learning from scientific text and illustrations: on the distinction between emotional interest and cognitive interest,” journal of educational psychology 89, no. 1 (1997): 92–102, https://doi.org/10.1037//00220663.89.1.92. 9 suzanne hidi and valerie anderson, “situational interest and its impact on reading and expository writing,” in the role of interest in learning and development, ed. by k. ann renniger (hillsdale, nj: l. erlbaum associates, 1992), 213–14. 10 babette park et al., “does cognitive load moderate the seductive details effect? a multimedia study,” in “current research topics in cognitive load theory,” special issue, computers in human behavior 27, no. 1 (january 1, 2011): 5–10, https://doi.org/10.1016/j.chb.2010.05.006. 11 annette towler et al., “the seductive details effect in technology-delivered instruction,” performance improvement quarterly 21, no. 2 (january 1, 2008): 65–86, https://doi.org/10.1002/piq.20023. 12 babette park, terri flowerday, and roland brünken, “cognitive and affective effects of seductive details in multimedia learning,” computers in human behavior 44 (march 1, 2015): 267–78, https://doi.org/10.1016/j.chb.2014.10.061. 13 mónica colón-aguirre and rachel a. fleming-may, “‘you just type in what you are looking for’: undergraduates’ use of library resources vs. wikipedia,” journal of academic librarianship 38, no. 6 (november 1, 2012): 391–99, https://doi.org/10.1016/j.acalib.2012.09.013. 14 baker, “students’ preferences,” 76. 15 towler et al., “the seductive details,” 71. https://doi.org/10.1080/1533290x.2014.916247 https://doi.org/10.1080/1533290x.2014.946345 https://doi.org/10.1080/02763877.2014.939537 https://doi.org/10.1037/0022-0663.89.1.92 https://doi.org/10.1037/0022-0663.89.1.92 https://doi.org/10.1016/j.chb.2010.05.006 https://doi.org/10.1002/piq.20023 https://doi.org/10.1016/j.chb.2014.10.061 https://doi.org/10.1016/j.acalib.2012.09.013 abstract literature review length narrative roleplay scenario methods survey design implementation results and discussion demographics information and video confidence video length instructions versus roleplay screencasts narrative screencast versus narrative conclusions suggestions for further study references reproduced with permission of the copyright owner. further reproduction prohibited without permission. using a native xml database for encoded archival description search and retrieval cornish, alan information technology and libraries; dec 2004; 23, 4; proquest pg. 181 communications using a native xml database for encoded archival description search and retrieval alan cornish the northwest digital archives (nwda) is a national endowment for the humanities-funded effort by fifteen institutions in the pacific northwest to create a finding-aids repository. approximately 2,300 finding aids that follow the encoded archival description (ead) standard are being contributed to a union catalog by academic and archival institutions in idaho, montana, oregon, and washington. this paper provides some information on the ead standard and on search and retrieval issues for ead xml documents. it describes native xml technology and the issues that were considered in the selection of a native xml database, ixiasoft's textml, to support the nwda project. pitti, one of the founders of the ead standard, noted the primary motivation behind the creation of ead: "to provide a tool to help mitigate the fact that the geographic distribution of collections severely limits the ability of researchers, educators, and others to locate and use primary sources."' pitti expanded on this need for ead in a 1999 d-lib article: the logical components of archival description and their relations to one another need to be accurately identified in a machine-readable form to support sophisticated indexing, navigation, and display that provide thorough and accurate access to, and description and control of, archival materials.' in a more recent publication, pitti and duff noted a key advantage offered by ead that relates to the focus of this article, the development of an ead union catalog: ead makes it possible to provide union access to detailed archival descriptions and resources in repositories distributed throughout the world. . . . libraries and archives will be able to easily share information about complementary records and collections, and to "virtually" integrate collections related by provenance, but dispersed geographically or administratively.' in a 2001 american archivist article, roth examined ead history and deployment methods used up to the 2001 time period. importantly, two of the most prominent delivery systems described by roth-dynatext (a server-side solution) and panorama (a client-side solution)-were, by 2003, obsolete products for ead delivery. this is indicative of the rapid pace of change in ead deployment, in part due to the migration from sgml to xml technologies. roth described survey results obtained on ead deployment that underscore the recognized need at that time for a "costeffective server-side xml delivery system." the lack of such a solution motivated institutions to choose html as a delivery method for ead finding aids.4 articles like roth's that describe specific ead search-and-retrieval implementation options are in short supply. one such option, the university of michigan dlxs xpat software, is employed for the search and retrieval of ead and other metadata in the university of illinois at urbanachampaign (uiuc) cultural heritage repository. 5 another option, harvesting ead records into machine-readable cataloging (marc) to establish search and retrieval access in an integrated library system, was described by fleck and seadle in a 2002 coalition for networked information task force briefing. using an xml harvester product created by innovative interfaces, marc records are generated based upon marc encoding analogs included in the ead markup and loaded into an innovative interfaces innopac system. 6 this product has been used to create access to ead finding aids in the catalog for michigan state university's vincent voice library. in a 2001 article, gillilandswetland recommended several desirable features for an ead searchand-retrieval system. she emphasized the challenge of ead search and retrieval by noting the nature of finding aids themselves: archivists have historically been materials-centric rather than user-centric in their descriptive practices, resulting in the finding aid assuming a form quite unlike the concise bibliographic description with name and subject access most users are accustomed to using in other information systems such as library catalogs, abstracts, and indexes.' without describing specific software tools, gilliland-swetland argued for a user-centric approach to the search and retrieval of finding aids by examining the needs of specific user communities such as genealogists, k-12 teachers, and historians. 8 several initiatives similar to the nwda effort are described in the professional literature. the online archive of california (oac), which was founded in the mid-1990s, is a consortium of california specialcollections repositories. a number of key consortium functions are centralized, including "monitoring to ensure consistency of ead encoding across all oac finding aids" according to agreed-upon best practices, a critical need in the creation of a union catalog.9 brown and schottlaender also describe the integration of the oac into the california digital library, which enables linkages between ead alan cornish (cornish@wsu.edu) is systems librarian, washington state university libraries, pullman. using a native xml database for encoding archival description search and retrieval i cornish 181 reproduced with permission of the copyright owner. further reproduction prohibited without permission. finding aids and digiti zed copies of original materials. 10 finall y, one import ant development area is the po ssibilit y of inte grating ead docum ent s into open archives initiative (oai) services in order to enh ance resourc e discovery. a 2002 paper written by prom and habin g, both of whom work with th e uiuc cultural herit age repository, explored th e possibility of mapping ead to oai, the latt er of w hich is based up on th e fifteeneleme nt dublin cor e metadata set (unqualified) . while no ting, "w e do n ot propose that th e full capabiliti es of ead finding aids could be subsum ed by oai," prom and habing sug gested that it is possible to map the top-l eve l and co mpon e nt portions of ea d int o oai, res ul tin g in multipl e oai records from a singl e ead finding aid. in thi s scenario, a sin gle oai record is created from th e collectionlevel information and multipl e records from component-level information in an ead docum en t.11 evaluation of ead search and retrieval products in order to iden tify a software solution for supporting a union catalog of ead findin g aids, the con so rtium conducted a product evaluation. the strengths and weakn esses of the native xml technology em ployed by the consortiu m can be best understood by lookin g at alternative xml product s and product categor ies . table 1 shows the products con sidere d during an evaluati on period th at consisted of both product research and actual trials. in approaching the eva luation, the consortium and its union -catalog host institut ion , the washin gton stat e university libraries, had seve ral specific need s in mind. first, the licensing an d support costs for the product needed to fit w ithin the consortium's budget. second , th e sea rch-andretrieval softw are had to sup port several basic fun ctions: keywo rd searching across all union-cat alog finding aids; specific field searching based upon elements or attribut es in the ead docum ent ; an abilit y to customize the look and feel of the interface and search-results screens; and the ability to display search term(s) in the conte xt of the finding aid . as not ed in the tabl e, three of the ev aluated products are n ativ e xml databases. cyrenne provid es a definition of native xml as a database with the se features: • the xml document is stored intact: "t he xml d ocum ent is preserv ed as a separat e, unique entity in its entirety ." • "schema independenc e," that is, "a ny we ll-formed xml document can be stored and queried." • the qu ery language is xmlbased: "na ti ve xml d ata base vendors typically u se a quer y langua ge d es igned sp eci fically for xml" as opposed to sql.12 of the thr ee native xml products, only the licensi ng costs of ixiasoft's textml and the open-sourc e xindice so ftware fell within the available projec t fundin g. both pack ages were extensively tested, with text ml proving superio r at handlin g th e large (sometimes in the mb-size range) and structurall y complex ead documents crea ted by consortium memb ers. one key strength of textml that m et an nwda consortium-need involved field sea rching. in textml, it is possibl e to m ap a search field to one or m ore xpath s ta tements , enabling th e crea tion of sea rch fields b ase d upon the precise us e of an element or attribute in ead d ocuments. the importanc e of this capability is show n with th e ead element, which can appear at the collection lev el and at the sub or dinate component level in a docu men t. with textml, usin g its limited xpath support, it is p ossib le to refer ence a specific, contextual use of . in addition to the native xml sol utions , seve ral oth er product 182 information technology and libraries i december 2004 types were considered. an xml qu ery engine, verity ultra seek, was te sted and produced good results whe n u sed for the search and retrieval of consortium docum en ts. 13 ultraseek can be used to search discrete xml files , supports th e creation of custom int erfaces for th e searc hand-r etrie va l sys tem, and ha s strong documentation . pro bably th e most obvious limit a tion in thi s xml qu eryengin e product conc erned the crea tion of search fields. to contras t u ltr asee k with a native xml solution : ultras eek 5 .0 (used du r ing the product trial) lacked xpath support. inste ad, it requir ed a uniqu e eleme ntattr ibute combin ation for the crea tion of a databa se sea rch field . returning to th e exa mple , cont extual u ses of could n ot be indexed with o ut recoding consortium docum ent s to create a unique eleme nt-attribut e combination on which to ind ex. an xml-enabled databa se, dlxs xpat, has b een successfully used in se veral ead projects, including oac. one d isadv antage of this product is th at it re quir es a unix operating sys tem for th e se rver. a dditionall y, xpat, as a supporting toolse t for di gital-library collection building, provid es functionalit y that duplicates other media tool s at the ho st institution (specifically, oclc/ dim ema contentdm). the use of a relational dat abase management system (rdbms) to es tablish sea rch and retri eva l for ead xml d ocume nts was con sidered as well. th e advantage to thi s approac h is th at it would ena ble the u se of codin g techniques built up through other web-based media d elivery proj ects at the ho st institution. the mo st obvio us negati ve issue is th e need to map xml elements or attributes to tables and field s in an rdbms, which , as cyrenne notes, "is often expensiv e and will m ost likely res ult in the loss of some dat a suc h as processing in stru ctions , and comments as well as the noti on of eleme nt and attribut e orderin g." 14 the reproduced with permission of the copyright owner. further reproduction prohibited without permission. table 1. nwda project---€valuated search and retrieval products product vendor product category license mysquphp n/ a relationa l database management system open so urce tamino xml server software ag native xml database nat ive xml database xml query engine native xml database integrated library system xml enabled database commercial commercial commercial open source commercial commercial textml lxiasoft ultraseek verity xindice n/a xml harvester innovative interfaces xpat dlxs use of native xml avoids the task of explodin g xml data in to the tabl e and field struc ture s of an rdbms. finally, another approac h considered was the use of an integrated library sys tem product. this was a realistic option for nwda becaus e consortium member institutions had decid ed to include marc encoding catalogs for selected elements in union-catalog findin g ai ds. inn ovative int er faces produces an xml harve ste r that can be u sed to gen erate marc records from ead findin g aids th a t include marc encoding analo gs. for this proj ect, a local ( or self-cont aine d) cat alog could hav e been created and p opulated with marc records containing metadat a for th e ead docum en ts, includin g a url for online access. this approach offers important strengths and weaknesses . on the positiv e side, it is a relati ve ly eas y meth od for enablin g search-and-retrieval access to ead findin g aids. in contrast to the int er face coding requirement s for textml, the xml harvest er provided an almost tu rn key approach to xml search and retrieval. on the negativ e side, tw o factors stood out during th e evaluation . first, it would be difficult to fully custom ize sea rch-andretrieval interfaces as needed for th e proj ec t. second, u sing the xml harv ester, there is no ability to display searc h terms in the context of the findin g aid. search and retrieval is bas ed upon the m etadata extract ed from th e finding aid usin g th e marc analogs. in michigan state's voice librar y impl eme ntation of thi s so lution , th e finding aid is an external resource with no hi ghlighting of search ter ms . strengths and weaknesses of the textml approach each p roject has it s ow n specific n eed s; thu s, th ere is no correct approach to establishing searc h and retrieval for ead xml documents. in taking th e needs and resources of th e nwda conso rtium into account, ixiaso ft's textml, a nati ve xml prod uct, pr ovi ded the best fit and was licens ed for u se. the use of textml enables the creation of customized interfac es for an xml d atabase (or docum en t base, u sing the textml terminol ogy) and pro vides support for ke yword and field sea rching of consortium documents. the qualified xpath support in textml enables search fields to be built up on precis e element or attribute combinations wi thin ead document s. the existence of a major findingaids int erne t site empl oy ing textml was a factor in the proj ect's selection of the sof tware . th e acces s to archive s (a2a) site, access ible from url www .a2a.pro.gov.uk / , provid es an excell ent model for a publicly sea rchabl e finding-aid sit e. th e a2a site supp orts keyw ord searching and sea rchin g b y arc hival facility; provides multiple views of sea rch results (a summary recor ds scree n, sea rch ter ms in cont ext, and th e full rec ord); highlights searc h term(s) in the displayed findin g aid ; and supp or ts the presentation of lar ge findin g -aid doc ument s. while a2a u ses ge neral internation al standard arc hival description, or isad(g), as op posed to ead for its description standard, the similaritie s between th e two stand a rd s m akes th e a2a site a va luable example for d eve lopment. '5 one w eakness of textml is the implementation model supported by ixiasoft, whi ch assumes significant local de velopme nt of the app lication or web int er face. th e rela tionship b etween sof tw are cap abiliti es and local dev elopme nt was con sidered with each of the produ cts listed in tab le 1. as no ted , th e innovative interfaces so lution was th e most straightfor wa rd approach , assu ming the existenc e of the marc analogs in ead marku p, but provid ed the least flexibility in terms of customization an d establishing a tru e linkage between the searc h system and the actual document. in contra st, while ixiasoft m akes available a base set of active server pages using visual basic script (asp / vbscript) code for textml app lication de velop ment and provides very goo d trainin g and support ser vices, the resp onsi bility for using a native xml database for encoding archival description search and retrieval i cornish 183 reproduced with permission of the copyright owner. further reproduction prohibited without permission. that d evelopm ent rests with the loca l site . for the nwda consortium, this development, using the co de base, ha s been manag ea ble. the curr ent state of interface dev elopment for the nwda proj ect can b e reviewed at http: // nwd a.ws ulibs .ws u.edu / project_info /. conclusion in se le cting a n ead se archan dretr ieva l sy s te m, on e important qu es ti on for th e con so rtium was, whi ch software so lution had the be st prosp ects for migration in the futur e ? becau se of th e inherent strength s of native xml tec hnolog y in comparison to the other product catego r ies listed in table 1, a nativ e xml d a tabase appeared to be the be s t approach , and tex tml pro v ided the best combination of lic ensing costs, softw are capabilities, and support. it is import a nt to not e that the di stinctions betw ee n nativ e xml d atabas es and databases that supp or t xml throu g h extensions (xmlenabl ed datab ases) 1nay b eco me more difficult to di scern ov er time, in part due to the ex isting expertise and inv es tments in rdbms techn o lo gies.16 ne verthel ess, capabilities central to native xml, such as the us e of an xml-bas ed query language, are integral to th e success of such h ybrid systems . references and notes 1. daniel pitti, "enc oded archi va l description: the dev elop ment of an encoding standard for archival findin g aids," the american archivist 60, no. 3 (summer 1997): 269. 2. daniel pitti, "encod ed archi va l descrip tion: an introducti on and overview," d-lib magazine 5, no. 11 (nov. 1999). accessed nov. 2, 2004, www.dlib. org / dlib / novemb er99 / 11 pi tti.h tml. 3. daniel v. pitti and wendy m. duff (eds.), "introdu ction ," in encoded archival description on the internet (binghamton, n.y.: haworth, 2001), 3. 4. james m. roth, "serving up ead: an exploratory stud y on the deployment and utili zation of encoded archiva l description findin g aids, " the american archivist 64, n o. 2 (fall/ winter 2001): 226. 5. sarah l. shreeves et al., "h arvesting cultural her itage metada ta using the oai protocol," library hi tech 21, no. 2 (2003): 161. 6. nanc y fleck and michael seadle, "ead harv es ting for the na tional gallery of the spoken word" (pap er present ed at th e coalition for netw orked information fall 2002 task force meeting, san antoni o, tex., dec. 2002). accessed nov. 2, 2004, www.cn i.org/ tfms/20 02b. fall/ handout s/ h-ead-fleckseadl e.doc . 7. anne j. gilliland -swetland, "popularizing the finding aid : exploiting ead to enhance online discovery and retrieval," in encoded archival description on the internet (binghamton, n.y.: h aw orth, 2001), 207. 8. ibid , 210-14. 9. charlotte b. brown and brian e. c. schottlaender, "the online archive of california: a consor tia! approach to encoded archival description ," in encoded archival description on the internet (binghamton, n .y.: haworth , 2001), 99. 10. ibid, 103-5. oac available at: www. oac.cd lib. o rg/. accessed nov . 2, 2004 . 11. christ ophe r j. prom and thomas habing, "using the op en archives initiative protocols with ead," in proceed ings of the second acm/ ieee-cs joint conference on digital libraries (portland, ore., july 2002). accessed nov. 2, 2004, http:// dli .grainger. ui uc.ed u / publ ications/ jcdl20 02/ pl4prom .pdf . 12 . marc cyre nn e, "going n ative : wh en should you use a nativ e xml database?" aim e-doc magazine 16, no. 6 (nov./ dec. 2002), 16. accessed nov. 2, 2004, www .edocmag az ine.com / ar ticle_ new.asp?id=25421. 13. product categor y decisions based up on definitions and classifications available from : ronald bourret, "xml database products." accessed nov. 2, 2004, www. rp bourret.com / xml / xmld a t a b a se prods.htm. 14. cyrenne, "going native, " 18. 15. bill stockting, "ead in a2a," microsoft powerpoint present at ion. accessed n ov. 2, 2004, www.agad .archiwa. gov.pl/ ead / stocking.ppt. 16. uw e ho henst ein, "supporting xml in oracl e9i," in akmal b. chaudhri, 184 information technology and libraries i december 2004 awais rashid, and roberto zicari (eds.), xml data management: native xml and xml-enabled database systems (boston: add ison -wesley, 2003), 123-4 . using gis to measure in-library book-use behavior jingfeng xia this article is an attempt to develop geographic information syst ems (gis) technologi; into an analytical tool for examining the relationships between the height of the bookshelves and the behavior of library readers in utiliz ing books within a library. the tool would contain a database to store book-use information and some gis maps to represent bookshelves. upon analyzing the data stored in the database, different frequ encies of book use across bookshelf layers are displayed on the maps. the tool would provide a wonderful means of visualization through which analysts can quickly realize the spatial distribution of books used in a library. this article reveals that readers tend to pull books out of the bookshelf layers that are easily reachable by human eyes and hands, and thus opens some issues for librarians to reconsider the management of library collections. several years ago , when working as a library ass istant reshelving books in a univer sit y library, the author noted that the majority of books used inside th e librar y were from the mid-range layers of b oo kshelv es . that is , by proportion, few book s pulled out by librar y rea ders were from the top or b ottom layers. books on the layers that were ea sily re achable by readers were frequently utilized . such a b oo k-u se distribution patt ern mad e th e job of reshelving books easy, but created some inquiries: how could book locations influ ence th e choices of read ers in selecting books? if this was not a n isolat ed observ a tion, it must have exposed an int ere sting communications | maceli, wiedenbeck, and abels 71benign neglect: developing life rafts for digital content | deridder 71 in accordance with the current best practices.8 for those producers of content who are not able to meet the requirements of ingest, or who do not have access to an oais archive provider, what are the options? with the recent downturn in the economy, the availability of staff and the funding for the support of digital libraries has no doubt left many collections at risk of abandonment. is there a method for preparation of content for longterm storage that is within the reach of existing staff with few technical skills? if the content cannot get to the safe harbor of a trusted digital library, is it consigned to extinction? or are there steps we can take to mitigate the potential loss? the oais model incorporates six functional entities: ingest, data management, administration, preservation planning, archival storage, and access.9 of these six, only archival storage is primary; all the others are useless without the actual content. and if the content cannot be accessed in some form, the storage of it may also be useless. therefore the minimal components that must be met are those of archival storage and some form of access. the lowest cost and simplest option for archival storage currently available is the distribution of multiple copies dispersed across a geographical area, preferably on different platforms, as recommended by the current lockss initiative,10 which focuses on bit-level preservation.11 private lockss network models (such as the alabama digital preservation network)12 are the lowest-cost implementation, requiring only hardware, membership in lockss, and a small amount of time and technical expertise. reduction of the six functional entities to only two negates the need in contrast, other leaders of the digital preservation movement have been stating for years that benign neglect is not a workable solution for digital materials. eric van de velde, director of caltech’s library information technology group, stated that the “digital archive must be actively managed.”3 tom cramer of stanford university agrees: “benign neglect doesn’t work for digital objects. preservation requires active, managed care.”4 the digital preservation europe website argues that benign neglect of digital content “is almost a guarantee that it will be inaccessible in the future.”5 abby smith goes so far as to say that “neglect of digital data is a death sentence.”6 arguments to support this statement are primarily those of media or data carrier storage fragility and obsolescence of hardware, software, and format. however, the impact of these arguments can be reduced to a manageable nightmare. by removing as much as possible of the intermediate systems, storing open-source code for the software and operating system needed for access to the digitized content, and locating archival content directly on the file system itself, we reduce the problems to primarily that of format obsolescence. this approach will enable us to forge ahead in the face of our lack of resources and our rather desperate need for rapid, cheap, and pragmatic solutions. current long-term preservation archives operating within the open archival information system (oais) model assume that producers can meet the requirements of ingest.7 however, the amount of content that needs to be deposited into archives and the expanding variety of formats and genres that are unsupported, are overwhelming the ability of depositors to prepare content for preservation. andrea goethals of harvard proposed that we revisit assumptions of producer ability to prepare content for deposit benign neglect: developing life rafts for digital content i n his keynote speech at the archiving 2009 conference in arlington, virginia, clifford lynch called for the development of a benign neglect model for digital preservation, one in which as much content as possible is stored in whatever manner available in hopes of there someday being enough resources to more properly preserve it. this is an acknowledgment of current resource limitations relative to the burgeoning quantities of digital content that need to be preserved. we need low cost, scalable methods to store and preserve materials. over the past few years, a tremendous amount of time and energy has, sensibly, been devoted to developing standards and methods for best practices. however, a short survey of some of the leading efforts clarifies for even the casual observer that implementation of the proposed standards is beyond many of those who are creating or hosting digital content, particularly because of restrictions on acceptable formats, requirements for extensive metadata in specific xml encodings, need for programmers for implementation, costs for participation, or simply a lack of a clear set of steps for the uninitiated to follow (examples include: planets, premis, dcc, caspar, irods, sound directions, hathitrust).1 the deluge of digital content coupled with the lack of funding for digital preservation and exacerbated by the expanding variety of formats, makes the application of extensive standards and extraordinary techniques beyond the reach of the majority. given the current circumstances, lynch says, either we can seek perfection and store very little, or we can be sloppy and preserve more, discarding what is simply intractable.2 communications jody l. deridder (jlderidder@ua.edu) is head, digital services, university of alabama. jody l. deridder 72 information technology and libraries | june 2011 during digitization is that developing digital libraries usually have a highly chaotic disorganization of files, directory structures, and metadata that impede digital preservation readiness.19 if the archival digital files cannot be easily and readily associated with the metadata that provides their context, and if the files themselves are not organized in a fashion that makes their relationships transparent, reconstruction of delivery at some future point is seriously in question. underfunded cultural heritage institutions need clear specifications for file organization and preparation that they are capable of meeting without programming staff or extensive time commitments. particularly in the current economic downturn, few institutions have the technical skills to create mets wrappers to clarify file relationships.20 one potential solution is to use the organization of files in the file system itself to communicate clearly to future archivists how the files relate to one another. at the university of alabama, we have adopted a standardized file naming system that organizes content by the holding institution and type, collection, item, and then sequence of delivery (see figure 1). the file names are echoed in the file system: top level directories match the holding institution number sequence, secondary level directory names match the assigned collection number sequence, and so forth. metadata and documentation are stored at whatever level in the file system corresponds to the files to which they apply, and these text and xml files have file names that also correspond to the files to which they apply, which assists further in identification (see figure 2).21 by both naming and ordering the files according to the same system, and bypassing the need for databases, complex metadata schemes and software, we leverage the simplicity of the file system to bring order to chaos and to enable our content to be easily reconstructed by future systems. take and manage the content is still uncertain. the relay principle states that a preservation system should support its own migration. preserving any type of digital information requires preserving the information’s context so that it can be interpreted correctly. this seems to indicate that both the intellectual context and the logical context need to be provided. context may include provenance information to verify authenticity, integrity, and interpretation;17 it may include structural information about the organization of the digital files and how they relate to one another; and it should certainly include documentation about why this content is important, for whom, and how it may be used (including access restrictions). because the cost of continued migration of content is very high, a method of mitigating that cost is to allow content to become obsolete but to support sufficient metadata and contextual information to be able to resurrect full access and use at some future time—the resurrection principle. to be able to resurrect obsolete materials, it would be advisable to store the content with open-source software that can render it, an opensource operating system that can support the software, and separate plain-text instructions for how to reconstruct delivery. in addition, underlying assumptions of the storage device itself need to be made explicit if possible (type of file system partition, supported length of file names, character encodings, inode information locations, etc.). some of the need for this form of preservation may be diminished through such efforts as the planets timecapsule deposit.18 this consortium has gathered the supporting software and information necessary to access current common types of digital files (such as pdf), for long-term storage in swiss fort knox. one of the drawbacks to gathering and storing content developed for a tremendous amount of metadata collection. where the focus has been on what is the best metadata to collect, the question becomes: what is the minimal metadata and contextual information needed? the following is an attempt to begin this conversation in the hope that debate will clarify and distill the absolutely necessary and specific requirements to enable long-term access with the lowest possible barrier to implementation. if we consider the purpose of preservation to be solely that of ensuring long-term access, it is possible to selectively identify information for inclusion. the recent proposal by the researchers of the national geospatial digital archive (ngda) may help to direct our focus. they have defined three architectural design principles that are necessary to preserve content over time: the fallback principle, the relay principle, and the resurrection principle.13 in the event that the system itself is no longer functional, then a preservation system should support some form of hand-off of its content—the fallback principle. this can be met by involvement in lockss, as specified above. lacking the ability to support even this, current creators and hosts of digital content may be at the mercy of political or private support for ingest into trusted digital repositories.14 the recently developed bagit file package format includes valuable information to ensure uncorrupted transfer for incorporation into such an archive.15 each base directory containing digital files is considered a bag, and the contents can be any types of files in any organization or naming convention; the software tags the content (or payload) with checksums and manifest, and bundles it into a single archive file for transfer and storage. an easily usable tool to create these manifests has already been developed to assist underfunded cultural heritage organizations in preparing content for a hosting institution or government infrastructure willing to preserve the content.16 the gap of who would communications | deridder 73benign neglect: developing life rafts for digital content | deridder 73 clifford lynch pointed out, funding cutbacks at the sub-federal level are destroying access and preservation of government records; corporate records are winding up in the trash; news is lost daily; and personal and cultural heritage materials are disappearing as we speak.24 it is valuable and necessary to determine best practices and to seek to employ them to retain as much of the cultural and historical record as possible, and in an ideal world, these practices would be applied to all valuable digital content. but in the practical and largely resource-constrained world of most libraries and other cultural institutions, this is not feasible. the scale of content creation, the variety and geographic dispersal of materials, and the cost of preparation and support makes it impossible for this level of attention to be applied to the bulk of what must be saved. for our cultural memory from this period to survive, we need to communicate simple, clear, scalable, inexpensive options to digital holders and creators. references 1. planets consortium, planets preservation and long-term access through networked services, http:// www.planets-project.eu/ (accessed mar. 29, 2011); library of congress, premis (preservation metadata maintenance activity), http://www.loc.gov/standards/premis/ (accessed mar. 29, 2011); dcc (digital curation centre), http:// www.dcc.ac.uk/ (accessed mar. 29, 2011); caspar (cultural, artistic, and scientific knowledge for preservation, access, and retrieval), http://www.casparpreserves .eu/ (accessed mar. 29, 2011); irods (integrated rule-oriented data system), h t t p s : / / w w w. i ro d s . o rg / i n d e x . p h p / irods:data_grids,_digital_libraries,_ persistent_archives,_and_real-time_ data_systems (accessed mar. 29, 2011); mike casey and bruce gordon, sound directions: best practices for audio preservation, http://www.dlib.indiana .edu/projects/sounddirections/papers present/sd_bp_07.pdf (accessed june 14, 2010); hathitrust: a shared digital online delivery of cached derivatives and metadata, as well as webcrawlerenabled content to expand accessibility. this model of online delivery will enable low cost, scalable development of digital libraries by simply ordering content within the archival storage location. providing simple, clear, accessible methods of preparing content for preservation, of duplicating archival treasures in lockss, and of web accessibility without excessive cost or deep web database storage of content, will enable underfunded cultural heritage institutions to help ensure that their content will continue to survive the current preservation challenges. as david seaman pointed out, the more a digital item is used, the more it is copied and handled, the more it will be preserved.23 focusing on archival storage (via lockss) and accessibility of content fulfills the two most primary oais functional capabilities and provides a life raft option for those who are not currently able to surmount the forbidding tsunami of requirements being drafted as best practices for preservation. the importance of offering feasible options for the continued support of the long tail of digitized content cannot be overstated. while the heavily funded centers may be able to preserve much of the content under their purview, this is only a small fraction of the valuable digitized material currently facing dissolution in the black hole of our cultural memory. as while no programmers are needed to organize content into such a clear, consistent, and standardized order, we are developing scripts that will assist others who seek to follow this path. these scripts not only order the content, they also create lockss manifests at each level of the content, down to the collection level, so that the archived material is ready for lockss pickup. a standardized lockss plugin for this method is available. to assist in providing access without a storage database, we are also developing an open-source web delivery system (acumen),22 which dynamically collects content from this protected archival storage arrangement (or from webaccessible directories) and provides figure 1. university of alabama libraries digital file naming scheme (©2009. used with permission.) figure 2. university of alabama libraries metadata organization (©2009. used with permission.) 74 information technology and libraries | june 2011 .org/documents/domain-range/index .shtml#provenancestatement (accessed july 18, 2009). 18. planets consortium, planets time capsule—a showcase for digital preservation, http://www.ifs.tuwien .ac.at/dp/timecapsule/ (accessed june 14, 2010). 19. martin halbert, katherine skinner, and gail mcmillan, “avoiding the calfpath: digital reservation readiness for growing collections and distributed preservation networks,” archiving 2009 (may 2009): 6. 20. library of congress, metadata encoding and transmission standard (mets), http://www.loc.gov/standards/ mets. 21. jody l. deridder, “from confusion and chaos to clarity and hope,” in digitization in the real world: lessons learned from small to medium-sized digitization projects, ed. kwong bor ng and jason kucsma, (metropolitan new york library council, n.y., 2010). 22. tonio loewald and jody deridder, “metadata in, library out. a simple, robust digital library system,” code4lib journal 10 (2010), http://journal.code4lib .org/articles/3107 (accessed aug. 29, 2010). 23. david seaman “the dlf today” (keynote presentation, 2004 symposium on open access and digital preservation, atlanta, ga.), paraphrased by eric lease morgan in musings on information and librarianship, http://infomotions.com/ m u s i n g s / o p e n a c c e s s s y m p o s i u m / (accessed aug. 9, 2009). 24. lynch, challenges and opportunities. 9. consultative committee for space data systems, reference model. 10. stanford university et al., lots of copies keep stuff safe (lockss), http:// www.lockss.org/lockss/home (accessed mar. 29, 2011). 11. david s. rosenthal et al., “requirements for digital preservation systems: a bottom-up approach,” d-lib magazine 11 (nov. 2005): 11, http:// w w w. d l i b . o r g / d l i b / n o v e m b e r 0 5 / rosenthal/11rosenthal.html (accessed june 14, 2010). 12. alabama digital preservation network (adpnet), http://www.adpn .org/ (accessed mar. 29, 2011). 13. greg janée, “preserving geospatial data: the national geospatial digital archive’s approach,” archiving 2009 (may 2009): 6. 14. research libraries group/oclc, trusted digital repositories: attributes and responsibilities, http://www .oclc.org/programs/ourwork/past/ trustedrep/repositories.pdf (accessed july 17, 2009). 15. andy boyko et al., the bagit file packaging format (0.96) (ndiipp content transfer project), http://www.digital preservation.gov/library/resources/ tools/docs/bagitspec.pdf (accessed july 18, 2009). 16. library of congress, bagit library, http://www.digitalpreservation.gov/ partners/resources/tools/index.html#b (accessed june 14, 2010). 17. andy powell, pete johnston, and thomas baker, “domains and ranges for dcmi properties: definition of the dcmi term provenance,” http://dublincore repository, http://www.hathitrust.org/ (accessed mar. 29, 2011). 2. clifford lynch, challenges and opportunities for digital stewardship in the era of hope and crisis (keynote speech, is&t archiving 2009 conference, arlington, va., may 2009). 3. jane deitrich, e-journals: do-ityourself publishing, http://eands .caltech.edu/articles/e%20journals/ ejournals5.html (accessed aug. 9, 2009). 4. tom cramer, quoted in art pasquinelli, “digital libraries and repositories: issues and trends” (sun microsystems presentation at the summit bibliotheken, universitäsbibliothek kassel, 18–19, mar. 2009), slide 12, http:// de.sun.com/sunnews/events/2009/ bibsummit/pdf/2-art-pasquinelli.pdf (accessed july 12, 2009). 5. digital preservation europe, what is digital preservation? http://www.digi talpreservationeurope.eu/what-is-digi tal-preservation/ (accessed june 14, 2010). 6. abby smith, “preservation,” in susan schreibman, ray siemens, john unsworth, eds., a companion to digital humanities (oxford: blackwell, 2004), http://www.digitalhumanities.org/com panion/ (accessed june 14, 2010). 7. consultative committee for space data systems, reference model for an open archival system (oais), ccsds 650.0-b-1 blue book, jan. 2002, http://public.ccsds .org/publications/archive/650x0b1.pdf (accessed june 14, 2010). 8. andrea goethals, “meeting the preservation demand responsibly = lowering the ingest bar?” archiving 2009 (may 2009): 6. president’s message cindi trainor information technologies and libraries | december 2013 1 hi, litans! forum 2013 i'm excited that 2014 is almost here. last month saw a very successful forum in louisville, in my home state of kentucky. there were 243 people in attendance, and about half of those were firsttime attendees. it's also typical of our yearly conference that there are a large number of attendees from the surrounding area; this is one of the reasons that it travels around the country. louisville's forum was the last of a few in the "middle" of the country--these included st. louis, atlanta, and columbus. next year, forum will move back out west, to albuquerque, nm. the theme for next year's conference will be "transformation: from node to network." see the lita blog (http://litablog.org/2013/11/call-for-proposals-2014-lita-forum/) for the call for proposals for concurrent sessions, poster sessions, and pre-conference workshops. goals of the organization at the board meeting in the fall, we took a stab at updating lita's major goal areas. the strategic plan had not been updated since 2010, so we felt it was time to update the goal areas, at least for the short term. the goals that we agreed upon will carry us through annual conference 2015 and will give us time to mount a more complete planning process in the meantime. they are: • collaboration & networking: foster collaboration and encourage networking among our members and beyond so the full potential of technologies in libraries can be realized. • education & sharing of expertise: offer education, publications, and events to inspire and enable members to improve technology integration within their libraries. • advocacy: advocate for meaningful legislation, policies, and standards that positively impact on the current and future capabilities of libraries that promote equitable access to information and technology. • infrastructure: improve lita’s organizational capacity to serve, educate, and create community for its members. midwinter activities in other governance news, the board will have an online meeting in january 2014, prior to cindi trainor (cindiann@gmail.com) is lita president 2013-14 and community specialist & trainer for springshare, llc. http://litablog.org/2013/11/call-for-proposals-2014-lita-forum/ mailto:cindiann@gmail.com president’s message | trainor 2 midwinter conference. our one-hour meeting will be spent asking and answering questions of those who typically submit written reports for board meetings: the vice-president, the president, and the executive director. as always, look to ala connect for these documents, which are posted publicly. we welcome your comments, as well as your attendance at any of our open meetings. our midwinter meeting schedule is: • the week of january 13 online meeting, time and date tba • saturday, january 25, 1:30 4:30 p.m. pcc 107a • monday, january 27, 1:30 4:30 p.m. pcc 115a as always, midwinter will also hold a lita happy hour (sunday, 6-8 pm, location tba), the top tech trends panel (sunday, 10:30 a.m., pcc 204a), and our annual membership meeting, the lita town meeting (monday 8:30 a.m., pcc 120c). we look forward to seeing you, in philadelphia or virtually. make sure to check the midwinter scheduler (http://alamw14.ala.org/scheduler) for all the details, including the forthcoming happy hours location. it's the best party^h^h^h^h^h networking event at midwinter! i would be remiss if i did not mention lita's committees and igs and their midwinter meetings. many will be meeting saturday morning at 10:30 a.m. (pcc 113abc)--so you can tablehop if you like. expressing interest at midwinter is a great way to get involved. can't make it to philadelphia? no problem! fill out the online form to volunteer for a committee, or check out the connect groups of our interest groups. some of the igs meet virtually before midwinter; some committees and igs also invite virtual participation at midwinter itself. join us! http://alamw14.ala.org/scheduler reproduced with permission of the copyright owner. further reproduction prohibited without permission. consortia building: a handshake and a smile, island style cutright, patricia j information technology and libraries; jun 2000; 19, 2; proquest pg. 90 consortia building: a handshake and a smile, island style patricia j. cutright in the evaluation of consortia and what constitutes these entities the discussion runs the gamut. from small, loosely knit groups who are interested in cooperation for the sake of improving services to large membershipdriven organizations addressing multiple interests, all recognize the benefits of partnerships. the federated states of micronesia are located in the western pacific ocean and cover 3.2 million square miles. throughout this scattering of small islands exists an enthusiastic library community of staff and users that have changed the outlook of libraries since 1991. motivated by the collaborative eff orts of this group, a project has unfolded over the past year that will furth er enhance library services through staff training and education while utilizing innovative technology. in assessing the library needs of the region this group crafted the document "the federated states of micronesia library services plan, 1999-2003," which coalesces the concepts, goals, and priorities put forward by a broad-based contingency of librarians. the compilation of the plan and its implementation demonstrate an understanding of the issues and exhibit the ingenuity, creativity, and willingness to solve problems on a g rand scale addressing the needs of all libraries in this vast pacific region. t he basic philosophy inher ent in librarianship is the concept of sharing. the di sse mination of information through material exchang e and interlibrary communication has enriched so cieties for centuries. th ere ar e few institutions other than libraries that are better equipped or suited for such cooperation and collaborati ve e ndeavors. with servic e as the lifeblood that runs through its inky veins , the librar y has the potential to be the driving force in an y community toward partnerships that a fford mutual benefit for all. the examination of the literatur e exposes a wid e rang e of perceptions as to the d e finition of what is a consortium . the term "consortia" conjur es up impressions that span the spectrum from highly or ganized, membership-driv en groups to loosely knit cadres focusing on impro ving services to their patrons however they can make it happen. in kopp 's pap er "library consortia and patricia j. cutright (cutright@eou .edu} is library director of the pierce library at eastern oregon university. 90 information technology and libraries i june 2000 information technology : th e past, the present, th e promise" he presents information from a study conduct ed by ruth patrick on academic library consortia. in that study she identified four general types of consortia : • large consortia concerned primarily with computerized large-scale technical processing; • small consortia conc erned with user services and everyday probl ems ; • limited-purpose consortia cooperating with respect to limited special subject areas; • limited-purpose con sorti a concerned primarily with interlibrary loan or reference; and network operations.i with this distinction in mind , this paper will focus on th e second category typifying a small , less structured organization. whil e on a visiting assis tantship in the federated states of micronesia (fsm), i worked with a partnership of libraries that believe in order for cooperation to succeed, results for the patron must be the goal-not equity between libraries or some magical balance between resources lent by one library and resources received from a noth er library.2 unified effort s to provide service to the p a tron is the key. the libraries on a small, rem ote island situated in the western pacific ocean exhibit this grassroots effort that define s the true meaning of consortia-demonstrating collaboration , cooperation , and partnerships. it is a multi type library cooperative that not only encompasses interaction among libraries but also betwe en agencies as well as governments. the librarians on the island of pohnpei, micron esia, and all the islands throughout the federated states of micronesia have embraced this consortia) attitud e whil e achieving much through these collaborative efforts : • the joint work done on crafting the library services plan, 1999-2003 for the libraries throu ghout the federated states of micronesia • initiating successful grant-writing efforts which target national goals and priorities • implementing a collaborative library automation project which is d esigned to evolve into a national union catalog • the implementation of a viable resource-sharing and document delivery service for the nation i background and socioeconomic overview micron esia, a name m eaning " tiny islands ," comprise s som e 2,200 volcanic and coral islands spread throughout reproduced with permission of the copyright owner. further reproduction prohibited without permission. 3.2 million square miles of pacific ocean. lying west of hawaii, east of the philippines, south of japan and north of australia, the total land mass of all these tropical islands is fewer than 1,200 square miles with a population base estimated at no more than 111,500.3 a locationunique region, but nonetheless still plagued with all the problems associated with any geographically remote, economically depressed area found anywhere in the united states or elsewhere in the world. the federated states of micronesia is a small-island, developing nation that is aligned with the united states through a compact of free association, making it eligible for many u.s. federal programs. the economic base is centered around fisheries and marine-related industries, tourism, agriculture, and small-scale manufacturing. the average per capita income in 1996 was $1,657 for the four states of the fsm: kosrae, pohnpei, yap, and chuuk. thirteen major languages exist in the country, with english as the primary second language. the 607 different islands, atolls, and islets dot an immense expanse of ocean; this geographic condition presents challenges in implementing and enhancing library services and technology. 4 despite the extreme geographic and economic conditions, the college of micronesia-fsm national campus in collaboration with the librarians throughout the states have been successful in implementing nationwide projects. these endeavors have resulted in technical infrastructure and the foundation for information technology instruction supported through awards from the u.s. department of education, the title iii program, and the national science foundation. i collaboration: building bridges that cross the oceans the libraries in micronesia have shown an ongoing commitment to librarianship and cooperation since the establishment of the pacific islands association of libraries and archives (piala) in 1991. the organization is a micronesia-based regional association committed to fostering awareness and encouraging cooperation and resource sharing among libraries, archives, museums, and related institutions. piala was formed to address the needs of pacific islands librarians and archivists, with a special focus on micronesia; it is responsible for the common-thread cohesiveness shared by the librarians over the past eight years. the organization has grown to become an effective champion of the needs of libraries and librarians in the pacific region.s when piala was established, the most pressing areas of concern within the region were development of resource-sharing tools and networks among the libraries, archives, museums, and related institutions of the pacific islands. the development of continuing education programs and the promotion of technology and telecommunications applications throughout the region were areas targeted for attention. those concerns have changed little since the group's inception. building upon that original premise, in january 1999 a group of interested parties from throughout the federated states of micronesia met to draft a document they envisioned would lay the groundwork for library planning over the next five years. this strategic plan encompasses all library activity-services, staffing, and the impact technology will have on libraries in the region. the document, "the federated states of micronesia library services plan, 1999-2003," coalesces the concepts, goals, and priorities put forward by a broad-based contingent. in this meeting, the group addressed basic issues of library and museum service, barriers and solutions to improve service delivery, and additional funding and training resources for libraries and museums.6 the compilation of the plan crafted at the gathering demonstrated a thorough understanding of the issues that face the librarians of the vast region. it exhibits the ingenuity, creativity, and willingness to problem-solve on a grand scale in a way that addresses the needs of all libraries in the pacific region. the goals set forward by the writing session group illustrate the concerns impacting library populations throughout the fsm. the fsm has now established six major goals to carry out its responsibilities and the need for overall improvement in and delivery of library services: 1. establish or enhance electronic linkages between and among libraries, archives, and museums in the fsm. 2. enhance basic services delivery and promote improvement of infrastructure and facilities. 3. develop and deliver training programs for library staff and users of the libraries. 4. promote public education and awareness of libraries as information systems and sources for lifelong learning. 5. develop local and nationwide partnerships for the establishment and enhancement of libraries, museums, and archives. 6. improve quality of information access for all segments of the fsm population and extend access to information to underserved segments of the population. priorities the following are general priorities for the fsm library services plan. the priorities represent needs for overall improvement of the libraries, museums, and archives. the priorities are based on the fact that currently libraries, museums, and archives development is in its infancy in consortia building i cutright 91 reproduced with permission of the copyright owner. further reproduction prohibited without permission. the fsm. specific priorities will change from year to year as programs are developed. 1. establishment of new libraries and enhancement of existing library facilities to increas e accessibility of all fsm citizens to library resources and services. outer islands and remote areas generally have no access to libraries or information sources. new facilities or mechanisms need to be established to provide access to information resources for the public. existing public and school library facilities often lack adequate staffing, climate control, and electrical connections needed to meet the needs of the community. existing public and school libraries also need to improve their facilities and services delivery to meet the needs of disabled individuals and other special populations. 2. provide training and professional development for library operation and use of new information technologies. a survey held during the writing session indicated that public and school library staff do not currently possess the skills needed to effectively provide assistance in the use of new information technologies. well-designed training programs with mechanisms for follow-up technical assistance and support need to be developed and implemented. 3. promote collaboration and cooperation among libraries, museums, and archives for sharing of holdings and technical ability. limited holdings, financial capacity, and human resources are major barriers to improving library services. collaboration and cooperation are needed among libraries, museums, and archives to maximize scarce resources . 4. develop recommended standards and guidelines for library services in the fsm. the ability to share resources and information could be significantly increased by development and implementation of recommended standards and guidelines for library services. standardization could assist with sharing of holdings and holdings information, increase availability of technical assistance, and provide guidance as new libraries and library services are set up. 5. increase access to electronic information sources. existing public and school libraries have limited or no access to electronic linkages including basic services such as e-mail and connections to the internet. the priority need is to establish basic electronic linkages for all libraries, followed by extending access to electronic information to all users.7 i shifting into action with the drafting of this five-year plan, the librarians stated emphatically the need and desire to move ahead 92 information technology and libraries i june 2000 with haste and determination . as the plan was conceptualized and documented, a small cadre of librarians from the college of micronesia -fsm national campus, the public library, and high school library crafted two successful grant proposals which addressed: • a cooperative library automation project which is designed to evolve into a national union catalog (goal 1; priorities 3, 5); • the installation of intern et services that would link the college of micronesia-fsm campuses, the public library, and high school library (goals 1, 2, 6; priorities 1, 2, 3, 5); • the development and delivery of training programs for library staff and users of the libraries (goals 3, 4, 6; priority 2); and • the implementation of a viable resource-sharing and document delivery service for the nation (goal 1, 2, 5, 6; priorities 3, 4, 5). over the past year the awarding of grant funds has shifted the library community into high gear with the design and implementation of project activities that will fulfill the targeted needs. the automation project and internet connectivity a collaborative request submitted by the bailey olter high school (bohs) library and the pohnpei public library provided the funding necessary to computerize the manual card catalog system at bohs and upgrade the dated automated library system at pohnpei public library. since the college of micronesia-fsm campuses are automated, it was important for the high school library and the public library to install like systems to achieve a networkable automated system, facilitating the development of a union catalog for all th e libraries' holdings. this migration to an automated system promoted cooperation and resource sharing for the island libraries-opening a wealth of information for all island residents. the project entailed purchasing a turnkey cataloging and circulation system that will facilitate the cataloging and processing of new acquisitions for each library as well as the conversion of approximately five thousand volumes of material already owned by the public and high school libraries. through internet connectivity, which was integral to the project, the system would also serve as public access to the many holdings of the libraries for students, faculty, and town patrons through a union catalog to be established in the future. the development and deliv ery of training programs for library staff and users is linked to the implementation of a viable resource-sharing and document delivery service for the nation. stated earlier, the librarians of the federated states of micronesia accepted the challenge facing them in rampreproduced with permission of the copyright owner. further reproduction prohibited without permission. ing up for the twenty-first century. their prior experience laid the groundwork necessary to implement the training programs necessary to bring the library community the knowledge and skills needed. a survey administered during the writing session indicated that few public and school librarians have significant training in or use of electronic linkages or information technologies, nor are they actively using such technologies at present. of the fourteen public and school librarians in the four states of micronesia, none hold a master's degree from an accredited library school or library media specialist certification. an exception is the library staff at the com-fsm national campus, where two-thirds of the librarians hold professional credentials. significant effort is needed on a sustained basis for effective training in the understanding and use of information systems throughout the nation. where training has occurred, it has often been of an infrequent, short variety with little support for ensuring implementation at the work site. additionally, often there are no formal systems for getting answers to questions when problems do arise. in addressing the information needs for this population it is apparent that education is the key component for continued improvement of library services. this concern is evident in a paper by daniel barron, where it is stated that only 54 percent of librarians and 19 percent of staff in libraries serving communities considered to be rural (i.e., 25,000 people or fewer) have an ala-mls. 8 and dowlin proposes even more perplexing questions, "how can a staff with such an educational deficit be expected to accomplish all that will be demanded to enable their libraries to go beyond being a warehouse of popular reading materials? how can we expect them to change from pointers and retrievers to organizers and facilitators?" 9 micronesia is no different than any other state or country in wanting its population to have access to qualified staff, current resources, and services. it recognizes the libraries are inadequately staffed and many others have staff who are seriously undereducated to meet the expanded information needs of the people in their communities. if these libraries are to seize the opportunities suggested by the developing positive view, develop services to support this view, and market such a view to a wider range of citizens in their communities they must invest in the intellectual capital of their staffs. in order to carry out this charge, the following activities were designed to address the educational and training needs of the librarians in the fsm. as outlined in a recently funded institute of museums and library services (imls) national leadership grant, preparation has begun with the following activities, which will address the staffing and technology concerns described in fsm libraries: 1. recruit and hire an outreach services librarian to survey training needs, coordinate and plan training, and deliver or arrange for needed training. 2. develop a skills profile for all library, museum, and archival staff positions. 3. identify training contact or coordinator for each state. 4. develop and provide periodic updates to operational manuals for school and public libraries, museums, and archives. 5. recruit local students and assist them in seeking out scholarships for professional training off island. 6. design and implement programs to provide continuous training and on-site support in new technological developments and information systems (provided on-site and virtually). 7. establish a summer training institute offering training based on needs as determined by the outreach services librarian in collaboration with state coordinators and recruiting onand off-island expertise as instructors. 8. design and develop programs for orientation and training of users of information systems (provided on-site and virtually). 9. develop and implement a "train the trainer" program, which will have representation from all four states, that will ensure continuity and sustainability of the project for the years to come. 10 the primary requisite to initiating this project is the recruitment and hiring of the outreach services librarian who will then begin the activities as listed. a beginning cadre of librarians gleaned from the summer institute will become the trainers of the future, perpetuating a learning environment enhanced with advanced technology. breakthroughs in distance education, aided with advances in telecommunications, will significantly impact this project. on-site training will be imperative for the initial cadre of summer institute attendees to provide sound teaching skills and a firm understanding of the material at hand. follow-up training will be presented on each island by the trainer either on location or virtually with available technology. products such as web course in a box, webct, or nicenet will be analyzed for appropriate utilization as teaching tools. these products will take advantage of newly established internet connections on each island and, more importantly, will provide the interactive element that distinguishes this learning methodology from the "talking head" or traditional correspondence course approach. a web site designed for this project will provide valuable information and connectivity for not only the pacific library community but anyone worldwide who may be interested in innovative methods of serving remote populations. using computer conferencing and virtual communities technology, a video conferencing system such as 8 x 8 consortia building i cutright 93 reproduced with permission of the copyright owner. further reproduction prohibited without permission. technologies will be used, which will allow face-to-face interaction with trainer and student in an intra-island situation (interisland telephone rates are too expensive for regular use as a teaching tool). to enhance the learning experience and information retrieval component for these librarians and the population they serve, the project also incorporates implementation of a viable resource-sharing, document delivery system capitalizing on a shared union catalog and using a service such as research library group's ariel product. with library budgets reflecting the critical economic climate of the nation, it becomes even more crucial for collaborative collection development and resource sharing to satisfy the needs of the library user. to maintain cost-effective communication and build a sense of community among the librarians, the messaging software icq has been installed on all participant hardware and utilized for group meetings, question and answer, and general correspondence. since icq operates as part of the internet, this package allows low-cost communication with maximum benefit in connecting the group. this technology will also be used as the primary mechanism for communication with an outside advisor who will provide expertise in the area of outreach services for rural populations. the realm of outreach services in libraries has always presented unique challenges that can now benefit greatly from current and emerging technologies. the definition of "outreach" is truly a matter of perspective, with the more traditional sense relating to a specific library servicing its own user or patron. but current practice regards "outreach" as a mere extension of services to all users whether they be a registered patron or colleague or peer. micronesia is a country where the proverbial phrase "the haves and the have-nots" is amplified. the recent (and ongoing) installation of internet services in the region has made possible many basic changes, but there still exists the reality that some of the sites for services proposed have nothing more than a common analog line and rudimentary services. as an example of the realities that exist, only 38 percent of the approximately 180 public schools in the fsm have access to reliable sources of electricity. another challenge for these libraries is the climate and environment, which has a significant impact on library facilities, equipment, and holdings. the fsm lies in the tropics, with temperatures ranging daily from 85 to 95 degrees with humidity normally 85 percent or higher.11 the high salt content in the ocean air wreaks havoc upon electrical equipment, and the favorable environs inside a library often entice everything from termites in the wooden bookcases to nesting ants in keyboards. from these examples it is apparent that the problems that trouble these libraries are not going to be solved with the magic bullet of technology. this reality constitutes the 94 information technology and libraries i june 2000 need for varying strategies and different aproaches to address the training requirements of the library staff. i summary the fsm library group, in particular the pohnpeian librarians, have accomplished much in the past year. the motivating factor for the flurry of activity that enveloped the libraries on pohnpei was spurred by the collaborative writing session in january 1999. a week-long "meeting of the minds" from libraries throughout micronesia produced the blueprint that will map the future of libraries and library service for years to come. these librarians stated their primary issues in delivering library services and came to a consensus on activities needed to address the issues. the "federated states of micronesia library services plan, 1999-2003" was crafted as a working document, a strategic plan for improving library services in the pacific region, and a commitment to achievement through collaboration. while in micronesia i observed the impact that the unification of ideas can have on the citizens of a community. in my fourteen-year tenure at eastern oregon university i have been exposed to the benefits of "consortium attitude" that come from cooperation and partnerships. time and again the university demonstrates the positive effects of what is referred to as "politics of entanglement." shepard describes the overriding philosophy that has been the recipe for success: the politics are really quite simple. we maintain an intricate pattern of relationships, any one of which might seem inconsequential. yet there is strength in the whole that is largely unaffected if a single relationship wanes. rather than mindlessly guarding turf, we seek to involve larger outside entities and in the ensnaring, to turn potential competitors into helpful partners .12 just as eastern oregon university has discovered, the libraries of the federated states of micronesia are learning the merits of entanglement. references and notes 1. james j. kopp, "library consortia and information technology: the past, the present, the promise," information technology and libraries 17 (mar. 1998): 7-12. 2. jan ison, "rural public libraries in multi-type library cooperatives," library trends 44 (summer 1995): 29-52. 3. pacific islands association of libraries and archives, www.uog.edu/rfk/piala.html, accessed june 6, 2000. 4. division of education, department of health, education and social affairs, federated states of micronesia, "federated reproduced with permission of the copyright owner. further reproduction prohibited without permission. states of micronesia, library services plan 1999-2003" (march 3, 1999): 2. 5. pacific islands association of libraries and archives, www.uog.edu/rfk/piala.html, accessed june 6, 2000. 6. division of education and others, "library services plan," 4. 7. ibid, 6. 8. daniel d. barron, "staffing rural pubic libraries: the need to invest in intellectual capital," library trends 44 (summer 1995): 77-88. the mit from gutenberg to the global information infrastructure access to information in the networked world christine l. borgman considers digital libraries from a social rather than a technical perspective. digital libraries and electronic publishing series 340 pp. $42 now in paperback remediation understanding new media jay david bolter and richard grusin " clearly written and not overly technical, this book will interest general readers, students, and scholars engaged with current trends in technology." choice 307 pp., 102 illus. $17.95 paper 9. k. e. dowlin, "the neographic library: a 30-year perspective on public libraries," in libraries and the future: essays oil the library ill the twenty-first century, f. w. lancaster, ed. (new york: haworth pr., 1993). 10. patricia j. cutright and jean thoulag, college of micronesia-fsm national campus, "institute of museums and library services, national leadership grant" (mar. 19, 1999). 11. division of education and others, "library services plan," 2. 12. w. bruce shepard, "spinning interin;titutional webs," aahe bulletin 49 (feb. 1997): 3-6. the intellectual foundation of information organization elaine svenonius "provides sound guidance to future developers of search engines and retrieval systems. the work is original, building on the foundations of information science and librarianship of the past 150 years." dr. barbara 8. tillett, director. ils program, library of congress digital libraries and electronic publishing series 264 pp. $37 now in paperback information ecologies using technology with heart bonnie a. nardi and vicki l. o'day "a new and refreshing perspective on our technologically dependent society." daily telegraph 246 pp. $15.95 paper to order call 800-356-0343 (us & canada) or 617-625-8569. prices subject to change without notice. http:/ /mitpress.mit.edu consortia building i cutright 95 applying topic modeling for automated creation of descriptive metadata for digital collections article applying topic modeling for automated creation of descriptive metadata for digital collections monika glowacka-musial information technology and libraries | june 2022 https://doi.org/10.6017/ital.v41i2.13799 monika glowacka-musial (monikagm@nmsu.edu) is assistant professor/metadata librarian, new mexico state university library. © 2022. abstract creation of descriptive metadata for digital objects tends to be a laborious process. specifically, subject analysis that seeks to classify the intellectual content of digitized documents typically requires considerable time and effort to determine subject headings that best represent the substance of these documents. this project examines the use of topic modeling to streamline the workflow for assigning subject headings to the digital collection of new mexico state university news releases issued between 1958 and 2020. the optimization of the workflow enables timely scholarly access to unique primary source documentation. introduction digital scholarship relies on digital collections and data. in the influential book digital_humanities, anna burdick and her associates affirm that humanistic knowledge production depends on collection building and curation.1 access to historical documents and data resources is essential for the development of new research questions and methodologies.2 this project utilizes topic modeling to support building a digital collection of institutional news releases. it is one of the initiatives to apply digital technologies in the library workflows. new mexico state university news releases in response to a growing scholarly and public interest in original university press announcements, the digitization of past nmsu print news releases was approved in september 2013. sixty years of news releases starting from the late 1950s to the present were to be included. one of the arguments presented in justification of the project was that these institutional news briefs have a truly unique historical value. researchers view university press announcements as anchors in the history of nmsu and the region, particularly for dating events and initiatives. they also find official communications essential for studying the way the news was framed by participants and the university administration. historically, the relationships between the university and the local media had always been a major concern of college administrators: how to respect the freedom of the press, while ensuring responsible and factual journalism, and how to build an effective partnership that would benefit both sides?3 to address these questions, the administration early on established the college’s information services that have issued news releases about campus events, programs, and developments in the college’s research, teaching, and service. these formal news repor ts representing the perspective of the university have been regularly distributed to local and worldwide media for many decades. this collection has become one of the most popular primary sources documenting a history of the southwestern educational institution. mailto:monikagm@nmsu.edu information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 2 since the beginning of the digitization project, thousands of press releases had been scanned, described, and added to the digital collection. currently, the collection features press releases issued by the university between 1958 and 1974. there is still a lot to be done. the most timeconsuming element in the process is adding metadata, including library of congress subject headings, to individual news releases. with decreasing personnel, dwindling library resources, and competing work priorities, the progress on the project has slowed substantially. its revitalization requires a fresh, problem-solving approach that would allow for a significant reduction of time that catalogers spend on metadata creation. in search for a viable solution, topic modeling, a computational tool for classifying large collections of texts, was put to the test and generated promising results. the following sections describe the tools, data, and process created for this experiment in some detail. topic modeling and its applications topic modeling (tm) is one of the methodologies used in natural language processing (nlp). it was specifically designed for text mining and discovering hidden patterns in huge collections of documents, images, and networks.4 according to practitioners, topic modeling is best viewed as a statistical tool for text exploration and open-ended discovery.5 it has been used extensively in computer science, genetics, marketing, political science, journalism, and digital humanities f or the last two decades. a growing literature on topic modeling applications provides clear evidence of its viability.6 examples of tm applications in digital social sciences and humanities include finding geographic themes from gps-associated documents on social media platforms such as flickr and twitter,7 selecting news articles on opposition to euro currency from financial times data,8 identifying paragraphs on epistemological concerns in english and german novels ,9 tracking research trends in different disciplines,10 and revealing dominant themes in newspapers,11 governance literature,12 and wikipedia entries.13 topic modeling was applied in addition to text mining to enhance access to large digital collections by providing minimal description and enriching metadata, including subject headings .14 also, a possibility of using topic modeling to determine the subject headings for books on project gutenberg was explored.15 topic modeling in a nutshell topic models help to identify the contents of document collections. topic modeling is a process of discovering clusters of words that best represent a set of topics. figure 1 shows the basic idea behind topic modeling. a large collection of text documents (the scrolls on top) consists of thousands of words (shown symbolically at the bottom). the algorithm seeks for the most frequent words that tend to occur in proximity and clusters them together. each cluster, referred to as a topic, has a set of words with their probabilities of belonging to a given topic. each document in the collection displays a set of combined topics to different degrees. here, documents are seen as mixtures of topics, and topics are seen as mixtures of words.16 topics also provide context to words. documents that have similar combinations of topics tend to be related. as a result, a large collection of text documents can be represented by a limited set of topics (as presented by icons in the middle of the figure). information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 3 figure 1. basic idea behind topic modeling. topics and subject headings combined the original purpose of topic modeling, as formulated by david blei and his associates in 2003, was to make large collections of texts more approachable for scholars by organizing texts automatically based on latent topics.17 these hidden topics can be discovered, measured, and consequently used by scholars to navigate the collection. the purpose of assigning subject headings is to identify “aboutness,” or simply subject concepts, covered by the intellectual content of a given work, and then again collocate related works. 18 since both topic models and subject headings have a similar purpose, although very different methodology and scale, we decided to combine them and make topic models a prerequisite for assigning subject headings. in such a scenario, the computer deals with the scale of text collections that are beyond human reading capacity and catalogers then fine-tune the results generated by the algorithm. the following methods section shows subsequent stages involved in the process of semiautomated assignment of subject headings to documents. methods overview for topic modeling, we used the algorithm of latent dirichlet allocation (lda).19 lda takes a document-term matrix, with rows corresponding to documents, and columns corresponding to terms (words) and, based on semirandom exploration, finds optimal probabilities of topics in documents (called gammas), and probabilities of terms in topics (called betas). after lda generates a set of topics that best represent the collection of news releases, each topic is associated with several subject headings that were previously assigned to news releases by catalogers. for a new news release, lda finds a set of most representative topics. subject headings information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 4 associated with the dominant topics are combined into a list of subject candidates presented to a cataloger. the last step involves a cataloger using a short list of subject candidates for selecting subject headings for news releases. training data training data used in this project consists of over 6,000 news releases (from 1958 to 1967) annotated with metadata. only two metadata properties—titles and subject headings—were considered. created by catalogers, both properties reflect the content of news releases accurately, although mistakes may sometimes happen. the values from the titles field were converted into a document-term matrix that, in turn, became an input for the algorithm. texts produced by ocr on original news releases were not included in the analysis due to their poor quality. detailed steps of the proposed method: 1. topic modeling on training data: a. run standard preprocessing of training text data, including tokenization, stop words removal, and stemming. b. run topic modeling (lda) where each document from the training data set is assigned a set of topics (subsets of words), each one with a measurable contribution to the document. 2. assignment of subject headings to topics.20 for each topic: a. select a number of documents with the highest probability (gamma) for the topic. we used 400. b. gather a set of subject headings assigned to documents produced in 2.a. and arrange them with decreasing frequency (freq) of occurrence in the set. 3. assignment of subject headings to a new document. a. assign to the new document gammas (probabilities) of topics using the lda model trained in 1.b. b. in subsequent topics, for each subject heading calculate its weight in the document as a product of its frequency in the topic (freq) and probability of the topic (gamma) in the document; for subject headings duplicated across topics, sum up their weights across topics. c. create a list of candidate subject headings processed in 3.b. in descending order with respect to their weights in the document. implementation there is a growing number of tools used for topic modeling.21 for this project, we used the r programming language, which has many packages for data preprocessing and topic modeling (tm).22 below are listed r packages used for this project: • topicmodels with functions: lda() producing topic models, posterior() for assigning topics to test documents by pretrained models and perplexity() for perplexity calculation 23 • tidytext with tidying functions that allow for re-arrangements and exploring data as well as for interpreting the models • textstem for preprocessing data, including stemming and lemmatization information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 5 • tidyr, dplyr, and stringr for data and strings manipulation and arrangements • ggplot2 for data visualizations the code related to topic modeling was mostly reused from the datacamp class on topic modeling.24 occasionally, data.table data structure was applied instead of data.frame. in addition to standard stop-words, custom stop-words including initials, names of weekdays, and dates were removed from the corpus using function anti_join(). for finding topics in test documents by a pretrained model, function posterior() from the r package topicmodels was used.25 the extra step needed before using function posterior() was to align the new document with the document-term matrix used for training the lda model.26 results for assessing the method’s performance, we adopted the idea of recall. in this specific context, recall is defined as the fraction of original subject headings (i.e., those assigned to a document manually by a cataloger) that are present on the list of candidate subject headings produced by the method. the average recall is estimated using a leave-one-out setting.27 once a single test document is set aside, the lda model is trained on the remaining documents and recall is calculated for the tested document using the list of candidate subject headings produced by the method. then, recall is averaged over a set of testing documents. this approach produces an estimate of the method’s performance if tested on a new document. information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 6 figure 2. average recall as a function of size of list with subject headings candidates. figure 2 shows the dependence of average recall on length of list of candidate subjects produced by the method. recall is averaged over 1,500 randomly selected test documents. the dashed line represents the chance level performance, i.e., when the method would produce a random subset of all subject headings available in the data. on a list of 100 suggested subject headings, the recall is on average above 0.6 and for a list of 500 candidate subject headings, above 0.8. even though the average recall stays noticeably below 1 (recall value 1 would mean perfect performance), at the same time it is still considerably above the chance level. the results presented in figure 2 were produced by the lda model trained with 16 topics. one of the parameters affecting the method performance is the number of topics used by the lda model. for finding the number of topics corresponding to the highest recall, an overall measure of recall across different lengths of the subject candidate list was defined as the cumulative recall for first 100 subject candidates. we assumed that 100 is a likely size of candidate lists that catalogers would be willing to go through. figure 3 shows the cumulative recall for different numbers of topics, based on which 16 were chosen as the optimum. interestingly, this corresponds wel l with the perplexity dependence on number of topics (fig. 4). the perplexity, a measure of model’s surprise at the data, shows how the model fits the data—a smaller number means a better fit, i.e., a better topic model.28 information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 7 figure 3. cumulative recall as a function of number of topics in the lda model. information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 8 figure 4. perplexity of the lda model as a function of number of topics in the lda model. to give a better idea about the method performance, figure 5 shows the distribution of recall for individual test documents, for a list of 100 subject headings. since most documents in the training data have just a few subject headings, there is only a small set of discrete values possible for recall for individual documents. the distribution is wide, with a fraction of documents with no subject heading present on the proposed list (recall = 0) but also with a bigger fraction of documents fully covered by the list (recall = 1). information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 9 figure 5. distribution of recall across 1,500 test documents, for 100 subject candidates (for 16 topics). the following examples show the sets of subject headings selected by the algorithm that include subject headings (in bold blue) chosen originally by catalogers. example 1 title of news release: “‘romeo and juliet’ play part of campus celebration for 400th anniversary of shakespeare's birth” subjects weights new mexico state university. playmakers 0.280 theater 0.143 students 0.080 academic achievement 0.080 theater--production and direction 0.075 high school students 0.052 competitions 0.048 new mexico state university. college of engineering 0.042 plays 0.041 debates and debating 0.038 new mexico state university. aggie forensic festival 0.036 information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 10 zohn, hershel 0.034 shakespeare, william, 1564-1616. a midsummer night's dream 0.034 forensics (public speaking) 0.034 frisch, max, 1911-1991. firebugs 0.027 tickets 0.027 theater rehearsals 0.027 new mexico state university. college of agriculture and home economics 0.022 shakespeare, william, 1564-1616. romeo and juliet 0.020 frisch, max, 1911-1991 0.020 performances 0.020 garcia lorca, federico, 1898-1936. casa de bernarda alba. english 0.020 molière, 1622-1673. bourgeois gentilhomme. english 0.020 anniversaries 0.014 new mexico state university. college of teacher education 0.012 example 2 title of caption to photo: “locals barbara gerhard, donna herron, lillian jean taylor rehearse for upcoming concert” subjects weights concerts 0.123 new mexico state university. university-civic symphony orchestra 0.085 institution. playmakers 0.077 united states. air force rotc 0.073 united states. army. reserve officers' training corps 0.062 military cadets 0.058 award presentations 0.054 theater 0.039 award winners 0.038 scholarships 0.035 music 0.035 musicians 0.031 awards 0.027 new mexico state university. department of military science 0.023 theater--production and direction 0.021 kennecott copper corporation 0.019 students 0.019 glowacki, john 0.019 new mexico state university symphonic band 0.015 new mexico state university. university-community chorus 0.015 lynch, daniel 0.015 drath, jan 0.015 performances 0.015 information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 11 military art and science 0.012 united states. army--inspection 0.012 discussion the major advantage of the method described above is reducing a long list of library of congress subject headings that catalogers need to consult before assigning subject headings to news releases. it is important to note that this method produces subject headings that are already present in the training data. the list of available subject headings can be expanded by periodic updates of the training data to include all entries in the catalog, assuming catalogers will add, where needed, subjects not present so far in the data set. in this project we utilized metadata from just two fields: titles and subject head ings. although documents’ titles are supposed to compactly represent the content of documents, we expect that the presented approach would give better results if the full text (ocr) was analyzed. in this project, the limiting factors were both quality of print copies and robustness of available ocr tools. in some cases, subject annotations are imperfect, depending on skills and experience of catalogers. that also affects the performance of our method that relies on quality of subject assignments. on the other hand, there are cases when the method suggests subjects that are fitting the content of news releases but were not selected by catalogers. this indicates that the method can also be used to refine the existing annotations. conclusion we propose a way to streamline the workflow of metadata creation for university news releases by applying topic modeling. first, we use this digital technology to identify topics in a large collection of text documents. then, we associate the discovered topics with sets of subject headings. finally, to a new document, we assign those subject headings that are associated with the document’s most dominant topics. the proposed method facilitates the process of document annotation. it produces short lists of candidate subject headings that account for a significant part of original labeling performed by catalogers. this approach can be applied to support annotation of any large digital collection of text documents. one of the advantages of applying topic modeling is that it produces numeric representations of text documents. these numeric representations can be used by advanced analytical methodologies, including machine learning, for numerous practical purposes in library workflows like text categorization, collocation of similar materials, enhancing metadata for digital collections, finding trends in government literature, etc. in addition, mastering digital methodologies by librarians may open new ways of collaboration among them and digital scholars across university campuses. as johnson and dehmlow argue, “... digital humanities represent a clear opportunity for libraries to offer significant value to the academy, not only in the areas of tool and consultations, but also in collaborative expertise that supports workflows for librarians and scholars alike.”29 digital technologies are best learned in hands-on practice. if librarians are to contribute to the development of digital scholarship, then information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 12 they need to learn how to apply new technologies to their own work. and since both librarians and humanists work with texts, they might have much to offer each other. correction on november 21, 2022, the urls to references 24 and 26 were updated at the author’s request to avoid user login. endnotes 1 anne burdick et al., digital_humanities (cambridge, massachusetts: the mit press, 2012), 32–33. 2 thomas g. padilla, “collections as data implications for enclosure,” acrl news 79, no. 6 (2018), https://crln.acrl.org/index.php/crlnews/article/view/17003/18751; rachel wittmann, anna neatrour, rebekah cummings, and jeremy myntti, “from digital library to open datasets: embracing a ‘collections as data’ framework,” information technology and libraries 38, no. 4 (december 2019), https://doi.org/10.6017/ital.v38i4.11101. 3 gerald w. thomas, academic ecosystem: issues emerging in a university environment (gerald w. thomas, 1998), 159–64. 4 david m. blei, andrew ng, and michael jordan, “latent dirichlet allocation,” journal of machine learning research 3, no. 1 (2003); david m. blei, “topic modeling and digital humanities,” journal of digital humanities 2, no. 1 (winter 2012), http://journalofdigitalhumanities.org/21/topic-modeling-and-digital-humanities-by-david-m-blei/. 5 megan r. brett, “topic modeling: a basic introduction,” journal of digital humanities 2, no. 1 (winter 2012), http://journalofdigitalhumanities.org/2-1/topic-modeling-a-basicintroduction-by-megan-r-brett/; jordan boyed-graber, yuening hu, and david mimno, “applications of topic models,” foundations and trends® in information retrieval 11, no. 2–3 (2017): 143–296. 6 boyed-graber, hu, and mimno, “applications of topic models,” foundations and trends® in information retrieval 11, no. 2–3 (2017): 143–296; rania albalawi, tet hin yeap, and morad benyoucef, “using topic modeling methods for short-text data: a comparative analysis,” frontiers in artificial intelligence 3 (2020): 42, https://doi.org/10.3389/frai.2020.00042; hamed jelodar, yongli wang, chi yuan, xia feng, “latent dirichlet allocation (lda) and topic modeling: models, applications, a survey,” (2017), https://www.ccs.neu.edu/home/vip/teach/dmcourse/5_topicmodel_summ/notes_slides/lda _survey_1711.04305.pdf. 7 zhijun yin et al., “geographical topic discovery and comparison,” in www: proceedings of the 20th international conference on the world wide web (2011), https://doi.org/10.1145/1963405.1963443. 8 david andrzejewski and david buttler, “latent topic feedback for information retrieval,” in kdd '11: proceedings of the 17th acm sigkdd international conference on knowledge discovery and data mining (2011), https://dl.acm.org/doi/10.1145/2020408.2020503. https://crln.acrl.org/index.php/crlnews/article/view/17003/18751 https://doi.org/10.6017/ital.v38i4.11101 http://journalofdigitalhumanities.org/2-1/topic-modeling-and-digital-humanities-by-david-m-blei/ http://journalofdigitalhumanities.org/2-1/topic-modeling-and-digital-humanities-by-david-m-blei/ http://journalofdigitalhumanities.org/2-1/topic-modeling-a-basic-introduction-by-megan-r-brett/ http://journalofdigitalhumanities.org/2-1/topic-modeling-a-basic-introduction-by-megan-r-brett/ https://doi.org/10.3389/frai.2020.00042 https://www.ccs.neu.edu/home/vip/teach/dmcourse/5_topicmodel_summ/notes_slides/lda_survey_1711.04305.pdf https://www.ccs.neu.edu/home/vip/teach/dmcourse/5_topicmodel_summ/notes_slides/lda_survey_1711.04305.pdf https://doi.org/10.1145/1963405.1963443 https://dl.acm.org/doi/10.1145/2020408.2020503 information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 13 9 matt erlin, “topic modeling, epistemology, and the english and german novel,” cultural analytics 1, no. 1 (may 1, 2017), https://doi.org/10.22148/16.014. 10 cassidy r. sugimoto et al., “the shifting sands of disciplinary development: analyzing north american library and information science dissertations using latent dirichlet allocation,” journal of the american society for information science and technology 62, no. 1 (january 2011), https://doi.org/10.1002/asi.21435; david mimno, “computational historiography: data mining in a century of classics journals,” journal on computing and cultural heritage 5, no. 1 (april 2012): 3:1–3:19; andrew j. torget and jon christensen, “mapping texts: visualizing american historical newspapers,” journal of digital humanities 1, no. 3 (summer 2012), http://journalofdigitalhumanities.org/1-3/mapping-texts-project-by-andrew-torgetand-jon-christensen/; andrew goldstone and ted underwood, “the quiet transformations of literary studies: what thirteen thousand scholars could tell us,” new literary history 45, (2014): 359–84; carlos g. figuerola, francisco javier garcia marco, and maria pinto, “mapping the evolution of library and information science (1978–2014) using topic modeling on lisa,” scientometrics 112, (2017): 1507–35, https://doi.org/10.1007/s11192-017-2432-9; jung sun oh and ok nam park, “topics and trends in metadata research,” journal of information science theory and practice 6, no. 4 (2018): 39–53; manika lamba and margam madhusudhan, “metadata tagging of library and information science theses: shodhganga (2013–2017),” paper presented at etd 2018: beyond the boundaries of rims and oceans globalizing knowledge with etds, national central library, taipei, taiwan, https://doi.org/10.5281/zenodo.1475795; manika lamba and margam madhusudhan, “author-topic modeling of desidoc journal of library and information technology (2008– 2017), india,” library philosophy and practice (2019): 2593, https://digitalcommons.unl.edu/libphilprac/2593. 11 david j. newman and sharon block, “probabilistic topic decomposition of an eighteenthcentury american newspaper,” journal of the american society for information science and technology 57, no. 6 (april 1, 2006): 753–67; robert k. nelson, “mining the dispatch,” last modified november 2020, https://dsl.richmond.edu/dispatch/about; tze-i yang, andrew torget, and rada mihalcea, “topic modeling on historical newspapers,” in latech '11: proceedings of the 5th acl-hlt workshop on language technology for cultural heritage, social sciences, and humanities (2011), https://dl.acm.org/doi/10.5555/2107636.2107649; carina jacobi, wouter van atteveldt, and kasper welbers, “quantitative analysis of large amounts of journalistic texts using topic modelling,” digital journalism 4, no. 1 (2015), https://doi.org/10.1080/21670811.2015.1093271. 12 jonathan o. cain, “using topic modeling to enhance access to library digital collections,” journal of web librarianship 10, no. 3 (2016): 210–25, https://doi.org/10.1080/19322909.2016.1193455; alexandra lesnikowski et al., “frontiers in data analytics for adaptation research: topic modeling,” wires climate change 10, no. 3 (2019): e576, https://doi.org/10.1002/wcc.576. 13 tiziano piccardi and robert west, “crosslingual topic modeling with wikipda,” in proceedings of the web conference 2021 (www ’21), april 19–23, 2021, ljubljana, slovenia (acm, new york), https://doi.org/10.1145/3442381.3449805. https://doi.org/10.22148/16.014 https://doi.org/10.1002/asi.21435 http://journalofdigitalhumanities.org/1-3/mapping-texts-project-by-andrew-torget-and-jon-christensen/ http://journalofdigitalhumanities.org/1-3/mapping-texts-project-by-andrew-torget-and-jon-christensen/ https://doi.org/10.1007/s11192-017-2432-9 https://doi.org/10.5281/zenodo.1475795 https://digitalcommons.unl.edu/libphilprac/2593 https://dsl.richmond.edu/dispatch/about https://dl.acm.org/doi/10.5555/2107636.2107649 https://doi.org/10.1080/21670811.2015.1093271 https://doi.org/10.1080/19322909.2016.1193455 https://doi.org/10.1002/wcc.576 https://doi.org/10.1145/3442381.3449805 information technology and libraries june 2022 applying topic modeling for automated creation of descriptive metadata | glowacka-musial 14 14 cain, “using topic modeling to enhance access to library digital collections,” 210 –25; a. krowne and m. halbert, “an initial evaluation of automated organization for digital library browsing,” in jcdl '05: proceedings of the 5th acm/ieee-cs joint conference on digital libraries, (june 7–11, 2005): 246–255; david newman, kat hagedorn, and chaitanya chemudugunta, “subject metadata enrichment using statistical topic models,” paper presented at acm ieee joint conference on digital libraries jcdl’07, vancouver, bc, june 17– 22, 2007. 15 craig boman, “an exploration of machine learning in libraries,” ala library technology report 55, no. 1 (january 2019): 21–25. 16 julia silge and david robinson, text mining with r: a tidy approach (sebastopol, california: o’reilly media, inc., 2017), 90. 17 blei, ng, and jordan, “latent dirichlet allocation.” 18 arlene g. taylor, introduction to cataloging and classification, 10th ed. (westport, connecticut: libraries unlimited, 2006), 19–20, 301–14; arlene g. taylor and daniel n. joudrey, the organization of information, 3rd ed. (westport, connecticut: libraries unlimited, 2009), 303– 28. 19 blei, ng, and jordan, “latent dirichlet allocation.” 20 silge and robinson, text mining with r, 149. 21 albalawi, yeap, and benyoucef, “using topic modeling methods for short-text data,” 42. 22 the r project for statistical computing, https://www.r-project.org/. 23 bettina grün and kurt hornik, “topicmodels: an r package for fitting topic models,” journal of statistical software 40, no. 13 (2011): 1–30, https://doi.org/10.18637/jss.v040.i13. 24 topic modeling in r (datacamp), https://www.datacamp.com/courses/topic-modeling-in-r. 25 grün and hornik, “topicmodels.” 26 topic modeling in r (datacamp), chap. 3, https://www.datacamp.com/courses/topicmodeling-in-r. 27 christopher m. bishop, pattern recognition and machine learning (new york, ny: springer science + business media, 2006), 32–33. 28 blei, ng, and jordan, “latent dirichlet allocation.” 29 daniel johnson and mark dehmlow, “digital exhibits to digital humanities: expanding the digital libraries portfolio,” in new top technologies every librarian needs to know, ed. kenneth j. varnum, (chicago: ala neal-schuman, 2019), 124. https://www.r-project.org/ https://doi.org/10.18637/jss.v040.i13 https://www.datacamp.com/courses/topic-modeling-in-r https://www.datacamp.com/courses/topic-modeling-in-r https://www.datacamp.com/courses/topic-modeling-in-r abstract introduction new mexico state university news releases topic modeling and its applications topic modeling in a nutshell topics and subject headings combined methods overview training data detailed steps of the proposed method: implementation results example 1 example 2 discussion conclusion endnotes librarians and technology skill acquisition: issues and perspectives | farney 141click analytics: visualizing website use data | farney 141 tutorial tabatha a. farney librarians who create website content should have access to website usage statistics to measure their webpages’ effectiveness and refine the pages as necessary.3 with web analytics libraries can increase the effectiveness of their websites, and as marshall breeding has observed, libraries can regularly use website statistics to determine how new webpage content is actually being used and make revisions to the content based on this information.4 several recent studies used google analytics to collect and report website usage statistics to measure website effectiveness and improve their usability.5 while web analytics are useful in a website redesign process, several studies concluded that web usage statistics should not be the sole source of information used to evaluate a website. these studies recommend using click data in conjunction with other website usability testing methods.6 background a lack of research on the use of click analytics in libraries motivated the web services librarian to explore their potential by directly implementing them on the library’s website. she found that there are several click analytics products available and each has its own unique functionality. however, many are commercially produced and expensive. with limited funding, the web services librarian selected google analytics’ in-page analytics, clickheat, and crazy egg because they are either free or inexpensive. each tool was evaluated on the library’s website for over a six month period. because google analytics cannot discern between the same link repeated in multiple places on a webpage. furthermore, she wanted to use website use data to determine the areas of high and low usage on the library’s homepage, and use this information to justify her webpage reorganization decisions. although this data can be found in a google analytics report, the web services librarian found it difficult to easily identify the necessary information within the massive amount of data the reports contain. the web services librarian opted to use click analytics, also known as click density analysis or site overlay, a subset of web analytics that reveals where users click on a webpage.1 a click analytics report produces a visual representation of what and where visitors are clicking on an individual webpage by overlaying the click data on top of the webpage that is being tested. rather than wading through the data, libraries can quickly identify what content users are clicking by using a click analytics report. the web services librarian tested several click analytics products while reassessing the library’s homepage. during this process she discovered that each click analytics tool had different functionalities that impacted their usefulness to the library. this paper introduces and evaluates three click analytics tools, google analytics’ in-page analytics, clickheat, and crazy egg, in the context of redesigning the library’s homepage and discusses the benefits and drawbacks of each. literature review library literature indicates that libraries are actively engaged in interpreting website usage data for a variety of purposes. laura b. cohen’s study encourages libraries to use their website usage data to enhance their understanding of how visitors access and use library websites.2 jeanie m. welch further recommends that all click analytics: visualizing website use data editor’s note: this paper is adapted from a presentation given at the 2010 lita forum click analytics is a powerful technique that displays what and where users are clicking on a webpage helping libraries to easily identify areas of high and low usage on a page without having to decipher website use data sets. click analytics is a subset of web analytics, but there is little research that discusses its potential uses for libraries. this paper introduces three click analytics tools, google analytics’ in-page analytics, clickheat, and crazy egg, and evaluates their usefulness in the context of redesigning a library’s homepage. w eb analytics tools, such as google analytics, assist libraries in interpreting their website usage statistics by formatting that data into reports and charts. the web services librarian at the kraemer family library at the university of colorado, colorado springs wanted to use website use data to reassess the library’s homepage that was crowded with redundant links. for example, all the links in the site’s dropdown navigation were repeated at the bottom of the homepage to make the links more noticeable to the user, but it unintentionally made the page long. to determine which links the web services librarian would recommend for removal, she needed to compare the use or clicks the repetitive links received. at the time, the library relied solely on google analytics to interpret website use data. however, this practice proved insufficient tabatha a. farney (tfarney@uccs.edu) is web services librarian, kraemer family library, university of colorado, colorado springs, colorado. 142 information technology and libraries | september 2011 libraries, outbound links include library catalogs or subscription databases. additional javascript tags must be added to each outbound link for google analytics to track that data.9 once google analytics recognizes the outbound links, their click data will be available in the in-page analytics report. visitors to that page, and outbound destinations, links that navigate visitors away from that webpage. the inbound sources and outbound destinations reports can track outbound links, which are links that have a different domain or url address from the website tracked within google analytics. for in-page analytics google analytics is a popular, comprehensive web analytics tool that contains a click analytics feature called in-page analytics (formerly site overlay) that visually displays click data by overlaying that information on the current webpage (see figure 1). site overlay was used during the library’s redesign process, however, it was replaced by in-page analytics in october 2010.7 the web services librarian reassessed the library’s homepage using in-page analytics, and found that the current tool resolved some of site overlay’s shortcomings. site overlay is no longer accessible in google analytics, so this paper will discuss in-page analytics. essentially, in-page analytics is an updated version of the site overlay (see figure 2). in addition to visually representing click data on a webpage, in-page analytics contains new features including the ability to easily segment data. web analytics expert, avinash kaushik, stresses the importance of segmenting website use data because it breaks down the aggregated data into specific data sets that represents more defined groups of users.8 rather than studying the total number of clicks a link received, an in-page analytics report can segment the data into specific groups of users, such as mobile device users. in-page analytics provides several default segments, but custom segments can also be applied allowing libraries to further filter the data that is constructive to them. in-page analytics also displays a complementing overview report of statistics located in a side panel next to the typical site overlay view. this overview report extracts useful data from other reports generated in google analytics without having to leave the in-page analytics report screen. the report includes the webpage’s inbound sources, also called top referrals, which are links from other webpages leading figure 1. screenshot of google analytics’ defunct site overlay figure 2. screenshot of google analytic’s in-page analytic librarians and technology skill acquisition: issues and perspectives | farney 143click analytics: visualizing website use data | farney 143 services librarian uses a screen capture tool, such as the firefox add-on screengrab13, to collect and archive the in-page analytics reports, but the process is clunky and results in the loss of the ability to segment the data. clickheat labsmedia’s clickheat is an open source heat mapping tool that visually displays the clicks on a webpage using color to indicate the amount of clicks an area receives. similar to in-page analytics, a clickheat heat map displays the current webpage and overlays that page with click data (see figure 3). instead of listing percentages or actual numbers of clicks, the heat map represents clicks using color. the warmer the color, such as yellows, oranges, or reds, the more clicks that area receives; the absence of color implies little to no click activity. each heat map has an indicator that outlines the number of clicks a color represents. a heat map clearly displays the heavily used and underused sections on a webpage making it easy for people with little experience interpreting website usage statistics to interpret the data. however, a heat map is not about exact numbers, but rather general areas of usage. for exact numbers, a traditional, comprehensive web analytics tool is required. clickheat can stand alone or be integrated into other web analytic tools.14 to have a more comprehensive web analytics product, the web services librarian opted to use the clickheat plugin for piwik, a free, open source web analytics tool that seeks to be an alternative to google analytics.15 by itself piwik has no click analytics feature, therefore clickheat is a useful plugin. both piwik and clickheat require access to a web server for installation and knowledge of php and mysql to configure them. because the kraemer family library does not maintain its own web servers, the pages, but it is time consuming and may not be worth the effort since the data are indirectly available.11 a major drawback to in-page analytics is that it does not discern between the same links listed in multiple places on a webpage. instead it tracks redundant links as one link, making it impossible to distinguish which repeated link received more use on the library’s homepage. similarly, the library’s homepage uses icons to help draw attention to certain links. these icons are linked images next to their counterpart text link. since the icon and text link share the same url, in-page analytics cannot reveal which is receiving more clicks. in-page analytics is useless for comparing repetitive links on a webpage, but google reports that they are working on adding this capability.12 as stated earlier, in-page analytics lays the click data over the current webpage in real-time, which can be both useful and limiting. using the current webpage allows libraries to navigate through their site while staying within the in-page analytics report. libraries can follow in the tracks of website users to learn how they interact with the site’s content and navigation. the downside is that it is difficult to compare a new version of a webpage with an older version since it only displays the current webpage. for example, the web services librarian could not accurately compare the use data between the old homepage and the revised homepage within the in-page analytics report because the newly redesigned homepage replaced the old page. comparing different versions of a webpage could help determine whether the new revisions improved the page or not. an archive or export feature would remedy this problem, but in-page analytics does not have this capacity. additionally, an export function would improve the ability to share this report with other librarians without having them login to the google analytics website. currently, the web evaluation of in-page analytics in-page analytics’ advanced segmenting ability far exceeds the old site overlay functionality. segmenting click data at the link level helps web managers to see how groups of users are navigating through a website. for example, in-page analytics can monitor the links mobile users are clicking, allowing web managers to track how that group of users are navigating through a website. this data could be used in designing a mobile version of a site. in-page analytics integrates a site overlay report and an overview report that contains selected web use statistics for an individual webpage. although the overview report is not in visual context with the site overlay view, it combines the necessary data to determine how a webpage is being accessed and used. this assists in identifying possible flaws in a website’s navigation, layout, or content. it also has the potential to clarify misleading website statistics. for instance, google analytics top exit pages report indicates the library’s homepage is the top exit page for the site. exit pages are the last page a visitor views before leaving the site.10 having a high exit rate could imply visitors were leaving the library’s site from the homepage and potentially missing a majority of the library’s online resources. using in-page analytics, it was apparent the library’s homepage had a high number of exits because many visitors clicked on outbound links, such as the library catalog, that navigated visitors away from the library’s website. rather than finding a potential problem, in-page analytics indicated that the homepage’s layout successfully led visitors to a desired point of information. while the data from the outbound links is available in the data overview report, it is not displayed within the site overlay view. it is possible to work around this problem by creating internal redirect 144 information technology and libraries | september 2011 the precise number of clicks is available in traditional web analytics reports. installing and configuring clickheat is a potential drawback for some libraries that do not have access to the necessary technology or staff to maintain it. even with access to a web server and knowledgeable staff, the web services librarian still experienced glitches implementing clickheat. she could not add clickheat to any high trafficked webpage because it created a slight, but noticeable, lag in response time to any page it was added. the cause was an out-of-box configuration setting that had to be fixed by the campus’ information technology department.17 another concern for libraries is that clickheat is continuously being developed with new versions or patches released periodically.18 like any locally installed software, libraries must plan for continuing maintenance of clickheat to keep it current. just as with in-page analytics, clickheat has no export or archive function. this impedes the web main navigation on the homepage and opted to use links prominently displayed within the homepage’s content. this indicated that either the users did not notice the main navigation dropdown menus or that they chose to ignore them. further usability testing of the main navigation is necessary to better understand why users do not utilize it. clickheat is most useful when combined with a comprehensive web analytics tool, such as piwik. since clickheat only collects data where visitors are clicking, it does not track other web analytics metrics, which limits its ability to segment the click data. currently, clickheat only segments clicks by browser type or screen resolution. additional segmenting ability would enhance this tool’s usefulness. for example, the ability to segment clicks from new visitors and returning visitors may reveal how visitors learn to use the library’s homepage. furthermore, the heat map report does not provide the actual number of clicks on individual links or content areas since heat maps generalize click patterns. web services librarian worked with the campus’ information technology department to install piwik with the clickheat plugin on a campus web server. once installed, piwik and clickheat generate javascript tags that must be added to every page that website use data will be tracked. although piwik and clickheat can be integrated, the tools work separately so two javascript tags must be added to a webpage to track click data in piwik as well as in clickheat. only the pages that contain the clickheat tracking script will generate heat maps that are then stored within the local piwik interface. evaluation of clickheat in-page analytics only tracks links or items that perform some sort of action, such as playing a flash video,16 but clickheat tracks clicks on internal links, outbound links, and even nonlinked objects, such as images. hence, clickheat is able to track clicks on the entire webpage. tracking non-linked objects was unexpectedly useful in identifying potential flaws in a webpage’s design. for instance, within a week of beta testing the library’s redesigned homepage, it was evident that users clicked on the graphics that were positioned closely to text links. the images were intended to draw the user’s attention to the text link, but instead users clicked on the graphic itself expecting it to be a link. to alleviate possible user frustration, the web services librarian added links to the graphics to take visitors to the same destinations as their companion text links. clickheat treats every link or image as its own separate component, so it has the ability to compare the same link listed in multiple places on the same page. unlike in-page analytics, clickheat was particularly helpful in analyzing which redundant links received more use on the homepage. in addition, the heat map also revealed that users ignored the site’s figure 3. screenshot of clickheat’s heat map report librarians and technology skill acquisition: issues and perspectives | farney 145click analytics: visualizing website use data | farney 145 clicks that area has received with the brighter colors representing the higher percentage of clicks. the plus signs can be expanded to show the total number of clicks an item has received, and this number can be easily filtered into eleven predefined allowing crazy egg to differentiate between the same link or image listed multiple times on a webpage. crazy egg displays this data in color-coded plus signs which are located next to the link or graphic it represents. the color is based on the percentage of services librarian’s ability to share the heat maps and compare different versions of a webpage. again, the web services librarian manually archives the heat maps using a screen capture tool, but the process is not the perfect solution. crazy egg crazy egg is a commercial, hosted click analytics tool selected for this project primarily for its advanced click tracking functionality. it is a fee-based service that requires a monthly subscription. there are several subscription packages based on the number of visits and “snapshots.” snapshots are webpages that are tracked by crazy egg. the kraemer family library subscribes to the standard package that allows up to twenty snapshots at one time with a combined total of 25,000 visits a month. to help manage how those visits are distributed, each tracked page can be assigned a specific number of visits or time period so that one webpage does not use all the visits early in the month. once a snapshot reaches its target number of visits or its allocated time period, it automatically stops tracking clicks and archives that snapshot within the crazy egg website.19 the snapshots convert the click data into three different click analytic reports: heat map, site overlay, and something called “confetti view.” crazy egg’s heat map report is comparable to clickheat’s heat map; they both use intensity of colors to show high areas of clicks on a webpage (see figure 4). crazy egg’s site overlay is similar to in-page analytics in that they both display the number of clicks a link receives (see figure 5). unlike in-page analytics, crazy egg tracks all clicks including outbound links as well as nonlinked content, such as graphics, if it has received multiple clicks. every clicked link and graphic is treated as its own separate entity, figure 4. screenshot of crazy egg’s heat map report figure 5. screenshot of crazy egg’s site overlay report 146 information technology and libraries | september 2011 to decide which redundant links to remove from the homepage. the confetti view report was useful for studying clicks on the entire webpage. segmenting this data allowed the web services librarian to identify click patterns on the webpage from a specific group. for example, the report revealed that mobile device users would scroll horizontally on the homepage to click on content, but rarely vertically. she also focused on the time to click segment, which reports how long it took a visitor to click on something, in the confetti view to identify links or areas that took users over half a minute to click. both segments provided interesting information, but further usability testing is necessary to better understand why mobile users preferred not to scroll vertically or why it took users longer to click on certain links. crazy egg also has the ability to archive its snapshots within its profile. this is useful for comparing different versions of a webpage to discover if the modifications were an improvement or not. one goal for the library’s homepage redesign was to shorten the page so users did not have to scroll evaluation of crazy egg crazy egg combines the capabilities of in-page analytcis and clickheat in one tool and expands on their abilities. it is not a comprehensive web analytics tool like google analytics or piwik, but rather is designed to specifically track where users are clicking. crazy egg’s heat map report is comparable to the one freely available in clickheat, however, its site overlay and confetti view reports are more sophisticated than what is currently available for free. the web services librarian found crazy egg to be a worthwhile investment during the library’s homepage redesign because it provided additional context to show how users were interacting with the library’s website. the site overlay facilitated the ability to compare the same link listed in multiple locations on the library’s homepage. not only could the web services librarian see how many clicks the links received, but she could also segment and compare that data to learn which links users were finding faster and which links new visitors or returning visitors preferred. this data helped her segments that include day of week, browser type, and top referring websites. custom segments may be applied if they are set up within the crazy egg profile. the confetti view report displays every click the snapshot recorded and overlays those clicks as colored dots on the snapshot as shown in figure 6. the color of the dot corresponds to specific segment value. the confetti view report uses the same default segmented values used in the site overlay report but here they can be further filtered into defined values for that segment. for example, the confetti view can segment the clicks by window width and then further filter the data to display only the clicks from visitors with window widths under 1000 pixels to see if users with smaller screen resolutions are scrolling down long webpages to click on content. this information is hard to glean from crazy egg’s site overlay report because it focuses on the individual link or graphic. the confetti view report focuses on clicks at the webpage level, allowing libraries to view usage trends on a webpage. crazy egg is a hosted service like google analytics, which means all the data are stored on crazy egg’s web servers and accessed through its website. implementing crazy egg on a webpage is a two-step process requiring the web manager to first set up the snapshot within the crazy egg profile and then add the tracking javascript tags to the webpage it will track. once the javascript tags are in place, crazy egg takes a picture of the current webpage and stores that as the snapshot on which to overlay the click data reports. since it uses a “snapshot” of the webpage, the website manager needs to retake a snapshot of the webpage if there are any changes to it. retaking the snapshot requires only a click of a button to automatically stop the old snapshot and regenerate a new one based on the current webpage without having to change the javascript tags. figure 6. screenshot of crazy egg’s confetti view report librarians and technology skill acquisition: issues and perspectives | farney 147click analytics: visualizing website use data | farney 147 website. next, she will explore ways to automate the process of sharing of website use data to make this information more accessible to other interested librarians. by sharing this information, the web services librarian hopes to promote informed decision making for the library’s web content and design. references 1. avinash kaushik, web analytics 2.0: the art of online accountability and science of customer centricity (indianapolis: wiley, 2010): 81–83. 2. laura b. cohen, “a two-tiered model for analyzing library website usage statistics, part 2: log file analysis,” portal: libraries & the academy 3, no. 3 (2003): 523–24. 3. jeanie m. welch, “who says we’re not busy? library web page usage as a measure of public service activity,” reference services review 33, no. 4 (2005): 377–78. 4. marshall breeding, “an analytical approach to assessing the effectiveness of web-based resources,” computers in libraries, 28, no. 1 (2008): 20–22. 5. julie arendt and cassie wagner, “beyond description: converting web site statistics into concrete site improvement ideas,” journal of web librarianship 4, no. 1 (january 2010): 37–54; steven j. turner, “websites statistics 2.0: using google analytics to measure library website effectiveness,” technical services quarterly 27, no. 3 (2010): 261–278; wei fang and marjorie e. crawford, “measuring law library catalog web site usability: a web analytic approach,” journal of web librarianship 2, no. 2–3 (2008): 287–306. 6. ardent and wagner, “beyond description,” 51–52; andrea wiggins, “data-driven design: using web analytics to validate heuristics,” bulletin of the american society for information science and technology 33, no. 5 (2007): 20–21; elizabeth l. black, “web analytics: a picture of the academic library web site user,” journal of web librarianship 3, no. 1 (2009): 12–13. 7. trevor claiborne, “introducing in-page analytics: visual context for your analytics data,” google analytics blog, oct. 15, 2010, http://analytics.blogspot .com/2010/10/introducing-in-page-ana tracking abilities, however, all provide a distinct picture of how visitors use a webpage. by using all of them, the web services librarian was able to clearly identify and recommend the links for removal. in addition, she identified other potential usability concerns, such as visitors clicking on nonlinked graphics rather than the link itself. a major bonus of using click analytics tools is their ability to create easy to understand reports that instantly display where visitors are clicking on a webpage. no previous knowledge of web analytics is required to understand these reports. the web services librarian found it simple to present and discuss click analytics reports with other librarians with little to no background in web analytics. this helped increase the transparency of why links were targeted for removal from the homepage. as useful as click analytics tools are, they cannot determine why users click on a link, only where they have clicked. click analytics tools simply visualize website usage statistics. as elizabeth black reports, these “statistics are a trail left by the user, but they do not explain the motivations behind the behavior.”20 she concludes that additional usability studies are required to better understand users and their interactions on a website.21 libraries can use the click analytics reports to identify a problem on a webpage, but further usability testing will explain why there is a problem and help library web managers fix the issue and prevent repeating the mistake in the future. the web services librarian incorporated the use of in-page analytics, clickheat, and crazy egg in her web analytics practices since these tools continue to be useful to test the usage of new content added to a webpage. furthermore, she finds that click analytics’ straightforward reports prompted her to share website use data more often with fellow librarians to assist in other decisionmaking processes for the library’s down too much to get to needed links. by comparing the old homepage and the new homepage confetti reports in crazy egg, it was instantly apparent that the new homepage had significantly fewer clicks on its bottom half than the old version. furthermore, comparing the different versions using the time to click segment in the site overlay showed that placing the link more prominently on the webpage decreased the overall time it took users to click on it. crazy egg’s main drawback is that archived pages that are no longer tracking click data count toward the overall number of snapshots that can be tracked at one time. if libraries regularly retest a webpage, they will easily reach the maximum number of snapshots their subscription permits in a relatively short period. once a crazy egg subscription is cancelled data stored in the account is no longer accessible. this increases the importance of regularly exporting data. crazy egg is designed to export the heat map and confetti view reports. the direct export function takes a snapshot of the current report as it is displayed, and automatically converts that image into a pdf. exporting the heat map is fairly simple because the report is a single image, but exporting all the content in the confetti view report is more difficult because the report is based on segments of click data. each segment type would have to be exported in a separate pdf report to retain all of the content. in addition, there is no export option for the site overlay report so there is not an easy method to manage that information outside of crazy egg. even if libraries are actively exporting reports from crazy egg, data loss is inevitable. summary and conclusions closely examining in-page analytics, clickheat, and crazyegg reveals that each tool has different levels of click 148 information technology and libraries | september 2011 (2009): 81–84. 17. clickheat performance and optimization, labsmedia, http://www .labsmedia.com/clickheat/156894.html (accessed feb. 7, 2011). 18. clickheat, sourceforge, http:// sourceforge.net/projects/clickheat/files/ (accessed feb. 7, 2011). 19. crazy egg, http://www.crazyegg .com/, (accessed on mar. 25, 2011). 20. black, “web analytics,” 12. 21. ibid., 12–13. 13. screengrab, firefox add-ons, https://addons.mozilla.org/en-us/fire fox/addon/1146/ (accessed feb. 7, 2011). 14. clickheat, labsmedia, http:// www.labsmedia.com/clickheat/index .html (accessed feb. 7,2011). 15. piwik, http://piwik.org/ (accessed feb. 7, 2011). 16. paul betty, “assessing homegrown library collections: using google analytics to track use of screencasts and flash-based learning objects,” journal of electronic resources librarianship, 21, no. 1 lytics-visual.html (accessed feb. 7, 2011). 8. kaushik, web analytics 2.0, 88. 9. turner, “websites statistics 2.0,” 272–73. 10. kaushik, web analytics 2.0, 53–55. 11. site overlay not displaying outbound links, google analytics help forum, http://www.google.com/ support/forum/p/google+analytics/ thread?tid=39dc323262740612&hl=en (accessed feb. 7, 2011). 12. claiborne, “introducing in-page analytics.” december_ital_fifarek_final president’s message: focus on information ethics aimee fifarek information technologies and libraries | december 2016 1 just a few weeks ago we held yet another successful lita forum1, this time in fort worth, tx. tight travel budgets and time constraints mean that only a few hundred people get to attend forum each year, but that is one of the things that make it a great conference. because of its size you have a realistic chance of meeting everyone there, whether it’s at game night, one of the many networking dinners, or just for during hallway chitchat after a session. and the sessions really do give you something to talk about. this year i couldn’t help but notice a theme. among all the talk about makerspace technologies, analytics, and specific software platforms, the one bubble that kept rising to the surface was information ethics. why are you doing what you are doing with the information you have, and should you really be doing it? have you stopped to think what impact collecting, posting, sharing that information is going to have on the world around you? in a post-election environment replete with talk of fake news and other forms of deliberate misinformation, lita forum presenters seem to have tapped in to the zeitgeist. tara robertson, in her closing keynote2, talked about the harm digitizing analog materials can do when what is depicted is sensitive to individuals and communities. waldo jaquith of us open data talked about how a government decision to limit options on a birth certificate to either “white” or “colored” effectively wiped the native population out of political existence in virginia. and sam kome from claremont colleges talked about how well-meaning librarians can facilitate privacy invasion merely by collecting operational statistics3. there were many other examples brought out by forum speakers but these in particular emphasized the real consequences the serious consequences the use of data – intentional or not – can have on people. i think it is time for librarians4 to get more vocal about information ethics and the role we play in educating the population about humane information use. our profession has always been forward thinking about information literacy and is traditionally known for helping our communities make judgements about the information they consume. but we have not done enough to declare our expertise in the information economy, to stand up and say “we’re librarians – this is what we do.” now, more than ever, people need the skills to think critically about the information they are consuming via all kinds of media, understand the consequences of allowing algorithms to shape their information universe, and make quality judgments about trading their personal information for goods and services. to quote from unesco: aimee fifarek (aimee.fifarek@phoenix.gov) is lita president 2016-17 and deputy director for customer support, it and digital initiatives at phoenix public library, phoenix, az. president’s message | fifarek https://doi.org/10.6017/ital.v35i4.9602 2 changes brought about by the rapid development of information and communication technologies (ict) not only open tremendous opportunities to humankind but also pose unprecedented ethical challenges. ensuring that information society is based upon principles of mutual respect and the observance of human rights is one of the major ethical challenges of the 21st century.5 i challenge all librarians to make a commitment to propagating information ethics, both personally and professionally. make an effort to get out of your social media echo chamber6 and engage with uncomfortable ideas. when you see biased information being shared consider it a “teachable moment” and highlight the spin or present more neutral information. and if your library is not actively making information literacy and information ethics part of its programming and instruction, then do what you can to change it. offer to be on a panel, create a curriculum, or host a program that includes key concepts relating to information “ownership, access, privacy, security, and community”7. the focus of the libraries transform campaign this year is all about our expertise: “because the best search engine in the library is the librarian”8 it’s our time to shine. references 1. http://forum.lita.org/home/ 2. http://forum.lita.org/speakers/tara-robertson/ 3. http://forum.lita.org/sessions/patron-activity-monitoring-and-privacy-protection/ 4. as always, when i use the term “librarian” my intention is to include any person who works in a library and is skilled in information and library science, not to limit the reference to those who hold a library degree. 5. http://en.unesco.org/themes/ethics-information 6. https://www.wnyc.org/story/buzzfeed-echo-chamber-online-news-politics/ 7. https://en.wikipedia.org/wiki/information_ethics 8. http://www.ilovelibraries.org/librariestransform/ laneconnex | ketchell et al. 31 laneconnex: an integrated biomedical digital library interface debra s. ketchell, ryan max steinberg, charles yates, and heidi a. heilemann this paper describes one approach to creating a search application that unlocks heterogeneous content stores and incorporates integrative functionality of web search engines. laneconnex is a search interface that identifies journals, books, databases, calculators, bioinformatics tools, help information, and search hits from more than three hundred full-text heterogeneous clinical and bioresearch sources. the user interface is a simple query box. results are ranked by relevance with options for filtering by content type or expanding to the next most likely set. the system is built using component-oriented programming design. the underlying architecture is built on apache cocoon, java servlets, xml/xslt, sql, and javascript. the system has proven reliable in production, reduced user time spent finding information on the site, and maximized the institutional investment in licensed resources. m ost biomedical libraries separate searching for resources held locally from external database searching, requiring clinicians and researchers to know which interface to use to find a specific type of information. google, amazon, and other web search engines have shaped user behavior and expectations.1 users expect a simple query box with results returned from a broad array of content ranked or categorized appropriately with direct links to content, whether it is an html page, a pdf document, a streaming video, or an image. biomedical libraries have transitioned to digital journals and reference sources, adopted openurl link resolvers, and created institutional repositories. however, students, clinicians, and researchers are hindered from maximizing this content because of proprietary and heterogeneous systems. a strategic challenge for biomedical libraries is to create a unified search for a broad spectrum of licensed, open-access, and institutional content. n background studies show that students and researchers will use the search path of least cognitive resistance.2 ease and speed are the most important factors for using a particular search engine. a university of california report found that academic users want one search tool to cover a wide information universe, multiple formats, full-text availability to move seamlessly to the item itself, intelligent assistance and spelling correction, results sorted in order of relevance, help navigating large retrievals by logical subsetting and customization, and seamless access anytime, anywhere.3 studies of clinicians in the patient-care environment have documented that effort is the most important factor in whether a patient-care question is pursued.4 for researchers, finding and using the best bioinformatics tool is an elusive problem.5 in 2005, the lane medical library and knowledge management center (lane) at the stanford university medical center provided access to an expansive array of licensed, institutional, and open-access digital content in support of research, patient care, and education. like most of its peers, lane users were required to use scores of different interfaces to search external databases and find digital resources. we created a local metasearch application for clinical reference content, but it did not integrate result sets from disparate resources. a review of federated-search software in the marketplace found that products were either slow or they limited retrieval when faced with a broad spectrum of biomedical content. we decided to build on our existing application architecture to create a fast and unified interface. a detailed analysis of lane website-usage logs was conducted before embarking on the creation of the new search application. key points of user failure in the existing search options were spelling errors that could easily be corrected to avoid zero results; lack of sufficient intuitive options to move forward from a zero-results search or change topics without backtracking; lack of use of existing genre or role searches; confusion about when to use the resource, openurl resolver, or pubmed search to find a known item; and results that were cognitively difficult to navigate. studies of the web search engine and the pubmed search log concurred with our usagelog analysis: a single term search is the most common, with three words maximum entered by typical users.6 a pubmed study found that 22 percent of user queries were for known items rather than for a general subject, confirming our own log analysis findings that the majority of searches were for a particular source item.7 search-term analysis revealed that many of our users were entering partial article citations (e.g., author, date) in any query debra s. ketchell (debra.ketchell@gmail.com) is the former associate dean for knowledge management and library director; ryan max steinberg (ryan.max.steinberg@stanford .edu) is the knowledge integration programmer/architect; charles yates (charles.yates@stanford.edu) is the systems software developer; and heidi a. heilemann (heidi.heilemann@stanford .edu) is the former director for research & instruction and current associate dean for knowledge management and library director at the lane medical library & knowledge management center, information resources & technology, stanford university school of medicine, stanford, california. 32 information technology and libraries | march 2009 box expecting that article databases would be searched concurrently with the resource database. our displayed results were sorted alphabetically, and each version of an item was displayed separately. for the user, this meant a cluttered list with redundant title information that increased their cognitive effort to find meaningful items. overall, users were confronted with too many choices upfront and too few options after retrieving results. focus groups of faculty and students were conducted in 2005. attendees wanted local information integrated into the proposed single search. local information included content such as how-to information, expertise, seminars, grand rounds, core lab resources, drug formulary, patient handouts, and clinical calculators. most of this content is restricted to the stanford user population. users consistently described their need for a simple search interface that was fast and customized to the stanford environment. in late 2005, we embarked on a project to design a search application that would address both existing points of failure in the current system and meet the expressed need for a comprehensive discovery-andfinding tool as described in focus groups. the result is an application called laneconnex. n design objectives the overall goal of laneconnex is to create a simple, fast search across multiple licensed, open-access, and special-object local knowledge sources that depackages and reaggregates information on the basis of stanford institutional roles. the content of lane’s digital collection includes forty-five hundred journal titles and fortytwo thousand other digital resources, including video lectures, executable software, patient handouts, bioinformatics tools, and a significant store of digitized historical materials as a result of the google books program. media types include html pages, pdf documents, jpeg images, mp3 audio files, mpeg4 videos, and executable applications. more than three hundred reference titles have been licensed specifically for clinicians at the point of care (e.g., uptodate, emedicine, stat-ref, and micromedex clinical evidence). clinicians wanted their results to reflect subcomponents of a package (e.g., results from the micromedex patient handouts). other clinical content is institutionally managed (e.g., institutional formulary, lab test database, or patient handouts). more than 175 biomedical research tools have been licensed or selected from open-access content. the needs of biomedical researchers include molecular biology tools and software, biomedical literature databases, citation analysis, chemical and engineering databases, expertise-finding tools, laboratory tools and supplies, institutional-research resources, and upcoming seminars. the specific objectives of the search application are the following: n the user interface should be fast, simple, and intuitive, with embedded suggestions for improving search results (e.g., did you mean? didn’t find it? have you tried?). n search results from disparate local and external systems should be integrated into a single display based on popular search-engine models familiar to the target population. n the query-retrieval and results display should be separated and reusable to allow customization by role or domain and future expansion into other institutional tools. n resource results should be ranked by relevance and filtered by genre. n metasearch results should be hit counts and filtered by category for speed and breadth. results should be reusable for specific views by role. n finding a known article or journal should be streamlined and directly link to the item or “get item” option. n the most popular search options (pubmed, google, and lane journals) should be ubiquitous. n alternative pathways should be dynamic and interactive at the point of need to avoid backtracking and dead ends. n user behavior should be tracked by search term, resource used, and user location to help the library make informed decisions about licensing, metadata, and missing content. n off-the-shelf software should be used when available or appropriate with development focused on search integration. n the application should be built upon existing metadata-creation systems and trusted webdevelopment technologies. based on these objectives, we designed an application that is an extension of existing systems and technologies. resources are acquired and metadata are provided using the voyager integrated library system (ils). the sfx openurl link resolver provides full-text article access and expands the title search beyond biomedicine to all online journals at stanford. ezproxy provides seamless off-campus access. webtrends provides usage tracking. movable type is used to create faq and help information. a locally developed metasearch application provides a cross search with hit results from more than three hundred external and internal full-text sources. the technologies used to build laneconnex and integrate all of these systems include extensible stylesheet language laneconnex | ketchell et al. 33 transformations (xslt), java, javascript, the apache cocoon project, and oracle. n systems description architecture laneconnex is built on a principle of separation of concerns. the lane content owner can directly change the inclusion of search results, how they are displayed, and additional path-finding information. application programmers use java, javascript, xslt, and structured query language (sql) to create components that generate and modify the search results. the merger of content design and search results occurs “just in time” in the user’s browser. we use component-oriented programming design whereby services provided within the application are defined by simple contracts. in laneconnex, these components (called “transformers”) consume xml information and, after transforming it in some way, pass it on to some other component. a particular contract can be fulfilled in different ways for different purposes. this component architecture allows for easy extension of the underlying apache cocoon application. if laneconnex needs to transform some xml data that is not possible with built-in cocoon transformers, it is a simple matter to create a software component that does what is needed and fulfills the transformer contract. apache cocoon is the underlying architecture for laneconnex, as illustrated in figure 1. this java servlet is an xml–publishing engine that is built upon a component framework and uses a pipeline-processing model. a declarative language uses pattern matching to associate sets of processing components with particular request urls. content can come from a variety of sources. we use content from the local file system, network file system, http, and a relational database. the xslt language is used extensively in the pipelines and gives fine control of individual parts of the documents being processed. the end of processing is usually an xhtml document but can be any common mime type. we use cocoon to separate areas of concern so things like content, look and feel, and processing can all be managed as separate entities by different groups of people with little effect on another area. this separation of concerns is manifested by template documents that contain most of the html content common to all pages and are then combined with content documents within a processing pipeline. the declarative nature of the sitemap language and xslt facilitate rapid development with no need to redeploy the entire application to make changes in its behavior. the laneconnex search is composed of several components integrated into a query-and-results interface: oracle resource metadata, full-text metasearch application, movable type blogging software, “did you mean?” spell checker, ezproxy remote access, and webtrends tracking. n full-text metasearch integration of results from lane’s metasearch application illustrates cocoon’s many strengths. when a user searches laneconnex, cocoon sends his or her query to the metasearch application, which then dispatches the request to multiple external, full-text search engines and content stores. some examples of these external resources are uptodate, access medicine, micromedex, pubmed, and md consult. the metasearch application interacts with these external resources through jakarta commons http clients. responses from external resources are turned into w3c document object model (dom) objects, and xpath expressions are used to resolve hit counts from the dom objects. as result counts are returned, they are added to an xml–based result list and returned to cocoon. the power of cocoon becomes evident as the xml– based metasearch result list is combined with a separate display template. this template-based approach affords content curators the ability to directly add, group, and describe metasearch resources using the language and look that is most meaningful to their specific user communities. for example, there are currently eight metasearch templates curated by an informationist in partnership with a target community. curating these templates requires little to no assistance from programmers. in lane’s 2005 interface, a user’s request was sent to the metasearch application, and the application waited five seconds before responding to give external resources a chance to return a result. hit counts in the user interface included a link to refresh and retrieve more results from external resources that had not yet responded. usability studies showed this to be a significant user barrier, since the refresh link was rarely clicked. the initial five second delay also gave users the impression that the site was slow. the laneconnex application makes heavy use of javascript to solve this problem. after a user makes her initial request, javascript is used to poll the metasearch application (through cocoon) on the user’s behalf, popping in result counts as external resources respond. this adds a level of interactivity previously unavailable and makes the metasearch piece of laneconnex much more successful than its previous version. resource metadata laneconnex replaces the catalog as the primary discovery interface. metadata describing locally owned and 34 information technology and libraries | march 2009 licensed resources (journals, databases, books, videos, images, calculators, and software applications) are stored in the library’s current system of record, an instance of the voyager ils. laneconnex makes no attempt to replace voyager ’s strengths as an application for the selection, acquisition, description, and management of access to library resources. it does, however, replace voyager ’s discovery interface. to this end, metadata for about eight thousand digital resources is extracted from voyager ’s oracle database, converted into marcxml, processed with xslt, and stored in a simple relational database (six tables and twenty-nine attributes) to support fast retrieval speed and tight control over search syntax. this extraction process occurs nightly, with incremental updates every five minutes. the oracle text search engine provides functionality anticipated by our internet-minded users. key features are speed and relevance-ranked results. a highly refined results ranking insures that the logical title appears in the first few results. a user ’s query is parsed for wildcard, boolean, proximity, and phrase operators, and then translated into an sql query. results are then transformed into a display version. related services laneconnex compares a user’s query terms against a dictionary. each query is sent to a cocoon spell-checking component that returns suggestions where appropriate. this component currently uses the simple object figure 1. laneconnex architecture. laneconnex | ketchell et al. 35 access protocol (soap)–based spelling service from google. google was chosen over the national center for biotechnology information (ncbi) spelling service because of the breadth of terms entered by users; however, cocoon’s component-oriented architecture would make it trivial to change spell checkers in the future. each query is also compared against stanford’s openurl link resolver (findit@stanford). client-side javascript makes a cocoon-mediated query of findit@stanford. using xslt, findit@stanford responses are turned into javascript object notation (json) objects and popped into the interface as appropriate. although the vast majority of laneconnex searches result in zero findit@stanford results, the convenience of searching all of lane’s systems in a single, unified interface far outweighs the effort of implementation. a commercial analytics tool called webtrends is used to collect web statistics for making data-centric decisions about interface changes. webtrends uses client-side javascript to track specific user click events. libraries need to track both on-site clicks (e.g., the user clicked on “clinical portal” from the home page) and off-site clicks (e.g., the user clicked on “yamada’s gastroenterology” after doing a search for “ibs”). to facilitate off-site click capture, webtrends requires every external link to include a snippet of javascript. requiring content creators to input this code by hand would be error prone and tedious. laneconnex automatically supplies this code for every class of link (search or static). this specialized webtrends method provides lane with data to inform both interface design and licensing decisions. n results laneconnex version 1.0 was released to the stanford biomedical community in july 2006. the current application can be experienced at http://lane.stanford.edu. the figure 2. laneconnex resource search results. resource results are ranked by relevance. single word titles are given a higher weight in the ranking algorithm to insure they are displayed in the first five results. uniform titles are used to co-locate versions (e.g., the three instances of science from different producers). journals titles are linked to their respective impact factor page in the isi web of knowledge. digital formats that require special players or restrictions are indicated. the metadata searched for ejournals, databases, ebooks, biotools, video, and medcalcs are lane’s digital resources extracted from the integrated library system into a searchable oracle database. the first “all” tab is the combined results of these genres and the lane site help and information. figure 3. laneconnex related services search enhancements. laneconnex includes a spell checker to avoid a common failure in user searches. ajax services allow the inclusion of search results from other sources for common zero results failures. for example, the stanford link resolver database is simultaneously searched to insure online journals outside the scope of biomedicine are presented as a linked result for the user. production version has proven reliable over two years. incremental user focus groups have been employed to improve the interface as issues arose. a series of vignettes will be used to illustrate how the current version of 36 information technology and libraries | march 2009 the “sunetid login” is required. n user query: “new yokrer.” a faculty member is looking for an article in the new yorker for a class reading assignment. he makes a typing error, which invokes the “did you mean?” function (see figure 3). he clicks on the correct spelling. no results are found in the resource search, but a simultaneous search of the link-resolver database finds an instance of this title licensed for the campus and displays a clickable link for the user. n user query: “pathway analysis.” a post–doc is looking for information on how to share an ingenuity pathway. figure 4 illustrates the integration of the locally created lane faqs. faqs comprise a broad spectrum of help and how-to information as described by our focus groups. help text is created in the movable type blog software, and made searchable through the laneconnex application. the movable type interface lowers the barrier to html content creation by any staff member. more complex answers include embedded images and videos to enable the user to see exactly how to do a particular procedure. cocoon allows for the syndication of subsets of this faq content back into static html pages where it can be displayed as both category-specific lists or as the text for scroll-over help for a link. having a single store of help information insures the content is updated once for all instances. n user query: “uterine cancer kapp.” a resident is looking for a known article. laneconnex simultaneously searches pubmed to increase the likelihood of user success (see figure 5). clicking on the pubmed tab retrieves the results in the native interface; however, the user sees the pubmed@stanford version, which includes embedded links to the article based on our openurl link resolver. the ability to retrieve results from bibliographic databases that includes article resolution insures that our biomedical community is always using the correct url to insure maximum full-text article access. user testing in 2007 found that adding the three most frequently used sources (pubmed, google, and lane catalog) into our one-box laneconnex search was a significant time saver. it addresses laneconnex meets the design objectives from the user’s perspective. n user query: “science.” a graduate student is looking for the journal science. the laneconnex results are listed in relevance order (see figure 2). singleword titles are given a higher weight in the ranking algorithm to insure they are displayed in the first five results. results from local metadata are displayed by uniform title. for example, lane has three instances of the journal science, and each version is linked to the appropriate external store. brief notes provide critical information for particular resources. for example, restricted local patient education documents and video seminars note that figure 4. example of integration of local content stores. help information is managed in moveable type and integrated into laneconnex search results. laneconnex | ketchell et al. 37 the expectation on the part of our users that they could search for an article or a journal title in a single search box without first selecting a database. n user query: “serotonin pulmonary hypertension.” a medical student is looking for the correlation of two topics. clicking on the “clinical” tab, the student sees the results of the clinical metasearch in figure 6. metasearch results are deep searches of sources within licensed packages (e.g., textbooks in md consult or a specific database in micromedex), local content (e.g., stanford’s lab-test database), and openaccess content (e.g., ncbi databases). pubmed results are tailored strategies tiered by evidence. for example, the evidence-summaries strategy retrieves results from twelve clinical-evidence resources (e.g., buj, clinical evidence, and cochrane systematic reviews) that link to the full-text licensed by stanford. an example of the bioresearch metasearch is shown in figure 7. content selected for this audience includes literature databases, funding sources, patents, structures, clinical trials, protocols, and stanford expertise integrated with gene, protein, and phenotype tools. user testing revealed that many users did not click on the “clinical” tab. the clinical metasearch was originally developed for the clinical portal page and focused on clinicians in practice; however, the results needed to be exposed more directly as part of the laneconnex search. figure 8 illustrates the “have you tried?” feature that displays a few relevant clinical-content sources without requiring the user to select the “clinical” tab. this feature is managed by the smartsearch component of the laneconnex system. smartsearch sends the user’s query terms to pubmed, extracts a subset of articles associated with those terms, extracts the mesh headings for those articles, and computes the frequency of headings in the articles to determine the most likely mesh terms associated with the user’s query terms. these mesh terms are mapped to mesh terms associated with each metasearch resource. preliminary evaluation indicates that the clinical content is now being discovered by more users. figure 5. example of integration of popular search engines into laneconnex results. three of the most popular searches based on usage analysis are included at the top level. pubmed and google are mapped to lane’s link resolver to retrieve the full article. creating or editing metasearch templates is a curator driven task. programming is only required to add new sources to the metasearch engine. a curator may choose from more than three hundred sources to create a discipline-based layout using general templates. names, categories, and other description information are all at the curator ’s discretion. while developing new subspecialty templates, we discovered that clinicians were confused by the difference in layout of their specialty portal and their metasearch results (e.g., the cardiology portal used the generic clinical metasearch). to address this issue, we devised an approach that merges a portal and metasearch into a single entity as illustrated in figure 9. a combination of the component-oriented architecture of laneconnex and javascript makes the integration of metasearch results into a new template patterned after a portal easy to implement. this strategy will enable the creation of templates contextually appropriate to knowledge requests originating from electronic medical-record systems in the future. direct user feedback and usage statistics confirm that search is now the dominant mode of navigation. the amount of time each user spends on the website has dropped since the release of version 1.0. we speculate that the integrated search helps our users find relevant 38 information technology and libraries | march 2009 information more efficiently. focus groups with students are uniformly positive. graduate students like the ability to find digital articles using a single search box. medical students like the clinical metasearch as an easy way to look up new topics in texts and customized pubmed searches. bioengineering students like the ability to easily look up patient care–related topics. pediatrics residents and attendings have championed the development of their portal and metasearch focused on their patient population. medical educators have commented on their ability to focus on the best information sources. n discussion a review of websites in 2007 found that most biomedical libraries had separate search interfaces for their digital resources, library catalog, and external databases. biomedical libraries are implementing metasearch software to cross search proprietary databases. the university of california, davis is using the metalib software to federate searching multiple bibliographic databases.8 the university of south california and florida state university are using webfeat software to search clinical textbooks.9 the health sciences library system at the university of pittsburgh is using vivisimo to search clinical textbooks and bioresearch tools.10 academic libraries are introducing new “resource shopping” applications, such as the endeca project at north carolina state university, the summa project at the university of aarhus, and the vufind project at villanova university.11 these systems offer a single query box, faceted results, spell checking, recommendations based on user input, and asynchronous javascript and xml (ajax) for live status information. we believe our approach is a practical integration for our biomedical community that bridges finding a resource and finding a specific item through figure 6. integration of metasearch results into laneconnex. results from two general, role-based metasearches (bioresearch and clinical) are included in the laneconnex interface. the first image shows a clinician searching laneconnex for serotonin pulmonary hypertension. selecting the clinical tab presents the clinical content metasearch display (second image), and is placed deep inside the source by selecting a title (third image). laneconnex | ketchell et al. 39 a metasearch of multiple databases. the laneconnex application searches across digital resources and external data stores simultaneously and presents results in a unified display. the limitation to our approach is that the metasearch returns only hit counts rather than previews of the specific content. standardization of results from external systems, particularly receipt of xml results, remains a challenge. federated search engines do integrate at this level, but are usually slow or limit the number of results. true integration awaits health level seven (hl7) clinical decision support standards and national information standards organization (niso) metasearch initiative for query and retrieval of specific content.12 one of the primary objectives of laneconnex is speed and ease of use. ranking and categorization of results has been very successful in the eyes of the user community. the integration of metasearch results has been particularly successful with our pediatric specialty portal and search. however, general user understanding of how the clinical and biomedical tabs related to the genre tabs in laneconnex has been problematic. we reviewed web engines and found a similar challenge in presenting disparate format results (e.g., video or image search results) or lists of hits from different systems (e.g., ncbi’s entrez search results).13 we are continuing to develop our new specialty portal-and-search model and our smartsearch term-mapping component to further integrate results. n conclusion laneconnex is an effective and openended search infrastructure for integrating local resource metadata and full-text content used by clinicians and biomedical researchers. its effectiveness comes from the recognition that users prefer a single query box with relevance or categorically organized results that lead them to the most likely figure 7. example of a bioresearch metasearch. figure 8. the smartsearch component embeds a set of the metasearch results into the laneconnex interface as “have you tried?” clickable links. these links are the equivalent of selecting the title from a clinical metasearch result. the example search for atypical malignant rhabdoid tumor (a rare childhood cancer) invokes oncology and pediatric textbook results. these texts and pubmed provide quick access for a medical student or resident on the pediatric ward. figure 9. example of a clinical specialty portal with integrated metasearch. clinical portal pages are organized so metasearch hit counts can display next to content links if a user executes a search. this approach removes the dissonance clinicians felt existed between separate portal page and metasearch results in version 1.0. 40 information technology and libraries | march 2009 answer to a question or prospects in their exploration. the application is based on separation of concerns and is easily extensible. new resources are constantly emerging, and it is important that libraries take full advantage of existing and forthcoming content that is tailored to their user population regardless of the source. the next major step in the ongoing development of laneconnex is becoming an invisible backend application to bring content directly into the user’s workflow. n acknowledgements the authors would like to acknowledge the contributions of the entire laneconnex technical team, in particular pam murnane, olya gary, dick miller, rick zwies, and rikke ogawa for their design contributions, philip constantinou for his architecture contribution, and alain boussard for his systems development contributions. references 1. denise t. covey, “the need to improve remote access to online library resources: filling the gap between commercial vendor and academic user practice,” portal libraries and the academy 3 no.4 (2003): 577–99; nobert lossau, “search engine technology and digital libraries,” d-lib magazine 10 no. 6 (2004), www.dlib.org/dlib/june04/lossau/06lossau.html (accessed mar. 1, 2008); oclc, “college students’ perception of libraries and information resource,” www.oclc.org/reports/ perceptionscollege.htm (accessed mar 1, 2008); and jim henderson, “google scholar: a source for clinicians,” canadian medical association journal 12 no. 172 (2005). 2. covey, “the need to improve remote access to online library resources”; lossau, “search engine technology and digital libraries”; oclc, “college students’ perception of libraries and information resource.” 3. jane lee, “uc health sciences metasearch exploration. part 1: graduate student gocus group findings,” uc health sciences metasearch team, www.cdlib.org/inside/assess/ evaluation_activities/docs/2006/draft_gradreport_march2006. pdf (accessed mar. 1, 2008). 4. karen k. grandage, david c. slawson, and allen f. shaughnessy, “when less is more: a practical approach to searching for evidence-based answers,” journal of the medical library association 90 no. 3 (2002): 298–304. 5. nicola cannata, emanuela merelli, and russ b. altman, “time to organize the bioinformatics resourceome,” plos computational biology 1 no. 7 (2005): e76. 6. craig silverstein et al., “analysis of a very large web search engine query log,” www.cs.ucsb.edu/~almeroth/ classes/tech-soc/2005-winter/papers/analysis.pdf (accessed mar. 1, 2008); anne aula, “query formulation in web information search,” www.cs.uta.fi/~aula/questionnaire.pdf (accessed mar. 1, 2008); jorge r. herskovic, len y. tanaka, william hersh, and elmer v. bernstam, “a day in the life of pubmed: analysis of a typical day’s query log,” journal of the american medical informatics association 14 no. 2 (2007): 212–20. 7. herskovic, “a day in the life of pubmed.” 8. davis libraries university of california, “quicksearch,” http://mysearchspace.lib.ucdavis.edu/ (accessed mar. 1, 2008). 9. eileen eandi, “health sciences multi-ebook search,” norris medical library newsletter (spring 2006), norris medical library, university of southern california, www.usc.edu/hsc/ nml/lib-information/newsletters.html (accessed mar. 1, 2008); maguire medical library, florida state university, “webfeat clinical book search,” http://med.fsu.edu/library/tutorials/ webfeat2_viewlet_swf.html (accessed mar. 1, 2008). 10. jill e. foust, philip bergen, gretchen l. maxeiner, and peter n. pawlowski, “improving e-book access via a librarydeveloped full-text search tool,” journal of the medical library association 95 no. 1 (2007): 40–45. 11. north carolina state university libraries, “endeca at the ncsu libraries,” www.lib.ncsu.edu/endeca (accessed mar. 1, 2008); hans lund, hans lauridsen, and jens hofman hansen, “summa—integrated search,” www.statsbiblioteket.dk/ publ/summaenglish.pdf (accessed mar. 1, 2008); falvey memorial library, villanova university, “vufind,” www.vufind.org (accessed mar. 1, 2008). 12. see the health level seven (hl7) clinical decision support working committee activities, in particular the infobutton standard proposal at www.hl7.org/special/committees/dss/ index.cfm and the niso metasearch initiative documentation at www.niso.org/workrooms/mi (accessed mar 1, 2008). 13. national center for biotechnology information (ncbi) entrez cross-database search, www.ncbi.nlm.nih.gov/entrez (accessed mar. 1, 2008). acrl 5 alcts 15 lita cover 2, cover 3 jaunter cover 4 index to advertisers 94 journal of library automation vol. 2/ 2 june, 1969 appendix ii a proposed utilization of the subrecord directory and subrecord relationship fields in the proposed american standard for a format for bibliographic information interchange on magnetic tape the following appendix is not part of the proposed standard but is included to illustrate a method for handling subrecords within a bibliographic record. 1. subrecord directory a subrecord directory will be used when a bibliographic record consists of more than one subrecord. the subrecord directory, when present, will contain entries comprising a directory of the directory entries. 0 1.1 entries in subrecord directory tag id the format of each entry in the subrecord directory is as shown: 1 tag length starting character type-ofbibliographic position record level 2 3 6 7 11 1.1.1 tag the tag is associated with a complete subrecord. 1.1.1.1 tag id the tag id is a data element consisting of one basic character used to differentiate tags for multiple subrecords describing bibliographic units which have the same type-of-record and bibliographic level codes (see 3.2.4 and 3.2.5). 1.1.1.2 type-of-record and bibliographic level when the subrecord does not describe a primary bibliographic unit, the type-of-record and bibliographic level are assigned as though it did. appendix ii 95 1.1.2 length of subrecord directory the length of that portion of the directory associated with the subrecord. (this value is always a multiple of twelve and when divided by twelve, equals the number of entries associated with the subrecord.) 1.1.3 starting character position the starting character position is that of the first entry in the directory which pertains to the subrecord. it is ·a five-digit decimal number relative to the first character of the bibliographic record. 2. subrecord relationship field a subrecord relationship field will be present if, and only if, a subrecord directory is present. the subrecord relationship field, when present, shall contain fixed fields which are used to indicate the relationships of subrecords to each other. 2.1 relationship fields the format of each relationship field is as shown : i tag i relationship indicator i tag i tag i 0 2 3 5 6 8 9 11 if a subrecord bears the same relationship to two subrecords, they may both be shown, otherwise the second tag field is padded with the padding character. there is no limit to the number of relationship fields or to the number of relationships which may involve a specific subfield. the relationship fields may be used to develop, and define tags for, hierarchies of subfields. one bibliographic record may contain no more than 64 subrecords with the same type-of-record and bibliographic level. 2.1.1 tag the tags in the subrecord relationship field have the same format as those in the subrecord directory. 2.1.2 relationship indicator a relationship indicator is a data element consisting of three ( 3) basic characters used to indicate the relationship between subfields. microsoft word 14291 20211219 author.docx letter from the editor december 2021 kenneth j. varnum information technology and libraries | december 2021 https://doi.org/10.6017/ital.v40i4.14291 another year is nearly in the books. it has not been an easy year for many of us; perhaps not as truly chaotic and frightening as 2020 was, but still a year filled with uncertainty and rising concerns about the path of the pandemic. as we turn to a new year in our calendars, i wish all our readers health, peace, and a continued spirit of adaptation as we begin 2022. our public libraries leading the way column, “how covid affected our python class at the worcester public library” by melody friendenthal, is a follow up to her 2020 column on moving a library course on the python programming language from in-person to online for the worcester (ma) public library. our peer-reviewed content this month showcases topics including: digital library innovations; virtual reality; diversity, equity, and inclusion (dei); and library hackathons. 1. stateful library analysis and migration system (slam): etl system for performing digital library migrations / adrian-tudor panescu, teodora-elena grosu, and vasile manta 2. black, white, and grey: the wicked problem of virtual reality in libraries / gillian d. ellern and laura cruz / 3. bridging the gap: using linked data to improve discoverability and diversity in digital collections / jason boczar, bonita pollock, xiying mi, and amanda yeslibas 4. developing a minimalist multilingual full-text digital library solution for disconnected remote library partners / todd digby 5. diversity, equity & inclusion statements on academic library websites: an analysis of content, communication, and messaging / eric ely 6. a 21st century technical infrastructure for digital preservation / nathan tallman 7. hackathons and libraries: the evolving landscape 2014-2020 / meris mandernach longmeier kenneth j. varnum, editor varnum@umich.edu december 2021 are ivy league library website homepages accessible? articles are ivy league library website homepages accessible? wenfan yang, bin zhao, yan quan liu, and arlene bielefield information technology and libraries | june 2020 https://doi.org/10.6017/ital.v39i2.11577 wenfan yang (youngwf@126.com) is a master’s student in the school of management, tianjin university of technology, china. bin zhao (andy.zh@126.com) is professor in school of management, tianjin university of technology, china. yan quan liu (liuy1@southernct.edu) is professor in information and library science at southern connecticut university and special hired professor of tianjin university of technology. arlene bielefield (bielefielda1@southernct.edu) is professor in information and library science at southern connecticut university. copyright © 2020. abstract as a doorway for users seeking information, library websites should be accessible to all, including those who are visually or physically impaired and those with reading or learning disabilities. in conjunction with an earlier study, this paper presents a comparative evaluation of ivy league university library homepages with regard to the americans with disabilities act (ada) mandates. data results from wave and achecker evaluations indicate that although the error of missing form labels still occurs in these websites, other known accessibility errors and issues have been significantly improved from five years ago. introduction an academic library is “a library that is an integral part of a college, university, or other institution of postsecondary education, administered to meet the information and research needs of its students, faculty, and staff.”1 people living with physical disabilities face barriers whenever they enter a library. many blind and visually impaired persons need assistance when visiting a library to do research. in such cases, searching the collection catalog, periodical indexes, and other bibliographic references are frequently conducted by a librarian or the person accompanying that individual to the library. thus, professionals in these institutions can advance the use of academic libraries for the visually impaired, physically disabled, hearing impaired, and people with learning disabilities. library websites are libraries’ virtual front doors for all users pursuing information from libraries. fichter stated that the power of the website is in its popularization.2 access by everyone regardless of disability is an essential reason for its popularization. whether users are students, parents, senior citizens, or elected officials navigating the library website to find resources, or sign up for computer courses at the library, the website can be either a liberating or a limiting experience.3 according to the web accessibility initiative (https://www.w3.org/wai/), website accessibility means that people with disabilities can use the websites. more specifically, website accessibility means that people with disabilities can perceive, understand, navigate, and interact with websites and that they can contribute to the websites. incorporating accessibility into website design enables people with disabilities to enjoy the benefits of websites to the same extent as anyone else in their community. this study evaluated the current state of the accessibility of university websites of the american ivy league university libraries using guidelines established by the americans with disabilities act mailto:youngwf@126.com mailto:andy.zh@126.com mailto:liuy1@southernct.edu mailto:bielefielda1@southernct.edu https://www.w3.org/wai/ information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 2 (ada) for those who are visually or physically impaired or who have reading or learning disabilities. ada’s section 508 and the web content accessibility guidelines (wcag), by the world wide web consortium (w3c) provide guidelines for website developers which define what makes a website accessible to those with physical, sensory, or cognitive disabilities. since a broad array of disabilities are recognized under the ada, websites seeking to be compliant with the ada should use the act’s technical criteria for website design. this study used two common accessibility evaluation tools—wave and achecker—for both section 508 and the wcag version 2.0 level aa. among universities in the united states, the eight ivy league universities—brown, columbia, cornell, dartmouth, harvard, princeton, university of pennsylvania, and yale—all have a long and distinguished history, strict academic requirements, high-quality teaching, and high-caliber students. because of their good reputations, they are expected to lead by example, not only in terms of academic philosophy and campus atmosphere, but also by the accessibility of their various websites. of course, any library website, whether an urban public library or a university library, should be accessible to everyone. hopefully, this study of their accessibility can enlighten other universities on how to better develop and maintain library websites so that individuals with disabilities can enjoy the same level of accessibility to academic knowledge as everyone else. literature review in 1999, schmetzke reported that emerging awareness about the need for accessible website design had not yet manifested itself in the actual design of library websites. for example, at the fourteen four-year campuses within the university of wisconsin system, only 13 percent of the libraries’ top-level pages (homepages plus the next layer of library pages linked to them) were free of accessibility problems.4 has this situation changed in the last twenty years? to answer this question, a number of authors have suggested various methods for evaluating software/hardware for accessibility and usability.5 included in the process of compiling data is “involving the user at each step of the design process. involvement typically takes the form of an interview and observation of the user engaged with the software/hardware.”6 providenti & zai conducted a study in 2007 focused on providing an update on the implementation of website accessibility guidelines of kentucky academic library websites. they tested the academic library homepages of bachelor-degree granting institutions in kentucky for accessibility compliance using watchfire’s webxact accessibility tester and the w3c’s html validator. the results showed that from 2003 to 2007, the number of library homepages complying with basic accessibility guidelines was increasing.7 billingham conducted research on edith cowan university (ecu) library websites. the websites were tested twice, in october 2012 and june 2013, using automated testing tools such as code validators and color analysis programs, resulting in findings that 11 percent of the guidelines for wcag 2.0 level a to level aa were passed in their first test. additionally, there was a small increase in the percentage of wcag 2.0 guidelines passed by all pages tested in the second test. 8 while quite a few research studies focus on library website accessibility rather than the university websites, the conclusions diverge. tatiana & jeremy (2014) tested 509 webpages at a large public university in the northeastern united states using wave (http://wave.webaim.org) and cynthia says (http://www.cynthiasays.com). the results indicated that 51 percent of those webpages http://wave.webaim.org/ http://www.cynthiasays.com/ information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 3 passed automated website accessibility tests for section 508 compliance with cynthia says. however, when using wave for wcag priority 1 compliance, which is a more rigorous evaluation level, only 35 percent passed the test.9 maatta smith reported that not one of the websites of 127 us members of the urban library council (ulc) was without errors or alerts, with the average number of errors being 27.10 such results were similar with liu.11 12they also found that about half (58 of 127) of the urban public libraries provided no information specifically for individuals with disabilities. of the 127 websites, some were confusing by using the variety of verbiage to suggest services for individuals with disabilities. sixty-six of them provide some information about services within the library for individuals with disabilities. the depth of the information varied, but in all instances contact information was included for additional assistance. liu, bielefield, and mckay examined 122 library homepages of ulc members and reported on three main aspects. first, only seven of them presented as error free when tested for compliance with the 508 standards. the highest percentage of errors occurred in accessibility sections 508(a) and 508(n). second, the number of issues was dependent on the population served. that means libraries serving larger populations tend to have more issues with accessibility than those serving smaller ones. third, the most common errors were missing label and contrast errors, while the highest number of alerts was related to the device-dependent event handler, which means that a keyboard or mouse is a necessary piece of equipment to initiate a desired transaction.12 although they were interested in overall website accessibility, theofanos and redish focused their research on the visually impaired website user. the authors investigated and revealed six reasons to bridge the gap between accessibility and usability. the six reasons were: 1. disabilities affect more people than you may think worldwide. 750 million people have a disability, and three of every ten families are touched by a disability. in the united states, one in five have some kind of disability, and one in ten have a severe disability. that’s approximately 54 million americans. 2. it is a good business. according to the president’s committee on the employment of people with disabilities, the discretionary income of people with disabilities is $175 billion. 3. the number of people with disabilities and income to spend is likely to increase. the likelihood of having a disability increases with age, and the overall population is aging. 4. the website plays an important role and has significant benefits for people with disabilities. 5. improving accessibility enhances usability for all users. 6. it is morally the right thing to do.13 lazar, dudley-sponaugle, and greenidge validated that most blind users were just as impatient as most sighted users. they want to get the information they need as quickly as possible. they don’t want to listen to every word on the page just as sighted users do not read every word.14 similarly, foley found that using automated validation tools did not ensure complete accessibility. students with low vision found many of the pages hard to use even though they were validated.15 outcomes of all the research revealed that most university library websites have developed a policy on website accessibility, but the policies of most universities had deficiencies.16 library staff information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 4 must be better informed and trained to understand the tools available to users, and when reviewing web pages, the audiences of all kinds must be considered.17 research design and methods this study, as a continuing effort from an earlier study on urban library websites, made use of content analysis methodology to examine the website accessibility of the university libraries against the americans with disabilities act (ada), with a focus on those with visual or cognitive disabilities.18 under the ada, people with disabilities are guaranteed access to all postsecondary programs and services. the evaluation of accessibility focuses on the main pages of these university library websites, as shown in table 1, because these homepages considerably demonstrate the institution’s best effort or, at least, most recent redesigns. it was the intent of the authors of this research to reveal the current status of the ivy league library homepages’ accessibility and the importance that ivy league universities attach to the accessibility of their websites. commonly recognized website evaluators (wave, achecker, and cynthia says), along with other online tools, evaluate a website's accessibility by checking its html and xml code. wave and achecker were selected for this study for the robustness of their evaluation based on w3c guidelines, comprehensiveness of evaluation reporting, and ready availability to any institution or individual conducting website evaluations. wave is a web evaluation tool that was utilized to check websites against section 508 standards and wcag 2.0 guidelines. this assessment was conducted by entering a uniform research locator, url, or website address in the search box. the evaluation tool provided a summary of errors, alerts, features, structural elements, html5 and aria. achecker is a tool to check single html page content for conformance with accessibility standards to ensure the content can be accessed by everyone. it produces a report of all accessibility problems for the selected guidelines by three types of problems: known problems, likely problems and potential problems. both wave and achecker help website developers make their website content more accessible. data from different periods were compared to show statistically whether enough attention was paid to accessibility issues by the ivy league university systems. the study team collected the first data set in february 2014, using wave for section 508. in 2018, achecker accessibility checker was used for both section 508 and wcag 2.0 aa. the access board published new requirements for information and communication technology covered by section 508 of the rehabilitation act (https://www.access-board.gov/guidelines-andstandards/communications-and-it/about-the-ict-refresh) on january 18, 2017. the latest wcag 2.0 guidelines were updated on september 5, 2013 (https://www.w3.org/tr/wcag2ict/). while the wave development team indicated that they have updated the indicators in wave regarding wcag 2.0, the current indicators regarding section 508 refer to the previous technical standards for section 508, not the updated 2017 ones. according to achecker.ca, the versions of the section 508 standards and wcag 2.0 aa guidelines used were published on march 12, 2004 and june 19, 2006, respectively, with neither being the latest versions. this study centered on three research questions: https://www.access-board.gov/guidelines-and-standards/communications-and-it/about-the-ict-refresh https://www.access-board.gov/guidelines-and-standards/communications-and-it/about-the-ict-refresh https://www.w3.org/tr/wcag2ict/ information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 5 1. are the library websites of the eight ivy league universities ada compliant? 2. are there easily identified issues that present barriers to access for the visually impaired on the ivy league university library homepages? 3. what should ivy league libraries do to achieve ada compliance and to maintain it? table 1. investigated websites of ivy league university libraries. library website address brown university library https://library.brown.edu columbia university libraries http://library.columbia.edu cornell university library https://www.library.cornell.edu dartmouth library https://www.library.dartmouth.edu harvard library http://library.harvard.edu princeton university library http://library.princeton.edu penn libraries http://www.library.upenn.edu yale university library https://web.library.yale.edu results & discussion all five evaluation categories employed by wave for section 508 standards, as shown in figure 1, were examined, with a more in-depth review of the homepage of the university of pennsylvania library. similar results in numbers of the five categories are presented in the library homepages of brown university, columbia university, and cornell university. interestingly, wave indicates more errors and alerts on the homepage of yale university. figure 1. wave results for section 508 standards. in order to determine the accuracy of the results, the team also used achecker to reevaluate these homepages in the year 2018. known problems as the category in achecker are as serious as errors in wave. they have been identified with certainty as accessibility barriers by the website information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 6 evaluators and need to be fixed. likely problems are problems that could be barriers which require a human to decide whether there is a need to fix them. achecker cannot identify potential problems and requires a human to confirm if identified problems need remediation. figure 2 shows the numbers for each category as detected by achecker on june 18, 2018, on the eig ht ivy league university libraries’ homepages. the library homepage of the university of pennsylvania was found to contain the most, which was the same as the result from wave. however, among the seven remaining libraries’ homepages, the homepage of harvard university library displayed the same number of problems as the university of pennsylvania detected by achecker. figure 2. achecker results for section 508 standards. there was significant improvement between 2014 and 2018 the results of this research from wave for section 508 standards signify a significant shift in the accessibility of these websites between the years of 2014 and 2018. among the five wave detection categories in the eight library homepages, the total of errors and alerts decreased during this period. for instance, the total number of errors was 36 in 2014 decreasing to 11 in 2018, and the number of alerts decreased from 141 to 14. figure 3 shows the number of errors in each library homepage, and figure 4 shows the number of alerts. they all show a downward trend from 2014 to 2018. but features, structural elements and html/aria were all on the rise when comparing the two years’ data sets. the green sections in table 2 indicate a decrease of the numbers in three categories from 2014 to 2018, and the yellow sections indicate an increase in numbers. these data results revealed that errors and alerts, the most common problems related to access, had been better controlled during these years, while others might still remain unchanged. information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 7 figure 3. change of errors from 2014 to 2018. figure 4. change of alerts from 2014 to 2018. table 2. changes of features, structural elements, and html/aria between 2014 and 2018. categories features structural elements html/aria year of data collection 2014 2018 2014 2018 2014 2018 total 108 191 184 233 24 89 brown university library 13 15 6 13 0 1 columbia university libraries 12 13 23 14 17 0 cornell university library 5 6 20 18 0 4 dartmouth library 10 8 15 27 0 23 harvard library 20 20 14 24 0 4 princeton university library 15 31 45 24 0 3 penn libraries 12 90 29 104 7 50 yale university library 21 8 32 9 0 4 missing form labels were the top error against the ada the data used in the analysis below were all the test data collected in 2018. all errors appearing in data results were collected and analyzed. figure 5 shows the number of errors that were identified based on the specific requirements contained in section 508 of the rehabilitation act as evaluated by wave. information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 8 figure 5. occurrences of specific error per specific 508 standards. the term error refers to accessibility errors that need to be fixed. missing form label was the highest frequency error type shown. only two types of errors occurred in ivy league university libraries’ homepages. but these errors didn’t appear on every homepage. there are several errors in some homepages while others had no errors. for example, linked image missing alternative text occurred on the library homepage of harvard university twice. table 3 shows the distribution of errors in eight homepages. table 3. distribution of errors in eight homepages. missing form label linked image missing alternative text brown university library columbia university libraries 1 cornell university library dartmouth library 3 harvard library 2 princeton university library penn libraries 1 yale university library 4 missing form label is listed in section 508 (n) and means there is a form control without a corresponding label. this is important because if a form control does not have a properly associated text label, the function or purpose of that form control may not be presented to sc reen reader users. linked image missing alternative text occurred only in the harvard library homepage among the eight ivy league university libraries’ homepages. it indicated that an image without alternative text results in an empty link. if an image is within a link that does not provide alternative text, a screen reader has no content to present to the user regarding the function of the link. these website accessibility issues may be easy fixes and considered minor to some; however, if they are not detected, they are major barriers for persons living with low vision or blindness. as a result, users are left at a disadvantage because they are lacking critical information to successfully fulfill their needs. examples of such error icons in wave are displayed in figures 6 and 7. information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 9 figure 6. missing form label icon from yale university library homepage. figure 7. linked image missing alternative text icon from harvard library homepage. a total of eleven errors, as shown in figure 8, were located on the homepages of the eight ivy league libraries and illustrated the number of errors that occurred in each library homepage. the average number of errors for each homepage was 1.375. yale university library homepage had the most errors with a total of four. library homepages of brown university, cornell university and princeton university performed best with zero errors. figure 8. the total of errors in ivy league libraries’ homepages. information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 10 six alerts appear among ada requirements the issues that alerts identify are also significant for website accessibility. figure 9 shows there are six different kinds of alerts that were identified based on the specific requirements contained in section 508 of the rehabilitation act. figure 9. occurrences of specific alert per specific 508 standards. the noscript element was the most encountered alert issue. alerts that wave reports need close scrutiny, because they likely represent an end-user accessibility issue. the noscript element is related to the 508 (l) requirement and means a noscript element is present when javascript is disabled. for users of screen readers and other assistive technologies, almost all have javascript enabled, so noscript cannot be used to provide an accessible version of inaccessible scripted content. skipped heading level ranked was second in number. the importance of headings is in their provision of document structure and facilitation of keyboard navigation for users of assistive technology. these users may be confused or they may experience difficulty navigating when heading levels are skipped. examples of icons of these alerts, evaluated by wave, indicated these noscript elements as being to accessibility, as shown in figures 10 and 11. figure 10. noscript element icon from cornell university library homepage. information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 11 figure 11. skipped heading level icon from dartmouth library homepage. a total of fourteen alert problems were detected. figure twelve illustrates the number of alerts that occurred on each library homepage. on average, there were 1.75 alerts present on the eight websites. the library homepages of yale university and the university of pennsylvania had the most alerts with 4 on each site. only the brown university library’s homepage had zero alerts. figure 12. the total of alerts in ivy league libraries’ homepages. linked image with alternative text was the most frequently found feature issue features as a category of issues indicates conditions of accessibility that probably need to be improved and usually require further verification and manual fixing. for example, if a feature is detected on a website, it means that further manual verification is required to confirm its accessibility. figure 13 shows the number of features that were identified, based on the specific requirement contained in section 508 of the rehabilitation act. information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 12 figure 13. occurrences of specific features per specific 508 standards. linked image with alternative text, which is a 508 (a) requirement, was shown to be the most encountered features issue. this means that an alternative text should be presented for an image that is within a link. by including appropriate alternative text on an image within a link, the function and purpose of the link and the content of the image are available to screen reader users even when images are unavailable. another high occurring feature was form label, which means a form label is present and associated with a form control. a properly associated form label is presented to a screen reader user when the form control is accessed. these evaluation steps were the same ones used for errors and alerts. example icons of features evaluated by wave are displayed as figures 14 and 15. figure 14. linked image with alternative text icon from brown university library homepage. information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 13 figure 15. form label icon from penn libraries homepage. this study also ranked the number of features that were detected by wave in the eight ivy league library homepages. figure 16 displays the number of features that occurred on each library homepage. in total there were 191 features detected by wave in the eight ivy league university libraries’ homepages. the homepage of the university of pennsylvania library was found to have 90 features, by far the most of all the libraries. no library was entirely free of features according to the wave measurement using section 508 standards. figure 16. the total of features in ivy league libraries’ homepages. information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 14 table 4a. comparison between wave & achecker section 508 standards on brown and columbia’s library homepages. section 508 standards brown university columbia university wave achecker wave achecker april june april june april june april june total 33 29 47 47 28 29 79 83 a 9 9 9 9 12 13 12 14 b c 14 14 26 28 d 8 8 14 14 e f g h i j 8 8 14 14 k l 6 6 12 12 m n 1 1 1 1 1 1 o 23 19 1 1 15 15 1 1 p table 4b. comparison between wave & achecker section 508 standards on cornell and dartmouth’s library homepages. section 508 standards cornell university dartmouth college wave achecker wave achecker april june april june april june april june total 30 29 107 106 59 68 65 67 a 2 2 2 2 8 8 10 11 b c 36 36 22 23 d 32 32 9 9 e f g h i j 33 32 9 9 k l 3 3 7 7 m n 7 7 23 29 8 8 o 21 20 1 1 28 31 p information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 15 table 4c. comparison between wave & achecker section 508 standards on harvard and princeton’s library homepages. section 508 standards harvard university princeton university wave achecker wave achecker april june april june april june april june total 51 51 139 139 57 61 74 74 a 20 20 29 29 25 25 20 20 b c 43 43 32 32 d 32 32 10 10 e f g h i j 34 34 10 10 k l 1 1 m n 5 5 3 7 o 26 26 1 1 29 29 1 1 p table 4d. comparison between wave & achecker section 508 standards on pennsylvania and yale’s library homepages. section 508 standards university of pennsylvania yale university wave achecker wave achecker april june april june april june april june total 253 249 129 139 28 29 84 85 a 40 37 14 19 6 7 4 5 b c 82 87 28 28 d 11 11 21 21 e f g 1 1 h i j 11 11 21 21 k l 1 1 9 9 3 3 4 4 m 3 2 n 103 104 1 1 8 8 4 4 o 106 105 1 1 11 11 1 1 p information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 16 a few 508 standards deviate from comparison between two evaluators to determine whether the wave tool missed some specific requirements in section 508, the authors comparatively examined these eight university homepages using both wave and achecker from one site to another synchronously in april and again in june 2019. there are sixteen principles in section 508. they are arranged from a to p. tables 4a–4d indicate issues for these section 508’s requirements in the eight universities’ homepages respectively. except the requirement g for yale library homepage which shows one issue in achecker, in neither wave nor achecker during the time we conducted our examination, there was no issue found for the seven requirements (b, e, f, h, i, k, and p) below: b. equivalent alternatives for any multimedia presentation shall be synchronized with the presentation; e. redundant text links shall be provided for each active region of a server-side image map; f. client-side image maps shall be provided instead of server-side image maps except where the regions cannot be defined with an available geometric shape; h. markup shall be used to associate data cells and header cells for data tables that have two or more logical levels of row or column headers; i. frames shall be titled with text that facilitates frame identification and navigation; k. a text-only page, with equivalent information or functionality, shall be provided to make a website comply with the provisions of this part, when compliance cannot be accomplished in any other way. the content of the text-only page shall be updated whenever the primary page changes; p. when a timed response is required, the user shall be alerted and given sufficient time to indicate more time is required. the results tabulated in tables 4a–4d indicate that these seven section 508 requirements perhaps are not problematic to the websites. conclusions based on the results, this study determined that the eight ivy league universities’ homepages exhibited some issues with accessibility for people with disabilities. considerable effort is necessary to ensure their websites ready to meet the challenges and future needs of web accessibility. users with visual impairments can navigate a website only when it is designed to be accessible with other assistive technology. while each institution presented both general and comprehensive coverage of services for users with disabilities, it would have been more practical and efficient if specific links were posted on the homepage. according to the american foundation for the blind (https://www.afb.org), “usability” is a way of describing how easy a website is to understand and use. accessibility refers to how easily a website can be used, understood, and accessed by people with disabilities. https://www.afb.org/ information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 17 this study has concluded that expertise and specialized training and skill are still needed in th is area. principles of accessible website design must be introduced and taught, underscoring that design matters for people with disabilities just as it does in the physical environment. as highlighted earlier through the evaluation tool wave, most of the problems detected can be fixed with provided solutions. a frequent review is critical and websites should be assessed at a minimum on a yearly basis for accessibility compliance. there is much to be done if accessibility is to be realized for everyone. limitations the authors recognize that this study, using free website accessibility testing tools, has certain limitations. as wave remarked in their help page, the aim of website developers is not to get rid of all identified problem categories except errors that need to be fixed, but to determine whether a website is accessible. at the time of writing neither wave nor achecker were updated with the latest general wcag 2.1 aa rules. while the version of wcag 2.1 is expected to provide new guidelines for making websites even more accessible, more careful and comprehensive studies against the wcag 2.1 aa rules could further assist university library professionals and their website developers to provide those with disabilities with accessible websites. moreover, while it is effective to conduct these machine-generated evaluations, it is equally important that researchers check the issues manually to impose human analysis in determining the major issues with content. endnotes 1 joan m. reitz, odlis: online dictionary for library and information science. (westport, ct: libraries unlimited, 2004), 1–2. 2 darlene fichter, “making your website accessible,” online searcher 37, no. 4 (2013): 73–76. 3 fichter, “making your website accessible,” 74. 4 axel schmetzke, web page accessibility on university of wisconsin campuses: a comparative study (stevens point, wi, 2019). 5 jeffrey rubin and dana chisnell, handbook of usability testing: how to plan, design, and conduct effective tests (idaho: wiley, 2008), 6–11. 6 alan foley, “exploring the design, development and use of websites through accessibility and usability studies,” journal of educational multimedia and hypermedia 20, no. 4 (2011), 361–85, http://www.editlib.org/p/37621/. 7 michael providenti and robert zai iii, “web accessibility at kentucky's academic libraries,” library hi tech 25, no. 4 (2007): 478–93, https://doi.org/10.1108/07378830710840446. 8 lisa billingham, “improving academic library website accessibility for people with disabilities,” library management 35, no. 8/9 (2014): 565–81, https://doi.org/10.1108/lm-11-2013-0107. 9 tatiana i solovieva and jeremy m bock, “monitoring for accessibility and university websites: meeting the needs of people with disabilities,” journal of postsecondary education and http://www.editlib.org/p/37621/ https://doi.org/10.1108/07378830710840446 https://doi.org/10.1108/lm-11-2013-0107 information technology and libraries june 2020 are ivy league library website homepages accessible? | liu 18 disability 27, no. 2 (2014): 113–27, http://search.proquest.com/docview/1651856804?accountid=9744. 10 stephanie l. maatta smith, “web accessibility assessment of urban public library websites,” public library quarterly 33, no. 3 (2014): 187–204, https://doi.org/187204.10.1080/01616846.2014.937207. 11 yan quan liu, arlene bielefeld, and peter mckay, “are urban public libraries’ websites accessible to americans with disabilities?,” universal access in the information society, 18, no. 1 (2019): 191–206, https://doi.org/10.1007/s10209-017-0571-7. 12 liu, bielefeld, and mckay, “are urban public library websites accessible.” 13 mary frances theofanos and j. redish, “bridging the gap: between accessibility and usability,” interactions 10, no. 6 (2003): 36–51, https://doi.org/10.1145/947226.947227. 14 jonathan lazar, a. dudley-sponaugle, and k. d. greenidge, “improving web accessibility: a study of webmaster perceptions,” computers in human behavior 20, no. 2 (2004): 269–88, https://doi.org/10.1016/j.chb.2003.10.018. 15 foley, “exploring the design,” 365. 16 david a. bradbard, cara peters, and yoana caneva, “web accessibility policies at land-grant universities,” internet & higher education 13, no. 4 (2010): 258–66, https://doi.org/10.1016/j.iheduc.2010.05.007. 17 mary cassner, charlene maxey-harris, and toni anaya, “differently able: a review of academic library websites for people with disabilities," behavioral & social sciences librarian 30, no. 1 (2011): 33–51, https://doi.org/10.1080/01639269.2011.548722. 18 liu, bielefeld, and mckay, “are urban public library websites accessible,” 195. http://search.proquest.com/docview/1651856804?accountid=9744 https://doi.org/187-204.10.1080/01616846.2014.937207 https://doi.org/187-204.10.1080/01616846.2014.937207 https://doi.org/10.1007/s10209-017-0571-7 https://doi.org/10.1145/947226.947227 https://doi.org/10.1016/j.chb.2003.10.018 https://doi.org/10.1016/j.iheduc.2010.05.007 https://doi.org/10.1080/01639269.2011.548722 abstract introduction literature review research design and methods results & discussion there was significant improvement between 2014 and 2018 missing form labels were the top error against the ada six alerts appear among ada requirements linked image with alternative text was the most frequently found feature issue a few 508 standards deviate from comparison between two evaluators conclusions limitations endnotes testing information literacy in digital environments | katz 3 despite coming of age with the internet and other technology, many college students lack the information and communication technology (ict) literacy skills necessary to navigate, evaluate, and use the overabundance of information available today. this paper describes the development and early administrations of ets’s iskills assessment, an internet-based assessment of information literacy skills that arise in the context of technology. from the earliest stages to the present, the library community has been directly involved in the design, development, review, field trials, and administration to ensure the assessment and scores are valid, reliable, authentic, and useful. t echnology is the portal through which we interact with information, but there is growing belief that people’s ability to handle information—to solve problems and think critically about information—tells us more about their future success than does their knowledge of specific hardware or software. these skills—known as information and communications technology (ict) literacy—comprise a twenty­first­century form of literacy in which researching and communicating information via digital environments are as important as reading and writing were in earlier centuries (partnership for 21st century skills 2003). although today’s knowledge society challenges stu­ dents with overabundant information of often dubious quality, higher education has recognized that the solution cannot be limited to improving technology instruction. instead, there is an increasingly urgent need for students to have stronger information literacy skills—to “be able to recognize when information is needed and have the ability to locate, evaluate, and use effectively the needed information” (american library association 1989)—and apply those skills in the context of technology. regional accreditation agencies have integrated information lit­ eracy into their standards and requirements (for example, middle states commission on higher education 2003; western association of schools and colleges 2001), and several colleges have begun campuswide initiatives to improve the information literacy of their students (for example, the california state university 2006; university of central florida 2006). however, a key challenge to designing and implementing effective information lit­ eracy instruction is the development of reliable and valid assessments. without effective assessment, it is difficult to know if instructional programs are paying off—whether students’ information literacy skills are improving. ict literacy skills are an issue of national and inter­ national concern as well. in january 2001, educational testing service (ets) convened an international ict literacy panel to study the growing importance of exist­ ing and emerging information and communication tech­ nologies and their relationship to literacy. the results of the panel’s deliberations over fifteen months highlighted the growing importance of ict literacy in academia, the workplace, and society. the panel called for assessments that will make it possible to determine to what extent young adults have obtained the combination of techni­ cal and cognitive skills needed to be productive mem­ bers of an information­rich, technology­based society (international ict literacy panel 2002). this article describes ets’s iskills assessment (for­ merly “ict literacy assessment”), an internet­based assessment of information literacy skills that arise in the context of technology. from the earliest stages to the pres­ ent, the library community has been directly involved in the design, development, review, field trials, and admin­ istration to ensure the assessment and scores are valid, reliable, authentic, and useful. ■ motivated by the library community although the results of the international ict literacy panel provided recommendations and a framework for an assessment, the inspiration for the current iskills assessment came more directly from the higher educa­ tion and library community. for many years, faculty and administrators at the california state university (csu) had been investigating issues of information literacy on their campuses. as part of their systemwide information competence initiative that began in 1995, researchers at csu undertook a massive ethnographic study to observe students’ research skills. the results suggested a great many shortcomings in students’ infor­ mation literacy skills, which confirmed librarian and classroom faculty anecdotal reports. however, clearly such a massive data collection and analysis effort would be unfeasible for documenting the information literacy skills of students throughout the csu system (dunn 2002). gordon smith and the late ilene rockman, both of the csu chancellor ’s office, discussed with ets the idea of developing an assessment of ict literacy that could support csu’s information competence initiative as well as similar initiatives throughout the higher edu­ cation community. irvin r. katz irvin r. katz (ikatz@ets.org) is senior research scientist in the research and development division at educational testing service. testing information literacy in digital environments: ets’s iskills assessment � information technology and libraries | september 2007� information technology and libraries | september 2007 ■ national higher education ict literacy initiative in august 2003, ets established the national higher education ict literacy initiative, a consortium of seven colleges and universities that recognized the need for an ict literacy assessment targeted at higher educa­ tion. representatives of these institutions collaborated with ets staff to design and develop the iskills assessment. the consortium built upon the work of the international panel to explicate the nature of ict literacy in higher education. over the ensuing months, repre­ sentatives of consortium institutions served as subject­ matter experts for the assessment design and scoring implementation. the development of the assessment followed a process known as evidence­centered design (mislevy, steinberg, and almond 2003), a systematic approach to the design of assessments that focuses on the evidence (student performance and products) of proficiencies as the basis for constructing assessment tasks. through the evidence­ centered design process, ets staff (psychometricians, cognitive psychologists, and test developers) and sub­ ject­matter experts (librarians and faculty) designed the assessment by considering first the purpose of the assess­ ment and by defining the construct—the knowledge and skills to be assessed. these decisions drove discussions of the types of behaviors, or performance indicators, to serve as evidence of student proficiency. finally, simulation­ based tasks designed around authentic scenarios were crafted to elicit from students the critical performance indicators. katz et al. (2004) and brasley (2006) provide a detailed account of this design and development process, illustrating the critical role played by librarians and other faculty from higher education. ■ ict literacy = information literacy + digital environments consortium members agreed with the conclusions of the international ict literacy panel that ict literacy must be defined as more than technology literacy. college students who grew up with the internet (the “net generation”) might be impressively technologically literate, more accepting of new technology, and more technically facile than their parents and instructors (oblinger and oblinger 2005). however, anecdotally and in small­scale studies, there is increasing evidence that students do not use technology effectively when they conduct research or communicate (rockman 2004). many educators believe that students today are less information savvy than earlier generations despite having powerful information tools at their disposal (breivik 2005). ict literacy must bridge the ideas of information literacy and technology literacy. to do so, ict literacy draws out the technology­related components of infor­ mation literacy as specified in the often­cited standards of the association of college and research libraries (acrl) (american library association 1989), focusing on how students locate, organize, and communicate information within digital environments (katz 2005). this conflu­ ence of information and technology directly reflects the “new illiteracy” concerns of educators: students quickly adopt new technology, but do not similarly acquire skills for being critical consumers and ethical producers of information (rockman 2002). students need training and practice in ict literacy skills, whether through general education or within discipline coursework (rockman 2004). the definition of ict literacy adopted by the con­ sortium members reflects this view of ict literacy as information literacy needed to function in a technological society: ict literacy is the ability to appropriately use digital technology, communication tools, and/or networks to solve information problems in order to function in an information society. this includes having the ability to use technology as a tool to research, organize, and communicate information and having a fundamental understanding of the ethical/legal issues surrounding accessing and using information (katz et al. 2004, 7). consortium members further refined this defini­ tion, identifying seven performance areas (see figure 1). these areas mirror the acrl standards and other related standards, but focus on elements that were judged most central to being sufficiently information literate to meet the challenges posed by technology. ■ ets’s iskills assessment ets’s iskills assessment is an internet­delivered assess­ ment that measures students’ abilities to research, orga­ nize, and communicate information using technology. the assessment focuses on the cognitive problem­solving and critical­thinking skills associated with using technol­ ogy to handle information. as such, scoring algorithms target cognitive decision­making rather than technical competencies. the assessment measures ict literacy through the seven performance areas identified by con­ sortium members, which represent important problem­ solving and critical­thinking aspects of ict literacy skill (see figure 1). assessment administration takes approx­ imately seventy­five minutes, divided into two sec­ tions lasting thirty­five and forty minutes, respectively. article title | author 5testing information literacy in digital environments | katz 5 figure 1. components of ict literacy define: understand and articulate the scope of an information problem in order to facilitate the electronic search for information, such as by: ■ distinguishing a clear, concise, and topical research question from poorly framed questions, such as ones that are overly broad or do not otherwise fulfill the information need; ■ asking questions of a “professor” that help disambiguate a vague research assignment; and ■ conducting effective preliminary information searches to help frame a research statement. access: collect and/or retrieve information in digital environments. information sources might be web pages, databases, discussion groups, e-mail, or online descriptions of print media. tasks include: ■ generating and combining search terms (keywords) to satisfy the requirements of a particular research task; ■ efficiently browsing one or more resources to locate pertinent information; and ■ deciding what types of resources might yield the most useful information for a particular need. evaluate: judge whether information satisfies an information problem by determining authority, bias, timeliness, relevance, and other aspects of materials. tasks include: ■ judging the relative usefulness of provided web pages and online journal articles; ■ evaluating whether a database contains appropriately current and pertinent information; and ■ deciding the extent to which a collection of resources sufficiently covers a research area. manage: organize information to help you or others find it later, such as by: ■ categorizing e-mails into appropriate folders based on a critical view of the e-mails’ contents; ■ arranging personnel information into an organizational chart; and ■ sorting files, e-mails, or database returns to clarify clusters of related information. integrate: interpret and represent information, such as by using digital tools to synthesize, summarize, compare, and contrast information from multiple sources while: ■ comparing advertisements, e-mails, or web sites from competing vendors by summarizing information into a table; ■ summarizing and synthesizing information from a variety of types of sources according to specific criteria in order to compare information and make a decision; and ■ re-representing results from an academic or sports tournament into a spreadsheet to clarify standings and decide the need for playoffs. create: adapt, apply, design, or construct information in digital environments, such as by: ■ editing and formatting a document according to a set of editorial specifications; ■ creating a presentation slide to support a position on a controversial topic; and ■ creating a data display to clarify the relationship between academic and economic variables. communicate: disseminate information tailored to a particular audience in an effective digital format, such as by: ■ formatting a document to make it more useful to a particular group; ■ transforming an e-mail into a succinct presentation to meet an audience’s needs; ■ selecting and organizing slides for distinct presentations to different audiences; and ■ designing a flyer to advertise to a distinct group of users. © 2007 educational testing service. all rights reserved. 6 information technology and libraries | september 20076 information technology and libraries | september 2007 during this time, students respond to fifteen interactive, performance­based tasks. each interactive task presents a real­world scenario, such as a class or work assignment, that frames the infor­ mation problem. students solve information­handling tasks in the context of simulated software (for example, e­mail, web browser, library database) having the look and feel of typical applications. there are fourteen three­ to five­minute tasks and one fifteen­minute task. the three­ to five­minute tasks target a single perfor­ mance area, while the fifteen­minute tasks comprise more complex problem­solving scenarios that target multiple performance areas. the simpler tasks contribute to the overall reliability of the assessment, while the more com­ plex task focuses on the richer aspects of ict literacy performance. in the assessment, a student might encounter a sce­ nario that requires him or her to access information from a database using a search engine (see figure 2). the results are tracked and strategies scored based on how he or she searches for information, such as key words chosen, search strategies refined, and how well the information returned meets the needs of the task. the assessment tasks each contain mechanisms to keep students from pursuing unproductive actions in the simulated environment. for example, in an internet browsing task, when the student clicks on an incorrect link, he might be told that the link is not needed for the current task. this message cues the student to try an alter­ native approach while still noting for scoring purposes that the student made a misstep. in a similar way, the student who fails to find useful (or any) journal articles in her database search might receive an instant message from a “teammate” providing her with a set of journal articles to be evaluated. these mechanisms potentially keep students from becoming frustrated (for example, via a fruitless search) while providing the opportunity for the students to demonstrate other aspects of their skills (for example, evaluation skills). the scoring for the iskills assessment is completely automated. unlike a multiple­choice question, each simu­ lation­based task provides many opportunities to collect information about a student and allows for alternative paths leading to a solution. scored responses are pro­ duced for each part of a task, and a student’s overall score on the test accumulates the individual scored responses across all assessment tasks. the assessment differs from existing measures in sev­ eral ways. as a large­scale measure, it was designed to be administered and scored across units of an institution or across institutions. as a simulation­based assessment, the tasks go beyond what is possible in multiple­choice format, providing students with the look and feel of interactive digital environments along with tasks that elicit higher­order critical­thinking and problem­solving skills. as a scenario­based assessment, students become engaged in the world of the tasks, and the task scenarios describe the types of assignments students should be see­ ing in their ict literacy instruction as well as examples of workplace and personal information problems. ■ two levels of assessments the iskills assessment is offered at two levels: core and advanced. the core level was designed to assess readi­ ness for the ict literacy demands of college. it is targeted at high school seniors and first­year college students. the advanced level was designed to assess readiness for the ict literacy challenges in transitioning to higher­level college coursework, such as moving from sophomore to junior year or transferring from a two­year to a four­year institution. the advanced level targets students in their second or third year of post­secondary study. the key difference between the core and advanced levels is in the difficulty of the assessment tasks. tasks in the core level are designed to be easier; examinees are presented with fewer options, the scenarios are more straightforward, and the reasoning needed for each step in a task is simpler. an advanced task might require an individual to infer the search terms needed from a gen­ eral description of an information need; the correspond­ ing core task would state the information need more explicitly. in a task of evaluating web sites, the core level might present a web site with many clues that it is not figure 2. in the iskills assessment, students demonstrate their skills at handling information through interaction with simulated software. in this example task, students develop a search query as part of a research assignment on earthquakes. © 2007 educational testing service. all rights reserved. article title | author 7testing information literacy in digital environments | katz 7 authoritative (a “.com” url, unprofessional look, content that directly describes the authors as students). the cor­ responding advanced task would present fewer cues of the web site’s origin (for example, a professional look, but careful reading reveals the web site is by students). ■ score reports for individuals and institutions both levels of the assessment feature online delivery of score reports for individuals and for institutions. the individual score report is intended to help guide students in their learning of ict literacy skills, aiding identifica­ tion of students who might need additional ict literacy instruction. the report includes an overall ict literacy score, a percentile score, and individualized feedback on the student’s performance (see figure 3). the percentile compares students to a reference group of students who took the test in early 2006 and who fall within the target population for the assessment level (core or advanced). as more data are collected from a greater number of institutions, these reference groups will be updated and, ideally, approach nationally representative norms. score reports are available online to students, usually within one week. high schools, colleges, and universities receive score reports that aggregate results from the test­takers at their institution. the purpose of the reports is to provide an overview of the students in comparison with a reference group. these reports are available to institutions online after at least fifty students have taken either the core or advanced level test—that is, when there are sufficient num­ bers to allow reporting of reliable scores. figure 4 shows a graph from one type of institutional report. users have the option to specify the reference group (for example, all students, all students at a four­year institution) and the subset of test­takers to compare to that group (for exam­ ple, freshmen, students taking the test within a particular timeframe). a second report summarizes the performance feedback of the individual reports, providing percentages of students who received the highest score on each aspect of performance (each of the fourteen short tasks are scored on two or three different elements). finally, institutions can conduct their own analyses by downloading the data of their test­takers, which include each student’s responses to the background questions, iskills score, and responses to institution­specified questions. ■ testing the test a variety of specialists contributed to the development of ets’s iskills assessment: librarians, classroom fac­ ulty, education administrators, assessment specialists, researchers, user­interface and graphic designers, and systems developers. the team’s combined goal was to produce a valid, reliable, authentic assessment of ict literacy skills. before the iskills assessment produced figure 3. first page of a sample score report for an individual. the subsequent pages contain additional performance feedback. figure 4. sample portion of an institutional score report: comparison between a user-specified reference group and data from the user’s institution. © 2007 educational testing service. all rights reserved. © 2007 educational testing service. all rights reserved. � information technology and libraries | september 2007� information technology and libraries | september 2007 official scores for test­takers, these specialists—both ets and ict literacy experts—subjected the assess­ ment to a variety of review procedures at many stages of development. these reviews ranged from weekly teleconferences with consortium members during the initial development of assessment tasks (january–july 2004), to small­scale usability studies in which ets staff observed individual students completing assessment tasks (or mockups of assessment tasks), to field trials that mirrored actual test delivery. the usability studies investigated students’ comprehension of the tasks and testing environment as well as the ease of use of the simulated software in the assessment tasks. the field trials provided opportunities to collect performance data and test the automated scoring algorithms. in some cases, ets staff fine­tuned the scoring algorithms (or developed alternatives) when the scores produced were not psychometrically sound, such as when one element of students’ scores was inconsistent with their overall performance. through these reviews and field trials, the iskills assessment evolved to its current form, targeting and reporting the performance of individuals who complete the seventy­five­minute assessment. in some cases, feedback from experts and field trial participants led to significant changes. for example, the iskills assess­ ment began in 2005 as a two­hour assessment (at that time called the ict literacy assessment), that reported scores only to institutions on the aggregated perfor­ mance of their participating students. some students entering higher education found the 2005 assessment excessively difficult, which led to the creation of the easier core level assessment. table 1 outlines the participation volumes for the field trials and test administrations. during each field trial, as well as during the institutional administration, feedback was collected from students on their experience with the test via a brief exit survey. table 2 summarizes some results of the exit survey. student reactions to the test were reasonably consistent: most students enjoyed taking the test and found the tasks realistic. in writ­ ten comments, students taking the institutional assess­ ment found the experience rewarding but exhausting, and thought the amount of reading excessive. student feedback directly influenced the design of the core and advanced level assessments, including the shorter test­ table 1. chronology of field trials and test administrations date administration approximate no. of students approximate no. of participating institutions july–september 2004 field trials for institutional assessment 1,000 40 january–april 2005 institutional assessment 5,000 30 may 2005 field trials for alternative individual assessment structures 400 25 november 2005 field trials for advanced level individual assessment 700 25 january–may 2006 advanced level individual assessment 2,000 25 february 2006 field trials for core level individual assessment 700 30 april–may 2006 core level individual assessment 4,500 45 august–december 2006 core level: continuous administration 2,100 20 august–december 2006 advanced level: continuous administration 1,400 10 note: items in bold represent “live” test administrations in which score reports were issued to institutions, students, or both. article title | author 9testing information literacy in digital environments | katz 9 taking time and lighter reading load compared with the institutional assessment. as shown in table 1 (bolded rows), test administra­ tions in 2005 and early 2006 occurred within set time frames. beginning in august 2006, the core and advanced level assessments switched to continuous testing: instead of a specific testing window, institutions create testing sessions to suit the convenience of their resources and students. the tests are still administered in a proctored lab environment, however, to preserve the integrity of the scores. ■ student performance almost 6,400 students at sixty­three institutions par­ ticipated during the first administrations of the core and advanced level iskills assessments between january and may 2006. (some institutions administered both the core and advanced level assessments.) test­takers consisted of 1,016 high­school students, 753 community college students, and 4,585 four­year college and university stu­ dents. institutions selected students to participate based on their assessment goals. some chose to test students enrolled in a particular course, some recruited a random sample, and some issued an open invitation and offered gift certificates or other incentives. because the sample of students is not representative of all united states institu­ tions nor all higher education students, these results do not necessarily generalize to the greater population of college­age students and should therefore be interpreted with caution. even so, the preliminary results reveal interesting trends in the ict literacy skills of participat­ ing students. overall, students performed poorly on both the core and advanced level, achieving only about half of the possible points on the tests. informally, the data suggest that students generally do not consider the needs of an audience when communicating information. for exam­ table 2. student feedback from the institutional assessment and individual assessments’ field trials statement % agreeing institutional assessment (n=4,898) advanced level field trials (n=736) core level field trials (n=648) i enjoyed taking this test. 61 59 67 this test was appropriately challenging. 90 90 86 i have never taken a test like this one before. 90 90 89 to perform well on this test requires thinking skills as well as technical skills. 95 93 94 i found the overall testing interface easy to use (even if the tasks themselves might have been difficult). 83 82 85 my performance on this test accurately reflects my ability to solve problems using computers and the internet. 63 56 67 i didn’t take this test very seriously. 25 25 23 the tasks reflect activities i have done at school, work, or home. 79 77 78 the software tools were unrealistic. n/a 21 24 10 information technology and libraries | september 200710 information technology and libraries | september 2007 ple, they do not appear to recognize the value of tailor­ ing material to an audience. regarding the ethical use of information, students tend not to check the “fair use” policies of information on the assessment’s simulated web sites. unless the usage policy (for example, copy­ right information) is very obvious, students appeared to assume that they may use information obtained online. on the positive side, test­takers appeared to recognize that .edu and .gov sites are less likely to contain biased material than .com sites. eighty percent of test­takers correctly completed an organizational chart based on e­mailed personnel information. most test­takers cor­ rectly categorized e­mails and files into folders. and when presented with an unclear assignment, 70 percent of test­takers selected the best question to help clarify the assignment. during a task in which students evaluated a set of web sites: ■ only 52 percent judged the objectivity of the sites cor­ rectly; ■ sixty­five percent judged the authority correctly; ■ seventy­two percent judged the timeliness correctly; and ■ overall, only 49 percent of test­takers uniquely identi­ fied the one web site that met all criteria. when selecting a research statement for a class assign­ ment: ■ only 44 percent identified a statement that captured the demands of the assignment; ■ forty­eight percent picked a reasonable but too broad statement; and ■ eight percent picked statements that did not address the assignment. when asked to narrow an overly broad search: ■ only 35 percent selected the correct revision; and ■ thirty­five percent selected a revision that only mar­ ginally narrowed the search results other results suggest that these students’ ict literacy needs further development: ■ in a web search task, only 40 percent entered mul­ tiple search terms to narrow the results; ■ when constructing a presentation slide designed to persuade, 12 percent used only those points directly related to the argument; ■ only a few test­takers accurately adapted existing material for a new audience; and ■ when searching a large database, only 50 percent of test­takers used a strategy that minimized irrelevant results. ■ validity evidence the goal of the iskills assessment is to measure the ict literacy skills of students—higher scores on the assess­ ment should reflect stronger skills. evidence for this validity argument has been gathered since the earliest stages of assessment design, beginning in august 2003. these documentation and research efforts, conducted at ets and at participating institutions, include: ■ the estimated reliability of iskills assessment scores is .88 (cronbach alpha), which is a measure of test score consistency across various administrations. this level of reliability is comparable to that of many other respected content­based assessments, such as the advanced placement exams. ■ as outlined earlier, the evidence­centered design approach ensures a direct connection between experts’ view of the domain (in this case, ict literacy), evi­ dence of student performance, design of the tasks, and the means for scoring the assessment (katz et al. 2004). through the continued involvement of the library community in the form of the ict literacy national advisory committee and development committees, the assessment maintains the endorsement of its con­ tent by appropriate subject­matter experts. ■ in november 2005, a panel of experts (librarians and faculty representing high schools, community colleges, and four­year institutions from across the united states) reviewed the task content and scoring for the core level iskills assessment. after investigat­ ing each of the thirty tasks and their scoring in detail, the panelists strongly endorsed twenty­six of the tasks. four tasks received less strong endorsement and were subsequently revised according to the committee’s recommendations. ■ students’ self­assessments of their ict literacy skills align with their scores on the iskills assessment (katz and macklin 2006). the self­assessment measures were gathered via a survey administered before the 2005 assessment. interestingly, although students’ confidence in their ict literacy skills aligned with their iskills scores, iskills scores did not correlate with the frequency with which students reported per­ forming ict literacy activities. this result supports librarians’ claims that mere frequency of use does not translate to good ict literacy skills, and points article title | author 11testing information literacy in digital environments | katz 11 to the need for ict literacy instruction (oblinger and hawkins 2006; rockman 2002). ■ several other validity studies are ongoing, both at ets and at collaborating institutions. these stud­ ies include using the iskills assessment in pre­post evaluations of educational interventions, detailed comparisons of student performance on the assess­ ment and on more real­world ict literacy tasks, and comparisons of iskills assessment scores and scores from writing portfolios. ■ national ict literacy standards and setting cut scores in october 2006, the national forum on information literacy, an advocacy group for information literacy policy (http://www.infolit.org/), announced the formation of the national ict literacy policy council. the policy coun­ cil—composed of representatives from key policy­making, information­literacy advocacy, education, and workforce groups—has the charter to draft ict literacy standards that outline what students should know and be able to do at different points in their academic careers. beginning in 2007, the council will first review existing standards docu­ ments to draft descriptions for different levels of perfor­ mance (for example, minimal ict literacy, proficient ict literacy), creating a framework for the national ict literacy standards. separate performance levels will be defined for the corresponding target population for the core and advanced assessments. these performance­level descrip­ tions will be reviewed by other groups representing key stakeholders, such as business leaders, healthcare educa­ tors, and the library community. the council also will recruit experts in ict literacy and information­literacy instruction to review the iskills assessment and recommend cut scores corresponding to the performance levels for the core and advanced assess­ ments. (a cut score represents the minimum assessment score needed to classify a student at a given performance level.) the standards­based cut scores are intended to help educators determine which students meet the ict literacy standards and which may need additional instruction or remediation. the council will review these recommended cut scores and modify or accept them as appropriately reflecting national ict literacy standards. ■ conclusions ets’s iskills assessment is the first nationally available measure of ict literacy that reflects the richness of that area through simulation­based assessment. owing to the 2005 and 2006 testing of more than ten thousand students, there is now evidence consistent with anec­ dotal reports of students’ difficulty with ict literacy despite their technical prowess. the results reflect poor ict literacy performance not only by students within one institution, but across the participating sixty­three high schools, community colleges, and four­year colleges and universities. the iskills assessment answers the call of the 2001 international ict literacy panel and should inform ict literacy instruction to strengthen these criti­ cal twenty­first­century skills for college students and all members of society. ■ acknowledgments i thank karen bogan, dan eignor, terry egan, and david williamson for their comments on earlier drafts of this article. the work described in this article represents con­ tributions by the entire iskills team at educational testing service and the iskills national advisory committee. works cited american library association. 1989. presidential committee on information literacy: final report. chicago: ala. available online at http://www.ala.org/acrl/legalis.html (accessed june 13, 2007). brasley, s. s. 2006. building and using a tool to assess info and tech literacy. computers in libraries 26, no. 5: 6–7, 43–48. breivik, p. s. 2005. 21st century learning and information literacy. change 37, no. 2: 20–27. dunn, k. 2002. assessing information literacy skills in the cali­ fornia state university: a progress report. journal of academic librarianship 28, no. 1/2: 26–36. international ict literacy panel. 2002. digital transformation: a framework for ict literacy. princeton, n.j.: educational testing service. available online at http://www.ets.org/media/ tests/information_and_communication_technology_lit­ eracy/ictreport.pdf (accessed june 13, 2007). katz, i. r. 2005. beyond technical competence: literacy in infor­ mation and communication technology. educational technology magazine 45, no 6: 144–47. katz, i. r., and a. macklin. 2006. information and communica­ tion technology (ict) literacy: integration and assessment in higher education. in proceedings of the 4th international conference on education and information systems, technologies, and applications, f. malpica, a. tremante, and f. welsch, eds. caracas, venezuela: international institute of informatics and systemics. katz, i. r., et al. 2004. assessing information and communications technology literacy for higher education. paper presented at the 12 information technology and libraries | september 200712 information technology and libraries | september 2007 annual meeting of the international association for educa­ tional assessment, philadelphia, pa. middle states commission on higher education. 2003. developing research and communication skills: guidelines for information literacy in the curriculum. philadelphia: middle states com­ mission on higher education. mislevy, r. j., l. s. steinberg, and r. g. almond. 2003. on the structure of educational assessments. measurement: interdisciplinary research and perspectives 1: 3–67. oblinger, d. g., and b. l. hawkins. 2006. the myth about stu­ dent competency. educause review 41, no. 2: 12–13. oblinger, d. g., and j. l. oblinger, eds. 2005. educating the net generation. washington, d.c.: educause, http://www. educause.edu/educatingthenetgen (accessed dec. 29, 2006). partnership for 21st century skills. 2003. learning for the 21st century: a report and mile guide for 21st century skills. washington, d.c.: partnership for 21st century skills. rockman, i. f. 2002. strengthening connections between infor­ mation literacy, general education, and assessment efforts. library trends 51, no. 2: 185–98. ———. 2004. introduction: the importance of information lit­ eracy. in integrating information literacy into the higher education curriculum: practical models for transformation. i. f. rockman and associates, eds. san francisco: jossy­bass. the california state university. 2006. information competence initiative web site. http://calstate.edu/ls/infocomp.shtml (accessed june 4, 2006). university of central florida. 2006. information fluency initiative web site. http://www.if.ucf.edu/ (accessed june 4, 2006). western association of schools and colleges. 2001. handbook of accreditation. alameda, calif.: western association of schools and colleges. available online at http://www.wascsenior .org/wasc/doc_lib/2001%20handbook.pdf (accessed dec. 22, 2006). 5 tails wagging dogs a funny thing happened on the way to the form. in the past decade, many libraries believed they were developing or using automated systems to produce catalog cards, or order slips, or circulation control records. the trauma of aacr2 implementation has helped many to realize belatedly that they have, in fact, been building data bases. libraries must relate their own machine-readable records to each other in a new way as they face new applications. further methods of relating and using records from different libraries, and even different networks, are becoming necessities in our increasingly interdependent world. a narrow view of the process of creating records has often resulted in introduction of nonstandard practices that provide the required immediate result, but create garbage in the data base. in effect, letting the tails wag the dogs. for many years, john kountz and the tesla (technical standards for library automation) committee addressed this issue forcefully, but were as voices in the wilderness. the problems created are the problems of success. the expectations libraries have developed have outstripped their practices. many libraries are only now seriously addressing the practices they have used to create data bases that already contain hundreds of thousands of records. precisely because of its success, the oclc system is a useful case in point. in general, oclc has adhered closely to marc standards. in call number and holding fields, national standards have been late forthcoming, and libraries have often improvised. meeting the procrustean needs of catalog cards has ofttimes blinded libraries to the long-term effects of their practices. multiple subfield codes to force call number "stacking" and omission of periods from lc call numbers are two examples of card-driven practice. not following recommended oclc practice of fully updating the record at each use has created archive tapes requiring significant manual effort to properly reflect library holdings. variant branch cataloging practices create dilemmas. some malpractices have resulted from attempts to beat pricing algorithms . some, like retaining extraneous fields or accepting default options when they are incorrect, merely reflect laziness or shortsighted procedures. while implementing systems in the present, libraries must keep a weather eye to the future. what new requirements will future systems place on records being created today? brian aveney the role of the library in the digital economy article the role of the library in the digital economy serhii zharinov information technology and libraries | december 2020 https://doi.org/10.6017/ital.v39i4.12457 serhii zharinov (serhii.zharinov@gmail.com) is researcher, state scientific and technical library of ukraine. © 2020. abstract the gradual transition to a digital economy requires all business entities to adapt to the new environmental conditions that are taking place through their digital transformation. these tasks are especially relevant for scientific libraries, as digital technologies make changes in the main subject field of their activities, the processes of creating, storing, and information disseminating. in order to find directions for the transformation of scientific libraries and determine their role in the digital economy, a study of the features of digital transformation and the experience of the digital transformation of foreign libraries was conducted. management of research data, which is implemented through the creation of current research information systems (cris) was found to be one of the most promising areas of the digital transformation of libraries. the problem area of this direction and ways of engaging libraries in it have been also analyzed in the work. introduction the transition to a digital economy contributes to the even greater penetration of digital technologies into our lives and the emergence of new conditions of competition and trends in organizations’ development. big data, machine learning, and artificial intelligence are becoming common tools implemented by the pioneers of digital transformation in their activities.1 significant changes in the main functions of libraries, storage and dissemination of information caused by the development of digital technologies, affect the operational activities of libraries, user and partners’ requests to the library, and ways to meet them. in the process of adapting to these changes, the role of libraries in the digital economy is changing. this study is designed to find current areas of library development and to determine the role of the library in the digital economy. achieving this goal requires study of the “digital economy” concept and the peculiarities of the digital transformation of organizations in order to better understand the role of the library in it; research on the development of libraries and determine what best fits the new role of the library in the digital economy; identification of obstacles to the development of this area and ways to engage libraries in it. the concept of the “digital economy” the transition to an information society and digital economy will gradually change all industries, and all companies must change accordingly.2 taking advantage of the digital economy is the main driving force of innovation, competitiveness, and economic development of the country.3 the transition to a digital economy is not instant but occurs over many years. the topic emerged starting at the end of the twentieth century, but in recent years has experienced rapid growth. in the web of science (wos) citation database, publications with this term in the title began to be published in 1996 (figure 1). mailto:serhii.zharinov@gmail.com information technology and libraries december 2020 the role of the library in the digital economy | zharinov 2 figure 1. the number of publications in the wos citation database for the query “digital economy.” one of the first books devoted entirely to the study of the digital economy concept is the work of don tapscott, published in 1996. in this book, the author understands the digital economy as an economy in which the use of digital computing technologies in economic activity becomes its dominant component.4 thomas mesenbourg, an american statistician and economist, identified in 2000 the three main components of the digital economy: e-business, e-commerce, and e-business infrastructure.5 a number of works on the development of indicators to assess the state of the digital economy, in particular, the work of philip barbet and nathalie coutinet, are based on the analysis of these components.6 alnoor bhimani, in his 2003 paper, “digitization and accounting change,” defined the digital economy as “the digital interrelationships and dependencies between emerging communication and information technologies, data transfers along predefined channels and emerging platforms, and related contingencies within and across institutional and organizational entities.”7 bo carlsson’s 2004 article described the digital economy as a dynamic state of the economy characterized by the constant emergence of new activities based on the use of the internet and new forms of communication between different authors of ideas, whose communication allows them to generate new activities.8 in 2009, john hand gave the meaning of the digital economy as the new design or use of information and communication technologies that help transform the lives of people, society, or business.9 0 100 200 300 400 500 600 700 1995 2000 2005 2010 2015 2020 information technology and libraries december 2020 the role of the library in the digital economy | zharinov 3 ciocoiu carmen nadia, in her 2011 article, explained the digital economy as a state of the economy where knowledge and networking begin to play a more important role than capital in a postindustrial society due to technology.10 in a 2014 article, kit lesya defined the digital economy as an element of the network economy, characterized by the transformation of all spheres of the economy by transferring information resources and knowledge to a computer platform for further use.11 ukrainian scientists mykhailo voinarenko and larysa skorobohata, in a study of network tools in 2015, gave the following definition of the digital economy: “the digital economy, unlike the internet economy, assumes that all economic processes (except for the production of goods) take place independently of the real world. goods and services do not have a physical medium but are ‘electronic.’”12 yurii pivovarov, director of the ukrainian association for innovation development (uaid), gives the following definition: “digital economy is any activity related to information technology. and in this case, it is important to separate the terms: digital economy and it sphere. after all, it is not about the development of it companies, but about the consumption of services or goods they provide—online commerce, e-government, etc.—using digital information technology.”13 taking into account the above, in this study, the digital economy is defined as digital infrastructure encompasses all business entities and their activities. the transition to the digital economy is the process of creating conditions for the digital transformation of organizations, the creation of digital infrastructure, and the process of gradual involvement of various economic entities and certain sectors of the economy in the digital infrastructure. one of the first practical and political manifestations of the transition to the digital economy was the european commission’s index of digital economy and society (desi), first published in 2014. the main components of the index are communications, human capital, internet use, digital integration, and digital public services. among european countries in 2019, there is significant progress in the digitalization of business and in the interaction of society with the state.14 for ukraine, the first step towards the digital economy was the digital economy and development concept of ukraine, which defines the understanding of the digital economy, the direction and principles of transition to it.15 thus, for active representatives of the public sector, this concept is a signal that the development of structures and organizations should be based not on improving operational efficiency, but on transformation in accordance with the requirements of industry 4.0. confirmation of the seriousness of the ukrainian government’s intentions in this direction is the creation of the ministry of digital transformation in 2019 and the digitization of the latest public services through online services.16 one of the priority challenges which needs to be solved at the stage of transition to the digital economy is the development of skills in working with digital technologies in the entire population . this is relevant not only for ukraine, but also for the european union. in europe, a third of the active workforce does not have basic skills in working with digital technologies; in ukraine, 15.1 percent of ukrainians do not have digital skills, and the share of the working population with below-average digital skills is 37.9 percent.17 information technology and libraries december 2020 the role of the library in the digital economy | zharinov 4 part of the solution to this challenge in ukraine is entrusted to the “digital education” project, implemented by the ministry of digital transformation (osvita.diia.gov.ua), which through the mini-series created by him for different target audiences should form digital literacy in the population of ukraine. features of digital transformation developed digital skills in the population make the digital transformation of organizations not just a competitive advantage, but a prerequisite for their survival. thus, the larger the target audience is accustomed to the benefits of the digital economy, the more actively the organization is to adapt to new requirements and customer needs, to the new competitive environment. digital transformation of the organization is a complex process that is not limited to the implementation of software in the company’s activities or automation of certain components of production. it includes changes to all elements of the company, including methods of manufacturing and customer service, the organization’s strategy and business model, approaches , and management methods. according to a study by mckinsey, the integration of new technologies into a company's operations can reduce profits in 45 percent of cases.18 therefore, it is extremely important to have a comprehensive approach to digital transformation, understanding the changes being implemented, choosing the method of their implementation, and gradually involving all structural units and business processes in the process of transformation. the boston consulting group study identified six factors necessary for the effective use of the benefits of modern technologies:19 • connectivity of analytical data; • integration of technologies and automation; • analysis of results and application of conclusions; • strategic partnership; • competent specialists in all departments; and • flexible structure and culture. mckinsey consultants draw attention to the low percentage of successful digital transformation practices and based on the successful experience of 83 companies form five categories of recommendations that can contribute to successful digitalization:20 • involvement of leaders experienced in digitalization; • development of digital staff skills; • creating conditions for the use of digital skills by staff; • digitization of tools and working procedures of the company; and • establishing digital communication and ensuring the availability of information. experts at the institute of digital transformation identify four main stages of digital transformation in the company:21 1. research, analysis and understanding of customer experience. 2. involvement of the team in the process of digital transformation and implementation of corporate culture, which contributes to this process. 3. building an effective operating model based on modern systems. 4. transformation of the business model of the organization. https://osvita.diia.gov.ua/ information technology and libraries december 2020 the role of the library in the digital economy | zharinov 5 the “integrated model of digital transformation” study identifies one of the key factors of successful digital transformation, focusing on priority digital projects, the development and implementation of which should be engaged in specific organizational teams. the authors identify three main functional activities for digital transformation teams, the implementation of which provides a gradual comprehensive renewal of the company, namely: the creation and implementation of digital strategy, digital activity management, digitization of operational activities.22 in their study, ukrainian scientists natalia kraus, oleksandr holoborodko, and kateryna kraus determine that the general pattern for all digital economy projects is their focus on a specific consumer and comprehensive use of available information about the latter and the conditions of project effectiveness.23 initially, the project is pre-tested on a small scale, and only after obtaining satisfactory results from the testing of new principles of activity in a narrow target audience is the project scaled to a wider range of potential users. all this reduces the risks associated with digital transformation. eliminating unnecessary changes and false hypotheses on a small scale allows to avoid overspending at the stage of a comprehensive transformation of the entire enterprise. therefore, the process of effective digital transformation should begin with the involvement of experienced leaders in the field of digital transformation, analysis of the weaknesses of the organization, and building of a plan for its comprehensive transformation, which is divided into individual projects implemented by individual qualified teams with a gradual increase in the volume of these projects, while confirming their effectiveness on a small scale. the process of digital transformation should be accompanied by constant training of employees in digital skills. the goal of digital transformation is to build an efficient, high-profile company that can quickly adapt to new environmental conditions, which is achieved through the introduction of digital technologies and new methods and tools of organization management. directions of library development in the digital economy based on the study of the digital economy concept and the peculiarities of digital transformation, the review of library development in the digital economy was conducted to find the library’s place in digital infrastructure and potential projects that can be implemented on a separate library as part of its comprehensive transformation plan. the main task is to determine the new role of the library in the digital economy and the areas that best meet it. the search for directions in the development of the library in response to the spread of digital technology began at the end of the last century. one of the first concepts to reflect the impact of the internet on the library sector is the concept of the digital library, published in 1999.24 in 2006, the concept of “library 2.0” emerged, which is based on the use of web 2.0 technologies, dynamic sites, users become data authors, open-source software, api interfaces, data added to one database is immediately fed to partner databases.25 the spread of the use of social networks, mobile technologies, and their successful use in library practice has led to the formation of the concept of “library 3.0.”26 the development of open source, cloud service, big data, augmented reality, context-aware, and other technologies has influenced library activities, which is reflected in the “library 4.0.”27 researchers, scholars, and the professional community continued to develop the concepts of the modern library, drawing on the experience of implementing changes in library activities and taking into account the development of other areas, and in 2020 articles began to appear which described the concept of “library 5.0,” based on a personalized approach to students, information technology and libraries december 2020 the role of the library in the digital economy | zharinov 6 support of each student during the whole period of study, development of skills necessary for learning and a set of other supporting actions integrated into the educational process.28 in determining the current role of the library in the digital economy, it is necessary to pay attention to a study by denis solovianenko, who in identifies research and educational infrastructure as one of the key elements of scientific libraries of the twenty-first century.29 olga stepanenko considers libraries as part of the information and communication infrastructure, the development of which is one of the main tasks of the transformation of the socioeconomic environment in accordance with the needs of the digital economy, which ensures high efficiency of stakeholders the pace of digitalization of the state economy, which occurs through the development of its constituent elements.30 the importance of traditional library services replacing digital infrastructure, based on the example of the moravian library, is proved in a study by michal indrak and lenka pokorna, published in april 2020.31 projects that contribute to the library’s adaptation to the conditions of the digital economy, implemented in the environment of public libraries, include: digitization of library collections (including historical heritage) and the creation of a database of full-text documents; providing free access to the internet via library computers and wi-fi; organization of online customer service, development of services that do not require a physical presence in the library; organization of events for the development of digital skills of users, work with information.32 under such conditions, the role of the librarian as a specialist in the field of information changes from being a custodian to being an intermediary, a distributor.33 one of the main objectives of library activity in the digital economy becomes overcoming a digital divide, dissemination of knowledge about modern technologies and innovations, the assistance of their use by the community, development of digital skills in all users of the library.34 an example of the digital public library is the digital north library project in canada, which resulted in the creation of the inuvialuit digital library (https://inuvialuitdigitallibrary.ca). the project lasted four years, bringing together researchers from different universities and the community in the region, who together digitized cultural heritage documents and created metadata. the library now has more than 5,200 digital resources collected in 49 catalogues. the implementation of this project provides access to library services and information to a significant number of people living in remote areas of northern canada and unable to visit libraries (https://sites.google.com/ualberta.ca/dln/home?authuser=0, https://inuvialuitdigitallibrary.ca).35 other representatives of modern digital libraries, one of the main tasks of which is the preservation of cultural heritage and the spread of national culture, are the british library (https://www.bl.uk), the hispanic digital library—biblioteca nacional de españa (http://www.bne.es), gallica digital library in france (https://gallica.bnf.fr), the german digital library—deutsche digitale bibliothek (https://www.deutsche-digitale-bibliothek.de), and the european library (https://www.europeana.eu). another direction was the development of analytical skills in information retrieval. academic libraries, operating with their competencies in information retrieval and information technology, which refined the results of the analysis were able to better identify trends in academia and expand cooperation with teachers to update their curricula.36 libraries become active participants https://inuvialuitdigitallibrary.ca/ https://sites.google.com/ualberta.ca/dln/home?authuser=0 https://inuvialuitdigitallibrary.ca/ https://www.bl.uk/ http://www.bne.es/ https://gallica.bnf.fr/ https://www.deutsche-digitale-bibliothek.de/ https://www.europeana.eu/ information technology and libraries december 2020 the role of the library in the digital economy | zharinov 7 in the process of teaching, learning, and assessment of acquired knowledge in educational institutions. t. o. kolesnikova, in her research of models of library development, substantiates the expediency of creating information intelligence centers for the implementation of the latest scientific advances in training and production processes, the involvement of libraries in the activities of higher educational establishments in the educational process, and the creation of centralized repositories as directions of development for university libraries of ukraine.37 one of the advantages of the development and dissemination of digital technologies is the possibility of forming individual curricula for students. involvement of university libraries in this area is one of the new areas of their activities in the digital economy.38 one of the important areas of operation for departmental and scientific-technical libraries that contribute to increasing the innovative potential of the country is activity in the area of intellectual property. consulting services in the field of intellectual property, information support for scientists, creation of electronic patent information databases in the public domain , and other related services are important components of libraries in many countries. consulting services in the field of intellectual property, information support for scientists, creation of electronic patentinformation databases in the public domain and other related services are important components of libraries in many countries.39 another important component of libraries’ transformation is the deepening of their role in scientific communication; expanding the boundaries of the use of information technology in order to integrate scientific information into a single network; creation and management of information technology infrastructure of science.40 the presence of libraries on social networks has become an important component of their digital transformation. on the one hand, libraries have thus created another source of information dissemination and expanded the number of service delivery channels, for the implementation of which they have developed online training videos and interactive help services.41 on the other hand, social networks have become a marketing tool to engage the audience in the digital fund of the library and its online services. an additional important component of the presence of libraries in social networks was the establishment of contacts and exchange of ideas with other professional organizations, which contributed to the further expansion of the network of library partners.42 another area of activity that libraries take on in the digital economy is the management of research data, which is confirmed by the significant number of publications on this topic in professional scientific and research journals for 2017–18.43 joining this area allows libraries to become part of the scientific digital information and communication infrastructure, the creation of which is one of the main tasks of digital transformation on the way to the digital economy.44 the development of this area contributes to the digitalization of scientific and information sphere, systematization and structuring of all scientific research data has a positive effect on the effectiveness of research, the level of scientific novelty of the results of intellectual activity. the ukrainian institute of the future with the digital agency of ukraine consider digital transformation as the integration of modern digital technologies into all spheres of business. the introduction of modern technologies (artificial intelligence, blockchain, koboty, digital twins, iiot platforms and others) in the production process will lead to the transition to industry 4.0. according to their forecasts, the key competence in industry 4.0 should be data processing and information technology and libraries december 2020 the role of the library in the digital economy | zharinov 8 analytics.45 research information is an integral part of this competence, so the development of this area is one of the most promising for the library in the digital economy. the tools used in the management of research data are called current research information systems, abbreviated as cris. in ukraine, there is no such system connected to the international community. 46 the change of the library’s role from a repository to its manager, the alignment of the functions and tasks of a cris with the key requirements of the digital economy, and the advantages of such systems, together with the fact that they are still not used in ukraine, make this area extremely relevant for research and a promising area of work of scientific libraries, so we’ll consider it more thoroughly. problems in research data management the global experience of research information management shows several problems in the process of research data management. some of them are related to the processes of workflow organization, control, and reporting. this is due to the use of several poorly coordinated systems to organize the work of scientists. data sets from different systems without metadata are very difficult to combine into a single system, and it is almost impossible to automate the process. all this is manifested in the lack of information security of the decision-making process in the field of science, both at the state level and at the level of individual structures. this situation can lead to wrong management decisions and can lead to overspending on similar, duplicate projects; increasing the cost of the process of recruiting and finding scientists with relevant experience for research, finding the equipment needed for research. cris, which began to appear in europe in the 1990s, are designed to overcome these shortcomings and promote the effective organization of scientific work. such systems are now widespread throughout the world, with a total of about five hundred, which are mainly concentrated in europe and india. however, there is currently no research information management system in ukraine that meets international standards and integrates with international scientific databases. this omission slows down ukraine’s integration into the international scientific community. the solution to this problem may be the creation of the national electronic scientific information system uris (ukrainian research information system).47 the development of this system is an initiative of the ministry of education and science of ukraine. it is based on combining data from ukrainian scientific institutions with data from crossref and other organizations, as well as ensuring integration with other international cris systems through the use of the cerif standard. future developers of the system face a number of challenges, both specific and already studied by foreign scientists. a significant number of studies in this area are designed to over come the problem of lack of access to research data, as well as to solve problems of data standardization and openness. in the global experience, the directions of collection processes management and development of structured data sets, their distribution on a commercial basis, and also ways of receiving the advantage of providing them in open access are investigated. the mechanisms of financing these processes are studied, in particular, the effective ways of attracting patronage funds are analyzed. the possibilities of licensing the received data sets and their distribution, approaches and tools that can be the most effective for the library are determined. in particular, alice wise describes information technology and libraries december 2020 the role of the library in the digital economy | zharinov 9 the experience of settling some legal aspects by clarifying the use of the site in the license agreement, which covers the conditions of access to information and search in it, while maintaining a certain level of anonymity.48 the problem of data consistency is related to the lack of uniform standards for information retention, which would relate to the format of the data, the metadata itself, the methods of their generation and use. thus, the use of different standards and formats in repositories and archives leads to problems with data consistency in researchers, which, in turn, affects the quality of service delivery and makes it impossible to use multiple data sets.49 another important problem for the dissemination of research data is the lack of tools, components in libraries, and repositories of higher educational establishments and scientific institutions. it is worth to develop the infrastructure so that at the end of the projects, in addition to the research results, the scientists publish the research data they used and generated. this approach will be convenient both for authors (in case they need to reuse the research data) and for other scientists (because they will have access to data that can be used in their own research).50 the development of the necessary tools is quite relevant, especially because researcher-practitioners are in favor of sharing the data they create with other researchers and the licensed use of other people’s datasets in conducting their own research, according to international surveys.51 another reason for the low prevalence of research data is that datasets have less of an impact on a researcher’s reputation and rating than publications.52 this is partly due to the lack of citation tracking infrastructure in datasets, in contrast to the publication of research results, and the lack of standards for storing and publishing data. prestigious scientific journals have been struggling with this problem for several years. for example, the american economic review requires authors whose articles contain empirical work, modelling, or experimental work to provide information about research data in a volume enough for replication.53 nature and science require authors to preserve research data and provide them at the request of the editors of the journals.54 one of the reasons for the underdeveloped infrastructure in research data management is the weak policy of disseminating free access to this data, as a result of which even a small part of usable scientific data remains closed by license agreements and cannot be used by other scientists.55 open science initiatives related to publications have been operating in the scientific field for a long time, but their dissemination to research data remains insufficient. the development of the uris system will provide management of scientific information, will solve problems highlighted in the above scientific works of researchers; will promote the efficient use of funds, will simplify the process of finding data for conducting research; will discipline research , and therefore will have a positive impact on the entire economy of ukraine. library and research information management library involvement in the development process for scientific information management systems will be an important future direction of their work. such systems, which could include all the necessary information about scientific research, will contribute to the renewal and development of the library sphere of ukraine, will promote the transition of the state to a digital economy. information technology and libraries december 2020 the role of the library in the digital economy | zharinov 10 the creation of the uris system is designed to provide access to research data generated by both ukrainian and foreign scientists. such a system can ensure the development of cooperation in the field of research, intensification of knowledge exchange, and interaction through the open exchange of scientific data and integration of ukrainian scientific infrastructure into the world scientific and information space. according to surveys conducted by the international organizations eurocris and oclc, of the 172 respondents working in the field of research information management, 83 percent said that libraries play an important role in the development of open science, copyright, and the deposit of research results. the share of libraries that play a major role in this direction was 90 percent. almost 68 percent of respondents noted the significant contribution of libraries in filling in the metadata needed to correctly identify the work of researchers in various databases; 60 percent noted the important role of libraries in verifying the correctness of metadata filling by researchers, and almost 49 percent of respondents assess the role of libraries as the main one in the management of research data (figure 4). figure 4. the proportion of organizations among 172 users of cris-systems that assess the role of libraries in the management of research information as basic or supporting.56 at the same time, the activity of libraries in the direction of assistance in information management of scientific research can take various forms, which should be adopted by scientific libraries of ukraine; some of these forms will be useful to public libraries that can become science ambassadors in their communities. based on the experience of foreign libraries, we have identified areas of activity in which the library can join the management of research information. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% financial support for rim project management maintaining or servicing technical operations impact assessment and reporting strategic development, management and planning creating internal reports for departments system configuration outreach and communication initiating rim adaption research data management metadata validation workflows metadata entry training and support open access, copyright and deposit information technology and libraries december 2020 the role of the library in the digital economy | zharinov 11 one of the main directions for libraries that cooperate with cris users or are themselves the organizers of such systems is the introduction and support of open science. historically, libraries support open science because they provide access to scientific papers, but they can further expand their activities. using open data resources and promoting them among the scientific community, involving scientific users in disseminating their own research results on the principles of open science, supporting users in disseminating their publications, creating conditions for increasing the citation of scientific papers, tracking information about user publications, creating and support of public profiles of scientists in scientific and professional resources and scientific social networks—all this will help to intensify researchers in engaging in open science and take advantages of this area. the analysis of the world experience shows that in the activity of scientific libraries there is a significant intensification of support for the strategic goals of the structures that finance their activities and to which they are subordinated. libraries are moving away from the usual customer service and expanding their activities through the use of their own assets and the introduction of new modern tools. such libraries try to promote the development of parent structures, increase modern competencies to meet the needs and goals of these institutions better. by introducing and implementing various tools for the development of management, libraries synchronize their strategy with the strategy of the parent structure to achieve a synergistic effect. the next important direction of library development is their socialization. wanting to get rid of the antiquated understanding of the word library, many of them conduct campaigns aimed at changing the image of the library in the imagination of users, communities, and society. an important component of this system step is to build relationships with the target audience, creating user communities around the library, which are not only its users but also supporters, friends, and promoters. building relationships with members of the scientific community allows libraries to reduce resistance to change as a result of the introduction of scientific information management systems; to influence users positively so that they introduce new tools into their usual activities, receive benefits, and become an active part of the scientific space structuring process. recently, work with metadata has undergone some changes. the need for identification and structuring of data in the world scientific space leads to the fact that they are already filled not only by libraries but also by other organizations that produce, issued, publish scientific results and scientific literature. scientists are beginning to make more active use of modern standards in the field of information in order to promote their own work. libraries, in turn, take on the role of consultant or contractor with many years of experience working with metadata and sufficient knowledge in this area. on the other hand, filling in metadata by users frees up the time of librarians and creates conditions for them to perform other functions, such as information management, creation of automated data collection and management systems integrated with scientific databases—both ukrainian and international. another area of research information management is the direct management of this process. thus, cris are developed and implemented with the contribution of scientific libraries in different countries of the world. this allows libraries to combine disparate data obtained from different sources, compile scientific reports, evaluate the effectiveness of scientific activities of the institution, create profiles of scientific institutions and scientists, develop research network s, etc. information technology and libraries december 2020 the role of the library in the digital economy | zharinov 12 scientists and students can find the results of scientific research, and look for partners and sources of funding for research. research managers have access to up-to-date scientific information, which allows to more accurately assess the productivity and influence of individual scientists, research groups and institutions. business representatives get access to up-to-date information on promising scientific developments, and the public—a way to control research conducting effectively. conclusions ukraine is on the path to a digital economy, characterized by the penetration of new technologies in all areas of human activity, simplification of access to information, goods and services, blurring the geographical boundaries of companies, increasing the share of automated and robotic production units, strengthening the role of creation and use databases. these changes affect all sectors of the economy, and all organizations, without exception, need to adapt accordingly. rapid response to relevant changes helps to increase competitiveness both at the level of individual organizations and at the level of the state economy. adaptation to the conditions of the digital economy occurs through digital transformation—a complex process that requires a review of all business processes of the organization and radically changes its business model. the digital transformation of the organization takes place through the involvement of management, which is competent in digitization, updating management methods, developing digital skills, establishing efficient production and services, implementing digital to ols and building digital communication, implementing individual development projects, and adapting to new user needs. the digital transformation of the economy occurs through the transformation of its individual sectors, creating conditions for the transformation of their representatives. one of the first steps in the process of transition to the digital economy is the establishment of digital information and communication infrastructure. libraries are representatives of the information sphere, which were the main operators of information in the analogue era. significant changes in the subject area of their activities require the search for a new role for libraries. modern projects and directions of library development are integral elements of transformation to the conditions of the digital economy. the result of completing this complex implementation will allow libraries to update their management methods, the range of services, and the channels of their provision; change fixed assets through their digitization, structuring the data and creating metadata; affect approaches to communication with users and cooperation with both domestic and international partners; change the functions and positioning of the library; and will enable them become effective information operator-managers. in the digital economy, the role of the library is changing from passively collecting and storing information to actively managing it. one of the areas of development that most comprehensively meets this role is the management of research data, which is implemented through the creation of cris systems. thus, the main asset of libraries is a digital, structured database, which is automatically and regularly updated, the main focus of which is to support the decision-making process. the library becomes an assistant in conducting research, finding funding, partners, fixed assets and information; a partner in the strategic management of both scientific organizations and the state at the level of committees and ministries. information technology and libraries december 2020 the role of the library in the digital economy | zharinov 13 the development of this area in ukraine requires solving a number of technical, administrative, and managerial questions that are relevant not only in ukraine, but also around the world. in particular, libraries need to address the issue of data integration and consistency, its accessibility and openness, copyright, and personal data issues. solving the problems of creation and operation of cris systems in ukraine are promising areas for future research. endnotes 1 andriy dobrynin, konstantin chernykh, vasyl kupriyanovsky, pavlo kupriyanovsky and serhiy sinyagov, “tsifrovaya ekonomika—razlichnyie puti k effektivnomu primeneniyu tehnologiy (bim, plm, cad, iot, smart city, big data i drugie),” international journal of open information technologies 4, no. 1 (2016): 4–10, https://cyberleninka.ru/article/n/tsifrovayaekonomika-razlichnye-puti-k-effektivnomu-primeneniyu-tehnologiy-bim-plm-cad-iot-smartcity-big-data-i-drugie. 2 jurgen meffert, volodymyr kulagin, and alexander suharevskiy, digital @ scale: nastolnaya kniga po tsifrovizatsii biznesa (moscow: alpina, 2019). 3 victoria apalkova, “kontseptsiia rozvytku tsyfrovoi ekonomiky v yevrosoiuzi ta perspektyvy ukrainy,” visnyk dnipropetrovskoho universytetu. seriia «menedzhment innovatsii» 23, no. 4 (2015): 9–18, http://nbuv.gov.ua/ujrn/vdumi_2015_23_4_4. 4 don tapscott, the digital economy: promise and peril in the age of networked intelligence (new york: mcgraw-hill, 1996). 5 thomas l. mesenbourg, measuring the digital economy (washington, dc: bureau of the census, 2001). 6 philippe barbet and nathalie coutinet, “measuring the digital economy: state-of-the-art developments and future prospects,” communications & strategies, no. 42 (2001): 153, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.576.1856&rep=rep1&type=pdf . 7 alnoor bhimani, “digitization and accounting change,” in management accounting in the digital economy, edited by alnoor bhimani, 1-12 (london: oxford university press, 2003), https://doi.org/10.1093/0199260389.003.0001. 8 bo carlsson, “the digital economy: what is m=new and what is not?,” structural change and economic dynamics 15, no. 3 (september 2004): 245–64, https://doi.org/10.1016/j.strueco.2004.02.001. 9 john hand, “building digital economy—the research councils programme and the vision,” lecture notes of the institute for computer sciences, social informatics and telecommunications engineering 16, (2009): 3, https://doi.org/10.1007/978-3-642-11284-3_1. 10 carmen nadia ciocoiu, “integration digital economy and green economy: opportunities for sustainable development,” theoretical and empirical researches in urban management 6, no. 1 (2011): 33–43, https://www.researchgate.net/publication/227346561. https://cyberleninka.ru/article/n/tsifrovaya-ekonomika-razlichnye-puti-k-effektivnomu-primeneniyu-tehnologiy-bim-plm-cad-iot-smart-city-big-data-i-drugie https://cyberleninka.ru/article/n/tsifrovaya-ekonomika-razlichnye-puti-k-effektivnomu-primeneniyu-tehnologiy-bim-plm-cad-iot-smart-city-big-data-i-drugie https://cyberleninka.ru/article/n/tsifrovaya-ekonomika-razlichnye-puti-k-effektivnomu-primeneniyu-tehnologiy-bim-plm-cad-iot-smart-city-big-data-i-drugie http://nbuv.gov.ua/ujrn/vdumi_2015_23_4_4 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.576.1856&rep=rep1&type=pdf https://doi.org/10.1093/0199260389.003.0001 https://doi.org/10.1016/j.strueco.2004.02.001 https://doi.org/10.1007/978-3-642-11284-3_1 https://www.researchgate.net/publication/227346561 information technology and libraries december 2020 the role of the library in the digital economy | zharinov 14 11 lesya zenoviivna kit, “evoliutsiia merezhevoi ekonomiky,” visnyk khmelnytskoho natsionalnoho universytetu, ekonomichni nauky, no. 3 (2014): 187–94, http://nbuv.gov.ua/ujrn/vchnu_ekon_2014_3%282%29__42. 12 mykhailo voinarenko and larissa skorobohata, “merezhevi instrumenty kapitalizatsii informatsiino-intelektualnoho potentsialu ta innovatsii,” visnyk khmelnytskoho natsionalnoho universytetu, . ekonomichni nauky, no. 3 (2015): 18–24, http://elar.khnu.km.ua/jspui/handle/123456789/4259. 13 yurii pivovarov, “ukraina perehodut na “cifrovu economic,” sccho ce oznachae,” edited by miroslav liskovuch. ukrinform (january 21, 2020). https://www.ukrinform.ua/rubricsociety/2385945-ukraina-perehodit-na-cifrovu-ekonomiku-so-ce-oznacae.html. 14 european commission, “digital economy and society index,” brussels, belgium, https://ec.europa.eu/commission/news/digital-economy-and-society-index-2019-jun-11_en. 15 kabinet ministriv ukrainu, “pro skhvalennia kontseptsii rozvytku tsyfrovoi ekonomiky ta suspilstva ukrainy na 2018–2020 roky ta zatverdzhennia planu zakhodiv shchodo yii realizatsii,” (kyiv: 2018), https://zakon.rada.gov.ua/laws/show/67-2018-%d1%80. 16 kabinet ministriv ukrainu, “pytannia ministerstva tsyfrovoi transformatsii,” (kyiv: 2019), https://zakon.rada.gov.ua/laws/show/856-2019-%d0%bf. 17 piatuy, “biblioteky stanut pershymy oflain-khabamy: mintsyfry zapustyt kursy z tsyfrovoi osvity,” https://www.5.ua/suspilstvo/biblioteky-stanut-pershymy-oflain-khabamy-mintsyfryzapustyt-kursy-z-tsyfrovoi-osvity-206206.html. 18 jacques bughin, jonathan deaki, and barbara o’beirne, “digital transformation: improving the odds of success,” mckinsey & company, https://www.mckinsey.com/businessfunctions/mckinsey-digital/our-insights/digital-transformation-improving-the-odds-ofsuccess. 19 domynyk fyld, shylpa patel, and henry leon, “kak dostich tsifrovoy zrelosti,” the boston consulting group inc. (2018), https://www.thinkwithgoogle.com/_qs/documents/5685/ru_adwords_marketing___sales_89 1609_mastering_digital_marketing_maturity.pdf. 20 hortense de la boutetière, alberto montagner, and angelika reich, “unlocking success in digital transformations,” mckinsey & company, https://www.mckinsey.com/businessfunctions/organization/our-insights/unlocking-success-in-digital-transformations. 21 top lea, “tsyfrova transformatsiia biznesu: navishcho vona potribna i shche 14 pytan,” businessviews, https://businessviews.com.ua/ru/business/id/cifrova-transformacijabiznesu-navischo-vona-potribna-i-sche-14-pitan-2046. 22 vasily kupriyanovsky, andrey dobrynin, sergey sinyagov, and dmitry namiot, “tselostnaya model transformatsii v tsifrovoy ekonomike—kak stat tsifrovyimi liderami,” international journal of open information technologies 5, no. 1 (2017): 26–33, http://nbuv.gov.ua/ujrn/vchnu_ekon_2014_3%282%29__42 http://elar.khnu.km.ua/jspui/handle/123456789/4259 https://www.ukrinform.ua/rubric-society/2385945-ukraina-perehodit-na-cifrovu-ekonomiku-so-ce-oznacae.html https://www.ukrinform.ua/rubric-society/2385945-ukraina-perehodit-na-cifrovu-ekonomiku-so-ce-oznacae.html https://ec.europa.eu/commission/news/digital-economy-and-society-index-2019-jun-11_en https://zakon.rada.gov.ua/laws/show/67-2018-%d1%80 https://zakon.rada.gov.ua/laws/show/856-2019-%d0%bf https://www.5.ua/suspilstvo/biblioteky-stanut-pershymy-oflain-khabamy-mintsyfry-zapustyt-kursy-z-tsyfrovoi-osvity-206206.html https://www.5.ua/suspilstvo/biblioteky-stanut-pershymy-oflain-khabamy-mintsyfry-zapustyt-kursy-z-tsyfrovoi-osvity-206206.html https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/digital-transformation-improving-the-odds-of-success https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/digital-transformation-improving-the-odds-of-success https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/digital-transformation-improving-the-odds-of-success https://www.thinkwithgoogle.com/_qs/documents/5685/ru_adwords_marketing___sales_891609_mastering_digital_marketing_maturity.pdf https://www.thinkwithgoogle.com/_qs/documents/5685/ru_adwords_marketing___sales_891609_mastering_digital_marketing_maturity.pdf https://www.mckinsey.com/business-functions/organization/our-insights/unlocking-success-in-digital-transformations https://www.mckinsey.com/business-functions/organization/our-insights/unlocking-success-in-digital-transformations https://businessviews.com.ua/ru/business/id/cifrova-transformacija-biznesu-navischo-vona-potribna-i-sche-14-pitan-2046 https://businessviews.com.ua/ru/business/id/cifrova-transformacija-biznesu-navischo-vona-potribna-i-sche-14-pitan-2046 information technology and libraries december 2020 the role of the library in the digital economy | zharinov 15 https://cyberleninka.ru/article/n/tselostnaya-model-transformatsii-v-tsifrovoy-ekonomikekak-stat-tsifrovymi-liderami. 23 nataliia kraus, alexander holoborodko, and kateryna kraus, “tsyfrova ekonomika: trendy ta perspektyvy avanhardnoho kharakteru rozvytku,” efektyvna ekonomika no. 1 (2018): 1–7, http://www.economy.nayka.com.ua/pdf/1_2018/8.pdf. 24 david bawden and ian rowlands, “digital libraries: assumptions and concepts,” international journal of libraries and information studies (libri), no. 49 (1999): 181–91, https://doi.org/10.1515/libr.1999.49.4.181. 25 jack m. maness, “library 2.0: the next generation of web-based library services,” logos 13, no. 3 (2006): 139–45, https://doi.org/10.2959/logo.2006.17.3.139. 26 woody evans, building library 3.0: issues in creating a culture of participation (oxford: chandos publishing, 2009). 27 younghee noh, “imagining library 4.0: creating a model for future libraries,” the journal of academic librarianship 41, no. 6 (november 2015): 786–97, https://doi.org/10.1016/j.acalib.2015.08.020. 28 helle guldberg et al., “library 5.0,” septentrio conference series, uit the arctic university of norway, no. 3 (2020), https://doi.org/10.7557/5.5378. 29 denys solovianenko, “akademichni biblioteky u novomu sotsiotekhnichnomu vymiri. chastyna chetverta. suchasnyi riven dyskursu akademichnoho bibliotekoznavstva ta postup e-nauky,” bibliotechnyi visnyk no.1 (2011): 8–24, http://journals.uran.ua/bv/article/view/2011.1.02. 30 olga petrivna stepanenko, “perspektyvni napriamy tsyfrovoi transformatsii v konteksti rozbudovy tsyfrovoi ekonomiky,” in modeliuvannia ta informatsiini systemy v ekonomitsi : zb. nauk. pr., edited by v. k. halitsyn, (kyiv: kneu, 2017), 120–31, https://ir.kneu.edu.ua/bitstream/handle/2010/23788/120131.pdf?sequence=1&isallowed=y. 31 michal indrák and lenka pokorná, “analysis of digital transformation of services in a research library,” global knowledge, memory and communication (2020), https://doi.org/10.1108/gkmc-09-2019-0118. 32 irina sergeevna koroleva, “biblioteka—optimalnaya model vzaimodeystviya s polzovatelyami v usloviyah tsifrovoy ekonomiki,” informatsionno-bibliotechnyie sistemyi, resursyi i tehnologii no. 1 (2020): 57–64, https://doi.org/10.20913/2618-7515-2020-1-57-64. 33 james currall and michael moss, “we are archivists, but are we ok?”, records management journal 18, no. 1 (2008): 69–91, https://doi.org/10.1108/09565690810858532. 34 kirralie houghton, marcus foth and evonne miller, “the local library across the digital and physical city: opportunities for economic development,” commonwealth journal of local governance no. 15 (2014): 39–60, https://doi.org/10.5130/cjlg.v0i0.4062. https://cyberleninka.ru/article/n/tselostnaya-model-transformatsii-v-tsifrovoy-ekonomike-kak-stat-tsifrovymi-liderami https://cyberleninka.ru/article/n/tselostnaya-model-transformatsii-v-tsifrovoy-ekonomike-kak-stat-tsifrovymi-liderami http://www.economy.nayka.com.ua/pdf/1_2018/8.pdf https://doi.org/10.1515/libr.1999.49.4.181 https://doi.org/10.2959/logo.2006.17.3.139 https://doi.org/10.1016/j.acalib.2015.08.020 https://doi.org/10.7557/5.5378 http://journals.uran.ua/bv/article/view/2011.1.02 https://ir.kneu.edu.ua/bitstream/handle/2010/23788/120-131.pdf?sequence=1&isallowed=y https://ir.kneu.edu.ua/bitstream/handle/2010/23788/120-131.pdf?sequence=1&isallowed=y https://doi.org/10.1108/gkmc-09-2019-0118 https://doi.org/10.20913/2618-7515-2020-1-57-64 https://doi.org/10.1108/09565690810858532 https://doi.org/10.5130/cjlg.v0i0.4062 information technology and libraries december 2020 the role of the library in the digital economy | zharinov 16 35 sharon farnel and ali shiri, “community-driven knowledge organization for cultural heritage digital libraries: the case of the inuvialuit settlement region,” advances in classification research online no. 1 (2019): 9–12, https://doi.org/10.7152/acro.v29i1.15453. 36 elizabeth tait, konstantina martzoukou, and peter reid, “libraries for the future: the role of it utilities in the transformation of academic libraries,” palgrave communications no. 2 (2016): 1–9, https://doi.org/10.1057/palcomms.2016.70. 37 tatiana alexandrovna kolesnykova, “suchasna biblioteka vnz: modeli rozvytku v umovakh informatyzatsii,” bibliotekoznavstvo. dokumentoznavstvo. informolohiia no. 4 (2009): 57–62, http://nbuv.gov.ua/ujrn/bdi_2009_4_10. 38 ekaterina kudrina and karina ivina, “digital environment as a new challenge for the university library,”bulletin of kemerovo state university. series: humanities and social sciences 2, no. 10 (2019): 126–34, https://doi.org/10.21603/2542-1840-2019-3-2-126-134. 39 anna kochetkova, “tsyfrovi biblioteky yak oznaka xxi stolittia,” svitohliad no. 6 (2009): 68–73, https://www.mao.kiev.ua/biblio/jscans/svitogliad/svit-2009-20-6/svit-2009-20-6-68kochetkova.pdf. 40 victoria alexandrovna kopanieva, “naukova biblioteka: vid e-katalohu do e-nauky,” bibliotekoznavstvo. dokumentoznavstvo. informolohiia no. 6 (2016): 4–10, http://nbuv.gov.ua/ujrn/bdi_2016_3_3. 41 christy r. stevens, “reference reviewed and re-envisioned: revamping librarian and deskcentric services with libstars and libanswers,” the journal of academic librarianship 39, no. 2 (march 2013): 202–14, https://doi.org/10.1016/j.acalib.2012.11.006. 42 samuel kai-wah chu and helen s du, “social networking tools for academic libraries,” journal of librarianship and information science 45, no. 1 (february 17, 2012): 64–75, https://doi.org/10.1177/0961000611434361. 43 acrl research planning and review committee, “2018 top trends in academic libraries a review of the trends and issues affecting academic libraries in higher education,” c&rl news 79, no.6 (2018): 286–300. https://doi.org/10.5860/crln.79.6.286. 44 currall and moss, “we are archivists, but are we ok?”, 69–91, https://doi.org/10.1108/09565690810858532. 45 valerii fishchuk et al., “ukraina 2030e— kraina z rozvynutoiu tsyfrovoiu ekonomikoiu,” ukrainskyi instytut maibutnoho, 2018, https://strategy.uifuture.org/kraina-z-rozvinutoyucifrovoyu-ekonomikoyu.html. 46 eurocris, “search the directory of research information system (dris),” https://dspacecris.eurocris.org/cris/explore/dris. 47 mon, “mon zapustylo novyi poshukovyi servis dlia naukovtsiv—vin bezkoshtovnyi ta bazuietsia na vidkrytykh danykh z usoho svituю,” https://mon.gov.ua/ua/news/mon https://doi.org/10.7152/acro.v29i1.15453 https://doi.org/10.1057/palcomms.2016.70 http://nbuv.gov.ua/ujrn/bdi_2009_4_10 https://doi.org/10.21603/2542-1840-2019-3-2-126-134 https://www.mao.kiev.ua/biblio/jscans/svitogliad/svit-2009-20-6/svit-2009-20-6-68-kochetkova.pdf https://www.mao.kiev.ua/biblio/jscans/svitogliad/svit-2009-20-6/svit-2009-20-6-68-kochetkova.pdf http://nbuv.gov.ua/ujrn/bdi_2016_3_3 https://doi.org/10.1016/j.acalib.2012.11.006 https://doi.org/10.1177/0961000611434361 https://doi.org/10.5860/crln.79.6.286 https://doi.org/10.1108/09565690810858532 https://strategy.uifuture.org/kraina-z-rozvinutoyu-cifrovoyu-ekonomikoyu.html https://strategy.uifuture.org/kraina-z-rozvinutoyu-cifrovoyu-ekonomikoyu.html https://dspacecris.eurocris.org/cris/explore/dris https://mon.gov.ua/ua/news/mon-zapustilo-novij-poshukovij-servis-dlya-naukovciv-vin-bezkoshtovnij-ta-bazuyetsya-na-vidkritih-danih-z-usogo-svitu information technology and libraries december 2020 the role of the library in the digital economy | zharinov 17 zapustilo-novij-poshukovij-servis-dlya-naukovciv-vin-bezkoshtovnij-ta-bazuyetsya-navidkritih-danih-z-usogo-svitu. 48 nancy herther et al., “text and data mining contracts: the issues and needs,” proceedings of the charleston library conference, 2016, https://doi.org/10.5703/1288284316233. 49 karen hogenboom and michele hayslett, “pioneers in the wild west: managing data collections.” portal: libraries and the academy 17, no. 2 (2017): 295–319, https://doi.org/10.1353/pla.2017.0018. 50 philip young et al., “library support for text and data mining,” a report for the university libraries at virginia tech, 2017, http://bit.ly/2fccowu. 51 carol tenopir et al., “data sharing by scientists: practices and perceptions,” plos one 6 (2011), no. 6, https://doi.org/10.1371/journal.pone.0021101. 52 filip kruse and jesper boserup thestrup, “research libraries’ new role in research data management, current trends and visions in denmark,” liber quarterly 23, no.4 (2014): 310– 35, https://doi.org/10.18352/lq.9173. 53 american economic review, “data and code.” aer guidelines for accepted articles. instructions for preparation of accepted manuscripts, 2020, https://www.aeaweb.org/journals/aer/submissions/accepted-articles/styleguide#iic. 54 “data access and retention.” the publication ethics and malpractice statement, (new york: marsland press, 2019), http://www.sciencepub.net/marslandfile/ethics.pdf. 55 patricia cleary et al., “text mining 101: what you should know,” the serials librarian 72, no.1-4 (may 2017): 156–59, https://doi.org/10.1080/0361526x.2017.1320876. 56 rebecca bryant et al., practices and patterns in research information management findings from a global survey (dublin: oclc research, 2018), https://doi.org/10.25333/bgfg-d241. https://mon.gov.ua/ua/news/mon-zapustilo-novij-poshukovij-servis-dlya-naukovciv-vin-bezkoshtovnij-ta-bazuyetsya-na-vidkritih-danih-z-usogo-svitu https://mon.gov.ua/ua/news/mon-zapustilo-novij-poshukovij-servis-dlya-naukovciv-vin-bezkoshtovnij-ta-bazuyetsya-na-vidkritih-danih-z-usogo-svitu https://doi.org/10.5703/1288284316233 https://doi.org/10.1353/pla.2017.0018 http://bit.ly/2fccowu https://doi.org/10.1371/journal.pone.0021101 https://doi.org/10.18352/lq.9173 https://www.aeaweb.org/journals/aer/submissions/accepted-articles/styleguide#iic http://www.sciencepub.net/marslandfile/ethics.pdf https://doi.org/10.1080/0361526x.2017.1320876 https://doi.org/10.25333/bgfg-d241 abstract introduction the concept of the “digital economy” features of digital transformation directions of library development in the digital economy problems in research data management library and research information management conclusions endnotes microsoft word december_ital_farnell_final.docx editorial board thoughts: metadata training in canadian library technician programs sharon farnel information technologies and libraries | december 2016 3 the core metadata team at my institution is small but effective. in addition to myself as coordinator, we include two librarians and two full-time metadata assistants. our metadata assistant positions are considered to be similar, in some ways, to other senior assistant positions within the organization which require or at least prefer that individuals have a library technician diploma. however, neither of our metadata assistants has such a diploma. their credentials, in fact, are quite different. in part, this difference is driven by the nature of the work that our metadata assistants do. they work regularly with different metadata standards such as mods, dc, ddi in addition to marc. the perform operations on large batches of metadata using languages such as xslt or r. this is quite different in many ways than the work of their colleagues who work with the ils, many of whom do have a library technician diploma. as we prepare for an upcoming short-term leave of one of our team members, i have been thinking a great deal about the work our metadata assistants do and whether or not we would find an individual who came through a librarian technician program who had the skills and knowledge we need a replacement to have. and i have also been reminded of conversations i have had with recently graduated library technicians who felt their exposure to metadata standards, practices, and tools beyond rda and marc had been lacking in their programs. this got me thinking about the presence or absence of metadata courses in library technician programs in canada. i reached out to two colleagues from macewan university—norene erickson and lisa shamchuk—who are doing in-depth research into library technician education in canada. they kindly provided me with a list of canadian institutions that offer a library technician program so i could investigate further. now, i must begin with two caveats. one, this is very much a surface level scan rather than an indepth examination, although this is simply the first step in what i hope will be a longer term investigation. second, although several francophone institutions in canada offer library technician programs, i did not review their programs; i was concerned that my lack of fluency in the french language could lead to inadvertent misrepresentations. sharon farnel (sharon.farnel@ualberta.ca), a member of the ital editorial board, is metadata coordinator, university of alberta libraries, edmonton, alberta. editorial board thoughts | farnel https://doi.org/10.6017/ital.v35i4.9601 4 canadian institutions offering a library technician program (by province) are: alberta ● macewan university (http://www.macewan.ca/wcm/schoolsfaculties/business/programs/libraryandinforma tiontechnology/) ● southern alberta institute of technology (http://www.sait.ca/programs-and-courses/fulltime-studies/diplomas/library-information-technology) british columbia ● langara college (http://langara.ca/programs-and-courses/programs/library-informationtechnology/) ● university of the fraser valley (http://www.ufv.ca/programs/libit/) manitoba ● red river college (http://me.rrc.mb.ca/catalogue/programinfo.aspx?progcode=libifdp®ioncode=wpg) nova scotia ● nova scotia community college (http://www.nscc.ca/learning_programs/programs/plandescr.aspx?prg=lbtn&pln=libin ftech) ontario ● algonquin college (http://www.algonquincollege.com/healthandcommunity/program/library-andinformation-technician/) ● conestoga college (https://www.conestogac.on.ca/parttime/library-and-informationtechnician) ● confederation college (http://www.confederationcollege.ca/program/library-andinformation-technician) ● durham college (http://www.durhamcollege.ca/programs/library-and-informationtechnician) ● seneca college (http://www.senecacollege.ca/fulltime/lit.html) ● mohawk college (http://www.mohawkcollege.ca/ce/programs/community-services-andsupport/library-and-information-technician-diploma-800) information technologies and libraries | december 2016 5 quebec ● john abbott college (http://www.johnabbott.qc.ca/academics/careerprograms/information-library-technologies/) saskatchewan ● saskatchewan polytechnic (http://saskpolytech.ca/programs-andcourses/programs/library-and-information-technology.aspx) my method was quite simple. using the program websites listed above, i reviewed the course listings looking for ‘metadata’ either in the title or in the description when it was available. of the fourteen (14) programs examined, nine (9) had no course with metadata in the title or description. two (2) programs had courses where metadata was listed as part of the content but not the focus: langara college as part of “special topics: creating and managing digital collections” and seneca college as part of “cataloguing iii” which has a partial focus on metadata for digital collections. three (3) of the programs had a course with metadata in the title or description; all are a variation on “introduction to metadata and metadata applications”. (importantly, the three institutions in question conestoga college, confederation college, and mohawk college are all connected and share courses online). so, what do these very preliminary and impressionistic findings tell us? it seems that there is little opportunity for students enrolled in library technician programs in canada to be exposed to the metadata standards, practices, and tools that are increasingly necessary for positions involved in work with digital collections, research data management, digital preservation, and the like. admittedly, no program can include courses on all potentially relevant topics. in addition, formal course work is only one aspect of training and education that can prepare graduates for their career; practica and work placements and other more informal activities during a program are crucial, as are the skills and knowledge that can only be developed once hired and on the job. nevertheless, based on the investigation above, one would be justified in asking if we are disadvantaging students by not working to incorporate additional coursework focused on metadata standards, application, and tools, as well as on basic skills in manipulation of metadata in large batches. scripting languages or equivalent combination of education and experience. master’s desirable.” i edited our statement to more clearly allow a combination of factors that would show sufficient preparation: “bachelor’s degree and a minimum of 3-5 years of experience, or an equivalent combination of education and experience, are required; a master’s degree is preferred,” followed by a separate description of technical skills needed. this increased the number and quality of our editorial board thoughts | farnel https://doi.org/10.6017/ital.v35i4.9601 6 applications, so i’ll remain on the lookout for opportunities to represent what we want to require more faithfully and with an open mind. meanwhile, on the other side of the table, students and recent grads are uncertain how to demonstrate their skills. first, they’re wondering how to show clearly enough that they meet requirements like “three years of work experience” or “experience with user testing” so that their application is seriously considered. second, they ask about possibilities to formalize skills. recently, i’ve gotten questions about a certificate program in ux and whether there is any formal certification to be a systems librarian. surveying the past experience of my own network—with very diverse paths into technology jobs ranging from undergraduate or second master’s degrees to learning scripting as a technical services librarian to pre-mls work experience—doesn’t suggest any standard method for substantiating technical knowledge. once again, the truth of the situation may be that libraries will welcome a broad range of possible experience, but the postings don’t necessarily signal that. some advice from the tech industry about how to be more inviting to candidates applies to libraries too; for example, avoiding “rockstar”/ “ninja” descriptions, emphasizing the problem space over years of experience,1 and designing interview processes that encourage discussion rather than “gotcha” technical tasks. at penn libraries, for example, we’ve been asking developer candidates to spend a few hours at most on a take-home coding assignment, rather than doing whiteboard coding on the spot. this gives us concrete code to discuss in a far more realistic and relaxed context. while it may be helpful to express requirements better to encourage applicants to see more clearly whether they should respond to a posting, this is a small part of the question of preparing new mls grads for library technology jobs. the new grads who are seeking guidance on substantiating their skills are the ones who are confident they possess them. others have a sense that they should increase their comfort with technology but are not sure how to do it, especially when they’ve just completed a whole new degree and may not have the time or resources to pursue additional training. even if we make efforts to narrow the gap between employers and jobseekers, much remains to be discussed regarding the challenge of readying students with different interests and preparation for library employment. library school provides a relatively brief window to instill in students the fundamentals and values of the profession and it can’t be repurposed as a coding academy. there persists a need to discuss how to help students interested in technology learn and demonstrate competencies rather than teaching them rapidly shifting specific technologies. references 1. erin kissane, “job listings that don’t alienate,” https://storify.com/kissane/job-listings-thatdon-t-alienate. a s i approach the end of my tenure as ital edi­ tor, i reflect on the many lita members who have not submitted articles for possible publica­ tion in our journal. i am especially mindful of the smaller number who have promised or hinted or implied that they intended to or might submit articles. admittedly, some of them may have done so because i asked them, and their replies to me were the polite ones that one expects of the honorable members of the library and information technology association of the american library association. librarians are as individuals almost all or almost always polite in their professional discourse. pondering these potential authors, particularly the smaller number, i conjured a mental picture of a fictional, male, potential ital author. i don’t know why my fic­ tional potential author was male—it may be because more males than females are members of that group; it may be because i’m a male; or it may be unconscious sex­ ism. i’m not very self­analytic. my mental picture of this fictional male potential author saw him driving home from his place of employ­ ment after having an after­work half gallon of rum when, into the picture, a rattlesnake crawled on to the seat of his car and bit him on the scrotum. lucky him: he was, after all, a figment of my imagina­ tion. (any resemblance between my fictional author and a real potential author is purely coincidental.) lucky me: we all know that such an incident is not unthinkable in library land. lucky lita: it is unlikely that any member will cancel his or her membership or any subscriber, his, her, or its subscription because the technical term “scro­ tum” found its way into my editorial. ital is, after all, a technology journal, and members and readers ought to be offended if our journal abjures technical terminology. likewise they should be offended if our articles discuss library technology issues misusing technical terms or concepts, or confusing technical issues with policy issues, or stating technology problems or issues in the title or abstract or introduction then omitting any mention of said problems until the final paragraph(s). ital referees are quite diligent in questioning authors when they think terminology has been used loosely. their close readings of manuscripts have caught more than one author mislabeling policies related to the uses of informa­ tion technologies as if the policies were themselves tech­ nical conundrums. most commonly, they have required authors who state major theses or technology problems at the beginnings of their manuscripts, then all but ignore these until the final paragraphs, to rewrite sections of their manuscripts to emphasize the often interesting questions raised at the outset. what, pray tell, is the editor trying to communicate to readers? two things, primarily. first, i have been following with interest the several heated discussions that have taken place on lita­l for the past number of months. sometimes, the idea of the traditional quarterly scholarly/professional journal in a field changing so rapidly may seem almost quaint. a typical ital article is five months old when it is pub­ lished. a typical discussion thread on lita­l happens in “real time” and lasts two days at most. a small number of participants raise and “solve” an issue in less than a half dozen posts. a few times, however, a question asked or a comment posted by a lita member has led to a flurry of irrelevant postings, or, possibly worse, sustained bomb­ ing runs from at least two opposing camps that have left some members begging to be removed from the list until the all clear signal has been sounded. i’ve read all of these, and i could not help but won­ der, what if ital accepted manuscripts as short as lita­l postings? what would our referees do? i suspect, for our readers’ sakes, most would be rejected. authors whose manuscripts are rejected receive the comments made by the referees and me explaining why we cannot accept their submissions. the most frequent reason is that they are out of scope, irrelevant to the purposes of lita. when someone posts a technology question to lita­l that gener­ ates responses advising the questioner that implementing the technology in question is bad policy, the responses are, from an editor’s point of view, out of scope. how many lita members have authority—real authority—to set policy for their libraries? a second “popular” reason for rejections is that the manuscripts pose “false” problems that may be technological but that are not technologies that are within the “control” of libraries. these are out of scope in a different manner. third, some manuscripts do not pass the “so what” test. some days i wish that lita­l responders would referee, honestly, their own responses for their relevance to the questions or issues or so­whatness and to the membership. second, and more importantly to me, lita members, whether or not your bodies include the part that we all have come to know and defend, do you have the “­” to send your ital editor a manuscript to be chewed upon not by rattlesnakes but by the skilled professionals who are your ital editorial board members and referees? i hope (and do i dare beg again?) so. your journal will not suffer quaintness unless you make it so. editorial: the virtues of deliberation john webb john webb (jwebb@wsu.edu) is a librarian emeritus, washington state university, and editor of information technology and libraries. editorial | webb 3 static vs. dynamic tutorials: applying usability principles to evaluate online point-of-need instruction benjamin turner, caroline fuchs, and anthony todman information technology and libraries | december 2015 30 abstract this study had a two-fold purpose. one is to discover through the implementation of usability testing which mode of tutorial was more effective: screencasts containing audio/video directions (dynamic) or text-and-image tutorials (static). the other is to determine if online point-of-need tutorials were effective in helping undergraduate students use library resources. to this end, the authors conducted two rounds of usability tests consisting of three groups each, in which participants were asked to complete a database-searching task after viewing a text-and-image tutorial, audio/video tutorial, or no tutorial. the authors found that web usability testing was a useful tutorial-testing tool while discovering that participants learned most effectively from text-and-image tutorials because both rounds of participants completed tasks more accurately and more quickly than those who received audio/video instruction or no instruction. introduction the provision of library instruction online has become increasingly important, given that more than one third of higher education students now take at least some of their courses online and that the number of students enrolling in online courses continues to increase more rapidly than the number of students in higher education as a whole.1 academic library websites reflect the growth of online education. by 1998, online versions of journals had become ubiquitous.2 in contrast, electronic books have been slower to be adopted in academic libraries, but there has been a steady and significant growth of their use in recent years. between 2010 and 2011, for example, the average number of electronic books available at academic libraries in the united states increased by 93 percent.3 benjamin turner (turnerb@stjohns.edu) is associate professor and instructional librarian, caroline fuchs (fuchsc@stjohns.edu) is associate professor and outreach librarian, and anthony todman (todmana@stjohns.edu) is associate professor and reference and government documents librarian, st. johns university libraries, new york, new york. mailto:turnerb@stjohns.edu mailto:fuchsc@stjohns.edu mailto:todmana@stjohns.edu static vs. dynamic tutorials | turner, fuchs, and todman | doi: 10.6017/ital.v34i4.5831 31 with the increasing availability of library content online, many users bypass the “brick and mortar” library and go directly to its website.4 remote access to library collections has advantages in terms of convenience, which further underscores the importance of making library websites as intuitive as possible while offering quality instruction at point-of-need. a recent survey of 264 academic library websites found that 64 percent offered some form of online tutorials.5 the relative effectiveness of different types of tutorials in providing online, point-of-need library instruction is therefore an important consideration for library professionals. this study had a two-fold purpose. one is to discover through the implementation of usability testing which mode of tutorial was more effective: screencasts containing visual and audio directions (dynamic) or text-and-image tutorials (static). the other is to determine if online point-of-need tutorials were effective in helping undergraduate students use library resources. for the purpose of this study, researchers were less interested in the long-term effects of these tutorials on student research but rather focused on point-of-need instruction for database use. st. john’s university st. john’s university is a private, coeducational roman catholic university, founded in 1870 by the vincentian community. the university has three residential campuses within new york city and an academic center in oakdale, new york, as well as international campuses in rome, italy, and paris, france. the university comprises six schools and colleges: st. john’s college of liberal arts and sciences; the school of education; the peter j. tobin college of business; college of pharmacy and health sciences; college of professional studies; and the school of law. there is a strong focus on online learning. special academic programs include accelerated three-year bachelor’s degrees, five-year bachelor’s/master’s degrees in the graduate schools, a six-year bachelor’s/jd from the school of law, and a six-year pharmd program. in fall 2013, total student enrollment was 20,729, with 15,773 registered undergraduates and 1,364 international students. during the 2012–13 academic year, 97 percent of undergraduate students received financial aid in the form of scholarships, loans, grants, and college work/study initiatives. the student body was 56 percent female and 44 percent male, representing 47 states and 116 countries. the diversity of student population is noted by the fact that 47 percent identified themselves as black, hispanic, asian, native hawaiian/pacific islander, american indian, alaska native, or multiracial. st. john’s university has a library presence at four campuses: queens, staten island, manhattan, and rome, italy. in addition to traditional or in-person interaction, both information technology and libraries | december 2015 32 online and distance learning are integral parts of the library-tutorial and instruction environment. undergraduate students receive a laptop computer at no cost, and the entire campus is wireless accessible. full-time faculty members receive laptop computers as well. the university libraries has 24/7 access to electronic resources, both on and off campus. the libraries’ portal is located at http://www.stjohns.edu/libraries. an online catalog can be found at http://stjohns.waldo.kohalibrary.com. wireless computing and printing are available at the four campus library sites as well as in other areas across campus. library reference and research assistance services are delivered in-person or electronically. library reserve services are accessible in either print or electronic formats. interlibrary loan has both domestic and international borrowing and lending via the illiad software platform. when the main queens campus library is not open for service, a 24/7 quiet study area is available for current students within the library space. library instructional services take place in formal classes that are requested by faculty, as well as library faculty-initiated workshops held in either the libraries’ computerized classrooms or at other on-campus locations. there is no mandated information literacy session. during june 2012–may 2013, 333 instruction classes were offered to 4,435 students. literature review the library literature on online library tutorials might be divided into subcategories: early development of online instructional tutorials, library website usability testing, evaluation of online information-literacy instruction tutorials, best practices for the creation of library tutorials, and the best mediums for the creation of library tutorials. early development of online instruction tutorials the need to evaluate and assess the usefulness of online instructional tutorials is not new. although not explicitly related to today’s environment, tobin and kesselman’s work contains an early history detailing the design of internet-based information pages and their use in the library information environment.6 they also included the early guidelines of the association of college and research libraries (acrl), the international federation of library associations (ifla), and the american library association (ala). a study by dewald conducted around the same time evaluated twenty library tutorials according to the current best practices in library instruction, and concluded “online tutorials cannot completely substitute for http://www.stjohns.edu/libraries http://stjohns.waldo.kohalibrary.com/ static vs. dynamic tutorials | turner, fuchs, and todman | doi: 10.6017/ital.v34i4.5831 33 the human connection in learning”7 and should be designed specifically to support students’ academic work. further, it was noted that tutorials should teach concepts, rather than mechanics, and incorporate active learning where possible.8 in a separate article, dewald argued that the web made possible new, creative ways of teaching library skills, through features such as linked tables of contents, and the provision of immediate feedback through cgi scripts. users also were able to open secondary windows to practice the skills they learned as they moved through tutorials. she further concluded that effective instructional content should not be text heavy, but rather include images and interactive features.9 another early study of online tutorials discussed the development of a self-paced web tutorial at seneca college in toronto, called “library research success,” which was designed to teach subject-specific and general research skills to first-year business majors. the creation of the tutorial was first requested by seneca college’s school of business management, which collaborated with seneca college library, the school’s centre for new technology, and centre for professional development in completing the project. the tutorial was a success, with overwhelmingly positive feedback from students and faculty members.10 despite such successful examples, a common concern expressed in early studies was that online tutorials would not be as effective as face-to-face instruction. one article compared and evaluated library skills instruction methods for first-year students at deakin university.11 another tracked the difference between cai (computer assisted instruction) without a personal librarian interaction and a more traditional library instruction incorporated into an english classroom setting, and which concluded that while useful, cai was not a good substitute for face-to-face instruction.12 library website usability testing as concern grew at the onset of the twenty-first century for the need to evaluate online library tutorials, articles on library website usability testing began to appear more frequently. in one study, the authors noted that they would not have identified problems with their website had they not done usability testing: “testers’ observations and the comments of the students participating in the test were invaluable in revealing where and why the site failed and helped evaluators to identify and prioritize the gross usability problems to be addressed.”13 librarians aiming to examine their patrons’ ability to independently navigate their library’s webpage to fulfill key research needs, conducted similar studies. at western michigan university (wmu), librarians investigated how researchers navigated the wmu library website in order to find three things: the title of a magazine article on affirmative action, the title of a journal article on endangered species, and a recent information technology and libraries | december 2015 34 newspaper about the senate race in new york state. they successfully used the data gathered to identify problems with their website and to establish goals and priorities in clarifying language and navigation on their site.14 more recently, researchers conducted a usability study with the aim of showing how librarians could build websites to better compete with nonlibrary search sites such as google, which would allow greater personalization by the individual user and more seamless integration into learning management systems.15 other researchers have studied the readability of content on academic library websites. in one such study, lim used a combination of readability formulas and focus groups to evaluate twenty-one academic library websites that serve significant numbers of academically underprepared students and/or students who spoke english as a second language. they concluded that the majority of information literacy content on library pages had poor readability, and that the lack of welldesigned and well-written information literacy content could undermine its effectiveness in serving users.16 kruger, ray, and knight employed a usability study to evaluate student knowledge of their library’s web resources. the study produced mixed results, with most students able to navigate to the library’s website and the opac, but large numbers unable to perform basic research tasks such as finding a journal article. the authors noted that such information would allow them to modify library instruction accordingly.17 another study focused on the use of language as it relates to awareness of relevant databases. at bowling green university library, staff members attempted to learn more about how users find and select databases through the library website’s electronic resources management system (erm). because of their study, the authors recommended that librarians should focus on promoting brand awareness of relevant databases among students in their subject disciplines by providing better database descriptions on the library webpages and by collaborating with subject faculty members.18 evaluation of online information-literacy instruction tutorials librarians at wayne state university conducted an assessment of their revamped information literacy tutorial, known as “re:search.”19 they distributed a multiplechoice knowledge questionnaire to seventy-two students participating in their 2010 wayne state federal trio student support service summer residential program, which was based on donald kirkpatrick’s evaluating training programs: the four levels.20 they concluded that their study highlighted some flaws in their tutorials, including navigational problems. as a result, they would consider partnering with wsu faculty in the future to develop better modules. one curious comment by the authors in their introduction warrants further discussion about assumptions made static vs. dynamic tutorials | turner, fuchs, and todman | doi: 10.6017/ital.v34i4.5831 35 by librarians regarding student research skills: “the internet has bolstered student confidence levels in their research abilities, increasing the demand for point-of-need instruction. students are accustomed to online learning, not only because of the shift in higher education to online coursework, but also because they have been leaning online through youtube, social networking, and other websites.”21 at purdue university, librarians evaluated the success of their seven-module online tutorial through the distribution of a post-test survey. these researchers found that the feedback received was essential for planning future versions of online instruction at their institution.22 a report from zayed university (united arab emirates) outlined an evaluation of infoasis, the university’s online information literacy tutorial, testing 4,000 female students with limited library proficiency and remedial english aptitudes.23 best practices for the creation of library tutorials other researchers developed guidelines and best practices for future planning and implementation. bowles-terry, hensley, and hinchliffe at the university of illinois conducted interviews to investigate the usability, findability, and instruction effectiveness of online video tutorials. although shorter than three minutes, students found the tutorials to be too lengthy, and would have preferred the option to skip ahead to pertinent sections. other participants found the tutorials too slow, while some preferred to read rather than watch and listen. on the basis of their study, the authors recommended a set of best practices for creating library video tutorials, including pace, length, content, look and feel, video versus text, findability, and interest in using video tutorials.24 at regis university library, librarians created online interactive animated tutorials and incorporated google analytics for use statistics and tutorial assessment, from which they developed a list of tips and suggestions for tutorial development. these included suggestions regarding the technical aspects such as screen resolution and accessibility. of some significance is that the data from the analytics suggest that the tutorials are being used both within and without the university. most useful here is the “best practices for creating and managing animated tutorials” found in the article’s appendix.25 best mediums for the creation of library tutorials other authors have explored the need to accommodate different learning styles in library tutorials rather than relying too heavily on text to convey information.26 at the university of leeds in the united kingdom, an information literacy tutorial was planned and created to support online distance learners in the geography postgraduate program. using an articulate presenter, the authors created a tutorial that information technology and libraries | december 2015 36 covered the same material that would be taught in a face-to-face session, and which incorporated visual, auditory, and textual elements. these researchers concluded that the online tutorial is supplemental and did not alleviate the need for face-toface instruction.27 to reach different types of learners, many librarians have begun to use adobe flash (formerly macromedia flash) to create multimodal online information literacy tutorials. authors who use flash note that learning how to use the software correctly represents a significant investment in time and effort.28 another study, conducted via a suny albany web design class, focused on the effect/outcome of teaching with web-based tutorials in addition to or instead of face-to-face interaction. the authors of this study pointed out that self-paced instruction, lab time, office, hours, and email exchange were all factors that are affecting web-based multimedia (wbmm) flash that were incorporated into instruction.29 rather than focusing purely on the content of online library instruction tutorials, some studies considered and evaluated the various tutorial-creating software tools. blevins and elton conducted a case study at the william e. laupus health sciences library at east caroline university, which set out “to determine the best practices for creating and delivering online database instruction tutorials for optimal information accessibility.”30 they produced “identical” tutorials using microsoft’s powerpoint, sonic foundry’s mediasite, and techsmith’s camtasia software. they chose to include powerpoint because “previous research has shown that online students prefer powerpoint presentations to video lectures.”31 their testing results indicated that participants found specific tutorial features to be most effective: video (33.3 percent), mouse movements (57.1 percent), instructor presence (28.6 percent), audio instruction only (28.6 percent), and interaction (28.6 percent). they concluded that camtasia tutorials provided optimal results for short sessions such as database instruction and that for instruction where video and audio of instructor + screen shots, mediasite was more appropriate. however, they also determined that powerpoint tutorials were an acceptable solution if cost were an important factor.32 in a separate study at florida atlantic university, researchers described the process of designing and creating library tutorials using the screencasting software camtasia. in addition to the creation of the tutorials themselves, the authors described how the project entailed the development of policies and guidelines for the creation of library tutorials, as well as training for of librarians in using camtasia software.33 this study provides another good example of the time investment involved in the creation of multimedia tutorials. static vs. dynamic tutorials | turner, fuchs, and todman | doi: 10.6017/ital.v34i4.5831 37 while the professional literature thus shows that flash-based tutorial software is popular among librarians, and the desire to accommodate students with different learning styles is a laudable goal, at least one study suggests that the time and money involved in the creation of multimedia tutorials could be better spent in other ways. a university of illinois urbana-champaign study found that students from different learning styles performed better after using tutorials made with a combination of text and screenshots than from tutorials created with camtasia software.34 method usability testing for the evaluation of tutorials in dynamic audio/video tutorials compared with text and image tutorials, the researchers employed usability testing, which is “watching people use something that you have created, with the intention of making it easier to use, or proving that it is easy to use.”35 usability testing requires relatively small numbers of participants to provide meaningful results, and it does not require the selection from a representative sample population.36 participants group number of participants control group 1 5 text-and-image group 1 5 dynamic audio/video group 1 5 group 1 total 15 control group 2 5 text-and-image group 2 5 dynamic audio video group 2 5 group 2 total 15 total participants 30 table 1. breakdown of participants thirty freshmen at st. john’s university participated in this study. while usabilitytesting experts do not place a great deal of importance on recruiting participants from a specific target audience, the researchers wanted to choose users who were less likely to have had significant experience with university library database searching, since prior knowledge could make it harder to determine the effectiveness of the tutorials. they therefore chose freshmen as the participants in the study. they did not seek any other variables such as age, gender, information technology and libraries | december 2015 38 ethnicity/culture, or any other demographic information. participants were recruited through the st. john’s central portal, which is the main channel of internal communication at st. john’s university, and through which mass emails can be sent to a targeted population of students. the email to students provided a registration link to a google form, which asked students to provide their name, year of study, time availability preference, and contact information. freshmen were selected from the response list. as an incentive for participation, the student participants became eligible for a kindle fire tablet for each of the two rounds of the study. prior to beginning the study, the authors consulted st. john’s university’s office of institutional research, which oversees all research at the university, and provides approval for the study of human subjects. since this study focused on tutorials rather than the participants themselves, the authors were granted a waiver for the study. tests usability testing typically involves having participants complete a task or tasks in front of an observer. for this study, the authors designed two tasks that required participants to find articles in academic search premier ebsco database (asp ebsco). the first task, given to all participants in the first round of tests, was relatively simple, and consisted of three components: finding an article about climate change published in the journal lancet and downloading a copy of the citation for that article in mla format from the database. participants who attempted the first task were labeled “group 1” (see appendix i). the second task was given to all participants in the second round of tests and was more complex, comprising five components. participants were asked to find an article about the deepwater horizon spill from a peer-reviewed journal published after 2011 that included color photographs. as with the first task, these participants were also required to download a copy of the citation for the article in mla format from the database. participants who attempted the second task were labeled “group 2” (see appendix ii). group 1 and group 2 were divided into three subgroups each. the first subgroup was the control group and received no instruction. the second subgroup was given access to the dynamic audio/visual tutorial (see appendix iii). the third subgroup was given access to the static text-and-image tutorial instruction (see appendixes iv and v). each subgroup consisted of five unique participants. each participant was scheduled for a specific fifteen-minute time slot. tests were conducted in a small meeting room in the library, with one participant at a time working with the facilitator. as the participants entered the meeting room, the static vs. dynamic tutorials | turner, fuchs, and todman | doi: 10.6017/ital.v34i4.5831 39 facilitator greeted them and confirmed their identities. participants were provided with an information sheet (see appendix vi), which told participants that the session would be recorded, that the researchers were concerned with testing the libraryinstruction tutorials, not the participants themselves, and that the tests were confidential and anonymous. participants were also told that they could end the test at any time for any reason. additionally, the facilitator read aloud the points-ofinformation sheet. participants were invited to ask questions or voice concerns. for both rounds of tests, participants had use of a laptop computer with a browser window open to the asp ebsco home page. for those who received instruction, a second browser window was open to either the dynamic or the static tutorial. for members of the control group, no tutorial was available. those who received instruction were allowed to return to the tutorial at any point they wished. using adobe connect software, the testing activities, tutorials, participants’ attempt(s) at task(s), participants’ computer screen, and any conversation between the participants and the facilitators were simultaneously recorded and broadcast to a separate room, where the two other researchers observed, listened and took notes. the participants were asked to verbally describe the steps they were taking, as per the “think aloud” protocol that is essential to usability testing. recorded sessions were then available for later review by the research team. on completing the task, participants who received either the text-and-image or dynamic audio/video tutorial were asked to complete a short questionnaire giving feedback on the instruction received (see appendix vii). participants who received no instruction were not asked to provide feedback. tutorials the researchers created four tutorials for this study. two were flash-based dynamic audio/video tutorials created using techsmith’s jing software. the static text-and-image tutorials were created using microsoft word, which was then converted into a pdf document. the dynamic and static tutorials mirrored each other in terms of content, and were designed with the specific goal of helping participants complete the tasks successfully, though in both cases there was some variation between the tutorials and the tasks. the tutorials received by group 1, for instance, showed participants how to find articles about the occupy wall street movement, limiting the search to “published in the new york times,” and how to download the citation in mla format. the tutorials for group 2 showed participants how to find articles about climate change that included color photographs, limiting the search to peer-reviewed journals that were published after 2011. discussion information technology and libraries | december 2015 40 the results of the usability study revealed two things: participants benefited from library instruction, through which they evidently acquired new skills; and participants benefited more from static text-and-image tutorials than from the dynamic audio/video tutorials. in both rounds of tests, the participants who received the text-and-image tutorials performed the tasks more effectively than did members of the control group or those who viewed the dynamic tutorials. group 1 for the first round of tests, members of the control group spent longer on the task and made more mistakes than those who received either the dynamic or the static tutorial (see table 2). for example, one participant in the control group was unable to download the mla citation, and another in the control group ventured outside the asp ebsco database platform to find the correct citation format. when members of the control group did succeed, they did so without a clear search strategy, evidenced by their use of natural language instead of boolean connectors. (asp ebsco uses boolean connectors by default, and natural language is usually ineffective.) another participant reached several dead-ends in the search before finally succeeding. while most of the control group participants were at least partially successful in completing the task, it is reasonable to suspect that they would have given up in frustration in a non-test situation, and would have benefited from point-of-need instruction. control 1 control 2 control 3 control 4 control 5 relevant article y y y y y lancet y y y y y mla citation y n y y y time on task (minutes) 8:28 2:49 6:30 2:41 1:42 average time on task: 4:26 mins. table 2. task completion success and time, control group 1 the participants who received the static text-and-image tutorial performed the best, completing the task with the highest speed and with the greatest accuracy (see table 3). all five of the participants in this group managed to find appropriate articles and to download the citation in mla format, though several had difficulty with the final task. all were able to navigate to the “cite” feature effectively, but all participants chose to click on the “mla” link rather than simply copy the citation. clearer directions in the tutorial might alleviate this problem. static vs. dynamic tutorials | turner, fuchs, and todman | doi: 10.6017/ital.v34i4.5831 41 t&i 1 t&i 2 t&i 3 t&i 4 t&i 5 relevant article y y y y y lancet y y y y y mla citation y n y y y time on task (minutes) 2:01 3:00 2:21 2:40 3:15 average time on task: 2:39 mins. table 3. task completion success and time, text and image tutorial, group 1 participants who received the dynamic video tutorial were more successful than those in the control group, but spent significantly longer on task than did those who received the static tutorial (see table 4). interestingly, two of the participants searched for “climate change” as the “subject term” in asp ebsco, even though the tutorial did not instruct them to do so. (su subject term is one of the options in the drop-down menu in asp ebsco, which otherwise searches citation and abstract by default.) while “climate change” is a commonly accepted scientific term, and the searches produced relevant search results, it is not generally advisable to begin a search with controlled vocabulary terms. t&i 1 t&i 2 t&i 3 t&i 4 t&i 5 relevant article y y y y y lancet y y y y y mla citation y y y y y time on task (minutes) 4:34 3:17 3:17 3:07 3:28 average time on task: 3:32 mins. table 4. task completion success and time, dynamic a/v tutorial, group figure 1. average time on task in minutes, group 1 information technology and libraries | december 2015 42 figure 2. successful task completion: group 1 group 2 the advantages of text-and-image instruction were more pronounced in the second round of tests, which involved a more complex task (see figure 3). as in the first round of tests, the participants in the control group had the lowest number of satisfactory task completions, and spent the greatest amount of time on task. although most of the participants in control group 2 had at least partial success in completing the task, most did so through trial and error, and showed a general lack of understanding of database terminology and functions. one participant, for example, attempted to use “peer-review” and “color photographs” as search terms. another attempted to search for “deepwater horizon” as a journal title. only two of the participants completed all components of the task successfully. two others partially completed the task – one found a suitable article with color photographs, but published in the nation, which is not peer-reviewed. one user failed to complete any part of the task and gave up in frustration (see table 5). control 1 control 2 control 3 control 4 control 5 relevant article y n y y y peer-reviewed y n y n y publication date y n n y y color photos y n y y y mla citation y n y n y time on task (minutes) 1:51 7:39 2:54 9:16 7:55 average time on task: 5:55 mins. table 5. task completion and success and time, control group 2 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% relevant article lancet mla citation control text and images video static vs. dynamic tutorials | turner, fuchs, and todman | doi: 10.6017/ital.v34i4.5831 43 in contrast, participants who received the text-and-image tutorial enjoyed the most success in round 2. three of the five participants who received the static tutorial completed all components of the task successfully. errors committed by the two others were related to publication date. participants in this group also completed the task more rapidly than those from the other two groups. t&i 1 t&i 2 t&i 3 t&i 4 t&i 5 relevant article y y y y y peer-reviewed y y y y y publication date y y n n y color photos y y y y y mla citation y y y n y time on task (minutes) 6:33 2:46 3:00 4:50 3:24 average time on task: 4:06 mins. table 6. task completion and success and time, text and image tutorial group 2 as in group 1, however, all but one of the participants who received the text-andimage tutorial first attempted to download the mla citation by clicking on the “mla” link, rather than simply copying the text. two of the participants referred back to the tutorials after they had begun the task, which was permissible according to the facilitator’s instructions. this suggests that the text-and image-tutorials are suitable for quick reference and allow users to access needed information at a glance. video 1 video 2 video 3 video 4 video 5 relevant article y y y y y peer-reviewed y y y y n publication date n n n y n color photos y y n y y mla citation y y n y y time on task (minutes) 4:13 5:39 6:33 3:59 4:40 average time on task: 4:57 mins. table 7. task completion and success and time, a/v tutorial group 2 among the five participants who received the dynamic audio/visual tutorial, only one completed all five components of the test successfully. one was unable to locate the citation feature, while another failed to limit to peer-reviewed articles. four of the participants limited the publication date from 2011 to the present instead of 2012 to the present. all participants correctly used the publication limiter. although given the option, none chose to return to the dynamic tutorial after starting the task. this might be because of the length of the tutorial (more than three minutes) and the difficulty in navigating to specific sections. information technology and libraries | december 2015 44 as noted above, participants in all groups tended to make errors related to publication date, which may have stemmed from the wording of the task itself rather than misunderstanding the functionality of the database. the task required participants to find articles published after 2011, but many found articles published from 2011 onward. clearer wording of the task probably would have alleviated this problem. figure 3. average time on task in minutes, group 2 figure 4. successful task completion, group 2 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% control text and image video static vs. dynamic tutorials | turner, fuchs, and todman | doi: 10.6017/ital.v34i4.5831 45 tutorial feedback after completing the task, participants were asked to provide anonymous, written feedback on the instruction they received. (members of the control groups were not asked to provide feedback because the purpose of the study was to compare different types of library tutorials.) participants were asked ten questions, eight of which were on a likert scale and two of which were openended. although the feedback for both the static and dynamic tutorials was generally positive, the text-and-image tutorials also received higher combined scores than the audio/visual tutorials on the likert scale questions (see figures 5 and 6). participants’ written feedback on the text-and-image tutorials was generally more positive than for the video tutorials. commenting on the text and image tutorial, one participant remarked that it was a “great resource” while another said that it was “very easy to use. will become really helpful when put into full effect.” another observed that the tutorial “was pretty precise.” not all the comments on the text and image tutorials were positive, however. more than one participant noted that the images used in the tutorials were blurry. one even suggested that “more animations to the text would make it much more open to people with different learning styles.” the feedback on the video tutorials was generally positive, with comments such as “very straightforward,” “helpful,” “easy to follow,” and “i would use this for school assignments.” however, a common complaint about the dynamic tutorials was that the audio was not very clear. (this may be because the quality of the microphone used for the recordings.) other participants seemed to criticize the layout of the database itself, saying that bigger size of words would have made it easier to follow. another complained that the dynamic tutorial was too simple, and that it should cover more advanced and in-depth topics. figure 5. tutorial feedback likert score averages group 1 0 1 2 3 4 5 text tutorial group #1 video tutorial group #1 information technology and libraries | december 2015 46 figure 6. tutorial feedback likert score averages group 2 conclusion this study suggests that library users benefit from online instruction library instruction at pointof-need, and that text-and-image tutorials are more effective than dynamic audio/visual tutorials for its provision. librarians should not assume that instructional tutorials must use flash or other video technology, especially given the learning curve, time, and financial commitments involved in creating video tutorial software. although the researchers in this study used the free software jing, learning to use it effectively was still a significant investment in time. more importantly, it is evident that the participants learned more and were more satisfied with text-and-image tutorials, which were more easily navigated than dynamic audio/video tutorials and which allowed users to more easily review tutorial content than did dynamic audio/video tutorials. this study corroborates the findings of mestre, who found that text-and-image tutorials were more effective than audio/video tutorials in teaching library skills.37 it also lends credence to the work of bowles-terry, hincliffe, and hutchinson, who found that users preferred tutorials that allowed them to read quickly and navigate to pertinent sections rather than watch and listen.38 as lim suggests, it is important to create instructional material that is clearly written.39 it further suggests that regardless of the technology used, librarians should focus on creating content that is relevant and helpful to our user population. again, it is worth noting that the control group, without the aid of point-of-need instructional materials, achieved some success in completing the tasks. it is possible that the members of the control group gained important knowledge simply by being told about asp ebsco and that there was enough implied information in the tasks themselves to provide basic information about the content and functionalities of the database. this suggests that databases like asp ebsco are intuitive enough that people can learn how to use them independently. the higher number of serious errors, and the greater length of time members of the control group spent on tasks, 0 1 2 3 4 5 text tutorial group #2 video tutorial group #2 static vs. dynamic tutorials | turner, fuchs, and todman | doi: 10.6017/ital.v34i4.5831 47 however, shows that efforts to raise student awareness of databases and library resources should be coupled with point-of-need instruction. although the usability tests generally went smoothly, researchers did encounter occasional difficulties with audio between the testing room and the observation room when it became difficult to hear what the participant was saying as he or she was completing the task. fortunately, the researchers kept recordings of each test, which allowed them to review those where the audio quality was less than optimal. to save time and run the tests more efficiently, however, the researchers recommend purchasing a high-quality microphone like those used for teleconferences. furthermore, this study shows the broader value of usability testing of library instructional material. although participants who received the text-and-image tutorials performed better than either of the other two groups, the tests helped researchers identify two problems with the tutorials: users found the images blurry and often misinterpreted how to download citations in mla format. such information gleaned from the user’s perspective would be valuable in creating future library online point-of-need instructional tutorials. references 1. i. elaine allen and jeff seaman, “grade change: tracking online education in the united states, 2013 | the sloan consortium,” sloanconsortium.org, 2013, sloanconsortium.org/publications/survey/grade-change-2013. 2. m. walter, “as online journals advance, new challenges emerge,” seybold report on internet publishing 3, no. 1 (1998). 3. rebecca miller, “dramatic growth,” library journal 136, no. 17 (october 15, 2011): 32, www.thedigitalshift.com/2011/10/ebooks/dramatic-growth-ljs-second-annual-ebooksurvey. 4. megan von isenburg, “undergraduate student use of the physical and virtual library varies according to academic discipline,” evidence based library & information practice 5, no. 1 (april 2010): 130. 5. sharon q. yang and min chou, “promoting and teaching information literacy on the internet: surveying the web sites of 264 academic libraries in north america,” journal of web librarianship 8, no. 1 (2014): 88–104, doi: 10.1080/19322909.2014.855586. 6. tess tobin and martin kesselman, “evaluation of web-based library instruction programs,” www.eric.ed.gov/ericwebportal/contentdelivery/servlet/ericservlet?accno=ed441454. 7. nancy h. dewald, “transporting good library instruction practices into the web environment: an analysis of online tutorials,” journal of academic librarianship 25, no. 1 (january 1999): 26–31. http://sloanconsortium.org/publications/survey/grade-change-2013 http://www.thedigitalshift.com/2011/10/ebooks/dramatic-growth-ljs-second-annual-ebook-survey http://www.thedigitalshift.com/2011/10/ebooks/dramatic-growth-ljs-second-annual-ebook-survey http://dx.doi.org/10.1080/19322909.2014.855586 http://www.eric.ed.gov/ericwebportal/contentdelivery/servlet/ericservlet?accno=ed441454 information technology and libraries | december 2015 48 8. ibid. 9. nancy h. dewald, “web-based library instruction: what is good pedagogy?,” information technology & libraries 18, no. 1 (march 1999): 26–31. 10. kelly a. donaldson, “library research success: designing an online tutorial to teach information literacy skills to first-year students,” internet & higher education 2, no. 4 (january 2, 1999): 237–51, doi: 10.1016/s1096-7516(00)00025-7. 11. marion churkovich and christine oughtred, “can an online tutorial pass the test for library instruction? an evaluation and comparison of library skills instruction methods for first year students at deakin university,” australian academic research libraries 33, no. 1 (march 2002): 25–38. 12. stephanie michel, “what do they really think? assessing student and faculty perspectives of a web-based tutorial to library research,” college & research libraries 62, no. 4 (july 2001): 317–32. 13. brenda battleson, austin booth, and jane weintrop, “usability testing of an academic library web site: a case study,” journal of academic librarianship 27, no. 3 (may 2001): 194. 14. barbara j. cockrell and elaine anderson jayne, “how do i find an article? insights from a web usability study,” journal of academic librarianship 28, no. 3 (may 2002): 122–32, doi: 10.1016/s0099-1333(02)00279-3. 15. brian detlor and vivian lewis, “academic library web sites: current practice and future directions,” journal of academic librarianship 32, no. 3 (may 2006): 251–58, doi: 10.1016/j.acalib.2006.02.007. 16. adriene lim, “the readability of information literacy content on academic library web sites,” journal of academic librarianship 36, no. 4 (july 2010): 296–303, doi: 10.1016/j.acalib.2010.05.003. 17. janice krueger, ron l. ray, and lorrie knight, “applying web usability techniques to assess student awareness of library web resources,” journal of academic librarianship 30, no. 4 (july 2004): 285–93, doi: 10.1016/j.acalib.2004.04.002. 18. amy fry and linda rich, “usability testing for e-resource discovery: how students find and choose e-resources using library web sites,” journal of academic librarianship 37, no. 5 (september 2011): 386–401, doi: 10.1016/j.acalib.2011.06.003. 19. rebeca befus and katrina byrne, “redesigned with them in mind: evaluating an online library information literacy tutorial,” urban library journal 17, no. 1 (spring 2011): 1–26. http://dx.doi.org/10.1016/s1096-7516(00)00025-7 http://dx.doi.org/10.1016/s0099-1333(02)00279-3 http://dx.doi.org/10.1016/j.acalib.2006.02.007 http://dx.doi.org/10.1016/j.acalib.2010.05.003 http://dx.doi.org/10.1016/j.acalib.2004.04.002 http://dx.doi.org/10.1016/j.acalib.2011.06.003 static vs. dynamic tutorials | turner, fuchs, and todman | doi: 10.6017/ital.v34i4.5831 49 20. donald l kirkpatrick, evaluating training programs: the four levels (san francisco: berrettkoehler; publishers group west [distributor], 1994). 21. rebeca befus and katrina byrne, “redesigned with them in mind: evaluating an online library information literacy tutorial,” urban library journal 17, no. 1 (spring 2011): 1–26. 22. sharon a. weiner et al., “biology and nursing students’ perceptions of a web-based information literacy tutorial,” communications in information literacy 5, no. 2 (september 2011): 187–201. 23. janet martin, jane birks, and fiona hunt, “designing for users: online information literacy in the middle east,” portal: libraries & the academy 10, no. 1 (january 2010): 57–73. 24. melissa bowles-terry, merinda kaye hensley, and lisa janicke hinchliffe, “best practices for online video tutorials in academic libraries: a study of student preferences and understanding,” communications in information literacy 4, no. 1 (march 2010): 17–28. 25. paul betty, “creation, management, and assessment of library screencasts: the regis libraries animated tutorials project,” part of a special issue on the proceedings of the thirteenth off-campus library services conference, part 1 48, no. 3/4 (october 2008): 295– 315, doi: 10.1080/01930820802289342. 26. lori s. mestre, “matching up learning styles with learning objects: what’s effective?,” journal of library administration 50, no. 7/8 (december 2010): 808–29, doi: 10.1080/01930826.2010.488975. 27. sara l. thornes, “creating an online tutorial to support information literacy and academic skills development,” journal of information literacy 6, no. 1 (june 2012): 81–95. 28. richard d. jones and simon bains, “using macromedia flash to create online information skills materials at edinburgh university library,” electronic library & information systems 37, no. 4 (december 2003): 242–50, www.era.lib.ed.ac.uk/handle/1842/248. 29. thomas p. mackey and jinwon ho, “exploring the relationships between web usability and students’ perceived learning in web-based multimedia (wbmm) tutorials,” computers & education 50, no. 1 (january 2008): 386–409. 30. amy blevins and c. w. elton, “an evaluation of three tutorial-creating software programs: camtasia, powerpoint, and mediasite,” journal of electronic resources in medical libraries 6, no. 1 (march 2009): 1–7, doi: 10.1080/15424060802705095. 31. ibid., 2. 32. ibid. http://dx.doi.org/10.1080/01930820802289342 http://dx.doi.org/10.1080/01930820802289342 http://www.era.lib.ed.ac.uk/handle/1842/248 http://dx.doi.org/10.1080/15424060802705095 information technology and libraries | december 2015 50 33. alyse ergood, kristy padron, and lauri rebar, “making library screencast tutorials: factors and processes,” internet reference services quarterly 17, no. 2 (april 2012): 95–107, doi: 10.1080/10875301.2012.725705. 34. lori s. mestre, “student preference for tutorial design: a usability study,” reference services review 40, no. 2 (may 2012): 258–76, http://dx.doi.org/10.1108/00907321211228318. 35. steve krug, rocket surgery made easy: the do-it-yourself guide to finding and fixing usability problems (berkeley, ca: new riders, 2010), 13. 36. jakob nielsen, “why you only need to test with 5 users,”nielsen norman group, march 19, 2000, www.nngroup.com/articles/why-you-only-need-to-test-with-5-users. 37. mestre, “student preference for tutorial design,” 258. 38. bowles-terry, hensley, and hinchliffe, “best practices for online video tutorials in academic libraries,” 22. 39. lim, “the readability of information literacy content on academic library web sites,” 302. appendix i. task 1 in academic search premier (ebsco), find an article about climate change, published in lancet. then copy a citation to the article in mla format. appendix ii. task 2 complete the following task using academic search premier (ebsco). take as long as you need. remember also to “think out loud” through the process. a) find an article about deepwater horizon oil spill published in a peer-reviewed journal after 2011, which includes color photographs. b) after you find an article, copy its citation in mla format. appendix iii. dynamic audio/video tutorials group 1 (basic): http://screencast.com/t/5uln4h8xr group 2 (advanced): http://screencast.com/t/c9kzkgofx6 http://dx.doi.org/10.1080/10875301.2012.725705 http://www.nngroup.com/articles/why-you-only-need-to-test-with-5-users http://screencast.com/t/5uln4h8xr http://screencast.com/t/c9kzkgofx6 static vs. dynamic tutorials | turner, fuchs, and todman | doi: 10.6017/ital.v34i4.5831 51 appendix iv. text-and-image tutorial 1 information technology and libraries | december 2015 52 appendix v. text-and-image tutorial 2 static vs. dynamic tutorials | turner, fuchs, and todman | doi: 10.6017/ital.v34i4.5831 53 appendix vi. information sheet st. john’s university libraries web site usability study information sheet thank you for participating in the sj libraries’ usability study! before beginning the test, please read the following: • the computer screen, your voice, and the voice of the facilitator will be recorded. • the results of this study may be published in an article, but no identifying information will be included in the article. • your participation in this study is totally confidential. • you may stop participating in the study at any time, and for any reason. information technology and libraries | december 2015 54 appendix vii. tutorial questionnaire thank you for participating in the st. john’s university libraries’ tutorial usability study. please take a few moments to answer this brief survey. please refer to the following scale when answering the questionnaire, and circle the correct response. 1 =no, not at all 2 = not likely 3 = neutral (not sure, maybe) 4 = likely 5 = yes, absolutely 1. the tutorial was easy to follow. 1 2 3 4 5 2. i felt comfortable using the tutorial. 1 2 3 4 5 3. the graphics on the tutorial were easy to use. 1 2 3 4 5 4. the language/text on the tutorial was easy to understand. 1 2 3 4 5 5. i would use stj libraries’ tutorials on my own in the future. 1 2 3 4 5 6. i would recommend the stj libraries’ tutorials to my friends. 1 2 3 4 5 7. i was able to complete the tasks with ease. 1 2 3 4 5 8. i would be able to repeat the task now without the aid of the tutorial. 1 2 3 4 5 9. what changes would you make to the tutorial? additional comments and suggestions? literature review early development of online instruction tutorials library website usability testing evaluation of online information-literacy instruction tutorials microsoft word december_ital_perrin_final.docx usability  testing  for  greater  impact:     a  primo  case  study     joy  marie  perrin,   melanie  clark,     esther  de-­‐leon,     lynne  edgar     information  technology  and  libraries  |  december  2014         57   abstract   this  case  study  focuses  on  a  usability  test  conducted  by  four  librarians  at  texas  tech  university   (ttu).  eight  students  were  asked  to  complete  a  series  of  tasks  using  onesearch,  the  ttu  libraries’   implementation  of  the  primo  discovery  tool.  based  on  the  test,  the  team  identified  three  major   usability  problems,  as  well  as  potential  solutions.  these  problems  typify  the  difficulties  patrons  face   while  using  library  search  tools,  but  they  have  a  variety  of  simple  solutions.   introduction   the  texas  tech  university  libraries’  usability  taskforce  was  created  to  inform,  facilitate,  and   promote  usability  initiatives  for  services  supporting  teaching,  learning,  and  research.  the  team’s   first  assignment  was  to  study  the  libraries’  new  implementation  of  primo,  a  discovery  tool  by  ex   libris,  which  is  capable  of  simultaneously  searching  all  library  resources.  primo,  branded   onesearch  for  public  use  at  the  ttu  libraries,  was  initially  implemented  with  no  further   customization.  library  administration  charged  the  team  to  evaluate  the  primo  interface  as  set  up,   determine  whether  the  tool  served  patrons  in  an  intuitive  way,  identify  problem  areas,  and  share   possible  improvements  with  the  library  systems  group.  the  issues  the  team  encountered,   problems  found,  and  lessons  learned  along  the  way  are  relevant  across  all  library  usability  efforts   and  may  assist  other  organizations  in  developing  better  searching  tools.     the  purpose  of  this  study  was  to  evaluate  how  well  onesearch  served  library  patrons  and  to   identify  ways  it  could  be  improved  before  it  replaced  the  existing  library  search  tools.  the   information  collected  about  the  website  navigation  and  searching  practices  of  ttu  students  was   also  expected  to  assist  instruction  librarians  in  teaching  students  how  to  use  onesearch.     method   the  usability  study  comprised  two  components  to  evaluate  both  onesearch  use  and  patron   thoughts,  comments,  and  observations.  the  first  component  was  a  series  of  seven     joy  marie  perrin  (joy.m.perrin@ttu.edu),  is  assistant  librarian,  digital  resources  unit,     melanie  clark  (melanie.clark@ttu.edu)  is  associate  librarian,  architecture  library,     esther  de-­‐leon(esther.de-­‐leon@ttu.ed)  is  assistant  librarian  for  electronic  resources,     lynne  edgar  (lynne.edgar@ttu.edu)  is  assistant  librarian,  library  systems  office,  texas  tech   university  library,  lubbock,  texas.     usability  testing  for  greater  impact:  a  primo  case  study  |  perrin,  clark,  de-­‐leon,  and  edgar      58   tasks  that  participants  completed  using  onesearch  while  the  team  observed.  each  participant  was   guided  through  the  process  by  a  facilitator  who  would  prompt  the  participant  when  he  or  she  got   stuck.  the  rest  of  the  team  observed  both  the  participant’s  screen  movements  and  audiovisual   footage  of  the  participant’s  facial  reactions  from  another  room,  with  the  help  of  techsmith’s   morae  usability  software.  while  morae  made  the  observation  process  easier,  the  same  results   could  have  been  achieved  through  simple  screen-­‐capture  software,  a  video  camera,  or  simple  note   taking  by  the  facilitator.  in  addition  to  the  observation,  the  team  used  retrospective  recall,  asking   the  patrons  to  think  through  their  choices  after  the  tasks  were  done  and  explain  their  process.1   the  second  component  to  the  study  was  the  system  usability  scale  (sus),  a  standard  survey  used   to  evaluate  systems  based  on  self-­‐reported  user  experience.2  participants  completed  the  sus   survey  after  finishing  the  tasks.   for  the  first  component,  the  tasks  were  developed  to  cover  seven  types  of  materials  a  patron   might  find  using  the  search  tool:     1. you  are  looking  for  a  work  called  “operations  management”  by  roberta  russell.  find  out  if   the  library  has  this  book  and  if  you  can  check  it  out.  if  the  library  has  it,  where  is  it  located?   2. you  are  not  on  campus,  but  you  want  to  read  a  book  about  human  resources  management.     see  if  there  are  any  books  available  online  and  if  you  find  them,  try  to  read  one  of  them.   3. find  the  database  jstor  and  open  it.   4. you  need  to  find  a  full-­‐text  online  article  about  customer  service  training.  try  to  find  an   article  and  view  it.   5. you  want  to  see  a  picture  of  someone  from  the  1977  volume  of  the  texas  tech  yearbook   (la  ventana).  locate  the  yearbook  and  then  open  the  first  page  of  1977.   6. find  dr.  rebecca  k.  worley’s  ttu  thesis  the  house  of  yes:  a  director’s  approach  and  find   the  abstract  of  the  thesis.   7. you  need  to  find  a  picture  of  frank  lloyd  wright’s  fallingwater.  find  the  picture  and  access   it.     the  order  of  the  tasks  was  important  to  test  the  learnability  of  the  system  and  minimize  user   frustration.  the  first  task  was  a  simple  book  search,  allowing  participants  to  familiarize   themselves  with  the  tool.  the  difficulty  of  the  searches  increased  over  the  next  three  tasks.  the   method  of  “start  easy,  finish  hard”  is  recommended  in  the  cuep  workshop  to  test  memorability   and  learnability  of  the  website.3  however,  the  team  varied  this  model  by  designing  the  last  three   tasks  to  be  similar  to  the  first  two.  these  tasks  each  requested  a  different  material  type,  but  all  the   materials  could  be  found  with  a  simple  search  identical  to  that  of  the  first  task.  this  design  showed   whether  participants  learned  how  to  use  the  system  and  remembered  the  process.  participants   struggling  with  the  last  tasks  would  indicate  a  severe  usability  problem.   the  team  timed  participants’  performances  with  each  task  and  noted  each  error  or  problem   encountered.  each  task  was  labeled  as  either  a  success  or  failure  for  each  participant  depending   on  whether  he  or  she  completed  the  task,  had  to  be  guided  to  complete  it,  or  gave  up.     information  technology  and  libraries  |  december  2014   59   participants   eight  patrons  participated  in  the  study.  from  a  demographic  profile  that  participants  filled  out   prior  to  completing  the  tasks,  the  team  identified  three  expert  users,  three  intermediate  users,  and   two  novices.  student  classification,  how  frequently  the  participant  used  the  library  website,  and   the  ways  the  participant  used  the  library  website  all  factored  into  the  user  status.   results   system  usability  scale  score   the  system  usability  scale  “grades”  a  website  or  system  by  how  usable  patrons  perceive  the   system  to  be,  resulting  in  a  single  number  score.  a  score  above  80.3  is  in  the  top  10  percentile  and   is  considered  to  be  an  a.  a  score  of  68  is  average,  with  anything  under  68  below  average.4   onesearch  received  a  sus  score  of  78.25  from  the  eight  participants  of  the  study.  this  is   comparable  to  a  b+,  indicating  that  overall,  the  implementation  of  onesearch  was  successful,  at   least  in  terms  of  how  students  perceived  it  after  using  it.  to  identify  specific  problems  with  the   interface,  the  team  looked  at  three  factors  of  the  participants’  performances  on  the  tasks.   average  time  spent  on  each  task   figure  1  shows  the  statistics  of  the  seven  tasks.  as  expected  according  to  the  average  completion   time,  participants  spent  more  time  on  the  first  task.  it  may  be  inferred  that  they  were  acquainting   themselves  with  the  system.  the  fourth  task,  which  the  team  expected  to  be  the  most  difficult,   proved  to  have  the  longest  completion  time.  from  the  completion  time  alone,  the  team  was  unable   to  determine  if  task  4  was  problematic  or  not.  while  participants  started  their  search  with  the   onesearch  interface,  they  had  to  wait  for  a  separate  integrated  system,  such  as  a  citation  linker,  to   retrieve  any  found  articles.  this  added  to  the  task  completion  time.     task   average  completion   time  (min)   completed   with  ease   completed  with   difficulty   average   error  rate   1.  find  a  book   1.74   50%   50%   1   2.  find  an  e-­‐book   0.97   87.5%    12.5%   0.63   3.  open  a  database   1.11   37.5%   62.5%   3.25   4.  find  an  article   2.92   37.5%   50%  (1  did  not   complete  the  task)   3.63   5.  find  a  digital   collection  item   1.12   62.5%   37.5%   0.75   6.  find  a  thesis   0.87   87.5%    12.5%   0.38   7.  find  an  image   0.70   87.5%    12.5%   0.38   table  1.  task  results     usability  testing  for  greater  impact:  a  primo  case  study  |  perrin,  clark,  de-­‐leon,  and  edgar      60   a  more  telling  observation  was  that  the  time  on  tasks,  excluding  task  4,  diminished  between  task  1   and  task  7,  suggesting  that  participants  had  no  trouble  learning  and  remembering  how  to  use  the   system.   error  rate  for  each  task   the  error  rate  proved  to  be  the  most  accurate  indicator  of  usability  problems  with  each  task.  each   time  a  participant  chose  a  wrong  path,  faced  an  impasse,  or  had  to  be  guided  by  the  facilitator,  the   event  was  labeled  as  an  error.  in  the  ideal  scenario,  the  participants  would  make  no  errors;   therefore,  the  more  errors  observed,  the  greater  indication  of  a  usability  problem.   as  seen  in  table  1,  although  the  first  task  took  an  average  of  1.74  minutes,  the  eight  participants   tended  to  only  make  a  mistake  once  while  trying  to  find  a  book.  tasks  2,  5,  6,  and  7  had  an  average   error  rate  below  1.  because  the  error  rate  seems  to  decline  from  task  1  to  task  7  (excluding  tasks  3   and  4),  the  team  inferred  that  users  were  able  to  learn  how  to  use  the  system  quite  easily.   tasks  3  and  4  seemed  to  cause  problems.  both  of  these  tasks  had  an  average  error  rate  above  3.   this  indicated  to  the  team  that,  since  the  sus  score  for  the  entire  system  was  good,  they  needed  to   identify  why  the  database  search  and  article  search  were  problematic.   success  rate  for  each  task   table  1  also  shows  the  three  ways  that  tasks  were  tagged  during  the  study.  the  participants   completed  the  task  with  ease,  completed  the  task  with  difficulty,  or  failed  to  complete  the  task.  as   the  team  expected,  the  first  task  was  divided  between  those  who  completed  it  with  ease  and  those   who  completed  it  with  difficulty.  this  was  expected  when  the  participants  were  learning  the   system.  task  2  shows  a  marked  improvement:  82.5  percent  of  the  participants  found  an  e-­‐book   with  ease.  tasks  3  and  4  however,  show  up  again  as  problematic.  one  of  the  participants  failed  to   complete  task  4.  after  tasks  3  and  4,  the  participants  successfully  regrouped,  with  87.5  percent   completing  the  last  two  tasks  with  ease.     average  completion  time   user  level   task  1   task  2   task  3   task  4   task  5   task  6   task  7   novice   1.41   0.62   0.93   2.35   0.57   0.72   0.67   intermediate   2.28   1.13   1.57   1.81   1.27   0.86   0.37   expert   1.44   1.04   0.77   4.42   1.35   0.98   1.06       information  technology  and  libraries  |  december  2014   61   average  error  rate   user  level   task  1   task  2   task  3   task  4   task  5   task  6   task  7   novice   0   0   3.5   4.5   0.5   1   0.5   intermediate   2.33   0.33   3.67   1.67   0.67   0.33   0.33   expert   0.33   1.33   2.67   5   1   0   0.33   table  2.  task  results  by  experience  level     results  by  experience  level     based  on  the  team’s  observation,  novices  used  the  simplest  approach  to  each  task.  the   intermediate  and  expert  users  sometimes  extended  their  task  completion  time  by  performing   more  complex  searches.  almost  without  exception,  the  first  thing  the  more  experienced  users  did   for  each  task  was  to  look  at  the  dropdown  menus  or  go  to  “advanced  search,”  even  when  a  general   search  would  provide  sufficient  results.  as  shown  in  table  2,  this  approach  lengthened  their   completion  time  and  increased  their  error  rates  on  the  first  two  tasks,  but  the  novice  error  rates   increased  dramatically  on  the  most  difficult  tasks  (3  and  4),  surpassing  those  of  the  intermediate   and  expert  users.  the  tendency  of  more  experienced  users  to  perform  complex  searches  did  not   negatively  affect  the  their  success  in  completing  each  task.       figure  1.  onesearch  home  page     usability  testing  for  greater  impact:  a  primo  case  study  |  perrin,  clark,  de-­‐leon,  and  edgar      62   problems  identified   figure  1  is  a  screen  shot  of  the  onesearch  interface  used  for  the  study.  as  shown,  there  are  three   tabs  for  different  searches.  the  first  tab,  “texas  tech  libraries,”  searches  the  print  collections,   institutional  repository,  digital  collections,  special  collections,  and  e-­‐books.  the  second  tab,   “articles  by  subject,”  is  a  federated  database  search  that  can  be  narrowed  by  subject.  to  find  a   specific  database,  users  had  to  click  on  the  a–z  link  either  in  the  upper  right  corner  or  at  the  lower   left  under  the  image.   the  test  indicated  that  users  faced  the  most  trouble  when  searching  for  articles  and  opening   specific  databases.  the  team  identified  three  problems  based  on  the  difficulties  the  participants   experienced.  all  of  these  problems  were  related  to  visual  design  and  clarity  rather  than  the  basic   functionality  of  onesearch.  identifying  problems,  however,  does  not  address  how  to  best  resolve   those  problems.  observing  participants  using  the  interface,  as  well  as  researching  current  web   standards,  may  lead  to  educated  guesses  as  to  possible  solutions.  knowing  the  limitations  of  how   the  onesearch  interface  design  could  be  changed,  the  team  decided  to  offer  as  many  possible   solutions  as  could  be  identified.  this  was  to  highlight  that  there  was  no  single  way  to  fix  the   problems,  but  many  different  ways  the  problem  could  be  solved.  the  team  put  the  options  in  order   of  their  expected  effectiveness  on  the  basis  of  what  was  observed  during  the  test.   problem  1:  individual  databases  are  difficult  to  find   while  analyzing  the  task  completion  statistics  and  video  footage,  it  became  clear  that  the  way  to   access  the  databases  was  not  visible  to  users.  as  shown  in  figure  1,  the  “databases  a–z”  link  was  at   the  very  bottom  of  the  page  in  a  location  not  immediately  noticeable.  in  addition,  the  formatting   on  the  link  did  not  look  like  an  obvious  link.  even  if  users  saw  it,  they  might  not  realize  it  was   something  they  could  click  on.   solution  1:  if  possible,  make  “find  databases”  a  search  scope  in  the  dropdown  menu.  the   participants  were  more  willing  to  go  to  the  menu  than  to  look  around  the  page.  this  meant  that   the  best  place  for  users  to  be  able  to  search  databases  was  by  selecting  them  on  the  dropdown   menu.  this  solution  was  not  possible  in  onesearch,  however.   solution  2:  make  “find  databases”  a  fourth  tab.  participants  were  also  more  likely  to  look  through   the  tabs  than  they  were  to  see  the  database  link  at  the  bottom  of  the  page.     solution  3:  move  “find  databases”  or  “databases  a–z”  to  a  different  part  of  the  page.  figure  2  is  a   mock-­‐up  of  different  ways  this  option  could  be  implemented.  this  was  to  highlight  that  as  long  as   the  link  was  put  higher  on  the  page,  in  a  more  visible  place,  users  would  be  more  likely  to  find  it.   solution  4:  make  “databases  a–z”  bigger  or  more  eye-­‐catching  by  changing  the  color.  this  would   increase  the  likelihood  of  it  being  seen.       information  technology  and  libraries  |  december  2014   63     figure  2.  suggested  locations  for  databases  a–z   problem  2:  dropdown  limiters  are  misleading   in  figure  1,  three  search  limiters  can  be  seen  below  the  search  box.  the  ones  shown  in  figure  1  are   “all  items,”  “that  contain  my  query  words,”  and  “anywhere  in  the  record.”  the  first  box  provides  a   way  to  limit  the  results  by  type  such  as  “books”  or  “ejournals”.  the  second  box  offers  a  choice   between  a  general  search  for  query  words,  specifying  if  the  search  is  an  exact  phrase  search  or  if  a   field  starts  with  the  query  words.  the  last  box  offers  a  way  to  specify  which  field  the  query  is   searching.  some  of  the  participants  ran  unsuccessful  searches  during  the  test,  and  then   erroneously  tried  to  get  results  by  “limiting”  the  items  in  their  already  faulty  search.  for  example,   a  few  students  went  to  the  “articles  by  subject”  tab  and  searched  for  an  image  (task  7).  when  no   images  came  up,  they  went  to  the  limiters  and  chose  the  format  “images.”  this  resulted  in  zero   results  returned  because  there  were  no  images  in  the  “articles  by  subject”  tab.  again,  the  team   gave  three  different  options  in  order  of  their  perceived  effectiveness.     solution  1:  on  the  “texas  tech  libraries”  tab,  remove  “articles”  from  the  dropdown  limiter.  the   reason  behind  this  was  that  there  were  no  articles  in  the  basic  search,  so  it  should  not  be  an  option.   solution  2:  on  the  “texas  tech  libraries”  tab,  remove  “databases”  from  the  dropdown  limiter   (unless  “find  databases”  can  work  as  a  search  scope).   solution  3:  on  the  “articles  by  subject”  tab,  remove  “images”  and  “journals”  from  the  dropdown   limiter.     usability  testing  for  greater  impact:  a  primo  case  study  |  perrin,  clark,  de-­‐leon,  and  edgar      64   however,  most  participants  did  expect  all  of  the  content  to  be  in  the  main  dropdown  menu  and  did   not  tend  to  use  the  tabs  or  limiters.  the  team  offered  a  mock-­‐up  of  the  best  possible  scenario  in   which  the  only  options  to  search  were  in  the  main  dropdown  (figure  3).  this  would  have  been  the   most  intuitive  way  for  patrons  to  search,  but  it  was  also  the  most  technically  complex  to   implement.     figure  3.  dropdown  menu  suggestion   problem  3:  “articles  by  subject”  tab  is  not  visible  to  users   the  “articles  by  subject”  tab,  which  allows  users  to  choose  a  federated  database  search  by  subject,   was  not  visible  to  participants.  the  text  was  smaller  than  other  surrounding  text  and  not   immediately  recognizable  as  a  tab.  one  of  the  reasons  that  the  tabs  were  removed  from  figure  3  is   that  they  were  not  easy  for  users  to  see.  the  team  recommended  three  options.     solution  1:  make  the  tabs  more  visually  identifiable  by  designing  them  more  like  traditional   website  tabs.   solution  2:  enlarge  the  tab  text  so  that  the  tabs  are  more  visible.  figure  4  shows  a  mock-­‐up  with   enlarged,  more  noticeable  text.   solution  3:  add  an  “articles  by  subject”  description  to  the  “what  am  i  searching?”  text.  the  team   noticed  that  the  explanatory  text  on  the  right  side  of  the  page  did  not  include  a  description  of  the   “articles  by  subject”  tab.     information  technology  and  libraries  |  december  2014   65     figure  4.  enlarged  tabs   other  minor  problems   in  addition  to  the  three  major  problems,  the  team  noted  smaller  issues.  one  of  the  problems  was   that  some  of  the  titles  of  the  dropdown  search  scopes  were  not  in  terminology  that  the  students   understood.  for  example,  one  of  the  scopes  is  “thinktech,”  the  name  of  texas  tech’s  institutional   repository.  since  this  name  doesn’t  indicate  what  the  scope  actually  searches—mainly  theses  and   dissertations—users  didn’t  know  what  was  in  “thinktech”  unless  they  read  the  explanatory  text   on  the  right.  the  team  recommended  changing  the  scope  name  to  something  more  descriptive,   such  as  “theses,  dissertations,  faculty  research.”   another  small  usability  problem  was  how  difficult  it  was  to  see  the  indication  of  an  incorrect   search.  the  “did  you  mean?”  spelling  suggestion  on  the  search  result  page  was  very  small,  smaller   than  the  notification  that  there  were  no  results.  participants  who  made  simple  spelling  errors   didn’t  realize  they  had  failed  because  of  this  simple  error  and  assumed  there  were  no  results.   discussion   the  team  submitted  these  findings  to  the  library  systems  group  in  two  different  ways:  a  written   report  and  a  presentation  including  the  above  mock-­‐ups,  charts,  and  video  footage  from  the   usability  study.    the  video  clips  allowed  the  team  to  both  illustrate  the  problems  and  show  the   systems  group  the  sources  of  those  problems.  the  presentation  with  more  multimedia  content   made  a  much  greater  impact  than  the  written  report  and  resulted  in  the  systems  group  better   understanding  the  problems  and  how  to  fix  them.  visual  aids  are  an  effective  way  to  report  results   because  of  how  crucial  visibility  is  in  usability.  one  example  of  how  the  presentation  was  more     usability  testing  for  greater  impact:  a  primo  case  study  |  perrin,  clark,  de-­‐leon,  and  edgar      66   effective  is  that  the  team  played  a  game  with  the  systems  group  by  showing  figure  2  without  the   “databases  a–z”  links  circled.  the  systems  group  was  asked  to  raise  their  hands  when  they  first   saw  the  link.  after  everyone  raised  their  hands,  the  next  slide  showed  all  the  locations  that  the   “databases  a–z”  link  could  be  found.  most  of  the  group  had  not  realized  there  was  more  than  one   link,  illustrating  that  some  positions  are  more  noticeable  than  others  and  that  users  tend  not  to   linger  on  a  page.  this  was  more  effective  than  a  dry  report  stating  this  fact.   implementing  usability  findings  is  often  more  difficult  than  identifying  them,  particularly  when   usability  is  conducted  on  a  “finished”  system.  not  all  the  problems  the  team  identified  were  able  to   be  addressed,  but  some  problems  were  fixed  quickly  and  easily,  such  as  including  the  link  “find   databases”  at  the  top  of  the  page  (see  problem  1,  solution  3  above).  what  the  usability  study  did   was  allow  the  systems  group  to  understand  how  patrons  view  their  tool  and  how  they  are  likely   to  work  with  it.       conclusion   these  small  changes  made  the  system  more  usable  for  patrons,  which  is  what  usability  testing  is   all  about.  it  is  less  about  making  a  system  conform  to  a  single  way  of  doing  things  than  finding   small  ways  that  the  system  can  be  made  easier  to  use.  changing  the  name  of  the  search  scopes  or   changing  the  position  of  a  link  is  a  relatively  small  investment  of  time  and  resources  that  yields   great  benefits  for  patrons  in  making  the  system  easier  to  use.     one  of  the  most  interesting  observations  from  this  study  was  that  most  of  the  users  wanted  all   their  search  options  in  one  place.  they  preferred  one  dropdown  menu  to  handle  all  their  needs.     for  future  development  on  these  kinds  of  systems,  this  kind  of  preference  should  be  kept  in  mind.     the  majority  of  patrons  might  be  happier  with  a  tool  with  less  capability  and  simpler  options   rather  than  a  complex  tool  with  many  different  ways  to  approach  their  search.   references     1.  brian  still  and  m.  [qy:  first  name?]  betz,  a  study  guide  for  the  certified  user  experience   professional  (cuep)  workshop  (lubbock:  texas  tech  university,  2011),  61.     2.    jeff  sauro,  “measuring  usability  with  the  system  usability  scale  (sus),”  measuring  usability,   february  2,  2011,  http://www.measuringusability.com/sus.php.     3.    still  and  betz,  a  study  guide,  67.   4. sauro,  “measuring  usability  with  the  system  usability  scale  (sus).” reproduced with permission of the copyright owner. further reproduction prohibited without permission. editorial: i inhaled helmer, john f information technology and libraries; jun 2000; 19, 2; proquest pg. 59 editorial: i inhaled t his editorial introduces the third special issue of information technology and libraries dedicated to library consortia, and the second primarily aimed at surveying consortial activities outside the united states. 1 the concept of a special consortial issue began in 1997 as an outgrowth of a sporadic and wide-ranging discussion with jim kopp, editor of ital 1996-98. at the time, jim and i were involved in the creation and maturation of the orbis consortium in oregon and washington. jim was a member and later chair of the governing council and i was chief volunteer staff person and finding myself increasingly absorbed by consortial work. our discussions lasted more than a year and were sustained by many e-mail messages and several enjoyable conversations over bottles of nut brown ale. in the mid-1990s it seemed obvious that we were witnessing the beginning of a renaissance in library consortia. consortia had been around for many years but now established groups were showing renewed vigor and new groups seemed to be forming every day. why was this happening? what were all these consortia doing? jim and i discussed these questions and speculated on future roles for library consortia and their impact on member libraries. library consortia seemed an ideal topic for a special issue of ital. my initial goal as guest editor of ital was to take a snapshot of a variety of consortia and begin to better understand the implications of the explosive growth we were witnessing. while assembling the march 1998 issue i soon realized that consortia were all over the map, both figuratively and literally. a small amount of study revealed a tremendous variety of consortia and a truly worldwide distribution. although american consortia were starting to receive attention in the professional literature, a great deal of important work was occurring abroad. this realization gave rise to the september 1999 issue and the present issue dedicated to consortia from around the world. in addition to six articles from the united states, these three special issues of ital include contributions from south africa, canada, israel, spain, australia, brazil, john f. helmer china, italy, micronesia, and the united kingdom. taken together these groups represent a dizzying array of organizing principles, membership models, governance structures, and funding models. although most are geographically defined, the type of library they serve also defines many. virtually all license electronic resources for their membership but many offer a wide variety of other services including shared catalogs, union catalogs, patron-initiated borrowing systems, authentication systems, cooperative collection development, digitizing, instruction, preservation, courier systems, and shared human resources. each consortium is formed by unique political and cultural circumstances, but a few themes are common to all. it is clear that the technology of the web, the increasing importance of electronic resources, and advances in resource-sharing systems have created new opportunities for consortia. beyond these technological and economic motivations, i believe that in consortia we see the librarian's instinct for collaboration being brought to bear at a time of great uncertainty and rapid change. librarians often forget that as a profession we collaborate and cooperate with an ease seldom seen in other endeavors. there is safety in numbers and in uncertain times it helps to confer with others, spread risk over a larger group, and speak with a collective voice. library consortia fulfill these functions very well and their future continues to look bright. as i conclude my duties as guest editor i would like to thank jim kopp for sparking my interest in this project and for several years of stimulating conversation. special thanks are due to managing editors ann jones and judith carter as well as the helpful and professional staff at ala production services. obstacles of language and time differences make composing and editing a publication such as this unusually challenging. the quality and cohesivejohn f.helmer(jhelmer@darkwing.uoregon.edu) is executive director, orbis library consortium. production: ala production services (troy d. linker, kevin heubusch; ellie barta-moran, angela hanshaw, and karen sheets), american library association, 30 e. huron st., chicago, il 60611. publication of material in infornrntion trclz110logy and libraries does not constitute official endorsement bv lita or the ala. . abstracted in computer & /11jtj1·11wtwn systems, compllting rn 1icws, il~{ormation science abstracts, library [-r lnforlllatio11 science abstracts, rtfrrati'unyi zlwrnal, i\iauclmaya i tckfrniclzeskaya l11fon11atsiya, otdyclnyi vyp11sk, and science abstracts pu{j/icnticms_ indexed in co111pu1\r!nth citation lndcx, comptdcr contents, co111putcr litaaturc lndc:r, current contc11ts/healtl1 scn.·iccs admi11istratio1l, current ccmtcnfs/social bclwuioral scic11ces, c11rrcnt index to journals in education, education, library literature, a1agazinc jndcj:, ncwscarcl1, and social sciences citation index. microfilm copies available to subscribers from university microfilms, ann arbor, michigan. for information sciences-permanence of paper for printed library materials, ansi 239.48-1992.= copyright ©2000 american library association. all material in this journal subject to copyright by ala may be photocopied for the noncommercial purpose of scientific or educational advancement granted by sections 107 and 108 of the copyright revision act of 1976. for other reprinting, photocopying, or translating, address requests to the ala office of rights and permissions. the paper used in this publication meets the minimum requirements of american national standard editorial 59 reproduced with permission of the copyright owner. further reproduction prohibited without permission. ness of these issues of ital are due in large measure to the efforts of these individuals. in inhaling the spore, the editorial introduction to the first special consortial issue, i compared a librarian's involvement in consortia to the cameroonian stink ant's inhalation of a contagious spore. the effect of this spore is featured in mr. wilson's cabinet of wonder, lawrence weschler's remarkable history of the museum of jurassic technology. 2 weschler explains that, once inhaled, the spore lodges in the brain and "immediately begins to grow, quickly fomenting bizarre behavioral changes in its ant host." although the concept of a consortial spore is somewhat extreme (or "icky" according to my nine-yearold daughter) the editorial was an accurate reflection of my own sense of being inexorably drawn into a consortium-drawn not so much against my will but as a willing crazed participant. at the time i was nominally working for the university of oregon library system and vainly trying to keep consortial work in perspective. 60 information technology and libraries i june 2000 by the time of my second editorial, epidemiology of the consortia/ spore, i was exploring consortia around the world but still laboring under the illusion that i could keep my own consortium at arm's length. i must have failed since, as of this writing, i have left my position at the uo and now serve as the executive director of the orbis library consortium. like the cameroonian stink ant, i have inhaled the spore and am now happily laboring under its influence. references and notes 1. see ital 17, no. 1 (mar. 1998) and ital 18, no. 3 (sept. 1999). 2. lawrence weschler, mr. wilson's cabinet of wonder (new york: vintage books, 1995). the museum of jurassic technology (www.mjt.org) is located in culver city, calif. see www.mjt.org/ exhibits/stinkant.html for more on the cameroonian stink ant. microsoft word june_ital_vacek_final.docx president’s  message:     making  an  impact  in  the  time     that  is  given  to  us   rachel  vacek     information  technologies  and  libraries  |  june  2015         3   in  an  early  chapter  in  the  fellowship  of  the  ring,  by  j.r.r.  tolkien,  frodo  laments  having  found  the   one  ring  and  gandalf  tries  to  console  him  by  saying,  “all  we  have  to  decide  is  what  to  do  with  the   time  that  is  given  us.”  this  is  one  of  my  favorite  quotes  in  the  lord  of  the  rings  series  because  it   inspires  us  to  rise  to  the  occasion  and  perform  to  the  best  of  our  abilities.  it  also  implies  that  that   we  have  a  purpose  to  fulfill  within  a  predetermined  time  period.   although  my  term  in  office  is  three  years,  i’m  only  lita  president  for  one  year.  to  set  a  vision  and   goals,  establish  a  sense  of  urgency,  generate  buy-­‐in,  engage  and  empower  the  membership,   implement  sustainable  changes,  and  remain  positive  and  focused  –  all  within  one  year  while   holding  a  full-­‐time  job  –  is  challenging  to  say  the  least.     i’ve  been  very  fortunate  during  my  almost  eight-­‐year  tenure  at  the  university  of  houston  libraries   to  participate  in  numerous  professional  development  opportunities,  lead  change,  and  make  a   difference.  personal  and  professional  growth  has  always  been  very  important  to  me,  and  being  in   an  environment  that  encourages  me  to  become  a  better  librarian,  technologist,  manager,  and   leader  is  not  only  helpful  for  my  career,  but  also  extremely  rewarding  on  an  intellectual  level.  lita   has  benefited  that  training.   in  today’s  library  technology  landscape,  one  of  the  many  skills  leaders  need  to  possess  is  the   ability  to  effect  change.  as  lita  president,  i  have  put  many  changes  in  motion  and  am  happy  with   what  i  have  accomplished,  and  proud  of  our  board  and  the  members  who  volunteer  to  lead  and   effect  change.   as  i  reflect  over  the  past  year,  it’s  fair  to  say  that  lita,  despite  some  financial  challenges,  has  had   numerous  successes  and  remains  a  thriving  organization.  three  areas  –  membership,  education,   and  publications  –  bring  in  the  most  revenue  for  lita.  of  those,  membership  is  the  largest  money   generator.  however,  membership  has  been  on  a  decline,  a  trend  that’s  been  seen  across  ala  for   the  past  decade.  in  response,  the  board,  committees,  interest  groups,  and  many  and  individuals   have  been  focused  on  improving  the  member  experience  to  retain  current  members  and  attract   potential  ones.  with  all  the  changes  to  the  organization  and  leadership,  lita  is  on  the  road  to   becoming  profitable  again  and  will  remain  one  of  ala’s  most  impactful  divisions.       rachel  vacek  (revacek@uh.edu)  is  lita  president  2014-­‐15  and  head  of  web  services,  university   libraries,  university  of  houston,  houston,  texas.     president’s  message  |  vacek       doi:  10.6017/ital.v34i2.8804     4   the  board  has  taken  numerous  steps  to  stabilize  or  reverse  the  decline  in  revenues  that  has   resulted  from  a  steady  reduction  in  overall  membership.  at  ala  annual  2014,  the  financial   advisory  committee  was  established  to  respond  to  recommendations  from  the  financial   strategies  task  force,  adjusting  the  budget  to  make  a  number  of  improvements  while  planning  for   larger,  more  substantial  changes.   in  fall  2014  we  took  steps  to  improve  our  communications  by  establishing  the  communications  &   marketing  committee  and  appointing  a  social  media  manager  and  a  blog  manager.  the  blog  and   social  media  have  seen  a  steady  upward  trajectory  of  engagement  with  over  27,000  blog  views   since  september  2014  and  over  13,300  followers  on  twitter.  these  efforts  help  recruit  and  retain   members,  advertise  our  online  education  and  programming,  and  increase  attendance  at   conferences.       over  the  past  year,  nine  workshops  and  two  web  courses  were  offered,  many  of  which  sold  out   thanks  to  new  marketing  approaches.  the  forum  remains  popular  and  has  stellar  programming   and  keynote  speakers.  programs  and  workshops  at  ala  conferences  are  stronger  than  ever  and   continue  to  be  well  attended.  publications  also  remain  strong.  although  only  three  lita  guides   were  published  this  year,  partially  due  to  a  change  in  publishers,  there  are  many  more  in  the   pipeline.       finally,  the  search  for  a  new  executive  director  is  underway,  and  with  a  new  leader  comes  fresh   ideas  and  perspectives.  i  am  excited  about  lita’s  future.  the  incoming  board,  along  with  a  new   executive  director,  has  an  opportunity  to  make  national  and  lasting  impact  as  well  as  collaborate   with  outstanding  librarians  and  staff  in  this  division  and  across  ala.  lita’s  challenges  and   successes  are  shared  amongst  a  dedicated  team  of  volunteers,  and  together  we’ve  made  significant   changes.  i  believe  that  lita  members  will  continue  to  rise  to  the  occasion  and  make  incredible   things  happen  with  “the  time  that  is  given  us.”  lita  is  an  amazing  organization  because  of  its   members  and  their  passion  and  dedication.  i  couldn’t  be  prouder.  it  has  been  an  honor  and  a   privilege  to  serve  as  your  president.           integrated technologies of blockchain and biometrics based on wireless sensor network for library management articles integrated technologies of blockchain and biometrics based on wireless sensor network for library management meng-hsuan fu information technology and libraries | september 2020 https://doi.org/10.6017/ital.v39i3.11883 meng-hsuan fu (msfu@mail.shu.edu.tw) is assistant professor, shih hsin university (taiwan). © 2020. abstract the internet of things (iot) is built on a strong internet infrastructure and many wireless sensor devices. presently, radio frequency identification embedded (rfid-embedded) smart cards are ubiquitous, used for many things including student id cards, transportation cards, bank cards, prepaid cards, and citizenship cards. one example of places that require smart cards is libraries. each library, such as a university library, city library, local library, or community library, has its own card and the user must bring the appropriate card to enter a library and borrow material. however, it is inconvenient to bring various cards to access different libraries. wireless infrastructure has been well developed and iot devices are connected through this infrastructure. moreover, the development of biometric identification technologies has continued to advance. blockchain methodologies have been successfully adopted in various fields. this paper proposes the blockmetrics library based on integrated technologies using blockchain and finger-vein biometrics, which are adopted into a library collection management and access control system. the library collection is managed by image recognition, rfid, and wireless sensor technologies. in addition, a biometric system is connected to a library collection control system, enabling the borrowing procedure to consist of only two steps. first, the user adopts a biometric recognition device for user authentication and then performs a collection scan with the rfid devices. all the records are recorded in a personal borrowing blockchain, which is a peer-to-peer transfer system and permanent data storage. in addition, the user can check the status of his collection across various libraries in his personal borrowing blockchain. the blockmetrics library is based on an integration of technologies that include blockchain, biometrics, and wireless sensor technologies to improve the smart library. introduction the internet of things (iot) connects individual objects together through their uniqu e address or tag, which are based on the sensor devices and wireless network infrastructure. presently, “smart living” (a term that includes concepts such as the smart home, smart city, smart university, smart government, and smart transportation) is based on the iot, which plays a key role to achieve a convenient and secure living environment. gartner, a data analytics company that presents the top ten strategic technology trends for the next year at the end of each year, listed blockchain as one of the top ten in 2017, 2018, 2019, and 2020.1 the fact that blockchain has been proposed as one of the top strategic technology trends for four consecutive years represents its sustained interest among technology experts and developers. in a blockchain, a block is the basic storage unit where data is saved and protected with cryptography and complex algorithms. the technology of peer-to-peer transfer is adopted mailto:msfu@mail.shu.edu.tw information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 2 when data or information is exchanged without the need for a third party. in other words, data is transferred directly from node to node or user to user thanks to the decentralized nature of the blockchain. in addition, blockchain is authorized and maintained by all nodes in the same blockchain network. each node has equal right (also known as equal weight) to access the blockchain and authorize new transactions. thus, all transactions are published and broadcast to all nodes and content cannot be altered by single or minority users or nodes. additionally, transaction content is secured by cryptography and complex secure algorithms. therefore, transactions occur and are preserved under a fully secure and private network. in practice, the blockchain has been applied to various fields including finance, medicine, academia, and logistics. the blockchain has also been adopted for personal transaction records for its privacy and security properties and because it offers immutable and permanent data storage. in this research, blockchain technologies are adopted to store the records of collections borrowed from various libraries in a personal borrowing blockchain. table 1. definition of key terms of blockchain key term definition blockchain a blockchain comprises many blocks. it has the characteristics of security, decentralized, immutability, distributed ledgers, transparent log, and irreversible data storage. block a block is the basic unit in blockchain. each block consists of a block header with nonce, previous block hash, timestamp, merkle root, and many transactions in a block body. nonce the counter of the algorithm, hash value will be changed once the nonce modified. merkle root a secure hash algorithm (sha) is used in merkle root to transform data into a meaningless hash value. transaction each transaction is composed of address, hash, index, and timestamp. all transactions will be stored in blocks permanently. hash secure hash algorithm (sha) transforms input data into meaningless output data, called a hash, which consists of english letters and digital numbers, in order to protect data content during transmission. biometrics using human physical characteristics including finger vein, iris, voice, and facial features for recognition. sensor network a sensor is a small and portable node with a data record function and power source. a sensor network is composed of many sensors based on a communication infrastructure. iot internet of things (iot) is a system to connect sensors and devices together under an internet environment. presently, many iot applications were adopted such as smart home, health care, and smart transportation. information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 3 although the iot and wireless networks have been well developed, people still own many different rfid-embedded cards, such as public transportation cards, credit cards, student cards, medical cards, identification cards, membership cards, or library cards. an rfid-embedded card is issued for each place or purpose, requiring the user to bring the appropriate cards to access the corresponding functions. in this study, the library is used as the objective to which blockchain technologies are applied because currently each library has its own library card for entering the library and borrowing material. this implies that users may have to carry several library cards to access each of a university library, community library, and district library on the same day. here, biometrics can be adopted to solve the problem of having to carry many access control cards and managing various borrowing policies. in this study, the blockmetrics library is designed based on the technologies of blockchain and biometrics within the environment of a wireless sensor network with iot devices. here, borrowing records are transferred and stored through blockchain technologies, automatic library access control is managed by biometric identification and the borrowing and returning of library materials are achieved under a wireless sensor network with iot devices to create a convenient, efficient, and secure library environment. the key terms of blockchain and its related terms, biometrics, sensor network and iot applied in this research are defined in table 1. related works blockchain technology nakamoto has presented bitcoin as a peer-to-peer electronic system that uses blockchain technologies, which include smart contracts, cryptography, decentralization, and consensus in proof of work. because this electronic system is based on cryptography, a trusted third party is not required in the payment mechanism. additionally, peer-to-peer technology and a timestamp server are adopted and a block is given a hash in serial order. this procedure solves the problem of double-spending during payment.2 in addition, proof of work is used in a decentralized system for authentication by most nodes in the blockchain network. each node has equal rights to compete to receive a block and each node can vote to authenticate a new block. 3 košt’ál et al. define proof-of-work (pow) as an asymmetric method with complex calculations where its difficulty is adjusted by the problem-solving duration.4 however, pow has drawbacks, such as high power consumption and the fact that some users can control the blockchain if their shares of users in the same blockchain network reach 51 percent.5 despite the possible presence of malicious parties, the information in a blockchain is difficult to modify because of the distributed ledger methodology in which each node has the same copy of ledger, making it difficult for a single or minority node to change or destroy the stored data. 6 a block is a basic unit in a blockchain. in other words, a blockchain is composed of blocks that are connected. one of the blockchain technologies is the distributed ledger, in which a ledger can be distributed to each node all over the world.7 each block is composed of a block body and header, and a block size is 4 bytes. the block header is a combination of the version, previous block hash, merkle root, timestamp, difficulty target, and nonce. a blockchain is 80 bytes in total and transactions are between 1 to 9 bytes (see table 2). information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 4 table 2. block element. block size 4 bytes size block header (80 bytes) 4 bytes version 32 bytes previous block hash 32 bytes merkle root 4 bytes timestamp 4 bytes difficulty target 4 bytes nonce block body 1-9 bytes transactions blockchain technologies have been applied to various fields including finance, art, hygiene, healthcare, and academic certificates. for example, in healthcare, user medical records are stored in a blockchain so users can check their health conditions and share them with their family members in advance. in business, blockchain is adopted into the supply chain management for monitoring activities during goods production. in the academic field, the certificates are permanently saved in a blockchain where users can retrieve them from their mobile devices and show them upon request.8 because blockchain is protected by cryptography and offers privacy, reliability, and decentralization, an increasing number of applications are beginning to adopt it. as an application for a library system, a blockchain, in combination with biometrics within a wireless infrastructure, can be adopted for personal borrowing records. library borrowing management each library has its own regulations. for example, the national central library (ncl) in taiwan has created reader service directions in which the general principles and rules for library card application, reader access to library materials, reference services, request for library materials, violations, and supplementary provisions are clearly stated. according to the ncl reader service directions, citizens are required to present their national id card and foreigners are asked to present their passport to apply for a library card. users are allowed to access the library when they have a valid library card. those who have library cards but have forgotten to bring them can apply for a temporary library card to enter the library, but this is limited to three times.9 this rule is only specific to the ncl. other libraries in the same country have their own regulations. another example is the taipei public library. citizens can apply for a library card using their citizen’s id card, passport, educational certificate, or residence permit. a taipei citizen can apply for a family library card using their family certificate. users can borrow material and return the material to all the libraries in taipei city. however, these policies are only applicable to users who hold library cards issued by libraries in taipei city.10 as for university libraries, each library also has its own regulations. for instance, shih hsin university (shu) issues its own library card to access its library. alumni are requested to present their id cards and photo to apply for a library card in person. the number of items and their loan periods are clearly stated in the rules set by the shu library.11 again, the regulations are individually set by each university library. information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 5 biometrics rfid-embedded smart cards such as student cards, transportation cards, and bank cards are widely used, however, they can be stolen, lost, or forgotten at home. biometrics is becoming more widely used in access-control systems for homes, offices, buildings, government facilities, and libraries. for these systems, the fingerprint is one of the most commonly used biometrics. users place their finger on a read device, usually a touch panel. this method ensures a unique identity, is easy to use and widely accepted, boasts a high scan speed, and is difficult to falsify. however, its effectiveness is influenced by the age of the user and the presence of moisture, wounds, dust, or particles on the finger, in addition to the concern for hygiene because of the use of touch devices. face recognition has been used in various applications such as unlocking smart devices, performing security checks and community surveillance, and maintaining home security. this method of biometric identification is convenient, widely accepted, difficult to falsify, and can be applied without the awareness of the person. however, limitations to face recognition include errors that can occur due to lighting, facial expression, and cosmetics. also, privacy is an issue in face recognition because it may take place at a distance without the user’s consent. another form of biometric identification uses the iris as an inner biometric indicator because th e iris is unique for each person. nevertheless, this method is also prone to errors that can be caused by bad lighting and the possible presence of diseases such as diabetes or glaucoma. devices used for iris recognition are expensive, and thus rarely adopted in biometrics.12 speech recognition is used for equipment control such as a smart device switch, however, it can be affected by noise, physical conditions of the user, or weather. vein recognition using finger or palm veins is becoming more prevalent as a form of biometric identification for banks or access control but can be limited by the possible presence of bruises or a lower temperature. however, vein recognition ensures a unique identity, is easy to use, convenient, accurate, and widely accepted, thus, many businesses are adopting vein recognition for various usages. to summarize, biometric identification is convenient, reduces the error rate in recognition, and is difficult to falsify. therefore, biometric identification is suitable for access control. blockmetrics library the blockmetrics library is based on the integration of blockchain and biometric technologies in a wireless sensor network with iot devices. figure 1 shows the blockmetrics library architecture with its bottom-up structure consisting of five layers: hardware, software, internet, transfer and security, and application. all components are sequentially described in detail from the bottom to the top layers. in the hardware layer, sensor nodes are physically located on library collection shelves, entrance gates, and relevant equipment to be further connected with the upper layers. rfid tags are attached to each item in the library, including books, audio resources, and videos. tag information is read and transferred by rfid readers. the biometric devices used in this study include fingerprint readers, palm and finger-vein identifiers, and face or iris recognition devices for biometric authentication when users enter libraries or borrow collections. all images including action images, collection images, and surveillance images are recorded with cameras. the ground surveillance, library collection recognition, image processing, and user identification are manipulated by graphics processing units. touch panels are used for typing or searching for information and there is a particular process for user registration. for general input and output of information, i/o devices include speakers, microphones, keyboards, and monitors. the entrance gate is connected to biometric devices and recognition systems for automatic access control. microprocessors and servers, which make up the core of the hardware, handle all the functions that run in the operating system. data and programs are run and securely saved on a large information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 6 memory drive. data transmission occurs through the wireless data collector, and data collection and transfer in the library are based on a wireless environment. in the blockmetrics library, a library collection database is used to store and maintain all the library material information in a local library for backup usage, and a blockchain is used to record personal borrowing and returning history. figure 1. blockmetrics library architecture in the software layer, an open-source word processor (such as writer, impress, or calc provided by libreoffice) is used to record library collections and handle library affairs. biometric recognition identifies a user’s biological features collected from biometric devices such as a fingervein recognition device, which is adopted in this research. all images and videos include ground information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 7 and entrance surveillance and library material borrowing and returning are recorded by cameras. all images and videos are operated by and processed with video-processing software. the data of the images, videos, personal information, and library collections is managed and saved through an image and file management system as well as a database management system. the software programs associated with creating, modifying, and maintaining library processes are written with open-source programming codes, in particular, python and r, which were scored as two of the top ten programming languages by ieee spectrum in 2019.13 there are various functions saved as packages that are free to download and can be modified and reproduced into a customized program for specific purposes. the hardware houses the cpu, which runs the general library operations, and software programs are maintained by the operating system and management information system. library stock management, collection search, personal registration, and other library-related functions can be designed and developed through app inventor. each borrow and return record is connected with the personal identification recognized from the finger-vein reader and recognition system and is saved as a transaction. the transactions that take place in a specific period are saved in a block that can be connected to another block to form a personal borrowing blockchain. the internet technology layer is built upon the hardware and software structure. the main purpose of internet technologies is to connect the equipment and devices with the internet, in which the internet plays the role of an intermediary for all devices communicating and cooperating together in the library. in the internet technology layer, bluetooth connects devices such as earphones, speakers, and audio guidance within short distances. files are exchanged between smart devices, including smartphones, tablets, or ipads, through near-field communication, which is a contact-less sensor device. rfid is adopted for collection borrowing and returning services in the library. because fiber-optic cables have been ubiquitously planted within infrastructure with the development of smart cities, most libraries have also been built with them. users or vehicles are more easily and more accurately located by the global positioning system (gps), which also assists with image recognition when a material is taken from the shelves. sensors transfer the sensing data to the relative devices for recording or processing under the infrastructure of the wireless sensor network. the library is currently built with a wi-fi environment, but li-fi is one of the future trends that involve creating a wireless environment just with light. mobile devices operate under wireless communications and most countries provide 4g and 4g+ with some supporting even 5g. the internet technology layer is the tools provider for intercommunication among devices. data security and transmission reliability are extremely important issues when various equipment is linked together and connected to the internet. the user interface is the bridge between the user and the devices. in other words, the user gives commands to the devices or software through a user interface or app. biometric recognition devices, rfid readers, and entrance control equipment are connected via the internet in this study. the devices send the information to the corresponding devices for specific purposes in a specified order. collected data such as private user identification are secured by cryptography utilized in blockchain technology. the finger-vein identification used as personal identification is combined with the borrow and return records, stored as transactions, and secured under a secure hash algorithm before being saved into a blockchain. all data and personal identification are transferred under the corresponding secure methodologies. information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 8 in the blockmetrics library, self-check-in and self-checkout rely mostly on rfid technology and finger-vein biometrics. the borrow and return records are stored in a personal borrowing blockchain, where records are saved in the blockchains of user and library, biometrics system, and library servers. entrance control is automatically managed by finger-vein recognition. library stock management is particularly based on image assistance and rfid technology. new user registration is performed only through a few identification questions and finger-vein characteristics extraction. the blockmetrics library is without a circulation desk environment and has an automated borrow and return mechanism through a single sign at the entrance and exit. the five layers in the blockmetrics library architecture communicate with each other such that operations are inseparably related. the blockmetrics library scenario is described in the next section. scenario in this section, the scenarios for registration, entry, and material borrowing and returning in the blockmetrics library will be described in detail. in figures 2 and 3, the user side indicates the actual user actions and is represented as solid lines and the background shows mostly background operations and is indicated as dotted lines. in figure 2, when a new user comes into the blockmetrics library, the registration procedure starts with a biometric pattern extraction and recognition of the user. finger-vein authentication is selected as the personal biometrics for entrance and material borrowing. on the user side, registration is completed with only two steps. the first is finger-vein extraction and the second is to simply provide personal information. the biometric recognition data is processed and stored in the appropriate database, which is linked to the personal identification management system. personal information is secured through the cryptography used in blockchain technology, thus, all information is securely stored. the registration procedure is performed only once at the first entry, and afterward all registered users can enter the blockmetrics library using finger-vein authentication. biometric recognition proceeds with a biometrics database that verifies user identity followed by verification results sent to the entrance control management for entrance guarding. the entrance is automatically controlled because the results from the biometric recognition step are sent as the rules for entrance control. users will be permitted to enter the library when they pass the biometrics recognition step. users do not have to bring their library cards to enter or borrow material, increasing convenience and decreasing identity infringement when library cards get lost. figure 3 shows the scenario of library material borrowing and returning. on the user side, library material borrowing consists of four simple steps performed by the user: 1) retrieving items, 2) authenticating with the finger-vein recognition device, 3) placing items on the rfid reader, and 4) exiting the library. when the user removes a book from the shelf, an infrared detector is triggered, recognizing that a book was removed from the shelf. then, image recognition identifies the specific book and the book’s status is marked as charged to the user in the stock database. if the user wants to leave the library without borrowing anything, the user just scans their finger with the finger-vein device to open the entrance gate. if the user wants to borrow library materials such as books or videos, the borrowing procedure is quickly completed after finger-vein scanning and placing all material including books and videos under the rfid read area. in the background, the user’s recognition results from the finger-vein scan are saved in the biometrics database, which is connected to the blockchain. when the library materials are placed together in the rfid read area, all the tags are read at once while the materials’ statuses in the information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 9 database are updated. user information and material borrowing information are linked and saved as transactions that are stored in the personal borrowing blockchain. figure 2. blockmetrics library scenario—registration and entry information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 10 figure 3. blockmetrics library scenario – borrowing and returning to return library material, the user only needs to put the library materials in the specific area with the rfid reader and the return procedure is completed. the rfid tags of returned materials are read and recorded and their status in the stock database is updated. personal borrow and return records are saved as transactions and stored in the personal borrowing blockchain as well. limitations partial biometric technologies such as facial recognition or fingerprint recognition systems have been adopted by some libraries. these tools have increased the efficiency of accessing and borrowing procedures. however, all the records include some personal information (e.g., fingerprint, historical borrowing records, log of library access, etc.) that is still stored in the individual libraries’ database. the blockchain model may not suitable for all current libraries system due to the unknown database design of each library. at present, library classification systems are by each library individually. therefore, integrating library information among national or international libraries will be a huge task. thus, how to establish the general regulations for all libraries to develop and manage the library information will need additional information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 11 research. after that, the information management system should be designed and built by collecting diverse comments from all library managers. the works should be completed by interdisciplinary experts including library management, information engineering, biometrics system design, and data management. the cost may include manager committees, collection coding design, system development, hardware layout, and related training plans. also, th ere may exist unpredictable privacy issues which could be known until practical system operation. lastly, some users need an adaptation period while new technologies are implemented, the duration of which can depend on how smooth the interface design is, if the system manipulation is easy and clear to use, and what benefits the technologies bring to users’ life. the limitations are concluded as: 1) integrating library information such as stock data and serial numbers, 2) establishing general regulations, 3) creating a consistent library management system, 4) the cost for this system, 5) potential for privacy breaches, and 6) library patron resistance or reluctance to use the technology. conclusion in this research, the blockmetrics library is designed under a wireless sensor network infrastructure combined with blockchain, biometric, iot, and rfid technologies. the library access control system is based on finger-vein biometric recognition, in which users can register with their finger-vein information through biometric devices and input personal information via various i/o devices. thus, automatic and secure library access control is achieved through biometric recognition. additionally, image recognition, gps, and rfid are adopted in the library collection management, providing a simplified way to borrow and return library material. blockchain technologies are utilized to record personal borrowing history of collections from various libraries into a personal borrowing blockchain where records are permanently stored. users can clearly understand their borrowing status through their own blockchain and manage their borrowing information through an application. to summarize, users can enter the library with finger-vein recognition instead of a specific library card. then, if they would like to check out library material, the user can retrieve the items, pass them through the rfid reader, scan their finger vein, and go. the blockmetrics library is designed for convenience and security, which are achieved by combining a wireless sensor network with the integration of blockchain and biometric technologies. this method eliminates the inconvenience of having to bring many library cards, increases the efficiency of collection borrowing procedures, and simplifies the management of collection borrowing from different libraries. adoption of these biometric technologies is still in its early stages. some libraries have begun using different tools, but few libraries have adopted all of them. it simplifies both accessing and borrowing procedures, and all the records are still stored in a particular library’s database for private access only. the development of the blockmetrics library will help to integrate biometric technologies and blockchain under the infrastructure of wireless sensor network to maintain library-accessing recognition, library collections, library users, borrowing records crossing libraries to raise the user convenience and satisfactions, library management efficiency, and library security. in the near future, the library transaction formula in a blockchain will be developed for collection borrowing storage. the library collection serial numbers will be considered in information management system as well. information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 12 endnotes 1 gartner, “smart with gartner, gartner top 10 strategic technology trends for 2020,” https://www.gartner.com/smarterwithgartner/gartner-top-10-strategic-technology-trendsfor-2020/; gartner, “smart with gartner, gartner top 10 strategic technology trends for 2019,” https://www.gartner.com/smarterwithgartner/gartner-top-10-strategic-technologytrends-for-2019/; gartner, “smart with gartner, gartner top 10 strategic technology trends for 2018,” https://www.gartner.com/smarterwithgartner/gartner-top-10-strategictechnology-trends-for-2018/; gartner, “smart with gartner, gartner top 10 strategic technology trends for 2017,” https://www.gartner.com/smarterwithgartner/gartners-top10-technology-trends-2017/. 2 satoshi nakamoto, “bitcoin: a peer-to-peer electronic cash system” (2009), https://bitcoin.org/bitcoin.pdf. 3 david shrier, weige wu, and alex pentland, “blockchain & infrastructure (identity, data security),” connection science & engineering, massachusetts institute of technology, 2016. 4 kristián košt’ál et al., “on transition between pow and pos,” international symposium elmar (2018). 5 thomas p. keenan, “alice in blockchains: surprising security pitfalls in pow and pos blockchain systems,” 15th annual conference on privacy, security and trust (2017); takeshi ogawa, hayato kima, and noriharu miyaho, “proposal of proof-of-lucky-id (pol) to solve the problems of pow and pos,” ieee international conference on internet of things and ieee green computing and communications and ieee cyber, physical and social computing and ieee smart data (2018). 6 quoc khanh nguyen, quang vang dang, “blockchain technology for the advancement of the future,” 4th international conference on green technology and sustainable development, (2018); nir kshetri and jeffrey voas, “blockchain in developing countries,” it professional, 20, no.2 (2018): 11-14. 7 shangping wang, yinglong zhang, and yaling zhang, “a blockchain-based framework for data sharing with fine-grained access control in decentralized storage systems,” 2018 ieee access, 6 (2018):38437-38450. 8 pinyaphat tasatanattakool and chian techapanupreeda, “blockchain: challenges and applications,” 2018 international conference on information networking (icoin), (2018), https://doi.org/10.1109/icoin.2018.8343163; abderahman rejeb, john g. keogh and horst treiblmaier, “leveraging the internet of things and blockchain technology in supply chain management,” future internet, 11, no. 7 (2019): 161; stanislaw p. stawicki, michael s. firstenberg, and thomas j. papadimos, “what’s new in academic medicine? blockchain technology in health-care: bigger, better, fairer, faster, and leaner,” international journal of academic medicine, 4, no. 1 (2018): 1-11; guang chen et al., “exploring blockchain technology and its potential applications for education,” smart learning environments, 5, no. 1 (2018), https://doi.org/10.1186/s40561-017-0050-x; asma khatoon, “a blockchain-based smart contract system for healthcare management,” electronics, 9, no. 1 (2020): 94. https://doi.org/10.1109/icoin.2018.8343163 https://doi.org/10.1186/s40561-017-0050-x information technology and libraries september 2020 integrated technologies of blockchain and biometrics | meng-hsuan fu 13 9 national central library, “national central library reader service directions,” november 11, 2016, https://enwww.ncl.edu.tw/content_26.html. 10 taipei public library, “regulation of circulation services,” june 13, 2018, https://english.tpml.gov.taipei/cp.aspx?n=af5cca6fc258864e. 11 shih hsin university library, “library regulations, access to shu libraries,” http://lib.shu.edu.tw/e_orders_enter.htm; shih hsin university library, “library regulations, borrowing policies,” accessed september 25, 2019, http://lib.shu.edu.tw/e_orders_borrows.htm. 12 sudhinder singh chowhan and ganeshchandra shinde, “iris biometrics recognition application in security management,” 2008 congress on image and signal processing. 13 stephen cass, “the top programming languages 2019,” ieee spectrum (2019), https://spectrum.ieee.org/computing/software/the-top-programming-languages-2019. https://enwww.ncl.edu.tw/content_26.html https://english.tpml.gov.taipei/cp.aspx?n=af5cca6fc258864e http://lib.shu.edu.tw/e_orders_enter.htm http://lib.shu.edu.tw/e_orders_borrows.htm https://spectrum.ieee.org/computing/software/the-top-programming-languages-2019 abstract introduction related works blockchain technology library borrowing management biometrics blockmetrics library scenario limitations conclusion endnotes editorial board thoughts: a considerable technology asset that has little to do with technology mark dehmlow information technology and libraries | march 2014 4 for this issue’s editorial, i thought i would set aside the trendy topics like discovery, the clo ud, and open . . . well, everything—source, data, science—and instead focus on an area that i think has more long-term implications for technologists and libraries. for technologists in libraries, probably any industry really, i believe our most important challenges aren’t technical at all. for the average “techie,” even if an issue is complex, it is often finite and ultimately traceable to a root cause—the programmer left off a semi-colon in a line of code, the support person forgot to plug in the network cable, or the systems administrator had a server choke after a critical kernel error. debugging people issues, on the other hand, is much less reductive. people are nothing but variables who respond to conflict with emotion and can become entrenched in their perspectives (right or wrong). at a minimum, people are unpredictable. the skill set to navigate people and personalities requires patience, flexibility, seeing the importance of the relationship through the 1s and 0s, and often developing mutual trust. working with technology benefits from one’s intelligence (iq), but working with people requires a deeper connection to perception, self-awareness, body language, and emotions, all parts of emotional intelligence (eq). eq is relevant to all areas of life and work, but i think particularly relevant to technology workers. of particular importance are eq traits related to emotional regulation, self-awareness, and the ability to pick up social queues. my primary reasoning for this is that technology is (1) fairly opaque to people outside of technology areas and (2) technology is driving so much of the rapid change we are experiencing in libraries. it units in traditional organizations have a significant challenge because many root issues in technology are not well understood, and change is uncomfortable for most, so it is easy to resent technology for being such a strong catalyst for change. as a result, it is becoming more incumbent upon us in technology to not only instantiate change in our organizations but also to help manage that change through clear communication, clear expectation setting, defining reasonable timeframes that accommodate individuals’ needs to adapt to change, a commitment to shift behavior through influence, and just plain old really good listening. i would like to issue a bit of a challenge to technology managers as you are making hiring decisions. if you want the best possible working relationships with other functional areas in the library, especially traditional areas, spend time evaluating candidates for soft skills like a relaxed demeanor; patience; clear, but not condescending, communication; and a personal commitment to mark dehmlow (mdehmlow@nd.edu), a member of lita and the ital editorial board, is director, information technology program, hesburgh libraries, university of notre dame, south bend, indiana. editorial board thoughts: a considerable technology asset | dehmlow 5 serving others. these skills are very hard to teach. they can be developed if one is committed to developing them, but more often than not, they are innate. if a candidate has those traits as a base but also has an aptitude for understanding technology, that individual will likely be the kind of employee people will want to keep, certainly much more so than someone who has incredible technical skill but little social intelligence. for those who are interested in developing their eq, there are many of tools available—a million management books on team building, servant leadership, influencing coworkers, providing excellent service, etc. personally, i have found that developing a better sense of self-awareness is one of the best ways to increase one’s eq. tests such as the meyers briggs type indicator ,1 the strategic leadership type indicator ,2 and the disc,3 which categorize your personality and work-style traits, can be very effective tools for understanding how you approach your work and how your work style may affect your peers. combined with a willingness to flex your style based on the personalities of your coworkers, these can be very powerful tools for influencing outcomes. most importantly, i have found putting the importance of the relationship above the task or goal can make a remarkable difference in cultivating trust and collaboration. self-awareness and flexible approaches not only have the opportunity to improve internal relationships between technology and traditional functional areas of the library, but between techies and end users. we are using technology in many new creative ways to support end users, meaning techies are more and more likely to have direct contact with users. in many ways, our reputation as a committed service profession will be affected by out tech staffs’ ability to interact well with end users, and ultimately, i believe the proportion of our tech staff that have a high eq could be one the strongest predictor s of the long-term success for technology teams in libraries. references 1. “my mbti personality type,” the myers briggs foundation, http://www.myersbriggs.org/mymbti-personality-type/mbti-basics. 2. “strategic leadership type indicator —leader’s self assessment,” hrd press, http://www.hrdpress.com/slti. 3. “remember that boss who you just couldn’t get through to? we know why…and we can help,” everything disc, http://www.everythingdisc.com/disc-personality-assessment-about.aspx. http://www.myersbriggs.org/my-mbti-personality-type/mbti-basics/ http://www.myersbriggs.org/my-mbti-personality-type/mbti-basics/ http://www.hrdpress.com/slti http://www.everythingdisc.com/disc-personality-assessment-about.aspx author id box for 2 column layout this article examines the linguistic structure of folksonomy tags collected over a thirty-day period from the daily tag logs of del.icio.us, furl, and technorati. the tags were evaluated against the national information standards organization (niso) guidelines for the construction of controlled vocabularies. the results indicate that the tags correspond closely to the niso guidelines pertaining to types of concepts expressed, the predominance of single terms and nouns, and the use of recognized spelling. problem areas pertain to the inconsistent use of count nouns and the incidence of ambiguous tags in the form of homographs, abbreviations, and acronyms. with the addition of guidelines to the construction of unambiguous tags and links to useful external reference sources, folksonomies could serve as a powerful, flexible tool for increasing the user-friendliness and interactivity of public library catalogs, and also may be useful for encouraging other activities, such as informal online communities of readers and user-driven readers’ advisory services. o ne of the most daunting challenges of information management in the digital world is the ability to keep, or refind, relevant information; book­ marking is one of the most popular methods for storing relevant web information for reaccess and reuse (bruce, jones, and dumais 2004). the rising popularity of social bookmark managers, such as del.icio.us, addresses these concerns by allowing users to organize their bookmarks by assigning tags that reflect directly their own vocabu­ lary and needs. the collection of user­assigned tags is referred to commonly as a folksonomy. in recent years, significant developments have occurred in the creation of customizable user features in public library catalogs. these features offer clients the opportunity to customize their own library web pages and to store items of interest to them, such as book lists. client participation in these interfaces, however, is largely reactive; clients can select items from the catalog, but they have little ability to orga­ nize and categorize these items in a way that reflects their own needs and language. digital document repositories, such as library cata­ logs, normally index the subject of their contents via key­ words or subject headings. traditionally, such indexing is performed either by an authority, such as a librarian or a professional indexer, or is derived from the authors of the documents; in contrast, collaborative tagging, or folkson­ omy, allows anyone to freely attach keywords or tags to content. demspey (2003) and ketchell (2000) recommend that clients be allowed to annotate resources of interest and to share these annotations with other clients with similar interests. folksonomies can thus make significant contributions to public library catalogs by enabling cli­ ents to organize personal information spaces; namely, to create and organize their own personal information space in the catalog. clients find items of interest (items in the library catalog, citations from external databases, external web pages, and so on) and store, maintain, and organize them in the catalog using their own tags. in order to more fully understand these applications, it is important to examine how folksonomies are struc­ tured and used, and the extent to which they reflect user needs not found in existing lists of subject headings. the purpose of this proposed research is thus to examine the structure and scope of folksonomies. how are the tags that constitute the folksonomies structured? to what extent does this structure reflect and differ from the norms used in the construction of controlled vocabular­ ies ,such as library of congress subject headings? what are the strengths and weaknesses of folksonomies (for example, reflect user need, ambiguous headings, redun­ dant headings, and so forth)? this article will examine a selection of tags obtained from three folksonomy sites, del.icio.us (referred to henceforth as delicious), furl, and technorati, over a thirty­day period. the structure of these tags will be examined and evaluated against section 6 of the niso guidelines for the construction of controlled vocabularies (niso 2005), which looks specifically at the choice and form of terms. ■ definitions of folksonomies folksonomies have been described as “user­created meta­ data . . . grassroots community classification of digital assets” (mathes 2004). wikipedia (2006) describes a folksonomy as “an internet­based information retrieval methodology consisting of collaboratively generated, open­ended labels that categorize content such as web pages, online photographs, and web links.” the concept of collaboration is attributed commonly to folksonomies (bateman, brooks, and mccalla 2006; cattuto, loreto, and pietronero 2006; fichter 2006; golder and huberman the structure and form of folksonomy tags: the road to the public library catalog louise f. spiteri louise f. spiteri (louise.spiteri@dal.ca) is associate professor at the school of information management, dalhousie university, halifax, nova scotia, canada. this research was funded by the oclc/alise library and information science research grant program. the structure and form of folksonomy tags | spiteri 13 1� information technology and libraries | september 20071� information technology and libraries | september 2007 2006; mathes 2004; quintarelli 2005; udell 2004). thomas vander wal, who coined the term folksonomy, argues, however, that: the definition of folksonomy has become completely unglued from anything i recognize. . . . it is not col­ laborative . . . it is the result of personal free tagging of information and objects (anything with a url) for one’s own retrieval. the tagging is done in a social environment (shared and open to others). the act of tagging is done by the person consuming the informa­ tion” (vanderwal.net 2005). it may be more accurate, therefore, to say that folk­ sonomies are created in an environment where, although people may not actively collaborate in their creation and assignation of tags, they may certainly access and use tags assigned by others. folksonomies thus enable the use of shared tags. folksonomies are used primarily in social bookmark­ ing sites, such as delicious (http://del.icio.us/) and furl (http://www.furl.net/), which allow users to add sites they like to their personal collections of links, to organize and categorize these sites by adding their own terms, or tags, and to share this collection with other people with the same interests. the tags are used to collocate bookmarks within a user’s collection and bookmarks across the entire system, so, for example, the page http://del.icio.us/tag/blogging will show all bookmarks that are tagged with blogging by any member of the delicious site. ■ benefits of folksonomies quintarelli (2005) and fichter (2006) suggest that folk­ sonomies reflect the movement of people away from authoritative, hierarchical taxonomic schemes that reflect an external viewpoint and order that may not necessarily reflect users’ ways of thinking. “in a social distributed environment, sharing one’s own tags makes for innova­ tive ways to map meaning and let relationships naturally emerge” (quintarelli 2005). vander wal (2006) adds that “the value in this external tagging is derived from people using their own vocabulary and adding explicit mean­ ing, which may come from inferred understanding of the information/object.” an attractive feature of folksonomies is their inclusive­ ness; they reflect the vocabulary of the users, regardless of viewpoint, background, bias, and so forth. folksonomies may thus be perceived to be a democratic system where everyone has the opportunity to contribute and share tags (kroski 2006). the development of folksonomies may reflect also the difficulty and expense of applying con­ trolled taxonomies to the web: building, maintaining, and enforcing a sound, controlled vocabulary is often simply too expensive in terms of development time and of the steep learning curve needed by the user of the system to learn the classification scheme (fichter 2006; kroski 2006; quintarelli 2005; shirky 2004). a further limitation of taxonomies is that they may become outdated easily. new concepts or products may emerge that are not yet included in the taxonomy; in comparison, folksonomies easily accommodate such new concepts (fichter 2006; mitchell 2005; wu, zubair, and maly, 2006). shirky (2004) points out that the advantage of folksonomies is not that they are better than controlled vocabularies, but that they are better than nothing. folksonomies follow desire lines, which are expres­ sions of the direct information needs of the user (kroski 2006; mathes 2004; merholz 2004). these desire lines also may reflect the needs of communities of interest: tag­ gers who use same set of tags have formed a group and can seek each other out using simple search techniques. “tagging provides users an easy, yet powerful method to express themselves within a community” (szekely and torres 2005). ■ weaknesses of folksonomies folksonomies share the problems inherent to all uncon­ trolled vocabularies, such as ambiguity, polysemy, syn­ onymy, and basic level variation (fichter 2006; golder and huberman 2006; guy and tomkin 2006; mathes 2004). the terms in a folksonomy may have inherent ambiguity as different users apply terms to documents in different ways. the polysemous tag port could refer to a sweet fortified wine, a porthole, a place for loading and unloading ships, the left­hand side of a ship or air­ craft, or a channel endpoint in a communications system. folksonomies do not include guidelines for use or scope notes. folksonomies provide for no synonym control; the terms mac, macintosh, and apple, for example, are all used to describe apple macintosh computers. similarly, both singular and plural forms of terms appear (for example, flower and flowers), thus creating a number of redun­ dant headings. the problem with basic level variation is that related terms that describe an item vary along a continuum of specificity ranging from very general to very specific, so, for example, documents tagged perl and javascript may be too specific for some users, while a document tagged programming may be too general for others. folksonomies provide no formal guidelines for the choice and form of tags, such as the use of com­ pound headings, punctuation, word order, and so forth; for example, should one use the tag vegan cooking or cooking, vegan? guy and tomkin (2006) provide some general suggestions for tag selection best practices, such as the use of plural rather than singular forms, the use article title | author 15the structure and form of folksonomy tags | spiteri 15 of underscore to join terms in a multiterm concept (for example, open_source), following conventions estab­ lished by others, and adding synonyms. these sugges­ tions are rather too vague to be of much use, however; for example, under what circumstances should singular forms be used (such as noncount nouns), and how should synonyms be linked? ■ applications of folksonomies other than social bookmarking sites, folksonomies are used in commercial shopping sites, such as amazon (http://www.amazon.com/), where clients tag items of interest; these tags can be accessed by people with similar interests. platial (http://www.platial.com/ splash) is used to tag personal collections of maps. examples of the use of folksonomies for intranets include ibm’s social bookmarking application dogear, which allows people to bookmark pages within their intranet (http://domino.watson.ibm.com/cambridge/ research.nsf/99751d8eb5a20c1f852568db004efc90/ 1c181ee5fbcf59fb852570fc0052ad75?opendocument), and scuttle (http://sourceforge.net/projects/scuttle/), an open­source bookmarking project that can be hosted on web servers for free. penntags (http://tags.library. upenn.edu/) is a social bookmarking service offered by the university of pennsylvania library to its community members. steve museum is a project that is investigating the incorporation of folksonomies into museum catalogs (trant and wyman 2006). another potential application of folksonomies is to public library catalogs, where users can organize and tag items of interest in user­specific folders; users could then decide whether or not to post the tags publicly (spiteri 2006). ■ analyses of folksonomies analysis of the structure, or composition, of tags has thus far been limited; there has been more emphasis placed upon the co­occurrence of tags and their frequency of use. cattuto, loreto, and pietronero (2006) applied a stochas­ tic model of user behavior to investigate the statistical properties of tag co­occurrence; their results suggest that users of collaborative tagging systems share universal behaviors. michlmayr (2005) compared tags assigned to a set of delicious bookmarks to the dmoz (http://www. dmoz.org/) taxonomy, which is designed by a commu­ nity of volunteers. the study concluded that there were few instances of overlap between the two sets of terms. mathes (2004) provides an interesting analysis of the strengths and limitations of the structure of delicious and flickr, but does not provide an explanation of the meth­ odology used to derive his observations; it is not clear, for example, for how long he studied these two sites, how many tags he examined, what elements he was looking for, or what evaluative criteria he applied. golder and huberman (2006) conducted an analysis of the structure of collaborative tagging systems, look­ ing at user activity and kinds and frequencies of tags. specifically, golder and huberman looked at what tags delicious members assigned and how many bookmarks they assigned to each tag. this study identified a number of functions tags perform for bookmarks, including iden­ tifying the: ■ subject of the item; ■ format of the item (for example, blog); ■ ownership of the item; and ■ characteristics of the item (for example, funny). while the golder and huberman study provides an important look at tag use, their study is limited in that they examined only one site for a period of four days; their results are an excellent first step in the analysis of tag use, but the narrow focus of their population and sample size means that their observations are not easily generalized. furthermore, this study focuses more on how bookmarks are associated with tags (for example, how many bookmarks are assigned per tag and by whom) rather than at the structural composition of the tags themselves. guy and tonkin (2006) collected a random sampling of tags from delicious and flickr to see whether “popular objections to folksonomic tagging are based on fact.” the authors do not explain, however, over what period the tags were acquired (for example, over a one­day period, over a month), nor to they provide any evaluative criteria. the tags were entered into aspell, an open source spell checker, from which the authors concluded that 40 percent of flickr and 28 percent of delicious tags were either mis­ spelled, encoded in a manner not understood by aspell, or consisted of compound words of two or more words. tags did not follow convention in such areas as the use of case or singular versus plural forms. while this study certainly focuses upon the structure of the tags, the bases for the authors’ conclusions are problematic. it is not clear that the use of a spell checker is a sufficient measure of quality. does the spell checker allow for cultural variations in spell­ ing (for example, labor or labour)? how well­recognized and comprehensive is the source vocabulary for this spell checker? furthermore, if a tag does not exist in the spell checker, does this necessarily mean that the tag is incor­ rect? tags may include several neologisms, such as podcasting, that may not yet exist in conventional dictionaries but are well­recognized in a particular domain. the authors do not mention whether they took into account the cor­ 16 information technology and libraries | september 200716 information technology and libraries | september 2007 rect use of the singular form of such tags as noncountable nouns (for example, air) or tags that describe disciplines or emotions (for example, history and love). if a named entity (person or organization) was not recognized by aspell, does this mean that the tag was classified as incorrect? lastly, the authors seem to imply that compound words of two or more words are necessarily incorrect, which may not be the case (for example, open source software). the pitfalls of folksonomies have been well­docu­ mented; what is missing is an in­depth analysis of the linguistic structure of tags against an established bench­ mark. while popular opinion suggests that folksonomies suffer from ambiguous and inconsistent structure, the actual extent of these problems is not yet clear; further­ more, analyses conducted so far have not established clear benchmarks of quality pertaining to good tag structure. although there are no guidelines for the construction of tags, recognized guidelines do exist for the construction of terms that are used in taxonomies. although these guidelines discuss the elucidation of inter­term relation­ ships (hierarchical, associative, and equivalent), which does not apply to the flat space of folksonomies, they contain sections pertaining to the choice and formation of concept terms that may, in fact, have relevance for the construction of tags. ■ methodology selection of folksonomy sites tags were chosen from three popular folksonomy sites: delicious, furl, and technorati (http://www.technorati. com/). delicious and furl function as bookmarking sites, while technorati enables people to search for and organize blogs. these sites were chosen because they provide daily logs of the most popular tags that have been assigned by their members on a given day. the daily tag logs from each of the sites were acquired over a thirty­day period (february 1–march 2, 2006). the daily tags for each site were entered into an excel spreadsheet. a list of unique tags for each site was compiled after the thirty­day period; unique refers to the single instance of a tag. some of the tags were used only once during the thirty­day period, while others, such as travel, occurred several times, so travel appears only once in the list of unique tags. variations of the same tag—for example, car or cars, cheney or dick cheney—were considered to constitute two unique tags. only english­language tags were accumulated. the analysis of the tag structure in the three lists was conducted by applying the niso guidelines for thesaurus construction, which are the most current set of recognized guidelines for the: contents, display, construction . . . of controlled vocabu­ laries. this standard focuses on controlled vocabularies that are used for the representation of content objects in knowledge organization systems including lists, syn­ onym rings, taxonomies, and thesauri (niso 2005, 1). while folksonomies are not controlled vocabularies, they are lists of terms used to describe content, which means that the niso guidelines could work well as a benchmark against which to examine how folksonomy tags are structured as well as the extent to which this structure reflects the widely accepted norm for controlled vocabu­ laries. section 6 of the guidelines (term choice, scope, and form) was applied to the tags, specifically the following elements (see appendix a for the expanded list): 6.3 term choice 6.4 grammatical form of terms 6.5 nouns 6.6 selecting the preferred form only those elements in section 6 that were found to apply to the lists of unique tags are included in appendix a. for each site, the section 6 elements were applied to each unique tag; for example, it was noted whether a tag consists of one or more terms, whether the tag is a noun, adjective, or adverb, and so on. the frequency of occur­ rence of the section 6 elements was noted for each site and then compared across the three sites in order to determine the existence of any patterns in tag structure and the extent to which these patterns reflect current practice in the design of controlled vocabularies. definition and disambiguation of tags the meanings of the tags were determined based upon (1) the context of their use; and (2) their definition in three external sources, namely merriam webster online dic­ tionary (http://www.m­w.com/); google (http://www. google.com/); and wikipedia (http://www.wikipedia. org/). merriam­webster was used specifically to define all tags other than those that constitute unique entities (for example, named people, places, organizations, or products) and to determine the various meanings of tags that are homographs (for example, art or web). the actual concept represented by homographs was determined by examin­ ing the sites or blogs to which the tag was assigned. merriam­webster also was used to determine the grammatical form of a tag; for example, noun, verbal noun, adjective, or adverb. determining verbal nouns proved to be complicated, especially given that niso relies only on examples to illustrate such nouns. some tags could serve as both verbal and simple nouns; for example, the tag clipping could describe the activity to clip or an item that has been clipped, such as a newspaper article title | author 17the structure and form of folksonomy tags | spiteri 17 clipping. similarly, does skiing refer to an activity, or the sport? if the dictionary defined a tag as an activity, the tag was classified as a verbal noun. in the case of tags that were defined as both verbal nouns and simple nouns, the context in which the tag was used determined the final classification. the dictionary also was used to determine the type of concept represented by a tag. the niso guidelines do not define any of these seven types of concepts outlined in section 6.3.2; they provide only a short list of examples for each type. if the term represented by the tag was defined as an activity, property, material, event, discipline or field of study, or unit of measurement, it was classified as such unless the context of the tag suggested otherwise. if none of these six types was defined in the dictionary, the default value of thing was assigned to the tag. these definitions were then compared to the context in which the tag was used. in the case of the tag art, for example, an examination of the sites associated with this tag indicated that it refers to art objects, rather than the discipline, so it was classified as a thing. merriam­webster was used to determine whether a tag constitutes a recognized term in standard english (both united states and united kingdom variants); for example, the tag blogs is a recognized term in the dictionary, while podcasting is not. niso does not provide a clear definition of slang, neologism, or jargon, other than to say that they are nonstandard terms not generally found in dictionaries. is the term podcasting, for example, an instance of slang, jargon, or neologism? at what point does jargon become a neologism? because of the difficulty of distinguishing among these three categories, it was decided to use the broader category nonstandard terms to cover tags that (1) could not be found in the dictionary; or (2) are designated as vulgar or slang in the dictionary. google and wikipedia were used to define the mean­ ings of tags that constitute unique entities. wikipedia also was used to distinguish the various meanings of tags that constitute abbreviations or acronyms via its disambigua­ tion pages; for example, the tag nfl is given eight pos­ sible meanings. in this case, the tag nfl is used to refer specifically to the national football league, so the tag is a homograph, noun, and unique entry. ■ tagging conventions and guidelines of the folksonomy sites delicious delicious defines tags as: one­word descriptors that you can assign to your bookmarks. . . . they’re a little bit like keywords but non­hierarchical. you can assign as many tags to a bookmark as you like and easily rename or delete them later. tagging can be a lot easier and more flexible than fitting your information into preconceived categories or folders” (del.icio.us 2006a). the delicious help page for tags encourages people to “enter as many tags as you would like, each separated by a space” in the tag field. this paragraph explains briefly that two lists of tags may appear under the entry form used to enter a bookmark. the first list consists of popular tags assigned by other people to the bookmark in question, while the second consists of recommended tags, which contains a combination of tags that have been assigned by the client in question as well as other users (del.icio.us 2006b). it is not clear how the two lists differ in that they both contain tags assigned by other people to the bookmark at hand. the only tangible guideline provided about how tags should be structured is the sentence “your only limitation on tags is that they must not include spaces.” delicious thus addresses only indirectly the fact that it does not allow multiterm tags; the examples provided suggest ways in which compound terms can be expressed; for example, san­francisco, sanfranciso, san.franciso (del. ico.us 2006b). punctuation thus appears to be allowed in the construction of tags, which is confirmed by the sug­ gestion that asterisks may be used to rate bookmarks: “a tag of * might mean an ok link, *** is pretty good, and a bookmark tagged ***** is awesome” (del.icio.us 2006b). it is thus possible that tags may not consist of recognizable terms, even though asterisks are neither searchable nor indicative of content. furl the furl web site uses the term topics rather than tags, but provides no guidelines or instructions for how to con­ struct these topics. furl mentions only that when entering a bookmark, “a small window will pop up. it should have the title and url of the page you are looking at. enter any additional details (i.e., topic, rating, comments) and click save” (furl 2006). furl provides all users with a list of default topics to which one can add at will. furl provides no guidelines as to whether single or multiword topics may be used; it is only by trial and error that the user discovers that the latter are, in fact, allowed. technorati in its tags help page, technorati encourages users to “think of a tag as a simple category name. people can categorize their posts, photos, and links with any tag that makes sense” (technorati 2006). a tag may be “anything, but it should be descriptive. please only use tags that are rel­ evant to the post” (technorati 2006). technorati tags are 1� information technology and libraries | september 20071� information technology and libraries | september 2007 embedded into individual blogs via the link rel=”tag”; for example: global warming. the tag will appear as simply global warming. no other guidelines are provided about how tags should be constructed. as can be seen, the three folksonomy sites provide very few guidelines or conventions for how tags should be constructed. users are not pointed to the common problems that exist in uncontrolled vocabulary, such as ambiguous headings, homographs, synonyms, spelling variations, and so forth, nor are suggestions made as to the preferred form of tags, such as nouns, plural forms, or the distinction between count nouns (for example, dogs) and mass nouns (for example, air). given this lack of guidance, it is not unreasonable to assume that the tags acquired from these sites will vary considerably in form and structure. ■ findings unless stated otherwise, the number of tags per folk­ sonomy site is 76 for delicious, 208 for furl, and 229 for technorati. homographs the niso guidelines recommend that homographs— terms with identical spellings but different meanings— should be avoided as far as possible in the selection of terms. homographs constitute 22 percent of delicious tags, 12 percent of furl tags, and 20 percent of technorati tags. unique entities constitute a significant proportion of the homographs in all three sites, with 71 percent in delicious, 43 percent in furl, and 55 percent in technorati. the most frequently occurring homographs across the three sites consist predominantly of computer­related terms, such as ajax and css. single-word versus multiword terms the niso guidelines recommend that terms should represent a single concept expressed by a single or mul­ tiword term, as needed. single­term tags constitute 93 percent of delicious tags, 76 percent of furl tags, and 80 percent of technorati tags. the preponderance of single tags in delicious may reflect the fact that it does not allow for the use of spaces between the different elements of the same tag; for example, open source. types of concepts niso provides a list of seven types of concepts that may be represented by terms; while this list is not exhaustive, it represents the most frequently occurring types of con­ cept. table 1 shows the percentage of tags that correspond to each of the seven types of concepts. tags that represent things are clearly predominant in the three sites, with activities and properties forming a distant second and third in importance. none of the tags represent events or measures, and only a fraction of the technorati tags represent materials. the niso guidelines provide no indication of the expected distribution of the types of concepts, so it is difficult to determine to what extent the three folksonomy sites are consistent with other lists of descriptors. none of the tags fell outside the scope of the seven types of concepts. unique entities unique entities may represent the names of people, places, organizations, products, and specific events (niso 2005). unique entities constitute 22 percent of delicious tags, 14 percent of furl tags, and 49 percent of technorati tags. there is no consistency in the percentage of unique enti­ ties: technorati has nearly twice the percentage of tags than delicious has, and nearly triple the percentage of tags than furl has. computer­related products constitute 100 percent of the unique entities in delicious, 63 percent in furl, and 38 percent in technorati. the remainder of the unique entities in furl and technorati represent places, people, and corporate bodies. the unique entities in technorati are closely related to developments in current news events, an occurrence that is likely due to the site’s focus on blogs rather than web sites. as will be discussed in a subsequent section, the unique entries constitute a significant proportion of the tags that represent ambiguous acronyms or abbreviated terms, such as ajax or psp. table 1. concepts represented by the tags delicious (%) furl (%) technorati (%) things 76 82 90.0 materials 0 0 0.4 activities 12 10 4.0 events 0 0 0.0 properties 8 6 4.0 disciplines 4 3 1.0 measures 0 0 0.0 article title | author 19the structure and form of folksonomy tags | spiteri 19 grammatical forms of terms the niso standards recommend the use of the following grammatical forms of terms: ■ nouns and noun phrases ■ verbal nouns ■ noun phrases ■ premodified noun phrases ■ postmodified noun phrases ■ adjectives ■ adverbs table 2 shows the distribution of the grammatical forms of tags. if all the types of nouns are combined, then 95 percent of delicious tags, 94 percent of furl tags, and 97 percent of technorati tags constitute types of nouns. the gram­ matical structure of the tags in the three folksonomy sites thus reflects very closely the niso recommendations that tags consist of mainly nouns, with the added proviso that adjectives and adverbs be kept to a minimum. none of the folksonomy sites used adverbs as tags, and the num­ ber of adjectives was very small, forming an average total of 5 percent of the tags. nouns (plural and singular forms) niso divides nouns into two categories: count nouns (how many?), and noncount, or mass nouns (how much?). niso recommends that count nouns appear in the plural form and mass nouns in the singular form. niso specifies other types of nouns that appear typi­ cally in the singular form: ■ abstract concepts ■ beliefs; for example, judaism, taoism ■ activities; for example, digestion, distribution ■ emotions; for example, anger, envy, love, pity ■ properties; for example, conductivity, silence ■ disciplines; for example, chemistry, astronomy ■ unique entities table 3 shows the distribution of the singular and plu­ ral forms of noun tags. the term singular nouns was used to collocate all the types of non­plural nouns. table 3 represents the number of tags that constitute count nouns; this does not mean, however, that the tags appeared correctly in the plural form. of the count nouns, 36 percent of delicious tags, 62 percent of furl tags, and 34 percent of technorati tags appeared correctly in the plural form. it should be noted that although table 3 indicates that properties constitute 8 percent of delicious, 6 percent of furl, and 4 percent of technorati tags, most of these tags are adjectives, and thus are not counted in the table. the niso guidelines do not suggest the typical distribution of count versus singular nouns, but table 3 indicates that at least among the three folksonomy sites, singular nouns form the bulk of the tags. table 2. grammatical form of tags delicious (%) furl (%) technorati (%) nouns 88 71 86 verbal nouns 5 6 4 noun phrases— premodified 1 15 4 noun phrases— postmodified 0 2 3 adjectives 6 6 3 adverbs 0 0 0 table 3. count and noncount noun tags delicious (%) furl (%) technorati (%) count nouns 18 35 23 noncount nouns 77 59 74 mass nouns 36 32 19 activities 12 10 4 properties 3 0 1 disciplines 4 3 1 unique 22 14 49 total 95 94 97 20 information technology and libraries | september 200720 information technology and libraries | september 2007 spelling the niso guidelines divide the spelling of terms into two sections: warrant and authority. with respect to warrant, niso recommends that “the most widely accepted spell­ ing of words, based on warrant, should be adopted,” with cross­references made between variant spellings of terms. as far as authority is concerned, spelling should follow the practice of well­established dictionaries or glossaries. while spelling refers normally to whole words, i included in this analysis acronyms and abbreviations used to denote unique entities, such as countries or product names, as there are recognized spellings of such acronyms and abbreviations. table 4 shows the tags from the three sites that do not conform to recognized spelling; the terms in italics show the accepted spelling. the number of tags that do not conform to spelling warrant is clearly very few, constituting a total of 4 per­ cent of the delicious tags, 3 percent of the furl tags, and 2 percent of the technorati tags. two of the nonrecognized spellings in delicious are likely due to the difficulty of creating compound tags in this site, as was discussed earlier. the remainder of the tags conformed to recog­ nized spellings as found in the three reference sources consulted. the findings suggest that tags are spelled con­ sistently and in keeping with recognized warrant across the three folksonomy sites. because of the international nature of the three folksonomy sites, no default english spelling was assumed. table 5 shows those tags whose spellings reflect regional variations. none of the three folksonomy sites featured lexical variants of any one tag. as the three sites are united states–based, the preponderance of american spelling is not surprising. what is surprising, however, is that technorati features only the british variants in the total of tags examined in this study. it should be pointed out that the two lexical variants of these terms do appear in the three folksonomy sites; the two variants simply did not appear in the daily logs examined. no system to enable cross­referencing (for example, humour use or see humor) exists in any of the three folksonomy sites, nor is cross­referencing discussed in the help logs of the sites. abbreviations, initialisms, and acronyms niso recommends that the full form of terms should be used. abbreviations or acronyms should be used only when they are so well­established that the full form of the term is rarely used. cross­references should be made between the full and abbreviated forms of the terms. abbreviations and acronyms constitute 22 percent of delicious tags, 16 percent of furl tags, and 19 percent of technorati tags. the majority of these abbreviations and acronyms pertain to unique entities, such as product names (for example, flash, mac, and nfl). in the case of delicious and furl, none of the abbreviated tags is referred to also by its full form. four of the abbreviated technorati tags have full­form equivalents: ■ cheney/dick cheney ■ ie/internet explorer ■ sheehan/cindy sheehan ■ uae/united arab emirates abbreviations and acronyms play a significant role in the ambiguity of the tags from the three sites; they represent 71 percent of the abbreviated delicious tags, 45 percent of the abbreviated furl tags, and 73 percent of the abbreviated technorati tags. furl and technorati are very similar in the proportion of abbreviated tags used, but delicious is significantly higher. the delicious tags are focused more heavily upon computer­related products, which may explain why there are so many more abbrevi­ ated tags, as many of these products are often referred to by these shorter terms; for example, css, flash, apple, and so on. table 4. tags that do not conform to spelling warrant delicious (n=76) furl (n=208) technorati (n=229) howto (how to) hollywood bday (hollywood birthday) met-art pics (metropolitan art pictures) opensource (open source) med-books (medical books) superbowl (super bowl) toread (to read) oralsex (oral sex) web-20 (web2.0) table 5. tags that reflect regional spelling variations delicious (n=76) furl (n=208) technorati (n=229) humor (u.s. spelling) humor (u.s. spelling) favourite (british spelling) jewelry (u.s. spelling) humour (british spelling) article title | author 21the structure and form of folksonomy tags | spiteri 21 neologisms, slang, and jargon the niso guidelines explain that neologisms, slang, and jargon terms are generally not included in standard dic­ tionaries and should be used only when there is no other widely accepted alternative. nonstandard tags do not constitute a particularly relevant proportion of the total number of tags per site; they account for 3 percent of the delicious tags, 10 percent of the furl tags, and 6 percent of the technorati tags. the nonstandard tags refer almost exclusively to either computer­ or sex­related concepts, such as podcast, wiki, and camsex. nonalphabetic characters this section of the niso guidelines deals with the use of capital letters and nonalphabetic characters. capitalization was not examined in the three folksonomy sites, as none of them are case sensitive; delicious and furl, for exam­ ple, post tags in lower case, regardless of whether the user has assigned upper or lower case, while technorati shows capital letters only if they are assigned by the users themselves. the niso guidelines state that nonalphabetic characters, such as hyphens, apostrophes (unless used for the possessive case), symbols, and punctuation marks, should not be used because they cause filing and search­ ing problems. table 6 shows the occurrence of nonalpha­ betic characters in the three folksonomy sites. a very small proportion of the tags in the three folk­ sonomy sites contains non­alphabetic characters, namely 1 percent of the delicious tags, and 3 percent of the furl and technorati tags. as was discussed previously, the delicious help screens may encourage people to use nonalphabetic characters to construct compound tags; in spite of this, however, such characters are not, in fact, used very frequently. it should be noted that the terms above were all searched, with punctuation intact, in their respective sites; in all three cases, the search engines retrieved the tags and their associated blogs or web sites, which suggests that nonalphabetic characters may not negatively impact searching. ■ discussion and recommendations the tags examined from the three folksonomy sites cor­ respond closely to a number of the niso guidelines pertaining to the structure of terms, namely in the types of concepts expressed by the tags, the predominance of single tags, the predominance of nouns, the use of recognized spelling, and the use of primarily alphabetic characters. potential problem areas in the structure of the tags pertain to the inconsistent use of the singular and plural form of count nouns, the difficulty with creating multi­ term tags in delicious, and the incidence of ambiguous tags in the form of homographs and unqualified abbre­ viations or acronyms. as has been seen, a significant proportion of tags that represent count nouns appears incorrectly in the singular form. because many search engines do not deploy default truncation, the use of the singular or plural form could affect retrieval; a search for the tag computer in delicious, for example, retrieved 208,409 hits, while one for computers retrieved 91,205 hits. some of the results from the two searches overlapped, but only if both the singular and plural forms of the tags coexist. it would thus be useful for the help features of the folksonomy sites to explain the difference between count and noncount nouns and to discuss the impact of the form of the noun upon retrieval. while all three sites conform to the niso recommendation that single terms be used whenever possible, some concepts cannot be expressed in this fashion, and thus folksonomy sites should accom­ modate the use of multiterm tags. table 6. nonalphabetic characters delicious (n=76) furl (n=208) technorati (n=229) hyphens — hollywood b-day; urlproject consumercredit; web2.0 apostrophes — mom’s medical (possessive) valentine’s day (possessive) underscore safari_export blogger_life — full stop — web 2.0 (part of product name) web-2.0 (part of product name) forward slash — — /africa + sign — jcr+ — 22 information technology and libraries | september 200722 information technology and libraries | september 2007 furl and technorati allow for their use, but make no mention of this feature in their help screens, which means that such tags may be constructed inconsistently—for example, by the insertion of punctuation—where a sim­ ple space between the tags will suffice. as has been seen, delicious does not allow directly for the construction of multiterm tags, and in its instructions it actually promotes inconsistency in how various punctuation devices may be used to conflate two or three separate tags, once again at the detriment of retrieval, as is shown below: opensource: 103,476 hits open_source: 91, 205 hits open.source: 26,494 hits delicious should consider allowing for the insertion of spaces between the composite words of a compound tag; without this facility, users may be unaware of how to create compound tags. alternatively, delicious should recommend the use of only one punctuation symbol to conflate terms, such as the underscore. furl and technorati should explain clearly that compound tags may be formed by the simple convention of placing a space between the terms. ambiguous headings constitute the most problematic area in the construction of the tags; these headings take the form of homographs and abbreviations or acronyms. in the case of computer­related product names, it may be safe to assume that in the context of an online environ­ ment it is likely that the meaning of these product names is relatively self­evident. in the case of the tag yahoo, for example, none of the sites or blogs associated with this tag pertained to “a member of a race of brutes in swift’s gulliver’s travels who have the form and all the vices of humans, or a boorish, crass, or stupid person” (merriam­ webster 2007), but referred consistently to the internet service provider and search engine. on the other hand, the tag ajax was used to refer to asynchronous javascript and xml technology as well as to a number of mainly european soccer teams. given the international audience of these folksonomy sites, it may be unwise to assume that the meanings of these homographs are self­evident. library of congress subject headings often uses parenthetical qualifiers to clarify the meaning of terms— for example, python (computer program language)—even though this goes against niso recommendations. it is unlikely, however, that such use of parentheses will be effective in the folksonomy sites. a search for opera (browser), for example, will likely imply an underlying and boolean operator, which detracts from the pur­ pose and value of the parenthetical qualifier; this was confirmed in a furl search, where the terms opera and browser appeared either immediately adjacent to each other or within the same document. the application of the section of the niso guidelines pertaining to abbreviations and acronyms is particularly difficult, as it is important to balance between using abbre­ viated forms of concepts that are so well­known that the full version is hardly used versus creating ambiguous tags. the fact that abbreviated forms appear so prominently in the daily logs of the three folksonomy sites suggests that the full forms of these tags are, in fact, very well­established. at face value, therefore, many of the abbreviated tags are ambiguous because they can refer to different concepts, but it is questionable whether such tags as css, flash, apple, and rss, for example are, in fact, ambiguous to the users of the sites. the use of the full forms for these tags seems cumbersome, as these concepts are hardly ever referred to in their full form. it could possibly be argued, in fact, that in some cases, the full forms may not be familiar; i may know to what concept rss refers, for example, without knowing the specific words represented by the letters r, s, s. the possible ambiguity of abbreviated forms is com­ pounded by the fact that none of the three folkson­ omy sites allows for cross­references between equivalent terms, which is a standard feature of most controlled vocabularies, for example: nfl/national football league use national football league/used for nfl the help screens of the three sites do not address the notion of ambiguity in the construction of tags: they do not draw people’s attention to the inherent ambigu­ ity of abbreviated forms that may represent more than one concept. the sites also fail to address the fact that abbreviated forms (or any tag, for that matter) may be culturally based, so that while the meaning of nfl may be obvious to north american users, this may not be the case for people who live in other geographic areas. it may be useful for the folksonomy sites to add direct links to an online dictionary and to wikipedia, and to encourage people to use these sites to determine whether their cho­ sen tags may have more than one application or meaning; i had not realized, for example, that rss could represent twenty­three different concepts until i used wikipedia and was led to a disambiguation page. access to these external sources may help users decide which full version of the abbreviation to use in the case of ambiguity. the examination of the structure of the tags pointed to some deficiencies in section 6 of the niso guidelines, specifically its occasional lack of sufficient definition or explanation of some of its recommendations. the guidelines list seven types of concepts that are typically represented by controlled vocabulary terms, but rely only upon a few examples to define the meaning and scope of these concepts. the guidelines thus provide no consistent mechanism by which the creators of terms can assess consistently the types of concepts represented. how, for example, is a discipline to be determined? does the term business represent a discipline if it is a subject area that is taught formally in a post­secondary institute, for article title | author 23the structure and form of folksonomy tags | spiteri 23 example? is it necessary for a discipline to be recognized as such among a majority of educational institutions? in its examples for events, niso lists holidays and revolutions. it is unclear, however, what level of specificity applies to this concept; would christmas, for example, be considered an event or a unique entity/proper noun (which is listed separately from types of concepts)? it is only later in the guidelines, under the examples provided for unique enti­ ties (for example, fourth of july), that one may assume that a named event should be considered a unique entity. verbal nouns also are difficult to determine based only upon the niso examples, and once again no guidelines are provided to determine whether a noun represents an activity or a thing, or possibly both; for example, skiing or clipping. the lack of clear definitions in niso also appeared in the section pertaining to slang, neologisms, and jargon, which are considered to be nonstandard terms that do not generally appear in dictionaries. as was discussed previ­ ously, it is not clear at what point a jargon term or a slang term becomes a neologism. all of the slang tags found in the three sites (for example, babe) appeared in merriam­ webster, which may serve to make this niso section even more ambiguous. ■ conclusion the most notable suggested weaknesses of folksonomies are their potential for ambiguity, polysemy, synonymy, and basic level variation as well as the lack of consistent guidelines for the choice and form of tags. the examina­ tion of the tags of the three folksonomy sites in light of the niso guidelines suggests that ambiguity and polysemy (such as homographs) are indeed problems in the struc­ ture of the folksonomy tags, although the actual propor­ tion of homographs and ambiguous tags each constitutes fewer than one­quarter of the tags in each of the three folksonony sites. in other words, although ambiguity and polysemy are certainly problematic areas, most of the tags in each of the three sites are unambiguous in their meaning and thus conform to niso recommendations. the help sites of the three folksonomy provide few tangible guidelines for (1) the construction of tags, which affects the construction of multiterm tags; and (2) the clear distinction between the singular and plural forms of count versus noncount nouns. as has been shown, the use of the singular or plural forms of terms, as well as the use of punctuation to form multiterm tags, affects search results. a large proportion of the tags in all three sites consists of single terms, which mitigates the impact on retrieval, but the inconsistent use of the singular and plural forms of nouns is indeed significant and thus may have marked effect upon retrieval. synonymy and basic level variation were not examined in this study, but are certainly worthy of further exploration. in other areas, the tags conform closely to the niso guidelines for the choice and form of controlled vocabu­ laries. the tags represent mostly nouns, with very few unqualified adjectives or adverbs. the tags represent the types of concepts recommended by niso and conform well to recognized standards of spelling. most of the tags conform to standard usage; there are few instances of nonstandard usage, such as slang or jargon. in short, the structure of the tags in all three sites is well within the standards established and recognized for the construction of controlled vocabularies. should library catalogs decide to incorporate folkson­ omies, they should consider creating clearly written rec­ ommendations for the choice and form of tags that could include the following areas: ■ the difference between count and noncount nouns, as well as an explanation of how the use of the sin­ gular and plural forms affects retrieval. ■ one standard way in which to construct multiterm tags; for example, the insertion of a space between the component terms, or the use of an underscore between the terms. ■ a link to a recognized online dictionary and to wikipedia to enable users to determine the meanings of terms, to disambiguate amongst homographs, and to determine if the full form would be preferable to the abbreviated form. an explanation of the impact of ambiguous tags and homographs upon retrieval would be useful. ■ an acceptable use policy that would cover areas of potential concern, such as the use of potentially offensive tags, overly graphic tags, and so forth. although such terms were not the focus of this study, their presence was certainly evident in some cases, and would need to be considered in an environment that includes clients of all ages. with the use of such expanded guidelines and links to useful external reference sources, folksonomies could serve as a very powerful and flexible tool for increasing the user­friendliness and interactivity of public library catalogs, and also may be useful for encouraging other activities, such as informal online communities of readers and user­driven readers’ advisory services. works cited bateman, s., c. brooks, and g. mccalla. 2006. collaborative tagging approaches for ontological metadata in adaptive e-learning systems. http://www.win.tue.nl/sw­el/2006/ camera­ready/02­bateman_brooks_mccalla_swel2006_ final.pdf (accessed jan. 11, 2007). 2� information technology and libraries | september 20072� information technology and libraries | september 2007 bruce, h., w. jones, and s. dumais. 2004. keeping and re-finding information on the web: what do people do and what do they need? seattle: information school. http://kftf.ischool.washington .edu/re­finding_information_on_the_web3.pdf (accessed jan. 11, 2007). cattuto, c., v. loreto, and l. pietronero. 2006. collaborative tagging and semiotic dynamics. http://arxiv.org/ps_cache/cs/ pdf/0605/0605015.pdf (accessed jan. 11, 2007). del.icio.us. 2006a. del.ico.us/about. http://del.icio.us/about/ (accessed jan. 11, 2007). del.icio.us. 2006b. del.ico.us/help/tags. http://del.icio.us/help/ tags (accessed jan. 11, 2007). dempsey, l. 2003. the recombinant library: portals and people. journal of library administration 39, no. 4: 103–36. fichter, d. 2006. intranet applications for tagging and folkson­ omies. online 30, no. 3: 43–45. furl. 2006. how to save a page in furl. http://www.furl.net/ howtosave.jsp (accessed jan. 11, 2007). golder, s. a., and b. a. huberman. 2006. usage patterns of col­ laborative tagging systems. journal of information science 32, no. 2: 198–208. guy, m., and e. tonkin. 2006. tidying up tags? d-lib magazine 12, no. 1. http://www.dlib.org/dlib/jan.06/guy/01guy.html (accessed jan. 11, 2007). ketchell, d. s. 2000. too many channels: making sense out of portals and personalization. information technology and libraries 19, no. 4: 175–79. kroski, e. 2006. the hive mind: folksonomies and user-based tagging. http://infotangle.blogsome.com/2005/12/07/the­hive ­mind­folksonomies­and­user­based­tagging/ (accessed jan. 11, 2007). mathes, a. 2004. folksonomies—ccooperative classification and communication through shared metadata. http://www.adammathes .com/academic/computer­mediated­communication/ folksonomies.html (accessed jan. 11, 2007). merholz, p. 2004. ethnoclassification and vernacular vocabularies. http://www.peterme.com/archives/000387.html (accessed jan. 11, 2007). merriam­webster. (2007). yahoo. http://www.m­w.com/ (accessed jan. 11, 2007). michlmayr, e. 2005. a case study on emergent semantics in communities. http://wit.tuwien.ac.at/people/michlmayr/ publications/michlmayr_casestudy_on_emergentsemantics _final.pdf (accessed jan. 11, 2007). mitchell, r. l. 2005. tag teams wrestle with web content. computerworld 38, no. 16: 31. niso. 2005. guidelines for the construction, format, and management of monolingual controlled vocabularies. ansi/niso z39.19­2005. bethesda, md.: national information standards organization. http://www.niso.org/standards/resources/z39­19­2005 .pdf (accessed jan. 11, 2007). quintarelli, e. 2005. folksonomies: power to the people. http:// www.iskoi.org/doc/folksonomies.htm (accessed jan. 11, 2007). shirky, c. 2004. folksonomy. http://www.corante.com/many/ archives/2004/08/25/folksonomy.php (accessed jan. 11, 2007). spiteri, l. f. 2006. the use of folksonomies in public library cata­ logues. the serials librarian 51, no. 2: 75–89. szekely, b., and e. torres. 2005. ranking bookmarks and bistros: intelligent community and folksonomy development. http:// torrez.us/archives/2005/07/13/tagrank.pdf. (accessed jan. 11, 2007). technorati. 2006. technorati help:tags. http://www.technorati. com/help/tags.html (accessed jan. 11, 2007). trant, j., and b. wyman. (2006). investigating social tagging and folksonomy in art museums with steve.museum. http://www.archimuse .com/research/www2006­tagging­steve.pdf (accessed jan. 11, 2007). udell, j. 2004. collaborative knowledge gardening. http://www. infoworld.com/article/04/08/20/34opstrategic_1.html (accessed jan. 11, 2007). vander wal, t. 2006. understanding folksonomy: tagging that works. http://s3.amazonaws.com/2006presentations/ dconstruct/tagging_in_rw.pdf (accessed jan. 11, 2007). vanderwal.net. 2005. folksonomy definition and wikipedia. http:// www.vanderwal.net/random/entrysel.php?blog=1750 (accessed jan. 11, 2007). wikipedia. 2006. folksonomy. http://en.wikipedia.org/wiki/ folksonomy (accessed jan. 11, 2007). wu, h., m. zubair, and k. maly. 2006. harvesting social knowledge from folksonomies. http://delivery.acm.org/10.1145/1150000/ 1149962/p111­wu.pdf (accessed jan. 11, 2007). article title | author 25the structure and form of folksonomy tags | spiteri 25 appendix a: list of niso elements 6.3 term form 6.3.1 single word vs. multiword terms 6.3.2 types of concepts terms for things and their physical parts terms for materials terms for activities or processes terms for events or occurrences terms for properties or states terms for disciplines or subject fields terms for units of measurement 6.3.3 unique entities 6.4 grammatical forms of terms 6.4.1 nouns and noun phrases 6.4.1.1 verbal nouns 6.4.1.2 noun phrases 6.4.1.2.1 premodified noun phrases 6.4.1.2.2 postmodified noun phrases 6.4.2 adjectives 6.4.3 adverbs 6.5 nouns 6.5.1 count nouns 6.5.2 mass nouns 6.5.3 other types of singular nouns 6.5.3.1 abstract concepts 6.5.3.2 unique entities 6.6.2 spelling 6.6.2.1 spelling—warrant 6.6.2.2 spelling—authorities 6.6.3 abbreviations, initialisms, and acronyms 6.6.3.1 preference for abbreviation 6.6.3.2 preference for full form 6.6.3.2.1 general use 6.6.3.2.2 ambiguity 6.6.4 neologisms, slang, and jargon 6.7.1 capitalization and nonalphabetic characters reproduced with permission of the copyright owner. further reproduction prohibited without permission. graphical table of contents for library collections: the application ... herrero-solana, victor;félix moya-anegón;guerrero-bote, vicente;zapico-alonso, felipe information technology and libraries; mar 2006; 25, 1; proquest education journals pg. 43 reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. president’s message: the year in review—open everything colleen cuddy information technologies and libraries | june 2012 1 as i sit down to write my last president’s column a variety of topics are running through my mind. but as i focus on just one word to sum up the year, “open” rises to the top of the list. for truly it was a year of all things open. my presidential theme is open data/open science and i am looking forward to hearing tony hey and clifford lynch speak at the lita president’s program later this month on this topic. dr. lynch is also the recipient of this year’s lita/library hi tech award for outstanding communication in library and information technology, cosponsored by emerald group publishing limited. the prestigious frederick g. kilgour award for research in library and information technology award, co-sponsored by oclc, is being given to g. sayeed choudhury this year. dr. choudhury is a longtime proponent of open data and the award recognizes his leadership in the field of data curation through the national science foundation supported data conservancy project. as you well know ital is now an open-access journal. open access continues to be a hot topic, and rightly so. my last column was devoted to the subject of open access, but, i do want to remind librarians to advocate for open access in the coming year—please keep up the fight! in addition to seeing our journal to its new platform, the publications committee has also been busy with a few new lita guides, one of which, “getting started with gis,” by eva dodsworth, provides some guidance on harnessing data sets to work with geospatial technology. ms. dodworth will be conducting an online course on this topic in august and the education committee has many new courses in the pipeline. internally lita has been working towards a more open and transparent governance structure. the board has been relentless in making sure that all of its meetings are open, from in-person meetings at conferences to our monthly phone meetings to conversations on ala connect. we have been streaming our board meetings live and now will archive the recordings for a limited amount of time. this move has not been without challenges as board members and the lita office struggled to build open communication with each other and the membership. sometimes the challenges were ideological or legal, and sometimes the very technology that we embrace has caused problems, but i think it is safe to say that lita leadership is working towards a common goal of a transparent structure with open communication channels. colleen cuddy (colleen.cuddy@med.cornell.edu) is lita president 2011-12 and director of the samuel j. wood library and c. v. starr biomedical information center at weill cornell medical college, new york, new york. mailto:colleen.cuddy@med.cornell.edu president’s message | cuddy 2 we opened up communication channels to get feedback on what our membership would like most when zoe stewart-marshall, incoming president, hosted a town hall meeting at the ala midwinter meeting that focused on member feedback. i know that she is working hard to address membership needs during her presidency. as a medical librarian i often travel in circles outside of ala and when my medical colleagues learned that i was lita president they were really impressed. lita is a well-known and wellrespected brand in the library community. talking to my non-lita colleagues reinforced the value that lita brings to the entire profession, particularly through our programming, education, and they way in which we share and exchange information in open forums such as the lita blog and listserv. (of course i hope that we have gained some new members through this outreach!) clearly we are doing many things right and we should not lose sight of what is great about lita as we work on addressing areas that need improvement. one thing that is consistently great about lita is its annual sponsorship of ala emerging leaders. this year we sponsored two lita members who were part of the 2012 ala emerging leaders cohort: jodie gambill and tasha keagan. both were assigned to a team working on a lita project that asked for a recommendation and plan for the implementation of a lita experts profile system. the team was responsible for identifying the software to employ and creating an implementation plan with ontology recommendations. the team has identified vivo (an opensource, semantic-web application) as the software for the project and will present its findings and implementation plan to the lita board and the ala community at the ala annual conference. the team did an outstanding job on this project and completed the deliverable on time, with very little guidance from lita leadership—a sure sign of leadership! yet, i was often reminded that as we embrace our upcoming leaders, we should not forget that leadership occurs on all levels. one message that i heard throughout my presidency is that lita should do more for mid-career librarians—and this sentiment is shared by members of other organizations in which i am active. this is a challenge that lita leadership is poised to take on as it balances its services to membership. as i now count eighteen occurrences of the word “open” in this column i believe i have made my point and it is time to sign off. although i am finishing up my duties as lita president, i am not saying goodbye. i look forward to my new role as past-president, particularly in hosting the 2012 lita national forum in columbus, ohio (october 4-7): new world of data: discover. connect. remix. the national forum planning committee led by susan sharpless smith has done an outstanding job putting together an excellent meeting. the committee has lined up interesting speakers such as eric hellman, ben schneiderman, and sarah houghton, and thoughtfully evaluated many paper and poster submissions. i am sure we will all learn quite a bit from our colleagues as we attend sessions and network. i will be hosting a dinner and i hope to see some of you there as i enjoy what i hope will be a more relaxed role as past-president. it has been an honor to serve you and i look forward to working with lita in the years to come! 56 technical communications announcements new cola chairman brian aveney, of the richard abel co., has been elected chairman of the cola discussion group, effective january 1974. prior to his present position with the design group at richard abel, mr. aveney was head of the systems office at the university of pennsylvania libraries. the cola discussion group traditionally meets on the sunday afternoon preceding each ala conference. meetings are open, and all are invited to attend. and a book review editor a member of the university of bri:tish columbia graduate school of library science faculty, peter simmons, has been appointed book review editor of the ]ow·nal of library automation. mr. simmons is the author of the "library automation" chapter in the annual review of information science and technology, volume 8, the most recent of his publications. authors and publishers are requested to send relevant literature to mr. simmons at the graduate school of library science, university of british columbia, vancouver, british columbia, for review. missing issues? the rapid publication sequence of the 1972 and 1973 volumes of the ]omnal of library automation has created problems for some isad members and subscribers. if your address changed during 1973, or if your ala membership suffered any quirk, you are especially likely to have missed one or more of the issues due you. if this is the case, please write to the membership and subscription records department of the american library association, 50 e. huron st., chicago, il 60611. indicate which issues you are missing, and every attempt will be made to forward them to you as quickly as possible. new eric clearinghouse stanford university's school of education has been awarded a one-year contract by the national institute of education (nie) to operate the newly-formed eric clearinghouse on information resources under the direction of dr. richard clark. the new clearinghouse will be part of the stanford center for research and development in teaching. the clearinghouse on inf01mation resources is the result of a merger of two previous clearinghouses-the one on medfa and technology formerly located at the stanford center for research and development in teaching, and . the one on library and information sciences formerly located at the american society for information science in washington, d.c. the new clearinghouse is responsible for collecting information concerning print and nonprint learning resources, including those traditionally provided by school and community libraries and those provided by the growing number of technologybased media centers. the clearinghouse collects and processes noncopyright documents on the management, operation, and use of libraries, the technology to improve their operation, and the education, training, and professional activities of librarians and information specialists. in addition, the clearinghouse is collecting material on educational media such as television, computers, films, radio, and microforms, as well as techniques which are an outgrowth of technology-systems analysis, individualized instruction, and microteaching. library automation activitiesinternational computerized system at the james cook university of north queensland library. the system design phase of an integrated acquisitions/ cataloging system for the library at the james cook university of north queensland has been completed by a firm of computer consultants, ian oliver and associates, and programming has commenced. history the system, known as catalist, is a batch system to be operated on the university's central computer, a pdp-10. it will be programmed in fortran and macro, the assembly language of the pdp-10. desc1·iption the system will cover all aspects of cataloging/ acquisitions procedures for all library material apart from serials including: (a) production of orders, followups, reports (b) budget control (c) fund accounting (d) routing slips (e) accessions lists (f) in-process and catalog supplements (author/title and added entry) and subject catalog supplement shelflist and supplement (g) catalogs (author/title and subject) (h) union catalog cards. some features of the system include the maintenance of average book price in all subject areas. these are continually updated by the system to reflect the current fluctuations in the trade. thfs information will be used together with machine-based arrival predictions to control the budget and fund allocations. marc data will be used as much as possible, with records for individual items being supplied from external sources on request. technical communications 57 the in-process catalogues, which will contain items on order, items arrived, and items cataloged since the previous edition of the catalog, will contain added entries for all material where such information fs available. the catalogs will be produced on com. roll film will be used for public catalogs and fiche for in-house use. data for the national union catalogue will be submitted on minimally-formatted computerproduced cards. for further information contact ms. c. e. kenchfngton, systems librarian, post office, james cook university of north queensland, australia 4811. technical exchanges editor's note: the two following articles, prepared by the library of congress and the council on libtary resources, respectively, have been distribttted through various lc publications. due to the importance of the two documents, however, and to the fact that they may not have reache.d the entire libtary community, it seemed therefore appropriate to publish the papers again in journal of library automation. sharing machine-readable bibliographic data: a progress report on a series of meetings sponsored by the council on library resources beginning in december 1972 and continuing since that date, the council on library resources has convened a series of meetings of representatives of several organizations to discuss the implications of bibliographic data bases being built around the country and the possibilities of sharing these resources. although the deliberations are not yet completed, the council, as well as all participants in the meetings, felt that it was timely to make the progress to date known to the community. since publication in the 58 i ournal of library automation vol. 7/1 march 197 4 open literature implies a long waiting period between completion of a paper and the actual publication date, it was decided that this paper should be written and distributed as expeditiously as possible. since the library of congress has vehicles for dissemination of information in its marc distribution service, i nfotmation bulletin, and cataloging setvice bulletin, lc was asked to assume the responsibility for the preparation of a paper to be distributed via the above mentioned channels as well as sending copies to relevant associations. the institutions participating in the deliberations have been included as an appendix to this paper. the bibliographic data bases under consideration at individual institutions contain both marc records from lc as well as records locally encoded and transcribed. these local records represent: ( 1) titles in languages not yet within the scope of marc; (2) titles in languages cataloged by lc prior to the onset of the marc service; ( 3) titles not cataloged by lc; and ( 4) titles cataloged by lc and recataloged when the lc record cannot be found locally. the first two categories, in many instances, are being encoded and transcribed by institutions using lc data as the source, i.e., proofsheets, nuc records, and catalog cards. these are referred to for the remainder of this paper as lc source data and the third and fourth categories as original cataloging. all participants agreed that the stmcture of the format for the interchange of bibliographic data would be marc but several participants questioned if a subset of lc marc could not be established for interchange for all transcribing libraries other than lc. 1• 2 although lc had reported its survey regarding levels of completeness of marc records and the conclusions reached by the recon working task force, namely, "to satisfy the needs of diverse installations and applications, records for general distribution should be in the full marc format," it appeared worthwhile to once more make a survey to see if agreement could be reached on a subset of data elements. 3 the survey ineluded only those institutions participating in the clr meetings. the result of the survey again demonstrated that considered collectively, institutions need the complete marc set of data elements. the decision was made that the lc marc format was to be the basis of the further deliberations of the participants. attention was then turned to any additional elements of the format or modifications to present elements that may be required in order to interchange bibliographic data among institutions. all concmned recognized that although networks of libraries, in the true sense, still do not exist today, much has been learned since the development of the marc format in 1968. certain ground rules were established and are given below: 1. the material under consideration is to be limited to monographs. 2. the medium considered for the transmission of data is magnetic tape. 3. data recorded at one institution and transmitted to another in machinereadable form is not to be retransmitted by the receiving institution as part of the receiving institution's data base to still another institution.4 4. any additions or changes required to the marc format for "networking" arrangements are not to substantially impact lc procedures. 5. any additions or changes required to the marc format for "networking" arrangements are not to substantially affect marc users. long discussions took place concerning modifications to lc source data by a transcribing library and the complexity involved in transmitting information as to which particular data elements were modified. ground mle 6 was established stating that if any change is made to the bibliographic content of a record copied from an lc source document (other than the lc call number), the transcribing library would be considered the cataloging source, i.e., the machine-readable record would no longer be considered an lc cataloging record. any errors detected in lc marc records are to be reported to lc for correction. a subcommittee was formed to study what marc format additions and modifications were required. the subcommittee met on several occasions and made the following proposals to the parent committee: 1. fixed field position 39 and variable field 040, ·cataloging source, should be expanded to include information defini.ng the cataloging library, i.e., the hbrary responsible for the cataloging of the item, and the transcribing library, i.e., the library actually doing the input keying of the cataloging data. 2. lc should include the lc card number in field 010 as well as in field 001. when the lc card number is known by an agency transcribing cataloging data, field 001 should contain that agency's control number and field 010 should contain the lc card number. 3. variable field 050 should not be used for any call number other than the lc call number. transcribing agencies should always put the lc call number in this field if known. 4. a new variable field 059, contributed classification, should be defined to allow agencies other than lc to record classification numbers such as lc classification, dewey, national agricultural library classification, etc., with indicators assigned to provide the information as to what classification system was recorded and whether the cataloging or transcribing agency provided this data. 5. variable field 090, local call number should follow the same indicator sys~ tern as defined in field 059. (090 contains the actual call number used by either the cataloging or transcribing library while 059 would contain additional classification numbers assigned by the cataloging or transcribing library.) 6. lc would assume the responsibility of distributing any agreed upon additions or modifications as either an technical communications 59 addendum to or a new edition of books: a marc format. discussions following the presentation of these proposals indicated concern regarding three principal areas: 1. the modifications of any data element in an lc source document other than the addition of a local call number dictated that the institution performing the modification of the record assume the position of the cataloging source. this resulted in the possibility that a large number of records would undergo minor changes and consequently the knowledge that the record was actually an lc record would be lost. this loss was considered a critical problem. 2. the creation of a marc record implied that each fixed field and all content designators should be present if applicable for any one record. during the lc recon project, it was recognized that certain fixed fields could not be coded explicitly because the basic premise in the recon effort was the encoding of existing cataloging records without inspecting the book. consequently, the value of certain fixed fields such as indicating the presence or absence of an index in the work, could not be known. participants felt that a "fill" character was needed to describe to the recipient of machiner~adable cataloging data that a particular fixed field, tag, indicator, or subfield code could not be properly encoded due to uncertainty. the "fill" character will be a character in the present library character set but one not used for any purpose up to this time. 3. although networking is not clearly defined at this time, participants felt that the marc format should have the capability to include location symbols to satisfy any future requirement to transmit this information in order to expedite the sharing of library resources. majority opinion indicated there was a 60 journal of library automation vol. 7/1 march 1974 need to guarantee the recognition of an lc source record, that a "fill" character could serve a useful function, and that a method of transmitting location symbols was required. three position papers were written on the topics outlined above giving the rationale for the requirement and describing a proposed methodology for implementation. these papers were reviewed at a meeting of the participants and are presently undergoing modification taking into account recommendations made. the revised papers are to be distributed prior to the next meeting in january 1974. following this meeting, another paper will be prepared for publication which will include a definitive account of the modifications and additions recommended for the marc format as well as describing the rationale for the additions and modifications. at that time the proposals will be submitted to the library community for its review and acceptance. if the additions and changes are approved by the marbi5 committee of the american library association, lc will proceed to amend or rewrite the publication books: a marc format. however, the points elaborated below deserve emphasis toward the understanding of the issues described in this paper. 1. the meetings were concerned with a national exchange of data, not international. 2. the additions and modifications recommended for the marc format, with one exception, affect organizations other than the library of congress exchanging machine-readable cataloging data. except for distributing records with the lc card number in field 010 as well as 001, the marc format at lc will remain intact. 3. lc will investigate the use of the fill character in its own records, both retrospective and current, and for records representing all types of materials. henriette d. avram marc development office library of congress references 1. the marc format has been adopted as both a national and international format by ansi and iso respectively. 2. subset in this context includes· both the data content of the record (fixed and variable fields) and content designators (tags, indicators, and subfield codes) . 3. recon working task force, "levels of machine-readable records," in its national aspects of creating and using marc/recon reco1'ds (washington, d.c.: library of congress, 1973), p.4-6. 4. this rule did not extend to a subscriber to the lc marc service duplicating an lc tape for another institution. one can readily see the chaos that would result if institution a sent its records to institutions b and c, b then selected all or part of a's records for inclusion in its data base, and then transmitted its records to a and c. the result of the multitransmission of the same records, modified or not, would create useless duplication and confusion. 5. rtsd/isad/rasd representation in machine-readable form of bibliographic information committee. appendix 1 list of organizations participating in the clr sponsored meetings library of congress national agricultural library national library of medicine national serials data program new england library information network new york public library the ohio college library center stanford university libraries university of chicago libraries washington state library university of western ontario library a composite effort to build an on-line national serials data base (a paper for presentation at the arl midwinter meeting, chicago, 19 january 1974) an urgent requirement exists for a concerted effort to create a comprehensive national serials data base in machine-readable form. neither the national serials data program nor the marc serials distribution service, at their current rate of data base building, will solve the problem quickly enough. because of the absence of a sufficient effort at the national level, several concerted efforts by other groups are under way to construct serials data bases. these institutions have been holding in abeyance the development of their automated serials systems, some for several years, waiting for sufficient development at the national level to provide a base and guidance for the development of their individual and regional systems. this has not been forthcoming, and local pressures from their users, their administrators, and their own developing systems are forcing these librarians to act without waiting for the national effort. these efforts are exemplified by the work of one group of librarians, described below. what has now come to be known as the "ad hoc discussion group on serials" had its beginnings in an informal meeting during the american library association's conference in las vegas last june. you will also hear this discussion group referred to as the "toronto group." this is because its prime mover has been richard anable of york university, toronto, and because the first formal meeting occurred in that city. the expenses of the toronto and subsequent meetings have been borne by the council on library resources, and council staff have been involved in each meeting. a fuller exposition of the origins, purposes, and plans of the toronto group has been written by mr. anable for the journal of libm1'y automation. it appeared in the december 1973 issue. quoting from anable: "at the meeting [in las vegas] there was a great deal of concern expressed about: 1. the lack of communication among the generators of machine-readable serials files. 2. the incompatibility of format and/ or bibliographic data among existing files. 3. the apparent confusion about the technical communications 61 existing and proposed bibliographic description and format 'standards'." end of quote. the toronto group agreed that something could and should be done about these problems. if nothing else, better communications among those libraries and systems creating machine-readable files would allow each to enhance its own systems development by taking advantage of what others were doing. as the discussions progressed, several points of consensus emerged. among them were: 1. the marc serials distribution service of the library of congress and the national serials data program together were not building a national serials data base in machine-readable form fast enough to satisfy the requirements of developing library systems. this systems development was, in several places, at the point where it could no longer wait on serials data base development at the national level as long as progress remained at the current rate. 2. the marc serials format developed at lc offered the only hope for machine format capability. every system represented planned to use it. for the purpose of building a composite data base outside lc, the marc serials format would probably require minor modification, principally by extension. these extensions could and should be added on so as to do no violence to software already developed to handle marc serials. 3. there existed some difference between the lc marc serials format and that used by the national serials data program. these differences arose from several circumstances. for example, the marc serials format predated the international serials data system (isds), the national serials data program, and the key title concept. when these three came along, the requirement existed that the nsdp abide by the conventions of the isds. since the key title 62 journal of librm·y automation vol. 7/1 march 1974 is not yet a cataloging title, but is the title to which the international standard serial number is assigned, it is natural that the approach to serial record creation by nsdp should be different from that of a library cataloging serials by conventional methods. a working group under the auspices of the ifla cataloguing secretariat has devised an international standard bibliographic description for serials. the working group's recommendations are to be distributed for trial, discussion, and recommendation for change in february. when the isbd ( s) is accepted into cataloging practice, some of the differences in marc usage and nsdp procedure will disappear. others will still remain and they must be reconciled. we cannot continue with two serial records, both of which claim to be national in purpose but which are incompatible with each other. a good exposition of the differences in these serials records from the point of view of the marc development office is in an article by mrs. josephine pulsifer in the december 1973 issue of the journal of libmry automation. 4. major canadian libraries are active in cooperative work on serials and these two national efforts should be coordinated. several other circumstances bear on the problem. for example, the national serials data program is a national commitment of the three national libraries. in addition to the funding from the three national libraries, there are excellent chances that the nsdp will receive funds from other sources to expedite its activities. the nsdp is responsible for the issn and key title and for relationships with the international serials data system. ultimately, the issn and key title will be of great importance to serials handling in all libraries. for all of these reasons it is imperative that the activities of the nsdp be channeled into the comprehensive data base building effort described in this paper. when it was realized at the council on library resources that the toronto group was serious and that a data base building effort would result, it was obvious that this had enormous significance for the library of congress and other library systems because the result would be a de facto national serials data base. accordingly, a paper was prepared and sent to lc, urging that an effort be made in washington to coordinate the efforts of the marc serials distribution service, the national serials data program, and this external effort. in addition, it was felt that lc should take a hard look at its own several serials processing flows and attempt to reconcile them better with each other and with the external effort. to do this, lc was urged to do a brief study of lc serials systems, using lc staff and one person from clr. lc agreed and the study is now very nearly complete. the written guidance given the study group members was quite specific. they were to study all serials flow at lc and make their recommendations based on what lc should be doing, rather than being constrained by what lc is doing. the overall objectives of the study were to aim for the creation of serials records as near the source as possible and onetime conversion of each record to machine-readable form to serve multiple uses. specifically to be examined were the serials processing flows of the copyright office, the order division, the serial record division, new serials titles, and the national serials data program. while all of this was going forward, the toronto group had some more meetings. oclc was tentatively selected as the site for the data base building effort. it is understood by everyone that this is a temporary solution; eventually a national-level effort must be mounted which will provide a post-edit capability to bring the composite data base up to nationally acceptable standards. a permanent update capability is also required. this permanent activity, hopefully, will be based at the library of congress. oclc was chosen as the interim site for several reasons, but especially for its proven capability to produce network software and support which will work. within a very short time oclc will have on-line serials cataloging and input capability which will extend to some two hundred libraries. no other system is nearly so far advanced. the toronto group has assured itself that the data record oclc intends to use is adequate and is now working on the conventions required to insure consistency in input and content, to include some recommendations for minor additions to the marc serials format. during their deliberations, the toronto group realized that, to be effective, their efforts needed formal sponsorship, and discussions to this end were begun. initially, several agencies were considered to be candidates for this management role. various considerations quickly narrowed the list down to the library of congress, the association of research libraries, and the council on library resources, and representatives of these three met to discuss the matter further. during the discussions, clr was asked to assume the interim management responsibility until a permanent arrangement could be worked out. clr was selected because, as an operating foundation under the tax laws, it can act expeditiously in matters of this kind. clr can also deal with all kinds of libraries and has no vested interest in any particular course of action. meanwhile, certain institutions in the toronto group had indicated that they were ready to pledge $10,000 among themselves for the specific purpose of hiring mr. anable as a consultant to continue his coordinating activities. the group asked clr to act as agent to collect and disburse these funds. clr is ready to assume the initial responsibility for the management of this cooperative data base building effort, if that is the will of the leadership in the library community. clr is prepared to commit one staff member full time to the project who is well versed in the machine handling of marc serials records. this is mr. george parsons, and other staff members will assist as appropriate. mr. anable has agreed to act as a consultant to help coordinate these activities. clr would aim for the most complete, accutechnical c01nmunications 63 rate, and consistent serial record in the lc marc serials format which can be had under the circumstances. during the effort, clr will act as the point of contact between oclc and the participating libraries, assisting in negotiating contracts and other agreements as required. the composite data base will be made available to all other libraries at the least possible cost for copying. initially at least, the costs of this effort will have to be shared by the participating libraries, since no additional funds are presently available. the goal is to build 100,000 serial records the first year, another 100,000 the second year, and design and implement the permanent mechanism the third year, while file-building continues. as the project gets under way, it will work like this: a set of detailed written guidelines for establishing the record and creating the input will be promulgated, and agreement to abide by them will be a prerequisite to participation. selected libraries with known excellence in serial records will be asked to participate; others may request participation. those selected who already have or can arrange for terminals on the oclc system will participate on line. this is the preferred method, but it may be possible to permit record creation off line, such records to be added to the data base in a batch mode. it is very difficult to merge serial files from different sources in this way, so an attempt will be made to find a large serials data base in machine-readable form for use as a starting point. this file would be read into the oclc system. a participating library wishing to enter a record would first search to see whether it existed in the initial data base. if a record is found, it would be updated insofar as this is possible, within the standards chosen for the system. it may be further updated by other participants, still within the system standards, but at some point update on a record in the system will reach a point of diminishing returns and the record will remain static until a post-edit at the national level can be performed. these records will be for use as their recipients see fit, but their prime purpose is to support the development of automated serials sys64 journal of library automation vol. 7/1 march 1974 terns while eliminating duplication of effort. details of how to hag these records in the oclc data base as they are being created by this effort will be worked out, as will be the relationship between this effort and the rest of oclc activities. clr will, from time to time, report progress to the community. it would be the hope of clr that the toronto group will continue to assist in the technical and detailed aspects of the project. in addition, and after consultation with the appropriate people, an advisory group will be appointed to advise clr in this effort. lawrence living8ton council on library resou1'ces input to the editor: re: file convm·sion using optical scanning at berkeley and the university of minnesota discussed by stephen silber8tein, jola technical communications, december 1973. it is rewarding to find someone who has actually read in detail one's published work (grosch, a. n. "computer-based subject authority files at the university of minnesota libraries"), i generally agree with mr. silberstein's observations regarding the use of optical scanning for library file conversion. however, several points were raised by mr. silberstein on which i feel further comment is needed. perhaps in my article i should have cautioned the reader that when developing procedure and programs for the cdc 915 page reader, there is a great variance in these machines depending upon: 1. how early a serial number unit, i.e., vintage of machine, 2. what version of the software system grasp is being used, 3. what degree of machine maintenance is performed out, and 4. what kinds of other customers are using the scanner. it was our misfortune to have a cdc 915 page reader that had many peculiarities about it which could or would not be resolved by a maintenance engineer. in addition it was not heavily used and what use it did receive was mostly nonrepetitive conversion jobs dealing mostly with mailing address file creation and freight billing. in our initial testing we tried to use various stock bond paper and had various reading difficulties. in talking with others who had used this particular machine we found that the choice of paper stock was critical on this scanner. i might add that we did not actually use $400 worth of paper on this as i sold half of the stock we had ordered to another user locally who was going to use this device. it might be worth mentioning that we had a failure of a potentially large conversion project reported to us. this project tried to use this equipment but could not create a suitable input format because of a specific uncorrected peculiarity of not being able to read lines of greater than six inches without repeated rejects. we were aware of this from our experience which is why we kept our line short using the ro to terminate reading of the line at the last character position. also our input was double spaced, not single spaced as you seem to infer in your comments. with this particular device we also found that the format recognition line was easily lost, necessitating greater time spent in re-running the job. therefore, even though this was a great commission of sin on our part according to mr. silberstein, i then must plead guilty to using expedient methods to turn a bad situation into an acceptable one. i might also point out that this solution had been employed at various times by some past users we contacted. in fact, i have later found out that occasionally such a technique has been resmted to in one of our other local user installations on a much newer machine. i do not wish to imply that our conversion achieved maximum through-put but that in any case it was a cost effective way to proceed. with a small file conversion such as this one which is to be done on a one-shot basis, it seemed foolish to me to spend much time optimizing, but rather to find a way that worked as our difficulties were encountered. if this had to be a continuing job we would have had to get a better maintained scanner and invested more time and money into the project. i take the view that we wish to couple modest human costs with modest projects and reserve for greater projects of a continuing nature more optimized procedures. i agree that file cleansing is undoubtedly the most costly operation but i cannot say by just what amount since my responsibilities did not include such work. this technical communications 65 was later performed by our technical services department. our general point in writing about this project was to convey our broad experiences using this technique on a subject authority system as we had not seen such use reported in the literature previously. i would hope your comments and mine here serve to illustrate that one's systems problems must be solved in light of the conditions and not always according to what we term the best theory or practice. to this end i hope others will profit from both of om comments. audrey n. grosch university of minnesota libraries editorial | truitt 51 marc truitteditorial: ala and our carbon footprint obligatory disclaimer: before proceeding, i want to state very clearly that—as with anything i write in this space that is not explicitly attributed to someone other than myself—the reflections that follow are my own thoughts and views. they in no way are intended to represent the views either official or personal of lita or ala officials or employees. w hile i am writing these lines just a week or so after the end of the american library association (ala) midwinter meeting, by the time you see them the ala annual conference in chicago will be just days away. i’ve been reflecting (stewing?) for some time now about the question of ala conferences: why do i attend, and what do i get from these gatherings? is the vendor/exhibitor “tail” wagging the ala/attendee “dog”? is attendance responsible in a time of straitened budgets? and, most recently, what is the environmental cost of attendance? for the moment, i’d like to consider only one of these. we all know that flying is, from an environmental perspective, enormously wasteful and destructive. yet, for attendance at ala and most other professional conferences, air travel is the only practical means, unless either one is fortunate enough to live in the area or ala holds the event in a place such as new york, chicago, philadelphia, or washington, each of which can boast credible commuter rail service. sadly, in most other places trains are really not an option; how many of us can imagine being able to take a long-distance amtrak train to an ala conference? so i wondered what it costs the environment for all of us to go to an ala conference. the following admittedly broad-side-of-barn figures for the recently completed midwinter meeting in denver are real eye-openers (you may not like my assumptions, but we have to assume some things, and after all, i’m only trying to get an orderof-magnitude number): a. number of paid attendees at midwinter meeting 2009: 9,8501 b. “fudge” figure for those who didn’t fly (local attendees or those close enough to use other means of transport): 1,000 c. total number of attendees who flew (a-b): 8,850 d. average distance to denver (round trip, in metric tons of co2 produced): .36352 e. total metric tons of co2—the “carbon footprint”— for all attendees who flew to denver (c x d): 3,217 i’m guessing this is a conservative number; still, the total “carbon footprint” of all who flew to the midwinter meeting was more than 3,000 metric tons of co2.3 that seems to me to be a giant’s footprint indeed for what we are told is primarily a “business meeting.” and this, of course, represents only that portion of the footprint that one identifies with air travel . . . enumerating the actual footprint would require taking into account many other sources of waste, with the resulting total being far larger. is it just me, or does this seem to be an extravagance these days? given that the vast majority of our “business meetings” can be transacted through video conference, teleconference, e-mail, or similar technological means, how do we continue to justify the indulgence of attending such conferences as the planet warms to temperature levels not observed in thousands of years? at a minimum, i would suggest that it’s high time we—individually or as a profession—began to think hard about compensating for our excess by purchasing carbon credits. i personally think of them as “bleeding heart environmentalism,” that is, little more than a means for we “haves” to assuage our guilt about our profligate ways. but even offset payments would be better than nothing. the obvious way to handle this would be for ala to add a modest ($5–10) surcharge to the meeting registration fee, with the resulting proceeds dedicated to an approved beneficiary. let’s see . . . my “carbon footprint” for flying to midwinter meeting 2009 is .38 metric tons. i can purchase an “offset” for about $5 and apply it to any of several worthy causes shown on the carbonfootprint.com website. ah, i feel better already . . . . . . or not. n more midwinter meeting fallout one of the more interesting sessions i attended at the midwinter meeting was a sleeper bearing the title “redefining technical services workflows with oclc.” led by karen calhoun, oclc’s vice president of worldcat and metadata services, a panel that included robin fradenburgh of the university of texas and my university of alberta colleagues kathy carter and sharon marshall described several innovative oclc services aimed at “improv[ing] efficiency and enhanc[ing] access to library materials.”4 calhoun’s overview, “reinventing technical services,” nicely summarized many of the issues facing technical services (ts) operations today, marc truitt (marc.truitt@ualberta.ca) is associate director, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 52 information technology and libraries | june 2009 including declining staff counts and the desire by library administrators to reclaim for patron use the space currently occupied by ts operations. she then reviewed recent studies about our patrons’ changing preferences for research tools—i.e., the question that has often been cast as “google versus the catalog.” precisely how workflow and organizational efficiencies (whether or not they come from oclc) in ts can alter our users’ research habits is a bit beyond me, but i’ll leave it to you to decide. the presentations are available to view at http://www.oclc. org/us/en/multimedia/2009/ala_mw_redefining_ technical_services.htm; do listen to the presentations and decide for yourself. in any case, calhoun’s talk, and an earlier comment made by a colleague and long-time friend of mine, got me to thinking again about “the catalog.” my friend, when asked at another program held just before the midwinter meeting, had said that the ts efficiency she would like most to institute would be “to stop cataloguing new (trade) books.” instead, we should put our limited cataloging resources where they might best be used, that is, in making rare and unique local resources discoverable. whoa!, i thought at the time. how might we do this? as calhoun talked about our users’ preference for discovery outside of the catalog, my mind wandered back to my friend’s comment. worldcat local? probably not, since it would still involve “cataloging” books, and doesn’t seem likely to be any more appealing to the google and amazon–focused user than are our opacs already. but what about amazon? i can envision a “catalog” search that begins at amazon’s already metadatarich site, enhanced with links to local holdings of all the things listed there—amazoncat local, if you will. blue-skying a bit more, i can imagine amazon’s business model for offering this kind of service. not only would there be even more eyeballs on its site than there are now, but a library considering such a service might offer in return that some or all of its acquisitions be sourced to amazon. conceivably, amazon could even offer a shelf-ready service, in which it provided the materials already barcoded, marked, and ready to park on our shelves. hmmm . . . open the box, shelve the already-in-the-“catalog” books, and pay the invoice. sounds pretty simple, no? things are rarely that simple, and i know that. there would be complexities aplenty, but who knows? am i serious? i make this proposal because i come from a background that respects and values the work of catalogers and other ts staff. part of me wants the idea to be tried and found wanting, that some of those who argue that library cataloging is “dead” might then come to a different view. but, either way, what we’d need would be a sizable institution willing to try it and see. who wants to be the pilot site? amazoncat local, anyone? references and notes 1. library journal.com, “with economy sputtering, ala midwinter attendance dips sharply,”www.libraryjournal.com/ index.asp?layout=talkbackcommentsfull&talk_back_header_ id=6582196&articleid=ca6632569#129349 (accessed feb. 5, 2009). according to libraryjournal.com, the count on saturday, january 24, was 9,850, including 7,689 registrants (of whom 498 were on-site registrants) and 2,161 exhibitors. 2. i used the carbon footprint calculator at www .carbonfootprint.com/calculator.aspx (accessed feb. 5, 2009) to compute the co2 footprint in metric tons for one round-trip flight between denver and each of the following cities: atlanta (.40), boston (.58), chicago (.30), dallas (.22), houston (.29), los angeles (.27), miami (.57), minneapolis (.23), new york–jfk (.54), philadelphia (.52), phoenix (.19), pittsburgh (.43), salt lake city (.23), san diego (.27), san francisco (.31), seattle (.34), and washington, d.c. (.49). i then averaged these for an “average trip” production of .3635 metric tons. 3. according to wikipedia, one metric ton equals 2,204.6226 lbs. or 1.102 u.s. tons. thus, 3,217 metric tons equals approximately 3,545 u.s. tons. wikipedia, “tonne,” http://en.wikipedia .org/wiki/tonne (accessed feb. 5, 2009). 4. oclc, redefining technical services workflows with oclc, www.oclc.org/us/en/multimedia/2009/ala_mw_ redefining_technical_services.htm (accessed feb. 25, 2009). editorial board thoughts column getting to yes: stakeholder buy-in for implementing emerging technologies in your library ida joiner information technology and libraries | september 2018 5 ida a. joiner (ida.joiner@gmail.com), a member of lita and the ital editorial board, is the librarian at the universal academy school in irving, texas. she is the author of “emerging library technologies: it’s not just for geeks” (elsevier, august 2018). have you ever wanted to implement new technologies in your library or resource center such as (drones, robotics, artificial intelligence, augmented/virtual reality/mixed reality, 3d printing, wearable technology, and others) and presented your suggestions to your stakeholders (board members, directors, managers, and other decision makers) only to be rejected based on “there isn’t enough money in the budget,” or “no one is going to use the technology,” or “we like things the way that they are,” then this column is for you. i am very passionate about emerging technologies, how they are and will be used in libraries/ resource centers, and how librarians will be able to assist those who will be affected by these technologies. i recently published a book introducing emerging technologies in libraries. i came up with suggestions on how doing your research — including the questions below and those on the accompanying checklist —will prepare you to meet with your stakeholders and improve the likelihood of your emerging technology proposal being approved. 1. who are your stakeholders and include them early on in the process? determine who you stakeholders are, what their areas of expertise are, and how they can support your emerging technology projects. the most critical piece to getting your stakeholders on board to support your technology initiatives is addressing the question “what’s in it for them?” this will get their attention and increase your odds to getting to say “yes” to your technology initiatives. 2. what are the costs? research what your costs will be and create a budget. find innovative ways to fund your initiatives by researching grants, strategic partnerships with others who might be interested in partnering with you, and locating other funding opportunities. 3. what are the risks? identify any potential risks so that you are prepared to discuss how you will mitigate them when you meet with your stakeholders. some potential risks that you might want to address are budget cost overruns or staffing issues such as a key person resigning or going on maternity or sick leave, or policies in place to deter patrons from trying to use the technology for criminal means. mailto:ida.joiner@gmail.com https://www.elsevier.com/books/emerging-library-technologies/joiner/978-0-08-102253-5 https://www.elsevier.com/books/emerging-library-technologies/joiner/978-0-08-102253-5 getting to yes | joiner 6 https://doi.org/10.6017/ital.v37i3.10746 4. what is the timeline and key milestones? address the timeline for when you want or need to implement these technologies? have you planned for key milestones and possible delays such as funds not being available? you need to have a detailed timeline, from your first kickoff meeting with your initiative’s team, to your stakeholder meeting where you present your proposal, to getting signoff on the project. 5. what training will you offer? perform a needs assessment to determine who will need to be trained, what training you will offer, what your training costs will be, and who will pay for them. once you have all of this in place, you will select the trainer(s) and the training model (such as “train the trainer”) that you will use. 6. how will you market your technology initiatives? will you rely on social media to market your technology initiatives? will you collaborate with your marketing department for developing your message through press releases, websites, blogs, e-newsletters, flyers, and other media outlets? you will need to meet with your marketing and publications experts to plan how you will market your emerging technology initiatives along with your costs and who will pay them. 7. who is your audience and how can you engage them? this is the one of the most important areas to address in your proposal to present to your stakeholders. without our patrons, there is no library. you will need to determine who your audience is and how you can utilize the emerging technologies to assist them. are they k to 12 students, adults who will be displaced by these technologies, technology novices who want to learn more about these technologies, or university faculty and/or students who want to use the technology for their projects? you can address all of these potential audiences in your proposal to your stakeholders. these are just a few tips on how to get stakeholder buy-in for implementing emerging technologies in your library. feel free to share some of your own successes in getting shareholders on board to implement emerging technologies in your library or resource center. information technology and libraries | septmeber 2018 7 emerging technology stakeholder buy-in questionnaire i have included questions below that you should follow when you are considering getting your stakeholders on board to implement new emerging technologies in your library. if you address all of these, you have a very good chance of getting your stakeholders on board to support your initiatives. 1. what technologies do you want to implement in your library/resource center and why do you want them? 2. who are your stakeholders and what are their backgrounds? 3. why should your stakeholders support your technology initiatives? 4. what is your budget for your new technology initiatives? 5. what training is needed to support these initiatives, who will provide the training, what are the costs, and who will pay for the training? 6. how will you market these technology initiatives, what are the costs, and who will pay for them? 7. did you perform a cost-benefit-analysis for these technology initiatives? 8. are there legal fees? if so, what are they, and who will pay for them? 9. what are the risks? 10. what are the returns on the investment (roi)? 11. what strategic partnerships can you establish? 12. what is your timeline for implementing these technology initiatives? emerging technology stakeholder buy-in questionnaire articles 50 years of ital/jla: a bibliometric study of its major influences, themes, and interdisplinarity brady lund information technology and libraries | june 2019 18 brady lund (blund2@g.emporia.edu) is a phd student at emporia state university’s school of library and information management. abstract over five decades, information technology and libraries (and its predecessor, the journal of library automation) has influenced research and practice in the library and information science technology. from its inception on, the journal has been consistently ranked as one of the superior publications in the profession and a trendsetter for all types of librarians and researchers. this research examines ital using a citation analysis of all 878 peer-reviewed feature articles published over the journal’s 51 volumes. impactful authors, articles, publications, and themes from the journal’s history are identified. the findings of this study provide insight into the history of ital and potential topics of interest to ital authors and readership. introduction fifty-one years have passed since the first publication of the journal of library automation (jla), the precursor to information technology and libraries (ital), in 1968: 51 volumes, 204 issues, and 878 feature articles. information technology and its use within libraries has evolved dramatically in the time since the first volume, as has the content of the journal itself. given the interdisciplinary nature of library and information science (lis) and ital, and the celebration of this momentous achievement, an examination of the journal’s evolution, based on the authors, publishers, and works that have influenced its content, seems apropos. the following article presents a comprehensive study of all 7,575 references listed for the 878 articles (~8.6 refs/article average) published over ital’s fifty years, identifying those authors and publishers whose work has been cited the most in the journal and major themes in the cited publications, and an evaluation of the interdisciplinarity of references in ital publications. this study not only frames the history of the ital journal, but demonstrates an evolution of the journal that suggests new paths for future inquiry. conceptual framework a major influence for the organization and methodology of this paper is imad al-sabbagh’s 1987 dissertation from florida state university’s school of library and information studies, the evolution of the interdisciplinarity of information science: a bibliometric study.1 in this study, alsabbagh sought to examine the interdisciplinary influences on the burgeoning field of information science by examining the references of the journal of the american society of information science (jasis), today known as the journal of the association for information science and technology (jasist). in al-sabbagh’s study, a sample of ten percent of jasis references was selected for examination.2 the references were sorted into disciplines based on the definitions supplied by information technology and libraries | june 2019 19 dewey decimal classification categories, with the percentages compared to the total number of the sampled references to derive percentages (e.g., if 150 references of 1,000 total jasis references examined belonged to the category of library science, then 15 percent of references belonged to library science, and so on for all disciplines). the present study deviates slightly from al-sabbagh’s in that it does not use a sampling method. instead, all 878 articles published in jla/ital and their 7,575 references will be examined. the categories for disciplines, instead of being based of dewey decimal, will be based on definitions derived from encyclopedia brittanica, and will include new disciplines that were not used in al-sabbagh’s original analysis, such as information systems and instructional design. 3 additionally, the major authors, publishers, and articles cited throughout jla/ital’s history will be identified; this was not done in al-sabbagh’s study, but will likely provide additional beneficial information for researchers and potential contributors to ital. ital is an ideal publication to study using al-sabbagh’s methodology, in that it is affiliated with librarianship and library science but, due to its content, is also closely associated with the disciplines of information science, computer science, information systems, instructional design, psychology, and many others. ital is likely one of the more interdisciplinary journals to still fall within the category of “library science.” in fact, as part of al-sabbagh’s 1987 study, he distributed a survey to several prominent information science researchers, asking them to name journals relevant to information science (this method was used to determine that jasis was the most representative journal for the discipline of information science). on the list of 31 journals compiled from the respondents’ rankings, ital ranked as the seventh most representative journal for information science, above datamation, scientometrics, jelis, and library hi-tech.4 this shows that, for a long time, ital has been considered as an important journal not just in library science, but in information science and likely beyond. key terminology while the findings of this study are pertinent to the ital reader, some of the terminology used throughout the study may be unfamiliar. to acclimate the reader to the terminology used in this study, brief definitions for key concepts are provided below. bibliometrics. “bibliometrics” is the statistical study of properties of documents.5 the present study constitutes a “citation analysis,” a type of bibliometric analysis that examines the citations in a document and what they can reveal about said document. cited publications. “cited publications” are the references (“publications”) listed at the end of a journal article.6 the purpose of al-sabbagh’s study (and the present study) is to analyze these cited publications to determine what disciplines influenced the research published in a specific journal. this bibliometric analysis methodology is distinct from those that examine the influence of a specific journal on a discipline (i.e., the present study looks at what disciplines influenced ital, not what disciplines are influenced by ital). discipline. in this study, the term “discipline” is used liberally to refer to any area of study that is presently or was historically offered at an institution of higher education (sociology, anthropology, education, etc.). in this study, library science and information science are considered as distinct disciplines (as was the case with al-sabbagh’s study).7 as discussed in the methodology section, the names and definitions of disciplines are all derived from the encyclopedia britanica. 50 years of ital/jla | lund 20 https://doi.org/10.6017/ital.v38i2.10875 literature review the type of citation analysis used by al-sabbagh and as the basis of the current study are used frequently to examine the interdisciplinarity of library and information science and specific lis journals, as noted by huang and chang.8 tsay used a similar methodology to al-sabbagh to examine cited publications in the 1980, 1985, 1990, 1995, 2000, and 2004 volumes of jasist. in this study, the researcher found that about one-half of the citations in jasist came from the field of lis.9 butler examined lis dissertations using a similar approach, finding that about 50 percent of the cited publications in the dissertations originated in lis, with education, computer science, and health science following in the second, third, and fourth positions.10 chikate and paul and chen and liang conducted similar studies of dissertations in the india and taiwan.11 each study found different degrees of interdisciplinarity, possibly indicating a fluctuation within the discipline of lis based on publication type, country of origin, etc. for the publications used in the study. several researchers have used these methods recently to examine library and information science journals, such as chinese librarianship,12 pakistan journal of library and information science,13 library philosophy and practice,14 and the journal of library and information technology.15 these studies are more common for journals published outside of the united states, but there is no reason why the methodology would not hold true for a u.s.-based journal like ital. recently, publications in a wide array of fields have used similar methodologies as al-sabbagh to evaluate interdisciplinarity in a discipline. ramos-rodriguez and ruiz-navarro (2004) examined reference trends in the journal of strategic management.16 fernandez-alles and ramos-rodriguez (2009) conducted a bibliometric analysis to identifying those publications most frequently cited in the journal human resource management.17 crawley-low (2006) used a similar methodology to identify the core (most frequently cited) journals in veterinary medicine from the american journal of veterinary research.18 these studies show a growing interest in the use of citation analysis to present new information about a publication to potential authors, editors, and readers. jarvelin and vakkari (1993) noted trends in lis from 1965 to 1985 based on an examination of cited publications in lis journals. the authors noted a trend in interest in the topic of information storage and retrieval, with a de-emphasis on classification and indexing and a strengthened emphasis on information systems and retrieval.19 this study deviated from al-sabbagh and related studies of interdisciplinarity—though it employed a similar methodology—in that it examined trends or subtopics within the discipline of lis. though it is not a primary focus of the present study, the use of subtopics to further divide the discipline of library science and examine what aspects (management, technology, cataloging, reference) of the discipline are the focus of cited publications is incorporated in several tables in the results section. methods all references from the 878 articles published in the jla/ital journals (n=7,575) were transcribed to an excel spreadsheet for analysis (this spreadsheet can be found as a supplemental file [https://ejournals.bc.edu/index.php/ital/article/view/10875/9469]). the spreadsheet includes separate columns for primary author, title, publisher, and discipline of each reference. the list of disciplines with their definitions, derived from encyclopedia brittanica, is displayed in table 1 below. information technology and libraries | june 2019 21 table 1. definitions of disciplines used for this study. discipline definition library science the principles and practices of library operation and administration, and their study. information science the discipline that deals with the processes of storing and transferring information. information systems the study of the integrated set of components for collecting, storing, and processing data and for providing information, knowledge, and digital products. computer science the study of computers, including their design (architecture) and their uses for computations, data processing, and systems control. engineering the application of science to the optimum conversion of the resources of nature to the uses of humankind. instructional design the systematic development of instructional specifications using learning and instructional theory to ensure the quality of instruction. education the discipline that is concerned with methods of teaching and learning in schools or school-like environments as opposed to various nonformal and informal means of socialization. government resources produced within the political system by which a country or community is administered and regulated. sociology a social science that studies human societies, their interactions, and the processes that preserve and change them. popular newspaper, magazine, media reports that do not fit better within another category. philosophy the rational, abstract, and methodical consideration of reality as a whole or of fundamental dimensions of human existence and experience. psychology the scientific discipline that studies mental states and processes and behaviour in humans and other animals. corporate business, corporate, private organization publications that do not fit better within another category. archival science the study of the repository for an organized body of records produced or received by a public, semipublic, institutional, or business entity in the transaction of its affairs and preserved by it or its successors. management the study of the process of dealing with or controlling things or people. linguistics the scientific study of language. literature the art of creation of a written work. law the discipline and profession concerned with the customs, practices, and rules of conduct of a community that are recognized as binding by the community. 50 years of ital/jla | lund 22 https://doi.org/10.6017/ital.v38i2.10875 discipline definition mathematics the science of structure, order, and relation that has evolved from elemental practices of counting, measuring, and describing the shapes of objects (also includes statistics). health science study of humans, the extent of an individual’s continuing physical, emotional, mental, and social ability to cope with his or her environment. communication science the study of the exchange of meanings between individuals through a common system of symbols. geography the study of the diverse environments, places, and spaces of earth’s surface and their interactions. physics the science that deals with the structure of matter and the interactions between the fundamental constituents of the observable universe. art/design the study of the nature of art, including such concepts as interpretation, representation and expression, and form. economics the social science that seeks to analyze and describe the production, distribution, and consumption of wealth. biology the study of living things and their vital processes. museum studies the study of institutions dedicated to preserving and interpreting the primary tangible evidence of humankind and the environment. music the art concerned with combining vocal or instrumental sounds for beauty of form or emotional expression, usually according to cultural standards of rhythm, melody, and, in most western music, harmony. chemistry the science that deals with the properties, composition, and structure of substances (defined as elements and compounds), the transformations they undergo, and the energy that is released or absorbed during these processes. science and technology studies the study, from a philosophical perspective, of the elements of scientific inquiry. journalism the collection, preparation, and distribution of news and related commentary and feature materials through such print and electronic media as newspapers, magazines, books, blogs, webcasts, podcasts, social networking and social media sites, and e-mail as well as through radio, motion pictures, and television. anthropology the study of human beings in aspects ranging from the biology and evolutionary history of homo sapiens to the features of society and culture that decisively distinguish humans from other animal species. to determine the discipline in which a cited publication would be classified, the researcher used the cited publication’s title, abstract, and journal to select the most appropriate discipline from the table. in those cases where a source could not be easily identified as falling within one specific discipline, the researcher conferred with additional reviewers (professional librarians) to determine the best fit. information technology and libraries | june 2019 23 several analyses of this data were conducted to explore various aspects of jla/ital’s publication history. for the complete data of the publication’s 51 volumes, the top ten most referenced authors, articles, publishers (journals/publishing houses/organizations/websites), and disciplines were identified with the aid of excel’s functions. the same was done separately for both the jla’s 14 volumes and ital’s 37 volumes. this will allow for the comparison of the journal before and after the 1982 rebranding. the 51 volumes of jla/ital were also divided into the five decades of its history: 1968-77, 1978-87, 1988-97, 1998-2007, 2008-18 (eleven volumes instead of ten). for each of these decades, the top ten authors, publishers, and disciplines were identified. for each of the three categories, a table was created to show the top ten of each decade side-by-side. lastly, the titles of the 7,575 cited publications in jla/ital articles were examined using a content analysis, to identify major concepts and themes that appear to have influenced jla/ital articles. nvivo content analysis software was utilized for this analysis. titles were fed from the excel spreadsheet into the nvivo software, and the word frequency tools were used to identify the most frequently used terms and “generalizations,” or types or themes of statements in the titles.20 results table 2 displays the top ten most-cited authors, articles, publishers/publications, and disciplines throughout ital’s fifty-year history. among the authors, four of the top six are associated with two institutions: library of congress and oclc. there are four corporate or nonprofit organizations, three academics (associated with institutions of higher education), two women and four men. of the top ten articles, eight were published before 1973; three were published in jla/ital and five were published in journals versus five in other (non-journal) publications. of the top ten publishers, seven are journals; five of the publishers are directly associated with library science. within the disciplines, lis represents 60 percent of the total. there are 31 total disciplines represented throughout the 51 volumes, a greater number of disciplines than identified in al-sabbaugh’s study of jasist. table 3 displays the results for jla. jla emerged at the same time as the machine-readable catalog (marc) and oclc, and this is evident in the authors, articles, and publishers cited in the journal. during this phase of the journal’s history, the top three authors—fred kilgour, the library of congress, and henriette avram—dominated the citations. these three authors were cited more than the next seven combined (143 to 101). the cited publications during this period reflected a focus on systems, corporate, and government publications. results for the 37 volumes of ital are displayed in table 4. during this period, marshall breeding emerged as one of the biggest influences on information technology and libraries. all but two of the top articles (larson and bizer) were written before 1985. while six publishers were the same as with jla, three of these six (library of congress, association for computing machinery, and college and research libraries) changed position in the top ten. the disciplines of systems, psychology, educational and instructional design rose, while government, corporate, management, linguistics, and electrical engineering dropped; library science, information science, and computer science remained at the top. 50 years of ital/jla | lund 24 https://doi.org/10.6017/ital.v38i2.10875 table 2. overall most cited. top ten authors (affiliation) top ten articles top ten publishers top ten disciplines top ten disciplines with percentages 1 u.s. library of congress american library association. (1967). anglo-american cataloging rules. chicago, il: american library association. ital/jla library science— technology library science 44% 2 fred g. kilgour (oclc) avram, h. d. (1968). the marc ii format: a communications format for bibliographic data. washington, dc: library of congress. asist information science information science 16% 3 henriette d. avram (library of congress) ruecking jr, f. h. (1968). bibliographic retrieval from bibliographic input; the hypothesis and construction of a test. information technology and libraries, 1(4), 227-238. association for computing machinery library science— cataloging computer science 8% 4 american library association kilgour, f. g., leiderman, e. b., & long, p. l. (1971). retrieval of bibliographic entries from a name-title catalog by use of truncated search keys. ohio college library center. college and research libraries computer science information systems 7% 5 ibm: international business machines kilgour, f. g. (1968). retrieval of single entries from a computerized library catalog file. proceedings of the american society for information science, 5, 133136. library of congress information systems government 3% 6 ohio college library center/online computer library center (oclc) long, p. l., & kilgour, f. (1972). a truncated search key title index. information technology and libraries, 5(1), 17-20. american library association library science— general instructional design 3% 7 marshall breeding (vanderbilt university/independent) hildreth, c. r. (1982). online public access catalogs: the user interface. oclc online computer library center, incorporated. library resources and technical services government corporate 2% 8 jakob nielsen (independent) nugent, w. r. (1968). compression word coding techniques for information retrieval. information technology and libraries, 1(4), 250-260. library hitech library science— administration education 2% 9 karen markey (university of michigan) curwen, a. g. (1990). international standard bibliographic description. in standards for the international exchange of bibliographic information: papers presented at a course held at the school of library, archive and information studies, university college london (pp. 3-18). library journal instructional design psychology 2% 10 walt crawford (research libraries group/independent) fasana, p. j. (1963). automating cataloging functions in conventional libraries (no. isl-9028-37). lexington, ma: itek corp information sciences lab. oclc library science— academic sociology 2% information technology and libraries | june 2019 25 table 3. jla most cited. top ten authors (affiliation) top ten articles top ten publishers top ten disciplines top ten disciplines with percentages 1 fred g. kilgour (oclc) avram, h. d. (1968). the marc ii format: a communications format for bibliographic data. washington, dc: library of congress. journal of library automation library science— technology library science 58% 2 u.s. library of congress american library association. (1967). anglo-american cataloging rules. chicago, il: american library association. asist information science information science 14% 3 henriette d. avram (library of congress) ruecking jr, f. h. (1968). bibliographic retrieval from bibliographic input; the hypothesis and construction of a test. journal of library automation, 1(4), 227-238. library of congress library science— cataloging computer science 6% 4 ibm: international business machines kilgour, f. g., leiderman, e. b., & long, p. l. (1971). retrieval of bibliographic entries from a nametitle catalog by use of truncated search keys. ohio college library center. library resources and technical services library science— general government 5% 5 american library association long, p. l., & kilgour, f. (1972). a truncated search key title index. journal of library automation, 5(1), 17-20. ibm computer science corporate 5% 6 william r. nugent (inforonics, inc.) kilgour, f. g. (1968). retrieval of single entries from a computerized library catalog file. proceedings of the american society for information science, 5, 133-136. american library association government information systems 4% 7 paul j. fasana (columbia university) livingston, l.g. (1973). international standard bibliographic description for serials. library resources and technical services, 17(3), 293-298. association for computing machinery corporate management 2% 8 philip l. long (oclc) fasana, p. j. (1963). automating cataloging functions in conventional libraries (no. isl-9028-37). lexington, ma: itek corp information sciences lab. university of illinois press information systems linguistics 1% 9 martha e. williams (university of illinois) nugent, w. r. (1968). compression word coding techniques for information retrieval. journal of library automation, 1(4), 250-260. college and research libraries library science— academic electrical engineering 1% 10 university of california avram, h. d. (1970). the recon pilot project: a progress report. journal of library automation, 3(2), 102-114. special libraries library science— special psychology 1% 50 years of ital/jla | lund 26 https://doi.org/10.6017/ital.v38i2.10875 table 4. ital most cited. top ten authors (affiliation) top ten articles top ten publishers top ten disciplines top ten disciplines with percentages 1 u.s. library of congress american library association. (1967). anglo-american cataloging rules. chicago, il: american library association. information technology and libraries library science— technology library science 41% 2 american library association hildreth, c. r. (1982). online public access catalogs: the user interface. oclc online computer library center, incorporated. asist information science information science 16% 3 marshall breeding (vanderbilt university/independent) markey, k. (1984). subject searching in library catalogs. oclc online computer library center. association for computing machinery library science— cataloging computer science 9% 4 jakob nielsen (independent) malinconico, s. m. (1979). bibliographic data base organization and authority file control. wilson library bulletin, 54(1), 36-45. college and research libraries computer science information systems 7% 5 karen markey (university of michigan) matthews, j. r., lawrence, g. s., & ferguson, d. (1983). using online catalogs: a nationwide survey. neal-schuman publishers, inc.. library hitech information systems instructional design 3% 6 oclc bizer, c., heath, t., & berners-lee, t. (2011). linked data: the story so far. in semantic services, interoperability and web applications: emerging concepts (pp. 205-227). igi global. american library association instructional design government 2% 7 walt crawford (research libraries group/independent) tolle, j. e. (1983). current utilization of online catalogs: transaction log analysis. volume i of three volumes. final report. ohio college library center library science— administration education 2% 8 clifford a. lynch (university of california/coalition for networked 0information) larson, r. r. (1991). the decline of subject searching: long-term trends and patterns of index use in an online catalog. journal of the american society for information science, 42(3), 197-215. journal of academic librarianship library science— general sociology 2% 9 charles r. hildreth (read ltd.) markey, k. (1983). online catalog use: results of surveys and focus group interviews in several libraries. volume ii of three volumes. final report. library journal library science— academic psychology 2% 10 j.r. matthews (san jose state university/independent) ludy, l.e., & logan, s.j. (1982). integrating authority control in an online catalog. american society for information science meeting, 19, 176-178. library of congress government management 2% the top ten authors of each decade are shown in table 5. for the first two decades, fred kilgour was a dominate influence, receiving 15 more citations than the next closest author (the library of congress). in the third decade, kilgour dropped entirely from the top ten and was supplanted at the top spot by karen markey, professor at the university of michigan. during the fourth decade, in the wake of cipa and the u.s. patriot act, the library of congress rose to the top spot and john information technology and libraries | june 2019 27 bertot and paul jaeger, who wrote extensively on these topics and their legal, social, and administrative implications, rose up the list. web resources, such as google, also began to emerge in the fourth decade. in the final decade, breeding, who writes on library systems as well as information technology in general, rose to the top spot. tim berners-lee, one of the pioneers of the internet and linked data, took the second spot. jakob nielson, known for his contributions to usability testing, appears in the top three of the rankings for both the fourth and fifth decade. only the library of congress and american library association appear in the top ten list for all five decades. table 5. top ten authors of each decade. 1968-77 1978-87 1988-97 1998-2007 2008-18 1 fred g. kilgour (oclc) fred g. kilgour (oclc) karen markey (university of michigan) u.s. library of congress marshall breeding (vanderbilt university/independent) 2 u.s. library of congress robert de gennaro (harvard university/ pennsylvania university) u.s. library of congress jakob nielsen (independent) tim berners-lee (w3 consortium/ university of oxford/ massachusetts institute of technology) 3 henriette d. avram (library of congress) henriette d. avram (library of congress) clifford a. lynch (university of california/coalition for networked information) john c. bertot (university of maryland) jakob nielsen (independent) 4 ibm: international business machines ibm: international business machines michael k. buckland (university of california) oclc u.s. library of congress 5 american library association s. michael malinconico (new york public library/ university of alabama) american library association paul t. jaeger (university of maryland) american library association 6 paul j. fasana (columbia university) u.s. library of congress christine l. borgman (university of californialos angeles) walt crawford (research libraries group/independent) national information standards organization 7 william r. nugent (inforonics, inc.) frederick w. lancaster (university of illinois) charles r. hildreth (read ltd) american library association u.s. government 8 university of california allen b. veaner (stanford university/ university of california) joseph r. matthews (san jose state university/independent) roy tennant (university of california/ oclc) john c. bertot (university of maryland) 9 kenneth j. bierman (oklahoma state university/ university of nevada-las vegas) alan l. landgraf (oclc) walt crawford (research libraries group/independent) google oclc 10 robert m. hayes (university of california-los angeles) american library association lois m. chan (university of kentucky) thomas b. hickey (oclc) jung-ran park (drexel university) 50 years of ital/jla | lund 28 https://doi.org/10.6017/ital.v38i2.10875 jla/ital appears as the most cited publisher in all decades except the fourth, as shown in table 6. during that decade, acm and jasist rose above ital, and library journal and websites (websites are considered in this study as a collective group) emerged on the list. library journal was a frequently used source for bertot and jaeger, who authored several ital articles during this period. there were also more articles about the internet, digital libraries, google and google scholar, and the future of libraries during the fourth decade. jasist appears in the top four of every decade but has declined in the fifth decade of ital. oclc, ibm, college and research libraries, cataloging and classification quarterly, journal of academic librarianship, library resources and technical services, and library hi-tech all appear in multiple decades of this list. table 6. top ten publishers of cited articles for each decade. 1968-77 1978-87 1988-97 1998-2007 2008-18 1 jla jla/ital ital association for computing machinery ital 2 library of congress jasist jasist jasist library hi-tech 3 jasist library journal college and research libraries ital association for computing machinery 4 library resources and technical services oclc american library association college and research libraries jasist 5 ibm university of illinois press library resources and technical services american library association journal of academic librarianship 6 american library association library of congress oclc library journal college and research libraries 7 special libraries library resources and technical services library of congress journal of academic librarianship computers in libraries 8 college and research libraries american library association library hi-tech general websites d-lib 9 association for computing machinery prentice-hall journal of academic librarianship library hi-tech cataloging and classification quarterly 10 university of illinois press ibm cataloging and classification quarterly oclc ieee as shown in table 7, library science and information science maintained the first and second positions for every decade of jla/ital’s publication, while computer science and information systems jockeyed for the third and fourth positions every decade except the first (when government reports had a major impact on the journal). government and corporate (ibm particularly) were important in the first three decades but were replaced by instructional design and education in the final two decades. sociology appears in four of five decades, while psychology appears in three of five. in the first two decades, electrical engineering (as it applied to the design of computer systems) rounded up the top ten; law emerged in decade four, following cipa and the information technology and libraries | june 2019 29 patriot act; in the final decade, with the discussion about encoded archival description in ital, archival science rose to the tenth spot. table 7. top ten disciplines of each decade (library science subcategories combined). 1968-77 1978-87 1988-97 1998-2007 2008-18 1 library science library science library science library science library science 2 information science information science information science information science information science 3 computer science information systems computer science computer science information systems 4 government computer science information systems information systems computer science 5 corporate corporate government instructional design instructional design 6 information systems government philosophy education psychology 7 management management sociology corporate government 8 linguistics sociology literature sociology education 9 electrical engineering psychology psychology philosophy sociology 10 chemistry electrical engineering education law archival science table 8 compares all disciplines (including subcategories of library science) in the first decade of jla/ital and the fifth decade. compared to the first decade, the fifth decade saw greater diversification of subtopics under library science, which led to “information science” supplanting “library science—technology” in the top spot. instructional design and archival science emerged from disciplines not discussed in the first decade to become some of the most important disciplines of the fifth decade. the library science subtopics of accessibility and teaching grew significantly as the roles of the librarian evolved. 50 years of ital/jla | lund 30 https://doi.org/10.6017/ital.v38i2.10875 table 8. first ten years vs. last eleven years disciplines (with subcategories of library science). 1968-77 2008-18 1 library science—technology information science 2 information science library science—technology 3 library science—cataloging information systems 4 library science—general computer science 5 computer science library science—cataloging 6 government instructional design 7 corporate library science—accessibility 8 library science—academic library science—academic 9 information systems library science—reference 10 library science—special library science—administration 11 management psychology 12 linguistics government 13 electrical engineering library science—general 14 library science—medical education 15 popular popular 16 library science—reference library science—teaching 17 chemistry sociology 18 physics archival science 19 engineering—general management 20 psychology law 21 mathematics corporate 22 library science—local mathematics 23 communication science philosophy 24 health science literature 25 library science—accessibility linguistics 26 library science—school physics 27 philosophy health science information technology and libraries | june 2019 31 1968-77 2008-18 28 library science—administration geography 29 journalism electrical engineering 30 government library science—medical 31 music biology 32 education art/design 33 literature museum studies 34 economics 35 communication science 36 engineering-general 37 journalism 38 library science—special 39 chemistry 40 science and technology studies 41 library science—school 42 library science—local 43 anthropology table 9 show the ten biggest themes and most frequently used terms throughout jla/ital’s 51 volumes. library is the most common theme and term. the library catalog, and the associated concept of the integrated library system (ils), influence the second and third themes. “online” is an interesting theme/term for the different ways in which it was used throughout the history of the journal. in the early years, “online” was used to refer to the retrieval of computerized bibliographic information; in later years, “online” came to refer almost exclusively to the use of the world wide web. rounding out the top ten terms are several that associated with the study of information science: data, bibliography, and retrieval. finally, table 10 depicts the top ten themes for each of ital’s five decades. libraries remained at the top for all decades; the second spot, however, shifted dramatically. in the first decade, with marc being a major topic of discussion, “system” and “catalog” rose to the top. in decades two and three, with the melding of the disciplines of library science and information science, “information” rose to the top. in the final two decades, the world wide web was influential on the ital discourse. users, usability, and accessibility remain an important theme throughout the history of the journal. 50 years of ital/jla | lund 32 https://doi.org/10.6017/ital.v38i2.10875 table 9. major themes and term frequency in titles of cited publications (all 51 volumes). themes terms 1 library library 2 catalog information 3 system online 4 information system 5 online web 6 usability catalog 7 web digital 8 search data 9 computer bibliography 10 digital retrieval table 10. major themes in titles of cited publications (by decade). 1968-77 1978-87 1988-97 1998-2007 2008-18 1 library library library library library 2 system information information web web 3 catalog system catalog information digital 4 information catalog web digital information 5 online online system usability usability 6 usability web digital users data 7 web usability online catalog users 8 search digital usability search accessibility 9 computer users users accessibility studies 10 digital search accessibility data academic information technology and libraries | june 2019 33 discussion one of the major benefits of a bibliometric study/citation analysis is the production of a set of themes, disciplines, seminal sources, influences, and influencers that may benefit potential authors in determining whether their manuscript is suitable for publication in a specific discipline or journal.21 the results of this study demonstrate that ital is undoubtedly a library science journal, but that it invites a high-level of interdisciplinarity and has experienced a growing impact from the disciplines of information science, computer science, and information systems (which combined presently comprise about 30 percent of total ital references). throughout the journal’s history, there has been an emphasis on library systems, particularly systems for library cataloging. recently, however, there has also been an emphasis on technology, law, and the library as well as instructional technology, teaching, and the library. ital authors take the majority of their citations/ideas from other ital articles, jasist, acm, and other library technology (library hitech, d-lib) and academic librarianship (college and research libraries, journal of academic librarianship) journals. some of the major authors to read to familiarize oneself with the history and themes of the ital publication include fred kilgour, henriette avram, karen markey, and marshall breeding. these are some findings that potential ital authors may find practical use while preparing crafting their research and writing. with ital having a sustained role as a leading publication in library and information science, this study may have some generalizable findings for the discipline. in 2015, richard van noorden produced an interactive chart of the interdisciplinarity of a variety of disciplines, based on data from web of science and the national science foundation.22 if ital is considered representative of a sub-discipline called “library and information science—technology,” it can be compared to the interdisciplinarity of the disciplines listed in van noorden’s study. in the last decade of ital, 45.4 percent of citations came from outside of lis. compared to van noorden’s findings, only 42 of 144 (29 percent) “fields” (or “disciplines,” as they have been referred to as in this study) have a higher proportion of references to outside disciplines. this lis-tech sub-discipline would have a level of interdisciplinarity comparable to the fields of oceanography, botany, philosophy, history, and psychology, and on-par with the average for all social sciences.23 this shows that the discipline certainly has its own proprietary knowledge-base to build upon, but also values the contributions of knowledge from other disciplines. though it is not necessarily the purpose of this study to examine the influence of ital on other journals and within the discipline of lis, some of this information can be gathered rather easily from google scholar (by searching for the journal and comparing the number of citations for each article, as displayed by scholar) and is worth sharing. table 11 shows the top ten most-cited articles published over the history of jla/ital, with mcclure’s 1994 article “network literacy: a role for libraries,” receiving the most references of any article published in the journal. three ital articles have been cited by articles which themselves have over 1,000 citations, including one article (2007’s “checking out facebook.com”) that has been cited by an article which itself has been cited over 10,000 times. fifty-seven ital articles have at least 57 citations, giving the journal an h-index24 of 57. 50 years of ital/jla | lund 34 https://doi.org/10.6017/ital.v38i2.10875 table 11. citations of ital articles in outside journals. rank journal citation number of citations 1 mcclure, c. r. (1994). network literacy: a role for libraries? information technology and libraries, 13(2), 115-26. 447 2 charnigo, l., & barnett-ellis, p. (2007). checking out facebook.com: the impact of a digital trend on academic libraries. information technology and libraries, 26(1), 23-34. 391 3 antelman, k., lynema, e., & pace, a. k. (2006). toward a 21st century library catalog. information technology and libraries 25(3), 128-39. 267 4 spiteri, l. f. (2007). the structure and form of folksonomy tags: the road to the public library catalog. information technology and libraries 26(3), 13-25. 260 5 katz, i. r. (2007). testing information literacy in digital environments: ets's iskills assessment. information technology and libraries 26(3), 3-12. 226 6 jeng, j. (2005). what is usability in the context of the digital library and how can it be measured. information technology and libraries 24(2), 47-56. 196 7 lankes, r. d., silverstein, j., & nicholson, s. (2007). participatory networks: the library as conversation. information technology and libraries 26(4), 17-33. 189 8 dickstein, r., & mills, v. (2000). usability testing at the university of arizona library: how to let the users in on the design. information technology and libraries 19(3), 14451. 188 9 schaffner, a. c. (1994). the future of scientific journals: lessons from the past. information technology and libraries 13(4), 239-47. 177 10 kopp, j. j. (1998). library consortia and information technology: the past, the present, the promise. information technology and libraries 17(1), 7. 166 limitations of this study there were couple of potential limitations to this study. this bibliometric analysis was conducted in the “old-fashioned” way, using excel and hand-typing out all 7,575 cited publications. this was deemed the most effective way to collect the data, based on the availability of ital journal, but did take a great deal of time. to save time in recording data, only the first author for each cited publication was listed and no publication dates were collected, nor were abstracts retained and analyzed (which may provide additional compelling details about the content of these cited publications). greater validity for the assignment of disciplines to cited publications may be achieved by having a large team of researchers for analysis, or using multiple researchers for all citations, not just those that the first researcher deems questionable.25 as with a content analysis, independent review of data and comparison and compromising of coding is likely to provide the most consistent and accurate results. information technology and libraries | june 2019 35 conclusion fifty-one volumes of the journal of library automation/information technology and libraries have been published, over which time library technology has evolved from early-marc, a time in which the exceptional library would have perhaps a single computer for “online retrieval,” to the internet age, characterized by personal computing, library management systems, and technology-aided instruction. as time has passed, many of the major influences on the journal have changed, yet the journal has remained one of the most influential for library and information science technology. increased interdisciplinarity in cited publications and new directions in information law and education offer new directions as the journal enters its sixth decade. endnotes 1 imad al-sabbagh, “evolution of the interdisciplinarity of information science: a bibliometric study” (phd diss., florida state university, 1987). 2 ibid. 3 encyclopedia britannica, https://www.britannica.com/ (accessed sept. 13, 2018). 4 al-sabbagh, “evolution of the interdisciplinarity.” 5 melissa k. mcburney and pamela l. novak, “what is bibliometrics and why should you care?” professional communication conference, ieee (2002): 108-14, https://doi.org/10.1109/ipcc.2002.1049094. 6 lutz bornmann and rudiger mutz, “growth rates of modern science,” journal of the association for information science and technology 66, no. 11 (2015): 2, 215-222, https://doi.org/10.1002/asi.23329. 7 al-sabbagh, “evolution of the interdisciplinarity.” 8 mu-hsuan huang and yu-wei chang, “a study of interdisciplinarity in information science: using direct citation and co-authorship analysis,” journal of information science 37, no. 4 (2011): 369-78, https://doi.org/10.1177/0165551511407141. 9 ming-yueh tsay, “journal bibliometric analysis: a case study on the jasist,” malaysian journal of library & information science 13, no. 2 (2008): 121-39, http://ejum.fsktm.um.edu.my/article/663.pdf. 10 lois buttlar, “information sources in library and information science doctoral research,” library & information science research 21, no. 2 (1999): 227-45, https://doi.org/10.1016/s0740-8188(99)00005-5 11 r.v. chikate and s.k. patil, “citation analysis of theses in library and information science submitted to university of pune: a pilot study,” library philosophy and practice 222 (2008); kuang-hua chen and chiung-fang liang, “disciplinary interflow of library and information science in taiwan,” journal of library and information studies 2, no. 2 (2004): 31-55. 50 years of ital/jla | lund 36 https://doi.org/10.6017/ital.v38i2.10875 12 akhtar hussain and nishat fatima, “a bibliometric analysis of the ‘chinese librarianship: an international electronic journal,’ 2006-2010,” chinese librarianship 31, no. 1 (2011): 1-14, http://www.iclc.us/cliej/cl31hf.pdf. 13 nosheen fatima warraich and sajjad ahmad, “pakistan journal of library and information science: a bibliometric analysis,” pakistan journal of information management and libraries 12, no. 1 (2011): 1-7. http://eprints.rclis.org/25600/. 14 s. thanuskodi, “bibliometric analysis of the journal library philosophy and practice from 20052009,” library philosophy and practice 437 (2010): 1-6. https://digitalcommons.unl.edu/libphilprac/437/. 15 manoj kumar and a.l. moorthy, “bibliometric analysis of desidoc journal of library and information technology during 2001-2010,” desicoc journal of library and information technology 31, no. 3 (2011): 203-08. 16 antonio ramos-rodriguez and jose ruiz-navarro, “changes in the intellectual structure of strategic management research: a bibliographic study of the strategic management journal, 1980-2000,” strategic management journal 25, no. 10 (2004): 981-1,004, https://doi.org/10.1002/smj.397. 17 mariluz fernandez-alles and antonio ramos-rodriguez, “intellectual structure of human resources management research: a bibliometric analysis of the journal human resource management, 1985-2005,” jasist 60, no. 1 (2009): 161, https://doi.org/10.1002/asi.20947. 18 jill crawley-low, “bibliometric analysis of the american journal of veterinary research to produce a list of core veterinary medicine journals,” jmla 94, no. 4 (2006): 430-34. 19 kalervo jarvelin and pertti vakkari, “the evolution of library and information science 19651985: a content analysis of journal articles,” information processing and management 29, no. 1 (1993): 129-44, https://doi.org/10.1016/0306-4573(93)90028-c. 20 r. barry lewis, “nvivo and atlas.ti 5.0: a comparative review of two popular qualitative data-analysis programs,” field methods 16, no. 4 (2004): 439-69, https://doi.org/10.1177/1525822x04269174. 21 thad van leeuwen, “the application of bibliometric analyses in the evaluation of social science research: who benefits from it, and why it is still feasible,” scientometrics 66, no. 1 (2006): 133-54, https://doi.org/10.1007/s11192-006-0010-7. 22 richard van noorden, “interdisciplinarity research by the numbers,” nature 525, no. 7569 (2015): 306-07, https://doi.org/10.1038/525306a. 23 ibid, 306. 24 lutz bornmann and hans-dieter daniel, “what do we know about the h index,” journal of the american society for information science and technology 58, no. 9 (2007): 1,381-385, https://doi.org/10.1002/asi.20609. 25 linda c. smith, “citation analysis,” library trends 30, no. 1 (1981): 83-106. microsoft word march_ital_massicotte_proof_revised.docx reference rot in the repository: a case study of electronic theses and dissertations (etds) in an academic library mia massicotte and kathleen botter information technology and libraries | march 2017 11 abstract this study examines etds deposited during the period 2011-2015 in an institutional repository, to determine the degree to which the documents suffer from reference rot, that is, linkrot plus content drift. the authors converted and examined 664 doctoral dissertations in total, extracting 11,437 links, finding overall that 77% of links were active, and 23% exhibited linkrot. a stratified random sample of 49 etds was performed which produced 990 active links, which were then checked for content drift based on mementos found in the wayback machine. mementos were found for 77% of links, and approximately half of these, 492 of 990, exhibited content drift. the results serve to emphasize not only the necessity of broader awareness of this problem, but also to stimulate action on the preservation front. introduction a significant proportion of material in institutional repositories is comprised of electronic theses and dissertations (etds), providing academic librarians with a rich testbed for deepening our understanding of new paradigms in scholarly publishing and their implications for long-term digital preservation. while academic libraries have long collected and preserved hard copy theses and dissertations of the parent institution, the shift to mandatory electronic deposit of this material has conferred new obligations and curatorial functions not previously incorporated into library workflows. by highlighting etds as a susceptible collection deserving of specific preservation actions, we draw attention to some unique responsibilities for libraries housing university-produced content, particularly as scholarly information continues its shift away from commercial production and distribution channels. as teper and kraemer point out in their discussion of etd program goals, “without preservation, long-term access is impossible; without long-term access, preservation is meaningless.”1 what is reference rot, and why study it? in addition to linkrot (where a link sends the user to a webpage which is no longer available), mia massicotte (mia.massicotte@concordia.ca) is systems librarian, concordia university library, montreal, quebec, canada. kathleen botter (kathleen.botter@concordia.ca) is systems librarian, concordia university library, montreal, quebec, canada. a case study of electronic theses and dissertations (etds) in an academic library | massicotte and botter | https://doi.org/10.6017/ital.v36i1.9598 12 there are webpages that remain available, but whose contents have undergone change over time- known as content drift. this dual phenomena of linkrot plus content drift has been characterized as reference rot by the hiberlink project team,2 and has important implications for digital preservation. since theses and dissertations are original works born digital by virtue of mandatory deposit programs, a university’s etd program is effectively a digital publishing initiative, accompanied by a new universe of responsibility for its digital preservation. due to the specialized nature of graduate-level research, etds frequently include links to resources on the open web, for example, personal blogs, project websites, and commercial entities. digital object identifiers (dois), useful in the context of published literature, do not apply to urls on the free web, which are doi-indifferent. open web links also fall outside the scope of preservation initiatives such as lockss (lots of copies keep stuff safe)3 which aim to safeguard the published literature. with increasing frequency, researchers are citing newer forms of scholarship, which do not readily fall under the rubric of published literature. moreover, since thesis preparation is conducted over a period of time typically measured in years, links cited therein are likely to be more vulnerable to linkrot and content drift by the time of manuscript submission. yet despite the surfeit of anecdotal daily evidence that urls vanish and result in dead links, phillips, alemneh, and ayala point out that “by and large academic libraries are not capturing and maintaining collections of web resources that provide context and historical reference points to the modern theses and dissertations held in their collections.”4 since an etd comprises a unique form of scholarly output produced by universities, and simultaneously satisfies the parent institution's degree-granting apparatus, as well as reflecting its academic stature on the international stage, the presence of reference rot in this body of literature is of particular concern and worthy of immediate attention. smoking guns there has been no shortage of evidence reporting on the linkrot phenomena over the last two decades. koehler, whose initial study on linkrot appeared in jasis in 1999, periodically revisited, analyzed, and reported on the same set of 360 urls collected in his original study.5,6,7 in 2015, upon the twenty-year benchmark of the original data collection, oguz and koehler reported in jasis that only 2 of the original links remained active.8 a number of foundational studies, including casserly and bird,9 spinellis,10 sellitto,11 falagas, karveli, and tritsaroli,12 and wagner et al.13 have reported on linkrot occurring in professional literature. sanderson, phillips, and van de sompel provide a table of 17 well-known linkrot studies, comparing overall benchmarks, and supplying a succinct summary of the scope of each study.14 linkrot also gained further important exposure with the harvard law school study by zittrain, albert, and lessig, which found that 70% of 3 harvard law journal references, and 49.9% of urls in supreme court opinions examined, no longer pointed to their originally cited sources.15 information technology and libraries | march 2017 13 members of the hiberlink project, which set out to examine “a vast corpus of online scholarly publication in order to assess what links still work as intended and what web content has been successfully archived using text mining and information extracting tools” have been pivotal in making the case for reference rot.16 hiberlink demonstrated that failure to link to cited sources was due not only to linkrot, but also to web page content which changed over time.17 a new dimension of the digital preservation universe was thrown into sharp relief with follow-up study by klein et al. (2014), which examined one million web references extracted from 3.5 million science, technology, and medicine (stm) articles published in elsevier, pubmed central, and arxiv, between the years 1997 and 2012. the study concluded that one in five articles suffers from reference rot.18 though the study focused on stm articles, its authors drew attention to theses and dissertations as a susceptible class of material. analyzing the same set of links extracted from this large stm corpus, jones et al. (2016) recently reported that 75% of referenced open web pages demonstrated changes in content.19 etds — a susceptible collection the digital preservation part of institutionally mandated etd deposit has yet to have its dots fully connected to the rest of the diagram. after four years of research into academic institutions’ etd programs, halbert, skinner, and schultz reported that close to 75% of respondents surveyed had no preservation plan for their etd collections.20 despite the prevalence of linkrot studies, linkrot in etds has not been subjected to similar scrutiny, and the implications of disappearance of content is underappreciated. while mandatory deposit programs have become relatively commonplace, focus has largely remained on policy and implementation aspects, metadata quality, interoperability and conformance to standards.21,22 there are few studies which focus on institutional repository link content. the study conducted by sanderson, philips, and van de sompel (2011) was a large-scale examination of two repositories.23 400,144 papers deposited in arxiv, and 3,595 papers in the university of north texas (unt) digital library repository were studied, and more than 160,000 urls examined. links were analyzed for persistence and the availability of mementos, that is, whether prior versions of the page existed in a public web archive, such as the internet archive's wayback machine. for 72% of unt urls, either mementos were available, or the resource still existed at its original location, or both. although 54% (9,880) were available in one or more international web archives, 28% (5,073) of unt's etd links were found to no longer exist, nor had they been archived by the international archival community. phillips, alemneh, and ayhala looked at overall general patterns and trends of url references in repository etds, examining 4,335 etds between the years 1999-2012 in the unt repository.24 the team analyzed 26,683 unique urls in 2,713 etds containing one or more links, finding an overall average of 10.58 unique urls per etd with one or more links. the unt team provided a a case study of electronic theses and dissertations (etds) in an academic library | massicotte and botter | https://doi.org/10.6017/ital.v36i1.9598 14 breakdown of domain and subdomain occurrence frequency, and indicated areas of future investigation into content-based url linking patterns of etds. etd link decay was studied by sife and bernard, who performed a citation analysis on urls in 83 theses published between 2007 and 2011 at tanzania's sokoine national agricultural library.25 15,468 citations were examined, 9.6% (1,487) of which were open web citations. urls were considered active if found at the original location, or available after a url redirect. the authors manually tested urls over a period of seven days to record their accessibility, noting down inaccessible urls error messages and domains, and analyzing the types of errors encountered. the authors calculated that it took only 2.5 years for half of the web citations to disappear. at the etd2014 conference,26 an important study of 7,500 etds in 5 u.s. universities was presented. of 6,400 etds defended between 2003 and 2010, approximately 18% of open web link content was confirmed as lost, and a further 34% at risk of loss, that is, live links which lacked an archived copy.27 though the results of that particular study have not been formally published, it was briefly summarized in a session held at the 38th uksg annual conference in glasgow, scotland in march 2015, an account of which was subsequently published by burnhill, mewissen, and wincewicz in insights.28 given the scarcity of published literature on link content as found in etds, this present study which examines reference rot in etds in an academic institutional repository is unique, draws attention to an important digital collection which is vulnerable to loss, and highlights need for action. background and context concordia university is a comprehensive university located in montreal, with a student population of 43,903 full-time equivalents in 2015, of which 7,835 were graduate students. 27 phd programs were offered in 2015,29 and 43 programs at the masters level. faculties of arts and science, engineering and computer science, fine arts, and business have a thesis requirement, and produce upwards of 350 masters and 150 phd dissertations annually. the broad disciplines, and the departmental clusters used in this study are shown in table 1. prior to the thesis deposit mandate, concordia university library housed hard copy versions of theses and dissertations in the collection. in 2009, the library launched spectrum, concordia’s eprints institutional repository, playing a leadership role in spectrum's implementation and policy development, and providing training and support to the school of graduate studies regarding submission and management of theses for deposit. following a successful pilot project, the graduate studies office ceased accepting paper manuscripts, and mandated electronic deposit of all theses and dissertations into spectrum as of spring 2011. information technology and libraries | march 2017 15 discipline department discipline department arts applied linguistics communication economics educational technology history hist and phil of religion humanities philosophy sociology political science psychology religion business* decision sciences and mis finance management marketing engineering** building engineering civil engineering computer science comp sci & software eng electrical and comp eng industrial engineering info systems security mechanical engineering fine arts art education art history film and moving image studies industrial design fine arts performing arts science biology chemistry mathematics physics exercise science table 1. summary of departmental clusters used in this study * john molson school of business ** engineering & computer science methodology we concentrated on phd dissertations (henceforth etds) in spectrum in order to limit the scope of the project; master's theses were excluded. a 5-year period was chosen, beginning with the first semester of mandatory deposit, spring 2011, through fall 2015, a total of 720 etds. since concordia etds are released for publication immediately following convocation, the university's official convocation dates were used to identify the set of documents to be downloaded and examined. we proceeded in phases: first downloading etds from spectrum and converting to a text format that could be examined for patterns; then extracting links from each and testing programmatically a case study of electronic theses and dissertations (etds) in an academic library | massicotte and botter | https://doi.org/10.6017/ital.v36i1.9598 16 for linkrot; then drawing a stratified random sample of active urls and visiting them to determine if content drift had taken place. our methodology for link extraction was similar to those described by klein et al.,30 and zhou, tobin, and grover.31 during the dissertation download stage, 36 etds with embargoed content were encountered and eliminated. etds were then converted from existing pdf/a format to xml. a further 20 documents failed to convert due to nonstandard or complex formatting which resulted in unreadable, garbled characters. these documents resisted multiple conversion attempts, and since they could not be mined, had to be eliminated. a final total of 664 etds were successfully converted using three different tools: 97% (644) were converted using pdftohtml,32 the remaining 3% by either givemetext (14)33 or adobe acrobat (3). a spot check of documents was sufficient evidence that many links occurred throughout the text body. since we intended to extract urls to the open web, we wanted to err on the side of detecting more links, rather than easily-identifiable well-formed urls. links were mined from the body of the text in a manner similar to the study carried out at unt.34 we wanted a regular expression which would catch as many urls as possible, expecting to manually clean the link output before further processing. we tested multiple regular expressions35 against a small sample of our converted etds and compared the results. we selected one which seemed well-suited for our purpose, as it was liberal in detecting links throughout the text, was able to extract links which contained obvious omissions and problems — for example, those that lacked http:// prefixes — but also caught non-obvious errors, such as ellipses in long urls. we considered how deduplication of extracted links might affect the outcome, and opted to count each link as an individual instance. manual cleanup included catching urls that broke across new lines, identifying false hits such as titles containing colons and dois, and adding escape encoding characters for "&" and "%" in order to generate a clean url for use in the next step of the process. methodology — linkrot collection a script programmatically used the curl command line tool to visit each link and fetch the http response code in return.36 an output listing was produced for each doctoral dissertation, comprised of the original urls, the final urls, and the http response codes. link output for each of the converted 664 etds was collected from december 2015 to january 2016, with the fall 2015 semester checked in march 2016. 76% (504 of 664) of etds contained one or more links, the highest number of links (5,946) falling into the arts group. 24% (160 of 664) of etds contained no links. for the 5-year period, the broad discipline breakdown of documents examined, the number of etds with links, and the number of links extracted are shown in table 2. converted etds by publication year, broken out by broad disciplines, are shown in figure 1. information technology and libraries | march 2017 17 discipline number of phd etds in spectrum etds converted* contain no links contain links number of links extracted arts 210 195 31 164 5,946 business 45 43 12 31 210 engineering 351 326 82 244 3,259 fine arts 28 25 2 23 1,728 science 86 75 33 42 294 total 720 664 160 504 11,437 table 2. 5-year period, 2011-2015, summary of documents examined and links extracted * 56 documents in total eliminated (36 embargoed; plus 20 which failed to convert). figure 1. converted etds by publication year and broad discipline a case study of electronic theses and dissertations (etds) in an academic library | massicotte and botter | https://doi.org/10.6017/ital.v36i1.9598 18 the 11,437 links extracted were checked for linkrot, each link accessed and its http response code recorded. 77% (8,834 of 11,437) of links returned an active 2xx http response code. 23% (2,603) of links could not be reached, returning a response code other than in the 2xx range. this includes 102 links in the 3xx range which failed to reach a destination after 50 redirects and were considered linkrot. numbers of links, total link response, and link response by year broken down by broad discipline are shown in figure 2, with accompanying data provided in table 3 and discussed in the findings section. figure 2. link http response codes, by broad discipline and year information technology and libraries | march 2017 19 discipline http response code 2011 2012 2013 2014 2015 total % active & rotten** arts 2xx 691 864 800 1,108 1,093 4,556 77% all other* 320 428 131 293 218 1,390 23% business 2xx 14 52 17 22 50 155 74% all other 9 19 5 9 13 55 26% engineering 2xx 302 702 638 482 404 2,528 78% all other 134 172 180 196 49 731 22% fine arts 2xx 165 143 504 467 94 1,373 79% all other 74 56 118 98 9 355 21% science 2xx 77 34 58 39 14 222 76% all other 25 23 10 11 3 72 24% subtotal 2xx 1,249 1,795 2,017 2,118 1,655 8,834 77% active all other 562 698 444 607 292 2,603 23% rotten % rotten 31% 28% 18% 22% 15% 23% total 1,811 2,493 2,461 2,725 1,947 11,437 100% table 3. breakdown by year and discipline showing active (2xx) and rotten (all others) response codes *all other = 0, 1xx, 3xx (unresolved after 50 redirects), 4xx and 5xx response codes combined ** active and rotten rates based on total links per discipline methodology — content drift for the content drift phase, we wanted to sample documents from each of the five disciplines. etds which did not contain any links were excluded from the sample. using only documents with one or more active links, a stratified random sample of 10% was drawn for a final sample of 49 etds containing a total of 990 links. a snippet of text surrounding each link was then also extracted from each etd, along with any "date accessed" or "date viewed" information if present. each link was manually visited, assessed for content drift, and observations recorded. the breakdown of the content drift sample is shown in table 4. a case study of electronic theses and dissertations (etds) in an academic library | massicotte and botter | https://doi.org/10.6017/ital.v36i1.9598 20 discipline etds with links etds with active links (2xx) etds sampled for content drift* number of links extracted for sample arts 164 156 16 668 business 31 28 3 12 engineering 244 235 24 154 fine arts 23 23 2 136 science 42 40 4 20 total 504 482 49 990 table 4. breakdown of sample pool of etds for content drift analysis * 10% sample drawn from each discipline’s pool of etds; only etds with urls relevant for content drift assessment. visited links were benchmarked against the existence of a memento -an archived snapshot of that page located in the wayback machine.37 since the university sets a strict thesis submission deadline of 3 months prior to convocation, mementos prior to submission deadline would be sought. based on the occurrences of "date accessed" and discursive information found in the snippets, we arrived at the supposition that links were likely to have been checked the closer the student approached final stages of manuscript preparation, although this is not verifiable. we set ourselves a soft window for locating an archived snapshot using a date 6 months prior to the convocation date as the benchmark; that is, for each semester's deadline date, an additional 3 months was added, arriving at a 6-months-prior-to-publication marker. since programmatic analysis of 990 links required time, expertise, and resources not available to us, we approached the problem heuristically. assuming that online consultations are not linear, active links occurring multiple times in a document were given equal weight. each link was manually checked in the wayback machine using "date viewed" if provided; if no date was provided (the majority of cases), wayback was checked to see if an archived version existed as close to our 6 month soft marker as possible. if a memento was not found within a month earlier/later than the soft marker, then the nearest neighboring older memento was selected, if one existed. the original url, the date the url was visited, and whether a snapshot was located in wayback was recorded. all links were checked during july-august 2016. if the initial web browser failed to access, a second and sometimes third browser was tried, using safari, chrome, and internet explorer (ie) in that order. unsuccessful attempts to reach wayback were rechecked in september. the question as to whether, and to what degree content drift had occurred was assessed, and is discussed in the next section. information technology and libraries | march 2017 21 findings and discussion linkrot findings of 664 etds examined for linkrot, 77% of links tested returned an active http response code in the 2xx range -roughly three-quarters overall. numbers of links by broad discipline varied greatly, as shown in figure 2 (healthy links in green, linkrot shown in red). linkrot rates ranged from 21% in fine arts, to 26% in business, as seen in last column of table 3. it should be noted that 2xx response codes are also returned for pages that disguise themselves as active links. for example, a url returns an active status code when a domain has been parked (e.g. purchased to reserve the space), or when a customized 404-page-not-found is encountered. since we had no mechanism in place to treat false positives, these were flagged during the linkrot phase as candidates for subsequent content drift analysis. 23% (2,604 of 11,437) of all links, returned a response code of something other than in the 2xx-range and considered linkrot -roughly one-quarter. response codes in the 4xx range alone, including 404-page-not-found errors, comprised 17% (1,916 of 11,437) of all links. table 5 shows the breakdown of the total number of links that were visited in the spring of 2016 for linkrot determination. http response code category meaning of http response code* number of links percent of total links (%) 0 empty response** 507 4% 1xx informational 2 0% 2xx successful 8,834 77% 3xx redirection† 102 1% 4xx client error 1,916 17% 5xx server error 76 1% total 11,437 100% table 5. breakdown of http response codes received * we used http protocol definitions at http://www.w3.org/protocols/rfc2616/rfc2616-sec10.html ** unofficial http response code due to request timing out † failure to resolve after 50 redirects http responses ranged from a high of 85% active in 2015, to a low of 69% active in 2011, the oldest publication year. to put it differently, the most recent year exhibited a linkrot rate of 15%. consistent with other studies, linkrot manifests itself quickly after publication and increases over time, as indicated by percentages shown in figure 2. content drift findings of the 990 links visited to check for the presence of content drift, 764 (400 + 364), or 77%, had a wayback memento compared 226 (92+134), or 23%, which did not. slightly more than half of links with mementos, 52% (400 of 764), demonstrated some level of content drift when the a case study of electronic theses and dissertations (etds) in an academic library | massicotte and botter | https://doi.org/10.6017/ital.v36i1.9598 22 memento was compared to the current active link, while 48% (364 of 764) with mementos did not exhibit content drift. the presence of content drift by discipline, with/without mementos showing numbers of links tested, appears in table 6. discipline number of links tested content drift detected no content drift memento found memento not found total memento found memento not found total arts 668 261 60 321 254 93 347 business 12 5 0 5 4 3 7 engineering 154 74 10 84 55 15 70 fine arts 136 55 22 77 38 21 59 science 20 5 0 5 13 2 15 total 990 400 92 492 364 134 498 table 6. presence of content drift by discipline, with/without mementos for links that had no memento in wayback, content drift assessment was based on the presence of an observable date in the current active link, including copyright, and/or other details which positively correlated against our extracted snippet information. for example, some links retrieved a .pdf or other static file which correlated with the snippet, there being no reason to conclude its content had undergone change since publication, despite the lack of a memento. snippets were also used in cases where a robots.txt file at the target url had prevented wayback from creating a memento. occasional examination of the dissertation text was conducted to validate information extracted in the snippet. the 23% (226) which lacked mementos remain at significant risk and will fall prey to further drift as time passes. as seen in table 7, of 492 urls manifesting content drift, 11% (54 of 492) were completely lost, linking to web domains that had been sold or were currently up for sale, and webpages replaced or removed. 9% (42 of 492) of web pages exhibited major change such that there was little correlation with snippets, or where website overhauls made assessment difficult, but not impossible. 36% (179 of 492) web links exhibited minor drift, primarily pages that differed somewhat from a memento in visual appearance, such as header and footer differences, changes in background theme or style, or changes in navigation or search functionality which did not represent a high degree of impairment. 7% (34 of 492) linked to continually updating websites, such as wikipedia and news organizations, and 7% (35 of 492) were customized 404-page-notfound, distinctive enough to warrant separate categories. a full 30% (148 of 492) exhibited a multiplicity of changes of uncertain nature which we grouped together, such as pages where graphic or audio components had been removed or could not be retrieved, broken javascript that impeded access, browser failure, mementos not accessible after repeated attempts -indicative of a range of issues affecting the quality of web archives and hence preservation.38 the types of information technology and libraries | march 2017 23 content drift encountered, broken down by broad discipline and numbers of links, and percentage, is shown in table 7. type of content drift arts business engineeri ng fine arts science total % of type lost 45 0 3 6 0 54 11% major but findable 22 0 9 9 2 42 9% minor – redesigned but recognizable 128 2 30 17 2 179 36% ongoing updating website 25 3 5 0 1 34 7% custom 404 23 0 4 8 0 35 7% other 78 0 33 37 0 148 30% total 321 5 84 77 5 492 100% table 7. types of content drift encountered, number of links by broad discipline though difficulties encountered during content drift assessment made further extrapolation problematic, the presence of reference rot was confirmed. our 10% stratified random sample examined 990 active links, finding that roughly half (492 of 990) manifested some degree of content drift. for 364 links, or 36% overall, a benchmark memento was found and no content drift detected. although many content drift changes can arguably be characterized as minor, it is not possible to ascertain where the content drift scale tips irremediably for any particular reader. what can be said with certainty is that 11% of active links which did not exhibit linkrot, and were quite live and accessible, fell into a small but unsettling group where the context of the cited web source is irrevocably lost. of the 498 links which did not exhibit any evidence of content drift, 134, approximately one-third, have no memento archived and continue to remain at high risk. a focused and deeper analysis of active links which might lead to a typology of content drift types would be a possible area of future study, though even the well-resourced study by jones et al. which utilized a strict "ground truth" for comparing textual mementos over time, points out that classifying links would certainly be challenging.39 a larger sample size might also allow closer analysis of disciplinary differences, which may lead to a better understanding of these types of content drift variations. conclusion reference rot in the form of linkrot and content drift were observed in etds in spectrum, our institutional repository, and this confirmation should give pause for those charged with a case study of electronic theses and dissertations (etds) in an academic library | massicotte and botter | https://doi.org/10.6017/ital.v36i1.9598 24 stewardship of etd collections. theses and dissertations have long been viewed as material which contribute overall to academic scholarly output, and carry unique status within the academy. in august 2016, opendoar registered 1600 institutional repositories with etds,40 up from 1,100 institutions as reported in 2012 by grey literature specialist schoepfel.41 academic libraries have, in large part, facilitated the transition from paper to etd with widespread adoption of institutional repository deposit programs, and along with that adoption comes a range of long-term preservation issues. yet as ohio state’s strategic digital initiatives working group pointed out, “even in digital library communities, preservation all too often stands in for or is used interchangeably with byte level backup of content.”42 for long-term access, focus can productively be shifted to offset the immediate threat of incompleteness and inadequate capture.43 not much has changed since hedstrom wrote back in 1997: “with few exceptions, digital library research has focussed on architectures and systems for information organization and retrieval, presentation and visualization, and administration of intellectual property rights … the critical role of digital libraries and archives in ensuring the future accessibility of information with enduring value has taken a back seat to enhancing access to current and actively used materials.”44 our understanding and discussion of digital preservation must be broadened, and attention turned to this key area of responsibility in the preservation life-cycle. the authors maintain that etd content and link preservation is an editorial, not individual, imperative. encouraging individual authors to perform their own archiving is doomed to fall short of even reasonable expectations. instituting measures such as perma, a distributed, redundant method of capturing and archiving web site content as part of the citation process must be pro-actively sought and built into library, and hence repository, workflows.45 browser plugins and automated solutions which use the memento protocol for capturing and archiving web site content as part of the citation process do exist,46 but naturally have to be implemented before they can take effect. either way, efforts to operationalize existing mechanisms which are designed to reduce future loss would be extremely productive. responsibility for insuring not only current, but continuing future access to etd content rests with those who maintain curatorial function of the repository. academic librarians have assumed a prominent and de facto role as curators, facilitating the role of university publication and emphasizing its break away from previous ties with commercial entities. we collectively bear greater responsibility for this body of scholarly work, and need to move forward from a position of benign neglect to one of informed curation and pro-active preservation of an important collection of scholarly output which is at risk. information technology and libraries | march 2017 25 references 1. thomas h. teper and beth kraemer, “long-term retention of electronic theses and dissertations,” college & research libraries 63, no. 1 (january 1, 2002), 64, https://doi.org/10.5860/crl.63.1.61. 2 the term “reference rot” was introduced by the hiberlink team. “hiberlink – about,” accessed march 31, 2016, http://hiberlink.org/about.html. 3. lockss: lots of copies keep stuff safe, accessed december 6, 2016, http://www.lockss.org/about/what-is-lockss/. 4. mark edward phillips, daniel gelaw alemneh, and brenda reyes ayala, “analysis of url references in etds: a case study at the university of north texas,” library management 35, no. 4/5 (june 3, 2014), 294, https://doi.org/10.1108/lm-08-2013-0073. 5. wallace koehler, “an analysis of web page and web site constancy and permanence,” journal of the american society for information science 50, no. 2 (january 1, 1999): 162–80, https://doi.org/10.1002/(sici)1097-4571(1999)50:2<162::aid-asi7>3.0.co;2-b. 6. wallace koehler, “web page change and persistence—a four-year longitudinal study,” journal of the american society for information science & technology 53, no. 2 (january 15, 2002): 162–71, http://doi.org/10.1002/asi.10018. 7. wallace koehler, "a longitudinal study of web pages continued: a consideration of document persistence." information research 9, no. 2 (2004): 9-2, http://www.informationr.net/ir/92/paper174.html. 8. fatih oguz and wallace koehler, “url decay at year 20: a research note,” journal of the association for information science and technology 67, no. 2 (february 1, 2016): 477–79, https://doi.org/10.1002/asi.23561. 9. mary f. casserly and james bird, “web citation availability: analysis and implications for scholarship,” college and research libraries 64, no. 4 (july 2003): 300–317, http://crl.acrl.org/content/64/4/300.full.pdf. 10. diomidis spinellis, “the decay and failures of web references,” communications of the acm 46, no. 1 (january 2003): 71–77, https://doi.org/10.1145/602421.602422. 11. carmine sellitto, “a study of missing web-cites in scholarly articles: towards an evaluation framework,” journal of information science 30, no. 6 (december 1, 2004): 484–95, https://doi.org/10.1177/0165551504047822. a case study of electronic theses and dissertations (etds) in an academic library | massicotte and botter | https://doi.org/10.6017/ital.v36i1.9598 26 12. matthew e. falagas, efthymia a. karveli, and vassiliki i. tritsaroli, “the risk of using the internet as reference resource: a comparative study,” international journal of medical informatics 77, no. 4 (april 2008): 280–86, https://doi.org/10.1016/j.ijmedinf.2007.07.001. 13. cassie wagner et al., “disappearing act: decay of uniform resource locators in health care management journals,” journal of the medical library association 97, no. 2 (april 2009): 122– 30, https://doi.org/10.3163/1536-5050.97.2.009. 14. robert sanderson, mark phillips, and herbert van de sompel, “analyzing the persistence of referenced web resources with memento,” arxiv:1105.3459 [cs], may 17, 2011, http://arxiv.org/abs/1105.3459. 15. jonathan zittrain, kendra albert, and lawrence lessig, “perma: scoping and addressing the problem of link and reference rot in legal citations,” legal information management 14, no. 2 (june 2014): 88–99, https://doi.org/10.1017/s1472669614000255. 16. “hiberlink about,” accessed march 31, 2016, http://hiberlink.org/about.html. 17. “hiberlink our research,” accessed march 31, 2016, http://hiberlink.org/research.html. 18. martin klein, herbert van de sompel, robert sanderson, harihar shankar, lyudmila balakireva, ke zhou, richard tobin. “scholarly context not found: one in five articles suffers from reference rot,” plos one 9, no. 12 (december 26, 2014), https://doi.org/10.1371/journal.pone.0115253. 19. shawn m. jones, herbert van de sompel, harihar shankar, martin klein, richard tobin, claire grover. “scholarly context adrift: three out of four uri references lead to changed content,” plos one 11, no. 12 (december 2, 2016): e0167475, https://doi.org/10.1371/journal.pone.0167475. 20. martin halbert, katherine skinner, and matt schultz, “preserving electronic theses and dissertations: findings of the lifecycle management for etds project,” text, (august 6, 2015), 2, http://educopia.org/presentations/preserving-electronic-theses-anddissertations-findings-lifecycle-management-etds. 21. for a recent overview, see sarah potvin and santi thompson, “an analysis of evolving metadata influences, standards, and practices in electronic theses and dissertations,” library resources & technical services 60, no. 2 (march 31, 2016): 99–114, https://doi.org/10.5860/lrts.60n2.99. 22. joy m. perrin, heidi m. winkler, and le yang, “digital preservation challenges with an etd collection — a case study at texas tech university,” the journal of academic librarianship 41, no. 1 (january 2015): 98–104, https://doi.org/10.1016/j.acalib.2014.11.002. 23. sanderson, phillips, and van de sompel, “analyzing the persistence of referenced web resources with memento,” http://arxiv.org/abs/1105.3459. information technology and libraries | march 2017 27 24. phillips, alemneh, and ayala, "analysis of url references," https://doi.org/10.1108/lm-082013-0073. 25. alfred s. sife and ronald bernard, “persistence and decay of web citations used in theses and dissertations available at the sokoine national agricultural library, tanzania,” international journal of education and development using information and communication technology 9, no. 2 (2013): 85–94, http://eric.ed.gov/?id=ej1071354. 26. “etd2014 — university of leicester,” university of leicester, accessed january 27, 2016, http://www2.le.ac.uk/library/downloads/etd2014. 27. edina, university of edinburgh, “reference rot: threat and remedy,” (education, 04:54:38 utc), http://www.slideshare.net/edinadocumentationofficer/reference-rot-and-linkeddata-threat-and-remedy. 28. peter burnhill, muriel mewissen, and richard wincewicz, “reference rot in scholarly statement: threat and remedy,” insights the uksg journal 28, no. 2 (july 7, 2015): 55–61, https://doi.org/10.1629/uksg.237. 29. concordia university university graduate programs, accessed april 7, 2016, http://www.concordia.ca/academics/graduate.html. 30. klein et al., "scholarly context not found," https://doi.org/10.1371/journal.pone.0115253. 31. ke zhou, richard tobin, and claire grover, “extraction and analysis of referenced web links in large-scale scholarly articles,” in proceedings of the 14th acm/ieee-cs joint conference on digital libraries, jcdl ’14 (piscataway, nj, usa: ieee press, 2014), 451–452, http://dl.acm.org/citation.cfm?id=2740769.2740863. 32. pdftohtml v0.38 win32, meshko (mikhail kruk), http://pdftohtml.sourceforge.net/ accessed september 20, 2015. (actual download is at http://sourceforge.net/projects/pdftohtml/). 33. give me text! open knowledge international, accessed october 26, 2015-march 7, 2016, http://givemetext.okfnlabs.org/. 34. phillips, alemneh, and ayala, "analysis of url references," https://doi.org/10.1108/lm-082013-0073. 35. “in search of the perfect url validation regex,” accessed december 7, 2015, https://mathiasbynens.be/demo/url-regex. we selected "@gruber v2" for our extraction. 36. curl v7.45.0, "command line tool and library for transferring data with urls," accessed october 18, 2015, http://curl.haxx.se/. 37. we have used the term "memento" in lowercase to denote a snapshot souvenir page, to distinguish from an automated service utilizing the memento protocol. a case study of electronic theses and dissertations (etds) in an academic library | massicotte and botter | https://doi.org/10.6017/ital.v36i1.9598 28 38. for a good overview of the types of problems, see michael l. nelson, scott g. ainsworth, justin f. brunelle, mat kelly, hany salaheldeen and michele weigle, "assessing the quality of web archives" 1 vol., computer science presentations, book 8 (old dominion university. odu digital commons, 2014). http://digitalcommons.odu.edu/computerscience_presentations/8. 39. shawn m. jones, et al. “scholarly context adrift," https://doi.org/10.1371/journal.pone.0167475. 40. opendoar search of institutional repositories with theses at http://www.opendoar.org/find.php, accessed august 26, 2016. 41. joachim schöpfel, "adding value to electronic theses and dissertations in institutional repositories." d-lib magazine 19, no. 3 (2013): 1. https://doi.org/10.1045/march2013schopfel. 42. strategic digital initiatives working group. implementation of a modern digital library at the ohio state university. (apr 2014). https://library.osu.edu/documents/sdiwg/sdiwg_white_paper.pdf. (published). 43. tim gollins. “parsimonious preservation: preventing pointless processes! (the small simple steps that take digital preservation a long way forward),” in online information proceedings uk national archives, 2009. available at http://www.nationalarchives.gov.uk/documents/information-management/parsimoniouspreservation.pdf. 44. margaret hedstrom, "digital preservation: a time bomb for digital libraries." computers and the humanities 31, no. 3 (1997): 189-202. https://doi.org/10.1023/a:1000676723815. 45. zittrain, albert, and lessig, "perma," https://doi.org/10.1017/s1472669614000255. 46. herbert van de sompel, michael l. nelson, robert sanderson, lyudmila l. balakireva, scott ainsworth, and harihar shankar, “memento: time travel for the web,” arxiv:0911.1112 [cs], november 5, 2009, http://arxiv.org/abs/0911.1112. lib-mocs-kmc364-20131012114231 on their regular clsi equipment in the main library and station branch. on days when housekeeping chores are scheduled, the console operator's job includes turning on the apples so we can begin serving the public when the doors open at 9:00 a.m. unless downtime persists for more than a day, no other routines are done except checkouts. under some circumstances, certain materials might be checked in on the apple, but it is not desirable to do this for newer materials on which holds may have been placed. when the libs 100 is online again, the checkout station is switched back to normal mode and the apple takes over the information desk's port for dumping, rendering that terminal inoperative. dumping continues around the clock until all transactions have been processed from both apples. normal activities proceed at all other terminals. diskettes are dumped in chronological order. as the dumping process operates, a file of transactions eliciting error or exception messages from the libs 100 is created on the apple diskette. this file is available for attention at a later time for manual entry into the database. the chief asset of the dumping process is the accuracy achieved by automatic inputting. when we used paper and pencil, not only was the original writing time consuming, but manual data entry was difficult because of illegible handwriting, inaccurate transcription of the numbers, inaccurate inputting into the database, and lack of available personnel for the job. the cti system resolves all of these difficulties, but a price is paid in the loss of the dumping terminal's services. the public may be less disturbed if a terminal in a nonpublic area is used. but to the department involved, access to the database is a central part of their work and its loss severely limits their output. in fact, dependence on the automated circulation system by all departments in the library has been swift and universal even though we originally assumed the terminals outside the circulation department would be used sparingly. plans are being made to store personnel records in machine-readable form on diskettes. other developments are being put on a back burner until we have less frequent communications 299 need for the apples as backups. however, levels, great neck library's youth department, has several apples of its own on which budding "computerniks" practice their art. for them there are few limits to possible applications-perhaps only the outermost boundaries of imagination. reference 1. joseph covino and sheila intner, "an informal survey of the cti computer backup system,'' journal of library automation 14:108-10 oune 1981). computer-to-computer communication in the acquisition process sandra k. paul: skp associates, new york city. in the 1970s, we entered the period of computer-to-computer communication; we now appear to have reached the second stage of development. today more than seventy publishers are equipped to receive computer tape orders and input them directly to their order fulfillment systems; twenty-six publishers can produce computer invoices and credits for their customers; six are capable of sending monthly updating information about titles, prices, publication dates, and books declared out of print. all of this, however, is based on a system through which computer tapes are sent from buyer to seller and back via the united states mail. the next stepcomputer-to-terminal or computer-tocomputer communication-is just around the corner. historical perspective how did this happen? it started in september 1974 when dewitt c. ("bud") baker, newly appointed president of the baker & taylor company, envisioned the savings his company could find if their customers provided the international standard book number (isbn) on their orders. he also believed that the volume of paper created by the computer was expensive and time-consuming for publishers to handle. 300 journal of library automation vol. 14/4 december 1981 always a visionary, he believed that computers communicating directly with each other would not only save time and money, but would prevent human errors introduced by research clerks or keypunchers. he invited publishers, booksellers, librarians, wholesalers, representatives of school systems, and others to a full-day meeting at thew. r. grace building. this diverse group of individuals discussed the isbn-what it was, what it might do. by the end of the day, the group defined two areas in which efforts might bear fruit. one was educationalpublishers needed to be told the importance of printing valid isbns on their books, in their promotional materials, in their advertising, and on any other source of ordering information, and on the invoices and packing lists they send to their customers. wholesalers, librarians, and booksellers needed to be shown the efficiency their use of isbns on orders introduces to the fulfillment process at publishing houses and wholesaler offices. these functions were assigned to an isbn publicity committee, chaired by franklyn ("lee") rodgers of scribner book company. the second function was the design of computer-tocomputer formats for orders and invoices that would be keyed to the use of isbn as title identifier and would be industry-wide in scope. this function was undertaken by an isbn data transmission committee, chaired by david wolverton, then of brodart. the isbn publicity committee produced a booklet and posters, distributed them at all major conventions, made press releases available, and prepared articles for inclusion in the newsletter of all of the major industry associations. the committee surveyed the use of isbns by publishers and published a list of in-house contacts for isbns. the committee's program was a success! format development the isbn data transmission committee had a more difficult task. the first question they faced was one of basic approach. immediately they decided to proceed with a format for orders rather than invoices. next, they reviewed the level to which the format would be directed. believing it was more appropriate to "crawl before they walked," they decided to develop a format that could be generated on computer tape, which would be mailed from buyer to seller through the united states mail. once there had been experience with that format, work would begin on direct computer-tocomputer communication formats and protocols. the final decision related to form. the majority of people volunteering their time to work on this committee came from the major book publishing houses. additional members included two major bookstore chains, b. dalton and waldenbooks; the new york public library (nypl) and three major wholesalers, brodart, baker & taylor, and the ingram book company. representatives from nypl, r. r. bowker company, and the library wholesaling organizations were familiar with american national standard z39.2bibliographic information interchange on magnetic tape. they felt that this standard, which is the basis for the marc tapes, should also become the basis for an order to be sent on magnetic tape. the majority of the committee, however, was not only unfamiliar with z39.2, but with the concept of programming for variable length records and/or fields. these dataprocessing managers argued that the format basically would consist of sending a quantity and the isbn for each title ordered, not the sending of bibliographic records as such. after review of a strong and well thought out letter from michael malinconico supporting the use of z39.2, the majority held to their decision and the subcommittee, chaired by tom brady, then of baker & taylor, was assigned responsibility for developing the first computer-tocomputer order format. it is a fixed length field and record format. debate continued throughout its development. each publisher hoped to have to do minimal programming in order to interface the new format and the input requirements of his or her specific order fulfillment system. provision was made for minimal bibliographic information if an isbn was unknown. polling of the members resulted in decisions to limit author and title to thirty characters, for instance. shortly before the format was approved by the committee, dick fontaine, then sales manager of b. dalton (now president) and dick lieberman, sales manager of random house decided that they would begin sending tapes in the mail as of january 1975. once the format had been approved, other publishers joined the group. orders were sent from baker & taylor and brodart to random house, john wiley, prentice hall, and doubleday. b. dalton continued to send random house tape orders in a slightly different version of the format. by the end of 1976, it appeared that the task of the isbn publicity committee had become publicizing the order format, rather than the isbn as such. the two committees decided to merge in march 1977, selecting the name book industry systems advisory committee (bisac), as much because it was a pronounceable acronym as for any other reason. development of an invoice format, originally considered of immediate importance by waldenbooks, had been shelved after that company lost interest and the individual chairing the subcommittee working on the format left the field. the next step became reports of experience with the order format. fields left open for individual use came under review for standardized coding; procedures were developed for the marking of information on the outside of the tapes and paperwork to accompany it; some publishers refused to accept tapes without isbns, while others pondered the procedure to separate titles without isbn from those with and then merge the two for discounting purposes. with all of its inadequacies, the format was working. publishers reported saving up to one week of time in processing tape orders. random house analyzed returns for wrong title or wrong edition one year after they began receiving tape orders from dalton, baker & taylor, and brodart. they found an extraordinary 47 percent decrease in that type of misshipment to those three customers. with a few years experience under their belt, bisac decided to revise the format to accommodate the inadequacies and problems members had found with it. in addition, the r. r. bowker company and oclc, inc., had just announced their incommunications 301 tentions of developing acquisition systems which would replace them in the role of "order forwarders." they would prepare orders for other organizations and transmit those orders, on tape or directly online, to the vendor of the organization's choice. this forced bisac to include a field for an "order placer" in addition to the traditional "bill-to" and "ship-to" customer name and address. the revision was approved in february 1977 and called "format 112." (it is this version of the format which has been programmed by publishers and wholesalers noted in the introductory paragraph). bisac members began pressuring b. dalton to convert from their original "preformat 1" format to format 112. in analyzing the cost of doing so, dalton also considered the potential saving they would have if invoices from publishers were received on tape. although they have never made those figures public, the potential was so great that jim nermyr, then their vice president of data processing, agreed to chair a subcommittee to develop an invoice format, paralleling as closely as possible, the order format. once that was approved jim made "selling trips" to new york and elsewhere, convincing over forty publishers to program for the invoice format in return for receiving orders on tape in format 112 from dalton. finally, bisac members began expressing concern about misinformation on orders and invoices. typically, when two organizations agreed to communicate using the standardized formats, they would exchange tapes of titles, descriptions, and the appropriate isbn for each. they would produce error reports and the purchasers would bring their computer records in line with the publisher's. however, once price changes occurred, books were made out of print, or publication dates changed for notyet-published titles, orders carried erroneous information. to resolve this, a subcommittee, chaired by andrew uszak of r. r. bowker company, set about developing what has come to be known as the "title update format." this format allows a publisher to send a monthly tape of all titles on file indicating those fields that have changed since the prior month, or simply sending the isbn and changed field infor302 journal of library automation vol. 14/4 december 1981 mation. six publishers are now sending information in this format to the ingram book company on a monthly basis; others will be doing so shortly. (the format is also the basis for information college textbook publishers send to update the monthly aap microfiche service.) most recent changes at its may 1981 meeting, bisac approved minor modifications to its order and invoice formats. these changes included increasing the zip code field to nine digits and specifying a seven-digit field for the standard address number. the invoice format was modified to accommodate its use for sending credits as well as invoices. these new formats were released in august 1981 and are "titled" order format #3 and invoice format #2. we guess that it will take at least a year before the bulk of those organizations now sending tapes in order format #2 and invoice format# 1 program for the revisions. during 1980, bisac members began expressing concern that the "crawl before you walk" philosophy had stopped in the crawl stage. baker & taylor and brodart, in particular, expressed concern that the generation of tapes was expensive and using the mails introduced such delays that orders were filled more promptly when phoned into publishers than when sent in the bisac format. in addition, oclc, which had programmed the order format into their new acquisition system, agreed that sending this information on tape would be far less effective than transmitting it online to the major vendors. in 1980, bisac established a subcommittee, chaired by jim long of oclc, inc., to develop an alternative version of the order format. this version, with variable length fields and records, is intended for use in a communication mode between the main frames of two computers. we expect there will be deep consideration and long debate on this proposed version at bisac meetings in the next few months, with passage expected in 1982. finally, bisac brought its formats to the attention of american national standards committee z39 (ansc z39) when format #2 was completed. at its may 1981 meeting, the committee decided to ask z39 to officially begin work on formalizing both the order format #3 and the new variablelength order-format alternative as american national standards. the z39 program committee and executive council agreed; ernest muro of baker & taylor is chairing the z39 subcommittee charged with this task. as bisac activities became more widely known, this ad hoc committee strained the resources of its volunteer officers in answering requests for information, for copies of the formats, and in preparing and disseminating the minutes of their meetings to an ever-increasing number of interested individuals and organizations. in 1980, bisac approached the book industry study group, inc. (bisg) with the suggestion that they become a permanent committee of that research organization, whose membership also included publishers, librarians, booksellers, and wholesalers, as well as book manufacturers. the bisg agreed and today supports bisac activities through its offices. at the end of this communication is the address which can be used to request any of the formats from the bisg office. the future automation is here and here to stay. individuallibraries that once considered it impossible to imagine being able to afford a computer now have several-and soon will have more-acquisition systems available to them through independent vendors and through the national bibliographic utilities. wholesalers are gaining computer sophistication, as are publishing houses. as an industry, we are lucky that those volunteer data-processing types who formed bisac and kept it alive were each willing to make the compromises necessary to provide us with a standardized industry-wide format. other industries have not been so lucky, with major vendors using their dollarvolume clout to demand that their customers accept orders in their own , unique formats. however, the need for standardization is known. in 1979 the american national standards institute approved the creation of a new committee-ansc x12-business data interchange. this committee is charged with developing anationa! standard format for transmission of orders, invoices, and other transactions related to the sale of merchandise, and the payment for that sale through electronic funds transfer. bisac and ansc z39 have been carefully reviewing the progress this new committee is making. it appears that the formats that result from their efforts will be variable length fields and sufficiently general in nature to fit the needs of librarians, booksellers, wholesalers, and publishers, along with those involved in the sale and purchase of all other types of commodities. the traditions and laws of this country preclude any organization from "forcing" a library to make use of these standardized formats. however, the cost savings, the guarantee of accuracy of the record received, and the speed with which the order reaches the fulfillment center suggest that these formats will increase in use in the future. we also anticipate that more and more use of the formats will be in an online transmission mode, rather than in the form of computer tapes in the mail. as the volume communications 303 of transmissions grows, we expect that some day messages from purchasers will be forced into queues to reach the more popular suppliers. to the extent that the major wholesalers provide terminals to their customers and/or facilities to accommodate a large number of transmissions, their queues may be minimal. however, it will be interesting to discover how individual publishers will cope with this situation. readers who are interested in receiving copies of order format #3, invoice format #2 or title update format# 1, should write to: book industry study group, inc., 160 fifth ave., new york, ny 10010. there is no charge for these formats. if all three are requested and first class mail is requested, postal costs are billed to the recipient. those interested in active participation on bisac should send a letter to the organization stating that request. finally, those interested in receiving copies of the minutes of the bimonthly meetings held between september and may should send a request, accompanied by a check for thirty-five dollars to the bisg address. hungry for answers in the life sciences? no matter how unusual your search question , the new 1981 810sis search guide can help you find the right answers. in response to a request for references on " people eating insects as food," an information scientist searched 810sis previews online: " ... i wanted a file with a broad subject coverage and wide range of search options. the first step i took in my search strategy was to use the free-text faci lity for the word 'edible' . then i used the keywords insect , insecta, and insects, which i identified from the 810sis search guide. the guide also led me to several biosystematic codes .. . " from a prize-winning 8/0s/s search tourna ment essay. for your online searches of biological and medical research, you need the right words ... the right concepts ... the right codes ... you'll find them all in the: 1981 biosis search guide the price? just $75.00. for further information, or to place your order, contact: biosis customer services, 2100 arch street , philadelphia, pa 19103. 800-523-4806 or 215-568-4016. community-driven programming: offering coding and robotics classes in your library public libraries leading the way community-driven programming offering coding and robotics classes in your library mary carrier information technology and libraries | june 2023 https://doi.org/10.6017/ital.v42i2.16619 mary carrier (mcarrier@mvls.info) is technology & growth specialist, mohawk valley library system. © 2023. abstract mary carrier serves as the technology & growth specialist for the four counties of the mohawk valley library system in schenectady, new york. prior to this position, mary dedicated over 15 years to teaching digital literacy and technology trends at the clifton park-halfmoon public library, a suburban public library that has over 40,000 registered patrons and 1,500 visitors per day. the community has a strong presence in youth and family programs and is a popular place for teens and children to learn, play, and create. in 2015, she began offering coding and stem classes to children and teens at the library and in the community as outreach programs. mary will share her expertise in technology programming for children and teens and the importance of planning, preparing, and testing curriculum for coding and robotics classes. introduction when you get unsolicited input from the community, it is worth the effort to investigate. rather than say “that will never work here,” take the time to listen. is this the voice of one or of many? at the clifton park-halfmoon public library located south of saratoga springs and north of albany, new york, our 55,000-square-foot library serves two communities with a combined population of 64,575 and hundreds of visitors from the greater capital region. as a digital services trainer my focus was geared toward teaching computer classes to older adults. my pace was slow, steady, and empathic as i translated this world of technology into digestible “bytes” for everyday use. i could easily relate to the foreign language these tools presented since i did not grow up with technology. this community of learners grew steadily between 2007 and 2014, particularly when more and more people were buying e-readers, tablets, and smartphones. i found my niche, i was making a difference, and i was comfortably coasting. i was fortunate to be in a community that used new technology and valued learning. after all, they had free access to a variety of computer classes and one-on-one assistance at their local library. the digital literacy foundation grew and was strong in the community after finding the support needed; however, over the years, i was getting antsy as i reached a plateau. my new challenge was passed on to me in a loose net, rather than on a silver platter. a cub scout leader approached me as his son was “aging out” of scouts, that transitional period between fifth and sixth grade. he recommended the library start a coding club, and he was willing to help. coding was an area i knew very little about as a technology discipline, and working with children would be a new challenge for me, but i was ready to jump right in. what started out as a few weeks of experiential learning, turned into seven years of coding and robotics classes for third mailto:mcarrier@mvls.info information technology and libraries june 2023 community-driven programming 2 carrier grade through high school students with the addition of programs for kindergarten through second graders in later years. the process to initiate a coding club started in 2015 with research and fact finding. i gathered information about computer science and coding, including the benefits of learning computer science at a young age, methods to teach these skills, and websites and resources to help with the development of the curriculum. with data in hand regarding resources, benefits, and age recommendations, my supervisor and i sat down with the parent and brainstormed the best way to approach offering a coding club that would be beneficial to the community and manageable from our perspective. we decided to offer a six-week series and called the club code crew. fifth through eighth graders were invited to register for this two-hour after-school program with a goal to create projects in scratch (https://scratch.mit.edu/), which uses block programming. each team of two to three students worked on a project to highlight the benefits of using the library. teams created interactive stories about finding a favorite book, checking out materials, and highlighting the library’s programs, such as story time. at the end of the six weeks, each team presented their project to an audience of family members and staff. the goal was to learn how to code in scratch, create a final product to promote the library, collaborate, and have fun. code crew was just the beginning. from there, our library offered a variety of series and one-time classes in scratch, scratch jr., python, css/html, and javascript. here’s what i learned along the way… planning your program set your goal to offer coding curriculum to children, teens, and/or adults. some of our main goals were to emphasize the importance of computer science, to allow children the opportunity to create versus consume computer games, and to promote collaboration and confidence. research what programs and resources are already available and where in the community. check with local schools from elementary to high school, continuing education programs, youth organizations such as the ymca, and summer camps in your local community. if there are other programs currently offered, who are the participants in the program? are there restrictions or limitations? for example, is coding offered only to accelerated students or available to all children? is it restrictive due to participation fees? this research will indicate if there is a local offering. a free program at the library is more inclusive. design based on what you investigated, would this work in your community? in your design, who is your audience, what skills will they need, and what is the recommended age based on your review? what is your capacity for a class, or class size? consider the number of computers or laptops and the number of students that is manageable based on the age of your audience. who will run the class? do you have staff that are interested and available, or do you need to outsource this program? how much do you need to budget? we were fortunate that the parent with the original idea had a programming background. he started as a volunteer and quickly became a contractor as we added more and more classes. https://scratch.mit.edu/ information technology and libraries june 2023 community-driven programming 3 carrier investigate program content reviewing coding websites, curriculum, or lesson plans can help with successful preparation. generally, these three websites are popular, well-known coding websites. more specifically, they are my top picks because they appeal to different learning styles: • scratch coding: https://scratch.mit.edu/ • code.org: https://code.org/ • cs google first: https://csfirst.withgoogle.com/s/en/home there are tutorials that include written instructions, video instruction, and exercises for different skill levels from beginner to advanced for students. these tutorials all include beginner block programming that use a drag and drop method to click blocks of code together. children are guided through colorful, themed tutorials in logical order for learning. by creating a user account, students can save all their projects and progression for another class or to continue at home. each website has different benefits. i started with scratch (https://scratch.mit.edu/) as it is the most widely used (created in 2007), has an unforgettable logo (scratch the cat), and provides helpful tutorials to use in class. after learning the basics of moving scratch, adding loops to repeat actions, and changing the backgrounds and sprites (characters), children are ready to make their first pong game. animating sprites to talk to each other and change scenes is the next fun basic element to try. participants love seeing the conversations develop and sharing their stories with each other. we used code.org (https://code.org/) to help promote the benefits of computer science by showing memorable videos in class that have celebrities, sports figures, and social media creators expressing their positive experience of learning to code. code.org promotes an hour of code challenge every year during computer science education week, which is the first week of december. kids love this website because it uses popular and current themes, such as characters from the movie frozen, star wars, and minecraft. google cs (computer science) first (https://csfirst.withgoogle.com/s/en/home) launched in 2013 and is a website used for classroom style collaboration. as an instructor, you can choose to use google classroom to encourage sharing, and collaboration among participants, and to review an individual’s progress. the exercises and videos are incredible and support learning that continues to build skills with exercises defined as beginner, intermediate, and advanced. it uses scratch block programming with easy-to-follow lessons to use with a class or for self-paced learning. design and test design the curriculum and lesson plan for the class. think project-based. plan an introductory exercise that builds to a project. do a test run. bring in a few volunteer students and run through the lesson plan or run a pilot program and adjust based on the results. consider skill level and attention span. how will you accommodate this? i learned to be prepared with additional projects to offer students who finish early. give them time to explore. ask students https://scratch.mit.edu/ https://code.org/ https://csfirst.withgoogle.com/s/en/home https://scratch.mit.edu/ https://code.org/ https://csfirst.withgoogle.com/s/en/home information technology and libraries june 2023 community-driven programming 4 carrier to help each other and stick to completing the project. the project should be fun and easy enough to get them excited about coding. assess and adjust coding is not for everyone. assess what went well and what you may want to change. don’t take it personally if students didn’t like it or don’t return. some children and teens feel overwhelmed when they feel their skill levels are below others. encourage them to watch the video tutorials and start with the basics, like google cs first step-by-step video tutorials. figure 1. google cs first lessons. figure 2. scratch for cs first. information technology and libraries june 2023 community-driven programming 5 carrier after introducing and running a coding class for a while, you will have a built-in following and enthusiasm in your community. a natural progression is to start working with robots. the coding that has been successfully run on a computer screen can now be applied to an object. children have learned the general concepts of block programming, which is directly applicable to the coding used to control robots. loops, conditionals, and various commands can be programmed to bring the robot to life. robots love to dance, do obstacle courses, race, and “dress up.” each of these robots are compatible with block programming such as scratch and object-oriented programming languages such as python. three types of robots are used for classes at the clifton park-halfmoon public library. figure 3. robots used for classes. finch robots we purchased the finch robots (https://www.birdbraintechnologies.com/) first because they were priced reasonably below $100. they came with a long cable that tethers to a laptop and each is coded through a free download scratch 2.0. newer finch models are available and use a blue tooth connection. there are video tutorials and lesson plans available on the bird brain technologies website. these can be used for grades k-12. dot & dash robots there are apps, challenge cards, and curricula available on the wonder workshop website (https://www.makewonder.com/). these robots are best for grades k-8. makeblock – mbot makeblock includes activities and lesson plans for pre-k-12 (https://education.makeblock.com/). one of the most important components of working with robots is to research what is needed to run the robot. typically, there is an app or download required. whether you use a laptop, chromebook, or tablet, planning and preparation is key. we purchased six kindle fire tablets to use with the mbots and found that the blockly app needed was not available as a download. i was able to run a hack to allow me to download the google play store app, but with a little more research, this could have been avoided. give yourself time before offering a program to test, test, test. and even then, you will adjust and learn along the way. finch robots dash & dot makeblock mbot https://www.birdbraintechnologies.com/ https://www.makewonder.com/ https://education.makeblock.com/ information technology and libraries june 2023 community-driven programming 6 carrier figure 4. teams working together to set up an obstacle course for the finch robot. information technology and libraries june 2023 community-driven programming 7 carrier practical tips • start with tutorials. • use a pilot group of kids/teens. • don’t be afraid. you don’t have to be an expert. • encourage the class to help each other and to present their projects. • utilize students as classroom helpers as they gain confidence. • continuously assess and adjust your plan. • growth takes time, keep building the program, and move forward as long as there is interest. the library community’s strong presence and support for youth and family programs made this programming endeavor successful. as interest in the program grew, i hired two part-time librarians and a parent as contractors. the community response allowed us to develop curricula for k-2, 3-5, and 6-adult. figure 5. chart shows the breakdown of attendance by programming topic. scratch was targeted for third to fifth graders; ready-set-code for kindergarten to second graders, and python programming for fifth graders to adult. total attendance for children and teens for coding and robotics programs: • 2016: 877 • 2017: 1,069 • 2018: 1,244 • 2019: 921 information technology and libraries june 2023 community-driven programming 8 carrier and this reach extended beyond the library walls. outreach was offered through afterschool enrichment programs and participation in our annual district-wide science and health discovery night, a showcase of science, technology, engineering, math, and health exhibits and interactive demos for all ages. volunteer exhibitors from a variety of companies and organizations participate and draw an audience of more than 3,000. we also allowed six out of sixteen of our finch robots to be borrowed by teachers and community members for at home learning and play. our success was worth the effort. there is nothing better than when you hear the squeals of delight, see the pride in the creator, and witness collaboration. abstract introduction planning your program design investigate program content design and test assess and adjust finch robots dot & dash robots makeblock – mbot practical tips 2 information technology and libraries | december 2007 editorial: farewell and thank you john webb this issue of information technology and libraries (ital), december 2007, marks the end of my term as editor. it has been an honor and a privilege to serve the lita membership and ital readership for the past three years. it has been one of the highlights of my professional career. editing a quarterly print journal in the field of information technology is an interesting experience. my deadlines for the submission of copy for an issue are approximately three and a half months prior to the beginning of the month in which the issue is pub­ lished; for example, my deadline for the submission of this issue to ala production services was august 15. therefore, most articles that can appear in an issue were accepted in final form at least five months before they were published. some are older; one was a baby at only four months old. when one considers the rate of change in information technologies today, one understands the need for blogs, wikis, lists, and other forms of profes­ sional discourse in our field. what role does ital play in this rapidly changing environment? for one, unlike these newer forms, it is double­blind refereed. published articles run a peer review gauntlet. this is an important distinction, not least to the many lita members who work for aca­ demic institutions. it may be crass to state it so baldly, but publication in ital can help one earn tenure, an old­fashioned fact of life. it is indexed or abstracted in nineteen published sources, not all of them in english. many of its articles appear in various digital repositories and archives, and these also are harvested or indexed or both. in addition, its articles are cataloged in worldcat local. many of lita’s most prominent members—your distinguished peers—have published articles in ital. the journal also serves as a source for the wider dis­ semination of sponsored research, a requirement of most grants. and you can read it on the bus or at the beach (heaven forbid!), in the brightest sunlight, or with a flashlight under the covers (though there are no reports of this ever having been observed). i am amazed at how quickly these three years have passed, though that may be at least as much a function of my advanced age as of the fun and pleasure i have had as editor. certainly, these past three years have hosted some notable landmarks in our history. lita and ital both celebrated their fortieth anniversaries. sadly, the death of one of lita’s founders and ital’s first editor, frederick g. kilgour, on july 31, 2006, at age ninety­two, was a landmark in the passing of an era. oclc and rlg’s merger, which fred lived to witness, was a landmark of a different sort—one of maturity, we hope. ital is now an electronic as well as a print journal. this conversion has had some rough passages, but i trust these will have been ironed out by the time you read this. when i became editor, i had a number of goals for the journal, which i stated in my first editorial in march 2005. reading that editorial today, i realize that we successfully accomplished the concrete ones that were most important to me then: increasing the number of articles from library and i­school faculty; increasing the number that result from sponsored research; increasing the number that describe any relevant research or cutting­edge advance­ ments; increasing the number of articles with multiple authors; and finding a model for electronic publication of the journal. the accomplishment of the most abstract and ambitious goal, “to make ital a destination journal of excellence for both readers and authors,” only you, the readers and authors, can judge. i thank mary taylor, lita executive director, and her staff for all of the support they provided to me during my term. i owe a debt that i can never repay to all of the staff of ala production services who worked with me these past three years. their patience with my some­ times bumbling ways was award­winning. thank all of you. the lita presidents and other officers and board members were unfailingly supportive, and i thank you all. in the lita organizational structure, the ital editor and the editorial board report to the lita publications committee, and the editor is a member of that body. i thank all of the chairs and other members of that commit­ tee for their support. once more, and sadly for the last time, i thank all of the members of the ital editorial board who served dur­ ing my term for their service and guidance. they perform more than their share of refereeing, but more importantly, as i have written before, they are the junkyard dogs who have kept me under control and prevented my acting on my worst instincts. i say again, you, the lita member­ ship and ital readership, owe them more than you can ever guess. trust me. to marc truitt, ital managing editor and the incom­ ing ital editor for the 2008–2010 volume years, i must say, “thank you, thank you, thank you!” marc and the ala production services staff were responsible for the form, fit, and finish of the journal issues you received in the mail, held in your hands, and read under the covers. finally, most of all, thank you authors whose articles, communications, and tutorials i have had the privilege to publish, and you whose articles have been accepted and await publication. john webb (jwebb@wsu.edu) is a librarian emeritus, washington state university, and editor of information technology and libraries. editorial: farewell and thank you | john webb 3 not only is this the end of my term as editor, but i also have retired. from now on, my only role in the field of library and information technology will be as a user. those of you have seen the movie the graduate probably remember the early scene when benjamin, the dustin hoffman character, receives the single word of advice regarding his future: “plastics.” (i don’t know if that scene is in the novel from which the movie was adapted.) my single word of advice to those of you too young or too ambitious to retire from our field is: “handhelds.” i am surprised that my treo is more valuable to me now in retirement than it was when i was working. (i’m not surprised that my ipod video is, nor that word thinks that treo and ipod are misspellings.) i just wish that more of the web was as easily accessible on my treo as are google maps and almost all of yahoo!. handhelds. trust me. guest editorial clifford lynch information technology and libraries | march 2012 3 congratulations lita and information technology and libraries. since the early days of the internet, i’ve been continually struck by the incredible opportunities that it offers organizations concerned with the creation, organization, and dissemination of knowledge to advance their core missions in new and more effective ways. libraries and librarians were consistently early and aggressive in recognizing, seizing, and advocating for these opportunities, though they’ve faced—and continue to face—enormous obstacles ranging from copyright laws to the amazing inertia of academic traditions in scholarly communication. yet the library profession has been slow to open up access to the publications of its own professional societies, to take advantage of the greater reach and impact that such policies can offer. making these changes is not easy: there are real financial implications that suddenly seem very serious when you are a member of a board of directors, charged with a fiduciary duty to your association, and you have to push through plans to realign its finances, organizational mission, and goals in the new world of networked information. so, as a long-time lita member, i find it a great pleasure to see lita finally reach this milestone with information technology and libraries (ital) moving to fully open-access electronic distribution, and i congratulate the lita leadership for the persistence and courage to make this happen. it’s a decision that will, i believe, make the journal much more visible, and a more attractive venue for authors; it will also make it easier to use in educational settings, and to further the interactions between librarians, information scientists, computer scientists, and members of other disciplines. on a broader ala-wide level, ital now joins acrl’s college & research libraries as part of the american library association’s portfolio of open-access journals. supporting ital as an open-access journal is a very good reason indeed to be a member of lita. clifford lynch (clifford@cni.org) is executive director, coalition for networked information. mailto:clifford@cni.org 20190318 10974 galley public libraries leading the way the democratization of artificial intelligence: one library’s approach thomas finley information technology and libraries | march 2019 8 thomas finley (tfinley@friscotexas.gov) is adult services manager, frisco public library. chances are that before you read this article, you probably checked your email, used a mapping app to find your way, or typed a search term online. without your even perceiving it, artificial intelligence (ai) has already helped you to accomplish something today. email spam filters use variants of ai to help cut down on harmful or useless emails in your inbox.1 with ai doing the factcrunching, mapping apps quickly preview the best route based on a myriad of factors. search engine companies like google have been using ai to suggest or produce results faster for longer than anyone outside of the company really knew until recently.2 according to a recent study by northeastern university and gallup, 85% of americans are already using ai products.3 the true revelation behind these recent technological developments may not be the fact that ai is already embedded into the fabric of our modern lives. the real surprise might just be the sudden ubiquitous availability (and approachability) of ai tools for all. as google’s former chief scientist of ai and machine learning, fei-fei li, said in 2017, “the next step for ai must be democratization, lowering the barriers of entry, and making it available to the largest possible community of developers, users and enterprises."4 this sounds a lot like most public libraries’ mission statements. as with other important workforce development efforts, libraries are uniquely placed to participate in this new revolution as key platforms for the discovery and dissemination of emerging tech knowledge. at the frisco public library (https://www.friscolibrary.com), we saw this ai trend surfacing, we see ai as a critical future job skill, and we investigated ways to introduce our patrons into this space. as such, the frisco public library has leveraged readily available technology in a cost-effective way that has engaged community interest. our efforts are also replicable and scalable in terms of multi-nodal experiences both at home and in classroombased learning. some basic definitions let’s take a few steps back to give some broad definitions and boundaries to the scope of ai. according to the oxford english dictionary, artificial intelligence is “the capacity of computers or other machines to exhibit or simulate intelligent behavior.”5 in the literature, you will find a further distinction between general ai, narrow ai, and something called machine learning.6 general ai is something that begins to look like science fiction: an artificial intelligence that learns how to learn, then is able to generalize what it has learned and apply that knowledge to a different case. in advanced examples of general ai, scientists are thinking of not putting a specific problem in front of a general ai program to solve, rather, they are giving it an entire dataset so the program itself can choose what problems it should work on. removing the limited point of view of whoever programs the program.7 narrow ai is easier to understand because it is what we interact with the most in our day-to-day lives. it is what powers those little speed ups that help us do things faster every day: search information technology and libraries | march 2019 9 through our emails to help us avoid spam, translate speech to text when we dictate a message on a smartphone, or helps to parallel park a car at the touch of a button. narrow ai accomplishes a specific task extremely fast and accurately, and thus, becomes an extension and multiplier of our own human productivity. a lot of these narrow ai activities are based in a type of artificial intelligence called machine learning (ml). ml is a set of very complex processes that can review large sets of information; create and train models based on this data; make predictions of what will happen next; and then to refine that data for better future results.8 machine learning is the focus of our efforts at the frisco public library due to two main reasons: 1) it is what has been made available through free tools such as google’s open ai resources; and 2) it makes ai attainable in a library setting. our approach: makerspaces for everyone, at home the frisco public library has had 4 years of success with circulating makerspace technology in reasonably-priced, hard shell waterproof boxes with foam inserts. each kit is cataloged, rfid tagged, security tagged, and sealed with zip ties to enable self-checkouts (zip ties can be easily cut open at home, but prevent items from disappearing in the library). these cases are easy to handle and can take some abuse while protecting their contents. this is important because we circulate about 20 different kinds of robotics kits, no-soldering circuitry kits, 3d scanning kits, programing kits, and internet of things kits. most kits contain the theme item with quick start guides, instruction booklets, and a book to inspire advanced learning. we call these maker kits, and we have about 150 total. in our community, they are wildly popular and have circulated more than 4,000 times since their introduction in january 2016.9 aiy: artificial intelligence kits for everyone in 2017, google released their maker-focused aiy voice project kit (where aiy is a catchy substitute for do-it-yourself with artificial intelligence yourself). the kit consists of several components that pairs a raspberry pi (entry-level computer) and a small speaker that is housed in a cardboard box with a button prominently placed on top.10 the result is a stripped-down version of an amazon echo or google home device — essentially a smart speaker. although the aiy voice kit is not necessarily initially set up to play music, it is designed to take voice commands like the other products on the market. with a minimum of python coding expertise, aiy kits enable mass participation in artificial intelligence. there isn’t even any soldering required to put this kit together! this is 100% in line with fei-fei li’s (google’s former chief scientist for ai and ml) remarks about the need to democratize ai. google has since released another kit called aiy vision that uses similar components paired with a camera. more information on the kits can be found at https://aiyprojects.withgoogle.com/. frisco public library’s artificial intelligence maker kits based on our previous experience with other maker kits, we made a few modifications to the original google design that most librarians with access to a 3d printer can accomplish. the original aiy voice kit uses a punch-out cardboard box to fold and envelop the device. apart from being an extremely cost-effective way of making a box, it also seems like there is delicious irony (and message) in the contrasting of cardboard-as a cheap, widely available material-with the advanced tech of ai. durability being our priority, we knew we needed to upgrade this aspect of google’s original design. our maker librarian, adam lamprecht, quickly found a shared design file public libraries leading the way: the democratization of artificial intelligence 10 https://doi.org/10.6017/ital.v38i1.10974 uploaded to the website, www.thingiverse.com, that he modified to better suit our needs (see figure 1).11 figure 1. ai maker kits with 3d printed aiy voice device. we then printed these in a variety of colors on our 3d printers and modified the grid-patterned foam inserts to make room for the device and a few other items (see figure 2). we are currently circulating 21 of these kits without major incident. information technology and libraries | march 2019 11 figure 2. interior view of the kits. library instruction: python as a window onto artificial intelligence our basic artificial intelligence classes have been key in the introduction of this technology to the public. we reserve 10 kits for a class and pair them with classroom laptops for ease of use. the structure of the class provides a short introduction to the technology and then walks participants through a basic voice recognition coding challenge. all of this is accomplished in python. python is great for beginning coders because it is easier to learn than other programming languages, takes less time to write lines of code, and it can telescope up into a very large number of projects and applications.12 in fact, according to neal ford, director and software architect at thoughtworks, python, “is very good at solving bigger kinds of problems.”13 so with python, a beginning learner has a programming language that continues to be useful beyond the classroom and into the world of work or school. python provides another important advantage: “python provides the front-end method of how to hook into google’s open ai,” states tech writer sardar yegulalp.14 it is this combination of a free, accessible coding language with the powerful (and also free) resources of google’s open ai that truly lowers the barrier to entry for anyone interested in a hands-on experience with artificial intelligence. public libraries leading the way: the democratization of artificial intelligence 12 https://doi.org/10.6017/ital.v38i1.10974 lessons learned the ai maker kits are, by far, our most complicated circulating kits. we are hearing back from patrons that the kits are right on the mark. our users get it, they see the power in getting access to these ai tools (utilizing python) and by all accounts thus far, are happy with their results. there has been a perception gap between library staff, however, and what an ai kit can reasonably accomplish. adam lamprecht reports, “staff members had the expectation that perhaps with this kit, a rookie coder was going to be able to jump directly into developing deep learning neural networks (a very advanced subset of artificial intelligence) and so we definitely benefited from ongoing discussions of those broad ai terms and expectations.”15 google’s aiy voice is a good start but there is lots of room to grow ai classes for more depth. aiy vision is the next logical step that would allow us to enter into the world of basic image recognition. our approach does rely on one company’s platform, but there are more platforms to explore ai now. one of which is amazon’s offerings of machine learning on aws (amazon web services). these services have recently been opened up for a wider audience and amazon is now offering everyone the same online courses they use to train their own engineers.16 the aws ml resources are currently behind paywalls but access to the training alone could be powerful for the right learner. there are even interesting developments for younger learners in ai with robotics. anki (www.anki.com) is a consumer robotics company that uses ai to enliven its products. they released vector in 2018: a seemingly simple toy that responds to its environment and simple commands with the aid of ai. with the release of their software development kit the company is allowing others under the hood of their robots-which potentially means an entry-point for autonomous (or semi-autonomous) robotic vehicle technology powered by ai. what is clear is that the world of ai is already upon us. public libraries are well positioned to help meet the challenge of developing the workforce of the nearand far future with ai classes being a vital tool. the doorway to artificial intelligence is now open, the only question that remains is this: do you step through it? references 1 cade metz, “google says its ai catches 99.9 percent of gmail spam,” wired. july 9, 2015, https://www.wired.com/2015/07/google-says-ai-catches-99-9-percent-gmail-spam/. 2 jack clark, “google turning its lucrative web search over to ai machines,” bloomberg business, october 26, 2015, https://www.bloomberg.com/news/articles/2015-10-26/google-turningits-lucrative-web-search-over-to-ai-machines. 3 rj reinhart, “most americans already using artificial intelligence products,” gallup, march 6, 2018, https://news.gallup.com/poll/228497/americans-already-using-artificial-intelligenceproducts.aspx. 4 scot petersen, “google joins chorus of cloud companies promising to democratize ai,” eweek, march 10, 2017, ebscohost academic search complete. information technology and libraries | march 2019 13 5 “artificial intelligence, n,” oed online, december 2018, oxford university, accessed march 1, 2019. 6 bernard marr, “what is the difference between artificial intelligence and machine learning?,” forbes, december 6, 2016, https://www.forbes.com/sites/bernardmarr/2016/12/06/whatis-the-difference-between-artificial-intelligence-and-machine-learning/#6d40eeec2742. 7 lex fridman, “juergen schmidhuber: godel machines, meta-learning, and lstms,” mit ai podcast, december 22, 2018. 8 serdar yegulalp, “what is tensorflow? the machine learning library explained,” infoworld. june 6, 2018, https://www.infoworld.com/article/3278008/tensorflow/what-is-tensorflow-themachine-learning-library-explained.html. 9 frisco public library, 2019 “unpublished maker kit statistics 2016-2019.” 10 “aiy projects: voice kit,” google, accessed december 15, 2018, https://aiyprojects.withgoogle.com/voice/. 11 adam lamprecht, “google aiy voice box,” thingiverse, accessed february 14, 2019, https://www.thingiverse.com/thing:3247685. 12 elena ruchko, “why learn python? here are 8 data-driven reasons,” dbader.org, accessed february 14, 2019, https://dbader.org/blog/why-learn-python. 13 christina cardoza, “the python programming language grows in popularity,” sd times, june 15, 2017, https://sdtimes.com/artificial-intelligence/python-programming-language-growspopularity/. 14 yegulalp, “what is tensorflow? the machine learning library explained.” 15 adam lamprecht, email message to the author, february 15, 2019. 16 locklear mallory, “amazon opens up its internal machine learning training to everyone,” engadget, november 26, 2018, https://www.engadget.com/2018/11/26/amazon-opensinternal-machine-learning-training/. use of language-learning apps as a tool for foreign language acquisition by academic libraries employees articles use of language-learning apps as a tool for foreign language acquisition by academic libraries employees kathia ibacache information technology and libraries | september 2019 22 kathia ibacache (kathia.ibacache@colorado.edu) is the romance languages librarian at the university of colorado boulder. abstract language-learning apps are becoming prominent tools for self-learners. this article investigates whether librarians and employees of academic libraries have used them and whether the content of these language-learning apps supports foreign language knowledge needed to fulfill library-related tasks. the research is based on a survey sent to librarians and employees of the university libraries of the university of colorado boulder (ucb), two professional library organizations, and randomly selected employees of 74 university libraries around the united states. the results reveal that librarians and employees of academic libraries have used language-learning apps. however, there is an unmet need for language-learning apps that cover broader content including reading comprehension and other foreign language skills suitable for academic library work. introduction the age of social media and the advances in mobile technologies have changed the manner in which we connect, socialize, and learn. as humans are curious and adaptive beings, the moment mobile technologies provided apps to learn a foreign language, it was natural that self-regulated learners would immerse themselves in them. language-learning apps’ practical nature, as an informal educational tool, may attract self-learners such as librarians and employees of academic libraries to utilize this technology to advance foreign language knowledge usable in the workplace. the academic library employs a wide spectrum of specialists, from employees offering research consultations, reference help, and instruction, to others specialized in cataloging , archival, acquisition, and user experience, among others. regardless of the library work, employees utilizing a foreign language possess an appealing skill, as knowing a foreign language heightens the desirability of employees and strengthens their job performance. in many instances, librarians and employees of academic libraries may be required to have reading knowledge of a foreign language. therefore, for these employees, acquiring knowledge of a foreign language might be paramount to deliver optimal job performance. this study aims to answer the following questions: 1) are librarians and employees of academic libraries using language-learning apps to support foreign language needs in their workplace? and 2) are language-learning apps addressing the needs of librarians and employees of academic libraries? for purposes of this article, mobile language apps are those accessed through a website, and apps downloaded onto portable smartphones, tablets, desktops, and laptops. mailto:kathia.ibacache@colorado.edu use of language-learning apps | ibacache 23 https://doi.org/10.6017/ital.v38i3.11077 background mobile-assisted language learning (mall) has a user-centered essence that resonates with users in the age of social media. librarians and employees of academic libraries needing a foreign language to fulfill work responsibilities are a target group that can benefit from using languagelearning apps. these apps provide a multifaceted capability that offers time and space flexibility and adaptability that facilitates the changeable environment favored by self-learners. kukulskahulme states that it is customary to have access to learning resources through mobile devices. 1 in the case of those individuals working in academic libraries, language-learning apps may present an opportunity to pursue a foreign language accommodating their self-learning style, time availability, space, and choice of device. considering the features of language-learning apps, some have a more personal quality where the device interacts with one user while other apps emulate social media characteristics connecting a wide array of users. for instance, users learning a language through the hello talk app can communicate with native speakers all around the world. through this app, language learners can send voice notes, corrections to faulty grammar, and use the built-in translator feature. therefore, language-learning apps may not only provide self-learners a vehicle to communicate remotely, but also to interact using basic conversational skills in a given language. in the case of those working in academic libraries, this human connectedness among users may not be as relevant as the interactive nature of the device, its mobility, the convenience of the virtual learning, and the flexibility of the mobile technology. kukulska-hulme notes that the ubiquity of mobile learning is affecting the manner in which one learns.2 although there is abundant literature referring to mobile language technologies and their usefulness in students’ language learning in different school levels including higher education, scholarship regarding the use of language-learning apps by professionals is scarce.3 broadbent refers to self-regulated learners as those who plan their learning through goals and activities. 4 the author concurs that to engage in organized language learning through a language-learning app, one should have some level of organizational learning or as a minimum enough motivation to engage in self-learning. in this context, some scholars believe that the level of self-management of learning will determine the level of learning success.5 moreover, learners who possess significant personal learning initiative (pli) have the foundation to accomplish learning outcomes and overcome difficulties.6 pli may be one factor affecting learners’ motivation to learn a language in a virtual environment and away from the formal classroom setting. this learning initiative may play a significant role in the learning process, as it may influence the level of engagement and positive learning outcome. in terms of learning outcomes, language software developers may also play a role by adapting and broadening content based on learning styles and considering the elements that would provide a meaningful user experience. in this sense, bachore conveys that there is a need to address language-learning styles when using mobile devices.7 bachore also notes that as interest in mobile language learning increases, so does the different manners in which mobile devices are used to implement language learning and instruction.8 similarly, louhab refers to context dimensions as the parameters in mobile learning that consider learners’ individuality in terms of where the learning takes place, individual personal qualities and information technology and libraries | september 2019 24 learning needs, and the features of their mobile device.9 bradley also suggests that learning is part of a dialogue between the learners and their devices as part of a sociocultural context where thinking and learning occur.10 in addition. bradley infers that users are considered when creating learning activities and when improving them.11 for these reasons, some researchers address the need to focus on accessibility and developing content designed for different types of users, including differently abled learner s.12 furthermore, adaptation, according to the learner’s style, may be considered as a pivotal quality of languagelearning apps as software developers try to break the gap between formal instruction and a learner-oriented mobile learning platform. undoubtedly, the technological gap, which includes the cost of the device, interactivity, screen size, and input capabilities, among others, matter when centering on implementing language learning supported by mobile technologies. however, learning style is only one aspect in the equation. a learner’s need is another. for example: the needs of a learner who seeks to acquire familiarity with a foreign language because of an upcoming vacation may be substantially distinct from the needs of professionals such as academic librarians, who may need reading, writing, or even speaking proficiency in a given language. a user-centered approach in language-learning software design may advance the adequacy of these apps connecting them with a much wider set of learning needs. when referring to mobile apps for language learning, godwin-jones asserts that while the capability of devices is relevant, software development is paramount to the educational process.13 therefore, language-learning software developers may consider creating learning activities that target basic foreign language-learning needs and more tailored ones suitable for people who require different content. kukulska-hulme refers to “design for learning” as creating structured activities for language learning.14 although language-learning apps appear to rely on learning activities built on basic foreign language learning needs, these apps should desire to rely more on learners’ evaluative insights to advance software development that meets the specific needs of learners. although mobile technologies as a general concept will continue to evolve, its mobile nature will likely continue focusing on user experience satisfying those who prefer the freedom of informal learning. methodology instrument the author used a 26-question qualtrics survey approved by the institutional review board at the university of colorado boulder (ucb). the survey was open for eight weeks and received 199 total responses. however, the number of responses to each question varied depending on the question. the data collected was both quantitative and qualitative in nature, seeking to capture respondents’ perspectives and measurable data that could be used for statistics. the survey consisted of twelve general questions for all respondents that reported working in an academic library, then branched into either nine questions for respondents who had used a languagelearning app, and five questions for those who had not. the respondents answered via text fields, standard single and multiple-choice questions, and a single answer likert matrix table. qualtrics provided a statistical report, which the author used to analyze the data and create the figures. use of language-learning apps | ibacache 25 https://doi.org/10.6017/ital.v38i3.11077 participants the survey was distributed through an email to librarians and employees of ucb’s university libraries. the author also identified 74 university libraries in the united states from a list of members of the association of research libraries, and distributed the survey via email to ten randomly selected library employees from each of these libraries.15 the recipients included catalogers, subject specialists, archivists, and others working in metadata, acquisition, reference, and circulation. in addition, the survey was also distributed to the listserv of two library organizations: the seminar on the acquisition of latin american library materials (salalm) and reforma, the national association to promote library and information services to latinos and the spanish speaking. these organizations were chosen due to their connection with foreign languages. results use of foreign language at work of the respondents, 172 identified as employees of academic libraries (66 percent). of these, a significant percentage reported using a foreign language in their library work. the respondents belonged to different generational groups. however, most of the respondents were in the age groups of 30-39 and 40-49 years old. the respondents performed a variety of duties within the categories presented. due to incomplete survey results, varying numbers of responses were collected for each question. therefore, of 110 respondents, 82 identified their gender as female. in addition, of 105 respondents, 62 percent reported being subject specialists, 56 worked in reference, 54 percent identified as instruction librarians, 30 percent worked in cataloging and metadata, 30 percent worked in acquisition, 10 percent worked in circulation, 2 percent worked in archiving, and 23 percent reported doing “other” types of library work. information technology and libraries | september 2019 26 figure 1. age of respondents (n=109). figure 2. foreign language skills respondents used at work (multiple responses allowed, n-106). 9.17% 29.36% 30.28% 12.84% 18.35% 20-29 years old 30-39 years old 40-49 years old 50-59 years old 60 years or older 102 65 49 49 reading writing speaking listening use of language-learning apps | ibacache 27 https://doi.org/10.6017/ital.v38i3.11077 as shown in figure 2, respondents used different foreign language skills at work. however, reading was used with significantly more frequency. when asked, “how often do you use a foreign language at work?” 38 respondents out of 105 used it daily, 29 used it weekly, and 21 used it monthly. in addition, table 1 shows that a large percentage of respondents noted that knowing a foreign language helped them with collection development tasks and reference services. however, the respondents who chose “other” stated in a text field that knowing a foreign language helped them with translation tasks, building management, creating a welcoming environment, attending foreign guests, communicating with vendors, researching, processing, and having a broader perspective of the world emphasizing empathy. these respondents also expressed that knowing a foreign language helped them to work with materials in other languages, digital humanities projects, and to offer library tours and outreach to the community. type of librarian work expressed benefit (%) collection development 61.5 reference 57.6 communication 56.7 instruction 41.3 cataloging and metadata 41.3 acquisition 40.3 other 19.2 table 1. types of librarian work benefiting from knowledge of a foreign language (multiple responses allowed, n=104). figure 3. languages respondents studied using an app (multiple responses allowed, n=51). as shown in figure 3, spanish was the most prominent language studied. thirteen out of 51 respondents studied french and portuguese. additionally, respondents stated in the text field “other” that they have also used these apps to study english, mandarin, arabic, malay, hebrew, swahili, korean, navajo, turkish, russian, greek, polish, welsh, indonesian, thai, and tamil. regardless, apps were not the sole means for language acquisition. some respondents specified using books, news articles, pimsleur cds, television shows, internet radio, conversations with family members and native speakers, formal instruction, websites, dictionaries, online tutorials, audio tapes, online laboratories, flashcards, podcasts, movies, and youtube videos. 22 13 13 9 8 5 26 spanish french portuguese german italian japanese other information technology and libraries | september 2019 28 over a third of 49 respondents used a language-learning app for 30 hours or more, and less than a quarter used it between 11-30 hours. concerning the device preferred to access the apps, most of the respondents utilized a smartphone (63.27 percent), followed by a laptop (16.33 percent), and a tablet (14.29 percent). table 2 shows the elements of language-leaning apps that 48 respondents found more satisfactory. they selected “learning in own time and space” as the most desired element followed by “vocabulary” and “translation exercises.” participants were less captivated by “pronunciation capability” (29.1 percent) and “dictionary function” (16.6 percent). element of a language-learning app percentage finding satisfactory (%) learning in own time and space 64.5 vocabulary 56.2 translation exercises 56.2 making mistakes without feeling embarrassed 54.1 responsive touch screen 52 self-testing 52 reading and writing exercises 43.7 game-like features 37.5 voice recognition capability 37.5 comfortable text entry 37.5 grammar and verb conjugation exercises 35.4 pronunciation capability 29.1 dictionary function 16.6 table 2. most satisfactory aspects with language-learning apps (multiple responses allowed, n=48). figure 4. most unsatisfactory elements of language-learning apps (n=30). conversely, 30 respondents described unsatisfactory elements on the survey. these elements were grouped into the categories shown in figure 4. the elements were: payment restrictions, lack of grammatical explanations, monocentric content focused on travel, vocabulary-centric content 13 10 5 2 content flexibility/interface grammar payment use of language-learning apps | ibacache 29 https://doi.org/10.6017/ital.v38i3.11077 (although opinions were varied on this issue), and poor interface. respondents also mentioned a lack of flexibility that inhibited learners from reviewing earlier lessons or moving forward as desired, unfriendly interfaces, and limited scope. other respondents alluded to technical issues with keyword diacritical, non-intuitive software and repetitive exercises. while these elements relate to the language apps themselves, one respondent mentioned missing human interaction and another reported the lack of a system to prompt learners to be accountable for their own learning process. figure 5. reasons participants had not used a language-learning app (multiple responses allowed, n=53). figure 5 shows that time restriction (i.e., availability of time to use the app) was the most prevalent reason why respondents had not used a language-learning app. however, a larger percentage of respondents answered “other” to expand on the reason they had not tried this technology. the explanations provided included: missing competent content for work; already having sufficient proficiency; preferring books, dictionaries, google translate, and podcasts; lacking interest; and having different priorities. similarly, when asked whether they would use a language-learning app if given an opportunity, a large percentage of 52 respondents answered “maybe” (65.38 percent). however, when 51 respondents answered the question: “what elements facilitated your language learning?,” 66.6 percent responded they preferred having an instructor, 54.9 percent liked being part of a classroom, and 41.1 percent liked language activities with classmates. discussion library employee use of language-learning apps the data revealed that a large number of respondents used a foreign language in their library work, reporting that reading and writing were the most needed skills. however, only about half of the respondents had used a language-learning app. therefore, there appears to be interest in language-learning apps, but use is not widespread at this time. overall, respondents felt languagelearning apps did not offer a curriculum that supported foreign language enhancement for the workplace, especially the academic library one. this factor may be one reason why respondents stopped using the apps and why this technology was not utilized more extensively. 54.71% 37.73% 32.07% 1.88% other lack of time prefer traditional setting screen too small information technology and libraries | september 2019 30 interestingly, the majority of the respondents were in their thirties and forties. one may surmise that young millennials in their twenties would be more inclined to use language-learning apps. however, the data showed a slight lead by respondents in their forties. this information may corroborate the author’s inference that generational distinctions among employees of academic libraries do not limit the ability to seek and even prefer learning another language through apps. moreover, a pew research center study showed that older generations than millennials have welcomed technology and even gen xers had a 10 percent lead on the ownership of tablets over millennials.16 referring to the device used to interact with the language app, most respondents preferred a smartphone. only a smaller fraction of respondents preferred a tablet, laptop, or desktop. this data may attest to the movability feature of language-learning apps preferred by self-learners and the notion that language learning may happen outside the classroom setting. however, while smartphones provide ubiquity and a sense of independence, so can tablets. therefore, what is it about smartphones that ignites preference from a user experience perspective? is it their ability to make calls, portability, fast processors, wi-fi signal, or cellular connectivity that makes a difference? since tablets can also be considered portable, and their larger screen and web surfing capabilities are desirable assets, is it the “when and where” that determines the device? while not all respondents reported using an app to learn a language, those who did expressed satisfaction with learning in their own space and time and with translation exercises. nevertheless, it is captivating that few respondents deemed important the ability of the software to help learners with the phonetic aspect of the language. this diminished interest in pronunciation may be connected with the type of language learning needed in the academic library profession. as respondents indicated, language-learning apps tend to focus on conversational skills rather than reading and text comprehension. in addition to those respondents who used an app to learn a new language, one respondent reported reinforcing skills in a language already acquired. a compelling matter to consider is the frequency with which respondents utilize a foreign language in their work. about a third of the respondents used a foreign language at work on a daily basis, and approximately a quarter used it weekly. this finding reveals that foreign language plays a significant role in academic library work. since the respondents fulfilled different responsibilities at their library work, one may deduce that foreign language is utilized in a variety of settings other than strictly desk tasks. in fact, as stated before, respondents reported using foreign language for multiple tasks including communicating with vendors and foreign guests as well as providing a welcoming environment, among others. even though 59 respondents stated that knowing a foreign language helped them with communication, respondents appeared to be more concerned with reading comprehension and vocabulary. it is likely reading comprehension was ranked high in the level of importance since library jobs that require foreign language knowledge tend to utilize reading comprehension skills widely. nonetheless, the author wonders whether subject specialists utilize more skills related to listening and communication in a foreign language, especially those librarians who provide instruction. therefore, it is curious that they did not prioritize these skills. perhaps this topic could be the subject for future research. notwithstanding these results, language-learning apps appear to center on content that improves listening and basic communication instead of reading use of language-learning apps | ibacache 31 https://doi.org/10.6017/ital.v38i3.11077 comprehension. therefore, the question remains as to whether mobile language apps have enough capabilities to provide a relevant learning experience to librarians and staff working in academic libraries. are language-learning apps responding to the language needs of employees working in academic libraries? the survey results indicate that language-learning apps are not sufficiently meeting respondents’ foreign language needs. qualitative data showed that there may be several elements affecting the compatibility of language-learning apps with the needs of employees working in academic libraries. however, the findings were not conclusive due to the limited number of responses. when respondents were asked to identify the unsatisfactory elements in these apps, 65.9 percent of 47 respondents found an issue with language-learning apps, but 23 percent of those respondents answered “none.” according to respondents, the main problems with apps were the lack of content and scope that was suitable for employees of academic libraries, flexibility, and grammar. perhaps mobile language-app developers speculate that some learners still use a formal classroom setting for foreign language acquisition, and therefore leave more advanced curriculum for that setting. it is also possible that developers deem more dominant a market that centers on travel and basic conversation; this may explain why these apps do not address foreign language needs at the professional level. finally, these academic library employees appear to perceive that there is a need for these apps to explore and offer a curriculum and learning activities that benefit those seeking deeper knowledge of a language. conclusion mobile language learning has changed the approach to language acquisition. its mobility, portability, and ubiquity have established a manner of instruction that provides a sense of freedom and self-management that suits self-learners. moreover, as app technology has progressed, features have been added to devices that facilitate a more meaningful user experience with language-learning apps. employees of academic libraries that have used foreign languagelearning apps are cognizant of language-learning activities that support their foreign language needs for work such as reading comprehension and vocabulary. however, language-learning apps appear to market conversational needs, providing exercises that focus on travel more than less ons that center on reading comprehension and deeper areas of language knowledge. this indicates a lack of language-learning content that would be more appropriate for those working in academic libraries. finally, academic library employees who require a foreign language in their work are a target group that may benefit from mobile language learning. presently, this target group feels languagelearning apps are too basic to cover professional, broader needs. therefore, as language-learning app developers consider service to wider groups of people, it would be beneficial for these apps to expand their lesson structure and content to address the needs of academic library professionals. endnotes 1 agnes kukulska-hulme, “will mobile learning change language learning?” recall 21, no. 2 (2009): 157, https://doi.org/10.1017/s0958344009000202. https://doi.org/10.1017/s0958344009000202 information technology and libraries | september 2019 32 2 ibid, 158. 3 see florence martin and jeffrey ertzberger, “here and now mobile learning: an experimental study on the use of mobile technology,” computers & education 68, (2013): 76-85, https://doi.org/10.1016/j.compedu.2013.04.021; houston heflin, jennifer shewmaker, and jessica nguyen, “impact of mobile technology on student attitudes, engagement, and learning,” computers & education 107, (2017): 91-99, https://doi.org/10.1016/j.compedu.2017.01.006; yoon jung kim, “the effects of mobileassisted language learning (mall) on korean college students’ english-listening performance and english-listening anxiety,” studies in linguistics, no. 48 (2018): 277-98, https://doi.org/10.15242/heaig.h1217424; jack burston, “the reality of mall: still on the fringes,” calico journal 31, no. 1 (2014): 103-25, https://www.jstor.org/stable/calicojournal.31.1.103. 4 jaclyn broadbent, “comparing online and blended learner’s self-regulated learning strategies and academic performance,” internet and higher education 33 (2017): 24, https://doi.org/10.1016/j.iheduc.2017.01.004. 5 rui-ting huang and chung-long yu, “exploring the impact of self-management of learning and personal learning initiative on mobile language learning: a moderated mediation model,” australian journal of education technology 35, no. 3 (2019): 118, https://doi.org/10.14742/ajet.4188. 6 ibid, 121. 7 mebratu mulato bachore, “language through mobile technologies: an opportunity for language learners and teachers,” journal of education and practice 6, no. 31 (2015): 51, https://files.eric.ed.gov/fulltext/ej1083417.pdf. 8 ibid, 50. 9 fatima ezzahraa louhab, ayoub bahnasse, and mohamed talea, “considering mobile device constraints and context-awareness in adaptive mobile learning for flipped classroom,” education and information technologies 23, no. 6 (2018): 2608, https://doi.org/10.1007/s10639-018-9733-3. 10 linda bradley, “the mobile language learner: use of technology in language learning,” journal of universal computer science 21, no. 10 (2015): 1270, http://jucs.org/jucs_21_10/the_mobile_language_learner/jucs_21_10_1269_1282_bradley.pdf . 11 ibid. 12 tanya elias, “universal instructional design principles for mobile learning,” the international review of research in open and distance learning 12, no. 2 (2011): 149, https://doi.org/10.19173/irrodl.v12i2.965. 13 robert godwin-jones, “emerging technologies: mobile apps for language learning,” language learning & technology 15, no. 2 (2011): 3, http://dx.doi.org/10125/44244. https://doi.org/10.1016/j.compedu.2013.04.021 https://doi.org/10.1016/j.compedu.2017.01.006 https://doi.org/10.15242/heaig.h1217424 https://www.jstor.org/stable/calicojournal.31.1.103 https://doi.org/10.1016/j.iheduc.2017.01.004 https://doi.org/10.14742/ajet.4188 https://files.eric.ed.gov/fulltext/ej1083417.pdf https://doi.org/10.1007/s10639-018-9733-3 http://jucs.org/jucs_21_10/the_mobile_language_learner/jucs_21_10_1269_1282_bradley.pdf https://doi.org/10.19173/irrodl.v12i2.965 http://dx.doi.org/10125/44244 use of language-learning apps | ibacache 33 https://doi.org/10.6017/ital.v38i3.11077 14 kukulska, 158. 15 “membership: list of arl members,” association of research libraries, accessed april 5, 2019, https://www.arl.org/membership/list-of-arl-members. 16 jingjing jiang, “millenials stand out for their technology use,” pew research center (2018), https://www.pewresearch.org/fact-tank/2018/05/02/millennials-stand-out-for-theirtechnology-use-but-older-generations-also-embrace-digital-life/. https://www.arl.org/membership/list-of-arl-members https://www.pewresearch.org/fact-tank/2018/05/02/millennials-stand-out-for-their-technology-use-but-older-generations-also-embrace-digital-life/ https://www.pewresearch.org/fact-tank/2018/05/02/millennials-stand-out-for-their-technology-use-but-older-generations-also-embrace-digital-life/ abstract introduction background methodology instrument participants results use of foreign language at work discussion library employee use of language-learning apps are language-learning apps responding to the language needs of employees working in academic libraries? conclusion endnotes digitization of text documents using pdf/a yan han and xueheng wan information technology and libraries | march 2018 52 yan han (yhan@email.arizona.edu) is full librarian, the university of arizona libraries, and xueheng wan (wanxueheng@email.arizona.edu) is a student, department of computer science, university of arizona. abstract the purpose of this article is to demonstrate a practical use case of pdf/a for digitization of text documents following fadgi’s recommendation of using pdf/a as a preferred digitization file format. the authors demonstrate how to convert and combine tiffs with associated metadata into a single pdf/a-2b file for a document. using real-life examples and open source software, the authors show readers how to convert tiff images, extract associated metadata and international color consortium (icc) profiles, and validate against the newly released pdf/a validator. the generated pdf/a file is a self-contained and self-described container that accommodates all the data from digitization of textual materials, including page-level metadata and icc profiles. providing theoretical analysis and empirical examples, the authors show that pdf/a has many advantages over the traditionally preferred file format, tiff/jpeg2000, for digitization of text documents. background pdf has been primarily used as a file delivery format across many platforms in almost every device since its initial release in 1993. pdf/a was designed to address concerns about long-term preservation of pdf files, but there has been little research and few implementations of this file format. since the first standard (iso 19005 pdf/a-1), published in 2005, some articles discuss the pdf/a family of standards, relevant information, and how to implement pdf/a for born-digital documents.1 there is growing interest in the pdf and pdf/a standards after both the us library of congress and the national archives and records administration (nara) joined the pdf association in 2017. nara joined the pdf association because pdf files are used as electronic documents in every government and business agency. as explained in a blog post, the library of congress joined the pdf association because of the benefits to libraries, including participating in developing pdf standards, promoting best-practice use of pdf, and access to the global expertise in pdf technology.2 few articles, if any, have been published about using this file format for preservation of digitized content. yan han published a related article in 2015 about theoretical research on using pdf/a for text documents.3 in this article, han discussed the shortcomings of the widely used tiff and jpeg2000 as master preservation file formats and proposed using the then-emerging pdf/a as the preferred file format for digitization of text documents. han further analyzed the requirements mailto:yhan@email.arizona.edu mailto:wanxueheng@email.arizona.edu digitization of text documents using pdf/a | han and wan 53 https://doi.org/10.6017/ital.v37i1.9878 of digitization of text documents and discussed the advantages of pdf/a over tiff and jpeg2000. these benefits include platform independence, smaller file size, better compression algorithms, and metadata encoding. in addition, the file format reduces workload and simplifies postdigitization processing such as quality control, adding and updating missing pages, and creating new metadata and ocr data for discovery and digital preservation. as a result, pdf/a can be used in every phase of a digital object in an open archival information system (oais)—for example, a submission information package (sip), archive information package (aip), and dissemination information package (dip). in summary, a pdf/a file can be a structured, self-contained, and selfdescribed container allowing a simpler one-to-one relationship between an original physical document and its digital surrogate. in september 2016, the federal agencies digital guidelines initiative (fadgi) released its latest guidelines for digitization related to raster images: technical guidelines for digitizing heritage materials.4 the de-facto best practices for digitization, these guidelines provide federal agencies guidance and have been used in many cultural heritage institutions. both the pdf association and the authors welcomed the recognition of pdf/a as the preferred master file format for digitization of text documents such as unbound documents, bound volumes, and newspapers.5 goals and tasks since han has previously provided theoretical methods of coding raster images, metadata, and related information in pdf/a, the goals of this article are threefold: 1. present real-life experience of converting tiffs/jpeg2000s to pdf/a and back, along with image metadata 2. test open source libraries to create and manipulate images, image metadata, and pdf/a 3. validate generated pdf/as with the first legitimate validator for pdf/a validation the tasks included the following: ● convert all the master files in tiffs/jpeg2000 from digitization of text documents into single pdf/a files losslessly. one document, one pdf/a file. ● evaluate and extract metadata from each tiff/jpeg2000 image and encode it along with its image when creating the corresponding pdf/a file. ● demonstrate the runtimes of the above tasks for feasibility evaluation. ● validate the pdf/a files against the newly released open source pdf/a validator verapdf. ● extract each digital image from the pdf/a file back to its original master image files along with associated metadata. ● verify the extracted image files in the back-and-forth conversion process against the original master image files choices of pdf/a standards and conformance level this article demonstrates using pdf/a-2b as a self-contained self-describing file format. currently, there are three related pdf/a standards (pdf/a-1, pdf/a-2, and pdf/a-3), each with information technology and libraries | march 2018 54 three conformance levels (a, b, and u). the reasons for choosing pdf/a-2 (instead of pdf/a-1 or pdf/a-3) are the following: ● pdf/a-1 is based on pdf 1.4. in this standard, images coded in pdf/a-1 cannot use jpeg2000 compression (named in pdf/a as jpxdecode). one can still convert tiffs to pdf/a-1 using other lossless compression methods such as lzw. however, the spacesaving benefits of jpeg2000 compression over other methods would not be utilized. ● pdf/a-2 and pdf/a-3 are based on pdf 1.7. one significant feature of pdf 1.7 is that it supports jpeg2000 compression, which saves 40–60 percent of space for raster images compared to uncompressed tiffs. ● pdf/a-3 has one major feature that pdf/a-2 does not have, which is to allow arbitrary files to be embedded within the pdf file. in this case, there is no file to be embedded. the authors chose conformance level b for simplicity. ● b is basic conformance, which requires only necessary components (e.g., all fonts embedded in the pdf) for reproduction of a document’s visual appearance. ● a is accessible conformance, which means b conformance level plus additional accessibility (structural and semantic features such as document structure). one can add tags to convert pdf/2b to pdf/2a. ● u represents a conformance level with the additional requirement that all text in the document have unicode equivalents. this article does not cover any post-processing of additional manual or computational features such as adding ocr text to the generated pdf/a files. these features do not help faithfully capture the look and feel of original pages in digitization, and they can be added or updated later without any loss of information. in addition, ocr results rely on the availability of ocr engines for the document’s language, and results can vary between different ocr engines over time. ocr technology is getting better and will produce better results in the future. for example, current ocr technology for english gives very reliable (more than 90 percent) accuracy. in comparison, traditional chinese manuscripts and pashto/persian give unacceptably low accuracy (less than 60 percent). the cutting edge on ocr engines has started to utilize artificial intelligence networks, and the authors believe that a breakthrough will happen soon. data source the university of arizona libraries (ual) and afghanistan center at kabul university (acku) have been partnering to digitize and preserve acku’s permanent collection held in kabul. this collaborative project created the largest afghan digital repository in the world. currently the afghan digital repository (http://www.afghandata.org) contains more than fifteen thousand titles and 1.6 million pages of documents. digitization of these text documents follows the previous version of the fadgi guideline, which recommended scanning each page of a text document into a separate tiff file as the master file. these tiffs were organized by directories in a file system, where each directory represents a corresponding document containing all the scanned pages of this title. an example of the directory structure can be found in han’s article. http://www.afghandata.org/ digitization of text documents using pdf/a | han and wan 55 https://doi.org/10.6017/ital.v37i1.9878 pdf/a and image manipulation tools there are a few open source and proprietary pdf software development kits (sdk). adobe pdf library and foxit sdk are the most well-known commercial tools to manipulate pdfs. to show readers that they can manipulate and generate pdf/a documents themselves, open source software, rather than commercial tools, was used. currently, only a very limited number of open source pdf sdks are available, including itext and pdfbox. itext was chosen because it has g ood documentation and provides a well-built set of apis to support almost all the pdf and pdf/a features. initially written by bruno lowagie (who was in the iso pdf standard working group) in 1998 as an in-house project, lowagie later started up his own company, itext, and published itext in action with many code examples.6 moreover, itext has java and c# coding options with good code documentation. it is worth mentioning that itext has different versions. the author used itext 5.5.10 and 5.4.4. using an older version in our implementation generated a non-compatible pdf/a file because the it was not aligned with the pdf/a standard.7 for image processing, there were a few popular open source options, including imagemagick and gimp. imagemagick was chosen because of its popularity, stability, and cross-platform implementation. our implementation identified one issue with imagemagick: the current version (7.0.4) could not retrieve all the metadata from tiff files as it did not extract certain information such as the image file directory and color profile. these metadata are critical because they are part of the original data from digitization. unfortunately, the author observed that some image editors were unable to preserve all the metadata from the image files during the conversion process. hart and de varies used case studies to show the vulnerability of metadata, demonstrating metadata elements in a digital object can be lost and corrupted by use or conversion of a file to another format. they suggested that action is needed to ensure proper metadata creation and preservation so that all types of metadata must be captured and preserved to achieve the most authentic, consistent, and complete digital preservation for future use.8 metadata extraction tools and color profiles as we digitize physical documents and manipulate images, color management is important. the goal of color management is to obtain a controlled conversion between the color representations of various devices such as image scanners, digital cameras, and monitors. a color profile is a set of data that control input and output of a color space. the international color consortium (icc) standards and profiles were created to bring various manufacturers together because embedding color profiles into images is one of the most important color management solutions. image formats such as tiff and jpeg2000 and document formats such as pdf may contain embedded color profiles. the authors identified a few open source tools to extract tiff metadata, includin g exiftool, exiv2, and tiffinfo. exiftool is an open source tool for reading, writing, and manipulating metadata of media files. exiv2 is another free metadata tool supporting different image formats. the tiffinfo program is widely used in the linux platform, but it has not been updated for at least ten years. our implementations showed that exiftool was the one that most easily extracted the full icc profiles and other metadata from tiff and jpeg2000 files. imagemagick and other image processing software were examined in van der knijff’s article discussing jpeg2000 for long-term preservation.9 he found that icc profiles were lost in imagemagick. our implementation has information technology and libraries | march 2018 56 showed the current version of imagemagick has fixed this issue. a metadata sample can be found in appendix a. implementation converting and ordering tiffs into a single pdf/a-2 file when ordering and combining all individual tiffs of a document into a single pdf/a-2b file, the authors intended to preserve all information from the tiffs, including raster image data streams and metadata stored in each tiff’s header. the raster image data streams are the main images reflecting the original look and feel of these pages, while the metadata (including technical and administrative metadata such as bitspersample, datetime, and make/model/software) tells us important digitization and provenance information. both are critical for delivery and digital preservation. the tiff images were first converted to jpeg2000 with lossless compression using the open source imagemagick software. our tests of imagemagick demonstrated that it can handle different color profiles and will convert images correctly if the original tiff comes with a color profile. this gave us confidence that past concerns about jpeg2000 and imagemagick had been resolved. these images were then properly sorted into their original order and combined into a single pdf/a-2 file. an alternative is to directly code tiff’s image data stream into a pdf/a file, but this approach would miss one benefit of pdf/a-2: tremendous file size reduction with jpeg2000. the following is the pseudocode of ordering and combining all the tiffs in a text document into a single pdf/a2 file. createpdfa2(queue tifflist) { create an empty queue xmlq; create an empty queue jp2q; /* tifffilelist is pre-sorted queue based on the original order */ /* convert each tiff to jpeg2000 losslessly, then add each jpeg2000 and its metadata into a queue */ while (tifflist is not empty) { string tifffilepath = tifflist.dequeue(); string xmlfilepath = tiff metadata extracted using exiftool; xmlq.enqueue(xmlfilepath); string jp2filepath = jpeg2000 file location from tiff converted by imagemagick; jp2q.enqueue(jp2filepath); } /* convert each image’s metadata to xmp, add each jpeg2000 and its metadata into the pdf/a-2 file based on its original order */ document pdf2b = new document(); /* create pdf/a-2b conformance level */ pdfawriter writer = pdfawriter.getinstance(doc, new fileoutputstream(pdfafilepath),pdfaconformacelevel.pdf_a_2b); writer.createxmpmetadata(); //create root xmp digitization of text documents using pdf/a | han and wan 57 https://doi.org/10.6017/ital.v37i1.9878 pdf2b.open(); while(jp2q is not empty){ image jp2 = image.getinstance(jp2q.dequeue()); rectangle size = new rectangle(jp2.getwidth(), jp2.getheight()); //pdf page size setting pdf2b.setpagesize(size); pdf2b.newpage(); // create a new page for a new image byte[] bytearr = xmpmanipulation(xmlq.dequeue()); // convert original metadata based on the xmp standard writer .setpagexmpmetadata(bytearr); pdf2b.add(jp2); } pdf2b.close(); } converting pdf/a-2 files back to tiffs and jpeg2000s to ensure that we can extract raster images from the newly created pdf/a-2 file, the authors also wrote code to convert a pdf/a-2 file back to the original tiff or jpeg2000 format. this implementation was a reverse process of the above operation. once the reverse conversion process was completed, the authors verified that the image files created from the pdf/a-2 file were the same as before the conversion to pdf/a-2. note that we generated md5 checksums to verify image data streams. images data streams are the same, but metadata location can be varied because of inconsistent tiff tags used over the years. when converting one tiff to another tiff, imagemagick has its implementation of metadata tags. the code can be found in appendix b. pdf/a validation pdf/a is one of the most recognized digital preservation formats, specially designed for long -term preservation and access. however, no commonly accepted pdf/a validator was available in the past, although several commercial and open source pdf preflight and validation engines (e.g., acrobat) were available. validating a pdf/a against the pdf/a standards is a challenging task for a few reasons, including the complexity of the pdf and pdf/a formats. the pdf association and the open preservation foundation recognized the need and started a project to develop an open source pdf/a validator and build a maintenance community. their result, verapdf, is an open source validator designed for all pdf/a parts and conformance levels. released in january 2017, the goal of verapdf is to become the commonly accepted pdf/a validator. 10 our generated pdf/as have been validated with verapdf 1.4 and adobe acrobat pro dc preflight. both products validated the pdf/a-2b files as fully compatible. our implementations showed that verapdf 1.4 verified more cases than acrobat dc preflight. figure 1 shows a pdf file structure and its metadata. information technology and libraries | march 2018 58 figure 1. a pdf object tree with root-level metadata. runtime and conclusion the time complexity of our code is o(log n) because of the sorting algorithms used. tiffs were first converted to jpeg2000. when jpeg2000 images are added to a pdf/a-2 file, no further image manipulation is required because the generated pdf/a-2 uses jpeg2000 directly (in other words, it uses the jpxdecode filter). tables 1 and 2 show the performance comparison running in our computer hardware and software environment (intel core i7-2600 cpu@3.4ghz, 8gb ddr3 ram, 3tb 7200-rpm 64mb-cache hard disk running ubuntu 16.10). digitization of text documents using pdf/a | han and wan 59 https://doi.org/10.6017/ital.v37i1.9878 table 1. runtimes of converting grayscale tiffs to jpeg2000s and to pdf/a-2b no. of files total file size (mb) image conversion runtime (tiffs to jp2s in seconds) total runtime (tiffs to jp2s to a single pdf/a-2b in seconds) 1 9.1 3.61 3.98 10 91.1 35.63 36.71 20 182.2 71.83 73.98 50 455.5 179.06 184.63 100 910.9 358.3 370.91 table 2. runtimes of converting color tiffs to jpeg2000s and to pdf/a-2b no. of files total file size (mb) image conversion runtime (tiffs to jp2s in seconds) total runtime (tiffs to jp2s to a single pdf/a-2b in seconds) 1 27.3 14.80 14.94 10 273 150.51 151.55 20 546 289.95 293.21 50 1,415 741.89 749.75 100 2,730 1490.49 1509.23 the results show that (a) the majority of the runtime (more than 95 percent) is spent in converting a tiff to a jpeg2000 using imagemagick (see figure 2); (b) the average runtime of converting a tiff has a constant positive relationship with the file’s size (see figure 2); (c) in information technology and libraries | march 2018 60 comparison, the runtime of converting a color tiff is significantly higher than that of converting a greyscale tiff (see figure 2); and (d) it is feasible in terms of time and resources to convert existing master images of digital document collections to pdf/a-2b. for example, the runtime of 1 tb of conversion of color tiffs will be 552,831 seconds (153.5 hours; 6.398 days) using the above hardware. the authors have already processed more than 600,000 tiffs using this method. the authors conclude that using pdf/a gives institutions advantages of the newly preferred master file format for digitization of text documents over tiff/jpeg2000. the above implementation demonstrates the ease, the reasonable runtime, and the availability of open source software to perform such conversions. from both the theoretical analysis and empirical evidences, the authors show that pdf/a has advantages over the traditional preferred file format tiff for digitization of text documents. following best practice, a pdf/a file can be a selfcontained and self-described container that accommodates all the data from digitization of textual materials, including page-level metadata and icc profiles. summary the goal of this article is to demonstrate empirical evidences of using pdf/a for digitization of text document. the authors evaluated and used multiple open source software programs for processing raster images, extracting image metadata, and generating pdf/a files. these pdf/a files were validated using the up-to-date pdf/a validators verapdf and acrobat preflight. the authors also calculated the time complexity of the program and measured the total runtime in multiple testing cases. most of the runtime was spent on image conversions from tiff to jpeg2000. the creation of the pdf/a-2b file with associated page-level metadata accounted for less than 5 percent of the total runtime. runtime of conversion of a color tiff was much higher than that of a greyscale one. our theoretical analysis and empirical examples show that using pdf/a-2 presents many advantages over the traditional preferred file format (tiff/jpeg2000) for digitization of text documents. digitization of text documents using pdf/a | han and wan 61 https://doi.org/10.6017/ital.v37i1.9878 figure 2. file size, greyscale and color tiffs and runtime ratio. information technology and libraries | march 2018 62 appendix a: sample tiff metadata with icc header 8 3400 4680 8 8 8 uncompressed rgb (binary data 41025 bytes, use -b option to extract) 3 1 (binary data 28079 bytes, use -b option to extract) 400 400 chunky appl 2.2.0 display device profile rgb xyz 2006:02:02 02:20:00 acsp apple computer inc. not embedded, independent none reflective, glossy, positive, color perceptual 0.9642 1 0.82491 epso 0 epson srgb 0.43607 0.22249 0.01392 0.38515 0.71687 0.09708 0.14307 0.06061 0.7141 0.95045 1 1.08905 copyright (c) seiko epson corporation 2000 2006. all rights reserved. (binary data 8204 bytes, use -b option to extract) (binary data 8204 bytes, use -b option to extract) (binary data 8204 bytes, use -b option to extract) 0 0 0 digitization of text documents using pdf/a | han and wan 63 https://doi.org/10.6017/ital.v37i1.9878 appendix b: sample code to convert pdf/a-2 back to jpeg2000s /* assumption: the pdf/a-2b file was specifically generated from image objects converted from tiff images with jpxdecode along with page-level metadata */ public static void parse(string src, string dest) throws ioexception{ pdfreader reader = new pdfreader(src); pdfobject obj; int counter = 0; for(int i = 1; i <= reader.getxrefsize(); i ++){ obj = reader.getpdfobject(i); if(obj != null && obj.isstream()){ prstream stream = (prstream) obj; byte[] b; try{ b = pdfreader.getstreambytes(stream); }catch(unsupportedpdfexception e){ b = pdfreader.getstreambytesraw(stream); } pdfobject pdfsubtype = stream.get(pdfname.subtype); fileoutputstream fos = null; if (pdfsubtype != null && pdfsubtype.tostring().equals(pdfname.xml.tostring())) { fos = new fileoutputstream(string.format(dest + "_xml/" + counter+".xml", i)); system.out.println("page metadata extracted!"); } if (pdfsubtype != null && pdfsubtype.tostring().equals(pdfname.image.tostring())) { counter ++; fos = new fileoutputstream(string.format(dest + "_jp2/" + counter+".jp2", i)); } if (fos != null) { fos.write(b); fos.flush(); fos.close(); system.out.println("jpeg2000s conversion from pdf completed !"); } } } /* then use imagemagick library to convert jpeg2000s to tiffs */ information technology and libraries | march 2018 64 references 1 pdf-tools.com and pdf association, “pdf/a—the standard for long-term archiving,” version 2.4, white paper, may 20, 2009, http://www.pdftools.com/public/downloads/whitepapers/whitepaper-pdfa.pdf; duff johnson, “white paper: how to implement pdf/a,” talking pdf, august 24, 2010, https://talkingpdf.org/white-paperhow-to-implement-pdfa/; alexandra oettler, “pdf/a in a nutshell 2.0: pdf for long-term archiving,” association for digital standards, 2013, https://www.pdfa.org/wpcontent/until2016_uploads/2013/05/pdfa_in_a_nutshell_211.pdf; library of congress, “pdf/a, pdf for long-term preservation,” last modified july 27, 2017, https://www.loc.gov/preservation/digital/formats/fdd/fdd000318.shtml. 2 library of congress, “the time and place for pdf: an interview with duff johnson of the pdf association,” the signal (blog), december 12, 2017, https://blogs.loc.gov/thesignal/2017/12/the-time-and-place-for-pdf-an-interview-with-duffjohnson-of-the-pdf-association/. 3 yan han, “beyond tiff and jpeg2000: pdf/a as an oais submission information package container,” library hi tech 33, no. 3 (2015): 409–23, https://doi.org/10.1108/lht-06-20150068. 4 federal agencies digital guidelines initiative, technical guidelines for digitizing cultural heritage materials. (washington, dc: federal agencies digital guidelines initiative, 2016), http://www.digitizationguidelines.gov/guidelines/fadgi%20federal%20%20agencies%20d igital%20guidelines%20initiative-2016%20final_rev1.pdf. 5 duff johnson, “us federal agencies approve pdf/a,” pdf association, september 2, 2016, http://www.pdfa.org/new/us-federal-agencies-approve-pdfa/. 6 bruno lowagie, itext in action, 2nd ed. (stamford, ct: manning, 2010). 7 “itext 5.4.4,” itext, last modified september 16, 2013, http://itextpdf.com/changelog/544. 8 timothy robert hart and denise de vries, “metadata provenance and vulnerability,” information technology and libraries 36, no. 4 (2017), https://doi.org/10.6017/ital.v36i4.10146. 9 johan van der knijff, “jpeg 2000 for long-term preservation: jp2 as a preservation format,” dlib 17, no. 5/6 (2011), https://doi.org/10.1045/may2011-vanderknijff. 10 pdf association, “how verapdf does pdf/a validation,” 2016, http://www.pdfa.org/howverapdf-does-pdfa-validation/. http://www.pdf-tools.com/public/downloads/whitepapers/whitepaper-pdfa.pdf http://www.pdf-tools.com/public/downloads/whitepapers/whitepaper-pdfa.pdf https://talkingpdf.org/white-paper-how-to-implement-pdfa/ https://talkingpdf.org/white-paper-how-to-implement-pdfa/ https://www.pdfa.org/wp-content/until2016_uploads/2013/05/pdfa_in_a_nutshell_211.pdf https://www.pdfa.org/wp-content/until2016_uploads/2013/05/pdfa_in_a_nutshell_211.pdf https://www.loc.gov/preservation/digital/formats/fdd/fdd000318.shtml https://blogs.loc.gov/thesignal/2017/12/the-time-and-place-for-pdf-an-interview-with-duff-johnson-of-the-pdf-association/ https://blogs.loc.gov/thesignal/2017/12/the-time-and-place-for-pdf-an-interview-with-duff-johnson-of-the-pdf-association/ https://blogs.loc.gov/thesignal/2017/12/the-time-and-place-for-pdf-an-interview-with-duff-johnson-of-the-pdf-association/ https://blogs.loc.gov/thesignal/2017/12/the-time-and-place-for-pdf-an-interview-with-duff-johnson-of-the-pdf-association/ https://doi.org/10.1108/lht-06-2015-0068 https://doi.org/10.1108/lht-06-2015-0068 http://www.digitizationguidelines.gov/guidelines/fadgi%20federal%20%20agencies%20digital%20guidelines%20initiative-2016%20final_rev1.pdf http://www.digitizationguidelines.gov/guidelines/fadgi%20federal%20%20agencies%20digital%20guidelines%20initiative-2016%20final_rev1.pdf http://www.digitizationguidelines.gov/guidelines/fadgi%20federal%20%20agencies%20digital%20guidelines%20initiative-2016%20final_rev1.pdf http://www.digitizationguidelines.gov/guidelines/fadgi%20federal%20%20agencies%20digital%20guidelines%20initiative-2016%20final_rev1.pdf https://www.pdfa.org/new/us-federal-agencies-approve-pdfa/ https://www.pdfa.org/new/us-federal-agencies-approve-pdfa/ https://www.pdfa.org/new/us-federal-agencies-approve-pdfa/ http://itextpdf.com/changelog/544 http://itextpdf.com/changelog/544 https://doi.org/10.6017/ital.v36i4.10146 https://doi.org/10.6017/ital.v36i4.10146 https://doi.org/10.1045/may2011-vanderknijff https://www.pdfa.org/how-verapdf-does-pdfa-validation/ https://www.pdfa.org/how-verapdf-does-pdfa-validation/ https://www.pdfa.org/how-verapdf-does-pdfa-validation/ abstract background goals and tasks choices of pdf/a standards and conformance level data source pdf/a and image manipulation tools metadata extraction tools and color profiles implementation converting and ordering tiffs into a single pdf/a-2 file converting pdf/a-2 files back to tiffs and jpeg2000s pdf/a validation runtime and conclusion summary appendix a: sample tiff metadata with icc header appendix b: sample code to convert pdf/a-2 back to jpeg2000s references jeng ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ 170 information technology and libraries | december 2011 this paper summarizes a research program that focuses on how catalogers, other cultural heritage information workers, web/semantic web technologists, and the general public understand, explain, and manage resource description tasks by creating, counting, measuring, classifying, and otherwise arranging descriptions of cultural heritage resources within the bibliographic universe and beyond it. a significant effort is made to update the nineteenth-century mathematical and scientific ideas present in traditional cataloging theory to their twentiethand twenty-first-century counterparts. there are two key elements in this approach: (1) a technique for diagrammatically depicting and manipulating large quantities of individual and grouped bibliographic entities and the relationships between them, and (2) the creation of resource description exemplars (problem–solution sets) that are intended to play theoretical, pedagogical, and it system design roles. to the reader: this paper presents a major re-visioning of cataloging theory, introducing along the way a technique for depicting diagrammatically large quantities of bibliographic entities and the relationships between them. as many details of the diagrams cannot be reproduced in regularly sized print publications, the reader is invited to follow the links provided in the endnotes to pdf versions of the figures. c ataloging—the systematic arrangement of resources through their descriptions that is practiced by libraries, archives, and museums (i.e., cultural heritage institutions) and other parties1—can be placed in an advanced, twenty-first-century context by updating its preexisting scientific and mathematical ideas with their more contemporary versions. rather than directing our attention to implementation-oriented details such as metadata formats, database designs, and communications protocols, as do technologists pursuing bottom-up web and semantic web initiatives, in ronald j. murray and barbara b. tillett cataloging theory in search of graph theory and other ivory towers object: cultural heritage resource description networks this paper we will define a complementary, top-down approach. this top-down approach focuses on how catalogers, other cultural heritage information workers, web/ semantic web technologists, and the general public have understood, explained, and managed their resource description tasks by creating, counting, measuring, classifying, and otherwise arranging descriptions of cultural heritage resources within and beyond the bibliographic universe. we go on to prescribe what enlargements of cataloging theory and practice are required such that catalogers and other interested parties can describe pages from unique, ancient codices as readily as they might describe information elements and patterns on the web. we will be enhancing cataloging theory with concepts from communications theory, history of science, graph theory, computer science, and from the hybrid field of anthropology and mathematics called ethnomathematics. employing this strategy benefits two groups: ■■ workers in the cultural heritage realm, who will acquire a broadened perspective on their resource description activities, who will be better prepared to handle new forms of creative expressions as they appear, and who will be able to shape the development of information systems that support more sophisticated types of resource descriptions and ways of exploring those descriptions. to build a better library system (perhaps an n-dimensional, n-connected system?), one needs better theories about the library collections and the people or groups who manage and use them. ■■ the full spectrum of people who draw on cultural heritage resources: scholars, creatives (novelists, poets, visual artists, musicians, and so on), professional and technical workers, students, and other people or groups pursuing specific or general, long or short-term interests, entertainment, etc. to apply a multidisciplinary perspective to the processes by which resource description data (linked or otherwise) are created and used is not an ivory tower exercise. our approach draws lessons from the debates on why, what, and how to describe physical phenomena that were conducted by physicists, engineers, software developers (and their historian and philosopher of science observers) during the evolution of high-energy physics. during that time, intensive debates raged over theory and observational/experimental data, the roles of theorists, experimenters, and instrument builders, instrumentation, and hardware/software system design.2 accommodating the resulting scientific approaches to description, collaboration, and publishing has required the creation of information technologies that have had and continue to have world-shaking effects. ronald j. murray (rmur@loc.gov) is a digital conversion specialist in the preservation reformatting division, and barbara b. tillett (btil@loc.gov) is the chief of the policy and standards division at the library of congress. cataloging theory in search of graph theory and other ivory towers | murray and tillett 171 descriptions—accounts or representations of a person, object, or event being drawn on by a person, group, institution, and so on, in pursuit of its interests. given this definition, a person (or a computation) operating from a business rules–generated institutional or personal point of view, and executing specified procedures (or algorithms) to do so, is an integral component of a resource description process (see figure 1). this process involves identifying a resource’s textual, graphical, acoustic, or other features and then classifying, making quality and fitness for purpose judgments, etc., on the resource. knowing which institutional or individual points of view are being employed is essential when parties possessing multiple views on those resources describe cultural heritage resources. how multiple resource descriptions derived from multiple points of view are to be related to one another becomes a key theoretical issue with significant practical consequences. ■■ niels bohr’s complementarity principle and the library in 1927, the physicist niels bohr offered a radical explanation for seemingly contradictory observations of physical phenomena confounding physicists at that time.6 according to bohr, creating descriptions of nature is the primary task of the physicist: it is wrong to think that the task of physics is to find out how nature is. physics concerns what we can say about nature.7 descriptions that appear contradictory or incomparable may in fact be signaling deep limitations in language. bohr’s complementarity principle states that a complete description of atomic-level phenomena requires descriptions of both wave and particle properties. this is generally understood to mean that in the normal language these physics research facilities and their supporting academic institutions are the same ones whose scientific subcultures (theory, experiment, and instrument building) generated the data creation, management, analysis, and publication requirements that resulted in the creation of the web. in response to this development, we have come to believe that cultural heritage resource description (i.e., the process of identifying and describing phenomena in the bibliographic universe as opposed to the physical one) must now be as open to the concepts and practices of those twenty-first-century physics subcultures as it had been to the natural sciences during the nineteenth century.3 we have consequently undertaken an intensive study of the scientific subcultures that generate scientific data and have identified four principles on which to base a more general approach to cultural heritage resource description: 1. observations 2. complementarity 3. graphs 4. exemplars the cultural heritage resource description theory to follow proposes a more articulated view of the complex, collaborative process of making available—through their descriptions—socially relevant cultural heritage resources at a global scale. we will demonstrate that a broader understanding of this resource description process (along with the ability to create improved implementations of it) requires integrating ideas from other fields of study, reaching beyond it system design to embrace larger issues. ■■ cataloging as observation as stated in the oxford english dictionary, an observation is: the action or an act of observing scientifically; esp. the careful watching and noting of an object or phenomenon in regard to its cause or effect, or of objects or phenomena in regard to their mutual relations (contrasted with experiment). also: a measurement or other piece of information so obtained; an experimental result.4 following the scientific community’s lead in striving to describe the physical universe through observations, we adapted the concept of an observation into the bibliographic universe and assert that cataloging is a process of making observations on resources. human or computational observers following institutional business rules (i.e., the terms, facts, definitions, and action assertions that represent constraints on an enterprise and on the things of interest to the enterprise)5 create resource figure 1. a resource description modeled as a business ruleconstrained account of a person, object, or event 172 information technology and libraries | december 2011 purpose, its reformatting, and its long-term preservation must take into consideration that resource’s physical characteristics. having things to say about cultural heritage resources—and having many “voices” with which to say them—presents the problem of creating a well-articulated context for library-generated resource descriptions as well as those from other sources. these contextualization issues must be addressed theoretically before implementation-level thinking, and the demands of contextualization require visualization tools to complement the narratives common to catalogers, scholars, and other users. this is where mathematics and ethnomathematics make their entrance. ethnomathematics is the study of the mathematical practices of specific cultural groups over the course of their daily lives and as they deal with familiar and novel problems.10 an ethnomathematical perspective on cultural heritage resource description directs one’s attention to the existence of simple and complex resource descriptions, the patterns of descriptions that have been created, and the representation of these patterns when they are interpreted as expressions of mathematical ideas. a key advantage of operating from an ethnomathematical perspective is becoming aware that mathematical ideas can be observed within a culture (namely the people and institutions who play key roles in observing the bibliographic universe) before their having been identified and treated formally by western-style mathematicians. ■■ resource description as graph creation relationships between cultural heritage resource descriptions can be represented as conceptually engaging and flexible systems of connections mathematicians call graphs. a full appreciation of two key mathematical ideas underlying the evolution of cataloging—putting things into groups and defining relationships between things and groups of things—was only possible after the founding, naming, and expansion of graph theory, which is a field of mathematics that emerged in the 1850s, and the eventual acceptance around 1900 of set theory, a field founded amid intense controversy in 1874. between the emergence of formal mathematical treatments of those ideas by mathematicians and their actual exploitation by cataloging theorists—or by anyone capable of considering library resource description and organization problems from a mathematical perspective—lay a gulf of more than one hundred years.11 it remained for scholars in the library world to begin addressing the issue. tillett’s 1987 work on bibliographic relationships and svenonius’s 2000 definition of bibliographic entities in set-theoretic terms that physicists use to communicate experimental results, the wholeness of nature is accessible only through the embrace of complementary, contradictory, and paradoxical descriptions of it. later in his career, bohr vigorously affirmed his belief that the complementarity principle was not limited to quantum physics: in general philosophical perspective, it is significant that, as regards analysis and synthesis in other fields of knowledge, we are confronted with situations reminding us of the situation in quantum physics. thus, the integrity of living organisms, and the characteristics of conscious individuals, and most of human cultures, present features of wholeness, the account of which implies a typically complementary mode of description. . . . we are not dealing with more or less vague analogies, but with clear examples of logical relations which, in different contexts, are met with in wider fields.8 within a library, there are many things catalogers, conservators, and preservation scientists—each with their distinctive skills, points of view, and business rules—can observe and say about cultural heritage resources.9 much of what these specialists say and do strongly affects library users’ ability to discover, access, and use library resources in their original or surrogate forms. while observations made by these specialists from different perspectives may lead to descriptions that must be accepted as valid for those specialists, a fuller appreciation of these descriptions calls for the integration of those multiple perspectives into a well-articulated, accessible whole. reflecting the perspectives of the library of congress directorates in which we work, the acquisitions and bibliographic access (aba) directorate and the preservation directorate, we assert that the most fundamental complementary views on cultural heritage resources involve describing a library’s resources in terms of their availability (from an acquisitions perspective), in terms of their information content (from a cataloging perspective), and in terms of their physical properties (from a preservation perspective). for example, in the normal languages used to communicate their results, preservation directorate conservators narrate their condition assessments and record simple physical measurements of library-managed objects—while at the same time preservation scientists in another section bring instrumentation to acquire optical and chemical data from submitted materials and from reference collections of physical and digital media. even though these assessments and measurements may not be comprehended by or made accessible to most library users, the information gathered possess a critical logical relationship to bibliographic and other descriptions of those same resources. key decisions regarding a library resource’s fitness for cataloging theory in search of graph theory and other ivory towers | murray and tillett 173 by the modeling technique. what is required instead is theory-based guidance of systems development, alongside theory testing and improvement through application use. if software development is not constrained by a tacit or explicit resource description theory or practice, graph or other data structures familiar to the historically less well-informed, those favored by an institution’s system designers and developers, or those familiar to and favored by implementation-oriented communities may be invoked inappropriately.18 given graph theory’s potentially overwhelming mathematical power—as evidenced by its many applications in the physical sciences, engineering, and computer science—investigations into graph theory and its history require close attention both to the history and evolving needs of the cultural heritage community.19 the unnecessary constraint on resource description theory formation occasioned by the use of e-r or oo modeling can be removed by dispensing with it system analysis tools and expressing resource description concepts in graph-theoretical terms. with this step, the very general elements (i.e., entities and relationships) that characterize e-r models and the more implementation-oriented ones in oo models are replaced by more mathematically flexible, theory-relevant elements expressed in graph-theoretical terms. the result is a “graph-friendly” theory of cultural heritage resource description, which can borrow from other fields (e.g., ethnomathematics, history of science) to improve its descriptive and predictive power, guide it system design and use, and, in response to users’ experiences with functioning systems, results in improved theories and information systems. graph theory in a cultural heritage context ever since the nineteenth century foundation of graph theory (though scholars regularly date its origins from euler’s 1736 paper)20 and its move from the backwaters of recreational mathematics to full field status by 1936, graph theory has concerned itself with the properties of systems of connections—nowadays regularly expressed as the mathematical objects called sets.21 in addition to its set notational form, graphs also are depicted and manipulated in diagrammatic form as dots/labeled nodes linked by labeled or unlabeled, simple or arrowed lines. for example, the graph x, consisting of one set of nodes labeled a, b, c, d, e, and f and one set of edges labeled ab, bd, de, ef, and fc, can be depicted in set notation as x = {{a b c d e f}, {ab bd de ef fc}} and can be depicted diagrammatically as in figure 2. when graphs are defined to represent different types of nodes and relationships, it becomes possible to create and discuss structures that can support cultural heritage resource description theory and application building. the following diagrams depict simple resource description identified those mathematical ideas in cataloging theory and developed them formally.12 then in 2009, we were able to employ graph theory (expressed in set-theoretical terms and in its highly informative graphical representation) as part of a broader historical and cultural analysis.13 cataloging theory had by 2009 haltingly embraced a new view on how resources in libraries have been described and arranged via their descriptions—an activity that in principle stretches back to catalogs created for the library of alexandria14—and how these structured resource descriptions have evolved over time, irrespective of implementation. murray’s investigation into this issue revealed that the increasingly formalized and refined rules that guided anglo-american catalogers had, by 1876, specified sophisticated systems of cross-references (i.e., connections between bibliographic descriptions of works, authors, and subjects)—systems whose properties were not yet the subject of formal mathematical treatment by mathematicians of the time.15 murray also found that library resource description structures—when teased out of their book and card and digital catalog implementations and treated as graphs—are arguably more sophisticated than those being explored in the world wide web consortium’s (w3c) library linked data initiative.16 implementation-oriented substitutes for graph theory cataloging theory has been both helped and hindered by the use of information technology (it) techniques like entity-relationship modeling (e-r, first used extensively by tillett in 1987 to identify bibliographic relationships in cataloging records) and object-oriented (oo) modeling.17 e-r and oo modeling may be used effectively to create information systems that are based on an inventory of “things of interest” and the relationships that exist between them. unfortunately, the things of interest in cultural heritage institutions keep changing and may require redefinition, aggregation, disaggregation, and re-aggregation. e-r and oo modeling as usually practiced are not designed to manage the degree and kind of changes that take place under those circumstances. when trying to figure out what is “out there” in the bibliographic universe, we assert that focus should first be placed on identifying and describing the things of interest, what relationships exist between them, and what processes are involved in the creation, etc., of resource descriptions. having accomplished this, attention can then be safely paid to defining and managing information deemed essential to the enterprise, that is, undertaking it system analysis and design. but when an it-centric modeling technique becomes the bed on which the resource description theory itself is constructed, the resulting theory will be driven in a direction that is strongly influenced 174 information technology and libraries | december 2011 of the resources they describe. figure 4’s diagrammatic simplicity becomes problematic when large quantities of resources are to be described, when the number and kinds of relationships recorded grows large, and when more comprehensive but less-detailed views of bibliographic relationships are desired. to address these problems in a comprehensive fashion, we examined similar complex description scenarios in the sciences and borrowed another idea from the physics community—paper tool creation and use. ■■ paper tools: graph-aware diagram creation paper tools are collections of symbolic elements (diagrams, characters, etc.), whose construction and manipulation are subject to specified rules and constraints.23 berzelian chemical notation (e.g., c6h12o6) and—more prominently—feynman diagrams like those in figure 5 are familiar examples of paper tool creation and use.24 creating a paper tool resource diagram requires that the rules for creating resource descriptions be reflected in diagram elements, properties of diagram elements, and drawing rules that define how diagram/symbolic elements are connected to one another (e.g., the formula c6h12o6 specifies six molecules of carbon, twelve of hydrogen, and six of oxygen). the detailed bibliographic information in figure 4 is progressively schematized in a graphs that are based on real-world bibliographic descriptions. nodes in the graphs represent text, numbers, or dates and relationships that can be nondirectional (as a simple line), unidirectional (as single arrowed lines) or bidirectional (as a double arrowed line). the all-in-one resource description graph in figure 3 can be divided and connected according to the kinds of relationships that have been defined for cultural heritage resources. this is the point where institutional, group, and individual ways of describing resources shape the initial structure of the graph. once constructed, graph structures like this and their diagrammatic representations are then interpreted in terms of a tacit or explicit resource description theory. in the case of graphs constructed according to ifla’s functional requirements for bibliographic records (frbr) standard,22 figure 3 can be subdivided into four frbr sub-graphs, yielding figure 4. the four diagrams depict the initial graph of cataloging data as four complementary frbr wemi (w–work, e–expression, m–manifestation, and i–item) graphs. note that the item graph contains the call numbers (used here to identify the location of the copy) of three physical copies of the novel. this use of call numbers is qualitatively different from the values found in the manifestation graph in that resource descriptions in this graph apply to the entire population of physical copies printed by the publisher. the descriptions contained in figure 4’s frbr subgraphs reproduce bibliographic characteristics found useful by catalogers, scholars, other educationally oriented end users, and to varying extents the public in general. once created, resource description graphs and subgraphs (in mathematical notation or in simple diagrams like figure 4) can proliferate and link in multiple and complex ways—in parallel with or independently figure 3. library of congress catalog data for thomas pynchon’s novel gravity’s rainbow, represented as an all-inone graph labeled c figure 2. a diagrammatic representation of graph x cataloging theory in search of graph theory and other ivory towers | murray and tillett 175 6 graph is now represented explicitly by a black dot in a ring in the more schematic paper tool version. resource descriptions are then represented in fixed colors and positions relative to the resource/ring: the worklevel resource description is represented by a blue box, expression by a green box, manifestation by a yellow box, and item by a red box. depicting one aspect of the frbr way that reflects frbr definitions of bibliographic things of interest and their relevant relationships. as a first step, the four wemi descriptions in figure 4 are given a common identity by linking them to a c node, as in figure 6. the diagram is then further schematized such that frbr description types and relationships are represented by appropriate graphical elements connected to other elements. the result shows how a frbr paper tool makes it much easier to construct and examine complex large-scale properties of resource and resource description structures (like figure 7, right side) without being distracted by textual and linkage details. the resource described (but not shown) by the figure figure 4. the all-in-one graph in figure 3, separated into four frbr work (top-left), expression (top-right), manifestation (bottom-left), and item (bottom-right) graphs figure 5. feynman diagrams of elementary particle interactions figure 6. a frbr resource description graph 176 information technology and libraries | december 2011 expressions. the work products of scholars—especially those creations that are dense with quotations, citations, and other types of direct and derived textual and graphical reference within and beyond themselves—are excellent environments for paper tool explorations and more generally, for testing of exemplars—solutions to the potentially complex problem of describing cultural heritage resources. ■■ exemplars the fourth principle in our cultural heritage resource description theory involves exemplar identification and analysis. according to the historian of science thomas s. kühn, exemplars are sets of concrete problems and solutions encountered during one’s education, training, and work. in the sciences, exemplar-based problem finding and solving involves mastery of relevant models, builds knowledge bases, and hones problem-solving skills. every student in a field would be expected to demonstrate mastery by learning and using their field’s exemplars. change within a scientific field is manifest by the need to modify old or create new exemplars as new problems appear and must be solved.26 a cultural heritage resource description theorist would, in addition to identifying and developing exemplars from real bibliographic data and other sources, want to speculate about possible resource/description configurations that call for changes in existing information technologies. to the theorist, it would be as important to find out what can’t be done with frbr and other resource description models at library, archive, museum, and internet scales, as it is to be able to explain routine item cataloging and tagging activities. discovering system limitations is better done in advance by simulating uncommon or challenging circumstances than by having problems appear later in production systems. model graphically, the descriptions closest to the black dot resource/slot are the most concrete and those furthest away the most abstract. (readers wishing to interpret frbr paper tool diagrams without reference to color values should note the strict ordering of wemi elements: w–e–m–i–resource/ring or resource/ring–i–m–e–w.) finally, to minimize element use when pairs of wemi boxes touch, the appropriate frbr linking relationship for the relevant pair of descriptions (as explicitly shown in the expanded graph) is implied but not shown. with appropriate diagramming conventions, the process of creating and exploring resource description complexes addresses combined issues of cataloging theory and institutional policy—and results in an ability to make better-informed judgments/computations about resource descriptions and their referenced resources. as a result, resource description graphs are readily created and transformed to serve theoretical—and with greater experience in thinking and programming along graph-friendly lines, practical—ends. one example of transformability would arise when exploring the implications of removing redundant portions of related resource descriptions as more copies of the same work are brought to the bibliographic universe. the frbr paper tool elements and the more articulated resource description graphs in figure 8 both depict the consequences of a practical act: combining resource descriptions for two copies of the same edition of the novel gravity’s rainbow.25 the top-most frbr diagram and its magnified section depict how the graph would look with a single item-level description, the call number for one physical copy. the bottom-most frbr diagram and its magnified section depict the graph with two item-level descriptions, the call numbers for two physical copies. a frbr paper tool’s flexibility is useful for exploring potentially complex bibliographic relationships created or uncovered by scholars—parties whose expertise lies in identifying, interrelating, and discussing creative concepts and influences across a full range of communicative figure 7. a frbr paper tool diagram element (left) and the less schematic frbr resource description graph it depicts (right) cataloging theory in search of graph theory and other ivory towers | murray and tillett 177 drawing diagrams. use case diagrams are secondary in use case work.28 as products of and guides for theory making, resource description exemplars have different origins and audiences than those for use cases. while use cases and exemplars offer perspectives that can support information system design, exemplars were originally introduced as theoretical entities by kühn to explain how theories and theory-committed communities can crystallize around problem-solution sets, how these sets also can serve as pedagogical tools, and why and when problem-solution sets get displaced by new ones. the proposed process of cultural heritage exemplar creation and use, followed by modification or replacement in the face of changes in the bibliographic universe draws on kühn’s and historian of science david kaiser’s interest in how work gets done in the sciences, in addition to their rejection of paradigms as eerie self-directing processes.29 exemplars are not use cases use cases are a software modeling technique employed by the w3c library linked data incubator group (lld xg) in support of requirements specification.27 kühnstyle exemplars are definitely not to be confused with use cases, which are requirements-gathering documents that contribute to software engineering projects. there is a wikipedia definition of a use case that describes its properties: a use case in software engineering and systems engineering, is a description of steps or actions between a user (or “actor”) and a software system which leads the user towards something useful. the user or actor might be a person or something more abstract, such as an external software system or manual process. . . . use cases are mostly text documents, and use case modeling is primarily an act of writing text and not figure 8. frbr paper tool diagram elements and the frbr resource description graphs they depict 178 information technology and libraries | december 2011 ■■ a webpage and its underlying, globally distributed, multimedia resource network, as it changes over time. such exemplars can be presented diagrammatically through the use of paper tools. this use of diagrams in support of conceptualization and information system design is deliberately patterned after professional data modeling theory and practice.31 paper tool–supported analyses of a nineteenth-century american novel (exemplar 1) and of eighteenth-century french poems drawn from state archives (exemplar 2) will be presented to illustrate how information system design and pedagogy can be informed by exemplary scholarly research and publication, combined with narrativized diagrammatic representations of bibliographic and other relationships in traditional and digital media. exemplar 1. from moby-dick to mash-ups—a print publication history and multimedia mash-up problem document the publication history of print copies of a literary work, identifying editorially driven content transfer across print editions along with content selection and transformation in support of multimedia resource creation. solution the solution to this descriptive problem relies heavily on placing resource descriptions into groups and then defining relationships within and across those groups— i.e., on graph creation. after locating a checklist that documented the publication history of the novel and after identifying key components of a moby-dick and orson welles–themed multimedia resource appropriation and transformation network, murray used the frbr paper tool along with additional connection rules to create a resource description diagram (rdd) that represented g. thomas tanselle’s documentation of the printing history (from 1851 to 1976) of herman melville’s epic novel, moby-dick.32 the resulting diagram provides a high-level view of a large set of printed materials—depicting concepts such as a creative work, the expression of the work in a particular mode of languaging (i.e., speech, sign, image), and more concrete concepts such as publications. to reduce displayed complexity, sets of frbr diagram elements were collapsed into green shaded squares representing entire editions/printings, yielding figure 9.33 the vertical axis represents the year of publication, starting with the 1851 printings at the top. connected squares the resulting network of connections in figure 9 can be interpreted in publishing terms. one line or two or more lines descending downwards from a printing’s green in addition, resource description structures specified in an exemplar can and should represent a more abstract treatment of a resource description and not just data or data structures engaged by end users. exemplars on hand and others to come cultural heritage resource description exemplars have been created over time as solutions to problems of resource description and later made available for use, study, mastery, and improvement. while not necessarily bound to a particular information technology, such as papyrus, parchment, index cards, database records, or rdf aggregations, resource description exemplars have historically provided descriptive solutions of physical resources whose physical and intellectual structure had originally been innovative solutions to describing, for example, ■■ a manuscript (individual and related multiples, published but host to history, imaginary, etc.); ■■ a monograph in one edition (individual and related multiples); ■■ a monograph in multiple editions (individual and related multiples); and ■■ a publication in multiple media, created sequentially or simultaneously. with the advent of electronic and then digital communications media, more complex resource description problem-solution sets have been called for as a response to enduringly or recently more sophisticated creative/ editorial decision-making and to more flexible print and digital information technology production capabilities. the most challenging problem-solution sets involve the assembly and cross-referencing of several multipart—and possibly multimedia—creative or editorially constructed works, such as the following: ■■ a work published as a monograph, but which has been reprinted and reedited; translated into numerous languages; supplemented by illustrations from multiple artists; excerpted and adapted as plays, an opera, comic books, and cartoon series; multimedia mash-ups; and has been directly quoted in paintings and other graphic arts productions, and has been the subject of dissertations, monographs, journal articles, etc. ■■ a continuing publication (individual and related multiple publications, special editions, name, publisher, editorial policy changes, etc.). ■■ a monograph whose main content is composed nearly entirely of excerpts from other print publications.30 ■■ a library-hosted multimedia resource and its associated resource description network. cataloging theory in search of graph theory and other ivory towers | murray and tillett 179 by paper tool diagram creation, analysis, and subsequent action, namely, ■■ connecting the squares (i.e., assigning at least one relationship to a printing) ensures access based on the relationship assigned; and ■■ parties located around the globe can examine a given connected or disconnected resource description network and develop strategies for enhancing its usefulness. the wealth of descriptive information available in the moby-dick exemplar illustrates how previous and future collaborative efforts between cultural heritage institutions and other parties have already generated resource descriptions that possess a network structure alongside its content. with a more graph-friendly and collaborative implementation, melville scholars, scholarly organizations,34 and enthusiasts could more effectively examine, discuss, and through their actions enhance the moby dick resource description network’s documentary, scholarly, and educational value. in its original form, the moby dick resource description diagram (and the exemplar it partially documents) only depicted full-length publications of melville’s work. as a test of the frbr paper tool’s ability to accommodate both traditional and modern creative expressions in individual and aggregate form—while continuing to serve theoretical, practical, and educational ends—murray added a resource description network for orson whales,35 square are interpreted to mean that the printing gave rise to one or more additional printings, which may occur in the same or later years. two or more lines converging on a green square from above indicate that the printing was created by combining texts from multiple prior printings—an editorial/creative technique similar to that used to construct the mash-ups published on the web. connecting unconnected squares tanselle’s checklist did not specify predecessor or successor relationships for each post–1851 printing. this often unavoidable, incomplete status is depicted in figure 9 as green squares that are ■■ not linked to any squares above it, i.e., to earlier printings; and/or ■■ not linked to any squares below it, i.e., to later printings; or ■■ connected islands, without a link to the larger structure. recognizing the extent of moby-dick printing disconnectedness in tanselle’s checklist and developing a strategy for dealing with it only by analyzing tanselle’s checklist would be extremely difficult. in contrast, the disconnectedness of the moby-dick resource description network, and its implications for search-based discovery based on following the depicted relationships is readily discernable in figure 9. the ease with which the disconnected condition can be assessed also hints at benefits to be gained by collaborative resource description supported figure 9. a moby-dick resource description diagram, depicting relationships between printings made between 1851–1976 (greatly reduced scale) 180 information technology and libraries | december 2011 darnton’s book can stand on its own as an exemplar for historical method, with the diagram providing additional diagrammatic support. solution 2 darnton’s analysis treated each poem found in the archives as an individual creative work,38 enabling the use of the frbr paper tool (as a bookkeeping device this time) instead of a tool designed to aggregate and describe archival materials. the resulting diagram is a more articulated frbr paper tool depiction of darnton’s poetry communication network, a section of which appears as figure 11. the depiction of the poetry communication network shown in figure 11 is composed of: ■■ tan squares that depict individuals (clerks, professors, priests, students, etc.) who read, discussed, copied, and passed along the poems. ■■ diagram elements that depict poetry written on scraps of paper (treated as resources) that were police custody, were admitted to having existed by suspects, or assumed to have existed by the police. if one’s theory and business rules permit it, paper tool drawing conventions can depict descriptions of lost and nonexistent but nonetheless describable resources. ■■ arrowed lines that represent relationships between a poem and the individuals who owned copies, those who created or received copies of the poem, etc.39 with darnton’s monograph to provide background information regarding the historical personages involved, relationships between the works and the people, document selection from archival fonds, and the point of view of the scholar, the resulting problem-solution set can: ■■ serve as enhanced documentation for darnton-style communication network analysis and discussion. ■■ serve as an exemplar for catalogers, scholars, and alex itin’s moby-dick-themed multimedia mash-up, to the print media diagram. the four-minute long orson whales multimedia mashup contains hundreds of hand-painted page images from the novel, excerpts from the led zeppelin song “moby dick,” parts of two vocal performances by the actor orson welles, and a video clip from welles’s motion picture citizen kane. the result is shown in figure 10.36 the leftmost group of descriptions in figure 10 depicts various releases of led zeppelin’s “moby dick.” the central group depicts the sources of two orson welles audio dialogues after they had been ripped (i.e., digitized from physical media) and made available online. the grouping on the right depicts the orson whales mash-up itself and collections of digital images of painted pages created from two printed copies of the novel. exemplar 2. poetry and the police—archival content identification and critical analysis problem examine archival collections and select, describe, and document ownership and other relationships of a set of documents (poems) alleged to have circulated within a loosely defined social group. solution 1 in his 2010 work, poetry and the police: communication networks in eighteenth-century paris, historian robert darnton studied a 1749 paris police investigation into the transmission of poems highly critical of the french king, louis xv. after combing state archives for police reports, finding and identifying scraps of paper once held as evidence, and collecting other archival materials, darnton was able to construct a poetry communication network diagram,37 which, along with his narrative account, identified a number of parties who owned, copied, and transmitted six of the scandalous poems and placed their activities in a political, social, and literary context. figure 10. a resource description diagram of alex itin’s moby-dick multimedia work, depicting the resources and their frbr descriptions. cataloging theory in search of graph theory and other ivory towers | murray and tillett 181 with all of the adaptations and excerpts extant within a specified bibliographic universe (such as the cataloging records that appear in oclc’s worldcat bibliographic database). resource description diagrams, created from real-world or theoretically motivated considerations, would then provide a diagrammatic means for depicting the precise and flexible underlying mathematical ideas that, heretofore unrecognized but nonetheless systematically employed, serve resource description ends. if the structure of a well-motivated and constructed resource description diagram subsequently makes data representation and management requirements that a given information system cannot accommodate, cataloging theorists and information technologists alike will then know of that system’s limitations, will work together on mitigating them, and will embark on improving system capabilities. ■■ cataloging theory, tool-making, education, and practice this modernized resource description theory offers new and enhanced roles and benefits for cultural heritage personnel as well as for the scholars, students, and those members of the general public who require support not just for searching, but also for collecting, reading, writing, collaborating, monitoring, etc.40 information systems that others who seek similar solutions to their problems with identifying, describing, depicting, and discussing as individual works documents ordinarily bundled within hierarchically structured archival fonds at multiple locations. ■■ a paper tool into a power tool there are limits to what can be done with a hand-drawn frbr paper tool. while murray was able to depict largescale bibliographic relationships that probably had not been observed before, he was forced to stop work on the moby-dick diagram because much of the useful information available could not fit into a static, hand-drawn diagram. we think that automated assistance in creating resource description diagrams from bibliographic records is required. with that capability available, cataloging theorists and parties with scholarly and pedagogical interests could interactively and efficiently explore how scholars and sophisticated readers describe significant quantities of analog and digital resources. it would then be possible and extremely useful to be able to initiate a scholarly discussion or begin a lecture by saying, “given a moby-dick resource description network . . . ” and then proceed to argue or teach from a diagram depicting all known printings of moby-dick—along figure 11. a section of darnton’s poetry communication network 182 information technology and libraries | december 2011 the value of non-euclidean geometry lies in its ability to liberate us from preconceived ideas in preparation for the time when exploration of physical laws might demand some geometry other than the euclidean.41 taking riemann to heart, we assert that the value of describing cultural heritage resources as observations organized into graphs and of enhancing and supplementing the resource description exemplars that have evolved over time and circumstance rests in opportunities for liberating the cultural heritage community from preconceived ideas about resource description structures and from longstanding points of view on those resources. having achieved such a goal, the cultural heritage community would then be ready when the demand came for resource description structures that must be more flexible and powerful than the traditional ones. given the unprecedented development of the web and the promise of bottom-up semantic web initiatives, we think that the time for the cultural heritage community’s liberation is at hand. ■■ acknowledgments the authors wish to thank beacher wiggins and dianne van der reyden, directors of the library of congress acquisitions and bibliographic access directorate and the preservation directorates, respectively, for supporting the authors’ efforts to explore and renew the scientific and mathematical foundations of cultural heritage resource description. thanks also to marcia ascher, david hay, robert darnton, daniel huson, and mark ragan, whose scholarship informed our own; and to joanne o’brienlevin for her critical eye and for editorial advice. references and notes 1. oed online, “catalogue, n.” http://www.oed.com/view dictionaryentry/entry/28711 (accessed aug. 10, 2011). 2. peter galison, “part ii: building data,” in image & logic: a material culture of microphysics (chicago: univ. of chicago pr., 2003): 370–431. 3. gordon mcquat, “cataloguing power: delineating ‘competent naturalists’ and the meaning of species in the british museum,” british journal for the history of science 34, no. 1 (mar. 2001): 1–28. exclusive control of classification schemes and of the records that named and described its specimens are said to have contributed to the success of the british museum’s institutional mission in the nineteenth century. as a division of the british museum, the british library appears to have incorporated classification concepts (hierarchical structuring) from its parent and elaborated on the museum’s strategies for cataloging species. 4. oed online, “observation, n.” http://www.oed.com/ viewdictionaryentry/entry/129883 (accessed july 8, 2011). couple modern, high-level understandings about how cultural heritage resources can be described, organized, and explored with data models that support linking within and across multiple points of view will be able to support those requirements. the complementarity of cosmological and quantum-level views cataloging theory formation and practice—two areas of activity that did not interest many outside of cultural heritage institutions—can now be understood as a much more comprehensive multilayered activity that is approachable from at least two distinct points of view. the approach presented in this paper represents a cosmological-level view on the bibliographic universe. this treatment of existing or imaginable large-scale configurations of cultural heritage resource descriptions serves as a complement to the quantum-level view of resource description, as characterized by it-related specificities such as character sets, identifiers, rdf triples, triplestores, etc. activities at the quantum level—the domain of semantic web technologists and others—yield powerful and relatively unconstrained information management systems. in the absence of cosmological-level inspiration or guidance, these systems have not necessarily been tested against nontrivial, challenging cultural heritage resource description scenarios like those documented in the above two exemplars. applying both views to the bibliographic universe would clearly be beneficial for all institutional and individual parties involved. if ever a model for multilevel, multidisciplinary effort was required, the history of physics is illuminated by mutually influential interactions of cosmological and quantum-level theories, practices, and pedagogy. workers in cultural heritage institutions and technologists pursuing w3c initiatives would do well to reflect on the result. ■■ ready for the future—and creating the future to explore the cultural, scientific, and mathematical ideas underlying cultural heritage resource description, to identify, study, and teach with exemplars, and to exploit the theoretical reach and bookkeeping capability of paper tool –like techniques is to pay homage to the cultural heritage community’s 170+ year-old talent for pragmatic, implementation-oriented thinking,while at the same time pointing out a rich set of possibilities for enhanced service to society. the cultural heritage community can draw inspiration from geometrician bernhard riemann’s own justification for his version of thinking outside of the box called euclidean geometry: cataloging theory in search of graph theory and other ivory towers | murray and tillett 183 18. the prospects for creating graph-theoretical functions that operate on resource description networks are extremely promising. for example, combinatorica (an implementation of graph theory concepts created for the computer mathematics application mathematica) is composed of more than 450 functions. were cultural heritage resource description networks to be defined using this application’s graph-friendly data format, significant quantities of combinatorica functions would be available for theoretical and applied uses; siriam pemmaraju and steven skiena, computational discrete mathematics: combinatorics and graph theory with mathematica (new york: cambridge univ. pr., 2003). 19. dénes könig, theory of finite and infinite graphs, trans. richard mccoart (boston: birkhaüser, 1990); fred buckley and marty lewinter, a friendly introduction to graph theory (upper saddle river, n.j.: pearson, 2003); oystein ore and robin wilson, graphs and their uses (washington d.c.: mathematical association of america, 1990). 20. leonhard euler, “solutio problematis ad geometriam situs pertinentis,” commentarii academiae scientarium imperalis petropolitanae no. 8 (1736): 128–40. 21. “set theory, branch of mathematics that deals with the properties of well-defined collections of objects, which may or may not be of a mathematical nature, such as numbers or functions. the theory is less valuable in direct application to ordinary experience than as a basis for precise and adaptable terminology for the definition of complex and sophisticated mathematical concepts.” quoted from encyclopædia britannica online, “set theory,” oct. 2010, http://www.britannica.com/ebchecked/ topic/536159/set-theory (accessed oct. 27, 2010). 22. ifla study group on the functional requirements for bibliographic records, functional requirements for bibliographic records: final report (munich: k.g. saur, 1998). this document is downloadable as a pdf from http://www.ifla.org/vii/s13/ frbr/frbr.pdf or as an html page at http://www.ifla.org/vii/ s13/frbr/frbr.htm. 23. ursula klein, ed., experiments, models, paper tools: cultures of organic chemistry in the nineteenth century (stanford, calif.: stanford univ. pr., 2003); klein, ed., tools and modes of representation in the laboratory sciences (boston: kluwer, 2001); david kaiser, drawing theories apart: the dispersion of feynman diagrams in postwar physics (chicago: univ. of chicago pr., 2005). 24. for more examples and a general description of feynman diagrams, see http://www2.slac.stanford.edu/vvc/theory/ feynman.html. 25. an enlarged version of this diagram may be found online. ronald j. murray and barbara b. tillett, “frbr paper tool diagram elements and the frbr resource description graphs they depict,” aug. 2011, http://arizona.openrepository.com/ arizona/bitstream/10150/139769/2/fig%208%20frbr%20 paper%20tool%20elements%20and%20graphs.pdf. other informative illustrations also are available. murray and tillett, “resource description diagram supplement to ‘cataloging theory in search of graph theory and other ivory towers. object: cultural heritage resource description networks,” aug. 2011, http://hdl.handle.net/10150/139769. 26. thomas s. kühn, the structure of scientific revolutions, 2nd ed. (chicago: univ. of chicago pr., 1970). 27. daniel vila suero, “use case report,” world wide web consortium, june 27, 2011, http://www.w3.org/2005/ incubator/lld/wiki/usecasereport. 5. david c. hay, uml and data modeling: a vade mecum for modern times (bradley beach, n.j.: technics pr., forthcoming 2011): 124–25. some scholars argue that decisions as to what the things of interest are and the categories they belong to are influenced by social and political factors. geoffrey c. bowker, susan leigh star, sorting things out: classification and its consequences (cambridge, mass.: mit pr., 1999). 6. gerald holton, “the roots of complementarity,” daedalus 117, no. 3 (1988): 151–97, http://www.jstor.org/stable/20023980 (accessed feb. 24, 2011). 7. niels bohr, quoted in aage petersen, “the philosophy of niels bohr,” bulletin of the atomic scientists 19, no. 7 (sept. 1963): 12. 8. niels bohr, “quantum physics and philosophy: causality and complementarity,” in essays 1958–1962 on atomic physics and human knowledge (woodbridge, conn.: ox bow, 1997): 7. 9. for cataloging theorists, the description of cultural heritage things of interest yields groups of statements that occupy different levels of abstraction. upon regarding a certain physical object, a marketer describes product features, a linguist enumerates utterances, a scholar perceives a work with known or inferred relationships to other works, and so on. 10. marcia ascher, ethnomathematics: a multicultural view of mathematical ideas (pacific grove, calif.: brooks/cole, 1991); ascher, mathematics elsewhere: an exploration of ideas across cultures (princeton: princeton univ. pr., 2002). 11. a timeline of events, people, and so on that have had or should have had an impact on describing cultural heritage resources is available online. seven fields or subfields are represented in the timeline and keyed by color: library & information science; mathematics; ethnomathematics; physical sciences; biological sciences; computer science; and arts & literature. ronald j. murray, “the library organization problem,” dipity .com, aug. 2011, http://www.dipity.com/rmur/libraryorganization-problem/ or http://www.dipity.com/rmur/ library-organization-problem/?mode=fs (fullscreen view). 12. barbara ann barnett tillett, “bibliographic relationships: toward a conceptual structure of bibliographic information used in cataloging” (phd diss., university of california, los angeles, 1987); elaine svenonius, the intellectual foundation of information organization (cambridge, mass.: mit pr., 2000): 32–51. svenonius’s definition is opposed to database implementations that permitted boolean operations on records at retrieval time. 13. ronald j. murray, “the graph-theoretical library,” slideshare.net, july 5 2011, http://www.slideshare.net/ ronmurray/-the-graph-theoretical-library. 14. francis j. witty, “the pinakes of callimachus,” library quarterly 28, no. 1–4 (1958): 132–36. 15. ronald j. murray, “re-imagining the bibliographic universe: frbr, physics, and the world wide web,” slideshare .net, oct. 22 2010, http://www.slideshare.net/ronmurray/frbrphysics-and-the-world-wide-web-revised. 16. for an overview of the technology-driven library linked data initiative, see http://linkeddata.org/faq. murray’s analyses of cultural heritage resource descriptions may be explored in a series of slideshows at http://www.slideshare.net/ronmurray/. 17. pat riva, martin doerr, and maja žumer, “frbroo: enabling a common view of information from memory institutions,” international cataloging & bibliographic control 38, no. 2 (june 2009): 30–34. 184 information technology and libraries | december 2011 36. the multimedia mash-up in figure 10 was linked to the much larger moby-dick structure depicted in figure 9. the combination of the two yields figure 10a, which is too detailed for printout but which can be downloaded for inspection as the following pdf file: ronald j. murray and barbara b. tillett, “transfer and transformation of content across cultural heritage resources: a moby-dick resource description network covering full-length printings from 1851–1976*,” july 2011, http://arizona.openrepository.com/arizona/bitstream/10150/136270/4/fig%2010a%20orson%20whales%20 in%20moby%20dick%20context.pdf. in the figure, two print publications have been expanded to reveal their own similar mash-up structure. 37. robert darnton, poetry and the police: communication networks in eighteenth-century paris (cambridge, mass.: belknap pr. of harvard univ. pr., 2010): 16. 38. ronald j. murray in a discussion with robert darnton, sept. 20, 2010. darnton considered the poems retrieved from the archives as distinct intellectual creations, which permitted the use of frbr diagram elements for the analysis. otherwise, a paper tool with diagram elements based on the archival descriptive standard isad(g) would have been used. committee on descriptive standards, isad (g): general international standard archival description (stockholm, sweden, 1999– ). 39. the complete poetry communication diagram may be viewed at http://arizona.openrepository.com/arizona/ bitstream/10150/136270/6/fig%2011%20poetry%20commun ication%20network.pdf. 40. carole l. palmer, lauren c. teffeau, and carrie m. pittman, scholarly information practices for the online environment: themes from the literature and implications for library science development (dublin, ohio: oclc research, 2009), http://www . o c l c . o rg / p ro g r a m s / p u b l i c a t i o n s / re p o r t s / 2 0 0 9 0 2 . p d f (accessed july 15, 2011). 41. g. f. b. riemann, quoted in marvin j. greenberg, euclidean and non-euclidean geometry: development and history (new york: freeman, 2008): 371. 28. wikipedia.org, “use case,” june 13, 2011, http://en .wikipedia.org/wiki/use_case. 29. kaiser, drawing theories, 385–86. 30. prime examples being jacques derrida’s typographically complex 1974 work glas (univ. of nebraska pr.), and reality hunger: a manifesto (vintage), david shield’s 2011 textual mashup on the topic of originality, authenticity, and mash-ups in general. 31. graeme simsion, data modeling: theory and practice (bradley beach, n.j.: technics, 2007): 333. 32. herman melville, moby-dick (new york: harper & brothers; london: richard bentley, 1851). moby-dick edition publication history excerpted from g. thomas tanselle, checklist of editions of moby-dick 1851–1976. issued on the occasion of an exhibition at the newberry library commemorating the 125th anniversary of its original publication (evanston, ill.: northwestern univ. pr.; chicago: newberry library, 1976). 33. ronald j. murray, “from moby-dick to mash-ups: thinking about bibliographic networks,” slideshare.net, apr. 2011, http://www.slideshare.net/ronmurray/from-mobydick-to-mashups-revised. the moby-dick resource description diagram was presented to the american library association committee on cataloging: description and access at the ala annual conference, washington d.c., july 2010. 34. the life and works of herman melville, melville.org, july 25, 2000, http://melville.org. 35. the new york artist alex itin describes his creation: “it is more or less a birthday gift to myself. i’ve been drawing it on every page of moby dick (using two books to get both sides of each page) for months. the soundtrack is built from searching ‘moby dick’ on youtube (i was looking for orson’s preacher from the the [sic] john huston film) . . . you find tons of led zep [sic] and drummers doing bonzo and a little orson . . . makes for a nice melville in the end. cinqo [sic] de mayo i turn forty. ahhhhhhh the french champagne.” quoted from alex itin, “orson whales,” youtube, jan. 2011, http://www.youtube .com/watch?v=2_3-gem6o_g. personalization of search results representation of a digital library article personalization of search results representation of a digital library ljubomir paskali, lidija ivanovic, georgia kapitsaki, dragan ivanovic, bojana dimic surla, and dusan surla information technology and libraries | march 2021 https://doi.org/10.6017/ital.v40i1.12647 ljubomir paskali (ljubomir.paskali@gmail.com) phd student, university of novi sad, serbia. lidija ivanovic (lidija.ivanovic@uns.ac.rs) assistant professor, university of novi sad, serbia, she is a corresponding author. georgia kapitsaki (gkapi@cs.ucy.ac.cy) associate professor, university of cyprus, cyprus. dragan ivanovic (dragan.ivanovic@uns.ac.rs) full professor, university of novi sad, serbia. bojana dimic surla (bdimicsurla@raf.edu.rs) full professor, union university, serbia. dusan surla (surla@uns.ac.rs) professor emeritus, university of novi sad, serbia. © 2021. abstract the process of discovering appropriate resources in digital libraries within universities is important, as it can have a big effect on whether retrieved works are useful to the requester. the improvement of the user experience with the digital library of the university of novi sad dissertations (phd uns) through the personalization of search results representation is the aim of the research presented in this paper. there are three groups of phd uns digital library users: users from the academic community, users outside the academic community, and librarians who are in charge of entering dissertation data. different types of textual and visual representations were analyzed, and representations which needed to be implemented for the groups of users of phd uns digital library were selected. after implementing these representations and putting them into operation in april 2017, the user interface was extended with functionality that allows users to select their desired style for representing search results using an additional module for storing message logs. the stored messages represent an explicit change in the results representation by individual users. using these message logs and elk technology stack, we analyzed user behavior patterns depending on the type of query, type of device, and search mode. the analysis has shown that the majority of users of the phd uns system prefer using the textual style of representation rather than the visual. some users have changed the style of results representation several times and it is assumed that different types of information require a different representation style. also, it has been established that the most frequent change to the visual results representation occurs after users perform a query which shows all the dissertations from a certain time period and which is taken from the advanced search mode; however, there is no correlation between this change and the client’s device used. introduction in order to place their current work within a framework of previous methods or identify research gaps, researchers often need to identify and study previous research. discovering information on the web is not always a trivial task. many systems allow scholars to search for research papers, dissertations, and other technical reports, providing at the same time relevant recommendations to users based on their areas of interest or previous searches. although web search engines are considered a superior solution to more specialized digital library systems, these specialized systems may provide more benefits in specific conditions, e.g., when searching for dissertations in specific languages, or by affiliated countries or institutions.1 nowadays, digital libraries are widely used by diverse communities of users for diverse purposes.2 xie and colleagues conducted an analysis in 2018 to compare similarities and differences in perceptions of the importance of mailto:ljubomir.paskali@gmail.com mailto:lidija.ivanovic@uns.ac.rs mailto:gkapi@cs.ucy.ac.cy mailto:dragan.ivanovic@uns.ac.rs mailto:bdimicsurla@raf.edu.rs mailto:surla@uns.ac.rs information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 2 different digital libraries evaluation criteria by heterogeneous stakeholders in academic settings.3 specifically, they surveyed three groups of stakeholders (scholars, librarians, and digital library users), and through their analysis of the survey’s responses, they identified differences in opinions not only between user expectations and the digital library practice but also between what is desirable and what is possible in the academic environment. regardless of whether more general (i.e., web search engines) or more specific (i.e., digital libraries) systems are used, the presentation of search results to users is important. it significantly affects how they perceive the system and may reduce or increase the chances of using the system and the frequency of use, as usability is an important aspect in any system. the visualization of search results may be adapted to user needs. presenting results either in a textual or an alternative format depends on how users respond to the alternative presentations of the system and this process is followed in different domains, such as recommender systems.4 this can form part of a personalized context-aware system that considers users’ environment, history, and interaction with the system in order to act proactively and adapt the presentation of search results to each user. based on the gaps identified above, the focus for this study is the improvement of the user experience with the phd uns digital library through the personalization of the offered service for system users. it is necessary to provide the user with a choice between textual and visual search results representation according to user preferences and needs. this study reveals a new way of presenting search results that has been analyzed, designed, and implemented by the authors. more concretely, presentations of the bibliographic metadata in a standardized citation style (harvard style) and bibliographic formats (marc 21, dublin core, etd-ms) have been analyzed and implemented. word cloud format is widely used in different systems and this representation has been implemented in the phd uns system. finally, it is necessary to determine if the initial search results representation should be stored in the history of users’ queries, device types, and search mode. users have the ability to provide their feedback on the visualization of the search results, therefore indicating if they prefer a textual or new visual results representation (by changing search results representation style). the feedback received is used to adapt the results representation based on the user preferences. this component represents the first step towards a completely personalized system, in which different contextual parameters will be used for providing a personalized context-based user experience. at this point, user feedback is used for personalization, search results representation, and subsequent system use. a preliminary version with preliminary results regarding the word cloud component is described by kapitsaki and ivanovic.5 in respect to this previous work, we are presenting the evolvement of personalization in the phd uns system and a more thorough evaluation that allows us to perform statistical analysis and draw more generic conclusions. accordingly, the motivation for this research is the personalization of the search results representation of a digital library, and the research questions to which this research should provide answers have been identified. we are discussing our results based on these questions: 1. rq1: what are the users’ profiles of the phd uns digital library? 2. rq2: how could search results best be presented to different users within phd uns’s digital library collections? 3. rq3: can the search results representation with phd uns’s digital library depend on the history of users’ queries, device types, and search mode? information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 3 related work dosird uns the dosird uns project (http://dosird.uns.ac.rs/) was launched in 2009 with the aim to develop software infrastructure for the research domain of the university of novi sad (uns). the cris uns system (www.cris.uns.ac.rs) is the first result of this project. this system represents the information system of the research domain of the uns. the development of the system started with the beginning of the project in 2009 and is still active. the digital library of theses and dissertations (phd uns), which is the topic of this paper, is integrated within cris uns. the complete cris uns system was developed in accordance with the recommendations of the eurocris (www.eurocris.org) non-profit organization. systems which contain the published scientific results were analyzed and, on the basis of these analyses, a set of metadata describing the scientific and research result in cris uns was created. 6 a paper by ivanović et al. described the cerif compatible data model based on the marc 21 format which maps part of the cerif data model to the marc 21 format data model. 7 the marc 21 format is a standardized format for storing bibliographic data. cris uns has been built on th is model. the system architecture and implementation are described in previous publications.8 the development of the digital ph.d. dissertations library (phd uns) began in 2010. in december 2012, the senate of the university of novi sad approved the commissioning of a public service for the search of a digital library of dissertations defended at the university (https://cris.uns.ac.rs/searchdissertations.jsf). phd uns has been implemented with the following characteristics: • the digital library of e-theses is integrated into the information system of the scientific research activity of the university of novi sad (cris uns). • the digital library is cerif compatible, that is, it can exchange metadata with cerifcompatible systems of scientific research activity. • e-theses are described by a set of metadata which includes all the metadata prescribed by the dublin core and the etd-ms metadata format, that is, the system can exchange the data in dublin core or etd-ms format via the oai-pmh protocol. • the digital e-thesis library has a data model and architecture that can be easily integrated with a bibliographic system based on the marc 21 bibliographic format. • the user interface allows a user to enter the thesis and dissertation data without knowing the standardized metadata formats on which the digital library is built. the integration of phd uns within cris uns involved the following four steps: 1. the cris uns data model has been extended with entities and properties for describing phd theses in accordance with cerif, dublin core, and etd-ms data models.9 2. the cris uns software architecture and user interface has been extended in order to support basic functionality of cataloguing theses.10 3. theses’ metadata have been imported from the previous source.11 4. the web page for searching among the collection of theses has been implemented.12 http://dosird.uns.ac.rs/ http://www.cris.uns.ac.rs/ http://www.eurocris.org/ https://cris.uns.ac.rs/searchdissertations.jsf information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 4 searching personalization the findings and analysis of scientific results described in papers, theses, and dissertations is an important part of research activities in the scientific community. therefore, the use and development of the tools and the bibliographic systems which enable advanced search is becoming increasingly more common. the personalization of search results can include automatic recommendations to users.13 moreover, part of search personalization refers to the personalization of results representation. similar to popular web search engines like google, the way the search results are represented is very important to users. the way the results are represented can affect user perception of the system and the frequency of its use. the results can be presented to users in formats other than textual in order to improve the user experience in search tools, as well as to improve access to finding information and in recommendation systems.14 ferran and colleagues described browsing and searching personalization systems for digital libraries.15 their approach is based on the use of ontologies for describing the relationships between elements that determine the functionalities of the desired personalization system. those elements include the user’s profile, including navigational history and the user preferences, as well as the information collected from the navigational behavior of the digital library users. such a personalization system can improve digital library users’ experience. sebrechts and colleagues presented a controlled comparison of text, 2d, and 3d approaches to a set of typical information seeking tasks on a collection of 100 top ranked documents retrieved from a much larger document set.16 the conducted experiments included 15 participants. the study revealed that although visualization can assist the reduction of the mental workload for interpreting the results, these reductions and their acceptance depend on an appropriate mapping among the interface, the task and the user. in relevance to the above, our approach lies in the area of 2d display of information (see the visual results representation section later in this article), but instead of focusing on basic text information we have adopted newer approaches found in word clouds. bowers and card analyzed visualization in the framework of database search. 17 soliman et al. presented an approach for the clustering of search engine results that relies on the semantics of the retrieved documents.18 the approach takes into consideration both lexical and semantic similarities among documents and applies activation spreading tech nique, in order to generate clusters based on semantic properties. nguyen and zhang proposed a model for web search visualization, where physical location, spatial distance, color, and movement of graphical objects are used to represent the degree of relevance between a query and relevant web pages considering this way the context of users’ subjects of interest.19 a word or tag cloud is a visual representation of word content commonly used to represent content in different environments.20 several past works have introduced various algorithms for the tag selection or new ways for the word cloud creation. 21 tag clouds have been used in pubcloud for the summarization of results from queries over the pubmed database of biomedical literature.22 pubcloud responds to queries of this database with tag clouds generated from words extracted from the abstracts returned by the query. the authors found that the descriptive information is this way provided in a better way to users. however, the discovery of relations between concepts is rendered less effective. information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 5 context awareness context awareness is a part of many systems in various domains, where the application or system functionality adapts to the context of use, such as in mobile computing or pervasive computing applications.23 the first definition of context was given by abowd and colleagues, where they defined context “as any information that is relevant to the user, to the system and to any interaction between the user and the system.”24 applications utilize context data in order to provide context-aware services to users. context information, such as user location or user preferences, are used to adapt the application functionality or presentation to a specific user. mobile computing and pervasive computing offer the necessary information from mobile device sensors and in users’ environments for context-aware application provision.25 fink and kobsa claimed that personalization may adapt various features in order to address the specific needs of each individual.26 many systems utilize users’ search history in order to offer personalized search in the framework of the web information retrieval systems. yoganarasimhan found that personalization based on short-term history or within-session behavior is less valuable than longterm or across-session personalization.27 behnert and lewandowski analyzed the application of web search engines ranking approaches on digital libraries, and they argued for a user-centric view on ranking, taking into account that ranking should be for the benefit of the user, and user preferences may vary across different contexts.28 frias-martinez and colleagues defined an approach to constructing personalized digital libraries. adaptive digital libraries automatically learn user preferences and goals and personalize their interaction using this information.29 based on previous work, frias-martinez and colleagues developed a personalized digital library to suit the needs of different cognitive styles. 30 contribution of our work we share similarities with previous work in terms of techniques used, as for instance the harvard reference style, standardized bibliographic formats (marc 21, dublin core, etd-ms) and word clouds, which have been used in other systems for representation of search results in order to improve the user experience. however, in contrast to previous work, we apply techniques in a specific context for a serbian digital library, allowing automatic adaptation for the representation of search results based on a user’s type, history, and reaction. those search results representations are implemented and integrated within a real system (phd uns) and are tested with real users’ feedback. that users’ feedback is analyzed using elk stack technologies, making our main conclusions useful for similar systems and future research on personalization in digital libraries. methodology the main requirements for implementation of the phd uns digital library were for the system to be compatible for integration with other systems of scientific research activity, support for data exchange in different standardized formats, and provision of representation of the results to users of different categories and profiles (researchers, scientists, librarians, users from outside the academic community, etc.). for these reasons, existing formats for representation of references, bibliographic metadata formats, as well as techniques for visual representation of textual publications, were analyzed. the format adopted for the representation of references was harvard-style implemented with the freemarker template. freemarker information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 6 (https://freemarker.apache.org/) is an open-source template engine for java that assists in separating the web user interface from the main system functionality, following the mvc (model view controller) pattern. the analysis established that it was necessary to implement the search results representation in three bibliographic formats: marc 21, dublin core, and etd-ms. for each of these formats, appropriate mappers/serializers were made which transform the data from a database into the xml (extensible markup language) representation of the previously mentioned bibliographic formats. for visual representation, word cloud style was adopted. a component for generating word cloud images was implemented and integrated into the phd uns digital library. when presented with search results, users are able to choose which desired representation format is stored and used as preferable for the given user in future search results representations. a logging system was implemented to assist with this; this module is invoked when the user changes the default data view mode. received messages are preprocessed for the purpose of analyzing messages and obtaining a more accurate evaluation. the aim of the analysis of the messages on the work of the phd uns system is to obtain the desired statistics: • distribution of used representation styles • top queries executed before changing into the textual style, as well as into the visual style • distribution of devices used before changing into the textual style, as well as into the visual style • distribution of search modes before changing into the textual style, as well as into the visual style these statistics are analyzed to determine the user behavior patterns depending on the type of search (basic or advanced), the search device used, the executed query, etc. based on the established patterns it is possible to determine the representation style for future searches of the new users. the results of the analysis are graphically represented using elk stack technologies and are presented below in the evaluation section. the methodological approach is shown in figure 1. the rest of this paper is organized in accordance with the methodology steps shown in figure 1. https://freemarker.apache.org/ information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 7 figure 1. the methodology of the study presented in this paper. information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 8 analysis of search results’ representation styles textual representation for the needs of the phd uns library, three types of search results’ textual representation for three types of phd uns users were analyzed: reference representation. a citation style is defined as a set of rules for citing sources in academic writing and those rules prescribe style for in-text citations, as well as style for reference representation in the references’ list. this textual approach is intended for users from the academic community—researchers, teaching staff of universities, and phd students. since the majority of phd uns users belong to this group, this is the default representation in a textual representation of the search results. this group of users is familiar with this type of representation. from this type of representation, the users can easily recognize basic data of interest and can use this representation for citing and referencing the dissertations which have been retrieved as a result of executing a query. taking into account that there are currently many different styles for representing references and citations (e.g., apa, mla, harvard, vancouver, and chicago, among others), and that they might change in the future with the emergence of new trends in science (for example, the emergence of open science and the need to cite data sets, not just publications), it is necessary to create a scalable component for representing the results in a form of a reference. based on this analysis, we decided that the architecture of this component will be based on freemarker, which makes the introduction of new templates for the output format easier, and that the first freemarker template should be created for harvard style. structured representation. this textual approach is intended for users outside the academic community who want to search the digital library. this type of representation represents only the data from the digital library database in a legible format that is represented in the web browser. bibliographic formats representation. this textual approach is intended for librarians who are in charge of entering and maintaining data in the digital library. in addition to one central library of the university of novi sad, there are libraries in every department within the university. most of these libraries use the bisis library system, which is based on marc format. therefore, it can be concluded that the majority of the librarians who enter data into the phd uns library are familiar with the marc format. librarians can use the representation of metadata about dissertations in the marc 21 format to check if all of the information about a dissertation is entered correctly. • marc 21 bibliographic format supports not only descriptions of theses and dissertations but also other published scientific results, such as a paper published in a journal, a monograph, a paper published in conference proceedings, etc. there are several examples where theses and dissertations are described using the marc 21 format in the bibliographic information systems of some universities.31 • dublin core (http://dublincore.org) is the most commonly used format for data exchange between different information systems, and data are exported in this format via the oaipmh protocol from the phd uns system into a network of digital libraries, such as darteurope, oatd, and nardus. the dublin core xml schema is available online at www.openarchives.org/oai/2.0/oai_dc.xsd. the representation in dublin core format can http://dublincore.org/ http://www.openarchives.org/oai/2.0/oai_dc.xsd information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 9 be used by librarians to check if the metadata will be correctly exported to the previously mentioned aggregation systems. • electronic theses and dissertations metadata standard (etd-ms) (www.ndltd.org/standards/ metadata) is an extension of the dublin core format with new features/properties. the standard defines a set of metadata that is used to describe a master’s thesis or a doctoral dissertation. the metadata of this standard describe the author, his/her paper, and the context in which this paper has been created in a way that will be useful not only to the researcher, but also to the librarian and/or the technical staff in charge of maintaining the paper in electronic form. this format is used within the ndltd worldwide network of digital theses and dissertations and in this format the data is exported via the oai-pmh protocol from the phd uns system to this network. the xml schema of the etd-ms format is available online at www.ndltd.org/ standards/metadata/etdms/1.0/etdms.xsd. the representation in the etd-ms format can be used by librarians to check if the metadata will be correctly exported to the ndltd network (http://union.ndltd.org/). visual representation a word cloud is a visual representation of textual content, with the importance of each word indicated with a different font size and/or color. word clouds are often used in many digital libraries to represent textual content.32 as previously written, the word cloud is used in different environments and is a popular way to represent web results by summarizing the content of documents and other sources of information. we adopted a word cloud approach for visual representation of the user search results in the phd uns library. various tools for generating word clouds are available, such as the tool offered by jin.33 based on the characteristics of available tools, we decided to implement the kumo library available in the java programming language (https://github.com/kennycason/kumo) that allows easier integration within the phd uns digital library. implementation details this section presents the implementation of the textual and visual search results representation, as well as the implementation of the search results personalization. textual results representation based on the analysis presented in the previous section, we decided to implement the following functionality in order to enhance the phd uns digital library: a structured representation for users outside the academic community, a representation in the form of references for scholars , and a representation of bibliographic and library formats for librarians in charge of data in the phd uns digital library. reference representation. figure 2 depicts the architecture of the module for generating phd dissertations’ representations in the form of references for scholars. http://www.ndltd.org/standards/%20metadata http://www.ndltd.org/%20standards/metadata/etdms/1.0/etdms.xsd http://www.ndltd.org/%20standards/metadata/etdms/1.0/etdms.xsd http://union.ndltd.org/ https://github.com/kennycason/kumo information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 10 figure 2. architecture of module for generating reference representation. the model of the reference generator component is shown as the class diagram in figure 3. this component can be used to generate textual representation to all publications from the data model component (figure 4) in the chosen reference style (freemarker template—see listing 1). the central class is templaterunner which includes the necessary operations to generate reports. the templateholder represents the template container and has operations for adding new templates and selecting template for generating report. the template class is the model of the template for one reference style and one publication type. the component architecture described in the class diagram of figure 3 is independent of the number of templates, whereas adding a new template to the component requires creating a new instance of the template class. as similarly performed in the cris uns system, the implementation of these instances of the template class is done in freemarker that does not require the recompilation of the source code. freemarker template reference generator data model phd uns database reference representation information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 11 figure 3. architecture of the component for generating template. <#macro nameinitial name> <@compress> <#t><#if name?length >1> <#t><#if (name?upper_case?starts_with("lj") || name?upper_case?starts_with("nj"))>, ${name?substring(0,2)?upper_case}. <#t><#else>, ${name?substring(0,1)?upper_case}. <#t><#elseif name?length=1>, ${name?upper_case}. <#t>${author.name.lastname?upper_case} <#t><@nameinitial author.name.firstname/> <#t><#if publicationyear??> (${publicationyear}) ${sometitle!""}. (${localizedstudytype}), ${institution.somename} listing 1. harvard-style freemarker template structured representation. the simplified version of the bibliographic records data model that is used in the cris uns system is shown in figure 4. the cris uns system contains also other publication entities, such as monograph, journal paper, etc. the phd uns digital library is integrated into the cris uns system and uses the entities shown in figure 4. 1..1 1..* templaterunner + getrepresentation (record rec[], int referencestyle) getrectype (record rec) makeonereference (record rec, template template) organizerecords (criteria criteria) : string : int : string : void templatesholder + + gettemplate (int recordtype, int referencestyle) addtemplate (template t) : int : void template pubtype referencestyle : int : int + + getdata () formatdata () : void : void information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 12 figure 4. data model. a structured representation for users outside the academic community is implemented with the help of the tohtmlrepresentation method contained in the thesis class. this method forms a structured representation of the dissertation with the help of html markup. when storing a dissertation in a database, html representation is generated and stored in lucene indexes for faster representation of search results. bibliographic formats representation. after analyzing the bibliographic and library formats (see the methodology section above) we concluded that we should implement the search results representation in marc21, dublin core, and etd-ms bibliographic formats for the needs of librarians who are in charge of entering data in the phd uns library. the representation of these formats is implemented in a similar way as the representation of structured data outside the academic community, with the help of the following methods: the tomarc21representation, 0..* publications 1..* authors 0..* 0..* supervisors 0..* researchers 0..* affiliations 0..1 1..1 defendedat 0..* 0..* defendboardmembers t hesis + + + + toht mlrepresentation () tomarc21representation () todublincorerepresentation () toet dmsrepresentation () : string : string : string : string publication {abstract} title publicationyear : string : int institution name address : string : string author firstname lastname address email : string : string : string : string information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 13 todublincorerepresentation, and toetdmsrepresentation in the thesis class (figure 4). these methods generate an xml representation of these formats. when storing a dissertation in a database, these xml representations are stored in lucene indexes for faster retrieval of search results. screenshots of user interface. figure 5 presents the textual search results representation. the basic representation contains the metadata of the dissertation presented as a harvard -style reference. this is the basic representation because the researchers from the academic environment are the most common users of the phd uns library. additional metadata are represented by pressing the button which is located next to the reference and represents the data structured for the needs of users outside the academic community (see fig. 6). figure 5. results representation in a textual format. in addition, the representation of the dissertation metadata is also available to library users in marc 21, dublin core, and etd ms formats (see fig. 6). information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 14 figure 6. structured and bibliographic formats representation. visual results representation this section describes the implementation of the graphical (visual) search results representation. the graphical representation is realized using a word cloud to represent the content of a dissertation. word cloud generator component. the word cloud generator component forms a new part of the phd uns digital library. the aim of this new component is to present the user search results in a word cloud representation (see fig. 7). information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 15 figure 7. word cloud generator steps. the word cloud component was implemented in java. the component accepts a pdf file as input and generates an image (png file) as output. the tool uses as input the pdf file of the dissertation ; it then parses the textual content of the file and performs a preprocessing of the text. the result of the preprocessing is a list of pairs containing the original version of each word from the text and its stem. the details of the tool utilized for this preprocessing step can also be found in the existing publication.34 the tool then calculates the top frequencies of words in the text, generates the word cloud, and creates an image file. as aforementioned, for implementation purposes, the kumo library was used (https://github.com/kennycason/kumo). kumo is an open source software that carries the mit license. the source code has been extended to accommodate the needs of the phd uns digital library. https://github.com/kennycason/kumo information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 16 integration to the phd uns system. the word cloud generator component described in the previous section has been integrated into the phd uns digital library application and was put into operation in april 2017, although some necessary adaptations have been performed since then and have all been integrated as well. taking into account that the word cloud generator is lengthy and creates a computationally intensive process, it is invoked in the indexing phase and the generated image is stored as supplementary material to a phd dissertation in the server file system. figure 8 presents a unified modeling language (uml) activity diagram which describes the process of adding a new dissertation to the phd uns digital library. the activity “generate word cloud image” is highlighted in red and represents invoking the execution of the word cloud component. moreover, the activity “create lucene index” includes the same steps for text preprocessing as the steps described in the word cloud generator component (see fig. 7). figure 8. adding a new dissertation into the phd uns system. the search results representation in the form of a word cloud is enabled via the user interface page for representing the search results of the phd uns digital library (see fig. 9). user phd uns upload dissertation store dissertation generate word cloud image create lucene index store image in file system upload dissertation store dissertation generate word cloud image create lucene index store image in file system information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 17 figure 9. results of the search of the phd uns system in a word cloud format. personalization of representation this section describes the implementation related to personalization of the search results representation. the user can select the desired style of representation, and the representation history of the results is recorded in order to personalize the results representation and customize the user’s profile and information needs. the initial search results representation style for users who search for dissertations in the phd uns system for the very first time is the random selection of one of the two options: • result representation in a textual format • result representation as word cloud images after analyzing the logs from changing the results representation (see next section), this random selection could be replaced with a choice that depends on the context: queries, devices, types of searches, etc. the parts of the page which represent how the results are presented in the textual and word cloud representations are shown in figures 5 and 9, respectively. users can change the representation style from the page. in this way, users give feedback and indicate their preference for visualization of the results which is used in the future results representations for that user. information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 18 evaluation collecting user feedback if a digital library user changes the style of results representation, the relevant message about the change of the representation style together with the user metadata is recorded using the log4j. this process is shown in red in the activity diagram in figure 10. figure 10. the process of executing queries and giving feedback on the representation style. listing 2 is an example of a recorded message about the change of the representation style containing user metadata from the phd uns system. information, such as the time and territorial determinant of a web client, the agent used, and the representation style, are also recorded. the representation style is stored on the user browser in the form of cookies and represents the basic style for representing results in future searches of dissertations in the phd uns system. by analyzing the messages about the change of the representation style, we evaluate the results of our approach and examine how the users respond to the new style of representation. user phd uns define query change representation style yes no log change of representation style select representation style execute query display results as a harvard style reference load and display word cloud image representation style is textual yes no information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 19 [info] 22.08.2017. 16:07:33 (searchdissertationsmanagedbean:setrepresentationstyle) date and time: tue aug 22 16:07:33 cest 2017| miliseconds: 1503410853455| + session id: 2a4ce66932d0c3c8db97098dff956074| userid: 150341083728649| ip address: 188.2.29.239| location: city: belgrade, postal code: null, regionname: null (region: 00), countryname: serbia (country code: rs), latitude: 44.818604, longitude: 20.468094| user agent (device): mozilla/5.0 (windows nt 10.0; wow64) applewebkit/537.36 (khtml, like gecko) chrome/60.0.3112.78 safari/537.36 opr/47.0.2631.55| new representation style: wordcloud listing 2. example of a message about the change of the representation style preprocessing users’ feedback as already indicated, each change in the representation style of the search results causes the creation of an appropriate message (see listing 2). in order to better understand the context of use and the reason for changing the representation style, these messages are preprocessed and supplemented with information on the type of search and the given query which preceded the change in the representation style (highlighted in yellow in listing 3). by analyzing additional information, we can understand which context of usage and user actions preceded the change of the representation style. additional information is obtained from the received queries for the phd uns system and is mapped by using a unique user session identifier. an example of a message after preprocessing is shown in listing 3. [info] 22.08.2017. 16:07:33 (searchdissertationsmanagedbean:setrepresentationstyle) date and time: tue aug 22 16:07:33 cest 2017| miliseconds: 1503410853455| + session id: 2a4ce66932d0c3c8db97098dff956074| userid: 150341083728649| ip address: 188.2.29.239| location: city: belgrade, postal code: null, regionname: null (region: 00), countryname: serbia (country code: rs), latitude: 44.818604, longitude: 20.468094| user agent (device): mozilla/5.0 (windows nt 10.0; wow64) applewebkit/537.36 (khtml, like gecko) chrome/60.0.3112.78 safari/537.36 opr/47.0.2631.55| new representation style: wordcloud| query: internet| searching mode: basic listing 3. example of a message about the change of the representation style after preprocessing analysis of user feedback messages such as the one in listing 3 with additional information about contextual use are suitable for further analysis using elk stack technologies. messages in a given format are collected from logs of the phd uns system using the logstash grok filter. this filter is used for parsing, statistical analysis based on field values, data filtering, and advanced search using multiple filters. the parsed messages have been forwarded to elasticsearch components of elk stack technology. the grok pattern definition, which represents the rules and instructions for parsing messages, is located in the configuration files that are forwarded as a parameter when running the tool. an example of a configuration file is shown in listing 4. input { file { path => "/config-dir/logs-style-formatted/*.log" start_position => beginning } } filter { grok { break_on_match => false match => { "message" => "%{loglevel:loglevel}" } match => { "message" => "date and time: %{day:logday} %{month:logmonth} %{monthday:logmonthday} %{number:loghour}:%{numbe r:logminute}:%{number:logsecond} %{word:logtimezone} %{year:logyear}" } information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 20 match => { "message" => "userid: %{number:userid}\|" } match => { "message" => "city: %{data:city}," } match => { "message" => "countryname: %{data:country} \(" } match => { "message" => "user agent \(device\): %{data:useragent}\|" } match => { "message" => "new representation style: %{data:newstyle}\|" } match => { "message" => "query: %{data:query}\|" } match => { "message" => "searching mode: %{data:searchingmode}\|" } } } output { elasticsearch { hosts => ["elasticsearch:9200"] } } listing 4. an example of a grok pattern used to analyze the message about the change of the representation the analysis of the messages about the work of the phd uns system is presented in this section. the results are represented using the kibana graph component of the elk stack. this component is used for visualization and data exploration, analysis of logs at specified time intervals, and realtime monitoring of applications. the word cloud generating component was put into operation in april 2017. log messages were analyzed from then until the end of 2019. in total, there were 17,474 analyzed messages about changing the style of search results representation. in these messages, the style was changed into a textual representation 16,032 times, while it was changed into a visual representation style in the form of a word cloud image 1,442 times. thus, most of the users of the phd uns system changed the representation style to textual rather than visual format. this tells us that the majority of users are more familiar with the textual style of representing search results in interaction with scientific systems. based on this analysis, it can be concluded that the random selection of the representation style of the results is not a good choice. we also analyzed the client devices used when changing representation style (textual and visual). computers were used considerably more frequently than mobile devices. devices with larger resolution screens are more suitable for presenting search results in different formats. distribution of the change in the representation style of the search results is similar for computers and mobile devices, and based on the device, we cannot conclude which representation style is more suitable for the user. the queries that were submitted by users before changing the style of representation were also analyzed; in other words, which of the queries and results representations initiated the change into the other style of representation. figure 11 and figure 12 show the most commonly executed queries before changing the style of representation into the textual and visual format, respectively. figure 11 shows the most commonly executed queries before changing into the textual style of representation. some of the queries shown on this figure represent the names of faculties of the university, such as • fakultet tehnick nauka (department of technical sciences) • filosofsk fakultet (department of philosophy) information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 21 figure 11. top queries before changing into the textual style the most commonly executed queries before changing into the visual representation style are shown in figure 12. some queries shown on this figure represent scientific fields, such as • doktor medicinskih nauka (doctor of medical science) • doktor geografskih nauka (doctor of geographical science) figure 12. top queries before changing into the visual representation style. based on figures 11 and 12, we can conclude that the queries users submitted before the change in the style of representing the results are of a general type, that is, they represent the queries in faculties or by scientific fields. these types of queries give long lists of results. for queries over longer periods of time where the representation of all dissertations defended in a certain period is required, users changed the representation style into visual. information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 22 the search mode used before the change into the textual style is shown in figure 13, while the search mode used before the change into the visual representation style is shown in figure 14. figure 13. search mode before changing into the textual style. figure 14. search mode before changing into the visual style. by analyzing figures 13 and 14, we conclude that most queries preceding the change of the representation style are set from the basic search mode (labeled basic on the figures), which is the default search mode. also, we notice that there is an increase in the percentage when changing into the visual style of the advanced search mode as compared to the basic search mode. this is in compliance with the analysis following figures 11 and 12, because in the advanced search mode we make queries for a time range that gives long lists of results. also, we notice that some users have changed the style of results representation several times , so it is assumed that different types of information require a different representation style. there has been no reduction or enlargement in the number of users since the introduction of the word cloud generating component, which indicates that the introduction of the new component has not affected the frequency of the system use significantly. conclusion this paper describes one improvement on user experience performed for the users of the phd uns digital library. this improvement was implemented through the personalization of the search results representation which was put into operation in april 2017. users of the phd uns digital library are using desktop and laptop computers considerably more than mobile devices (rq1). moreover, besides specific exploratory queries, the users are raising general queries by scientific fields, faculties, or in the time range. the phd uns digital library has three user groups: those from the academic community, those from outside the academic community, and librarians in charge of entering the dissertation data. for these three groups of users, the following textual search results representations (rq2) have been selected and implemented: harvard-style representation of the dissertation in the form of references for users from the academic community; html structured results representation for users outside the academic community; and marc 21, dublin core, etd-ms bibliographic records for the library users. for the visual representation, word cloud presentation based on the complete text from the pdf file of the information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 23 dissertation has been selected and implemented. it is possible to select the desired search results representation which initiates storing of the messages about the representation style of the results, client device used, time, etc. this message is joined with a preset query message to analyze the patterns of system usage and establish a correlation between the change of the representation style and the type of query, device, and search mode (rq3). based on the conducted analysis, we reached the following conclusions: • a significantly larger number of users of the phd uns system use the textual representation style rather than the visual representation. this tells us that a larger number of users is more familiar with the textual style of representing search results in interaction with scientific systems and that a random selection of the representation style of the results used since april 2017 was not a good choice for the first-time user. because of this observation, the initial selection of the representation style for the first-time user was changed to the textual search results representation (rq3). • some users changed the representation style of the results several times and it is assumed that different types of information require a different representation style. based on this, we can conclude that the possibility of personalizing the search results representation is a useful functionality that contributes to the improvement of the phd uns system and the user experience. • it has been established that the most frequent change of the visual results representation is after a query which shows all the dissertations from a certain time period taken from the advanced search mode, but there is no correlation between this change and the device being used. based on this, it can be concluded that in certain cases for queries which show long lists of results, it is more transparent to see the results in the visual mode (rq3). it is necessary to collect more data and carry out additional analysis, in order to be able to precisely establish the correlation or to precisely determine for which queries and for which types of users this applied to, so that the system could automatically change the style of representation in certain cases. directions for future research and application development include the following. it is planned to collect and analyze additional messages about the work of the digital library in order to further enhance the user experience. also, it is necessary to follow the trends of the results representation due to the change of standardized reference styles, bibliographic formats, technologies and hardware devices, and it is further necessary to coordinate the results representation with these trends. differences between the behavior of the different user groups will also be examined further. endnotes 1 j. brophy and d. bawden, “is google enough? comparison of an internet search engine with academic library resources,” aslib proceedings 57, no. 6 (2005): 498–512, https://doi.org/10.1108/00012530510634235. 2 a. f. smeaton and j. callan, “personalisation and recommender systems in digital libraries,” international journal on digital libraries 5, no. 4 (2005): 299–308, https://doi.org/10.1007/s00799-004-0100-1. https://doi.org/10.1108/00012530510634235 https://doi.org/10.1007/s00799-004-0100-1 information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 24 3 iris xie, soohyung joo, and krystyna k. matusiak, “multifaceted evaluation criteria of digital libraries in academic settings: similarities and differences from different stakeholders,” the journal of academic librarianship 44, no. 6 (2018): 854–63, https://doi.org/10.1016/j.acalib.2018.09.002. 4 theodora nanou, george lekakos, and konstantinos fouskas, “the effects of recommendations’ presentation on persuasion and satisfaction in a movie recommender system,” multimedia systems 16, no. 4–5 (august 2010): 219–30, https://doi.org/10.1007/s00530-010-0190-0. 5 georgia kapitsaki and dragan ivanović, “representation with word clouds at the phd uns digital library,” computer science & information technology 21 (2017), https://doi.org/10.5121/csit.2017.71102. 6 dragan ivanović, “software systems for increasing availability of scientific-research outputs,” novi sad journal of mathematics – ns jom 42, no. 1 (2012): 37–48. 7 dragan ivanović, dušan surla, and zora konjović, “cerif compatible data model based on marc 21 format,” the electronic library 29, no. 1 (2011): 52–70, https://doi.org/10.1108/02640471111111433. 8 dragan ivanović, “a scientific-research activities information system,” (phd thesis, university of novi sad, 2010); d. ivanović, g. milosavljević, b. milosavljević, and d. surla, “a cerifcompatible research management system based on the marc 21 format,” program: electronic library and information systems 44, no. 1 (2010): 229–51; dragan ivanović and branko milosavljević, “software architecture of system of bibliographic data,” in proceedings of the xxi conference on applied mathematics prim 2009, 85–94. 9 lidija ivanović, dragan ivanović, and dušan surla, “a data model of theses and dissertations compatible with cerif, dublin core and edt-ms,” online information review 36, no. 4: 548– 67, https://doi.org/10.1108/14684521211254068. 10 lidija ivanović, dragan ivanović and dušan surla, “integration of a research management system and an oai-pmh compatible etds repository at the university of novi sad, republic of serbia,” library resources & technical services 56, no. 2: 104–12, https://doi.org/10.5860/lrts.56n2.104. 11 lidija ivanović and dušan surla, “a software module for import of theses and dissertations to criss,” in proceedings of the cris 2012 conference, (prague, june 6–9, 2012): 313–22. 12 lidija ivanovic, “search of catalogues of theses and dissertations,” novi sad journal of mathematics – ns jom, 43, no. 1 (2013): 155–65; lidija ivanović, dragan ivanović, dušan surla and zora konjović, “user interface of web application for searching phd dissertations of the university of novi sad,” in proceedings of the intelligent systems and informatics (sisy), 2013 ieee 11th international symposium: 117–22. 13 joel azzopardi, dragan ivanović and georgia kapitsaki, “comparison of collaborative and content-based automatic recommendation approaches in a digital library of serbian phd https://doi.org/10.1016/j.acalib.2018.09.002 https://doi.org/10.1007/s00530-010-0190-0 https://doi.org/10.5121/csit.2017.71102 https://doi.org/10.1108/02640471111111433 https://doi.org/10.1108/14684521211254068 https://doi.org/10.5860/lrts.56n2.104 information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 25 dissertations,” in proceedings of the international keystone conference 2016: 100–11, https://doi.org/10.1007/978-3-319-53640-8_9. 14 dragan ivanović and georgia kapitsaki, “personalisation of keyword-based search on structured data sources,” in proceedings of the 1st international keystone conference (ikc 2015). 15 núria ferran, enric mor, and julià minguillón, “towards personalization in digital libraries through ontologies,” library management 26, no. 4/5 (2005): 206–17, https://doi.org/10.1108/01435120510596062. 16 mark m. sebrechts et al., “visualization of search results: a comparative evaluation of text, 2d, and 3d interfaces,” in proceedings of the 22nd annual international acm sigir conference on research and development in information retrieval (sigir '99): 3–10, https://doi.org/10.1145/312624.312634. 17 frank h. bowers and stuart k. card, “method and apparatus for visualization of database search results,” u.s. patent no. 5,546,529. 18 sara saad soliman, maged f. el-sayed, and yasser f. hassan, “semantic clustering of search engine results,” the scientific world journal (2015), https://doi.org/10.1155/2015/931258. 19 tien nguyen and jin zhang, “a novel visualization model for web search results,” ieee transactions on visualization and computer graphics, 12, no. 5 (2006), https://doi.org/10.1109/tvcg.2006.111. 20 daniel scanfeld, vanessa scanfeld, and elaine l. larson, “dissemination of health information through social networks: twitter and antibiotics,” american journal of infection control 38, no. 3 (2010): 182–88, https://doi.org/10.1016/j.ajic.2009.11.004. 21 carmel mcnaught and paul lam, “using wordle as a supplementary research tool,” the qualitative report 15, no. 3 (2010): 630; weiwei cui et al., “context preserving dynamic word cloud visualization,” in ieee pacific visualization symposium (pacificvis) (2010): 121–28, https://doi.org/10.1109/pacificvis.2010.5429600; yusef hassan-montero and herrerovictor solana, “improving tag-clouds as visual information retrieval iinterfaces,” in proceedings of the international conference on multidisciplinary information sciences and technologies (2006): 25–28. 22 byron kuo, thomas hentrich, benjamin good, and mark wilkinson, “tag clouds for summarizing web search results,” in proceedings of the 16th international conference on world wide web (www '07) (2007): 1203–04, https://doi.org/10.1145/1242572.1242766. 23 jong-yi hong, eui-ho suh, and sung-jin kim, “context-aware systems: a literature review and classification,” expert systems with applications 36, no. 4 (2009): 8509–22, https://doi.org/10.1016/j.eswa.2008.10.071; georgia m. kapitsaki, george n. prezerakos, nikolaos d. tselikas, and iakovos s. venieris, “context-aware service engineering: a survey,” journal of systems and software 82 no. 8 (2009): 1285–97, https://doi.org/10.1016/j.jss.2009.02.026. https://doi.org/10.1007/978-3-319-53640-8_9 https://doi.org/10.1108/01435120510596062 https://doi.org/10.1145/312624.312634 https://doi.org/10.1155/2015/931258 https://doi.org/10.1109/tvcg.2006.111 https://doi.org/10.1016/j.ajic.2009.11.004 https://doi.org/10.1109/pacificvis.2010.5429600 https://doi.org/10.1145/1242572.1242766 https://doi.org/10.1016/j.eswa.2008.10.071 https://doi.org/10.1016/j.jss.2009.02.026 information technology and libraries march 2021 personalization of search results | paskali, ivanovic, kapitsaki, ivanovic, surla, and surla 26 24 gregory d. abowd et al., “towards a better understanding of context and context-awareness,” in handheld and ubiquitous computing (springer: berlin/heidelberg, 1999): 304–07, https://doi.org/10.1007/3-540-48157-5_29. 25 mika raento, antti oulasvirta, renaud petit and hannu toivonen, “contextphone: a prototyping platform for context-aware mobile applications,” ieee pervasive computing 4 no. 2 (2005): 51–59, https://doi.org/10.1109/mprv.2005.29. 26 josef fink and alfred kobsa, “a review and analysis of commercial user modeling servers for personalisation on the world wide web,” user modeling and user-adapted interaction 10, no. 2 (2000): 209–49, https://doi.org/10.1023/a:1026597308943. 27 hema yoganarasimhan, “search personalization using machine learning,” management science 66 no. 3 (2020): 1045–70, https://doi.org/10.1287/mnsc.2018.3255. 28 cristiane behnert and dirk lewandowski, “ranking search results in library information systems—considering ranking approaches adapted from web search engines,” the journal of academic librarianship 41 no. 6 (2015): 725–35, https://doi.org/10.1016/j.acalib.2015.07.010. 29 enrique frias-martinez, george magoulas, chen sherry, and robert macredie, “automated user modeling for personalised digital libraries,” international journal of information management 26 no. 3 (2006): 234–48, https://doi.org/10.1016/j.ijinfomgt.2006.02.006. 30 enrique frias-martinez, chen sherry, and liu xiaohui, “evaluation of a personalised digital library based on cognitive styles: adaptivity vs. aadaptability,” international journal of information management 29 no. 1 (2009): 48–56, https://doi.org/10.1016/j.ijinfomgt.2008.01.012. 31 magda el-sherbini and george klim, “metadata and cataloging practices,” the electronic library 22 no. 3 (2004): 238–48, https://doi.org/10.1108/02640470410541633; shawn averkamp and joanna lee, “repurposing proquest metadata for batch ingesting etds into an institutional repository” (university libraries staff publications, 2009): 38; sai deng and terry reese, “customized mapping and metadata transfer from dspace to oclc to improve etd workflow,” new library world, 2009, 110, no. 5/6 (2009): 249–64, https://doi.org/10.1108/03074800910954271. 32 steffen lohmann, jürgen ziegler, and lena tetzlaff, “comparison of tag cloud layouts: taskrelated performance and visual exploration,” in proceedings of the ifip conference on humancomputer interaction (2009): 392–404, https://doi.org/10.1007/978-3-642-03655-2_43. 33 yuping jin, “development of word cloud generator software based on python,” procedia engineering 174 (2017): 788–92, https://doi.org/10.1016/j.proeng.2017.01.223. 34 joel azzopardi, dragan ivanović and georgia kapitsaki, “comparison of collaborative and content-based automatic recommendation approaches in a digital library of serbian phd dissertations,” in proceedings of the international keystone conference 2016: 100–11, https://doi.org/10.1007/978-3-319-53640-8_9 https://doi.org/10.1007/3-540-48157-5_29 https://doi.org/10.1109/mprv.2005.29 https://doi.org/10.1023/a:1026597308943 https://doi.org/10.1287/mnsc.2018.3255 https://doi.org/10.1016/j.acalib.2015.07.010 https://doi.org/10.1016/j.ijinfomgt.2006.02.006 https://doi.org/10.1016/j.ijinfomgt.2008.01.012 https://doi.org/10.1108/02640470410541633 https://doi.org/10.1108/03074800910954271 https://doi.org/10.1007/978-3-642-03655-2_43 https://doi.org/10.1016/j.proeng.2017.01.223 https://doi.org/10.1007/978-3-319-53640-8_9 abstract introduction related work dosird uns searching personalization context awareness contribution of our work methodology analysis of search results’ representation styles textual representation visual representation implementation details textual results representation visual results representation personalization of representation evaluation collecting user feedback preprocessing users’ feedback analysis of user feedback conclusion endnotes library-authored web content and the need for content strategy articles library-authored web content and the need for content strategy courtney mcdonald and heidi burkhardt information technology and libraries | september 2019 8 courtney mcdonald (crmcdonald@colorado.edu) is learner experience & engagement librarian and associate professor, university of colorado at boulder. heidi burkhardt (heidisb@umich.edu) is web project manager & content strategist, university of michigan. abstract increasingly sophisticated content management systems (cms) allow librarians to publish content via the web and within the private domain of institutional learning management systems. “libraries as publishers” may bring to mind roles in scholarly communication and open scholarship, but the authors argue that libraries’ self-publishing dates to the first “pathfinder” handout and continues today via commonly used, feature-rich applications such as wordpress, drupal, libguides, and canvas. although this technology can reduce costly development overhead, it also poses significant challenges. these tools can inadvertently be used to create more noise than signal, potentially alienating the very audiences we hope to reach. no cms can, by itself, address the fact that authoring, editing, and publishing quality content is both a situated expertise and a significant, ongoing demand on staff time. this article will review library use of cms applications, outline challenges inherent in their use, and discuss the advantages of embracing content strategy. introduction we tend to look at content management as a digital concept, but it’s been around for as long as content. for as long as humans have been creating content, we’ve been searching for solutions to manage it. the library of alexandria (300 bc to about ad 273) was an early attempt at managing content. it preserved content in the form of papyrus scrolls and codices, and presumably controlled access to them. librarians were the first content managers.1 (emphasis added) content is, and has always been, central to the mission of libraries. content is physical, digital, acquired, purchased, leased, subscribed, and created. “libraries as publishers” may bring to mind roles in scholarly communication and open scholarship, but the authors argue that libraries’ selfpublishing dates to the first mimeographed ‘pathfinder’ handout and continues today via commonly used, feature-rich web content management systems (cmss). libraries use these cmss to support research, teaching, and learning in a variety of day-to-day operations. the sophisticated and complex infrastructure surrounding web-based library content has evolved from the singular, independently hosted and managed “library website” into a “library web ecosystem” comprised of multiple platforms, including integrated library systems, institutional repositories, cmss, and others. multiple cms applications, whether open-source (e.g., wordpress, drupal), institutionally supported (e.g., canvas, blackboard) or library-specific (e.g., springshare’s libguides), are employed by most libraries to power the library’s website and research guides, as well as to make their collections, in any and all formats, discoverable and accessible. mailto:crmcdonald@colorado.edu mailto:heidisb@umich.edu) library-authored web content and the need for content strategy | mcdonald and burkhardt 9 https://doi.org/10.6017/ital.v38i3.11015 library staff at all levels create and publish content through these cms platforms, an activity that is critical to our users discovering what we offer and accomplishing their goals. the cms removes technical bottlenecks and enables subject matter experts to publish content without coding expertise or direct access to a server. this disintermediation has many benefits, enabling librarians to share and interact directly with their communities, and reducing costly development overhead. as with any powerful technology that’s simple to use, effectively implementing a cms is not without pitfalls. through these tools, we can inadvertently create more noise than signal, potentially alienating the very audiences we hope to reach. further, effective management of content and workflows across and among so many platforms is not trivial. distributing web content creation among many authors can quickly lead to numerous challenges requiring expert attention. governance strategies for library-authored web content are rarely addressed in the library literature. this article will review library use of cms applications, outline challenges inherent in their use, and discuss the advantages of embracing content strategy as a framework for library-authored web content governance. content management systems: a definition any conversation on this topic is complicated by the fact that there is both misunderstanding and disagreement regarding the definition of a content management system. in their survey of 149 libraries covering day-to-day website management, including staffing, infrastructure, and organizational structures, bundza et al. observed “[w]hen reviewing the diverse systems mentioned, it is obvious that people defined cmss very broadly.”2 connell surveyed over 600 libraries regarding their use of cmss, defined as “website management tools through which the appearance and formatting is managed separately from content, so that authors can easily add content regardless of web authoring skills.”3 a few respondents “indicated their cms was dreamweaver or adobe contribute” and another “self-identified as a non-cms user but then listed drupal as their web management tool.”4 while the authors find the survey definition itself slightly ambiguous (likely in the service of clarity for survey respondents), we also believe that these responses may hint at an underlying and widespread lack of clarity regarding the technology itself. an early report on potential library use of content management systems by browning and lowndes in 2001 opined that “a cms is not really a product or a technology. it is a catch-all term that covers a wide set of processes that will underpin the ‘next generation’ large-scale website.”5 while technological developments over the last twenty years reveal some limitations to this early characterization, we believe it is fundamentally sound to define the cms primarily through its functions. fulton defined a cms as “an application that enables the shared creation, editing, publish ing, and management of digital content under strict administrative parameters.”6 the authors concur with barker’s (2016) similarly task-based definition: “a content management system (cms) is a software package that provides some level of automation for the tasks required to effectively manage content . . . usually server-based, multi-user . . . [and] interact[ing] with content stored in a information technology and libraries | september 2019 10 repository.”7 browning & lowndes defined the key tasks, or functions, of the cms as encompassing four major categories: authoring, workflow, storage, and publishing.8 barker (2016) also outlined “the big four” of content management as: enterprise content management (e.g., intranets), digital asset management (dam), records management, and web content management (wcm), with wcm defined as “the management of content primarily intended for mass delivery via a website. wcm excels at separating content from presentation and publishing to multiple channels.”9 for the purpose of clarity within the scope of this article, our discussion will primarily focus on content management systems as they are used for wcm, acknowledging that some principles may apply in varying degrees to other categories. the cms and library websites the library literature reveals that, generally speaking, libraries began the transition from telnet and gopher catalog interfaces to launching websites in the 1990s.10 case studies of library websites from this period through the mid-2000s report library website pages increasing at a rapid rate, in some cases doubling or tripling on a yearly basis. 11 a comment from dallis and ryner in regard to their own case study provides a sense of what might be considered typical during this period: “the management of the site was decentralized, and it grew to an estimated 8,000 pages over a period of five years.”12 this proliferation, in turn, spurred focused interest in content management. “web content management (wcm) as a branch of content management (cm) gained importance during the web explosion in the mid-1990s.”13 as early as 2001 there were published laments regarding the state of library websites: institutions are struggling to maintain their web sites. out of date material, poor control over design and navigation, a lack of authority control and the constriction of the webmaster (or even web team) bottleneck will be familiar to many in the he/fe [higher education / further education] sector. the pre-millennial web has been characterized by highly manual approaches to maintenance; the successful and sustainable post-millennial web will have significant automation. one vehicle by which this can be achieved is the cms.14 mach wrote: the special concerns of web maintenance have only multiplied with the increased size and complexity of many library web sites. not only does the single webmaster model no longer work for most libraries, but the static html page is also in jeopardy. many overworked web librarians dream about the instant content updates possible with database-driven site or content management software. but while these technical solutions save staff time, they demand a fair amount of compromise.15 in 2010, fulton noted, “at one time, all institutions [mentioned in her literature review] could effectively manage their sites outside of a cms. however, changing standards combined with uncontrollable growth patterns persuaded them to take steps to prevent prolonged chaos.”16 library-authored web content and the need for content strategy | mcdonald and burkhardt 11 https://doi.org/10.6017/ital.v38i3.11015 changing technology, accessibility, and literacy throughout the early 2000s, advances in consumer technology and in web development (e.g., css, html 5, bootstrap) together with the need to comply with web-accessibility standards resulted in a gradual move from static, hand-coded sites to other solutions. in 2005, yu stated, “today’s content management solution is either a sophisticated software-based system or a databasedriven application.”17 after a detailed explanation of the cumbersome process of managing and updating a static site using microsoft’s frontpage, kane and hegarty noted, “the opportunity to migrate the site to a content management system provided a golden opportunity . . . to bring the code into line with best practice.”18 this transition also coincided with the growth of viable cms options, particularly open-source tools. black stated in 2011: “in the past few years, the field of open-source cmss has increased, making it more likely that a library will find a viable cms in the existing marketplace that will meet the organization’s needs.”19 in 2013, comeaux and schmetzke replicated an earlier study of library websites’ accessibility, reviewing the homepages of library websites at 56 institutions offering ala-accredited library and information science programs using bobby, an automated web-accessibility checker. they found that cms-powered library websites had a higher average of approved pages and a lower average of errors per page than those not powered by a cms.20 in a 2017 study, comeaux manually reviewed 37 academic library websites (members of the association of southeastern research libraries), and found that approximately three-quarters of cms-driven sites were responsive, as compared to only one-quarter of sites without a cms.21 accessibility also manifests itself on the web in other ways. it is important to consider what we know about literacy and how people read online. the ability to write using plain language, in addition to other essential techniques for effective web writing, is an important aspect of accessibility that must be addressed in tandem with compliance with industry standards such as the web content accessibility guidelines (wcag, https://www.w3.org/tr/wcag20/). a summary of recent results for the program for the international assessment of adult competencies (piaac, https://nces.ed.gov/surveys/piaac/) survey, administered to us adults, reported “the majority of people may struggle to read through a ‘simple’ bullet-point list of rules . . . nearly 62% of our population might not be able to read a graph or calculate the cost of shoes reliably.”22 blakiston succinctly observed: “on the web, scanning and skimming is the default.”23 these trends have led to an increasing push to adopt “plain language” by governmental agencies and others.24 skaggs stated, “adopt plain language throughout your website. plain language focuses on understanding and writing for the user’s goals, making content easily scannable for the user, and writing in easy to understand sentences.”25 library websites and the challenges of a distributed environment in 2011, black pointed out one of the chief advantages to using a cms: “cmss support a distributed content model by separating the content from the presentation and giving the content provider an easy to use interface for adding content”.26 empowerment to focus on special expertise is noted as another benefit: “chief among the efficiencies gained in using a cms is the simple act of giving content authors the tools they need to create webpages and, most importantly, to do so without requiring the technical knowledge that used to be a part of webpage development. designers can design, writers can write, editors can edit, and technology folks can manage the cms and support https://www.w3.org/tr/wcag20/ https://nces.ed.gov/surveys/piaac/ information technology and libraries | september 2019 12 its users.”27 browning and landes agreed: “the concept of ‘self-service authoring’, whereby staff do not need special skills to edit the content for which they are responsible, can be regarded as a major step towards acceptance of the web as a medium for communication by non-web specialists. providing this is the key advantage of a cms.”28 librarians quickly found, however, that while the adoption of a cms could empower more subject matter experts to participate in web content development and address technical issues such as responsive design and compliance with accessibility standards, the transition to a distributed model of content creation, oversight, and maintenance resulted in larger organizational ramifications. in 2006, approximately a decade following libraries’ general move to the web and at an early stage for cms adoption, guenther (2006) cautioned: “a cms is only a tool. purchasing the very best cms with every bell and whistle available will be a useless exercise without a solid plan to guide people and processes around its use.”29 this same article went on to observe: what makes using a cms a tremendous advantage is exactly what makes it a potential nightmare. a cms can make website development really easy; that's the good part. the bad part is, it makes webpage development really easy. one of the first issues you encounter is having to suddenly support a lot more content authors posting a lot more content. what once was an environment with limited activity can become a web development environment requiring considerably more oversight and technical support. having more hands stirring the pot, so to speak, is wrought with all kinds of challenges. 30 untenable growth this model of distributed content creation, in which authorship is undertaken by numerous parties across the organization, generally results in a rapidly increasing quantity of content without necessarily guaranteeing consistent quality. a review of the literature reveals that, more commonly, a distributed model leads to a lack of consistency and focus in library web content’s structure and execution. some papers underscore the problematic quality of the highly individualized nature of the content: “the sheer mass of [libraries’] public web presence has reached the point where maintenance is a problem. often the webpages grew out of the personal interests of staff members, who have since left for other jobs for other responsibilities or simply retired.”31 blakiston stated, “for a number of years, librarians were motivated to create more web content. it was assumed that adding more content was a service for library users, and it was also seen as a way to improve their web skills and demonstrate their fluency with technology.”32 similarly, chapman and demsky described how the university of michigan library website grew “in an organic fashion” and noted, “[a]s in many places, the library’s longstanding attitude toward the web was that more was more and that there was really no harm in letting the website develop however individual units and librarians thought best.”33 other papers described “authority and decision-making issues . . . differing opinions, turf struggles or a lack of communication . . . a shortage of time and motivation, general inertia, and resistance to change on the part of content authors.”34 iglesias noted, “some librarians will always be more comfortable creating webpages from scratch, fearing a loss of control. the library as a whole must decide if the core responsibility of librarians is to create content or to create websites.”35 library-authored web content and the need for content strategy | mcdonald and burkhardt 13 https://doi.org/10.6017/ital.v38i3.11015 newton and riggs stated, “this approach to content appears to be at odds with the role of librarians as leaders in information management practices and in supporting users to find , filter and critically evaluate information.”36 in her article “editorial and technological workflow tools to promote website quality,” morton-owens discussed several studies measuring the severe impact of even small flaws (such as typographical errors) on users’ judgements of a website’s credibility, and, by extension, of the organization’s credibility: “users’ experience of a website leads them to attribute characteristics of competence and trustworthiness to the sponsoring organization.”37 a. paula wilson, citing mcconnell and middleton, summarized the potential pitfalls inherent in a distributed model in which empowerment of content creators overshadows a unified vision, strategy, and approach to library-wide content management: a decentralized model without the use of guidelines, standards or templates will eventually fail. the website may experience inconsistency in presentation and navigation, outdated and incorrect information, and gaps in content, and its webpages maybe noncompliant in usability and accessibility design so much so that users cannot find information.38 inconsistent voice and lack of organizational unity in addition to such compounding factors and in contrast to journalistic practice, “libraries lack an editorial culture where content production and management is viewed as a collective rather than a personal effort.”39 morton-owens noted: “the concept of editing is not yet consistently applied to websites unless the site represents an organization that already relies on editors (like a newspaper)—but it is gaining recognition as a best practice. if the website is the most readily available public face of an institution, it should receive editorial attention just as a brochure or fundraising letter would.”40 in an environment with distributed authorship lacking a strong and consistent editorial culture, an organization's “voice” can quickly deteriorate. in web writing, voice is often defined as personality. blakiston stated: “the written content you provide plays an essential role in defining your library as an organization.”41 young went further, aligning voice with values, and arguing “[a]ny item of content that your library creates—an faq, a policy page, or a facebook post—should be conveyed in the voice of your library and should communicate the values of your library. a combined expression of content and values defines the voice of your organization.” 42 in their 2006 article “cms/cms: content management system/change management strategies,” goodwin et al. insightfully explore organizational challenges: the effort of developing a unified web presence reveals where the organization itself lacks unity . . . effective use of a content management system requires an organized and comprehensive consolidation of library resources, which emphasizes the need for a different organizational model and culture—one that promotes thinking about the library as a whole, sharing and collaboration.43 fulton built on this concept: “disunity in the library’s web interface could signify disunity within the institution. on the other hand, a harmonious web presence suggests an institution that works well together.”44 young drew an inherent connection between a strongly unified organizational identity and a consistent and coherent “content strategy”: information technology and libraries | september 2019 14 while libraries in general can draw on decades or centuries of cultural identity, each individual library may wish to convey a unique set of attributes that are appropriate for unique contexts. in this way, the element of “organizational values” inherent to content strategy signals a larger visioning project for determining the mission, vision, and values of your library. if these elements are already in place, then the work of content strategy can easily be adapted to fit existing values statements. otherwise, content strategy and organizational values can develop as a joint initiative. 45 library websites and content strategy content strategy is an emerging discipline that brings together concepts from user experience design, information architecture, marketing, and technical writing. content strategy encompasses activities related to creating, updating, and managing content that is intentional, useful, usable, well-structure, easily found, and easily understood, all while supporting an organization’s strategic goals.46 browning and lowndes recognized as early as 2002 that strategy would be required as the variety of communication channels for libraries increased: “as local information systems integrate and become more pervasive, self-service authoring extends to the concept of ‘write once, re-use anywhere’, in which the web is treated as just another communication channel along with email, word processor files and presentations, etc.”47 more than a decade later, in the introductory column to a 2013 themed issue of information outlook focused on content strategy, hales stated: content strategy is a field for which information professionals and librarians are ideally suited, by virtue of both their education and temperament. content, after all, is another word for information, and librarians and information professionals have been developing strategies for acquiring, managing, and sharing information for centuries. today, however, information is available to more people in more forms and through more channels than ever before, making content strategies a necessity for organizations rather than an afterthought.48 jones and farrington posited a common refrain for stating the importance of content strategy for librarianship: “library website content must be viewed in much the same way as a physical collection” and the “library website, to apply s. r. ranganathan’s fifth law, is a growing organism and must be treated as such, especially with the complexity of web content.”49 claire rasmussen drew connections between ranganathan’s laws and content strategy in a blog post, pointing out that web content represents an additional set of responsibilities to be managed: “for hundreds of years, librarians have been the primary caretakers of the content corpus. but somebody needs to care for the content that never makes it into a library’s collections, too.”50 blakiston & mayden provided a helpful overview of content strategy and its application in libraries in their article “how we hired a content strategist (and why you should too),” finding many points of connection between skill sets essential to content strategy and those commonly possessed by librarians: librarians who have worked in public services may have the needed skills to ask good questions and find out what users need . . . professionals doing this kind of work came from backgrounds including communications, english and library science . . . desirable library-authored web content and the need for content strategy | mcdonald and burkhardt 15 https://doi.org/10.6017/ital.v38i3.11015 qualifications for . . . content strategist[s] . . . [include] strategic planning, web skills and project management.51 the circumstances that motivated them to propose and eventually hire a dedicated content strategist at the university of arizona libraries hearken back to the discussion earlier in this article regarding the increasing complexity of web librarianship: “the web product manager had independently coordinated all user research and content strategy work. the idea of both managing [a major web redesign project] and leading these other important areas was not realistic.”52 datig also pointed to increasing day-to-day responsibilities when advocating for the importance of content strategy for librarians with outreach and marketing responsibilities: “lack of time, and a desire for that time to be well spent, is a huge concern for all librarians involved in library outreach and marketing . . . content strategy is an important and overlooked aspect of maintaining an effective and vital library outreach program.”53 hackett reflected on her role as web content strategist in a blog post after a recent website migration, noting: “moving forward with a content strategy . . . will ensure that university libraries’ website is useful, usable, and discoverable—now and in the future.”54 yet, while the need for strategy is hard to dispute and librarians are theoretically well suited for web content strategy work, blakiston & mayden noted that explicit organizational support for content strategy in libraries remained limited: “despite the growing popularity of content strategy as a discipline, only a handful of libraries had hired staff dedicated to this role at the time we proposed adding a content strategist to our staff.”55 conclusion this article has traced the history of library adoption of web content management systems, the evolution of those systems, and the corresponding challenges as libraries have attempted to manage increasingly prolific content creation workflows across multiple, divergent cms platforms. what is the library website, anyway? while some variation would to be expected from institution to institution, largely missing from the conversation is agreement on the purpose and aim of the library website writ large. this lack of definition, together with the technological and growth-related issues already discussed, has doubtless contributed to the confusion. after all, how would we know if we are “building it right” if we are not sure what we are meant to be building in the first place? in response to this ambiguity, the following definition was proposed: the library website is an integrated representation of the library, providing continuously updated content and tools to engage with the academic mission of the college/university. it is constructed and maintained for the benefit of the user. value is placed on consump tion of content by the user rather than production of content by staff.56 effective management of library web content requires dedicated resources and clear authority inconsistent processes, disconnects between units, varying constituent goals, and vague or ineffective wcm governance structures are recurrent themes throughout the literature. as cms applications have enabled broader access to web publishing, models of library web management information technology and libraries | september 2019 16 have moved away from workflows structured around strictly technical tasks and permissions, and have instead migrated toward consensus-based, revolving committee structures. while greater involvement of subject matter experts has been noted as a positive earlier in this article, other challenges have also been acknowledged. mcdonald, haines, and cohen stated: “in the context of web design and governance, consensus is a blocker to nimble, standards-based, user-focused action.”57 library website as an integrated representation of the organization as previously discussed, web content governance issues often signal a lack of coordination, or even of unity, across an organization. demsky stated, “we won’t be fully successful until we see it as our website” (emphasis added).58 internal documentation from the university of michigan library emphasized the value of “publicly represent[ing] ourselves as one library,” and stated: the more people are provided with clear communication that shows our offerings and unique items are part of the . . . library—rather than confuse users by making primary attribution to a sub-library, collection, or service point—the more people will recognize and understand the library's tremendous, overall value.59 content strategy and the case for library-authored content no cms can, by itself, address the fact that authoring, editing, and publishing quality content is both a situated expertise and a significant, ongoing demand on staff time. each platform, resource, or database brings its own visual style, terminology, tone and functionality. they are all parts of the library experience, which in turn is one part of the student, research or teaching experience. an understanding of content strategy is critical if staff are to see the connections between their own content and the rest of the content delivered by the organization.60 libraries must proactively embrace and employ best practices in content strategy and in writing for the web to effectively address considerations of literacy and to present a consistent voice for the organization. these practices position libraries to fully realize the promise of content management systems through embracing an ethos of library-authored content. the authors define library-authored content as collectively owned and authored content that represents the organization as a whole. library-authored content is: • collaboratively planned, written, and edited with participation of both subject matter experts and domain experts (i.e., library staff with expertise in content strategy, web librarianship); • carefully drafted to optimize for clarity within the context of the end-user; • current, reviewed on a recurrent schedule, and regularly updated; • consistent across the ecosystem of cms applications and other platforms, including print materials and social media; • compliant with industry standards (including but not limited to those related to accessibility), and with relevant internal brand standards; and • centrally managed as the primary responsibility of one or more domain experts. library-authored web content and the need for content strategy | mcdonald and burkhardt 17 https://doi.org/10.6017/ital.v38i3.11015 in order for libraries to meet the ever-increasing demands on our resources to produce timely, user-centered content that advances our missions for supporting teaching, research, and learning, a cultural shift toward a more collective, collaborative model of web content management and governance is necessary. content strategy provides a flexible, adaptable framework for libraries to more efficiently and effectively leverage the power of multiple cms platforms, to present engaging on-point content, and to provide appropriate, scaffolded support for researchers at all levels — with a team of one or a team of many. endnotes 1 deane barker, “what web content management is (and isn’t),” in web content management (o’reilly media, inc., 2016), sec. what web content management is (and isn’t), https://learning.oreilly.com/library/view/web-content-management/9781491908112/. 2 maira bundza, patricia fravel vander meer, and maria a. perez-stable, “work of the web weavers: web development in academic libraries,” journal of web librarianship 3, no. 3 (september 15, 2009): 252, https://doi.org/10.1080/19322900903113233. 3 ruth sara connell, “content management systems: trends in academic libraries,” information technology and libraries 32, no. 2 (june 10, 2013): 43, https://doi.org/10.6017/ital.v32i2.4632. 4 connell, 46. 5 paul browning and mike lowndes, “jisc techwatch report: content management systems,” 2001, 3, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.15.9100. 6 camilla fulton, “library perspectives on web content management systems,” first monday 15, no. 8 (july 15, 2010): sec. review of literature, https://doi.org/10.5210/fm.v15i8.2631. 7 barker, “what web content management is (and isn’t),” sec. what is a content management system? 8 browning and lowndes, “jisc techwatch report,” 4. within a diagram outlining the major functions within the content life-cycle, they include the steps ‘review’, ‘archive’ and ‘dispose’ steps which, in the experience and observations of the authors, are often overlooked in general library web practice. 9 barker, sec. types of content management systems. 10 laura b. cohen, matthew m. calsada, and frederick j. jeziorkowski, “scratchpad: a quality management tool for library web sites,” content and workflow management for library websites: case studies, 2005, 102–26, https://doi.org/10.4018/978-1-59140-533-7.ch005; diane dallis and doug ryner, “indiana university bloomington libraries presents organization to the users and power to the people: a solution in web content management,” content and workflow management for library websites: case studies, 2005, 80–101, https://doi.org/10.4018/978-1-59140-533-7.ch004; stephen sottong, “database-driven web pages using only javascript: active client pages,” content and workflow management for library websites: case studies, 2005, 167–85, https://doi.org/10.4018/978-1-59140-533 https://learning.oreilly.com/library/view/web-content-management/9781491908112/ https://doi.org/10.1080/19322900903113233 https://doi.org/10.6017/ital.v32i2.4632 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.15.9100 https://doi.org/10.5210/fm.v15i8.2631 https://doi.org/10.4018/978-1-59140-533-7.ch005 https://doi.org/10.4018/978-1-59140-533-7.ch004 https://doi.org/10.4018/978-1-59140-533-7.ch008 information technology and libraries | september 2019 18 7.ch008; ray bailey and tom kmetz, “migrating a library’s web site to a commercial cms within a campus‐wide implementation,” library hi tech 24, no. 1 (january 1, 2006): 102–14, https://doi.org/10.1108/07378830610652130; juan carlos rodriguez and andy osburn, “developing a distributed web publishing system at csu sacramento library: a case study of coordinated decentralization,” content and workflow management for library websites: case studies, 2005, 51–79, https://doi.org/10.4018/978-1-59140-533-7.ch003; barbara a. blummer, “a literature review of academic library web page studies,” journal of web librarianship 1, no. 1 (june 21, 2007): 45–64, https://doi.org/10.1300/j502v01n01_04; robert slater, “the library web site: collaborative content creation and management,” journal of web librarianship 2, no. 4 (december 2008): 567–77, https://doi.org/10.1080/19322900802473928; rebecca blakiston, “developing a content strategy for an academic library website,” journal of electronic resources librarianship 25, no. 3 (july 2013): 175–91, https://doi.org/10.1080/1941126x.2013.813295; suzanne chapman and ian demsky, “taming the kudzu: an academic library’s experience with web content strategy,” in cutting-edge research in developing the library of the future, ed. bradford lee eden (lanham, md: rowman & littlefield, 2015). 11 cohen, calsada, and jeziorkowski, “scratchpad,” 11; rodriguez and osburn, “developing a distributed web publishing system at csu sacramento library,” 76–77; slater, “the library web site,” 57. 12 dallis and ryner, “indiana university bloomington libraries presents organization to the users and power to the people,” 82. 13 holly yu, ed., content and workflow management for library web sites: case studies (hershey, pa: igi global, 2005), vi. 14 browning and lowndes, “jisc techwatch report,” 5. 15 michelle mach, “website maintenance workflow at a medium-sized university library,” content and workflow management for library websites: case studies, 2005, 128, https://doi.org/10.4018/978-1-59140-533-7.ch006. 16 fulton, “library perspectives on web content management systems,” sec. review of literature. 17 yu, content and workflow management for library web sites, 2. 18 nora hegarty and david kane, “new web site, new opportunities: enforcing standards compliance within a content management system,” library hi tech 25, no. 2 (june 19, 2007): 278, https://doi.org/10.1108/07378830710755027. 19 elizabeth l. black, “selecting a web content management system for an academic library website,” information technology and libraries 30, no. 4 (december 1, 2011): 186, https://doi.org/10.6017/ital.v30i4.1869. https://doi.org/10.4018/978-1-59140-533-7.ch008 https://doi.org/10.1108/07378830610652130 https://doi.org/10.4018/978-1-59140-533-7.ch003 https://doi.org/10.1300/j502v01n01_04 https://doi.org/10.1080/19322900802473928 https://doi.org/10.1080/1941126x.2013.813295 https://doi.org/10.4018/978-1-59140-533-7.ch006 https://doi.org/10.1108/07378830710755027 https://doi.org/10.6017/ital.v30i4.1869 library-authored web content and the need for content strategy | mcdonald and burkhardt 19 https://doi.org/10.6017/ital.v38i3.11015 20 dave comeaux and axel schmetzke, “accessibility of academic library web sites in north america: current status and trends (2002‐2012),” library hi tech 31, no. 1 (march 1, 2013): 27, https://doi.org/10.1108/07378831311303903. 21 david j. comeaux, “web design trends in academic libraries — a longitudinal study,” journal of web librarianship 11, no. 1 (january 2, 2017): 12, https://doi.org/10.1080/19322909.2016.1230031. 22 meredith larson, “even if you’re trying, you’re probably not writing for the average american,” federal communicators network (blog), october 9, 2018, https://fedcommnetwork.org/2018/10/09/even-if-youre-trying-youre-probably-not-writingfor-the-average-american/. 23 rebecca blakiston, writing effectively in print and on the web—a practical guide for librarians (rowman & littlefield, 2017), 110. 24 national adult literacy agency, “plain english around the world,” simply put, 2015, http://www.simplyput.ie/plain-english-around-the-world; plain language action and information network, general services administration, united states government, “home | plainlanguage.gov,” plainlanguage.gov, accessed february 1, 2019, https://www.plainlanguage.gov/. 25 danielle skaggs, “my website reads at an eighth grade level: why plain language benefits your users (and you),” journal of library & information services in distance learning, 2016, 2, https://doi.org/10.1080/1533290x.2016.1226581. 26 black, “selecting a web content management system for an academic library website,” 185. 27 kim guenther, “content management systems as ‘silver bullets,’” online 30, no. 4 (2006): 55. 28 paul browning and mike lowndes, “content management systems: who needs them?,” ariadne, no. 30 (2002): sec. the issue, http://www.ariadne.ac.uk/issue30/techwatch. 29 guenther, “content management systems as ‘silver bullets,’” 54. 30 guenther, 56. 31 michael seadle, “content management systems,” library hi tech 24, no. 1 (january 1, 2006): 5, https://doi.org/10.1108/07378830610652068. 32 blakiston, “developing a content strategy for an academic library website,” 176. 33 chapman and demsky, “taming the kudzu,” 25. 34 bundza, meer, and perez-stable, “work of the web weavers,” 256. 35 edward iglesias, “winning the peace: an approach to consensus building when implementing a content management system,” in content management systems in libraries: case studies, ed. bradford lee eden (scarecrow press, 2008), 177. https://doi.org/10.1108/07378831311303903 https://doi.org/10.1080/19322909.2016.1230031 https://fedcommnetwork.org/2018/10/09/even-if-youre-trying-youre-probably-not-writing-for-the-average-american/ https://fedcommnetwork.org/2018/10/09/even-if-youre-trying-youre-probably-not-writing-for-the-average-american/ http://www.simplyput.ie/plain-english-around-the-world https://www.plainlanguage.gov/ https://doi.org/10.1080/1533290x.2016.1226581 http://www.ariadne.ac.uk/issue30/techwatch https://doi.org/10.1108/07378830610652068 information technology and libraries | september 2019 20 36 kristy newton and michelle riggs, “everybody’s talking but who’s listening? hearing the user’s voice above the noise, with content strategy and design thinking,” vala2016 conference, january 1, 2016, 1, https://ro.uow.edu.au/asdpapers/536. 37 emily g. morton-owens, “editorial and technological workflow tools to promote website quality,” information technology and libraries 30, no. 3 (september 2, 2011): 91, https://doi.org/10.6017/ital.v30i3.1764. 38 a. paula wilson, library web sites: creating online collections and services (chicago: american library association, 2004), 4. 39 chapman and demsky, “taming the kudzu,” 35. 40 morton-owens, “editorial and technological workflow tools to promote website quality,” 97. 41 blakiston, writing effectively in print and on the web — a practical guide for librarians, 6. 42 scott w. h. young, “principle 1: create shareable content,” library technology reports 52, no. 8 (november 18, 2016): 11–12. 43 susan goodwin et al., “cms/cms: content management system/change management strategies,” library hi tech 24, no. 1 (january 2006): 55–56, https://doi.org/10.1108/07378830610652103. 44 fulton, “library perspectives on web content management systems,” sec. discussion. 45 young, “chapter 1. principle 1,” 12. 46 kristina halvorson, “understanding the discipline of web content strategy,” bulletin of the american society for information science & technology 37, no. 2 (january 2011): 23–25, https://doi.org/10.1002/bult.2011.1720370208; anne haines, “web content strategy: what is it, and why should i care?,” inula notes 27, no. 2 (december 18, 2015): 11–15, https://scholarworks.iu.edu/journals/index.php/inula/article/view/20672/26734; u.s. department of health & human services, “content strategy basics,” usability.gov, january 24, 2016, https://www.usability.gov/what-and-why/content-strategy.html. 47 browning and lowndes, “content management systems,” sec. the issue. 48 stuart hales, “providing content strategy services,” information outlook (online); alexandria 17, no. 6 (december 2013): 8. 49 kyle m. l. jones and polly-alida farrington, “wordpress as library cms,” american libraries; chicago 42, no. 5/6 (june 2011): 34. 50 claire rasmussen, “do it like a librarian: ranganathan for content strategists « brain traffic blog,” braintraffic blog, june 13, 2012, https://web.archive.org/web/20120613173955/http://blog.braintraffic.com/2012/06/do -itlike-a-librarian-ranganathan-for-content-strategists/. https://ro.uow.edu.au/asdpapers/536 https://doi.org/10.1108/07378830610652103 https://doi.org/10.1002/bult.2011.1720370208 https://scholarworks.iu.edu/journals/index.php/inula/article/view/20672/26734 https://www.usability.gov/what-and-why/content-strategy.html https://web.archive.org/web/20120613173955/http:/blog.braintraffic.com/2012/06/do-it-like-a-librarian-ranganathan-for-content-strategists/ https://web.archive.org/web/20120613173955/http:/blog.braintraffic.com/2012/06/do-it-like-a-librarian-ranganathan-for-content-strategists/ library-authored web content and the need for content strategy | mcdonald and burkhardt 21 https://doi.org/10.6017/ital.v38i3.11015 51 rebecca blakiston and shoshana mayden, “how we hired a content strategist (and why you should too),” journal of web librarianship 9, no. 4 (2015): 196, https://doi.org/10.1080/19322909.2015.1105730. 52 blakiston and mayden, 197. 53 ilka datig, “revitalizing library websites and social media with content strategy: tools and recommendations,” journal of electronic resources librarianship 30, no. 2 (2018): 63–64, https://doi.org/10.1080/1941126x.2018.1465511. 54 karen hackett, “what is a web content strategist?,” library news, october 17, 2016, https://sites.psu.edu/librarynews/2016/10/17/whats-a-web-content-strategist/. 55 blakiston and mayden, “how we hired a content strategist (and why you should too),” 196. 56 courtney mcdonald, anne haines, and rachael cohen, “from consensus to expertise: rethinking library web governance,” acrl techconnect (blog), november 2, 2015, https://acrl.ala.org/techconnect/post/from-consensus-to-expertise-rethinking-library-webgovernance/. 57 mcdonald, haines, and cohen. 58 ian demsky, “lessons from my first year as web content strategist,” library tech talk (blog), august 7, 2014, https://www.lib.umich.edu/blogs/library-tech-talk/lessons-my-first-yearweb-content-strategist. 59 university of michigan library, “editorial style and best practices,” january 23, 2019, sec. library branding. 60 newton and riggs, “everybody’s talking but who’s listening?,” 12. https://doi.org/10.1080/19322909.2015.1105730 https://doi.org/10.1080/1941126x.2018.1465511 https://sites.psu.edu/librarynews/2016/10/17/whats-a-web-content-strategist/ https://acrl.ala.org/techconnect/post/from-consensus-to-expertise-rethinking-library-web-governance/ https://acrl.ala.org/techconnect/post/from-consensus-to-expertise-rethinking-library-web-governance/ https://www.lib.umich.edu/blogs/library-tech-talk/lessons-my-first-year-web-content-strategist https://www.lib.umich.edu/blogs/library-tech-talk/lessons-my-first-year-web-content-strategist abstract introduction content management systems: a definition the cms and library websites changing technology, accessibility, and literacy library websites and the challenges of a distributed environment untenable growth inconsistent voice and lack of organizational unity library websites and content strategy conclusion what is the library website, anyway? effective management of library web content requires dedicated resources and clear authority library website as an integrated representation of the organization content strategy and the case for library-authored content endnotes reproduced with permission of the copyright owner. further reproduction prohibited without permission. is this a geolibrary? a case of the idaho geospatial data center jankowska, maria anna;jankowski, piotr information technology and libraries; mar 2000; 19, 1; proquest pg. 4 is this a geolibrary? a case of the idaho geospatial data center maria anna jankowska and piotr jankowski the article presents the idaho geospatial data center (igdc), a digital library of public-domain geographic data for the state of idaho. the design and implementation of igdc are introduced as part of the larger context of a geolibrary model. the article presents methodology and tools used to build igdc with the focus on a geolibrary map browser. the use of igdc is evaluated from the perspective of access and demand for geographic data. finally, the article offers recommendations for future development of geospatial data centers. i n the era of integrated transnational economies, demand for fast and easy access to information has become one of the great challenges faced by the traditional repositories of information-libraries. globalization and the growth of market-based economies have brought about, faster than ever before, acquisition and dissemination of data, and the increasing demand for open access to information, unrestricted by time and location. these demands are mobilizing libraries to adopt digital information technologies and create new methods of cataloging, storing, and disseminating information in digital formats. libraries encounter new challenges constantly. participation in the global information infrastructure requires them to support public demand for new information services, to help the society in the process of selfeducation, and to promote the internet as a tool for sharing information. these tasks are becoming easier to accomplish thanks to the growing number of digital libraries. since 1994, when the digital library initiative originated as part of the national information infrastructure program, the internet has accommodated many digital libraries with spatial data content. for example, the electronic environmental library project at the university of california, berkeley (http:/ /elib.cs. berkeley.edu/) provides botanical and geographic data; the university of michigan digital library teaching and learning project (www.si.umich.edu/umdl/) focuses on earth and space sciences; the carnegie mellon's informedia digital video library (www.informedia. cs.cmu.edu) distributes digital video, audio, and images maria anna jankowska (majanko@uidaho.edu) is associate network resources librarian, university of idaho library, and piotr jankowski (piotrj@uidaho.edu) is associate professor, department of geography, university of idaho, moscow, idaho. 4 information technology and libraries i march 2000 with text; and the alexandria digital library at santa barbara (http:/ /alexandria.sdc.ucsb.edu/) provides geographically referenced information. the alexandria digital library is of special interest in this article because it implements a model of a geolibrary. a geolibrary stores georeferenced information searchable by geographic location in addition to traditional searching methods such as by author, title, and subject. the purpose of this article is to present the idaho geospatial data center (igdc) in the larger context of a geolibrary model. igdc is a digital library of publicdomain geographic and statistical data for the state of idaho. the article discusses methodology and tools used to build igdc and contrast its capabilities with a geolibrary model. the usage of igdc is evaluated from the perspective of access and demand for geographic data. finally, the article offers recommendations for future development of geospatial data centers. i geographic information systems for public services terms such as digital, electronic, virtual, or image libraries have existed long enough to inspire diverse interpretations. the broad definition by covi and king concentrates on the main objective of digital libraries, which is the collection of electronic resources and services for the delivery of materials in different formats.1 the common motivation for initiatives leading to the development of digital libraries is to allow conventional libraries to move beyond their traditional roles of gathering, selecting, organizing, accessing, and preserving information. digital libraries provide new tools allowing their users not only to access the existing data but also to create new information. the creation of new information using the existing data sources is essential to the very idea of the digital library. since the information in a digital library exists in virtual form, it can be manipulated instantaneously by computer-based information processing tools. this is not possible using traditional information media (e.g., paper, microfilm) where the information must first be transferred from non-digital into digital format. since late 1994, when the u.s. national science foundation founded the alexandria digital library project, the number of internet sites devoted to spatially referenced information has grown dramatically. today, it would require a serious expenditure of time and effort to visit all geographic data sites created by state agencies, universities, and commercial organizations. in 1997 karl musser wrote, "there are now more than 140 sites featuring interactive maps, most of which have been created in the last two years." 2 this incredible boom in publishing reproduced with permission of the copyright owner. further reproduction prohibited without permission. spatial data is possible thanks to geographic information system (gis) technology and data development efforts brought about by the rapidly increasing use of gis. this new technology provides its users with capabilities to automate, search, query, manage, and analyze geographic data using the methods of spatial analysis supported by data visualization. traditionally, geographic data were presented on maps considered as public assets. according to a norwegian survey, the aggregate benefit accrued from using maps was three times the total cost of their production, even though maps provided only static information.3 today, the conventional distribution of geographic data on printed maps has become less efficient than distributing them in the digital format through wide area data networks. this happened largely due to gis's ability to separate data storage from data presentation. as a result, data can be presented in a dynamic way, according to users' needs. often gis is termed "data mixing system" because it can process data from different sources and formats such as vector-format maps with full topological and attribute information, digital images of scanned maps and photos, satellite data, video data, text data, tabular data, and databases. 4 all of these data types provide a rich informational infrastructure about locations and properties of entities and phenomena distributed in terrestrial and subterrestrial space. the definition of gis changes according to the discipline using it. gis can be used as a map-making machine, a 3-d visualization tool, and as an analytical, planning, collaboration, and business information management tool. today, it is hard to find a planning agency, city engineering department, or utility company (not to mention individual internet users) that has not used digital maps. this is why the number of users seeking spatial data in digital format has increased so dramatically. data discovery can be for gis users the most time-consuming part of using the technology. 5 as a result, libraries are faced with the growing demand for services that help discover, retrieve, and manipulate spatial data. the web greatly improved the availability and accessibility of spatial data but, at the same time, stimulated public interest in using geographic information. the continuing migration to popular operating systems (i.e., microsoft windows family) and the adoption of their common functionality has brought gis software to many desktops. tools such as arcview gis from environmental systems research institute, inc. (esri, www.esri.com) or maplnfo from maplnfo corporation (maplnfo, www.mapinfo.com) have become popular gis desktop systems. new software tools such as arcexplorer, released by esri, are focused on making gis more accessible, simpler, and available for use by the public. by taking advantage of the popularity of the web, attempts are being made to gain a wider acceptance of gis. in the wake of the simplification of gis tools and improved access to spatial data, a new exciting area of gis use has recently emerged-public participation gis.6 public participation gis by definition is a pluralistic, inclusive, and nondiscriminatory tool that focuses on the possibility of reducing the marginalization of societies by means of introducing geographic information operable on a local level.7 it promotes an understanding of spatial problems by those who are most likely to be affected by the implementation of problem solutions, and encourages transfer of control and knowledge to these parties. this approach leads to a broader use of gis tools and spatial data and creates new challenges for libraries storing and serving geographic data in digital formats. broadening the use of data and gis tools requires attention to data access. traditional libraries have often fulfilled the crucial role of being an impartial information provider for all parties involved in public decision-making processes. will they be capable of serving the society in this capacity in the digital age? i geolibrary as a repository of georeferenced information according to brandon plewe, the user of spatial data can choose among seven types of distributed geographic information services available on the intemet. 8 they range from raw data download, through static map display, metadata search, dynamic map browsing, data processing, web-based gis query and analysis, to net-savvy gis software. yet, another important new category of geographic data service that can be added to this list is geolibrary. goodchild defines a geolibrary as a library filled with georeferenced information where the primary basis of representation and retrieval are spatial footprints that determine the location by geographic coordinates. "the footprints can be precise, when they refer to areas with precise boundaries, or they can be fuzzy when the limits of the area are unclear." 9 according to buttenfield, "the value of a geolibrary is that catalogs and other indexing tools can be used to attach explicit locational information to implicit or fuzzy requests, and once accomplished, can provide links to specific books, maps, photographs, and other materials." 10 a geolibrary is distinguished from a traditional library in being fully electronic, with digital tools to access digital catalogs and indexes. it is anticipated that most of the information is archived in digital form. the value of a geolibrary is that it can be more than a traditional, physical library in electronic form.11 is this a geolibrary? i jankowska and jankowski 5 reproduced with permission of the copyright owner. further reproduction prohibited without permission. since its introduction, the concept of a geolibrary has been synonymous with the alexandria digital library (aol) project. once aol was defined as the internetbased archive providing comprehensive browsing and retrieval services for maps, images, and spatial information.12 a more recent definition characterizes aol as a geolibrary where a primary attribute of collection objects is their location on earth, represented by geographic footprints. a footprint is the latitude and longitude values that represent a point, a bounding box, a linear feature, or a complete polygonal boundary.13 according to goodchild (1998) a geolibrary' s components include: • the browser-a specialized software application running on the user's computer and providing access to geolibrary via a computer network. • the basemap-a geographic frame of reference for the browser's searches. a basemap provides the image of an area corresponding to the geographical extent of geolibrary collection. for the worldwide collection this would be the image of the earth. for the statewide collection this could be the image of a state. the basemap may be potentially large, in which case it is more advantageous to include it in the browser then to download it from a geolibrary server each time a geolibrary is accessed. • the gazetteer-the index that links place names to a map. the gazetteer allows geographic searches by place name instead of by area. • server catalogs-collection catalogs maintained on distributed computer servers. the servers can be accessed over a network with the browser, utilizing basic server-client architecture. the value of a geolibrary lies in providing open access to a multitude of information with geographic footprints regardless of the storage media. because all information in a digital library is stored using the same digital medium, traditional problems of physical storage, accessibility, portability, and concurrent use (e.g., many patrons wanting to view the one and only copy of a map) do not exist. i idaho geospatial data center in 1996, inspired by the aol project, a team of geographers, geologists, and librarians started to work on a digital library of public-domain geographic data for the state of idaho. the main goal of the project was the development of a geographic digital data repository accessible through a flexible browsing tool. the project 6 information technology and libraries i march 2000 was funded by a grant from the idaho board of education's technology incentive program. the project resulted in the creation of the idaho geospatial data center (igdc, http://geolibrary.uidaho.edu). the first in the state of idaho, this digital library is comprised of a database containing geospatial datasets, and geolibrary software that facilitates access, browsing, and retrieval of data in popular gis data formats including digital line graph (dlg), digital raster graphics (drg), usgs digital elevation model (dem), and u.s. bureau of census tiger boundary files for the state of idaho. the site also provides an interactive visual analysis of selected demographic/economic data for idaho counties. additionally, the site provides interactive links to other idaho and national spatial data repositories. the key component of the library is the geolibrary software. the name "geolibrary" is not synonymous with the model of geolibrary defined by goodchild (1998). it was rather adopted as a reference to a geolibrary browser-one of the components of the geolibrary. the geolibrary browser (gl) supports online retrieval of spatial information related to the state of idaho. it was implemented using microsoft visual basic 5.0/6.0 and esri mapobjects technology. the software allows users to query an area of interest using a search based on map selection, as well as selection by area name (based on uses 7.5-minute quad naming convention). queries return gis data including dems, dlgs, drgs, and tiger files. queries are intended both for professionals seeking gis-format data and nonprofessionals seeking topographic reference maps in the drg format. the interface of gl consists of three panels resembling the microsoft outlook user interface. our intent in designing the interface was to have panels that would be used in the following order. first, the map panel is used to explore the geographic coverage of the geolibrary and to select the area of interest. next, the query panel is used to execute a query, and finally the result panel allows the user to analyze results and to download spatial data. users can use a shortcut to go directly to the query panel and type their query. both approaches result in the output being displayed as the list of files available for download from participating servers. the map panel (figure 1) includes a navigable map of idaho, a vertical command toolbar, and a map finder tool. the command toolbar allows the user to zoom in, zoom out, pan the map, identify by name the entities visible on the map canvas, and select a geographic area of interest. geographic entity name identification was implemented as a dynamic feature whereby the name of entity changes as the user moves the mouse over the map. spatial selection provides a tool to select a rectangular area of interest directly on the map canvas. the map finder provides additional means to simplify the exploration of the map. reproduced with permission of the copyright owner. further reproduction prohibited without permission. the results panel shows the outcome of the query and includes important information about the data files: their size, type , projection, scale , the name of the server providing the data, as well as the access path (figure 4). based on this information , the user has the option of manually connecting to the server, using ftp protocol, and retrieving th e selected files. a much more convenient approach, however, is to rely on gl software to automatically retrieve the files through the software int erface. as an option , the result of the query can also be exported to a plain html document that contains links to all listed files . this feature can be very useful in the case of multifile files selected by the user and slow or limited-time internet access. this way the user can open the saved list of files in a web browser and download individual files as needed, without having to download all the files at once and tie up the internet connection for a long period of time. figure 1. map panel. the vertical toolbar provides zooming, panning , as well as labeling and simple feature querying capabilities. the map finder allows finding and selecting an area by county or usgs quad name . the screen copy here presents the selection of latah county in idaho. the result panel provides a flexible way to review and organize the outcomes of queries before commencing the download. one can sort files by name, size, scale, the user can select a county or a quad name and zoom in on the selected geographic unit. the query panel (figure 2) allows the user to perform a query, based either on the selection made on the map or a new selection using one of the available query tools (figure 3). in the latter case, the user can enter geographic coordinates (in decimal degrees) defining the area of interest. this approach is equivalent to selecting a rectangular area directly on the map, and will return all data files that spatially intersect with the selected area. optionally, the user can handpick quads of interest from the list. finally, a name can be entered to execute a more flexible query . for instance, the search containing the word "moscow" returns spatial data related to three quads containing "moscow" within their names. the query is executed when the user presses the query button . after the results are received, the application automatically switches to the results panel. projection, and server name . this feature may be useful if the user decides to retrieve data of only one type (e.g., dems), of one scale, or when the user prefers to connect only to a specific sever. in addition, individual records as well as entire file types can be selected to prevent files from being downloaded. the user can also remove selected files to scale down the set of data in the list. one of the most important assets of the gl browser is that all of the user activities described up to this point, with the exception of file download, take place entirely on the client-side without any network traffic. in fact, area/file selection as well as queries do not require an active internet connection. map exploration is based on vector-format maps contained in gl software and queries are run against the local database. such an approach limits bandwidth consumption and unnecessary network traffic. internet connection is only necessary to perform retrieval of selected files. is this a geolibrary? i jankowska and jankowski 7 reproduced with permission of the copyright owner. further reproduction prohibited without permission. figure 2. query panel. the interface was set to query spatial selection from the map panel. figure 3. query panel. the query is based on the selection of usgs quads . optionally, the user can enter geographic coordinates of the area or a text to search. 8 information technology and libraries i march 2000 the vulnerability of the client-side approach to data query is to be left with a potentially outdated local database. in order to prevent this problem from happening, the gl is equipped with a database synchronization mechanism that allows users to keep up with the server database updates. the client-side database, contained in gl software, which mirrors the schema of the server database, can be synchronized automatically or by the user's request. in either case, the gl client contacts the server-based database synchronizer on the server side and handles all necessary processes. since the synchronization is limited to database record updates, the network traffic is kept low, making gl suitable for limited internet connections. igdc is an open solution. new local datasets can be added or removed making the collection easily adaptable to different geographical areas. in addition, datasets can physically reside on multiple servers, taking full advantage of the internet's distributed nature. i evaluation of igdc use geospatial information is among the most common public information needs; almost 80 percent of all information is geographic in nature. published research reflecting those needs and the role of libraries in resolving them is not extensive. the efforts of federal, state, and local agencies collecting digital geospatial data and the growth of gis created an interest in the role of libraries as repositories of geospatial data. 14 the main obstacle to providing access to digital spatial information is its complexity. this is why the user-friendly interface is critical for presenting spatially referenced information.15 the igdc has been a first attempt at creating a user-friendly interface in the form of a map-based data browser allowing the users to access and retrieve geographic datasets about idaho. in order to track and evaluate the use of geospatial data, webtrends software was installed on the igdc server. the webtrends software produces customized web log statistics and allows tracking information on traffic and reproduced with permission of the copyright owner. further reproduction prohibited without permission. ahsahka -southwick ·· lenore --juliaetta green knob -· aldeamand ridge park texas ridge · mcgary butte ·· bovill deary viola palouse dlg_aoads i.tj dlg_rai l!l ·dlg_transp01t dlg_hydro olg_bcu'ldaries tiger_streets tiger_bnds ----'-----'--"-'--'--'----'=---:__.:_::.._-_·-since the opening of igdc for public us e (april 1998), the geolibrary map browser was downloaded 1,352 times. the software proved to be relati vely easy to use by the public. out of fort y-four bug report s/ user questions submitted to igdc, most were concerned with filling out the software registration form and not with software failure. the igdc project spurred an interest in geographic information among students , faculty, and librarians at the university of idaho. in a direct response to this interest, the university of idaho library installed a new dedicated computer at the reference desk with geolibrar y software to access, view , and retrieve igdc data . i conclusion idaho geospatial data center is the first geospatial digital library for the state of idaho. it does not fulfill all requirements of a figure 4. the results panel. results of a query can be sorted; individual items can be removed from the list or can be deselected to prevent them from being downloaded . geolibrary model proposed by goodchild and others. the igdc has only two components of the geolibrary model; they are the datasets dissemination. during a one-year timeframe the number of successful hits was more than twenty-five thousand . almost 40 percent of users came from .com domain, 35 percent were .net domain users, 15 percent w ere .org, and 10 percent were .edu users (figure 5). tracking the geographic origin of users by state, the biggest number of users came from virginia, followed by washington, california, ohio, and idaho . the high number of users from virginia can be explained by the linking of the igdc site to one of the most popular geospatial data sites in the country-the united states geological survey (usgs) site. eighty-four percent of user sessions were from the united states; the rest originated from sweden, canada , and germany. the average number of hits per day on weekdays was around one hundred customers. the most popular retrievable information were digital raster graphics (drg) data that present scanned images of usgs standard series topographic maps at 1:24,000 scale. digital elevation models (dem) and digital line graphs (dlg) were less popular. the tiger boundary files for the state of idaho were in small demand . the popularity of drg-format maps and the fact that most of the users accessed igdc via the usgs web site makes plausible a speculation that most of the users were non-gis specialists interested in general reference geographic information about idaho including topography and basic land use information. geolibrary map browser and the basemap . the main difference between the geolibrary map browser and a web-based browser solution adopted by other spatial repositories is a client-side solution to geospatial data query and selection. spatial data query is done locally on the user's machine, using the library data base schema contained in the geolibrary map browser. this saves time by eliminating client-server communication delays during data searches, gives the user an experience of almost instantaneous response to queries , and reduces the network communication to the data download time . in comparison with th e geolibrary model, igdc is missing the gazetteer . this component can help improve the ease of user navigation through a geospatial data collection. the other useful component includes online mapping and spatial data visualization services. the idea of such services is to provide the user with a simple-tooperate mapping tool for visualizing and exploring the results of user-run queries . one such service, currently under implementation at igdc, includes thematic mapping of economic and demographic variables for idaho using descartes software .16 descartes is a knowledgebased system supporting users in the design and utilization of thematic maps. the knowledge base incorporates domain-independent visualization rules determining which map presentation technique to employ in response to the user selection of variables. an intelligent is this a geolibrary? i jankowska and jankowski 9 reproduced with permission of the copyright owner. further reproduction prohibited without permission. i ,i distribution of igdc users (in %) by domain 40 30 20 10 0 . com .net org .edu web domain categories figure 5. distribution of igdc users in percent by origin domain map generator such as descartes can enhance the utility of a geolibrary by providing tools to transform georeferenced data into information. references and notes 1. l. covi and r. king, "organizational dimensions of effective digital library use: closed rational and open natural systems models," journal of the american society for information science 47, no. 9 (1996): 697. 2. k. musser, "interactive mapping on the world wide web." (1997) accessed march 6, 2000, www .min.net/-boggan/ mapping/thesis.htm. 3. t. bernhardsen, geographic information systems (arendal, norway: viak it and norwegian mapping authority, 1992), 2. 4. ibid., 4. 5. j. stone, "stocking your gis data library," issues in science and technology librarianship. (winter 1999). accessed march 6, 2000, www.library.ucsb .edu/istl/99-winter/articlel. html. 6. p. schroeder, "gis in public participation settings." (1997.) accessed june 2, 1999, www.spatial.maine.edu/ucgis/ testproc/ schroeder / ucgisdft.htm . 7. w. j. craig and others, "empowerment, marginalization, and public participation gis," report of a specialist meeting held under the auspices of the varenius project. santa barbara, california, oct. 15-17, 1998, ncgia, uc santa barbara. 8. b. plewe, gis online: information retrieval, mapping, and the internet (santa fe, n.m.: on word pr., 1997), 71-91 . 9. m. f. goodchild, "the geolibrary," in innovations in gis 5: selected papers from the fifth national conference on gis research uk (gisruk), ed. s. carver. (london: taylor and francis, 1998), 59. accessed march 6, 2000, www.geog.ucsb.edu/ -good/geolibrary.html . 10. b. p. buttenfield, "making the case for distributed geolibraries." (1998) accessed march 6, 2000, www.nap.edu/ html/ geolibraries/ app_b .html . 11. ibid . 12. m. rock, "monitoring user navigation through the alexandria digital library," (master's thesis abstract, 1998). accessed march 6, 2000, http :/ /greenwich.colorado.edu/projects/ rockm.htm. 13. l. l. hill and others, "geographic names the implementation of a gazetteer in a georeferenced digital library. d-lib magazine 5, no. 1 (1999). accessed march 6, 2000, www.dlib. org/ dlib/ january99 /hill/0lhill.html. 14. m. gluck and others, "public librarians' views of the public's geospatial information needs," library quarterly 66, no . 4 (1996): 409. 15. b. p. buttenfield, "user evaluation for the alexandria digital library project." (1995) accessed march 6, 2000, http://edfu.lis.uiuc.edu/allerton/95 /s 2/buttenfield .html. 16. g. andrienko and others, "thematic mapping in the internet: exploring census data with descartes," in proceedings of telegeo '99, first international workshop on telegeoprocessing, lyon, may 6-7, r. laurini, ed. (seiten, france: claude bernard univ. of lyon, 1999), 138--45. 10 information technology and libraries i march 2000 116 library computerization in the united kingdom frederick g. kilgour: director, ohio college library center, columbus, ohio library automation in the united kingdom has evolved rapidly in the past three years. imaginative, innovative development has produced novel techniques, some of which have yet to be put into practice in the united states. of greatest importance is the growing cadre of highly effective librarians engaged in development. when the brasenose conference in oxford convened in june 1966, there were represented only two operational library computerization projects from the united kingdom : w. r. maidment, britain's pioneer in library computerization, had introduced his bookform catalog at the london borough of camden library in april 1965 ( 1); and m. v. line and his colleagues at the university library, newcastle-upon-tyne, had introduced an automated acquisitions system just a year later ( 2). during the three years following the summer of 1966, british librarians moved rapidly into computerization and have made novel contributions which their american colleagues would do well to adopt. in the spring of 1969 there were more than a couple of dozen major applications operating routinely with perhaps another score being actively developed. the most striking development in the united kingdom is computerization in public libraries, whose librarians are considerably more active than their colleagues in the united states; at least nine public libraries have computerization projects that are operational or under active development, and as already mentioned, it was a public library that led the way. ' the sources for this paper are published literature and an-all-too-brief visit to the united kingdom in april 1969 to see and hear of those aclibrary computerization in the u. k. 117 tivities not yet reported. the principal literature source is program: news of computers in libraries, now in its third volume. r. t. kimber, of the queen's university school of library studies at belfast, edits program, which he first published as a gratis newsletter in march 1966. kimber has published the only reviews of library computerization that have contained adequate information on activities in the united kingdom; the first appeared in program ( 3), and the second, an expansion of the first, 1s m his recently published book ( 4). program became an immediate success, and beginning with the first issue of volume 2 in april 1968, it became available on a subscription basis. a year later, aslib assumed its publication, with kimber still as editor, and program will undoubtedly continue to be the major source of published information about library computerization in the united kingdom. information & library science abstracts, formerly library science abstracts, is the one other major source of published information about british library automation. it abstracts articles appearing in other journals and report literature as well. most library computerization in the united kingdom has been a genuine advance of technology, in that computerization has introduced new methods of producing existing products or products that had existed in the past, such as bookform catalogs. to be sure, relatively more british libraries than united states libraries have maintained catalogs in bookform, but the pioneer project at camden produced a bookform catalog to take the place of card catalogs. the time has come, however, when it is fruitful to think of products with new characteristics or of entirely new products unknown to libraries heretofore. british librarians have already begun to think in these terms. one example (and others will be reported later in this paper) is the pioneering w. r. maidment, who feels that the problem of application of computers to produce existing products has been solved intellectually. maidment is giving serious thought to development of management information techniques, and automatic collection of data to be used by librarians, sociologists and others as a data base for research. such research could produce many findings, including knowledge of effectiveness of formal education programs as revealed by subsequent public library usage, as well as better understanding of the social dynamics of public libraries within their communities. catalogs although users searching by subject may find more material by going directly to classified shelves than by any other subject access ( 5,6), the library catalog is nevertheless a major and indispensible tool for making books available to users. taken together, descriptive cataloging, subject indexing, and subject classification constitute the bridge over which the user must travel to obtain books from a library. in libraries that are useroriented it can be expected that the greatest gain will be achieved by computerization of the cataloging process. moreover, acquisitions activities, 118 journal of library automation vol. 2/3 september, 1969 as well as circulation procedures, are essentially based on, and must be interlocked with, cataloging products. it is, therefore, of much interest that the first routine british computerization was of the catalog at the camden borough library ( 1). impetus for this event . occurred several years earlier when the london metropolitan boroughs of hampstead, holbom and st. pancras were combined to become the borough of camden. the problem thereby generated was how to combine catalogs of three public library systems so that users of the new system could take advantage of the increased number of books available to them. maidment decided to cope first with the future, and introduced a bookform union catalog in 1965 listing new acquisition in all camden libraries and giving their locations. of course, users have consulted both the bookform catalog and older card catalogs, but with the passage of each year, the card catalogs become less useful. h. k. gordon bearman, who directs the west sussex county library from its lovely new headquarters building in the charming little city of chichester, is another imaginative pioneering public librarian. bearman has keenly evaluated potential contribution of computerization to public libraries, and has amusingly assessed the opposition of some to such advances (7). the west sussex county library possesses more than a score of branches, for which bearman has introduced a computerized bookform union catalog ( 8). in april 1969 this computerized catalog contained nearly 23,000 entries. the library at the university of essex produces computerized accession lists, departmental catalogs, and special listings for its science books ( 9). at least four libraries are putting out computerized alphabetical subject indexes to their classification schemes or to their classified catalogs: the library of the atomic weapons research establishment (awre) at aldermaston; the city university library ( 10), london, formerly the northampton technical college; the loughborough university of technology ( 11); and the dorset county library ( 12), which may be the first library in the united kingdom to use a computer, for it issued a computerized catalog of sets of plays in 1964. one of the most exciting cataloging computerization projects in the united kingdom is the british national bibliography marc project under the extraordinarily skillful leadership of r. e. coward ( 13,14). the bnb marc record is entirely compatible with marc ii, and coward has introduced worthwhile improvements to it. for example, he uses indicator positions to record the number of initial characters to be omitted when an entry possessing an initial article is to be sorted alphabetically. in april 1966, the british national bibliography was using its marc records in its process for production of cards for sale to british libraries. bnb intends to use the same records for production of the british national bibliography. in addition, bnb is fostering a pilot project, quite like the marc pilot project, among a score of british libraries. f. h. library computerization in the u. k. 119 ayres ( 15) has published perceptive suggestions for use of bnb marc tapes for selections, acquisitions, and cataloging. although coward was able to take full advantage of work done at the library of congress, it is enormously to his credit that he did take that advantage, and that he has moved so far ahead so rapidly. since british book production somewhat exceeds american, coward has doubled the size of the pool of machine readable cataloging records available at the present time. the bodleian library at oxford will be an important early user of bnb marc tapes. robert shackleton, who became bodley's librarian shortly before the brasenose conference, has worked wonders at that ancient and honorable institution, and his principal wonder is .peter brown, who became keeper of the catalogues late in 1966. brown is one of the few members of classical librarianship who has trained himself in depth in the programming and operation of computers. oxford possesses no fewer than 129 separate libraries acquiring current imprints and has no instrument that remotely resembles a union catalog. hence, each user must guess which library out of 129 is most likely to have the book he wis}:tes to use a guessing game of which oxonians notoriously tire. brown has developed a system for bookform catalog production which will place a union catalog of oxford's holdings in each of its libraries. conversion the bodleian is also the scene of the most ambitious of retrospective conversion projects. involving 1,250,000 entries, it is by far the largest conversion project in either the united states or the united kingdom. entries being converted constitute the bodley's so-called "1920 catalogue," which includes the bodley's holdings for imprints of 1920 and earlier. for some years the manuscript bookform slip catalog that houses these entries has been in advancing stages of deterioration, and indeed since 1930, entries have been revised in anticipation of printing the catalog. to reprint the catalog would require keyboarding the entries to prepare manuscript copy for the printer, who in turn would keyboard the entries again in setting type. there would be only one product from this process, namely a printed catalog. bodleian officials wisely decided to do a single keyboarding that would convert the entries to machine readable form from which a multiplicity of products could be had, including a printed catalog. brown has worked out details of schedules and procedures whereby conversion will take place during the next five years. a contractor employing optical character recognition techniques performs actual conversion, but the contractor does not edit, code, or proofread the entries, although he is responsible for accurate conversion. brown has skillfully developed techniques to diminish the number of keystrokes required in conversion, and what with labor costs being lower in the united kingdom than in the united states, the contractual cost of 4.17 pence per record is certainly low enough to 120 journal of library automation vol. 2/ 3 september, 1969 attract work from outside the united kingdom.the most significant part of this operation is, however, the identification by computer program of the individual elements of information in the text. this puts into practice the concepts of john jollilfe of the bdtish museum on the conversion of catalog data (16); it was jolliffe who programmed oxford's kdf 9 computer to convert the text coming on tapes from the contractor to true machine records that are compatible with the marc ii format. despite the fact that these entries contain no subject heading tracings, they will constitute the first major source of retrospective machine readable cataloging records. the west sussex county library in chichester and the university library at newcastle-on-tyne have already converted their catalogs to machine readable form, the former having done somewhat less, and the latter somewhat more, than 200,000 entries. at chichester, former library employes did the job on the piece-work basis; at newcastle the computer laboratory employed a special group ( 17). the large number of records produced by these conversion projects forces urgent consideration of files designed to house huge numbers of entries. approaches to solutions of this problem have begun at the level of individual records or of file design as a whole. nigel s. m. cox, at newcastle, one of britain's most widely k'110wn library computer people and co-author of the best-selling the computer and the library (it has been translated even into japanese ) , has developed a generalized file-handling system ( 18) based on individual records. cox has demonstrated that his system is hospitable to demographic records as well as bibliographic records. his file handling will surely play a role in future library computerization. circulation britain's first computerized circulation system went into operation in october 1966 ( 19) at the university of southampton. books contain eighty-colu,mn punched book cards which are passed through a friden collectadata together with a machine readable borrower's identification card. punched paper tape is produced that is input to the computer system. the principal output is a nightly listing of charges having records in abbreviated form, with a print-out of the complete records being produced once a week. the southampton circulation system works well, and obviously the staff finds it easy to use. borrowers also enjoy the system; when the collectadata is down, as it occasionally is, circulation volume also goes down, for borrowers avoid filling out charge cards manually. f. h. ayres and his colleagues at the awre library at aldermaston are a productive group in research and development. aldermaston has a partially computerized circulation system wherein the computer segment of the system maintains control features of the circulation record file, but the master record is maintained manually ( 20). it is understood that the library of the atomic energy research establibrary computerization in the u. k. 121 lishment at harwell is developing an on-line circulation system, but the west sussex county library is the only british library to have on-line access to a circulation file ( 8,21) . the circulation system in chichester is both experimental and operational. the punch-paper-tape reading devices at the circulation desk and in the discharge room were specially designed by elliott computers for experimental application for library purposes. however, it appears that the experimental period is ending, and that the production of a new model is about to be marketed by automated library systems ltd. the experimental equipment at chichester was to be replaced during summer, 1969, and six further installations introduced at the major regional branch libraries during the next two years. the on-line circulation records are housed on an ibm 2321 data cell in the computer centre in an adjacent county council building. there is an ibm 27 40 terminal in the library from which special inquiries are put to the file. for example, overdue notices are sent out by computer using the same records to which inquiries can be made, but there are sometimes lag periods, particularly over weekends, so that an overdue notice may be sent on a book already returned. when the borrower reports that he has already returned the book, the file is queried from the terminal. processing of these special and time-consuming tasks is thereby greatly facilitated. on-line circulation files are a rarity, and the west sussex county library and the county computer centre are to be congratulated on their achievement. acquisitions the already mentioned acquisition system at newcastle (2,22) has been in continuous and successful operation for over three years. although the system does not handle large numbers of orders, there being only slightly more than a thousand active orders and four thousand inactive orders in the file at any one time, there is no reason to think that it could not cope with a larger volume. output from the computer consists of purchase orders, the order file, claim notices, a fund commitment register, and an occasional list of orders by dealers. the city university library has computerized its book fund accounting ( 23), its general library accounts, and its inventory of over 350 categories of furniture and equipment ( 24). the last procedure is unique. am cos ( aldermaston mechanized cataloging and order system) appears to be the british pioneer integrated acquisitions and cataloging system (25). the ibm 870 document writing system originally used for output became overburdened after it had produced the second bookform title catalog with classed subject and author indexes. a title listing is employed in the main catalog because the aldermaston group found in a separate study ( 26) that users as they approached the catalog possessed less than seventy-five percent accurate author information, while their information about titles was over ninety percent correct. 122 journal of library automation vol. 2/3 september, 1969 serials the university of liverpool library (27) and the city university library ( 28 ) produce periodicals holding lists by computer. at liverpool the list is restricted to scientific serials but contains 7,600 entries of holdings in 28 libraries, not all of which are university libraries. with each entry are holding information, the name of the library or libraries possessing the title, and the call number in each library. similarly, the city university library list contains holdings information and frequency of appearance for each title. the computer program at city university also puts out a list of titles for which issues may be expected during the coming month as well as of all titles having irregular issues. however, this procedme for checking in issues did not prove to be wholly satisfactory and is not currently in use. the library of the atomic energy research establishment at harwell also puts out a union holdings list for the several sections of the library (29). in addition, the harwell programs, which run on an ibm 360/ 65 and are written in fortran iv, produce for review annual lists of current subscriptions taken by each library; it also produces annual lists of periodicals by subscription agencies supplying the periodicals. dews (30,31) has described computer production of the union list of periodicals in institute of education libraries. this union list first appeared about 1950, was republished annually, then biennially, as magnitude of effort to revise it increased. both the manipulation and typesetting programs employ the newcastle file handling system. assessment the most gratifying development in library computerization in the united kingdom dming the last three years has been the rapid expansion of numbers of individuals who have made themselves competent in the field. among the british participants at the brasenose conference were barely a half-dozen who had had first-hand experience in library computerization. the group has increased considerably more than tenfold and has brought quality of british library computerization to a level surpassed by none. continuing advances depend on the calibre of those advancing; the competence of the present cadre assures exciting future developments. perhaps the most distinguishing characteristic of library computerization in the united kingdom as compared with that in north america is the relatively larger role played by public libraries. indeed, it was the public libraries at dorset and camden that first used computers. american public librarians would do well to follow the lead of their british confreres. in general, americans can learn from british imagination and accomplishment, can learn of exquisite refinements and major achievements. ' british librarians, particularly of large british libraries, have not been a notoriously chummy group. it is, therefore, interesting to observe comlibrary computerization in the u. k. 123 puterization bringing them together. the new style in solving problems made possible by the computer has suddenly made it clear that libraries heretofore deemed to have nothing in common now seem surprisingly alike. for example, bookform union catalogs at the camden and west sussex public libraries and at the oxford libraries can now be seen to be essentially the same solution to the same problem. although library computerization in the united kingdom is but half the age of that in the united states, the quality if not the quantity of british research, development, and operation has rapidly pulled abreast of, and in some areas surpassed, american activities. references l. maidment, ·w. r.: "the computer catalogue in camden," library world, 07 (aug. 1965), 40. 2. line, m. v.: "automation of acquisition records and routine in the university library, newcastle upon tyne," program, 1 (june 1966), 1-4. 3. kimber, r. t.: "computer applications in the fields of library housekeeping and information processing," program, 1 (july 1967), 5-25. 4. kimber, r. t.: automation in libraries (oxford, pergamon press, 1968), pp. 118-133. 5. bundy, mary lee: "metropolitan public library use," wilson library bulletin, (may 1967 ), 950-961. 6. raisig, l. miles; smith, meredith; cuff, renata; kilgour, frederick g.: "how biomedical investigators use library books," buuetin of the medical library association, 54 (april 1966), 104-107. 7. bearman, h. k. gordon: "automation and librarianship-the computer era," proceedings of the public libraries conference brighton, 1968, pp. 50-54. 8. bearman, h. k. gordon: "library computerisation in west sussex," program, 2 (july 1968), 53-58. 9. sommerlad, m. j.: "development of a machine-readable catalogue at the university of essex," program, 1 (oct. 1967), 1-3. 10. cowburn, l. m.; enright, b. j.: "computerized u. d. c. subject index in the city university library," program, 1 (jan. 1968), 1-5. ll. evans, a. j.; wall, r. a.: "library mechanization projects at laughborough university of technology," program, 1 (july, 1967), 1-4. 12. carter, kenneth : "dorset county library: computers and cataloguing," program, 2 (july 1968 ), 59-67. 13. bnb marc documentation service publications, nos. 1 and 2 (london, council of the british national bibliography, ltd., 1968). 14. coward, r. e.: "the united kingdom marc record service," in cox, nigel s. m.; grose, michael w.: organization and handling of 124 journal of library automation vol. 2/3 september, 1969 bibliographic records by computer (hamden, conn., archon books, 1967), pp. 105-115. 15. ayres, f. h.: "making the most of marc; its use for selection, acquisitions and cataloguing," program, 3 (april 1969), 30-37. 16. jolliffe, j. w.: "the tactics of converting a catalogue to machinereadable form," journal of documentation, 24 (sept. 1968), 149-158. 17. university of newcastle upon tyne : catalogue computerisation project (september, 1968). 18. cox, nigel s. m.; dews, j. d.: "the newcastle file hand~g system," in op. cit. (note 13 ), pp. 1-20. 19. woods, r. g.: "use of an ict 1907 computer in southampton university library, report no. 3," program, 2 (april1968), 30-33. 20. ayres, f. h.; cayless, c. f.; german, janice a.: "some applications of mechanization in a large special library," journal of documentatu:m, 23 (march 1967 ), 34-44. 21. kimber, r. t.: "an operational computerised circulation system with on-line interrogation capability," program, 2 (oct. 1968 ), 75-80. 22. grose, m. w.; jones, b.: "the newcastle university library order system," in op. cit. (note 13 ), pp. 158-167. 23. stevenson, c. l.; cooper, j. a.: "a computerised accounts system at the city university library," program, 2 (april 1968), 15-29. 24. enright, b. j.; and cooper, j. a.: "the housekeeping of housekeeping; a library furniture and equipment inventory program," program, 2 (jan. 1969), 125-134. 25. ayres, f. h.; german, janice; loukes, n.; searle, r. h.: amcos ( aldermaston mechanised cataloguing and ordering system ). part 1, planning for the ibm 870 system; part 2, stage one operational. nos. 67/ 11, 68/ 10, aug. 1967, nov. 1968. 26. ayres, f. h. ; german, janice; loukes, n.; searle, r. h.: "author versus title : a comparative survey of the accuracy of the information which the user brings to the library catalogue," journal of documentation, 24 (dec. 1968), 266-272. 27. cheeseman, f .: "university of liverpool finding list of scientific medical and technical periodicals," program. 1 (april 1967 ) , 1-4. 28. enright, b. j. : "an experimental periodicals checking list," program, 1 (oct. 1967), 4-11. 29. bishop, s. m.: "periodical records on punched cards at aere library, harwell," program, 3 (april 1969), 11-18. 30. dews, j. d.: "the union list of periodicals in institute of education libraries," in op. cit. (note 13), pp. 22-29. 31. dews, j. d.; smethurst, j. m.: the institute of education union list , of periodicals processing system (newcastle upon tyne, oriel press, 1969). article exploring final project trends utilizing nuclear knowledge taxonomy an approach using text mining faizhal arif santosa information technology and libraries | march 2023 https://doi.org/10.6017/ital.v42i1.15603 faizhal arif santosa (faizhalarif@gmail.com) is academic librarian, polytechnic institute of nuclear technology, national research and innovation agency. © 2022. abstract the national nuclear energy agency of indonesia (batan) taxonomy is a nuclear competence field organized into six categories. the polytechnic institute of nuclear technology, as an institution of nuclear education, faces a challenge in organizing student publications according to the fields in the batan taxonomy, especially in the library. the goal of this research is to determine the most efficient automatic document classification model using text mining to categorize student final project documents in indonesian and monitor the development of the nuclear field in each category. the knn algorithm is used to classify documents and identify the best model by comparing cosine similarity, correlation similarity, and dice similarity, along with vector creation binary term occurrence and tf-idf. a total of 99 documents labeled as reference data were obtained from the batan repository, and 536 unlabeled final project documents were prepared for prediction. in this study, several text mining approaches such as stem, stop words filter, n-grams, and filter by length were utilized. the number of k is 4, with cosine-binary being the best model with an accuracy value of 97 percent, and knn works optimally when working with binary term occurrence in indonesian language documents when compared to tf-idf. engineering of nuclear devices and facilities is the most popular field among students, while management is the least preferred. however, isotopes and radiation are the most prominent fields in nuclear technochemistry. text mining can assist librarians in grouping documents based on specific criteria. there is also the possibility of observing the evolution of each existing category based on the increase of documents and the application of similar methods in various circumstances. because of the curriculum and courses given, the growth of each discipline of nuclear science in the study program is different and varied. introduction the national nuclear energy agency of indonesia (batan), now known as the research organization for nuclear energy (ortn)—national research and innovation agency (brin), in 2018 issued a decision regarding batan’s six competencies: isotopes and radiation (ir), nuclear fuel cycle and advanced materials (nfcam), engineering of nuclear devices and facilities (endf), nuclear reactor (nr), nuclear and radiation safety and security (nrss), and management (mgt). these areas of focus are also known as batan’s knowledge taxonomy, which is used to support nuclear knowledge management (nkm) and the grouping of explicit knowledge in repositories.1 the polytechnic institute of nuclear technology (pint), which is under the auspices of batan and is now in one of the directorates of brin, can also utilize batan’s knowledge taxonomy to classify students’ final assignments. every year the pint library accepts final assignments from mailto:faizhalarif@gmail.com information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 2 santosa students who have graduated from three study programs, namely nuclear technochemistry, electronics instrumentation, and electromechanics. over the past six years (2017 to 2022), 563 final assignments in indonesian were collected and needed to be classified into the batan’s knowledge taxonomy in order to see the document growth of each existing competency. however, it is quite time consuming for librarians to assign individual documents to the most appropriate taxonomy term. it is also possible to involve experts to determine the right group, which results in increased working time to complete a document. this obstacle arises because librarians do not have in-depth and detailed knowledge of the nuclear field so it is feared that grouping errors will occur. in this study, the author tried to classify the collection of final project documents owned by the pint library based on batan’s knowledge taxonomy. the author used text mining tools, choosing the k-nearest neighbors (knn) algorithm for this study. similar research also leads to trying to focus on automatic document classification of certain subjects,2 which in this case is the subject of nuclear engineering. the hope is that users will find it easier to explore knowledge according to their area of interest through taxonomy grouping based on explicit knowledge,3 in this case, pint students’ final project documents. finding the trend of research conducted by students on each subject is also one of the goals of this research. literature review text mining in libraries the increasing number of publications currently makes it a challenge to classify and find out the growth and trends of a topic. document classification is one of the jobs that is quite time consuming so document classification automation by utilizing text mining is very necessary.4 the application and utilization of text mining itself is very broad. several studies have demonstrated the usefulness of text mining in libraries. pong et al. from city university of hong kong conducted research to facilitate the classification process using machine learning.5 this study aimed to streamline document categorization utilizing automatic document classification by using a system called the web-based automatic document classification system (wadcs) and claimed to be the pioneer of a comprehensive study of automatic document classification on a classification that is already popular in the world, namely the library of congress classification (lcc) utilizing knn and naive bayes (nb). this research indicates that the machine-learning algorithm they used can be applied by the library for document classification. wagstaff and liu utilized text mining to perform automatic classification to help make decisions to select candidate documents for weeding.6 this study used data from wesleyan university from 2011 to 2014 to predict which documents were eligible for weeding and which will be stored. five classifier models, namely knn, naive bayes, decision tree, random forest, and support vector machines (svm), were used to compare their performance. while this process may not replace librarians, this study can help librarians make better decisions and reduce their workload significantly. lamba and madhusudhan applied the use of text mining to extract important topics which were published in the desidoc journal of library and information technology over a period of 38 years.7 the latent dirichlet allocation (lda) method used in this study is able to find topics from information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 3 santosa within a collection of documents so that they can see how these topics develop over time. because lda is an algorithm for looking at topics from a group of words that appear together, the authors suggest that this study be expanded by utilizing articles that have been labeled using supervised classification. knn classifier various studies try to find answers to the most appropriate method of grouping the collection of documents. the knn and svm algorithms were used as comparative methods in the document classification study.8 however, there is no definite standard for the methods used in text mining.9 choosing the right technique in each phase of document classification can improve the performance of the text classifier, so, experts generally make adjustments to existing methods to get better results.10 kim and choi compared knn, maximum entropy model (mem), and svm to classify japanese patent documents by focusing on the structure of patents.11 instead of comparing the entire text, specific components named semantic elements, such as purpose, background, and application fields, are compared from the training document. these semantically grouped components are the basis for patent categorization. in addition, the strategy used is the existence of cross -references from two semantic fields that are useful for determining the intentions of the patent writer s who are still unsure or hidden. this strategy works well on knn compared to mem and svm where svm doesn’t do very well when handling large data sets. however, research conducted by alhaj et al. on arabic documents showed that svm can outperform knn by implementing a stemming strategy.12 meanwhile, through the approach to the relationship between unstructured text documents, the study conducted by mona et al. was able to increase the performance of knn combined with tf-idf by 5 percent.13 the knn algorithm is one of the popular classifiers that categorizes new data based on the concept of similarity from the amount of data (determined by the specified “k” value) around it.14 this method is believed to be able to group documents effectively because it is not limited to the number of vector sizes.15 wagstaff and liu noted that one of the weaknesses of knn is the long processing time when faced with large datasets, but knn as a classifier is easy to apply.16 in terms of measurement, previous experiments showed that knn was not suitable when used with euclidean distance.17 generally, similarity measures such as cosine, jaccard, and dice were used in the knn classifier.18 one of the problems in text classification is the number of attributes or dimensions so that many irrelevant attributes in the data set cause the classifier’s performance to not run optimally.19 for this reason, it is necessary to have a technique to increase effectiveness and reduce dimensions that are too large through the selection of features or terms,20 such as within-document tf, weighting with one of the popular methods, namely tf-idf (which sees how important a word is in a collection of corpus),21 and binary representation which looks at the absence and presence of a concept in a document22 by converting it to 0 and 1.23 aims of the study university libraries have a vital role in managing internal publications to support the education ecosystem. in connection with the role of the pint to support nkm and nuclear development, it is necessary to apply technology to help provide advice on certain classes of documents. in addition, in order to see scientific developments, generally experts conduct bibliometric studies which are information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 4 santosa limited to the title and abstract fields. text mining provides an opportunity to dig deeper. instead of just the title and abstract, this study used the full text of the final project collection. the trend of a subject will be seen from the growth and percentage of existing documents. so, the objectives of this study are to • explore the best knn model to be applied to classify the final project; • know the development of nuclear subjects based on batan’s knowledge taxonomy; and • know the development of nuclear subjects from each study program at the pint. methods a total of 99 documents were taken from the batan repository and manually labeled as reference data. this study was conducted using rapidminer studio software. the first document processing method is to convert all words into lower case and divide the text into a collection of tokens. filters on tokens are also applied based on the length of the token. in this case, the author applied a minimum of 3 characters and a maximum of 25 characters. stop words were also applied to eliminate short words (e.g., “and,” “the,” and “are”), thereby reducing the vector size. english and indonesian stop words were used for this study to overcome the use of english in the abstract section and indonesian as the document language. the collection of words from haryalesmana was chosen to be the stop words for indonesian.24 the stemming technique is applied to reduce dimensions that are useful for improving the function of the classification system 25 by changing word forms into basic word,26 e.g., water, waters, watered, and watering into water. this analysis applies wicaksana data to indonesian stemming.27 some words cannot be separated from other words because they form a meaning, e.g., nondestructive testing, biological radiation effects, structural chemical analysis, and water -cooled reactors. to overcome this case, the use of n-grams can help identify compound words that have a meaning so that the words are not reduced.28 n-grams will record a number of “n” words that follow the previous word.29 to accommodate these words, in this study, three words were assigned to n-grams. information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 5 santosa figure 1. nuclear taxonomy classification framework. vector creation in this study used tf-idf and binary term occurrence and then compared them to determine the best performance. in the knn method, it is necessary to determine the value of “k” manually, so a value of 2–10 was chosen by activating a weighted vote which is useful for weighing the contributions of neighbors in the vicinity. weight voting indicates the use of multiple voting methods by assigning a weight to each neighbor depending on their distance from the unknown item.30 the types of measurement chosen to get maximum results were numerical measure and tested cosine similarity, correlation similarity, and dice similarity. meanwhile, to measure performance, the author used cross validation with a number of folds of 10. then, using this set of procedures, documents from the batan repository are classified. the procedure that achieves the highest level of accuracy is then submitted as a model. this model was applied to 563 final project documents that have not been labeled so that each document has a label according to batan’s knowledge taxonomy. results the experiment was carried out 54 times to determine the best knn performance from the proposed approach, namely cosine-binary, correlation-binary, dice-binary, cosine–tf-idf, correlation–tf-idf, and dice–tf-idf utilizing cross validation. cosine was still the most accurate in the tf-idf vector creation process, with an accuracy of 81.89 percent on seven neighbors, and dice reaches the lowest point when used on four neighbors. in contrast to correlation and dice, cosine can perform well when creating binary vectors. cosine on four neighbors had the best performance, with a 97 percent accuracy rate. the lowest accuracy occurred when the number of selected neighbors was two and the overall numerical measure had decreased in neighbors more than nine. the classification model for unlabeled documents was determined to be the cosine-binary method with four neighbors. the experiment found that this method did not successfully group three information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 6 santosa documents (for details of the confusion matrix, see appendix a). even though document 7 ought to be on nfcam, but with a lower score of 0.49921, it was predicted on the nrss with a confidence value of 0.50079. documents 86 and 93, which were supposed to be about endf, were unable to be foreseen. document 93 was predicted on the nrss with a confidence value of 0.50126 and document 86 was predicted on the nr with a value of 0.49936. figure 2. a comparison of the accuracy levels in the knn method. this study utilized 563 unlabeled documents that were divided into six years. there were 34 fewer documents in 2021 than there were in 2020, a significant drop from the previous year (see table 1). the number of documents then climbed again in 2022, reaching 98. rapidminer’s labeling process ran into issues when it got to the process document stage. to improve memory performance, the documents were split into three runs (2017–2018, 2019–2020, and 2021–2022) because the memory was not sufficient to execute a set of commands on docu ment processing. the results of the previous set of procedures were then exported as tabular data for further study. every year, the evolution of each nuclear subject can be seen in the final project report (see fig. 3). during the test period, 282 documents (50.09%) of the total extant papers had an endf study, followed by ir with 95 documents (16.87%) and nfcam with 69 documents (12.26%). while there were very little changes between nr and nrss, nr contains 47 papers (8.35%) connected while nrss had 45 documents (7.99%). mgt was the subject with the fewest documents, with a total of 25 (4.44%) from 2017 to 2022. information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 7 santosa table 1. the pint’s final project documents growth from 2017 to 2022 study program 2017 2018 2019 2020 2021 2022 grand total electromechanics 35 34 43 35 24 41 212 electronics instrumentation 27 34 38 38 22 28 187 nuclear technochemistry 31 31 26 27 20 29 164 grand total 93 99 107 100 66 98 563 see appendix b for more information on the confidence value of each predicted document. of the 212 final project reports in the electromechanics study program 63.68 percent (135 documents) were projected to be on the endf subject, followed by 17.92 percent (38 documents) on nfcam, nrss with 8.96 percent (19 documents), and nr 5.19 percent (11 documents). meanwhile, ir had the fewest papers predicted, with 2.83 percent (6 documents) while mgt had 1.42 percent (3 documents) predicted. every year, endf was the most predicted subject in this study (see fig. 4). figure 3. nuclear subject development by percentage each year. information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 8 santosa figure 4. nuclear subject development in electromechanics by % each year. the final project report on instrumentation electronics, which included 187 papers, was successfully predicted into five subjects. endf was projected to contain 141 documents (75.40%), nrss was likely to contain 24 documents (12.83%), and nr was predicted to contain 14 documents (7.49%). furthermore, only 7 documents (3.74%) on mgt and 1 document (0.53%) on ir were predicted. nfcam, on the other hand, is not mentioned in any of the electronics instrumentation publications (see fig. 5). final processing was performed on a collection of nuclear technochemistry documents. one hundred sixty-four documents are predicted at ir of 53.66 percent (88 documents), nfcam of 18.90 percent (31 documents), nr of 13.41 percent (22 documents), mgt of 9.15 percent (15 documents), endf of 3.66 percent (6 documents), and the remaining 1.22 percent (2 documents) were predicted on the nrss. subjects that were popular each year vary (see fig. 6) when compared to electromechanics and instrumentation electronics, where endf was the most popular topic in these two study programs. information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 9 santosa figure 5. nuclear subject development in electronics instrumentation by % each year. information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 10 santosa figure 6. nuclear subject development in nuclear technochemistry by % each year. discussion the study found that implementing knn with cosine similarity in association with vector construction=binary and k=4 resulted in the highest accuracy results of 97 percent. in general, this strategy outperformed in every class examined, and it can only be balanced on one occasion, notably at k=9 by utilizing correlation similarity. when compared to the use of tf-idf, the results likewise indicated that binary term occurrence always functioned well. tf-idf was only able to achieve its highest accuracy of 81.89 percent when k was 7 using correlation similarity. cosine similarity also seemed to work efficiently on every vector creation, both when using binary and tf-idf (in classes numbering 2, 5, and 10 the use of tf-idf was not optimal), compared to numerical measures of correlation similarity and dice similarity. cosine similarity evaluates the similarity of documents, and a high similarity score indicates that the documents are quite similar.31 nuclear field growth in general, aside from the endf field, which is steady and increasing, other subjects endure annual changes in development. for the past six years, endf has been the most popular subject among students. the endf reached the highest percentage rate in 2022, with 59 documents predicted on this subject. students preferred engineering final project reports on mechanics and structures, electromechanics, control systems, nuclear instrumentation, or nuclear facility process technology. research conducted by wang et al. also suggests that the current popular topic of research on nuclear power is modeling and simulation.32 information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 11 santosa the endf document’s average confidence value was 0.6499916, with a median value of 0.7490455. the two documents with the lowest confidence in the endf were document numbers 233 and 597. document 233 had a confidence value of 0.25105 and was predicted in the other three subject areas (nrss, nr, mgt) with close values. likewise, the 597 documents predicted in the endf with a confidence value of 0.25156 were higher than the nrss, nfcam, and ir subjects, but with a not too significant difference. both of these documents can be investigated further and directly evaluated by the librarian in order to obtain a more precise field. the majority of the final project reports projected in the endf have confidence levels around 0.50, and some even higher at 0.75. this study also reveals that 11 documents in the endf category have a confidence value of 1. with lower nrss confidence values, 239 endf documents connected to the nrss field. this relationship demonstrates a good tendency among students conducting nuclear engineering related to the nrss discipline. though it differs significantly from endf, ir is becoming a prominent field. the final project report for ir was developed in 2017–2018, but it shrank again from 2019 to 2021, then increased in 2022. in comparison to other fields, ir has the highest minimal confidence score of 0.4 987, with many documents lying within the 0.5 and 0.75 range. meanwhile, the confidence value for 26 documents predicted by ir is 1. the nfcam subject area is a prediction that appears frequently in ir predictions but has a lower level of confidence. there are 54 documents indicating the existence of research that involves isotopes and radiation in nuclear materials, nuclear excavations, radioactive waste, structures, or advanced materials. nfcam is inversely proportional to the conditions that occur in endf. after increasing in 2019, this subject faced a reversal over the next three years, with only two documents classified in this subject through 2022. students are still uncommonly interested in nuclear minerals, nuclear fuel, radioactive waste, structural materials, and advanced materials. six projected documents in this field have confidence levels of 1, while many more have confidence levels between 0.50 and 0.75. the ir field is also expected to appear alongside the nfcam field publications. there were also ups and downs in nr and nrss. twenty-five of the 47 documents identified on the nr were also predicted with a lower value in the nrss field. this demonstrates that students explored the relationship between the subject of reactor research and safety and security in various documents. meanwhile, only eight of the 46 nrss papers are unrelated to the endf field. this demonstrates that students who study nuclear safety and security tend to perform engineering to address situations involving nuclear safety and security. documents in these two fields are usually concentrated in the 0.5 confidence value range in both nr and nrss. mgt is one of the least studied topics among students. human resources, organization, management, program planning, auditing, quality systems, informatics utilization, or cooperation are more commonly associated with the mgt field. the mgt increased in 2020, although it became the field with the fewest documents on earlier occasions (2017 to 2019 and 2021 to 2022). in terms of confidence value, 21 mgt documents have a value greater than 0.5, with eight documents worth 1. with 10 documents, the endf is the most often discussed study area with mgt. progression in each study program even if they are still within the purview of nuclear science, the growth of the nuclear field in each study program differs depending on the curriculum. students are influenced by knowledge, and information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 12 santosa more specifically the process of learning and comprehending (whether theoretical or more practical).33 endf is still the most popular field in electromechanics and electronics instrumentation study programs. these two study programs offer courses in endf topic areas such as mechanical, civil and architectural, electromechanical, electrical, control systems, and radiation detection for nuclear devices. furthermore, the electronics instrumentation study program offers courses on nuclear electronics, signal processing techniques, and practical work on interface and data acquisition techniques, all of which are part of the endf nuclear instrumentation group. apart from endf, the fields of nfcam and nrss have been present in electromechanics for a period of six years. while mgt is currently a less appealing topic, there have been no final project reports relating to mgt in the most recent three years. in electronics instrumentation, the absence of a field occurs in nfcam. the findings of the predictions demonstrate that none of the documents predicted on nfcam were proper. meanwhile, only 10 documents that intersect with nfcam which have lower confidence in the range of values from 0.247 to 0.251. nuclear minerals, nuclear fuel, structural materials and advanced materials, and radioactive waste were not studied in depth in this study program, illustrating why nfcam is not predicted in instrumentation electronics. in contrast to other study programs, ir is the most predictable field in the final project report in nuclear technochemistry. in this investigation, nuclear technochemistry owns 88 of the 95 documents examined. this study program includes ir specializations such as the use of isotopes and radiation in agriculture, health, and industry. radioisotope production becomes another discipline that specializes in the creation of isotopes and radiation sources, which explains why ir is so popular among nuclear technochemistry students. the nfcam field was not present in 2022, despite the fact that it had been the topic of several students’ studies throughout the preceding five years. while the endf and mgt fields have only been present in the last three years, there were no predictable papers in the previous three years. conclusion the trend of research activities carried out by students from one study program to the next appears to vary although they are both within the scope of the nuclear field. for example, the field of endf is quite popular among electromechanics and electronics instrumentation students but not for nuclear technochemistry students because endf only appeared three years ago and the number of documents is still modest. however, endf deserves to be a field that needs attention. nuclear technochemistry students with radiochemistry learning experiences demonstrate that the ir field is linear and interesting to them. due to a paucity of publications, the low proportion in certain categories, e.g., mgt, shows a potential to further investigate this field. this study demonstrates an opportunity to use text mining to assist librarians in performing automatic document classification based on specific subjects. the best model in this study is produced by combining knn with cosine similarity and binary term occurrence. the model used can help improve the quality of decisions made to accurately and efficiently categorize documents. to determine a more specific classification, pay close attention to documents that have a low level of confidence and intersect with other issues. this study is limited to the knn method and information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 13 santosa documents from the batan repository, as well as final project documents for pint students. large-scale testing can be conducted, for instance, in the international atomic energy agency ’s (iaea) nuclear repository known as the international nuclear information system (inis) repository, or in other databases with the complexity of categorizing documents throughout many languages. data accessibility datasets and data analysis code for rapidminer have been uploaded to the rin dataverse: https://hdl.handle.net/20.500.12690/rin/asrgvo. data visualization can be accessed through tableau public: https://public.tableau.com/app/profile/faizhal.arif/viz/finalprojecttrendsutilizingnuclearknow ledgetaxonomy/story1 https://hdl.handle.net/20.500.12690/rin/asrgvo https://public.tableau.com/app/profile/faizhal.arif/viz/finalprojecttrendsutilizingnuclearknowledgetaxonomy/story1 https://public.tableau.com/app/profile/faizhal.arif/viz/finalprojecttrendsutilizingnuclearknowledgetaxonomy/story1 information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 14 santosa appendix a: confusion matrix of 10-fold cross validation accuracy: 97.00% +/4.83% (micro average: 96.97%) true nfcam true ir true nrss true mgt true nr true endf class precision pred. nfcam 13 0 0 0 0 0 100.00% pred. ir 0 18 0 0 0 0 100.00% pred. nrss 1 0 20 0 0 1 90.91% pred. mgt 0 0 0 19 0 0 100.00% pred. nr 0 0 0 0 13 1 92.86% pred. endf 0 0 0 0 0 13 100.00% class recall 92.86% 100.00% 100.00% 100.00% 100.00% 86.67% information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 15 santosa appendix b: the confidence value of each field e n d f ir m g t information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 16 santosa n f c a m n r n r s s information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 17 santosa endnotes 1 budi prasetyo and anggiana rohandi yusuf, “pengelolaan pengetahuan eksplisit berbasis teknologi informasi di batan,” in prosiding seminar nasional sdm teknologi nuklir (seminar nasional sdm teknologi nuklir, yogyakarta: sekolah tinggi teknologi nuklir, 2018), 126–32, https://inis.iaea.org/collection/nclcollectionstore/_public/50/062/50062856.pdf?r=1 . 2 joanna yi-hang pong et al., “a comparative study of two automatic document classification methods in a library setting,” journal of information science 34, no. 2 (april 2008): 213–30, https://doi.org/10.1177/0165551507082592. 3 prasetyo and yusuf, “pengelolaan pengetahuan eksplisit.” 4 jae-ho kim and key-sun choi, “patent document categorization based on semantic structural information,” information processing & management 43, no. 5 (september 2007): 1200–15, https://doi.org/10.1016/j.ipm.2007.02.002; pong et al., “a comparative study”; khusbu thakur and vinit kumar, “application of text mining techniques on scholarly research articles: methods and tools,” new review of academic librarianship (may 12, 2021): 1–25, https://doi.org/10.1080/13614533.2021.1918190. 5 pong et al., “a comparative study.” 6 kiri l. wagstaff and geoffrey z. liu, “automated classification to improve the efficiency of weeding library collections,” the journal of academic librarianship 44, no. 2 (march 2018): 238–47, https://doi.org/10.1016/j.acalib.2018.02.001. 7 manika lamba and margam madhusudhan, “mapping of topics in desidoc journal of library and information technology, india: a study,” scientometrics 120, no. 2 (august 2019): 477– 505, https://doi.org/10.1007/s11192-019-03137-5. 8 fábio figueiredo et al., “word co-occurrence features for text classification,” information systems 36, no. 5 (july 2011): 843–58, https://doi.org/10.1016/j.is.2011.02.002; yen-hsien lee et al., “use of a domain-specific ontology to support automated document categorization at the concept level: method development and evaluation,” expert systems with applications 174 (july 2021): 114681, https://doi.org/10.1016/j.eswa.2021.114681; yousif a. alhaj et al., “a study of the effects of stemming strategies on arabic document classification,” ieee access 7 (2019): 32664–71, https://doi.org/10.1109/access.2019.2903331. 9 david antons et al., “the application of text mining methods in innovation research: current state, evolution patterns, and development priorities,” r&d management 50, no. 3 (june 2020): 329–51, https://doi.org/10.1111/radm.12408; muhammad arshad et al., “next generation data analytics: text mining in library practice and research,” library philosophy and practice (2020): 1–12. 10 mowafy mona, rezk amira, and hazem m. el-bakry, “an efficient classification model for unstructured text document,” american journal of computer science and information technology 06, no. 01 (2018), https://doi.org/10.21767/2349-3917.100016. https://inis.iaea.org/collection/nclcollectionstore/_public/50/062/50062856.pdf?r=1 https://doi.org/10.1177/0165551507082592 https://doi.org/10.1016/j.ipm.2007.02.002 https://doi.org/10.1080/13614533.2021.1918190 https://doi.org/10.1016/j.acalib.2018.02.001 https://doi.org/10.1007/s11192-019-03137-5 https://doi.org/10.1016/j.is.2011.02.002 https://doi.org/10.1016/j.eswa.2021.114681 https://doi.org/10.1109/access.2019.2903331 https://doi.org/10.1111/radm.12408 https://doi.org/10.21767/2349-3917.100016 information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 18 santosa 11 kim and choi, “patent document categorization.” 12 alhaj et al., “a study of the effects of stemming strategies.” 13 mona, amira, and el-bakry, “an efficient classification model.” 14 thakur and kumar, “application of text mining techniques.” 15 kim and choi, “patent document categorization.” 16 wagstaff and liu, “automated classification.” 17 najat ali, daniel neagu, and paul trundle, “evaluation of k-nearest neighbour classifier performance for heterogeneous data sets,” sn applied sciences 1, no. 12 (december 2019): 1559, https://doi.org/10.1007/s42452-019-1356-9. 18 roiss alhutaish and nazlia omar, “arabic text classification using k-nearest neighbour algorithm,” the international arab journal of information technology 12, no. 2 (2015): 190–95. 19 mona, amira, and el-bakry, “an efficient classification model.” 20 guozhong feng et al., “a probabilistic model derived term weighting scheme for text classification,” pattern recognition letters 110 (july 2018): 23–29, https://doi.org/10.1016/j.patrec.2018.03.003. 21 snezhana sulova et al., “using text mining to classify research papers,” in 17th international multidisciplinary scientific geoconference sgem 2017, vol. 17, international multidisciplinary scientific geoconference-sgem (17th international multidisciplinary scientific geoconference sgem, sofia: surveying geology & mining ecology management (sgem), 2017), 647 –54, https://doi.org/10.5593/sgem2017/21/s07.083. 22 lee et al., “use of a domain-specific ontology.” 23 man lan et al., “supervised and traditional term weighting methods for automatic text categorization,” ieee transactions on pattern analysis and machine intelligence 31, no. 4 (april 2009): 721–35, https://doi.org/10.1109/tpami.2008.110. 24 devid haryalesmana, “masdevid/id-stop words,” 2019, https://github.com/masdevid/id-stop words. 25 alhaj et al., “a study of the effects of stemming strategies.” 26 pong et al., “a comparative study.” 27 ananta pandu wicaksana, “nolimitid/nolimit-kamus,” 2015, https://github.com/nolimitid/nolimit-kamus. 28 antons et al., “the application of text mining methods.” https://doi.org/10.1007/s42452-019-1356-9 https://doi.org/10.1016/j.patrec.2018.03.003 https://doi.org/10.5593/sgem2017/21/s07.083 https://doi.org/10.1109/tpami.2008.110 https://github.com/nolimitid/nolimit-kamus information technology and libraries march 2023 exploring final project trends utilizing nuclear knowledge taxonomy 19 santosa 29 kanish shah et al., “a comparative analysis of logistic regression, random forest and knn models for the text classification,” augmented human research 5, no. 1 (december 2020): 12, https://doi.org/10.1007/s41133-020-00032-0. 30 judit tamas and zsolt toth, “classification-based symbolic indoor positioning over the miskolc iis data-set,” journal of location based services 12, no. 1 (january 2, 2018): 2–18, https://doi.org/10.1080/17489725.2018.1455992. 31 hanan aljuaid et al., “important citation identification using sentiment analysis of in -text citations,” telematics and informatics 56 (january 2021): 101492, https://doi.org/10.1016/j.tele.2020.101492. 32 qiang wang, rongrong li, and gang he, “research status of nuclear power: a review,” renewable and sustainable energy reviews 90 (july 2018): 90–96, https://doi.org/10.1016/j.rser.2018.03.044. 33 ronald barnett, “knowing and becoming in the higher education curriculum,” studies in higher education 34, no. 4 (june 2009): 429–40, https://doi.org/10.1080/03075070902771978. https://doi.org/10.1007/s41133-020-00032-0 https://doi.org/10.1080/17489725.2018.1455992 https://doi.org/10.1016/j.tele.2020.101492 https://doi.org/10.1016/j.rser.2018.03.044 https://doi.org/10.1080/03075070902771978 abstract introduction literature review text mining in libraries knn classifier aims of the study methods results discussion nuclear field growth progression in each study program conclusion data accessibility appendix a: confusion matrix of 10-fold cross validation appendix b: the confidence value of each field endnotes navigation design and library terminology: findings from a user-centered usability study on a library website communication navigation design and library terminology findings from a user-centered usability study on a library website isabel vargas ochoa information technology and libraries | december 2020 https://doi.org/10.6017/ital.v39i4.12123 isabel vargas ochoa (ivargas2@csustan.edu) is web services librarian, california state university, stanislaus. © 2020. abstract the university library at california state university, stanislaus is not only undergoing a library building renovation, but a website redesign as well. the library conducted a user-centered usability study to collect data in order to best lead the library website “renovation.” a prototype was created to assess an audience-based navigation design, homepage content framework, and heading terminology. the usability study consisted of 38 student participants. it was determined that a topicbased navigation design will be implemented instead of an audience-based navigation, a search-all search box will be integrated, and the headings and menu links will be modified to avoid ambiguous library terminology. further research on different navigation and content designs, and usability design approaches, will be explored for future studies. introduction the university library at california state university, stanislaus is currently undergoing a much anticipated and necessary redesign of the library website. website redesigns are crucial and a part of website maintenance to acclimate with modern technology and meet accessibility standards. “if librarians are expected to be excellent communicators at the reference desk and in the classroom, then the library website should complement the work of a librarian.”1 in this case, a library website prototype was created, using a springshare llc product, libguides cms, as the testing subject for our user-centered usability study. the usability study was completed with 38 student participants belonging to different academic years and areas of study. the library website prototype tested was designed using a user-based design framework and an audience-based navigation. this study found issues reported from users based on navigation design and ambiguous library terminology. an audience-based navigation was chosen in order to best organize and group the information and services offered to best make them accessible for users. however, an audience-based navigation will directly affect users and their search behaviors.2 the prototype, like the current library website, did not have a search-all search box during the study. a catalog search box was utilized to test whether or not the catalog was enough for student participants to find information. this also forced the participants to utilize the menu navigation. literature review the design and approach of usability studies, preference for types of search boxes, navigation design, and library terminology evolve over time in parallel with technology changes. most recent usability studies use screen and audio recording tools as opposed to written observation notes. participants in recent studies are also more adapted to learning how to navigate websites, as mailto:ivargas2@csustan.edu information technology and libraries december 2020 navigation design and library terminology | ochoa 2 opposed to participants in usability studies twenty years ago. regardless, it’s crucial to compare the results from previous usability studies to analyze differences and similarities. different types of usability studies include user-centered usability studies and heuristic usability studies. this study chose a user-centered approach because of the library's desire to collect data and feedback from student users. the way in which the usability study is presented is also detrimental to the approach. website usability studies are meant to test the website, although participants may unconsciously believe they are being tested. in tidal’s library website case study (2012) researchers assured the participants that “the web site was being tested and not the participants themselves.”3 this unconscious belief may also affect the data collected from the participants and “influence user behavior, including number of times students might attempt to find a resource or complete a given task.”4 the features tested were the navigation design and homepage elements. the navigation design in the prototype was developed to test an audience-based navigation design (see figure 1). an audience-based navigation design organizes the navigation content by audience type. 5 that is to say, the user will begin their search by identifying themselves first. although this design can organize content in a more efficient manner, especially for organizations that have specific, known audiences, critics argue that this design forces users to identify themselves before searching for information, thus taking them out of their task mindset.6 for this usability study, i wanted to test this navigation design and compare the results to our current navigation design which is a topicbased navigation design. a topic-based navigation design is developed to present topics as navigation content.7 this design is our current library website navigation design (see figure 2) figure 1. screenshot of the audience-based navigation design developed for the library website prototype. figure 2. screenshot of the current content-based navigation design in the library website. information technology and libraries december 2020 navigation design and library terminology | ochoa 3 designing the navigation and homepage also means choosing accessible terms that are relevant to all users. unfortunately, over the course of many decades, library terminology has been a hindrance for student users. terms such as “catalog,” “reference,” and “research guides” are still difficult for users to understand. as conrad states (2019), “students are not predisposed to think of a ‘research guide’ as a useful tool to help them get started.”8 a research guide isn’t necessarily a self-explanatory term. in many ways, the phrase is ambiguous. augustine’s case study in 2002 had similar difficulties. students “lack of understanding and awareness of library resources impacted their ability more than the organization of the site did.”9 it’s unsettling to know that our own terminology has been deterring users from accessing library resources for decades. librarians use library terminology to such an extent that it’s part of our everyday language, but what is common knowledge to us may be completely alien to our very own audience. not only should libraries be aware of confusing library terms, but content should also not overwhelm the user with an abundance of information. most students who visit the library are looking for something specific and easy to find. it’s important for librarians to condense their information on guides or website pages to not frustrate the user or make them search elsewhere, like google. “students scan. . . rather than [read] material.” 10 this is also something that has been noted from our crazy egg statistics. heatmaps of our website’s pages prove that users are not scrolling to the bottom of the pages. this also applies to the use of large images, or unnecessary flashy or colorful content that covers most of the desktop or mobile screen. these images should be reduced in size so that users can find information swiftly. for this reason, any large design on the homepage should also be included in menu links, in case large flashy content is ignored.11 the search box is also another fundamental element i analyzed. in this case study, our search box was the catalog search box for ex libris primo. if a page, particularly the homepage, has two search boxes—search-all and catalog search—the user can be confused. search boxes are primarily placed at the center of the page. depending on how these search boxes are labeled and identified, users may not know which one to use. students approach library search boxes as if searching google.12 in our case, neither the current website nor the prototype has a general search-all box. we have a catalog search box placed on the top center of the homepage for both sites. if we were to add a general search-all box, it would be placed away from the catalog search box and preferably in the header where it is visible in all pages. methodology the usability study was conducted by the author, the web services librarian at california state university, stanislaus, who also worked with a computer science instructor in order to recruit participants. not only is the university library redesigning its website, but the university library building is also undergoing a physical renovation. due to this project, the library has relocated to the library annex, a collection of modular buildings providing library services to the campus community. the usability study was conducted in a quiet study room in one of these modular sites. i reserved this study area and borrowed eight laptops for the sessions. the usability study employed two different methods to get students to participate. the first offered an extra credit incentive, which was offered when i collaborated with the computer science instructor. this instructor was teaching a course on human-centered design for websites. she offered her students an extra credit incentive, since several of her learning objectives centered on website design and usability studies. the second approach was an informal one. this approach information technology and libraries december 2020 navigation design and library terminology | ochoa 4 was promoted by scouting students who were already at the library annex during the usability study scheduled sessions. this enabled students to participate without having to sign-up or remember to participate. the students were recruited in-person during the usability session and through flyers posted in study rooms on the days of the study. an incentive of snacks for students to take home was also included. i created questions and seven tasks to be handed out to the participants during the study. the tasks were created to test the navigation design of the main menu and content on the homepage. i also added a task to test the research skills of the student. after these tasks, students were asked to rate the ease of access, answer questions about their experience navigating the prototype and to provide feedback. all students were given the same tasks, however if the student was taking the human-centered design course, they were also given specific web design questions for feedback (see appendices a and b). the tasks were piloted before the study with three library student workers who provided feedback on how to better word the tasks for students. the following tasks are the final seven tasks used for the usability study: 1. find research help on citing legal documents—a california statute— in apa style citation. 2. find the library hours during spring break. 3. find information on the library study spaces hours and location. 4. you’re a student at the stan state campus and you need to request a book from turlock to be sent to stockton. fill out the request-a-book form. 5. you are a graduate student and you need to submit your thesis online. fill out the thesis submission form. 6. for your history class, you need to find information on the university’s history in the university archives and special collections. find information on the university archives and special collections. 7. find any article on salmon migration in portland, oregon. you need to print it, email it to yourself, and you also need the article cited. the usability study sessions took place from 11am to 2pm on february 10, 12, and 14, 2020. these days and times were chosen because the snack incentive would attract students during lunch hour and i wanted to accommodate the start and end times of the human-centered design course on mondays, wednesdays, and fridays. the total time it took for students to complete the 7 tasks averaged 15 minutes. in total, there were 38 student participants. the student’s experience was recorded anonymously. i asked students to provide their academic year and major. students ranged from freshman (5), sophomore (2), junior (12), senior (17), graduate (1), and unknown (1). areas of study included computer science (16), criminal justice (2), business (2), psychology (3), communications (1), sociology (1), english (3), nursing (1), spanish (1), biology (3), geology (1), history (2), math (1), gender studies (1), and undeclared (1). the subject tested was the library website prototype created and executed using a springshare llc product, libguides cms. the tools i used were eight laptops and a screen recording tool, snagit. snagit is a recording tool made accessible through a campus subscription. the laptops were borrowed from the library for the duration of the sessions. during the session, students navigated and completed the tasks on their own with no direct interference, including no direct observations. i planned to create a space where my presence didn’t directly influence or intimidate their experience with the website. my findings were based solely on their written responses and screen recordings. i also explained to the students that their screen recorded video information technology and libraries december 2020 navigation design and library terminology | ochoa 5 will not be linked to their identity, since they had to sign-in to the laptop using their campus student id. i did, however, occasionally walk around the tables in the room in case a student was navigating the current website or using a separate site to complete the tasks. once the students completed the tasks and answered the questions, i collected the handouts and the screen -capture videos by copying them to a flash drive. limitations during the usability study session, there were two technical issues that hindered the initial process. on the first day, there were difficulties accessing the campus wi-fi in the room as well as difficulties accessing the snagit video recording application. this limitation affected some of the students' experiences and feedback. these issues were resolved and not present on the second and third day of the study. results and observations the results and observations collected from this study mirror results from the studies conducted by azadbakht and swanson.13 i found that students searched the catalog search box for library collections, citations, and other library terms they didn’t understand, even though it was a catalog search box with the keywords “find articles, books, and other materials” labeled in the search bar. another finding was that the navigation design can detrimentally affect a user's experience with the website. mixed reviews were received from utilizing the audience-based navigation design. the study also found that students are adept at finding research materials. for example, most students knew how to search, find, print, email, and cite an article. students in general are also familiar with book requests, ill accounts, and filling out book request webforms. this indicates that, in terms of utilizing library services, students are well aware of how to find, request, and acquire resources, using the website on their own. what was most difficult for students was interpreting library terminology. this was explicitly shown in their attempts to complete tasks 1 and 6: finding how to cite a legal document in apa style and finding information on special collections and the university archives. the following results and observations are divided into three categories: written responses, video recording observations, and data collected. data was collected based on observations from the video recording and the written responses. data was then input into eight separate charts. written responses observations comments from both non-human-centered website design students and human-centered design students included mixed reviews on the navigation layout, overall positive outlook on the page layout design, suggestions to add a search-all “search bar,” and frustrations with tasks 1 and 6. video recording observations the ex libris primo search box was constantly mistaken as a search-all search box. this occurred during students’ search for tasks 1 and 6: citation help and university archives, respectively. students also used the research guides search box in libguides as a search-all search box. students found the citation style guides easily because of this feature, however on the proposed new website, it was difficult to find citation help. students were also using research guides to complete other tasks, such as task 6. a search bar for the entire website was continuously mentioned as a solution from student participants. information technology and libraries december 2020 navigation design and library terminology | ochoa 6 tasks 2 and 3, regarding library hours and study spaces, were easily completed. tasks 4 and 5 were also easily accessible. after completing task 4 (book request form) it was easier for participants to complete task 5 (thesis submission form) because both tasks required students to search the top main navigation menu. to complete task 4, several students immediately signed-in to their ill account or login to primo for csu+, which was expected as signing-in to these accounts are alternate modes to request a book. an additional other observation, regarding task 4 is that the confusion revolving the library terms, “call number,” was solved by adding an image reference pointing to the call number in the catalog. the call number image reference was opened several times for assistance. most students completed task 7 (find a research article) but not all students used the catalog search box on the homepage to complete it. several students searched the top main navigation and clicked on the “research help” link. others utilized research guides and the research guides search box on the homepage. a particular unique observation was made by some computer science students. most computer science students were quicker to give up on a task as opposed to non-computer science students. some computer science students did not scroll down when browsing pages. these students failed to complete several tasks because they didn’t scroll down the page after being on the page for less than ten seconds. data collected figure 3. ease of navigation (overall). information technology and libraries december 2020 navigation design and library terminology | ochoa 7 figure 3 illustrates the ease of navigation rating overall from all student participants. students were asked to rate the ease of access of the website (see appendices a and b). other than the keywords “ease of navigation (1 difficult; 10 easy)” students were given the freedom to define what “easy” and “difficult” meant to them individually. the mean for the ease of access rating for all student participants was 7.7. the lowest rating of ease of access was 3 and the highest rating of ease of access was 10. figure 4. ease of navigation (computer science major). figure 4 illustrates the ease of access rating by the student participants based on whether the student was a computer science major or not a computer science major. the lowest ease of access ratings were from computer science majors. overall non-computer science majors had higher ease of access ratings than computer science majors. information technology and libraries december 2020 navigation design and library terminology | ochoa 8 figure 5. ease of navigation (human-centered design). figure 5 illustrates the ease of access rating by the student participants based on whether the student was taking the human-centered design course. the human-centered design students’ learning outcomes include website user-interface design and an assignment on how to create a usability study. similar to patterns found in figure 2, human-centered design students had lower ease of access ratings. figure 6. tasks – status of completion. information technology and libraries december 2020 navigation design and library terminology | ochoa 9 figure 6 illustrates whether a task was completed or not. completion of task was determined by analyzing whether or not the student not only found the page(s) that provided the solution to the task. it was determined that a student did not complete the task if the student was unable to find the page(s) that provided the solution to the task. “not applicable” was determined if th e student did not use the website prototype (e.g., followed a link that led elsewhere or opted to use google search instead). most students completed tasks 2, 3, 4, 5, and 7. the task with most “did not complete” was task 1 , which 64 percent of student participants did not complete. task 6 had neutral completion, 63 percent. 86 percent of students completed tasks 2 and 4, and 90 percent of students completed tasks 3, 5 and 7. it is evident that task 1 was a difficult task to complete, regardless of the stu dent’s area of study. task 1 required students to find apa legal citation help. the terms “apa legal citation” confused users. likewise, for task 6 (special collections), students did not understand what “collections” referred to or where to search them. figure 7. tasks – number of clicks (complete). figure 7 illustrates how many clicks it required students to complete the task. the clicks were separated into three categories: 1-2 clicks, 3-5 clicks, and more than 6 clicks. this figure only illustrates data collected from tasks that were completed. the number of clicks began at the website prototype’s homepage or from the main menu navigation found in the website prototype’s header, when it was evident that the student was starting a new task. tasks 2 and 3 were completed in 1-2 clicks, whereas tasks 1, 4, 5, 6, and 7 required an average of 3-5 clicks. because of experience helping students find articles at the librarian's research help desk, task 7 (find research articles) was expected to require 6+ clicks. task 1 may have a pattern of needing a high number of clicks to complete because it was a generally a difficult task to complete. information technology and libraries december 2020 navigation design and library terminology | ochoa 10 figure 8. tasks – number of clicks (did not complete). figure 8 illustrates how many clicks a student participant made before they decided to skip the task or if they believed they had completed the task. this figure only illustrates data from tasks that were not completed. the clicks were separated into three categories: 1-2 clicks, 3-5 clicks, and more than 6 clicks. the number of clicks began at the website prototypes homepage or from the main menu navigation found in the website prototype’s header, when it was evident that the student was starting a new task. task 1 and 6 show the most patterns in this figure. task 1 (citation help) shows that students generally skipped the task after more than 6 clicks. task 6 (special collections) was generally skipped after 3-6+ clicks. figure 9 illustrates the duration to complete each task. the duration was separated into three categories: 0-1 minutes, 1-3 minutes, or more than 3 minutes. this figure only illustrates data for tasks that were completed. the duration began when the student started a new task. this was determined when it was observed that the students started to use the main menu navigation, or if the student directed their screen back to the website prototype’s homepage. there are parallels between the number of clicks and duration of tasks. for tasks 2, 3, and 5, the duration to complete the task was less than 1 minute. task 5 was a task similar to task 4 (both are forms, linked once on the website), but the duration for task 5 may have averaged lower than the duration of task 4, because task 5 was after task 4. having completed a form before task 5 may have influenced the student’s behavior on searching for forms. tasks 1, 6, and 7 averaged 1-3 minutes to complete. information technology and libraries december 2020 navigation design and library terminology | ochoa 11 figure 9. tasks – question duration (complete) figure 10. tasks – question duration (did not complete) information technology and libraries december 2020 navigation design and library terminology | ochoa 12 figure 10 illustrates the duration of each task that wasn’t completed. the duration was separated into three categories: 0-1 minutes, 1-3 minutes, or more than 3 minutes. this figure only illustrates data for tasks that were not completed. the duration began when the student started a new task. this was determined when it was observed that the students started to use the main menu navigation, or if the student directed their screen back to the website prototype’s homepage. similarly, to observations for figure 7, there are parallels between the number of clicks and duration of tasks. for task 1, the average time before students skipped the task varied, however most students who didn’t complete the task skipped it after more than 3 minutes of trying to complete it. for task 6, the average duration before skipping the task was 1-3 minutes. conclusion and recommendations the purpose of this study was primarily designed to test the user-centered study approach and the navigational redesign of the library website. the results, however, provided the library with a variety of outcomes. based on suggestions and comments on the website prototype navigation design, menus, and page content, there are several elements that will be integrated to help lead the redesign of the library’s website. students found that the navigation design of the website was clear and simple, but also required a “getting used to.” because of this, and due to navigation design literature, it is recommended to design a menu navigation that is a topic-based navigation as opposed to an audience-based navigation. our findings also highlighted the effects of the use of library terms. to make menu links exceptionally user-friendly, it is recommended to utilize clear and common terminology. student participants also voiced that a search-all search box for the website was necessary. this will enable users to access information efficiently. library website developers should also map more than one link to a specific page, especially if the only link to the page is on an image or slideshow. the user-centered usability approach for this case study worked well in collaboration with campus faculty and as an informal recruitment. it provided relevant and much needed data and feedback for the university library. in terms of future usability studies, a heuristic approach may be effective. a heuristic study approach will enable moderators to gather feedback and analysis from library web development experts.14 moreover, the usability study could be conducted over a semester long time and include focus groups to acquire consistent feedback. 15 overall, website usability studies are evolving and require constant improvements and research. information technology and libraries december 2020 navigation design and library terminology | ochoa 13 appendix a major: ___________ year (freshman, sophomore, etc.): ______________ link to site: url please do not use url please complete the following situations. for some of these, you don’t need to actually submit/send, but pretend as if you are. 1. find research help on citing legal documents a california statute in apa style citation. 2. find the library hours during spring break. 3. find information on the library study spaces hours and location. 4. you’re a student at the stan state campus and you need to request a book from turlock to be sent to stockton. fill out the request-a-book form. 5.you are a graduate student and you need to submit your thesis online. fill out the thesis submission form. 6. for your history class, you need to find information on the university’s history in the university archives and special collections. find information on the university archives and special collections. 7. find any article on salmon migration in portland, oregon. you need to print it, e-mail it yourself, and you also need the article cited. complete the following questions. 1. rate the ease of access of the website (1= really difficult to navigate, 10=eas y to navigate) 1 2 3 4 5 6 7 8 9 10 2. did you ever feel frustrated or confused? if so, during what question? 3. do you think the website provides enough information to answer the above questions? why or why not? information technology and libraries december 2020 navigation design and library terminology | ochoa 14 appendix b cs 3500 major: ___________ year (freshman, sophomore, etc.): ______________ link to site: url please do not use url please complete the following situations. for some of these, you don’t need to actually submit/send, but pretend as if you are. 1. find research help on citing legal documents a california statute in apa style citation. 2. find the library hours during spring break. 3. find information on the library study spaces hours and location. 4. you’re a student at the stan state campus and you need to request a book from turlock to be sent to stockton. fill out the request-a-book form. 5.you are a graduate student and you need to submit your thesis online. fill out the thesis submission form. 6. for your history class, you need to find information on the university’s history in the university archives and special collections. find information on the university archives and special collections. 7. find any article on salmon migration in portland, oregon. you need to print it, e-mail it yourself, and you also need the article cited. then, complete the following questions. 1. rate the ease of access of the website (1= really difficult to navigate, 10=easy to navigate) 1 2 3 4 5 6 7 8 9 10 2. what did you think of the overall web design? 3. what would you change about the design? please be specific. 4. what did you like about the design? please be specific. information technology and libraries december 2020 navigation design and library terminology | ochoa 15 endnotes 1 mark aaron polger, “student preferences in library website vocabulary,” library philosophy and practice, no. 1 (june 2011): 81, https://digitalcommons.unl.edu/libphilprac/618/. 2 jakob nielsen, “is navigation useful?,” nn/g nielsen norman group, https://www.nngroup.com/articles/is-navigation-useful/. 3 junior tidal, “creating a user-centered library homepage: a case study,” oclc systems & services: international digital library perspectives 28, no. 2 (may 2012): 95, https://doi.org/10.1108/10650751211236631. 4 suzanna conrad and christy stevens, “‘am i on the library website?’: a libguides usability study,” information technology and libraries (online) 38, no. 3 (september 2019): 73, https://doi.org/10.6017/ital.v38i3.10977. 5 eric rogers, “designing a web-based desktop that's easy to navigate,” computers in libraries 20, no. 4 (april 2000): 36, proquest. 6 katie sherwin, “audience-based navigation: 5 reasons to avoid it,” nn/g nielsen norman group, https://www.nngroup.com/articles/audience-based-navigation/. 7 rogers, “designing a web-based desktop that's easy to navigate,” 36. 8 conrad, “‘am i on the library website?’: a libguides usability study,” 71. 9 susan augustine and courtney greene, “discovering how students search a library web site: a usability case study,” college & research libraries 63, no. 4 (july 2002): 358, https://doi.org/10.5860/crl.63.4.354. 10 conrad, “‘am i on the library website?’: a libguides usability study,” 70. 11 kate a. pittsley and sara memmott, “improving independent student navigation of complex educational web sites: an analysis of two navigation design changes in libguides,” information technology and libraries 31, no. 3 (september 2012): 54, https://doi.org/10.6017/ital.v31i3.1880. 12 elena azadbakht, john blair, and lisa jones, “everyone's invited: a website usability study involving multiple library stakeholders,” information technology and libraries 36, no. 4 (december 2017): 43, https://doi.org/10.6017/ital.v36i4.9959. 13 azadbakht, “everyone's invited,” 43; troy a. swanson and jeremy green, “why we are not google: lessons from a library web site usability study,” the journal of academic librarianship 37, no. 3 (february 2011): 226, https://doi.org/10.1016/j.acalib.2011.02.014. 14 laura manzari and jeremiah trinidad-christensen, “user-centered design of a web site for library and information science students: heuristic evaluation and usability testing ,” information technology and libraries 25, no. 3 (september 2006): 164, https://doi.org/10.6017/ital.v25i3.3348. 15 tidal, “creating a user-centered library homepage: a case study,” 97. https://digitalcommons.unl.edu/libphilprac/618/ https://www.nngroup.com/articles/is-navigation-useful/ https://doi.org/10.1108/10650751211236631 https://doi.org/10.6017/ital.v38i3.10977 https://www.nngroup.com/articles/audience-based-navigation/ https://doi.org/10.5860/crl.63.4.354 https://doi.org/10.6017/ital.v31i3.1880 https://doi.org/10.6017/ital.v36i4.9959 https://doi.org/10.1016/j.acalib.2011.02.014 https://doi.org/10.6017/ital.v25i3.3348 abstract introduction literature review methodology limitations results and observations written responses observations video recording observations data collected conclusion and recommendations appendix a appendix b endnotes lib-s-mocs-kmc364-20141005043703 68 book reviews indiana seminar on information networks (isin). proceedings. compiled by donald p. hammer and gary c. lelvis. west lafayette, indiana: purdue university libraries, 1972. 91 p. (available at no charge from the extension division, indiana state library, 140 north senate avenue, indianapolis, indiana 46204 as long as the supply lasts). the indiana seminar on information networks (october 26-28, 1971) was an attempt to introduce indiana librarians to the benefits (and presumably problems) of library networking. papers included in the proceedings are introduction to networks (maryann duggan), library of congress marc & recon (lucia j. rather), nelinet (ronald f. miller), an on-line interlibrary circulation and bibliographic searching demonstration (gary c. lelvis and donald p. hammer), ohio college library center (frederick g. kilgour), user response to the facts (facsimile tran.smission system) netu1ork (lynn r. hard), indiana twx network discussion (margaret d. egan & abbie d. h eitger), and how does the n etwork serve the researcher? (irwin h. pizer) . as with any collection of written papers or oral presentations, the quality is mixed. the papers are introductory in nature, the pizer article being the exception. the majority report "case studies" of particular automated operations and/ or networks (marc & recon, nelinet, oclc, facts) . the facts article is the most interesting of these "case studies" because it moves beyond simply reporting "how we done it good" into an evaluation of why the network did not succeed (the network did not meet a real and/or consciously recognized need of the libraries it was proposing to serve) and emphasizes the importance of careful planning. any wouldbe network planner should read this article; there are many lessons to be learned. although the collected papers have all of the disadvantages usually associated with a collection of oral presentations (material is loosely organized and lacks continuity, introductory and oversimplified, repetitive, and out of date), they are a valuable addition to the growing body of literature dealing with networks both from the idealized conceptual view and, perhaps more importantly, from the practical reality view of existing networks. kenneth j. bierman systems librarian virginia polytechnic institute computers and systems; an introduction for librarians, by john eyre and peter tonks. hamden, connecticut, linnet books (shoe string press), 1971. 127 p. $5. 75. isbn: 0-208-01073-4. at last an inexpensive introductory text specifically written for librarians and library students! not since n. s. m. cox's the computer and the library have we had such a short, easy to read, yet comprehensive, description of the essentials. complementing the text are twenty-nine figures illustrating everything from batch and real-time processors, disc drives, program process, and systems flowcharts to data elements, formats, and input procedures, marc ii records on magnetic tapes, and sample pages from a computer-produced author catalog. the text reads like a well-organized glossary, treats the subjects of library use of computers and systems analysis in a way at once simple and informative. the authors had tested the material with students in courses at the school of librarianship of the polytechnic of north london. thanks to the british-american cooperation surrounding marc efforts, this book will be as useful in our library school classes as it is in theirs. the index d eserves a special note because it was compiled after the style of precis developed by the british national bibliography. it is a facet analysis of the text featuring access to "activity:thing: type:aspect" in a prescribed permuted order. although there is not much emphasis in such a text on subject access or information retrieval, this is not entirely overlooked and this index serves as an excellent example of what could be done by computer. truly an excellent introduction to computers and systems analysis for librarians! a two-page bibliography contains suggestions for further reading on the topic or for an expanded reading of various applications of computers in libraries. pauline atherton school of library science syracuse university isis: integrated scientific information system; a general description of an ap· proach to computerised bibliographical control, by william schieber. geneva: international labour office, 1971. 115p. $1.50. this document is a well-written description of the computerized library system developed at the international labour office. planning and development for the system began in 1963. it has been implemented and is now in operation within the central library and documentation bmnch of the ilo. the isis bibliographic control system is a large file system for storing, processing, and retrieving bibliographic information. the ilo data base consists of some 45,000 records of books, periodical articles, and other documents. each record consists of conventional bibliographic data (with less detailed definitions than marc data, however) plus an abstract. in form, the abstract appears to be written in natural language, but all descriptor words used in the abstract are taken from a controlled vocabulary and, in fact, provide subject indexing. on-line terminals are used for ide searches. the search system allows searches by subject descriptors, language, and date of publication. sequential formulation of the search allows control of the number of responses to a desirable size. records are also indexed on various data fields, such book reviews 69 as author and title. display of records and browsing are handled on line, but printing of lists or bibliographies is handled through subsequent batch printing jobs. regularly scheduled outputs of the system include printed catalogs, indexes, and authority lists. two other systems have been developed at the ilo using some programs and files of the bibliographic control system. one is for controlling loans of library books, the other is for serials data and includes a subsystem for routing library periodicals. these three major systems are described in some detail in this report. a fourth section deals with system monitoring and control. costs are discussed here. the isis system is an interesting and unique one even though the system is geared primarily to a special library environment. it is evident that much careful thought and attention to detail went into the system design and development. the integrated use of programs and files as described here and the details of some design elements make this a useful document. the report itself is well done. describing a complex system for a varied audience is a difficult task. the author, william d. schieber, has put together an excellent example of a systems report document. charles t. payne systems development office university of chicago library title derivative indexing techniques: a comparative study, by hilda feinberg. metuchen, n.j.: the scarecrow press, 1973. x+297p.; index and bibliography. this book is primarily a survey of key word indexes, with some discussion of issues in indexing. the survey is quite good, but already out of date. the discussion is unfortunate. the survey covers a wide range of computer-based article title key word indexes, including extreme cases such as permuterm. sample pages are included for fifty-six indexes, and thirteen lists of excluded words ("stopwords") are· given. reproduction of samples is generally ex70 journal of library automation vol. 6/ 1 march 1973 cellent, and this portion is valuable in showing the virtues and defects of various approaches to key word indexing. since this survey, at least three major libraries have begun publication of key word indexes to serial titles, a type of index with different problems which is likely to be more common in the future. the discussion suffers from a lack of focus. there are no clear standards for key word indexes or the traditional tools they complement or replace, and studies of user preference and convenience have been limited and inconclusive. it is difficult to say what makes a key word index more or less workable, and this book seems to cloud the issues even more. ms. feinberg makes some questionable and unsupported assumptions about what users think, want, and need, and a number of recommendations which are at best only applicable to indexes of article titles in scientific fields. take three major recommendations: plural and singular forms should be interfiled, synonyms and similar words should be interfiled, and foreign titles should be translated. the university of california (berkeley) library found "college," "university," "company" and "papers" to be good exclusion words, while "colleges," "universities," "companies," and "paper" are good subject words. synonym control increases homonym problems, makes for longer (and thus more difficult to use) lists, and entails difficult decisions as to what con~titute true synonyms. translation raises the qm ..., tion of whether a user should be guided to a publication he may not be able to read. in sum, these and similar decisions should depend much more on the field of study and user population than on this type of general treatment. there are other problems reflecting deficiencies in the areas of technical background, understanding of typography, and appreciation of some reasons for key word indexing. ms. feinberg comes out strongly in favor of "title enrichment"-adding artificial titles to improve indexing. this, however, adds cost and time to the key word approach, and subtracts from its clear advantages. a large section is devoted to an experimental study of different indexing programs, with the result that different programs produce different in dexes. generally, the discussion detracts from the survey. finally, the title chosen seems unfortunate. "key word indexing" may not be an ideal term, but it is fairly well known; must we introduce yet another vague, polysyllabi<, phrase, "title derivative indexing"? walt crawford university of california berkeley accountability: systems planning in education. leon lessinger & associates. creta d. sabine, editor. homewood, ill.: etc publications, 1973. 242 pages. "accountability" has become a rallying cry in many educational circles of late: for the public in its demand for visible results for educational dollars, and for educators as they attempt to define and defend new programs. this well-sequenced collection of nine papers on this subject addresses the problem of accountability at all levels of the educational 'enterprise. first is a conceptualization of systemsplanning through an explanation of the systems approach, cost effectiveness, and cost analysis. next are specific methods of systems-planning at the classroom, community college, university, and state {spatial clusters of operators} 3: intervaltree tx ← intervaltree() 4: intervaltree ty ← intervaltree() 5: map parent ← map() 6: for all operation op ∈ input_operations do 7: rectangle boundary ← extendbymargins(op.boundary) 8: repeat 9: operationset int_opsx ← tx.getintersectingops(boundary) 10: operationset int_opsy ← ty.getintersectingops(boundary) 11: operationset int_ops ← int_opsx ∩ int_opsy 12: for all operation int_op ∈ int_ops do 13: rectangle bd ← tx[int_op] × ty[int_op] 14: boundary ← smallestenclosing(bd, boundary) 15: parent[int_op] ← op 16: tx.remove(int_op); ty.remove(int_op) 17: end for 18: until int_ops = ∅ 19: tx.add(boundary, op); ty.add(boundary, op) 20: end for 21: map results ← map() 22: for all operation op ∈ input_operations do 23: operation root_ob ← getroot(parent, op) 24: rectangle rec ← tx[int_ob] × ty[int_ob] 25: if not results.has_key(rec) then 26: results[rec] ← list() 27: end if 28: results[rec].add(op) 29: end for 30: return results automatic extraction of figures from scientific publications in high-energy physics | praczyk, nogueras-iso, and mele 32 the clustering of operations is based on the relation of their rectangles being close to each other. definition 1 formalizes the notion of being close, making it useful for the algorithm. definition 1: two rectangles are considered to be located close to each other if they are intersecting after expanding their boundaries in every direction by a margin. the value by which rectangles should be extended is a parameter of the algorithm and might be different in various situations. to detect if rectangles are close to each other, we needed a data structure allowing the storage a set of rectangles. this data structure was required to allow retrieving all stored rectangles that intersect a given one. we have constructed the necessary structure using an important observation about the operation result areas. in our model all bounding rectangles have their edges parallel to the edges of the reference canvas on which the output of the operators is rendered. this allowed us to reduce our problem from the case of 2-dimensional rectangles to the case of 1-dimensional intervals. we can assume that edges of the rectangular canvas define the coordinates system. it is easy to prove that two rectangles of edges parallel to the axis of the coordinates system intersect only if both their projections in the directions of axis intersect. the projection of a rectangle into an axis is always an interval. the observation made above has allowed us to build the required 2-dimensional data structure by remembering two 1-dimensional data structures that recall a number of intervals and for a given interval return the set of intersecting ones. such a 1-dimensional data structure has been provided by interval-trees.20 every interval inside the tree has an arbitrary object assigned to it, which in this case is a representation of the pdf operator. this object can be treated as an identifier of the interval. the data structure also implements a dictionary interface, mapping objects to actual intervals. at the beginning, the algorithm initializes two empty interval trees representing projections on the x and y axes, respectively. those trees store values about projections of the biggest so-far calculated areas rather than about particular operators. each cluster is represented by the most recently discovered operation belonging to it. during the algorithm execution, each operator from the input set is considered only once. the order of processing is not important. the processing of a single operator proceeds as follows (the interior of the outermost “for all” loop of the algorithm). 1. the boundary of the operation is extended by the width of margins. the spatial data structure described earlier is utilized to retrieve boundaries of all already detected clusters (lines 9–10) 2. the forest of trees representing clusters is updated. the currently processed operation is added without a parent. roots of all trees representing intersecting clusters (retrieved in previous step) are attached as children of the new operation. information technology and libraries | december 2013 33 3. the boundary of the processed operation is extended to become the smallest rectangle containing all boundaries of intersecting clusters and the original boundary. finally, all intersecting clusters are removed from the spatial data structure. 4. lines 9–17 of the algorithm are repeated as long as there exist areas intersecting the current boundary. in some special cases, more than one iteration may be necessary. 5. finally, the calculated boundary is inserted into the spatial data structure as a boundary of a new cluster. the currently processed operation is designed to represent the cluster and so is remembered as a representation of the cluster. after processing all available operations, the post–processing phase begins. all the trees are transformed into lists. the resulting data structure is a dictionary having boundaries of detected clusters as keys and lists of belonging operations as values. this is achieved in lines 21–29. during the process of retrieving the cluster to which a given operation belongs, we use a technique called path compression, known from the union-find data structure.21 filtering of clusters graphical areas detected by a simple clustering usually do not directly correspond to figures. the main reason for this is that figures may contain not only graphics, but also portions of text. moreover, not all graphics present in the document must be part of a figure. for instance, common graphical elements not belonging to a figure include logos of institutions and text separators like lines and boxes; various parts of mathematical formulas usually include graphical operations; and in the case of slides from presentations, the graphical layout should not be considered part of a figure. the above shows that the clustering algorithm described earlier is not sufficient for the purpose of figures detection and it yields a results set wider than expected. in order to take into account the aforementioned characteristics, pre-calculated graphical areas are subject to further refinement. this part of the processing is highly domain-dependent as it is based on properties of scientific publications in a particular domain, in this case publications of hep. in the course of the refinement process, previously computed clusters can be completely discarded, extended with new elements, or some of their parts might be removed. in this subsection we discuss the heuristics applied for rejecting and splitting clusters of graphical operators. there are two main reasons for rejecting a cluster. the first of them is a size being too small compared to a page size. the second is the figure candidate having its aspect ratio outside a desired interval of values. the first heuristic is designed to remove small graphical elements appearing for example inside mathematical formulas, but also small logos and other decorations. the second one discards text separators and different parts of mathematical equations, such as a line-separating numerator from a denominator inside a fraction. the thresholds used for filtering are provided as automatic extraction of figures from scientific publications in high-energy physics | praczyk, nogueras-iso, and mele 34 configurable properties of the algorithm and their values are assigned experimentally in a way maximising the accuracy of figures detection. additionally, the analysis of the order of operations forming the content stream of a pdf document may help to split clusters that were incorrectly joined by algorithm 1. parts of the stream corresponding to logical parts of the document usually form a consistent subsequence. this observation allows the construction of a method of splitting elements incorrectly clustered together. we can assign content streams not only to entire pdf documents or pages, but also to every cluster of operations. the clustering algorithm presented in algorithm 1 returns a set of areas with a list of operations assigned to each of them. the content stream of a cluster consists of all operations from such a set ordered in the same manner as in the original content stream of the pdf document. the usage of the original content stream allows us to define a distance in the content stream as follows: definition 2. if o 1 and o 2 are two operations appearing in the content stream of the pdf document, by the distance between these operations we understand the number of textual and graphical operations appearing after the first of them and before the second of them. to detect situations when a figure candidate contains unnecessary parts, the content stream of a figure candidate is read from the first to the last operation. for every two subsequent operations, the distance between them in the sense of the original content stream is calculated. if the value is larger than a given threshold, the content stream is split into two parts, which become separate figure candidates. for both candidates, a new boundary is calculated. this heuristic is especially important in the case of less formal publications such as slides from presentations at conferences. presentation slides tend to have a certain number of graphics appearing on every page and not carrying any meaning. simple geometrical clustering would connect elements of page style with the rest of the document content. measuring the distance in the content stream and defining a threshold on the distance facilitates the distinction between the layout and the rest of the page. this technique also might be useful to automatically extract the template used for a presentation, although this transcends the scope of this publication. clustering of textual operators the same algorithm that clusters graphical elements can cluster parts of text. detecting larger logically consistent parts of text is important because they should be treated as single entities during subsequent processing. this comprises, for example, inclusion inside a figure candidate (e.g., captions of axes, parts of a legend) and classification of a text paragraph as a figure caption. inclusion of text parts the next step in figures extraction involves the inclusion of lost text parts inside figure candidates. information technology and libraries | december 2013 35 at the stage of operations clustering, only the operations of the same type (graphical or textual) were considered. the results of those initial steps become the input to the clustering algorithm that will detect relations between previously detected entities. by doing this, we move one level farther in the process of abstracting from operations. we start from basic meaningless operations. later we detect parts of graphics and text, and finally we are able to see the relations between both. not all clusters detected at this stage are interesting because some might consist uniquely of text areas. only those results that include at least one graphical cluster may be subsequently considered figure candidates. another round of heuristics marks unnecessary intermediate results as deleted. applied methods are very similar to those described in “filtering of clusters” (above), only thresholds deciding on the rejections must change because we operate on geometrically much larger entities. also the way of application is different—candidates rejected at this stage can be later restored to the status of a figure. instead of permanently removing, heuristics of this stage only mark figure candidates as rejected. this happens in the case of the candidates having incorrect aspect ratio, incorrect sizes or consisting only of horizontal lines (which is usually the case with mathematical formulas but also tables). in addition to using the aforementioned heuristics, having clusters consisting of a mixture of textual and graphical operations allows the application of new heuristics. during the next phase, we analyze the type of operations rather than their relative location. in some cases, steps described earlier might detect objects that should not be considered a figure, such as text surrounded by a frame. this situation can be recognized by the calculation of a ratio between the number of graphical and textual operations in the content stream of a figure candidate. in our approach we have defined a threshold that indicates which figure candidates should be rejected because they contain too few graphics. this allows the removal of, for instance, blocks of text decorated with graphics for aesthetic reasons. the ratio between numbers of graphical and textual operations is smaller for tables than for figures, so extending the heuristic with an additional threshold could improve the table–figure distinction. another heuristic analyzes ratio between the total area of graphical operations and the area of the entire figure candidate. subsequently, we mark as deleted the figure candidates containing horizontal lines as the only graphical operations. these candidates describe tables or mathematical formulas that have survived previous steps of the algorithm. tables can be reverted to the status of figure candidates in later stages of processing. figure candidates that survive all the phases of filtering are finally considered to be figures. figure 2 shows a fragment of a publication page with indicated text areas and final figure candidates detected by the algorithm. automatic extraction of figures from scientific publications in high-energy physics | praczyk, nogueras-iso, and mele 36 figure 2. a fragment of the pdf page with boxes around every detected text area and each figure candidate. dashed rectangles indicate figure candidates. solid rectangles indicate text areas. detection and matching of captions the input of the part of the algorithm responsible for detecting figure captions consists of previously determined figures and all text clusters. the observation of scientific publications shows that, typically, captions of figures start with a figure identifier (for instance see the grammar for figure captions proposed by bathia, lahiri, and mitra.22 the identifier usually starts with a word describing a figure type and is followed by a number or some other unique identifier. in more complex documents, the figure number might have a hierarchical structure reflecting, for example, the chapter number. the set of possible figure types is very limited. in the case of hep publications, the most usual combinations include words “figure”, “plot,” and different variations of their spelling and abbreviating. information technology and libraries | december 2013 37 during the first step of the caption detection, all text clusters from the publication page are tested for the possibility of being a caption. this consists of matching the beginning of the text contained in a textual cluster with a regular expression determining what is a figure caption. the role of the regular expression is to elect strings starting with one of the predefined words, followed by an identifier or beginning of a sentence. the identifier is subsequently extracted and included in the metadata of a caption. the caption detection has to be designed to reject paragraphs of the type “figure 1 presents results of (. . .)”. to achieve this, we reject the possibility of having any lowercase text after the figure identifier. having the set of all the captions, we start searching for corresponding figures. all previous steps of the algorithm take into account the division of a page into text columns (see “detection of the page layout” below). when matching captions with figure candidates, we do not take into account the page layout. matching between figure candidates and captions happens at every document page separately. we consider every detected caption once, starting with those located at the top of the page and moving down toward the end. for every caption we search figure candidates lying nearby. first we search above the caption and, in the case of failure, we move below the caption. we take into account all figure candidates, including those rejected by heuristics. in the case of finding multiple figure candidates corresponding to a caption, we merge them into a single figure, treating previous candidates as subfigures of a larger figure. we also include small portions of text and graphics previously rejected from figure candidates that lie between figure and caption and between different parts of a figure. these parts of text usually contain identifiers of the subfigures. the amount of unclustered content that can be included in a figure is a parameter of the extraction algorithm and is expressed as a percentage of the height of the document page. it might happen that captions are located in a completely different location, but this case is rare and tends to appear in older publications. the distance from the figure is calculated based on the page geometry. the captions should not be too distant from the figure. generation of the output the choice of the format in which data should be saved at the output of the extraction process should take into account further requirements. the most obvious use case of displaying figures to end users in response to text-based search queries does not yield very sophisticated constraints. a simple raster graphic annotated with captions and possibly some extracted portions of metadata would be sufficient. unfortunately, the process of generating raster representations of figures might lose many important pieces of information that could be used in the future for an automatic analysis. automatic extraction of figures from scientific publications in high-energy physics | praczyk, nogueras-iso, and mele 38 to store as much data as possible, apart from storing the extracted figures in a raster format (e.g., png), we also decided to preserve their original vector character. vector graphics formats, similarly to pdf documents, contain information about graphical primitives. primitives can be organized in larger logical entities. sometimes rendering of different primitives leads to a modification of the same pixel of resulting image. such a situation might happen, for example, when circles are used to draw data points lying nearby on the same plot. to avoid such issues, we convert figures into scalable vector graphics (svg) format.23 on the implementation level, the extraction of vector representation of a figure proceeds in a manner similar to regular rendering of a pdf document. the interpreter preserves the same elements of the state and allows their modification by transformation operations. a virtual canvas is created for every detected figure. the content stream of the document is processed and all the transformation operations are executed modifying the interpreter’s state. the textual and graphical operators are also interpreted, but they affect only the appropriate canvas of the figure to which the operation belongs. if a particular operation does not belong to any figure, no canvas is affected. the behaviour of graphical canvases used during the svg generation is different from the case of raster rendering. instead of creating graphical output, every operation is transformed into a corresponding primitive and saved within an svg file. pdf was designed in such a manner that the number of external dependencies of a file is minimized. this design decision led to the inclusion of the majority of fonts in the document itself. it would be possible to embed font glyphs in the svg file and use them to render strings. however, for the sake of simplicity, we decided to omit font definitions in the svg output. a text representation is extracted from every text operation, and the operation is replaced by a svg text primitive with a standard font value. this simplification affects what the output looks like, but the amount of formatting information that is lost is minimal. moreover, this does not pose a problem because vector representations are intended to be used during automatic analysis of figures rather than for displaying purposes. a possible extension of the presented method could involve embedding complete information about used glyphs. finally, the generation of the output is completed with some metadata elements. an exhaustive categorization of the metadata that can be compiled for figures could be the customization of the one proposed by liu et al. for table metadata.24 in the case of figures, the following categories could be distinguished: (1) environment/geography metadata (information of the document where the figure is located); (2) affiliated metadata (e.g., captions, references, or footnotes); (3) layout metadata (information about the original visualization of the figure); (4) content data; and (5) figure type metadata. for the moment, we compile only environment/geography metadata and affiliated metadata. the geography/environment metadata consists of the document title, the document authors, the document date (creation and publication), and the exact location of a figure inside a publication information technology and libraries | december 2013 39 (page and boundary). most of these elements are provided by simply referencing the original publication in the inspire repository. the affiliated metadata consists of the text caption and the exact location of the caption in the publication (page and boundary). in the future, metadata from other categories will be annotated for each figure. detection of the page layout figure 3. sample page layouts that might appear in a scientific publication. the black color indicates areas where content is present. in this section we discuss how to detect the page layout, an issue which has been omitted in the main description of the extraction algorithm, but which is essential for an efficient detection of figures. figure 3 depicts several possibilities of organising content on the page. as mentioned in previous sections, the method of clustering operations based on their geometrical position may fail in the case of documents having a complex page layout. the content appearing in different columns should never be considered belonging to the same figure. this cannot be assured without enforcing additional constrains during the clustering phase. to address this difficulty, we enhanced the figure extractor with a pre-processing phase of detecting the page layout. being able to identify how the document page is divided into columns enables us to execute the clustering within every column separately. it is intuitively obvious, what can be understood as a page layout, although to provide a method of calculating such, we need a more formal definition, which we provide below. by the layout of a page, we understand a particular division of a page into areas called columns. each area is a sum of disjoint rectangles. the division of a page into areas must satisfy a set of conditions summarized in definition 3. automatic extraction of figures from scientific publications in high-energy physics | praczyk, nogueras-iso, and mele 40 definition 3: let p be a rectangle representing the page. the set d containing subareas of a page is called a page division if and only if � 𝑄 = 𝑃 𝑄∈𝐷 ∀𝑥,𝑦∈𝐷𝑥 ∩ 𝑦 = ∅ ∀𝑄∈𝐷𝑄 ≠ ∅ ∀𝑄∈𝐷∃𝑅=�𝑥:𝑥 𝑖𝑠 𝑎 𝑟𝑒𝑐𝑡𝑎𝑛𝑔𝑙𝑒,∀𝑦∈r\{x} 𝑦∩x=∅ �𝑄 = � 𝑥𝑥∈𝑅 every element of a division is called a page area. to be considered a page layout, borders of areas from the division must not intersect the content of the page. definition 3 does not guarantee that the layout is unique. a single page might be assigned different divisions satisfying the definition. additionally, not all valid page layouts are interesting from the point of view of figures detection. the segmentation algorithm calculates one of such divisions, imposing additional constraints on the detected areas. the layout-calculation procedure utilizes the notion of separators, introduced by definition 4. definition 4: a vertical (or horizontal) line inside a page or on its borders is called a separator if its horizontal (vertical) distance from the page content is larger than a given constant value. the algorithm consists of two stages. first, the vertical separators of a sufficient length are detected and used to divide the page into disjoint rectangular areas. each area is delimited by two vertical lines, each of which forms a consistent interval inside of one of the detected vertical separators. at this stage, horizontal separators are completely ignored. figure 4 shows a fragment of a publication page processed by the first stage of the layout-detection. the upper horizontal edge of one of the areas lies too close too close to two text lines. with the constant of the definition 4 chosen to be sufficiently large, this edge would not be a horizontal separator and thus the generated division of the page would require additional processing to become a valid page layout. the second stage of the algorithm transforms the previously detected rectangles into a valid page layout by splitting rectangles into smaller parts and by joining appropriate rectangles to form a single area. information technology and libraries | december 2013 41 figure 4. example of intermediate layout-detection results requiring the refinement. algorithm 2 shows the pseudo-code of the detection of vertical separators. the input of the algorithm consists of the image of the publication page. the output is a list of vertical separators aggregated by their x-coordinates. every element of this list consists of two elements: an integer indicating the x-coordinate and the list of y-coordinates describing the separators. the first element of this list indicates the y-coordinate of the beginning of the first separator. the second element is the y-coordinate of the end of the same separator. the third and fourth elements describe the second separator and the same mechanism is used for the remaining separators (if they exist). the algorithm proceeds according to the sweeping principle known from the computational geometry.25 the algorithm reads the publication page starting from the left. for every xcoordinate value, a set of corresponding vertical separators is detected (lines 9–18). vertical separators are searched as consistent sequences of blank points. a point is considered blank if all the points in its horizontal surrounding of the radius defined by the constant from definition 5 are of the background colour. not all blank vertical lines can be considered separators. short, empty spaces usually delimit lines of text or different small units of the content. in line 11 we test detected vertical separators for being long enough. automatic extraction of figures from scientific publications in high-energy physics | praczyk, nogueras-iso, and mele 42 if a separator has been detected in a particular column of a publication page, the adjacent columns also tend to contain similar separators. lines 19–31 of the algorithm are responsible for electing the longest candidate among the adjacent columns of the page. the maximization is performed across a set of adjacent columns for which at least one separator exists. algorithm 2. detecting vertical separators. the detected separators are used to create the preliminary division of the page, similar to the one from the example of figure 4. as with the previous step, separators are considered one by one in the order of increasing x coordinate. at every moment of the execution, the algorithm maintains a division of the page into rectangles. this division corresponds only to the already detected vertical separators. updating the previously considered division is facilitated by processing separators in a particular well-defined order. 1: input: the page image 2: output: vertical separators of the input page 3: list> separators ← ∅ 4: int max_weight ← 0; 5: boolean maximizing ← false 6: for all x ∈ {minx … maxx} do 7: emptyb ← 0, current_eval ← 0 8: empty_areas ← list() 9: for all y ∈ {0 … page_height} do 10: if point at (x, y) is not blank then 11: if y – emptyb – 1 > heightmin then 12: empty_areas.append(emptyb) 13: empty_areas.append(y = page_height? y : y-1) 14: current_eval ← current_eval + y emptyb 15: end if 16: emptyb ← y + 1 17: end if 18: end for {we have already processed the entire column. now we are comparing with adjacent already processed columns} 19: if max_weight < current_eval then 20: max_weight ← current_eval 21: max_separators ← empty_areas 22: maxx ← x 23: end if 24: if maximising then 25: if empty_areas = ∅ then 26: separators.add() 27: maximising ← false, max_weight ← 0 28: end if 29: else 30: maximising ← (empty_areas ≠ ∅) 31: end if 32: end for 33: return separators information technology and libraries | december 2013 43 before presenting the final outcome, the algorithm must refine the previously calculated division. this happens in the second phase of the execution. all the horizontal borders of the division are then moved along adjacent vertical separators until they become horizontal separators in the sense of definition 4. typically, moving the horizontal borders result in dividing already existing rectangles into smaller ones. if such a situation happens, both newly created parts are assigned to different page layout areas. sometimes when moving separators is not possible, different areas are combined together, forming a larger one. tuning and testing the extraction algorithm described here has been implemented in java and tested on a random set of scientific articles coming from the inspire repository. the testing procedure has been used to evaluate the quality of the method, but also allowed to tweak the parameters of the algorithm to maximize the outcomes. preparation of the testing set to prepare the testing set, we randomly selected 207 documents stored in inspire. in total, these documents consisted of 37,28 pages which contained 1,697 figures altogether. the records have been selected according to a uniform probability distribution across the entire record space. this way, we have created a collection that is representative for the entire inspire including historical entries. currently, inspire consists of: 1,140 records describing publications written before 1950; 4,695 between 1950 and 1960; 32,379 between 1960 and 1970; 108,525 between 1970 and 1980; 167,240 between 1980 and 1990; 251,133 between 1990 and 2000; and 333,864 in the first decade of the twenty-first century. in total, up to july 2012, inspire manages 952,026 records. it can be seen that the rate of growth has increased with time and most of inspire documents come from the last decade. the results on such a testing set should accurately estimate the efficiency of extraction for existing documents but not necessarily for new documents, being ingested into inspire. this is because inspire contains entries describing old articles which were created using obsolete technologies or scanned and encoded in pdf. the extraction algorithm is optimized for born-digital objects. to test the hypothesis that the extractors provides better results for newer papers, the testing set has been split into several subsets. the first set consists of publications published before 1980. the rest of the testing set has been split into subsets corresponding to decades of publication. to simplify the counting of correct figure detections and to provide a more reliable execution and measurement environment, every testing document has been split into many of pdf documents consisting of a single page. subsequently, every single page document has been manually annotated with the number of figures appearing inside. automatic extraction of figures from scientific publications in high-energy physics | praczyk, nogueras-iso, and mele 44 execution of the tests the efficient execution of the testing was possible thanks to a special script executing the plots extractor on every single page separately and then computing the total number of successes and failures. the script allows the execution of tests in a distributed heterogeneous environment and allows dynamic connection and disconnection of computing nodes. in the case of a software failure, the extraction request is resubmitted to a different computation node, allowing the avoidance problems related to a worker node configuration rather than to the algorithm implementation itself. during the preparation of the testing set, we manually annotated all the expected extraction results. subsequently, the script compared these metadata with the output of the extractor. using aggregated numbers from all extracted pages allowed us to calculate efficiency measures of the extraction algorithm. as quality measures, we used recall and precision.26 their definitions are included in the following equations: at every place where we needed a single comparable quality measure rather than two semiindependent numbers, we have used a harmonic average of the precision and the recall.27 table 1 summarizes the results obtained during the test execution for every subset of our testing set. figure 5 shows the dependency of recall and precision on the time of publication. the extractor parameters used in this test execution were chosen based on intuition and small number of manually triggered trials. in the next section we describe an automatic tuning procedure we have used to find the most optimal algorithm arguments. testsettheinpresentfigures figuresextractedcorrectly recall # # = figuresextracted figuresextractedcorrectly precision # # = information technology and libraries | december 2013 45 –1980 1980–90 1990–2000 2000–10 2010–12 number of existent figures 114 60 170 783 570 number of correctly detected figures 59 53 164 703 489 number of incorrectly detected figures 26 78 65 40 73 total number of pages 85 136 760 1919 828 number of correctly processed pages 20 44 712 1816 743 table 1. results of the test execution. figure 5. recall and precision as functions of decade of the date of the publication. it can be seen that, as expected, the efficiency increases with the increasing time of publication. a total recall and precision for all samples since 1990, which constitutes a majority of the inspire corpus, were both 88 percent. precision and recall based on the correctly detected figures do not give a full image of the algorithm efficiency because the extraction has been executed on a number of pages not containing any figures. the correctly extracted pages not having any figures do not appear in the recall and precision statistics because in their case the expected and detected number of figures are both equal to 0. automatic extraction of figures from scientific publications in high-energy physics | praczyk, nogueras-iso, and mele 46 besides recall and precision, figure 5 depicts also the fraction of pages that have been extracted correctly. taking into account the samples since 1990, 3,271 pages out of 3,507 have been detected completely correctly, which makes 93 percent success rate counted by number of pages. as it can be seen, this measure is higher than both the precision and the recall. the analysis of the extractor results in the case of failure shows that in many cases, even if results are not completely correct, they are not far from the expectation. there are different reasons of the algorithm failing. some of them may result from non-optimal choice of algorithm parameters, others from document layout being too far from the assumed one. in some rare cases, even manual inspection of the document does not allow an obvious identification of figures. the automatic tuning of parameters in previous section we have shown the results obtained by executing the extraction algorithm on a sample set. during this execution we were using extractor arguments which seemed to be the most correct based on our observation but also on other research (typical sizes of figures, margin sizes, etc.).28 this way of algorithm configuration was useful during the development, but is not likely to yield the best possible results. to find better parameters, we have implemented a method of automatic tuning. metrics described in the previous section provided a good method of measuring the efficiency of the algorithm running based on given parameters. the choice of optimal parameters can be relative to the choice of documents on which the extraction is to be performed. the way in which the testing set has been selected, allowed us to use it as representative for the hep publications. to tune the algorithm, we have used a described subset of testing set from the previous step as a reference. the subset consisted of all entries created after 1990. this allowed us to minimize the presence of scanned documents which, by design, cannot be correctly processed by our method. the adjustment of parameters has been performed by a dedicated script which has executed the extraction using various parameter values and has read results. the script has been configured with a list of tuneable parameters together with their type and allowed values range. additionally, the script had the knowledge of the believed best value, which was the one used in previous testing. to decrease the complexity of training, we have made several assumptions about the parameters. these assumptions are only an approximation of real nature of parameters, but the practice has shown that they are good enough to permit the optimization: • we assume that the precision and recall are continuous with respect to the parameters. this allows us to assume that efficiency of the algorithm for parameter values close to a given one will be close. the optimization has proceeded by sampling the parametric space in a number of points and executing tests using the selected points as parameter values. information technology and libraries | december 2013 47 having n parameters to optimize and dividing the space of every parameter into m regions leads to the execution of mn tests. execution of every test is a timely operation due to the size of the training set. • we assume that parameters are independent from each other. this means that we can divide the problem of finding an optimal solution in the n-dimensional space of n configuration arguments into finding n solutions in 1-dimensional subspaces. such an assumption seems to be intuitive and considerably reduces the number of necessary tests from o(mn) to o(m⋅n), where m is the number of samples taken from a single dimension. in our tests, the parametric space has been divided into 10 equal intervals in every direction. in addition to checking the extraction quality in those points, we have executed one test for the so-far best argument. in order to increase the level of fine-tuning of the algorithm, each test has been reexecuted in the region, where chances of finding a good solution were considered the highest. this consisted of a region centred around the highest result and having a radius of 10 percent of the parameter space. figure 6 and figure 7 show the dependency of the recall and the precision on an algorithm parameter. the parameter depicted in figure 6 indicates what minimal aspect ratio the figure candidate must have in order to be considered a correct figure. it can be seen that tuning this heuristic increases the efficiency of the extraction. moreover, the dependency of recall and precision on the parameter is monotonic which is the most compatible with the chosen optimization method. the parameter of figure 7 specifies which fraction of the area of the entire figure candidate has to be occupied by graphical operations. this parameter has a lower influence on the extraction efficiency. such a situation can happen when more than one heuristic influences the same aspect of the. this is contradictory with the assumption of parameter independence, but we have decided to use the present model for the simplicity. figure 6. effect of the minimal aspect ratio on precision and recall. automatic extraction of figures from scientific publications in high-energy physics | praczyk, nogueras-iso, and mele 48 figure 7. effect on the precision and recall of the area fraction occupied by graphical operations. after executing the optimization algorithm, we have managed to achieve a recall of 94.11 percent and a precision of 96.6 percent, which is a considerable improvement compared to previous results of 88 percent. conclusions and future work this work has presented a method for extracting figures from scientific publications in a machinereadable format, which is the main step toward the development of services enabling access and search of images stored in scientific digital libraries. in recent years, figures have been gaining increasing attention in the digital libraries community. however, little has been done to decipher the semantics of these graphical representations and to bridge the semantic gap between content, which can be understood by machines and this which is managed by digital libraries. extracting figures and storing them in uniform and machine-readable format constitutes the first step towards the extraction and the description of the internal semantics of figures. storing semantically described and indexed figures would open completely new possibilities of accessing the data and discovering connections between different types of publishing artefacts and different resources describing related knowledge.29 our method of detecting fragments of pdf documents that correspond to figures is based on a series of observations of the character of publications. however, tests have shown that additional work is needed to improve the correctness of the detection. also, the performance should be reevaluated after we have a large set of correctly annotated figures, confirmed by users of our system. the heuristics used by the algorithm are based on a number of numeric parameters that we have tried to optimize using automatic techniques. the tuning procedure has made several arbitrary assumptions on the nature of the dependency between parameters and extraction results. a future approach to the parameter optimization, requiring much more processing, could information technology and libraries | december 2013 49 involve the execution of a genetic algorithm that would treat the parameters as gene samples.30 this could potentially allow a discovery of a better parameter set because a smaller set of assumptions would be imposed on the parameters. a vector of algorithm parameters could play the role of a gene and random mutations could be introduced to previously considered and subsequently crossed genes. the evaluation and selection of surviving genes could be performed by the usage of the metrics described previously. another approach to improving the quality of the tuning could involve extending the present algorithm by a discovery of mutually dependent parameters and usage of special techniques (relaxing the assumptions) to fine-tune in subspaces spanned by these parameters. all of our experiments have been performed using a corpus of publications from hep. the usage of the extraction algorithm on a different corpus would require tuning the parameters for the specific domain of application. for the area of hep, we can also consider preparing several sets of execution parameters varying by decade of document publication or by other easy to determine characteristics. subsequently, we could decide which extraction method to run, based on those metrics. in addition to a better tuning of the existing heuristics, there are improvements that can be made at the level of the algorithm. for example, we could mention extending the process of clustering text parts. in the current implementation, the margins by which textual operations are extended during the clustering process are fixed as algorithm parameters. this approach proved to be robust in most cases. in fact, distances between text lines tend to be different depending on the currently utilized style. every text portion tends to have one style that dominates. an improved version of the text-clustering algorithm could use local rather than global properties of the content. this would not only allow to correctly handle the entire document written using different text styles, but also help to manage cases of single paragraphs differing from the rest of the content. another important, not-yet-implemented improvement related to figure metadata is the automatic extraction of figure references from the text content. important information about figure content might be stored in the surroundings of the place where publication text refers to a figure. furthermore, the metadata could be extended by the usage of some type of classifier that would assign a graphics type to the extracted result. currently, we are only distinguishing between tables and figures based on simple heuristics involving number and type of graphical areas and the text inside of the detected caption. in the future, we could detect line-plots from photos, histograms and so on. such a classifier could be implemented using artificial intelligence techniques such as support vector machines.31 finally, partial results of the figures extraction algorithm might be useful in performing other pdf analyses: • the usage of clustered text areas could allow a better interpretation and indexing of textual content stored in digital libraries with full-text access. clusters of text tend to describe automatic extraction of figures from scientific publications in high-energy physics | praczyk, nogueras-iso, and mele 50 logical parts like paragraphs, section and chapter titles, etc. a simple extension of the current schema could allow the extraction of predominant formatting style of the text encoded in a page area. text parts written in different styles could be indexed in a different manner giving for instance more importance to segments written with larger font. • we mentioned that the algorithm detects not only figures, but also tables. a heuristic is being used in order to distinguish tables from different types of figures. our present effort concentrates on correct treatment of figures, but a useful extension could allow extraction of different types of entities. for instance, another common type of content ubiquitous in hep documents are mathematical formulas. thus, in addition to figures, it would be important to extract tables and formulas in structured format allowing a further processing. the internal architecture of the implemented prototype of the figure extractor allows easy implementation of extension modules which can compute other properties of pdf documents. acknowledgements this work has been partially supported by cern, and the spanish government through the project tin2012-37826-c02-01. references 1. saurabh kataria, “on utilization of information extracted from graph images in digital documents,” bulletin of ieee technical comittee on digital libraries 4, no. 2 (2008), http://www.ieee-tcdl.org/bulletin/v4n2/kataria/kataria.html. 2. marti a. hearst et al., “exploring the efficacy of caption search for bioscene journal search interfaces,” proceedings of the workshop on bio nlp 2007: biological, translational and clinical language processing: 73–80, http://dl.acm.org/citation.cfm?id=1572406. 3. lisa johnston, “web reviews: see the science: scitech image databases,” sci-tech news 65, no. 3 (2011), http://jdc.jefferson.edu/scitechnews/vol65/iss3/11. 4. annette holtkamp et al., “inspire: realizing the dream of a global digital library in highenergy physics,” 3rd workshop conference: towards a digital mathematics library, paris, france (july 2010): 83–92. 5. piotr praczyk et al., “integrating scholarly publications and research data—preparing for open science, a case study from high-energy physics with special emphasis on (meta)data models,” metadata and semantics research—ccis 343 (2012): 146–57. 6. piotr praczyk et al., “a storage model for supporting figures and other artefacts in scientific libraries: the case study of invenio,” 4th workshop on very large digital libraries (vldl 2011), berlin, germany (2011). http://www.ieee-tcdl.org/bulletin/v4n2/kataria/kataria.html http://dl.acm.org/citation.cfm?id=1572406 http://jdc.jefferson.edu/scitechnews/vol65/iss3/11 information technology and libraries | december 2013 51 7. “sciverse science direct: image search,” elsevier, http://www.info.sciverse.com/sciencedirect/using/searching-linking/image. 8. guenther eichhorn, “trends in scientific publishing at springer,” in future professional communication in astronomy ii (new york: springer, 2011), doi: 10.1007/978-1-4419-83695_5. 9. william browuer et al., “segregating and extracting overlapping data points in twodimensional plots,” proceedings of the 8th acm/ieee-cs joint conference on digital libraries (jcdl 2008), new york: 276–79. 10. saurabh kataria et al., “automatic extraction of data points and text blocks from 2dimensional plots in digital documents,” proceedings of the 23rd aaai conference on artificial intelligence, (2008) chicago: 1169–1174. 11. saurabh kataria, “on utilization of information extracted from graph images in digital documents,” bulletin of ieee technical committee on digital libraries 4, no. 2 (2008), http://www.ieee-tcdl.org/bulletin/v4n2/kataria/kataria.html. 12. ying liu et al., “tableseer: automatic table metadata extraction and searching in digital libraries,” proceedings of the 7th acm/ieee-cs joint conference on digital libraries (jcdl’07), vancouver (2007): 91–100. 13. william s. cleveland, “graphs in scientific publications,” american statistician, 38, no. 4, (1984): 261–69, doi: 10.1080/00031305.1984.10483223. 14. hui chao and jian fan, “layout and content extraction for pdf documents,” document analysis systems vi, lecture notes in computer science 3163 (2004): 213–24. 15. at every moment of the execution of a postscript program, the interpreter maintains many variables. some of them encode current positions within the rendering canvas. such positions are used to locate the subsequent character or to define the starting point of the subsequent graphical primitive. 16. transformation matrices are encoded inside the interpreters’ state. if an operator requires arguments indicating coordinates, these matrices are used to translate the provided coordinates to the coordinate system of the canvas. 17. graphical operators are those that trigger the rendering of a graphical primitive. 18. textual operations are the pdf instructions that cause the rendering of the text. textual operations receive the string representation of the desired text and use the current font, which is saved in the interpreters’ state. 19. operations that do not produce any visible output, but solely modify the interpreters’ state. 20. herbert edelsbrunner and hermann a. maurer, “on the intersection of orthogonal objects,” information processing letters 13, nos. 4, 5 (1981): 177–81. http://www.info.sciverse.com/sciencedirect/using/searching-linking/image http://www.ieee-tcdl.org/bulletin/v4n2/kataria/kataria.html automatic extraction of figures from scientific publications in high-energy physics | praczyk, nogueras-iso, and mele 52 21. thomas h. cormen, charles e. leiserson, and ronald l. rivest, introduction to algorithms, (cambridge: mit electrical engineering and computer science series, 1990). 22. sumit bhatia, shibamouli lahiri, and prasenjit mitra, “generating synopses for documentelement search,” proceedings of the 18th acm conference on information and knowledge management, new york (2009): 2003–6, doi: 10.1145/1645953.1646287. 23. jon ferraiolo, ed., “scalable vector graphics (svg) 1.0 specification,” w3c recommendation 01 september 2001, http://www.w3.org/tr/svg10/. 24. liu et al., “tableseer.” 25. cormen, leiserson, and rivest, introduction to algorithms. 26. ricardo a. baeza-yates and berthier ribeiro-neto, modern information retrieval,” (boston: addison-wesley, 1999). 27. ibid. 28. cleveland, “graphs in scientific publications.” 29. praczyk et al., “a storage model for supporting figures and other artefacts in scientific libraries.” 30. stuart russell and peter norvig, artificial intelligence: a modern approach (third edition) (prentice hall, 2009). 31. sergios theodoridis and konstantinos koutroumbas, pattern recognition (third edition) (boston, academic press, 2006). http://www.w3.org/tr/svg10/ pre-processing of operators clustering of graphical operators the clustering algorithm filtering of clusters clustering of textual operators inclusion of text parts detection and matching of captions generation of the output detection of the page layout preparation of the testing set execution of the tests the automatic tuning of parameters 108 information technology and libraries | june 2006 tutorial writing your first scholarly article: a guide for budding authors in librarianship scott nicholson this series of questions and answers is designed to help you take the first steps toward the successful production of a scholarly article in librarianship. you may find yourself in a library position that requires writing or you may have just decided that you are ready to share your findings, experiences, and knowledge with the current and future generations of librarians. while following the guidelines listed here will not guarantee that you will be successful, these steps will take you closer to discovering the thrill of seeing your name in print and making a difference in the field. what should i write about? perhaps you already have an idea based upon your experiences and expertise, or perhaps you aren’t sure which of those ideas you should write about. the best way to start writing is to read other articles! many scholarly articles end with a future research section that outlines other projects and questions that the article suggests. it is useful to contact the author of a piece that holds a future research seed to ensure that the author has not already taken on that challenge. sometimes, the original author may be interested in collaborating with you to explore that next question. how do i start? scholarship is an iterative process, in that works that you produce are bricks in an ever-rising wall. your brick will build upon the works of others and, once published, others will build upon your work. because of this, it is essential to begin with a review of related literature. search in bibliographic and citation databases as well as web search tools to see if others have done similar projects to your own. the advantage of finding related literature is that you can learn from the mistakes of others and avoid duplicating works (unless your plan is to replicate the work of others). starting with the work of others allows you to place your brick on the wall. if you do not explicitly discuss how your scholarship relates to the scholarship of others, only those having familiarity with the literature will be able to understand how your work fits in with that of previous authors. in addition, it’s easier to build upon your work if those who read it have a better idea of the scholarly landscape in which your work lives. as you go out and discover literature, it is crucial to keep citation information about each item. much of what you will cite will be book chapters or articles in journals, and you will save yourself time and trouble later if you make a printed copy of source items and record bibliographic information on that copy. recording the title of the work, the full names (including middle initials) of authors and editors, page range, volume, issue, date, publisher and place of publication, url and date accessed, and any other bibliographic information at the time of collection will save you headaches later when you have to create your references list. as different journals have different citation requirements, having all of this information allows you the flexibility of adapting to different styles. one type of scholarship produced by libraries is the “how our library did something well” article. while a case study of your library can be an appropriate area of discussion, it is critical to position these pieces within the scholarship of the field. this allows readers to better understand how applicable your findings are to their own libraries. the concept illustrates the difference between the practice of librarianship and library science. library science is the study of librarianship and includes the generalization of library practice in one setting to other settings. before starting your writing, talk about your idea with your colleagues, which will help you refine your ideas. it will also generate some excitement and publicity about your work, which can help inspire you to continue in the writing process. colleagues can help you consider different places where similar works may already exist and might even open your eyes to similar work in another discipline. you may find a colleague who wants to coauthor the piece with you, which can make the project easier to complete and richer through the collaborative process. another important early step is to consider the journals you would like to be published in. many times, it can be fruitful to publish in the journal that has published works that are in your literature review. considering the journal at this point will allow you to correctly focus the scope, length, and style of your article to the requirements of your desired journal. your article should match the length and tone of other articles in that journal. most journals provide instructions to authors in each issue or on the web; the information page for ital authors is at www.ala.org/ala/ lita/litapublications/ital/information authors.htm. how can i find funding for my research? some projects can’t be easily done in your spare time and require resources for surveys, statistical analysis, travel, or other research costs. you will find that successful requests for funding scott nicholson (srnichol@syr.edu) is an assistant professor in the school of information studies, syracuse university, new york. writing your first scholarly article | nicholson 109 start with a literature review and a research plan. developing these before requesting funding will make your request for funding much stronger, as you will be able to demonstrate how your work will sit within a larger context of scholarship. you will need to develop a budget for your funding request. this budget will come together more easily if you have planned out your research. it may be useful or even required for you to develop a set of outcomes for your project and how you will be assessing those outcomes (find more information on outcome-based evaluation through the imls web site at www.imls.gov/grants/current/ crnt_obe.htm). developing this plan will give you a more concrete idea of what resources you will need and when, as well as how you can use the results of your work. resources for research may come from the inside, such as the library or the parent organization of the library, or from an external source, such as a granting body or a corporate donor. in choosing an organization for selection, you should consider who would most benefit from the research, as the request for funding should focus on the benefit to the granting body. many libraries and schools do have small pots of money available for research that will benefit that institution and that, many times, go untapped due to a lack of interest. granting organizations put out formal calls for grant proposals. these can result in a grant that would carry some prestige but would require a detailed formal application that can take months of writing and waiting. another approach is to work with a corporate or nonprofit organization that gives grants. if your organization has a development office, this office may be able to help connect you with a potential supporter of your work. how do i actually do the research? just as the most critical part of a dissertation is the proposal, a good research plan will make your research process run smoothly. before you start the research, write the literature review and the research plan as part of an article. it can be useful to create tables and charts with dummy data that will show how you plan to present results. doing this allows you to notice gaps in your data-collection plan well before you start that process. in many research projects, you only have a single chance to collect data; therefore, it’s important to plan out the process before you begin. how do i start writing the paper? the best way to start the writing process is to just write. don’t worry about coming up with a title; the title will develop as the work develops. you can skip over the abstract and introduction; these can be much easier to write after the main body of the article is complete. if you’ve followed the advice in this paper, then you’ve already written a literature review and perhaps a research plan; these make a good starting point for your article. one way to develop the body of the article is to develop an outline of headings and subheadings. starting with this type of outline forces you to think through your entire article and can help you identify holes in your preparation. once you have the outline completed, you can then fill in the outline by adding text to the headings and subheadings. this approach will keep your thinking organized in a way typically used in scholarly writing. scholarly writing is different than creative writing. many librarians with a humanities background face some challenges in transitioning to a different writing style. scholarly writing is terse; strunk and white’s the elements of style (2000) focuses on succinct writing and can help you refresh your writing skills.1 if you are having difficulty finding the time to write, it can be useful to set a small quota of writing that you will do every day. a quota such as four paragraphs a day is a reasonable amount to fit into even a busy day, but it will result in the completion of your first draft in only a few weeks. i’m finished with my first complete draft! now what? while you will be excited with the completion of the draft, it’s not appropriate to send that off to a journal just yet. take a few days off and let your mind settle from the writing, then go back and reread your article carefully. examine each sentence for a subject and a verb, and remove unneeded words, phrases, sentences, paragraphs, or even pages. try to tighten and clean your writing by replacing figures of speech with statements that actually say what you mean in that situation and removing unneeded references to firstand second-person pronouns. working through the entire article in this way greatly improves your writing and reduces the review and editing time needed for the article. after this, have several colleagues read your work. some of these might be people with whom you shared your original ideas, and others may be new to the concepts. it can be useful to have members of different departments and with different backgrounds read the piece. ask them if they can read your work by a specific date, as this type of review work is easy to put off when work gets busy. these colleagues may be people who work in your institution or may be people you have met online. if you know nobody who would be appropriate, consider putting out a request for assistance on a library discussion list focused on your research topic. dealing with the comments from others requires you to set aside your 110 information technology and libraries | june 2006 defenses. you did spend a lot of time on this work and it can be easy to slip into a defensive mode. attempt to read their comments from an objective viewpoint. remember—these people are spending their time to help you, and a comment you disagree with at first blush may make more sense if you consider the question “why would someone say this about my work?” putting yourself into the reader’s shoes can aid you in the creation of a piece that speaks to many audiences. what goes on when i submit my work? at this point, your readers have looked at the piece, and you have made corrections on it. now you’re ready to submit your work. follow the directions of the target journal, including length, citation format, and method of submission. if submission is made by e-mail, it would be appropriate to send a follow-up e-mail a few days after submission to ensure the work was received; it can be very frustrating to realize, after a month of waiting, that the editor never got the work. once you have submitted your work, the editor will briefly review it to ensure it is an appropriate submission for the journal. if it is appropriate, then the editor will pass the article on to one or more reviewers; if not, you will receive a note fairly quickly letting you know that you should pick another journal. if the reviewing process is “blind,” then you will not know who your reviewers are, but they may know your identity. if the process is “double-blind,” neither reviewer nor author will know the identity of the other. the reviewers will read the article and then submit comments and a recommendation to the editor. the editor will collect comments from all of the reviewers and put them together, and send those comments to you. this will always take longer than you would prefer; in reality, it will usually take two to six months, depending upon the journal. after a few months, it would be appropriate for you to contact the editor and ask about the progress on the article and when you should expect comments. do not expect to have your article accepted on the first pass. the common responses are: ■ reject. at this point, you can read the comments provided, make changes, and submit it to another journal. ■ revise and resubmit. the journal is not making a commitment to you, but they are willing to take another look if you are willing to make changes. this is a common response for first submissions. ■ accept with major changes. the journal is interested in publishing the article, but it will require reworking. ■ accept with minor changes. you will be presented with a series of small changes. some of these might be required and others might be your choice. ■ accept. the article is through the reviewing process and is on to the next stage. this is an iterative process. you will most likely go through several cycles of this before your article is accepted, and staying dedicated to the process is key to its success. it can be disheartening to have made three rounds of changes only to face another round of small changes. ideally, each set of requested changes should be smaller (and take less time) until you reach the acceptance level. do not submit your work to multiple journals at the same time. if you choose to withdraw your work from one journal and submit it to another, let the editor know that you are doing this (assuming they have not rejected your work). my article has been accepted. when will it come out? once your article is accepted, it will be sent into a copyediting process. the copy editor will contact you with more questions that focus more on writing and citation flaws than on content. after making more corrections, you will receive a proof to review (usually with a very tight deadline). this proof will be what comes out in the journal, so check important things like your name, institutions, and contact information carefully. the journal will usually come out several months after you see this final proof. the process from acceptance to publication can take from six months to two years (or more), depending on how much of a publication queue the journal has. the editor should be able to give you an estimate as to when the article will come out after full acceptance. can i put a copy of my article online? it depends upon the copyright agreement that you sign. many publishers will allow you to put a copy of your article on a local or institutional web site with an appropriate citation. some allow you to put up a preprint, which would be the version after copyediting but not the final proof version. if the copyright agreement doesn’t say anything about this, then ask the editor of the journal about the policy of authors mounting their own articles on a web site. conclusion writing an article and getting it published is akin to having a child. your child will have a life of its own, and others may notice this new piece of knowledge and build upon it to improve their own library services writing your first scholarly article | nicholson 111 or even make their own works. it is a way to make a difference that goes far beyond the walls of your own library, to extend your professional network, and to engage other scholars in the continued development of the knowledge base of our field. reference 1. w. strunk jr. and e. b. white, the elements of style (boston: allyn & bacon, 2000). for more information: w. crawford, first have something to say: writing for the library profession (chicago: ala, 2003). r. gordon, the librarian’s guide to writing for publication (lanham, md.: scarecrow, 2004). l. hinchliffe and j. dorner, eds., how to get published in lis journals: a practical guide (san diego: elsevier, 2003), www .elsevier.com/framework_librarians/lib raryconnect/lcpamphlet2.pdf, (accessed feb. 8, 2006). participatory networks | lankes, silverstein, and nicholson 17 author id box for 2 column layout column title editor the goal of the technology brief is to familiarize library decision-makers with the opportunities and challenges of participatory networks. in order to accomplish this goal the brief is divided into four sections (excluding an overview and a detailed statement of goal): ■ a conceptual framework for understanding and evaluating participatory networks; ■ a discussion of key concepts and technologies in participatory networks drawn primarily from web 2.0 and library 2.0; ■ a merging of the conceptual framework with the technological discussion to present a roadmap for library systems development; and ■ a set of recommendations to foster greater discussion and action on the topic of participatory networks and, more broadly, participatory librarianship. this summary will highlight the discussions in each of these four topics. for consistency, the section numbers and titles from the full brief are used. k nowledge is created through conversation. libraries are in the knowledge business. therefore, libraries are in the conversation business. some of those conversations span millennia, while others only span a few seconds. some of these conversations happen in real time. in some conversations, there is a broadcast of ideas from one author to multiple audiences. some conversa­ tions are sparked by a book, a video, or a web page. some of these conversations are as trivial as directing someone to the bathroom. other conversations center on the foun­ dations of ourselves and our humanity. it may be odd to start a technology brief with such seemingly abstract comments. yet, without this firm, if theoretical, footing, the advent of web 2.0, social net­ working, library 2.0, and participatory networks seems a clutter of new terminology, tools, and acronyms. in fact, as will be discussed, without this conceptual footing, many library functions can seem disconnected, and the field that serves lawyers, doctors, single mothers, and eight­year olds (among others) fragmented. the scale of this technology brief is limited; it is to present library decision­makers with the opportunities and challenges of participatory networks. it is only a single piece of a much larger puzzle that seeks to pres­ ent a cohesive framework for libraries. this framework not only will fit tools such as blogs and wikis into their offerings (where appropriate), but also will show how a more participatory, conversational approach to libraries in general can help libraries better integrate current and future functions. think of this document as an overview or introduction to participatory librarianship. readers will find plenty of examples and definitions of web 2.0 and social networking later in this article. however, to jump right into the technology without a larger frame­ work invites the rightful skepticism of a library organiza­ tion that feels constantly buffeted by new technological advances. in any environment with no larger conceptual founding, to measure the importance of an advance in technology or practice selection of any one technology or practice is nearly arbitrary. without a framework, the field becomes open to the influence of personalities and trendy technology. therefore, it is vital to ground any technological, social, or policy conversation into a larger, rooted concept. as susser said, “to practice without theory is to sail an uncharted sea; theory without practice is not to set sail at all.”1 for this paper, the chart will be conversation theory. the core of this article is in four sections: ■ a conceptual framework for understanding and eval­ uating participatory networks; ■ a discussion of key concepts and technologies in par­ ticipatory networks drawn primarily from web 2.0 and library 2.0; ■ a merging of the conceptual framework with the technological discussion to present a sort of roadmap for library systems development; and ■ a set of recommendations to foster greater discussion and action on the topic of participatory networks and, more broadly, participatory librarianship. it is recommended that the reader follow this order to get the big picture; however, the second section should be a useful primer on the language and concepts of partici­ patory networks. ■ library as a facilitator of conversation let us return to the concept that knowledge is created through conversation. this notion stretches back to socrates and the socratic method. however, the specific foundation for this statement comes from conversation theory, a means of explaining cognition and how people learn.2 it is not the purpose of this article to provide a r. david lankes (jdlankes@iis.syr.edu) is director and associate professor, joanne silverstein (jlsilver@iis.syr.edu) is research professor, and scott nicholson (scott@scottnicholson.com) is associate professor at the information institute of syracuse, (n.y.) syracuse university’s school of information studies. participatory networks: the library as conversation r. david lankes, joanne silverstein, and scott nicholson 18 information technology and libraries | december 200718 information technology and libraries | december 2007 detailed description of conversation theory, a task already admirably accomplished by pask. rather, let us use the theory as a structure upon which to hang an exploration of participatory networking and, more broadly, participa­ tory librarianship. the core of conversation theory is simple: people learn through conversation. different communities have different standards for conversations, from the scientific community’s rigorous formalisms, to the religious com­ munity’s embedded meaning in scripture, to the some­ times impenetrable dialect of teens. the point remains, however, that different actors establish meaning through determining common definitions and building upon shared concepts. the library has been a place where we facilitate con­ versations, though often implicitly. the concept of learn­ ing through conversation is evidenced in libraries in such large initiatives as information literacy and teaching criti­ cal thinking skills (using such meta­cognitive approaches as self­questioning), and in the smaller events of book groups, reference interviews, and speaker series. library activities such as building collections of artifacts (the tan­ gible products of conversation) inform scholars’ research through a formal conversation process where ideas are supported with evidence and methods. similarly, pres­ ervation efforts, perhaps of wax cylinders with spoken word content or of ancient maps that embody an ongo­ ing dialogue about the shape and nature of the physical world, seek to save, or at least document, important conversations. common use of the word “conversation” is com­ pletely in accordance with the use of the term in conver­ sation theory. the term is, however, more specifically defined as an act of communication and agreement between a set of agents. so, a conversation can be between two people, two organizations, two countries, or even within an individual. how can a conversation take place within an individual? educators and school librarians may be familiar with the term “metacogni­ tion,” or the act of reflecting on one’s learning.3 yet, even the most casual reader will be familiar with the concept of debating oneself (“if i go right, i’ll get there faster, but if i go left i can stop by jim’s . . .”). the point is that a conversation is with at least two agents trying to come to an understanding. also note that those two agents can change over time. so, while socrates and plato are dead, the conversation they started about the nature of knowl­ edge and the world is carried forward by new genera­ tions of thinkers—same conversation, different agents. people converse, organizations converse, states con­ verse, societies converse. the requirements, in the terms of conversation theory, are two cognitive systems seek­ ing agreement. the results of these conversations, what pask would call “cognitive entanglements,” are books, videos, and artifacts that either document, expand, or result from conversations.4 so, while one cannot con­ verse with a book, that book certainly can be a starting point for many conversations within the reader and within a larger community. if the theory is that conversation creates knowledge, the library community has added a corollary: the best knowl­ edge comes from an optimal information environment, one in which the most diverse and complete information is available to the conversant(s). library ethics show an implicit understanding of this corollary in the advocacy of intellectual freedom and unfettered access. libraries seek to create rich environments for knowledge and have taken the stance that they are not in the job of arbitrating the conversations that occur or the appropriateness of the information used to inform those conversations. as will be discussed later, this belief in openness of conversations will have some far­reaching implications for the library collec­ tion and is an ideal that can never truly be met. for now, the reader may take away that conversation theory is very much in line with current and past library practice, and it also shows a clear trajectory for the future. this viewpoint’s value is not just theoretical; it has real consequences and uses. for example, much of library evaluation has been based on numeric counts of tangible outputs: books circulated, collection size, reference transactions, and so on. yet this quantitative approach has been frustrating to many who feel they are count­ ing outcomes but not getting at true impact of library service. librarians may ask themselves, “which num­ bers are important . . . and why?” if libraries focused on conversations, there might be some clarity and cohesion between statistics and other outcomes. suddenly, the number of reference questions can be linked to items cat­ aloged or to circulation numbers . . . they are all markers of the scope and scale of conversations within the library context. this approach might enable the library com­ munity to better identify important conversations and demonstrate direct contributions to these conversations across functions. for example, a school district identifies early literacy as important. there is a discussion about public policy options, new programs, and school goals to achieve greater literacy in k–5. the library should be able to track two streams in this conversation. the first is the one libraries are accustomed to counting; that is, the library’s contribution to k–5 literacy (participation in book talks, children’s events, circulation of children’s books, reference questions, and so on). but the library also can document and demonstrate how it furthered the conversation about children’s literacy in general. it could show the resources provided to community offi­ cials. it could show the literacy pathfinders that were created. the point of this example is that the library is both participant in the conversation (what we do to pro­ mote early literacy) and facilitator of conversation (what we do to promote public discourse). article title | author 19participatory networks | lankes, silverstein, and nicholson 19 the theoretical discussion leads us to a discussion about the second topic of this technology brief: pragmatic aspects of the knowledge as conversation approach, or a participatory approach, as it will be called. as new technologies are developed and deployed in the current environment of limited resources, there must be some means of evaluating their utility. a technology’s util­ ity is appropriately measured against a given library’s mission, which is, in turn, developed to respond to the needs of the community that library serves. first, how­ ever, let us identify some of the new technologies and describe them briefly. ■ participatory networking, social networks, and web 2.0 let us now move from the theoretical to the opera­ tional. the impetus behind this article is the relatively recent emergence of a new group of internet services and capabilities. suddenly, terms such as wiki, blog, mashup, web 2.0, and biblioblogosphere have become commonplace. as with any new wave of technological creation, these terms can seem ambiguous. they also come wrapped in varying amounts of hype. they may all, however, be grouped under the phenomenon of par­ ticipatory networking. while we now have a conceptual framework to evaluate these technologies that support participatory networking (for example, do they further conversa­ tions), we still need to know the basics of the terminol­ ogy and technologies. this section outlines key concepts in the pragmatics of participatory networking. the section after this one will join the theoretical and operational to outline key chal­ lenges and opportunities for the library world. we begin with web 2.0. web 2.0 much of what we call participatory networking, at least the technological foundation of it, stems from developments in web 2.0.5 as with many buzzwords, the exact definition of web 2.0 is not clear. it is more an aggregation of concepts that range from software development (loosely coupled application programming interfaces [apis] and the ease of incorporating features across platforms) to abstrac­ tions (the user is the content). what pervades the web 2.0 approach is the notion that internet services are increas­ ingly facilitators of conversations. the following sections describe some of the characteristics of web 2.0. web 2.0 characteristic: social networks a core concept of web 2.0 is that people are the content of sites; that is, a site is not populated with information for users to consume. instead, services are provided to individual users for them to build networks of friends and other groups (professional, recreational, and so on). the content of a site, then, comprises user­provided infor­ mation that attracts new members of an ever­expanding network. examples include: ■ flickr. flickr (www.flickr.com) provides users with free web space to upload images and create photo albums. users then can share these photos with friends or with the public at large. flickr facilitates the creation of shared photo galleries around themes and places. ■ the cheshire public library. the teen book blog (http://cpltbb.wordpress.com) at the cheshire public library offers book reviews created only by the stu­ dents who use the library. ■ memorial hall library. the memorial hall library in andover, massachusetts, offers podcasts of poetry contests in which the content is created by students (www.mhl.org/teens/audio/index.htm). ■ libraries in myspace. myspace searches show that there are myspace sites for hundreds of individual libraries and scores of library groups. alexandrian public library (apl), for example, has established a site at myspace (www.myspace.com/teensatapl). this practice is growing among public libraries and is an attempt to reach out to users in their preferred online environments. in this venue, the more friends a library’s myspace site has, the more successful it may be considered. as of this writing, apl had sev­ enty­five friends and fifteen comments. the brooklyn college library had 2,195 friends and 270 comments. web 2.0 characteristic: wisdom of crowds there has been some research into the quality of mass decision­making.6 that research shows how remarkably accurate groups are in their judgments. web 2.0 pools large groups of users to comment on decisions. this aggregation of input is facilitated by the ready availabil­ ity of social networking sites. certainly, this approach of community organization and verification of knowledge also has its detractors. many, for example, question the wisdom seen in some entries of wikipedia. yet, recent articles have compared this mass editing process favor­ ably to traditional sources of information, such as the encyclopedia britannica.7 examples include: ■ ebay. ebay has perhaps the most studied and copied community policing and reputation systems. all buyers and sellers can be rated. the aggregation of many users’ experiences create a feedback score that is equivalent to a group credibility rating (see figure 1). these kinds of group feedback systems can now be seen in most major internet retailers. ■ librarything. librarything.com makes book recom­ 20 information technology and libraries | december 200720 information technology and libraries | december 2007 mendations based on the collective intelligence of all users of the site. the greater the pool of collective intelligence, the more information available to the user for decision­making. ■ the diary project. the diary project library (www. diaryproject.com) is a non­profit organization that encourages teens to write about their day­to­day experiences growing up. the goal of this site is to encourage communication among teens of all cul­ tures and backgrounds, provide peer­to­peer support, stimulate discussion, and generate feedback that can help ease some of the concerns teens encounter along the way and let them know that they are not alone. to that end, the site comprises thousands of entries in twenty­four categories. because of the great number of entries, most youth can find helpful materials. web 2.0 characteristic: loosely coupled apis an api provides a set of instructions (messages) that a programmer can use to communicate between applica­ tions. apis allow programmers to incorporate one piece of software they may not be able to directly manipulate (code) into another. for example, google maps has made a public api that allows web page designers to include satellite images into their web pages with little more than a latitude and longitude.8 apis vary in their ease of integration. loosely coupled apis allow for very easy integration using high­level scripting languages such as javascript9. examples include: ■ google maps. google maps displays street or sat­ ellite maps showing markers on specific locations provided by an external source with simple sets of longitudes and latitudes. it becomes extremely easy to create geographic information systems with little knowledge of gis principles. ■ flickr. flickr provides easy means to integrate hosted images into other web pages or applications (as with a google map that shows images taken at a specific location). ■ youtube. youtube (www.youtube.com) provides users with the capability to upload and comment upon video on the internet. it also allows for easy integration of the videos into other web pages and blogs. with a simple line of html code, anyone can access streaming video for their content. web 2.0 characteristic: mashups mashups are combinations of apis and data that result in new information resources and services.10 this ease of incorporation has led to an assumption of a “right to remix.” in the world of open source software and the creative commons, the right to remix refers to a grow­ ing expectation among internet users that they are not limited by the interfaces and uses presented to them by a single organization. examples include: ■ chicagocrime.org. an often­cited example of a mashup is chicagocrime.org, which uses google maps to plot crime data for the city of chicago. users can now see exactly which street corner had the most murders. figure 2 shows a marker at the location of every homicide in chicago from november 2, 2005, to august 2, 2006. ■ book burro. book burro (http://bookburro.org/ about.html) “is a web 2.0 extension for firefox and flock. when it senses you are looking at a page that contains a book, it will overlay a small panel which when opened lists prices at online bookstores such as amazon, buy, half (and many more) and whether the book is available at your library.” ■ library lookup. the mit library lookup greasemonkey script for firefox (http://libraries. mit.edu/help/lookup.html) searches mit’s barton catalog from an amazon book screen. web 2.0 characteristic: permanent betas the concept of a permanent beta is, in part, a realization that no software is ever truly complete so long as the user community is still commenting upon it. for example, google does not release services from beta until it has achieved a sufficient user base, no matter how fixed the underlying source code is.11 permanent beta also is a design strategy. large applications are broken into smaller constituent parts that can be manipulated sepa­ rately. this allows large applications to be continually figure 1. a seller’s profile shows a potential buyer the ebay community’s current estimation of a seller’s credibility. article title | author 21participatory networks | lankes, silverstein, and nicholson 21 developed by a more diverse and distributed community (as in open source). examples include: ■ google labs. google has a site named “google labs” (http://labs.google.com) that puts out company­ generated tools and services. in fact, part of a google employee’s work time is dedicated to creating the resources and tools through personal projects and exploration. these tools and services remain a part of the “lab” until they are finished and have sufficient user bases. projects (see figure 3) range from the simple (google suggest, which provides a dropdown box of possible search queries as you being to type your search terms) to the extensive (google maps, which started as a google lab project). ■ mit libraries. the mit libraries are experimenting with new technologies to help make access to informa­ tion easier. the tools below are offered to the public with an appeal for feedback and additional tools, and the there is a permanent address designed just to collect feedback from the beta­phase tools, which include: ■ the new humanities virtual browsery, which highlights new books and incorporates an rss feed, the ability to comment on books, links to book reviews, availability information, and links to other books by the same author. ■ the libx—mit edition (http://libraries.mit. edu/help/libx.html), which is a firefox toolbar that allows users to search the barton catalog, vera, google scholar, the sfx fulltext finder, and other search tools; it embeds links to mit­ only resources in amazon, barnes & noble, google scholar, and nyt book reviews. ■ the dewey research advisor business and economics q&a (http://libraries.mit.edu/help/ dra.html), which provides starting points for specific research questions in the fields of busi­ ness, management, and economics. web 2.0 characteristic: software gets better the more people use it an increasing number of web 2.0 sites emphasize social networks, where these services gain value only as they gain users. malcolm gladwell recounts this principle and the work of kevin kelly with an earlier telecommunica­ tions network, the network of fax machines connected to the phone system: the first fax machine ever made . . . cost about $2,000 at retail. but it was worth nothing because there was no other fax machine for it to communicate with. the second fax machine made the first fax machine more valuable, and the third fax made the first two more valuable, and so on. . . . when you buy a fax machine, then, what you are really buying is access to the entire fax network—which is infinitely more valuable than the machine itself.12 with social networking sites, and all sites that seek to capitalize on user input (reviews, annotations, profiles, etc.), the true value of each site is defined by the number of people it can bring together. a classic example of this characteristic is amazon. amazon sells books and other merchandise, but, in reality, amazon is very much about the marketing of information. amazon gains tremendous value by allowing its users to review and rate items. the more people use amazon and the more they comment, the more visibility these active users gain and the more credibility markers they take on. web 2.0 characteristic: folksonomies a folksonomy is a classification system created in a bottom­up fashion with no central coordination. this differs from the deductive approach of such classifica­ tions systems as the dewey decimal system, where the world of ideas is broken into ten nominal classes.13 it also differs from other means of developing classifications where some central authority determines if a term should be included. in a folksonomy, the members of a group simply attach terms (or tags) to items (such as photos or blog postings), and the aggregate of these terms is seen as the classification. what emerges is a classification scheme that prioritizes common usage (the most­used tags) over semantic clarity (if most people use “car,” but some use “cars,” they are seen as different terms, and the tag “auto­ mobile” has no real relationship within the aggregate classification). examples include: figure 2: screenshot of chicagocrime.org 22 information technology and libraries | december 200722 information technology and libraries | december 2007 ■ penntags. penntags (http://tags.library.upenn.edu/ help) is a social bookmarking tool for locating, orga­ nizing, and sharing one’s favorite online resources. members of the penn community can collect and maintain urls, links to journal articles, and records in franklin, the online catalog, and vcat, the online video catalog. once resources are compiled, users can organize them by assigning tags (free­text key­ words) or by grouping them into projects according to specific preferences. penntags also can be used collaboratively, as it acts as a repository of the varied interests and academic pursuits of the penn com­ munity, and a user can find topics and other users related to his or her own favorite online resources. ■ hillsdale teen library. the hillsdale teen library (www.flickr.com/photos/hillsdalelibraryteens) uses flickr to post pictures of events at the hillsdale teen library (figure 4). the resulting tag view is repre­ sented in figure 5. these tags allow users to easily retrieve the images in which they are interested. there are more characteristics of web 2.0, but these give some overall concepts. core new technologies: ajax and web services as we have just discussed, web 2.0 is little more than set of related concepts, albeit with a lot of value being currently attached to these concepts. these concepts are supported by two underlying technologies that have facilitated web 2.0 development and brought a substantially new (and improved) user experience to the web. the first is ajax, which allows a more desktop­like experience for users. the second is the advent of web services. these technolo­ gies are not necessary for web 2.0 concepts, but they have made web 2.0 sites much more compelling. ajax ajax stands for asynchronous javascript and xml.14 it is a set of existing web technologies brought together. at the most basic, ajax allows a browser (the part the user interacts with) and a server (where the data resides) to send data back and forth without needing to refresh the entire web page being worked on. think about the web sites you work with. you click on a link, the browser freezes and waits for the data, then draws it on the screen. early versions of such sites as mapquest would show a map. if you wanted to zoom into the map, you would press a zoom icon and wait while the new map, and the rest of the web page was redrawn. compare this to google maps, where you click in the middle of a map and drag left or right and the map moves dynamically. we are used to this kind of interaction in desktop applications. click and drag has become second nature on the desktop, and ajax is making it second nature on the web, too. another ajax advantage is that it is open and requires only light programming skills. javascript on the client and almost any server­side scripting language (such as active server pages or php) are easily accessible languages. this fact allows for both fast development and easier integration with existing systems. as an example, it should now be easier to bring more interactive web interfaces to existing online catalogs. web services web services allow for software­to­software interactions on the web.15 using web protocols and xml, applications exchange queries and information in order to facilitate the larger functioning of a system. one example would be a system that uses an isbn number to query multiple online catalogs and commercial vendors for availability (and price) of a book. this simple process might be part of a much larger library catalog that shows users a book and its availability. the point is, that unlike federated search systems such as z39.50, web services are small. they also tend to be lightweight (that is, limited in what they do), and are aggregated for greater functionality. this is the technological basis for the loosely coupled apis dis­ cussed previously. library 2.0 library 2.0 is a somewhat diffuse concept. walt crawford, in his extended essay “library 2.0 and ‘library 2.0,’” found sixty­two different (and often contradictory) views and seven distinct definitions of library 2.0.16 it is no wonder that people are confused. however, it is natural for emerging ideas and groups to function in an environ­ figure 3: screenshot of current google lab projects article title | author 23participatory networks | lankes, silverstein, and nicholson 23 ment of high ambiguity. for use in this technology brief, the authors see library 2.0 as an attempt to apply web 2.0 concepts (and some longstanding beliefs for greater com­ munity involvement) to the purpose of the library. in the words of ormsby, “the purpose of a library is not to . . . showcase new gadgetry . . . ; rather, it is to make possible that instant of insight when all the facts come together in the shape of new knowledge.”17 in the case of library 2.0, the new gadgetry discussed in the previous section comprises a group of software applications. how the applications are used will determine whether they support ormsby’s “instant of insight.” many libraries and librarians already are pursuing this goal. some, for instance, are using blogs to reach other librarians, their own users (on their own web sites), and potential users (using myspace and other online communities). they are using wikis to deliver reports, teach information literacy, and serve as repositories. one has developed an api that allows wordpress posts to be directly integrated into a library catalog. clearly, the internet and newer tools that empower users seem to be aligned with the library mission. after all, librarians blogging and allowing the catalog to be mashed up can be seen as an extension of current information services. but this abundance of new applications poses a challenge. given the speed with which new tools are invented, librarians may find it difficult to create strate­ gies that include all the desired services that they make possible. for every new application that becomes avail­ able, library administrators must decide whether it can serve the library, how to use it, and how to find additional resources to manage it (for example, “now we can do this. but why should we?”). this problem stems from focusing excessively on the technology. librarians should instead focus on the phenomena made possible by the technology. most important of these phenomena, the library invites participation. as chad and miller state: library 2.0 facilitates and encourages a culture of participation, drawing upon the perspectives and con­ tributions of library staff, technology partners and the wider community. library 2.0 means harnessing this type of participation so that libraries can benefit from increasingly rich collaborative cataloguing efforts, such as including contributions from partner libraries as well as adding rich enhancements, such as book jackets or movie files, to records from publishers and others. library 2.0 is about encouraging and enabling a library’s community of users to participate, contribut­ ing their own views on resources they have used and new ones to which they might wish access. with library 2.0, a library will continue to develop and deploy the rich descriptive standards of the domain, whilst embracing more participative approaches that encourage interaction with and the formation of com­ munities of interest.18 the carte blanche statement that users participating in the library is “good,” however, is insufficient. library administers must ask, “what is the ultimate goal?” in summary, current initiatives in the library world to bring the tools of web 2.0 to the service of library 2.0 are exciting and innovative, and, more to the point, they are supportive of the library’s purpose. they may, however, incur costs, such as monitoring blogs and wikis, and cre­ figure 4: hillsdale teen library figure 5: hillsdale teen library flickr site 24 information technology and libraries | december 200724 information technology and libraries | december 2007 ating content and corresponding with users that stretch already inadequate resources even further. ultimately, the value of library 2.0 concepts requires us to answer some important questions: will they be used to further knowledge, or will they simply create more work for librarians? what does the next version of library 2.0 look like? is its mission the same, and only the tools dif­ ferent? what makes the library different from myspace— simply a legacy? should we incorporate new services into the current library offerings? how do we, as facilitators of conversations, point the way to the next generation of library? it is hoped that some of the concepts in participa­ tory librarianship may answer these questions and help further the innovations of the library 2.0 community. participatory networks the authors use the phrase “participatory networking” to encompass the concept of using web 2.0 principles and technologies to implement a conversational model within a community (a library, a peer group, the general public, and so on). why not simply adopt social networking, web 2.0, or library 2.0 for that matter? let us examine each term’s limitations: ■ social networking: social network sites such as myspace and facebook have certainly captured public attention. they also have proven very popular. in their short life spans, these sites have garnered an immense audience (myspace has been ranked one of the top destination sites on the web) and drawn much atten­ tion from the press.19 some of that attention, however, has been very negative. myspace, for example, has been typified as a refuge for pedophiles and online predators. even the television show saturday night live has parodied the site for the ease with which users can create false personas and engage in risky online behaviors.20 to say you are starting a social networking site in your library may draw either enthusiastic support, vehement opposition (“social networking experiment in my library?!”), or simply confused looks. add to the potential negative con­ notations the ambiguity of the term. is a blog a social networking site? is flickr? to compound this confu­ sion, the academic domain of social network theory predates myspace by about a decade. ■ web 2.0: ambiguity also dogs the web 2.0 world. for some, it is technology (blogs, ajax, web ser­ vices, and so on). for others, it is simply a buzzword for the current crop of internet sites that survived the burst of the dot­com bubble. in any case, web 2.0 certainly implies more than just the inclusion of users in systems. ■ library 2.0: as stated before, the term library 2.0 is a vague term used by some as a goad to the library community. further, this term limits the discussion of user­inclusive web services to the library world. while this brief focuses on the library community, it also sees the library community as a potential leader in a much broader field. so, ultimately, the authors propose “participatory net­ working” as a positive term and concept that libraries can use and promote without the confusion and limitations of previous language. the phrase “participatory network” also has a history of prior use that can be built upon. it represents systems of exchange and integration and has long been used in discussions of policy, art, and government.21 the phrase also has been used to describe online communities that exchange and integrate information. ■ libraries as participatory conversations so where are we? we started with the abstract statement that knowledge is created through conversation. we then looked at the current landscape of technologies that can facilitate these conversations and showed examples of how libraries, other industries, and individuals are using these technologies. in this section we combine the larger framework with the technologies to see how libraries can incorporate participatory networks to further their knowledge mission. participatory librarianship in action let us look specifically at how participatory networks can be used in the library’s role as facilitator of knowledge through conversation. an obvious example is libraries hosting blogs and wikis for their communities, creat­ ing virtual meeting spaces for individuals and groups. indeed, these are increasingly useful functions for librar­ ies. they meet a perceived need in the community and can generate excitement both within the library and in the community. the idea of creating online sites for individu­ als and organizations makes sense for a library, although it is not without difficulties (see the section on challenges and opportunities). libraries also could use freely avail­ able (and increasingly easy to implement) open source software to create library versions of wikipedia (with or without enhanced editorial processes). another way for libraries to offer these services would be through a cooperative or other third­party vendor. such a service easily can be seen as a knowledge management activity capturing and providing local expertise while linking this expertise to that produced at other libraries. another reason for libraries to engage in participatory networking is that one library can more easily collaborate article title | author 25participatory networks | lankes, silverstein, and nicholson 25 with other libraries in richer dialogues. we currently have systems that connect our online catalogs and share resources through interlibrary loan. these conduits exist and can be used for the transferal of richer data, as has been proved through collaborative virtual reference sys­ tems. in our current systems, as in traditional library practice, when users are referred to other libraries, they are sent out and not brought back. in a participatory library setting, libraries would facilitate a conversation between the user, the community of the local library, and then through the developed conduits, other libraries and their communities. the end result would be a seamless web of libraries where the user can ignore the intrica­ cies of the library’s organization structure and boundar­ ies, and in which the libraries are using the best local resources to meet local needs. bringing libraries seamlessly together to participate in conversations with a single user has another sig­ nificant advantage: the library would make it easy for users to join the conversation regardless of where they are, through the presentation of a single façade. there is, for example, only one google, one amazon, and one wikipedia. why should users have to search from among thousands of libraries to find the conversations they want? participatory networking will be most effective when libraries work together, when the whole is greater than its parts. we currently see elements of the participatory library in the oclc open worldcat project. for example, users searching google may come across a listing provided by oclc. after selecting the entry for the book, the user can then jump to his or her own local library’s information about the book. users do not have to know which library to visit to find a book near them. extending this concept to conversations, one goal of these participatory networks is to make it easier for the user to enter a conversation with the library without having to work to discover their own specific entry points. however, ensuring this effective seamless access to the library will require more than simply adding ele­ ments of participatory networking around the library’s edges. adding services such as blogs and wikis may be seen merely as adjunct to current library offerings. as with any technological advance, scarce resources must be weighed against a desire to incorporate new services. do we expand the collection, improve the web site, or offer blogs to students? a better approach for making these kinds of decisions is to look at the needs of the community served in context with the commonly accepted, core tasks of a library, and see how they can be recast (and enhanced) as conversational, or participatory, tools. in point of fact, every service, patron, and access point is a starting point for a conversation. let’s start with the catalog. if the catalog is a conversation, it is decidedly formal and, more importantly, one way. think of today’s catalog as the educational equivalent of a college lecture. a for­ mal system is used to serve up a series of presentations on a given topic (selected by the user). the presentations are rigid in their construction (marc, aacr2, and so on). they follow an abstract model (relevance scores, some­ times alphabetical listings), and provide minimal oppor­ tunities to the receiver of the information to provide feedback or input. they provide no constructive means for the user to improve or shape the conversation. even recent advances in catalog functions (dynamic, graphical visualizations; faceted searching; simple search boxes’ links to non­collection resources) do little more than make the presentation of information more varied. they are still not truly interactive because they do not allow user participation; they do not allow for conversation. to highlight the one­way nature of the catalog, ask a simple question: what happens when the user doesn’t find something? do we assume that the information is there, but that the user is simply incapable of finding it (in which case the catalog presents search tips, refers the patron to an expert librarian who is capable, or offers more information literacy instruction)? do we assume that the information does not exist (refer the patron to interlibrary loan, pass him or her on to a broader search engine)? do we assume that the catalog itself is limited (refer the user to online databases, or other finding aids)? what if we assume that the catalog is just the current place a user is involving in an ongoing conversation —what would that look like? how can such a traditionally rigid system (in concept, more than in any one feature set) be made more participa­ tory? what if the user, finding no relevant information in the catalog, adds either the information or a place­ holder for someone else to fill in the missing information? possibly the user adds information from his or her exper­ tise. however, assuming that most people go to a catalog because they don’t have the information, perhaps the user instead begins a process for adding the information. the user might ask a question using a virtual reference service; at the end of the transaction, the user then has the option to add the question, along with the answer and associ­ ated materials, to the catalog. or perhaps, the user simply leaves the query in the catalog for other patrons to answer, requesting to be notified when a response is posted. in that case, when a new user does a catalog search and runs across the question, he or she can provide an answer. that answer might be a textual entry (or an image, sound, or video), or simply a new query that directs the original questioner or new patrons to existing information in the catalog (user­created see also entries in the catalog). the catalog also can associate conversations with any data point. for example, a user pulls up the record for a book she or he feels might be relevant to an information need she or he is having. this process starts a conver­ sation between that user and the library, its users, and 26 information technology and libraries | december 200726 information technology and libraries | december 2007 authors of associated works. the user can see comments and ratings associated with this book from not only users of this library, but users of other libraries. also associated is a list of related works and the full audio of a lecture by the author. the user also might be directed to an in­ person or online book group that is reading that book. the point is that the catalog facilitates a conversation as opposed to simply presenting what it “knows” about a topic and then stepping out of the process. the catalog, then, does not simply present information, but instead helps users construct knowledge by allowing the user to participate in a conversation. there are other means of improving (and linking) systems in a conversational library. take the implicit link between the catalog and circulation. of course, these systems have always been linked in that items found in the catalog can be checked out, and checked out items have their status reflected in the catalog. but this kind of state information is a pretty meager offering. imagine using circulation data to improve the actual functionality of the catalog. take the example of a user who is search­ ing the catalog for fictional books on magic. currently, a relevance score between an item’s metadata and the query is computed and then all the items are ranked in a retrieval set. this relevance score can be computed in many ways, but is usually based on the number of times a keyword appears in the record and the placement of that keyword in the metadata record (giving preference to terms appearing in certain marc fields, such as titles). what is missing is the actual, real­world circulation of an item. wouldn’t it make sense, given such an abstract query, to present the user with harry potter first (but not exclusively)? what if we added circulation data to our relevance rankings: how many times this item has been checked out? it turns out that using a simple statistic is amazingly powerful. it is akin to google’s page rank algorithm that presents sites most linked to higher in the results. also, for those worried that users would be flooded with only popular materials, studies show that while these algorithms do change the very top ranked material, the effect quickly fades so that the user can still easily find other materials. another consideration for adjusting a search is to allow the user to tweak the algorithms used to retrieve works. in the example above, a user could turn off the popularity feature. the user also could toggle switches for currency, authority, and other facets of relevancy rankings. the conversational model requires us to rethink the catalog as a dynamic system with data of varying levels of currency and, frankly, quality, coming into and out of the system. in a conversational catalog, there is no reason that some data can’t exist in the catalog for limited dura­ tions (from years to seconds). records of well­groomed physical collections may be a core and durable collection in the catalog, but that is only one of many types of infor­ mation that could exist in the catalog space. furthermore, even this core data can be annotated and linked to (and from) more transient media. so, the user might see a review from a blog as part of a catalog record on one day, but when she or he pulls the record up again in a few days, that review might be absent, the blog writer hav­ ing withdrawn the comment. this is akin to weeding the collection; however, it would happen in a more dynamic fashion than occurs with the content on library shelves. the conversational model also can be used in other areas of the library. what do we digitize? what do we select? what programs do we offer? what do we pre­ serve? the empowered user can participate in answer­ ing all of these questions but does not replace the expert librarian; rather, the user contributes additional and diverse information and commentary. in fact, the catalog scenario just proposed already assumes that the library catalog does more than store metadata. in order for the scenario to work, the catalog must store questions, answers, video, audio—in essence the catalog must be expanded and integrated with other library systems so that a final participatory library system can present a coherent view of the library to patrons. the next section lays out a sort of roadmap for these enhance­ ments and mergers. framework for integration of participatory librarianship as has been noted, participatory networks and libraries as conversations are not brand new concepts sprung from the head of zeus. instead, they are means to integrate past and current innovations and create a viable plan forward. figure 6 provides a sort of road map of how the library might make the transition from current systems to a truly participatory system. it includes current systems, systems under development (such as federated searching), and new concepts (such as the participatory library). it seeks to capture current momentum and push the field forward to a larger view instead of getting bogged down in the intricacies of any one development activity. along the left side of the graph are current library systems. while the terminology may differ from library to library, nearly every system can be found on today’s library web sites. by showing the systems together, the problems of user confusion and library management burden become obvious. users must often navigate these systems based on their needs, and often with little help. should they search the catalogs first, or the databases? isn’t the catalog really just another database? which data­ base do they choose? in our attempts to serve the users better by creating a rich set of resources and services, we have instead complicated their information­seeking lives. as one librarian puts it, “don’t give me one more system i, or my patrons, have to deal with.” article title | author 27participatory networks | lankes, silverstein, and nicholson 27 from the array of systems on the left side, we can see that libraries have not been doing themselves any favors either. we are maintaining many systems, therefore mak­ ing the calls for yet more systems not only impractical but unwise. the answer is to integrate systems, combining the best of each while discarding the complexity of the whole. the library world is in the midst of doing just that. this section seeks to highlight promising developments in integrating library systems well beyond the library catalog and to highlight not only an ideal endpoint, but also how this ideal system is truly participatory. merging reference and community involvement the functional area furthest along in the integration of participatory librarianship is reference; as reference is most readily recognizable as a conversation, this comes as no surprise. over the last decade, reference services have gone online and have led to shared reference ser­ vices. more importantly, reference done online creates artifacts of reference conversations: electronic files than can be cleaned of personal information and placed in a knowledge base and used as a resource for other users. a new development in reference is the reference blog, in which multiple librarians and other users can be part of a question­answering community with conversations that can live on beyond a single transaction. another functional area of libraries that is already involved with participatory librarianship is community involvement. for decades, public libraries have supported local community groups through meeting spaces. some libraries now are hosting web spaces for local groups. as libraries incorporate participatory technologies into their offerings, they can create virtual places such as discussion forums, wikis, and blogs for these community groups to use. if there are standards for these discussion areas, then groups from different communities also could easily participate in shared boards; this makes sense for groups such as weight watchers or alcoholics anonymous that have local branches and national involvement. in an academic setting, these groups can be student, faculty, or staff organizations or courses. in addition to reference and hosted community con­ versations, the library has been actively creating digi­ tal collections of materials (either through digitization, leasing service from content providers, or capturing the library’s born digital items). parallel to the digital collec­ tion building of library materials is an active attempt to create institutional repositories of faculty papers, teacher lesson plans, organizational documentation, and the like. these services are participatory systems in which col­ lections come from users’ contributions, and they may evolve into digital repositories that include both user­ and librarian­created artifacts. these different conversations can be archived into a single repository, and, if properly planned, the refer­ ence conversations can live alongside, and eventually be intermingled with, the community conversations, and the digital repository (which, after all, though formal, is a community conversation) into a community repository. community repositories allow librarians to be more eas­ ily involved in the conversations of the community and capture important artifacts of these conversations for later use. merging library metadata into an enhanced catalog participatory librarianship can be supported by another functional area of the library: collections. traditionally, the collection comprises books, magazines, and other information resources paid for by the library. electronic resources, such as databases that are leased instead of purchased, make up a large portion of library expen­ ditures. more recently, web­based resources (external feeds and sites) have been selected and added to the virtual collection. several kinds of finding aids are used to locate these information resources. the catalog and databases both contain descriptions of resources and searching interfaces. in order to improve access, libraries include records for databases within the catalog. conversely, federated search­ ing tools combine the records from different databases and could allow the retrieval of both books and articles by com­ bining records from the traditional catalog and databases into one tool. if community­created resources are part of the catalog, then these resources also would be findable alongside other traditional library resources. the tools for describing information resources also can be participatory. in traditional librarianship, the librarians provide metadata that patrons then use to make selections. figure 6: road map of how the library might make the transition from current systems to a truly participatory system. 28 information technology and libraries | december 200728 information technology and libraries | december 2007 by examining this use data, recommender systems can be created to help users locate new materials. in participatory networking, patrons will be encouraged to add comments about items. if standards are used for these comments, then they can be shared among libraries to create larger pools of recommendations. as these comments are analyzed, they can be combined with usage databases to create stronger recommender systems to present patrons with additional choices based upon what is being explored. the end result is an enhanced catalog that allows users and libraries to find information regardless of which sys­ tem the information resides in. however, the enhanced catalog is still just that, a catalog. it contains surrogates of digital information and is managed separately from the artifacts themselves. in the case of physical items, this may be all the library systems can manage, but in the case of digital content, there is one more step that needs to be taken. namely, the artificial barrier between catalog (defined as inventory control system) and content (housed in the community repository) must come down. building the participatory library at this point in the evolution of distributed systems into a truly integrated library system, the participatory library, we have two large collections: one of resources, and one of information about the resources. the first collection of digital content, the community repository, is built by the library and its users collaboratively. the second collection, the enhanced catalog, includes metadata, both formal and user­created (such as ratings, commentary, use data, and the like). both the community repository and the enriched catalog are participatory. yet to realize the dream of a seamless system of functionality (seamless to the user and the library), these two systems must be merged, allow­ ing users to find resources and, much more importantly, conversations. furthermore, the users must be able to add to metadata (such as tags to catalog records) and content (such as articles, postings to a wiki, or personal images). the result may be conceived of as a single integrated infor­ mation resource, which, for the purposes of this conversa­ tion, is called the participatory library. users may access the participatory library directly through the library or as a series of services in google, myspace, or their own home pages. the point is that the access to the library takes place at the point of conversa­ tion, not at the point the user realizes he or she needs information from the library. conversations and preservation the conversation model highlights the need for preserva­ tion. aside from simply providing systems that facilitate conversation, libraries serve as the vital community memory. conversations construct knowledge, but some­ one must remember what has already been said and know how to access that dialog. scientific conversations, for example, are built on previous conversations (theories, studies, methods, results, and hypotheses). capturing conversations and playing them back at the right time is essential. this might mean the preservation of artifacts (maps, transcripts, blueprints, photographs), but also it means the increasingly important tasks of capturing the digital dialogs. this highlights the need for institutional repositories (that will later be integrated seamlessly with other library systems, as previously discussed). specifically, web sites, lectures, courseware, and articles must be kept. further, they must be kept in true conversa­ tional repositories that capture the artifacts (the papers), the methods (data, instruments, policy documents), and the process (meeting notes, conversations, presentations, web sites, electronic discussions). they must be kept in information structures that make them readily available as conversations; in other words, users must be able to search for materials and reconstruct a conversation in its entirety from one fragment. being where the conversation is imagine the conversations that are going on in your local library as you read this. imagine the physicist chatting with the gardener, and the trustee talking with the volunteer who is reading the latest best­seller. what knowledge can be gleaned from these novel interac­ tions? can you measure it? can you enhance it? can you capture it? can you recall it when it would be precisely what a user needs? note also that these conversations do not belong solely to the library. the library is only part of the con­ versation. faced with the daunting variety of resources available on the web, many organizations try to become the single point of entry into it. remember that conversa­ tions are varied in their mode, places, and players, and, more importantly, that they are intensely personal. this means that participants need to have ownership in them, and often in their locations as well. this also means that the library, as facilitator, needs to be varied in its modes and access points. in many cases, it is better to either create a personal space in which users may converse, or, increasingly, to be part of someone else’s space. what we can learn from web 2.0’s mashups is that smaller sets of limited (but easy to access) functionalities lead to greater incorporation of tools into people’s lives. in the chicagocrime–google maps mashup, combining maps from google and chicago crime statistics, it was important for the host of the site to brand the space and shape the interface for his conversation on crime. can your library functions be as easily incorporated into these types of conversations? can a user search your catalog and present the results on his or her web site? the point is that libraries need to be proactive in a new way. instead of article title | author 29participatory networks | lankes, silverstein, and nicholson 29 the mantra, “be where the user is,” we need to, “be where the conversation is.” it is not enough to be at the users’ desktops; you need to be in their e­mail program, in their myspace pages, in their instant messaging lists, and in their rss feed readers. all of these examples point to a significant mental shift that librarians will need to make in moving from delivering information from a centralized location to delivering information in a decentralized manner where the conversations of users are taking place. the catalog example presented earlier is an example of a centralized place for conversations. what if, instead of only being in a catalog, the same data were split into smaller components and embedded in the user’s browser and e­mail pro­ grams? just as google’s mail system embeds advertising based upon the content of a message, the library could provide links to its resources based upon what a user is working on. by disaggregating the information within its system, the library can deliver just what is needed to a user, provide connections into mashups, and live in the space of the user instead of forcing the user to come to the space of the library. challenges and opportunities there is clearly a host of challenges in incorporating par­ ticipatory networks and a participatory model into the library. this is to be expected when we are dealing with something as fundamental as knowledge and as personal as conversations. we consider four major challenges that must be met by libraries before they can truly get into the business of participatory librarianship. technical there is a rich suite of participatory networking software that libraries can incorporate into their daily operations. implementing a blog, a wiki, or rss feeds these days is not a hard task, and they can easily be used to deliver information about library services and conversations to the user’s space. furthermore, these systems are often tested in very large­scale environments and are, in some cases, the same tools used in large participatory network­ ing sites such as wikipedia and blogger. some of these packages are commercial, but others are open source software. open source software is cheaper, easier to adapt, and, in some cases, more advanced. the downside to open source is that it requires a considerable amount of technical knowledge by the library (but not as much as one might think) and does not come with a technical support hotline. the largest technological impediment, however, may be the currently installed base of software within librar­ ies. integrated library systems have a long history and include a broad range of library functions. legacy code and near monolithic systems have restricted the easy exchange of a diverse set of information. were these sys­ tems written today, they would use modular code and loosely coupled apis and allow customers much more interface customizability. these changes may come to integrated library systems (as customers are demanding it), but it may take years. several libraries are currently attempting to pick apart these integrated systems themselves. often, libraries go to the underlying databases that hold the library metadata or create their own data structures, such as the university of pennsylvania data farm project.22 once components of this system are exposed, the catalog simply becomes another database that can be federated into new and uni­ fied interfaces. however, such integration requires a great deal of technological expertise. there is an opportunity for integrated library system vendors or large consortial groups such as oclc to move quickly into this space. in the meantime, there is an opportunity for the larger library community. this technology brief was created in response to a perceived need. whether evi­ denced in the library 2.0 community or in conversations at lita, libraries are now interested in incorporating new web technologies into their offerings and opera­ tions. the technologies under consideration here pres­ ent platforms for experimentation. rather than setting up thousands of separated experiments, however, the library community should create a participatory net­ work of its own. the technology certainly exists to create a test bed for libraries to set up various combinations of communication technologies (blogs, tagging, wikis), to test new web services against pooled data (catalog data, metadata repositories, and large scale data sets), and even to incorporate new services into the current library offerings (rss feeds, for example). by combining resources (money, time, expertise) in a single, large­scale test bed, libraries not only can get greater impact for the their investments, but can directly experience life as a connected conversation. these connections, if built at the ground level, will then make it easier for the library to come into existence. terminology can be clarified, claims tested, and best practices collaboratively developed, greatly accelerating innovation and dissemination. operational in addition to being in the conversation business, librar­ ies are in the infrastructure business. one of the most powerful aspects of a library is its ability not only to develop a collection of some type of information, but to maintain it over time. sometimes infrastructure can be problematic (as in the case of legacy systems), but more often than not it provides a stable foundation from which to operate. there are many conversations going on that need infrastructure but have none (or little). think of the opportunities in your community for using the web to 30 information technology and libraries | december 200730 information technology and libraries | december 2007 facilitate a conversation. it might be a researcher want­ ing to disseminate the results of his or her latest study. it might be a community organization seeking funding. it might be a business trying to manage its basic opera­ tional knowledge. the point is that such individuals and community organizations are not in the infrastructure business and could use a partner who is. imagine a local organization coming to the library and, within a few min­ utes, setting up a web site with an rss feed, a blog, and bulletin boards. the library facilitates, but does not own, that individual’s or organization’s conversation. it does form a strong partnership, however, that can be leveraged into resources and support. the true power of participa­ tory networking in libraries is not to give every librarian a blog; it is in giving every community member a blog (and making the librarian a part of the community). in addition, the library can play the role of connecting these conversations to other users when appropriate. participatory libraries allow the concept of com­ munity center (intellectual center, service center, media center, information center, meeting center) to be extended to the web. many public libraries have no problem providing meeting space to local non­profits. why not provide web meeting space in the form of a web site or web conferencing? many academic libraries attempt to capture the scholarly output of their faculties, why not help generate the output with research data stores? the answers to these questions inevitably come back to time and money. however, there is nothing in this brief that says such services have to be free. in fact, the best part­ nerships are formed when all partners are invested in the process. the true problem is that libraries have no idea of how to charge for such services. faculty would be glad to write library support into grants (in the form of web site creation and hosting), but need a dollar figure to include and how long each task will take. many libraries aren’t used to positioning their services on a per item basis, and this makes it difficult to build partnerships. sometimes it is not a lack of money, but a lack of structure to take in money that is the problem. policy as always, it is policy that presents the greatest challenges. the idea of opening the library functions to a greater set of inputs is rife with potential pitfalls. how can libraries use the technologies and concepts of facebook and myspace without being plagued by their problems? how can users truly be made part of the collection without the library being liable for all of their actions? the answers may lie in a seemingly obscure concept: identity management. conversations can range in their mode, topic, and duration. they also can vary in the conversants. the library needs to know a conversant’s status to determine policy (for example, we can only disclose this information to this person), and requires a unique identifier, such as a library card, to uphold it. in traditional libraries, that is the extent of identity management. in a participatory model, distinctions among identi­ ties become complex and graduated, and require us to consider a new approach. this new model, of patrons adding information directly to library systems, is not as radical as it may first appear. we have become very used to the idea of roles and tiered levels of authority in many other settings. most modern computer systems allow for some gradation in user abilities (and responsibilities). online communities have even introduced merit systems, by which continual high­quality contributions to a site equals greater power in the site. think about amazon, wikipedia, even ebay; as users contribute more to the community, they gain status and recognition. from par­ ticipants to editors, from readers to writers, these organi­ zations have seen membership as a sliding scale of trust, and libraries need to adopt this approach in all of their basic systems. we currently do, to a degree, in the form of librarians, paraprofessionals, and other staff. yet even these distinctions tend to be rigid and often class­based, with high walls (such as a master’s degree) between the strata. some of this is imposed by outside organizations (civil service requirements, tenure track, and so on), but a great deal is there by inertia of the field. skillful use of identity management will help librar­ ies avoid the baggage of myspace and facebook. as users grain greater access, greater responsibility, and greater autonomy, libraries need to be more certain of their identities. that is, for a user to do more requires the library to know more. knowing about a user may involve traditional identity verification or tracking an activity trail, whereby intentions can be judged in rela­ tion to actions. these concepts may be expressed as, “the more we know you, the more control you can have in valuable services such as blogging, or the catalog.” the concepts are illustrated in blogger and livejournal, both of which require some level of identity information. in another example, to join livejournal you must be invited, thus the community confers identity. the common theme is that verifying (and building) identity is community­ based. the difference between the library and myspace is that the library works in an established community with traditional norms of identity, whereas myspace is seeking to create a community (where identity is more defined by social connections than actions). both the library and the services mentioned above, however, base their functions and services on identity. ethical as knowledge is developed through conversation, and libraries facilitate this process, libraries have a powerful impact on the knowledge generated. can librarians inter­ fere with and shape conversations? absolutely. should we? we can’t help it. our collections, our reference work, article title | author 31participatory networks | lankes, silverstein, and nicholson 31 our mere presence will influence conversations. the ques­ tion is, in what ways? by dedicating a library mission to directly align with the needs of a finite community, we are accepting the biases, norms, and priorities of the com­ munity. while a library may seek to expand or change the community, it does so from within. when internet filtering became a requirement for fed­ eral internet funding, public and school libraries could not simply quit, or ignore the fact, because they are agents of their communities. school libraries had to accept filtering with federal funding because their parent organizations, the schools, accepted filtering.23 we see, from this example, that libraries may shift from facilitating conversations to becoming active conversants, but they are always doing both. thus, the question is not whether the library shapes conversations, but which ones, and how actively? these questions are hardly new to the underlying principles of librarianship. and nothing in the participa­ tory model seeks to change those underlying principles. the participatory model does, however, highlight the fact that those principles shape conversations and have an impact on the community. ■ recommendations the overall recommendation of this article is that librar­ ies must be active participants in the ongoing conversa­ tions about participatory networking. they must do so through action, by modeling appropriate and innovative use of technologies. this must be done at the core of the library, not on the periphery. rather than just adding blogs and photosharing, libraries should adopt the princi­ ples of participation in existing core library technologies, such as the catalog. anything less simply adds stress and stretches scarce resources even further. to complement this broad recommendation, the authors make two specific proposals: expand and deepen the discussion and understanding of participatory net­ works and participatory librarianship, and create a par­ ticipatory library test bed to give librarians needed participatory skills and sustain a standing research agenda in participatory librarianship. as stated in the outset of this document, what you are reading is limited. while it certainly contains the kernel and essence of participatory networks (systems to allow users to be truly part of services) and participatory librar­ ianship (the role of librarianship as facilitators and actors in conversations in general), the focus was on technology and technology changes. already, the ideas contained in this document have been part of an active conversation. the first draft of this document was made available for public comment via a wiki, e­mail, and bulletin boards, and concepts herein presented at conferences and lec­ tures. however, there is now a need to broaden the scope and scale of the conversation. the theoretical founda­ tions of participatory librarianship need to be rigorously presented. the nontechnical components of the ideas (and the marriage of nontechnical to technical) need to be explored. there are curricular implications: how do we prepare participatory librarians? the nature and form of the library and participatory systems need to be dis­ cussed and examined in theoretical, experimental, and operational contexts. in order to do this, the authors propose a series of con­ versations to engage the ideas. these conversations, both in person and virtual, need to be within the profession and across disciplines and industries. the deeper conversa­ tions need to be documented in a series of publications that expand this document for academics and practitioners. the authors feel, however, that the first proposal must be grounded in action. to complement the more abstract exploration of participatory networks and participatory librarianship, there must be an active playground where conversants can experience firsthand the technologies discussed, and then actively shape the tools of partici­ pation. this is the test bed. this test bed would imple­ ment a participatory network of libraries, and provide a common technology platform to host blogs, wikis, discussion boards, rss aggregators, and the like. these shared technologies would be used to experiment with new technologies and to provide real services to librar­ ies. thus, libraries could not only read about blogging applications, they could try them and even roll them out to their community members. as libraries start new com­ munity initiatives, they could rapidly add wikis and rss feeds hosted at the shared test bed. the test bed would also make all software available to the libraries so they could locally implement technologies that have proven themselves. the test bed would provide the open source software and consulting support to implement features locally. the test bed also would develop new metrics and means of evaluating participatory library services for the use of planners and policy makers. a major deliverable of the test bed, however, would be to model innovations in integrated library systems (ils). the test bed would work with libraries and ils vendors to pilot new technologies and specify new stan­ dards to accelerate ils modernization. the point of the test bed is not to create new ilss, but to make it easy to incorporate innovative technologies into vendor and open source ilss. the location and support model of the test bed are open for the library community to determine. certainly, it could be placed in existing library associations or orga­ nizations. however, it would require the host to be seen as neutral in ils issues, and to be capable of supporting a diverse infrastructure over time. the host organiza­ tion also would need to be a nimble organization, able 32 information technology and libraries | december 200732 information technology and libraries | december 2007 to identify new technical opportunities and implement them quickly. one model that might work is establishing a pooled fund from interested libraries. this pooled fund would support an open source technology infrastructure and a small team of researchers and developers. the team’s activities would be overseen by an advisory panel drawn from contributing members. such a model spreads this investment out into experimentation across a broad col­ laboration and should, ultimately, save libraries time and money. as a result, the time and money that indi­ vidual libraries might spend on isolated or disconnected experiments can be invested in a common effort with greater return. libraries have a chance not only to improve service to their local communities, but to advance the field of par­ ticipatory networks. with their principles, dedication to service, and unique knowledge of infrastructure, libraries are poised not simply to respond to new technologies, but to drive them. by tying technological implementa­ tion, development, and improvement to the mission of facilitating conversations across fields, libraries can gain invaluable visibility and resources. impact and leadership, however, come from a firm and conceptual understanding of libraries’ roles in their communities. the assertion that libraries are an indis­ pensable part of knowledge generation in all sectors pro­ vides a powerful argument to an expanded function of libraries. eventually, blogs, wikis, rss, and ajax all will fade in the continuously dynamic internet environment. however, the concept of participatory networks and con­ versations is durable. ■ acknowledgements the authors would like to thank the following people and groups: ken lavender, for his editing prowess. the doctoral students of ist 800 for providing input on conversation theory: johanna birkland, john d’ignazio, keisuke inoue, jonathan jackson, todd marshall, jeffrey owens, katie parker, david pimentel, michael scialdone, jaime snyder, sarah webb. the students of ist 676 for their tremendous input and for their exploration of the related concept of massive scale librarianship: marcia alden, charles bush, janet chemotti, janet feathers, gabrielle gosselin, ana guimaraes, colleen halpin, katie hayduke, agnes imecs, jennifer kilbury, min­chun ku, todd mccall, virginia payne, joseph ryan, jean van doren, susan yoo. those who commented on the draft, including karen scheider, walt crawford and john buschman, and kathleen de la peña mccook. lita for giving us a forum for feedback. carrie lowe, rick weingarten, and mark bard of ala’s oitp for their feedback and support. the institute staff, including lisa pawlewicz, joan laskowski, and christian o’brien, for logistical support. references and notes 1. cited in p. hardiker and m. baker, “towards social theory for social work,” handbook of theory for practice teachers in social work, j. lishman, ed. (london: jessica kingsley, 1991). 2. g. pask, conversation theory: applications in education and epistemology (new york: elsevier, 1976). 3. linda h. bertland, “an overview of research in metacog­ nition: implications for information skills instruction,” school library media quarterly 15 (winter 1986): 96–99. 4. pask, conversation theory, 92. 5. tim o’reilly, “what is web 2.0: design patterns and business models for the next generation of software,” o’reilly, www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/ what­is­web­20.html (accessed feb. 1, 2007). 6. j. suroweicki, the wisdom of crowds (new york: double­ day, 2004). 7. “wiki’s wild world: researchers should read wikipedia cautiously and amend it enthusiastically,” nature 438, no. 890 (dec. 2005): 890; www.nature.com/nature/journal/v438/ n7070/full/438890a.html (accessed feb 1, 2007). 8. google, “google maps api,” www.google.com/apis/ maps (accessed feb. 1, 2007). 9. “java script tutorial,” w3 schools, www.w3schools.com/ js/default.asp (accessed feb. 1, 2007). 10. while the terms in web 2.0 are a bit ambiguous, many people confuse the term “mashup” with “remixes.” mashups are combining data and functions (such as mapping), whereas remixes are reusing and combining content only. so combining a song with a piece of video to create a “new” music video would be a remix. mapping all of your videos on a map using youtube to store the videos and google maps to plot them geographically would be a mashup. 11. for example gmail, a very widely used, web­based email service, but is still considered “beta” by google. 12. malcolm gladwell, the tipping point: how little things can make a big difference (boston: back bay books, 2000), 272. 13. oclc, “introduction to dewey decimal classification,” www.oclc.org/dewey/versions/ddc22print/intro.pdf (accessed feb. 1, 2007). 14. “ajax (programming),” wikipedia, http://en.wikipedia .org/wiki/ajax_(programming) (accessed feb. 1, 2007). 15. “web services activity,” w3c, www.w3.org/2002/ws (accessed feb. 1, 2007). 16. walt crawford, “library 2.0 and ‘library 2.0.’ ” cites & insights 6, no. 2 (2006), http://citesandinsights.info/civ6i2.pdf (accessed dec. 13, 2007). 17. eric ormsby, “the battle of the book: the research library today,” the new criterion (oct. 2001): 8. 18. ken chad and paul miller, “do libraries matter? the rise of library 2.0: a white paper,” version 1.0, 2005, www.talis .com/downloads/white_papers/dolibrariesmatter.pdf (accessed feb. 1, 2007). 19. slashdot, “myspace #1 us destination last week,” h t t p : / / s l a s h d o t . o rg / a r t i c l e s / 0 6 / 0 7 / 1 2 / 0 0 1 6 2 11 . s h t m l article title | author 33participatory networks | lankes, silverstein, and nicholson 33 (accessed feb. 1, 2007); pete williams, “myspace, facebook attract online predators,” msnbc, www.msnbc.msn.com/ id/11165576 (accessed feb. 1, 2007); “the myspace gener­ ation,” businessweek, dec. 12, 2005, www.businessweek .com/magazine/content/05_50/b3963001.htm (accessed feb. 1, 2007). 20. saturday night live, “sketch: myspace seminar,” nbc, www.nbc.com/saturday_night_live/segments/9166.shtml (accessed feb. 1, 2007). 21. c. stohl and g. cheney, “participatory processes/para­ doxical practices,” management communication quarterly 14, no. 3 (2001): 349–407. 22. j. zucca, “traces in the clickstream: early work on a management information repository at the university of penn­ sylvania,” information technology and libraries 22, no. 4 (2003): 175–78. 23. to be more precise, public and school libraries that accept e­rate funding. microsoft word june_ital_dehmlow_final.docx editorial board thoughts: developing relentless collaborations and powerful partnerships mark dehmlow information technologies and libraries | june 2017 3 with the end of the performance and fiscal year wrapping up, it seemed like a good time to reflect on what change initiatives we have engaged in over the past few years that have strengthened the organizational effectiveness of the it department in our library. my thoughts almost immediately drifted to our focus on collaboration. early in my career, it was the profession wide culture of cross-institutional collaboration that convinced me that becoming a librarian would be the right career move. i am certain that the impetus to collaborate stems from our professional service commitment a values based system that at its core believes that the success of all helps the collective do their jobs better in the name of service to our patrons. and yet, over the years, i have heard stories of and observed first hand internal competitions for resources, vilification of library it as siloed and opaque factions, and library it departments that have had strained relationships with their institution’s central it organizations. as a part of our senior leadership team for the hesburgh libraries, two of my core professional interests are organizational effectiveness and staff satisfaction, especially in the face of a rapidly changing technology landscape, competition for talent in the it sector where it is hard to contend with commercial salaries, and the slow rate of attrition at the university. retaining talented it staff requires creating a work culture that is better than the commercial sector, a work culture that values work/life balance, innovation and experimentation, a culture of teamwork and camaraderie, and where there is a clear sense of strategic priority. to build these latter two qualities into our work culture, we have strategically emphasized durable internal and external coalitions with a tenacious sense of partnership. true collaboration reinforces a collective sense of goals, allows for maximal efficiency, discourages unnecessary or destructive competition, and opens the door to the coveted but seldom realized ability to “stop doing” through partnering with other units on campus that share a sense of priority around particular services. creating sustainable and significant internal collaboration requires etching it into the culture of the organization. making it a part of the organization’s dna has to be prioritized and modeled by senior leadership and it begins with advancing shared goals over singular agendas. in our senior leadership team, we have committed to each other as our primary team. we may advocate for staff and initiatives in our own verticals, but our drive is to be holistic stewards for the libraries, not just our functional departments. we give as much, if not more weight, to the objectives of collective senior leadership team which also helps in clarifying priorities. our executive leadership mark dehmlow (mdehmlow@nd.edu), a member of the ital editorial board, is director of library information technology, hesburgh libraries, university of notre dame, south bend, in. editorial board thoughts | dehmlow https://doi.org/10.6017/ital.v36i2.10044 4 models cooperation, cross-divisional problem solving, and collective strategic initiative planning. using this model, decisions get made more quickly, enhancing our ability is to accomplish things on time with a high level of quality, and with a considerable level of satisfaction for our staff and faculty. the it department is less viewed as a black box where decisions for what to work on are made behind the curtains and rather as a group of talented staff who help our organization accomplish their priorities. when our it department needs to advocate for support and timely completion of work from individuals in other departments, the other senior managers help get their units mobilized. we see ourselves as part of the community and the community embraces us as part of them. historically, it has been tempting to view it as somewhat separate a part of the production line, but in an age where every operation in the library is affected by technology, our workflows need to be more integrated and team based. the problems we are working on are more cross-disciplinary and require a plurality of expertise to solve. libraries are increasingly becoming more and more of an interconnected and interdependent ecosystem that requires thinking holistically about problems and a relentless commitment to building coalitions to drive our services. it may seem obvious that this would be a more effective way to work, yet i have spoken with many people at organizations where there is a clear culture of departmental objective separation and competition for resources. i have long appreciated the the work environment at notre dame, in part because we strive to be an organization whose culture has been guided by our core institutional values accountability, integrity, excellence in leadership, excellence in mission, and teamwork. these values not only drive our internal collaborations, but also the way in which different departments on campus work with each other. we have had a long standing, positive relationship with our central office of information technologies one that has been tremendously cooperative, but for many years has lacked interconnections at a variety of levels and a clear collaborative and strategic focus. in the last 5 years, our organizations have shifted their focus the oit from emphasizing centralized, administrative, enterprise computing to decentralized, academic, enterprise computing and the libraries from doing everything in house to leveraging services for standardized services and focusing our staff’s time on initiatives where they can create the most value. in part, we developed an in-house it department because we had service expectations that weren’t a priority at the time for the oit. but during our strategic transitions, we have extended our working relationships at every level throughout our organizations from our staff in the trenches to our managers and senior leaders. my focus as the director for library it over the past few years has been to look at ways we can enhance our capacity through partnerships. to that end, there are several interrelated initiatives that we have begun to engage in with the oit: 1. embedding an oit presence in the libraries 
 2. shifting support for common it services to the oit, and 
 3. consolidating our customer communication through their service portal servicenow. 
 information technologies and libraries | march 2017 5 the first step in this new collaboration with the oit was letting go of the past and revisiting where the oit and the libraries have strategic overlaps that may not have been aligned before. as two service organizations on campus with a deep concern for supporting the academic endeavor, it was easy to find strategic alignment with each other. for the libraries, we often get questions at our service points about how to change passwords or install printer drivers, needs that are part of the central it service portfolio. for the oit, the libraries are a major campus hub where hordes of students and faculty conduct research and work on assignments, particularly after classes when many of the business unit leave the university for the day. working closely with the libraries’ director for teaching, research, and user services and the oit’s senior director for user services, we began developing a collaboration grounded in our common desire to support end users which resulted in creating an oit outpost in the libraries. while there are many libraries who have this kind of collaboration, this was a revolutionary step for us. this collaboration opened the door for us to begin a discussion about common technology services that we have been supporting internally printing and general lab computing. for us, these services are important to function well for our end users, but they are not services that require library expertise to accomplish. the oit supports these services for much of campus and as long as we have aligned expectations around service level expectations that are practical and committed to excellence the oit can handle that function much more efficiently and we can use our staff expertise to support other, emerging services that are core to the libraries. we are also working closely with the oit to leverage their it service portal servicenow as the libraries’ service portal. given that our service portfolio is much broader than strictly it services, moving in this direction for us required a willingness from the oit to think outside of the box and allow us to customize the system to meet our service needs. it has required some reciprocation from the libraries as well. the servicenow platform is more expensive than others we could license, its functionality will require effort from our staff to customize, and it is requiring us to change workflows, especially in the public services areas. integrating our customer communication into this platform, though, will create a better user experience for our patrons through supporting a common interface they are experienced with and it will allow for us to more easily transfer both staff and patron general it questions to the oit. beginning to work in truly collaborative ways requires shifting the narrative around our relationships from a client/provider model to one of a coalition. redefining these relationships as partnerships puts both parties on equal footing around the planning table where everyone has an equal stake in the objectives and outcomes. they don’t come effortlessly, they require libraries to ardently become more visible on campus, to articulate the complementary value that we can contribute to campus initiatives, and to proactively request to join initiatives that we haven’t participated in before. it also takes reaching out and helping campus partners see how we can collectively create value together using our unique talents to successfully support the campus community. and lastly, it takes engaging a more holistic view of the university and the way we steward its resources; sometimes that will mean allocating more resources for the common good editorial board thoughts | dehmlow https://doi.org/10.6017/ital.v36i2.10044 6 versus taking the narrower view that we should only consider our own context when adopting solutions. but in the end, if we are willing to think about our role at the university in that broader context and build powerful partnerships, we will collectively be able to serve our end users better. a file storage service on a cloud computing environment for digital libraries victor jesús sosa-sosa and emigdio m. hernandez-ramirez information technology and libraries | december 2012 34 abstract the growing need for digital libraries to manage large amounts of data requires storage infrastructure that libraries can deploy quickly and economically. cloud computing is a new model that allows the provision of information technology (it) resources on demand, lowering management complexity. this paper introduces a file-storage service that is implemented on a private/hybrid cloud-computing environment and is based on open-source software. the authors evaluated performance and resource consumption using several levels of data availability and fault tolerance. this service can be taken as a reference guide for it staff wanting to build a modest cloud storage infrastructure. introduction the information technology (it) revolution has led to the digitization of every kind of information.1 digital libraries are appearing as one more step toward easy access to information spread throughout a variety of media. the digital storage of data facilitates information retrieval, allowing a new wave of services and web applications that take advantage of the huge amount of data available.2 the challenges of preserving and sharing data stored on digital media are significant compared to the print world, in which data “stored” on paper can still be read centuries or millennia later. in contrast, only ten years ago, floppy disks were a major storage medium for digital data, but now the vast majority of computers no longer support this type of device. in today’s environment, selecting a good data repository is important to ensure that data are preserved and accessible. likewise, defining the storage requirements for digital libraries has become a big challenge. in this context, it staff—those responsible for predicting what storage resources will be needed in the medium term—often face the following scenarios: • prediction of storage requirements turn out to be below real needs, resulting in resource deficits. • prediction of storage requirements turn out to be above real needs, resulting in expenditure and administration overhead for resources that end up not being used. in these situations, considering only an efficient strategy to store documents is not enough.3 the acquisition of storage services that implement an elastic concept (i.e., storage capacity that can be victor jesús sosa-sosa (vjsosa@tamps.cinvestav.mx) is professor and researcher at the information technology laboratory at cinvestav, campus tamaulipas, mexico. emigdio m. hernandez-ramirez (emhr1983@gmail.com) is software developer, svam international, ciudad victoria, mexico. information technology and libraries | december 2012 35 increased or reduced on demand, with a cost of acquisition and management relatively low) becomes attractive. cloud computing is a current trend that considers the internet as a platform providing on-demand computing and software as a service to anyone, anywhere, and at any time. digital libraries naturally should be connected to cloud computing to obtain mutual benefits and enhance both perspectives.4 in this model, storage resources are provisioned on demand and are paid according to consumption. services deployment in a cloud-computing environment can be implemented three ways: private, public, or hybrid. in the private option, infrastructure is operated solely for a single organization; most of the time, it requires an initial strong investment because the organization must purchase a large amount of storage resources and pay for the administration costs. the public cloud is the most traditional version of cloud computing. in this model, infrastructure belongs to an external organization where costs are a function of the resources used. these costs include administration. finally, the hybrid model contains a mixture of private and public. a cloud-computing environment is mainly supported by technologies such as virtualization and service-oriented architectures. a cloud environment provides omnipresence and facilitates deployment of file-storage services. it means that users can access their files via the internet from anywhere and without requiring the installation of a special application. the user only needs a web browser. data availability, scalability, elastic service, and pay-per-use are attractive characteristics found in the cloud service model. virtualization plays an important role in cloud computing. with this technology, it is possible to have facilities such as multiple execution environments, sandboxing, server consolidation, use of multiple operating systems, and software migration, among others. besides virtualization technologies, emerging tools that allow the creation of cloud-computing environments also support this type of computing model, providing dynamic instantiation and release of virtual machines and software migration. currently, it is possible to find several examples of public cloud storage, such as amazon s3 (http://aws.amazon.com/en/s3), rackspace (http://www.rackspace.com/cloud/public/files), and google storage (https://developers.google.com/storage), each of which provide high availability, fault tolerance, and services and administration at low cost. for organizations that do not want to use a third-party environment to store their data, private cloud services may offer a better option, although the cost is higher. in this case, a hybrid cloud model could be an affordable solution. organizations or individual users, can store sensitive or frequently used information in the private infrastructure and less sensitive data in the public cloud. the development of a prototype of a file-storage service implemented on a private and hybrid cloud environment using mainly free and open-source software (foss) helped us to analyze the behavior of different replication techniques. we paid special attention to the cost of the system implementation, system efficiency, resource consumption, and different levels of data privacy and availability that can be achieved by each type of system. http://aws.amazon.com/en/s3 http://www.rackspace.com/cloud/public/files https://developers.google.com/storage a file storage service on a cloud computing environment for digital libraries | sosa-sosa 36 infrastructure description the aim of this prototyping project was to design and implement scalable and elastic distributed storage architecture in a cloud-computing environment using free, well-known, open-source tools. this architecture represents a feasible option that digital libraries can adopt to solve financial and technical challenges when building a cloud-computing environment. the architecture combines private and public clouds by creating a hybrid cloud environment. for this purpose, we evaluated tools such as kvm and xen, which are useful for creating virtual machines (vm).5 open nebula (http://opennebula.org), eucalyptus (http://www.eucalyptus.com), and openstack (http://www.openstack.org) are good, free options for managing a cloud environment. we selected open nebula for this prototype. commodity hard drives have a relatively high failure rate, hence our main motivation to evaluate different replication mechanisms, providing several levels of data availability and fault tolerance. figure 1(a) shows the core components of our storage architecture (the private cloud), and figure 1(b) shows a distributed storage web application named distributed storage on the cloud (disoc), used as a proof of concept. the private cloud also has an interface to access a public cloud, thus creating a hybrid environment. figure 1. main components of the cloud storage architecture the core components and modules of the architecture are the following: • virtual machine (vm). we evaluated different open-source were evaluated, such as kvm and xen, for the creation of virtual machines.6 some performance tests were done, and kvm showed a slightly higher performance than xen. we selected kvm as the main virtual machine manager (vmm) for the proposed architecture. vmms also are called http://opennebula.org/ http://www.eucalyptus.com/ http://www.openstack.org/ information technology and libraries | december 2012 37 hypervisors. each vm has a linux operating system that is optimized to work in virtual environments and requires a minimum consumption of disk space. the vm also includes an apache web server, a php module, and some basic tools that were used to build the disoc web application. every vm is able to transparently access a pool of disks through a special data access module, which we called dam. more details about dam follow. • virtual machine manager module (vmmm). this has the function of dynamic instantiation and de-instantiation of virtual machines depending on the current load on the infrastructure. • data access module (dam). all of the virtual disk space required by every vm was obtained through the data access module interface (dam-i). dam-i allows vms to access disk space by calling dam, which provides transparent access to the different disks that are part of the storage infrastructure. dam allocates and retrieves files stored throughout multiple file servers. • load balancer module (lbm). this distributes the load among different vms instantiated on the physical servers that make up the private cloud. • load manager (lm). this monitors the load that can occur in the private cloud. • distributed storage on the cloud (disoc). this is a web-based file-storage system that is used as a proof of concept and was implemented based on the proposed architecture. replication techniques high availability is one of the important features offered in a storage service deployed in the cloud. the use of replication techniques has been the most useful proposal to achieve this feature. dam is the component that provides different levels of data availability. it currently includes the following replication policies: no-replication, total-replication, mirroring, and ida-based replication. • no-replication. this replication policy represents the data availability method with the lowest level of fault tolerance. in this method, only the original version of a file is stored in the disk pool. it follows a round-robin allocation policy whereby load assignation is made based on a circularly linked list, taking into account disk availability. this policy prevents all files from being allocated to the same server, providing a minimal fault tolerance in case a server failure. • mirroring. this replication technique is a simple way to ensure higher availability without high resource consumption. in this replication, every time a file is stored in a disk, the dam creates a copy and places it on a different disk. • total-replication. this represents the highest data availability approach. in this technique, a copy of the file is stored on all of the file servers available. total-replication also requires the highest consumption of resources. • ida-based replication. to provide higher data availability with less impact on the consumption of resources, an alternative approach based on information-dispersal techniques can be used. the information dispersal algorithm (ida) is an example of this a file storage service on a cloud computing environment for digital libraries | sosa-sosa 38 strategy.7 when a file (of size |f|) is required to be stored using the ida, the file is partitioned into n fragments of size |f|/m, where mb (10) p (b, c) = ~ (ab)k -· k = c + 1 kl for selected values of b, beyer's tables of the poisson distribution have been used to compute p ( b, c) and to determine the largest value of c for which p(b, c) l o.oi.29 the results are shown in table 1 for the instance in which a = 1. a similar table has been computed by buchholz3° for the instance in which c = b and a ranges from 0.1 to 1.2. as is apparent from table 1, an increase in the value of b allows use of a smaller ratio c/ b and hence permits more economical use of storage. with b = 64 the allowed value of c/b is 1.33 and hence c may be chosen equal to 85. the reduction in access time that results from structuring the file so that each bucket contains both index and content entries is, of course, effected at the expense of additional storage costs. for example, if cj b = 1.33 then the space allocated for storage of content entries is 33 percent greater than if content entries are stored in a separate file. relaxation of the condition p(b,c)..:::: 0.01 allows a reduction in cj b, but the increased number of bucket overflows will cause additional disk accesses to be required. 44 journal of library automation vol. 6/ 1 march 1973 table 1. values of b, c, and cj b for which p(b,c~o.ol when a= 1. b 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 60 100 c 5 6 8 10 11 13 1415 17 18 19 20 22 23 24 25 27 28 29 30 80 125 treatment of bucket overflows c! b 5.00 3.00 2.66 2.50 2.20 2.17 2.00 1.88 1.89 1.80 1.73 1.67 1.69 1.64 1.60 1.56 1.59 1.55 1.53 1.50 1.33 1.25 when a new key is found to map into a bucket whose content section is full then some means must be found to provide space in some other bucket. the particular procedure that should be used depends on the extent to which the entire set of buckets contain unfilled portions. suppose that many buckets are almost full and that the number c of allowed content entries is less than 127. the entire hash file may then be expanded with the same index sections but with longer content sections. if many buckets are almost full and c = 127 then the entire file may be expanded in such manner that each bucket is replaced by a pair of buckets that contain the same number b of allowable index entries, but whose number ct of allowable content entries is chosen to ensure that p(b,ct ) l 0.01. such doubling of buckets also doubles the number of index entries but it does not double the storage required for the entire file. each key k that corresponds to an entry in the original bucket is associated with an entry in the first , or second, of the new buckets according as the leading bit of either its index address i ( k) or its minor m:( k) is equal to 0 or 1. the effect is to shift one bit from i(k) or m(k) into the bucket address b ( k ) . this method is based on a suggestion of morris. 31 · suppose that few buckets are almost full. then a suitable means of determination of an unfilled bucket for storage of the minor is through use of some overflow algorithm that determines a sequence of bucket numbers bo ( k) , bt ( k) , b2 ( k), etc., corresponding to any given full bucket {:3o ( k) . suppose there are nb buckets. a quadratic residue algorithm (11) bi (k) = [b0 (k) + aj + bj2 ] mod nb file structure for an on-line catalog / dimsdale 45 has been considered by maurer and by bell for use with in-core hash tables, but it suffers from the disadvantage that the existence of a full bucket /)o ( k) will divert entries into the particular buckets /31 ( k), /32 ( k), etc. and hence cause them to fill more rapidly than other buckets which may contain fewer entries.82 • 33 it is believed that a more desirable form of the quadratic residue algorithm is ( 12) bj ( k) = { b0 ( k) + f1 [i ( k)] } mod nb where fl is a suitably chosen function. letting b, ( k) depend, through fs, on both j and i ( k), instead of on j alone, allows reduction of the tendency to fill a particular set of buckets. to prevent a tendency to overflow particular buckets it is also desirable for the overflow algorithm to produce bucket numbers that are uniformly distributed among all possible bucket numbers. among the more promising forms to be chosen for the fl [i ( k)] are the following ( 13a) fj [i ( k)] = i ' ( k) j where j = 1, 2, ... , nb -1, and l'(k) denotes i(k) if i(k ) is odd, but denotes i ( k) + 1 if i ( k) is even. since nb is a power of 2 such choice of i'(k) ensures that i'(k) and nb have no common factors, and hence that bi ( k) steps through the sequence /3o ( k ), /31 ( k), etc. covering every bucket in the file. ( 13b ) fj [i ( k) ] = i ' ( k ) j 2 where j = 1, 2, ... , r \ / n-1, and r means "the least integer greater than or equal to." ( 13c ) fdl ( k)] = rdi ' ( k)] where j = 1, 2, ... , db, and rj[l'( k)] denotes a number output by a pseudorandom number generator of the form suggested by morrisb4 with an initial input of i' ( k) instead of 1. it may be remarked that use of equation 13a requires the least number of machine instructions, and the least cpu time per step, but it has a strong tendency to cluster the f31(k) immediately after the /3o(k) and hence it is likely to be the least effective of the three methods. use of equation 13b produces less clustering, but the sequence does not include all buckets of the file. use of equation 13c requires the largest number of instructions and cpu time per step, but the f3j(k) are less likely to cluster and they are uniformly distributed among all possible buckets. thus equation 13c produces shorter chains of overflow buckets and hence requires fewer disk accesses. if a new key k maps into a full bucket /3o ( k) then the following procedure is used to determine the bucket into which the minor of k is to be inserted: ( i) the chain of pointers from the i ( k) th entry of the bucket /)o ( k) is followed, possibly through overflow buckets given by equation 12, in or46 journal of library automation vol. 6/1 march 1973 der to locate the terminal entry of the chain. suppose this terminal entry is within a bucket /3j ( k) . (ii) if there is available space in bucket /3j(k) then the minor mr(k) is entered and chained as described previously. (iii) if bucket /3j ( k) is full, but there is space in /3j + 1 ( k), then the minor m ( k) is entered into /3j + 1 ( k) and chained as described previously. ( iv) if buckets f3j ( k) and /3j + 1 ( k) are both full, and bucket /3j + 1 ( k) contains at least one nonempty index entry i ( k') whose chained content entries are all contained within /3j + 1 ( k), then the minor m ( k) is stored according to the following displacement algorithm: the terminal member of the chain from i ( k') is displaced to an overflow bucket /3r ( k') determined by use of equation 12, except that if both /3r(k') and /3r + 1(k') are full then a further bucket is determined by use of the displacement algorithm. the minor m ( k) is substituted for the displaced entry in bucket /3j + 1(k) and is chained appropriately. ( v) if application of step ( iv) leads to a bucket /3j + 1 ( k), or /3r + 1 ( k), that contains no nonempty index entry whose chained content entries are all contained within it, then the entire hash file must be expanded by use of one of the procedures described at the beginning of the present section. it should be emphasized that, although step ( iv) is necessary for completeness, the probability of its use is very low. with a probability of less than 0.01 for a bucket overflow, the probability of use of step ( iv) is less than (0.01) 3• search phase and problem of mismatch in the previous sections the structure of the hash index file has been discussed with emphasis on details of its creation and update. during search of the catalog files by use of the inverted index, each search key is processed by the following search alogorithm: step 1: the search key k is transformed by the hashing function into a virtual hash address b(k), i(k), m(k). step 2: the bucket /3(k) is read into core. step 3: the index entry specified by i(k) is examined. if it is empty then the search key is not present in the data base. if it is not empty then step 4 is performed. step 4: the overflow bit of the index entry specified by i(k) is examined. if it is equal to 1 then step 5 is performed. if it is equal to 0 then step 6 is performed. step 5: the overflow algorithm is used to determine the address of therequired overflow bucket which is then read into core, and step 6 is executed. step 6: the minor of each entry in the chain of content entries is comfile structure for an on-line catalog/dimsdale 47 pared to the minor of the search key's virtual hash address until either a match is found or the chain is exhausted. whenever the chain leads to an overflow bucket then step 5 is performed. step 7: if a match is found for m ( k) then the collision bit of the entry is examined. if it is equal to 0 then step 9 is performed. if it is equal to 1 then step 8 is performed. step 8: the dictionary entry that corresponds to each content entry in the virtual address collision is read into core and compared to the search key k. if no match is found then the search key is not present in the index. step 9: this step is included because there is a small probability that a misspelled search key, or one not present in the hash file, may be transformed into the same virtual address as some key already included in the file. the step consists of reading the corresponding dictionary entry into core and comparing it with the search key. for reasons discussed later in the present section it is desirable to omit this step. it should be noted that in most instances the search algorithm will not require execution of steps 5 and 8. in fact, with the hash index files designed as described in the previous sections, the probability of execution of step 5 is about 0.01 and the probability of execution of step 8 is about 2-16• consequently, if step 9 is also omitted the number of disk accesses required to find the index entry corresponding to a search key is approximately l.ol. the mismatch problem, which gives rise to step 9 of the search algorithm, is less serious than might be expected. suppose the hash function distributes the transformed keys uniformly over all hash addresses. the probability that a new, or misspelled, key maps into an existing entry is given by (14) pc = njv the probability that a search leads to a mismatch is therefore ( 15) p m = p .n j v where ps is the probability that the search key is misspelled or not in the hash table. thus, for a hash table of n = 216= 65,536 title words and v = 28\ an assumption of ps = 0.1leads to pm = 3 x 106• because pm is extremely small, and because each execution of step 9 requires up to two disk accesses, it is desirable to omit this step. if experience shows that particular new or misspelled search keys occur frequently, and cause mismatches, they may themselves be entered into the hash index file. in fact, some degree of automatic spelling correction may be provided if some common misspellings are included in the hash files and chained to the content entries that correspond to the correctly spelled keys. correct, but alternative, spellings of search keys may also be treated in the same manner. 48 journal of library automation vol. 6/1 march 1973 size of hash file for title words suppose the docwnent collection contains t different titles that comprise a total of w words of which there are n different words. let w = w /t denote the average number of words in each title. reid and heaps85 have reported word counts on the 57,800 titles included on the marc tapes between march 1969 and may 1970 and have noted that (16) w = 5.5 ( 17) log10n = 0.6 log1ow + 1.2. examination of other data bases has led to the conclusion that log n is likely to be a linear function of log w over the range 0 l w l 106• for a library of one million titles the equations 16 and 17 may therefore be used to predict that when t = 106 then (18) w :::: 5.5 x 106 and n = 1.8 x 105 • it follows from equation 6 that if a = 1 the number of bits required in the major is (19) r = 18. according to equation 7, in order to reduce the frequency f of collisions at virtual addresses to 2-16 the number of bits required in the entire virtual address is (20) v = r [lo~ (1.8 x 105 + 16 1] = 33. consequently, the number of bits in the minor is ( 21) m = v r = 15. however, with such a choice of r then r = 218 and the value of the load factor is, in fact, (22) a = n/r = 0.7 it follows from equation 4 that the expected total number p of collisions at virtual addresses is equal to approximately 2. it may be further noted that murray36 has derived the following approximation for the probability that the number of collisions at virtual hash addresses lies within the range a to d: d (23) p (a, d) = ~ e-"p p1/il (0 ~i l u~ n) i= a where l means "greatest integer less than or equal to." when p = 2 the equation gives a value of 0.9998 for the probability that the total number of collisions lies between 0 and 8. thus the above choice of r, v, and m leads to a title word hash table file with excellent virtual address collision properties. use of equation 10 with b = 64 and a= 0.7, leads to the result that the probability of bucket overflow may be reduced to 0.01 by choosing c = 62. in view of the above value of m it proves convenient to allocate 10 bytes of storage for each content entry. each entry consists of a 2-byte portion to contain the 15-bit minor preceded by a collision bit, a 1-byte portion to file structure for an on-line catalogj dimsdale 49 contain a 7-bit chain pointer preceded by an overflow bit, a 3-byte dictionary pointer, and a 4-byte pointer to an inverted index. the 64 one-byte index entries, the 62 ten-byte content entries, and 4 one-byte counters, constitute buckets of length 688 bytes. the entire hash file consists of r entries, and hence r/b = 212 buckets. its storage requirement is therefore for 212 x 688 = 2.82 x 106 bytes. it may be remarked that nine 688-byte buckets may be stored unblocked in one track of an ibm 2316 disk pack, and that the entire hash file occupies 11.38 percent of the disk pack. when the disk and channel are idle the average time to access such a bucket is the sum of the average seek time, the average rotational delay, and the record transmission time. for storage on an ibm 2314 disk drive the average bucket access time is therefore 60 + 12.5 + 2.8 = 75.3 milliseconds. the average access time for a sequence of accesses could be reduced by suitable scheduling. size of hash file for lc call numbers for a library of one million titles the number n of call numbers is 106• if a = 1 and f = .2-16 it follows from equations 6, 7, 9, and 4 that (24) r = 20, v = 35, m = 15, p = 16. with such a choice of r the load factor is approximately equal to 1. equa~ tion 23 gives a probability of 0.9998 that the total number of virtual address collisions lies between 0 and 34. use of equation 10 with b = 64 and a = 1.0 shows that the probability of bucket overflow may be reduced to 0.01 by choosing c = 85. the content entries for lc call numbers may be arranged as for title words except that the 4-byte pointer to an inverted index is replaced by a 3-byte pointer to the compressed catalog file. the bucket length is therefore 64 + 85 x 9 + 4 = 833 bytes. the storage requirement for the hash file is ( 220/ 26 ) x 833 = 13.65 x 106 bytes which may be stored in 2184 tracks, or 54.6 percent, of an ibm 2.316 disk pack. the average time to access a bucket is 60 + 12.5 + 3.3 = 75.8 milliseconds. size of hash file for author names in the present section the term "author" will be used to include personal names, corporate names, editors, compilers, composers, translators, and so forth. it will be assumed that for personal names only surnames are entered into the author dictionary. a search query that includes specification of authors with initials is first processed as if initials were omitted, and the resulting retrieved catalog entries are then scanned sequentially to eliminate any entries whose authors do not have the required initials. it will also be supposed that each word of a corporate name is entered separately into the author dictionary, and that the inverted index contains an entry for each term. in the absence of reliable statistics regarding the distributions of author 50 journal of library automation vol. 6/1 march 1973 surnames, words within corporate names, and so forth, the following as~ sumptions have been made in order to estimate tile size of the author dictionary and hash file for a library of one million titles: ( i) personal author names contain 2 x 105 different surnames of average length 7 characters. ( ii) the corporate author names include 4 x 104 different words of average length 6 characters. (iii) the author names include 1.6 x 104 different acronyms such as ibm, aslib, and so forth; their average length is 4 characters. it is thus supposed that n = 2.56 x 105 entries are required in the author hash files. calculations similar to those of the previous section show that ( 25) r = 18, v = 33, m = 15, p = 4, a = 1.0. equation 23 gives a probability of 0.9999 that the total number of virtual address collisions lies between 0 and 13. the probability of bucket overflow may be reduced to 0.01 by choosing c = 85. content entries of 10 bytes may be arranged as previously described for title words. hence each bucket requires 918 bytes of storage. the storage requirement for the hash file is ( 218/ 26 ) x 918 = 3.76 x 106 bytes which may be stored in 586 tracks, or 14.6 percent, of an ibm 2316 disk pack. the average time to access a bucket is 76.1 milliseconds. structure of dictionary files the structure of the dictionary files for title words and author names is as described by thiel and heaps.87• 38 each dictionary file contains up to 128 directories each of which points to up to 128 term strings that may each contain space for storage of 128 terms of equal length. thus each dictionary file contains up to 214 different terms. the dictionary pointers in the hash files are essentially the codes stored instead of alphanumeric terms in the catalog file. the most frequent 127 title words are assigned dictionary pointers of the form (26) 10000000 10000000 1xxxxxxx pt and do not have corresponding entries in the inverted index file. the last byte forms the code used to represent the title word within the compressed catalog file. the next most frequent 16,384 title words are assigned dictionary pointers of the form ( 27) 00000000 1xxxxxxx lxxxxxxx or (28) 10000000 oxxxxxxx 1xxxxxxx file structure for an on-line catalog/ d!msdale 51 according as there is, or is not, a corresponding entry in the inverted index. the last 2 bytes are used as codes in the compressed catalog file. the remaining title words are assigned dictionary pointers of the form ( 29) oxxxxxxx oxxxxxxx lxxxxxxx ...____~ --...------' p~ p~ pt they all have corresponding entries in the inverted index file, and the 3 bytes are used as codes in the catalog file. the reason that terms coded in the form 26 or 28 do not have corresponding entries in the inverted index file is that very frequently occurring terms form very inefficient search keys. also, previous results suggest that omission of corresponding entries in the inverted index allows its size to be reduced by about 50 percent.39• 40 the codes of type pt, ( ps,pt) , and ( pn,ps,pt) are used respectively for approximately 50 percent, 45 percent, and 5 percent of the title words. the average length of the coded title words in the compressed catalog file is therefore 1.55 bytes. associated with each dictionary file there is a directory of length 512 bytes whose entries point to the beginnings of term strings within the dictionary file and also indicate the lengths of the terms. within the hash table file a dictionary pointer of the form po, p s, pt points to the pt th term of the ps th term string in the dictionary associated with the po th directory. there is a single directory associated with each set of pointers of type pt and ps, pt. the average length of the 1.8 x 105 different title words is 7.6 characters, and hence the entire set of term strings requires 1.8 x 105 x 7.6 = 1.37 x 106 bytes for storage of title words. since twelve directories occupying 12 x 512 = 6144 bytes will be required, and since some term strings will contain unfilled portions, the storage requirement of the dictionary file will be slightly larger. if the title word dictionary is stored on disk in 1,000 byte records then the storage requirement is 238 tracks, or 5.95 percent, of an ibm 2316 disk pack. the assumptions made previously regarding author names imply an author dictionary size of 1.70 x 106 bytes and sixteen directories whose total storage requirements are 16 x 512 = 8,192 bytes. using an ibm: 2316 disk pack the storage requirement is for 286 tracks, or 7.15 percent. on completion of a search through use of the inverted index .file there results a set of sequence numbers that indicate the position of the relevant items in the compressed catalog file. before such items are displayed to a user of the system, each term must be decoded through access to the directory and dictionary to which it points. the time required to decode a catalog item depends on how the directories and dictionaries are partitioned between disk and core memory. several partitioning schemes for title words have been analysed, and the results are summarized in table 2. 52 journal of library automation vol. 6/1 march 1973 in the calculations used to obtain table 2 it is assumed that title words occur with the frequencies listed by kucera and francis.41 it is supposed that both the directory and term strings corresponding to codes of form pt are stored in a single physical record, that every other directory is contained wholly within a physical record, and that each dictionary term may be located by a single access to a term string. any required cpu time is regarded as insignificant compared to the time needed for file accesses. from the results shown in table 2 it appears that the best partition between core and disk is probably that which gives an average decode time of 42 milliseconds while requiring a dedicated 1501 bytes of core memory. this results when core is used to store both the directories and term strings for terms that correspond to pointers of type pt, and the directories only for terms that correspond to pointers of type ps,pt. compressed catalog file since the title word codes stored in the compressed catalog file have an average length of 1.55 bytes, whereas uncoded title words and their delimiting spaces have an average length of 6.5 characters, the compressed title fields occupy only 24 percent of the storage required for uncompressed words. uncoded author names and their delimiting spaces have an average length of 7.6 characters and are coded to occupy not more than 3 bytes; hence coding of author names effects an average compression factqr of less than 3;7.6 = 40 percent. for lc call numbers the compression factor is less than 30 percent. clearly, subject headings, publisher names, and series statements may be coded with even more effective compression factors. the saving in space through compression of the catalog file may be translated into a cost saving as follows. if there are an average of 5.5 words in each title then one million titles include 5.5 x 106 title words and delimiting spaces which, if stored in the catalog file in uncoded form, would require 3.63 x 107 bytes.42 when stored in coded form the requirement is for 8.54 x 106 bytes. charges for disk space vary considerably with different computing facilities. at the university of alberta users of the ibm 360 model 67 are charged a monthly rate of $.50 for each 4,096 bytes of disk storage. thus, for title words alone the advantage of storing the catalog file in compressed form is to allow the monthly storage cost to be reduced from $4,440 to $950. concluding remarks the results reported in the present paper indicate that a satisfactory structure for a catalog file may be designed to use the concept of virtual hash addressing and storage of terms in compressed form. access and decoding times may be reduced to acceptable amounts. it may prove advantageous to arrange the items in the catalog file in the order of their call numbers. this will tend to reduce the number of disk file structure for an on-line catalog/ dimsdale 53 table 2. average time to decode a title word of the compressed catalog file. core resident directories ter-m string none pr pt, (ps, pt) all pt, (ps, pt) all none pr pr p.,. pt, (ps, pr) 0 pr, (ps, pr ) 0 average number accesses 1.50 1.01 0.55 0.50 0.49 0.44 ( ps, pr) 0 signifies the 128 most frequent of the codes ps, pt average decode time (milliseconds) ll5 77 42 39 38 34 dedicated core memory (bytes) 0 989 1501 7133 2474 8106 accesses needed to retrieve catalog items in response to queries since it will tend to group relevant items. however, the benefits should be weighed against the additional expense required to maintain and update the ordered file. the present paper has omitted discussion of the form of the query language or the search algorithm that operates on the elements of the inverted index. a formal definition of one form of query language has been discussed by dimsdale.48 details of a search algorithm and structure of a compressed form of inverted index have been discussed by thiel and heaps.4 4 it may be noted that each content entry in the hash table file has 4 bytes reserved for a pointer to a bit string of the inverted index. whenever the bit string is less than 4 bytes in length it is stored in the content section and no pointer is required. storage of such bit strings within the content entries significantly reduces the storage requirements of the inverted index and also reduces the number of required disk accesses in the search phase of the program. acknowledgment the authors wish to express their appreciation to the national research council of canada for their support of the present investigation. references 1. d. lefkovitz and r. v. powers, "a list-structured chemical information retl"ieval system," in g. schecter, ed., informatio-n retrieval (washington, d.c.: thompson book co., 1967), p.l09-29. 2. p. r. weinberg, "a time sharing chemical information retrieval system" (doctoral thesis, univ. of pennsylvania, 1969) . 3. r. m. curtice, "experimental retrieval systems studies. report no. 1. magnetic tape and disc file organization for retrieval" (master's thesis, lehigh univ., 1966). 4. d. lefkovitz, file strttctures for on-line systems (new york: spartan books, 1969). 54 journal of library automation vol. 6/ 1 march 1973 5. i. b. holbrook, "a threaded-file retrieval system," journal of the american society for information science 21: 4048 (jan.-feb. 1970). 6. g. g. dodd, "elements of data management systems," computer surveys 1:11733 (june 1969). 7. j. w. rettenmayer, "file ordering and retrieval cost," information storage and retrieval8:19-93 (april1972). 8. r. t. divett, "design of a file structure for a total system computer program for medical libraries and programming of the book citation module" (doctoral thesis, univ. of utah, 1968). 9. h. p. burnaugh, "the bold (bibliographic on-line display) system," in g. schecter, ed., information retrieval (washington, d .c.: thompson book co., 1967)' p.53-66. 10. lefkovitz, powers, "a list-structured chemical information," p.109--29. 11. lefkovitz, file structures for on-line systetm, p.141. 12. ibid., p.177. 13, f. g. kilgour, "concept of an on-line computerized catalog," journal of library automation 3:1-11 (march 1970). 14. j. l. cunningham, w. d. schieber, and r. m. shoffner, a study of the organization and search of bibliographic holdings records in on-line computer systetm: phase i (berkeley: univ. of california, 1969). 15. r. s. marcus, p. kugel, and r. l. kusik, "an experimental computer stored, augmented catalog of professional literature," in proceedings of the 1969 spring joint computer conference (montvale: afips press, 1969) p.461-73. 16. j. w. henderson and j. a. rosenthal, eds., library catalogs: their preservation and maintenance by photographic and automate d techniques; m.i.t. report 14 (cambridge, mass.: m.i.t. press, 1968). 17. i. a. warheit, "file organization of library records," journal of library automation 2:2(}...30 (march 1969) . 18. r. morris, "scatter storage techniques," communications of the acm 11 :38-44 (jan. 1968) . 19. d. m. murray, "a scatter storage scheme for dictionary lookups," journal of library automation 3:173-201 (sept. 1970). 20. w. buchholz, "file organization and addressing," ibm systems journal 2:86-111 {june 1963). 21. p. l. long, k. b. l. rastogi, j. e. rush, and j. a. wyckoff, "large on-line files of bibliographic data: an efficient design and a mathematical predictor of retrieval behavior," in information processing 71 (north holland publishing company, 1972) p.473-78. 22. buchholz, "file organization," p.l02-3. 23. w. p. reising, "note on random addressing techniques," ibm systems journal 2:11216 (june 1963). 24. murray, "a scatter storage scheme," p.178. 25. ibid., p.181. 26. g. schay and w. g. spruth, "analysis of a file addressing method," communications of the acm 5:459-62 (august 1962). 27. m. tainter, "addressing for random-access storage with multiple bucket capacities," journal of the acm 10:307-15 (july 1963). 28. reising, "note on random addressing," p.ll2-16. 29. w. h. beyer, handbook of tables for probability and statistics (cleveland: the chemical rubber company, 1966). 30. buchholz, "file organization," p.99. 31. morris, "scatter storage," p.42. 32. w. d. maurer, "an improved hash code for scatter storage," communications of the acm 11:35-38 (jan. 1968). file structure for an on-line catalog/dimsdale 55 33. j. r. bell, "the quadratic quotient method: a hash code eliminating secondary clustering," communications of the acm 13:107-9 (feb. 1970). 34. morris, "scatter storage," p.40. 35. w. d. reid and h. s. heaps, "compression of data for library automation," in canadian association of college and university libraries: automation in libraries1971 (ottawa: canadian library association, 1971), p.2.1-2.21. 36. murray, "a scatter storage scheme," p.183. 37. l. h. thiel and h. s. heaps, "program design for retrospective searches on large data bases," information storage and retrieval8:1-20 (jan. 1972) . 38. h. s. heaps, "storage analysis of a compression coding for document data bases," infor 10:47-61 (feb. 1972) . 39. thiel and heaps, "program design," p.l5-16. 40. reid and heaps, "compression of data," p.2.1-2.21. 41. h. kucera and w. n. francis, computational analysis of present-day american english (providence: brown university press, 1967). 42. reid and heaps, "compression of data," p.2.4. 43. j. j. dimsdale, "application of on-line computer systems to library automation" (master's thesis, univ. of alberta, 1971), p.50-68. 44. thiel and heaps, "program design," p.l-20. 48 information technology and libraries | march 200748 information technology and libraries | march 2008 touchable online braille generator wooseob jeong a prototype of a touchable online braille generator has been developed for the visually impaired or blind using force feedback technology, which has been used in video games for years. without expensive devices, this prototype allows blind people to access information on the web by touching output braille displays with a force feedback mouse. the data collected from user studies conducted with blind participants has provided valuable information about the optimal conditions for the use of the prototype. the end product of this research will enable visually impaired people to enjoy information on the web more freely. the united states has made some attempts to nationally address information access for those with disabilities. section 508 of the rehabilitation act (www.section508.gov) requires federal agencies to make their electronic information accessible to people with disabilities, mainly those who are visually impaired. the library of congress launched a webbraille service (www.loc.gov/nls/) for the blind in 1998, which continues today. with the upsurge in information stored on the internet, the importance of these issues cannot be overemphasized. many products have been developed to help the visually impaired use technology. several braille output and input devices are available, such as the braille notetaker (www. artictech.com) and voice synthesizers for screen readers like jaws (www. freedomscientific.com/fs_products/ software_jaws.asp). while these products are mainly for textual information, recent developments put more focus on graphical displays. the american national institute of standards and technology proposed a “pins” down imaging system for the blind (www.nist.gov/ public_affairs/factsheet/visualdisplay.htm). uniplan in japan and ksg america (www.kgs-america.com/ dvs.htm) have produced other products based on similar ideas. software like the duxbury braille translator (www.duxburysystems.com) can translate plain text into braille output, which can then be used for embossed printing. however, such products are fairly expensive, ranging from hundreds to several thousands of dollars in addition to the cost of computers. fortunately, there is a potentially promising solution. based on the technology used in prior research, it is possible to develop an online braille generator.1 the braille could then be read either by touching the screen with a fingertip sensor or through the use of a force feedback mouse similar to the type used in some video games.2 this application has several advantages over existing devices. first, it does not require expensive special devices—only a $20 mouse, which is readily available. also, the technology is available as long as there is access to the internet. another advantage is that this technology utilizes the existing braille skills of visually impaired people. the same technology can be used for producing image displays as well, allowing for the creation of a virtual museum for the blind where they can touch objects that are displayed alongside their braille descriptions. literature review force feedback has been studied under the name of haptic perception. haptic perception involves sensing the movement and position of joints, limbs, and fingers through kinesthesia and proprioception, and sensing information through the skin’s tactility.3 haptic output can be achieved through several techniques, including pneumatic, vibrotactile, electrotactile, and electromechanical stimulation.4 this study examines only vibrotactile haptic output methods because vibrotactile stimulation is easily created, manipulated, and delivered. it is also easily perceived by users through the use of commonly available software and devices. researchers have begun to develop various haptic input/output devices and software, such as massachusetts institute of technology’s (mit) frequently used phantom haptic interface.5 along with these developments, a number of studies have tried to apply haptic displays to real-world computing, including a force feedback braille system,6 force feedback virtual reality modeling language (vrml),7 a force feedback x window system8, and gis.9 haptic studies have only recently become more mainstream, and there are few extensive studies with real subjects. gillespie and others developed the “virtual teacher,” a device for manual skill learning, which they tested with 24 participants and found that most profited from the “force feedback teacher.” 10 langrana and others used the rutgers master ii, a dexterous, portable master for virtual reality simulations for force feedback using four fingers. in their experiment of tumor detection in virtual livers with 32 subjects, the experimental group with force feedback training performed slightly better than the control group.11 this may mean that either the training methods need improvement or that the task did not require extensive training. colwell and others confirmed that a haptic interface (impulse engine 3000) has considerable potential for blind computer users through their threedimensional objects experiment with 22 subjects.12 jeong tested ordering communications wooseob jeong (wjj8612@uwm.edu) is associate professor at the school of information studies, university of wisconsin–milwaukee. article title | author 49touchable online braille generator | jeong 49 tasks in auditory and haptic displays with 23 subjects and found that subjects performed better with haptic-only displays than with auditory-only displays or with auditory/ haptic combination displays.13 several studies already attempted to apply force feedback technology to assist blind people’s computing. ramstein conducted a pilot study to apply haptics to braille.14 yu and brewster compared the use of force feedback in multimodal virtual reality and printed medium in visualization for the blind.15 tzovaras and others tried to implement a virtual reality interface with force feedback for blind people.16 ramloll and others studied the use of haptic line graphs with sound for blind students.17 emery and others tested a multimodal haptic interface with 29 older adults to find that all participants performed well under auditory-haptic bimodal feedback.18 jacko and others tested a multimodal interface with 29 normal vision older adults and 30 visually impaired older adults, finding that in some cases, nonvisual feedback forms—including auditory or haptic feedback—demonstrated significant performance gains over the visual feedback form.19 s. jeong and others proposed an interactive system that combines an immersive virtual environment with a humanscale haptic interface.20 when conducting user studies with the visually impaired, it is necessary to separate the completely blind from the partially sighted. in spite of the different characteristics of these two groups, the literature on visually impaired people typically does not distinguish between them. this distinction is especially important if the legally blind or those with low vision are included in the definition of visually impaired. the challenges to the partially sighted are different from those of the totally blind, demanding different assistance and considerations. in fact, the completely blind represent a small portion of the visually impaired population. according to an advisor in wisconsin’s division of vocational rehabilitation, less than 5 percent of her advisees are totally blind and require very specialized attention quite different from partially sighted people. purpose of study the purpose of this study is to explore the feasibility of using force feedback technology to facilitate blind people’s access to text information on the web. both quantitative and qualitative data were collected to identify the optimal conditions under which the prototype can best serve the blind. significance of study public libraries in the u.s., primarily through their main libraries, are providing special services for the visually impaired. currently, the core service is the provision of audiobooks. as digital libraries prevail, services for the blind should be online as well, with the the library of congress’s web-braille service as one of the leading examples. however, such services require the use of an expensive braille output device. upon refinement, this prototype would significantly improve the experience of the visually impaired using online services. this prototype can be easily expanded to support graphical displays without any additional devices, making the use of touchable picture books possible for blind users in libraries. prototype development force feedback technology has been used for many years in video games. its use has expanded to other areas such as surgical operations and dangerous mechanical processes. this technology was previously applied to gis to solve the problem of ambiguous multicolor displays for multi-variable thematic maps.21 the same technique was used for this project. the online braille generator translates text on the web into a braille display, letting the user feel the braille dots with a vibrating mouse. the prototype interface was developed using immersion studio (www.immersion.com), javascript, perl/cgi, and active server pages (asp). logitech’s ifeel mouse, inexpensive at a cost of $20, was used for force feedback output (figure 1). the interface has an input text box, which can be filled with any plain text. once it is submitted, the text is instantly translated into braille (figures 2 and 3). when the user moves the mouse over each dot on the screen, it vibrates with a given force. while users explore the screen with the vibrating mouse, force feedback dots provide a tactile effect similar to braille displays. in future projects, the manual configure 1. experimental setting 50 information technology and libraries | march 200850 information technology and libraries | march 2008 version programs will be upgraded to automatic conversion programs with which any texts on the web can be grabbed by their urls and converted into a touchable format for the blind. participants to make this prototype more usable, user studies were conducted in milwaukee, wisconsin, with 21 participants who are completely blind and read braille. the small sample size—due to the relatively small percentage of visually impaired people who are completely blind and can read braille—is comparable to or larger than those found in other research on the blind. the participants came from various age groups—teens (3), twenties (6), thirties (2), fifties (5), and sixties (5)—and included 9 females and 12 males. nineteen of the 21 were born blind. participants were recruited at several sites, including the university of wisconsin–milwaukee student accessibility center, public libraries with centers for the blind, and nonprofit organizations for the physically impaired. vision teachers in local school districts were also contacted. participants provided valuable i n f o r m a t i o n about the optimal conditions for the use of the prototype. this information will eventually lead to force feedback displays that enable visually impaired people to access the vast amount of information on the web without expensive devices. experimental procedure experiments were conducted in a number of settings, including in the organization’s offices, at the participants’ homes, and at the site of a regional annual meeting for the blind. each session lasted no more than 60 minutes. participants were asked to try different interfaces of force feedback braille outputs with various dot sizes and magnitudes of force. they used a tactile mouse on a notebook computer; after exploring every option, they were asked to select the most comfortable settings for their sense of touch, including what size the dots should be, how strong the force should be, what kind of force feedback should be used (vibration or friction), and their general opinions of the prototype (see figure 4). interviews accompanied the experiments so that both quantitative and qualitative data could be collected. interviews were transcribed for qualitative data analysis. result even though there were only 21 study participants, a number of issues were clearly identified. it is encouraging to see that all of the participants could identify braille characters using the force feedback mouse with the guidance of the researcher. all the participants agreed that this prototype would be useful with training. the participants preferred the largest dot size (30 pixels in diamfigure 2. touchable braille input screen figure 3. touchable braille output screen figure 4. inexpensive force feedback mouse article title | author 51touchable online braille generator | jeong 51 eter) and the strongest force possible for maximum perception of the force feedback effect. however, the prototype was less attractive to the participants than the currently dominant voice synthesizer software. at least two participants mentioned that their current braille pads fulfill their needs. it seems that they are not motivated to invest their time and effort in a new device. when a potential graphical display application was introduced at the end of a session, the participants became more receptive. at this time there is no practical solution for the visually impaired to feel graphics on computers. experimental devices are available, but they are either quite expensive or still in the research phase. the blind participants also suggested that this graphical prototype could be used for geometry and geography easily and effectively. discussion blind people’s navigation by mouse because blind people do not use a mouse for computing, using the force feedback mouse itself was a challenge for the study participants. a sighted person uses a mouse with both hand and eye, moving the mouse while watching the mouse cursor on the screen. for the blind it is difficult to identify the mouse’s position. the direction of movement and the distance between two points are difficult to grasp. due to the lack of guidance, the blind encounter difficulties in moving the mouse in a straight line. these issues hinder the effectiveness of force feedback displays for the blind. however, this issue does not only affect the blind. some sighted people, especially older adults, cannot move a mouse easily. one possible solution may be to develop guardrails to help blind people to differentiate relevant areas of the screen from irrelevant ones. due to their inexperience in using a mouse, the participants held the mouse too firmly to move it or to feel the force feedback. the only participant to use the mouse successfully was a college student who is music major with 15 years of piano playing experience. this implies that a significant learning session will be required to allow blind people to use the mouse freely. ignorance or suppression of graphical information need even though the participants were more excited about the potential graphical displays, blind people’s graphical information needs are limited. it is possible that their graphical information needs are ignored or suppressed based on their lifetime experiences. they tend to resort to braille and, more recently, voice synthesizers instead of graphical displays. this finding suggests the importance of studying the real information needs of the blind or visually impaired rather than the sighted researchers’ expectations of those needs. more research needed with sound because the blind already use sound, particularly voice synthesizers, more sound applications should be researched. for example, audio games have the potential to help blind children learn some skills in the same way that video games teach certain skills to sighted children. audio games also provide a broader research area for future studies. conclusion numerous devices have been developed to improve blind or visually impaired people’s access to information, including information on the internet. however, such devices are quite expensive or limited in flexibility and mainly work in text-only environments. there is no suitable graphic display for the blind, except the laboratory level’s expensive and bulky pin-based external devices. this new prototype uses established force feedback technology with a minimal cost to existing pcs. it functions for both text and graphics. the final products derived from this study can be used for many purposes nationally and internationally. information on the web can be delivered to the visually impaired without expensive devices. this touchable braille also lets deaf-blind people, who cannot use screen reader software, access information on the web, and it can help people learn braille. the application of this force feedback prototype to image displays has exciting and enormous potential because currently there is no practical, usable method for the blind to access images. for example, blind children are still using handmade 3-d picture books that are labor-intensive and time-consuming to produce. with this prototype, children’s books can be delivered easily to blind children, who will touch the books’ images via the force feedback mouse. maps of local, state, national, or international interests can be delivered to the blind as well. this prototype will help to add yet another sense—touch—to already blossoming visual and auditory digital libraries. through force feedback technology, new multimodal digital libraries will be accessible to the world. acknowledgement this research was supported by a diversity research grant from the american library association in 2005. 52 information technology and libraries | march 200852 information technology and libraries | march 2008 references and notes 1. wooseob jeong and myke gluck, “multimodal geographic information systems: adding haptic and auditory display,” journal of the american society for information science and technology 54, no. 3 (2003): 229–242. 2. wooseob jeong, “touchable online braille generator,” in proceedings of the 7th international acm sigaccess conference on computers and accessibility (new york: acm press, 2005), 188–189. 3. jack m. loomis and susan j. lederman, “tactual perception,” in handbook of perception and human performance, ed. k. r. boff, l. kaufman and j. p. thomas (new york: john wiley & sons, 1986), vol. 2, chap. 31, 1–41. 4. r. dan jacobson, robert kitchen, and reginald golledge, “multimodal virtual reality for presenting geographic information,” in virtual reality in geography, ed. p. fisher and d. unwin (new york: taylor & francis, 2000), 382–400. 5. j. kenneth salisbury and mandayam a. srinivasan, “phantom-based haptic interaction with virtual objects,” ieee computer graphics and applications 17, no. 5 (1997): 6–10. 6. christopher ramstein, “combining haptic and braille technologies: design issues and pilot study,” in proceedings of the 2nd annual acm conference on assistive technologies (new york: acm press, 1996), 37–44. 7. a. hardwick, s. furner, and j. rush, “tactile access for blind people to virtual reality on the world wide web,” iee colloquium on developments in tactile displays 1997, no. 012: 9/1–9/3. 8. timothy miller and robert zeleznik, “the design of 3d haptic widgets,” in proceedings of the 1999 symposium on interactive 3d graphics (new york: acm press, 1999), 97–102. 9. r. dan jacobson, “geographic visualization with little or no sight: an interactive gis for visually impaired people (paper submitted to aag-gis specialty group student paper competition). 10. r. brent gillespie and others, “the virtual teacher” in proceedings of asme dynamic systems and control division (new york: asme, 1998), vol. 2, 171–78. 11. noshira a. langrana and others, “human performance using virtual reality tumor palpation simulation,” computer & graphics 21, no. 4 (1997): 451–458. 12. c. colwell and others, “haptic virtual reality for blind computer users,” in proceedings of the third annual acm conference on assistive technologies (new york: acm press, 1998), 92–99. 13. wooseob jeong, “exploratory user study of haptic and auditory display for multimodal geographic information systems,” in chi’01 extended abstracts on human factors in computing systems (new york: acm press, 2001), 73–74. 14. ramstein, “combining haptic and braille technologies.” 15. wai yu and stephen brewster, “multimodal technologies: multimodal virtual reality versus printed medium in visualization for blind people,” in proceedings of the 5th international acm conference on assistive technologies (new york: acm press, 2002), 57–64. 16. d. tzovaras and others, “multimodal technologies: design and implementation of virtual environments training of the visually impaired,” in proceedings of the 5th international acm conference on assistive technologies (new york: acm press, 2002), 41–48. 17. r. ramloll and others, “constructing sonified haptic line graphs for the blind student: first steps,” in proceedings of the 4th international acm conference on assistive technologies (new york: acm press, 2000), 17–25. 18. v. kathlene emery and others, “toward achieving universal usability for older adults through multimodal feedback,” in proceedings of the 2003 conference on universal usability (new york: acm press, 2003), 46–53. 19. julie a. jacko and others, “older adults and visual impairment: what do exposure times and accuracy tell us about performance gains associated with multimodal feedback?” in proceedings of the sigchi conference on human factors in computing systems (new york: acm press, 2003), 33–40. 20. seongzoo jeong, naoki hashimoto, and sato makoto, “a novel interaction system with force feedback between real and virtual humans,” in proceedings of the 2004 acm sigchi international conference on advances in computer entertainment technology (new york: acm press, 2004), 61–66. 21. jeong and gluck, “multimodal geographic information systems”; and wooseob jeong, “multimodal trivariate thematic maps with auditory and haptic display” (paper contributed to asist 2005, charlotte, north carolina, october 28–november 2, 2005). that was then, this is now: replacing the mobile-optimized site with responsive design hannah gascho rempel and laurie bridges information technology and libraries |december 2013 8 abstract as mobile technologies continue to evolve, libraries seek sustainable ways to keep up with these changes and to best serve our users. previous library mobile usability research has examined tasks users predict they might be likely to perform, but little is known about what users actually do on a mobile-optimized library site. this research used a combination of survey method and web analytics to examine what tasks users actually carry out on a library mobile site. the results indicate that users perform an array of passive and active tasks and do not want content choices to be limited on mobile devices. responsive design is described as a long-term solution for addressing both designers’ and users’ needs. introduction technology is in a constant state of flux. as librarians well know, emerging technology can quickly become outdated in a few short years. in 2010 blackberry phones were at their peak, but now their mobile devices account for a little more than 5 percent of the market share and the android dominates the top spot, with approximately 52 percent of the market share.1 as smartphone use and design has continued to proliferate and advance, users have become accustomed to quicker load times for webpages and are now well-acquainted with how to navigate the web from their phones. as the mobile phone market changes and evolves, usability experts continuously update, test, and revise standards for the mobile web. at oregon state university (osu) we recently set out to improve our mobile site. this required updating our knowledge about patron use of library mobile sites, both through reviewing the literature and by conducting our own primary research. what we found surprised us and challenged us to reconsider what we had previously assumed about patrons’ mobile habits. in this article we will describe past research on library mobile website usability, our own research on how our mobile library site is used, and why we ended up deciding to use responsive design as the guiding principle for our redesign. hannah gascho rempel (hannah.rempel@oregonstate.edu), is science librarian and graduate student services coordinator, oregon state university, corvallis, oregon. laurie bridges (laurie.bridges@oregonstate.edu), a lita member, is instruction and emerging services librarian, oregon state university, corvallis, oregon. mailto:hannah.rempel@oregonstate.edu mailto:laurie.bridges@oregonstate.edu that was then, this is now: replacing the mobile-optimized site with responsive design | rempel and bridges 9 background: that was then in early 2010 we co-authored an article with our then-programmer about the mobile landscape in libraries. we were among the first to propose that libraries should develop separate websites and catalog interfaces optimized for mobile devices.2 based on the widespread implementation of mobile-optimized library websites since then, it appears this proposal was both relevant and timely. our 2010 recommendations were based on usability studies, library reports, and technology trends from 2007 through 2009. research and literature at the time pointed to the need for considering the mobile context, for example, the “attention span” of mobile users as they search for information on the go.3 we noted the advice of jakob nielsen, who indicated that “if mobile use is important to your internet strategy, it’s smart to build a dedicated mobile site.” although that particular webpage is no longer available, the essence of nielsen’s thinking can be found in a mobile usability update posted on september 26, 2011, which states, “a dedicated mobile site is a must,” in the introductory paragraph.4 the iphone was released in the united states in late 2007, and in 2008 the proliferation of dedicated mobile sites began. in december 2008, our mobile team focused on developing a site for the two most popular device types at the time, “smartphones” and “feature phones.” the differences between the two types of phones were numerous and there were drawbacks to the feature phones. however, we felt it was important to have a site that rendered well on feature phones because at that time feature phones dominated the market with only 28 percent of mobile phone users in the united states owning a smartphone.5 our initial site design in 2008 and 2009 focused on our primary users, members of the university community. the first phase of our mobile website, released in march 2009, included static pages like library hours, contact information, frequently asked questions, and directions.6 the second phase, released in september 2009, included a mobile catalog interface (designed in-house), a staff directory, and a computer availability map. in february 2010 the site averaged one hundred unique users a day. mobile site analytics showed that the most viewed pages were computer availability, catalog, and hours. background: this is now recent research and case studies show a shift in the mobile context. a comprehensive study by alan aldrich in 2010 examined the mobile websites of large research universities and their libraries in the united states and canada. aldrich notes, “users seem to want access to information just as if they were using a fully web-capable desktop or laptop computer.” aldrich ponders the possibility that patron expectations and desires may be evolving as smartphones begin to dominate the mobile landscape.7 in 2011, jakob nielsen noted that because most people do not use the web on their feature phones and most companies do not support feature phones in their web design process, he would no longer be testing feature phones in his usability studies.8 our experience matches nielsen’s on this information technology and libraries | december 2013 10 point; approximately 10 percent of daily users accessing our library’s mobile site came from feature phones in 2010; in 2012, that number had dwindled to less than 1 percent. in 2012, nielsen began considering the proliferation of mobile devices beyond phones, when he wrote, “high-end sites will need 3 mobile designs to target phones, mid-sized tablets, and big tablets.”9 note the slight change in the phrase “dedicated mobile site” (from the 2009 mobile usability update) to “mobile designs.” nielsen goes on to suggest responsive design, which we will address in more detail later in this article, as a solution to this design problem. as we prepared for another redesign of our mobile site, we knew we needed a current snapshot of users and how they actually use the mobile version of the library’s website before starting our redesign. this may sound like a simple decision; however, it is a step that the library literature has not documented well. researchers have focused on what library users predict they might need rather than analyzing their actual behaviors. mobile site user experiences mobile library website development has been influenced by other, nonlibrary mobile sites that have placed heavy emphasis on developing sites for users who are on the go.10 earlier studies examining general mobile browsing and searching habits found that mobile users’ most popular activities were reading news, weather, or sports articles, looking for information using search engines, and checking email.11 when libraries used these studies to inform mobile site development, the result was a streamlined version of the full library site. it is instructive to consider studies like those done by coursaris and kim, who performed a meta-analysis of more than one hundred mobile usability studies. they demonstrated both the extreme breadth of mobile usability research, which has examined everything from how users perform tasks on their mobile devices while walking on a treadmill to how users navigate mobile maps, to mobile restaurant selection, as well as the niche-specific nature of many of these studies, the results of which may not be transferable to other contexts.12 when looking at mobile usability studies specifically within the higher education sector, a field more closely related to libraries, research focuses primarily on the use of mobile phones for enhancing student learning through specific activities or sites.13 less research examines how mobile portal or university homepages are used. one exception is the iowa course online (icon) mobile device survey, which was administered in 2010 and asked students what aspects of icon (a course management system) they used on their mobile devices.14 students were frequent users of this site; three-quarters of respondents used the site at least 1–3 times per week. the top three selected tasks were grades, “content” (a category including pdfs and microsoft word documents), and schedules. as defined by kaikkonen, users’ top tasks included a mix of “passive” content that require no additional interaction (e.g., grades, weather) and “active” content, which requires further searching, reading, or location-specific information from the user (e.g., searching in web browsers, mapping, or looking through pdfs). 15 the combination of passive and active content more closely matches the types of tasks potentially required by mobile library website users. that was then, this is now: replacing the mobile-optimized site with responsive design | rempel and bridges 11 research on library mobile site use or usability primarily has focused on users’ speculations about what they might like from a mobile library site before the site was constructed, or on a handful of users’ experiences navigating an existing site while accomplishing researcher-assigned tasks.16 no known research exists that demonstrates what tasks users actually perform on an existing mobile library website in real-time. for example, focus groups conducted in 2009 at kent state university library before the creation of their mobile site led to the conclusion that “students want the site for ‘quick research’ not to ‘sit down and write a term paper on my phone,’” and that students did not want as many research choices, such as all of the databases the library subscribes to, made available on their mobile device (or perhaps even on the full website).17 a survey at the open university in the united kingdom given prior to deployment of the mobile site extrapolated that students would want to access library hours, a map of the library, contact information, the library catalog and their borrowing record from their mobile device.18 in addition, a survey of the student body at utah state university in 2011 intended to help inform their mobile site design found that students might want to access the mobile catalog, retrieve articles and reserve study rooms.19 these results helped determine the future development of these academic library mobile websites and were interpreted to demonstrate that particular tasks might be better suited for the mobile environment. however, they also demonstrate that student users might want to engage in a variety of both active and passive tasks, such as searching the library catalog and checking the library’s hours. as part of a growing recognition that it is time to reevaluate mobile library sites, bohyun kim, reviewed eight library mobile sites, and presented her analysis at the american library association annual conference in 2012.20 kim compared screenshots of 2010 and 2012 homepages and found a greater emphasis on search in the 2012 versions. kim also highlighted constraints and assumptions that are no longer true in the mobile environment, such as mobile devices’ slow networks, a focus on information on the go, and mobile sites with fewer features and content. kim’s analysis signals a shift in how libraries, as well as the broader mobile environment, are envisioning the content they provide on their mobile sites. because websites designed for the mobile context are still relatively new, an important part of this shift in the design of library’s mobile sites should include an investigation into what types of tasks users are actually performing on these sites. using web analytics software, we are able to learn which mobile webpages are the most visited on osu’s mobile site, and we know the path users took from the first hit to when they exited the site. however, what we do not know is what the user’s intention is in visiting the site, what types of searches they enter, or if these users are able to accomplish their search goals. the objectives of this study were to gather a list of tasks users attempt to accomplish when visiting the library mobile site and to understand the difficulties users encounter when they try to access information on the osu’s mobile site. a more in-depth understanding of how our mobile website is used will help us to provide an improved interface, especially as we work on a site redesign. information technology and libraries | december 2013 12 methods this study used an online survey instrument to gain a better understanding of what tasks mobile site visitors are trying to perform when they visit our library’s mobile site, to discover whether they were able to accomplish their task, and any other general impressions, suggestions, or feedback these users had about our mobile library site. this survey (approved by osu’s institutional review board) was available on the qualtrics survey platform for twelve-weeks from november 2012 to january 2013. the survey was accessible via a link on the mobile version of our library’s homepage, meaning that only users who used the mobile-optimized version of our website had access to the survey. the survey was open to anyone who used the library’s mobile website, not just osu affiliates. the survey was completed by 115 participants. a $2 gift certificate to a coffee shop was distributed to participants upon completion of the survey. because mobile site use can cover a complex range of scenarios, and because we were more interested in learning about what this range of scenarios was for our mobile site, we did not use a closed-task scenario with preset tasks. we asked real users who were currently browsing the mobile site to choose from a list of tasks that best described what they were searching for on the library's mobile site (there were also open-ended questions and options). if they were looking for a book or planned to conduct research on a topic, we used display logic in our survey to further probe their answer and ask about the parameters of their search. if they indicated they had previously used the mobile site to search for books or other research materials, we used the same method used with the book search to ask if they were able to find these materials. if they were looking for articles, we then asked if they read the articles on their mobile device. in addition, we asked if there was anything they wished they could do on the site and for any other general feedback on the mobile site. finally, we collected some demographic information about the participants: their osu affiliation and the frequency with which they use the library’s mobile site. the survey data was analyzed using qualtrics’ cross-tab functionality and microsoft excel to observe trends and potential differences by user groups. open-ended responses were examined for common themes. to help provide some counterbalance to our survey data, a combination of urchin and google analytics statistics were analyzed for two of the months the survey was available. urchin statistics were gathered for the mobile version of the website, and google analytics statistics of our drupal-based pages were gathered for mobile users of the full version of the library’s website. we tabulated average daily visitors, specific page views, and the type of browser used with these analytical tools. findings online survey—closed-ended questions an advantage of administering a survey versus simply using web analytics to assess a mobile site is that more granular information, such as demographics, can be gathered. of the 115 online survey respondents, 74 identified themselves as undergraduate students, 19 were graduate that was then, this is now: replacing the mobile-optimized site with responsive design | rempel and bridges 13 students, 8 were faculty members, 3 were community members, 2 were alumni, 2 were staff members, and 1 chose the “other” field and self-identified as a parent (see figure 1). not all respondents answered every question. because undergraduates make up the overwhelming majority of the campus body, it makes sense that 64 percent of respondents identified themselves as undergraduates. however, the demographic responses also illustrate that multiple user groups access the library’s mobile site. figure 1. demographic distribution of survey respondents (n = 109). the survey participants were asked how often they had previously used the library’s mobile site. sixty-nine respondents (60 percent) were accessing the site for the first time. no respondents used the mobile site daily, 1 respondent visited 2–3 times per week, 5 respondents (4 percent) visited once a week, 11 respondents (9.5 percent) visited 2–3 times per month, 7 respondents (6 percent) visited once a month, and 16 (14 percent) visited less than once a month (see figure 2). the majority of respondents had not previously used the library’s mobile site. one possible reason for this is that the data was collected primarily during fall term, a time when there are many new students on campus. an alternative explanation is that people who use the mobile site often are highly task-oriented and did not want to be distracted by taking a survey. the fact that the majority of respondents had not previously used the library’s mobile site affected responses to later questions in the survey, which asked participants to remember previous experiences and satisfaction with the site. 1 2 2 3 8 19 74 0 10 20 30 40 50 60 70 80 other staff member alumni community member (non-student) faculty member graduate student undergraduate what is your affiliation? information technology and libraries | december 2013 14 figure 2. frequency with which survey respondents used the library’s mobile site (n = 109). one of our main research goals was to determine our users’ intention for visiting the library’s mobile site. respondents could choose as many items as were applicable from a list of reasons for visiting the library’s mobile site; as a result, the data is presented as the percent of total responses. respondents could also choose “something else” and enter additional reasons for visiting the mobile site. these “other” reasons were grouped and the groupings are reported in the list of reasons for visiting the site. the top reason respondents visited the site was to view the library’s hours (47 percent). the next two most frequent reasons for visiting the site were research related, with 25 percent intending to look for a book and 21 percent intending to do research on a topic. the fourth and fifth most common choices were associated with using the library building, with 13 percent looking for study room reservations and 10 percent looking for the availability of computers in the library. because the library’s current mobile site has been optimized for tasks perceived to be most important for mobile users, and because some of the features available via the full-site version of our ils (integrated library system) are not available via the library’s mobile site, not all of the tasks respondents wanted to accomplish on the mobile site were actually available on the mobile site. these items include study room reservations (13 percent of responses); “my account” features, such as the ability to check due dates, make renewals, and place holds (6 percent of responses); and interlibrary loan (1 percent of responses). finally, some features that had been considered ideal for the mobile context because of their location-sensitive, time-saving, or hedonic functionality—such as looking for directions (1 percent of responses), finding a quick way to contact a librarian with a question (2 percent of responses), and viewing 0 1 5 11 7 16 69 0 10 20 30 40 50 60 70 80 daily 2-3 times a week once a week 2-3 times a month once a month less than once a month this is my first time number of participants how often do you use the library's mobile site? that was then, this is now: replacing the mobile-optimized site with responsive design | rempel and bridges 15 the webcam for the coffee shop line (3 percent of responses)—were rarely selected as a reason for visiting the mobile site (see figure 3). figure 3. respondents’ reasons for visiting the library’s mobile site by percent of responses. respondents could choose more than one response. to determine if different user groups approach the library’s mobile site differently, we compared the reasons for visiting the mobile site across user groups. when looking at the top five reasons respondents visited the mobile site, only a few differences appeared based on user group (because of the small sample size, a statistical analysis determining significance cannot be done, but results may indicate avenues for future research). graduate students were somewhat more likely to visit the library’s mobile site to look for research on a topic, as well as to look for study room reservations. however, undergraduate students were more likely than graduate students to be interested in the availability of computers in the library (see table 1). 1% 1% 1% 1% 1% 2% 2% 3% 4% 6% 10% 13% 21% 25% 47% 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% jobs interlibrary loan directions availability of other technology academic calendar a way to contact a librarian with a question course reserves library coffee shop webcam staff directory my account computer availability study room reservations research on a topic a book library hours percent of responses what are you searching for on the library's mobile site? information technology and libraries | december 2013 16 what are you searching for during this visit to the library mobile site? library hours a book research on a topic study room reservations computer availability n undergraduate 48.7 21.6 14.9 14.9 12.2 74 graduate student 47.4 31.6 31.6 21.1 5.3 19 faculty member 12.5 25 37.5 0 0 8 community member 100 33.3 33.3 0 0 3 staff member 0 0 50 0 0 2 alumni 50 50 0 0 0 2 other 0 0 0 0 0 1 table 1. percentage of respondents’ reasons for visiting the library’s mobile site by user group for the top five most-selected tasks. (respondents could choose more than one response.) because the online survey was available over a twelve-week period, which included portions of fall term, winter break, and winter term, we could look at a breakdown of use by time of term. specifically, we wanted to see if there was a difference in use during the middle of the term versus finals week and intersession. during the latter period, we anticipated users would not be using the library’s mobile site for research purposes; however, when comparing these two different usage periods for the top five reasons respondents visited the site, we found respondents’ tasks tended to be quite similar regardless of whether or not the term was in session. the only two differences were (1) during intersession respondents tended to be more likely to search for a book using the mobile site and (2) during the term, more respondents were looking for a way to make study room reservations. while the high number of respondents looking for library hours dominates these results, it does appear that respondents were still using the mobile site to conduct researchrelated tasks even during intersession (see figure 4). that was then, this is now: replacing the mobile-optimized site with responsive design | rempel and bridges 17 figure 4. percentage of respondents’ reasons for visiting the library’s mobile site by during term vs. intersession for the top five most-selected tasks. respondents could choose more than one response. online survey—open-ended questions as described earlier, the second and third most frequently cited reasons for visiting the library’s mobile site were research related: looking for a book and doing research on a topic. survey participants indicating they had come to the site looking for a book were prompted to enter the search they intended to use. of the twenty-eight respondents who indicated they were looking for a book, twenty-five provided the search words they planned to use (see table 2). respondents reported a wide range of search types, from known-item titles or authors such as moby dick or ian fleming, to broad topic areas such as women studies, to focused keywords like “high performance computing.” all of these searches fit into the active task category of mobile use. 7% 13% 15% 15% 33% 6% 5% 14% 22% 34% 0% 5% 10% 15% 20% 25% 30% 35% 40% computer availability study room reservations research on a topic a book library hours p e r c e n t o f r e s p o n s e s what are you looking for on the mobile library site (by time of term)? during term finals week/intersession information technology and libraries | december 2013 18 accelerated c++ autism spectrum in children beauty pageant beer and circus course reserves four fish googlr [sic] ian fleming moby dick oregon taxes pomerania seafood quality semiconductor sir thomas malory the power of now high performance computing hope is an imperative i wanted to look up textbooks for my winter term classes look up reference number to see if i can find out online if reserve is available name of the book and author if i know it title vin # web design women studies writing the successful thesis and dissertation table 2. respondents’ book searches on their mobile device. if a survey participant indicated they had come to the site to conduct research on a topic, we prompted them to provide more detailed information about their search. of the twenty-three respondents who indicated they were conducting research on a topic, twenty provided the search words they planned to use (see table 3). it is apparent from the responses that at least five respondents misunderstood our question, and instead of entering keywords for a search, they entered the databases or search engines they planned to use to conduct their research, such as 1search (our library’s iteration of serials solutions’ summon), eric, psycinfo, and perhaps google; although, when considering the search term google, it is possible the respondent was going to attempt to conduct research on the google company itself. the remaining search terms reflect indepth concepts that would either retrieve many results (e.g., procurement and contract processes), or that while retrieving fewer results would give the researcher more than just a single simple document to consult (e.g., ethnobotany oregon). as with the book searches, these research topics represent more active tasks that move the mobile user beyond the earlier perceptions of mobile use that predicted tasks centered on quick, entertaining, or location-specific information.21 that was then, this is now: replacing the mobile-optimized site with responsive design | rempel and bridges 19 unit 731 social justice taxes ecological anthropology shared governance college ncaa history world war i 1search ebsco medline search eric search procurement and contract processes seafood quality e-journals to search for a specific subscription psychinfo ethnobotany oregon google dye properties and peak wavelengths databases in search of company info for applebee’s thesis writing beauty pageants the science teacher journal table 3. respondents’ topic searches for nonbook research sources. web analytics results in addition to collecting survey responses, we monitored our web analytics during the survey period to see how the site usage matched with our survey respondents’ stated activities. the mobile version of the site averaged 124 daily visitors between november 5, 2012, and january 5, 2013. the top three pages viewed were the computer availability map (37 percent), the mobile homepage (25 percent) and the research page (3 percent). the mobile homepage also displays the library hours. because mobile users do not just use mobile optimized sites, we also looked at mobile use of the drupal-designed pages on the full version of our website. of the top twenty pages viewed, eleven were content pages and nine were navigational pages. based on page views, the top three content pages were study rooms (8 percent), hours (5 percent) and research databases (5 percent). the top three navigational pages based on page views were the homepage (40 percent), which also displays the library hours, the find it page (8 percent), which lists links to content like databases, the catalog, and ejournals, and the in the library page (7 percent), which provides links to information about things a patron might use in the library, such as study rooms or computers. the web analytics do not exactly mirror the tasks reported by the survey respondents and reflect an even greater emphasis on practical tasks like using a computer or a study room in the library. information technology and libraries | december 2013 20 in our research a disconnect appeared between what users actually do on the mobile site, according to urchin and google analytics, and what they would like to do on the site based on information gathered from the participants of our online survey. it became apparent from our survey that our participants were not only attempting simple searches appropriate for a strippeddown mobile-optimized site but were also attempting active tasks, like conducting more complex searches we formerly would have expected them to do only on the full site. we needed a website that no longer restricted the activities our users could do based on our outdated assumptions of their use of our website. responsive design as a solution to our problem of how to provide a consistent, nonrestricted experience to all of our users regardless of how they were accessing our site, we turned to the concept of responsive design. responsive design was conceived as recently as 2010,22 but adoption is growing rapidly because it offers a more scalable solution for designers, allowing them to move away from designing different websites for every platform, and instead designing sites that scale differently in different contexts (for example, an iphone vs. an ipad). responsive design is a more dynamic strategy, requiring as web designer ethan marcotte states, “fluid grids, flexible images, and media queries.”23 however, marcotte goes on to argue that responsive design “requires a different way of thinking. rather than quarantining our content into disparate, device-specific experiences, we can use media queries to progressively enhance our work within different viewing contexts.” the following images illustrate the reflow of a responsive design layout for desktop and tablet views. figure 5. desktop view of responsive osu libraries webpage. that was then, this is now: replacing the mobile-optimized site with responsive design | rempel and bridges 21 figure 6. tablet view of responsive osu libraries webpage. this summer, our web designer redesigned both the full site and mobile site using responsive design. at osu, we have decided that we will no longer have a separate mobile site. instead, drupal responsive design modules and themes allow our site to be viewed optimally, independent of screen size, as “one web.”24 using a responsive design allowed us to choose a one-column layout for our mobile site and a three column layout for our full site. using drupal modules and themes, the three columns from the full site reflow into one column on the mobile device. menu bars collapse, numerous static pictures become one image box with rotating pictures, and paragraphs are simply linkable titles. these design decisions allow users to perform both active and passive tasks depending on their needs, regardless of context. responsive design provides clear advantages for designers, for example, they no longer need to maintain separate versions of low-use pages, such as a directions page. in addition, while responsively designed websites are not automatically accessible, it is simpler for designers to create a single iteration of a site that meets accessibility guidelines, and which can then scale appropriately to other contexts. however, there are also advantages for the user. responsive design ensures that users will encounter a predictable interface and experience across all of the platforms from which they access the library’s website. moreover, in our local context, we have had a policy of not providing links to websites not optimized for mobile devices, such as databases, from our mobile site. as we switched to a responsively designed site, we moved away from this policy, thereby providing more research choices to our mobile users. there are also some drawbacks to using responsive design in our context. some might consider linking out to resources not optimized for mobile devices a drawback rather than an advantage. in addition, our redesign only involved applying responsive principles to our library sites that are developed in drupal. sites such as the study room reservation system, my account, and the information technology and libraries | december 2013 22 library’s catalog draw upon other vendors’ tools, and therefore are not under our design control. however, access to these sites will no longer be circumscribed for mobile users, but may involve less intuitive navigation depending on their device. conclusion our goal in this study was to gain a more in-depth understanding of how our mobile website is used to guide a redesign of our mobile interface. as a result of examining current trends in web design and development, through analysis of the data we collected that demonstrated not only what our users currently do on our mobile site but also what they intend to do and what gaps they perceive in our service, we have chosen to integrate our full and mobile sites into “one web” through the use of responsive design. gathering qualitative information from our mobile site users has provided us with more realistic tasks that we can use in further usability testing. finally, we are again reminded of the continual evolution of our users’ needs and the expanding possibilities that are available as information becomes increasingly mobile. references 1. jeff clabaugh, “blackberry u.s. market share falls to 5.4 percent; google’s android remains on top,” washington business journal, april 4, 2013, http://www.bizjournals.com/washington/news/2013/04/04/blackberry-us-market-sharefalls-to.html. 2. laurie bridges, hannah gascho rempel, and kimberly griggs, “making the case for a mobile library web site,” reference services review 38, no. 2 (2010): 309–20, doi: 10.1108/00907321011045061. 3. anne kaikkonen, “full or tailored mobile web—where and how do people browse on their mobiles?” in proceedings of the international conference on mobile technology, applications, and systems, mobility ’08 (new york: acm, 2008), 28:1–28:8, doi: 10.1145/1506270.1506307. 4. jakob nielsen, “mobile usability update (jakob nielsen’s alertbox),” september 26, 2011, http://www.useit.com/alertbox/mobile-usability.html. 5. “feature phones comprise overwhelming majority of mobile phone sales in q2 2009,” npd group, 2009, https://www.npd.com/wps/portal/npd/us/news/press-releases/pr_090819/. 6. kim griggs, laurie m. bridges, and hannah gascho rempel, “library/mobile: tips on designing and developing mobile web sites,” code4lib journal 8 (november 11, 2009), http://journal.code4lib.org/articles/2055. 7. alan aldrich, “universities and libraries move to the mobile web,” educause review online, june 24, 2010, http://www.educause.edu/ero/article/universities-and-libraries-movemobile-web. http://www.bizjournals.com/washington/news/2013/04/04/blackberry-us-market-share-falls-to.html http://www.bizjournals.com/washington/news/2013/04/04/blackberry-us-market-share-falls-to.html http://www.useit.com/alertbox/mobile-usability.html https://www.npd.com/wps/portal/npd/us/news/press-releases/pr_090819/ http://journal.code4lib.org/articles/2055 http://www.educause.edu/ero/article/universities-and-libraries-move-mobile-web http://www.educause.edu/ero/article/universities-and-libraries-move-mobile-web that was then, this is now: replacing the mobile-optimized site with responsive design | rempel and bridges 23 8. nielsen, “mobile usability update (jakob nielsen’s alertbox).” 9. jakob nielsen, “mobile site vs. full site (jakob nielsen's alertbox),” april 10, 2012, http://www.useit.com/alertbox/mobile-vs-full-sites.html. 10. keren mills, m-libraries: information use on the move (cambridge, uk: arcadia programme, 2009), http://www.dspace.cam.ac.uk/handle/1810/221923; bridges, rempel, and griggs, “making the case for a mobile library web site.” 11. anne kaikkonen, “full or tailored mobile web?” 12. constantinos k. coursaris and dan j. kim, “a meta-analytical review of empirical mobile usability studies,” journal of usability studies 6, no. 3 (may 2011): 117–71. 13. emrah baki basoglu and omur akdemir, “a comparison of undergraduate students’ english vocabulary learning: using mobile phones and flash cards,” turkish online journal of educational technology—tojet 9, no. 3 (july 1, 2010): 1–7; suzan duygu eristi et al., “the use of mobile technologies in multimedia-supported learning environments,” turkish online journal of distance education 12, no. 3 (july 1, 2011): 130–41; stephanie cobb et al., “using mobile phones to increase classroom interaction,” journal of educational multimedia and hypermedia 19, no. 2 (april 1, 2010): 147–57; shelley kinash, jeffrey brand, and trishita mathew, “challenging mobile learning discourse through research: student perceptions of ‘blackboard mobile learn’ and ‘ipads,’” australasian journal of educational technology 28, no. 4 (january 1, 2012): 639–55. 14. university of iowa, icon mobile device survey, n.d., https://icon.uiowa.edu/support/statistics/icon percent20mobile percent20device percent20survey.pdf. 15. anne kaikkonen, “full or tailored mobile web?” 16. kimberly d. pendell and michael s. bowman, “usability study of a library’s mobile website: an example from portland state university,” information technology & libraries 31, no. 2 (2012): 45–62, doi: 10.6017/ital.v21i2.1913. 17. jamie seeholzer and joseph salem, “library on the go: a focus group study of the mobile web and the academic library,” college & research libraries 72, no. 1 (january 2011): 9–20. 18. mills, m-libraries. 19. angela dresselhaus and flora shrode, “mobile technologies & academics: do students use mobile technologies in their academic lives and are librarians ready to meet this challenge?” information technology & libraries 31, no. 2 (2012): 82–101, doi: 10.6017/ital.v31i2.2166. http://www.useit.com/alertbox/mobile-vs-full-sites.html http://www.dspace.cam.ac.uk/handle/1810/221923 https://icon.uiowa.edu/support/statistics/icon%20percent20mobile%20percent20device%20percent20survey.pdf https://icon.uiowa.edu/support/statistics/icon%20percent20mobile%20percent20device%20percent20survey.pdf information technology and libraries | december 2013 24 20. bohyun kim, “it’s time to look at your library’s mobile website again!” (presented at the american library association annual conference, anaheim, ca, june 24, 2012), http://www.slideshare.net/bohyunkim/its-time-to-look-at-your-librarys-mobile-websiteagain. 21. bridges, rempel, and griggs, “making the case for a mobile library web site.” 22. ethan marcotte, “responsive web design,” a list apart, may 25, 2010, http://alistapart.com/article/responsive-web-design. 23. ibid. 24. jeff burnz, “responsive design,” drupal, february 14, 2013, http://drupal.org/node/1322126. http://www.slideshare.net/bohyunkim/its-time-to-look-at-your-librarys-mobile-website-again http://www.slideshare.net/bohyunkim/its-time-to-look-at-your-librarys-mobile-website-again http://alistapart.com/article/responsive-web-design http://drupal.org/node/1322126 september_ital_ozeran_for_proofing managing metadata for philatelic materials megan ozeran information technology and libraries | september 2017 7 abstract stamp collectors frequently donate their stamps to cultural heritage institutions. as digitization becomes more prevalent for other kinds of materials, it is worth exploring how cultural heritage institutions are digitizing their philatelic materials. this paper begins with a review of the literature about the purpose of metadata, current metadata standards, and metadata that are relevant to philatelists. the paper then examines the digital philatelic collections of four large cultural heritage institutions, discussing the metadata standards and elements employed by these institutions. the paper concludes with a recommendation to create international standards that describe metadata management explicitly for philatelic materials. introduction postage stamps have existed since great britain introduced them in 1840 as a way to prepay postage. historian and professor winthrop boggs (1955) points out that postage stamps have been collected by individuals since 1841, just a few months after the first stamps were issued (5). to describe this collection and research, the term philately was coined by a french stamp collector, georges herpin, who “combined two greek words philos (friend, amateur) and atelia (free, exempt from any charge or tax, franked)” (boggs 1955, 7). thus postage stamps and related materials, such as the envelopes to which they have been affixed, are considered philatelic materials. in the united states, numerous societies have formed around philately, such as the american philatelic society, the postal history society, the precancel stamp society, and the sacramento philatelic society (in northern california). the definitive united states authority on stamps and stamp collecting for nearly 150 years has been the scott postage stamp catalogue, which was first created by john walter scott in 1867 (boggs 1955, 6). the scott catalogue “lists nearly all the postage stamps issued by every country of the world” (american philatelic society 2016). philately is a massively popular hobby, and cultural heritage institutions have amassed large collections of postage stamps through collectors’ donations. in this paper, i will examine how cultural heritage institutions apply metadata to postage stamps in their digital collections. libraries, archives, and museums have obtained specialized collections of stamps over the decades, and they have used various ways to describe these collections, such as through creating finding aids. only recently have institutions begun to digitize their stamp collections and make the collections available for online review, as digitization in general has become more common in cultural heritage institutions. megan ozeran (megan.ozeran@gmail.com), a recent mlis degree graduate from san jose state university school of information, is winner of the 2017 lita/ex libris student writing award. managing metadata for philatelic materials | ozeran | doi:10.6017/ital.v36i3.10022 8 problem statement textual materials have received much attention in regards to digitization, including the creation and implementation of metadata standards and schemas. philatelic materials are not like textual materials, and are not even like photographic materials, which have also received some digitization attention. in fact, there is very little literature that currently exists describing how metadata is or should be applied to philatelic materials, even though digital collections of these materials already exist. therefore, the goal of this paper is to examine exactly how metadata is applied to digital collections of philatelic materials. several related questions drove the research about this topic: as institutions digitize stamp collections, what metadata schema(s) are they using to do so? are current metadata standards and schemas appropriate for these collections, or have institutions created localized versions? what metadata elements are most crucial in describing philatelic materials to enhance access in a digital collection? literature review while there is abundant literature regarding the use of metadata for library, archives, and museum collections, there is a dearth of literature that specifically discusses the use of metadata for philatelic materials. indeed, there is no literature at all that analyzes best practices for philatelic metadata, despite the fact that several large institutions have already created digital stamp collections. even among the many metadata standards that have been created, very few specify metadata guidelines for philatelic collections. it is clear that philatelic collections have not been highlighted in discussions over the last few decades about digitization, so best practices must be inferred based on the more general discussions that have taken place. the purpose and quality of metadata when considering why metadata is important to digital collections (of any type), it is crucial to remember, as david bade (2008) puts it, “users of the library do not need bibliographic records at all. . .. what they want is to find what they are looking for” (125). in other words, the descriptive metadata in a digital record is important only to the extent that it facilitates the discovery of materials that are useful to a researcher. as arms and arms (2004) point out, “most searching and browsing is done by the end users themselves. information discovery services can no longer assume that users are trained in the nuances of cataloging standards and complex search syntaxes” (236). echoing these sentiments, chan and zeng (2006) write, “users should not have to know or understand the methods used to describe and represent the contents of the digital collection” (under “introduction”). when creating digital records, then, institutions need to consider how the creation, display, and organization of metadata (especially within the search system) make it easier or more difficult for those end users to effectively search the digital collection. how effective metadata is in facilitating user research is ultimately dependent upon the quality of that metadata. bade (2007) notes that the information systems are essentially a way for an institution to communicate with researchers, and that this communication is only effective if metadata creators understand what the end users are looking for in the content and style of information technology and libraries | september 2017 9 communication (3-4). thus, in somewhat circular fashion, metadata quality is dependent upon understanding how best to communicate with end users. to help define discussions of metadata quality, bruce and hillmann (2004) suggest seven factors to consider: “completeness, accuracy, provenance, conformance to expectations, logical consistency and coherence, timeliness, and accessibility” (243). deciding how to prioritize one or several factors over the others will depend on the resources and goals of the institution, as well as the ultimate needs of the end users. the state of standards standards are created by various organizations to define the rules for applying metadata to certain materials in certain settings. standards generally describe a metadata schema, “a formal structure designed to identify the knowledge structure of a given discipline and to link that structure to the information of the discipline through the creation of an information system that will assist the identification, discovery and use of information within that discipline” (cc:da 2000, under “charge #3”). essentially, a metadata schema standard demonstrates how best to organize and identify materials to enhance discovery and use of those materials. such standards are helpful to catalogers and digitizers because they define rules for how to include content, how represent content, and/or what the allowable content values are (chan and zeng 2006, under “metadata schema”). unfortunately, very few current metadata standards even mention philatelic materials, despite their unique nature. the only standard that appears to do so with any real purpose is the canadian rules for archival description (rad), created by the bureau of canadian archivists in 1990, and revised in 2008. thirteen chapters comprise the first part of the rad, and these chapters describe the standards for a variety of media. philatelic materials are given their own focus in chapter 12, which discusses general rules for philatelic description as well as specifics for each of nine areas of description: title and statement of responsibility, edition, issue data, dates of creation and publication, physical description, publisher’s series, archival description, note, and standard number. the rad therefore provides a decent set of guidelines for describing philatelic materials. the encoded archival description tag library created by the society of american archivists (ead3, updated in 2015) mentions philatelic materials only in passing. there is no specific section discussing how to properly apply descriptive metadata to philatelic materials. the single mention of such materials in the entire ead3 documentation is in the discussion of the tag, where it is noted that “jurisdictional and denominational data for philatelic records” (257) may be recorded. other standards don’t appear to mention philatelic materials at all, so implementers of those standards must extrapolate based on the general information provided. for example, describing archives: a content standard (dacs), also published by the society of american archivists (2013), does not discuss philatelic materials in any way. it does note, “different media of course require different rules to describe their particular characteristics…” (xvii), but the recommendations for specific content standards for different media listed in appendix b still leave out philately (141142). institutions using dacs for philatelic materials need to determine how to localize the standard. although marc similarly does not include specific guidelines for philatelic materials, peter roberts (2007) suggests ways to effectively use it for cataloging philatelic materials. for managing metadata for philatelic materials | ozeran | doi:10.6017/ital.v36i3.10022 10 instance, in the marc 655 field he suggests using the getty art and architecture thesaurus terms to describe the form of the materials and the library of congress subject headings to describe the subjects (genres) of the materials (86-87). in similar ways, most standards could potentially be applied to philatelic materials if an institution were to provide additional local rules for how to best implement the standard. the metadata that philatelists want there are actually a good number of resources for determining what metadata is important to philatelic researchers. boggs (1955) suggests that a philatelist may want to “study the methods of production; the origin, selection, and the subject matter of designs; their relation to the social, political and economic history of the country of issue; the history of the postal service which issued them” (1-2). these few initial research suggestions can provide some insight into what metadata elements would be most useful in a digital record. david straight (1994) suggests the most basic crucial items are the date and country of issue for an item (75). roberts (2007) provides significant background about philatelic materials and research, and indicates multiple metadata elements that will be helpful for researchers. he reiterates that dates are extremely useful, and are often identified on the materials themselves; when specific dates are not visible, a stamp itself may provide evidence of an approximate year based on when the stamp was issued (75). he notes that many of the postal markings also “indicate the time and place of origin, route, destination, and mode of transportation” (78), which will also be of interest to philatelic researchers. if any information is available about the original collector, dealer, or exhibitor of the stamp before it was acquired by a cultural heritage institution, this may also be of great interest to a researcher (81). roberts also suggests that the finding aids for philatelic collections are more crucial places for description than for specific item records, and that controlled vocabulary subject terms are important in these descriptions (86). because the scott postage stamp catalogue is the leading united states authority on stamps, it can also suggest the metadata elements that primarily concern philatelic researchers. each listing includes a unique scott number, paper color, variety (e.g., perforation differences), basic information, denomination, color of the stamp, year of issue, value used/unused, any changes in the basic set information, and the total value of the set (scott publishing co. 2014, 14a). the scott catalogue also describes a variety of additional components that researchers may be interested in, including the type of paper used, any watermarks, inks used, separation type, printing process used, luminescence, and gum condition (19a-25a). one additional interesting source for deciding what metadata is important to researchers (aside from directly surveying them, of course) is a piece of software that was created to help philatelists catalog their own private collections. stampmanage is available in united states and international versions, and it is largely based on the scott postage stamp catalogue in creating the full listing of stamps that may be available to a collector. it includes a wide variety of metadata elements for cataloging stamps, such as the scott number, country of origin, date of issue, location of issue, type of stamp, denomination, condition, color, brief description, presence and type of perforations, category, plate block size, mint sheet size, paper type, presence and type of watermark, gum type, and so forth (liberty street software 2016). as a product that is sold to stamp collectors, information technology and libraries | september 2017 11 stampmanage is likely to have a confident grasp of all the metadata that could possibly be important to its customers. this literature review helps create a holistic view of the issues faced by cultural heritage institutions with digitized stamp collections. although little progress has been made in the literature to describe how best to apply metadata to philatelic materials, there are ways that institutions can extrapolate guidelines from the literature that does exist. methodology to explore my research questions, i interviewed (over email) representatives of several large institutions with digitized stamp collections. the information provided by these institutions sheds light on the current state of metadata and metadata schemas for philatelic collections. note that there are other institutions with online collections of postage stamps that are not discussed in this paper (e.g., the swedish postal museum, https://digitaltmuseum.se/owners/s-pm). due to my own language limitations, this paper is limited to analysis of online collections that are described in english. additional research into institutions with non-english displays would support greater analysis of how cultural heritage institutions are currently creating and providing philatelic metadata. results smithsonian national postal museum in the united states, the largest publicly accessible digital collection of philatelic materials is from the smithsonian national postal museum. i discussed the metadata for this collection with elizabeth heydt, collections manager at the museum. ms. heydt stated that the stamps are primarily identified “by their country and their scott number” (e. heydt, pers. comm., october 5, 2016). for digital collections, the smithsonian national postal museum uses a gallery systems database called the museum system, which includes the getty art and architecture thesaurus as an embedded thesaurus. ms. heydt noted that aside from this embedded thesaurus, they “do not use any additional, formalized data standards such as the dublin core, mods,” or the like. of note, the museum system does allege compliance with “standards including spectrum, cco, cdwa, dacs, chin, lido, xmp, and other international standards” (gallery systems 2015, 4). the end user interface that pulls data from the museum system is called arago, which has “an internal structure that built on the scott catalogue system and some internal choices for grouping and classifying objects for the philatelic and the postal history collections.” users can search and browse the entire digital collection through arago, but ms. heydt did note that arago “is in stasis right now as we are in the planning stages for an updated version sometime in the near future.” based on an example record (http://arago.si.edu/record_145471_img_1.html), the descriptive metadata currently available for end users include a title, scott number, detailed description (including keywords), date of issue, medium, museum id (a unique identifier), and place of origin. digital images of the stamps are also included. a set of “breadcrumb” links at the top of the page also allow a user to browse each level of the digital collection, from an individual stamp record up to the entire museum collection as a whole. managing metadata for philatelic materials | ozeran | doi:10.6017/ital.v36i3.10022 12 library and archives canada i discussed the library and archives canada (lac) online philatelic collection with james bone, archivist at the lac. he explained that the philatelic collection has had a complicated history: our philatelic collection largely began with the dissolution of the national postal museum … in 1989 and the subsequent division and transfer of its collection to the canadian postal museum for artifacts/objects at the former canadian museum of civilization (now the canadian museum of history) and to the canadian postal archives at the former national archives (which was merged with the national library in the mid-2000s to create library and archives canada). as a side note, both the canadian postal museum and the canadian postal archives are themselves now defunct – although lac still acquires philatelic records and records related to philately and postal administration, these functions are no longer handled by a dedicated section but rather by archivists within our government records branch and our private records branch (the latter being me). (j. bone, pers. comm., october 11, 2016) regarding the collection’s metadata, mr. bone confirmed that the archival records at the lac all conform to the rad standard (discussed in the literature review above), and that philatelic materials are all given “at least a minimum level of useful file level or item level description for philatelic records based on chapter 12 of rad,” the chapter that specifically discusses philatelic materials. unfortunately, to his knowledge, the online database for these records does not use a common metadata standard such as oai-pmh that enables “external metadata harvesting or querying,” so the system is not searchable outside of the lac website. mr. bone also pointed out that there are fields visible on the back end of the lac online database that are not visible to end users, and the most notable of these omissions is the scott number (the number assigned to every stamp by the scott catalogue). he wrote that it seemed “bizarre” to not have the scott number visible, “as that’s definitely an access point that i would expect philatelic researchers to use to narrow down a result set to the postage stamp issue of interest.” however, it appears this invisibility was a decision consciously made by the lac, based on mr. bone’s review of an internal lac standards document. based on an example record (http://collectionscanada.gc.ca/pam_archives/index.php?fuseaction=genitem.displayitem&lang= eng&rec_nbr=2184475) the following fields are available for end users to view: title, place of origin, denomination, date of issue, title of the collection of which it is a part, extent of item, language, access conditions, terms of use, mikan number (a unique identifier), itemlev number (deprecated), and any additional relevant information such as previous exhibitions of the physical item. the postal museum the postal museum in london is set to open its physical doors in 2017, but much of the collection is already available for browsing and searching online. stuart aitken, curator, philately, explained to me that the online collection uses the general international standard archival description, second edition, as the primary metadata schema, but the online collection also includes “non isad(g) fields for certain extra-specific data for our archive material, including philatelic material” (s. aitken, pers. comm., december 1, 2016). based on my own review of the isad(g) standards information technology and libraries | september 2017 13 document (international council on archives 1999) and an example record from the postal museum’s online collection (http://catalogue.postalmuseum.org/collections/getrecord/gb813_p_150_06_02_011_01_001#cu rrent), it appears nearly all the fields are based on the isad(g) standards. these fields include information such as date, level of description, extent of item, language, description, and conditions for access and reproduction. only the field for “philatelic number” appears to be extra. there may be additional non-isad(g) fields that are not included in the example record above, but are included in other records when the extra information is available and relevant. each digital record also allows end users to submit tags for help with identification and search. no tags were already submitted on the example record reviewed above, but this is likely because the online collection is still rather new. of note, digital records are created at each archival level, from the broadest collection category down to the individual item (similar to the smithsonian national postal museum collection). to provide an additional way to browse the collection, a sidebar in each digital record shows where it exists in the hierarchy of collections and provides links to each broader collection of which the current record is a part. the british museum i reached out to the folks at the british museum to discuss the application of metadata to their online records for postage stamps, but at the time of this writing i have not received any response. however, some information can be gleaned from examining the website. unlike the other institutions reviewed in this paper, the british museum’s online collection includes a wide variety of objects. postage stamps are therefore identified in the online collection by specifying “postagestamp” in the “object type” field, which likely uses a controlled vocabulary. based on an example record (http://www.britishmuseum.org/research/collection_online/collection_object_details.aspx?objec tid=1102502&partid=1&searchtext=postage+stamp&page=1), each record for a postage stamp lists the museum number (a unique identifier), denomination, description, date issued, country of origin, materials, dimensions, acquisition name and date, department, and registration number (which appears to be the same as the museum number). digital images of the stamps are occasionally included. the collection website notes that the british museum is “continuing every day to improve the information recorded in it [the digital collection] and changes are being fed through on a regular basis. in many cases it does not yet represent the best available knowledge about the objects” (trustees of the british museum 2016a, under “about these records”). therefore, end users are encouraged to read the information in any given record with care, and to provide feedback if they have any additional information or corrections about an object. the online collection also is offered in machine-readable format, via linked data and sparql, to encourage wider accessibility and use. the website advises, the use of the w3c open data standard, rdf, allows the museum's collection data to join and relate to a growing body of linked data published by other organisations around the world interested in promoting accessibility and collaboration. the data has also been organised using the cidoc crm (conceptual reference model) crucial for harmonising managing metadata for philatelic materials | ozeran | doi:10.6017/ital.v36i3.10022 14 with other cultural heritage data. the cidoc crm represents british museum's data completely and, unlike other standards that fit data into a common set of data fields, all of the meaning contained in the museum's source data is retained. (trustees of the british museum 2016b) each digital object has rdf and html resources, as well as a sparql endpoint with an html user interface. discussion the information from the four institutions above provides a starting point for examining best practices for philatelic metadata. in the following discussion, i will review the information in light of the research questions: important metadata elements, the standards that were implemented, and whether the standards that currently exist have been sufficient. as explained in the literature review above, relevant metadata are crucial for enhancing end user research of digital records. this suggests that similarity of metadata across collections of the same type will improve users’ ability to conduct their research. unfortunately, there are only a few descriptive metadata fields used across all four of the institutions reviewed in this paper. these fields include a title (sometimes used very loosely), the date of issue, the place of issue, a description, and a unique identifier. these fields certainly seem to be the absolute minimum necessary for identifying (and searching for) a postage stamp, since they are among the fields discussed in the literature review as being important to philatelic researchers. other fields that are included in some but not all of the above collections, such as stamp denomination and access conditions, are nonetheless quite relevant to online collections of postage stamps. interestingly, although the scott catalogue is recognized as a premier stamp catalogue, only one institution (the smithsonian national postal museum) currently uses the scott identification number as part of the standard philatelic metadata. as noted above, the library and archives canada does include the scott number in the behind-the-scenes metadata, but does it not display the scott number to end users. the postal museum and the british museum don’t use the scott number at all. it appears that only the smithsonian believes the scott number is useful to end users, either for search or identification purposes. of the four institutions, it appears that only the british museum uses metadata standards that increase the accessibility of the online collection beyond its own website. the implementation of rdf for linked data creates an open collection that is machine-readable beyond the internal database used by the museum. the smithsonian national postal museum, library and archives canada, and the postal museum do not appear to use any similar metadata standard for data harvesting or transmission, which means that these collections can only be searched from within their respective websites. the most important thing to note in reviewing the online collections for these four institutions is the fact that each institution uses different standards to apply metadata in a different way. frankly, this is not a surprise. as discussed in the literature review above, although metadata standards exist for a variety of materials, philatelic materials are simply not considered. only the canadian rules for archival description explicitly include information about philatelic materials; information technology and libraries | september 2017 15 accordingly, the library and archives canada utilizes these rules when creating its online records of postage stamps. no similar standard exists in the united states or internationally, leaving individual institutions with the task of deciding what generic metadata standard to use as a jumping off point, and then modifying it to meet local needs. as described above, the smithsonian national postal museum uses the metadata schema that comes with their collection management software, and has created an end-user interface based off of internal metadata decisions. the postal museum based their metadata primarily off of isad(g), an international metadata standard with no specific suggestions for philatelic materials. i was unable to confirm the base metadata schema the british museum employs, although it is clear they use rdf to make the collection’s digital records more widely available. each institution appears to be using a different base metadata standard, essentially requiring them to reinvent the wheel upon deciding to digitize philatelic materials. this is what happens when there is no single, unified standard available for the type of material being described. conclusion as this paper has shown, metadata standards are sorely lacking when it comes to philatelic materials. other kinds of materials have received special considerations because more and more institutions decided it would be important to digitize them, so various groups came together to create standards that provide some guidance. it is time for this to happen for philatelic materials as well. there aren’t many cultural heritage institutions that currently manage digital collections of philatelic materials, so this is an opportunity for those who plan to digitize their collections to consider what has been done and what makes sense to pursue. it is clear that philatelic digitization is still nascent, but as with other kinds of materials, it is only likely that more and more institutions will attempt digitization projects. it is hoped that this paper can serve as a jumping off point for institutions to discuss the creation of international metadata standards specifically for philatelic materials. acknowledgements many thanks are owed to the people who took time out of their very busy lives to respond to the unrefined inquiries of an mlis grad student: stuart aitken (curator, philately, the postal museum); james bone (archivist, private archives branch, library and archives canada); and elizabeth heydt (collections manager, smithsonian national postal museum). their expertise and responsiveness is immensely appreciated. managing metadata for philatelic materials | ozeran | doi:10.6017/ital.v36i3.10022 16 references aape (american association of philatelic exhibitors). 2016a. “aape join/renew your membership.” http://www.aape.org/join_the_aape.asp. –––––. 2016b. “exhibits online.” http://www.aape.org/join_the_aape.asp. american philatelic society. 2016. “stamp catalogs: your guide to the hobby.” accessed december 8. http://stamps.org/how-to-read-a-catalog. arms, caroline r., and william y. arms. 2004. “mixed content and mixed metadata: information discovery in a messy world.” in metadata in practice, edited by diane i. hillman and elaine l. westbrooks, 223-37. chicago, il: ala editions. bade, david. 2007. “structures, standards, and the people who make them meaningful.” paper presented at the 2nd meeting of the library of congress working group on the future of bibliographic control, chicago, il, may 9, 2007. https://www.loc.gov/bibliographicfuture/meetings/docs/bade-may9-2007.pdf. bade, david. 2008. “the perfect bibliographic record: platonic ideal, rhetorical strategy or nonsense?” cataloging & classification quarterly 46 (1): 109-33. https://doi.org/10.1080/01639370802183081. boggs, winthrop s. 1955. the foundations of philately. princeton, nj: d. van nostrand company. bruce, thomas r., and diane i. hillmann. 2004. “the continuum of metadata quality: defining, expressing, exploiting.” in metadata in practice, edited by diane i. hillman and elaine l. westbrooks, 238-56. chicago, il: ala editions. bureau of canadian archivists. 2008. rules for archival description. rev. ed. ottawa, canada: canadian council of archives. http://www.cdncouncilarchives.ca/archdesrules.html. cc:da (american library association committee on cataloging: description and access). 2010. “task force on metadata: final report.” american library association. https://www.libraries.psu.edu/tas/jca/ccda/tf-meta6.html. chan, lois m., and marcia l. zeng. 2006. “metadata interoperability and standardization – a study of methodology part i: achieving interoperability at the schema level.” d-lib magazine 12 (6). https://doi.org/10.1045/june2006-chan. gallery systems. 2015. “tms: the museum system.” http://go.gallerysystems.com/abouttms.html. international council on archives. 1999. isad(g): general international standard archival description. 2nd ed. stockholm, sweden: international council on archives. http://www.icacds.org.uk/eng/isad(g).pdf. liberty street software. 2016. “stampmanage the best way to catalog your stamp collection.” http://www.libertystreet.com/stamp-collecting-software.htm. information technology and libraries | september 2017 17 roberts, peter j. 2007. “philatelic materials in archival collections: their appraisal, preservation, and description.” the american archivist 70 (1): 70-92. https://doi.org/10.17723/aarc.70.1.w3742751w5344275. scott publishing co. 2014. scott 2015 standard postage stamp catalogue. vol. 3, countries of the world, g-i. sidney, oh: scott publishing co. society of american archivists. 2013. describing archives: a content standard. 2nd ed. chicago, il: society of american archivists. http://files.archivists.org/pubs/dacs2e-2013_v0315.pdf. society of american archivists. 2015. encoded archival description tag library, version ead3. chicago, il: society of american archivists. http://www2.archivists.org/sites/all/files/taglibrary-versionead3.pdf. straight, david. 1994. “adding value to stamp and coin collections.” library journal 119 (10): 7578. accessed december 8, 2016. http://libaccess.sjlibrary.org/login?url=http://search.ebscohost.com/login.aspx?direct=tr ue&db=ulh&an=9406157617&site=ehost-live&scope=site. trustees of the british museum. 2016a. “about the collection database online.” accessed december 8. http://www.britishmuseum.org/research/collection_online/about_the_database.aspx. –––––. 2016b. “british museum semantic web collection online.” accessed december 8. http://collection.britishmuseum.org/. microsoft word june_ital_ellern_final.docx user  authentication  in  the  public     area  of  academic  libraries  in     north  carolina   gillian  (jill)  d.  ellern,     robin  hitch,  and   mark  a.  stoffan       information  technology  and  libraries  |  june  2015     103         abstract   the  clash  of  principles  between  protecting  privacy  and  protecting  security  can  create  an  impasse   between  libraries,  campus  it  departments,  and  academic  administration  over  authentication  issues   with  the  public  area  pcs  in  the  library.  this  research  takes  an  in-­‐depth  look  at  the  state  of   authentication  practices  within  a  specific  region  (i.e.,  all  the  academic  libraries  in  north  carolina)  in   an  attempt  to  create  a  profile  of  those  libraries  that  choose  to  authenticate  or  not.    the  researchers   reviewed  an  extensive  amount  of  data  to  identify  the  factors  involved  with  this  decision.   introduction   concerns  surrounding  usability,  administration,  and  privacy  with  user  authentication  on  public   computers  are  not  new  issues  for  librarians.  however,  in  recent  years  there  has  been  increasing   pressure  on  all  types  of  libraries  to  require  authentication  of  public  computers  for  a  variety  of   reasons.  since  the  9/11  tragedy,  there  has  been  increasing  legislation  such  as  the  uniting  and   strengthening  america  by  providing  appropriate  tools  required  to  intercept  and  obstruct   terrorism  act  of  2001  (usa  patriot  act)  and  communications  assistance  for  law  enforcement   act  (calea).    in  response,  administrators  and  campus  it  staff  have  become  increasingly   concerned  about  allowing  open  access  anywhere  on  their  campuses.    restrictive  licensing   agreements  for  specialized  software  and  web  resources  are  also  making  it  necessary  or  attractive   to  limit  access  to  particular  academic  subgroups  and  populations.    permitting  access  to  secured   campus  storage  from  these  computers  can  make  it  necessary  for  libraries  to  think  about  the   necessity  of  authentication.    and  finally,  the  general  state  of  the  economy  has  increased  the  user   traffic  to  libraries,  sometimes  making  it  necessary  to  control  the  use  of  limited  computer   resources.  authenticating  can  often  make  these  changes  easier  to  implement  and  can  give  the   library  more  control  over  its  it  environment.         that  being  said,  authentication  comes  at  a  price  for  librarians.  authentication  often  creates  ethical   issues  with  regards  to  patron  privacy,  freedom  of  inquiry,  increasing  the  complexity  of  using   public  area  machines,  and  restricting  the  open  access  needs  of  public  or  guest  users.    requiring  a   patron  to  log  into  a  computer  can  make  it  possible  for  organizations  outside  the  library’s  control     gillian  (jill)  d.  ellern  (ellern@email.wcu.edu)  is  systems  librarian,  robin  hitch     (rhitch@email.wcu.edu)  is  tech  support  analyst,  and  mark  a.  stoffan  (mstoffan@email.wcu.edu)   is  head,  digital,  access,  and  technology  services,  western  carolina  university,  cullowhee,  north   carolina.     user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     104   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   to  collect,  review  and  use  data  of  a  patron’s  searching  habits  or  online  behaviors.  issues  associated   with  managing  patron  logins  can  also  create  barriers  for  access  as  well  as  being  time  consuming   and  frustrating  for  both  the  patron  and  the  library  staff.1  while  open,  anonymous  access  does  not   completely  protect  against  these  issues,  it  can  help  to  create  an  environment  of  free,  private  and   open  access  similar  to  the  longstanding  situation  with  the  book  collection  in  most  libraries.     the  hunter  library  experience   while  working  on  the  implementation  of  a  new  campus-­‐wide  pay-­‐for-­‐print  solution  in  2009,   librarians  from  the  hunter  library  at  western  carolina  university  began  to  feel  pressured  by  the   campus  it  department  to  change  its  practice  of  allowing  anonymous  logins  to  all  the  computers  in   the  public  areas  of  the  library.    concerns  about  authenticating  users  on  library  public  area   machines  had  been  building  between  these  two  units  for  several  years.    the  resulting  clash  of   principles  between  protecting  privacy  and  protecting  security  came  to  a  head  over  this  project.   the  hunter  library  employees  perceived  that  there  needed  to  be  more  time  for  research  and   debate  before  implementing  the  preceded  mandate.  initially,  there  was  great  resistance  from   campus  it  staff  to  take  the  library’s  concerns  into  account,  but  eventually  a  compromise  was   worked  out  that  allowed  the  library  to  retain  anonymous  logins  on  its  public  computers.    the   confrontation  led  library  staff  to  investigate  the  practices  of  other  libraries,  particularly  within  the   university  of  north  carolina  (unc)  system  of  which  it  is  a  member.    it  seemed  a  logical   development  to  extend  the  initial  research  into  the  authentication  practices  throughout  the  state   of  north  carolina.   the  problem   one  of  the  first  questions  asked  by  western  carolina’s  library  administration  of  the  systems   department  was  what  other  libraries  in  the  area  were  doing.    in  our  case,  the  library  director   specifically  asked  how  many  of  west  carolina’s  sister  universities  were  authenticating  and  why.   anecdotally,  during  this  process,  it  seemed  that  many  other  university  of  north  carolina  system   libraries  reported  being  pressured  to  authenticate  their  public  computers  by  organizations   outside  the  library,  most  often  the  campus  it  department.   when  the  librarians  at  the  hunter  library  began  looking  at  research  to  support  their  position,   hard  data  and  practical  arguments  that  could  be  used  to  effectively  argue  their  case  against  this   change,  helpful  literature  seemed  to  be  lacking.  some  items  were  found  such  as  carlson,  writing  in   the  chronicle  of  higher  education,  who  reported  on  the  divide  between  access  and  security.  he   confirmed  that  other  librarians  also  have  ambivalent  feelings  about  authentication  issues  but  that   there  was  also  growing  understanding  in  libraries  about  the  potential  vulnerability  of  networks  or   misuse  of  their  resources.2     it  seemed  that  the  speed  at  which  authenticating  computers  in  the  public  areas  of  libraries  was   happening  across  the  country  had  not  really  allowed  the  literature  on  the  subject  to  quite  catch  up.     information  technology  and  libraries  |  june  2015     105           those  studies  that  existed  such  as  spec  kits  seem  to  address  the  issue  from  the  perspective  of   larger  research  libraries  or  else  did  not  systematically  assess  other  specific  groups  of  libraries.3,4     there  were  questions  in  our  minds  about  whether  the  current  research  that  was  found  would   describe  the  trends  and  unique  situations  of  libraries  located  in  rural  areas  or  in  other  types  of   academic  libraries.  there  seemed  to  be  no  current  statewide  or  geographically  defined  analysis  of   authentication  practices  across  various  types  of  academic  libraries  in  a  specific  state  or  region,  nor   were  there  any  available  studies  creating  a  profile  of  libraries  more  likely  to  authenticate   computers  in  their  public  areas.  we  questioned  if  the  rural  nature  of  our  settings,  our  mission,  or   our  geographic  area  in  the  south  might  reinforce  or  hurt  our  position  with  it.    authentication   status  is  not  something  that  is  mentioned  in  the  ala  directory  nor  is  this  kind  of  information  often   given  on  a  library’s  web  site.    we  found  that  individuals  usually  need  to  call  or  visit  the  library   directly  if  they  want  to  know  about  a  library’s  authentication  practices.   during  the  initial  investigation,  the  need  for  this  kind  of  information  to  support  the  library’s   perspective  became  clear.    this  question  led  to  the  creation  of  this  survey  of  authentication   practices  in  a  larger  geographical  area  and  across  various  kinds  of  academic  libraries.    the  goals  of   this  research  were  to  determine  some  answers  to  the  following  questions:   • what  is  the  current  state  of  authentication  practices  in  the  public  area  of  academic  libraries   in  north  carolina?       • what  factors  caused  these  libraries  to  make  the  decisions  that  they  did  in  regards  to   authentication?   • could  you  predict  whether  an  academic  library  would  require  users  to  authenticate?   literature  review   a  number  of  studies  have  discussed  various  other  aspects  of  user  authentication  in  libraries,   including  privacy  and  academic  freedom  concerns,  guest  access  policies,  differing  views  of  privacy   and  access  between  library  and  campus  it  departments,  and  legislation  impacting  library   operations.  all  are  potential  factors  impacting  decisions  on  authentication  of  patron  accessible   computers  located  in  the  public  areas  of  library.   privacy  and  academic  freedom  about  the  use  of  a  library’s  collection  have  long  been  major   concerns  for  librarians  even  before  information  technology  was  introduced.  the  impact  of  9/11   and  the  patriot  act  made  the  discussion  of  computers  and  network  security,  especially  in  the   library  environment  much  more  entwined.    oblinger  discussed  online  access  concerns  in  the   context  of  academic  values,  focusing  on  unique  aspects  of  the  academic  mission.  she  discussed  the   results  of  an  educause/internet2  computer  and  network  security  task  force  invitational   workshop  that  established  a  common  set  of  principles  as  a  starting  point  for  discussion:  civility   and  community,  academic  and  intellectual  freedom,  privacy  and  confidentiality,  equity  of  access  to   resources,  fairness,  and  ethics.  all  of  these  principles,  she  argues,  are  integral  to  the  environment     user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     106   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   of  a  university  and  concluded  that  security  is  a  complex  topic  and  that  written,  top-­‐imposed   policies  alone  will  not  adequately  address  all  concerns.5  while  not  directly  addressing  the  issues  of   the  library’s  public  computer  access  in  particular,  she  established  a  framework  of  values  on  how   security  issues  relate  to  the  university  culture  of  freedom  and  openness.   dixon  in  an  article  written  for  library  administrators  discussed  privacy  practices  for  libraries   within  the  context  of  the  library  profession’s  ethical  concerns.  she  highlights  such  documents  as   the  code  of  ethics  of  the  american  library  association6,  the  fair  information  practices  adopted  by   the  organization  for  economic  cooperation  and  development7,  and  the  niso  best  practices  for   designing  web  services  in  the  library  context8.    she  also  reviews  a  variety  of  ways  that  patron  data   may  be  misused  or  compromised.  she  stated  that  all  the  ways  that  patron  data  can  be  be  stored  or   tracked  by  local  networks,  it  departments,  or  internet  service  providers  may  not  be  fully   understood  by  librarians.  while  most  librarians  ardently  maintain  the  privacy  of  patron   circulation  records,  she  points  out  that  similar  usage  data  on  online  activities  may  be  collected   without  the  librarians  or  their  patrons  being  aware.  dixon  studied  the  current  literature  and   maintained  that  libraries  need  to  be  closely  involved  in  decisions  about  the  collection  and   retention  of  patron  usage  data,  especially  when  patron  authentication  and  access  is  controlled  by   external  agencies  such  as  campus  or  city  it  departments,  because  of  a  tendency  for  security  to   prevail  over  privacy  and  free  inquiry.9  this  theme  was  of  major  importance  to  us  in  preparing  the   present  study  as  it  shows  that  we  are  not  alone  in  these  concerns.   carter  focused  on  the  balance  between  security  and  privacy  and  suggested  several  possible   scenarios  for  addressing  both  areas.  he  emphasized  librarian  values  involving  privacy  and   intellectual  freedom,  contrasting  the  librarian’s  focus  on  unrestricted  access  with  the  over-­‐arching   security  concerns  of  computing  professionals.  he  discussed  several  computer  access  policies  in   use  at  various  institutions  and  possible  approaches.  these  options  include  computer   authentication  (with  associated  privacy  concerns),  open  access  stations  visually  monitored  from   staffed  desks,  or  routine  purging  of  user  logs  at  the  end  of  each  session.  he  also  suggested   librarians  lobby  state  legislatures  to  have  computer  usage  logs  included  in  laws  governing  the   confidentiality  of  library  records.10   still  and  kassabian  provided  a  good  summary  of  internet  access  issues  as  they  affected  academic   libraries  from  legal  and  ethical  perspectives.  they  suggested  that  librarians  focus  on  public   obligations,  free  speech  and  censorship,  and  potential  for  illegal  activities  occurring  on  library   workstations.  the  issues  highlighted  in  the  article  have  increased  in  the  15  years  since  the  article   was  written  but  it  remains  the  best  available  overview.11  the  arguments  put  forth  in  this  article   proved  relevant  for  us  in  understanding  the  multitude  of  viewpoints  regarding  authentication   even  before  9/11.     in  the  post-­‐9/11  era,  essex  discussed  the  usa-­‐patriot  act  and  its  implications  for  libraries  and   patron  privacy.  some  of  the  9/11  terrorists  were  reported  to  have  made  use  of  public  library   computers  in  the  days  before  the  attack.  this  has  led  to  heighted  concern  about  patron  privacy     information  technology  and  libraries  |  june  2015     107           among  librarians.  accurate  assessment  of  its  impact  is  difficult  due  to  restrictions  placed  on   libraries  in  even  disclosing  that  they  have  been  subjected  to  search.12  while  not  directly   addressing  authentication,  the  article  highlights  privacy  issues  surrounding  library  records  of  all   types.     one  of  the  arguments  in  not  requiring  authentication  in  the  public  area  is  the  use  by  unaffiliated   users  of  academic  libraries.    this  is  especially  true  in  rural  areas  where  an  academic  library  might   be  some  of  the  best-­‐funded,  comprehensive  and  accessible  resources  in  a  geographical  area.    even   in  urban  areas,  guest  access  by  unaffiliated  users  is  a  growing  issue  for  many  academic  libraries   because  of  limited  resources,  software  licensing  problems  and  public  access  to  campus   infrastructure.  while  most  institutions  have  traditionally  offered  basic  library  services  to   unaffiliated  patrons,  the  online  environment  has  raised  new  problems.  weber  and  lawrence   provided  one  of  the  best  studies  of  these  issues.    their  work  surveyed  association  of  research   libraries  (arl)  member  libraries  to  determine  the  extent  of  mandatory  logins  to  computer   workstations  and  document  how  online  access  was  provided  to  non-­‐affiliated  guest  users.  they   concentrated  their  study  questions  on  federal  and  canadian  depository  libraries  that  must   provide  some  type  of  access  to  online  government  information,  with  or  without  authentication.   less  than  half  of  respondents  reported  having  any  written  policies  governing  open  access  on   computers  or  guest  access  policies.  of  the  61  responding  libraries  to  the  survey,  32  required  that   affiliated  users  authenticate,  and  of  these  libraries  and  23  had  a  method  for  authenticating  guest   users.13  this  article,  which  was  published  just  as  this  study  was  testing  and  evaluating  the  survey   instrument,  proved  to  be  very  useful  as  we  worked  with  our  questions  in  qualtrics™      and  dealt   with  the  irb  requirements.     courtney  explored  a  half-­‐century  of  changes  in  access  policies  for  unaffiliated  library  users.   viewing  the  situation  from  somewhat  early  in  the  shift  from  print  to  electronic  resources,  she   foresaw  the  potential  for  significantly  reduced  access  to  library  resources  for  non-­‐affiliated   patrons.  these  barriers  would  be  created  by  access  policy  issues  with  computing  infrastructure   and  licensing  limitations  by  database  vendors.    this  is  especially  true  if  a  library’s  licenses  or   policies  did  not  specifically  address  use  by  unaffiliated  users.  she  concluded  that  decisions  about   guest  access  to  online  library  resources  should  be  made  by  librarians  and  not  be  handed  over  to   vendors  or  campus  computing  staff.14  our  study  began  as  a  result  of  this  very  issue,  i.e.,  an  outside   entity  (campus  it)  determining  how  access  to  library  resources  should  be  controlled,  without   input  by  librarians  or  library  staff.   courtney  also  surveyed  814  academic  libraries  to  assess  their  policies  for  access  by  unaffiliated   users.    she  focused  on  all  library  services  including  building  access,  reference  assistance,  and   borrowing  privileges  in  addition  to  online  access.  many  libraries  were  also  cancelling  print   subscriptions  in  favor  of  online  access  and  she  questioned  the  impact  this  might  have  on  use  by   unaffiliated  users.    while  suggesting  little  correlation  between  decisions  to  cancel  paper   subscriptions  and  requiring  authentication  of  computer  workstations,  she  concluded  that  reduced     user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     108   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   access  by  unaffiliated  users  would  be  an  unintended  consequence  of  this  change.15  this  article   proved  valuable  to  us  in  framing  our  study,  as  it  gave  us  some  idea  of  what  we  might  expect  to  find   and  provided  some  concepts  to  use  when  we  formulated  our  survey.     best-­‐nichols  surveyed  public  use  policies  in  11  nc  tax-­‐supported  academic  libraries  and  asked   similar  questions  to  our  own.  this  study  was  dated  and  didn’t  address  computer  resources,  but   some  of  the  same  issues  were  addressed.16  public  use  and  authentication  policies  have  the   potential  to  impact  one  another  and  how  the  library  responds.       courtney  called  on  librarians  to  conduct  a  carefully  thought  out  discussion  of  user  authentication   because  of  the  implications  for  public  access  and  freedom  of  inquiry.  while  librarians  are   traditionally  passionate  at  protecting  patron  privacy  involving  print  resources,  many  are  unaware   of  related  concerns  involving  online  authentication.  she  advocated  for  more  education  and  open   debate  of  the  issues  because  of  the  potential  gravity  of  leaving  decision-­‐making  in  the  hands  of   database  vendors  or  campus  it  departments.  decisions  regarding  authentication  and  privacy   impact  library  services  and  access,  and  therefore  need  to  include  input  from  librarians.17  as  this   study  included  a  summary  of  the  reasons  for  authentication  as  provided  by  surveyed  libraries,  it   also  gave  us  another  reference  point  to  use  when  comparing  our  results  and  highlighted  the   intellectual  freedom  issues  that  were  often  missing  or  glossed  over  in  other  studies.   barsun  surveyed  the  web  sites  of  the  100  association  of  research  libraries  to  assess  services  to   unaffiliated  users  in  four  areas:  building  access,  circulation  policies,  interlibrary  loan  services,  and   access  to  online  databases.  61  member  libraries  responded  to  requests  for  data.  she  explored  the   question  of  whether  the  policies  governing  these  services  would  be  found  on  a  library’s  web  site.   she  perceived  a  possible  disparity  between  increasing  demand  for  services  generated  by  members   of  the  public  who  are  discovering  a  library’s  resources  via  online  searching  and  the  library’s  ability   or  willingness  to  serve  outside  users.  while  she  did  not  address  computer  authentication  issues   directly,  she  did  find  that  a  significant  percentage  of  academic  library  web  sites  were  ambiguous   about  stating  the  availability  of  non-­‐authenticated  access  to  databases  from  onsite  computers.18   this  ambiguity  could  possibly  be  related  to  vague  usage  agreements  with  database  vendors  that   do  not  clearly  state  whether  non-­‐affiliated  users  may  obtain  onsite  access  to  these  resources.  in   “secret  shopper”  visits  done  as  part  of  our  own  research,  we  saw  a  disparity  between  what  was   stated  on  a  library’s  web  site  and  the  reality  of  access  offered.     method   it  seemed  appropriate  to  start  this  project  with  a  regional  focus.      none  of  the  studies  available   looked  at  authentication  geographically.    because  colleges  and  universities  within  a  state  are  all   subjected  to  the  same  economic,  political  and  environmental  factors,  looking  at  the  libraries  might   help  provide  some  continuity  for  creating  a  relevant  profile  of  current  practices.    north  carolina   has  a  substantial  number  of  academic  libraries  (114)  with  a  wide  variety  of  demographics.     historically,  the  state  supports  a  strong  educational  system  with  one  of  the  first  public  university     information  technology  and  libraries  |  june  2015     109           systems.    together  with  the  17  universities  within  university  of  north  carolina  system,  the  state   has  59  public  community  colleges,  36  private  colleges  and  universities,  and  3  religious  institutions.   religious  colleges  are  identified  as  those  whose  primary  degree  is  in  divinity  or  theology.    (see   chart  1.)     chart  1.  survey  participation  by  type  of  academic  library.   work  had  been  started  to  identify  the  authentication  practices  of  other  unc  system  libraries,  so   the  researchers  expanded  the  data  to  include  the  other  academic  libraries  within  the  state.  to   create  a  list  of  the  library’s  pertinent  information  for  this  investigation,  the  researchers  used  the   american  library  directory19,  the  nc  state  library’s  online  directories  of  libraries20,  and  visited   each  library’s  web  page  to  create  a  database.  the  researchers  augmented  each  library’s  data  to   include  information  including  the  type  of  academic  library  (public,  private,  unc  system  and   religious),  current  contact  information  on  personnel  who  might  be  able  to  answer  questions  on   authentication  policies  and  practices  in  that  library,  current  number  of  books,  institutional   enrollment  figures,  and  the  name  and  population  of  the  city  or  town  in  which  the  library  was   located.  the  library’s  responses  to  the  survey  were  also  tracked  in  the  database  with  spss  and   excel  employed  in  evaluating  the  collected  data.   a  western  carolina  institution  review  board  (irb)  “request  for  review  of  human  subject   research”  was  submitted  and  approved  using  the  following  statement:  “we  want  to  know  the   authentication  situation  for  all  the  college  libraries  in  north  carolina.”    the  researchers  discovered   quickly  that  the  definition  of  “authentication”  would  have  to  be  explained  to  the  review  board  and   many  of  the  responding  librarians  that  filled  out  the  survey.  the  research  goal  was  further   simplified  with  the  explanation  of  authentication  as  “how  do  patrons  identify  themselves  to  get     user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     110   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   access  to  a  computer  in  the  public  area  of  a  library”  because  many  librarians  might  not  realize  that   what  they  do  is  “authentication”.       during  the  approval  phase,  there  was  some  question  about  whether  the  researchers  needed   formal  approval  because  much  of  the  information  could  be  collected  by  just  visiting  the  libraries  in   person.    the  researchers  saw  no  risk  of  potentially  disclosing  confidential  data.  however,  it  was   decided  that  it  was  better  to  go  through  the  approval  process,  since  the  survey  asked  the  librarians   whether  they  were  being  required  to  authenticate  by  outside  entities.    there  might  also  be  a  need   to  do  some  follow-­‐up  calls  and  there  was  a  plan  to  do  site  visits  to  the  local  libraries  in  order  test   the  data  for  accuracy.     the  qualtrics™  online  survey  system  was  used  to  create  the  survey  and  collect  the  responses.     contact  information  from  the  database  was  uploaded  to  the  survey  system  with  the  irb  approved   introductory  letter  to  each  library  contact  person  along  with  a  link  to  the  survey.    the  introductory   letter  described  the  goals  of  the  project  and  included  an  invitation  to  participate  as  well  as  refusal   language  as  required  by  the  irb  request.  the  same  language  was  used  in  the  follow  up  emails  and   phone  calls.   the  initial  (16)  surveys  were  administered  to  the  unc  system  libraries  in  october  –  december   2010  as  a  test  of  the  delivery  and  collection  system  on  qualtrics™,  with  the  rest  of  the  libraries   being  sent  the  survey  mid-­‐december  2010.         in  the  spring  of  2011,  the  researchers  followed  initial  survey  with  a  second  letter  and  then  with   phone  calls  and  emails.  during  the  follow  up  calls,  some  librarians  chose  to  answer  the  survey   questions  with  the  researcher  filling  it  out  over  the  phone.    most  filled  out  the  survey  themselves.     the  final  surveys  were  completed  in  april  2011.    because  the  status  of  authentication  is  volatile,   this  survey  data  and  research  represents  a  snapshot  in  time  of  their  authentication  practices   between  october  2010  and  april  2011.  the  researchers  did  see  changes  happening  over  the   course  of  the  surveying  process  and  made  changes  to  any  data  collected  in  follow  up  contact  in   order  to  maintain  the  most  current  information  about  that  library  for  the  charts,  graphs  and   presentations  made  from  the  data.     in  fall  2011,  the  researchers  did  a  “secret  shopper”  type  expedition  to  the  nearest  academic   libraries  by  visiting  in  person  as  a  guest  user.    the  main  purpose  of  these  visits  was  to  check  the   data,  take  pictures  of  the  library  public  areas,  get  a  firsthand  experience  with  the  variety  of   authentication  practices,  and  talk  to  and  thank  the  librarians  that  participated.   the  survey   the  survey  asked  36  different  questions  using  a  variety  of  pull  down  lists,  check  boxes  and  fill  in   the  blank  questions.    qualtrics™  allows  for  the  survey  to  have  seven  branches,  or  skip  logic,  that   asked  further  questions  depending  upon  the  answer  given.    these  branches  allowed  the  survey   software  to  skip  particular  sections  or  ask  for  additional  information  depending  on  the  answers     information  technology  and  libraries  |  june  2015     111           supplied.    some  libraries,  especially  those  that  didn’t  authenticate  or  didn’t  know  specific  details,   might  be  asked  as  little  as  14  questions  while  others  received  all  36.  the  setup  of  computers  in  the   public  area  of  libraries  can  be  quite  variable,  especially  if  the  library  differentiates  between   student-­‐only  and  guest/public  use  only  workstations.  the  survey  questions  were  grouped  into   seven  basic  areas:  descriptive,  authentication,  student-­‐only  pcs,  guest/public  pcs,  wireless   access,  incident  reports,  and  computer  activity  logs.     the  full  survey  is  included  as  appendix  a.   initial  hypothesis   given  the  experience  at  the  hunter  library,  we  expected  the  following  factors  might  influence  a   decision  to  authenticate.    some  of  these  basic  assumptions  did  influence  our  selection  of  questions   in  the  seven  areas  of  the  survey.     we  expected  to  find:   • when  the  workstations  were  under  the  control  of  campus  it,  authentication  would  usually   be  required   • when  the  workstations  were  under  the  control  of  the  library,  authentication  would   probably  not  be  required   • that  factors  such  as  population,  enrollment,  and  book  volume  would  play  a  role  in   decisions  to  authenticate     • that  librarians  would  not  be  aware  of  what  user  information  was  being  logged  whether  or   not  authentication  was  required   • a  library  would  have  experienced  incidents  involving  the  computers  in  the  public  area  that   the  library  would  have  authentication     • that  authentication  increased  from  post-­‐  9/11  factors  and  its  legal  interpretations  to  force   libraries  to  authenticate   survey  questions,  responses,  and  general  findings     the  data  collected  from  this  survey,  especially  from  those  libraries  that  did  authenticate,  produced   over  200  data  points  for  each  library.  below  are  those  that  resulted  in  answers  to  questions  posed   at  the  outset  that  particularly  looked  at  overall  authentication  practices.  further  articles  are   planned  to  look  at  areas  of  inquiry  with  regards  to  other  related  practices  in  the  public  areas  of   academic  libraries  geographically.   there  are  114  academic  libraries  in  north  carolina.  as  a  result  of  the  follow  up  emails  and  phone   calls,  this  research  survey  got  an  exceptional  99.1%  response  rate  (113  out  of  114).    once  the     user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     112   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   appropriate  librarians  were  contacted  and  understood  the  scope  and  purpose  of  this  study,  they   were  very  cooperative  and  willing  to  fill  out  the  survey.    those  who  were  contacted  via  phone   mentioned  that  the  original  email  was  overlooked  or  lost.    only  one  library  refused  to  participate   in  the  study.     individual  library’s  demographics  were  collected  in  a  database  by  using  directory  and  online   information.    the  data  was  matched  with  the  survey  data  provided  by  the  respondents  to  produce   more  in-­‐depth  analysis  and  create  a  profile  of  each  library.           how  many  libraries  in  north  carolina  are  authenticating?  (chart  2)   the  survey  asked:  “is  any  type  of  authentication  required  or  mandated  for  using  any  of  the  pcs  in   the  library’s  public  area?”  66%  (or  75)  of  libraries  answered  yes  that  they  required  authentication   to  use  the  pcs.  (see  chart  2.)         chart  2.     are  some  types  of  libraries  more  likely  to  authenticate?  (chart  3)   while  each  type  of  library  had  a  different  overall  total  as  compared  to  the  other  types,  chart  3   shows  how  the  percentages  of  authentication  hold  for  each  type.    three  out  of  the  four  types  of   libraries  authenticate  more  often.    of  the  58  community  college  libraries,  60%  (or  35)  of  them   require  users  to  authenticate.    seventy-­‐eight  percent  (78%)  of  the  36  private  colleges  libraries   authenticate  and  11  of  the  16  (or  69%)  unc  system  libraries  authenticate.    only  the  religious   college  libraries  more  often  don’t  require  users  to  authenticate  (1  of  the  3  or  33%),  although  this   is  a  very  small  population  in  the  survey.    however,  percentagewise,  community  colleges  are  more   likely  to  not  require  users  to  authenticate  then  private  college  libraries  (40%  vs.  22%)  and  the   unc  system  libraries,  that  are  public  institutions,  fall  in  the  middle  at  31%.     information  technology  and  libraries  |  june  2015     113             chart  3.     how  many  academic  libraries  were  required  to  authenticate  pcs  in  their  public  areas?   (chart  4)   of  the  75  libraries  that  required  patrons  to  authenticate,  when  asked  if  “they  were  required  to  use   this  authentication”,  59  (52%)  replied  “yes”.    putting  these  data  points  together  shows  that  16  (or   14%)  of  the  libraries  authenticate  even  though  they  were  not  required  to  do  so.      some  clues  about   why  this  was  were  asked  in  the  next  question  and  during  the  follow  up  phone  calls.         chart  4.       user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     114   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   why  was  authentication  used?   libraries  were  asked,  “do  you  know  the  reasons  why  authentication  is  being  used?”  if  they   answered  “prevent  misuse  of  resources”  or  “control  the  public’s  use  of  these  pcs”  then  an   additional  question  was  asked,  “what  led  the  library  to  control  the  use  of  pcs?”      this  option  had   two  check  boxes  (“inability  of  students  to  use  the  resources  due  to  overuse  by  the  public”  and   “computer  abuse”)  and  a  third  box  to  allow  free  text  entry.      a  library  could  check  more  than  one   box.   of  those  75  libraries  that  authenticated,  60%  (or  45)  checked  “prevent  misuse  of  resources”  and   48%  (or  36)  cited  “controlling  the  public’s  use  of  these  pcs”  as  the  reasons  for  authenticating.   in  normalizing  the  data  from  the  two  questions  and  the  free  text  field,  table  1  combines  all   answers  to  illustrate  the  number  and  percentages  of  each.     table  1.     in  the  course  of  the  follow  up  calls  with  those  libraries  that  answered  the  survey  over  the  phone,   further  insight  was  provided.  one  librarian  said  that  their  it  department  told  them   “authentication  was  the  law  and  they  had  to  do  it”.  another  answered  that  they  were  “on  the  bus   line  and  so  the  public  used  their  resources  more  than  they  expected  and  so  they  had  to”.   to  get  a  better  understanding  of  the  scope  and  variety  of  these  answers,  here  are  some  examples   of  the  reasons  cited  in  the  free  text  space:  “all  it's  idea  to  do  this”  “best  practices”,  “caution”,   “concerned  they  would  be  used  for  the  wrong  reasons”,  “control”,  “we  found  them  misusing   computer  resources  (porn,  including  child  porn)”,  “control  over  college  students  searching  of   inappropriate  websites,  such  as  porn/explicit  sites”,  “disruption”,  “ease  of  distributing     information  technology  and  libraries  |  june  2015     115           applications”,  “fear  of  abuse  on  the  part  of  legal”,  “legal  issues  regarding  internet  access”,  “making   students  accountable”,  “monitor  use”,  “policy”,  “security  of  campus  network”,  “security  of   machines  after  issues  were  raised  at  a  conference”,  and  “time”.   who  required  that  the  libraries  authenticate?  (chart  5)   the  survey  asked,  “what  organization  or  group  required  or  mandated  the  library  to  use   authentication?”    respondents  were  allowed  to  choose  more  than  one  of  the  5  boxes.    these   choices  included  “the  library  itself,”  “it  or  some  unit  within  it,”  “college  or  university   administration,”  “other”  (with  a  text  box  to  explain),  and  “not  sure”.    the  results  of  this  question   are  shown  in  chart  5.    the  survey  revealed  that  the  decision  was  solely  the  library’s  choice  25%  of   the  time,  (or  28  libraries)    22%  of  the  time  the  library  was  mandated  or  required  to  authenticate   by  it  or  some  unit  within  it  (or  25  libraries)  and  4%  of  the  time  a  library’s  college  or  university   administration  required  or  mandated  authentication  (or  4  libraries).      collaborative  decisions  in   14  libraries  involved  more  than  one  organization.    of  the  39  libraries  that  were  involved  with  the   authentication  decision  (28  that  made  the  decision  by  themselves  and  11  that  were  part  of  a   collaborative  decision),  55%  (or  16)  authenticated  even  though  they  were  not  required  to  do  it.     chart  5.   what  type  of  authentication  is  used?   authentication  in  libraries  can  take  many  forms.    the  most  common  method  for  those  libraries   that  authenticate  was  by  using  centralized  or  networked  systems.    almost  sixty  percent  of  the   libraries  used  some  form  of  this  identified  access  (tables  2  and  3)  with  one  library  using  some   other  independent  system.    twenty-­‐five  percent  (or  19)  of  libraries  that  authenticate  still  use   some  form  of  paper  sign-­‐in  sheets  and  21%  (or  16)  use  pre-­‐set  or  temporary  logins  or  guest  cards.     fifteen  percent  (or  11)  use  pc  based  sign-­‐in  or  scheduling  software  and  8%  (or  6)  use  the  library     user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     116   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   system  in  some  form  for  authentication.    a  few  libraries  indicated  that  they  bypass  their   authentication  systems  for  guests  by  either  having  staff  log  guests  in  or  disabling  the  system  on   selected  pcs.  we  saw  this  during  the  “secret  shopper”  visits  as  well.       table  2.   do  the  forms  of  authentication  used  in  libraries  allow  for  user  privacy?   when  asked  how  they  handle  user  privacy  in  authentication,  of  the  75  libraries  that  authenticate,   67%  (or  50)  use  a  form  of  authentication  that  can  identify  the  user.    in  other  words,  most  users  do   not  have  privacy  when  using  public  computers  in  an  academic  library  because  they  are  required  to   use  some  form  of  centralized  or  networked  authentication.  the  options  in  table  3  were  presented   to  the  respondents  as  possible  forms  of  privacy  methods.  thirty-­‐five  percent  (or  26)  libraries   indicated  that  they  provide  some  form  of  privacy  for  their  patrons.  anonymous  access  accounted   for  28%  (or  21)  of  the  libraries.       table  3.     information  technology  and  libraries  |  june  2015     117           are  librarians  aware  of  the  computer  logging  activity  going  on  in  the  public  area?  (table  4)   all  the  113  respondents  were  asked  two  questions  about  the  computer  logging  activities  of  their   libraries:  “do  you  know  what  computer  activity  logs  are  kept”  and  “do  you  know  how  long   computer  activity  logs  are  kept”.    the  second  question  was  only  asked  if  “unsure”  was  not  checked.   besides  “unsure”,  responses  on  the  survey  included  “authentication  logs  (who  logged  in)”,   “browsing  history  (kept  on  pc  after  reboot)”,  “browsing  history  (kept  in  centralized  log  files)”,   “scheduling  logs  (manual  or  software)”,  “software  use  logs”  and  “other”.  the  respondents  could   select  more  than  one  answer.  however,  over  half  (52%)  of  the  respondents  were  unsure  if  the   library  kept  any  computer  logs  at  all.  authentication  logs  of  who  logged  in  were  the  most  common,   but  those  were  kept  in  only  25%  of  the  total  libraries  surveyed.    a  high  percentage  of  libraries   kept  some  kind  of  logs  but  most  respondents  were  unsure  how  long  those  records  were  kept.    of   the  various  types  of  logs,  respondents  that  use  scheduling  software  were  the  most  familiar  with   the  length  of  time  software  logs  were  kept.  in  one  case,  a  respondent  mentioned  that  the  manual   sign-­‐in  sheets  were  never  thrown  out  and  that  they  had  retained  them  for  years.     table  4.  log  retention.   are  past  incidents  factors  in  authenticating?     only  three  libraries  reported  breaches  of  privacy  and  all  those  libraries  reported  using   authentication.     of  the  75  libraries  that  do  authenticate  (chart  6,  3  bars  on  the  right),  36  reported  that  they  did   have  improper  use  of  the  pcs  while  29  of  the  libraries  reported  that  did  not  and  10  did  not  know.     of  the  38  libraries  that  do  not  authenticate  (chart  6,  3  bars  on  the  left),  23  reported  that  they  had   no  improper  use  of  the  pcs  while  13  stated  that  they  did  and  2  did  not  know.    the  overall  known   reports  of  improper  use  in  the  survey  are  higher  when  the  library  does  authenticate  and  is  lower   when  the  library  doesn’t  authenticate.   computer  activity  logs number of  total   libraries don't  know   how  long   data  is  kept   (unsure) unsure 59 52% 100% authentication  logs  (who  logged  in) 28 25% 60% none 21 19% -­‐-­‐ browsing  history  (kept  in  centralized  log  files) 14 12% 86% scheduling  logs  (manual  or  software) 10 9% 70% browsing  history  (kept  on  pc  after  reboot) 7 6% 57% software  use  logs 6 5% 33% library  system 4 4% 75% other 2 2% -­‐-­‐ what  kind  and  for  how  long  computer  logs  are  kept (all  113  libraries)   user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     118   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770     chart  6.   when  did  libraries  begin  authenticating  in  their  public  areas?   of  the  75  libraries  that  authenticate,  only  one  implemented  this  more  than  ten  years  prior  to  the   survey.  51  (or  67%)  of  the  responding  libraries  began  authenticating  between  3  and  10  years  ago.     10  libraries  implemented  authentication  in  the  year  before  the  survey.    this  is  consistent  with  the   growth  of  security  concerns  in  the  post  9/11  decade.  (chart  7)     chart  7.     information  technology  and  libraries  |  june  2015     119           discussion   since  the  introduction  of  computer  technology  to  libraries,  library  staff  and  patrons  have  used   different  levels  of  authentication  depending  upon  the  application.    while  remote  access  to   commercial  services  such  as  oclc  cataloging  subsystems  or  vendor  databases  have  always  used   some  form  of  authorization,  usually  username  and  password,  it  has  never  been  necessary  or   desirable  for  public  access  to  the  library’s  catalog  system  to  have  any  kind  of  authorization   requirements.    most  of  the  collections  within  an  academic  library  have  traditionally  been  housed   in  open  access  stacks  where  anyone  can  freely  access  material  on  the  shelves.    printed  indexes  and   other  tools  that  provide  in-­‐depth  access  to  these  collections  have  traditionally  been  open  as  well.     today,  most  libraries  still  make  their  library  catalog  and  even  some  bibliographic  discovery  tools   open  access  and  available  over  the  web.  this  practice  naturally  extended  to  computer  technology   and  other  electronic  reference  tools  until  libraries  began  connecting  them  to  the  campus  and   public  networks.       the  principle  of  free  and  open  access  to  the  materials  and  resources  of  the  library,  within  the   library  walls,  has  been  a  fundamental  characteristic  of  most  public  and  academic  libraries.  there  is   an  ethical  commitment  of  librarians  to  a  user’s  privacy  and  confidentiality  that  has  deep  roots   based  in  the  first  and  fourth  amendment  of  the  us  constitution,  state  laws,  and  the  code  of  ethics   of  the  ala.    article  ii  of  the  ala  code  states  “we  protect  each  library  user's  right  to  privacy  and   confidentiality  with  respect  to  information  sought  or  received  and  resources  consulted,  borrowed,   acquired  or  transmitted.”  traditionally,  library  staff  do  not  identify  patrons  that  walk  through  the   door;  they  don’t  ask  for  identification  when  answering  questions  at  the  reference  desk  nor  do  they   identify  patrons  reading  a  book  or  magazine  in  the  public  areas  of  a  library.  schneider  has   empathized  that  librarians  have  always  valued  user  privacy  and  have  been  instrumental  in  the   passing  of  many  state’s  library  privacy  laws.23    usually,  it  is  only  when  materials  are  checked  out   to  a  patron  that  a  user’s  affiliation  or  authorization  even  gets  questioned  directly.  frequently   patrons  can  make  use  of  materials  within  the  library  building  with  no  record  of  what  was  accessed.   we  are  seeing  these  traditional  principles  of  open  access  to  materials  as  they  transition  to   electronic  formats.  it  is  becoming  more  common  for  patrons  to  have  to  authenticate  before  they   can  use  what  was  once  openly  available.  the  data  collected  from  this  survey  confirms  this  trend   with  66%  of  the  libraries  using  some  form  of  authentication  in  their  public  area.   the  widespread  use  of  personally  identifiable  information  is  making  it  more  difficult  for  librarians   to  protect  the  privacy  and  confidentiality  of  library  users.  although  the  writing  was  on  the  wall   that  some  choices  would  have  to  be  made  with  regards  to  privacy  before  911,  no  easy  answer  to   the  problem  had  yet  been  identified.  librarians  themselves  are  often  uncertain  about  what   information  is  collected  and  stored  as  evidenced  by  our  data  (chart  6).    as  more  information   becomes  available  only  electronically,  because  computers  in  the  public  areas  are  now  used  for   much  more  than  just  accessing  library  catalog  functions,  it  is  becoming  difficult  to  uphold  the  code   of  ethics  and  protect  the  privacy  of  users.       user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     120   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   using  authentication  can  also  make  it  more  difficult  to  use  technology  in  the  library.    in  order  to   authenticate,  users  may  be  required  to  start  or  restart  a  computer  and/or,  log  into  or  out  of  the   computer.    this  can  take  time  to  do  as  well  as  require  the  user  to  remember  to  log  off  the   computer  when  finished.  users  often  have  difficulty  keeping  track  of  their  user  information  and   may  require  increased  assistance  (table  5).     table  5.   library  staff  or  scheduling  software  can  be  required  to  help  library  guests  obtain  access  to   computer  equipment.    north  carolina,  like  other  states,  does  have  laws  governing  the   confidentiality  of  library  records.  librarians  have  long  dealt  with  this  situation  by  keeping  as  little   data  as  possible.  for  example,  many  library  circulation  systems  do  not  store  data  beyond  the   current  checkout.  access  logs  that  detail  what  resources  a  particular  user  has  accessed  would   seem  to  fall  under  this  legislation,  although  the  wording  in  the  law  is  vague.   information  technology  departments,  legal  counsel,  and  administrators,  on  the  other  hand,  are   often  less  concerned  about  privacy  and  intellectual  freedom  issues.  more  often  their  focus  is  on   security,  limiting  access  to  those  users  affiliated  with  the  institution,  and  monitoring  use.  being   ready  and  able  to  provide  data  in  response  to  subpoenas  and  court  orders  is  often  a  priority.  at   western  carolina  university,  illicit  use  of  an  unauthenticated  computer  in  the  student  center  led  to   an  investigation  by  campus  and  county  law  enforcement.  this  case  is  still  used  as  justification  for   needing  to  authenticate  and  monitor  campus  computer  use  even  though  the  incident  occurred   many  years  ago.  being  able  to  track  an  individual’s  online  activity  is  believed  to  increase  security   by  ensuring  adherence  to  institutional  policies.  authentication  with  individually  assigned  login   credentials  permits  online  activity  to  be  traced  to  that  specific  account  whose  owner  can  then  be   held  accountable  for  the  activity  performed.  librarian’s  responses  to  the  survey  indicate  that  these   issues  play  a  role  in  a  library’s  decisions  to  authenticate  as  seen  in  the  free  text  responses  in  table   6.     information  technology  and  libraries  |  june  2015     121           tracking  use  through  ip  address,  individual  login,  and  transaction  logs  allows  scrutinizing  of  users   in  case  of  illegal  or  illicit  use  of  computer  resources.  in  many  cases,  this  action  is  justified  as  being   required  by  auditors  or  law  enforcement  agencies,  though  information  regarding  this  is  scarce.   the  authors  of  this  article  are  not  aware  of  any  laws  or  auditing  requirements  in  north  carolina   that  require  detailed  tracking  of  library  computer  use.     some  libraries  indicated  that  it  departments  were  concerned  about  security  of  networks  and/or   computers.  security  can  be  undermined  when  generic  accounts  are  used  or  when  no   authentication  is  required.  by  using  individual  logins,  users  can  be  restricted  to  specific  network   resources  and  can  be  monitored.  when  multiple  computers  use  the  same  account  for  logging  in  or   when  the  login  credentials  are  posted  on  each  computer,  it  can  compromise  security  because  use   cannot  be  tracked  to  a  specific  user.  in  some  libraries,  these  security  issues  have  trumped   librarian’s  concerns  about  intellectual  freedom  and  privacy.   creating  a  profile  as  a  result  of  these  findings   given  the  number  of  characteristics  collected  about  each  library,  it  was  assumed  there  were  some   factors  gathered  that  might  influence  a  decision  to  authenticate  and  allow  for  the  possibility  to   create  a  profile  for  prediction.  the  data  was  collected  from  libraries  within  a  fixed  geographic   region.  the  externally  collected  and  survey  data  was  coded,  put  into  spss™  and  a  number  of   statistical  tests  were  performed  to  find  what  factors  might  be  statistically  significant.    to  further   the  geographical  analysis  of  the  data,  the  data  was  also  put  into  arcview™  to  produce  a  map  of   north  carolina  with  the  libraries  given  different  colored  pins  for  those  academic  libraries  that   authenticated  vs.  non-­‐authenticated  to  see  if  there  were  any  pattern  to  the  choice.  (map  1)     to  more  completely  explore  the  possible  role  that  geographic  information  might  play  in  the   decision  to  authenticate,  the  population  of  the  city  or  town  the  institution  was  located  in,   enrollment,  book  volume,  number  of  pcs  and  total  number  of  library  it  staff  (scaled  variables)  as   well  as  ordinal  variables  such  as  “who  controlled  the  setup  of  the  pcs”,  “do  you  differentiate   between  student  and  public  pcs”,  and  “known  incidents  of  privacy  and  misuse”,  were  also   integrated  into  the  analysis.    the  data  collected  could  not  predict  whether  an  academic  library   would  authenticate  or  not  using  logistical  regression  techniques,  although  those  that  differentiate   between  student  and  public  pcs  did  have  a  higher  probability.    based  on  all  our  collected  data  and   mapping,  it  is  impossible  to  predict  with  any  significance  whether  or  not  an  academic  library   would  authenticate.     so  the  short  answer  statistically  is  no.    using  all  of  the  data  collected,  a  statistically  significant   profile  could  not  be  created,  however  there  are  general  tendencies  identified  that  the  data  was   able  to  suggest.       user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     122   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770     map  1.   for  those  libraries  that  do  authenticate,  the  average  book  volume  is  almost  400,000,  the   enrollment  around  5,600,  the  city  population  where  the  institution  is  located  is  94,000,  the  total   number  of  pcs  in  the  public  area  is  54,  and  the  average  number  of  library  it  staff  is  1.8.         for  those  libraries  that  do  not  authenticate,  the  average  book  volume  is  about  163,000,  enrollment   around  3,000,  the  population  is  53,000,  the  average  number  of  pcs  in  the  public  area  is  about  39   and  the  average  number  of  library  it  staff  is  0.8.     libraries  that  authenticate  tend  to  have  statistically  significant  differences  in  book  volume,  the   number  of  pcs  in  the  public  area,  which  has  a  t-­‐test  value  of  p<1.    student  enrollment  was  the  most   statistically  significant  factor  in  those  that  authenticated,  with  a  t-­‐test  value  of  p<0.5.  libraries  that   authenticate  had  many  more  students,  more  books  and  a  larger  number  of  pcs  in  their  public   areas  then  libraries  that  didn’t  authenticate.   those  libraries  that  didn’t  authenticate  tended  to  be  in  smaller  towns,  more  often  their  pcs  in  the   public  areas  were  setup  by  non-­‐library  it  staff,  and  had  fewer  library  it  staff.  sixty  percent  (60%)   of  the  libraries  that  don’t  authenticate  had  zero  library  it  staff.           information  technology  and  libraries  |  june  2015     123           while  it  was  assumed  at  the  outset  of  this  research  that  the  responsible  campus  department  for   the  setup  of  the  workstations  (the  library  or  it)  in  the  public  area  would  be  a  factor  in  whether   authentication  was  used  in  the  library,  the  data  does  not  support  this  assumption  statistically.   ethical  questions  about  authentication  as  a  result  of  these  findings   there  are  a  variety  of  reasons  why  a  library  might  choose  to  authenticate  despite  the  ethical  issues   associated  with  it.  the  protection  and  management  of  it  resources  or  the  mission  of  the   institution  are  two  likely  scenarios.    a  library,  especially  one  with  lots  of  use  by  unaffiliated  users   or  guests,  might  chose  to  authenticate  regardless  of  concerns  in  order  to  make  sure  its  own  users   have  preference  to  the  pcs  in  the  public  area  of  their  library.  a  private  institution  may  choose  to   authenticate  in  order  to  limit  access  by  any  members  of  the  general  public.  of  those  75  libraries   that  authenticate,  81%  cited  concerns  about  controlling  use,  overuse  and  misuse.  this  study  also   found  that  in  25%  of  the  total  academic  libraries,  the  library  itself  decided  to  authenticate  without   influence  from  external  groups.  this  was  a  higher  percentage  than  was  expected.  given  librarian’s   professional  concerns  about  intellectual  freedom  and  privacy,  we  were  very  surprised  that  so   many  libraries  choose  to  authenticate  on  their  own.   we  suspected  that  many  librarians  might  not  have  a  full  understanding  of  the  privacy  issues   created  when  requiring  individual  logins.    based  on  this  assumption,  we  expected  that  many  of  the   librarians  would  not  be  fully  aware  of  what  user  tracking  data  was  being  kept.    examples  include   network  authentication,  tracking  cookies,  web  browser  history,  and  user  sign-­‐in  sheets.  the  study   found  that  librarians  are  often  unsure  of  what  data  is  being  logged  with  51  (or  45%)  of  113   libraries  reporting  this.    only  19%  reported  knowing  with  certainly  that  no  tracking  data  was  kept.     of  those  that  did  know  that  tracking  data  was  being  kept,  most  had  no  idea  how  long  this  data  was   retained.   conclusion   this  study  found  that  66%  (or  75)  of  the  113  surveyed  north  carolina  academic  libraries  required   some  form  of  user  authentication  on  their  public  computers.  the  researchers  reviewed  an   extensive  amount  of  data  to  identify  the  factors  involved  with  this  decision.    these  factors   included  individual  demographics,  such  as  city  population,  book  volume,  type  of  academic  library,   and  enrollment.    it  was  anticipated  that  by  looking  a  large  pool  of  academic  libraries  within  a   specific  region,  a  profile  might  emerge  that  would  predict  which  libraries  would  chose  to   authenticate.    even  with  comprehensive  data  about  the  75  libraries  that  authenticated,  a  profile  of   a  “typical”  authenticated  library  could  not  be  developed.    the  data  did  show  two  factors  of  any   statistical  significance  (enrollment  and  book  volume)  in  determining  a  library’s  decision  to   authenticate.    however,  the  decision  to  authenticate  could  not  be  predicted.    each  library’s   decision  to  authenticate  seems  to  be  based  on  the  unique  situation  of  that  library.     we  expected  to  find  that  most  libraries  would  authenticate  due  to  pressure  from  external  sources,   such  as  campus  it  departments,  administrators,  or  in  response  to  incidents  involving  the     user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     124   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   computers  in  the  public  area.  this  study  found  that  only  39%  (or  44)  libraries  surveyed   authenticated  due  to  these  factors  so  our  assumption  was  incorrect.      surprisingly,  we  found  that   25%  (or  28)  libraries  did  choose  to  authenticate  on  their  own.  the  need  to  control  the  use  of  their   limited  resources  seemed  to  have  precedence  over  any  other  factors  including  user  privacy.  we   did  expect  to  see  a  rise  in  the  number  of  libraries  that  authenticated  in  the  aftermath  of  9/11.  this   we  found  to  be  true.  looking  at  the  prior  research  that  define  an  actual  percentage  of   authentications  in  academic  libraries,  no  matter  how  limited  in  scope,  (for  example,  just  the  arl   libraries,  responding  libraries,  etc.),  there  does  seem  to  be  a  strong  trend  for  academic  libraries  to   authenticate.   our  results,  with  75%  of  academic  libraries  having  authentication,  support  the  conclusion  that   there  is  a  continued  trend  of  authentication  that  has  steadily  expanded  over  the  past  decade.  this   has  happened  in  spite  of  librarian’s  traditional  philosophy  on  access  and  academic  freedom.   libraries  are  seemingly  relinquishing  their  ethical  stance  or  have  other  priorities  that  make   authentication  an  attractive  solution  to  controlling  use  of  limited  or  licensed  resources.    our   survey  results  show  that  many  librarians  may  not  fully  understand  the  privacy  risks  inherent  in   authentication.    slightly  over  half  (52%)  of  the  libraries  reported  that  they  did  not  know  if  any   computer  or  network  log  files  were  being  kept  nor  for  how  long  they  are  kept.   the  issues  surrounding  academic  freedom,  access  to  information,  and  privacy  in  the  face  of   security  concerns  continue  to  effect  library  users.  academic  libraries  in  smaller  communities  are   often  the  only  nearby  source  of  scholarly  materials.  traditionally  these  resources  have  been  made   available  to  community  members,  high  school  students,  and  others  who  require  materials  beyond   the  scope  of  the  resources  of  the  public  or  school  library.  as  pointed  out,  restrictive  authentication   policies  may  hamper  the  ability  of  these  groups  to  access  the  information  they  need.  however,  the   data  showed  very  little  consistency  to  support  this  idea  with  respect  to  authentication  in  small   towns  and  communities  throughout  the  state.   some  of  the  surveyed  academic  libraries  made  a  strong  statement  that  they  are  not  authenticating   in  their  public  area  computers  and  have  every  intention  of  continuing  this  practice.  these  libraries   are  now  in  a  distinct  minority  and  we  expect  their  position  will  continually  be  challenged.    for   example,  at  western  carolina  university,  we  continue  to  employ  open  computers  in  the  public   areas  of  the  library  but  are  regularly  pressed  by  our  campus  it  department  to  implement   authentication.  we  have  so  far  been  successful  in  resisting  this  pressure  because  of  the   commitment  of  our  dean  and  librarians  to  preserving  the  privacy  of  our  patrons.   further  studies   as  a  follow-­‐up  to  this  study,  we  plan  to  contact  the  35  libraries  that  did  not  authenticate  to   determine  if  they  now  require  authentication  or  have  plans  to  do  so.  based  on  responses  to  this   survey,  we  expect  that  many  librarians  are  unaware  of  the  degree  to  which  authentication  can   undermine  patron  privacy.  we  suggest  an  in-­‐depth  study  be  conducted  to  determine  the  degree  of     information  technology  and  libraries  |  june  2015     125           understanding  among  librarians  about  potential  privacy  issues  with  authentication  in  the  context   of  their  longstanding  professional  position  on  academic  freedom  and  patron  confidentiality.             user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     126   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   appendix  a.     survey  questions   1.  select  the  library  you  represent:   2.  which  library  or  library  building  are  you  reporting  on?       • main  library  or  the  only  library  on  campus   • medical  library   • special  library   • other   3.  how  many  total  pcs  do  you  have  in  your  library  public  area  for  the  building  you  are  reporting   on?   4.  how  many  library  it  or  library  systems  staff  does  the  library  have?   5.  does  the  library’s  it/systems  staff  control  the  setup  of  these  pcs  in  the  library  public  area?   • yes   • shared  with  it  (campus  computing  center)   • it  (campus  computing  center)   • no  (please  specify  who  does  control  the  setup  of  these  pcs)   authentication   6.  is  any  type  of  authentication  required  or  mandated  to  use  any  of  the  pcs  in  the  library’s  public   area?   7.  were  you  required  to  use  this  authentication  on  any  of  the  pcs  in  the  library’s  public  area?   8.  what  organization  or  group  required  or  mandated  the  library  to  use  authentication  on  pc’s  in   the  library  public  area?   • the  library  itself   • it  or  some  unit  within  it   • other  (please  explain)   • not  sure   • college/university  administration         information  technology  and  libraries  |  june  2015     127           9.  do  you  know  the  reason’s  authentication  is  being  used?     • mandated  by  parent  institution  or  group   • prevent  misuse  of  resources   • other  (please  specify)   • control  the  public’s  use  of  these  pcs   10.  what  lead  the  library  to  control  the  use  of  pcs?       • inability  of  students  to  use  the  resource  due  to  overuse  by  the  public   • computer  abuse   • other  (please  specify)   11.  how  are  the  users  informed  about  the  authentication  policy?   • screen  saver   • web  page   • login  or  sign  on  screen   • training  session  or  other  presentation   • other  (please  specify)   12.  what  form  of  authentication  do  you  use?   • manual  paper  sign-­‐in  sheets   • individual  pc  based  sign-­‐in  or  scheduling  software   • centralized  or  networked  authentication  such  as  active  directory,  novell,  or  ers   (enterprise  resource  planning)  system  with  a  college/university  wide  identifier   • pre-­‐set  or  temporary  authorization  logins  or  guest  cards  handed  out  (please  specify  the   length  of  time  this  is  good  for)   • other  (please  specify)   13.  how  does  the  library  handle  user  privacy  of  authentication?   • anonymous  access  (each  session  is  anonymous  with  repeat  users  not  identified)   • anonymous  access  (each  session  is  anonymous  with  repeat  users  not  identified)   • identified  access   • pseudonymous  access  with  demographic  identification  (characteristics  of  users   determined  but  not  actual  identified)   • pseudonymous  access  (repeat  users  identified  but  not  the  identity  of  a  particular  user)         user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     128   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   14.  when  did  you  implement  authentication  of  the  pcs  in  the  library  public  area?   • this  year   • last  year   • 3-­‐5  years  ago   • 5-­‐10  years  ago   • don’t  know   student  only  pcs   15.  do  you  differentiate  between  student  only  pcs  and  guest/public  use  pcs  in  the  library  public   area?   17.  how  many  pcs  are  designated  for  student  only  pcs  in  the  library’s  public  area?   18.  do  you  require  authentication  to  access  student  only  pcs  in  the  library’s  public  area?   19.  what  does  authentication  provide  on  a  student  only  pc  once  an  affiliated  person  logs  in?   • access  to  specialized  software   • access  to  storage  space   • printing   • internet  access   • other  (please  specify)   20.  once  done  with  an  authenticated  session  on  a  student  only  pc,  how  is  authentication  on  a  pc   removed?     • user  is  required  to  log  out   • user  is  timed  out   • other  (please  specify)    21  what  authentication  issue  have  you  seen  in  your  library  with  student  only  pcs?   • id  management  issues  from  the  user  (e.g.,  like  forgetting  passwords)   • id  management  issues  from  the  network  (e.g.,  updating  changes  in  timely  fashion)   • timing  out  issues   • authentication  system  become  not  available   • other  (please  specify)   guest/public  pcs   22.  how  many  pcs  are  designated  for  guest  or  public  use  in  the  library’s  public  area?   23.  describe  the  location  of  these  guest/public  use  pcs.     information  technology  and  libraries  |  june  2015     129           • line-­‐of-­‐sight  to  library  service  desk   • all  in  one  general  area   • scattered  throughout  the  library   • other  (please  specify)   • in  several  groups  around  the  library   24.  do  you  require  authentication  to  access  guest/public  use  pcs  in  the  library’s  public  area?   25.  what  does  authentication  allow  for  guest  or  the  public  that  log  in?   • limited  software   • control,  limit  or  block  web  sites  that  can  be  accessed   • limited  or  different  charge  for  printing   • timed  or  scheduled  access   • internet  access   • other  (please  specify)   • control,  limit  or  block  access  to  library  resources  (such  as  databases  or  other  subscription   based  services)   26.  are  there  different  type  of  pcs  in  your  library  area?  check  those  that  apply.   • all  pcs  are  the  same   • some  have  different  type  of  software  (like  browser  only)   • some  have  time  or  scheduling  limitation   • some  have  printing  limitations   • some  have  specialized  equipment  attached  (like  scanners,  microfiche  readers,  etc.)   • some  control,  limit  or  block  web  sites  that  can  be  accessed   • some  control,  limit  or  block  access  to  library  resources  (such  as  database  or  other   subscription  based  services)   • other  (please  specify)   wireless  access   27.  do  you  have  wireless  access  in  your  library  public  area?   28.  do  you  require  authentication  to  your  wireless  access  in  the  library  public  area?   29.  does  the  library  have  its  own  wireless  policies  different  from  the  campus’s  policy?   30.  what  methods  are  used  to  give  guests  or  the  public  access  to  your  wireless  access?  check   those  that  apply.   • no  access  to  guest  or  general  public   • paperwork  and/or  signature  required  before  access  given     user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     130   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   • limited  access  by  time   • open  access   • limited  access  by  resource  (such  as  internet  access  only)   • other   incident  reports   31.  has  your  library  had  any  known  incidents  of  breach  of  privacy  that  you  know  about?   32.  has  your  library  had  any  incidents  of  improper  use  of  public  pcs  (such  as  cyber  stalking,  child   pornography,  terrorism,  etc.?)   33.  have  these  incidents  required  investigation  or  digital  forensics  work  to  be  done?   34.  who  handled  the  work  of  investigation?   • library  it  or  library  systems  staff   • it  or  campus  computing  center   • campus  police   • other  law  enforcement   • unsure   • other  (please  specify)   computer  activity  logs   35.  do  you  know  what  computer  activity  logs  are  kept?  (if  unsure,  end,  if  not  ask)   • authentication  logs  (who  logged  in)   • browsing  history  (kept  on  pc  after  reboot)   • browsing  history  (kept  in  centralized  log  files)   • scheduling  logs  (manual  or  software)   • software  use  logs   • none   • unsure   • other  (please  specify)   36  do  you  know  how  long  computer  activity  logs  are  kept?   • 24  hours  or  less     • week   • month   • year   • unknown     information  technology  and  libraries  |  june  2015     131           references   1.   pam  dixon,  "ethical  risks  and  best  practices,"  journal  of  library  administration  47,  no.  3/4   (may  2008):  157.     2.   scott  carlson,  “to  use  that  library  computer,  please  identify  yourself,”  chronicle  of  higher   education,  june  25,  2004,  a39.   3.   lori  driscoll,  library  public  access  workstation  authentication,  spec  kit  277  (washington,  d.c.:   association  of  research  libraries,  2003).     4.   martin  cook  and  mark  shelton,  managing  public  computing,  spec  kit  302  (washington,  d.c.:   association  of  research  libraries,  2007).     5.   diana  oblinger,  “it  security  and  academic  values,”  in  computer  and  network  security  in   higher  education,  ed.    mark  luker  and  rodney  petersen  (jossey-­‐bass,  2003):  1-­‐13.   6.   code  of  ethics  of  the  american  library  association,   http://www.ala.org/advocacy/proethics/codeofethics/codeethics     7.   fair  information  practices  adopted  by  the  organization  for  economic  cooperation  and   development,  http://www.oecd.org/sti/security-­‐privacy     8.   ”niso  best  practices  for  designing  web  services  in  the  library  context,”  niso  rp-­‐2006-­‐01   (bethesda,  md:  national  information  standards  organization,  2006)   9.   dixon,  “ethical  issues  implicit  in  library  authentication  and  access  management.”   10.  howard  carter,  "misuse  of  library  public  access  computers:  balancing  privacy,  accountability,   and  security,"  journal  of  library  administration  36,  no.  4    (april  2002):  29-­‐48.   11.   julie  still  and  vibiana  kassabian,  "the  mole's  dilemma:  ethical  aspects  of  public  internet   access  in  academic  libraries,"  internet  reference  services  quarterly  4,  no.  3  (january  1,  1999):   7-­‐22.   12.   don  essex,  "opposing  the  usa  patriot  act:  the  best  alternative  for  american  librarians,"   public  libraries  43,  no.  6  (november  2004):  331-­‐340.   13.   lynne  weber  and  peg  lawrence,  "authentication  and  access:  accommodating  public  users  in   an  academic  world."  information  technology  &  libraries  29,  no.  3(september  2010):  128-­‐140.   14.   nancy  courtney,  "barbarians  at  the  gates:  a  half-­‐century  of  unaffiliated  users  in  academic   libraries,"  journal  of  academic  librarianship  27,  no.  6  (november  2001):  473.   15.   nancy  courtney,  "unaffiliated  users’  access  to  academic  libraries:  a  survey,"  the  journal  of   academic  librarianship  29,  no.  1  (2003):  3-­‐7.     user  authentication  in  the  public  library  area  of  academic  libraries  in  north  carolina  |     132   ellern,  hitch,  and  stoffan   doi:  10.6017/ital.v34i2.5770   16.   barbara  best-­‐nichols,  “community  use  of  tax-­‐supported  academic  libraries  in  north   carolina:  is  unlimited  access  a  right?”  north  carolina  libraries  51  (fall  1993):  120-­‐125.   17.   nancy  courtney,  "authentication  and  library  public  access  computers:  a  call  for  discussion,"   college  &  research  libraries  news  65,  no.  5  (may  2004):  269-­‐277.   18.   rita  barsun,  "library  web  pages  and  policies  toward  “outsiders”:  is  the  information  there?"   public  services  quarterly  1,  no.  4    (october  2003):  11-­‐27.   19.   american  library  directory  :  a  classified  list  of  libraries  in  the  united  states  and  canada,  with   personnel  and  statistical  data,  62nd  ed.  (new  york:  information  today,  2009)   20.   http://statelibrary.ncdcr.gov/ld/aboutlibraries/nclibrarydirectory2011.pdf.       21.   karen  schneider,  “so  they  won’t  hate  the  wait:  time  control  for  workstations,”  american   libraries,  29  no.  11  (1998):  64.   22.   code  of  ethics  of  the  american  library  association.   23.   karen  schneider,  “privacy:  the  next  challenge,”  american  libraries,  30,  no.  7  (1999):  98.   from digital library to open datasets: embracing a “collections as data” framework articles from digital library to open datasets: embracing a “collections as data” framework rachel wittmann, anna neatrour, rebekah cummings, and jeremy myntti information technology and libraries | december 2019 49 rachel wittmann (rachel.wittmann@utah.edu) is digital curation librarian, university of utah. anna neatrour (anna.neatrour@utah.edu) is digital initiatives librarian, university of utah. rebekah cummings (rebekah.cummings@utah.edu) is digital matters librarian, university of utah. jeremy myntti (jeremy.myntti@utah.edu) is head of digital library services, university of utah. abstract this article discusses the burgeoning “collections as data” movement within the fields of digital libraries and digital humanities. faculty at the university of utah’s marriott library are developing a collections as data strategy by leveraging existing digital library and digital matters programs. by selecting various digital collections, smalland large-scale approaches to developing open datasets are explored. five case studies chronicling this strategy are reviewed, along with testing the datasets using various digital humanities methods, such as text mining, topic modeling, and gis (geographic information system). introduction for decades, academic research libraries have systematically digitized and managed online collections for the purpose of making cultural heritage objects available to a broader audience. making archival content discoverable and accessible online has been revolutionary for the democratization of scholarship, but the use of digitized collections has largely mimicked traditional use: researchers clicking through text, images, maps, or historical documents one at a time in search of deeper understanding. “collections as data” is a growing movement to extend the research value of digital collections beyond traditional use and to give researchers more flexible access to our collections by facilitating access to the underlying data, thereby enabling digital humanities research.1 collections as data is predicated upon the convergence of two scholarly trends happening in parallel over the past several decades.2 first, as mentioned above, librarians and archivists have digitized a significant portion of their special collections, giving access to unique material that researchers previously had to travel across the country or globe to study. at the same time, an increasing number of humanist scholars have approached their research in new ways, employing computational methods such as text mining, topic modeling, gis (geographic information system), sentiment analysis, network graphs, data visualization, and virtual/augmented reality in their quest for meaning and understanding. gaining access to high-quality data is a key challenge of digital humanities work, since the objects of study in the humanities are frequently not as amenable to computational methods as data in the sciences and social sciences.3 typically, data in the sciences and social sciences is numerical in nature and collected in spreadsheets and databases with the intention that it will be computationally parsed, ideally as part of a reproducible and objective study. conversely, data (or, more commonly, “evidence” or “research assets”) in the humanities is textor image-based and is created and collected with the intention of close reading or analysis by a researcher who brings their subjective expertise to bear on the object.4 even a relatively simple digital humanities method like identifying word frequency in a corpus of literature is predicated on access to plain mailto:rachel.wittmann@utah.edu mailto:anna.neatrour@utah.edu mailto:rebekah.cummings@utah.edu mailto:jeremy.myntti@utah.edu from digital library to open datasets | wittmann, neatrour, cummings, and myntti 50 https://doi.org/10.6017/ital.v38i4.11101 text (.txt) files, high-quality optical character recognition (ocr), and the ability to bulk download the files without running afoul of copyright or technical barriers. as “the santa barbara statement on collections as data” articulates, “with notable exceptions like the hathitrust research center, the national library of the netherlands data services & api’s, the library of congress’ chronicling america, and the british library, cultural heritage institutions have rarely built digital collections or designed access with the aim to support computational use.”5 by and large, digital humanists have not been well-served by library platforms or protocols. current methods for accessing collections data include contacting the library for direct access to the data or “scraping” data off library websites. recently funded efforts such as the institute of museum and library services’ (imls’s) always already computational and the andrew w. mellon foundation’s collections as data: part to whole seek to address this problem by setting standards and best practices for turning digital collections into datasets amenable to computational use and novel research methods.6 the university of utah j. willard marriott library has a long-running digital library program and a burgeoning digital scholarship center creating a moment of synergy for librarians in digital collections and digital scholarship to explore collaboration in teaching, outreach, and digital collection development. a shared goal between the digital library and digital scholarship teams is to develop collections as data of regional interest that could be used by researchers for visualization and computational exploration. this article will share our local approach to developing and piloting a collections as data strategy at our institution. relying upon best practices and principles from thomas padilla’s “on a collections as data imperative,” we transformed five library collections into datasets, made select data available through a public github repository, and tested the usability of the data with our own research questions relying upon expertise and infrastructure from digital matters and the digital library at the marriott library.7 digital matters in 2015, administration at the marriott library was approached by multiple colleges at the university of utah to explore the possibility of creating a collaborative space to enable digital scholarship. while digital scholarship was happening across campus in disparate and unfocused ways, there was no concerted effort to share resources, build community, or develop a multi-college digital scholarship center with a mission and identity. after an eighteen-month planning process, the digital matters pop-up space was launched as a four-college partnership between the college of humanities, college of fine arts, college of architecture + planning, and the marriott library. an anonymous $1 million donation in 2017 allowed the partner colleges to fund staffing and activity in the space for five years, including the hire of a digital matters director tasked with planning for long-term sustainability. the development of digital matters brings new focus, infrastructure, and partners for digital humanities research to the university of utah and the marriott library. monthly workshops, speakers, and reading groups led by digital scholars from all four partner colleges have created a vibrant community with crossdisciplinary partnerships and unexpected synergies. close partnerships and ongoing dialogue have increased awareness for marriott faculty, particularly those working in and collaborating with digital matters, of the challenges facing digital humanists and the ways in which the library community is uniquely suited to meet those needs. for example, a university of utah researcher in the college of humanities developed “century of black mormons,” a community-based public history database of biographical information and primary source documents on black mormons baptized between 1830 and 1930.8 working closely with the digital initiatives librarian and various staff and faculty at the marriott library, they created an omeka s site that allows users to interact with the historical data using gis, timeline features, and basic webpage functionality. information technology and libraries | december 2019 51 institution digital library the university of utah has had a robust digital library program since 2000, including one of the first digital newspaper repositories, utah digital newspapers (udn, https://digitalnewspapers.org/). in 2016, the library developed its own digital asset management system using open-source systems such as solr, phalcon, and nginx after using contentdm for over fifteen years.9 this new system, solphal, has made it possible for us to implement a custom solution to manage and display a vast amount of digital content, not only for our library, but also for many partner institutions throughout the state of utah. our main digital library server (https://collections.lib.utah.edu/) contains over 765,000 objects in nearly 700 collections, consisting of over 2.5 million files. solphal is also used to manage the udn, containing nearly 4 million newspaper pages and over 20 million articles. digital library projects are continually evolving as we redefine our digital collection development policies, ensuring that we are providing researchers and other users the digital content that they are seeking. with such a large amount of data available in the digital library, we can no longer view our digital library as a set of unique, yet siloed, collections, but more as a wealth of information documenting the history of the university, the state of utah, and the american west. we are also engaged in remediating legacy metadata across the repository in order to achieve greater standardization, which could support computational usage of digital library metadata in the future. with this in mind, we are working to strategically make new digital content available on a large scale that can help researchers discover this historical content within a collections as data mindset. leveraging the existing digital library and digital matters programs, faculty at the marriott library are in the process of piloting a collections as data strategy. we selected digital collections with varying characteristics and used them to explore smalland large-scale approaches to developing datasets for humanities researchers. we then tested the datasets by employing various digital humanities methods such as text mining, topic modeling, and gis. the five case studies below chronicle our efforts to embrace a collections as data framework and extend the research value of our digital collections. text mining mining texts when developing the initial collections as data project, several factors were considered to identify the optimal material for this experiment. selecting already digitized and described material in the university of utah digital library was ideal to avoid waiting periods required for new digitization projects. the marriott library special collections’ relationship with the american west center, an organization based at the university of utah with the mission of documenting the history of the american west, has produced an extensive collection of oral histories held in the audio visual archive which have typewritten transcripts yielding high-quality ocr. given the availability and readiness of this material, we built a selected corpus of mining-related oral histories, drawn from collections such as the uranium oral histories and carbon county oral histories. engaging in the entire process with a digital humanities framework, we scraped our own digital library repository as though we had no special access to the back end of the system, developing a greater understanding of the process and workflows needed to build a text corpus to support a research inquiry. in this way, we extended our skills so that we would be able to scrape any digital library system if this expertise was needed in the future. the extensive amount of text produced by the corpus of 230 combined oral histories provided ideal material for topic modeling. simply put, “topic modeling is an automatic way to examine the contents of a corpus of documents.”10 the output of these models is word clouds with varying sizes of words based on the number of co-occurrences within the corpus; larger words indicate more occurrences and smaller ones indicate fewer. each topic model then points to the most relevant documents within the corpus based on the co-occurrences of the words contained in that model. in order to create these topic models from the https://digitalnewspapers.org/ https://collections.lib.utah.edu/ from digital library to open datasets | wittmann, neatrour, cummings, and myntti 52 https://doi.org/10.6017/ital.v38i4.11101 corpus of oral histories, a workflow was developed with the expertise of the digital matters cohort, implementing mallet for r script, using the lda topic model style, developed by blei et al.11 figure 1. topic model from text mining the mining-related oral histories found in the university of utah’s digital library. from the mining-related oral history corpus, twenty-six topic models were created. once generated, each topic model points to five interviews that are most related to the words in a particular model. in figure 1, the words carbon, county, country, and italian are the largest, because the interviews are about carbon county, utah. considering this geographical area of utah was the most ethnically diverse in the late 1800s due to the coal mining industry recruiting labor from abroad, including italy, these words are not surprising. as indicated by their prominence in the topic model, the set of words co-occur most often in the interview set. we approached the process of topic modeling the oral histories as an exploration, but after information technology and libraries | december 2019 53 reviewing the results, we discovered that many of the words which surfaced through this process pointed to deficiencies in the original descriptive metadata, highlighting new possibilities for access points and metadata remediation. honing in on the midsize words tended to uncover unique material that is not covered in descriptive metadata, as these words are often mentioned more than a handful of times and across multiple interviews. the largest words in the model are typically thematic to the interview and included in the descriptive metadata. for example, when investigating the inclusion of “wine” in the topic model found in figure 1, conversations about the winemaking process amongst the italian mining community in carbon county, utah were revealed. from an interview with mary nicolovo juliana conducted in 1973 from the carbon county oral history project, nicolovo discusses how her father, a miner, made wine at home.12 as the topic models are based on co-occurrences in the corpus, there was another interview with emile louise cances, from the carbon county oral history project conducted in 1973. cances, from a french immigrant mining family, discusses the vineyards her family had in france.13 with both of these oral histories, there was no reference to wine in the descriptive metadata. a researcher may miss this content because it isn’t included as an access point in metadata. thus, topic modeling allowed for the discoverability of potentially valuable topics that may be buried in hundreds of pages of content. from this collections as data project, text mining the mining oral history texts to produce topic models, we are considering employing topic modeling when creating new descriptive metadata for similar collections. setting a precedent, the text files for this project are hosted on the growing marriott library collections as data github repository. after we developed this corpus, we discovered that a graduate student in the history department had developed a similar project, demonstrating the research value of oral histories combined with computational analysis.14 harold stanley sanders matchbooks collection when assessing potential descriptive metadata for the harold stanley sanders matchbooks collection, an assortment of matchbooks that reflect many bygone establishments predominately from salt lake city that include restaurants, bars, hotels, and other businesses, non-dublin core metadata was essential for computational purposes. with the digital project workflow now extending beyond publishing the collection in the digital library, to publishing the collection data to the marriott library collections as data github repository, assessing metadata needs has evolved. as matchbooks function as small advertisements, they often incorporate a mix of graphic design, advertising slogans, and addresses of the establishment. the descriptive metadata was created first with the most relevant fields for computational analysis, including business name, type of business, transcription of text, notable graphics, colors of matchbooks, and street addresses. for collection mapping capabilities, street addresses were then geocoded using a google sheets add-on called geocode cells, which uses google’s geocoding api (see figure 2). from digital library to open datasets | wittmann, neatrour, cummings, and myntti 54 https://doi.org/10.6017/ital.v38i4.11101 figure 2. a screenshot of google sheets add-on, geocode cells. (https://chrome.google.com/webstore/detail/geocode-cells/pkocmaboheckpkcbnnlghnfccjjikmfc). figure 3. a screenshot of harold stanley sanders matchbook collection map, made with arcgis online. https://chrome.google.com/webstore/detail/geocode-cells/pkocmaboheckpkcbnnlghnfccjjikmfc information technology and libraries | december 2019 55 this proved efficient for this collection, as other geocoding services required zip codes for street addresses which were not present on the matchbooks. with the latitude and longitude addition to the metadata, the collection was then mapped using arcgis online (see figure 3).15 the extensive metadata, including geographic-coordinate data, is available on the library’s github repository for public use. after the more computationally ready metadata was created, it was then massaged to fit library best practices and dublin core (dc) standards. this included deriving library of congress subject headings for dc subjects from business type and concatenating notable matchbook graphics and slogans for the dc description. while providing the extensive metadata is beneficial for computational experimentation, it adds time and labor to the lifespan of the project. kennecott copper miner records one aspect of our collections as data work at the university of utah moving forward is the need for longterm planning for resources that contain interesting information that could eventually be used for computational exploration, even if we currently don’t have the capacity to make the envisioned dataset available at the current time. the marriott library holds a variety of personnel records from the kennecott copper corporation, utah copper division. these handwritten index cards contain a variety of interesting demographic data about the workers who were employed by the company from 1900-19 such as name, employee id, date employed, address, dependents, age, weight, height, eyes, hair, gender, nationality, engaged by, last employer, education, occupation, department, pay rate, date leaving employment, and reason for leaving. not all the cards are filled out with the complete level of detail as listed in the fields above, however, usually name, date employed, ethnicity, and notes about pay rates for each employee are included. developing a scanning and digitization procedure for creating digital surrogates of almost 40,000 employment records was fairly easy due to an existing partnership and reciprocal agreement with familysearch, however developing a structure for making the digitized records available and providing full transcription is a long-term project. librarians used this project as an opportunity to think strategically about the limits of dublin core when developing a collections as data project from the start. the digital library repository at the university of utah provides the ability to export collection level metadata as .tsv files. with this in mind, the collection metadata template was created with the aim of eventually being able to provide researchers with the granular information on the records. this required introducing a number of new, non-standard field labels to our repository. since we are not able to anticipate exactly how a researcher might interact with this collection in the future, our main priority was developing a metadata template that would accommodate full transcription for every data point on the card. twenty new fields in the template reflect the demographic data on the card, and ten are existing fields that map to our standard practices with dublin core fields. because we do not currently have the staffing in place to transcribe 40,000 records, we are implementing a phased approach of transcribing four basic fields, with fuller transcription to follow if we are able to secure additional funding. from digital library to open datasets | wittmann, neatrour, cummings, and myntti 56 https://doi.org/10.6017/ital.v38i4.11101 figure 4. employment card for alli ebrahim, 1916. information technology and libraries | december 2019 57 figure 5. employment card for richard almond, 1917. woman’s exponent a stated goal for digital matters is to be a digital humanities space that is unique to utah and addresses issues of local significance such as public lands, water rights, air quality, indigenous peoples, and mormon history.16 when considering what digital scholarship projects to pursue in 2019, digital matters faculty became aware of the upcoming 150th anniversary of women in utah being the first to vote in the nation. working with a local nonprofit, better days 2020, and colleagues at brigham young university (byu), digital matters faculty and staff decided to embark on a multimodal analysis of the 6,800-page run of the woman’s exponent, a utah women’s newspaper published between 1872-1914 primarily under the leadership of latter-day saint relief society president emmeline b. wells. in its time, the woman’s exponent was a passionate voice for women’s suffrage, education, and plural marriage, and chronicled the interest and daily lives of latter-day saint women. initially, we hoped to access the data through the brigham young university harold b. lee library, which digitized the exponent back in 2000. we quickly learned that ocr from nearly twenty years ago would not suffice for digital humanities research and considered different paths for rescanning the exponent. after accessing the original microfilm from byu, we leveraged existing structures for digitization. through an agreement that the marriott library has in place with a vendor for completing large-scale digitization of newspapers on microfilm for inclusion in the utah digital newspapers program, we were able to add the woman’s exponent to the existing project without securing a new contract for digitization. the vendor digitized the microfilm, created an index of each title, issue, date, and page, and extracted the full text from digital library to open datasets | wittmann, neatrour, cummings, and myntti 58 https://doi.org/10.6017/ital.v38i4.11101 through an ocr process. they then delivered 330 gb of data to us, including high-quality tiff and jp2000 images, a pdf file for each page, and mets-alto xml files containing the metadata and ocr text. acquiring data for the woman’s exponent project illuminated the challenges that digital humanists face when looking for clean data. our original assumption was that if something had already been scanned and put online, the data must exist somewhere. we soon learned, when working with legacy digital scans, that the ocr might be insufficient or the original high-quality scans might be lost over the course of multiple system migrations. as librarians with existing structures in place for digitization, we had the content rescanned and delivered within a month. our digital humanities partners from outside of the library did not know this option was available and assumed our research team would have to scan 6,800 pages of newspaper content before we were able to start analyzing the data. this incongruity highlighted cultural differences between digital humanists with their learned self-reliance and librarians who are more comfortable and conversant looking to outside resources. indeed, our digital humanities colleagues seemed to believe that “doing it yourself” was part and parcel of digital humanities work. the woman’s exponent project is still in early phases, but now that we have secured the data, we are considering what digital humanities methods we can bring to bear on the corpus. with the 2020 150th anniversary of women’s suffrage in utah, we have considered a topic modeling project looking at themes around universal voting, slavery, and polygamy and tracking how the discussion around those topics evolved over the 42-year run of the paper. another potential project is building a social network graph of the women and men chronicled throughout the run of the paper. developing curriculum around women in utah history is of particular interest to the group as women are underrepresented in the current k-12 utah history curriculum. keeping in line with our commitment to collections as data, we have released the woman’s exponent as a .tsv file with ocr full-text data, which can be analyzed by researchers studying utah, mormon studies, the american west, or various other topics. collaborators have also developed a digital exhibit on the woman’s exponent which includes essays about a variety of topics as well as sections showcasing its potential for digital scholarship.17 obituary data the utah digital newspapers (udn) program began in 2002 with the goal of making historical newspaper content from the state of utah freely available to the public for research purposes. between 2002 and 2019, there have been over 4 million newspaper pages digitized for udn. due to search limitations of the software system used for udn at the time, the data model for newspapers was made more granular, and included segmentation for articles, obituaries, advertisements, birth notices, etc. this article segmentation project ended in 2016 when it was determined that the high cost of segmentation was not sustainab le with mass digitization of newspapers and users were still able to find the content they are looking for on a full newspaper page. before the article segmentation project concluded, udn had accrued over 20 million articles, including 318,044 articles that were tagged as obituaries or death notices. in 2013, the marriott library partnered with familysearch to index the genealogical information that can be gleaned from these obituaries. the familysearch indexing (fsi) program crowdsourced the indexing of this data to thousands of volunteers worldwide. certain pieces of data, such as place names, were mapped to an existing controlled vocabulary and dates were entered in a standardized format to ensure that certain pieces of the data are machine actionable.18 after the obituaries were indexed by fsi in 2014, a copy of the data was given to the marriott library to use in udn. the indexed data included fields such as name of deceased, date of death, place of death, date of birth, birthplace, and relative names with relationships. since this massive amount of data didn't easily fit within the udn metadata schema, it was stored for several years without the marriott library doing anything with the data. information technology and libraries | december 2019 59 now that we are thinking about our digital collections as data, we are exploring ways that researchers could use this vast amount of data. the data was delivered to the library in large spreadsheets that are not easily usable in any spreadsheet software. we are exploring ingesting the data into a revised newspaper metadata schema within our digital asset management system or converting the data into a mysql database so it is possible to search and find relationships between pieces of data. working with a large dataset such as this can be challenging. the data from only two newspapers, including 1,038 obituaries, is a 25 mb file. the full database is over 10 gb of data. since this is a large amount of data, we are working through issues related to how we can distribute this data in a usable way in order for researchers to make use of the data. we are also looking at the possibility of having fsi index a dditional obituary data from udn, which will make the database continually expand. conclusion as the digital library community recognizes the need for computational-ready collections, the university of utah digital library has embraced this evolution with a strategic investment. implementing the collections as data github repository for computational users is a first step towards providing access to collections beyond the traditional digital library environment. while there may be improved ways to access this digital library data in the future, the github repository filled an immediate need. developing standardized metadata for computational use can often require more time from metadata librarians who are already busy with the regular work of describing new assets for the digital library. developing additional workflows for metadata enhancement and bulk download can delay the process in making new collections available. in most cases, collections need to be evaluated individually to determine what type of resources can be invested in making them available for computational use. for a project needing additional transcription, like the kennecott mining records, crowdsourcing might seem like potential avenue to pursue. however, the digital library collection managers have misgivings about the training and quality assurance involved in developing a new large-scale transcription project. combined with the desire to ensure that the people who are working on the project have adequate training and compensation for their labor, we are making the strategic decision to transcribe for some of the initial access points to the collection now, and attempt full transcription at a later date pending additional funding. for the udn obituary data, leveraging an existing transcription program at no cost with minimal supervision needed by librarians worked well in being able to surface additional genealogical data that can be released for researchers. the collections as data challenge mirrors a perennial digital library conundrum—how much time and effort should librarians invest for unknown future users with unknown future needs? much like digitization and metadata creation, creating collections as data requires a level of educated guesswork as to what collections digital humanists will want to access, what metadata fields they will be interested in manipulating, and in what formats they will need their data. considering the limited resources of librarians, should we convert our digital collections into data in anticipation of use or convert our collections on demand? this “just in case” vs. “just in time” question is worthy of debate and will naturally be dependent on the resources and priorities of individual institutions. with an increasing number of researchers experimenting with digital humanities methods, collections as data will be a standard consideration when working with new digitization projects at the university of utah. visualization possibilities outside of the digital-library environment will be regularly assessed. descriptive metadata practices beyond dublin core will be developed when beneficial to the computational and experimental use of the data by the public. integrating techniques like topic modeling into descriptive metadata workflows provides additional insight about the digital objects being described. while adding collections as data to existing digitization workflows will require an additional investment of time, developing these projects has also created new opportunities for collaboration both within the library and from digital library to open datasets | wittmann, neatrour, cummings, and myntti 60 https://doi.org/10.6017/ital.v38i4.11101 in developing expanded partnerships at the university of utah and other institutions in the mountain west. by leveraging our existing partnerships, we were able to create collections as data pilots organically by taking advantage of our current workflows and digitization procedures. while we have been successful in releasing smaller-scale collections as data projects, we still need to consider integration issues with our larger digital library program and experiment more with enabling access to large datasets. with librarians engaged in producing curated datasets that evolve from unique special collection materials, they can extend the research value of the digital library and the collections that are unique to each institution. as we look towards the future, we see this work continuing and expanding as librarians engage more with digital humanities teaching and support. acknowledgements the authors would like to acknowledge dr. elizabeth callaway, former digital matters postdoctoral fellow and current assistant professor in the department of english at the university of utah, for developing the topic modeling workflow used in the collections as data project, text mining mining texts. callaway’s expertise was invaluable in creating the scripts to enable distance reading of the text corpus, documenting this process, and training library staff. references 1 thomas g. padilla, “collections as data: implications for enclosure,” college & research libraries news; chicago 79, no. 6 (june 2018): 296, https://crln.acrl.org/index.php/crlnews/article/view/17003/18751. 2 thomas padilla et al., “the santa barbara statement on collections as data (v1),” n.d., https://collectionsasdata.github.io/statementv1/. 3 christine l. borgman, “data scholarship in the humanities,” in big data, little data, no data: scholarship in the networked world (cambridge, ma: the mit press, 2015), 161–201. 4 miriam posner, “humanities data: a necessary contradiction,” miriam posner’s blog (blog), june 25, 2015, http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/. 5 thomas padilla et al., “the santa barbara statement on collections as data (v1),” n.d., https://collectionsasdata.github.io/statementv1/. 6 thomas padilla, “always already computational,” always already computational: collections as data, 2018, https://collectionsasdata.github.io/; thomas padilla, “part to whole,” collections as data: part to whole, 2019, https://collectionsasdata.github.io/part2whole/. 7 “marriott library collections as data github repository,” april 16, 2019, https://github.com/marriottlibrary/collections-as-data. 8 “century of black mormons,” accessed april 25, 2019, http://centuryofblackmormons.org. 9 anna neatrour et al., “a clean sweep: the tools and processes of a successful metadata migration,” journal of web librarianship 11, no. 3-4 (october 2, 2017): 194-208, 111, https://doi.org/10.1080/19322909.2017.1360167. 10 anna l. neatrour, elizabeth callaway, and rebekah cummings, “kindles, card catalogs, and the future of libraries: a collaborative digital humanities project,” digital library perspectives 34, no. 3 (july 2018): 162–87, https://doi.org/10.1108/dlp-02-2018-0004. https://crln.acrl.org/index.php/crlnews/article/view/17003/18751 https://collectionsasdata.github.io/statementv1/ http://miriamposner.com/blog/humanities-data-a-necessary-contradiction/ https://collectionsasdata.github.io/statementv1/ https://collectionsasdata.github.io/ https://collectionsasdata.github.io/part2whole/ https://github.com/marriott-library/collections-as-data https://github.com/marriott-library/collections-as-data http://centuryofblackmormons.org/ https://www.ifla.org/files/assets/newspapers/slc/2014_ifla_slc_herbert_mynti_alexander_witkowski_-_getting_the_crowd_into_obituaries.pdf https://doi.org/10.1108/dlp-02-2018-0004 information technology and libraries | december 2019 61 11 david m. blei et al., “latent dirichlet allocation,” journal of machine learning research 3, no. 4/5 (may 15, 2003): 993–1022, http://search.ebscohost.com/login.aspx?direct=true&db=asn&an=12323372&site=ehost-live. 12 “mary nicolovo juliana, carbon county, utah, carbon county oral history project, no. 47, march 30 1973,” carbon county oral histories, accessed april 29, 2019, https://collections.lib.utah.edu/details?id=783960. 13 “mrs. emile louise cances, salt lake city, utah, carbon county oral history project, no. cc-25, february 24, 1973,” carbon county oral histories, accessed april 29, 2019, https://collections.lib.utah.edu/details?id=783899. 14 nate housley, “a distance reading of immigration in carbon county,” utah division of state history blog, 2019, https://history.utah.gov/a-distance-reading-of-immigration-in-carbon-county/. 15 “harold stanley sanders matchbooks collection,” accessed may 8, 2019, https://collections.lib.utah.edu/search?facet_setname_s=uum_hssm; “harold stanley sanders matchbooks collection map,” accessed may 8, 2019, https://mlibgisservices.maps.arcgis.com/apps/webappviewer/index.html?id=d16a5bc93b864fc0b953 0af8e48c6c6f. 16 rebekah cummings, david roh, and elizabeth callaway, “organic and locally sourced: growing a digital humanities lab with an eye towards sustainability,” digital humanities quarterly, 2019. 17 “woman’s exponent data,” https://github.com/marriott-library/collections-asdata/tree/master/womansexponent; “woman’s exponent digital exhibit,” https://exhibits.lib.utah.edu/s/womanexponent/. 18 john herbert et al., “getting the crowd into obituaries: how a unique partnership combined the world’s largest obituary with the utah’s largest historic newspaper database,” in salt lake city, ut: international federation of library associations and institutions, 2014, https://www.ifla.org/files/assets/newspapers/slc/2014_ifla_slc_herbert_mynti_alexander_witkowski _-_getting_the_crowd_into_obituaries.pdf. http://search.ebscohost.com/login.aspx?direct=true&db=asn&an=12323372&site=ehost-live https://collections.lib.utah.edu/details?id=783960 https://collections.lib.utah.edu/details?id=783899 https://history.utah.gov/a-distance-reading-of-immigration-in-carbon-county/ https://collections.lib.utah.edu/search?facet_setname_s=uum_hssm https://mlibgisservices.maps.arcgis.com/apps/webappviewer/index.html?id=d16a5bc93b864fc0b9530af8e48c6c6f https://mlibgisservices.maps.arcgis.com/apps/webappviewer/index.html?id=d16a5bc93b864fc0b9530af8e48c6c6f https://www.zotero.org/google-docs/?kmyo08 https://www.zotero.org/google-docs/?kmyo08 https://github.com/marriott-library/collections-as-data/tree/master/womansexponent https://github.com/marriott-library/collections-as-data/tree/master/womansexponent https://exhibits.lib.utah.edu/s/womanexponent/ https://www.ifla.org/files/assets/newspapers/slc/2014_ifla_slc_herbert_mynti_alexander_witkowski_-_getting_the_crowd_into_obituaries.pdf https://www.ifla.org/files/assets/newspapers/slc/2014_ifla_slc_herbert_mynti_alexander_witkowski_-_getting_the_crowd_into_obituaries.pdf abstract introduction digital matters institution digital library text mining mining texts harold stanley sanders matchbooks collection kennecott copper miner records woman’s exponent obituary data conclusion acknowledgements references reproduced with permission of the copyright owner. further reproduction prohibited without permission. prospector: a multivendor, multitype, and multistate western union catalog bush, carmel;garrison, william a;machovec, george;reed, helen i information technology and libraries; jun 2000; 19, 2; proquest pg. 71 prospector: a multivendor, multitype, and multistate western union catalog the prospector project represents a unique union catalog. the origin, goals, and design of the union catalog that uses the inn-reach system are presented. challenges of the union catalog include the integration of records from libraries that do not use the innovative interfaces system and the development of best practices for participating libraries. t he prospector project is a union catalog of sixteen libraries in colorado and wyoming built around the inn-reach software from innovative interfaces, inc. (iii).1 in 1997, the colorado alliance of research libraries (the colorado alliance) and the university of northern colorado submitted a joint grant proposal to create a regional union catalog for many of the major academic and public libraries in the region. the project would allow users to view library holdings and circulation information with a single query of the central database. the union catalog also would allow patrons to request items from any of the participating libraries and have them delivered to a nearby local library. however, unlike many of the other union catalogs in the country, prospector has several unique elements: • it is multistate (colorado and wyoming). • it is multisystem (incorporating systems from innovative interfaces and carl corporation; plans call for voyager from endeavor). • it is multi-library-type (academic, public, and special libraries). regional union catalogs representing the cataloged collections of libraries that are related by geography, subject, or library type have been extant for many years. early leaders in the field spearheaded locally developed systems such as the university of california's melvyl system and the illinois library computer systems organization's (ilcso) illinet online system, which became operational in 1980.2 the commercial integrated library system market began to emerge in the late 1980s and the 1990s with such vendors as innovative interfaces and its work with ohiolink through its inn-reach union catalog product, and the carl system.3 many major vendors now have union catalog solutions for a single physical union catalog, although most have the requirement that participating libraries all use the same integrated library system. an alternative approach that is also becoming popular, because of the heterogeneous nature of the ils marketplace and the widespread implementation of z39.50, is for libraries to create virtual union catalogs through broadcast searching. this solution is available from many ils vendors as well as through organizations such as oclc and its webz software. carmel bush, william a. garrison, george machovec, and helen i. reed there is not a single "right" answer for whether regional catalog searching and document delivery is best accomplished through a physical or virtual union catalog. each solution has benefits and drawbacks that must be balanced against the mix of vendors, economics, politics, and technical issues within a state. prospector is somewhat unusual in that it does create a single physical union catalog but allows for the incorporation of other library systems, made possible through a published specification from innovative interfaces. i prospector history, funding, and project goals colorado has a long history of resource sharing through a variety of programs, including use of the colorado library card statewide borrower's card and access to individual libraries' online catalogs through the access colorado library information network (aclin) and other regional catalogs. the colorado alliance has taken a leadership role within the state in promoting cooperation among major academic and public libraries in the areas of automation, joint acquisitions, and other cooperative endeavors. existing online catalog software enabled patrons to easily search individual online catalogs, but searching several catalogs was a tedious task requiring many steps. it has long been a goal of the alliance to have a true union catalog of holdings for all member libraries. to forward this goal, in 1997 the colorado alliance of research libraries and the university of northern colorado jointly applied for and received a grant from the colorado technology grant and revolving loan program to establish the colorado unified catalog, a unified catalog of holdings for sixteen of the major academic, public, and special libraries in colorado.4 the university of wyoming was included in the project through separate funding. the grant of $640,000 was used to develop a union catalog that would support searching and patron borrowing from a single database. the colorado alliance carmel bush (cbush@manta.library.colostate.edu) is assistant dean for technical services at the colorado state university libraries, fort collins; william a. garrison (garrisow@ spot.colorado.edu) is head of cataloging at the university of colorado at boulder (colo.) libraries; george machovec (gmachove@coalliance.org) is the associate director of the colorado alliance of research libraries, denver; and helen i. reed (hreed@unco.edu) is associate dean, university of northern colorado libraries, greeley. prospector i bush, garrison, machovec, and reed 71 reproduced with permission of the copyright owner. further reproduction prohibited without permission. and the university of northern colorado contributed an additional $189,500 of in-kind services to the unified catalog project. additionally, the colorado alliance contributed $119,000 of in-kind funds to support purchase of distributed system software. the colorado unified catalog project, later named prospector, was based upon the inn-reach software developed by innovative interfaces, inc. it included all innovative interfaces sites in colorado as of december 1996 as well as the carl system sites that were members of the nonprofit colorado alliance of research libraries.s the colorado unified catalog project had two major goals: • the development of a global catalog database containing the library holdings of the largest public and academic libraries in the region; and • the development of an automated borrowing system so that users at any of the participating libraries could easily request materials electronically from any other participating libraries.6 the union catalog would allow users to view library holdings and circulation information on titles with a single query of the global database. once titles were located, patrons could request available items and have them delivered to their home library. the grant proposal identified four major goals and outcomes of the project: access, equity, connections, and content and training. by creating a global catalog, the colorado unified catalog project would provide students, faculty, staff, and patrons free and open access to the union catalog via the internet. patrons from all participating libraries would have equal access to the combined holdings of all sixteen participating libraries, thus greatly enhancing resources available to patrons without the necessity of travel across the state. connectivity was greatly enhanced by the installation of high-speed internet access in the colorado alliance office where the union catalog server was housed. the unified catalog project amassed, in one place, the complete cataloged collections of the major libraries in the region creating a single, easy-to-use public interface. training for the catalog would be conducted in each library so that it could be integrated into the standard training and reference services of each participating library.? addressing statewide goals for libraries, the colorado unified catalog was designed to dovetail with an existing project in colorado called the access colorado library and information network (aclin) in several ways. the goal of aclin was to provide statewide searching of several hundred library catalogs in colorado through broadcast 239.50 searching. however, because of the large number of online library catalogs (too many z39.50 targets cause broadcast searching to be slow) and 72 information technology and libraries i june 2000 poor network infrastructure in some parts of the state, the creation of physical union catalogs, such as prospector, greatly enhanced the ability for a project such as aclin to be successful. as stated in the grant proposal it will: • make aclin more efficient since sixteen libraries will be grouped together and can be accessed via a single search, thus saving alcin users steps in searching; • enhance aclin's document delivery plans since patrons can make requests themselves; • offer both web and character interfaces for various levels of users; • provide access via aclin's dial-in ports as well as via the internet; and • support alcin's future developments based on a 239.50 environment.s work on the development of the colorado unified catalog began in mid-1997. even while contract negotiations were underway in midto late 1997, groups were busy undertaking discussions on the design and structure of the unified catalog. work on development of profiling and system specifications continued through july 1998. this data was entered onto the server at the colorado alliance office and a test database was created in august 1998. testing was completed in november 1998 and the first records were loaded in december 1998. the creation of the database for the first twelve libraries took seven months. during the database load the catalog was available for searching, although most participating libraries did not highlight the system in their local opacs. innovative interfaces, inc. conducted training on the actual patron requesting and circulation functions at three sites over the period from may through july 1999. as of january 2000 the catalog included more than 3.6 million unique bibliographic records of the twelve largest libraries in colorado (more than 6.6 million marc records have been contributed, which has resulted in 3.6 million unique records after de-duplication). with the database in place and opac and circulation training complete, prospector went "live" for patron-initiated requests in the first eight libraries on july 23, 1999. as of december 31, 1999, all twelve innovative sites were "live" in prospector. the final programming for loading the records from carl-system sites will be completed in spring 2000. it is anticipated that carl-system library records will be loaded in late spring 2000 and will bring the database to more than five million unique marc records, with more than ten million item records. since the receipt of the grant, two participating libraries have selected endeavor as their online integrated system . contract negotiations are underway between innovative interfaces and the reproduced with permission of the copyright owner. further reproduction prohibited without permission. colorado alliance to come to an agreement on loading records for the endeavor libraries into prospector. i politics and marketing of prospector planning and policy making are inherently political processes in which participants choose among goals and options in order to make decisions and to direct actions. for prospector the diverse makeup of multitype libraries and multisystems augured for different perspectives on implementation from the onset. nearly every department in member libraries would have an impact from the project. to be successful in carrying out their charges, the work of the task forces appointed to implement prospector had to address how these staff could influence the process and how local practices would be affected. the challenge was to engage staff in the process since the task force structure precluded representation from every member library. meeting this challenge would be vital to ensuring input and fostering buy-in and advocacy for prospector in member institutions. consequently, in addition to reviewing standards or best practices and focusing on the goals stipulated in the grant, obtaining factual knowledge about member practices and resources and encouraging communications served as key ingredients in planning and policy development. general process profiling prospector, a main charge for the cataloging/ reference task force, illustrates the general process employed in planning and how key ingredients were applied to gain input and produce results. the first step involved the task force's review of the grant's aims for the unified catalog. with that framework as a basis, a planning process was outlined and shared with participants. the prospector web site detailed the specification development process, including the schedule and opportunities for input. next the task force surveyed participants for information on their systems: bibliographic indexing rules, types of indexes, characters indexed in phrase indexes, indexes on which authority control performed, and suppliers of authority records. using this data, the task force identified the commonalties and differences to determine what to create in the unified catalog. members also consulted innovative interfaces and reviewed what previous innreach customers had established. draft recommendations for indexing, indexes, record overlay, and record display specifications were then posted on the prospector web site and participants requested to review and provide input. a notice in data/ink: the alliance newsletter (www.coalliance.org/ datalink) also referenced the site. at the same time, testing was performed using draft specifications in order to assess them and to check for other concerns that testing might reveal. because of the importance of the recommendations, an open forum was held to receive additional comments. following the forum, the task force members made final adjustments to the specifications. after the period for public comment ended, the specifications were submitted as recommendations to the prospector steering committee for approval. once approved, the specifications became official and were referenced in all site visits. issues because of the design of inn-reach, participants must make decisions about contribution of records, priorities for what record would serve as the master record, order of loading, indexing, indexes, and displays for the unified catalog. circulation functions require decisions about services for patron types, circulation statuses, loan periods, numbers of loans, renewals, recalls, checkouts, holds, overdues, fines, notices, pick-up locations, and billing. in the case of prospector, expectations regarding what would be controversial met with a few surprises. for example, the master record, the bibliographic record from one participating library to which holdings of other libraries are displayed, is based upon encoding level and the library priority list. the latter determines if the incoming record should replace an existing level; a record with a higher level will replace a lower one. based upon the data collected from libraries, a proposal categorized libraries into the following order: large, special, and "all others." the order was further factored by a member library's application of authority control and participation in program for cooperative cataloging programs. the proposal drew minimal comment from libraries. pride of ownership was not an obstacle. everyone was committed to the fullest authorized form of the record. how many loans an individual could request was the subject of early debate. there were concerns about discrepancies between local limits for borrowing and the possible setting of a higher number of loans on prospector. a corollary concern was that a high number might result in depleting a member library's collection. previous experience with borrowing by a subset of members shed light on the issue; there were no problems with loan limits. in fact, inn-reach supports "load leveling" across participating libraries randomly as well as by prospector i bush, garrison, machovec, and reed 73 reproduced with permission of the copyright owner. further reproduction prohibited without permission. precedence tables thus avoiding systematic checkout from one library only. members decided that they could always pass a request on to another owning library if necessary and monitor loans to determine if any abuses would develop. with these options, it then became possible to establish a forty checkout limit for individual patrons in prospector. differences in cataloging practices engendered more discussion because of the potential for a policy that might affect local practice. in the course of comparing practices of institutions, the cataloging/reference task force identified multiple records for the same serial titles that reflected differences in forms of entry and multiple versions treated either in separate records or on the same record. there was wide variety in statements of holdings. these differences warranted gathering further information on holdings; multiple versions, especially those involving electronic versions; and successive/latest entry for cataloging. the task force decided to hold a focus group on serials and invited staff in member libraries from serials, cataloging, and reference to attend. in the meantime, visits to participating libraries were instituted, the first of the roadshows, to discuss serials practices, their implications for overlays and displays, and options for handling them. the focus group attracted a large attendance and proved useful in gathering information about practices and the concerns of participating libraries regarding serials. most libraries reported individual practices for recording holdings. although participants expressed a desire for consistency, attendees also shared that resources are not available to retroactively change them. instead attendees encouraged development of a best practice recommendation that would follow the niso standards for those libraries wishing to change practices. with the exception of electronic versions of serials, focus group participants had no problem with multiple formats in the same bibliographic record as long as it was clear to users. electronic versions prompted a lot of questions about what to do with 856 links to restricted access resources and about changes in software. it was clear that this issue would need further investigation by the task force. the hottest area, successive or latest entry cataloging of serials, registered strong preferences by proponents. attendees did not welcome changing practice in either direction. instead there were questions asked about possible system changes and about the conduct of use studies to determine what problems might arise from latest entry records in the system. with the information gained from the focus group meeting, the task force assigned priority to the areas and pursued latest/ successive entry as the top priority. 74 information technology and libraries i june 2000 already the task force had consulted innovative interfaces, inc. and received a negative reply to possible changes to matching algorithms, loading programs, and record values that could deal with practices of participants because of the software structure. it was technically impossible for a latest entry and successive entry record to load separately given their match on the oclc number. the predominant use of successive entry and its status as the current national standard persuaded the task force initially to recommend coding latest entry in a special way so that the record for such an entry would not be the master record in the system unless it was unique. this interim measure led to the policy recommendation that successive entry serve as the standard for prospector. as a part of the recommendation, members are asked to not undertake retroactive conversion/ recataloging projects to change existing latest entry records. up to the meeting of the prospector board of directors, the serials policy was argued. the approval by the board illustrates that controversial issues may require that leadership commit their libraries to policies. marketing marketing incorporates an overall strategy of identifying patrons' needs, designing products to meet those needs, implementing the products, and promoting and evaluating them. the twin goals of prospector are: (1) one-stop shopping and expanded access regardless of location, and (2) an automated borrowing system to facilitate fast delivery of materials that addressed problems experienced by patrons in searching and obtaining materials. the grant proposal outlined a plan for member libraries to meet these goals through inn-reach software and the cooperative efforts of participating members. with the implementation of the unified catalog and patron-initiated borrowing, the next pieces of the strategy, promotion and evaluation, come into play. member libraries commitment to a cooperative venture takes time and energy. the support for prospector at the library director and dean level had to be translated to staff in member libraries whose efforts would be necessary to support the unified catalog and patron-initiated loans. staff members had to become acquainted with how prospector would benefit patrons and their work. hence internal promotion was a necessary component throughout planning and policy development and with implementation to users. because of the numbers of staff in member libraries, no one method would assure awareness of developments for prospector. the approach involved the alliance's newsletter (datalink), a prospector web site, electronic reproduced with permission of the copyright owner. further reproduction prohibited without permission. discussion lists, e-mail, correspondence, phone calls, documentation, training sessions, and many site visits. the site visits facilitated interaction across institutional lines and were important for discussing critical issues at the local level. in arranging for site visits, it was important to clarify what the staff members wanted to discuss. a general update on prospector might be followed by other technical sessions such as preparing the library's database for load into the prospector system. participants' questions emphasized the importance of sharing the plan for developing prospector and the basic concepts guiding the implementation planning and policy process as listed below. these concepts bore repeating because a staff member could have been hearing about prospector for the first time. • decisions and directions are guided by data and input gathered from participants, standards/best practices, system capabilities, and the aims for prospector described in the grant. • relatively few local practices are affected by participating in prospector. • inclusiveness in record contributions would build prospector into a rich resource for users; however, participating libraries can exert control over contributions. • global policies are developed for prospector only; local sites define their own local policies. • assistance is available to participating libraries in coming up with solutions for special circumstances. • prospector is not reinventing the wheel. although the multitype library and multisystem involvement would produce a new model of inn-reach, other inn-reach sites could serve as models. • think globally but act locally. more than a catchphrase, this statement acknowledges the reality of individual library circumstances and the balancing of prospector goals to maximize access and use of resources by patrons. patrons the design of the pac, a promotional brochure, and individual library public relations efforts all served to promote prospector's availability to users. prospector provides access via telnet and the web. the impetus, however, was to examine member webpacs and create a prospector webp ac that exemplified the best in menu design including caption descriptions, navigational aids, and consistency in display of elements among search screens. special attention was paid to providing example searches that would have appeal for the diversity of patrons served by the membership. after mulling over several name possibilities, the alliance staff suggested the name prospector for the unified catalog, connoting the rich mining history of the rocky mountain area. this identity found its depiction in a classic picture of a gold miner supplied by the colorado historical society. representing the user, the miner is the center panning for gold, an apt image for users exploring the richness of resources from the unified catalog. the incorporation of the image as the logo on the web site and the catalog was followed by its adoption for the entire cooperative venture. name recognition spread quickly. to facilitate promotion at member libraries, the alliance staff designed a brochure. the design features a brief description of the unified catalog, a list of members and information for patrons on how to connect, what's available on prospector, how to use the self-service borrowing, and how to view their circulation record. many libraries have web-mounted guides or paper handouts in their instructional service, using the alliance-designed brochure as a model. finally, staff in member libraries exercised individual approaches to promote prospector to users. denison library describes and provides a link to prospector on its web list of databases and help guides. colorado state university libraries devoted the front page of its library newsletter to "hunting for hidden gold," the introduction of prospector. a special newsletter for auraria's history faculty highlighted prospector in its database news section. the university libraries of the university of colorado at boulder describes the unified catalog in its web site on its state services page. more introductions came from instructional classes held by every member library. profile of participating libraries prospector is unique since it is multistate, multi type, and multisystem. of the sixteen members (see appendix a), almost all are located along the front range of the rocky mountains extending from laramie, wyoming, southward to colorado springs, colorado. only fort lewis college is located on the western slope of the mountains. despite the distances, a network of courier service connects all members. within the membership are eleven public and private academic libraries, three special libraries representing law and medicine, and two public libraries that serve almost one million registered patrons. twelve of the libraries operate innopac and are loaded into prospector. two libraries on the carl system are slated for loading in mid-2000. two other libraries are migrating to the voyager system by endeavor information systems in the summer of 2000. hopes are to incorporate them into the system in 2001. prospector i bush, garrison, machovec, and reed 75 reproduced with permission of the copyright owner. further reproduction prohibited without permission. description of how inn-reach works the inn-reach software is designed to provide a union catalog with one record per title with all of the libraries holding a title represented. after databases are loaded initially, the software automatically queues transactions that occur to bibliographic, item, order, or summary serial holdings records and sends those transactions up to the central catalog. staff in the local library has no extra work or steps to take to send transactions to the union catalog. the union catalog uses a "master" record to maintain only one bibliographic record per title. the "owner" of the master record is determined by several factors. a bibliographic record with only one holding library automatically has that library as the owner of the master record. if more than one library holds a title, the system uses an algorithm to determine which record coming into the system has the highest encoding level. the library that has the record with the highest encoding level becomes the owner of the record, and its version of the record is displayed and indexed in the catalog. in addition, a table is created which has a list of the libraries in priority order for determining the master record if two or more matching records enter the system with the same encoding level. for the prospector catalog, a survey was conducted of the participating institutions to determine which libraries might have the best or fullest records. questions in the survey included size of database, source of bibliographic records, participation in national projects (e.g., program for cooperative cataloging, oclc enhance), amount of authority work done and level of authority control in the local database, level of cataloging given to records, and type of institution. the task force charged with designing the catalog examined these surveys and determined a priority order of the participating institutions for selecting bibliographic records. the system also uses a set of match points each time a bibliographic record is added to the union catalog. whenever a match occurs, the system examines the encoding level of the incoming record and the library from which the record is coming to determine if a change in the master record is required. the existing record is overlaid by the incoming record if the master record holder is changed. the first check is done on the oclc record number. if there is a match on that, the system adds the holdings to the existing record. if there is no match on the oclc number, the system attempts to match on the isbn or issn in combination with the title in the 245 field. again, if a match occurs, the system adds the holdings to the existing record. if no match occurs, a new bibliographic record is added to the catalog. in addition, each library that has a local innovative interfaces system has the ability to exclude bibliographic, item, order, or check-in records from being sent to the 76 information technology and libraries i june 2000 union catalog. suppression may occur in each of these record types. the library may also choose to send a record to the union catalog but exclude it from public display in the union catalog or to suppress a record from displaying in the public catalog both locally and centrally. the inn-reach system has no central database maintenance module, though it does provide a staff mode in which to view records, to create lists, and to monitor transaction queues. the staff module that is available via a telnet connection allows authorized users to view those records that have been contributed to the union catalog but are not displayed to the public in the union catalog. for example, a library may contribute its order records to the union catalog but choose to suppress those records from public display; however, authorized staff may view these records in the inn-reach staff mode or create lists for collection development purposes that include those order records. circulation status of individual items and volumes also appears to the user. the prospector member libraries with local innovative interfaces systems also maintain a set of circulation or item status codes that display various messages to users of their individual public catalogs. the inn-reach system also has a set of circulation or item status codes. agreement was reached on what the status codes were to be in the central catalog, and each member library then had to map its local codes to the codes used in the central catalog to ensure proper message display in the union catalog. in some cases, the member libraries had to adjust local status codes. indexes for the prospector catalog were determined during the profiling process. in general, there are more indexes in the union catalog than are available in the member libraries' local catalogs. indexes in prospector include author, author/ title, library of congress subject headings, medical subject headings, library of congress children's subject headings, journal title, keyword, library of congress classification numbers, national library of medicine classification numbers, dewey decimal classification numbers, government documents numbers, oclc numbers, and special numbers (e.g., isbn, issn, music publisher numbers, etc.). the classification number indexes are derived using the classification numbers that appear in the defined marc tags for the various classification schemes in the bibliographic record and do not represent local call numbers. local call numbers are always stored at the item record level in the union catalog. it was decided that many local marc fields that are defined for local notes or local access would not transfer from the local catalog to the union catalog (e.g., 59x, 69x, 79x, 9xx) to avoid ambiguities and excessive heading conflicts. therefore, there may be access points or index entries in the local catalog that may not be available in the union catalog; the local reproduced with permission of the copyright owner. further reproduction prohibited without permission. catalog may still contain "richer" or "fuller" searching than the union catalog. the local catalog may have materials accessible in it as well that do not appear in the union catalog. patrons using a local catalog may transfer their searches up to prospector simply by clicking on a button in their local public catalogs and have the search automatically occur in the union catalog. patrons may access prospector directly either via the world wide web or via telnet. navigation between local catalogs and prospector as well as navigation within prospector has been designed to be clear and simple. patrons may also go from prospector either back to their local catalog or to the local catalogs of other member libraries. when a patron locates an item that he or she wishes to borrow from prospector, he or she may initiate the request for the item online. the borrowing and lending process is described below. prospector member libraries have been asked to be as inclusive as possible in contributing bibliographic records to the union catalog. member libraries have been asked to contribute the following: • items that users may borrow, including all monographic materials that circulate, and other material types as specified by individual institutions that are listed as available for circulation. • items that users may not borrow but may use onsite, including reference materials, archival materials, rare books, and others as determined by individual institutions. virtual items, such as electronic journals, which have ip limiting and authentication are included in this category. • items that are owned virtually which have urls or ip addresses that are open and unrestricted include government publications and selected home pages as determined by the local institution. bibliographic records that are contributed should have as full cataloging as possible for identification and retrieval. materials that are on reserve and other locally defined special materials (e.g., materials that have use restrictions placed upon them) may be excluded from prospector. the prospector union catalog will also include bibliographic and circulation information from libraries that do not use innovative interfaces as their local system vendor. i the integration of non-innovative libraries into inn-reach one of the major efforts in the prospector project was to be able to incorporate bibliographic, item, summary serial holdings, and acquisitions records from other vendors with the inn-reach union catalog software. in 1997, when the grant was written, it was envisioned that the system would incorporate libraries using two ils vendors-innovative interfaces, inc. and carl corporation-two of the major vendors in colorado at the time. twelve libraries used innovative interfaces and four used the carl system (denver public library, regis university, colorado school of mines, and the university of wyoming). however, in late 1999, the colorado school of mines and the university of wyoming decided to migrate to the voyager system by endeavor information systems (this is occurring in 2000). both of these institutions have still expressed an interest in being part of prospector, so they will need to be integrated in 2001 after they are stable on their new system. the remaining carl sites will be fully integrated in 2000. the integration of records that allows document requests from different vendors is being accomplished as follows: • innovative interfaces, inc. has published a set of specifications for how bibliographic, item, summary serial holdings, and acquisitions order records should be formatted to be loaded into the union catalog. • published specifications were also created for patron verification and for how document requests are to be transferred. • the alliance office is developing the software to package usmarc bibliographic records, item records, summary serial holding records, and order records to transfer to prospector. work is also being done so that document requests may be relayed between the different systems using an intermediate unix server running an sql database with a web interface for circulation to ill staff. because the carl and endeavor systems are built differently, the record updating may be done on a "batch" basis several times a day. patron verification, to determine if a carl or endeavor patron is in good standing before allowing a document request, will be done in realtime. i administrative and committee structures under provisions of the grant, the dean of libraries at the university of northern colorado provides administrative management for the project while the colorado alliance of research libraries houses the server, maintains the union catalog software, provides network connectivity, prospector i bush, garrison, machovec, and reed 77 reproduced with permission of the copyright owner. further reproduction prohibited without permission. develops the software to integrate the non-innovative sites into the union catalog, and provides ongoing system administration support for the project. a prospector steering committee comprised of deans and directors of three participating libraries provided general overview for the project during the initial stages. to carry out the initial work of the project, two task forces were appointed with responsibility for detailed design and implementation of the system: the catalog/reference task force and the circulation/document delivery task force. the catalog/reference task force was charged with making all bibliographic and display decisions relating to the catalog. this included establishing the criteria for determining which institution's bibliographic record displays in the catalog, developing display and overlay hierarchies for bibliographic records coming into the system, and identifying marc fields that would be indexed and displayed in the catalog. membership on this task force included both public services and technical services personnel, but did not include representation from every participating library.9 the circulation/document delivery task force was charged with developing common circulation policies to be applied in the union catalog including loan periods, fines, renewals, holds, recalls, checkout limits, and patron blocks. the task force was also responsible for developing the precedence table for routing patron requests. the members of this task force represented each participating library, and several libraries had representation from both their circulation and interlibrary loan department.lo these two task forces conducted meetings from july 1997 through december of 1999. the stage was set for the task forces' work at a training session held by innovative interfaces, inc. on system operation and functionality. each group received direction on what policy issues needed to be determined to lay the groundwork for establishing the codes that drive system functionality. after the initial training, each task force met several times a month, often consulting with innovative interfaces, inc. and/ or their local libraries as their planning and deliberations continued. communication was an important component during the development of the system. soon after the grant was awarded, staff from the alliance office visited each participating library and met with library personnel to explain the overall goals of the project and how work would be conducted. as detailed development progressed, open forums were held in central locations to keep representatives of all libraries apprised of progress and to get feedback regarding specific policy issues. completed work from the task forces was mounted on the prospector web site. in addition, regular articles appeared in data/ink, the alliance monthly newsletter. specific training sessions were conducted both by the task forces and by innovative interfaces. 78 information technology and libraries i june 2000 as the actual database loading process began, the catalog/reference task force conducted sessions at each prospector library. these sessions were twofold in purpose: to provide an opportunity for a general overview of how the database structure and indexing worked for all library personnel, and to train technical services personnel in how local coding of records impacted the display of their local records in the global catalog. in preparation for going live with patron requesting, innovative interfaces, inc. conducted pac searching and circulation training sessions at several central locations for frontline staff from all institutions. in addition, the circulation/ document delivery task force held a central session for representatives from all libraries to discuss issues relating to the flow of materials among libraries. during system implementation, it became apparent that some ongoing structure would be required for ongoing maintenance and development of the global catalog. in completion of their charges, each task force prepared a final report, which was submitted to the steering committee and to the prospector directors group. each task force recommended its own termination but outlined a structure to address ongoing issues. as approved by the prospector directors group, the ongoing governance structure is multilayered with frontline operations groups, broader planning and policy-setting committees, an advisory committee, a directors group, and electronic discussion lists for communication. monitoring of the day-to-day work of the cataloging and circulation/ document delivery operations is handled by frontline staff via e-mail, electronic discussion lists, and/ or telephone. broader planning and policy issues are addressed through smaller, representative standing committees. the advisory committee and directors group operate at a policy level. the new structure includes: • a catalog site liaison group comprised of one representative from each participating library and charged with serving as the point of contact for inquiries regarding catalog maintenance, access and record merging; • a catalog/reference committee comprised of members selected from the participating libraries and charged with responsibility for all bibliographic and display issues relating to prospector. this includes monitoring details of the current implementation as well as addressing ongoing policy issues, recommending system enhancements, testing new system functionality, and training staff at new sites coming into the system; • a document delivery site liaison group comprised of one or more representatives from each participating institution with responsibility to reproduced with permission of the copyright owner. further reproduction prohibited without permission. serve as a point of contact for other prospector libraries that have inquiries concerning issues, lost books, courier delivery, or related topics; • a circulation/document delivery committee comprised of representatives selected from the participating libraries and responsible for issues relating to the courier delivery service, circulation load-balancing, monitoring member compliance with circulation policies, recommending system enhancements, testing new system functionality , and the year-end reconciliation of lost book charges; and • a prospector advisory committee comprised of tewnty-four deans and directors from participating libraries to address issues requiring quick response relating to project specifications and operating rules. the prospector directors group is comprised of the deans/ directors of all participating libraries and is charged with making recommendations on high-level policy and admission of new participants . since prospector is a project of the nonprofit colorado alliance of research libraries consortium, all final high-level decisions and financial commitments are subject to the approval of the board of directors of the consortium . at the present, five of the sixteen prospector libraries are not part of the formal consortium but participate in this one project. the newly formed committees will continue to address broad policy and operational issues such as the load-balancing tables for routing patron requests to owning libraries, will document best practices for local libraries to follow in implementing certain functionality within their local system to achieve maximal results in the central catalog, will identify enhancements to the system , and will test new release functionality. i borrowing and lending policies and specifications as a prelude to its work, the circulation / document delivery task force examined borrowing and lending practice s from other innovative interfaces . inn-reach sites and reviewed the borrowing policies for consortia! borrowers that were developed and agreed to by a subset of alliance libraries (university of northern colorado, auraria library, and denver public library) several years ago. the first major duty of the task force was to establish circulation and document delivery policies that would govern those functions within the prospector system. these common circulation and document delivery policies were based on a series of assumptions: • the task force policies apply to the unified catalog only; local sites define local policies; • local workflow remains local purview; • policies should be kept simple; • circulating materials are commonly circulated materials, primarily books, at each site; • the task force will work within the confines of the inn-reach system; • if a patron is blocked locally, he or she will be blocked at the global level; • for routing purposes, each institution (rather than branch) is the routing site; and • local sites will determine when their items are declared lost. the task force established a series of recommendations for policies that applied to the prospector system . the proposed policies were discussed within the local institutions as well as with various administrative groups. the final policies for prospector lending as adopted and implemented in the system are: • loan period : twenty -one days • renewals: one • number of holds allowed : forty • checkout limit: forty items • recalls: none, except for academic library reserve collections • lost book charge: $100, which is comprised of a $75 refundable lost book charge and a $25 nonrefund able processing fee • libraries establish their own local rules for overdue fines on prospector materials . key features of the inn-reach software that were emphasized with each local library during training sessions are: • libraries have local control over what is loaned through the global catalog. • libraries have local control over which of their patrons can borrow materials through the global catalog. • if the local copy is checked out or missing, a copy may be requested through prospector. • the system is sensitive to multivolume works and allows particular volumes to be selected. the ongoing document delivery committee has developed a series of "best practices" that establish benchmark policies that each library is urged to adopt in the spirit of uniform cooperation among participating libraries. individual libraries, however, may choose not to adopt these practices. prospector i bush, garrison, machovec, and reed 79 reproduced with permission of the copyright owner. further reproduction prohibited without permission. system functionality the actual steps for a patron to request an item within the prospector system are simple and self-explanatory. once a patron has identified an item they wish to order, the following steps take place: • the user is prompted for institutional affiliation, name, and library card number. • the system checks local system to ensure that the patron is in good standing. • the user selects a pick-up location from those offered by their home institution. • the system forwards the patron request to an owning library with an available circulation status doing load balancing among the libraries with available copies. once the patron request is forwarded to a lending library, the request goes into the queue of requested items from that library. each library has established its own workflow for handling requests; however, that workflow must include interaction with the system to record the status of the request. once the item is located by the lending library, it is checked out to the requesting patron's "home" library and is sent, via courier, to that library. the "home" library then receives the item in the system and holds it pending pick-up by the patron. when the patron arrives to borrow that item, it is checked out to that patron's record according to the prospector loan rules. having a common set of loan rules for all prospector loans provides consistency for the patron. the patron may still have multiple due dates on items checked out at the same time depending on the loan rules for local checkouts. the system maintains statistics on several elements of the borrowing and lending processes. it tracks the total number of items borrowed and loaned and calculates the ratio of borrowing to lending per institution. in addition, it tracks the number of items cancelled and the reason why, the number of holds filled and cancelled, and several other groupings. i challenges and issues with the building of prospector still underway and public access available only since late july 1999, prospector is doing a respectable volume of loans in its infancy. over ten thousand items were delivered during the first six months of operation. this number is expected to dramatically rise as the system grows and as local libraries promote the service. this auspicious start provides a sense of 80 information technology and libraries i june 2000 accomplishment tempered by recognition that there is more to do. some of the major challenges facing the project include: • • • • • • • • development is underway to integrate records for the carl system libraries into the central catalog and provide borrowing capabilities for their patrons. as member libraries choose other online system providers, ideally, these systems likewise need to be interfaced with the prospector system. coming to agreements with all vendors involved will require careful negotiation and wording of contracts. discussions are underway with innovative interfaces and endeavor information systems for merging endeavor libraries into inn-reach. monitoring how the fiscal accounting for first endof-year reconciliation will work for lost books is planned. developing best practices and evaluating software enhancements for inn-reach are necessary. we need to determine how to handle electronic resources and multiple formats, and load records from commercial electronic resources, for example, net library. we must improve matching within the system and additional enhancements to the prospector web site. with growth of the system, full-time operations and management staff may be required. securing funding for the new ventures and new staffing will require development efforts or a sharing of costs by members. there is no state-based funding for ongoing maintenance and new product acquisition. with the increasing flow of materials between libraries, the courier delivery service must be monitored on an ongoing basis. the statewide courier service has been recently restructured and was contracted based on pre-prospector activity levels for interlibrary loan materials. with the ever-growing popularity of prospector, there will be a corresponding increase in volume for the courier. service levels need to be monitored closely to ensure that the speed of delivery is maintained and that the loss and incorrect routing rate is within acceptable limits. the balance of borrowing and lending will have financial impacts on some of the participating libraries. through a legislative allocation, the state library of colorado provides funding on a per transaction basis to libraries that are net lenders, or that loan more materials than they borrow. most libraries are considering the prospector transactions as equivalent to interlibrary loan transactions and counting them toward the payment for lending program. it is anticipated that the inclusion of prospector activity in the interlibrary loan borrowing and reproduced with permission of the copyright owner. further reproduction prohibited without permission. lending statistics will significantly alter the balance of payment for lending among the prospector libraries. already prospector has shown that it is changing behaviors. the cooperation between libraries has been impressive. in member libraries, staff are factoring prospector into their plans and realizing that keeping prospector operations staff informed of problems is a good habit. user searching and document delivery patterns are changing. margaret landrum, director at the fort lewis college library, predicts that prospector will have a dramatic effect on researchers in the geographic area. its start has given all members a share in that expectation. i the future and interesting spin-offs union catalog projects often take on a "life of their own" far beyond what was originally envisioned. some of the future spin-offs may include: • the addition of other research libraries in nearby states. • collection overlap studies and improved coordination on acquisition and weeding projects between libraries. • with the full implementation of the union catalog, there are opportunities for resource sharing at a broader level. the central catalog has the functionality to support bibliographic records for and access to "consortia!" resources, thus enabling libraries to jointly purchase resources and provide centralized access to them. • as database and online information providers develop new methodologies for access to their resources, there will be opportunities to easily link from either the local or central catalog to these online resources, a process which is cumbersome and/or impossible in the nonglobal environment. for instance, where databases are centrally mounted at the alliance office with shared ownership, the link to serial holdings feature is pointed to prospector, thus providing patron access to consortiawide holdings. • use of the system as a central repository for cataloged metadata for electronic resources on the web. • encouraging innovative interfaces, inc. to allow document requests that "fail" in the system to be forwarded to national ill subsystems or commercial document suppliers using national standards. i conclusion prospector dramatically alters the bibliographic landscape in colorado, offering patrons easy access to the bibliographic wealth of the state. patrons will be easily able to move from a local catalog to this regional system and request materials. librarians will find the system useful for collection overlap studies, improved coordination on acquisitions and weeding projects, z39.50 links with other indexing/ abstracting services for serials holdings information (e.g., ovid or silverplatter), and expedited book delivery. the high level of cooperation among the diverse nature of the participating libraries is exemplary. the incorporation of public and private universities, public libraries, and special libraries offers a model for cooperation. references 1. anthony j. dedrick, "the colorado union catalog project," college and research libraries news 59, no. 10 (1998): 754-55; george machovec, "prospector: a regional union catalog," colorado libraries 25, no. 2 (1999): 43-45. 2. clifford a. lynch, "the next generation of public access information retrieval systems for research libraries: lessons from ten years of the melvyl system," l!'.formation technology and libraries 11, no. 4 (1992): 405-15; bernie sloan, "testing common assumptions about resource sharing," information technology and libraries 17, no. 1 (1998): 18-29. 3. thomas dowling, "ohiolink-the ohio library and information network," library hi tech 15, no. 3 / 4 (1997): 136-39; lindy naj, "the carl system at the university of hawaii uhm library," library software review 12, no. 1 (1993): 5-11. 4. gary pitkin and george machovec, colorado union catalog. senate bill 96-197. technology grant and revolving loan program. excellence in learning through technology. december 1996. grant proposal by the university of northern colorado and the colorado alliance of research libraries. 5. gary pitkin, colorado union catalog-prospector. final report. july 27, 1999. 6. machovec, "prospector: a regional union catalog." 7. ibid. 8. ibid. 9. prospector staff web site, www.coalliance.org/prospector. 10. ibid. prospector i bush, garrison, machovec, and reed 81 reproduced with permission of the copyright owner. further reproduction prohibited without permission. appendix a general statistics about prospector: • sixteen libraries (see below) • twelve innovative interfaces sites (went live in fall 1999) • two carl sites (to go live in 2000) • two voyager endeavor sites (to be incorporated in 2001 pending final negotiations with both vendors) • 3.6 million unique marc records as of january 2000, which are expected to grow to more than 5 million after the incorporation of the carl and endeavor sites. • 9 million item records, which are expected to grow to more than 12 million after the incorporation of the carl and endeavor sites. • currently 61 percent of the records in the system are held by only one library. • greater than 1 million registered patrons are possible users . denver public library has over 500,000 patrons and jefferson county public library has over 300,000 patrons . • prospector url for public use : http:/ /prospector.coalliance.org • prospector staff url, which includes policies, committee minutes, and profiling tables: www.coalliance.org/ prospector prospector libraries auraria library colorado college colorado school of mines colorado state university denver public library fort lewis college jefferson county public library regis university university of colorado at boulder university of colorado/colorado springs university of colorado/health sciences university of colorado/law library university of denver university of denver/law library university of northern colorado university of wyoming web site http://carbon.cudenver.edu/public/library http://www.coloradocollege.edu/library http://www.mines.edu/academic/library http://manta.library.colostate.edu http://www.denver.lib.co.us http:/ !library. fortlewis.edu http://www.jefferson.lib.co .us http://www.regis.edu/1 ib/wlibhome.htm http://www.libraries.colorado.edu http://web.uccs.edu/library http://www.uchsc.edu/library/index.html http://www.colorado.edu/law/lawlib http://www.penlib.du.edu http://www.law.du.edu/library http://www.unco.edu/library http://www-lib.uwyo.edu 82 information technology and libraries i june 2000 reproduced with permission of the copyright owner. further reproduction prohibited without permission. appendix b early borrowing/lending data the borrowing and lending patterns in prospector will be of interest to monitor because of the wide variety of participating libraries in the system. the incorporation of both academic and public libraries has the potential for different use patterns as seen in more homogeneous academic union catalogs. the following data represents some of the very early borrowing and lending patterns in prospector . all of the libraries in the table went "live" in terms of borrowing and lending in late july or august 1999, with the exception of jefferson county public library, which went live in november 1999. history with other similar projects has shown that use will dramatically grow as libraries and users gain familiarity with the service. the incorporation of denver public library in 2000 should provide significant impact on the service. at the present (and in the accompanying table), prospector has been configured to do random load balancing without the use of any precedence tables to force document requests to one site or another. borrowing site aur ccc su cul cub du dul ftl jcpl uccs uchsc unc lending (owning) site ratio ub totals 1879 930 2301 225 1520 1132 129 946 1775 882 364 2063 aur 0.89 1667 108 282 33 232 187 17 113 234 128 70 263 ccc 0.72 673 114 109 11 96 57 66 89 53 10 68 csu 0.86 1985 267 156 29 272 221 18 130 288 134 55 415 cul 0.55 123 24 9 20 5 11 12 3 10 7 3 19 cub 2.05 3120 396 231 590 26 260 21 246 420 233 56 641 du 2.07 2341 361 153 464 42 315 20 163 279 131 69 344 dul 1.12 145 27 7 14 27 15 25 3 11 6 4 6 ftl 0.54 511 66 36 130 3 66 36 7 72 31 11 53 jcpl 0.54 962 187 81 201 11 154 65 11 64 33 38 117 uccs 1.02 900 170 65 148 12 130 65 5 3 137 15 90 uchsc 0.83 301 63 5 49 5 26 31 3 5 32 36 46 unc 0.69 1422 219 81 291 27 207 153 13 89 222 90 30 prospector fulfillments report, august 1999 through february 14, 2000 prospector i bush, garrison, machovec, ano reed 83 reproduced with permission of the copyright owner. further reproduction prohibited without permission. digital resource sharing and library consortia in italy giordano, tommaso information technology and libraries; jun 2000; 19, 2; proquest pg. 84 digital resource sharing and library consortia in italy tommaso giordano interlibrary cooperation in italy is a fairly recent and not very widespread practice. attention to the topic was aroused in the eighties with the italian library network project. more recently, under the impetus toward technological innovation, there has been renewed (and more pragmatic) interest in cooperation in all library sectors. sharing electronic resources is the theme of greatest interest today in university libraries, where various initiatives are aimed at setting up consortia to purchase licenses and run digital products. a number of projects in hand are described, and emerging trends analyzed. t he state of progress and the details of implementation in various countries of initiatives to share digital information resources obviously depend-apart from current investment policies to develop the information society-on many factors of a historical, social, and cultural nature that have determined the evolution and consolidation of cooperation practices specific to each context. before going to the heart of the specific subject of this article, in order to foster an understanding of the environment in which the trends and problems that we shall be considering are set, i feel it best to give a quick (and necessarily summary) sketch of the library cooperation position in italy. the word "cooperation" became established in the language of italian librarians only toward the mid-'70s, when in the sector of public libraries-which were transferred in those years from central government to local authorities-the "territorial library systems" were taking shape: this was a form of cooperation provided for and encouraged by regional laws that brought together groups of small and medium-sized libraries, often around a system centre supplying shared services. a few years later, in the wake of the new information technologies and in line with ongoing trends in the most advanced countries, in italy, too, the term "cooperation" became increasingly associated with the concept of computerized library networks. the decisive impulse in this direction came from a project of the national library service (sbn), the national network of italian libraries, then in a gestation stage, which also had the merit of tommaso giordano (giordano@datacomm.iue.it) is deputy director of the library at the european university institute, florence. 84 information technology and libraries i june 2000 speeding up the opening of the italian librarianship profession to experiences underway in the most advanced countries_! in the '80s, cooperation, together with automation, was the dominant theme at conferences and in italian professional literature. however, the heat of the debate had no satisfactory counterpart in terms of practical implementation, because of both resistance attributable to a noninnovative administrative culture and the polarization of the bulk of the investments around a single major project (the sbn network), the technical and organizational choices of which were shared by only part of the libraries, while others remained completely outside this programme. many librarians, while recognizing the progress over the last fifteen or twenty years (including the possibility of accessing the collective catalogue of sbn libraries through the internet), maintain that results obtained in the area of cooperation are well below expectations, or energy involved. i am touching here on one of the most sensitive, controversial points in the ongoing professional debate, which i do not wish to dwell on except to note the split that came in italian libraries following the vicissitudes of a project that ought, instead, to have united them and stimulated large-scale cooperation.2 i shall now seek to summarize the cooperation position in italy in relation to the subject of this article. very schematically (and arbitrarily) i have grouped the experiences i feel most signficant under three heads: sbn network, territorial library systems, and sectoral cooperation. sbn brings together some eight hundred large, medium-sized, and small libraries (national, localauthority, university, and research-institute). the programme, funded by the central government, supports cooperation in the following main sectors: • hardware sharing, • development and maintenance of library software packages, • network administration, • shared cataloguing, and • interlibrary loans. the sbn is a star network with its central node consisting of a database (the so-called "index") containing the collective catalogue of the participating libraries (currently some four million relevant bibliographic titles and 7.5 million locations). to the index are linked the thirtyseven local systems, single libraries or multiple libraries, that apply the computerized procedures developed by the sbn programme. thus the sbn is a closed network of only those libraries agreeing to adopt the automation systems distributed by the central institute for the union catalogue, the central office coordinating the programme, take part. reproduced with permission of the copyright owner. further reproduction prohibited without permission. from the organizational viewpoint, the sbn can be regarded as a de facto consortium (i.e., not in the legal sense of the term), even if the management bodies, participation structures, and funding mechanisms differ considerably from consortia that have been set up in other countries. in fact, libraries join the sbn through an agreement among state, regions, and universities, and the governing bodies represent not the libraries but their parent institutions. participating libraries receive the services free, and funding for developing the systems and network administration comes from the central government, which coordinates the technical level of the project through iccu.3 currently, ideas are moving toward evolving the sbn into an open network system and reorganizing its management bodies: if this provision becomes a reality, the sbn will have potential for taking on an important role in developing digital cooperation. the territorial library systems, developed especially in the central and northern regions, consist of small groups of public libraries cooperating in one or more sectors of activity such as: • sharing computer systems, • cataloguing, • centralized management of purchases, • interlibrary loans, and • professional training and other activities. the library systems are based on conventions and formal or informal agreements between local institutions (the municipalities) and receive support from the provincial and regional administrations. in more recent years some systems (e.g., abano terme, in the veneto) have formed themselves into formal, legal consortia. the most advanced experience in this sector-for example, the libraries in the valseriana (an industrial valley in lombardy), which have been operating on the basis of an informal consortium for some twenty years now-have reached a high level of efficiency comparable with the most developed european situations and may rightly be regarded as reference models for the organization of cooperation. however, given their limited size, they are unlikely to achieve economies of scale in the digital context unless they develop broader alliances. it is not unlikely that these consortia, given their capacity to work together, will in the near future develop broader forms of cooperation suited to tackling current technological challenges. sectoral cooperation (cooperation by area of specialization) is meeting today with steadily increasing interest, though it did not fare very well in the past. among the rare initiatives embarked upon by university and research libraries in this direction, particular importance in our context attaches to the national coordination of architectural libraries (cnba), started some twenty years ago, which became an association in 1991. the cnba has various projects on its programme and can be regarded as an established reference point for cooperation among architectural libraries. we should also mention one of the "oldest" cooperation projects among research libraries: the italian periodicals catalogue promoted by the national research council (cnr), recently made available online by the university of bologna.4 to complete this sketch, at least a mention should be made of the participation of italian libraries in the european commission's technical programme in favor of libraries. this programme, which since 1991 has mobilized the world of libraries in the european union, not only favors and guides explosion of technologies into libraries in accordance with preset objectives, but also has the aim of encouraging cooperation among libraries in the various countries. the programme-the latest edition of which includes not just libraries but also archives and museums-has secured significant participation from many italian libraries. over and above the validity of the projects already carried out or under way (important as that is), this programme has been very valuable to italian libraries in terms of exchanges of experience and of opening up professional horizons, especially as regards cooperation practice.s digital cooperation recently, following the expansion of electronic publishing, university libraries have been displaying renewed interest in cooperation activities with particular reference to acquiring licenses and sharing electronic resources. this movement is at present in full swing and is giving rise to manifold cooperation initiatives. to get an idea of the trends under way, one may leaf through a session on database networking in italian universities in the proceedings of the aib congress at genoa. 6 on that occasion a group of universities presented a "draft proposal of agreement on access to electronic information." the document is divided into two parts, the first defining the purposes and object of university cooperation in the sphere of electronic information. the second part indicates operational objectives for cooperation in acquiring electronic information and proposes a model contract for purchasing licenses, to which member universities are to keep. the content of this second part coincides with the recommendations and understandings signed by associations, consortia, and groups of libraries in other countries, and largely follows the indications and recommendations issued by the european bureau of library information and documentation associations (eblida), the organization that brings together the library associations of the various european countries; by digital resource-sharing and library consortia in italy i giordano 85 reproduced with permission of the copyright owner. further reproduction prohibited without permission. the international coalition of library consortia (icolc); and by other library organizations. there is no point here in listing all initiatives under way in italian libraries, in part because most of them are only just started or in the experimental stage. i shall mention a few only to bring out the trends that seem, from my point of view , to be emerging . development of digital collections at the moment initiatives in this sector are much fewer and less substantial than in other industrialized countries. among them the biblioteca telematica italiana stands out: in it, fourteen italian and two foreign universities digitize , archive, and put online works in italian . the project is based on a consortium, the italian interuniversity library center for the italian telematic library (cibit), supported by funds from the national research council (cnr) and made up of the fourteen italian and two foreign universities that have signed the agreement. technical support is provided by the cnr institute for computer linguistics, located in pisa.7 in this context we must also note, especially for the consequences it may have for the future growth of digital collections, an agreement between the national central library in florence and the publishers and authors associations aimed at accomplishing the national legal depository for electronic publishing project, which also provides for production of a section of the italian national bibliography to be called bnidocumenti elettronici. the publishers who have signed the agreement undertake to supply a copy of their electronic products to the national central library in florence. the latter undertakes to guarantee conservation of the electronic products deposited, and to make them accessible to the public in accordance with the agreements reached. • description of electronic resources in this area the bulk of the initiatives are still in an embryonic stage. in the sector of periodicals index production (i.e., tocs), mention should be made of the economic social science periodicals (essper), a cooperation project on italian economics periodicals launched by the libero istituto universitario carlo cattaneo (castellanza, varese) to which some forty libraries are contributing. 9 recently the project has been extended to italian legal journals. essper is a cooperative programme based on an informal agreement among the libraries, each of which undertakes to supply in good time the tocs of the periodical titles they have undertaken to monitor. the programme does not benefit from any outside funds, being supported entirely by the participating libraries, which 86 information technology and libraries i june 2000 have recently been endeavouring to evolve into a more structured form of cooperation . administration of electronic resources and licenses in this sphere there have been numerous initiatives recently, particularly by university libraries . one may note, first, a certain activism by university data-processing consortia (big computing centres created at the start of the computer era to support applications in scientific and then university and library administration areas). the interuniversity consortium for automation (cilea) in milan , which has for some time been operating in the area of library systems and electronic information distribution (especially in the biomedical sector), has extended its activities by offering services to nonmembers of the consortium too. recently cilea, in connection with a broader programme---cdl-cilea digital library-has been negotiating with a number of major publishers the distribution of electronic journals and online bibliographic services on the basis of needs expressed by the libraries in the consortium. caspur (the university computing consortia in rome) is working on several projects, among them shared management of electronic resources on cd-rom in a network among five universities of the centre-south . caspur, too, has opened its services to libraries not in the consortium and is negotiating with a number of major publishers the licenses for establishing a mirror site for electronic periodicals. the university of genoa, through csita, its computing services centre, has concluded an agreement with an italian distributor of electronic services to enable multisite license-sharing for biomedical databases by institutions operating on the territory of liguria. very recently the universities of florence, bologna, modena, genoa, and venice and the european university institute in florence have initiated a pilot project (cipe) for shared administration of electronic periodicals, and have begun negotiations with a number of publishers. let us now seek to draw some conclusions from this initial, brief consideration of current initiatives: • initiatives in the area of digital cooperation are coming mainly from the world of university and research-institute libraries. • no projects are big enough to achieve economies of scale, with most initiatives in hand having a very limited number of partners and often being experimental in nature . • projects under way do not provide for the formation of proper consortia, most likely because the legal form of the consortium is hard to set up in italy because of the burdens involved, especially the complexity and length of the decision-making processes needed to constitute such an organization. reproduced with permission of the copyright owner. further reproduction prohibited without permission. • librarians prefer decentralized forms of cooperation, partly because , shaken by experiences of the past, they fear losing autonomy and efficiency and finding themselves caught up in the bureaucracy of centralized organizations. "however, there can also be a correlation between the amount of autonomy that the individual institution retains and the ability of the consortium to achieve goals as a group". this observation by allen and hirshon obviously holds for italy too . jo it is no coincidence , in fact, that university computing consortia, who have centralized staff and funds available, are able to carry out more incisiv e actions in this sector. • except for the biblioteca telematica italiana, no initiatives seem to have been incentivized by ad hoc government programmes or funds. • a part of the cooperation projects concerns sharing of databases on cd-roms. the traditional italian resistance to online materials would seem to be due partly to the still inadequate network infrastructures in our country; improvements in this sector might bring a quick turnaround here. • some initiatives in hand have been inspired more by suppliers than by librarians : the risk is to cooperate in distributing a particular product, not to enhance libraries' bargaining power. without wishing to deny anything to the suppliers, who today play an essential part in terms of professional information too, i feel that keeping the roles clearly separate may help to develop clear, upright and mutually advantageous cooperation. • some major project s are being led by universit y computing consortia that have begun to take an interest in the library sector. the university computing consortia would indeed have some of the requirements to play a first-rank role in this sphere if they can manage to bring themselves into their most natural position, i.e., to operate as agents of libraries rather than as distributors of services on behalf of the commercial suppliers. moreover, it ought to be clear that th e computing consortia should act as partners with the library consortia and not as substitutes for them, otherwise the libraries risk limiting their autonomy of decision . • some attention is turning toward university electronic publishing , though at the present stage it does not seem there are practical projects for cooperation in this area. • finally, one has to not e low initiative by libraries (compared with other countries) in developing content and in storing digital collections. th e analysis i have rapidly summarized here is the basis for an initative which has in recent months been stimulating the debate on digital cooperation in italy. i am referring to the italian national forum on electronic information resources (infer), a coordination group initially promoted by the europ ean university institut e, the university of florence, and a number of universities in the centre-north, which is today extending beyond the sphere of university and research libraries. the forum's chief mission is to coop erate to promote efficient use of electronic information reso urce s and facilitate access by the public. to this end it encourages libraries to set up consortia and other typ es of agreement on acquisition and management of electronic resources and access to them . infer's objectives can be summarized as follows: • to act as a reference and linkage point and develop initiatives to promote activities and programmes in the area of library e lectro nic resource sharing; • to enhance awar eness both at institutional political levels (ministries, universities, local authorities, etc.) and among librarians and end users; • to facilitate dialogue and mutual collaboration between libraries and all others in the knowledge production and distribution chain, to help them all (authors, publi shers, intermediaries, end users) to take advantage of the opportunities offered by the information society; and • to maintain contacts with similar initiatives under way in other countries. infer has immediately embarked on a rich programme of activities which is giving appreciable results especia lly in terms of raising awareness of the problem and coordinating initiativ es in the area. we shall her e briefly mention some of the actions in hand that seem to us most important. dissemination of information. infer has developed a web site where as well as information on the forum's activities, important information and documents can be found relating to the consortia, the negotiations and licenses, and in general the digital resource-sharing programmes in italy and around the world.1 1 a discussion list for tnfer members has also been activated. seminars and workshops. thi s activity is aimed at further exploration of themes of particular interest (e.g ., legal aspects of license contracts, or programmes under way in other countries) . data collection. the two main programmes corning und e r this heading are: (a) monitoring of italian cooperation initiatives under way in the digital sector; and (b) collecting data on acquisitions of electronic information resources in university libraries . this information will enable the libraries to have a more exact picture of the situation , so as to assess their bargaining power and achieve the necessary support to adopt the most appropriate strategies. digital resource-sharing and library consortia in italy i giordano 87 reproduced with permission of the copyright owner. further reproduction prohibited without permission. indications and recommendations. as well as translating and distributing documents from the most important associations operating in this area (such as eblida, icolc, and ifla), infer is developing a model license for the italian consortia. infer was set up in may 1999 and currently has some forty members, most of them representatives of university library systems, university computing consortia or research libraries, or univer si ty professors. one of infer's aspirations is to persuade decision-makers to develop a programme of incentives on a national scale for the creation of library consortia . i critical factors as to the delay we note in terms of shared management of electronic resources, weight clearly attaches to the fact that cooperation is not very established , nor are the national structures that ought to have supported it. it would be all too easy and perhaps also more fun to attribute this situation to the so-called individualism of italians and to abandon inquiry into th e structural limitations that may have determined it. first of all, except in very few cases, libraries have no administrative autonomy, or only very little, and with hardly any decision-making powers. this factor favors interference in decision-making processes, complicates th em, slows down procedures, and strips librarians of their responsibility. one of the reasons why the sbn has not managed to generate cooperation is to be sought in the mechanisms for joining and participating in the programme . in other words, many libraries have joined the sbn following decisions taken from above, at the political and admistrative levels, and not on the basis of an autonomous, weighted assessment of attitudes, needs, and alternatives. these experiences have augmented libraries' reluctance to embark on centrally steered national programmes. on the other hand, the low administrative autonomy they have prevents them from implementing truly effective alternative solutions, i.e., ones able to realize economies of scale. another factor is the administrative fragmentation of libraries . the big universities have fifty or so libraries each (often one per department). some universities have an office coordinating the librari es , but only in very few cases does this structure have the powers and the necessary support to coordinate; more often it acts as a mediation office with no real administrative powers. in short, the result is that since (perhaps also because of a misunderstood sense of departmental autonomy) there is no 88 information technology and libraries i june 2000 decision-making centre for libraries in each university , decisional processes prov e slow and cumbersom e. clearly, all this brings many probl ems in establishing understandings and cooperative programmes with other libraries and weakens the universities in negotiating licenses. this position, while objectively favoring suppliers in the short term, in the long term risks facing them with difficulties given an increasingly impoverished, uncertain market because of the fragmentation and the limited capacity of possible purchasers . another limit is the insufficient awareness, especially on the academic side, of the challenges of electronic information. in early 1999 the french daily le monde published an extensive feature on scientific publishing, showing how current publishing production mechanisms, whil e assuring a few big publishers of ample profit margins, are suffocating libraries and universities under the continu ous rises in prices for scientific joumals.12 the argument, immediately taken up by the spanish el pais and other european newspapers, met with very little response in italy. clearly, in italy today, the conditions do not exist to embark on initiatives like the incisive open letter to publishers sent by the kommission des deutschen bibliotheksinstituts filr erwerbung und bestandsentwicklung in germany, supported by similar swiss, austrian, and dutch organizations. 13 the lack of an adequate national policy in the area of electronic information is probably the direct consequence of the problems i have just mentioned. in this context, however praiseworthy the initiatives, they tend in the absence of reference points and practical support to break up or fritter away . under the ministry for universities there are no leadership or action bodies in the area of academic information, like the joint information system committee in britain that stimulates programmes aimed at developing and utilizing information technologies in university and research libraries . these observations are also valid for the state libraries and public libraries, too, where the central (ministry for cultural affairs) and regional authorities could play a more effective part in promoting digital cooperation . i conclusions the picture i have presented is not very rosy. however, it does reveal considerable elements of vitality and great er awareness of the problems emerging, starting with a few representatives of academic sectors who might be able to wield influence and bring about a turnaround. at the moment, the consortium movement to share electronic resources chiefly involves university libraries, reproduced with permission of the copyright owner. further reproduction prohibited without permission. but a few initiatives by public libraries are starting to appear, especially in the multimedia products sector. no specific lines of action are yet emerging at the level of the national authorities-especially the ministry for education and research and the ministry of cultural activities, on which the national libraries and many research libraries depend. it is likely that in the near future the entry of these agencies may be able to modify the current scenario and considerably influence the approach to cooperation. from this viewpoint, the impression is that a few consortium initiatives that have been flourishing in recent months on the part of both libraries and suppliers have the principal aim of proposing cooperation models to guide future choices. in conclusion, we are only at the outset, and the game is still waiting to be played. references and notes 1. michel boisse!, "l'organisation automatisee de la bibliotheque de l'institut universitaire europeen de florence," bulletin des bibliotheques de france 24, no. 5 (1979): 231-39. for an overall picture of the debate, see: la cooperazione: ii servizio bibliotecario nazionale: atti de/ 30th congresso del/'associazione italiana biblioteche, giardini naxos, november 21-24, 1982 (messina: universita di messina, 1986). 2. tommaso giordano, "biblioteche tra conservazione e innovazione," in giornate uncee su/le biblioteche pubbliche statali, roma, january 21-22, 1993 (roma: accademia nazionale dei lincei, 1994): 57-65. for the most recent developments in the debate, see the articles by antonio scolari, "a proposito di sbn," giovanna mazzola merola, "lo studio sull'evoluzione de! servizio bibliotecario nazionale," and claudio leombroni, "sbn un bilancio per ii futuro," bollettino aib 37, no. 4 (1977): 437-66. 3. further information on sbn can be found at www.iccu.sbn.it/sbn.htm, accessed oct. 27, 1999, where the collective catalogue of participating libraries is also accessible. 4. catalogo italiano dei periodici (acnp),www.cib.unibo.it/ cataloghi/infoacnp.htm, accessed sept. 19, 1999. 5. there is a considerable literature on the european commission's "libraries programme": for a summary of projects in the programme, see telematics for libraries: synopses of projects (luxembourg: office for official publications of european communities, 1998). updated information on the latest version of the programme can be found at www.echo.lu/ digicult, accessed oct. 26, 1999. on italian participation in the programme see: "ministero per i beni culturali e ambientali, l'osservatorio dei programmi internazionali delle biblioteche 1995-1998" (roma: mbac, 1999). 6. associazione italiana biblioteche (aib), xliv congresso nazionale aib. genova, 1988: www.aib.it/aib/congr/co98univ. htm, accessed oct. 27, 1999. 7. more information about cibit can be found at www.ilc.pi.cnr.it/pesystem/19.htm, accessed may 19, 2000. 8. progetto eden: deposito legale editoria elettronica n azionale, www.bncf.firenze.sbn.it/ progetti.html, accessed sept. 29, 1999. 9. more information about essper mav be found at www.liuc.it/biblio/ essper /default.htm, access~d may 19, 2000. 10. barbara mcfadden allen and arnold hirshon, "hanging together to avoid hanging separately: opportunities for academic libraries and consortia," information technology and libraries 17, no. 1 (1998): 37-44. 11. the infer web page can be found on the universita di roma i site, www.uniromal.it/infer, accessed may 19, 1999. 12. le monde, 22 jan. 1999: a whole page is devoted to this topic. see especially the article titled "les journaux scientifiques menaces per la concurrence d'internet." accessed feb. 4, 1999, www.lemonde.fr/ nvtechno /branche / journo / index.html. the point was taken up again by el pa(s, 27 jan. 1999; see the article titled "las revistas cientfficas, amenazadas por internet." 13. the letter, signed by werner reinhardt, dbi president, is available at www.ub.uni-siegen.de/pub/misc/offener_brief-engl. pdf, accessed feb. 4, 1999. digital resource-sharing and library consortia in italy i giordano 89 harnessing the power of orcam public libraries leading the way harnessing the power of orcam mary howard information technology and libraries | september 2020 https://doi.org/10.6017/ital.v39i3.12637 mary howard (mhoward@sccl.lib.mi.us) is reference librarian library for assistive media and talking books (lamtb) at the st. clair county library, port huron, michigan. © 2020. library for assistive media and talking books services (lamtb) are located at the main branch for the st. clair county’s library system. lamtb facilitates resources and technologies for residents of all ages who have visual, physical, and/or reading limitations that prevent them from using traditional print materials. operating out of port huron, michigan, we encounter many instances where we need to provide assistance above and beyond what a basic library may offer. we host talking book services which provide free players, cassettes, braille titles, and downloads to users who are vision or mobility impaired. we also have a large and stationary kurtzweil reading machine that converts print to speech, video-enhanced magnifiers, large print books. we also provide home delivery service for patrons who are unable to travel to branches. the library has been searching for a more technology-forward focus for our patrons. the state’s talking books center in lansing set up an educational meeting at the library of michigan in 2018 to see a live demonstration of the orcam my eye reader. this was the innovation we were seeking and i was thoroughly impressed with the compact and powerful design of the reader, the ease of use, and the stunningly accurate feedback provided by this ai reading assistive device. users are able to read with minimal setup and total control. orcam readers are lightweight, easily maneuverable assistive technology devices for users who are blind, visually impaired, or have a reading disability, including children, adults and the elderly. the device automatically reads any printed text: newspapers, money, books, menus, labels on consumer products, text on screens, books, or smartphones, etc. the orcam reader will repeat back any text immediately and is fit for all ages and abilities. orcam works with english, spanish, and french languages and can identify money and other business and household items. it can be placed near either the left or right ear. users can easily adjust the volume and speed of the read text. it can be to either the left or right temple on your glasses using a magnetic docking device. having a diverse group of users with different needs use the reader as they like is one of the more impressive offerings. changing most settings is normally facilitated with just a finger swipe on the orcam device. the mission of orcam is to develop a "portable, wearable visual system for blind and visually impaired persons, via the use of artificial computer intelligence and augmented reality” by offering these devices to our sight, mobility, or otherwise impaired patrons we open up the world of literacy, discovery and education. some of our users are not able to read in any other fashion and the orcam provides a much-needed boost to their learning profile. we secured a grant from the institute of museum and library services (imls) for the purchase of the readers (cfda 45.310). we also worked with orcam to get lower pricing for these units. normally they retail for $3,500 but we were able to move this to the lower price point of $3,000. we also were awarded a $22,106 improving access to information grant from the library of michigan to fund the entire purchase. without this funding stream we would not have been able to secure the orcam. however, if you have veterans in your service area please contact the company since there is availability for va health coverage for low vision or legally blind veterans who may information technology and libraries september 2020 harnessing the power of orcam | howard 2 qualify to receive an orcam device, fully paid for by the va. please visit https://orcam.com/en/veterans for more information. figure 1. close-up of the orcam device. the grant was initially set to run from september 2019 to september 2020. we purchased six orcam readers for our library users, and they were planned to be rotated among our twelve branches throughout this grant cycle. however, due to the pandemic and out of safety concerns for staff and visitors, our library was closed from march 23 to june 15 and we were only able to offer it to the public at six branches. as of july 14, 2020, we are projecting that we may open to the public in september, but covid-19 issues could halt that. we have had to make arrangements with the grantor to extend the period for the usage of the orcam from september to december. this will make up for some of the lost time and open a path for the other six libraries to have their turn offering the orcam to their patrons. the interesting aspect of this is we now have to take our technology profile even further by offering remote training to prospective orcam users. thankfully, the design and rugged housing for the reader makes it easy to clean and maintain but the social distancing can prove to be intrusive for training. to set up a user you need to be within a foot or two of them and being very close in order to get them used to how the orcam reads. there is a lot of directing involved and close contact with the user and instructor. we will use a work around of providing distance instruction including in-person and remote training. orcam also has a vast array of instructional videos that we will have cued up for users. we have had over 150 https://orcam.com/en/veterans information technology and libraries september 2020 harnessing the power of orcam | howard 3 residents attend presentations, demonstrations, and talks on the orcam. i anticipate that this number will not be achieved for the second round; however, we may be more successful in our online presence since we can add the instruction to our youtube page, offer segments on facebook and other social media and provide film clips for our webpage. the situation has been difficult, but it has opened up lamtb services to think about how we should be working to provide better and more remote service to our users. since we cover over 800 square miles in the county, becoming more adaptable to servicing our patrons has become a paramount area of work for the library. the orcam will bring about a new way of remote training to our patrons, which will bring about more awareness of the reader and how it can be beneficial to users. the st. clair county library system would like to thank the institute of museum and library services for supporting this program. the views, findings, conclusions or recommendations expressed in this article do not necessarily represent those of the institute of museum and library services. lib-s-mocs-kmc364-20141005044646 167 the minnesota union list of serials audrey n. grosch: university of minnesota libraries this paper describes development of a marc serials format union catalog of serials caued the minnesota union list of serials ( muls). the preliminary edition, published august 1972, contains over 37,000 main entries in 1,566 text pages produced through photocomposition in news gothic typefont using the full marc character set. the total number of entries is over 59,000, including cross-references. conceptualization and scope of the system as well as its design, data conversion, computer and programming support, photocomposition, costs, and problems are discussed. introduction this paper has been prepared to inform the profession of the development of the minnesota union list of serials (muls), the data base of which represents significant differences from those of previously reported union lists. as one can see from figure 1, a muls preliminary edition sample page, muls is a full bibliographic union serials catalog. it uses the marc serials format as its formal structure. the preliminary edition contained 37,289 university of minnesota serial titles. the file now includes holdings of the minneapolis public library, eight private colleges, and ten minnesota state agency libraries including the minnesota historical society. augmentation of the data is continuing so that all minnesota academic, large public, and selected special libraries will be included in the coming two years. several years ago, the university of minnesota began investigating the development of its own unified serials catalog as a first stage in the development of an automated serials management system. at that time many libraries in the state had developed their own serials lists, and regional consortia had created lists of their members' holdings. these resources, coupled with networking through the minitex (minnesota inter-library teletype exchange) program, made possible the muls initial development. the minitex program links together seventy-two libraries via teletype to the university of minnesota for rapid interchange of library materials. state supported academic institutions, public libraries, and private colleges in the local metropolitan area also participate in this program. in spring 1971, it became apparent that a union list had become a necessity if the expanding minitex program and the university were to 168 journal of library automation vol. 6/ 3 september 1973 minnesota union l ist of serials fr..-kt. minitttrt mi'air. pvwiutiona kietltifiquh et tkmfqun. lultelln *' mf'yic.n techfriques. p~rrs. mnu en no i · ( 1921· ). frmkt. minist.,t4ei'huuh0ft nationie . .t..rwl.aitt de 2073900 i' m\iution mtion~ . hns. "'pubhe par l'lnst•tut ~goe~que nijoi'ii!ie ... mnu wll 1962. 196$.1968.1970310.58qfsa4 mnu r current votume oruy. france. milthtert ch l'ectuutioft iuitionale. cawocu.ctes 2074000 theloh 4e docto,.t. p•"j· ~~ htfdot t•ue: 1884/ 1885 1931 · m•ntst~e~t·,nstuxtton pubhque: 1932 1958 "'""'''~ore de l'edi.icatjo n nlhon•le, 1960·1968. 1884t 89· 1958. 1ssued annua lly. 5yurty pans be+nc plied cont.nously and supplle d w1th a e~eral t p and subt«t •nde•totofma volume. vols 1·5, 8{l884 t 89· 1904 / 09. 1919123) have also an alphabetical hst of 1uthors. in vois. 6 -7 ( 190911 3 ·19141 18) a hst ol authors is 81\'ci"' wllh tach iikicie. bea~rmmc w1t h 1959, vol . numbers no lonliltr used bea•nn•na "'''h thue~ of 1930. a hst was also publtshed m parts, as supplements to the numbers ott he b•bhograpl"ue de ia france. 1. pt•e .. btbl1ograptue, 1932 ·date. lt~~;~::!~~::4t'~s-19sb:. za~~~,~~:~a'~;>:~::,e~~~~ce. supplement annuel 0 j915 16(1uc.)2 33)1,a 1917 is (la$( )4 )5)uci'iiu<.tf0 of1qne fiumbef t.•49llrt'w 1 mnu 818 v. i ·i s(l884 fl889-1958): n . ser . 1959· 0,6 j18qf84 ftance mm1s1ere de l'1nstruc110n pubhq~ etdes beau•· 201.4joo trts. s..france. m1 ntsterede l'edutaiion nat10naje c1talo@:ue de~ thhh de doctor at. fr1nce mm1sttre de l'+nstruction publtqueet des beaux arts. shbultttm des btbiiotheques e t cfe~ archtves. 2074200 fra nc.,, m•n•stere de l'mstru<:tton pubhque e t cfes beaux 207000 lrts. comlte des travaux h1stor•ques et sc.enflhque~ se. f ranee. cotnte dn travau• h•storiquf:s et sc.~ttftques. section de gqraph.e.buuem. fr~e mon!sterede l'llislruc.iioopubijqvtetdesdeijui 2074 40() arts comtte des travau• h•sicm'iquts et sc~ent•hques. sh france corrute des tr1v1u• h •stonques el sctenl•hqoes sec lion de gqrtphm!.bullel•n. funce mlntstere de l'lflletteur revue g~ale 2074!.00 d 'admentstultton par•s. ltnfk. bere~·tew•vlt. ai hudot t•tle. 1879-1910: m•n.stet"e del'.nterteur. t•tle ylrtts j.i•lhtly v. ll. jan ok j878. pub11hayee.lacollaborahonde senateurs, de deputes. de mtmbtes du conwtl d'etat , de lonctiqnna trts tt de publtctstes sous ia dlfe18 •~ mnu per 1878 19 10 . fra": ~~~!~t'!,"~:~~~~:t~!faet~fsc:~,;:~s de ia 20;:.600 frtnct comm.ss10n de pubf•uhon des documents relahls a u• or1c•nesdc ia guerre de 1914 oocument:s.d•plomat 1ques •. ftlnc e , m•n•sttre des ltnances . sh bulletmde 2074 100 stlltsltque et de teg,stat1on comparee. rrance mtnts tere des t.nances et des afla1rs :?0741!.00 econom1que. rapport sur le:s.comptes de ia nat•on see france . se,....tci des etudes tconom 1ques et hnan<:jtres.rapport sur les comptes de ia nat•on france m.n1steredu comfl"ltrce. annates du commerce 2074900 extet~eur pa"s 18&3 18~1 hlvet tue oocumenssurlecommercee~ter•eur . various chanles '" n.jme ot m 1ntstry: 1843-1851. mtn1stere de l' t&oc uttureetdu commerce. 18s2-f eb. 1853, mm•sterede l'•tlltfleur. de l' ailrtculture et du commerce. feb.· may 1853, mm•ste•e de l'tntt:t~eur. june 1853·july 1869, mtn1sterede l'aguculture. du cotnit\frct tt des tra vaux publics. auc. 1869 1881 , mln•stere de t·agnculture el du comme•ct. ocustonamy pubhs~ •n comb•nt'd numbers. no more publ•she800 rapport wt lei c.omptu de ia n11uon. p.jrts, lmpr nat~te vol tor 1949 1 95~ •ssued by wrvice dts etudes konon'i•ques et ftna ncttrts and lnsittut n.c.on.l de~ ~tahshque et des etudes e<:of'iomiques 1962 i 966 by lnst•tut nattanal c!r ia s tttish que et des etudes economtques 6) •1<600 mnu ooc 1967 mnu wll 1960 1966 jj6 "" f8a 74 fran« servteedespichesmartl lfn's shfr•nc.e office 2'01~900 sctenl!fiqut et t tchntque des pectle's ~ntunes rr-vve (kos lra vau• france serveces techntques et .ndustt~el~ de 2016000 t·.eronaullque publteations scttnllhques el techrnques ~ france mtntstere de l'alf pubhcat10ns sc.enhftqun et ledw'iiqut:sfrlntt , armee etat mtpservice hts torjque s..rewe 1076100 tus lortque de i"jrmet pubhution tmneslf•eim: de i'etat ma1or de franceactuelle. p•ns 1076100 mnu doc. vi snol (jani966· fr~rtea·a.merlque . new yor.t. 2'016100 mnu new current 3 ye"s frjniu mnu per v. l 6(1919 1924) france •llustrat•on p•t•s 2016~ t•tle vaues, mar. 1947 -1948. france •llustrat1on htteratte et thentrtle absorbed by nouveau fem1na supp't'mtlll tntllt••• tt ht ttl',..,t' •a 2<~216 mnu per oct1945 1955 , s uppl no i 191 (1947-1955). france 1llus1ratton ll!ter,we et thea ttale sftfr;snce •llustrat!on franceln,l•tut natfontfci'etudes dernocraphtqw1 . tr1v1u• et documents. p•os mnu xx v 9 11. 1.3 19 .21 40,43 (1950 ff ance hbte.ltbefte egaht~. hatt:tme tondon, ham$h h 1m111on l•m• l ed ceased ovb v ll no 74 (jan 1947) mnu p[r v 1 13 (n0v1940 jan j947). la france lotteta11e p•tts nomorepubl•s~' l &'lllll mnu pta vi 36(1832 1839). lrancsscan ht)iof' tc:at classics s..ac~myol amerun f rlinc,s,un htsior)' f ranci'scan htstor-cat c lass:cs. 2076000 2016800 2076~ 539 fig. 1. minnesota union list of serials. preliminary edition sample page. provide maximum benefit for minnesota's library users. our state's library environment features: • one large academic research library-the university of minnesota; • many smaller academic libraries in the 75,000-250,000 volume class; • two large public library systems-the minneapolis public library and the st. paul public library; • one private research library-the james jerome hill reference library -which serves as a nucleus for the metropolitan area private college minnesota union list of serialsj grosch 169 library network called clic (cooperating libraries in consortium ); and • some library automation activities among these libraries, with the largest automation staff and activity at the university of minnesota. the parallel developments of networking and systems design at the university made possible the proposal to the minitex program advisory board for funds to develop the system and publish the first union list. in summer 1971 this program received approval and work was begun in midaugust. on september 1, 1972, the preliminary edition of muls was published and distributed to participating university and minitex network members. following is a report of this work, its results and problems. program scope obviously, to create a system capable of eventually including library holdings state-wide and to convert such data requires definition of an initial and future scope. the initial scope was defined as: conversion of the university of minnesota libraries' actively received titles, departmental libraries' complete titles, and inactive titles in the libraries' periodical division. development of a batch input tape software system capable of supporting initial conversion, correction, and updating to produce the preliminary edition of muls. the future scope would potentially include the augmentation of the muls data base with the following non-university of minnesota holdings: a. eight metropolitan area private colleges in the clic network, with production of a clic union list for their members' use; b. minneapolis public library serials and unique titles from other public libraries of over 50,000 volumes, with production of a public libraries union list; c. holdings of all state agencies, which would include the minnesota historical society, state law library, state department of health, and legislative reference library, with production of a union list for their internal resource sharing; d. state supported colleges' holdings; e. university of minnesota inactive general collection serials, thereby completing access to the state's largest research library; f. private college holdings outside of the metropolitan area clic institutions; and g. selected special libraries' holdings. at the moment of this writing we have the initial scope completed, are just completing a, b, and c, and have planned work on d and e for 1973. in view of this scope the initial muls magnetic tape system was based on the marc format to permit: • publication of a photocomposed or line-printer-method full union list; 170 journal of library automation vol. 6/3 september 1973 • publication of regional combination or individual library lists using an ibm 1403 line printer equipped with the ala graphic print train; • storage of complete and verified information on each serial as known, together with the source of the cataloging data; • extraction of the data via individual libraries to assist those wishing to develop automated serials management systems including check· in, claiming, binding, etc.; • conversion of the file to other storage media such as disk; • fulfillment of the smallest to the largest libraries' needs for biblio· . graphic detail; and • extension to a fully automated resource sharing system which would further improve the benefits of library cooperation. with this picture of the program scope, the design factors, data conversion, computer system, programs, photocomposition, costs, and problems will be described below. system design the easiest way to look at the muls design is to gain an understanding of the muls marc record content as shown in table 1. this record is the basic unit which is entered, including all associated cross-references or added entries to be made. it in tum generates each of these secondary entries in the file. in this brief description we will assume the reader is familiar with the marc serials record as described in serials: a marc format: preliminary edition and its addendum no. 1.1• 2 there are some differences between the muls format and the lc marc format, most importantly the addition of a sort field (tag 249) and the subfield arrange· ment for holding fields (tag 850). other variations have been indicated in table 1, which uses the same organization as that contained in the lc format description referred to above. figure 2 shows a page from a master·file listing. note entry no. 2074000. this listing is formatted with the sequence number of the record appearing on the first line, followed by the bibliographic level and the remaining leader information. next the record directory entries are found for fields 008-950 as applicable. on the next line are the 008 fixed length data ele· table 1. muls marc record content a. leader 1. logical record length-five characters 2. recqrd status = 1 for marc record 3. legend = 4 for added entry ( aet) or cross-reference (xrf) entry a. type of record-not used (blank) b. bibliographic level ,. s c. two blank characters 4. indicator count = 2 minnesota union list of serialsjgrosch 171 5. subfield code count "' 2 6. base address of data "' 5 characters 7. sequence number= 7 characters b. record directory 1. variable :geld tag "' 3 characters 2. field length = 4 characters 3. starting character position = 5 characters c. control fields-008 fixed length data elements 1. date typed 2. publication status 5. country of publication code 9. type of serial designator 10. physical medium designator 12. form of content a. type of material code b. nature of contents codes 13. government publication indicator 14. conference publication designator 20. language code 21. modified record designator 22. cataloging source code d. variable :gelds 1. indicators in general we have not followed lc in the use of indicators. one exception is the use of filing indicator for the 100 and 200 series tags, which we implemented before seeing that this feature was provided in the addendum no. i to the lc format. therefore, the indicators except as above are both blank. 2. subfield codes except for the holdings statements (tag 850) we have generally followed lc philosophy. for tag 850 we now precede the $a sub:geld with a $z sub6eld, suppressed on printing, which contains the 4 digit number identifying each specific holding library which is also found at the end of the 008 field. 3. variable fields currently used. 010 lc card number 022 standard serial number 041 languages 100 main entry-personal name 110 main entry-corporate name 111 main entry-conference or meeting 200 title as it appears on piece 245 full title 249 sort key from 100 or 200 series tags stored in collating codes and limited to 120 characters 250 edition statement 260 imprint 500 general note 501 bound with note 515 note for explanation of dates, volumes, etc. 525 supplement note 555 cumulative index note 730 added entry 850 holdings 950 cross-reference tracing note: we have followed lc numbering for the above data elements, and have substituted blanks on the tape record for those elements omitted. we have also expanded the 008 field to include a variable number of 4 character elements which contain the index number of each holdings location listed in the z subfield of tag 850. 172 journal of library automation vol. 6/3 september 1973 20739!jijs~2221iiiijjui1,i621 tl0auii4~j00000 lloilll:ilouu .. fl .l450li..&uuuloo 24900820ui4u 260001100222 51jooci5:ju02j:l 7j0004600286 l't500u5800jjz ~51;1}(;j~iif j:ttl 711ll.t• fr .. c: ,.. • frta 111 ut.:jj 0130 u:£• rr.an('t!'• "'jntst er~ •le l' ..,chu:atlol'l llatlonale.t ua..a aunu.,jre dt!' l'.educa'i"n natlouele•l u i>a t t.:an(:t.: .-'c i hi stewe-dt: lto.diical' l o~na 1lonale·--annua 1 r£-de-l£ducat ion-national e .1: o:.a p,.,.,,..,j 1~ .. "rubt•"'• i>ar" l't!"st!tut p"'ed'amolllque natlonale.ul l:ia i-r"&nc~: .. l11stitut p edaiiomiqua nationale.a: ••• nnu f'h .. il!icltdl9&2,1965,19b5, uno•sej70.s~ qt'844t: .l~• »nu ~bm ~c$dcurrent voalu•e unly • .l074uuo~u.llltltll110164j i oo~uu450•jooij 4j 100011-:00045 ll000su00063 245003800113 249008200 is l 260001100233 500044000244 5000u54006h4 "i:.h1cli'72ui•jj~ 500014'700e1u 51500'7~01057 730006901l36 '730005301205 730005401258 730003801312 85000760135(] 71121"'• fr fwe c 0107 ll:.ia o-"4!llt'l r"'v :.2& u~a ..-ran~e. winlstere de l' .. educatlon natj.onale.t: o:&a c .. ,.,l~jhic "~"' th~se:, de doctorat.1 0~& t"" wancf. .-11 in istf:ih·:-de-leducati on-no\, ionale.--cata.logue-des-theses-de-doctoio\1•( u:.ia t.. r.i>l•1 &~ ... at hf!'.arj of titl6': irr4/ir85-19.31: winlstere de l'lnetructlon publlque; 1932-1958 •lntet'er• de l' .. edu.catlo n natlonale, l~lb0-19btl• lt184/b9-1958: laaued •nnually, 5 yearly parte beln• paaed contlnouely end •upplle d elth a ••neral t. p• and !4ub.lect index to fore • volu••• vots. 1-5, h ( 1884189-1904109, 1919/23) ha¥e aleo en elphebetlcat ll11t of author•• in vuts. b-7 11909/ij-1914/uo a list uf author& ia .-:lven •lth each taeclcle.s: l.,t;d hp...:innln,. •lth 1950, vol. nuaberli no lon11er umed•s: .j:iio ut"t~otlrtnln.r •1 th theses ot" 19jo, a llht was a lao publlwhed ln parta, as auppleaente to the nuaber• of the llblloaraphle de ia franc~, 1. ptie., blblloaraphle, 1932-data.l -is• tttle'.,•rjea: ih84/89-iustj, catalo.eue des th"et~ee et ""acrlta acade•lc:~uea. 1965; called blblla.raptlle da la prance, suppl .. ••ent a.nnu~l 0.1 t~a l!ll5-lt. c fa!:ic• j2-jj) and 1911-tr (lase. 34-35) each laeued jn one nuaber.s: i$"' 1-" rancor. wiul9t .. ere de l'ln&tru~tlon publlque et dea beaux-erte.t l:i• fr .... uce. !hrectlon det> hjblloth eques de france .. t j5• jjihllu~otr.j.ph.le de la france. suj>pleaent annuel d.s: 4:io• ct~ciiloio(u~ des theses de doctorat. t lo:da nnu sljitih.scsdv.l-15 ( ie84/18fl9-1958t; n. ser• l9511207410us l.tu. minist"ere de l'ln»tructlon publlque .. t ri•a beaux arte.sb8ull•tln d .. a blbllothequea et de• archlvew.l: ~ .. ficancf.-!41 nis 1 ewe-dh-lf hs"fltuclion-puhliquf.-et-di:s-ri:"aux-arts •• j 2:074juos .l2urhjbiuuj!=!t4 00hliuu50u000 150021000005 2490115011215 u 119 fr11.ru: ... ~tni.st"~rl!' <.it!> !'instruction publlque vt dee beaua •rt•• c~•lt .. ., d~• trevaux hl•torlquea et sclentltjqu•9•lbf•anc ... <..:urzrit v tl~~ tru.v .. ux l•lt~corifiup.s et scientlfioues. sectloa de g i!cirr•phle.llulletln.1 :f.t~ t).ianll::~ -.loll n i sl"l:"wt·:-uh-li ns ofucll on-puhl jqijeel-oes-i:teo\ux-awts.-conite-oes-tiiavaui-aistc 5 icue5-et-sc i entif iquf.s•.i fig. 2. master-file listing. ments with the last four digits the holdings location index number which is the same as the suppressed $z subfield in the 850 field. then the variable fields are listed in numeric sequence. note the subfields as indicated by $z, $b, etc. the number to the left of each $a is the marc tag number. another departure from marc is to store the call number as a subfield of the holdings statement since it may vary among participating libraries. to contrast how the information is stored and how it appears when published, the same record is shown in the left column of figure 1. also, the next record shown is generated from an added entry tag 730 in this parent record. we have prepared a detailed coding manual which is followed by our coders; this document presents various examples of conditions and details the full system structural requirements. these changes in the format were made to simplify wherever possible, to provide for conditions which the original lc format did not cover, and to preserve the marc structure with full text. with the exception of subject headings, all bibliographic text is stored. other marc tags may be added to the system at any time. the initial system was tape-based, as our computer system at that time did not have uncommitted disk drives. also, we needed to gain some detailed knowledge of the file and record characteristics to most effectively design the disk-based system. this knowledge could be gained easily after minnesota union list of serials j grosch 173 some basic data were stored in the system. since programmer time was our most precious commodity, this phased approach was used to: ( 1) achieve enough support on the tape system to permit publication of the preliminary edition of muls while gathering file and data characteristics; and (2) bring into operation a disk-based system with completely automatic addedentry correction and generation, coupled with very flexible correction procedures. data conversion various methods of data conversion were investigated. two requirements seemed obvious in our system-compilation of data on a code sheet and efficient, accurate keyboarding. further, since the marc character set was being used, any potential device had to provide a minimal keying situation to accommodate this character set. compilation of data on a code sheet was necessary because multiple files in multiple locations would be checked to gather all of the information. keyboarding had to be efficient as it was initially estimated that some 25 million characters would be entered before we were ready to publish the union list. the ibm model v record only magnetic tape selectric typewriter ( mt jst) was chosen as offering the best approach for high volume, short duration use. three machines, each equipped with the special marc element and key buttons, were leased. typists easily corrected their discovered errors on these units. each typist followed detailed typing instructions and, after mastering the coding manual practices and procedures, was a trained coder. during july i august 1971 all training aids were prepared, forms designed, and staff recruited. the initial staff complement received their training during the last two weeks of august. during september the data gathering staff was brought to full strength and consisted of: project director editors (librarians-library assistants) senior clerk-typists clerks (students) 1 fte 4 fte 6 fte 12 fte full-time equivalents are used as staff were in many cases part time or temporarily lent to the project. during the period august 1971-june 15, 1972, which comprised the total data preparation time for the preliminary edition, five librarians and thirty-five students actually were trained and participated in the project. it took about six weeks to bring most of the staff to an acceptable performance level. some students found the work too complex or detailed and voluntarily left the project. one clerk-typist did not gain sufficient proficiency to pass out of a trainee status and was terminated at the end of her probation period. thereafter, with a staff of this size, performance problems were minimal. ,i 174 journal of library automation vol. 6/3 september 1973 the data to be included in the preliminary edition comprised the university's • currently received, centrally recorded serials ( 20,000 titles); • inactive periodical division titles ( 8,000 titles); • coordinate campus locations of the university ( 4,000 titles); • complete departmental library titles excluding the bio-medical library ( 6,000 titles). the bio-medical library was excluded due to its present mechanized serials system which would be used to produce a separate serials list, issued as volume 3 of muls to the university and the minitex participating libraries. this separate publication was necessary due to the short time in which the initial data were to be collected. however, the bio-medical library is now also being included in the body of the muls data base. these four categories of serials necessitated quite different approaches dependent upon the available check-in files, shelflists, or catalogs. for example: to capture data on the currently received, centrally recorded titles we photocopied the kardex drawers from the serial check-in file maintained in our headquarters library. these running titles were checked against the official card catalog in the library. if the title was found, the bibliographic infom1ation was transcribed, together with all kardex and catalog locations. if not, the kardex data were copied onto a code sheet for subsequent verification together with its listed location. about 5 percent of the time the photocopied sheet was illegible. these entries had to be transcribed from the check-in file, verified, and then passed on to the next step. when bibliographic data had been assembled on the code sheets they were edited in groups, each group accompanied by its photocopied sheet. corrections were entered by editors, the catalog or check-in file was rechecked as necessary, and then the sheets were sorted by holding location. next all holdings information was procured from the remote location to make sure it was the most reliable information. finally, the sheets were returned to be rechecked and typed. "mopping up" occurred at each holding location to encode inactive titles and uncataloged serials. when a title could not be verified, the piece itself was used to develop the main entry, added entries, and other pertinent cataloging information. similar procedures were used on the inactive periodical division shelflist. departmental library locations involved the use of shelf-locator visible indexes and shelflists, coupled with check-in files and branch catalogs. coordinate campus locations outside the twin cities metropolitan area required the checking of title/holdings listings provided by these campus libraries. many entry problems resulted, because variant cataloging approaches were used in many of these libraries. typing and subsequent input were done as coding sheets became ready for keyboarding and were therefore in random order. over 40,000 individual records were typed, each averaging about 480 characters (an approximate minnesota union list of serials/ grosch 175 18 million keystrokes). during the period february-june 15, 1972, when the complete file was proofread from the thirteen volume master-file listing, another 5 million keystrokes were required to delete, to reenter, and to correct entries and associated cross-references. our final keyboarding stroke count was exceedingly close to our original estimate of 25 million characters. the proofreading portion of the data conversion took twice as long as originally anticipated, causing a delay of two months in photocomposition scheduling. proofreading was completed on june 15, 1972, and on the following monday the photocomposition vendor received the final output tape. due to some format changes and continued systems problems the photocomposition output was not received until july 21. printing and binding followed and on august 28 the preliminary edition, consisting of 1,566 text pages in two class a bound volumes, was ready for distribution. computer system two computer systems were used in muls production. one system was used to convert mt /st cassette tapes and involved initially an ibm 2495 cassette converter coupled to an ibm 360/ 20 system. this configuration was replaced by off-line tape conversion using a data action tape pooler and the same computer for code conversion and record blocking. twohour to one-day service was provided by this service center, located in a local insurance company. the raw data tape resulting from the above process then required processing on the second computer system, an ibm 360/50 at the university of minnesota. all programs are written for the cobol f compiler and operate under os/ mft using 1600 cpi magnetic tape. two 80k core partitions are required for the updating and printing programs. the ala graphic print train is used to print the file and control listings. figure 2 was printed with this character set. programs muls programs for the present tape system were conceived as two sets: ( 1) conversion, file creation, and updating; and ( 2) printing functions. the first set performs the following functions: • identification and checking of fields for validity, tagging, and structure from the raw input tape; • creation of marc-type main entries; • creation of secondary entries generated from the added entry (tag 730) and cross-reference (tag 950) fields; • creation of correction and deletion entries; • sorting of main entries and the generated secondary entries in alphabetical sequence; • sorting of correction and deletion entries in sequence number order; • addition of new records; "' ~~ .. ,,, ,, 'i'' i ... j• !'i · · ~-· 176 journal of library automation vol. 6/ 3 september 1973 • deletion of an old record; • addition of a new variable field, including holdings statements; • substitution of data in a variable field; • deletion of a variable field; • production of a transaction file reflecting changes to the data base; and • generation of a new master tape, which can include resequencing the entire file andjor producing a work list of the file. however, any change in a !00, 200, 730, or 950 tag requires deletion of the complete record with its secondary entries, and reentry of the record in its changed form. this is because a two-pass update would be required in the tape system to automatically coltect secondary entries as well as to generate them. the second set of programs perfom1s the following functions: • printing of a formatted work list selectively by location or combination of locations, diacritical printing preceding the character to which it applies; and • printing of a conventional union list format which closely duplicates the design of the photocomposed page in figure 1. selectivity by location or groups of locations is present and all diacritical characters are overprinted as in the photocomposed list. photocomposition the preliminary edition of muls, as shown in figure 1, was photocomposed by a twin cities firm using a harris fototronic crt composition system and an ibm 370/ 145 computer system. we chose the lowest bidder which was fortunately a local firm. the bid required the vendor to program from our marc format master file tape an input tape for the photocomposer which would produce the specified format, using the marc character set in a font to be chosen from sample text pages. the vendor's bid included programming, composing, and procurement of several of the characters used by marc which were not in his current font repertoire. a test tape was provided to the vendor for his developmental use, together with documentation on the marc muls system. after seeing the initial result of our specified format we were not pleased with the result. the reason for this was compounded by the fact that: • the vendor had not followed some of the suggestions; • the vendor had made some unspecified changes; • the program had injected some data errors and other unacceptable conditions; and • ·the library, in its total lack of experience with this variable density form of display, had no idea of the real effect of its proposed format in getting efficient character density coupled with attractiveness. each of the design problems was looked at in order to adjust character minnesota union list of setials j grosch 177 size, column length or width, continuation line placement, display form (bold, regular, oblique), and relative data element placement. four iterations were required to finally produce the format shown in figure 1. as a result, our photocomposition and printing costs were half the costs had the original format been developed. style and readability also improved dramatically. the choice of type font was made by comparing sample pages in both serif and sans serif styles, including times roman and other well-known fonts. various library staff members were asked to vote on their preferred font. news gothic was an overwhehning favorite by both public and technical services oriented librarians. the photocomposition vendor had produced many catalogs and books using other special alphabets and characters, but had not previously done any catalog from a marc format tape. this made possible a high degree of expertise on their part in handling our special character requirements, but added some developmental problems because of lack of marc format experience. except for superscripts, subscripts, and the underline, all marc characters have been needed to display the text. our advice to those considering catalog photocomposition is to request bids, as the price on this service has continued to drop. the page price will be dependent upon the services perfom1ed. in our case the vendor handled all composition programming. one can estimate that at a minimum 40 percent-50 percent of the page charge would be involved in this service. also, the size of the job will cause a variance in the price a vendor will quote-the larger the number of pages, the cheaper the cost per page. on a very large application it may be to the library's advantage, if resources permit, to train their own programmer to program the composing device. however, we feel that our best needs were served b y contracting for this support as our programming staff was limited and did not have any prior composing-machine experience. costs the expenditure to produce a computer-based serials catalog will vary dependent upon salary and equipment rates and the conditions found in the library system. in the case of muls, condition of the files used ranged from disastrous to excellent, yet with only fragmentary information in each file. moreover, entry forms varied greatly among the many check-in, shelflist, and catalog files. therefore, data collection was much more expensive than it would have been had we keyboarded directly from one existing file of data. to present some idea of costs for others planning similar activities, we have developed some average costing information from our expenditures. each main entry in muls costs $2.81 on an average, figuring all known actual charges or subsidized costs. this main entry cost includes all associated secondary entries, which is about one secondary entry generated per h ll, h i" ii"• rr! i; ~ ;;jl i i" i'" 178 journal of library automation vol. 6/3 september 1973 1.5 main entries. this $2.81 breaks down to approximately $1.00 for design, programming, and administrative costs; $1.40 for data conversion; and $.41 for photocomposition, final printing and binding. let us look at some specific items which figure into this average cost per record to give the reader some idea of what is reasonable to expect in a project of this sort. a good example is conversion of mt/st cassette tapes to computer compatible magnetic tape, including code conversion and blocking of the records. our per-cassette conversion cost varied from $.50 to ·$2.00 per cassette. this variance was caused by a change from online to off-line conversion and the problem of handling cassette tapes which did not have the proper stop code at their end. our actual billed average throughout the whole project was $. 73 per cassette. if no tapes had been prepared omitting stop codes and if total off-line conversion had been used, our average would have been $.50 per cassette. a typical cassette tape averaged seventy-five new marc entries, so this was a very economical charge for this method. another specific cost to examine is computer time. on our ibm 360/50 system, time is billed as time on/off the system and not according to some calculation of cpu /channel/storage/peripheral device usage. normally an internal university rate is a great deal cheaper than a commercial rate for the same equipment. however, the billing method used in our system has probably increased our costs for computer time over the cpu time method of billing, since the user is at the mercy of contending with other jobs on the system at the same time; i.e., waiting for his processing turn. this has had a noticeable effect in our case; run times to update the file have varied from four to six hours machine onjoff time almost independent of the number of transactions being processed. photocomposition page rates over the last few years have been dropping as competition in this area has flourished. two years ago it was common to receive quotes of $6.00 per page or even higher. most prices we received were under this figure; but at the time our contract was · signed, our successful bidder, who also was our lowest bidder, quoted $2.60 per page. this included full programming support to convert our marc format tape for creation of the photocomposer input tape. today rates much lower than this can be found. moreover, rates under $1.00 per page can be obtained if the customer is able to create his own input or driver tape for the photocomposition device, making this method considerably more attractive for even low volume per-page printing. in the case of muls, one photocomposed page equals ten double column computer printed pages without photoreduction. photoreduction can cut computer output pages about one-third, yet obviously not to the limit achieved through the photocomposition method. therefore, considerable printing costs can be saved dependent upon the number of copies of each page printed. minnesota union list of serialsj grosch 179 problems the problems encountered during this project and its daily operation presently have been, for the most part, those commonly found in any large scale project. the large volume of data, less than ideal computer environment, condition of the original data, and large staff required to produce this effort all magnify many problems which seem unimportant in a small or short term project. in general these problems fall into the following categories: ( 1) data handling and bibliographic; ( 2) communications; ( 3) estimating; and ( 4) hardware or computer related problems. data handling and bibliographic. those who create and use research library catalogs can appreciate the formidable physical problem in any data conversion activity. a half century or more of cataloging variations must be brought together; mistakes in the original data, differences in format of cards, and spelling or usage inconsistencies must be weeded. couple this situation with a new staff, large in number but containing few professionals. the result could be disastrous if proper decision-making and problem identification did not occur. not knowing the magnitude of these problems we decided on almost verbatim transcription of records but spelling out all abbreviated words in any filing field. when our first file listing appeared-some 40,000 main entries plus 30,000 secondary entries-we saw that the filing arrangement was very poor due mainly to spelling variants, failure to consistently follow instructions to spell out abbreviated terms (which somehow escaped editing), and different entry forms for the same body. transcription of data from the original source was very accurate but because of these problems in the original data our proofreading resulted in some change occurring in about 10,000 of these 70,000 records. the use of punctuation marks in main entries varied so much that some corporate entries were filing in five or six separate groups in the list, each separated perhaps by several pages. the great shocker was the arrangement under the united states, as some coders had copied exactly from the card without spelling out u.s. and inserting a period and space. about a dozen entries had failed to be caught by the editors and appeared as one block. then, to compound the problem, others spelled out united states but forgot to insert a period after it. moreover, very early in the project the typists incorrectly inserted 2 spaces after the period. in all, there were six forms to the u.s. entries alone, with only one being correct. this lesson taught us that no matter how well instructions and examples are prepared misunderstanding can result; and, of course, editors and others will not catch all possible errors. however, these major errors were eliminated before publication. with the large volume of data and limited funds our conversion process was quite streamlined with most of the errorchecking resulting after the data were on tape and displayed in their proper relation to other records. few keyboarding errors occurred which 180 journal of library automation vol. 6/3 september 1973 were not caught at typing. the predominant errors resided in the nature of the original data, or in the lack of some piece of information from three or four different files which may have been checked in building the full record. communications. in any large project effective communication is necessary to improve quality of work and progress toward completion of the scheduled task. frequently scheduled meetings of the staff were used to inform all project members of decisions, receive their suggestions and criticisms, and develop coordinated work assignments among the teams of each editor /librarian. all typing personnel were trained as coders and were periodically relieved of typing to code. this gave them an insight into detecting problems for referral to the professional staff, renewed their knowledge of proper format, and provided more variety in their work. all project members were capable of performing tasks of coding,. control list checking, and proofreading. the most capable clerical staff also assisted the editors in editorial work. it was felt that our use of the team approach, unified training, frequent staff meetings, and very detailed written documentation served to channel communication with a resultant minimization of these problems-once the first few months of the project had passed. estimating. in most data conversion work accurate estimating is required on many matters. some estimates we made were very accurate, such as basic time and staff to complete initial coding, typing time and staff, and supplies needed. however, other estimates were not very accurate. for example, the time to edit and correct the file once basic data collection was completed was double our original estimate and required more typing than anticipated. this caused the publication schedule to be delayed two months. difficulties at the computer center and at the photocomposition vendor caused another two months delay, even though it is doubtful that our photocomposition firm would have been ready had we met our original estimate. our original target was publication not later than two months after the basic data collection period of six months, i.e., in eight months. however, on a project of this size, and with the addition of about 7,000 more titles than we had originally estimated, we did not feel that the fiftyfour weeks really required was excessive. computer time was also difficult to estimate because of the time on / off the system. dependent upon the nature of the other jobs on the computer, this time varied greatly, for updating runs were almost independent of the number of transactions. there is always room for improvement in estimating, and, obviously, we have learned many things from this experience to use in further work. hardware/computer center. our largest problem was creating firm computer scheduling commitments on our campus ibm 360/50 computer, which serves the business functions of the university. all other campus computing facilities use control data equipment which is six-bit character, word oriented. with the extended character set requirement and the availability minnesota union list of serialsj grosch 181 of the ibm 360/ 50, which we were already using for other work with the ala graphic print train, it was natural for us to choose this system. current facilities are now satisfactory to permit our tape batch system operation and the development of our new disk-based batch system. tape pooling operations for the mt / st have caused some problems due to equipment changes at our vendor. we have now switched to a new conversion source as our former vendor upgraded his data entry system to keyto-disk. the three mt/ st typewriters we leased pedormed quite reliably, but one machine seemed to have more down time than the others. now that our typing load is down1 we have cancelled two model v s and will maintain two machines. we are now choosing a new system for key input to cassette tape. on the new equipment we will do our proofreading and initial correction off-line resulting in a further cost saving. this was not possible previously as our typing load required two-shift operation on all machines during the preliminary edition preparation time. conclusion a great amount of effort has been expended to achieve a unified serials data base to serve minnesota's libraries. it is our hope that this system can continue to be developed in as flexible a way as possible so that future needs can be supported through the system. only the imagination of those involved in networking is the limit to identifying the future needs to be met through access to this data base. of course, we would hope that one day our data could benefit the development of other similar programs in other states and, perhaps more importantly, in achieving a true national serials data base. acknowledgment many staff members at the university and other institutions contributed their invaluable counsel as we h~ve proceeded on the development of the system and the data base. the muls project staff particularly receives our deep gratitude for its yeoman effort. special commendation is due mr. don norris for systems design and principal programming support. mr. carl sandberg, who wrote all printer output programs, also contributed invaluable assistance to the project. the minitex program and university library administration receive our appreciation for placing their confidence .in the systems division. muls and its support system is truly a product resulting from the coordinated concern and interest of the aforementioned individuals and groups. references i. u.s. library of congress. information systems office. serials: a marc format. preliminary edition. washington, d.c.: library of congress, 1970 (l.c. 73-606842). 2. u.s. library of congress. marc development office. serials: a marc format. addendum no.1. washington, d.c.: library of congress, june, 1971. generating collaborative systems for digital libraries | malizia, bottoni, and levialdi 171 from previous experience and from research in software engineering. wasted effort and poor interoperability can therefore ensue, raising the costs of dls and jeopardizing the fluidity of information assets in the future. in addition, there is a need for modeling services and data structures as highlighted in the “digital library reference model” proposed by the delos eu network of excellence (also called the “delos manifesto”);2 in fact, the distribution of dl services over digital networks, typically accessed through web browsers or dedicated clients, makes the whole theme of interaction between users important, for both individual usage and remote collaboration. designing and modeling such interactions call for considerations pertaining to the fields of human– computer interaction (hci) and computer-supported cooperative work (cscw). as an example, scenariobased or activity-based approaches developed in the hci area can be exploited in dl design. to meet these needs we developed cradle (cooperative-relational approach to digital library environments),3 a metamodel-based digital library management system (dlms) supporting collaboration in the design, development, and use of dls, exploiting patterns emerging from previous projects. the entities of the cradle metamodel allow the specification of collections, structures, services, and communities of users (called “societies” in cradle) and partially reflect the delos manifesto. the metamodel entities are based on existing dl taxonomies, such as those proposed by fox and marchionini,4 gonçalves et al.,5 or in the delos manifesto, so as to leverage available tools and knowledge. designers of dls can exploit the domain-specific visual language (dvsl) available in the cradle environment—where familiar entities extracted from the referred taxonomies are represented graphically—to model data structures, interfaces and services offered to the final users. the visual model is then processed and transformed, exploiting suitable templates, toward a set of specific languages for describing interfaces and services. the results are finally transformed into platformindependent (java) code for specific dl applications. cradle supports the basic functionalities of a dl through interfaces and service templates for managing, browsing, searching, and updating. these can be further specialized to deploy advanced functionalities as defined by designers through the entities of the proposed visual the design and development of a digital library involves different stakeholders, such as: information architects, librarians, and domain experts, who need to agree on a common language to describe, discuss, and negotiate the services the library has to offer. to this end, high-level, language-neutral models have to be devised. metamodeling techniques favor the definition of domainspecific visual languages through which stakeholders can share their views and directly manipulate representations of the domain entities. this paper describes cradle (cooperative-relational approach to digital library environments), a metamodel-based framework and visual language for the definition of notions and services related to the development of digital libraries. a collection of tools allows the automatic generation of several services, defined with the cradle visual language, and of the graphical user interfaces providing access to them for the final user. the effectiveness of the approach is illustrated by presenting digital libraries generated with cradle, while the cradle environment has been evaluated by using the cognitive dimensions framework. d igital libraries (dls) are rapidly becoming a preferred source for information and documentation. both at research and industry levels, dls are the most referenced sources, as testified by the popularity of google books, google video, ieee explore, and the acm portal. nevertheless, no general model is uniformly accepted for such systems. only few examples of modeling languages for developing dls are available,1 and there is a general lack of systems for designing and developing dls. this is even more unfortunate because different stakeholders are interested in the design and development of a dl, such as information architects, to librarians, to software engineers, to experts of the specific domain served by the dl. these categories may have contrasting objectives and views when deploying a dl: librarians are able to deal with faceted categories of documents, taxonomies, and document classification; software engineers usually concentrate on services and code development; information architects favor effectiveness of retrieval; and domain experts are interested in directly referring to the content of interest without going through technical jargon. designers of dls are most often library technical staff with little to no formal training in software engineering, or computer scientists with little background in the research findings of hypertext information retrieval. thus dl systems are usually built from scratch using specialized architectures that do not benefit alessio malizia (alessio.malizia@uc3m.es) is associate professor, universidad carlos iii, department of informatics, madrid, spain; paolo bottoni (bottoni@di.uniroma1.it) is associate professor and s. levialdi (levialdi@di.uniroma1.it) is professor, “sapienza” university of rome, department of computer science, rome, italy. alessio malizia, paolo bottoni, and s. levialdi generating collaborative systems for digital libraries: a model-driven approach 172 information technology and libraries | december 2010 a formal foundation for digital libraries, called 5s, based on the concepts of streams, (data) structures, (resource) spaces, scenarios, and societies. while being evidence of a good modeling endeavor, the approach does not specify formally how to derive a system implementation from the model. the new generation of dl systems will be highly distributed, providing adaptive and interoperable behaviour by adjusting their structure dynamically, in order to act in dynamic environments (e.g., interfacing with the physical world).13 to manage such large and complex systems, a systematic engineering approach is required, typically one that includes modeling as an essential design activity where the availability of such domain-specific concepts as first-class elements in dl models will make application specification easier.14 while most of the disciplines related to dls—e.g., databases,15 information retrieval,16 and hypertext and multimedia17—have underlying formal models that have properly steered them, little is available to formalize dls per se. wang described the structure of a dl system as a domain-specific database together with a user interface for querying the records stored in the database.18 castelli et al. present an approach involving multidimensional query languages for searching information in dl systems that is based on first-order logic.19 these works model metadata specifications and thus are the main examples of system formalization in dl environments. cognitive models for information retrieval, as used for example by oddy et al.,20 focus on users’ information-seeking behavior (i.e., formation, nature, and properties of a users’ information need) and on how information retrieval systems are used in operational environments. other approaches based on models and languages for describing the entities involved in a dl are the digital library definition language,21 the dspace data model22 (with the definitions of communities and workflow models), the metis workflow framework,23 and the fedora structoid approach.24 e/r approaches are frequently used for modeling database management system (dbms) applications,25 but as e/r diagrams only model the static structure of a dbms, they generally do not deal deeply with dynamic aspects. temporal extensions add dynamic aspects to the e/r approach, but most of them are not object-oriented.26 the advent of object-oriented technology calls for approaches and tools to information system design resulting in object-oriented systems. these considerations drove research toward modeling approaches as supported by uml.27 however, since the uml metamodel is not yet widespread in the dl community, we adopted the e/r formalism and complemented it with the specification of the dynamics made available through the user interface, as described by malizia et al.28 using the metamodel, we have defined a dsvl, including basic entities and language. cradle is based on the entity-relationship (e/r) formalism, which is powerful and general enough to describe dl models and is supported by many tools as a metamodeling language. moreover, we observed that users and designers involved in the dl environment, but not coming from a software engineering background, may not be familiar with advanced formalism like unified modeling language (uml), but they usually have basic notions on database management systems, where e/r is largely employed. ■■ literature review dls are complex information systems involving technologies and features from different areas, such as library and information systems, information retrieval, and hci. this interdisciplinary nature is well reflected in the various definitions of dls present in the literature. as far back as 1965, licklider envisaged collections of digital versions of scanned documents accessible via interconnected computers.6 more recently, levy and marshall described dls as sets of collections of documents, together with digital resources, accessible by users in a distributed context.7 to manage the amount of information stored in such systems, they proposed some sort of user-assisting software agent. other definitions include not only printed documents, but multimedia resources in general.8 however different the definitions may be, they all include the presence of collections of resources, their organization in structured repositories, and their availability to remote users through networks (as discussed by morgan).9 recent efforts toward standardization have been taken by public and private organizations. for example, a delphi study identified four main ingredients: an organized collection of resources, mechanisms for browsing and searching, a distributed networked environment, and a set of objectified services.10 the president’s information technology advisory committee (pitac) panel on digital libraries sees dls as the networked collections of digital text, documents, images, sounds, scientific data, and software that make up the core of today’s internet and of tomorrow’s universally accessible digital repositories of all human knowledge.11 when considering dls in the context of distributed dl environments, only few papers have been produced, contrasting with the huge bibliography on dls in general. the dl group at the universidad de las américas puebla in mexico introduced the concept of personal and group spaces, relevant to the cscw domain, in the dl system context.12 users can share information stored in their personal spaces or share agents, thus allowing other users to perform the same search on the document collections in the dl. the cited text by gonçalves et al. gives generating collaborative systems for digital libraries | malizia, bottoni, and levialdi 173 education as discussed by wattenberg or zia.33 in the nsdl program, a new generation of services has been developed that includes support for teaching and learning; this means also considering users’ activities or scenarios and not only information access. services for implementing personal content delivery and sharing, or managing digital resources and modeling collaboration, are examples of tools introduced during this program. the virtual reference desk (vrd) is emerging as an interactive service based on dls. with vrd, users can take advantage of domain experts’ knowledge and librarians’ experience to locate information. for example, the u.s. library of congress ask a librarian service acts as a vrd for users who want help in searching information categories or to interact with expert librarians to search for a specific topic.34 the interactive and collaborative aspects of activities taking place within dls facilitate the development of user communities. social networking, work practices, and content sharing are all features that influence the technology and its use. following borgmann,35 lynch sees the future of dls not in broad services but in supporting and facilitating “customization by community,” i.e., services tailored for domain-specific work practices.36 we also examined the research agenda on systemoriented issues in dls and the delos manifesto.37 the agenda abstracts the dl life cycle, identifying five main areas, and proposes key research problems. in particular we tackle activities such as formal modeling of dls and their communities and developing frameworks coherent with such models. at the architectural level, one point of interest is to support heterogeneous and distributed systems, in particular networked dls and services.38 for interoperability, one of the issues is how to support and interoperate with different metadata models and standards to allow distributed cataloguing and indexing, as in the open archive initiative (oai).39 finally, we are interested in the service level of the research agenda and more precisely in web services and workflow management as crucial features when including communities and designing dls for use over networks and for sharing content. as a result of this analysis, the cradle framework features the following: ■■ a visual language to help users and designers when visual modeling their specific dl (without knowing any technical detail apart from learning how to use a visual environment providing diagrams representations of domain specific elements) ■■ an environment integrating visual modeling and code generation instead of simply providing an integrated architecture that does not hide technical details ■■ interface generation for dealing with different users relationships for modeling dl-related scenarios and activities. the need for the integration of multiple languages has also been indicated as a key aspect of the dsvl approach.29 in fact, complex domains like dls typically consist of multiple subdomains, each of which may require its own particular language. in the current implementation, the definition of dsvls exploits the metamodeling facilities of atom3, based on graph-grammars.30 atom3 has been typically used for simulation and model transformation, but we adopt it here as a tool for system generation. ■■ requirements for modeling digital libraries we follow the delos manifesto by considering a dl as an organization (possibly virtual and distributed) for managing collections of digital documents (digital contents in general) and preserving their images on storage. a dl offers contextual services to communities of users, a certain quality of service, and the ability to apply specific policies. in cradle we leave the definition of quality of service to the service-oriented architecture standards we employ and partially model the applicable policy, but we focus here on crucial interactivity aspects needed to make dls usable by different communities of users. in particular, we model interactive activities and services based on librarians’ experiences in face-to-face communication with users, or designing exchange and integration procedures for communicating between institutions and managing shared resources. while librarians are usually interested in modeling metadata across dls, software engineers aim at providing multiple tools for implementing services,31 such as indexing, querying, semantics,32 etc. therefore we provide a visual model useful for librarians and information architects to mimic the design phases they usually perform. moreover, by supporting component services, we help software engineers to specify and add services on demand to dl environments. to this end, we use a service component model. by sharing a common language, users from different categories can communicate to design a dl system while concentrating on their own tasks (services development and design for software engineers and dl design for librarians and information architects). users are modeled according to the delos manifesto as dl end-users (subdivided into content creators, content consumers, and librarians), dl designers (librarians and information architects), dl system administrators (typically librarians), and dl application developers (software engineers). several activities have been started on modeling domain specific dls. as an example, the u.s. national science digital library (nsdl) program promotes educational dls and services for basic and advanced science 174 information technology and libraries | december 2010 ■■ how that information is structured and organized (structural model) ■■ the behavior of the dl (service model) and the different societies of actors ■■ groups of services acting together to carry out the dl behavior (societal model) figure 1 depicts the design approach supported by cradle architecture, namely, modeling the society of actors and services interacting in the domain-specific scenarios and describing the documents and metadata structure included with the library by defining a visual model for all these entities. the dl is built using a collection of stock parts and configurable components that provide the infrastructure for the new dl. this infrastructure includes the classes of objects and relationships that make up the dl, and processing tools to create and load the actual library collection from raw documents, as well as services for searching, browsing, and collection maintenance. finally, the code generation module generates tailored dl services code stubs by composing and specializing components from the component pool. initially, a dl designer is responsible for formalizing (starting from an analysis of the dl requirements and characteristics) a conceptual description of the dl using metamodel concepts. model specifications are then fed into a dl generator (written in python for atom3), to produce a dl tailored suitable for specific platforms and requirements. after these design phases, cradle generates the code for the user interface and the parts of code corresponding to services and actors interacting in the described society. a set of templates for code generation and designers ■■ flexible metadata definitions ■■ a set of interactive integrated tools for user activities with the generated dl system to sum up, cradle is a dlms aimed at supporting all the users involved in the development of a dl system and providing interfaces, data modeling, and services for user-driven generation of specific dls. although cradle does not yet satisfy all requirements for a generic dl system, it addresses issues focused on developing interactive dl systems, stressing interfaces and communication between users. nevertheless, we employed standards when possible to leave it open for further specification or enhancements from the dl user community. extensive use of xml-based languages allows us to change document information depending on implemented recognition algorithms so that expert users can easily model their dl by selecting the best recognition and indexing algorithms. cradle evolves from the jdan (java-based environment for document applications on networks) platform, which managed both document images and forms on the basis of a component architecture.40 jdan was based on xml technologies, and its modularity allowed its integration in service-based and grid-based scenarios. it supported template code generation and modeling, but it required the designer to write xml specifications and edit xml schema files in order to model the dl document types and services, thus requiring technical knowledge that should be avoided to let users concentrate on their specific domains. ■■ modeling digital library systems the cradle framework shows a unique combination of features: it is based on a formal model, exploits a set of domain-specific languages, and provides automatic code generation. moreover, fundamental roles are played by the concepts of society and collaboration.41 cradle generates code from tools built after modeling a dl (according to the rules defined by the proposed metamodel) and performs automatic transformation and mapping from model to code to generate software tools for a given dl model. the specification of a dl in cradle encompasses four complementary dimensions: ■■ multimedia information supported by the dl (collection model) figure 1. cradle architecture generating collaborative systems for digital libraries | malizia, bottoni, and levialdi 175 socioeconomic, and environment dimensions. we now show in detail the entities and relations in the derived metamodel, shown in figure 2. actor entities actors are the users of dls. actors interact with the dl through services (interfaces) that are (or can be) affected by the actors preferences and messages (raised events). in the cradle metamodel, an actor is an entity with a behavior that may concurrently generate events. communications with other actors may occur synchronously or asynchronously. actors can relate through services to shape a digital community, i.e., the basis of a dl society. in fact, communities of students, readers, or librarians interact with and through dls, generally following predefined scenarios. as an example, societies can behave as query generator services (from the point of view of the library) and as teaching, learning, and working services (from the point of view of other humans and organizations). communication between actors within the same or different societies occur through message exchange. to operate, societies need shared data structures and message protocols, enacted by sending structured sequences of queries and retrieving collections of results. the actor entity includes three attributes: 1. role identifies which role is played by the actor within the dl society. examples of specific human roles include authors, publishers, editors, maintainers, developers, and the library staff. examples of nonhuman actors include computers, printers, telecommunication devices, software agents, and digital resources in general. 2. status is an enumeration of possible statuses for the actor: i. none (default value) ii. active (present in the model and actively generating events) iii. inactive (present in the model but not generating events) iv. sleeping (present in the model and awaiting for a response to a raised event) 3. events describes a list of events that can be raised by the actor or received as a response message from a service. examples of events are borrow, reserve, return, etc. events triggered from digital resources include store, trash, and transfer. examples of response events are found, not found, updated, etc. have been built for typical services of a dl environment. to improve acceptability and interoperability, cradle adopts standard specification sublanguages for representing dl concepts. most of the cradle model primitives are defined as xml elements, possibly enclosing other sublanguages to help define dl concepts. in more detail, mime types constitute the basis for encoding elements of a collection. the xml user interface language (xul)42 is used to represent appearance and visual interfaces, and xdoclet is used in the libgen code generation module, as shown in figure 1.43 ■■ the cradle metamodel in the cradle formalism, the specification of a dl includes a collection model describing the maintained multimedia documents, a structural model of information organization, a service model for the dl behavior, and a societal model describing the societies of actors and groups of services acting together to carry out the dl behavior. a society is an instance of the cradle model defined according to a specific collaboration framework in the dl domain. a society is the highest-level component of a dl and exists to serve the information needs of its actors and to describe its context of usage. hence a dl collects, preserves, and shares information artefacts for society members. the basic entities in cradle are derived from the categorization along the actors, activities, components, figure 2. the cradle metamodel with the e/r formalism 176 information technology and libraries | december 2010 a text document, including scientific articles and books, becomes a sequence of strings. the struct entity a struct is a structural element specifying a part of a whole. in dls, structures represent hypertexts, taxonomies, relationships between elements, or containment. for example, books can be structured logically into chapters, sections, subsections, and paragraphs, or physically into cover, pages, line groups (paragraphs), and lines. structures are represented as graphs, and the struct entity (a vertex) contains four attributes: 1. document is a pointer to the document entity the structure refers to. 2. id is a unique identifier for a structure element. 3. type takes three possible values: i. metadata denotes a content descriptor, for instance title, author, etc. ii. layout denotes the associated layout, e.g., left frame, columns, etc. iii. item indicates a generic structure element used for extending the model. 4. values is a list of values describing the element content, e.g., title, author, etc. actors interact with services in an event-driven way. services are connected via messages (send and reply) and can be sequential, concurrent, or task-related (when a service acts as a subtask of a macroservice). services perform operations (e.g., get, add, and del) on collections, producing collections of documents as results. struct elements are connected to each other as nodes of a graph representing metadata structures associated with documents. the metamodel has been translated to a dsvl, associating symbols and icons with entities and relations (see “cradle language and tools” below). with respect to the six core concepts of the delos manifesto (content, user, functionality, quality, policy, and architecture), content can be modeled in cradle as collections and structs, user as actor, and functionality as service. the quality concept is not directly modeled in cradle, but for quality of service we support standard service architecture. policies can be partially modeled by services managing interaction between actors and collections, making it possible to apply standard access policies. from the architectural point of view, we follow the reference architecture of figure 1. ■■ cradle language and tools in this section we describe the selection of languages and tools of the cradle platform. to improve interoperability service entities services describe scenarios, activities, operations, and tasks that ultimately specify the functionalities of a dl, such as collecting, creating, disseminating, evaluating, organizing, personalizing, preserving, requesting, and selecting documents and providing services to humans concerned with fact-finding, learning, gathering, and exploring the content of a dl. all these activities can be described and implemented using scenarios and appear in the dl setting as a result of actors using services (thus societies). furthermore, these activities realize and shape relationships within and between societies, services, and structures. in the cradle metamodel, the service entity models what the system is required to do, in terms of actions and processes, to achieve a task. a detailed task analysis helps understand the current system and the information flow within it in order to design and allocate tasks appropriately. the service entity has four attributes: 1. name is a string representing a textual description of the service. 2. sync states whether communication is synchronous or asynchronous, modeled by values wait and nowait, respectively. 3. events is a list of messages that can trigger actions among services (tasks); for example, valid or notvalid in case of a parsing service. 4. responses contain a list of response messages that can reply to raised events; they are used as a communication mechanism by actors and services. the collection entity collections are sets of documents of arbitrary type (e.g., bits, characters, images, etc.) used to model static or dynamic content. in the static interpretation, a collection defines information content interpreted as a set of basic elements, often of the same type, such as plain text. examples of dynamic content include video delivered to a viewer, animated presentations, and so on. the attributes of collection are name and documents. name is a string, while documents is a list of pairs (documentname, documentlabel), the latter being a pointer to the document entity. the document entity documents are the basic elements in a dl and are modeled with attributes label and structure. label defines a textual string used by a collection entity to refer to the document. we can consider it as a document identifier, specifying a class or a type of document. structure defines the semantics and area of application of the document. for example, any textual representation can be seen as a string of characters, so that generating collaborative systems for digital libraries | malizia, bottoni, and levialdi 177 graphs. model manipulation can then be expressed via graph grammars also specified in atom3. the general process of automatic creation of cooperative dl environments for an application is shown in figure 3. initially, a designer formalizes a conceptual description of the dl using the cradle metamodel concepts. this phase is usually preceded by an analysis of requirements and interaction scenarios, as seen previously. model specifications are then provided to a dl code generator (written in python within atom3) to produce dls tailored to specific platforms and requirements. these are built on a collection of templates of services and configurable components providing infrastructure for the new dl. the sketched infrastructure includes classes for objects (tasks), relationships making up the dl, and processing tools to upload the actual library collection from raw documents, as well as services for searching and browsing and for document collections maintenance. the cradle generator automatically generates different kinds of output for the cradle model of the cooperative dl environment, such as service and collection managers. collection managers define the logical schemata of the dl, which in cradle correspond to a set of mime types, xul and xdoclet specifications, representing digital objects, their component parts, and linking information. collection managers also store instances of their and collaboration, cradle makes extensive use of existing standard specification languages. most cradle outputs are defined with xml-based formats, able to enclose other specific languages. the basic languages and corresponding tools used in cradle are the following: ■■ mime type. multipurpose internet mail extensions (mime) constitute the basis for encoding documents in cradle, supporting several file formats and types of character encoding. mime was chosen because of wide availability of mime types, and standardisation of the approach. this makes it a natural choice for dls where different types of documents need to be managed (pdf, html, doc, etc.). moreover, mime standards for character encoding descriptions help keeping the cradle framework open and compliant with standards. ■■ xul. the xml user interface language (xul) is an xml-based markup language used to represent appearance and visual interfaces. xul is not a public standard yet, but it uses many existing standards and technologies, including dtd and rdf,44 which makes it easily readable for people with a background in web programming and design. the main benefit of xul is that it provides a simple definition of common user interface elements (widgets). this drastically reduces the software development effort required for visual interfaces. ■■ xdoclet. xdoclet is used for generating services from tagged-code fragments. it is an open-source code generation library which enables attribute-oriented programming for java via insertion of special tags.45 it includes a library of predefined tags, which simplify coding for various technologies, e.g., web services. the motivation for using xdoclet in the cradle framework is related to its approach for template code generation. designers can describe templates for each service (browse, query, and index) and the xdoclet generated code can be automatically transformed into the java code for managing the specified service. ■■ atom3. atom3 is a metamodeling system to model graphical formalisms. starting from a metaspecification (in e/r), atom3 generates a tool to process models described in the chosen formalism. models are internally represented using abstract syntax figure 3. cooperative dl generation process with cradle framework 178 information technology and libraries | december 2010 and (3) the metadata operations box. the right column manages visualization and multimedia information obtained from documents. the basic features provided with the ui templates are document loading, visualization, metadata organization, and management. the layout template, in the collection box, manages the visualization of the documents contained in a collection, while the visualization template works according to the data (mime) type specified by the document. actually, by selecting a document included in the collection, the corresponding data file is automatically uploaded and visualized in the ui. the metadata visualization in the code template reflects the metadata structure (a tree) represented by a struct, specifying the relationship between parent and child nodes. thus the xul template includes an area (the metadata box) for managing tree structures as described in the visual model of the dl. although the tree-like visualization has potential drawbacks if there are many metadata items, there should be no real concern with medium loads. the ui template also includes a box to perform operations on metadata, such as insert, delete, and edit. users can select a value in the metadata box and manipulate the presented values. figure 4 shows an example of a ui generated from a basic template. service templates to achieve automated code generation, we use xdoclet to specify parameters and service code generation according to such parameters. cradle can automatically annotate java files with name–value pairs, and xdoclet provides a syntax for parameter specification. code generation is classes and function as search engines for the system. services classes also are generated and are represented as attribute-oriented classes involving parts and features of entities. ■■ cradle platform the cradle platform is based on a model-driven approach for the design and automatic generation of code for dls. in particular, the dsvl for cradle has four diagram types (collection, structure, service, and actor) to describe the different aspects of a dl. in this section we describe the user interface (ui) and service templates used for generating the dl tools. in particular, the ui layout is mainly generated from the structured information provided by the document, struct, and collection entities. the ui events are managed by invoking the appropriate services according to the imported xul templates. at the service and communication levels, the xdoclet code is generated by the service and actor entities, exploiting their relationships. we also show how code generation works and the advanced platform features, such as automatic service discovery. at the end of the section a running example is shown, representing all the phases involved in using the cradle framework for generating the dl tools for a typical library scenario. user interface templates the generation of the ui is driven by the visual model designed by the cradle user. specifically, the model entities involved in this process are document, struct and collection (see figure 2) for the basic components and layout of the interfaces, while linked services are described in the appropriate templates. the code generation process takes place through transformations implemented as actions in the atom3 metamodel specification, where graph-grammar rules may have a condition that must be satisfied for the rule to be applied (preconditions), as well as actions to be performed when the rule is executed (postconditions). a transformation is described during the visual modeling phase in terms of conditions and corresponding actions (inserting xul language statements for the interface in the appropriate code template placeholders). the generated user interface is built on a set of xul template files that are automatically specialized on the basis of the attributes and relationships designed in the visual modeling phase. the layout template for the user interface is divided into two columns (see figure 4). the left column is made of three boxes: (1) the collection box (2) the metadata box, figure 4. an example of an automatically generated user interface. (a) document area; (b) collection box; (c) metadata box; (d) metadata operations box. generating collaborative systems for digital libraries | malizia, bottoni, and levialdi 179 "msg arguments.argname"> { "" , "" "" } , }; the first two lines declare a class with a name class nameimpl that extends the class name. the xdoclet template tag xdtclass:classname denotes the name of the class in the annotated java file. all standard xdoclet template tags have a namespace starting with “xdt.” the rest of the template uses xdtfield : forallfield to iterate through the fields. for each field with a tag named msg arguments.argname (checked using xdtfield : ifhasfieldtag), it creates a subarray of strings using the values obtained from the field tag parameters. xdtfield : fieldname gives the name of the field, while xdtfield : fieldtagvalue retrieves the value of a given field tag parameter. characters that are not part of some xdoclet template tags are directly copied into the generated code. the following code segment was generated by xdoclet using the annotated fields and the above template segment: public class msgargumentsimpl extends msgarguments { public static string[ ][ ] argumentnames = new string[ ][ ]{ { "eventmsg" , " event " , " eventstring " } , { " responsemsg " , " response " , " responsestring " } , }; } similarly, we generate the getter and setter methods for each field: public get () { return ; } public void set ( string value ) { based on code templates. hence service templates are xdoclet templates for transforming xdoclet code fragments obtained from the modeled service entities. the basic xdoclet template manages messages between services, according to the event and response attributes described in “cradle language and tools” above. in fact, cradle generates a java application (a service) that needs to receive messages (event) and reply to them (response) as parameters for the service application. in xdoclet, these can be attached to the corresponding field by means of annotation tags, as in the following code segments: public class msgarguments { . . . . . . /* * @msg arguments.argname name="event " desc="event_string " */ protected string eventmsg = null; /* * @msg arguments.argname name="response" * desc="response_string " */ protected string responsemsg = null; } each msg arguments.argname related to a field is called a field tag. each field tag can have multiple parameters, listed after the field tag. in the tag name msg arguments .argname, the prefix serves as the namespace of all tags for this particular xdoclet application, thus avoiding naming conflicts with other standard or customized xdoclet tags. not only fields can be annotated, but also other entities such as class and functions can have tags too. xdoclet enables powerful code generation requiring little or no customization (depending on how much is provided by the template). the type of code to be generated using the parameters is defined by the corresponding xdoclet template. we have created template files composed of java codes and special xdoclet instructions in the form of xml tags. these xdoclet instructions allow conditionals (if) and loops (for), thus providing us with expressive power close to a programming language. in the following example, we first create an array containing labels and other information for each argument: public class impl extends { public static string[ ][ ] argumentnames = new string[ ][ ] { " , value ) ; }< /xdtfield : ifhasfieldtag> this translates into the following generated code: public java.lang.string get eventmsg ( ) { return eventmsg ; } public void set eventmsg ( string value ) { setvalue ( "eventmsg" , value ) ; } public java.lang.string getresponsemsg ( ) { return getresponsemsg ; } public void setresponsemsg ( string value ) { setvalue ( " responsemsg " , value ) ; } the same template is used for managing the name and sync attributes of service entities. code generation, service discovery, and advanced features a service or interface template only describes the solution to a particular design problem—it is not code. consequently, users will find it difficult to make the leap from the template description to a particular implementation even though the template might include sample code. others, like software engineers, might have no trouble translating the template into code, but they still may find it a chore, especially when they have to do it repeatedly. the cradle visual design environment (based on atom3) helps alleviate these problems. from just a few pieces of information (the visual model), typically application-specific names for actors and services in a dl society along with choices for the design tradeoffs, the tool can create class declarations and definitions implementing the template. the ultimate goal of the modeling effort remains, however, the production of reliable and efficiently executable code. hence a code generation transformation produces interface (xul) and service (java code from xdoclet templates) code from the dl model. we have manually coded xul templates specifying the static setup of the gui, the various widgets and their layout. this must be complemented with code generated from a dl model of the systems dynamics coded into services. while other approaches are possible,46 we employed the solution implemented within the atom3 environment according to its graph grammar modeling approach to code generation. cradle supports a flexible iterative process for visual design and code generation. in fact, a design change might require substantial reimplementation generating collaborative systems for digital libraries | malizia, bottoni, and levialdi 181 selecting one, the ui activates the metadata operations box—figure 6(d). the selected metadata node will then be presented in the lower (metadata operations) box, labeled “set metadata values,” replacing the default “none” value as shown in figure 6. after the metadata item is presented, the user can edit its value and save it by clicking on the “set value” button. the associated action saves the metadata information and causes its display in the intermediate box (tree-like structure), changing the visualization according to the new values. the code generation process for the do_search and front desk services is based on xdoclet templates. in particular, a message listener template is used to generate the java code for the front desk service. in fact, the front desk service is asynchronous and manages communications between actors. the actors classes are generated also by using the services templates since they have attributes, events, and messages, just like the services. the do_search service code is based on the producer and consumer templates, since it is synchronous by definition in the modeled scenario. a get method retrieving a collection of documents is implemented from the getter template. the routine invoked by the transformation action for struct entities performs a breadth-first exploration of the metadata tree in the visual model and attaches the corresponding xul code for displaying the struct node in the correct position within the graph structure of the ui. collections, while a single rectangle connected to a collection represents a document entity; the circles linked to the document entity are the struct (metadata) entities. metadata entities are linked to the node relationships (organized as a tree) and linked to the document entity by a metadata linktype relationship. the search service is synchronous (sync attribute set to “wait”). it queries the document collection (get operation) looking for the requested document (using metadata information provided by the borrow request), and waits for the result of get (a collection of documents). based on this result, the service returns a boolean message “is_available,” which is then propagated as a response to the librarian and eventually to the student, as shown in figure 5. when the library designer has built the model, the transformation process can be run, executing the code generation actions associated with the entities and services represented in the model. the code generation process is based on template code snippets generated from the atom3 environment graph transformation engine, following the generative rules of the metamodel. we also use pre– and postconditions on application of transformation rules to have code generation depend on verification of some property. the generated ui is presented in figure 6. on the right side, the document area is presented according to the xul template. documents are managed according to their mime type: the pdf file of the example is loaded with the appropriate adobe acrobat reader plug-in. on the left column of the ui are three boxes, according to the xul template. the collection box—figure 6(b)— presents the list of documents contained in the collection specified by the documents attribute of the library collection entity, and allows users to interact with documents. after selecting a document by clicking on the list, it is presented in the document area—figure 6(a)—where it can be managed (edit, print, save, etc.). in the metadata box—figure 6(c)—the tree structure of the metadata is depicted according to the categorization modeled by the designer. the xul template contains all the basic layout and action features for managing a tree structure. the generated box contains the parent and child nodes according to the attributes specified in the corresponding struct elements. the user can click on the root for compacting or exploding the tree nodes; by figure 5. the library model, alias the model of the library society 182 information technology and libraries | december 2010 workflow system. the release collection maintains the image files in a permanent storage, while data is written to the target database or content management software, together with xml metadata snippets (e.g., to be stored in xml native dbms). a typical configuration would have the recognition service running on a server cluster, with many dataentry services running on different clients (web browsers directly support xul interfaces). whereas current document capture environments are proprietary and closed, the definition of an xml-based interchange format allows the suitable assembly of different component-based technologies in order to define a complex framework. the realization of the jdan dl system within the cradle framework can be considered as a preliminary step in the direction of a standard multimedia document managing platform with region segmentation and classification, thus aiming at automatic recognition of image database and batch acquisition of multiple multimedia documents types and formats. personal and collaborative spaces a personal space is a virtual area (within the dl society) that is modeled as being owned and maintained by a user including resources (document collections, services, etc.), or references to resources, which are relevant to a task, or set of tasks, the user needs to carry out in the dl. personal spaces may thus contain digital documents in multiple media, personal schedules, visualization tools, and user agents (shaped as services) entitled with various tasks. resources within personal spaces can be allocated ■■ designing and generating advanced collaborative dl systems in this section we show the use of cradle as an analytical tool helpful in comprehending specific dl phenomena, to present the complex interplays that occur between cradle components and dl concepts in a real dl application, and to illustrate the possibility of using cradle as a tool to design and generate advanced tools for dl development. modeling document images collections with cradle, the designer can provide the visual model of the dl society involved in document management and the remaining phases are automatically carried out by cradle modules and templates. we have provided the user with basic code templates for the recognition and indexing services, the data-entry plug-in, and archive release. the designer can thus simply translate the particular dl society into the corresponding visual model within the cradle visual modeling editor. as a proof of concept, figure 7 models the jdan architecture, introduced in “requirements for modeling digital libraries,” exploiting the cradle visual language. the recognition service performs the automatic document recognition and stores the corresponding document images, together with the extracted metadata in the archive collection. it interacts with the scanner actor, representing a machine or a human operator that scans paper documents. designers can choose their own segmentation method or algorithm; what is required to be compliant with the framework is to produce an xdoclet template. it stores the document images into the archive collection, with its different regions layout information according to the xml metadata schema provided by the designer. if there is at least one region marked as “not interpreted,” the dataentry service is invoked on the “not interpreted” regions. the data-entry service allows operators to evaluate the automatic classification performed by the system and edit the segmentation for indexing. operators can also edit the recognized regions with the classification engine (included in the recognition service) and adjust their values and sizes. the output of this phase is an xml description that will be imported in the indexing service for indexing (and eventually querying). the archive collection stores all of the basic information kept in jdan, such as text labels, while the indexing service, based on a multitier architecture, exploiting jboss 3.0, has access to them. this service is responsible for turning the data fragments in the archive collection into useful forms to be presented to the final users, e.g., a report or a query result. the final stage in the recognition process could be to release each document to a content management or figure 6. the ui generated by cradle transforming the library model in xul and xdoclet code generating collaborative systems for digital libraries | malizia, bottoni, and levialdi 183 and metadata, but also can share information with the various committees collaborating for certain tasks. ■■ evaluation in this section we evaluate the presented approach from three different perspectives: usability of the cradle notation, its expressiveness, and usability of the generated dls. usability of cradle notation we have tested it by using the well known cognitive dimensions framework for notations and visual language design.48 the dimensions are usually employed to evaluate the usability of a visual language or notation, or as heuristics to drive the design of innovative visual languages. the significant results are as follows. abstraction gradient an abstraction is a grouping of elements to be treated as one entity. in this sense, cradle is abstraction-tolerant. it provides entities for high-level abstractions of communication processes and services. these abstractions are intuitive as they are visualized as the process they represent (services with events and responses) and easy to learn as their configuration implies few simple attributes. although cradle does not allow users to build new abstractions, the e/r formalism is powerful enough to provide basic abstraction levels. closeness of mapping cradle elements have been assigned icons to resemble their real-world counterparts (e.g., a collection is represented as a set of paper sheets). the elements that do not have a correspondence with a physical object in the real world have icons borrowed from well-known notations (e.g., structs represented as graph nodes). consistency a notation is consistent if a user knowing some of its structure can infer most of the rest. in cradle, when two elements represent the same entity but can be used either as input or as output, then their shape is equal but incorporates an incoming or an outgoing message in order to differentiate them. see, for example, the icons for services or those for graph nodes representing either a according to the user’s role. for example, a conference chair would have access to conference-specific materials, visualization tools and interfaces to upload papers for review by a committee. similarly, we denote a group space as a virtual area in which library users (the entire dl society) can meet to conduct collaborative activities synchronously or asynchronously. explicit group spaces are created dynamically by a designer or facilitator who becomes (or appoints) the owner of the space and defines who the participants will be. in addition to direct user-touser communication, users should be able to access library materials and make annotations on them for every other group to see. ideally, users should be able to act (and carry dl materials with them) between personal and group spaces or among group spaces to which they belong. it may also be the case, however, that a given resource is referenced in several personal or group spaces. basic functionality required for personal spaces includes capabilities for viewing, launching, and monitoring library services, agents, and applications. like group spaces, personal spaces should provide users with the means to easily become aware of other users and resources that are present in a given group space at any time, as well as mechanisms to communicate with other users and make annotations on library resources. we employed this personal and group space paradigm in modeling a collaborative environment in the academic conferences domain, where a conference chair can have a personal view of the document collections (resources) figure 7. the cradle model for the jdan framwork 184 information technology and libraries | december 2010 of “sapienza” university of rome (undergraduate students), shown in figure 5, and (2) an application employed with a project of records management in a collaboration between the computer science and the computer engineering department of “sapienza” university, as shown in figure 7. usability of the generated tools environments for single-view languages generated with atom3 have been extensively used, mostly in an academic setting, in different areas like software and web engineering, modeling, and simulation; urban planning; etc. however, depending on the kind of the domain, generating the results may take some time. for instance, the state reachability analysis in the dl example takes a few minutes; we are currently employing a version of atom3 that includes petri-nets formalism where we can test the services states reachability.49 in general, from application experience, we note the general agreement that automated syntactical consistency support greatly simplifies the design of complex systems. finally, some users pointed out some technical limitations of the current implementation, such as the fact that it is not possible to open several views at a time. altogether, we believe this work contributes to make more efficient and less tedious the definition and maintenance of environments for dls. our model-based approach must be contrasted with the programmingcentric approach of most case tools, where the language and the code generation tools are hard-coded so that whenever a modification has to be done (whether on the language or on the semantic domain) developers have to dive into the code. ■■ conclusions and future work dls are complex information systems that integrate findings from disciplines such as hypertext, information retrieval, multimedia, databases, and hci. dl design is often a multidisciplinary effort, including library staff and computer scientists. wasted effort and poor interoperability can therefore ensue. examining the related bibliography, we noted that there is a lack of tools or automatic systems for designing and developing cooperative dl systems. moreover, there is a need for modeling interactions between dls and users, such as scenario or activity-based approaches. the cradle framework fulfills this gap by providing a model-driven approach for generating visual interaction tools for dls, supporting design and automatic generation of code for dls. in particular, we use a metamodel made of different diagram types (collection, structures, service, and struct or an actor, with different colors. diffuseness/terseness a notation is diffuse when many elements are needed to express one concept. cradle is terse and not diffuse because each entity expresses a meaning on its own. error-proneness data flow visualization reduces the chance of errors at a first level of the specification. on the other hand, some mistakes can be introduced when specifying visual entities, since it is possible to express relations between source and target models which cannot generate semantically correct code. however, these mistakes should be considered “programming errors more than slips,” and may be detected through progressive evaluation. hidden dependencies a hidden dependency is a relation between two elements that is not visible. in cradle, relevant dependencies are represented as data flows via directed links. progressive evaluation each dl model can be tested as soon as it is defined, without having to wait until the whole model is finished. the visual interface for the dl can be generated with just one click, and services can be subsequently added to test their functionalities. viscosity cradle has a low viscosity because making small changes in a part of a specification does not imply lots of readjustments in the rest of it. one can change properties, events or responses and these changes will have only local effect. the only local changes that could imply performing further changes by hand are deleting entities or changing names; however, this would imply minimal changes (just removing or updating references to them) and would only affect a small set of subsequent elements in the same data flow. visibility a dl specification consists of a single set of diagrams fitting in one window. empirically, we have observed that this model usually involves no more than fifteen entities. different, independent cradle models can be simultaneously shown in different windows. expressiveness of cradle the paper has illustrated the expressiveness of cradle by defining different entities end relationships for different dl requisites. to this end, two different applications have been considered: (1) a basic example elaborated with the collaboration of the information science school generating collaborative systems for digital libraries | malizia, bottoni, and levialdi 185 retrieval (reading, mass.: addison-wesley, 1999). 17. d. lucarella and a. zanzi, “a visual retrieval environment for hypermedia information systems,” acm transactions on information systems 14 (1996): 3–29. 18. b. wang, “a hybrid system approach for supporting digital libraries,” international journal on digital libraries 2 (1999): 91–110,. 19. d. castelli, c. meghini, and p. pagano, “foundations of a multidimensional query language for digital libraries,” in proc. ecdl ’02, lncs 2458 (berlin: springer, 2002): 251–65. 20. r. n. oddy et al., eds., proc. joint acm/bcs symposium in information storage & retrieval (oxford: butterworths, 1981). 21. k. maly, m. zubair et al., “scalable digital libraries based on ncstrl/dienst,” in proc. ecdl ’00 (london: springer, 2000): 168–79. 22. r. tansley, m. bass and m. smith, “dspace as an open archival information system: current status and future directions,” proc. ecdl ’03, lncs 2769 (berlin: springer, 2003): 446–60. 23. k. m. anderson et al., “metis: lightweight, flexible, and web-based workflow services for digital libraries,” proc. 3rd acm/ieee-cs jcdl ’03 (los alamitos, calif.: ieee computer society, 2003): 98–109. 24. n. dushay, “localizing experience of digital content via structural metadata,” in proc. 2nd acm/ieee-cs jcdl ’02 (new york: acm, 2002): 244–52. 25. m. gogolla et al., “integrating the er approach in an oo environment,” proc. er, ’93 (berlin: springer, 1993): 376–89. 26. heidi gregersen and christian s. jensen, “temporal entity-relationship models—a survey,” ieee transactions on knowledge & data engineering 11 (1999): 464–97. 27. b. berkem, “aligning it with the changes using the goal-driven development for uml and mda,” journal of object technology 4 (2005): 49–65. 28. a. malizia, e. guerra, and j. de lara, “model-driven development of digital libraries: generating the user interface,” proc. mddaui ’06, http://sunsite.informatik.rwth-aachen.de/ publications/ceur-ws/vol-214/ (accessed oct 18, 2010). 29. d. l. atkins et al., “mawl: a domain-specific language for form-based services,” ieee transactions on software engineering 25 (1999): 334–46. 30. j. de lara and h. vangheluwe, “atom3: a tool for multi-formalism and meta-modelling,” proc. fase ’02 (berlin: springer, 2002): 174–88. 31. j. m. morales-del-castillo et al., “a semantic model of selective dissemination of information for digital libraries,” journal of information technology & libraries 28 (2009): 21–30. 32. n. santos, f. c. a. campos, and r. m. m. braga, “digital libraries and ontology,” in handbook of research on digital libraries: design, development, and impact, ed. y.-l. theng et al. (hershey, pa.: idea group, 2008): 1:19. 33. f. wattenberg, “a national digital library for science, mathematics, engineering, and technology education,” d-lib magazine 3 no. 10 (1998), http://www.dlib.org/dlib/october98/ wattenberg/10wattenberg.html (accessed oct 18, 2010); l. l. zia, “the nsf national science, technology, engineering, and mathematics education digital library (nsdl) program: new projects and a progress report,” d-lib magazine, 7, no. 11 (2002), http://www.dlib.org/dlib/november01/zia/11zia.html (accessed oct 18, 2010). 34. u.s. library of congress, ask a librarian, http://www.loc society), which describe the different aspects of a dl. we have built a code generator able to produce xul code from the design models for the dl user interface. moreover, we use template code generation integrating predefined components for the different services (xdoclet language) according to the model specification. extensions of cradle with behavioral diagrams and the addition of analysis and simulation capabilities are under study. these will exploit the new atom3 capabilities for describing multiview dsvls, to which this work directly contributed. references 1. a. m. gonçalves, e. a fox, “5sl: a language for declarative specification and generation of digital libraries,” proc. jcdl ’02 (new york: acm, 2002): 263–72. 2. l. candela et al., “setting the foundations of digital libraries: the delos manifesto,” d-lib magazine 13 (2007), http://www.dlib.org/dlib/march07/castelli/03castelli.html (accessed oct 18, 2010). 3. a. malizia et al., “a cooperative-relational approach to digital libraries,” proc. ecdl 2007, lncs 4675 (berlin: springer, 2007): 75–86. 4. e. a. fox and g. marchionini, “toward a worldwide digital library,” communications of the acm 41 (1998): 29–32. 5. m. a. gonçalves et al., “streams, structures, spaces, scenarios, societies (5s): a formal model for digital libraries,” acm transactions on information systems 22 (2004): 270–312. 6. j. c. r. licklider, libraries of the future (cambridge, mass.: mit pr., 1965). 7. d. m. levy and c. c. marshall, “going digital: a look at assumptions underlying digital libraries,” communications of the acm 38 (1995): 77–84. 8. r. reddy and i. wladawsky-berger, “digital libraries: universal access to human knowledge—a report to the president,” 2001, www.itrd.gov/pubs/pitac/pitac-dl-9feb01.pdf (accessed mar. 16, 2010). 9. e. l. morgan, “mylibrary: a digital library framework and toolkit,” journal of information technology & libraries 27 (2008): 12–24. 10. t. r. kochtanek and k. k. hein, “delphi study of digital libraries,” information processing management 35 (1999): 245–54. 11. s. e. howe et al., “the president’s information technology advisory committee’s february 2001 digital library report and its impact,” in proc. jcdl ’01 (new york: acm, 2001): 223–25. 12. n. reyes-farfan and j. a. sanchez, “personal spaces in the context of oa,” proc. jcdl ’03 (ieee computer society, 2003): 182–83. 13. m. wirsing, report on the eu/nsf strategic workshop on engineering software-intensive systems, 2004, http://www.ercim. eu/eu-nsf/sis.pdf (accessed oct 18, 2010) 14. s. kelly and j.-p. tolvanen, domain-specific modeling: enabling full code generation (hoboken, n.j.: wiley, 2008). 15. h. r. turtle and w. bruce croft, “evaluation of an inference network-based retrieval model,” acm transactions on information systems 9 (1991): 187–222. 16. r. a. baeza-yates, b. a. ribeiro-neto, modern information 186 information technology and libraries | december 2010 .mozilla.org/en/xul (accessed mar. 16, 2010). 43. xdoclet, welcome! what is xdoclet? http://xdoclet .sourceforge.net/xdoclet/index.html (accessed mar. 16, 2010). 44. w3c, extensible markup language (xml) 1.0 (fifth edition), http://www.w3.org/tr/2008/rec-xml-20081126/ (accessed mar. 16, 2010); w3c, resource description framework (rdf), http://www.w3.org/rdf/ (accessed mar. 16, 2010). 45. h. wada and j. suzuki, “modeling turnpike frontend system: a model-driven development framework leveraging uml metamodeling and attribute-oriented programming,” proc. models ’05, lncs 3713 (berlin: springer, 2005): 584–600. 46. i. horrocks, constructing the user interface with statecharts (boston: addison-wesley, 1999). 47. universal discover, description, and integration oasis standard, welcome to uddi xml.org, http://uddi.xml.org/ (accessed mar. 16, 2010). 48. t. r. g. green and m. petre, “usability analysis of visual programming environments: a ‘cognitive dimensions framework,’” journal of visual languages & computing 7 (1996): 131–74. 49. j. de lara, e. guerra, and a. malizia, “model driven development of digital libraries—validation, analysis and formal code generation,” proc. 3rd webist ’07 (berlin: springer, 2008). .gov/rr/askalib/ (accessed on mar. 16, 2010). 35. c. l. borgmann, “what are digital libraries? competing visions,” information processing & management 25 (1999):227–43. 36. c. lynch, “coding with the real world: heresies and unexplored questions about audience, economics, and control of digital libraries,” in digital library use: social practice in design and evaluation, ed. a. p. bishop, n. a. van house, and b. buttenfield (cambridge, mass.: mit pr., 2003): 191–216. 37. y. ioannidis et al., “digital library information-technology infrastructure,” international journal of digital libraries 5 (2005): 266–74. 38. e. a. fox et al., “the networked digital library of theses and dissertations: changes in the university community,” journal of computing higher education 13 (2002): 3–24. 39. h. van de sompel and c. lagoze, “notes from the interoperability front: a progress report on the open archives initiative,” proc. 6th ecdl, 2002, lncs 2458 (berlin: springer 2002): 144–57. 40. f. de rosa et al., “jdan: a component architecture for digital libraries,” delos workshop: digital library architectures, (padua, italy: edizioni libreria peogetto, 2004): 151–62. 41. defined as a set of actors (users) playing roles and interacting with services. 42. mozilla developer center, xul, https://developer microsoft word june_ital_owen_final.docx engine  of  innovation:     building  the  high  performance  catalog        will  owen  and   sarah  c.  michalak     information  technology  and  libraries  |  june  2015               5   abstract   numerous  studies  have  indicated  that  sophisticated  web-­‐based  search  engines  have  eclipsed  the   primary  importance  of  the  library  catalog  as  the  premier  tool  for  researchers  in  higher  education.   we  submit  that  the  catalog  remains  central  to  the  research  process.  through  a  series  of  strategic   enhancements,  the  university  of  north  carolina  at  chapel  hill,  in  partnership  with  the  other   members  of  the  triangle  research  libraries  network  (trln),  has  made  the  catalog  a  carrier  of   services  in  addition  to  bibliographic  data,  facilitating  not  simply  discovery,  but  also  delivery  of  the   information  researchers  seek.   introduction in  2005,  an  oclc  research  report  documented  what  many  librarians  already  knew—that  the   library  webpage  and  catalog  were  no  longer  the  first  choice  to  begin  a  search  for  information.  the   report  states,   the  survey  findings  indicate  that  84  percent  of  information  searches  begin  with  a  search   engine.  library  web  sites  were  selected  by  just  1  percent  of  respondents  as  the  source  used  to   begin  an  information  search.  very  little  variability  in  preference  exists  across  geographic   regions  or  u.s.  age  groups.  two  percent  of  college  students  start  their  search  at  a  library  web   site.1   in  2006  a  report  by  karen  calhoun,  commissioned  by  the  library  of  congress,  asserted,  “today  a   large  and  growing  number  of  students  and  scholars  routinely  bypass  library  catalogs  in  favor  of   other  discovery  tools.  .  .  .  the  catalog  is  in  decline,  its  processes  and  structures  are  unsustainable,   and  change  needs  to  be  swift.”2     ithaka  s+r  has  conducted  national  faculty  surveys  triennially  since  2000.  summarizing  the  2000– 2006  surveys,  roger  schonfeld  and  kevin  guthrie  stated,  “when  the  findings  from  2006  are   compared  with  those  from  2000  and  2003,  it  becomes  evident  that  faculty  perceive  themselves  as   becoming  decreasingly  dependent  on  the  library  for  their  research  and  teaching  needs.”3   furthermore,  it  was  clear  that  the  “library  as  gateway  to  scholarly  information”  was  viewed  as   decreasingly  important.  the  2009  survey  continued  the  trend  with  even  fewer  faculty  seeing  the       will  owen  (owen@email.unc.edu)  is  associate  university  librarian  for  technical  services  and   systems  and  sarah  c.  michalak  (smichala@email.unc.edu)  is  university  librarian  and  associate   provost  for  university  libraries,  university  of  north  carolina  at  chapel  hill.     engine  of  innovation:  building  the  high-­‐performance  catalog  |  owen  and  michalak       doi:  10.6017/ital.v34i2.5702   6   gateway  function  as  critical.  these  results  occurred  in  a  time  when  electronic  resources  were   becoming  increasingly  important  and  large  google-­‐like  search  engines  were  rapidly  gaining  in   use.4     these  comments  extend  into  the  twenty-­‐first  century  more  than  thirty  years  of  concern  about  the   utility  of  the  library  catalog.  through  the  first  half  of  this  decade  new  observations  emerged  about   patron  perceptions  of  catalog  usability.  even  after  migration  from  the  card  to  the  online  catalog   was  complete,  the  new  tool  represented  primarily  the  traditionally  cataloged  holdings  of  a   particular  library.  providing  direct  access  to  resources  was  not  part  of  the  catalog’s  mission.   manuscripts,  finding  aids,  historical  photography,  and  other  special  collections  were  not  included   in  the  traditional  catalog.  journal  articles  could  only  be  discovered  through  abstracting  and   indexing  services.  as  these  discovery  tools  began  their  migration  to  electronic  formats,  the   centrality  of  the  library’s  bibliographic  database  was  challenged.   the  development  of  google  and  other  sophisticated  web-­‐based  search  engines  further  eclipsed  the   library’s  bibliographic  database  as  the  first  and  most  important  research  tool.  yet  we  submit  that   the  catalog  database  remains  a  necessary  fixture,  continuing  to  provide  access  to  each  library’s   particular  holdings.  while  the  catalog  may  never  regain  its  pride  of  place  as  the  starting  point  for   all  researchers,  it  still  remains  an  indispensable  tool  for  library  users,  even  if  it  may  be  used  only   at  a  later  stage  in  the  research  process.   at  the  university  of  north  carolina  at  chapel  hill,  we  have  continued  to  invest  in  enhancing  the   utility  of  the  catalog  as  a  valued  tool  for  research.  librarians  initially  reasoned  that  researchers   still  want  to  find  out  what  is  available  to  them  in  their  own  campus  library.  gradually  they  began   to  see  completely  new  possibilities.  to  that  end,  we  have  committed  to  a  program  that  enhances   discovery  and  delivery  through  the  catalog.  while  most  libraries  have  built  a  wide  range  of   discovery  tools  into  their  home  pages—adding  links  to  databases  of  electronic  resources,  article   databases,  and  google  scholar—we  have  continued  to  enhance  both  the  content  to  be  found  in  the   primary  local  bibliographic  database  and  the  services  available  to  students  and  researchers  via  the   interface  to  the  catalog.   in  our  local  consortium,  the  triangle  research  libraries  network  (trln),  librarians  have   deployed  the  search  and  faceting  services  of  endeca  to  enrich  the  discovery  interfaces.  we  have   gone  beyond  augmenting  the  catalog  through  the  addition  of  marcive  records  for  government   documents,  by  including  encoded  archival  description  (ead)  finding  aids  and  selected  (and  ever-­‐ expanding)  digital  collections  that  are  not  easily  discoverable  through  major  search  engines.  we   have  similarly  enhanced  services  related  to  the  discovery  and  delivery  of  items  listed  in  the   bibliographic  database,  including  not  only  common  features  like  the  ability  to  export  citations  in  a   variety  of  formats  but  also  more  extensive  services  such  as  document  delivery,  an  auto-­‐suggest   feature  that  maximizes  use  of  library  of  congress  subject  headings  (lcsh),  and  the  ability  to   submit  cataloged  items  to  be  processed  for  reserve  reading.     information  technology  and  libraries  |  june  2015     7   both  students  and  faculty  have  embraced  e-­‐books,  and  in  adding  more  than  a  million  such  titles  to   the  unc-­‐chapel  hill  catalog  we  continue  to  blend  discovery  and  delivery,  but  now  on  a  very  large   scale.  coupling  catalog  records  with  a  metadata  service  that  provides  book  jackets,  tables  of   contents,  and  content  summaries,  cataloging  geographic  information  systems  (gis)  data  sets,  and   adding  live  links  to  the  finding  aids  for  digitized  archival  and  manuscript  collections  have  further   enhanced  the  blended  discovery/delivery  capacity  of  the  catalog.   we  have  also  leveraged  the  advantages  of  operating  in  a  consortial  environment  by  extending  the   discovery  and  delivery  services  among  the  members  of  trln  to  provide  increased  scope  of   discovery  and  shared  processing  of  some  classes  of  bibliographic  records.  trln  comprises  four   institutions  and  content  from  all  member  libraries  is  discoverable  in  a  combined  catalog   (http://search.trln.org).  printed  material  requested  through  this  combined  catalog  is  often   delivered  between  trln  libraries  within  twenty-­‐four  hours.   at  unc,  our  search  logs  show  that  use  of  the  catalog  increases  as  we  add  new  capacity  and  content.   these  statistics  demonstrate  the  catalog’s  continuing  relevance  as  a  research  tool  that  adds  value   above  and  beyond  conventional  search  engines  and  general  web-­‐based  information  resources.  in   this  article  we  will  describe  the  most  important  enhancements  to  our  catalog,  include  data  from   search  logs  to  demonstrate  usage  changes  resulting  from  these  enhancements,  and  comment  on   potential  future  developments.   literature  review   an  extensive  literature  discusses  the  past  and  future  of  online  catalogs,  and  many  of  these   materials  themselves  include  detailed  literature  reviews.  in  fact,  there  are  so  many  studies,   reviews,  and  editorials,  it  becomes  clear  that  although  the  online  catalog  may  be  in  decline,  it   remains  a  subject  of  lively  interest  to  librarians.  two  important  threads  in  this  literature  report  on   user-­‐query  studies  and  on  other  usability  testing.  though  there  are  many  earlier  studies,  two   relatively  recent  articles  analyze  search  behavior  and  provide  selective  but  helpful  literature   surveys.5     there  are  many  efforts  to  define  directions  for  the  catalog  that  would  make  it  more  web-­‐like,  more   google-­‐like,  and  thus  more  often  chosen  for  search,  discovery,  and  access  by  library  patrons.   these  articles  aim  to  define  the  characteristics  of  the  ideal  catalog.  charles  hildreth  provides  a   benchmark  for  these  efforts  by  dividing  the  history  of  the  online  catalog  into  three  generations.   from  his  projections  of  a  third  generation  grew  the  “next  generation  catalog”—really  the  current   ideal.  he  called  for  improvement  of  the  second-­‐generation  catalog  through  an  enhanced  user-­‐ system  dialog,  automatic  correction  of  search-­‐term  spelling  and  format  errors,  automatic  search   aids,  enriched  subject  metadata  in  the  catalog  record  to  improve  search  results,  and  the   integration  of  periodical  indexes  in  the  catalog.  as  new  technologies  have  made  it  possible  to   achieve  these  goals  in  new  ways,  much  of  what  hildreth  envisioned  has  been  accomplished.6       engine  of  innovation:  building  the  high-­‐performance  catalog  |  owen  and  michalak       doi:  10.6017/ital.v34i2.5702   8   second-­‐generation  catalogs,  anchored  firmly  in  integrated  library  systems,  operated  throughout   most  of  the  1980s  and  the  1990s  without  significant  improvement.  by  the  mid-­‐2000s  the  search   for  the  “next-­‐gen”  catalog  was  in  full  swing,  and  there  are  numerous  articles  that  articulate  the   components  of  an  improved  model.  the  catalog  crossed  a  generational  line  for  good  when  the   north  carolina  state  university  libraries  (ncsu)  launched  a  new  catalog  search  engine  and   interface  with  endeca  in  january  2006.  three  ncsu  authors  published  a  thorough  article   describing  key  catalog  improvements.  their  endeca-­‐enhanced  catalog  fulfilled  the  most  important   criteria  for  a  “next-­‐gen”  catalog:  improved  search  and  retrieval  through  “relevance-­‐ranked  results,   new  browse  capabilities,  and  improved  subject  access.”7     librarians  gradually  concluded  that  the  catalog  need  not  be  written  off  but  would  benefit  from   being  enhanced  and  aligned  with  search  engine  capabilities  and  other  web-­‐like  characteristics.   catalogs  should  contain  more  information  about  titles,  such  as  book  jackets  or  reviews,  than   conventional  bibliographic  records  offered.  catalog  search  should  be  understandable  and  easy  to   use.  additional  relevant  works  should  be  presented  to  the  user  along  with  result  sets.  the   experience  should  be  interactive  and  participatory  and  provide  access  to  a  broad  array  of   resources  such  as  data  and  other  nonbook  content.8     karen  markey,  one  of  the  most  prolific  online  catalog  authors  and  analysts,  writes,  “now  that  the   era  of  mass  digitization  has  begun,  we  have  a  second  chance  at  redesigning  the  online  library   catalog,  getting  it  right,  coaxing  back  old  users  and  attracting  new  ones.”9   marshall  breeding  predicted  characteristics  of  the  next-­‐generation  catalog.  his  list  includes   expanded  scope  of  search,  more  modern  interface  techniques,  such  as  a  single  point  of  entry,   search  result  ranking,  faceted  navigation,  and  “did  you  mean  .  .  .  ?”  capacity,  as  well  as  an  expanded   search  universe  that  includes  the  full  text  of  journal  articles  and  an  array  of  digitized  resources.10     a  concept  that  is  less  represented  in  the  literature  is  that  of  envisioning  the  catalog  as  a   framework  for  service,  although  the  idea  of  the  catalog  designed  to  ensure  customer  self-­‐service   has  been  raised.11  michael  j.  bennett  has  studied  the  effect  of  catalog  enhancements  on  circulation   and  interlibrary  loan.12  service  and  the  online  catalog  have  a  new  meaning  in  morgan’s  idea  of   “services  against  texts,”  supporting  “use  and  understand”  in  addition  to  the  traditional  “find  and   get.”13  lorcan  dempsey  commented  on  the  catalog  as  an  identifiable  service  and  predicts  new   formulations  for  library  services  based  on  the  network-­‐level  orientation  of  search  and  discovery.14   but  the  idea  that  the  catalog  has  moved  from  a  fixed,  inward-­‐focused  tool  to  an  engine  for   services—a  locus  to  be  invested  with  everything  from  unmediated  circulation  renewal  and   ordering  delivery  to  the  “did  you  mean”  search  aid—has  yet  to  be  addressed  comprehensively  in   the  literature.   enhancing  the  traditional  catalog   one  of  the  factors  that  complicates  discussions  of  the  continued  relevance  of  the  library  catalog  to   research  is  the  very  imprecision  of  the  term  in  common  parlance,  especially  when  the  chief  point     information  technology  and  libraries  |  june  2015     9   of  comparison  to  today’s  ils-­‐driven  opacs  is  google  or,  more  specifically,  google  scholar.  from   first-­‐year  writing  assignments  through  advanced  faculty  research,  many  of  the  resources  that  our   patrons  seek  are  published  in  the  periodical  literature,  and  the  library  catalog,  the  one  descended   from  the  cabinets  full  of  cards  that  occupied  prominent  real  estate  in  our  buildings,  has  never  been   an  effective  tool  for  identifying  relevant  periodical  literature.   this  situation  has  changed  in  recent  years  as  products  like  summon,  from  proquest,  and  ebsco   discovery  service  have  introduced  platforms  that  can  accommodate  electronic  article  indexing  as   well  as  marc  records  for  the  types  of  materials—books,  audio,  and  video—that  have  long  been   discovered  through  the  opac.  in  the  following  discussion  of  “catalog”  developments  and   enhancements,  we  focus  initially  not  on  these  integrated  solutions,  but  on  the  catalog  as  more   traditionally  defined.  however,  as  electronic  resources  become  an  ever-­‐greater  percentage  of   library  collections,  we  shall  see  a  convergence  of  these  two  streams  that  will  portend  significant   changes  in  the  nature  and  utility  of  the  catalog.   much  work  has  been  done  in  the  first  decade  of  the  twenty-­‐first  century  to  enhance  discovery   services  and,  as  noted  above,  north  carolina  state  university’s  introduction  of  their  endeca-­‐based   search  engine  and  interface  was  a  significant  game-­‐changer.  in  the  years  following  the   introduction  of  the  endeca  interface  at  ncsu,  the  triangle  research  libraries  network  invested  in   further  development  of  features  that  enhanced  the  utility  of  the  endeca  software  itself.   programmed  enhancements  to  the  interface  provided  additional  services  and  functionality.  in   some  cases,  these  enhancements  were  aimed  at  improving  discovery.  in  others,  they  allowed   researchers  to  make  new  and  better  use  of  the  data  that  they  found  or  made  it  easier  to  obtain  the   documents  that  they  discovered.   faceting  and  limiting  retrieval  results   perhaps  the  most  immediately  striking  innovation  in  the  endeca  interface  was  the  introduction  of   facets.  the  use  of  faceted  browsing  allowed  users  to  parse  the  bibliographic  record  in  new  ways   (and  more  ways)  than  had  preceding  catalogs.  there  were  several  fundamentally  important  ways   faceting  enhanced  search  and  discovery.   the  first  of  these  was  the  formal  recognition  that  keyword  searching  was  the  user’s  default  means   of  interacting  with  the  catalog’s  data.  ncsu’s  initial  implementation  allowed  for  searches  using   several  indexes,  including  authors,  titles,  and  subject  headings,  and  this  functionality  remains  in   place  to  the  present  day.  however,  by  default,  searches  returned  records  containing  the  search   terms  “anywhere”  in  the  record.  this  behavior  was  more  in  line  with  user  expectations  in  an   information  ecosystem  dominated  by  google’s  single  search  box.   the  second  was  the  significantly  different  manner  in  which  multiple  limits  could  be  placed  on  an   initial  result  set  from  such  a  keyword  search.  the  concept  of  limiting  was  not  a  new  one:  certain   facets  worked  in  a  manner  consistent  with  traditional  limits  in  prior  search  interfaces,  allowing   users  to  screen  results  by  language,  or  date  of  publication,  for  example.       engine  of  innovation:  building  the  high-­‐performance  catalog  |  owen  and  michalak       doi:  10.6017/ital.v34i2.5702   10   it  was  the  ease  and  transparency  with  which  multiple  limits  could  be  applied  through  faceting  that   was  revolutionary.  a  user  who  entered  the  keyword  “java”  in  the  search  box  was  quickly  able  to   discriminate  between  the  programming  language  and  the  indonesian  island.  this  could  be   achieved  in  multiple  ways:  by  choosing  between  subjects  (for  example,  “application  software”  vs.   “history”)  or  clearly  labeled  lc  classification  categories  (“q  –  science”  vs.  “d  –  history”).  these   limits,  or  facets,  could  be  toggled  on  and  off,  independently  and  iteratively.   the  third  and  highly  significant  difference  resulted  from  how  library  of  congress  subject   headings  (lcsh)  were  parsed  and  indexed  in  the  system.  by  making  lcsh  subdivisions   independent  elements  of  the  subject-­‐heading  index  in  a  keyword  search,  the  endeca   implementation  unlocked  a  trove  of  metadata  that  had  been  painstakingly  curated  by  catalogers   for  nearly  a  century.  the  user  no  longer  needed  to  be  familiar  with  the  formal  structure  of  subject   headings;  if  the  keywords  appeared  anywhere  in  the  string,  the  subdivisions  in  which  they  were   contained  could  be  surfaced  and  used  as  facets  to  sharpen  the  focus  of  the  search.  this  was   revolutionary.   utilizing  the  power  of  new  indexing  structures   the  liberation  of  bibliographic  data  from  the  structure  of  marc  record  indexes  presaged  yet   another  far-­‐reaching  alteration  in  the  content  of  library  catalogs.  to  this  day,  most  commercial   integrated  library  systems  depend  on  marc  as  the  fundamental  record  structure.  in  ncsu’s   implementation,  the  multiple  indexes  built  from  that  metadata  created  a  new  framework  for   information.     this  change  made  possible  the  integration  of  non-­‐marc  data  with  marc  data,  allowing,  for   example,  dublin  core  (dc)  records  to  be  incorporated  into  the  universe  of  metadata  to  be  indexed,   searched,  and  retrieved.  there  was  no  need  to  crosswalk  dc  to  marc:  it  sufficed  to  simply  assign   the  dc  elements  to  the  appropriate  endeca  indexes.  with  this  capacity  to  integrate  rich  collections   of  locally  described  digital  resources,  the  scope  of  the  traditional  catalog  was  enlarged.   expanding  scopes  and  banishing  silos   at  unc-­‐chapel  hill,  we  began  this  process  of  augmentation  with  selected  collections  of  digital   objects.  these  collections  were  housed  in  a  contentdm  repository  we  had  been  building  for   several  years  at  the  time  of  the  library’s  introduction  of  the  endeca  interface.  image  files,  which   had  not  been  accessible  through  traditional  catalogs,  were  among  the  first  to  be  added.  for   example,  we  had  been  given  a  large  collection  of  illustrated  postcards  featuring  scenes  of  north   carolina  cities  and  towns.  these  postcards  had  been  digitized  and  metadata  describing  the  image   and  the  town  had  been  recorded.  other  collections  of  digitized  historical  photographs  were  also   selected  for  inclusion  in  the  catalog.  these  historical  resources  proved  to  be  a  boon  to  faculty   teaching  local  history  courses  and,  interestingly,  to  students  working  on  digital  projects  for  their   classes.  as  class  assignments  came  to  include  activities  like  creating  maps  enhanced  by  the     information  technology  and  libraries  |  june  2015     11   addition  of  digital  photographs  or  digitized  newspaper  clippings,  the  easy  discovery  of  these   formerly  hidden  collections  enriched  students’  learning  experience.   other  special  collection  materials  had  been  represented  in  the  traditional  catalog  in  somewhat   limited  fashion.  the  most  common  examples  were  manuscripts  collections.  the  processing  of   these  collections  had  always  resulted  in  the  creation  of  finding  aids,  produced  since  the  1930s   using  index  cards  and  typewriters.  during  the  last  years  of  the  twentieth  century,  archivists  began   migrating  many  of  these  finding  aids  to  the  web  using  the  ead  format,  presenting  them  as  simple   html  pages.  these  finding  aids  were  accessible  through  the  catalog  by  means  of  generalized   marc  records  that  described  the  collections  at  a  superficial  level.  however,  once  we  attained  the   ability  to  integrate  the  contents  of  the  finding  aids  themselves  into  the  indexes  underlying  the  new   interface,  this  much  richer  trove  of  keyword-­‐searchable  data  vastly  increased  the  discoverability   and  use  of  these  collections.   during  this  period,  the  library  also  undertook  systematic  digitization  of  many  of  these  manuscript   collections.  whenever  staff  received  a  request  for  duplication  of  an  item  from  a  manuscript   collection  (formerly  photocopies,  but  by  then  primarily  digital  copies),  we  digitized  the  entire   folder  in  which  that  item  was  housed.  we  developed  standards  for  naming  these  digital  surrogates   that  associated  the  individual  image  with  the  finding  aid.  it  then  became  a  simple  matter,  involving   the  addition  of  a  short  javascript  string  to  the  head  of  the  online  finding  aid,  to  dynamically  link   the  digital  objects  to  the  finding  aid  itself.     other  library  collections  likewise  benefited  from  the  new  indexing  structures.  some  uncataloged   materials  traditionally  had  minimal  bibliographic  control  provided  by  inventories  that  were  built   at  the  time  of  accession  in  desktop  database  applications;  funding  constraints  meant  that  full   cataloging  of  these  materials  (often  rare  books)  remained  elusive.  the  ability  to  take  the  data  that   we  had  and  blend  it  into  the  catalog  enhanced  the  discovery  of  these  collections  as  well.   we  also  have  an  extensive  collection  of  video  resources,  including  commercial  and  educational   films.  the  conventions  for  cataloging  these  materials,  held  over  from  the  days  of  catalog  cards,   often  did  not  match  user  expectations  for  search  and  discovery.  there  were  limits  to  the  number   of  added  entries  that  catalogers  would  make  for  directors,  actors,  and  others  associated  with  a   film.  many  records  lacked  the  kind  of  genre  descriptors  that  undergraduates  were  likely  to  use   when  seeking  a  film  for  an  evening’s  entertainment.  to  compensate  for  these  limitations,  staff  who   managed  the  collection  had  again  developed  local  database  applications  that  allowed  for  the   inclusion  of  more  extensive  metadata  and  for  categories  such  as  country  of  origin  or  folksonomic   genres  that  patrons  frequently  indicated  were  desirable  access  points.  once  again,  the  new   indexing  structures  allowed  us  to  incorporate  this  rich  set  of  metadata  into  what  looked  like  the   traditional  catalog.   each  of  the  instances  described  above  represents  what  we  commonly  call  the  destruction  of  silos.   information  about  library  collections  that  had  been  scattered  in  numerous  locations—and  not  all   of  them  online—was  integrated  into  a  single  point  of  discovery.  it  was  our  hope  and  intention  that     engine  of  innovation:  building  the  high-­‐performance  catalog  |  owen  and  michalak       doi:  10.6017/ital.v34i2.5702   12   such  integration  would  drive  more  users  to  the  catalog  as  a  discovery  tool  for  the  library’s  diverse   collections  and  not  simply  for  the  traditional  monographic  and  serials  collections  that  had  been   served  by  marc  cataloging.  usage  logs  indicate  that  the  average  number  of  searches  conducted  in   the  catalog  rose  from  approximately  13,000  per  day  in  2009  to  around  19,000  per  day  in  2013.  it   is  impossible  to  tell  with  any  certainty  whether  there  was  heavier  use  of  the  catalog  simply   because  increasingly  varied  resources  came  to  be  represented  in  it,  but  we  firmly  believe  that  the   experience  of  users  who  search  for  material  in  our  catalog  has  become  much  richer  as  a  result  of   these  changes  to  its  structure  and  content.   cooperation  encouraging  creativity   another  way  we  were  able  to  harness  the  power  of  endeca’s  indexing  scheme  involved  the  shared   loading  of  bibliographic  records  for  electronic  resources  to  which  multiple  trln  libraries   provided  access.  trln’s  endeca  indexes  are  built  from  the  records  of  each  member.  each   institution  has  a  “pipeline”  that  feeds  metadata  into  the  combined  trln  index.  duplicate  records   are  rolled  up  into  a  single  display  via  oclc  control  numbers  whenever  possible,  and  the   bibliographic  record  is  annotated  with  holdings  statements  for  the  appropriate  libraries.   we  quickly  realized  that  where  any  of  the  four  institutions  shared  electronic  access  to  materials,  it   was  redundant  to  load  copies  of  each  record  into  the  local  databases  of  each  institution.15  instead,   one  institution  could  take  responsibility  for  a  set  of  records  representing  shared  resources.   examples  of  such  material  include  electronic  government  documents  with  records  provided  by   the  marcive  documents  without  shelves  program,  large  sets  like  early  english  books  online,  and   pbs  videos  streamed  by  the  statewide  services  of  nc  live.   in  practice,  one  institution  takes  responsibility  for  loading,  editing,  and  performing  authority   control  on  a  given  set  of  records.  (for  example,  unc,  as  the  regional  depository,  manages  the   documents  without  shelves  record  set.)  these  records  are  loaded  with  a  special  flag  indicating   that  they  are  part  of  the  shared  records  program.  this  flag  generates  a  holdings  statement  that   reflects  the  availability  of  the  electronic  item  at  each  institution.  the  individual  holdings   statements  contain  the  institution-­‐specific  proxy  server  information  to  enable  and  expedite  access.   in  addition  to  this  distributed  model  of  record  loading  and  maintenance,  we  were  able  to  leverage   oai-­‐pmh  feeds  to  add  selected  resources  to  the  searchtrln  database.  all  four  institutions  have   access  to  the  data  made  available  by  the  inter-­‐university  consortium  for  political  and  social   research  (icpsr).  as  we  do  not  license  these  resources  or  maintain  them  locally,  and  as  records   provided  by  icpsr  can  change  over  time,  we  developed  a  mechanism  to  harvest  the  metadata  and   push  it  through  a  pipeline  directly  into  the  searchtrln  indexes.  none  of  the  member  libraries’   local  databases  house  this  metadata,  but  the  records  are  made  available  to  all  nonetheless.   while  we  were  engaged  in  implementing  these  enhancements,  additional  sources  of  potential   enrichment  of  the  catalog  were  appearing.  in  particular,  vendors  began  providing  indexing   services  for  the  vast  quantities  of  electronic  resources  contained  in  aggregator  databases.     information  technology  and  libraries  |  june  2015     13   additionally,  they  made  it  possible  for  patrons  to  move  seamlessly  from  the  catalog  to  those   electronic  resources  via  openurl  technologies.  indeed,  services  like  proquest’s  summon  or   ebsco’s  discovery  service  might  be  taken  as  another  step  toward  challenging  the  catalog’s   primacy  as  a  discovery  tool  as  they  offered  the  prospect  of  making  local  catalog  records  just  a   fraction  of  a  much  larger  universe  of  bibliographic  information  available  in  a  single,  keyword-­‐ searchable  database.   it  remains  to  be  seen,  therefore,  whether  continuing  to  load  many  kinds  of  marc  records  into  the   local  database  is  an  effective  aid  to  discovery  even  with  the  multiple  delimiting  capabilities  that   endeca  provides.  what  is  certain,  however,  is  that  our  approach  to  indexing  resources  of  any  kind   has  undergone  a  radical  transformation  over  the  past  few  years—a  transformation  that  goes   beyond  the  introduction  of  any  of  the  particular  changes  we  have  discussed  so  far.   promoting  a  culture  of  innovation   one  important  way  endeca  has  changed  our  libraries  is  that  a  culture  of  constant  innovation  has   become  the  norm,  rather  than  the  exception,  for  our  catalog  interface  and  content.  once  we  were   no  longer  subject  to  the  customary  cycle  of  submitting  enhancement  requests  to  an  integrated   library  system  vendor,  hoping  that  fellow  customers  shared  similar  desires,  and  waiting  for  a   response  and,  if  we  were  lucky,  implementation,  we  were  able  to  take  control  of  our  aspirations.   we  had  the  future  of  the  interface  to  our  collections  in  our  own  hands,  and  within  a  few  years  of   the  introduction  of  endeca  by  ncsu,  we  were  routinely  adding  new  features  to  enhance  its   functionality.   one  of  the  first  of  these  enhancements  was  the  addition  of  a  “type-­‐ahead”  or  “auto-­‐suggest”   option.16  inspired  by  google’s  autocomplete  feature,  this  service  suggests  phrases  that  might   match  the  keywords  a  patron  is  typing  into  the  search  box.  ben  pennell,  one  of  the  chief   programmers  working  on  endeca  enhancement  at  unc-­‐chapel  hill,  built  a  solr  index  from  the  ils   author,  title,  and  subject  indexes  and  from  a  log  of  recent  searches.  as  a  patron  typed,  a  drop-­‐ down  box  appeared  below  the  search  box.  the  drop-­‐down  contained  matching  terms  extracted   from  the  solr  index  in  a  matter  of  seconds  or  less.  for  example,  typing  the  letters  “bein”  into  the   box  produced  a  list  including  “being  john  malkovich,”  “nature—effects  of  human  beings  on,”   “human  beings,”  and  “bein,  alex,  1903–1988.”  the  italicized  letters  in  these  examples  are   highlighted  in  a  different  color  in  the  drop-­‐down  display.  in  the  case  of  terms  drawn  directly  from   an  index,  the  index  name  appears,  also  highlighted,  on  the  right  side  of  the  box.  for  example,  the   second  and  third  terms  in  the  examples  above  are  tagged  with  the  term  “subject.”  the  last  example   is  an  “author.”   in  allowing  for  the  textual  mining  of  lcsh,  the  initial  implementation  of  faceting  in  the  endeca   catalog  surfaced  those  headings  for  the  patron  by  uniting  keyword  and  controlled  vocabularies  in   an  unprecedented  manner.  there  was  a  remarkable  and  almost  immediate  increase  in  the  number   of  authority  index  searches  entered  into  the  system.  at  the  end  of  the  fall  semester  prior  to  the   implementation  of  the  auto-­‐suggest  feature,  an  average  of  around  1,400  subject  searches  were     engine  of  innovation:  building  the  high-­‐performance  catalog  |  owen  and  michalak       doi:  10.6017/ital.v34i2.5702   14   done  in  a  week.  approximately  one  month  into  the  spring  semester,  that  average  had  risen  to   around  4,000  subject  searches  per  week.  use  of  the  author  and  title  indexes  also  rose,  although   not  quite  as  dramatically.  in  the  perpetual  tug-­‐of-­‐war  between  precision  and  recall,  the  balance   had  decidedly  shifted.   another  service  that  we  provide,  which  is  especially  popular  with  students,  is  the  ability  to   produce  citations  formatted  in  one  of  several  commonly  used  bibliographic  styles,  including  apa,   mla,  and  chicago  (both  author-­‐date  and  note-­‐and-­‐bibliography  formats).  this  functionality,  first   introduced  by  ncsu  and  then  jointly  developed  with  unc  over  the  years  that  followed,  works  in   two  ways.  if  a  patron  finds  a  monographic  title  in  the  catalog,  simply  clicking  on  a  link  labeled  “cite”   produces  a  properly  formatted  citation  that  can  then  be  copied  and  pasted  into  a  document.  the   underlying  technology  also  powers  a  “citation  builder”  function  by  which  a  patron  can  enter  basic   bibliographic  information  for  a  book,  a  chapter  or  essay,  a  newspaper  or  journal  article,  or  a   website  into  a  form,  click  the  “submit”  button,  and  receive  a  citation  in  the  desired  format.     an  additional  example  of  innovation  that  falls  somewhat  outside  the  scope  of  the  changes   discussed  above  was  the  development  of  a  system  that  allowed  for  the  mapping  of  simplified   chinese  characters  to  their  traditional  counterparts.  searching  in  non-­‐roman  character  sets  has   always  offered  a  host  of  challenges  to  library  catalog  users.  the  trln  libraries  have  embraced  the   potential  of  endeca  to  reduce  some  of  these  challenges,  particularly  for  chinese,  through  the   development  of  better  keyword  searching  strategies  and  the  automatic  translation  of  simplified  to   traditional  characters.   since  we  had  complete  control  over  the  endeca  interface,  it  proved  relatively  simple  to  integrate   document  delivery  services  directly  into  the  functionality  of  the  catalog.  rather  than  simply   emailing  a  bibliographic  citation  or  a  call  number  to  themselves,  patrons  could  request  the   delivery  of  library  materials  directly  to  their  campus  addresses.  once  we  had  implemented  this   feature,  we  quickly  moved  to  amplify  its  power.  many  catalogs  offer  a  “shopping  cart”  service  that   allows  patrons  to  compile  lists  of  titles.  one  variation  on  this  concept  that  we  believe  is  unique  to   our  library  is  the  ability  for  a  professor  to  compile  such  a  list  of  materials  held  by  the  libraries  on   campus  and  submit  that  list  directly  to  the  reserve  reading  department,  where  the  books  are   pulled  from  the  shelves  and  placed  on  course-­‐reserve  lists  without  the  professor  needing  to  visit   any  particular  library  branch.  these  new  features,  in  combination  with  other  service   enhancements  such  as  the  delivery  of  physical  documents  to  campus  addresses  from  our  on-­‐ campus  libraries  and  our  remote  storage  facility,  have  increased  the  usefulness  of  the  catalog  as   well  as  our  users’  satisfaction  with  the  library.  we  believe  that  these  changes  have  contributed  to   the  ongoing  vitality  of  the  catalog  and  to  its  continued  importance  to  our  community.   in  december  2012,  the  libraries  adopted  proquest’s  summon  to  provide  enhanced  access  to   article  literature  and  electronic  resources  more  generally.  at  the  start  of  the  following  fall   semester,  the  libraries  instituted  another  major  change  to  our  discovery  and  delivery  services   through  a  combined  single-­‐search  box  on  our  home  page.  this  has  fundamentally  altered  how     information  technology  and  libraries  |  june  2015     15   patrons  interact  with  our  catalog  and  its  associated  resources.  first,  because  we  are  now   searching  both  the  catalog  and  the  summon  index,  the  type-­‐ahead  feature  that  we  had  deployed  to   suggest  index  terms  from  our  local  database  to  users  as  they  entered  search  strings  no  longer   functions  as  an  authority  index  search.  we  have  returned  to  querying  both  databases  through  a   simple  keyword  search.     second,  in  our  implementation  of  the  single  search  interface  we  have  chosen  to  present  the  results   from  our  local  database  and  the  retrievals  from  summon  in  two,  side-­‐by-­‐side  columns.  this  has   the  advantage  of  bringing  article  literature  and  other  resources  indexed  by  summon  directly  to   the  patron’s  attention.  as  a  result,  more  patrons  interact  directly  with  articles,  as  well  as  with   books  in  major  digital  repositories  like  google  books  and  hathitrust.  this  change  has   undoubtedly  led  patrons  to  make  less  in-­‐depth  use  of  the  local  catalog  database,  although  it   preserves  much  of  the  added  functionality  in  terms  of  discovering  our  own  digital  collections  as   well  as  those  resources  whose  cataloging  we  share  with  our  trln  partners.  we  believe  that  the   ease  of  access  to  the  resources  indexed  by  summon  complements  the  enhancements  we  have   made  to  our  local  catalog.   conclusion  and  further  directions   one  might  argue  that  the  integration  of  electronic  resources  into  the  “catalog”  actually  shifts  the   paradigm  more  significantly  than  any  previous  enhancements.  as  the  literature  review  indicates,   much  of  the  conversation  about  enriching  library  catalogs  has  centered  on  improving  the  means   by  which  search  and  discovery  are  conducted.  the  reasonably  direct  linking  to  full  text  that  is  now   possible  has  once  again  radically  shifted  that  conversation,  for  the  catalog  has  come  to  be  seen  not   simply  as  a  discovery  platform  based  on  metadata  but  as  an  integrated  system  for  delivering  the   essential  information  resources  for  which  users  are  searching.   once  the  catalog  is  understood  to  be  a  locus  for  delivering  content  in  addition  to  discovering  it,  the   local  information  ecosystem  can  be  fundamentally  altered.  at  unc-­‐chapel  hill  we  have  engaged  in   a  process  whereby  the  catalog,  central  to  the  library’s  web  presence  (given  the  prominence  of  the   single  search  box  on  the  home  page),  has  become  a  hub  from  which  many  other  services  are   delivered.  the  most  obvious  of  these,  perhaps,  is  a  system  for  the  delivery  of  physical  documents   that  is  analogous  to  the  ability  to  retrieve  the  full  text  of  electronic  documents.  if  an  information   source  is  discovered  that  exists  in  the  library  only  in  physical  form,  enhancements  to  the  display  of   the  catalog  record  facilitate  the  receipt  by  the  user  of  the  print  book  or  a  scanned  copy  of  an  article   from  a  bound  journal  in  the  stacks.     in  2013,  ithaka  s+r  conducted  a  local  unc  faculty  survey.  the  survey  posed  three  questions   related  to  the  catalog.  in  response  to  the  question,  “typically  when  you  are  conducting  academic   research,  which  of  these  four  starting  points  do  you  use  to  begin  locating  information  for  your   research?,”  41  percent  chose  “a  specific  electronic  research  resource/computer  database.”  nearly   one-­‐third  (30  percent)  chose  “your  online  library  catalog.”17     engine  of  innovation:  building  the  high-­‐performance  catalog  |  owen  and  michalak       doi:  10.6017/ital.v34i2.5702   16   when  asked,  “when  you  try  to  locate  a  specific  piece  of  secondary  scholarly  literature  that  you   already  know  about  but  do  not  have  in  hand,  how  do  you  most  often  begin  your  process?,”  41   percent  chose  the  library’s  website  or  online  catalog,  and  40  percent  chose  “search  on  a  specific   scholarly  database  or  search  engine.”  in  response  to  the  question,  “how  important  is  it  that  the   library  .  .  .  serves  as  a  starting  point  or  ‘gateway’  for  locating  information  for  my  research?,”  78   percent  answered  extremely  important.     on  several  questions,  ithaka  provided  the  scores  for  an  aggregation  of  unc’s  peer  libraries.  for   the  first  question  (the  starting  point  for  locating  information),  18  percent  of  national  peers  chose   the  online  catalog  compared  to  30  percent  at  unc.  on  the  importance  of  the  library  as  gateway,  61   percent  of  national  peers  answered  very  important  compared  to  the  78  percent  at  unc.   in  2014,  the  unc  libraries  were  among  a  handful  of  academic  research  libraries  that  implemented   a  new  ithaka  student  survey.  though  we  don’t  have  national  benchmarks,  we  can  compare  our   own  student  and  faculty  responses.  among  graduate  students,  31  percent  chose  the  online  catalog   as  the  starting  point  for  their  research,  similar  to  the  faculty.18  of  the  undergraduate  students,  33   percent  chose  the  library’s  website,  which  provides  access  to  the  catalog  through  a  single  search   box.19   a  finding  that  approximately  a  third  of  students  began  their  search  on  the  unc  library  website   was  gratifying.  oclc’s  perceptions  of  libraries  2010  reported  survey  results  regarding  where   people  start  their  information  searches.  in  2005,  1  percent  said  they  started  on  a  library  website;   in  2010,  not  a  single  respondent  indicated  doing  so.20     the  gross  disparity  between  the  oclc  reports  and  the  ithaka  surveys  of  our  faculty  and  students   requires  some  explanation.  the  libraries  at  the  university  of  north  carolina  at  chapel  hill  are   proud  of  a  long  tradition  of  ardent  and  vocal  support  from  the  faculty,  and  we  are  not  surprised  to   learn  that  students  share  their  loyalty.  for  us,  the  recently  completed  ithaka  surveys  point  out   directions  for  further  investigation  into  our  patrons’  use  of  our  catalog  and  why  they  feel  it  is  so   critical  to  their  research.   anecdotal  reports  indicate  that  one  of  the  most  highly  valued  services  that  the  libraries  provide  is   delivery  of  physical  materials  to  campus  addresses.  some  faculty  admit  with  a  certain  degree  of   diffidence  that  our  services  have  made  it  almost  unnecessary  to  set  foot  in  our  buildings;  that  is  a   trend  that  has  also  been  echoed  in  conversations  with  our  peers.  yet  the  online  presence  of  the   library  and  its  collections  continues  to  be  of  significant  importance—perhaps  precisely  because  it   offers  an  effective  gateway  to  a  wide  range  of  materials  and  services.   we  believe  that  the  radical  redesign  of  the  online  public  access  catalog  initiated  by  north  carolina   state  university  in  2006  marked  a  sea  change  in  interface  design  and  discovery  services  for  that   venerable  library  service.  without  a  doubt,  continued  innovation  has  enhanced  discovery.   however,  we  have  come  to  realize  that  discovery  is  only  one  function  that  the  online  catalog  can   and  should  serve  today.  equally  if  not  more  important  is  the  delivery  of  information  to  the     information  technology  and  libraries  |  june  2015     17   patron’s  home  or  office.  the  integration  of  discovery  and  delivery  is  what  sets  the  “next-­‐gen”   catalog  apart  from  its  predecessors,  and  we  must  strive  to  keep  that  orientation  in  mind,  not  only   as  we  continue  to  enhance  the  catalog  and  its  services,  but  as  we  ponder  the  role  of  the  library  as   place  in  the  coming  years.  far  from  being  in  decline,  the  online  catalog  continues  to  be  an  “engine   of  innovation”  (to  borrow  a  phrase  from  holden  thorp,  former  chancellor  of  unc-­‐chapel  hill)  and   a  source  of  new  challenges  for  our  libraries  and  our  profession.   references     1.     cathy  de  rosa  et  al.,  perceptions  of  libraries  and  information  resources:  a  report  to  the  oclc   membership  (dublin,  oh:  oclc  online  computer  library  center,  2005),  1–17,   https://www.oclc.org/en-­‐us/reports/2005perceptions.html.   2.     karen  calhoun,  the  changing  nature  of  the  catalog  and  its  integration  with  other  discovery   tools,  final  report,  prepared  for  the  library  of  congress  (ithaca,  ny:  k.  calhoun,  2006),  5,   http://www.loc.gov/catdir/calhoun-­‐report-­‐final.pdf.   3.     roger  c.  schonfeld  and  kevin  m.  guthrie,  “the  changing  information  services  needs  of   faculty,”  educause  review  42,  no.  4  (july/august  2007):  8,   http://www.educause.edu/ero/article/changing-­‐information-­‐services-­‐needs-­‐faculty.   4.     ross  housewright  and  roger  schonfeld,  ithaka’s  2006  studies  of  key  stakeholders  in  the  digital   transformation  in  higher  education  (new  york:  ithaka  s+r,  2008),  6,   http://www.sr.ithaka.org/sites/default/files/reports/ithakas_2006_studies_stakeholders_di gital_transformation_higher_education.pdf.   5.     xi  niu  and  bradley  m.  hemminger,  “beyond  text  querying  and  ranking  list:  how  people  are   searching  through  faceted  catalogs  in  two  library  environments,”  proceedings  of  the   american  society  for  information  science  &  technology  47,  no.  1  (2010):  1–9,   http://dx.doi.org/10.1002/meet.14504701294;  and  cory  lown,  tito  sierra,  and  josh  boyer,   “how  users  search  the  library  from  a  single  search  box,”  college  &  research  libraries  74,  no.   3  (2013):  227–41,  http://crl.acrl.org/content/74/3/227.full.pdf.     6.     charles  r.  hildreth,  “beyond  boolean;  designing  the  next  generation  of  online  catalogs,”   library  trends  (spring  1987):  647–67,  http://hdl.handle.net/2142/7500.   7.     kristen  antelman,  emily  lynema,  and  andrew  k.  pace,  “toward  a  twenty-­‐first  century   library  catalog,”  information  technology  and  libraries  25,  no.  3  (2006):  129,   http://dx.doi.org/10.6017/ital.v25i3.3342.   8.     karen  coyle,  “the  library  catalog:  some  possible  futures,”  journal  of  academic  librarianship   33,  no.  3  (2007):  415–16,  http://dx.doi.org/10.1016/j.acalib.2007.03.001.   9.     karen  markey,  “the  online  library  catalog:  paradise  lost  and  paradise  regained?”  d-­‐lib   magazine  13,  no.  1/2  (2007):  2,  http://dx.doi.org/10.1045/january2007-­‐markey.     engine  of  innovation:  building  the  high-­‐performance  catalog  |  owen  and  michalak       doi:  10.6017/ital.v34i2.5702   18     10.    marshall  breeding,  “next-­‐gen  library  catalogs,”  library  technology  reports  (july/august   2007):  10–13.   11.    jia  mi  and  cathy  weng,  “revitalizing  the  library  opac:  interface,  searching,  and  display   challenges,”  information  technology  and  libraries  27,  no.  1  (2008):  17–18,   http://dx.doi.org/10.6017/ital.v27i1.3259.   12.    michael  j.  bennett,  “opac  design  enhancements  and  their  effects  on  circulation  and   resource  sharing  within  the  library  consortium  environment,”  information  technology  and   libraries  26,  no.  1  (2007):  36–46,  http://dx.doi.org/10.6017/ital.v26i1.3287.   13.    eric  lease  morgan,  “use  and  understand;  the  inclusion  of  services  against  texts  in  library   catalogs  and  discovery  systems,”  library  hi  tech  (2012):  35–59,   http://dx.doi.org/10.1108/07378831211213201.   14.    lorcan  dempsey,  “thirteen  ways  of  looking  at  libraries,  discovery,  and  the  catalog:  scale,   workflow,  attention,”  educause  review  online  (december  10,  2012),   http://www.educause.edu/ero/article/thirteen-­‐ways-­‐looking-­‐libraries-­‐discovery-­‐and-­‐ catalog-­‐scale-­‐workflow-­‐attention.   15.    charles  pennell,  natalie  sommerville,  and  derek  a.  rodriguez,  “shared  resources,  shared   records:  letting  go  of  local  metadata  hosting  within  a  consortium  environment,”  library   resources  &  technical  services  57,  no.  4  (2013):  227–38,   http://journals.ala.org/lrts/article/view/5586.   16.    benjamin  pennell  and  jill  sexton,  “implementing  a  real-­‐time  suggestion  service  in  a  library   discovery  layer,”  code4lib  journal  10  (2010),  http://journal.code4lib.org/articles/3022.     17.    ithaka  s+r,  unc  chapel  hill  faculty  survey:  report  of  findings  (unpublished  report  to  the   university  of  north  carolina  at  chapel  hill,  2013),  questions  20,  21,  33.   18.    ithaka  s+r,  unc  chapel  hill  graduate  student  survey:  report  of  findings  (unpublished  report   to  the  university  of  north  carolina  at  chapel  hill,  2014),  47.   19.    ithaka  s+r,  unc  chapel  hill  undergraduate  student  survey:  report  of  findings  (unpublished   report  to  the  university  of  north  carolina  at  chapel  hill,  2014),  39.   20.    cathy  de  rosa  et  al.,  perceptions  of  libraries,  2010:  context  and  community:  a  report  to  the   oclc  membership  (dublin,  oh:  oclc  online  computer  library  center,  2011),  32,   http://oclc.org/content/dam/oclc/reports/2010perceptions/2010perceptions_all.pdf.     “am i on the library website?”: a libguides usability study articles “am i on the library website?”: a libguides usability study suzanna conrad and christy stevens information technology and libraries | september 2019 49 suzanna conrad (suzanna.conrad@csus.edu) is associate dean for digital technologies and resource management at california state university, sacramento. christy stevens (crstevens@sfsu.edu) is the associate university librarian at san francisco state university. abstract in spring 2015, the cal poly pomona university library conducted usability testing with ten student testers to establish recommendations and guide the migration process from libguides version 1 to version 2. this case study describes the results of the testing as well as raises additional questions regarding the general effectiveness of libguides, especially when students rely heavily on search to find library resources. introduction guides designed to help users with research have long been included among a suite of reference services offered by academic libraries, though terminology, formats, and mediums of delivery have evolved over the years. print “pathfinders,” developed and popularized by the model library program of project intrex at mit in the 1970s, are the precursor to today’s online research guides, now a ubiquitous resource featured on academic library websites.1 pathfinders were designed to function as a “kind of map to the resources of the library,” helping “beginners who seek instruction in gathering the fundamental literature of a field new to them in every respect” find their way in a complex library environment.2 with the advent of the internet, pathfinders evolved into online “research guides,” which tend to be organized around subjects or courses. in the late 1990s and early 2000s, creating guides online required a level of technological expertise that many librarians did not possess, such as html-coding knowledge or the ability to use web development applications like adobe dreamweaver. as a result, many librarians could not create their own online guides and relied upon webmasters to upload and update content. the online guide landscape changed again in 2007 with the introduction of springshare’s libguides, a content management solution that quickly became a wildly popular library product.3 as of december 2018, 614,811 guides had been published by 181,896 librarians, at 4,743 institutions in 56 countries.4 the popularity of libguides is due in part to its removal of technological barriers to online guide creation, making it possible for those without web-design experience to create content. libguides is also a particularly attractive product for libraries constrained by campus or library web templates, affording librarians and library staff the freedom to design pages without requiring higher level permissions to websites. despite these advantages, in the absence of oversight, libguides sites can develop into microsites within the library’s larger web presence. inexperienced content creators can inadvertently develop guides that are difficult to use, lacking consistent templates and containing overwhelming amounts of information. as a result, libraries mailto:suzanna.conrad@csus.edu mailto:crstevens@sfsu.edu am i on the library website? | conrad and stevens 50 https://doi.org/10.6017/ital.v38i3.10977 often find it useful to develop local standards and best practices in order to enhance the user experience.5 like many academic libraries, the cal poly pomona university library uses the libguides platform to provide the campus community with course and subject guides. in 2015, librarians began discussing plans to migrate from libguides version one to the version two platfo rm. these discussions led to broader conversations about libguides related issues and concerns, some of which had arisen during website focus group sessions conducted in early 2015. the focus groups were designed to provide the library with a better understanding of students’ library website preferences. students reported frustration with search options on the library website as well as confusion regarding inconsistent headers. even though focus group questions were related to the library website, two participants commented on the library’s libguides as well. the library was using a modified version of the library website header for vendor-provided services, including libguides, so it was sometimes unclear to students when they had navigated to an external s ite.6 to complicate matters, the library also occasionally used libguides for other, non-researchrelated library pages, such as a page delineating the library’s hours, because of the ease of updating that the platform affords. one student, who had landed on the libguides page detailing the library’s hours, described feeling confused about where she was on the library website. she explained that she had tried to use the search box on the libguides page to navigate away from the hours page, apparently unaware that it was only an internal libguides search. as a result, she did not receive any results for her query. the language the student used to describe the experience clearly revealed her disorientation and perplexity: “something popped up called libguides and then i put what i was looking for and that was nothing. it said no search found. i don’t even know what that was, so i just went to the main page.” another participant, who also tried to search for a research-related topic after landing on a libguides page, stated, “i tried putting my topics. i even tried refining my topic, but then it took me to the guide thing.” accustomed to using a search function to find information on a topic, this student did not interpret the research guide she had landed upon as a potentially useful tool that could help with her research. she expected that her search would produce search results in the form of a list of potentially relevant books or articles. the appearance instead of a research guide was misaligned with her intentions and expectations and therefore confusing to her.7 given both the libguides related issues that emerged during the library website focus groups and the library’s plan to migrate from libguides version one to version two in the near future, the library’s digital initiatives librarian and head of reference and instruction decided to conduct usability testing focused specifically on libguides. in addition to testing the usability of specific libguides features, such as navigational tabs and subtabs, we were also interested in determining whether some of the insights gleaned from the library website focus groups and from prior user surveys and usability testing regarding users’ web expectations, preferences, and behaviors were also relevant in the libguides environment. specifically, prior data had indicated users were unlikely to differentiate between the library’s website and vendor-provided content, such as libguides, libanswers, the library catalog, etc. findings also suggested that rather than intentionally selecting databases that were appropriate to their topics, students often resorted to searching in the first box they saw. this included searching for articles and books on their topics using search boxes that were not designed for that purpose, such as the database search box on the library’s a-z database page and the libguides site search tool for searching all guides. although many students did not always resort to searching first (many did attempt to browse to information technology and libraries | september 2019 51 specific library services), if they were not immediately successful, they would then type terms from the usability testing task into the first available search box.8 finally, we were also aware that many of our current libguides contained elements that were inconsistent with website search and design best practices as well as misaligned with typical website users’ behaviors and expectations, as described by usability experts like jakob nielsen. as such, we wanted to test the usability of some of these potentially problematic elements to determine whether they negatively impacted the user experience in the libguides environment. if they did, we would have institution-specific data that we could leverage to develop local recommendations for libguides standards and best practices that would better meet students’ needs. literature review the growth of libguides since springshare’s founding in 2007, libguides have been widely embraced by academic libraries.9 in 2011, ghaphery and white visited the websites of 99 arl university libraries in the united states and found that 68 percent used libguides as their research guides platform. they also surveyed librarians from 188 institutions, 82 percent of which were college or university libraries, and found that 69 percent of respondents reported they used libguides.10 as of december 2018, springshare’s libguides community website indicated that 1,620 academic libraries in the united states and a total of 1,823 academic libraries around the world, not counting law and medical libraries, were using the libguides platform.11 libguides’ popularity is due in part to its user-friendly format, which eliminates most technical barriers to entry for would be guide authors. for example, anderson and springs surveyed librarians at rutgers university and found they were more likely to update and use libguides than previous static subject guides that were located on the library website and maintained by the webmaster, to whom subject specialists submitted content and any needed changes.12 the majority of librarians reported that having direct access to the libguides system would increase how often they updated their guides. moreover, after implementing the new libguides system, 52 percent said they would update guides as needed, and 14 percent said they would update guides weekly; prior to implementation, only 36 percent stated they would update guides as needed, and none said they would do so weekly. libguides usability testing and user studies although much literature has been published on the usability of library websites,13 fewer studies have focused on research guides or libguides specifically. of these, several focused on navigation and layout issues. for example, in their 2012 libguides navigation study, pittsley and memmott confirmed their initial hypothesis that the standard libguides navigation tabs located in a horizontal line near the top of each page can sometimes go unnoticed, a phenomenon referred to as “banner blindness.” as a result of their findings, librarians at their institution decided to increase the tab height in all libguides, and some librarians also chose to add content menus on the homepages of each of their guides. they moved additional elements from the header to the bottom of the guide under the theory that decreased complexity would contribute to increased tab navigation recognition.14 sonsteby and dejonghe examined the efficacy of libguides’ tabbed navigational interface as well as design issues that caused usability problems. they identified user preferences, su ch as users’ am i on the library website? | conrad and stevens 52 https://doi.org/10.6017/ital.v38i3.10977 desire for a visible search box that behaved like a discovery tool, and design issues that frequently led to confusion, such as search boxes that searched for limited types of content, like journal titles. they also found that jargon confused users, and that guides containing too many tabs that were inconsistently labeled led to both confusion and the perception that guides were “cluttered” and “busy.”15 thorngate and hoden explored the effectiveness of libguides version two designs, specifically focusing on use of columns, navigation, and the integration of libguides into the library website. they found that two-column navigation is the most usable, users are more likely to notice left navigation over horizontal tabs, and students do not view libguides as a separate platform, expecting instead for it to live coherently within the library’s website. 16 almeida and tidal employed a mixed methods approach to gather user feedback about libguides, including usage of “paper prototyping, advanced scribbling, task analysis, tap, and semi-structured interviews.”17 the researchers intended to “translate user design and learning modality preferences into executable design principles,” but found that no one layout filled all students’ needs or learning modalities.18 ouellette’s 2011 study differed from many libguides-focused articles in that rather than assigning participants usability tasks, it employed in-depth interviews with 11 students to explore how they used subject guides created on the libguides platform and the features they liked and disliked about them. like some of the aforementioned studies, oullette found that students did not like horizontal tabbed navigation, preferring instead the more common left-side navigation that has become standard on the web. however, the study was also able to explore issues that many of the usability task-focused studies did not, including whether and how students use subject guides to accomplish their own research-related academic work. ouellette found that students “do not use subject guides, or at least not unless it is a last resort.”19 reasons provided for non-use included not knowing that they existed, preferring to search the open web, and not perceiving a need to use them, preferring instead to search for information rather than browsing a guide.20 such findings call into question the wisdom of expending time and resources on creating guides. however, ouellette asserted that students were more likely to use research guides when they were stuck, when they were required to find information in a new discipline, or when their instructors explicitly suggested that they use them.21 nevertheless, most students who had used libguides reported that they had done so solely “to find the best database for locating journal articles.”22 indeed, ouellette found that the majority of “participants had only ever clicked on the tab leading to the database section of a guide,” a finding that was consistent with staley’s 2007 study, which found that databases are the most commonly used subject guide section.23 while ouellette concluded that libguides creators should therefore emphasize databases on their guides, both the more recent widespread library adoption of discovery systems that search across databases, in many cases making it unnecessary for students to select a specific database, as well as the common practice of aggregating relevant databases under disciplinary subject headings on library databases pages implicitly call into question the need for duplicating such information on library subject guides. if users can easily find such information elsewhere, these conclusions also cast doubt on the effectiveness of the entire libguides enterprise. information retrieval behaviors: search and browse preferences in 1997, usability expert jakob nielsen reported that more than half of web users are “search dominant,” meaning that they go directly to a search function when they arrive at a website rather than clicking links. in contrast, only a fifth of users are “link dominant,” preferrin g to navigate sites by clicking on links rather than searching. the rest of the users employ mixed strategies, switching information technology and libraries | september 2019 53 between searching and clicking on links in accordance with what appears to be the most promising strategy within the context of a specific page.24 while some researchers have questioned the prevalence of search dominance, nielsen’s mobile usability studies have indicated an even stronger tendency toward search dominance when users access websites on their mobile devices.25 moreover, by 2011, nielsen’s research had indicated that search dominance is a user behavior that gets stronger every year, and that “many users are so reliant on search that it’s undermining their problem-solving abilities.” specifically, nielsen found that users exhibited an increasing reluctance to experiment with different strategies to find the information they needed when their initial search strategy failed.26 nielsen attributes the search dominance phenomenon to two main user preferences. the firs t is that search allows users to “assert independence from websites’ attempt to direct how they use the web.”27 the second is that search functions as an “escape hatch when they are stuck in navigation. when they can’t find a reasonable place to go next, they often turn to the site’s search function.” nielsen developed a number of best practices based on these usability testing results, including that search should be made available from every page in a website, since it is not possible to predict when users will feel lost. additionally, given that users quickly scan sites for a box where they can type in words, search should be configured as a box and not a link, it should be located at the top of the page where users can easily spot it, and it should be wide enough to accommodate a typical number of search terms.28 nielsen’s usability studies have shed light not only on where search should be located but also on how search should function. in 2005, nielsen reported that searchers “now have precise expectations for the behavior of search” and that “designs that invoke this mental model but work differently are confusing.”29 specifically, searchers’ “firm mental model” for how search should work includes “a box where they can type words, a button labeled ‘search’ that they click to run the search, [and] a list of top results that’s linear, prioritized, and appears on a new page.” moreover, nielsen found that searchers want all search boxes on all websites to function in the same way as typical search engines and that any deviation from this design causes usability issues. he specifically highlighted scoped searches as problematic, pointing out that searches that only cover a subsite are generally misleading to users, most of whom are unlikely to consider what th e search box is actually searching.30 while there is much evidence to support nielsen’s claims about the prevalence of search dominance, other studies have suggested that users themselves are not necessarily always search or link dominant. rather, some websites lend themselves better to searching or exploring links, and users often adjust their behaviors accordingly.31 although we did not find studies that specifically discussed the search and browse preferences and behaviors of libguides users, we did find studies of library website use that suggested that though users often exhibit search -dominant tendencies, they also often rely on a mixed approach to library website navigation. for example, hess and hristova’s 2016 study of users’ searching and browsing tendencies explored how students access library tutorials and online learning objects. specifically, they compared searching from a search box on the tutorials landing page, using a tag cloud under a search box, and browsing links.32 google analytics data revealed that students employed a mixed approach, equally relying upon both searching and clicking links to access the library’s tutorials.33 similarly, han and wolfram analyzed clickstream data from 1.3 million sessions in an image repository and determined that the two most common actions (86 percent of actions) were simple search and am i on the library website? | conrad and stevens 54 https://doi.org/10.6017/ital.v38i3.10977 click actions.34 however, users in this study exhibited a tendency toward search dominance, conducting simple searches in 70 percent of the actions.35 niu, zhang, and chen presented a mixed methods study analyzing search transaction logs and conducting usability testing f ocused on comparing the discovery layers vufind and primo. browsing in the context of their study included browsing search results. they found that most search sessions were very brief, and students searched using two or three keywords.36 xie and joo tested how thirty-one participants went about finding items on a website, classifying their approaches into what they described as eight “search tactics,” including explorative methods, such as browsing.37 over 88 percent of users conducted at least one search query, and 75 percent employed “iterative exploration,” browsing and evaluating both internal and external links on the site “until they were satisfied or they quit.”38 only four of thirty-one, or 6.7 percent, did “whole site exploration,” a tactic which included browsing and evaluating most of the available information on a website, looking through every page on the site to find the desired information.39 method this study addresses the following research questions: 1. when prompted to find a research guide, are students more likely to click links or type terms into a search box to find the guide? 2. are students more likely to successfully accomplish usability tasks directing them to find specific information on a libguide when using a guide with horizontal or vertical tabs? 3. how likely are students to click on subtabs? 4. how and to what extent does a one-, two-, or three-column content design layout affect students’ ability to find information on a libguide? 5. how and to what extent do students use embedded search boxes in libguides? 6. do students confuse screenshots of search boxes with functioning search tools? in 2015, the university library had access to two versions of libguides: the live version one instance and a beta version two instance. in order to answer our research questions and make data-informed design decisions that would improve the usability of our libguides, we compared the usability of existing research guides in libguides version one to test sites on libguides version two. version two guides differed from version one guides in several ways. version two guides were better aligned with nielsen’s recommendations regarding search box placement and function. every libguide page included a header identical to the library website’s header, which contained a global search box that searched both library resources and the library’s website. the inclusion of a visible discovery tool in the header was consistent with usability recommendations in the literature40 as well as our own prior library website usability tests, which indicated many users preferred searching for resources over finding a path to them by clicking through a series of links. in mid-april 2015, ten students were scheduled to test libguides. each student attempted the same seven tasks, but five students tested the current version of libguides and five students tested version two. the sessions were recorded using camtasia, and students completed usability tasks on a laptop that was hooked up to a large television monitor, allowing the two librarians who were in the room to observe how students navigated the library’s website and libguides platform. one librarian served as the moderator and the other managed the recording technology.41 although additional members of the web team were interested in viewing the test information technology and libraries | september 2019 55 sessions, in order to avoid overwhelming the students, only two librarians sat in the sessions. the moderator read tasks aloud and students were instructed to think aloud while completing each task, narrating their thought processes and navigational decisions. students were recruited via a website usage and perceptions survey sent out the prior quarter, which included a question as to whether they would be interested in participating in usability testing. the students who received this survey were selected from a randomized sample provided by the university’s institutional research office. the sample included both lower division students in the first or second year of their studies and transfer students. students were also recruited in information literacy instruction sessions for lower-level english courses as well as in a creditbearing information literacy course taught by librarians. survey respondents and students from the targeted classes who indicated that they would be interested in participating in usability testing were subsequently contacted via email. students with appropriate testing day availability were selected. students from the various colleges were represented, including engineering; business administration; letters, arts and social sciences; education and integrative studies; and hospitality management. all of the participants were undergraduates and most were lower division students. we chose to focus on recruiting lower division students because we wanted to ensure that our guides were usable by students with the least amount of library experience; many lower division students are unaware of library services and may not have taken a library instruction session or a library information literacy course. however, while the goal was to recruit lower division students, scheduling difficulties, including three no-shows, led us to recruit students on-the-fly who were in the library, regardless of their lower division or upper division status. task 1 in both rounds of usability testing, students were prompted to find a “research guide” to help them write a paper on climate change for a com 100 class. students started from the homepage of the library. two possible success routes included browsing to a featured links section on the homepage where a “research guides” link was listed (see figure 1) or searching via the top level “onesearch” discovery layer search box, displayed in figure 2, which delivered results, including articles from databases, books from the catalog, library website pages, and libguides pages, in a bento-box format. the purpose of this task was to determine if students browsed or searched to find research guides. we defined browsing as clicking on links, menus, or images to arrive at a result, whereas searching involved typing words and phrases into a search box. am i on the library website? | conrad and stevens 56 https://doi.org/10.6017/ital.v38i3.10977 figure 1. featured links section on library homepage. figure 2. onesearch search box on library homepage. task 2 task 2 was designed to compare the usability of libguides version one’s horizontal tab orientation with version two’s left navigation tab option. students were provided with a scenario in which they were asked to compare two public opinion polls on the topic of climate change for the same com 100 class. we displayed the appropriate research guide for the students and instructed them to find a list of public opinion polls. the phrase “public opinion polls” appeared in the navigation of both versions of the guide. figure 3 displays the research guide with horizontal tab navigation and figure 4 with vertical, left tab navigation. information technology and libraries | september 2019 57 figure 3. horizontal tab navigation. am i on the library website? | conrad and stevens 58 https://doi.org/10.6017/ital.v38i3.10977 figure 4. left tab navigation. task 3 in the third scenario, students were informed that their professor recommended that they u se a library “research guide” to find articles for a research paper assignment in an apparel merchandising and management class. students were instructed to find the product development articles on the research guide. the phrase “product development” appeared as a subtab in both versions of the guide. this task was intended to test whether students navigated to subtabs in libguides. as shown in figure 5, the subtab located on the horizontal navigation menu appeared when scrolled over but was otherwise not immediately visible. in contrast, figure 6 shows how the navigation was automatically popped open on the left tab navigation menu so that subtabs were always visible, a newly available option in libguides version two. figure 5. horizontal subtab options. information technology and libraries | september 2019 59 figure 6. left tab navigation with lower subtabs automatically open. task 4 on the same apparel merchandising and management libguide, students were asked where they would go to find additional books on the topic of product development. the librarian who designed this libguide had included search widgets in separate boxes on the page that searched the catalog and the discovery layer “onesearch.” we were interested in seeing whether students would use the embedded search boxes to search for books. this functionality was identical in both the version one and two instances of the guide, as shown in figure 7. figure 7. embedded catalog search and embedded discovery layer search. task 5 in the fifth scenario, students were told that they were designing an earthquake-resistant structure for a civil engineering class. as part of that process, they were required to review am i on the library website? | conrad and stevens 60 https://doi.org/10.6017/ital.v38i3.10977 seismic load provisions. we asked them to locate the asce standards on seismic loads using a research guide we opened for them. the asce standard was located on the “codes & standards” page, which could be accessed by clicking on the “codes & standards” tab. the version one instance of the guide was two-columned, and a link to the asce seismic load standard was available in the second column on the right, per figure 8. the version two instance of the guide used a single, centered column, and the user had to scroll down the page to find the standard, per figure 9. we wanted to see if students noticed content in columns on the right, as many of our libguides featured books, articles, and other resources in columns on the right side of the page, or whether guides with content in a single central column were easier for students to use. figure 8. two-column design with horizontal tabs. information technology and libraries | september 2019 61 figure 9. two-column design with left tab navigation. task 6 because librarians sometimes included screenshots of search interfaces in their guides, we were interested in testing whether students mistook these images of search tools for actual search boxes. in task six, we opened a civil engineering libguide for students and told them to find an online handbook or reference source on the topic of finite element analysis. as shown in figure 10, a screenshot of a search box was accompanied by instructional text explaining how to find specific types of handbooks. within this libguide, there were also screenshots of the onesearch discovery layer as well as a screenshot of a “findit” link resolver button. am i on the library website? | conrad and stevens 62 https://doi.org/10.6017/ital.v38i3.10977 figure 10. screenshots used for instruction. task 7 the final task was designed to test whether it was more difficult for students to find content in a twoor three-column guide. students were instructed to do background research on motivation and classroom learning for a psychology course. they were told to find an “encyclopedic source” on this topic. within each version of the psychology libguide, there was a section called “useful books for background research.” as shown in figure 11, in the version one libguide, books useful for background research were displayed in the third column on the right side of the page. the version two libguide displayed those same books in the first column under the left navigation options. information technology and libraries | september 2019 63 figure 11. books displayed in third column. am i on the library website? | conrad and stevens 64 https://doi.org/10.6017/ital.v38i3.10977 figure 12. two-column display with books in the left column. results searching vs. browsing to find libguides understanding how students navigate and use libguides is important, but if they have difficulty finding the libguides from the library homepage, usability of the actual guides is moot. of the ten students tested, six students used the onesearch discovery layer located on the library’s homepage to search for a guide designed to help them write a paper on climate change for a com 100 class. frequently used search terms included “research guide,” “communication guides,” “climate change,” “climate change research guide,” “faculty guides,” and “com 100.” of these students, two used search as their only strategy, typing search queries into whichever search box they discovered. neither of these students were successful at locating the correct guide. the remaining four students used mixed strategies; they started by searching and resorted to browsing after the search did not deliver exact results. two of these students were eventually successful in finding the specific research guide; two were not. of the six studen ts who searched using the discovery layer, only one did not find the libguides landing page at all. in general, it seems that the task and student expectations during testing were not aligned with the way the guide was constructed. only one student went to the controversial topics guide because “climate change is a controversial topic.” one student thought the guide would be titled “climate change” and another thought there might be a subject librarian dedicated to climate change. students would search for keywords corresponding with their course and topic, but generally they did not make the leap to focus more broadly on controversial topics. only one student browsed directly to the “research guides” link on the homepage and found the guide under subject guides for “communication" on the first try. another student navigated to a information technology and libraries | september 2019 65 “services and help” page from the main website navigation and found a group of libguides that were labeled “user guides,” designed specifically for new students, faculty, staff, and visitors; however, the student did not find any other libguides relevant to the task at hand. the remaining two students navigated to pages with siloed content; one student clicked the library catalog link on the library homepage and began searching using the keywords “climate change.” the other student clicked on the “databases” link. upon arriving at the databases a-z page, the student chose a subject area (science) and searched for the phrase “faculty guides” in the databases search box. the student was unable to find the research guide because our libguides were not indexed in this search box; only database names were listed. only three out of ten students found the guide; the rest gave up. two of the successful participants employed mixed strategies that began with searching and included some browsing; the third student browsed directly to the guide without searching. testers in the libguides version one environment attempted the task an average of 3.8 times before achieving success or giving up compared to an average of 3.2 attempts per tester in version two testing. we defined an attempt as browsing or searching for a result until the student tried a different strategy or started over. for instance, if a student tried to browse to a guide and then chose to search after not finding what they were looking for, that constituted two attempts. testers in both rounds began on the same library website. one major difference between the two research guides landing pages was the search boxes; one was an internal libguides search box (version one) and one was a global onesearch box (version two). it is possible that testers in round two made fewer attempts because of the inclusion of the onesearch box. for those testing with the libguides search box in version one, three searched on the libguides landing page. from both rounds, eight of the students located the libguides landing page, regardless of whether or not they found the correct guide. the two students who did not find the correct guide did land in libguides, but they arrived at specific libguides pages that served other purposes (one found a onesearch help guide and the other landed on a new users’ guide). navigation, tabs, and layout navigation, tab options (including subtab usage), and layouts were evaluated in tasks two, three, five, and seven. as mentioned in the method section, the first group of five students who tested the interface used the version one libguides with horizontal navigation and hidden subtabs. the second round of five students used the version two libguides with left navigation and popped open subtabs. students in both rounds were able to find items in the main navigation (not including subtabs) at consistent rates, with those in the second round with left navigation completing all tasks significantly faster than the first-round testers (38 seconds faster on average across all tasks). in task two, students were asked to find public opinion polls, which they could access by clicking a “public opinion polls” link on the main navigation. in both rounds, regardless of horizontal or vertical navigation, nine of the students clicked on the tab for the polls. only one student testing on version two was unable to find the tab. students in version one testing with horizontal navigation attempted this task two times on average before successfully finding the tab; students testing on version two with vertical navigation attempted 1.4 times before finding the tab with the polls or giving up. am i on the library website? | conrad and stevens 66 https://doi.org/10.6017/ital.v38i3.10977 when asked in task three to find articles on product development, which were included on a “product development” subtab under the primary “library databases amm” tab, nine out of ten students were unable to locate the subtab. in libguides version one, this subtab was only viewable after clicking the main “library databases amm” tab. in libguides version two, this subtab was popped open and immediately visible underneath the “library databases amm” tab. a version two tester was the only student who clicked on the “product development” subtab. students attempted this task 1.8 times in version one testing compared to 1.2 times for those testing version two. it is worth noting that six of the students found product development articles by searching via other means (onesearch, databases, and other library website links); they just did not find the articles on the libguide shown. while they still successfully found resources, they did not find them on the guide we were testing. in task five, we asked students to find the asce standards on seismic loads on a specific guide. the version one guide used a two-column design while the version two guide with the same content utilized a single column for all content. while six students found the standards (three in round one and three in round two), only four of ten testers overall did so by browsing to the resource. three of the students who chose to browse were in round one and the fourth student was from round two. in version one testing with the two-column design, two students found the standards after making two attempts to browse the guide. both of these students used the libguides “search this guide” function to find the correct page for the standards using keywords “asce standards” and “asce.” the third successful student in this round used a mixed methods strategy of searching and browsing. she used the search terms “asce standards on seismic loads” and then searched for “seismic loads” twice in the same search box. she landed on the correct tab of the libguide, scrolling over the correct standard multiple times, but only found the standards after the sixth attempt. during version two testing, which included the one column design and global search box, only one student browsed to the standards on the libguide. this student scrolled up and down the main libguide page, clicked on the left navigation option for “find books,” then the left navigation option for “codes & standards” and scrolled down to find the correct item. four out of five version two testers bypassed browsing altogether, instead using the onesearch box on the page header to try to find the asce standards. two of those students found the specific asce standards that were featured on the libguide; the other two found asce standards, just not the specific item we intended for them to find. the four students who did no t find the specific standards were equally distributed across both testing groups. on average, students attempted to complete the task 3.6 times in version one testing and 1.6 times in version two testing before either finding the resource or giving up. task seven asked students to find an encyclopedic source using a three-column design in version one and a two-column design in version two. the version one guide listed encyclopedias in the right-most column of a three-column layout and the version two guide included them under the left navigation in a two-column design. only three students found the encyclopedia mentioned in task seven, two of whom completed the task using version two’s two-column display. only one student was able to locate the encyclopedia in the third column in version one testing. the seven students who were unable to find the encyclopedia all attempted to search when they were unable to find the encyclopedia by browsing. six of these seven students searched for the keywords “motivation and classroom learning” and the seventh for “motivation and learning.” those who landed in onesearch (six out of seven) received many results and were unable to find encyclopedias. one student searched within libanswers for “encyclopedia” and found britannica. information technology and libraries | september 2019 67 one student attempted to refine by facets, thinking that “encyclopedia” would be a facet similar to “book” or “article.” using search, especially onesearch, to attempt to find an encyclopedia was ultimately unsuccessful for the students. search terms students chose were far too general for them to complete the task successfully. students in version one testing attempted this task 2.4 times compared to 3.2 times for version two testers. embedded search boxes & screenshots of search boxes embedded search boxes and screenshots of search boxes were tested in tasks four and six. the header used in version one libguides was limited, defaulting to searching within the guide, and the additional options on the dropdown menu next to the search box did not include a global “onesearch.” in version two guides, a onesearch box that searched most library resources (articles, books, library webpages, and libguides) was included. during task four, which asked students who were already on a specific guide how they would go about finding additional books on product development, version one testers were much more likely to use embedded search box widgets in the guide content. three of the five students in version one testing used the search widgets on the page to either search the catalog or search onesearch. the remaining two students in that round used a header search or browsed. one of these students used the libguides “search this guide” function in libguides and searched for “producte [sic] development books.” this student did not notice the typo in the search term and subsequently navigated out of libguides to the library website via the library link on the libguides header. the user then searched the catalog for “product development” and was able to locate books. a fifth student in the version one testing round did not use embedded search box widgets or the libguides search. she browsed through two guide pages and then gave up. in version two testing, three of five students used the global onesearch box to find the product development books. the remaining two students chose to search the millennium catalog linked from a “books and articles” tab on the main website header, finding books via that route. during testing of both versions, students tried an average of 1.5 times to complete the task before achieving success or acknowledging failure. nine out of ten testers found books on the topic of product development. the one tester who did not find the books attempted to complete the task one time; she found product development articles from the prior task and said she would click on the same links (for individual article titles) to find books. in task six, half of the ten students from both rounds attempted to click on screenshots of search boxes or unlinked “findit” buttons. a screenshot of the onesearch box and a knovel search box were embedded in the test engineering guide. two users in the version one testing and one tester in version two testing attempted to click on the onesearch screenshot. one student in version two testing attempted to click on the knovel search box screenshot. one student from version one testing tried to click on a “findit” button for the link resolver. comparisons between rounds we recorded how many attempts were needed to complete tasks in each round. in round one, which tested libguides version one, students took an average of 2.74 tries to complete the tasks. in round two, which focused on libguides version two, students took two tries to complete tasks. average attempts per task are displayed in figure 13. we also timed the rounds to see how many minutes it took students to complete all of the tasks. in the first round, it took 16:07 minutes on average and in the second round 15:29 minutes. this does not appear to constitute an important am i on the library website? | conrad and stevens 68 https://doi.org/10.6017/ital.v38i3.10977 difference, but there was one tester in round two who narrated his experiences very explicitly and in great detail. his session lasted 23 minutes. if his testing is excluded, then round two had a shorter average of 13:30 minutes. despite the lower total time spent testing, task success was nearly equal between the two rounds. details on individual testing times per participant are in figure 14. in round one, testers were successful at completing the task, whether they completed it in the manner we predicted or not, for 24 tasks. round two was slightly lower with 23 successfully completed tasks. success was, however, subjective. in task three, we wanted to test whether students found a list of articles on a libguide on a certain topic. nearly all of the students (nine out of ten) found articles on the topic, but only one of them found them via the method we had anticipated. other tasks produced similar results where the students found resources that technically fulfilled the task we had asked them to complete, even though they did not test the feature of the interface we were hoping. in these cases, we called this a success, as they had fulfilled the task as written. figure 13. attempts per task for libguides v1 compared to libguides v2. information technology and libraries | september 2019 69 figure 14. total time per participant for libguides v1 compared to libguides v2. discussion there were several overarching themes that we discovered during the testing of libguides versions one and two. the first relates to nielsen’s conception of search dominance and its implications for finding guides as well as resources within guides. task one, which asked students to navigate to a relevant libguide from the library homepage, revealed that students were much more likely to search for a guide than to navigate to one by using links. although the library homepage in our study included a clearly demarcated “research guides” link, only one tester clicked on it. in contrast, six of ten of the students used search as their first and only strategy, and an additional two of ten first clicked on a link and then switched to search as their next strategy. although our initial search-focused research question and related task looked specifically at how students navigate to guides, most of the other tasks provided additional insight into how students navigate within them as well. our findings are consistent with nielsen’s observation that search functions as an “escape hatch” when users get “stuck in navigation.”42 many students we tested used mixed strategies to find content, often resorting to searching for content when they were confused, lost, or impatient. while one student explicitly stated that search is a backup for when he cannot find something via browsing, search behaviors from many other students suggested that they were “search-dominant,” preferring searching over browsing both on library website pages and from within libguides. similar to nielsen’s studies on reliance on search engine results, students were unlikely to change their search strategies even if they were not receiving helpful results. students did not engage in what xie and joo referred to as “whole site exploration,” browsing and evaluating most of the available information on a website to accomplish the assigned tasks.43 while research guides are sometimes designed to function as linear pathways that lead students through the research process or as comprehensive resources that introduce am i on the library website? | conrad and stevens 70 https://doi.org/10.6017/ital.v38i3.10977 students to a series of tools and resources, all of which could be useful in the research process, the students we tested did not approach guides in this way. rather than starting on the first tab and comprehensively exploring it tab by tab and content box by content box, students ignored most of the content on the page, searching instead to find the specific information they needed. our testers’ search behaviors were also consonant with nielsen’s observation that scoped searches are inconsistent with users’ mental models about how search should function. nielsen found that search boxes that only cover a subsection of a site are generally confusing to users and negatively impact users’ ability to find what they are looking for on a site. in our study, several students used scoped search boxes both on library website pages and within libguides to find content that the search did not index. version two testers had access to a search box on every page that aligned with their global search expectations, and they frequently used it, so much so that they their preference for search disrupted some of the usability questions we were trying to answer in our tasks. for example, users’ tendency to search instead of browse interfered with our ability to clearly discern whether it was easier for students to find content on pages with one-, two-, or three-column content designs (many students did not even attempt to find content in the columns). students’ global search expectations of search boxes also have implications on their ability to find libguides that they have been told exist or to discover the existence of libguides that might help them with their research. for example, students with search-dominant tendencies who attempt to use a library search tool that does not index libguides or the content within libguides will be unlikely to find them. while students did use search boxes embedded within libguides content areas, version two testers had access to a global search box located at the top right-hand side of every libguides page, and as a result, they were more likely to use the global search than the embedded search boxes. this behavior is consistent with nielsen’s assertion that for ease of use, search should consist of a box “at the top of the page, usually in the right-hand corner,” that is “wide enough to contain the typical query.”44 version two testers were quick to find and use the search box in the header that fit this description. although students often used search boxes, and global ones in particular, to accomplish usability testing tasks, they were sometimes impeded by screenshots of search boxes and links. several students clicked on them thinking they were live, unable to immediately distinguish that they differed from the functional embedded search boxes that some of the guides also included. as nielsen observed, “users often move fast and furiously when they’re looking for search. as we’ve seen in recent studies, they typically scan the homepage looking for ‘the little box where i can type.’”45 librarians sometimes use screenshots of search boxes in an effort to provide helpful visuals to accompany instructional content (text) focusing on how to access and use a specific resource. because many students scan the page for a search box so that they can quickly find needed information rather than carefully reading material in the content boxes, it could be argued that these screenshots inadvertently confuse students and impede usability. another way to look at this issue, however, may be that guide content can be misaligned with user expectations and contexts. a user looking to search for articles on a topic who stumbles on a guide may have no reason to do anything other than look for a search box. in contrast, a user introduced to a guide in the context of a course who is asked to read through the content and explore three listed resources in preparation for a discussion to occur in the next class meeting will likely have a very different orientation to the guide and perception of its purpose and usefulness. information technology and libraries | september 2019 71 students’ search behaviors also made us question the efficacy of linking to specific books or articles within a libguide. in tasks three through seven, many of the students used onesearch or the library catalog to search for specific books or articles rather than referencing the guide where potentially useful resources were listed. for example, while trying to find the com 100 guide during task one, one student commented, “i never really look for stuff. i just go to the databases.” version two testers, who had access to a global search in the header of every libguides page, were even more likely to navigate away from the guides to find books or articles. while several studies in the literature had suggested that vertical tab navigation may be more usable than horizontal tab navigation, our study did not bear this out, as students in both rounds were able to find items on vertical and horizontal navigation menus at relatively consistent rates. similarly, one-, two-, and three-column content design did not appear to affect users’ abilities to find information and links on a page; however, users’ tendency to search rather than bro wse interfered with the relevant task’s intention of comparing the browsability of different content column designs, and therefore more targeted research on this question is needed. one student commented on the pointlessness of content in second columns, stating “nobody ever looks on the right side, i always look on the left cause everything’s usually on the left side. because you don’t read from right to left, it’s left to right.” he was, nevertheless, able to complete the task regardless of the multi-column design. subtab placement in libguides versions one and two was very different from each other; version one subtabs were invisible to users unless they hovered over the main menu item on the horizontal menu, while version two allowed us to make subtabs immediately visible on the vertical menu, without any action needed by the user to uncover their existence. given the subtabs’ visibility, we had anticipated that version two testers would be more likely to find and use subtabs, but this turned out not to be the case. only one out of ten students found the relevant subtab. although the successful tester was using libguides version two in which the subtab was visible, the fact that nine out of ten testers failed to see the subtab, regardless of whether it was immediately visible or not, suggests that subtab usage may not be an effective navigation strategy. results from all tasks also suggested that students might not understand what research guides are or how guides might help them with their research. like many libraries, the cal poly pomona university library did not refer to libguides by their product name on the library website, labeling them “research guides” instead in an effort to make their intended purpose clearer. testing revealed, however, that students are not predisposed to think of a “research guide” as a useful tool to help them get started on their research. one student said, “i’m not sure what the definition of a research guide is.” when prompted to think more about what it might be, the student guessed that it was a pamphlet with “something to help me guide the research.” the student did not offer any additional guesses about what specifically that help might look like. moreover, students’ tendency to resort to search itself can also be interpreted as evidence that they are confused about how guides are supposed to help them with research. instead of reading or skimming information on the guides, students used search as a strategy to attempt to complete the tasks an average of 70 percent of the time across both rounds. many of their searches navigated students away from the very guides that were designed to help them. the tendency to navigate away from guides was likely increased by the content included in the guides we tested, since many incorporated search boxes and links that pointed to external systems, such as the catalog, the discovery layer, libanswers, etc. however, many students’ first attempts to am i on the library website? | conrad and stevens 72 https://doi.org/10.6017/ital.v38i3.10977 accomplish the tasks given them involved immediately navigating away from libguides. others navigated away shortly after an initial attempt or two to complete the task within the guide. all but one student navigated away from libguides to complete tasks; four did so more than five times. eight of ten students used onesearch in the header or from the library homepage; the other two used embedded onesearch boxes on the libguides. results also suggested that it might be easier for students to find guides that are explicitly associated with their courses, through either the guides’ titles or other searchable metadata, than to find and understand the relevance of general research guides. even though general research guides might be relevant to the subject matter of students’ courses, guides that explicitly reference a course or courses are easily discoverable and their relevance is more immediately obvious. for instance, the first task asked students to find a “research guide” to help them write a paper on climate change for a com 100 class. we wanted to see whether students would find the “controversial topics” research guide that was designed for com 100 and that included the course number in the guide’s metadata. mentioning the course number in the task seemed to make it more actionable as an assignment they might expect from a professor. when students searched for “com 100,” they were more likely to find the controversial topics guide; two of three students who found the guide searched using the course number. if course numbers had not been included, they might not have found the guide as searching for the course number brought up the correct guide as the one result. two additional students unsuccessfully attempted to find the guide by searching for “com100,” without a space. had the libguides search been more effective, or had librarians included both versions of the course code with and without a space, more students would likely have found the guide. limitations limitations of this study include weaknesses in both our usability tasks and the content of some of the libguides, which made it difficult to answer our research questions. we may have tested too many different features at once, which can be a pitfall of usability testing in general. some tasks, such as tasks five and seven, tested both navigation placement and column layouts. in task five, for instance, there were multiple factors that could have led to success or failure; did a student overlook the asce standards because of column layout or tab placement or was the layout moot because the search box was comprehensive enough to allow them to complete the task without browsing the guide’s content? similarly, task two tested a guide with seven tabs. it is not clear if the students who did not click on a tab missed it because of the placement of the navigation on the page or because the navigation contained too many options. weaknesses in the content of many of the libguides used in the study led to additional limitations. many of the libguides were text heavy and included jargon. one student even commented, “ it’s a lot of words here, so i really don’t want to read them.” although we set out to test the usability of different navigation layouts and template designs, factors such as content overload or use of jargon could have influenced success or failure. the wording of task seven, for example, was particularly problematic and led to unclear results. students were instructed to find an “encyclopedic source” in an attempt to see if they would click on books listed in a third column in version one testing compared to a left column in version two testing. the column header was titled “useful books for background research” and the box included encyclopedias. students appeared to struggle with the idea of what constituted an “encyclopedic source.” when one student was specifically asked what she thought the term meant, she responded, “not sure.” based information technology and libraries | september 2019 73 on the results of this task, it was difficult to discern if the interface or the wording of the task resulted in task completion failures. the contrived nature of usability testing itself might also have affected our results. for example, one student exhibited a tendency to rush through tasks, a behavior that may have been due to experiencing content overload, anxiety over being observed during the testing process, time limitations of which we were unaware, etc. on the other hand, behavior that we perceived to be rushing might be consistent with the students’ normal approach to navigating websites. whatever the case, it is important to keep in mind that usability testing puts users on the spot because they are testing an interface in front of an audience. the usability testing context can therefore influence user behavior, including the number of times students might attempt to find a resource or complete a given task. some students might be impatient or uncomfortable with the process, resulting in attempts to complete the testing as quickly as possible, including giving up on tasks more quickly than they would in a more natural setting. conversely, other students might be more likely to expend more time and effort when performing in front of an audience than they would privately. conclusion usability testing was effective for revealing some of the difficulties students encounter when using our libguides and our website and for prompting reflection on the types of content they include, how that content is presented, and the contexts in which that content may or may not be useful to our students. analysis of the data from our study and a review of the literature within the context of existing political realities and constraints within our library led to our development of several data-informed recommendations for libguides creators, most of which were adopted. one of the most important recommendations was that libguides should use the same header that is on the library’s main website, which includes a global search box. use of the similar header not only would provide a consistent look and feel but it would also provide users with the global search box at the top of the page that is aligned with their mental model of how search should function. our testing confirmed many students prefer to use global search boxes to find information rather than browsing or in addition to browsing when they get stuck. while some librarians were not thrilled with what they viewed as the privileging or predominance of the discovery layer on their guides, preferring to direct students to specific databases instead of the onesearch, this recommendation was ultimately accepted due to the compelling nature of the usability data we were able to share. our recommendation that subtabs should be avoided was also accepted because of how compelling the data was: 90 percent of users failed to find links located on subtabs. we also recommended that librarians should evaluate the importance of all content on their guides to minimize student confusion when browsing. while we acknowledged that there might be contexts when screenshots of search boxes would be useful, we encouraged librarians to think carefully about their use and to avoid them when possible. additionally, librarians were encouraged to evaluate whether the content they were adding was of core importance to the libguide, reflecting on the degree to which it added value or possibly detracted from the libguide, perhaps by virtue of lack of relevance or content-overload. content boxes consisting of suggested books on a general subject guide were used as an example, given the difficulty of providing useful book suggestions to students working on wildly different topics. while results from our rounds of usability testing did not indicate that left-side vertical navigation was decidedly more usable than horizontal navigation at the top of the page, we nevertheless am i on the library website? | conrad and stevens 74 https://doi.org/10.6017/ital.v38i3.10977 recommended that all guides should use left tab navigation, for consistency’s sake across guides, because left-side navigation has become standard on the web, and because other libguide studies have suggested that left-side navigation is easier to use than horizontal navigation, due to issues such as “banner blindness.”46 the librarians agreed, and a template was set up in the administrative console requiring that all public-facing libguides use left tab navigation. based on other usability studies in the literature as well, we also recommended that guides should include no more than a maximum of seven main content tabs.47 although our study did not provide any actionable data about the relative usability of one-, two-, and three-column content designs, other articles in the literature had emphasized the importance of consistency and avoiding a busy look with too much content. in order to avoid both a busy look and having guides that looked decidedly different from each other due to inconsistent number of columns, we therefore recommended that all guides should utilize a two-column layout, with the left column reserved for navigation. all content should appear in a single main column. however, future iterations of libguides usability testing should attempt to find ways to test whether limiting content to a single column is indeed more usable than dispersing it across two or more columns. the group voted on many of our recommendations, and several were simple to implement and oversee because they could translate into design decisions that could be set up as default, unchangeable options within the libguides administration module. other recommendations were more difficult to operationalize and enforce. for example, because our findings indicated that students attempted to search for course numbers to find a guide that they were told was relevant to their research for a specific class, another one of our recommendations to the librarians’ group was to include, as appropriate, various course numbers in their guides’ metadata in order to both make them more discoverable and appear more immediately relevant to students’ coursework. this recommendation is not one that a libguides administrator could enforce due to issues revolving around subject matter and curriculum knowledge. the issue of context, and specifically the connection between courses and guides that has the potential to underscore their relevance and purpose to students, also caused us to question the effectiveness of general subject guides in assisting students with their research. if students are more likely to understand the relevance and purpose of a libguide when it is explicitly connected to their specific class or assignment and less likely to make the connection between a general research guide and their coursework, then the creation and maintenance of general subject guides might not be worth the time and effort librarians invest in them. this question is made more pressing by studies in the literature that indicate both low usage and shallow use of guides, such as using them primarily to find a database.48 while this question did not lead to a specific recommendation to the librarians’ group, we have since reflected that the return on investment issue might be effectively addressed via closer collaboration with faculty in the disciplines. if research guides are more clearly aligned with specific research assignments in specific courses , and if faculty members instruct their students to consult library research guides and integrate libguides and other library resources into learning management systems, perhaps use and return on investment would improve. researchers like hess and hristova, for example, found that online tutorials that are required in specific courses show high usage.49 the connection between course integration and usage may hold true with libguides as well. regardless, students’ frequent lack of understanding of what guides are designed to do and their tendency to navigate quickly away from them rather than exploring them suggests that information technology and libraries | september 2019 75 reconceptualizing what guides are designed to do, and what needs they are designed to meet in what specific contexts might prove to be a useful exercise. a guide designed as an ins tructional tool to teach specific concepts, topic generation processes, search strategies, citation practices, etc. within the context of a specific assignment for a specific course may well be immediately perceived as relevant to students in that course. such a guide discussed in the context of a class might also be perceived as more useful than guides consisting of lists of resources and tools, which are unlikely to be interpreted as helpful by students who stumble upon them while seeking research assistance on the library’s website. as such, thinking about how and in what context students are likely to find guides, and how material might be presented so that guides are quickly perceived as a potentially relevant resource worth exploring might also prove useful. the importance of talking to users cannot be overemphasized; without collecting user feedback, whether through usability testing or another method, it is difficult to know how students perceive and use libguides or any other library online service. getting user input on navigation flow, template design, and search functionality can provide valuable details that can help libraries improve the usability of their online resources. it is also important to note that in our rapidly changing environment, users’ needs and preferences also change. as such, collecting and analyzing user feedback to inform user-centered design should be a fluid process, not a one-time effort. admittedly, it can sometimes be challenging to make collective design decisions, particularly when librarians have strong opinions grounded on their own personal experiences working with students that conflict with usability testing data. although it is necessary to incorporate user feedback into the design process, it is also important to be open to compromise in order to achieve stakeholder buy-in for some usability-informed changes. as with many library services, usage of libguides is contingent at least in part on awareness, as students are unlikely to use services of which they are unaware or are unlikely to discover due to the limitations of a library’s search tools. given the prevalence of search dominance among our users, we should not assume that simply placing a “research guides” link on a webpage will lead to usage. increased outreach, better integration with the content of specific courses and assignments, and a thorough review of libguides content by those creating the guides with an eye toward the specific contexts in which they are likely to be used, taught, serendipitously discovered, etc. is necessary to ensure that the research guides librarians create are worth the time they invest in them. additional studies focusing on why students do or do not use specific types of research guides, the contexts in which they are most useful, how students use them, and the specific content in guides that students find most helpful are needed to determine whether and to what extent they are aligned with students’ information-seeking preferences, behaviors, and needs, as well as how they might be improved to increase their use and usefulness. am i on the library website? | conrad and stevens 76 https://doi.org/10.6017/ital.v38i3.10977 appendix 1: libguides usability testing tasks purpose: seeing how students browse or search to get to research guides task 1: you are writing a research paper on the topic of climate change for your com 100 class. your teacher told you that the library has a “research guide” that will help you write your paper. find the guide. start: library homepage purpose: testing tab orientation on top task 2: you need to compare two public opinion polls on the topic of climate change for your com 100 class. find a list of public opinion polls on the research guide shown. start: http://libguides.library.cpp.edu/controversialtopics or http://csupomona.beta.libguides.com/controversial-topics purpose: testing subtabs task 3: you are writing a research paper for your apparel merchandising & management class on the topic of product development. your teacher told you that the library has a “research guide” that includes a list of articles on product development. find the product development articles on this research guide. start: http://libguides.library.cpp.edu/amm or http://csupomona.beta.libguides.com/amm purpose: testing searching within the libguides pages task 4: if you were going to look for additional books on the topic of product development, what would you do next? start: http://libguides.library.cpp.edu/amm or http://csupomona.beta.libguides.com/amm purpose: testing two-tab column design task 5: you are designing an earthquake-resistant structure for your civil engineering course and need to review seismic load provisions. locate the asce standards on seismic loads. use the research guide we open for you. start: http://libguides.library.cpp.edu/civil or http://csupomona.beta.libguides.com/civilengineering purpose: seeing if including screenshots of search boxes is problematic task 6: your professor also asks you to find an online handbook or reference source on the topic of finite element analysis. locate an online handbook or reference source on this topic. start: http://libguides.library.cpp.edu/civil or http://csupomona.beta.libguides.com/civilengineering purpose: seeing if three-columns are noticeable http://libguides.library.cpp.edu/controversialtopics http://csupomona.beta.libguides.com/controversial-topics http://csupomona.beta.libguides.com/controversial-topics http://libguides.library.cpp.edu/amm http://csupomona.beta.libguides.com/amm http://libguides.library.cpp.edu/amm http://csupomona.beta.libguides.com/amm http://libguides.library.cpp.edu/civil http://csupomona.beta.libguides.com/civil-engineering http://csupomona.beta.libguides.com/civil-engineering http://libguides.library.cpp.edu/civil http://csupomona.beta.libguides.com/civil-engineering http://csupomona.beta.libguides.com/civil-engineering information technology and libraries | september 2019 77 task 7: find resources that might be good for background research on motivation and classroom learning for a psychology course. find an encyclopedic source on this topic. start: http://libguides.library.cpp.edu/psychology or http://csupomona.beta.libguides.com/psychology http://libguides.library.cpp.edu/psychology http://libguides.library.cpp.edu/psychology http://libguides.library.cpp.edu/psychology am i on the library website? | conrad and stevens 78 https://doi.org/10.6017/ital.v38i3.10977 references 1 william hemmig, “online pathfinders: toward an experience-centered model,” reference services review 33, no. 1 (february 2005): 67, https://dx.doi.org/10.1108/00907320510581397. 2 charles h. stevens, marie p. canfield, and jeffrey t. gardner, “library pathfinders: a new possibility for cooperative reference service,” college & research libraries 34, no. 1 (january 1973): 41, https://doi.org/10.5860/crl_34_01_40. 3 “about springshare,” springshare, accessed may 7, 2017, https://springshare.com/about.html. 4 “libguides community,” accessed december 4, 2018, https://community.libguides.com/?action=0. 5 see, for example, alisa c. gonzalez and theresa westbrock, “reaching out with libguides: establishing a working set of best practices,” journal of library administration 50, no. 5/6 (september 7, 2010): 638–56, https://doi.org/10.1080/01930826.2010.488941. 6 suzanna conrad and nathasha alvarez, “conversations with web site users: using focus groups to open discussion and improve user experience,” the journal of web librarianship 10, no. 2 (2016): 74, https://doi.org/10.1080/19322909.2016.1161572. 7 ibid., 74. 8 suzanna conrad and julie shen, “designing a user-centric web site for handheld devices: incorporating data-driven decision-making techniques with surveys and usability testing,” the journal of web librarianship 8, no. 4 (2014): 349-83, https://doi.org/10.1080/19322909.2014.969796. 9 “about springshare.” 10 jimmy ghaphery and erin white, “library use of web-based research guides,” information technology and libraries 31, no. 1 (2012): 21-31, https://doi.org/10.6017/ital.v31i1.1830. 11 “libguides community,” accessed december 4, 2018, https://community.libguides.com/?action=0&inst_type=1. 12 katie e. anderson and gene r. springs, “assessing librarian expectations before and after libguides implementation,” practical academic librarianship: the international journal of the sla academic division 6, no. 1 (2016): 19-38, https://journals.tdl.org/pal/index.php/pal/article/view/19. 13 examples include: troy a. swanson and jeremy green, “why we are not google: lessons from a library web site usability study,” the journal of academic librarianship 37, no. 3 (2011): 22229, https://doi.org/10.1016/j.acalib.2011.02.014; judith z. emde, sara e. morris, and monica claassen-wilson, “testing an academic library website for usability with faculty and graduate students,” evidence based library and information practice 4, no. 4 (2009): 24-36, https://doi.org/10.18438/b8tk7q; heather jeffcoat king and catherine m. jannik, “redesigning for usability: information architecture and usability testing for georgia tech https://dx.doi.org/10.1108/00907320510581397 https://doi.org/10.5860/crl_34_01_40 https://springshare.com/about.html https://community.libguides.com/?action=0 https://doi.org/10.1080/01930826.2010.488941 https://doi.org/10.1080/01930826.2010.488941 https://doi.org/10.1080/19322909.2014.969796 https://doi.org/10.6017/ital.v31i1.1830 https://community.libguides.com/?action=0&inst_type=1 https://journals.tdl.org/pal/index.php/pal/article/view/19 https://doi.org/10.1016/j.acalib.2011.02.014 https://doi.org/10.18438/b8tk7q information technology and libraries | september 2019 79 library’s website,” oclc systems & services 21, no. 3 (2005): 235-43, https://doi.org/10.1108/10650750510612425; danielle a. becker and lauren yannotta, “modeling a library website redesign process: developing a user-centered website through usability testing,” information technology and libraries 32, no. 1 (2013): 6-22, https://doi.org/10.6017/ital.v32i1.2311; darren chase, “the perfect storm: examining user experience and conducting a usability test to investigate a disruptive academic library web site redevelopment,” the journal of web librarianship 10, no. 1 (2016): 28-44, https://doi.org/10.1080/19322909.2015.1124740; andrew r. clark et al., “taking action on usability testing findings: simmons college library case study,” the serials librarian 71, no. 3-4 (2016): 186-96, https://doi.org/10.1080/0361526x.2016.1245170; anthony s. chow, michelle bridges, and patrician commander, “the website design and usability of us academic and public libraries: findings from a nationwide study,” reference & user services quarterly 53, no. 3 (2014): 253-65, https://journals.ala.org/index.php/rusq/article/view/3244/3427; gricel dominguez, sarah j. hammill, and ava iuliano brillat, “toward a usable academic library web site: a case study of tried and tested usability practices,” the journal of web librarianship 9, no. 2-3 (2015), https://doi.org/10.1080/19322909.2015.1076710; junior tidal, “one site to rule them all, redux: the second round of usability testing of a responsively designed web site,” the journal of web librarianship 11, no. 1 (2017): 16-34, https://doi.org/10.1080/19322909.2016.1243458. 14 kate a. pittsley and sara memmott, “improving independent student navigation of complex educational web sites: an analysis of two navigation design changes in libguides,” information technology and libraries 31, no. 3 (2012): 52-64, https://doi.org/10.6017/ital.v31i3.1880. 15 alec sonsteby and jennifer dejonghe, “usability testing, user-centered design, and libguides subject guides: a case study,” the journal of web librarianship 7, no. 1 (2013): 83-94, http://dx.doi.org/10.1080/19322909.2013.747366. 16 sarah thorngate and allison hoden, “exploratory usability testing of user interface options in libguides 2,” college & research libraries 78, no. 6 (2017), https://doi.org/10.5860/crl.78.6.844. 17 nora almeida and junior tidal, “mixed methods not mixed messages: improving libguides with student usability data,” evidence based library and information practice 12, no. 4 (2017): 66, https://academicworks.cuny.edu/ny_pubs/166/. 18 ibid., 63; 71. 19 dana ouellette, “subject guides in academic libraries: a user-centered study of uses and perceptions,” canadian journal of information and library science 35, no. 4 (december 2011): 436–51, https://doi.org/10.1353/ils.2011.0024. 20 ibid., 442. 21 ibid., 442-43. https://doi.org/10.1108/10650750510612425 https://doi.org/10.6017/ital.v32i1.2311 https://doi.org/10.1080/19322909.2015.1124740 https://doi.org/10.1080/0361526x.2016.1245170 https://journals.ala.org/index.php/rusq/article/view/3244/3427 https://journals.ala.org/index.php/rusq/article/view/3244/3427 https://doi.org/10.1080/19322909.2015.1076710 https://doi.org/10.1080/19322909.2016.1243458 https://doi.org/10.6017/ital.v31i3.1880 http://dx.doi.org/10.1080/19322909.2013.747366 https://doi.org/10.5860/crl.78.6.844 https://academicworks.cuny.edu/ny_pubs/166/ https://doi.org/10.1353/ils.2011.0024 am i on the library website? | conrad and stevens 80 https://doi.org/10.6017/ital.v38i3.10977 22 ibid., 443. 23 ibid., 443; shannon m. staley, “academic subject guides: a case study of use at san jose state university,” college & research libraries 68, no. 2 (march 2007): 119–39, https://doi.org/10.5860/crl.68.2.119. 24 jakob nielsen, “search and you may find,” nielsen norman group, last modified july 15, 1997, https://www.nngroup.com/articles/search-and-you-may-find/. 25 jakob nielsen, “macintosh: 25 years,” nielsen norman group, last modified february 2, 2 009, https://www.nngroup.com/articles/macintosh-25-years/; jakob nielsen and raluca budiu, mobile usability (berkeley: new riders, 2013), chap. 2, o’reilly. 26 jakob nielsen, “incompetent research skills curb users’ problem solving,” nielsen norman group, last modified april 11, 2011, https://www.nngroup.com/articles/incompetent-searchskills/. 27 jakob nielsen, “search: visible and simple,” nielsen norman group, last modified may 13, 2001, https://www.nngroup.com/articles/search-visible-and-simple/. 28 ibid. 29 jakob nielsen, “mental models for search are getting firmer,” nielsen norman group, last modified may 9, 2005, https://www.nngroup.com/articles/mental-models-for-search/. 30 ibid. 31 erik ojakaar and jared m. spool, getting them to what they want: eight best practices to get users to the content they want (and to content they didn’t know they wanted) (bradford, ma: uie reports: best practices series, 2001). 32 amanda nichols hess and mariela hristova, “to search or to browse: how users navigate a new interface for online library tutorials,” college & undergraduate libraries 23, no. 2 (2016): 173, https://doi.org/10.1080/10691316.2014.963274. 33 ibid., 176. 34 hyejung han and dietmar wolfram, “an exploration of search session patterns in an imagebased digital library,” journal of information science 42, no. 4 (2016): 483, https://doi.org/10.1177/0165551515598952. 35 ibid., 487. 36 xi niu, tao zhang, and hsin-liang chen, “study of user search activities with two discovery tools at an academic library,” international journal of human-computer interaction 30 (2014): 431, https://doi.org/10.1080/10447318.2013.873281. 37 iris xie and soohyung joo, “tales from the field: search strategies applied in web searching,” future internet 2 (2010): 268-69, https://doi.org/10.3390/fi2030259. https://doi.org/10.5860/crl.68.2.119 https://www.nngroup.com/articles/search-and-you-may-find/ https://www.nngroup.com/articles/macintosh-25-years/ https://www.nngroup.com/articles/incompetent-search-skills/ https://www.nngroup.com/articles/incompetent-search-skills/ https://www.nngroup.com/articles/search-visible-and-simple/ https://www.nngroup.com/articles/mental-models-for-search/ https://doi.org/10.1080/10691316.2014.963274 https://doi.org/10.1177/0165551515598952 https://doi.org/10.1080/10447318.2013.873281 https://doi.org/10.3390/fi2030259 information technology and libraries | september 2019 81 38 ibid., 275; 267-68. 39 ibid., 268-69. 40 sonsteby and dejonghe, “usability testing, user-centered design,” 83-94. 41 we experienced technical difficulties when capturing screens and audio simultaneously in camtasia. the audio did not sync in real time with the testing and we had to correct sync issues after the fact. a full technical test of screen capture and recording technology might have resolved this issue. 42 nielsen, “search: visible and simple.” 43 nielsen, “search and you may find”; nielsen, “incompetent research skills”; iris xie and soohyung joo, “tales from the field,” 268-69. 44 jakob nielsen, “search: visible and simple.” 45 ibid. 46 pittsley and memmott, “improving independent student navigation,” 52-64. 47 e.g., sonsteby and dejonghe, “usability testing, user-centered design,” 83-94. 48 ouellette, “subject guides in academic libraries,” 448; brenda reeb and susan gibbons, “students, librarians, and subject guides: improving a poor rate of return,” portal: libraries and the academy 4, no. 1 (january 22, 2004): 124, https://dx.doi.org/10.1353/pla.2004.0020; staley, “academic subject guides,” 119–39. 49 hess and hristova, “to search or to browse,” 174. https://dx.doi.org/10.1353/pla.2004.0020 abstract introduction literature review the growth of libguides libguides usability testing and user studies information retrieval behaviors: search and browse preferences method task 1 task 2 task 3 task 4 task 5 task 6 task 7 results searching vs. browsing to find libguides navigation, tabs, and layout embedded search boxes & screenshots of search boxes comparisons between rounds discussion limitations conclusion appendix 1: libguides usability testing tasks references editorial | truitt 3 marc truitteditorial w elcome to 2009! it has been unseasonably cold in edmonton, with daytime “highs”—i use the term loosely— averaging around -25°c (that’s -13°f, for those of you ital readers living in the states) for much of the last three weeks. factor in wind chill (a given on the canadian prairies), and you can easily subtract another 10°c. as a result, we’ve had more than a few days and nights where the adjusted temperature has been much closer to -40°, which is the same in either celsius or fahrenheit. while my boss and chief librarian is fond of saying that “real canadians don’t even button their shirts until it gets to minus forty,” i’ve yet to observe such a feat of derring-do by anyone at much less than twenty below . even your editor’s two labrador retrievers—who love cooler weather—are reluctant to go out in such cold, with the result that both humans and pets have all been coping with bouts of cabin fever since before christmas. n so, when is it “too cold” for a server room? why, you may reasonably ask, am i belaboring ital readers with the details of our weather? over the weekend we experienced near-simultaneous failures of both cooling systems in our primary server room (sr1), which meant that nearly all of our library it services, including our opac (which we host for a consortium of twenty area libraries), a separate opac for edmonton public library, our website, and access to licensed e-resources, e-mail, files, and print servers had to be shut down. temperature readings in the room soared from an average of 20–22°c (68–71.5°f) to as much as 37°c (98.6°f) before settling out at around 30°c (86°f). we spent much of the weekend and beginning of this week relocating servers to all manner of places while the cooling system gets fixed. i imagine that next we may move one into each staff person’s under-heated office, where they’ll be able to perform double duty as high-tech foot warmers! all of this happened, of course, while the temperature outside the building hovered between -20° and -25°c. this is not the first time we’ve experienced a failure of our cooling systems during extremely cold weather. last winter we suffered a series of problems with both the systems in sr1 and in our secondary room a few feet away. the issues we had then were not the same as those we’re living through now, but they occurred, as now, at the coldest time of the year. this seeming dichotomy of an overheated server environment in the depths of winter is not a matter of accident or coincidence; indeed, while it may seem counterintuitive, the fact is that many, if not all, of our cooling woes can be traced to the cold outside. the simple explanation is that extreme cold weather stresses and breaks things, including hvac systems. as we’ve tried to analyze this incident, it appears likely that our troubles began when the older of our two systems in sr1 developed a coolant leak at some point after its last preventive maintenance servicing in august. fall was mild here, and we didn’t see the onset of really severe cold weather until early to mid-december. since the older system is mainly intended for failover of the newer one, and since both systems last received routine service recently, it is possible that the leak could have developed at any time since, although my supposition is that it may be itself a result of the cold. in any case, all seemed well because the newer cooling system in sr1 was adequate to mask the failure of the older unit, until it suffered a controller board failure that took it offline last weekend. but, with the failure of the new system on saturday, all it services provided from this room had to be brought down. after a night spent trying to cool the room with fans and a portable cooling unit, we succeeded in bringing the two opacs and other core services back online by sunday, but the coolant leak in the old system was not repaired until midday monday. today is friday, and we’ve limped along all week on about 60 percent of the cooling normally required in sr1. we hope to have the parts to repair the newer cooling system early next week (fingers crossed!). some interesting lessons have emerged from this incident, and while probably not many of you regularly deal with -30°c winters, i think them worth sharing in the hope that they are more generally applicable than our winter extremes are: 1. document your servers and the services that reside on them. we spent entirely too much time in the early hours of this event trying to relate servers and services. we in information technology (it) may think of shutting down or powering up servers “fred,” “wilma,” “betty,” and “barney,” but, in a crisis, what we generally should be thinking of is whether or not we can shut down e-mail, file-and-print services, or the integrated library system (ils) (and, if the latter, whether we shut down just the underlying database server or also the related staff and public services). perhaps your servers have more obvious names than ours, in which case, count yourself fortunate. but ours are not so intuitively named—there is a perfectly good reason for this, by the way—and with distributed applications where the database marc truitt (marc.truitt@ualberta.ca) is associate director, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 4 information technology and libraries | march 2009 may reside here, the application there, and the web front end yet somewhere else, i’d be surprised if your situation isn’t as complex as ours. and bear in mind that documentation of dependencies goes two ways: not only do you want to know that “barney” is hosting the ils’s oracle database, but you also want to know all of the servers that should be brought up for you to offer ils–related services. 2. prioritize your services. if your cooling system (or other critical server-room utility) were suddenly only operating at 50 percent of your normal required capacity, how would you quickly decide which services to shut down and which to leave up? i wrote in this space recently that we’ve been thinking about prioritized services in the context of disaster recovery and business continuity, but this week’s incident tells me that we’re not really there yet. optimally, i think that any senior member of my on-call staff should be empowered in a given critical situation to bring down services on the basis of a predefined set of service priorities. 3. virtualize, virtualize, virtualize. if we are at all typical of large libraries in the association of research libraries (and i think we are), then it will come as no surprise that we seem to add new services with alarming frequency. i suspect that, as with most places, we tend to try and keep things simple at the server end by hosting new services on separate, dedicated servers. the resulting proliferation of new servers has led to ever-greater strains on power, cooling, and network infrastructures in a facility that was significantly renovated less than two years ago. and i don’t see any near-term likelihood that this will change. we are, consequently, in the very early days of investigating virtualization technology as a means of reducing the number of physical boxes and making much better use of the resources—especially processor and ram— available to current-generation hardware. i’m hoping that someone among our readership is farther along this path than we and will consider submitting to ital a “how we done it” on virtualization in the library server room very soon! 4. sometimes low-tech solutions work . . . no one here has failed to observe the irony of an overheated server room when the temperature just steps away is 30° below. our first thought was how simple and elegant a solution it would be to install ducting, an intake fan, and a damper to the outside of the building. then, the next time our cooling failed in the depths of winter, voila!, we could solve the problem with a mere turn of the damper control. 5. . . . and sometimes they don’t. not quite, it seems. when asked, our university facilities experts told us that an even greater irony than the one we currently have would be the requirement for can$100,000 in equipment to heat that -30°c outside air to around freezing so that we wouldn’t freeze pipes and other indoor essentials if we were to adopt the “low-tech” approach and rely on mother nature. oh, well . . . n in memoriam most of the snail mail i receive as editor consists of advertisements and press releases from various firms providing it and other services to libraries. but a few months ago a thin, hand-addressed envelope, postmarked pittsburgh with no return address, landed on my desk. inside were two slips of paper clipped from a recent issue of ital and taped together. on one was my name and address; the other was a mailing label for jean a. guasco of pittsburgh, an ala life member and ital subscriber. beside her name, in red felt-tip pen, someone had written simply “deceased.” i wondered about this for some time. who was ms. guasco? where had she worked, and when? had she published or otherwise been active professionally? if she was a life member of ala, surely it would be easy to find out more. it turns out that such is not the case, the wonders of the internet notwithstanding. my obvious first stop, google, yielded little other than a brief notice of her death in a pittsburgh-area newspaper and an entry from a digitized september 1967 issue of special libraries that identified her committee assignment in the special libraries assocation and the fact that she was at the time the chief librarian at mcgraw-hill, then located in new york. as a result of checking worldcat, where i found a listing for her master’s thesis, i learned that she graduated from the now-closed school of library service at columbia university in 1953. if she published further, there was no mention of it on google. my subsequent searches under her name in the standard online lis indexes drew blanks. from there, the trail got even colder. mcgraw-hill long ago forsook new york for the wilds of ohio, and it seems that we as a profession have not been very good at retaining for posterity our directories of those in the field. a friend managed to find listings in both the 1982–83 and 1984–85 volumes of who’s who in special libraries, but all these did was confirm what i already knew: ms. guasco was an ala life member, who by then lived in pittsburgh. i’m guessing that she was then retired, since her death notice gave her age as eighty-six years. of her professional career before that, i’m sad that i must say i was able to learn no more. introducing zoomify image | smith 55 column title editor author id box for 3 column layout returning classification to the catalog | bland and stoffan 55 communications robert n. bland and mark a. stoffan returning classification to the catalog the concept of a classified catalog, or using classification as a form of subject access, has been almost forgotten by contemporary librarians. recent developments indicate that this is changing as libraries seek to enhance the capabilities of their online catalogs. the western north carolina library network (wncln) has developed a “classified browse” feature for its shared online catalog that makes use of library of congress classification. while this feature is not expected to replace keyword searching, it offers both novice and experienced library users another way of identifying relevant materials. classification to modern librari-ans is almost exclusively a tool for organizing and arranging books (or other physical media) on shelves. the role of classification as a form of subject access to collections through the public catalog—the concept of the classified catalog—has been almost forgotten. from a review of the literature, it does not appear that any major u.s. library has supported a classified catalog since boston university libraries closed its classified catalog in 1973.1 to be sure, nearly all online catalogs nowadays have some form of what is called a “call number search” or a “shelf list browsing capability” that is based on classification, but this is a humble and little-used feature because it requires that a call number (or at least a call number stem) be known and entered by the user, when no verbal index to the classification is available online. this search methodology provides nothing in the way of a systematic and hierarchical arrangement and display of subject classes, complete with accompanying verbal descriptions, that the classified catalog seeks to accomplish. but as karen markey put it in her recent review of classification and the online catalog, “to this day, the only way in which most end users experience classification online is through their online catalog’s shelf list browsing capability.”2 there are signs that this situation is changing. the recently released endeca-based catalog at north carolina state university libraries uses library of congress classification (lcc) in a prominent way to provide for browsing of the collection without need of the user entering any search terms at all.3 the lcc outline is presented on the main search entry screen with verbal captions describing the classes, allowing users to navigate through several layers of the outline to retrieve with a click of the mouse bibliographic records for materials assigned to those classes. in a converse way, the new online catalog being developed by the florida center for library automation uses lc classification as a kind of back end to keyword searching. following a keyword search, a user can limit the results set by confining it to a designated lcc range chosen again from an online display of the lcc outline.4 both of these catalogs use three levels of the lcc outlines from the most general single letter level classes (q for sciences, for example) through the two-letter classes for more specific subjects (qc for physics, qd for chemistry) to an even finer granularity with designated numeric ranges within the two-letter classes identifying specific subdisciplines, (qd241–qd441 for organic chemistry). the western north carolina library network (wncln) has been experimenting with classification as a retrieval tool in the public catalog for some time,5 and it has just implemented the first version of what we call a classified catalog browse in our innovative millennium system.6 like the two catalogs just mentioned, the classified catalog browse is based on software that is external to the ils software and integrated with that software through linking and webpage designs. also, like the previously discussed catalogs, it is robert n. bland (bland@unca.edu) is associate university librarian for technical services, university of north carolina at asheville. mark stoffan (mstoffan@ fsu.edu) is associate director for library technology at florida state university, tallahassee. figure 1. level 1 of lc classification in wncln webpac 56 information technology and libraries | june 200856 information technology and libraries | september 2008 based on scanning and incorporating into the catalog the lcc outlines as published by the library of congress. the wncln catalog goes a step further, however, in bringing the entire lc classification online down to the individual class number level—at least that portion of the classification that is actually used in our catalog. this is done through extracting class numbers and associated subject headings from bibliographic and authority records in our catalog and building an online classification display with descriptive captions (a verbal index) from these bibliographic and authority records. the result is a hierarchical display (to continue the example from above) not only of qc241–qd441 for organic chemistry but within this, qd271 for chromatographic analysis, qd273 for organic electrochemistry, and so on. the design of our interface presents this as a fourth level to which the user can “drill” down beginning with q for sciences, qd for chemistry, qd241–qd441 for organic chemistry, and finally qd271 for chromatographic analysis (figures 1–4.) from this fourth level,the user can click an associated link to execute a search of the catalog by the class number in question using the call number search function of the ils (figure 5); a second link for that class number will present the same list of titles but sorted by “most popular” (i.e., the items that have been checked out most frequently) from a separate but linked external database (figure 6); a third link will search the catalog by the associated subject heading for the class (figure 7); and finally a fourth link will show other subject headings that have been used in the catalog with this specific class number (figure 8). what does having the lc classification online in our catalog accomplish for our users? part of the point of our project is to answer this very question. chan and others7 have theorized that incorporation of the classification system into the catalog as a retrieval tool can figure 2. level 2 of lc classification in wncln webpac figure 3. level 3 of lc classification in wncln webpac provide enhanced subject access that is not possible through standard alphabetical subject headings and keyword searching alone. early studies by markey and others at oclc seem to have confirmed this with an online version of the dewey decimal classification.8 since (as far as we know) the library of congress classification has not really been tested as an online retrieval tool in a live catalog up to now, our implementation will serve as a kind of test bed for this hypothesis. how actual users in fact exploit this feature is of course only something that experience will introducing zoomify image | smith 57returning classification to the catalog | bland and stoffan 57 tell. a cursory look, however, would seem to indicate definite advantages to this approach. first of all, many studies indicate that two of the major sources of failure with subject retrieval in online systems are misspellings and poor choice of search terms by users. no figure 4. level 4 of lc classification in wncln web:pac figure 5. call number search display in wncln matter how far we may try to go with keyword searching and relevance ranking, no online library retrieval system is likely to do much with “napolyan’s fites” when what the user is looking for are books on the military campaigns of the emperor napoleon. with the classification system and verbal index online most of these problems are eliminated, since users can navigate to a subject of choice without ever entering a search term. moreover, given the design of the verbal index based on library of congress subject headings, the user is led to actual subject headings used in the catalog, which should provide for precise retrieval beyond what is ordinarily possible with keywords even when entered correctly, and (importantly) a retrieval set that is always greater than zero. the infamous and frustrating problem of “no hits” is eliminated. secondly, the great attraction of the classified catalog approach is that it arranges subjects in a hierarchical fashion based on integral connections among the topics in a way that cannot be accommodated in an alphabetic subject approach because of the vagaries of spelling. the topics “violence,” “social conflict,” and “conflict management,” for example, obviously spread out in an alphabetical subject list, are collocated in the classified catalog under the class “hm1106–hm1171 interpersonal relations” (figure 9), allowing the user to find references to materials all in one place in the catalog just as the classification system arranges the books on these subjects all in one place on the library shelves. alphabetical subject indexes, of course, attempt to ameliorate this problem by means of cross references, but there is clearly a limit to how far one can go with this approach. finally, the classified catalog provides an efficient way for collection development staff to review specific subject areas and to make better informed purchasing decisions regarding the collections. in the wncln design, the classes at the bottom level of the hierarchy are linked to the catalog by call number and subject headings, and each class carries an indication of the number of items assigned that class number. the classes are also linked to an external database that shows the frequency 58 information technology and libraries | june 200858 information technology and libraries | september 2008 of circulation of items in the class as well as title and date of publication. a quick review of this list can inform a bibliographer of circulation rates as well as the currency of materials in the class. as mentioned, the captions that are displayed with the lcc hierarchy in the wncln catalog are extracted from subject headings and authority records present in our catalog. readers familiar with lc marc record services may wonder why we took this approach to building the verbal index rather than using the information available in the lc marc classification records. machine-readable records for lc classification are now available in marc format. these files include records for each individual class number with a corresponding verbal caption. while we did experiment with using these files, cost and complexity determined that we go another direction. the lc classification files are huge, containing hundreds of thousands of classification numbers that we do not now and probably never would use in our wncln catalog simply because we (unlike lc) have no materials on these subjects. while these records could be filtered out by matching against lc class numbers that are found in our catalog and discarding non-matches, this would add yet another level of processing to an already complex process, as would handling the lc table subdivisions that are used in the lc schedules and that are separate from the standard class numbers. secondly, the lc marc classification files require a subscription costing several thousands dollars per year, as well as a substantial payment for the retrospective file needed to begin building the database of class numbers. on the other hand, extracting the verbal index from subject headings and authority records in our own catalog adds no cost to our processing. these headings and authority records are created and maintained, of course, as a standard part of the figure 6. most used titles display figure 7. subject search display in wncln cataloging process, and accordingly only headings and authority records that match materials owned by our libraries are included. the description or caption that is finally assigned to a class number is determined by a computer program that analyzes both authority records and bibliographic records found in our catalog that are assigned the class number in question, with the subject heading that is used most frequently as a primary subject generally being the one normally selected as the caption for the class. these class numbers with associated subject headings are processed then by another program, which eventually builds html files introducing zoomify image | smith 59returning classification to the catalog | bland and stoffan 59 representing the classification with links to the catalog and the external “most used” database as alluded to above. these standard html files, along with the files representing the first three levels of the lcc outline, are then loaded onto our web server to display the classification system online. figure 9. collocation of terms in the classified catalog figure 8. related subjects display in wncln a second advantage of this approach is that using the actual subject heading as the caption or description for the class makes it possible to use that caption as a direct link to a subject search in the catalog, as shown in the illustration in figure 4. a disadvantage is that the captions from the lcc files are designed to retain the hierarchy that is represented in the printed schedules in a visual way by formatting and indenting. captions derived from subject headings do not retain this feature. we have tried to accommodate this in our display of the schedules by replicating the class number ranges from the outline in the appropriate place in the full display of the schedules, thereby building a hierarchy from these ranges as genus and the individual class numbers as species. this does not manage to retain the full hierarchy of the lc schedules as shown in the printed schedules or as represented in lc’s online classification web product, but it is, we hope, an adequate surrogate for the purpose intended. in fact, in most cases, the captions derived from the extracted subject and authority headings match quite nicely the captions included in the actual lcc schedules, as shown in a comparison from the psychology classification of the hierarchy as it appears in our classified catalog browse and as it appears online in lc’s classification web product (figures 10 and 11). what is missing in our representation of the classification is not so much the subject content of the classes but the notes and information about literary form that are included in the actual lcc schedules. thus, our lcc online is not a strict image of the lcc as it would appear in printed or electronic form based on the hierarchies and captions devised by the lc. nor for that matter—despite our terminology— is it a true classified catalog, since only one classification (that used in the call number) is assigned to each item, whereas in a true classified catalog multiple classifications may be assigned to an item. it is nevertheless an online presentation of the lcc with links to our catalog that seeks to enhance subject access by exploiting the power of the classification system to organize materials by integral subject classes and to show relationships among subjects by a 60 information technology and libraries | june 200860 information technology and libraries | september 2008 hierarchical arrangement of classes as genus, species, and subspecies. and, perhaps just as importantly, it is an implementation that requires no additional cataloging effort on the part of our staff, nor any additional costs for data or processing other than the investment we have made in development of the software and the small amount of time required weekly to update the files. we do not expect that the classified catalog browse will replace keyword or subject searching as the primary means of subject access to our collections. we do believe that it promises to be a powerful and effective complement to our standard ils searches that may improve subject searching for both the novice and the experienced user. references 1. margaret hindle hazen, “the closing of the classified catalog at boston university,” library resources and technical services 18 (1974): 221–26. 2. karen markey, joan s. mitchell, and diane vizine-goetz, “forty years of classification online: final chapter or future unlimited?” cataloging and classification quarterly 42 (2006): 1–63. 3. north carolina state university libraries, “ncsu libraries online catalog,” north carolina state university, www.lib.ncsu.edu/catalog (accessed mar. 23, 2007). 4. florida center for library automation, “state university libraries of florida–endeca,” board of governors, state of florida, http://catalog.fcla.edu (accessed mar. 23, 2007). 5. the western north carolina library network is a consortium consisting of the libraries of appalachian state university, the university of north carolina at asheville, and western carolina university. 6. western north carolina library network, “library catalog,” western north carolina library network, http://wncln .wncln.org (accessed mar. 23, 2007). figure 10. class captions in the wncln webpac figure 11. class captions in lc’s classification web 7. lois mai chan, “library of congress classification as an online retrieval tool: potentials and limitations,” information technology and libraries 5 (1986): 181–92. 8. karen markey and anh demeyer, dewey decimal classification online project: evaluation of a library schedule and index integrated into the subject searching capabilities of an online catalog: final report to the council on library resources (dublin, ohio: oclc, 1986), report no. oclc/ opr/rr-86/1. ontology for the user-learner profile personalizes the search analysis of online learning resources: the case of thematic digital universities article ontology for the user-learner profile personalizes the search analysis of online learning resources the case of thematic digital universities marilou kordahi information technology and libraries | june 2022 https://doi.org/10.6017/ital.v41i2.13601 marilou kordahi (marilou_kordahi@yahoo.fr) is assistant professor, faculty of business administration and management, saint-joseph university of beirut, and associate researcher, paragraph research laboratory, paris 8 university. © 2022. abstract we hope to contribute to the field of research in information technology and digital libraries by analyzing the connections between thematic digital universities and digital user-learner profiles. thematic digital universities are similar to digital libraries, and focus on creating and indexing open educational resources, as well as improving learning in the information age. the digital user profile relates to the digital representation of a person’s identity and characteristics. in this paper we present the design of an ontology for the digital user-learner profile (ontoulp) and its application program. ontoulp is used to structure a user-learner’s digital profile. the application provides each user-learner with tailor-made analyses based on informational behaviors, needs, and preferences. we rely on an exploratory research approach and on methods of ontologies, user modeling, and semantic matching to design the ontoulp and its application program. any user-learner could use the ontoulp and its application program. introduction more online learning environments are supporting the creation and dissemination of quality open educational resources (oer) to facilitate change in the education sector, improve education, ensure longlife learning, reduce cost, and other motives.1 in 2002, the united nations educational, scientific and cultural organization (unesco) recommended the definition of oer as follows: “the open provision of educational resources, enabled by information and communication technologies, for consultation, use and adaptation by a community of users for non-commercial purposes.”2 the william and flora hewlett foundation defined oer as “freely licensed, remixable learning resources—[they] offer a promising solution to the perennial challenge of delivering high levels of student learning at lower cost.”3 in 2012, unesco noted that oer offer education stakeholders an opportunity to access textbooks and other learning contents to enhance their knowledge and professional experiences.4 education stakeholders may choose oer based on their informational needs, behaviors, and preferences.5 we hope to contribute to the field of research in information technology and digital libraries by analyzing the connections between thematic digital universities and digital user-learner profiles. we are conducting a case study using the digital university engineering and technology.6 in the following we will explain these topics and the interest in the digital university engineering and technology. in 2003, the french ministry of higher education, research, and innovation initiated the creation of thematic digital universities to facilitate the integration and use of information and mailto:marilou_kordahi@yahoo.fr information technology and libraries june 2022 ontology for the user-learner profile | kordahi 2 communication technologies for education in university teaching practices.7 in total, there are six thematic digital universities which are organized by broad disciplines: health sciences and sports, engineering sciences, environment and sustainable development, humanities, economics and management, as well as technical studies. thematic digital universities are similar to digital libraries, and focus on creating and indexing oer, as well as improving learning in the information age.8 although thematic digital libraries are mostly comprised of oer, they also develop complete training programs with some of these resources (e.g., massive open online courses, or moocs). they are partners with canal-u, the video library for higher education, as well as the french national platform for massive open online courses (fun-mooc). thematic digital universities are mostly created for learners and teachers, as they offer complementary educational resources to bachelor, masters, and doctoral programs.9 to date, learners and teachers have free access to most thematic digital universities and corresponding educational resources. registration is not required; however, without registration neither the learner nor the teacher can analyze her/his search for oer based on informational behaviors, needs, and preferences.10 we will focus on the analysis of oer metadata records in the context of thematic digital universities. each oer in the repository holds a metadata record to precisely describe its specifications to the learner or teacher (e.g., the learning level, language, and topics). specifications are written according to the institute of electrical and electronics engineers (ieee) standards for learning object metadata (lom),11 lomfr, and suplomfr. lom provides an accurate descriptive schema of a learning object suitable for educational resources12 (e.g., the classification and identification of an educational resource). lomfr and suplomfr are currently applications of lom in the french educational community.13 the digital university engineering and technology attracted our attention because of the following characteristics: clear presentation of its objectives, regular information updates, priority for free access to oer and open data, 3,000 published educational resources, extensive documentation of oer indexing, interoperability of oer and metadata records, and an advanced search engine for oer. each metadata record describes precise information on the oer, including the main title, keywords, descriptive text, educational types (or resources), learning level, copyrights, knowledge domains, topics, authors, and publishers. it is processed and structured with xml language which is human-readable and machine-readable. digital user profiles relate to the digital representation of a person’s identity and characteristics.14 digital identity is the sum of digital traces (or “footprints”) relating to an individual or a community found on the web or in digital systems. digital traces correspond to the user’s profile, browsing history, and contribution actions.15 our focus is the learner who wishes to use the thematic digital universities for tailor-made analysis of retrieved information based on her/his needs and preferences. we offer the learner an option to register on these platforms to track behavior over time while searching for oer. analyses are based on criteria the learner has previously chosen to personalize this search. subsequently, we suggest using the term “digital user-learner profile.” we will do our best to respect the general data protection regulations when collecting information on the digital userlearner profile.16 the general data protection regulations are privacy laws drafted and passed by information technology and libraries june 2022 ontology for the user-learner profile | kordahi 3 the european union that prohibit the processing, storage, or sharing of certain types of information about individuals without their knowledge and consent. the research questions are as follows: 1. in the context of thematic digital universities, how can a user-learner personalize the search for open educational resources according to her/his digital profile? 2. in this same context, what kinds of information can a user-learner analyze in a search for open educational resources according to her/his digital profile? the objectives of this article are to present the preliminary results of work in progress on the design of the ontology for the digital user-learner profile (ontoulp) and its application program, the personalized modeling system for the user-learner profile (psul). we rely on the methods of ontology,17 user modeling,18 and semantic matching.19 the method of ontology is used to describe in a formal manner a set of concepts and objects which represent the meaning of an information system in a specific area and the relationships between these concepts and objects.20 the method of user modeling describes the process of designing and changing a user’s conceptual understanding. it is applied to customize and adjust systems to meet the user’s needs and preferences. the method of semantic matching is used to identify and relate a meaning concept (or class) to its homologous concept in tree-like schemas and to consider the concept’s position in these schemas (e.g., mapping a class in an ontology to homologous concepts in metadata records). this relationship can be a one-to-one concept or one-to-many concepts. the ontoulp is a first approach, and it will be used to structure a user-learner’s digital profile in the context of thematic digital universities. we design this ontology for three main reasons: to structure collected and generated information21 (e.g., structuring a user-learner’s learning preferences will enable the identification of learning behaviors and activities), to analyze collected and generated information22 (e.g., analyzing generated information by a user-learner may predict a search for oer), as well as to facilitate relationships between a user-learner and thematic digital universities23 (e.g., analyzing user-learner informational behaviors may improve oer creation and dissemination). the psul will be designed as an application program for the ontoulp. it will be used to provide each user-learner with tailor-made analyses based on informational behaviors, needs, and preferences. psul will include a secure database and web pages, namely those for registering and editing the user-learner profile and its dashboard.24 ontoulp and its application program will offer each registered user-learner an opportunity to analyze the search for oer according to informational behaviors and needs. ontoulp and psul could be implemented in the structure of information systems for educational and research institutions, documentation and information centers, and many others. we will finetune our analysis by relying on a case example—the thematic digital universities. this article comprises six sections. first, we will explain the exploratory research carried out in the context of thematic digital universities. second, we will present the main published works related to the subject of the article. third, we will explain the approach followed to design and write the ontoulp. fourth, we will discuss the creation of the psul application program. fifth, we will demonstrate the integration of the designed ontology and its application program into a information technology and libraries june 2022 ontology for the user-learner profile | kordahi 4 mirror site to perform a technical test. finally, we will discuss the completed work before concluding the article. exploratory research approach this exploratory research is based on an analysis of the literature, a semistructured questionnaire, and an in-depth documentary research. we check the consistency of collected information and identify the need to personalize the search for oer as well as make tailor-made analysis of information. methods used during the first 18 months of the covid-19 pandemic (november 2020–may 2021), we conducted qualitative research to deepen our comprehension of the practices of thematic digital universities. we collected and interpreted primary and secondary information. primary information: we contacted the digital university association and their six thematic digital universities.25 because of their extensive expertise and robust knowledge in leading or managing thematic digital universities, directors and general secretaries were chosen to selfadminister an electronic semistructured questionnaire. we contacted seven individuals and received six responses. in this questionnaire, we asked about the following topics: the recent knowledge of thematic digital universities, conditions of access to oer, metadata records indexing as well as user-learner’s expectations. an example of the questionnaire is included in the appendix. secondary information: we analyzed a report by the french general inspectorate of the national education and research administration. we have also studied recently-published scientific articles by anne boyer (2011), deborah arnold (2018),26 and sihem zghidi and mokhtar ben henda (2020). the results and findings will be explained in the following paragraphs. results of information collection we have compared responses to the questionnaire and contents of published documents and articles. for the digital university in health sciences and sports, “resources are mostly accessible to learners from member universities, through an identification system based on the university email address.”27 only a few resources are open to the public. otherwise, according to comments gathered from the other four digital universities and digital university association, “thematic digital universities are part of global movements providing access to oer by promoting open access to knowledge.”28 they are an opportunity for learners to discover new disciplines and explore new areas.29 in fact, “the process for indexing metadata records meets standards for education, such as lom, lomfr and suplomfr.”30 at present, there is no feedback on the use of thematic digital universities platforms. in other words, “thematic digital universities have no information about learners who view oer, because there is no login and password. this is done on purpose to make them as open as possible.”31 these platforms are considered as a means of selftraining with quality assurance, as the documents have been produced and validated by higher education teachers. “thematic digital universities provide a certain flexibility allowing learners to work when and where they want.”32 information technology and libraries june 2022 ontology for the user-learner profile | kordahi 5 findings five thematic digital universities and the digital university association responded to the semistructured questionnaire. two thematic digital universities can track user-learners’ behaviors. these digital universities are related to the disciplines of health and sport in addition to technical studies. to date, four thematic digital universities cannot track user-learners’ interactions based on informational behaviors and preferences. ontoulp and its application program could be implemented in four thematic digital universities, which are related to the disciplines of engineering sciences, environment and sustainable development, humanities, economics, and management. literature review to our best knowledge, published research works addressing this research subject are limited in the context of thematic digital universities.33 we analyze the most recent ontologies and user modeling systems that are close to our research objectives. the main works we use are those of bloom et al. (1984),34 smythe et al. (2001),35 green and panzer (2009),36 and kordahi (2020),37 in addition to kelly and belkin (2002). the work methods and field studies these researchers have developed are useful to design the structure of the ontoulp and the model of its application program. in the following paragraphs, we will explain these works and the relationships with this research article. selection of recently published works in 2020 and 2021, kordahi designed an ontology and a personalized dashboard for user learners.38 the objectives of these works were to track individual searches for oer and compare them with a user-learner’s field of work. to design her ontology, kordahi relied on standardized ontologies and validated taxonomies which are used in online learning environments, namely the ims learner information profile (ims lip)39 and bloom’s taxonomy. the personalized dashboard was linked to the user-learner ontology. the designed dashboard was tested technically with its ontology in a digital library environment to examine its performance. kordahi used the methods of ontologies and semantic matching. learner model we are mostly interested in the learner model40 as it “is a model of the knowledge, difficulties and misconceptions of the individual [learner].” 41 as students learn the educational resources they find, the learner model is updated to display their current progress. the model can continue to tailor students’ interactions as they learn. there are several learner models, such as the ims lip.42 we examine the ims lip, which is based on a standardized data model describing a learner’s characteristics. it is mainly used to manage a student’s learning history to discover her/his learning opportunities. ims lip is made from 11 categories that gather learning information: “the identification, goals, qualifications and licenses, activity, interest, competency, accessibility, transcript, affiliation, security, and relationships.”43 this model has been successfully used by many renowned researchers (e.g., paquette 201044) to design a learner model and then adapt it to appropriate contexts. ims lip’s reliability, accuracy, and flexibility match well with the ontoulp motives. we will use it to begin designing the structure of the ontoulp and adapt it to the thematic digital universities context. we will also consider the ieee lom, lomfr, and suplomfr classification fields. this measure will be used to improve semantic matching between the ontoulp and oer metadata records. information technology and libraries june 2022 ontology for the user-learner profile | kordahi 6 taxonomy of educational objectives we examine the user-learner’s educational objectives to meet informational needs and expectations.45 in each oer metadata record, educational objectives are defined based on bloom’s taxonomy (e.g., “understand the context and rules of scientific publication” 46). bloom et al. have developed a taxonomy for educational objectives to classify statements teachers expected students to learn as a result of lessons and instructions. the researchers described a method for allowing students to achieve educational goals while carrying out exercises utilizing the resources of the environment. bloom et al. relied on in-depth qualitative studies to design and validate this taxonomy. bloom’s taxonomy contains the following six major categories related to the cognitive domain: knowledge, comprehension, application, analysis, synthesis, and evaluation. this taxonomy was revised in 2001 by lorin anderson et al.47 bloom’s taxonomy is still in use internationally as in the works of kordahi. integrating bloom’s taxonomy into the ontoulp will enhance the structure of a user-learner’s educational objectives. these educational objectives will be organized in six categories allowing the user-learner to refine her/his informational goals. therefore, we will create a mutual link between the user-learner’s educational objectives and oer educational objectives. knowledge domains knowledge organization systems48 are seen as a valuable component for searching for oer.49 our research includes analyses of oer metadata records to establish relationships between their knowledge topics and the user-learner’s topics of interest. in the thematic digital universities’ metadata records, a precise classification is reported respecting both knowledge topics and dewey decimal classification (e.g., geographic information systems (526.028 5)). 50 the dewey decimal classification and relative index 22nd edition,51 published in 2003 by the online computer library center,52 is being used worldwide in digital libraries and by the thematic digital universities.53 in their works published in 2009, green and panzer have developed an ontology to structure knowledge domains.54 this ontology recognizes two classes, which are dewey classes and knowledge topics. we selected the dewey decimal classification for the ontoulp because the thematic digital universities are already using it. we will rely on green and panzer’s ontology to structure the knowledge domains in the ontoulp (e.g., the use of dewey classes and knowledge topics). we will establish relationships between the knowledge domains and user-learner model, allowing the user-learner to choose the most appropriate learning topics. user modeling system the “user modeling system for personalized interaction and tailored retrieval” is useful for analyzing each user-learner’s informational needs and preferences.55 kelly and belkin’s system helps the user to track informational needs over time. it contains three classes of models and a set of interactions. the “general behavioral model” tracks information seeking and user behavior to determine informational needs. the “personal behavioral model” characterizes each user’s information search according to specific preferences and behaviors. the “topical models” are associated with concepts related to each user’s informational behaviors. this model is developed by renowned researchers specialized in information retrieval and corresponds to the objectives of the research article. we will use the structure of kelly and belkin’s model (2002) to design the psul application program, in the context of thematic digital information technology and libraries june 2022 ontology for the user-learner profile | kordahi 7 universities. relationships between both the psul and ontoulp ontology will be established to carry out personalized analyses of oer search. ontoulp ontology ontoulp’s design is based on the works discussed in the previous section. it consists of two stages. we start by writing it. we then describe the ontology and emphasize the relationships between different entities. writing the ontology we write ontoulp with protégé editor and use the hermit inference engine to check the consistency of classes and their relationships with objects. the ontology’s first approach is saved in owl format, which is compliant with the semantic web technologies. ontoulp description the ontology is comprised of five subsystems. these are: user-learner, user-learner model, educational objectives, learning design, and knowledge domains. each subsystem is composed of classes that inherit the attributes of the subsystem on which they depend. for brevity, the figures show the hierarchical representation of these subsystems. the user-learner subsystem contains all recorded private information on the digital user-learner profile. the classes personal information, identification sessions, and traces provide information about the user-learner’s behavior and search history for oer, e.g., the search duration for oer (see fig. 1). the user-learner model subsystem is responsible for structuring collected information related to learning behaviors and needs, namely the classes identification, interest, learning level (or qualifications and licenses), personal preferences (or accessibility), activities, learning objectives (or goals), affiliation, and network of contacts (or relationships). in the context of thematic digital universities, the resulting subsystem is composed of eight classes instead of eleven. the userlearner model subsystem conveys the structured information to the user-learner subsystem. figure 1 shows the structure of both subsystems, the user-learner and user-learner model. information technology and libraries june 2022 ontology for the user-learner profile | kordahi 8 figure 1. hierarchical representation of both subsystems, the user-learner and user-learner model. the educational objectives subsystem includes cognitive objectives involved in the process of acquiring knowledge. we design their structure by adapting bloom’s taxonomy. the cognitive objectives class includes six interrelated subclasses: remember (or knowledge), understand (or comprehension), apply (or application), analyze (or analysis), synthetize (or synthesis), and evaluate (or evaluation). the cognitive objectives class is enhanced with the ieee lom, lomfr, and suplomfr classification fields enabling the user-learner to choose objectives which best describe their needs and preferences, e.g., the class apply has subclasses design, choose (see fig. 2). information technology and libraries june 2022 ontology for the user-learner profile | kordahi 9 figure 2. hierarchical representation of educational objectives and learning design subsystems. the learning design subsystem is an adaptation of the ims learning design model, in the context of thematic digital universities.56 the learning design subsystem has two main classes: the userlearner’s environment and learning activities. the environment class has six thematic digital universities as subclasses. in a general manner, information about the environment class comes from thematic digital universities platforms (e.g., the viewed metadata records). the learning activities class has resources as a subclass. the resources subclass is also enriched with the ieee lom, lomfr, and suplomfr classification fields to complete its structure and meet the userlearner’s needs and expectations. further, we have connected the learning activities with cognitive objectives classes to ensure continuity between them (e.g., the subclass experimentation is associated with subclass analyze). figure 2 illustrates the main structure of both subsystems, the learning objectives and learning design. the knowledge domains subsystem contains the main class dewey decimal classification and class contacts. this main class has two subclasses: dewey classes, with the corresponding divisions as subclasses, and knowledge topics, with the corresponding subtopics as subclasses (e.g., science topic corresponds to dewey class 500, manufacturing subtopic corresponds to division 670). information technology and libraries june 2022 ontology for the user-learner profile | kordahi 10 figure 3. hierarchical representation of the subsystem knowledge domains. the subclass knowledge topics is related to the subclass user-learner’s learning topics to improve informational behavior analyses. the class contacts is linked to the subclass user-learner’s network of contacts to analyze the strength or weakness of networks between the user-learner and oer publishers/authors (see fig. 1). the subsystem knowledge domains can deal with questions which belong to different levels in the ontoulp. for example, which learning topics is the user-learner looking for? which network of contacts is the user-learner interested in? what are the activities related to the user-learner learning topics? what keywords searched relate to the user-learner’s learning topics?57 in figure 3, we show some of the subsystem’s elements. personalized modeling system for the user-learner profile the psul is based on the works discussed in the previous sections. it is written with php, javascript, and xml, computing languages for the web. this new modeling system comprises three classes of models: the general behavioral, personal behavioral, and topical (see fig. 4). the general behavioral model has two roles. it registers a user-learner’s digital profile in order to determine informational needs and preferences for oer. it also collects informational behaviors of a user-learner while viewing oer metadata records for tailor-made analyses. the general information technology and libraries june 2022 ontology for the user-learner profile | kordahi 11 behavioral model includes the ontology ontoulp as well as user-learner registration and editing pages. the registration page contains relevant information about a user-learner, an option to accept or reject data collection, and a list of choices for behavioral analyses. once registered, the user-learner can modify her/his profile from the editing page. both pages are mapped to the ontoulp to populate criteria fields. the user-learner profile information is stored in a secure database (as described in the introduction). the personal behavioral model is used to analyze information according to the registered digital user-learner profile and informational behaviors. it contains a set of queries to collect and tailor information for each user-learner. the sources of information are the general behavioral model and oer metadata records. this model is designed based on analyses of the general behavioral model. when a user-learner begins searching for oer, the general behavioral model provides the personal behavioral model with all profile information as well as the history of oer search. this information is transmitted to make an adjustment to the personalized user-learner profile. the user-learner profile changes as the personal behavioral model receives more information from the general behavioral model. informational interactions connect the personal behavioral model to topical models. the topical models bring together all analyses of oer search for each user-learner.58 they are inferred from the personal behavioral model. informational interactions connect the topical models to the general behavioral model. for now, we have designed four topical models and present their outcome in the user-learner dashboard page. this page may be used as a practical dashboard providing feedback to each user-learner, who can use these analyses to adjust or make changes in the profile or the oer search. topical model 1 is used to synthesize each user-learner’s search history and to suggest a profile adjustment. the suggested adjustment is based on analyses of user-learner behavioral trends.59 topical model 2 allows each user-learner to examine the list of knowledge topics which have caught her/his attention. it contains two separate lists describing viewed oer metadata records and matching them to the chosen topics of interest. topical model 3 shows comparative analyses between the user-learner’s preference criteria and viewed metadata records. the user-learner can interact with this model by comparing the chosen topics of interest to the viewed knowledge topics. the user-learner can also compare the chosen learning activities to the viewed teaching pedagogies. the teaching pedagogies as well as knowledge topics are extracted from oer metadata records (see fig. 5a). topical model 4 highlights each user-learner’s interest based on the keyword search volume. the user-learner can interact with this model by studying the relationships between searched keywords and chosen topics of interest (see fig. 5a and fig. 5b). figure 4 shows the diagram of psul as explained in the paragraph. information technology and libraries june 2022 ontology for the user-learner profile | kordahi 12 figure 4. the psul diagram based on the kelly and belkin’s system (2002).60 ontoulp and its modeling system in the context of a thematic digital university for now, ontoulp and its application program are implemented in the digital university engineering and technology private platform which is hosted on a private server. we conducted a technical test to mainly assess ontoulp’s precision and performance. the digital university’s team has sent us a complete archive of their oer metadata records. these oer metadata records are saved on the private server with the digital university engineering and technology platform. once a user-learner is registered to this platform, she/he can carry out actions through the psul. for example, these actions are a search by keyword, personalization of profile, tailored-made analysis of oer search, and visualization of analyses in the dashboard. information technology and libraries june 2022 ontology for the user-learner profile | kordahi 13 figure 5a. screenshot of a section of the dashboard. the bar chart shows comparative analyses between a user-learner’s topic of interest and knowledge topics. the knowledge topics are extracted from the viewed oer metadata records. information technology and libraries june 2022 ontology for the user-learner profile | kordahi 14 figure 5b. screenshot of a section of the dashboard. the pie chart highlights a user-learner’s interest based on a keyword search volume. the bar chart shows comparative analyses between a user-learner learning activities and viewed teaching pedagogies. the keywords are extracted from the search. the teaching pedagogies are extracted from oer metadata records. to avoid making the article longer, in figures 5a and 5b, we show brief results of a technical test. in this example, the user-learner’s identity is fictitious, or the user-learner’s persona is a construct.61 information technology and libraries june 2022 ontology for the user-learner profile | kordahi 15 in other words, the user-learner’s identity is not real, it is fabricated to conduct and complete the technical test. when registering, this user-learner has selected the technology topic (dewey class 600) in addition to the management and public relations subtopic (dewey division 650). this userlearner has also selected all topical models. during a viewing session, this user-learner chose to search for oer while using a few keywords. the keywords were chosen according to the userlearner’s profile and in order to continue the technical test. discussion and conclusion the ontology for the digital user-learner profile is a first approach based on the semantic web. it is designed for the personalization of interactions and retrieval of tailored information. we have combined standardized and validated resources, such as the ims lip, bloom’s taxonomy, and knowledge domains ontology, to allow the user-learner’s search analyses. we have discussed the design of a new application program prototype allowing a user-learner to analyze the search for oer according to her/his digital profile. psul provides automated real-time feedback based on the user-learner’s search history and information she/he has inserted about herself/himself. we have then demonstrated the integration of the ontoulp and psul into a mirror site to perform a technical test. the ontology’s main characteristics are flexibility and adaptability. while designing ontoulp, we have reused or restructured resources to allow its use in other thematic digital universities and online learning environments, including digital libraries. another advantage of ontoulp is the application of several information processing techniques. for example, a registered user-learner can self-assess her/his search for oer by keywords. she/he can also analyze the relevance of the search for oer through the psul. we have successfully overcome three essential limitations. the first limitation concerns the literature on the subject (see literature review section). while contributing to the field of research in information technology and digital libraries, this work has also drawn on disciplines as diverse as those of education as well as cognitive, social, and human sciences. the terminological definitions of disciplines, concepts, and even methods vary over decades or centuries, and among groups of researchers. we have made every effort to define the different terms correctly and to cite the corresponding researchers. the second difficulty relates to the design of ontoulp. published works dealing with this topic are rare. we used an exploratory research approach and the published works of renowned international researchers to fine-tune our study (see the exploratory research approach and literature review sections). we then determined the classes and objects as well as relationships between them. the third constraint concerns the design of the psul by following the thematic digital universities policies and respecting the general data protection regulations. according to the regulations, we have opted for an optional registration to thematic digital universities and to collecting information on the digital user-learner profile. thus, the user-learner will always have the possibility of registering to these platforms to make a tailor-made information analysis according to the digital profile. as we conclude our work, we have a plan to focus our research and initiatives in the following areas. firstly, we will further deepen our study of ontoulp classes to further increase their precision. we will also examine the search personalization of oer based on uses and practices of algorithms in the ontoulp.62 for example, by relying on newer version of the ontology we will identify the topics of interest, which may interest a specific user-learner. we will implement this information technology and libraries june 2022 ontology for the user-learner profile | kordahi 16 newer version in some thematic digital universities to perform technical tests. secondly, we will conduct qualitative and quantitative studies to analyze participants’ behavior while using ontoulp and its application program, in the context of thematic digital universities. for example, we will examine how many participants would choose to use the ontoulp and psul as well as how many wouldn’t (e.g., the usefulness of ontologies to participants). we will analyze the behavior of individuals with digital personae and make connections between their searches for oer.63 we will study their profiles, behaviors, and interests to ultimately suggest oer (e.g., the use of recommendation systems). we will also analyze how participants’ behavior and feedback may affect future findings. participants would be previously selected to contribute to these studies. thirdly, we will study the effects of ontoulp and psul practices on the thematic digital universities. this study will concern an analysis of the thematic digital universities’ search engines and users-learners’ needs. for example, exploratory research will allow us to better understand user-learners’ informational needs and expectations when using the oer search engines. we will analyze the design of oer search engines considering these informational needs and expectations. we will then utilize and integrate these findings to suggest alternatives to the thematic digital universities to further improve these search engines. acknowledgments we thank the digital university association and thematic digital universities for their elaborate and enlightening explanations concerning the platforms. we thank the reviewers and claude baltz, emeritus professor in information and communication sciences at the paris 8 university, for carefully reviewing this article and for enriching it with their expert observations. thanks to mohammad hajj hussein, communication and it engineer, for his help programming the dashboard. information technology and libraries june 2022 ontology for the user-learner profile | kordahi 17 appendix: semistructured questionnaire example email subject: digital university engineering and technology dear sir, madam, i am affiliated to the paragraph research laboratory at the paris 8 university (laboratoire de recherche paragraphe, université paris 8). i am writing to you to gather further information concerning the digital university engineering and technology. the objective of the semistructured questionnaire is to deepen my comprehension of the practices of digital university engineering and technology in order to write a research article and contribute to its improvement. i would be grateful if you could answer the following questions: • what are your responsibilities at the digital university engineering and technology? • do the thematic digital universities as well as digital university engineering and technology provide “open” educational resources? • are the educational resources accessible only to students enrolled in the training programs of partner universities? • how is the access to educational resources made? • do the educational resources follow document processing for their indexing? • is the document processing specific to the thematic digital universities? • what are the expectations of “users” searching for educational resources? thank you in anticipation sincerely yours, marilou kordahi information technology and libraries june 2022 ontology for the user-learner profile | kordahi 18 endnotes 1 “cape town open education declaration: unlocking the promise of open educational resources,” 2007, http://www.capetowndeclaration.org/read-the-declaration. 2 unesco, “forum on the impact of open courseware for higher education in developing countries,” (2002): 24, http://unesdoc.unesco.org/images/0012/001285/128515e.pdf. 3 william and flora hewlett foundation, “open education,” accessed april 5, 2022, https://hewlett.org/strategy/open-education. 4 unesco, “2012 paris oer declaration,” 2012, http://www.unesco.org/new/fileadmin/multimedia/hq/ci/wpfd2009/english_declaratio n.html. 5 camille thomas, kimberly vardeman, and jingjing wu, “user experience testing in the open textbook adaptation workflow,” information technology and libraries journal 40, no. 1 (2021): 1–18, https://doi.org/10.6017/ital.v40i1.12039. 6 digital university engineering and technology, “open educational resources for engineering and technology,” accessed april 5, 2022, https://unit.eu. 7 jean delpech de saint guilhem, sonia dubourg-lavroff, and jean-yves de longueau, “thematic digital universities,” general inspectorate of the national education and research administration, 2016, https://www.enseignementsuprecherche.gouv.fr/cid104387/www.enseignementsup-recherche.gouv.fr/cid104387/lesuniversites-numeriques-thematiques.html. 8 asim ullah, shah khusro, and irfan ullah, “bibliographic classification in the digital age: current trends & future directions,” information technology and libraries 36, no. 3 (2017): 48–77, https://doi.org/10.6017/ital.v36i3.8930; anne boyer, “thematic digital universities: report,” sciences et technologies de l'information et de la communication pour l'éducation et la formation 18, no. 1 (2011): 39–52. 9 sihem zghidi and mokhtar ben henda, “open educational resources and open archives in the open access movement: an educational engineering and scientific research crossed analysis,” distances and mediations of knowledge 31 (2020), https://doi.org/10.4000/dms.5347. 10 diane kelly and nicholas j. belkin, “a user modeling system for personalized interaction and tailored retrieval in interactive ir,” proceedings of the american society for information science and technology 39, no. 1 (2002): 316–25, https://doi.org/10.1002/meet.1450390135. 11 ieee learning technology standards committee, “learning object metadata, final draft standard, 1484.12.1-2002,” http://ltsc.ieee.org/wg12. 12 gregory m. shreve, and marcia lei zeng, “integrating resource metadata and domain markup in an nsdl collection,” in international conference on dublin core and metadata applications (2003): 223–29. http://www.capetowndeclaration.org/read-the-declaration http://unesdoc.unesco.org/images/0012/001285/128515e.pdf https://hewlett.org/strategy/open-education http://www.unesco.org/new/fileadmin/multimedia/hq/ci/wpfd2009/english_declaration.html http://www.unesco.org/new/fileadmin/multimedia/hq/ci/wpfd2009/english_declaration.html https://doi.org/10.6017/ital.v40i1.12039 https://www.enseignementsup-recherche.gouv.fr/cid104387/www.enseignementsup-recherche.gouv.fr/cid104387/les-universites-numeriques-thematiques.html https://www.enseignementsup-recherche.gouv.fr/cid104387/www.enseignementsup-recherche.gouv.fr/cid104387/les-universites-numeriques-thematiques.html https://www.enseignementsup-recherche.gouv.fr/cid104387/www.enseignementsup-recherche.gouv.fr/cid104387/les-universites-numeriques-thematiques.html https://doi.org/10.6017/ital.v36i3.8930 https://doi.org/10.4000/dms.5347 https://doi.org/10.1002/meet.1450390135 http://ltsc.ieee.org/wg12 information technology and libraries june 2022 ontology for the user-learner profile | kordahi 19 13 french standardization association, “description standard for the field of education in france – – part 1: description of learning resources (nodefr-1), nf z76-041,” 2019. 14 arthur allison, james currall, michael moss, and susan stuart, “digital identity matters,” journal of the american society for information science and technology 56, no. 4 (2005): 364–72, https://doi.org/10.1002/asi.20112. 15 katalin feher, “digital identity and the online self: footprint strategies – an exploratory and comparative research study.” journal of information science 47, no. 2 (2021): 192–205. https://doi.org/10.1177/0165551519879702. 16 robyn caplan and danah boyd, “who controls the public sphere in an era of algorithms,” mediation, automation, power (2016), https://www.datasociety.net/pubs/ap/mediationautomationpower_2016.pdf. 17 thomas r. gruber, “a translation approach to portable ontology specifications,” knowledge acquisition 5, no. 2 (1993): 199–220, https://doi.org/10.1006/knac.1993.1008. 18 gerhard fischer, “user modeling in human–computer interaction,” user modeling and useradapted interaction 11, no. 1 (2001): 65–86, https://doi.org/10.1023/a:1011145532042. 19 yannia kalfoglou and marco schorlemmer, “ontology mapping: the state of the art,” the knowledge engineering review 18, no. 1 (2003): 1–31, https://doi.org/10.1017/s0269888903000651. 20 tom gruber, “collective knowledge systems: where the social web meets the semantic web,” web semantics: science, services and agents on the world wide web 6 no. 1 (2008): 4–13, https://doi.org/10.1016/j.websem.2007.11.011. 21 peter ingwersen, “search procedures in the library – analysed from the cognitive point of view,” journal of documentation 38, no. 3 (1982): 165–97, https://doi.org/10.1108/eb026727. 22 tefko saracevic, amanda spink, and mei-mei wu, “users and intermediaries in information retrieval: what are they talking about?” in user modeling: proceedings of the sixth international conference (vienna: springer, 1997): 43–54. 23 núria ferran, enric mor, and julià minguillón, “towards personalization in digital libraries through ontologies,” library management 26, no. 4/5 (2005): 206–17. https://doi.org/10.1108/01435120510596062. 24 katrien verbert, erik duval, joris klerkx, sten govaerts, and josé luis santos, “learning analytics dashboard applications,” american behavioral scientist 57, no. 10 (2013): 1500– 1509, https://doi.org/10.1177/0002764213479363. 25 digital university association, “open educational resources for all,” accessed april 5, 2022, https://univ-numerique.fr. https://doi.org/10.1002/asi.20112 https://doi.org/10.1177/0165551519879702 https://www.datasociety.net/pubs/ap/mediationautomationpower_2016.pdf https://doi.org/10.1006/knac.1993.1008 https://doi.org/10.1023/a:1011145532042 https://doi.org/10.1017/s0269888903000651 https://doi.org/10.1016/j.websem.2007.11.011 https://doi.org/10.1108/eb026727 https://doi.org/10.1108/01435120510596062 https://doi.org/10.1177/0002764213479363 https://univ-numerique.fr/ information technology and libraries june 2022 ontology for the user-learner profile | kordahi 20 26 deborah arnold, “the french thematic digital universities – a 360° perspective on open and digital learning,” in european distance and e-learning network conference proceedings, no. 1 (2018): 370–78. 27 director of the digital university in health and sport messaged author, may 3, 2021. 28 director of the virtual university of environment and sustainable development messaged author, january 6, 2021. 29 director of the digital university in economics and management messaged author, december 08, 2020. 30 general secretary of the open university of the humanities messaged author, may 1, 2021. 31 member of digital university association messaged author, december 18, 2020. 32 director of the digital university engineering and technology messaged author, december 11, 2020. 33 laecio araujo costa, leandro manuel pereira sanches, ricardo josé rocha amorim, laís do nascimento salvador, and marlo vieira dos santos souza, “monitoring academic performance based on learning analytics and ontology: a systematic review,” informatics in education 19, no. 3 (2020): 361–97. 34 benjamin s. bloom, david r. krathwohl, and bertram b. masia, taxonomy of educational objectives: the classification of educational goals (new york: longman, 1984). 35 colin smythe, frank tansey, and robby robson, “ims learner information package. best practice & implementation guide,” ims global learning consortium, 2001. 36 rebecca green and michael panzer, “the ontological character of classes in the dewey decimal classification,” the library, (2009), https://www.ergonverlag.de/isko_ko/downloads/aiko_vol_12_2010_25.pdf 37 marilou kordahi, «le changement de l’apprentissage, l’ontologie du profil de l’utilisateurapprenant, » management des technologies organisationnelles, 10 (2020): 73–88. 38 marilou kordahi, “information literacy: ontology structures user-learner profile in online learning environment,” in seventh european conference on information literacy, (2021): 130, http://ecil2021.ilconf.org/wpcontent/uploads/sites/9/2021/09/ecil2021_book_of_abstracts_final_v3.pdf#page=149. 39 “ims learner information package accessibility for lip best practice and implementation guide,” ims global learning consortium, last revised june 18, 2003, https://www.imsglobal.org/accessibility/acclipv1p0/imsacclip_bestv1p0.html. 40 judy kay, “learner know thyself: student models to give learner control and responsibility,” in proceedings of international conference on computers in education (1997): 17–24. https://www.ergon-verlag.de/isko_ko/downloads/aiko_vol_12_2010_25.pdf https://www.ergon-verlag.de/isko_ko/downloads/aiko_vol_12_2010_25.pdf http://ecil2021.ilconf.org/wp-content/uploads/sites/9/2021/09/ecil2021_book_of_abstracts_final_v3.pdf#page=149 http://ecil2021.ilconf.org/wp-content/uploads/sites/9/2021/09/ecil2021_book_of_abstracts_final_v3.pdf#page=149 https://www.imsglobal.org/accessibility/acclipv1p0/imsacclip_bestv1p0.html information technology and libraries june 2022 ontology for the user-learner profile | kordahi 21 41 susan bull, “supporting learning with open learner models,” in proceedings of 4th hellenic conference with international participation information and communication technologies in education (2004): 47–61. 42 peter dolog and wolfgang nejdl, “challenges and benefits of the semantic web for user modelling,” in proceedings of the workshop on adaptive hypermedia and adaptive web-based systems (ah2003) at 12th international world wide web conference (2003). 43 ims global learning consortium, “ims learner information package accessibility for lip best practice and implementation guide,” para. 2. 44 gilbert paquette, “ontology-based educational modelling-making ims-ld visual,” technology, instruction, cognition & learning 7, no. 3–4 (2010): 263–93. 45 john seely brown and richard p. adler, “open education, the long tail, and learning 2.0,” educause review 43, no. 1 (2008): 16–20. 46 open university of the humanities, “how to write and publish a scientific article,” accessed on april 5, 2022, https://uoh.fr/front/noticefr/?uuid=6a063dd7-3a02-482a-9857934501f7c82d. 47 lorin w. anderson, david r. krathwohl, peter w. airiasian, kathleen a. cruikshank, richard e. mayer, paul r. pintrich, james raths, and merlin c. wittrock. a taxonomy for learning, teaching and assessing: a revision of bloom’s taxonomy of educational objectives (new york: longman publishing group, 2001). 48 birger hjørland, “theories are knowledge organizing systems (kos).” knowledge organization 42, no. 2 (2017): 113–28, https://doi.org/10.5771/0943-7444-2015-2-113. 49 walter moreira and daniel martínez-ávila, “concept relationships in knowledge organization systems: elements for analysis and common research among fields,” cataloging & classification quarterly 56, no. 1 (2018): 19–39, https://doi.org/10.1080/01639374.2017.1357157. 50 wayne a. wiegand, “the ‘amherst method’: the origins of the dewey decimal classification scheme.” libraries & culture 33, no. 2 (1998): 175–94. 51 melvil dewey, dewey decimal classification and relative index, ed. joan s. mitchell, julianne beall, giles martin, and winton e. matthews, 22nd ed., (dublin, ohio: oclc, 2003). 52 joan s. mitchell, “ddc 22: dewey in the world, the world in dewey,” advances in knowledge organization 9 (2004): 139–45. 53 hamid saeed and abdus sattar chaudhry, “using dewey decimal classification scheme (ddc) for building taxonomies for knowledge organisation,” journal of documentation 58, no. 5 (2002): 575–83. 54 rebecca green and michael panzer, “the interplay of big data, worldcat, and dewey,” advances in classification research online 24, no. 1 (2013): 51–58. https://doi.org/10.5771/0943-7444-2015-2-113 https://doi.org/10.1080/01639374.2017.1357157 information technology and libraries june 2022 ontology for the user-learner profile | kordahi 22 55 kelly and belkin, “a user modeling system,” 319. 56 rob koper and colin tattersall, eds., learning design: a handbook on modelling and delivering networked education and training (heidelberg: springer science and business media, 2005). 57 david beer, “envisioning the power of data analytics,” information, communication & society 21, no. 3 (2018): 465–79, https://doi.org/10.1080/1369118x.2017.1289232. 58 charles lang, george siemens, alyssa wise, and dragan gasevic, eds., handbook of learning analytics (society for learning analytics and research, 2017), https://doi.org/10.18608/hla17. 59 joris klerkx, katrien verbert, and erik duval, “learning analytics dashboards,” in handbook of learning analytics, ed. charles lang, george siemens, alyssa wise, and dragan gasevic, (society for learning analytics and research, 2017), https://doi.org/10.18608/hla17, 143–50. 60 kelly and belkin, “a user modeling system,” 319. 61 roger clarke, “the digital persona and its application to data surveillance,” the information society 10, no. 2 (1994): 77–92, https://doi.org/10.1080/01972243.1994.9960160. 62 ahu sieg, bamshad mobasher, and robin burke, “web search personalization with ontological user profiles,” in proceedings of the sixteenth acm conference on conference on information and knowledge management (2007): 525–34, https://doi.org/10.1145/1321440.1321515. 63 roger clarke, “persona missing, feared drowned: the digital persona concept, two decades later,” information technology & people 27, no. 2 (2014): 182–207, https://doi.org/10.1108/itp-04-2013-0073. https://doi.org/10.1080/1369118x.2017.1289232 https://doi.org/10.18608/hla17 https://doi.org/10.18608/hla17 https://doi.org/10.1080/01972243.1994.9960160 https://doi.org/10.1145/1321440.1321515 https://doi.org/10.1108/itp-04-2013-0073 abstract introduction exploratory research approach methods used results of information collection findings literature review selection of recently published works learner model taxonomy of educational objectives knowledge domains user modeling system ontoulp ontology writing the ontology ontoulp description personalized modeling system for the user-learner profile ontoulp and its modeling system in the context of a thematic digital university discussion and conclusion acknowledgments appendix: semistructured questionnaire example endnotes 16 information technology and libraries | march 2009 mathew j. miles and scott j. bergstrom classification of library resources by subject on the library website: is there an optimal number of subject labels? the number of labels used to organize resources by subject varies greatly among library websites. some librarians choose very short lists of labels while others choose much longer lists. we conducted a study with 120 students and staff to try to answer the following question: what is the effect of the number of labels in a list on response time to research questions? what we found is that response time increases gradually as the number of the items in the list grow until the list size reaches approximately fifty items. at that point, response time increases significantly. no association between response time and relevance was found. i t is clear that academic librarians face a daunting task drawing users to their library’s web presence. “nearly three-quarters (73%) of college students say they use the internet more than the library, while only 9% said they use the library more than the internet for information searching.”1 improving the usability of the library websites therefore should be a primary concern for librarians. one feature common to most library websites is a list of resources organized by subject. libraries seem to use similar subject labels in their categorization of resources. however, the number of subject labels varies greatly. some use as few as five subject labels while others use more than one hundred. in this study we address the following question: what is the effect of the number of subject labels in a list on response times to research questions? n literature review mcgillis and toms conducted a performance test in which users were asked to find a database by navigating through a library website. they found that participants “had difficulties in choosing from the categories on the home page and, subsequently, in figuring out which database to select.”2 a review of relevant research literature yielded a number of theses and dissertations in which the authors compared the usability of different library websites. jeng in particular analyzed a great deal of the usability testing published concerning the digital library. the following are some of the points she summarized that were highly relevant to our study: n user “lostness”: users did not understand the structure of the digital library. n ambiguity of terminology: problems with wording accounted for 36 percent of usability problems. n finding periodical articles and subject-specific databases was a challenge for users.3 a significant body of research not specific to libraries provides a useful context for the present research. miller’s landmark study regarding the capacity of human shortterm memory showed as a rule that the span of immediate memory is about 7 ± 2 items.4 sometimes this finding is misapplied to suggest that menus with more than nine subject labels should never be used on a webpage. subsequent research has shown that “chunking,” which is the process of organizing items into “a collection of elements having strong associations with one another, but weak associations with elements within other chunks,”5 allows human short-term memory to handle a far larger set of items at a time. larson and czerwinski provide important insights into menuing structures. for example, increasing the depth (the number of levels) of a menu harms search performance on the web. they also state that “as you increase breadth and/or depth, reaction time, error rates, and perceived complexity will all increase.”6 however, they concluded that a “medium condition of breadth and depth outperformed the broadest, shallow web structure overall.”7 this finding is somewhat contrary to a previous study by snowberry, parkinson, and sisson, who found that when testing structures of 26, 43, 82, 641 (26 means two menu items per level, six levels deep), the 641 structure grouped into categories proved to be advantageous in both speed and accuracy.8 larson and czerwinksi recommended that “as a general principle, the depth of a tree structure should be minimized by providing broad menus of up to eight or nine items each.”9 zaphiris also corroborated that previous research concerning depth and breadth of the tree structure was true for the web. the deeper the tree structure, the slower the user performance.10 he also found that response times for expandable menus are on average 50 percent longer than sequential menus.11 both the research and current practices are clear concerning the efficacy of hierarchical menu structures. thus it was not a focus of our research. the focus instead was on a single-level menu and how the number and characteristics of subject labels would affect search response times. n background in preparation for this study, library subject lists were collected from a set of thirty library websites in the united mathew j. miles (milesm@byui.edu) is systems librarian and scott j. bergstrom (bergstroms@byui.edu) is director of institutional research at brigham young university–idaho in rexburg. classification of library resources by subject on the library website | miles and bergstrom 17 states, canada, and the united kingdom. we selected twelve lists from these websites that were representative of the entire group and that varied in size from small to large. to render some of these lists more usable, we made slight modifications. there were many similarities between label names. n research design participants were randomly assigned to one of twelve experimental groups. each experimental group would be shown one of the twelve lists that were selected for use in this study. roughly 90 percent of the participants were students. the remaining 10 percent of the participants were full-time employees who worked in these same departments. the twelve lists ranged in number of labels from five to seventy-two: n group a: 5 subject labels n group b: 9 subject labels n group c: 9 subject labels n group d: 23 subject labels n group e : 6 subject labels n group f: 7 subject labels n group g: 12 subject labels n group h: 9 subject labels n group i: 35 subject labels n group j: 28 subject labels n group k: 49 subject labels n group l: 72 subject labels each participant was asked to select a subject label from a list in response to eleven different research questions. the questions are listed below: 1. which category would most likely have information about modern graphical design? 2. which category would most likely have information about the aztec empire of ancient mexico? 3. which category would most likely have information about the effects of standardized testing on high school classroom teaching? 4. which category would most likely have information on skateboarding? 5. which category would most likely have information on repetitive stress injuries? 6. which category would most likely have information about the french revolution? 7. which category would most likely have information concerning walmart’s marketing strategy? 8. which category would most likely have information on the reintroduction of wolves into yellowstone park? 9. which category would most likely have information about the effects of increased use of nuclear power on the price of natural gas? 10. which category would most likely have information on the electoral college? 11. which category would most likely have information on the philosopher emmanuel kant? the questions were designed to represent a variety of subject areas that library patrons might pursue. each subject list was printed on a white sheet of paper in alphabetical order in a single column, or double columns when needed. we did not attempt to test the subject lists in the context of any web design. we were more interested in observing the effect of the number of labels in a list on response time independent of any web design. each participant was asked the same eleven questions in the same order. the order of questions was fixed because we were not interested in testing for the effect of order and wanted a uniform treatment, thereby not introducing extraneous variance into the results. for each question, the participant was asked to select a label from the subject list under which they would expect to find a resource that would best provide information to answer the question. participants were also instructed to select only a single label, even if they could think of more than one label as a possible answer. participants were encouraged to ask for clarification if they did not fully understand the question being asked. recording of response times did not begin until clarification of the question had been given. response times were recorded unbeknownst to the participant. if the participant was simply unable to make a selection, that was also recorded. two people administered the exercise. one recorded response times; the other asked the questions and recorded label selections. relevance rankings were calculated for each possible combination of labels within a subject list for each question. for example, if a subject list consisted of five labels, for each question there were five possible answers. two library professionals—one with humanities expertise, the other with sciences expertise—assigned a relevance ranking to every possible combination of question and labels within a subject list. the rankings were then averaged for each question–label combination. n results the analysis of the data was undertaken to determine whether the average response times of participants, adjusted by the different levels of relevance in the subject list labels that prevailed for a given question, were significantly different across the different lists. in other words, would the response times of participants using a particular list, for whom the labels in the list were highly relevant 18 information technology and libraries | march 2009 to the question, be different from students using the other lists for whom the labels in the list were also highly relevant to the question? a separate univariate general linear model analysis was conducted for each of the eleven questions. the analyses were conducted separately because each question represented a unique search domain. the univariate general linear model provided a technique for testing whether the average response times associated with the different lists were significantly different from each other. this technique also allowed for the inclusion of a covariate—relevance of the subject list labels to the question—to determine whether response times at an equivalent level of relevance was different across lists. in the analysis model, the dependent variable was response time, defined as the time needed to select a subject list label. the covariate was relevance, defined as the perceived match between a label and the question. for example, a label of “economics” would be assessed as highly relevant to the question, what is the current unemployment rate? the same label would be assessed as not relevant for the question, what are the names of four moons of saturn? the main factor in the model was the actual list being presented to the participant. there were twelve lists used in this study. the statistical model can be summarized as follows: response time = list + relevance + (list × relevance) + error the general linear model required that the following conditions be met: first, data must come from a random sample from a normal population. second, all variances with each of the groupings are the same (i.e., they have homoscedasticity). an examination of whether these assumptions were met revealed problems both with normality and with homoscedasticity. a common technique— logarithmic transformation—was employed to resolve these problems. accordingly, response-time data were all converted to common logarithms. an examination of assumptions with the transformed data showed that all questions but three met the required conditions. the three 0.70 0.80 0.90 1.00 1.10 1.20 0.50 0.60 avg log performance trend figure 1. the overall average of average search times for the eight questions for all experimental groups (i.e., lists) questions (5, 6, and 7) were excluded from subsequent analysis. n conclusions the series of graphs in the appendix show the average response times, adjusted for relevance, for eight of the eleven questions for all twelve lists (i.e., experimental groups). three of the eleven questions were excluded from the analysis because of heteroscedascity. an inspection of these graphs shows no consistent pattern in response time as the number of the items in the lists increase. essentially, this means that, for any given level of relevance, the number of items of the list does not affect response time significantly. it seems that for a single question, characteristics of the categories themselves are more important than the quantity of categories in the list. the response times using a subject list with twenty-eight labels is similar to the response times using a list of six labels. a statistical comparison of the mean response time for each classification of library resources by subject on the library website | miles and bergstrom 19 group with that of each of the other groups for each of the questions largely confirms this. there were very few statistically significant different comparisons. the spikes and valleys of the graphs in the appendix are generally not significantly different. however, when the average response time associated with all lists is combined into an overall average from all eight questions, a somewhat clearer picture emerges (see figure 1). response times increase gradually as the number of the items in the list increase until the list size reaches approximately fifty items. at that point, response time increases significantly. no association was found between response time and relevance. a fast response time did not necessarily yield a relevant response, nor did a slow response time yield an irrelevant response. n observations we observed that there were two basic patterns exhibited when participants made selections. the first pattern was the quick selection—participants easily made a selection after performing an initial scan of the available labels. nevertheless, a quick selection did not always mean a relevant selection. the second pattern was the delayed selection. if participants were unable to make a selection after the initial scan of items, they would hesitate as they struggled to determine how the question might be reclassified to make one of the labels fit. we did not have access to a high-tech lab, so we were unable to track eye movement, but it appeared that the participants began scanning up and down the list of available items in an attempt to make a selection. the delayed selection seemed to be a combination of two problems: first, none of the available labels seemed to fit. second, the delay in scanning increased as the list grew larger. it’s possible that once the list becomes large enough, scanning begins to slow the selection process. a delayed selection did not necessarily yield an irrelevant selection. the label names themselves did not seem to be a significant factor affecting user performance. we did test three lists, each with nine items and each having different labels, and response times were similar for the three lists. a future study might compare a more extensive number of lists with the same number of items with different labels to see if label names have an effect on response time. this is a particular challenge to librarians in classifying the digital library, since they must come up with a few labels to classify all possible subjects. creating eleven questions to span a broad range of subjects is also a possible weakness of the study. we had to throw out three questions that violated the assumptions of the statistical model. we tried our best to select questions that would represent the broad subject areas of science, arts, and general interest. we also attempted to vary the difficulty of the questions. a different set of questions may yield different results. references 1. steve jones, the internet goes to college, ed. mary madden (washington, d.c.: pew internet and american life project, 2002): 3, www.pewinternet.org/pdfs/pip_college_report.pdf (accessed mar. 20, 2007). 2. louise mcgillis and elaine g. toms, “usability of the academic library web site: implications for design,” college & research libraries 62, no. 4 (2001): 361. 3. judy h. jeng, “usability of the digital library: an evaluation model” (phd diss., rutgers university, new brunswick, new jersey): 38–42. 4. george a. miller, “the magical number seven plus or minus two: some limits on our capacity for processing information,” psychological review 63, no. 2 (1956): 81–97. 5. fernand gobet et al., “chunking mechanisms in human learning,” trends in cognitive sciences 5, no. 6 (2001): 236–43. 6. kevin larson and mary czerwinski, “web page design: implications of memory, structure and scent for information retrieval” (los angeles: acm/addison-wesley, 1998): 25, http://doi.acm.org/10.1145/274644.274649 (accessed nov. 1, 2007). 7. ibid. 8. kathleen snowberry, mary parkinson, and norwood sisson, “computer display menus,” ergonomics 26, no 7 (1983): 705. 9. larson and czerwinski, “web page design,” 26. 10. panayiotis g. zaphiris, “depth vs. breath in the arrangement of web links,” www.soi.city.ac.uk/~zaphiri/papers/hfes .pdf (accessed nov. 1, 2007). 11. panayiotis g. zaphiris, ben shneiderman, and kent l. norman, “expandable indexes versus sequential menus for searching hierarchies on the world wide web,” http:// citeseer.ist.psu.edu/rd/0%2c443461%2c1%2c0.25%2cdow nload/http://coblitz.codeen.org:3125/citeseer.ist.psu.edu/ cache/papers/cs/22119/http:zszzszagrino.orgzszpzaphiriz szpaperszszexpandableindexes.pdf/zaphiris99expandable.pdf (accessed nov. 1, 2007). 20 information technology and libraries | march 2009 appendix. response times by question by group 0.00 0.20 0.40 0.60 0.80 1.00 1.20 gr p a (5 it em s) gr p e (6 it em s) gr p f (7 it em s) gr p b (9 it em s) gr p c (9 it em s) gr p h (9 it em s) gr p g (1 2 ite m s) gr p d (2 3 ite m s) gr p j (2 8 ite m s) gr p i (3 5 ite m s) gr p k (4 9 ite m s) gr p l (7 2 ite m s) 0.00 0.20 0.40 0.60 0.80 1.00 1.20 gr p a (5 it em s) gr p e (6 it em s) gr p f (7 it em s) gr p b (9 it em s) gr p c (9 it em s) gr p h (9 it em s) gr p g (1 2 ite m s) gr p d (2 3 ite m s) gr p j (2 8 ite m s) gr p i (3 5 ite m s) gr p k (4 9 ite m s) gr p l (7 2 ite m s) 0.00 0.20 0.40 0.60 0.80 1.00 1.20 gr p a (5 it em s) gr p e (6 it em s) gr p f (7 it em s) gr p b (9 it em s) gr p c (9 it em s) gr p h (9 it em s) gr p g (1 2 ite m s) gr p d (2 3 ite m s) gr p j (2 8 ite m s) gr p i (3 5 ite m s) gr p k (4 9 ite m s) gr p l (7 2 ite m s) 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 gr p a (5 it em s) gr p e (6 it em s) gr p f (7 it em s) gr p b (9 it em s) gr p c (9 it em s) gr p h (9 it em s) gr p g (1 2 ite m s) gr p d (2 3 ite m s) gr p j (2 8 ite m s) gr p i (3 5 ite m s) gr p k (4 9 ite m s) gr p l (7 2 ite m s) 0.00 0.20 0.40 0.60 0.80 1.00 1.20 gr p a (5 it em s) gr p e (6 it em s) gr p f (7 it em s) gr p b (9 it em s) gr p c (9 it em s) gr p h (9 it em s) gr p g (1 2 ite m s) gr p d (2 3 ite m s) gr p j (2 8 ite m s) gr p i (3 5 ite m s) gr p k (4 9 ite m s) gr p l (7 2 ite m s) 0.00 0.20 0.40 0.60 0.80 1.00 1.20 gr p a (5 it em s) gr p e (6 it em s) gr p f (7 it em s) gr p b (9 it em s) gr p c (9 it em s) gr p h (9 it em s) gr p g (1 2 ite m s) gr p d (2 3 ite m s) gr p j (2 8 ite m s) gr p i (3 5 ite m s) gr p k (4 9 ite m s) gr p l (7 2 ite m s) 0.00 0.20 0.40 0.60 0.80 1.00 1.20 1.40 1.60 gr p a (5 it em s) gr p b (9 it em s) gr p c (9 it em s) gr p d (2 3 ite m s) gr p e (6 it em s) gr p f (7 it em s) gr p g (1 2 ite m s) gr p h (9 it em s) gr p i (3 5 ite m s) gr p j (2 8 ite m s) gr p k (4 9 ite m s) gr p l (7 2 ite m s) 0.00 0.10 0.20 0.30 0.40 0.50 0.60 0.70 0.80 0.90 gr p a (5 it em s) gr p e (6 it em s) gr p f (7 it em s) gr p b (9 it em s) gr p c (9 it em s) gr p h (9 it em s) gr p g (1 2 ite m s) gr p d (2 3 ite m s) gr p j (2 8 ite m s) gr p i (3 5 ite m s) gr p k (4 9 ite m s) gr p l (7 2 ite m s) question 1 question 8 question 2 question 9 question 3 question 10 question 4 question 11 microsoft word december_ital_zak_final.docx do  you  believe  in  magic?     exploring  the  conceptualization  of   augmented  reality  and  its  implications   for  the  user  in  the  field  of  library  and   information  science       elizabeth  zak     information  technology  and  libraries  |  december  2014           23   abstract   augmented  reality  (ar)  technology  has  implications  for  the  ways  that  the  field  of  library  and   information  science  (lis)  serves  users  and  organizes  information.  through  content  analysis,  the   author  examined  how  ar  is  conceptualized  within  a  sample  of  lis  literature  from  the  library,   information  science  and  technology  abstracts  (lista)  database  and  google  blogs  postings.  the   author  also  examined  whether  radical  change  theory  (rct)  and  the  digital-­‐age  principles  of   interactivity,  connectivity,  and  access  are  present  in  the  discussion  of  this  technology.  the  analysis  of   data  led  to  the  identification  of  14  categories  comprising  132  total  codes  across  sources  within  the   data  set.  the  analysis  indicates  that  the  conceptualization  of  ar,  while  inconsistent,  suggests   expectations  that  the  technology  will  enhance  the  user  experience.  this  can  lead  to  future   examinations  of  user  behavior,  response,  and  observation  of  technologies  like  ar.   introduction   it  seems  an  understatement  to  say  digital  technology  is  changing  quickly.  cell  phones  are  like   small  computers  in  our  pockets;  we  have  access  to  far  greater  computing  resources  in  “the  cloud”   than  we  did  just  five  years  ago,  and  computers  are  processing  at  speeds  once  only  a  fantasy.  this   digital  revolution  includes  the  continued  development  of  augmented  reality  (ar)  applications.  at   its  simplest,  ar  is  a  blending  of  the  physical  environment  with  digital  elements.   as  with  many  of  the  latest  technologies,  the  development  of  ar  is  interdisciplinary.  professionals   in  the  fields  of  computer  science,  psychology,  and  philosophy  seem  to  direct  the  discussion  on  the   development  and  application  of  ar  technology,  as  evidenced  by  the  volume  of  literature  when   searching  these  subject  databases  for  articles  pertaining  to  ar.  the  field  of  library  and  information   science  (lis)  seems  largely  absent  from  the  conversation.   while  elements  of  ar,  such  as  global  positioning  systems  (gps),  quick  response  (qr)  codes,  and   virtual  reality  are  not  uncommon  in  lis  literature,  rarely  are  these  topics  defined  as  ar.   information  theory,  information  behavior,  knowledge  management,  information  architecture,  and   digital  literacy  (to  name  only  a  few)  are  key  areas  of  study  within  lis,  which  can  be  central  in       elizabeth  zak  (ezak@dom.edu),  is  adjunct  instructor  at  dominican  university,  river  forest,   illinois.     do  you  believe  in  magic?  |  zak       24   developing  and  exploring  ar.   the  focus  on  and  the  definitions  of  the  user  within  lis  provide  a  much  different  perspective  on  the   human  aspect  of  engaging  with  digital  information  than  is  found  within  computer  science.   literature  on  human-­‐computer  interaction  focuses  more  on  the  user  as  a  piece  of  the  system,  with   a  shift  only  recently  toward  acknowledging  this  misdirected  focus.1,2,3,4,5,6,7,8  libraries  and   information  agencies  have  the  tools  and  skills—with  regard  to  user  interaction  with  and  use  of   information—to  help  answer  questions  relating  to  how  the  conceptualization  of  a  technology  like   ar  influences  the  use  of  the  technology.   augmented  reality  (ar)  defined   what  is  ar,  really?  ronald  azuma,  a  pioneer  and  innovator  in  the  research  and  creation  of  ar   applications,  describes  ar  as  a  supplement  to  reality.9  it  combines  the  real  and  the  virtual,  aligning   the  virtual  with  the  real  environment.10  ar  is  part  of  a  mixed  reality  continuum,  and  “the   technology  depends  on  our  ability  to  make  coherent  space  from  sensory  information.”11  this   coherent  space  is  dependent  on  several  variables,  one  of  which  is  ar’s  real-­‐time  interactivity.  ar   applications  cannot  experience  any  delays  in  response  time;  if  the  real  and  virtual  are  misaligned,   it  impedes  the  sense  of  reality.  ar  needs  to  happen  at  the  same  speed  as  real  life—virtual  actions   coinciding  with  human  actions,  and  variable  across  users.   some  also  view  ar  as  a  new  media  experience,  adding  to  the  growing  list  of  digital  literacies.   “while  the  pure  technology  provides  a  set  of  features  that  can  be  exploited,  the  features  of  the  new   technology  will  develop  into  one  or  more  particular  forms  within  a  particular  historical  and   cultural  setting.”12  this  includes  “remediating”  existing  media,  such  as  film  or  stage  productions,   with  ar  components.   as  a  result,  ar  is  contextual  and  reliant  on  each  personal  experience,  but  it  also  borrows  from   earlier  forms  of  media  within  those  contexts  and  experiences.  for  this  reason,  it  is  important  to   examine  just  how  those  within  lis  are  constructing  ar  as  a  concept.  the  rapid  pace  at  which  ar  is   evolving  and  gaining  in  popularity  suggests  those  within  the  field  of  lis  will  need  to  be  aware  of   new  applications  for  the  technology  as  more  and  more  users  may  come  to  expect  access  to  and   knowledge  of  this  technology.   much  of  the  literature  also  references  ar  as  being  an  enhancement.  but  what,  exactly,  is  ar   enhancing?  the  answer  includes  our  senses  and  perceptions,13  software,14  our  emotions  and   feelings,15  and  question-­‐answering  programs.16  authors  use  the  term  enhancement  with  no   explanation  of  how  it  is  defined  in  each  piece  of  literature.  some  assume  more  is  always  better.   missing  is  the  voice  of  the  user  or  consumer,  whom  authors  refer  to  as  a  subject  or  wearer.  the   irony  is  many  of  these  users  are  the  ones  creating,  building,  and  populating  these  ar  worlds.  i   propose  the  field  of  lis  is  in  a  prime  position  to  address  the  missing  piece  of  the  research  and   discourse  of  ar.     information  technology  and  libraries  |  december  2014   25   many  of  the  technological  enhancements  authors  describe  in  the  ar  literature  connect  to  the   development  of  knowledge,  one  of  the  main  goals  of  computing.17  there  are  a  variety  of  settings  in   which  some  expect  ar  to  create  new  routes  to  knowledge.  within  a  library  setting,  ar  can   improve  library  instruction,18  provide  information  retrieval  about  shelved  books  through   recognition  applications,19  reconstruct  and  restore  artifacts,20  and  deliver  services  at  point  of  need   through  qr  codes.21  others  view  ar  as  a  technological  breakthrough  with  extreme  potential  in   medical  fields,  such  as  non-­‐invasive  surgical  visualization,  which  can  display  the  organs  through   sensors  placed  on  the  body,  helping  medical  practitioners  and  students  to  understand  internal   body  functions.22   ar  is  also  said  to  perform  some  of  its  “enhancements”  in  the  classroom.23  augmented  books  are   growing  in  number  and  are  expected  to  enhance  collaboration  and  interactivity  between  students.   some  allow  more  than  one  person  to  explore  the  same  content  at  the  same  time  or  outside  of  the   classroom  through  mobile  technology;  others  give  students  the  ability  to  rotate,  tilt,  and   manipulate  viewing  angles  of  various  objects.24       ar  technology  within  art  education  is  said  to  have  a  positive  impact  on  student  motivation.25  in  a   2010  study,  reference  librarians  at  the  university  of  manitoba  were  given  smartphones  and  asked   to  create  innovative  projects.  what  they  came  up  with  resulted  in  a  public  art  project  using  social   networks,  gps  software,  and  ar  technology  that  allowed  users  to  interact  with  the  art  pieces   through  qr  codes.26  again  and  again,  interactivity,  connectivity,  and  mobility  of  ar  applications   are  highlighted  as  efficient  and  motivating  factors  in  education  and  learning.27,28,29,30   organizations  are  also  testing  ar  applications  for  general  public  use.  museums  and  galleries   contain  ar  virtual  displays  of  artifacts  and  historic  scenes,  personalizing  interactive  experiences   and  providing  multiple  perspectives  of  events  and  artifacts.31  the  natural  history  museum  in   london,  for  example,  created  the  attenborough  studio  in  2009  for  live  events  and  the  viewing  of   ar  enhanced  films.32  ar  tours  are  also  offered  on  historic  places  and  spaces.  augmented  reality  is   already  in  use  as  a  tour  guide  application,33  to  supplement  paper  maps,34  and  even  to  reconstruct   damaged  historic  sites.35  ar  is  essentially  a  mutable  form  of  displaying  what  are  typically  static   objects  and  ideas.   the  use  of  ar  is  integrative.  it  can  saturate  real  landscapes,  places,  and  spaces  with  virtual   characteristics;  it  can  add  to  or  hide  objects  in  the  environment.36  manipulation  of  the   environment  promotes  “immersive  experiences,”37  multifaceted  for  different  people  interacting   with  ar.38  virtual  experiences  are  expected  to  enhance  the  real-­‐life  experiences  of  the  user.  spaces   can  become  layered,  or  scaffold,  not  only  through  depth  perception  or  ar  software,  but  also   through  the  user’s  contextual  life  experiences.39   with  growing  widespread  use,  it  is  imperative  for  those  in  the  field  of  lis  to  understand  this   technology  and  how  it  is  used.  how  will  these  applications  be  archived?  will  institutions  and   organizations  collaborate  with  users  in  creating  these  applications?  will  users  expect  librarians   and  other  knowledge  workers  to  help  them  understand  and  use  the  technology?  will  libraries  and     do  you  believe  in  magic?  |  zak       26   other  institutions  use  the  technology,  and  if  so,  in  what  capacity?  these  are  just  some  of  the   questions  lis  professionals  should  be  asking  to  ensure  they  are  meeting  user  needs  in  relation  to   technology  and  the  access  of  information  through  technology.  this  study  provides  a  base  for   considering  these  questions,  and  the  effects  of  ar,  by  situating  the  concept  and  expectations  of  ar   within  the  field  and  aligning  the  conceptualization  with  radical  change  theory.   the  user  within  ar  and  lis   the  use  of  ar  can  bring  with  it  an  altering  of  emotional  and  psychological  experiences.  pederson   argues  ar  applications  should  be  “human-­‐centric”.40  because  human  beings  “instrumentalize”   technology  such  as  ar,  the  technology  itself  should  accommodate  the  human  needs  for  ar  tools.41   for  ar  applications  to  be  successful,  they  must  display  the  human  characteristics  encompassing   reactions  to  one  another  and  the  environment.42  applications  blurring  the  lines  between  the  real   and  the  virtual,  and  engaging  the  person  in  ways  that  play  on  perception,  are  capable  of  changing   the  real-­‐life  perceptions  and  expectations  of  the  user.  ar  can  be  either  a  form  of  escapism  for  brief   moments  or  an  escape  from  what  we  know  of  as  real-­‐life  for  good.43   what  much  of  the  research  assumes  is  the  desire  for  ar  applications.  lacking  is  surveys  of  user   desire  for  ar,  or  examination  of  potential  negative  consequences.  a  user-­‐focus  is  central  to  lis   and  the  values  of  librarianship.  within  all  these  potential  uses,  what  is  not  being  discussed  is  how   creators  of  ar  applications  organize,  categorize  and  choose  the  information  contained  within   these  applications.  much  of  the  discussion  surrounding  the  implications  of  ar  use  is  left  to  those  in   fields  of  philosophy  and  psychology,  dealing  in  abstraction.  the  research  and  marketing  of  ar   applications  promote  passive  acceptance  of  ar  technologies.44  there  comes  a  point  when  ar   researchers  must  ask,  to  what  extent  are  potential  users  aware  of  and  desirous  of  ar  applications?   the  notion  of  people  as  users  is  quite  similar  to  the  contextualization  of  human  as  subject  in  this   scientific  literature,  different  from  the  notion  of  user  in  lis.  in  computer-­‐science  literature,  users   are  those  reacting  to  ar  and  providing  ar  researchers  with  content  for  ar  modification  as   evaluators  of  the  technology.  within  computer-­‐science  literature,  there  is  much  talk  of  the  user’s   satisfaction  with  tested  ar  applications;45  there  also  much  talk  of  the  user’s  position  within  the  ar   frame/environment.46  similar  is  the  discussion  of  the  user  and  the  physical  space  of  the  ar   application,  and  how  the  user  responds  to  the  ar  functions.47  user  and  subject  are  terms  used   interchangeably  in  the  scientific  literature.  these  terms  are  largely  undefined,  indicating  little   regard  for  the  role  of  the  human  outside  of  objectification  as  a  tool  working  in  the  ar  environment.   within  the  field  of  lis,  the  notion  of  the  user  takes  on  different  characteristics:  users  are  clearly   patrons,  human  beings,  for  which  lis  provides  a  service.  kuhlthau  focuses  on  user-­‐centric   treatment  of  information  services  with  her  information  search  process  model.48  chatman’s   theories  of  information  poverty  and  life  in  the  round  both  center  on  the  information-­‐seeking   behaviors  of  ordinary  people  with  everyday  needs.49     information  technology  and  libraries  |  december  2014   27   library  anxiety,  as  developed  by  mellon,  pivots  on  the  emotions  and  psychological  responses  of   information-­‐seekers,  or  users.50  there  are  many  more  theories  and  models  demonstrating  this   user-­‐centric  focus  in  lis.  case  points  to  dervin’s  1976  article  “strategies  for  dealing  with  human   information  needs:  information  or  communication?”  as  an  exemplar  of  the  shift  in  thinking  about   the  role  and  needs  of  the  user  in  lis.51  dervin  pointed  to  assumptions  in  research  on  information-­‐ seeking,  such  as  more  information  as  better,  and  relevant  information  existing  for  every  need.  it  is   important  to  note  whether  there  are  assumptions  now  being  made  in  lis  literature  with  regard  to   user  interaction  with  ar  and  how  best  to  understand  user-­‐centered  design  of  ar  applications.   bowler  et  al.  define  user-­‐centered  design  as  “[reflecting]  the  user,  typically  from  a  cognitive,   affective  or  behavioral  point  of  view,  as  well  as  the  social,  organizational,  and  cultural  contexts  in   which  users  function.”52  these  points  of  view—the  cognitive,  social,  behavioral,  and  the  like—all   synchronize  within  ar  applications.  bowler  et  al.  state,  “with  the  increased  use  of  digital,   networked  information  tools  in  daily  practice  and  the  emergence  of  the  digital  library  and  archive,   it  is  impossible  to  separate  the  service  from  the  system.  in  this  context,  understanding  the  user   becomes  more  critical  than  ever.”53  with  the  advent  of  ar  and  its  dependence  on  user  interaction,   it  is  imperative  to  continue  to  address  the  role  of  the  user.   o’brien  and  toms  further  the  discussion  by  trying  to  define  user  engagement  with  technology.  the   authors  define  engagement  as  a  process,  composed  of  three  states:  a  point  of  engagement,  a  period   of  engagement  and  then  finally  disengagement.54  attention  to  user  needs  and  behavior  as  an   individualized  process  is  evident  across  lis  literature.  shu  suggests  user  engagement  in  website   design  as  a  means  to  strengthening  user  relationships  with  organizations  through  a  study  of  web   2.0  [interactive  internet  applications].55  idoughi,  seffah,  and  kolski  recommend  integrating   “personae,”  or  perceived  personality  types  and  characteristics,  into  user-­‐design  to  address   challenges  in  creating  software  offering  highly  personalized  services.56  pang  and  schauder  take  a   community-­‐based  approach  to  systems  design,  particularly  in  libraries  and  museums  and   encourage  system  designers  to  draw  on  the  study  of  relationships  and  interaction  within  different   communities  as  a  means  to  gain  insight  into  more  user-­‐centric  design  methods.57   a  user-­‐centric  focus  has  extended  itself  to  catalog  design58  and  information-­‐retrieval  systems   design.59  it  has  been  applied  as  a  learning  approach  to  organizational  culture  within  libraries.60   scholarly  research  within  the  library  is  said  also  to  have  benefitted  because  a  user-­‐centric  focus   informs  how  different  types  of  users  interact  with  information  within  the  library.61  through  an   analysis  of  user  language,  user-­‐centricity  is  also  applied  as  a  means  to  identify  strategies  for   creating  language  tools  for  web  searching.62  such  studies  are  representative  of  the  ways  a  user-­‐ centric  paradigm  proliferates  within  lis.   these  models  and  studies  highlight  the  importance  of  the  user  perspective  and  how  the  user   engages  with  information  at  various  levels.  radical  change  theory  (rct),  developed  by  eliza   dresang  in  1999,  goes  one  step  further  and  adds  another  element  to  the  user-­‐information   interaction:  the  user’s  interaction  with  other  users  and  the  response  to  the  interactive  applications     do  you  believe  in  magic?  |  zak       28   and  digital  technologies  in  which  information  is  becoming  embedded.  this  element  is  mirrored  in   such  applications  as  twitter  and  foursquare,  which  allow  digital  overlays  of  information  to  see  if   there  are  individuals  “near  you”  also  logged  on  to  these  networking  applications.   the  highly  complex  ar  applications  call  for  a  highly  complex  view  of  human  interaction  with  those   applications.  radical  change  theory  helps  to  explicate  the  characteristics  of  human  expectations   and  interactions  with  information  in  digital  formats.  coupled  with  an  examination  of  ar  discourse   within  lis  literature,  rct  helps  clarify  how  people,  or  users,  are  approaching  the  changing   information  landscape.   research  approach   this  study  focuses  on  the  user-­‐centric  paradigm  prevalent  in  lis,  seeking  to  understand  the   relationship  between  the  conceptualization  of  an  emerging  technology  and  the  role  of  the  user.   the  research  questions  guiding  this  study  are  the  following:   • how  is  augmented  reality  (ar)  conceptualized  in  lis?   • what  is  the  role  of  the  user  in  relation  to  ar  as  it  is  conceptualized  in  lis?     the  aim  is  to  understand  how  a  specifically  lis  user-­‐centric  focus  can  apply  to  the   conceptualization  of  ar  and  its  use  within  libraries  and  cultural  heritage  institutions.   the  model  for  this  study  is  clement  and  levine,  which  examined  how  pre-­‐1978  dissertations  were   published  and  what  the  concept  of  copyright  was  for  those  dissertations.  the  unit  of  sampling  in   their  study  was  the  written  message,  “defined  as  a  complete  statement  or  series  of  statements   with  a  distinct  start  and  end.”  each  message  under  investigation  required  author-­‐specified   semantic  concepts.  the  researchers  then  selected  recording  units,  what  they  describe  as  “explicit   assertions”  pertaining  to  the  publication  of  dissertations.  the  authors  delineated  explicit   assertions  as  taking  several  forms,  from  a  phrase  within  a  sentence  to  a  multipage  argument.63   for  the  purpose  of  this  study,  i  investigated  written  messages  contained  in  the  journals  and  blog   posts  i  chose  as  well  as  links  to  other  webpages  or  blog  posts  contained  in  the  initial  data  set.  the   semantic  concept  is  the  term  augmented  reality.  the  recording  unit  is  any  explicit  assertions  made   regarding  augmented  reality  using  the  same  range  of  form  as  clement  and  levine.64   in  this  study,  i  allied  content  analysis  with  radical  change  theory  (rct).  rct  focuses  on  the   characteristics  of  interactivity,  connectivity,  and  access,  which  are  all  dependent  on  measures  of   human  connections  through  various  forms  of  media.  the  characteristics  of  these  principles  are  as   follows:   • interactivity  refers  to  dynamic,  user-­‐initiated,  nonlinear,  nonsequential,  complex   information  behavior,  and  representation.       • connectivity  refers  to  the  sense  of  community  or  construction  of  social  worlds  that  emerge   from  changing  perspectives  and  expanded  associations.         information  technology  and  libraries  |  december  2014   29   • access  refers  to  the  breaking  of  long-­‐standing  information  barriers,  bringing  entrée  to  a   wide  diversity  of  opinion  and  opportunity.65   content  analysis  of  lis  literature  related  to  ar  through  the  rct  framework  allowed  me  to  find   connections  between  the  conceptualization  of  a  technology  and  how  the  conceptualization   functions  within  a  given  academic  community.  during  the  data  analysis,  i  assessed  whether  the   characteristics  of  interactivity,  connectivity,  and  access  are  present  in  the  descriptors  of  ar.   understanding  how  ar  is  brought  into  the  discourse  of  lis  in  lis  literature  helps  describe  the   ways  an  emerging  technology  is  developed  as  a  concept  and  as  a  tool  through  an  examination  of   how  researchers  and  practitioners  perceive  the  need  for  and  use  of  ar.   data  collected  from  searches  of  google  blogs  and  the  library,  information  science  and  technology   abstracts  with  full  text  (lista)  database  aided  in  understanding  the  conceptualization  of  ar  and   the  role  of  the  user  in  relation  to  the  conceptualization.  all  the  searches  took  place  over  three   months,  december  2012  to  february  2013,  and  all  searches  centered  on  the  search  term   “augmented  reality”  (the  term  was  enclosed  in  quotation  marks  in  the  search  box).   through  purposeful  criterion  sampling,  all  search  results  including  the  term  “augmented  reality”   are  included  in  the  initial  list  of  data  for  analysis.  i  chose  the  google  blogs  search  engine  because   of  its  popularity,  familiarity,  ease  of  use,  and  variety  of  viewpoints.  blogs  are  an  important  source   of  data  because  they  continue  to  increase  in  popularity  across  disciplines,  including  lis,  and  serve   as  a  way  for  people  to  communicate  with  one  another  and  exchange  information.66  the  lista   database  is  easy  to  access;  provided  free  to  libraries;  familiar  to  students,  faculty,  and   professionals  in  the  field  of  lis;  and  covers  a  broad  spectrum  of  general  and  specialized  journals.   together,  search  results  comprise  both  academic  and  popular,  or  mainstream,  sources.   google  blogs   i  first  gathered  data  from  searches  of  google  blogs.  i  conducted  two  separate  searches  of  the   search  term  “augmented  reality.”  the  first  was  in  december  2012  and  the  second  was  in  february   2013.  the  first  search  yielded  373  results  and  the  second  yielded  376.  i  used  the  advanced  search   function  to  limit  the  search  to  blog  postings  between  june  2012  and  december  2012.  the  second   search  is  limited  to  june  2012  to  february  2013.  blog  postings  excluded  from  the  final  body  of   data  for  analysis  include  foreign-­‐language  entries,  duplicate  items,  video-­‐only  postings,  and   advertisements,  resulting  in  a  final  data  set  of  300  postings.   library,  information  science  and  technology  abstracts  with  full  text  (lista)   after  completing  my  search  of  google  blogs,  i  gathered  data  from  lista  database  searches.  i   searched  the  database  once  a  month  for  three  months,  from  december  2012  to  february  2013.  i   divided  each  monthly  search  into  three  search  types:  first  by  author-­‐supplied  keyword,  second  by   subject  terms,  and  third  by  all-­‐text,  resulting  in  nine  searches.  i  did  this  to  determine  whether  the   results  would  differ  across  each  search  specification.  i  then  cross-­‐referenced  these  lists  and     do  you  believe  in  magic?  |  zak       30   removed  duplicate  and  foreign-­‐language  results  to  compile  one  complete  data  set  of  160  articles.   recording  units  from  the  articles,  blog,  or  social  media  postings  i  collected  from  these  searches   include  explicit  assertions  concerning  ar  i  then  coded.  for  the  purposes  of  this  exploratory  study,   i  developed  codes  inductively  rather  than  approaching  the  data  with  a  predetermined  set  of  codes,   as  inductive  codes  arise  from  the  interpretation  of  the  coded  data.67  an  example  of  an  assertion  in   the  dataset  includes  the  following,  taken  from  an  article  in  my  pilot  study:     ar  is  a  very  efficient  technology  for  both  higher  education  such  as  universities  and  colleges.   students  in  both  schools  can  improve  their  knowledge  and  skills,  especially  on  complex   theories  or  mechanisms  of  systems  or  machinery.68   i  categorized  these  assertions  according  to  the  themes  or  codes  with  which  they  are  embedded.     for  instance,  in  the  assertion  above,  the  notion  that  ar  is  efficient  and  capable  of  improving  or   adding  to  knowledge  and  skills  could  be  categorized  within  similar  assertions  about  the  value  or   purpose  of  ar.  after  i  organized  the  codes  and  categories,  i  determined  which,  if  any,  of  these   codes  coincide  with  the  digital  age  principles  of  rct  interactivity,  connectivity,  and  access.   i  developed  132  codes.  a  further  reduction  of  codes  is  not  feasible  given  the  myriad  ways  ar  is   defined  across  sources  in  the  data  set,  which  underscores  the  lack  of  consensus  on  just  how  ar  is   truly  understood.  i  went  through  the  codes  and  grouped  them  according  to  similarities  and   overarching  themes  labeled  as  categories.  the  132  codes  make  up  14  categories,  listed  in  table  1.   how  is  ar  conceptualized  in  lis?   the  lista  database  includes  more  than  560  journals  from  lis  and  related  information-­‐science   fields  such  as  communication  and  museum  studies.  of  the  77  lista  sources  included  in  the  data   set  for  this  study,  46  sources  are  peer-­‐reviewed;  31  are  not.  of  those  journals  not  peer-­‐reviewed,   all  of  them  focus  specifically  on  issues  in  the  media,  technology,  education  or  libraries.  based  on   my  analysis,  the  categories  or  overarching  themes  most  prominent  in  the  lista  data  set  are  “ar   as  a  new  direction,”  “ar  as  informational,”  and  “ar  as  an  enhancement.”   taken  together,  these  categories  suggest  ar  can  deliver  information  and  the  user  can  interact  with   information  through  an  enhanced  experience,  which  is  a  new  direction  in  technology.  individually,   these  categories  are  loaded  with  implications  based  on  the  codes  each  category  encompasses.  the   category  of  ar  as  a  new  direction  itself  includes  twenty-­‐five  codes.  the  codes  include  assertions  of   “ar  as  a  new  normal,  providing  opportunities”  for  those  willing  to  implement  the  technology   because  of  its  “potential,  versatile”  (yet  debatable)  range  of  uses  from  the  business  sector  to   education.         information  technology  and  libraries  |  december  2014   31                           a  new  direction   acting  upon  reality   conditioning  the  environment           a  new  world   blurs  line  between  real  &  virtual   control  environment           new  direction   distinction  between  real  &  virtual   creating  environment           new  normal   layered  reality   increases  view  of  the  environment       relevant   overlay  worlds   contextual           unstoppable   integrated   omnipotent  presence           important   embellishment             raising  expectations   improves  reality   used  as  a  tool           popular   bringing  to  life   discovery  tool           trend   used  to  simulate   educational  tool           versatile   unifying   marketing  tool           has  potential   crossing  boundaries   utility  for  library  operations           needs  further  development   intelligent   utility  for  library  operations           under-­‐utilized   transferring  intelligence   library  instruction  aid           unfamiliar   making  meaning   mobile  learning           promising   multimedia  display   reading  aid           searching   generates  media   increases  motivation           opportunity  for  leadership   visual  as  better  than  textual   promote  libraries           provides  opportunity   an  experience   prompts  action           background               debatable   a  modifier   impacting  economics           familiar   catalyst  for  change   economic  barriers           skepticism  &  understanding   changing  network  structure   economic  growth           entertaining  over  practical   eliminates  objects   low  cost           ill-­‐defined   potential  for  eliminating  people   reducing  costs           redefined   transformative   business  model           valuable   revolutionary   retail             problem  solver   measurable           obtainable   restorative   niche  market           access*   capable  of  injury   gimmick           questionable  access   safety             democratization   disruptive   progressive           relative  ease  of  use   innovative   eighth  mass  medium           simplicity   influential   occurs  in  phases           requiring  knowledge   powerful   part  of  a  continuum                         characterized  by  empty  descriptors   confrontational   involving  imagination           amazing   challenging  perceptions   envisioning           cool   challenging     science  fiction           exciting   therapeutic   the  future           fun   utilizes  mobile  devices             unique   wearable   an  enhancement           awkward     enhance  communication           unpredictable   informational   enhance  user  experience           entrancing   changing  definitions  of  personal  information   enhance  reality           phenomenon   delivering  information   enhancing  learning  and  training       magic   defining  relevant  information   enhancing  the  library             helps  gather  information   enriching           evoking  legal  questions   presenting  location-­‐based  information   engaging           beginning  litigation   speed  of  information  delivery   interactivity*           marketing  disputes   superimposes  information   building  relationships           privacy  concerns   providing  services  to  user   collaboration               connectivity*                           table  1.  codes  grouped  by  category.     *  denotes  principles  of  radical  change  theory     do  you  believe  in  magic?  |  zak       32     while  the  technology  still  needs  development  in  terms  of  user  awareness  of  the  technology,  a  clear   definition  of  the  technology,  and  an  understanding  of  its  full  range  of  uses,  some  view  it  as  a  trend   that  raises  the  bar  of  technological  expectations.  when  viewed  as  a  growing  trend,  ar  is  creating  a   new  world  or  platform  for  information  delivery.   furthermore,  in  the  lista  sources  ar  is  also  viewed  as  informational.  this  category  comprises   eight  codes,  all  of  which  refer  to  the  capability  of  ar  to  deliver,  gather,  define,  present,  and   superimpose  information  rapidly.  in  this  context  for  lis,  ar  is  another  format  for  providing  the   user  with  information  tailored  to  specific  user  needs.  the  ways  ar  can  provide  users  with   information  are  nestled  in  the  view  of  ar  as  an  enhancement.   within  this  category,  ar  is  described  as  an  enhancement  of  reality,  communication,  experiences,   and  learning.  under  this  category  there  is  little  ar  will  not  enhance.  the  enhancement  of  the  user   experience  is  directly  tied  to  the  informational  quality  of  ar.  this  enhancement  rests  on  the   engaging  interactive  qualities  of  ar  fostering  relationships  and  collaboration  through  the  property   of  connectivity.  under  this  definition,  the  digital  information  ar  displays  or  “creates”  is  the   enhancement  of  the  experience.   sources  in  the  data  set  also  suggest  ar  enhances  the  learning  experience  through  the  digital   images  or  objects  presented  to  the  user  in  conjunction  with  the  “original  materials.”  the   connection  of  ar  to  the  internet  and  sources  therein  also  gives  users  the  ability  to  connect  and   collaborate  with  one  another.  the  enhanced  experiences  provided  by  way  of  ar  foster   connections  between  users  as  well  as  librarians  and  the  creators  of  ar  applications  themselves.   authors  within  the  data  set  expect  connecting  with  others  and  building  relationships  will  make   user  experiences  much  richer  in  terms  of  how  the  user  interacts  with  information  and  in  what  way   information  is  presented  to  the  user.   the  google  blogs  search  was  not  limited  to  lis-­‐specific  blogs,  as  the  search  function  is  limited  in   definable  search  parameters.  of  the  300  blog  posts  in  the  data  set,  only  four  actually  include  the   term  library  or  refer  to  ar  applications  within  libraries.  one  blog  post  alludes  to  archiving  but   does  not  explicitly  mention  a  library  setting.  these  blog  postings  code  for  versatility,  utility,   interactivity,  discovery  tool,  an  experience,  a  library  instruction  aid,  promoting  and  enhancing  the   library,  access,  and  providing  services  to  users.  while  not  all  of  these  codes  are  included  in  the  three   dominant  categories  coding  for  the  lista  sources,  they  do  reflect  the  utilitarian  quality  of  ar  as  a   provider  of  information  at  its  most  basic.  for  example,  ar  is  in  one  source  a  versatile  tool   enhancing  aspects  of  the  user’s  library  experience,  a  view  shared  with  the  aforementioned  lista   excerpts,  from  the  ways  services  are  provided  for  the  user  to  the  level  of  interactivity  the  patron   has  with  relevant  information  within  the  library,  such  as  finding  specific  book  locations  or   accessing  information  about  the  services  the  library  offers.     information  technology  and  libraries  |  december  2014   33   archiving  information  displayed  through  ar  applications  poses  a  challenge  and  is  a  necessary   consideration  for  those  interested  in  archiving  information.  truly,  ar  has  the  potential  to  change   archiving  platforms  and  access  to  those  platforms.  the  archiving  of  information  presented  via  ar   is  a  concern  for  those  implementing  ar  for  a  variety  of  purposes.  presenting  even  library  hours  or   wayfinding  information  for  a  user  through  ar  also  raises  the  question  of  how  information  will  be   accessed,  for  how  long,  and  in  what  form  it  will  exist  once  it  is  no  longer  needed,  updated,  or   changed.   the  coded  data  suggests  ar  is  conceptualized  as  a  new  development  in  digital  technology  worth   paying  attention  to,  at  least  for  now,  but  should  also  be  approached  with  some  caution;  users  or   those  looking  to  implement  ar  should  be  careful  to  understand  the  functionality  and  implications   of  using  ar  prior  to  adapting  the  technology.  as  reflected  in  the  sources,  books  and  physical   spaces  are  potential  areas  for  ar  application  and  in  some  cases  are  already  overlaid  with  ar  and   are  enhancing  user  experiences.  as  a  whole,  the  conceptualization  of  ar  is  a  technology  with  great   potential  to  change  the  way  users  interact  with  information  because  of  its  versatility,  mobility,  and   direct  interaction  with  the  user’s  immediate  environment.   what  is  the  role  of  the  user  in  relation  to  ar  as  it  is  conceptualized  in  lis?   the  user  is  the  foundation  of  ar,  giving  the  technology  its  functionality  or  prompting  action.   whether  conceptualized  as  a  new  direction,  an  information  source  or  provider,  or  an   enhancement,  ar  is  essentially  static  if  there  is  no  user  prompting  the  ar  application  to  “act.”   without  action  on  the  user’s  part,  the  information  stored  within  ar  applications  is  inert.  the  goal   of  ar  is  to  present  information  in  digital  form  within  the  context  of  the  user’s  surroundings,   environment,  or  reality.   codes  describing  ar  as  a  new  direction,  new  world,  or  new  normal  solidify  the  idea  that  ar  is  a  new   technological  development  poised  to  redefine  not  only  the  information  landscape  but  also  the   ways  users  interact  with  technology  and  information.  furthermore,  references  to  ar  as  a   seemingly  unstoppable  popular  trend  point  to  the  perceived  usefulness  and  importance  of  ar  in   the  life  of  the  user,  as  it  is  seen  to  raise  expectations  in  terms  of  how  users  access  information.  one   blog  source  writes,  “you  may  have  heard  about  augmented  reality  before.  if  you  haven’t,  you’ll  be   hearing  a  lot  about  it  from  now  on,  with  the  smartphone  and  tablet  revolution  now  in  full-­‐ swing.”69  the  “revolution”  surrounding  smartphones  and  tablets  alludes  to  the  increase  in  their   use  and  sales,  making  these  devices  staples  in  everyday  life.  this  idea  is  parallel  to  user   expectations  in  terms  of  online  resources,  as  many  users  rely  on  information  accessed  through  the   internet.70,71,72,73  the  implication  is  once  ar  applications  gain  more  widespread  use,  the  user  will   come  to  expect  access  to  a  wide  range  of  information  through  those  ar  applications.   because  ar  is  still  met  with  skepticism,  and  for  some  is  still  in  need  of  further  development,  the   technology  provides  opportunities  for  librarians  and  their  staff  as  those  implementing  and     do  you  believe  in  magic?  |  zak       34   providing  ar-­‐based  services  to  forge  new  paths  in  ar  application,  becoming  users  of  the   technology  “behind-­‐the-­‐scenes.”  the  following  excerpt  from  a  lista  source  highlights  this  view:   technological  advances  are  beginning  to  fundamentally  change  the  way  that  library  users   interact  with  digital  information,  and  it  is  therefore  essential  that  librarians  become  engaged   with  the  relevant  technology  and  leverage  their  role  as  teachers  in  order  to  help  ensure  their   continued  relevance  in  the  lives  of  clients  in  the  twenty-­‐first  century.74   within  this  statement,  librarians  are  learning  and  teaching  about  new  technology.  maintaining   “relevance  in  the  lives  of  clients”  suggests  as  the  technology  grows,  implementers  of  the   technology  need  to  understand  the  technology.  as  reflected  in  this  excerpt,  on  the  other  end  is  the   user  (or  client)  of  the  ar  application  after  it  is  created  and  implemented.  the  user  has  the   opportunity  not  only  to  access  and  experience  information  in  new  ways  but  also  to  help  build  and   contribute  to  the  creation  and  streamlining  of  the  ar  application  through  his  or  her  response  to   the  application’s  functionality.   the  user  of  ar  is  also  central  to  the  view  of  ar  as  informational.  in  a  simplistic  way,  ar  is  dormant   and  incapable  of  providing,  delivering,  searching  for,  superimposing,  or  really  doing  much  of   anything  with  information  without  user  actions.  in  most  cases  with  ar,  users  must  have  some  type   of  mobile  device  to  prompt  what  is  often  described  as  an  ar  experience.  information  provided  via   ar  applications  is  unlike  the  physical  format  of  a  book,  newspaper,  magazine,  or  other  object.   these  physical  objects  exist  in  libraries  and  the  like,  residing  on  shelves  and  taking  up  physical   space  regardless  of  the  location  or  presence  of  a  user.  by  contrast,  the  information  embedded   within  an  ar  application  is  only  viewable  and  unlocked  when  the  user  prompts  action  through  the   viewfinder  of  a  smartphone,  tablet,  or  other  mobile  device.  the  technology  is  described  as  a   catchall  or  endless  repository  for  interactive  information  for  the  user,  at  the  user’s  fingertips.   likewise,  the  conceptualization  of  ar  as  any  kind  of  enhancement  is  also  dependent  on  the  user,   though  this  assertion  is  implicit  in  many  of  the  coded  passages.  without  user  interaction   prompting  the  ar  application  to  overlay  information  onto  the  real  environment,  ar  is  incapable  of   enhancing  experiences,  libraries,  communication,  or  reality  itself.  enhancement,  or  improvement,   suggests  something  or  someone  is  acted  on  in  a  way  that  is  beneficial,  intensified,  or  embedded   with  a  stronger  sense  of  value.   much  like  the  informational  quality  of  ar,  without  the  use  of  a  mobile  device  handled  by  the  user,   ar  is  dormant  and  nonexistent,  unable  to  enhance  the  environment  or  other  aspect  of  physical  life.   without  a  user  within  a  specific  context,  there  is  not  much  for  ar  to  enhance  because  it  exists  as   merely  a  “marker,”  tucking  away  the  desired  information,  awaiting  a  user  to  come  along  and  point   a  mobile  device  in  its  direction.  as  blogger  david  meyer  put  it,  ar  “requires  the  active   participation  of  the  consumer—you  do  not  by  default  wander  around  with  your  phone  held  out  in   front  of  you.”75       information  technology  and  libraries  |  december  2014   35   this  implicit  role  of  the  user  is  buried  under  language  suggesting  ar  is  actually  capable  of  acting   on  the  user  and  the  user’s  reality,  giving  ar  a  sense  of  agency.  for  example,  in  terms  of   enhancement,  while  the  user  is  the  “activator”  of  the  ar  experience,  ar  is  often  described  as   making,  creating,  challenging,  improving,  producing,  delivering,  solving  problems,  and  even   bringing  some  object  to  life,  thus  prompting  this  “enhancement.”  such  verbs  give  ar  an  active   quality,  as  if  it  is  itself  alive  and  present  in  creating  or  prompting  change.  this  is  best  exemplified   by  the  categories  of  ar  as  acting  upon  reality,  ar  as  a  modifier  and  ar  as  conditioning  the   environment.   the  perceived  quality  of  ar  as  acting  on  reality  is  what  often  drives  the  idea  of  ar  is  a  catalyst  for   change,  conditioning  the  environment  by  either  controlling  the  present  environment  or  creating  a   new  one.  the  transformative  characteristic  of  ar  reshapes  and  redefines  the  user  experience   precisely  because  of  the  ways  ar  inserts  digital  objects  into  the  real-­‐world  environment.  for  this   reason,  ar  is  often  described  through  invoking  imagination  and  with  words  like  cool,  amazing,  fun,   and  exciting,  or  other  empty  descriptors;  ar  truly  enhances  user  interaction  with  the  surrounding   world  with  what  is  perceived  as  a  “wow”  factor,  so  much  so  that  codes  in  this  study  even  reflect   legal  and  economic  implications  of  the  technology.  but,  while  ar  is  a  catalyst  for  change  and  acts   on  the  environment  and  reality,  those  changes  and  actions  are  only  seen  through  the  viewfinder  of   a  mobile  device  ultimately  in  the  hands  of  the  user.   radical  change  theory   dresang  founded  rct  on  what  she  identified  as  the  three  digital-­‐age  principles  of  interactivity,   connectivity,  and  access,  which  describe  how  “the  digital  environment  has  influenced  some   nondigitized  media  to  take  on  digital  principles.”  essentially,  dresang’s  theory  is  an  attempt  to   explain  information  resources  and  behaviors.  within  the  digital-­‐age  principles,  the  user  takes  on   the  role  of  initiator.  while  flexible  in  allowing  user  initiation,  the  digital  applications  are  inert   without  the  user;  a  range  of  information  is  unavailable  without  user  action  to  put  the  digital   environment  into  motion.  the  digital  environment  within  ar,  as  described  by  sources  in  the  data   set,  is  an  overlay  of  the  digital  onto  the  real  world.  rct  suggests  the  “digital  environment  extends   far  beyond  the  digital  resources  themselves.”  the  extension  beyond  the  digital  resources  is  evident   in  ar  as  it  combines  the  real  and  physical  with  the  virtual.76   based  on  these  principles  and  the  idea  that  the  digital  extends  beyond  the  resources  themselves,   the  conceptualization  of  ar  reflects  the  characteristics  of  rct  in  explaining  information  behavior   and  representations  in  the  ar  “environment.”  interactivity,  connectivity,  and  access  are  essential   parts  of  ar.  when  each  principle  is  examined  in  conjunction  with  the  coded  data,  a  picture   appears  of  ar  as  an  exemplar  of  rct.   interactivity  and  connectivity  emerged  as  coded  assertions  within  the  data.  these  codes  fall  under   the  category  of  ar  as  an  enhancement.  as  stated,  ar  is  expected  by  many  of  the  voices  within  my     do  you  believe  in  magic?  |  zak       36   data  set  to  influence  or  enhance  almost  every  aspect  of  our  lives,  and  part  of  this  enhancement  is   due  to  the  properties  of  interactivity  and  connectivity  exhibited  by  ar.     as  defined  by  dresang,  interactivity  refers  to  dynamic,  user-­‐controlled,  nonlinear,  nonsequential   information  behavior  and  representation.77  speaking  directly  to  the  idea  of  user-­‐controlled,   nonlinear,  and  nonsequential  information  behavior  and  representation  is  the  sense  of  agency  that   ar  is  expected  to  give  users  in  terms  of  how  they  view  and  interact  with  the  digital  overlays  of   information.   the  control  the  user  has  on  the  ar  experience  and  the  information  with  which  he  or  she  comes  in   contact  stems  from  the  functionality  of  ar  often  relying  on  tracking  a  user’s  location.  presenting  a   user  with  information  based  on  location  is  not  only  user-­‐controlled  but  nonlinear  and   nonsequential,  since  the  application  has  no  predetermined  set  of  boundaries  the  way  a  physical   map  might  have.  people  do  not  typically  travel  through  the  day  in  a  linear  fashion  with  a   predetermined  sequence  of  action.  as  one  blogger  notes,  ar  is  “helping  to  erase  that  line  between   your  real  life  and  how  you  interact  with  the  web.”  78   ar  is  further  tied  to  the  principle  of  interactivity  as  defined  within  rct  precisely  because  of  its   mobility  and  formatting,  namely,  because  of  mobile  devices.  a  source  within  the  lista  data  set   includes  the  following  assertion:     the  ar  paradigm  opens  innovative  interaction  facilities  to  users:  human  natural     familiarity  with  the  physical  environment  and  physical  objects  defines  the  basic     principles  for  exchanging  data  between  the  virtual  and  the  real  world,  thus  allowing     gestures,  body  language,  movement,  gaze  and  physical  awareness  to  trigger  events  in  the     ar  space.79     gestures,  body  language,  movement,  gaze,  and  physical  awareness  are  all  unpredictable  actions.   for  these  actions  to  “trigger  events  in  the  ar  space,”  there  must  certainly  be  a  high  degree  of   nonlinear  and  nonsequential  information  behavior  and  representations  taking  place;  it  is  highly   unlikely  user  gestures  and  the  like  could  exist  on  a  linear  continuum  of  action.   further,  within  rct,  connectivity  refers  to  the  sense  of  community  or  construction  of  social   worlds  emerging  from  changing  perspectives  and  expanded  associations  in  the  world  and  in   resources.  in  terms  of  the  coded  data  in  this  study,  the  code  of  connectivity  reflects  this  idea  of   creating  community  and  social  worlds  through  the  capability  of  ar  to  connect  users  to  various   forms  of  social  media,  to  one  another,  and  to  various  resources.  users  make  connections  through   games,  location  awareness,  and  applications  allowing  for  the  sharing  of  information  from  user  to   user.  social  networking  figures  prominently  in  ar  technology.  users  can  upload  overlays  of  digital   information  captured  on  a  smartphone  to  various  social  networking  sites.  additionally,  the  ability   of  ar  to  overlay  digital  information  onto  the  real  world,  and  the  customizable  experiences  this   creates,  aids  in  connectivity.  ar  creates  a  virtual  world  wherein  users  can  engage  with  one   another  across  applications;  while  physically  in  different  places,  they  can  share  experiences  or     information  technology  and  libraries  |  december  2014   37   even  simple  conversations.  connectivity  through  ar  is  possible  in  all  aspects  of  life  as  many   sources  in  the  data  set  allude  to  working,  playing,  and  learning.   to  interact  and  connect  with  one  another,  users  must  have  access  to  the  ar  space.  the  principle  of   access  within  rct  is  defined  as  the  breaking  of  long-­‐standing  information  barriers,  allowing   exposure  and  access  to  a  wide  range  of  differing  perspectives  and  opportunities.  the  concept  of   access  (both  in  terms  of  what  is  and  is  not  accessible)  does  appear  in  the  coded  data.  note  access   and  accessibility  within  this  study  refer  to  the  opportunity  or  right  to  use  a  system  or  service,  and   is  not  referring  to  access  and  accessibility  as  it  is  used  within  discourse  pertaining  to  disabilities.   the  use  of  ar  in  conjunction  with  smartphones  is  a  basic  way  of  interpreting  access  as  it  relates  to   rct  and  ar.  the  mobility  of  smartphones  and  tablets  allows  users  to  access  ar  across  a  variety  of   locations,  and  each  user’s  smartphone  or  tablet  is  uniquely  tailored  to  the  user’s  interests  and   desires  through  differing  collections  of  applications  and  software.  smartphones  and  tablets  are   comparatively  low-­‐cost  options  for  accessing  and  sharing  access  to  the  internet  and  ar   applications.  their  use  is  on  the  rise  because  they  are  often  much  cheaper  to  obtain  than   computers.80,81,82  the  fusion  of  ar  with  mobile  devices  suggests  an  opportunity  for  accessing   information  in  real  time  in  any  place  through  these  technologies  working  in  tandem  with  one   another.  by  its  very  nature,  ar  offers  access  to  “differing  perspectives  and  opportunities”  as  it   presents  the  user  with  information  in  atypical  formats  in  places  and  spaces  once  static,  or  lacking   in  digital  overlays.   ar  is  not  bounded  by  physical  location;  rather,  it  depends  on  and  varies  with  your  physical   location.  the  idea  that  users  can  access  ar  wherever  there  is  an  internet  connection  means  the   only  real  barrier  to  accessing  ar  is  the  same  barrier  existing  regarding  a  web  connection  for  the   user,  something  at  least  one  source  within  the  lista  data  set  alludes  to  in  terms  of  the  digital   divide,  cautioning  that  ar  should  be  used  in  conjunction  with  traditional  formats  of  information   instead  of  in  lieu  of  them.  for  example,  libraries  should  not  replace  traditional  signage  with   information  only  embedded  within  ar  applications.  however,  as  more  and  more  organizations,   institutions,  and  businesses  use  ar  and  provide  users  with  access  rather  than  relying  on  the  user   to  conjure  his  or  her  own  internet  connection,  more  barriers  to  ar  will  fall.     however,  it  is  important  to  note  that  not  all  ar  applications  actually  do  require  internet  access.   mobile  device  applications  can  work  off  of  markers  and  triggers  in  the  physical  environment,  not   web-­‐based  anchors.  for  example,  a  cookbook  can  include  a  marker  next  to  a  recipe  that  when   scanned  displays  an  image  of  the  finished  dish.  access  to  ar  applications  without  a  web   connection  opens  a  wealth  of  information  to  individuals  who  do  not  have  access  to  the  web.  not  all   ar  applications  link  to  web-­‐based  information,  and  this  widens  the  pool  of  users  engaging  with   information  through  those  ar  applications.  moreover,  relative  ease  of  use  and  low  initial  cost  to   create  ar  applications  also  allow  users  to  become  content  creators,  as  exemplified  by  an  ar   application  allowing  an  artist  to  create  virtual  graffiti  in  public  spaces.  users  are  able  to  display   and  access  information  otherwise  invisible.     do  you  believe  in  magic?  |  zak       38   it  is  important  to  note  the  differing  views  suggesting  ar  does  and  does  not  require  internet  access   to  function.  such  a  discrepancy  points  further  to  ar  as  a  loosely  defined  concept.  from  a  technical   standpoint,  ar  does  not  require  an  internet  connection,  but  the  internet  does  serve  as  a   repository  for  information,  and  many  see  ar  as  providing  a  bridge  to  that  information,  be  it  a   company  wanting  to  give  consumers  access  to  product  lines  or  a  library  creating  an  ar  application   linking  to  web-­‐based  databases  and  services.  in  terms  of  rct,  the  principle  of  access  does  take   into  account  digital  information  as  able  to  both  provide  and  inhibit  access,  as  some  users  may  not   have  access  to  the  often  costly  hardware  allowing  access  to  digitally  formatted  information.  what   is  important  to  highlight  about  the  principle  of  access  is  the  focus  on  the  range  of  voices  and  the   increased  array  of  digital  information  available,  which,  in  the  case  of  this  study,  ar  provides.  any   object  associated  in  some  way  with  a  monetary  cost  or  technical  savvy  always  has  the  potential  to   leave  some  users  in  the  dark.   interactivity,  connectivity,  and  access  are  present  in  the  conceptualization  of  ar  within  the  lis   literature  as  well  as  the  popular  media  blogs.  the  user  is  central  to  both  rct  and  this   conceptualization  of  ar  as  a  new  technology  with  the  potential  to  change  the  way  users  interact   with  information  and  with  one  another.  the  goal  of  ar  is  to  present  information  in  digital  form   within  the  context  of  the  user’s  surroundings,  environment,  or  reality.  rct  is  a  theory  seeking  to   understand  changes  in  information  behavior  and  representations,  and  ar  is  an  exemplar  of  the   myriad  changes,  or  the  evolution,  of  the  digital-­‐information  environment.   implications  for  theory   this  study  has  several  implications  for  theory  within  lis.  these  include  the  extension  of  rct  or   creation  of  new  theories  born  out  of  the  digital  age,  the  understanding  that  a  user-­‐centric  focus  is   essential  to  theory  within  the  digital  age,  and  the  realization  that  ar  opens  new  areas  of  research   in  what  is  considered  an  enhancement  of  information.  while  this  study  sought  to  test  rct  in   relation  to  the  conceptualization  of  ar,  it  also  provides  a  framework  for  future  studies.  the  results   of  this  study  also  suggest  ar  opens  more  areas  to  explore  within  the  field  of  lis  to  create  new   theories  or  to  add  to  rct  as  a  theoretical  framework  to  better  understand  information  behavior   and  representations  in  the  digital-­‐information  environment.   when  dresang  initially  formulated  rct,  she  focused  on  youth  information  seeking  behaviors.83   few  scholars  have  used  the  theory,  but  those  that  have  explore  education,84  literacy,85     communication  and  writing86  as  related  to  changing  technologies.  these  previous  studies,  as  well   as  this  study,  highlight  the  importance  of  this  theory  in  examining  the  effect  of  the  digital-­‐ information  landscape  on  information-­‐seeking  and  user  understanding  of  and  reaction  to  digital   information.  rct  is  viable  beyond  a  focus  on  youth  information  seeking  and  is  highly  relevant  to   today’s  world.     rct  developed  to  understand  how  the  digital  age  influences  traditional  and  new  media.  ar  is  itself   often  described  as  an  environment,  a  digital  environment,  which  is  precisely  the  focus  of  rct.  if     information  technology  and  libraries  |  december  2014   39   ar  is  in  fact  a  “new  normal,”  as  some  describe,  our  information  landscape  is  moving  in  a  direction   where  interactivity,  connectivity,  access,  and  the  role  of  the  user  are  central  to  any  discussion  of   how  information  is  organized,  distributed,  formatted,  and  presented.   researchers  can  begin  by  adapting  traditional  lis  theories  to  the  digital  age.  for  example,  wilson   revised  his  oft-­‐cited  original  general  model  of  information-­‐seeking  behavior  in  an  attempt  to   understand  the  totality  of  information  behavior  by  linking  theory  to  action  and  understanding   what  prompts  and  hinders  the  need  to  search  for  information.87  researchers  can  begin  to   reevaluate  the  model  to  determine  whether  it  helps  to  explain  information  behavior  within  the  ar   environment,or  whether  aspects  of  the  model  can  be  further  developed  or  revised.  wilson’s   model’s  focus  on  human  behavior  parallels  the  focus  on  human  behavior  within  the  ar   environment.   other  theories  within  lis  can  also  be  adapted  to  the  ar  environment,  such  as  erdelez’s   information  encountering  (ie).  erdelez’s  theory  focuses  on  a  “memorable  experience  of   unexpected  discovery  of  useful  or  interesting  information”  situated  within  three  elements:   characteristics  of  the  information  user,  characteristics  of  the  information  environment,  and   characteristics  of  the  encountered  information.  erdelez  further  describes  categories  of   information  users:  superencounterers,  encounterers,  occasional  encounterers,  and   nonencounterers.  while  erdelez  has  since  taken  the  web  and  the  internet  into  account  as   information  environments,  this  theory  could  further  be  remodeled  to  include  the  ar   environment.88   wilson’s  model  and  erdelez’s  theory  are  just  two  examples  of  theories  within  lis  lending   themselves  to  further  exploration  of  the  user  of  ar  within  lis.  bates’  model  of  information  search   and  retrieval,  known  as  berrypicking,  which  centers  on  the  changing  nature  of  the  search  query   through  the  search  process,  can  also  be  amended  to  include  information  search  and  retrieval   within  the  ar  environment.89  bates’  model  suggests  as  users  seek  and  find  information,  the   information  search  shifts  from  source  to  source.  the  berrypicking  model  can  also  be  updated  or   expounded  in  response  to  ar  because  of  ar’s  multidimensional  display  of  information—a   relatively  new  phenomenon  for  the  average  user—to  understand  whether  the  same  shift  in   information  queries  occurs  and  what  new  paths  to  information  users  are  taking  within  the  ar   environment.   this  study  suggests  a  user-­‐centric  focus  is  essential  to  any  theories  in  lis  developed  within  or   extended  to  the  digital  age.  the  user  is  vital  to  making  ar  technology  functional,  as  demonstrated   in  the  conceptualization  of  ar  in  this  study.  the  personalization,  individualization,  and  mobility  of   digital  technology  like  ar  suggest  theories  related  to  information  behavior  within  this   environment  must  account  for  user  interaction.  information  is  no  longer  contained  within  static   formats.  geotagging  or  geospatial  awareness  and  social  networking  are  prime  examples  of  the   reliance  of  digital  technology  on  user  interaction.  without  addressing  the  role  of  the  user  in  the     do  you  believe  in  magic?  |  zak       40   functionality  of  digital  technology  in  any  context,  theoretical  frameworks  attempting  to  address   information  seeking  and  behavior  in  the  digital  age  will  be  limited.   additionally,  should  ar  prove  to  be  a  new  direction  in  accessing,  organizing,  delivering,  and   obtaining  information,  it  further  opens  new  areas  of  theoretical  research.  since  ar  is  considered   an  enhancement  of  the  information  experience,  it  is  incumbent  on  researchers  to  determine  just   how  “enhancement”  is  defined  in  the  context  of  ar  and  how  that  translates  to  the  user  experience.   researchers  can  strive  to  apply  and  understand  the  concept  of  an  enhancement  in  relation  not   only  to  the  enhancement  of  information  but  also  to  the  experience  of  accessing,  organizing,   delivering,  and  obtaining  information.  the  digital-­‐age  principles  outlined  by  dresang  in  rct  are   just  one  example  of  how  to  understand  the  impact  of  the  digital  age  on  the  user’s  interaction  with   information  and  in  what  ways  the  digital  age  creates  enhancements.   by  itself,  the  study  of  ar  technology  raises  more  questions  than  it  answers.  the  study  of  ar   technology  could  lead  to  more  diverse  theoretical  frameworks  seeking  to  answer  not  only   practical  questions  but  also  those  more  philosophical  in  nature,  working  toward  an  understanding   of  how  the  digital-­‐information  environment  influences  everyday  life  as  it  evolves  and  changes  at  a   rapid  pace.   implications  for  practice   this  study  can  inform  several  aspects  of  practice.  i  expound  on  three  possibilities:  the  clear   definition  of  technologies  like  ar  to  create  an  awareness  and  understanding  of  those  technologies,   development  of  best  practices,  and  the  need  for  a  focus  on  user  collaboration  in  the  design  and   functionality  of  ar  and  similar  technologies.  the  implications  for  practice  concern  both  the  user   and  the  provider  of  information  services.   this  study  provides  perspective,  or  a  starting  point,  from  which  the  field  of  lis  can  begin  to   analyze  the  use  and  implementation  of  ar  technology.  by  taking  a  step  back  to  understand  the   current  conceptualization  of  ar,  practitioners  within  lis  can  begin  to  seek  consensus,  identify   best  practices,  maintain  an  awareness  of  how  the  technology  is  used  and  think  realistically  about   what  factors  contribute  to  successful  implementation  of  the  technology  in  a  given  institution.  as   identified  in  the  study,  ar  is  seen  as  a  new  direction.  it  is  important  for  those  within  the  field  to   understand  this  perspective  and  to  go  on  to  identify  what  a  new  direction  in  information  gathering,   organization,  and  seeking  implies  for  the  field  as  a  whole  and  for  users.   as  a  field,  lis  can  begin  to  have  a  broader  discussion  on  what  exactly  ar  can  provide  and  how  it   can  benefit  user  services.  such  a  discussion  can  help  practitioners  make  sense  of  how  this   technology  can  work  with  traditional  sources  of  information.  ar  can  be  integrated  with  the   traditional  rather  than  act  as  a  replacement  for  the  traditional.  this  broader  discussion  can  lead  to   a  consensus  on  how  best  to  define  ar  as  a  tool  and  concept.  within  this  study,  it  is  evident  ar  is   described  in  myriad  ways,  so  it  is  important  to  reflect  on  those  descriptions,  understand  what  the   issues  are  surrounding  the  technology,  and  collaboratively  seek  and  identify  best  practices.     information  technology  and  libraries  |  december  2014   41   furthermore,  by  identifying  best  practices,  practitioners  can  begin  to  pinpoint  what  applications   of  ar  are  successful  within  an  institution  and  for  users,  and  why  those  applications  are  successful   for  specific  purposes.  in  doing  so,  practitioners  can  build  ar  applications  around  the  needs  and   mission  of  the  institution  rather  than  simply  flock  to  use  a  new  technology.  it  is  therefore  critical   for  practitioners  to  think  realistically  about  ar  implementation.  adopting  such  a  technology  will   only  be  beneficial  once  practitioners  in  the  library  understand  its  full  impact.  experience  with   technology  and  programming,  knowledge  of  ar  functionality,  versatility,  and  cost  are  important   factors  to  consider  when  contemplating  how  an  institution  can  benefit  from  ar,  if  at  all.   similarly,  publishers  are  using  ar  to  supplement  traditional  printed  books.  educators  are  using   ar  books  in  the  classroom  and  supplementing  traditional  course  instruction  with  these  books.   such  books  allow  for  3d  rendering  of  models  for  study,  such  as  planets,  molecular  structures,  and   various  other  objects.  those  within  lis  have  a  strong  connection  to  the  field  of  education,  and  ar   books  may  become  a  part  of  the  library  collection  as  they  become  more  popular  among  educators.   practitioners  in  lis  through  collaboration  with  educators  will  then  need  to  be  aware  of  these   books,  their  functionality,  and  how  to  help  users  access  the  content  lying  dormant  until  “activated”   by  a  smartphone  or  tablet.  this  raises  the  question  as  to  whether  smartphones,  tablets  or  other   devices  that  can  scan  the  environment  will  become  commonplace  in  the  library  to  provide  full   access  to  users.     user  collaboration  also  becomes  central  to  understanding  the  implications  of  ar  on  practice.  user   collaboration  in  design  is  important  because  ar  technology  is  largely  dependent  on  user  context.   as  the  data  suggests,  ar  is  considered  an  enhancement—of  the  environment,  of  information,  and   of  the  user  experience.  prior  to  implementation,  it  is  critical  to  understand  how  ar  enhances  the   user  experience  and  what  the  perception  is  among  users.  user  surveys  can  lead  to  tailored  ar   applications  for  a  given  library  or  cultural  institution  community  should  there  be  a  need  or  desire   for  ar  applications  identified  among  users.  coupled  with  the  idea  of  user  collaboration  in  design  is   also  the  need  to  reevaluate  the  physical  spaces  of  libraries  and  similar  institutions.  because  ar   creates  an  overlay  of  digital  information  on  the  physical  environment,  it  will  be  necessary  for   practitioners  to  identify  what  areas  of  the  library  or  institution  lend  themselves  to  digital  overlays,   what  types  of  information  users  are  accessing  through  ar  applications,  and  whether  the  library   space  is  configured  to  allow  for  navigating  space  via  ar.   practitioners  can  also  begin  to  survey  the  role  of  rct  in  understanding  user  information-­‐seeking   behavior.  by  acknowledging  this  theory  as  an  outline  of  our  digital-­‐information  environment,   practitioners  can  be  mindful  of  user  expectations  and  behaviors  as  they  differ  from  traditional   information  representations  and  methods  of  information  retrieval.  as  ar  creates  an  environment   or  experience  for  the  user,  it  is  important  for  practitioners  within  the  field  to  understand  how  this   technology  is  moving  forward  and  what  effect  it  has  on  the  sea  change  occurring  in  user   acquisition  of  information.  rct  is  a  framework  providing  practitioners  the  lens  through  which  to   make  sense  of  the  sea  change  and  predict  what  might  be  on  the  horizon.     do  you  believe  in  magic?  |  zak       42   limitations  and  future  application   this  study  is  just  one  step  in  the  process  of  understanding  how  new  technology  is  conceptualized   and  what  effect  that  conceptualization  has  on  implementation.  should  ar  continue  to  grow  in   popularity,  this  study  can  serve  as  a  model  for  future  research  seeking  to  understand  concepts   misinterpreted,  misunderstood,  or  undergoing  concrete  development.  utilizing  the  explicit   assertion  as  a  unit  of  analysis  coupled  with  rct  can  aid  in  investigations  of  other  digital   technologies,  both  in  terms  of  implementation  and  end  use.     the  data  set  and  the  time  period  during  which  the  searches  of  the  data  set  took  place  in  this  study   highlight  two  of  the  study’s  limitations.  further  studies  can  focus  on  a  wider  range  of  sources  not   limited  to  one  database  or  blog  search  type  and  extend  over  a  longer  period  of  time.  these   limitations  have  potentially  excluded  other  voices,  perspectives,  and  definitions  of  ar,  and  the   time  element  may  exclude  new  applications  or  uses  of  ar  currently  being  implemented.     limitations  in  data  analysis  also  exist.  content  analysis  is  one  of  many  research  methods   researchers  can  employ  to  explore  this  topic.  ethnographic  research  and  user  interviews  can  lead   to  a  deeper  understanding  of  how  users  perceive  ar  and  information-­‐seeking  or  behavior  within   the  ar  environment.  such  qualitative  studies  can  provide  insight  to  the  role  of  the  user  lacking  in   this  study.  moreover,  this  researcher’s  own  admitted  bias  against  the  steadfast  use  of  digital   technologies  prior  to  in-­‐depth  understanding  is  what  prompted  the  qualitative  inquiry  guiding  the   study.  quantitative  methods  can  also  be  used  to  track  the  popularity  or  perceptions  of  ar  through   close-­‐ended  questionnaires  or  surveys  of  both  users  and  practitioners  in  the  field  of  lis.  citation   tracking  could  further  reveal  in  what  subfields  of  lis  the  conversation  surrounding  ar  is  taking   place,  and  may  also  uncover  whether  any  one  researcher  or  group  of  researchers  is  leading  the   conversation.     future  studies  can  examine  and  expand  on  the  results  of  this  study.  rather  than  focusing  on   conceptualization  only,  researchers  can  study  which  professional  fields  dominate  the  conversation   surrounding  ar  and  what  areas  of  popular  culture  dominate  the  conversation  or  influence   understanding  of  ar.  similarly,  other  studies  can  address  the  specificity  of  each  source  making   explicit  assertions  about  this  kind  of  technology.  while  qualitative  in  nature,  the  study  is  limited   because  it  does  not  examine  quantitative  changes  in  the  number  of  articles  or  blog  posts  alluding   to  ar  over  an  extended  period  of  time.  such  studies  might  unravel  why  ar  is  progressing  as  it  is,   and  may  identify  potential  problems  or  differences  in  the  influence  of  these  perspectives  on  the   use  of  ar.   the  study  of  ar  also  widens  the  spectrum  of  user  studies.  augmented  reality  open  a  whole  new   area  of  user  interaction  with  information  extending  beyond  the  screen.  with  the  advent  of   products  like  google  glass  and  applications  overlaying  digital  information  at  the  click  of  a  button   in  an  endless  array  of  contexts  and  environments,  ar  brings  information-­‐seeking  further  into  a   world  of  instability  and  unpredictability.  the  complex  nature  of  individual  people  is  now  being   coupled  with  a  highly  individualized  complex  technology.     information  technology  and  libraries  |  december  2014   43   the  functionality  of  ar  prompts  the  need  for  archival  studies  related  to  this  technology.  the   mobile  aspect  of  ar,  the  highly  personalized  content  and  the  intangible  quality  of  the  information   stored  within  ar  applications  highlights  the  need  for  an  examination  of  how  such  information  can   actually  exist  within  an  archive  and  be  made  accessible,  or  whether  such  information  even  should   exist  within  an  archive.  such  a  question  for  those  within  lis  also  suggests  the  need  for  a  realistic   perspective  on  technology  like  ar—the  next  step,  or  reaction  to  such  technology,  is  often   unrecognizable  and  unidentifiable  until  the  concept  itself  is  dissected  and  each  part  is  interpreted   and  understood.   conclusion   in  this  study,  i  used  content  analysis  to  explore  the  conceptualization  of  ar  technology  within  the   field  of  lis.  the  model  for  this  study  is  the  work  of  clement  and  levine  and  their  use  of  the  explicit   assertion  as  a  unit  of  analysis.90  i  coded  and  examined  explicit  assertions  pertaining  to  ar  in  lis   literature  and  google  blogs  to  determine  how  the  concept  of  ar  is  understood.  analysis  shows  ar   is  most  prominently  conceptualized  as  a  new  direction  in  technology  and  media  consumption   acting  on  reality  and  as  enhancing  reality  and  interaction  with  information.   ar  is  basically  a  technology  allowing  for  digital  information  to  be  superimposed  on  the  real  world.   but  beyond  that,  it  is  a  technology  changing  the  way  users  interact  with  information,  and  it  has  the   potential  to  continue  changing  how  we  literally  see  information.  the  data  set  suggests  those   within  lis  conceptualize  ar  as  a  new  development  in  digital  technology  worth  paying  attention  to,   at  least  for  now,  but  should  also  be  approached  with  some  caution  to  be  fully  understood  prior  to   implementation  in  case  its  popularity  and  growth  is  fleeting.  as  reflected  in  the  data  set,  books  and   physical  spaces  are  potential  areas  for  ar  application,  and  in  some  cases  are  already  overlaid  with   ar  and  enhancing  user  experiences.  as  a  whole,  ar  is  conceptualized  as  a  technology  with  great   potential  to  change  the  way  users  interact  with  information  because  of  its  versatility,  mobility,  and   direct  interaction  with  the  user’s  immediate  environment.   within  this  conceptualization,  the  user  is  central  to  igniting  the  functionality  of  ar.  whether   conceptualized  as  a  new  direction,  an  information  source  or  provider,  or  an  enhancement,  ar  is   essentially  static  if  there  is  no  user  prompting  the  ar  application  to  “act.”  the  goal  of  ar  is  to   present  information  in  digital  form  within  the  context  of  the  user’s  surroundings,  environment,  or   reality.  as  a  field  dedicated  to  user  services  in  regard  to  information-­‐seeking,  it  is  imperative  to   understand  the  potential  impact  this  technology  has  had  or  will  have  on  everyday  life.   rct  is  a  theoretical  framework  aiding  in  the  exploration  of  the  potential  impact  of  ar.  born  out  of   a  desire  to  understand  the  influence  of  the  digital  age  on  the  traditional  or  “analog”  media  with   which  we  engage,  rct  is  one  of  few  theories  resting  entirely  on  the  characteristics  driving  our   digital-­‐information  environment,  outlined  specifically  as  interactivity,  connectivity,  and  access.   utilizing  this  theory  as  a  lens  for  future  research  regarding  digital  information  is  a  natural  next   step  in  theory  exploration  and  development.     do  you  believe  in  magic?  |  zak       44   together,  ar  and  rct  accentuate  the  evolution  of  how  we  consume  and  display  information.  from   storytelling  to  printed  pages  to  electronic  devices,  our  engagement  with  information  will  never  be   the  same  again.  as  we  move  forward,  it  is  important  to  continue  to  ask  new  questions,  seek  new   explanations,  and  try  to  formulate  the  most  appropriate  answers  for  the  contexts  in  which  we  all   deal  with  information,  be  it  gathering,  organizing,  seeking,  or  understanding.  this  study  is  one   piece  in  a  puzzle,  and  it  prompts  more  questions  than  it  provides  answers.  ar  can  and  should  be   studied  from  every  aspect  of  the  field  of  lis,  if  it  is  in  fact  a  new  direction  toward  our  new  normal.   references     1.     nathan  crilly,  “the  design  stance  in  user-­‐system  interaction,”  design  issues  27,  no.  4  (2011):   16–29.   2.     pelle  ehn,  “the  end  of  the  user—the  computer  as  a  thing,”  in  end-­‐user  development,  ed.  y.   dittrich,  m.  burnett,  a.  morch  and  d.  redmiles  (berlin,  germany:  springer,  2009),  8–8.   3.     daniel  fallman,  “the  new  good:  exploring  the  potential  of  philosophy  of  technology  to   contribute  to  human-­‐computer  interaction.”  paper  presented  at  the  sigchi  conference  on   human  factors  in  computing  systems,  vancouver,  british  columbia,  may  2011).   4.     bruce  m.  hanington,  “relevant  and  rigorous:  human-­‐centered  research  and  design     education,”  design  issues  26,  no.  3  (2010):  18–26.   5.     manuel  imaz  and  david  benyon,  designing  with  blends:  conceptual  foundations  of  human-­‐ computer  interaction  and  software  engineering  (cambridge,  ma:  mit  press,  2007).   6.     laura  manzari  and  jeremiah  trinidad-­‐christensen,  “user-­‐centered  design  of  a  web  site  for   library  and  information  science  students:  heuristic  evaluation  and  usability  testing,”   information  technology  &  libraries  25,  no.  3  (2006):  163–69.   7.     yoram  moses  and  marcia  k.  shamo,  “a  knowledge-­‐based  treatment  of  human-­‐automation   systems,”  (2013),  http://arxiv.org/abs/1307.2191.   8.     marc  steen,  “human-­‐centered  design  as  a  fragile  encounter,”  design  issues  28,  no.  1  (2012):   72–80.   9.     ronald  t.  azuma,  “a  survey  of  augmented  reality,”  presence:  teleoperators  and  virtual   environments  6,  no.  4  (1997):  355.   10.    antti  aaltonen  and  juha  lehikoinen,  “exploring  augmented  reality  visualizations,”   proceedings  of  the  avi  ’06:  proceedings  of  the  working  conference  on  advanced  visual   interfaces  (2006):  453–56,  http://portal.acm.org/citation.cfm?id=1133357.     information  technology  and  libraries  |  december  2014   45     11.    peter  anders,  “designing  mixed  reality:  perception,  projects  and  practice,”  technoetic  arts:  a   journal  of  speculative  research  6,  no.  1  (2008):  19–29,     http://dx.doi.org/10.1386/tear.6.1.19_1.   12.    blair  macintyre,  jay  david  bolter,  emmanuel  moreno,  and  brendan  hanigan,  “augmented   reality  as  a  new  media  experience,”  proceedings  of  the  ieee  and  acm  international   symposium  on  augmented  reality  (new  york,  new  york:  2001),  197-­‐206,   http://dx.doi.org/10.1109/isar.2001.970538.   13.     aaltonen  and  lehikoinen,  “exploring  augmented  reality  visualizations.”   14.    benjamin  avery  et  al.,  “evaluation  of  user  satisfaction  and  learnability  for  outdoor   augmented  reality  gaming”  user  interfaces  2006:  proceedings  of  the  seventh  australasian  user   interface  conference—volume  50  (darlinghurst,  australia:  australian  computer  society,   2006),  17–24.   15.    oliver  bimber,  l.  miguel  encarnacao,  and  dieter  schmalstieg,  “the  virtual  showcase  as  a  new   platform  for  augmented  reality  digital  storytelling”  proceedings  of  the  egve  ’03:  proceedings   of  the  workshop  on  virtual  environments  (new  york:  acm,  2003),  87–95,   http://portal.acm.org/citation.cfm?id=769964.   16.    push  singh,  barbara  barry  and  h.  liu,  “teaching  machines  about  everyday  life,”  bt   technology  journal  22,  no.  4  (2004):  211–26.   17.    alan  m.  turing,  “computing  machinery  and  intelligence,”  creative  computing  6,  no.  1  (1950):   44–53.   18.    chih-­‐ming  chen  and  yen-­‐nung  tsai,  “interactive  augmented  reality  system  for  enhancing   library  instruction  in  elementary  schools,”  computers  and  education  59,  no.  2  (2012):  638– 52.   19.    david  chen  et  al.,  “mobile  augmented  reality  for  books  on  a  shelf.”  paper  presented  at  2011   ieee  international    conference  on  multimedia  and  expo  (icme),  barcelona,  spain,  july  2011,   http://dx.doi.org/10.1109/icme.2011.6012171.   20.    giovanni  saggio  and  davide  borra,  “augmented  reality  for  restoration/reconstruction  of   artefacts  with  artistic  or  historical  value”  (informally  published  manuscript,  university  of   rome,  italy,  2012),  http://tainguyenso.vnu.edu.vn/jspui/handle/123456789/29953.   21.    andre  walsh,  “qr  codes—using  mobile  phones  to  deliver  library  instruction  and  help  at  the   point  of  need,”  journal  of  information  literacy  4,  no.  1  (2010):  55–63.   22.    azuma,  “a  survey  of  augmented  reality.”     do  you  believe  in  magic?  |  zak       46     23.    telmo  zarraonandia  et  al.,  “augmented  lectures  around  the  corner?”  british  journal  of   educational  technology  42,  no.  4  (2011):  e76–e78.   24.    mark  billinghurst  and  andreas  duenser,  “augmented  reality  in  the  classroom,”  computer  45,   no.  7  (2012):  56–63.   25.    angela  di  serio,  maria  blanca  ibanez,  and  carlos  delgado  kloos,  “impact  of  an  augmented   reality  system  on  students’  motivation  for  a  visual  art  course,”  computers  and  education  68   (2013):  586–96.   26.    liv  valmestad,  “q(a)r(t)  code  public  art  project:  a  convergence  of  media  and  mobile   technology,”  art  documentation:  journal  of  the  art  libraries  society  of  north  america  30,  no.  2   (2011):  70–73.   27.    claudio  kirner  et  al.,  “design  of  a  cognitive  artifact  based  on  augmented  reality  to  support   multiple  learning  approaches,”  proceedings  of  world  conference  on  educational  multimedia,   hypermedia  and  telecommunications  (denver,  co:  june  2006).   28.    deborah  lee,  “the  2011  horizon  report:  emerging  technologies,”  mississippi  libraries  75,  no.   1  (2012):  7–8.   29.    george  margetis  et  al.,  “augmented  interaction  with  physical  books  in  an  ambient  intelligence   learning  environment,”  multimedia  tools  and  applications  67,  no.  2  (2013):  473–95,   http://dx.doi.org/10.1007/s11042-­‐011-­‐0976-­‐x.   30.    stefaan  ternier  and  fred  de  vries,  “mobile  augmented  reality  in  higher  education.”  paper   presented  at  the  learning  in  context  ’12  workshop,  brussels,  belgium,  march  2012,     http://hdl.handle.net/1820/4219.   31.    bimber,  encarnacao,  and  schmalstieg,  “the  virtual  showcase  as  a  new  platform  for   augmented  reality  digital  storytelling.”   32.    alisa  barry  et  al.,  “augmented  reality  in  a  public  space:  the  natural  history  museum,   london,”  computer,  45,  no.  7  (2012):  42–47,     http://doi.ieeecomputersociety.org/10.1109/mc.2012.106.   33.    david  marimon  et  al.,  “mobiar:  tourist  experiences  through  mobile  augmented  reality.”   paper  presented  at  the  networked  and  electronic  media  summit,  barcelona,  spain,  2012.   34.    ann  morrison  et  al.,  “collaborative  use  of  mobile  augmented  reality  with  paper  maps,”   computers  and  graphics  35,  no.  4  (2011):  789–99.   35.    yetao  huang  et  al.,  “iterative  design  of  augmented  reality  device  in  yuanmingyuan  for  public   use,”  vrcai  ’11:  proceedings  of  the  10th  international  conference  on  virtual  reality  continuum     information  technology  and  libraries  |  december  2014   47     and  its  applications  in  industry,  hong  kong,  2011  (new  york:  acm,  2011),   http://dx.doi.org/10.1145/2087756.2087847.   36.    azuma,  “a  survey  of  augmented  reality.”   37.    selim  balcisoy  and  daniel  thalmann,  “interaction  between  real  and  virtual  humans  in   augmented  reality.”  paper  presented  at  computer  animation,  geneva,  switzerland,  1997,   http://portal.acm.org/citation.cfm?id=791510.   38.    enylton  machado  coelho,  blair  macintyre,  and  simon  j.  julier,  “supporting  interaction  in   augmented  reality  in  the  presence  of  uncertain  spatial  knowledge.”  paper  presented  at  the   eighteenth  annual  acm  symposium  on  user  interface  software  and  technology,  seattle,  wa,   october  23–27,  2005,  http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.91.4041.   39.    brian  x.  chen,  “if  you’re  not  seeing  data,  you’re  not  seeing,”  wired  (blog),  august  25,  2009,   http://www.wired.com/gadgetlab/2009/08/augmented-­‐reality/  .   40.    isabel  pedersen,  “a  semiotics  of  human  actions  for  wearable  augmented  reality  interfaces,”   semiotica  155,  no.  1–4  (2005):  183–200.   41.    ibid.   42.    balcisoy  and  thalmann,  “interaction  between  real  and  virtual  humans  in  augmented   reality.”   43.    ray  kurzweil,  “robots  ’r’  us,”  popular  science  269,  no.  3  (2006):  54–57.   44.    chen,  “if  you’re  not  seeing  data,  you’re  not  seeing.”     45.    avery  et  al.,  “evaluation  of  user  satisfaction  and  learnability  for  outdoor  augmented  reality   gaming.”   46.    gerhard  reitmayr  and  dieter  schmalstieg,  “location  based  applications  for  mobile   augmented  reality”  paper  presented  at  the  fourth  australian  user  interface  conference  on   user  interfaces,  adelaide,  austrialia,  2003.   47.    coelho,  macintyre  and  julier,  “supporting  interaction  in  augmented  reality  in  the  presence  of   uncertain  spatial  knowledge.”   48.    carol  c.  kuhlthau,  seeking  meaning:  a  process  approach  to  library  and  information  services   (westport,  ct:  libraries  unlimited,  2004).   49.    crystal  fulton,  “chatman’s  life  in  the  round,”  in  theories  of  information  behavior,  ed.  karen  e.   fisher,  sanda  erdelez,  and  lynne  mckechnie  (medford,  nj:  information  today,  2009),  79–82.   50.    constance  a.  mellon,  “library  anxiety:  a  grounded  theory  and  its  development,”  college  &   research  libraries  47  (1986),  160–65.     do  you  believe  in  magic?  |  zak       48     51.    donald  o.  case,  looking  for  information:  a  survey  of  research  on  information  seeking,  needs   and  behavior  (new  york:  academic  press,  2008).   52.    leanne  bowler  et  al.,  “issues  in  user-­‐centered  design  in  us,”  library  trends  59,  no.  4  (2011):   721–52.   53.    ibid.   54.    heather  l.  o’brien  and  elaine  g.  toms,  “what  is  user  engagement?  a  conceptual  framework   for  defining  user  engagement  with  technology,”  journal  of  the  american  society  for   information  science  &  technology  59,  no.  6  (2008):  938–55.   55.    liu  shu,  “engaging  users:  the  future  of  academic  library  web  sites,”  college  &  research   libraries  69,  no.  1  (2008):  6–27.   56.    djilali  idoughi,  ahmed  seffah,  and  christophe  kolski,  “adding  user  experience  into  the   interactive  service  design  loop:  a  persona-­‐based  approach,”  behaviour  &  information   technology  31,  no.  3  (2012):  287–303.   57.    natalie  pang  and  don  schauder,  “the  culture  of  information  systems  in  knowledge-­‐creating   contexts:  the  role  of  user-­‐centred  design,”  informing  science  10  (january  2007):  203–35.     58.    michelle  l.  young,  annette  bailey,  and  leslie  o’brien,  “designing  a  user-­‐centric  catalog   interface:  a  discussion  of  the  implementation  process  for  virginia  tech’s  integrated  library   system,”  virginia  libraries  53,  no.  4  (2007):  11–15.   59.    shawn  r.  wolfe  and  yi  zhang,  “user-­‐centric  multi-­‐criteria  information  retrieval”  (paper   presented  at  the  32nd  international  acm  sigir  conference  on  research  and  development  in   information  retrieval,  boston,  july  19–23,  2009).   60.    mary  m.  somerville  and  mary  nino,  “collaborative  co-­‐design:  a  user-­‐centric  approach  for   advancement  of  organizational  learning,”  performance  measurement  &  metrics  8,  no.  3   (2007):  180–88.   61.    tamar  sadeh,  “user-­‐centric  solutions  for  scholarly  research  in  the  library,”  liber  quarterly:   the  journal  of  european  research  libraries  17,  no.  1–4  (2007):  253–68.   62.    bettina  berendt  and  anett  kralisch,  “a  user-­‐centric  approach  to  identifying  best  deployment   strategies  for  language  tools:  the  impact  of  content  and  access  language  on  web  user   behaviour  and  attitudes,”  information  retrieval  12,  no.  3  (2009):  380–99.   63.    gail  clement  and  melissa  levine,  “copyright  and  publication  status  of  pre-­‐1978  dissertations:   a  content  analysis  approach,”  portal:  libraries  and  the  academy  11,  no.  3  (2011):  813–29,   http://dx.doi.org/10.1353/pla.2011.0032.     64.    ibid.     information  technology  and  libraries  |  december  2014   49     65.    eliza  t.  dresang,  “radical  change,”  in  theories  of  information  behavior,  ed.  karen  e.  fisher,   sanda  erdelez,  and  lynne  mckechnie  (medford,  nj:  information  today,  2009),  298–302.   66.    grace  m.  jackson-­‐brown,  “content  analysis  study  of  librarian  blogs:  professional   development  and  other  uses,”  first  monday  18,  no.  2  (2013):  2.   67     dahlia  k.  remler  and  gregg  g.  van  ryzin,  research  methods  in  practice:  strategies  for   description  and  causation  (los  angeles:  sage,  2011).   68.    kangdon  lee,  “augmented  reality  in  education  and  training,”  techtrends:  linking  research   and  practice  to  improve  learning  56,  no.  2  (2012):  13–21.   69.    yfs  magazine,  “interview:  gravity  jack  ceo,  luke  richey  talks  industry  leadership,   augmented  reality  and  why  cash  isn’t  king,”  yfsmagazine.com  (blog),  december  19,  2012,   http://yfsentrepreneur.com/2012/12/19/interview-­‐gravity-­‐jack-­‐ceo-­‐luke-­‐richey-­‐talks-­‐ industry-­‐leadership-­‐augmented-­‐reality-­‐and-­‐why-­‐cash-­‐isnt-­‐king/  .   70.    carol  pitts  diedrichs,  “discovery  and  delivery:  making  it  work  for  users,”  serials  librarian  56,   no.  1–4  (2009):  79–93.   71.    baker  evans,  “the  ubiquity  of  mobile  devices  in  universities—usage  and  expectations,”   serials  24  (2011):  s11–s16.   72.    andrew  j.  flanagin  and  miriam  j.  metzger,  “perceptions  of  internet  information  credibility,”   journalism  and  mass  communication  quarterly  77,  no.  3  (2000):  515–40.   73.    ronald  m.  solorzano,  “adding  value  at  the  desk:  how  technology  and  user  expectations  are   changing  reference  work,”  reference  librarian  54,  no.  2  (2013):  89–102,   http://dx.doi.org/10.1080/02763877.2013.755398.   74.    robin  canuel,  chad  crichton,  and  maria  savova,  “tablets  as  powerful  tools  for  university   research,”  library  technology  reports  48,  no.  8  (2012):  35–41.   75.    david  meyer,  “telefonica  bets  on  augmented  reality  with  aurasma  tie-­‐in,”  gigaom  (blog),   september  17,  2012,  http://gigaom.com/2012/09/17/telefonica-­‐bets-­‐on-­‐augmented-­‐reality-­‐ with-­‐aurasma-­‐tie-­‐in.   76.    eliza  t.  dresang,  “the  information-­‐seeking  behavior  of  youth  in  the  digital  environment,”   library  trends  54,  no.  2  (2005):  178–96.   77.    ibid.   78.    dave  rodgerson,  “experiments  in  augmented  reality  hint  at  its  potential  for  retailers,”  future   of  retail  alliance  (blog),  october  5,  2012,  http://www.joinfora.com/experiments-­‐in-­‐ augmented-­‐reality-­‐hint-­‐at-­‐its-­‐potential-­‐for-­‐retailers/.       do  you  believe  in  magic?  |  zak       50     79.    wolfgang  narzt  et  al.,  “augmented  reality  navigation  systems,”  universal  access  in  the   information  society  4,  no.  3  (2005):  177–87.   80.    mito  akiyoshi  and  hiroshi  ono,  “the  diffusion  of  mobile  internet  in  japan,”  information   society  24,  no.  5  (2008):  292–303,  http://dx.doi.org/10.1080/01972240802356067.   81.    jeffrey  james,  “institutional  and  societal  innovations  in  information  technology  for   developing  countries,”  information  development  28,  no.  3  (2012):  183–88.     82.    andromeda  yelton,  “dispatches  from  the  field.  bridging  the  digital  gap,”  american  libraries   43,  no.  1/2  (2012):  30,  http://www.americanlibrariesmagazine.org/article/bridging-­‐digital-­‐ divide-­‐mobile-­‐services.     83.    dresang,  “the  information-­‐seeking  behavior  of  youth  in  the  digital  environment.”   84.    marta  j.  abele,  “responses  to  radical  change:  children’s  books  by  preservice  teachers”   (doctoral  dissertation,  capella  university,  minneapolis,  minnesota,  2003).     85.    jacqueline  n.  glasgow,  “radical  change  in  young  adult  literature  informs  the  multigenre   paper,”  the  english  journal  92,  no.  2  (2002):  41-­‐51,  http://www.jstor.org/stable/822225.     86.    sylvia  pantaleo,  “readers  and  writers  as  intertexts:  exploring  the  intertextualities  in  student   writing,”  australian  journal  of  language  and  literacy  29,  no.  2  (2006):  163–81,   http://search.informit.com.au/documentsummary;dn=157093987891049;res=ielhss.     87.    tom  d.  wilson,  “information  behavior:  an  interdisciplinary  perspective,”  information   processing  &  management  33,  no.  4  (1997):  551–72.   88.    sanda  erdelez,  “information  encountering:  it’s  more  than  just  bumping  into  information,”   bulletin  of  the  american  society  for  information  science  &  technology  25,  no.  3  (1999):  25–29,   http://www.asis.org/bulletin/feb-­‐99/erdelez.html.   89.    marcia  j.  bates,  “the  design  of  browsing  and  berrypicking  techniques  for  the  online  search   interface,”  online  review  13,  no.  5  (1989):  407–24.   90.    clement  and  levine,  “copyright  and  publication  status  of  pre-­‐1978  dissertations.”   perceived quality of whatsapp reference service: a quantitative study from user perspectives article perceived quality of whatsapp reference service a quantitative study from user perspectives yan guo, apple hiu ching lam, dickson k. w. chiu, and kevin k. w. ho information technology and libraries | september 2022 https://doi.org/10.6017/ital.v41i3.14325 yan guo (kguo@connect.hku.hk) is msc(lim) graduate, faculty of education, the university of hong kong. apple hiu ching lam (applelamwork@gmail.com) is edd candidate/msc(lim) graduate, faculty of education, the university of hong kong. dickson k. w. chiu (dicksonchiu@ieee.org) is lecturer, faculty of education, the university of hong kong. kevin k. w. ho (ho.kevin.ge@u.tsukuba.ac.jp) is professor of management information systems, graduate school of business sciences, humanities and social sciences, university of tsukuba. © 2022. abstract academic libraries are experiencing significant changes and making efforts to deliver their service in the digital environment. libraries are transforming from being places for reading to extensions of the classroom and learning spaces. due to the globalized digital environment and intense competition, libraries are trying to improve their service quality through various evaluations. as reference service is crucial to users, this study explores user satisfaction towards the reference service through whatsapp, a social media instant messenger, at a major university in hong kong and discusses the correlation between the satisfaction rating and three variables. suggestions and recommendations are raised for future improvements. the study also sheds light on the usage of reference services through instant messaging in other academic libraries. introduction due to the advancement of new technologies and mobile devices, library resources and services are more accessible.1 apart from independent searching strategies, the interactions between librarians and users have become an effective method to solve user problems, referred to as reference services.2 according to the reference and user services association (rusa), reference services include creating, managing, and assessing reference transactions and activities. 3 with the increasing user needs, reference services have become an essential part of library services and commonplace in academic libraries.4 further, technology development requires reference librarians to possess updated skills, willingness, and interest to deal with user inquiries.5 recently, due to the covid-19 pandemic, users have increasingly utilized virtual reference services to help them obtain information required for their academic studies instead of face-to-face modes.6 some libraries have employed different virtual tools, for example, instant messaging services, to provide reference services to their users. one of the most popular global instant messaging services is whatsapp.7 referring to the digital 2022—hong kong report, the most-used social media platform among internet users aged 16 to 64 in hong kong was whatsapp (84.3%), followed by facebook (83.7%), instagram (65.6%), wechat (55.2%), and facebook messenger (50.4%).8 the popularity of whatsapp in hong kong accordingly increases whatsapp reference service usage in academic libraries. the qualitative study by tsang and chiu has identified whatsapp as one of the most commonly-used and relatively preferred reference services of an academic library in hong kong.9 mailto:kguo@connect.hku.hk mailto:applelamwork@gmail.com mailto:dicksonchiu@ieee.org mailto:ho.kevin.ge@u.tsukuba.ac.jp information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 2 many studies have investigated reference service quality with measurements such as satisfaction rating, perceived gaps in reference services ability to meet user expectations, and other information-seeking behaviors. however, few studies focus on instant messaging reference services compared to traditional services, except for a notable recent qualitative study by tsang and chiu.10 therefore, this research aims to quantitatively evaluate user satisfaction with whatsapp’s application in reference service of a major university’s library in hong kong through three dimensions: affect of service (as), information control (ic), and library as place (lp), which are detailed in the research purpose section. the results can help librarians better understand the effectiveness of applying whatsapp and other instant messaging to improve reference service quality. by expanding technology-based services, libraries can become more competitive in the digital era and provide better user experiences in the future. thus, this study deals with the following four research questions (rqs): rq1. what is the users’ awareness level of the library’s reference services? rq2. how do users evaluate the whatsapp application in the library? rq3. what are the relationships between user satisfaction and the three service dimensions as, ic, and lp? rq4. how can academic libraries increase user satisfaction with whatsapp reference services? literature review in the late 1800s, library leaders started to pay attention to the importance of reference services. 11 since then, reference services have also caught the public’s attention and were introduced into public libraries. reference services can assist readers in solving problems through various interactions between users and staff.12 currently, the library is not merely a repository of collections, and librarians can provide more help, particularly fulfilling users’ various information needs rather than just offering directions or physical locations of books.13 nowadays, librarians strive to solve various user problems and inquiries with their professional skills and information literacy.14 at first, in-person and telephone were the most common ways for reference services. however, with the increasing number of remote users and ubiquitous internet connectivity, face-to-face reference and asynchronous emails can no longer satisfy users’ needs.15 thus, libraries increasingly explore collaborative software and mobile applications such as instant messaging, online chatting, video sessions, and other methods to serve users, referred to collectively as virtual reference.16 virtual reference occurs electronically in real time, where users may interact with librarians through smartphones, computers, or other devices without physical presence.17 as libraries began to use the internet, several case studies investigated instant messaging reference services in academic libraries.18 at the same time, librarians and researchers began to investigate reference service quality with designated measurements. various indicators can help measure user satisfaction levels, such as accuracy, communication skills, user satisfaction, instruction, and user’s willingness to return.19 although these indicators were originally developed for physical reference services, most principles and methods can still be applied to virtual reference services, as instant messaging has become one of the most frequently used information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 3 channels.20 some studies have confirmed the effectiveness of instant messaging for reference services for more traditional means, such as phone and email.21 as one of the most popular social media chatting software platforms, whatsapp has become a powerful tool for connecting librarians and users. a primary difference from traditional phonebased reference services is that whatsapp can share texts, images, documents, and videos (and their links) at a low cost.22 whatsapp can run as a mobile application on smartphones or as a web page on desktop browsers named whatsapp web. whatsapp web users are required to use their mobile phone to scan the qr (quick response) code on the computer browser (https://web.whatsapp.com/) for authentication before use. as the functionality of whatsapp web is similar to whatsapp, users can easily adapt to whatsapp web on desktop computers. as of march 2020, the number of active whatsapp users has globally increased to approximately 2 billion and is still growing steadily.23 whatsapp, by april 2021, had become the most popular messaging application based on the number of monthly active users, compared with other popular messaging applications.24 studies also indicate that students may use whatsapp for two to three hours daily.25 although the essential chat functions of whatsapp are similar to other instant messaging services such as facebook messenger and wechat, whatsapp and wechat have been more popular for hongkongers and mainland chinese, respectively.26 surprisingly, howard et al. studied students’ habits of using social media platforms at purdue university in the us and revealed that respondents rarely use whatsapp in their daily lives, indicating that residents in different regions may have different social media platform preferences.27 recently, odu and omini have demonstrated a significant relationship between using whatsapp and library service satisfaction from the student’s perspective.28 some studies also stressed that many students welcome whatsapp as an effective reference service platform.29 however, friday et al. pointed out that some librarians might not be trained and equipped with proper and up -todate skills in using social media tools to provide library services effectively and efficiently.30 further, aina, babalola, and oduwole argued that hurdles such as instructional policies, lack of time, and heavy workloads might cause difficulties in using these tools to provide library services.31 as for evaluation, mohd azmi, noorhidawati, and yanti idaya aspura applied rusa’s guidelines for the behavioral performance of reference and information service providers to evaluate the perceived importance versus actual practices of whatsapp reference service from librarians’ perspectives.32 they suggested that although librarians expressed their awareness of the importance of rusa guidelines, they would not fully comply with the guidelines because of time and other constraints. yet, few studies deal with the satisfaction with whatsapp reference services of academic libraries from user perspectives. research purpose regarding whatsapp and library services, a few studies focused on finding the relationship between whatsapp and service usage, user attitudes toward whatsapp applications, and the difficulties of using whatsapp, particularly for reference services.33 though mohd azmi, noorhidawati, and yanti idaya aspura evaluated librarians’ behavioral performance in providing whatsapp reference service, it was from librarians’ perspectives instead of users.34 https://web.whatsapp.com/ information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 4 thus, we studied user satisfaction with the whatsapp reference service offered by academic libraries by adapting tsui’s instrument to develop the survey framework.35 tsui employed three key indicators, i.e., affect of service (as), information control (ic), and library as place (lp), in libqual+, an online library assessment tool developed by the association of research libraries (arl).36 as measures “empathy, responsiveness, assurance, and reliability of library employees,”37 like librarian-user interactions concerning the librarians’ knowledge in inquiry responses and the level of reference service provided.38 ic measures “how users want to interact with the modern library and include scope, timeliness and convenience, ease of navigation, modern equipment, and self-reliance,”39 such as library resource availability and accessibility from user perspectives. 40 lp measures “the usefulness of space, the symbolic value of the library, and the library as a refuge for work or study,”41 such as the availability of adequate facilities and an appropriate physical environment from user perspectives.42 the application of these indicators will be further discussed in the methodology section. methodology this study chose a major academic library in hong kong with a long track record of technological advancement. a reference desk is situated near the library’s main entrance for traditional services. the library’s web page shows a clear ask a librarian column with diversified methods for reference services, including email, telephone, whatsapp, and other electronic devices to access the reference services. notably, the whatsapp reference service is operated the same as other channels, available monday to friday from 9 am to 5 pm (except on public holidays). the library promises an inquiry response in no more than four hours. the mission and vision of such reference services are to • help locate information resources; • assist in searching strategies and research; • deal with queries about the use of facilities and services; and • equip users with information literacy. the library has developed a whatsapp business account with a mobile phone in the whatsapp business application and uses the whatsapp web function to handle user inquiries on desktop computers. one to two library assistants support the whatsapp reference service on shift seamlessly from 9 am to 5 pm, including the lunch break, and a professional librarian reviews whatsapp inquiry records weekly. this study used a survey administered through google form, both online and offline, to collect user perceptions about the library’s whatsapp reference service. online methods for collecting survey responses included email, facebook, wechat, and whatsapp, and offline methods included site delivery at the library entrance and sticking the survey qr code on public notice boards. no incentives were provided for the voluntary data collection. the data collected comprised mainly undergraduate and postgraduate students to represent a general user view of the whatsapp reference service. microsoft excel and ibm spss statistics were used to analyze the data, including bivariate correlation for investigating the relationships between whatsapp satisfaction and the three variables based on tsui’s study, as, ic, and lp.43 among these indicators, as focuses on whether whatsapp is easy to use and supportive; ic evaluates the response speed, accuracy, and accessibility of the whatsapp reference service; and lp measures the staff attitude and whether whatsapp helps encourage librarian-user information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 5 communication. the survey also includes demographic information, reference services usage, and user satisfaction with the whatsapp reference service. participants were asked to evaluate the quality of whatsapp reference service from these three dimensions through five-point likert scales in the satisfaction rating part. finally, the survey asked for the overall satisfaction and other useful comments about the reference service. data analysis demographic information as the main analysis of this study is regression analysis, a check on the minimum number of participants needed for analysis was performed. as explained later in this paper, the regression involved six predictors of satisfaction. using medium effect size and 0.8 as the statistical power, the minimum sample size should be 97 using an online a-priori sample size calculator for multiple regression (https://www.danielsoper.com/statcalc/calculator.aspx?id=1). the data collection yielded 131 completed responses, with 66% of master’s students and the rest undergraduates. respondents had diversified academic backgrounds, including education (26.0%), science (14.5%), business and economics (13.0%), engineering (12.2%), liberal arts (10.7%), social science (9.9%), architecture (9.2%), and legal studies (4.6%). for the time spent on instant messaging such as whatsapp and wechat, 39% spent more than three to five hours every day, while one-fifth of them would spend one to two hours. 22% of respondents spend five hours or above, and only a small portion of them (19%) would spend less than an hour. in summary, most respondents would spend at least one hour on instant messaging daily. usage of reference service table 1 summarizes respondents’ usage of reference services with a five-point likert scale (1 = never; 5 = always). as shown, walk-in and email are the most common methods to use the reference service, while whatsapp is the least frequent. when it comes to the purposes of using reference services (see table 2), databases and e-resources and identifying information sources are the two most common purposes for respondents, followed by service and facility and research assistance. table 1. usage frequency of reference service through different methods (n = 131) methods walk-in email phone whatsapp mean score 3.23 3.24 2.53 2.32 note: 1=never; 5=always table 2. purposes of using reference service (n = 131) purposes service and facility database and eresources identify information sources research assistance (individual/ group) other mean score 3.10 3.36 3.28 3.08 2.31 note: 1=never; 5=always https://www.danielsoper.com/statcalc/calculator.aspx?id=1 information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 6 when asked about their preferred way to use reference services, more than half of the respondents said they would use email (67.9%), followed by walk-in (59.5%), whatsapp (23.7%), and phone (12.2%). as the traditional method, most respondents considered walk-in, in-person reference the most effective reference method because users could receive instant help from librarians, especially for urgent and complex problems. however, results indicated that despite a gap in users seeking reference services help by instant messaging and email, this gap is smaller than that for face-to-face and telephone.44 the user ratings for reference services through different methods were compared using anova. our result shows a significant result, f(3, 520) = 30.52, p < 0.01. walk-in (m = 4.18, sd = 0.71) is the most satisfying method. post-hoc tests showed the rating of email (m = 3.78, sd = 0.60) and phone (m = 3.73, sd = 0.83) statistically indifferent and lower than walk-in, while both were considered better than whatsapp (m = 3.27, sd = 0.90). apart from these ratings, respondents were also asked to leave a few comments and suggestions for the reference service. notably, most respondents showed a positive attitude to the whatsapp reference service while suggesting some improvements. for example, one respondent requested “longer office hours for whatsapp.” at the time of this research, the whatsapp reference service hours were monday to friday from 9 am to 5 pm, while in-person reference service reference hours were monday to friday, 8:30 am to 7 pm, and saturdays from 8:30 am to 7 pm. therefore, the library should extend the whatsapp service hours to provide more flexible service time, aligning with the findings of tsang and chiu.45 further, a respondent suggested that librarians should “respond to email more efficiently.” for this issue, whatsapp could serve to expand user access to reference services instead of emails. users’ satisfaction with whatsapp reference service prior research reported that as, ic, and lp influenced user satisfaction. this study adapted the instrument developed in tsui’s prior research (see appendix) to collect data to investigate these relationships.46 as the cronbach’s alpha values for all three constructs are higher than 0.7, it is valid to use the average value of these items for our data analysis.47 table 3 shows the analysis of whether respondents’ academic level would affect as, ic, lp, and overall user satisfaction with whatsapp using anova. results indicated that academic level affected as but not the other factors and satisfaction. further, multiple regression results indicated that ic and lp affected whatsapp satisfaction. table 4 tabulates our findings. table 3. anova results overall undergraduate (n = 45) master’s student (n = 86) f-value affect of service (as) 3.380 3.200 3.474 5.712 * information control (ic) 3.202 3.162 3.223 0.286 library as place (lp) 3.645 3.550 3.695 0.273 whatsapp satisfaction (sat) 3.275 3.200 3.314 0.495 notes: *** p < 0.001; ** p < 0.01; * p < 0.05. information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 7 as shown in table 4, ic and lp have significant positive impacts on users’ satisfaction with using whatsapp for reference services. however, considering the academic level (undergraduate = 0; master’s student = 1) in our regression model (i.e., interaction effect), the following effects are notable. first, as does not affect user satisfaction with using whatsapp for reference services for undergraduates but positively affects master’s respondents. second, ic positively impacts satisfaction for both undergraduate and master’s respondents, of which the difference between these two respondent groups is statistically insignificant. lastly, even though lp also positively impacts satisfaction for both groups, the effect is higher for undergraduates than for master ’s respondents. the different learning needs of the groups may explain such differences, as shown in table 5.48 table 4. regression analysis main effect interaction effect independent variables coefficient t-value coefficient t-value affect of service (as) 0.0933 0.7556 −0.3503 −1.744 information control (ic) 0.6718 6.736 *** 0.7624 4.178 *** library as place (lp) 0.6092 6.143 *** 0.9366 6.335 *** as  academic 0.7817 3.076 *** ic  academic −0.1741 −0.8701 lp  academic −0.5721 −0.3178 *** intercept −1.412 −3.748 *** −1.423 −3.876 *** r2(adj). 0.5444 0.5742 f-value 52.78 *** 30.21 *** notes: *** p < 0.001; ** p < 0.01; * p < 0.05 table 5. impacts of as, ic, and lp on different student groups undergraduate master’s student affect of service (as) not significant 0.7817 information control (ic) 0.7624 0.7624 library as place (lp) 0.9366 0.3645 discussions and recommendations subdivision of the whatsapp reference service into specialist subjects our findings indicated that as had the strongest correlation with whatsapp satisfaction for master’s students, while the as part had the lowest satisfaction with undergraduate students. this reflected that respondents who are undergraduates could not receive adequate supportive help from librarians through whatsapp, aligning with the findings of tsang and chiu.49 a possible reason is that the number of whatsapp reference librarians with specialist subject knowledge was information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 8 small. yet, one general reference whatsapp number on the library website is insufficient compared to other methods, as the library website shows seven telephone numbers of branch libraries to serve different patrons. through different numbers, users could easily find the required experts accordingly. the whatsapp reference service had only a single number probably because users mostly ask basic and general questions. such a process would cost professionals too much time and energy to deal with.50 to further enhance the service, it is necessary to reform the operational policies and add a few more whatsapp accounts, for instance, creating a whatsapp business account for each branch library (a total of six branch libraries) or for each school serving users in different disciplines to connect to corresponding subject librarians via specialized whatsapp accounts.51 this approach can separate users from the general inquiry number dealing with quick and straightforward information inquiries from those requiring specific domain inquiries.52 further, the general inquiry whatsapp service should be extended to cater to various students’ needs by possibly improving to provide 24-hour service.53 to remedy human resources requirements, student helpers, interns, and volunteers can serve on shifts on saturdays, sundays, and even public holidays.54 more users may seek troubleshooting services during the holidays, especially long holidays, and recently, under the covid-19 pandemic and its associated isolation requirements.55 more staff training due to the whatsapp reference service features, the skills required for online and face-to-face conversations are different, e.g., it is difficult to convey emotions like facial expressions and body language online.56 further, due to the limited interactions between librarians and users and the lack of visual and audio cues through the whatsapp reference service, librarians can hardly identify user needs in a short time.57 therefore, librarians may need further professional training for such scenarios, particularly in answering questions quickly and precisely in real-time chat, because users tend to be more impatient during a chat engagement.58 in addition, unlike face-toface inquiry, some complex issues often cannot be adequately explained through whatsapp. therefore, librarians should make appropriate referrals if some problems cannot be solved through whatsapp. reference services through video-based platforms such as zoom can also help.59 regular training could offer librarians updated information on using the tool and refresh the skills used in responding to the whatsapp reference service among various staff members, i.e., librarians, library assistants, student helpers, volunteers, and interns. if the library staff does not acquire well-developed skills and competencies in texting, comprehension, and communication specialized in instant messaging services, they cannot efficiently and effectively understand the inquiries and search, locate, explain, and convey the appropriate information resources to users on the asynchronous whatsapp reference service in a shorter response time.60 establishment of whatsapp reference service guidelines whether the whatsapp reference service increases the capability to deal with user problems, it still relies on consistently favorable reference behaviors.61 mohd azmi, noorhidawati, and yanti idaya aspura pointed out that users need timely responses and friendly online contacts from librarians, though librarians might not completely follow the rusa guidelines due to human resources constraints.62 therefore, libraries should establish easy-to-follow, concise, and information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 9 whatsapp-tailored guidelines for appropriately conducting whatsapp reference services, especially because such skills differ from face-to-face services, as discussed above.63 the studied library has developed a simple series of internal work procedures for using whatsapp web, including how to open and close whatsapp web and how on-duty staff should handle inquiries. to enhance and standardize the whatsapp reference service, the library should develop guidelines by offering some polite, brief, and interactive text templates for answering inquiries, such as “i am (name) (job title). what may i help you with, dear user?”, as well as answers to frequently asked questions. progress reporting messages should be sent to users to acknowledge their searching status.64 the relationship between librarians and users can thus be enhanced by creating a consistent, friendly, and warm atmosphere, using informal conversation and emojis, and incorporating whatsapp’s features to engage users and establish continued service use.65 further, such guidelines can save time and energy in training new staff and provide the basis for the future development of artificial intelligence aids such as chatbots.66 promotion for the whatsapp service most respondents conveyed a positive attitude, considering whatsapp a convenient way to access the reference service, which is in line with the studies by ansari and tripathi and sudharani and nagaraju.67 however, it is still not the most frequently used and preferred method in the library. one reason is that users still need help with physical materials and ask for the answers face-toface. 68 however, this is not the only reason, as many studies showed that library promotional efforts are often weak.69 in addition to the traditional promotional materials such as leaflets and contact cards with whatsapp numbers, the library can also broaden the promotion through massive emails and social media such as facebook, instagram, linkedin, twitter, and signal.70 in the information era, social media is an effective and efficient channel for reaching the target audience and disseminating information in an accessible way. 71 the library should reform the webpage of the whatsapp reference service to further attract users. for instance, displaying some sample whatsapp chat screenshots of librarian-user interactions on the library’s website can increase the attractiveness of the service as images can graphically represent the application’s ease of use for library reference help.72 conclusion the study has investigated user satisfaction with the whatsapp reference service in a major academic library in hong kong and explored the correlations between whatsapp satisfaction with three quality dimensions as, ic, and lp. the survey revealed various opinions toward using reference services and preference methods, including inconsistencies between users’ frequently used methods and preferred methods. moreover, by analyzing the correlation between whatsapp satisfaction and the three variables, results showed that users emphasized the whatsapp reference service. the results have led to some practical suggestions for improvement: subdividing the whatsapp reference service with subject specialists, providing more staff training, establishing staff guidelines and policies, and increasing whatsapp service promotion. limitations and future research there are still some limitations to the study. firstly, the survey only collected limited complete responses, which may not represent all users’ views. additionally, the perceptions of both library information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 10 staff and users should be considered. secondly, the research evaluation design with three dimensions can be extended to measure other quality and effects. thirdly, as whatsapp is just one application among various emerging instant-messaging tools, further studies should cover other instant messaging platforms for similar and different purposes. for instance, as the studied university comprises a significant student population from mainland china, wechat could be investigated for its possibility and effectiveness as a whatsapp alternative for providing reference services and promotion to chinese students.73 information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 11 appendix: key survey items item mean sd. affect of service (ease of use, supportive) (as) (cronbach’s alpha  = 0.799) as1. there is a clear introduction teaching library users about how to use the whatsapp function. 3.18 0.87 as2. the directories are easy to understand. 3.16 0.85 as3. reference service through whatsapp is easy to use. 3.69 0.94 as4. i can receive instant help from a librarian through whatsapp. 3.55 0.78 as5. i can request service anytime, anywhere. 3.31 0.93 information control (response speed, accuracy, accessible) (ic) (cronbach’s alpha  = 0.707) ic1. response of inquiry is reliable. 3.66 0.74 ic2. whatsapp application makes reference services easily accessible for users. 3.25 0.95 ic3. response of inquiry is accurate. 3.87 0.66 ic4. using whatsapp to gain access to reference services can meet my needs. 3.73 0.95 ic5. the quality of response obtained through whatsapp is inferior to walk-in. (r). 2.60 1.26 ic6. the quality of response obtained through whatsapp is inferior to email (r). 2.66 1.17 ic7. the quality of response obtained through whatsapp is inferior to phone (r). 2.63 0.98 library as place (staff attitude, encourage communication) (lp) (cronbach’s alpha  = 0.843) lp1. reference staff is friendly or pleasant 4.08 0.76 lp2. using whatsapp to contact a librarian is convenient. 3.90 0.78 lp3. whatsapp application in reference service increases my productivity in using online library services. 3.53 0.99 lp4. it provides an efficient channel to communicate with librarians. 3.87 0.89 lp5. i request more reference services after i know about the whatsapp channel. 3.28 0.86 note: ic5, ic6, and ic7 are reversed codes. information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 12 endnotes 1 karen hiu tung yip, patrick lo, kevin k. w. ho, and dickson k. w. chiu, “adoption of mobile library apps as learning tools in higher education: a tale between hong kong and japan ,” online information review 45, no. 2 (2020): 389–405, https://doi.org/10.1108/oir-07-20200287; ken yiu kwan fan, patrick lo, kevin k. w. ho, stuart so, dickson k. w. chiu, and eddie h. t. ko, “exploring the mobile learning needs amongst performing arts students,” information discovery and delivery 48, no. 2 (2020), 103–12, https://doi.org/10.1108/idd-12-2019-0085; vanessa hiu ying chan, dickson k. w. chiu, and kevin k. w. ho, “mediating effects on the relationship between perceived service quality and public library app loyalty during the covid-19 era,” journal of retailing and consumer services 67 (2022): 102960, https://doi.org/10.1016/j.jretconser.2022.102960. 2 samuel s. green, “personal relations between librarians and readers,” library journal 1, no. 2 (1876): 74–81. 3 “measuring and assessing reference services and resources: a guide,” reference and user services association, accessed july 25, 2021, http://www.ala.org/rusa/sections/rss/rsssection/rsscomm/evaluationofref/measrefguide. 4 angel lok yi tsang and dickson k. w. chiu, “effectiveness of virtual reference services in academic libraries: a qualitative study based on the 5e learning model,” the journal of academic librarianship 48, no. 4 (2022): 102533; kun zhang and peixin lu, “what are the key indicators for evaluating the service satisfaction of wechat official accounts in chinese academic libraries?,” library hi tech, (2022), ahead-of-print, https://doi.org/10.1108/lht07-2021-0218; yifei zhang, patrick lo, stuart so, and dickson k. w. chiu, “relating library user education to business students’ information needs and learning practices: a comparative study,” reference services review 48, no. 4 (2020): 537–58, https://doi.org/10.1108/rsr-12-2019-0084. 5 andrew chean yang yew, dickson k. w. chiu, yuriko nakamura, and king kwan li, “a quantitative review of lis programs accredited by ala and cilip under contemporary technology advancement,” library hi tech, (2022), ahead of print, https://doi.org/10.1108/lht-12-2021-0442; james friday, oluchi chidozie, and lauretta ngozi chukwuma, “social media and library services: a case of covid-19 pandemic era,” international journal of research and review 7, no. 10 (2020): 230–37, https://www.ijrrjournal.com/ijrr_vol.7_issue.10_oct2020/abstract_ijrr0031.html. 6 ruth sara connell, lisa c. wallis, and david comeaux, “the impact of covid-19 on the use of academic library resources,” information technology and libraries 40, no. 2 (2021): 1–20, https://doi.org/10.6017/ital.v40i2.12629. 7 “digital 2022: global overview report,” we are social and hootsuite, accessed april 30, 2022, https://wearesocial.com/hk/blog/2022/01/digital-2022/; tsang and chiu, “effectiveness of virtual reference services.” 8 “digital 2022—hong kong,” we are social and hootsuite, accessed april 30, 2022, https://wearesocial.com/hk/blog/2022/01/digital-2022/. https://doi.org/10.1108/oir-07-2020-0287 https://doi.org/10.1108/oir-07-2020-0287 https://doi.org/10.1108/idd-12-2019-0085 https://doi.org/10.1016/j.jretconser.2022.102960 http://www.ala.org/rusa/sections/rss/rsssection/rsscomm/evaluationofref/measrefguide https://doi.org/10.1108/lht-07-2021-0218 https://doi.org/10.1108/lht-07-2021-0218 https://www.emerald.com/insight/search?q=yifei%20zhang https://www.emerald.com/insight/search?q=patrick%20lo https://www.emerald.com/insight/search?q=stuart%20so https://www.emerald.com/insight/search?q=dickson%20k.w.%20chiu https://doi.org/10.1108/rsr-12-2019-0084 https://doi.org/10.1108/lht-12-2021-0442 https://www.ijrrjournal.com/ijrr_vol.7_issue.10_oct2020/abstract_ijrr0031.html https://doi.org/10.6017/ital.v40i2.12629 https://wearesocial.com/hk/blog/2022/01/digital-2022/ https://wearesocial.com/hk/blog/2022/01/digital-2022/ information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 13 9 tsang and chiu, “effectiveness of virtual reference services.” 10 tsang and chiu, “effectiveness of virtual reference services.” 11 green, “personal relations.” 12 green, “personal relations.” 13 p. sankar and e. s. kavitha, “ask librarian to whatsapp librarian: reengineering of traditional library services,” international journal of information sources and services 3, no. 2 (march– april 2016): 35–40, https://www.researchgate.net/profile/drkavithaes/publication/304466788_ask_librarian_to_whatsapp_librarian_reengineering_of_traditio nal_library_services/links/5770958c08ae10de639c0ca3/ask-librarian-to-whatsapplibrarian-reengineering-of-traditional-library-services.pdf; spear wing sze wong and dickson k. w. chiu, “re-examining the value of remote academic library storage in the mobile digital age: a comparative study,” portal 23, no. 1 (2023), in press; tin nok leung, dickson k. w. chiu, kevin k. w. ho, and canon k. l. luk, “user perceptions, academic library usage and social capital: a correlation analysis under covid-19 after library renovation,” library hi tech 40, no. 2 (2021): 304–22, https://doi.org/10.1108/lht-04-2021-0122. 14 syeda hina batool, ata ur rehman, and imran sulehri, “the current situation of information literacy education and curriculum design in pakistan: a discovery using delphi method,” library hi tech (2021): ahead of print, https://doi.org/10.1108/lht-02-2021-0056; yew et al., “quantitative review of lis programs.” 15 tsang and chiu, “effectiveness of virtual reference services”; zhang and lu, “what are the key indicators.” 16 james ogom odu, and emmanuel ubi omini, “mobile phone applications and the utilization of library services in the university of calabar library, calabar, nigeria,” global journal of educational research 16, no. 2 (2017): 111–19, https://doi.org/10.4314/gjedr.v16i2.5. 17 “guidelines for behavioral performance of reference and information service providers,” american library association, june 2004, http://www.ala.org/template.cfm?section=home&template=/contentmanagement/contentd isplay.cfm&contentid=26937. 18 marianne foley, “instant messaging reference in an academic library: a case study,” college & research libraries 63, no. 1 (2002): 36–45, https://doi.org/10.5860/crl.63.1.36; tsang and chiu, “effectiveness of virtual reference services.” 19 chun-wai tsui, “a study on service quality gap in remote service delivery with mobile devices among academic libraries in hong kong,” (master’s thesis, the university of hong kong, 2015), https://doi.org/10.5353/th_b5611574; leung et al., “user perceptions”; zhang and lu, “what are the key indicators.” 20 tsang and chiu, “effectiveness of virtual reference services.” https://www.researchgate.net/profile/drkavitha-es/publication/304466788_ask_librarian_to_whatsapp_librarian_reengineering_of_traditional_library_services/links/5770958c08ae10de639c0ca3/ask-librarian-to-whatsapp-librarian-reengineering-of-traditional-library-services.pdf https://www.researchgate.net/profile/drkavitha-es/publication/304466788_ask_librarian_to_whatsapp_librarian_reengineering_of_traditional_library_services/links/5770958c08ae10de639c0ca3/ask-librarian-to-whatsapp-librarian-reengineering-of-traditional-library-services.pdf https://www.researchgate.net/profile/drkavitha-es/publication/304466788_ask_librarian_to_whatsapp_librarian_reengineering_of_traditional_library_services/links/5770958c08ae10de639c0ca3/ask-librarian-to-whatsapp-librarian-reengineering-of-traditional-library-services.pdf https://www.researchgate.net/profile/drkavitha-es/publication/304466788_ask_librarian_to_whatsapp_librarian_reengineering_of_traditional_library_services/links/5770958c08ae10de639c0ca3/ask-librarian-to-whatsapp-librarian-reengineering-of-traditional-library-services.pdf https://doi.org/10.1108/lht-04-2021-0122 https://doi.org/10.1108/lht-02-2021-0056 https://doi.org/10.4314/gjedr.v16i2.5 http://www.ala.org/template.cfm?section=home&template=/contentmanagement/contentdisplay.cfm&contentid=26937 http://www.ala.org/template.cfm?section=home&template=/contentmanagement/contentdisplay.cfm&contentid=26937 https://doi.org/10.5860/crl.63.1.36 https://doi.org/10.5353/th_b5611574 information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 14 21 charlotte clements, “implementing instant messaging in four university libraries,” library hi tech 27, no. 3 (2009): 393–402, https://doi.org/10.1108/07378830910988522. 22 gunnan dong et al., “relationships between research supervisors and students from coursework-based master’s degrees: information usage under social media,” information discovery and delivery 49, no. 4 (2021): 319–27, https://doi.org/10.1108/idd-08-2020-0100; tsang and chiu, “effectiveness of virtual reference services.” 23 “number of monthly active whatsapp users worldwide 2013–2020,” statista research department, accessed july 26, 2021, https://www.statista.com/statistics/260819/number-ofmonthly-active-whatsapp-users/. 24 “most popular global mobile messaging apps 2021,” statista research department, accessed july 26, 2021, https://www.statista.com/statistics/258749/most-popular-global-mobilemessenger-apps/. 25 mohd shoaib ansari and aditya tripathi, “use of whatsapp for effective delivery of library and information services,” desidoc journal of library & information technology 37, no. 5 (2017): 360–65, https://doi.org/10.14429/djlit.37.5.11090; y. sudharani and k. nagaraju, “whatsapp usage among the students of svu college of engineering, tirupathi,” journal of advances in library and information science 5, no. 4 (2016): 325–29, https://jalis.in/pdf/5-4/nagaraju.pdf. 26 jianhua xu, qi kang, zhiqiang song, and christopher peter clarke, “applications of mobile social media: wechat among academic libraries in china,” the journal of academic librarianship 41, no. 1 (2015): 21–30, https://doi.org/10.1016/j.acalib.2014.10.012; tsang and chiu, “effectiveness of virtual reference”; “digital 2022—hong kong,” 54; zhang and lu, “what are the key indicators.” 27 heather howard, sarah huber, lisa carter, and elizabeth moore, “academic libraries on social media: finding the students and the information they want,” information technology and libraries 37, no. 1 (2018): 8–18, https://doi.org/10.6017/ital.v37i1.10160. 28 odu and omini, “mobile phone applications.” 29 ansari and tripathi, “use of whatsapp”; sudharani and nagaraju, “whatsapp usage.” 30 friday, chidozie, and chukwuma, “social media and library services.” 31 adebowale japhet aina, yemisi tomilola babalola, and adebambo adewale oduwole, “use of web 2.0 tools and services by the library professionals in lagos state tertiary institution libraries: a study,” world digital libraries – an international journal 12, no.1 (2019): 1–17, https://content.iospress.com/articles/world-digital-libraries-an-internationaljournal/wdl12101. 32 nor azilawati mohd azmi, a. noorhidawati, and m. k. yanti idaya aspura, “librarians’ behavioral performance on chat reference service in academic libraries: perceived importance vs actual practices,” malaysian journal of library & information science 22, no. 3 (2017): 19–33, https://doi.org/10.22452/mjlis.vol22no3.2. https://doi.org/10.1108/07378830910988522 https://doi.org/10.1108/idd-08-2020-0100 https://www.statista.com/statistics/260819/number-of-monthly-active-whatsapp-users/ https://www.statista.com/statistics/260819/number-of-monthly-active-whatsapp-users/ https://www.statista.com/statistics/258749/most-popular-global-mobile-messenger-apps/ https://www.statista.com/statistics/258749/most-popular-global-mobile-messenger-apps/ https://doi.org/10.14429/djlit.37.5.11090 https://jalis.in/pdf/5-4/nagaraju.pdf https://doi.org/10.1016/j.acalib.2014.10.012 https://doi.org/10.6017/ital.v37i1.10160 https://content.iospress.com/articles/world-digital-libraries-an-international-journal/wdl12101 https://content.iospress.com/articles/world-digital-libraries-an-international-journal/wdl12101 https://doi.org/10.22452/mjlis.vol22no3.2 information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 15 33 aina, babalola, and oduwole, “use of web 2.0 tools and services”; ansari and tripathi, “use of whatsapp”; friday, chidozie, and chukwuma, “social media and library services”; odu and omini, “mobile phone applications”; sudharani and nagaraju, “whatsapp usage.” 34 mohd azmi, noorhidawati, and yanti idaya aspura, “librarians’ behavioral performance.” 35 tsui, “a study on service quality gap.” 36 “what is libqual+®?,” libqual+, accessed may 1, 2022, https://www.libqual.org/home; tsui, “a study on service quality gap.” 37 jessica kayongo, and sherri jones, “faculty perception of information control using libqual+™ indicators,” the journal of academic librarianship 34, no. 2 (2008): 131, https://doi.org/10.1016/j.acalib.2007.12.002. 38 rachael kwai fun ip and christian wagner, “libqual+® as a predictor of library success: extracting new meaning through structured equation modeling,” the journal of academic librarianship 46, no. 2 (2020): 102102, https://doi.org/10.1016/j.acalib.2019.102102; selena killick, anne van weerden, and fransje van weerden, “using libqual+® to identify commonalities in customer satisfaction: the secret to success?.” performance measurement and metrics 15, no. 1/2 (2014), 23–31, https://doi.org/10.1108/pmm-04-2014-0012. 39 kayongo and jones, “faculty perception of information control,” 131. 40 ip and wagner, “libqual® as a predictor”; killick, van weerden, and van weerden, “using libqual® to identify commonalities.” 41 kayongo and jones, “faculty perception of information control,” 131. 42 ip and wagner, “libqual® as a predictor”; killick, van weerden, and van weerden, “using libqual® to identify commonalities.” 43 tsui, “a study on service quality gap.” 44 anabel quan–haase, “instant messaging on campus: use and integration in university students' everyday communication,” the information society 24, no. 2 (2008): 105–15, https://doi.org/10.1080/01972240701883955. 45 tsang and chiu, “effectiveness of virtual reference services.” 46 tsui, “a study on service quality gap.” 47 robert a. peterson, “a meta-analysis of cronbach's coefficient alpha,” journal of consumer research 21, no. 2 (1994): 381–91, https://doi.org/10.1086/209405. 48 ka po lau, dickson k. w. chiu, kevin k. w. ho, patrick lo, and eric w. k. see-to, “educational usage of mobile devices: differences between postgraduate and undergraduate students ,” the journal of academic librarianship 43, no. 3 (2017): 201–8, https://doi.org/10.1016/j.acalib.2017.03.004. https://www.libqual.org/home https://doi.org/10.1016/j.acalib.2007.12.002 https://doi.org/10.1016/j.acalib.2019.102102 https://doi.org/10.1108/pmm-04-2014-0012 https://doi.org/10.1080/01972240701883955 https://doi.org/10.1086/209405 https://doi.org/10.1016/j.acalib.2017.03.004 information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 16 49 tsang and chiu, “effectiveness of virtual reference services.” 50 aina, babalola, and oduwole, “use of web 2.0 tools and services.” 51 aina, babalola, and oduwole, “use of web 2.0 tools and services.” 52 leung et al., “user perceptions”; zhang et al., “relating library user education.” 53 maggie ka yin chan, dickson k. w. chiu, and ernest tak hei lam, “effectiveness of overnight learning commons: a comparative study,” the journal of academic librarianship 46, no. 6 (2020): 102253, https://doi.org/10.1016/j.acalib.2020.102253; tsang and chiu, “effectiveness of virtual reference services.” 54 wesley wing hong cheng, ernest tak hei lam, and dickson k. w. chiu, “social media as a platform in academic library marketing: a comparative study,” the journal of academic librarianship 46, no. 5 (2020): 102188, https://doi.org/10.1016/j.acalib.2020.102188. 55 parker fruehan and diana hellyar, “expanding and improving our library's virtual chat service: discovering best practices when demand increases,” information technology and libraries 40, no. 3 (2021): 1–9, https://doi.org/10.6017/ital.v40i3.13117; pui yik yu, ernest tak hei lam, and dickson k. w. chiu, “operation management of academic libraries in hong kong under covid-19,” library hi tech, (2022), ahead of print, https://doi.org/10.1108/lht10-2021-0342. 56 friday, chidozie, and chukwuma, “social media and library services.” 57 mohd azmi, noorhidawati, and yanti idaya aspura, “librarians’ behavioral performance.” 58 aina, babalola, and oduwole, “use of web 2.0 tools and services”; friday, chidozie, and chukwuma, “social media and library services.” 59 yu, lam, chiu, “operation management” 60 tsang and chiu, “effectiveness of virtual reference services.” 61 kirsti nilsen, “the library visit study: user experiences at the virtual reference desk,” information research 9, no. 2 (2004), paper 171, http://informationr.net/ir/92/paper171.html. 62 mohd azmi, noorhidawati, and yanti idaya aspura, “librarians’ behavioral performance.” 63 tsang and chiu, “effectiveness of virtual reference services.” 64 mohd azmi, noorhidawati, and yanti idaya aspura, “librarians’ behavioral performance.” 65 tsang and chiu, “effectiveness of virtual reference services.” 66 dessy harisanty et al., “leaders, practitioners and scientists’ awareness of artificial intelligence in libraries: a pilot study,” library hi tech, (2022), ahead of print, https://doi.org/10.1108/lht10-2021-0356. https://doi.org/10.1016/j.acalib.2020.102253 https://doi.org/10.1016/j.acalib.2020.102188 https://doi.org/10.6017/ital.v40i3.13117 https://doi.org/10.1108/lht-10-2021-0342 https://doi.org/10.1108/lht-10-2021-0342 http://informationr.net/ir/9-2/paper171.html http://informationr.net/ir/9-2/paper171.html https://doi.org/10.1108/lht-10-2021-0356 https://doi.org/10.1108/lht-10-2021-0356 information technology and libraries september 2022 perceived quality of whatsapp reference service | guo, lam, chiu, and ho 17 67 ansari and tripathi, “use of whatsapp”; sudharani and nagaraju, “whatsapp usage.” 68 leung et al., “user perceptions”; tsang and chiu, “effectiveness of virtual reference services.” 69 ernest tak hei lam, cheuk hang au, and dickson k. w. chiu, “analyzing the use of facebook among university libraries in hong kong,” the journal of academic librarianship 45, no. 3 (2019): 175–83, https://doi.org/10.1016/j.acalib.2019.02.007; foley, “instant messaging reference”; tsang and chiu, “effectiveness of virtual reference services.” 70 lam, au, and chiu, “analyzing the use of facebook”; tsang and chiu, “effectiveness of virtual reference services.” 71 cheng, lam, and chiu, “social media as a platform.” 72 tsang and chiu, “effectiveness of virtual reference services.” 73 apple hiu ching lam, kevin k. w. ho, and dickson k. w. chiu, “instagram for student learning and library promotions? a quantitative study using the 5e instructional model,” aslib journal of information management, (2022), in press, https://doi.org/10.1108/ajim-12-2021-0389. https://doi.org/10.1016/j.acalib.2019.02.007 https://doi.org/10.1108/ajim-12-2021-0389 abstract introduction literature review research purpose methodology data analysis demographic information usage of reference service users’ satisfaction with whatsapp reference service discussions and recommendations subdivision of the whatsapp reference service into specialist subjects more staff training establishment of whatsapp reference service guidelines promotion for the whatsapp service conclusion limitations and future research appendix: key survey items endnotes lib-mocs-kmc364-20131012114126 286 communications marc format simplification d. kaye capen: university of alabama, university. this is a summary of a paper written on the consideration of the feasibility as well as the benefits, disadvantages, and consequences of simplification of the marc formats for bibliographic records. 1 the original paper was commissioned in june 1981, by the arl task force on bibliographic control as one facet in exploring the perceived high costs of cataloging and adhering to marc formats in arl libraries. the conclusions and recommendations, however, are entirely those of the author and the opinions and judgments stated here result from a wide-ranging canvas of technical services people, computer people, and/o r library administrators. because the marc format has so many uses, the paper is divided into five perspectives from which the marc format can be viewed: history, standards, and codes; present purposes; library operations; computer operations; and online catalogs. the library of congress has already begun a review of the marc format and has distributed a draft document. 2 the general thrust of that review is a close examination of the marc format in an attempt to begin to lay the foundation on which revised marc formats can firmly standparticularly in regard to content designation (tags, indicators, and subfield codes used to identify and characterize the data explicitly). as that review deals with the very specific, this paper aims generally at attempting to paint with broad strokes a picture of today's marc in its many relationships, benefits, costs, and what the impact would be to the whole from any change to the part. perspective: marc history, standards, and codes relationships the original marc format document established conventions for encoding data for monographs. though it was understood that early applications were going to relate to the production of catalog cards, the marc designers looked ahead to an increasing emphasis on data retrieval applications. other design considerations included, for example, the necessity for providing for complex computer filing, allowance for a variety of data processing equipment, and an attempt to provide for some analytical work (more specific description of contents notes or other types of analysis). later the single marc ii format was transformed into a series of formats, and as time passed, those formats became inextricably tied to other developments at the national and international levels: the international standard bibliographic descriptions, the anglo-american cataloguing rules , 2d ed., unimarc, the national level bibliographic records, and the national and international communications standards; e.g., ansi z39.2-1979 and iso 2709. benefits the benefits of the marc formats and other standards and codes have been substantial both philosophically and pragmatically. the sharing of cataloging records through the computer-based, online networks have been shown in a variety of cost studies to have contained the rate of rise of per unit cost. a further benefit of the marc formats is the momentum its creation gave to the steady movement toward standardization which can benefit individuallibraries in a number of ways: first, bibliographic information can be exchanged among libraries and countries. second, in recent years we have moved steadily toward creating an environment in which the library of congress would become one of many authoritative libraries thus enhancing the shareability of records. costs the early costs of the development and implementation of the marc formats were borne by lc (aided by council on library resources funds). lc continues to bear most of the costs of marc formats, such as new marbi proposals, duplication and distribution of documentation, and so forth. direct investment of library dollars came through the purchase of the marc tapes and the development of systems to receive, process, and output data in marc formats. impact of change throughout the years of its use, the marc format content designation and content rules have been augmented or modified. in the beginning, however, databases were small and changes could be absorbed more readily. the number and complexity of the formats have increased, as have the interrelationships of the marc formats with other standards and codes resulting in a present environment in which the impact of change is felt more strenuously. perspective: present relationships and constraints relationships today's close interrelationships between the marc formats and other codes and standards affect both library and computer operations. though, for example, the general international standard bibliographic description was implemented by the library community prior to the adoption of aacr2, the second edition of the rules has firmly incorporated the isbds. when this format description system is combined with the machine-based marc formats, some isbd information will be supplied by humans and some generated by programmed machine manipulations. communications 287 as a second example, in the last couple of years, the library of congress has spearheaded the development of national level bibliographic record(s) which define the specific data elements that should be included by any organization creating cataloging records which may also be shared with other organizations or be acceptable for contribution to a national database. as the logical idea of a national database comes to fruition, it is necessary for the marc format to provide for greater specificity in the coding of originating library, modifying library, and so forth. benefits the benefits of the use of the marc format continue to lie in the ease with which bibliographic information can be shared and the concomitant beneficial impact on cost control. in addition, the marc format supports a host of other standards and codes and the benefit from these relationships has been consistency in and fostering of standards development. in the bibliographic arena, the more that standards are developed-locally, regionally, nationally, and internationally-the more we will be able to transmit and share bibliographic data, thus controlling the costs of original cataloging. on the other hand, we also "pay" when we standardize. cost the two costs associated with increased standardization are additional time and thus cost required to meet standards, and the increased expense of maintaining local practices which may often be idiosyncratic. in relation to the latter, while many local idiosyncrasies are often unnecessary and counterproductive, there are generally some which have become an integral part of a large catalog database or upon which a major procedural activity is based. but, to benefit from compliance with standards, increasingly we will move away from local practices. in terms of the time required to adhere to the marc format, it is possible to continue to utilize the format (or participate in systems that use it) and yet control the amount of complexity with which one has to deal. both aacr2 and national level biblio288 journal of library automation vol. 14/4 december 1981 graphic record documents allow for "levels of description" which provide for more or less description; and various online networks allow, in a similar manner, for limited input standards. as we view the array of standards and codes which together make up today's bibliographic scene, we can see that each of the separate elements is consistent within itself, is understandable, and counts for only a portion of the costs associated with the cataloging process. the combination of elements, however, begins an accretion of complexity that for most requires an effort of organization and education in order to control work flow and meet standards. impact of change because the marc format is closely interwoven with a number of national and international codes and standards, changes to the format would have implications far beyond the local library. at the very least, discussions would have to involve a host of individuals and groups, all at different stages of development and implementation based upon the present marc format. perspective: library operations relationships in the library-operations perspective, any operations related to the marc format have to be viewed as only one of many elements which must be interfaced with daily work flow. let us look, for example, at the amount of time which might be expended in a typical large academic library by cataloging personnel in training and ongoing work activities required in marc-related operations. in those libraries which obtain access to cataloging databases as members of networks, contact with the marc format is filtered through the standards, requirements , marc implementation design, documentation and other related training facilities of the network. libraries which maintain their own databases do the same kind of filtering, though staff may have somewhat more control of the user cordiality of the interface. the shared networking environment , however, generally seems to imply more standards and requirements because of the attempt to guarantee as much "shareability" as possible. libraries participating in oclc, for example, must train staff in the following codes: aacri; aacr2; standard subject heading codes; standard classification codes; oclc/marc formats for each type of material being cataloged; oclc bibliographic input standards; oclc level i and level k input standards; oclc systems users guides; in some instances, input standards documents for regional or special-interest cooperatives; local library interpretations, procedures, and standards. any close review of the time library staff expend in the use of these tools for either training or ongoing operations reveals that marc per se requires only a limited proportion of a typical library staff person's day. while training may be intensive at either the beginning of a person's job or at the beginning of work with a new type/format of material, this portion of the cataloging unit cost is small. benefits, costs in the cataloging activity, the benefits from the use of the marc formats are at least two: first, the marc format as part of an online cataloging system permits the machine-production of catalog cards at a major savings over manual production. second, access to a shared cataloging database permits the use of "clerical" catalogers at an estimated unit cost saving per book of twenty dollars when compared to "original" cataloging.3 third, depending upon the information available in the cataloging record, the time required for decision making during the cataloging process can be decreased significantly. impact of change it was the general consensus of the technical services people i contacted that simplification of the formats through the consistent assignment of tags would make training and introduction to new formats somewhat easier, but that any savings of time would probably be trivial. there was no consensus that either simplification or shortening would result in any significant time or cost savings. to a certain extent, the use of the very specific marc formats has made the descriptive cataloging process (and the training to undertake it) clearer in that the logical relationships and description of the data elements are so clearly exposed through the assignment of tags and other codes. also, once initial familiarity with the format(s) is achieved, ongoing use becomes second nature. it is also possible for cataloging staff to control the complexity with which they will deal through the use of less than "full," but still nationally acceptable levels of cataloging and, hence, marc coding. finally, most technical services people believe that cataloging and maintenance activities in libraries have always been complex, requiring long and detailed procedures and intricate work flow . while membership in networks requires new skills and knowledge, it is the sum of the whole rather than the difficulty of any single portion which affects unit costs today. changing the marc format through either simplification or shortening would have only a slight effect on the total technical services operation and costs. perspective: the computer operations environment relationships in looking at computer operations, there are at least two major subdivisions: operations that serve only one client (e.g., alibrary system serving itself) or operations that serve many clients (e.g., rlin or blackwell/north america). the constraints differ for each operation and are further complicated by whether or not the computer operation must be able to produce as well as accept bibliographic records in a marc format. each computer facility, for example, can have distinct operating software depending upon the type and mix of computing equipment used. in addition, each computing facility translates the marc-formatted records into an internal processing format which may differ extensively from marc. too, further tailoring may be done for batch processing as opposed to online operations and computer operations which serve a single user may not have to re-create records in the marc format and may even communications 289 more radically redesign the marcformatted records for internal use. as changes to the marc format occur over the years, each computer system will write additional software to incorporate those changes into the then existing system. in some instances, it may be too difficult to attempt to convert old databases to reflect changes in marc coding, and there will then exist an "old" database and a "new" database for that particular marc field or subfield. since changes have occurred in many fields, most databases are an amalgam of new and old interpretations (this is true in relation to cataloging codes, too) of marc coding, and original internal software design may reflect the same type of patchwork quilt. operating these computer systems is complicated, in addition, by the fact that a wide range of user library needs and desires must be accommodated. indeed, a report prepared by hank epstein for the conference to explore machine-readable bibliographic interchange (cembi) revealed after an exhaustive review of the use of marc data elements that there was no data element not used by someone!• benefits benefits that accrue to computing operations as a result of the marc format include the use of what was called "a pretty decent general communications format ," which facilitates communications, card/ com production, and online information retrieval. as a communications format it is as coherent as any other structure for carrying bibliographic data. because the format allows for a very specific level of detail in description, computing operations can supply a variety of products to fill a variety of needs. costs while specific cost information was not available for inclusion in this paper, discussion does reveal some widely held generalizations. first, the marc format does not seem to be any more complex or costly to use than other variable field communications formats. beginning programmers are generally introduced first to the internal communications format of their particular 290 journal of library automation vol. 14/4 december 1981 computing system, and when they come to the marc tags rapidly become familiar with the coding through experience. indeed, if the programmers know the structure of and have a specification for the format, they can work with that format even though they may be unfamiliar with it from the users' point of view. thus, the format itself, and training in its use does not seem to be significantly costly. second, every change in the marc format requires some programming effort and may or may not require concomitant changes in the database. the consensus of the computer people with which i spoke was that the sophistication and specificity of the marc formats was a good thing, but the inconsistencies among formats is problematical. the benefits of consistency can be important, but to justify changes financially, the major changes should be done at one time. indeed, most individuals doubted whether or not there was sufficient capital in these straitened times to be able to implement consistently a major marc format changeand this is from the perspective of both the operations serving one and many users. impact of change without a philosophical and practical framework (or benchmark) against which to compare the benefits and costs of alternative solutions to marc format maintenance issues and without a better and more comprehensive description of the requirements of the internal processing formats of the computer operations, it is difficult to assess clearly the costs and benefits of marc format changes. it does seem to be the case presently that, once established, computer operations can deal with the complexity and specificity of the marc format without undue ongoing financial investment. the strength of the marc format for computer operations lies in its specificity. for the batch processing environment especially, the marc format is a reasonably efficient format and one that facilitates development. its inefficiencies are not drastic and its specificity buys valuable flexibility. severe cuts or major simplifications would be a mistake since discontinuing specificity is a one-way street-once it is gone, it cannot be retrieved. the ability of the machine to assist in editing is weakened by the loss of specificity and it then becomes more difficult to edit out poor data. simplification through consistency, rather than shortening, would produce the most beneficial impact-though it must be done carefully to be cost beneficial. perspective: online catalogs relationship the major difficulties facing us when we attempt to discuss the relationship of the marc format to online catalogs is that, first, we know so little about how people think when they use our card catalogs; and, second, we have so little experience with how those thought and use patterns might change when the online catalog replaces the card catalog. another aspect of online library system development is the combination of subsystems such as acquisitions, serials control, or authority control with the online catalog and the implications of such a combination for system design, the internal processing format, and compatibility with the marc format. the index design of most large online catalogs or information retrieval systems today relies upon precoordinated search keys in order to facilitate the large sorting activities that have to occur. the second indicator in the 700 field, for example, is designed for the purpose of formulating search keys, filing added entries or for selecting alternative secondary added entries. this type of specificity is necessary for both card production and online retrieval. taken together, all of these considerations make most systems and library technical people hesitate to recommend any major changes to the marc format at this time. benefits at this time, therefore, in terms of information retrieval, there does not seem to be any major force toward either simplifying or shortening the marc format to facilitate retrieval. this becomes an even more cogent sentiment when we consider that major development efforts have already been begun in the areas of online catalog access and information retrieval. delays in these development efforts now caused by ........ changes in the marc formats could be enormously wasteful of the time and effort already invested, and could postpone urgently needed implementation of new, easily maintainable online systems. costs there is no firm cost data to guide us in considering the impact of marc format changes in the information retrieval environment. generally accepted assumptions are, however, that because of our lack of knowledge and experience in this area, it is simply too risky and potentially costly to experiment. impact of change overall, without more experience in this area, it is the general opinion that the fullest level of descriptive specificity of the marc format might be required to design and implement online catalogs/information retrieval systems which can be responsive to the needs of a variety of users and levels of information. interaction with other subsystems and formats is also incomplete, thus clouding our vision of the impact of change over the breadth of the library community. summary and conclusions the original purpose of the marc format is still a cogent and necessary one-that of allowing for a great variety of individual library needs for products, practices, and policies via a standardizing communications format. both catalog card production and online retrieval necessitate the same level of specificity, though particular tags, indicators, and subfield codes may vary. as we look toward a variety of authoritative cataloging sources the marc format, in addition to a specific coding of bibliographic information, might also have to specify descriptions of cataloging actions so that the greatest degree of "shareability" might exist. some of this related authoritytype information will either be carried as part of the marc format or in some manner as linked records. the computer operations that utilize the marc formats exist under the constraints of a variety of internal processing formats and design constraints. for each internal processing system, however, the specificity of the marc format offers flexibility and communications 291 efficiency for a number of different processes and products. taken by itself, the marc format is no more difficult to work with than any other standard or technique for both librarians and computer people. while it might be useful for librarians to implement training aids such as online documentation, access to library manuals (particularly that of the library of congress), and so forth, the benefits of aids such as these are trivial since the coding can be learned rather quickly through experience. for computing people, on the other hand, changes in the formats can be very expensive and disruptive. there is general agreement, moreover, that over the long term we have got to be able to maintain the marc format in response to experience with retrieval and other theoretical and technical advances. the main thrust of maintenance in the computing realm is consistency across formats, but approaching this type of simplification requires a number of preliminary steps if it is to be implemented effectively. we need to develop a vocabulary for jointly discussing the elements of the problem. in addition, a major review needs to be undertaken of the internal processing formats and design constraints of the major computer operations-both to serve as a benchmark for measuring the impact of format changes, and as a guideline for newly developing systems to assist in avoiding mistakes in the development of new computer operations. someone needs to be thinking about and designing the ultimate, comprehensive marc format-not to be implemented, but to serve as a springboard for discussion and for consideration of system design. we need to establish limitations on what we will handle with the marc formats and where we will begin to rely on underlying formats instead. the development of a comprehensive marc conceptualization would also provide a protocol for undertaking the improvement of marc and would serve as a benchmark against which local systems could be compared. at the very least, the steps described here would facilitate the consideration and implementation of making the formats consistent across types of material a goal which is seen by all to be highly desirable. 292 journal of library automation vol. 14/4 december 1981 we need a format which is consistent, easily maintainable without being uncontrollably disruptive, and responsive to changing needs which are likely to accelerate as we gain experience with online systems. rather than recommending or supporting the implementation of specific changes to the marc format, it is essential that the library community begin to establish the framework and benchmarks necessary to maintain the marc formats over the long term as well as to guide short-term considerations. arl and others can play an important role in undertaking and encouraging a broader approach to this pressing problem. such an approach will not only reduce the risk of decision making, but will also assist in the development of the cost/benefit data needed to enhance consideration of format changes. references 1. d. kaye capen, simplification of the marc format: feasibility, benefits, disadvantages, consequences (washington, d.c.: association of research libraries, 1981), 22p. 2. "principles of marc format content designation,'" draft (washington, d.c.: library of congress, 1981), 66p. 3. ichikot. morita and d. kaye capen, "a cost analysis of the ohio college library center on-line shared cataloging system in the ohio state university libraries," library resources & technical services 21:286302 (summer 1977). 4. council on library resources bibliographic interchange committee, bibliographic interchange report, no. i (washington, d.c.: the council, 1981). comparing fiche and film: a test of speed terence crowley: division of library science, san jose state university, san jose, california. introduction for more than a decade librarians have been responding to budget pressures by altering the format of their library catalogs from labor-intensive card formats to computer-produced book and microformats. studies at bath, 1 toronto, 2 texas, 3 eugene, 4 los angeles, 5 and berkeley, 6 have compared the forms of catalogs in a variety of ways ranging from broad-scale user surveys to circumscribed estimates of the speed of searching and the incidence of queuing. the american library association published a state-of-the-art reporf as well as a guide to commercial computer-output microfilm (com) catalogs pragmatically subtitled how to choose; when to buy. 8 in general, com catalogs are shown to be more economical and faster to produce and to keep current, to require less space, and to be suitable for distribution to multiple locations. primary disadvantages cited are hardware malfunctions, increased need for patron instruction, user resistance (particularly due to eyestrain), and some machine queuing. the most common types of library com catalogs today are motorized reel microfilm and microfiche, each with advantages and disadvantages. microfilm offers filesequence integrity and thus is less subject to user abuse, i.e., theft, misfiling, and damage; in motorized readers with "captive" reels it is said to be easier to use. disadvantages include substantially greater initial cost for motorized readers; limits on thecapacity of captive reels necessitating multiple units for large files; inexact indexing in the most widespread commercial reader, and eyestrain resulting from high speed film movement. microfiche offers a more nearly random retrieval, much less expensive and more versatile readt:r~, and unlimited file size. conversely, the file integrity of fiche is lower and the need for patron assistance in use of machines is said to be greater than for self-contained motorized film readers. the problem one of the important considerations not fully researched is that of speed of searching. the toronto study included a selftimed "look-up" test of thirty-two items "not in alphabetical order" given to thirtysix volunteers, of whom thirty finished the test. the researchers found the results "inconclusive" but noted that seven of the ten librarians found film searching the fastest method. "average" time reported for searching in card catalogs was 37.3 minimproving independent student navigation of complex educational web sites: an analysis of two navigation design changes in libguides kate a. pittsley and sara memmott information technology and libraries | se ptember 2012 52 abstract can the navigation of complex research websites be improved so that users more often find their way without intermediation or instruction? librarians at eastern michigan university discovered both anecdotally and by looking at patterns in usage statistics that some students were not recognizing navigational elements on web-based research guides, and so were not always accessing secondary pages of the guides. in this study, two types of navigation improvements were applied to separate sets of online guides. usage patterns from before and after the changes were analyzed. both sets of experimental guides showed an increase in use of secondary guide pages after the changes were applied whereas a comparison group with no navigation changes showed no significant change in usage patterns. in this case, both duplicate menu links and improvements to tab design appeared to improve independent student navigation of complex research sites. introduction anecdotal evidence led librarians at eastern michigan university (emu) to investigate possible navigation issues related to the libguides platform. anecdotal evidence included (1) incidents of emu librarians not immediately recognizing the tab navigation when looking at implementations of the libguides platform on other university sites during the initial purchase evaluation, (2) multiple encounters with students at the reference desk who did not notice the tab navigation, and (3) a specific case involving use of a guide with an online course. the case investigation started with a complaint from a professor that graduate students in her online course were suddenly using far fewer resources than students in the same course during previous semesters. the students in that semester’s section relied heavily—often solely— on one database, while most students during previous semesters had used multiple research sources. this course has always relied on a research guide prepared by the liaison librarian, the selection of resources provided had not changed significantly between the semesters, and the assignment had not changed. furthermore, the same professor taught the course and did not alter her recommendation to the students to use the resources on the research guide. what had changed between the semesters was the platform used to present research guides. the library had just migrated from a simple one-page format for research guides to the more flexible multipage format offered by the libguides platform. only a few resources were listed on the first kate a. pittsley (kpittsle@emich.edu) is an assistant professor and business information librarian and sara memmott (smemmott@emich.edu) is an instructor and emerging technologies librarian at eastern michigan university, ypsilanti, michigan. improving independent student navigation of complex educational websites | pittsley and memmott 53 libguides page of the guide used for the course. only one of these resources was a subscription database, and that database was the one that current students were using to the exclusion of many other useful sources. after speaking with the professor, the liaison librarian also worked one-on-one with a student in the course. the student confirmed that she had not noticed the tab navigation and so was unaware of the numerous resources offered on subsequent pages. the professor then sent a message to all students in the course explaining the tab navigation. subsequently the professor reported that students in the course used a much wider range of sources in assignments. statistical evidence of the problem a look at statistics on guide use for fall 2010 showed that on almost all guides the first pages of guides were the most heavily used. as the usual entry point, it wasn’t surprising that the first pages would receive the most use; however, on many multipage guides, the difference in use between the first page and all secondary pages was dramatic. that users missed the tab navigation and so did not realize additional guide pages existed seemed like a possible explanation for this usage pattern. librarians felt strongly that most users should be able to navigate guides without direct instruction in their use, and they were concerned by the evidence that indicated problems with the guide navigation. was there something that could be done to improve independent student navigation in libguides? two types of design changes to navigation were considered. to test the changes, each navigation change was applied to separate sets of guides. usage patterns were then compared for those guides before and after changes were made. the investigators also looked at usage patterns over the same period for a comparison group to which no navigation changes had been made. literature review navigation in libguides and pathfinders the authors reviewed numerous articles related to libguides or pathfinders generally, but found few that mention navigation issues. they then turned to studies of website navigation in general. in an early article on the transition to web-based library guides, cooper noted that “computer screens do not allow viewers to visualize as much information simultaneously as do print guides, and consequently the need for uncomplicated, easily understood design is even greater.”1 four university libraries’ usability studies of the libguides platform specifically address navigation issues. university of michigan librarians dubicki et al. found that “tabs are recognizable and meaningful—users understood the function of the tabs.”2 the michigan study then focused on the use of meaningful language for tab labels. however, at the latrobe university library (australia), corbin and karasmanis found a consistent pattern of students not recognizing the navigation tabs, and so recommended providing additional navigation links elsewhere on the page.3 at the university of washington, hungerford et al. found students did not immediately recognize the tab navigation: information technology and libraries | se ptember 2012 54 during testing it was observed that users frequently did not notice a guide’s tabs right away as a navigational option. users’ eyes were drawn to the top middle of the page first and would focus on content there, especially if there was actionable content, such as links to other pages or resources.4 the solution at the university of washington was to require that all guides have a main page navigation area (libguides “box”) with a menu of links to the tabbed pages. after a usability study, mit libraries also recommended use of a duplicate navigation menu on the first page, stating in mit libraries staff guidelines for creating libguides to “make sure to link to the tabs somewhere on the main page” as “users don’t always see the tabs, so providing alternate navigation helps.”5 navigation palmer mentions navigation as one of the factors most significantly associated with website success as measured by user satisfaction, likelihood to use a site again, and use frequency.6 however, effective navigation may be difficult to achieve. nielsen found in numerous studies that “users look straight at the content and ignore the navigation areas when they scan a new page.”7 in a presentation on the top ten mistakes in web design, human–computer interaction scholar tullis included “awkward or confusing navigation.”8 the following review of the literature on website navigation design is limited to studies of navigation models that use browsing via menus, tabs, and menu bars. the navigation problem seen in libguides is far from unique. usability studies for other information-rich websites demonstrate similar problems with users not recognizing navigation tabs or menu bars similar to those used in libguides. in 2001, mcgillis and toms investigated the usability of a library website with a horizontal navigation bar at the top of the page, a design similar to the single row of libguides tabs. this study found that users either did not see the navigation bar or did not realize it could be clicked.9 in multiple usability studies, u.s. census bureau researchers found similar problems with navigation bars on government websites. in 2009, olmsted-hawala et al. reported that study participants did not use the top-navigation bar on the census bureau’s business and industry website.10 the next year, chen et al. again reported problems with top-navigation bar use on the governments division public website, explaining that the “top-navigation bar blends into the header, leading participants to skip over the tabs and move directly to the main content. this is a recurring issue the usability laboratory has identified with many web sites.”11 one possible explanation for user neglect of tabs and navigation bars may be a phenomenon termed “banner blindness.” as early as 1999, benway provided in-depth analysis of this problem. in his thesis, he uses the word “banner” not just for banner ads, but also for banners that consist of horizontal graphic buttons similar to the libguides tab design. benway’s experiments show that an attempt to make important items visually prominent may have the opposite effect— that “the visual distinctiveness may actually make important items seem unimportant.” benway follows with two recommendations: (1) that “any method that is created to make something stand out should be carefully tested with users who are specifically looking for that content to ensure that it does not cause banner blindness,” and (2) that “any item visually distinguished on a page should be duplicated within a collection of links or other navigation areas of the page. that way, if searchers ignore the large salient item, they can still find what they need through basic navigation.”12 improving independent student navigation of complex educational websites | pittsley and memmott 55 in 2005, tullis cited multiple studies that showed that users found information faster or more effectively by using a simple table of contents than by using other navigation forms, including tabbased navigation.13 yet in 2011, nicolson et al. found that “participants rarely used table of contents; and often appeared not to notice them.”14 yelinek et al. pointed to a practical problem in using content menus on libguides pages: since libguides pages can be copied or mirrored on other guides, guide authors must be cognizant that such menus could cause problems with incorrect or confusing navigational links on copied or mirrored pages.15 success can also depend on the location of navigational elements, although researchers disagree on effects of location. in addition, user expectations of where to look for navigation elements may change over time along with changes in web conventions. in 2001, bernard studied user expectations as to where common web functions would be located on the screen layout. he found that “most participants expected the links to web pages within a website to be almost exclusively located in the upper-left side of a web page, which conforms to the current convention of placing links on [the] left side.”16 in 2004, pratt et al. found that users were equally effective using horizontal or vertical navigation menus, but when given a choice more users chose to use vertical navigation.17 also in 2004, mccarthy et al. performed an eye-tracking study, which showed faster search times when sites conformed to the expected left navigation menu and a user bias toward searching the middle of the screen; but it also found that the initial effect of menu position diminished with repeated use of a site.18 nonetheless, jones found that by 2006 most corporate webpages used “horizontally aligned primary navigation using buttons, tabs, or other formatted text.”19 in 2008, cooke found that users looked equally at left, top, and center menus; however, when “a visually prominent navigation menu populated the center of the web page, participants were more likely to direct their search in this location.”20 wroblewski describes how tab navigation was first popularized by amazon.21 burrell and sodan investigated user preferences for six navigation styles and found that users clearly preferred tab navigation “because it is most easily understood and learned.”22 in the often-cited web design manual don’t make me think, krug also recommends tabs: “tabs are one of the very few cases where using a physical metaphor in a user interface actually works.”23 krug recommends that tabs be carefully designed to resemble file folder tabs. they should “create the visual illusion that the active tab is in front of the other tabs . . . the active tab needs to be a different color or contrasting shade [than the other tabs] and it has to physically connect with the space below it. this is what makes the active tab ‘pop’ to the front.”24 an often-cited u.s. department of health and human services manual on research-based web design addresses principles of good tab design, stating that tabs should be located near the top of the page and should “look like clickable versions of real-world tabs. real-world tabs are those that resemble the ones found in a file drawer.”25 nielsen provides similar guidelines for tab design, which include that the selected tab should be highlighted, the current tab should be connected to the content area (just like a physical tab), and that one should use only one row of tabs.26 more recently, cronin highlighted examples of good tab design that effectively use elements such as rounded tab corners, space between tabs, and an obvious design for the active tab that visually connects the tab to the area beneath it.27 christie also provides best practices for tab design that include consistent use of only one row of tabs, use of a prominent color for the active tab and a single information technology and libraries | se ptember 2012 56 background color for unselected tabs, changing the font color on the active tab, and use of rounded corners to enhance the file-folder-tab metaphor.28 two articles mention that the complexity of a site can be a factor in navigation success. mccarthy et al. found that search times are significantly affected by site complexity and recommended finding ways to balance the provision of numerous user options with simplifying the site so that users can find their way.29 little specifically suggests reducing the amount of extraneous information on libguides pages in her article, which applies cognitive load theory to use of library research guides.30 in sum, effective navigation is difficult to achieve. however, navigation design can be improved by considering the purpose of the site, user expectations, common conventions, best practices, the possibility that intuitive ideas for design may not perform as expected (e.g., banner blindness), the site’s complexity, and more. research question and method could design changes improve independent student use of libguides tab navigation? the literature reviewed above suggested two likely design changes to test: adding additional navigation links in the body of the page and improving the tab design. testing these design changes on selected guides would allow the emu library to assess the impact before implement changes on all library research guides. for this experiment, each type of navigation change was applied to separate subsets of guides; a subset of similar guides was selected as a comparison group; and usage patterns were analyzed for similar periods before and after changes were made. navigation design changes were made to fourteen subject guides related to business. the business subject guides were divided into two experimental groups of seven guides. in group a, a table of contents box with navigation links was added to the front page of each guide, and in group b, the navigation tabs were altered in appearance. no navigation changes were made to comparison group c. class specific guides were excluded from the experiment, as in many cases the business librarian would have instructed students in the use of tabs on class guides. changes were made at the beginning of the winter 2011 semester so that an entire semester’s data could be collected and compared to the previous semester’s usage patterns. the design for group a was similar to the university of washington implementation of a “what’s in the guide” box on guide homepages that repeated the tab navigation links.31 for guides in group a, a table of contents box was placed on the guide homepages. it contained a simple list of links to the secondary pages of the guides, using the same labels as on the navigation tabs. the table of contents box used a larger font size than other body text and was given an outline color that contrasted with the outline color used on other boxes and matched the navigation tab color to create visual cues that this box had a different function from the other boxes on the page (navigation). the table of contents box was placed alongside other content on the guide homepages so users could still see the most relevant resources immediately. figure 1 shows a guide containing a table of contents box. improving independent student navigation of complex educational websites | pittsley and memmott 57 figure 1. group a guide with content menu box labeled “guide sections” the design change for group b focused on the navigation tabs. libguides tabs exhibit some of the properties of good tab design, such as allowing for rounded corners and contrasting colors for the selected tabs. other aspects are not ideal, such as the line that separates the active tab from the page body.32 in the emu library’s initial libguides implementation, the option for tabs with rounded corners was used to resemble the design of manila file folders and increase the association with the file-folder metaphor. possibilities for further design adaptation on the experimental guides were somewhat limited because these changes needed to be applied to the tabs of just a selected set of guides. the investigators theorized that increasing the height of the tabs might make them more closely resemble paper file folder tabs. increasing the height would also increase the area of the tabs, and the larger size might also make the tabs more noticeable. this option was simple to implement on the guides in group b by adding html break tags,
, to the tab text. taller tabs also provided more room for text on the tabs. tabs in libguides will expand in width to fit the text label used, and if the tabs on a guide require more space on the page, they will be displayed in multiple rows. multiple rows of tabs are visually confusing and break the tabs metaphor, decreasing their usefulness for navigation.33 the emu library’s best practices for research guides already encouraged limiting tabs to one row. adding height to tabs allowed for clearer text labels on some guides without expanding the tab display beyond a single row. figure 2 shows a guide containing the altered taller tabs. information technology and libraries | se ptember 2012 58 figure 2. group b guide with tabs redesigned to look more like file folder tabs while variations in content and usage of library guides did not allow for a true control group, other social science subject guides were selected as a comparison group. social science subject guides were excluded from the comparison group if they had very low guide usage during the fall 2010 semester (fewer than thirty uses), or if they had fewer than three tabs, making them structurally dissimilar to the business guides. this left a group of sixteen comparison guides. no changes were made to the navigation design of these guides during the test period. the business guides—which the authors had permission to experiment with—tend to be longer and have more pages than other guides. on average, the experimental guides had more pages per guide than the comparison guides; guides in groups a and b averaged nine pages per guide, and comparison guides averaged five pages per guide. guides with more pages will tend to have a higher percentage of hits on secondary pages because there are more pages available to users. however, the authors intended to measure the change in usage patterns with each guide measured against itself in different periods, and the number of pages in each guide did not change from semester to semester. data collection and results libguides provides monthly usage statistics that include the total hits on each guide and the number of hits on each page of a guide. use of secondary pages of the guides was measured by calculating the proportion of hits to each guide that occurred on secondary pages. data for the fall 2010 semester (september through december 2010) was used to measure usage patterns before navigation changes were made to the experimental guides. data for the winter 2011 semester (january through april 2011) was used to measure usage patterns after navigation changes were made. each would represent a full semester’s use at similar enrollment levels with many of the same courses and assignments. usage patterns for the comparison guides were also examined for these periods. improving independent student navigation of complex educational websites | pittsley and memmott 59 as shown in figures 3 and 4, in both group a and group b, the percentage of hits on secondary pages increased in five guides and decreased in two guides. figure 3. group a: change in secondary page usage with content menus added for winter 2011 figure 4. group b: change in secondary page usage with new tab design for winter 2011 both groups of experimental guides showed an increase in use of secondary guide pages after the design changes were made. the median usage score was calculated for each group. group a, with the added menu links, showed an increase of 10.3 points in the median percentage of guide hits on secondary pages. group b, with redesigned tabs, showed an increase of 10.4 points in the median percentage of guide hits on secondary pages. within the comparison guides, the proportion of hits secondary tab usage : guides in group a fall 2010 winter 2011 secondary tab usage: guides in group b fall 2010 winter 2011 information technology and libraries | se ptember 2012 60 on secondary pages did not change significantly from fall 2010 to winter 2011. table 1 shows the median percentage of guide hits on secondary pages before and after navigation design changes. group a: menu links added group b: tabs redesigned group c: comparison group fall 2010 39.1% 50.5% 37.7% winter 2011 49.4% 60.9% 37.4% table 1. median percentage of guide hits on secondary pages the box plot in figure 5 graphically illustrates the range of the usage of secondary pages in each group of guides and the changes from fall 2010 to winter 2011, showing the minimum, maximum, and median scores, as well as the range of each quartile. figure 5. distribution of percentage of guide hits on secondary pages. this figure demonstrates the change in usage pattern for groups a and b and the lack of change in usage pattern for comparison group c. averages for the percentage change in secondary tab use were also computed for the combined experimental groups and the comparison group. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% group a f10 group a w11 group b f10 group b w11 group c f10 group c w11 improving independent student navigation of complex educational websites | pittsley and memmott 61 experimental or comparison n mean std. deviation std. error mean change in secondary tab use dim ension 1 experimental 14 .07871 .097840 .026149 comparison 16 -. 02550 .145977 .036494 table 2. average change in secondary tab use from fall 2010 to winter 2011, comparing all experimental guides (groups a & b) with all comparison (group c) guides. when comparing all experimental guides and all comparison guides, the change in use of secondary pages was found to be statistically significant. the average change in use of secondary pages for all experimental guides (groups a and b) was .07871, and the average for all comparison guides (group c) was -.02550. a t test showed that this difference was significant at the p < . 05 level (p = .032). study limitations in some (possibly many) cases, the first page of the guide provides all necessary sources and advice for an assignment. we measured actual use of secondary pages, but were unable to measure recognition of navigation elements where the student did not use the secondary pages because they had no need for additional resources. because it wasn’t possible to control use of the guides during the periods studied, it is possible that factors other than the design changes contributed to the pattern of hits. though subject guides rather than class guides were used to limit the influence of instruction in the use of guides, it wasn’t possible to determine with certainty if other faculty members instructed a significant number of students in the use of particular guides during the periods examined. the comparison group was slightly dissimilar in that they had fewer pages than the experimental guides; however, the number of pages on a guide did not correlate with a change in percentage of hits on secondary pages from one semester to the next. application of findings when presented with the study results, the full library faculty at emu expressed interest in using both design changes across all library research guides. the change to tab design—which is easiest to implement—has been made to all subject guides. some librarians also chose to add content menus to selected guides. since the complexity of research guides is also a factor in successful navigation,35 a recent libguides enhancement was used to move elements from the header area to the bottom of the guides. the elements moved out of the header included the date of last update, guide url, print option, and rss updates. the investigators hypothesize that the reduced complexity of the header may help in recognizing the tab navigation. although convinced that the experimental changes made a difference to independent student navigation in research guides, the authors hope to find further ways to strengthen independent navigation. vendor design changes to enhance the tab metaphor, such as creating a more visible connection between the active tab and page, might also improve navigation.36 information technology and libraries | se ptember 2012 62 conclusion designing navigation for complex sites, such as library research guides, is likely to be an ongoing challenge. this study suggests that thoughtful design changes can improve navigation. in this case, both duplicate menu links and improvements to tab design improved independent student navigation of complex research sites. references and notes 1. eric a. cooper, “library guides on the web: traditional tenets and internal issues,” computers in libraries 17, no. 9 (1997): 52. 2. barbara dubicki beaton et al., libguides usability task force guerrilla testing (ann arbor: university of michigan, 2009), http://www.lib.umich.edu/content/libguides-guerillatesting. 3. jenny corbin and sharon karasmanis, health sciences information literacy modules usability testing report (bundoora, australia: la trobe university library, 2009), http://arrow.latrobe.edu.au:8080/vital/access/handleresolver/1959.9/80852. 4. rachel hungerford, lauren ray, christine tawatao, and jennifer ward, libguides usability testing: customizing a product to work for your users (seattle: university of washington libraries, 2010), 6, http://hdl.handle.net/1773/17101. 5. mit libraries, research guides (libguides) usability results (cambridge, ma: mit libraries, 2008), http://libstaff.mit.edu/usability/2008/libguides-summary.html; mit libraries, guidelines for staff libguides (cambridge, ma: mit libraries, 2011), http://libguides.mit.edu/staff-guidelines. 6. jonathan w. palmer, “web site usability, design, and performance metrics,” information systems research 13, no. 2 (2002): 151-67, doi:10.1287/isre.13.2.151.88. 7. jakob nielsen, “is navigation useful?,” jakob nielsen’s alertbox, http://www.useit.com/alertbox/20000109.html. 8. thomas s. tullis, “web-based presentation of information: the top ten mistakes and why they are mistakes,” in hci international 2005 conference: 11th international conference on human-computer interaction, 22–27, july 2005, caesars palace, las vegas, nevada usa (mahwah nj: lawrence erlbaum associates, 2005), doi:10.1.1.107.9769. 9. louise mcgillis and elaine g. toms, “usability of the academic library web site: implications for design,” college & research libraries 62, no. 4 (2001): 355–67, http://crl.acrl.org/content/62/4/355.short. 10. erica olmsted-hawala et al., usability evaluation of the business and industry web site, survey methodology #2009–15, (washington, dc: statistical research division, u.s. census bureau, 2009), http://www.census.gov/srd/papers/pdf/ssm2009–15.pdf. 11. jennifer chen et al., usability evaluation of the governments division public web site, survey http://www.lib.umich.edu/content/libguides-guerilla-testing http://www.lib.umich.edu/content/libguides-guerilla-testing http://arrow.latrobe.edu.au:8080/vital/access/handleresolver/1959.9/80852 http://hdl.handle.net/1773/17101 http://crl.acrl.org/content/62/4/355.short http://www.census.gov/srd/papers/pdf/ssm2009–15.pdf improving independent student navigation of complex educational websites | pittsley and memmott 63 methodology #2010–02, (washington, dc: u.s. census bureau, usability laboratory, 2010), 19, http://www.census.gov/srd/papers/pdf/ssm2010-02.pdf. 12. jan panero benway, “banner blindness: what searching users notice and do not notice on the world wide web” (phd diss., rice university, 1999), 75, http://hdl.handle.net/1911/19353. 13. tullis, “web-based presentation of information.” 14. donald j. nicolson et al., “combining concurrent and sequential methods to examine the usability and readability of websites with information about medicines,” journal of mixed methods research 5, no. 1 (2011): 25–51, doi:10.1177/1558689810385694. 15. kathryn yelinek et al., “using libguides for an information literacy tutorial 2.0,” college & research libraries news 71, no. 7 (july): 352–55, http://crln.acrl.org/content/71/7/352.short 16. michael l. bernard, “developing schemas for the location of common web objects,” proceedings of the human factors and ergonomics society annual meeting 45, no. 15 (october 1, 2001): 1162, doi:10.1177/154193120104501502. 17. jean a. pratt, robert j. mills, and yongseog kim, “the effects of navigational orientation and user experience on user task efficiency and frustration levels,” journal of computer information systems 44, no. 4 (2004): 93–100. 18. john d. mccarthy, m. angela sasse, and jens riegelsberger, “could i have the menu please? an eye tracking study of design conventions,” people and computers 17, no. 1 (2004): 401–14. 19. scott l. jones, “evolution of corporate homepages: 1996 to 2006,” journal of business communication 44, no. 3 (2007): 236–57, doi:10.1177/0021943607301348. 20. lynne cooke, “how do users search web home pages?” technical communication 55, no. 2 (2008): 185. 21. luke wroblewski, “the history of amazon’s tab navigation,” lukew ideation + design, may 7, 2007, http://www.lukew.com/ff/entry.asp?178. after addition of numerous product categories made tabs impractical, amazon now relies on a left-side navigation menu. 22. a. burrell and a. c. sodan, “web interface navigation design: which style of navigationlink menus do users prefer?” in 22nd international conference on data engineering workshops, april 2006. proceedings (washington, d.c.: ieee computer society, 2006), 42– 42, doi:10.1109/icdew. 2006.163. 23. steve krug, don’t make me think! a common sense approach to web usability, 2nd ed. (berkeley: new riders, 2006), 79. 24. ibid., 82. http://www.census.gov/srd/papers/pdf/ssm2010-02.pdf http://hdl.handle.net/1911/19353 http://crln.acrl.org/content/71/7/352.short http://www.lukew.com/ff/entry.asp?178 information technology and libraries | se ptember 2012 64 25. u.s. department of health and human services, “navigation,” in research-based web design & usability guidelines (washington, dc: u.s. department of health and human services, 2006), 8, http://www.usability.gov/pdfs/chapter7.pdf. 26. jakob nielsen, “tabs, used right,” jakob nielsen’s alertbox, http://www.useit.com/alertbox/tabs.html. 27. matt cronin, “showcase of well-designed tabbed navigation,” smashing magazine, april 6, 2009, http://www.smashingmagazine.com/2009/04/06/showcase-of-well-designedtabbed-navigation. 28. alex christie, “usability best practice, part 1—tab navigation,” tamar, january 13, 2010, http://blog.tamar.com/2010/01/usability-best-practice-part-1-tab-navigation. 29. mccarthy, sasse, and riegelsberger, “could i have the menu please?” 30. jennifer j. little, “cognitive load theory and library research guides,” internet reference services quarterly 15, no. 1 (2010): 52–63, doi:10.1080/10875300903530199. 31. hungerford et al., libguides usability testing. 32. christie, “usability best practice”; nielsen, “tabs, used right”; krug, don’t make me think; cronin, “showcase of well-designed tabbed navigation.” 33. christie, “usability best practice”; nielsen. “tabs, used right.” 34. eva d. vaughan, statistics: tools for understanding data in the behavioral sciences (upper saddle river, nj: prentice hall, 1998), 66. 35. mccarthy, sasse, and riegelsberger, “could i have the menu please?” 36. springshare, the libguides vendor, has been amenable to customer feedback and open to suggestions for platform improvements. http://www.usability.gov/pdfs/chapter7.pdf http://www.smashingmagazine.com/2009/04/06/showcase-of-well-designed-tabbed-navigation http://www.smashingmagazine.com/2009/04/06/showcase-of-well-designed-tabbed-navigation http://blog.tamar.com/2010/01/usability-best-practice-part-1-tab-navigation autocomplete as a research tool: a study on providing search suggestions david ward, jim hahn, and kirsten feist information technology and libraries | december 2012 6 abstract as the library website and its online searching tools become the primary “branch” many users visit for their research, methods for providing automated, context-sensitive research assistance need to be developed to guide unmediated searching toward the most relevant results. this study examines one such method, the use of autocompletion in search interfaces, by conducting usability tests on its use in typical academic research scenarios. the study reports notable findings on user preference for autocomplete features and suggests best practices for their implementation. introduction autocompletion, a searching feature that offers suggestions for search terms as a user types text in a search box (see figure 1), has become ubiquitous on both larger search engines as well as smaller, individual sites. debuting as the “google suggest” feature in 20041, autocomplete has made inroads into the library realm through inclusion in vendor search interfaces, including the most recent proquest interface and in ebsco products. as this feature expands its presence in the library realm, it is important to understand how patrons include it in their workflow and the implications for library site design as well as for reference, instruction, and other library services. an analysis of search logs from our library federated searching tool reveals both common errors in how search queries are entered, as well as patterns in the use of library search tools. for example, spelling suggestions are offered for more than 29 percent of all searches, and more than half (51 percent) of all searches appear to be for known items.2 additionally, punctuation such as commas and a variety of correct and incorrect uses of boolean operators are prevalent. these patterns suggest that providing some form of guidance in keyword selection at the point of searchterm entry could improve the accuracy of composing searches and subsequently the relevance of search results. this study investigates student use of an autocompletion implementation on the initial search entry box for a library’s primary federated searching feature. through usability studies, the authors analyzed how and when students use autocompletion as part of typical library research, asked the students to assess the value and role of autocompletion in the research process, and noted any drawbacks of implementing the feature. additionally, the study sought to analyze how david ward (dh-ward@illinois.edu) is reference services librarian, jim hahn (jimhahn@illinois.edu) is orientation services and environments librarian, undergraduate library, university of illinois at urbana-champaign. kirsten feist (kmfeist@uh.edu) is library instruction fellow, m.d. anderson library, university of houston. information technology and libraries | december 2012 7 figure 1. autocomplete implementation implementing autocompletion on the front end of a search affected providing search suggestions on the back end (search result pages). literature review autocomplete as a plug-in has become ubiquitous on site searches large and small. research on autocomplete includes a variety of technical terms that refer to systems using this architecture. examples include real time query expansion (rtqe), interactive query expansion, search-asyou-type (sayt), query completion, type-ahead search, auto-suggest, and suggestive searching/search suggestions. the principal research concerns for autocomplete include issues related to both back-end architecture and assessments of user satisfaction and systems for specific implementations. nandi and jagadish present a detailed system architecture model for their implementation of autocomplete, which highlights many of the concerns and desirable features of constructing an index that the autocomplete will query against.3 they note in particular that the quality of suggestions presented to the user must be high to compensate for the user interface distraction of having suggestions appear as a user types. this concern is echoed by hanmin et al. in their analysis of how the results offered by their autocomplete implementation met user expectations.4 their findings emphasize configuring systems to display only keywords that bring about successful searches, noting “precision [of suggested terms] is closely related with satisfaction.” an additional analysis of their implementation also noted that suggesting search facets (or “entity types”) is a way to enhance autocomplete implementations and aid users in selecting suitable keywords for their search.5 wu also suggests using facets to help group suggestions by type, which improves comprehension of a list of possible keyword combinations.6 in defining important design characteristics for autocomplete as a research tool | ward, hahn, and feist 8 autocomplete implementations, wu advocates building in a tolerance for misplaced keywords as a critical component. chaudhuri and kaushik examine possible algorithms to use in building this type of tolerance into search systems. misplaced keywords include typing terms in the wrong field (e.g., an author name in a title field), as well as spelling and word order errors.7 systems that are tolerant in this manner “should enumerate all the possible interpretations and then sort them according to their possibilities,” a specification wu refers to as “interpret-as-you-type.”8 additionally, both wu and nandi and jagadish specify fast response time (or synchronization speed) as a key usability feature in autocomplete interfaces, with nandi and jagadish indicating 100ms as a maximum.9,10 speed also is a concern in mobile applications, which is part of the reason paek et al. recommend autocomplete as part of mobile search interfaces, in which reducing keystrokes is a key usability feature.11 on the usability end, white and marchionini12 assess best practices for implementation of searchterm-suggestion systems and users’ perceptions of the quality of suggestions and search results retrieved. they find that offering keyword suggestions before the first set of results has been displayed generated more use of the suggestions than displaying them as part of a results page, even though the same terms were displayed in both cases. providing suggestions at this initial stage also led to better-quality initial queries, particularly in cases where users may have little knowledge of the topic for which they are searching. the researchers also warn that, while presenting “query expansion terms before searchers have seen any search results has the potential to speed up their searching . . . it can also lead them down incorrect search paths.”13 method usability study we conducted two rounds of usability testing on a version of university of illinois at urbanachampaign’s undergraduate library website that contained a search box for the library’s federated/broadcast search tool with autocomplete built in. the testing followed nielsen’s guidelines, using a minimum of five students for each round, with iterative changes to the interface made between rounds based on feedback from the first group.14 we conducted the initial round in summer 2011 with five library undergraduate student workers. the second round was conducted in september 2011 and included eight current undergraduate students with no affiliation to the library. by design, this method does not allow us to state definitive trends for all autocomplete implementations. it is not a statistically significant method by quantitative standards—rather, it gives us a rich set of qualitative data about the particular implementation (easy search) and specific interface (undergrad library homepage) being studied. the study’s questions were approved by the campus institutional review board (irb), and each participant signed an irb waiver before participating. students for the september round were recruited via advertisements on the website and flyers in the library. gift certificates to a local coffee shop provided the incentive for the study. information technology and libraries | december 2012 9 the procedure for each interview focused on two steps (see appendix). first, each participant was asked to use the search tool to perform a series of common research tasks, including three queries for known item searches (locating a specific book, journal, and movie), and two searches that asked the student to recall and describe a current or previous semester’s subject-based search, then use the search interface to find materials on that topic. participants were asked to follow a speak-aloud protocol, dictating the decision-making process they went through as they conducted their search, including noting why they made each choice that they made along the way. researchers observed and took notes, including transcribing user comments and noting mouse movements, clicks, and other choices made during the searches. because part of the hypothesis of the study was that the autocomplete feature would be used as an aid for spelling search queries correctly, titles with possibly challenging spelling were chosen for the known item searches. participants were not told about or instructed in the use of autocomplete; rather, it was left to each of them to discover it and individually decide whether to use it during each of the searches they conducted as a part of the study. in the second part of the interview, researchers asked students questions about their use (or lack thereof) of the autocomplete feature during the initial set of task-based questions. this set of questions focused on identifying when students felt the autocomplete feature was helpful as part of the search process, why they used it when they did, and why they did not use it in other cases. students also were asked more general questions about ways to improve the implementation of the feature. in the second round of testing (with students from the general campus populace), an additional set of questions was asked to gather student demographic information and to have the participants assess the quality of the choices the autocomplete feature presented them with. these questions were based in part on the work of white and marchionini, who had study participants conduct a similar quality analysis.15 autocomplete implementation the autocomplete feature was javascript and based on the jquery autocomplete plugin (http://code.google.com/p/jquery-autocomplete/). autocomplete plugins generally pull results either from a set of previous searches on a site or from a set of known products and pages within a site. for the study, the initial dataset used was a list of thousands of previous searches using the library’s easy search federated search tool. however, this data proved to be extremely messy and slow to search. in particular, a high number of problematic searches were in the data, including entire citations pasted in, misspelled words, and long natural-language strings. constructing an algorithm to clean up and make sense of these difficult queries would have required too much time and overhead, so we investigated other sources. researchers looked at autocomplete apis for both bing (http://api.bing.com/osjson.aspx?query=test) and google (the suggest toolbar api: http://google.com/complete/search?output=toolbar&q=test). both worked well and produced autocomplete as a research tool | ward, hahn, and feist 10 similar relevant results for the test searches. significantly, the search algorithms behind each of these apis were able to process the search query into far more meaningful and relevant results than what was achieved through the test implementation using local data. these algorithms also included correcting misspelled words entered by users by presenting correctly spelled results from the dropdown list. we ultimately chose the google api on the basis of its xml output. findings the study’s findings were consistent across both rounds of usability testing. notable themes include using autocomplete to correct spelling on known-item searches (specific titles, authors, etc.), to build student confidence with an unfamiliar topic, to speed up the search process, to focus broad searches, and to augment search-term vocabulary. the study also details important student perceptions about autocomplete that can guide the implementation process in both library systems and instructional scenarios. these student perceptions include themes of autocomplete’s popularity, desire for local resource suggestions, various cosmetic page changes, and user perception of the value of autocomplete to their peers. spelling “it definitely helps with spelling,” said one student, responding to a prompt of how they would explain the autocomplete feature to friends. correcting search-term spelling is a key way in which students chose to make use of the autocomplete feature. for known-item searches, all eight students in the second round of testing selected suggestions from auto-complete at least two times out of the three searches conducted. of those eight students, four (50 percent) used auto-complete every time (three out of three opportunities), and four (50 percent) used it 67 percent of the time (two out of three opportunities). we found that of this latter group who only selected auto-complete suggestions two out of the three opportunities presented, three of them did in fact refer to the dropdown selections when typing their inquiries, but did not actively select these suggestions from the dropdown all three times. in choosing to use autocomplete for spelling correction, one student noted that autocomplete was helpful “if you have an idea of a word but not how it’s spelled.” it is interesting to note, with regard to clicking on the correct spellings, that students do not always realize they are choosing a different spelling than what they had started typing. an example is the search for journal of chromatography, which some students started spelling as “journal of chormo,” then picked the correct spelling (starting “chroma”) from the list, without apparently realizing it was different. this is an important theme: if a student does not have an accurate spelling from which to begin, the search might fail, or the student will assume the library does not have any information on the chosen topic. this is particularly true in many current library catalog interfaces, which do not provide spelling suggestions on their search result pages. locating known items information technology and libraries | december 2012 11 another significant use of the autocomplete feature was in cases where students were looking for a specific item but had only a partial citation. in one case, a student used autocomplete to find a specific course text by typing in the general topic (e.g., “africa”) and then an author’s name that the course instructor had recommended. the google implementation did an excellent job of combining these pieces of information into a list of actual book titles from which to choose. this finding also echoes those of white and marchioni, who note that autocomplete “improved the quality of initial queries for both known item and exploratory tasks.”16 the study also found this to be an important finding because overall, students are looking for valid starting points in their research (see “confidence” below), and autocomplete was found to be one way to support finding instructor-approved items in the library. this echoes findings from project information literacy, which shows students typically turn to instructor-sanctioned materials first when beginning research.17 this use case typically arises when an instructor suggests an author or seminal text on a research topic to a student, often with an incomplete or inaccurate title. one participant also mentioned that they wanted the autocomplete feature to suggest primary or respected authors based on the topic they entered. confidence “[autocomplete is] an assurance that it [the research topic] is out there . . . you’re not the first person to look for it.”—student participant there were multiple themes related to the concept of user confidence discovered in the study. first, some participants noted that when they see the suggestions provided by autocomplete it verifies that what they are searching is “real”—validating their research idea and giving them the sense that others have been successful previously searching for their topic. when students were asked the source of the autocomplete suggestions, most thought that results were generated based on previous user searches. their response to this particular question highlighted the notion of “popularity ranking,” in that many were confident that the suggestions presented were a result of popular local queries. in addition, one participant thought that results generated were based on synonyms of the word they typed, while another believed that the results generated were included only if the text typed matched descriptions of materials or topics currently present in the library’s databases. some students did indicate the similarity of search results to google suggestions, but they did not make an exact connection between the two. this assumption that the terms are vetted seems to lend authority to the suggestions themselves and parallels the research of jung et al., who investigated satisfaction based on the connection between user expectations on selecting an autocomplete keyword and results.18 the benefit of autocomplete-provided suggestions in this context was noted even in cases when participants did not explicitly select items from the autocomplete list. students’ confidence in their own knowledge of a topic also factored into when they used autocomplete. participants reported that if they knew a topic well (particularly if the topic chosen was one that they had previously completed a paper on), it was faster to just type it in without autocomplete as a research tool | ward, hahn, and feist 12 choosing a suggestion from the autocomplete suggestion list. one participant also noted that common topics (e.g., “someone’s name and biography”) would also be cases in which they would not use the suggestions. after the first round of usability testing, a question was added to the post–test assessment asking students to rate their confidence as a researcher on a five-point scale. all participants in the second round rated themselves as a four or five out of five. while this confirms findings on student confidence from studies like project information literacy, this assessment question ultimately had no correlation to actual use of autocomplete suggestions during the subject-based research phase of the study. rather, confidence in the topic itself seemed to be the defining factor in use. speed the study also showed that speed is a factor in deciding when to use autocomplete functionality. specifically, autocomplete should be implemented in a way in which they are not perceived as slowing down the search process. this includes having results displayed in a way that is easily ignored if students want to type in an entire search phrase themselves, and having the presentation and selection of search suggestions done in a way that is easy to read and quick to be selected. autocomplete is perceived as a time-saver when clicking on an item will shorten the amount of typing students need to do. however, some students will ignore autocomplete altogether; they do this when they know what they want, and they feel that speed is compromised if they need to stop and look at the suggestions when they already know what they want to search. in the study, different participants would often cite speed as a reason for both selecting and not selecting an item for the same question, particularly with the known-item searches. this finding indicates that a successful implementation should include both a speedy response (as noted above in nandi and jagadish’s research on delivering suggestions within 100ms, paek et al.’s research on reducing keystrokes, and white and marchioni’s finding that providing suggested words was “a real time-saver”),19 as well as an interface which does not force users to select an item to proceed, or obscure the typing of a search query. focusing topics “it helps to complete a thought.” “[autocomplete is] extra brainstorming, but from the computer.”— participant responses the above quotes indicate the use of autocomplete as a tool for query formulation and search term identification, a function closely related to the association of college and research libraries (acrl) information literacy standard two, which includes competencies for selecting appropriate search keywords and controlled vocabulary related to a topic.20 this quote also parallels a similar finding from white and marchioni, 21 who had a user comment that autocomplete “offered words (paths) to go down that i might not have thought of on my own.” the use of autocomplete for scoping and refining a topic also parallels elements of the reference interview, specifically the open and closed questions typically asked to help a student define what information technology and libraries | december 2012 13 aspects of a topic they are interested in researching. this finding has many exciting implications for how elements and best practices from both classroom instruction and reference methodologies can be injected directly into search interfaces, to aid students who may not consult with a librarian directly during the course of their research. autocomplete was used at a lower rate, and in different ways, for subject searching compared to kown-item searching. three out of eight participants (38 percent) from the second round of testing did not use autocomplete at all for subject-based searching (zero of two opportunities). five out of eight participants (62 percent) used autocomplete on one of two search opportunities (50 percent). no participants used autocomplete on both of the search opportunities. the stage of research a student was in helped to indicate where and how autocomplete could be useful in topic formulation and search-term selection for subject searches. participants indicated that they would use autocomplete for narrowing ideas if they were at a later stage in a paper, when they knew more about what they wanted or needed specifics on their topic. however, early in a paper, some participants indicated they just wanted broad information and did not want to narrow possible results too early. this finding also supports previous research from project information literacy, which describes student desire to learn the “big-picture context” as a key function in the early part of the research process.22 at this topic-focusing stage, some participants told us that the search suggestions reminded them of topics that were discussed in class. further, the study showed that autocomplete suggests aspects of topics to student that they had not previously considered, and one participant indicated that she might change her topic if she saw something interesting from the list of suggestions, particularly something she had not thought of yet. interface implementation though students who opted to utilize the autocomplete feature were generally satisfied with the results generated, some students recommended increasing the number of autocomplete suggestions in the dropdown menu to increase the probability of finding their desired topic or known item or to potentially lead to other related topics to narrow their search. in addition, students recommended increasing the width of the autocomplete text box, as its present proportions are insufficient for displaying longer suggestions without text wrapping. some students also noted that increasing the height of the dropdown menu containing the autocomplete suggestions might help reduce the necessity to scroll through the results and may help to draw user attention to all results for those who elect not to use the scroll bar. beyond the suggested improvements for the functionality of the autocomplete feature, students also noted a few cosmetic changes they would like to see implemented. in particular, students would prefer to see larger text and a better use of fonts and font colors when using autocomplete. one student noted that if different fonts and colors were used in this feature, the results generated might stand out more and better attract users, or better draw users’ attention to the recommended search terms. autocomplete as a research tool | ward, hahn, and feist 14 perceived value to peers most students who participated in the study stated that they would recommend that their fellow classmates utilize the autocomplete feature for two primary purposes: known-item searches and locating alternative options for research topics. one student noted that she would recommend using this feature to search keywords “easily and efficiently,” while another student indicated that the feature helps to link to other related keywords. this finding also revealed that users were not intimidated by the feature and did not see it as a distraction from the search process, an initial researcher concern. conclusion and future directions implementation implications implementing autocomplete functionality that accounts for the observed research tendencies and preferences of users makes for a compelling search experience. participant selection of autocomplete suggestions varied between the types of searches studied. spelling correction was the one universally acknowledged use. for subject-based searching, confidence in the topic searched and the stage of research emerged as indicators of the likelihood of autocomplete suggestions being taken. the use and effectiveness of providing subject suggestions requires further study, however. students expect suggestions to produce usable results within a library’s collections, so the source of the suggestions should incorporate known, viable subject taxonomies to maximize benefits and not lead students down false search paths. there is an ongoing need to investigate possible search-term dictionaries outside of google, such as lists of library holdings, journal titles, article titles, and controlled vocabulary from key library databases. the “brainstorming” aspect of autocomplete for subject searching is an intriguing benefit that should be more fully explored and supported. in combination with these findings, participant’s positive responses to some of the assessment questions (including first impressions of autocomplete and willingness to recommend it to friends) indicate that autocomplete is a viable tool to incorporate site-wide into library search interfaces. instruction implications traditional academic library instruction tends to focus on thinking of all possible search terms, synonyms, and alternative phrasing before the onset of actual searching and engagement with research interfaces. this process is later refined in the classroom by examining controlled vocabulary within a set of search results. however, observations from this study (as well as researcher experience with users at the reference desk) indicate that students in real-world situations often skip this step and rely on a more trial-and-error method for choosing search terms, beginning with one concept or phrasing rather than creating a list of options that they try sequentially. the implication for classroom practice is that instruction on search-term formulation should include a review of autocomplete suggestions as well as practical methods for integrating these suggestions into the research process. this is particularly important as vendor databases information technology and libraries | december 2012 15 move toward making autocomplete a default feature. proper instruction in its use can help advance acrl information literacy goals and provide a practical, context-sensitive way to explain how a varied vocabulary is important for achieving relevant results in a research setting.23 reference implications as with classroom instruction, traditional reference practice emphasizes a prescriptive path for research that involves analyzing which aspects of a topic or alternate vocabulary will be most relevant to a search before search-term entry. open and closed questioning techniques encourage users to think about different facets of their topic, such as time period, location, and type of information (e.g., statistics) that might be relevant. an enhanced implementation of autocomplete can incorporate these best practices from the reference interview into the list of suggestions to aid unmediated searching. one way this might be incorporated is through presenting faceted results that change on the basis of user selection of the type and nature of information they are looking for, such as a time period, format, or subject. for broadcast and federated searching interfaces, this could extend into the results users are then presented with, specifically attempting to use items or databases on the basis of suggestions made during the search entry phase, rather than presenting users with a multitude of options for users to make sense of, some of which may be irrelevant to the actual information need. finally, the findings on use of autocomplete also have implications for search-results pages. many of the common uses (e.g., spelling suggestions and additional search-term suggestion) also should be standard on results pages. this, too, is a common feature of commercial interfaces. bing, for example, includes a related searches feature (on the left of a standard results page), that suggests context-specific search terms based on the query. this feature is also part of their api (http://www.bing.com/developers/s/apibasics.html). providing these reference-without-alibrarian features is essential both in establishing user confidence in library research tools and in developing research skills and an understanding of the information literacy concepts necessary to becoming better researchers. our autocomplete use findings draw attention to user needs and library support across search processes; specifically, autocomplete functionality offers support while forming search queries and can improve the results of user searching. for this reason, we recommend that autocomplete functionality be investigated for implementation across all library interfaces and websites to provide unified support for user searches. the benefits that can be realized from autocomplete can be maximized by consulting with reference and instruction personnel on the benefits noted above and collaboratively devising best practices for integrating autocomplete results into searchstrategy formulation and classroom-teaching workflows. http://www.bing.com/developers/s/apibasics.html autocomplete as a research tool | ward, hahn, and feist 16 references 1. “autocomplete—web search help,” google, support.google.com/websearch/bin/answer.py?hl=en&answer=106230 (accessed february 7, 2012). 2. william mischo, internal use study, unpublished, 2011. 3. arnab nandi and h. v. jagadish, “assisted querying using instant-response interfaces,” in proceedings of the 2007 acm sigmod international conference on management of data (new york: acm, 2007), 1156–58, doi: 10.1145/1247480.1247640. 4. hanmin jung et al., “comparative evaluation of reliabilities on semantic search functions: auto-complete and entity-centric unified search,” in proceedings of the 5th international conference on active media technology (berlin, heidelberg: springer-verlag, 2009), 104–13, doi: 10.1007/978-3-642-04875-3_15. 5. hanmin jung et al., “auto-complete for improving reliability on semantic web service framework,” in proceedings of the symposium on human interface 2009 on human interface and the management of information. information and interaction. part ii: held as part of hci international 2009 (berlin, heidelberg: springer-verlag, 2009), 36–44, doi: 10.1007/978-3-64202559-4_5. 6. hao wu,“search-as-you-type in forms: leveraging the usability and the functionality of search paradigm in relational databases,” vldb 2010, 36th international conference on very large data bases, september 13–17, 2010, singapore, p. 36–41, www.vldb2010.org/proceedings/files/vldb_2010_workshop/phd_workshop_2010/phd%20wor kshop/content/p7.pdf (accessed february 7, 2012). 7. surajit chaudhuri and raghav kaushik, “extending autocompletion to tolerate errors,” in proceedings of the 35th sigmod international conference on management of data (new york,: acm, 2009), 707–18, doi: 10.1145/1559845.1559919,. 8. wu, “search-as-you_type in forms,” 38. 9. wu, “search-as-you-type in forms.” 10. ibid. 11. tim paek, bongshin lee, and bo thiesson, “designing phrase builder: a mobile real-time query expansion interface,” in proceedings of the 11th international conference on humancomputer interaction with mobile devices and services (new york: acm, 2009), 7:1–7:10, doi: 10.1145/1613858.1613868. http://support.google.com/websearch/bin/answer.py?hl=en&answer=106230 http://www.vldb2010.org/proceedings/files/vldb_2010_workshop/phd_workshop_2010/phd%20workshop/content/p7.pdf http://www.vldb2010.org/proceedings/files/vldb_2010_workshop/phd_workshop_2010/phd%20workshop/content/p7.pdf information technology and libraries | december 2012 17 12. ryen w. white and gary marchionini, “examining the effectiveness of real-time query expansion,” information processing and management 43, no. 3 (2007): 685–704, doi: 10.1016/j.ipm.2006.06.005. 13. white and marchionini, “examining the effectiveness of real-time query expansion,” 701. 14. jakob nielsen, “why you only need to test with 5 users,” jakob nielsen’s alertbox (blog), march 19, 2000, www.useit.com/alertbox/20000319.html (accessed february 7, 2012). see also walter apai, “interview with web usability guru, jakob nielsen,” webdesigner depot (blog), september 28, 2009, www.webdesignerdepot.com/2009/09/interview-with-web-usability-gurujakob-nielsen/ (accessed february 7, 2012). 15. white and marchionini, “examining the effectiveness of real-time query expansion.” 16. ibid. 17. alison j. head and michael b. eisenberg, “lessons learned: how college students seek information in the digital age,” project information literacy progress report, december 1, 2009, projectinfolit.org/pdfs/pil_fall2009_finalv_yr1_12_2009v2.pdf (accessed february 7, 2012). 18. jung et al., “comparative evaluation of reliabilities on semantic search functions.” 19. jung et al., “comparative evaluation of reliabilities on semantic search functions”; paek, lee, and thiesson, “designing phrase builder”; white and marchionini, “examining the effectiveness of real-time query expansion.” 20. association of college and research libraries (acrl), “information literacy competency standards for higher education,” http://www.ala.org/acrl/standards/informationliteracycompetency (accessed february 7, 2012). 21. white and marchionini, “examining the effectiveness of real-time query expansion.” 22. head and eisenberg, “lessons learned.” 23. association of college and research libraries (acrl), “information literacy competency standards for higher education.” http://www.useit.com/alertbox/20000319.html http://www.webdesignerdepot.com/2009/09/interview-with-web-usability-guru-jakob-nielsen/ http://www.webdesignerdepot.com/2009/09/interview-with-web-usability-guru-jakob-nielsen/ http://projectinfolit.org/pdfs/pil_fall2009_finalv_yr1_12_2009v2.pdf http://www.ala.org/acrl/standards/informationliteracycompetency autocomplete as a research tool | ward, hahn, and feist 18 appendix. questions task-based questions 1. does the library have a copy of “the epic of gilgamesh?” 2. does the library own the movie “battleship potempkin?” 3. does the library own the journal/article “journal of chromatography?” 4. for this part, we would like you to imagine you are doing research for a recent paper, either one you have already completed or one you are currently working on. a. what is this paper about? (what is your research question?) b. what class is it for? c. search for an article on yyy 5. same as 4, but different class/topic, and search for a book on yyy autocomplete-specific questions 1. what is your first impression of the autocomplete feature? 2. have you seen this feature before? a. if so where have you used it? 3. why did you/did you not use the suggested words? (words in the dropdown) 4. where do you think the suggestions are coming from? or, how are they being chosen? 5. when would you use this? 6. when would you not use it? 7. how can it be improved? 8. overall, what do you like/not like about this option? 9. would you suggest this feature to a friend? 10. if you were to explain this feature to a friend how might you explain it to them? assessment and demographic questions autocomplete feature 1. [known item] rate the quality/appropriateness of each of the first five autocomplete dropdown suggestions for your search: (5 point scale) 1—poor quality/not appropriate 2—low quality 3—acceptable 4—good quality –5—high quality/very appropriate information technology and libraries | december 2012 19 2. [subject/topic search] rate the quality/appropriateness of each of the first five autocomplete dropdown suggestions for your search: (5 point scale) 1—poor quality/not appropriate 2—low quality –3—acceptable 4—good quality –5—high quality/very appropriate 3. please indicate how strongly you agree or disagree with the following statement: “the autocomplete feature is useful for narrowing down a research topic.” (5 point scale): 1—strongly disagree 2—disagree –3—undecided –4—agree –5—strongly agree demographics 1. please indicate your current class status a.  freshman b.  sophomore c.  junior d.  senior 2. what is your declared or anticipated major? 3. have you had a librarian come talk to one of your classes or give an instruction session in one of your classes? if yes, which class(es)? 4. please rate your overall confidence level when beginning research for classes that require library resources for a paper or assignment. (5 point scale): 1—no confidence 2—low confidence 3—reasonable confidence 4—high confidence –5—very high confidence 5. what factors influence your confidence level when beginning research for classes that require library resources for a paper or assignment? hutchinson this study focuses on the adoption and use of wireless technology by medium-sized academic libraries, based on responses from eighty-eight institutions. results indicate that wireless networks are already available in many medium-sized academic libraries and that respondents from these institutions feel this technology is beneficial. w ireless networking offers a way to meet the needs of an increasingly mobile, tech-savvy student population. while many research libraries offer wireless access to their patrons, academic libraries serving smaller populations must heavily weigh both the potential benefits and disadvantages of this new technology. will wireless networks become essential components of the modern academic library, or is this new technology just a passing fad? prompted by plans to implement a wireless network at the houston cole library (hcl) (jacksonville state university’s [jsu’s] library), which serves a student enrollment close to ten thousand, this study was conducted to gather information about whether libraries similar in size and mission to hcl have adopted wireless technology. the study also sought to find out what, if any, problems other libraries have encountered with wireless networks and how successful they have perceived those networks to be. other questions addressed include level of technical support offered, planning, type of equipment used to access the network, and patron-use levels. � review of literature a review of the literature on wireless networks revealed a number of articles on wireless networks and checkout programs for laptop computers at large research institutions. seventy percent of major research libraries surveyed by kwon and soules in 2003 offered some degree of wireless access to their networks.1 no articles, however, specifically addressed the use of wireless networks in medium-sized academic libraries. many articles can also be found on wireless-network use in medical libraries and other institutions. library instruction using wireless classrooms and laptops has been another subject of inquiry as well. breeding wrote that there are a number of successful uses for wireless technology in libraries, and a wireless local area network (wlan) can be a natural extension of existing networks. he added that since it is sometimes difficult to install wiring in library buildings, wireless is more cost effective.2 a yearly survey conducted by the campus computing project found that the number of schools planning for and deploying wireless networks rose dramatically from 2002 to 2003. “for example, the portion of campuses reporting strategic plans for wireless networks rose to 45.5 percent in fall 2003, up from 34.7 percent in 2002 and 24.3 percent in 2001.”3 the use of wireless access in academia is expected to keep growing. according to a summary of a study conducted by the educause center for applied research (ecar), the higher-education community will keep investing in the technology infrastructure, and institutions will continue to refine and update networks. the move toward wireless access “represents a user-centered shift, providing students and faculty with greater access than ever before.”4 in an article on ubiquitous computing, drew provides a straightforward look at how wlans work, security issues, planning, and the uses and ramifications of wireless technology in libraries. he suggests, “perhaps one of the most important reasons for implementing wireless networking across an entire campus or in a library is the highly mobile lifestyle of students and faculty.” the use of wireless will only increase with the advent of new portable devices, he added. wireless networking is the best and least expensive way for students, faculty, and staff to take their office with them wherever they go.5 the circulation of laptop computers is a frequent topic in the available literature. the 2003 study by kwon and soules primarily focused on laptop-lending services in academic-research libraries. fifty percent of the institutions that responded to their survey provided laptops for checkout. the majority indicated moderate-to-high use of laptop services. positive user response and improved “public reputation, image, and relations” were the greatest advantages reported with laptop circulation. the major disadvantages associated with these services were related to labor and cost.6 a study of laptop checkout service at the mildred f. sawyer library at suffolk university in boston revealed that laptop usage was popular during the fall semester of 1999. students checked out the computers to work on group projects. a laptop area was set aside on one library floor to provide wired internet access for eight users. however, students wanted to use the laptops anywhere, not one designated place. the wired laptop areas were not popular, dugan wrote, adding that “few students used the wired area and the wires were repeatedly stolen or intentionally broken.” an interim phase involved providing wireless network cards for checkout wireless networks in medium-sized academic libraries: a national survey paula barnett-ellis and laurie charnigo paula barnett-ellis (pbarnett@jsucc.jsu.edu) is health and sciences librarian, and laurie charnigo (charnigo@jsucc .jsu.edu) is education librarian at houston cole library, jacksonville state university, alabama. wireless networks in medium-sized academic libraries | barnett-ellis and charnigo 13 14 information technology and libraries | march 2005 to encourage patrons to use their own laptops, and, when a wireless network was put into place in the fall of 2000, demand exceeded the number of available laptops for checkout.7 � method a survey (see appendix) was designed to find out how many libraries similar in size and mission to hcl have adopted wireless networks, the experiences they have encountered in offering wireless access, and, most importantly, whether they felt the investment in wireless technology has been worth the effort.8 the national center for education statistic’s academic library peer comparison tool, a database composed of statistical information on libraries throughout the united states, was used to select institutions for this study. a search on this database retrieved eighty-eight academic libraries that met two criteria: full-time enrollments of between five thousand and ten thousand, and classification by the carnegie classification of higher education as master’s colleges and universities i.9 the survey was administered to those thought most likely to be responsible for systems in the library; they were selected from staff listings on library web sites (library systems administrator, information tech-nology [it] staff). if such a person could not be identified, the survey was sent to the head of library systems or to the library director. the survey was divided into the following sections: implementation of wireless network, planning and installation stages, user services, technical problems, and benefits specific to use of network. surveys were mailed out in march 2004. an internet address was provided in the cover letter if participants wished to take the survey online rather than return it by mail. an e-mail reminder with a link to the online survey was sent out three weeks after the initial survey was mailed. all letters and e-mails were personalized, and a self-addressed stamped envelope and a ballpoint pen with the jsu logo were included with the mail surveys. in the e-mail reminder, the authors offered to share the results of the project with anyone who was interested, and received several enthusiastic responses. � results a total of fifty-three completed surveys were returned, resulting in a response rate of 60 percent. the overwhelming majority (85 percent) responded that their library offered wireless-network access. even if the thirty-five surveys that were not returned had reported that wireless networks were not available, more than 50 percent would still have offered wireless networks. survey results also pointed to the newness of the technology. only four of the fifty-three institutions have had wireless networks for more than three years. the majority (73 percent) has implemented wireless networks just within the last two years. when asked to identify the major reasons for offering wireless networks to their patrons, the three responses most chosen were: (1) to provide greater access to users; (2) the flexibility of a network unfettered by the limitations of tedious wiring; and (3) to keep up with technological innovation (see table 1). least significant factors in the decision to implement wireless networks were cost; use by library faculty and staff; to aid in bibliographic instruction; and use for carrying out technical services (taking inventory). somewhat to the authors’ surprise, wireless use in bibliographic instruction was not high on the list of reasons for installing a wireless network, identified by only 9 percent of respondents. the benefits of wireless for library instruction was stressed in the literature by mathias and heser and patton.10 in addition to obtaining an instrument for gauging how many libraries similar in scope and size to hcl have implemented wireless networks and why they chose to do so, questions on the survey were also designed to gather information on planning and implementation, user services, technical problems, and perceived benefits. � planning and implementation although tolson mentions that some schools have used committees composed of faculty, staff, and students to look into the adoption of wireless technology, responses from this survey indicated that the majority (60 percent) of the libraries did not form committees specifically for the planning of their wireless networks.11 in addition, 49 percent of the libraries took fewer than six months to plan for implementation of a network, 37 percent required six months to one year, and 15 percent reported more than one to two years. actual time spent on installation and configuration of wireless networks was relatively short, 98 percent indicating less than one year (see table 2 for specific times). one of the most important issues to consider when planning to implement a wireless network is extent of coverage—where wireless access will be available. survey responses revealed varying degrees of wireless coverage among institutions. twenty percent had campus-wide access, 55 percent had some level of coverage throughout the entire library, 37 percent provided a limited range of coverage outside the building, and 20 percent offered access only in certain areas within the library. according to a bulletin published by ecar, institutions vary in their approaches to networking depending on enrollment. smaller colleges and universities with fewer than ten thousand students are “more likely to implement campuswide wireless networks from the start. larger institutions are more likely to implement wireless technology in specific buildings, consistent with a desire to move forward at a modest pace, as resources and comfort with the technology grow.”12 questions on the survey also queried respondents about the popularity of spaces in the library where users access the library’s wireless network. answers revealed that the most popular areas for wireless access are study carrels, tables, and study rooms. nineteen percent indicated that accessing wireless networks in the stacks is popular. of particular concern to hcl, a thirteen-story building, was how the environment of the library would accommodate a wireless network. a thorough site survey is important to locate the best spots within the library to install access points and to determine whether there are architectural barriers in the building that might interfere with access. the majority of survey respondents indicated that the site survey conducted in their library for a wireless network was carried out by their academic institution’s it staff (59 percent). while library staff conducted 35 percent of site surveys, only 17 percent were conducted by outside companies. � user services an issue to be addressed by libraries deciding to go wireless is whether laptop computers should also be provided for checkout in the library. after all, it might be hard to justify the usefulness of a wireless network if users do not have access to laptops or other hardware with wireless capabilities. while one individual reported working at a “laptop university” in which campuswide wireless networking exists and all students are required to own laptops, not all college students will have that luxury. in order to provide more equal access to students, checking out laptops has become an increasingly common service in academic libraries. seventy percent of this survey’s respondents whose institutions offered wireless access also made laptops available for checkout. comments made throughout the survey seemed to imply that while checking out laptops to patrons is an invaluable complement to offering wireless access, librarians should be prepared for a myriad of hassles that accompany laptop checkout. wear and tear of laptops, massive battery use, cost of laptops, and maintenance were some of the biggest problems reported. one participant, whose institution decided to stop offering laptops for checkout to patrons in the library, wrote, “it required too much staff time to maintain and we decided the money was better spent elsewhere. the college now encourages students to purchase a laptop [instead of] a full-sized pc.” one participant worried that the rising use of laptops in his library would lead to the obsolescence of its more than one hundred wired desktops, writing, “our desktops are very popular and we think having them is one of the reasons our gate count has increased in recent years. what happens when everyone has a laptop?” the number of laptops checked out in the libraries varied. the majority of libraries had purchased between one and thirty laptops available for checkout (see table 3). three institutions had more than forty-one laptops available for checkout. one library could boast that it had sixty laptops available for checkout with twelve pagers to notify students waiting in line to use laptops. when asked about the use of laptops in libraries, 46 percent table 1. main reasons for implementing a wireless network in absolute numbers and percentages reasons for implementing total number of percent of responses a wireless network responses out of total number provide greater access to users 36 67 flexibility (no wires, ease in setting up) 29 54 to keep up with or provide technological innovation 28 52 campuswide initiative 21 39 requests expressed by users 16 30 provide greater online access due to shortage of computers-per-user in the library 15 28 other 7 13 offer network access outside the library building 6 11 aid in bibliographic instruction 5 9 for use by library faculty and staff 5 9 low cost 5 9 to carry out technical services (such as inventory) 4 7 wireless networks in medium-sized academic libraries | barnett-ellis and charnigo 15 16 information technology and libraries | march 2005 observed moderate use, while 32 percent reported heavy use of laptops. only 3 percent indicated that they hardly ever noticed use of laptops in the library. for those students who chose to bring their own laptop to access the library’s wireless network, half of the institutions surveyed required students to purchase their own network-interface cards for their laptops, while 19 percent allowed students to check them out from the library. in addition to laptops, personal digital assistants, (pdas) were listed by 37 percent of respondents as devices that may access wireless networks. one librarian indicated that cell phones could access the wireless network in his library. fiftysix percent of respondents indicated that users are able to print to a central printer in the library from their wireless device. an important consideration for implementing a wireless network is how users will authenticate. authentication protocol is defined by the microsoft encyclopedia of networking as “any protocol used for validating the identity of a user to determine whether to grant the user access to resources over a network.”13 authentication methods listed by the institutions surveyed varied greatly and the authors could not identify all of them. methods mentioned were lightweight directory access protocol (ldap), virtual private network (vpn), and media access control (mac) addresses, bluesocket, remote authentication dial in user service (radius), pluggable graphical identification and authentication (pgina), protected extensive authentication protocol (peap), and e-mail logins. out of the thirty-nine responses to this question, seven individuals indicated that they do not require any type of authentication at the present. although some individuals noted that they are planning to enable some type of authentication in the future, one participant suggested that there were ethical issues involved in requiring users to authenticate. this person argued that “anonymous access to information is valued” and praised his institution’s current policy of allowing “anyone who can find the network” to use it. a concern about offering wireless network access in the library is how library staff will be prepared to handle the flood of technical questions that are likely to ensue. the level of technical support offered to users varied among the institutions surveyed. more than half of the respondents indicated that users receive help specifically from it staff or from the campus computer center. thirtynine percent of users received help from the reference desk, while 19 percent received help from circulation staff. thirty-three percent of the responding institutions offered technical help from a web site, while 7 percent indicated that they did not offer any type of technical support to users. technical problems the technical problems most often encountered with wireless networks centered on architectural barriers that cause black-outs or slow-spots where wireless access fails. this confirms the importance of carrying out thorough site surtable 2. total length of time taken to completely configure and install the wireless network time to install and total number of percent of responses configure wireless network responses out of total number less than one month 12 28 one to two months 11 26 more than two months to four months 10 23 more than four months to six months 4 9 more than six months to one year 5 12 more than one year 1 2 table 3. total number of laptops available for checkout in the library total laptops total number of percent of responses available for checkout responses out of total number one to five 8 26 six to ten 5 16 eleven to fiften 1 3 sixteen to twenty 5 16 twenty-one to thirty 8 26 thirty-one to forty 1 3 more than forty 3 10 veys and testing prior to installation of access points. site surveys may be carried out by companies specially equipped and trained to determine where access points should be installed, the most appropriate type of antennae (directional or omnidirectional), and how many access points are needed to provide the greatest amount of coverage. configuration of the network was the second most highly reported problem associated with installing wireless networks, seeming to suggest the need for librarians to coordinate their efforts and rely on the knowledge provided by the it coordinator (or similar type of personnel) within their institution. lack of technical support available to users, slow speed, and authentication were also indicated as technical problems most encountered (see table 4). integrating the wireless network with the existing wired network was the least-mentioned problem associated with wireless networks. although security problems, particularly concerning wired equivalency protocol (wep) vulnerabilities, have been pointed out as one of the major drawbacks of a wireless network, the majority of users had not as yet experienced security problems. although one participant wrote, “don’t be too casual about the security risks,” another individual wrote, “talk to your networking department,” as many of them are overly worried about security. perceived benefits respondents reported that the number-one benefit of offering wireless access was user satisfaction. giving patrons the ability to use their laptops anywhere in the library and do multiple tasks from one machine is simply becoming what more and more users expect. the secondlargest benefit revolved around flexibility and ease of use due to the lack of wires. thirty-five percent indicated that allowing students to roam the stacks while accessing the network was a significant benefit. although a few studies have suggested the promise of wireless networks for aiding bibliographic instruction, only 9 percent of respondents indicated this as a benefit of wireless technology. use of wireless technology for instruction, it might be recalled, was not a significant factor noted by respondents in the decision to implement a wireless network. likewise, use of this type of network to carry out technical services (such as inventory) was also low on the scale of benefits. seventy-three percent of users claimed that wireless networks have thus far been worth the cost-benefit ratio. while 70 percent indicated moderate to heavy use of the wireless network, 27 percent reported low usage. when asked what advice they would give to others considering adopting wireless networks in their libraries, the overwhelming majority of responses were positive, recommending that hcl take the plunge. as one individual wrote, “offer it and they will come. it has really increased the usage of our library.” other individuals noted that it is simply necessary to offer wireless access to keep up with technological innovation, and that students expect it. the most significant warning, however, revolved around checkout and maintenance of laptops, which, from the results of this survey, seems be both a big advantage and a headache. several individuals echoed the importance of doing site surveys to test bandwidth limitations and access. one particularly energized participant, using multiple exclamations for emphasis, shared a plethora of advice. “throttle connection speeds! allow only http access! block ports and unnecessary protocols! secure your network and disallow unauthenticated users! use access control lists! establish policies that describe wireless networks in medium-sized academic libraries | barnett-ellis and charnigo 17 table 4. technical problems encountered problems total number of percent of responses encountered responses out of total number architectural barriers 15 28 configuration problems 12 22 not enough technical help available to users when needed 10 19 slow speed 10 19 authentication problems 10 19 blackouts 6 11 problems installing drivers 6 11 security problems 6 11 difficulty signing on 6 11 problems with operating systems 5 9 other 3 6 problems integrating the wireless network with an existing wired network 2 4 18 information technology and libraries | march 2005 [wireless fidelity] wi-fi risks and liabilities on your part!” useful advice on wireless-access implementation gleaned from this survey fell under the following categories: � be aware of slower speed � create a policy and guide for users � do it because more users are going wireless, it is necessary to keep up with technological innovation, and because students love it � provide plenty of access points � install access points in appropriate places � ensure continuous connectivity by allowing overlap between access points � purchase battery chargers and heavy-duty laptops with extended warranties � get support from it staff for planning and maintenance � offering wireless will increase library usage � perform or have an expert perform a careful site survey and do lots of testing to locate dead or slow spots in the library due to architectural barriers � enable some type of authorization � be aware of security concerns � although the majority of participants’ networks (70 percent) support 802.11b (which allows for throughput up to 11 megabits per second), a few participants suggest using the 802.11g standard (up to 54 megabits per second) because it is “the fastest” and “backwards compatible to 802.11b” � conclusion though it is a relatively new technology, this study found that a surprisingly large number of medium-sized academic libraries are already offering wireless access. not only are they offering wireless access, but they are also providing patrons with laptops for checkout in the library. although actual use of the network by patrons was not determined through survey responses (as individuals were only asked about their observations of network use), the comments and answers were overwhelmingly positive and enthusiastic about this new technology. problems that have been encountered with wireless networks largely revolve around configuration, slow speed, and laptop checkout. although much of the literature focuses on security issues that accompany wireless networking, few individuals reported problems with security. college and university students, like the rest of society, are becoming increasingly mobile. more often, they want access to library networks and the internet wherever they happen to be studying or working on group projects, not merely in computer labs or designated study areas. the majority of the libraries in this study are accommodating these students’ needs by offering wireless access. according to breeding, wireless networking is a rapidly growing niche in the networking world, and mobile computer users will become a larger and larger part of any library’s clientele.14 to encourage patrons to continue visiting them, academic libraries, large and small, should attempt to meet the demand for wireless access if at all possible. references and notes 1. myoung-ja lee kwon and aline soules, laptop computer services: spec kit 275 (washington, d.c.: association of research libraries office of leadership and management services, 2003), 11. 2. marshall breeding, “the benefits of wireless technologies,” information today 19, no. 3 (mar. 2002): 42–43. 3. kenneth c. green, “the campus computing project.” accessed mar. 3, 2004, www.campuscomputing.net/. 4. educause center for applied research, “respondent summary: wireless networking in higher education in the u.s. and canada.” accessed dec. 4, 2003, www.educause.edu/ ir/library/pdf/ecar_so/ers/ers0202/ekf0202.pdf. 5. wilfred drew, “wireless networks: new meaning to ubiquitous computing,” journal of academic librarianship 29, no. 2 (mar. 2003): 102–106. 6. kwon and soules, laptop computer services, 11, 15–17. 7. robert e. dugan, “managing laptops and the wireless networks at the mildred f. sawyer library,” journal of academic librarianship 27, no. 4 (jul. 2001): 295–98. 8. questions on the survey did not distinguish as to whether wireless network installations were initiated by it or library personnel. 9. national center for education statistics, “compare academic libraries.” accessed mar. 10, 2004, http://nces.ed.gov/ surveys/libraries/academicpeer/. 10. molly susan mathias and steven heser, “mobilize your instruction program with wireless technology,” computers in libraries 22, no.3 (mar. 2002): 24–30; janice k. patton, “wireless computing in the library: a successful model at st. louis community college,” community & junior-college libraries 10, no. 3 (mar. 2001): 11–16. 11. stephanie diane tolson, “wireless laptops and local area networks.” accessed dec. 11, 2003, www.thejournal.com/ magazine/vault/articleprintversion.cfm?aid=3536. 12. raymond boggs and paul arabasz, “research bulletin: the move to wireless networking in higher education.” accessed dec. 4, 2003, www.educause.edu/ir/library/pdf/erb0207.pdf. 13. mitch tulloch, microsoft encyclopedia of networking (redmond, wash.: microsoft pr., 2002), 122. 14. marshall breeding, “a hard look at wireless networks,” library journal 127, no. 12 (summer 2002): 14–17. 1. has a wireless network been implemented in your library? __yes __no 2. if your library has not adopted wireless networking, are you currently planning or seriously considering it for the near future? __yes (please skip to question 4) __no (please fill out questions 2 and 3 only) 3. what are your primary concerns about implementing a wireless network? check all that apply. __the technology is still new __unsure of its benefits __no need for one __questions regarding security __cost __would not be able to provide technical support that might be needed __funds must primarily support other types of technology at the moment __have not noticed many users with laptops in the library __slow speed of wireless networks __other 4. how long has a wireless network been implemented in your library? __fewer than 6 months __6 months to 1 year __more than 1 to 2 years __more than 2 to 3 years __more than 3 years 5. what were the main reasons for implementing a wireless network? check all that apply. __provide greater access to users __campuswide initiative __offer network access outside the library building __provide greater online access due to shortage of computers per user in the library __flexibility (no wires, ease in setting up) __requests expressed by users __low cost __to keep up with or provide technological innovation __to carry out technical services (such as inventory) __aid in bibliographic instruction __for use by library faculty and staff __other 6. please describe the coverage of your network. check all that apply. __campuswide __library building and limited range outside the library building __inside the library (all areas) __select areas within the library 7. what areas of the library are most popularly used for access to the wireless network? check all that apply. __reference and computer media center areas __in the stacks __librarians and staff offices __carrels, tables, reading or study rooms __area outside the library building 8. please list standards your wireless network supports. check all that apply. __802.11b __802.11a __802.11g __bluetooth __other planning and installation 1. was a committee established to plan the implementation and service of the wireless network? __yes __no 2. how long did it take to plan for implementation of the wireless network? __fewer than 6 months __6 months to 1 year __more than 1 to 2 years __more than 2 years 3. how long did it take to install and configure the network? __less than a month __1 to 2 months __more than 2 to 4 months __more than 4 to 6 months __more than 6 months to 1 year __more than 1 year 4. who performed the site survey? check all that apply. __an outside company or contractor appendix. survey: implementation of wireless networks wireless networks in medium-sized academic libraries | barnett-ellis and charnigo 19 20 information technology and libraries | march 2005 __institution’s own information technology coordinator or computer staff __library staff with technical expertise __no site survey was conducted 5. if the site surveyor was an outside company or contractor, please list their company name and whether you would recommend them. _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ user services 1. how are users authenticated? _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 2. does the library check out laptops to users (for either wired or wireless use)? __yes __no 3. if laptops are available for checkout, do they have wireless capability? __yes __no 4. how many laptops do you have for checkout? __one to five __six to ten __eleven to fifteen __sixteen to twenty __twenty-one to thirty __thirty-one to forty __more than forty 5. how would you describe use of laptops in your library on the average day? __heavy—very noticeable use of laptops __moderate use of laptops __low use of laptops __not sure __hardly even notice laptops are used 6. how do users obtain wireless cards for the network? check all that apply. __check out from library __purchase from library __purchase from the campus computer center __must purchase on their own 7. if the library checks out wireless cards, how many were purchased for checkout? __one to five __six to ten __eleven to fifteen __sixteen to twenty __twenty-one to twenty-five __twenty-six to thirty __more than thirty 8. what type of technical support does the library provide to users? check all that apply. __help from reference or help desk __help from the information technology staff or campus computer center __circulation staff __other library staff __from a web site __no technical help is provided to users 9. has the library created a policy for the use of wireless networks? __yes __no 10. are users able to print from the wireless network in the library? __yes __no 11. which of the following may access the wireless network? check all that apply. __laptops __desktop computers __pdas __cell phones __other technical problems 1. what technical problems have you or your users encountered? check all that apply. __blackouts __architectural barriers __slow speed __problems integrating the wireless network with an existing wired network __configuration problems __security problems __authentication problems __problems with operating systems __difficulty signing on __not enough technical help available to users when needed __problems installing drivers __other 2. have you experienced security problems with the network? check all that apply. __have not experienced any security problems __problems with unauthorized people accessing the internet through the wireless network __problems with restricted parts of the network being accessed by unauthorized users __other 3. how were security problems resolved? benefits of use of network 1. what have been the biggest benefits of wireless technology? check all that apply. __user satisfaction __increased access to the internet and online sources __flexibility and ease due to lack of wires __has improved technical services (use for library functions) __has aided in bibliographic instruction __provides access beyond the library building __allows students to roam the stacks while accessing the network __other 2. how would you describe current usage of the network? __heavy __moderate __low 3. in your opinion, has this technology been worth the benefit-cost ratio thus far? __yes __no __not sure 4. what advice would you give to librarians considering this technology? (editorial continued from page 3) design and implementation of complex systems to serve our users. writing about that should not be solitary either. i hope to publish think-pieces from leaders in our field. i hope to publish more articles on the management of information technologies. i hope to increase the number of manuscripts that provide retrospectives. libraries have always been users of information technologies, often early adopters of leading-edge technologies that later become commonplace. we should, upon occasion, remember and reflect upon our development as an information-technology profession. i hope to work with the editorial board, the lita publications committee, and the lita board to find a way, and soon, to facilitate the electronic publication of articles without endangering—but in fact enhancing—the absolutely essential financial contribution that the journal provides to the association. in short, i want to make ital a destination journal of excellence for both readers and authors, and in doing so reaffirm the importance of lita as a professional division of ala. to accomplish my goals, i need more than an excellent editorial board, more than first-class referees to provide quality control, and more than the support of the lita officers. i need all lita members to be prospective authors, prospective referees, and prospective literary agents acting on behalf of our profession to continue the almost forty-year tradition begun by fred kilgour and his colleagues, who were our predecessors in volume 1, number 1, march 1966, of our journal. reference 1. walt crawford, first have something to say: writing for the library profession (chicago: ala, 2003). wireless networks in medium-sized academic libraries | barnett-ellis and charnigo 21 microsoft word march_ital_stuart_tc proofread.docx measuring  journal  linking  success     from  a  discovery  service       kenyon  stuart,   ken  varnum,  and   judith  ahronheim       information  technology  and  libraries  |  march  2015             52   abstract   online  linking  to  full  text  via  third-­‐party  link-­‐resolution  services,  such  as  serials  solutions  360  link  or   ex  libris’  sfx,  has  become  a  popular  method  of  access  to  users  in  academic  libraries.  this  article   describes  several  attempts  made  over  the  course  of  the  past  three  years  at  the  university  of  michigan   to  gather  data  on  linkage  failure:  the  method  used,  the  limiting  factors,  the  changes  made  in  methods,   an  analysis  of  the  data  collected,  and  a  report  of  steps  taken  locally  because  of  the  studies.  it  is  hoped   that  the  experiences  at  one  institution  may  be  applicable  more  broadly  and,  perhaps,  produce  a   stronger  data-­‐driven  effort  at  improving  linking  services.   introduction   online  linking  via  vended  services  has  become  a  popular  method  of  access  to  full  text  for  users  in   academic  libraries.  but  not  all  user  transactions  result  in  access  to  the  desired  full  text.   maintaining  information  that  allows  the  user  to  reach  full  text  is  a  shared  responsibility  among   assorted  vendors,  publishers,  aggregators,  local  catalogers,  and  electronic  access  specialists.  the   collection  of  information  used  in  getting  to  full  text  can  be  thought  of  as  a  supply  chain.  to   maintain  this  chain,  libraries  need  to  enhance  the  basic  information  about  the  contents  of  each   vendor  package—a  collection  of  journals  bundled  for  sale  to  libraries—with  added  details  about   local  licenses  and  holdings.  these  added  details  need  to  be  maintained  over  time.  since  links,   platforms,  contracts,  and  subscriptions  change  frequently,  this  can  be  a  time-­‐consuming  process.   when  links  are  unsuccessfully  constructed  within  each  system,  considerable  troubleshooting  of  a   very  complex  process  is  required  to  determine  where  the  problem  lies.  because  so  much  of  the   transaction  is  invisible  to  the  user,  linking  services  have  come  to  be  taken  for  granted  by  the   community,  and  performance  expectations  are  very  high.  failure  to  reach  full  text  reflects  poorly   on  the  institutions  that  offer  the  links,  so  there  is  considerable  interest  for  and  value  to  the   institution  in  improving  performance.     kenyon  stuart  (kstuart@umich.edu)  is  senior  information  resources  specialist,  ken  varnum   (kvarnum@umich.edu)  is  web  systems  manager,  and  judith  ahronheim  (jaheim@umich.edu)  is   head,  electronic  resource  access  unit,  university  of  michigan  library,  ann  arbor,  michigan.       measuring  journal  linking  success  from  a  discovery  service  |  stuart,  varnum,  and   ahronheim   53   improving  the  success  rate  for  users  can  best  be  achieved  by  acquiring  a  solid  understanding  of   the  nature  and  frequency  of  problems  that  inhibit  full-­‐text  retrieval.  while  anecdotal  data  and   handling  of  individual  complaints  can  provide  incremental  improvement,  larger  improvement   resulting  from  systematic  changes  requires  more  substantial  data,  data  that  characterizes  the   extent  of  linking  failure  and  the  categories  of  situations  that  inhibit  it.   literature  review   openurl  link  resolvers  are  “tool[s]  that  helps  library  users  connect  to  their  institutions’   electronic  resources.  the  data  that  drives  such  a  tool  is  stored  in  a  knowledge  base.”1  since  the   codification  of  the  openurl  as  an  ansi/niso  standard  in  2004,2  openurl  has  become,  in  a  sense,   the  glue  that  holds  the  infrastructure  of  traditional  library  research  together,  connecting  citations   and  full  text.  it  is  well  recognized  that  link  resolution  is  an  imperfect  science.  understanding  what   and  how  openurls  fail  is  a  time-­‐consuming  and  labor-­‐intensive  process,  typically  conducted   through  analysis  of  log  files  recording  attempts  by  users  to  access  a  full-­‐text  item  via  openurl.   research  has  been  conducted  from  the  perspective  of  openurl  providers,  showing  which   metadata  elements  encoded  in  an  openurl  were  most  common  and  most  significant  in  leading  to   an  appropriate  full-­‐text  version  of  the  article  being  cited.  in  2010,  chandler,  wiley,  and  leblanc   reported  on  a  systematic  approach  they  devised,  as  part  of  a  mellon  grant,  to  review  the  outbound   openurls  from  l’année  philologique.3  they  began  with  an  analysis  of  the  metadata  elements   included  in  each  openurl  and  compared  this  to  the  standard.  they  found  that  elements  critical  to   the  delivery  of  a  full-­‐text  item,  such  as  the  article’s  starting  page,  were  never  included  in  the   openurls  generated  by  l’année  philologique.4  their  work  led  to  the  creation  of  the  improving   openurls  through  analytics  (iota)  working  group  within  the  national  information  standards   organization  (niso).   iota,  in  turn,  was  focused  on  improving  openurl  link  quality  at  the  provider  end.  “the  quality  of   the  data  in  the  link  resolver  knowledge  base  itself  is  outside  the  scope  of  iota;  this  is  being   addressed  through  the  niso  kbart  initiative.”5,6  where  iota  provided  tools  to  content  providers   for  improving  their  outbound  openurls,  kbart  provided  tools  to  knowledge  base  and  linking   tool  providers  for  improving  their  data.  pesch,  in  a  study  to  validate  the  iota  process,  discovered   that  well-­‐formed  openurls  were  generally  successful,  however:   the  quality  of  the  openurl  links  is  just  part  of  the  equation.  setting  the  proper  expectations   for  end  users  also  need  to  be  taken  into  consideration.  librarians  can  help  by  educating  their   users  about  what  is  expected  behavior  for  a  link  resolver  and  end  user  frustrations  can  also  be   reduced  if  librarians  take  advantage  of  the  features  most  content  providers  offer  to  control   when  openurl  links  display  and  what  the  links  say.  where  possible  the  link  text  should   indicate  to  the  user  what  they  will  get  when  they  click  it.7       information  technology  and  libraries  |  march  2015   54   missing  from  the  standards-­‐based  work  described  above  is  the  role  of  the  openurl  middleman,   the  library.  price  and  trainor  describe  a  method  for  reviewing  openurl  data  and  identifying  the   root  causes  of  failures.  8  through  testing  of  actual  openurls  in  each  of  their  systems,  they  arrived   at  a  series  of  steps  that  could  be  taken  by  other  libraries  to  proactively  raise  openurl  resolution   success  rates.  several  specific  recommendations  include  “optimize  top  100  most  requested   journals”  and  “optimize  top  ten  full  text  target  providers.”9  that  is,  make  sure  that  openurls   leading  to  content  from  the  most  frequently  used  journals  and  content  sources  are  tested  and  are   functioning  correctly.  chen  describes  a  similar  analysis  of  broken  link  reports  derived  from   bradley  university  library’s  sfx  implementation  over  four  years,  with  a  summary  of  the  common   reasons  links  failed.10  similarly,  o’neill  conducted  a  small  usability  study  whose  recommendations   included  providing  “a  system  of  support  accessible  from  the  page  where  users  experience   difficulty,”11  although  her  recommendations  focused  on  inline,  context-­‐appropriate  help  rather   than  error-­‐reporting  mechanisms.   not  found  in  the  literature  are  several  systematic  approaches  that  a  library  can  take  to  proactively   collect  problem  reports  and  manage  the  knowledge  base  accordingly.   method   we  have  taken  a  two-­‐pronged  approach  to  improving  link  resolution  quality,  each  relying  on  a   different  kind  of  input.  the  first  uses  problem  reports  submitted  by  users  of  our  summontm-­‐ powered  article  discovery  tool,  articlesplus.12  the  second  focuses  on  the  most  commonly-­‐accessed   full-­‐text  titles  in  our  environment,  based  on  reports  from  360  link.  we  have  developed  this  dual   approach  in  the  expectation  that  we  will  catch  more  problems  on  lesser-­‐used  full-­‐text  sources   through  the  first  approach,  and  problems  whose  resolution  will  benefit  the  most  individuals   through  the  second.   user  reports   the  university  of  michigan  library  uses  summon  as  the  primary  article  discovery  tool.  when  a   user  completes  a  search  and  clicks  the  “mget  it”  button  (see  figure  1)—mget  it  is  our  local  brand   for  the  entire  full-­‐text  delivery  process—the  user  is  directed  to  the  actual  document  through  one   of  two  mechanisms:   1. access  to  the  full-­‐text  article  through  a  summon  index-­‐enhanced  direct  link.  (some  of   summon’s  full-­‐text  content  providers  contribute  a  url  to  summon  for  direct  access  to  the   full  text.  this  is  known  as  an  index-­‐enhanced  direct  linking  [direct  linking].)   2. access  to  the  full  text  article  through  the  university  library’s  link  resolver,  360  link.  at  this   point,  one  of  two  things  can  happen:   a. the  university  library  has  configured  a  number  of  full-­‐text  sources  as  “direct  to  full   text”  links.  when  a  citation  leads  to  one  of  these  sources,  the  user  is  directed  to  the   article  (or  as  close  to  it  as  the  content  provider’s  site  allows  (sometimes  to  an  issue       measuring  journal  linking  success  from  a  discovery  service  |  stuart,  varnum,  and   ahronheim   55   table  of  contents,  sometimes  to  a  list  of  items  in  the  appropriate  volume,  and—rarely  in   this  instance——to  the  journal’s  front  page;  the  last  outcome  is  rare  in  our  environment   because  the  university  library  prefers  full-­‐text  links  that  get  closer  to  the  article  and   has  configured  360  link  for  that  outcome).     b. for  those  full-­‐text  sources  that  do  not  have  direct-­‐to-­‐article  links,  360  link  is   configured  to  provide  a  range  of  possible  delivery  mechanisms,  including  journal-­‐,   volume-­‐  or  issue-­‐level  entry  points,  document-­‐delivery  options  (for  cases  where  the   library  does  not  license  any  full-­‐text  online  sources),  the  library  catalog  (for  identifying   print  holdings  for  a  journal),  and  so  on.   from  the  user  perspective,  mechanisms  1  and  2a  are  essentially  identical.  in  both  cases,  a  click  on   the  mget  it  icon  takes  the  user  to  the  full  text  in  a  new  browser  window.  if  the  link  does  not  lead  to   the  correct  article  for  any  reason,  there  is  no  way  in  the  new  window  for  the  library  to  collect  that   information.  users  may  consider  item  2b  results  as  a  failure  because  the  article  is  not  immediately   perceptible,  even  if  the  article  is  actually  available  in  full  text  after  two  or  more  subsequent  clicks.   because  of  this  user  perception,  we  interpreted  2b  results  as  “failures.”     figure  1.  sample  citation  from  articlesplus   in  an  attempt  to  understand  this  type  of  problem,  following  the  advice  given  by  o’neill  and  chen,   we  provide  a  problem-­‐reporting  link  in  the  articlesplus  search-­‐results  interface  each  time  the  full-­‐ text  icon  appears  (see  the  right  side  of  figure  1).  when  the  user  clicks  this  problem-­‐reporting  link,   they  are  taken  to  a  qualtrics  survey  form  that  asks  for  several  basic  pieces  of  information  from  the   user  but  also  captures  the  citation  information  for  the  article  the  user  was  trying  to  reach  (see   figure  2).         information  technology  and  libraries  |  march  2015   56     figure  2.  survey  questionnaire  for  reporting  linking  problems   this  survey  instrument  asks  the  user  to  characterize  the  type  of  delivery  failure  with  one  of  four   common  problems,  along  with  an  “other”  text  field:   • there  was  no  article   • i  got  the  wrong  article   • i  ended  up  at  a  page  on  the  journal's  web  site,  but  not  the  article   • i  was  asked  to  log  in  to  the  publisher's  site   • something  else  happened  (please  explain):   the  form  also  asks  for  any  additional  comments  and  requires  that  the  user  provide  an  email   address  so  that  library  staff  can  contact  the  user  with  a  resolution  (often  including  a  functioning   full-­‐text  link)  or  to  ask  for  more  information.   in  addition  to  the  information  requested  from  the  user,  hidden  fields  on  this  form  capture  the   summon  record  id  for  the  article,  the  ip  address  of  the  user’s  computer  (to  help  us  identify  if  the   problem  could  be  a  related  to  our  ezproxy  configuration),  a  time  and  date  stamp  of  the  report’s   submission,  and  the  brand  and  version  of  web  browser  being  used.       measuring  journal  linking  success  from  a  discovery  service  |  stuart,  varnum,  and   ahronheim   57   the  results  of  the  form  are  sent  by  email  to  the  university  library’s  ask  a  librarian  service,  the   library’s  online  reference  desk.  ask  a  librarian  staff  make  sure  that  the  problem  is  not  associated   with  the  individual  user’s  account  (that  they  are  entitled  to  get  full  text,  that  they  were  accessing   the  item  from  on  campus  or  via  the  proxy  server  or  vpn,  etc.).  when  user-­‐centric  problems  are   ruled  out,  the  problem  is  passed  on  to  the  library’s  electronic  access  unit  in  technical  services  for   further  analysis  and  resolution.   random  sampling   user-­‐reported  problems  are  only  one  picture  of  issues  in  the  linking  process.  we  were  concerned   that  user  reports  might  not  be  the  complete  story.  we  wanted  to  ensure  that  our  samples   represented  the  full  range  of  patron  experiences,  not  just  that  of  the  ones  who  reported.  so,  to  get   a  different  perspective,  we  instituted  a  series  of  random  sample  testing  using  logs  of  document   requests  from  the  link  resolver,  360  link.   2011  linking  review   our  first  large-­‐scale  review  of  linking  from  articlesplus  was  conducted  in  2011.  this  first  approach   was  based  on  a  log  of  the  summon  records  that  had  appeared  in  patron  searches  and  for  which   our  link  resolver  link  had  been  clicked.  for  this  test  we  chose  a  slice  of  the  log  covering  the  period   from  january  30–february  12,  2011.  this  period  was  chosen  because  it  was  well  into  the  academic   term  and  before  spring  break,  so  it  would  provide  a  representative  sample  of  the  searches  people   had  performed.  the  resulting  slice  contained  13,161  records.  for  each  record  the  log  contained   the  summon  id  of  the  record.  we  used  this  to  remove  duplicate  records  from  the  log  to  ensure  we   were  not  testing  linking  for  the  same  record  more  than  once,  leaving  us  with  a  spreadsheet  of   10,497  records,  one  record  per  row.  from  the  remaining  records  we  chose  a  sample  of  685   records  using  a  random  number  generator  tool,  research  randomizer   (http://www.randomizer.org/form.htm),  to  produce  a  random,  nonduplicating  list  of  685   numbers  with  values  from  1  to  10,497.  each  of  the  685  numbers  produced  was  matched  to  the   corresponding  row  in  the  spreadsheet  starting  with  the  first  record  listed  in  the  spreadsheet.  for   each  record  we  collected  the  data  in  figure  3.                 information  technology  and  libraries  |  march  2015   58   1.  the  summon  id  of  the  record   2.  the  raw  openurl  provided  with  the  record.   3.  a  version  of  the  openurl  that  may  have  been  locally  edited  to  put  dates  in  a  standard   format.   4.  the  final  url  provided  to  the  user  for  linking  to  the  resource.  this  would  usually  be   the  openurl  from  #3  containing  the  metadata  used  by  the  link  resolver  to  build  its   full-­‐text  links.  currently  it  is  an  intermediary  url  provided  by  the  summon  api.  this   url  may  lead  to  an  openurl  or  to  a  direct  link  to  the  resource  in  the  summon  record.   5.  the  classification  of  the  link  in  the  summon  record.  this  was  either  “full  text  online”   or  “citation-­‐only.”   6.  the  date  the  link  resolver  link  was  clicked.   7.  the  page  in  the  summon  search  results  the  link  resolver  link  was  found.   8.  the  position  within  the  page  of  search  results  where  the  link  resolver  link  was  located.   9.  the  search  query  that  produced  the  search  results.   figure  3.  data  points  collected   the  results  from  this  review  were  somewhat  disappointing,  with  only  69.5%  of  the  citations   tested  leading  directly  to  full  text.  at  the  time  direct  linking  did  not  yet  exist,  so  “direct  to  full  text”   linking  was  only  available  through  the  1-­‐click  feature  of  360  link.  the  1-­‐click  feature  attempts  to   lead  patrons  directly  to  the  full  text  of  a  resource  without  first  going  through  the  360  link  menu.   1-­‐click  was  used  for  579  or  84.5%  of  the  citations  tested  with  15.3%  leading  to  the  360  link  menu.   of  the  citations  that  used  1-­‐click,  476  or  82.2%  led  directly  to  full  text,  so  when  1-­‐click  was  used  it   was  rather  successful.  links  for  about  30.5%  of  the  citations  led  either  to  a  failed  attempt  to  reach   full  text  through  1-­‐click  or  directly  to  the  360  link  menu.  the  2011  review  included  looking  at  the   full-­‐text  links  that  360  link  indicated  should  lead  directly  to  the  full  text  as  opposed  to  the  journal,   volume  or  issue  level.  when  we  reviewed  all  of  the  “direct  to  full  text”  links  generated  by  360  link,   not  only  the  ones  used  by  1-­‐click,  we  found  a  variety  of  reasons  why  those  links  did  not  succeed  in   leading  to  the  full  text.  the  top  five  reasons  found  for  linking  failures  are  the  following:   1. incomplete  target  collection   2. incorrect  syntax  in  the  article/chapter  link  generated  by  360  link   3. incorrect  metadata  in  the  summon  openurl   4. article  not  individually  indexed   5. target  error  in  targeturl  translation       measuring  journal  linking  success  from  a  discovery  service  |  stuart,  varnum,  and   ahronheim   59   collectively,  these  reasons  were  associated  with  the  failure  of  71.5%  of  the  “direct  to  full  text”   links.  as  we  will  show  later,  these  problems  were  also  noted  in  our  most  recent  review  of  linking   quality.   move  to  quarterly  testing   after  this  review  in  2011,  we  decided  to  perform  quarterly  testing  of  the  linking  so  we  would  have   current  data  on  the  quality  of  linking.  this  would  give  us  information  on  the  effectiveness  of  any   changes  we  and  proquest  had  made  independently  to  improve  the  linking.  we  could  see  where   linking  problems  found  in  previous  testing  had  been  resolved  and  where  new  ones  might  exist.     however,  we  needed  to  change  how  we  created  our  sample.  while  the  data  gathered  in  2011   provided  much  insight  into  the  workings  of  360  link,  testing  the  685  records  produced  2,210  full-­‐ text  links.  gathering  the  data  for  such  a  large  number  of  links  required  two  months  of  part-­‐time   effort  by  two  staff  members  as  well  as  an  additional  month  of  part-­‐time  effort  by  one  staff  member   for  analysis.  this  would  not  be  workable  for  quarterly  testing.  as  an  alternative  we  decided  to  test   two  records  from  each  of  the  100  serials  most  accessed  through  the  link  resolver.  this  gave  us  a   sample  we  could  test  and  analyze  within  a  quarter  based  on  serials  that  our  patrons  were  using.   we  felt  that  we  could  gather  data  for  such  a  sample  within  three  to  four  weeks  instead  of  two   months.  the  list  was  generated  using  the  “click-­‐through  statistics  by  title  and  issn  (journal  title)”   usage  report  generated  through  the  proquest  client  center  administration  gui.  we  searched  for   each  serial  title  within  summon  using  the  serial’s  issn  or  the  serial’s  title  when  the  issn  was  not   available.     we  ordered  the  results  by  date,  with  the  newest  records  first.  we  wanted  an  article  within  the  first   two  to  three  pages  of  results  so  we  would  have  a  recent  article,  but  not  one  so  new  it  was  not  yet   available  through  the  resources  that  provide  access  to  the  serial.  then  we  reordered  the  results  to   show  the  oldest  records  first  and  chose  an  article  from  the  first  or  second  page  of  results.  our  goal   was  to  choose  an  article  at  random  from  the  second  or  third  page  while  ignoring  the  actual  content   of  the  article  so  as  not  to  introduce  a  selection  bias  by  publisher  or  journal.  another  area  where   our  sample  was  not  truly  random  involved  supplement  issues  of  journals.  one  problem  we  found   with  the  samples  collected  was  that  they  contained  few  items  from  supplemental  issues  of   journals.  linking  to  articles  in  supplements  is  particularly  difficult  because  of  the  different  ways   supplement  information  is  represented  among  different  databases.  to  attempt  to  capture  linking   information  in  this  case  we  added  records  for  articles  in  supplemental  issues.  those  records  were   chosen  from  journals  found  in  earlier  testing  to  contain  supplemental  issues.  we  searched   summon  for  articles  within  those  supplemental  issues  and  selected  one  or  two  to  add  to  our   sample.   one  notable  thing  is  the  introduction  of  direct  linking  in  our  summon  implementation  between   the  reviews  for  the  first  and  second  quarters  of  2012.  proquest  developed  direct  linking  to       information  technology  and  libraries  |  march  2015   60   improve  linking  to  resources  (including  but  not  limited  to  full  text  of  articles)  through  summon.   instead  of  using  an  openurl  which  must  be  sent  to  a  link  resolver,  direct  linking  uses   information  received  from  the  providers  of  the  records  in  summon  to  create  links  directly  to   resources  through  those  providers.  ideally,  since  these  links  use  information  from  those  providers,   direct  linking  would  not  have  the  problems  found  with  openurl  linking  through  a  link  resolver   such  as  360  link.  not  all  links  from  summon  use  direct  linking,  and  as  a  result  we  had  to  take  into   account  the  possibility  that  any  link  we  clicked  could  use  either  openurl  linking  or  direct  linking.   current  review:  back  to  random  sampling   while  the  above  sampling  method  resulted  in  useful  data,  we  also  found  it  had  some  limitations.   when  we  performed  the  review  for  the  second  quarter  of  2012,  we  found  a  significant  increase  in   the  effectiveness  of  360  link  since  the  first  quarter  2012  review.  this  is  further  described  in  the   findings  section  of  this  article.  we  were  able  to  trace  some  of  this  improvement  to  changes   proquest  had  made  to  360  link  and  to  the  openurls  produced  from  summon.  however,  we  were   unable  to  fully  trace  the  cause  of  the  improvement  and  were  unable  to  determine  if  this  was  real   improvement  that  would  be  persistent.  to  resolve  these  problems,  we  have  returned  to  using  a   random  sample  in  our  latest  review,  but  with  a  change  in  methods.     current  review:  determining  the  sample  size   we  wanted  to  perform  a  review  that  would  be  statistically  relevant  and  could  help  us  determine  if   any  changes  in  linking  quality  were  persistent  and  not  just  a  one-­‐time  event.  instead  of  testing  a   single  sample  each  quarter  we  decided  to  test  a  sample  each  month  over  a  period  of  months.  one   concern  with  this  was  the  sample  size,  as  we  wanted  a  sample  that  would  be  statistically  valid  but   not  so  large  we  could  not  test  it  within  a  single  month.  we  determined  that  a  sample  size  of  300   would  be  sufficient  to  determine  if  any  month-­‐to-­‐month  changes  represent  a  real  change.   however,  in  previous  testing  we  had  learned  that  because  of  re-­‐indexing  of  the  summon  records,   summon  ids  that  were  valid  when  a  patron  performed  a  search  might  no  longer  be  valid  by  the   time  of  our  testing.  we  wanted  a  sample  of  300  still-­‐valid  records,  so  we  selected  a  random  sample   larger  than  that  amount.  so,  we  decided  to  test  600  records  each  month  to  determine  if  the   summon  ids  were  still  valid.   current  review:  methods   when  generating  each  month's  sample  we  used  the  same  method  as  in  2011.  we  asked  our  web   systems  group  for  the  logs  of  full-­‐text  requests  from  the  library’s  summon  interface  for  the  period   of  november  2012–february  2013.13  we  processed  each  month’s  log  file  within  two  months  of  the   user  interactions.  to  generate  the  600-­‐record  sample,  after  removing  records  with  duplicate   summon  ids,  we  used  a  random  number  generator  tool,  research  randomizer,  to  produce  a   random,  nonduplicating  list  of  600  numbers  with  values  from  1  to  the  number  of  unique  records.   each  of  the  600  numbers  produced  was  matched  to  the  corresponding  row  in  the  spreadsheet  of       measuring  journal  linking  success  from  a  discovery  service  |  stuart,  varnum,  and   ahronheim   61   records  with  unique  summon  ids.  once  the  600  records  were  tested  and  we  had  a  subset  with   valid  summon  ids,  we  generated  a  list  of  300  random,  nonduplicating  numbers  with  values  from  1   to  the  number  of  records  with  valid  summon  ids.  each  of  the  300  numbers  produced  was  matched   to  the  corresponding  row  in  a  spreadsheet  of  the  subset  of  records  with  valid  summon  ids.  this   gave  us  the  300-­‐record  sample  for  analysis.     testing  was  performed  by  two  people,  a  permanent  staff  member  and  a  student  hired  to  assist   with  the  testing.  the  staff  member  was  already  familiar  with  the  data  gathering  and  recording   procedure  and  trained  the  student  on  this  procedure.  the  student  was  introduced  to  the  library’s   summon  implementation  and  shown  how  to  recognize  and  understand  the  different  possible   linking  types:  summon  direct  linking,  360  link  using  1-­‐click,  and  360  link  leading  to  the  360   link  menu.  once  this  background  was  provided,  the  student  was  introduced  to  the  procedure  for   gathering  and  recording  data.  the  student  was  given  suggestions  on  how  to  find  the  article  on  the   target  site  if  the  link  did  not  lead  directly  to  the  article  and  how  to  perform  some  basic  analysis  to   determine  why  the  link  did  not  function  as  expected.  the  permanent  staff  member  reviewed  the   analysis  of  the  links  that  did  not  lead  to  full  text  and  applied  a  code  to  describe  the  reason  for  the   failure.     based  on  our  2011  testing,  we  expected  to  see  one  of  two  general  results  in  the  current  round.     1. 360  link  would  attempt  to  connect  directly  to  the  article  because  of  our  activating  the  1-­‐ click  feature  of  360  link  when  we  implemented  the  link  resolver.  with  1-­‐click,  360  link   attempts  to  lead  patrons  directly  to  the  full  text  of  a  resource  without  first  having  to  go   through  the  link  resolver  menu.  even  with  1-­‐click  active  we  provide  patrons  a  link  leading   to  the  full  360  link  menu,  which  may  have  other  options  for  leading  to  the  full  text  as  well   as  links  to  search  for  the  journal  or  book  in  our  catalog.     2. the  other  possible  result  was  the  link  from  summon  leading  directly  to  the  360  link  menu.     once  direct  linking  was  implemented  after  we  began  this  round,  a  third  result  became  possible  (a   direct  link  from  summon  to  the  full  text).     for  each  record  we  collected  the  data  shown  in  figure  4.                 information  technology  and  libraries  |  march  2015   62   1.      date  the  link  from  summon  record  was  tested.   2.      the  url  of  the  summon  record.   3.    *the  openurl  generated  by  clicking  the  link  from  summon.  this  was  the  url  in  the   address  bar  of  the  page  to  which  the  link  led.  this  is  not  available  when  direct  linking  is   used.   4.      the  issn  of  the  serial  or  isbn  of  the  book.   5.      the  doi  of  the  article/book  chapter  if  it  was  available.   6.      the  citation  for  the  article  as  shown  in  the  360  link  menu  or  in  the  summon  record  if   direct  linking  was  used.   7.    *each  package  (collection  of  journals  bundled  together  in  the  knowledgebase)  for  which   360  link  produced  an  electronic  link  for  that  citation.   8.    *the  order  in  the  list  of  electronic  resources  in  which  the  package  in  #7  appeared  in  the   360  link  menu.   9.    *the  linking  level  assigned  to  the  link  by  360  link.  this  level  indicates  how  close  to  the   article  the  link  should  lead  the  patron,  with  article-­‐level  or  chapter-­‐level  links  ideally   taking  the  patron  directly  to  the  article/book  chapter.  the  linking  levels  recorded  in  our   testing  starting  with  the  closest  to  full  text  were  article/book  chapter,  issue,  volume,   journal/book  and  database.   10.  *for  article-­‐level  links,  the  url  that  360  link  used  to  attempt  to  connect  to  the  article.   11.  for  all  full-­‐text  links  in  the  360  link  menu,  the  url  to  which  the  links  led.  this  was  the   link  displayed  in  the  browser  address  bar.   12.  a  code  assigned  to  that  link  describing  the  results.   13.  a  note  indicating  if  full  text  was  available  on  the  site  to  which  the  link  led.  this  was  only   an  indicator  of  whether  or  not  full  text  could  be  accessed  on  that  site  not  an  indicator  of   the  success  of  1-­‐click/direct  linking  or  the  article-­‐level  link.   14.  a  note  if  this  was  the  link  used  by  1-­‐click.   15.  a  note  if  direct  linking  was  used.   16.  a  note  if  the  link  was  for  a  citation  where  1-­‐click  was  not  used  and  clicking  the  link  in   summon  led  directly  to  the  360  link  menu.   17.  notes  providing  more  detail  for  the  results  described  by  #12.  this  included  error   messages,  search  strings  shown  on  the  target  site,  and  any  unusual  behavior.  the  notes   also  included  conclusions  reached  regarding  the  cause(s)  of  any  problems.   *  collected  only  if  the  link  resolver  was  used.   figure  4.  data  collected  from  sample       measuring  journal  linking  success  from  a  discovery  service  |  stuart,  varnum,  and   ahronheim   63   each  link  was  categorized  based  on  whether  it  led  to  the  full  text.  then  the  links  that  failed  were   further  categorized  on  the  basis  of  the  reason  for  failure  (see  figure  5  for  failure  categories).     1.        incorrect  metadata  in  the  summon  openurl.   2.        incomplete  metadata  in  the  summon  openurl.   3.        difference  in  the  metadata  between  summon  and  the  target.  in  this  case  we  were  unable  to   determine  which  site  had  the  correct  metadata.   4.        inaccurate  data  in  the  knowledgebase.  this  includes  incorrect  url  and  incorrect   issn/isbn.   5.        incorrect  coverage  in  the  knowledgebase.   6.        link  resolver  insufficiency.  the  link  resolver  has  not  been  configured  to  provide  deep   linking.  this  may  be  something  that  we  could  configure  or  something  that  would  require   changes  in  360  link.   7.        incorrect  syntax  in  the  article/chapter  link  generated  by  360  link.   8.        target  site  does  not  appear  to  support  linking  to  article/chapter  level.   9.        article  not  individually  indexed.  this  often  happens  with  conference  abstracts  and  book   reviews  which  are  combined  in  a  single  section.   10.    translation  error  of  the  “targeturl”  by  target  site.   11.    incomplete  target  collection.  site  is  missing  full  text  for  items  that  should  be  available  on   the  site.   12.    incorrect  metadata  on  the  target  site.   13.    citation-­‐only  record  in  summon.  summon  indicates  only  the  citation  is  available  so  access   to  full  text  is  not  expected.   14.    error  indicating  cookie  could  not  be  downloaded  from  target  site.  this  sometimes   happened  with  1-­‐click  but  the  same  link  would  work  from  the  360  link  menu.   15.    item  does  not  appear  to  have  a  doi.  the  360  link  menu  may  provide  an  option  to  search  for   the  doi.  sometimes  these  searches  fail  and  we  are  unable  to  find  a  doi  for  the  item.   16.    miscellaneous.  results  that  do  not  fall  into  the  other  categories.  generally  used  for  links  in   packages  for  which  360  link  only  provides  journal/book-­‐level  linking  such  as  directory  of   open  access  journals  (doaj).   17.    unknown.  the  link  failed  with  no  identifiable  cause.     figure  5.  list  of  failure  categories  assigned       information  technology  and  libraries  |  march  2015   64   user-­‐reported  problems   in  march  2012,  we  began  recording  the  number  of  full-­‐text  clicks  in  articlesplus  search  results   (using  google  analytics  events).  for  each  month,  we  calculated  the  number  of  problems  reported   per  1,000  searches  and  per  1,000  full-­‐text  clicks.  graphed  over  time,  the  number  of  problem   reports  in  both  categories  shows  an  overall  decline.  see  figures  6  and  7.     figure  6.  problems  reported  per  1,000  articlesplus  searches  (june  2011–april  2014)     figure  7.  problems  reported  per  1,000  articlesplus  full-­‐text  clicks  (march  2012-­‐april  2014)       measuring  journal  linking  success  from  a  discovery  service  |  stuart,  varnum,  and   ahronheim   65   our  active  work  to  update  the  summon  and  360  link  knowledge  bases  began  in  june  2011.  the   change  to  summon  direct  linking  happened  on  february  27,  2012,  at  a  time  when  we  were   particularly  dissatisfied  with  the  volume  of  problems  reported.  we  felt  the  poor  quality  of   openurl  resolution  was  a  strong  argument  in  favor  of  activating  summon  direct  linking.  we   believe  this  change  led  to  a  noticeable  improvement  in  the  number  of  problems  reported  per   1,000  searches  (see  figure  6).  we  do  not  have  data  for  clicks  on  the  full-­‐text  links  in  our   articlesplus  interface  prior  to  march  2012,  but  do  know  that  reports  per  1,000  full-­‐text  clicks  have   been  on  the  decline  as  well  (see  figure  7).   findings   summary  of  random-­‐sample  testing  of  link  success     in  early  2013  we  tested  linking  from  articlesplus  to  gather  data  on  the  effectiveness  of  the  linking   and  to  attempt  to  determine  if  there  were  any  month-­‐to-­‐month  changes  in  the  effectiveness  that   could  indicate  persistent  changes  in  linking  quality.  in  this  section  we  will  review  the  data   collected  from  the  four  samples  used  in  this  testing.  we  will  discuss  the  different  paths  to  full  text,   direct  linking  vs.  openurl  linking  through  360  link,  and  their  relative  effectiveness.  we  will  also   discuss  the  reasons  we  found  for  links  to  not  lead  to  full  text.   paths  to  full-­‐text  access   as  shown  below  (see  table  1)  most  of  the  records  tested  in  summon  used  direct  linking  to   attempt  to  reach  the  full  text.  the  percentage  varied  with  each  sample  tested  but  they  ranged  from   61%  to  70%.  the  remaining  records  used  360  link  to  attempt  to  reach  the  full  text.  most  of  the   time  when  360  link  was  used,  1-­‐click  was  also  used  to  reach  the  full  text.  between  direct  linking   and  1-­‐click  about  93%  to  94%  of  the  time  an  attempt  was  made  to  lead  users  directly  to  the  full   text  of  the  article  without  first  going  through  the  360  link  menu.     sample  1   november  2012   sample  2   december  2012   sample  3   january  2013   sample  4   january  2013   direct  linking   205   68.3%   210   70.0%   184   61.3%   190   63.3%   360  link/1-­‐click   77   25.7%   70   23.3%   98   32.7%   87   29.0%   360  link/360  link  menu   18   6.0%   20   6.7%   18   6.0%   23   7.7%   total   300     300     300     300     table  1.  type  of  linking       information  technology  and  libraries  |  march  2015   66   attempts  to  reach  the  full  text  through  direct  linking  and  1-­‐click  were  rather  successful.  in  the   testing,  we  were  able  to  reach  full  text  through  those  methods  from  79%  to  about  84%  of  the  time   (see  table  2).  the  remaining  cases  were  situations  where  direct  linking/1-­‐click  did  not  lead   directly  to  the  full  text  or  we  reached  the  360  link  menu.       sample  1   november  2012   sample  2   december  2012   sample  3   january  2013   sample  4   january  2013   direct  linking   197   65.7%   204   68.0%   173   57.7%   185   61.7%   360  link/1-­‐click   45   15.0%   47   15.7%   64   21.3%   55   18.3%   total  out  of  300   242   80.7%   251   83.7%   237   79.0%   240   80.0%   table  2.  percentage  of  citations  leading  directly  to  full  text   table  3  contains  the  same  data  but  adjusted  to  remove  results  that  summon  correctly  indicated   were  citation-­‐only.  instead  of  calculating  the  percentages  based  on  the  full  300  citation  samples,   they  are  calculated  based  on  the  sample  minus  the  citation-­‐only  records.  the  last  row  shows  the   number  of  records  excluded  from  the  full  samples.       sample  1   november  2012   sample  2   december  2012   sample  3   january  2013   sample  4   january  2013   direct  linking   197   65.9%   204   69.2%   173   59.0%   185   62.5%   360  link/1-­‐click   45   15.1%   47   15.9%   64   21.8%   55   18.6%   total   242   80.9%   251   85.1%   237   80.9%   240   81.1%   records  excluded   1     5     7     4     table  3.  percentage  of  citations  leading  directly  to  full  text,  excluding  citation-­‐only  results   link  failures  with  summon  direct  linking  and  360  link  1-­‐click   the  next  two  tables  show  the  results  of  linking  for  records  that  used  direct  linking  and  the   citations  that  used  1-­‐click  through  360  link.  records  that  used  direct  linking  were  more  likely  to   lead  testers  to  full  text  than  360  link  with  1-­‐click.  for  the  four  samples,  direct  linking  led  to  full   text  more  than  90%  of  the  time  while  1-­‐click  led  to  full  text  from  about  58%  to  about  67%  of  the   time.   for  those  records  using  direct  linking  where  direct  linking  did  not  lead  directly  to  the  text,  the   result  was  usually  a  page  that  did  not  have  a  link  to  full  text  (see  table  4).         measuring  journal  linking  success  from  a  discovery  service  |  stuart,  varnum,  and   ahronheim   67     sample  1   nov.  2012   n  =  205   sample  2   dec.  2012   n  =  210   sample  3   jan.  2013   n  =  184   sample  4   jan.  2013   n  =  190   full  text/page  with  full-­‐text  link   197   96.1%   204   97.1%   173   94.0%   185   97.4%   abstract/citation  only   6   2.9%   5   2.4%   6   3.3%   5   2.6%   unable  to  access  full  text  through   available  full-­‐text  link   1   0.5%   1   0.5%   3   1.6%   0   0.0%   error  and  no  full-­‐text  link  on  target   1   0.5%   0   0.0%   0   0.0%   0   0.0%   listing  of  volumes/issues   0   0.0%   0   0.0%   1   0.5%   0   0.0%   wrong  article   0   0.0%   0   0.0%   1   0.5%   0   0.0%   minor  results14   0   0.0%   0   0.0%   0   0.0%   0   0.0%   table  4.  results  with  direct  linking   for  360  link  with  1-­‐click,  the  results  that  did  not  lead  to  full  text  were  more  varied  (see  table  5).   the  top  reasons  for  failure  included  the  link  leading  to  an  error  indicating  the  article  was  not   available  even  though  full  text  for  the  article  was  available  on  the  site,  the  link  leading  to  a  list  of   search  results  and  the  link  leading  to  the  table  of  contents  for  the  journal  issue  or  book.  in  the  last   case,  most  of  those  results  were  book  chapters  where  360  link  only  generated  a  link  to  the  main   page  for  the  book  instead  of  a  link  to  the  chapter.           information  technology  and  libraries  |  march  2015   68     sample  1   nov.  2012   n  =  77   sample  2   dec.  2012   n  =  70   sample  3   jan.  2013   n  =  98   sample  4   jan.  2013   n  =  87   full  text/page  with  full-­‐text  link   45   58.4%   47   67.1%   64   65.3%   55   63.2%   table  of  contents   12   15.6%   6   8.6%   10   10.2%   6   6.9%   error  but  full  text  available   6   7.8%   11   15.7%   10   10.2%   18   20.7%   results  list   6   7.8%   2   2.9%   10   10.2%   4   4.6%   error  and  no  full-­‐text  link  on  target   6   7.8%   1   1.4%   2   2.0%   2   2.3%   wrong  article   1   1.3%   1   1.4%   1   1.0%   2   2.3%   other   1   1.3%   0   0.0%   0   0.0%   0   0.0%   abstract/citation  only   0   0.0%   0   0.0%   1   1.0%   0   0.0%   unable  to  access  full  text  through  available   full-­‐text  link   0   0.0%   1   1.4%   0   0.0%   0   0.0%   search  box   0   0.0%   1   1.4%   0   0.0%   0   0.0%   minor  results15   0   0.0%   0   0.0%   0   0.0%   0   0.0%   table  5.  results  with  360  link:  citations  using  1-­‐click   link  analysis  for  all  360  link  clicks   unlike  the  above  tables,  which  show  the  results  on  a  citation  basis,  the  table  below  shows  the   results  for  all  links  produced  by  360  link  (see  table  6).  this  includes  the  following:   1. links  used  for  1-­‐click.   2. links  in  the  360  link  menu  that  were  not  used  for  1-­‐click  when  360  link  attempted  to  link   to  full  text  using  1-­‐click   3. links  in  the  360  link  menu  where  clicking  the  link  in  summon  led  directly  to  the  360  link   menu  instead  of  using  1-­‐click             measuring  journal  linking  success  from  a  discovery  service  |  stuart,  varnum,  and   ahronheim   69     sample  1   nov.  2012   n  =  167   sample  2   dec.  2012   n  =  158   sample  3   jan.  2013   n  =  184   sample  4   jan.  2013   n  =  172   full  text/page  with  full-­‐text  link   81   48.5%   84   53.2%   103   56.0%   87   50.6%   abstract/citation  only   0   0.0%   0   0.0%   1   0.5%   0   0.0%   unable  to  access  full  text  through  available   full-­‐text  link   0   0.0%   1   0.6%   0   0.0%   1   0.6%   error  but  full  text  available   9   5.4%   14   8.9%   17   9.2%   23   13.4%   error  and  full  text  not  accessible  through   full-­‐text  link  on  target   1   0.6%   0   0.0%   0   0.0%   0   0.0%   error  and  no  full-­‐text  link  on  target   10   6.0%   1   0.6%   6   3.3%   5   2.9%   failed  to  find  doi  through  link  in  360  link   menu   3   1.8%   5   3.2%   5   2.7%   8   4.7%   main  journal  page   22   13.2%   24   15.2%   17   9.2%   15   8.7%   other   2   1.2%   0   0.0%   1   0.5%   2   1.2%   360  link  menu  with  no  full-­‐text  links   0   0.0%   2   1.3%   3   1.6%   3   1.7%   results  list   9   5.4%   4   2.5%   10   5.4%   3   1.7%   search  box   6   3.6%   7   4.4%   5   2.7%   8   4.7%   table  of  contents   12   7.2%   6   3.8%   10   5.4%   9   5.2%   listing  of  volumes/issues   9   5.4%   9   5.7%   5   2.7%   6   3.5%   wrong  article   3   1.8%   1   0.6%   1   0.5%   2   1.2%   table  6.  results  with  360  link:  all  links  produced  by  360  link   in  addition  to  recording  what  happened,  we  attempted  to  determine  why  links  failed  to  reach  full   text.  even  though  direct  linking  is  very  effective,  it  is  not  100%  effective  in  linking  to  full  text.   when  excluding  records  that  indicated  that  only  the  citation,  not  full  text,  would  be  available   through  summon,  most  of  the  problems  were  due  to  incorrect  information  in  summon  (see  table   7).  either  the  link  produced  by  summon  was  incorrectly  leading  to  an  error  or  an  abstract  when       information  technology  and  libraries  |  march  2015   70   full  text  was  available  on  the  target  site  or  summon  incorrectly  indicated  access  to  full  text  was   available.       sample  1   nov.  2012   n  =  8   sample  2   dec.  2012   n  =  6   sample  3   jan.  2013   n  =  11   sample  4   jan.  2013   n  =  5   citation-­‐only  record  in  summon   1   12.5%   3   50.0%   4   36.4%   1   20.0%   incomplete  target  collection   1   12.5%   0   0.0%   1   9.1%   1   20.0%   incorrect  coverage  in  knowledgebase   0   0.0%   0   0.0%   2   18.2%   0   0.0%   summon  has  incorrect  link   3   37.5%   1   16.7%   2   18.2%   2   40.0%   summon  incorrectly  indicating  available  access   to  full  text   3   37.5%   2   33.3%   2   18.2%   1   20.0%   table  7.  reasons  for  linking  failure  to  link  to  full  text  through  direct  linking   table  8  shows  the  reasons  links  generated  by  360  link  and  used  for  1-­‐click  did  not  lead  to  full  text.   most  of  the  failures  were  caused  by  three  general  problems:   1. incorrect  metadata  in  summon   2. incorrect  syntax  in  the  article/chapter  link  generated  by  360  link   3. target  site  does  not  support  linking  to  the  article/chapter  level           measuring  journal  linking  success  from  a  discovery  service  |  stuart,  varnum,  and   ahronheim   71     sample  1   nov.  2012   n  =  32   sample  2   dec.  2012   n  =  23   sample  3   jan.  2013   n  =  34   sample  4   jan.  2013   n  =  32   incorrect  metadata  in  the  summon  openurl   2   6.3%   4   17.4%   3   8.8%   4   12.5%   incomplete  metadata  in  the  summon   openurl   0   0.0%   2   8.7%   0   0.0%   0   0.0%   difference  in  metadata  between  summon  and   the  target   1   3.1%   5   21.7%   0   0.0%   2   6.3%   inaccurate  data  in  knowledgebase   0   0.0%   0   0.0%   0   0.0%   1   3.1%   incorrect  coverage  in  knowledgebase   0   0.0%   0   4.3%   0   0.0%   0   0.0%   link  resolver  insufficiency   2   6.3%   0   0.0%   1   2.9%   0   0.0%   incorrect  syntax  in  the  article/chapter  link   generated  by  360  link   6   18.8%   3   13.0%   10   29.4%   7   21.9%   target  site  does  not  support  linking  to   article/chapter  level   11   34.3%   4   17.4%   5   14.7%   6   18.8%   article  not  individually  indexed   0   0.0%   1   4.3%   3   8.8%   5   15.6%   target  error  in  targeturl  translation   0   0.0%   0   0.0%   5   14.7%   3   9.4%   incomplete  target  collection   8   25.0%   1   4.3%   1   2.9%   3   9.4%   incorrect  metadata  on  the  target  site   0   0.0%   1   4.3%   0   0.0%   1   3.1%   citation-­‐only  record  in  summon   0   0.0%   0   0.0%   0   0.0%   0   0.0%   cookie   2   6.3%   0   0.0%   0   0.0%   0   0.0%   item  does  not  appear  to  have  a  doi   0   0.0%   0   0.0%   0   0.0%   0   0.0%   miscellaneous   0   0.0%   0   0.0%   4   0.0%   0   0.0%   unknown   0   0.0%   1   4.3%   2   5.9%   0   0.0%   table  8.  reasons  for  linking  failure  to  link  to  full  text  through  1-­‐click       information  technology  and  libraries  |  march  2015   72   broadening  our  view  of  360  link  to  include  all  links  generated  by  360  link  during  the  testing,  not   only  the  ones  used  by  1-­‐click  (see  table  9),  we  see  more  causes  of  failure  than  with  1-­‐click.  most  of   the  failures  were  caused  by  five  general  problems:   1. incorrect  metadata  in  summon.   2. link  resolver  insufficiency.  we  mostly  used  this  classification  when  360  link  only  provided   links  to  the  main  journal  page  or  database  page  instead  of  links  to  the  article  and  we   thought  it  might  have  been  possible  to  generate  a  link  to  the  article.  sometimes  this  was   due  to  configuration  changes  that  we  could  have  made  and  sometimes  it  was  because  360   link  would  only  create  article  links  if  particular  metadata  was  available  even  if  other   sufficient  identifying  metadata  was  available.   3. incorrect  syntax  in  the  article/chapter  link  generated  by  360  link.   4. target  site  does  not  support  linking  to  the  article/chapter  level.   5. miscellaneous.  most  of  the  links  that  fell  in  this  category  were  ones  that  were  intended  to   go  the  main  journal  page  by  design.  these  were  for  journals  that  are  not  in  vendor-­‐specific   packages  in  the  knowledgebase  but  in  large  general  packages  with  many  journals  on   different  platforms.  because  there  is  no  common  linking  syntax,  article-­‐level  linking  is  not   possible.  this  includes  packages  containing  open  source  titles  such  as  directory  of  open   access  journals  (doaj)  and  packages  of  subscription  titles  that  are  not  listed  in  vendor-­‐ specific  packages  in  the  knowledgebase.             measuring  journal  linking  success  from  a  discovery  service  |  stuart,  varnum,  and   ahronheim   73     sample  1   nov.  2012   n  =  86   sample  2   dec.  2012   n  =  74   sample  3   jan.  2013   n  =  81   sample  4   jan.  2013   n  =  89   incorrect  metadata  in  the  summon  openurl   9   10.5%   5   6.8%   4   4.9%   8   9.0%   incomplete  metadata  in  the  summon   openurl   0   0.0%   2   2.7%   1   1.2%   3   3.4%   difference  in  metadata  between  summon  and   the  target   1   1.2%   6   8.1%   2   2.5%   2   2.2%   inaccurate  data  in  knowledgebase   0   0.0%   0   0.0%   1   1.2%   5   5.6%   incorrect  coverage  in  knowledgebase   3   3.5%   1   1.4%   2   2.5%   1   1.1%   link  resolver  insufficiency   20   23.3%   15   20.3%   9   11.1%   8   9.0%   incorrect  syntax  in  the  article/chapter  link   generated  by  360  link   7   8.1%   3   4.1%   10   12.3%   11   12.4%   target  site  does  not  support  linking  to   article/chapter  level   17   19.8%   6   8.1%   9   11.1%   10   11.2%   article  not  individually  indexed   0   0.0%   1   1.4%   3   3.7%   5   5.6%   target  error  in  targeturl  translation   1   1.2%   3   4.1%   7   8.6%   3   3.4%   incomplete  target  collection   11   12.8%   2   2.7%   5   6.2%   5   5.6%   incorrect  metadata  on  the  target  site   0   0.0%   1   1.4%   0   0.0%   1   1.1%   citation-­‐only  record  in  summon   0   0.0%   2   2.7%   3   3.7%   3   3.4%   cookie   2   2.3%   0   0.0%   0   0.0%   0   0.0%   item  does  not  appear  to  have  a  doi   2   2.3%   4   5.4%   5   6.2%   7   7.9%   miscellaneous   13   15.1%   22   29.7%   18   22.2%   17   19.1%   unknown   0   0.0%   1   1.4%   2   2.5%   0   0.0%   table  9.  reasons  for  linking  failure  to  link  to  full  text  for  all  360  link  links       information  technology  and  libraries  |  march  2015   74   comparison  of  user  reports  and  random  samples   when  we  look  at  user-­‐reported  problems  during  the  same  period  over  which  we  conducted  our   manual  process  (november  1,  2012–january  29,  2013),  we  see  that  users  reported  a  problem   roughly  0.2%  of  the  time  (0.187%  of  searches  resulted  in  a  problem  report  while  0.228%  of  full-­‐ text  clicks  resulted  in  a  problem  report).  see  table  10.   sample  period   problems   reported   articlesplus   searches   mget  it   clicks   problems   reported  per   search   problems   reported  per   mget  it  click   11/1/2012– 11/30/2012   225   111,062   95,218   0.203%   0.236%   12/1/2012– 12/31/2012   105   74,848   58,346   0.140%   0.180%   1/1/2013– 1/29/2013   100   44,204   34,692   0.226%   0.288%               overall   430   230,114   188,256   0.187%   0.228%   table  10.  user  problem  reports  during  the  sample  period   the  number  of  user-­‐reported  errors  is  significantly  less  than  what  we  found  through  our   systematic  sampling  (see  table  2).  where  the  error  rate  based  on  user  reports  would  be  roughly   0.2%,  the  more  systematic  approach  showed  a  20%  error  rate.  relying  solely  on  user  reports  of   errors  to  judge  the  reliability  of  full-­‐text  links  dramatically  underreports  true  problems  by  a  factor   of  100.     conclusions  and  next  steps   comparison  of  user  reports  to  random  sample  testing  indicates  a  significant  underreporting  of   problems  on  the  part  of  users.  while  we  have  not  conducted  similar  studies  across  other  vendor   databases,  we  suspect  that  user-­‐generated  reports  likewise  significantly  lag  behind  true  errors.   future  research  in  this  area  is  recommended.     the  number  of  problems  discovered  in  full-­‐text  items  that  are  linked  via  an  openurl  is   discouraging;  however,  the  ability  of  the  summon  discovery  service  to  provide  accurate  access  to   full  text  is  an  overall  positive  because  of  its  direct  link  functionality.  more  than  95%  of  direct-­‐ linked  articles  in  our  research  led  to  the  correct  resource  (table  3).  one-­‐click  (openurl)   resolution  was  noticeably  poorer,  with  about  60%  of  requests  leading  directly  to  the  correct  full-­‐ text  item.  more  alarming,  we  found  that,  of  full-­‐text  requests  linked  through  an  openurl,  a  large       measuring  journal  linking  success  from  a  discovery  service  |  stuart,  varnum,  and   ahronheim   75   portion—20%—fail.  the  direct  links  (the  result  of  publisher-­‐discovery  service  negotiations)  are   much  more  effective.  this  discourages  us  from  feeling  any  complacency  about  the  effectiveness  of   our  openurl  link  resolution  tools.  the  effort  spent  maintaining  our  link  resolution  knowledge   base  does  not  make  a  long-­‐term  difference  in  the  link  resolution  quality.     based  on  the  data  we  have  collected,  it  would  appear  that  more  work  needs  to  be  done  if  openurl   is  to  continue  as  a  working  standard.  while  our  data  shows  that  direct  linking  offers  improved   service  for  the  user  as  an  immediate  reward,  we  do  feel  some  concern  about  the  longer-­‐term  effect   of  closed  and  proprietary  access  paths  on  the  broader  scholarly  environment.  from  the  library’s   perspective,  the  trend  to  direct  linking  creates  the  risk  of  vendor  lock-­‐in  because  the  vendor-­‐ created  direct  links  will  not  work  after  the  library’s  business  relationship  with  the  vendor  ends.   an  openurl  is  less  tightly  bound  to  the  vendor  that  provided  it.  this  lock-­‐in  increases  the  cost  of   changing  vendors.  the  emergence  of  direct  links  is  a  two-­‐edged  sword:  users  gain  reliability  but   libraries  lose  flexibility  and  the  ability  to  adapt.   the  impetus  for  improving  openurl  linking  must  come  from  libraries  because  vendors  do  not   have  a  strong  incentive  to  take  the  lead  in  this  effort,  especially  when  it  interferes  with  their   competitive  advantage.  we  recommend  that  libraries  collaborate  more  actively  on  identifying   patterns  of  failure  in  openurl  link  resolution  and  remedies  for  those  issues  so  that  openurl   continues  as  a  viable  and  open  method  for  full-­‐text  access.  with  more  data  on  the  failure  modes   for  openurl  transactions,  libraries  and  content  providers  may  be  able  to  implement  systematic   improvements  in  standardized  linking  performance.  we  hope  that  the  methods  and  data  we  have   presented  form  a  helpful  beginning  step  in  this  activity.   acknowledgement   the  authors  thank  kat  hagedorn  and  heather  shoecraft  for  their  comments  on  a  draft  of  this   manuscript.   references     1.     niso/uksg  kbart  working  group,  kbart:  knowledge  bases  and  related  tools,  january  2010,   http://www.uksg.org/sites/uksg.org/files/kbart_phase_i_recommended_practice.pdf.     2.     national  information  standards  organization  (niso),  “ansi/niso  z39.88  -­‐  the  openurl   framework  for  context-­‐sensitive  services,”  may  13,  2010,   http://www.niso.org/kst/reports/standards?step=2&project_key=d5320409c5160be4697dc 046613f71b9a773cd9e.     3.     adam  chandler,  glen  wiley,  and  jim  leblanc,  “towards  transparent  and  scalable  openurl   quality  metrics,”  d-­‐lib  magazine  17,  no.  3/4  (march  2011),   http://dx.doi.org/10.1045/march2011-­‐chandler.       information  technology  and  libraries  |  march  2015   76     4.     ibid.   5.     national  information  standards  organization  (niso),  improving  openurls  through  analytics   (iota):  recommendations  for  link  resolver  providers,  april  26,  2013,   http://www.niso.org/apps/group_public/download.php/10811/rp-­‐21-­‐2013_iota.pdf.   6.     niso/uksg  kbart  working  group,  kbart:  knowledge  bases  and  related  tools.   7.     oliver  pesch,  “improving  openurl  linking,”  serials  librarian  63,  no.  2  (2012):  135–45,   http://dx.doi.org/10.1080/0361526x.2012.689465.   8     jason  price  and  cindi  trainor,  “chapter  3:  digging  into  the  data:  exposing  the  causes  of   resolver  failure,”  library  technology  reports  46,  no.  7  (october  2010):  15–26.   9.     ibid.,  26.   10.    xiaotian  chen,  “broken-­‐link  reports  from  sfx  users,”  serials  review  38,  no.  4  (december   2012):  222–27,  http://dx.doi.org/10.1016/j.serrev.2012.09.002.     11.    lois  o’neill,  “scaffolding  openurl  results,”  reference  services  quarterly  14,  no.  1–2  (2009):   13–35,  http://dx.doi.org/10.1080/10875300902961940.   12.    http://www.lib.umich.edu/.  see  the  articlesplus  tab  of  the  search  box.   13.    one  problem  we  had  in  testing  was  that  log  data  for  february  2013  was  not  preserved.  this   would  have  been  used  to  build  the  sample  tested  in  april  2013.  to  get  around  this  we  decided   to  take  two  samples  from  the  january  2013  log.   14.    the  “minor  results”  row  is  a  combination  of  all  results  that  did  not  represent  at  least  0.5%  of   the  records  using  direct  linking  for  at  least  one  sample.  this  includes  the  following  results:   error  but  full  text  available,  error  and  full  text  not  accessible  through  full  text  link  on  target,   main  journal  page,  360  link  menu  with  no  full  text  links,  results  list,  search  box,  table  of   contents,  and  other.   15.   the  “minor  results”  row  is  a  combination  of  all  results  that  did  not  represent  at  least  0.5%  of   the  records  using  360  link  for  at  least  one  sample.  this  includes  the  following  results:  error   and  full  text  not  accessible  through  full  text  link  on  target,  main  journal  page,  360  link  menu   with  no  full  text  links,  listing  of  volumes/issues. measuring library broadband networks to address knowledge gaps and data caps article measuring library broadband networks to address knowledge gaps and data caps chris ritzo, colin rhinesmith, and jie jiang information technology and libraries | september 2022 https://doi.org/10.6017/ital.v41i3.13775 chris ritzo, mslis (critzo@afutures.xyz) is consultant/owner, anemophlious futures llc. colin rhinesmith, phd (crhinesmith@metro.org) is director, digital equity research center, metropolitan new york library council. jie jiang, mslis (jie.jiang@simmons.edu) is a doctoral student at the simmons university school of library and information science. © 2022. abstract in this paper, we present findings from a three-year research project funded by the us institute of museum and library services that examined how advanced broadband measurement capabilities can support the infrastructure and services needed to respond to the digital demands of public library users across the us. previous studies have identified the ongoing broadband challenges of public libraries while also highlighting the increasing digital expectations of their patrons. however, few large-scale research efforts have collected automated, longitudinal measurement data on library broadband speeds and quality of service at a local, granular level inside public libraries over time, including when buildings are closed. this research seeks to address this gap in the literature through the following research question: how can public libraries utilize broadband measurement tools to develop a better understanding of the broadband speeds and quality of service that public libraries receive? in response, quantitative measurement data were gathered from an open-source broadband measurement system that was both developed for the research and deployed at 30 public libraries across the us. findings from our analysis of the data revealed that ookla measurements over time can confirm when the library’s internet connection matches expected service levels and when they do not. when measurements are not consistent with expected service levels, libraries can observe the differences and correlate this with additional local information about the causes. ongoing measurements conducted by the library enable local control and monitoring of this vital service and support critique and interrogation of the differences between internet measurement platforms. in addition, we learned that speed tests are useful for examining these trends but are only a small part of assessing an internet connection and how well it can be used for specific purposes. these findings have implications for state library agencies and federal policymakers interested in having access to data on observed versus advertised speeds and quality of service of public library broadband connections nationwide. introduction the covid-19 pandemic exposed the severity of the digital divide in the united states. during this time, lack of access to computers and the internet has been highlighted among individuals and families with limited monthly incomes in tribal, rural, and urban communities where broadband is neither available nor affordable. decades of research has shown that this digital divide is further deepened along racial and ethnic lines. wealthier, white, and more educated individuals consistently have higher rates of home computer and broadband ownership. many without th is societal privilege rely on their local public libraries and other community spaces to fill these gaps. the pandemic has also underscored just how significant public libraries have been in addressing people’s need for computers and high-speed internet. last year, for example, mainstream news mailto:critzo@afutures.xyz mailto:crhinesmith@metro.org mailto:jie.jiang@simmons.edu information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 2 organizations shared several stories about children, parents, and teachers all relying on wireless internet access while seated outside in school and public library parking lots, which happens both during and outside library hours.1 much less attention has been paid, however, to the broadband infrastructure and technical support that public schools and libraries need to meet the digital demands of their communities. in 2018, our research team, composed of researchers and practitioners at the simmons university school of library and information science, measurement lab (m-lab), and internet2, received a grant (award #71-18-0110-18) from the us institute of museum and library services (imls). the purpose was to investigate how advanced broadband measurement capabilities can inform the capacity of the nation’s public libraries to support the online software applications and social and technical infrastructure needed to promote the national digital platform.2 in this paper, we present findings from this study, which seeks to address a gap in understanding, particularly among researchers, practitioners, and policymakers, about the speeds and quality of service of public library internet connections across the united states. through our research we learned that there are significant gaps in knowledge about broadband speeds and quality of service measures that are impacting the ability of public libraries to support their communities’ digital needs. in this context, we hope the quantitative data and analysis presented in this paper contributes to the scholarship on broadband measurement in libraries, as well as to expanding awareness and understanding of broadband data. more concretely, we hope this paper helps to raise awareness of the urgent need for shared knowledge about broadband data and infrastructure that supports digital services in public libraries. we begin the paper with a brief review of key studies that have highlighted the important role of public libraries in promoting digital equity, as well as studies that have discussed the importance of measuring broadband connectivity in public libraries. we concentrate particularly on those studies that have sought to elucidate the opportunities and challenges of both connecting public libraries with high-speed internet connections and educating public librarians, other researchers, and policymakers about what is meant by broadband infrastructure and services. we then present our findings from the quantitative analysis of our broadband measurement data, which highlights the ways in which ongoing, locally collected measurements can enhance libraries’ understanding of their internet service and help inform interactions with patrons and it service providers. the paper concludes with a discussion of the contribution of our research to the scholarship, and we briefly discuss the implications for state and federal policymakers interested in better understanding the role that library broadband measurement data can play in promoting healthy digital equity ecosystems. literature review digital inclusion and broadband measurement in public libraries public libraries in the united states have been committed to bridging the digital gap by providing free public access to computers, internet, and digital literacy skills for decades.3 for example, in their study of how public libraries respond to inquiries about the digital divide through participatory forms, schenck-hamlin, han, and schenck-hamlin found that public libraries have been recognized as the “first and last” resort for internet access particularly “for those unable to afford high-speed connections at home.”4 further, bertot, real, and jaeger affirmed this idea with their digital inclusion survey data, collected over several years, by stating, “america’s public libraries are an important force for bridging this (digital) divide, with 62.1% of these outlets information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 3 reporting that they are the only free providers of internet access inclusive of computers in their communities.”5 in addition to providing public access to computers and the internet, us public libraries have placed an emphasis on promoting the general public’s awareness and skills around broadband through delivering free digital literacy training sessions, as well as hosting civic discussions around the topic of broadband connections with their patrons.6 to further illustrate how public libraries narrow the digital divide, deguzman, jain, and loureiro explained that telemedicine has become a new norm in today’s medical visits, which quickly became a reality during the covid-19 pandemic. in their article, the authors show how public libraries can play a critical role in bridging this “digital health divide” that exists in many communities.7 as a bottomup means to promote digital inclusion in the us, the role that public libraries have played to promote digital inclusion and equity cannot be overlooked. however, as jaeger et al. explained in their study on how public libraries address the digital divide and digital inclusion, “one curious constant across policy approaches to digital divides in many, though not all, nations has been the failure to involve librarians in the formulation of definitions, policies, or other aspects of the policy-making process.”8 it is within this space that public librarians and the technological staff who support them can play an important role in co-designing the tools, skills, and knowledge needed to better understand broadband in public libraries. broadband measurement in public libraries many public libraries, particularly small, rural, and tribal libraries, face ongoing challenges in gaining accurate information about their broadband speeds and quality of service. this lack of information can limit their capacity to provide a wide range of applications and services to the community. as bertot, real, and jaeger concluded, one of the big challenges that public libraries have been dealing with is that the speed of public library internet connections “can vary significantly according to local population density.”9 in reaction to this situation, public librarians have shown great interest and need to acquire knowledge about their libraries’ current broadband performance.10 digital inclusion scholars have proposed topics that future research on public libraries and broadband measurement should explore.11 these topics include how to better inform public librarians in order to assist them in planning, as well as how to deliver sufficient and quality broadband connections to the community. other topics include looking at how to help public libraries justify the need for more workstations and bandwidth using data coming from “empirical measures, especially longitudinal measures.”12 these and other questions remain largely unanswered in the academic literature. the measuring library broadband networks (mlbn) project and research design research questions and significance of study our research sought to address this gap in the scholarship on broadband measurement in public libraries through the following research question: how can public libraries utilize broadband measurement tools and training materials to develop a better understanding of the broadband speeds and quality of service that public libraries receive? in response, our research team gathered quantitative data from an open-source broadband measurement system that was both developed for this study and deployed at 30 public libraries across the us. our research is significant because answers to these questions can help strengthen public libraries as essential anchor institutions and partners in providing data to address the digital needs of their communities. the findings from our study can also assist public libraries in responding to the challenges of developing a more integrated, equitable, and dynamic set of infrastructures for delivering public computing access and digital library services. information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 4 project overview and research design the measuring library broadband networks (mlbn) project (https://slis.simmons.edu/blogs/mlbn/) was originally conceptualized to be completed in four phases during the two-year grant period. during the first phase, we organized a “participatory design” workshop with our 10 first-year public libraries who agreed to serve as part of our research panel on the project.13 findings from our analysis of the qualitative data gathered during the workshop revealed that public libraries wanted access to broadband measurement data in order to: (1) better communicate with their patrons about their library’s broadband connectivity, (2) respond to their communities’ digital needs, and (3) justify the importance of robust internet connectivity to their funders.14 our analysis revealed early on in the project that knowledge gaps existed around the performance of public library broadband networks, patron and staff experiences using the library’s internet connection, and the meaning and value of measurements such as speed tests. during the second phase of the project, we applied what we learned from insights gained during the workshop to our site visits with the 10 participating first-year public libraries. during our fieldwork at the libraries, we sought to interview four different groups of people: (1) library staff, (2) library administrators, (3) it staff, and (4) it administrators. the purpose was to gain multiple perspectives on the same sets of questions, which would provide additional qualitative data to help answer our research questions. in a few of the libraries, the library administrator was also the primary it professional on site. in other words, depending on the size of the organization, librarians often wore several hats, which is certainly not uncommon for small, rural, and tribal libraries. in addition to conducting interviews with these four groups, we also held focus groups with patrons on site at each of the libraries. during this process, we were able to learn more about the context, character, and communities of our partner libraries and gain a better sense of what it is like to work at and/or be a patron of each library, as well as why public libraries might need an open-source broadband measurement system. the other main goal during this phase was to learn more about and document the process of installing our broadband measurement devices. through this process, we gained additional insights into the nuances of the network configurations at each location and refined our device configurations and setup instructions in response. ultimately, we sought to identify potential barriers to the measurement devices working properly in the networks of our second-year libraries, when we would not have the luxury of being there in person. at the conclusion of the research program in march 2021, we asked participating library and/or it staff to complete a final evaluation survey. twenty libraries responded to a range of questions, two of which related to their understanding of the library’s internet connectivity and network management practices: “is there an overall download and/or upload cap on the connection to the entire library building?” and “is there a cap on individual devices using the internet at the library?” eight libraries responded to one or both of the above questions; their responses are in table a.2 in appendix b. training manual during phases 2 and 3, we worked with carson block, a well-known library technology expert and consultant who helped us to develop our mlbn training manual (https://dataverse.harvard.edu/dataset.xhtml?persistentid=doi:10.7910/dvn/8xxxzq). the https://slis.simmons.edu/blogs/mlbn/ https://dataverse.harvard.edu/dataset.xhtml?persistentid=doi:10.7910/dvn/8xxxzq information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 5 development of the manual was led by chris ritzo, carson block, and colin rhinesmith to assist our second-year participating public libraries in being able to install the measurement devices on their own. the manual provides a comprehensive overview of our mlbn project, including what we learned in the first year of the project about why public libraries would want to measure broadband at their libraries. section 2 focuses on the setup instructions that public libraries would need to install the devices to measure both wired and wireless internet connections. we also provided details on the hardware used, device management, and data collection, as well as the data visualization platform developed for the project, which allows public libraries to access and use the broadband data gathered from the devices installed on their network. the manual includes complete information about how the measurement platform used in this study can be set up independently for future use by any library, institution, or individual. we knew this manual would be essential for scaling to the 60 total public libraries for our project and for any additional library after the end of our grant. final cohort of participating libraries libraries participating in this research were recruited primarily through the suggestion of the project’s advisory board, many of whom represented state library agencies, regional research and education networks, or other intermediary organizations working with or supporting public libraries. though the covid-19 pandemic limited our ability to scale up to our goal of 60 libraries, ultimately 30 libraries were recruited to participate in the research. appendix a lists the final cohort of participating libraries, the specific branch where measurements were conducted, the city, state, and the library’s imls code. broadband measurement data collection quantitative measurements of the network connections at participating libraries were collected using the murakami software developed by m-lab, running from a dedicated, on-premise measurement computer/device.15 this software provides tests from two large platforms for crowdsourced speed tests: m-lab’s network diagnostic tool versions 5 and 7 (ndt) and speedtest-cli, an open source client using the ookla platform.16 ndt is a network performance test of a connection’s bulk transport, conforming to the internet engineering task force’s (ieft) rfc 3148.17 m-lab provides two ndt testing protocols (ndt5 and ndt7), each measuring different aspects of the transmission control protocol (tcp).18 all versions of ndt measure upload and download speeds and latency using a single tcp stream, between the computer running the test and the nearest m-lab server. ookla is a commercial company that created the network performance test speedtest.net.19 ookla’s test also measures upload and download speeds, as well as latency, but provides the option to measure using a single tcp stream or using multiple streams. the primary differences between these two platforms’ tests are the use of single or multiple tcp streams and the location of testing servers.20 at each location (with a few exceptions), two devices were configured using network details supplied by library or it staff and shipped to the library with setup instructions (in some cases, depending on network complexity, only one device was installed). one device was connected to the switch or router where internet service connected the location (egress). the other device was connected to an available switch port on the same virtual local area network (vlan) as wifi access points. the intention was to measure the capacity of the entire location using the egress device, and the capacity of a single wifi access point (ap) to serve multiple patrons using the wifi ap device. once connected, each device ran tests approximately six randomized times within each 24-hour period. information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 6 each murakami device ran four tests: ndt 5, ndt 7, ookla single-stream, and ookla multi-stream. each test result was exported to an archive in google cloud. this data was imported into bigquery and analyzed in datastudio.21 data from the 2019 public libraries survey (pls) from imls was also included to describe each library’s locale, service population, and number of public computers, annual computer use sessions, and annual wifi sessions reported.22 public data provided by both ookla and m-lab for the counties in which each mlbn library was located were loaded to compare each platform’s reported aggregate measurements for the surrounding area to the measurements conducted at the libraries.23 aggregate public data for the surrounding counties in our analyses excluded all measurements from the libraries themselves. along with the data itself, specific details on our data import, cleaning, and analysis are provided in our publicly available mlbn dataverse (https://dataverse.harvard.edu/dataverse/mlbn), hosted by harvard university. limitations the covid-19 pandemic created challenges for the research team in scaling up to 60 libraries, as was planned at the beginning of the project. therefore, we had to limit our outreach and engagement during 2020. when we asked the final 30 public libraries that were able to participate in the research whether they had closed their doors during the pandemic, all of them said yes. however, all the libraries reported that they continued to provide wireless internet access, even though their buildings were closed to the public during this time. although we were unable to scale up to 60 public libraries, we were still absolutely thrilled with the response we received from the libraries that were able to participate. the ndt 7 tests in our program uncovered a now-resolved bug where measurements were limited by the performance of our selected premise devices, which lack proper support for encryption.24 this is observable in some of our data as a large jump in measured speeds from ndt 7 after november 1, 2020. the jump in measured throughput from ndt 7 tests after november 1, 2020 reflected when encrypted ndt 7 tests were disabled and began running unencrypted.25 findings individual libraries’ data data collected at each library is provided in an interactive mlbn datastudio report, along with summary information about the library from the 2019 public libraries survey.26 aggregated download, upload, and latency metrics from measurements conducted at each library can be viewed on page 2, individual library data (see figure 1 for an example).27 https://dataverse.harvard.edu/dataverse/mlbn information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 7 figure 1. individual library data page for andover, massachusetts.28 a unique feature of the report is a map of server locations to which tests were conducted. this feature demonstrates the different topologies of the ookla and m-lab platforms and enables analysis of measurements to specific servers. if a library’s internet service provider (isp) hosts an ookla server, it can be selected to display only measurements of the isp’s network, as shown in figure 2. the federal communications commission (fcc) distinguishes this topology as on-net, when the server and client are both within the same network, in contrast to off-net, where the server and client are in different networks.29 figure 2. individual library data page for clarkston, michigan, merit networks’ nearest ookla server selected.30 information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 8 by selecting all servers, we can observe the wide geographic range of ookla servers used. conversely, when we select one of the ndt tests, we can see that m-lab servers are only hosted in large metropolitan areas, as seen in figure 3. this demonstrates key differences in the server locations of these two measurement platforms and how the data from each relates to the fcc’s national broadband standard.31 clarkston, michigan—all ookla servers selected32 clarkston, michigan—all m-lab servers selected33 figure 3. test server locations for clarkston, michigan—all ookla and m-lab servers selected. additional aggregate speeds are provided on the individual library data page to communicate general measurement trends over time: maximum upload and download speed by month, day, hour, and weekday (see figures 4–7). this allows a library to confirm advertised speeds, as seen in figure 4 where the connection at clarkston, michigan, was measured consistently at just under 100 mbps symmetric download and upload. we also observe where measurements are not always consistent, as seen below in figures 5 and 7. in figure 5 we observe a dip in the upload median for westchester county, new york, in late june 2020, and a drop in upload median in late october 2020. with additional information, a library could correlate these observations with network outages, service changes, or network management changes. for example, the change in october 2020 could have been a network management change or service change to 200 mbps symmetric. in figure 7 we observe a trend that many librarians will recognize: a slight dip in median speeds over the peak hours of use. information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 9 figure 4. max speeds by month—clarkston, michigan.34 figure 5. daily aggregate speeds—westchester county, new york.35 information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 10 figure 6. weekday aggregate speeds—pasco county, florida.36 figure 7. hourly aggregate speeds—estherville, iowa.37 with additional local knowledge about network use, conditions, and events, library and it staff can use ongoing measurements to confirm and explain service changes or uncover issues that are not previously known. many mlbn libraries sought ongoing measurements of internet service to confirm service delivery levels, and some shared their expected service speeds in our final program evaluation survey. the list of libraries that shared their expected service levels are listed at the end of this paper. using these reported speeds as an example, we can observe where the overall measured speeds were consistent with the service levels and where they were not, using the ookla multi-stream measurements. bennington (vt) free library reported a 100 mbps symmetric connection as their expected service level, and the monthly maximum speeds range between 93 and 98 mbps.38 similar results were seen in live oak, georgia; monroe county, michigan; and sheridan, arkansas.39 in other cases, the reported service levels did not match the measurements. in pasco county, florida, measurements indicate a 50 mbps symmetric connection where the reported service level was 100 mbps download and 25 mbps upload.40 and in ventura county, california, our information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 11 measurements confirm an ~300 mbps symmetric connection but the reported service level was 1 gbps symmetric. finally, in several cases measurements may confirm anomalies or changes in the library’s internet service. in these examples we do not have local knowledge of changes in service or events that might explain anomalies observed in the data, but we can nonetheless observe that a change happened and make an inference about the causes. some examples include: • graham county, arizona—possible service delivery change in may 2020 from ~100/10 (download/upload mbps) to ~300/3041 • traverse city, michigan—possible service delivery change in january 2021 from ~80/5 to ~300/2042 • waltham, massachusetts—change to symmetric download and upload in november 2020 from ~50/25 to ~50/5043 • truro, massachusetts—observed changes in symmetry of upload and download measurements in june 2020 and march 2021 are perhaps indicators of testing changes in network management to adapt to changing needs44 • westchester county, new york—observed dip in some upload measurements in late june 2020 at specific times of day for unknown reason45 comparing average monthly maximum speeds the final two pages (5 and 6) of our data studio report display the maximum overall speeds and the average monthly maximum speeds measured for each library, filterable by imls code, access media, and type of isp.46 figure 8 shows a report for the average maximum speeds per test at libraries connected with fiber. figure 8. average maximum speeds per test measured at mlbn libraries connected with fiber. information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 12 comparison of measurements and related data to support increased understanding of network measurement within the public library community, we also compared measurements from the public libraries that participated in our study to the public data of crowdsourced measurements from the two large scale internet measurement platforms used in our research measurements, ookla and m-lab. we can observe the differences or similarities in measurements between the tests conducted from the libraries’ premise devices and the publicly released data in aggregate for the surrounding county. the weighted average speeds and latency are provided by quarter, since ookla’s public data limits more granularity. figure 9. comparing individual library data to public datasets—twin falls, idaho, 2020 q4.47 on page 4 of the data studio report we can observe whether the measured speeds in mlbn libraries were lower or higher than measurements from the surrounding county (see figure 9), along with the percentage difference between the two sources (see figure 10).48 while these differences are interesting to observe, and in some cases seem quite pronounced, this is not a finding that explains whether libraries are getting better or worse speeds than their communities. this is a coarse comparison that we might think of as a kind of litmus test for further inquiry, rather than findings that tell a definitive story. ookla public data aggregates all measurements from all isps together, while measurements from mlbn libraries are from one. a subscription to ookla speedtest intelligence might enable more direct comparisons on a per isp basis. information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 13 figure 10. mlbn data studio report, page 4—was public data or library data higher or lower? discussion our research builds on the digital inclusion survey that used a version of the ookla test in a supplemental speed test study in which, “libraries were asked to run the speed test when the library was closed, during regular hours of operation, and when usage was light, normal, or heavy, by the librarians’ estimation.”49 the software created by mlbn to collect ongoing, randomized measurements extends the idea of the digital inclusion survey’s supplemental speed test, making it a source of monitoring data that can be correlated with each location’s connection plans, service tiers, costs, and other metadata. this approach aligns with the methods used by the fcc by using a dedicated, on-premise device.50 the mlbn system also goes further, providing a framework for any open-source measurement to be added as an available test. since the conclusion of our research, several new tests have either been added or are being considered. 51 as new measurement tools and analyses are developed by the research community, the mlbn system can incorporate them and bridge network science researchers’ understandings to anchor institutions and the general public. while speed tests have been used in this research and its predecessors, we find there is a gap in public understanding about what these tests can tell us. one important outcome of our study is that understanding the experience of using the internet, measuring it, and regulating it all need additional measurements and approaches that go beyond speed tests alone. speed tests and the platforms that support them are very different. internet service plans may focus on upload and download speeds, as does telecommunications regulation at the fcc, but these tests offer only simple and incomplete assessments. the ookla and m-lab platforms provide two different controlled experiments designed to measure internet protocols and performance for different segments of the internet’s topology. they both use data generated specifically for the measurement itself. but these tests do not measure our experiences using the internet in general. for that we need network science researchers to develop new measurement methods and analyses. we advocate for even more nuance and additional metrics in the measurement and understanding of internet service beyond speed. this perspective aligns with the research er and network science community who are designing new measurements to account for user experience, content delivery, and latency issues, all of which are often incorrectly assumed to be measured by speed tests.52 information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 14 we took the approach of using multiple tests and platforms to provide complementary measurements of different aspects of delivered internet service. if we need to confirm advertised service levels, we can look at the ookla test data. in this analysis, we used the average monthly maximum speeds as measured by the ookla multi-stream test since it was used in the digital inclusion survey and most closely aligns with isps’ terms of service.53 m-lab’s ndt, on the other hand, provides a diagnostic measurement for how well the underlying tcp protocol is performing over the measured path. that measurement path for m-lab tests always traverses the boundaries between networks. if we want to assess our isp’s connectivity to the internet beyond the last mile network it maintains, we can examine the ndt data. however different the methodologies and measurements of ookla’s speed test and m-lab’s ndt test are, we agree with the internet measurement community that measurements of throughput or speed should not be the only focus of assessing a connection’s quality, and a broader public understanding that includes other measurements and nuances needs to be cultivated. 54 researchers’ understanding of the usefulness of both ookla and ndt measurements is not static, as recent analyses of their public datasets have shown.55 the research community is also exploring new metrics derived from various data sources that may eventually provide analyses that speak to user experience of using the internet, as well as other technical factors that can influence performance such as latency, bufferbloat, and responsiveness.56 what does this all mean for public libraries? building on the speed test used in the digital inclusion survey, the mlbn measurement system enables communities to collect ongoing measurements using a dedicated premise device, leveraging available open-source measurement tools, instead of running periodic or occasional tests. using this longitudinal data, libraries can confirm expected service levels using ookla test results, uncover where there is a mismatch in understanding of service levels or network management practice, or compare a library’s measured service to the surrounding community using public data sources. additional measurements like m-lab’s ndt can assess a library’s connectivity beyond the isp’s network. the resulting measurements are useful for interrogating the differences in platforms, their tests, and regulatory or funding benchmarks. but while speed tests can provide useful metrics for understanding general trends and anomalies, an appropriate understanding of what they do and do not measure is also needed. speed tests demonstrating advertised levels do not necessarily mean that users of that network will not experience slowness as content is delivered to their computers over the same connection. as the federal government prepares to outlay infrastructure dollars to states to improve internet access and service quality, libraries and other public institutions in their states will need specific data and understanding of its nuances and differences. conclusion in this paper, we sought to promote greater understanding about the speeds and quality of service of public library internet connections, an understudied area within library and information science, as well as among broadband policymakers. library staff and administrators need information to understand and communicate about a library’s network capacities, management practices, and diagnostic or monitoring information. the availability of measurements from different sources can help build shared understanding about a library’s internet connectivity between library staff and it or network administration personnel. and while speed tests are admittedly limited in what they can tell us about internet capacity, library staff who have access to information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 15 these types of measurements, as well as other information provided by a library’s it staff, would be better equipped to engage with patrons around questions of internet stability and capacity in support of the library’s mission. from our observations of the data collected, the mlbn measurement system can be used to enhance understanding of the library’s internet service and network management. for the subscribed service levels identified at mlbn libraries, ookla measurements confirm whether the library’s measured connection speeds matched the expected service levels and when they did not. when measurements are not consistent with expected service levels, libraries can observe the differences and correlate this with additional local information. ongoing measurements conducted by the library enable local control and monitoring of this vital service and support critique and interrogation of the differences between internet measurement platforms, their topologies, tests, and data, from the perspective of the library doing the measurement. speed tests are useful for examining these trends but may not always be indicative of a user’s experience accessing and using internet content and services. new research and leadership from the internet measurement community are needed to provide more nuanced and authentic assessments of both network performance and user experience. emerging research and analyses published openly can be added to the mlbn system to support increased public understanding of internet connection quality and user experience. we hope this paper and our research will help support public libraries interested in ongoing measurement and assessment of their internet service, as well as contribute to discussion of the implications for state and federal policymakers interested in better understanding that public libraries play a key role in their local digital equity ecosystems. acknolwedgments funding statement this work was supported by an award (#lg-71-18-0110-18) from the us institute of museum and library services national leadership grants for libraries program. data accessibility the datasets supporting this article have been uploaded to the harvard dataverse, located here: https://dataverse.harvard.edu/dataverse/mlbn https://dataverse.harvard.edu/dataverse/mlbn information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 16 appendix a: all participating libraries table a.1. the final cohort of participating libraries, the specific branch where measurements were conducted, the city, state, and the library’s imls code library branch (if applicable) city state imls region code andover memorial hall library — andover ma 21 suburb, large arkansas river valley regional library arkansas river valley dardanelle ar 33 town, remote avery mitchell yancy (amy) regional library avery morrison library newland nc 42 rural, distant bennington free library — bennington vt 32 town, distant caruthersville public library — caruthersville mo 33 town, remote cherokee public library — cherokee ia 33 town, remote clarkston independence district library main branch clarkston mi 21 suburb, large cochise county library district elfrida elfrida az 32 town, distant denver public library central library denver co 11 city, large estherville public library — estherville ia 33 town, remote mid arkansas regional library system grant county library sheridan ar 33 town, remote casewell county library gunn memorial public library yanceyville nc 42 rural, distant hall county library system gainesville gainesville ga 13 city, small hollis public library — hollis ak 43 rural, remote live oak public libraries bull street library savannah ga 13 city, small monroe county library system bedford branch library temperance mi 21 suburb, large information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 17 library branch (if applicable) city state imls region code multnomah county library st. johns branch portland or 12 city, midsize pasco county library regency park library new port richey fl 21 suburb, large pryor public library — pryor ok 32 town, distant union county library system the public library for union county lewisburg pa 32 town, distant safford city-graham county library — safford az 32 town, distant saline county library — benton ar 33 town, remote saint paul public library rondo branch, central branch saint paul mn 11 city, large the ferguson library — stamford ct 12 city, midsize traverse area district library kingsley branch library kingsley mi 21 suburb, large truro public library — truro ma 21 suburb, large twin falls public library — twin falls id 33 town, remote ventura county public library ep foster branch, admin branch ventura ca 21 suburb, large waltham public library main library waltham ma 21 suburb, large westchester county public library hendrick hudson free library, library system datacenter montrose ny 21 suburb, large information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 18 appendix b: final program evaluation survey table a.2. final evaluation responses on internet connectivity and network management library survey respondent role(s) service tier from survey (download/upload) per device limit imposed bennington free library library staff, it staff 100/100 live oak public libraries it staff 300/300 monroe county library system network administrator 50/20 pasco county library network administrator 100/25 grant county library library administrator 15/15 public library for union county it staff 10 mb/s ventura county public library it staff 1000/1000 waltham public library library staff, it staff 100/100 50 mb/s information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 19 endnotes 1 the editorial board, “doing schoolwork in the parking lot is not a solution,” the new york times, july 18, 2020, https://www.nytimes.com/2020/07/18/opinion/sunday/broadbandinternet-access-civil-rights.html; kathleen gray, “these buses bring school to students,” the new york times, december 17, 2020, https://www.nytimes.com/interactive/2020/12/17/us/school-bus-remote-learningwifi.html; cecilia kang, “parking lots have become a digital lifeline,” the new york times, may 5, 2020, https://www.nytimes.com/2020/05/05/technology/parking-lots-wificoronavirus.html; dan levin, “in rural ‘dead zones,’ school comes on a flash drive,” the new york times, november 13, 2020, https://www.nytimes.com/2020/11/13/us/wifi-dead-zonesschools.html. 2 institute of museum and library services, “lg-71-18-0110-18,” accessed august 25, 2021, https://www.imls.gov/grants/awarded/lg-71-18-0110-18-0. 3 john carlo bertot, brian real, and paul t. jaeger, “public libraries building digital inclusive communities: data and findings from the 2013 digital inclusion survey,” library quarterly 86, no. 3 (2016): 270–89, https://doi.org/10.1086/686674; donna schenck-hamlin, soo-hye han, and bill schenck-hamlin, “library-led forums on broadband: an inquiry into public deliberation,” library quarterly 84, no. 3 (july 2014): 278–93; sharon strover, brian whitacre, colin rhinesmith, and alexis schrubbe, “the digital inclusion role of rural libraries: social inequities through space and place,” media, culture & society 42, no. 2 (2020), https://doi.org/10.1177/0163443719853504. 4 schenck-hamlin, han, and schenck-hamlin, “library-led forums on broadband,” 280. 5 bertot, real, and jaeger, “public libraries building digital inclusive communities,” 271. 6 schenck-hamlin, han, and schenck-hamlin, “library-led forums on broadband.” 7 pamela b. deguzman, neha jain, and christine g. loureiro, “public libraries as partners in telemedicine delivery: a review and research agenda,” public library quarterly 41, no. 3 (may 2022): 294–304. 8 paul t. jaeger, john carlo bertot, kim m. thompson, sarah m. katz, and elizabeth j. decoster, “the intersection of public policy and public access: digital divides, digital literacy, digital inclusion, and public libraries,” public library quarterly 31, no. 1 (january 2012): 4. 9 bertot, real, and jaeger, “public libraries building digital inclusive communities,” 276. 10 colin rhinesmith et al., “co-designing an open source broadband measurement system with public libraries,” in eds. larry stillman, misita anwar, colin rhinesmith, and vanessa rhinesmith, proceedings—17th cirn conference 6-8 november 2019, monash university prato centre, italy: “whose agenda: action, research, & politics” (department of human centred computing, monash university, 2020): 153–76, https://www.researchgate.net/publication/341882544_codesigning_an_open_source_broadband_measurement_system_with_public_libraries. https://www.nytimes.com/2020/07/18/opinion/sunday/broadband-internet-access-civil-rights.html https://www.nytimes.com/2020/07/18/opinion/sunday/broadband-internet-access-civil-rights.html https://www.nytimes.com/interactive/2020/12/17/us/school-bus-remote-learning-wifi.html https://www.nytimes.com/interactive/2020/12/17/us/school-bus-remote-learning-wifi.html https://www.nytimes.com/2020/05/05/technology/parking-lots-wifi-coronavirus.html https://www.nytimes.com/2020/05/05/technology/parking-lots-wifi-coronavirus.html https://www.nytimes.com/2020/11/13/us/wifi-dead-zones-schools.html https://www.nytimes.com/2020/11/13/us/wifi-dead-zones-schools.html https://www.imls.gov/grants/awarded/lg-71-18-0110-18-0 https://doi.org/10.1086/686674 https://doi.org/10.1177/0163443719853504 https://www.researchgate.net/publication/341882544_co-designing_an_open_source_broadband_measurement_system_with_public_libraries https://www.researchgate.net/publication/341882544_co-designing_an_open_source_broadband_measurement_system_with_public_libraries information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 20 11 john carlo bertot and charles r. mcclure, “assessing sufficiency and quality of bandwidth for public libraries,” information technology and libraries 26, no. 1 (march 2007): 14–22; lauren h. mandel, bradley w. bishop, charles r. mcclure, john carlo bertot, paul t. jaeger, “broadband for public libraries: importance, issues, and research needs,” government information quarterly 27, no. 3 (january 1, 2010): 280–91. 12 mandel, bishop, mcclure, bertot, and jaeger, “broadband for public libraries,” 388. 13 douglas schuler and aki namioka, eds., participatory design: principles and practices (hillsdale, nj: lawrence erlbaum associates, inc., 1993). 14 rhinesmith et al., “co-designing.” 15 measurement lab, “m-lab/murakami: run automated internet measurement tests in a docker container” (2021), https://github.com/m-lab/murakami/. 16 measurement lab, “m-lab/ndt5-client-go: ndt5 reference client implementation in go” (2021), https://github.com/m-lab/ndt5-client-go; sivel, “sivel/speedtest-cli: command line interface for testing internet bandwidth using speedtest.net” (2021), https://github.com/sivel/speedtest-cli. 17 measurement lab, “ndt (network diagnostic tool),” https://www.measurementlab.net/tests/ndt/. 18 lai yi ohlsen, matt mathis, and stephen soltesz, “evolution of ndt,” measurement lab (blog), august 5, 2020, https://www.measurementlab.net/blog/evolution-of-ndt/. 19 ookla, “speedtest,” https://www.speedtest.net/. 20 measurement lab, “where are m-lab servers hosted?”, https://support.measurementlab.net/help/en-us/9-platform/2-where-are-m-lab-servershosted. 21 mlbn data studio report, page 1—overview of mlbn libraries, https://datastudio.google.com/u/0/reporting/0dff817b-0e0e-446e-a3b3406121291124/page/gxxib. 22 institute of museum and library services, “public libraries survey,” https://www.imls.gov/research-evaluation/data-collection/public-libraries-survey. 23 ookla, “ookla’s open data initiative,” https://www.ookla.com/ookla-for-good/open-data; measurement lab, “data overview,” https://www.measurementlab.net/data/. 24 measurement lab, “detect cpu capabilities and set scheme accordingly by robertodauria · pull request #62 · m-lab/ndt7-client-go” (2021), https://github.com/m-lab/ndt7-clientgo/pull/62; measurement lab, “m-lab/ndt7-client-go: ndt7 reference client implementation in go” (2021), https://github.com/m-lab/ndt7-client-go. https://github.com/m-lab/murakami/ https://github.com/m-lab/ndt5-client-go https://github.com/sivel/speedtest-cli https://www.measurementlab.net/tests/ndt/ https://www.measurementlab.net/blog/evolution-of-ndt/ https://www.speedtest.net/ https://support.measurementlab.net/help/en-us/9-platform/2-where-are-m-lab-servers-hosted https://support.measurementlab.net/help/en-us/9-platform/2-where-are-m-lab-servers-hosted https://datastudio.google.com/u/0/reporting/0dff817b-0e0e-446e-a3b3-406121291124/page/gxxib https://datastudio.google.com/u/0/reporting/0dff817b-0e0e-446e-a3b3-406121291124/page/gxxib https://www.imls.gov/research-evaluation/data-collection/public-libraries-survey https://www.ookla.com/ookla-for-good/open-data https://www.measurementlab.net/data/ https://github.com/m-lab/ndt7-client-go/pull/62 https://github.com/m-lab/ndt7-client-go/pull/62 https://github.com/m-lab/ndt7-client-go information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 21 25 measurement lab, “add to ndt7 runner to force all tests to be non-tls” (2020), https://github.com/mlab/murakami/commit/3770d6b63ebd9ad62b3754e0642bd0e9216e171e. 26 mlbn data studio report, page 1—overview of mlbn libraries, https://datastudio.google.com/s/ghtw2gm-vfi. 27 mlbn data studio report, page 2—individual library data, https://datastudio.google.com/s/kppdgf3i8c4. 28 mlbn data studio report page 2—individual library data page, andover, massachusetts, https://datastudio.google.com/s/unpy5z1m21g. 29 fcc office of engineering and technology, “technical appendix to the tenth mba report” (n.d.) 24, accessed 2021-06-22, http://data.fcc.gov/download/measuring-broadbandamerica/2020/technical-appendix-fixed-2020.pdf. 30 mlbn data studio report page 2—individual library data for clarkston, michigan, merit networks’ nearest ookla server selected, https://datastudio.google.com/s/pqpcybmhaj8. 31 ookla, “the speedtest server network,” accessed august 16, 2021, https://www.ookla.com/speedtest-servers; m-lab, “ndt data in ntia indicators of broadband need,” accessed february 15, 2022, https://www.measurementlab.net/blog/ntia/. 32 mlbn data studio report page 2—clarkston, michigan—all ookla servers selected, https://datastudio.google.com/s/oilgppl47rc. 33 mlbn data studio report page 2—clarkston, michigan—all m-lab servers selected, https://datastudio.google.com/s/j79kg_wmyrc. 34 mlbn data studio report page 2—max speeds by month—clarkston, michigan https://datastudio.google.com/s/hyljk_mpvl4. 35 mlbn data studio report page 2—daily speeds—westchester county, ny https://datastudio.google.com/s/ixgn8v4r3ia. 36 mlbn data studio report page 2—weekday aggregate speeds—pasco county, fl https://datastudio.google.com/s/gadtydi2g9k. 37 mlbn data studio report—hourly speeds—estherville, ia https://datastudio.google.com/s/n5pe2y9rjyg. 38 mlbn data studio report page 2—individual library data (bennington, vt and speedtestmulti-stream selected) https://datastudio.google.com/s/njreltxejge. 39 mlbn data studio report page 2—individual library data—ookla multi-stream measurements for live oak, ga, https://datastudio.google.com/s/kild8kkltza; mlbn data studio report page 2—individual library data—ookla multi-stream measurements for monroe county, mi, https://datastudio.google.com/s/p140swjihjm; mlbn data studio report page 2—individual https://github.com/m-lab/murakami/commit/3770d6b63ebd9ad62b3754e0642bd0e9216e171e https://github.com/m-lab/murakami/commit/3770d6b63ebd9ad62b3754e0642bd0e9216e171e https://datastudio.google.com/s/ghtw2gm-vfi https://datastudio.google.com/s/kppdgf3i8c4 https://datastudio.google.com/s/unpy5z1m21g http://data.fcc.gov/download/measuring-broadband-america/2020/technical-appendix-fixed-2020.pdf http://data.fcc.gov/download/measuring-broadband-america/2020/technical-appendix-fixed-2020.pdf https://datastudio.google.com/s/pqpcybmhaj8 https://www.ookla.com/speedtest-servers https://www.measurementlab.net/blog/ntia/ https://datastudio.google.com/s/oilgppl47rc https://datastudio.google.com/s/j79kg_wmyrc https://datastudio.google.com/s/hyljk_mpvl4 https://datastudio.google.com/s/ixgn8v4r3ia https://datastudio.google.com/s/gadtydi2g9k https://datastudio.google.com/s/n5pe2y9rjyg https://datastudio.google.com/s/njreltxejge https://datastudio.google.com/s/kild8kkltza https://datastudio.google.com/s/p140swjihjm information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 22 library data—ookla multi-stream measurements for sheridan, ar, https://datastudio.google.com/s/pehrfxosx6s. 40 mlbn data studio report page 2—individual library data—ookla multi-stream measurements for pasco county, fl, https://datastudio.google.com/s/uxgjlfv44ze. 41 mlbn data studio report page 2—individual library data—graham county, az—observing maximum ookla measured speeds by month, https://datastudio.google.com/s/ieal3b2vbu8. 42 mlbn data studio report page 2—individual library data—traverse city, mi – observing maximum ookla measured speeds by month, https://datastudio.google.com/s/tljy900tx-q. 43 mlbn data studio report page 2—individual library data—waltham, ma—observing maximum ookla measured speeds by month, https://datastudio.google.com/s/pxzdrb0chb0. 44 mlbn data studio report page 2—individual library data—truro, ma—observing maximum ookla measured speeds by month, https://datastudio.google.com/s/okcwp-_f6qs. 45 mlbn data studio report page 2—individual library data—westchester county, ny— observing daily aggregate speeds between june 17-29, 2020 and hourly aggregate speeds for tests in the 8 a.m., 10 a.m., 3 p.m., and 10 p.m. columns, https://datastudio.google.com/s/vlegynkoqc. 46 mlbn data studio report page 5—comparison of overall maximum speeds by test (fiber access media selected), https://datastudio.google.com/s/ii2f7onuchi; mlbn data studio report page 6—average monthly maximum speeds per test (all libraries selected), https://datastudio.google.com/s/oo7xal_kv4k. 47 mlbn data studio report page 3—comparing individual library data to public datasets—twin falls, id, 2020 q4, https://datastudio.google.com/s/vqq38wyraho. 48 mlbn data studio report page 4—was public data or library data higher or lower?, https://datastudio.google.com/s/meb4s-cflcw. 49 bertot, real, and jaeger, “public libraries building digital inclusive communities,” 271; american library association, “library broadband speed test shows increased capacity; room still for improvement” (press release), https://www.ala.org/news/pressreleases/2015/04/library-broadband-speed-test-shows-increased-capacity-room-stillimprovement. 50 federal communications commission, “measuring broadband america,” https://www.fcc.gov/general/measuring-broadband-america. 51 measurement lab, “add new runner for ooniprobe-cli,” https://github.com/mlab/murakami/pull/103; measurement lab, “add fast.com test runner,” https://github.com/m-lab/murakami/issues/48; “data science institute,” university of chicago, https://cdac.uchicago.edu/; university of chicago data science institute, “netrics – active measurements of internet performance,” https://github.com/chicago-cdac/nm-expactive-netrics/. https://datastudio.google.com/s/pehrfxosx6s https://datastudio.google.com/s/uxgjlfv44ze https://datastudio.google.com/s/ieal3b2vbu8 https://datastudio.google.com/s/tljy900tx-q https://datastudio.google.com/s/pxzdrb0chb0 https://datastudio.google.com/s/okcwp-_f6qs https://datastudio.google.com/s/vleg-ynkoqc https://datastudio.google.com/s/vleg-ynkoqc https://datastudio.google.com/s/ii2f7onuchi https://datastudio.google.com/s/oo7xal_kv4k https://datastudio.google.com/s/vqq38wyraho https://datastudio.google.com/s/meb4s-cflcw https://www.ala.org/news/press-releases/2015/04/library-broadband-speed-test-shows-increased-capacity-room-still-improvement https://www.ala.org/news/press-releases/2015/04/library-broadband-speed-test-shows-increased-capacity-room-still-improvement https://www.ala.org/news/press-releases/2015/04/library-broadband-speed-test-shows-increased-capacity-room-still-improvement https://www.fcc.gov/general/measuring-broadband-america https://github.com/m-lab/murakami/pull/103 https://github.com/m-lab/murakami/pull/103 https://github.com/m-lab/murakami/issues/48 https://cdac.uchicago.edu/ https://github.com/chicago-cdac/nm-exp-active-netrics/ https://github.com/chicago-cdac/nm-exp-active-netrics/ information technology and libraries september 2022 measuring library broadband networks | ritzo, rhinesmith, and jiang 23 52 internet architecture board, “measuring network quality for end-users, 2021, https://www.iab.org/activities/workshops/network-quality/; david d. clark and sara wedeman, “measurement, meaning and purpose: exploring the m-lab ndt dataset (august 2, 2021), https://ssrn.com/abstract=3898339 or http://dx.doi.org/10.2139/ssrn.3898339. 53 bertot, real, and jaeger, “public libraries building digital inclusive communities.” 54 internet architecture board, “measuring network quality for end-users, 2021” (call for papers), accessed august 26, 2021, https://www.iab.org/activities/workshops/network-quality/; lai yi ohlsen and chris ritzo, “ndt data in ntia indicators of broadband need,” measurement lab (blog), july 15, 2021, https://www.measurementlab.net/blog/ntia/. 55 clark and wedeman, “measurement, meaning and purpose.” 56 lai yi ohlsen, “m-lab research fellows – sprint 2022,” measurement lab (blog), january 13, 2022, https://www.measurementlab.net/blog/research-fellow-announcement/; lai yi ohlsen, “upcoming m-lab community call discussing latency, bufferbloat, responsiveness,” measurement lab (blog), august 18, 2021, https://www.measurementlab.net/blog/community-call-announcement/; broadband internet technical advisory group (bitag), “latency explained,” https://bitag.org/latencyexplained.php; internet architecture board, “measuring network quality for end-users 2021,” https://www.iab.org/activities/workshops/network-quality/; caida, “nsf workshop on overcoming measurement barriers to internet research (wombir 2021), https://www.caida.org/workshops/wombir/2101/; caida, “2nd nsf workshop on overcoming measurement barriers to internet research (wombir-2), https://www.caida.org/workshops/wombir/2104/; ietf, “responsiveness under working conditions” (draft), https://datatracker.ietf.org/doc/draft-cpaasch-ippm-responsiveness/. https://www.iab.org/activities/workshops/network-quality/ https://ssrn.com/abstract=3898339 http://dx.doi.org/10.2139/ssrn.3898339 https://www.iab.org/activities/workshops/network-quality/ https://www.measurementlab.net/blog/ntia/ https://www.measurementlab.net/blog/research-fellow-announcement/ https://www.measurementlab.net/blog/community-call-announcement/ https://bitag.org/latency-explained.php https://bitag.org/latency-explained.php https://www.iab.org/activities/workshops/network-quality/ https://www.caida.org/workshops/wombir/2101/ https://www.caida.org/workshops/wombir/2104/ https://datatracker.ietf.org/doc/draft-cpaasch-ippm-responsiveness/ abstract introduction literature review digital inclusion and broadband measurement in public libraries broadband measurement in public libraries the measuring library broadband networks (mlbn) project and research design research questions and significance of study project overview and research design training manual final cohort of participating libraries broadband measurement data collection limitations findings individual libraries’ data comparing average monthly maximum speeds comparison of measurements and related data discussion what does this all mean for public libraries? conclusion acknolwedgments funding statement data accessibility appendix a: all participating libraries appendix b: final program evaluation survey endnotes article title | author 5revitalizing the library opac | mi and weng 5 the behavior of academic library users has drastically changed in recent years. internet search engines have become the preferred tool over the library online public access catalog (opac) for finding information. libraries are losing ground to online search engines. in this paper, two aspects of opac use are studied: (1) the current opac interface and searching capabilities, and (2) the opac bibliographic display. the purpose of the study is to find answers to the following questions: why is the current opac ineffective? what can libraries and librarians do to deliver an opac that is as good as search engines to better serve our users? revitalizing the library opac is one of the pressing issues that has to be accomplished. t he information-seeking behavior of today’s academic library users has drastically changed in recent years. according to a survey conducted and published by oclc in 2005, approximately 89 percent of college students across all the regions that were included in the study (including areas outside the united states) begin their electronic information searches with internet search engines.1 more than half of u.s. residents used google for their searches. internet search engines dominate the information-seeking landscape. academic libraries are the ones affected most, because many college students are satisfied with the answers they find on the internet for their assignments, and they end up not taking advantage of the many quality resources in their libraries. for many years, before the internet search engine emerged, library catalogs were the sole information-seeking gateway. just as the one-time industry giant kodak has lost ground to digital photography, academic library opacs are losing ground to online search engines. all along we academic librarians have devotedly and assiduously produced good cataloging records for the public to use. we have diligently and faithfully educated and helped our faculty and students find the proper library resources to fulfill their research needs and assignment requirements. we feel good about what we have achieved. why have our users switched to online search engines? ■ the evolution of user behavior it is technology and rising user expectations that have contributed to the changes in user behavior. as coyle and hillmann pointed out: “today’s library users have a different set of information skills from those of just a few decades ago. they live in a highly interactive, networked world and routinely turn to web search engines for their information needs.”2 a recent study conducted by the university of georgia on undergraduate research behavior in using the university’s electronic library concluded that internet sites and online instruction modules are the primary sources for their research.3 the students’ year of study did not make much of a difference in their choices. tenopir also concluded from her study of approximately 200 scholarly works published between 1995 and 2003 that no matter what type of resources were used, “convenience remains the single most important factor for information use.”4 recently, oclc identified three major trends in the needs of today’s information consumers—self-service (moving to self-sufficiency), satisfaction, and seamlessness.5 services provided by google, amazon, and similar companies are the major cause of these emerging trends. customers have wholeheartedly embraced these products because of their ease of use and quick delivery of “good enough” results. researchers do not need to take information literacy classes to learn how to use an online search engine. they do not need to worry about forgetting important but infrequently used search rules or commands. in addition, the search results delivered by online search engines are sorted using relevance ranking systems that are more user-friendly than the ones currently employed by academic library opacs. these are just some of the features that current academic library opacs fail to deliver. in 2004, campbell and fast presented their analysis of an exploratory study of university students’ perceptions of searching opacs and web search engines.6 they found that “[s]tudents express a distinct preference for search engines over library catalogues, finding the catalogue baffling and difficult to use effectively.” as a result, library opacs, because they do not fulfill user needs, have been bombarded with criticism.7 we often hear librarians complain about how library users forget what they have learned in user education classes. librarians sometimes even laugh at users’ ignorance and ineffectiveness in searching library opacs. this legacy mentality has actually prevented librarians from recognizing the changes in user behavior and expectations that have occurred in the past decade. rarely have librarians considered ineffective opac design to be at the root of unsuccessful opac use. roy tennant has mentioned frequently in his presentations that “only librarians like to search; users prefer to find”; that “users aren’t lazy, they are jia mi and cathy weng revitalizing the library opac: interface, searching, and display challenges jia mi (jmi@tcnj.edu) is electronic resources/serials librarian and cathy weng (weng@tcnj.edu) is head of cataloging, the college of new jersey library, ewing. 6 information technology and libraries | march 20086 information technology and libraries | march 2008 human.”8 it is only natural that library users turn to internet search engines first for their information needs. ■ the opac reexamined cutter, in his 1876 book, introduced the objectives of the library catalog as follows: 1. to enable a person to find a book of which either a. the author b. the title is known c. the subject 2. to show what the library has a. by a given author b. on a given subject c. in a given kind of literature 3. to assist in the choice of a book a. as to its edition (bibliographically) b. as to its character (literary or topical)9 the majority of today’s opacs have successfully fulfilled cutter’s model in finding known items. following the card-catalog convention, bibliographic elements such as title, author, and subject have been the leading search options in opac search menus for many years. it was assumed that users always came to the library with specific author, title, or subject information in mind before searching the catalog. the opac bibliographic display is in essence an electronic version of the card catalog. to accommodate the bibliographic data from card catalogs, many display labels were created, but often without regard to whether or not they were suitable in an online environment. this data-centered, card-catalog type of design was easily understood and fluently used by librarians, but not by most end users. campbell and fast found in their study that “while the participants were generally happy with their understanding of search engines, they frequently expressed a low opinion of their ability to search the catalogue.” they also found that students felt that “[t]he web is cluttered; the catalogue is organized. however, this organization was not always helpful; it was admired, but not understood.”10 the traditional catalog retrieval mechanism is significantly different from the web search engine. as yu and young noted in 2004, “web search engines and online bookstores have a number of features that are not typically incorporated into opacs. these functions include: natural-language entry, automated mapping to controlled vocabulary, spell-checking, similar pages, relevance-ranked output, popularity tracking, and browsing.”11 these features have unquestionably affected user expectations in searching library opacs. teaching users to search for structured bibliographic data is completely opposed to the ever-popular free and open internet search mechanism drawn from the google-like search experience, which does not require any special training. since academic libraries aim to provide more dynamic and versatile services, revitalizing library opacs should be considered a top priority. furthermore, librarians’ expectations of user behavior should adjust to today’s needs. educating users to become fluent in using opac search commands and rules has become less relevant as users now seldom read and follow instructions. investing effort and energy in designing a truly user-friendly opac that functions intuitively to achieve productive retrieval could not be more imperative. academic librarians have started pondering what changes should be made to library opacs so that a truly user-friendly, twenty-first-century catalog that offers a “google-like” experience can be delivered. two important aspects that affect the usability of library opacs are addressed in this article: (1) the current interface and searching capabilities and (2) the bibliographic display. the opac’s public interface and searching capabilities together function as a finding aid. it determines how successful a user is in retrieving information and is the gateway to library resources. the effectiveness of an opac’s bibliographic display affects the user’s understanding of the bibliographic description. users use bibliographic information to identify, select, and obtain library resources. ■ the study of the public interface of library opacs to find out how academic libraries designed and administered their opacs, the authors examined the interfaces of 123 association of research libraries (arl) libraries’ opacs powered by five major integrated library systems (ils): aleph, horizon, millennium, unicorn, and voyager. the study focused on searching ability, relevance ranking, layout, and linking functionalities. during the study, we expected each ils system to have its own opac design. we also anticipated that search mechanisms would be managed differently at each location. however, we were surprised by the great disparities that we discovered in opac quality, a clear indication of the time and effort (or lack thereof) devoted to their maintenance and improvement. the findings are summarized below. google-driven changes—keyword search as the default search key in his article “mental models for search are getting firmer,” usability expert jakob nielsen argued that cur> article title | author 7revitalizing the library opac | mi and weng 7 rent users have already developed a firm mental model of searching: search is such a prominent part of the web user experience that users have developed a firm mental model for how it’s supposed to work. users expect search to have three components: ■ a box where they can type words ■ a button labeled “search” that they click to run the search ■ a list of top results that’s linear, prioritized, and appears on a new page—the search engine results page (serp) in our experience, when users see a fat “search” button, they’re likely to frantically look for “the box where i type my words.” the mental model is so strong that the label “search” equals keyword searching, not other types of search.12 studies have also shown that the default search option to which an opac is set affects users’ success in retrieving information. two studies on university opac search transactions confirmed that novice users preferred searching by keyword. at nanyang technological university, singapore, a recent search transaction log study was conducted to “identify query and search failure patterns with the goal of identifying areas of improvement for the system.” results indicated that “the most commonly used search option for the ntu opac is the keyword search. the use of keyword searches contributed to 68.9 percent of all queries while other options such as title, author, and subject accounted for 16.5 percent, 8.2 percent, and 6.4 percent of all searches respectively.”13 at california state university–los angeles, a fourquarter (2002–2003) search transaction log analysis also revealed similar results. after the library implemented an “advanced keyword search” feature that provided more user-centered, behind-the-scenes search algorithms and that set keyword search as the default, the keyword search queries rose dramatically.14 many university library opacs have already begun to adopt features employed by internet search engines. among the 123 arl library opacs studied, 81 have “keyword(s) anywhere” as the default search key (see appendix a). this is a positive sign that libraries are paying attention to user search behavior. thirty-six libraries’ default search keys are still set to “title,” and six libraries, instead of providing a default search option, list field choices from which users must choose before entering their search terms. the title search used as the default option holds some potential problems. in order to retrieve good results from a title search, users are expected to type in a title in the right order, spelled correctly, and omitting the initial article (a, an, the), if any. while librarians are fluent with these seemingly simple rules, students and faculty constantly have trouble remembering them. providing online search tips and offering information literacy classes only help a little. since presenting keyword search as the default has proved effective, libraries using title search as their opac default search option might want to reconsider switching their default setting to keyword. search ability—true keyword search the basis of current opac search systems is boolean logic. the ease of using google-like search engines comes from its implicit “and” feature, which eliminates the need to enter boolean connectors (and, or, not) between search terms. this is logical because users usually look for records that contain all the terms that they enter. sixty-six percent of the arl libraries studied have opacs with keyword set as the default search option. these libraries handle boolean logic in keyword searching very differently. all five ils vendors offer “automatic and” functionality, but not all of these libraries have adopted it: in some cases, users are required to enter boolean operators during a search. emory university library’s opac automatically executes “same” for multiple search words if no boolean operators are entered which means that it will find records with the search terms in the same bibliographic fields. syracuse university’s opac automatically uses the boolean operator “or” for all keyword queries. this practice can generate too many irrelevant results. libraries that automatically supply the boolean operator “and” for multiple terms entered in the search box consequently produce more relevant results. in addition, none of the arl opacs studied handle auto-correction for typos, spell-check, auto-plurals, auto-word-truncation, punctuations, or special characters. this makes searching unnecessarily inconvenient. for many years now, teaching students how to properly use boolean operators has been one of the essential topics in information literacy classes. after taking these classes, do students use boolean operators when searching? a study of 2,374 transaction logs collected by 836 french universities revealed that french university students use boolean operators infrequently. fifty-six percent of the queries used only a single term. approximately 28 percent of the queries contained one boolean operator. to further investigate the impact of information search expertise on the use of boolean operators, the study showed that approximately one-third (32 percent) of the students (considered the “novice” group in the study) still did not use boolean operators even when they were explicitly invited to do so, compared to 83 percent of librarians (considered the “expert” group in the study), who used at least one boolean operator for their queries.15 therefore, complicated search strategies and syntax are mostly used by expert users. novice users 8 information technology and libraries | march 20088 information technology and libraries | march 2008 prefer to use natural-language queries. libraries also handle phrase searching in different ways. phrase searching usually is embedded within keyword search either explicitly or implicitly depending upon the ils system. aleph (ex libris) libraries use a radio button for “word or phrase” or “words adjacent” or “exact phrase” options for the computer to execute the command. unicorn (sirsi/dynix) libraries provide three options: “keyword,” “begins with,” and “exact.” some libraries have the “exact” command executed to search every field in a bibliographic record; other libraries search the title, subject, and author fields only. the millennium system’s (innovative) keyword search feature can do automatic phrase and “and” search. some millennium libraries (e.g., michigan state university) take advantage of this feature to search words entered as phrases first and, if unsuccessful, the system then repeats the search for the same words using the boolean operator “and.” this feature produces more relevant search results. however, several millennium libraries have not implemented this feature. they still use “boolean keyword” search as the default and instruct users to add quotation marks to define phrases. the voyager (ex libris, formerly endeavor) system offers two types of keyword searches: “keyword relevance” and “keyword boolean.” both options can handle phrase searching. but users are required to enter quotation marks for specific terms used as phrases. some libraries intentionally made only one keyword search option available. other libraries provided both options and used different languages as an opac search key (see appendix b). these search keys are not self-explanatory, and users will often find them puzzling. the default help screen provided by the ils vendor and adopted by many voyager libraries does not help much either (see appendix c). thirty-one of the 35 voyager libraries provide a boolean keyword search option. only five libraries utilize the automatic “and” feature. one library uses boolean keyword search as the only keyword option, but did not activate the automatic “and” functionality. relevance ranking in search results when users search by keyword, the best way to sort the results is by relevance. presenting the most relevant results at the top of the results page is crucial because it enhances library resource discovery and access. other sorting options, such as title or publication date, are not very useful since users usually do not have titles or publication dates in mind when browsing search results from a keyword search. three ils systems (millennium, unicorn, and voyager) have a relevance-ranking feature, yet this functionality was very much underutilized by the libraries studied. of the eighteen unicorn libraries, only five offered relevance ranking. none made it the default sorting option. thirtysix of the 38 millennium libraries provided relevance ranking as a sorting option. only twelve of those libraries made relevance ranking the default sorting system. twenty-seven out of the thirty-five voyager libraries offered the keyword (relevance) search option, under which the search results were automatically ranked by relevance. out of the twenty-nine voyager libraries that offered the keyword (boolean) search option, only four libraries used relevance as the default sorting system. the rest of the libraries used a “system sort” mechanism that sorted search results by bibliographic control number. figure 1 summarizes the sorting options used by the arl libraries studied and also shows the default sorting options for keyword search. unlike online search engines, which pull data directly from full-text documents, library opacs search for words from the structured metadata entered by catalogers. different fields are set to carry different weights for relevance considerations. the behind-the-scenes algorithm (the criteria used to decide the level of relevance) should be carefully established to warrant a good ranking scheme. for example, the new opac of north carolina state university library, powered by endeca, adopted an algorithm based on field weighting, phrase matching, facet lcsh, term frequency (tf), and inverse document frequency (idf). their search results are indeed more logically ranked by relevance. recently there have been suggestions to incorporate circulation statistics, book review data, and a library of congress call number table into the algorithm. the checkout data would provide a rough substitute for google’s pagerank (a count of links to a site, which is an indication of the site’s popularity), and book reviews would provide more text to be considered in the relevancy tests. using library of congress call numbers would either require having the call number table loaded and then running the search terms against it or including call numbers in the algorithm, giving more weight to titles having the same call number. for example, seven out of twenty-three results generated for a search for “new york history” on an opac have the call number “f128.” the call number “f128” is linked to the call number table with the subject new york and history. it can be confirmed that seven items with call number “f128” should be considered more relevant and ranked first on the results list. more research needs to be done in this area. the search results display the search results display is critical. the information, options, and bibliographic data presented on the browse page help users decide what actions to take next. in the opacs examined, the authors found the following problems: article title | author 9revitalizing the library opac | mi and weng 9 1. search terms and search boxes were not retained on the results page after a search is performed, many opacs do not effectively carry the original search information onto the results screen. this information includes the search key and the words typed in the search box. users need to consult this information to identify and select records relevant to their needs from the search results page. based on the retained information, users also decide what to do next. for example, they might change their search strategy or modify their previous search. many of the opacs studied neglected to display the original search information. even better than just displaying the text of the user’s search terms would be to maintain them in search boxes at the top and bottom of the results display page. this way, users would only have to modify their search terms rather than type new search terms each time they wished to modify their original search. only one of the twentyone aleph libraries studied kept the previous search terms in the search box on the results page. fourteen of the aleph libraries retained neither the previous search strategy nor the search terms. six libraries placed the search box at the bottom of the search results page, which could be easily missed. 2. post-search limit functions were not always readily available sometimes keyword searches produce an overwhelmopac sorting options for keyword search relevance year (publication date) author title call # (subject) format default aleph 21 0 21 21 21 8 4 year/author: 17 title/year ascending: 1 title: 1 system sort: 2 horizon 7 0 7 7 7 0 0 publication date: 2 title: 2 author(ascending): 1 system sort: 2 millennium 38 36 38 0 38 0 0 date: 20 title: 5 relevance: 12 system sort: 1 unicorn 18 5w descending 18 ascending 18 18 18 18 0 new to old: 5 relevance: 1 (ncsu) system sort: 12 voyager 35 kw (r) 27 kw (b) 4 descending 34 ascending 34 35 35 0 0 relevance: 5 kw with relevance: 27 system sort: 8 figure 1. arl libraries sorting options for keyword search (as of march 2007) 10 information technology and libraries | march 200810 information technology and libraries | march 2008 ing number of search results. since the relevance ranking functionality currently provided by ils vendors does not work very well, the best way to refine searches is to make effective search limit options available. limiting options such as format, language, date, availability, and location should be readily available on the results page. some ilss in our study hid this feature, either under a modified search link or an advanced search link. this made refining a search unnecessarily cumbersome. 3. item statuses were not available on the search results page in addition to bibliographic information, users also need to know whether an item they want is available. having the item status on the browse page is very helpful because users can skip the records that have been checked out. some libraries studied did not have this information on the results browse page. users needed to go to the individual bibliographic records to find out whether an item was available or not. a few libraries provided an added-value option to limit the results by “available items”—a very useful feature. 4. a lack of value-added information a book cover image conveys an impression of a book that words cannot. it can also help a user recognize a book he or she has seen previously. in addition to cover images, libraries can provide value-added and contextual information by linking those images to tables of contents, summaries, sample passages of text, and reviews. one way libraries provide value-added and contextual information is to link cover images to the library of congress’s table of contents page. another way is to link opacs to information obtained from syndetics.com, a company that provides cover images, tables of contents, summaries, author biographical information, and reviews. the ohio state university library not only adds the table of contents into the marc record, but also links the names of the authors of a particular resource to other works by the same authors. this is a great discovery tool for finding related resources, and it is especially helpful, since in the future opacs will be able to search not only books but also articles and other resources. 5. title links were misleading we found that several libraries’ opacs title links on the results page did not take users to the detailed bibliographic record, but instead directed users to an alphabetical title-browsing page. to get to the actual bibliographic record, users had to click a “display full record” link (which is sometimes difficult to locate) to view the individual bibliographic record. this misleading feature makes the retrieval process inefficient. 6. switching between individual records and the results list was cumbersome after viewing an individual bibliographic record, users will want to return to the results browse page, either by hitting the “back” button or by clicking on a “return to results” link. many library opacs in our study returned the user to the top of the results page rather than to the location to which the user had previously scrolled. this forced the user to scroll back down through the records that had already been examined. this feature ought to be improved. 7. the color of entry links that had already been read were not differentiated for over a decade now, web browsers have changed the color of links that have already been clicked on. however, this has not been the case with opacs. to solve this problem, visited bibliographic entry links on search results pages should likewise be given a different color from entries that have not yet been visited. this feature facilitates the browsing of the search results. if what has been viewed is clearly marked, users only need to focus on entries that have not yet been visited. some libraries in our study did not have this feature. 8. searched keywords were not highlighted when a keyword search is performed, highlighting the entered keywords in each bibliographic record that has been retrieved is helpful. based on the bibliographic elements in which the highlighted keywords appear, users can then decide how relevant the retrieved publication is to their research. all five ils vendors provide this feature, and many libraries did a good job of implementing it. however, some libraries neglected to make this feature available. 9. many libraries lack a meaningful call-number browse feature library opacs should take better advantage of call number links by allowing users to browse them much as if they were browsing shelves in the stacks. to that end, opacs should link call numbers directly to a page with more useful identifying information, such as the authors and titles. no aleph library opacs that we studied currently have this feature. instead, clicking on the hyperlinked call number field only leads users to a list of more call numbers, which is not helpful at all. 10. title link, subject link, and author link should be relabeled to be meaningful to end users (other valueadded features) millennium’s “similar records” and voyager’s “more like this” are added to pull similar titles under the same article title | author 11revitalizing the library opac | mi and weng 11 subjects. unicorn and horizon offer a panel on the left side of the detailed book record, which can add meaningful information to these links. but how the panel is used depends on the individual libraries. some libraries use the panel with only library holding information, but other libraries, such as university of virginia, make an informative presentation of those links to students. virginia has added three browse features to make the index links much more meaningful: “find more by this author” (author link), “find more on these topics” (subject link), and “nearby items on shelf” (call number link). (see figure 2.) this value-added feature can indeed facilitate retrieval process. by analyzing five major integrated library systems’ opacs among arl libraries, the authors have come to believe that librarians can make a big difference in improving opacs. no matter how good the library system is, librarians still need to invest effort, time, and technical knowledge to configure and take full advantage of the many capabilities that ilss offer. public services, technical services, and system librarians should all work together to continuously study the usability of opacs and to make them more effective. it is true that all current opacs lack spell-check and automatic stemming functionality. aleph and horizon need to add relevance ranking, and millennium, unicorn, and voyager should make our data work harder and relevance ranking algorithms more effective. besides those systems in need of improvements, the study shows that all library opacs could do a much better job if they focus on the user’s needs. ■ the opac bibliographic display study when the web opac was introduced, libraries around the world quickly abandoned the traditional card catalog display and adopted the line-by-line display with display labels on one side and bibliographic information on the other. because the line-by-line display format can be locally customized, each library’s opac bibliographic display looks very different. for decades, most academic libraries in the united states have used aacr and marc as their content and metadata standards for resource description and access. marc and aacr were originally created for card catalogs in which descriptive elements and access elements were separately defined and presented. the line between the two types of elements has become less distinct in today’s web environment. many elements in bibliographic records can serve as both description and tracing elements on opacs.16 hyperlink functionality has also streamlined the retrieval process. to see how academic libraries in the united states format their opac bibliographic displays, the authors examined the opacs of fifteen academic libraries.17 the purpose was to study the effectiveness of the display of records in different formats. in the mid-1990s, wool studied the bibliographic display practices for monographs of thirty-six online catalogs in the united states. in his study, five criteria were used to analyze each bibliographic record structure.18 the authors of this paper adopted for analysis three of the five opac bibliographic display criteria used by wool, only this time with an emphasis on the user’s perspective and needs. eight different titles were reviewed and compared: three monographs, two serials, one video recording, and two sound recordings.19 the analysis given below is based on the following three criteria: ■ the accuracy and clarity of display labels; ■ the order of bibliographic elements display; and ■ the utilization of bibliographic data. accuracy and clarity of display labels for this discussion, the authors divided the bibliographic elements into three areas: ■ the first tier: information about author/contributor, title, imprint, and subjects; ■ the second tier: other descriptive information, including the physical description, notes, related contributors, related titles, etc.; and ■ the third tier: the linking fields (marc 76x–78x fields) and the electronic location and access field (i.e., 856 field). the first-tier elements the information displayed in the first tier can be consid-figure 2. university of virginia libraries catalog 12 information technology and libraries | march 200812 information technology and libraries | march 2008 most libraries in our study used the label “author” for the principal author. the principal author could be a personal author, a corporate author, or a conference name. if it is a personal author, it could be a writer, an artist, or a composer. some opacs used “author” to represent all types of responsible bodies, be it a personal author, a corporate author, a meeting name, an artist, a music composer, etc. this use of a single label to cover a diverse set of situations is confusing. some libraries used separate labels (“author,” “corporate author,” “meeting name,” “author/ artist,” “author/composer,” or “author, etc.”) for different types of responsible bodies (see appendix d). “uniform title” was defined in aacr to collocate resources derived from the same original intellectual or artistic creation. for example, when cataloging a translation, in addition to its official translated title, an established uniform title is entered to indicate the original work. when browsing by uniform title on a properly set opac, all entries related to the original intellectual creation should be retrieved. this uniform title browsing feature helps users locate related publications in the catalog. the problem is that the term “uniform title” is only understood by catalogers, not by others. there is no label for such an entry that can be easily understood by the average user. however, suppressing the uniform title entry to avoid confusing users will cause the opac to lose its helpful collocation functionality. some libraries studied use the term “uniform title” as a display label. some libraries use “other title” as a display label. some libraries display this entry under the label “title” along with the title proper (title in the 245 field). none of the above-mentioned arrangements are ideal. the display labels for subject headings provided by each library were very similar. most academic libraries in the united states use the library of congress subject headings and the medical subject headings as the thesauri for subject entries. specifying the thesauri for headings on opacs with acronyms like “lcsh” and “mesh” is of no help to users, because these thesauri do not clarify anything that would assist users in their research. figure 3 lists the display labels used by libraries in the study. ered the key elements for identification. opac users first examine them and decide if the manifestation described is relevant to their query. most opacs studied used “title” as the display label for the title statement. this element actually consists of the title and statement of responsibility (author, etc. statement). using the label “title” alone is not inclusive enough. one library (university of arizona library) displayed only the title portion under the label “title” and provided a separate label, “author/contributor info,” for the statement of responsibility portion, which, while helpful in a limited way, could also create more confusion. let us consider, for example, the project directory (répertoire des projets) of tdc (in french, cdt). the title statement for this data would be “project directory / tdc = répertoire des projets / cdt.” here, the english title and statement of responsibility is equivalently presented with its french title and statement of responsibility. the opac display using the university of arizona library’s model is as follows: title: project directory author/contributor info: tdc = répertoire des projets / cdt. this arrangement will not work for items with titles and statements of responsibility in multiple languages presented on a single manifestation. the french title appears under the label “author/contributor info,” which makes no sense. marc fields library of congress subject (marc 650 field 2nd indicator 0) medical subject (marc 650 field 2nd indicator 2) d is p la y l a b e ls subject (lcsh) subject (mesh) subject-lib. cong. subject-medical subject lc subject medical library of congress subject headings medical subject headings subject(s) subject(s) subject, general subject, geographic subject, medical subject med. subject figure 3. display labels for subjects article title | author 13revitalizing the library opac | mi and weng 13 the second-tier elements the elements in the second tier include the physical description, notes, related authors, and related titles. this is an area where mapping bibliographic elements onto proper display labels is difficult. this area was also not managed well by the libraries studied. unlike first-tier elements in which one element usually corresponds to a unique display label, second-tier elements exhibit two patterns in the opacs examined: many-to-one and one-to-many. that is, multiple categories of data (of different marc fields) can be represented by one display label, e.g., incorporating physical description, numbering notes, and publication numbering into “description” (many-to-one). on the other hand, one display label can represent one single, repeatable bibliographic element (the same marc field repeated many times), e.g., multiple general notes (oneto-many). both arrangements (one-to-many and manyto-one) can result in a simpler, cleaner public display, since some descriptive elements are self-explanatory and users can get by without specific display labels supplied. the disadvantage of these arrangements is that the level of specificity of public displays is compromised. some important descriptions can be easily missed if they are clustered in a group of elements. for bibliographic elements that are not self-explanatory, this type of arrangement can fail to convey useful information, or even worse, deliver inaccurate or vague information. for example: description: v. : ill. ; 28 cm (physical description, marc 300 field) report year ends mar. 31. (numbering note, marc 515 field) ’77– (publication span, marc 362 field) published: philadelphia : robert morris associates, 1977– (imprint, marc 260 field) ’77– (publication span, marc 362 field) annual (frequency, marc 310 field) the numbering field (field 362) is defined to describe a serial publication’s chronological or numerical publication extent. carelessly placing data like “’77-” under labels such as “description” or “published” is very unclear. in fact, it is inaccurate because “’77-” is the publication span, not the publication date. without a proper label, it is difficult to convey this information to users. some libraries we studied used such labels as “publication history,” “publishing history,” “publication dates,” or “volume/date range” to describe the publication span. this practice is misleading (see appendix e). names like coauthors, editors, cast members, performers, related corporate names, or meeting names of people who contributed to or were involved in the creation of the work are considered secondary contributors. using one label to cover the various roles (author, editor, composer, etc.) is the practice most libraries have adopted. like the primary author field, this element represents a variety of roles depending upon the type of manifestation. some opacs used one display label to cover all related personal names, corporate names, and meeting names (see appendix f). most libraries failed to supply a proper label for a secondary name when it was entered with a related title. this so-called “name-title added entry” is provided to collocate materials under the same author and title in the catalog. ideally, the name-title combined element, provided with redirect functionality via hyperlink, should perform an author-title combination search for exact retrieval. most opac systems could only perform either an author or a title search. the search results were unsurprisingly irrelevant, because they did not utilize both elements of the name-title added entry to produce results that were sufficiently specific: users would get only a list of authors or a list of titles instead of an author-title combination entry list. some libraries presented this type of element only as an unhyperlinkable note, which defeats the purpose of having such data available. handling series for opac displays is also challenging. the majority of opacs studied did a poor job in this area. in general, a series title transcribed from the resource also functions as an access element if the transcribed title is the same as the established one in the authority file. when the transcribed series title is different from the established series title, ideally the transcribed series title should only be accessible via the library system’s cross-references feature, which then directs users to bibliographic records that contain the established entry. this type of descriptive element is not meant to be displayed on the opac. the opacs examined used the labels listed in figure 4 to handle transcribed and established series entries. labels listed in the same row were taken from the same opac. as can be seen, users are not expected to know the difference between a “series statement” and a “series.” in many cases, these two elements are identical due to the vendor authority control process.20 this could confuse the user, especially when both elements are displayed right next to each other. 14 information technology and libraries | march 200814 information technology and libraries | march 2008 the third-tier elements the third-tier elements consist of linking fields (marc 7xx fields) and electronic location and access fields (marc 856 field). the linking fields are used mostly in serial bibliographic records. their purpose is to link the title being described to its related publications, e.g., supplements, translations, preceding titles, or succeeding titles. elements in this category should be displayed and linked directly to the related record via control numbers provided in the bibliographic record. if the catalog does not have the related record, a clear message should indicate this to the user. unfortunately, many libraries do not display all the linking entries. none of the opacs studied offered direct link functionality. instead, what was usually offered was a redirect feature via hyperlink that prompted the system to issue a new author or title search. the direct link functionality via record control numbers was never made available. if the library did not have the related entry, the opac system simply took the user back to the original entry—a very confusing design flaw. to ease the user’s access to internet resources, the electronic location and access element (marc 856 field) was defined for catalogers to record the internet location of the resource being described and its related information. by clicking the hyperlinked element on an opac, users seamlessly get to the desired electronic document site. the url specified in the field might link to full-text documents, the table of contents, the document abstract, the publisher’s description, or the author’s biographical information. a label that fits all types of materials is crucial. the bibliographic elements displayed under the label should also be carefully managed. under the label, some libraries displayed the type of resource (e.g., table of contents). other libraries displayed the http url only. some libraries displayed both the type of resource and the http url (see figure 5). as for the location of the label in the opac record, we found that the location of the url link depended on the opac in which it appeared: in some opacs, links were located at the top of records; in others, they appeared in the middle or at the bottom. we found that the location of the link was not terribly critical, provided that the label was prominent and the display text understandable. the order of the bibliographic elements display the way bibliographic data is organized in each opac record, together with display labels, helps users to quickly identify library resources. although each library can locally choose the arrangement of bibliographic data displayed on its opac, most libraries prefer to place the citation information (author, title, publication) ahead of other elements. the sequence of the other elements exhibited enormous variation in the opacs studied. some libraries placed the electronic access element above all other data (suny buffalo); some libraries placed local holdings information, call number, and item availability in the middle of the bibliographic record. arrangements were clearer and more understandable when provided with clear labels and a distinct layout between the local holdings information and bibliographic data. problems arose when second-tier elements were mingled with firsttier elements and when they shared the same display label. see example in figure 6. in this example, two titles are displayed under the “title” label. the first title, “rma annual statement studies,” is the full title (marc field 245) of the publication. the second title, “rma annual statement studies: industry default probabilities and cash flow measures,” is the title of the resource’s related publication (marc field 730), which normally is considered a second-tier element and should be placed farther from the title proper with a clear label. since the display order of bibliographic elements is completely customizable, we found in our study that few libraries put enough effort into providing clear bibliographic displays. more importantly, records in different formats (e.g., monographs, serials, music materials, video recordings) were not given equal attention. some labels and data sequences might work for one format, but not another. utilization of bibliographic data another factor that has an effect on the usability of an opac is the utilization of bibliographic data. two issues are addressed in terms of utilization of bibliographic data: (1) the completeness and suitability of the metadata displayed on an opac, and (2) the extent of repurposing the bibliographic data and creating added value to an opac.21 a typical bibliographic record contains descriptive data, access data, and adminlabel for transcribed element label for established element series statement series series statement series indexed as other series series series note series description series figure 4. display labels for series article title | author 15revitalizing the library opac | mi and weng 15 istrative data. descriptive data is provided to describe the manifestation cataloged and is considered of interest to the public. access data is entered and indexed for retrieval. administrative data is used for setting up search limits (e.g., limit by language, format) and pulling statistics (e.g., how many titles in spanish). it is most useful for internal, administrative use. librarians must be careful when deciding whether such data elements will be displayed. in terms of the completeness and the suitability of metadata in the opac display, the authors discovered the following in the opacs studied: 1. many libraries’ opacs displayed control numbers, such as the oclc control number (the 035 field), the lc control number (010 field), and other local system control numbers. this type of information is usually of no interest to the public. see example in figure 7. in this example, the numbers listed under the label “wln #” represent different types of system control numbers, which are of no concern to users and therefore should not be displayed. 2. some opacs displayed bibliographic data from the leader fields of the cataloging record. marc leader fields are a group of fixed-length codes that represent the type of resource (monograph, serial, or musical score) and material format (print, electronic, or sound recording). the information could be helpful for patrons if they are displayed with the proper label on the opac. libraries that chose to display the leader data on their opacs did not do a good job of making the information clear to users. for example, one library listed “journals and newspapers,” “computer file,” “serial,” “book,” “e-resource,” and “gov publication” under the label “record type” (see figure 8). seeing so many record types under one label can easily confuse library users. 3. some libraries omitted certain crucial variable fields, e.g., the linking entry complexity note (field 580, containing information about title history), related title access entries (fields 730 and 740, containing related titles), and linking entries (linking the record to other bibliographically related records, e.g., 76x, 77x, and 78x fields). these fields are defined with a clear purpose and should be carefully considered for public display with clear labels. some libraries in our study displayed them but left other irrelevant information on the opac, which clutters the display with information that does not help users. see example in figure 9. in this example, under the label “related publication,” the french version and the spanish version of jama are displayed. in addition to the french title and the spanish title, the marc 21 language code and its corresponding issn are also displayed. the language code and the eight-digit issn number— since no separate label is provided for them—are confusing. 4. the linking elements not only should be displayed on the opac, but should also be hyperlinkable. they ought to be used to link to related bibliographic records. in an online environment, this sort of field can also function as a descriptive element. some opacs displayed linking entries but did not enable hyperlink functionality. some libraries displayed two instances of them, one as a descriptive element and the other as a linking element with hyperlink capability. another important aspect of making use of bibliographic data is repurposing the bibliographic data to provide added value to opacs. lorcan demsey mentions frequently in his blog that in order to sustain library value, libraries should “make data work harder.” he points out that “libraries have invested a great deal in bibliographic data—yet it has remained somewhat inert in our catalogs, failing to release the value of the investment.”22 these rich data can be better utilized for different purposes, including designing an enhanced opac. lavoie, et al. described further in their recent article about data mining: as more activities move into networked spaces, more areas of our lives are shedding data. this data is increasingly being mined for intelligence that drives services. . . . [c]ompanies like amazon repurpose data to create added value. this is a lesson librarians must learn if they want to improve their own visibility and value in increasingly crowded digital information spaces where users, as always, want good results without too much time or effort. . . . the good news is that libraries don’t come to this task empty-handed but with figure 5. online opac record from suny buffalo figure 6. online opac record from the college of new jersey 16 information technology and libraries | march 200816 information technology and libraries | march 2008 rich, structured information about the materials in our collections.23 tim o’reilly highlighted in his article the successful example of how amazon reutilizes data: amazon relentlessly enhanced the data, adding publisher-supplied data such as cover images, table of contents, index, and sample materials. even more importantly, they harnessed their users to annotate the data, such that after ten years, amazon, not bowker, is the primary source for bibliographic data on books, a reference source for scholars and librarians as well as consumers. . . . effectively, amazon “embraced and extended” their data suppliers.24 all opacs reviewed in the study operate within the traditional vendor-supplied module. this long-established approach gives libraries limited flexibility to customize the search key options, search results displays, restricted sorting options, and preand post-search limit options of their opacs. unfortunately, libraries can do very limited data mining inside the vendor’s hard-coded framework. many valuable metadata are buried in the bibliographic database. system vendors have failed to make the most of technology to better utilize data. very few libraries have thought outside the box and taken advantage of the existing rich bibliographic data. the emergence of north carolina state university’s endecapowered opac was a good example of repurposing data and creating value-added information. the data facets used on ncsu’s single search-andbrowse combined opac interface are pulled and repurposed from their sirsi/dynix database. as one might have expected, eight of the eleven facets are extracted from the library’s marc bibliographic records (“availability” and “browse: new” are from item records). out of the eight facets, four are extracted from subject headings; two are from the fixed fields; one is from the call number field and one from the variable fields of the bibliographic record.25 ■ discussion and recommendation based on the authors’ findings above, the following are the primary factors that have contributed to the ineffectiveness of the opacs offered by today’s academic libraries. 1. system limitations the inadequacy of today’s ils has been a known problem. inflexible search options make library catalogs difficult to use. despite the fact that some vendors diligently enhance their systems’ functionalities, overall performance is still disappointing. karen markey pointed out in a recent article that one of the reasons why the solutions recommended by researchers in the 1990s were not applied to online library catalogs was “the failure of ils vendors to monitor shifts in information-retrieval technology and respond accordingly with system improvements.”26 antelman et al. observed similarly that all major ils vendors are still marketing catalogs that represent second-generation functionality. despite between-record linking made possible by migrating catalogs to web interfaces, the underlying indexes and exact-match boolean search remain unchanged. it can no longer be said that more sophisticated approaches to searching are too expensive computationally; they may, however, to be too expensive to introduce into legacy systems from a business perspective.27 since ils vendors first introduced their products back in the 1980s, user behavior and expectations have changed immensely. while libraries have started to figure 7. online opac record from the university of washington. figure 9. online opac record from the university of michigan.figure 8. online record from suny buffalo article title | author 17revitalizing the library opac | mi and weng 17 recognize the changes and are working hard toward meeting the needs of multiple generations of users, little can be done if ils products still operate within the same old-fashioned information-retrieval structure. because ils vendors have failed to revamp their opac modules to meet user needs, libraries have been forced to seek other options. north carolina state university is one of the first libraries to exercise its options. its new opac system, powered by endeca (operated on the sirsi/dynix platform), has shown remarkable improvements in ease of use, which usability tests have verified. recently, two ils vendors (innovative and ex libris) have been in the process of developing new opac modules using new technology and a new approach in data mining. 2. libraries are not fully exploiting the functionality already made available by ilss unsurprisingly, the opacs examined by the authors, if powered by the same vendor, showed similarities in general layout and interface features. during the study, it soon turned out to be easy for the authors to recognize the ils system of each opac. as mentioned previously, we expected opacs to vary somewhat. what was unexpected was the huge differences in, among other things, interface layout, search options and search languages, behind-thescenes search algorithms, search results displays, display labels and the corresponding bibliographic data, and what data was chosen for display. the disparities that we found in these features suggested that there had been great differences in the amount of attention, energy, and time devoted by each library to designing its opac. some libraries took advantage of available features and made better use of them than others. (see appendix g for examples of best practices of library opacs.) many libraries did only the very minimum. while we recognize that academic library opacs are difficult to use, we also need to recognize that some libraries do not fully exploit existing resources, thereby exacerbating the difficulty of using their opacs. 3. the unsuitability of marc standards to online bibliographic display as previously mentioned, aacr and marc were initially designed for card catalogs without display labels in mind. many marc fields can be used for multiple purposes. providing labels that properly fit all the cataloging data needed to cover all types of resources is nearly impossible. from the opacs studied, some libraries used vague labels in an effort to encompass as many circumstances as possible. some libraries used labels suitable only for certain formats, but not all formats. neither approach is satisfactory. the solution has to come from cataloging and metadata standards. wool identified this issue back in the 1990s: the interchangeability of descriptive data elements and access points (since each can be made to serve both functions online) makes the separate creation of description and headings seem pointless and burdensome. labeling of data elements (made possible through the mapping of terms to marc fields) creates a need for simpler, less ambiguous bibliographic data definitions than are appropriate for the dense and context-rich narrative-style records catalogers continue to create . . . cataloging standards will need to be rewritten in order to provide the kind of data flexibility expected in online systems . . . records flexible enough to be added to, subtracted from, and rearranged without loss and garbling of meaning. what is needed is a modular record structure, in which every segment of data can stand on its own with appropriate labeling and which can support all possible display lengths and combinations of data elements.28 a decade later, not much progress has been made in improving cataloging and metadata standards for online display. while enhancing cataloging and metadata standards for better retrieval is desirable, making the standards more complicated and difficult to adopt in order to accommodate opac displays is not. as librarians are working to simplify cataloging, our essential rich metadata should not be sacrificed. one possible solution is to have the system recognize the existence of certain subfields and produce specific display labels accordingly. this certainly will not solve all the issues with regard to display labels. regardless, there is much room for improvement, and librarians’ attention is this area is critically needed. ■ conclusion the information-seeking world has entered an era of selfservice. roy tennant described well the self-service trend: “i wish i had known that the solution for needing to teach our users how to search our catalog was to create a system that didn’t need to be taught.”29 tim o’reilly also indicated in his article “what is web 2.0” that “the web 2.0 lesson [is to] leverage customer-self service and algorithmic data management to reach out to the entire web, to the edges and not just the center, to the long tail and not just the head.” he also argued that “[t]rusting users as co-developers” is one of the core competencies of web 2.0 companies.30 academic libraries should aim toward designing a user-centered, self-sufficient, twenty-first-century online catalog that fits the web 2.0 model. the ultimate goal is that users will be comfortable and confident using library opacs for their information needs wherever a computer 18 information technology and libraries | march 200818 information technology and libraries | march 2008 is available and without special training. as campbell and fast have trenchantly asked, “are we witnessing a major disruption, a large-scale redefinition of information design and delivery so radically different from the traditional library environment that it renders irrelevant all our experience in bibliographic control?”31 this remains an open question. regardless, a new generation of opacs will need to be in place soon. much needs to be done to make academic library opacs matter. academic librarians cannot afford to be considered irrelevant in the information-seeking world. the future of academic libraries relies on effective opacs. this is one of the most pressing tasks that must be accomplished. references and notes 1. cathy de rosa et al., perceptions of libraries and information resources: a report to the oclc membership (dublin, ohio: oclc, 2005), 1–17. http://www.oclc.org/reports/2005perceptions.htm (accessed jan. 20, 2007). 2. karen coyle and diane hillmann, “resource description and access (rda): cataloging rules of the 20th century,” d-lib magazine 13, no. 1/2 (2007). http://www.dlib.org/dlib/january07/coyle/01coyle.html (accessed feb. 3, 2007). 3. anna m.van scoyoc and caroline cason, “the electronic academic library: undergraduate research behavior in a library without books,” portal: libraries and the academy 6, no. 1 (2006): 47–58. 4. carol tenopir, “user and users of electronic library resources: an overview and analysis of recent research studies,” council on libraries and information resources, 2003. http://www.clir.og/pubs/reports/pub120/pub120 (accessed jan. 20, 2007). 5. cathy de rosa et al., the 2003 oclc environmental scan (dublin, ohio: oclc, 2003), http://www.oclc.org/reports/ escan/introduction/default.htm (accessed jan. 20, 2007). 6. d. grant campbell and karl v. fast, “panizzi, lubetzky, and google: how the modern web environment is reinventing the theory of cataloguing,” the canadian journal of information and library science 28, no. 3 (2004): 25–38. 7. roy tennant, “breaking library services out of the box,” presentation (2005), http://www.cdlib.org/inside/news/ presentations/rtennant/2005netspeed/ (accessed feb. 11, 2007); andrew pace, “my kingdom for an opac,” american libraries online (feb. 2005), http://www.ala.org/ala/alonline/ techspeaking/2005colunms/techfeb2005.cfm (accessed feb. 11, 2007); karen g. schneider, “how opacs suck, part 1: relevance rank (or the lack of it),” ala techsource blog (mar. 13, 2006), http://www.techsource.ala.org/blog/2006/03/how-opacssuck-part-1-relevance-rank-or-the-lack-of-it.html (accessed feb. 11, 2007); karen g. schneider, “how opacs suck, part 2: the checklist of shame,” ala techsource blog (apr. 3, 2006), http:// www.techsource.ala.org/blog/2006/04/how-opacs-suck-part2-the-checklist-of-shame.html (accessed feb. 11, 2007); “how opacs suck, part 3: the big picture,” ala techsource blog (may 20, 2006), http://www.techsource.ala.org/blog/2006/05/ how-opacs-suck-part-3-the-big-picture.html (accessed feb. 11, 2007); lorcan dempsey, lorcan dempsey’s weblog (oct. 4, 2005), http://orweblog.oclc.org/archives/000815.html (accessed feb. 11, 2007); kristin antelman, emily lynema, and andrew k. pace, “toward a twenty-first century library catalog,” information technology and libraries 25, no. 3 (2006): 128–139. 8. roy tennant, “libraries through the looking-glass,” 2004 ala midwinter endeavor presentation. http://www.cdlib. org/inside/news/presentations/rtennant/2004ala/ (accessed march 16, 2007). 9. charles ammi cutter, rules for a printed dictionary catalogue (washington, d.c.: government printing office, 1876). 10. d. grant campbell and karl v. fast, “panizzi, lubetzky, and google: how the modern web environment is reinventing the theory of cataloguing,” 31. 11. holly yu and margo young, “the impact of web search engines on subject searching in opac,” information technology and libraries 23, no.4 (2004): 194. 12. jakob nielsen, “mental models for search are getting firmer,” in jakob nielsen’s alertbox, http://www.useit.com/ alertbox/20050509.html (accessed feb 20, 2007). 13. eng pwey lau and dion hoe-lian goh, “in search of query patterns: a case study of a university opac,” information processing and management 42, no. 1 (2006): 1316–1329. 14. holly yu and margo young, “the impact of web search engines on subject searching in opac,” 173. 15. dinet jérome, favart monik and passerault jean-michel, “searching for information in an online public access catalogue opac: the impacts of information search expertise on the use of boolean operators,” journal of computer assisted learning 20, no. 5 (2004): 338–346. 16. gregory wool, “the many faces of a catalog record: a snapshot of bibliographic display practices for monographs in online catalogs,” information technology and libraries 15, no. 3 (1996): 184. 17. the fifteen libraries are located at the college of new jersey, library of congress, northwestern university, princeton university, state university of new york at buffalo, temple university, university of arizona, university of florida, university of illinois–urbana-champaign, university of michigan, university of minnesota, university of rochester, university of texas– austin, university of washington, and vanderbilt university. 18. gregory wool, “the many faces of a catalog record: a snapshot of bibliographic display practices for monographs in online catalogs,” 173–195. 19. eight titles representing monograph, serial, video recording, and sound recording were used to study the effectiveness of the bibliographic display. the eight titles are: (1) to love the wind and the rain: african americans and environmental history, edited by dianne d. glave and mark stoll. university of pittsburgh press, 2006. (monograph) (2) to kill a mocking bird, by harper lee (mongraph) (3) rma annual statement studies, robert morris associates, 1977(serial) (4) sideways (20th century fox, 2004) (video recording) (5) chamber music (newport classic, 2000) (sound recording) (6) end of summer book of hours ; bright music, naxos, 2003 / by ned rorem (sound recording) (7) jama : the journal of the american medical association, 1960(serial) article title | author 19revitalizing the library opac | mi and weng 19 (8) the 21st century at work, by lynn a. karoly (rand, 2004) (mongraph) 20. many vendors retag the 440 field to 490 in bibliographic record and create an 830 field based on the contents of the 440 field. the series title in the 830 field receives authority control. many libraries prefer not to restore the 830 field back to the 440 fields causing the duplicate series statements on opac if both fields are displayed. 21. lorcan demsey, “making data work—web 2.0 and catalogs.” 22. ibid. 23. brian lavoie, lorcan dempsey, and lynn silipigni connaway, “making data work harder,” library journal.com (jan. 15, 2006), http://www.libraryjournal.com/article/ca6298444. html (accessed jan. 28, 2006). 24. tim o’reilly, “what is web 2.0: design patterns and business models for the next generation of software,” (sept. 30, 2005), http://www.oreillynet.com/pub/a/oreilly/tim/ news/2005/09/30/what-is-web-20.html (accessed jan. 28, 2007). 25. tito sierra, “a faceted interface to the library catalog,” ala 2007 midwinter meeting, http://www.lib.ncsu.edu/ endeca/presentations.html (accessed feb. 11, 2007). 26. karen markey, “the online library catalog: paradise lost and paradise regained?” d-lib magazine 13, no.1/2 (2007). http://www.dlib.org/dlib/january07/markey/01markey.html (accessed feb. 11, 2007). 27. kristin antelman, emily lynema, and andrew k pace, “toward a twenty-first century library catalog,” 129. 28. gregory wool, “the many faces of a catalog record: a snapshot of bibliographic display practices for monographs in online catalogs,” 184–185. 29. roy tennant, “lipstick on a pig,” library journal.com (apr. 15, 2005), http://libraryjournal.com/article/ca516027. html (accessed feb. 11, 2007). 30. tim o’reilly, “what is web 2.0: design patterns and business models for the next generation of software.” 31. d. grant campbell and karl v. fast, “panizzi, lubetzky, and google: how the modern web environment is reinventing the theory of cataloguing,” 26. appendix a. default search keys used by arl libraries (as of march 2007) appendix b. keyword search keys used by voyager libraries keyword (relevance) keyword (boolean) keyword with relevance ranking keyword (enclose phrases “in quotes”) keyword anywhere (user “” for phrase) keyword combined (use and/or/not “ “ for phrase) keyword anywhere (relevance ranked) keyword (and or not) keyword anywhere advanced boolean words anywhere keyword boolean basic keyword keyword(s) (user and, or, not, or “a phrase”) any word anywhere boolean search (use and or not) relevance keyword (user + for key terms) command keyword keyword phrase keyword (use “and” “or” “not”) keyword and or not( keyword boolean) keyword (results sorted by relevance) expert keyword keyword keyword expert (user an or not “phrase”) keyword command ranked keyword keyword 20 information technology and libraries | march 200820 information technology and libraries | march 2008 keyword (ranked by relevance) keyword keyword command search find all words search for a phrase keyword (quick search) boolean search appendix c. default keyword search help page provided by voyager system keyword search ■ enter words and/or phrases ■ use quotes to search phrases: "world wide web" ■ use + to mark essential terms: +explorer ■ use * to mark important terms: *internet ■ use ? to truncate (cut off) words: theat? finds theaters, theatre, theatrical, etc. ■ do not use boolean operators (and, or, not) to combine search terms boolean ■ use the boolean terms (and, or, not) to combine search terms. ■ use quotation marks to search for a phrase, e.g., "united states" ■ use ? to truncate a word, e.g., browser? ■ use parentheses to group search terms, e.g., (automobile or car) and repair appendix d. display labels for entries of principal responsibility marc fields libraries 100 (personal name) 110 (corporate name) 111 (meeting name) u. of arizona author author author u. of ill. author author conference lc personal name corporate name meeting name u. of minnesota author author author u. of michigan author author author northwestern u. author, etc. author, etc. author, etc. princeton u. author/artist author/artist author/artist u. of washington author author author suny buffalo author author author temple author corp author conference u. of florida author, etc. author, etc. author, etc. u. of rochester main author main author conference ut austin author corporate author conference tcnj principal author principal author conference name vanderbilt u. author corporate author meeting/event name article title | author 21revitalizing the library opac | mi and weng 21 appendix e. display labels for publication extent libraries marc 362 field u. of arizona issued u. of ill. publication history lc description u. of minnesota published u. of michigan pub history northwestern u. extent of publication princeton u. description u. of washington (suppressed from opac) suny buffalo publication dates temple publication started u. of florida publishing history u. of rochester (suppressed from opac) ut austin publication coverage date tcnj description vanderbilt u. volume/date range appendix f. display labels for entries of secondary responsibility marc fields libraries 700 (personal name) 710 (corporate name) 711 (meeting name) u. of arizona other auth other auth other auth u. of ill champaign other name other name other name lc related names related names related names u. of minnesota contributor contributor contributor u. of michigan contributors people contributors other contributors other northwestern u. other authors, title, etc. other authors, title, etc. other authors, title, etc. princeton u. related name(s) related name(s) related name(s) u. of washington alt author alt author alt author suny buffalo contributors contributors contributors temple other author(s) other author(s) other name u. of florida other author(s), etc. other author(s), etc. other author(s), etc. u. of rochester other author(s) other author(s) other author(s) ut austin added author (not display) (not display) tcnj other contributor(s) other contributor(s) conference name vanderbilt u. author, editor, etc. corporate author meeting/event 22 information technology and libraries | march 200822 information technology and libraries | march 2008 appendix g. examples of best practices of opacs (accessed july 16, 2007) search interface, including retaining search keys and searched terms university of notre dame http://alephprod.library.nd.edu:8991/f/?func= find-b-0&local_base=ndu01pub keyword searching ability michigan state university http://magic.msu.edu/search~/x facets browsing (endeca) north carolina state university http://www.lib.ncsu.edu/catalog mcmaster university http://libcat.mcmaster.ca make author, subject and call number links more accessible university of virginia https://virgo.lib.virginia.edu/uhtbin/cgisirsi/0/ uva-lib/0/60/1180/x links to amazon ratings ohio state university http://library.ohio-state.edu/search direct export to refworks johns hopkins university https://catalog.library.jhu.edu/ipac20/ipac. jsp?profile=default#focus university of chicago http://libcat.uchicago.edu/ipac20/ipac. jsp?profile=ucpublic cover art/toc/ summary/review indiana university http://www.iucat.iu.edu/authenticate.cgi?status=start guesstimate/del.icio.us persistent link enabled virginia tech http://addison.vt.edu lib-s-mocs-kmc364-20141005043103 the new york public library automated book catalog subsystem s. michael malinconico: assistant chief, systems analysis and data processing office and james a. rizzolo: chief, systems analysis and data processing office, the new york public library. 3 a comprehensive automated bibliographic control system has been developed by the new york public library. this system is unique in its use of an automated authority system and highly sophisticated machine filing algorithms. the primary aim was the rigorous control of established forms and their cross-reference structure. the original impetus for creation of the system, and its most highly visible product, is a photocomposed book catalog. the book catalog subsystem supplies automatic punctuation of condensed entries and contains the ability to pmduce cumulation/ supplement book catalogs in installments tl'ithout loss of control of the crossreferencing structure. background in 1965 studies confirmed what much of the new york public library's administration had long felt: the public card catalog of the research libraries, containing entries dating back to 1857, was rapidly deteriorating.1 it was estimated that 29 percent of the cards were illegible, damaged, or in some other way unusable. further, cataloging and card filing arrearages were monotonically increasing at an alarming rate. increases in labor costs were eroding all efforts to cope with these problems manually. in addition, the deputy director at that time (now director), john m. cory, realized that a wider base of support was absolutely essential to the survival of the new york public library as an institution. as a result of these disquieting observations, three logical conclusions followed. first, the existing card catalog would have to be closed off, rehabilitated, and photographically preserved. second, available technology should be explored as a possible solution to some of the spiraling arrearage problems. in particular the applicability of computer technology was to be explored. this exploration appeared to offer some most attractive longterm solutions. the capture of all future cataloging in a machine-readable form would obviate for all time the deterioration problem. this strategy could also provide a basis for a check against spiraling costs, since traditionally unit costs have tended to increase in manual and decrease in 4 journal of library automation vol. 6/ 1 march 1973 automated systems.2 seen within the context of the marc project at the library of congress ( lc), the economies were becoming manifestly obvious. the long-term benefits to the entire library community of a national network of shared machine-readable bibliographic data could not be denied. capture of data in machine-readable form for use by information retrieval systems which might become economically feasible in the near future had to be viewed as a matter of great value. third, wider access to the resources of the new york public library had to be provided if a wider base of support for the library's operation was to be sought. the solution decided upon was the development of an automated bibliographic control system capable of producing photocomposed book catalogs. the book catalog would then serve as the prospective catalog and augment the retrospective card catalog, which would also appear in book form following photographic duplication of the cards. 3 this solution, at one stroke·, addressed itself to all three of the major problems, and showed great promise as a future investment. reproducible book catalogs could be widely distributed. a machine-based system would eliminate manual filing, would take full advantage of cataloging available from marc, and would begin at the earliest possible time the establishment of an invaluable machine-readable bibliographic data base. photographic techniques had already been employed in producing book catalogs, e.g. the national union catalog, the book catalog of the free library of philadelphia, and the enoch pratt free library catalogs, among others. 4 computer-produced book catalogs embodying various techniques (computer line printing, photo-typesetting, etc.) and levels of sophistication were being produced by many institutions, e.g. harvard university's widener library shelflist, stanford university's undergraduate library catalog, baltimore county public library's catalog, among others.57 an extensive review of various types of book catalogs including typical pages of each is given by hilda feinberg.8 following extensive studies conducted by messrs. henderson, rosenthal, and nantier of the nypl research libraries, the systems analysis and data processing office (sad po) was formed, staffed by edp and library specialists, to be completely dedicated to the solution of problems of automated bibliographic control and library automation. from the beginning it was decided that if edp technology were to be utilized, it should qe utilized in a manner which took full advantage of the properties of the medium. the computer was not to be used as an ultrasophisticated and costly printing press. the application of new technology to a field will invariably lead to waste and awkward results if the intrinsi_c properties of the technology are not fully utilized. the fundamental properties of edp technology lie in its abilities to: 1. reorganize and combine data; 2. select items meeting a set of predefined conditions; 3. maintain a permanent but flexible correlation between items; ' automated book catalog subsystemj malinconico 5 4. transform a set of conditions into data; 5. perform all of the above with remarkable speed and accuracy; 6. perform all operations with a merciless consistency. thus, it was realized, at the outset of the project at nypl, that technology could provide a great deal more than the maintenance of a machine-readable record and its reorganization for display. a rigorous control of bibliographic data was possible, and would extract maximum utility from any investment in edp technology. it was with these ideas in mind that machine-based authority control and filing systems were developed. the authority control file provides the fnndamental utility of the system. control of data usage has always been of paramount concern to the professional bibliographer. it becomes even more important in a machine-based system in which the data lie in an essentially invisible form until a fairly complex display operation is performed. advantages of an authority file another bibliographic aid which the computer could provide through an authority control system was the maintenance and integrity of a crossreference structure. in addition, one of the classical functions of crossreferencing could be eliminated : it would no longer be necessary to direct a user from one classification which has been used extensively to a newer one when terminology changes. consider the problems which might arise if the library of congress were to change its current usage of the heading aeroplane to airplane. it would be virtually impossible, under a manual system, for a library to attempt to locate, alter, and refile all cards bearing the tracing aeroplane. with a central authority file the problem is reduced to a single transaction and a fraction of a second of effort by the computer. the change is effected with an accuracy unattainable in a manual system. finally, the common nuisance of a cross-reference leading to yet another cross-reference is automatically obviated. the presence of a machine-readable authority file and the ability to verify use of all forms against this central authority, with machine accuracy, eliminates all clerical errors in the usage of names and headings to which a manual system is susceptible. the problem of consistent usage is greatly compounded in a machine-based system which does not provide mechanical verification. inconsistencies in any automated system generally tend to diminish its utility, and invariably lead to ludicrous results. nonetheless, inconsistencies of usage in an automated system are more readily corrected than those in a manual system. the existence of a central authority file, however, reduces the operation to maximum simplicity and allows no deviation from established standards. while maximum rigor in machine control was attempted, an attempt was also made to shield the professional librarian, who would be using the system, from as much of the tyranny imposed by the machine as possible. in the system finally adopted, the librarian need only exercise care when 6 journal of library automation vol. 6/ 1 march 1973 establishing a form. following establishment of the form, the cataloger need not be concerned, with any of the details of the entry, such as punctuation, accent marks, marc delimiting or categorization. the authority subsystem supplies all such details. in short, the cataloger is only required to spell the form correctly. the machine will identify any incorrect usage; thus a great deal of tedious and time-consuming (and thereby costly) manual searching is eliminated. at the same time that work began on the automated system at nypl extensive activity in library automation was also in progress in many other parts of the country, involving virtually all areas of library operation: cataloging, acquisitions, serials control, circulation, and reference services (information retrieval) . since, at nypl, it was assumed that the bibliographic data base and its conb·ol would form the cornerstone of each of these systems, cataloging was given first priority. this approach differed from that taken at other institutions; others, columbia university for example, chose to develop an acquisitions system first. 9 still others developed highly sophisticated circulation systems, ohio state university being notable among these. 10 even among those institutions which chose to address themselves to the problems of automated cataloging, important differences in approach were evident. these diherences were largely a result of attempts to solve different types of problems related to cataloging. among the many projects initiated at that time two will be mentioned, as they are representative of the differences in approach to automated cataloging. the first is represented by the university of california union book catalog project, undertaken by the institute of library research ( ilr). this system is characterized by an attempt to minimize, via computer programming, manual intervention in data preparation. employing the technique of automatic format recognition, the ilr staff attempted to find the most economical means of rendering a vast amount of retrospective data into machine-readable form. 11 in converting such a large amount of data they had to also concern themselves with the statistical error levels to be expected from keying. having decided that extensive manual edit was too timeconsuming and costly, and itself prone to statistical error, they attempted to create computer programs which would use the massive amounts of data as a self-editing device. in a sense, ilr used the nature of the problem as its own solution. the goal of the project was the production of a book catalog representing a five-year cumulation ( 19631967) of materials on the nine university of california campuses, and a marc-like data tape clean enough for print purposes. nypl, on the other hand, decided to consider only prospective materials in a continuously published catalog, and the creation of a marc-like record which would approach in completeness, as closely as was economically feasible, that created by the library of congress. to this end manual tagging and editing were absolutely essential. automated book catalog subsystem/malinconico 7 the second system to be considered is the shared cataloging system developed by the ohio college library center.12 the primary emphasis here is on the economy to be derived by instantaneous access to the combined cataloging efforts of a cooperating group of libraries. at oclc the primary emphasis was placed on on-line bibliographic data input and access. the major bibliographic product to be produced was a computer printed card set. the overriding consideration of oclc was the sharing of resources among many users, while at nypl the major concern was the content integrity of a single user's file. advantages of a book form catalog a book form catalog has several advantages over a card form catalog: it is portable, compact, more readily scanned and extremely simple to reproduce. when coupled with an automated system for maintenance and production the advantages are greatly magnified, as manual filing is virtually eliminated. the format, sequencing, and usage of terms in a book catalog may be varied at will to accommodate users' needs and library service policies. advantages and disadvantages to book catalogs are summarized in the introduction to tauber and feinberg's collection of articles on book catalogs.13 comparisons of book versus card catalogs are presented by catherine macquarrie and irwin pizer in articles reprinted in the work cited above.14• 16 the most obvious advantage of the book catalog is its portability. wide availability of the catalog of a library's collection makes possible a level of service not economically feasible under any other system. access to the complete collection of a library system can be made available economically to every educational institution in the region served by the system. access to a highly valuable research collection can be made available to a much wider geographic region than was hitherto possible. the concept of a union catalog for a region becomes much more viable, making possible regional cooperation in acquisitions policies and relieving the burden of heavily duplicated collections currently borne by library systems within manageable geographic regions. such cooperative ventures allow the cost of maintaining the catalog to be defrayed among the various members of the consortium. thus, a book form catalog would appear to provide groups of libraries with the possibility of operating economies, while increasing the overall level of service to the public they serve. the utility of a book form union catalog has already been demonstrated by the experience of the m~d-manhattan libraries in new york. midmanhattan, a central circulating library, consists essentially of five libraries in two locations. provision of complete bibliographic access with a traditional card catalog would require the manual maintenance of five individual and two union catalogs. the utility of the mid-manhattan catalog has been further increased with the inclusion of the entire nypl branch library system in january 1973. 8 journal of library automation vol. 6/1 march 1973 a library's internal operation benefits by wide availability of the catalog, as individual copies of the catalog can be made available to the acquisition division, the cataloging division, and each of a library's special collection administrators, making references to the traditional official catalog more efficient; such has been the experience of nypl. baltimore county public library reports a similar finding.16 a perhaps hidden advantage of a book catalog lies in its compactness. a book catalog requires neither the space nor the expensive furniture required by a card catalog. the problem of space becomes more and more acute as the "information explosion" continues to mushroom. an ironic squeeze is encountered in that the collection yearns more and more for the space occupied by the catalog, while the catalog, in growing, continues to make its own demands on available space. description of the nypl bibliographic system files before attempting to describe the book catalog subsystem, we shall briefly describe the nature of the files from which the bibliographic data are drawn. the complete bibliographic system consists of four major files and computer programs for their control and maintenance. • the files are: 1. complete marc data base (updated weekly with all changes and additions) from which cataloging may be drawn; 2. bibliographic master file; 3. authority master file; 4. bibliographic/ authority linkage file. for the purpose of this discussion we shall take the existence and maintenance of these files for granted, and concern ourselves solely with their use in the production of photocomposed book catalogsp bibliographic master file this file contains unit records for each bibliographic item in the collection; t books and book-like materials, monographs, serials, analytics and in• the system actually consists of three independent sets of such files (marc is common to all)-one each for the research libraries, the branch libraries, and the dance collection. t separate data bases are maintained for the research and branch library systems. the research libraries file contains all book and certain book-like material added to its collections since january 1971. the branch libraries' file contains all holdings of books and book-like materials of the mid-manhattan library collections. this file currently duplicates to a large extent the holdings of the rest of the branch system, and will eventually encompass the entire system. automated book catalog subsystem/malinconico 9 dexing items are included.: the information content is identical to that of marc records. tagging and delimiting adhere to the marc conventions except in those cases in which it was necessary to expand delimiting in order to enhance the functional utility of the marc coding structure. some data distinctions which marc has since dropped, but which are nonetheless useful, have been retained. the expansions consist of the addition of several delimiters not used by marc in order to provide filing forms (which are automatically generated, but which may be manually overridden) for titles, and sequencing information for volume numbers of series and serials. transformations from a marc ii communications for· mat to the nypl format and vice versa are possible due to the isomor· phism of the two records. the transformation of marc ii format records into nypl processing format is carried out in the normal course of processing, in which marc records are selected for addition to the nypl files. authority master file this file is the central repository of all established forms. names (personal, corporate, and place), series titles, uniform titles, conventional titles, and topical subject headings are all established on this file. categorization of each form is controlled by this file. no form is accepted for use in a bibliographic record unless it matches a form already established on the authority file, and is used consistently with the categorization assigned to it, e.g. a form categorized as a topical subject is never permitted as an author, a series title may only match a form categorized as a title, etc. the cross-reference and note structures are maintained on this file. an additional heading employed by nypl, which falls conceptually half way between a cross-reference and a subject heading, the dual entry, is also controlled here. the dual entry heading serves to bring together, under a nonlc heading, bibliographic items which nypl considers unique, by virtue of the nature of its collection. an example might be found in the genealogy division which contains a very extensive collection dealing with new york city. use of the dual entry allows a sequencing under both a subject heading indirectly regionalized to new york city (lc heading) and at the same time a drawing together of all items about new york city into a single sequence headed by new york city. take, for example, the lc established heading elections-new york (city); nypl automatically causes all ~ at nypl a distinction is made between analysis and indexing of a work. the latter refers to selective analysis, used when it is desired to provide, for example, subject access to a significant article in a periodical without creation of the series added entry. there are two types of indexing provided by the nypl system. the fust creates only a subject tracing; such treabnent might be accorded an article of topical significance by a staff writer of a popular periodical. the second would create both an author and subject entry; this might be used in the case of an author of note writing on a significant subject in a popular periodical, e.g. norman mailer writing on political conventions for esquire magazine. 10 journal of library automation vol. 6/ 1 march 1973 items traced to the above heading to appear under both the lc heading, and the dual entry new york (city)-elections (figure i). the dual entry merely provides an alternate form of organization for display. no bibliographic tracing is permitted directly to a dual entry. the additional entry point is automatically created when a catalog is printed. manual effort by the cataloger in order to provide the additional entry point is prevented; in addition, the bibliographic record remains rigorously marc-compatible. automatic control of cross-references, dual entries, and the en masse alteration of classification are facilitated by the authority subsystem together with the correlative and reorganizational capabilities of the computer. there is some irony in the relative ease with which the computer allows such individualized organization of data to be effected and the computer's reputation-richly deserved-for imposing a bland uniformity on its victims. the authority file provides one other invaluable service: it controls, in a single location, filing forms to be associated with a heading. consistency of filing is assured and, again, extreme simplicity of alteration is possible. only one record need be changed in order to alter the filing of the entire 201 election handbook. [boise, 1970] 87 p. 71-511901 [jld 71·314) elections in ghana, 1969. austin, dennis, 1922· [new delhi, 1970] 26 p. 7l -44166s [jfe 71·564] elections • japan. curtis, gerald l. eleclion campaigning, japanese style. new york, 1971. xiii , 275 p. 71-591294 [jld 71-805) elections • jurisprudence. see elecflon law. elections • lancashire, eng. • history. clarke, p. f. lancas hire and the new liberalism. cambridge [eng.} 1971. ix, 472 p. 71-s09sj8 (jle 11·191) elections • management and methods. see electioneering. elections • new york (city) .. ivins, william mills, 1851-1915. machine politics and money in elections in new york city. new york, 1970 [cl887} 150 p. 72-41160 [irgn 72-92] elecfions ·norway. koritzinsky, theo. velgere, partier og utenrikspo litikk. oslo, 1970. 182, [i] p. 12-261079 [jld 72·536) 2 4 new york (city) • economic assistance. poston, richard waverly. the gang and the establishment. new york [ 1971] xii, 269 p. 72-59612 [jld 72-583] new york (city) • economic assistance· law and legislation. u. s. congress. house. committee on education and labor. subcommiltee on the war on poverty program. antipoverty program in new york city and los· angeles. washington, 1965. vii, 209 p. n -222049 [jle 72·171) new york (city) • elections.+ ivins, william mills, 1851-1915. machine politics and money in elections in new york city. new york, 1970 [cl887] 150 p. 72-41160 [irgn 72-92] new york (city). environmental protection administratioll. fabricant, neil, 1937· toward a rational power policy: energy, politics, and pollution. new york [1971] vi , 292, [30] p. 72143433 [jse 72·291] new york (city) federation of jewish philanthropies. see federation of jewish p•iluthropies of new york. fig. 1. the nypl research libraries dictionary catalog, july 1972: ciu-f page 201 on the left, and ln page 297 on the right. dual entries under new york (city) are shown on the right. this catalog was produced in 6 and 8 pt. type set on 8 pt. body. automated book catalog subsystem/ malinconico 11 body of material associated with a heading. filing forms are automatically generated, with provision made for a manual override. automatic filing has been found to be correct in better than 95 percent of the cases currently in use. the remaining 5 percent required manual intervention. the machine filing algorithms are based on language and on marc categorization and delimiting. 18 initial articles are dropped in each of thirty-eight languages, including the major languages transliterated into a romanized alphabet (those employing cyrillic alphabets, oriental languages, hebrew, and yiddish) . chronological subdivisions are filed automatically obseiving rules regarding inclusiveness of dates, etc. important chronological periods (currently fifty-four such periods) are recognized and filed automatically, e.g. american revolutionary and civil wars, french revolutions, chinese dynasties, middle ages, etc. roman enumeration is automatically filed in correct decimal sequence. bibliographic/ authority linkage file the basic function of the bibliographic/ authority linkage file is to provide a communications channel between the two major files by assigning to each authority form a neutral unique number. the linkage file then provides access to the established form regardless of the metamorphoses which it may have undergone since its original use (the number remains inviolate). each authority upon addition to the file is assigned a unique number; however, the authority file is sequenced by an alphabetic sort key. this sort key bears no logical relationship to the filing form of the heading; it is constructed by dropping punctuation and accent marks, converting to upper case, dropping multiple blanks, and appending a hash total. the linkage file maintains the correspondence between authority control number and alphabetic sort key. only the authority control numbers, determined by the first bibliographic/ authority file match for each field, are carried in the bibliographic records. in addition, information is provided to the book catalog subsystem regarding changes in the authority file (alteration of established forms, etc.) which would cause an entry exhibiting such alterations to be immediately regenerated for inclusion in a book catalog supplement. appropriate action is taken against the bibliographic file when activity to an au~ thority heading is sensed by the book catalog subsystem. the presence of a dual entry form, which will require the creation of an additional entry under the associated variant form, is also indicated here. alternative input files it should be mentioned that the full set of files described above is not a mandatory requirement for creation of a book catalog. a bibliographic file in a marc ii communications format alone will suffice. we have performed tests using both another library's data file and the marc file as 12 journal of library automation vol. 6/1 march 1973 sole input to the system. using unmodified file update software we have generated from these marc ii format data bases complete authority files, and thence book catalogs. no cross~references or scope notes are possible in this mode of operation, since marc makes no provision for them. a further experiment was performed using another library's data base (in marc ii format) in combination with the cross-reference structure of the nypl authority file. this led to highly satisfactory results, demonstrating that a photocomposed catalog could be created, and exhibiting the utility of the input file enhanced by cross-references.§ the photocomposed book catalog subsystem the system for production of book catalogs represents only the visible tip, albeit a large and complex tip, of the entire bibliographic system. it consists, in a11, of ten computer programs and several score modules. the system was designed with thought toward production of catalogs with a variety of output options. in most cases, these options can be attained by the elimination of entire programs or modules. space does not permit a consideration of all possible variations; the most important will be mentioned in the course of the discussion. one consideration which was deemed of paramount importance was to remain as independent of photocomposition hardware as possible. photocomposition is yet in its infancy; hence, an inextricable commitment to a particular device, it was decided, was to be avoided. the final approach taken was the design, by sadpo, of generalized photocomposition software which is responsive to device-independent typographic commands. the only function of this software is to accept, as input, completely defined text data and typographic instructions from which it generates formatted pages. this task is accomplished via a translation of device-independent into device-particular commands in the form of a photocomposition device driver tape. should a new or more desirable photocomposition device become available, or significant advantage be found in employing a different photocomposition vendor, only one program need be altered. the photocomposition software is completely generalized and can be used to generate anything from book catalogs to typeset prose, in virtually any format (see the section on the pagination program for a discussion of the formatting options provided). figures 2, 3, and 4 demonstrate some of the possibilities . . the creation, organization, and control of data to appear in the catalog was undertaken as a completely distinct set of programming tasks. design obfectives of the book catalog system before embarking upon a discussion of the technical aspects of each g the september 1972 hennepin county public library book catalog was published with a bibliographic data base produced by the hennepin county library combined with the nypl research libraries' authority flle. automated book catalog subsystem/ maljnconjco 13 dictiona r)' catalog supplemen t, august/no vember 1972 a. a. a. see associ~td amt'rlc.n artists. a. a. h. p. e. r. see .a•etitoan assod1tion for h• altb. ph)l'iq' ialllcmioft, a•d recrn.lio.-. a. a. t... l 1o«: ar.rima .4uoe:ia tl011 ol law l.lh..-. a amoa re~rts. afro-american music o ppnrtu!uiic's assnd:lllinn. v. 3. no. 4; o c i./oc<:. 197 1minneapolis. ci.jrrf.ntlssues a v a ilable t:-1 music division . 12-.. dl .. 979 [m.,sie div.) a. a. s. sec assor:i:ado• l•r asilft sra~~d~s. a. a. s. h . 0 . f or corpotati: body reprek!i.icd by cht'sc ioailiits. ~e:. a~k:ajto ~ioal .r s tme hipw•y otneials. a. b. a. sec: american bankert aisecis~~iiod. abe clt r ft rmehtmpr:ill.let'lt chnik. sareng., klaus k. ( bertin. 19701 243 p. 72-j96ll1 (jsd 72·391] an a. b.c. of b:ritidl fe.ms. ma~arthy, d.1phne. 1-ondnq, 1971. 127 p., 2() plaits. 72-21s.ui (jse 72· 640] abc ~oor de ... atenporl. bron&erj, j . f. laren [197 1] . 199 p. 12·l9llll [jfd 7l·l831] abc"s ol libr.ry pronotiojt. sherma n, stc11e, l9l8· metuc he n, n.j., 19 7 l.lv, u2 p. n .nn •7 (jld 72-1106] a. c. a. for corpot"ate 'body represc:r.te d by these iflilials. !4x: assocbtell coundls • f tile arts. a cu d: • n-itk al imi1llt int• israer.s. 111--.s. 1, no. s/6-; lt\ l.!g._ 1971?· oeylon. mo. curr ent i$si.j£s availa8le in pejhodic a ls division. 72-421 5-to (per d l,,] a. c. s . moaow•ph. ~ee ame rican chen~ieal society. acs mo11•1raph. a etui ha rfd•no. mn trona.rdi, lucio, 193().. milano, )971. 14fl p. 71·2hi1.s [jfil 71· 1735] t\vail ..-.ole 1n periodicals division. i'2-2ui24 (p~div.] a. i . p. see arhr ftaa l11stit•te of pi••mts. a. j. r. se.~ au.claz.imv .ita1ia .. rsui. a. j. c. su a .. r~ jewisll c~tec. a. l. g. 0 . l. (computer program languagei see algol (computer program unguagei a. l. p. see lllbor parf)" (a.astraua). a ia quile de din. s.b4e. philippe de. parfs. 1970.96 p . 12·11112> [~fo 7:1-,78] a ia ~he clts trisws ~ bohattc;. m ilnsl.a\1, [pracuc-. 1970) ,;g, (63j p. nf ill"-s. lp :u1 c nl.) 1l-<40jil76 [ifg 12--9"] a la vitille rutslf:, in.c. the att or the anldsmith. & the jc:we\tr; a loan c'lhibition ror the bcntfit c f the: y oun1 women·, chri51i an auoc:ia,ion of 1hc: city nf new y ork. novtn~bc:r 6·novcmbc:r zl. 1961. new york (1 96!1 u9 p , ill us. (pari col.) 2& em. 7l.j09ll2 [mn o 71-ui] a. m . a. rc: a.-rkaa mn.ical auoclmka. amdg, a histor y ol c•dhts coli~ 18?0.79. harney . thomas e .. tl9 7· c~n i!dus col le&e. n.e w york (1 971lls ~ p. n-m(;}o 7 holo] a. m. i. r . a. ~ a111straliu miael"'!ij lnlll•stries ranrch auod•tio.t. a. m . i. r . a. bullt tla. a lluraliaa mu~cral industries rnurc:h associauon. bulletin. fmtibou rnc:) full recolo o f holdi ngs in central serial jt.ecord12·171208 [jsp 7l-29t] a. m. s. !icc: ame riun m•t .. c~nclk•l s odety; amcric:a n m et toroloaica l sot'if:ty, a m o•tevidu. 0<)uoq:ha.lk. louis moreau, 1829· 1869 . (s)rnphony. nca. 2) (1168?) score: (sj pi 7l·lt;<91 (lng 7h9) aaron. daniel. 19j2· supervi;sioll :and c un-lcwlum de11elopmen 1. a. s. c. i. see hylller•bn, 11\di:a. admin.lscr • tl'e s taff col~ae ol i-.4.1l a. s. 1. s. stt aidt:ric ... soclelr r.,. t.r.r-.ati .. sdctl«. asis werlc,o.p ef c..,.t~r cdmposkior, washi•ston. d. c., \970. prn.::ccdin &'· edited by r o bert m. lartd.au. wu hina:to n , atiutican society fnr t11rorm<1 1i on scicnc:t [i 97 11 he., 2 58 p. illw. 24 em. bibliojtllpl'ly: p. 249-l so. 1l·l'161jo (jfe 72·1217) a. s. s. £. me a.uricu s.c~ty of sa-.itary ~ ... a. s. t. d . see amet~ s.clety fer tr ... ni"l md j>e..-~lop•t•t. as1mt l£s/ aiaa sptt:t sinn1.lation co,.ruc "c~, :zct. philt delpllia. 1967. tcchnk~l p.ap~r!l. ~~:~ri':.~h;~;)j~~219 ~~t~r. ~3' ~:.tin& •cch.pon10rrd b y a..nmta.• s!ikiely for ttu.i'-1 •nd m i mrim. lm;ritlll:t of £nnm.-enut st-ienca (md) j\llteri.:n lm.~itimt of auou.ckj. md ast~nt~wtk:l.. • 7l...ojii10 [js p 71·5 11] a., s ergio lllirliitr. ~c lll:irlien a •• str&io. 1945· a. t . .1\. sc~: great brlu,fr~. ait tr:ansporl a~txiu•ry. a. t. £ e. m. ~e asod.lcih t£cni~ esp:aiol:1 lilt esiydi•s m ttallilrekos. a. t . l k c acadttnit tt ri:..sopolita n 4e k tras. a·l' (fighter·bomber j'l.anf:sj ••• mustang (fighte!li'lanes ) a llt rnpo y (uea:o! mdn stntos, frllnci ~cn. sa.ntia ao d e: chile, 1970. 59 p. 72274u4 [ jfc 7:.695) a. v, h . sec: hln&:m)'· alla• •ne\nu hat6s.ic. avr; allatllltilter vlit.nt.tf·rc-port.(n r.j i·; 1972· ticusetwamm. gcrm.any. c urrent fig. 2. the nypl research libraries dictionary catalog supplement, november 1972: az page 1. this catalog was produced in 6 and 7 pt. type set on 7 pt. body utilizing a three column format. captions are in 8 pt. type. the mid-manhattan ubr/try ..4pmi~ lauis, li/j1·1 873. (cont.) ~~-~oflbt~ofl., ................ llkll!~ dte~ ...-l placci 1·14.,. ~ preaded .,. .. t.lwlti!.ui ouul.lhwlqof ~,._.., ~ot irwse.autl• ......__,~,...cm66. rt4ml'tln ,_~* .. a-at .. ._......., • aa... .. ....._ ~ neucww. 1o 24 )lilo.t ju7. 7j.jwio cd [551.312·a) agassiz. lquis, 1117-1173. luri&, edward, 19l7· loub apssiz. chieqo (cl960] 449p. 70.104ut c.:3 [b·apai .. t) tharp, louise (hall) 1898· adveniuj"o\ij alliance. booton (el g$9] 354p. "'"'"' cc3 [b·apui&-t} "'-*• jit•......, ll77•l!n7. felaaieo and impromphll. freeport, n .y., 8oob for ullraries prasa [ 1967] :u8p. <-r --) _ol ... ltuod. 71h1,.. c.:l [82+apta) agatbias, schoi.asticus, d 512. cam~, averil apthiu. oxford [c l970) 168p. (t~;~;i't~ a.,.._, ~~odor aluloj, thothe na..t olllcer's alli. 70.100619 c.: co4 [b·apew-l) manh, robert, 1932· a,new, the une•aminod-. new york [e1967) 182p. 71· 111010 c.: co4 [b-a-w·m] ..._. s. y .... alma. s-..t j_,., , .... fig. 3. the mid·manhattan names catalog, april 1972: acit page 20. this is a divided catalog produced in a two column f01mat utilizing 6 and 8 pt. type set on 8 pt. body. 14 journal of library automation vol. 6/ 1 beyond the ttable: state. new york. [<1971) 2·2s. p.; 71·591412 c.. (301.1.~5) bcr_. t~w a*-'le tt.a.. schon. oo:n•ld. ap. new yoct (ct971} 2-ls.c p.; tl·s~i47l q4 [l01.2 .. s) ...... .... amii. (eel.) freeport, n.y. (1971. df443}29tp. 12·210216 co6 [94u31 .. a) a.,o~ ........ krotj'iey, herbert. new york [~1966} 209p. 7l·'89so cc4 (3tu41-k) n. 8.._041 gila owl !u eplld,. ol ra.l sltiner, rudou', 18.61·1925. ntw vo.-k [1971) 102 p. tl·s919h q4 (294.5924-s] bbofn. o;.>wu, c. od.mcpu, 19ll· n"' york.[cl969) )17, l26p. 70.4301s9 c:.4 [966.905.0] "he bulle ami dn: tade•t nc• £me. v,~,~,, p.ola.ad de, l91l3· carden city, n.y .• 1911. 21!4 p . 1l·1386s8 cof [221-v] :bibl£ ·comment allies. bisek, m&nhcw_ (~.} peeke'a coml'l)tfttajy on lhc bib~e. (loll.doa, e196l:u 116, »ii) n:..~:=~. th::' .. -ti~ 114'7·l9u. detroit, 1961. sv. 11·399610 co6 [oiu7j.b) blb ..... ~a '-' iaiu .. amri gw.. fig. 4. the mid-manhattan titles catalog supplement, july 1972. this page was created as a test utilizing 4 pt. type on 4 pt. body. the actual supplement created for use by the public was created in 6 and 8 pt. type. automated book catalog subsystem/ malinconico 15 processing step, we shall state the objectives which we set out to meet, and the constraints-generally economic-under which they were met. method of publication as it is economically impractical to publish the entire catalog on a very frequent basis, a cumulation/ supplement scheme was adopted. two basic types of supplements are possible: ( l ) a supplement containing only new items for the period represented; or ( 2) a cumulative supplement containing all items new to the system since the last appearance of a cumulation of the entire collection, automatically replacing all previous supplements. the latter is more costly than the former. the economic desirability of the former was eschewed in favor of convenience to the user. under the scheme adopted, a user has, at any time, only three sources to considerthe retrospective catalog, the prospective cumulation, and the cumulative supplement. b we have derived several optimization formulae for reaccumulation schedules.19 application of these formulae indicated a reaccumulation cycle of approximately one year, assuming that supplements would appear monthly. the formulae also indicated that a small premium would have to be paid for the administrative convenience of spreading the printing and processing load of the cumulation over the span of the entire reaccumulation period, compared to the cost of a complete printing at the beginning of each period. the adopted publication scheme calls for the publication each month of *2 of the cumulation, together with a supplement containing all items which have not yet appeared in the cumulation and those which have been altered since their appearance in a cumulation. the division into twelve segments is table-controlled; the number of segments may be varied from one to sixteen. for example, in january a cumulation is published for the alphabetic span a-b; a supplement is published for the remaining letters of the alphabet. a similar situation would occur the following month, etc. thus, at any given time the public is presented with a set of volumes representing the cumulated catalog and a supplement which contains all material not found in the former. the public is unaware of the fact that the cumulation is being cyclically updated. they are only aware of the fact that they have no more than three sources to consult: ( 1) the old card catalog, ( 2) the basic cumulative book catalog, and ( 3) the cumulative supplement. the fact that entries are migrating from the supplement to the basic cumulation each month is of no consequence from the standpoint of catalog usage. the decision governing representation of an item in a cumulation or supplement is made on an entry by entry basis. for example, one of the n all material in the card catalog h as become known as the retrospective collection, and all material entered into the automated system after january 1972 has b ecome known as the prospective collection. 16 journal of library automation vol. 6/ 1 march 1973 subject added entries may have migrated into the cumulation; hence, it will no longer appear in a supplement. however, the main, and all other added entries, falling into different filing ranges, will continue to appear in a supplement until they too can be absorbed into the cumulation. similarly, alterations to a bibliographic record will cause only those entries whose text or sequencing is affected to reappear in a supplement. a change to or an addition of a subject tracing will cause only that subject added entry to be regenerated for inclusion in a supplement. the main, and all other added entry citations, which remain unaltered, need not reappear in a supplement (assuming they have previously migrated into the cumulation). condensed added entries in order to keep printing costs to a minimum, all added entries are condensed; title page extension, publisher, and bibliographic notes do not appear under any of the added entries, the assumption being that the user who is interested in such data will take the trouble to refer to the main entry, which contains the complete bibliographic citation. this type of back-and-forth reference, while quite awkward in a card environment, is extremely simple in a book catalog. economic considerations also led to the decision to suppress tracings from the main entry. the system was designed to allow these decisions not to be irreversible. the choice of data which are to appear with an entry is governed by a set of tables which may be readily altered should it be desired to change the format or context of an entry. punctuation of condensed entries is accomplished automatically. this is not a trivial problem, and one that only a cataloger can truly appreciate. consider, for a moment, the myriad ways in which bracketing may occur within the title or imprint statement, and the ways in which these may span the two fields. add to these factors the rules which do not permit the appearance of double punctuation. we have found that punctuation of added entries is effected correctly in 98 percent of catalog entries. in those instances in which ala punctuation rules are observed in the complete record, correct punctuation is assured (this is not true of cataloging obtained from european sources). control of cross-references it is in the realm of cross-references that the mindless consistency of the computer is most effectively employed. the goal to which we addressed ourselves was the absolute integrity of cross-referencing. under no circumstances-short of erasing a cross-reference from a previously published catalog-were cross-references to refer the user to a heading which did not have an associated bibliographic citation. all meaningful cross-references providing alternate access points to a citation must appear. by the same token, in order to minimize costs, cross-references which appear in a automated book catalog subsystem/malinconico 17 cumulation available to the public are not to be repeated in a supplement. cross-references to a heading would be considered valid entry points to the catalog when bibliographic citations appear under a subdivision of that heading. for example, the appearance of bibliographic citations under negro art-exhibitions would cause all cross-references to negro art to be generated (figure 5). the same rules concerning appearance in supplements and cumulations are observed for these secondary cross-references. alterations to cross-references which have appeared in a cumulation will cause the altered forms to reappear immediately in a supplement, provided the referenced heading is still in use in the catalog. similarly, alteration of the referenced heading would cause the reference to the new form to be automatically generated. nepi. aatiat.o. i..a comunili estctica il> kant. bari, adriatica, 1968. 399 p. 25 em. "2. edizione accresciuu: includes biblioarephl~al rcrcrepces. 72-283171 [jfe 72-659) noeriqm. shapiro, norman r. (comp) new york [1970) 247 p. 72-4010599 [jfd 72·5021) ne necro ud j-.lea. pim, bedford clapperton trevelyan, 1826-1886. freeport, n.y., 1971. vii, 72 p. 12·3324'8 [hrc 72·749) ne necro ocl doe • ....._ bond, frederick weldon. college park, md. [1969, c1940] x, 213 p. 72-365267 [mwed 12·657) negro art • exhibitions. ~ harlem cultural council. new blac~ts. [new york, 19691 [541 p. (chietry illus., ports.) 72-420544 [mcw 72-9il8] negro art • united states. harlem cultural council. new black artists. [new york, 1969] (s4] p. (chien.x. ~u .... , ports.) 72-420544 [mcw 72-908] negro art· united states • history. chase, judith wraag. afro-american art and craft. new york [1971] 142 p. 72·363299 [3-mamt 72-910] negro artists· united states. fax, elton c. seventeen black artists. new york [1971] xiv, 306 p. 12·31l294 [mamt 72·732) negro arts • harlem, new york (city) hujjins, nathan irvin. 1927harlem rcnatasance. new york, 1971. xi, 343 p. 12-173133 [jfd 12·3936] afro-americans. new york (19711 61 p. 12·261130 [jnf 72·6] negroes. . black america. new york [1970] xv, 303 p . 72·296234 [iec 72·1178) negroes· addresses, essays, lectures. goldstein, rhoda l. black life and culture in the united states. new york [1971) xiii, 400 p. 12·240427 [iec 72-1 ul6] necroes •d the ....,.t depl'essiotl wolters, raymond, 1938westport, conn. [1970j.xvii, 398 p. 72-296828 [iec 72-1260] negroes • art. see negro art. + negroes as businessmen. andreasen, al~n r, 1934inner city business. new york [1971) xix, 238 p. 72-3371ss [jle 72-977) durham, laird. black capitalism. washington [1970) vii, 71 p. 72-401063 [jld 72-1931] jones. edward h. blacks in busin.,.., new york (1971] 214 p. 72-4008520 [jld 71-2200) negroes as businessmen • directories. national minority business directories, inc. national black business directory. (minneapolis] full record 01' holdings in central serial record. 12·406758 [jlm 12·221] negroes as physicians. sec negro physicians. fig. 5. the nypl research libraries dictionary catalog supplement, october 1972: l z page 120 and 121. these pages demonstrate the generation of the cross reference negroesart see negro-art even though only subdivisions of negro-art appear in the catalog. a further consideration extends to cross-references which have migrated into a cumulation. when a cumulation segment is updated, all cross-references which previously appeared in it should continue to appear if, and only if, the referenced heading is still in use in either the same segment of the cumulation, another segment of the cumulation, or a supplement; if not, its use is discontinued. subsequent use of the referenced heading would then call up the cross-reference for reuse. each of the above desid_,. ...: ._______. ··---· _ ~ 18 journal of library automation vol. 6/ 1 march 1973 erata requires rather intricate logic when the cumulation is being produced in monthly installments, as any of the following is possible: 1. cross-reference in a supplement, referenced heading in a supplement; 2. cross-reference in a supplement, referenced heading in a cumulation; 3. cross-reference in a cumulation, referenced heading in a supplement; 4. cross-reference in a cumulation, referenced heading in a cumulation. in each case, the cross-reference must be suppressed whenever the referenced heading disappears from the catalog available to the public, but must be retained when it refers to a heading existing in any part of the catalog. the cross-reference and referenced heading may easily appear in catalog segments published as much as eleven months apart, making it absolutely essential that both the authority and book catalog subsystems maintain strict control of the cross-reference structure. control of hierarchies it was decided that the appearance of cataloging under a subdivision of a heading which contains associated notes should cause the higher level heading with its attendant notes to appear. such a heading would be forced to appear regardless of whether or not it itself headed a bibliographic citation, under the assumption that notes concerning a heading might he valuable to a user interested in a subdivision of that heading (see figure 6 for an example ). acta symbojjc:a. v. i, no. 2·; fall, 197(). [akron, ohio] current issues available in periodicals division. 72·218723 [per. di•.] acflng. schreck, everett m. princij:iies and styles of acting. reading, mass. [1970) 354 p . 72·24is44 [mweq 7:z-t57] acflon songs. see games with music l'actirite utistiq•e. philippe, marie dominique. paris (1969)1 v. 72· 272967 [jfd 72-1443] actirities by v.no.s ceatraj bllllks to pro-'ote economic ud sod~ •elfue prop-aas [by lester c. thurow and others] a staff report prepared for the committee on banking and currency, house of representatives, 91st congress, second session. washington, u. s. govt. print. off., 1971 .' vii, 332 p. 24 em. at head of title: committee print includes bibliographies. 72·288174 [.jle 12·835] actors. here are entered works on actors .. including bolh men and women. works about women actors alone or women as actors are entered under the heading actresses. actors, american • biography. shaw, dale, 1927titans of tbe american stage. philadelphia [1971) 160 .p.· 72·313460 [mwer 12-524) fig. 6. the nypl research libraries dictionary catalog supplement, august 1972: page 2. the heading actors is caused to appear due to the presence of a scope note and the use of a subdivision of the heading. automated book catalog subsystemj malinconico 19 dictionary and divided catalogs the same system was required to serve two divisions of the new york public library, each of which has different traditions and philosophies of service to identifiably different users. therefore, an additional flexibility was required of the system: the ability to produce both dictionary form and divided catalogs. the research libraries, which have traditionally used a dictionary fom1 of catalog, wished to continue that practice. the branch libraries, on the other hand, felt that their public could be better served by a divided catalog, separated into titles, subjects, and names. the system was designed in such a manner that the modification of a single parameter in the final sort would produce either form of catalog. book catalog subsystem-technical description the entire subsystem consists of ten separate programs, each of which will be described below. the flow charts in figures 7, 8, and 9 depict the processing how of the subsystem. the system was designed to operate on an ibm 360 model 40 (which has since been replaced with a 370 model 145) with 256k bytes of core storage. the programs were written exclusively in bal for a dos configuration. a conversion to full os has recently been completed. each processing step described below is executed sequentially. significant peripheral devices required are: five tape drives, one disk drive in addition to those required by the operating system, and a line printer. please refer to figures 7, 8, 9, and 10 for the programs and files referenced by symbols pi, tl, dl, etc. entry explosion and construction-program pl this program serves as the driver for the entire subsystem. in this step entries are selected for inclusion in a supplement or cumulation segment. requests for data required from the authority :file are initiated. the format and data content of each entry are defined by this program via a set of tables. these tables may be altered at will, allowing redefinition of the format and content of any entry. the bibliographic master file is updated to indicate the appearance of an entry in a cumulation, preventing its subsequent appearance in a supplement. in addition, this program is charged with accepting communication of activity to the authority file and taking the appropriate action with respect to the bibliographic file. this activity may take several forms: alteration of a heading, change of delimiting, change to a filing form, posting or removal of a cross-reference or dual entry, change of categorization, or the complete transfer of all cataloging from one valid heading to another. evidence of activity to an authority heading is carried on the authority 1 bibliographic linkage file ( d0 ). when such activity has affected a head20 journal of library automation vol. 6/ 1 march 1973 fig. 7. subsystem flow chart. explode catalog entries generate responses, select headings.and update x-reference linkage pl create requests for x-references, dual entries, &. higher level headings p3.1 automated book catalog subsystemj malinconico 21 format headings module format headings module fig. 8. subsystem fiow chart. eliminate duplicate heading requests p4 locate higher le ve l headings, & dual entries ps 22 journal of library automation vol. 6/ 1 march 1973 create requests for secondary x-references, write headings p3.2 locate x-references, update x-reference indicators p6 fig. 9. subsystem flow chart. format headings module format headings module automated book catalog subsystem/ malinconico 23 insert authority text data into skeleton catalog entries p7 pagination p8 fig. 10. subsystem flow chart. 24 journal of library automation vol 6/ 1 march 1973 ing used by a bibliographic record as an authority field, the field is tagged for verification by the authority file in the next file update/ authority-interface run. the indicator for the field in question, denoting previous appearance in a cumulation, is turned off. at the same time, the indicators for all other catalog tracings which require that authority field as data are turned off. when a transfer from one heading to another has occurred, the new linkage number is inserted into the authority directory of the bib· liographic record. this is not absolutely necessary, as the authority / bibliographic linkage file provides the link via a chain when a transfer has occurred. nonetheless, the insertion of the true authority control number into the bibliographic file eliminates the necessity of a chained search in all future accesses of the tracing, space on the linkage file is conserved, and no additional indicators are required to make note of the fact that the entry has been caused to reappear in a supplement as a result of the transfer. in all cases of activity to an authority record, reverification is forced for the associated tracing field in order to guarantee correct usage of the altered authority. each bibliographic record is examined to determine whether it will contribute to the .catalog. this is done on an entry by entry basis. each field of the bibliographic record capable of defining a catalog entry is examined. all fields which define a catalog entry (tags 1, 245, 4-, &-, 7-) carry a set of indicators denoting appearance in the cumulation, and a number defining the cumulation segment into which the entry should file. an additional indicator for authority fields denotes the presence (or absence) of an associated dual entry on the authority file. appearance in the cumulation and filing segment number of the dual entry are also carried in the bibliographic record, allowing independent control of the dual entry citation. as may be readily seen, the dual entry acts as a phantom tracing in the bibliographic record and will thus not be specifically mentioned in the discussion of selection criteria below. an entry is selected for construction on the basis of the following criteria: 1. the bibliographic record is in a valid status, i.e. has passed all editing tests, and sufficient time for proofreading has elapsed. 2. all authority fields required for construction of the entry have been verified against the authority file in the weekly bibliographic file update/ intedace production runs. 3. it files in the segment being produced that month. 4. the indicator denoting appearance in the cumulation is not set. thus, any alteration to the content of a bibliographic record, warranting immediate reappearance of an entry, may be communicated to the book catalog subsystem by the extinction of the cumulation indicator. both cumulation and supplement entries are created in the same run. the entries are separately collated by causing the highest level of the final automated book catalog subsystem/ malinconico 25 sort to be a code denoting supplement or cumulation. it will prove fruitful at this point to draw a distinction between a catalog entry-the printed bibliographic citation-and the machine record which is created by the system prior to phototypesetting. the machine record is nothing more than a highly organized print record. the final merging of such print records from various processing steps completely define the text, typography, and sequencing of the final printed catalog. the machine print records created by the system up to step p8 will be referred to as text entry ( te) records. when an entry is to be included in a particular month's catalog segment or supplement, a table for the particular type of entry is consulted in order to determine the data and the typographic commands which will govern the entry's format. at this point only a skeleton text entry record is constructed, as all authority data will be obtained from the authority file. the sequencing information is contained in the sort key of each te record, which defines six levels of sorting: 1. collation-catalog or supplement. this is further refined when a divided catalog is being produced. 2. level i sort, and sort code. 3. level ii sort, and sort code. 4. level iii sort, and sort code. 5. publication date. 6. publisher. in the case of certain series entries, level ii and iii may be split into two half-size levels by the program in order to further refine the sort sequence. as an example of the use of sort levels i, ii, and iii, we might consider a subject added entry. in that case, the level i sort is defined by the filing form of the subject tracing, level ii by the filing form of the author's name and level iii by the filing form of the title of the work. the sort codes are used to separate entries which would result in the same sort keys but are conceptually different, e.g. a name which might simultaneously define a title added entry, a main entry, and a subject added entry. a similar situation exists at the second sort level where conventional titles are to be separated from titles or subject title entries. sort key levels, as all other data elements required in a te record, will be directly inserted into the record under construction if they consist of nonauthority data, and will be identified by linkage codes for later insertion when the filing form data is returned from the authority file. the final te record will not be completed until step p7, to be described below. following construction of the sort key (or indications to complete a sort key) typographic commands and text data are inserted into the te record. the typographic commands are contained as binary bit settings in a record directory. the directory also defines the location and length of each data element, or gives a linkage code when the data are to be obtained from the authority file, and hence cannot be inserted until program p7. the order 26 journal of library automation vol. 6/1 march 1973 of entries in the directory defines the printing sequence of text data. thus, when text data are available, true locations and lengths are provided in the record. when they are not, linkage codes replace them in the directory. these linkage codes are simply replaced by true locations and lengths when the authority text is added to the end of the record by another program (p7). it will suffice at this point to mention that all typographic commands are present in the record. the function of the commands will be discussed in detail below when the pagination program ( p8) is discussed. having constructed a set of skeleton te records, the program initiates requests to the authority file for authority text data and filing forms. requests are also made to the authority file for headings which are to print above the bibliographic citations. these headings will be constructed in the same manner as catalog enbies, i.e. as te records. they will then be merged with the respective te records as citation entries. these heading requests also initiate a sequence of processing steps culminating in the location and formatting of all relevant cross-references. the necessary crossreferences are formatted into te records, and are likewise merged to form the complete catalog. when an entry is chosen for inclusion in a cumulation segment, indicators to that effect are set in the bibliographic master record; it is then written onto the updated bibliographic master file. locate authority data and select headings-program p2 all inquiries to the authority file are sorted into authority sort key sequence and matched with the authority file. all inquiries will result in a match to a valid authority record. a match for each inquiry is assured by the weekly file updatej intetface processing programs. inquiries to the authority file result in any combination of the following actions: ( 1) authority text and filing data are supplied, via a response record, to program p7 for the completion of te records created by program pl; (2) authority records are selected to serve as headings above bibliographic citations (these same records will also cause cross-references to be selected); ( 3) authority records are selected in order to initiate a search for the associated dual entry, as per instructions contained in the inquiry record. the selected headings consist of complete authority records with instructions regarding their eventual use and routing. headings are routed, via a collation code, into cumulation segments or supplements. since a single authority heading may appear as both a main entry and subject heading, indicators are set defining its eventual use as one, the other, or both. these indicators will be called usage indicators. usage decisions made by pl are passed to this step as part of the inquiry records. the results of these decisions are then transmitted as a set of codes inserted into the se~ lected authority records. automated book catalog subsystem/malinconico 27 this program is further charged with the responsibility of keeping current the catalog status indicators for cross-references by maintaining two binary indicators with every cross-reference. a cross-reference record with multiple see fields will have a pair of indicators for each see field. the first binary indicator denotes prior appearance of a cross-reference in a cumulation segment. the second indicates that the referenced heading currently appears in some part of the catalog. in passing through the entire authority file, this program will note that a heading which falls in the current month's filing range has had no requests for its use lodged against it. when this is the case, transactions are created for every cross-reference, defined by see froms in the heading record, extinguishing the second binary indicator described above. the cross-reference will then not be used again until it is required. the need for this operation will become more evident when we discuss program p6. the maintenance of the physical linkage between cross-references and headings is performed by the authority file update subsystem. this subsystem guarantees that the linkage is kept current regardless of alterations to headings and cross-references. hence, all see froms are guaranteed to refer to a cross-reference (direct see) record on the file. explode hierarchies, cross-references and dual entries-program p3.1 the selected authority records are examined for the presence of see from fields. if any are found, they are used to create further inquiries to the authority file for cross-references. a similar operation is performed for dual entries with the exception that the dual entry inquiry is not created unless it was requested by program pl. the request is passed via indicators in the inquiry record (as discussed above in the description of program p2). all records which are subdivisions of headings, e.g. sculpture-technique, will cause inquiries for all significant higher level headings ( sculpture in this case) to be created. higher level headings will supply additional entry points via cross-references to them, or may themselves appear if they contain notes. cros.'i-reference requests are separated for later processing. they will be processed with requests for secondary cross-references to be generated by program p3.2 below. exclude duplicate headings and separate inquiries-program p4 this program is nothing more than a sort with exits. the input tape of selected headings and higher level heading requests is sorted, and if a request for a higher level heading has already been filled by a heading selected in 'p2, the request is dropped. all usage information carried by the request is logically added to the matching heading. when multiple requests for the same higher level heading are discovered, all but the first are 28 journal of library automation vol. 6/ 1 march 1973 dropped. usage information from all duplicates is added to the retained request by a logical or operation. the authority records which were selected by p2 for use as headings are formatted into complete text entry (te) records for later input to the pagination program. te heading records are formatted by a single module invoked by this step and again in p5. the surviving hierarchy requests, and all dual entry requests are separated for processing in the next step. format headings module all heading records selected for print are processed by this module, which converts the input text and filing data of authority recmds into te records. at times quasi-duplicates of the te record are constructed with different filing and typography codes for use as main entry and subject headings. at times portions of the data are encoded as nonprinting because it is lmown that the print data will be provided by other heading records. this is the case with author/ conventional title records. the author heading is assured because of the explosion of higher level headings; hence, a simple method is provided for insuring its appearance only once regardless of the number of associated conventional titles. when a subject heading record is created, the heading is made to appear twice in the record, once in upper case for printing, and once in its normal upper and lower case form, encoded as nonprinting, for possible use as a dictionary heading by the pagination program. the conversion to upper case is effected via a translate table, because of the presence of control information within the text for floating diacritics. also, diacritics and many special characters do not have a simple upper case equivalent due to the use of the complete ala character set. punctuation of cross-references is effected in this module. the complexities by no means approach those encountered in punctuating condensed added entries; nonetheless, they do exist. for example, terminal periods in headings referenced in a cross-reference must be replaced with semicolons when more than one heading is referenced, a blank mus l be inserted following the hyphen and preceding the semicolon in open ended dates, the final referenced heading in a string must end in a period unless it terminates with a hyphen, quote mark, exclamation point, question mark, parenthesis, etc. typographic codes which apply to headings, notes associated with headings, and phrases in cross-references are inserted by this program when te records are created. locate hierarchies and dual entries-program p5 all heading requests are applied to the authority master file. when the heading corresponding to a request is located, the entire authority record is written onto an output file for further processing. this process is simautomated book catalog subsystem/ malinconico 29 ilar to that executed when the original heading requests were processed in program p2. higher level headings are encoded for use in accordance with their categorization and filing form. when a requested dual entry heading is located, a te record is written for later processing by the pagination program. a response record containing the filing form of the dual entry is also written onto an indexed sequential disk file. a direct access file is necessary since the catalog record contains only a link to the primary heading, and all requests for the dual entry come via a request against the primary heading in program p2. rather than attempting a complex scheme for keeping track of all bibliographic items requiring the dual entry data, only one copy of the dual entry response is isolated and indexed by the control number of the primary heading. it is then retrieved on that basis when needed. explode secondary cross-references, separate and select hierarchical headings-program p3.2 this program is simply a phase of program p3.1 described above. the major difference lies in its handling of the authority records which it accepts as input. they are written out as te records, but only if they meet one of two conditions: if the authority record matching the heading request contains notes, it is selected for eventual formatting into a heading; or if it represents an author, required of an author/conventional title combination. in all other instances higher level headings are not selected for printing. the format headings module is invoked by this step for all higher level headings selected for print. if secondary cross-references are not desired, the explosion module which creates the requests is simply bypassed. similarly, higher level headings may be suppressed. no further attempt is made to generate higher level headings, as they have all been exploded in p3.1. the exploded cross-reference requests are separated in this program, just as they were in p3.1. locate cross-references-program p6 prior to execution of this step tapes t3.1, t3.2, t3.3 are sort/merged into a single tape t3.4 (figure 9). t3.4 now contains all of the transactions generated by program p2, and all cross-reference requests. recall that p2 has created transactions extinguishing the indicator carried by cross-reference headings, denoting that the referenced heading appears somewhere in the catalog. the sort causes all of these transactions to be applied before any cross-reference requests are processed. it might appear a bit paradoxical that a request should be made to a cross-reference whose referenced heading was not selected in p2; however, recall that a cross-reference may be invoked as the result of the use of a subdivision of the referenced heading (secondary cross-reference). at this point some discussion of the cross-reference record is in order. a cross-reference may point to several headings simultaneously, e.g. ani30 journal of libraty automation vol. 6/ 1 march 1973 mals see aardvarks/ bears/ cats/ ... zebras. each referenced heading is controlled individually. only the required references are extracted as needed. in the example above, if aardvarks and cats appeared in the catalog those two references would have been selected, and no others. hence, the discussion which follows will be greatly simplified if we consider each cross-reference transaction to apply to only a single reference. this is effected operationally by carrying the control number of the heading which gave rise to the cross-reference request within the request. following the application of transactions, if any, to extinguish indicators, the selection fm· print logic is executed. cross-references are selected for printing when the indicator specifies that the cross-referenced heading appears somewhere in the catalog available to the public, regardless of whether there is a specific request for it, and the cross-reference is filed in the segment being produced. a request for a cross-reference which already appears in a cumulation segment currently in use is ignored. a request for a cross-reference which is not already in the catalog is honored. the actual logic is somewhat complex; however, the end result is as described above. cross-references to be printed are routed to either a supplement or cumulation installment depending upon the filing range in which they fall. when a divided catalog is being produced cross-references are further routed into the appropriate catalog on the basis of categorization. following the selection of, or refusal to select, a heading, the indicators denoting prior appearance in the catalog and linkage to a heading in use are updated. continuing integrity of the cross-reference structure for future printings of the catalog is thus assured. complete citation text entry records-program p7 prior to execution of this processing step, response records emanating from p2 are sorted into bibliographic item number sequence. sequencing is necessary since the ske-leton te records are in the same sequence as the bibliographic master file. identification of authority response data required by a te record is via bibliographic item number and a sequence number assigned to each authority field within a bibliographic record. subfields of a response record are identified by delimiter. response records are matched to skeleton te records bearing the same item number. following the match, all required data are inserted into the skeleton te record. codes are carried in the te record directing this program to perform certain formatting functions not possible in step pl. these functions include insertion of certain combinations of parentheses and brackets required by series notes, addition of a series note to certain call numbers, and the replacement of the author portion of an authortitle combination se1·ies note with his:, her:, in his:, in her:, etc. none of the above could have been accomplished in a typographically acceptable manner in program pl. dual entry data are obtained from the indexed sequential file ( dl). the automated book catalog subsystemjmalinconico 31 identification of such data is via the authority control number of the primary lc subject heading carried in the bibliographic record. this number is used to access file dl for the required text and filing data. pagination-program p8 prior to execution of this step a set of page initialization records is created for the particular type of catalog being produced. these records are prepared by a program not shown in the subsystem flow. initialization records govern the overall format of the book to be produced. there are six such initialization records, all of which must appear at the beginning of the input tape. they may also appear embedded anywhere among the te records in various combinations. the first initialization record, known as a page dimension ( pd) record, defines the physical dimensions of the page to be printed. parameters carried in this record also determine the dimensions of inner and outer page margins, head and foot margins (independently for recto and verso pages), number and width of columns, body size on which to set type, and spacing between entries. when an embedded pd record is encountered the program will terminate any page cun-ently being formatted, begin a new page, and continue formatting in accordance with the redefined dimensions. the second initialization record defines the starting page number, and indicates whether paging is to start with a recto or verso page. the pagination program may also be directed via this record to place a black square at the edge of a page, at a location defined by the record, to serve as a thumb index. this record may also appear anywhere else on the tape. when it does appear as an embedded record it commands the program to terminate the page being formatted at that point, to begin a new page, and possibly provide a number of blank pages. this allows volumes to be broken at predefined sort points. in this manner we may separate alphabetic segments, the various volumes of a divided catalog, or cumulation and supplement volumes, and move the thumb index. subsequently four records define caption and legend text (independently for recto and verso pages). any one or combination of these records may also occur elsewhere on the tape. when they do occur as embedded records, the program terminates the page currently being formatted, alters the appropriate caption and/ or legend text, and continues to format text. interfiling of these records with te records allows captions to be changed automatically between volumes of a divided catalog, or between supplement and cumulation volumes, or at any other desired sort point. the six records described above control those aspects of page format which are common to a large class of entries. individual te records carry typographic commands which are specific to the entry, or to an element of the entry. a code carried by each te record (entry fo1'mat code) defines typographical rules for the entry as a whole. this code is used to identify 32 journal of library automation vol. 6/ 1 march 1973 data to be used in the formation of dictionary and column headings when page breaks occur. certain widow rules affecting the entire entry are specified, e.g. entry may not span columns, entry may not form the last line of a column, etc. line advance commands, defining the amount of space (if any) to be left between enbies, are carried in this code. data elements within an entry may require different typographic rules. format codes for each such element are carried within a record directory. the directory also serves to identify the location and length of text data to be typeset in accordance with the typography specified by element format codes. element format codes consist of 32 bit fullwords. groups of bits within the word define separate typographic rules. these bits may be set in any combination, defining a complete spectrum of typography. the major typographic parameters governed by these bit settings are: 1. starting indention ("continue on the previously used line" is included). 2. overflow indention to be used if the element must be continued onto another line. 3. space to be left on a line before adding any additional text to a previously used line. 4. justification-left, right, center of column, and center of page. 5. type size height. 6. type size width relative to height. 7. type face-bold or light. 8. type style-roman or italic. 9. element widow rules-restrictions which do not allow text to : span columns, form the first line of a column, span from a verso to a recto page, or span from a recto to a verso page. 10. line break-indicating whether lines may be broken at blanks only, or may be broken at blanks and certain special characters. line break decisions observe a hierarchy of rules, e.g. if the indicator is set to break at blanks only and no blanks are found within the entire line, the program automatically reverts to the second option (break at blanks and special characters ); should that also fail, the line will be broken arbitrarily at the last character which fits on the line. 11. hyphenation indicator-due to the great number of foreign languages used in the nypl catalog no hyphenation routine is employed. allowance has been made, however, for the inclusion of a hyphenation module should it be desired in the future and an indicator provided in order to invoke it. other rules of lesser importance exist, but space does not warrant their discussion. the entire ala character set plus several additional characters specified by nypl may be typeset via this program on an iii videocomp. diacritics are floated onto the characters they accent. the coding structure adopted by nypl consists of two unique codes preceding a pair of characters to be automated book catalog subsystem / malinconico 33 overprinted. the first code indicates to all processing programs that the data to follow must be interpreted in a unique manner. the second defines the unique treatment to be accorded. we currently employ only two such functions codes; both imply a form of overprint. coding in this manner allows unlimited expansion of the character set. a function code has been assigned but not yet utilized for overprinting of triplets. this would be necessary in handling doubly accented characters, such as are found in vietnamese. functions codes have been assigned defining escapes to nonroman alphabets. the character set includes two blanks in addition to the normal word space. one of these will provide a word space on printed output but will fail line break tests. such a character is of great utility as a separator in abbreviations and as a word space preceding such terminal characters as a close parenthesis. conversion of the nypl data base to utilize this super blank will be effected following definition of sufficiently reliable rules for its automatic generation at input. the second blank is a zero set width character. this character, when present in a machine record, is assigned a null width by the phototypesetting device. its utility lies in areas in which it is required to remove only one or two characters from a record, but it is not desired to expend the programming or processing time in restructuring the record. all of the input text data and format codes are translated into commands to an iii videocomp 830 and written onto a driver tape. the driver tape is then delivered to a photocomposition vendor who mounts it on a videocomp to produce camera ready copy for catalog pages. the camera ready copy is then delivered to a printer who produces multilith plates, and thence, pages which are bound into monthly supplements and cumulation segments. conclusion photocomposed book catalogs have been in use at nypl since january 1971. the effectiveness of the system can, perhaps, best be judged by the only adve rse reaction received thus far: in the case of material which must pass through the bindery after cataloging, entries appear in the catalog befor e the materials reach the shelves, thereby causing annoyance to users. judged by more serious criteria, the system has been proven to be an operational success. the processing budget for the research libraries is now insignificantly higher than it was under the manual system, but cataloging volwnes have increased dramatically: 7,500 titles/ mo. cataloged vs. 5,500 titles/ mo. under the old manual system. the increase in productivity cannot be solely attributed to the automated system. some of it is attributable to the revision, by the head of preparation services, of manual procedures. e xpansion of book catalog coverage the entire bibliographic system is currently in the final stages of revi34 journal of library automation vol. 6/1 march 1973 sion for production of a multimedia catalog of the dance collection of the research library of the performing arts. 20 the organization of cita~ tions referring to material in diverse media will be accomplished by pr~ viding separate sequences under appropriate headings, denoting: works by, works about, visual works, music, audio materials. listed under each of these headings will be the following types of materials: 1. works by-written works by an author. 2. works about-written works about an author, performer, etc. (the subheading is not used under topical subjects.) 3. visual works-photographs (original and indexed), prints and orig* inal designs, motion pictures and videotapes, filmstrips and slides. 4. music-music scores. 5. audio materials-phono records and phonotape. these headings are not as specific as those suggested by riddle, et al., however, they do provide the early warning function discussed by virginia taylor.zl, zz this catalog is due for publication in early 197 4. pending the success of this venture, a study will be made of the means of extending the scope of the research libraries' catalog to include nonbook materials. in late fall 1973, an extremely exciting and bold step will be taken by the jewish division of the research libraries. they will begin data input of material in hebrew, using the recently defined ansi correspondence scheme for hebrew characters. 23 within this scheme roman and special keyboard characters have been assigned to each character of the hebrew alphabet. book catalog display of hebrew text will utilize these characters in a left to right print mode until such time as development money is found for the digitization of hebrew character fonts, and for modifications to the pagination program in order to display mixed roman and he~ brew text. all hebrew entries will be filed in accordance with conventions for sequencing hebrew text. the hebrew entries will be interfiled with entries in romanized forms by conceptually assuming the sequencing alphabet to contaln 57 characters: blank, a, b, ... , z, 0, 1, . .. , 9, n , !l , ... , .n . if we have an author who has written several titles in roman al~ phabet languages, and others in hebrew, we would create a sequence of main entries under his name interfiled according to the alphabetic sequence shown above. all hebrew or variant title added entries would be found in a sequence starting at the end of the roman alphabet. the primary reasons for adopting such a scheme as opposed to the more traditional romanization are: 1. a nationally endorsed correspondence schedule has been provided by ansi. 2. it is desired to enter this data into the automated system and end the manual operation at the earliest possible time. 3. it is desired not to have to revise all cataloging when true hebrew text may be economically displayed. it is virtually impossible to reauto1mted book catalog subsystemjmalinconlco 35 cover the true form of nonroman text from its romanized form. these two areas, nonroman alphabet display and inclusion of nonbook materials, represent the only areas in which further development of the book catalog system is planned. future efforts will be directed to conver~ sion of the batch-oriented processing system to one with on-line file maintenance capability. it should be stressed again that the primary aim of the bibliographic system is not production of book catalogs. the system was designed to create a highly controlled data base which could be used in conjunction with whatever display medium it; technologically and economically feasible. online access to the catalog will require extreme control of the data, as automated retrieval techniques require very precise definition of access points. the problems of data organization become greatly magnified when crt display devices are used, as the visual scan range produced is severely limited. the extensive development effort to produce book catalogs was undertaken at nypl since it was felt that for at least the next decade book catalogs in printed or microform would provide the only economically viable form of access to the collection. book catalogs will, no doubt, also serve as backup forms of display for a considerable time after introduction of electronic access techniques. references 1. seoud makram matta, the card catawg in a large research library: present conditions and future possibilities in the new york public library, submitted in partial fulfillment of the requirements for the degree of doctor of library science. (new york: columbia university, school of library service, 1965). 2. i. a. warheit, "automation of libraries-some economic considerations," presented to: canadian association of infornuition science, ottawa, ontario, canada, 27 may 1971. 3. james w. henderson and joseph a. rosenthal, eds., library catalogs: their preservation and maintenance by photographic and automated techniques (mit report no. 14.) (cambridge, mass .: mit press, 1968). 4. margaret c. brown, "a book catalog at work (free library of philadelphia)," library resources and technical services 8:349-58 (fall1964). 5. richard de gennaro, "harvard university's widener library shelflist conversion and publication program," college & research libraries 31:318-33 (september 1970). 6. richard d. johnson, "a book catalog at stanford," journal of library automation 1:13-50 (march 1968). 7. paula kieffer, "the baltimore county public library book catalog," library resources and t echnical services 10:133--41 (spring 1966). 8. hilda feinberg, "sample book catalogs and their characteristics." in: book catalogs by maurice f . tauber and hilda feinberg. (metuchen, n.j.: the scarecrow press, 1971) p.381-511. 9. paul j. fasana and heike kordisb, the columbia university libraries integrated technical services system. part ii: acquisitions. (a) introduction. (new york: columbia university libraries systems office, 1970). 62 p. 10. gerry d. guthrie, "an on-line remote access and circulation system." in: amer36 journal of library automation vol. 6/ 1 march 1973 ican society for infor11ultion science. annual meeting. 34th, denver, colorado, 7-11 november 1971. proceedings 8:3059, communications for decision-makers. (greenwood publishing corp.: westport, connecticut, 1971). 11. ralph m. shoffner, "some implications of automatic recognition of bibliographic elements," journd of the american society for infor11ultion science 22:275-82 (july/ august 1971) . 12. frederick c . .kilgour, "initial design for the ohio college library center: a case history." in : clinic on library applications of data processing, 1968. proceedings (urbana: university of illinois, graduate school of library science, 1969) , p. 54-78. 13. maurice f. tauber and hilda s. feinberg, book catalogs (metuchen, n. j.: the scarecrow press, 1971). 14. catherine 0. macquarrie, "library catalogs: a comparison," hawaii library association ]ournal21:18-24 (august 1965). 15. irwin h. pizer, "book catalogs versus card catalogs," medical library association bulletin 53: 225-38 (april 1965). 16. kieffer, "the baltimore county public library," p.l33--41. 17. james a. rizzolo, "the nypl book catalog system: general systems flow," the larc reports 3:87-103 ( falll970). 18. edward duncan, "computer filing at the new york public library," the larc r eports 3:66-72 (fall1970). 19. s. michael malinconico, "optimization of publication schedules for an automated book catalog," the larc reports 3:8185 (fall 1970) . 20. dorothy lourdou, "the dance collection automated book catalog," the larc reports 3: 1738 (fall 1970). 21. jean riddle, shirley lewis, and janet macdonald, n on-book materials: the organization of integrate d collections. prelim. ed. (ottawa, ont.: canadian library association, 1970). 22. virginia taylor, "media designators," library resources and technical services 1:60-65 (winter 1973) . 23. edward a. goldman, et al., "transliteration and a 'computer-compatible' semitic alphabet," hebrew union college annual 42:251-78 (1971). microsoft word 13389 20211217 galley.docx article hackathons and libraries the evolving landscape 2014–2020 meris mandernach longmeier information technology and libraries | december 2021 https://doi.org/10.6017/ital.v40i4.13389 meris mandernach longmeier (longmeier.10@osu.edu) is head of research services, the ohio state university libraries. © 2021. abstract libraries foster a thriving campus culture and function as “third space,” not directly tied to a discipline.1 libraries support both formal and informal learning, have multipurpose spaces, and serve as a connection point for their communities. for these reasons, they are an ideal location for events, such as hackathons, that align with library priorities of outreach, data and information literacy, and engagement focused on social good. hackathon planners could find likely partners in either academic or public libraries as their physical spaces accommodate public outreach events and many are already providing similar services, such as makerspaces. libraries can act solely as a host for events or they can embed in the planning process by building community partnerships, developing themes for the event, or harnessing the expertise already present in the library staff. this article, focusing on years from 2014 to 2020, will highlight the history and evolution of hackathons in libraries as outreach events and as a focus for using library materials, data, workflows, and content. introduction as a means of introduction to hackathons for those unfamiliar with these events, the following definition was developed after reviewing the literature. hackathons are time-bound events where participants gather to build technology projects, learn from each other and experts, and create innovative solutions that are often judged for prizes. while hacking can have negative connotations when it comes to security vulnerabilities, typically for hackathon events hacking refers to modifying original lines of code or devices with the intent of creating a workable prototype or product. events may have a specific theme (use of a particular dataset or project based on a designated platform) or may be open-ended with challenges focused on innovation or social good. while hackathons have been a staple in software and hardware design for decades, the first hackathons with a library focus were sponsored by vendors, focused on topics such as accessibility and adaptive technology for their content and platforms.2 other industry hackathons focused on re-envisioning the role of the book in 2013 and 2014.3 as hackathons became more popular at colleges and universities, library participation evolved from content provider to event host. these partnerships were beneficial to libraries interested in shifting the perception of libraries from books to newer areas of expertise around data and information literacy. however, many libraries realized that by partnering in planning the events greater possibilities existed to educate participants about library content and staff expertise. some examples include working with public library communities to highlight text as data, having academic subject librarians work with departmental faculty to embed events within curriculum and assignments, and for both academic and public libraries to promote library-produced and publicly available datasets.4 information technology and libraries december 2021 hackathons and libraries |longmeier 2 there are many roles that libraries can take in these events. libraries can act as event hosts where they provide the space at a cost or for free.5 in other cases, library staff become collaborators and in addition to space may assist with planning logistics, judging, building partnerships, and have some staff present at the events.6 in public libraries this often includes building relationships with the city or specific segments of the community based on the theme of the event. on college campuses, it may be a partnership with a specific disciplines or campus it or an outside sponsor. in this way, the libraries are building and sustaining the event due to aligned priorities with the other partners. another option would be for the library to be the primary sponsor, where the library may provide prizes, the theme for the hackathon, as well as many of the items listed above.7 however, instead of specific categories, it should be viewed as a continuum of partnership and the amount of involvement with the event should align with the library’s priorities of what it hopes to accomplish through the event. how involved in event planning specific libraries want to be may depend on the depth of the existing partnerships as well as how many resources the library wants to commit to the event. libraries have always existed as curators and distributors of knowledge. some libraries are using hackathons to advance both their image and their practices. libraries are evolving into new roles and have grown to support more creative endeavors, such as the maker movement. this shift of libraries from book-provider to social facilitator and information co-creator aligns with hackathon events. the physical spaces themselves are ideal to support public outreach events and libraries are already providing makerspaces or similar services that would overlap with a hackathon audience.8 additionally, the spaces afforded by libraries allow flexibility and creativity to flourish, ideas to be exchanged, and different disciplines to mingle and co-produce. library staff focused on software development may have projects that would benefit from outside perspectives as well. in recent years libraries have become stewards of digital collections that can be used and reused in innovative ways. many libraries have chosen wikipedia edit-a-thons as a means of engaging with the public and enhancing access to materials.9 similarly, the collections-as-data movement is blossoming and allowing galleries, libraries, archives, and museum (glam) institutions to rethink the possible ways of interacting with collections. many public libraries are partnering with local or regional governments to build awareness of data sources and build bridges with the community around how they would like to interact with the data.10 additionally, as data science continues to grow in importance in both public and academic libraries, data fluency, data cleaning, and data visualization could be themes for a hackathon or data-thon.11 for those unfamiliar with these events, table 1 provides some generalized definitions created by the author of the different types of events and their intended purpose. for some organizations, there are ways to support these events that consume fewer resources or require less technical knowledge, such as an edit-a-thons or code jams. information technology and libraries december 2021 hackathons and libraries |longmeier 3 table 1. defining common hackathon and hackathon-like events, purpose, and typical size of events type of event definition purpose size of event hackathon a team-based sprint-like event focused on hardware or software that brings together programmers, graphic designers, interface designers, project managers or domain experts; can be open ended idea generation or for a specific provided theme build a working prototype, typically software up to 1,000 participants, usually determined by space available idea fest business pitch competition where individuals or teams pitch a solution or new company (startup) idea to a panel of judges deliver an elevator pitch for an idea, could be to secure funding <100 coding contest or code jam an individual or team competition to work through algorithmic puzzles or on specific code provided learning to code or solve challenges through coding; may produce a pitch at the end rather than a product 20–50 edit-athon an event where users improve content in a specific online community; can focus on a theme (art, country, city) or type of material (map) improving information in online communities such as wikipedia, openstreetmap, or localwiki 20–100 datathon a data-science–focused event where participants are given a dataset and a limited amount of time to test, build, and explore solutions usually a visualization or software development around a particular dataset 50–100 makeathon hardware focused hackathon build working prototype of hardware up to 300 participants methods to find articles in the library and information science literature related to hackathons and libraries, the author searched the association for computing machinery (acm) digital library, scopus, library literature and information science, and library and information science and technology abstracts (lista) databases. in scopus and the acm digital library, the most successful searches included the following: (hackathon* or makeathon*) and library; in library literature and information science and lista databases, the most successful searches included: information technology and libraries december 2021 hackathons and libraries |longmeier 4 hackathon or makeathon or “coding contest.” the author also searched google scholar in an attempt to locate other studies or reports, some of which came from institutional repositories. while this search strategy was not meant to be exhaustive, it uncovered many articles about hackathons and libraries and others were found by chaining citations in the articles reference lists. based on search locales, international articles were found but only those where the text was available in english were included which meant that articles from asia, africa, and the global south may have been inadvertently overlooked. only two of the articles found in the search results were not held in library locations, did not use library/archival materials, or were not an outreach event where library staff were integral in planning (these were discarded.) findings the author grouped the literature into two categories: library as place and library as source. in the realm of library as place, the literature consisted of reporting on hackathons where the library was the host location for the event, those where the hackathon was an outreach event, and those where the hackathon was an extension of the libraries’ teaching/education mission. for most of these articles the majority were case studies and often shared tips for other libraries to consider when hosting a hackathon in library spaces. the second category, those that use library as source, focused on highlighting library spaces or services, workflows, or collections as the theme of the events. additionally, there were a few articles in the second category that discussed how to prepare or clean library data or library sources before the event to ensure that participants were able to use the materials during the time-bound event. in some cases where the source materials were from the libraries, the event also occurred in the library; thus, some articles fit into both categories and are highlighted in both sections. results: library as place the following summaries of hackathons and libraries as places for events will be grouped into two subgenres: library spaces and outreach events. libraries, both public and academic, are ideal locations for hosting large, technology-driven events given the usual amenities of ample parking, ubiquitous wi-fi, adequate outlets, and at times already having 24-hour spaces built into their infrastructure. more and more libraries are offering generous food and drink policies, a benefit as sustenance is a mainstay at these multiday events. additionally, libraries already host a number of outreach events and serve as a community information hub. using libraries as event hosts for hackathons a number of articles detail the use of library spaces to host hackathon events.12 the university of michigan library, a local hackerspace (all hands active), and the ann arbor district library teamed up to host a hackathon focused on oculus rift.13 this event grew out of a larger partnership with the community and sought to mix teams to include participants from all three areas. the 2018 article by demeter et al. highlights lessons learned from florida state university library and many of the planning steps involved when hosting large outreach events in library spaces.14 while the library initially hosted a 36-hour event, hackfsu, as a favor to the provost in the first year, they continue to host the event, providing library staff as mentors and logistical support. after the first year they started charging the student organization for use of the space and direct staffing costs for the hours beyond normal operating hours. while focused primarily on providing a central campus space, the library also sees it as a way to highlight the teaching and learning role of the library. similarly, nandi and mandernach detail the steps involved in planning information technology and libraries december 2021 hackathons and libraries |longmeier 5 hackathon events and some benefits of choosing the library as a location for the event.15 at ohio state, hackathon events in 2014 and 2015 were held in the library due to twenty-four-hour spaces, interest by the libraries in supporting innovative endeavors on campus, and a participant size (100–200 attendees) that could be accommodated in the space. other events chose academic libraries as locations for hackathons due to their central location on campus.16 an initial summary of library hackathons was captured by r. c. davis who detailed that libraries may be motivated to host such events as they align with library principles of “community, innovation, and outreach.”17 she points out that libraries are ideal locations because of small modular workspaces paired with a large space for final presentations. additionally, adequate and sufficiently strong wi-fi or hardwired connections, a multitude of power outlets, and 24-hour spaces are appealing for these kinds of events. event planners should know that the necessities include free food and multidisciplinary involvement. davis details ways to plan smaller events, such as code days or edit-a-thons, if staffing does not allow for a large hackathon event. in all cases, the libraries serve a purpose to either campus or community as the location and sometimes also provide staff for the events. hackathons as library outreach hackathon events are a great way to reach out to the community and provide a fresh look into libraries as purveyors of information focused on more than books. at the 2014 computers in libraries conference, chief library officer mary lee kennedy delivered a keynote sharing stories of the new york public libraries experiences hosting wikipedia editathons and other hackathons at various branches since 2014.18 the goals for these outreach events were to highlight strategic priorities around making knowledge accessible, re-examine the library purpose, and spark connections. early library hackathon events focused on outreach included topics of accessibility or designing library mobile apps.19 more recent events have focused on outreach but with an eye toward sharing content as part of the coding contest.20 even library associations have hosted preconference hacking events to highlight what libraries are doing to foster innovation.21 the future libraries product forge, a four-day event, was hosted in collaboration with the scottish library and information council and delivered by product forge, a company focused on running hackathons that tackle challenging social issues. the 2016 event focused specifically on public libraries in scotland and seven teams, comprised mainly of students from a local university, worked with public library staff and users as well as regional experts in technology, design, and business.22 the goals of the event were to raise awareness of digital innovation with library services, generate enthusiasm for approaches to digital service design, and codesign new services around digital ventures. participants created novel products including digital signage, a game for young readers, a tool for collecting user stories about library services, and an app to reserve specific library spaces. another common focus for library hackathon outreach events is the theme of data and data literacy. in july 2016, the los angeles public library hosted the civic information lab’s immigration hackathon.23 this outreach event gathered 100 participants to address local issues around immigration. the library, motivated by establishing itself as a “welcoming, trusting environment,” wanted to be a “prominent destination of immigrant and life-enrichment information and programs and services.”24 newcastle libraries ran two-day-long events focused on promoting data they released under an open license as part of the commons are forever project.25 they used both events to educate users about tools such as github, a gif-making session with historical photographs, and data visualization tools. similarly, toronto public library hosted a series of open data hackathons to highlight the role of the libraries in civic issue information technology and libraries december 2021 hackathons and libraries |longmeier 6 discourse, data literacy, and data education.26 their events combined the hackathon with other panel presentations and resources focused on mentorship and connection-building in the technology sector. the library also used the event to promote their open data policy, build awareness around the data provided by the library for the community, and highlight their role in facilitating conversations around civic issues through data literacy and data education. edmonton public library hosted its first hackathon in 2014 for international open data day. one of the main drivers was to build the relationship with their local government.27 they built their event around the tenets laid out in the open data hackathon how-to guide and by a blog post about the city of vancouver’s 2013 international open data day hackathon.28 they took a structured approach to documenting expectations of both partners around areas such as resources, staffing, and costs, which served as a roadmap for the hackathon and the partnership. the library provided the event space, coffee and pizza, an emcee, tech help and wi-fi, door prizes and “best idea” prize, and promotional material. the city recruited participants and provided an orientation, promotional banners, and a keynote. the event led to a deeper partnership with the city and additional hacking events. in these ways, the hackathon served a greater purpose of community building and awareness around data, the role the library plays in interpreting data, and how the libraries serve as a resource hub to the community. events supporting library teaching mission at academic institutions, the events often focus on outreach to their own campus community. in 2015, adelphi university hosted their first hackathon and the libraries funded the event themselves rather than seeking outside funding.29 the article details the considerable lessons learned through the process as well as a step-by-step guide to planning a smaller event. similarly, york university science and engineering library hosted hackfests in the library and embedded an event as part of an introductory computer science course.30 shujah highlighted some of the benefits to the library hosting a hackathon included: establishing libraries as part of the research landscape, providing a constructive space for innovation and innate collaborative environment, highlighting the commitment to openness and democratizing knowledge, and acknowledging the library’s role in boosting critical thinking and information literacy concepts. shin, vela, and evans highlight a community hackathon at washington state university college of medicine where a group of librarians from multiple institutions staffed a research station throughout the event.31 while the station was underutilized by participants, as only seven questions were asked during the event, the libraries deemed their participation a success as it worked as an outreach and promotion mechanism for both library services and expertise. at some public libraries, the focus of the hackathon is on education and teaching basic coding skills. whether called a coding contest, hackathon, or tech sandbox, there are opportunities for programming with a focus on learning and skill-building and fun.32 santa clara county library district used a peer-to-peer approach for mentoring and hosted a hackathon in 2015 for middle and high-school students.33 the library staff facilitated the event planning and recruited judges from the community, but the bulk of the event was coordinated by the students. considerations when hosting events in library spaces a couple of substantive reports provide overarching recommendations and considerations for hosting hackathons in library spaces, including planning checklists, tips on getting funding, building partnerships with local community officials, and thinking through the event systematically. recently, the digital public library of america (dpla) created a hackathon information technology and libraries december 2021 hackathons and libraries |longmeier 7 planning guide that details a number of logistical issues to address during the planning phases, both preand post-event.34 this report highlights specific considerations for galleries, libraries, archives and museums that are looking to host a hackathon. after hosting a successful hackathon, librarians at new york university created a libguide called hack your library which is a planning guide for other libraries considering hosting a similar event.35 the engage respond innovate final report: the value of hackathons in public libraries was put together following an event the carnegie uk trust sponsored.36 this guide highlights some of the challenges present with hackathons, including: intellectual property of the creations, prizes, participant diversity, and complications that arise from either approach of using specific themes or open-ended challenges. it also highlights some of the main reasons a library should consider hackathons and other coding events, including ways to promote new roles of libraries within communities, promote specific collections, capitalize on community expertise, gain insight about users, help users build new skills and improve digital literacy, and develop tools that increase access to materials. finally, the report points out that hosting an event will not be the only solution for a library’s innovation problem. yet if the library is clear on why it wants to hold a hackathon, being planful about expectations and outcomes the library is trying to achieve will increase the chances for success. results: library as source the other category of articles about hackathons and libraries focuses on the library as the source for the challenge or theme of the hackathon. the following summaries highlight articles include those where the libraries provided the challenges around library spaces or services, library datasets, workflows or collections as the theme for the hackathon. this section also details steps involved in cleaning data for use/re-use in time-bound events. using hackathons to improve library services and spaces a few articles discuss libraries that proposed hackathon themes around improving library services. a 2016 article describes how adelphi university libraries hosted a hackathon and provided the theme of developing library mobile apps and web software applications.37 the winning student team created an app for library group study meetups. similarly, the librarians from university of illinois tried three approaches for library app development: a student competition, a project in a computer science course, and a coding camp. with the adventure code camp, students co-designed with librarians over the course of two days.38 they advertised to specific departments and courses and ten students were selected with six ultimately participating in the two-day coding camp. students were sent a package of library data, available apis, and brief tutorials on coding languages that may be useful. mentors and coaches were available throughout the coding camp. the authors provided tips for others trying to replicate their approach as well as insights from the students about interest in developing apps that include library data but that don’t solely focus on library services. the following year the librarians hosted a coding contest focused specifically on app development related to library services and spaces.39 the library sponsored the event and served as both a traditional client and partner in the design process. ultimately six teams with a total of 26 individuals participated and each app was “required to address student needs for discovery of and access to information about library services, collections, and/or facilities” but not duplicate existing library mobile apps. they based their approach on massachusetts institute of technology’s entrepreneurship competition. through this process, co-ownership was preferred and many teams set up a licensing agreement as part of the competition to handle intellectual property for the software. students had two weeks to complete the apps and were judged by both library and campus it administration. this article details what information technology and libraries december 2021 hackathons and libraries |longmeier 8 they learned through the process given the amount of attrition from selection of teams to final product presentations. the new york university school of engineering worked with the libraries and used a hackathon theme of noise issues to coincide with the renovation of the library.40 the libraries created a libguide to provide structured information about the event itself (https://guides.nyu.edu/hackdibner). they used the event to market the new maker space and held workshops there leading up to the event. in the inaugural year they held the event over the course of two semesters and saw a lot of attrition due to the event length. in the second year, following focus groups with participants, they designed a library hackathon with four goals: 1) appeal to a large base of the student population, 2) create a triangle of engagement between the student and the library, the library and the faculty, and the faculty and the students, 3) provide an adaptable model to other libraries, and 4) highlight the development of student information literacy skills.41 the second year’s approach required more work by the participants due to pitching an initial concept, providing a written proposal, and giving a final presentation. library staff and guest speakers offered workshops to help students hone their skills. the planners evaluated the event through surveys and student focus groups. overall the students applied what they learned about information literacy and were highly engaged with the codesign approach to library service improvements. similarly, mcgowan highlights two hackathons at purdue that focused on inclusive healthcare and how the libraries applied design thinking processes as part of the events.42 the librarian wanted to encourage health sciences students to examine health data challenges. to examine this issue, she applied the blended librarians adapted addie model (blaam) as a guide to developing a service to prepare students to participate in a hackathon. a number of pre-event training sessions were held in the libraries and covered topics such as research data management, openrefine and data cleaning, gephi for data visualization, and javascript. while this initial approach was in tandem with the hackathon events, students reported that they needed assistance in finding and cleaning datasets for use. in this case, developing library services to prepare for hackathon events ended up out of alignment with both the library’s mission and the participants’ expectations. using library materials for hackathon themes several events have focused on library as source where the library’s materials or processes serve as the theme of the hackathon, particularly around digital humanities (dh) topics.43 in september 2016, over 100 participants worked with materials from the special collections of hamburg state and university library, a space that serves both the university and the public.44 it followed the process established by coding da vinci (https://codingdavinci.de/en), an event that occurred in 2014 and 2015. the event at hamburg state and university library had a kick-off day for sharing available datasets, brainstorming projects using library materials, and team building opportunities. the event had a second day of programming and then teams had six weeks to complete their projects. some exemplary products included a sticker printer that would print old photographs, a quiz app based on engraving plates, and using a social media platform to bring the engravings to the public. the event was successful and resulted in opening additional data from the institution. several examples focus on highlighting digital humanities approaches as part of the events. in 2016, four teams from across european institutions participated over five days in kibbutz lotan in the arava region of israel to develop linguistic tools for tibetan buddhist studies with the goal of information technology and libraries december 2021 hackathons and libraries |longmeier 9 revealing their collections to the public.45 the planning team recruited international scholars to participate in prestructured teams (teams consisted of computer scientists as well as a tibetan scholar) in israel. although it was less of a traditional hackathon, this event being more akin to an event/coding contest around a specific task, it highlighted tools and methods for understanding literary texts. the format of the event for encouraging interdisciplinary efforts in the computational humanities was deemed successful and it was repeated the next year on manuscripts and computer-vision approaches. recently the university of waterloo detailed a series of datathons using archives unleashed to engage the community in an open-source digital humanities project.46 the goal of the events was to engage dh practitioners with the web archive analysis tools and attempt to build a web archiving analysis community. in 2016, the american museum of natural history in new york hosted their third annual hackathon event, hack the stacks, with more than 100 participants.47 the event focused on creating innovative solutions for libraries or archives and to “animate, organize, and enable greater access to the increasing body of digitized content.” ten tasks were available for participants to work on and ranged from a unified search interface, reassembling fragments of scientific notebooks, and creating timelines of archival photos of the museum itself. in addition to planning the tasks, the library staff ensured that the databases and applications could handle the additional traffic. a multitude of platforms were provided (omeka, dspace, the catalog, apis, archivespace, etc) for hackers to use. all prototypes that were developed were open source and deposited on github at “hack the stacks.”48 some cultural institutions have used hackathons as a means of outreach and publicity and then have showcased the outputs at the museums. vhacks, a hackathon at the vatican, was held in 2018 and gathered 24 teams from 30 countries for a 36-hour event.49 the three themes for the event focused on social inclusion, interfaith dialogue, and migrants and refugees. a winner was announced for each thematic area and sponsors enticed participants to continue working on projects by having a venture capitalist pitch a few weeks after the event. another program, museomix, concentrates on a three-day rapid prototyping event where outputs are highlighted in the museum or cultural institution.50 this event has happened annually in november since 2011 and the goal is to create interdisciplinary networks and encourage innovation and community partnership. improving library workflows and processes other hackathons have focused on library staff working on library processes themselves. bergland, davis, and traill detail a two-day event, catdoc hack doc, hosted by the university of minnesota data management and access department focused on increasing documentation by library staff.51 this article details logistics of preparing for the event as well and a summary of the work completed. they based their approach on the islandora collaboration group’s template on how to run a hack/doc.52 they were pleased with the workflow overall, refined some of the steps, and held it again for library staff the following year. similarly, dunsire highlights using a hackathon format to encourage adoption of a cataloging approach of research description and access (rda) through a “jane-athon.”53 events occurred at library conferences or in conjunction with other hackathon events, such as the thing-athon at harvard, with the intention of promoting the use of rda, to help users understand the utility of rda, and to spark discussions. this approach proved useful in uncovering some limitations with rda as well as valuable feedback that could be incorporated into its ongoing development. information technology and libraries december 2021 hackathons and libraries |longmeier 10 considerations when using libraries as source if libraries are interested in hosting a hackathon where the library plays a more central role, there are several options of ready-to-use library and museum data that could allow the host to also serve as the content provider. the digital public library of america released a hackathon guide, glam hack-in-a-box: a short guide for helping you organize a glam hackathon with several sources at the end for finding data related to libraries.54 the university of glasgow began a project called the global history hackathons that seeks to improve access and excitement around global history research.55 additionally, candela et al. detail the new array of sources for sharing glam data for reuse in multiple ways, including using data in hackathon projects.56 planners could look to the collections-as-data conversations for other data sources that could be adapted for hackathon projects.57 when thinking about hackathons and cultural institutions, sustainability of projects and choice of platforms is an important consideration for planners.58 ultimately, the top priority when providing a dataset is to ensure that it is clean and enough details about the dataset are available for participants to make use of it in their designs given the time constraints of most events. discussion hackathons often have a dual purpose of educating the participants and serving as an advertisement for the sponsor for either a platform or content. participants will develop a working prototype or improving their coding abilities; sponsors, including libraries, can benefit from rapid prototyping and idea generation using either their platforms or content. while usable apps or new ideas are a welcome outcome, even if the applications are not used, the events still feed into the larger goal of marketing libraries and their data, building relationships with local communities, or drawing attention to social good. there are benefits to libraries in either hosting or collaborating on the events. in both areas, those of library as space and library as source, hackathons help realign user expectations of libraries. if libraries choose to become involved with hackathons or other coding or data contests, the library should be deliberate in its goals and intended outcomes as those will help shape both the event and its planning. libraries are naturally aligned with teaching and learning, are already offering co-curricular programming, and typically serve as physical and communication hubs for campus. libraries already prioritize outreach and engagement with constituents both on campuses and in the community. therefore, when programs align with library priorities of data literacy, data fluency, and information evaluation, it is a natural fit to propose involvement in hosting hackathons. many libraries are able to customize their spaces, services, and vendor interfaces, which is a benefit when thinking about having libraries as a theme for an event. other benefits exist for the hackathon event planners when partnering with a library. hackathon planners should consider reaching out to libraries as they already serve as a cross-disciplinary event spaces, host many other outreach events, and are often connected to other campus and community stakeholders and communication outlets. since students from all disciplines and colleges already use the library spaces on college campuses, they are an ideal location for fostering collaborations from different colleges and majors. public libraries function as community gathering spots as well. as libraries consider hosting events, several articles provide overarching tips for planning and hosting hackathons and other time-bound events.59 table 2 provides an overview of articles and the areas of coverage for planning topics. information technology and libraries december 2021 hackathons and libraries |longmeier 11 table 2. selected articles for tips on planning hackathon events based on common article theme areas article author location details sample agenda + timelines power and computing mentors/ judging further readings carruthers (2014) x x x x nelson & kashyap (2014) x x x x x jansen-dings, dijk, van westen (2017) x x x bogdanov & isaacmendard (2016) x x x nandi & mandernach (2016) x x x grant (2017) x x x x x as library data becomes more open and reusable, hackathons will be a way to highlight data availability, promote its use and reuse, and reach out to the community. the issues present when considering library collections as potential hackathon themes are that libraries will need to ensure the data are cleaned and contain sufficient metadata so that the data are ready to use. additionally, if there are programming language restrictions for ongoing maintenance by the library after the event, those should be specified when advertising the event. ultimately, the libraries will likely not control the intellectual property (ip) of the tool or visualization developed, but several libraries have specified the ultimate ip as part of the event details either as open source or co-owned.60 often the goal of the event is the promotion of specific materials or building awareness of a collection rather than any biproduct created during the event. however, it is important for the library to be clear about their intent when advertising to participants. the collections-as-data movement will continue to evolve and there will be a multitude of library resources that could be mined for use at hackathons or other similar events. while libraries provide an ideal location and have access to data that can be used for an event, they can also leverage their wealth of experts. library staff can serve as judges, mentors, and connectors to the wider campus or community. events could highlight specific expertise when hackathons focus on particular approaches (data visualization), processes (metadata management or documentation), or codesign of services (physical spaces). table 3 provides examples of hackathon events from a variety of library contexts. hackathons are a great way for libraries to serve as a connector to others on campus or in their communities. if libraries are not interested or able to host an event themselves, library staff can act as mentors or event judges. at smaller schools, library staff can partner with other campus units to plan a hackathon; similarly, smaller public libraries could work with community organizations to host events. at a smaller scale if staffing is a concern or full hackathons are unrealistic, a coding contest or datathon, both of which typically have a shorter duration, might be an option. edit-a-thons are even easier to host as they require only an introduction to the editing process, ample computer space (or laptop hook-ups), and a small food budget. some edit-a-thon events happen in a single afternoon. information technology and libraries december 2021 hackathons and libraries |longmeier 12 table 3. selected hackathon event summaries from various library contexts based on themes and products of the event article author type of library size of event time for event purpose of event role of the library output carruthers (2014) public + city 29 participants 1 day highlight open data from the libraries event space, coffee + pizza, emcee, some prizes, assessment building partnerships with the city, getting dataset requests ward, hahn, mestre (2015) academic 6 teams; 25 participants 2 weeks develop apps using library data event sponsor, mentor app development for library using library data mititelu & grosu (2016) academic 100 participants 48 hours bring together tech students event space app development for sponsors nandi & manderna ch (2016) academic 200 participants 36 hours bring together tech students event space, planning logistics, judges various apps, not library related baione (2017) private museum 100 participants 2 days animate, organize, and enable greater access to digitized content from the library create challenges, event space, judges open source apps for glam institutions theise (2017) academic + public 100 participants 2 days + 6 week sprint cultural hackathon to highlight library data and resources event space, challenges, datasets for hacking highlighted data available for use, created apps focused on library materials almogi et al. (2019) academic 23 participants 5 days develop linguistic tools for buddhist studies provided cleaned datasets for manipulation linguistic tools for buddhist studies one area for iteration around these events relates to timing. while most hackathons last 24–36 hours, some are run over the course of a oneor two-month period where coding happens information technology and libraries december 2021 hackathons and libraries |longmeier 13 remotely with a few scheduled check-ins with mentors before judging and presentations. this notion of a remote event may have more appeal for collections-as-data–themed events as experts are more likely to be available for keynotes or mentoring. if the process instead of the product is the focus of the event, then providing a flexible structure may be more appealing to participants. if a library has more limited resources or capacity, stretching the event out over a longer period would allow for sustained interactions. however, libraries should be aware that the longer the event period, the greater the attrition of the participants. an area for future research includes assessment of library participation in events. a couple of articles highlighted the value the libraries found in the events, but it is unclear whether the participants also gained value from the libraries.61 typically, post-event surveys have focused on the participant experience or the overall event space, rather than whether it affected participants’ view of the libraries, which would another area of interest for future research.62 conclusion in the realm of hackathons and libraries, originally hackathon themes were a way that vendors could highlight new content or improve interfaces. libraries followed this trend and used events to reach out to constituents, make connections with their communities, and highlight evolving library services. with the growth of flexible spaces, ample technology support and more relaxed food policies, libraries have become ideal event locations. as the collections-as-data movement evolves, there will be more opportunities to develop services related to these data and other library data which would lend themselves easily as themes for hackathons, edit-a-thons, or datathons. libraries thinking about hosting events will need to weigh the amount of time and resources they want to invest with the intended goals of hosting an event. planning is essential whether the library is the event host, a collaborator, or a sponsor of a hackathon. for those libraries that are unable to host a full hackathon, smaller events, such as a datathon or edit-a-thon, are possibilities to provide support without the same time and resource commitment. given the growing popularity of hackathons and other coding contests, they may be a catch-all for solving several library issues simultaneously: updating the library’s image as being more than book-centric, supporting the collections-as-data movement, and a new way of engaging community partners. acknowledgements thank you to jody condit fagan for providing valuable suggestions on a draft of this paper and to the two anonymous reviewers whose feedback improved the quality of this manuscript. endnotes 1 james k. elmborg, “libraries as the spaces between us: recognizing and valuing the third space,” reference and user services quarterly 50, no. 4 (2011): 338–50. 2 “a brief open source timeline: roots of the movement,” online searcher 39, no. 5 (2015): 44–45; patrick timony, “accessibility and the maker movement: a case study of the adaptive technology program at district of columbia public library,” in accessibility for persons with disabilities and the inclusive future of libraries, advances in librarianship, vol. 40, (emerald group publishing limited, 2015), 51–58; kurt schiller, “elsevier challenges library community,” information today 28, no. 7 (july 2011): 10; eric lease morgan, “worldcat information technology and libraries december 2021 hackathons and libraries |longmeier 14 hackathon,” infomotions mini-musings (blog), last modified november 9, 2008, http://infomotions.com/blog/2008/11/worldcat-hackathon/; margaret heller, “creating quick solutions and having fun: the joy of hackathons,” acrl techconnect (blog), last modified july 23, 2012, http://acrl.ala.org/techconnect/post/creating-quick-solutions-andhaving-fun-the-joy-of-hackathons. 3 clemens neudecker, “working together to improve text digitization techniques: 2nd succeed hackathon at the university of alicante,” impact centre of confidence in digitisation blog, last updated april 22, 2014, https://www.digitisation.eu/succeed-2nd-hackathon/; porter anderson, “futurebook hack,” bookseller no. 5628 (june 20, 2014): 20–21; sarah shaffi, “inaugural hack crowns its diamond project,” bookseller no. 5628 (june 20, 2014): 18–19. 4 rose sliger krause, james rosenzweig, and paul victor jr. “out of the vault: developing a wikipedia edit-a-thon to enhance public programming for university archives and special collections,” journal of western archives 8, no. 1 (2017): 3; stanislav bogdanov and rachel isaac-menard, “hack the library: organizing aldelphi [sic] university libraries’ first hackathon,” college and research libraries news 77, no. 4 (2016): 180–83; matt enis, “civic data partnerships,” library journal 145, no. 1 (2020): 26–28; alex carruthers, “open data day hackathon 2014 at edmonton public library,” partnership: the canadian journal of library & information practice & research 9 no. 2 (2014): 1–13, https://doi.org/10.21083/partnership.v9i2.3121; sarah shujah, “organizing and embedding a library hackfest into a 1st year course,” information outlook 18, no. 5 (2014): 32–48; lindsay anderberg, matthew frenkel, and mikolaj wilk, “project shhh! a library design contest for engineering students,” in american society for engineering education 2018 annual conference proceedings (2018): paper id 21058, https://cms.jee.org/30900. 5 michelle demeter et al., “send in the crowds: planning and benefiting from large-scale academic library events,” marketing libraries journal 2 no. 1 (2018): 86–95, https://bearworks.missouristate.edu/cgi/viewcontent.cgi?article=1089&context=articles-lib. 6 jamie lausch vander broek and emily puckett rodgers, “better together: responsive community programming at the um library,” journal of library administration 55, no. 2 (2015): 131–41; arnab nandi and meris mandernach, “hackathons as an informal learning platform,” in sigcse 2016 – proceedings of the 47th acm technical symposium on computing science education (february 2016): 346–51, https://doi.org/10.1145/2839509.2844590; lindsay anderberg, matthew frenkel, and mikolaj wilk, “hack your library: engage students in information literacy through a technology-themed competition,” in american society for engineering education 2019 annual conference proceedings, (2019): paper id 26221, https://peer.asee.org/32883; anna grant, hackathons: a practical guide, insights from the future libraries project forge hackathon (carnegieuk trust, 2017), https://www.carnegieuktrust.org.uk/publications/hackathons-practical-guide/; carruthers, “open data day hackathon 2014 at edmonton public library”; chad nelson and nabil kashyap, glam hack-in-a-box: a short guide for helping you organize a glam hackathon (digital public library of america, summer 2014), http://dpla.wpengine.com/wpcontent/uploads/2018/01/dpla_hackathonguide_forcommunityreps_9-4-14-1.pdf. information technology and libraries december 2021 hackathons and libraries |longmeier 15 7 david ward, james hahn, and lori mestre, “adventure code camp: library mobile design in the backcountry,” information technology and libraries 33, no. 3 (2014): 45–52; david ward, james hahn, and lori mestre, “designing mobile technology to enhance library space use: findings from an undergraduate student competition,” journal of learning spaces 4, no. 1 (2015): 30–40. 8 ann marie l. davis, “current trends and goals in the development of makerspaces at new england college and research libraries,” information technology and libraries 37, no. 2 (2018): 94–117, https://doi.org/10.6017/ital.v37i2.9825; mark bieraugel and stern neill, “ascending bloom’s pyramid: fostering student creativity and innovation in academic library spaces,” college & research libraries 78, no. 1 (2017): 35–52; elyssa kroski, the makerspace librarian’s sourcebook (chicago: ala editions, 2017); angela pashia, “empty bowls in the library: makerspaces meet service,” college & research libraries news 76 no. 2 (2015): 79–82; h. michele moorefield-lang, “makers in the library: case studies of 3d printers and maker spaces in library settings,” library hi tech 32, no. 4 (2014): 583–93; adetoun a. oyelude, “virtual reality (vr) and augmented reality (ar) in libraries and museums,” library hi tech news 35, no. 5 (2018) 1–4. 9 krause, rosenzweig, and victor jr., “out of the vault”; ed yong, “edit-a-thon gets women scientists into wikipedia,” nature news (october 22, 2012), https://doi.org/10.1038/nature.2012.11636; angela l. pratesi et al., “rod library art+feminism wikipedia edit-a-thon,” community engagement celebration day (2018): 10, https://scholarworks.uni.edu/communityday/2018/all/10; maitrayee ghosh, “hack the library! a first timer’s look at the 29th computers in libraries conference in washington, dc,” library hi tech news 31, no. 5 (2014): 1–4, https://doi.org/10.1108/lhtn-05-20140031. 10 carruthers, “open data day hackathon 2014 at edmonton public library”; bob warburton, “civic center,” library journal 141, no. 15 (2016): 38. 11 matt burton et al., shifting to data savvy: the future of data science in libraries (project report, university of pittsburgh, pittsburgh, pa, 2018): 1–24, https://d-scholarship.pitt.edu/33891/. 12 vander broek and rodgers, “better together”; nandi and mandernach, “hackathons as an informal learning platform”; robin camille davis, “hackathons for libraries and librarians,” behavioral & social sciences librarian 35, no. 2 (2016): 87–91; bogdanov and isaac-menard, “hack the library”; ward, hahn, and mestre, “adventure code camp”; ward, hahn, and mestre, “designing mobile technology to enhance library space use”; demeter et al., “send in the crowds”; carruthers, “open data day hackathon 2014 at edmonton public library.” 13 vander broek and rodgers, “better together.” 14 demeter et al., “send in the crowds.” 15 nandi and mandernach, “hackathons as an informal learning platform.” 16 eduard mititelu and vlad-alexandru grosu, “hackathon event at the university politehnica of bucharest,” international journal of information security & cybercrime 6, no. 1 (2017): 97–98; information technology and libraries december 2021 hackathons and libraries |longmeier 16 orna almogi et al., “a hackathon for classical tibetan,” journal of data mining and digital humanities, episciences.org, special issue on computer-aided processing of intertextuality in ancient languages, hal-01371751v3 (2019): 1–10, https://jdmdh.episciences.org/5058/pdf. 17 davis, “hackathons for libraries and librarians.” 18 ghosh, “hack the library!” 19 timony, “accessibility and the maker movement”; ward, hahn, and mestre, “adventure code camp.” 20 gérald estadieu and carlos sena caires, “hacking: toward a creative methodology for cultural institutions,” (presented at the viii lisbon summer school for the study of culture “cuber+cipher+culture”, september 2017); andrea valdez, “the vatican hosts a hackathon,” wired magazine, last updated march 7, 2018, https://www.wired.com/story/vaticanhackathon-2018/; leonardo moura de araujo, “hacking cultural heritage: the hackathon as a method for heritage interpretation,” (phd diss., university of bremen, 2018): 181–231, 235– 38. 21 thomas finley, “innovation lab: a conference highlight,” texas library journal 94, no. 2 (summer 2018): 61–62. 22 grant, hackathons: a practical guide. 23 warburton, “civic center.” 24 warburton, “civic center.” 25 aude charillon and luke burton, “engaging citizens with data the belongs to them,” cilip update magazine (november 2016). 26 enis, “civic data partnerships.” 27 carruthers, “open data day hackathon 2014 at edmonton public library.” 28 kevin mcarthur, herb lainchbury, and donna horn, “open data hackathon how to guide v. 1.0,” october 2012, https://docs.google.com/document/d/1fbuisdtiibaz9u2tr7sgv6gddlov_ahbafjqhxsknb0/e dit?pli=1; david eaves, “open data day 2013 in vancouver,” eaves.ca (blog), march 11, 2013, https://eaves.ca/2013/03/11/open-data-day-2013-in-vancouver/. 29 bogdanov and isaac-menard, “hack the library.” 30 shujah, “organizing and embedding a library hackfest into a 1st year course.” 31 nancy shin, kathryn vela, and kelly evans, “the research role of the librarian at a community health hackathon—a technical report,” journal of medical systems 44 (2020): 36. 32 geri diorio, “programming by the book,” voices of youth advocates 35, no. 4, (2012): 326–327. information technology and libraries december 2021 hackathons and libraries |longmeier 17 33 lauren barack and matt enis, “where teens teach,” school library journal (april 2016): 30. 34 nelson and kashyap, glam hack-in-a-box. 35 lindsay anderberg, matthew frenkel, and mikolaj wilk, “hack your library: a library competition toolkit,” june 6, 2019, https://wp.nyu.edu/hackyourlibrary/; anderberg, frenkel, and wilk, “hack your library: engage students in information literacy through a technologythemed competition.” 36 anna grant, engage. respond. innovate. the value of hackathons in public libraries (carnegieuk trust, 2020), https://www.carnegieuktrust.org.uk/publications/engage-respond-innovatethe-value-of-hackathons-in-public-libraries/. 37 bogdanov and isaac-menard. “hack the library.” 38 ward, hahn, and mestre, “adventure code camp.” 39 ward, hahn, and mestre, “designing mobile technology to enhance library space use.” 40 anderberg, frenkel, and wilk, “project shhh!” 41 anderberg, frenkel, and wilk, “hack your library: engage students in information literacy through a technology-themed competition.” 42 bethany mcgowan, “the role of the university library in creating inclusive healthcare hackathons: a case study with design-thinking processes,” international federation of library associations and institutions 45, no. 3 (2019): 246–53, https://doi.org/10.1177/0340035219854214. 43 marco büchler et al., “digital humanities hackathon on text re-use ‘don’t leave your data problems at home!’” electronic text reuse acquisition project, event held july 27–31, 2015, http://www.etrap.eu/tutorials/2015-goettingen/; helsinki centre for digital humanities, “helsinki digital humanities hackathon 2017 #dhh17,” event held may 15–19, 2017, https://www.helsinki.fi/en/helsinki-centre-for-digital-humanities/dhh-hackathon/helsinkidigital-humanities-hackathon-2017-dhh17. 44 antje theise, “open cultural data hackathon coding da vinci–bring the digital commons to life,” in ifla wlic 2017 wroclaw poland, session 231—rare books and special collections (2017), http://library.ifla.org/id/eprint/1785. 45 almogi et al., “a hackathon for classical tibetan.” 46 samantha fritz et al., “fostering community engagement through datathon events: the archives unleased experience,” digital humanities quarterly 15, no. 1 (2021): 1–13, http://digitalhumanities.org/dhq/vol/15/1/000536/000536.html. 47 tom baione, “hackathon & 21st-century challenges.” library journal 142, no. 2 (2017): 14–17. information technology and libraries december 2021 hackathons and libraries |longmeier 18 48 american museum of natural history, “hack the stacks,” https://www.amnh.org/learnteach/adults/hackathon/hack-the-stacks, https://github.com/amnh/hackthestacks/wiki, https://github.com/hackthestacks. 49 andrea valdez, “inside the vatican’s first-ever hackathon: this is the holy see of the 21st century,” wired magazine, march 12, 2018, https://www.wired.com/story/inside-vhacksfirst-ever-vatican-hackathon/. 50 museomix, “concept,” accessed march, 29, 2021, https://www.museomix.org/en/concept/. 51 kristi bergland, kalan knudson davis, and stacie traill, “catdoc hackdoc: tools and processes for managing documentation lifecycle, workflows, and accessibility,” cataloging and classification quarterly 57, no. 7–8 (2019): 463–95. 52 islandora collaboration group, “templates: how to run a hack/doc,” last modified december 5, 2017, https://github.com/islandora-collaborationgroup/icg_information/tree/master/templates_how_to_run_a_hack_doc. 53 gordon dunsire, “toward an internationalization of rda management and development,” italian journal of library and information science 7, no. 2 (may 2016): 308–31. http://dx.doi.org/10.4403/jlis.it-11708 54 nelson and kashyap, glam hack-in-a-box. 55 hannah-louise clark, “global history hackathons information,” accessed april 19, 2021, https://www.gla.ac.uk/schools/socialpolitical/research/economicsocialhistory/projects/glob al%20historyhackathons/history%20hackathons/. 56 gustavo candela et al., “reusing digital collections from glam institutions,” journal of information science (august 2020): 1–10, https://doi.org/10.1177/0165551520950246. 57 thomas padilla, “on a collections as data imperative,” uc santa barbara, 2017, https://escholarship.org/uc/item/9881c8sv; rachel wittmann et al., “from digital library to open datasets,” information technology and libraries 38, no. 4 (2019): 49–61, https://doi.org/10.6017/ital.v38i4.11101; sandra tuppen, stephen rose, and loukia drosopoulou, “library catalogue records as a research resource: introducing ‘a big data history of music,’” fontes artis musicae 63, no. 2 (2016): 67–88. 58 moura de araujo, “hacking cultural heritage.” 59 grant, hackathons: a practical guide; grant, engage. respond. innovate.; joshua tauberer, “hackathon guide,” accessed march 26, 2021, https://hackathon.guide/; alexander nolte et al., “how to organize a hackathon—a planning kit,” arxiv preprint arxiv:2008.08025 (2020), https://arxiv.org/abs/2008.08025v2; ivonne jansen-dings, dick van dijk, and robin van westen, hacking culture: a how-to guide for hackathons in the cultural sector, waag society, (2017): 1–41. https://waag.org/sites/waag/files/media/publicaties/es-hacking-culturesingle-pages-print.pdf. 60 ward, hahn, and mestre, “designing mobile technology to enhance library space use.” information technology and libraries december 2021 hackathons and libraries |longmeier 19 61 mcgowan, “the role of the university library in creating inclusive healthcare hackathons.” 62 nandi and mandernach, “hackathons as an informal learning platform”; carruthers, “open data day hackathon 2014 at edmonton public library.” starr ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ article title | author 3editorial | truitt 3 editorial: beginnings marc truitt as i write these lines in late february, the first hints of spring on the alberta prairie are manifest. alternatively, perhaps it’s just that the longer and warmer days are causing me to “think spring.” there are no signs yet of early bulbs—at least, none that i can detect with around a foot of snow in most places—but the sun is now rising at 7:30 a.m. and not setting until 6 p.m., a dramatic change from the barely seven hours of daylight typical of december and january. and while none but the hardiest souls are yet outside in shorts and shirt-sleeves, somehow, daytime highs that hover around freezing seem downright pleasant in comparison with the minus thirties (not counting the wind chill) we were experiencing even a couple of weeks ago. yes, spring is in the air, even if the calendar says it is still nearly a month away. . . . so what, you may fairly ask, does the weather in edmonton have to do with ital? this is my first issue of ital as editor, and it may not surprise you to hear that i’ve been thinking quite a bit about what might be the right theme and tone for my first column. while i’ve been associated with the journal for quite awhile—first as a board member, and more recently as managing editor—my role has always been comfortably limited to background tasks such as refereeing papers and production issues. now, that is about to change; i am stepping a bit out of my comfort zone. it’s about beginnings. i follow with some awe in the footsteps of a long line of editors of ital (and jola, its predecessor). i’ve been honored to serve—and to learn a great deal—from the last two, dan marmion and john webb. you, the readers of ital, and i are fortunate to have as returning managing editor judith carter, who preceded me and taught me the skills required for that post; i hasten to emphasize that she is definitely not responsible for the things i did not do right in the job! regular readers of ital will recall that john webb often referred humorously and admiringly to the members of the ital editorial board as his “junkyard dogs;” he claimed that they kept him honest. with the addition of a couple of fine new members, i’m confident that they will continue to do so in my case! okay, with that as preface, enough about me . . . let’s talk about ital. ■ what’s inside this issue ital content has traditionally represented an eclectic blend of the best mainstream and leading/bleeding edge of library technology. we strive to be reflective of the broad, major issues of concern to all librarians, as well as alert to interesting applications that may be little more than a blip at the edge of our collective professional radar screen. our audience is not limited to those actively working in library technology, although they certainly form ital’s core readership; rather, we seek to identify and publish content that will be relevant to all with an interest in or need to know about how technology is affecting our profession. thus, some articles will resonate with staff seeking new ways to use web 2.0 technologies to engage our readers, while other articles will be of interest to those interested in better exploiting the four decades’ worth of bibliographic metadata that forms the backbone of our integrated library systems. the current issue of ital is no exception in this regard. we lead off with two papers that reflect the renewed interest of the past several years in the role and improvement of the library online catalog. jia mi and cathy weng review opac interfaces, searching functionality, and results displays to address the question of why the current opac is ineffective and what we can do to revitalize it. timothy dickey, in a contribution that received the 2007 lita/ exlibris student writing award,1 summarizes the challenges and benefits of a frbr approach to current and “next-gen” library catalogs. interestingly, as will become clear at the end of this column, dickey’s is not the first prize-winning frbr study to appear in the pages of ital. online learning has long been a subject of interest both to librarians and to the education sector as a whole. whereas the focus of many previous studies has been on the techniques and efficacy of online learning systems, though, connie haley’s paper takes a rather different approach, describing and exploring factors that characterize the preference of learners for online training, as compared with more traditional in-person techniques. in gary wan’s and zao liu’s investigation of contentbased information retrieval (cbir) in digital libraries, the authors describe and argue for systems that will enable identification of images and audio clips by automated comparison against digital libraries of image and audio files. finally, wooseob jeong prototypes an innovative application for enhancing web access by the visually impaired. jeong’s application makes use of force feedback, an inexpensive, proven technology drawn from the world of video gaming. ■ some ideas about where we are going a change of editorship is always one of those good opportunities for thinking about how we might improve, or of marc truitt (marc.truitt@ualberta.ca) is associate director, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 4 information technology and libraries | march 2008 different directions we might explore. with that in mind, here are a couple of things we’re either going to try, or that we’re considering: different voices. ital’s format has long included provision for two “opinion” columns, one by the editor, and another by the president of lita. from time to time, past editors have given over their columns for guest editorials. however, there are many other voices that could enrich ital’s pages, and the existing structure doesn’t really have a “place” for the regular airing of these voices. beginning with the june 2008 issue, ital will include a regular column contributed by members of the board, on a rotating basis. the column will be about any topic related to technology and libraries that is on the author’s mind. i’m thinking about how we might expand this to include a similar column contributed by ital readers. while such reader contributions may lack the currency of a weblog, i think that they would make for thoughtprovoking commentary. oh, and there’s that “currency thing.” in recent years, those of us who bring you ital have—as have those responsible for other ala publications—discussed at length the whole question of when and how to move to a sustainable model of electronic publishing that will address the needs of readers. this issue is of course especially important in the case of a technology-focused journal, where content tends to age rapidly. unfortunately, for various reasons, we’re not yet at the stage where we can go completely and solely electronic. a recent conversation with one board member, though, surfaced an idea that i think in the meantime has merit: essentially, we might create a preprint site for papers that have been accepted and edited for future publication in ital. we might call it something such as ital express, and its mission would be to get content awaiting publication out and accessible. is this a “done-deal”? no, at this stage, it’s just an intriguing idea, and i’d be interested in hearing your views about it . . . or anything else related to ital, for that matter. you can e-mail me at marc.truitt@ualberta.ca. ■ and finally, congratulations dept. last week, martha yee, of the film and television archive at the university of california, los angeles received the alcts cataloging and classification section’s margaret mann citation for 2008. martha was “recognized for her outstanding contributions to the practice of cataloging and her interest in cataloging education . . . [and her] professional contributions[, which] have included active participation in ala and alcts and numerous publications.” of particular note, the citation specifically singled out her work in the areas of “frbr, opac displays, shared cataloging and other important issues, [in which] yee is making a significant contribution to the discussions that are leading the development of our field.” surely among the most important of these is her paper “frbrization: a method for turning online public finding lists into online public catalogs,” which appeared in the june 2005 issue of ital (p. 77–95). archived at the ital site, d-list, the cdl e-scholarship repository, and elsewhere, this seminal contribution has become one of the most accessed and cited works on frbr. we at ital are proud to have provided the original venue for this paper and congratulate martha on being named recipient of the margaret mann award. 112 information technology and libraries | june 2006 book review debra shapiro, editor strategic planning and management for library managers by joseph r. matthews. westport, conn.: libraries unlimited, 2005. xiv, 150p. $40 (isbn 1-59158-231-8). the reality for most librarians is that, sometime in their career, they will be involved in strategic management and planning. while library school courses occasionally deal with this topic, it is from a theoretical perspective only. most librarians are promoted or coerced into leadership and management roles, often with little or no training or resources at their disposal to assist them with the transition or change of responsibilities. strategic planning is one of those duties assigned to library managers and leaders that often get pushed to the lowest-priority list, mainly because there are few guidelines and handbooks available in this area. since the publication of donald riggs’s strategic planning for library managers (oryx, 1984), little attention has been given to this vital topic. matthews’s book attempts to provide information on how to explore strategies; demystify false impressions about strategies; how strategies play a role in the planning and delivery of library services; broad categories of library strategies that can be used; and identification of new ways to communicate the impact of strategies to patrons. as the author states in the introduction, the focus of libraries has moved from collections to encompass the arena of change itself. finding strategies to enable operation in a fluid environment can mean the difference between relevance and irrelevance in today’s competitive information marketplace. the book is divided into three major sections: (1) what is a strategy, and the importance of having one; (2) the value of and options for strategic planning; and (3) the need to monitor and update strategies. the first four chapters make up the first section. chapters 1 and 2 go through the semantics and the need for strategies, as well as the realities and limitations of strategies. chapter 3 provides brief introductions to schools of strategic thought. these include the design school, the planning school, the positioning school, the entrepreneurial school, the cognitive school, the learning school, the power school, the cultural school, the environmental school, and the configuration school. chapter 4 introduces types of strategies: operational excellence, innovative services, customer intimacy, and the concept of strategic options. section 2 consists of chapters 5 through 8 and provides information on what strategic planning is, what its value is, process options such as planning alternatives and critical success factors, and implementation. section 3, comprised of chapters 9 and 10, focuses on the culture of assessment; monitoring and updating strategies; and tools available for managing the library. two appendixes are provided: one containing sample library strategic plans, and another with a critique of a library strategic plan. overall, the book is very straightforward and understandable, with numerous illustrations, process workflows, and charts. i found the information very interesting and useful, and the final section on assessment and measurement of strategic planning is essential for libraries to implement and monitor in today’s marketplace. the various explanations related to schools of strategic thought were especially helpful. this book should be read by every library manager and director involved in strategic planning and process.—brad eden, associate university librarian for technical services and scholarly communication, university of california, santa barbara ebsco cover 3 lita 107, 111, covers 2 and 4 index to advertisers president’s message: open access/open data colleen cuddy information technologies and libraries | march 2012 1 i am very excited to write this column. this issue of information technology and libraries (ital) marks the beginning of a new era for the journal. ital is now an open-access, electronic-only journal. there are many people to thank for this transition. the lita publications committee led by kristen antelman did a thorough analysis of publishing options and presented a thoughtful proposal to the lita board; the lita board had the foresight to push for an open-access journal even if it might mean a temporary revenue loss for the division; bob gerrity, ital editor, has enthusiastically supported this transition and did the heavy lifting to make it happen; and the lita office staff worked tirelessly for the past year to help shepherd this project. i am proud to be leading the organization during this time. to see ital go open access in my presidential year is extremely gratifying. as cliff lynch notes in his editorial, “the library profession has been slow to open up access to the publications of its own professional societies, to take advantage of the greater reach and impact that such policies can offer.” as librarians challenge publishers to pursue open-access venues, myself included, i am relieved to no longer be a hypocrite. by supporting open access we are sending a strong message to the community that we believe in the benefits of open access and we encourage other library organizations to do the same. ital will now reach a much broader and larger audience. this will benefit our authors, the organization, and the scholarship of our profession. i understand that while our members embrace open access, not everyone is pleased with an online-only journal. the number of new journals being offered electronically only is growing and i believe we are beginning to see a decline in the dual publishing model of publishers and societies offering both print and online journals. my library has been cutting back consistently on print copies of journals and this year will get only a handful of journals in print. personally, i have embraced the electronic publishing world. in fact, i held off on subscribing to the new yorker until it had an ipad subscription model! i estimate that i read 95 percent of my books and all of my professional journals electronically. the revolution has happened for me and for many others. i know that our membership will adapt and transition their ital reading habits to our new electronic edition and i look forward to seeing this column and the entire journal in its new format. colleen cuddy (colleen.cuddy@med.cornell.edu) is lita president 2011-12 and director of the samuel j. wood library and c. v. starr biomedical information center at weill cornell medical college, new york, new york. mailto:colleen.cuddy@med.cornell.edu president’s message | cuddy 2 earlier this week saw the research works act die. librarians and researchers across the country celebrated this victory as we preserved an important open-access mandate requiring the deposition of research articles funded by the national institutes of health into pubmed central. this act threatened not just research but the availability of health information to patients and their families. as librarians, we still need to be vigilant about preserving open access and supporting open-access initiatives. i would like to draw your attention to the federal research public access act (frpaa, hr 4004). this act was recently introduced in the house, with a companion bill in the senate. as described by the association of research libraries, frppa would ensure free, timely, online access to the published results of research funded by eleven u.s. federal agencies. the bill gives individual agencies flexibility in choosing the location of the digital repository to house this content, as long as the repositories meet conditions for interoperability and public accessibility, and have provisions for long-term archiving. the legislation would extend and expand access to federally-funded research resources and, importantly, spur and accelerate scientific discovery. notably, this bill does not take anything away from publishers. no publisher will be forced to publish research under the bill’s provisions; any publisher can simply decline to publish the material if it feels the terms are too onerous. i encourage the library community to contact their representatives to support this bill. open access and open data are the keystones of e-science and its goals of accelerating scientific discovery. i hope that many of you will join me at the lita president’s program on june 24, 2012, in anaheim. tony hey, corporate vice president of microsoft research connections and former director of the u.k.'s e-science initiative, and clifford lynch, executive director of the coalition for networked information, will discuss data-intensive scientific discovery and its implications for libraries, drawing from the seminal work the fourth paradigm. librarians are beginning to explore our role in this new paradigm of providing access to and helping to manage data in addition to bibliographic resources. it is a timely topic and one in which librarians, due to our skill set, are poised to take a leadership role. reading the fourth paradigm was a real game changer for me. it is still extremely relevant. you might consider reading a chapter or two prior to the program. it is an open-access e-book available for download from microsoft research (http://research.microsoft.com/en-us/collaboration/fourthparadigm/). i keep a copy on my ipad, right there with downloaded ital article pdfs. http://www.arl.org/pp/access/frpaa-2012.shtml http://research.microsoft.com/en-us/collaboration/fourthparadigm/ do space’s virtual interview lab: using simple technology to serve the public in a time of crisis editorial board thoughts do space’s virtual interview lab using simple technology to serve the public in a time of crisis michael p. sauers information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.13461 as we start to pull ourselves out of the covid-19 pandemic it’s time to start thinking about what changes we made at our libraries in response and decide which ones we should keep and which ones need to end along with the pandemic itself. do space in omaha, nebraska is the country’s first community technology library run by community information trust, a privately funded non-profit. since our opening in november 2015, we have been providing access to a variety of technologies like a computer lab, laptops, high-speed internet and more along with innovative educational programs on a range of technological topics for individuals and small businesses. membership and the vast majority of services are free and anyone is welcome to join. as a result of the pandemic, we were closed to the public for three months in early 2020 and reopened under limited services to the public in june 2020. both while we were closed and since, we have implemented many changes to our services and programming including limiting the number of available computers to support social distancing and moving our all-in-person educational programming to all-online programming to just name two of the bigger changes. but what i’d like to talk about is one new service we implemented that was simple, easy to set up, and has had a significant impact on a number of our members: do space’s virtual interview lab. when we reopened to the public with limited services in june 2020, one of the questions we asked ourselves was what new services could we provide in the circumstances and, under those same circumstances, what new service might the public need. we considered the reality of social distancing, and the fact that our meeting rooms could no longer be used for meetings with multiple members. then, although nebraska has historically had a low unemployment rate, we realized that many employers that had not yet moved to online interviews, covid pretty much forced many to do so. this was combined with the fact that covid only reinforced the already significant digital divide. someone needing to attend a job interview online could easily be lacking something as simple as a good quality webcam or microphone, or not have the bandwidth available to them at their home to successfully video conference. worst case, they may not have a computer at all. these are exactly the situations that do space was created for; to offer the public access to the hardware, software, and bandwidth that they need to become successful. with this in mind we decided to turn our small conference room into a virtual interview lab. the room already had a good-sized table, excellent available wifi, generally good lighting, and plain white walls, perfect as a simple background. previous users of this room would generally use a laptop to connect to our wifi and a large screen in the room. instead, for this setup we added a small micropc which we connected via an ethernet port to our gigabit fiber internet connection. michael sauers (msauers@travelinlibrarian.info) is technology manager, do space, and a member of the journal’s editorial board © 2021. mailto:msauers@travelinlibrarian.info information technology and libraries june 2021 do space’s virtual interview lab | sauers 2 to this pc we added a 27” monitor, 1080p webcam, a blue yeti microphone, and headphones. on the pc we installed every virtual meeting platform we could think of including zoom, adobe connect, microsoft teams, gotowebinar/meeting, and more, placing direct shortcuts to all of the programs and online services right on the desktop for easy access. with our setup complete we opened the lab for bookings starting july 1, 2020. use has been slow and steady, possibly due to our low unemployment rate, but the members that have used it are grateful for its existence. our marketing was first just on our website and social media but after a month or two we gathered a list of over 50 area groups and organizations that assisted folks with finding work and mailed them a stash of postcards that they could hand out and asked them to let us know if they needed any more. one group was so inspired by the project that in their thank you they said that they’d be starting their own virtual interview lab at their location. in the past year the lab hasn’t changed all that much with the exception of moving to a different room and a broadening of the use case. we quickly realized that members were wishing to use the room for events beyond job interviews. those using the lab have done so for attending ged and language classes, business meetings, attending do space’s own online programming, and even participating in our virtual tech mentoring program. have there been any problems? we’re dealing with technology here so of course the answer is yes, but luckily, they have been minor. for example, one person commented that our blue headphones didn’t look as “professional” as they would have liked. other times zoom needed a last-minute update which staff quickly addressed. (we encourage everyone to book the start of their session 30 minutes in advance to give us a chance to fix such issues.) otherwise, feedback has been overall very positive. here’s just a few examples: • “thank you! i actually used the room on short notice for several conference calls (plumbing disaster at my house!). it's not the intended use, but it was open and your team was kind enough to let me use it. i sincerely appreciate it. the room, by the way, has an excellent set up. wifi was lightning fast, lighting was perfect and i love that you have a microphone to focus the sound. oh, and that cute coat rack dressed up my background when i had to talk to a large client. it's fantastic that you offer this resource to the community. thanks again for letting me use it!” • “wanted to note a few things. i used this space for a research interview where i was a participant. i wasn't strictly using this space for a job interview. that said, it suited my needs perfectly. i was very happy to utilize this space. it was quiet, clean, and accommodated what i was looking for. customer service was also excellent. the service desk worker was able to promptly get me set up when i was already running a bit late for my interview. thank you for making this service available and also making it intuitive and easy to utilize. i will probably look to use it again in the future.” • “the room is ideal, quiet, no distractions, i was able to connect clearly using teams, no glitches, the volume was loud enough. was able to hear clearly and see interviewers faces clearly. staff at do space were available and prompt to assist before the interview when i needed set-up help adjusting the appearance / display of my head within the frame/screen.” • “you are a godsend! i am so grateful, especially in these times, that you are here. the staff are kind, patient and thoroughly knowledgeable. i love you.” information technology and libraries june 2021 do space’s virtual interview lab | sauers 3 this experience has reminded me that while all the advanced experimenting and complex coding we create to better assist our users is all well and good, sometimes just a simple computer, internet connection, and webcam can make all the difference in someone’s life. while some of the changes that we’ve made over the past year, such as moving all programming online, will be either ending or slowly transitioning to pre-pandemic states, our virtual interview lab is one new service that we will be definitely keeping for the foreseeable future. 50 information technology and libraries | june 2009 andrew k. pace president’s message: lita forever andrew k. pace (pacea@oclc.org) is lita president 2008/2009 and executive director, networked library services at oclc inc. in dublin, ohio. i was warned when i started my term as lita president that my time at the helm would seem fleeting in retrospect, and i didn’t believe it. i should have. i suppose most advice of that sort falls on deaf ears—advice to children about growing up, advice to newlyweds, advice to new parents. some things you just have to experience. now i am left with that feeling of having worked very hard while not accomplishing nearly enough. it’s time to buy myself some more time. my predecessor, mark beatty, likes to jokingly introduce himself in ala circles as “lita has-been” in reference to his role as lita past-president. i say jokingly because he and i both know it is not true. not only does the past-president continue in an active role on the lita board and executive committee, the past-president has the daunting task of acting as the division’s financial officer. just as mark knows well the nature of this elected (but still volunteer) commitment, so michelle frisque, my successor this july, knows that the hard work started as vice-president/ president-elect has two challenging years ahead. being elected lita president is for all intents and purposes a three-year term with shifting responsibilities. add to this the possibility of serving on the board beforehand, and it’s likely that one could serve less time for knocking over a liquor store. i’m joking, of course— there’s nothing punitive about being a lita officer; it’s as rewarding as it is challenging. neither is this intended to be a self-congratulatory screed as my last hurrah in print as lita president. i’ve referred repeatedly to the grassroots success of lita’s board, interest groups, dedicated committees, and engaged volunteers. the flatness of our division is often emulated by others. i thoroughly enjoy engagement with the lita membership, face-to-face and virtual recruitment of new members and volunteers, and group meetings to discuss moving lita forward. i love that lita is fun. fun and enjoyment, coupled with my dedication to the profession that i love, is why i plan to make the most of my time, even as a has-been. all those meetings, all that bureaucracy? well, believe it or not, i like the bureaucracy—process works when you learn to work the process—and all those meetings have actually created some excellent feedback for the lita board. changes in ala, changes in the membership, and changes suggested by committees and interest groups all suggest . . . guess what? change. “change” has been a popular theme these days. i’m in that weird minority of people who does not believe that people don’t like to change. i think if the ideas are good, if the destination is worthwhile, then change is possible and even desirable. i’m always geared up for change, for learning from our mistakes, for asking forgiveness on occasion and for permission even less. this is a long-winded way of saying that i think lita is ready for some change. change to the board, change to the committees and interest groups, and changes to our interactions with lita and ala staff. i think ala and the other divisions are anxious for change as well, and i feel confident that lita and its membership can help, even while we change ourselves. don’t ask me today what the details of these changes are. all i can say is that i will be there for them, help see them through, and will be there on the other side to asses which changes worked and which didn’t. one thing i hope does not change is the passion and dedication of the leaders, volunteers, and members of this great organization. i only hope that our ranks grow, even in times of financial uncertainty. lita provides a valuable network of colleagues and friends—this network is always valuable, but it is indispensible in times of difficulty. for many, lita represents a second or third divisional membership, but for networking and collegial support, i think we are second to none. i titled my previous column “lita now.” i think it’s safe for me to say now, “lita forever.” 50 information technology and libraries | june 2006 author name and second author f orty years! in july 1966, the library and information technology association (lita) was officially born at the american library association (ala) annual conference in new york as the information science and automation division (isad). it was bastille day, and i’m sure for those who had worked so hard to create this new organization that it probably seemed like a revolution, a new day. the organizational meeting held that day attracted “several hundred people.” imagine! i’ve mentioned it before, i know, but the history of the first twenty-five years of lita is intriguing reading and well worth an investment of your time. stephen r. salmon’s article “lita’s first twenty-five years: a brief history” (www.lita.org/ala/lita/aboutlita/org/1st 25years.htm) offers an interesting look back in time. any technology organization that has been in existence for forty or more years has seen a lot of changes and adapted over time to a new environment and new technologies. there is no other choice. someone (who, i don’t remember; i’d gladly attribute the quote if i did) once told me that library automation began with the electric eraser. i’m sure that many of you have neither seen an electric eraser, nor can you probably imagine its purpose. ask around. i’m sure there are staff in your organization who do remember using it. there may even be one hidden somewhere in your library. a quick search of the web even finds cordless, rechargeable electric erasers today in drafting and art supply stores. the 1960s, as lita was born, was still the era of the big mainframe systems and not-so-common programming languages. machine readable cataloging (marc) was born and oclc conceived. the 1970s saw the introduction of minicomputer systems. digital equipment corporation introduced the vax, a 32-bit platform, in 1976. the roots of many of our current integrated library systems reach back to this decade. the 1980s saw the introduction of the ibm personal computer and the apple macintosh. the graphical interface became the norm or at least the one to imitate. the 1990s saw a shift away from hardware to communication and access as the web was unveiled and began to give life to the internet bubble. the new millennium began with y2k. the web predominates, and increasingly, the digital form dominates almost everything we touch (text, audio, video). automation and systems evolved and changed over the years, and so did libraries. automation, which had been confined to large air-conditioned and monitored rooms, moved out into the library. it increasingly appeared at circulation desks, on staff desks, and then throughout the library. information technology (it) spread into offices everywhere and into homes. libraries had products and services to deliver to users. users demanded more convenience. of course, others knew this trend as well and provided products and services that users wanted. users often liked what they saw in stores better than what the library was able to provide. each of us attempts to keep up, compete, and beat those whom we see as our competitors. it’s a moving target and one that seems to be gaining speed. all the while, during these four decades, our association and its members continually adapted to the new environment, faced new challenges, and adopted new technologies. we would not exist if we did not. i feel that we, as an association, are again facing the need to change, to transform ourselves. it, digital technology, automation (whatever term you want to use) affects the work of virtually every library staff member. everyone’s work in the library uses or contributes to the digital presence of our employer. it is not the domain of a few. lita has a wonderful history and it has great potential to better serve the profession. what do we want our association to be? what programs and services can we provide that others do not? who can we involve to broaden our reach? how can we better communicate with members and nonmembers? if we had a clean sheet of paper, what would we write? what would we dream? we need to share that dream and bring it to life. i can’t do it. the lita board can’t do it. we need your help. we need your ideas. we need your energy. we need to break out of our comfort zone. none of us wants the strategic plan (www.lita.org/ala/lita/aboutlita/org/plan.htm) we adopted last year to ring hollow. we want to accelerate change and move into a reenergized future. i welcome your aspirations, ideas, and comments. i know that the lita board does as well. please feel free to contact me or any member of the board (www.lita .org/ala/lita/aboutlita/org/litagov/board.htm). lita is your association. where should we be going? help us navigate the future. patrick mullin patrick mullin (mullin@email.unc.edu) is lita president 2005– 2006, and associate university librarian for access services and systems, the university of north carolina at chapel hill. president’s column september_ital_varnum_final editorial board thoughts: content and functionality: know when to buy ‘em, know when to code ‘em1 kenneth j. varnum2 information technologies and libraries | september 2017 3 we in library technology live in interesting times, though not those of these apocryphal curse. no, these are interesting times in the best possible way. where once there was a paucity of choice in interfaces and content, we have arrived at a time when a range of competing and valid choices exists for just about any particular technology need. data and functionality of actual utility to libraries are increasingly available not just through proprietary interfaces, but also through apis (application programming interfaces) that are ready to be consumed by locally developed applications. this has expanded the opportunity for libraries to respond more thoughtfully and strategically to local needs and circumstances than ever before. libraries are faced with an actual, rather than hypothetical, choice between building or buying fundamental user interfaces and systems as the internet has evolved, and coding has become more central to the skillset of many libraries, the capability of libraries to seriously consider building their own interfaces has grown. how does a technologically capable library make the decision to buy a complete system or build its own interface to existing data? the process can be decided using a range of criteria that can help define the library’s need for a locally managed solution. we’ll start by discussing technological capabilities needed to take on almost any development project, then define three criteria, and finally discuss the circumstances in which a build solution might be appropriate. the goal is outline a process for deciding when it make more sense to buy both the interface and the content, to build one or the other locally, or to build both. criterion 0: what are the shortand long-term technological capabilities of the library? clearly, the first point of consideration is whether the institution has the capacity to manage application development and user research. the short-term answer may be no, but the long-term answer -one based on the library’s strategic direction -may be that these skills are needed to meet the library’s goals or strategic vision. one project may not be enough to tip the scales, but if the library is continually deciding if the immediate project under discussion is the one to change the balance, then perhaps the answer is that it’s time to invest in new skillsets and capabilities. there are actually several skillsets needed to undertake development projects. individuals with coding skills are needed to adapt existing open-source software to the library’s needs — it is a rare 1 with apologies to kenny rogers 2kenneth j. varnum (varnum@umich.edu), a member of the ital editorial board, is senior program manager for discovery, delivery, and library analytics at the university of michigan library, ann arbor, mi. editorial board thoughts | varnum https://doi.org/10.6017/ital.v36i3.10087 4 open-source project that does exactly what a library needs it to do, with connectors to all the same data sources and library management tools already perfectly configured by somebody else — but that is not sufficient. a library also needs people with user interface and user research skills ensure that the application meets at least the critical needs of its own user community, and does so with language and cues that match user expectations. even if there is not a permanent capability on the library’s staff, development can take place with contract services. if this is the option selected, a library would do well to make sure that staff are sufficiently trained to make minor updates to interfaces and applications, or that a longer-term arrangement is made for ongoing maintenance and updates. criterion 1: what is the need to customize interactions to local situations? most, but not all, applications offer opportunities to match interface features and functionality with local user needs. the more interactive and core to the library’s service model the tool is, the more likely the tool is to benefit from customization. for example, a proxy server -technology that allows an authenticated user to access licensed content as if she were in the physical library or within a campus on a defined network -has little or no user interface. there is little need to customize the tool to meet user needs, beyond ensuring the list of online resources and urls subject to being proxied is up to date. there really aren’t any particularly useful apis to consumer and reproduce elsewhere, and there are easier ways to build an a-z list of licensed content than harvesting the proxy server’s configuration lists. in contrast, the link resolver -technology that takes a citation formatted according to the openurl standard and returns a list of appropriate full-text destinations to which the library has licensed access -may well be worth bringing in house. some vendors offer their software to be run locally, while others provide api access to the metadata. at my institution, we used the apis serials solutions makes available for its 360 link api to build our own interface using the opensource umlaut software. (see https://mgetit.lib.umich.edu/). why go to the trouble of recreating an interface? for several reasons, some of which (understanding user behaviors and maintaining control over user data to the extent practical) i’ll touch on in the following two sections. the main reason centered on providing a user interface consistent with the rest of our web presence, offering integrations to our document delivery service, and a way to contact our online chat service, and a way to report problem links directly to the library when the full text links provided by the system do no work. while these features are generally available through vendor interfaces, the user experience is hard to make consistent with other services we offer. criterion 2: what are the needs for integration with other systems from different providers? integrations can run in two directions: from the system under consideration to existing library or campus/community tools, and from those environmental tools to the library. when thinking about the buy-or-build decision, understanding the scope of these integrations up front is important. if all of the tools or services that need to consume information from or provide information to your information technologies and libraries | september 2017 5 system rely on well-defined standards that are broadly implemented, this criterion may be a wash; there may not be an inherent advantage to building or buying based on data exchange. if, however, the other systems are themselves tricky to work with, relying on inputs or providing outputs in a non-standard or idiosyncratic way, this situation may swing the pendulum toward building the system yourself so you can manage. for example, many course management systems on academic campuses can consume and provide data using the lti [learning tools interoperability] standard for data exchange. many traditional library applications do, as well, so if a library using an lti-compliant system needs to provide course reserves reading lists to the course management system, this is a ready-made way to make that information available. at the other extreme, bringing registrar’s data into a library catalog -to know who is in what courses to provide those patrons with an appropriate reference librarian contact for a particular subject, or access to a reading list through a course reserves system -may only be possible through customized applications to read non-standard data. in this case, to provide the desired level of service to the campus, the library may need to build local applications. criterion 3: who manages confidentiality or privacy of user interactions? a final, and increasingly significant, criterion to consider is where the library believes responsibility for patron data and information seeking behavior to reside. notwithstanding contractual or licensing obligations taken on by library vendors, the risk of inadvertent exposure or intentional sharing of user interactions is always present. one advantage of building local systems to interact with vendor systems (link resolvers, discovery platforms, etc.) is that vendor does not have access to the end-user’s ip address or any other personally identifying information. the vendor only sees a request coming from the library’s application; all requests are equal and undifferentiated. of course, once users access the target item they are seeking (an online journal, database, etc.), that particular vendor’s site has access to that information. for libraries concerned about user privacy, the risk of exposure is somewhat mitigated by managing the discovery or access layer in-house -and deciding to maintain a level of user information that suits that particular library’s comfort level -and potentially minimizing the single point of failure for breaches. at the same time, such a decision puts more responsibility on the library or its parent information technology organization to protect data from exposure. some libraries feel they can handle this responsibility -either by careful protection of the data, or by not collecting and storing it in the first place -in a way that library vendors cannot. concluding thoughts making the buy-or-build decision is not straightforward; the criteria described here are not the only ones a library might wish to consider, but they are common ones with the greatest ramifications. putting the decision process into a framework can help a library make consistent editorial board thoughts | varnum https://doi.org/10.6017/ital.v36i3.10087 6 decisions over time, enabling it to focus on the projects and systems that are most important to the library and its community (a campus, a town, or company). microsoft word september_ital_cyzyk_final.docx editorial board thoughts: information technology in libraries: anxiety and exhilaration mark cyzyk   information  technology  and  libraries  |  september  2015               6   a  few  weeks  ago  a  valued  colleague  left  our  library  to  move  his  young  family  back  home  to   pittsburgh.    insofar  as  we  were  a  two-­‐man  department,  i  spent  the  weeks  following  the   announcement  of  his  imminent  departure  picking  his  brain  about  various  projects,  their   codebases,  potential  rough  spots,  existing  trouble  tickets,  etc.    he  left,  and  i  immediately   inherited  nine-­‐years-­‐worth  of  projects  and  custom  code  including  all  the  "micro-­‐services"   that  feed  into  our  various  well-­‐designed,  high-­‐profile,  and  high-­‐performing  (thanks  to  him)   websites.   this  was  all,  naturally,  anxiety-­‐producing.   almost  immediately,  things  began  to  break.       early  on,  a  calendar  embedded  in  a  custom  wordpress  theme  crucial  to  the  functioning  of   two  of  our  revenue-­‐generating  departments  broke.    the  external  vendor  simply  made   disappear  the  calendar  we  were  screenscraping.  poof,  gone.    i  quickly  created  an  ok-­‐but-­‐less-­‐ than-­‐ideal  workaround  and  we  were  back  in  business,  at  least  for  the  time  being.   then,  two  days  before  the  july  4  holiday,  our  calendar  managers  started  reporting  that  our   google-­‐calendar-­‐based  system  was  disallowing  a  change  to  "closed"  for  that  saturday.    i   somehow  forced  a  closed  notification,  at  least  for  our  main  library  building,  but  no  matter   what  any  of  us  did  we  could  not  get  such  a  notification  to  show  up  for  a  few  of  our  other   facilities.    i  spent  quite  bit  of  time  studying  the  custom,  middleware  code  that  sits  between   our  google  calendars  and  our  website,  and  could  see  where  the  magic  was  happening.    i  now   think  i  know  what  to  do  -­‐-­‐  and  all  i  have  to  do  is  express  it  in  that  nutty  programming   language/platform  that  the  kids  are  using  these  days,  ruby  on  rails.    i've  never  written  a  line   of  ruby  in  my  life,  but  it's  now  or  never.   a  little  voice  inside  me  keeps  saying,  "you're  swimming  in  the  deep  end  now  -­‐-­‐  paddle   harder,  and  try  not  to  sink."   while  these  surprise  events  were  happening,  we  also  switched  source  code  management   systems,  so  a  migration  was  in  order  there,  my  longingly-­‐awaited  new  workstation  came  in   (and  i'm  sure  you  all  know  how  painstaking  it  is  to  migrate  all  idiosyncratic   data/apps/settings  to  a  new  workstation  and  ensure  it's  all  present,  functioning,  and  secure   before  dban-­‐nuking  your  old  drives),  we  decommissioned  a  central  service  that  had  been  in     mark  cyzyk  (mcyzyk@jhu.edu)  a  member  of  the  ital  editorial  board,  happily  works  and   ages  in  the  sheridan  libraries,  johns  hopkins  university,  baltimore,  maryland,  usa.     information  technology  in  libraries:  anxiety  and  exhilaration  |  cyzyk       doi:  10.6017/ital.v34i3.8967   7   production  since  2006,  we  fully  upgraded  our  wordpress  multisite  including  all  plugins  and   themes,  fixing  what  broke  in  the  upgrade,  and  i  got  into  the  groove  of  working  on  any  and  all   trouble  tickets/change  requests  that  spontaneously  appeared,  popping  up  like  mushrooms  in   the  verdant  vale  of  my  worklife.   this  was  all  largely  in  addition  to  my  own  job.   so  now  i  find  myself  surgically  removing/stitching  up  code  in  recently-­‐diseased  custom   wordpress  themes,  adding  ruby  code  to  a  crucial  piece  of  our  website  infrastructure,  and   learning  as  much  as  i  can  -­‐-­‐  but  quick  -­‐-­‐  about  the  wonderful  and  incredibly  powerful   bootstrap  framework  upon  which  most  of  our  sites  are  built.   surely  it's  anxiety-­‐producing?    you  bet.   but  it's  thrilling  and  exhilarating  was  well.    i'm  paddling  hard,  and  so  far  my  head  remains   above  water.    many  days,  i  just  can't  wait  to  get  to  work  and  start  paddling.       this  aging  it  guy  suddenly  feels  ten  years  younger!   (but  isn't  all  this  paddling  supposed  to  somehow  result  in  a  swimmer's  body?    patiently   waiting...)     microsoft word december_ital_kiscaden_final.docx creating  a  current  awareness  service   using  yahoo!  pipes  and  libguides             elizabeth  kiscaden     information  technology  and  libraries  |  december  2014         51   abstract   migration  from  print  to  electronic  journals  brought  an  end  to  traditional  current  awareness  services,   which  primarily  used  print  routing.  the  emergence  of  real  simple  syndication,  or  rss  feeds,  and  email   alerting  systems  provided  users  with  alternative  services.  to  assist  users  with  adopting  these   technologies,  a  service  utilizing  aggregate  feeds  to  the  library’s  electronic  journal  content  was  created   and  made  available  through  libguides.  libraries  can  reestablish  current  awareness  services  using   existing  technologies  to  increase  awareness  and  usage  of  library-­‐provided  electronic  journal  content.   the  current  awareness  service  presented  is  an  example  of  how  libraries  can  build  basic  current   awareness  services  utilizing  freely  accessible  technologies.     current  awareness  services   library  current  awareness  services,  commonly  referred  to  as  “table  of  contents”  services,  historically   involved  the  dissemination  of  information  in  the  form  of  print  journals  or  photocopied  journal   contents  routed  to  library  users  subscribed  to  the  service.1,2  these  services  have  been  particularly   popular  among  corporate,  law,  and  hospital  libraries,  which  routinely  route  serials  to  primarily   internal  clients.  while  these  paper-­‐based  services  are  still  offered  at  some  libraries,  most  shifted  to   an  electronic  model  of  service  with  the  migration  to  electronic  journals.   as  libraries  adopted  electronic  journals,  many  paper-­‐based  current  awareness  services  transitioned   to  an  electronic  table  of  contents  service  utilizing  email  alerts  or  referred  users  to  rss  feeds  made   available  by  publishers  and  database  vendors.3  a  common  challenge  to  a  library-­‐managed  electronic   table  of  contents  service  is  the  complexity  of  managing  alerts  for  hundreds  of  electronic  journals  for   multiple  patrons.  more  often,  libraries  make  individual  users  responsible  for  subscribing  to  email   alerts  or  rss  feeds  on  their  own,  effectively  transferring  the  responsibility  of  subscribing  to,  filtering,   and  managing  incoming  information  to  the  user.   a  drawback  to  this  migration  is  that  library  users  often  don’t  possess  a  clear  understanding  of  what   tools  are  available  to  create  their  own  service.4  formerly,  journals  may  have  arrived  on  a  user’s  desk   for  perusal,  yet  now  users  are  required  to  seek  out  information  independently.  additionally,  despite   the  number  of  discovery  tools  available,  library  users  are  often  unaware  of  journals  available  in  an   electronic  format  through  their  library.5  information  management  tools  have  become  necessary  in   our  current  information  environment;  with  the  abundance  of       elizabeth  kiscaden  (elizabeth-­‐kiscaden@uiowa.edu),  former  library  director  at  waldorf  college,  is   head,  library  services,  hardin  library  for  the  health  sciences,  university  of  iowa,  iowa  city.       creating  a  current  awareness  service  using  yahoo!  pipes  and  libguides  |  kiscaden   52   information  available,  keeping  up-­‐to-­‐date  with  new  information  in  a  discipline  can  be  overwhelming.   therein  exists  an  opportunity  for  libraries—academic,  special,  and  public—to  revitalize  current   awareness  services  and  build  information  management  tools  using  aggregate  feeds.     design  and  description  of  the  service   at  waldorf  college,  the  luise  v.  hanson  library  created  a  current  awareness  service  utilizing  rss   feeds,  with  the  intent  to  assist  faculty  with  keeping  up-­‐to-­‐date  with  newly  published  content  in  the   library’s  electronic  journal  collection.  the  service,  dubbed  info  sos,  was  designed  to  overcome  two   barriers  to  patron  participation  in  feed  services:  the  chore  of  subscribing  to  and  curating  multiple   feeds  and  the  lack  of  awareness  of  feeds  and  feed  reader  technology.  info  sos  was  piloted  to  faculty   during  the  spring  of  2014  and  was  accompanied  by  an  informal  questionnaire  to  collect  feedback.   info  sos  is  built  on  rss,  or  “really  simple  syndication”  technology,  one  of  the  most  prevalent  tools  for   keeping  current  with  new  information  published  electronically.  rss  has  been  available  for  more  than   a  decade,6  and  many  users—both  patrons  and  library  professionals—are  using  this  technology.   however,  while  powerful  and  freely  accessible,  rss  feeds  have  their  limitations.  subscribing  to  and   curating  multiple  feeds  can  become  a  burden.     to  eliminate  the  chore  of  managing  multiple  feeds,  info  sos  displays  feed  aggregates  created  using   yahoo!  pipes  http://pipes.yahoo.com/pipes/).  aggregate  feeds,  or  feeds  comprising  multiple  rss   feeds,  can  be  created  using  many  tools  available  freely  online,  such  as  feed  stitch,  feed  informer,   feedburner,  and  more.  yahoo!  pipes  was  chosen  for  this  service  primarily  because  it  requires  limited   coding  knowledge,7  yet  the  software  provides  a  number  of  advanced  functions  for  sorting  and   combining  large  groups  of  feeds.  these  advanced  features  became  essential  when  building  aggregate   feeds  for  content  from  journal  aggregators.     yahoo!  pipes  requires  a  user  account  (free  of  charge)  before  constructing  pipes.  the  software   combines  and  sorts  information  using  a  visual  editor  that  resembles  virtual  plumbing,  which  is   presumably  why  the  software  is  called  pipes.  to  construct  the  aggregate  feeds  composing  info  sos,   librarians  used  the  fetch  feed  operator  to  combine  individual  rss  feeds  into  a  single  feed.  once   combined,  the  service  uses  the  sort  operator,  which  sorts  the  aggregated  content  by  date.  from  the   sort  operator,  the  content  is  connected  to  the  pipe  output,  from  which  a  single  rss  feed  is  generated.     the  strength  of  yahoo!  pipes  lies  in  the  advanced  tools  available  for  manipulating  feed  content.  for   example,  pipes  sorts  feed  content  from  database  vendors  by  the  date  it  is  published  to  the  feed,  not   the  publication  date  of  the  article.  if  desired,  aggregate  feed  creators  can  use  the  rename  and  regex   operators  to  remove  the  article  publication  date  from  the  description  field  and  use  it  to  sort  the  feed   content.  another  useful  tool  is  the  union  operator,  which  allows  creators  to  string  together  larger   bundles  of  feeds.       information  technology  and  libraries  |  december  2014   53     figure  1.  fetch  feed  and  sort  operator  in  yahoo!  pipes       figure  2.  image  of  yahoo!  pipe  using  advanced  tools     creating  a  current  awareness  service  using  yahoo!  pipes  and  libguides  |  kiscaden   54   lack  of  awareness  is  a  barrier  to  user  adoption  of  rss  feeds;  many  users  have  an  unclear   understanding  of  what  a  rss  feed  is.  if  unfamiliar  with  rss  feeds,  it  is  safe  to  assume  that  users  are   unfamiliar  with  rss  reader  technology  as  well.  at  waldorf  college,  this  was  confirmed  by  the   questionnaire  distributed  during  the  pilot  of  this  service.  of  the  twenty-­‐eight  faculty  respondents,   more  than  70  percent  had  never  used  an  rss  feed  before  using  info  sos.  it  is  safe  to  assume  that   these  faculty  would  not  have  a  subscription  to  a  feed  reader.   recognizing  the  need  for  an  interface  to  deliver  content,  librarians  used  the  libguides  software  to   display  content  from  these  aggregate  feeds.  the  software  contains  a  tool  for  adding  feed  content,  and   allows  for  the  application  of  an  institution’s  proxy  prefix  to  the  url,  creating  seamless  access  on  and   off  campus.  the  info  sos  resource  contains  tabbed  pages  designated  for  individual  fields  (biology,   psychology,  library  sciences,  etc.)  displaying  aggregate  feeds  for  journals  in  each  subject  area.  for   example,  the  physics  page  contains  aggregate  feeds  for  new  articles  published  in  the  library’s  full-­‐ text  physics  journals,  as  displayed  in  the  figure  below.       figure  3.  aggregate  physics  feeds  in  libguides   user  feedback   info  sos  remains  a  relatively  new  service  to  library  users  at  the  luise  v.  hanson  library,  but   preliminary  feedback  has  been  positive.  the  service  was  advertised  to  faculty  via  email  and     information  technology  and  libraries  |  december  2014   55   accompanied  by  a  feedback  survey  created  using  google  forms.  as  stated  previously,  librarians   received  twenty-­‐eight  responses  to  the  survey,  a  relatively  strong  response  considering  the  limited   number  of  faculty  at  the  college.   of  the  respondents,  more  than  70  percent  had  never  used  an  rss  feed  previously,  instead  using  a   variety  of  other  tools  to  stay  current  with  their  field.  of  those  other  tools,  18  percent  of  faculty   subscribed  to  table  of  contents  alerts,  27  percent  browsed  new  issues  of  print  journals,  25  percent   visited  association  websites,  and  23  percent  conducted  periodic  searches  for  information  in  the   library  databases.  it  was  of  some  concern  that  of  these  tools,  only  faculty  using  databases  and   subscribing  to  table  of  contents  alerts  would  be  connecting  with  the  library’s  electronic  journal   collection.   when  presented  with  info  sos  and  asked  whether  faculty  would  find  this  tool  useful,  more  than  70   percent  responded  that  they  would.  faculty  were  solicited  for  suggestions  for  improving  the   resource,  and  librarians  received  many  suggestions  for  expanding  the  content.  this  feedback  was   valuable  in  that  it  provided  justification  for  continuing  the  service  beyond  the  pilot  and  a  list  of   potential  subject  areas  to  begin  expanding  the  service.  the  intended  outcome  of  the  service  is  to   assist  faculty  in  keeping  current  with  literature  in  their  field  and  utilizing  the  library’s  resources  in   the  process.   limitations  and  challenges   generating  feeds  from  popular  library  databases,  such  as  ebscohost  and  proquest,  is  limited  in  that   the  publication  dates  for  articles  are  contained  in  the  description  field.  this  can  make  the  sort   operator  in  yahoo!  pipes  somewhat  inaccurate  because  it  would  be  sorting  by  the  date  they  were   published  to  the  feed,  not  by  actual  publication  date  of  the  journal  article.  if  necessary,  this  issue  can   be  corrected  using  the  rename  and  regex  operators  by  copying  the  item  description  as  the   publication  date.     an  additional  challenge  regarding  vendor-­‐created  feeds  relates  to  the  issue  of  expiring  feeds  created   from  library  databases.  a  library  profile  was  required  for  each  database,  such  as  ebscohost  or   proquest,  to  create  and  save  feeds.  this  allows  for  the  renewal  of  expiring  feeds;  the  email  account   attached  to  the  profile  receives  an  invitation  to  renew  expiring  feeds.  most  vendors  allow  for  feeds  to   be  created  at  the  database  without  a  profile,  but  those  feeds  will  automatically  expire  if  not  used   within  a  period  of  time.  the  potential  of  feeds  expiring  may  add  an  element  of  maintenance  to  the   current  awareness  service.       future  developments   yahoo!  pipes  offers  the  unique  ability  to  publish  pipes  that  others  may  share  and  “clone.”  for   libraries  interested  in  creating  aggregate  feeds  for  popular  ebscohost  journals,  the  pipes  created  for   info  sos  are  available  to  clone  at  http://pipes.yahoo.com/infosos.  a  search  of  published  pipes   available  in  yahoo!  pipes  reveals  pipes  created  by  many  public  and  academic  libraries,  all  of  which   are  available  to  clone  and  edit.  the  ability  to  share  pipes  with  other  institutions  introduces  the   possibility  of  current  awareness  services  shared  between  library  consortia  or  associations.     creating  a  current  awareness  service  using  yahoo!  pipes  and  libguides  |  kiscaden   56   as  information  becomes  more  abundant,  tools  and  services  to  manage  incoming  information  will   continue  to  be  a  corresponding  need.  creating  and  sharing  services  that  utilize  technology  common   to  libraries  presents  us  with  the  opportunity  to  collaborate  with  one  another  and  revitalize  library-­‐ engineered  current  awareness  services.  these  services  offer  a  value  that  is  twofold:  library  users   benefit  from  the  ability  to  stay  current  with  publications  in  their  field,  and  libraries  have  the   potential  of  increased  usage  of  their  purchased  content.  with  no  financial  investment,  an  aggregate   feed-­‐based  service  is  a  value  that  a  variety  of  libraries  can  implement  with  the  investment  of  only   limited  personnel  time.   references     1.    g.  mahesh  and  dinesh  kumar  gupta,  “changing  paradigm  in  journals  based  current  awareness   services  in  libraries,”  information  services  &  use  28,  no.  1  (2008):  59–65,   http://dx.doi.org/10.3233/isu-­‐2008-­‐0555.       2.    stephen  m.  johnson,  andrew  osmond,  and  rebecca  j.  holz,  “developing  a  current  awareness   service  using  really  simple  syndication  (rss),”  journal  of  the  medical  library  association  97,  no.  1   (2009):  51–53,  http://dx.doi.org/10:3163/1536-­‐5050.97.1.011.   3.    mahesh  and  gupta,  “changing  paradigm  in  journals  based  current  awareness  services  in   libraries.”   4.    m.  kathleen  kern  and  cuiying  mu,  “the  impact  of  new  technologies  on  current  awareness  tools   in  academic  libraries,”  reference  &  user  services  quarterly  51,  no.  2  (2011):  92–97.     5.    sandra  j.  weingart  and  janet  a.  anderson,  “when  questions  are  answers:  using  a  survey  to   achieve  faculty  awareness  of  the  library’s  electronic  resources,”  college  &  research  libraries  61,   no.  2  (2000):  127–34,  http://dx.doi.org/10.5860/crl.61.2.127.   6.    jim  doree,  “rss:  a  brief  introduction,”  journal  of  manual  &  manipulative  therapy  15,  no.  1  (2007):   57–58.     7.    bill  dyszel,  “create  no-­‐code  mashups  with  yahoo!  pipes,”  pc  magazine  26,  no.  21/22  (2007):  103– 5.   lib-s-mocs-kmc364-20140601051731 58 book reviews descriptive cataloguing; a student's introduction to the anglo-american cataloguing rules 1967. by james a. tait and douglas anderson. second ed.; rev. and enl. hamden, conn.: linnet books, 1971, 122p. $5.00 this second edition contains some corrections to the errors made in the 1968 edition, and includes the changes and clarifications brought out by the aacr amendment bulletin. the number of exemplary title pages has been increased from twenty-five to forty, thus giving the student more practice in determining entries and doing descriptive cataloging. this reviewer believes that a more exact title would be "descriptive cataloging and determining entries and headings," because this introductory text not only covers descriptive cataloging as defined and explained in "part iidescriptive cataloging" of the anglo-american cataloguing rules, but also includes some of the basic rules for determining entries and headings in aacr's "part !-entry and heading." there are three distinct sections: descriptive cataloging; determining entries and headings; and facsimile title pages for student practice. descriptive cataloging is covered in just thirteen pages, but all the basic elements are there. the explanations are clear and examples are shown, but not in the context of a full card. (unfortunately only one full catalog card is illustrated in the entire book.) it is in this section, more than in any other, where the differences between british and american cataloging become obvious. british descriptive cataloging varies in so many ways from its american counterpart that a beginning student in an american library school would be quite confused by these variations. the next section consists of twenty-five pages and is devoted to the basic rules on entries and headings. examples are used to illustrate the rules and the authors point out some differences between the british and american texts of the aacr. the remaining seventy pages contain the forty reproduced title pages which are followed by some commentary and a key corresponding to each title page. these title pages give the student a wide range of experience in transcribing the proper information onto the card and in determining main and added entries. even though this book is an excellent introduction to the rudiments of descriptive cataloging and the determination of main and added entries, book reviews 59 its use of british descriptive cataloging precludes its being widely adopted in beginning cataloging courses in american library schools. donald /. l ehnus centmlized processing for academic librm·ies. by richard m. dougherty and joan m. maier. metuchen, n.j.: scarecrow press, 1971. 254p. $10.00 this is the final report of the colorado academic libraries book processing center ( calbpc) two-part study investigating centralized processing. phase i, reported by laurence leonard, maier, and dougherty in centralized book processing, scarecrow, 1969, was basically a feasibility study, whereas this final report describes the beginning six months of operations that tested the phase i recommendations. partially funded by the national science foundation, the experiment measured anticipated time and cost savings, monitored acquisitions and cataloging operations, and tested product acceptability for six libraries participating in the 1969, six-month study. even though centralized book processing might hold little appeal for the reader, this volume nonetheless is valuable to technical service heads because of its above average sophistication in applying a systems analysis approach to technical services problems. the authors objectively report their findings, outlining in detail the mistakes, the unanticipated problem areas, and what they believed to be the successes. from the start the authors encountered problems with scheduling. by the time the experiment began most participants had a large portion of their book money encumbered, and the center was forced to accept cataloging arrearages in addition to book order requests. those who did send in orders did not conform to patterns predicted in phase i. instead, the center was used as a source of obtaining more difficult materials, including foreign language items. it was discovered that in actual practice calbpc had no impact on discounts received from vendors. the vendor performance study lacked relevancy because it was based upon the date invoices were cleared for payment rather than the date books were received in house. in evaluating the total processing time, four libraries reduced their time lag by participating in the center's centralized processing, and the cost of processing the average book was reduced from $3.10 to $2.63. the product acceptance study showed that the physical processing was only partially accepted with most of the libraries modifying a truncated title that was printed on the book card and book pocket as a by-product of the automated financial subsystem. other local modifications were made on books processed by the center but that cost or local error correction costs were not reported in the study. calbpc's automated financial subsystem was beseiged with many problems resulting from lack of programming foresight and adequate consulting 60 journal of library automation vol. 5/1 march, 1972 by those who had previously designed such systems. individuals interested in the automation of acquisitions should read this section of the report. calbpc's problems were typically those of building exceptions to exceptions in order to accommodate unanticipated program omissio.ns. simply not recognizing that books could be processed before invoices were paid caused delays and bottlenecks of such magnitude that procedures had to be devised to circumvent requirements of the automated subsystem. many recommendations were particularly relevant to cooperative ventures. in formulating processing specifications such as call number format and abbreviation standardization, calbpc had not anticipated the infinite local variations they would have to accommodate. they quickly recognized the need for both greater quality control to minimize errors within the system and better communications and educational programs for participants. a reoccurring message was that librarians emphasized the esthetics of catalog cards rather than the content, thus a recommendation was made to investigate whether a positive correlation exists between the esthetics of the product and the quality of the library service. the authors emphasized that a cooperative program depends more upon competencies and willingness of individuals than the technical aspects of the operations. some diversification of services was called for but no mention was made of the possibilities of an on-line system. it was felt that in future operations the center should accept orders for out-of-print and audiovisual materials. those libraries participating in approval programs had received no benefit by having books sent first to the center, thus it was suggested that the center forward those libraries a bibliographic packet only and that the approval books bypass the center. this well-documented study, half of which is devoted to charts and appendix materials, concluded its recommendations with a positive evaluation of the service the center had performed and suggested that public and school libraries should also be participants. ann allan 2 information technology and libraries | march 2009 andrew k. pace president’s message: lita now andrew k. pace (pacea@oclc.org) is lita president 2008/2009 and executive director, networked library services at oclc inc. in dublin, ohio. a t the time of this writing, my term as lita president is half over; by the time of publication, i will be in the home stretch—a phrase that, to me, always connotes relief and satisfaction that is never truly realized. i hope that this time between ala conferences is a time of reflection for the lita board, committees, interest groups, and the membership at large. various strategic planning sessions are, i hope, leading us down a path of renewal and regeneration of the division. of course, the world around us will have its effect—in particular, a political and economic effect. first, the politics. i was asked recently to give my opinion about where the new administration should focus its attention regarding library technology. i had very little time to think of a pithy answer to this question, so i answered with my gut that the united states needs to continue its investment in it infrastructure so that we are on par with other industrialized nations while also lending its aid to countries that are lagging behind. furthermore, i thought it an apt time to redress issues of data privacy and retention. the latter is often far from our minds in a world more connected, increasingly through wireless technology, and with a user base that, as one privacy expert put it, would happily trade a dna sample for an extra value meal. i will resist the urge to write at greater length a treatise on the bill of rights and its status in 2008. i will hope, however, that lita’s technology and access and legislation and regulation committees will feel reinvigorated post–election and post–inauguration to look carefully at the issues of it policy. our penchant for new tools should always be guided and tempered by the implementation and support of policies that rationalize their use. as for the economy, it is our new backdrop. one anecdotal view of this is the number of e-mails i’ve received from committee appointees apologizing that they will not be able to attend ala conferences as planned because of the economic downturn and local cuts to library budgets. libraries themselves are in a paradoxical situation—increasing demand for the free services that libraries offer while simultaneously facing massive budget cuts that support the very collections and programs people are demanding. what can we do? well, i would suggest that we look at library technology through a lens of efficiency and cost savings, not just from a perspective of what is cool or trendy. when it comes to running systems, we need to keep our focus on end-user satisfaction while considering total cost of ownership. and if i may be selfish for a moment, i hope that we will not abandon our professional networks and volunteer activities. while we all make sacrifices of time, money, and talent to support our profession, it is often tempting when economic times are hard to isolate ourselves from the professional networks that sustain us in times of plenty. politics and economics? though i often enjoy being cynical, i also try to make lemonade from lemons whenever i can. i think there are opportunities for libraries to get their own economic bailout in supporting public works and emphasizing our role in contributing to the public good. we should turn our “woe-are-we” tendencies that decry budget cuts and low salaries into championed stories of “what libraries have done for you lately.” and we should go back to the roots of it, no matter how mythical or anachronistic, and think about what we can do technically to improve systemwide efficiencies. i encourage the membership to stay involved and reengage, whether through direct participation in lita activities or through a closer following of the activities in the ala office of information technology policy (oitp, www.ala.org/ala/aboutala/offices/oitp) and the ala washington office itself. there is much to follow in the world that affects our profession, and so many are doing the heavy lifting for us. all we need to do sometimes is pay attention. make fun of me if you want for stealing a campaign phrase from richard nixon, but i kept coming back to it in my head. in short, library information technology— now more than ever. lib-mocs-kmc364-20140106084054 a computer system for effective management of a medical library network 213 richard e. nance and w. kenneth wickham: computer science/ operations research center, institute of technology, southern methodist university, dallas, texas, and maryann duggan: systems analyst, south central regional medical library program, dallas, texas trips (talon reporting and information processing system) is an interactive software system for generating reports to nlm on regional medical library network activity and constitutes a vital part of a network management information system (nemis) for the south central regional medical library program. implemented on a pdp-lofsru 1108 interfaced system, trips accepts paper tape input describing network transactions and generates output statistics on disposition of requests, elapsed time for completing filled requests, time to clear unfilled requests, arrival time distribution of requests by day of month, and various other measures of activity andjor performance. emphasized in the trips design are flexibility, extensibility, and system integrity. processing costs, neglecting preparation of input which may be accomplished in several ways, are estimated at $.05 per transaction, a transaction being the transmittal of a message from one library to another. introduction the talon (texas, arkansas, louisiana, oklahoma, and new mexico) regional medical library program is one of twelve regional programs established by the medical library assistance act of 1965. the regional programs form an intermediate link in a national biomedical information network with the national library of medicine ( nlm) at the apex. unlike 214 journal of library automation vol. 4/4 december, 1971 most of the regional programs that formed around a single library, talon evolved as a consortium of eleven large medical resource libraries with administrative headquarters in dallas. a major focus of the talon program is the maintenance of a document delivery service, created in march 1970, to enable rapid access to published medical information. twx units located in ten of the resource libraries and at talon headquarters in dallas comprise the major communication channel. in july 1970 a joint program was initiated to develop a statistical reporting system for the talon document delivery network. design and development of the system was done by the computer science/operations research center at southern methodist university, while training and operational procedures were developed by talon personnel. both parties in the effort view the statistical reporting system as a vital first step in providing talon administrators with a comprehensive network management information system (nemis ). an overview of this statistical reporting system, designated as trips (talon reporting and information processing systems), and its relation to nemis is discussed in the following paragraphs. the objectives and design characteristics of nemis are stated in ( 1 ). design requirements there were two considerations for requirements for a network management information service ( nemis ) for talon: 1) in what environment would talon function? 2) what should be the objectives of a network management information service and what part does a statistical reporting system play in its development? the talon staff and the design team spent an intensive period in joint discussion of these two questions. talon environment the talon document delivery network operates in an expansive geographical area (figure 1). the decentralized structure of the network enables information transfer between any two resource libraries. in addition talon headquarters serves as a switching center, by accepting loan requests, locating documents, and relaying requests to holding libraries. a requirement placed on talon by nlm is the submission of monthly, quarterly, and annual reports giving statistical data on network activity. these statistics provide details on: 1) requests received by channel used (mail, telephone, twx, other), 2) disposition of requests (rejected, accepted and filled , accepted and unfilled), 3) response time for filled requests, 4) response time for unfilled requests, 5) most frequent user libraries, 6) requests received from each of the other regions, and 7) non-medlars reference inquiries. a medical library networkjnance 215 • fig. 1. location of the eleven resource libraries and talon headquarters. monthly reports require cumulative statistics on year-to-date performance, and each of the eleven resource libraries and talon headquarters is required to submit a report on its activity. needs and objectives while the immediate need of the talon network was to develop a system to eliminate manual preparation of nlm reports, an initial decision was made to develop software also capable of assisting talon management in policy and decision making. eventual need for a network management information system ( nemis) being recognized, the talon reporting and information processing system (trips) was designed as the first step in the creation of nemis. provision of information in a form suitable for analytical studies of policy and decision makinge.g., the message distribution problem described by nance ( 2) -placed some stringent requirements on trips. for instance, the identification of primitive data elements could not be made from report considerations only; an overall decision had to be made that no sub-item of information would ever be required for a data element. in addition the system demanded flexibility and extensibility, since it was to operate in a highly dynamic environment. these characteristics are quite apparent in the design of trips. 216 journal of library automation vol. 4/4 december, 1971 trips design trips is viewed as a system consisting of hardware and software components. the description of this system considers: 1) the input, 2) the software subsystems (set of programs), 3) hardware components, and 4) the output. emphasis is placed on providing an overview, and no effort is made to give a detailed description. the environment in which trips is to operate is defined in a single file ( for25.dat). this file assigns network parameters, e.g., number of reporting libraries, library codes, and library titles. the file is accessed by subprograms written in fortran iv and dystal ( 3), the latter being a set of fortran iv subprograms, termed dystal functions, that perform primitive list processing and dynamic storage allocation operations. because it requires only fortran iv trips can be implemented easily on most computers. input a transaction log, maintained by each regional library and talon headquarters, constitutes the basic input to trips. copies of log sheets are used to create paper tape description of the transactions. if and when compatibility is achieved between standard twx units and telephone entry to computer systems, the input could be entered directly by each regional library. (this is technically possible at present. ) currently, talon headquarters is converting the transaction descriptions to machine readable form. initial data entry under normal circumstances is pictured in figure 2, which shows the sequence of operations and file accesses in two phases: 1) data entry and 2) report generation. data entry in tum comprises 1) collecting statistics, 2) diagnosis and verification of input data and 3) backup of original verified input data. trips is designed to be extremely sensitive to input data. all data is subjected to an error analysis, and a specific file (for22.dat ) is used to collect errors detected or diagnosed in the error analysis routine. only verified data records are transmitted to the statistical accumulation file (for20.dat). software subsystems trips comprises seven subsystems or modules. within each module are several fortran iv subprograms, dystal function and/ or pdp-10 systems programs discussed under hardware components in the following section: newy: run at the beginning of each year, newy builds an in-core data structure and transfers it to disk for each resource library in the network. it further creates the original data backup disk file ( for23.dat). after disk formatting , record (the accessing and storage module) may be activated to begin accumulating statistics for the new year. a medical library networkjnance 217 statistical collection l~cport genera tion reimburs ab le statis tic s repor t non-reimburs•ble s tat istics report fig. 2. trips structure newq: newm: dumpl: record: report: manage: run between quarters, newq purges the past quarter statistics for each library and prepares file for23.dat for the next quarter. the report for the quarter must be generated before newq is executed. run between months, newm purges the monthly statistics for each regional library and prepares file for23.dat for the backing up of next month's data. the utility module causes a dyst al dump of the data base. the accessing and storage module record incorporates the error diagnosis on input and the entry of validated data records into file for23.dat. no data record with an indicated error is permitted, and erroneous records are flagged for exception reporting. the error report (ermes.dat) may be printed on the teletype or line printer after execution of record. the reporting module report generates all reimbursable statistics on a month-to-date, quarter-to-date, and year-todate basis. utilization of trips as a network management tool is afforded by manage, which combines statistics from reimbursable and non-reimbursable transactions to generate a report providing measures of total network activity and performance. 218 journal of library automation vol. 4/4 december, 1971 the primary files used by the software subsystems are described briefly in table 1. table 1. primary files in trips file name function of the file for25.da t contains the system definition parameters and initialization values. for20.dat for2l.dat statistical accumulation for validated data records. generation of reports from information in for20.dat. comments created from card input to assure proper format. two parts : file type ascii ( 1) input translator binary data structure, and (2) statistical data base. carriage control charascii acters must be included to generate reports. for22.dat collects data records errors accumulated ascii diagnosed as in error. in for22.da t are transmitted to ermes.dat for output. for23.dat enables creation and each month's valiascii updating of the backdated records added up magnetic tape. to tape. for24.dat enables recovery tape information binary read of backup tape. stored prior to transfer of file information to for20.dat. ermes.dat serves to output mesif 6 or less errors ocascii sages on data records cur ermes is not diagnosed as in error. created and messages are output to teletype. if more than 6 errors, an estimate of typing time is given to user who has option of printing them on the teletype or in a report form on the line printer. a medical library networkjnance 219 a major concern in any management information system is the system integrity. in addition to the diagnosis of input data, trips concatenates sequential copies of disk file for23.dat to provide a magnetic tape backup containing all valid data records for the current year. a failsafe tape, containing all trips programs, is also maintained. hardware components conversion of transaction information to machine readable form is done off line currently. using a standard twx with ascii code, paper tapes are created and spliced together. fed through a paper tape reader to a pdp-10 (digital equipment company), the input data is submitted to trips. control of trips is interactive, with the user monitoring program execution from a teletype. all file operations are accomplished using the pdp-10 via the teletype, and the output reports are created on a high-speed line printer. with sm,u's pdp-10 and sru 1108 interface, report generation can be done on line printers at remote terminals to the sru 1108 as well. output trips output consists of a report for each library in the network and a composite for the entire network. the report may be limited to reimbursable statistics or include all statistics. information includes: 1) errors encountered in the input phase, 2) number of requests received by channel, 3 ) disposition of requests (i.e., rejected, accepted/ filled, accepted/ unfilled, etc. ) , 4) elapsed time for completing :filled requests or clearing unfilled requests, 5) geographic origin of requests, 6) titles for which no holdings were located within the region, 7 ) types of requesting institutions, 8) arrival time distribution of requests by day of month, 9) invoice for reimbursement by talon, 10 ) node/ network dependency coefficient as described by ( 4). summary trips is now entering its operational phase. training of personnel at the resource libraries is concluded, and data on transactions are being entered into the system. input errors have decreased significantly ( from fifteen or twenty percent to approximately two percent ). talon personnel are enthusiastic, and needless to say the regional library staffs are happy to see a bothersome, time-consuming manual task eliminated. in summary, the following characteristics of trips deserve repeating: 1) with its modular construction, it is flexible and extensible. 220 journal of library automation vol. 4/4 december, 1971 2) implemented in dystal and fortran iv, it should allow installation on most computers without major modifications. 3) designed to operate in an interactive environment, it can be modified easily to function in a batch processing environment. 4) trips is extremely sensitive to system integrity, providing diagnosis of input data, reporting of errors, magnetic tape backup of data files, and a system failsafe tape. 5) definition of primitive data elements and the structural design of trips enable it to serve as the nucleus of a network management information system ( nemis) as well as to generate reports required by nlm. 6) currently accepting paper tape as the input medium, trips could be modified easily to accept punched card input and with more extensive changes could derive the input information during the message transfer among libraries. finally, the processing cost of operating trips, neglecting the conversion to paper tape, is estimated to be $.05 per transaction (a message transfer from one library to another). extensive and thorough documentation of trips has been provided. availability of this documentation is under review by the funding agency. acknowledgment work described in this article was done under contract hew phs 1 g04 lm 00785-01, administered by the south central regional medical library program of the national library of medicine. the authors express their appreciation to dr. u. narayan bhat and dr. donald d. hendricks for their contributions to this work. references 1. "nemis -a network management information system," status report of the south central regional medical library program, october 26, 1970. 2. nance, richard e.: "an analytical model of a library network," journal of the american society for information science, 21: (jan.-feb. 1970), 58-66. 3. sakoda, james m.: dyst aldynamic storage allocation language manual, (providence, r. i.: brown university, 1965). 4. duggan, maryann, "library network analysis and planning (libnat)," journal of library automation, 2: (1969), 157-175. jaeger ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ president’s message andromeda yelton information technology and libraries | june 2018 3 https://doi.org/10.6017/ital.v37i2.10493 andromeda yelton (andromeda.yelton@gmail.com) is lita president 2017-18 and senior software engineer, mit libraries, cambridge, massachusetts. as i started planning this column, i looked back over my other columns for the year and discovered that they have a theme: the connection that runs from our past, through our present, and into our future. in my first column, i talked about the first issues of ital: henriette avram founding marc right here in these pages. early lita hackers cobbling together the technologies of their age to make streamlined, inventive library services — just as lita members do today. in my second column, i talked about conferences where we come together today — lita forum 2017 and 2018 — and encounter the issues of today — data for black lives. i can close my eyes and i’m in denver, chatting with long-time colleagues and first-time presenters ... or i’m at the mit media lab, watching algorithmic opportunity and injustice spar with one another, while artists and poets point us toward the wakandan imaginary. and in my third column, i talked about the possibility of lita, llama, and alcts coming together to form a new division: a potential future. this possibility both knocked my world off its axis and let me see it in a new light; i didn’t imagine that i’d spend my presidency exploring the options for large-scale organizational transformation, and yet i can see how this route could not only address challenges all three divisions face, but also give us opportunities to be stronger together. i believe in this roadmap, but i also want us all to grapple with the question of identity. what’s peripheral, and what’s central, to who we are as library technologists? what’s ephemeral, and what endures? what’s the through line we can hold on to, across that past and present, and carry with us into the future? today, here in the present, i’m preparing to turn over my piece of that line to president-elect bohyun kim. she has been unfailingly brilliant and diligent in the years i’ve known her, and i know she’ll ask insightful questions, advocate for all that’s best in lita and its people, and get things done. but i’m also cognizant that it was never really my line; it was yours. i had the immense privilege of carrying it for a while, but as we hear every time we survey our members, the best part of lita is the networking — it’s you. we will have many chances to discuss our through line in the months to come and i urge you to bring your voices to the table: ask your questions, tell us what matters, and depict your imaginaries. mailto:andromeda.yelton@gmail.com lib-mocs-kmc364-20131012120558 314 news and announcements first use of catvlib network: american red cross satellite telecast on may 21, 1981, the american red cross celebrated their one-hundredth birthday by ending their annual conference in washington, d.c., with a special twohour nationwide satellite telecast. the pssc coordinated distribution of the telecast, which originated from constitution hall in washington, d.c. , from 10 a.m. to noon. the program was carried on satcom i, transponder 16 (appalachian community service network), and made available to all cable systems able to receive this transponder. those areas not able to schedule the live program were offered a satellite-transmitted taped feed later in the day. the american red cross had encouraged all its local chapters to initiate program reception in their communities by approaching the local cable system about carrying the event. since the american red cross was offering a free program and trying to saturate as much of the united states as possible, use of the catvlib network in conjunction with this telecast was appropriate. pssc contacted 53 libraries in 23 states that were interested in assuming local coordination for bringing this event to their communities. as the local coordinator, the ca tvlibs' minimum responsibilities included alerting the cable systems to schedule receiving this program (if the local red cross chapter had not already approached the catv) and contacting the local red cross chapter to offer the catvlibs' facilities for their group viewing and concomitant local celebration. of these fifty-three catvlibs , only seven could not participate because of technical problems. schedule conflicts; lack of catv, red cross, or community interest; and red cross alternative plans were the major factors in prohibiting twelve others from directly participating in hosting the satellite-transmitted program. the remaining thirty-four catvlibs did host community residents in their facilities. evaluation forms revealed a variety of degrees of catvlib participation in coordinating their first satellite event participation. several catvlibs (though none came to the library for viewing) were instrumental in getting the program into the community and available to all local cable subscribers. advance publicity, birthday cakes and refreshments, sing-alongs, taping for multiple showings, and joint library/ chapter preand postevent activities are but a few of the ways the individual catvlibs participated. all of the evaluation forms indicated that the ca tvlibs wanted to be contacted as a potential local site for future satellite events. the following list names the fifty-t hree ca tvlibs that were initially contacted to be local coordinators for the red cross onehundredth birthday satellite telecast. though not all were successful, catvlib made an effort to bring the program to its community. colorado boulder public library, boulder connecticut thomaston public library, thomaston florida tarpon springs public library, tarpon springs georgia tri-county regional library, rome idaho pocatello public library, pocatello illinois pekin public library, pekin rockford public library, rockford indiana fort wayne public library, fort wayne monroe county public library, bloomington iowa kirkwood community college telecommunications center, cedar rapids iowa city public library, iowa city kansas abilene public library, abilene newton public library, newton kentu cky lexington public library, lexington louisville public library, louisville camden-carroll library, morehead state university , morehead massachusetts greenfield community college library, greenfield south hadley library system, south hadley minn esota anoka county library, fridley cloquet public library, cloquet crow river regional library, willmar international falls public library, international falls minnesota valley regional library, mankato marshall-lyon county library system, marshall western plains library system, montevideo rochester public library , rochester st. cloud public library, st. cloud missouri st. charles city county library, st. peters new j ersey burlington county college library , pemberton new york albany public library, albany amherst public library , willia msville bethlehem public library, delmar chautauqua-cattaraugus library system, jamestown gates public library, rochester mid-york library system, utica ridge road elementary school library, horseheads north carolina davidson county community college library, lexington ohio greene county district library, xenia public library of columbus and franklin county, columbus news and announcements 315 university of toledo library, toledo pennsylvania altoona area public library, altoona lancaster county library, lancaster monroeville public library , monroeville tennessee memphis/shelby county public library & information center, memphis utah merrill library and learning resources program , utah state university, logan weber county library, ogden virginia arlington county department of libraries, arlington washington edmonds community college library, lynnwood lynnwood public library, lynnwood mountlake terrace public library, mountlake terrace seattle public library, seattle wisconsin middleton public library, middleton nicolet college learning resource center, rhinelander who's who and what's what in library video and cable for librarians interested in who is doing what in video in libraries, or in how to do it themselves, a guidebook has been published by the video a nd cable communications section of the libra ry a nd information technology association . it is the 461-page video and cable guidelines. edited by leslie c hamberlin burk and roberto esteves-two of the most active libra rians in the video field-the book includes papers from donald sager, kandy brandt, arlene farber sirkin, anne hollingsworth , and by burk and esteves. among the topics covered are a description ofthe present operation, future plans, problems, and benefits of video in 250 libraries in the u.s. and canada. the book is spiral-bound and can be used conveniently as a manual for staff development programs. its price is $9. 75 . for additional information, or to order copies (prepaid orders only, please), contact lit a, ala, 50 e. huron st., chicago, il 60611 ; (3 12)944-6780. 316 journal of library automation vol. 14/4 december 1981 elmig electronic mail arrives the "new arrival" to the library association family this summer is the electronic library membership initiative group. elmig is an organization of individuals established to ensure that electronically delivered information remains accessible to the general public. elmig promotes participation and leadership in the remote electronic delivery of information by publicly supported libraries and nonprofit organizations. the group's efforts are coordinated by richard sweeney, director of the public library of columbus and franklin county; neal kaske, director of oclc's office of research; and kenneth dow lin, director of the pikes peak library district. the first founding goals of elmig are: • identifying services and information best suited for the remote electronic access to and delivery of information; • planning, funding, and developing working demonstrations of library electronic information services; • communicating the availability of electronic library services to the community; • informing the library profession of trends, specific events, and future directions in remote electronic delivery of information; • creating coalitions with organizations in allied fields of interest. organizers of elmig are working within ala to foster interest in , and facilitate the needs of, the electronic library. ala has established a membership initiative group to address the concerns of this group. the electronic library membership initiative group will meet during the ala midwinter meeting in denver. interested individuals are encouraged to attend the meeting scheduled for monday, january 25, 1982, at 2 p.m. in room 2e of the auditorium. interest in elmig/ela has surfaced quickly. the membership group was formed in march, and gathered the 200 signatures needed for official recognition at the ala annual conference in san francisco. some 150 people met at that conference to discuss topics of concern. they decided to continue these discussions at the 1982 midwinter meeting and plan for an elmig program to be presented at philadelphia. elmig aims to address the issues concerning the electronic library on a continuing basis through ongoing interaction of its members. to facilitate this interaction, elmig will use an electronic mail system. further information on elmig and its members is available from richard sweeney at the public library of columbus and franklin county, 28 s. hamilton rd., columbus, oh 43213. see page 317 for subscriber agreement form. heynen to head arl microform project the association of research libraries has hired jeffrey heynen to head a two-year program designed to improve bibliographic access to microform collections in american and canadian libraries. the association has received $20,000 from the council on library resources to initiate the project, and additional funds are anticipated from other sources. heynen brings an extensive background in micrographics and publishing to the project as well as a long-standing commitment to improving the treatment , use, and bibliographic control of microforms in libraries. he has served as chair of the american library association's reproduction of library materials section, and was a participant in earlier groups that laid the foundation for the current arl project. currently president of information interchange corporation, heynen has held executive positions with congressional information service, greenwood press, and redgrave information resources. these positions have all included responsibility for the creation of large microform collections. heynen hold~ memberships in numerous standards-making bodies, including the international organization for standardization (iso), the american national standards institute, and the national micrographics association, and is a lecturer at the university of maryland college of library and information services. the arl microform project is based upon a planning study conducted for the association by richard boss of information systems consultants, inc. its purpose is to stimulate and coordinate the work of libraries, microform publishers, bibliographic utilities, and regional networks in providing bibliographic access to millions of monographic titles in microform that are now inadequately or insufficiently cataloged. since the development of the plan during 1980, there has been keen interest both in the elements of the plan and in the cooperative efforts needed to achieve them. a number of libraries-both arl and nonarl members-are planning to begin or are already entering catalog records for individual titles in microform sets into bibliographic databases. for example, three arl libraries have recently been awarded grants under title 11-c of the higher education news and announcements 317 act, strengthening research library resources, to catalog major microform sets, entering the resulting records into one of the major utilities. all three librariesstanford university, university of utah, and indiana university-will be coordinating their efforts with the goals of the arl program. key to these efforts, however, is coordination to ensure that national standards are accepted and followed, to distribute the work load so that as many sets as possible are covered and duplication of effort is avoided, and to ensure that the records are available to all libraries that want to use them. the arl microform project will emphasize building on existing resources, coordinating efforts among the library and pubsubscriber agreement electronic library membersiup initiative group ------------------·(ala member), applies for membership in the electronic library membership initiative group, electronic mail system, and states that: recitals: a. elmig is an association of individuals whose mission is to ensure that information delivered electronically remains accessible to the general public; and b. elmig seeks to promote participation and leadership in remote electronic delivery of information by publicly supported libraries and nonprofit organizations. now therefore, the above member and oclc agree that: 1. member will deposit with oclc a $100 contribution toward the cost of electronic mail service and attendant expenses for the first year of operation, which is to commence january 1, 1982. the member recognizes that the initial member contribution may not be sufficient to pay for a year of operation and agrees, when invoiced, to make additional payments of $100, or other agreed upon sums, to oclc for the continuation of service. 2. oclc agrees that by accepting member deposits, it will secure electronic mail service for the members of elmic; and 2.1 will place member deposits in a separate elmic account from which oclc will pay the cost of the electronic mail service, u.s. postal mailings, and any other expenses incurred in the administration of ems. 2.2 will provide a year-end accounting of contributions and expenditures to members with in a reasonable time after december 31 , 1981 , and each year-end thereafter. member: by --------------------------------------------------------title --------------------------------------------------------date ---------------------------------------------------------318 journal of library automation vol. 14/4 december 1981 lishing communities and the bibliographic utilities, and, where possible, facilitating cooperative projects already planned or under way. heynen will be assisted by an advisory committee composed of representatives of both arl and non-arl libraries, the major bibliographic utilities, and microform publishers. the arl project will operate out of the office of information interchange corporation, 503 11th st., se, washington, dc 20003; (202)544-0291. libraries and publishers interested in participating in the project are urged to contact the project office. nominations sought for lita award nominations are being sought for the library and information technology association's award for achievement. the award is intended to recognize distinguished leadership, notable development or application of technology, superior accomplishments in research or education, or original contributions to the literature of the field. the award may be given to an individual or to a small group of individuals working in collaboration. organized institutions or parts of organized institutions are not eligible. nominations for the award may be made by any member of the american library association and should be submitted by january 15, 1982, to hank epstein, lita awards committee chairperson, 1992 lemnos dr., costa mesa, ca 92626. are these books on your shelf? the special library role in networks: proceedings of a conference robert w. gibson, jr., ed. 296 p. 1980 isbn 0-87111·279°5 ........ . ........ . ......... $10.50 d reports on the cu rrent state of networking and presents a creative approach to special library involvement in network participation and management. special libraries special issue on information technology and special libraries april 1981, vol. 72, no. 2 ...... .. ....... . .... .. ............... .. .......... .. ......... $9.00 d the entire issue of this journal is devoted to the technological transformation of the information industry. topics discussed are such advances as computer and tele communications components, software developments, linking, and modes of access to information systems. bibliographic utilities: a guide lor the special librarian james k. webster, ed. 32 p. 1980 isbn 0°87111°280°7 ............. 0 •• 0 •••••••••••• 0 •• $3.75 d a comparative study of the services offered by the four major north american online bibliographic utilities. total$ ___ _ send to: special libraries association order departmentbox jla 235 park avenue south orders from individuals must be prepaid. new york, new york 10003 date ______ _ name __________ __ organization street address ------------------------city ----------state _ _____ zip _____ _ new york ctty purchasers add 8 'i•% state and city sales tax. new york state purchasers add appropriate state and local sales tax. lib-s-mocs-kmc364-20141005044842 182 technical communications reports-library projects and activities ohio state university health sciences library uses autamated bookstack system the new health sciences library at ohio state university began serving students in may 1973 with some of the most advanced features in any library in the country. it contains an automated bookstack system to locate and file books, and is the fourth library in the country to have the system ( randtriever, manufactured by remington rand corp.), says jo ann johnson, director of the health sciences library. "the bookstack system will find and deliver a book via a conveyor belt in about a minute," miss johnson said. the chief advantages of the system are that it saves space and is speedy and accurate, she pointed out. "the book stacks in the new library take up about 15 percent of the total space while in most libraries the stacks take 40 to 60 percent of the space," miss johnson said. aisles in the stacks are narrow, about 15 inches, and the shelves rise through two stories of the library-twenty-two feet in all, she said. the library has a capacity of 175,000 volumes. the accuracy of the system will reduce the problem of misfiling. also, book theft should dwindle because the stacks will be closed to users, she said. the library is connected with the computerized circulation system of the university library, made up of a main library and twenty-three branch libraries. this circulation system is the first of its type in the country and permits library users to place telephone calls to learn titles and authors and to charge out books. other features of the modern library will include a computer-assisted instruction area to be completed later, and connections to medline, the international computerized information system of medical journals. miss johnson explained that the automated books tack system works like this: a library staff member sends instructions via a terminal to an electronic device in an aisle. the device travels on vertical and horizontal columns in the aisles. it picks out a small bin of books containing the requested one, then travels to the end of the aisle and places it on a conveyor belt. at the terminal, the staff member selects the requested book from the bin, usually containing about eight volumes, and sends the bin back for refiling. a glass window permits observation of the system. university of california, berkeley serials key word index the university of california, berkeley, general library has published a serials ke y word index to titles of 45,741 serials. the computer-produced index is the largest of a fairly new variety of key word indexes, covering titles of serials rather than articles. the 360/ assembler programs written by the library systems office include a number of innovations. berkeley serial records are stored in marc format, upper-lower case, capitalized by citation rather than catalog standards. the key word extract program ignores prepositions and conjunctions, etc. (which are not capitalized); treats certain multiword terms (la paz, united nations) as single words; prepares a librarystandard sorting key (with u.s. filing as united states, & filing as and, and distinction made between two types of hyphenation); and does no stop-list searching or other searching for excluded words. key lines are sorted by key word; all other processing is based on an alphabetic file of key words attached to main entries. thus, vocabulary control (forced interfiling of abbreviations, synonyms, cognates, etc.-not heavily used in this edition) is a fast, simple runtime operation, changing certain key words (on a single alphabetic pass) and generating "see" references. exclusion of low-content words is also a fast, simple runtime operation, done in the printing program, allowing excluded-word entries to print if the word occurs first in either title or author, and generating an explanatory note under each excluded key word. listings are main-entry, alphabetic under key word groups, with brief holdings, campus location, and call number where available. the key word appears in all capital letters within each entry, and redundant entries are collapsed-that is, if a wqrd appears more than once in an entry, each occurrence is capitalized, but the entry is only listed once under the key word. the first edition limits entries to 98 characters and holdings to 13 characters; the programs have since been revised to allow up to 193 character entries and up to 45 characters of holdings. both versions t echnical communications 183 of the programs retain as much runtime flexibility as possible, while maintaining extremely low running time. the first edition, including mostly nondocument currently-received titles, is photocomposed in a 6-point slab-serif type and published in three paperbound volumes. copies are available for $60 a set from systems office, main library, university of california, berkeley, ca 94720. walt crawford, university of california, berkeley programming and computers plea, a pli 1 efficiency analyzer pl/ 1 users find that the language offers infinite ways to invoke inefficient code. partial defense is provided by careful manual reading. another, and very plea ut~~ctio'! analysis pace stat£nent t~a~ counts foil main p~oceoorf t!slimeo offset 000)68 in load noouleo 'l 141 :' 69 ·r10 1 3 11 "\j ~ u •• ~~ 2 ii ,, u 4 »' i z1 l6j ---lt 25 9 35 14 u 5 ' ' f: hi ' .. "s .. , u ~34 10 !i 3 .-... !. · . ul .... _,... 6 ' e0tt1) ; j = (tof + bot) i 2; if b > .a(j) then bot = j; else top • j; end; if top> 101 b = a(top) then; /•not found•/ else flagl • 'l'b; 'f, total ~ 20.4 10.1 13.4 s.z 0.7 19.4 1.1 1.5.6 88.9 9.4 totals 98.)* ( 0 )truncation error. no.traps 151 197 121 1z 28j 17 232 1312 146 14.58 fig. 2. each test repeated 2000 times with argument b in array and 2000 times with argument b not in array. sampling interrupt interval was .00 seconds. to get a reasonable sampling of the remammg blocks. a comparative run showed that the optimization overhead was charged to the proper statement groups, but pragmatists will note that the problem setup was biased against the binary search solution. however, using the trap totals from tests 2, 3, and 4, the 50 percent probability test indicates that the probability of no significant difference between methods 2 and 4 is more than 5 percent; the probability of no significant time difference between methods 2 and 3 is more than 1 percent. pll:a is available at a program distribution fee of $25 from the share program library agency, triangle university computer center, p.o. box 12076, research triangle park, nc 27709. thanks are due dr. david gomberg, university of california, san francisco computer center, for most of the runtime and several of the statements used in the test.-]ust·ine roberts, systems librarian, uc-san francisco input to the editor: i am writing to you concerning the article which occurred in the september 1972 issue of journal of library automation entitled "the shared cataloging system of the ohio college library center." i also note that this issue of your journal, even though dated sept.-mber 1972 was not received until july of 1973 by this library, and, indeed, it was a timely arrival for at the present time the northwest association of private colleges and universities is investigating the feasibility of seeking service from the oclc for some of its library requirements. however, in talking with mr. kilgour and his associates at ala this summer it was exceedingly difficult to get a complete cost picture of participation in the oclc and to this date we have not been able to get a complete cost breakdown obligation. in this regard, this article was extremetechnical communications 185 ly interesting and i requested one of my staff members to do a careful analysis of the cost aspects of the oclc services. i am attaching this analysis for your interest and perhaps it will be of suitable pertinence for the readership of your journal. certainly i, and other of my colleagues at this university and in napcu would more than be interested in response by mr. kilgour and his associates. summary: desmond taylor library director collins memorial library university of puget sourul tacoma, washington an analysis "average cost per card for 529,893 catalog cards in finished form and alphabetized for filing was 6.57¢ each ... the system is easy to use, efficient, reliable, and cost beneficial. an off-line catalog card production system based on a file of marc ii records was activated a year before the on-line system." requests were hatched weekly. library of congress card numbers were keypunched onto cards for searching. seventy percent were found the first search. "members could specify a recycling period of from one to thirty-six weeks ... before unfulillled requests were returned." lowest price in lots of one-half million permalife cards was $8.01 per thousand. cpa's checked the system and found that all direct costs were included in 6.57¢ cost. no mention is made of the preexisting cataloging systems-att g ie n e 1¢ c a j s p e n a 9 619 it ss 56 57 58 59 60 61 62 6j 64 6' 66 67 68 ()9 et~qucttc l nd •c. co.gi!-lli.?>.i\~i9.<:1 ... $.1? ... a .. i;.q.!1.1.pr~.h~~-~-~-y.l! .... ~r~.'!-.~.~-~-<; .... $.~ ... :;:.$!, .... l!y ... ~r.nst . .w.[9.ug!!-.j;~g} .. gi\.~pi\.j;a , ..... , ... ax.~ .. .j\r.~~-l_cl ... w.a.x:.~.\l.~ ... h . ed ...... # ................... ......... ...... ................... ... .. .. 681 04 .. $a .. g.eneti.que ..... #. ..... # ...... ................................ ............ ....................................................... . fig. 2. worksheet . monocle/chauveinc 119 following order: tags, indicators, subrecord indicators, repeat indicators (e.g., 100 00 001). on the magnetic disk, however, the order is as follows: tag, subrecord indicator, repeat indicator, indicator (e.g., 100 001 00). the second file is the main file. records in this file have the same general design as the marc ii communications records, and monocle bas retained all the fields designed by the library of congress. each field begins with a two-character subfield code. grenoble does not use fields 001 to 009, but since the bibliotheque nationale will use these fields, monocle retains them. another characteristic of the second file is that records are input in random order and are given identification numbers that are their physical addresses on the disk. the address, which is put in the leader, is made up of ten digits, of which one is the number of the disk, four the number of the track and five the number of the record. access to every record is simple, since the identification number is also the physical address. a printed abridged alphabetical list giving author, title and this number indexes a printout of the main file. additions and corrections are made on this printout and then added to the computer file through a correction tape. the identification number is the access point. no supplementary internal index is needed, nor is any sequential search. there is direct access to every record in the file. some fields have been added for monocle, some deleted, and some modified. the main field deleted is field 130 (main entry uniform title heading) because its place was considered to be in the group of title fields. accordingly fields 630, 730 and 930 are deleted. that is to say, they are kept on the format, but not used, as is the case with many other fields. field 008 contains codes different from those of the ·marc format. these 69 codes (see figure 1) are put in fixed position just after the leader and before the directory. this permits various studies and manipulations (statistics, sorts, etc.) without going to the main file, which is in a variable-length form and whose contents are therefore less easily accessible than those of fixed fields. field 080 for universal decimal classification was not developed by the library of congress or bnb. for monocle it has been given a structure that permits differentiation of the call number (when the book is classified on the shelves according to the udc) from the udc number, which is only used for the card catalogs. in this structure "$a" represents the call number and "$b" represents the continuation of the udc number, as shown in the following example: 080 00 $a dur 539.143 $b ( 083) : 547.1 the colon instructs the computer to make a cross reference from the second number to the first. in field 100, main entry author personal name, the general layout was 120 journal of library automation vol. 4/3 september, 1971 retained, but the subfield codes changed for filing purposes. as a matter of fact, the filing rules for personal names at the bibliotheque nationale differ in many aspects from american library association rules. in designing monocle, the library tried all along to give filing value to subfield codes in order to simplify programming. for instance, the filing order for the same name is: saint pope emperor kings of france kings (other countries) forename single surname plus forename this gives: john, saint john, king of england john john, bishop of chartres john, peter john, peter, ed. john, peter, advocate therefore the following subfield codes have been adopted: names $a saint $b pope $c emperor $d king of france $e other kings $f (alphabetized by name of kingdom) relator $g date $h numeration $i precedent epithet $k filing epithet $1 forename $m this structure is closer to that of the bnb than to marc's, but an important change has been made in the indicators. marc and bnb indicators for this field were chosen for communications purposes and are therefore not necessarily convenient for internal processing. in fact, the program had to test every character and take action on some of them (delete a blank, transform a hyphen into a blank, etc.), which takes a lot of computer time. to facilitate construction of sort keys a change of indicators was made that assigned to each of them a specific action. for first indicator 1 no action is assigned. that is to say that a name monocle/chauveinc 121 is filed exactly as it is, whether it is a single surname or a compound surname: 100 10 $a durand $m charles " smith $m john ,, castro calvo $m frederico hoa tien su santa cruz $m alonso de eighty percent of names are put under this indicator and put in the sorting field without any test, which saves much computer time. first indicator 2 changes a hyphen into a blank in a compound name. the internal hyphen becomes a blank because it is filed as a blank: martin-chauffier martin chauffier pasteur vallery-radot pasteur vallery radot first indicator 3 is used for the compound names in which a character (blank, hyphen, apostrophe) is deleted: la fontaine (filed as lafontaine) mac innis (filed as macinnis ) o'neil (filed as oneil) von nostrand (filed as van nostrand) there seems nowhere a clear explanation of the reasons for creating a special field for family names (the use of this indicator in marc ii). for french libraries it is useless for filing purposes, family name being filed as a surname. first indicator 4 is used when a complex filing is necessary, that is to say, when the technique of inserting vertical bars (or any other characters) is used in the way proposed by r. coward. the use of this specific indicator for these three bars enables the program to test for them only when this indicator is present. this means that there is just one test per name instead of ten or twenty on each character of every name. as this indicator is in the directory, the processing of the names before the sorting itself is hastened. martin i du card i ducard dupon i de la cueriviere i lacueriviere me alester i macalester i me craw-hill i maccraw hill i muller i mueller i first indicator 0 also has a filing function. as names of saints and kings will be a small part of the files, and in order to file them correctly, three bars are inserted to mark omissions for alphabetization. 100 00 $a therese i d' ii a villa $b sainte 100 00 $a therese de i' ii enfant jesus $b sainte $k marie francoise therese martin in field llo the subfield codes of the communications format were not sufficient for a good filing. first, there seemed no reason to separate name (inverted) and name (direct order) because there is no difference in the 122 journal of library automation vol. 4/3 september, 1971 filing of these names, which is strictly alphabetical. there is also no logical difference between them. so monocle retains only two of these indicators: 10, for name of a corporate body entered under the name of a place and 20, for other corporate bodies. this will be useful either for research purposes or for giving priority in filing to the name of place following upon the other name. as there are the same filing problems as in the author field, the indicator 40 has been added, which means that the three vertical lines are used. 110 40 $c martin i von ii wagner universitat the subfield coding is rather succinct in the marc format, and a change was made from the bnb coding because french practice does not use form subheading and "treaty" subheading. moreover, under the name of a corporate body there can be a subheading such as "conference." this subheading has to be interfiled with a subheading of subordinate department and then should have a different code. library association. londres. conference. library association. londres. cataloging group the subfield codes are: $a french name of the corporate body ~ uniform title used by $b place i the bibliotheque nationale $c name $g relator $h name of congress or conference $1 subordinate department $j additional designation (number of the congress) $k date of the congress $m place of the congress $n remainder of the title $o type of jurisdiction $p name of larger geographic entity $q inverted element monocle does not use the "$t" proposed in marc, and the same is true with many other fields ( 410, 610, 710, 910). monocle makes important changes in the title fields, following british marc but going a little further. tags have been assigned to titles in the following order: 240 collective filing title (complete works) 241 uniform title ( bible) 242 original title 243 translated title (used only for the filing of russian or greek words according to the roman alphabet) 244 romanized title 245 title a book may have several titles, in which case they are filed under the name of the author in the numerical sequence of the tags. a collective monocle/chauveinc 123 title (the complete work ) is filed before a uniform title (if it exists), and the latter before an original title, which is in turn filed before an actual title. classical works of which there are many translations have to be regrouped under the original title, but this may not be true of scientific works or of popular novels, which are filed under actual title. moreover, filing of titles can be different in different libraries and for different books in the same library, which is why the filing order will not be determined on the worksheet, but by the program. this problem in filing order was raised by the bibliotheque nationale, which does not want to have determined in the record itself which of several titles will be the filing title; titles will be put under their respective tags according to their nature, and the program will, according to certain tests, choose the filing title. however, a completely satisfying solution to achieving flexibility and unambiguity in filing has not been arrived at. monocle now uses only sequences 240, 241 and 245, using about the same indicators as the marc format but with a slightly different meaning. the first indicators in field 241 have also been changed in order to achieve proper filing whether or not a conventional title contains a personal name. for example "exposition chagall" will be filed before "exposition bibliotheque nationale." the second indicator set to 't ' shows that there should be a cross reference from this title to the title used for filing (actual title to original title, alternative title to main title ). the second indicator set to "9" shows that the title is not significant and will not be used in a title catalog; field 900 is thus not used and repetition of the cross reference is avoided. monocle also employs in title fields the indicator "4" used in field 100 for complex names and an added indicator "5" for title without personal names. subfield codes have also been modified in such a way as to use their alphabetical value as filing value as well as to identify data elements within a field. the following codes are used in fields 240, 241, 242, 243, 244 and in corresponding fields 440-444, 7 40-7 44, 940-944 ) : $a title $b filing number for a logical order of the bible, koran, etc. $c adaptation or extract $d remainder of the title $e filing number for languages $f language $g filing number for dates $h dates $k name of person $1 epithet $m forename $p place $q corporate body the following are examples of this subfield code use: 124 journal of library automation vol. 413 september, 1971 241 50 $a bible $b 03 $d a. t. pentateuque, genese $c extraits $e 7 $f francais $h 1967 241 50 $a exposition $p paris $q bibliotheque nationale $h 1967 241 10 $a exposition $k chagall $m marc $h 1963 for field 245 marc indicators have been retained and "40" added for title with complex filing. these titles use the three vertical lines. 245 40 $a i le xxeme i vingtieme i siecle for more simple filing the virgule or slash is used to eliminate articles at the beginning of titles. this is more flexible than the use of one indicator to determine the number of characters to avoid in filing, especially as there can be more than nine characters to avoid. 245 00 $a the i chemistry of life the foregoing two techniques are used in all the fields x4y of monocle ( 445, 945, etc. ) . there are slight modifications in other fields. for example, in the "collation" field the american and british formats do not make any mention of volumes. as it comes first in monocle collation, the subfield codes of 260 are modified as follows: $a volumes $b height $c pagination $d illustration this situation may change if an international standardized catalog description is agreed upon. in fields 400, 600, 700 and 900 the marc and bnb marc projects have foreseen only one subfield "$t" to put the title after the name, and only one field, 740 or 940 for titles alone. to permit filing author-title series or an author-title added entry with titles of works of the same author, the following title fields were constructed in exactly the same way as fields 240-245: 440, 640, 740, 940. the following fields were added, with the same indicators and subfield codes as 240-245: 441, 442, 443, 444, 741, 742, etc. the repeat indicator is used to link the author to the title in order to make one entry, since author entry and title entry may be quite independent. 410 20 001 $c national research council 445 00 001 $a i publications $y 1708 100 00 $a meynell $m esther 241 00 $a the i little chronicle of anna magdalena bach $f francais $h 1957 245 01 $a la i petite chronique d'anna magdalena bach $c trad. par m. e. buchet 700 11 $a buchet $m m. e. $g trad. 900 10 001 $a bach $m anna magdalena $g auteur suppose 945 00 001 $a la petite chronique $r voir $z 241 000 945 00 002 $a laipetite chronique d'anna magdalena bach $r voir $z 241 000 monoclejchavveinc 125 this is a very useful tool, which permits generalization of the program to interfile records of books published by an institution with records of series published by the same institution, something not possible if one is under "$t" and the other under 245. the technique is not used, however, when the name is part of the title, as in "holden day series in mathematics." it is also useful because monocle treats large handbooks as series, which is more simple than using "$d" and "$e" in the 245 field and repeating the name of the treatise in every record or using the subrecord technique. field 502 has also been modified to permit filing dissertations by subject, towns, date and number. the details of the indicators and subfield codes can be found in monocle (3). one of the main problems encountered was the processing of multivolume sets. it was thought necessary to develop a provision to permit interfiling volumes of a multivolume set. there are three cases, the most simple being that in which volumes are simply numbered 1, 2, 3 ... with or without a title and a date by volume. field 505 is used in this case, with subfield codes slightly modified: $y volume number $a title $b subtitle $e remainder (date, pagination) following is an example: 505 00 $y 1 $a the practice of kinetics $e 1969, 450 p. $y 2 sa the theory of kinetics $e 1969, 436 p. in the second case, when each volume has authors, title, and date, the subrecord technique can be used, each volume having its own subrecord. this is possible only for treatises with few volumes, since the complete record cannot be too long. for very complicated handbooks the series technique is employed. a record is made for the main title as a guide record, and other records are made for each volume, the name of the main treatise being repeated in fields 400-445. this case could be treated by the subrecord technique, but this would give very long and complicated records, too long to be processed by computer and difficult to correct each time a new volume comes in. although the technique used is not very logical, the guide record is made only once, and a record is made for the volume only when it comes in, without any modification to the records already in the computer. when the records are sorted in alphabetical order, one entry will be made to the individual volume and by the "series note" will find its place under the guide record ( 3). there is of course no logical link internal to the file between records of different books of the same series, nor of them with their guide record. if there is a multivolume work as part of a series, in which each volume bears a different number in the series, there are two possibilities: either to use field 505 and 445 for each volume, linking them by the repeat indicator, or to use the subrecord technique. monocle 126 journal of library automation vol. 4/3 september, 1971 makes a choice according to the complexity of the records. at the request of the bibliotheque nationale and of some documentalists wishing to use the format for bibliographies of articles, some fields were added. field 270 contains name of the printer, the place and date of printing. indicators 00 subfield codes $a place $b printer's name $c date field 545 is the title of a periodical from which is extracted the article in the main entry. this tag was chosen because 500 is the note number (the title of the periodical is not an entry ) and 45 is the title number and can be constructed as a title field. indicators 00 subfield codes $a title $b subtitle $c year $d month $e day $y volume $f issue $g pagination $h bibliographical references "$y" was kept for volume for the sake of consistency throughout the format. since it was undesirable to alter marc fields 660 and 670, monocle employs 680-682 for french subject headings. however, name subject heading tags were retained as 600, 610 and 611, but with modified subfield coding. as in french filing geographical names are filed before topical names, the following tags were assigned: 680 geographical names 681 topical names 682 topical names for indexes only the last tag was created in order to differentiate between subject headings for information retrieval and headings for printed indexes only. if there is a relation between two headings, the slash is used between them to tell the computer to make an inverted entry. for example, 680 04 $a chemistry j physics gives two entries, one under chemistry and the other under physics. to allow each library to have its own subject heading system the second indicator is used to indicate this system: for example, 04 is for bibliotheque de grenoble. codes for monocle are partially taken from the british codes instead of the american ones because they are given a filing value. they are, however, slightly different, in that there is no form subdivision. subfield codes are as follows: $a heading $t chronological subdivision $u geographic subdivision $w general subdivision, 1st level $x general subdivision, 2nd level $y general subdivision, 3rd level $z general subdivision, 4th level monocle/chauveinc 127 the levels have been requested for some information retrieval systems that have multilevel thesauri. as a general rule, the attempt was to give a filing value to most of the subfield codes in order to simplify and hasten processing without any table of translation. the latter is always possible, but burdens the program. the library of congress has published a special format for serials. thinking it not very useful, and feeling that serials could be processed by the marc format for books, the librarians at grenoble simply added to the monocle format some fields specifically for serials, as follows: 030 coden 210 abbreviated title 515 525 not used 555 in monocle 503, bibliographic history, is used for the "followed by" and "following" notes of a periodical, because they are simply notes and not added entries. fields 780 and 785 are not necessary, since in a catalog an entry is usually not made for these titles. most periodicals are processed by the format without any trouble. the holdings of the library are put under 090 $b, as shown in the following example: 090 00 $a cbp. 185 $b 1, 1967$c 5732s. $a call number $b holdings $c location summary as stated at the beginning, the library of congress in its marc ii communications format has published the most comprehensive and the most detailed analysis of a bibliographical record. some, mostly documentalists, do not agree with the marc ii complexity in coding, but their aims are not the same as those of librarians who want, first, to catalog books and catalog records according to rules required for a catalog of a large stock of books. a simple, alphabetical sort on the author names is not adequate and is quite unusable by a reader. however, an arrangement that is good for a weekly bibliography may not be sufficient for a complete catalog. the british national bibliography made a thorough study of catalog entries and produced a better filing structure in accordance with the anglo-american rules. 128 journal of library automation vol. 4/3 september, 1971 monocle translated the marc format with slight modifications, but subsequent trials led to more modifications. monocle format has been made from a librarian's point of view, but sometimes a programmer's view of the system has brought about an improvement in it. monocle is working, but not without difficulties. these difficulties come not from the format itself but from the on-line system, which is not working as well as expected. the system organization may not be of the best and perhaps needs a thorough study before being put into operation. the format is not completely satisfactory and needs improvement. documentalists are right when they say it is too complex and expensive. synthesis between the documentalist format, which is too simple, and the monocle format will be undertaken to simplify the worksheet and speed up input time. from the librarian's point of view there are still problems to be solved. processing of complex titles is not easy, elegant and clear. the analysis should go deeper to determine more logical relations between data, avoidance of duplication of information in the record, and speeding up of processing at every stage. the technique of links between fields and records is not developed in monocle as it is in other systems. it may be helpful to connect data by use of pointers and to do away with repetition of series notes that are already input elsewhere. hierarchical links between records should be useful. hence, there is much work still to be done, but the most immediate goal is to make the monocle format operational not only for the library of grenoble university for also for the bibliotheque nationale, which has adopted it for the automation of the bibliographie de la france. the philosophy behind the modifications introduced in converting the marc communications format to the monocle processing format can and should be discussed, but they have all been made in order to improve the structure of the record not only for an internal processing but also for the interfiling of records, which is much more complicated. until now work has been done only on descriptive cataloging and on author-title filing. subject indexing and information retrieval are quite another job. references 1. avram, henriette d.; knapp, john f. ; rather, lucia j.: the marc ii format: a communications format for bibliographic data (washington, d. c.: library of congress, 1968). 2. bnb marc documentation service publication no.1 (london: council of the british national bibliography, ltd., 1968) . 3. chauveinc, marc: monocle ; protect de mise en ordinateur d'une notice catalographique de livre (grenoble: universitaire de grenoble, 1970) . editorial board thoughts: the promise of immersive libraries jerome yavarkovsky information technology and libraries | december 2013 5 immersive technologies—interactive 3d graphics, simulation, and gaming technologies—have much to offer higher education by collapsing geography and by providing a richer learning environment. over the past forty years, through digitization and internet services, librarians have brought technology to bear on making it easier to find and use information, even to the point where people can find and use library resources without coming to the library. information space—the space between user and literature—has been collapsed through digital access. now there is potential to collapse the space between users themselves as they work together from different locations. in recent decades, learning has gone from a predominantly independent and competitive process for students to one that makes greater use of collaboration, cooperation, and group study. the library as a place for students and researchers to work individually with their literature has become a collaborative workspace where students work together on research projects and shared class assignments. this presents a challenge to libraries and learning institutions with limited space for students to meet, share ideas, do coursework together, work on joint projects, and practice presentations. for more than five years, advances in the development of virtual meeting space and workspace have enabled librarians to provide immersive, 3d virtual world services that give a sense of presence that is lacking in conference calls, text chat, and web conferencing.1 as a result, not only is the individual’s physical distance from library materials eliminated, but also the distance is eliminated between individuals who work with each other using library materials. immersive technologies offer the promise of 3d virtual world libraries where students and their teachers can work together in virtual space with library materials and tools—search engines, online catalogs, media, text, etc. students would sit at their computer wherever they are and work together with classmates in a shared environment, using library materials as well as productivity tools—word processor, spreadsheet, web, or blog development software. this would be a boon for real-world institutions lacking sufficient physical space, but also for distance education and international education. for example, students might study abroad but take a class at their home institution, take classes with classmates at foreign institutions, learn jerome yavarkovsky (jeromey@bc.edu) is emeritus university librarian, boston college, and founding co-chair of the libraries and museums technology working group under the immersive education initiative of the media grid. editorial board thoughts | yavarkovsky 6 languages and experience foreign cultures more directly, or take classes in locales not accessible to professors. and now, with the advent of massive open online courses, the opportunity lies within the immersive library to serve large numbers of students who have limited or no library facilities. these students would be able to use library resources as well as to communicate and work with classmates, teaching assistants, tutors, and others whose support would be augmented through their virtual presence rather than via email or text chat. as current research material is born digital, and legacy material is digitized at accelerating rates and is delivered digitally, they are perfect for use in immersive environments. however, immersive library resources are not limited to traditional materials. they include virtual world learning objects and environments, and virtual representations of books—walk-in books or educational simulations—that are interactive experiences with literature.2 further, in virtual space, physical research objects can be part of the study environment and be brought into an active relationship with the information resources that pertain to them. for example, if you were studying 3d models of mayan pottery in a virtual library workspace, you would have access to historical accounts, conference papers, current periodical articles, photographic archives, dissertations and other material pertinent to your studies. through this, not only would distance be eliminated between individuals and library materials, but also physical research objects from pottery to architecture, from monuments to molecular models could be represented in virtual space and related to the information that pertains to them.3 so whether the collaborators are students, teachers, librarians, researchers, we can see a time, even now, when they will no longer be bound by physical workspace. in addition to the immersive library as repository and collaborative study space, the immersive library can be expected to offer enhanced services to users. for example, immersive information literacy programs, immersive research and course consultations, virtual interlibrary document management, and document delivery are just a few possibilities. the immersive workspace is a logical setting for instructing students in digital literacy, the evaluation of resources, and the use of information tools. the goal here would be to bring into the immersive environment the rich body of course designs and curriculum materials pertaining to the educated use of information. we know that among the most significant lifelong benefits of higher education are information skills—finding, evaluating, and using information for work or for personal enrichment. research and course consultations are further academic services that would be valuable applications in the immersive library. librarians, library assistants, and docents have all helped individuals in second life and opensim libraries by providing advice, information, and guidance. these services have further potential in 3d virtual environments through enhancement with photographs, documents and virtual structures. add to these the tools and resources of the library collaborative workspace, and the potential for student and faculty consultation and advisory information technology and libraries | december 2013 7 services is even greater. imagine the phd candidate or student with a writing assignment in the library space with the librarian and all the tools needed for help with dissertation research. as immersive libraries realize their potential and grow in number and scale, the challenge of managing them will grow as well. the tools of 3d virtual workspaces hold promise here also to facilitate the work of the library and improve librarian productivity. librarians work across geographic boundaries regionally and nationally in consortia and multitype systems for resource sharing, collaborative research and development projects, digitization initiatives, staff development programs, and any number of efforts to economize and improve performance. the very management of the library enterprise should benefit from 3d virtual reality tools brought to bear on the day-to-day work and communication of the library. for any who might want to learn more, the ala virtual communities and libraries member initiative group maintains an ala connect site (http://connect.ala.org/node/66325). communication on libraries in virtual environments is also available through the acrl virtual worlds interest group and its google group, acrlinsl (http://groups.google.com/group/acrlinsl). references 1. lori bell and rhonda b. trueman, eds., virtual worlds, real libraries: librarians and educators in second life and other multi-user virtual environments (medford nj: information today, 2008); tom peters, “librarianship in virtual worlds,” library technology reports 44, no. 7 (october 2008). 2. aaron griffiths, ma education in virtual worlds: immersive literature, “presenting the novel night, by elie wiesel, as an immersive literature discussion space,” youtube video, www.youtube.com/watch?v=i-ijpjcwtxa&feature=player_embedded#t=0, blog post at f/xual education services, may 17, 2013, http://fxualeducation.wordpress.com/2013/05/17/immersivelit/; bernadette daly swanson/daisyblue hefferman, “second life: bradburyville virtual experience, fahrenheit 451,” youtube video, 2008, http://www.youtube.com\\watch?v=yyhqo0q2m_g. 3. luís miguel sequeira and leonel caseuri morgado, “virtual archaeology in second life and opensimulator,” journal of virtual worlds research 6, no. 1 (april 2013), http://journals.tdl.org/jvwr/index.php/jvwr/article/view/7047. http://connect.ala.org/node/66325 http://groups.google.com/group/acrlinsl http://fxualeducation.wordpress.com/2013/05/17/immersivelit/ http://www.youtube.com/watch?v=yyhqo0q2m_g http://journals.tdl.org/jvwr/index.php/jvwr/article/view/7047 trends at a glance: a management dashboard of library statistics emily morton-owens and karen l. hanson information technology and libraries | september 2012 36 abstract systems librarians at an academic medical library created a management data dashboard. charts were designed using best practices for data visualization and dashboard layout, and include metrics on gatecount, website visits, instant message reference chats, circulation, and interlibrary loan volume and turnaround time. several charts draw on ezproxy log data that has been analyzed and linked to other databases to reveal use by different academic departments and user roles (such as faculty or student). most charts are bar charts and include a linear regression trend line. the implementation uses perl scripts to retrieve data from eight different sources and add it to a mysql data warehouse, from which php/javascript webpages use google chart tools to create the dashboard charts. introduction new york university health sciences libraries (nyuhsl) had adopted a number of systems that were either open-source, home-grown, or that offered apis of one sort or another. examples include drupal, google analytics, and a home-grown interlibrary loan (ill) system. systems librarians decided to capitalize on the availability of this data by designing a system that would give library management a single, continuously self-updating point of access to monitor a variety of metrics. previously this kind of information had been assembled annually for surveys like aahsl and arl. 1 the layout and scope of the dashboard was influenced by google analytics and a beta dashboard project at brown.2 the dashboard enables closer scrutiny of trends in library use, ideally resulting in a more agile response to problems and opportunities. it allows decisions and trade-offs to be based on concrete data rather than impressions, and it documents the library’s service to its user community, which is important in a challenging budget climate. although the end product builds on a long list of technologies—especially perl, mysql, php, javascript, and google chart tools—the design of the project is lightweight and simple, and the number of lines of code required to power it is remarkably small. further, the design is modular. this means that nyuhsl could offer customized versions for staff in different roles, restricting the display to show only data that is relevant to the individual’s work. because most libraries have a unique combination of technologies in place to handle functions like circulation, reference questions, circulation, and so forth, a one-size-fits-all software package that emily morton-owens (emily.morton.owens@gmail.com) was web services librarian and karen hanson (karen.hanson@med.nyu.edu) is knowledge systems librarian, new york university health sciences libraries, new york. trends at a glance: a management dashboard of library statistics | morton-owens and hanson 37 could be used by any library may not be feasible. instead, this lightweight and modular approach could be re-created relatively easily to fit local circumstances and needs. visual design principles in designing the dashboard, we tried to use some best practices for data visualization and assembling charts into a dashboard. the best-known authority on data visualization, edward tufte, states “above all else, show the data.”3 in part, this means minimizing distractions, such as unnecessary gridlines and playful graphics. ideally, every dot of ink on the page would represent data. he also emphasizes truthful proportions, meaning the chart should be proportional to the actual measurements.4 a chart should display data from zero to the highest quantity, not arbitrarily starting the measurements at a higher number, because that distorts the proportions between the part and the whole. a chart also should not use graphics that differ in width as well as length, because that causes the area of the graphic to increase incorrectly, as opposed to simply the length increasing. pie charts are popular chart types that have serious problems in this respect despite their popularity; they require users to judge the relative area of the slices, which is difficult to do accurately.5 generally, it is better to use a bar chart with different length bars whose proportions users can judge better. color should also be used judiciously. some designers use too many colors for artistic effect, which creates a “visual puzzle”6 as the user wonders whether the colors carry meaning. some colors stand out more than others and should be used with caution. for example, red is often associated with something urgent or negative, so it should only be used in appropriate contexts. duller, less saturated colors are more appropriate for many data visualizations. a contrasting style is exemplified by nigel holmes, who designs charts and infographics with playful visual elements. a recent study compared the participants’ reactions to holmes’ work with plain charts of the same data.7 there was no significant difference in comprehension or shortterm memorability; however, the researchers found that the embellished charts were more memorable over the long term, as well as more enjoyable to look at. that said, holmes’ style is most appropriate for charts that are trying to drive home a certain interpretation. in the case of the dashboard, we did not want to make any specific point, nor did we have any way of knowing in advance what the data would reveal, so we used tufte’s principles in our design. a comparable authority on dashboard design is stephen few. a dashboard combines multiple data displays in a single point of access. as in the most familiar example, a car dashboard, it usually has to do with controlling or monitoring something without taking your focus from the main task.8 a dashboard should be simple and visual, not requiring the user to tune out extraneous information or interpret novel chart concepts. the goal is not to offer a lookup table of precise values. the user should be able to get the idea without reading too much text or having to think information technology and libraries | september 2012 38 too hard about what the graph represents. thinking again of a car, its speedometer does not offer a historical analysis of speed variation because this is too much data to process while the car is moving. similarly, the dashboard should ideally fit on one screen so that it can be taken in at a glance. if this is not possible, at least all of the individual charts should be presented intact, without scrolling or being cramped in ways that distort the data. a dashboard should present data dimensions that are dynamic. the user will refer to the dashboard frequently, so presenting data that does not change over time only takes up space. better yet, the data should be presented alongside a benchmark or goal. a benchmark may be a historical value for the same metric or perhaps a competitor’s value. a goal is an intended future value that may or may not ever have been reached. either way, including this alternate value gives context for whether the current performance is desirable. this is essential for making the dashboard into a decision-making tool. nils rasmussen et al. discuss three levels of dashboards: strategic, tactical (related to progress on a specific project), and operational (related to everyday, department-level processes). 9 so far, nyuhsl’s dashboard is primarily operational, monitoring whether ordinary work is proceeding as planned. later in this paper we will discuss ways to make the dashboard better suited to supporting strategic initiatives. system architecture the dashboard architecture consists of three main parts: importer scripts that get data from diverse sources, a data warehouse, and php/javascript scripts that display the data. the data warehouse is a simple mysql database; the term “warehouse” refers to the fact that it contains a stripped-down, simplified version of the data that is appropriate for analysis rather than operations. our approach to handling the data is an etl (extract, transform, load) routine. data are extracted from different sources, transformed in various ways, and loaded into the data warehouse. our data transformations include reducing granularity and enriching the data using details drawn from other datasets, such as the institutional list of ip ranges and their corresponding departments. data rarely change once in the warehouse because they represent historical measurements, not open transactions.10 there is an importer script customized for each data source. the data sources differ in format and location. for example, google analytics is a remote data source with a unique data export api, the ill data are in a local mysql database, and libraryh3lp has remote csv log files. the scripts run automatically via a cron job at 2a.m. and retrieves data for the previous day. that time was chosen to ensure all other nightly cron jobs that affect the databases are complete before the dashboard imports start. each uses custom code for its data source and creates a series of mysql insert queries to put the needed data fields in the mysql data warehouse. for example, a script might pull the dates when an ill request was placed and filled, but not the title of the requested item. trends at a glance: a management dashboard of library statistics | morton-owens and hanson 39 a carefully thought-out data model simplifies the creation of reports. the data structure should aim to support future expansion. in the data warehouse, information that was previously formatted and stored in very inconsistent ways is brought together uniformly. there is one table for each kind of data with consistent field names for dates, services, and so forth, and others that combine related data in useful ways. the dashboard display consists of a number of widgets, one for each chart. each chart is created with a mixture of php and javascript. google chart tools interprets lines of javascript to draw an attractive, proportional chart. we do not want to hardcode the values in this javascript, of course, because the charts should be dynamic. therefore we use php to query the data warehouse and a statement for each line of results to “write” a line of the data in javascript. figure 1. php is used to read from the database and generate rows of data as server-side javascript. each php/javascript file created through this process is embedded in a master php page. this master page controls the order and layout of the individual widgets using the php include feature to add each chart file to the page plus a css stylesheet to determine the spacing of the charts. finally, because all the queries take a relatively long time to run, the page is cached and refreshes itself the first time the page is opened each day. the dashboard can be refreshed manually if the database or code is modified and someone wants to see the results immediately. many of the dashboard’s charts include a linear regression trend line. this feature is not provided by google charts and must be inserted into the widget’s code manually. the formula can be found online.11 the sums and sums of squares are totted up as the code loops through each line of data, and these totals are used to calculate the slope and intercept. in our twenty-six-week displays, we never want to include the twenty-sixth week of data because that is the present (partial) week. the linear regression line takes the form y = mx + b. we can use that formula along with the slope and intercept values to calculate y-values for week zero and the next-to-last week (week twentyfive). those two points are plotted and the trend line is drawn between them. the color of the line depends on its slope (greater or less than zero). depending on whether we want that chart’s metric to go up or down, the line is green for the desirable direction and red for the undesirable direction. information technology and libraries | september 2012 40 details on individual systems gatecount most of nyuhsl’s five locations have electronic gates to track the number of patrons who visit. formerly these statistics were kept in a microsoft excel spreadsheet, but now there is a simple web form into which staff can enter the gate reading twice daily. the data goes directly into the data warehouse, and the a.m. and p.m. counts are automatically summed. there is some errorchecking to prevent incorrect numbers being entered, which varies depending on whether that location’s gate is the kind that provides a continuously increasing count or is reset each day. the data are presented in a stacked bar chart, summed for the week. the user can hover over the stacked bars to see numbers for each location, but the height of the stacked bar and the trend line represent the total visits for all locations together. figure 2. stacked bar chart with trendline showing visits per week to pphysical library branches over a twenty-six-week period ticketing nyuhsl manages online user requests with a simple ticketing system that integrates with drupal. there are four types of tickets, two of which involve helping users and two of which involve reporting problems. the “helpful” services are general reference questions and literature search requests. the “trouble” services are computer problems and e-resource problems. these two pairs trends at a glance: a management dashboard of library statistics | morton-owens and hanson 41 each have their own stacked bar chart because, ideally, the number of “helpful” tickets would go up while the number of “trouble” tickets would go down. each chart has a trend line, color-coded for the direction that is desirable in that case. figure 3. stacked bar chart with trendline showing trouble tickets by type the script that imports this information into the data warehouse simply does so from another local mysql database. it only fetches the date and the type of request, not the actual question or response. it also inserts a record into the user transactions table, which will be discussed in the section on user data. drupal nyuhsl’s drupal site allows librarians directly to contribute content like subject guides and blog posts.12 the dashboard tracks the number of edits contributed by users (excluding the web services librarian and the web manager, who would otherwise swamp the results). this is done with a simple count query on the node_revisions table in the drupal database. because no other processing is needed and caching ensures the query will be done at most once per day, this is the only widget that pulls data directly from the original database at the time the chart is drawn. koha koha is an open-source opac system. at nyuhsl, koha’s database is in mysql. each night the importer script copies “issues” data from koha’s statistics table. this supports the creation of a information technology and libraries | september 2012 42 stacked bar chart showing the number of item checkouts each week, with each bar divided according to the type of item borrowed (e.g., book or laptop). as with other charts, a color-coded trend line was added to show the change in the number of item checkouts. google analytics the dashboard relies on the google analytics php interface (gapi) to retrieve data using the google analytics data export api.13 nothing is stored in the data warehouse and there is no importer script. the first widget gets and displays weekly total visits for all nyuhsl websites, the main nyuhsl website, and visits from mobile devices. a trend line is calculated from the “all sites” count. the second widget retrieves a list of the top “outbound click” events for the past thirty days and returns them as urls. a regular expression is used to remove any ezproxy prefix, and the remaining url is matched against our electronic resources database to get the title. thus, for example, the widget displays “web of knowledge” instead of “http://ezproxy.med.nyu.edu/login?url=http://apps.isiknowledge.com/.” a future improvement to this display would require a new table in the data warehouse and importer script to store historic outbound click results. this data would support comparison of the current list with past results to identify click destinations that are trending up or down. figure 4. most popular links clicked on to leave the library’s website in a thirty-day period trends at a glance: a management dashboard of library statistics | morton-owens and hanson 43 libraryh3lp libraryh3lp is a jabber-based im product that allows librarians to jointly manage a queue of reference queries. it offers csv-formatted log files that a perl script can access using “curl,” a command-line tool that mimics a web browser’s login, cookies, and file requests. the csv log is downloaded via curl, processed with perl’s text::csv module, and the data are then inserted into the warehouse. the first libraryh3lp widget counts the number of chats handled by each librarian over the past ninety days. the second widget tracks the number of chats for the past twenty-six weeks and includes a trend line. figure 5. bar chart showing number of im chats per week over a twenty-six-week period document delivery services the document delivery services (dds) department fulfills ill requests. the web application that manages these requests is homegrown, with a database in mysql. each night, a script copies the latest requests to the data warehouse. the dashboard uses this data to display a chart of how many requests are made each week and which publications are requested from other libraries most frequently. this data could be used to determine whether there are resources that should be considered for purchase. information technology and libraries | september 2012 44 the dds data was also used to demonstrate how data might be used to track service performance. one chart shows the average time it takes to fulfill a document request. further evaluation is required to determine the usefulness of such a chart for motivating improvement of the service or whether this is perceived as a negative use of the data. some libraries may find this kind of information useful for streamlining services. figure 6. this stacked bar chart shows the number of document delivery requests handled per week. the chart separates patron requests from requests made by other libraries. ezproxy data ezproxy is an oclc tool for authenticating users who attempt to access the library’s electronic resources. it does not log e-resource use where the user is automatically authenticated using the institutional ip range, but the data are still valuable because it logs a significant amount of use that can support in-depth analysis. because of the gaps in the data, much of the analysis looks at patterns and relationships in the data rather than absolute values. karen coombs’ article discussing the analysis of ezproxy logs to understand e-resource at the department level provided the initial motivation to switch on the ezproxy log.14 when logging is enabled, a standard web log file is produced. here is a sample line from the log: 123.45.6.7 amyu0gh5brmuska hansok01 [09/sep/2011:18:25:23 -0500] post http://ovidsp.tx.ovid.com: 80/sp3.3.1a/ovidweb.cgi http/1.1 20020472 http://ovidsp.tx.ovid.com.ezproxy.med.nyu.edu/sp-3.3.1a/ovidweb.cgi trends at a glance: a management dashboard of library statistics | morton-owens and hanson 45 each line in the log contains a user ip address, a unique session id, the user id, the date and time of access, the url requested by the user, the http status code, the number of bytes in the requested file, and the referrer (the page the user clicked on to get to the site). the ezproxy log data undergoes some significant processing before being inserted into the ezproxy report tables. the main goal of this is to enrich the data with relevant supplemental information while eliminating redundancy. to facilitate this process, the importer script first dumps the entire log into a table and then performs multiple updates on the dataset. during the first step of processing, the ip addresses are compared to a list of departmental ip ranges maintained by medical center it. if a match is found, the “location accessed” is stored against the log line. next, the user id is compared with the institutional people database, retrieving a user type (faculty, staff, or student) and a department, if available (e.g., radiology). one item of significant interest to senior management is the level of use within hospitals. as a medical library, we are interested in the library’s value to patient care. if there is significant use in the hospitals, this could furnish positive evidence about the library’s role in the clinical setting. next, the resource url and the referring address are truncated down to domain names. the links in the log are very specific, showing detailed user activity. because the library is operating in a medical environment, privacy is a concern and so specific addresses are truncated to a top-level domain (e.g. ovid.com) to suppress any tie to a specific article, e-book, or other specific resource. finally, a query is run against the remaining raw data to condense the log down to unique session id/resource combinations, and this block of data is inserted into a new table. each user visit to a unique resource in a single session is recorded; for example, if a user visits lexis nexis, ovid medline, scopus, and lexis nexis again in a single session, three lines will be recorded in the user activity table. a single line in the final ezproxy activity table contains a unique combination of location accessed (e.g., tisch hospital), user department (e.g., radiology), user type (e.g., staff), earliest access date/time for that resource (e.g., 9/9/201118:25), resource name (e.g., scopus.com), session id, and referring domain (e.g., hsl.med.nyu.edu). there is significant repetition in the log. depending on what filters are set up, every image within a webpage could be a line in the log. the method of condensing the data described previously results in a much smaller and more manageable dataset. for example, on a single day 115,070 rows of were collected in the ezproxy log, but only 2,198 were inserted into the final warehouse table after truncating the urls and removing redundancy. in a separate query on the raw data table, a distinct list containing user id, date, and the word “eresources” is built and stored in a “user transactions” table. this very basic data are stored so that simple user analysis can be performed (see “user data” below). information technology and libraries | september 2012 46 figure 7. line chart showing total number of ezproxy sessions captured per week over a twenty-sixweek period once the ezproxy data are transferred to the appropriate tables, the raw data (and thus the most concerning data from a privacy standpoint) is purged from the database. several dashboard charts were created using the streamlined ezproxy data, a simple count of weekly e-resource users, and a table showing resources whose use changed most significantly since the previous month. it was challenging to calculate the significance of the variations in use since resources that went from one session in a month to two sessions were showing the same proportional change as those that increased from one thousand to two thousand sessions. a basic calculation was created to highlight the more significant changes in use. d = (pq) if d<0 then significance = d—8 x 10 d q +1 if d>0 then significance = d +8 x 10 d q +1 d = difference between last month and this month p = number of visits last month (8 to 1 days ago) q = number of visits previous month (15 to 9 days ago) trends at a glance: a management dashboard of library statistics | morton-owens and hanson 47 this equation serves the basic purpose of identifying unusual changes in e-resource use. for example, one e-resource was shown trending up in use after a librarian taught a course in it. figure 8. table of e-resources showing the most significant change in use over the last month compared to the previous month the ezproxy data has already proven to be a rich source of data. the work so far has only scratched the surface of what the data could show. only two charts are currently displayed on the dashboard, but the value of thisdata is more likely to come from one-off customized reports based on specific queries, like tracking use of individual resources over time or looking at variations of use within specific buildings, departments, or user types. there is also a lot that could be done with the referrer addresses. for example, the library has been submitting tips to the newsletter that is delivered by email. the referrer log allows the number of clicks from this source to be measured so that librarians can monitor the success of this marketing technique. user data each library system includes some user information. where user information is available in a system, a separate table is populated in the warehouse. as mentioned briefly above, a user id, a date, and the type of service used (e-resources, dds, literature search, etc.) is stored. details of the transaction are not kept here. the user id can be used to look up basic information about the user such as role (faculty, staff, student) and department. we should emphasize for clarity that the detailed information about the activity is completely separated from any information about the user so that the data cannot be joined back together. information technology and libraries | september 2012 48 the most sensitive data, such as the raw ezproxy log data, is purged after the import script has copied the truncated and de-identified data. even though the data stored is very basic, information at the granularity of individual users is never displayed on the dashboard. the user information is aggregated by user type for further analysis and display. the institutional people database can be used to determine how many people are in each department. a table added to the dashboard shows the number of resource uses and the percentage of each department that used library resources in a six-month period. some potential uses of this data include identifying possible training needs and measuring the success of library outreach to specific departments. for example, if one department uses the resources very little, this may indicate a training or marketing deficit. it may also be interesting to analyze how the academic success of a department aligns with library resource use. do the highest intensity users of library resources have greater professional output or higher prestige as a research department, for example? it is unsurprising to find that medical students and librarians are most likely to use library resources. the graduate medical education group is third and includes medical residents (newly qualified doctors on a learning curve). as with the ezproxy data, there are numerous insights to be gained from this data that will help the library make strategic decisions about future services. figure 9. table showing the proportion of each user group that has used at least one library service in a six-month period results trends at a glance: a management dashboard of library statistics | morton-owens and hanson 49 the dashboard has been available for almost a year. it requires a password and is only available to nyuhsl’s senior management team and librarians who have asked for access. feedback on the dashboard has been positive, and librarians have begun to make suggestions to improve its usefulness. one librarian uses the data warehouse for his own reports and will soon provide his queries so that they can be added to the dashboard. the dashboard has facilitated discoveries about the nature of our users and has identified potential training needs and areas of weakness in outreach. a static dashboard snapshot was recently created for presentation to the dean of the medical school to illustrate the extent and breadth of library use. the initial dashboard aimed to demonstrate the kinds of library statistics that it is possible to extract and display, but there is much to be done to improve its operational usefulness. a dashboard working group has been established to build on the original proof-of-concept by improving the data model and adding relevant charts. some charts will be incorporated into the public website as a snapshot of library activity. the dashboard was structured to be adaptable and expandable. the next iteration will support customization of the display for each user. new charts will be added as requested, and charts that are perceived to be less insightful will be removed. for example, one chart shows the number of reference chat requests answered by each librarian in addition to the number of chats handled per week. the usefulness of this chart was questioned when it was observed that the results were merely a reflection of which librarians had the most time at their own desks, allowing them to answer chats. this is an example of how it can be difficult to separate context from numbers. in this instance the individual statistics were only included because the data was available, not because any particular request from management, so these charts may be removed from the dashboard. nyuhsl is also investigating the ex libris tool ustat, which supports analysis of counter (counting online usage of networked electronic resources) reports from e-resources vendors. ustat covers some of the larger gaps in the ezproxy log, including journal-level rather than vendor-level analysis, and most importantly, the use statistics for non-ezproxied addresses. a future project will be to see whether there is an automated way to extract use metrics, either from ustat or directly from the vendors to be incorporated into the data warehouse. preliminary discussion are being held with it administrators about the possibilities of ezproxying library resource urls as they pass through the firewall so that the ezproxy log becomes a more complete reflection of use. an example of a strategic decision based on dashboard data involves nyuhsl’s mobile website. librarians had been considering the question of whether to invest substantial effort in identifying and presenting free apps and mobile websites to complement the library’s small collection of licensed mobile content. the chart of website visits on the dashboard surprisingly shows that the number of visits that come from mobile devices is consistently fewer 3 percent, probably because of the relatively modest selection of mobile-optimized website resources. rather than invest information technology and libraries | september 2012 50 significant effort in cataloging additional potentially lackluster free resources that would not be seen by a large number of users, the team decided to wait for more headlining subscription-based resources to become available and increase traffic to the mobile site. it would be worthwhile to add charts to the dashboard that track metrics related to new strategic initiatives requiring librarians to translate strategic ideas into measurable quantities. for example, if the library aspired to make sure users received responses more quickly, charts tracking the response time for various services could be added and grouped together to track progress on this goal. as data continues to accumulate, it will be possible to extend the timeframe of the charts, for example, making weekly charts into monthly ones. over time, the data may become more static, requiring more complicated calculations to reveal interesting trends. conclusions the medical center has a strong ethic of metric-driven decisions, and the dashboard brings the library in line with this initiative. the dashboard allows librarians and management to monitor key library operations from a single, convenient page, with an emphasis on long-term trends rather than day-to-day fluctuations in use. it was put together using freely available tools that should be within the reach of people with moderate programming experience. assembling the dashboard required background knowledge of the systems in question, was made possible by nyuhsl’s use of open-source and homegrown software, and increased the designers’ understanding of the data and tools in question. references 1 association of academic health sciences libraries, “annual statistics,” http://www.aahsl.org/mc/page.do?sitepageid=84868 (accessed november 7, 2011); association of research libraries, “arl statistics,” http://www.arl.org/stats/annualsurveys/arlstats (accessed november 7, 2011). 2 brown university library, “dashboard_beta :: dashboard information,” http://library.brown.edu/dashboard/info (accessed january 5, 2012). 3 edward r. tufte, the visual display of quantitative information (cheshire, ct: graphics, 2001), 92. 4 ibid., 56. 5 ibid., 178. 6 ibid., 153. 7 scott bateman et al., “useful junk? the effects of visual embellishment on comprehension and memorability of charts,” chi ’10 proceedings of the 28th international conference on human factors in computing systems (new york, acm, 2010) , doi: 10.1145/1753326.1753716. http://www.aahsl.org/mc/page.do?sitepageid=84868 http://www.arl.org/stats/annualsurveys/arlstats/ http://library.brown.edu/dashboard/info/ trends at a glance: a management dashboard of library statistics | morton-owens and hanson 51 8 stephen few, information dashboard design: the effective visual communication of data (beijing: o’reilly, 2006), 98. 9 nils rasmussen, claire y. chen, and manish bansal, business dashboards: a visual catalog for design and deployment (hoboken, nj: wiley, 2009), ch. 4. 10 richard j. roiger and michael w. geatz, data mining: a tutorial-based primer (boston: addison wesley, 2003), 186. 11 one example: stefan waner and steven r. costenoble, “fitting functions to data: linear and exponential regression,” february 2008, http://people.hofstra.edu/stefan_waner/realworld/calctopic1/regression.html (accessed january 5, 2012). 12 emily g. morton-owens, “editorial and technological workflow tools to promote website quality,” information technology &llibraries 30, no 3 (september 2011):92–98. 13 google, “gapi—google analytics api php interface,” http://code.google.com/p/gapi-google-analyticsphp-interface (accessed january 5, 2012). 14 karen a. coombs, “lessons learned from analyzing library database usage data,” library hitech 23 (2005): 4, 598–609, doi: 10.1108/07378830510636373. http://people.hofstra.edu/stefan_waner/realworld/calctopic1/regression.html http://code.google.com/p/gapi-google-analytics-php-interface/ http://code.google.com/p/gapi-google-analytics-php-interface/ june_ital_rubel_final picture perfect: using photographic previews to enhance realia collections for library patrons and staff dejah t. rubel information technology and libraries | june 2017 59 abstract like many academic libraries, the ferris library for information, technology, and education (flite) acquires a range of materials, including learning objects, to best suit our students’ needs. some of these objects, such as the educational manipulatives and anatomical models, are common to academic libraries but others, such as the tabletop games, are not. after our liaison to the school of education discovered some accessibility issues with innovative interfaces' media management module, we decided to examine all three of our realia collections to determine what our goals in providing catalog records and visual representations would be. once we concluded that we needed photographic previews to both enhance discovery and speed circulation service, choosing processing methods for each collection became much easier. this article will discuss how we created enhanced records for all three realia collections including custom metadata, links to additional materials, and photographic previews. introduction ferris state university’s full-time enrollment for fall 2015 was 14,715 students. of these students, 10,216 are big rapids residents and the other 4,499 are either kendall college of art and design students or at other off-campus sites across michigan.1 during the 2014-2015 school year, flite had 14,647 check-outs including 2,558 check-outs of items in reserves, which is where our realia collections are located.2 however, reserves includes other items in addition to these collections, thus making analysis of circulation statistics problematic. another problem with conducting such an analysis is that the educational manipulative collection already had photographic previews and the tabletop game collection is a pilot project, so there is no clear before and after comparison. we can, however, demonstrate that enhancing the catalog records for our anatomical model collection had an incredibly significant impact, jumping from a handful of check-outs from 2014-2015 to almost 450 in 2016. literature review although there are very few libraries using photographic previews for their realia collections, the ones that do described similar limitations with bibliographic records and goals that only dejah t .rubel (rubeld@ferris.edu) is the metadata and electronic resources management librarian, ferris state university, big rapids, mi. picture perfect: using photographic previews to enhance realia collections for library patrons and staff | rubel | https://doi.org/10.6017/ital.v36i2.9474 60 photographic previews could meet. most realia collections that warranted this extra effort are either curriculum materials or anatomical models, which is not surprising considering how difficult they are to describe. as butler and kvenild noted in their article on cataloging curriculum materials, “patrons struggled to identify which game or kit they sought based on the…information in the online catalog,” because “discovering curriculum materials in the catalog and getting a sense of the item are not easy when using traditional catalog descriptions...”3. as they continue, “the inventory and retrieval problems…were compounded by the fact that existing catalog records were not as descriptive as they should be.”4 this was also a problem for our collections because our names and descriptions were often not intuitive or precise. in addition, as loesch and deyrup discovered while cataloging their curriculum materials collection, “…there was great inconsistency among the oclc records regarding the labeling of the format…,”5 which was another issue we needed to address. although the general material designation (gmd) has since been rendered obsolete, flite continues to use it to highlight certain material. this choice is due to some limitations with our library management system as well as our discovery layer, namely the lack of good mapping or use of the 33x fields. until this is rectified with a more modern system, we have it found it easier to retain certain gmds like “sound recording”, “electronic resource”, and “realia”. thus, we needed to standardize our terms for each collection. another problem that our predecessors indicated photographic previews might resolve was missing objects or pieces of objects.6 this becomes especially important for our tabletop games collection because most of those pieces are very small and too numerous for a piece count upon return. fortunately, “previews…can aid users in making better decisions about potential relevance, and extract gist more accurately and rapidly than traditional hit lists provided by search engines.”7 ideally, a preview will display an appropriate level of information about the object it represents in order “…to support users in making a correct judgement about the relevance of that object to the user’s information need.”8 greene goes further by listing the main roles for previews of which the first two are the most applicable for photographic previews: aiding retrieval and aiding users in quickly making relevance decisions.9 for these uses, photographic previews of realia are ideal because users can examine the object without needing to see its details and they expect them to be abstract, not exhaustive, unlike digital surrogates that an archive would use.10 as greene also notes, the high-level goal of any preview is to "...communicate the level and scope of objects to users so that comprehension is maximized and disorientation is minimized."11 a common finding among all the previous projects was that even a single photograph provides more readily comprehensible information than several lines of description. as moeller states regarding their journal project, "they [previews of each issue's cover] give the researcher or student an immediate idea of the nature of the journal."12 he goes further to give the example of an innocuous journal title for a propagandist serial whose political nature is transparent once you view its imagery. from a staff perspective, photographic previews can also easily illustrate the number of information technology and libraries | june 2017 61 pieces and an object's condition or orientation. this can be very useful in determining whether something is missing or damaged without having to do a time-consuming individual piece count upon check-in. but as butler and kvenild discuss, layout within each photograph is key for illustrating missing pieces.13 unfortunately, aside from a few small projects mentioned in butler and kvenild's article, there are not many examples of photographic previews for realia collections currently being used by academic libraries. one reason might be software limitations. innovative's media management module is still unique among ils/lms software in that most vendors either provide a separate digital repository for special collections digital surrogates or they incorporate images into the catalog using third party software like syndetic solutionstm. another reason for the lack of photographic previews within catalogs may simply be the rarity of realia in academic libraries. every library certainly has a few unique pieces, like a skeleton for the pre-medical students, but often not enough to consider them an entire collection much less a complex enough collection to warrant the extra effort to create photographic previews of each item. at flite, we had already crossed that threshold of complexity. therefore, this article will start by discussing our educational manipulative collection, which provided the basis for how we would catalog and process the tabletop games and anatomical models. educational manipulative collection our first foray into creating photographic previews was completed by the previous cataloger with over 300 items cataloged in 2004 and another 30-40 added to the collection over the next decade. unlike the other realia collections, the educational manipulatives were cataloged using innovative’s course reserves module, so no attempt was made to find or create oclc records. nevertheless, the minimal metadata is very consistent across the collection, which supports greene’s recommendation “…that it was important to define a set of consistent attributes at the high level of the collection if any effective browsing across the collections was to be provided.”14 in our case, we rely on a combination of the gmd ([realia]), a custom call number prefix (toys box #), and a limited amount of local subject headings as shown below with “manipulatives” as the common subject for the entire collection. 690 = (d) current local subject headings in use as of 12/3/15: art. infant/toddler. block props. magnets. boards. manipulatives. cognitive. music. discovery box. oversize books. discovery. posters. dramatics. puppets. finger puppets. story apron. flannel board. story props. gross motor. woodworking. picture perfect: using photographic previews to enhance realia collections for library patrons and staff | rubel | https://doi.org/10.6017/ital.v36i2.9474 62 due to the nature of descriptive metadata, photographic previews of the educational manipulatives made logical sense because “the images…are not the content. they are the metadata, the description of the materials.”15 as moeller describes, innovative’s media management module links images and many other file types directly to bibliographic records without requiring users to click an additional link unless they want to view a larger image of a thumbnail.16 similar to butler and kvenild’s project, all of our photos were 900 pixels wide by 600 pixels tall, which is slightly smaller than their default width of 1000 pixels.17 one advantage of using the media management module is its ability to automatically create thumbnails 185 pixels wide by 85 pixels tall. a bigger advantage is that the images are hosted on the same server that runs our catalog, which allows us to freely distribute the images in an intuitive manner (thumbnails instead of links) without having to worry about authentication to a shared folder from off-campus, unlike our pdf files. unfortunately, our liaison to the school of education recently discovered some accessibility issues with media management that forced us to consider whether we should change the embedded photographic previews to external links. the most significant of these problems is simply the language of the proprietary viewer software. because it is written in java, if you click on a thumbnail for a larger image, many browsers, like chrome, will not run it and those that will often require a security exception to do so. we have attempted to ameliorate some of these issues by providing an faq entry on which browsers are best for viewing these images and how to add a security exception for our website, but unless or until innovative rewrites this software in a different language, these accessibility issues will persist because java is being phased out of many browsers. butler and kvenild also noted its slow response time compared to their own server.18 another issue they mentioned was that the thumbnails would not be visible in their consortial catalog, so they needed to add links in the 856 field for these users.19 this is less of an issue for us because we do not contribute any of our realia records to our consortia catalog, but moeller’s concern that in general “…enhancements involving scanned images…will not be easily shared with other libraries,”20 is entirely valid. unlike oclc records, there is no way to share attached or embedded images as part of the metadata and not the content. contrariwise, butler and kvenild’s concerns regarding catalog migration are very pertinent because we are considering moving to a new lms within the next few years.21 although we acknowledge that “utilizing 856 tags is an indirect method of accessing the images, as users must take the intiative to follow the links,” we will eventually have to move and link our photographic previews to ensure accessibility after migration.22 tabletop game collection unlike the educational manipulatives, the majority of the tabletop game collection was previously cataloged in oclc, so finding good bibliographic records was easy. once downloaded, we decided to add a unique gmd ([game]), custom call number prefix (board game box #), and local subject heading “tabletop games”. however, our emerging technlogies librarian who coordinated this information technology and libraries | june 2017 63 pilot project felt that the single subject heading was not descriptive enough. so he gave us a spreadsheet with more specific subject headings such as “deck building”, “historical”, and “resource management” that we added as genre/form subject headings in the 655_4 field. he also suggested that we add links to the rule books, which we did using the 856 field and the link text “connect to rule book (pdf)”. because tabletop games are commercial products, finding images online was also easy. at first, we had some concerns about copyright, but we are not reselling these products or using the image as a replacement for the item. so, we concurred with butler and kvenild that “…the images in our project fall under copyright fair use.”23 another plus to using commercial images is that we could use more than one to show various aspects of setup and play. the downside to this benefit is image sizes and content photographed varied widely, so we used our best judgement in creating labels and tried to keep them as consistent as possible. to ensure consistency across the collection, we decided that the first image should always be the top of the game’s box labeled “box cover” or “box cover – front” if there was a “box cover – back” image. (we only displayed the back of the box cover if there was significant information about the game printed on it.) then we added up to five additional images showing parts of the game like “card examples”, “game pieces”, and “game set-up”. overall, this number of images worked very well in both encore’s attached media viewer and the classic catalog/web opac, but there is a slight duplication in images by syndetic solutionstm for a few games. this results in a larger version of the box top image displaying to the right of the title and above the smaller thumbnails of images we added using media management. in regards to piece counts, we presumed that we would need photographic previews to aid in piece counting upon return of a tabletop game. however, our emerging technologies librarian assured us that because we are an educational institution, we could contact the vendor for free replacement pieces at any time. he also emphasized that unlike the educational manipulatives or the anatomical models, this was a pilot collection, so extensive processing would not be a good investment of our labor. fortunately, the anatomical model collection would require images for piece counts as well as several other cataloging customizations to increase discoverability and speed circulation. anatomical model collection similar to our educational manipulative collection, but not nearly as extensive, our anatomical model collection has been a part of flite since its inception. unlike the manipulatives, which are used primarily by the early childhood education students, the anatomical models support a range of allied health programs including but not limited to dental hygiene, radiology, and nursing. the majority of our two dozen models were purchased in the 20th century and, like the manipulatives, the majority were cataloged using innovative’s course reserves module. unfortunately, none of these records were very descriptive, some being so poor as to be merely a title like “jawbones” and a barcode. so, the first task was to match objects with oclc records. fortunately, this task picture perfect: using photographic previews to enhance realia collections for library patrons and staff | rubel | https://doi.org/10.6017/ital.v36i2.9474 64 became easier once we discovered that it was easier to match the object to the vendor’s catalog image and then search oclc by vendor model name or number than it is to decipher written descriptions if you do not know human anatomy. once good bibliographic records were downloaded, we decided to add one of three gmds depending on the type of model ([model], [chart], or [flash card]), a custom call number prefix (model #), and one or more of the local subject headings shown below. 690 = (d) anatomy model. anatomy chart. anatomy models. anatomy charts. dental hygiene model. dental model. dental hygiene models. dental models. technically, all dental models could be used as anatomical models, but not vice versa. therefore, the common subject headings for the collection are “anatomy model” and “anatomy models”. to make things easier to shelve, retrieve, and inventory, we also designed numeric ranges for the call numbers, as shown below, so we would know what type of model we should expect when referring to a specific model number. 099 = (c) model #00x following this hierarchy: 001-099 anatomical charts and flash cards 100-199 articulated skeletons 200-299 disarticulated skeletons and bone kits 300-399 organs 400-499 skulls (anatomical and dental hygiene) 500-599 other dental models (dental studies, dental decks) we also scanned and linked pdfs of the heavily worn model keys with the link text “connect to key pdf” before washing and rehousing all the models. once they were clean, they were ready for their shoot with ferris state university’s media production team. due to winter break, media production was able to shoot the majority of the collection fairly quickly. they returned to us high-resolution tiffs the same size as those for the manipulatives, 900 pixels by 600 pixels. in case of java viewer failure, we requested that there be one top-level image that showcases exactly what the model contains with images of individual pieces or drawers as the succeeding images. for example, our disarticulated skeletons are housed in small plastic carts with three drawers in each cart. therefore, the first image would be a shot of all the pieces of the disarticulated skeleton and the second image would be the contents of the top drawer, the third image the contents of the middle drawer, and the last image the contents of the bottom drawer. in this specific example, we re-used the images that we posted in the catalog information technology and libraries | june 2017 65 record by pasting them on top of the cart to show circulation staff what to expect in each drawer upon check-in. overall, photographic previews for this collection appear to be working very well for both catalog users and circulation staff “…to inform users about size, extent, and availability of collections or objects.”24 in fact, they have been working so well for this collection that usage has increased exponentially compared to previous years. figure 1. circulation statistics 2014-2016 conclusions and future directions although we implemented photographic previews for three realia collections, we could not define any standard workflow for the process beyond correcting or downloading the metadata first and adding the images second. part of this is due to our working primarily with legacy collections because we often discovered issues, like the model keys, while working through another issue. the other part is due to the nuances involved in processing realia in general. even with good, readily available catalog records like those for the tabletop games, time still had to be spent separating, organizing, and rehousing game pieces as well as hunting down useful images. unfortunately, any type of realia processing, even if it is just textual description, is much more time-consuming than the majority of academic library cataloging. adding in the extra steps to create, upload, and link a photographic preview can nearly double that labor investment. notwithstanding, as butler and kvenild advocate “…not supplying images as metadata for items that most need them (i.e. kits, games, and models) is to make them nearly irretrievable. providing bare-bones traditional metadata for these items is analogous to delegating them to the backlog shelves of yesteryear.”25 367 317 114 10 1 444 24 0 50 100 150 200 250 300 350 400 450 500 2014 2015 2016 circulation statistics manipulatives models games picture perfect: using photographic previews to enhance realia collections for library patrons and staff | rubel | https://doi.org/10.6017/ital.v36i2.9474 66 unfortunately, neither the library management system nor the third-party catalog enhancement market currently provides a good solution to this problem. considering how great an impact photographic previews have had in the online retail market, this lack of technical support is surprising. yes, syndetic solutionstm is a great product for cover images and tables of content for books. however, once you go beyond traditional resources, there is a great need to allow institutions to submit their own images as part of catalog record enhancement and not to serve as separate digital surrogates in a digital respository. this could be done either within the library management system, like the media management module, or as an option for catalog enhancement where libraries could add images to either a shared database or their own database using standard identifiers on a third-party platform like syndeticstm. further research on photographic previews is also sorely needed. as of this writing, we only have a handful of case studies and some guiding philosophy on the use of previews. consultation with internet retailers and literature on online marketing might be more applicable than library science research to evaluate their impact, but research into their direct impact vs. textual descriptions on catalog use would be ideal. references 1. fact book 2015 – 2016 (big rapids, mi: ferris state university institutional research & testing, 2016), http://www.ferris.edu/htmls/admision/testing/factbook/factbook15-162.pdf, 47. 2. ibid, 12. 3. marcia butler and cassandra kvenild, “enhancing catalog records with photographs for a curriculum materials center,” technical services quarterly 31 (2014): 122-138, https://doi.org/10.1080/07317131.2014.875377, 122-124. 4. ibid, 126. 5. martha fallahay loesch and marta mestrovic deyrup, “cataloging the curriculum library: new procedures for non-traditional formats,” cataloging & classification quarterly 34, no. 4 (2002): 79-89, https://doi.org/10.1300/j104v34n04_08, 82. 6. butler and kvenild, “enhancing catalog records with photographs,” 128. 7. stephan greene, gary marchionini, catherine plaisant, and ben shneiderman, “previews and overviews in digital libraries: designing surrogates to support visual information seeking,” journal of the american society for information science 51, no. 4 (2000): 380-393, https://doi.org/10.1002/(sici)1097-4571(2000) 51:4<380::aid-asi7>3.0.co;2-5, 381. 8. ibid. information technology and libraries | june 2017 67 9. ibid, 384. 10. ibid, 385. 11. ibid. 12. paul moeller, “enhancing access to rare journals: cover images and contents in the online catalog,” serials review 33, no. 4 (2007): 231-237, https://doi.org/10.1016/j.serrev.2007.09.003, 235. 13. butler and kvenild, “enhancing catalog records with photographs,” 128. 14. greene et. al., “previews and overviews in digital libraries,” 388. 15. butler and kvenild, “enhancing catalog records with photographs,” 124. 16. moeller, “enhancing access to rare journals,” 234. 17. butler and kvenild, “enhancing catalog records with photographs,” 129. 18. ibid, 132. 19. ibid, 126. 20. moeller, “enhancing access to rare journals,” 237. 21. butler and kvenild, “enhancing catalog records with photographs,” 131. 22. ibid, 135. 23. ibid, 134. 24. greene et. al., “previews and overviews in digital libraries,” 386. 25. butler and kvenild, “enhancing catalog records with photographs,” 136. rarely analyzed: the relationship between digital and physical rare books collections article rarely analyzed the relationship between digital and physical rare books collections allison mccormack and rachel wittmann information technology and libraries | june 2022 https://doi.org/10.6017/ital.v41i2.13415 allison mccormack (allie.mccormack@utah.edu) is the original cataloger for special collections, university of utah, university of utah. rachel wittmann (rachel.wittmann@utah.edu) is the digital curation librarian, university of utah. © 2022. abstract the relationship between physical and digitized rare books can be complex and, at times, nebulous. when building a digital library, should showcasing a representative slice of the physical collection be the goal? should stakeholders focus on preservation, high-use items, or other concerns? to explore these conundrums, a special collections librarian and a digital services librarian performed a comparative analysis of their library’s physical and digital rare books collections. after exporting marc metadata for the rare books from their ils, the librarians examined the place of publication, publication date, and broad subject range of the collection. they used this data to create a variety of visualizations with the open-source digital humanities tool tableau public. next, the authors downloaded the rare books metadata from the digital library and created illuminating data visualizations. were the geographic, temporal, and subject scopes of the digital library similar to those of the physical rare books collection? if not, what accounted for the differences? the implications of these and other findings will be explored. introduction as of august 2019, the special collections division of the university of utah j. willard marriott library held over 256,000 printed works and archival collections. approximately 22% of the collection, or just over 55,000 works, belongs to the rare books department (https://lib.utah.edu/collections/rarebooks/), which contains not only books but serials, maps, manuscripts, ephemera, and other formats. the collection covers over 4,000 years of human history, with its earliest piece, a cuneiform tablet, dating to the mid-twenty-third century bce; contains works from nearly 100 different countries; and represents a wide variety of topics, including the exploration and settlement of the american west and the history of the book. the rare books department, a subset of special collections, specifically seeks to document the history of written human communication and actively collects historical items to enhance teaching and research at the university of utah. the marriott library has been adding digitized works from the rare books department to its digital library (https://collections.lib.utah.edu/) for over 25 years. approximately 780 works, or 1.42% of the rare books collection, has been digitized to date. however, no formal collection development plan was ever written, and rare books were selected for digitization by both curators and patrons. unfortunately, the reason a particular item was digitized is not recorded in the system: it is unclear if age, research value, physical condition, a desire to bring forward underrepresented stories, or a combination of these and other factors influenced the decision to digitize a rare book. this piecemeal approach to digital library collection development, while not uncommon, made it difficult for library staff and patrons to determine the relationship between the digital and physical collections of rare books. it also presented challenges when library staff mailto:allie.mccormack@utah.edu mailto:rachel.wittmann@utah.edu https://lib.utah.edu/collections/rarebooks/ https://collections.lib.utah.edu/ information technology and libraries june 2022 rarely analyzed | mccormack and wittmann 2 attempted to communicate the scope and intent of the digital library to patrons, who assumed that the digitized items were representative of the overall collection. given their expertise in library metadata, the authors decided to analyze both traditional library catalog records and digital library records for the rare books collection and explore whether the digital collection was proportionally representative of the physical collection or if it differed in geographic, temporal, or subject scope in a meaningful way. they then created a series of data visualizations to better communicate information about the library’s rare books holdings. literature review while much has been written about methods and criteria for selecting special collections items to be digitized and the effects of digitization on collection accessibility, few authors have discussed the relationships between digital collections and the physical collections from which they were sourced. in their highly detailed treatise on selection strategies for digitization, ooghe and moreels identify representativity, a method that “aims for a final selection that provides a representative view of the original collections,” as one of 25 selection criteria for digitization projects.1 however, alexandra mills notes that “without a thorough understanding of the institution and collections, it is impossible to create truly representative collections.”2 because many digitization initiatives are undertaken in response to user requests, preservation concerns, or the availability of projectbased funding, it is likely that most libraries do not plan for their digital collections to be representative of their overall special collections holdings. as peter michel states, the digital collections at the university of nevada, las vegas, were explicitly built with popular history and popular culture in mind and were never intended to be “surrogates of the collection.”3 bradley daigle of the university of virginia explained that digitization could be undertaken to alleviate preservation concerns, respond to defined research needs, or to brand certain online content, but this approach could give the mistaken impression “that only the important materials are digitized.”4 despite the gaps in the literature, having an explicit collection development policy is still considered paramount; indeed, it is the very first principle listed in the national information standards organization (niso)’s framework for building “good” digital collections.5 to investigate this type of documentation further, a google search was employed using the search term “digital collection development policy site:edu”. this yielded 10 publicly accessible digital collection development policies from academic libraries in the united states: 6 • amherst college library (https://www.amherst.edu/library/services/digital/digitalcolldev) • emerson college archives and special collections (https://www.emerson.edu/policies/digital-collections-development-policy) • colorado state university libraries (https://lib.colostate.edu/digital-collectiondevelopment-policy/) • florida atlantic university digital library (https://library.fau.edu/policy/digital-librarycollection-development-policy) • georgetown university library (https://www.library.georgetown.edu/digital-projectpolicy) • northern illinois university digital library (https://digital.lib.niu.edu/policy/collectiondevelopment-policy) https://www.amherst.edu/library/services/digital/digitalcolldev https://www.emerson.edu/policies/digital-collections-development-policy https://lib.colostate.edu/digital-collection-development-policy/ https://lib.colostate.edu/digital-collection-development-policy/ https://library.fau.edu/policy/digital-library-collection-development-policy https://library.fau.edu/policy/digital-library-collection-development-policy https://www.library.georgetown.edu/digital-project-policy https://www.library.georgetown.edu/digital-project-policy https://digital.lib.niu.edu/policy/collection-development-policy https://digital.lib.niu.edu/policy/collection-development-policy information technology and libraries june 2022 rarely analyzed | mccormack and wittmann 3 • oregon health and sciences university digital collections (https://www.ohsu.edu/library/ohsu-digital-collections-development-policy) • university of north texas university libraries (https://library.unt.edu/policies/collection-development-digital-collections/) • wesleyan university digital library (https://digitalcollections.wesleyan.edu/about/whatwe-collect) • williams college special collections (https://specialcollections.williams.edu/collectiondevelopment-policies/digital-collections/) in reviewing the sample of 10 universities’ digital collection development policies, homogenous content becomes apparent. almost all of the policies include a mission statement, scope, and selection criteria for potential digital collection items. all policies include criteria that physical materials should meet in order to qualify for digitization. the most common criteria for digitization are materials that are rare or unique, high-use, fragile, important to institutional or regional history, and/or support campus curriculum or faculty research. in addition, the clearance to publish materials online is ubiquitous among the policies. materials eligible for online display must either be in the public domain or intellectual property rights are held by the institution, and materials currently under copyright must receive permission from the copyright holder. a measured approach to digitization qualification has been employed by the university of north texas (unt) libraries’ digital collections and the northern illinois university digital library (niudl). unt libraries’ digital collections policy lists levels of criteria that materials must meet in order to be digitized and included in the digital library; to qualify for digitization, all criteria on level one must be met while only one criterion from level two is needed. niudl includes a priority factor rubric which includes criteria categories and corresponding numerical scale with a maximum point of 35, the higher value signifying an elevated priority. six of the 10 policies include prioritizing materials that support diversity and inclusion missions on campus. amherst college has leveraged their digital collection development policy to include content that would increase perspectives of underrepresented groups within the digital collections and traditionally underrepresented groups more broadly. niudl includes marginalized groups as a collection priority area in order to “deepen public understanding of the histories of people of color and other communities and populations whose work, experiences, and perspectives have been insufficiently recognized or unattended” and lists over 20 such groups. the collection candidate’s relationship to other collections is outlined in four of the 10 policies. georgetown university requires that “the materials form a coherent collection, fill gaps in existing collections, or complement existing collection strengths.” amherst college evaluates whether digitization would “enhance public awareness of archives’ collection strengths.” another function of a digital collection development policy is to inform the public on the scope and provenance of contents in their digital library. the unt digital collection policy includes a section outlining the content contributors, including partners, which can be beneficial for large-scale digital libraries that host collections from multiple partners. unt is also exemplary in defining collection curators and their responsibilities while underscoring the nature of this role, likely changing over time and not set to an individual. with no written digital collection development policy regarding special collections at the marriott library, the authors would first have to analyze both the physical and digital special collections before determining what factors may have influenced the digitization of these materials. libraries are gathering massive amounts of data, ranging from the metadata of their varied collections to patron usage statistics of both physical and digital collections. interpretation of the https://www.ohsu.edu/library/ohsu-digital-collections-development-policy https://library.unt.edu/policies/collection-development-digital-collections/ https://digitalcollections.wesleyan.edu/about/what-we-collect https://digitalcollections.wesleyan.edu/about/what-we-collect https://specialcollections.williams.edu/collection-development-policies/digital-collections/ https://specialcollections.williams.edu/collection-development-policies/digital-collections/ information technology and libraries june 2022 rarely analyzed | mccormack and wittmann 4 ever-growing accumulation of data can quickly become complex. by visualizing data, we are able to interpret large and often messy sets of data while processing multiple aspects of the data concurrently. for example, the ohio state university (osu) libraries used tableau desktop to combine data from various departments in order to better manage and explore information.7 tableau was osu’s data visualization software of choice due to its ease of use and accessibility, and the program was also used to create dashboards that blend data from various sources for realtime visualizations. bibliographic metadata cleanup to understand the marriott library’s collections, one must first understand the relevant metadata, which for the rare books department is in the machine-readable cataloging (marc) format. a popular criticism of marc, commonly used in traditional library cataloging, is that the schema is highly regulated and, at times, redundant. however, for the purposes of this project, those qualities proved to be a boon. an older, uncorrected record in the digital library might list london as the place of publication for a particular book, but it was not immediately apparent if that referred to london, england; london, ontario; or london, ohio. however, a marc record would not only list a book’s city of publication in the 260 or 264 field but would also contain a two or three-letter code in the 008 field that specified the country, us state, canadian province or territory, or australian state or territory in which it was published. for this reason, the authors decided to base their analysis on marc record data from the physical collection instead of the dublin core metadata used in the digital library. in order to tease out the relationships between our digital and physical collections, each of the approximately 55,000 rare books bibliographic records stored in alma, the marriott library’s cloud-based library services platform, would have to have a common set of data points that could be compared. for the purposes of this analysis, the authors chose to investigate the place of publication and the subject of each work. despite the relative rigidity of marc metadata, some of the alma records lacked country of publication data in the 008 field. these records were not incorrect, but merely outdated: some had been copied directly from paper catalog card s when the library first transitioned to a computer-based cataloging system, while others were created using different metadata standards. approximately 6,000 rare books either completely lacked a country code in the 008 field or had data that could possibly be enhanced by, for example, replacing a code for the united states with a code for a particular state. instead of editing all 6,000 records by hand, the cataloger wrote several metadata normalization rules in alma to automatically correct the most obvious errors. records that listed chicago as the place of publication were assigned the marc geographic code for illinois, while those that were published in lugduni batavorum, the latin designation for leiden, were given the geographic code for the netherlands. however, 3,000 records were unable to be enhanced in this manner, either because their place of publication was an ambiguous city name like cambridge or because the place of publication was listed as unknown. the cataloger examined each record individ ually and was ultimately unable to assign a marc geographic code to 1,682 records, most of which were arabic manuscripts or advertising pamphlets that simply did not list a place of publication or creation. while these records would be excluded from the place of publication analysis, they could be mined for data on other topics. with the marc records as complete as possible, the metadata was exported from alma into an excel spreadsheet and given to the metadata librarian for further manipulation. information technology and libraries june 2022 rarely analyzed | mccormack and wittmann 5 metadata transformation & visualization creation the next phase involved standardizing the raw metadata to create human readable data, rather than marc codes, that are necessary to produce data visualizations. once the physical rare books ’ bibliographic metadata was updated in alma, it was then exported as a comma-separated values file. the raw data export produced a massive spreadsheet containing over 50,000 marc records. these records included twoand three-letter location codes for the place of publication from the library of congress marc code list for geographic areas. two-letter codes are used for most countries, while three-letter codes are used for states within the united states, provinces within canada, and territories within the united kingdom. while this additional level of location data was available for books from the united kingdom and canada, it was decided to review the collection at a country level for consistency and map display. books from the united states, however, were analyzed on a state level, considering the research is germane to an american institution. using a list correlating these codes to the location name provided by the library of congress (https://www.loc.gov/marc/countries/countries_code.html), a vlookup formula was used in microsoft excel to add the location names to the marc records. the vlookup formula pulls in data from one table to another as long as the two tables have one data field in common. in this exercise, both tables of data contained the library of congress location codes, therefore the lc location codes were used to add the location names to the table containing the marc metadata. once the location names were added, there were some additional quality control steps required, as lc location names that included outdated country names posed issues to mapping the data to current country names and boundaries. for example, we combined the codes for east germany and west berlin for the one representing contemporary germany. for countries that have since been dissolved and rezoned to multiple countries, e.g., the ussr and czechoslovakia, these records were manually checked for city names and then added to the current country. once this process was completed, the results showed the rare books were published in 97 countries and all 50 united states, as well as the district of columbia. examining the subject content of the rare books physical collection was another aspect of analysis for this project. in contemplating this analysis, using the lc subject heading field was considered, however, faceting of lc subject headings and the structure of the exported data posed too many issues for a rather simple analysis. instead, the library of congress call number was used to extract high-level lc classification information for each work by separating the first two letters of the call numbers included in the exported marc metadata, which indicated lc class and subclass. to add the lc class and subclass names to these letters, a vlookup formula was used again to match the letter codes to the list of lc classification categories. once classification categor ies were added to the 55,000 records, works from all 21 lc master classes and 190 subclasses were represented in the rare books collection. in addition to the physical rare books collection held at the marriott library, there is a selection of this collection that has been digitized and is accessible in the marriott digital library. the rare books digital collection (https://collections.lib.utah.edu/search?facet_setname_s=uum_rbc) comprises 780 works, although this number includes unique records for individual volumes within a series and therefore is not a true comparison to marc metadata records, which contain one record for a series. for example, the silver reef miner, a newspaper “devoted to the mining interests of southern utah” published during the late nineteenth century, has 30 individual volumes in the digital library, but these are represented in just one marc record. in order to compare the digital collection to the physical collection, the datasets would need to have https://www.loc.gov/marc/countries/countries_code.html https://collections.lib.utah.edu/search?facet_setname_s=uum_rbc information technology and libraries june 2022 rarely analyzed | mccormack and wittmann 6 consistent data for comparison, namely place of publication and lc classification derived from call numbers. the digital collection metadata is in the dublin core schema, which does not include all of the metadata found in the marc metadata, nor does it use the same format. while there is a dublin core spatial element used to capture geographic data on what the item is about, this does not always align neatly with the location of an item’s publication. for example, reise in das innere nord-america in den jahren 1832 bis 1834 (2 volumes) is a book printed in germany that documents an expedition to north america in 1832–1834 and includes illustrations of native american people from the swiss artist karl bodmer. for these volumes, the appropriate dublin core spatial data would include the specific regions the expedition traveled to in north america; in the marc 26x field, however, it contains koblenz, germany, the city where the volumes were published. call number data was included for many digitized works, but not in a consistent format. in order to use the same data to compare the physical rare books collection to the digital one, the digital collection metadata was updated with the improved/accurate call numbers found in the marc metadata. another improvement to the digital collection metadata was the addition of the metadata management system (mms) id unique numerical identifiers that aid in locating a record in the alma system. when the rare books’ descriptive metadata was originally converted to dublin core during the digitization process, some titles and call numbers were changed and became different from their physical counterparts. the inclusion of the mms id allows for a consistent identifier between the physical and digital collections. when selecting data visualization software, being able to create a map of the places where books in the rare collection were published was a priority. considering the goal of creating an easily replicable workflow for other libraries, the authors sought a freely accessible program that did not require advanced geospatial skills, unlike esri’s arcgis software. tableau software is a data visualization software package with both a public and desktop version. the tableau desktop version requires a subscription fee while tableau public is open access. for the purposes of this study, tableau public offered open access and mapping features that are enabled without any geospatial knowledge necessary. analysis creating a variety of data visualizations allowed information about the rare books physical and digital collections to be more apparent than merely browsing entries in a spreadsheet. for example, there are numerous geographic disparities between the two collections of rare materials as shown in the american states in which works from the collections were published. while books from all 50 states are found in the physical collection (fig. 1), only 18 states are represented in the digital library (fig. 2), with new york being the state in which the highest number of books were published. as new york city has long been a major publishing center in the united states, the authors were not surprised by this. however, the subsequent states were quite different: california and utah ranked second and third for the physical collection, while massachusetts and pennsylvania claimed those spots for the digital library. the authors believe several factors might influence this discrepancy. first, works can only be added to the digital library if they are no longer in copyright, and states with longer histories of european-american settlement are more likely to have published books that are now out of copyright. furthermore, these older books are more likely to be in a fragile condition and therefore may have been digitized to decrease the amount of physical handling to which they are subjected. information technology and libraries june 2022 rarely analyzed | mccormack and wittmann 7 figure 1. marriott library physical rare books by us state. information technology and libraries june 2022 rarely analyzed | mccormack and wittmann 8 figure 2. marriott library digital rare books by us state. there are other discrepancies when comparing the country of publication between the physical (fig. 3) and digital collections (fig. 4). while 61% of the physical rare books were published in the united states, only 20% of the digitized works were published in this country. the authors expected to see egypt rank highly in the physical collection, as many of the rare books were purchased by former university of utah professor dr. aziz atiya to support the middle east center for research he founded; similarly high in rank, britain, germany, france, and italy were all major centers for the early printing and publishing trade in early modern europe. however, there is strong geographic bias in the digital collection, as only north america, western europe, and one african country are represented online. copyright may again play a factor, as the earliest books from non-western countries in the collection often date to the twentieth century, but a eurocentric or other bias cannot immediately be discounted. while the physical collection contains many more european imprints than from the global south, it is much more diverse than the digital collection. information technology and libraries june 2022 rarely analyzed | mccormack and wittmann 9 figure 3. marriott library physical rare books by country. information technology and libraries june 2022 rarely analyzed | mccormack and wittmann 10 figure 4. marriott library digital rare books by country. the analysis of the subjects represented in the collection proved to be somewhat challenging to study. due to the nature and structure of library of congress subject headings, which attempt to mirror natural language and may be composed of “strings” of phrases to represent complex topics, no tableau public visualization could be created that effectively grouped similar content areas together without looking quite fragmented. instead, the authors based their analysis of subjects on library of congress classification numbers (i.e., call numbers) assigned to works, which, though not exact, can be understood as distillations of the subject of a work.8 once again there were considerable differences between the physical and digital rare books collections (fig. 5). as in many generalized special collections, literature and history make up significant portions of the physical collection. however, works on bibliography, or the study of books and book history, comprise a notable percentage of the collection. many of these are modern works on book history and special collections librarianship and therefore are unable to be digitized due to copyright law. nearly 9% of the digital collection is on the sciences, though these works comprise only 3% of the physical collection. while this portion of the holdings may be relatively small, it contains many scientific high points such as vesalius’ de humani corporis fabrica, early printings of ancient mathematical texts, and the journals of major scientific societies, which may have been digitized both for physical preservation as well as high interest on the part of students and faculty on campus. information technology and libraries june 2022 rarely analyzed | mccormack and wittmann 11 figure 5. percentage of rare books physical and digital collections by library of congress class. next steps now that the first phase of the project is complete, the authors would like to conduct additional analyses. first, they plan to compare the usage statistics of the digital rare books collection to the circulation statistics of the physical collection. this method of inquiry was not possible at the start of the project, as circulation information for the rare books was previously not tracked in the integrated library system. now that rare books are checked out to patrons for use in the special collections reading room, this data can be quickly pulled from alma. once there is a year’s worth of circulation data for the rare books unhindered by the changes necessitated by the coronavirus pandemic, the authors will compare the usage statistics of the digital collection for the same time period. do patrons in the reading room look at similar materials to online patrons, or are their interests vastly different? are some rare books used so frequently that they would benefit from the added physical security that digitization brings? information technology and libraries june 2022 rarely analyzed | mccormack and wittmann 12 the authors also plan to pull annual usage statistics from the digitized rare books and share this with special collections division leadership. online patrons are still library patrons, and the division can use the viewing data to show the national and international reach of the collection. relatedly, the authors will investigate the digital library usage data in more depth. do patrons from utah, the united states, and the world look at similar materials, or are there geographic divides among the online patrons? do countries that are home to a majority of the university’s international student body have higher viewership numbers? finally, the authors wish to convene a group of stakeholders to create a formal collection development plan for the rare books component of the digital library. given the library’s limited resources, it is imperative that digitization be done thoughtfully and systematically. there is a good rationale for creating a digital collection that is representative of the physical rare books collection as well as one that highlights certain collection areas. both material fragility and the modern scholarly emphasis on highlighting the stories of people of color, women, and other underrepresented groups in library collections provide strong counterarguments to making digital libraries strictly representative of their physical counterparts. since informal conversations with patrons of the marriott library revealed that they assumed the digital library was representative of the collection overall, it is imperative that this assumption be either confirmed or disclaimed in a publicly viewable statement. in the case of the rare books department, the authors are in favor of a focused, rather than representative, collection development policy. firstly, many of the books in the collection are under copyright and therefore cannot be digitized, while other materials like reference sources for rare books librarians will be of limited interest to the general public. furthermore, complex items such as artists’ books are often poor candidates for digitization, as they may have movable components that cannot be captured accurately in a still photograph. as for what should be included online, the authors fully support equity, diversity, and inclusion efforts at the university of utah and would like to see the digital library highlight materials from marginalized groups whenever possible. usage statistics from the physical and digital collections, when they become available, should also inform the collection development policy to encourage traffic to the digital library. whatever is ultimately decided, however, the clarity a written policy provides will help streamline decision-making and ultimately help both library staff and patrons understand and search within the digital library much more effectively. endnotes 1 bart ooghe and dries moreels, “analysing selection for digitisation: current practices and common incentives,” d-lib magazine 15, no. 9/10 (2009), https://doi.org/10.1045/september2009-ooghe. 2 alexandra mills, “user impact on selection, digitization, and the development of digital special collections,” new review of academic librarianship 21, no. 2 (2015): 166. https://doi.org/10.1080/13614533.2015.1042117. 3 peter michel, “digitizing special collections: to boldly go where we’ve been before,” library hi tech 23, no. 3 (2005): 382, https://doi.org/10.1108/07378830510621793. https://doi.org/10.1045/september2009-ooghe https://doi.org/10.1080/13614533.2015.1042117 https://doi.org/10.1108/07378830510621793 information technology and libraries june 2022 rarely analyzed | mccormack and wittmann 13 4 bradley j. daigle, “the digital transformation of special collections,” journal of library administration 52, no. 3–4 (2012): 253, https://doi.org/10.1080/01930826.2012.684504. 5 niso framework working group, a framework of guidance for building good digital collections (2007), https://www.imls.gov/sites/default/files/publications/documents/framework3.pdf. 6 the urls in the following list were accurate as of march 2, 2022. 7 sarah anne murphy, “data visualization and rapid analytics: applying tableau desktop to support library decision-making,” journal of web librarianship 7, no. 4 (2013): 465–76, https://doi.org/10.1080/19322909.2013.825148. 8 readers who do not work with marc metadata may not be familiar with how library of congress call numbers are assigned. created in 1891, the classification system is based on 21 classes designated by a single letter; subclasses add one or two letters to the initial class. catalogers must choose which one of the classes to assign to a particular work. the subject headings may guide a cataloger towards a certain class, but there is not a 1:1 relationship between subject headings and call number classes. https://doi.org/10.1080/01930826.2012.684504 https://www.imls.gov/sites/default/files/publications/documents/framework3.pdf https://doi.org/10.1080/19322909.2013.825148 abstract introduction literature review bibliographic metadata cleanup metadata transformation & visualization creation analysis next steps endnotes the free software alternative: freeware, open-source software, and libraries james e. corbly information technology and libraries | september 2014 65 abstract this paper will introduce the reader to the world of freeware and open-source software. following a brief introduction, the author presents an overview of these types of software. next comes a discussion of licensing issues unique to freeware and open-source software, which leads directly to issues of registration. the author then offers several strategies readers can adopt to locate these software packages on the web. the author then addresses questions regarding the use of freeware and open-source software before offering a few closing thoughts. introduction i first recognized the potential savings in time, money, and labor offered to librarians and others by freeware and open-source software while i was head of technical and automated services at st. ambrose university in davenport, iowa. among other responsibilities, i oversaw the cataloging and processing of all new library materials. normally, i created original records on oclc whenever i cataloged. one of the tools i needed to complete this work (particularly with foreign language materials) was an ascii chart to provide instructions for making characters not found in english, such as ß, ¿, and ç. today a search for such symbols is relatively easy (most can be found on any character map), but twenty years ago, it presented more of a challenge. i spent many fruitless hours looking for the implement i needed. after much searching, i discovered the right tool—david lord’s ascii chart. this freeware program featured several tables one for ansi characters, control characters, an ebdic table (extended binary coded decimal interchange code), palette (for colors), and a list of ibm pc characters. the first and last charts were the ones i referred to the most. whenever i needed a special symbol in a windows-based program, the ibm pc chart gave me the formula i needed to make it. any time i was in dos or a dos-based utility (as i was with oclc), i consulted the ansi chart to form a diacritical mark. ascii chart was a great piece of software that saved me many hours of work and helped me formulate accurate documents and records. in another instance, shortly after i assumed my duties as the director of library services at kansas wesleyan university, the time clock that registered the work hours of student workers broke down. it was an ancient machine that demanded frequent repairs. and although it dutifully james e. corbly (james.corbly@gmail.com), from austin, minnesota, has been studying freeware and open-source software for over a decade. mailto:james.corbly@gmail.com information technology and libraries | september 2014 66 printed check-in and check-out hours on time cards, it did not calculate the number of hours worked, nor could it prevent such abuses as students punching in and out for one another. rather than taking the machine to a repair shop, i sought an alternative method of record keeping. i found one a computer program from guia international called picture timeclock1 which not only registers each student’s work hours but also totals them so that monthly summaries of each student’s work record can be compiled with increased speed, ease, and accuracy. the software lists among its features a photo identification module that prohibits one student from signing in or out for another. picture timeclock is freeware, so no monies were involved in its procurement. i then took this process a step farther. each month, i digitized the work record sheets required by the business office so i could employ another freeware program, a pdf editor, to transfer the records from picture timeclock onto the digitized time sheets. together, these two programs reduced the amount of time needed to document student worker hours by 88% while enhancing the correctness of the submitted records. as an added bonus, i obtained a permanent digital copy of each student’s work record which could be readily accessed whenever necessary. today, i still rely on freeware and open-source software to solve many of my workaday problems. i utilize these packages for a variety of purposes including web content mining, manipulating pdf documents, and keeping my computers functioning at their optimal level. what are freeware and open-source software? among the major types of software (commercial, shareware, adware, etc.), freeware and opensource software are unique. freeware is copyrighted software given away by its owner (normally the author) for others to use. the author retains sole possession of the copyright, so users cannot alter the software. freeware authors allow individuals to use their productions in any legal manner but they do not allow anyone to sell the software for a profit. additionally, many freeware suppliers impose boundaries on the use of their products, restricting their application in commercial endeavors, for example. open-source software carries the concept of freeware to its ultimate conclusion. with open source products, the copyright holder gives others the right to study, modify, and distribute the software free of charge to anyone for any purpose. quite often, open source products result from the collaborative efforts of contributors living in numerous locations around the world. raw program code, along with the compiled program, is available to anyone who is willing to obtain it, scrutinize it, and make additions or improvements to it in the expectation that the combined efforts of many people will result in a product increasingly useful and reliable to end users. although some open source products lack documentation, many (if not most) have active user groups or communities which serve as sources of assistance to users. to be considered open source, a software product must meet several criteria, among which are the following: the free software alternative| corbly 67 • the software must be freely available without cost, royalties, or fees of any kind. • the program must be distributed as source code (for programmers) and compiled code (for end users). • end users and programmers may alter the program’s code. • the modified source code must be redistributed under the same conditions as the license for the original version of the software.2 one must guard against the temptation to equate freeware and open-source software with publicdomain software. this latter category of software is not copyrighted; hence, it is free of all costs and can be employed without any restrictions whatsoever.3 freeware and open-source software possess copyrights. although many users may be unfamiliar with the concepts of freeware and open-source software, they nonetheless rely on them every day in their routine tasks. if one uses the web, one needs to use a web browser. those who employ microsoft’s internet explorer are using one of the most popular freeware products available,4 while firefox users rely on an open source package.5 many freeware and open source systems are standards in their fields. ccleaner (freeware) has won numerous awards for its efficiency not only as a cleaner and system optimization tool but also as a guardian of user privacy.6 audacity (open source) is a sound recorder and editor employed not only by amateurs but also by professionals in the field.7 access to the web would be impossible without the use of apache (open source), the number one http server.8 it is also important to note that software can change type. freeware can become shareware, commercial software can morph into freeware, etc. licensing licensing of open source products is rather straightforward. although there are over sixty-five different open source licenses, one predominates the general public license (gpl). this is the most popular license used for open-source software. the gpl serves as the license for approximately 70% of open source products. the gpl first appeared in 1989. richard stallman, formerly of the massachusetts institute of technology and founder of the nonprofit free software foundation, authored it. modifications made to the gpl helped keep it vibrant as the years passed and the information technology world changed. as of this writing, the gpl in use is the third version, which came out on june 29, 2007. the foundation of the gpl consists of four principles: 1. the right of individuals to use software for any purpose; 2. the right of individuals to alter software to meet individual needs; 3. the right of individuals to share software with others; and information technology and libraries | september 2014 68 4. the right of individuals to freely distribute the changes one makes to software.9 to these ends, the gpl gives end users the right to freely reproduce and distribute copies of a software program’s source code, providing that each copy displays a copyright notice, a disclaimer of warranty, a copy of the gpl, and gpl notices. the right to modify the software’s code and freely distribute it to others, taking care to list all modifications made to the code and insuring that every condition outlined above is met. commentators often refer to gpl as “copyleft” licensing. copyleft is a method of making software freely available and requires all modified versions to meet the criteria already listed. one can read the text of this license at: https://www.gnu.org/licenses/gpl-3.0.txt.10 in addition to the gpl, open-source software authors distribute their work under other arrangements such as the berkeley software distribution licenses (bsdl), the mozilla public license, the nasa open source agreement, and the common public license. freeware licensing is not nearly as uniform as open source. there is no freeware equivalent to the gpl and the rights and responsibilities of the copyright owner vary from program to program. having noted this, there are clauses that many freeware licenses hold in common. among these are the following: • the copyright to the software is retained either by the author of the software or its provider. • end users may install the software and use it for any legal purpose. • one may install and use the software only on a specified number of computers. • users may copy and distribute the software provided the original copyright remains intact. • one cannot charge a fee for copies of the freeware save for distribution costs. • the software is provided “as is” the copyright owner assumes no liability for any damage caused by product usage. • freeware frequently has usage limits. many freeware licenses permit only personal or noncommercial use of the product. there may be limits on use of the freeware with other software packages and restrictions on freeware use over a network. with such a variety of clauses such as these, it is hardly surprising that freeware licenses vary in length. one freeware program i regularly use has a license consisting of three small paragraphs, while another boasts a license five pages in length. due to these characteristics of freeware licensing, i always copy the text of a freeware license to a blank page of my word processor. i keep this document in the master file of the software so that it is readily available should it be required for any exigency. https://www.gnu.org/licenses/gpl-3.0.txt the free software alternative| corbly 69 software registration the above section on licensing leads directly to questions regarding registration. in this arena, freeware and open-source software differ from other types of software. when employing freeware and open source products, one often finds differences between “personal” and “business” use. many of these packages allow unlimited use of the software as long as its use is strictly personal. in other words, if one downloads a software program, installs it on one’s computer, and utilizes it strictly for one’s own, non-commercial purposes, then use of the software is free. personal use refers to all usage that does not generate financial income, such as scrap-booking, creating personal websites, personal blogs, and print jobs such as flyers, posters, and t-shirts for grade schools and local food banks. in this case, one would simply register the software in one’s name. additionally, charities and other non-profit institutions (such as public, academic, and school libraries) may ordinarily employ freeware and open-source software under the same precincts as individuals, i.e., as personal users. however, suppose one works at a for-profit enterprise and desires to install the software on his/her office computer. naturally, since the firm owns all its office computers, one usually registers the software in the company’s name. under such conditions, many freeware packages oblige the user to seek permission from the software owner for such use. commercial use may also require a fee from the user to the software owner. these caveats also apply to individuals who obtain this software for engagement in their own moneymaking endeavors (such as freelancing). the end-user license agreement, included with all freeware packages, contains information to guide users in such contingencies. to avoid worrying about these details, some freeware and open source users simply treat these programs as if they were commercial products and register their use accordingly. this is a safe practice to adopt. one may be surprised to learn that numerous freeware and open source packages do not require registration at all; others regard user registration as a voluntary exercise. both stipulations presume non-commercial use of the software. locating freeware and open-source software finding freeware and open-source software is a rather simple process. a good place for the neophyte to start searching for them is at datamation.11 datamation is a periodical that began life in hard copy format in 1957 but morphed into an e-journal in the 1990s (the final print edition appeared in february 1998). to access the list of open source replacements for popular commercial security tools, for example, simply click on the “open source” heading of the menu bar on the home page of the website. scroll down the next page to discover the list of alternatives to information technology and libraries | september 2014 70 commercial security software in such categories as anti-malware, backup, and browsers, until finally, the reader reaches the last category, web filtering. another important resource comes from pcmag.com. each month, the site presents a list of the best free software available in a particular category such as firewalls, video conferencing, antivirus, and presentation software. both freeware and open-source software are included in each category. last year, experts in the field examined over one hundred free software packages from nine categories. most of the evaluated packages operate on either windows 7 or windows 8, although programs designed for other platforms, such as mac os and android, were included in the lists. also assayed are free cloud-based web applications which run in a browser. note that the lists of free software from 2012 and 2013 are readily available.12 john corpuz’s “45 free and useful windows applications” is a slideshow presenting detailed information on useful applications from a variety of software categories. embedded within each software description is a link which, when clicked, takes the reader to a window where that software may be downloaded.13 dottech offers one of the most comprehensive lists of freeware for windows available. this list consists of individual reports organized around nine categories: cleaning and maintenance, communication, files & documents, work & productivity, pdf tools, privacy and security, multimedia, network/internet tools, and miscellaneous. each report is well-written, concise, and features the type of information computer users need. although i disagree on some of the choices labeled as “best,” i cannot but appreciate the exhaustive nature of these documents. since dottech continuously revises and expands the number of reports in this set, wise users make periodical visits to this site.14 a quick search of the web will bring one to the open-source software directory where one will find open source applications listed by broad categories on the left side of the home page, categories which are then separated into topics before being subdivided by function. to discover network management software, for example, find “administrators” (category), then “networking” (topic), and finally, click on “network management” (function). in this manner, one will obtain a list of open source products matching this description.15 techradar provides another register of recommended free software.16 this site provides detailed descriptions of seventy-six freeware packages, ranging from productivity software to games. a link takes users to techradar’s download channel where one may read information on freeware and open-source software categorized by function and specific name.17 one cannot discuss freeware without mentioning the world’s largest supplier of freeware, microsoft. veteran users of its windows operating system and its office suite are undoubtedly cognizant of the templates and other helps the firm offers them. those implements represent merely a sampling of the valuable and diverse tools the company makes available to the computing world. microsoft provides access to their abundance of freeware via a download site on the web;18 the free software alternative| corbly 71 however, many people find the site difficult to navigate. for this reason, many individuals consult a friendlier, third party site that opens the doors to this unique collection of freeware. one of the better examples of these sites can be found at gizmo’s freeware.19 internet download directories offer an easy and convenient avenue for one to obtain freeware and open-source software. there are a plethora of such sites on the web; unfortunately, they are not all of equal value. some sites are simply better than others. here is a brief list of criteria one may employ to judge these sites: • ease of use. is the site easy to navigate? does it feature categories that enable one to search for a specific type of software? if it contains categories, does their organization enable one to find software quickly and efficiently? • language. can one easily comprehend the language used to describe the software? does the language express complex concepts in laypeople’s terms or in a manner targeted to information technology professionals? • are the software packages accurately described? do the descriptions detail desirable and undesirable traits of the software? do they clearly indicate the operating system required by the software? what other prerequisites does the software require for effective operation? are alternatives to that software specified? • are software reviews presented? if so, who wrote the reviews laypeople or information technology professionals? do the reviews offer download statistics? does the site contain a ratings system easily understood by the user? • are aids available to help users make optimum use of the software? if aids are available, what format are they in videos, documentation, other? • look for other features which may prove useful, such as: is there a link to the software’s home page? does the download directory provide access to more than one download site? among the download sites meeting these criteria are the following: cnet download.com http://download.cnet.com/windows/ majorgeeks.com http://www.majorgeeks.com/ softpedia http://www.softpedia.com/ filehorse.com http://www.filehorse.com/ softonic http://en.softonic.com/ filehippo.com http://www.filehippo.com/ freewareweb http://www.freewareweb.com/ http://download.cnet.com/windows/ http://www.majorgeeks.com/ http://www.softpedia.com/ http://www.filehorse.com/ http://en.softonic.com/ http://www.filehippo.com/ http://www.freewareweb.com/ information technology and libraries | september 2014 72 freeware guide http://www.freeware-guide.com/ freeware files http://freewarefiles.com/ sourceforge http://sourceforge.net/ this brief list does not even begin to exhaust the number of internet download directory websites available for use. frequent visits to these and similar websites will prove amply rewarding. another method of finding this software is to simply look for it on the web via a search engine. to do this most efficiently, one will need the proper name of the software one is trying to obtain. however, if one has that element, one will find this an effective technique of obtaining freeware and open-source software. how does one seek freeware and open source equivalents to commercial software and shareware? first, one may consult a website entitled alternativeto.20 this website will present one with a list of alternatives to specifically named commercial software packages. one has only to click on the search dialog box, key in the name of the commercial product one wishes to find alternatives to, and press the search icon. the list of results will include freeware, open-source software, and commercial products. clicking on the name of the product with transport one to that product’s home page, where one will gain additional information on that product and be given an opportunity to download the product onto one’s computer. secondly, one may consult any of several lists of software equivalents from the web. one of the better of these registers is “100 open source apps to replace everyday software,” by cynthia harvey.21 this list provides not only names of individual open source projects and the commercial packages they replace but also links to homepages of these projects. library-specific freeware and open-source software in the past several years, an increasing number of librarians have turned to freeware and opensource software to help them fulfill the duties they are obligated to discharge. one of the most renowned examples of library-specific open-source software is koha, an integrated library system.22 programmers and librarians developed koha fifteen years ago for the horowhenua library trust in new zealand. since then, libraries of all types, including public, academic, and school libraries, have adopted koha as their integrated library system. numerous consortia across the globe also employee koha to meet their needs and those of their users. one way novices may apprise themselves of this software is to consult websites such as the creative librarian, which features a page entitled “open-source software for libraries,” where software is enumerated by library function.23 any type of library, irrespective of the library’s size, can utilize the software described on this page. http://www.freeware-guide.com/ http://freewarefiles.com/ http://sourceforge.net/ the free software alternative| corbly 73 questions regarding the use of freeware and open-source software all this is not to say that freeware and open-source software do not have their challenges. for instance, in one anecdote i related in the introduction in this paper, i did not name the pdf editor i used at kansas wesleyan. that is because the software is no longer available; unfortunately, neither is david lord’s ascii chart. whenever one spots a piece of freeware or open-source software that may be useful, it is imperative to download it immediately. the availability of many packages is limited, and once gone, they are usually difficult, if not impossible, to obtain. another concern regards documentation. when i first obtained picture timeclock, for example, a complete set of instructions was available from the software provider. that is no longer the case. although an increasing number of freeware and open source packages offer documentation, many do not. however, as noted earlier in this report, many open-source software products have user groups called communities that exist not only to improve the software but also to provide technical assistance to those who use the software. downloading freeware and open-source software presents its own quandaries. even though most providers of these packages go to great lengths to insure the cleanliness of their product, it is nevertheless true that viruses and malware sometimes attach themselves to this software. whenever my security software activates during a download, i immediately cease the downloading process and make a note of the site for future reference. additionally, i always run my security software against all software downloads before installing them in order to keep my system free of any potential threats. issues also arise from employing freeware and open-source software in business offices. individuals bring most of this software into enterprise environments. since the organization itself doesn’t procure this software, the corporation’s information technology personnel may be reluctant to provide support. indeed, the corporation’s it department may not even permit an individual to download any outside software whatsoever onto a system under their domain. before one attempts to install such software (regardless of type of software) on one’s business unit, one should check with the company’s it people to obtain their views on the proposed installation. closing thoughts one question remains: why bother with freeware and open-source software? are librarians searching for new software programs to master? do they need an additional task to add to their todo lists? are freeware and open-source software worthy of the attention of already overworked and stressed-out librarians? yes, they are worthy of our attention. why? for three key reasons. first, freeware and open-source software are cost-effective. for no monies whatsoever, freeware and open-source software offer librarians the opportunity to add important new tools to the arsenal of implements at their information technology and libraries | september 2014 74 disposal. that means that badly needed funds can be more strategically used by the library to help enable it to fulfill its mission to its clientele. secondly, freeware and open-source software enable librarians to make increased use of computer hardware. computers are machines: they require software to not only tell them which tasks to execute but also to provide instructions for performing those tasks. with this software, the range of computer capabilities not only expands in terms of numbers but also increases in terms of efficiency. the bottom line: freeware and open-source software enhance the value of computer hardware to the library and its patrons. finally, with the assistance of freeware and open-source software, librarians become better librarians. they manage their time more effectually, make better use of the resources at their disposal, and elevate the degree of customer service at all levels of the organization. freeware and open-source software can expedite the handling of routine assignments and make possible the fulfillment of other jobs that, due to time and human limitations, are difficult, if not impossible, to address. freeware and open-source software are good for librarians, good for the library, and good for those who depend on the library for the fulfillment of their information needs. they truly foster what many individuals refer to as a “win-win situation” in the world of information acquisition, organization, preservation, and retrieval. urls cited 1. “picture timeclock,” guia international corporation, http://workschedules.com/store/product/picture_time_clock.aspx. 2. “the open source definition,” open source initiative, http://opensource.org/docs/osd. 3. “public-domain software,” webopedia: online tech dictionary for it professionals, http://www.webopedia.com/term/p/public_domain_software.html. 4. “fast and fluid for windows 7: get internet explorer 11,” microsoft, http://windows.microsoft.com/en-us/internet-explorer/download-ie. 5. “firefox web browser,” mozilla, http://www.mozilla.org/en-us/firefox/new/. 6. “ccleaner,” piriform, http://www.piriform.com/ccleaner. 7. “audacity,” audacity: free audio editor and recorder, http://audacity.sourceforge.net/. 8. “apache,” the apache http server project, http://httpd.apache.org/. 9. “a quick guide to gplv3,” gnu operating system, http://www.gnu.org/licenses/quick-guidegplv3.html. 10. “gnu general public license,” gnu operating system, https://www.gnu.org/licenses/gpl3.0.txt. http://workschedules.com/store/product/picture_time_clock.aspx http://opensource.org/docs/osd http://www.webopedia.com/term/p/public_domain_software.html http://windows.microsoft.com/en-us/internet-explorer/download-ie http://www.mozilla.org/en-us/firefox/new/ http://www.piriform.com/ccleaner http://audacity.sourceforge.net/ http://httpd.apache.org/ http://www.gnu.org/licenses/quick-guide-gplv3.html http://www.gnu.org/licenses/quick-guide-gplv3.html https://www.gnu.org/licenses/gpl-3.0.txt https://www.gnu.org/licenses/gpl-3.0.txt the free software alternative| corbly 75 11. “about us: datamation,” datamation, http://www.datamation.com/about/. 12. “the best free software,” pcmag.com, http://www.pcmag.com/article2/0,2817,2381528,00.asp. 13. “45 free and useful applications,” tom’s guide: tech for real life, http://www.tomsguide.com/us/pictures-story/286-39-best-free-windows-apps.html. 14. “best windows free software,” dottech, http://dottech.org/best-free-windows-software. 15. “open-source software directory,” http://www.opensourcesoftwaredirectory.com/. 16. “the best free software for your pc: essential pc programs you should download today,” techradar, http://www.techradar.com/us/news/software/the-best-free-software-for-yourpc-1221029 . 17. “newest downloads,” techradar, http://www.techradar.com/us/downloads. 18. “microsoft download center,” microsoft corporation, http://www.microsoft.com/enus/download/. 19. “best free microsoft downloads,” gizmo’s freeware: the best freeware reviewed and rated, http://www.techsupportalert.com/content/best-free-microsoft-downloads.htm. 20. “alternativeto,” http://alternativeto.net/. 21. “100 open source apps to replace everyday software,” datamation, http://www.datamation.com/open-source/100-open-source-apps-to-replace-everydaysoftware-1.html. 22. “koha library software,” official website of koha library software, http://kohacommunity.org/. 23. “open-source software for libraries,” the creative librarian, http://creativelibrarian.com/library-oss/. http://www.datamation.com/about/ http://www.pcmag.com/article2/0,2817,2381528,00.asp http://www.tomsguide.com/us/pictures-story/286-39-best-free-windows-apps.html http://dottech.org/best-free-windows-software http://www.opensourcesoftwaredirectory.com/ http://www.techradar.com/us/news/software/the-best-free-software-for-your-pc-1221029 http://www.techradar.com/us/news/software/the-best-free-software-for-your-pc-1221029 http://www.techradar.com/us/downloads http://www.microsoft.com/en-us/download/ http://www.microsoft.com/en-us/download/ http://www.techsupportalert.com/content/best-free-microsoft-downloads.htm http://alternativeto.net/ http://www.datamation.com/open-source/100-open-source-apps-to-replace-everyday-software-1.html http://www.datamation.com/open-source/100-open-source-apps-to-replace-everyday-software-1.html http://koha-community.org/ http://koha-community.org/ http://creativelibrarian.com/library-oss/ technology integration in storytime programs: provider perspectives article technology integration in storytime programs provider perspectives maria cahill, erin ingram, and soohyung joo information technology and libraries | june 2023 https://doi.org/10.6017/ital.v42i2.15701 maria cahill (maria.cahill@uky.edu) is professor, university of kentucky. erin ingram (erin.ingram@chpl.org) is youth librarian, cincinnati and hamilton county public library. soohyung joo (soohyung.joo@uky.edu) is associate professor, university of kentucky. © 2023. abstract technology use is widespread in the lives of children and families, and parents and caregivers express concern about children’s safety and development in relation to technology use. children’s librarians have a unique role to play in guiding the technology use of children and families, yet little is known about how public library programs facilitate children’s digital literacy. this study sought to uncover librarians’ purposes for using technology in programs with young children as well as the supporting factors and barriers they encountered in attempting to do so. findings reveal 10 purposes for integrating technology into public library storytime programs and 15 factors across four dimensions that facilitate and/or inhibit its inclusion. if librarians are to embrace the media mentor role with confidence and the necessary knowledge and skills required of the task, much greater attention should be devoted to the responsibility and more support in the way of professional development and resources is necessary. introduction technology use is widespread in the lives of children and families. from a very early age, children in highly developed countries across the world regularly interact with technology and data from device trackers substantiate parental reports.1 nearly all families have access to one or more mobile devices, and nearly three-fourths of children in the united states begin some form of digital engagement, primarily television viewing, before age three.2 prior to formal schooling, children (ages two to four) in highly developed countries tend to use a device with a screen for about two and a half hours per day on average.3 differences in screen use by income level and race are significant, with children from lowerincome families and children of color spending more time on electronic devices than children from higher-income families and children who are white. though most parents do allow their children to use technology, many voice some concerns about their children’s well-being, particularly regarding privacy as well as the content of the media.4 yet, young children’s digital activity can be beneficial, particularly when the technology is designed to foster active, meaningful engagement and when it facilitates social interaction.5 in light of children’s usage and parents’ concerns, librarians in public libraries have a unique role to play in this information realm. not only can librarians provide access to technology and recommended resources but they can also provide guidance in how to use technology to contribute to children’s learning, especially in the areas of reading, information literacy, and academic concepts.6 yet, little is known about whether librarians actually facilitate children’s digital literacy through integration of technology into programs, and this dearth of empirical mailto:maria.cahill@uky.edu mailto:erin.ingram@chpl.org mailto:soohyung.joo@uky.edu information technology and libraries june 2023 technology integration in storytime programs 2 cahill, ingram, and joo evidence is highlighted in the association of library services to children (alsc) research agenda.7 storytime, as a program attended by both children and caregivers, can be used as a time for children’s librarians to integrate technology for the purposes of modeling and explaining how various electronic tools might be beneficial for young children.8 due to this potential, it is important to understand how and why children’s librarians are—or are not—integrating technology into storytime programs. previous studies of technology use in children’s programs and storytimes internationally, there have been few investigations of technology integration within library programs for young children. within the united states, two survey studies, both commissio ned by alsc, sought to capture the use of technology in youth programming.9 the initial survey launched in 2014 and the follow-up survey in 2018. respondents to these surveys reported that the types of devices used most often in libraries were proprietary institutional devices, digital tablets, tangible tech such as squishy circuits that allow children to build electrical circuits with play dough, and programmable tech such as cubetto, a wooden robot toy.10 additionally, more than half of respondents working in medium and large libraries and more than 45% of those working in small libraries indicated using digital devices during storytimes.11 conversely, a comprehensive study of programming for young children in public libraries, which included observations, concluded that, “while many libraries offer families a place to use computers and other digital resources together, few libraries actively promote the use of technology during their programming.”12 notably, neither the 2014 nor 2018 alsc survey included questions about the types of technology used in storytimes, nor were respondents asked to explain their thoughts on why or how technology was or was not included in storytime.13 a study conducted in aotearoa new zealand collected data about technology use in storytime in three phases: a survey of 25 children’s librarians, interviews with librarians in nine libraries, and a survey of 28 caregivers who attend a library storytime with a young child.14 slightly more than a quarter of the librarians responding to the survey reported incorporating digital technology such as tablets or e-books into storytime programs. the most common rationale for technology use in storytime was to educate caregivers. other reasons included for the novelty of it and to promote accessibility and the aims of library services. interviewees explained that they used technology in storytime to show caregivers the availability of high-quality digital media such as e-books and educational apps, with one likening the use and recommendation of digital media to librarians’ traditional role as recommenders of storybooks (i.e., readers’ advisory services). conversely, one interviewee expressed reservations about using technology for fear that children would be distracted from the content of the story. the majority of caregiver respondents who had attended a storytime with digital technology reported enjoying the experience. however, those who had never attended a storytime with technology were apprehensive about doing so. technology best practices: joint media engagement and media mentorship recent scholarship encourages children’s librarians to use their expertise and experience to evaluate and recommend technology and new media resources as well as to model for adults how to interact with children as they use technology.15 for example, librarians can promote joint media engagement during storytimes both by modeling the practice and by directly explaining it to the adults in attendance. using technology during storytime can be seen as modeling modern literacy practices, just as reading print books has modeled literacy practices in traditional storytimes since the 1940s .16 information technology and libraries june 2023 technology integration in storytime programs 3 cahill, ingram, and joo alsc instructs youth services librarians to act as media mentors, a role that means they will assist caregivers in choosing and using technology by researching new technology and by modeling technology use, such as joint media engagement, for caregivers in programs such as storytimes.17 media mentorship is seen as an extension of how youth services librarians have traditionally been called upon to meet the needs of caregivers and children with their knowledge of child development and ability to facilitate caregivers’ information seeking.18 while alsc encourages media mentorship, the extent to which children’s librarians have embraced this role is unclear in professional research. findings from prior surveys and interviews with storytime providers suggest that librarians are regularly integrating technology into programs while observations of library programs suggest otherwise.19 further, goulding and colleagues found that while many librarians were comfortable recommending technology such as apps, it was unclear whether or not they were modeling its use during storytimes.20 study objectives the overarching research question of this study is “how do storytime providers view the integration of technology into storytime programs?” the following three research questions guide this study. 1. what are the purposes for using technology in storytimes? 2. what are factors associated with adopting technology in storytimes? 3. what are barriers to integrating technology in storytimes? method participants as part of a larger institute of museum and library services (imls)-funded, multistate study that was approved by the university of kentucky institutional review board (irb number 42829), researchers conducted semi-structured interviews with 34 library staff who facilitate storytime programs at public libraries serving urban, suburban, and rural communities across kentucky, ohio, and indiana.21 interviewees were not asked to identify their race or ethnicity. thirty-two identified as female and two as male. all but one of the participants (97%) had earned a college degree, but only 13 (38.2%) held a master’s degree from a library and information science (lis) program, while another two were enrolled in an lis master’s degree program when the interviews occurred. the majority of participants (57.1%) had five years or more of experience in children’s library services. the participants will be referred to as “storytime providers.” procedure the interviews were conducted by one member of the research team. other members of the team created written transcripts from recordings of the interviews. for the study reported in this paper, researchers focused on participants’ answers to the interview question “what place, if any, does technology or digital media have in a quality storytime program?” an open coding method was used to organize participants’ statements within three categories: purposes underlying technology use, factors associated with technology adoption, and barriers to technology integration. three researchers conducted open coding independently and came up with the initial set of coding results. then, the researchers discussed the coding results multiple times to assess the relevance of the coded constructs, refine operational definitions, and select one representative quote for each code. interviewees were assigned a number between 1 and 34 to eliminate identifying information. information technology and libraries june 2023 technology integration in storytime programs 4 cahill, ingram, and joo results what are the purposes for using technology in storytimes? to find answers to this research question, the researchers coded statements related to how or why interviewees used or wanted to use technology in storytime programs. we identified 10 specific purposes, formed operational definitions for each, and chose one representative quote (table 1). although most purposes had statements from more than one interviewee associated with them, we collaborated to choose one example due to space constraints. researchers determined that the purposes for technology use could be divided into two categories: experiential and learning. experiential purposes are those for which technology is used to create a positive, engaging experience for child and/or adult participants. learning purposes are those for which technology use is intended to help child and/or adult participants learn. what are factors associated with adopting technology in storytimes? to answer the second research question, researchers looked for statements explaining the reasons or causes for storytime providers using or wanting to use technology in their storytime programs. these would be factors that facilitate technology adoption. researchers coded statements independently and then discussed results multiple times to verify relevance and consolidate categories into 15 factors in four dimensions: storytime provider, participant, library system, and content. though many factors had more than one corresponding statement from participants, we chose one representative quote for each. results are presented in table 2. what are barriers to integrating technology in storytimes? to answer this question, researchers independently reviewed responses, looking for statements related to why storytime providers did not or did not wish to use technology during storytime. after individual coding, we collaborated to verify relevance, refine definitions of the 15 identified barriers, and choose representative quotes. the results are presented in table 3. researchers found that three of the dimensions created for factors that lead to techno logy adoption could also be applied to barriers to technology integration: storytime provider, participant, and library system. information technology and libraries june 2023 technology integration in storytime programs 5 cahill, ingram, and joo table 1. purposes for using technology in storytimes category purpose operational definition representative quote experiential accommodating large groups technology is used to enable a large group to view books/materials 2: “i had this huge group of kids. and i took them to our red room and did a story on our big screen. you know, through tumblebooks.” children’s enjoyment provider incorporates media or technology because children enjoy it 14: “and then as far as, um, sometimes, um, we’ll have, like, at the end of a storytime, we may have a little short, um, like nonfiction or sign language or if we were doing something on the alphabet, maybe i would throw in a little dvd and give them popcorn for the end of storytime and things like that and i think that they really enjoy that. it is important to integrate that in.” facilitating adult participation provider uses technology to display the words to songs to facilitate adult participation 12: “the closest thing i would say, i use a powerpoint that has the words on it for the parents to be able to follow along, um, or for the kids if they can pick out some of the letters or start to read, even some of the older ones.” facilitating movements technology is used to facilitate movements or dancing 19: “in addition to our singing, just to give, you know, to change it up a little bit. so, they can hear the music. we clap rhythms. so, we use that a lot.” playing songs/music technology is used to play songs or music 13: “we have a sound system that i love, with surround sound. we always do our last song with, you know, that, and i’ve been fortunate that it’s worked all the time.” information technology and libraries june 2023 technology integration in storytime programs 6 cahill, ingram, and joo category purpose operational definition representative quote sound effects technology is used to create a sound or voice 17: “one of the better things that i’ve done, that i like to do, is, i like to use animal sounds. i’ll research or pull up a list of sounds on youtube or whatever and have the kids listen to them. i think that’s always been a fun way to work in a little bit of technology without taking out all of the flow.” visual aids technology is used to support children’s visual experience 24: “and, like, it gives the kids a visual. and i feel like sometimes, if we could give them a better visual, they might be more engaged.” learning support for adult-child interaction technology is used to support adult-child interaction 1: “if you’re actually sitting down with your child, looking at it together, it’s a lot more effective and the child is getting a lot more out of it versus just sitting them in front of it and expecting to teach something to the child.” teaching caregivers technology is integrated to model for caregivers 11: “i think it’s important to share with parents really good e-resources, such as, like, apps. and books and stuff. so, that, i think it’s very important…. i have, like, when i have like a screen, a projector screen, maybe when the book i picked for the storytime was an e-book that they could get through the library, and kind of, you know, advertise that resource, and then we would, we would read the e-book, you know, from the projector. so i’ve done, like, e-books and stuff.” teaching concepts technology is used to present letters, words, numbers, shapes, sign language, colors, or coding skills, to children 22: “…. all these different color songs, um, and they’re actually just on youtube…. so that is one way that we’ve been incorporating technology, um, is with those color songs because it spells it out for them. they can see the word, it’s a familiar tune, and it helps them, you know, at least be able to sing, sing the song.” information technology and libraries june 2023 technology integration in storytime programs 7 cahill, ingram, and joo table 2. factors associated with adopting technology in storytimes dimension factor operational definition representative quotes storytime provider awareness provider is aware of the tool/technology available for storytime 1: “i’m aware of all kinds of apps that are out there and of course the ebooks.” familiarity provider feels comfortable with the technology and with integrating the technology into programs 1: “i feel like it’s going to be effective if it’s what you’re comfortable with and you’re excited about. because that will come through when you actually provide the storytime.” choice of provider ultimately it is up to the provider to choose to integrate technology or not 1: “i think it all depends on the provider.” provider’s philosophy and approach how the provider views storytime and its purpose influences technology integration 1: “everyone has their own, unique storytime philosophy and the way that they approach planning storytimes…. so, really, a lot of it is just ... theory of how you want to approach it since there’s so many options out there.” reaction/success with initial attempt if the provider tried technology integration, the success or failure of that initial attempt influences subsequent attempts 2: “it went over really well.” information technology and libraries june 2023 technology integration in storytime programs 8 cahill, ingram, and joo dimension factor operational definition representative quotes research base provider is aware of research to support integration of technology 1: “... it’s kind of what the research is saying with parents and digital media at home. it all depends on how you are using it. if you’re actually sitting down with your child, looking at it together, it’s a lot more effective and the child is getting a lot more out of it versus just sitting them in front of it and expecting to teach something to the child.” participant number of participants the number of participants facilitates technology integration 2: “i think this summer was the first time i ever did that [used technology], and it was because i had this huge group of kids.” perception of caregivers’ reactions provider’s perception of how the caregivers would react to technology use 1: “i think they would probably be open to it…. i don’t know if maybe the perception some parents don’t want any technology, that would keep some people from appreciating it. but i think in general, it would be wellreceived if we tried it.” responsive to children’s interests provider uses digital resources because the children show interest or engagement 10: “kids are automatically interested in that stuff. they don’t need to be enticed. you know, you just get out an iphone or an ipad and they’re, like, gasp.” library system access to equipment and resources provider has access to technology and tools 1: “... we have technology, i think, in our system to implement it. you know, e-readers and ipads and things that we can use in storytimes. and large screen tvs.” information technology and libraries june 2023 technology integration in storytime programs 9 cahill, ingram, and joo dimension factor operational definition representative quotes colleague support provider is part of a branch or system that shares information and resources for technology integration 17: “so, you know, we have, and we’ve gotten pretty [good] at sharing with other storytime providers in our system if we have any websites or anything that we’ve been using or music that works really well for ‘movers and shakers’ or anything like that.” expectation to integrate technology in programs provider feels pressure to integrate technology and is defensive about the choice not to do so 1: “i kind of apologize for it…. so, we have the technology available, and they encourage us to use it....” training provider has used or wants to use technology during storytime because of a training 17: “we did a digital mentoring training about how to appropriately model, like, tech skills and screen time with families. so we’ve been encouraged to add in a little bit more technology into our storytimes if we can do those, you know, in an appropriate way.” content interactivity provider can use technology to facilitate interactivity 24: “... i would love to use some, like, smart tvs, smart boards, those kind of things. just for some interactive songs and you know, activities... when i go into these kindergartens and first grade and second grade rooms, like, these kids are using the smart boards for interactive activities for abcs and colors and shapes and numbers. and it may be through an activity or a song that’s being used with that smart board. and i say, ‘oh, i love that! i wish i could do that!’” theme provider uses technology that clearly connects to the theme of the storytime 17: “actually in my kinderbridge storytime now, it’s shapes month. we have the osmotangrams that i bring out. so that’s one of the ones all four weeks i’m going to use the apps and bring out both of our ipads so that kids can practice those spatial shapes.” information technology and libraries june 2023 technology integration in storytime programs 10 cahill, ingram, and joo table 3. barriers to integrating technology in storytime dimension barrier operational definition representative quote storytime provider fear of difficulties/ problems provider doesn’t plan or hesitates to plan technology use because there may be problems with using it 13: “but technology can be a problem. when you’re planning or something and it’s not working.” previous/ own child’s experiences with tech provider has negative experience using technology with children 5: “i have a four-year old. and it’s interesting to see how he responds to technology and what he responds to. and what helps him to learn the most. and it’s just, like, night and day what he learns from. you know, hearing repeated songs and rhymes and just reading tons of books versus what he learns.… i mean, i think that probably the most he ever learned from an ipad was getting to watch sesame street. just sort of the same, sort of like watching a storytime, i think. but yeah, i think just now from experience seeing like, ‘oh! that really doesn’t. it’s not a helpful tool, i don’t think, for that age.’ just from my experience.” undecided about the value of tech provider is unsure if tech integration is appropriate 5: “i have been all over the board in terms of that subject … like i said, it’s really important for me to pack in as much of what i think they need in a storytime. and i don’t know, again, i’m not sure that i’m doing exactly what is correct and maybe i should be exposing them more. but i feel like, especially for threeto five-year olds, it’s one of those things.... screen time/ overuse concerns provider is concerned about children’s screen time 2: “because i think there’s plenty of opportunity to be had in other places.” information technology and libraries june 2023 technology integration in storytime programs 11 cahill, ingram, and joo dimension barrier operational definition representative quote storytime activities as purposeful alternative to technology provider deliberately chooses not to use technology in storytime because they see storytime activities as equally or more beneficial 16: “and one thing that i’ve gotten feedback on is that kids are exposed to the technology in pretty much every facet of their life, so if we can make this a space where they can learn and experience things in a way that doesn’t have technology and they can see that it’s still really fun and exciting and we can learn a lot, then that has its own place, too.” unwilling to adopt a new technology provider keeps using the prior tool and does not try a new alternative technology 18: “i’m kind of old school because we’ve been using our cd player.” participant children devalue other components of storytime when tech is integrated provider perceives that the children prefer tech over other components of storytime 5: “i used to sometimes show a short video, and then i kind of found that that’s what they looked forward to most. i wanted to sort of change that perception of what the library was for some kids.” difficult to use tech with young children provider experiences difficulty using technology with young children 5: “i have found, for preschoolers, that it is really hard to incorporate anything digital.” lack of access to the internet poor broadband in rural area; why expose children to something they can’t use at home 5: “i feel like, especially here in this rural area, … [w]e have a really poor broadband network here, so not a lot of people have access to the internet. and so sometimes i feel like, also, showing them something that they can’t really utilize at home is not really helpful until they’re a little older also. information technology and libraries june 2023 technology integration in storytime programs 12 cahill, ingram, and joo dimension barrier operational definition representative quote perception or anticipated perception of some parents/ caregivers if the provider perceives that some parents/caregivers will object to tech integration, the storytime provider may be reluctant to do so 1: “i don’t know if maybe the perception, some parents don’t want any technology, that would keep some people from appreciating it.” tech is distracting for young children provider believes technology is distracting 5: “personally, i think i kind of get distracted by the media, so, then i think they would, too. library system lack of access to devices library does not have a certain device or technology even though the provider would like to have or think useful for storytime 24: “um, i’ll be honest with you, if we had the ability, i would love to use some like smart tvs, smart boards, those kind of things.… we just don’t really have that option here.” lack of time to integrate tech into storytime, the provider has to have time to explore tools and know the best resources/media to integrate, and that takes time 1: “and part of it’s time, too. having the time to find quality resources, and to learn how to use them. because we have the technology, i think, in our system to implement it. “ information technology and libraries june 2023 technology integration in storytime programs 13 cahill, ingram, and joo dimension barrier operational definition representative quote lack of training provider thinks self doesn’t have the knowledge, interest, skill, or training to use technology during storytime 15: “and i’d be open to ways to use it, but i guess i haven’t taken, you know, any trainings on … i mean, i really haven’t seen a lot of things offered at conferences.” old facility library does not support installing newer technology 21: “... that’s a thing that we have struggled with previously because of our infrastructure and set-up. it was almost a hazard to set up a projector and have some sort of digital aspect to storytime.” information technology and libraries june 2023 technology integration in storytime programs 14 cahill, ingram, and joo discussion purposes experiential many of the storytime providers’ purposes for using technology revealed a goal to create a positive, engaging experience for all children and adults who attend storytime, a theme that prior research has highlighted.22 specifically, technology facilitates the sharing of visual aids, sound effects, and songs. providers also use technology to encourage adult participation, and like their early childhood educator colleagues, storytime providers in this study reported using technology to scaffold and coordinate children’s gross motor movements with songs and action rhymes.23 learning storytime providers’ responses also show the aim to contribute to the learning of children and adults in storytime. this finding mirrors those of goulding, shuker, and dickie, which found that providers like to use technology in ways that coincide with the aims of children’s services. 24 two of the purposes show an awareness of best practices in technology integration: support for adult-child interaction and teaching caregivers.25 additionally, storytime can be an opportune time for providers to model technology best practices for caregivers as providers have been modeling literacy best practices throughout the history of storytime programming.26 importantly, when storytime providers do model and intentionally seek to support caregivers’ learning, caregivers expand their knowledge, experience heightened confidence, and tend to utilize the strategies they encountered.27 notably, storytime providers tend to feel discomfort with providing instructional or developmental information directly to caregivers via “asides”; thus, a more palatable approach for many storytime providers might include using “we” language along the lines of “when we use digital media, we want to be sure that we are developing healthy habits. some families set a timer to help them monitor the duration of their children’s screentime.”28 one way that storytime providers might model digital media use is to search for and find information related to the storytime theme or book in one of the library’s databases. for example, if a book shared in storytime included a sloth, the storytime provider might demonstrate how to search for a video of a sloth in one of the library’s digital encyclopedias (e.g., encyclopedia britannica). storytime providers should also keep in mind that digital play can be incorporated into the informal activities that typically occur before and after storytime programs as a means to support children’s social interaction with other children.29 for example, if puzzles are typically included as one of the informal activity options before or after the storytime program, the provider might offer both traditional and digital puzzles (e.g., https://kids.nationalgeographic.com/games/puzzles/) on library-owned tablets and provide a simple how-to if needed. supports and barriers through the process of open coding, researchers identified four dimensions that storytime providers’ perceived supports and barriers could fall into based on the primary influential factor: provider, library system, participants, or content. provider the providers’ perceptions about technology and experiences with technology in the library setting serve as facilitators or barriers to integration. if a provider is aware of useful technology, familiar and comfortable with its use, knowledgeable of research supporting technology use, has a https://kids.nationalgeographic.com/games/puzzles/ information technology and libraries june 2023 technology integration in storytime programs 15 cahill, ingram, and joo professional philosophy that can accommodate technology use, and/or has had a positive experience trying out technology, then these may be factors that lead to the adoption of technology in storytime. on the other hand, if the provider has concerns about the difficulties of technology use or the amount of time children spend on screens, if the provider’s professional philosophy views storytime as a deliberate alternative to time with technology, or if the provider has had a negative experience with technology, then these may be factors that prevent the adoption of technology in storytime. these same factors affect early childhood practitioners and influence their decisions to incorporate technology into classroom practices.30 the factors that lead to technology integration could be seen as related to media mentorship. a media mentor has awareness, familiarity, knowledge, and a professional philosophy that supports technology use, all of which were factors identified by interviewees. professional training in mentorship was mentioned by one interviewee (17) who stated, “we did a digital mentoring training about how to appropriately model, like, tech skills and screen time with families.” thus , some providers’ responses indicate some general awareness of the currently emphasized best practice of media mentorship. however, the ambivalence toward the role of media mentor that goulding and colleagues found amongst librarians is also found here as interviewees’ responses do not give a clear picture of how they model technology use for caregivers during storytimes .31 in addition, responses that highlight barriers to technology integration show ways in which some providers are opposed to employing the role of media mentor specifically during storytime. as such, our findings align with prior observational studies that noted “few instances of librarians willing to speak directly to parents about how to interact with their children using technology.”32 participant providers consider the perspectives of the adult and child participants in storytimes in relation to integrating technology. providers are more likely to integrate technology if they view it as an aid to facilitating sessions for large groups, they believe caregivers will be open to the technology, and they appreciate that young children show a high interest in devices such as ipads. however, children’s high interest in devices was seen by other providers as a negative aspect of technology use and a barrier to integration because they thought children were too focused on the technology itself or would be distracted by the technology. just as early childhood teachers have been encouraged to broaden their perspectives of literacy to encompass digital literacy, so too might storytime providers, as this shift in focus would enable them to view these incidences as engagement rather than distraction.33 also, the same interviewee who thought caregivers might be open to technology in storytime expressed the concern that other caregivers might not like its use. our findings related to caregiver reaction echo similar findings from goulding and colleagues: the reaction that providers anticipate from adult participants might be either a support or a barrier for technology integration.34 library system two aspects of the library system were present in both factors and barriers: access and training. when the library system in which the provider worked gave them access to technology and training in its use for programs, they were more likely to integrate technology. in contrast, when a provider did not have access to technology, the library building did not support its use, or training was not given, the provider was less likely to integrate technology. libraries pride themselves on providing the highest level of service to members of the community and “removing barriers to access presented by socioeconomic circumstances.”35 yet, if libraries are to facilitate the digital learning of young children, it is necessary for them to recognize the digital divide impacting information technology and libraries june 2023 technology integration in storytime programs 16 cahill, ingram, and joo children’s access to technology throughout the world, and parents’ reluctance to spend money on digital apps.36 content content was a dimension only found in factors that support technology integration, not in barriers. providers used or wanted to use technology because they could connect the technology to two essential elements in the content of storytime: interactivity and theme. this dimension relates to purposes for technology use in the learning category as providers want to use the interactivity of technology as well as technology directly related to the session’s theme to boost children’s learning. indeed, child learning has long been librarians’ goal in providing storytime programs as has facilitating the development of parent skills.37 conclusion technology is prevalent in the lives of children and many begin interacting with digital tools as early as the first year of life; and caregivers seek guidance regarding their children’s technology use.38 while alsc has championed children’s librarians as media mentors, findings from this study, coupled with those from prior research, highlight storytime providers’ opposition to the media mentor role and the integration of technology within storytime programs.39 some first steps storytime providers might take are to integrate the digital tools the library is already providing. for example, if the library offers e-books (e.g., via libby), the storytime provider might consider integrating one or more picturebooks from that collection into storytime. alternatively, if the library does not have the tools necessary to share the book electronically during the program (e.g., a screen large enough for the storytime group), the provider might read the print version but then follow that up with a comment along the lines of “grownups, did you know that the library also offers this as an e-book that you could read on a phone, tablet, or other device? i would be happy to show you how to access it and other e-books after the program.” providers looking for other ways to incorporate digital tools into library programs might read strategies recommended by librarians in a fully and freely accessible online book.40 as scholars have previously noted, early childhood providers, including those who support young children and families in libraries, need much more professional development.41 specifically, the field needs more opportunities for librarians and other early childhood educators to develop their knowledge and skills within the realm of digital technology for young children, but they also need training that advances the notion of media mentor and boosts their confidence and identities relative to that role.42 the institute of museum and library services recently funded a project designed to support librarians’ knowledge and skills within the realm of family media for children ages five to eleven years—and products from that project are certainly a good starting place for storytime providers; however, additional resources and research focused on library programs and services designed for children from birth through five years are needed.43 if librarians are to embrace the media mentor role with confidence and the necessary knowledge and skills required of the task, much greater attention should be devoted to the responsibility and more support in the way of professional development and resources is necessary. acknowledgement this work was supported by the institute of museum and library services [federal award identification number: lg-96-17-0199-17]. information technology and libraries june 2023 technology integration in storytime programs 17 cahill, ingram, and joo endnotes 1 nalika unantenne, mobile device usage among young kids: a southeast asia study (the asianparent insights, november 2014), https://s3-ap-southeast-1.amazonaws.com/tap-sgmedia/theasianparent+insights+device+usage+a+southeast+asia+study+november+2014.p df; brooke auxier, monica anderson, andrew perrin, and erica turner, parenting children in the age of screens (pew research center, 2020), https://www.pewresearch.org/internet/2020/07/28/parenting-children-in-the-age-ofscreens/; stephane chaudron, rosanna di gioia, and monica gemo, young children (0–8) and digital technology: a qualitative study across europe (publications office of the european union, 2018), https://doi.org/10.2760/294383; organization for economic cooperation and development, what do we know about children and technology? (2019), https://www.oecd.org/education/ceri/booklet-21st-century-children.pdf; victoria rideout and michael b. robb, the common sense census: media use by kids age zero to eight, 2020 (common sense media, 2020), https://www.commonsensemedia.org/sites/default/files/uploads/research/2020_zero_to_eig ht_census_final_web.pdf; jenny s. radesky et al., “young children’s use of smartphones and tablets,” pediatrics 146, no. 1 (2020). 2 unantenne, mobile device usage; auxier, anderson, perrin, and turner, parenting children; chaudron, di gioia, and gemo, young children (0-8) and digital technology. 3 rideout and robb, the common sense census; sebastian paul suggate and philipp martzog, “preschool screen-media usage predicts mental imagery two years later,” early child development and care (2021): 1–14. 4 auxier, anderson, perrin, and turner, parenting children; suggate and martzog, “preschool screen-media usage.” 5 marc w. hernandez, carrie e. markovitz, elc estrera, and gayle kelly, the uses of technology to support early childhood practice: instruction and assessment. sample product and program tables (administration for children & families, u.s. department of health & human services, 2020), https://www.acf.hhs.gov/media/7970; lisa b. hurwitz and kelly l. schmitt, “can children benefit from early internet exposure? shortand long-term links between internet use, digital skill, and academic performance,” computers & education 146 (2020): 103750; kathy hirsh-pasek et al., “putting education in ‘educational’ apps: lessons from the science of learning,” psychological science in the public interest 16, no. 1 (2015): 3–34. 6 amy koester, ed., young children, new media, and libraries: a guide for incorporating new media into library collections, services, and programs for families and children ages 0–5 (little elit, 2015), https://littleelit.files.wordpress.com/2015/06/final-young-children-new-media-andlibraries-full-pdf.pdf. 7 association for library service to children, national research agenda for library service to children (ages 0–14), 2019, https://www.ala.org/alsc/sites/ala.org.alsc/files/content/200327_alsc_research_agen da_p rint_version.pdf. https://s3-ap-southeast-1.amazonaws.com/tap-sg-media/theasianparent+insights+device+usage+a+southeast+asia+study+november+2014.pdf https://s3-ap-southeast-1.amazonaws.com/tap-sg-media/theasianparent+insights+device+usage+a+southeast+asia+study+november+2014.pdf https://s3-ap-southeast-1.amazonaws.com/tap-sg-media/theasianparent+insights+device+usage+a+southeast+asia+study+november+2014.pdf https://www.pewresearch.org/internet/2020/07/28/parenting-children-in-the-age-of-screens/ https://www.pewresearch.org/internet/2020/07/28/parenting-children-in-the-age-of-screens/ https://doi.org/10.2760/294383 https://www.oecd.org/education/ceri/booklet-21st-century-children.pdf https://www.commonsensemedia.org/sites/default/files/uploads/research/2020_zero_to_eight_census_final_web.pdf https://www.commonsensemedia.org/sites/default/files/uploads/research/2020_zero_to_eight_census_final_web.pdf https://littleelit.files.wordpress.com/2015/06/final-young-children-new-media-and-libraries-full-pdf.pdf https://littleelit.files.wordpress.com/2015/06/final-young-children-new-media-and-libraries-full-pdf.pdf https://www.ala.org/alsc/sites/ala.org.alsc/files/content/200327_alsc_research_agenda_print_version.pdf https://www.ala.org/alsc/sites/ala.org.alsc/files/content/200327_alsc_research_agenda_print_version.pdf information technology and libraries june 2023 technology integration in storytime programs 18 cahill, ingram, and joo 8 christner, hicks, and koester, “chapter six: new media in storytimes: strategies for using tablets in a program setting.” in a. koester, ed., a guide for incorporating new media into library collections, services, and programs for families and children ages 0–5 (little elit, 2015), 77-88. 9 kathleen campana, j. elizabeth mills, marianne martens, and claudia haines, “where are we now? the evolving use of new media with young children in libraries,” children and libraries 17, no. 4 (2019): 23–32; j. elizabeth mills, emily romeign-stout, cen campbell, and amy koester, “results from the young children, new media, and libraries survey: what did we learn?”, children and libraries 13, no. 2 (2015): 26–32. 10 campana, mills, martens, and haines, “where are we now?”. 11 campana, mills, martens, and haines, “where are we now?”. 12 susan b. neuman, naomi moland, and donna celano, “bringing literacy home: an evaluation of the every child ready to read program” (chicago: association for library service to children and public library association, 2017), 5, http://everychildreadytoread.org/wpcontent/uploads/2017/11/2017-ecrr-report-final. 13 campana, mills, martens, and haines, “where are we now?”; mills, romeign-stout, campbell, and koester, “results from the young children, new media, and libraries survey.” 14 anne goulding, mary jane shuker, and john dickie, “media mentoring through digital storytimes: the experiences of public libraries in aotearoa new zealand,” in proceedings of ifla wlic (2017), https://library.ifla.org/id/eprint/1742/1/138-goulding-en.pdf. 15 goulding, shuker, and dickie, “media mentoring through digital storytimes”; cen campbell and amy koester, “new media in youth librarianship,” in a. koester, ed., a guide for incorporating new media into library collections, services, and programs for families and children ages 0–5 (little elit, 2015), 8–24. 16 jennifer nelson and keith braafladt, technology and literacy: 21st century library programming for children and teens (chicago: american library association, 2012). 17 c. campbell, c. haines, a. koester, and d. stoltz, media mentorship in libraries serving youth (chicago: association for library service to children, 2015), https://www.ala.org/alsc/sites/ala.org.alsc/files/content/media%20mentorship%20in%20li braries%20serving%20youth_final_no%20graphics.pdf. 18 association for library service to children, competencies for librarians serving children in libraries. 19 campana, mills, martens, and haines, “where are we now?”; mills, romeign-stout, campbell, and koster, “results from the young children, new media, and libraries survey”; neuman, moland, and celano, “bringing literacy home”; goulding, shuker, and dickie, “media mentoring through digital storytimes.” http://everychildreadytoread.org/wp-content/uploads/2017/11/2017-ecrr-report-final http://everychildreadytoread.org/wp-content/uploads/2017/11/2017-ecrr-report-final https://www.ala.org/alsc/sites/ala.org.alsc/files/content/media%20mentorship%20in%20libraries%20serving%20youth_final_no%20graphics.pdf https://www.ala.org/alsc/sites/ala.org.alsc/files/content/media%20mentorship%20in%20libraries%20serving%20youth_final_no%20graphics.pdf information technology and libraries june 2023 technology integration in storytime programs 19 cahill, ingram, and joo 20 goulding, shuker, and dickie, “media mentoring through digital storytimes” in proceedings of ifla wlic (2017), https://library.ifla.org/id/eprint/1742/1/138-goulding-en.pdf. 21 institute of museum and library services, public libraries survey, 2016, https://www.imls.gov/research-evaluation/data-collection/public-libraries-survey. 22 maria cahill, soohyung joo, mary howard, and suzanne walker, “we’ve been offering it for years, but why do they come? the reasons why adults bring young children to public library storytimes,” libri 70, no. 4 (2020), 335–44; peter andrew de vries, “parental perceptions of music in storytelling sessions in a public library,” early childhood education journal 35, no. 5 (2008): 473–78; goulding and crump, “developing inquiring minds.” 23 courtney k. blackwell, ellen wartella, alexis r. lauricella, and michael b. robb, technology in the lives of educators and early childhood programs: trends in access, use, and professional development from 2012 to 2014 (chicago: northwestern school of communication, 2015). 24 campbell and koester, “new media in youth librarianship.” 25 campbell, haines, koester, stoltz, media mentorship in libraries serving youth; prachi e. shah et al., “daily television exposure, parent conversation during shared television viewing and socioeconomic status: associations with curiosity at kindergarten,” plos one 16, no. 10 (2021), e0258572. 26 nelson and braafladt, technology and literacy. 27 roger a. stewart et al., “enhanced storytimes: effects on parent/caregiver knowledge, motivation, and behaviors,” children and libraries 12, no. 2 (2014): 9–14; scott graham and andré gagnon, “a quasi-experimental evaluation of an early literacy program at the regina public library/évaluation quasi-expérimentale d'un programme d'alphabétisation des jeunes enfants à la bibliothèque publique de regina,” canadian journal of information and library science 37, no. 2 (2013): 103–21. 28 maria cahill and erin ingram, “instructional asides in public library storytimes: mixed methods analyses with implications for librarian leadership,” journal of library administration 61, no. 4 (2021): 421–38. 29 leigh disney and gretchen geng, “investigating young children’s social interactions during digital play, early childhood education journal (2021): 1–11. 30 hernandez, markovitz, estrera, and kelly, “the uses of technology”; karen daniels et al., “early years teachers and digital literacies: navigating a kaleidoscope of discourses,” education and information technologies 25, no. 4 (2020): 2415–26. 31 goulding, shuker, and dickie, “media mentoring through digital storytimes.” 32 neuman, moland, and celano, “bringing literacy home,” 58. 33 daniels et al., “early years teachers and digital literacies.” https://library.ifla.org/id/eprint/1742/1/138-goulding-en.pdf https://www.imls.gov/research-evaluation/data-collection/public-libraries-survey information technology and libraries june 2023 technology integration in storytime programs 20 cahill, ingram, and joo 34 goulding, shuker, and dickie, “media mentoring through digital storytimes.” 35 association for library service to children, competencies for librarians serving children in libraries (2020) https://www.ala.org/alsc/edcareeers/alsccorecomps; american library association, code of ethics of the american library association (2021), https://www.ala.org/tools/ethics 36 jenna herdzina and alexis r. lauricella, “media literacy in early childhood report,” child development 101 (2020): 10; sara ayllon et al., digital diversity across europe: policy brief september 2021 (digigen project, 2021), https://www.digigen.eu/news/digital-diversityacross-europe-recommendations-to-ensure-children-across-europe-equally-benefit-fromdigital-technology/. 37 goulding and crump, “developing inquiring minds”; nancy l. kewish, “south euclid’s pilot project for two-year-olds and parents,” school library journal 25, no. 7 (1979): 93–97. 38 auxier, anderson, perrin, and turner, parenting children; rideout and robb, the common sense census. 39 neuman, moland, and celano, “bringing literacy home”; goulding, shuker, and dickie, “media mentoring through digital storytimes.” 40 koester, ed., young children, new media, and libraries. 41 us department of education, office of educational technology, policy brief on early learning and use of technology, 2016, https://tech.ed.gov/files/2016/10/early-learning-tech-policybrief.pdf. 42 herdzina and lauricella, “media literacy in early childhood report.” 43 rebekah willett, june abbas, and denise e. agosto, navigating screens (blog), https://navigatingscreens.wordpress.com. https://www.digigen.eu/news/digital-diversity-across-europe-recommendations-to-ensure-children-across-europe-equally-benefit-from-digital-technology/ https://www.digigen.eu/news/digital-diversity-across-europe-recommendations-to-ensure-children-across-europe-equally-benefit-from-digital-technology/ https://www.digigen.eu/news/digital-diversity-across-europe-recommendations-to-ensure-children-across-europe-equally-benefit-from-digital-technology/ https://tech.ed.gov/files/2016/10/early-learning-tech-policy-brief.pdf https://tech.ed.gov/files/2016/10/early-learning-tech-policy-brief.pdf https://navigatingscreens.wordpress.com/ abstract introduction previous studies of technology use in children’s programs and storytimes technology best practices: joint media engagement and media mentorship study objectives method participants procedure results what are the purposes for using technology in storytimes? what are factors associated with adopting technology in storytimes? what are barriers to integrating technology in storytimes? discussion purposes experiential learning supports and barriers provider participant library system content conclusion acknowledgement endnotes reproduced with permission of the copyright owner. further reproduction prohibited without permission. new strategies in library services organization: consortia university libraries in spain miguel duarte barrionuevo information technology and libraries; jun 2000; 19, 2; proquest pg. 96 new strategies in library services organization: consortia university libraries in spain miguel duarte barrionuevo new political, economic, and technological developments, as well as the growth of information markets, in spain have created a foundation for the creation of library consortia. the author describes the process by which different regions in spain have organized university library consortia. s panish libraries are public entities that depend either on central or local governments and are funded through either the national general budget or the regional government (comunidades aut6nomas) budget. on one hand, the player at the national level is the education and culture ministry, which contributes to the fifty-two state public libraries and shares jurisdiction with the regional government. on the other hand, universities are self-governed institutions of a public nature regulated by the ley de reforma universitaria, or university reform law, which was approved by the spanish parliament in 1983 to promote scientific study and greater selfgovernment of spanish universities. universities have their own budget, and they are mainly funded by the regional government. the university library system is currently made of about fifty public libraries and twelve private libraries. since the second half of the 1980s, a new philosophy concerning public services has spread in spain, as in other european countries: a philosophy calling for higher quality and more efficiency in the management and administration of the public capital. there has also arisen a claim to the government's satisfactory use of public funds as a social right, as well as a claim to a return on that capital in social terms. this is where libraries' public services come into play. there is a clear interest in all the aspects related to the introduction of new techniques in management. quality management, effectiveness and efficiency measuring, costs control, services assessment, and users content or analysis from the stakeholders' point of view are concepts that emerge in university libraries. in order to adjust to the circumstances, universities are changing their management procedures, and university libraries have been forced into managing their "business" according to managerial criteria. the commonality of their activities, and the relaxation of geographical boundaries fostered by information technologies, have encouraged libraries to join consortia in order to remain relevant in the current library services context. such concepts as the "electronic," "digital," and miguel duarte barrionuevo is head director of the central library of the university of cadiz (andalucia), and an active contributor of the university libraries consortium of andaluc1a. 96 information technology and libraries i june 2000 "virtual" libraries lead, from my point of view, to a different configuration in the library services context; they have pushed the library managers to consider strategically where they are and what is their most adequate position within this new configuration. departments dealing with information are to be wider, more heterogeneous, and multidisciplinary. new organization strategies need to be defined in order to offer services in a different way when library managers are forced to obtain the best results out of their limited resources, the organization of consortia represents a qualitative leap forward in cooperation, efficiency, and cost-savings. library consortia aim to share resources and to promote participation on the basis of the mutual benefit of the libraries involved and, although the concepts of cooperation, coordination, and sharing resources are not new in the library world, the organization of library consortia introduces a major level of commitment and involvement among the participants. i new settings, new facts libraries are going through a crisis. a library is still an institution with a strong traditional character, but its traditional duties as depository of knowledge no longer justify its costs, and the crisis is exacerbated by an accelerated technological and informative revolution. 1 within the changing atmosphere of the spanish university in the last few years, goals and objectives are affected by a number of socioeconomic, institutional, and technological factors, as well as others with an internal character that push these institutions to move toward change as an opportunity to maintain continuous improvement. materials and services are more expensive, and technology is more sophisticated every day, which leads to a need for strong investments. the public financing funds are more and more limited while the costs are growing. the university, in general, is suffering from a lack of efficiency and organizational flexibility; staff rejects monotonous tasks and holds high expectations; the fast dynamics of the implementation of information technology in the last few years has caused a very serious imbalance in the skill levels of people and in job-position demands. all these factors generate a new setting of weaknesses and hopes to which the university libraries have to respond in order to maintain their competitive advantages. i technology technology has recently become a strategic element in the development of libraries. technology is more and more sophisticated and its life is shorter. its use implies reproduced with permission of the copyright owner. further reproduction prohibited without permission. the need of strong investments in computer and communication infrastructure. i economical pressure on information market agents materials costs have diversified and arc more and more costly, with annual growths far exceeding even inflation rate levels. an absolute change has been produced in the supply and demand of the information market, which causes the agent's utter disorientation: the publishing sector is adapting very slowly to the electronic context; the distribution sector needs a deep technological and organizational transformation (few spanish suppliers offer added value services such as cataloguing, outsourcing, or material preparation-puvill libras, or filial multinationals such as blackwell or dawson are exceptions). electronic data interchange, a european standard like sisac, is not a standard format among the sector and there is not a national supplier that offers services of the approval plans type. additionally, the agents of the information market are very conditioned by the change of the demand orientation. specialized users (teachers, researchers, thesis students, etc.) demand from libraries electronic resources, quick information, and access at all times from remote locations. this conflicts with the restrictive tendencies in the maintenance of the public services and drastic budget cuts. libraries are forced to obtain the highest possible ratio of efficiency in the use of the fewest resources. i total quality management implementation and other management techniques the result is implementation of total quality management (tqm), which guarantees quality of services. it is important to consider tqm as an instrument that develops organizational strategies. it is a continuous process developed in order to replace obsolete types of organization, to orient the corporate activity as a permanent basis to the processing optimization, and to obtain a coherent relation between the efficacy in the reaching of objectives and the efficiency in the use of resources. changes in the editorial industry, the budget cuts, the quick expansion of electronic resources, the new price politics, and the problems related to copyright and intellectual property form the new setting. in this context, the consortia organization is considered by the university and library managers as a means to face the challenges which the new settings imply, to unify their pressure capacity with regard to the different agents, and to take advantage of the system's strength in order to adjust to the new situation and improve their competitive advantage. i adequate information technologies the spanish university libraries are connected to the academic information network upheld by rediris, a scientific-technical installation that depends on the science and technology office of the prime minister. the main line that maintains the redlris services is formed by seventeen nodes in each region (comunidad aut6noma), connected by atm circuits on atm accesses of 34/155 mbps. each node is formed by a set of communication equipment that allows the coordination of the main transmission means and of the access lines from the centers of each regions. redlris participates in the ten-34 project, which aims at building up an ip paneuropean net of 34 mbps, that interconnects us with the different academic and research nets and that is planned to become a ten-135 in 1999.2 on the other hand, the region (comunidad aut6noma) incorporates added value elements to the net segments they manage, such as faster access speeds that allow centralized architecture (for instance, union catalogue consortia libraries of galicia is managed through a broad band net of 155 mbps). the region also allows access to databases in cd-rom and electronic formats orientated to the final users in a regional context. for instance, the scientific computer center of andalucia manages twenty-two databases in cd-rom and other electronic formats that can be searched by all the andalusian universities and research centers through the andalusian scientific research net. homogeneous automation level the automation process of the library services, initiated at the end of the decade of the '80s, is practically completed. dobys-libis, libertas, vrls, absys, and sabini are the most widely used library management systems. 3 since 1997 some libraries have updated their library automation system to unicom (sirsi) and innopac (innovative interfaces). the spanish university libraries have a homogeneous automation level and can establish projects from the consortia perspective, such as regional union catalogs, sharing electronic information resources, and shared purchase policies. favourable political situation traditionally, the cooperative efforts have obtained little offical support. however, in the last years, a positive attitude can be perceived from the academic authorities in new stragetiges in library services organization i barrionuevo 97 reproduced with permission of the copyright owner. further reproduction prohibited without permission. relation to cooperation activities and the cooperative projects development, both as an answer to the need to reduce costs by sharing resources and as a means to face the growing and unstoppable demand from the users. the initiatives for the consortia organization are supported by highest academic level institutional agreements among the universities: principals and vice-principals of research (such is the case of the consortia of andalucia and madrid) or they are the result of initiatives taken by the autonomous government (galicia consortium) or a confluence of interests between the autonomous government and the universities (catalufla consortium). remote access to end users' information resources following the automation projects and the network technologies and data transmission development, most university libraries have made projects for all information resources integration and maintain a wide group of services: campuswide networks, catalogs, databases in cdrom (e .g., indice espanol de ciencias sociales y humanidades, indice espanol de ciencia y tecnologia, aranzadi legislaci6n y jurisprudencia, medline, abi inform, academic search) , e-mail , and remote access via internet. access to dll resources is available through the libraries management system opac web. there is access to any of these resources from any point connected to the network, whether from terminal servers, workstations, pcs, unix stations, or macs. i cooperation in spain up to the middle of the '80s, university libraries were separate realities with scattered funds and disorganized services; they were not structured as a system and they were lacking any tradition or mentality of cooperation. in a 1994 poll, only 40 percent of university library directors declared that cooperation among libraries was important. 4 we could say that the cooperation initiatives depend on the will of the people who obtain little support from the government. therefore, two different stages could be set: one in which cooperation is the result of personal actions, taken with no institutional support, in which local projects are undertaken ; or one in which individual initiatives are taken by the people in charge of libraries and a certain concern from the central government converge. will to share resources spain did not join the movement toward library automation until the '80s . at this time, the cooperative tenden98 information technology and libraries i june 2000 cies now associated with information and communication technologies were only slightly realized in the libraries. eventually, however, a consolidation of efforts took place, helping to bring about, at the end of the '80s and beginning of the '90s, some important cooperative initiatives out of which some specialized union catalogs could be brought. some of the first cooperative initiatives arose from the association of specialized libraries. 5 among these we can point out the coordinating committee of biomedical documentation, whose mission was to promote the cooperation and rationalization of document resources in the field of biomedicine. this committee holds conferences and maintains a union catalog of the daily publications on health services accessible through internet. 6 documat, created in 1988, groups together the libraries specializing in mathematics and maintains a union catalog of journals on which basis are organized plans of shared acquisition . mecano groups together the libraries of the schools of engineering and maintains a union catalog accessible through internet? early cooperative initiatives were also promoted by the library automation systems users groups. red universitaria espanola de dobis / libis began in 1990 when twelve universities using the system decide to create an online union catalog maintained by the university of oviedo. the libertas spanish users group maintains its union catalog associated with sls database, accessible online from bristol. rueca is the union catalog of absys users .8 need to cooperate in the early '80s a forum started in universities that attempted to influence the writing of the university statutes (as a result of the ley de reforma universitaria) and establish a general criterion for regulations. as a result of this debate, two documents have been published and have proved to be essential for subsequent cooperative development. 9 some reports from conferenc es on university libraries h eld in 1989 in the university complutense of madrid had a wide influence at the national level, and the same year, fundesco produced a report about the state-of-the-art in automation in the spanish university libraries .10 the situation that is repeated in these reports about th e libraries is extremely pessimistic. their evolution from 1985 to 1995 has been perfectly described by m. taladriz and l. anglada as "the lack of recognition of the role of university libraries ... the dispersion of bibliographical funds ... the general disorganization of the library services .... " 11 in 1988, red de bibliotecas universitarias (rebiun, university libraries network) was created. although inireproduced with permission of the copyright owner. further reproduction prohibited without permission. tially only nine university libraries were involved , the number grew to seventeen during the following years. the cooperative activiti es were centralized, and th ey obtained remarkable results in training, the improvement of library interlending, and in the publishing on cd-rom of bibliographical records from participant libraries. at the same time, and thanks to the celebration of the ifla congress in barcelona in 1993, the general need to create a wider discussion forum including all the univ ersity libraries and to obtain bett er cooperation and coordination was established. this idea crystallized with the creation of the conferencia de directores de bibliotecas universitarias y cientfficas (cobiduce, th e conference of university and scientific libraries directors). the first working mee ting was held in november 1993.12 this led to th e merging of rebiun with cobiduce in order to concentrate all the cooperation efforts into a single institution. a single institution, which kept th e name of rebiun, was created in 1996. in 1998, rebiun became the local committe e of the conferencia de rectores de las universidades espanolas (crue, conference of spanish university principals). rebiun has become the organization that oversees all the cooperation and coordination efforts in spanish academic librari es . rebiun activities include a union catalog published on cd-rom, "regulation s for university and scientific librari es," agreements on int erli brary loans, and activities in different working groups .13 i university libraries' consortia in the past few years the tran sfer of powers to the autonomous regions on ed u ca tion and culture, a consequence of a constitutional order, has brought about another political and administrative context for the achievement of the libraries ' objectives. th e autonomous regions are now working on the design of regional developm en t plans or regional information systems that are related, unfailingly, to the cooperative activity of the libraries of the territor y. thi s initiati ve can be applied to university librarie s as well as any other type of library , which, through their institutions, request their autonomous governments' assistance or funding in order to achieve cooperative projects. or it could be done the other way round: a governm ent can outline an action plan for its libraries and suggest it to the potential participants. thus, the basis for consortia development was set in the second half of the '90s, and encouraged by events like the celebrated conference in ca diz , organized by the university of carlos iii de madrid and the university of cadiz libraries , and ebsco information services (spanish branch) in 1998. catalonia consortium of university libraries (consorcio de bibliotecas universitarias de catalufia) we could sum up the situation in catalonia according to the following: the existenc e of new automated libraries, few automated records, the us e of their own automation systems, and the existence of only three universitie s. we can es tablish some cooperation background developed at this time : cruc, caps , and the joint selection of an automation system realiz ed by universidad aut6noma de barcelona and universidad politecnica de cataluna. it is not until the '90s that positive factors combined to move the cooperative movement a step forward in catalonia. these positive factors were a homogeneous s ta te of automation among university libraries, a good communications network, and the use of standards for library data recording . the previous cooperative movements and an analysis of the worldwide evolution of libraries helped in the building of a united view in which coop era tion appeared as an additional instrument for the improvement of the library world. the university library directors of catalonia considered cooperation a way to accelerate the evolution of libraries, to create new services, to facilitate changes, and to save expenses. with this conviction, they wrote a proproposal for the creat io n of a library network in catalonia , which in 1993 resu lted in the interconnection of the university librarie s in catalonia, followed in 1995 with the first steps toward the cre ation of the united catalog of the univer sities of catalonia. this catalog was fully operative in early 1996. at the end of 1996 th e univ ersity library consortium of catalonia (cbuc) was created with the task of improving library services through cooperation. 14 its objectiv es are: • to create new workin g too ls • to improve services • to build a digital librar y • to take better advantages of resources • to face together the changing role in libraries the cbuc comprises the university of barcelona , universidad autonoma de barcelona, the politechnical university of catalonia , pompeu fabra university , th e univ ersity of girona, the university of lleida, rovira i virg ili university, the university oberta of catalonia , and the library of catalonia. the direction of cbuc is determined by a board of representatives from each of th e institutions, an executive committee of six members, and a technical committee of librar y dir ectors. a staff of seven new stragetiges in library services organization i barrionuevo 99 reproduced with permission of the copyright owner. further reproduction prohibited without permission. runs the cbuc office, and different working groups audit active plans and study possible issues of concern. university libraries consortium of the madrid region (comunidad autonoma de madrid) the public university libraries, based in the madrid region (universidad de alcala, universidad carlos iii, universidad complutense, universidad politecnica, universidad rey juan carlos, and uiversidad nacional de educaci6n a distancia), are developing many cooperation programs with the following objectives: • to facilitate access to information resources • to improve the existing library services • to test and promote the use of information and communication technologies • to reduce costs by sharing resources 15 two programs have already been initiated: interlibrary loan. an agreement to obtain a faster delivery system for books and journal articles has been established. using the services of a private courier company, maximum delivery time from one university to another will be set to forty-eight hours. this service started working on the first of sepember. training. different courses for the joint training of library staff are being organized on a cooperative basis. in the future, other programs will be developed, including a union catalog (with the creation of a collective data basis that will also save cataloging costs by sharing bibliographical resources); and an elecronic library, which will allow common access to electronic resources. galician libraries consortium the galician libraries consortium is the result of a regional government intiative. 16 in november 1996 the xunta de galicia signed an agreement of scientific and technological collaboration with fijitsu icl spain in which the company agreed to develop the telecommunications infrastructure of the community: the galician information highway (agi: autopista gallega de la informaci6n). inaugurated in 1997, agi serves as the basis for projects with great political and social appeal. three projects were embarked upon : • tele-teaching, • tele-medicine, and • access to libraries users have access to a loan service by which a loan may be requested from any library in the consortium. the loan works as it would work in a local climate, with the same limitations, controls, and blocking of any other local loan system . the request to the system is sent online and is fulfilled within twenty-four to forty-eight hours. the consortium originally was to encompass all types of libraries, but as the project advanced, it was decided to restrict the collaboration to university libraries. this allowed the project to move forward with greater speed, because the member libraries had more narrowly defined interests and concerns. the xunta de galicia prepared the "protocol of intentions," which has been signed by the highest representatives of the three gallician universities (universidad de santiago, universidad de la corufta, and universidad de vigo). this protocol is characterized by two essential ideas: 1. allow adequate time for planning individual incorporation into the consortium, so that each institution may participate at the rate it deems appropriate. 2. create a permanent working commission formed by representatives of the institutions involved, which will: • answer existing and future questions; • define the model of consortium that each organization desires to establish through specific objectives; and • promote adequate measurement in order to obtain the objectives that have been designed . andalucian university libraries consortium in the era of the internet, electronic documents, and the virtual library, maintaining independent libraries is out of order . in addition, the efforts needed to face the challenges of the information society and the changes that society is demanding of universities are destined to become weaknesses more than strengths in those institutions that face them individually. there are many reasons why it is advisable for libraries to approach these challenges collaboratively: • the productivity and competitiveness that society demands of the universities • the huge technological opportunities to share information • the importance of the changes that are taking place in the products and services that the information market offers • the high cost of the new products (e.g., e-joumals) • the need of very specialized knowledge in order to activate some of these services • the growing demands of library users the andalucian university libraries concluded that if they wished to stay current with information technologies, if they wished to continue implementing improved services, and if they wished to do so within their budg100 information technology and libraries i june 2000 reproduced with permission of the copyright owner. further reproduction prohibited without permission. ets, solid cooperation mechanisms would have to be established. in march 1998 the andalucian vice-principals of research requested the directors of the andalucian university libraries to analyze possible cooperative activities among the university libraries of the community. two goals were set in this meeting: • the analysis of library automation products currently on the market. • the analysis of the current individual management systems within the andalucian libraries (which, though automation varied within them, were each considered to be outdated) and the potential for sharing resources with the present systems, which is difficult because currently available systems may not be compatible with z39.50. the object of this analysis is to define essential requirements so the new systems to be implemented facilitate possible cooperative actions. this possible integration will not be simple: the university pablo de olavide, recently created, is planning to purchase its own system; the universities of seville, granada, and cordoba are using dobis-libis; and the universities of cadiz and malaga are using libertas and are preparing to update to innopac. the andalucian university libraries have studied some of the systems that the spanish market offers: abys (baratz, document systems), amicus (elias), innopac (sls), sabini (sabini library automation) and unicorn (sirsi). they are preparing a catalog of electronic information resources available in the andalucian university libraries to know which resources are available and preferred by different universities. the andalucian university libraries consortium is in an early stage; while its organizational structure and functions are defined, its tasks are still being elaborated. the delegate commission of the vice-principals of research of the andalucian universities is responsible for this work. the commission is presided over by the viceprincipal of the university of seville and formed by the directors of the andalucian libraries and the juridical consultant of the university of cordoba. the commision will produce a working paper that outlines the main facets of the organization, based on the following general principles: • to add value to the computer net of research • to favor the use of technologies that contribute to the improvement of the production times and the designing of efficient processes • to apply scale economies: • in the purchase of products and services • in repetitive tasks and activities • to favor the use of information resources among the members of the andalusian universities and the society in general in order for the project to succeed, the following conditions must exist: • a homogeneous situation among the libraries in terms of regulations and technical instruments used in the description of materials, data format, and information interchange format; • the andalucian universities are connected with high speed optic fiber lines (32 mb); • the administrative framework is clearly defined; and • the responsible members of the andalusian university libraries are convinced that cooperation will improve substantially the quality of the library services in each university. additionally, the following advantages must result: • decline or leveling of production expenses • economies of scale in the purchase of products such as computer systems, databases, and journal and electronic information subscriptions • shared technical support • shared training costs • shared information resources through interlibrary loan i conclusions the ultimate goal of cooperation is to join users and the documents and information they need; establishing relations among participant institutions is a means to that end. consortia represent the possibility to test alternatives to the traditional automated library. they represent the potential to offer the best library services to a wider number of users with all the resources they possess. further than simple cooperation that unites efforts and resources, consortia represent the possibility to test innovative formulas of processes management and services organization from a regional perspective. references 1. miguel duarte, "evaluaci6n del rendimiento aplicando sistemas de gesti6n de calidad, la experiencia de la biblioteca de la universidad de cadiz" [performance assesment implementing total quality management systems. the university library of cadiz experience], in xv jornadas_ de gerencia universitaria: mode/as de financiaci6n, evaluaci6n y me1ora de la calidad de la gesti6n de las servicios [15th university managers meeting: financing models, assesment and quality assurance new stragetiges in library services organization i barrionuevo 101 reproduced with permission of the copyright owner. further reproduction prohibited without permission. of services] (cadiz, university pr., 1997), 309-10; marta torres, "el impacto de las autopistas de la informaci6n para la comunidad academica y los bibliotecarios" [the information highway to academic community and librarians], in autopistas de la informaci6n: el reto de/ siglo xxi (madrid: editorial complutense, 1996), 37-55. 2. victor castelo en la mesa redonda: suen.an los informaticos con bibliotecas electr6nicas. en seminario sobre consorcios de bibliotecas [dream the computerman with electronic libraries?] table ronde in libraries consortia conference, cadiz, university press, 1999, 130; see also www.rediris.es, accessed apr. 24, 2000. 3, m. jimenez and alice keefer, "library automation in spain," program 26, no. 3 (1992): 225-37; assumpcio stivill, "automation of university libraries in spain," telephasa seminar on innovative information services and information handling (tilburg, june 10-12, 1991); rebiuns statistical annual offers data about catalog automation. 4. luis anglada and margarita taladriz, "pasado, presente y futuro de las bibliotecas universitarias espaii.olas" [past, present and future of spanish university libraries] in ix jornadas de bibliotecas de andalucfa (granada: asociaci6n andaluza de bibliotecarios, 1996), 108-31. 5. l. anglada, "cooperaci6 bibliotecaria a espanya [library cooperation in spain]," item 95, no. 16: 51--67. 6. see www.doc6.es/cdb, accessed apr. 24, 2000. 7. see http:/ /biblioteca.upv.es/bib/mecano, accessed apr. 24, 2000. 8. see www.uned,es/bibliote/biblio/ruedo.htm and www. baratz.cs/rueca, accessed apr. 24, 2000. 9. "the library in the university: report on the university libraries in spain, produced by a working team formed by university librarians and teachers" (madrid: ministry of general culture of the book and libraries, 1985); "university libraries: recommendations about its regulations, conference's on university libraries, 'castillo magalia,' las navas de! marques," avila, may 27-28, 1986 (madrid: library coordination centre, 1987). 10. situaci6n de las bibliotecas universitarias dependientes del mec [academic libraries from education department state of art] (madrid: universidad complutense, biblioteca, 1988); estudio sob re normalizaci6n e informatizaci6n de las bibliotecas cientificas espaii.olas.-fundesco, 1989 (no publicado). 11. luis anglada and margarita taladriz, 108. 12. see consorcios de bibliotecas [consortia libraries conference], maribel gomez campillejo, ed. (cadiz: cadiz univ. pr., 1999). 13. see www2.uji.es/rebiun, accessed apr. 24, 2000. 14. for more information about cbuc, see www.cbuc.es, accessed apr. 24, 2000. 15. marta torres, los consorcios, forma de organizaci6n bibliotecaria en el s.xxi. una aproximaci6n desde la perspectiva espaii.ola. in consorcios de bibliotecas (library consortia conference), 17-35. 16. santiago raya, "el consorcio de bibliotecas de galicia [galician library consortium]," in consorcios de bibliotecas [library consortia conference], cit, 117-25. 102 information technology and libraries i june 2000 library space information model based on gis — a case study of shanghai jiao tong university yaqi shen information technology and libraries | september 2018 99 yaqi shen (yqshen@sjtu.edu.cn) is a librarian at shanghai jiao tong university. abstract in this paper, a library-space information model (lsim) based on a geographical information system (gis) was built to visually show the bookshelf location of each book through the display interface of various terminals. taking shanghai jiao tong university library as an example, both spatial information and attribute information were integrated into the model. in the spatial information, the reading room layout, bookshelves, reference desks, and so on were constructed with different attributes. the bookshelf layer was the key attribute of the bookshelves, and each book was linked to one bookshelf layer. through the field of bookshelf layer, the book in the query system can be connected with the bookshelf-layer information of the lsim. with the help of this model, readers can search books visually in the query system and find the books’ positions accurately. it can also be used in the inquiry of special-collection resources. additionally, librarians can use this model to analyze books’ circulation status, and books with similar subjects that are frequently circulated can be recommended to readers. the library’s permanent assets (chairs, tables, etc.) could be managed visually in the model. this paper used gis as a tool to solve the problem of accurate positioning, simultaneously providing better services for readers and realizing visual management of books for librarians. introduction geographical information systems (gis) are powerful tools that can edit, store, analyze, display, and manage geographical data. early in 1992, several association of research libraries (arl) institutions, including the university of georgia, harvard university, north carolina state university, and southern illinois university, launched the gis literacy project and carried out an extensive survey about the possible applications of gis in libraries.1 since then, studies about the application of gis in library research have attracted more and more attention.2 gis is effective for library-planning efforts, such as investigating library-service areas, modeling the implications of the opening and closing of library services, informing initial location decisions, and so on.3 the university of idaho library adopted gis to link variables such as age, race, income, and education from the 2000 us census with the service-area maps of two proposed branch libraries. based on the thematic maps created, the demographic information about potential library users can be displayed. most importantly, the maps were also helpful for improving the library-service planning. koontz et al. from florida state university investigated the reasons for public-library closure by using gis. the authors presented a methodology using gis to describe libraries’ geographic market to illustrate the effects of facility location, relocation, and permanent closure on potential users. sin used gis with inequality measures and multiple regressions to analyze statistics from the public-libraries survey and the census-tract data. then the nationwide library space information model based on gis | shen 100 https://doi.org/10.6017/ital.v37i3.10308 multivariate study of the neighborhood-level variations was investigated, and the public libraries’ funding and service landscapes were mapped. gis can also provide strong support for the library accessibility.4 in south wales, united kingdom, a case study about a preliminary analysis of spatial variations in accessibility of library services was carried out based on a gis model. park further measured the public-library accessibility accurately and provided realistic analysis by using gis, including descriptive and statistical analyses and a road network–based distance measure. in another paper, park went a step further to measure readers’ travel time and distance while they are using the library. in addition to using gis for library planning and accessibility, it can be also applied to managing the collections, including the physical documents and digital databases of an academic library.5 solar and radovan from the national and university library of slovenia explored the possibility of creating a virtual collection of diverse materials like maps and pictorial documents using gis. they connected spatial data with other pictorial elements, including views and portrait images with hyperlinks.6 coyle from rochester public library studied the implementation of gis in the library collection. he believed that libraries that implemented gis early on would have an intellectual advantage over those coming on board later.7 sedighi conducted research about gis as a decisionsupport system in analyzing geospatial data in the databases of an academic library. by using the analysis functions of the system, a range of features could be indicated; for example, the spatial relationships of data based on the educational course can be analyzed.8 boda used a 3d virtuallibrary model to represent the most prominent and celebrated collection of classical antiquity in the alexandria library.9 beyond the applications mentioned above, some libraries have used gis techniques to analyze reader behaviors.10 xia developed gis into an analytical tool for examining the relationships between the height of the shelf and the frequency of book use, revealing that readers tended to pull books off shelves that are easily reachable by human eyes and hands. mandel used gis to map the most popular routes that readers took when entering the library. based on the seating sweeps method, mandel adopted maps to depict use of tables and computers. the research results of both xia and mandel can provide the information of readers’ behavior whereby the books’ positions, and accordingly the entry routes and facilities’ evaluation can be adjusted strategically. though lots of work has been done about the application of gis to the library, there are few reports about visually showing the exact position of each book through the library-catalog display interface, which is of great importance both for the readers and the librarians. xia located library items with gis and pointed out that updating the starting and ending call numbers for each shelf could be the most tedious work.11 specifically, gis cannot tell if the book is not in its correct location or is being used by somebody else. xia advised combining gis with radio frequency identification (rfid), both of which have the capability of tracing the location of each book. stackmap, a library-mapping tool providing a collection-mapping product for librarians, was being used at the hampton library.12 the shanghai jiao tong university library built an interface that would use gis to identify the specific location of each book in the catalog. a gis model that includes spatial and attribute information was constructed. the connection of gis, rfid, and opac was discussed in detail. additionally, the relationship between the bookshelves and patrons’ behavior was studied deeply. information technology and libraries | septmber 2018 101 it is hoped that this gis model will bring convenient services for readers and efficient management for librarians. methodology background in 1984, shanghai jiao tong university circulation system was built based on barcode-reader technology. the first automated library-management system (lms), minisis and image library integrated system, was implemented in 1988. in 1993, the second lms, the unify online multiuser system, was implemented. in 1994, an open public access catalogue (opac) system was built based on the unils, allowing readers to query the library bibliographic record through the computer. in 1998, the third automated lms, a client/server–based tool, was built based on the horizon lms. in 2008, we launched the aleph integrated library system (ils). in the same year, primo, a resource discovery and access system, was introduced. in 2009, the our explore interface was built based on the primo system, providing the services of resource retrieval and access.13 rfid technology was introduced in 2014, and now readers can borrow or return books through self-service machines. users can find a book via the opac or our explore system in the shanghai jiao tong university library homepage (http://www.lib.sjtu.edu.cn/index.php?m=content&c=index&lang=en), a screen shot of which is shown in figure 1. book information can be found through the systems, but the exact position of the books cannot be exhibited in the system. at the library reference desk, the question readers ask most frequently is where they can find a certain book. the chinese library classification (clc) system is used to organize the collections in the shanghai jiao tong university. the librarians are very familiar with the classification. however, it is hard for the inexperienced users to understand, even if they have been trained. although static maps can guide patrons to find the books, patrons sometimes still have difficulties finding the books. if the readers can get the exact bookshelf location for a book through the opac or our explore system, the users’ experience could be improved significantly, and much of readers’ time for finding the books could be saved. therefore, it is necessary to introduce gis to the library with the aim of visually showing the position of each book. furthermore, library managers need to plan the budget at the end of every year. the arrangement of different subjects should be considered in the planning. although the usage of the collections by the ils provides reference for the planning, a library-space information model (lsim) would bring a new insight. software there are many kinds of gis software in this research field, including commercial products such as arcgis, mapinfo, and mapgis as well as free and open-source software (foss) solutions. taking foss and arcgis for example, foss can provide a broader context of the open-source software movement and developments in gis.14 no single foss package can match all the functionality that arcgis has for creating thematic maps; therefore, the function of spatial analysis and data processing of arcgis is more powerful. the software used in this study is arcgis 10.3 trial version. http://www.lib.sjtu.edu.cn/index.php?m=content&c=index&lang=en library space information model based on gis | shen 102 https://doi.org/10.6017/ital.v37i3.10308 figure 1. opac and our explore in the shanghai jiao tong university library homepage. methods there are two modules in the lsim, including spatial information and attributes information, as shown in figure 2. spatial information, including the building position, the reading-room layout, bookshelf information, and so on, is transferred to shapefile style. remote-sensing information is used to set the geographic location of the library. these elements are constructed with different attributes, and 2d-attribute and 3d-multipatch data are stored in the geodatabase. arcmap and arcscene are used to generate the 2d and 3d maps and analyze the readers’ behavior. we connect the spatial information with data from the opac, our explore, and rfid. the query fields (which we call “general information”) in the opac are title, author, keyword, call number, issn, isbn, system number, barcode, collection location, and publisher. in the our explore system, readers can not only search the general information, but also refine the search results by specific fields, such as topic, author, collection location, published date, and clc. the functions of book reserving and renewing are also supported by these two systems. rfid is introduced to the shanghai jiao tong library to allow self-service, and the fields include collection location, subject, issn, isbn, barcode, and so on. barcode is the common field in all three systems and is used to connect them. in the rfid system, the bookshelf is the unique identification of each shelf in the bookshelves. in the shanghai jiao tong university library, the first-book location method is used to manage books in the rfid system. the first book on each bookshelf is recorded as a different bookshelf location, and the books on one bookshelf are assigned to the same bookshelf location. the books are ordered and arranged according to the call number. a book’s current status can be obtained in the information technology and libraries | septmber 2018 103 rfid system by shelf inventory. the books that are borrowed by patrons or not on the right shelf would be recorded in the rfid system. the key attribute information in the lsim is the bookshelf layer, which is used to describe the book’s position. the field of the bookshelf layer is connected with the rfid data. taking the bookshelf layer of rfid as the attribute field, the position of a book can be located by the bookshelf layer in the lsim. compared to xia’s research, it is easier to get the bookshelf-layer information based on the rfid in the lsim.17 figure 2. research flowchart. the connection of the opac, rfid, and lsim is shown in figure 3. when the reader locates a book in the opac or our explore, the barcode will be shown in the system. the bookshelf layer in the rfid system can be retrieved through the barcode immediately. the map of the reading room has been embedded in the opac. furthermore, the coordinates of the book (x, y, height) can be shown through the bookshelf layer. the index of each bookshelf coordination is created in the opac, rfid system, and lsim. the field of the map presentation is built in the opac, and the search interface is supported by the arcmap and arcscene. the url link is the content of the field, and its content is varied with the different bookshelves. in short, when the reader searches one book, the related bookshelf coordination is highlighted in the map. through the bookshelf layer field, the book information in the query system can be connected with that of lsim. faculty and students can search books in the query system visually. as shown in figure 2, spatial information and attribute information are connected in the lsim. furthermore, a lsim based on gis is built to provide better services for readers and enhance librarians’ visual management. library space information model based on gis | shen 104 https://doi.org/10.6017/ital.v37i3.10308 figure 3. the connection of the opac, rfid system, and lsim. figure 4. finding a book in the our explore system. information technology and libraries | septmber 2018 105 figure 5a. the visual position of the book with the call number r318-53/3 (2d). figure 5b. the visual position of the book with the call number r318-53/3 (3d). library space information model based on gis | shen 106 https://doi.org/10.6017/ital.v37i3.10308 discussion providing services for readers by lsim visual query in the reading room when a book about biological medicine is required, it can be searched by using the keyword “biological medicine” in our explore. then, as shown in figure 4, a book titled amalgamation within evolution can be found with the clc call number r318-53/3. readers can find the book with the call number in the corresponding reading room. however, if the lsim is applied, the search results include not only the text information about the book’s location, but also a visual map. firstly, the barcode of the book (32832872) is identified and passed to the bookshelf layer. the bookshelf layer (a4r042c04) will be found in the lsim. then the book’s spatial position can be shown on a visual map. figures 5a and 5b show the 2d and 3d visual position of the book with the call number r318-53/3, and these two results can be switched in the system. the red arrow is the book’s position. based on the visual position, readers can find the book more conveniently. the reading rooms in shanghai jiao tong university library are organized by subject. in each reading room, the books with related categories are distributed together. figures 5a and 5b show the layout of one reading room. the books with the large clc classes, i.e., o, p, q, r, and s, were studied as an example in the reading room in this paper. the red triangles represent chairs and the light green rectangles represents desks. shelves are alphabetically labeled. the reference desk, office area, group study room, storehouse, inquiry machines, printers, and stairs are also shown. special collections in different reading rooms in the shanghai jiao tong university library, there are many special collections, such as contract documents, tsung-dao lee’s manuscripts, alumni theses, important findings of research teams, and so on. because of their rarity, these special collections do not circulate and can only be read in the reading rooms. furthermore, these collections are located in different branch libraries. the geographical information of these resources can be input into the model. scholars can use lsim to achieve the exact positions of these resources, go directly to the related area, and quickly find these special items. library analysis and management book-borrowing situation analysis using gis, it is also possible to show how often books circulate based on their physical location. as shown in figure 6, each rectangle represents a shelf in the reading room. the books with the same topic are placed on the same shelf. the number labeled on the shelf represents the average borrowing frequency of the books on this shelf. different colors mean different frequency, with scale of five to one hundred. the clc classes o, p, and q appearing on the right of the shelves represent mathematical sciences and chemistry, astronomy and geosciences, and bioscience, respectively. information technology and libraries | septmber 2018 107 figure 6. average borrowing frequency of the books on each shelf in one reading room. based on analysis of the relationship between borrowing frequency and subject category, the hot spots of the professional fields can be found and shown. in turn, books related to the hot spots can be recommended to readers. taking class o as an example, the shelf position of the highest borrowing frequency (100) is in row 9, column 2. according to the query system, the theme of the books on this shelf is high polymer chemistry. the books with high borrowing frequency can be highlighted both on the bookshelf and in the query system. if the higher-borrowing-frequency books on the remote shelves meet school discipline development policy, the purchases of these books will be increased. books related to the subjects with the higher borrowing frequency on the taller or lower shelves will also be considered, and vice versa. permanent-assets management permanent assets such as chairs, desks, shelves, inquiry machines, printers, etc., can be managed in this model. information about permanent assets (such as their status, spatial position, etc.) was input in the model, as is shown in figures 5a and 5b. librarians can find the visual positions of permanent assets at any time, and readers can conveniently find the inquiry machines or printers to search books and print documents. library space information model based on gis | shen 108 https://doi.org/10.6017/ital.v37i3.10308 future directions the lsim is only tested in one reading room and is still experimental. this model will be expanded to the whole library, providing visual information of library books and materials. in the process of using this model, gis potentiality in the library will be exploited to provide better services for readers and managers. conclusion based on readers’ need of the book position in the library, the lsim is built to visually show the exact bookshelf layer of the book. spatial and attribute information is combined into the model. based on the model, readers can search for books and find books’ positions. meanwhile, many special collections located in the different branches can be easily found in the model. the gis model not only brings convenience to readers, but also supports the library’s analysis and management. librarians can analyze books’ circulation history based on the relationship between the books’ borrowing frequency and subject categories. books with higher borrowing frequency and ones related them can be recommended to the readers. then the number of the purchased books with the higher borrowing frequency in the remote, taller, or lower places will be increased based on the above analysis. permanent assets can also be managed, and librarians can conveniently find the status and spatial position of the inquiry machines, printers, and so on. in short, the application of gis in the library will bring a visual insight into the library, providing a better reader experience and better library management. acknowledgements i thank guo jing, chen jiayi and huang qinling, shanghai jiao tong university library, for their advice on the structure of this article and the grammar of the written english. i also thank liu min and peng xia, east china normal university, for their help in the model building. research was funded by the “fundamental research funds for the central universities" (grant 17jcya13), shanghai jiao tong university. information technology and libraries | septmber 2018 109 endnotes 1 d. kevin davie, james fox, and barbara preece, the arl geographic information systems literacy project. spec kit 238 and spec flyer 238 (washington, dc: association of research libraries, 1999). 2 b. w. bishop and l. h. mandel, “utilizing geographic information systems (gis) in library research,” library hi tech 4, no. 4 (2010): 536–47. 3 karen hertel and nancy sprague, “gis and census data: tools for library planning,” library hi tech 25, no. 2 (2007): 246–59, https://doi.org/10.1108/07378830710755009; christie m. koontz, dean k. jue, and bradley wade bishop, “public library facility closure: an investigation of reasons for closure and effects on geographic market areas,” library information science research 31, no. 2 (2009): 84–91, https://doi.org/10.1016/j.lisr.2008.12.002; sei-ching joanna sin, “neighborhood disparities in access to information resources: measuring and mappin g u.s. public libraries’ funding and service landscapes,” library information science research 33, no. 1 (2011): 41–53, https://10.1016/j.lisr.2010.06.002. 4 gary higgs, mitch langford. and richard fry, “investigating variations in the provision of digital services in public libraries using network-based gis models,” library and information science research 35, no. 1 (2013): 24–32, https://doi.org/10.1016/j.lisr.2012.09.002; sung jae park, “measuring public library accessibility: a case study using gis,” library and information science research 34, no. 1 (2012): 13–21, https://doi.org/10.1016/j.lisr.2011.07.007; sung jae park, “measuring travel time and distance in library use,” library hi tech 30, no. 1 (2012): 151–69, https://doi.org/10.1108/07378831211213274. 5 wang xuemei et al., “applications and researches of geographic information system technologies in bibliometrics,” earth science informatics 7, no. 3 (2014): 147–52, https://doi.org/10.1007/s12145-013-0132-4. 6 renata solar and dalibor radovan, “use of gis for presentation of the map and pictorial collection of the national and university library of slovenia,” information technology and libraries 24, no. 4 (2005): 196–200, https://doi.org/10.6017/ital.v24i4.3385. 7 andrew coyle, “interior library gis,” library hi tech 29, no. 3 (2011): 529–49, https://doi.org/10.1108/07378831111174468. 8 mehri sedighi, “application of geographic information system (gis) in analyzing geospatial information of academic library databases,” electronic library 30, no. 3 (2012): 367–76, https://doi.org/10.1108/02640471211241645. 9 istván boda et al., “a 3d virtual library model: representing verbal and multimedia content in three dimensional space,” qualitative and quantitative methods in libraries 4, no. 4 (2017): 891–901. 10 xia jingfeng, “using gis to measure in-library book-use behavior,” information technology and libraries 23, no 4 (2004): 184–91, https://doi.org/10.6017/ital.v23i4.9663; lauren h. mandel, “toward an understanding of library patron wayfinding: observing patrons’ entry routes in a mailto:https://doi.org/10.1108/07378830710755009 mailto:https://doi.org/10.1016/j.lisr.2008.12.002 mailto:https://10.1016/j.lisr.2010.06.002 https://doi.org/10.1016/j.lisr.2012.09.002 https://doi.org/10.1108/07378831211213274 https://doi.org/10.1108/02640471211241645 https://doi.org/10.6017/ital.v23i4.9663 library space information model based on gis | shen 110 https://doi.org/10.6017/ital.v37i3.10308 public library,” library and information science research 32, no. 2 (2010): 116–30, https://doi.org/10.1016/j.lisr.2009.12.004; lauren h. mandel, “geographic information systems: tools for displaying in-library use data,” information technology and libraries 29, no. 1 (2010): 47–52, https://doi.org/10.6017/ital.v29i1.3158. 11 xia jingfeng, “locating library items by gis technology,” collection management 30, no. 1 (2005): 63–72, https://doi.org/10.1300/j105v30n01_07. 12 matt enis, “technology: capira adds stackmap,” library journal 139, no. 13 (2014): 17. 13 chen jin, the history of shanghai jiao tong university library (shanghai: shanghai jiao tong university press, 2013). 14 francis p. donnelly, “evaluating open source gis for libraries,” library hi tech 28, no. 1, (2010): 131–51, https://doi.org/10.1108/07378831011026742. https://doi.org/10.1016/j.lisr.2009.12.004 https://doi.org/10.6017/ital.v29i1.3158 https://doi.org/10.1300/j105v30n01_07 https://doi.org/10.1108/07378831011026742 abstract introduction methodology background software figure 1. opac and our explore in the shanghai jiao tong university library homepage. methods figure 4. finding a book in the our explore system. figure 5a. the visual position of the book with the call number r318-53/3 (2d). figure 5b. the visual position of the book with the call number r318-53/3 (3d). discussion providing services for readers by lsim visual query in the reading room special collections in different reading rooms library analysis and management book-borrowing situation analysis figure 6. average borrowing frequency of the books on each shelf in one reading room. permanent-assets management future directions conclusion acknowledgements endnotes 66 ]ourml of library automation vol. 2/2 june, 1969 appendix i preliminary guidelines for the implementation of the proposed american standard for bibliographic information interchange on magnetic tape this appendix is not part of the proposed standard but is included to illustrate its application in one environment and recommended application in another. part a of the appendix contains general guidelines which apply to both parts b and c. part b contains the preliminary guidelines for the library of congress, national library of medicine, and national agricultural library implementation of the standard. part c contains the proposed preliminary committee on scientific and technical information ( cosati) guidelines for the implementation of the standard. a. general 1. labels volume header and file header labels are required and will conform to usas proposed standard x.3/552 magnetic tape labels and file structure. 2. character codes a code for a diacritical will always be recorded before the code for the alphabetic character which it modifies. 2.1 character definitions 2.1.1 delimiter. the delimiter will consist of the "unit separator" (ascii character 1/15). 2.1.2 field termimtor. the field terminator will consist of the "record separator" (ascii character 1/14). 2.1.3 record termimtor. the record terminator will consist of the "group separator" (ascii character 1/13). . 2.1.4 padding character. the padding character will consist of the "space" (ascii character 2/ 0). 2.2 ba.sic character set the basic character set will consist of the characters in columns 2, 3, 6 and 7 of the standard code as defined in usas x3.4-1967 code for information interchange, p. 6. this basic character set is included as part of the illustration on p . 82 of this appendix, columns 2, 3, 6 and 7. 3. type-of-record symbols the following table indicates the type-of-record symbols that have been assigned at this time : symbol a b c d e f f i j k 1 x y meaning printed text manuscript text printed music manuscript music printed maps manuscript maps motion pictures; fllms microform publications recorded sound (language) recorded sound (music) pictures digital media authority datanames authority datasubjects appendix i 67 4. bibliographic level symbols the following table indicates the bibliographic level symbols that have been assigned at this time: symbol a m s c meaning analytical (a bibliographic unit generally not published separately but part of a larger bibliographic entity) monographic publication serial publication (a bibliographic unit issued in successive parts, usually dated or numbered, intended to be continued indefinitely) collective (a made-up collection which is gathered together and cataloged as a unit) indicates that this data element is not used. 5. status symbols the following table indicates the status symbols that have been assigned at this time. the meaning of the symbols is relative to the transmitting source. symbol n meaning new record c d changed or corrected record (complete record to be substituted for one previously transmitted) deleted record 68 journal of library automation vol. 2/ 2 june, 1969 b. preliminary guidelines for the library of congress, national library of medicine, and national agricultural library implementation of the proposed american standard for a format for bibliographic information interchange on magnetic tape as applied to records representing monographic materials in textual printed form (books) 1. labels 1.1 header labels the following table indicates the data elements of the volume and file labels and their permissible values. 1.1.1 volume header data element name length contents label identifier 8 .. vol" label number 1 •t' volume serial number 6 reserved for user accessibility 1 lb unused 26 reserved for user format description 28 usaslbz39.2-1969bbiibfmtli~bl313 label standard level 1 1.1.2 file header data element length contents name label identifier 3 "hdr" label number 1 eel" file identifier 17 mixedbbibliobdata set identifier 6 marcb2 file section number 4 "0001" file sequence number 4 "0001" unused 6 blanks creation date 6 "l6yyddd" expiration date 6 '1llyyddd" or blanks accessibility 1 13 block count 6 "000000" system code 13 reserved for user unused 7 blanks 1.2 end of file data element name length contents label identifier 3 "eof" label number 1 't' appendix i 69 the next 50 characters correspond to those in the same positions in the header label. block count 6 nnnnnn the next 20 characters correspond to those in the same positions in the header label. ~=blank n = decimal digit yy = last two digits of year ddd = day number in julian calendar 2. delimiter and data element identifier the delimiter will consist of the unit separator. the data element identifier will consist of one basic character. a delimiter will precede each data element identifier which in turn precedes each data element that it identifies. the first data element in each variable field will always be preceded by a delimiter and a data element identifier (even though there is only one data element in the field). 3. indicat01' two indicators will be used as the first two data elements in each variable data field. each indicator will consist of one basic character. h an indicator is not used, it will be set to blank (ascii character 2/0) . no indicators are used in the control fields. 4. leader the following table indicates the data elements in the leader and their permissible values and formats. record length decimal digits, right justified, status type-of-record bibliographic level indicator count delimiter count base address of data entry map with leading zeros. as defined in paragraph a.5 of this appendix. c' ,, a " ,, m decimal digits, right justified, leading zeros. "4500" 70 journal of library automation vol. 2/ 2 june, 1969 5. directory each directory entry consists of the following data elements: tag 3 decimal digits length of field 4 decimal digits, right justified, leading zeros. starting character 5 decimal digits, right justified, position leading zeros. the directory ends with a field terminator. 6. control fields tag 001 008 control number fixed length data character positions 0-5 date entered on file 6 type of publication 7-10 date of publication 1 11-14 date of publication 2 15-17 country of publication code 18-21 illustration codes 22 intellectual level code 23 form of reproduction code 24-27 form of contents codes 28 government publication indicator 29 conference proceedings indicator 30 festschrift indicator 31 index indicator 32 main entry in body of entry indicator 33 fiction indicator 34 biography code 35-37 language code 38 modified record indicator 39 cataloging source code 7. variable field data elements tag indicator data 1 010 element 2 identifier preceded by a "unit separator." a name library of congress card number library of congress card number appendix i 71 011 linking library of congress card number a linkin~ library of congress card num er 015 national bibliography number a national bibliography number 016 linking national bibliography number a linking national bibliography number 020 standard book number a standard book number 021 linking standard book number a linking standard book number 025 overseas acquisition number a overseas acquisition number 026 linking oan number a linking oan number 035 local system number a local system number 036 linking local system number a linking local system number 040 cataloging source a cataloging source 041 language ( s) 0 work contains more than one language 1 work is a translation a group of 3-character language codes needed to describe languages of the text or its translation b languages of summaries 042 search code a search code 050 library of congress call number 0 book is in library of congress 1 book is not in library of congress a library of congress classification number b book number 051 copy, issue, offprint statement a library of congress classification number b book number c copy information 72 journal of library automation vol. 2/2 june, 1969 060 national library of medicine call number 0 book is in national library of medicine 1 book is not in national library of medicine a national library of medicine classification number b book number 070 national agricultural library call number 0 book is in national agricultural library 1 book is not in national agricultural library a national agricultural library classification number b book number 071 national agricultural library subject category a national agricultmal library subject category 080 universal decimal classification number a udc number 081 british national bibliography classification number a bnb classification number 082 dewey decimal classification number a ddc number 086 supt. of documents classification number a supt. of documents classification number 090 local call number 100 personal name as main entry (names may be established in conformity with the ala or anglo-american rules.) 0 forename only 1 single surname 2 multiple surname 3 name of family 0 main entry is not subject appendix i 73 1 main entry is subject a name b numeration c titles and other words associated with name d dates e relator k form subheading t title (of book) 110 corporate name as main entry 0 surname (inverted) -· 1 place or place and name 2 n arne (direct order) 0 main entry is not subject 1 main entry is subject a name · b each subordinate tmit e relator k form subheading t title (of book) 111 conference or meeting as main entry 0 surname (inverted) 1 place and name 2 name (direct order) 0 main entry is not subject 1 main entry is subject a name b number c place d date · e subordinate unit in name g other information k form subheading t title (of book) 130 uniform title heading as main entry ~ null condition in first indicator 0 main entry is not subject 1 main entry is subject a uniform title heading t title (of a book) 240 uniform title 0 not printed on lc card 1 printed on lc card a uniform title 74 journal of library automation vol. 2/2 june, 1969 241 romanized title 0 does not receive title added entry 1 receives title added entry a romanized title 242 translated title a translated title 245 title statement 0 no title added entry in this form 1 title added entry in this form a short title b remainder of title c transcription of remainder of title page up to next field 250 edition statement a edition b additional information 260 imprint 0 publisher is not main entry 1 publisher is main entry a place b publisher c date 300 collation a pagination or volumes b illustration( s) c height 350 bibliographic price a bibliographic price 360 converted price a converted price 400° series note-personal name 0 forename only 1 single surname 2 multiple surname 3 name of family 0 author of series is not main entry 1 author of series is main entry a name b numeration c titles, other name-associated words d dates e relator k form subheading t title (of series) v volume or number •used only when series is traced in the same form. appendix i 75 410. series note-corporate name 0 surname (inverted) 1 place or place and name 2 name (direct order) 0 author of series is not main entry 1 author of series is main entry a name b each subordinate unit e relator k form subheading t title (of series) v volume or number 411. series note--conference 0 surname (inverted) 1 place and name 2 name (direct order) a name b number c place d date e subordinate unit in name ~ other information form subheading t title (of book) v volume or number 440. title a title v volume or number 490 series untraced or traced differently 0 series not traced 1 series traced differently a series statement 500 general note a general note 501 "bound with" note a "bound with" note 502 dissertation note a dissertation note 503 bibliographic history note a bibliographic history note 504 bibliography note a bibliography note 505 formatted contents note 0 "complete" contents 1 "incomplete" contents • used only when series is traced in the same form. 76 i ournal of library automation vol. 2/ 2 june, 1969 2 partial contents a contents note 506 " "limited use note a " "limited use note 520 abstract or annotation a abstract or annotation 600 personal name as subject added entry 0 forename only 1 single surname 2 multiple surname 3 name · of .family 0 lc subject heading 1 subj. heading assigned for use in children's catalog 2 nlm subject heading 3 nal subject heading a name b numeration c titles, other name-associated words d dates e relator k form subheading t title (of book) x general subdivision y period subdivision z place . subdivision 610 corporate name as subject added entry 0 surname (inverted) 1 place or place and name 2 name' (direct order) 0 lc subject heading 1 subj. heading assigned for use in children's catalog 2 nlm subject heading 3 nal subject heading a name · b each subordinate unit e relator k form subheading t title (of book) x general subdivision y period subdivision z place subdivision •used only when series ij traced in tho same form. appendix i 77 611 conference as subject added entry 0 surname (inverted) 1 place and name 2 n arne (direct order) 0 lc subject heading 1 subj. heading assigned for use in children's catalog 2 nlm subject heading 8 nal subject heading a name b number c place d date e subordinate unit in name f other information form subheading t title (of book) :1 general subdivision y period subdivision z place subdivision 630 uniform title heading as subject ~ added entry null condition in first indicator 0 lc subject heading 1 subj. heading assigned for use in children's catalog 2 nlm subject heading 8 nal subject heading a unif01m title heading t title (of book) x general subdivision y period subdivision z place subdivision 650 topical subject added entry 0 not entered under place 1 entered under place 0 lc subject heading 1 subj. heading, children's catalog 2 nlm subject heading 8 nal subject heading a topical subject heading b name following place entry element x general subdivision y period subdivision z place subdivision 78 journal of library automation vol. 2/ 2 june, 1969 651 geographic name (not capable of authorship) as subject added entry 0 not entered under place 1 entered under place 0 lc subject heading 1 subj. heading assigned for use in children's catalog 2 nlm subject heading 3 nal subject heading a geographic name b geographic name following place entry element x general subdivision y period subdivision z place subdivision 652 political jurisdiction as subject added entry 16 null condition in first indicator 0 lc subject heading 1 subj. heading assigned for use in children's catalog 2 nlm subject heading 3 nal subject heading a political jurisdiction x general subdivision y period subdivision z place subdivision 690 local subject headings 16 reserved for user lb reserved for user a subject heading x general subdivision y period subdivision z place subdivision 700 personal name as added entry 0 forename only 1 s~le surname 2 m tiple surname 3 name of family 0 alternative entry 1 secondary entry 2 analytical entry a name b numeration c titles, other name-associated words appendix i 79 d dates e relator k form subheading t title (of book) u non-printing filing information 710 corporate name as added entry 0 surname (inverted) 1 place or place and name 2 n arne (direct order) 0 alternative entry 1 secondary entry 2 analytical entry a name b each subordinate unit e relator k fotm subheading t title (of book) u non-printing filing information 711 conference as added entry 0 surname (inverted) 1 place and name 2 n arne (direct order) 0 alternative entry 1 secondary entry 2 analytical entry a name b number c place d date e subordinate unit in name g other information k form subheading t title (of book) u non-printing filing information 730 uniform title heading as added entry fj null condition in first indicator 0 alternative entry 1 secondary entry 2 analytical entry a uniform title heading t title u non-printing filing information 140 title traced differently from short title fj null condition in first indicator 80 journal of library automation vol. 2/ 2 june, 1969 0 alternative entry 1 secondary entry 2 analytical entry a title traced differently from short title 750 n arne not capable of authorship 0 not entered under place 1 entered under place 0 alternative entry 1 secondary entry 2 analytical entry a name or place entry element b name following place entry element 800° personal name-title series added entry 810• corporate name-title series added entry 811° conference-title series added entry 840° title series added entry 900 block of 100 numbers for local use 8. extended ascii character set for roman alphabet and rcnnanized non-roman alphabets 8.1 scope a library character set for the roman alphabet and romanized non-roman alphabets necessitates a larger number of characters than are provided for in the 7 -bit american standard code for information interchange (ascii) . in addition, many libraries only have a 6-bit capability. therefore, it was necessary to develop a character set which would meet all of the following requirements: (a) leave the 7 -bit standard (ascii) intact, (b) expand the 7-bit standard to include an 8th bit to provide additional chamcters, (c) provide a shift mechanism which would make it possible to use all of the characters defined in the 8-bit set in the 6-bit environment. this section describes such a character set. 8.2 criteria governing selection of characters 8.2.1 frequency of occurrence of character 8.2.2 degree of necessity in expressing character when it occurred 8.2.3 possibility of substituting one character for another or of expressing a character by writing it out •taki in the 800'• aro used for series added entries traced differently from the series rtatement. with the exception that no aecond indicators are used in the soo's, tho indicators and data element ldentiben are tile same as those used with the 400' s. appendix i 81 8.3 digital codes the correlation of the character set to digital form code is based upon the ascii (american standard code for information interchange) standard. in conformance with the design considerations of ascii ( 7 -bit code), the character set is also correlated to an 8-bit code and a 6-bit code. the basic digital form code for the character set is the 8-bit code (see figure 1 ) . 8.3.1 the 8-bit code is an extended form of the standard 7-bit ascii. some of the standard ascii characters such as the braces or the backwards slash are not proposed for the character set. however, no characters will be substituted for these code positions. other characters such as diacritical marks will be left in their standard position (unused) and duplicated in another portion of the code set reserved for special characters and diacriticals. 8.3.2 the 7 -bit code will be derived from the 8-bit code by removing the 8th bit. those characters which previously had a 0 in the 8th bit will be considered part of the standard 7 -bit ascii set. those with a 1 in the 8th bit will be considered part of the nonstandard set. a so (shift out) control character will be used to go from the standard set to the non-standard. the code will stay in the nonstandard mode until a si (shift in) control character is reached. 7-bit 8-bit si i usascii i ~i'--------'-;8-th_b_i_t _-_0__ji so (special characters ~i :'8th bit ,. 1 i and diacriticals) : ~-----------------~---------~ 8.3.3 the 6-bit code will be derived by removing the 6th bit and the 8th bit. the 8-bit code set will be divided into 4 sets as follows: columns 2, 3, 6 & 7 = standard set (referred to as the "basic set" in the proposed standard for a format for bibliographic information interchange, p. 56 and in paragraph a.2 of this appendix) fig. 1. proposed extended ascii character set --standard 6-bit set -----non-standard set 1 •• •••·•·· non-standard set 2 ~ ~ ~ 1 2 ~ ~ 1 ~ ~ ~ ¢ ~ ¢ inul idle sp ~ ~ ¢ 1 1 isoh idc1 ~ ¢ 1 ¢ 2 istx idc2 " ~ ~ 1 1 3 ietx idc3 tl ¢ 1 ~ ~ 4 ieot idc4 $ ¢ 1 ¢ 1 5 ienq. i nak /. ~ 1 1 ~ 6 lack isyn & ~ 1 1 1 7 ibel ietb 1 ¢ ¢ ¢ 8 ibs can . ( 1 ¢" ¢ 1 9 iht el-i 1 ¢ 1 ¢ 10 i lf sub * 1 ¢ 1 1 11 ivt esc + 1 1 ¢ . ¢ 12 iff fs j 1 ¢ 1 13 icr igs ~· 1 i 1 ¢ 14 i so i rs !l' • ~ i~ ~ 1 1 ¢ 1 ~ 3 i 4 ¢ i @ a 2 b . 3 c 4 d 5 e 6 f 7 g 8 h 9 i j k ( l = h > n 5 p q r s t u v w x y z [ \ j ¢ 1 ¢ 1 ¢ i~ 1 1 1 1 ¢ 1 6 i 7 ' 1 i p a. q b r c s d t e u f v g v h x i y j z k {3 1 m } 3 a 1 , n 1 8 1 ~ ¢ ¢ 9 1 ~ ~ 1 1 ~ 1 ¢ 10 !. h d p ie (e / ; ® ± ()" 11 }: fi!j d p re ce // .l £ '& 1 ~ 1 1 0" t.r i u1 1 ¢ ~ 12 1 1 ¢ 1 13 ' 1 1118b 1 1 7 i 1 1 6 t ¢ 1 5 s i 14 __ ./...!s.. .. '1 .) ... .. a ... ... v r .; ) ~ ..:. 1 1 1 1 15 ~_r_ ~~rs 2.l i i ? l~ .... l.,..._1 j o i net i i l .......... ~ .......... ~ 1 ~ ......... ~ ........ 1 4 3 2 1 bits key:(1)redefined elsewhere in the set. (2)to be used as terminators or delimiters. (3)to be "used as shift codes for 6-blt set(non•1ocking). 100 to i -.q. t"'< .... ~ ~ '"'l ~ .... ~ -c;· ;:3 < 0 rto ~ ._ § sll ~ cd ffi appendix i 83 columns 0, 1, 4 & 5 = non-standard set ( 1) columns 10, 11, 14 & 15 =non-standard set (2) character 7b in the standard set will be used as a non-locking shift code to reach non-standard set ( 1); character 7/11 (column 7, row 11 of figure 1) will shift to non-standard set ( 2). the presence of one of these codes will indicate that the next character is in one of the appropriate non-standard set. the code will then be automatically shifted back to the standard set. c. proposed preliminary cosati guidelines for the implementation of the proposed american standard for a format for bibliographic information interchange on magnetic tape (this imple,mentation guide, prepared by a panel of the committee on science and technical information, should be regarded as a committee working paper until approval by the federal council for science and technology, to which it is in process of being presented.) 1. labels 1.1 header labels 1.1.1 volume header as specified in paragraph b.l.l.1 of this appendix. 1.1.2 file header as specified in paragraph b.l.l.2 of this appendix, except that the set identifier (field 4) shall contain the characters "cosati." 2. delimiter the delimiter shall consist of the "unit separator" (ascii character 1/15). 3. indicator the indicator is a two-character code consisting of basic characters specifying the origin or authority for the data in each variable field. the codes as presently assigned are as follows: federal agency code legislative branch general accounting office govemment printing office library of congress judicial branch administrative office of the u.s. courts the supreme court of the u.s. gg gp li ao si 84 journal of library automation vol. 2/2 june, 1969 executive branch american battle monuments commission ac appalachian regional commission ar atomic energy commission ai bureau of the budget bo canal zone government cv central intelligence agency cl civil aeronautics board cc commission of fine arts ci council of economic advisers cf delaware river basin commission ee department of agriculture al department of commerce co department of defense office, secretary of defense (includes defense agencies not indicated below) dd department of army da department of navy dn department of air force df defense supply agency ds defense atomic support agency dh defense communications agency dk department of health, education, & welfare hh department of housing and urban development hu department of interior in department of justice ju department of labor la • . ' department of state su agency for international development sv peace corps sw department of transportation to department of treasury tr district of columbia government cz export-import bank of washington ei farm credit administration fc federal aviation agency fa federal coal mine safety board fg federal communications commission fe federal deposit insurance corporation fk federal home loan bank board fm federal maritime commission fo federal mediation and conciliation service fq federal power commission fs federal reserve system fu federal trade commission fw appendix i 85 foreign claims settlement commission general services administration indian claims commission interstate commerce commission national aeronautics and space administration national aeronautics and space council national capital housing authority national foundation on arts and humanities national labor relations board national mediation board national security council national science foundation office of economic opportunity office of emergency planning office of science and technology office of special representative for fi gs ik ic nc nf nh au ni nm no ns oe oh os trade negotiations tu panama canal company pc post office department po railroad retirement board rr renegotiation board re saint lawrence seaway development corporation sx securities and exchange commission sl selective service system sr small business administration sf smithsonian institution so subversive activities control board sc tax comt of the united states tc tennessee valley authority tx u.s. arms control and disarmament agency af u.s. civil service commission cr u.s. information agency us u.s. tariff commission uw veterans administration va virgin island corporation vi 4. leader the following table indicates the data elements in the leader and their permissible values and formats. record length decimal digits, right justified, with leading zeros status type-of-record bibliographic level as defined in paragraph a.5 of this appendix (( , a as defined in paragraph a.4 of this appendix 86 journal of library automation vol. 2/2 june, 1969 indicator count delimiter count "2" "i" base address of data decimal digits, right justified, with leading zeros entry map "4500" 5. directory each directory entry consists of the following data elements : tag 3 decimal digits length 4 decimal digits, right justified, leading zeros starting character position 5 decimal digits, right justified, leading zeros the directory ends with a field terminator. the entries in the directory shall be sequenced in ascending numeric order by the tag. 6. control field tag designation: content 001 record identification number an identification number is assigned for purposes of file control by the specific organization which is distributing the tape. the number may be newly assigned, may be the accession number assigned by a documentation center, or may be the report number assigned either by the originating organization or the monitoring organization, depending on the practice of the organization which writes the tape. examples might be: ad-635 050, pb-170 275, ucrl-1376. 7. variable field data elements tag 100 designation: type of item content this defines whether tl1e item is cataloged as a monograph, serial title, journal article, patent, technical report, audio-visual matetag designation: content rial, etc. 110 security classification of item this is the alphabetic code which properly specifies the security classification of the item. the codes available include: u unclassified c confidential s secret tag designation: content tag designation: content tag designation: content tag designation: content tag designation: content tag designation: content tag designation: content tag designation : content appendix i 87 120 downgrading authority code codes should be taken from dod industrial security manual, app. 2, automatic downgrading and declassification system. 130 distribution limitation statements these are limitations (other than security classification) on the availability of the item to the public. these limitations might include: not reproducible; only available on loan; only available to certain recipients. 140 distribution limitation codes codes corresponding to information in field 130. 150 cataloging organization acronym or name of cataloging organization. 160 announcement journal reference this designates the specific issue of the announcement journal in which this record is published. 170 source report numbers these report numbers are the numbers assigned to the report by the orginating organization. examples might be: 180 ucrl-1035 rm-4244-pr tm/ adc/ 820/ 03 monitoring organization report numbers these are the report numbers assigned by the monitoring or sponsoring organizations. examples might be : nasa-cr-263 asd-tdr-63-24 190 other report identification numbers these are other identification numbers such as other organization identification numbers which do not fall into the other categories. 88 journal of library automation vol. 2/2 june, 1969 tag designation: content tag designation: content tag designation : content tag designation: content tag designation: content tag designation: content tag designation: content 200 project numbers included here are the project numbers under which the work was performed. a project is a grouping of tasks or efforts directed toward a single end result. the project is the basic building-block used in planning, reviewing, and reporting of performance of research and developing programs. 230 security classification of title this is the classification of the content of data element 240. use codes shown in data element 110. 240 classified title this is the classified title in the vernacular, transliterated if necessary. this title is integral to the work. 250 unclassified translated title this title is supplied by the cataloger if not on the document. 260 alternate title entry tllis is an alternate title derived from the secondary part of the title as given on the title pages or a catchword title or subtitle. 270 index annotation this is an edited or supplied version of the title that more accurately reflects subject content of the work than the original title. 280 personal names these are the names of people associated with the responsibility for the intellectual content of the item. this might include authors, compilers, illustrators, translators, etc., but it excludes personal names used as subjects. data will be entered in the form last name, initial. initial. examples nlight be: smith, j. r. roberts, a. b. tag designation: content tag designation: content tag designation: content tag designation: content tag designation: content tag designation: content tag designation: content appendix i 89 290 personal name affiliation this is the affiliation of each name in data element 280 if that affiliation is different from the corporate author. 300 corporate authors these are the names of the organizations associated with the intellectual content of the work. they do not include the publisher or personal affiliation or sponsor except where they are also the corporate author. 310 corporate author codes these are the codes which correspond to the content of data element 300 (if present). 320 contract numbers the contract number is an alphanumeric identifier of the contract cited in the report which designates the financial support of the report. some examples might be: af 33(657)-8146 da036-039-sc-8727 4 330 grant numbers the grant number is an alphanumeric identifier of the grant cited in the report which designates the financial support of the report. some examples might be: 340 nih-5r01-ca-03157-02 nsf-gp-2528 sponsoring organizations sponsors include any and all of the following: true sponsors, who furnished financial support and issued the contracts; monitors, who supervised compliance with the contract; and beneficiaries, for whose benefit the work was done and the report written. 350 cosati subject category codes these are alphanumeric codes used to group subject terms according to broad subject areas established by cosati. 90 journal of library automation vol. 2/ 2 june, 1969 tag 360 designation: other subject heading codes content these are alphanumeric codes used to group subject terms according to broad subject areas which have been established by organizations other than cosa ti. tag 370 designation: primary subject term security classification content this is the classification code of the subject term in data element 380 which has the highest classification. use codes shown in data element 110. tag 380 designation: controlled primary subject terms content this consists of vocabulary taken from the controlled list of subject terms which describes the prime subject content and appears as a heading in the bibliography. tag 390 designation: secondary subject term security classi£cation content content this is the classification code of the subject term in data element 400 which has the highest classification. use codes shown in data element 110. tag 400 . ' designation: controlled secondary subject terms content this consists of vocabulary taken from the controlled list of subject terms which describes the subject content and is available in the system of the organization but does not appear in the bibliography. tag 410 designation: security classification of provisional subcontent ject terms this is the classification code of the provisional subject term in data element 420 which has the highest classification. use codes shown in data element 110. tag 420 designation: provisional subject terms content these are terms which may be applied to subject-classify the content of the work but tag designation: content tag designation: content tag designation: content tag designation: content tag designation: content tag designation: content appendix i 91 are usually taken from the work itself and not from any authorized listing or thesaurus. they may be terms which will eventua1ly be classified as controlled vocabulary depending on their frequency and consistency of use or importance as a retrieval tag, or they may stay in the uncontrolled vocabulary group indefinitely. 430 security classification of special retrieval terms this is the classification code of the special retrieval term in data element 440 which has the highest classification. use codes shown in data element 110. 440 special reb·ieval terms these are terms which designate project names, equipment nomenclature, trade names, catch words. 450 source journal citation this contains the source journal title, the volume and issue number, the pages on which the article appears, and the date of the journal issue. definition is provided as to whether the item is a reprint or if the source journal is in another language. 460 original language of item this is the language (or languages) in which the item origina1ly appeared if different from data element number 470. 470 present language of item this is the language (or languages) in which the item appears at present. it may be the original language, or it may be the result of translation. 480 imprint date of item this is the date of current publication of the item. this would include new imprints, translation dates, date of revision, etc. 92 journal of library aut01tultion vol. 2/ 2 june, 1969 tag designation: content tag designation: content tag designation: content tag designation: content tag designation: content tag designation: content tag designation: content tag designation: content 490 original date of item this contains the date of the original completion or publication of the item only when different from the date of imprint. 500 place of publication this is the city of publication and includes state or country when necessary for identification. 510 country of origin of the intellectual effort this is the country where the original work was done and is not necessarily the country where the publication or translation occurred. 520 number of pages this is the number of pages and/or volumes as determined by current cataloging procedures. 530 availability and price this is the acronym or name of the specific organization, if any, from which the document is available, the hardcopy price, and the microform price. 540 descriptive note this is a title without subject content which describes the type of item, such as final report, progress report for the period . . . , quarterly technical status report, etc. 550 bibliography note this is a note which indicates the presence of bibliographic information as part of the contents of the work. 560 dissertation note this is a note which identifies the work as an academic dissertation presented in partial fulfillment of requirements for a degree. it usually names the institution or faculty to which the dissertation was presented and tag designation: content tag designation: content tag designation: content tag designation: content appendix i 93 the degree for which the author was a candidate. 570 contents note this is a note which lists either all or part of the contents of a work, such as authors and titles, in order to bring out important parts of the work not mentioned in the main title. 580 notes, general these are notes not covered elsewhere. 590 owner or assignee these are the names of owners or assignees of the patent. 600 security classification of abstract use the codes shown in data element 110. tag 610 designation: languages of abstracts or summaries, if content different from data element 470. tag designation: content 620 abstract this is free form as it occurs in the file of the organization writing the tape, in which the source of the abstract is identified as author of the item or the organization generating the abstract. for contents notes see data element 390. automated fake news detection in the age of digital libraries article automated fake news detection in the age of digital libraries uğur mertoğlu and burkay genç information technology and libraries | december 2020 https://doi.org/10.6017/ital.v39i4.12483 uğur mertoğlu (umertoglu@hacettepe.edu.tr) is a phd candidate, hacettepe university. burkay genç (bgenc@cs.hacettepe.edu.tr) is assistant professor, hacettepe university. © 2020. abstract the transformation of printed media into the digital environment and the extensive use of social media have changed the concept of media literacy and people’s habits of news consumption. while online news is faster, easier, comparatively cheaper, and offers convenience in terms of people's access to information, it speeds up the dissemination of fake news. due to the free production and consumption of large amounts of data, fact-checking systems powered by human efforts are not enough to question the credibility of the information provided, or to prevent its rapid dissemination like a virus. libraries, long known as sources of trusted information, are facing challenges caused by misinformation as mentioned in studies about fake news and libraries.1 considering that libraries are undergoing digitization processes all over the world and are providing digital media to their users, it is very likely that unverified digital content will be served by world’s libraries. the solution is to develop automated mechanisms that can check the credibility of digital content served in libraries without manual validation. for this purpose, we developed an automated fake news detection system based on turkish digital news content. our approach can be modified for any other language if there is labelled training material. this model can be integrated into libraries’ digital systems to label served news content as potentially fake whenever necessary, preventing uncontrolled falsehood dissemination via libraries. introduction collins dictionary which chose the term “fake news” as the “word of the year 2017,” describes news as the actual and objective presentation of a current event, information, or situation that is published in newspapers and broadcast on radio, television, or online.2 we are in an era where everything goes online, and news is not an exception. many people today prefer to read their daily news online, because it is a cost-effective and convenient way to remain up to date. although this convenience has lucrative benefits for society, it can also have harmful side effects. having access to news from multiple sources, anytime, anywhere has become an irresistible part of our daily routines. however, some of these sources may provide unverified content which can easily be delivered right to your mobile device. most importantly, potential fake news content delivered by these sources may mislead society and cause social disturbances such as triggering violence against ethnic minorities and refugees, causing unnecessary fear related to health issues, or even sometimes result in crisis, devastating riots and strikes. not having a steady definition compared to news, fake news is often defined according to the data used or the limited perspective of the study in the literature. for example; difranzo and gloriagarcia defined the fake news as “false news stories that are packaged and published as if they were genuine.”3 on the other hand, guess et al. see the term as “a new form of political misinformation” within the domain of politics, whereas mustafaraj is more direct and defines it as mailto:umertoglu@hacettepe.edu.tr mailto:bgenc@cs.hacettepe.edu.tr information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 2 “lies presented as news.”4 a comprehensive list of 12 definitions can be found in egelhofer and lecheler.5 in simplified terms, news which is created to deceive or mislead readers can be called fake news. however, the concept of fake news is a quite broad one that needs to be specified meticulously. fake news is created for many purposes and emerges in many different types. having an interwoven structure, most of these types are shown in figure 1. although, it is not easy to cluster these types into separate groups, they can be categorized according to the information quality or based on the intention as it is created to deceive deliberately or not, as rashkin et al. did.6 we propose the following classification where the two dimensions represent the potential impact and the speed of propagation. figure 1. the volatile distribution of the fake news types (clustered in four regions: sr, sr, sr, sr) with respect to two dimensions: speed of propagation and potential impact. the four regions visualized are clustered according to their dangerousness. first of all, it should be noted that to order types of fake news in a stable precision is quite a challenging task. the variations within the field highly depend on dynamic factors such as timespan, actors, and echochamber effect. hence, this figure should be considered as a clustering effort. there are possible intersecting areas of types within the regions. we will now give examples for two regions, “sr” and “sr.” for example, the sr grouping shows characteristics of high-risk levels and fast dissemination. this includes varieties of fake news such as propaganda, manipulation, misinformation, hate news, information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 3 provocative news, etc. we usually encounter this in the domain of politics. this kind of news may cause critical and nonrecoverable results in politics, the economy, etc., in a short period of time. the rise of the term fake news itself can also be attributed to this kind of news. on the other hand, the relatively less severe group (sr) of fake news, comprising of satire, hoax, click-bait, etc., has low-risk levels and a slow speed of dissemination. a frequently used type of this group, click-bait, is a sensational headline or link that urges the reader to click on a post, link, article, image, or video. these kinds of news have a repetitive style. it can be said that readers become aware of falsehood after experiencing a few times. so, risk level is lower, and dissemination is slower. vosoughi et al. stated the assumption that “falsehood diffuses significantly farther, faster, deeper, and more broadly than the truth.”7 so indeed, just one piece of fake news may affect many more people than thousands of true news items do because of the dramatic circulation of fake news. in their recent survey about fake news, zhou and zafarani highlighted that fake news is a major concern for many different research disciplines especially information technologies. 8 being a trusted source of information for a long time, libraries will play an important role in fighting against fake news problem. kattimani et al. claims that the modern librarian must be equipped with necessary digital skills and tools to handle both printed collections and newly emerging digital resources.9 similarly, we foresee that digital libraries, which can be defined as collections of digital content licensed and maintained by libraries, can be a part of the solution as an authority service with a collective effort. connaway et al. point to the key role of information professionals such as librarians, archivists, journalists, and information architects in helping society use the products and services related to news in a convenient way. 10 as libraries all over the world are transitioning into digital content delivery services, they should implement mechanisms to avoid fake and misleading content being disseminated through them under the guidance of information professionals. to lay out proper future directions for the solution strategy, a clear understanding of interaction between library and information science (lis) community and fake news must be addressed. sullivan states that the lis community has been affected deeply in the aftermath of the 2016 us presidential elections.11 moreover, he quotes many other scientists, emphasizing libraries’ and librarians’ role in the fight against fake news. for example, finley et al. say that libraries are the direct antithesis of fake news, the american library association (ala) called fake news an anathema to the ethics of librarianship in 2017, rochlin emphasizes the role of librarians in this fight, and talks about the need to adopt fake news as a central concern in librarianship and many other researchers name librarians in the front lines of the fight against fake news.12 today, the struggle to detect fake news and prevent their spread is so popular that competitions are being organized (e.g., http://www.fakenewschallenge.org/) and conferences are being held (e.g., bobcatsss 2020). the struggle against fake news can be classified under three main venues: • reader awareness • fact-checking organizations and websites • automated detection systems the first item requires awareness of individuals against fake news and a collective conscience within the society against spreading fake news. to this end, visual and textual checklists, frameworks, and guidance lists are being published by official organizations, such as ifla’s13 http://www.fakenewschallenge.org/ information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 4 (international federation of library associations) infographic which contains eight steps to spot fake news. the radar framework and the currency, relevance, authority, accuracy, and purpose (craap) test are some of the efforts trying to increase reader-awareness of fake news.14 unfortunately, due to the nature of fake news and the clever way they are created triggering people’s hunger to spread sensational information, it is very difficult to achieve full control via this strategy. some studies explicitly showed that humans are prone to get confused when it comes to spotting lies or deciding whether a news item is fake or not.15 furthermore, people often overlook facts that conflict with their current belief, especially in politics and controversial social issues.16 the second strategy focuses on third-party manually driven systems for checking and labelling content as fake or valid. recently, we have seen many examples of offline and online organizations trying to work according to this strategy, such as a growing body of fact-checking organizations, start-ups (storyzy, factmata, etc.), and other projects with similar purposes.17 unfortunately, these manually powered systems cannot cope with the huge amounts of digital content being steadily produced. therefore, they focus only on a subset of digital content that they classify as having higher priority. even for this subset of content, their reaction speed is much slower than the fake information’s spread speed. therefore, automated and verified systems emerge as an inevitable last option. the third strategy offers automated fact-checking systems, which once trained, can deliver content labelling at unprecedented speeds. today, many researchers are researching automated solutions and building models with different methodologies.18 notwithstanding the latest studies, there is still a lot to do in the realm of automated fake news detection. automated fact-checking systems will be detailed in the rest of the paper. thanks to the internet, the collections of digital content served by digital libraries can be accessed by a great number of users without distance and time limits. therefore, we propose a solution to the problem by positioning digital libraries as automated fact-checking services, which label digital news content as fake or valid as soon as or before it is served through library systems. the main reason we associate this approach with digital libraries is their access to a wide variety of digital content which can be used to train the proposed mathematical models, as well as their role in the society as the publisher of trusted information. to this end, we develop a mathematical model that is trained using existing news content served by digital libraries, and capable of labelling news content as fake or valid with unprecedented accuracy. the proposed solution uses machine learning techniques with an optimized set of extracted features and annotated labels of existing digital news content. our study mainly contributes (a) a new set of features highly applicable for agglutinative languages, (b) the first hybrid model combining a lexicon/dictionarybased approach with machine learning methods to detect fake news, and (c) a benchmark dataset prepared in turkish for fake news detection. literature review contemporary studies have indicated that social, economic, and political events in recent years, especially after the 2016 us presidential elections, are increasingly associated with the concept of fake news.19 since then, fake news has begun to be used as a tool in many domains. on the other hand, researchers motivated by finding automated solutions started to make use of machine learning, deep learning, hybrid models, and other methodologies for their solutions. https://storyzy.com/ information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 5 although computational deception detection studies applying nlp (natural language processing) operations are not new, textual deception in the context of text-based news is a new topic for the field of journalism.20 accordingly, we believe that there is a hidden body language of news text, which has linguistic clues indicating whether the news is fake or not. thus, lexical, syntactic, semantic, and rhetorical analysis when used with machine learning and deep learning techniques offers encouraging directions. the textual deception spread over a wide spectrum and the studies have utilized many different techniques. there are some prominent studies which took the problem as a binary classification problem utilizing linguistic clues.21 although it is still early to say the linguistic characteristics of fake news are fully understood, research into fake-news detection in english-language texts is relatively advanced compared to that in other languages. in contrast, agglutinative languages such as turkish have been little researched when it comes to fake news detection. agglutinative languages enable the construction of words by adding various morphemes, which means that words that are not practically in use may exist theoretically. for example, “gerek-siz-leş-tir-ebilecek-leri-miz-den-dir,” is a theoretically possible word that means “it is one of the things that we will be able to make redundant,” but it is not a practical one. shu et al. classified the models for the detection of fake news in their study.22 according to this study, the automated approaches can focus on four types of attributes to detect fake news: knowledge based, style based, stance based, or propagation based. among these, it can be said that the most useful approaches are the ones which focus on the textual news content. th e textual content can be studied by an automated process to extract features that can be very helpful in classifying content as fake or valid. many scholars have tried to build models for automatic detection and prediction of fake news using machine learning algorithms, deep learning algorithms, and other techniques. these scholars approach the detection of fake news from many different perspectives and domains. for example, in one of the studies, scientific news and conspiracy news were used.23 in shu et al.’s study based on credibility of news, the headlines were used to determine whether the article was clickbait or not. in another study, reis et al. worked on buzzfeed articles linked to the 2016 us election using machine learning techniques with a supervised learning approach.24 studies which try to detect satire and sarcasm can be attributed to subcategories of fake news detection.25 our observation, in line with the general view, is that satire is not always recognizable and can be misunderstood for real news.26 for this reason, we included satirical news in our dataset. it should be noted that although satire or sarcasm can be classified by automated detection systems, experts should still evaluate the results of the classification. while some scholars used specific models focusing on unique characteristics, some others such as ruchansky et al. proposed hybrid deep models for fake news detection making use of multiple kinds of features such as temporal engagement between users and news articles over time and generated a labelling methodology based on those features.27 in related studies, many features such as automatic extracted features, hand-crafted features, social features, network information, visual features, and some others such as psycholinguistic features, are applied by researchers.28 in this work, we focused on news content features, however the social context features can also be adapted using different tiers such as user activity patterns, information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 6 analysis of user interaction, profile metadata, social network/graph analysis etc. to extract features. we also have some of these features in our data but not having ground truth quantitatively, we avoided using these features. methodology in this section, we present our motivation for this work which we visualized in a framework and named global library and information science (glis_1.0). subsequently, we discuss the construction of the automated detection system as the key element of the glis_1.0 framework. we explain the framework, model, dataset, features, and the techniques used in this section. framework the main structure of the proposed framework is shown in figure 2. this framework consists of highly cohesive but flexible layers. figure 2. the glis_1.0 framework main structure. in the presentation layer one can find the different sources of news that are publicly available. these sources can be accessed directly using their websites or can be searched for via search engines. the news is received by fact-checking organizations which classify them manually, digital libraries which archives and serves them, and automated detection systems (ads) which classify them automatically. digital libraries work together with fact-checking organizations and adss to present clean and valid news to the public. moreover, search engines use digital libraries systems to label their results as fake or valid. information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 7 fact-checking organizations should also benefit from the output of adss, as instead of manually checking heaps of news content, they could now focus on news labeled as potentially fake by an ads. through glis, adss make the life of fact-checking organizations and digital libraries much easier, all the while increasing the quality of news served to the public. considering this is a high-level overview of a structure given in figure 2, there may be many other components, mechanisms, or layers, but the key elements of this structure are automated detection systems and the digital libraries. a critical approach to this framework can be why we need such an authority mechanism. the answer will be quite simple, technological progress is not the only solution. on the contrary, tech giants have already been subject to regulatory scrutiny for how they handle personal information.29 also, their policy related to political ads has been questioned. furthermore, they are often blamed for failing to fight fake news. indeed, there is an urgent need for a global action more than ever. digital libraries are much more than a technological advancement. hence, they should be considered as institutions or services which can be a great authority service to provide news to society since the printed media disappears day by day. the threats caused by fake news are real and dangerous, but only recently have researchers from different disciplines been trying to find possible solutions such as educational, technological, regulatory, or political. digital librarianship can be the intersection of all these solutions for promoting information/media literacy. hence, digital librarianship will make use of many automated detection systems (ads) to serve qualified news. in the following section, we discuss ads in detail. model an overview of our model of automated detection system solution which is very critical for the framework is shown in figure 3. our fake news detection model consists of two phases. first is the language model/lexicon generation and the second is machine learning integration. in this work, we used machine learning algorithms via supervised learning techniques which learn from labeled news data (training) and helps us to predict outcomes for unforeseen news data (test). dataset we collected our data from three sources: • the primary source is the gdelt (global database of events, language and tone) project (https://www.gdeltproject.org/), a massive global news media archive offering free access to news text metadata for researchers worldwide. it can almost be considered a digital library of news in its own right. however, gdelt does not provide the actual news text and only serves processed metadata along with the url of the news item. gdelt normally does not check for the validity of any news items. however, we have only used news from approved news agencies and completely ignored news from local and lesser-known sources to maximize the validity of the news we have automatically obtained through gdelt. moreover, we have post-processed the obtained texts by cross-validating with teyit.org data to clean any potential fake news obtained through gdelt links. https://www.gdeltproject.org/ information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 8 figure 3. integrated fake news detection model with main phases combining language-model based approach with machine learning approach. • the second source is teyit.org which is a fact-checking organization based in turkey, compliant to the principles of ifcn (international fact-checking network) aiming to prevent spreading of false information through online channels. manually analyzing each information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 9 news item, they tag them as fake, true, or uncertain. we used their results to automatically download and label each news text. • lastly, our team collected manually curated and verified fake and valid news obtained from various online sources and named it as mvn (manually verified news). this set includes fake and valid news that we have manually accumulated in time during our studies and that were not overlapping with the news obtained from gdelt and teyit.org sources. we named our dataset trfn. in phase 2, the data is very similar to the one we used in phase 1. however, to see the effectiveness of model, we made modifications to exclude old news before 2017 and added new items from 2019. the news in our dataset span a time frame between 2017– 2019 and are uniformly distributed. table 1 outlines the dataset statistics, namely where the news text comes from, its class (fake or valid), the amount of distinct texts and the corresponding data collection method. it can be seen from the table that most of our valid news come from the gdelt source, whereas teyit.org, a fact-checking organization, contributes only fake news. table 1. trfn dataset summary after cleaning and duplicate removal. dataset class size of processed data collection method gdelt non-fake 82708 automated teyit.org fake 1026 mvn non-fake 1049 manual fake 400 all news items were processed through zemberek (http://code.google.com/p/zemberek), the turkish nlp engine for extracting different morphological properties of words within texts. after this processing phase, all obtained features were converted into tabular format and made available for future studies. this dataset is now available for scholarly studies upon request. in a study of this nature, the verifiability of the data used is important. as we have already mentioned, most of the data we used comes from verified sources such as mainstream news agencies accessed through gdelt and teyit.org archives which are verified by teyit.org staff. all data used in training the mathematical models which are to be explained in the rest of the paper are either directly or indirectly verified. another important issue was generalizability of the dataset, which determines whether the results of the study are only applicable to specific domains or to all available domains. although focusing on a specific news domain would clearly improve our accuracies, we preferred to work in the general domain and included news from all specific domains. the distribution of domains in our dataset is visualized in figure 4. this distribution closely matches the distribution one would experience reading daily news in turkey. hence, we have no domain specific bias in our training dataset. http://code.google.com/p/zemberek information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 10 figure 4. the distribution of domains in the dataset. (scitechenvwetnatlife = science, technology, environment, weather, nature, life. educulturearttourism = education, culture, art, tourism.) moreover, we obtained highly correlated evidence showing syntactic similarities with the other nlp studies in turkish during the exploratory data analysis. for example, the results of a study by zemberek developers (http://zembereknlp.blogspot.com/2006/11/kelime-istatistikleri.html) to find the most common words in turkish experimented with over five million words is compatible with most common words in our corpus. this evidence can be attributed to representability of our dataset. the last issue worth discussing is the imbalanced nature of the dataset. an imbalanced dataset occurs in a binary classification study when the frequency of one of the classes dominates the frequency of the other class in the dataset. in our dataset, the amount of fake news is highly surpassed by the amount of valid news. this generally results in difficulties in applying conventional machine learning methods to the dataset. however, it is a frequently observed phenomenon due to the disparity of variable classes in these kinds of problems in real world. to avoid potential problems due to the imbalanced nature of the dataset, we used smote (synthetic http://zembereknlp.blogspot.com/2006/11/kelime-istatistikleri.html information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 11 minority over-sampling technique) which is an over-sampling method.30 it creates synthetic samples of the minority class that are relatively close in the feature space to the existing observations of the minority class. features in this study, we discarded some features because of their relatively low impact on overall performance during the exploratory data analysis and subsequently in the training phase. the most effective features we decided on are shown in table 2. table 2. main features features group definition nrootscore language model features the news score calculated according to the root model nrawscore the news score calculated according to the raw model spellerrorscore extracted features spell errors per sentences complexityscore the score of the complexity/readibility of the news source labels the url or identifier of the news maincategory the category of the news newssite the unique address of the news the language model features nrootscore and nrawscore are features that we have borrowed from our earlier study on fake news detection.31 in that study, we focused on constructing a fake news dictionary/lexicon based on different morphological segments of the words used in news texts. these two scores were found to be the most successful ones in determining the fakeness/validity of a news text, one considering the raw form of the words, the other considering the root form. the extracted features are complexityscore and spellerrorscore. complexityscore basically represents the readability of the text. studies for determining a good readability metric exist for the turkish language.32 we used a modified version of the gunnig-fog metric, which is based on word length and sentence length.33 since turkish is an agglutinative language, we used word length instead of using the syllable count. we also made some modifications to normalize the scores. the average number of syllables per word syllable in turkish is 2.6, so we defined a word as a long word if it has more than 9 letters.34 for a given news text t, the complexity score (cs) can be computed by equation 1. (1) 𝑇𝐶𝑆 = ( 𝑊𝑜𝑟𝑑𝑐𝑜𝑢𝑛𝑡 𝑆𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑠𝑐𝑜𝑢𝑛𝑡 + 𝐿𝑜𝑛𝑔𝑊𝑜𝑟𝑑𝑐𝑜𝑢𝑛𝑡∗100 𝑊𝑜𝑟𝑑𝑐𝑜𝑢𝑛𝑡 10 ) the second extracted feature is spellerrorscore. we foresee that there may be many more errors in fake news than in valid news. we calculated the spell error counts making use of turkish spellchecker class of zemberek. due to the text length of news varies, we calculate the ratio information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 12 according to the sentences. for a given news text t, se (spell error score) is calculated as shown in equation 2. (2) 𝑇𝑆𝐸 = ( 𝑆𝑝𝑒𝑙𝑙𝐸𝑟𝑟𝑜𝑟𝐶𝑜𝑢𝑛𝑡 𝑆𝑒𝑛𝑡𝑒𝑛𝑐𝑒𝑠𝐶𝑜𝑢𝑛𝑡 ) finally, we included the metadata categories source, maincategory, and newssite as additional identifiers for the learning process. then, we combined features extracted from text representation techniques with the features shown in table 2 and trained the model with different classifiers. for text representation, we followed two directions for the experiments. first, we converted text into structured features with bag of words (bow) approach in which text data is represented as the multiset of its words. second, we experimented with n-grams which represents the sequence of n words, in other words splitting text into chunks of size n-words. in the (bow) model, documents in trfn are represented as a collection of words, ignoring grammar and even word order, but preserving multiplicity. in a classic bow approach, each document can be represented as a fixed-length vector with length equal to the vocabulary size. this means each dimension of this vector corresponds to the occurrence of a word in a news item. we customized the generic approach by reducing variable-length documents to fixed-length vectors to be able to use with varying lengths with many machine learning models. information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 13 figure 5. an overview of bow (bag of word) approach. because we ignore the word order, we reduced fixed length of counts as histograms as seen in figure 5. assuming n is the number of news documents and w is the number of possible words in the corpus, it should be noted that in n*w count matrix, n is generally large but infrequent, because we have many news documents, but most words do not occur in any given document causing rareness of a term/word which is a drawback for the approach. therefore, we modified the model to compensate the rarity problem by weighting the terms using tf-idf measure which evaluates how important a word is to a document in a collection. the other technique we used, n-gram model is the generic term for a string of words in computational linguistics, and it is extensively used in text mining and nlp tasks. the prefixes that replace the n-part indicate the number of consecutive words in the string. so, a unigram is referred to one word, a bigram is two words, and an n-gram is n words. experimental results and discussion in this section, the experimental process and the results are presented. all experiments are performed using the scikit-learn library. to evaluate the performance of the model and proposed features we employed the precision, recall, f1 score (the harmonic mean of the precision and recall), and accuracy metrics. we did many experiments using different combinations of features. information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 14 several classification models have been trained. these are as follows: k-nearest neighbor, decision trees, gaussian naive bayes, random forest, support vector machine, extratrees classifier, and logistic regression. to be effective, a classifier should be able to correctly classify previously unseen data. to this end, we tuned the parameter values for all the classification models used. then, models were trained and evaluated on trfn dataset using 10-fold cross-validation. in table 3, we present the ultimate best scores of the proposed model. the results are highly motivating to exemplify how useful automated detection systems can be as a key component of the integrated solution framework in figure 2. we compared the algorithms with three ultimate feature sets for having respectively consistent results to the other feature set combinations. set1 stands for bigram+fopt (optimized features), set2 stands for bowmodified+ fopt and set3 stands for unigram+bigram+fopt. the results show that there is a relative consistency in terms of performance across the models. in almost all models, the combination of unigram+bigram and optimized features sets (fopt) gives better results than the other combinations. the extratree classifier model is chosen as the best due to its higher performance. this model is also known as extremely randomized trees classifier which is a type of ensemble learning technique aggregating the results of multiple decision trees collected in a “forest” to output its classification result. it is very similar to random forest classifier and only differs in the manner of construction of the decision trees. so, we can also see closer results between these two classifiers. table 3. results. evaluation results of all combinations of features and classification models. model feature sets precision%(0,1) recall%(0,1) accurac y f1scor e set1 93.32 93.96 93.92 93.3 6 93.64 93.62 gaussian naive bayes set2 93.37 94.02 93.98 93.4 2 93.70 93.68 set3 93.95 94.21 94.19 93.9 7 94.08 94.07 set1 93.70 93.50 93.52 93.6 9 93.60 93.61 k-nearest neighbour set2 93.66 94.05 94.03 93.6 8 93.85 93.84 set3 94.42 94.21 94.22 94.4 1 94.31 94.32 set1 94.15 94.92 94.88 94.1 9 94.53 94.51 extratrees classifier set2 94.09 94.94 94.90 94.1 4 94.51 94.49 set3 97.90 95.72 95.81 97.8 6 96.81 96.85 set1 89.61 88.92 88.99 89.5 4 89.26 89.30 support vector machine set2 89.70 88.96 89.04 89.6 2 89.33 89.37 set3 90.85 91.26 91.22 90.8 9 91.05 91.03 set1 91.56 92.28 92.23 91.6 2 91.92 91.89 logistic regression set2 91.50 92.28 92.22 91.5 6 91.89 91.86 set3 92.25 92.90 92.86 92.3 0 92.57 92.55 set1 93.71 94.44 94.40 93.7 5 94.07 94.05 random forest set2 93.87 95.00 94.94 93.9 4 94.44 94.41 set3 94.77 95.14 95.12 94.7 9 94.96 94.95 set1 93.95 94.59 94.56 93.9 9 94.27 94.25 decision trees set2 94.05 95.08 95.03 94.1 1 94.57 94.54 set3 94.94 95.24 95.23 94.9 5 95.09 95.08 information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 15 every ads in glis_1.0 framework may use its own way to detect fake news. the open source ads may improve with feedbacks. hybrid models and other techniques such as neural networks with deep learning methodology can also be used according to the data, language of news and the news features related with both social context and news content. conclusion and future work in this study we presented a novel framework which offers a practical architecture of an integrated system for identifying fake news. we have tried to illustrate how digital libraries can be a service authority to promote media literacy and fight against fake news. because librarians are trained to critically analyze information sources, their contributions to our proposed model are critical. accordingly, we see this work as an encouraging effort for the next collaborative studies among the communities of lis and cs (computer science). we think that there is an immediate need for lis professionals to participate and contribute to automated solutions that can help detecting inaccurate and unverified information. in the same manner, we believe the collaboration of lis professionals, computer scientists, fact-checking organizations, and pioneering technology platforms is the key to provide qualified news within a real-time framework to promote information literacy. moreover, we put the reader at the core of the framework as the feed reader position while consuming news. in terms of automated detection systems, we proposed a fake news detection model in tegration of dictionary-based approach and machine learning techniques offering optimized feature sets applicable to agglutinative languages. we comparatively analyzed the findings with several classification models. we demonstrated that machine learning algorithms when used together with dictionary-based findings yield high scores both for precision and recall. consequently, we believe once operational in the field, proposed workflow can be extended in the future to support other news elements such as photographs and videos. with the help of social network analysis (sna) it may be possible to stop or slow down the spread of fake news as it emerges. during all the experiments we did, this work also highlighted several tasks as future research directions such as: • the studies can be deepened to mathematically categorize the fake news types and the dissemination characteristics of each type can be analyzed. • the workflow has the potential to provide an automated verification platform for all news content existing in digital libraries to promote media literacy. endnotes 1 m. connor sullivan, “why librarians can’t fight fake news,” journal of librarianship and information science 51, no. 4 (december 2019): 1146–56, https://doi.org/10.1177/0961000618764258. 2 “definition of 'news',” available at: https://www.collinsdictionary.com/dictionary/english/news 3 dominic difranzo and kristine gloria-garcia, “filter bubbles and fake news,” xrds: crossroads, the acm magazine for students 23, no. 3 (april 2017): 32–35, https://doi.org/10.1145/3055153. https://doi.org/10.1177/0961000618764258 https://www.collinsdictionary.com/dictionary/english/news https://doi.org/10.1145/3055153 information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 16 4 andrew guess, brendan nyhan, and jason reifler, “selective exposure to misinformation: evidence from the consumption of fake news during the 2016 us presidential campaign,” european research council 9, no. 3 (2018): 4; eni mustafaraj and p. takis metaxas, “the fake news spreading plague: was it preventable?” proceedings of the 2017 acm on web science conference, (june 2017): 235–39, https://doi.org/10.1145/3091478.3091523. 5 jana laura egelhofer and sophie lecheler, “fake news as a two-dimensional phenomenon: a framework and research agenda,” annals of the international communication association 43, no. 2 (2019): 97–116, https://doi.org/10.1080/23808985.2019.1602782. 6 hannah rashkin et al., “truth of varying shades: analyzing language in fake news and political fact-checking,” proceedings of the 2017 conference on empirical methods in natural language processing, (2017): 2931–37. 7 soroush vosoughi, deb roy, and sinan aral, “the spread of true and false news online,” science 359, no. 6380 (2018): 1146–51, https://doi.org/10.1126/science.aap9559. 8 xinyi zhou and reza zafarani, “a survey of fake news: fundamental theories, detection methods, and opportunities,” acm computing surveys (csur) 53, no. 5 (2020): 1–40, https://doi.org/10.1145/3395046. 9 s. f. kattimani, praveenkumar kumbargoudar, and d. s. gobbur, “training of the library professionals in digital era: key issues” (2006), https://ir.inflibnet.ac.in:8443/ir/handle/1944/1234. 10 lynn silipigni connaway et al., “digital literacy in the era of fake news: key roles for information professionals,” proceedings of the association for information science and technology 54, no. 1 (2017): 554–55, https://doi.org/10.1002/pra2.2017.14505401070. 11 matthew c. sullivan, “libraries and fake news: what’s the problem? what’s the plan?,” communications in information literacy 13, no. 1 (2019): 91–113, https://doi.org/10.15760/comminfolit.2019.13.1.7. 12 wayne finley, beth mcgowan, and joanna kluever, “fake news: an opportunity for real librarianship,” ila reporter 35, no. 3 (2017): 8–12; american library association, “resolution on access to accurate information,” 2018; nick rochlin, “fake news: belief in post-truth,” library hi tech 35, no. 3 (2017): 386–92, https://doi.org/10.1108/lht-03-2017-0062; linda jacobson, “the smell test: in the era of fake news, librarians are our best hope,” school library journal 63, no. 1 (2017): 24–29; angeleen neely–sardon, and mia tignor, “focus on the facts: a news and information literacy instructional program,” the reference librarian 59, no. 3 (2018): 108–21, https://doi.org /10.1080/02763877.2018.1468849; claire wardle and hossein derakhshan, “information disorder: toward an interdisciplinary framework for research and policy making,” council of europe report 27 (2017). 13 ifla, “how to spot fake news,” 2017. https://doi.org/10.1145/3091478.3091523 https://doi.org/10.1080/23808985.2019.1602782 https://doi.org/10.1145/3395046 https://doi.org/10.1002/pra2.2017.14505401070 https://doi.org/10.15760/comminfolit.2019.13.1.7 https://www.emerald.com/insight/publication/issn/0737-8831 https://doi.org/10.1108/lht-03-2017-0062 information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 17 14 jane mandalios, “radar: an approach for helping students evaluate internet sources,” journal of information science 39, no. 4 (2013): 470–78, https://doi.org/10.1177/0165551513478889; sarah blakeslee, “the craap test,” loex quarterly 3, no. 3 (2004):4. 15 victoria l. rubin and niall conroy, “discerning truth from deception: human judgments and automation efforts,” first monday 17, no. 5 (2012), https://doi.org/10.5210/fm.v17i3.3933; verónica pérez-rosas et al., “automatic detection of fake news,” arxiv preprint arxiv:1708.07104 (2017). 16 justin p. friesen, troy h. campbell, and aaron c. kay, “the psychological advantage of unfalsifiability: the appeal of untestable religious and political ideologies,” journal of personality and social psychology 108, no. 3 (2015): 515–29, https://doi.org/10.1037/pspp0000018. 17 tanja pavleska et al., “performance analysis of fact-checking organizations and initiatives in europe: a critical overview of online platforms fighting fake news,” social media and convergence 29 (2018). 18 yasmine lahlou, sanaa el fkihi, and rdouan faizi, “automatic detection of fake news on online platforms: a survey,” (paper, 2019 1st international conference on smart systems and data science (icssd), rabat, morocco, 2019), https://doi.org/10.1109/icssd47982.2019.9002823; christian janze, and marten risius, “automatic detection of fake news on social media platforms,” (paper, pasific asia conference on information systems (pacis), 2017); torstein granskogen, “automatic detection of fake news in social media using contextual information” (master’s thesis, norwegian university of science and technology (ntnu), 2018). 19 jacob l. nelson and harsh taneja, “the small, disloyal fake news audience: the role of audience availability in fake news consumption,” new media & society 20, no. 10 (2018): 3720–37, https://doi.org/10.1177/1461444818758715; philip n. howard et al., “social media, news and political information during the us election: was polarizing content concentrated in swing states?,” arxiv preprint arxiv:1802.03573 (2018); alexandre bovet and hernán a. makse, “influence of fake news in twitter during the 2016 us presidential election,” nature communications 10, no. 7 (2019): 1–14, https://doi.org/10.1038/s41467-018-07761-2. 20 lina zhou et al., “automating linguistics-based cues for detecting deception in text-based asynchronous computer-mediated communications,” group decision and negotiation 13, no. 1 (2004): 81–106, https://doi.org/10.1023/b:grup.0000011944.62889.6f; myle ott et al., “finding deceptive opinion spam by any stretch of the imagination,” arxiv preprint arxiv:1107.4557 (2011); rada mihalcea and carlo strapparava, “the lie detector: explorations in the automatic recognition of deceptive language,” (paper, proceedings of the acl-ijcnlp 2009 conference short papers, (2009): association for computational linguistics, 309–12); julia b. hirschberg et al., “distinguishing deceptive from non-deceptive speech,” (2005), https://doi.org/10.7916/d8697c06. 21 victoria l. rubin, yimin chen, and nadia k. conroy, “deception detection for news: three types of fakes,” proceedings of the association for information science and technology 52, no. 1 (2015): 1–4, https://doi.org/10.1002/pra2.2015.145052010083; david m. markowitz, and jeffrey t. hancock, “linguistic traces of a scientific fraud: the case of diederik stapel,” plos https://doi.org/10.1177/0165551513478889 https://doi.org/10.5210/fm.v17i3.3933 https://psycnet.apa.org/doi/10.1037/pspp0000018 https://doi.org/10.1109/icssd47982.2019.9002823 https://doi.org/10.1177%2f1461444818758715 https://doi.org/10.1038/s41467-018-07761-2 https://doi.org/10.1023/b:grup.0000011944.62889.6f https://doi.org/10.7916/d8697c06 https://doi.org/10.1002/pra2.2015.145052010083 information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 18 one 9, no. 8 (2014): e105937, https://doi.org/10.1371/journal.pone.0105937; jing ma et al., “detecting rumors from microblogs with recurrent neural networks,” (paper, proceedings of the 25th international joint conference on artificial intelligence (ijcai 2016), (2016): 3818–24), https://ink.library.smu.edu.sg/sis_research/4630. 22 kai shu et al., “fake news detection on social media: a data mining perspective,” acm sigkdd explorations newsletter 19, no. 1 (2017): 22–36, https://doi.org/10.1145/3137597.3137600. 23 eugenio tacchini et al., “some like it hoax: automated fake news detection in social networks,” arxiv preprint arxiv:1704.07506 (2017). 24 julio c.s. reis et al., “supervised learning for fake news detection,” ieee intelligent systems 34, no. 2 (2019): 76–81, https://doi.org10.1109/mis.2019.2899143. 25 victoria l. rubin et al., “fake news or truth? using satirical cues to detect potentially misleading news,” (paper, proceedings of the second workshop on computational approaches to deception detection, (2016): 7–17); francesco barbieri, francesco ronzano, and horacio saggion, “is this tweet satirical? a computational approach for satire detection in spanish,” procesamiento del lenguaje natural, no. 55 (2015): 135-42; soujanya poria et al., “a deeper look into sarcastic tweets using deep convolutional neural networks,” arxiv preprint arxiv:1610.08815 (2016). 26 lei guo and chris vargo, “’fake news’ and emerging online media ecosystem: an integrated intermedia agenda-setting analysis of the 2016 us presidential election,” communication research 47, no. 2 (2020): 178–200, https://doi.org/10.1177/0093650218777177. 27 natali ruchansky, sungyong seo, and yan liu, “csi: a hybrid deep model for fake news detection,” proceedings of the 2017 acm on conference on information and knowledge management, (november 2017): 797–806, https://doi.org/10.1145/3132847.3132877. 28 yaqing wang et al., “eann: event adversarial neural networks for multi-modal fake news detection,” proceedings of the 24th acm sigkdd international conference on knowledge discovery & data mining, (2018): 849–57, https://doi.org/10.1145/3219819.3219903; james w. pennebaker, martha e. francis, and roger j. booth, “linguistic inquiry and word count: liwc 2001”, mahway: lawrence erlbaum associates 71, no. 2001 (2001). 29 “facebook, twitter may face more scrutiny in 2019 to check fake news, hate speech,” accessed may 17, 2020, available: https://www.huffingtonpost.in/entry/facebook-twitter-may-facemore-scrutiny-in-2019-to-check-fake-news-hate-speech_in_5c29c589e4b05c88b701d72e. 30 nitesh v. chawla et al., “smote: synthetic minority over-sampling technique,” journal of artificial intelligence research 16, (2002): 321–57, https://doi.org/10.1613/jair.953. 31 uğur mertoğlu and burkay genç, “lexicon generation for detecting fake news,” arxiv preprint arxiv:2010.11089 (2020). 32 burak bezirci, and asım egemen yilmaz, “metinlerin okunabilirliğinin ölçülmesi üzerine bir yazilim kütüphanesi ve türkçe için yeni bir okunabilirlik ölçütü,” dokuz eylül üniversitesi https://doi.org/10.1371/journal.pone.0105937 https://ink.library.smu.edu.sg/sis_research/4630 https://doi.org/10.1145/3137597.3137600 https://doi.org10.1109/mis.2019.2899143 https://doi.org/10.1177%2f0093650218777177 https://doi.org/10.1145/3132847.3132877 https://doi.org/10.1145/3219819.3219903 https://www.huffingtonpost.in/entry/facebook-twitter-may-face-more-scrutiny-in-2019-to-check-fake-news-hate-speech_in_5c29c589e4b05c88b701d72e https://www.huffingtonpost.in/entry/facebook-twitter-may-face-more-scrutiny-in-2019-to-check-fake-news-hate-speech_in_5c29c589e4b05c88b701d72e https://doi.org/10.1613/jair.953 information technology and libraries december 2020 automated fake news detection in the age of digital libraries | mertoğlu and genç 19 mühendislik fakültesi fen ve mühendislik dergisi 12, no. 3 (2010): 49–62, https://dergipark.org.tr/en/pub/deumffmd/issue/40831/492667. 33 robert gunning, “the technique of clear writing,” revised edition, new york: mcgraw hill, 1968. 34 ender ateşman, “türkçede okunabilirliğin ölçülmesi,” dil dergisi 58, no. 71–74 (1997). https://dergipark.org.tr/en/pub/deumffmd/issue/40831/492667 abstract introduction literature review methodology framework model dataset features experimental results and discussion conclusion and future work endnotes pal: toward a recommendation system for manuscripts scott ziegler and richard shrake information technology and libraries | september 2018 84 scott ziegler (sziegler1@lsu.edu) is head of digital programs and services, louisiana state university libraries. prior to this position, ziegler was the head of digital scholarship and technology, american philosophical society. richard shrake (shraker13@gmail.com) is a library technology consultant based in burlington, vermont. abstract book-recommendation systems are increasingly common, from amazon to public library interfaces. however, for archives and special collections, such automated assistance has been rare. this is partly due to the complexity of descriptions (finding aids describing whole collections) and partly due to the complexity of the collections themselves (what is this collection about and how is it related to another collection?). the american philosophical society library is using circulation data collected through the collectionmanagement software package, aeon, to automate recommendations. in our system, which we’re calling pal (people also liked), recommendations are offered in two ways: based on interests (“you’re interested in x, other people interested in x looked at these collections”) and on specific requests (“you’ve looked at y, other people who looked at y also looked that these collections”). this article will discuss the development of pal and plans for the system. we will also discuss ongoing concerns and issues, how patron privacy is protected, and the possibility of generalizing beyond any specific software solution. introduction the american philosophical society library (aps) is an independent research library in philadelphia. founded in 1743, the library houses a wide variety of material in early american history, history of science, and native american linguistics. the majority of the library’s holdings are manuscripts, with a large amount of audio material, maps, and graphics, nearly all of which are described in finding aids created using encoded archival description (ead) standards. like similar institutions, the aps has long struggled to find new ways to help library users discover material relevant to their research. in addition to traditional in-person, email, and phone reference, the aps has spent years creating search and browse interfaces, subject guides , and web exhibitions to promote the collections.1 as part of these ongoing efforts to connect users with collections, the aps is working on an automated recommendation system to reuse circulation data gathered through aeon. developed by atlas systems, aeon is a “request and workflow management software specifically designed for special collections libraries and archives,” and it enables the aps to gather statistics on both the use of our manuscript collections and on aspects of the library’s users.2 the automated recommendation system, which we’re calling pal, for “people also liked,” is an ongoing effort. this article presents a snapshot of current work. pal: toward a recommendation system for manuscripts | ziegler and shrake 85 https://doi.org/10.6017/ital.v37i3.10357 literature review the benefits of recommendations in library opacs has long been recognized. writing in 2008 about the library recommendation system bibtip, itself started in the early 2000s, mönnich and spiering observe that “library services are well suited for the adoption of recommendation systems, especially services that support the user in search of literature in the catalog.” by 2011 oclc research and the information school at the university of sheffield began exploring a recommendation system for oclc’s worldcat.3 recommendations for library opacs commonly fall into one of two categories, content-based or collaborative filtering. content-based recommendations pair specific users to library items based on the metadata of the item and what is known about the user. for example, if a user indicates in some way that they enjoy mystery novels, items identified as mystery novels might be recommended to them. collaborative filtering combines users in some way and creates recommendations for one user based on the preferences of another user. there can be a dark side to recommendations. the algorithms that determine which users are similar and thus which recommendations to make are not often understood. writing about algorithms in library discovery systems broadly, reidsma points out that “in librarianship over the past few decades, the profession has had to grapple with the perception that computers are better at finding relevant information then people.”4 the algorithms that are doing the finding, however, often carry the same hidden biases that their programmers have. reidsma encourages a broader understanding of algorithms in general and deeper understanding of recommendation algorithms in particular. the history of recommendation systems in libraries has informed the ongoing development of pal. we use both the content-based and the collaborative filtering approach to offering recommendations to users. for the purposes of communicating them to nontechnical patrons, we refer to them as “interest-based” and “request-based,” respectively. furthermore, we are cautious about the role algorithms play in determining which recommendations users see. our help text reinforces the continued importance of working directly with in-house experts, and we promote pal as one tool among the many offered by the library. we are not aware of any literature on the development of recommendation tools for archives or special-collections libraries. the nature of the material held in these institutions presents special challenges. for example, unlike book collections, many manuscript and archival collections are described in aggregate: one description might refer to many letters. these issues are discussed in detail below. putting data to use: recommendations based on interests and requests the use of aeon allows the aps to gather and store data, including both data that users supply through the registration form and data concerning which collections are requested. pal use both types of data to create recommendations. interest-based recommendations the first type of recommendation uses self-identified research interest data that researchers supply when creating an aeon account. when registering, a user has the option to select from a list of sixty-four topics grouped into seven broad categories (figure 1). the aps selected these information technology and libraries | september 2018 86 interests based on suggestions from researchers as well as categories common in the field of academic history. upon signing in, a registered user sees a list of links (figure 2); each link leads to a full-page view of collection recommendations (figure 3). these recommendations follow the model, “you’re interested in x, other people interested in x looked at these collections.” request-based recommendations using the circulation data that aeon collects, we are able to automate recommendations in pal based on request information. upon clicking a request link in a finding aid, the user is presented with a list of recommendations on the sidebar in aeon (figure 4). each link opens the finding aid for the collection listed. figure 1. list of interests a user sees when registering for the first time. a user can also revisit this list to modify their choices at any point by following links through the aeon interface. the selected interests generate recommendations. pal: toward a recommendation system for manuscripts | ziegler and shrake 87 https://doi.org/10.6017/ital.v37i3.10357 figure 2. list of links appearing on the right-hand sidebar, based on interests that users select. figure 3. recommended collections, based on interest, showing collection name (with a link to finding aid), call number, number of requests, and number of users who have requested from the collections. the user sees this list after clicking on option from sidebar, as shown in figure 2. information technology and libraries | september 2018 88 figure 4. request-based recommendation links appearing on the right-hand sidebar after a patron requests an item from a finding aid. the process currently, the data that drives these two functions is obtained from a semidynamic process via daily, automated sql query exports. usernames are employed to tie together requests and interests but are subsequently purged from the data before the results are presented to users and staff. this section explains the process in detail and presents code snippets where available. all code is available on github.5 interest-based recommendations for interest-based recommendations, we employ two queries. the first query pulls every collection requested by a user for each topic for which that user has expressed an interest. the second aggregates the data for every user in the system. the following queries get data from the microsoft sql database, via a microsoft access intermediary, that aeon uses to store data. because of the number of interest options in the registration form, and the character length of some of them (“early america colonial history,” for example) we encode the interests in shortened form. “early america colonial history” becomes “ea-colhist” so as not to run into character limits in the database. this section explores each of these queries in more detail and provides example code. pal: toward a recommendation system for manuscripts | ziegler and shrake 89 https://doi.org/10.6017/ital.v37i3.10357 the first query gathers research topics for all users who are not staff (user status is ‘researcher’), and where at least one research topic is chosen (‘researchtopics’ is not null). the data is exported into an xml file that we call “aeonmssreg.” select aeondata.dbo.users.researchtopics, aeondata.dbo.transactions.callnumber, aeondata.dbo.transactions.location from aeondata.dbo.transactions inner join aeondata.dbo.users on (aeondata.dbo.users.username = aeondata.dbo.transactions.username) and (aeondata.dbo.transactions.username = aeondata.dbo.users.username) where (((aeondata.dbo.users.researchtopics) is not null) and ((aeondata.dbo.transactions.callnumber) like 'mss%' or (aeondata.dbo.transactions.callnumber) like 'aps.%') and ((aeondata.dbo.users.status)='researcher')) for xml raw ('aeonmssreq'), root ('dataroot'), elements; the second query combines all data for all users and exports an xml file ‘aeonmssusers.’ select distinct aeondata.dbo.users.researchtopics, aeondata.dbo.transactions.callnumber, aeondata.dbo.transactions.location, aeondata.dbo.transactions.username from aeondata.dbo.transactions inner join aeondata.dbo.users on (aeondata.dbo.users.username = aeondata.dbo.transactions.username) and (aeondata.dbo.transactions.username = aeondata.dbo.users.username) where (((aeondata.dbo.users.researchtopics) is not null) and ((aeondata.dbo.transactions.callnumber) like 'mss%' or (aeondata.dbo.transactions.callnumber) like 'aps.%') and ((aeondata.dbo.users.status)='researcher')) for xml raw ('aeonmssusers'), root ('dataroot'), elements; each query produces an xml file. these files are parsed using xsl stylesheets into subsets for each research interest. the stylesheets also generate counts of users requesting a collection and number of total requests for a collection by users sharing an interest. an example is the following stylesheet for the topic “early america colonial history,” which pulls from the xml file “aeonmssreg”: information technology and libraries | september 2018 90 this process is repeated for each interest. the data from the query that we modify with xslt is presented as html that we insert into aeon templates. this html includes the collection name (linked to finding aid), call number, number of requests, and number of users in a table. see figure 3 for how this appears to the user. the following shows how xsl is wrapped in html.

the collections most frequently requested from researchers who expressed an interest in are listed below with links to each collection's finding aid and the number of times each collection has been requested.

collection call number # of requests # of users
to ensure a user only sees the links that match the interests they have selected, we use javascript to determine the expressed interests of the current user and display the corresponding links to the html pages in a sidebar. this approach works well, but we must account for two quirks. the first is that many interests in the database do not conform to the current list of options because many users predate our current registration form and wrote in free-form interests. secondly, aeon stores the research information as an array rather than in a separate table, so we must account for the fact that the aeon database contains an array of values that includes both controlled and uncontrolled vocabulary. first, we set the array as a variable so we can look for a value that matches our controlled vocabulary and separate the array into individual values for manipulation: // use var message to check for presence of controlled list of topics var message = "<#user field='researchtopics'>"; // use var values to separate topics that are collected in one string var values = "<#user field='researchtopics'>".split(","); pal: toward a recommendation system for manuscripts | ziegler and shrake 91 https://doi.org/10.6017/ital.v37i3.10357 we also create variables to generate the html entries and links out when we have extracted our research topics: var open = "" next we set a conditional to determine if one of our controlled vocabulary terms appears in the array: //determine if user has an interest topic from the controlled list if ((message.indexof("ea-colhis") > -1) || (message.indexof("ea-amrev") > -1) || (message.indexof("ea-earlynat") > -1) || (message.indexof("ea-antebellum") > -1) || … if the array contains a value from our controlled vocabulary, we generate a link and translate our internal code back into a human-friendly research topic (“ea-colhist,” for example, becomes once again “early american colonial history”): for (var i = 0; i < values.length; ++i) { if (values[i]=="ea-colhis"){ document.getelementbyid("topic").innerhtml += (open + values[i] + middle + "early america-colonial history" + close);} else if (values[i]=="ea-amrev"){ document.getelementbyid("topic").innerhtml += (open + values[i] + middle + "early america american revolution" + close);} else if (values[i]=="ea-earlynat"){ document.getelementbyid("topic").innerhtml += (open + values[i] + middle + "early america early national" + close);} else if (values[i]=="ea-antebellum"){ document.getelementbyid("topic").innerhtml += (open + values[i] + middle + "early america antebellum" + close);} … see figure 2 for how this appears to the user. users only see the links that correspond to their stated interest. if the array does not contain a value from our controlled vocabulary, we display the research-topic interests associated with the user account, note that we don’t currently have a recommendation, and provide a link to update the research topics for the account. else {document.getelementbyid("notopic").innerhtml = "

you expressed interest in:

<#user field='researchtopics'>

we are unable to provide a specific collection recommendation for you. please visit our user profile page to select from our list of research topics.

" } request-based recommendations in addition to interest-based recommendations, pal supplies recommendations based on past requests a user has made. this section details how these recommendations are generated. aeon allows users to request materials directly from a finding aid (see figure 6). to generate our request-based recommendations we employ a query depicting the call number and user of every request in the system and export the results to an xml file called “aeonlikecollections.” information technology and libraries | september 2018 92 select subquery.callnumber, subquery.username, iif(right(subquery.trimlocation,1)='.',left(subquery.trimlocation,len(subquery.trimlocation)1),subquery.trimlocation) as finallocation from ( select distinct aeondata.dbo.transactions.callnumber, aeondata.dbo.transactions.username, iif(charindex(':',[location])>0,left([location],charindex(':',[location])-1),[location]) as trimlocation from aeondata.dbo.transactions inner join aeondata.dbo.users on (aeondata.dbo.users.username = aeondata.dbo.transactions.username) and (aeondata.dbo.transactions.username = aeondata.dbo.users.username) where (((aeondata.dbo.transactions.callnumber) like 'mss%' or (aeondata.dbo.transactions.callnumber) like 'aps.%') and ((aeondata.dbo.transactions.location) is not null) and ((aeondata.dbo.users.status)='researcher'))) subquery order by subquery.callnumber for xml raw ('aeonlikecollections'), root ('dataroot'), elements; we then process the “aeonlikecollections” file through a series of xslt stylesheets, creating lists of every other collection that every user of the current collection has requested. first the stylesheets remove collections that have only been requested once. then we count the number of times each collection has been requested: we sort on the collection name and username and then re-sort to combine groups of requested collections with users who have requested each collection. pal: toward a recommendation system for manuscripts | ziegler and shrake 93 https://doi.org/10.6017/ital.v37i3.10357 we then create a new xml file that is organized by our collection groupings. the following snippet shows a populated xml file generated by the xslt stylesheet above. mss.497.3.b63c mss.497.3.b63c american council of learned societies … 94 mss.ms.coll.200 mss.ms.coll.200 miscellaneous manuscripts collection … 92 we use javascript to determine the call number of the user’s current request and display the list of other collections that users who have requested the current collection have also requested. see figure 4 for how these links appear to the user. all of the exports and processing are handled automatically through a daily scheduled task. the only personally identifiable data that is contained in these processes are usernames, which are used for counting purposes, but they are removed from the final products through the xslt processing on an internal administrative server, are never stored in the aeon web directory, and are never available for other library users or staff to see. potential pitfalls and what to do about them pal allows us to see new things about our users, and we hope that our users are able to see new collections in the library. however, there are potential pitfalls to the way we’ve been working on this project. we’re calling the two biggest pitfalls the “bias toward well-described collections” and the “problem of aboutness.” information technology and libraries | september 2018 94 the bias toward well-described collections the bias toward well-described collections is best understood by examining how the aps integrates aeon into our finding aids. we offer request links at every available level of description: collection, series, folder, and item. if a patron spends all day in our reading room and looks at the entirety of an item-level collection, they could have made between twenty and one hundred individual requests from that collection. for our statistics, each request will be counted as that collection being used. figure 6 shows a collection described at the item level; each item can be individually requested, giving the impression that this collection is very heavily used even if it is only one patron doing all the requesting. figure 6. finding aid of collection described at the item level. a patron making their way through this collection could make as many as one hundred individual requests. for collections described at the collection level, however, the patron has only one link to click to see the entire collection. for pal, however, it looks like that collection was only used once, as shown in figure 7. a patron sitting all day in our reading room looking at a collection with little description might use the collection more heavily than a patron clicking select items in a well-described collection. however, when we review the numbers, all we see is that the well-described collections get more clicks. pal: toward a recommendation system for manuscripts | ziegler and shrake 95 https://doi.org/10.6017/ital.v37i3.10357 figure 7. screenshot of finding aid with only collection-level description. this collection has only one request link, the “special request” link at the top right. a patron looking through the entirety of this collection will only log a single request from the point of view of our statistics. the problem of aboutness when we speak of the problem of aboutness, we draw attention to the fact that manuscript collections can be about many different things. one researcher might come to a collection for one reason, another researcher for another reason. a good example at the aps library is the william parker foulke papers.6 this collection contains approximately three thousand items and represents a wide variety of the interests of the eponymous mr. foulke. he discovered the first full dinosaur skeleton, promoted prison reform, worked toward abolition, and championed arctic exploration. a patron looking at this collection could be interested in any of these topics, or others. pal, however, isn’t able to account for these nuances. if a researcher interested in prison reform requests items from the foulke papers, they’ll see the same suggestion as a researcher who came to the collection for arctic exploration. what to do about this identifying these pitfalls is a good first step to avoiding them, but it’s only a first step. there are technical solutions, and we’ll continue to explore them. for example, the bias toward welldescribed collections is mitigated by showing both the number of requests and the number of users who have requested from a collection (see figure 3). we hope that by presenting both numbers, we move a little toward overcoming this bias. however, we’re also interested in the nontechnical approaches to these issues. as mentioned in the introduction, the aps relies heavily on traditional reference service, both remote and in-house. nontechnical solutions acknowledge the shortcomings of any constructed solution and injects a healthy amount of humility into our work. additionally, the subject guides, search tools, and web exhibitions all form an ecosystem of discovery and access to supplement pal. future steps using data outside of aeon we have begun exploring options for using the recommendation data outside of aeon. one early prototype surfaces a link in our primary search interface. for example, searching for the william information technology and libraries | september 2018 96 parker foulke papers shows a link of what people who requested from this collection also looked at. see figures 8 and 9. generalizing for other repositories there are ways to integrate the use of aeon with ead finding aids. the systems that the aps has developed to collect data for automated recommendations takes advantage of our infrastructure. we’d like for other repositories to be able to use pal. it is our hope that an institution using aeon in a different way will help us generalize this system. generalizing beyond aeon pal is currently configured to pull data out of the microsoft sql database used by aeon. however, all the manipulation is done outside of aeon and is therefore generalizable to data collected in other ways. because archives and special collections have long-held statistics in different types of systems, we hope to be able to generalize beyond the aeon use case if there is any interest in this from other repositories. integrating pal into aeon conversations with atlas staff about pal have been positive, and there is interest in building many of the features into future releases of aeon. as of this writing, an open uservoice forum topic is taking votes and comments about this integration.7 figure 8. a link in the search returns that leads to recommendations based on finding aid search. clicking on the link “pal recommendations: patrons who used henry howard houston, ii papers also used these collections” will open an html page with a list of links to finding aids. pal: toward a recommendation system for manuscripts | ziegler and shrake 97 https://doi.org/10.6017/ital.v37i3.10357 figure 9. html link of recommended finding aids based on search. conclusion the aps is trying to add to the already robust options for users to find relevant manuscript collections. in addition to traditional reference, web exhibitions, and online search and browse tools, we have started reusing circulation data and self-identified user interests to automate recommendations. this new system fits within the ecosystem of tools we already supply. this is a snapshot of where the pal recommendation project is as of this writing, and we hope to work with other special collections libraries and archives to continue to grow the tool. if you are interested, we hope you reach out. endnotes 1 “subject guides and bibliographies,” american philosophical society, accessed february 27, 2018, https://amphilsoc.org/library/guides; “exhibitions,” american philosophical society, accessed february 27, 2018, https://amphilsoc.org/library/exhibit; “galleries,” american philosophical society, accessed february 27, 2018, https://diglib.amphilsoc.org/galleries. 2 “aeon,” atlas systems, accessed february 27, 2018, https://www.atlas-sys.com/aeon/. https://amphilsoc.org/library/guides https://amphilsoc.org/library/exhibit https://diglib.amphilsoc.org/galleries https://www.atlas-sys.com/aeon/ information technology and libraries | september 2018 98 3 michael mönnich and marcus spiering, “adding value to the library catalog by implementing a recommendation system,” d-lib magazine 14, no. 5/6 (2008), https://doi.org/10.1045/may2008-monnich. 4 matthew reidsma, “algorithmic bias in library discovery systems,” matthew reidsma (blog), march 11, 2016, https://matthew.reidsrow.com/articles/173. 5 “americanphilosophicalsociety/pal,” american philosophical society, last modified september 11, 2017, https://github.com/americanphilosophicalsociety/pal. 6 “william parker foulke papers, 1840–1865,” american philosophical society, accessed february 27, 2018, https://search.amphilsoc.org/collections/view?docid=ead/mss.b.f826-ead.xml. 7 “recommendation system to suggest items to researchers based on users with the same research topic,” atlas systems, accessed february 27, 2018, https://uservoice.atlassys.com/forums/568075-aeon-ideas/suggestions/18893335-recommendation-system-tosuggest-items-to-research. https://doi.org/10.1045/may2008-monnich https://matthew.reidsrow.com/articles/173 https://github.com/americanphilosophicalsociety/pal http://amphilsoc.org/collections/view?docid=ead/mss.b.f826-ead.xml https://uservoice.atlas-sys.com/forums/568075-aeon-ideas/suggestions/18893335-recommendation-system-to-suggest-items-to-research https://uservoice.atlas-sys.com/forums/568075-aeon-ideas/suggestions/18893335-recommendation-system-to-suggest-items-to-research https://uservoice.atlas-sys.com/forums/568075-aeon-ideas/suggestions/18893335-recommendation-system-to-suggest-items-to-research abstract introduction literature review putting data to use: recommendations based on interests and requests interest-based recommendations request-based recommendations the process interest-based recommendations request-based recommendations potential pitfalls and what to do about them the bias toward well-described collections the problem of aboutness what to do about this future steps using data outside of aeon generalizing for other repositories generalizing beyond aeon integrating pal into aeon conclusion endnotes kruger ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ usability test results for encore in an academic library megan johnson information technology and libraries | september 2013 59 abstract this case study gives the results a usability study for the discovery tool encore synergy, an innovative interfaces product, launched at appalachian state university belk library & information commons in january 2013. nine of the thirteen participants in the study rated the discovery tool as more user friendly, according to a sus (standard usability scale) score, than the library’s tabbed search layout, which separated the articles and catalog search. all of the study’s participants were in favor of switching the interface to the new “one box” search. several glitches in the implementation were noted and reported to the vendor. the study results have helped develop belk library training materials and curricula. the study will also serve as a benchmark for further usability testing of encore and appalachian state library’s website. this article will be of interest to libraries using encore discovery service, investigating discovery tools, or performing usability studies of other discovery services. introduction appalachian state university’s belk library & information commons is constantly striving to make access to libraries resources seamless and simple for patrons to use. the library’s technology services team has conducted usability studies since 2004 to inform decision making for iterative improvements. the most recent versions (since 2008) of the library’s website have featured a tabbed layout for the main search box. this tabbed layout has gone through several iterations and a move to a new content management system (drupal). during fall semester 2012, the library website’s tabs were: books & media, articles, google scholar, and site search (see figure 1). some issues with this layout, documented in earlier usability studies and through anecdotal experience, will be familiar to other libraries who have tested a tabbed website interface. user access issues include the belief of many patrons that the “articles” tab looked for all articles the library had access to. in reality the “articles” tab searched seven ebsco databases. belk library has access to over 400 databases. another problem noted with the tabbed layout was that patrons often started typing in the articles box, even when they knew they were looking for a book or dvd. this is understandable, since when most of us see a search box we just start typing, we do not read all the information on the page. megan johnson (johnsnm@appstate.edu) is e-learning and outreach librarian, belk library and information commons, appalachian state university, boone, nc. mailto:johnsnm@appstate.edu usability test results for encore in an academic library | johnson 60 figure 1. appalachian state university belk library website tabbed layout search, december 2012. a third documented user issue is confusion over finding an article citation. this is a rather complex problem, since it has been demonstrated through assessment of student learning that many students cannot identify the parts of a citation, so this usability issue goes beyond the patron being able navigate the library’s interface, it is partly a lack of information literacy skills. however, even sophisticated users can have difficulty in determining if the library owns a particular journal article. this is an ongoing interface problem for belk library and many other academic libraries. google scholar (gs) often works well for users with a journal citation, since on campus they can often simply copy and paste a citation to see if the library has access, and, if so, the full text it is often is available in a click or two. however, if there are no results found using gs, the patrons are still not certain if the library owns the item. background in 2010, the library formed a task force to research the emerging market of discovery services. the task force examined summon, ebsco discovery service, primo and encore synergy and found the products, at that time, to still be immature and lacking value. in april 2012, the library reexamined the discovery market and conducted a small benchmarking usability study (the results are discussed in the methodology section and summarized in appendix a). the library felt enough improvements had been made to innovative interface’s encore information technology and libraries | september 2013 61 synergy product to justify purchasing this discovery service. an encore synergy implementation working group was formed, and several subcommittees were created, including end-user preferences, setup & access, training, and marketing. to help inform the decision of these subcommittees, the author conducted a usability study in december 2012, which was based on, and expanded upon, the april 2012 study. the goal of this study was to test users’ experience and satisfaction with the current tabbed layout, in contrast to the “one box” encore interface. the library had committed to implementing encore synergy, but there are options in layout of the search box on the library’s homepage. if users expressed a strong preference for tabs, the library could choose to leave a tabbed layout for access to the articles part of encore, for the catalog part, and create tabs for other options like google scholar, and a search of the library’s website. a second goal of the study was to benchmark the user experience for the implementation of encore synergy so that, over time, improvements could be made to promote seamless access to appalachian state university library’s resources. a third goal of this study was to document problems users encountered and report them to innovative. figure 2. appalachian state university belk library website encore search, january 2013. usability test results for encore in an academic library | johnson 62 literature review there have been several recent reviews of the literature on library discovery services. thomsettscott and reese conclude that discovery tools are a mixed blessing. 1 users can easily search across abroad areas of library resources and limiting by facets is helpful. downsides include loss of individual database specificity and user willingness to look beyond the first page of results. longstanding library interface problems, such as patrons’ lack of understanding of holding statements, and knowing when to it is appropriate to search in a discipline specific database are not solved by discovery tools.2 in a recent overview of discovery services, hunter lists four vendors whose products have both a discovery layer and a central index: ebsco’s discovery service (eds); ex libris’ primo central index; serials solutions’ summon; and oclc’s worldcat local (wcl). 3 encore does not have currently offer a central index or pre-harvested metadata for articles, so although encore has some of the features of a discovery service, such as facets and connections to full text, it is important for libraries considering implementing encore to understand that the part of encore that searches for articles is a federated search. when appalachian purchased encore, not all the librarians and staff involved in the decision making were fully aware of how this would affect the user experience. further discussion of this in the “glitches revealed” section. fagan et al. discuss james madison university’s implementation of ebsco discovery service and their customizations of the tool. they review the literature of discovery tools in several areas, including articles that discuss the selection processes, features, and academic libraries’ decisions process following selection. they conclude, the “literature illustrates a current need for more usability studies related to discovery tools.” 4 the most relevant literature to this study are case studies documenting a library’s experience with implementing a discovery services and task based usability studies of discovery services. thomas and buck5 sought to determine with a task based usability study whether users were as successful performing common catalog-related tasks in worldcat local (wcl) as they are in the library’s current catalog, innovative interfaces’ webpac. the study helped inform the library’s decision, at that time, to not implement wcl. beecher and schmidt6 discuss american university’s comparison of wcl and aquabrowser (two discovery layers), which were implemented locally. the study focused on user preferences based on students “normal searching patterns” 7 rather than completion of a list of tasks. their study revealed undergraduates generally preferred wcl, and upperclassmen and graduates tended to like aquabrower better. beecher and schmidt discuss the research comparing assigned tasks versus user-defined searches, and report that a blend of these techniques can help researchers understand user behavior better.8 information technology and libraries | september 2013 63 this article reports on a task-based study, in which the last question asks the participant to research something they had looked for within the past semester, and the results section indicates that the most meaningful feedback came from watching users research a topic they had a personal interest in. having assigned tasks also can be very useful. for example, an early problem noted with discovery services was poor search results for specific searches on known items, such as the book “the old man and the sea.” assigned tasks also give the user a chance to explore a system for a few searches, so when they search for a topic of personal interest, it is not their first experience with a new system. blending assigned tasks with user tasks proved helpful in this study’s outcomes. encore synergy has not yet been the subject of a formally published task-based usability study. allison reports on an analysis of google analytic statistics at university of nebraska-lincoln after encore was implemented.9 the article concludes that encore increases the user’s exposure to all the library’s holdings, describes some of the challenges unl faced and gives recommendations for future usability studies to evaluate where additional improvements should be made. the article also states unl plans to conduct future usability studies. although there are not yet formal published task-based studies on encore, at least one blogger from southern new hampshire university documented their implementation of the service. singley reported in 2011, “encore synergy does live up to its promise in presenting a familiar, user-friendly search environment.10 she points out, “to perform detailed article searches, users still need to link out to individual databases.” this study confirms that users do not understand that articles are not fully indexed and integrated; articles remain, in encore’s terminology, in “database portfolios.” see the results section, task 2, for a fuller discussion of this topic. method this study included a total of 13 participants. these included four faculty members, and six students recruited through a posting on the library’s website offering participants a bookstore voucher. three student employees were also subjects (these students work in the library’s mailroom and received no special training on the library’s website). for the purposes of this study, the input of undergraduate students, the largest target population of potential novice users, was of most interest. table 3 lists demographic details of the student or faculty’s college, and for students, their year. this was a task-based study, where users were asked to find a known book item and follow two scenarios to find journal articles. the following four questions/tasks were handed to the users on a sheet of paper: 1. find a copy of the book the old man and the sea. 2. in your psychology class, your professor has assigned you a 5-page paper on the topic of eating disorders and teens. find a scholarly article (or peer-reviewed) that explores the relation between anorexia and self-esteem. http://www.snhu.edu/ usability test results for encore in an academic library | johnson 64 3. you are studying modern chinese history and your professor has assigned you a paper on foreign relations. find a journal article that discusses relations between china and the us. 4. what is a topic you have written about this year? search for materials on this topic. the follow up questions where verbally asked either after a task, or asked as prompts while the subject was working. 1. after the first task (find a copy of the book the old man and the sea) when the user finds the book in appsearch, ask: “would you know where to find this book in the library?” 2. how much of the library’s holdings do you think appsearch/ articles quick search is looking across? 3. does “peer reviewed” mean the same as “scholarly article”? 4. what does the “refine by tag” block the right mean to you? 5. if you had to advise the library to either stay with a tabbed layout, or move to the one search box, what would you recommend? participants were recorded using techsmith’s screen-casting software camtasia, which allows the user’s face to be recorded along with their actions on the computer screen. this allows the observer to not rely solely on notes or recall. if the user encounters a problem with the interface, having the session recorded makes it simple to create (or recreate) a clip to show the vendor. in the course of this study, several clips were sent to innovative interfaces, and they were responsive to many of the issues revealed. further discussion is in the “glitches revealed” section. seven of the subjects first used the library site’s tabbed layout (which was then the live site) as seen in figure 1. after they completed the tasks, participants filled in a system usability scale (sus) form. the users then completed the same tasks on the development server using encore synergy. participants next filled out a sus form to reflect their impression of the new interface. encore is locally branded as appsearch and the terms are used interchangeably in this study. the six other subjects started with the appsearch interface on a development server, completed a sus form, and then did the same tasks using the library’s tabbed interface. the time it took to conduct the studies was ranged from fifteen to forty minutes per participant, depending on how verbal the subject was, and how much they wanted to share about their impressions and ideas for improvement. jakob nielson has been quoted as saying you only need to test with five users: “after the fifth user, you are wasting your time by observing the same findings repeatedly but not learning much new.”11 he argues for doing tests with a small number of users, making iterative improvements, and then retesting. this is certainly a valid and ideal approach if you have full control of the design. in the case of a vendor-controlled product, there are serious limitations to what the information technology and libraries | september 2013 65 librarians can iteratively improve. the most librarians can do is suggest changes to the vendor, based on the results of studies and observations. when evaluating discovery services in the spring of 2012, appalachian state libraries conducted a four person task based study (see appendix a), which used university of nebraska at lincoln’s implementation of encore as a test site to benchmark our students’ initial reaction to the product in comparison to the library’s current tabbed layout. in this small study, the average sus score for the library’s current search box layout was 62, and for unl’s implementation of encore, it was 49. this helped inform the decision of belk library, at that time, not to purchase encore (or any other discovery service), since students did not appear to prefer them. this paper reports on a study conducted in december 2012 that showed a marked improvement in users’ gauge of satisfaction with encore. several factors could contribute to the improvement in sus scores. first is the larger sample size of 13 compared to the earlier study with four participants. another factor is in the april study, participants were using an external site they had no familiarity with, and a first experience with a new interface is not a reliable gauge of how someone will come to use the tool over time. this study was also more robust in that it added the task of asking the user to search for something they had researched recently and the follow up questions were more detailed. overall it appears that, in this case, having more than four participants and a more robust design gave a better representation of user experience. the system usability scale (sus) the system usability scale has been widely used in usability studies since its development in 1996. many libraries use this tool in reporting usability results.12,13 it is simple to administer, score, and understand the results.14 sus is an industry standard with references in over 600 publications.15 an “above average” score is 68. scoring a scale involves a formula where odd items have one subtracted from the user response, and with even numbered items, the user response is subtracted from five. the total converted responses are added up, and then multiplied by 2.5. this makes the answers easily grasped on the familiar scale of 1-100. due to the scoring method, it is possible that results are expressed with decimals.16 a sample sus scale is included in appendix d. results the average sus score for the 13 users for encore was 71.5, and for the tabbed layout, the average sus score was 68. this small sample set indicates there was a user preference for the discovery service interface. in a relatively small study like this, these results do not imply a scientifically valid statistical measurement. as used in this study, the sus scores are simply a way to benchmark how “usable” the participants rated the two interfaces. when asked the subjective follow up question, “if you had to advise the library to either stay with a tabbed layout, or move to the one search box, what would you recommend?” 100% of the participants recommended the library change to appsearch, (although four users actually rated usability test results for encore in an academic library | johnson 66 the tabbed layout with a higher sus score). these four participants said things along the lines of, “i can get used to anything you put up.” participant sus sus year and major or college appsearch first encore tabbed layout student a 90 70 senior/social work/female no student b 95 57.5 freshman/undeclared/male yes student c 82.5 57.5 junior/english/male yes student d 37.5 92 sophomore/actuarial science/female yes student e 65 82.5 junior/psychology/female yes student f 65 77.5 senior/sociology/female no student g 67.5 75 junior/music therapy/female no student h 90 82.5 senior/dance/female no student i 60 32.5 senior/political science/female no faculty a 40 87.5 family & consumer/science/female yes faculty b 80 60 english/male no faculty c 60 55 education/male no faculty d 97.5 57.5 english/male yes average 71.5 68 table 1. demographic details and individual and average sus scores. discussion task 1: “find a copy of the book the old man and the sea.” all thirteen users had faster success using encore. when using encore, this “known item” is in the top three results. encore definitely performed better than the classic catalog in saving the time of the user. in approaching task 1 from the tabbed layout interface, four out of thirteen users clicked on the books and media tab, changed the drop down search option to “title,” and were (relatively) quickly successful. the remaining nine who switched to the books and media tab and used the default keyword search for “the old man and the sea” had to scan the results (using this search method, the book is the seventh result in the classic catalog), which took two users almost 50 seconds. this length of time, for an “average user” to find a well-known book is not considered to be acceptable to the technology services team at appalachian state university. when using the encore interface, the follow up question for this task was, “would you know where to find this book in the library?” nine out of 13 users did not know where the book would be, or information technology and libraries | september 2013 67 how to find it. the three faculty members and student d could pick out the call number and felt they could locate the book in the stacks. figure 3. detail of the screen of results for searching for “the old man and the sea”. the classic catalog that most participants were familiar with has a “map it” feature (from the third party vendor stackmap), and encore did not have that feature incorporated yet. since this study has been completed, the “map it” has been added to the item record in appsearch. further research can determine if students will have a higher level of confidence in their ability to locate a book in the stacks when using encore. figure 3 shows the search as it appeared in december 2012 and figure 4 has the “map it” feature implemented and pointed out with a red arrow. related to this task of searching for a known book, student b commented that in encore, the icons were very helpful in picking out media type. figure 4. book item record in encore. the red arrow indicates the “map it” feature, an add-on to the catalog from the vendor stackmap. browse results are on the right, and only pull from the catalog results. when using the tabbed layout interface (see figure 1), three students typed the title of the book into the “articles” tab first, and it took them a few moments figure out why they had a problem with the results. they were able to figure it out and re-do the search in the “correct” books & usability test results for encore in an academic library | johnson 68 media tab, but student d commented, “i do that every time!” this is evidence that the average user does not closely examine a search box--they simply start typing. task 2: “in your psychology class, your professor has assigned you a five-page paper on the topic of eating disorders and teens. find a scholarly article (or peer-reviewed) that explores the relation between anorexia and self-esteem.” this question revealed, among other things, that seven out of the nine students did not fully understand the term scholarly or peer reviewed article are meant to be synonyms in this context. when asked the follow up question “what does ‘peer reviewed’ mean to you?” student b said, “my peers would have rated it as good on the topic.” this is the kind of feedback that librarians and vendors need to be aware of in meeting students’ expectations. users have become accustom to online ratings by their peers of hotels and restaurants, so the terminology academia uses may need to shift. further discussion on this is in the “changes suggested” section below. figure 5. typical results for task two. figure 5 shows a typical user result for task 2. the follow up question asked users “what does the refine by tag box on the right mean to you?” student g reported they looked like internet ads. other users replied with variations of, “you can click on them to get more articles and stuff.” in fact, the “refine by tag” box in the upper right column top of screen contains only indexed terms from the subject heading of the catalog. this refines the current search results to those with the specific subject term the user clicked on. in this study, no user clicked on these tags. information technology and libraries | september 2013 69 for libraries considering purchasing and implementing encore, a choice of skins is available, and it is possible to choose a skin where these boxes do not appear. in addition to information from innovative interfaces, libraries can check a guide maintained by a librarian at saginaw valley state university17 to see examples of encore synergy sites, and links to how different skins (cobalt, pearl or citrus) affect appearance. appalachian uses the “pearl” skin. figure 6. detail of screenshot in figure 5. figure 6 is a detail of the results shown in the screenshot for average search for task 2. the red arrows indicate where a user can click to just see article results. the yellow arrow indicates where the advanced search button is. six out of thirteen users clicked advanced after the initial search results. clicking on the advanced search button brought users to a screen pictured in figure 7. usability test results for encore in an academic library | johnson 70 figure 7. encore's advanced search screen. figure 7 shows the encore’s advanced search screen. this search is not designed to search articles; it only searches the catalog. this aspect of advanced search was not clear to any of the participants in this study. see further discussion of this issue in the “glitches revealed” section. information technology and libraries | september 2013 71 figure 8. the "database portfolio" for arts & humanities. figure 8 shows typical results for task 2 limited just to articles. the folders on the left are basically silos of grouped databases. innovative calls this feature “database portfolios.” in this screen shot, the results of the search narrowed to articles within the “database portfolio” of arts & humanities. clicking on the individual databases return results from that database, and moves the usability test results for encore in an academic library | johnson 72 user to the database’s native interface. for example, in figure 8, clicking on art full text would put the user into that database, and retrieve 13 results. while conducting task 2, faculty member a stressed she felt it was very important students learn to use discipline specific databases, and stated she would not teach a “one box” approach. she felt the tabbed layout was much easier than appsearch and rated the tabbed layout in her sus score with a 87.5 versus the 40 she gave encore. she also wrote on the sus scoring sheet “appsearch is very slow. there is too much to review.” she also said that the small niche showing how to switch results between “books & more” to article was “far too subtle.” she recommended bold tabs, or colors. this kind of suggestion librarians can forward to the vendor, but we cannot locally tweak this layout on a development server to test if it improves the user experience. figure 9. closeup of switch for “books & more” and “articles” options. task 3: “you are studying modern chinese history and your professor has assigned you a paper on foreign relations. find a journal article that discusses relations between china and the us.” most users did not have much difficulty finding an article using encore, though three users did not immediately see a way to limit only to articles. of the nine users who did narrow the results to articles, five used facets to further narrow results. no users moved beyond the first page of results. search strategy was also interesting. all thirteen users appeared to expect the search box to work like google. if there were no results, most users went to the advanced search, and reused the same terms on different lines of the boolean search box. once again, no users intuitively understood that “advanced search” would not effectively search for articles. the concept of changing search terms was not a common strategy in this test group. if very few results came up, none of the users clicked on the “did you mean” or used suggestions for correction in spelling or change in terms supplied by encore. during this task, two faculty members commented on load time. they said students would not wait, results had to be instant. but when working with students, when the author asked how they felt when load time was slow, students almost all said it was fine, or not a problem. they could “see it was working.” one student said, “oh, i’d just flip over to facebook and let the search run.” so perhaps librarians should not assume we fully understand student user expectations. it is also information technology and libraries | september 2013 73 worth noting that, for the participant, this is a low-stakes usability study, not crunch time, so attitudes may be different if load time is slow for an assignment due in a few hours. task 4: “what is a topic you have written about this year? search for materials on this topic.” this question elicited the most helpful user feedback, since participants had recently conducted research using the library’s interface and could compare ease of use on a subject they were familiar with. a few specific examples follow. student a, in response to the task to research something she had written about this semester, looked for “elder abuse.” she was a senior who had taken a research methods class and written a major paper on this topic, and she used the tabbed layout first. she was familiar with using the facets in ebsco to narrow by date, and to limit to scholarly articles. when she was using appsearch on the topic of elder abuse, encore held her facets “full text” and “peer reviewed” from the previous search on china and u.s. foreign relations. an example of encore “holding a search” is demonstrated in figures 10 and 11 below. student a was not bothered by the encore holding limits she had put on a previous search. she noticed the limits, and then went on to further narrow within the database portfolio of “health” which limited the results to the database cinahl first. she was happy with being able to limit by folder to her discipline. she said the folders would help her sort through the results. student g’s topic she had researched within the last semester was “occupational therapy for students with disabilities” such as cerebral palsy. she understood through experience, that it would be easiest to narrow results by searching for ‘occupational therapy’ and then add a specific disability. student g was the user who made the most use of facets on the left. she liked encore’s use of icons for different types of materials. student b also commented on “how easy the icons made it.” faculty b, in looking for the a topic he had been researching recently in appsearch, typed in “writing across the curriculum glossary of terms” and got no results on this search. he said, “mmm, well that wasn’t helpful, so to me, that means i’d go through here” and he clicked on the google search box in the browser bar. he next tried removing “glossary of terms” from his search and the load time was slow on articles, so he gave up after ten seconds and clicked on “advanced search” and tried putting “glossary of terms” in the second line. this led to another dead end. he said, “i’m just surprised appalachian doesn’t have anything on it.” the author asked if he had any other ideas about how to approach finding materials on his topic from the library’s homepage and he said no, he would just try google (in other words, navigating to the group of databases for education was not a strategy that occurred to him). usability test results for encore in an academic library | johnson 74 the faculty member d had been doing research on a relatively obscure historical event and was able to find results using encore. when asked if he had seen the articles before, he said, “yes, i’ve found these, but it is great it’s all in one search!” glitches revealed it is of concern for the user experience that the advanced search of encore does not search articles; it only searches the catalog. this was not clear to any participant in this study. as noted earlier, encore’s article search is a federated search. this affects load time for article results, and also puts the article results into silos, or to use encore’s terminology, “database portfolios.” encore’s information on their website definitely markets the site as a discovery tool, saying, it “integrates federated search, as well as enriched content—like first chapters—and harvested data… encore also blends discovery with the social web. 18” it is important for libraries considering purchase of encore that while it does have many features of a discovery service, it does not currently have a central index with pre-harvested metadata for articles. if innovative interfaces is going to continue to offer an advanced search box, it needs to be made explicitly clear that the advanced search is not effective for searching for articles, or innovative interfaces needs to make an advanced search work with articles by creating a central index. to cite a specific example from this study, when student e was using appsearch, with all the tasks, after she ran a search, she clicked on the advanced search option. the author asked her, “so if there is an advanced search, you’re going to use it?” the student replied, “yeah, they are more accurate.” another aspect of encore that users do not intuitively grasp is that when looking at the results for an article search, the first page of results comes from a quick search of a limited number of databases (see figure 8). the users in this study did understand that clicking on the folders will narrow by discipline, but they did not appear to grasp that the result in the database portfolios are not included in the first results shown. when users click on an article result, they are taken to the native interface (such as psych info) to view the article. users seemed un-phased when they went into a new interface, but it is doubtful they understand they are entering a subset of appsearch. if users try to add terms or do a new search in the native database they may get relevant results, or may totally strike out, depending on chosen database’s relevance to their research interest. information technology and libraries | september 2013 75 figure 10. changing a search in encore. another problem that was documented was that after users ran a search, if they changed the text in the “search” box, the results for articles did not change. figure six demonstrates the results from task 2 of this study, which asks users to find information on anorexia and self-esteem. the third task asks the user to find information on china and foreign relations. figure 10 demonstrates the results for the anorexia search, with the term “china” in the search box, just before the user clicks enter, or the orange arrow for new search. figure 11. search results for changed search. figure 11 show that the search for the new term, “china” has worked in the catalog, but the results for articles are still about anorexia. in this implementation of encore, there is no “new search button” (except in the advanced search page, there is a “reset search” button, see figure 7) and usability test results for encore in an academic library | johnson 76 refreshing the browser is had no effect on this problem. this issue was screencast19 and sent to the vendor. happily, as of april 2013, innovative interfaces appears to have resolved this underlying problem. one purpose of this study was to determine if users had a strong preference for tabs, since the library could choose to implement encore with tabs (one for access to articles, one for the catalog, and other tab options like google scholar). this study indicated users did not like tabs in general, they much preferred a “one box solution” on first encounter. a major concern raised was the user’s response to the question, “how much of the library’s holdings do you think appsearch/ articles quick search is looking across?” twelve out of thirteen users believed that when they were searching for articles from the quick search for articles tabbed layout, they were searching all the library databases. the one exception to this was a faculty member in the english department, who understood that the articles tab searched a small subset of the available resources (seven ebsco databases out of 400 databases the library subscribes to). all thirteen users believed appsearch (encore) was searching “everything the library owned.” the discovery service searches far more resources than other federated searches the library has had access to in the past, but it is still only searching 50 out of 400 databases. it is interesting that in the fagan et al. study of ebsco’s discovery service, only one out of ten users in that study believed the quick search would search “all” the library’s resources.20 a glance at james madison university’s library homepage21 suggests wording that may improve user confusion. figure 12. screenshot of james madison library homepage, accessed december 18, 2012. information technology and libraries | september 2013 77 figure 13. original encore interface as implemented in january 2013. given the results that 100% of the users believed that appsearch looked at all databases the library has access to, the library made changes to the wording in the search box. (see figure 7). future tests can determine if this has any positive effect on the understanding of what appsearch includes. figure 14. encore search box after this usability study was completed. the arrow highlights additions to the page as a result of this study. some other wording changes suggested were from the finding that only seven out of nine students fully understood that “peer reviewed” would limit to scholarly articles. a suggestion was made to innovative interfaces to change the wording to “scholarly (peer reviewed)” and they did so in early january. although innovative’s response on this issue was swift, and may help students, changing the wording does not address the underlying information literacy issue of what students understand about these terms. interestingly, encore does not include any “help” pages. appalachian’s liaison with encore has asked about this and been told by encore tech support that innovative feels the product is so intuitive; users will not need any help. belk library has developed a short video tutorial for users, and local help pages are available from the library’s homepage, but according to innovative, a link to these resources cannot be added to the top right area of the encore screen (where help is commonly located in web interfaces). although it is acknowledged that few users actually read “help” pages, it seems like a leap of faith to think a motivated searcher will understand things like the “database portfolios” (see figures 9) without any instruction at all. after implementation, the usability test results for encore in an academic library | johnson 78 librarians here at appalachian conducted internally developed training for instructors teaching appsearch, and all agreed that understanding what is being searched and how to best perform a task such as an advanced article search is not “totally intuitive,” even for librarians. finally, some interesting search strategy patterns were revealed. on the second and third questions in the script (both having to do with finding articles) five of the thirteen participants had the strategy of putting in one term, then after the search ran, adding terms to narrow results using the advanced search box. although this is a small sample set, it was a common enough search strategy to make the author believe this is not an unusual approach. it is important for librarians and for vendors to understand how users approach search interfaces so we can meet expectations. further research the findings of this study suggest librarians will need to continue to work with vendors to improve discovery interfaces to meet users expectations. the context of what is being searched and when is not clear to beginning users in encore one aspect of this test was it was the participants’ first encounter with a new interface, and even student d, who was unenthused about the new interface (she called the results page “messy, and her sus score was 37.5 for encore, versus 92 for the tabbed layout) said that she could learn to use the system given time. further usability tests can include users who have had time to explore the new system. specific tasks that will be of interest in follow up studies of this report are if students have better luck in being able to know where to find the item in the stacks with the addition of the “map it” feature. locally, librarian perception is that part of the problem with this results display is simply visual spacing. the call number is not set apart or spaced so that it stands out as important information (see figure 5 for a screenshot). another question to follow up on will be to repeat the question, “how much of the library’s holdings do you think appsearch is looking across?” all thirteen users in this study believed appsearch was searching “everything the library owned.” based on this finding, the library made small adjustments to the initial search box (see figures 14 and 15 as illustration). it will be of interest to measure if this tweak has any impact. summary all users in this study recommended that the library move to encore’s “one box” discovery service instead of using a tabbed layout. helping users figure out when they should move to using discipline specific databases will most likely be a long-term challenge for belk library, and for other academic libraries using discovery services, but this will probably trouble librarians more than our users. information technology and libraries | september 2013 79 the most important change innovative interfaces could make to their discovery service is to create a central index for articles, which would improve load time and allow for an advanced search feature for articles to work efficiently. because of this study, innovative interfaces made a wording change in search results for article to include the word “scholarly” when describing peer reviewed journal articles in belk library’s local implementation. appalachian state university libraries will continue to conduct usability studies and tailor instruction and e-learning resources to help users navigate encore and other library resources. overall, it is expected users, especially freshman and sophomores, will like the new interface but will not be able to figure out how to improve search results, particularly for articles. belk library & information commons’ instruction team is working on help pages and tutorials, and will incorporate the use of encore into the library’s curricula. references 1 . thomsett-scott, beth, and patricia e. reese. "academic libraries and discovery tools: a survey of the literature." college & undergraduate libraries 19 (2012): 123-43. 2. ibid, 138. 3. hunter, athena. “the ins and outs of evaluating web-scale discovery services” computers in libraries 32, no. 3 (2012) http://www.infotoday.com/cilmag/apr12/hoeppner-web-scalediscovery-services.shtml (accessed march 18, 2013) 4. fagan, jody condit, meris mandernach, carl s. nelson, jonathan r. paulo, and grover saunders. "usability test results for a discovery tool in an academic library." information technology & libraries 31, no. 1 (2012): 83-112. 5. thomas, bob., and buck, stephanie. oclc's worldcat local versus iii's webpac. library hi tech, 28(4) (2010), 648-671. doi: http://dx.doi.org/10.1108/07378831011096295 6. becher, melissa, and kari schmidt. "taking discovery systems for a test drive." journal of web librarianship 5, no. 3: 199-219 [2011]. library, information science & technology abstracts with full text, ebscohost (accessed march 17, 2013). 7. ibid, p. 202 8. ibid p. 203 9. allison, dee ann, “information portals: the next generation catalog,” journal of web librarianship 4, no. 1 (2010): 375–89, http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1240&context=libraryscience (accessed march 17, 2013) http://www.infotoday.com/cilmag/apr12/hoeppner-web-scale-discovery-services.shtml http://www.infotoday.com/cilmag/apr12/hoeppner-web-scale-discovery-services.shtml http://dx.doi.org/10.1108/07378831011096295 usability test results for encore in an academic library | johnson 80 10. singley, emily. 2011 “encore synergy 4.1: a review” the cloudy librarian: musings about library technologies http://emilysingley.wordpress.com/2011/09/17/encore-synergy-4-1-areview/ [accessed march 20, 2013]. 11 . nielson, jakob. 2000. “why you only need to test with 5 users” http://www.useit.com/alertbox/20000319.html (accessed december 18, 2012]. 12. fagan et al, 90. 13. dixon, lydia, cheri duncan, jody condit fagan, meris mandernach, and stefanie e. warlick. 2010. "finding articles and journals via google scholar, journal portals, and link resolvers: usability study results." reference & user services quarterly no. 50 (2):170-181. 14. bangor, aaron, philip t. kortum, and james t. miller. 2008. "an empirical evaluation of the system usability scale." international journal of human-computer interaction no. 24 (6):574-594. doi: 10.1080/10447310802205776. 15. sauro, jeff. 2011. “measuring usability with the system usability scale (sus)” http://www.measuringusability.com/sus.php. [accessed december 7, 2012]. 16. ibid. 17. mellendorf, scott. “encore synergy sites” zahnow library, saginaw valley state university. http://librarysubjectguides.svsu.edu/content.php?pid=211211 (accessed march 23, 2013). 18. encore overview, “http://encoreforlibraries.com/overview/” (accessed march 21, 2013). 19. johnson, megan. videorecording made with jing on january 30, 2013 http://www.screencast.com/users/megsjohnson/folders/jing/media/0ef8f186-47da-41cf96cb-26920f71014b 20. fagan et al. 91. 21. james madison university libraries, “http://www.lib.jmu.edu” (accessed december 18, 2012). http://emilysingley.wordpress.com/ http://emilysingley.wordpress.com/2011/09/17/encore-synergy-4-1-a-review/ http://emilysingley.wordpress.com/2011/09/17/encore-synergy-4-1-a-review/ http://www.useit.com/alertbox/20000319.html http://www.measuringusability.com/sus.php http://librarysubjectguides.svsu.edu/content.php?pid=211211 http://encoreforlibraries.com/overview/ http://www.screencast.com/users/megsjohnson/folders/jing/media/0ef8f186-47da-41cf-96cb-26920f71014b http://www.screencast.com/users/megsjohnson/folders/jing/media/0ef8f186-47da-41cf-96cb-26920f71014b http://www.lib.jmu.edu/ information technology and libraries | september 2013 81 appendix a pre-purchase usability benchmarking test in april 2012, before the library purchased encore, the library conducted a small usability study to serve as a benchmark. the study outlined in this paper follows the same basic outline, and adds a few questions. the purpose of the april study was to measure student perceived success and satisfaction with the current search system of books and articles appalachian uses compared with use of the implementation of encore discovery services at university of nebraska lincoln (unl). the methodology was four undergraduates completing a set of tasks using each system. two started with unl, and two started at appalachian’s library homepage. in the april 2012 study, the participants were three freshman and one junior, and all were female. all were student employees in the library’s mailroom, and none had received special training on how to use the library interface. after the students completed the tasks, they rated their experience using the system usability scale (sus). in the summary conclusion of that study, the average sus score for the library’s current search box layout was 62, and for unl’s encore search it was 49. even though none of the students was particularly familiar with the current library’s interface, it might be assumed that part of the higher score for appalachian’s site was simply familiarity. student comments from the small april benchmarking study included the following. the junior student said the unl site had "too much going on" and appalachian was "easier to use; more specific in my searches, not as confusing as compared to unl site." another student (a freshman), said she has "never used the library not knowing if she needed a book or an article." in other words, she knows what format she is searching for and doesn’t perceive a big benefit to having them grouped. this same student also indicated she had no real preference between appalachian or the unl. she believed students would need to take time to learn either and that unl is a "good starting place." usability test results for encore in an academic library | johnson 82 appendix b instructions for conducting the test notes: use firefox for the browser, set to “private browsing” so that no searches are held in the cache (search terms to not pop into the search box from the last subject’s search). in the bookmark toolbar, the only two tabs should be available “dev” (which goes to the development server) and “lib” (which goes to the library’s homepage). instruct users to begin each search from the correct starting place. identify students and faculty by letter (student a, faculty a, etc). script hi, ___________. my name is ___________, and i'm going to be walking you through this session today. before we begin, i have some information for you, and i'm going to read it to make sure that i cover everything. you probably already have a good idea of why we asked you here, but let me go over it again briefly. we're asking students and faculty to try using our library's home page to conduct four searches, and then ask you a few other questions. we will then have you do the same searches on a new interface. (note: half the participants to start at the development site, the other half start at current site). after each set of tasks is finished, you will fill out a standard usability scale to rate your experience. this session should take about twenty minutes. the first thing i want to make clear is that we're testing the interface, not you. you can't do anything wrong here. do you have any questions so far? ok. before we look at the site, i'd like to ask you just a few quick questions. what year are you in college? what are you majoring in? roughly how many hours a week altogether--just a ballpark estimate--would you say you spend using the library website? ok, great. hand the user the task sheet. do not read the instructions to the participant, allow them to read the directions for themselves. allow the user to proceed until they hit a wall or become frustrated. verbally encourage them to talk aloud about their experience. usability test results for encore in an academic library | johnson 83 written instructions for participants. find the a copy of the book the old man and the sea. in your psychology class, your professor has assigned you a 5-page paper on the topic of eating disorders and teens. find a scholarly article (or peer-reviewed) that explores the relation between anorexia and self-esteem. you are studying modern chinese history and your professor has assigned you a paper on foreign relations. find a journal article that discusses relations between china and the us. what is a topic you have written about this year? search for materials on this topic. usability test results for encore in an academic library | johnson 84 appendix c follow up questions for participants (or ask as the subject is working) after the first task (find a copy of the book the old man and the sea) when the user finds the book in appsearch, ask “would you know where to find this book in the library?” how much of the library’s holdings do you think appsearch/ articles quick search is looking across? does “peer reviewed” mean the same as “scholarly article”? what does the “refine by tag” block the right mean to you? if you had to advise the library to either stay with a tabbed layout, or move to the one search box, what would you recommend? do you have any questions for me, now that we're done? thank subject for participating. usability test results for encore in an academic library | johnson 85 appendix d sample system usability scale (sus) strongly strongly disagree agree i think that i would like to use this system frequently 1 2 3 4 5 i found the system unnecessarily complex 1 2 3 4 5 i thought the system was easy to use 1 2 3 4 5 i think that i would need the support of a technical person to be able to use this system 1 2 3 4 5 i found the various functions in this system were well integrated 1 2 3 4 5 i thought there was too much inconsistency in this system 1 2 3 4 5 i would imagine that most people would learn to use this system very quickly 1 2 3 4 5 i found the system very cumbersome to use 1 2 3 4 5 i felt very confident using the system 1 2 3 4 5 i needed to learn a lot of things before i could get going with this system 1 2 3 4 5 comments: letter from the editor: improving ital's peer review letter from the editor improving ital’s peer review kenneth j. varnum information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.13573 over the past several months, ital has enrolled almost 30 reviewers to the journal’s new review panel. increasing the pool of reviewers for the journal supports the editorial board’s desire to provide equitable treatment to submitted articles by having two independent reviews provide double-blind consideration of each article, a practice that has now been in effect for articles submitted after may 1, 2021. i am grateful to the individuals (listed on the editorial team page) who volunteered, attended an orientation session, and have begun contributing to the work of the journal. * * * * * * in this issue in the editorial section of this issue, we have a column by incoming core president margaret heller. her essay, “making room for change through rest,” highlights the need for each of us to recharge after a collectively challenging year. this inaugurates what we plan to be an occasional feature, the “core leadership column,” to which we invite contributions from members of core leadership. it is joined by two other regular items, our editorial board thoughts essay by michael p. sauers, “do space’s virtual interview lab: using simple technology to serve the public in a time of crisis” and william yarbrough’s public libraries leading the way column, “service barometers: using lending kiosks to locate patrons.” an interesting and diverse set of peer-reviewed articles round out the issue: 1. the impact of covid-19 on the use of academic library resources / ruth sara connell, lisa c. wallis, and david comeaux 2. emergency remote library instruction and tech tools: a matter of equity during a pandemic / kathia ibacache, amanda rybin, and eric vance 3. off-campus access to licensed online resources through shibboleth / francis jayakanth, anand t. byrappa, and raja visvanathan 4. a framework for measuring relevancy in discovery environments / blake l. galbreath, alex merrill, and corey m. johnson 5. beyond viaf: wikidata as a complementary tool for authority control in libraries / carlo bianchini, stefano bargioni, and camillo carlo pellizzari di san girolamo 6. algorithmic literacy and the role for libraries / michael ridley and danica pawlick-potts 7. persistent urls and citations offered for digital objects by digital libraries / nicholas homenda kenneth j. varnum, editor varnum@umich.edu june 2021 https://ejournals.bc.edu/index.php/ital/about/editorialteam https://ejournals.bc.edu/index.php/ital/article/view/13513 https://ejournals.bc.edu/index.php/ital/article/view/13461 https://ejournals.bc.edu/index.php/ital/article/view/13461 https://ejournals.bc.edu/index.php/ital/article/view/13499 https://ejournals.bc.edu/index.php/ital/article/view/13499 https://ejournals.bc.edu/index.php/ital/article/view/12629 https://ejournals.bc.edu/index.php/ital/article/view/12751 https://ejournals.bc.edu/index.php/ital/article/view/12751 https://ejournals.bc.edu/index.php/ital/article/view/12589 https://ejournals.bc.edu/index.php/ital/article/view/12835 https://ejournals.bc.edu/index.php/ital/article/view/12959 https://ejournals.bc.edu/index.php/ital/article/view/12963 https://ejournals.bc.edu/index.php/ital/article/view/12987 mailto:varnum@umich.edu in this issue 6 information technology and libraries | march 2011 i n the new lita strategic plan, members have suggested an objective for open access (oa) in scholarly communications. some people describe oa as articles the author has to pay someone to publish. that can be true, but that’s not how i think of it. oa is definitely not vanity publishing. most oa journals are peer-reviewed. i like the definition provided by enablingopenscholarship: open access is the immediate (upon or before publication), online, free availability of research outputs without any of the restrictions on use commonly imposed by publisher copyright agreements.1 my focus on oa journals increased precipitously when the licensing for a popular american weekly medical journal changed. we could only access online articles from one on-campus computer unless we increased our annual subscription payment by 500 percent. we didn’t have the funds, and now the students suffer the consequences. i think it was an unfortunate decision the journal’s publishers made. i know from experience that if a student can’t access the first article they want, they will find another one that is available. interlibrary loan is simpler than ever, but i think only the patient and curious students will make the effort to contact us and request an article they cannot obtain. in 2006 scientist gary ward wrote that faculty at many institutions experience problems accessing current research. when faculty teach “what is available to them rather than what their students most need to know, the education of these students and the future of science in the u.s. will suffer.” he explains it is a false assumption that those who need access to scientific literature already have it. interlibrary loans or pay-per-view are often offered by publishers as the solution to the access problem, but this misses an essential fact of how we use the scientific literature: we browse. it is often impossible to tell from looking at an abstract whether a paper contains needed methodological detail or the perfect illustration to make a point to one’s students. apart from considerations of cost, time, and quality, interlibrary loans and pay-per-views simply do not meet the needs of those of us who often do not know what we’re looking for until we find it.2 i want our medical students and tomorrow’s doctors to have access to all of the most current medical research. we offer the service of providing jama articles to students, but i’m guessing that we hear from a small percentage of the students who can’t access the full text online. are people reading oa articles? not only are scholars reading the articles, but they are citing those articles in their publications. consider the public library of science’s plosone (http://www.plosone.org/home.action), a peerreviewed, open-access, online publication that features reports on primary research from all disciplines within science and medicine. in june 2010, plosone received its first impact factor of 4.351—an impressive number. that impact factor puts plosone in the top 25 percent of the institute for scientific information’s (isi) biology category.3 the impact factor is calculated annually by isi and represents the average number of citations received per paper published in that journal during the two preceding years.4 in other words, articles from plosone published in 2008 and 2009 were highly cited. is oa making an impact in my medical library? i believe it is, although i won’t be happy until our students can access the online journals they want from off campus and the library won’t have to pay outrageous licensing fees. we have more than one thousand online oa journal titles in our list of online journals. the more full text they can access, the less they’ll have to settle for their second or third choice because their first choice is not available online. i’m glad that lita members included oa in their strategic plan. the number of oa journals is increasing, and i believe we will continue to see that the articles are reaching readers and making a difference. i don’t think ital will be adopting the “author pays” model of oa, but the editorial board is dedicated to providing lita members with the access they want. references 1. enablingopenscholarship, “enabling open scholarship: open access,” http://www.openscholarship.org/jcms/c_6157/ open-access?portal=j_55&printview=true, (accessed jan. 18, 2011). 2. ward, gary, “deconstructing the arguments against improved public access,” newsletter of the american society for cell biology, nov. 2006, http://www.ascb.org/filetracker .cfm?fileid=550 (accessed jan. 18, 2011). 3. davis, phil, “plos one: is a high impact factor a blessing or a curse?” online posting, june 21, 2010, the scholarly kitchen, http://scholarlykitchen.sspnet.org/2010/06/21/plosone -impact-factor-blessing-or-a-curse/ (accessed jan. 18, 2011). 4. thomson reuters, “introducing the impact factor,” http://thomsonreuters.com/products_services/science/ academic/impact_factor/ (accessed jan. 18, 2011). cynthia porter editorial board thoughts: is open access the answer? cynthia porter (cporter@atsu.edu) is distance support librarian at a.t. still university of health sciences, mesa, arizona. drawing upon findings from a national survey of u.s. public libraries, this paper examines trends in internet and public computing access in public libraries across states from 2004 to 2006. based on library-supplied information about levels and types of internet and public computing access, the authors offer insights into the network-based content and services that public libraries provide. examining data from 2004 to 2006 reveals trends and accomplishments in certain states and geographic regions. this paper details and discusses the data, identifies and analyzes issues related to internet access, and suggests areas for future research. t his article presents findings from the 2004 and 2006 public libraries and the internet studies detail­ ing the different levels of internet access available in public libraries in different states.1 at this point, 98.9 percent of public library branches are connected to the internet and 98.4 percent of connected public library branches offer public internet access.2 however, the types of access and the quality of access available are not uniformly distributed among libraries or among the libraries in various states. while the data at the national level paint a portrait of the internet and public computing access provided by public libraries overall, studies of these differences among the states can help reveal successes and lessons that may help libraries in other states to increase their levels of access. the need to continue to increase the levels and quality of internet and public computing access in public libraries is not an abstract problem. the services and con­ tent available on the internet continue to require greater bandwidth and computing capacity, so public libraries must address ever­increasing technological demands on the internet and computing access that they provide. 3 public libraries are also facing increased external pressure on their internet and computing access. as patrons have come to rely on the availability of internet and computing access in public libraries, so too have government agencies. many federal, state, and local government agencies now rely on public libraries to facilitate citizens’ access to e­government services, such as applying for the federal prescription drug plans, filing taxes, and many other interactions with the gov­ ernment.4 further, public libraries also face increased demands to supply public access computing in times of natural disasters, such as the major hurricanes of 2004 and 2005.5 as a result, both patrons and govern­ ment agencies depend on the internet and computing access provided by public libraries, and each group has different, but interrelated, expectations of what kinds of access public libraries should provide. however, the data indicate that public libraries are at capacity in meet­ ing some of these expectations, while some libraries lack the funding, technology­support capacity, space, and infrastructure (e.g., power, cabling) to reach the expecta­ tions of each respective group. as public libraries (and the internet and public com­ puting access they provide) continue to fill more social roles and expectations, a range of new ideas and strate­ gies can be considered by public libraries to identify suc­ cessful methods for providing access that is high quality and sufficient to meet the needs of patrons and commu­ nity. the goals of the public libraries and the internet stud­ ies have been to help provide an understanding of the issues and needs of libraries associated with providing internet­based services and resources. the 2006 public libraries and the internet study employed a web­based survey approach to gather both quantitative and qualitative data from a sample of the 16,457 public library outlets in the united states.6 a sample was drawn to accurately represent metropolitan status (roughly equating to their designation of urban, suburban, or rural libraries), poverty levels (as derived through census data), state libraries, and the national picture, producing a sample of 6,979 public library out­ lets.7 the survey received a total of 4,818 responses for a response rate of 69 percent. the data in this article, unless otherwise noted, are drawn from the 2004 and 2006 public libraries and the internet studies.8 while the survey received responses from librar­ ies in all fifty states, there were not enough responses in all states from which to present state­level findings. the study was able to provide state­level analysis for thirty­five states (including washington, d.c.) in 2004 and forty­four states at the outlet level (including washington, d.c.) and forty­two states at the system level (including washington, d.c.) in 2006. in addi­ tion, there was some variance in states with adequate responses between the 2004 and 2006 studies. a full listing of the states is available in the final reports of the 2004 and 2006 studies at http://www.ii.fsu.edu/ plinternet_reports.cfm. thus, the findings below reflect 4 information technology and libraries | june 2007 public libraries and internet access across the united states: a comparison by state 2004–2006 paul t. jaeger, john carlo bertot, charles r. mcclure, and miranda rodriguez paul t. jaeger (pjaeger@umd.edu) is an assistant professor at the college of information studies at the university of maryland; john carlo bertot (bertot@ci.fsu.edu) is professor and associate director of the information use management and policy institute, college of information, florida state university; charles r. mcclure (cmcclure@ci.fsu.edu) is francis eppes professor and director of the information use management and policy institute, college of information, florida state university; and miranda rodriguez (mrodrig08@umd.edu) is a graduate student in the college of information studies at the university of maryland. public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 5 only those states for which both the 2004 and 2006 stud­ ies were able to provide analysis. n public libraries and the internet across the states overview of 2004 to 2006 as the public library and the internet studies have been ongoing since 1994, the questions asked in the biennial studies have evolved along with the provision of internet access in libraries. the questions have varied between surveys, but there have been consistent questions that allow for longitudinal analysis at the national level. the 2004 study introduced the analysis of the data at both the national and the state levels. with both the 2004 and 2006 studies providing data at the state level, some longitudi­ nal analysis at the state level is now possible. overall, there were a number of areas of consistent data across the states from 2004 to 2006. most states had fairly similar, if not identical, percentages of library outlets offering public internet access between 2004 and 2006. for the most part, changes were increases in the percentage of library outlets offering patron access. further, the average number of hours open per week in 2004 (44.5) and in 2006 (44.8) were very similar, as were the percentages of library outlets reporting increases in hours per week, decreases in hours per week, and no changes in hours per week. while these numbers are consistent, it is not known whether this average number of hours open, or the distribution of the hours open across the week, is sufficient to meet patron needs in most communities. data across the states also indicated that physical space is the primary reason for the inability of libraries to add more workstations within the library building. there was also consistency in the findings related to upgrades and replacement schedules. changes and continuities from 2004 to 2006 while the items noted above show some areas of stability in the internet access provided by public libraries across the states, insights are possible in the areas of change for libraries overall or in the libraries that are leading in particular areas. table 1 details the states with the highest average number of hours open per public library outlet in 2004 and 2006. between 2004 and 2006, the national average for the number of hours open increased slightly from 44.5 hours per week to 44.8 hours per week. this increase is reflected in the numbers for the individual states in 2006, which are generally slightly higher than the numbers for the individual states in 2004. for example, the top state in 2006 averaged 55.7 hours per outlet each week, while the top state in 2004 averaged 54.8 hours. the top four states—ohio, new jersey, florida, and virginia—were the same in both years, though with the top two switching positions. this demonstrates a continuing commitment in these four states by state and local government to ensure wide access to public librar­ ies. these states are also ones with large populations and state budgets, presumably fueling the commitment and facilitating the ability to keep libraries open for many hours each week. while the needs of patrons in other states are no less significant, the data indicate that states with larger populations and higher budgets, not surpris­ ingly, may be best positioned to provide the highest levels of access to public libraries for state residents. the other six states in the 2006 top ten were not in the 2004 top ten. the primary reason for this is that the six states in 2006 increased their hours more than other states. note that the fifth­ranked state in 2004, south carolina, averaged 49 hours per outlet each week, which is less than the tenth­ranked state in 2006, illinois, at 49.5 hours. simply by maintaining the average number of hours open per outlet between 2004 and 2006, south carolina fell from fifth to out of the top ten. these differ­ ences are reflected in the fact that there is nearly a ten­ hour difference from first place to tenth place in 2004; yet only a six­hour discrepancy exists from first place to tenth in 2006. these numbers suggest that hours of operation may change frequently for many libraries, indicating the need for future evaluations of operational hours in rela­ tion to meeting patron demand. table 2 displays the states with the highest average number of public access workstations per public library in 2004 and 2006. the national averages between 2004 and 2006 also showed a slight increase from 10.4 workstations table 1. highest average number of hours open in public library outlets by state in 2004 and 2006 2004 2006 1. new jersey 54.8 1. ohio 55.7 2. ohio 54.6 2. new jersey 55.6 3. florida 52.4 3. florida 52.3 4. virginia 51.3 4. virginia 52.3 5. south carolina 49.0 5. indiana 51.9 6. utah 48.0 6. pennsylvania 50.6 7. new mexico 47.4 7. washington, d.c. 50.6 8. rhode island 47.3 8. maryland 50.0 9. alabama 46.9 9. connecticut 49.8 10. new york 46.2 10. illinois 49.5 national: 44.5 national: 44.8 in 2004 to 10.7 workstations in 2006. a key reason for this slow growth in the number of workstations appears to have a great deal to do with limitations of physical space in libraries; in spite of increasing demands, space con­ straints often limit computer capacity.9 unlike table 1, the comparisons between 2004 and 2006 in table 2 do not show across­the­board increases from 2004 to 2006. in fact, florida had the highest average of workstations per library outlet in both 2004 and 2006, but the average number decreased from 22.6 in 2004 to 21.7 in 2006. it is interesting to note that florida has a significantly higher number of workstations than the next highest state in both 2004 and 2006. in contrast, many of the states in the lower half of the top ten in 2004 had sub­ stantially lower average numbers of workstations in 2004 than in 2006. in 2004 there were an average of seven more computers in spot two than spot ten; in 2006, there were only an average of four more computers from spot two to ten. the large increases in the number of workstations in some states, like nevada, michigan, and maryland, indicate sizeable changes in budget, numbers of outlets, and/or population size. also of note is the significant drop of the average number of workstations in kentucky, declining from 18.8 in 2004 to fewer than 13 in 2006. a possible explanation is that, since kentucky libraries have been leaders in adopting wireless technologies (see table 3), the demand for workstations has decreased as libraries have added wireless access. five states appear in the top ten of both years— florida, indiana, georgia, california, and new jersey. the average number of workstations in indiana, california, and georgia increased from 2004 to 2006, while the aver­ age number of workstations in florida and new jersey decreased between 2004 and 2006. some of the decreases in workstations can be accounted for by increases in the availability of wireless access in public libraries, as librar­ ies with wireless access may feel less need to add more networked computers, relying on patrons to bring their own laptops. such a strategy, of course, will not increase access for patrons who cannot afford laptops. some libraries have sought to address this issue by having lap­ tops available for loan within the library building. the states listed in table 3 had the highest average levels of wireless connectivity in public library outlets in 2004 and 2006. the differences between the numbers in 2004 and 2006 reveal the dramatic increases in the avail­ ability of wireless internet access in public libraries. the national average in 2004 was 17.9 percent, but in 2006, the national average had more than doubled to 37.4 percent of public libraries offering wireless internet access. this sizeable increase is reflected in the changes in the states with the highest levels of wireless access. every position in the ratings in table 3 shows a dra­ matic jump from 2004 to 2006. the top position increased from 47 percent to 63.8 percent. the tenth position increased from 19.6 percent to 47.8 percent, an increase of nearly two­and­a­half times. these increases show how much more prominent wireless internet access has become in the services that public libraries offer to their communities and to their patrons. four states appear on both the 2004 and 2006 lists— virginia, kentucky, rhode island, and new jersey. these four states all showed increases, but the rises in some table 2. highest average number of public access workstations in public library outlets by state in 2004 and 2006. 2004 2006 1. florida 22.6 1. florida 21.7 2. kentucky 18.8 2. indiana 17.5 3. new jersey 15.5 3. nevada 15.7 4. georgia 14.0 4. michigan 14.8 5. utah 13.0 5. maryland 14.6 6. rhode island 12.6 6. georgia 14.4 7. indiana 12.3 7. arizona 14.1 8. texas 11.9 8. california 14.0 9. california 11.8 9. new jersey 13.8 10. south carolina 11.7 10. virginia 13.0 new york 11.7 national: 10.4 national: 10.7 table 3. highest levels of public access wireless internet connectivity in public library outlets by state in 2004 and 2006 2004 2006 1. kentucky 47% 1. virginia 63.8% 2. new mexico 38.6% 2. connecticut 56.6% 3. new hampshire 31.6% 3. indiana 56.6% 4. virginia 30.8% 4. rhode island 53.9% 5. texas 26.4% 5. kentucky 52.0% 6. kansas 25.8% 6. new jersey 50.9% 7. new jersey 22.8% 7. maryland 49.8% 8. rhode island 22.5% 8. illinois 48.3% 9. florida 21.9% 9. california 47.8% 10. new york 19.6% 10. massachusetts 47.8% national: 17.9% national: 37.4% 6 information technology and libraries | june 2007 public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 7 other states were significant enough to reduce kentucky from the top­ranked state in 2004 to the fifth ranked, in spite of the fact that the number of public libraries in kentucky offering wireless access increased from 47 per­ cent to 52 percent. in both years, a majority of the states in the top ten were located along the east coast. further, high levels of wireless access may be linked in some states to areas of high population density or the strong presence of technology­related sectors in the state, as in california and virginia. smaller states with areas of dense popula­ tions, such as connecticut, rhode island, and maryland, are also among the leaders in wireless access. tables 4 and 5 provide contrasting pictures regarding the number of public access internet workstations in public libraries by state in 2004 and 2006. table 4 shows the states with the highest percentages of libraries that consistently have fewer workstations that are needed by patrons, while table 5 shows the states with the highest percentages of libraries that consistently have sufficient workstations to meet patron needs. of note is the fact that, unlike the preceding three tables, there appears to be no significant geographical clustering of states in tables 4 and 5. nationally, the percentage of libraries that consis­ tently have insufficient workstations to meet patron needs declined from 15.7 percent in to 2004 to 13.7 percent in 2006, a change that is within the margin of error (+/­ 3.4 percent) of the question on the 2006 survey. due to the size of the change, it is not known if the national decline was a real improvement or simply a reflection of the margin of error. washington, d.c., oregon, new mexico, idaho, and california appear on the lists for both 2004 and 2006 in table 4. washington, d.c. had the highest percentage of libraries reporting insufficient workstations in both years, though there was a significant decrease from 100 percent of libraries in 2004 to 69 percent of libraries in 2006. in this case, the significant drop represents major strides forward to providing sufficient access to patrons in washington, d.c. similarly, though california features on both lists, the percentages dropped from 44.9 percent in 2004 to 22.2 percent in 2006, a decline of more than half. states like these are obviously making efforts to address the need for increased workstations. overall, eight out of ten positions in table 4 remained constant or saw a decline percentage in each position from 2004 to 2006, indicating a national decrease in libraries with insufficient workstations. in sharp contrast, fewer than 20 percent of nevada libraries in 2004 reported insufficient workstations, placing well out of the top ten. however, in 2006 nevada ranked second, with 51.5 percent of public libraries reporting insufficient workstations to meet patron demand. with nevada’s rapidly growing population, it appears that the demand for internet access in public libraries may not be keeping pace with the population growth. the percentage of public libraries reporting suffi­ cient workstations to consistently meet patron demands increased slightly at the national level from 14.1 percent in 2004 to 14.6 percent in 2006, again well within the margin of error (+/­ 3.5 percent) of the 2006 question. however, in table 5, the top ten positions in 2006 all fea­ ture lower percentages than the same positions in 2004. in 2004 the top­ranked state had 53.2 percent of libraries able to consistently meet patron needs for internet access, but the top­ranked state in 2006 had only 31 percent of libraries able to consistently meet patron access needs. table 4. public library outlet public access workstation availability by state in 2004 and 2006–consistently have fewer workstations than are needed 2004 2006 1. washington, d.c. 100% 1. washington, d.c. 69.9% 2. california 44.9% 2. nevada 51.5% 3. florida 36% 3. oregon 34.8% 4. new mexico 30.7% 4. new mexico 31.9% 5. oregon 30.4% 5. tennessee 30.4% 6. utah 29.2% 6. alaska 27.8% 7. south carolina 28.4% 7. idaho 26% 8. kentucky 24.1% 8. california 22.2% 9. alabama 21.5% 9. new york 21.4% 10. idaho 21.1% 10. rhode island 19% national: 15.7% national: 13.7% table 5. public library outlet public access workstation availability by state in 2004 and 2006—always have a sufficient number of workstations to meet demand. 2004 2006 1. wyoming 53.2% 1. louisiana 31% 2. alaska 34.9% 2. new hampshire 30.4% 3. kansas 32.2% 3. north carolina 28.4% 4. rhode island 31.4% 4. arkansas 26.2% 5. new hampshire 29.7% 5. wyoming 25.2% 6. south dakota 25.2% 6. mississippi 24.4% 7. georgia 25% 7. missouri 23.6% 8. arkansas 24.8% 8. vermont 22.2% 9. vermont 32.7% 9. nevada 20.9% 10. virginia 22.4% 10. pennsylvania 17.9% west virginia 17.9% national: 14.1% national: 14.6% � information technology and libraries | june 2007 four states—new hampshire, arkansas, wyoming, and vermont—appear on both the 2004 and 2006 lists. the national increase in the sufficiency of the num­ ber of workstations to meet patron access needs and decreases in all of the top­ranked states between 2004 and 2006 seems incongruous. this situation results, however, from a decrease in range of differences among the states from 2004 to 2006, so that the range is compressed and the percentages are more similar among the states. further, in some states, the addition of wireless access may have served to increase the overall sufficiency of the access in libraries, possibly leveling the differences among states. nevertheless, the national average of only 14.6 percent of public libraries consistently having sufficient numbers of workstations to meet patron access needs is clearly a major problem that public libraries must work to address. comparing the 2006 data of tables 4 and 5 demonstrates that patron demands for internet access are being met neither evenly nor consistently across the states. nationally, the percentage of public library systems with increases in the information technology budgets from the previous year dropped dramatically from 36.1 percent in 2004 to 18.6 percent in 2006. as can be seen in table 6, various national, state, and local budget crunches have significantly reduced the percentages of public library systems with increases in information technology budgets. when inflation is taken into account, a stationary information technology budget represents a net decrease in funds available in real dollar terms, so the only public libraries that are not actually having reductions in their information technology budgets are those with increases in such budgets. since internet access and the accompa­ nying hardware necessary to provide it are clearly a key aspect of information technology budgets, decreases in these budgets will have tangible impacts on the ability of public libraries to provide sufficient internet access. virtually every position on table 6 has a decrease of 20 percent to 30 percent from 2004 to 2006, with the largest decrease being from 84.2 percent in 2004 to 48.3 percent in 2006 in the second position. five states—delaware, kentucky, florida, rhode island, and south carolina—are listed for both 2004 and 2006, though every one of these states registered a decrease from 2004 to 2006. no drop was more dramatic than south carolina’s from 84.2 percent in 2004 to 31 percent in 2006. overall, though, the declining information tech­ nology budgets and continuing increases in demands for information technology access among patrons cre­ ates a very difficult situation for libraries. public libraries and the internet in 2006 along with questions that were asked on both the 2004 and 2006 public libraries and the internet studies, the sur­ vey included new questions on the 2006 study to account for social changes, alterations of the policy environment, and the maturation of internet access in public librar­ ies. several findings from the new questions on the 2006 study were noteworthy among the state data. the states listed in table 7 had the highest percentage of public library systems with increases in total operating budget over the previous year in 2006. nationally, 45.1 percent of public library systems had some increase in their overall budget, which includes funding for staff, physical structures, collection development, and many other costs, along with technology. at the state level, three northeastern states clearly led the way, with more than 75 percent of library systems in maryland, delaware, and rhode island benefiting from an increase in the overall operating budget. also of note is the fact that two fairly table 6. highest levels of public library system overall internet information technology budget increases by state in 2004 and 2006 2004 2006 1. florida 87.5% 1. delaware 60% 2. south carolina 84.2% 2. kentucky 48.3% 3. rhode island 67.5% 3. maryland 47.6% 4. delaware 64.9% 4. wyoming 45.7% 5. new jersey 61.5% 5. louisiana 40% 6. north carolina 55.5% 6. florida 38% 7. virginia 53.6% 7. rhode island 33.3% 8. kentucky 53.2% 8. south carolina 31% 9. new mexico 49.3% 9. arkansas 27.5% 10. kansas 49% 10. california 27.3% national: 36.1% national: 18.6% table 7. highest levels of public library system total operating budget increases by state in 2006 1. maryland 85.7% 2. delaware 80% 3. rhode island 76.4% 4. idaho 74.5% 5. kentucky 73.6% 6. connecticut 68.6% 7. virginia 62.8% 8. new hampshire 62.5% 9. north carolina 61.6% 10. wyoming 60.9% national: 45.1% public libraries and internet access | jaeger, bertot, mcclure, and rodriguez � rural and sparsely populated western states—idaho and wyoming—were among the top ten. five of the states in the top ten in highest percent­ ages of increases in operating budget in 2006 were also among the top ten in highest percentages of increases in information technology budgets in 2006. comparing table 7 with table 6 reveals that delaware, kentucky, maryland, rhode island, and wyoming are on both lists. in these states, increases in information technology budgets seem to have accompanied larger increases in the overall 2006 budget. an interesting point to ponder in comparing table 6 with table 7 is the large discrepancy between average increases in information technology budgets (18.6 per­ cent) with overall budgets (45.1 percent) at the national level. as internet access is becoming more vital to pub­ lic libraries in the content and services they provide to patrons, it seems surprising that such a smaller portion of library systems would receive an increase in information technology budgets than in overall budgets. one growing issue with the provision of internet access in public libraries is the provision of access at suf­ ficient connection speeds. more and more internet con­ tent and services are complex and require large amounts of bandwidth, particularly content involving audio and video components. fortunately, as demonstrated in table 8, 53.5 percent of libraries nationally indicate that their connection speed is sufficient at all times to meet patron needs. in contrast, only 16.1 percent of public libraries nationally indicate that their connection speed is insuf­ ficient to meet patron needs at all times. georgia has the highest percentage of libraries that always have sufficient connection speed at 80.5 percent. in the case of georgia, the statewide library network is most likely a key part of ensuring the majority of libraries have sufficient access speed. many of the other states that have the highest percentages of public librar­ ies with sufficient connection speeds are located in the middle part of the country. the state with the highest percentage of libraries with insufficient connection speed to meet patron demands is virginia, with 35 per­ cent of libraries. curiously, virginia consistently ranks in the top ten of tables 1–3. though virginia libraries have some of the longest hours open, some of the high­ est numbers of workstations, and some of the highest levels of wireless access, they still have the highest per­ centage of libraries with insufficient connection speed. only five states had more than 25 percent of libraries with connection speeds insufficient to meet the needs of patrons at all times. this issue is significant now in these states, as these libraries lack the necessary connec­ tion speeds. however, it will continue to escalate as an issue as content and services on the internet continue to evolve and become more complex, thus requiring greater connection speeds. comparing table 8 with table 4 (consistently have fewer workstations than are needed) and table 5 (always have a sufficient number of workstations to meet demand) reveals some parallels. alabama and rhode island are among the top ten states both for connection speed being consistently insufficient to meet patron needs (table 8) and consistently have fewer workstations than are needed (table 4). conversely, vermont and louisiana are among the top ten states both for connection speed being sufficient to meet patron needs at all times (table 8) and always have a sufficient number of workstations to meet demand (table 5). table 9 displays the two leading types of internet connection providers for public libraries and the states with the highest percentages of libraries using each. nationally, 46.4 percent of public libraries rely on an internet service provider (isp) for internet access. in the states listed in table 9, three­quarters or more of librar­ ies use an isp, with more than 90 percent of libraries in kentucky and iowa using an isp. the next most common means of connection for public libraries is through a library cooperative or library network, with 26.2 percent of libraries nationally using these means. in such cases, member libraries rely on their established network to serve as the connector to the internet. the library net­ work approach seems to be most effective in geographi­ cally small states. the top three on the list being three of the smallest of the states—rhode island, delaware, and west virginia—with more than 75 percent of libraries in each of these states connecting through a network. nationally, the remaining approximately 25 percent of table �. highest percentages of public library outlets where public access internet service connection speed is sufficient at all times or insufficient by state in 2006 sufficient to meet patrons needs at all times insufficient to meet patron needs 1. georgia 80.5% 1. virginia 35% 2. new hampshire 70.6% 2. north carolina 28.1% 3. iowa 64.2% 3. alaska 27.3% 4. illinois 64% 4. delaware 26.9% 5. ohio 63.9% 5. mississippi 26.6% 6. indiana 63.6% 6. missouri 24.3% 7. vermont 63.5% 7. rhode island 23.1% 8. oklahoma 62.8% 8. oregon 22.4% 9. louisiana 61.7% 9. connecticut 21.5% 10. wisconsin 61.5% 10. arkansas 21.2% national: 53.5% national: 16.1% 10 information technology and libraries | june 2007 libraries connect through a network managed by a nonlibrary entity or by other means. the highest percentages of public library sys­ tems receiving each kind of e­rate discount are presented in table 10. e­rate discounts are an important source of technology funding for many public libraries across the country, with more than $250,000,000 in e­rate discounts distributed to libraries between 2000 and 2003.10 nationally in 2006, 22.4 percent of public library systems received discounts for internet connectivity, 39.6 percent for telecommunications services, and 4.4 percent for internal connection costs. mississippi and louisiana appear in the top five for each of the three types of discounts. minnesota and west virginia are each in the top five for two of the three lists. many of the states benefiting the most from e­rate funding in 2006 have large rural popu­ lations spread out over a geographically dispersed area, indicating the continuing importance of e­ rate discounts in bringing internet connections to rural public libraries. maryland and west virginia are both included in the telecommunications service column of table 10 due to proportionally large areas of these smaller states that are rural. the importance of the telecommunications dis­ counts in certain states is obviated by the fact that more than 75 percent of public library systems in all five states listed received such discounts. in comparison, only one state has more than 75 percent of library systems receiv­ ing discounts for internet connectivity, while no state has 30 percent of library systems receiving discounts for internal connection costs, with the latter reflecting the manner in which e­rate funding is calculated. in spite of the penetration of the internet into virtually every public library in the united states and the general expectations that internet access will be publicly available in every library, not all public libraries offer information technology training for patrons. nationally, 21.4 percent of public library outlets do not offer technology training. table 10 lists the states with the highest percentages of public library outlets not offering information technol­ ogy training. six of the ten states listed are located in the southeastern part of the country. the lack of resources or adequate number of staff to provide training is a leading concern in these states. not offering patron training may be strongly linked to lacking economic resources to do so. for example, the two states with the highest percentage of public libraries not offering patron training—mississippi and louisiana—are also the two states in the top five recipients of each kind of e­rate funding listed in table 10. if the libraries in states like these are economically struggling just to provide internet access, it seems likely that providing accompany­ ing training might be difficult as well. a further difficulty is that there is little public or private funding available specifically for training. n discussion of issues the similarities and differences among the states indi­ cate that the evolution of public access to the internet in public libraries is not necessarily an evenly distributed phenomenon, as some states appear to be consistent lead­ ers in some areas and other states appear to consistently trail in others. while the national picture is one primarily of continued progress in the availability and quality of internet access available to library patrons, the progress is not evenly distributed among the states. 11 libraries in different states struggle with or benefit from different issues. some public libraries are limited by state and local budgetary limitations, while other libraries are seeking alternate funding sources through grant writ­ ing and building partnerships with the corporate world. some face barriers to providing access due to their geo­ graphical location or small service population. it may also be the case that the libraries in some states do not per­ ceive that patrons desire increased access. other public libraries are able to provide high­end access as a result of having strong local leadership, sufficient state and local funding, well­developed networks and cooperatives, and a proactive state library. though the discussion of the “digital divide” has become much less frequent, the state data seem to indi­ cate that there are gaps in levels of access among libraries in different states. while every state has very successful individual libraries in terms of providing quality internet table �. highest levels of types of internet connection provider for public library outlets by state in 2006 internet service provider library cooperative or network 1. kentucky 93.5% 1. rhode island 84.7% 2. iowa 90.9% 2. delaware 79.5% 3. new hampshire 83.8% 3. west virginia 77.9% 4. vermont 81.1% 4. wisconsin 71.2% 5. oklahoma 80.6% 5. massachusetts 54.7% wyoming 80.6% 6. minnesota 52.5% 7. idaho 80.2% 7. ohio 48.9% 8. montana 78.9% 8. georgia 45.1% 9. tennessee 78.4% 9. mississippi 41.2% 10. alabama 74.6% 10. connecticut 38.5% national: 46.4% national: 26.2% public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 11 access and individual libraries that could be doing a better job, the state data indicate that library patrons in different parts of the country have variations in the levels and quality of access available to them. uniformity across all states clearly will never be feasible, though, as differ­ ent states and their patrons have different needs. for example, tables 1, 2, and 3 all display features that indicate high­level internet access in public librar­ ies—high numbers of hours open, high numbers of public access workstations, and high levels of wireless internet access. three states—maryland, new jersey, and virginia—appear in the top ten in these three lists for 2006. further, connecticut, florida, illinois, and indiana each appear in the top ten of two of these three lists. these states clearly are making successful efforts at the state and local levels to guarantee widespread access to public libraries and the internet access they provide. gaps in access are also evident among different regions of the country. the highest percentages of library systems with increases in total operating budgets were concentrated in states along the east coast, with seven of the states listed in table 7 being mid­atlantic or northeastern states. in con­ trast, the highest percentages of library systems relying on e­rate funding in table 10 were concentrated in the midwest and the southeast. further, the numbers in tables 6 and 7 showed far greater increases in the total operating budgets than in the information technology budgets in all regions of the country. as a result, public libraries in all parts of the united states may need to seek alternate sources of funding specifically for information technology costs. as can be seen in table 3, the leading states in adoption of wireless technology are concentrated in the northeast and mid­atlantic. in table 11, southern states, particu­ larly louisiana and mississippi, had many of the highest percentages of libraries not offering any internet training to patrons. it is important to note with data from the gulfstates, however, that the effects of hurricane katrina may have had a large impact on the results reported. one key difference in a number of states seems to be the presence of a state library actively working to coordi­ nate access issues. this particular study was not able to address such issues, but evidence indicates that the state library can play a significant role in ensuring sufficiency of internet access in public libraries in a state. maine, west virginia, and wisconsin all have state libraries that apply and distribute funds at the statewide level to ensure all public libraries, regardless of size or geography, have high­end connections to the internet. the state library of west virginia, for example, applied for e­rate funding for telecommunications costs on a statewide basis and received 79.1 percent funding in 2006, using such funding to cover not only connection costs for public libraries, but also to provide it and network support to libraries. another example of a successful statewide effort to provide sufficient internet access can be found in maryland. in the early 1990s, maryland public library administrators agreed to let the state library use library services and technology act (lsta) funds to build the sailor network, connecting all public libraries in the state.12 this network predates the e­rate program by a number of years, but having an established statewide network has helped the state library to coordinate table 10. highest percentages of public library systems receiving e-rate discounts by category and state in 2006 internet connectivity telecommunications services internal connection costs 1. louisiana 89.2% 1. mississippi 92.6% 1. mississippi 29.6% 2. indiana 70.8% 2. south carolina 89.4% 2. minnesota 22.6% 3. mississippi 63% 3. louisiana 79.5% 3. arizona 19.3% 4. minnesota 50.5% 4. west virginia 79.1% 4. west virginia 14.2% 5. tennessee 44.7% 5. maryland 76.2% 5. louisiana 12.3% national: 22.4% national: 39.6% national: 4.4% table 11. highest levels of public library systems not offering patron information technology training services by state in 2006 1. louisiana 48.7% 2. mississippi 40.7% 3. arkansas 39.6% 4. alaska 36% 5. arizona 34.8% 6. georgia 34.5% 7. new hampshire 32.8% 8. south carolina 31.1% 9. tennessee 30% 10. idaho 29% national: 21.4% 12 information technology and libraries | june 2007 applications, funding, and services among the libraries of the state. the state budget in maryland also provides other types of funding to support the state library, the library systems, and the library outlets in providing internet access. in states such as georgia, maryland, maine, west virginia, and wisconsin, the provision of internet access in public libraries is shaped not only by library outlets and library systems, but by the state libraries as well. in these and other states, the efforts of the state library appear to be reflected in the data from this study. a final area for discussion is the degree to which librarians understand how much bandwidth is required to meet the needs of library users, how to measure actual bandwidth that is available in the library, and how to determine the degree to which that bandwidth is suf­ ficient. indeed, many providers advertise that their con­ nection speeds are “up to” a certain speed when in fact they deliver considerably less.13 the authors have offered an analysis of determining the quality and sufficiency of bandwidth elsewhere.14 suffice to say that there is consid­ erable confusion as to “how good is good enough” band­ width connection quality. these types of issues frame understandings of how connected libraries in different states are and whether those connections are sufficient to meet the needs of patrons. n future research while the experience of individual patrons in particular libraries will vary widely in terms of whether the access available is sufficient to meet their information needs, the fact that the state data indicate variations in the levels and quality of access among some states and regions of the country is worthy of note. an important area of sub­ sequent research will be to investigate these differences, determine the reasons for them, and develop strategies to alleviate these apparent gaps in access. investigating these differences requires consideration of local and situational factors that may affect access in one library but perhaps not in another. for example, one public library may have access to an internet provider that offers higher speed connectivity that is not available in another location. the range of the possible local and situational factors affecting access and services is extensive. a prelimi­ nary list of the factors that contribute to being a success­ fully networked public library is described in greater detail in the 2006 study.15 however, additional investigation into the degree to which these factors affect access, quality of service, and user satisfaction needs to be continued. the personal experience of the authors in working with various state library agencies suggests the need for additional research that explores relationships among those states ranked highest in areas such as connectivity and workstations with programs and services offered by the state library agencies. one state library, for example, has a specific program that works directly with individual public libraries to assist them in completing the various e­rate forms. is there a link between that state library providing such assistance and the state’s public libraries receiving more e­rate discounts per capita than other states? this is but one example where investigating the role of the state library and comparing those roles and services to the rankings may be useful. perhaps a number of “best practices” could be identified that would assist the libraries in other states in improv­ ing access and services. in terms of research methods, future research on the topics identified in this article may need to draw upon strategies other than a national survey and on­site focus groups/interviews. the 2006 study, for the first time, included site visits and interviews and produced a wealth of data that supplemented the national survey data.16 on­site analysis of actual connection speeds in a sample of public libraries is but one example. the degree to which survey respondents know the connec­ tion speeds at specific workstations is unclear. simply because a t­1 line comes in the front door, it is not nec­ essarily the speed available at a particular workstation. other methods such as log file analysis or user­based surveys of networked services (as opposed to surveys completed by librarians) may offer insights that could augment the national survey data. other approaches such as policy analysis may also prove useful in better understanding access, connectiv­ ity, and services on a state­by­state basis. there has been no systematic description and analysis of state­based laws and regulations that affect public library internet access, connectivity, and services. the authors are aware of some states that ensure a minimum bandwidth will be provided to each public library in the state and pay for such connectivity. such is not true in other states. thus, a better understanding of how state­based policies and regulations affect access, connectivity, and services may identify strategies and policies that could be used in other states to increase or improve access, connectiv­ ity, and services. the data discussed in this article also point to many other important needs in future research. libraries in certain states that seem to be frequently ranking high in the tables indicate that certain states are better able to sustain their libraries in terms of finances and usage. however, additional factors may also be key in the differ­ ences among the states. future research needs to consider the internet access in public libraries in different states in relation to other services offered by libraries and to uses of the internet connectivity in libraries, including types of online content and services available, types of training public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 13 available, community outreach, other collection issues, staffing in relation to technology, and other factors. n conclusion internet and public computing access is almost univer­ sally available in public libraries in the united states, but there are differences in the amounts of access, the kinds of access, and sufficiency of the access available to meet patron demands. now that virtually every public library has an internet connection, provides internet access to patrons, and offers a range of public computing access, the attention of public libraries must refocus on ensuring that every library can provide sufficient internet and com­ puting access to meet patron needs. the issues to address include being open to the public a sufficient number of hours, having enough internet access workstations, hav­ ing adequate wireless access, and having sufficient speed and quality of connectivity to meet the needs of patrons. if a library is not able to provide sufficient access now, the situation will only continue to grow more difficult as the content and services on the internet continue to be more demanding of technical and bandwidth capacity. public libraries must also focus on increasing provi­ sion of internet access in light of federal, state, and local governments recently adding yet another significant level of services to public libraries by “requesting” that they provide access to and training in using numerous e­gov­ ernment services. such e­government services include social services, prescription drug plans, health care, disas­ ter support, tax filing, resource management, and many other activities.17 the maintenance of traditional services, the addi­ tion and expansion of public access computing and networked services, and now the addition of a range of e­government services tacitly required by federal, state, and local governments, in combination, risk stretching public library resources beyond their ability to keep up. to avoid such a situation, public libraries, library sys­ tems, and state governments must learn from the library outlets, systems, and states that are more successfully providing sufficient internet access to their patrons and their communities. among these leaders, there are likely models for success that can be identified for the benefit of other outlets, systems, and states. beyond the lessons that can be learned from the most connected, however, there are also practical and logistical issues that remain beyond the control of an individual library and sometimes the entire state, such as geographical and economic factors. ultimately, the analysis of state data offered here sug­ gests that much can be learned from one state that might assist another state in terms of improving connectivity, access, and services. while the data suggest a number of significant discrepancies among the various states, it may be that a range of best practices can be identified from those more highly ranked states that could be employed in other states to improve access, connectivity, and ser­ vices. staff at the various state library agencies may wish to discuss these findings and develop strategies that can then improve access nationwide. providing access to the internet is now as established a role for public libraries as providing access to books. patrons and communities, and now government orga­ nizations, rely on the fact that internet access will be available to everyone who needs it. while there are other points of access to the internet in some communities, such as school media centers and community technology centers, the public library is often the only public access point available in many communities.18 public libraries across the states must continually work to make sure the access they provide meets all of these needs. n acknowledgements the 2004 and 2006 public libraries and the internet studies were funded by the american library association and the bill & melinda gates foundation. drs. bertot, mcclure, and jaeger served as the co­principal investigators of the study. more information on these studies is available at http://www.ii.fsu.edu/plinternet/. references and notes 1. john carlo bertot, charles r. mcclure, and paul t. jaeger, public libraries and the internet 2004: survey results and findings (tallahassee, fla.: information institute, 2005), http://www.ii.fsu .edu/plinternet_reports.cfm; john carlo bertot et al., public libraries and the internet 2006: study results and findings (tal­ lahassee, fla.: information institute, 2006), http://www.ii.fsu. edu/plinternet_reports.cfm (accessed mar. 31, 2007). 2. bertot et al., public libraries and the internet 2006. 3. john carlo bertot and charles r. mcclure, “assessing the sufficiency and quality of bandwidth for public libraries,” information technology and libraries 26, no. 1 (2007): 14–22. 4. john carlo bertot et al., “drafted: i want you to deliver e­government,” library journal 131, no. 13 (2006): 34–39; john carlo bertot et al., “public access computing and internet access in public libraries: the role of pub­ lic libraries in e­government and emergency situations,” first monday 11, no. 9 (2006). http://www.firstmonday .org/issues/issue11_9/bertot/ (accessed mar. 31, 2007). 5. ibid.; paul t. jaeger et al., “the 2004 and 2005 gulf coast hurricanes: evolving roles and lessons learned for public libraries in disaster preparedness and community services,” public library quarterly (in press). 6. there are actually nearly 17,000 service outlets in the united states. however, the sample frame eliminated bookmobiles as 14 information technology and libraries | june 2007 well as library outlets that the study team could neither geocode nor calculate poverty measures. additional information on the methodology is available in the study report at http://www.ii.fsu .edu/plinternet/ (accessed mar. 31, 2007). 7. bertot et al., public libraries and the internet 2006. 8. bertot, mcclure, and jaeger, public libraries and the internet 2004; bertot et al., public libraries and the internet 2006. the 2004 survey instrument is available at http://www.ii.fsu.edu/pro­ jectfiles/plinternet/plinternet_appendixa.pdf. the 2006 survey instrument is available at http://www.ii.fsu.edu/projectfiles/ plinternet/2006/appendix1.pdf (accessed mar. 31, 2007). 9. bertot et al., public libraries and the internet 2006. 10. paul t. jaeger, charles r. mcclure, and john carlo bertot, “the e­rate program and libraries and library consortia, 2000­ 2004: trends and issues,” information technology and libraries 24, no. 2 (2005): 57–67. 11. bertot, mcclure, and jaeger, public libraries and the internet 2004; bertot et al., public libraries and the internet 2006; john carlo bertot, charles r. mcclure, and paul t. jaeger, “public libraries struggle to meet internet demand: new study shows libraries need support to sustain online services,” american libraries 36, no. 7 (2005): 78–79. 12. john carlo bertot and charles r. mcclure, sailor assessment final report: findings and future sailor development (bal­ timore, md.: division of library development and services, 1996). 13. matt richtel and ken belson, “not always full speed ahead,” new york times, nov. 18, 2006. 14. bertot and mcclure, “assessing the sufficiency,” 14–22. 15. bertot et al., public libraries and the internet 2006. 16. ibid. 17. bertot et al., “drafted: i want you to deliver e­govern­ ment”; bertot et al., “public access computing and internet access in public libraries”; jaeger et al., “the 2004 and 2005 gulf coast hurricanes.” 18. paul t. jaeger et al., “the policy implications of internet connectivity in public libraries,” government information quarterly 23, no. 1 (2006): 123–41. farrell ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ article title | author 41content-based information retrieval and digital libraries | wan and liu 41 content-based information retrieval and digital libraries this paper discusses the applications and importance of content-based information retrieval technology in digital libraries. it generalizes the process and analyzes current examples in four areas of the technology. content-based information retrieval has been shown to be an effective way to search for the type of multimedia documents that are increasingly stored in digital libraries. as a good complement to traditional textbased information retrieval technology, content-based information retrieval will be a significant trend for the development of digital libraries. w ith several decades of their development, digital libraries are no longer a myth. in fact, some general digital libraries such as the national science digital library (nsdl) and the internet public library are widely known and used. the advance of computer technology makes it possible to include a colossal amount of information in various formats in a digital library. in addition to traditional text-based documents such as books and articles, other types of materials—including images, audio, and video—can also be easily digitized and stored. therefore, how to retrieve and present this multimedia information effectively through the interface of a digital library becomes a significant research topic. currently, there are three methods of retrieving information in a digital library. the first and the easiest way is free browsing. by this means, a user browses through a collection and looks for desired information. the second method—the most popular technique used today—is textbased retrieval. through this method, textual information (full text of text-based documents and/or metadata of multimedia documents) is indexed so that a user can search the digital library by using keywords or controlled terms. the third method is content-based retrieval, which enables a user to search multimedia information in terms of the actual content of image, audio, or video (marques and furht 2002). some content features that have been studied so far include color, texture, size, shape, motion, and pitch. while some may argue that text-based retrieval techniques are good enough to locate desired multimedia information, as long as it is assigned proper metadata or tags, words are not sufficient to describe what is sometimes in a human’s mind. imagine a few examples: a patron comes to a public library with a picture of a rare insect. without expertise in entomology, the librarian won’t know where to start if only a text-based information retrieval system is available. however, with the help of content-based image retrieval, the librarian can upload the digitized image of the insect to an online digital image library of insects, and the system will retrieve similar images with detailed description of this insect. similarly, a patron has a segment of music audio, about which he or she knows nothing but wants to find out more. by using the content-based audio retrieval system, the patron can get similar audio clips with detailed information from a digital music library, and then listen to them to find an exact match. this procedure will be much easier than doing a search on a text-based music search system. it is definitely helpful if a user can search this non-textual information by styles and features. in addition, the advance of the world wide web brings some new challenges to traditional text-based information retrieval. while today’s web-based digital libraries can be accessed around the world, users with different language and cultural backgrounds may not be able to do effective keyword searches of these libraries. content-based information retrieval techniques will increase the accessibility of these digital libraries greatly, and this is probably a major reason it has become a hot research area in the past decade. ideally, a content-based information retrieval system can understand the multimedia data semantically, such as its objects and categories to which it belongs. therefore, a user is able to submit semantic queries and retrieve matched results. however, a great difficulty in the current computer technology is to extract high-level or semantic features of multimedia information. most projects still focus on lower-level features, such as color, texture, and shape. simply put, a typical content-based information retrieval system works in this way: first, for each multimedia file in the database, certain feature information (e.g., color, motion, or pitch) is extracted, indexed, and stored. second, when a user composes a query, the feature information of the query is calculated as vectors. finally, the system compares the similarity between the feature vectors of the query and multimedia data, and retrieves the best matching records. if the user is not satisfied with the retrieved records, he or she can refine the search results by selecting the most relevant ones to the search query, and repeat the search with the new information. this process is illustrated in figure 1. the following sections will examine some existing content-based information retrieval techniques for most common information formats (image, audio, and video) in digital libraries, as well as their limitations and trends. gary (gang) wan (gwan@tamu.edu) is a science librarian and assistant professor, and zao liu (zliu@tamu.edu) is a distance learning librarian and assistant professor at sterling c. evans library, texas a&m university, college station, texas. gary (gang) wan and zao liu 42 information technology and libraries | march 200842 information technology and libraries | march 2008 ■ content-based image retrieval there have been a large number of different contentbased image retrieval (cbir) systems proposed in the last few years, either building on prior work or exploring novel directions. one similarity among these systems is that most perform feature extraction as the first step in the process, obtaining global image features such as color, shape, and texture (datta et al., 2005). one of the most well-known cbir systems is query by image content (qbic), which was developed by ibm. it uses several different features, including color, sketches, texture, shape, and example images to retrieve images from image and video databases. since its launch in 1995, the qbic model has been employed for quite a few digital libraries or collections. one recent adopter is the state hermitage museum in russia (www.hermitage. ru), which uses qbic for its web-based digital image collection. users can find artwork images by selecting colors from a palette or by sketching shapes on a canvas. the user can also refine existing search results by requesting all artwork images with similar visual attributes. the following screenshots demonstrate how a user can do a content-based image search with qbic technology. in figure 2.1, the user chooses a color from the palette and composes the color schema of artwork he or she is looking for. figure 2.2 shows the artwork images that match the query schema. another example of digital libraries or collections that have incorporated cbir technology is the national science foundation’s international digital library project (www.memorynet.org), a project that is composed of several image collections. the information retrieval system for these collections includes both a traditional text-based search engine and a cbir system called simplicity (semantics-sensitive integrated matching for picture libraries) developed by wang et al. (2001) of pennsylvania state university. from the front page of these image collections, a user can choose to display a random group of images (figure 3.1). below each image is a “similar” button; clicking this allows the user to view a group of images that contain similar objects to the previously selected one (figure 3.2). by providing feedback to the search engine this way, the user can find images of desired objects without knowing their names or descriptions. simply put, simplicity segments each image into small regions, extracts several features (such as color, figure 1. the general process of content-based information retrieval figure 2.1. a user query figure 2.2. the search results for this query article title | author 43content-based information retrieval and digital libraries | wan and liu 43 location, and shape) from these small regions, and classifies these regions into some semantic categories (such as textured/nontextured and graph/photograph). when computing the similarity between the query image and images in the database, all these features will be considered and integrated, and best matching results will be retrieved (wang et al., 2001). similar applications of cbir technology in digital libraries include the university of california–berkeley’s digital library project (http://bnhm.berkeley.edu), the national stem digital library (ongoing), and virginia tech’s anthropology digital library, etana (ongoing). while these feature-based approaches have been explored over the years, an emerging new research direction in cbir is automatic concept recognition and annotation. ideally, automatic concept recognition and annotation can discover the concepts that an image conveys and assign a set of metadata to it, thus allowing image search through the use of text. a trusted automatic concept recognition and annotation system can be a good solution for large data sets. however, the semantic gap between computer processors and human brains remains the major challenge in the development of a robust automatic concept recognition and annotation system (datta et al., 2005). a recent example of efforts in this field is li and wang’s alipr (automatic linguistic indexing of pictures—real time, http://alipr.com) project (2006). through a web interface, users are able to search images in several different ways: they may do text searches and provide feedback to the system to find similar images. users may also upload an image, and the system will perform concept analysis and generate a set of annotations or tags automatically, as shown in figure 4. the system then retrieves images from the database that are visually similar to the uploaded image. in the process of automatic annotation, if the user doesn’t think the tags given by the system are suitable, he or she can input other tags to describe the image. this is also the “training” process for the alipr system. since cbir is the major research area and has the longest history in content-based information retrieval, there are many models, products, and ongoing projects in addition to the above examples. as image collections become a significant part of digital libraries, more attention has been paid to possibilities of providing content-based image search as a complement to existing metadata search. ■ content-based audio retrieval compared with cbir, content-based audio retrieval (cbar) is relatively new, and fewer research projects on it can be found. in general, existing cbar approaches start from the content analysis of audio clips. an example of this content analysis is extracting basic audio elements, such as duration, pitch, amplitude, brightness, and bandfigure 3.1. a group of random images in the collection figure 3.2. cbir results figure 4. alipr’s automatic annotation feature 44 information technology and libraries | march 200844 information technology and libraries | march 2008 width (wold et al., 1996). because of the great difficulties in recognizing audio content, research in this area is less mature than that in content-based image and video retrieval. although no cbar system has been found to be implemented by any digital library so far, quite a few projects provide good prototypes or directions. one good example is zhang and kuo’s (2001) research project on audio classification and retrieval. the prototype system is composed of three stages: coarse-level audio segmentation, fine-level classification, and audio retrieval. in the first stage, audio signals are semantically segmented and classified into several basic types including speech, music, song, speech with music background, environment sounds, and silence. some physical audio features—such as the energy function, the fundamental frequency, and the spectral peak tracks—are examined in this stage. in the second stage, further classification is conducted for every basic type. features are extracted from the time-frequency representation of audio signals to reveal subtle differences of timbre and pattern among different classes of sounds. based on these differences, the coarse-level segmentation obtained in stage one can be classified to narrower categories. for example, speech can be differentiated into the voices of men, women, and children. finally, in the information retrieval stage, two approaches—query-by-keyword and query-by-example—are employed. the query-by-keyword approach is more like the traditional text-based search system. the query-by-example approach is similar to content-based image retrieval systems where an image can be searched by color, texture, and histogram, and audio clips can be retrieved with distinct features, such as timbre, pitch, and rhythm. this way, a user may choose from a given list of features, listen to the retrieved samples, and modify the input feature set to get more desired results. zhang and kuo’s prototype is a very typical and classic cbar system. it is relatively mature and can be used by large digital audio libraries. more recently, li et al. (2003) proposed a new feature extraction method particularly for music genre classification named daubechies wavelet coefficient histograms (dwchs). dwchs capture the local and global information of music signals simultaneously by computing their histograms. similar to other cbar strategies, this method divides the process of music genre classification into two steps: feature extraction and multi-class classification. the music signal information representing the music is extracted first, and then an algorithm is used to identify the labels from the representation of the music sounds with respect to their features. since the decomposition of audio signal can produce a set of subband signals at different frequencies corresponding to different characteristics, li et al. (2003) proposed a new methodology, the dwchs algorithm, for feature extraction. with this algorithm, the decomposition of the music signals is obtained at the beginning, and then a histogram of each subband is constructed. hence, the energy for each subband is computed, and the characteristics of the music are represented by these subbands. one finding from this research reveals that this methodology, along with advanced machine learning techniques, has significantly improved accuracy of music genre classification (li et al. 2003). therefore, this methodology potentially can be used by those digital music libraries widely developed in past several years. ■ content-based video retrieval content-based video retrieval (cbvr) is a more recent research topic than cbir and cbar, partly because the digitization technology for video appeared later than those for image and audio. as digital video websites such as youtube and google video become more popular, how to retrieve desired video clips effectively is a great concern. searching by some features of video, such as motion and texture, can be a good complement to the traditional text-based search method. one of the earliest examples is the videoq system developed by chang et al. (1997) of columbia university (www.ctr.columbia.edu/videoq), which allows a user to search video based on a rich set of visual features and spatio-temporal relationships. the video clips in the database are stored as mpeg files. through a web interface, the user can formulate a query scene as a collection of objects with different attributes, including motion, shape, color, and texture. once the user has formulated the query, it is sent to a query server, which contains several databases for different content features. on the query server, the similarities between the features of each object specified in the query and those of the objects in the database are computed; a list of video clips is then retrieved based on their similarity values. for each of these video clips, key-frames are dynamically extracted from the video database and returned to browser. the matched objects are highlighted in the returned key-frame. the user can interactively view these matched video clips by simply clicking on the keyframe. meanwhile, the video clip corresponding to that key-frame is extracted from the video database (chang et al. 1997). figures 5.1–5.2 show an example of a visual search through the videoq system. many other cbvr projects also examine these content features and try to find more efficient ways to retrieve data. a recent example is wang et al.’s (2006) project, vferret, a content-based similarity search tool for continuous archived video. the vferret system segments video data into clips and extracts both visual and audio features as metadata. then a user can do a metadata search or article title | author 45content-based information retrieval and digital libraries | wan and liu 45 content-based search to retrieve desired video clips. in the first stage, a simple segmentation method is used to split the archived digital video into five-minute video clips. the system then extracts twenty image frames evenly from each of these five-minute video clips for visual feature extraction. additionally, the system splits the audio channel of each clip into twenty individual fifteensecond segments for further audio feature extraction. in the second stage, both audio and visual features are extracted. for visual features, the color element is used as the content feature. for audio features, 154 audio features originally used by ellis and lee (2004) to describe audio segments are computed. for each fifteen-second video segment, the visual feature vector extracted from the sample image and the audio feature vector extracted from the corresponding audio segment are combined into a single feature vector. in the information retrieval stage, the user submits a video clip query at first, then its feature vector is computed and compared with that of video clips in the database, and the most similar clips are retrieved (wang et al. 2006). similar projects in this area include carnegie mellon university’s informedia digital video library (www. informedia.cs.cmu.edu) and muvis of finland’s tampere university of technology (http://muvis.cs.tut.fi/index. html). content-based information retrieval for other digital formats with the advance of digitization technology, the content and formats of digital libraries are much richer than before. they are not limited to text, image, audio, and video. some new formats of digital content are emerging. digital libraries of 3-d objects are good examples. since 3-d models have arbitrary topologies and cannot be easily “parameterized” using a standard template as in the case for 2-d forms (bustos et al. 2005), contentbased 3-d model retrieval is a more challenging research topic than other multimedia formats discussed earlier. so far, four types of solutions—primitive-based, statistics-based, geometry-based, and view-based—have been found (bimbo and pala 2006). primitive-based solutions represent 3-d objects with a basic set of parameterized primitive elements. parameters are used to control the shape of each primitive element and to fit each primitive element with a part of the model. with statistics-based approaches, shape descriptions based on statistical modfigure 5.1. the user composes a query figure 5.2. search results for the sample query 46 information technology and libraries | march 200846 information technology and libraries | march 2008 els are created and measured. geometry-based methods, however, use geometric properties of the 3-d object and their measures as global shape descriptors. for viewbased solutions, a set of 2-d views of the model and descriptors of their content are used to represent the 3-d object shape (bimbo and pala 2006). another novel example is moustakas et al.’s (2005) project on 3-d model search using sketches. in the experimental system, the vector of geometrical descriptors for each 3-d model is calculated during the feature extraction stage. in the retrieval stage, a user can initially use one of the sketching interfaces (such as the virtual reality interface or by using an air mouse) to sketch a 2-d contour of the desired 3-d object. the 2-d shape is recognized by the system, and a sample primitive is automatically inserted in the scene. next, the user defines other elements that cannot be described by the 2-d contour, such as the height of the object, and manipulates the 2-d contour until it reaches its target position. the final query is formed after all the primitives are inserted. finally, the system computes the similarities between the query model and each 3-d model in the database, and renders the best matching records. an online demonstration can be found for a european project specifically designed for a 3-d digital museum collection, sculpteur (www.sculpteurweb.org). from its web-based search interface, a user can choose to do a metadata search or content-based search for a 3-d object. the search strategy here is somewhat similar to that in some cbir systems: the user can upload a 3-d model in vrml formats, then select a search algorithm (such as similar color, texture, etc.) to perform a search within a digital collection of 3-d models. as 3-d computer visualization has been widely used in a variety of areas, there are more research projects focusing on the content-based information retrieval techniques for this new multimedia format. ■ conclusion there is no doubt that content-based information retrieval technology is an emerging trend for digital library development and will be an important complement to the traditional text-based retrieval technology. the ideal cbir system can semantically understand the information in a digital library, and render users the most desirable data. however, the machine understanding of semantic information still remains to be a great difficulty. therefore, most current research projects, including those discussed in this paper, deal with the understanding and retrieval of lower-level features or physical features of multimedia content. certainly, as related disciplines such as computer vision and artificial intelligence keep developing, more researches will be done on higher-level feature-based retrieval. in addition, the growing varieties of multimedia content in digital libraries have also brought many new challenges. for instance, 3-d models now become important components of many digital libraries and museums. content-based retrieval technology can be a good direction for this type of content, since the shapes of these 3-d objects are often found more effectively if the user can compose the query visually. new cbir approaches need to be developed for these novel formats. furthermore, most cbir projects today tend to be web-based. by contrast, many project were based on client applications in the 1990s. these web-based cbir tools will have significant influence on digital libraries or repositories, as most of them are also web-based. particularly in the age of web 2.0, some large digital repositories—such as flickr for images and youtube and google video for video—are changing people’s daily lives. the implementation of cbir will be a great benefit to millions of users. since the nature of cbir is to provide better search aids to end users, it is extremely important to focus on the actual user’s needs and how well the user can use these new search tools. it is surprising to find that little usability testing has been done for most cbir projects. such testing should be incorporated into future cbir research before it is widely adopted. bibliography bimbo, a. and p. pala. 2006. content-based retrieval of 3-d models. acm transactions on multimedia computing, communications, and applications 2, no. 1: 20–43. bustos, b., et al. 2005. feature-based similarity search in 3-d object databases. acm computing surveys 37, no. 4: 345–387. chang, s., et al. 1997). videoq: an automated content based video search system using visual cues. in proceedings of the 5th acm international conference on multimedia, e. p. glinert, et al., eds. new york: acm. datta r., et al. 2005. content-based image retrieval: approaches and trends of the new age. in proceedings of the 7th international workshop on multimedia information retrieval, in conjunction with acm international conference on multimedia, h. zhang, , j. smith, and q. tian, eds. new york: acm. ellis, d. and k. lee. minimal-impact audio-based personal archives. in proceedings of the 1st acm workshop on continuous archival and retrieval of personal experiences carpe, j. gemmell, et al., eds. new york: acm. li, t., et al. 2003. a comparative study on content-based music genre classification. in proceedings of the 26th annual international acm sigir conference on research and development in information retrieval, c. clarke, et al., eds. new york: acm. li, j. and j. wang, j. 2006. real-time computerized annotation of pictures. in proceedings of the 14th annual acm international article title | author 47content-based information retrieval and digital libraries | wan and liu 47 conference on multimedia, k. nahrstedt, et al., eds. new york: acm. marques, o. and b. furht. 2002. content-based image and video retrieval. norwell, mass: kluwer. moustakas, k., et al. 2005. master-piece: a multimodal (gesture+speech) interface for 3d model search and retrieval integrated in a virtual assembly application. proceedings of the enterface: 62–75. wang, j., et al. 2001. simplicity: semantics-sensitive integrated matching for picture libraries. ieee trans. pattern analysis and machine intelligence 23, no. 9: 947–963. wang, z., et al. 2006. vferret: content-based similarity search tool for continuous archived video. in proceedings of the 3rd acm workshop on continuous archival and retrival of personal experiences, k. maze et al., eds. new york: acm. wold, e., et al. 1996. content-based classification, search, and retrieval of audio. ieee multimedia 3, no. 3: 27–36. zhang, t. and c. kuo. 2001. content-based audio classification and retrieval for audiovisual data parsing. norwell, mass.: kluwer. lita national forum cover 2 lita guides cover 3 lita workshops cover 4 index to advertisers statement of ownership, management, and circulation information technology and libraries, publication no. 280-800, is published quarterly in march, june, september, and december by the library information and technology association, american library association, 50 e. huron st., chicago, illinois 60611-2795. editor: john webb, librarian emeritus, washington state university libraries, pullman, wa 99164-5610. annual subscription price, $55. printed in u.s.a. with periodical-class postage paid at chicago, illinois, and other locations. as a nonprofit organization authorized to mail at special rates (dmm section 424.12 only), the purpose, function, and nonprofit status for federal income tax purposes have not changed during the preceding twelve months. extent and nature of circulation (average figures denote the average number of copies printed each issue during the preceding twelve months; actual figures denote actual number of copies of single issue published nearest to filing date: june 2007 issue). total number of copies printed: average, 5,354; actual, 5,280. sales through dealers and carriers, street vendors, and counter sales: average, 0; actual 462. paid or requested mail subscriptions: average, 4,283; actual, 4,193. free distribution (total): average, 292; actual, 292. total distribution: average, 5,028; actual, 4,947. office use, leftover, unaccounted, spoiled after printing: average, 326; actual, 333. total: average, 5,354; actual, 5,280. percentage paid: average, 94.19; actual, 94.10. s t a t e m e n t o f o w n e r s h i p , m a n a g e m e n t , a n d c i r c u l a t i o n ( p s f o r m 3 5 2 6 , s e p t e m b e r 2 0 0 7 ) f i l e d w i t h t h e u n i t e d s t a t e s p o s t o f f i c e p o s t m a s t e r i n c h i c a g o , o c t o b e r 1 , 2 0 0 7 . 96 journal of library automation vol. 2/2 june, 1969 book reviews the marc pilot project; final report . . . prepared by henriette d. avram. washington, library of congress, 1968. 183 pp. $3.50. marc manuals used by the library of congress prepared by the information systems office, library of congress. chicago, american library association, 1969. 335 pp. $7.50. the first of these two important publications is a technical report of high quality. its purpose is to describe in detail the history, objectives, system design, operation, costs, and findings of the experimental pilot project. it attains its purpose admirably; this report will long be the classic document on the first major experiment of the use of a machine readable cataloging record by a group of libraries. mrs. a vram has included sufficient detail to enable the reader to understand exactly how the project operated. procedures could be reproduced from the information given. for the many who will be using marc i or marc ii data for experiment or operations, complete information on both formats is included. four calculations of input costs yielded unit costs ranging from $2.26 to $1.31. if the cost of computer processing is subtracted from $1.31, the result is $.99, or double the approximate average of conversion costs reported from several other centers. reports from seventeen participants constitute an appendix. some accomplished nothing, others experimented with the tapes, while a third used the data in routine operations. of the participants' reports, those from the university of toronto library and the washington state library are the most detailed and contain most useful statistical data. marc manuals is an indispensable publication for any library contemplating use of, or using, marc ii tapes. the manuals are four: 1) "subscriber's guide to the marc distribution service," 76 pp.; 2) "data preparation manual: marc editors," 218 pp.; 3) "transcription manual," 22 pp.; and 4) "computer and magnetic tape unit usability study," 18 pp. this publication is the master guide to use of marc ii records. the government printing office required three-quarters of a year to produce the marc pilot project while anxious users waited. the american library association needed hardly a month to produce the marc manuals. admittedly this publication performance is new for ala, but it should receive long and loud applause. computerization has introduced a factor of timeliness into publication, and it is gratifying that ala recognizes the fact. frederick g. kilgour book reviews 97 bibliography of research relating to the communication of scientific and technical information. edited by jay hillary kelley, charles l. bernier and judith c. leondar. bureau of information sciences research, graduate school of library service, rutgers, the state university. rutgers university press, new brunswick, n.j. 1967. 3510 pages. do we need a review of a bibliography already two years old? the editor of jla says yes. more importantly, can we find good use for the bibliography it reviews? in this case, yes. its scope is both less and more than the title indicates: less, because "communication" here means documentation and excludes direct, immediate communication; more, because it extends far beyond merely the documentation of science and technology to information processes per se, though not to all of information science. psycholinguistics and epistemology seem to be ignored, and logic is given short shrift. from the seven existing major bibliographies listed at the end of this review, and from twenty abstracting and indexing services, and nearly 300 journals, the compilers have selected items published during the years 1955-1965, in nine categories : 1 ) research results, 2) new theories, 3) identifiable breakthroughs, 4) incremental gains in information sciences, services, and systems, 5) developments identified as new by the authors reporting them, 6) comprehensive reviews, 7) bibliographies, 8) evaluative articles, and 9) directories to current research. excluded are items of purely historical, biographical, speculative or entertainment value, as well as bibliographies or literature surveys of fields outside information science (is) . these criteria, and the book's subject classification scheme, are themselves useful and they reflect considerable thought, even though the user is sure to find instances where: 1) items included do not seem to measure up to the criteria or 2) he will disagree with the structure of the classification scheme. however, these faults are inherent in the bibliographic activity, inexact science that it is. the introduction offers as the project's rationale some interesting and provocative hypotheses. one relates to the epidemic nature of progress in is-i.e., that progress comes through a few identifiable discoveries. more basic is their assumption that "well-known bibliographies, reviews, workers, and organizations were identifiable and needed representation." (it is possible to argue that if identifiable through literature, they are likely to be already identified, at least by the people who really need them, and that a general bibliography is not needed. but the worker oriented to the literature of is-as documentalist, librarian, or as teacher, student or researcher in is, will probably be glad anyway to have so much of it in one place. ) selection is slanted to the most current work, on the assumption that viable earlier contributions will be identified through citations. the editors postulated that "plagiarism, duplication, and repetition of work were so 98 journal of library automation vol. 2/ 2 june, 1969 rampant that many potential items for the bibliography could be rejected on this basis." by creating in advance a classification scheme for is, and then placing the items selected in the classes, they predicted that it would be possible to identify gaps in the field, where more research is needed. the means for doing so are not discussed and left unanswered is the question: how do we determine the right amount of publication for each class? the result is a bibliography-of some 3700 items chosen from about 30,000 considered-intended as a guide rather than an exhaustive compilation. if the judgments of the editors stand the test of time, having less is more. the prospect of obsolescence, however, haunts this bibliography as it does all others, and it highlights the need for bibliographic tools that can be more easily updated by both addition and purging, like the ill-fated automation reporter, a looseleaf service in this field, discontinued for lack of support. for a profession which seeks to solve other people·s information problems, is people are often slow to get the word. but this is an indictment of the whole field, not specifically the group at rutgers, who have provided a useful tool, if not the most useful one imaginable. efficient use of the book is likely to be impaired by its appearance. photo-offset reproduction of greatly reduced typescript is not ideal for a reference book such as this. where economy dictates its use, a little imagination and quality control, not evident here, can do a great deal to overcome its faults. here, the printing is too light. there is nothing done to set off such elements as author or title. item numbers appear at the right margin in all cases; hence they are half the time buried in the gutter. the ratio of pages of index to text is appropriately high-though of course no one knows what an optimum would be. there is about a page of author index to four pages of bibliography, and a slightly greater proportion of subject indexing. shortcomings aside, this promises to be a useful bibliography. the editors do not make it clear if they intend it to be more than that-for example, the basis for a study of formal characteristics is is literature. if not, they should consider doing so. the seven major bibliographies mentioned above were completely searched for this bibliography. they are: balz, c. f. and r. h. stanwood, compilers. literature on information retrieval and machine translation. ibm, 117 pp., 2965 ref., 1962. janaske, p. c., ed. information handling and science, information, a select bibliography 1957-1961. washington, d. c., american institute of biological sciences, 1121 ref., 1962. national bureau of standards, research information center and advisory service on information processing (ricasip) [computer printout of references and indexes] washington, d. c., national bureau of standards, 11 parts, approximately 18,000 ref., june 16, and july 15, 1965. book reviews 99 neeland, f., ed. a bibliography on information science and technology for 1965. santa monica, calif., systems development corp., 3 parts 1750 ref., 1965. snodey, s. r., compiler. information retrieval: systems and technology, a literature survey. north american aviation, inc., space and systems div., 272 pp. 1914 rev. (sid 63-199), jan. 15, 1963. spangler, m., compiler & ed. general bibliography on information storage and retrieval. phoenix, general electric co., computer dept., 1550 ref., 1962. zell, h. m. and r. j. machesney, compilers & ed. an international bibliography of non-periodical literature on documentation and information. oxford, robert maxwell & co. ltd., 1555 ref., 1965. joseph c. donohue evaluation of the medlars demand search service, by f. w. lancaster. u. s. department of health, education and welfare, washington ,d. c., january 1968. 276 pp. medlars, a computer-based information storage and retrieval service of the medical literature, represents a very significant effort in the management of the information explosion in the health sciences. the medlars system in itself is quite complex and this study represents an attempt to evaluate the effectiveness of the storage and retrieval from the data base which now numbers more than 800,000 citations from 2,300 journals from all over the world dating since january 1964. the study was designed to evaluate the factors related to the requirements of the user: coverage, recall power, precision, response time, format; and the effort that the user must expend to evoke a satisfactory response from the system. emphasis in this report was upon recall and precision. the study was based on 25 to 30 retrieved citations, the effectiveness of which was evaluated by the users. of 299 searches studied, the system was operating at 57.7% recall of the major relevant citations from the available data base, and 54.4% precision as judged relevant by the requesters. the more comprehensive the recall, the less precise is the output. in addition to a determination of effectiveness, equally important was analysis of the factors contributing to a failure. the principal causes were related to the failure of the index language, the indexing subsystem, searching, and the interaction between the user and the system. the author concludes with a number of considerations for enhancement of the effectiveness of the medlars system. the author and the advisory committee are to be commended upon the depth of their evaluation, the objectivity of their appraisal and their thoughtful suggestions for improvement. such a complex information system should be under continuous self-appraisal if it is to meet the urgent 100 journal of library automation vol. 2/ 2 june, 1969 needs of the scientist as he deals with the burgeoning health sciences information. john a prior library effectiveness: a systems approach, by philip m. morse. the m.i.t. press, cambridge, massachusetts, 1969. 207 pp. $10.00. as professor of theoretical physics at the massachusetts institute of technology, as a director of m.i.t.'s computing center, operations research center, and project mac, and as the first president of the operations research society of america, philip morse has been a key figure in the many scientific developments which are now playing such an important role in the design of information systems. his abiding interest in the analysis and improvement of libraries is less well-known, and it is fortunate that he has made available a detailed account of his seminal work in this area. the present book had its origins in a series of student projects which used the m.i.t. library as a laboratory for the application of operations research methods. morse has selected several mathematical models for exposition with ample verbal explanation of their theoretical implications and their practical application in explaining and predicting user behavior in the m.i.t. science library. the number and kinds of tasks performed by library visitors is shown to follow a geometric-multinomial pattern, not unlike a game of craps. the essentially random demand for, and utilization of, library services is shown to give rise to a queuing or interference situation not unlike a telephone switchboard, where models are available to help predict the effect of providing duplicate services, usage restrictions, and reservations, and to help account for the possibilities of the user's balking or becoming discouraged. finally, the random usage of books is shown to have a mean bias with age, especially in the early years, which can be modelled by a markov chain whereby book usage settles down in an exponential fashion to some residual or "steady state" level of usage in old age. the 'model is used to examine book retirement policies. in all of these models approaches employing probability are emphasized, but the relationships are kept simple enough to allow for meaningful comparisons and combinations of different classes of users and library materials. some of the observations morse is able to make about the differences among biologists, chemists, mathematicians, and physicists as library users are among the most interesting results of his analysis. unfortunately, the absence of statistical tests of significance makes it necessary to accept many of these results as useful hypotheses in need of further validation. on page 141, morse says that he anticipates "comments that are sure to be made about the cavalier way we have handled the model and the data. . . . our object was to arrive at a model simple enough so results could be obtained graphically or by slide rule. accuracy is not book reviews 101 often important in reaching policy decisions: order-of-magnitude figures are far better than none. . . . but, as the library becomes more 'mechanized' or 'computerized' these data will become enormously easier to collect, if the computer system is designed to gather the needed data," (author's italics) . he goes on to say: "it is the author's belief, based on discouraging experience, that neither the computer experts nor the librarian (for different reasons) really know what data would be useful for the librarian to have collected, analyzed, and displayed, so he can make decisions with some knowledge of what the decision implies. what is needed before the computer designs are frozen is for models of the sort developed in this book, to be played with, to see which of them could be useful and to see what data are needed and in what form, in order that both 'models and computers can be used most effectively by the librarian." morse has addressed this book to both librarians and system analysts as an experimental but much needed venture. to the analyst it represents a good first attempt at modelling the complexities of a library and points the way toward more sophisticated techniques and more experimental work. to the librarian it provides some alternative to blind automation and a glimmer of hope that the evaluative techniques will come forth that are so badly needed to judge and control the efficacy of the new computer-aided systems being proposed. f. f. leimkuhler the role of the library in relation to other information activities: a state of the art review, by anne f . painter. u. s. army, office of chief of engineers, washington, d.c. 1968. ( ctisa project, rt. no. 23.) at one time the controversy over libraries and "infoitrnation centers" was of interest to many of us. the "wienberg report" could draw a crowd at any professional meeting but, thank goodness, such issues lose their interest and, one hopes, we go on to more productive work. differentiating between libraries and "information centers" does not seem to this reviewer to be art. nor does it seem to be in such a state as to be worth reviewing. nevertheless, professor painter has produced a large bibliography, arranged both alphabetically and by subject, preceded by some fifty rather wordy pages. the general conclusion-that libraries and "information centers" are and should be performing the same tasks to a greater and greater degree-speaks to an issue no longer of great interest. a literature survey of any kind can get tedious and one which reviews that written about a dead issue, as this publication does, becomes extremely dull. the ponderous style of official reports is present and the effort required to wade through the jargon is not rewarded by 1: 102 journal of library automation vol. 2/2 june, 1969 fresh insight nor perceptive evaluation. the publication is recommended to those who collect bibliographies on the subject and collectors of library science who exercise but little selectivity. hugh c. atkinson bnb marc documentation service publications nos. 1 and 2. london, council of the british national bibliography, ltd., 1968. part 1, £2; pt. 2, draft. these admirable publications, presented by r. e. coward, describe, explain, and discuss essential characteristics of bnb marc records. they constitute a more comprehensive presentation of information about marc than has heretofore appeared as an integrated exposition. they are particularly valuable for their explanations of details of marc format and of cataloging practices. in addition, part 1 contains useful and informative treatises on filing, subject and other added entry data. r. e. coward prepared these documents for users of bnb marc records. but users of any variety of marc records will find stimulating and helpful discussions. since it appears most probable that bnb marc records will be used beyond the perimeters of the united kingdom, the handbook areas of these two documents will receive wide use. the description and explanation of the communication for:mat is fully and lucidly presented. bnb has introduced a few elaborations of lc marc that are imaginative and effective. for example, part 2 describes an attractive technique for elimination of an initial article in a title when sorting is on title. the number of characters in the article and the space following the article is determined, and this number is placed in the otherwise unused second indicator position. this information is not on the lc tapes, and would certainly be a welcome and helpful addition. part 1 contains discussions and solutions of filing problems that occupy two dozen pages. since the british national bibliography appears in bookform, its filing problems are numerous and severe. the techniques for solving their problems are effective and are presented with commendable clarity. of comse, not all problems of arrangement of entries in bookform catalogs are solved, but the procedmes for solution will be useful in application to architecture of other filing orders. little has been written about subject content of marc records, and most of what has appeared is also in part 1. coward briefly describes subject-heading and classed subject content of marc without pushing these two ancient jousters into the lists. however, it can confidently be predicted that marc will become a new terrain for this heroic arena. the discussion of added entries, although brief, is also novel for a marc document. however, the boundaries of a new battleground are book reviews 103 discernible in the statement that "author and title have proved to be so cumbersome and prone to error that number systems have proliferated to take · their place." those librarians whose main objectiv~ is participation in the programs of the community of which their library is a segment, will ··surely protest that the day is not in the foreseeable future when scholars and other users will substitute standard book numbers for author-title citations. part 2 of the publication supplements piut 1 with provision of detailed information on magnetic tape specifications. it also increases compatibility between bnb marc and lc marc so that no significant differences. exist. where bnb marc does not include fields in lc marc, the lc ·fields are nevertheless described, thereby aiding either british or american users in processing marc records from eithersource. these two publications contain much useful information about marc records that is not available elsewhere. in addition, they contain effective emendations of marc that will stimulate all marc users to develop further improvements. richard coward and bnb are to be commended for a major contribution to marc literature. frederick g. kilgou1' library & information science abstracts. 1 (jan.-feb. 1969). london, the library association. annual subscription £6 6s. recently two authors described librarianship as "paralyzed by decades of philosophical and literary argumentation." it is correct to state that until the past few years library literature has contained little, if any, new knowledge. however, the literature of today is beginning to swell with reports of new investigations and applications-reports which the modern librarian must make part of his armamentarium, just as the modem physician must learn of new developments if he is to be increasingly successful in prevention and cure of disease. indeed, worthwhile library literature has increased to a magnitude that requires regular perusal of abstracts to "keep up." given tlus circumstance, it is a pleasure to welcome an excellent new absb·act journal. library & information science abstracts (lisa) is not a mere rechristening of library science abstracts. to be sure, lisa evolved from the latter, and must be thought of as a new generation. the library association publishes lisa but aslib has joined forces with la in cooperative sponsorshlp. lisa now boasts a fulltime editor with some staff at la, where responsibility for abstracting in library science resides. aslib furnishes the information science abstracts under a contract with la. it is the intent of the publishers to use author abstracts or to have i' i 104 journal of library automation vol. 2/ 2 june, 1969 staff do abstracts in english and to call on a panel of abstractors that can read foreign languages. the goal for publication lag is six to fourteen weeks. if lag time can be kept within these limits, lisa will achieve at least one notable accomplishment. the main arrangement of abstracts is the british research groups· classification of library science, which appears to be adequate. the subjects are much more narrow than those that library science abstracts employed. cross references are included in the form of the citation with a reference to the location of the abstract-a most helpful procedure. an author index and a subject index is in each issue. the first issue contains 358 abstracts, so that it can be expected that some two thousand will appear annually. the abstracts are the usual indicative variety found in abstract journals and are well done. the la library will provide photocopies of the original at page rates varying from 4 1/ d to ls od, depending on size of page. lisa will cover proceedings, symposia and a few monographs as well as journals. the first issue lists 251 journal titles being covered-a twentyfive percent increase in numbers of titles over library science abstracts. however, some titles in lsa have been dropped, so that lisa covers approximately a hundred new journals, including titles in computation and information science as well as librarianship. lisa is an excellent abstract joumal which every librarian who wishes to grow with his profession must read and use effectively. frederick g. kilgour analytics and privacy: using matomo in ebsco’s discovery service articles analytics and privacy using matomo in ebsco’s discovery service denise fitzgerald quintel and robert wilson information technology and libraries | september 2020 https://doi.org/10.6017/ital.v39i3.12219 denise fitzgerald quintel (denise.quintel@mtsu.edu) is discovery services librarian and assistant professor, middle tennessee state university. robert wilson (robert.wilson@mtsu.edu) is systems librarian and assistant professor, middle tennessee state university. © 2020. abstract when selecting a web analytics tool, academic libraries have traditionally turned to google analytics for data collection to gain insights into the usage of their web properties. as the valuable field of data analytics continues to grow, concerns about user privacy rise as well, especially when discussing a technology giant like google. in this article, the authors explore the feasibility of using matomo, a free and open-source software application, for web analytics in their library’s discovery layer. matomo is a web analytics platform designed around user-privacy assurances. this article details the installation process, makes comparisons between matomo and google analytics, and describes how an open-source analytics platform works within a library-specific application, ebsco’s discovery service. introduction in their 2016 article from the serials librarian, adam chandler and melissa wallace summarized concerns with google analytics (ga) by reinforcing how “reader privacy is one of the core tenets of librarianship.”1 for that reason alone, chandler and wallace worked to implement and test piwik (now known as matomo) on the library sites at cornell university. taking a cue from chandler and wallace, the authors of this paper sought out an analytics solution that was robust and private, that could easily work within their discovery interface, and provide the same data as their current analytics and discovery service implementation. this paper will expand on some of the concerns from the 2016 wallace and chandler article, make comparisons, and provide installation details for other libraries. libraries typically use ga to support data-informed decisions or build discussions on how users interact with library websites. the goal of this pilot project was to determine the similarities between google analytics and matomo, how viable matomo might be as a google analytics replacement, and seek to bring awareness to privacy concerns in the library. matomo could easily be installed on multiple websites. however, this project looked into a specific instance of monitoring, that of the library’s discovery layer, ebsco discovery service (eds). literature review google analytics the 2005 release of google analytics was a massive boon to libraries who long searched for an easy to implement and budget-friendly tool for analytics. shortly after its release, academic libraries were quick to adopt the platform and install its javascript code into their library web pages.2 in a little over a decade, there have been nearly forty scholarly articles published that discuss the ways in which google analytics is used for libraries’ websites. articles that not only mailto:denise.quintel@mtsu.edu mailto:robert.wilson@mtsu.edu information technology and libraries september 2020 analytics and privacy | quintel and wilson 2 introduced the service, but also discuss the various ways libraries utilize the platform.3 in fact, in their survey of 279 libraries, o’brien et al.’s 2018 research found that 88 percent of libraries surveyed had implemented google analytics or google tag manager.4 in contrast, during that same period, authors found matomo, or its earlier name, piwik, discussed in a total of five scholarly articles, with only three libraries who wrote about using it as a web analytics tool.5 in addition to measuring website use, libraries found that google analytics allowed for several different assessments. in using google analytics, libraries could provide immediate feedback for projects, indicate website design change possibilities, create key performance indicators, and determine research paths and user behaviors.6 convenience of implementation and use, minimal cost, and a user-friendly interface were all reasons cited for the widespread and fast adoption.7 although the early literature covers a lot of ground about the reporting possibilities and the coverage of google analytics, there is rarely a mention of user privacy. early articles that mention privacy provide a cursory discussion, reiterating that the data collected by google is anonymous and therefore, protects the privacy of the user. recently, there has been a shift in literature, with articles that now provide more in-depth discussions about user privacy and the concerns libraries have with third parties that collect and host user data. o’brien et. al discussed the problematic ways that libraries adopted and implemented ga, by either overlooking front-facing policies or implementing it without the consent of their users.8 in their webometrics study, o’brien et. al found that very few libraries (1 percent) had implemented https with the ga tracking code, only 14 percent had used ip anonymization, and not a single site utilized both features.9 the concern is not solely google’s control of the data, but in google’s involvement with third-party trackers. third parties, as pekala remarks, are rarely held accountable.10 with an advertisement revenue of $134 billion in 2019, representing 84 percent of its total revenue, it is important to remember that google is an advertising company.11 google's search engine monetization transformed it into one of the world's most recognizable brands. as the most visited site in the world, google is firmly committed to security, especially when it comes to data theft. google offers protection from unwanted access into user accounts, even providing ways for high-risk users, such as journalists or political campaigns, to purchase additional security keys for advanced protection.12 but while google keeps data breaches and hackers at bay, the user data that google collects and stores for advertising revenue tells a different story. goo gle stores user data for months on end; only after nine months is advertisement data anonymized by removing parts of ip addresses. then, after 18 months, google will finally delete stored cookie information.13 recent surveys are reporting an increase in users who want to know how companies are collecting information to provide data-driven services. in a 2019 pew research survey, 62 percent of respondents believe it is impossible to go through their daily lives untracked by companies. additionally, even with the ease that certain data-driven services bring, “81 [percent] of the public reported that the potential risks they face because of data collection by companies outweigh the benefits.”14 cisco technologies, in a 2019 personal data survey, found a segment of the population (32 percent) that not only cares about data privacy and wants control of their data, but has also taken steps to switch providers, or companies, based on their data policies. 15 additionally, in pew research survey results published as recently as april 2020, andrew perrin reports that an even larger number of u.s. adults (52 percent) are now choosing to not use products or services information technology and libraries september 2020 analytics and privacy | quintel and wilson 3 out of concerns for their privacy and the personal information companies collect. 16 with a growing population of users who make inquiries about who, or what, is in control of their data, a web analytics tool that can easily answer those questions might serve libraries, and their users, well. comparisons google analytics had been the library’s only web analytics tool until the start of the pilot project. during the pilot period, the authors simultaneously ran both analytics tools. once matomo was installed the authors found several similarities between the two products, and discovered that nearly identical analyses could occur, given the quality and quantity of the data collected. the pilot study focused only on one analytics project, which would be the library’s discovery layer— ebsco’s discovery service. authors worked with their dedicated ebsco engineer to replicate the google analytics eds widget, and have it configured to send output to matomo instead. in making comparisons, one of the common statements about ga and matomo, is that the numbers will never be exact matches. oftentimes with much higher counts presented in ga than in matomo. several forums and blogs, even matomo themselves, admit that there are several possible reasons why there is a noticeable difference between the two.17 those involved in the discussion theorize that this is due to ga spam hits, bot hits, and matomo’s ability for users to limit tracking. beyond the counts, both products measure the same kinds of metrics for websites.18 for this project, the authors only wanted to look at specific metrics within eds, those measurements that look more closely at the user, rather than the larger aggregate data. for the sake of the analysis, it is important to note that although both products have several great features; this is a specific situation where the researchers use certain features in terms of analytics. the analytics we collect for eds strive to answer specific questions: • are users searching for known item or exploratory searches? how often? • are users utilizing the facets and limiters? how often? although you can use both products to count page views or set events for your website, when looking at meaningful metrics for our discovery system, we focus more on the user level. in google analytics, the best way to capture these is by going through the user explorer tool, which breaks up a user journey into search terms, events, and actions that occur during sessions. in the same way, matomo provides anonymized user profiles that include search terms, events, and actions in its visits log report. in ga, you can export this user explorer data in json format, but only at one user at a time, as seen in figure 1. this restriction also means you cannot see data from multiple users, with those details, on a single page. to contrast, in matomo’s visits log, you can export the same data (search terms, events, actions) from multiple users in csv, xml, php, tsv, json, or html formats. as seen in figure 2, matomo offers a snapshot of this data in an easy-to-read single page, versus google’s one user at a time option which requires clicking through to see a user report. information technology and libraries september 2020 analytics and privacy | quintel and wilson 4 figure 1. screenshot of the google analytics user explorer tool figure 2. screenshot of the matomo visits log report information technology and libraries september 2020 analytics and privacy | quintel and wilson 5 in summary, libraries using either of these analytics tools can measure usage and users with page views, visits, and unique visitors. looking at how users navigate a site is possible with the available user paths, from the initial search, to events as seen in figures 3 and 4, and an exit page url. goals can be set and maintained with conversion metrics tied to referrers, visits, user location, devices, or user attributes. like google analytics, matomo can run reports on engagement and performance, and share customizable user-friendly graphs or graphs or other visual representation. figure 3. peer reviewed limiter as event action in google analytics figure 4. peer reviewed limiter use as event name in matomo comparisons on privacy both google analytics and matomo offer ways to protect the privacy of your users. both offer ip anonymization, the option for data deletion after a certain time, and both provide do not track feature for users. it is important to note the way google offers these adjustments to the user. for matomo, do not track is a default behavior, meaning that the tracker automatically honors a browser’s settings for all sites, which is sometimes not the case, as respecting the do not track browser setting is voluntary for websites, not mandatory .19 google analytics offers the same service, as long as it is implemented by the user through a browser extension.20 ip anonymization and data deletion are all features that matomo users can adjust easily from the dashboard, whereas google analytics users will need to make those adjustments programmatically. 21 in matomo, you can choose to automatically delete your old visitor logs from the database, although matomo recommends keeping detailed matomo logs from three to six months, and then information technology and libraries september 2020 analytics and privacy | quintel and wilson 6 delete the older log data.22 quite the contrast is google analytics where a user makes a data deletion request to google, which then creates a report for your review, before submitting the request to google. even after submitting a request, google still allows for seven days to reverse that decision. in terms of data retention, google analytics gives you the option to retain user data anywhere from 14 months to 50 months, with the option to never expire. fourteen months is the shortest amount of time you can retain user data for, nothing less.23 ip anonymization is the default for matomo analytics but is an opt-in feature for google analytics. again, like data retention, any adjustments to ip anonymization in matomo can occur in the dashboard with options to have two or three bytes removed from the address. google analytics will adjust the last octet to zero.24 both products are similar in several ways, but the standout feature of matomo is that the data belongs only to your institution. in his interview with katherine schwab for fast company, mathieu aubry, matomo’s founder states it clearly: when [google] released google analytics, [it] was obvious to me that a certain percent of the world would want the same technology, but decentralized, where it’s not provided by a centralized corporation and you’re not dependent on them… if you use it on your own server, it’s impossible for us to get any data from it.25 implementation and installation originally released as piwik in 2007, matomo was designed as a replacement to phpmyvisites.26 it is an open-source software application licensed under gnu gpl v3.27 it is designed as a php/mysql application allowing the server operating system (os) and web service to best match a user’s needs or institutional preferences and expertise.28 to match the organization’s preferences and expertise, this matomo instance was set up as a linux-apache-mysql/php (lamp) stack server (centos 7 in our case) with apache 2.4.6 and mysql-mariadb 5.5.60. the required configurations needed to run matomo are well-documented on the matomo documentation site as well as the download and documentation area. depending on the version of matomo, the mileage a user gets with the documentation may vary. for example, on the recent upgrade to 3.11.0, the instance displayed a warning notification that php v7.0 had reached end of life and recommended updating to php v7.1 or greater to accommodate future matomo versions. however, at the time of this writing, the minimum php version required stated in matomo’s documentation is 5.5.9 or greater.29 like many php applications, once the prerequisite applications are installed (php, mysql, and the selected web service, apache in this case), the matomo install is completed by browsing to the server’s url or ip address on port 80. browsing to the index.php path in a web browser will guide a user through the install process. the installer will also review file directories on the server and inform a user of any permissions problems that will need to be addressed for correct install and use. compared to other php application install experiences, installing matomo was straightforward and easier to follow than many. within a few minutes, the admin user was created and the first website was added. the web-based administration area is also more robust and easier to use than many comparable applications. many features that might typically require configuration file changes directly on the server, including matomo upgrades, can be configured through the administration area. while the administration page has many options relating to paid-for premium features, there are several information technology and libraries september 2020 analytics and privacy | quintel and wilson 7 particularly helpful free configuration cards in the interface. most notably is the “system summary” card that displays the current version of matomo, php, and mysql as well as total users, segments, goals, tracking failures, total websites configured, and a few other metrics. there is a “tracking failures” card that notifies of issues with websites, and a “need help?” card that links to the matomo community forums. finally, the “system check” card displays any warnings or errors as well as a link to the full system check report. this is extremely helpful when matomo has been installed but the instance still needs additional configuration changes or follow-up tasks on upgrades. if there are warnings or errors, the full system report will often have recommendations of changes to make either in the administration page or on the server in the configuration files. these administration features make maintenance a straightforward process. since setting up the server, two upgrades have been completed. in both cases, an email notification was received indicating a new stable release was available. on login to matomo, this information also appeared as a banner. simply clicking on the download update option automatically updated the service without any need to access the server directly or via ssh. in both cases the updates ran smoothly with one exception. in that case, several files were created or overwritten with the root user as the owner. as a result, matomo indicated an issue with the files and/or path not being found. in actuality, the files did exist, but matomo no longer had permission to read them. resolution of the problem required browsing to the directory path indicated in a warning on the server and changing ownership from the root user to the apache user to match other files. despite this issue, the update process is much more user-friendly than similarly structured applications. standalone implementation and installation of matomo is made simple by the installation documentation that is readily available on the matomo.com website, especially if one is familiar with php/mysql applications. adding one or two websites whose architectures a new matomo user is well-acquainted with is a good way for new users to pilot and get introduced to matomo’s overall functions without being so overwhelmed that the more granular functions are never learned. a system admin may find maintenance and updates to this service less problematic with less interruption of the service than similarly structured applications while users may find the overall functionality of matomo easier to use and finer points of reporting and analytics more transparent and easier to understand than google analytics. once installed, the authors then tested matomo on a low-traffic library site. after tracking proved successful, eds was entered as a new website in the matomo dashboard and the javascript tracking tag was placed in the bottom branding of eds. the process of adding eds as a new site to matomo was as easy as expected, and the data collection was almost immediate. to mirror the eds and google analytics integration, the authors worked with their ebsco library service engineer to create a matomo widget. luckily, another engineer had previously worked on an integration when it was known as piwik. instead of building from the ground up, the piwik widget only needed clean and updated code to match the google analytics widget, which would allow for the tracking of events and site searches. adding a user outside of the organization to matomo was necessary for the ebsco engineer to fine-tune the widget. matomo admins can set up users with specific permissions within the system, with access to only a specific site. each matomo user has their own email address and password (not domain-specific), settings, and users can even customize their dashboard. after information technology and libraries september 2020 analytics and privacy | quintel and wilson 8 testing proved successful, the new matomo widget moved into the live profile of eds, and data collection commenced. security though the service is in a pilot stage with limited data collection, the authors wanted to ensure an ssl certificate was in place for login to matomo. with eff’s certbot (https://certbot.eff.org/), the authors installed a let’s encrypt (https://letsencrypt.org/) ssl certificate. the ssl certificate is automatically renewed every three months via a cronjob on our server. because of the power of the administration interface, caution should be used when assigning the “super user” role to user accounts. it would also be wise to require two-factor authentication (2fa) on the service. turning on 2fa is a very simple process and matomo works with multiple third-party authentication utilities including authy, lastpass, and 1password. while each user can choose to activate 2fa, an admin can require it for all users if desired. conclusion as the amount of research and rate of adoption testifies, since 2005 ga has set the benchmark for assessment of library web asset success and has made possible a completely new understanding of the library user experience and overall assessment of library services. matomo’s earliest iteration appeared shortly after in 2007 and is a viable alternative to proprietary web analytics applications with a few notable advantages over ga. from a long-term perspective, the two biggest advantages of matomo is that it is licensed under a copyleft gpl free and open source software (foss) license and is designed with user privacy at heart. for libraries, using foss applications whenever possible allows them to practice what they preach. foss does not mean cost-free. in fact, free in the foss sense is more akin to freedom (freedom to download, modify, distribute, and change the code) rather than free of charge. budgeting for a hosted subscription, support, or the costs of a library running and maintaining the application itself or through an infrastructure as a service (iaas) provider like amazon web services (aws) or microsoft’s azure is necessary, but the freedom matomo provides by ensuring the library is in control of its patron data, that it is protected, and that data is not at risk of becoming a product in and of itself may well be worth the cost. like other initiatives in the open-access movement or open-education resources, and as thirdparty data collection and privacy on the web becomes a more mainstream concern, opting to use matomo to protect patron privacy principles allows libraries to be the leaders on issues relating to privacy and intellectual freedom. as noted earlier, there are other feature-based advantages matomo provides that impact the day-to-day aspects of monitoring web asset use and assessment, like export options and viewing the full log of visits. lastly, by focusing on eds in this pilot, the authors were able to demonstrate and verify that matomo rises to the challenge not just with traditional web asset analytics requirements, but to library-specific applications like proprietary discovery layer services. https://certbot.eff.org/ https://letsencrypt.org/ information technology and libraries september 2020 analytics and privacy | quintel and wilson 9 endnotes 1 adam chandler and melissa wallace, “using piwik instead of google analytics at the cornell university library.” serials librarian 71, no. 3 (october 2016): 174, https://doi.org/10.1080/0361526x.2016.1245645. 2 tabatha farney and nina mchale, “introducing google analytics for libraries,” library technology reports 49, no. 4 (may 2013): 5, https://journals.ala.org/ltr/article/download/4269/4881. 3 paul betty, “assessing homegrown library collections: using google analytics to track use of screencasts and flash-based learning objects,” journal of electronic resources librarianship 21, no. 1 (2009): 75–92, https://doi.org/10.1080/19411260902858631; jason d. cooper and alan may, “library 2.0 at a small campus library,” technical services quarterly 26, no. 2 (2009): 89–95, https://doi.org/10.1080/07317130802260735; stephan spitzer, “better control of user web access of electronic resources,” journal of electronic resources in medical libraries 6, no. 2 (2009): 91–100, https://doi.org/10.1080/15424060902931997; julie arendt and cassie wagner, “beyond description: converting web site usage statistics into concrete site improvement ideas,” journal of web librarianship 4, no. 1 (2010): 37–54, https://doi.org/10.1080/19322900903547414; steven j. turner, “website statistics 2.0: using google analytics to measure library website effectiveness,” technical services quarterly 27, no. 3 (2010): 261–78, https://doi.org/10.1080/07317131003765910; gail herrera, “measuring link-resolver success: comparing 360 link with a local implementation of webbridge,” journal of electronic resources librarianship 23, no. 4 (2011): 379–88, https://doi.org/10.1080/1941126x.2011.627809; wayne loftus, “demonstrating success: web analytics and continuous improvement,” journal of web librarianship 6, no. 1 (2012): 45–50, https://doi.org/10.1080/19322909.2012.651416; tabatha a. farney, “click analytics: visualizing website use data,” information technology & libraries 30, no. 3 (2011): 141–8, https://doi.org/10.6017/ital.v30i3.1771. 4 patrick o’brien et al., “protecting privacy on the web: a study of https and google analytics implementation in academic library websites,” online information review 42, no. 6 (2018): 734–51, https://doi.org/10.1108/oir-02-2018-0056. 5 junior tidal, “using web analytics for mobile interface development,” journal of web librarianship 7, no. 4 (2013): 451–64, http://doi.org/10.1080/19322909.2013.835218; ramiro federico uviña, “bibliotecas y analítica web: una cuestión de privacidad = libraries and web analytics: a privacy matter,” información, cultura y sociedad no. 33 (2015): 105–12, http://revistascientificas.filo.uba.ar/index.php/ics/article/view/1906; sukumar mandal, “site metrics study of koha opac through open web analytics and piwik tools,” library philosophy and practice (2019), https://digitalcommons.unl.edu/libphilprac/2835; mohammad azim and nabi hasan, “web analytics tools usage among indian library professionals,” 2018 5th international symposium on emerging trends and technologies in libraries and information services, (2018): 31-35, https://doi.org/10.1109/ettlis.2018.8485212. 6 ian barba et al., “web analytics reveal user behavior: ttu libraries’ experience with google analytics,” journal of web librarianship 7, no. 4 (2013): 389–400, https://doi.org/10.1080/19322909.2013.828991. https://doi.org/10.1080/0361526x.2016.1245645 https://journals.ala.org/ltr/article/download/4269/4881 https://doi.org/10.1080/19411260902858631 https://doi.org/10.1080/07317130802260735 https://doi.org/10.1080/15424060902931997 https://doi.org/10.1080/19322900903547414 https://doi.org/10.1080/07317131003765910 https://doi.org/10.1080/1941126x.2011.627809 https://doi.org/10.1080/19322909.2012.651416 https://doi.org/10.1080/19322909.2012.651416 https://doi.org/10.1108/oir-02-2018-0056 http://doi.org/10.1080/19322909.2013.835218 http://revistascientificas.filo.uba.ar/index.php/ics/article/view/1906 https://digitalcommons.unl.edu/libphilprac/2835 https://doi.org/10.1109/ettlis.2018.8485212 https://doi.org/10.1080/19322909.2013.828991 information technology and libraries september 2020 analytics and privacy | quintel and wilson 10 7 betty, “assessing homegrown library collections.” 8 o’brien et al., “protecting privacy on the web,” 734. 9 o’brien et al., “protecting privacy on the web,” 741. 10 shayna pekala, “privacy and user experience in 21st century library discovery,” information technology & libraries 36, no. 2 (2017): 50, https://doi.org/10.6017/ital.v36i2.9817. 11 j. clement, “advertising revenue of google from 2001 to 2019,” statista, february 5, 2020, https://www.statista.com/statistics/266249/advertising-revenue-of-google; lily hay newman, “the privacy battle to save google from itself,” wired, november 1, 2018, https://www.wired.com/story/google-privacy-data/; ben popken, “google sells the future, powered by your personal data,” nbc news, may 10, 2018, https://www.nbcnews.com/tech/tech-news/google-sells-future-powered-your-personaldata-n870501; richard graham, “google and advertising: digital capitalism in the context of post-fordism, the reification of language, and the rise of fake news,” palgrave communications 3, no. 45 (2017): 2-4, https://doi.org/10.1057/s41599-017-0021-4. 12 “google advanced protection program,” google, https://landing.google.com/advancedprotection/. 13 “google privacy and terms, advertising,” google, https://policies.google.com/technologies/ads?hl=en-us. 14 brooke auxier et al., “american and privacy: concerned, confused and feeling lack of control over their personal information,” november 15, 2019, pew research, https://www.pewresearch.org/internet/wp-content/uploads/sites/9/2019/11/pewresearch-center_pi_2019.11.15_privacy_final.pdf. 15 “consumer privacy survey,” november 2019, cisco, https://www.cisco.com/c/dam/en/us/products/collateral/security/cybersecurity-series2019-cps.pdf. 16 andrew perrin, “half of americans have decided not to use a product or service because of privacy concerns,” pew research, april 14, 2020, https://www.pewresearch.org/facttank/2020/04/14/half-of-americans-have-decided-not-to-use-a-product-or-service-becauseof-privacy-concerns/. 17 “matomo vs. google analytics 360,” matomo.org, https://matomo.org/matomo-vs-googleanalytics comparison; lemon, “a comparison of data: piwik vs. google analytics,” the fplus (blog), november 30, 2016, https://thefpl.us/wrote/about-piwik; himanshu sharman, “best google analytics alternatives in 2020—matomo & piwik pro,” optimizesmart (blog), march 30, 2020, https://www.optimizesmart.com/introduction-to-piwik-best-google-analyticsalternative. 18 “matomo vs. google analytics 360,” matomo.org. https://doi.org/10.6017/ital.v36i2.9817 https://www.statista.com/statistics/266249/advertising-revenue-of-google https://www.wired.com/story/google-privacy-data/ https://www.nbcnews.com/tech/tech-news/google-sells-future-powered-your-personal-data-n870501 https://www.nbcnews.com/tech/tech-news/google-sells-future-powered-your-personal-data-n870501 https://doi.org/10.1057/s41599-017-0021-4 https://landing.google.com/advancedprotection/ https://policies.google.com/technologies/ads?hl=en-us https://www.pewresearch.org/internet/wp-content/uploads/sites/9/2019/11/pew-research-center_pi_2019.11.15_privacy_final.pdf https://www.pewresearch.org/internet/wp-content/uploads/sites/9/2019/11/pew-research-center_pi_2019.11.15_privacy_final.pdf https://www.cisco.com/c/dam/en/us/products/collateral/security/cybersecurity-series-2019-cps.pdf https://www.cisco.com/c/dam/en/us/products/collateral/security/cybersecurity-series-2019-cps.pdf https://www.pewresearch.org/fact-tank/2020/04/14/half-of-americans-have-decided-not-to-use-a-product-or-service-because-of-privacy-concerns/ https://www.pewresearch.org/fact-tank/2020/04/14/half-of-americans-have-decided-not-to-use-a-product-or-service-because-of-privacy-concerns/ https://www.pewresearch.org/fact-tank/2020/04/14/half-of-americans-have-decided-not-to-use-a-product-or-service-because-of-privacy-concerns/ https://matomo.org/matomo-vs-google-analytics%20comparison/ https://matomo.org/matomo-vs-google-analytics%20comparison/ https://thefpl.us/wrote/about-piwik https://www.optimizesmart.com/introduction-to-piwik-best-google-analytics-alternative https://www.optimizesmart.com/introduction-to-piwik-best-google-analytics-alternative information technology and libraries september 2020 analytics and privacy | quintel and wilson 11 19 ryan singel, “google holds out against ‘do not track’ flag,” wired, april 15, 2011, https://www.wired.com/2011/04/chrome-do-not-track; kieren mccarthy, “do not track is back in the us senate,” the register, may 20, 2019, https://www.theregister.co.uk/2019/05/20/do_not_track; “how do i turn on the do not track features?,” mozilla, https://support.mozilla.org/en-us/kb/how-do-i-turn-do-not-trackfeature. 20 “google analytics opt-out browser add-on,” google, https://support.google.com/analytics/answer/181881. 21 “ip anonymization,” google, https://developers.google.com/analytics/devguides/collection/analyticsjs/ip-anonymization. 22 “managing your database’s size,” matomo.org, https://matomo.org/docs/managing-yourdatabases-size/ deleting-old-unprocessed-data. 23 “data retention,” google, https://support.google.com/analytics/answer/7667196?hl=en&ref_topic=2919631. 24 “ip anonymization,” google. 25 katherine schwab, “it’s time to ditch google analytics,” fast company, february 1, 2019, https://www.fastcompany.com/90300072/its-time-to-ditch-google-analytics. 26 “matomo and phpmyvisites,” matomo.org, https://matomo.org/faq/general/faq_437. 27 “licenses,” matomo.org, https://matomo.org/licences. 28 “matomo (software),” wikipedia, https://en.wikipedia.org/wiki/matomo_(software). 29 “matomo requirements,” matomo.org, https://matomo.org/docs/requirements. https://www.wired.com/2011/04/chrome-do-not-track https://www.theregister.co.uk/2019/05/20/do_not_track https://support.mozilla.org/en-us/kb/how-do-i-turn-do-not-track-feature https://support.mozilla.org/en-us/kb/how-do-i-turn-do-not-track-feature https://support.google.com/analytics/answer/181881 https://developers.google.com/analytics/devguides/collection/analyticsjs/ip-anonymization https://matomo.org/docs/managing-your-databases-size/%20-%20deleting-old-unprocessed-data https://matomo.org/docs/managing-your-databases-size/%20-%20deleting-old-unprocessed-data https://support.google.com/analytics/answer/7667196?hl=en&ref_topic=2919631 https://www.fastcompany.com/90300072/its-time-to-ditch-google-analytics https://matomo.org/faq/general/faq_437 https://matomo.org/licences https://en.wikipedia.org/wiki/matomo_(software) https://matomo.org/docs/requirements abstract introduction literature review google analytics comparisons comparisons on privacy implementation and installation security conclusion endnotes lib-s-mocs-kmc364-20141005043847 87 on-line and back at s.f.u. m. sanderson: simon fraser university simon fraser university library began operation with an automated circulation system. after deliberation, it mounted the first phase of a two-phase o~line circulation system. a radically revised loan pol·icy caused the system design and assumptions to be called into question. a cheaper, simpler, and more effective off-line system eventually replaced the on-line system. the systems, fiscal, and administrative implications of this decision are reviewed. the original system when simon fraser university ( sfu) library opened in 1965, circulation of materials was handled by an automated system. briefly the method of operation was as follows: to borrow a book, the patron presented a laminated plastic card which had his borrower number and borrower class (faculty, staff, graduate, undergraduate) punched in it. the book itself cont:'lined a keypunched card holding the book's class number and brief author and title information. the book card and the patron's badge were fed into an ibm 1031 data collection terminal. the terminal transmitted the information to an ibm 1034 card punch which punched out a card containing the information from the book card, the patron's borrower number, and the date borrowed. at the end of the day, these transaction cards were used to update the loan master file. the loan master file produced daily a list of all material on loan, and fine and overdue notices for dispatch to patrons. payment cards for fines were also produced daily by the system; these cards were used to cancel fines from the file upon payment of the fine. the loan master file and the daily circulation listing also contained records of all materials on reserve. separate listings were available weekly showing reserve books and reserve photocopied material. at the end of each semester a list was produced of all students owing more than $2 in fines for the purpose of withholding grades until such time as fines were paid. reasons for going on-line the possibility of implementing an on-line system in one of the sfu departments was first discussed in early summer 1968. it was accepted by the computing centre management and the nonacademic department heads that: 1. the use of on-line processing generally was increasing rapidly. 2. the level of sophistication of these systems was not high. 88 ]oumal of libra-ry automation vol. 6 / 2 june 1973 3. there was a shortage of people competent to design, implement, and maintain sophisticated on-line systems. 4. a demand for on-line processing at sfu would develop. 5. sfu would probably move with the general trend toward increased use of on-line systems, and an on-line system ought to be initiated to develop local expertise in anticipation of demand. after further discussion, it was agreed that the department wishing to develop the first on-line system must be able to satisfy the following prerequisites: l. the system should encompass the beginning and the end of a clearly defined process. 2. the system should require the simultaneous use of one or more files by two or more terminals. 3. the system should use relatively large files with a high inquiry and update rate. 4. the system should satisfy genuine objectives of the application department. a survey of the departments showed that the library was the logical choice because: l. it could satisfy the prerequisites. 2. it had experience with automated systems. 3. batch-processing in the loan division could be extended to the on-line mode using the existing line of equipment. 4. the library administration was prepared to make an immediate commitment of resources to the project. the library's objectives were as follows: l. inventory conj1·ol-to gain statistics about the use of the collection. such data were available under batch processing for the general collection, but not for the reserve collection, which, with its loan periods of two hours, four hours, one day, and three days, was handled manually. 2. inventory usefulness-to determine how the library is being used and by whom. this information is essential in order to ensure that collection building is a reflection of the realities of the education process of the institution. 3. increased service-by definitiqn, the library is a service institution. if the automated system in batch mode allowed us to speed up the transaction process to handle large volume circulation, and allowed us to produce overdue notices, bills, and statistics, thereby increasing both the efficiency and service of the loan division, then we were satisfying a built-in library objective by implementing data processing in batch mode in the loan division. if the on-line system could give our users instant information on the status of books, then that function becomes a service objective. at sfu, the loan period and penalties for overdue books are the same for all classes of borrowers. the library has never been an enthusiastic supporter of the fines system because on-line and back at s.f.u./sanderson 89 of the general antagonism it creates and because it favors the borrower who can afford to pay. unfortunately, there was no acceptable way to force faculty to pay fines. it yvas thought that the on-line system was the only way to support a system of suspension of borrowing privileges for failure to return books, in lieu of the fines system. 4. cooperation-it was agreed between the three universities of british columbia (simon fraser university, university of victoria, and university of british columbia) that the storage of low-use material in a cooperatively supported lending/storage facility would save in the order of $800,000 per year. it was felt that the on-line system would provide useful statistics for this purpose. 5. future development-it was thought that the on-line system, with its statistics-gathering potential, was a necessary preliminary to the cooperative shelflist conversion of the three universities, in turn thought necessary to provide the kind of bibliographic information to allow collaborative collection building. the reasons why the above justifications later turned out to be invalid are given in a subseq11ent section. phase i of the on-line system (abbreviated system flowcharts of the various stages are shown in appendix 4) the purpose of phase i was to put the general collection on-line in enquiry mode only with batch updating every three hours-on-line updating was to wait until phase ii. in april1969, one full-time programmer analyst and one part-time systems analyst began work on the first phase of the on-line system, using three ibm 2260 graphic display terminals. problems with pgam, the pl/i graphic access method interface program, and multitasking support allowing the use of more than one terminal at a time (it was easy to get one terminal going) meant that by april 1970 the system was just struggling into life. there followed a period of parallel running which was unexpectedly long as a result of some of the problems peculiar to on-line systems (e.g. system down-time; designing a 1'eally effective back-up system to prevent loss of data). this phase lasted until october 1970. by july 1971 it had become apparent that the system was not cost-effective and in august 1971 the system was taken down and replaced by a revised version of the old batch system. the reasons and costs are given in a later section. there were three display terminals in the loan division, two for patrons, one for staff, giving the following capabilities: patron-when the patron typed in the class number of the book he was looking for, according to instructions appearing on the terminafs screen, the information was transmitted to the computer program which searched the on-line loan master file for the required class number. if the book was on loan, a message appeared on the screen giving the class number, borrower number, due date, and whether a hold had been placed on the book. 90 journal of library automation vol. 6/2 june 1973 if the book was not on loan or on reserve, or being repaired, or in cataloging, a message to this effect was displayed. if the patron made any errors in his use of the terminal, error routines in the program displayed messages giving corrective procedures. staff-by use of a special password, staff members could access different modules of the enquiry program. a status query by a staff member would result in all copies of a particular class number being displayed serially on the screen, and since fines and overdues were held on the master file, this type of information was also displayed. other routines available to staff members allowed holds to be placed on books or removed, renewals to be made, and the passwords to be altered. although passwords were a closely guarded secret, it was felt necessary to be able to change passwords in the event of their being learned by unauthorized users. since on-line updating was not to be incorporated until phase ii, the 1034 transaction cards were input every three hours and the loan master file updated in batch mode. file structure for phase i was based on an indexed sequential type of access to a loan master file which contained one 100-byte record per book on loan, one record per fine and one record per reserve book. in this way, the loan master file was in the same format as in the batch system. access to the file began with a program check of a small table held in core storage which gave ranges of class numbers with entry points to an index table. taking the appropriate entry point, the index table stored on disc was accessed. this gave the class number which headed each track for the loan master file. the index table was scanned for the appropriate track. each track of the loan master file contained fifty-four records with eighteen spaces for updates. whenever a record was changed or a new loan inserted, the new record was inserted in the update area. at the end of the day, the file was stripped of its update records and the old batch update program was used to update the loan master file. the loan master file was rewritten to disc the following morning ready for the day's updates. total file space allocated was fifty cylinders. phase ii and the demerit system phase ii was to see the on-line processing of loans and returns, the master file being updated at the time of the transaction instead of in three-hour batches. the reserve collection was to be automated and go on-line. the recording of holds and the production of hold slips for patrons and books was to be fully automated. detailed statistics of the use of the reserve collection were to be obtained. one of the major objectives of phase ii was the replacement of the fines system by a demerit system. under the demerit system a patron would accrue penalty points for the length of time a book was overdue. after a certain level was reached, a warning notice was to be sent out informing him that his privileges would be suspended if a particular level, of points were exceeded. if he then exceeded this level, his borrowing privileges on-line and back at s.f.u./ sanderson 91 wouid be suspended, and whenever he subsequently presented his library card to take out books, the checking procedure in the program would find his borrower number invalid, prevent the transaction being recorded and print a message on a 27 41 terminal giving the reason for suspension. after a given period, borrowing privileges would be restored, provided that overdue materials had been returned. at exam times, penalty points would accumulate more rapidly, as they would also for reserve materials which had short loan periods. file organization for phase ii was to be altered from that of phase i principally to allow easier retrieval and updating. a master index file would contain a brief record (26 bytes for class number, 4 bytes for relative address) for every cataloged book in the library. this index file would lead into the loan master file which would consist of variable length records: one fixed length portion of the class number and author-title, followed by varying numbers of fixed length sections giving details of the loan transaction. the number of the transaction sections would depend on the number of copies of the book which were on loan. anticipated file sizes were 60 cylinders for the master index and 30 to 40 cylinders for the loan master file. the increase in file handling efficiency and in restarting with no lost data after system down-time were seen to compensate for the increase in space allocation. loan policy changes problems with the system of fines and proposals such as the demerit system led to the suggestion that a survey should be made of campus opinion on the library loan policy. an examination of the results of the questionnaire and the comments obtained led to the submission of a somewhat different loan policy to the senate library committee. this policy, briefly, was a recall system with the two-week loan period changed to a semester loan period for general loan material, and retention of the current fines system for reserve materials until the implementation of phase ii. failure to respond to recall was to be penalized by suspension of library service. the system was to be experimental for two semesters. the decision to adopt a recall system had an immediate impact on system development for phase ii: 1. specifications for phase ii needed to be reworked. 2. the demerit system was no longer required. 3. interim procedures were required to handle the recall system until the inception of phase ii. 4. file size growth became unpredictable because it was not known whether all books would stay out until the end of the semester or be returned at more frequent intervals. this could indicate a file size of between 30,000 and 80,000. revision of thinking on on-line circulation two significant developments made it advisable for the library to re92 journal of library automation vol. 6/ 2 june 1973 consider its need for an on-line system in terms of both its benefits for the library and its economic justification. the first development was, as indicated, the radical revision of library loan policy-namely, the proposed adoption of a semester loan period supported by a recall system. the second was a detailed costing of the equipment requirements for phase ii of the on-line system, weighing the relative merits and costs of two alternative manufacturers. these costs have turned out to be significantly higher than originally anticipated. consequently, it was seen that the costing done for phase ii should be done again in the light of the new developments. the original benefits of the on-line system were also reexamined. 1. inventory control-this still applied as far as the reserve collection was concerned. these :.tatistics would have to be gained in some other way insofar as they are additional to the statistics now collected manually. 2. inventory usefulness-this was no longer a justification. by this time we had developed collection analysis programs which give a fine breakdown of the collection into separate disciplinary areas and give total volumes and book usage by borrower class in these areas. further development of these programs could give more information; e.g. referencing the registration system files could give information correlating students, courses, and book usage. 3. increase in service-this was no longer a justification. (a) the implementation of the recall system with its attendant suspension of privileges does not demand an on-line system for its operation as would the previously proposed demerit system. with a suspension of privileges for those owing over $25 tested in early 1971, we were operating a manual system of borrower control successfully, leading us to assume that the recall system's control system would similarly function well. (b) nobody ever complained that the information on the batch system was too old (eighteen hours old at maximum). we had even had messages (anonymous) left by frustrated users of the on-line terminals which could be paraphrased as: "what was wrong with the old system?" 4. cooperation-this was no longer a justification. extensions of the work on collection analysis mentioned in 2 above· could help in the identification of high and low use items and thus provide an alternative way to save the estimated $800,000 per year. work on collection comparison between the three british columbia universities is already underway in a tri-university task force. 5. future development-this was no longer a justification. shelflist conversion should have been hastened by the abandoning of the on-line loan system insofar as resources would be freed to work on the conversion, which is of far greater importance to the future information on-line and back at s.f.u./sanderson 93 handling capability of the library than knowing within four seconds whether or not a book is on loan-especially as the time taken in reshelving of books make this loan information prone to inaccuracy. it thus appeared that the reasons used to justify an on-line system were no longer valid, if, indeed, they ever were. when examining the cost figures again in view of the proposed recall system, the amortization of the development and equipment costs no longer seemed possible. the cost of the batch and on-line system equipment is shown in figures i and 2 for both ibm and colorado instruments (now mohawk). it can be seen that the difference m equipment costs between the proposed batch system and phase ii would have been over $15,000 per year. (some of the savings in equipment rental has been used to microfilm the subject catalog for distribution to three floors in the library which do not have easy access to this catalog.) the manual procedures involved with fines which phase ii was to eliminate are now considerably reduced by the recall system. the development costs of phase ii have been replaced with the cost of returning to the old batch system in a slightly improved form. the cost of this, at the computing centre, was $2,123.76. it had been predicted that writing phase ii in minerva and marc iv (two high-level program language packages) would make considerable savings in the impact on computing centre operations. however, even taking this into account there still remains the development costs and at least $15,000 per year for extra equipment. (the difference between the equipment costs for phase i (figure 1) and phase ii (figure 2).) see the appendixes for cost comparison and projections. colorado ( 3 year lease) monthly ibm monthly 3 c-deks @ $131.29 $393.87 1 10.31 a terminal 3 c-dek cable terminals @ $100.34 $100.34 @ $2.14 6.42 1 1031 a terminal 105.35 1 central controller 137.25 1 1031 b terminal 64.12 1 controller cable terminal 1 1034 card punch 328.73 box 2.25 (includes educational 2 mag tape-recorders 268.20 discount) 598.54 807.99 service-free discount @ 12% 100.00 installation-equipment 707.99 already on site installation-probably fr ee service contractapproximately 122.00 total colorado total ibm monthly $829.99 monthly $598.54 fig. 1. equipment costs (1971) ibm vs colomdo, phase i and off-line 94 journal of library automation vol. 6/ 2 junel973 colorado ( 3 year lease) monthly ibm monthly data collection: 5 c-dek3213 2 1031a terminals @ $131.39 $ 656.95 @ $100.34 $ 200.64 5 c-dek cable 2 1031b terminals terminals @ $2.14 10.70 @ $64.12 128.24 1 3216 central 1 1031a terminal controller 137.25 @ $105.35 105.35 1 controller cable 1 2711 data set 115.00 terminal 2.25 549.23 1 interface coupler 112.50 919.65 less 12% discount 110.36 additional 2703 809.29 attachments: 1 4879 600 bps 12.00 library share of 1 4697 type ii control 40.00 memorex 1270: 1 3205 data line set 86.00 base: 1/ 32 of $1,011 31.00 2 4790 line adapters line adapter: ~of $28 7.00 @ $12 24.00 modem 33.00 1 7506 (library pays half?) @ $86 43.00 $ 205.00 back-up: 9-track mag-tape switching rpq 36.00 recorder with free back-up clock 98.21 switching rpq 134.10 1034 card punch 328.73 printers: 2 2741@ $90.70 181.40 2 2741@ $90.70 181.40 display t enninals: 4 2260@ $46.74 186.96 4 2260@ $46.74 186.96 share of 2848 311.10 share of 2848 311.10 systems $1,896.63 equipment totals: $1,693.85 service contract: nil prime shift only 195.00 $1,896.63 total monthly cost: $1,888.85 equipment freight installation and charges: check-out: $1,390.00 approximately $ 100.00 fig. 2. equipment costs (1971) ibm vs colorado, phase ii on-line and back at s.f.u./sanderson 95 the present recall system the recall system has been in operation since august 1971. its principal features are as follows: that books be loaned for a period of one semester. that they be subject to recall after a period of two weeks after borrowing. that they become due on the last day of exams. that there be a penalty for failure to respond to recall. that there be a penalty for failure to return books after exams. that the penalty be suspension of library privileges plus a $5.00 fine. in the case of failure to respond to recall, the $5.00 fine is levied five days after the recall notice is sent. in the case of failure to return books after exams, the fine is $1.00 per day to a maximum of ·$25.00, starting at the end of the semester. listings of overdue books will be run during this period only, and a fine payment card produced and kept in the loan division. as in the first system, the fine payment card is used to cancel fines upon payment. the fine system and checking of delinquent borrowers is being successfully handled manually. that privileges will be restored only when the patron has both returned the books and paid the fine. the automated part of the system is similar to the original system described earlier except that fine and overdue notices are produced only at the semester end as mentioned. the reaction of the staff handling the recall system has been favorable, as has been the reaction of patrons. initial fears that a high percentage of the books in the collection would be out all semester and be returned en masse at the end have proved unfounded. the number of books out at any one time is often less than under the previous system. people seem to be returning books when they have finished with them and taking out fewer at a time; thus, browsing and usage are not affected. books began returning at 2,000 per day on november 30, 1971 in anticipation of the december 17 due date (master file standing at around 34,000 books on loan at this point). on december 19 only 4,864 books had not been returned. by december 29 this was down to 2,169, and by january 13, 1972 down to 394. recalls have fluctuated between 35 and 130 per day and of these an average of 8 recalls per day have not been picked up by the recaller. by contrast, under the fines system, the daily production of fines, overdue, and hold notices was between 500 and 700. the total amount of fines from september 1, 1970 to november 17, 1970 was ·$11,021.32. from september 6, 1971 to november 17, 1971 the figure was $2,405.03, a difference of $8,616.29. thus, although people are making similar use of the library, judging by the circulation statistics, it is not costing them as dearly. 96 ]oumal of library automation vol. 6/ 2 june 1973 costs comparative computer operating costs are shown in table l. tahle 1. comparative computer operating costs average monthly computer cost computer model 196970 old batch system $3, 100 ibm 360-40 1970-71 phase i-on-line $3,851 ibm 360-50 1971 recall system, batch $1,178 ibm 360-50 197273 recall system, batch $ 514 ibm 370-155 the annual average cost of computer processing is no\v $6,168 rather than the $19,320 projected in appendix l. staff salarit"s have risen in the two years since august 1971 and loans staff costs are now $33,200 instead of $21,267. total total annual cost is now $6,168 (computer time) + $7,182 (equipment) + $33,200 (loan staff and materials ) = $46,550. this is less than tl1e projected annual cost of $57,994. the recall system certainly seems so far to be making the predicted savings, and the increase in good will in the university community is something we must also take into account on the credit side. conclusion as is stressed so often in systems analysis theory, and sinned against so often in practice, a clear statement of objectives is required and a thorough cost/ benefit analysis of all alternative solutions is needed to prevent unwanted solutions of unreal problems. a first question should be: '\vhat are we really trying to acl1ieve here?" rather than: "i wonder if we could apply system x in this situation?" automation is one of many possible solutions to a p roblem. an on-line system is one of many possible automated solutions. the management aspects of the decisions in setting up an on-line system were referred to in "reasons for going on-line." the thought of taking the on-line system down again was born of a number of factors. in the first place, feeling on campus cau_sed the loan policy to evolve in a way not predictable at the time of system design. in the second place, we learned that on-line systems are not to be tre ated lightly. they require a great deal of careful design and technical competence if they are to be as efficient as they are impressive. they embody concepts as different from batch processing as batch processing is from the manual system it may replace. for us, the result was escalating costs, and an on-line system design that could have been better and less costly. the solution finally adopted was the result of considering what were seen to be the real requirements: maximum availability of materials with maximum convenience; and against the background of the library's general objectives, maximum cost-effective service in an era of tight budgets. appendix 1 annual circulation system cost summary (as of aucust 1971) present on-losses compared pr()posed line phase i plw.se ii savings over with batch (without predicted annual costs (with recall) recall) (with recall) phase i phase ii phase i phase ii machine time $19,320 $37,150 $42,000 $17,830 $22,680 forms overdue notices 950 3,250 9.50 2,300 fine notices 15 48 15 33 a printouts 3,200 960 1,000 $2,240 $2,200 ~ i postage for overdues & fines 3,530 12,000 3,530 8,470 t'"t ... envelopes 70 250 70 180 ~ ~ postage for holds/ recalls 1,200 1,200 1,200 [ punch cards 1,260 1,260 1,260 loans staff b:l ;::. fines 2,000 6,000 2,000 4 ,000 ~ ;>;"' stuffing envelopes 600 2,000 600 1,400 ~ looking up addresses 400 1,200 400 800 vl reserves staff 18,267 18,267 14,763 3,504 ~ equipment c::: 1030 system 7,182.48 7,182.48 14,606.04 7,423,56 '--en 2260 terminals 1,682.64 2,243.52 1,682.64 2,243.52 > share of 2848 ·3,733.20 3,733.20 3,733.20 3,733.20 z t:j 2741 terminals 2,176.80 2,176.80 tr1 ::0 $40,428.84 $38,257.08 $3,440 $6,964 en 0 net saving in annual cost of batch system over: phase i $36,988 z phase ii $31,293 '-0 --l 98 journal of library automation vol. 6/2 june 1973 appendix 2 gross computer operating costs during phase i costs shown include all circulation runs. nov. 1970 dec. 1970 ]an. 1971 feb. march april may $ 5,402.57 4,265.12 3,605.33 3,937.78 4,349.41 2,981.39 2,421.39 cpu hrs. 36.0241 28.4410 24.0419 26.2595 29.0043 average monthly operating cost of phase i over seven months: average monthly operating cost of former batch system: appendix 3 19.8820 16.1487 $3,851.85 $3,100.00 development costs for phase ii completion present system (phase i ) converted to minerva with new file organization, etc. interface to batch system. and systems computing centre library programming and systems tests est. pacific westem consulting (minerva) at $150 per day computer time (est. ) forms, staff training parallel runs minerva total phase ii on-line systems computing centre ibm support library personnel programming and systems tests pacific western consulting computer time (est.) forms, staff training parallel runs (33 d ays at $35 per day) equipment rental (@ $1,200 per month additional) total development phase ii total system development (already spent-in addition): $11,576 2 months 7 days subtotal 7 months 5 days subtotal ( 10 days) 13 months 48 days subtotal 21 months 10 days subtotal $ 1,800 200 2,000 5,600 750 8,350 1,500 50 350 --$10,2.50 --11,700 1,400 1,900 15,000 16,800 1,500 33,300 10,000 1,000 1,155 1,300 46,755 57,005 circulation mash:r lis~lng on-line and back at s.f.u. j sanderson 99 appendix 4 (a) original circulation system ibm 1034 card punch dajly circulation s)l>tom 100 journal of library automation vol. 6/ 2 june 1973 1030system circulation cards payment cards, lost book billi, reserve bills, etc. reserves listiog by course appendix 4 (b) phase i create on~ line l..o.an master inquiry and update program (status, holds & renewalt) 3 in cene ntl lo3.ns 1031 badgecard readers 2 in reserves reser\le listing"' by course {weekly) bode-up 1034 ca rd punch on~line and back at s.f-u-/sanderson appendix 4 (c) proposed phase ii c"..reate on-lioe ll'l~c)w a record containing the iso extended cyrillic character set. l'>l'>$c)w$c)x a record 3.4 discussion-other details containing both the iso greek and extended cyrillic character sets. when a field has an indicator to specify the number of leading characters to be ignored in filing and the text of the field begins with an escape sequence, the length of the escape sequence will not be included in the character count. when fields contain escape sequences to languages written from right to left, the field will still be given in its logical order. for example, the first letter of a hebrew title would be the eighth character in a field (following the indicators, a delimiter, a subfield code, and a three-character escape sequence). the first letter would not appear just before the end of field character and proceed backwards to the beginning of the field. a convention exists in descriptive cataloging fields that subfield content designation generally serves as a substitute for a space. an escape sequence can occur within a word, after a subfield code, or between two words not at a subfield boundary. for simplicity, the convention that an escape sequence does not replace a space should be adopted. one other convention is also advocated: when a space, subfield code, or punctuation mark (except open quote, pareports and working papers 215 renthesis or bracket) is adjacent to an escape sequence, the escape sequence will come last. wayne davison of rlin raised the following issue. after the library of congress has prepared and distributed an entirely romanized cataloging record for a russian book, a library with access to automated cyrillic input and display capability will create a record for the same book with the title in the vernacular. (since aacr2 says to give the title in the original script "wherever practicable," the library could be said to be obligated to do so.) in such an event the local record could have all the authoritative library of congress access points. to keep this record current when the library of congress record is revised and redistributed, it would be necessary to carry the lc control number in the local record. most automated systems are hypersensitive to the presence of two records with the same control number. the two records can be easily distinguished: in the library of congress record, the modified record byte in field 008 will be set to "o" and it will not have any 066, character sets present field. a comparison of oclc, rlg/rlin, and wln university of oregon library the following comparison of three major bibliographic utilities was prepared by the university of oregon library's cataloging objectives committee, subcommittee on bibliographic utilities. members of the subcommittee were elaine kemp, acting assistant university librarian for technical services; rod slade, coordinator of the library's computer search service; and thomas stave, head documents librarian. the subcommittee attempted to produce a comparison that was concise and jargonfree for use with the university community in evaluating the bibliographic utilities under consideration. the university faculty library committee was enlisted to review this document in draft form and held three meetings with the subcommittee for that purpose. the document was also shared with library faculty and staff in order to elicit suggestions for revision. 216 journal of library automation vol. 14/3 september 1981 a copy of the draft was sent to each utility with a request for suggestions for correction and/or clarification of the report. each of the utilities responded promptly, and their recommendations were reviewed by the subcommittee and have been incorporated into the report as it appears here. in reading this report two considerations should be kept in mind: (1) the information is current as of december 1980, and (2) the efforts at brevity and jargon-free comparison may have resulted in oversimplification in some areas. this report is one aspect of the sixmonths-long decision-making process that led the university of oregon library to select oclc, inc. (now the online computer library center). introduction an online bibliographic utility provides computer services to member libraries who, in turn, contribute computer-readable records to a common database. the database is a collection of catalog records input by the members and other sources such as the library of congress, the government printing office, and the national library of medicine. use of the database is online, meaning that each member library accesses the computer directly and carries out its work in an interactive, conversational manner through a computer terminal located in the library. communications with the central computer are carried over a leased long-distance telephone line. the bibliographic utility produces two primary products-catalog cards and magnetic tapes of a library's catalog records-and offers many other services for processing and bibliographic control in libraries. in addition to providing the products and services of a bibliographic utility through the research libraries information network (rlin), the research libraries group (rlg) has three other goals: (1) to provide a structure through which common research library problems can be addressed, (2) to provide scholars and others with increasingly sophisticated access to bibliographic and other forms of information , and (3) to promote, develop, and operate cooperative programs in collection development, preservation of library materials, and shared access to research materials. the purpose of this report is to provide an overview of considerations in selecting an online bibliographic utility and a comparison of the three utilities being reviewed by the university of oregon library. each consideration is accompanied by a brief definition or explanation, and a summary of each utility's capability in providing the necessary services or products. an attempt has been made to distinguish between currently available services and those that are planned for the future, but technological and organizational changes in the utilities have complicated this task and, in some cases, made it difficult for the subcommittee members to distinguish between operational and projected capabilities. basic characteristics history oclc oclc, inc., was founded in 1967 by the ohio college association as the ohio college library center, to be the first online shared cataloging network. it has since expanded beyond the confines of the state of ohio and is currently used by nearly 2,400 member libraries in the united states and abroad. in 1977 it adopted its present name. rlgirlin the research libraries group, inc. , was established in 1974 by four major research libraries. in 1978 it acquired from stanford university the ballots bibliographic data system, which became the foundation for rlin (research libraries information network), rlg's wholly-owned bibliographic utility. besides being the basis for rlg's cooperative processing activities, rlin supports its other three programs: shared resources, cooperative collection development, and preservation. rlg presently has 23 owner-members. wln in 1975 the washington library network began testing its online system using as its base a computerized bibliographic database that several washington libraries had been building since 1972. wln is a project of the washington state library and presently has over 60 members, primarily in the northwest. membership configuration oclc oclc had 2,392 member libraries, in early 1981, including about 1,300 college and university libraries, 330 public libraries, 250 federal libraries, 145 special libraries, 77law libraries, 71 members of the association of research libraries, 168 medical libraries, 37 state libraries, and at least 48 art and architecture libraries. rlg!rlin in december 1980, there were 23 ownermembers (21 university libraries, the new york public library, and the american antiquarian society), two associate members, two affiliate members, and several museum and three law library special members. libraries which formerly contracted for ballots cataloging services from stanford university are still being served by rlin. these include 52 libraries using rlin for online cataloging and 136 libraries using rlin on a search-only basis. wln wln had 65 members, in early 1981 , including 34 college and university libraries, 21 public libraries, two special libraries, three state libraries, five law libraries, and the pacific northwest bibliographic center. governance methods of governance are of concern to libraries considering membership inasmuch as they determine to a great extent the responsiveness of the utilities to the needs of their members and the ability of members to participate in setting the direction and priorities for the utility. oclc a 15-mem ber board of trustees holds the powers and performs the duties necessary for governance (including filling management vacancies and approving policy and budgets). a users' council, elected by the members, participates in the election of trustees and represents the interests of the membership in an advisory capacity. it also reports and working papers 217 must ratify amendments to the oclc code of regulations and articles of incorporation. of the 69 delegates to the council, 44 are from academic libraries. various advisory groups exist representing the interests of special groups within the membership, including a research libraries advisory group. twenty regional networks contract with oclc to provide services to their members. oclc libraries in oregon participate through the oclc western service center, claremont, ca, and are served by oclc's portland office. rlg!rlin rlg /rlin operates through a board of governors consisting of one representative from each full member institution with the president as chief operating officer. standing committees for collection management, public services, preservation, and library technical systems & bibliographic control; and program committees for east asia, art, law, theology, and music are composed of appointees from member institutions and report to the president. wln an 11-member computer services council is elected directly by the online participant libraries. legal responsibility for wln resides with the washington state library commission. financial stability an indicator of a utility's financial stability is its proven ability to generate sufficient revenues to cover expenses with the least recourse to outside funding sources. financial stability in a utility is a concern to a library considering membership not only from the standpoint of a utility's mere survival, but because of its implications for future system developments, possible dramatic fee increases should outside funding evaporate, and maintenance of high quality services and products. oclc oclc, inc., is a not-for-profit corporation, with tax-exempt status having been granted under section 501 ( c)(3) of the internal revenue code . it is self-supporting, receiving no government or private subsidies, 218 journal of library automation vol. 14/3 september 1981 and issuing no stock. its revenues alone support existing operations, expansion, and research and development activities. revenues result from fees charged member libraries for products and services. oclgs estimated assets for fiscal year 1980 were over $55 million and its revenues approximately $24 million. its revenue base is its 2,400 member institutions. rlgirlin the research libraries group, inc., is a tax-exempt corporation owned by its 23 owner-member institutions. revenues result from fees charged members for use of the rlin database. rlg currently must supplement this income with foundation grants and loans from stanford university, because of relatively high development costs and relatively low revenues. as of this year, nearly $5.25 million has been received in grants and a $2.2 million loan was obtained, to be repaid by august 1986. rlg has projected that in 1982-83 ongoing operating costs will be met by feegenerated income. rlg's board of governors recently approved a new income/ expense structure to take effect september 1, 1981: "operating expenses matched by rates for services; system development matched by grants and loans; program and administration matched by a program partnership fee." this new program partnership fee will be a flat annual rate for full members in the range of $20,000 to $25,000. a decline in the number of units cataloged by member libraries (due in part to decreased acquisitions budgets), which is the basis for fees charged, forced the board lo inslilute this new fee. il.lg is encouraging member libraries to seek these additional funds from institutional sources outside the libraries' own budgets. the new financial structure appears to reflect a recognition of the need for outside resources to provide for research and development for at least the immediate future, and at the same time an effort to reconcile income and expense in the areas of operating expenses and program administration. its revenue base is its membership of 23 institutions. in the past rlg has estimated that financial stability would be reached when membership reached 35, but it is unclear how the new rate structure will affect that projection. wln the washington library network receives revenues in the form of fees for services and products. as a division of the washington state library, it also receives some funding from the state of washington. wln has been the recipient of some outside grants, but does not appear to rely heavily upon grant monies to meet ongoing expenses or system development costs. wln would like to lessen its dependency upon the state of washington, and has taken the first step by broadening the base of its advisory committee to include out-ofstate members. its revenue base is its membership of approximately 60 libraries. the committee preparing this report does not have information as to the proportion of revenues generated by fees. however, a recent (july 1, 1980) 10% increase in service rates was put into effect for these stated purposes, among others: "to recover the cost of operation of the computer service" and to "allow a modest margin to insure stability." track record in meeting past system developme11t deadlines past success or failure in meeting announced deadlines for system developments may be indicative of future performance in this regard. all three utilities are heavily engaged in research and development and, while we are primarily interested in the features that are presently available, it is also important to try to gauge what each system will look like several years from uuw. the amount of information available to the committee varied according to the utility, so these columns are not directly comparable, but merely suggestive. oclc oclc tries not to attach dates to its projections because of early failures to meet announced deadlines. however, its interlibrary loan system was implemented one year early and its searching improvements are claimed to be ahead of schedule. the planned acquisitions subsystem had been scheduled for completion in summer 1980, and is currently being tested by a small number of member libraries. the conversion of oclc's database to accommodate the new cataloging rules and include new forms of names was completed on schedule in december 1980. the serials union listing capability was also completed on time. (seep. [224]) rlgirlin a study dated august 1978 performed for the university of california listed planned ballots system developments with projected completion dates. this list follows, with actual completion dates or revised projections added: • network file system (now called "reconfigured database" by rlin) projected january 1979 revised projection april 1981 serials cataloging projected january 1979 actual completion late 1979 authority control system, phase 1 projected january 1979 revised projection spring 1981 authority linking and control, phase 2 projected fall1979 revised projection spring 1981 generalized acquisitions projected fall1979 revised projection (in two phases) june 1981, october 1981 serials control projected 1980 revised projection post-1982 library management information system projected 1979 no projected date, no resources allocated book/com catalog interface projected 1980 revised projection 1981 wln wln's present online system was one year late, and its acquisitions module was also late. the processing of retrospective conversion tapes which had been three months behind was current by early 1981, *since 1978 the rlg board of governors has determined the order of priorities for research and development. reports and working papers 219 with the exception of two special projects. large-scale system adjustments to accommodate new cataloging rules were completed on schedule, as was implementation of roll-microfilm catalogs. database size and components the size and makeup of the utility's database is of concern to libraries considering membership because those factors have the greatest bearing on the library's likelihood of obtaining a large portion of its cataloging information from the system. oclc size. over 7.1 million bibliographic records (february 1981) books: 4.9 million (october 1979) serials: 341,000 (october 1979) other: 340,000 (october 1979) name authority records: 500,000 (est. by 1981) formats available. books serials films (av) maps manuscripts music recordings music scores sources of data. member-contributed records library of congress-produced machinereadable cataloging records (marc) (1968 to date) government printing office-produced records (cataloged directly into oclc by gpo) conser records (conversion of serialsa project of 15 major libraries to produce machine-readable serials cataloging records). data are entered directly into oclc, then authenticated by the library of congress and the national library of canada. national library of medicine-produced records additional sources include the following databases: canadian marc serials minnesota union list of serials pittsburgh regional library center serials 220 journal of library automation vol. 14/3 september 1981 rlg/rlin size. over 3 million bibliographic records 0 une 1980) books: 2.5 million (june 1980) serials: 460,000 (june 1980) authority records: 1.6 million (early 1981) formats available. books serials films (av) maps music recordings music scores sources of data. member-contributed records marc (excluding 19681972) gpo records (to be added spring 1981) conser records cataloging records from columbia and yale universities and university of minnesota biomedical libraries, previously put into machine-readable form, have been added to rlin. records from the new york public library, northwestern and pennsylvania state universities will be added in the near future. additional sources include the avery index to architectural periodicals. wln size. 2 million bibliographic records (january 1981) authority records: 2.3 million (january 1981) holdings records: 2.3 million (december 1980) formats available. books serials films (av) music recordings• music scores• sources of data. member-contributed records marc (1968 to date) gpo records conser records (except those not yet authenticated by the library of congress) machine-readable records from the university of illinois will be added to wln's • awaiting implementation by the library of congress. database on a weekly basis by mid-1981. records from certain libraries in the southeastern library network (solinet) will be added in the future, ,as part of an arrangement whereby wln made its computer software package available for use by illinois and solinet. resource sharing interlibrary loan (ill) ill is the process by which library materials are lent and borrowed by libraries in the u.s. and foreign countries. a bibliographic utility provides two tools to aid in this process: an online union catalog used to determine which library owns the needed material, and a message switching system used to communicate among libraries and to carry out the transaction. ill at the university of oregon library is currently accomplished using a large number of printed union catalogs and is communicated by mail or western union teletype. a bibliographic utility will not completely replace ill transactions carried out in this manner. the number of requests for materials from the library collection will probably increase due to the "visibility" gained in the online union catalog. oclc the oclc database provides the largest online union catalog through a holdings record listed with each catalog entry. the ill message system transfers records from the database to the lending library in a request form, automatically sends the request to up to five libraries, generates records on the status of each request, and provides statistics on ill transactions. oclc ill transactions are generally faster than traditional methods of interlibrary loan because of the ability to move data directly from the online union catalog to the request form without re-typing and the ability to have requests automatically forwarded if a library is unable to fill the request immediately. oclc's ill subsystem has been in operation for a year and participating libraries have reported general satisfaction with its performance. rlg/ rlin the rlin database provides an online union catalog through a holdings record listed with each catalog entry. materials not located in the rlin database may be referred to the bibliographic center at yale university for further manual searching through printed union catalogs. the rlg message system may be used to create and send ill requests to other rlg libraries, though this system is not specifically designed as a comprehensive ill support system. the shared resources program committee has recently formed a task force charged with the responsibility to create a functional specification for an automated interlibrary loan system, and to determine the priority for its implementation. rlg resource sharing policy requires members to give priority to ill requests from other rlg members, to suspend fees to members, to provide on-site access to users from members' libraries' institutions, and to provide free photocopies of non-circulating materials. wln the wln database provides an online union catalog through a holdings record listed with each catalog entry. this online union catalog includes the local library call number and, for serials, the specific holdings of the library. the wln resource directory is a microfiche listing of the bibliographic and holdings information in the database. wln offers no message switching system for ill, though this is their highest priority for future development. in cooperation with pacific northwest bibliographic center, wln is planning experiments with a message switching system for interim use until the comprehensive ill system is developed. cooperative acquisitions cooperation in purchasing library materials is done in order to minimize the duplication of expensive purchases and to ensure that important works are easily available to users of the library, whether they are actually owned or not. oclc member libraries may search the database to determine the holdings of particular items by other member libraries, in order to reports and working papers 221 avert undesirable duplicative purchases. rlgirlin members actively coordinate purchases of certain categories of materials in designated fields in order to avoid extensive duplication and to ensure that at least one copy of every item of research value be acquired by a member institution. in support of this effort is an automated "cooperative purchase file," containing limited bibliographic information and acquisition decisions of rlg members for all new serials on order and for all expensive items ($500 or more). member institutions agree to develop conspectuses reflecting their level of holdings and development in certain fields (subjects, language, and formats). these conspectuses are time-consuming to develop. a survey of holdings in chinese, japanese, and korean languages has been finished by 12 members. older members have completed language and literature, fine arts, philosophy, and religion. history is expected by march, 1981, to be followed by the hard sciences. based upon these conspectuses, rlg members will build a system-wide collection development policy. new members are expected to begin work on their conspectuses as soon as possible, but not necessarily immediately after joining rlg. wln members may search the database to determine the holdings of particular items by other member libraries, in order to avert undesirable duplicative purchases. libraries may also search the in-process file to determine if items are on order by one of the 23 libraries using wln's acquisitions subsystem. support for collection development activities a bibliographic utility is potentially useful for collection development in that it provides a large file of bibliographic records that may be searched to assist in a) determining the existence of published materials in specified categories (on a particular subject, by a particular author, in a particular series, for example), and b) obt~ining cor222 journal of library automation vol. 14/3 september 1981 rect bibliographic information about specific items to assist in ordering them. important features in a utility in this regard are database size and variety of access points (subject, author, series titles, etc.). oclc useful access points by which the database may be searched include: • personal author • corporate author • title • series title • variant names (e.g. clemens or twain) • conference names the database must be searched using a "search key" (a code based upon a sequence of initial letters in the words to be searched), not real words. rlg/rlin useful access points by which the database may be searched include: • personal author • corporate author • conference names • title • series title • subject heading or call number range (excluding items cataloged by the library of congress) • publisher, using a truncated isbn (international standard book number) [restricted to items cataloged by the library of congress] a search of rlin is likely to produce multiple records for particular items because an item held by more than one member will be displayed for as many libraries as have cataloged it through the system. it is projected that by april, 1981, run's "reconfigured database" will have solved that problem by attaching holdings information to one unified record. it will also have merged the two bibliographic subfiles (library of congress and member cataloging) so that access by subject heading, call number range, and isbn will be available for the entire database. wln useful access points by which the database may be searched include: • personal author • corporate author or corporate author keyword (keyword searching permits the user to search for items using either the full heading: american society for information science; or words from the heading: "society" and "information. " this capability is useful when the complete phrase is not known.) • title • corporate or conference author/title series (keyword) • series title or truncated series title • subject heading and/or subdivision or truncated subject heading • corporate and conference name subject headings (keyword) preservation of library materials all bibliographic utilities, because of their function as a union catalog of their members' machine-readable cataloging information, have some usefulness for libraries making decisions about preservation priorities. a library may, for example, choose to give preservation treatment to item a rather than item b because item b is owned by several other libraries in the vicinity, whereas item a appears to be unique. it must be remembered, however , that many older items will not appear at all, because they were cataloged long before the utilities came into existence. oclc members may search holdings information in the database to determine the relative rarity of an item that is a candidate for preservation treatment. rlgirlin members may search holdings information in the database to determine the relative rarity of an item that is a candidate for preservation treatment. a computerized list of members' micropreservation activities is provided. experimental programs are conducted to test new preservation technologies and applications of existing processes . preservation microfilming is being done for members by staff at yale and princeton. funds are provided to members for preservation activities. r these activities are part of rlg's preservation program, one of its four major programs. wln members may search holdings information in the database to determine the relative rarity of an item that is a candidate for preservation treatment. technical processing acquisitions the steps by which the library purchases books and other materials include: l. pre-order searching to determine that a requested item is not already owned by the library or on order. 2. selecting a dealer likely to be able to supply desired item. 3. placing the order. 4. receiving the item. 5. clearing the order records. 6. processing the invoice for payment. 7. maintaining precise accounting of all book funds. 8. inquiring about the status of items which are not received when expected. 9. cancelling orders and adjusting accounting records when items are not available. at the uo most acquisitions forms and files are created and maintained manually. in an automated acquisitions system the placing of the initial order generates an acquisition record for each item, which is updated as the item moves through the cycle outlined above. this eliminates the need for maintaining separate files according to the status of an order. oclc operational. oclc has an online nameaddress directory which presently can be searched while using other oclc subsystems. this file contains information about publishing, educational, library, and professional organizations and associations. this information will be automatically transferrable to forms being produced online. planned. oclc's acquisitions subsystem, which is presently being tested by sereports and working papers 223 lected member libraries, is projected to be generally available in spring 1981. when operational the acquisitions subsystem will permit users to: place orders for all types of bibliographic materials (forms generated will be sent directly to supplier with copy to library) renew subscriptions request publications or price quotations create deposit account orders send prepaid orders cancel orders create and adjust fund records receive periodic fund reports rlgirlin operational. rlin does not have an operational acquisitions subsystem. stanford university is continuing to use a system developed as part of ballots. planned. the rlg board of governors has approved functional specifications for an acquisitions subsystem to be introduced in two phases. by june 1981, rlin plans to have a centralized in-process file which will contain records of all new orders, gifts, subscriptions, etc. of members, and will be able to support non-accounting aspects of the acquisitions process. the capability to store and maintain an online book fund accounting system will be achieved in october 1981. rlin expects to be able to support all files, processing, and products necessary to establish, coordinate, and monitor materials acquisitions from the point of selection decision, request, order, or receipt through completion of technical processing activity. wln operational. wln's acquisitions subsystem, which has been operational since may 1978, is comprised of four files: 1. in-process file which supports the majority of acquisitions activities. 2. standing orders file which has records for subscriptions and other items which are renewed or reordered on a continuing or periodic basis. 3. name and address file which contains names and addresses of book dealers and other vendors, main libraries, branch libraries, etc. 4. account status file which provides ca224 journal of library automation vol. 14/3 september 1981 pability to maintain up-to-date accounting. information keyed into the terminal during the day is entered against the accounts nightly and is reflected in the account totals available online the following day. records of completed transactions are transferred to a magnetic tape history file and can be used for generating statistical and other reports. with each step of the order cycle, appropriate forms and reports are generated. special system reports reflecting the status of the four files may be generated on request. instructions entered at the time of the initial order provide for automatic generation of notification forms for individuals requesting the specific item being ordered or inquiry notices for materials not received after a specified period. planned. further refinements of the procedures and capabilities of the system. cataloging the creation of a cataloging record involves: i. describing an item 2. assigning headings for names of persons or organizations and titles by which the user might be expected to seek the item in the catalog 3. assigning a unique call number which will place the item with others of a similar nature, and 4. assigning subject headings which reflect the content of the item. because most libraries collect many of the same materials, the concept of sharing the responsibility for cataloging was developed which makes materials available more quickly at reduced cost. with the establishment of national and international cataloging rules and standards, and the growth of large online computerized databases, it is becoming increasingly feasible to have each item cataloged only once with that cataloging information available for all libraries to use. the library of congress catalogs approximately 250,000 titles per year into machine-readable form . this cataloging is available through each of the bibliographic utilities and may be used for the creation of local catalogs. when the library of congress has not yet cataloged a specific item, a utility member library may prepare the cataloging according to specified standards and enter its cataloging into the database for use by other member libraries and for its own catalog. another aspect of the cataloging activity is the creation of a local database which can be used as the basis of not only the local library catalog, but also of a local circulation, acquisitions, and serials system, as well as for regional union catalogs. in order to provide total access to a library's collection in this machine-readable database, information concerning every item in the library must be entered into the system. this process is called retrospective conversion. during the retrospective conversion process the library can choose to eliminate existing inconsistencies in the treatment of library materials including reclassifying books so that most materials are retained in one main classification system. the university of oregon library has as a long-term goal completing total retrospective conversion of its collection so that all materials can be searched and located in an online catalog. oclc operational. oclc's online cataloging subsystem has been operational since 1971. based on the experience of similar libraries, the university of oregon library might expect to find entries in oclc's database for over 90 percent of the items searched . • these cataloging records can be modified online or accepted as is. the local library's symbol is added to indicate that it has used the cataloging record and then presorted, alphabetized catalog cards are ordered. the cards are printed overnight and shipped on a daily basis. many oclc libraries print their call number labels by means of a printer attached to their terminal. once a cataloging transaction has been completed, it is not possible to retrieve your local modifications online in the oclc system. the record of your transaction is stored and sent to your library on magnetic tape on a periodic basis. these magnetic archive tapes can be used by a vendor or •see footnote on page 225. local computing center to generate a local microform or online catalog, run a circulation system, etc. it is presently possible to catalog most types of materials in the oclc system including books, serials, microforms, motion pictures, music, sound recordings, maps, and manuscripts. increased emphasis has been placed on quality control and adherence to specified standards in the creation of cataloging records, but there is no official editing of cataloging records by oclc staff. in 1979-80 nearly 45 percent of the activity on oclc's cataloging subsystem was related to retrospective conversion. oclc's large database, extended hours of service, and special pricing schedules for retrospective conversion and reclassification make it attractive for these activities. oclc charges 60 c~nts per retrospective conversion record during hours of peak system activity (prime time) and five cents per retrospective conversion record during less busy hours (non-prime time). planned. oclc continues to explore means of improving quality control. after moving their central facility to new quarters in early 1981, oclc will reconsider the possibility of storing and displaying the number and location of local copies of a title. rlgirlin operational. at this time the university of oregon might expect to find cataloging available for 70 to 90 percent of its ongoing work in rlin. t a search of rlin's database retrieves multiple records because each library's records are stored separately. the reports and working papers 225 library selects the desired record, modifies or accepts it, enters the library's symbol, and orders cards which are printed nightly and sent in presorted, alphabetized batches. no call number labels are produced, and it is not presently possible to print labels from the terminal. local library modifications are accessible online. magnetic tapes or cataloging transactions may be purchased and used to create local online or microform catalogs. most materials may be cataloged with rlin including books, serials, microforms, motion pictures, music, sound recordings, and maps. member libraries agree to catalog in conformity with rlin standards, but there is no formal editing of records by rlin staff on an ongoing basis. sample quality checking is the responsibility of a newly-created position of quality assurance specialist. with only 23 owner-members, rlg must carefully consider the impact on the system of allowing individual members to undertake retrospective conversion projects. each project must be approved by the board of governors, and members are encouraged to seek outside financial support rather than asking rlin for reduced rates. rlin has just received a 1.25 million dollar grant including $600,000 to support retrospective conversion projects. rlin does not charge for retrospective records which are completely recataloged and upgraded with the book in hand. the prices for other levels of retrospective conversion cataloging range from fifty-five cents to $1.85 per record. planned. in april 1981, rlin plans to reformat its database so that there will be t a wide range of success rates for searching each system are cited in the literature, each dependent on the sample procedures used. the university of oregon library had 100 items searched against each database. this sample excluded books with printed library of congress card numbers, and included books, serials, microforms, music scores, recordings, documents, and non-book materials. of this sample oclc found 96, rlin found 65, and wln found 38. the range of figures cited in this report allows for variation between studies cited in the literature, word-of-mouth reports from librarians using these systems, and the university of oregon library's own sample. an analysis of this sample is being prepared. recent comparisons of searching success are found in the following: linking the bibliographic utilities: benefits and costs, submitted to the council on library resources ... by donald a. smalley [and others). columbus, ohio, battelle, 1980; matthews, joseph r. , "the four online bibliographic utilities: a comparison," library technology reports 15:6 (november-december 1979), p. 665-838; tracy, juan i. and remmerde, barbara, "availability of machine-readable cataloging: hit rates for ballots, bna, oclc, and wln for the eastern washington university library," library research 1:3 (falll979), p. 227-81. 226 ] ournal of library automation vol. 14/3 september 1981 only one copy of each cataloging record. member libraries' symbols and local cataloging information will be displayed with the appropriate records. wln operational. based on the experience of others, the university of oregon library might currently expect to find cataloging records available for 50 to 70 percent of its ongoing work in the wln database. • libraries search wln's database, accept or modify the cataloging records, and order cards and labels which are printed nightly and shipped weekly. (card sets are not presorted for filing.) local cataloging information is accessible online through the library's wln terminal. magnetic tapes of a library's cataloging transactions may be purchased to run a local online or microform catalog. wln also provides microform catalogs on either microfilm or microfiche. books, serials, and audio-visual materials, but not music, sound recordings, and maps may be cataloged on wln's system. libraries cataloging in wln must conform to well-defined wln standards. new cataloging records go through an edit cycle and are reviewed by central wln staff before being added to the wln database. presently this review takes about two weeks. during this period, the cataloging record may not be retrieved online. the wln batch retrospective conversion subsystem has been operational since august 1980. using this system a library enters brief cataloging records which are collected by the system and searched later as a unit through the wln database. records for which a match is found are billed at six cents. records not matched are billed at one cent and may be searched again at a later date. over 30 wln libraries are using this capability, which can be made available to non-members under special circumstances. planned. wln is considering dispersing among selected member libraries responsi~ility for editing member-created catalogmg records. wln :will make music cataloging available within the near future. •see footnote on page 225. serials check-in serials are publications issued in successive pa~ts be~ring n~merical or chronological designations which are intended to be co~tinued indefinitely. they include periodicals; newspapers; annual reports and yearbooks; i?urnals, memoirs, proceedings an~ transactions of societies; and numbered senes. the average research library will have between 15,000 and 20,000 such titles. precise data must be maintained to enter ~ach issue as received, to discover missing ~ssues, to requ_est replacements for missing issues, to momtor accounting information, to ~enew or cancel subscriptions, and to mamtain binding information. serials files contain such information as title, relationship to earlier publications, name and address of publisher volumes the library owns, call number a'nd location date, volume, and number of each issue' date each issue was received, subscriptio~ dates, price, etc. at_ t?e university of oregon library all of this mformation is maintained in manual files. once the serials check-in operation is co_mpute~ized, it is possible to generate a w1de ~anety of serials finding lists, analyses of senals subscriptions by subject, location, department, etc., and to provide current serials information online. oclc operational. oclc introduced its serials control subsystem in 1976 and improvements to the system in 1979. participants create online local data records with information necessary to monitor and cont~ol each iss~e o~ each serial received by the hbr~ry. i_nshtutwns can check-in currently received issues online. ~recent ancillary to this system is the ability to create and maintain online a cooperativ~ r~or~ of serials owned by any group of mshtutwns (a union list of serials). pl~nned. oclc plans to continue upgradmg the capabilities of its serials control subsystem as needed. rlg!rlin operational. none. planned. automated serials check-in is one of several items listed for consideration after current development activities are released, probably in late 1982. no resources are presently committed to this project. wln operational. while wln has no current serials check-in capabilities, it does support maintenance of serials subscriptions in the acquisitions subsystem, including automatic renewal and reorder reminders. wln also produces union lists of serials. planned. wln is investigating existing commercially-created check-in systems to see whether they can purchase an existing system to incorporate into wln's services. management lnfonnation precise up-to-date information concerning library operations can be very useful in planning improvements in library services and in attaining efficient utilization of available personnel, resources, and materials. without the computer, the laborious record-keeping necessary to obtain useful management information almost negates the benefits of having the information. oclc operational. oclc produces cataloging, interlibrary joan, and serials check-in system use and system performance statistics on a regular basis. libraries can make local arrangements to create additional analyses of the information stored on subscription archival tapes of their local cataloging activity. oclc offers semimonthly, monthly, or quarterly accession lists of new materials cataloged by each library. these lists may be in call number or subject sequence. oclc has produced some special studies for institutions based on their cataloging records. planned. when the acquisitions subsystem is operational, libraries may choose to receive a cumulative, monthly fund activity report and a periodic, cumulative fund commitment register. these reports will provide institutions with current financial control data. oclc plans to continue to develop its ability to provide management information. reports and working papers 227 rlgirlin operational. system use statistics are provided in the form of the monthly invoice, which may be used to monitor cataloging and public service activity, and may be broken down into appropriate accounts by pre-planning. lists in call number order of materials cataloged by a library into rlin could be produced from local printers attached to the terminal. planned. the generation of management information is a future development project; no special management reports are prepared presently. among the management reports included in the specifications for the acquisitions subsystem, projected for implementation by october 1981, are status reports on in-process files, materials awaiting receipt, materials received, and book fund balances. wln operational. wln produces aggregate system activity reports monthly, but does not analyze the cataloging activity or subject holdings. wln's acquisitions subsystem can be used to produce acquisitionsrelated management reports concerning account transactions, account history, standing orders, renewals and reorders, receipts, detailed encumbrances, etc. a microform accession list by title is available. a general-purpose text-editing facility may be used by management to maintain data not derived from wln operations and to produce formatted reports of this data. planned. wln is developing the capability to store and maintain detailed collection information for each library online, including copy numbers and location symbols for each copy of a title owned by a library. no specific management information plans have been outlined at this point. public services reference use of the utility's terminal a bibliographic utility has potential for use in library reference services in three major areas: 1. verification of bibliographic information. the utility's database may be searched for cataloging information 228 journal of library automation vol. 14/3 september 1981 not in the uo library catalog. a verification search is made to locate a complete catalog description of a specific, known item and is carried out most easily using one of the unique numbers assigned to a publication (library of congress card number, international standard book number, etc.). if one of these is not known, a combination of author and title words, or a "search key"• based on author and title is used to retrieve the information. verification places a greater reliance on the quality of bibliographic information in the utility's database than on search techniques used to locate the information. 2. compilation of subject bibliographies. the utility's database is searched through words in the titles and subject headings in a bibliographic record in order to produce a list of materials on a given subject. this subject query can be modified using the logical relationships and, or, and not to indicate, respectively, limitations, synonyms, or exclusions in the search. the ability to obtain a printed list of references is convenient, if not required. 3. compilation of author bibliographies. the database is searched to find all material created by a particular individual or corporate body. the size of the utility's database is a major consideration, as is the source of the cataloging found in an author search. again, a printed list is necessary. oclc the oclc database can be searched in a variety of ways to support reference ser• a search key is a code based on a certain number of characters drawn from a particular element in the bibliographic reference. for instance, to find a record for william manchester's american caesar, an author/title search key using the first four letters of the author's name and the first four letters in the title would be manc,amer. various combinations of letters are used to search author names, titles, or author/title combinations. a search key may not necessarily be unique to a given item , and may retrieve other items beside the one desired. vices, though there is no subject search capability in the system. the following access points may be used in a search: 1. lc card number 2. international standard book number (isbn) 3. international standard serial number (issn) 4. coden (an abbreviation developed by chemical abstracts service for designating periodical titles) 5. government documents number 6. oclc identification number 7. personal author (search key, not full words) 8. corporate author (search key) 9. performer (search key) 10. title (search key) 11. author/title (search key) 12. series title (search key) 13. variant names (search key) 14. conference names (search key) searches may be restricted by year or by type of material, such as books, manuscripts, maps, etc. the logical operators and, or, and not are not used in oclc. the oclc search system is primarily based on search keys and is best utilized to locate a known item. local printing is available on any oclc terminal so equipped. there is one standard print format offered. rlgirlin the following access points may be used in a search of the rlin database, though not all are currently active in each subfile of the database: 1. lc card number 2. isbn 3. issn 4. coden 5. government documents number 6. rlin identification number 7. call number (complete or truncated) 8. recording label number 9. personal author 10. corporate authors or conference names (keyword or phrase) 11. title words 12. subject headings (keyword or phrase) 13. music publisher truncation (searching of partial entries) is available to aid in searching incomplete entries and the logical operators and, or, and not may be used to broaden or restrict a search. local printers may be attached to the rlin terminals. a variety of print formats is offered. plans include unified search access points for all subfiles of the database as of april, 1981. wln the "following access points may be used to search the wln database: 1. lc card number 2. isbn 3. issn 4. wln identification number 5. personal author 6. corporate authors or conference names 7. title words 8. series title (complete or truncated) 9. corporate or conference author/title series (keyword) 10. subject headings (complete or truncated) for a variety of reasons, the wln search system is the most powerful of the three utilities. truncation is available and the logical operators and, or, and not may be applied to broaden or restrict a search. records may be printed locally in a variety of formats on any wln terminal so equipped. wln will also provide printing at the central computer for reference bibliographies. wln search software may be purchased for local database management applications (see the section on online public catalogs.) links to other computerized services there are presently over 150 reference databases available through commercial computerized reference service vendors. during the last ten to fifteen years, standard bibliographic indexing and abstracting publications such as chemical abstracts, historical abstracts and dissertation abstracts international have used computerized methods to organize and print references to periodical articles, reports, dissertations, conference papers, etc. the vendor creates a computer searchable version of the reference database and makes reports and working papers 229 it available to libraries for a fee based on their use of the computerized search system. membership in a bibliographic utility can provide two benefits in the use of other computerized reference services: 1. discounts on fees through membership in large group contract administered by the utility. 2. access to the reference vendor's computer through the utility's terminal and communication network. oclc oclc's affiliated online services program provides access at discounted rates to the information services of bibliographic retrieval service (brs), lockheed information systems (lis), and the new york times information bank. oclc's communications network does not yet permit users to link to the hosts using an oclc terminal, though this capability is anticipated in the near future. rlg!rlin rlin does not offer a formal program in this area, though the rlg 40 terminal is compatible with other information retrieval systems. wln wln does not offer a program in this area, but anticipates offering access to brs, lis, and new york times information bank. circulation none of the bibliographic utilities under consideration currently support circulation functions on their computers. however, each system can provide a machinereadable archive tape of our cataloging information to be used in developing a computerized circulation system. in order to keep track of circulation transactions, it is necessary to have complete retrospective conversion of the uo library catalog. another important consideration is the transferability of data between the utility's computer and the circulation computer. oclc oclc anticipates offering support for local circulation systems on their computer 230 journal of library automation vol. 14/3 september 1981 for member libraries and will demonstrate their system in mid-1981. oclc data has been successfully transferred to many local circulation systems. rlg/rlin rlin does not anticipate offering local circulation services for member libraries. rlin data has been successfully transferred to several local circulation systems. wln wln does not anticipate offering local circulation systems on their computer for member libraries. wln data has been successfully transferred to local circulation systems and an agreement has been reached with dataphase, a computerized circulation system vendor, to discount purchase of their system by wln member libraries. public online catalogs again, none of the bibliographic utilities under consideration currently support public online catalogs of an individual library's collection. a public online catalog requires further programming in order to make it easy for the public to locate materials of interest without extensive training; the bibliographic utility's searching procedures are too esoteric to be used by the general public. as in circulation, issues of data transferability and full retrospective conversion of the uo library's catalog are paramount. oclc oclc does not currently encourage public access to their database and does not support use of local online catalogs on their computer due to the tremendous demand for computer resources exerted by 2400 member libraries. oclc and rlg/ rlin are participating in a study of user requir~ ments for a public online catalog. oclc data has been successfully transferred to several local online catalogs, including eugene public library's circulation and online catalog system, ulisys. rlgirlin rlin anticipates being able to offer public access to their database. they are participating in a study with oclc of user requirements for such a system, but no date has been announced for the development of this capability in rlin. rlin data has been successfully transferred to a local public online catalog at northwestern university. wln wln does not believe that a local online patron accessed catalog should be provided through the wln computer, even though they anticipate having such a capability within one year. instead, they encourage libraries to develop local systems for public access to the online computerized catalog and to obtain data from the wln cataloging system . the university of illinois is adapting the wln computer search and database management software to provide a local online catalog and computerassisted instruction in its use for the public. checklist for cassette recorders connected to crts prepared by lawrence a. woods: purdue university libraries, west lafayette, indiana, for the technical standards for library automation committee, information science and automation section, library and information technology association . introduction a data cassette recorder connected to a printer port is an effective, low-cost method of collecting data in machine-readable form from display terminals such as the oclc 100/105. it is important that a data recorder be used rather than an audio recorder although the cassette itself can be a goodquality audio tape. it is also important to note that the data recorded on the tape are not the same as the data originally transmitted to the display terminal, but are simply a line-by-line image of what appears on the screen. a typical installation will have a minimum of two devices: one attached to the display terminal to collect data, and one attached to a printer or an input device to another computer for playback of the data. there are more than 150 various data relib-mocs-kmc364-20140106084141 221 book reviews introduction to information science, tefko saracevic, ed. new york: bowker 1970, 776 pp. $25.00 the editor has put together a large volume consisting of 776, 8~ x 11 pages and weighing almost 5 pounds. it comprises 66 different articles written by almost as many authors and covers the period from 1953 to 1970. two-thirds of the articles were written during the period 1966-1969. in short, it is a collection of a large number of papers mostly from the last few years having to do in some way with information science or more properly, with information systems. the papers generally are good ones and in some cases have already become acknowledged classics. in a few cases i am a bit puzzled about their inclusion in a volume of this type. in the few months since i have had this book i have already found numerous occasions to consult several of the articles. some of the other papers which i have not seen recently i have enjoyed reading again. the book is divided into four parts, which are further subdivided into thirteen chapters. the four parts are basic phenomena, information systems, evaluation of information systems, and a unifying theory. although the chapter headings are too numerous to list they include such topics as notions of information, communication processes, behavior of information users, concept of relevance, testing, as well as economics and growth. by virtue of the parts, chapters, and articles the editor has provided a type of classificalion system or structure for information science without attempting to define information science. interspersed between each of the parts and chapters is up to a page of introductory and explanatory material provided by the editor. in a volume of this type it is important to recognize what the volume is and what it is not. as i have mentioned, it is a good anthology of important articles related to information. it is not, as the title implies, an introduction to information science. the papers are by and large unrelated to each other and the introductory comments by the editor do little to provide a unifying relationship. furthermore, the overall scope of the articles is generally quite limited and, although the editor implies it is not so, tends to equate information science to information systems. the final paper in the volume by professor william goffmann is listed by the editor as part four-a unifying theory. the precise title of the chapter is somewhat less ambitious: namely, "a general theory of communication." the paper is an unpublished one (although similar papers by the author have been published elsewhere) and relates communication in a general sense to the theory of epidemic processes. although the theory is an 222 journal of library automation vol. 4/4 december, 1971 interesting one, it would hardly qualify as a unifying theory for information science. it certainly does not provide the unifying relationships among the various articles included in the text. my guess would be that .other qualified individuals, in putting together a similar volume, would have included many different articles. this, however, is the nature of the field at this time. by comparison note the recently published volume key papers in inf01·mation science, arthur w. elias editor. this book, although admittedly serving a somewhat different purpose, contains 19 papers with only a single paper in common with those of this particular volume. in summary, this is a good collection of relevant and useful articles in information science. it is probably desirable that they be included in a single volume. serious students, educators, and research workers will find this volume to be of interest. as a reference book it will be quite useful. the book is not, however, an introduction to information science. the novice, the student, and the casual reader will probably be disappointed and confused, and in some cases might even be misled. marshall c. yovits information processing letters. north-holland publishing company, amsterdam. vol. 1, no. 1, 1971. english. bi-monthly. $25.00. this journal is published by a most reputable company and has a most impressive international list of editors and references. the affiliation of editors illuminates the orientation of the journal: six of them are from departments of mathematics, computer science or cybernetics and two are from ibm laboratories. understandably, the journal is devoted basically to computer theory, software and applications, with a heavy accent on mathematically expressed theory related to the solution of computing problems, algorithms, etc. it is directed toward basic and applied scientists and not toward practitioners. people interested in library automation may, from time to time, find in it theoretical articles broadly related to their work, but they will have to do the "translating" themselves. this journal follows the tradition of "letters" journals in physics, biology and some other disciplines. the papers are short; publication is rapid; work reported generally tends to be very specific, preliminary to or a part of some larger research project; usually small items of knowledge are reported. the "letters" journals are received in the fields where they appear with mixed emotions. for instance, ziman (nature 224:318-324, 1969) questions very much the need for these publications. on the other hand, they are a useful outlet for authors who otherwise would not publish these often useful bits of specific knowledge. recommended for research libraries related to computer science. t efko saracevic book reviews 223 handbook of data processing for libraries. by robert m. hayes and joseph becker. new york: becker & hayes, inc., 1970. 885 pages. $19.95. to write a universal handbook in a field so full of complex intellectual pro.blems and simultaneously satisfy every potential reader is an impossible assignment. therefore the authors cannot be faulted for failing to satisfy everyone. they have succeeded in writing for a very important audience -administrators and decision makers. for this group, they have presented difficult technical material in a clear readable fashion-a reflection of ' their extensive teaching experience. for many library administrators, this handbook arrives five years too late. had it been available earlier, a large number of current automation projects might never have been authorized by management, or at least might have been conducted on a sounder basis. following a very conservative approach, the authors generally remain within the limitations of the current state of the art, being careful to distinguish that which is feasible (i.e., practical ) from that which is possible. over and over again, they warn librarians about the limitations of computers and caution against excessively high expectations. for administrators, the most useful material is in chapter 3, "scientific management of libraries," and in chapter 8, "system implementation." a reading of chapter 8 alone suffices to convey to the administrator the magnitude and complexity of even the most seemingly routine computer application in libraries. this chapter, the most important and useful in the entire book, covers planning, organization, staffing, hardware, site preparation, programming, data conversion, phase-over, staff orientation, and training. each of these topics-deserving of complete chapters in themselves-is treated briefly, but in enough detail to communicate the complexity of each component in the long stream of system development activities, all of which must be completed to the last detail for success. there are three useful appendices: a glossary, an inventory of machine readable data bases, and a list of 115 sources for keeping up to date. bibliographic footnotes abound and each chapter ends with a list of suggested readings. however, it is surprising how many references are five or more years old; in fact, there is a scarcity of current references. for example, ballou's well-known guide to microreproduction equipment, now entering its 5th edition, is cited in the first edition of 1959. the authors have been badly served by their proofreaders. the book is marred by an incredible number of spelling errors in text, tables, footnotes and references, especially with personal names, plus incomplete citations. the index contains many entries too broad to be useful, such as: utilization of computer ( 1 entry ), time sharing ( 1 entry), hardware ( 3 entries), technical services ( 3 entries). lacking from the index are name references to distinguished contributors to the literature, such as avram, cuadra, degennaro, fasana, and others. many of these names appear only in footnotes. 224 journal of library automation vol. 4/4 december, 1971 the book is rich in tabulated data and specifications for a variety of equipment. unfortunately, much of this equipment is inapplicable to library use, or the tabulated data is in error. table 12.25lists several defunct or never marketed equipments, such as ibm's walnut and eastman kodak's minicard, without indication of non-availability. in table 11.22 there are extensive listings of crt terminals, most of which are unsuitable for library applications by reason of deficient character sets or excessive rentals. nine of the units listed showed rentals of over $1,000 per month, and two of these were virtually at $5,000 per month, clearly beyond the reach of any library. table 12.2 suggests the access time to one of 10,000 pages in microfiche is half a second, a figure that is off by an order of magnitude for mechanical equipment and by two orders of magnitude for manual systems. (more nearly correct figures are given in the text on page 396). table 12.21 lists several microfilm cameras designed expressly for non-library applications and not adaptable to any library purpose. from a broader perspective, one misses several other features. is a "handbook" for the practitioner? if so, this volume is too elementary. can it be used as a textbook in a course in library automation and information science? the book contains no problems for students to attack, and except for references, no aids to the instructor. possibly it can serve as supplementary reading, for it contains far too much tutorial material (yet only ten pages of nearly 900 are devoted to flow charting). one wishes for more specifics drawn from the real world. a hypothetical case study in chapter 11 is illustrative: a 5% error rate is assumed for input of a 300,000 record bibliographic data base to be converted to machine readable form. not revealed in the example is that a relatively low error rate in keyboarding may result in a very high percentage of records which must be reprocessed to achieve a high quality data base. each reprocessed record will consume computer resources: cpu time, core, disc i/0, tape reading and writing, etc. we know from marc and recon that the ratio of the total records processed to net yield is on the order of 3:2; i.e., each record must be processed on the average of one and a half times to get a "clean" record. the cost of this reprocessing is far beyond the 5% lost by faulty keyboarding. the handbook will be a useful decision making tool for the generalist, a less helpful aid to the practitioner. it is hoped that a revised edition is in preparation, and particularly that the tabular material will be corrected and brought up to date. chapter 8, the heart of the book, should be greatly expanded. for the next edition, some consideration might be given to a two-volume work: the first volume for the administrator, and the second containing much more technical detail for the practitioner. if the two volume pattern is followed, a loose-leaf format with regular updating would be most helpful for the second half. allen b. v eaner book reviews 225 l~brary automation: experience, methodology, and technology of the lzbrary as an information system, by edward w. heiliger. new york: mcgraw-hill book co., 1971. xii, 333 pp. the need for a handbook and/or general introductory text on the topics of automation and systems analysis in libraries has been sorely felt for quite som.e time. during the past year, three have appeared (chapman and st. pierre, library systems analysis guidelines, wiley, 1970; hayes and becker, handbook of data processing for libraries, wiley, 1970 and the book here reviewed.) unfortunately, none is completely satisfactory, for different reasons. a serious student wanting a reasonably comprehensive, systematic, and balanced treatment of these subjects will, i'm afraid, be forced to have to use all three of these titles and, even then, will have constant need to use supplementary materials for a number of aspects. the title being considered in this review by heiliger and henderson, if one judged only by the authors' intent as expressed in the preface, would be exactly the kind of work that we've all felt the need for. as they state on page vii, the purpose "is to provide a perspective of the library functions that have been or might be mechanized or automated, an outline of the methodology of the systems approach, an overview of the technology available to the library, and a projection of the prospects for library automation." and, indeed, if one looks at the table of contents there are four parts that closely parallel this statement of purpose. the parts themselves though, when inspected more closely, reveal not a systematic treatise or even an in-depth treatment of these topics, but rather a loosely connected series of essays, each on a fairly superficial level, discoursing on a variety of aspects associated with, or tangental to these topics. this indicates, at least to this reviewer, that the genesis of the book was a series of lectures presented and refined over a period of time by the authors. although not in itself a bad thing, here it is unfortunate to some degree because not enough effort was expended in amplifying the material with additional data, library-oriented examples, and illustrations, nor in logically integrating the various parts. part i, entitled "experience in library automation," begins by broadly citing a number of library automation projects mostly dating from the early 60's. the level is extremely superficial and the presentation not very enlightening, since only three or four projects are mentioned, and then only in passing. immediately following are several excellent chapters describing traditional library activities (e.g., acquisition, cataloging, reference,. etc. ) in functional terms. the approach, though extremely simple, is for the most part effective and is only marred by occasional, overly condescending statements such as "library filing is a very complicated matter" or "reference librarians use serials literature extensively." unfortunately, in the 104 pages of this section there is not one illustration. 226 journal of library automation vol. 4/4 december, 1971 part ii, "methodology of library automation," attempts to describe the general approach and techniques of systems analysis. in a number of ways, this is the best part of the book. unfortunately, the concepts that are so simply and succinctly described are only indifferently related to activities that will be familiar to librarians. as a brief essay on the objectives and concepts of systems analysis, it is quite adequate, but as a discussion of how they relate to library problems, it is totally inadequate and often misleading. part iii, "technology for library automation," is probably the least informative part of the book, giving the reader virtually no practical information. all of the important and obvious technological concepts are listed, but are dismissed with what oftentimes is little more than a brief definition. the one exception to this is chapter 13, entitled for no apparent reason, "concepts." this chapter is in fact an innovative and thoughtprovoking view of a library as a data-handling system. one wishes that this chapter had been amplified and treated more fully. part iv, "prospects for library automation," is the least effective part of the book, having in my mind only one merit: it doesn't tack on a hollywood-style happy ending. the authors' view of the 70's, as far as can be inferred from this too short section, is cautious and mundane. these will be, i'm convinced, the overriding characteristics of automation efforts for the next several years. i only wish that the authors had elaborated more fully on these points and presented their views more coherently. the book is augmented with a 61-page bibliography ( 1,029 citations), which, though reasonably current, is of dubious worth because it is neither annotated nor particularly well balanced. certain classics, such as bourne's methods of information handling, or information storage and retrieval by becker and hayes, and certain current, basic items, such as cuadra's annual review of infonnation science and technology and the journal of library automation, are not listed. each chapter is accompanied by a "suggested reading list" wherein materials more or less pertinent to the subject of the chapter are listed. a glossary of terms in three parts (a total of 36 pages) is also included and, though difficult to use because it is in three alphabets and interspersed with the text, provides short but very adequate definitions. unfortunately, several jargon terms used in the text itself are not included; one that was most irritating to this reviewer is the term "gigabyte" which to my knowledge has very little currency among the cognoscenti. on balance, library automation is a title that should be recommended for a wide range of readers. though it will probably have little to offer experts in the field, it does have value as a text for library students or a general introduction for the average, non-technical librarian. paul ]. fasana book reviews 227 sistema colombiano de informacion cientifica y t echnica (sicoldic). a colombian network for scientific information, by joseph becker et al. quirama, colombia: may-june 1970. 59 p. mimeo. the task of the study team which produced this report was to present "an implementation plan for strengthening the scientific communication process in colombia by providing a permanent systematic mechanism to function in the context of colombia's internal needs for scientific and technical information in government, industry, and among the research activities in higher education." more specifically, the expressed goal of such a mechanism is "to develop a network which will permit any scientific or technical researcher, in government, industry, or university, to access the total information resources of the country without regard for his own physical location." the study was comple ted in two months (according to the cover dates) and comprised four areas of investigation, namely: 1) to elucidate the advantages of d eveloping a centrally administered national network including three levels of network nodes and a technical communications plan; 2) make an inventory of universities, institutes, telecommunications and computer facilities in colombia; 3) recommend a mix of these factors to produce specific services, and 4) propose a seven-year budget. the republic of colombia is about the size of texas and california combined, and its population is about 1 million less than new york state. most scientific and technical workers are located in five major cities, and the country is divided into six administrative zones. within these zones twenty universities and forty-four institutes were inventoried by the study team with respect to specialization, faculty, book collections and the like. from these universities and institutes, five primary and seven secondary nodes were named to be connected by means of a telex communications system. the telex connections are not to be computer-mediated in the forseeable future, but used for interlibrary loan and other messages. ( there were two teleprocessing systems operating in colombia at the time of the study.) basic recommendations are: that a governmental unit be established with responsibility for directing the development of sicoldic; that this unit, with a high echelon board of directors, should produce several directories, bibliographies and union lists, and publish a monthly catalog of government-sponsored scientific and technical research. in addition, a manual for use of the telecommunications system should be produced. the proposed budget is about $250,000. ( 4.5 million pesos ) for the first year, graduating to a 25-fold increase by 1976. in some aspects the sicoldic plan follows the pattern of some state library development plans being implemented in the u.s. the advantage of central control of information resources planning and fund control b y the sicoldic group, with fairly direct access to high governmental 228 journal of library automation vol. 4/4 december, 1971 authority, provides reasonable insurance for support of the plan, especially since these services contribute to the economic and scientific advance of colombia. there is no indication of the acceptance of the plan by colciencas, the governmental unit which commissioned it. of the sixty references in the bibliography, spanish publications predominate. ronald miller cooperation between types of libraries 1940-1968: an annotated bibliography, by ralph h. stenstrom. chicago: american library association, 1970. this bibliography is an effort to sift, organize and describe the literature of library cooperation produced during the period 1940-1968. two criteria governed the selection of the 348 books and monographs listed: 1) they must deal with cooperative programs involving more than one type of library, and 2) they must describe programs in actual operation or likely to be implemented. although most of the cooperative projects described are located in the united states, other countries are represented when the material about them is written in english. cooperative programs in the audio-visual field are included. the annotations explain the nature of the cooperative projects and give the names of participating libraries. an appendix describes briefly about 35 recent cooperative ventures not yet reported in the literature, which the editor learned about through an appeal published in professional journals. entries are arranged chronologically to facilitate direct access to the most recent developments and to permit tracing the evolution of a particular project over a period of time. three indexes provide approaches to the material by 1) name of author, cooperative project or library organization, 2) type of c long congfa men wuhan, hubei family-level intangible cultural heritage project-inheritor of wuhan woodcarving ship model conclusion this article reviews the classification system of china’s intangible cultural heritage items and the integration of existing knowledge organizations and other types of resources for designing a set of more comprehensive and reasonable metadata standards with a certain degree of scalability and it is applied to the actual intangible cultural heritage knowledge organization. to effectively protect and use the digital resources of intangible cultural heritage, further research is needed for this study. additional discussion on updating and promoting existing metadata specifications as well as multidimensional aggregation of existing resources to achieve knowledge discovery is needed. through the integration of linked data and sharing existing digital resources, this article can encourage scholarship and conversation that leads to the preservation of china’s intangible cultural heritage. funding statement this work was supported by the hubei key laboratory of big data in science and technology. this work was also supported by the palace museum’s open project in 2021, research on the dissemination of intangible cultural heritage of the palace museum from the perspective of artificial intelligence. this subject has been funded by the mercedes-benz star wish fund of china youth foundation. information technology and libraries june 2022 research on knowledge organization of intangible cultural heritage | qing, tan, sun, and chen 10 endnotes 1 feng xiangyun, xiao long, liao sansan, and zhuang jilin, “a comparative study of commonly used foreign metadata standards,” journal of university libraries 4 (2001): 15–21, https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2001&filename=d xts200104005&uniplatform=nzkpt&v=v9a8p-rcf4csl9yoaqskj5nbnfjrmwjhsaoj2pnqq9jl0tdsle3ntrjrzeto32h. 2 ma min, “metadata—the basic format for organizing online information resources,” information science 4 (2002): 377–79, https://kns.cnki.net/kcms/detail/detail.aspx? dbcode=cjfd&dbname=cjfd2002&filename=qbkx200204012&uniplatform=nzkpt&v=yem o5mxwo0mzg5mkz6qml62oruvfchtdy2slxdbn_hesfdvspxuc-naorq0v0ikl. 3 “specification data function requirements” november 24, 2014, http://eprints.rclis.org/13191/1/frad_2009-zh.pdf. 4 lan xuliu and meng fang, “metadata format analysis of digital cultural resources,” modern information 33, no. 8: 61–64, 102, https://kns.cnki.net/kcms/detail/detail.aspx? dbcode=cjfd&dbname=cjfd2013&filename=xdqb201308015&uniplatform=nzkpt&v=skct nh3sg04qrgzqahxdh3nj2hmpk2ppmjbp4ymnpdq-phf2ffjwxpp5vcns9qc9. 5 murtha baca, “practical issues in applying metadata schemas and controlled vocabularies to cultural heritage information,” cataloging & classification quarterly 36, no. 3–4 (2003): 47–55, https://doi.org/10.1300/j104v36n03_5. 6 yi junkai, zhou yubin, and chen gang, “research and practice of scalable digital museum metadata specification[j],” digital library forum 2 (2014): 43–53, https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfdtemp&filename=s ztg201402011&uniplatform=nzkpt&v=tf76zueher7ymnfxdfafenmm2z2tetze08zqkdhoc7 wq2zwtkoao3i0ei7oyvcf1. 7 jin saiying, “research on chinese and foreign art image metadata and framework,” new art 37, no. 1 (2016): 129–32, https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname= cjfdlast2016&filename=xmsh201601019&uniplatform=nzkpt&v=eynvoucbcnpzjkw84 mxeabs--auqafuwanchem0p5phcmjw0s7jttnplobqop0_h. 8 xiao long and zhao liang, introduction and examples of chinese metadata (beijing: beijing library press, 2007). 9 li bo, “research on metadata model of intangible cultural heritage information resources,” library circle 5 (2011): 38–41, https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd& dbname=cjfd2011&filename=tsgu201105016&uniplatform=nzkpt&v=unflzsdezr0jue0ut _npb7h0ri5vioemybvm3zytqfh2quzuycubz5tzrbshnkwh. 10 ye peng and zhou yaolin, “the framework and standards of chinese intangible cultural heritage metadata,” 2013 international conference on applied social science research (paris: atlantis press, 2013). https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2001&filename=dxts200104005&uniplatform=nzkpt&v=v9a8p-rcf-4csl9yoaqskj5nbnfjrmwjhsaoj2pnqq9jl0tdsle3ntrjrzeto32h https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2001&filename=dxts200104005&uniplatform=nzkpt&v=v9a8p-rcf-4csl9yoaqskj5nbnfjrmwjhsaoj2pnqq9jl0tdsle3ntrjrzeto32h https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2001&filename=dxts200104005&uniplatform=nzkpt&v=v9a8p-rcf-4csl9yoaqskj5nbnfjrmwjhsaoj2pnqq9jl0tdsle3ntrjrzeto32h https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2002&filename=qbkx200204012&uniplatform=nzkpt&v=yemo5mxwo0mzg5mkz6qml62oruvfchtdy2slxdbn_hesfdvspxuc-naorq0v0ikl https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2002&filename=qbkx200204012&uniplatform=nzkpt&v=yemo5mxwo0mzg5mkz6qml62oruvfchtdy2slxdbn_hesfdvspxuc-naorq0v0ikl https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2002&filename=qbkx200204012&uniplatform=nzkpt&v=yemo5mxwo0mzg5mkz6qml62oruvfchtdy2slxdbn_hesfdvspxuc-naorq0v0ikl http://eprints.rclis.org/13191/1/frad_2009-zh.pdf https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2013&filename=xdqb201308015&uniplatform=nzkpt&v=skctnh3sg04qrgzqahxdh3nj2hmpk2ppmjbp4ymnpdq-phf2ffjwxpp5vcns9qc9 https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2013&filename=xdqb201308015&uniplatform=nzkpt&v=skctnh3sg04qrgzqahxdh3nj2hmpk2ppmjbp4ymnpdq-phf2ffjwxpp5vcns9qc9 https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2013&filename=xdqb201308015&uniplatform=nzkpt&v=skctnh3sg04qrgzqahxdh3nj2hmpk2ppmjbp4ymnpdq-phf2ffjwxpp5vcns9qc9 https://doi.org/10.1300/j104v36n03_5 https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfdtemp&filename=sztg201402011&uniplatform=nzkpt&v=tf76zueher7ymnfxdfafenmm2z2tetze08zqkdhoc7wq2zwtkoao3i0ei7oyvcf1 https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfdtemp&filename=sztg201402011&uniplatform=nzkpt&v=tf76zueher7ymnfxdfafenmm2z2tetze08zqkdhoc7wq2zwtkoao3i0ei7oyvcf1 https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfdtemp&filename=sztg201402011&uniplatform=nzkpt&v=tf76zueher7ymnfxdfafenmm2z2tetze08zqkdhoc7wq2zwtkoao3i0ei7oyvcf1 https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfdlast2016&filename=xmsh201601019&uniplatform=nzkpt&v=eynvoucbcnpzjkw84mxeabs--auqafuwanchem0p5phcmjw0s7jttnplobqop0_h https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfdlast2016&filename=xmsh201601019&uniplatform=nzkpt&v=eynvoucbcnpzjkw84mxeabs--auqafuwanchem0p5phcmjw0s7jttnplobqop0_h https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfdlast2016&filename=xmsh201601019&uniplatform=nzkpt&v=eynvoucbcnpzjkw84mxeabs--auqafuwanchem0p5phcmjw0s7jttnplobqop0_h https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2011&filename=tsgu201105016&uniplatform=nzkpt&v=unflzsdezr0jue0ut_npb7h0ri5vioemybvm3zytqfh2quzuycubz5tzrbshnkwh https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2011&filename=tsgu201105016&uniplatform=nzkpt&v=unflzsdezr0jue0ut_npb7h0ri5vioemybvm3zytqfh2quzuycubz5tzrbshnkwh https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2011&filename=tsgu201105016&uniplatform=nzkpt&v=unflzsdezr0jue0ut_npb7h0ri5vioemybvm3zytqfh2quzuycubz5tzrbshnkwh information technology and libraries june 2022 research on knowledge organization of intangible cultural heritage | qing, tan, sun, and chen 11 11 bamo qubumo, guo cuixiao, yin hubin, and li gang, “customizing discipline-based metadata standards for digital preservation of living epic traditions in china: basic principles and challenges,” 2013 digital heritage international congress, https://ieeexplore.ieee.org/document/6744746. 12 y. j. nam and s. m. lee, “localization of metadata elements in the art museum community[j],” 충청문화연구 46, no. 2 (2012): 139–74. 13 bamo qubumo, c. guo, h. yin, et al., “customizing discipline-based metadata standards for digital preservation of living epic traditions in china: basic principles and challenges,” digital heritage international congress, ieee, 2014. 14 chao gojin, “unesco ethical principles for the protection of intangible cultural heritage: an introduction and comment [j],” inner mongolia social sciences (chinese version), 37, no. 05 (2016): 1–13, https://doi.org/10.14137/j.cnki. issn1003-5281.2016.05.00. 15 chen junxiu, “research on the mode of productive protection and utilization of intangible cultural heritage,” learning and practice 5 (2015): 118–23, https://kns.cnki.net/kcms/detail/ detail.aspx?dbcode=cjfd&dbname=cjfdlast2015&filename=xxys201505014&uniplatform= nzkpt&v=telpps4abo6-qidxtqjyu9a_hy0q6ukovi4x5nz8br-u33pzq6py2d1cshqlclnw. 16 zhao zhihui, “visual analysis of the evolution path and hot frontiers of cultural heritage digitization research,” library forum 2 (2013): 33–40, https://kns.cnki.net/kcms/detail/ detail.aspx?dbcode=cjfd&dbname=cjfd2013&filename=tsgl201302007&uniplatform=nzk pt&v=yezmntrx2f00eqvogxwtz5yehk3zz1dm8layjik4l1lmjvvjuq7gaiymloplnmiv. https://ieeexplore.ieee.org/document/6744746 https://doi.org/10.14137/j.cnki.%20issn1003-5281.2016.05.00 https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfdlast2015&filename=xxys201505014&uniplatform=nzkpt&v=telpps4abo6-qidxtqjyu9a_hy0q6ukovi4x5nz8br-u33pzq6py2d1cshqlclnw https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfdlast2015&filename=xxys201505014&uniplatform=nzkpt&v=telpps4abo6-qidxtqjyu9a_hy0q6ukovi4x5nz8br-u33pzq6py2d1cshqlclnw https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfdlast2015&filename=xxys201505014&uniplatform=nzkpt&v=telpps4abo6-qidxtqjyu9a_hy0q6ukovi4x5nz8br-u33pzq6py2d1cshqlclnw https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2013&filename=tsgl201302007&uniplatform=nzkpt&v=yezmntrx2f00eqvogxwtz5yehk3zz1dm8layjik4l1lmjvvjuq7gaiymloplnmiv https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2013&filename=tsgl201302007&uniplatform=nzkpt&v=yezmntrx2f00eqvogxwtz5yehk3zz1dm8layjik4l1lmjvvjuq7gaiymloplnmiv https://kns.cnki.net/kcms/detail/detail.aspx?dbcode=cjfd&dbname=cjfd2013&filename=tsgl201302007&uniplatform=nzkpt&v=yezmntrx2f00eqvogxwtz5yehk3zz1dm8layjik4l1lmjvvjuq7gaiymloplnmiv abstract introduction research status at home and abroad connotation and design principles of intagible cultural heritage metadata connotation of intangible cultural heritage metadata design principles of intangible cultural heritage metadata intangible cultural heritage knowledge organization definition of intangible cultural heritage metadata description of digital resources of intangible cultural heritage data association examples of metadata application of intangible cultural heritage knowledge organizations introduction to the wuhan wood carving ship model intangible cultural heritage project knowledge organization construction based on metadata wuhan woodcarving ship model metadata definition conclusion funding statement endnotes 10592 20190318 galley the map as a search box: using linked data to create a geographic discovery system gabriel mckee information technology and libraries | march 2019 40 gabriel mckee (gm95@nyu.edu) is librarian for collections and services at the institute for the study of the ancient world at new york university. abstract this article describes a bibliographic mapping project recently undertaken at the library of the institute for the study of the ancient world (isaw). the marc advisory committee recently approved an update to marc that enables the use of dereferenceable uniform resource identifiers (uris) in marc subfield $0. the isaw library has taken advantage of marc’s new openness to uris, using identifiers from the linked data gazetteer pleiades in marc records and using this metadata to create maps representing our library’s holdings. by populating our marc records with uris from pleiades, an online, linked open data (lod) gazetteer of the ancient world, we are able to create maps of the geographic metadata in our library’s catalog. this article describes the background, procedures, and potential future directions for this collection-mapping project. introduction since the concept of the semantic web was first articulated in 2001, libraries have faced the challenge of converting their vast stores of metadata into linked data.1 though bibframe, the planned replacement for the marc (machine-readable cataloging) systems that most american libraries have been using since the 1970s, is based on linked-data principles, it is unlikely to be implemented widely for several years. as a result, many libraries have delayed creating linked data within the existing marc framework. one reason for this delay has been the absence of a clear consensus in the cataloging community about the best method to incorporate uniform resource identifiers (uris), the key building block of linked data, into marc records.2 but recent developments have added clarity to how uris can be used in marc, clearing a path for projects that draw on uris in library metadata. this paper describes one such project undertaken by the library of the institute for the study of the ancient world (isaw) that draws on uris from the linked-data gazetteer pleiades to create maps of items in the library’s collection. a brief history of uris in marc over the last decade, the path to using uris in marc records has become more clear. this process began in 2007, when the deutsche nationalbibliothek submitted a proposal to expand the use of a particular marc subfield, $0 (also called “dollar-zero” or “subfield zero”), to contain control numbers for related authority records in main entry, subject access, and added entry fields.3 the proposal, which was approved on july 13, 2007, called for these control numbers to be recorded with a particular syntax: “the marc organization code (in parentheses) followed immediately by the number, e.g., (cabvau)2835210335.”4 this marc-specific syntax is usable within the marc environment, but is not actionable for linked-data purposes. a dereferenceable uri—that is, an identifier beginning with “http://” that links directly to an online resource or a descriptive information technology and libraries | march 2019 41 representation of a person, object, or concept—could be parsed and reconstructed, but only with a significant amount of human intervention and a high likelihood of error.5 in 2010, following a proposal from the british library, $0 was redefined to allow types of identifiers other than authority record numbers, in particular international standard name identifiers (isni), using this same parenthetical-prefix syntax.6 that same year, the rda/marc working group issued a discussion paper proposing the use of uris in $0, but no proposal regarding the matter was approved at that time.7 the 2010 redefinition made it possible to place uris in $0, provided they were preceded by the parenthetical prefix “(uri)”. however, this requirement of an added character string put marc practice at odds with the typical practices of the linked data community. not only does the addition of a prefix create the need for additional parsing before the uri can be used, the prefix is also redundant, since dereferenceable uris are self-identifying. in 2015, the program for cooperative cataloging (pcc) charged a task group with examining the challenges and opportunities for the use of uris within a marc environment.8 one of this group’s first accomplishments was submitting a proposal to the marc advisory committee to discontinue the requirement of the “(uri)” prefix on uris.9 though this change appears minor, it represents a significant step forward in the gradual process of converting marc metadata to linked data. linked data applications require dereferenceable uris. the requirement of either converting an http uri to a number string (as $0 required from 2007-10), or prefixing it with a parenthetical prefix, produced identifiers that did not meet the definition of dereferenceability. as shieh and reese explain, the marc syntax in place prior to this redefinition was at odds with the practices used by semantic web services: the use of qualifiers, rather than actionable uris, requires those interested in utilizing library metadata to become domain experts and become familiar with the wide range of standards and vocabularies utilized within the library metadata space. the qualifiers force human interaction, whereas dereferenceable uris are more intuitive for machines to process, to query services, to self-describe—a truly automated processing and a wholesome integration of web services.10 though it has been possible to use prefixed uris in marc for several years, few libraries have done so, in part because of this requirement for human intervention, and in part because of the scarcity of use-cases that justified their use. the removal of the prefix requirement brings marc’s use of uris more into line with that of other semantic web services, and will reduce system fragility and enhance forward-compatibility with developing products, projects, and services. though marc library catalogs still struggle with external interoperability, the capability of inserting unaltered, dereferenceable uris into marc records is potentially transformative.11 following the approval of the pcc task group on uri in marc’s 2016 proposal makes it easier to work with limited linked data applications directly within marc, rather than waiting for the implementation of bibframe. by inserting actionable uris directly into marc records, libraries can begin developing programs, tools, and projects that draw on these uris for any number of data outcomes. in the last two years, the isaw library has taken advantage of marc’s new openness to uris to create one such outcome: a bibliographic mapping project that creates browseable maps of items held by the library. the isaw library holds approximately 50,000 volumes in its print collection, map as a search box | mckee 42 https://doi.org/10.6017/ital.v38i1.10592 chiefly focusing on archaeology, history, and philology of asia, europe, and north africa from the beginning of agriculture through the dawn of the medieval period, with a focus on cultural interconnections and interdisciplinary approaches to antiquity. the institute, founded in 2007, is affiliated with new york university (nyu) and its library holdings are cataloged within bobcat, the nyu opac. by populating our marc records with uris from pleiades, an online, linked open data (lod) gazetteer of the ancient world, the isaw library is able to create maps of the geographic metadata in our library’s catalog. at the moment, this process is indirect and requires periodic human intervention, but we are working on ways of introducing greater automation as well as expanding beyond small sets of data to a larger map encompassing as much of our library’s holdings as it makes sense to represent geographically. map-based searching for ancient subjects in the disciplines of history and archaeology, geography is of vital importance. much of what we know about the past can be tied to particular locations: archaeological sites, ancient structures, and find-spots for caches of papyri and cuneiform tablets provide the spatial context for the cultures about which they inform us. but while geospatial data about antiquity can be extremely precise, the text-based searching that is the user’s primary means of accessing library materials enabled is much less clear. standards for geographic metadata focus on place names, which open the door for greater ambiguity, as buckland et al. explain: there is a basic distinction between place, a cultural concept, and space, a physical concept. cultural discourse tends to be about places rather than spaces and, being cultural and linguistic, place names tend to be multiple, ambiguous, and unstable. indeed, the places themselves are unstable. cities expand, absorbing neighboring places, and countries change both names and boundaries.12 nowhere is this instability of places and their names so clear as in the fields of ancient history and archaeology, which often require awareness of cultural changes in a single location throughout the longue durée. and yet researchers in these fields have had to rely on library search interfaces that rely entirely on toponyms for accessing research materials. scholars in these disciplines, and many others besides, would be well served by a method of discovering research materials that relies not on keywords or controlled vocabularies, but on geographic location. library of congress classification and subject cataloging tend to provide greater granularity for political developments in the modern era, presenting a challenge to students of ancient history. a scholar of the ancient caucasus, for example, is likely to be interested in materials that are currently classified under the history classes for the historical region of greater armenia (ds161199), the modern countries of armenia (dk680), azerbaijan (dk69x), georgia (dk67x), russia (dk5xx), ukraine (dk508), and turkey (ds51, ds155-156 and dr401-741); for preand protohistoric periods, materials may be classified in gn700-890; and texts in ancient languages of the caucasus will fall into the pk8000-9000 range. moreover, an effective catalog search may require familiarity with the romanization schemes for georgian, armenian, russian, and ukrainian. materials on the ancient caucasus fall into a dozen or more call number ranges, and there is no single term within the library of congress subject headings (lcsh) that connects them—but if their subjects were represented on a map, they would fall within a polygon only a few hundred miles long on each side. this geophysical collocation of materials from across many classes of knowledge can enable unexpected discoveries. as bidney and clair point out, “organizing information technology and libraries | march 2019 43 information based on location is a powerful idea—it has the capacity to bring together information from diverse communities of practice that a research may never have considered . . . ‘place’ is interdisciplinary.”13 with this in mind, the isaw library has set out to create an alternative method of accessing items in its collection: a browseable, map-based interface for the discovery of library materials. literature review though geographic searching is undoubtedly useful for many different types of content, much of the work in using coordinate data and map-based representations of resources has centered on searching for printed maps and, more recently, geospatial datasets. in an article published in 2007, buckland et al. issued a challenge to libraries to complement existing text-string toponymic terminology with coordinate data.14 perhaps unsurprisingly, the most progress in meeting this challenge has been made in the area of cartographic collections. in a 2010 article, bidney discussed the library of congress’s then-new requirement of coordinates in records describing maps, and explores the possibility of using this metadata to create a geographic search interface.15 a 2014 follow-up article by bidney and clair expanded this argument to include not just cartographic materials, but all library resources, challenging libraries to develop new interfaces to make use of geospatial data.16 the most advanced geospatial search interfaces have been developed for cartographic and geospatial data. for example, geoblacklight (http://geoblacklight.org) offers an excellent map-based interface, but it is intended primarily for cartographic and gis data specifically, and not library resources more broadly. the mapfast project described by bennett et al. in 2011 pursues goals similar to our pleiadesbased discovery system.17 using fast (faceted application of subject terminology) headings, which are already present in many marc records, this project creates a searchable map via the google maps api. each fast geographic heading creates a point on the map which, when clicked, brings the user to a precomposed search in the library catalog for the corresponding controlled subject heading. one limitation to the mapfast model is the absence of geographic coordinates on many of the lc authority records from which fast headings are derived: at the time that bennett et al. described the project, coordinates were available for only 62.5 percent of fast geographic headings; additional coordinates came from the geonames database (http://www.geonames.org/).18 moreover, the method of retrieving these coordinates is based on text string matching, which introduces the possibility of errors resulting from the lack of coordination between toponyms in fast and geonames. in exploring other mapping projects, we looked most closely at projects with a focus on the ancient world, including pelagios (http://commons.pelagios.org), its geographic search tool peripleo,19 and china historical gis (chgis, http://sites.fas.harvard.edu/~chgis). as described by simon et al. in 2016, pelagios offers a shared framework for researchers in classical history to explore geographic connections, and several applications of its data resemble our desired outcome.20 similarly, merrick lex berman’s work with the api provided by china historical gis in connection with library metadata provided important guidelines and points of comparison.21 we also explored mapping projects outside of the context of antiquity, including maphappy, the biodiversity heritage library’s map of lcsh headings, and the map interface developed for phillyhistory.org.22 map as a search box | mckee 44 https://doi.org/10.6017/ital.v38i1.10592 first steps: metadata to develop a system for mapping the isaw library’s collection, we began by working with smaller sets of metadata. our initial collection map, which served as a proof of concept, represented the titles available in the ancient world digital library (awdl, http://dlib.nyu.edu/ancientworld), an online e-book reader created by the isaw library in collaboration with nyu’s digital library technical services department. when we initially created this interface, called the awdl atlas, awdl contained a small, manageable set of about one hundred titles. working in a spreadsheet, we assigned geographic coordinates to each of these titles and mapped them using google fusion tables (https://fusiontables.google.com). fusion tables, launched by google in june 2009, is a cloud-based platform for data management that includes a number of visualization tools, including a mapping feature that builds on the infrastructure of google maps.23 the fusion tables map created for awdl shows a pinpoint for each title in the e-book library; when clicked, each pinpoint gives basic bibliographic data about the title and a link to the e-book itself. one problem with this initial map was that it did little to show precision—a pinpoint representing a specific archaeological site in iraq looks the same on the map as a pinpoint representing the entirety of central asia. nevertheless, the basic functionality of the awdl atlas performed as desired, providing a geographic search interface for a concrete set of resources. for our next collection map, we turned our attention to our monthly lists of new titles in our library’s collection. at the end of each month, nyu’s library systems team sends our library a marc-xml report listing all of the items added to our library’s collection that month. for several years now, we have been publishing this data on our library’s website in human-readable html form and adding the titles to a library in the open-source citation management platform zotero, allowing our users multiple pathways to discovering resources within our collection.24 beginning in august 2016, we began creating monthly maps of these titles, using a variation of the workflow that we devised for the awdl atlas. to better represent the different levels of precision that each point represents, we implemented a color-coded range of four levels of precision, from sitespecific archaeological publications to materials covering a broad, multi-country range, with a fifth category for cross-cultural materials and other works that can’t be well represented in geographic form. (these items are grouped in the mediterranean sea on the monthly new titles maps, but in a full-collection map would most likely be either excluded or represented by multiple points, as appropriate.) the initial new titles maps took a significant amount of title-by-title work to create. coordinates and assessments of precision needed to be assigned for each title individually. we quickly began looking for ways to automate the process of geolocation, and soon settled on using data from pleiades to increase the efficiency of creating each map.25 we set our sights on marc field 651 (subject added entry-geographic name) as the best place in a marc record to put pleiades data. as a subject access field, the 651 is structured to contain a searchable text string and can also include a $0 with a uri associated with that text string. however, under current cataloging guidelines, catalogers are not free to use any uri they choose in this field: the library of congress maintains a list of authorized sources for subject terms to be used in 651 and other subject-access fields.26 in august 2016, the isaw library submitted a proposal to the library of congress for pleiades to be approved as a source of authoritative subject data and added to lc’s list of subject heading and term source codes. the proposal was approved the following month, and by early 2017 the lc-assigned code was approved for use in information technology and libraries | march 2019 45 oclc records. with this approval in place, we began incorporating pleiades uris in marc records for items held by the isaw library. we used the names of pleiades resources as subject terms in new 651 (subject added entry-geographic name) fields, specifying pleiades as the source of the subject term in subfield $2 and adding the pleiades uri in a $0: figure 1. fields from a marc record showing an lcnaf geographic heading and the corresponding pleiades heading, with uri in $0. figure 1 shows a detail from oclc record #986242751, which describes a book containing texts from cuneiform tablets discovered at the hittite capital city hattusa. this detail shows both the lcnaf and pleiades geographic headings assigned to this record. (in addition to providing a uri for the site, the pleiades heading also enhances keyword searches: the 651 field is searchable in the nyu library catalog, thus providing keyword access to one of the city’s ancient names). the second 651 field contains a second indicator 7, indicating that the source of the subject term is specified in $2, where the lc-approved code “pleiades” is specified. this is followed by a $0 containing the uri for the pleiades place resource describing hattusa. our monthly reports of new titles now contain a field for pleiades uris. currently, we are not querying pleiades directly for coordinates, but rather are using the uri as a vertical-lookup term within a spreadsheet of each month’s new titles, which is checked against a separate data file that matches pleiades uris to coordinate pairs.27 for places where no pleiades heading is available, we have begun using uris from the getty thesaurus of geographic names (tgn), marc-syntax fast identifiers, and unique lcnaf text strings, using the same vertical-lookup process to retrieve previously researched coordinate pairs for those places. next, we retrieve coordinates for newly appearing pleiades locations, research the locations of new non-pleiades places, and add both to the local database of places used. lastly, due to google fusion table’s inability to display more than one item on a single coordinate pair, prior to uploading the map data to fusion tables we examine it for duplicated coordinate pairs, manually altering them to scatter these points to nearby locations. the overall amount of time spent on cleaning data and preparing each month’s map has decreased from more than a full day’s work in august 2016 to about two hours in january 2018. map as a search box | mckee 46 https://doi.org/10.6017/ital.v38i1.10592 figure 2. a screenshot from the isaw library new titles map for january 2018, showing an itemspecific information window (http://isaw.nyu.edu/library/find/newtitles-2017-18/2018-jan). challenges in developing the isaw library’s mapping program, we had to overcome several challenges: early in the project, we needed to address the philosophical differences between how pleiades and lcnaf think about places and toponyms. the concept of “place” in pleiades is broad, and contains cities, structures, archaeological sites, kingdoms, provinces, and other types of administrative divisions, roads, geological features, culturally defined regions, and ethnic groups: “the term [‘place’] applies to any locus of human attention, material or intellectual, in a real-world geographic context.”28 in functional terms, a “place” in pleiades is a top-level resource containing one or more other types of data: • one or more locations, consisting of either a precise point, an approximate rectangular polygon, or a precise polygon formed by multiple points; • one or more names, in one or more ancient or modern languages; • one or more connections to other place resources, generally denoting a geospatial or political/administrative connection. locations, names, and connections contain further metadata, including chronological attestations and citations to data sources. no one of these components is a requirement—even locations are optional, as ancient texts contain references to cities and structures whose geospatial location is unknown. information technology and libraries | march 2019 47 by contrast, library of congress rules focus almost exclusively on names—that is, text strings. there are two main categories of geographic names, as described in instruction sheet h690 of the subject headings manual (shm): headings for geographic names fall into two categories: (1) names of political jurisdictions, and (2) non-jurisdictional geographic names. headings in the first category are established according to descriptive cataloging conventions with authority records that reside in the name authority file . . . headings in the second category are established . . . with authority records that reside in the subject authority file.29 the two categories—essentially definable as political entities and geographic regions—are both of interest to the shm only as represented by text strings. the purpose of identifying places within the framework of lc’s guidelines is to enable text-based searching and collocation of items based on uniform, human-readable terminology. at the beginning of this project, it was important to acknowledge, explore, and understand this fundamental difference, and to understand the different purposes of an authority file (identifying unique text strings), a linked data gazetteer (assembling and linking many different kinds of geospatial and toponymic data), and our mapping project (identifying coordinate pairs related to specific library resources). in our project, this philosophical gap manifested as a difference between the primary and secondary importance of authorized text strings and uris: in lcsh and lcnaf, the text string is primary, and the uri secondary (where it is used at all); in pleiades and many other linked-data sources, uris are primary and text strings secondary. lcsh and lcnaf text strings are unique, and can be considered as a sort of identifier, but they do not have the machine-readable functionality of a uri. in pleiades, the machine-readable uri is primary, and can be used to return coordinates, place names, and other humanor machine-readable data. the name of a pleiades place resource can be construed as a “subject heading,” but these text strings are not necessarily unique, and additional data from the pleiades resource may be required for disambiguation by a human reader.30 toponymic terminology—that is, human-readable text strings—are just one type of data that pleiades contains, alongside geospatial data, temporal tags, and linkages between resources. one example of a recent change in pleiades data illustrates the fundamental difference in approach between authority control and uri management. until recently, pleiades contained two different place resources with the heading “aegyptus” (https://pleiades.stoa.org/places/766 and https://pleiades.stoa.org/places/981503), both referring to the general region of egypt. both of these resources were recently updated, and the title text of both was changed: /places/766 was retitled “aegyptus (roman imperial province)” and /places/981503 became “ancient egypt (region).” the distinction illustrates the difficulty in assigning names to places over long spans of time: egypt, as understood by pre-ptolemaic inhabitants of the nile region, had a different meaning than the administrative region established after octavian’s defeat of marc antony and cleopatra—or, for that matter, from the predynastic kingdoms of upper and lower egypt, the ottoman eyalet of misr, and the modern republic of egypt. prior to this change in pleiades, both uris were applied to marc records for items held by the isaw library, under the heading “aegyptus.” from a linked-data standpoint, there is no real problem here: the uris still link to resources describing different historical places called “egypt,” including the coordinate data needed for isaw’s collection maps. but from the standpoint of authority control, the subject term map as a search box | mckee 48 https://doi.org/10.6017/ital.v38i1.10592 “aegyptus” on these records is now “wrong,” representing a deprecated term, and should be updated. even here, though, a linked-data model has benefits that a text-string-based model lacks. even if they contain the same text string heading, the uri means there is no ambiguity between the two headings, and the text strings can be replaced with a batch operation based on the differences in their uris. getting away from text-string-based thinking will represent a major philosophical challenge for libraries as we move toward a linked data model for library metadata, but the many benefits of linked data will make that shift worthwhile. google fusion tables represents a future hurdle that the isaw library’s mapping project will need to clear. in december 2018, google announced that the fusion tables project would be discontinued, and that all embedded fusion tables visualizations will cease functioning on december 3, 2019.31 fortunately, the isaw library has already begun developing an alternative solution that does not rely on the deprecated fusion tables application. the core methodology used in developing our maps will remain the same, however. lastly, the geographic breadth of our collection reveals the limitations of pleiades as the sole data source for this project. at its inception, pleiades was focused on the greco-roman antiquity, and though it has expanded over time, central and east asia—regions of central interest to the isaw library—are largely not covered. because all contributions to pleiades undergo peer review prior to being published online, pleiades’ editors are understandably reluctant to commit to expanding their coverage eastward until the editorial team includes experts in these geographic areas. however, though we began this project with pleiades, there is no barrier to using other sources of geographic data, such as china historical gis, the getty thesaurus of geographic names (tgn, http://www.getty.edu/research/tools/vocabularies/tgn/index.html), geonames (http://www.geonames.org/), the world-historical gazetteer (http://whgazetteer.org/), or the library of congress’s linked data service (http://id.loc.gov/). the same procedures we’ve used with pleiades can be applied to any reliable data source with consistently formatted data. future directions we have already begun to move away from the google fusion tables model, and are working to develop our own javascript-based map application using mapbox (https://www.mapbox.com/) and leaflet (https://leafletjs.com/). when completed, this updated mapping application will actively query a database of pleiades headings for coordinates, further automating the process of map creation. we are looking into different methods of encoding and representing precision—for example, using points and polygons to represent sites and regions, respectively. the leaflet map interface will also enable us to show multiple items for single locations, something fusion tables is unable to do, and will thus eliminate the need to manually deduplicate coordinate pairs. to expand the number of records that contain pleiades uris, we are developing a crosswalk between existing lc geographic headings and pleiades place resources. when completed, we will use this crosswalk to batch-update our older records with pleiades data where appropriate. the crosswalk will contain uris from both pleiades and the lc linked data service, and it will be provided to the pleiades team so that pleiades resources can incorporate lc metadata as well. we are also exploring further user applications of map-based search. one function we hope to develop is a geographic notification service, allowing users to define polygonal areas of interest on the map. when a new point is added that falls within these polygons, the user will be notified of a information technology and libraries | march 2019 49 new item of potential interest. some user training will be required to ensure that users define their areas of interest in such a way that they will receive results that interest them—for example, a user interested in the roman empire will likely be interested in titles about the mediterranean region in general, and may need to draw their bounding box so that it encompasses the open sea as well as sites on land. it will also require thoughtfulness about where users are likely to look for points of interest, especially for empires and other historic entities that do not correspond to modern geopolitical boundaries (for example, the byzantine empire or scythia). additionally, we hope to begin working with chronological as well as geospatial data, with hopes of being able to add a time slider to the library map. this would enable users to focus on particular periods of history as well as geographic regions—for example, users interested in bronze age anatolia could limit results to that time period, so that they can browse the map without material from the byzantine empire “cluttering” their browsing experience.32 the online temporal gazetteer periodo (http://perio.do/) provides a rich data source to draw on, including uris for individual chronological periods and beginning and end dates for each defined temporal term. following a proposal submitted by the isaw library, periodo was approved by the library of congress as a source of subject terminology in september 2018, and its headings and uris are now useable in marc. however, though lcsh headings for geographic places are often quite good, the guidelines for chronological headings and subdivisions are often inadequate for describing ancient historical periods, and thus enacting a chronological slider, though highly desirable, would require a large amount of manual changes and additions to existing metadata. the isaw library’s collection mapping project has accomplished its initial goal of providing a geospatial interface for the discovery of materials in our library collection. as we expand our mapping project to incorporate more of our collection, we also hope that our model can prove useful to other institutions looking for practical applications of uris in marc, alternative discovery methods to text-based searching, or both. references and notes 1 for a summary of this challenging problem, see brighid m. gonzales, “linking libraries to the web: linked data and the future of the bibliographic record,” information technology & libraries 33, no. 4 (dec. 2014): 10–22, https://doi.org/10.6017/ital.v33i4.5631. 2 see, for example, timothy w. cole et al., “library marc records into linked open data: challenges and opportunities,” journal of library metadata 13, no. 2–3 (july 2013): 178, https://doi.org/10.1080/19386389.2013.826074. 3 deutsche nationalbibliothek, “marc proposal no. 2007-06: changes for the german and austrian conversion to marc 21,” marc standards, may 25, 2007, https://www.loc.gov/marc/marbi/2007/2007-06.html. 4 ibid. 5 for a detailed discussion of the importance of actionability in unique identifiers, see jackie shieh and terry reese, “the importance of identifiers in the new web environment and using the uniform resource identifier (uri) in subfield zero ($0): a small step that is actually a big map as a search box | mckee 50 https://doi.org/10.6017/ital.v38i1.10592 step,” journal of library metadata 15, no. 3–4 (oct. 2, 2015): 220–23, https://doi.org/10.1080/19386389.2015.1099981. 6 british library, “marc proposal no. 2010-06: encoding the international standard name identifier (isni) in the marc 21 bibliographic and authority formats,” marc standards, may 17, 2010, https://www.loc.gov/marc/marbi/2010/2010-06.html. 7 rda/marc working group, “marc discussion paper no. 2010-dp02: encoding uris for controlled values in marc records,” marc standards, dec. 14, 2009, https://www.loc.gov/marc/marbi/2010/2010-dp02.html. 8 for a summary of this task group’s work to date, see jackie shieh, “reports from the program for cooperative cataloging task groups on uris in marc & bibframe,” jlis.it: italian journal of library, archives and information science = rivista italiana di biblioteconomia, archivistica e scienza dell’informazione 9, no. 1 (2018): 110–19, https://doi.org/10.4403/jlis.it-12429. 9 pcc task group on uri in marc and the british library, “marc discussion paper no. 2016dp18: redefining subfield $0 to remove the use of parenthetical prefix ‘(uri)’ in the marc 21 authority, bibliographic, and holdings formats,” marc standards, may 27, 2016, https://www.loc.gov/marc/mac/2016/2016-dp18.html; marc advisory committee, “mac meeting minutes” (ala annual meeting, orlando, fl, 2016), https://www.loc.gov/marc/mac/minutes/an-16.html. for a cumulative description of the scope of this task group’s work, see pcc task group on uris in marc, “pcc task group on uris in marc: year 2 report to poco, october 2017” (program for cooperative cataloging, oct. 23, 2017), https://www.loc.gov/aba/pcc/documents/poco2017/pcc_uri_tg_20171015_report.pdf. 10 shieh and reese, “the importance of identifiers in the new web environment and using the uniform resource identifier (uri) in subfield zero ($0),” 221. 11 shieh and reese, “the importance of identifiers in the new web environment and using the uniform resource identifier (uri) in subfield zero ($0)”; for a discussion of a related problem (finding a place for a uri in marc authority records), see ioannis papadakis, konstantinos kyprianos, and michalis stefanidakis, “linked data uris and libraries: the story so far,” d-lib magazine 21, no. 5/6 (june 2015), https://doi.org/10.1045/may2015-papadakis. 12 michael buckland et al., “geographic search: catalogs, gazetteers, and maps,” college & research libraries 68, no. 5 (sept. 2007): 376, https://doi.org/10.5860/crl.68.5.376. 13 marcy bidney and kevin clair, “harnessing the geospatial semantic web: toward place-based information organization and access,” cataloging & classification quarterly 52, no. 1 (2014): 70, https://doi.org/10.1080/01639374.2013.852038. 14 buckland et al., “geographic search.” 15 marcy bidney, “can geographic coordinates in the catalog record be useful?,” journal of map & geography libraries 6, no. 2 (july 13, 2010): 140–50, https://doi.org/10.1080/15420353.2010.492304. information technology and libraries | march 2019 51 16 bidney and clair, “harnessing the geospatial semantic web.” 17 rick bennett et al., “mapfast: a fast geographic authorities mashup with google maps,” code4lib journal, no. 14 (july 25, 2011): 1–9, http://journal.code4lib.org/articles/5645. 18 bennett et al., 1. 19 rainer simon et al., “peripleo: a tool for exploring heterogeneous data through the dimensions of space and time,” the code4lib journal, no. 31 (jan. 28, 2016), http://journal.code4lib.org/articles/11144. 20 rainer simon et al., “the pleiades gazetteer and the pelagios project,” in placing names: enriching and integrating gazetteers, ed. merrick lex berman, ruth mostern, and humphrey southall, the spatial humanities (bloomington: indiana univ. pr., 2016), 97–109. 21 merrick lex berman, “linked places in the context of library metadata” (nov. 10, 2016), https://sites.fas.harvard.edu/~chgis/work/docs/papers/hvd_librarylinkeddatagroup_lexb erman_20161110.pdf. 22 lisa r. johnston and kristi l. jensen, “maphappy: a user-centered interface to library map collections via a google maps ‘mashup,’” journal of map & geography libraries 5, no. 2 (july 2009): 114–30, https://doi.org/10.1080/15420350903001138; chris freel et al., “geocoding lcsh in the biodiversity heritage library,” the code4lib journal, no. 2 (mar. 24, 2008), http://journal.code4lib.org/articles/52; gina l. nichols, “merging special collections with gis technology to enhance the user experience,” slis student research journal 5, no. 2 (2015): 52–71, http://scholarworks.sjsu.edu/slissrj/vol5/iss2/5/. 23 hector gonzalez et al., “google fusion tables: data management, integration and collaboration in the cloud,” in proceedings of the 1st acm symposium on cloud computing (indianapolis: acm, 2010), 175–80, https://doi.org/10.1145/1807128.1807158. 24 the isaw library new titles library is available at http://www.zotero.org/groups/290269. 25 since our interest was in obtaining coordinate data, we determined that lcnaf and lcsh would not be appropriate to our needs. although some marc authority records include coordinate data, it is not present in all geographic headings. moreover, where coordinate data is available in the authority file, it is not published in the rdf form of the records via the lc linked data service (http://id.loc.gov/). entries in the getty thesaurus of geographic names (tgn, http://www.getty.edu/research/tools/vocabularies/tgn/index.html) often include structured coordinate data, and in recent months we have begun using tgn uris when a pleiades uri is not available. 26 library of congress, network development & marc standards office, “subject heading and term source codes: source codes for vocabularies, rules, and schemes,” library of congress, jan. 9, 2018, https://www.loc.gov/standards/sourcelist/subject.html. 27 it is worth noting that, since the uris are not currently being queried in the preparation of the map, much of this work could have been accomplished with pre-uri identifiers from marc map as a search box | mckee 52 https://doi.org/10.6017/ital.v38i1.10592 data, or even unique text strings. one benefit of using uris is ease of access to coordinate data, especially from pleiades. pleiades puts coordinates front and center in its display, and even features a one-click feature to copy coordinates to the clipboard. moreover, the entire pleiades dataset is available for download, making the retrieval of coordinates automatable locally, reducing keystrokes even without active database querying. the primary benefit of using uris instead of other forms of unique identifiers, however, is forward-compatibility. this is of immediate importance, since we are developing an updated version of the map that will actively query pleiades for coordinates. future benefits of the presence of uris also include links from pleiades into the library catalog, based on records in which place uris appear. if and when the entire catalog shifts to a linked-data model, the benefits of having these uris present expands exponentially, as this metadata will then be available to all manner of outside sources. 28 sean gillies et al., “conceptual overview,” pleiades, mar. 24, 2017, https://pleiades.stoa.org/help/conceptual-overview. 29 library of congress, “subject headings manual (shm)” (library of congress, 2014), h 690, https://www.loc.gov/aba/publications/freeshm/freeshm.html. 30 for example, pleiades contains two place resources with the identical name “babylon”: one the mesopotamian city and capital of the region known as babylonia (https://pleiades.stoa.org/places/893951); the other the site of the muslim capital of egypt, al-fusṭāṭ, known in late antiquity as babylon (https://pleiades.stoa.org/places/893951727082). 31 google, “notice: google fusion tables turndown,” fusion tables help, dec. 11, 2018, https://support.google.com/fusiontables/answer/9185417. 32 a method of chronological browsing was described in vivien petras, ray r. larson, and michael buckland, “time period directories: a metadata infrastructure for placing events in temporal and geographic context,” in digital libraries, 2006. jcdl’06. proceedings of the 6th acm/ieeecs joint conference on digital library (ieee, 2006), 151–60, https://doi.org/10.1145/1141753.1141782. subject access to a data base of library holdings alice s. clark: assistant dean for readers' services, university of new mexico general library, albuquerque. at the time this research was undertaken, the author was head of the undergraduate libraries at ohio state university. 267 as more academic and public libraries have some form of bibliographic description of their complete collection available in machine-readable form, public service librarians are devising ways to use the information for better retrieval. research at the ohio state university tested user 1'esponse to paper and com output from selected areas of the shelflist. results indicated usm·s at remote locations found such lists helpful, with some indication that paper printout was more popular than microfiche. while many of the computer applications in special libraries were designed to improve subject access to the collections, the systems adopted in academic and public libraries have often been those which would handle various file operations and improve control of circulation or technical processing functions. once some of the data describing the items in the collection became available in machine-readable form, reference librarians have been tempted to find ways to use it for subject retrieval. in november 1970, the ohio state university ( osu) libraries began to use its automated circulation system using a data base representing its complete shelflist with limited information on each title: field no. field 1 call number 2 author 3 title 4 lc number-or nolc if none available 5 title number 6 publication date (if available) 7 ser-serial indicator. when present indicates the title is a serial. 8 neng-non-english indicator. when present indicates the title is non-english. 9 size-oversize indicator. when present indicates the book is an oversize book. 268 journal of library automation vol. 7 i 4 december 197 4 field field no. 10 portxx:xx-portfolio number in which book is located (main library only). 11 mono-monographic set indicator. when present indicates 12 13 14 15 16 17 18 19 20 21 22 title has been designated a monographic set. number of holdings (not displayed if copy 1, main library) reference line number volume number copy number holdings· condition code library location patron identification number of specific saves for the copy circulation status date charged in the form of year, month, day date due in the form of year, month, day the system, modified from time to time, provided access by call number, record number, or author-title with an algorithm consisting of the first four letters of the author's name plus the first five letters of the title. a title search was also possible by entering four letters of the first significant word and five letters of the second significant word or five dashes. as soon as the system was implemented, it was immediately evident that the search option was one of the most important features of the system. the circulation clerk at any location either in the main library or in any department library could search the author and title and find: ( 1) if the osu libraries had the book; ( 2) where it was regularly housed; and ( 3) its status (charged out, missing, lost, or available for circulation). all of this was possible without checking the card catalog except when problems of identifying the main entry existed. the immediate lack was, of course, the subject approach. as use of the system continued and library personnel became more sophisticated, various procedures offering some kind of subject approach were developed. the title search option is one possibility for finding subject access. for example, to find a book on "evolution" one can enter the title search command tls/evol----and receive a report that there are 757 titles in which evolution is the first significant word. the terminal will then print out items as follows: tls/evol----page 1 757 matches 01 lan, h. j. 02 moody, paul amos. 1903 03 brosseau, george e 04 adler, irving 05 lotsy, j. p. 0 skipped evolutie (not all retrieved) introduction to evolution evolution evolution evolution 1946 1970 1967 1965 subject access/clark 269 06 smith, john maynard, 192007 miller, edward on evolution evolution evolution evolution evolution . 1972 1917 19-1924 1951 08 watson, j. a. s. 09 kellogg, v. l; 10 shull, a. franklin when the user types in pg2 or pg3, more titles will come up, and if more than thirty titles are desired, the original command can be reentered with a /skip 30 option to display others including all 757 titles if necessary. it is also possible to manipulate this option further since this first. search may tum up the name of an author recognized as an authority on the subject. in this case, when thomas huxley's evolution and ethics appears, the terminal attendant changes to an author-title search, ats/huxlevolu, and finds eight matches, four books by thomas huxley and four by julian sorell huxley on the same subject: ats/huxlevolu page 1 8 matches 01 huxley, thomas henry 02 huxley, thomas henry 03 huxley, julian 04 huxley, thomas henry 05 huxley, julian sorell 06 huxley, thomas henry 07 huxley, julian sorell 08 huxley, julian sorell 0 skipped (all retrieved in 1) evolution and ethics, and other essays evolution and ethics and other essays evolution, the modern synthesis evolution and ethics and other essays evolution as a process evolution and ethics and other essays evolution in action 1st ed evolution as a process 2d ed 1970 1916 1942 1897 1954 1896 1953 1958 to find the call number of any of these, the attendant merely enters a detailed line search dsl/1: dsl/1 hm106h91896a huxley, thomas henry evolution and ethics, and other nolc 902452 1970 1 01 001 3week und page 1 end the ability to search by a word in the title, which in the above example gives a form of kwic subject index, is even more specific if two words are used. for example, the attendant may enter tsl/chilpsych to bring up titles containing the words "child" and "psychology" as follows: tls / chilpsych page 1 52 matches 0 skipped (not all retrieved) 01 jersild, arthur thomas, 1902child psychology. 4th 1954 02 jersild, arthur thomas, 1902child psychology 5th ed 1960 270 journal of library automation vol. 7/4 december 1974 03 thompson, george greene, 191404 kanner, leo 05 curti, margaret (wooster) 06 clarke, paul a 07 greenberg, harold a 08 english, horace bidwell 09 chess, stella 10 curti, margaret (wooster) child psychology 1952 child psychiatry 3d ed 1957 child psychology 1930 child-adolescent psychology 1968 child psychiatry in the commun 1950 child psychology 1951 an introduction to child psych 1969 child psychology 2d ed 1938 the obvious subject approach is, of course, by call number. the system contains an option that permits a search on the general call number. the operator may enter either a real or an imaginary call number and receive the fifteen titles preceding and the fifteen titles subsequent to it in the shelflist. for example, with the command sps/hm106h9, using the call number from the previous example, the following ten titles will appear with that call number as the central item: sps/hm106h9 11 hm106g77 graubard, man the slave and master 12 hm106h3 haycraft, darwinism and race progress 13 hm106h57 herter, c. biological aspects of human problems 14 hm106h6 hill, g. c. heredity and selection in sociology 15 hm106h63 hoagland, evolution and man's progress 16 °hm106h9 17 hm106h91896 huxley, evolution and ethics and other essays 18 hm106h91896a huxley, evolution and ethics and other essays 19 hm106h91897 huxley, evolution and ethics and other essays 20 hm106h91916 huxley, evolution and ethics and other essays 21 hm106k29 keller, societal evolution; a study of the evolutionary basis page 2 input:hm106h9 entering pgl will bring up the ten preceding titles and pg3 the ten sub:sequent titles. one of the best features of this system is that the patron may call in by telephone and have at least some of this information read to him; if he is at a circulation area, he may receive a printout as an instant bibliography. recently an attempt has been made to use the file of data in other ways. in an attempt to provide better access to the main campus collection for the people at the five regional campuses of the university, an experiment was tried using a computer printout of certain selected parts of the shelflist. since microfiche is less expensive and more compact to handle, there were good reasons for using this form rather than the paper printout form. this was an obvious application for computer output microfiche (com). once subject access/clark 271 a master frame has been produced by com, the cost of additional copies is negligible. in order to test acceptance of form more accurately, it was decided to provide a list in each form to test on sample populations. to cover some of the subjects taught at the agricultural and technical institute at wooster, a total of 20,672 titles were selected in the following areas: agricultural economics botany agriculture agricultural machinery wood technology woodworking hd1401-2210 qk10-942 s tj148g-1496 ts80g-937 tt13g-200 2,121 titles 1,039 titles 17,157 titles 6 titles 197 titles 152 titles these titles were printed in a hard-copy printout in the following format with a program designed by gerry guthrie of the research and development division of the osu libraries: call number = tj1496c3a3 title number = 196795 author = caterpillar tractor company title = fifty years on tracks publ. date = 1954 holdings = cool com regular lc number = 55-20529 the physical form of the resulting documents varied somewhat due to the fact that each subject area was put in one cover. this meant "agriculture" ( s) with 17,157 titles was too bulky to carry around, but "wood technology" was compact and easily carried to one's office or home for leisurely browsing. a brief questionnaire was used to test the reaction to the list. responses were received from 6 percent of the students and faculty at the agricultural and technical institute. with the usual assumption that some students are not library users, there was some validity to the sample. results tabulated from these questionnaires fell into three categories: ( 1) nature of use; ( 2) value of the list; and ( 3) response to its form and format. since some questions were left blank, the totals were often less than 100 percent. nature of use the responses turned out to be evenly divided between faculty and students, 46 percent for each with some leaving this question blank. the faculty indicated that two-thirds of the use was for themselves and one-third for the students. students, of course, used it totally for their own purposes. the actual purpose of the list had been envisioned as access to the main campus collection, and increases in interlibrary loans indicated that it was 272 journal of libmry automation vol. 7/4 december 1974 effective. loans during the month of october 1973 totaled four while november's loans totaled thirty-four, showing a marked difference after the delivery of this search tool on october 31. the questionnaire showed that 77 percent indicated they used the information for this purpose. it should not have been a surprise to librarians to find that 34 percent of the sample population used the information to order a duplicate copy for the wooster ati library, an indication of readers' known proclivity for wanting their material close at hand. users' evaluation the increase in interlibrary loans was probably a better reflection of the users' approval than the actual questionnaire results, although the results themselves were also highly positive. seventy-seven percent checked that they found it valuable, against 15 percent who did not. eighty-five percent said they wanted more lists. requests for additional suggestions included a request to keep it up to date and a request to limit it to just recently published items, while another person asked for all of the titles located in the agricultural engineering library. the requests indicated that several additional subject areas were wanted: communication skills, personnel management, human relations, use of airplanes in agricultural, irrigation, and drainage engineering, and environmental pollution. suitability of form and format some attempt was made to determine how people react to the admittedly inconvenient form of a computer printout. since financial considerations limited the possibilities to either this form or microfiche, those options were presented in the questionnaire. preference for the paper form was expressed by the users of the list in this form-84 percent to 8 percent who would have preferred microfiche. · the population was evenly divided as to whether or not they wished to have the list in this call number order-50 percent wanted it by straight shelflist or call number order and 50 percent wanted it alphabetically by author. the latter response may very well reflect the proportionally large number of respondents who were faculty and who supposedly would know the authors in their fields and do not use a subject approach when seeking materials. while the original purpose of the research was to provide better subject access to a remote collection, it was also important to find out more about the user's response to microfiche if he could be given an improvement in service or a service he did not previously have. microfiche would be both more compact and less expensive if lists of this type were to be provided in many subjects and continually updated. for the microfiche section of the research project the library of congress classifications covering classics and related fields were chosen, partly subject access/clark 273 on the basis that faculty in these areas had agreed to participate and encourage their students to use the list. included were: de1-de98 df101-df289 dg11-dg209 n563q-n5790 na20q-na335 pa-all z7001-z7005 history-the mediterranean world history-greece history-italy history of art-greek and roman architecture-history-greek and roman language and literature of greece and rome bibliographies in linguistics, roman and greek literature, teaching languages this subset produced about eleven thousand titles. the format of the com was the same as that on the paper printout, with general titles appearing at the top of each sheet or frame, e.g., shelflist-classics-greece. this took twenty-two microfiche with sixty-nine frames each listing seven or eight titles. the last frame on each fiche was an index to that fiche. a nonreduced (eyeball) character at the top listed the first call number on the fiche. it was envisioned that the user might know the general classification number, search for it by the eyeball character, then consult the index in the last frame to locate the proper frame for a specific class. in this way the user could browse through the subject area. the chief advantage of com lay in the fact that the small envelope of microfiche and a portable reader were easy to check out of the library and carry home or to an office where the user could browse through the library shelflist at a leisurely pace. since initial reaction was negative, a subject index was prepared to make the list more usable to undergraduate students. this index was made up of the appropriate entries which appeared in the library of congress classification schedules, with all entries consolidated into one alphabet. 1 using this index to find an entry-for example, "caesar, c. julius" -the student would find two areas to search: dg261-267 and pa6235-6269. he would find these areas on the microfiche with the eyeball characters, then search the index frame to find the appropriate pages. the classics list with its index and instructions was packaged in neat, loose-leaf notebook form and, together with a portable reader, presented to classics faculty at two regional campuses. a set was also available in the library. the results were completely negative. reliance upon the cooperation of too small a number of cooperating teachers may have invalidated this part of the research, but the contrast in response to the similar printed list raised serious questions about user response to microfiche in an index or reference book situation.2 it had been anticipated that a population in the humanities or social sciences would have had more need than the science group for what was essentially a book list since serial titles did not include 27 4 j oumal of library automation vol. 7 i 4 december 197 4 holdings. the complete lack of interest from the faculty in the field of classics was an unexpected disappointment but no firm conclusions could be drawn without a research strategy designed to remove any possible variables. conclusion increased use of marc cataloging through such systems as oclc and ballots will mean many more libraries will have their total holdings in machine-readable form with the capability of using their records in new ways. programs for distributing microfiche copies of library catalogs such as georgia tech's lends program provide inspiration for public service librarians to make use of the data and technology that technical services automation projects are supplying. 8 this experiment in manipulating machine-readable library records for use in subject searching was an attempt toward better retrieval of a library's collection and indicated that such programs would be useful to extend service outside a single library location. references 1. it may soon be possible to do this in a much simpler fashion by using the combined indexes to the libl'ary of congmss classification schedttles (washington, d.c.: u.s. historical documents institute, 1974). 2. doris bole£, "computer-output microfilm," special libraries 65:169-75 (april 1974). in describing the use of com at the washington university school of medicine, bole£ said, "there is, however, an additional disadvantage, namely, the resistance of users to the use of microforms because of their inconvenience. patrons will sometimes choose not to read a publication when told it is available in some sort of microform only. it is assumed that librarians are not quite as reluctant, but it would be a mistake not to take this reluctance into consideration. this resistance by both librarians and patrons is stronger than is usually reported by com manufacturers and service bureaus" ( p.170-71). 3. the georgia tech libl'ary's complete card catalog is now available in microfiche form, brochure (atlanta: price gilbert memorial library, georgia institute of technology, 1972). 128 information technology and libraries | september 2010 lynne weber and peg lawrence authentication and access: accommodating public users in an academic world in cook and shelton’s managing public computing, which confirmed the lack of applicable guidelines on academic websites, had more up-to-date information but was not available to the researchers at the time the project was initiated.2 in the course of research, the authors developed the following questions: ■■ how many arl libraries require affiliated users to log into public computer workstations within the library? ■■ how many arl libraries provide the means to authenticate guest users and allow them to log on to the same computers used by affiliates? ■■ how many arl libraries offer open-access computers for guests to use? do these libraries provide both open-access computers and the means for guest user authentication? ■■ how do federal depository library program libraries balance their policy requiring computer authentication with the obligation to provide public access to government information? ■■ do computers provided for guest use (open access or guest login) provide different software or capabilities than those provided to affiliated users? ■■ how many arl libraries have written policies for the use of open-access computers? if a policy exists, what is it? ■■ how many arl libraries have written policies for authenticating guest users? if a policy exists, what is it? ■■ literature review since the 1950s there has been considerable discussion within library literature about academic libraries serving “external,” “secondary,” or “outside” users. the subject has been approached from the viewpoint of access to the library facility and collections, reference assistance, interlibrary loan (ill) service, borrowing privileges, and (more recently) access to computers and internet privileges, including the use of proprietary databases. deale emphasized the importance of public relations to the academic library.3 while he touched on creating bonds both on and off campus, he described the positive effect of “privilege cards” to community members.4 josey described the variety of services that savannah state college offered to the community.5 he concluded his essay with these words: why cannot these tried methods of lending books to citizens of the community, story hours for children . . . , a library lecture series or other forum, a great books discussion group and the use of the library staff in the fall of 2004, the academic computing center, a division of the information technology services department (its) at minnesota state university, mankato took over responsibility for the computers in the public areas of memorial library. for the first time, affiliated memorial library users were required to authenticate using a campus username and password, a change that effectively eliminated computer access for anyone not part of the university community. this posed a dilemma for the librarians. because of its federal depository status, the library had a responsibility to provide general access to both print and online government publications for the general public. furthermore, the library had a long tradition of providing guest access to most library resources, and there was reluctance to abandon the practice. therefore the librarians worked with its to retain a small group of six computers that did not require authentication and were clearly marked for community use, along with several standup, open-access computers on each floor used primarily for searching the library catalog. the additional need to provide computer access to high school students visiting the library for research and instruction led to more discussions with its and resulted in a means of generating temporary usernames and passwords through a web form. these user accommodations were implemented in the library without creating a written policy governing the use of open-access computers. o ver time, library staff realized that guidelines for guests using the computers were needed because of misuse of the open-access computers. we were charged with the task of drafting these guidelines. in typical librarian fashion, we searched websites, including those of association of research libraries (arl) members for existing computer access policies in academic libraries. we obtained very little information through this search, so we turned to arl publications for assistance. library public access workstation authentication by lori driscoll, was of greater benefit and offered much of the needed information, but it was dated.1 a research result described lynne webber (lnweber@mnsu.edu) is access services librarian and peg lawrence (peg.lawrence@mnsu.edu) is systems librarian, minnesota state university, mankato. authentication and access | weber and lawrence 129 providing service to the unaffiliated, his survey revealed 100 percent of responding libraries offered free in-house collection use for the general public, and many others offered additional services.16 brenda johnson described a one-day program in 1984 sponsored by rutgers university libraries forum titled “a case study in closing the university library to the public.” the participating librarians spent the day familiarizing themselves with the “facts” of the theoretical case and concluded that public access should be restricted but not completely eliminated. a few months later, consideration of closing rutgers’ library to the public became a real debate. although there were strong opposing viewpoints, the recommendation was to retain the open-door policy.17 jansen discussed the division between those who wanted to provide the finest service to primary users and those who viewed the library’s mission as including all who requested assistance. jansen suggested specific ways to balance the needs of affiliates and the public and referred to the dilemma the university of california, berkeley, library that had been closed to unaffiliated users.18 bobp and richey determined that california undergraduate libraries were emphasizing service to primary users at a time when it was no longer practical to offer the same level of service to primary and secondary users. they presented three courses of action: adherence to the status quo, adoption of a policy restricting access, or implementation of tiered service.19 throughout the 1990s, the debate over the public’s right to use academic libraries continued, with increasing focus on computer use in public and private academic libraries. new authorization and authentication requirements increased the control of internal computers, but the question remained of libraries providing access to government information and responding to community members who expected to use the libraries supported by their taxes. morgan, who described himself as one who had spent his career encouraging equal access to information, concluded that it would be necessary to use authentication, authorization, and access control to continue offering information services readily available in the past.20 martin acknowledged that library use was changing as a result of the internet and that the public viewed the academic librarian as one who could deal with the explosion of information and offer service to the public.21 johnson described unaffiliated users as a group who wanted all the privileges of the affiliates; she discussed the obligation of the institution to develop policies managing these guest users.22 still and kassabian considered the dual responsibilities of the academic library to offer internet access to public users and to control internet material received and sent by primary and public users. further, they weighed as consultants be employed toward the building of good relations between town and gown.6 later, however, deale indicated that the generosity common in the 1950s to outsiders was becoming unsustainable.7 deale used beloit college, with an “open door policy” extending more than 100 years, as an example of a school that had found it necessary to refuse out-of-library circulation to minors except through ill by the 1960s.8 also in 1964, waggoner related the increasing difficulty of accommodating public use of the academic library. he encouraged a balance of responsibility to the public with the institution’s foremost obligation to the students and faculty.9 in october 1965, the ad hoc committee on community use of academic libraries was formed by the college library section of the association of college and research libraries (acrl). this committee distributed a 13-question survey to 1,100 colleges and universities throughout the united states. the high rate of response (71 percent) was considered noteworthy, and the findings were explored in “community use of academic libraries: a symposium,” published in 1967.10 the concluding article by josey (the symposium’s moderator) summarized the lenient attitudes of academic libraries toward public users revealed through survey and symposium reports. in the same article, josey followed up with his own arguments in favor of the public’s right to use academic libraries because of the state and federal support provided to those institutions.11 similarly, in 1976 tolliver reported the results of a survey of 28 wisconsin libraries (public academic, private academic, and public), which indicated that respondents made a great effort to serve all patrons seeking service.12 tolliver continued in a different vein from josey, however, by reporting the current annual fiscal support for libraries in wisconsin and commenting upon financial stewardship. tolliver concluded by asking, “how effective are our library systems and cooperative affiliations in meeting the information needs of the citizens of wisconsin?”13 much of the literature in the years following focused on serving unaffiliated users at a time when public and academic libraries suffered the strain of overuse and underfunding. the need for prioritization of primary users was discussed. in 1979, russell asked, “who are our legitimate clientele?” and countered the argument for publicly supported libraries serving the entire public by saying the public “cannot freely use the university lawn mowers, motor pool vehicles, computer center, or athletic facilities.”14 ten years later, russell, robison, and prather prefaced their report on a survey of policies and services for outside users at 12 consortia institutions by saying, “the issue of external users is of mounting concern to an institution whose income is student credit hour generated.”15 despite russell’s concerns about the strain of 130 information technology and libraries | september 2010 be aware of the issues and of the effects that licensing, networking, and collection development decisions have on access.”35 in “unaffiliated users’ access to academic libraries: a survey,” courtney reported and analyzed data from her own comprehensive survey sent to 814 academic libraries in winter 2001.36 of the 527 libraries responding to the survey, 72 libraries (13.6 percent) required all users to authenticate to use computers within the library, while 56 (12.4 percent) indicated that they planned to require authentication in the next twelve months.37 courtney followed this with data from surveyed libraries that had canceled “most” of their indexes and abstracts (179 libraries, or 33.9 percent) and libraries that had cancelled “most” periodicals (46 libraries or 8.7 percent).38 she concluded that the extent to which the authentication requirement restricted unaffiliated users was not clear, and she asked, “as greater numbers of resources shift to electronic-only formats, is it desirable that they disappear from the view of the community user or the visiting scholar?”39 courtney’s “authentication and library public access computers: a call for discussion” described a follow-up with the academic libraries participating in her 2001 survey who had self-identified as using authentication or planning to employ authentication within the next twelve months. her conclusion was the existence of ambivalence toward authentication among the libraries, since more than half of the respondents provided some sort of public access. she encouraged librarians to carefully consider the library’s commitment to service before entering into blanket license agreements with vendors or agreeing to campus computer restrictions.40 several editions of the arl spec kit series showing trends of authentication and authorization for all users of arl libraries have been an invaluable resource in this investigation. an examination of earlier spec kits indicated that the definitions of “user authentication” and “authorization” have changed over the years. user authentication, by plum and bleiler indicated that 98 percent of surveyed libraries authenticated users in some way, but at that time authentication would have been more precisely defined as authorization or permission to access personal records, such as circulation, e-mail, course registration, and file space. as such, neither authentication nor authorization was related to basic computer access.41 by contrast, it is common for current library users authenticate to have any access to a public workstation. driscoll’s library public access workstation authentication sought information on how and why users were authenticated on public-access computers, who was driving the change, how it affected the ability of federal depository libraries to provide public information, and how it affected library services in general.42 but at the time of driscoll’s survey, only 11 percent of surveyed libraries required authentication on all computers and 22 percent required it only on selected terminals. cook and shelton’s managing public computing the reconciliation of material restrictions against “principles of freedom of speech, academic freedom, and the ala’s condemnation of censorship.”23 lynch discussed institutional use of authentication and authorization and the growing difficulty of verifying bona fide users of academic library subscription databases and other electronic resources. he cautioned that future technical design choices must reflect basic library values of free speech, personal confidentiality, and trust between academic institution and publisher.24 barsun specifically examined the webpages of one hundred arl libraries in search of information pertinent to unaffiliated users. she included a historic overview of the changing attitudes of academics toward service to the unaffiliated population and described the difficult balance of college community needs with those of outsiders in 2000 (the survey year).25 barsun observed a consistent lack of information on library websites regarding library guest use of proprietary databases.26 carlson discussed academic librarians’ concerns about “internet-related crimes and hacking” leading to reconsideration of open computer use, and he described the need to compromise patron privacy by requiring authentication.27 in a chapter on the relationship of it security to academic values, oblinger said, “one possible interpretation of intellectual freedom is that individuals have the right to open and unfiltered access to the internet.”28 this statement was followed later with “equal access to information can also be seen as a logical extension of fairness.”29 a short article in library and information update alerted the authors to a uk project investigating improved online access to resources for library visitors not affiliated with the host institution.30 salotti described higher education access to e-resources in visited institutions (haervi) and its development of a toolkit to assist with the complexities of offering electronic resources to guest users.31 salotti summarized existing resources for sharing within the united kingdom and emphasized that “no single solution is likely to suit all universities and colleges, so we hope that the toolkit will offer a number of options.”32 launched by the society of college, national and university libraries (sconul), and universities and colleges information systems association (ucisa), haervi has created a best-practice guide.33 by far the most useful articles for this investigation have been those by nancy courtney. “barbarians at the gates: a half-century of unaffiliated users in academic libraries,” a literature review on the topic of visitors in academic libraries, included a summary of trends in attitude and practice toward visiting users since the 1950s.34 the article concluded with a warning: “the shift from printed to electronic formats . . . combined with the integration of library resources with campus computer networks and the internet poses a distinct threat to the public’s access to information even onsite. it is incumbent upon academic librarians to authentication and access | weber and lawrence 131 introductory letter with the invitation to participate and a forward containing definitions of terms used within the survey is in appendix a. in total, 61 (52 percent) of the 117 arl libraries invited to participate in the survey responded. this is comparable with the response rate for similar surveys reported by plum and bleiler (52 of 121, or 43 percent), driscoll (67 of 124, or 54 percent), and cook and shelton (69 of 123, or 56 percent).45 1. what is the name of your academic institution? the names of the 61 responding libraries are listed in appendix b. 2. is your institution public or private? see figure 1. respondents’ explanations of “other” are listed below. ■❏ state-related ■❏ trust instrument of the u.s. people; quasigovernment ■❏ private state-aided ■❏ federal government research library ■❏ both—private foundation, public support 3. are affiliated users required to authenticate in order to access computers in the public area of your library? see figure 2. 4. if you answered “yes” to the previous question, does your library provide the means for guest users to authenticate? see figure 3. respondents’ explanations of “other” are listed below. all described open-access computers. ■❏ “we have a few “open” terminals” ■❏ “4 computers don’t require authentication” ■❏ “some workstations do not require authentication” ■❏ “open-access pcs for guests (limited number and function)” ■❏ “no—but we maintain several open pcs for guests” ■❏ “some workstations do not require login” 5. is your library a federal depository library? see figure 4. this question caused some confusion for the canadian survey respondents because canada has its own depository services program corresponding to the u.s. federal depository program. consequently, 57 of the 61 respondents identified themselves as federal depository (including three canadian libraries), although 5 of the 61 are more accurately members of the canadian depository services program. only two responding libraries were neither a member of the u.s. federal depository program nor of the canadian depository services program. 6. if you answered “yes” to the previous question, and computer authentication is required, what provisions have been made to accommodate use of online government documents by the general public in the library? please check all that touched on every aspect of managing public computing, including public computer use, policy, and security.43 even in 2007, only 25 percent of surveyed libraries required authentication on all computers, but 46 percent required authentication on some computers, showing the trend toward an ever increasing number of libraries requiring public workstation authentication. most of the responding libraries had a computer-use policy, with 48 percent following an institution-wide policy developed by the university or central it department.44 ■■ method we constructed a survey designed to obtain current data about authentication in arl libraries and to provide insight into how guest access is granted at various academic institutions. it should be noted that the object of the survey was access to computers located in the public areas of the library for use by patrons, not access to staff computers. we constructed a simple, fourteen-question survey using the zoomerang online tool (http://www .zoomerang.com/). a list of the deans, directors, and chief operating officers from the 123 arl libraries was compiled from an internet search. we eliminated the few library administrators whose addresses could not be readily found and sent the survey to 117 individuals with the request that it be forwarded to the appropriate respondent. the recipients were informed that the goal of the project was “determination of computer authentication and current computer access practices within arl libraries” and that the intention was “to reflect practices at the main or central library” on the respondent’s campus. recipients were further informed that the names of the participating libraries and the responses would be reported in the findings, but that there would be no link between responses given and the name of the participating library. the survey introduction included the name and contact information of the institutional review board administrator for minnesota state university, mankato. potential respondents were advised that the e-mail served as informed consent for the study. the survey was administered over approximately three weeks. we sent reminders three, five, and seven days after the survey was launched to those who had not already responded. ■■ survey questions, responses, and findings we administered the survey, titled “authentication and access: academic computers 2.0,” in late april 2008. following is a copy of the fourteen-question survey with responses, interpretative data, and comments. the 132 information technology and libraries | september 2010 ■❏ “some computers are open access and require no authentication” ■❏ “some workstations do not require login” 7. if your library has open-access computers, how many do you provide? (supply number). see figure 6. a total of 61 institutions responded to this question, and 50 reported open-access computers. the number of open-access computers ranged from 2 to 3,000. as expected, the highest numbers were reported by libraries that did not require authentication for affiliates. the mean number of open-access computers was 161.2, the median was 23, the mode was 30, and the range was 2,998. 8. please indicate which online resources and services are available to authenticated users. please check all that apply. see figure 7. ■❏ online catalog ■❏ government documents ■❏ internet browser apply. see figure 5. ■❏ temporary user id and password ■❏ open access computers (unlimited access) ■❏ open access computers (access limited to government documents) ■❏ other of the 57 libraries that responded “yes” to question 5, 30 required authentication for affiliates. these institutions offered the general public access to online government documents various ways. explanations of “other” are listed below. three of these responses indicate, by survey definition, that open-access computers were provided. ■❏ “catalog-only workstations” ■❏ “4 computers don’t require authentication” ■❏ “generic login and password” ■❏ “librarians login each guest individually” ■❏ “provision made for under-18 guests needing gov doc” ■❏ “staff in gov info also login user for quick use” ■❏ “restricted guest access on all public devices” figure 3. institutions with the means to authenticate guests figure 4. libraries with federal depository and/or canadian depository services status figure 2. institutions requiring authentication figure 1. categories of responding institutions authentication and access | weber and lawrence 133 11. does your library have a written policy for use of open access computers in the public area of the library? question 7 indicates that 50 of the 61 responding libraries did offer the public two or more open-access computers. out of the 50, 28 responded that they had a written policy governing the use of computers. conversely, open-access computers were reported at 22 libraries that had no reported written policy. 12. if you answered “yes” to the previous question, please give the link to the policy and/or summarize the policy. twenty-eight libraries gave a url, a url plus a summary explanation, or a summary explanation with no url. 13. does your library have a written policy for authenticating guest users? out of the 32 libraries that required their users to authenticate (see question 3), 23 also had the means to allow their guests to authenticate (see question 4). fifteen of those libraries said they had a policy. 14. if you answered “yes” to the previous question, please give the link to the policy and/or summarize the policy. eleven ■❏ licensed electronic resources ■❏ personal e-mail access ■❏ microsoft office software 9. please indicate which online resources and services are available to authenticated guest users. please check all that apply. see figure 8. ■❏ online catalog ■❏ government documents ■❏ internet browser ■❏ licensed electronic resources ■❏ personal e-mail access ■❏ microsoft office software 10. please indicate which online resources and services are available on open-access computers. please check all that apply. see figure 9. ■❏ online catalog ■❏ government documents ■❏ internet browser ■❏ licensed electronic resources ■❏ personal e-mail access ■❏ microsoft office software figure 5. provisions for the online use of government documents where authentication is required figure 6. number of open-access computers offered figure 7. electronic resources for authenticated affiliated users (n = 32) number of libraries number of librariesnumber of libraries number of libraries figure 8. resources for authenticating guest users (n = 23) 134 information technology and libraries | september 2010 ■■ respondents and authentication figure 10 compares authentication practices of public, private, and other institutions described in response to question 2. responses from public institutions outnumbered those from private institutions, but within each group a similar percentage of libraries required their affiliated users to authenticate. therefore no statistically significant difference was found between authenticating affiliates in public and private institutions. of the 61 respondents, 32 (52 percent) required their affiliated users to authenticate (see question 3) and 23 of the 32 also had the means to authenticate guests (see question 4). the remaining 9 offered open-access computers. fourteen libraries had both the means to authenticate guests and had open-access computers (see questions 4 and 7). when we compare the results of the 2007 study by cook and shelton with the results of the current study (completed in 2008), the results are somewhat contradictory (see table 1).46 the differences in survey data seem to indicate that authentication requirements are decreasing; however, the literature review—specifically cook and shelton and the 2003 courtney article—clearly indicate that authentication is on the rise.47 this dichotomy may be explained, in part, by the fact that of the more than 60 arl libraries responding to both surveys, there was an overlap of only 34 libraries. the 30 u.s. federal depository or canadian depository services libraries that required their affiliated users to authenticate (see questions 3 and 5) provided guest access ranging from usernames and passwords, to open-access computers, to computers restricted to libraries gave the url to their policy; 4 summarized their policies. ■■ research questions answered the study resulted in answers to the questions we posed at the outset: ■■ thirty-two (52 percent) of the responding arl libraries required affiliated users to login to public computer workstations in the library. ■■ twenty-three (72 percent) of the 32 arl libraries requiring affiliated users to login to public computers provided the means for guest users to login to public computer workstations in the library. ■■ fifty (82 percent) of 61 responding arl libraries provided open-access computers for guest users; 14 (28 percent) of those 50 libraries provided both open-access computers and the means for guest authentication. ■■ without exception, all u.s. federal depository or canadian depository services libraries that required their users to authenticate offered guest users some form of access to online information. ■■ survey results indicated some differences between software provided to various users on differently accessed computers. office software was less frequently provided on open-access computers. ■■ twenty-eight responding arl libraries had written policies relating to the use of open-access computers. ■■ fifteen responding arl libraries had written policies relating to the authorization of guests. figure 9. electronic resources on open access computers (n = 50) figure 10. comparison of library type and authentication requirement number of libraries authentication and access | weber and lawrence 135 ■■ one library had guidelines for use posted next to the workstations but did not give specifics. ■■ fourteen of those requiring their users to authenticate had both open-access computers and guest authentication to offer to visitors of their libraries. other policy information was obtained by an examination of the 28 websites listed by respondents: ■■ ten of the sites specifically stated that the open-access computers were for academic use only. ■■ five of the sites specified time limits for use of openaccess computers, ranging from 30 to 90 minutes. ■■ four stated that time limits would be enforced when others were waiting to use computers. ■■ one library used a sign-in sheet to monitor time limits. ■■ one library mentioned a reservation system to monitor time limits. ■■ two libraries prohibited online gambling. ■■ six libraries prohibited viewing sexually explicit materials. ■■ guest-authentication policies of the 23 libraries that had the means to authenticate their guests, 15 had a policy for guests obtaining a username and password to authenticate, and 6 outlined their requirements of showing identification and issuing access. the other 9 had open-access computers that guests might use. the following are some of the varied approaches to guest authentication: ■■ duration of the access (when mentioned) ranged from 30 days to 12 months. ■■ one library had a form of sponsored access where current faculty or staff could grant a temporary username and password to a visitor. ■■ one library had an online vouching system that allowed the visitor to issue his or her own username and password online. ■■ one library allowed guests to register themselves by swiping an id or credit card. ■■ one library had open-access computers for local resources and only required authentication to leave the library domain. ■■ one library had the librarians log the users in as guests. ■■ one library described the privacy protection of collected personal information. ■■ no library mentioned charging a fee for allowing computer access. government documents, to librarians logging in for guests (see question 6). numbers of open-access computers ranged widely from 2 to more than 3,000 (see question 7). eleven (19 percent) of the responding u.s. federal depository or canadian depository services libraries that did not provide open-access computers issued a temporary id (nine libraries), provided open access limited to government documents (one library), or required librarian login for each guest (one library). all libraries with u.s. federal depository or canadian depository services status provided a means of public access to information to fulfill their obligation to offer government documents to guests. figure 11 shows a comparison of resources available to authenticated users and authenticated guests and offered on open-access computers. as might be expected, almost all institutions provided access to online catalogs, government documents, and internet browsers. fewer allowed access to licensed electronic resources and e-mail. access to office software showed the most dramatic drop in availability, especially on open-access computers. ■■ open-access computer policies as mentioned earlier, 28 libraries had written policies for their open-access computers (see question 11), and 28 libraries gave a url, a url plus a summary explanation, or a summary explanation with no url (see question 12). in most instances, the library policy included their campus’s acceptable-use policy. seven libraries cited their campus’s acceptable-use policy and nothing else. nearly all libraries applied the same acceptable-use policy to all users on all computers and made no distinction between policies for use of open-access computers or computers requiring authentication. following are some of the varied aspects of summarized policies pertaining to open-access computers: ■■ eight libraries stated that the computers were for academic use and that users might be asked to give up their workstation if others were waiting. table 1. comparison of findings from cook and shelton (2007) and the current survey (2008) authentication requirements 2007 (n = 69) 2008 (n = 61) some required 28 (46%) 23 (38%) required for all 15 (25%) 9 (15%) not required 18 (30%) 29 (48%) 136 information technology and libraries | september 2010 ■■ further study although the survey answered many of our questions, other questions arose. while the number of libraries requiring affiliated users to log on to their public computers is increasing, this study does not explain why this is the case. reasons could include reactions to the september 11 disaster, the usa patriot act, general security concerns, or the convenience of the personalized desktop and services for each authenticated user. perhaps a future investigation could focus on reasons for more frequent requirement of authentication. other subjects that arose in the examination of institutional policies were guest fees for services, age limits for younger users, computer time limits for guests, and collaboration between academic and public libraries. ■■ policy developed as a result of the survey findings as a result of what was learned in the survey, we drafted guidelines governing the use of open-access computers by visitors and other non-university users. the guidelines can be found at http://lib.mnsu.edu/about/libvisitors .html#access. these guidelines inform guests that openaccess computers are available to support their research, study, and professional activities. the computers also are governed by the campus policy and the state university system acceptable-use policy. guideline provisions enable staff to ask users to relinquish a computer when others are waiting or if the computer is not being used for academic purposes. while this library has the ability to generate temporary usernames and passwords, and does so for local schools coming to the library for research, no guidelines have yet been put in place for this function. figure 11. online resources available to authenticated affiliated users, guest users, open-access users authentication and access | weber and lawrence 137 these practices depend on institutional missions and goals and are limited by reasonable considerations. in the past, accommodation at some level was generally offered to the community, but the complications of affiliate authentication, guest registration, and vendor-license restrictions may effectively discourage or prevent outside users from accessing principal resources. on the other hand, open-access computers facilitate access to electronic resources. those librarians who wish to provide the same level of commitment to guest users as in the past as well as protect the rights of all should advocate to campus policy-makers at every level to allow appropriate guest access to computers to fulfill the library’s mission. in this way, the needs and rights of guest users can be balanced with the responsibilities of using campus computers. in addition, librarians should consider ensuring that the licenses of all electronic resources accommodate walk-in users and developing guidelines to prevent incorporation of electronic materials that restrict such use. this is essential if the library tradition of freedom of access to information is to continue. finally, in regard to external or guest users, academic librarians are pulled in two directions; they are torn between serving primary users and fulfilling the principles of intellectual freedom and free, universal access to information along with their obligations as federal depository libraries. at the same time, academic librarians frequently struggle with the goals of the campus administration responsible for providing secure, reliable networks, sometimes at the expense of the needs of the outside community. the data gathered in this study, indicating that 82 percent of responding libraries continue to provide at least some open-access computers, is encouraging news for guest users. balancing public access and privacy with institutional security, while a current concern, may be resolved in the way of so many earlier preoccupations of the electronic age. given the pervasiveness of the problem, however, fair and equitable treatment of all library users may continue to be a central concern for academic libraries for years to come. references 1. lori driscoll, library public access workstation authentication, spec kit 277 (washington, d.c.: association of research libraries, 2003). 2. martin cook and mark shelton, managing public computing, spec kit 302 (washington, d.c.: association of research libraries, 2007): 16. 3. h. vail deale, “public relations of academic libraries,” library trends 7 (oct. 1958): 269–77. 4. ibid., 275. 5. e. j. josey, “the college library and the community,” faculty research edition, savannah state college bulletin (dec. 1962): 61–66. ■■ conclusions while we were able to gather more than 50 years of literature pertaining to unaffiliated users in academic libraries, it soon became apparent that the scope of consideration changed radically through the years. in the early years, there was discussion about the obligation to provide service and access for the community balanced with the challenge to serve two clienteles. despite lengthy debate, there was little exception to offering the community some level of service within academic libraries. early preoccupation with physical access, material loans, ill, basic reference, and other services later became a discussion of the right to use computers, electronic resources, and other services without imposing undue difficulty to the guest. current discussions related to guest users reflect obvious changes in public computer administration over the years. authentication presently is used at a more fundamental level than in earlier years. in many libraries, users must be authorized to use the computer in any way whatsoever. as more and more institutions require authentication for their primary users, accommodation must be made if guests are to continue being served. in addition, as courtney’s 2003 research indicates, an ever increasing number of electronic databases, indexes, and journals replace print resources in library collections. this multiplies the roadblocks for guest users and exacerbates the issue.48 unless special provisions are made for computer access, community users are left without access to a major part of the library’s collections. because 104 of the 123 arl libraries (85 percent) are federal depository or canadian depository services libraries, the researchers hypothesized that most libraries responding to the survey would offer open-access computers for the use of nonaffiliated patrons. this study has shown that federal depository libraries have remained true to their mission and obligation of providing public access to government-generated documents. every federal depository respondent indicated that some means was in place to continue providing visitor and guest access to the majority of their electronic resources— whether through open-access computers, temporary or guest logins, or even librarians logging on for users. while access to government resources is required for the libraries housing government-document collections, libraries can use considerably more discretion when considering what other resources guest patrons may use. despite the commitment of libraries to the dissemination of government documents, the increasing use of authentication may ultimately diminish the libraries’ ability and desire to accommodate the information needs of the public. this survey has provided insight into the various ways academic libraries serve guest users. not all academic libraries provide public access to all library resources. 138 information technology and libraries | september 2010 identify yourself,” chronicle of higher education 50, no. 42 (june 25, 2004): a39, http://search.ebscohost.com/login.aspx?direct =true&db=aph&an=13670316&site=ehost-live (accessed mar. 2, 2009). 28. diana oblinger, “it security and academic values,” in luker and petersen, computer & network security in higher education, 4, http://net.educause.edu/ir/library/pdf/pub7008e .pdf (accessed july 14, 2008). 29. ibid., 5. 30. “access for non-affiliated users,” library & information update 7, no. 4 (2008): 10. 31. paul salotti, “introduction to haervi-he access to e-resources in visited institutions,” sconul focus no. 39 (dec. 2006): 22–23, http://www.sconul.ac.uk/publications/ newsletter/39/8.pdf (accessed july 14, 2008). 32. ibid., 23. 33. universities and colleges information systems association (ucisa), haervi: he access to e-resources in visited institutions, (oxford: ucisa, 2007), http://www.ucisa.ac.uk/ publications/~/media/files/members/activities/haervi/ haerviguide%20pdf (accessed july 14, 2008). 34. nancy courtney, “barbarians at the gates: a half-century of unaffiliated users in academic libraries,” journal of academic librarianship 27, no. 6 (nov. 2001): 473–78, http://search.ebsco host.com/login.aspx?direct=true&db=aph&an=5602739&site= ehost-live (accessed july 14, 2008). 35. ibid., 478. 36. nancy courtney, “unaffiliated users’ access to academic libraries: a survey,” journal of academic librarianship 29, no. 1 (jan. 2003): 3–7, http://search.ebscohost.com/login.aspx?dire ct=true&db=aph&an=9406155&site=ehost-live (accessed july 14, 2008). 37. ibid., 5. 38. ibid., 6. 39. ibid., 7. 40. nancy courtney, “authentication and library public access computers: a call for discussion,” college & research libraries news 65, no. 5 (may 2004): 269–70, 277, www.ala .org/ala/mgrps/divs/acrl/publications/crlnews/2004/may/ authentication.cfm (accessed july 14, 2008). 41. terry plum and richard bleiler, user authentication, spec kit 267 (washington, d.c.: association of research libraries, 2001): 9. 42. lori driscoll, library public access workstation authentication, spec kit 277 (washington, d.c.: association of research libraries, 2003): 11. 43. cook and shelton, managing public computing. 44. ibid., 15. 45. plum and bleiler, user authentication, 9; driscoll, library public access workstation authentication, 11; cook and shelton, managing public computing, 11. 46. cook and shelton, managing public computing, 15. 47. ibid.; courtney, unaffiliated users, 5–7. 48. courtney, unaffiliated users, 6–7. 6. ibid., 66. 7. h. vail deale, “campus vs. community,” library journal 89 (apr. 15, 1964): 1695–97. 8. ibid., 1696. 9. john waggoner, “the role of the private university library,” north carolina libraries 22 (winter 1964): 55–57. 10. e. j. josey, “community use of academic libraries: a symposium,” college & research libraries 28, no. 3 (may 1967): 184–85. 11. e. j. josey, “implications for college libraries,” in “community use of academic libraries,” 198–202. 12. don l. tolliver, “citizens may use any tax-supported library?” wisconsin library bulletin (nov./dec. 1976): 253. 13. ibid., 254. 14. ralph e. russell, “services for whom: a search for identity,” tennessee librarian: quarterly journal of the tennessee library association 31, no. 4 (fall 1979): 37, 39. 15. ralph e. russell, carolyn l. robison, and james e. prather, “external user access to academic libraries,” the southeastern librarian 39 (winter 1989): 135. 16. ibid., 136. 17. brenda l. johnson, “a case study in closing the university library to the public,” college & research library news 45, no. 8 (sept. 1984): 404–7. 18. lloyd m. jansen, “welcome or not, here they come: unaffiliated users of academic libraries,” reference services review 21, no. 1 (spring 1993): 7–14. 19. mary ellen bobp and debora richey, “serving secondary users: can it continue?” college & undergraduate libraries 1, no. 2 (1994): 1–15. 20. eric lease morgan, “access control in libraries,” computers in libraries 18, no. 3 (mar. 1, 1998): 38–40, http://search .ebscohost.com/login.aspx?direct=true&db=aph&an=306709& site=ehost-live (accessed aug. 1, 2008). 21. susan k. martin, “a new kind of audience,” journal of academic librarianship 24, no. 6 (nov. 1998): 469, library, information science & technology abstracts, http://search.ebsco host.com/login.aspx?direct=true&db=aph&an=1521445&site= ehost-live (accessed aug. 8, 2008). 22. peggy johnson, “serving unaffiliated users in publicly funded academic libraries,” technicalities 18, no. 1 (jan. 1998): 8–11. 23. julie still and vibiana kassabian, “the mole’s dilemma: ethical aspects of public internet access in academic libraries,” internet reference services quarterly 4, no. 3 (1999): 9. 24. clifford lynch, “authentication and trust in a networked world,” educom review 34, no. 4 (jul./aug. 1999), http://search .ebscohost.com/login.aspx?direct=true&db=aph&an=2041418 &site=ehost-live (accessed july 16, 2008). 25. rita barsun, “library web pages and policies toward ‘outsiders’: is the information there?” public services quarterly 1, no. 4 (2003): 11–27. 26. ibid., 24. 27. scott carlson, “to use that library computer, please authentication and access | weber and lawrence 139 appendix a. the survey introduction, invitation to participate, and forward dear arl member library, as part of a professional research project, we are attempting to determine computer authentication and current computer access practices within arl libraries. we have developed a very brief survey to obtain this information which we ask one representative from your institution to complete before april 25, 2008. the survey is intended to reflect practices at the main or central library on your campus. names of libraries responding to the survey may be listed but no identifying information will be linked to your responses in the analysis or publication of results. if you have any questions about your rights as a research participant, please contact anne blackhurst, minnesota state university, mankato irb administrator. anne blackhurst, irb administrator minnesota state university, mankato college of graduate studies & research 115 alumni foundation mankato, mn 56001 (507)389-2321 anne.blackhurst@mnsu.edu you may preview the survey by scrolling to the text below this message. if, after previewing you believe it should be handled by another member of your library team, please forward this message appropriately. alternatively, you may print the survey, answer it manually and mail it to: systems/ access services survey library services minnesota state university, mankato ml 3097—po box 8419 mankato, mn 56001-8419 (usa) we ask you or your representative to take 5 minutes to answer 14 questions about computer authentication practices in your main library. participation is voluntary, but follow-up reminders will be sent. this e-mail serves as your informed consent for this study. your participation in this study includes the completion of an online survey. your name and identity will not be linked in any way to the research reports. clicking the link to take the survey shows that you understand you are participating in the project and you give consent to our group to use the information you provide. you have the right to refuse to complete the survey and can discontinue it at any time. to take part in the survey, please click the link at the bottom of this e-mail. thank you in advance for your contribution to our project. if you have questions, please direct your inquiries to the contacts given below. thank you for responding to our invitation to participate in the survey. this survey is intended to determine current academic library practices for computer authentication and open access. your participation is greatly appreciated. below are the definitions of terms used within this survey: ■■ “authentication”: a username and password are required to verify the identity and status of the user in order to log on to computer workstations in the library. ■■ “affiliated user”: a library user who is eligible for campus privileges. ■■ “non-affiliated user”: a library user who is not a member of the institutional community (an alumnus may be a nonaffiliated user). this may be used interchangeably with “guest user.” ■■ “guest user”: visitor, walk-in user, nonaffiliated user. ■■ “open access computer”: computer workstation that does not require authentication by user. 140 information technology and libraries | september 2010 appendix b. responding institutions 1. university at albany state university of new york 2. university of alabama 3. university of alberta 4. university of arizona 5. arizona state university 6. boston college 7. university of british columbia 8. university at buffalo, state university of ny 9. case western reserve university 10. university of california berkeley 11. university of california, davis 12. university of california, irvine 13. university of chicago 14. university of colorado at boulder 15. university of connecticut 16. columbia university 17. dartmouth college 18. university of delaware 19. university of florida 20. florida state university 21. university of georgia 22. georgia tech 23. university of guelph 24. howard university 25. university of illinois at urbana-champaign 26. indiana university bloomington 27. iowa state university 28. johns hopkins university 29. university of kansas 30. university of louisville 31. louisiana state university 32. mcgill university 33. university of maryland 34. university of massachusetts amherst 35. university of michigan 36. michigan state university 37. university of minnesota 38. university of missouri 39. massachusetts institute of technology 40. national agricultural library 41. university of nebraska-lincoln 42. new york public library 43. northwestern university 44. ohio state university 45. oklahoma state university 46. university of oregon 47. university of pennsylvania 48. university of pittsburgh 49. purdue university 50. rice university 51. smithsonian institution 52. university of southern california 53. southern illinois university carbondale 54. syracuse university 55. temple university 56. university of tennessee 57. texas a&m university 58. texas tech university 59. tulane university 60. university of toronto 61. vanderbilt university introducing zoomify image | smith 25 column title editor author id box for 3 column layout communications “just in case” answers: the twenty-first-century vertical file | dalrymple 25 tam dalrymple “just-in-case” answers: the twenty-first century vertical file this article discusses the use of oclc’s questionpoint service for managing electronic publications and other items that fall outside the scope of oclc library’s opac and web resources pages, yet need to be “put somewhere.” the local knowledge base serves as both a collection development tool and as a virtual vertical file, with records that are easy to enter, search, update, or delete. we do not deliberately collect for the vertical file, but add to it day by day the useful thing which turns up. these include clippings from newspapers, excerpts from periodicals . . . broadsides that are not injured by folding . . . anything that we know will be used if available. —wilson bulletin, 1919 i nformation that “will be used if available” sounds like the contents of the internet.1 as with libraries everywhere, the oclc library has come to depend on the internet as an almost limitless resource. and like libraries everywhere, it has confronted the advantages and disadvantages of that scope. this means that in addition to using the opac and oclc library’s webpages, oclc library staff have used a mix of bookmarks, del.icio.us tags, and post-it® notes to keep track of relevant, authoritative, substantive, and potentially reusable information. much has been written about the use of questionpoint’s transaction management capabilities and of the important role of knowledge bases in providing closure to an inquiry. in contrast, this article will look at questionpoint’s use as a management tool for future questions, for items that fall outside the scope of oclc library’s opac and web resources pages yet need to be “put somewhere.” the questionpoint local knowledge base is just the spot for these new vertical file items. about oclc library oclc is the world’s largest nonprofit membership computer library service and research organization. more than 69,000 libraries in 112 countries and territories around the world use oclc services to locate, acquire, catalog, lend, and preserve library materials. oclc library was established in 1977 to provide support for oclc’s mission. the collection concentrates on library, information and computer sciences, business management, and has special collections that include the papers of frederick g. kilgour and archives of the dewey decimal classification™. oclc library has a distinct clientele to which it offers a complete range of services—print and electronic collections, reference, interlibrary loan—within its subject areas. because of the nature of the organization, the library supports longterm and collaborative research, such as that done by oclc programs and research staff, as well as the immediate information needs of product management and marketing staff. oclc library also provides information to oclc’s other service areas, such as finance and human resources. while most oclc library acquisitions are done on demand, oclc library selects and maintains an extensive collection of periodicals, journals, and reference resources, most of them online and accessible—along with the opac—to oclc employees worldwide from the library’s webpages (see figure 1). often, however, oclc staff, like those of many organizations, are too busy to consult these resources themselves and thus depend on the library. oclc library staff pursue the answers to such research questions through its collections and look to enhance the collections with “anything that we know will be” of use. one of the challenges is keeping track of the “anything” that falls outside the library’s primary collections scope; questionpoint helps with that task. traditional uses of questionpoint questionpoint is a service that provides question management tools aimed at increasing the visibility of reference services and making them more efficient. oclc library uses many of those tools, but there are significant ones it does not use (for example, chat). and although the library’s questionpoint-based aska link is visible by default on the front page of the corporate intranet as well as on oclc library–specific pages, less than than 8 percent of questions over the last year were received through that link. one reason for this low use may be that for most of oclc library’s history, e-mail has been the primary contact method, and so it remains. even when the staff need clarification of a question, they automatically opt for telephone or e-mail messaging. working with a web form and question-and-answer software has not caught on as a replacement for these more established methods. however, questionpoint remains tam dalrymple (dalrympt@oclc.org) is senior information specialist at oclc, dublin, ohio. 26 information technology and libraries | december 200826 information technology and libraries | december 2008 the reference “workspace.” when questions come in through e-mail or phone, librarians enter them into questionpoint, using it to add notes and keep track of sources checked. completed transactions are added to the local knowledge base. (because their questions involve proprietary matters, many special libraries do not add their answers to the global knowledge base, and oclc library is no exception. the local knowledge base is accessible only by oclc library staff.) not surprisingly, most of the questions received are about libraries, museums, and other cultural institutions, their collections, users, and staff. this means that the likelihood of reuse of the information in the oclc library knowledge base is relatively high, and makes the local knowledge base an early stop in the reference process. though statistics vary widely by individual institutions and type of library—and though some libraries have opted not to use the knowledge base—the average ratio for all questionpoint libraries is about one knowledge base search for every three questions received. in contrast, in the past year oclc library staff averaged 4.2 local knowledge base searches for every three questions received. the view of the questionpoint knowledge base as a repository of answers to questions that have been asked is a traditional one. oclc library’s use of the questionpoint knowledge base in anticipation of information needs of its clients—as a way of collection development—is distinctive. in many respects this use creates an updated version of the oldfashioned vertical file. nontraditional uses of questionpoint just-in-case the vertical file has a quirky place in the annals of librarianship. it has been the repository for facts and information too good to throw away but not quite good enough to catalog. h. w. wilson still offers its vertical file index, a specialized subject index to pamphlets issued on topics often unavailable in book form, which began in 1932. by now, except for special collections, the internet has practically relegated the vertical file to the backroom with the card platens and electric erasers. oclc library now uses its questionpoint knowledge base to manage information that once might have gone into a vertical file: the authoritative reports, studies, .org sites, and other resources that are often not substantive enough to catalog, but too good to hide away in a single staff member’s bookmarks. the questionpoint knowledge base provides a place for these resources; more importantly, questionpoint provides fast, efficient ways to collect, tag, manage, and use them. questionpoint allows development of such collections with powerful capabilities that allow for future retrieval and use of the information, and it does so without the incredibly time-consuming processes of the past. a 1909 description of such processes describes in detail the inefficiency of yore: in the public library [sic] of newark, n.j., material is filed in folders made of no. 1 tag manila paper, cut into pieces about 11x18 inches in size. one end is so turned up against the others as to make a receptacle 11x19 1/2 inches. the front fold is a half inch shorter than the back one, and this leaves a margin exposed on the back one, whereon the subject of that folder is written.2 thus a major benefit of using questionpoint to manage these resources is saving time. because questionpoint is a routine part of oclc library’s workflow, it allows the addition of items directly to the figure 1. oclc library intranet homepage introducing zoomify image | smith 27“just in case” answers: the twenty-first-century vertical file | dalrymple 27 knowledge base quickly and with a minimum of fuss. there is initially no need to make the entry “pretty,” but only to describe the resource briefly, add the url, and tag it (see figure 2). unlike a physical vertical file, tagging items in the knowledge base allows items to be “put” in multiple places. staff can also add comments that characterize the authoritativeness of a resource. occasionally librarians come across articles or resources that might address multiple questions. instead of burying the data in one overarching knowledge base record, staff can make an entry for each aspect of the resource. an example of this is www .galbithink.org/libraries/analysis. htm, a page created by douglas galbi, senior economist with the federal communications commission (see figure 3). the site provides statistics, including historical statistics, on u.s. public libraries. rather than describe these generically with a tag like “library statistics”—not very useful in any case—each source can be added separately to the questionpoint knowledge base. for example, the item “audiovisual materials in u.s. public libraries” can be assigned specific tags—audiovisual, av, videos—that will make the data more accessible in the future. in other words, librarians use the faq model of asking and answering just one question at a time. an important element in adding “answers” to oclc library’s knowledge base is the ability to provide context. with questionpoint, librarians can not only describe what the resource is, but why it may be of future use. and just the act of adding information to the knowledge base serves as a valuable mnemonic— “i’ve seen that somewhere.” records added to the knowledge base in this way can be easily updated with information about newer editions or better sources. equally valuable is the ability to edit and add keywords when the resource becomes useful for unforeseen questions. sharing information with staff the knowledge base also serves as a more formal collection development tool. when librarians run across potentially valuable resources, they can send a description and a link to a product manager who may find it of use. library staff use questionpoint’s keyword capability to add tags of people’s names and job titles to facilitate ongoing current awareness. employees may provide feedback suggesting an item be added to the figure 3. a page with diverse facts and figures: www.galbithink.org/libraries/analysis.htm figure 2. a sample questionpoint entry, this for a report by the national endowment for the arts 28 information technology and libraries | december 200828 information technology and libraries | december 2008 permanent print collection, or linked to from the library website. oclc library strives to inform users without subjecting them to information overload. when a 2007 survey of oclc staff found the library’s rss feeds seldom used, librarians began to send e-mails directly to individuals and teams. the reaction of oclc staff indicates that such personal messages, with content summaries that allow recipients to quickly evaluate the contents, are more often read than oclc library rss feeds—especially if items sent continue to be valuable. requirements that enable this kind of sharing include knowledge of company goals, staff needs, and product initiatives. to keep up-todate, librarians meet regularly with other oclc staff, and monitor organizational changes. attendance at oclc’s members council meetings provides information on hot topics that help identify resources for future use. while oclc’s growth as a global organization has brought challenges in maintaining awareness of the full range of organization needs, the questionpoint knowledge base offers a practical way to manage increased volume. maintaining resources of potential interest to staff with questionpoint has another benefit: it helps keep librarians aware of internal experts who can help the library with questions, and in many cases allows the library to connect staff with mutual interests to one another. this has become especially important as oclc has grown and its services continue to integrate with one another. conclusions beyond its usefulness as a system to receive, manage, and answer inquiries, questionpoint is providing a way to facilitate access to online resources that addresses the particular needs of oclc library’s constituency. it is fast and easy to use: a standard part of the daily workflow. it enables direct links to sources and accommodates tagging those sources with the names of people and projects, as well as subjects. it serves as part of the library’s collection management and selection system. using questionpoint in this way has some potential drawbacks. “just in case” acquisition of virtual resources entails some of the risks of traditional acquisitions: acquiring resources that are seldom used, creating a database of resources that are difficult to retrieve, and perhaps the necessity of “weeding” or updating obsolete items. with company growth comes the issue of scalability, as well. but for now, the benefits have far outweighed the risks. most of the items added have been identified for and shared with at least one staff member, so the effort has provided immediate payoff. n the knowledge base serves as a collection development tool, helping to identify items that can be cataloged and added to the permanent collection. n the record in the knowledge base can serve as a reminder to check for later editions. n the knowledge base records are easy to update or even delete. the questionpoint virtual vertical file helps oclc library manage and share those useful things that “just turn up.” references 1. “the vertical file for pamphlets and miscellany,” wilson bulletin 1, no. 16 (june 1919): 351. 2. kate louise roberts, “vertical file,” public libraries 12 (oct. 1907): 316–17. 266 performance of ruecking's word-compression method when applied to machine retrieval from a library catalog ben-ami lipetz, peter stangl, and kathryn f. taylor: research department, yale university library, new haven, connecticut f. h. ruecking's word-compression algorithm for retrieval of bibliographic data from computer stores was tested for performance in matching usersupplied, unedited bibliographic data to the bibliographic data contained in a library catalog. the algorithm was tested by manual simulation, using data derived from 126 case studies of successful manual searches of the card catalog at sterling memorial library, yale university. the algorithm achieved 70% recall in comparison to conventional searching. its acceptability as a substitute for conventional catalog searching methods is questioned unless recall performance can be improved, either by use of the algorithm alone or in combination with other algorithms. frederick h. ruecking has published a report ( 1) of a method for improving bibliographic retrieval from computerized files when searching on unverified input data supplied by requestors. the method involves compression of author-and-title information before comparison. the rules for compression cause certain types of spelling errors and word discrepancies to be ignored by the computer. ruecking reported 90.4% recall and 98.67% accuracy (precision) in a test of his method in which unverified book order requests were matched against a marc i data base that contained 1392 of the references searched. this paper reports on a small-scale manual simulation test undertaken to assess the value of the method when applied to bibliographic retrieval from a library catalog. ruecking' s w ord-c ompression/ lipetz 267 the opportunity to test ruecking's method when applied to retrieval from a library catalog was provided by the ready availability of data derived from a current study ( 2) of catalog use at sterling memorial library (3.5 million books) at yale university. this study collects, from a rigidly randomized sample of catalog users, precise information on the clues available to them at the moment of initiating a search. search clues are recorded exactly as known to the catalog user, employing his own spelling-right or wrong. for each catalog user studied, the outcome of the search is ascertained; complete catalog information is recorded for documents identified as pertinent in successful searches. search clues known to catalog users wno seek specific documents correspond to the "unverified input data" which ruecking's method would match against catalog holdings. catalog information on those documents identified as pertinent corresponds to the portion of the data base that ruecking's program seeks to match. it was possible, therefore, to apply ruecking' s method by manual simulation, and to test its recall performance in real catalog searches. a test of its precision was not immediately feasible .because such a test would require comparison of input data with the entire catalog (or a substantial portion of it). however, the determination of recall performance would at least indicate whether the method shows sufficient promise in catalog searching to warrant evaluation of its precision. an aside on precision is in order, however. it should be noted that precision of retrieval with a given method tends to vary inversely with the size of the file being searched. although ruecking did not specify the number of records included in his marc i data base, it could not have exceeded 48,000. had he run his test on a data base, ten, or fifty, or one hundred times larger, the measured precision would certainly have been much lower than the figure reported. any librarian who is contemplating the adoption of a retrieval technique which has been tested on a data base similar to, but smaller than, his own should realize that precision performance must inevitably drop as the data base is increased. the degree of lowered precision to be expected may be predicted theoretically or estimated from tests on files of several different sizes. the data used in the evaluation of recall performance reported in this paper came from 126 searches in which the catalog users had been successful in locating the specific documents that they were seeking. the compression coding method described by ruecking was applied in each instance to the author-title search clues supplied by the catalog user and to the author-title information available on the catalog card. threshold values were computed for the catalog card data, and retrieval values were computed for the user data .. when the retrieval value was at least as large as the threshold value, the document was considered "retrieved." ruecking's method was designed for use with english-language titles only. of the 126 catalog searches in the study sample, 20 involved foreign268 journal of library automation vol. 2/ 4 december, 1969 language titles. recall was determined on both the full sample and the english-language subset of 106 searches. surprisingly, there is not a great improvement in performance when foreign-language references are excluded. it should be noted that several difficulties were encountered in applying ruecking' s method because of ambiguities in the rules stated in his paper. in fact, in his figure 2 (page 236), of the seventeen illustrations of compression-coded data retrieved by his program, at least eight appear to contain departures from the compression-coding rules as stated in the paper. his table 5 (page 235) is scantily described: "individual code test'' and "full-code test" are not defined; neither are column headings. and, contrary to the text (page 234), values in columns five through seven are obtained by adding two to the calculated thresholds in only the top half of table 5; in the bottom half, no such regular correlation exists. in all cases of ambiguity, the alternative was selected that would tend to increase probability of retrieval. for example, ruecking states (page 234) that the search program provided for matching of titles on the basis of rearrangement of title words, and that the threshold value required for retrieval is raised at the same time. raising this value decreases the probability of retrieval, but it is not clear by how much the value is to be raised. for purposes of the test, the threshold value was not raised at all in cases where title words were out of correct sequence, thus retaining maximum probability of retrieval based on the number of matched words alone, regardless of their sequence. results of the test showed that, of the 126 documents in the full sample which were located successfully by manual search in the existing card catalog, only 88 were retrieved by the compression-code method-a recall rate of 70%. considering only the 106 english-language references, 77 were retrieved by the compression-code method-a recall rate of 73%. the premise for the preceding calculation of recall rate should be clearly understood. the test considered real document searches that were concluded successfully in an actual library using a manual catalog; recall is defined here as the proportion of such searches that would be concluded successfully in a hypothetical, computerized library where the only means of searching the catalog would be by ruecking•s method. in a real library with a manual catalog, wanted documents can be located in many ways, not merely through a knowledge of author and title (e.g., through subject entries, series entries, cross references). the test did not disqualify any manual approaches from consideration; it compared the real world with a specific potential alternative. obviously, the use of ruecking's method in combination with other computer programs could result in a recall rate higher than 70% or 73% by the method of calculation employed, and conceivably higher than 100% (because some document searches of manual catalogs that now end in failure might become successful using new search methods). ruecking' s w ord-compression/lipetz 269 table 1 provides detailed information on the discrepancies between user data and catalog data in the test. with respect to the full sample ( 126 documents), there were 49 documents for which mismatches of data were observed. of these, the compression-code method was able to "heal" mismatches in 11 instances to cause retrieval; on the other hand, manual searches had achieved retrieval in all 49 instances. with respect to the english-language sample ( 106 documents), there were 37 documents for which mismatches of data were observed. of these, the compressioncode method was able to "heal" mismatches in 8 instances to cause retrieval; on the other hand, manual searches had achieved retrieval in all 37 instances. contrary to expectations, the compression-code method performed somewhat worse, or at least no better, in "healing" actual mismatches in english references ( 8 out of 37) than it did with foreign-language references ( 3 out of 12). the higher overall recall percentage with the englishtable 1. results of applying ruecking's method in cases where user clues and catalog data did not match completely type of mismatch in user data had neither author nor title had author's last name, no title had title, no author had wrong author had misspelled author had wrong words in title had misspelled words in title had words transposed in title had incomplete title: a. first word correct b. first word incorrect had entire subtitle, no title had part of subtitle full sample english subset (126 documents) (106 documents) not not retrieved retrieved retrieved retrieved 1 4 1 2 2 2 2 1 9 5 2 1 2 1 1 4 2 1 9° 1 6 2 1 2 2 2 a. first word correct 1 1 b. first word incorrect 2 2 total documents 0 00 11 38 8 29 0 1 case of correct word stems not matched because of wrong endings. 0 0 2 cases of long or composite titles with maximum threshold values contained in input words but not among the first four significant words. o 0 ° figures shown are lower than totals of figures in columns because some documents had two or more types of mismatch. 270 journal of library automation vol. 2/4 december, 1969 language subset is attributable entirely to the fact that users had complete and correct data more frequently for english references ( 69 out of 106) than they did for foreign-language references (8 out of 20). thus, regardless of original intent, the method words equally well (or equally poorly, depending on one's viewpoint) on foreign-language and english references. if foreign-language references had been systematically ignored in applying the test to catalog searches, some 16% ( 20 out of 126) of the searches would have been excluded, with no real gain in performance. the block of interviews from which the searches used in this test were drawn included 10 unsuccessful document searches in addition to the 126 successful searches. one could speculate on whether the compressioncode method would have been able to "heal" these failures, resulting in a higher performance rating. the indications are, however, that the chances of such healing are close to zero. in a majority of these unsuccessful searches, the available data were incomplete or were not of the type that the method is intended to utilize. in the few remaining cases, it is very likely that the searches were unsuccessful simply because the desired documents were not in the library collection. recall performance as measured by the test could have been improved by modifying ruecking' s rules to some extent. for example, five more titles would have been retrieved had the assigned retrieval value been increased by two units in cases where the first title word matched correctly; this would have increased overall recall performance from 70% to 7 4%. a further increase to 76% would have resulted from matching the user's version of the title with the catalog's subtitle, or with portions of titles which follow a punctuation mark (in addition to matching with the actual title in the catalog). extension of the compression code to include publisher and date as well as author and title would do little or nothing to improve the performance of this method. the test data, although admittedly a small sample, indicate that users who do not have accurate author and title information when they begin a search very rarely have accurate information on any other descriptive data element. it is, of course, a matter for individual judgment as to whether the performance of the compression-code method, as indicated by the test reported here, is sufficiently good to make it attractive for use in some computerized alternative to the manual library catalog. in the authors' opinion, ruecking's method does not in itself supply an adequate solution to the problem of searching a computerized catalog. however, further investigation seems warranted along two lines. first, the method might be modified to give better performance in this application. second, it might be used in combination with some other computer methods to give searching performance approaching that which is attained today by the manual searching of card catalogs. book reviews 211 acknowledgment the work reported in this paper was supported in part by a grant from the u.s. office of education, oeg-7-071140-4427. references 1. ruecking, frederick h., jr.: "bibliographic retrieval from bibliographic input; the hypothesis and construction of a test," journal of library automation, 1 (december 1968), 227-38. 2. lipetz, ben-ami; stangl, peter: "user clues in initiating searches in a large library catalog," in american society for information science, proceedings, 5. annual meeting, october 20-24, 1968, columbus, ohio, p. 137-139. book reviews conceptual design of an automated national library system, by norman r. meise. metuchen, n.j.: scarecrow press, 1969. 234 pp. $5.00. this is a very confusing book. and it is too bad, because this reviewer kept feeling that the author, norman meise, had something to present. the trouble is that he does not communicate. this, i think, is the result of two things. first, the book reflects the naivete of engineers when they come to deal with what are basically social systems like libraries. this does not mean it can't be done, but such a task needs clarity and purpose, which this book does not have. the second springs from this failure. the masses of data, assumptions, and commentary in the book are poorly organized and intenelated. it is not enough to write strings of words; those strings must communicate and relate backward and forward in the text. although never explicitly stated, the book evidently grew out of a study performed by the united aircraft corporate systems center in 1965-66 for the development and implementation of a connecticut library research center (see eric document ed 0221512) . the latest reference in the book is 1966. in a field, i.e. library networks, where a fair amount of work and discussion has taken place in the last three years (e.g. the edunet conference in 1966), a book like this quickly loses its impact. the purpose of the book, according to the author, is "to show the feasibility of a system concept rather than provide a detailed engineering design." the system is "an automated national library system" using the state of connecticut as a model. the author then adds (spoiling the whole introduction) : "if these functions (bibliographic searching, acquisition, cataloging, circulation) can be economically automated, the major problems associated with our information explosion will be solved." as anatole france once said: "it is in the ability to deceive oneself that the greatest talent is shown." classical musicians v. copyright bots: how libraries can aid in the fight article classical musicians v. copyright bots how libraries can aid in the fight adam eric berkowitz information technology and libraries | june 2022 https://doi.org/10.6017/ital.v41i2.14027 adam eric berkowitz (berkowitza@hcflgov.net) is supervisory librarian, tampahillsborough county public library. © 2022. abstract the covid-19 pandemic forced classical musicians to cancel in-person recitals and concerts and led to the exploration of virtual alternatives for engaging audiences. the apparent solution was to livestream and upload performances to social media websites for audiences to view, leading to income and a sustained social media presence; however, automated copyright enforcement systems add new layers of complexity because of an inability to differentiate between copyrighted content and original renditions of works from the public domain. this article summarizes the conflict automated copyright enforcement systems pose to classical musicians and suggests how libraries may employ mitigation tactics to reduce the negative impacts when uploaders are accused of copyright infringement. introduction the covid-19 pandemic, unlike anything the country has seen in a century, forced industries to reevaluate the manner in which they provide services to the public. businesses and citizens everywhere made hairpin turns as they quickly searched for virtual alternatives to everyday inperson activities. with many remaining home for extended periods of time, demand for digital content and entertainment skyrocketed. in may 2020, comcast reported a 40% increase in online video streaming since march 1, just weeks before governments instated stay-at-home mandates.1 throughout the year, subscription-based streaming services saw enormous surges in customer usage and, likewise, social media platforms saw a significant spike in content production and consumption.2 daily blogging on facebook replaced in-person interactions, and youtubers generated higher volumes of videos to meet viewer demand. classical musicians were also heavily reliant on social media platforms in order to showcase performances as pointed out in the washington post article “copyright bots and classical musicians are fighting online. the bots are winning.” highlighted by american library association’s american libraries, the article illustrated the toll social media content moderation algorithms took on classical musicians sharing their performances online.3 this article became the starting point for the 2021 study “are youtube and facebook canceling classical musicians?,” which investigated the relationship between classical musicians and automated copyright enforcement systems.4 the following is a summary of this study’s findings and brings attention to the role libraries can play in aiding classical musicians facing copyright infringement claims. automated copyright enforcement evidence shows that automated copyright enforcement systems wrongfully remove useruploaded materials in the name of copyright protections on a regular basis.5 in fact, it happens so often that the australian broadcasting corporation began wittingly dubbing such instances “copywrongs.”6 these algorithms are not designed to distinguish between recordings of music mailto:berkowitza@hcflgov.net information technology and libraries june 2022 classical musicians v. copyright bots | berkowitz 2 owned by record labels and those shared online by freelance musicians. they are instructed to recognize copyrighted recordings and content resembling those recordings as identical matches, ensuring the protection of intellectual property from unauthorized reproduction. as such, automated content moderation systems are incapable of making allowances for the performance of works from the public domain. such performances comprise nearly all of a classical musician’s repertoire. automated copyright enforcement systems are typically based on a combination of matching and classification methods. the most effective matching technique for content moderation is perceptual hashing, which isolates unique strings of data (hashes) taken from an uploaded file and compares distinguishing markers and patterns to a database of samples provided by copyright owners.7 this technique allows systems to detect exact matches and iterations of the original work, such as live recordings and remixes.8 among classification methods, artificial neural networks with deep learning are best suited to the task of algorithmic moderation. consisting of a network of nodes, they are meant to simulate the structure and function of neural networks in animals and humans.9 this enables them to solve multifaceted, dynamic problems, which makes them ideal for instantaneous content moderation, allowing them to identify musical similarities in real time.10 both youtube and facebook enable users to upload recordings and broadcast live feeds to their websites. matching techniques are used to review prerecorded content since the upload process allows for automated systems to sample the material for comparison to the companies’ hash databases before allowing the recording to be posted.11 in contrast, live broadcasts are transmitted instantaneously and allow for no time to review the footage before it is visible online. therefore, hashes cannot be sampled from streaming content, requiring that classification methods using training data identify infringing material on the fly.12 while these algorithms make content moderation easier, they are limited in their capacity. one study showed that youtube is surprisingly inaccurate in its attempts to recognize infringing material in live broadcasts, failing to identify 26% of copyrighted footage within the first thirty minutes of streaming and blocking 22% of non-infringing livestreams.13 research strongly suggests that the only factors considered by music copyright enforcement systems are pitch, volume, and melodic and harmonic contour.14 those values alone cannot be used to distinguish copyrighted works from the public domain. as such, these systems are not yet advanced enough to account for the total complexity of human creativity, and human intervention is required before these programs systematically accuse uploaders of copyright infringement.15 compositions in the public domain are not subject to copyright; however, recorded performances of compositions from the public domain can be copyrighted. individuals may upload or livestream their own performances of classical music without fear of infringing copyright but may not upload another musician’s copyrighted recordings of the same pieces. for example, no one owns the copyright to bach’s cello suites and, therefore, anyone can profit from performing these works. sony music, though, owns the copyright to yo-yo ma’s recordings of bach’s cello suites, and anyone uploading these specific recordings to social media would be infringing copyright and subject to the repercussions. unfortunately, automated copyright enforcement systems often misidentify an individual’s performances as copyrighted recordings. information technology and libraries june 2022 classical musicians v. copyright bots | berkowitz 3 the impact on classical musicians classical musicians are accustomed to having their content misidentified for infringing copyright, but with the pandemic forcing many more musicians to share performances regularly on social media, the problem has become ever more pervasive. adrian spence, the artistic director for chamber ensemble camerata pacifica, found himself appealing multiple copyright claims from both facebook and youtube. on occasion, he would dispute several claims issued by different copyright owners for the same recording. until these issues were resolved, facebook suspended camerata pacifica’s ability to livestream, and youtube displayed a notification on their channel informing viewers that their videos were likely to be removed due to anticipated copyright infringement.16 owen espinosa, a high school senior, was preparing for a piano recital, and during rehearsal, facebook ended his livestream over claims of copyright infringement. he was unable to successfully appeal the claim which meant that facebook would not host his performance. instead, he had to broadcast his recital on an acquaintance’s youtube channel.17 michael sheppard, a professional pianist, has had broadcasts interrupted and videos removed by facebook multiple times with notifications stating that music owned by naxos of america was detected in his performances.18 after facebook rejected his disputes, sheppard took to twitter, alerting naxos of his situation. his videos were eventually restored, but nothing could be done about his livestreams.19 the violinist.com broadcasts weekly, hour-long concerts featuring multiple guest musicians. during one of these performances, facebook muted child violinist yugo maeda due to a claim of copyright infringement. after appealing the notice, facebook unmuted maeda’s performance three days later.20 while covid-19 exacerbated the issue, classical musicians often had their performances interrupted or removed from social media. in 2019, conducting students at the university of british colombia had their facebook live feed interrupted over copyright infringement claims and, in 2018, facebook removed a recording of an in-home performance given by pianist james rhodes also stating that the music infringed copyright.21 also in 2018, the australia broadcasting corporation’s abc classic fm livestreamed a performance of beethoven’s symphony no. 9. the broadcast ended with facebook issuing a claim stating that the music in question was owned by two different copyright owners.22 in 2016, violinist claudia schaer disputed several of youtube’s copyright claims. she typically had success with these appeals, but one of her recordings received three claims from different copyright owners. she was able to refute two of them; however, the third remained, and she was warned that if she was unsuccessful in her second attempt at appealing the claim, her account would receive a copyright strike, deleting her video from the site permanently. she felt both intimidated and aggravated by the ordeal.23 the author of this article has also had to refute a copyright infringement claim on youtube. according to the notice, 51 seconds of the author’s approximately five-minute performance of beethoven’s “für elise” infringed copyright. as a result, the claimant authorized youtube to include ads in the video, allowing them to generate revenue. the dispute was upheld after the claimant’s 30-day window for a response expired. although the author does not rely on monetized videos and livestreams for income, it is unethical for another entity to profit from the work of an unaffiliated individual. information technology and libraries june 2022 classical musicians v. copyright bots | berkowitz 4 disputing a copyright claim while there is recourse for uploaders facing copyright claims from social media sites, the appeals process can be lengthy and overwhelming. it can take more than two months for youtube to render a verdict when a musician disputes a copyright notice. during this span of time, classical musicians depending on ad revenue cease to generate income as these funds are held by the company until a final decision is made, at which point all profits accumulated by the video are released to the appropriate party. if the claim is upheld, the recording may remain online with proceeds going to the supposed copyright owner.24 uploaders may attempt to refute the result, but a failed appeal leads to the video’s removal and a copyright strike levied against the uploader s preventing them from livestreaming and monetizing videos for three months. should this occur, a counter notification can be issued which insists that the content in question has been mischaracterized as infringing and requires that would-be copyright owners file a lawsuit to uphold the claim. after three strikes, accounts are permanently deleted along with all associated uploads.25 the time that elapses for a final verdict along with the suspension of uploading and livestreaming permissions due to a copyright strike amounts to more than five months without being able to sustain an income. when a single performance is charged with multiple claims from different entities, as in the aforementioned examples, the uploader must dispute each one individually. this makes it easy to accumulate copyright strikes, risking account termination. it would be reasonable to assume that many classical musicians who endure these circumstances avoid the dispute process for fear of youtube removing their recordings, enforcing limitations on their ability to broadcast and monetize videos, and even permanently deleting their accounts. meanwhile, mistakenly recognized copyright owners can leverage this by appropriating the earnings generated by the work of unaffiliated musicians. furthermore, should the matter be redirected to the courts, the uploader faces the burden of retaining legal counsel. youtube algorithms deal with approximately 98% of all copyright issues and, because youtube’s business model generates profits primarily via user-uploaded content, it has been found to show bias towards established copyright owners.26 copyright owners can set preferences for how they want the system to react to instances of copyright infringement, resulting in the automatic monetization of 95% of claims for the copyright owner. as a result, user uploads make up 50% of the revenue generated by youtube for the music industry.27 although google reported in 2018 that 60% of disputed claims were found in favor of accused uploaders, the system clearly benefits established copyright owners.28 all of the aforementioned musicians who were accused of copyright infringement had their livestreams interrupted, saw their videos removed, and witnessed companies profiting from their work performing music that has long since passed into the public domain. youtube’s video series copyright and content id on youtube attempts to educate users on how automated copyright enforcement and the dispute process work, and while fair use and copyright permissions are discussed, the public domain is never mentioned; although, youtube does offer a brief explanation of the public domain on its help site.29 according to the us copyright act, the duration of copyright extends to 70 years after the death of the known composer, and for uncredited compositions or those composed by a musician under a pseudonym, copyright is recognized for 95 years from the date the work was published or 120 years from when it was composed, depending on which information technology and libraries june 2022 classical musicians v. copyright bots | berkowitz 5 expires first.30 while record labels are fully within their right to protect the recordings they own, that should have no bearing on individual musicians performing pre-twentieth-century music. the majority of online music consumption occurs on social media sites with 47% of the market share going to youtube.31 reports from deezer showed a near 20% increase in users listening to classical music since the start of the pandemic.32 given that more users are gravitating towards listening to classical music, and that the most popular digital access point for music is youtube, classical musicians coping with pandemic-induced restrictions were presented with what should have proven to be a lucrative opportunity. adhering to social distancing requirements and stay-athome mandates meant musicians cancelled their performances, leading to an exploration of virtual alternatives such as uploading recordings and livestreaming. obstructing these activities interrupts their sole source of income. conclusion while researchers have suggested a handful of improvements for automated copyright enforcement systems, they have not addressed the role that libraries can play in assis ting classical musicians.33 the tampa-hillsborough county public library, prior to the spread of covid-19, maintained four branches outfitted with recording studios; today, that number has grown to five. prior to pandemic library closures, recording studios were reserved just over 800 times, amounting to about 1,600 hours of usage between january 1, 2019 and march 13, 2020. patrons using the recording studios produce music and videos with the intention of uploading them to social media. other libraries with recording studios likely see their patrons doing the same, but without knowledge of copyright. libraries have the means and the motive to assist classical musicians. libraries can hold classes covering the basics of copyright, fair use, and the public domain, or that expand upon how automated copyright enforcement systems work on social media. library staff, however, may feel overwhelmed by the numerous texts on these subjects and may not know where to begin. an excellent starting point is the frequently asked questions page on the us copyright office website. this webpage offers explanations for a broad array of copyright-related issues and questions.34 fair use allows for unauthorized borrowing from a creative work; however, navigating how fair use is determined is always challenging. steven m. davis’ “computerized takedowns: a balanced approach to protect fair uses and the rights of copyright owners” is a reliable point of reference for defining fair use, its application in copyright infringement cases, and ethical and legal implications regarding the limitations of algorithmic moderation systems.35 for a thorough look into the mechanics and applications of automated copyright enforcement, refer to the previously mentioned “are youtube and facebook cancelling classical musicians?” this article offers a synopsis on the shift from physical to digital media, descriptions of different algorithmic models developed specifically for copyright enforcement, and an account of how youtube’s and facebook’s copyright enforcement systems came to be.36 libraries can also offer help sessions that support patrons through the copyright claims dispute process. the youtube dispute interface is user friendly, and the instructions are comprehensible. throughout each step, explanations are offered to clarify what is being required of the user. for example, when asked for the reasoning behind the dispute, the user is offered four options: the disputed material is original content, the user has acquired permission to reproduce the co ntent, the content falls under fair use, or the content originates from the public domain. once selected, information technology and libraries june 2022 classical musicians v. copyright bots | berkowitz 6 additional explanations for each option are given in order to provide further clarification and context which allows the user to reconsider their choice and also helps the user better explain how their content falls under the selected category. finally, the user is asked to provide a narrative explaining how the content in question does not infringe copyright. facebook’s counternotification process is less generous, providing brief, ineffectual descriptions of copyright and a simple form requesting the user’s personal information and explanation for why the copyright infringement claim is unfounded. after library staff demonstrate the use of these interfaces, patrons can be guided to library resources to help them articulate and refine their arguments. for anything that cannot be found among the library’s collections, library staff may need to assist with internet searches, or patrons may request materials through interlibrary loan. additionally, patrons may still feel overwhelmed by the terminology being presented, which would further support the need for library programming that covers copyright-related topics. when considering the research involved to produce a convincing counterargument, information literacy and metaliteracy classes may be warranted. libraries can also encourage patrons to include descriptions in their uploads and livestreams with links to supporting evidence explaining that the featured music belongs to the public domain, and as the uploader, they own the rights to recordings and broadcasts of their own performances. the public domain description on youtube’s help page provides links to columbia university libraries’ copyright advisory service and cornell university’s copyright information center, and it suggests that these resources can lead to supporting evidence regarding works in the public domain.37 another excellent resource is the international music score library project’s petrucci music library. this database of almost 200,000 compositions belonging to the public domain features both sheet music and recordings of each of these works.38 users can also point to the public domain song anthology, a book comprising 348 popular songs from the public domain; the entire text can be downloaded from the publisher’s website.39 these resources and explanations can be included in disputes to support the reasoning for why a copyright claim is invalid. it should be noted that library employees are most often not lawyers, and as such, it is ill-advised to answer direct questions about the specific legality of the myriad of situations musicians face when disputing copyright claims. these matters require expert, specialist knowledge with which library staff are not equipped. the role of the library should only be to provide access to resources and inform the public on various issues regarding the use of information. as information specialists, librarians are in a unique position to educate patrons on information policy, and in this case, copyright. library systems with law libraries or with access to law collections and databases would be especially suited to teach patrons about copyright, guide them through the dispute process, and assist them with gathering resources to support their counterarguments. the tampahillsborough county public library and other systems like it that are outfitted with both music recording studios and a law library are encouraged to offer such services. hopefully, this overview of automated copyright enforcement, its impacts on classical musicians, and the suggestions to libraries offered here will promote further conversation that eventually leads to action and a possible solution. perhaps, as progress is made, automated copyright enforcement systems will grow more hospitable towards user-generated recordings and livestreams of classical music. after all, social media should be able to freely host the artistic talents of all musicians. information technology and libraries june 2022 classical musicians v. copyright bots | berkowitz 7 endnotes 1 “covid-19 network update,” comcast, may 20, 2020, https://corporate.comcast.com/covid19/network/may-20-2020. 2 julia alexander, “the entire world is streaming more than ever—and it’s straining the internet,” the verge, march 27, 2020, https://www.theverge.com/2020/3/27/21195358/streaming-netflix-disney-hbo-nowyoutube-twitch-amazon-prime-video-coronavirus-broadband-network; ella koeze and nathaniel popper, “the virus changed the way we internet,” the new york times, april 7, 2020, https://www.nytimes.com/interactive/2020/04/07/technology/ coronavirus-internet-use.html. 3 michael andor brodeur, “copyright bots and classical musicians are fighting online. the bots are winning,” the washington post, may 21, 2020, https://www.washingtonpost.com/entertainment/music/copyright-bots-and-classicalmusicians-are-fighting-online-the-bots-are-winning/2020/05/20/a11e349c-98ae-11ea-89fd28fb313d1886_story.html. 4 adam eric berkowitz, “are youtube and facebook cancelling classical musicians? the harmful effects of automated copyright enforcement on social media platforms,” notes 78, no. 2 (december 2021): 177–202. 5 rebecca tushnet, “all of this has happened before and all of this will happen again: innovation in copyright licensing,” berkeley technology law journal 29, no. 3 (december 2014): 1147–87. 6 matthew lorenzon, “why is facebook muting classical music videos?” abc classic fm, december 21, 2018, https://www.abc.net.au/classic/read-and-watch/music-reads/facebookcopyright/10633928. 7 xia-mu niu and yu-hua jiao, “an overview of perceptual hashing,” acta electronica sinica 36, no. 7 (2008): 1405–11. 8 robert gorwa, reuben binns, and christian katzenbach, “algorithmic content moderation: technical and political challenges in the automation of platform governance,” big data & society 7, no. 1 (january 2020): 7. 9 larry hardesty, “explained: neural networks,” mit news, april 14, 2017, https://news.mit.edu/2017/explained-neural-networks-deep-learning-0414. 10 daniel graupe, principles of artificial neural networks, 3rd ed. (hackensack, nj: world scientific publishing company, 2013), 1–3. 11 gorwa, binns, and katzenbach, “algorithmic content moderation,” 7. 12 daniel (yue) zhang, jose badilla, herman tong, and dong wang, “an end-to-end scalable copyright detection system for online video sharing platforms,” in proceedings of the 2018 https://corporate.comcast.com/covid-19/network/may-20-2020 https://corporate.comcast.com/covid-19/network/may-20-2020 https://www.theverge.com/2020/3/27/21195358/streaming-netflix-disney-hbo-now-youtube-twitch-amazon-prime-video-coronavirus-broadband-network https://www.theverge.com/2020/3/27/21195358/streaming-netflix-disney-hbo-now-youtube-twitch-amazon-prime-video-coronavirus-broadband-network https://www.nytimes.com/interactive/2020/04/07/technology/coronavirus-internet-use.html https://www.nytimes.com/interactive/2020/04/07/technology/coronavirus-internet-use.html https://www.washingtonpost.com/entertainment/music/copyright-bots-and-classical-musicians-are-fighting-online-the-bots-are-winning/2020/05/20/a11e349c-98ae-11ea-89fd-28fb313d1886_story.html https://www.washingtonpost.com/entertainment/music/copyright-bots-and-classical-musicians-are-fighting-online-the-bots-are-winning/2020/05/20/a11e349c-98ae-11ea-89fd-28fb313d1886_story.html https://www.washingtonpost.com/entertainment/music/copyright-bots-and-classical-musicians-are-fighting-online-the-bots-are-winning/2020/05/20/a11e349c-98ae-11ea-89fd-28fb313d1886_story.html https://www.abc.net.au/classic/read-and-watch/music-reads/facebook-copyright/10633928 https://www.abc.net.au/classic/read-and-watch/music-reads/facebook-copyright/10633928 https://news.mit.edu/2017/explained-neural-networks-deep-learning-0414 information technology and libraries june 2022 classical musicians v. copyright bots | berkowitz 8 ieee/acm international conference on advances in social networks analysis and mining (barcelona, spain: ieee press, 2018), 626–27. 13 daniel (yue) zhang et al., “crowdsourcing-based copyright infringement detection in live video streams,” in proceedings of the 2018 ieee/acm international conference on advances in social networks analysis and mining (barcelona, spain: ieee press, 2018), 367. 14 berkowitz, “are youtube and facebook cancelling classical musicians?,” 200. 15 diego cerna aragon, “behind the screen: content moderation in the shadows of social media,” critical studies in media communication 37, no. 5 (october 19, 2020): 512–14. 16 brodeur, “copyright bots and classical musicians are fighting online.” 17 amy williams, “camerata pacifica to stream high school graduate’s senior recital,” classical candor: classical music news and reviews (blog), june 6, 2020, https://classicalcandor.blogspot.com/2020/06/classical-music-news-of-week-june-62020.html. 18 baltimore school for the arts, “sometimes you have to fight!,” facebook, may 22, 2020, https://www.facebook.com/baltimoreschoolforthearts/posts/sometimes-you-have-to-fightour-michael-sheppard-was-recently-giving-a-facebook-/3146142648740808/. 19 michael sheppard (@pianistcomposer), “dear @naxosrecords please stop muting portions of works whose composers have been dead for hundreds of years.” twitter, may 9, 2020, https://twitter.com/pianistcomposer/status/1259118489622777856. 20 laurie niles, “facebook and naxos censor music student playing bach,” violinist.com (blog), july 13, 2020, https://www.violinist.com/blog/laurie/20207/28375/. 21 brodeur, “copyright bots and classical musicians are fighting online”; ian morris, “facebook blocks musician from uploading his own performance—but did he break copyright?” daily mirror, september 7, 2018, https://www.mirror.co.uk/tech/facebook-blocks-musicianuploading-performance-13208194. 22 matthew lorenzon, “why is facebook muting classical music videos?” abc classic fm, december 21, 2018, https://www.abc.net.au/classic/read-and-watch/music-reads/facebookcopyright/10633928. 23 claudia schaer, “youtube copyright issues,” violinist.com (blog), february 15, 2016, https://www.violinist.com/discussion/archive/27589/. 24 “monetization during content id disputes,” youtube help, accessed october 24, 2019, https://support.google.com/youtube/answer/7000961?hl=en&ref_topic=9282678#zippy=,fili ng-a-content-id-dispute,more-info-about-the-content-id-dispute-process,filing-a-content-idappeal,more-info-about-the-content-id-appeal-process. https://classicalcandor.blogspot.com/2020/06/classical-music-news-of-week-june-6-2020.html https://classicalcandor.blogspot.com/2020/06/classical-music-news-of-week-june-6-2020.html https://www.facebook.com/baltimoreschoolforthearts/posts/sometimes-you-have-to-fight-our-michael-sheppard-was-recently-giving-a-facebook-/3146142648740808/ https://www.facebook.com/baltimoreschoolforthearts/posts/sometimes-you-have-to-fight-our-michael-sheppard-was-recently-giving-a-facebook-/3146142648740808/ https://twitter.com/pianistcomposer/status/1259118489622777856 https://www.violinist.com/blog/laurie/20207/28375/ https://www.mirror.co.uk/tech/facebook-blocks-musician-uploading-performance-13208194 https://www.mirror.co.uk/tech/facebook-blocks-musician-uploading-performance-13208194 https://www.abc.net.au/classic/read-and-watch/music-reads/facebook-copyright/10633928 https://www.abc.net.au/classic/read-and-watch/music-reads/facebook-copyright/10633928 https://www.violinist.com/discussion/archive/27589/ https://support.google.com/youtube/answer/7000961?hl=en&ref_topic=9282678#zippy=,filing-a-content-id-dispute,more-info-about-the-content-id-dispute-process,filing-a-content-id-appeal,more-info-about-the-content-id-appeal-process https://support.google.com/youtube/answer/7000961?hl=en&ref_topic=9282678#zippy=,filing-a-content-id-dispute,more-info-about-the-content-id-dispute-process,filing-a-content-id-appeal,more-info-about-the-content-id-appeal-process https://support.google.com/youtube/answer/7000961?hl=en&ref_topic=9282678#zippy=,filing-a-content-id-dispute,more-info-about-the-content-id-dispute-process,filing-a-content-id-appeal,more-info-about-the-content-id-appeal-process information technology and libraries june 2022 classical musicians v. copyright bots | berkowitz 9 25 “copyright strike basics,” youtube help, accessed october 24, 2019, https://support.google.com/youtube/answer/2814000#zippy=,what-happens-when-you-geta-copyright-strike,resolve-a-copyright-strike. 26 google, how google fights piracy (november 2018), 14, https://www.blog.google/documents/27/how_google_fights_piracy_2018.pdf; joanne e. gray and nicolas p. suzor, “playing with machines: using machine learning to understand automated copyright enforcement at scale,” big data & society 7, no. 1 (april 2020): 1–15. 27 karl borgsmiller, “youtube vs. the music industry: are online service providers doing enough to prevent piracy?” southern illinois university law journal 43, no. 3 (spring 2019): 660. 28 google, how google fights piracy, 28–31. 29 youtube creators, copyright and content id on youtube, october 12, 2020, accessed december 11, 2021, https://www.youtube.com/playlist?list=plpjk416fmkwrnrbv72kshryeknnsaafkd; “frequently asked copyright questions,” youtube help, accessed october 24, 2019, https://support.google.com/youtube/answer/2797449#c-pd&zippy=,what-is-the-publicdomain. 30 “how long does copyright protection last?” copyright.gov, us copyright office, https://www.copyright.gov/faq/faq-duration.html. 31 adam j. reis and manon l. burns, “who owns that tune? issues faced by music creators in today’s content-based industry,” landslide 12, no. 3 (january & february 2020): 13–16. 32 maddy shaw roberts, “research shows huge surge in millennials and gen zers streaming classical music,” classic fm, august 19, 2020, https://www.classicfm.com/music-news/surgemillennial-gen-z-streaming-classical-music/. 33 berkowitz, “are youtube and facebook cancelling classical musicians?,” 199–201. 34 “frequently asked questions” copyright.gov, us copyright office, https://www.copyright.gov/help/faq. 35 steven m. davis, “computerized takedowns: a balanced approach to protect fair uses and the rights of copyright owners,” roger williams university law review 23, no. 1 (winter 2018): 1– 24. 36 berkowitz, “are youtube and facebook cancelling classical musicians?,” 177–202. 37 “frequently asked copyright questions,” youtube help. 38 “main page,” (website), imslp: petrucci music library, accessed december 12, 2021, https://imslp.org/wiki/main_page. 39 david berger and chuck israels, the public domain song anthology: with modern and traditional harmonization (charlottesville: aperio, 2020), https://aperio.press/site/books/m/10.32881/book2/. https://support.google.com/youtube/answer/2814000#zippy=,what-happens-when-you-get-a-copyright-strike,resolve-a-copyright-strike https://support.google.com/youtube/answer/2814000#zippy=,what-happens-when-you-get-a-copyright-strike,resolve-a-copyright-strike https://www.blog.google/documents/27/how_google_fights_piracy_2018.pdf https://www.youtube.com/playlist?list=plpjk416fmkwrnrbv72kshryeknnsaafkd https://support.google.com/youtube/answer/2797449#c-pd&zippy=,what-is-the-public-domain https://support.google.com/youtube/answer/2797449#c-pd&zippy=,what-is-the-public-domain https://www.copyright.gov/faq/faq-duration.html https://www.classicfm.com/music-news/surge-millennial-gen-z-streaming-classical-music/ https://www.classicfm.com/music-news/surge-millennial-gen-z-streaming-classical-music/ https://imslp.org/wiki/main_page https://aperio.press/site/books/m/10.32881/book2/ abstract introduction automated copyright enforcement the impact on classical musicians disputing a copyright claim conclusion endnotes learning to share: measuring use of a digitized collection on flickr and in the ir melanie schlosser and brian stamper information technology and libraries | september 2012 85 abstract there is very little public data on usage of digitized library collections. new methods for promoting and sharing digitized collections are created all the time, but very little investigation has been done on the effect of those efforts on usage of the collections on library websites. this study attempts to measure the effects of reposting a collection on flickr on use of the collection in a library-run institutional repository (ir). the results are inconclusive, but the paper provides background on the topic and guidance for future efforts. introduction inspired by the need to provide relevant resources and make wise use of limited budgets, many libraries measure the use of their collections. from circulation counts and in-library use studies of print materials, to increasingly sophisticated analyses of usage of licensed digital resources, the techniques have changed even as the need for the data has grown. new technologies have simultaneously presented challenges to measuring use, and allowed those measurements to become more accurate and more relevant. in spite of the relative newness of the digital era, “librarians already know considerably more about digital library use than they did about traditional library use in the print environment.”1 arl’s libqual+,2 one of the most widelyadopted tools for measuring users’ perceptions of service quality, has recently been joined by digiqual and mines for libraries. these new statsqual tools3 extend the familiar libqual focus on users into the digital environment. there are tools and studies for seemingly every type of licensed digital content, all with an eye toward better understanding their users and making better-informed collection management decisions. those same tools and studies for measuring use of library-created digital collections are conspicuous in their absence. almost two decades into library collection digitization programs, there is not a significant body of literature on measuring use of digitized collections. a number of articles have been written about measuring usage of library websites in general; arendt and wagner4 is a recent example. in one of the few studies to specifically measure use of a digitized collection, herold5 uses google analytics to uncover the geographical location of users of a digitized archival image collection. otherwise, a literature search on usage studies uncovers very little. less formal communication channels are similarly quiet, and public usage data on digitized collections on library sites is virtually nonexistent. commercial sites for disseminating and sharing melanie schlosser (schlosser.40@osu.edu) is digital publishing librarian and brian stamper (stamper.10@osu.edu) is administrative associate, the ohio state university libraries, columbus, ohio. mailto:schlosser.40@osu.edu mailto:stamper.10@osu.edu information technology and libraries | september 2012 86 digital media frequently display simple use metrics (image views, for example, or file downloads) alongside content; such features do not appear on digitized collections on library sites. usage and digitization projects digitized library collections are created with an eye toward use from their early planning stages. an influential early clir publication on selecting collections for digitization written by a harvard task force6 included current and potential use of the analog and digitized collection as a criterion for selection. the factors to be considered include the quantitative (“how much is the collection used?”) and the qualitative (“what is the nature of the use?”). more than ten years later, ooghe and moreels7 find that use is still a criterion for selection of collections to digitize, tied closely to the value of the collection. facilitating discovery and use of the digitized collection is a major consideration during project development. payette and rieger8 is an early example of a study of the needs of users in digital library design. usability testing of the interface is frequently a component of site design; see jeng9 for a good overview of usability testing in the digital library environment. increasing usage of the digitized collection is also a major theme in metadata research and development. standards such as the open archives initiative’s protocol for metadata harvesting10 and object reuse and exchange11 are meant to allow discovery and reuse of objects in a variety of environments, and the linked data movement promises to make library data even more relevant and reusable in the world wide web environment.12 digital collection managers have also found more radical methods of increasing usage of their collections. inserting references into relevant wikipedia articles has become a popular way to drive more users to the library’s site.13 some librarians have taken the idea a step further and have begun reposting their digital content on third-party sites. the smithsonian pioneered one reposting strategy in 2008 when they partnered with flickr, the popular photo-sharing site, to launch flickr commons.14 the commons is a walled garden within flickr that contains copyrightfree images held by cultural heritage institutions such as libraries, archives, and museums. each partner institution has its own branded space “photostream” in flickr parlance organized into collections and sets. this model aggregates content from different organizations and locates it where users already are, but it still maintains the traditional institution/collection structure. flickr commons has been, by all measures, a very successful experiment in sharing collections with users. the smithsonian,15 the library of congress,16 the alcuin society,17 and the london school of economics18 have all written about their experiences with the commons. stephens19 and michel and tzoc20 give advice on how libraries can work with flickr, and garvin21 and vaughan22 take a broad view of the project and the partners. another sharing strategy is beginning to emerge, where digital collection curators contribute individual or small groups of images to thematic websites. a recent example is pets in collections,23 a whimsical tumblr photo blog created by the digital collections librarian at bryn mawr college. learning to share: measuring use of a digitized collection on flickr and in the ir| schlosser and stamper 87 the site’s description states, “come on if you work in a library, archive, or museum, you know you’ve seen at least one of these a seemingly random image of that important person and his dog or a man and a monkey wearing overalls … so now you finally have a place to share them with the world!” the site requires submissions to include only the image and a link back to the institution or repository that houses it, although submitters may include more information if they choose. although more lighthearted than most traditional library image collections, it still performs the desired function of introducing users to digital collections they may never have encountered otherwise. clearly, these creative and thoughtful strategies are not dreamed up by digital librarians unconcerned with end use of their collections, so why do stewards of digitized collections so rarely collect, or at least publicly discuss, statistics on the use of their content? the one notable exception to this may shed some light on the matter. institutional repositories (irs) have been the one area of non-licensed digital library content where usage statistics are frequently collected and publicized. dspace,24 the widely-adopted ir platform developed by mit and hewlett-packard, has increasingly sophisticated tools for tracking and sharing use of the content it hosts. digital commons,25 the hosted ir solution created by bepress, provides automated monthly download reports for scholars who use it to archive their content. the development of these features has been driven by the need to communicate value to faculty and administrators. encouraging participation by faculty has been a major focus of ir managers since the initial ‘build it and they will come’ optimism faded and the challenge of adding another task to already busy faculty schedules became clear.26 having a clear need (outreach) and a defined audience (participating scholars) has led to a thriving program of usage tracking in the ir community. the lack of an obvious constituency and the absence of pointed questions about use in the digitized collections world have, one suspects, led to the current dearth of measurement tools and initiatives. still, questions about use do arise, particularly when libraries undertake laborintensive usability studies or venture into the somewhat controversial landscape of sharing library-created digital objects on third party sites.27 anecdotally, the thought of sharing library content elsewhere on the web also raises concerns about loss of context and control, as well as a fear of ‘dilution’ of the library’s web presence. “if patrons can use the library’s collections on other sites,” a fellow librarian once exclaimed, “they won’t come to the library’s website anymore!” without usage data, we cannot adequately answer questions about the value of our projects or the way they impact other library services. justification for study and research questions there were three major motivations for this project. first, inspired by the success of the flickr commons project, we wanted to explore a method for sharing our collections more widely. an image collection and a third-party image-sharing platform were an obvious choice, since image display is not a strength of our dspace-based repository. flickr is currently a major presence in information technology and libraries | september 2012 88 the image sharing landscape, and the existence of the commons was an added incentive for choosing flickr as our platform. second, the collection we selected for the project (described more fully below) is not fully described, and we wanted to take advantage of flickr’s annotation tools to allow user-generated metadata. since further description of the images would have required an unusual depth of expertise, we were not optimistic that we would receive much useful data, and in fact we did not. still, we lost nothing by asking, and gained familiarity with flickr’s capabilities for metadata capture. the final motivation for the project, and the focus of the study, was the desire to investigate the effect of third-party platform sharing of a local collection on usage of that collection on library sites. the data gathered were meant partly to inform our local practice, but also to address a concern that may hold librarians back from exploring such means of increasing collection usage the fear that doing so will divert traffic from library sites. we suspected that sharing collections more widely would actually increase usage of the items on library-owned sites, and the study was developed to explore the issue in a rigorous way. the research question for this study was: does reposting digitized images from a library site to a third-party image sharing site have an effect on usage of the images on the library site? about the study platforms for the study, the images were submitted to two different platforms the knowledge bank (kb),28 a library-managed repository, and flickr, a commercial image sharing site. the kb is an institutional repository built on dspace software with a manakin (xml-based) user interface. established in 2005, it holds more than 45,000 items, including faculty and student research, gray literature, institutional records, and digitized library collections. image collections like the one used in this study make up a small percentage of the items in the repository. in the kb’s organizational structure, the images in the study were submitted as a collection in the library’s community, under a sub-community for the special collection that contributed them. each image was submitted as an item consisting of one image file and dublin core metadata.29 the project originally called for submitting the images to flickr commons, but the commons was not accepting new partners during the study period. instead, we created a standard flickr pro account for the libraries, while following the commons guidelines in image rights and settings. in contrast to dspace’s community/sub-community/collection structure, flickr images are organized in sets, sets belong to collections, and all images make up the account owner’s photostream. a set was created for the images, with accompanying text giving background information and inviting users to contribute to the description of the images.30 the images were accompanied by the same metadata as the items in the kb, but the files themselves were higher resolution, to take advantage of flickr’s ability to display a range of sizes for each image. all items in the set were publicly learning to share: measuring use of a digitized collection on flickr and in the ir| schlosser and stamper 89 available for viewing, commenting, and tagging, and each image was accompanied by links back to the kb at the item, collection, and repository level. the collection the choice of a collection for the study was limited by a number of factors. first, and most obviously, it needed to be an image collection. second, it needed to be in the public domain, both to allow our digitization and distribution of the images, and also to satisfy flickr commons’ “no known copyright restrictions” requirement.31 this could be accomplished either by choosing a collection whose copyright protections had expired, or by removing restrictions from a collection to which the libraries owned the rights. third, the curator of the collection needed to be willing and able to post the images on a commercial site. this required not only an open-minded curator, but also a collection without a restrictive donor agreement or items containing sensitive or private information. finally, we wanted the collection to be of broad public interest. the collection chosen for the study was a set of 163 photographs from osu’s charles h. mccaghy collection of exotic dance from burlesque to clubs, held by the jerome lawrence and robert e. lee theatre research institute.32 the photographs, mainly images of burlesque dancers, were published on cabinet and tobacco cards in the 1890s, putting them solidly in the public domain. figure 1. "the devil's auction," j. gurney & son (studio). http://hdl.handle.net/1811/47633 (kb), http://www.flickr.com/photos/60966199@n08/5588351865/ (flickr) http://hdl.handle.net/1811/47633 learning to share: measuring use of a digitized collection on flickr and in the ir| schlosser and stamper 87 methodology phases the study took place in 2011 and was organized in three ten-week phases. for the first phase (january 31 through april 11), the images were submitted to the kb. the purpose of this phase was to provide a baseline level of usage for the images in the repository. in phase two (april 12 through june 20), half of the images were randomly selected and submitted to flickr (group a). the purpose of this phase was to determine what effect reposting would have on usage of items in the repository both on those images that were reposted, and on other images in the same collection that had not been reposted. in phase three (june 21 through august 29), the rest of the images (group b) were submitted to flickr. in this phase, we began publicizing the collection. publicity consisted of sharing links to the collection on social media and sending emails to scholars in relevant fields via email lists. these efforts led to further downstream publicity on popular and scholarly blogs.33 data collection the unit of measurement for the study was views of individual images. to understand the notion of a “view,” we must contrast two different ways that an image may be viewed in the knowledge bank. each image in the collection has an individual web page (the item page) where it is presented along with metadata describing it. in addition, from that page a visitor may download and save the image file itself (in this collection, a jpeg). in the former case, the image is an element in a web page, while in the latter it is an image file independent of its web context. search engines and other sources commonly link directly to such files, so it is not unusual for a visitor to download a file without ever having seen it in context. in light of this, we produced two data sets, one for visits to item pages, and another for file downloads. depending on one’s interpretation, either could be construed as a “view.” ultimately there was little distinction in usage patterns between the two types of measure. the data were generated by making use of dspace’s apache solr-based statistics system, which provides a queryable database of usage events. for each item in the study, we made two queries; one for per-day counts of item page views, and another for per-day counts of image file downloads (called “bitstream” downloads in dspace parlance.) in both cases, views that came from automated sources such as search engine indexing agents were excluded from our counts. views of the images in flickr were noted and used as a benchmark, but were not the focus of the study. unlike cumulative views, which are tabulated and saved indefinitely, flickr saves daily view numbers for only thirty days. as a result, daily view numbers for most of the study period were not available for analysis, and the discussion of the trends in the flickr data is necessarily anecdotal. information technology and libraries | september 2012 88 results at the end of the study period, the data showed very little usage of the collection in the repository. this lack of usage was relatively consistent through the three phases of the study, and in rough terms translates to less than one view of each item per day. of the two ways of measuring an image "view" either by counting views of the web page where the item can be found or by counting how many times the image file was downloaded there was little distinction. knowledge bank item pages received between 5 and 38 views per item, while files were downloaded between 5 and 34 times. further, there were no significant differences in number of views received between the first group released to flickr and the second. kb item page views image file downloads min median max min median max group a (images released to flickr in phase ii) 5 10 35 5 9 25 group b (images released to flickr in phase iii) 6 10 38 4 9 34 table 1. the items in the study are divided into group a and group b, depending on when the images were placed on flickr. this table shows that both groups received similar traffic over the course of the study, with items having between 5 and 38 views in both groups, with a median of 10 for both, and between 4 and 34 downloads, with a median of 9 for both groups. the items attracted more visitors on flickr, with the images receiving between 100 and 600 views each. with a few exceptions, the items that appeared towards the beginning of the set (as viewed by a user who starts from the set home page) received more views than items towards its end. this suggests a particular usage pattern start at the beginning, browse through a certain number of images, and navigate away. a more significant trend in the flickr data is that most views of the images came after publicity for the collection began (approximately midway through the third phase of the study). again, the lack of daily usage numbers on flickr makes it impossible to demonstrate the publicity ‘bump,’ but it was dramatic. we witnessed a similar, if smaller, ‘bump’ in usage of the items in the kb after publicity started. we were also able to identify 65 unique visitors to the kb who came to the site via a link on flickr, out of 449 unique visitors overall. of those who came to the kb from flickr, 31 continued on to other parts of the kb, and the rest left after viewing a single item or image. learning to share: measuring use of a digitized collection on flickr and in the ir| schlosser and stamper 89 discussion with so little data, we cannot reliably answer the primary research question. reposting certainly does not seem to have lowered usage of the items in the kb, but the numbers of views in all phases were so small as to preclude drawing meaningful conclusions. a larger issue is the fact that much of the usage came immediately following our promotional efforts. this development complicated the research in a number of ways. first, because the promotional emails and social media messages specifically pointed users to the collection in flickr, it is impossible to know how the use may have differed if the primary link in the promotion had been to the knowledge bank. would the higher use seen on flickr simply have transferred to the kb? would the unfamiliarity and non-image-centric interface of the knowledge bank have thwarted casual users in their attempt to browse the collection? the centrality of the promotion efforts also suggests that one of the underlying assumptions of the study may have been wrong. this research project was premised on the idea that an openly available collection on a library website will attract a certain number of visitors (number dependent on the popularity and topicality of the subject of the collection) who find the content spontaneously via searching and browsing. placing that same content on a third-party site could theoretically divert a percentage of those users, who would then never visit the library’s site. the percentage of users diverted would likely depend on how many more users browse the third party site than the library site, as well as the relative position of the two in search rankings. the mccaghy collection should have been a good candidate for this type of use pattern. flickr is certainly heavily used and browsed, and burlesque, while not currently making headlines, is a subject with fairly broad popular appeal. the fact that users did not spontaneously discover the collection on either platform in significant numbers suggests that this may not be how discovery of library digitized collections works. it is not surprising that email lists and social media should drive larger numbers of users to a collection than happenstance the power of link curation by trusted friends via informal communication channels is well known. what is surprising is that it was the only significant use pattern in evidence. the primary takeaway is that promotion is key. if we do not promote our collections to the people who are likely to be interested in them, barring a stroke of luck, it is unlikely that they will be found. anecdotally, promotional efforts are often an afterthought in digital collections work a pleasant but unnecessary ‘extra.’ in our environment, the repository staff often feel that promotion is the work of the collection owner, who may not think of promoting the collection in the digital environment, nor know how to do so. as a result, users who would benefit from the collections simply do not know they exist. these results also suggest that librarians worried about the consequences of sharing their collections on third party sites may be worrying about the wrong thing. the sheer volume of information on any given topic makes it unlikely that any but the most dedicated researcher will information technology and libraries | september 2012 90 explore all available sources. most other users are likely to rely on trusted information sources (traditional media, blogs, social networking sites) to steer them towards the items that are most likely to interest them. instead of wondering if users will still come to the library’s site if the content is available elsewhere, perhaps we should be asking of our digital collections, “is anyone using them on any site?” and if the answer is no, the owners and caretakers of those collections should explore ways to bring them to the attention of relevant audiences. conclusion as a usage study of a collection hosted on a library site and a commercial site, this project was not a success. flawed assumptions and a lack of usable data resulted in an inability to address the primary research question in a meaningful way. however, it does shed light on the questions that motivated it. are our digitized collections being used? what effect do current methods of sharing and promotion have on that use? librarians working with digitized collections have fallen behind our colleagues in the print and institutional repository arenas in measuring use of collections, but we have the same needs for usage data. in the current climate of heightened accountability in higher education and publicly funded institutions, we need to demonstrate the value of what we do. we need to know when our efforts to promote our collections are working, and determine which projects have been most successful and merit continued development. and as always, we need to share our results, both formally and informally, with our colleagues. measuring use of digital resources is challenging, and obtaining accurate usage statistics requires not only familiarity with the tools involved, but also some understanding of the ways in which the numbers can be unrepresentative of actual use. the organizations that do collect usage statistics on their digitized collections should share their methods and their results with others to help foster an environment where such data are collected and used. next steps in this area could take the shape of further research projects, or simply more visible work collecting usage statistics on digital collections. of greatest utility to the field would be data demonstrating the relative effectiveness of different methods of increasing use. do labor-intensive usability studies deliver returns in the form of increased use of the finished site? which forms of reposting generate the most views? what types of publicity are most effective in bringing users to collections? how does use of a collection change over time? there are also more policy-driven questions to be answered. for example, should further investment in a collection or site be tied to increasing use of low-traffic collections, or capitalizing on success? differences in topic, format, and audience make it difficult to generalize in this area, but we can begin building a body of knowledge that helps us learn from each other’s successes and failures. learning to share: measuring use of a digitized collection on flickr and in the ir| schlosser and stamper 91 references 1 brinley franklin, martha kyrillidou, and terry plum. "from usage to user: library metrics and expectations for the evaluation of digital libraries." in evaluation of digital libraries: an insight into useful applications and methods, ed. giannis tsakonas and christos papatheodorou, 17-39. (oxford: chandos publishing, 2009). http://www.libqual.org/publications (accessed february 29, 2012) 2 “libqual+,” accessed february 29, 2012. http://www.libqual.org/home 3 “statsqual,” accessed february 29, 2012. http://www.digiqual.org/ 4 julie arendt and cassie wagner. "beyond description: converting web site usage statistics into concrete site improvement ideas." journal of web librarianship 4, no. 1 (2010): 37-54. 5 irene m. h. herold. "digital archival image collections: who are the users?" behavioral & social sciences librarian 29, no. 4 (2010): 267-282. 6 dan hazen, jeffrey horrell, and jan merrill-oldham. selecting research collections for digitization. (council on library and information resources, 1998). http://www.clir.org/pubs/reports/hazen/pub74.html (accessed february 29, 2012) 7 bart ooghe and dries moreels. "analysing selection for digitisation: current practices and common incentives." d-lib magazine 15, no. 9 (2009): 28. http://www.dlib.org/dlib/september09/ooghe/09ooghe.html. 8 sandra d. payette and oya y. rieger. "supporting scholarly inquiry: incorporating users in the design of the digital library." the journal of academic librarianship 24, no. 2 (1998): 121-129. 9 judy jeng. "what is usability in the context of the digital library and how can it be measured?" information technology & libraries 24, no. 2 (2005): 47-56. 10 “open archives initiative protocol for metadata harvesting,” accessed february 29, 2012. http://www.openarchives.org/pmh/ 11 “open archives initiative object reuse and exchange,” accessed february 29, 2012. http://www.openarchives.org/ore/ 12 eric miller and micheline westfall. "linked data and libraries." serials librarian 60, no. 1&4 (2011): 17-22. 13 ann m. lally and carolyn e. dunford. “using wikipedia to extend digital collections,” d-lib magazine 13, no. 5&6 (2007). accessed february 29, 2012. doi:10.1045/may2007-lally 14 “flickr: the commons,” accessed february 29, 2012. http://www.flickr.com/commons/ 15 martin kalfatovic, effie kapsalis, katherine spiess, anne camp, and michael edson. "smithsonian team flickr: a library, archives, and museums collaboration in web 2.0 space." archival science 8, no. 4 (2008): 267-277. http://www.libqual.org/publications http://www.libqual.org/home http://www.digiqual.org/ http://www.clir.org/pubs/reports/hazen/pub74.html http://www.dlib.org/dlib/september09/ooghe/09ooghe.html http://www.openarchives.org/pmh/ http://www.openarchives.org/ore/ http://www.flickr.com/commons/ information technology and libraries | september 2012 92 16 josh hadro. "lc report positive on flickr pilot." library journal 134, no. 1 (2009): 23. 17 jeremiah saunders. “flickr as a digital image collection host: a case study of the alcuin society,” collection management 33, no. 4 (2008): 302-309. doi: 10.1080/01462670802360387 18 victoria carolan and anna towlson. "a history in pictures: lse archives on flickr." aliss quarterly 6 (2011): 16-18. 19 michael stephens. "flickr." library technology reports 42, 4 (2006): 58-62. 20 jason paul michel and elias tzoc. "automated bulk uploading of images and metadata to flickr." journal of web librarianship 4, no. 4 (10, 2010): 435-448. 21 peggy garvin. "photostreams to the people." searcher 17, no. 8 (2009): 45-49. 22 jason vaughan. "insights into the commons on flickr." portal: libraries & the academy 10, no. 2 (2010): 185-214. 23 “pets-in-collections,” accessed february 29, 2012. http://petsincollections.tumblr.com/ 24 “dspace,” accessed february 29, 2012. http://www.dspace.org/ 25 “digital commons,” accessed february 29, 2012. http://digitalcommons.bepress.com/ 26 dorothea salo. "innkeeper at the roach motel." library trends 57, no. 2 (2008): 98-123. 27 for an example of the type of debate that tends to surround projects like flickr commons, see http://www.foundhistory.org/2008/12/22/tragedy-at-the-commons/. (accessed february 29, 2012) 28 “the knowledge bank,” accessed february 29, 2012. http://kb.osu.edu 29 “charles h. mccaghy collection of exotic dance from burlesque to clubs,” accessed february 29, 2012. http://hdl.handle.net/1811/47556 30 “charles h. mccaghy collection of exotic dance from burlesque to clubs,” accessed february 29, 2012. http://flic.kr/s/ahsjua3bgi 31 “flickr: the commons (usage),” accessed february 29, 2012. http://www.flickr.com/commons/usage/ 32 “the jerome lawrence and robert e. lee theatre research institute,” http://library.osu.edu/find/collections/theatre-research-institute/; “charles h. mccaghy collection of exotic dance from burlesque to clubs,” http://library.osu.edu/find/collections/theatre-research-institute/personal-papers-andspecial-collections/charles-h-mccaghy-collection-of-exotic-dance-from-burlesque-to-clubs/; “loose women in tights digital exhibit,” http://library.osu.edu/find/collections/theatreresearch-institute/digital-exhibits-projects/loose-women-in-tights-digital-exhibit/. accessed february 29, 2012. http://petsincollections.tumblr.com/ http://www.dspace.org/ http://digitalcommons.bepress.com/ http://www.foundhistory.org/2008/12/22/tragedy-at-the-commons/.%29 http://hdl.handle.net/1811/47556 http://flic.kr/s/ahsjua3bgi http://www.flickr.com/commons/usage/ http://library.osu.edu/find/collections/theatre-research-institute/ http://library.osu.edu/find/collections/theatre-research-institute/personal-papers-and-special-collections/charles-h-mccaghy-collection-of-exotic-dance-from-burlesque-to-clubs/ http://library.osu.edu/find/collections/theatre-research-institute/personal-papers-and-special-collections/charles-h-mccaghy-collection-of-exotic-dance-from-burlesque-to-clubs/ http://library.osu.edu/find/collections/theatre-research-institute/digital-exhibits-projects/loose-women-in-tights-digital-exhibit/ http://library.osu.edu/find/collections/theatre-research-institute/digital-exhibits-projects/loose-women-in-tights-digital-exhibit/ learning to share: measuring use of a digitized collection on flickr and in the ir| schlosser and stamper 93 33 for an example of the kind of coverage it received, see http://flavorwire.com/195225/fascinating-photos-of-19th-century-vaudeville-and-burlesqueperformers (accessed february 29, 2012) http://flavorwire.com/195225/fascinating-photos-of-19th-century-vaudeville-and-burlesque-performers http://flavorwire.com/195225/fascinating-photos-of-19th-century-vaudeville-and-burlesque-performers article title | author 39 author id box for 2 column layout thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria 39 author id box for 2 column layout knowledge organization systems denotes formally represented knowledge that is used within the context of digital libraries to improve data sharing and information retrieval. to increase their use, and to reuse them when possible, it is vital to manage them adequately and to provide them in a standard interchange format. simple knowledge organization systems (skos) seem to be the most promising representation for the type of knowledge models used in digital libraries, but there is a lack of tools that are able to properly manage it. this work presents a tool that fills this gap, facilitating their use in different environments and using skos as an interchange format. u nlike the largely unstructured information avail­ able on the web, information in digital libraries (dls) is explicitly organized, described, and man­ aged. in order to facilitate discovery and access, dl sys­ tems summarize the content of their data resources into small descriptions, usually called metadata, which can be either introduced manually or automatically generated (index terms automatically extracted from a collection of documents). most dls use structured metadata in accor­ dance with recognized standards, such as marc21 (u.s. library of congress 2004) or dublin core (iso 2003). in order to provide accurate metadata without ter­ minological dispersion, metadata creators use different forms of controlled vocabularies to fill the content of typi­ cal keyword sections. this increase of homogeneity in the descriptions is intended to improve the results provided by search systems. to facilitate the retrieval process, the same vocabularies used to create the descriptions are usu­ ally used to simplify the construction of user queries. as there are many different schemas for modeling controlled vocabularies, the term knowledge organization systems (kos) is intended to encompass all types of schemas for organizing information and promoting knowledge management. as hodge (2000) says, “a kos serves as a bridge between the users’ information need and the material in the collection.” some types of kos can be highlighted. examples of simple types are glossaries, which are only a list of terms (usually with definitions), and authority files that control variant ver­ sions of key information (such as geographic or personal names). more complex are subject headings, classifica­ tion schemes, and categorization schemes (also known as taxonomies) that provide a limited hierarchical structure. at a more complex level, kos includes thesauri and less traditional schemes, such as semantic networks and ontologies, that provide richer semantic relations. there is not a single kos on which everyone agrees. as lesk (1997) notes, while a single kos would be advantageous, it is unlikely that such a system will ever be developed. culture constrains the knowledge classifi­ cation scheme because what is meaningful to one area is not necessarily meaningful to another. depending on the situation, the use of one or another kos has its advan­ tages and disadvantages, each one having its place. these schemas, although sharing many characteristics, usually have been treated heterogeneously, leading to a variety of representation formats to store them. thesauri are an example of the format heterogeneity problem. according to iso­2788 (norm for monolingual thesauri) (iso 1986), a thesaurus is a set of terms that describe the vocabulary of a controlled indexing language, formally organized so that the a priori relationships between con­ cepts (for example, synonyms, broader terms, narrower terms, and related terms) are made explicit. this stan­ dard is complemented with iso­5964 (iso 1985), which describes the model for multilingual thesauri, but none of them describe a representation format. the lack of a stan­ dard representation model has caused a proliferation of incompatible formats created by different organizations. so each organization that wants to use several external thesauri has to create specific tools to transform all of them to the same format. in order to eliminate the heterogeneity of represen­ tation formats, the w3c initiative has promoted the development of simple knowledge organization systems (skos) (miles et al. 2005) for its use in the semantic web environment. skos has been created to represent simple kos, such as subject heading lists, taxonomies, classifica­ tion schemes, thesauri, folksonomies, and other types of controlled vocabulary as well as concept schemes embed­ ded in glossaries and terminologies. although skos has been recently proposed, the number and importance of organizations involved in its creation process (and that publish their kos in this format) indicates that it will probably become a standard for kos representation. skos provides a rich, machine­readable language that is very useful to represent kos, but nobody would expect to have to create it manually or by just using a general­purpose resource description framework (rdf) editor (skos is rdf­based). however, in the digital library area, there are not specialized tools that are able to manage it adequately. therefore, this work tries to fill this gap, describing an open source tool, thmanager, that thmanager: an open source tool for creating and visualizing skos javier lacasta, javier nogueras-iso, francisco javier lópez-pellicer, pedro rafael muro-medrano, and francisco javier zarazaga-soria javier lacasta (jlacasta@unizar.es) is assistant professor, javier nogueras-iso (jnog@unizar.es) is assistant professor, francisco javier lópez-pellicer (fjlopez@unizar.es) is research fellow, pedro rafael muro-medrano (prmuro@ unizar.es) is associate professor, and francisco javier zarazaga-soria (javy@unizar.es) is associate professor in the computer science and systems engineering department, university of zaragoza, spain. �0 information technology and libraries | september 2007�0 information technology and libraries | september 2007 facilitates the construction of skos­based kos. although thmanager has been created to manage thesauri, it also is appropriate to create and manage any other models that can be represented using skos format. this article describes the thmanager tool, highlight­ ing its characteristics. thmanager’s layer­based architec­ ture permits the reuse of the components created for the management of thesauri in other applications where they are also needed. for example, it facilitates the selection of values from a controlled vocabulary in a metadata cre­ ation tool, or the construction of user queries in a search client. the tool is distributed as open source software accessible through the sourceforge platform (http:// thmanager.sourceforge.net/). ■ state of the art in thesaurus tools and representation models the problem of creating appropriate content for thesauri is of interest in the dl field and other related disciplines, and an increasing number of software packages have appeared in recent years for constructing thesauri. for instance, the web site of willpower information (http://www .willpower.demon.co.uk/thessoft.htm) offers a detailed revision of more than forty tools. some are only avail­ able as a module of a complete information storage and retrieval system, but others also allow the possibility of working independently of any other software. among these thesaurus creation tools, one may note the follow­ ing products: ■ bibliotech (http://www.inmagic.com/). this is a multiplatform tool that forms part of bibliotech pro integrated library system and can be used to build an ansi/niso standard thesaurus (standard z39.19 [ansi 1993]). ■ lexico (http://www.pmei.com/lexico.html). this is a java­based tool that can be accessed and/or manip­ ulated over the internet. thesauri are saved in a text­based format. it has been used by the u.s. library of congress to manage such vocabularies and thesauri as the thesaurus for graphic materials, the global legal information network thesaurus, the legislative indexing vocabulary, and the symbols of american libraries listing. ■ multites (http://www.multites.com/) is a windows­ based tool that provides support for ansi/niso relationships plus user­defined relationships and comment fields for an unlimited number of thesauri (both monolingual and multilingual). ■ termtree 2000 (http://www.termtree.com.au/) is a windows­based tool that uses access, sql server, or oracle for data storage. it can import and export trim thesauri (a format used by the towers records information management system [http://www.towersoft.com/]), as well as a defined termtree 2000 tag format. ■ webchoir (http://www.webchoir.com/) is a family of client­server web applications that provides dif­ ferent utilities for thesaurus management in multiple dbms platforms. termchoir is a hierarchical infor­ mation organizing and searching tool that enables one to create and search varieties of hierarchical subject categories, controlled vocabularies, and tax­ onomies based on either predefined standards or a user­defined structure, and is then exported to an xml­based format. linkchoir is another tool that allows indexers to describe information sources using terminology organized in termchoir. and seekchoir is a retrieval system that enables users to browse thesaurus descriptors and their references (broader terms, related terms, synonyms, and so on). ■ synaptica (http://www.synaptica.com/) is a client­ server web application that can be installed locally on a client’s intranet or extranet server. thesaurus data is stored in a sql server or oracle database. the application supports the creation of electronic the­ sauri in compliance with the ansi/niso standard. the application allows the exchange of thesauri in csv (comma­separated values) text format. ■ superthes (batschi et al. 2002) is a windows­based tool that allows the creation of thesauri. it extends the ansi/niso relationships, allowing many pos­ sible data types to enrich the properties of a concept. it can import and export thesauri in xml and tabular format. ■ tematres (hhttp://r020.com.ar/tematres/) is a web application specially oriented to the creation of thesauri, but it also can be used to develop web navigation structures or to manage the documentary languages in use. the thesauri are stored in a mysql database. it provides the created thesauri in zthes (tylor 2004) or in skos format. finally, it must be mentioned that, given that thesauri can be considered as ontologies specialized in organiz­ ing terminology (gonzalo et al. 1998), ontology editors have sometimes been used for thesaurus construction. a detailed survey of ontology editors can be found in the denny study (2002). all of these tools (desktop or web­based) present some problems in using them as general thesaurus editors. the main one is the incompatibility in the interchange formats that they support. these tools also present integration problems. some are deeply integrated in bigger sys­ tems and cannot easily be reused in other environments because they need specific software components to work article title | author �1thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria �1 (as dbms to store thesauri). others are independent tools (can be considered as general­purpose thesaurus editors), but their architecture does not facilitate their integration within other information management tools. and most of them are not open source tools, so there is no possibility to modify them to improve their functionality. focusing on the interchange format problem, the iso­5964 standard (norm for multilingual thesauri) is currently undergoing review by iso tc46/sc 9, and it is expected that the new modifications will include a stan­ dard exchange format for thesauri. it is believed that this format will be based on technologies such as rdf/xml. in fact, some initiatives in this direction have already arisen: ■ the adl thesaurus protocol (janée et al. 2003) defines an xml­ and http­based protocol for access­ ing thesauri. as a result of query operations, portions of the thesaurus encoded in xml are returned. ■ the language independent metadata browsing of european resources (limber) project has published a thesaurus interchange format in rdf (matthews et al. 2001). this work introduces an rdf representa­ tion of thesauri, which is proposed as a candidate thesaurus interchange format. ■ the california environmental resources evaluation system (ceres) and the nbii biological resources division are collaborating in a thesaurus partnership project (ceres/nbii 2003) for the development of an integrated environmental thesaurus and a thesau­ rus networking toolset for metadata development and keyword searching. one of the deliverables of this project is an rdf format to represent thesauri. ■ the semantic web advanced development for europe (swad­europe 2001) project includes the swad­europe thesaurus activity, which has defined the skos, a set of specifications to represent the knowledge organization systems (kos) on the semantic web (thesauri between them). the british standards bs­5723 (bsi 1987) and bs­6723 (bsi 1985) (equivalent to the international iso­2788 and iso­5964) also lack a representation format. the british standards institute idt/2/2 working group is now developing the bs­8723 standard that will replace them and whose fifth part will describe the exchange formats and protocols for interoperability of thesauri. the objec­ tive of this working group is to promote the standard to iso, to replace the iso­2788 and iso­5964. here, it is important to remark that given the direct involvement of the idt/2/2 working group with skos development; probably the two initiatives will not diverge. the new representation format will be, if not exactly skos, at least skos­based. taking into account all these circumstances, skos seems to be the most adequate representation model to store thesauri. given that skos is rdf­based, it can be created using any tool that is able to manage rdf (usually used to edit ontologies); for example, swoop (mindswap group 2006), protégé (noy et al. 2000), or triple20 (wielemaker et al. 2005). the problem with these tools is that they are too complex for editing and visualizing such a simple model as skos. they are thought to create complex ontologies, so they provide too many options not spe­ cifically adapted to the type of relations in skos. in addition, they do not allow an integrated management of collection of thesauri and other types of controlled vocabularies as needed in dl processes (for example, the creation of metadata of resources, or the construction of queries in a search system). ■ skos model skos is a representation model for simple knowledge organization systems, such as subject heading lists, tax­ onomies, classification schemes, thesauri, folksonomies, other types of controlled vocabulary, and also concept schemes embedded in glossaries and terminologies. this section describes the model, providing characteristics, showing the state of development, and indicating the problems found to represent some types of kos. skos was initially developed within the scope of the semantic web advanced development for europe (swad­europe 2001). swad­e was created to support w3c’s semantic web initiative in europe (part of the ist­7 programme). skos is based on a generic rdf schema for thesauri that was initially produced by the desire project (cross et al. 2001), and further developed in the limber project (matthews et al. 2001). it has been developed as a draft of an rdf schema for thesauri com­ patible with relevant iso standards, and later adapted to support other types of kos. among the kos already published using this new format are gemet (eea 2001), agrovoc (fao 2006), adl feature types (hill and zheng 1999), and some parts of wordnet lexical data­ base (miller 1990), all of them available on the skos project web page. skos is a collection of three different rdf schema application profiles: skos­core, to store common prop­ erties and relations; skos­mapping, whose purpose is to describe relations between different kos; and skos­ extension, to indicate specific relations and properties only contained in some type of kos. for the first step of the development of the thmanager tool, only the most stable part of skos has been consid­ ered. figure 1 shows the part of skos­core used. the rest of skos­core is still unstable, so its support has been delayed until it is approved. skos­mapping and skos­extension are still in their first steps of develop­ �2 information technology and libraries | september 2007�2 information technology and libraries | september 2007 ment and are very unstable, so their management in thmanager also has been delayed until the creation of stable versions. in skos­core, a kos (in our case, usually a the­ saurus) consists of a set of concepts (labelled as skos: concept) that are grouped by a concept scheme (skos: conceptscheme). to distinguish between different mod­ els provided, the skos:conceptscheme contains a uri that identifies it, but to describe the model content to humans, metadata following the dublin core standard also can be added. the relation of the concept scheme with the concepts of the kos is done through the skos: hastopconcept relation. this relation points at the most general concepts of the kos (top concepts), which are used as entry points to the kos structure. in skos, each concept consists of a uri and a set of properties and relations to other concepts. among the properties, skos.preflabel and skos.altlabel provide labels for a concept in different languages. the first one is used to show the label that better identifies a concept (for the­ sauri it must be unique). the second one is an alternative label that contains synonyms or spelling variations of the preferred label (it is used to redirect to the preferred label of the concept). the skos concepts also can contain three other properties called skos.scopenote, skos.definition, and skos.example. they contain annotations about the ways to use a concept, a definition, or examples of use in differ­ ent languages. last, the skos.prefsymbol and skos.altsymbol properties are used to provide a preferred or some alter­ native symbols that graphically represent the concept. for example, a graphical representation is very useful to identify the meaning of a mathematical formula. another example is a chemical formula, where a graphical repre­ sentation of the structure of the substance also provides valuable information to the user. with respect to the relations, each concept indicates by means of the skos:inscheme relation in which concept scheme it is contained. the skos.broader and the skos.narrower relations are inverse relations used to model the generalization and specialization characteristics present in many kos (including thesauri). skos.broader relates to more general concepts, and skos.narrower to more spe­ cific ones. the skos.related relation describes associative relationships between concepts (also present in many thesauri), indicating that two concepts are related in some way. with these properties and relations, it is perfectly possible to represent thesauri, taxonomies, and other types of controlled vocabularies. however, there is a problem for the representation of classification schemes that provide multiple coding of terms, as there is no place to store this information. under this category, one may find classification schemes such as iso­639 (iso 2002) (iso standard for coding of languages), which proposes different types of alphanumeric codes (for example, two letters and three letters). for this special case, the skos working group proposes the use of the property skos.notation. although this property is not in the skos vocabulary yet, it is expected to be added in future versions. given the need to work with these types of schemes, this property has been included in the thmanager tool. ■ thmanager architecture this section presents the architecture of thmanager tool. this tool has been created to manage thesauri in skos, but it also is a base infrastructure that facilitates the management of thesauri in dls, simplifying their inte­ gration in tools that need to use thesauri or other types of controlled vocabularies. in addition, to facilitate its use on different computer platforms, thmanager has been developed using the java object­oriented language. the architecture of thmanager tool is shown in figure 2. the system consists of three layers: first, a repository layer where thesauri are stored and identified by means of associated metadata describing them; second, a per­ sistence layer that provides an api for access to thesauri stored in the repository; and third, a gui layer that offers different graphical components to visualize thesauri, to search by their properties, and to edit them in different ways. the thmanager tool is an application that uses the different components provided by the gui layer to allow the user to manage the thesauri. in addition, the layered architecture allows other applications to use some of the visualization components or the method provided by the persistence layer to provide access to thesauri. the main features that have guided the design of these layers have been the following: a metadata­driven design, efficient management of thesauri, the possibility of interrelating thesauri, and the reusability of thmanager figure 1. skos model article title | author �3thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria �3 components. the following subsections describe these characteristics in detail. metadata-driven design a fundamental aspect in the repository layer is the use of metadata to describe thesauri. thmanager considers metadata of thesauri as basic information in the thesau­ rus management process, being stored in the metadata repository and managed by the metadata manager. the reason for this metadata­driven design is that thesauri must be described and classified to facilitate the selec­ tion of the one that better fits the user needs, allowing the user to search them not only by their name but also by the application domain or the associated geographi­ cal area between others. the lack of metadata makes the identification of useful thesauri (provided by other organizations) difficult, producing a low reuse of them in other contexts. to describe thesauri in our service, a metadata profile based on dublin core has been created. the reason to use dublin core as basis of this profile has been its extensive use in the metadata community. it provides a simple way to describe a resource using very general metadata ele­ ments, which can be easily matched with complex domain­ specific metadata standards. additionally, dublin core also can be extended to define application profiles for specific types of resources. following the metadata pro­ file hierarchy described in tolosana­calasanz et al. (2006), the thesaurus metadata profile refines the definition and domain of dublin core elements as well as includes two new elements (metadata language and metadata identifier) to appropriately identify the metadata records describing a thesaurus. the profile for thesauri has been described using the iemsr format (heery et al. 2005) and is distributed with the tool. iemsr is an rdf­based format created by the jisc ie metadata schema registry project to describe metadata application profiles. figure 3 shows the metadata created for gemet thesaurus (the resource), expressed as a hedgehog graph (reinterpreta­ tion of rdf triplets: resources, named properties, and values). the purpose of these metadata is not only to sim­ plify the thesaurus location to a user, but also to facilitate the identification of thesauri useful for a specific task in a machine­to­machine communication. for instance, one may be interested only in thesauri that cover a restricted geographical area or have a specific thematic. efficient thesauri storage thesauri vary enormously in size, ranging from hundreds of concepts and properties to millions. so the time spent on load, navigation, and search processes are a functional restriction for a tool that has to manage them. skos is rdf­based, and because reading rdf to extract the con­ tent is a slow process, the format is not appropriate for inner storage. to provide better access time, thmanager transforms skos into a binary format when a new skos is imported. the persistence layer provides a unified access to the thesaurus repository. this layer is used by the gui layer figure 2. kos manager architecture viewer generatorviewer generator repository concept repository metadata manager concept manager persistence gui disambiguation tool concept core thesaurus persistence manager skos core skos mapping jena api metadata repository thesaurus metadata applications thmanagerthmanager other tools that use thesauri other tools that use thesauri desktop tools that use thesauri other tools that use thesauri other tools that use thesauri other tools that use thesauri other tools that use thesauri desktop tools that use thesauri desktop tools that use thesauri other tools that use thesauri other tools that use thesauri web services that use thesauri other tools that use thesauri other tools that use thesauri other tools that use thesauri other tools that use thesauri web services that use thesauri web services that use thesauri visualization edition search gui manager figure 3. metadata of gemet thesaurus european topic centre on catalogue of data sources (etc/cds) general multilingual environmental thesaurus dc:title dcterms:alternative gemet dc:creator [ http://www2.ulcc.ac.uk/unesco/concept/mt_mt_2.55 ] science.environmental sciences and engineering [ http://www2.ulcc.ac.uk/unesco/concept/mt_2.60 ] science.pollution, disasters and security [ http://www2.ulcc.ac.uk/unesco/concept/mt_2.65 ] science.natural resources dc:subject dc:subject dc:subject dc:subject gemet was conceived as a "general" thesaurus, aimed to define a common general language, a core of general terminology for the environment dc:description dc:publisher european environment agency (eea) dc:date 2005-03-07 dc:type [ http://iaaa.cps.unizar.es/dctype/concept/236 ] text.reference materials.ontology dc:format [ http://iaaa.cps.unizar.es/mimetype/concept/skos ] skos http://www.eionet.eu.int/gemetdc:identifier dc:language en es fr ... iaaa:metadatalanguage en http://iaaa.cps.unizar.es/ontologies/gemetiaaa:metadataidentifier [ http://www2.ulcc.ac.uk/unesco/concept/mt_2.75 ] science.natural sciences [ http://www.eionet.europa.eu ] european environment information and observation network it can be used whenever there is no commercial profitdc:rights dc:relation us environmental protection agency (epa) dc:contributor dc:source [ http://europa.eu/eurovoc ] eurovoc thesaurus european environment agency (eea) dc:creator ... �� information technology and libraries | september 2007�� information technology and libraries | september 2007 to access the thesauri, but it also can be employed by other tools that need to use thesauri outside a desktop environment (for example, a thematic search system accessible through the web that requires browsing a thesaurus to facilitate construction of user queries). this layer performs the transformation of skos to the binary format when a thesaurus is imported. the transformation is provided using the jena library, a popular library to manipulate rdf documents that allows storing them in different kinds of repositories (http://jena.sourceforge. net/). jena provides an open model that can be extended with specialized modules to use other ways of storage, making it possible to easily change the storage format system for another that is more efficient if needed. the data structure used is shown in figure 4. the model is an optimized representation of the information given by the rdf triplets. the concepts map contains the concepts and their associated relations in the form of key­value pairs: the key is a uri identifying a concept; and the value is a relations object containing the properties of the concept. a relations object is a map that stores the properties of one concept in the form of pairs. the keys used for this map are the names of the typical property types in the skos model (for example, narrower or broader). the only special cases for encoding these property types in the proposed data structure occur when they have a language attribute (for example, preflabel, definition, or scopenote). in those cases, we propose the use of a [lang] suffix to distinguish the property type for a particular language. for instance, preflabel_en indicates a preflabel property type in english. additionally, it must be noted that the data type of the property values assigned to each key in the relations map varies upon the semantics given to each property type. the data types fall into the following categories: a string for a preflabel property type; a list of strings for altlabel, definition, scope note, and example property types; a uri for a prefsymbol property type; a list of uris for narrower, broader, related, and altsymbol property types; and a list of notation objects for a notation property type. the data type used for notation values is a complex object because there may be different notation types. a notation object consists of type and value attributes. the type attribute is a uri that identifies a particular notation type and qualifies the associated notation value. additionally, and with the objective of increasing the speed of some operations (for example, navigation or search), some optimizations have been added. first, the uris of the top concepts are stored in the topconcepts list. this list contains redundant information, given that those concepts also are stored in the concepts map, but it makes immediate their location. second, to speed up the search of concepts and the drawing of the alphabetic viewer, the translations map has been added. for each language sup­ ported by the thesaurus, this map contains a translationterm object, or list of pairs , ordered by preflabel. it also contains redundant information that allows the immediate creation of the alphabetic viewer for a language, simplifying the search process; as can be seen later, this does not provides a big over­ head in load time. in addition, if no alphabetic viewer and search are needed, this structure can be removed without affecting the hierarchical viewer. this solution has proven to be useful to manage the kind of thesauri we use (they do not sur­ pass 50,000 concepts and about 330,000 properties), loading them to memory in an average com­ puter in a reasonable time, and allowing immediate navigation and search (see section 6). interrelation of thesauri the vast choice of thesauri that are available nowadays implies an undesired effect of content heterogeneity. although a the­ saurus is usually created for a specific application domain, some of the concepts defined in thesauri from different applica­figure �. persistence model …… relations uri 3uri 3 relations uri 2uri 2 relations uri 1uri 1 valuekey …… relations uri 3uri 3 relations uri 2uri 2 relations uri 1uri 1 valuekey <> concepts uriprefsymbol list altsymbol list notation stringpreflabel_[lang] list altlabel_[lang] list definition_[lang] list scopenote_[lang] list example_[lang] list related list broader list narrower valuekey uriprefsymbol list altsymbol list notation stringpreflabel_[lang] list altlabel_[lang] list definition_[lang] list scopenote_[lang] list example_[lang] list related list broader list narrower valuekey <> relations -type : uri -value : string notation …… list narrower valuekey …… list narrower valuekey <> relations … uri 390 uri 27 uri 3 … uri 390 uri 27 uri 3 <> topconcepts … -concept : uri -label : string translationterm …… listfr listes listen valuekey …… listfr listes listen valuekey <> translations article title | author �5thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria �5 tions domains may be equivalent. in order to facilitate cross­domain classification of resources, users would benefit from the possibility of knowing the connections of a thesaurus in their application domain to thesauri used in other domains. however, it is difficult to manually detect the implicit links between those different thesauri. therefore, in order to automatically facilitate these interthesaurus connections, the persistence layer of thmanager tool provides an interrelation function that relates a thesaurus with respect to an upper­level lexical database (the concept core displayed in figure 2). the interrelation mechanism is based on the method presented in nogueras­iso, zarazaga­soria, and muro­ medrano (2005). it is an unsupervised disambiguation method that uses the relations between concepts as disam­ biguation context. it applies a heuristic voting algorithm to select the most adequate sense of the used concept core for each thesaurus concept. at the moment, the concept core is the wordnet lexical database. wordnet is a large english lexical database that groups nouns, verbs, adjectives, and adverbs into sets of cognitive synonyms (synsets), each expressing a distinct concept. those synsets are interlinked by means of conceptual­semantic and lexical relations. the interrelation component has been conceived as an independent module that receives a thesaurus as input in skos and returns the relation respect to concept core using an extended version of the skos mapping model (miles and brickley 2004). this model, as commented before, is a part of skos that allows describing exact, major, and minor mappings between concepts of two different kos (in this case between a thesaurus and the common core). skos mapping is still in an early stage of development and has been extended in order to provide the needed functionality. the base skos mapping provides the map:exactmatch, map:majormatch, and map:minormatch relations to indicate the degree of relation between two concepts. given that the interrelation algorithm cannot ensure that a mapping is 100 percent exact, only the major and minor match properties are used. the algorithm returns a list of pos­ sible mappings with the lexical database for each concept: the one with the highest probability is assigned as major match, and the rest are assigned as minor matches. to store the interrelation probability, skos mapping has been extended by adding a blank node with the liability of the mapping. also, to be able to know which concepts of which thesauri are equivalents to one of the common core, the inverse relations of map:majormatch and map:minormatch have been created. an example of skos mapping can be seen in figure 5. there, the concept 340 of gemet thesaurus (alloy) is correctly mapped to the wordnet concept number 13751474 (alloy, metal) with a probability of 91.007 percent, an unrelated minor mapping also is found, but it is given a low probability (8.992 percent). reusability of thmanager components on top of the api layer, the gui layer has been con­ structed. this layer contains several graphical interfaces to provide different types of viewers, searchers, and edi­ tors for thesauri. this layer is used as base for the con­ struction of the thmanager tool. the tool groups a subset of the provided components, relating them to obtain a final user application that allows the management of the stored thesauri, their visualization (navigation by the concept relations), their edition, and their importation and exportation using skos format. the thmanager tool not only has been created as an independent tool to facilitate thesauri management, but also to allow easy integration in tools that need to use thesauri. it has been done by combining the informa­ tion management with specific graphical interfaces in different black­box components. between the provided components, there is a hierarchical viewer, an alphabetic viewer, a list viewer, a searcher, and an editor, but more components can be constructed if needed. the use of the gui layer as a library of reusable graphical components makes it possible to create different tools that are able to manage thesauri with different user requirements with minimum effort, allowing also the integration of this technology in other applications that need controlled vocabularies to improve their functionality. for example, in a metadata creation tool, it can be used to provide the graphical component to select controlled values from thesauri and automatically insert them in the metadata. it also can be used to provide the list of possible values to use in a web search system, or to provide a thesaurus­ based navigation of a collection of resources in an explor­ atory search system. figure 6 shows the integration process of a thesau­ rus visualization component in an external tool. the provided thesaurus components have been constructed following the java beans philosophy (reusable software components that can be manipulated visually in a builder tool), where a component is a black box with methods to read and change its state that can be reused when needed. here, each thesaurus component is a thesaurusbean that can be directly inserted in a graphical application to use its functionality (visualize or edit thesauri) in a very simple way. the thesaurusbeans are provided by the thesaurusbeanmanager that, given the parameters of the thesaurus to visualize and the type of visualization, returns the most adequate component to use. ■ description of thmanager functionality thmanager tool is a desktop application that is able to manage thesauri stored in skos. as regards to the instal­ �6 information technology and libraries | september 2007�6 information technology and libraries | september 2007 lation requirements, the application requires 100 mbs of free space on the hard disk. with respect to ram and cpu requirements, they depend greatly on the size and the number of thesauri loaded in the tool. considering the number and size of thesauri used as testbed in section 6, ram consumption ranges from 256 to 512 mbs, and with a 3ghz cpu (for example, pentium iv), the load times for the bigger thesauri are acceptable. however, if the size of thesauri is smaller, ram and cpu requirements decrease, being able to operate on a computer with just a 1 ghz cpu (for example, pentium iii) and 128 mbs of ram. given that the management of thmanager is meta­ data oriented, the first window in the application shows a table including the metadata records describing all the thesauri stored in the system (figure 7). the selection of a record in this table indicates to the rest of the compo­ nents the selected thesaurus. the creation or deletion of thesauri also is provided here. the only operation that can be performed when no record is selected is to import a new thesaurus stored in skos. to import it, the name of the skos file must be provided. the import tool also contains the option to interrelate the imported thesaurus to the concept core. the metadata of the thesaurus are extracted from inside of the skos if they are available, or they can be provided in an associated xml metadata file. if no metadata record is provided, the application generates a new one with minimum information, using as base the name of the skos file. once the user has selected a thesaurus, it can visualize and modify its metadata or content, export it to skos, or, as commented before, delete it. with respect to the metadata describing a thesaurus, a metadata viewer visualizes the metadata in html and a metadata editor allows the editing of metadata following the thesaurus metadata profile described in the metadata­driven design section (figure 8 shows a screenshot of the metadata edi­ tor). different html views can be provided by adding more css files to the application. the metadata editor is customiz­ able. to add or delete metadata elements to the metadata edi­ tor window, it is only neces­ sary to modify the description of the iemsr profile for thesauri included in the application. the main functionality of the tool is to visualize the thesaurus structure, showing all proper­ ties of concepts and allowing the navigation by relations (see figure 9). here, different read­only viewers are provided. there is an alphabetic viewer that shows all the concepts ordered by the preferred label in one language. a hierar­ chical viewer provides navigation by broader and nar­ rower relations. additionally, a hypertext viewer shows all properties of a concept and provides navigation by all its relations (broader, narrower, and related) via hyper­ links. finally, there also is a search system that allows the typical searches needed for thesauri (equals, starts with, contains). currently, search is limited to preferred labels in the selected language, but it could be extended to allow searches by other properties, such as synonyms, defini­ tions, or scope notes. figure 5. skos mapping extension alloy ... 91.00727 alloy, metal … 91.00727 map:majormatch iaaa:probability map:majormatch iaaa:hasmajormatch iaaa:hasmajormatch resource property alloy, metal a mixture containing two or more metallic elements or metallic and nonmetallic elements usually fused together or dissolving into each other when molten; "brass is an alloy of zinc and copper" skos:definition map:minormatch iaaa:hasminormatch admixture, alloy map:minormatch iaaa:hasminormatch http://www.eionet.eu.int/ gemet/concept/340 rdf:about a28660 rdf:nodeid a2821 8.992731 iaaa:probability rdf:nodeid http://wordnet.princeton.edu/ wordnet_2.0/13751474 rdf:about skos:preflabel alloy skos:preflabel http://wordnet.princeton.edu/ wordnet_2.0/13664144 the state of impairing the quality or reducing the value of something skos:preflabel skos:definition rdf:about any of a large number of substances having metallic properties and consisting of two or more elements; with few exceptions, the components are usually metallic elements. (source: mgh) skos:definition figure 6. gui component integration desktop tool thesaurusbeanmanager type: tree, thesaurus: gemet thesaurusbean article title | author �7thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria �7 all of these viewers are synchronized, so the selec­ tion of a concept in one of them produces the selection of the same concept in the others. the layered architec­ ture described previously allows these viewers to be reused in many situations, including other parts of the thmanager tool. for example, in the thesaurus metadata editor described before, the thesaurus viewer is used to facilitate the selection of values for the subject section of metadata. also, in the thesaurus editor shown later, the thesaurus viewer simplifies the selection of a concept related (by some kind of relation) to the selected, and provides a preview of the hierarchical viewer to help to detect wrong relations. the third available operation is to edit the thesaurus structure. here, to create a thesaurus following the skos model, an edition component is provided (see figure 10). the graphical interface shows a list with all the concepts created in the selected thesaurus, allowing the creation of new ones (providing their uris) or deletion of selected ones. once a concept has been selected, its properties and relations to other concepts are shown, allowing the creation of new ones and the deletion of others. to facili­ tate the creation of relations between concepts, a selector of concepts (based in the thesaurus viewer) is provided, allowing the user to add related concepts without manu­ ally typing the uri of the associated concept. also, to see if the created thesaurus is correct, a preview of the hier­ archical viewer can be shown, allowing the user to easily detect problems in the broader and narrower relations. with respect to the interrelation functionality, at the moment the mapping obtained is shown in the thesaurus viewers, but the navigation between equivalent concepts of two thesauri must be be done manually by the user. however, a navigation component still under develop­ ment will allow the user to jump from a concept in a the­ saurus to concepts in others that are mapped to the same concept in the common core. as mentioned before, for efficiency, the format used to store the thesauri in the repository is binary, but the inter­ change format used is skos. so a module for thesauri importation and exportation is provided. this module is able to import from and export to skos. in addition, if the thesaurus has been interrelated with respect to the concept core, it is able to export its mapping to the con­ cept core using the extended version of skos mapping above. ■ results of the work this section shows some experiments performed with the thmanager tool for the storage and management of a selected set of thesauri. in particular, this set of thesauri is relevant in the context of the geographic information community. the increasing relevance of geographic infor­ mation for decision­making and resource management in different areas of government has promoted the cre­ ation of geo­libraries and spatial data infrastructures to facilitate distribution and access of geographic informa­ tion (nogueras­iso, zarazaga­soria, and muro­medrano, 2005). in this context, complex metadata schemes, such as iso­19115, have been proposed for a full­detail descrip­ tion of resources. many of the metadata elements in these schemes are either constrained to a selected vocabulary (iso­639 for language encoding, iso­3166 for country codes, and so on), or the user is told to pick a term from the most suitable thesaurus. the problems with this sec­ ond case are that typically the choice for thesauri is quite open, the thesauri are frequently large, and the exchange format of available thesauri is quite heterogeneous. in such a context, the thmanager tool has proven to be very useful to simplify the management of the used thesauri. at the moment, eighty kos between thesauri and other types of controlled vocabulary have been cre­ ated or transformed to skos and managed through this tool. table 1 shows some of them, indicating their names (name column), the number of concepts (nc column), their total number of properties and relations (np and nr columns), and the number of languages in which concept properties are provided (nl column). to give an idea of the cost of loading these structures, the sizes of skos and binary files (ss and sb columns) are provided in kilobytes (kb). additionally, table 1 compares the performance time of thmanager with respect to other tools that load the figure 7. thesaurus selector figure �. thesaurus metadata editor �� information technology and libraries | september 2007�� information technology and libraries | september 2007 thesauri directly from an rdf file using the jena library (time performance has been obtained using a 3ghz pentium iv processor). for this purpose, three different load times (in seconds) have been computed. the bt column contains the load time of binary files without the cost of creating the gui for the thesauri viewers. the lt column contains the total load time of binary files (including the time of gui creation and drawing). the jt column contains the time spent by a hypothetical rdf­ based editor tool to invoke jena and load in its memory model the rdf skos files (it does not include gui cre­ ation) containing the thesauri. the difference between the bt and lt column shows the time used to draw the gui once the thesauri have been loaded in memory. the difference between bt and jt columns shows the gain in terms of time of using a binary storage instead of a rdf based one. the thesauri shown in the table are the adl feature types thesaurus (adl ftt), the isoc thesaurus of geography (isoc­g), the iso­639, the unesco thesaurus (unesco 1995), the ogp surveying and positioning committee code lists (epsg) (ogp 2006), the multilingual agricultural thesaurus (agrovoc), the european vocabulary thesaurus (eurovoc) (eupo 2005), the european territorial units (spain and france) (etu), and the general multilingual environmental thesaurus (gemet). they have been selected because they have different sizes and can be used to show how the load time evolves with the thesaurus size. among them, gemet and agrovoc can be high­ lighted. although they are provided as skos, they include nonstandard extensions that we have transformed to standard skos relations and properties. eurovoc and unesco are examples of thesauri provided in formats different than skos that we have completely transformed into skos. the former one was in an xml­based format, and the latter used a plain­text format. another thesaurus transformed to skos is the european territorial units, which contains the administrative political units in spain and france. here, the original source was a collection of heterogeneous documents that contained parts of the needed information and have been processed to generate a skos file. some classification schemes also have been trans­ formed to skos, such as the iso­639 and the different epsg codes for coordinate reference systems (includ­ ing datums, ellipsoids, and projections). with respect to controlled vocabularies created (by the authors) in skos using the thmanager tool, there is an extended version of the adl feature types that includes a more detailed clas­ sification of features types and different glossaries used for resource classification. figure 11 depicts the comparison of the different load times shown in table 1 with respect to the size of the rdf skos files. the order of the thesauri in the figure is the same as in the table 1. it can be seen that the time to con­ struct the model using a binary format is almost half the time spent to create the model using a rdf file. in addi­ tion, once the binary model is loaded, the time to generate the gui is not very dependent on thesaurus size. this is possible thanks to the redundant information added to facilitate the access to top concepts and to speed up load­ ing of the alphabetic viewer. this redundant informa­ tion produces an overhead in the load of the model, but without it the drawing time would be much worse, as it would have to generate it on the fly. however, in spite of the improvements, for the larger thesauri considered, the load time starts to be long, given that it includes the load time of all the structure of the thesaurus in memory and the creation of the objects used to manage it quickly when loaded. but, once it is loaded, future accesses are immediate (quicker than 0.5 seconds). these accesses include opening it again, navigating by figure 9. thesaurus concept selector figure 10. thesaurus concept editor article title | author �9thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria �9 thesaurus relations, changing the visualization language, and searching concepts by their preferred labels. to minimize the load time, thesauri can be loaded in the background when the application is launched, reducing, in that way, the user perception of the load time. another interesting aspect in figure 11 is the peak of the third element. it corresponds with the iso­639 classifica­ tion scheme. it has the special characteristic of not having hierarchy and having many notations. these two character­ istics produce a little increase in the model load time, given that the top concepts list contains all the concepts and the notations are more complex than other relations. but most of the time is used to generate the gui of the tree viewer. the tree viewer gets all the concepts that are top terms, and for each one it asks for their preferred labels in the selected language and sorts them alphabetically to show the first level of the tree. this is fast for a few hundred concepts, but not for the 7,599 in the iso­639. however, this problem could be easily solved if the metadata contained a descrip­ tion of the type of kos to visualize. if the tool knew that the kos does not have broader and narrower relations, it could use the structures used to visualize the alphabetic list, which are optimized to show all of the kos concepts rapidly, instead of trying to load it as a tree. the persistence approach used has the advantage of not requiring external persistence systems, such as a dbms, and providing rapid access after loading, but it has the drawback of loading all thesauri in memory (in time and space). so, for much bigger thesauri, the use of some kind of dbms would be necessary. if this change were necessary, minimum modifications would be needed (one class). however, if not all the concepts are loaded, the alphabetic viewer (shows all the concepts) would have to be updated (for example, showing the concepts by pages) or it would become too slow to work with it. ■ conclusions this article has presented a tool for managing the the­ sauri needed in a digital library, for creating metadata, and for running search processes using skos as the interchange format. this work revises the tools that are available to edit thesauri, highlighting the lack of a formalized way to exchange thesauri and the difficulty of integrating those tools in other environments. this work selects skos from the available interchange formats for thesauri as the most promising format to become a standard for skos repre­ sentation, and highlights the lack of tools that are able to manage it properly. the thmanager tool is offered as the solution to these problems. it is an open source tool that can manage the­ sauri stored in skos, allowing their visualization and editing. thanks to the layered architecture, its components can be easily integrated in other applications that need to use thesauri or other controlled vocabularies. additionally, the components can be used to control the possible values used in a web search service to facilitate traditional or exploratory searches based on a controlled vocabulary. the performance of the tool is proved through a series of experiments on the management of a selected set of thesauri. this work analyzes the features of this selected set of thesauri and compares the efficiency of this tool with respect to other tools that load the thesauri directly from a rdf file. in particular, it is shown that the internal representation used by thmanager helps to decrease the time spent for the graphical loading of thesauri, facilitating navigation of the thesaurus contents as well as other typical operations, such as sorting or change of visual­ ization language. additionally, it is worth noting that the tool can be used as a library of components to simplify the integration of the­ sauri in other applications that require the use of controlled vocabularies. thmanager has been integrated within the open source catmdedit tool table 1. sizes of some thesauri and other types of vocabularies name nc np nr nl lt bt jt ss sb adl ftt 210 210 408 1 0.4 0.047 0.062 103 41 isoc­g 5,136 5,136 1,026 1 2.4 1.063 1.797 2,796 1,332 iso­639 7,599 16,247 0 6 5.1 1.969 2.89 3,870 3,017 unesco 8,600 13,281 21,681 3 2.1 1.406 2.984 4,034 2,135 epsg 4,772 9,544 0 1 1.8 0.969 1.796 2,935 1,682 agrovoc 16,896 103,484 30,361 3 7.5 4.953 14.75 15,859 5,089 eurovoc 6,649 196,391 20,861 15 11.1 9.266 15.828 18,442 11,483 etu 44,991 89,980 89,976 2 13.3 10.625 17.844 23,828 10,412 gemet 5,244 326,602 12,750 21 13.7 11.828 25.61 28,010 15,048 50 information technology and libraries | september 200750 information technology and libraries | september 2007 (zarazaga­soria et al. 2003), a metadata editor tool for the documentation of geographic information resources (metadata compliant with iso19115 geographic informa­ tion metadata standard). the thesaurusbeans provided in thmanager library have been used to facilitate keyword selection for some metadata elements. the thmanager component library also has contributed to the develop­ ment of catalog search systems guided by controlled vocabularies. for instance, it has been used to build a thematic catalog in the sdiger project (zarazaga­soria 2007). sdiger is a pilot project on the implementa­ tion of the infrastructure for spatial information in europe (inspire) for the development of a spatial data infrastructure to support access to geographic infor­ mation resources concerned with the european water framework directive. thanks to the thmanager compo­ nents, the thematic catalog allows browsing of resources by means of several multilingual thesauri, including gemet, unesco, agrovoc, and eurovoc. future work will enhance the functionalities provided by thmanager. first, the ergonomics will be improved to show connections between different thesauri. currently, these connections can be computed and annotated, but the gui does not allow the user to navigate them. as the base technology already has been developed, only a graphical interface is needed. second, the tool will be enhanced to support data types different from texts (for example, images, documents, or other multimedia sources) for the encoding of concepts’ property values. third, it has been noted that the thesauri concepts can evolve with time. thus, a mechanism for the managing the different ver­ sions of thesauri will be necessary in the future. finally, improvements in usability also are expected. thanks to the component­based design of thmanager widgets (thesaurusbeans), new viewers or editors can be readily created to meet the needs of specific users. ■ acknowledgments this work has been partially supported by the spanish ministry of education and science through the proj­ ects tin2006­00779 and tic2003­09365­c02­01 from the national plan for scientific research, development, and technology innovation. the authors would like to express their gratitude to juan josé floristán for his support in the technical development of the tool. references american national standards institute (ansi). 1993. guidelines for the construction, format, and management of monolin­ gual thesauri. ansi/niso z39.19­1993. revision of z39.19. batschi, wolf­dieter et al. 2002. superthes: a new software for construction, maintenance, and visualisation of mul­ tilingual thesauri. http://www.t­reks.cnr.it/docs/st_ enviroinfo_2002.pdf (accessed sept. 6, 2007). british standards institute (bsi). 1985. guide to establishment and development of multilingual thesauri. bs 6723. british standards institute (bsi). 1987. guide to establishment and development of monolingual thesauri. bs 5723. ceres/nbii. 2003. the ceres/nbii thesaurus partnership project. http://ceres.ca.gov/thesaurus/ (accessed june 12, 2007). cross, phil, dan brickley, and traugott koch. 2001. rdf the­ saurus specification. technical report 1011, institute for learning and research technology. http://www.ilrt.bris.ac.uk/ discovery/2001/01/rdf­thes/ (accessed june 12, 2007). denny, michael. 2002. ontology building: a survey of edit­ ing tools. xml.com. http://xml.com/pub/a/2002/11/06/ ontologies.html (accessed june 12, 2007). european environment agency (eea). 2004. general multilingual environmental thesaurus (gemet). version 2.0. european environment information and observation network. http:// www.eionet.europa.eu/gemet/rdf (accessed june 12, 2007). european union publication office (eupo). 2005. european vocabulary (eurovoc). publications office. http://europa .eu/eurovoc/ (accessed june 12, 2007). food and agriculture organization of the united nations (fao). 2006. agriculture vocabulary (agrovoc). agricul­ tural information management standards. http://www.fao. org/aims/ag%20alpha.htm (accessed june 12, 2007). gonzalo, julio, et al. 1998. applying eurowordnet to cross­lan­ guage text retrieval. computers and the humanities 32, no. 2/3 (special issue on euroword­net): 185–207. heery, rachel, et al. 2005. jisc metadata schema registry. in 5th acm/ieee-cs joint conference on digital libraries, 381–81. new york: acm pr. hill, linda, and qi zheng. 1999. indirect geospatial referencing through place names in the digital library: alexandria digi­ figure 11. thesaurus load times 0 5 10 15 20 25 30 0 5000 10000 15000 20000 25000 30000 skos file size (kb) lo ad t im e (s ) rdf (jena) binary thmanager article title | author 51thmanager | lacasta, nogueras-iso, lópez-pellicer, muro-medrano, and zarazaga-soria 51 tal library experience with developing and implementing gazetteers. in asis ‘99: proceedings of the 62nd asis annual meeting: knowledge: creation, organization, and use, 57–69. med­ ford, n.j.: information today, for the ameircan society for information science. hodge, gail. 2000. systems of knowledge organization for digital libraries: beyond traditional authority files. washington, d.c.: the digital library federation. international organization for standardization (iso). 1985. guidelines for the establishment and development of multilingual thesauri. iso 5964. international organization for standardization (iso). 1986. guidelines for the establishment and development of monolingual thesauri. iso 2788. international organization for standardization (iso). 2002. codes for the representation of names of languages. iso 639. international organization for standardization (iso). 2003. information and documentation—the dublin core metadata element set. iso 15836:2003. janée, greg, satoshi ikeda, and linda l. hill. 2003. the adl the­ saurus protocol. http://www.alexandria.ucsb.edu/~gjanee/ thesaurus/ (accessed june 12, 2007). lesk, michael. 1997. practical digital libraries. san francisco: books, bytes, and bucks. matthews, brian m., et al. 2001. internationalising data access through limber. in third international workshop on internationalisation of products and systems: 1–14. milton keynes (uk). http://epubs.cclrc.ac.uk/bitstream/401/limber_iwips.pdf (accessed june 12, 2007). miles, alistair, and dan brickley, eds. 2004. skos mapping vocab­ ulary specification. w3c. http://www.w3.org/2004/02/ skos/mapping/spec/2004­11­11.html (accessed june 12, 2007). miles, alistair, brian matthews, and michael wilson. 2005. skos core: simple knowledge organization for the web. in 2005 dublin core annual conference—vocabularies in practice, 5–13. madrid: universidad carlos ii de madrid. miller, george a. 1990. wordnet: an on­line lexical database. int. j. lexicography 3: 235–312. mindswap group. 2006. swoop a hypermedia­based feath­ erweight owl ontology editor. maryland information and network dynamics lab. semantic web agents project. http://www.mindswap.org/2004/swoop/ (accessed june 12, 2007). nogueras­iso, javier, francisco javier zarazaga­soria, and pedro rafael muro­medrano. 2005. geographic information metadata for spatial data infrastructures—resources, interoperability, and information retrieval. new york: springer verlag. noy, natalie f., ray w. fergerson, and mark a. musen. 2000. the knowledge model of protégé2000: combining interoper­ ability and flexibility. in knowledge engineering and knowledge management: methods, models, and tools: 12th international conference, ekaw 2000, juan-les-pins, france, october 2–6, 2000: proceedings, 1­20 (lecture notes in computer science, 1937). new york: springer. ogp surveying & positioning committee. 2006. surveying and positioning. http://www.epsg.org/ (accessed june 12, 2007). semantic web advanced development for europe (swad­ europe). 2001. semantic web advanced development for europe thesaurus activity. http://www.w3.org/2001/sw/ europe/ reports/thes (accessed june 12, 2007). tolosana­calasanz, r., et al. 2006. semantic interoperability based on dublin core hierarchical one­to­one mappings. international journal of metadata, semantics, and ontologies 1, no. 3: 183–88. tylor, mike. 2004. the zthes specifications for thesaurus rep­ resentation, access, and navigation. http://zthes.z3950.org/ (accessed june 12, 2007). united nations educational, scientific, and cultural organiza­ tion (unesco). 1995. unesco thesaurus: a structured list of descriptors for indexing and retrieving literature in the fields of education, science, social and human science, culture, communication and information. paris: unesco publ. u.s. library of congress. network devlopment and marc standards office. 2004. marc standards. http://www.loc. gov/marc/ (accessed june 12, 2007). wielemaker, jan, guss schreiber, and bob wielinga1. 2005. using triples for implementation: the triple20 ontology-manipulation tool (lecture notes in computer science, 3729): 773–85. new york: springer. zarazaga­soria, francisco javier, et al. 2003. a java tool for creating iso/fgdc geographic metadata. in geodatenund geodiensteinfrastukuren—von der forschung zur praktischen anwendung: beitrage ze den münsteraner gi-tagen, 26/27. juni 2003 (ifgiprints, 18). münster, germany: institut fur geoin­ formatik, universitat münster. zarazaga­soria, francisco javier, et al. 2007. providing sdi ser­ vices in a cross­border scenario: the sdiger project use case. in research and theory in advancing spatial data infrastructure concepts, 113–26. redlands, calif.: esri. ebsco cover 2 lita cover 3, cover 4 index to advertisers current trends and goals in the development of makerspaces at new england college and research libraries ann marie l. davis information technology and libraries | june 2018 94 ann marie l. davis (davis.5257@osu.edu) is faculty librarian of japanese studies at the ohio state university. abstract this study investigates why and which types of college and research libraries (crls) are currently developing makerspaces (or an equivalent space) for their communities. based on an online survey and phone interviews with a sample population of crls in new england, i found that 26 crls had or were in the process of developing a makerspace in this region. in addition, several other crls were actively promoting and diffusing the maker ethos. of these libraries, most were motivated to promote open access to new technologies, literacies, and stem-related knowledge. introduction and overview makerspaces, alternatively known as hackerspaces, tech shops, and fab labs, are trendy new sites where people of all ages and backgrounds gather to experiment and learn. born of a global community movement, makerspaces bring the do-it-yourself (diy) approach to communities of tinkerers using technologies including 3d printers, robotics, metaland woodworking, and arts and crafts.1 building on this philosophy of shared discovery, public libraries have been creating free programs and open makerspaces since 2011.2 given their potential for community engagement, college and research libraries (crls) have also been joining the movement in growing numbers.3 in recent years, makerspaces in crls have generated positive press in popular and academic journals. despite the optimism, scholarly research that measures their impact is sparse. for example, current library and information science literature overlooks why and how various crls choose to create and maintain their respective makerspace. likewise, there is scant data on the institutional objectives, frameworks, and experiences that characterize current crl makerspace initiatives.4 this study begins to fill this gap by investigating why and which types of crls are creating makerspaces (or an equivalent room or space) for their library communities. specifically, it focuses on libraries at four-year colleges and research universities in new england. throughout this study, makerspace is used interchangeably with other terms, including maker labs and innovation spaces, to reflect the variation in names and objectives that underlie the current trends. in exploring their motives and experiences, this article provides a snapshot of the current makerspace movement in crls. mailto:davis.5257@osu.edu current trends and goals in the development of makerspaces | davis 95 https://doi.org/10.6017/ital.v37i2.9825 the study finds that the number of crls actively involved in the makerspace movement is growing. in addition to more than two dozen that have or are in the process of developing a makerspace, another dozen crls have staff who support the diffusion of maker technologies, such as 3d printing and crafting tools that support active learning and discovery, in the campus library and beyond.5 comprising research and liberal arts schools, public and private, and small and large, the crls involved with makerspaces are strikingly diverse. despite these differences, this population is united by common objectives to promote new literacies, provide open access to new technologies, and foster a cooperative ethos of making. literature review the body of literature on library makerspaces is brief, descriptive, and often didactic. given the newness of the maker movement in public and academic libraries, many articles focus on early success stories and defining the movement vis-à-vis the mission of the library. for instance, laura britton, known for having created the first makerspace in a public library (the fayetteville free library’s fabulous laboratory), defines a makerspace as “a place where people come together to create and collaborate, to share resources, knowledge, and stuff.”6 this definition, she determines, is strikingly similar to that of the library. most literature on makerspaces appears in academic blogs, professional websites, and popular magazines. among the most frequently cited is tj mccue’s article, which celebrates britton’s (née smedley) fablab while distilling the intellectual underpinnings of the makerspace ethos.7 phillip torrone, editor of make: magazine, supports smedley’s project as an example of “rebuilding” or “retooling” our public spaces.8 within this camp, david lankes, professor of information studies at syracuse university, applauds such work as activist and community-oriented librarianship.9 many authors emphasize the philosophical “fit,” or intersection, of public makerspaces with the principles of librarianship. building on torrone’s work, j. l. balas claims that creating access to resources for learning and making is in keeping with the “library’s historical role of providing access to the ‘tools of knowledge.’”10 others emphasize the hands-on, participatory, and intergenerational features of the maker movement, which has the potential to bridge the digital divide.11 still others identify areas of literacy, innovation, and ste(a)m skills where library makerspaces can have a broad impact. while public libraries often focus on early childhood or adult education, crls adopt separate frameworks for information literacy. like public libraries, they aim to build (meta)literacies and ste(a)m skills. nevertheless, their programs often tailor to curricular goals in the arts and sciences or specialized degrees in engineering, education, and business. this is especially true of crls situated within large, research-intensive universities. considering their specific missions and aims, this study seeks to identify the goals and challenges that reinforce the development of makerspaces in undergraduate and research environments. research design and method data presented in this study was gathered from library directors (or their designees) through an online survey and oral telephone interviews. after choosing a sampling frame of crls in new england, i developed a three-path survey, sent invitations, and collected and analyzed data using the online platform surveymonkey. the survey was distributed following review by the information technology and libraries | june 2018 96 institutional review board (irb) at southern connecticut state university, where i completed a master of library science (mls) degree. survey population to assess generalized findings for the larger population in north america, i chose a clustersampling approach that limited the survey population to the crls in new england. in generating the sampling frame, i included four-year and advanced-degree institutions based on the assumption that libraries at these schools supported specialized, research, or field-specific degrees. i omitted for-profit and two-year institutions, based on the assumption that they are driven by separate business models. this process generated a contact list of 182 library directors at the designated crls in connecticut, maine, massachusetts, new hampshire, rhode island, and vermont. survey design the purpose of the survey was to gather basic data about the size and structure of the respondents’ institutions and to gain insights on their views and practices regarding makerspaces (the survey is reproduced in the appendix). the first page of the survey contained a statement of consent, including my contact information and that of my irb. after a short set of preliminary questions, the survey branched into one of three paths based on respondents’ answers about makerspaces. the respondents were thus categorized into one of three groups: path one (p1) for those with no makerspace and no plans to create one, path two (p2) for those with plans to develop a makerspace in the near future, and path three (p3) for those already running a makerspace in their libraries. p3 was the longest section of the survey, containing several questions about p3 experiences with makerspaces such as staffing, programing, and objectives. data collection in summer 2015, brief email invitations and two reminders were sent to the targeted population.12 to increase the participation rate, i sometimes wrote personal emails and made direct phone calls to crls known to have makerspace. for cold-call interviews, i developed a script explaining the nature of the online survey. after obtaining informed consent, i proceeded to ask the questions in the online survey and manually enter the participants’ responses at the time of the interview. on a few occasions, online respondents followed up with personal emails volunteering to discuss their library’s experiences in more detail. i took advantage of these invitations, which often provided unique and welcome insights. in analyzing the responses, i used tabulated frequencies for quantitative results and sorted qualitative data into two different categories. the first category was identified as “short and objective” and coded and analyzed numerically. the longer, more “subjective and value-driven” data was analyzed for common trends, relationships, and patterns. within this second category, i also identified outlier responses that suggested possible exceptions to common experiences. results the survey closed after one month of data collection. at this time, 55 of 182 potential respondents had participated, yielding a response rate of 30.2%. among these participants, the survey achieved a 100.0% response rate (9 completed surveys of 9 targeted crls) among libraries that were current trends and goals in the development of makerspaces | davis 97 https://doi.org/10.6017/ital.v37i2.9825 currently operating makerspaces. i created a list of all known crl makerspaces in new england based on an exhaustive website search of all crls in this region. subsequent interviews with the managers of the makerspaces on this list revealed no other hidden or unknown makerspaces in this region. of the 55 respondents, 29 (52.7%) were in p1, 17 (30.9%) were in p2, and 9 (16.4%) were in p3. (see figure 1.) figure 1. survey participants’ (n = 55) current crl efforts and plans to develop and operate a makerspace. among respondents in p2 and p3, the majority (13 of 23) indicated that they were from libraries that served a student population of 4,999 people or fewer, while only one library served a population of 30,000 or more (see figure 2). in terms of sheer numbers, makerspaces might seem to be gaining traction at smaller crls, but proportionally, one cannot say that smaller crls are adopting makerspaces at a higher rate because the majority of survey participants have student populations of 19,999 or less (51, or 91.1%). the number of institutions with populations over 20,000 were in a clear minority (5, or 8.9%). (see figure 3.) information technology and libraries | june 2018 98 figure 2. p2 and p3 crls with makerspaces or concrete plans to develop a makerspace. figure 3. the majority of crls (67.2%) that participated in the survey had a population of 4,999 students or less. only 1.8% of schools that participated had a population of 30,000 students or more. current trends and goals in the development of makerspaces | davis 99 https://doi.org/10.6017/ital.v37i2.9825 crls with no makerspace (p1 = 29) in the first part of the survey, the majority of p1 respondents demonstrated positive views toward makerspaces despite having no plans to create one in the near future. budgetary and space limitations aside, many were relatively open to the possibility of developing a makerspace in a more distant future. in the words of one respondent, “we have several areas within the library that present a heavy demand on our budget. in [the] future, we would love to consider a makerspace, and whether it would be a sensible and appropriate investment that would benefit our students.” when asked what their reasons were for not having a makerspace, some respondents (8, or 27.6%) said they had not given it much thought, but most (21, or 72.4%) offered specific answers. among these, the most frequently cited reason (11, or 37.8%) was that a library makerspace would be redundant: such spaces and labs were already offered in other departments within the institution or in the broader community. at one crl, for example, the respondent said the library did not want to compete with faculty initiatives elsewhere on campus. other reasons included that makerspaces were expensive and not a priority. some (5, or 17.2%) libraries preferred to allocate their funds to different types of spaces such as “a very good book arts studio/workshop” or “simulation labs.” some (6, or 20.6%) shared concerns about a lack of space, staff, or simply “a good culture of collaboration [on campus].” merging these sentiments, one respondent concluded, “people still need the library to be fairly quiet. . . . having makerspace equipment in our library would be too distracting.” while some were skeptical (sharing concerns about potential hazards or that makerspaces were simply “the flavor of the month”), the majority (roughly 60%) were open and enthusiastic. one respondent, in fact, held a leadership position in a community makerspace beyond campus. according to this librarian, 3d printers, scanners, and laser cutters were sure to become more common, and crls would no doubt eventually develop “a formal space for making stuff.” crls with plans for a makerspace in the near future (p2 = 17) the second section of the survey (p2) focused primarily on the motivations and means by which this cohort planned to develop a makerspace. when asked why they were creating a makerspace, the most common response was to promote learning and literacy (15 respondents, or 88.2%). in addition, a large majority (12 respondents, or 70.6%) felt that makerspaces helped to promote the library as relevant, particularly in the digital age. three more reasons that earned top scores (10 respondents each, or 58.2%) were being inspired by the ethos of making, creating a complement to digital repositories and scholarship initiatives, and providing access to expensive machines or tools. additional reasons included building outreach and responding to community requests.13 (see figure 4.) information technology and libraries | june 2018 100 figure 4. rationale behind p2 respondents’ decision to plan a makerspace (n = 17). while p2 respondents indicated a clear decision to create a makerspace, their timeframes were noticeably different. i categorized their open responses into one of six timeframes: “within six months,” “within one year,” “within two years,” “within four years,” “within six years,” and “unknown.” the result presented a clear trimodal distribution with three subgroups: six crls with plans to open within 18 months, five with plans to open within the next two years, and six with plans to open after three or more years (see figure 5). in addition to their timeframe, p2 respondents were also asked about their plans for financing their future makerspaces. based on their open responses, the following six funding sources emerged: • the library budget, including surplus moneys or capital project funds • internal funding, including from campus constituents • donations and gifts • external grants • cost recovery plans, including small charges to users • not sure/in progress current trends and goals in the development of makerspaces | davis 101 https://doi.org/10.6017/ital.v37i2.9825 figure 5. p2 respondents’ timeframe for developing the makerspace (n = 17). with seven mentions, the most common of the above funding was the “library budget.” with two mentions each, the least common sources were “cost recovery” and “not sure/in progress.” among those who mentioned external grant applications, one respondent mentioned a focus on women and stem opportunities, and another specifically discussed attempts at grants from the institute of museum and library services. (see figure 6.) figure 6. p2respondents’ plans for gathering and financing makerspace (n = 17). regarding target user groups, some respondents focused on opportunities to enhance specific disciplinary knowledge, while others emphasized a general need for creating a free and open environment. one respondent mentioned that at her state-funded library, the space would be “geared to younger [primary and secondary school] ages,” “student teachers,” and “librarians on practicum assignments.” by contrast, another respondent at a large, private, carnegie r1 information technology and libraries | june 2018 102 university emphasized that the space was earmarked for the undergraduate and graduate students. in contrast to the cohort in p1, a notable number in p2 chose to create a makerspace despite the existence of maker-oriented research labs elsewhere on campus. as one respondent noted, the university was still “lacking a physical space where people could transition between technologies” and an open environment “where students doing projects for faculty” could come, especially later in the evenings. another respondent at a similarly large, private institution explained that his colleagues recognized that most labs at their university were earmarked for specific professional schools. as a result, his colleagues came up with a strategy to provide self-service 3d printing stations at the media center, located in the library at the heart of campus. crls with operating makerspaces (p3 = 9) the final section of the survey (p3) focused on the motivations and means by which crls with makerspaces already in operation chose to develop and maintain their sites. in addition, this section gathered information on p3 crl funding decisions, service models, and types of users in their makerspaces. of the nine respondents in this path, all had makerspaces that had opened within the last three years. among these, roughly a third (4) had been in operation from one to two years; another third (3) had operated for two to three years; and two had opened within the last year. (see table 1.) table 1. length of time the crl makerspace has been in operation for p3 respondents (n = 9). age of crl makerspace or lab—p3 answer options responses % less than 6 months 1 11.1 6–12 months 1 11.1 1–2 years 4 44.4 2–3 years 3 33.3 more than 3 years 0 0.0 total responses 9 100.0 priorities and rationale the reasons behind p3 decisions to make a makerspace were slightly different from those of p2. while “promoting literacy and learning” was still a top priority, two other reasons, “promoting the maker culture of making” and “providing access to expensive machinery,” were deemed equally important (6 respondents, or 66.7%, for each). other significant priorities included “promoting community outreach” (4 respondents, or 44.4%), “promoting the library as relevant” and in “direct response to community requests” (3 respondents, or 33.3%, for each). (see figure 7.) current trends and goals in the development of makerspaces | davis 103 https://doi.org/10.6017/ital.v37i2.9825 figure 7. rationale behind p3 respondents’ decision to develop and maintain a makerspace (n = 9). the answer of “other” was also given top priority (5 respondents, or 55.6%). i conclude that this indicated a strong desire among respondents to express in their own words their library’s unique decisions and circumstances. (their free responses to this question are discussed below.) a familiar theme in the responses of the five respondents who elaborated on their choice of “other” was the desire to situate a makerspace in the central and open environment of the campus library. as one participant noted, there were “other access points and labs on campus,” but those labs were “more siloed” or cut off from the general population. by contrast, the campus library aimed to serve a broader population and anticipated a general “student need.” later, the same respondent added that the makerspace was an opportunity to promote social justice, cultivate student clubs, and encourage engagement at the hub of the campus community. this type of ecumenical thinking was manifested in a similar remark that the library’s role was to reinforce other learning environments on campus. one respondent saw the makerspace as an additional resource “that complemented the maker opportunities that we have had in our curriculum resource center for decades.” likewise, the library makerspace was intended to offer opportunities to a range of users on campus and beyond. funding, staffing, and service models when prompted to discuss how they gathered the resources for their makerspaces, the largest group (4 respondents) stated that a significant means for funding was through gifts and donations. thus, the majority of crl makerspaces in new england depended primarily on contributions from friends of the library, university/college alumni, and donors. the second most common source (3 respondents) was through the library budget, including surplus money at the end of the year. making use of grant money and cost recovery were mentioned by two library participants, and internal and constituent support was useful for two libraries. (see figure 8.) information technology and libraries | june 2018 104 figure 8. p3 methods for gathering and financing a makerspace (n = 9). among these, a particularly noteworthy case was a makerspace that had originated from a new student club focused on 3d printing. originally based in a student dorm, the club was funded by a campus student union, which allocated grant money to students through a budget derived from the college tuition. as the club quickly grew, it found significant support in the library, which subsequently provided space (on the top floor of the library), staff, and financial support from surplus funds in the library budget. as this example would suggest, the sum of the responses showed that financing the makerspaces depended on a combination of strategies. one participant summarized it best: “we’ve slowly accumulated resources over time, using different funding for different pieces. some grant funding. mostly annual budget.” regarding service models, more than half of these libraries (five) currently offer a combination of programming and open lab time where users could make appointments or just drop in. by contrast, two of the libraries offered programs only, and did not offer an open lab; another two did the opposite, offering no programming but an open makerspace at designated times. of the latter, one is open monday to friday from 8 a.m. to 4 p.m., and the other is open during regular hours, with spaces that “can be booked ahead for classes or projects.” most labs supported drop-in visitors and were open evenings and weekends. at one makerspace, where there was increasingly heavy demand, the staff required students to submit proposals with project goals. (see table 2.) while some libraries brought in community experts, others held faculty programs, and some scheduled lab time for individual classes. one makerspace prioritized not only the campus, but also the broader community, and thus featured programs for local high schools and seniors. responses from this library emphasized the social justice thread that inspired their work and the community culture that they aimed to foster. current trends and goals in the development of makerspaces | davis 105 https://doi.org/10.6017/ital.v37i2.9825 table 2. model for services offered in the crl makerspace or 3d printing lab do you offer programs in the makerspace/lab or is it simply opened at defined times for users to use? answer options responses % yes, we offer the following types of programs. 2 22.2 no, we simply leave the makerspace/lab open at the specific times. 2 22.2 we do both. we offer the programs and leave the makerspace/lab open at specific times. 5 55.6 as this data would suggest, most makerspaces were used by students (undergraduates and graduates) and faculty, in addition to local experts and generational groups. survey responses showed that undergraduate students were the most common users (9 of 9 respondents checked this group as the most frequent type of user), and faculty and graduate students were the second and third most common (8 of 9 respondents checked these groups as most frequent) user groups in the labs. local entrepreneurs, artists, designers, craftspeople, and campus and library staff also use the makerspaces. (see figure 9.) when prompted to identify “other” categories, one respondent specifically listed “learners, makers, sharers, studiers, [and] clubs.” figure 9. of the different types of users listed above, p3 respondents ranked them in order of who used the makerspace or equivalent lab most often (n = 9). the number and type of staff that managed and operated the makerspaces also varied widely at the nine crls in p3. seven of the crls employed full-time, dedicated staff, among whom four participants checked off the “dedicated staff”–only options. of the remaining two crls, one information technology and libraries | june 2018 106 reported staffing the makerspace with only one student, and one reported not having any staff working in the makerspace. i assume that the makerspace with no employees is managed by staff and students who are assigned to other, unspecified library departments or work groups. (see figure 10.) figure 10. the staffing situations at the p3 respondents (n = 9), where each respondent is assigned a letter from “a” to “i.” library programing was also diverse in terms of targeted audiences, speakers, and learning objectives. instructional workshops varied from 3d scanning and printing to soldering, felt making, sewing, knitting, robotics, and programming (e.g., raspberry pi.) the type of equipment contained in each lab is likely correlated to the range in programming; however, investigating these links was beyond the scope of this study. regarding this equipment, the size and activity of the participant crls varied considerably. some responses were more specific than others, and thus the resulting dataset was incomplete (see table 3.) challenges and philosophies of crl makerspaces the final portion of the survey invited participants to freely offer their thoughts about operating a crl makerspace. what follows below is a summary of the two most prominent themes that emerged: the challenges of building the lab and the social philosophies that framed these initiatives. in terms of challenges, the most common hurdle noted was the tremendous learning curve involved in establishing, maintaining, and promoting a makerspace. setting up some of the 3d printers, for example, required knowledge about electrical networks, computer systems, and safety policies at a federal and local level. once the hardware was running, lab managers needed to know how the machines interfaced with different and challenging software applications. communication skills were also critical, as one respondent reported, “printing anything and everything takes knowledge, experience.” communicating with stakeholders and users in accessible and proactive ways required strong teaching and customer service skills. current trends and goals in the development of makerspaces | davis 107 https://doi.org/10.6017/ital.v37i2.9825 table 3. the types of tools and equipment used at p3 crl respondents (n = 8), which are assigned letters from a to h. major equipment offered by individual library makerspaces or equivalent labs—path 3 crl label response text a die cut machine, 3d printer, 3d pens, raspberry pi, arduino, makey makey, art supplies, sewing supplies, pretty much anything anyone asks for we will try to get. b 2 makerbot replicators, 1 digital scanner, 1 othermill c 3d printing, 3d scanning, and laser cutting. d 3d printing, 3d scanning, laser cutting, vinyl cutting, large format printing, cnc machine, media production/postproduction. e no response f 3 creatorx, 1 powerspec, 3 m3d, 2 replicator 2, 1 replicator2x, 1 makergear, 1 leapfrogxl, 1 ultimaker, 1type a,1 deltaprinter, 1 delta maker, 2 printrbot, 2 filabots, 2x-box kinect for scanning, 2 oculus rifts, embedded systems cabinet with soldering stations, solar panels and micro controllers etc, 1 formlabs sla, 1 muve sla, rova 5, a bunch of quadcopters g 3d printers (4 printers, 3 models), 3d scanning/digitizing equipment (3 models), raspberry pi, arduino, a laser cutter and engraving system, poster printer, digital drawing tablets, gopro, a variety of editing and design software, a number of tools (e.g. dremel, soldering iron, wrenches, pliers, hammers, etc.), and a number of consumable or misc. items (e.g. paint, electrical tape, acetone, safety equipment, led lights, screws and nails, etc.) h 48 printers (all makerbot brand), 35 replicator 5th gen (a moderate size printer, 5 replicator z18 printers (larger built size), and 5 replicator minis, 3 replicator 2x) 5 makerbot digitzers (turntable scanners 8" by 8") 1 cubify sense hand scanner 7 still cameras for photogrammetry 21 i-mac computers 2 mac pros 2 wacom graphics tablets (thinking about complementing other resources at other labs on campus) another challenge that often came up was that of managing resources. as one respondent warned, crls should beware the “early adoption of certain technologies,” which can become “quickly information technology and libraries | june 2018 108 outdated by a rapidly growing field.” for others, it was a challenge to recruit the right staff that could run and fix machines in constant need of repair. in addition to hiring people with manufacturing and teaching skills, a successful lab required individuals who were savvy about outreach and community needs. despite such challenges, many respondents were eager to discuss the aspirations and rewards of crl makerspaces. above all, respondents focused on the pedagogical opportunities on the one hand, and the potential for outreach and social justice on the other. one participant conceded that measuring advances in literacy and education was “intangible,” but he saw great value in “giving students the experience of seeing their ideas come to fruition.” the excitement that this created for one student manifested in a buzz, and subsequently a “fever” or groundswell, in which more users came in to tinker and learn. meanwhile, the learning that took place among future professionals on campus was “critical,” even when results did not “go viral.” the aspiration to create human connections within and beyond campus was another striking theme. according to one respondent, the makerspace had “enabled some incredibly fruitful collaborations with different departments on campus.” this “fantastic outcome” was becoming more and more visible as the maker community grew. other crl makerspaces took pride in fostering a type of learning that was explicitly collaborative, exciting, and even “fun” for users. this in turn meant that some libraries were becoming “very popular,” generating a lot of “good pr,” and becoming central in the lives of new types of library users. along these lines, some respondents aimed to leverage the power of the makerspace to achieve social justice goals that resonated with core values of librarianship. according to one enthusiastic participant, the ethos of sharing was alive and strong among the staff and the many students who saw their participation in the lab as a lifestyle and culture of collaborating. in another initiative, the respondent looked forward to eventually offering grants to those users who proposed meaningful ways to use the makerspace to create practical value for the community. from this perspective, there was added value in having the 3d printing lab situated specifically on a college or university campus. according to this respondent, the unique quality of the crl makerspace was that by virtue of its location amid numerous and energetic young people, it was ripe for exploitation by those “who had great ideas and time and energy to do good.” discussion the aim of this study was to explore why and which types of crls had developed makerspaces (or an equivalent space) for their communities. of the 56 respondents, roughly half (46%) were p2 and p3 libraries who were currently developing or operating a makerspace, respectively. data from this survey indicated that none of the p2 or p3 crls fit a mold or pattern in terms of their size, educational models, or classifications. upon analyzing the data, i found that the differentiators between the three groups were less clearly defined than originally anticipated. in one example of blurred lines, at least two respondents in p1 indicated that they were more actively engaged with makerspaces than two respondents in p2. despite not having physical labs within their libraries, these p1 respondents were in the process of actively supporting or making plans for a makerspace within their crl community. one p1 respondent, for example, served on the planning board for a local community makerspace and had therefore “thoroughly investigated and used” the makerspace at a current trends and goals in the development of makerspaces | davis 109 https://doi.org/10.6017/ital.v37i2.9825 neighboring university. based on his knowledge, he decided to develop a complementary initiative (e.g., a book arts workshop) at his university library. although his library did not yet have a formal makerspace, he felt confident that the diffusion of 3d printers would come to his library in the near future. another p1 respondent was responsible for administering faculty teaching and innovation grants. among the recent grant recipients were two faculty collaborators who used the library’s funds to build a makerspace at a campus location that was separate from the library. although the makerspace was not directly developed by the respondent’s library, it was nevertheless a direct product of his library’s programmatic support. the respondent reported that for this reason, his library did not want to compete with its own faculty initiatives. in another example of blurred distinctions, one librarian in p2 was as deeply immersed in providing access and education on makerspaces as his colleagues in p3. although he was not clear on when or how his library would finance a future makerspace, his library already offered many of the same services and workshops as p3 libraries. as a “maker in the library,” he offered noncredit-bearing 3d printing seminars to students and offered trial 3d printing services in the library for graduates of the 3d printing seminar. in addition, he made appearances at relevant campus events. when the university museum ran a 3d printing day, for instance, he participated as an expert panelist and gave public demonstrations on library-owned 3d printers and a scanner kinect bar. in sum, despite the respondents’ categorization in p1 and p2, they sometimes shared more in common with the cohorts in p2 and p3, respectively. given their library’s programmatic involvement in creating and endorsing the maker movement, these respondents were more than just “interested” or “open to” the prospect of creating a makerspace. while only 16% of crls (p3 = 9) responded as actively operating a makerspace, another 30% (p2 = 17) were involved in developing a makerspace in the near future. moreover, the number of crls formally involved with the diffusion of maker technologies was not limited to just these two groups. although some makerspaces were not directly run by the library, they had come to fruition because of librarybased funding, grants, and professional support. and although some libraries did not have immediate plans for a makerspace, they were already promoting maker technologies and the maker ethos in other significant ways. conclusion this study is one of the first comprehensive and comparative studies on crl makerspace programs and their respective goals, policies, and outcomes. while the number of current crl makerspaces is relatively low, the data suggests that the population is increasing; a growing number of crls are involved in the makerspace movement. more than two dozen crls were planning to develop makerspaces in the near future, helping to diffuse maker technologies through crl programming, and/or supporting nonlibrary maker initiatives on campus and beyond. in addition, some crls were buying equipment, hiring dedicated staff, offering relevant workshops and demonstrations, and supporting community efforts to build labs beyond the library. although the author aimed to find structural commonalities between crls in groups p2 and p3, none were found. respondents in these groups came from institutions of all sizes , a wide variety information technology and libraries | june 2018 110 of endowment levels, and both public and private funding models, and they ranged in emphasis from the liberal arts to professional certifications and graduate-level research. although a majority of crl respondents were not currently making plans to create a makerspace, many respondents were enthusiastic about current trends, and some even promoted the maker movement in unexpected ways. acknowledging the steady diffusion of 3d printers, many anticipated using such technologies in the future to promote traditional library values and goals. respondents in p2 and p3 indicated that their primary rationale for developing a makerspace was to promote learning and literacy. other prominent reasons included promoting library outreach and the maker culture of learning. data from crls with makerspaces indicated that these benefits were often symbiotic and correlated to strong ideas about universal access to emergent tools and practices in learning. unexpected challenges for developing and operating makerspaces include staffing them with highly skilled, knowledgeable, and service-oriented employees. learning the necessary skills— including operating the printers, troubleshooting models, and maintaining a safe environment, to name a few—was time-consuming and labor intensive. the majority of funding for crls with or planning maker labs came from internal budgets, gifts and donors, and some grants. while some p1 crls indicated that their reason for not developing makerspaces was a lack of community interest, p2 and p3 crls were not necessarily motivated by user requests or needs, nor was lack of explicit need or interest a deterrent. on the contrary, a few reported a desire to promote the campus library as ahead of the curve by keeping in front of student and community needs. in a similar contradiction, some p1 respondents reported that their libraries did not want to compete with other labs on campus. respondents from p2 and p3, however, wanted to offer an alternative to the more siloed or structured model of departmentor lab-funded makerspaces. although makerspaces were sometimes forming in other parts of campus, some p2 and p3 crls felt there was a gap in accessibility and therefore aimed to offer more open and flexible spaces. a final salient theme among p2 and p3 respondents was their commitment to equity of access and issues of social justice. above all, they saw a unique fit for makerspaces in their crl philosophies to serve the greater good. among other advantages, crls were in a unique position to leverage the power of the makerspaces to take advantage of campus communities of “cognitive surplus” and millennial aspirations to share and create spontaneous communities of knowledge. given the amount of resources that are required to create and maintain a makerspace, this research will be useful for crls considering such a space in the future. the present data suggests that no one type of library currently has a monopoly on maker spaces; regardless of size or funding levels, the common thread among p2 and p3 crls was simply a commitment to providing access to emergent technologies and supporting new literacies. while annual budgets and grant applications were critical for some libraries, the majority of crls funded the bulk of their makerspaces through gifts and donations. future studies on the characteristics and challenges of p2 and p3 populations beyond those in new england will certainly amplify our understanding of these trends. current trends and goals in the development of makerspaces | davis 111 https://doi.org/10.6017/ital.v37i2.9825 appendix: survey questions informed consent current trends in the development of makerspaces and 3d printing labs at new england college and research libraries consent for the participation in a research study southern connecticut state university purpose you are invited to participate in a research project conducted by ann marie l. davis, a masters student in library and information studies at southern connecticut state university. the purpose of this project is to investigate the experiences and goals of college and research libraries (crls) that currently have or are making plans to have an open makerspace (or an equivalent room or space). the results from this study will be included in a special project report for the mls degree and the basis for an article to submit for peer-review. procedures if you decide to participate, you will volunteer to take a fifteen-minute online survey. risks and inconveniences there are no known risks associated with this research; other than taking a short amount of time, the survey should not burden you or infringe on your privacy in any way. potential benefits and incentive by participating in this research, you will be contributing to our understanding of current trends and practices with regards to community learning labs in crls. in addition, you will be providing useful knowledge that can support other libraries in making more informed decisions as they potentially develop their own makerspaces in the future. voluntary participation your participation in this research study is voluntary. you may choose not to participate and you may withdraw your consent to participate at any time. you will not be penalized in any way should you decide not to participate or withdraw from this study. protection of confidentiality the survey is anonymous and does not ask for sensitive or confidential information. contact information before you consent, please ask any questions on any aspect of this study that is unclear to you. you may contact me at my student email address at any time: xxx@owls.southernct.edu. if you have questions regarding your rights as a research participant, you may contact the southern connecticut state institutional review board at (203) xxx-xxxx. information technology and libraries | june 2018 112 consent by proceeding to the next page, you confirm that you understand the purpose of this research, the nature of this survey and the possible burdens and risks as well as benefits that you may experience. by proceeding, this indicates that you have read this consent form, understand it , and give your consent to participate and allow your responses to be used in this research. acrl survey on makerspaces and 3d printers q1. what is the size of your college or university? • 4,999 students or less • 5,000–9,999 students • 10,000–19,999 students • 20,000–29,999 students • 30,000 students or more q2. how would you categorize your institution? (please check all that apply) • private • public • doctorate-granting university (awards 20 or more doctorates) • master’s college or university (awards 50 or more master’s degrees, but fewer than 20 doctorates) liberal arts and sciences college • other q3. do any of the libraries at your institution have a makerspace or equivalent hands-on learning lab (including a 3-d printing station or lab)? • yes [if “yes,” respondents are directed to question 14] • no [if “no,” respondents are directed to question 4] q4. do any of the libraries at your institution have plans to develop a makerspace or equivalent learning lab in the near future? • yes [if “yes,” respondents are directed to question 8] • no [if “no,” respondents are directed to question 5] path one (crls with no makerspace, no plans for makerspace) q5. are there specific reasons why your institution has decided not to pursue developing a makerspace or equivalent lab in the near future? • no reasons. we have not given much thought to makerspaces for our library. • yes q6. thank you for your participation. would you like a copy of the results when the report is completed? if yes, please enter your email address in the space provided. current trends and goals in the development of makerspaces | davis 113 https://doi.org/10.6017/ital.v37i2.9825 • no • yes (please enter your email address below) q7. you have almost concluded this survey. before signing off, please feel free to share your thoughts and comments regarding the makerspace movement in college and research libraries. if no comments, please click “next” to end the survey. path two [crls with plans to build a makerspace] q8. what are the main goals that motivated your library’s decision to develop a makerspace or equivalent lab? (please check all that apply) • promote community outreach • promote learning and literacy • promote the library as relevant • promote the maker culture of making • provide access to expensive machines or tools • complement digital repository or digital scholarship projects • as a direct response to community requests or needs • other q9. of these goals, please rank them in order of their level of priority for your library. (choose “n/a” for goals that you did not select in the previous question) • promote community outreach • promote learning and literacy • promote the library as relevant • promote the maker culture of making • provide access to expensive machines or tools • complement digital repository or digital scholarship projects • as a direct response to community requests or needs • other q10. what is your library’s time frame for developing a makerspace or equivalent lab? q11. what are your library’s current plans for gathering and/or financing the resources needed for developing and maintaining the makerspace or equivalent lab? q12. thank you for your participation. would you like a copy of the results when the report is completed? • no • yes (please enter your email address below) q13. you have almost concluded this survey. before signing off, please feel free to share your thoughts and comments regarding the makerspace movement in college and research libraries. if no comments, please click “next” to end the survey. information technology and libraries | june 2018 114 path three [crls with a makerspace] q14. how long have you had your makerspace or equivalent learning lab? • less than 6 months • 6–12 months • 1–2 years • 2–3 years • more than 3 years q15. what were the main goals that motivated your library's decision to develop a makerspace or equivalent lab? (please check all that apply) • promote community outreach • promote learning and literacy • promote the library as relevant • promote the maker culture of making • provide access to expensive machines or tools • complement digital repository or digital scholarship projects • as a direct response to community requests or needs other q16. of these goals, please rank them in order of their level of priority for your library. (choose “n/a” for goals that you did not select in the previous question) • promote community outreach • promote learning and literacy • promote the library as relevant • promote the maker culture of making • provide access to expensive machines or tools • complement digital repository or digital scholarship projects • as a direct response to community requests or needs • other q17. how did your library gather and/or finance the resources needed for developing and maintaining the makerspace or equivalent learning lab? q18. do you offer programs in the makerspace/lab or is it simply opened at defined times for users to use? • yes, we offer the following types of programs: • no, we simply leave the makerspace/lab open at the following times (please note times and/or if a reservation is required): • we do both. we offer the following types of programs and leave the makerspace/lab open at the following times (please note types of programs, times open, and if a reservation is required): current trends and goals in the development of makerspaces | davis 115 https://doi.org/10.6017/ital.v37i2.9825 q19. what type of community members tend to use your library's makerspace or equivalent lab most? (please check all that apply) • undergraduate researchers • graduate researchers • faculty • staff • general public • local artists, designers, or craftspeople • local entrepreneurs • other q20. of the cohorts chosen above, please rank them in order of who uses the makerspace or equivalent lab most often. (use “n/a” for cohorts that are not relevant to your space or lab) • undergraduate researchers • graduate researchers • faculty • staff • general public • local artists, designers, or craftspeople • local entrepreneurs • other q21. how many dedicated staff does your library currently employ for the makerspace or equivalent? • 0 • 1 • 2 • 3 • other q22. where is your makerspace or equivalent lab located? q23. what is the title or name of your makerspace or equivalent lab, and if known, what were the reasons behind this particular name? q24. what major equipment and services does your library makerspace or equivalent lab provide? q25. what unexpected considerations, challenges, or failures has your library faced in developing and maintaining the makerspace or equivalent lab? q26. how would you assess the benefits or “return on investment” of having a makerspace or equivalent lab? q27. thank you for your participation. would you like a copy of the final results when the report is completed? if yes, please enter your email address in the space provided. information technology and libraries | june 2018 116 • no • yes (please enter your email address below) q28. you have almost concluded this survey. before signing off, please feel free to share your thoughts and comments regarding the makerspace movement in college and research libraries. if no comments, please click “next” to end the survey. references and notes 1 laura britton, “a fabulous laboratory: the makerspace at fayetteville free library,” public libraries 51, no. 4 (july/august 2012): 30–33, http://publiclibrariesonline.org/2012/10/afabulous-labaratory-the-makerspace-at-fayetteville-free-library/; madelynn martiniere, “hack the world: how the maker movement is impacting innovation: from diy geige,” medium, october 27, 2014, https://medium.com/@mmartiniere/hack-the-world-how-the-makermovement-is-impacting-innovation-bbc0b46bd820#.3mnhow4jz. 2 david v. loertscher, “maker spaces and the learning commons,” teacher librarian 39, no. 6 (october 2012): 45–46, accessed december 9, 2016, library, information science & technology abstracts with full text, ebscohost; jon kalish, “libraries make room for high-tech ‘hackerspaces,’” national public radio, december 25, 2011, http://www.npr.org/2011/12/10/143401182/libraries-make-room-for-high-techhackerspaces; diane slatter and zaana howard, “a place to make, hack, and learn: makerspaces in australian public libraries,” australian library journal 62, no. 4: 272–84, https://doi.org/10.1080/00049670.2013.853335. 3 sharon crawford barniskis, “makerspaces and teaching artists,” teaching artist journal 12, no. 1: 6–14. 4 anne wong and helen partridge, “making as learning: makerspaces in universities,” australian academic & research libraries 47, no. 3 (september 2016): 143–59, https://doi.org/10.1080/00048623.2016.1228163. 5 erich purpur et al., “refocusing mobile makerspace outreach efforts internally as professional development,” library hi tech 34, no. 1 (2016): 130–42. 6 britton, “a fabulous laboratory,” 30. 7 tj mccue, “first public library to create a maker space,” forbes, november 15, 2011, http://www.forbes.com/sites/tjmccue/2011/11/15/first-public-library-to-create-a-makerspace/. 8 phillip torrone, “is it time to rebuild and retool public libraries and make ‘techshops’?,” make:, march 20, 2011, http://makezine.com/2011/03/10/is-it-time-to-rebuild-retool-publiclibraries-and-make-techshops/. 9 r. david lankes, “killing librarianship,” (keynote speech, new england library association annual conference, october 3, 2011, burlington, vermont), https://davidlankes.org/killinglibrarianship/. http://publiclibrariesonline.org/2012/10/a-fabulous-labaratory-the-makerspace-at-fayetteville-free-library/ http://publiclibrariesonline.org/2012/10/a-fabulous-labaratory-the-makerspace-at-fayetteville-free-library/ https://medium.com/@mmartiniere/hack-the-world-how-the-maker-movement-is-impacting-innovation-bbc0b46bd820#.3mnhow4jz https://medium.com/@mmartiniere/hack-the-world-how-the-maker-movement-is-impacting-innovation-bbc0b46bd820#.3mnhow4jz http://www.npr.org/2011/12/10/143401182/libraries-make-room-for-high-tech-hackerspaces http://www.npr.org/2011/12/10/143401182/libraries-make-room-for-high-tech-hackerspaces https://doi.org/10.1080/00049670.2013.853335 https://doi.org/10.1080/00048623.2016.1228163 http://www.forbes.com/sites/tjmccue/2011/11/15/first-public-library-to-create-a-maker-space/ http://www.forbes.com/sites/tjmccue/2011/11/15/first-public-library-to-create-a-maker-space/ http://makezine.com/2011/03/10/is-it-time-to-rebuild-retool-public-libraries-and-make-techshops/ http://makezine.com/2011/03/10/is-it-time-to-rebuild-retool-public-libraries-and-make-techshops/ https://davidlankes.org/killing-librarianship/ https://davidlankes.org/killing-librarianship/ current trends and goals in the development of makerspaces | davis 117 https://doi.org/10.6017/ital.v37i2.9825 10 janet l. balas, “do makerspaces add value to libraries?,” computers in libraries 32, no. 9 (november 2012): 33. 11 balas, “do makerspaces add value to libraries?,” 33; adrian g smith et al., “grassroots digital fabrication and makerspaces: reconfiguring, relocating and recalibrating innovation?” (working paper, university of sussex, spru working paper swps, falmer, brighton, september 2013), https://doi.org/10.2139/ssrn.2731835. 12 the number of and interval between emails corresponded roughly with dillman’s “five-contact framework” as outlined in carolyn hank, mary wilkins jordan, and barbara m. wildemuth, “survey research,” in applications of social research methods to questions in information and library science, edited by barbara wildemuth, 256–69 (westport, ct: libraries unlimited, 2009), 261. 13 in choosing these priorities, respondents were asked to select as many of the reasons that applied to their own crl. https://doi.org/10.2139/ssrn.2731835 abstract introduction and overview literature review research design and method survey population survey design data collection results crls with no makerspace (p1 = 29) crls with plans for a makerspace in the near future (p2 = 17) crls with operating makerspaces (p3 = 9) priorities and rationale funding, staffing, and service models challenges and philosophies of crl makerspaces discussion conclusion appendix: survey questions informed consent purpose procedures risks and inconveniences potential benefits and incentive voluntary participation protection of confidentiality contact information consent acrl survey on makerspaces and 3d printers path one path two path three references and notes lib-mocs-kmc364-20140106083504 the recon pilot project: a progress report october 1970-may 1971 159 henriette d. avram and lenore s. maruyama: marc development office, library of congress, washington, d. c. synopsis of three progress reports on the recon pilot project submitted by the library of congress to the council on library resources covering the period october 1970-may 1971. progress w reported in the following areas: recon production, foreign language editing test, format recognition, microfilming, input devices, and tasks assigned to the recon working task force. introduction with the implementation of the marc distribution service in march 1969, the library of congress and the library community have had available in machine readable form the catalog records for english language monographs cataloged since 1969. most libraries, however, also need to convert their older cataloging records, and the library of congress attempted to meet these needs by establishing the recon pilot project in august 1969. during the two-year period of the pilot project, various techniques for conversion of retrospective bibliographic records have been tested, and a useful body of catalog records is being converted to machine readable form. the pilot project is being supported with funds from the library of congress, the council on library resources, and the u.s. office of education. earlier articles in the journal of library automation have described the progress through september 1970 ( 1, 2, 3 ). this article covers the period october 1970 through may 1971. 160 journal of library automation vol. 4/3 september, 1971 progress-october 1970 through may 1971 recon production the conversion of 8476 records in the 1969 and 7-series of card numbers that had not been included in the marc distribution service was completed, and these records were sent to 47 subscribers of the marc distribution service. the subscribers were not charged for these records but were asked to send a tape reel to the library for the duplication process. at present, the recon data base consists of 25,206 records in the 7, 1969, and 1968 series of card numbers. records in the 1968 series that were part of the data base for the marc pilot project are being converted by program from the marc i format to the marc ii format, proofed, and updated. to date, 7551 out of 7583 marc i records have been processed. prior to the implementation of the marc distribution service, records were input for test purposes, and the resulting practice tapes contain data requiring correction or updating to correspond with the present specifications of the marc ii format. of the 8340 titles on the practice tapes, 3460 have been updated and reside on the recon master file. these updated machine readable records will be distributed with the recon titles in the 1968 card series. foreign languages editing experiment a foreign language editing experiment was conducted to test the accuracy of marc/recon editors in editing french and german language records. records used for this test included 1180 of the 5000 recon research titles. at least 50 percent accuracy was expected since half of the task of editing a marc record involves being able to read the language of the record. the other half involves identifying the data elements by their location in the record. the three editors used in the experiment had studied french in high school, one having had an additional year in college; none had studied german. each editor was required to edit approximately 200 records in each language. statistics on the number of records edited per hour and the number of errors made, when compared with the same editors' statistics for editing english language records, showed that each editor maintained an approximately equal rate of speed in editing foreign language records as in editing english. the error rate for each editor, however, was more than tripled on foreign records, and each made approximately as many errors in french (the language studied) as in german. each editor averaged more than 12 errors per batch in french and 12 in german. since the marc editorial office has established a standard of 2.5 errors per batch ( 20 records comprising a batch ) as being acceptable for trained marc editors, this error rate would have to be lowered in a production environment. the majority of errors occurred in the title field, which is a portion of the recon pilot projectjavram and maruyama 161 the record that must be read for content in order to be edited correctly. the second largest number of errors occurred in the fixed fields, which are also dependent upon a reading knowledge of the language of the record for accurate coding. the number of errors made in each batch of records by each editor was tabulated to determine if any improvement was made during the course of the experiment. in no case was improvement noted. statistics were also kept on the number of times an editor consulted various sources for help: e.g., dictionaries, the editing manual, the lc official catalog, the reviser, or a language specialist. dictionaries were consulted frequently, and the reviser and language specialists rarely. typing statistics (number of errors) were also recorded for 181 french and 185 german records. the error rate for typing foreign language material was lower than for typing english. the english language statistics, however, were combined for several typists, and the foreign language statistics were for one typist only. charts showed that there was no improvement in the number of typing errors made at the end of the test. the primary conclusion drawn from the results of the experiment is that in order to edit foreign language records with an acceptable degree of accuracy, it would be necessary for the editor to have a good knowledge of the language as well as the editing procedures. f orrnat recognition format recognition is a technique that allows the computer to process unedited bibliographic records by analyzing data strings for certain keywords, significant punctuation, and other clues to determine proper identification of data fields. the library of congress has been developing this technique since early 1969 in order to eliminate substantial portions of the manual editing process, which in turn should represent a considerable savings in the cost of creating machine readable records. the recon report, which was written prior to the completion of the first format recognition feasibility study, concluded that "partial editing combined with format recognition processing is a promising alternative to full editing." ( 4) since that time, the emphasis in the deve1opment of the programs has been shifted to no editing prior to format recognition processing. the programs are in the final stages of acceptance testing, and it is expected that 75% of the records can be processed without errors created by the format recognition programs. preliminary estimates show that it takes approximately half a second of machine time to process one record by format recognition ; the manual editing process, on the other hand, takes approximately six minutes per record. the total amount of core storage required is approximately 120k: 80k for the programs and 40k for the keyword lists. although the keyword lists are maintained as a separate data set on a 2314 disk pack, they are loaded into memory during processing. the format recognition programs have been written 162 journal of library automation vol. 4/3 september, 1971 in assembler language for the library's ibm 360/40 under dos. the logical design of the format recognition process, with detailed flow charts needed for implementation of computer programming, has been published as a worki~;tg document by the american library association so that the technical content would be available to assist librarians in their automation projects ( 5). workflow for format recognition begins with the input of unedited catalog records via the mt /st following the typing specifications created for format recognition. mter being processed by the format recognition programs, these records are proofed by the editors (the first instance in which they see the records), and the necessary corrections or verifications made. correction procedures for format recognition records are the same as those used for regular marc records. figures 1, 2, and 3 are examples of the printed card used for input, the mt /st hard copy, anq the proofsheet of the record created by format recognition. initial use of the format recognition programs is for input of approximately 16,000 recon records in the 1968 card series. input of current marc records via format recognition will begin at a later date. recon records were chosen for large-scale testing because they are not required for an actual production operation such as the marc distribution service. in addition, work has begun on the expansion of format recognition to foreign languages. analysis is being done on german and french monograph records, and eventually spanish, for new or expanded keyword lists and some changes to the algorithms. ewart, andrew. the world's greatest ion' n if airs. london. odhu m~. hl(;j ti. e. 19681• 287 p. 8 plates, lllus .. ports. 2~ em . 20/( n 68-. library of congress 0 301.41'4'0922 lir-97457 hq80l.a2eo fig. 1. input for format recognition. the recon pilot pro;ect/avram and maruyama 163 hq80l.a2e9 ewart, andrew the world's greatest love affairs.#london, odhams, 1967 [i. e. 1968]. 287 p. 8 plates, illus., ports. 22 em. 25/(b68-03757) l.l love. 2. biography. i. title. 301.41/4/0922 68-97457 library of congress fig. 2. mt j st hard copy. 050/ 1 100/1 68-97457 cal :$ab ---·----------·---. ---meps :ta *ewnrt , 1\ndrew. ----------------·-----------------------245/ 1 tila~ *the world's greatest love affair s. 260/ 1 i mp *abc *london , *odhams, *1 967 [i.e. 1 968) . ---·------------------------·-------300/1 col *abc *287 p . *8 p lates , illus., ports , 22*cm . 350/1 pri *e. 015/ 1 :mrha *b68-03757 650/ 1 sut-l*a *love . -------------------·------·--650/2 sut-l*a *biography. 0 --------------------------------08 2/1 ddc*a *301.41 /4/0 922 --;o'bft~c--=-~ --==~-~--~--1 ·~-_--2 -~--~-=i-;_ ~-:-_---;---_ ~ . ~~== c. c. ~-r11r..~-~1tl~.b~·-+13~.--~1*~~.--~1~5~.~etmtyr-------------·------m-;-s-21-.-l%&-h.-------r-3-;-en-!r-z*;-aef'-2-5.--------------. --2-6-; --~.-m--~--t9~--'*l-;-----7r.----fig. 3. proofsheet of format recognition r ecord. microfilming for a full-scale retrospective conversion project at the library of congress, it is likely that records for input would be microfilmed from the card division record set and updated from the corresponding records in the library's official catalog. a subset of the record set, such as the catalog cards for a given year, would be microfilmed and then the appropriate records, i.e., english language monographs, german monographs, etc., would be selected after filming. costs were calculated for a base figure of 100,000 records for the year 1965, and four different methods of 164 journal of library automation vol. 4/3 september, 1971 microfilming have been estimated as follows by the library's photoduplication service: 1) microfilming for a direct-read optical character reader ( $2000); 2) microfilming for reader/ printer specifications ( $2350); 3) microfilming for reader specifications ( $400); and 4) microfilming for a xerox copyflo printout of a card overlaid on a 8 x 10)~ worksheet ( $7000). the differences in cost are primarily attributable to the type of camera used (rotary or planetary) and the kind of feed mechanism (manual or automatic). other factors need to be considered, such as the fact that film suitable for ocr requirements could not be used on xerox copyflo or even for contact printing to positive film. since a readable copy of the original printed card is necessary for updating and proofing, microfilming for direct-read ocr would not be a viable alternative. input devices the monitoring of existent input devices was continued with an investigation of dissly systems' scan data optical character reader. scan data has been modified, via software, to read 55 different type fonts which are recognized by a "best compare" technique using six stored fonts to match against the remaining 49. according to the manufacturer, direct-reading is accomplished with approximately 95% level of accuracy. errors are recorded during a proofing cycle and corrected in the machine readable data base. the scan data equipment does not have a transport for a 3 x 5 document, so that a number of 3 x 5 cards must be attached to an 8 x 14 document for scanning, and therefore these cards would not be returned to the library by the manufacturer. under these conditions, cards to be read by scan data equipment would have to be obtained from stock rather than from the card division record set. unfortunately, many cards are out of stock; and of those that are in stock many may be cards reprinted several times by photo-offset methods and consequently have a poor image. therefore the use of this device would be severely hampered. fifty good quality cards were submitted to dissly systems for an experiment that was run without any modifications to the existing machine and software. five of the 50 cards were returned to the library with a matching printout. the results were not encouraging because many lines of text were missed and many characters misread. recon working task force the recon working task force has compiled work statements for contractual support for two of its research projects. these projects involve investigations on the implications of a national union catalog in machine readable form and the possible utilization of machine readable data bases other than that of the library of congress for use in a national bibliographic store. preliminary tasks related to these projects have been described in earlier progress reports ( 6, 7). the recon pilot project/ avram and maruyama 165 the first part of the work statement deals with the products that could be derived from the machine readable national union catalog: a bibliographic register, indexes by name, title, and subject, and a register of locations. these indexes would provide multiple access points to the records in the national union catalog. the bibliographic register will contain a full bibliographic record on each title covered. the indexes will contain partial records which are associated with the full records in the register, and a given index file will carry one or more partial records for every record in the register. for each title in the register, the register of locations lists those libraries where copies of the title are held. the assumption is made that the indexes under consideration will contain the following data elements (the numeric designations and subfield codes are those used in the marc format fields): name index name ( 100, 110, 111, 400, 410, 411, 600, 610, 611, 700, 710, 711, 800, 810, 811) short title ( 245) main entry in abbreviated form date (fixed field date 1) language (fixed field language code) lc card number register number title index short title ( 130, 240, 241, 245, 440, 630, 730, 7 40, 840) main entry in abbreviated form date (fixed field date 1, or may be omitted if in heading) language (fixed field language code, or may be omitted if in heading) lc card number register number subject index subject heading ( 650, 651) main entry ( 100, 110, or 111) short title (245) date (fixed field date 1) language (fixed field language code) lc card number register number the abbreviated form of main entry noted above is to be included in the record of the name or title index unless the name itself is carried in the main entry of that record. it is defined as follows: 1) for a personal name, a conference, or a uniform title heading-subfield "$a" is appended in brackets after the title; and 2) for a corporate name-subfield "$a" plus the first "$b" subfield are appended, within a single set of brackets, after the title. 166 journal of library automation vol. 4/3 september, 1971 the specific objective of this project is to define and investigate alternative processing schemes associated with an automated national union catalog. this study will explore and examine these processing schemes and the following components: 1) techniques for introducing the necessary input into the automated nuc svstem. the considerations to be covered include the relationship to' marc input, use of the format recognition programs, and the problems of language in terms of selection of input. 2) techniques for structuring or organizing the data contained in the register and the various indexes to establish and maintain the relationships among the records contained in these data bases. 3) techniques and procedures connected with the production of the products listed above. this investigation will also cover any selection and sorting procedures necessary. 4) analysis of the format, i.e., graphic design and printing, size, style, typographic variation, condensation, etc. 5) examination of alternative cumulation patterns associated with the products of the system. in this connection, items such as number of characters in an average entry, average number of entries on a page, expected rate of increase of number of entries in catalog, and segmentation of catalog are to be taken into consideration. 6) feasibility of producing a register through automation techniques. if this can be accomplished, further investigation will be directed toward the feasibility and cost of segmenting the register into three sections: one produced from machine readable records (english and whatever roman alphabet language records are in machine readable form); one produced from roman alphabet language records which are only in printed form; and one produced from non-roman alphabet language records which are only in printed form. the costs associated with the various techniques and procedures enumerated above as well as with their components will be calculated. from these figures an average total cost per title cataloged is to be determined for each alternative processing scheme. these cost values (one per alternative scheme ) are to be compared with those associated with a purely manual processing scheme. included in this cost analysis will be the associated costs for different forms of hard copy as well as for the use of com (computer output microfilm). from any one index and the register of locations, the maximum number of alphabetic and numeric lists (registers of location ordered by register number) will be determined, taking into account ease of usage and technical and economic feasibility. the intent is to have as few lists as possible and still keep the cost within reasonable bounds. supplements to the indexes should be issued monthly; supplements to the register of locations may be issued monthly or quarterly. the recon pilot pro;ectfavram and maruyama 167 the second project is a continuation of a previous investigation on the possible utilization of machine readable data bases other than that produced by the library of congress for use in a national bibliographic store. the results of this project should determine if the use of other data bases is economically and technically feasible. using three or four data bases selected by the recon working task force, the study will determine the following: 1) method and cost of acquiring these other data bases in machine readable form. 2) analysis of the kinds of programs capable of converting records from a number of these data bases into the marc format. different level data bases might require different kinds of programs. if such an effort is deemed feasible, a cost estimate for such a program or array of programs will be calculated. 3) method and cost of printing the records for examination, corrections, etc. 4) method and cost of eliminating records already in the marc data base. 5) method and cost of comparing these records against the lc official catalog and making the necessary changes in the data or content designators. 6) cost for input of additions and corrections. 7) method and cost of incorporating the additions and corrections in the machine readable file. 8) cost of providing means by which these records would not be input again by any future lc retrospective conversion effort. a result of this project should be a determination as to whether high potential or medium potential files, or both, are suitable for conversion. a determination will be made of the minimum yield or the minimum number of titles needed to justify writing the programs to convert these data bases. a factor to be considered is that the number of unique titles will decrease as more data bases are converted for this pool of records. it was decided that the research tasks to study the problems in distributing name and subject cross reference control files would be dropped because of limitations of time and funds. an additional task, however, has been added that can be performed within the time limits of the pilot project. during the past year, the library of congress card division has recorded information about card orders in machine readable form. this information will be analyzed as to the year and language of the most frequent orders because it is assumed that the most popular card orders bear a relationship to the potential use of a data base in machine readable form by libraries in the field. this study involves the following: 1) analysis of a frequency count of lc card orders for a one-year period and preparation of a distribution curve for card series. 168 journal of libmry automation vol. 4/3 september, 1971 2) analysis of a sample of frequently ordered cards to determine with fair reliability the proportion of english language titles in this group. the sample will be large enough to give an indication of other language groups that might be significant for any recon effort. 3) preparation of distribution curves for english language and nonenglish titles by card series. 4) mathematical analysis of the results of 1) -3) above to arrive at a table to show the anticipated utility of converting specified subsets of the lc card set. outlook research in input devices has not uncovered any equipment that offers a significant technical and cost improvement over the mt /st currently used in the library of congress. on-line correction and verification of marc/recon records will, however, speed conversion and will offer relief in the flow of documents and paper work required in a purely batch operation. since marc/recon records will be corrected and verified in one operation rather than by the cyclic process of the present system, · cost savings should be realized. the library of congress will have this on-line capability through the multiple use marc system. this new system is still in the design phase, and a projected date for implementation has not yet been set. to date investigations in the use of direct-read optical character readers have demonstrated that there are no devices currently available capable of scanning the lc printed card. the format recognition programs are operational, and recon titles in the 1968 card series are being converted without any prior editing of the records. procedures are being implemented to gather the necessary data to compare costs of the format recognition technique with costs of conversion with human editing. production statistics have shown that retrospective records are more costly to convert than current records. this higher cost is attributed to the additional tasks in recon of selecting the subset for input from the lc record set and comparing the records with the lc official catalog for updating. since cards in the lc record set do not necessarily reflect the latest changes made to the cards in the lc official catalog, the official catalog comparison is necessary to ensure that recon records are as up-to-date as the cards in the official catalog. although the recon report ( 8) recommended conversion in reverse chronological order with highest priority given to the last ten years of english language monograph cataloging, the working task force study on the card division popular titles may reveal that selective conversion is a more practical approach. the orderliness of chronological conversion by language does mean that records in machine readable form can be ascertained easily. it is interesting, however, to speculate on the use of the recon pilot project/ avram and maruyama 169 these records compared with popular titles which may cross many years and languages. the marc/recon titles constitute the data base for the phase ii card division mechanization project, and close liaison continues to be maintained between both projects. it is recognized that the distribution of cards and marc records requires the same computer based bibliographic files and has similar hardware and software requirements. plans are presently underway to transfer the duplication of tapes for ~.iarc subscribers from the library's ibm 360/40 to the card division's spectra 70 when the phase ii system is operational. the recon pilot project does not officially end until august 1971. in an attempt to make information available as rapidly as possible, the preparation of the final report will begin this summer, since several aspects of the project are complete enough to be documented. the final report will be published by the library of congress, and its availability will be announced in the lc information bulletin and in professional journals. acknowledgments the authors wish to thank the staff members associated with the recon pilot project in the marc development office, the marc editorial office, the technical processes research office, and the photoduplication service of the library of congress for their contributions to the project and, therefore, to this report. special thanks are due to patricia e. parker of the marc development office for her work on the foreign language editing experiment and for writing that section of this article. references 1. avram, henriette d.: "the recon pilot project: a progress report," journal of library automation, 3 (june 1970), 102-114. 2. avram, henriette d.; guiles, kay d.; maruyama, lenore s.: "the recon pilot project: a progress report, november 1969-april 1970," journal of librm·y automation, 3 (september 1970), 230-251. 3. avram, henriette d.; maruyama, lenore s.: "recon pilot project: a progress report, april-september 1970," jow·nal of library automation, 4 ( march 1971 ) , 38-51. 4. recon working task force: conversion of retrospective catalog records to machine-readable form: a study of the feasibility of a national bibliographic service (washington, d.c.: library of congress, 1969 ), 179. 5. u. s. library of congress. information systems office. format recognition process for marc records: a logical design (chicago, american library association, 1970 ). 6. avram , guiles, maruyama, op. cit., 248-249. 7. avram, maruyama, op. cit., 49-51. 8. recon working task force, op. cit., 11. letter from the editor (september 2020) letter from the editor september 2020 kenneth j. varnum information technology and libraries | september 2020 https://doi.org/10.6017/ital.v39i3.12xxx with “unprecedented” rising to first place on my personal list of words i would prefer never to need to use again, let alone hear used, i find it eminently satisfying that some activities and events from before covid continue in their usual, predictable ways. for me, the quarterly rhythm of publication of information technology and libraries is one of those activities. it is helping keep me grounded. while it is certainly not much in the scope of what is happening all around me, it is at least something. one thing that is changing is that this journal, along with library resources and technical services and library leadership & management are now publications of ala’s newest division: core: leadership, infrastructure, futures. you’ll notice a new logo at the top of our site, reflecting the new organizational structure. i am excited about the possibilities of richer cross-core cooperation and collaboration as we explore our new structure. this issue includes the first—and last—lita president’s message from incoming and outgoing lita president evviva weinraub lajoie. evviva assumed the lita presidency this summer, just before the merger of lita, llama, and alcts into the new core division took place on september 1. members of those three merged divisions should watch for information about elections for the new core president in october. i am pleased that this issue includes the 2020 lita/ex libris student writing award winning article, evaluating the impact of the long-s upon 18th-century encyclopedia britannica automatic subject metadata generation results, by sam grabus of drexel university. julia bauder, the chair of this year’s selection committee (i was also a member, as ital editor) said, “this valuable work of original research helps to quantify the scope of a problem that is of interest not only in the field of library and information science, but that also, as grabus notes in her conclusion, could affect research in fields from the digital humanities to the sciences.” before closing, i would like to express my appreciation to breanne kirsch, who ably served on the editorial board from 2018-2020. sincerely, kenneth j. varnum, editor varnum@umich.edu september 2020 https://doi.org/10.6017/ital.v39i3.12235 https://doi.org/10.6017/ital.v39i3.12235 mailto:varnum@umich.edu information technology and libraries at 50: the 1970s in review sandra shores information technology and libraries | june 2018 7 sandra shores (sandra.shores@ualberta.ca), a member of lita and the ital editorial board, is senior it & facilities officer, learning services, university of alberta. what a pleasure it has been to scan through a decade of articles, communications, and news from the ten volumes comprising the 1970s’ journal of library automation, predecessor to information technology and libraries. despite the open-access availability of several of these volumes, i requested the entire run from our high density storage library and delighted in the dusty covers between my hands and the musty smells wafting from the paper and ink. at the same time, i rued past practices at this university library of slicing out journal covers and front and back advertising in order to save a few cents in bindery costs. such a loss of information about a journal’s history! by the end of the 1970s, i was almost through my undergraduate degree and intended to pursue a library science degree after a gap year, not being able to imagine a more inspiring or satisfying place to build a career than an academic library. even as an avid library user at that time, i realized that technology was having an increasing impact on operations but, until reviewing these ten years of the journal, did not grasp the tipping point reached. the decade began with tentative language around technology including uncertainty about whether this phenomenon was best called mechanization or automation. that uncertainty extended to the naming of new things, resulting in words not yet joined, or if joined, then with hyphenation: data base or data-base, key word or key-word, on line or on-line, for example. in the early years, the profession began to imagine the coming together of disparate small library applications into what was referred to as a “total” system; the integrated library system or ils had neither been imagined nor articulated. early concerns in the decade focused on the rising costs of library services alongside the high price of computing. cost-benefit analysis drove many decisions related to library automation. in a 1970 article conceptualizing an online catalog, frederick kilgour claimed that the “productivity of library workers is not continuously increasing as is that of workers in the general economy”1 and concluded that “mechanization, or more specifically, computerization, is the only avenue that extends toward the goal of economic viability.”2 fortunately, the decade saw considerable advances in computer engineering coinciding with steadily decreasing costs in processing power, data storage and networking. nine year later, computer scientist g. salton imagined a much more viable future for libraries, “postulating a completely new library design where the shelf arrangement of books and journals would be replaced by a computerized store containing presumably the full text of all library items together with appropriate search methodologies.”3 while several decades would pass before the emergence of project gutenberg and other mass book scanning projects, technology was sufficiently affordable for libraries by the end of the 1970s that the focus shifted to ways of working cooperatively to harness the powerful new opportunities. jola attracted articles on library networks, cataloguing cooperatives, union serial lists, robust circulation systems, early interlibrary loan systems, and commercial and not-for-profit database services. many applications and systems grew out of projects begun at university and other libraries, but the decade also saw the early emergence of vendor solutions. by the final volume of the 1970s, caryl and stratton mcallister reported on mailto:sandra.shores@ualberta.ca the 1970s in review | shores 8 https://doi.org/10.6017/ital.v37i2.10494 dobi/libis, “an integrated library system with strong authority file control that can be used directly by the library staff and its borrowers.”4 the ils was born. a few professional themes that still resonate today stand out from this glance into the past, the first being a shift from valuing ownership of materials to valuing cooperation and resource sharing. technology combined with an increasing emphasis on standards of description and communication offered new possibilities for regional and national resource sharing, leading many in the profession to acknowledge the futility of trying to build comprehensive collections for their institutions. a number of articles highlight the impact of technology on library personnel, noting that the introduction of automation is disruptive to staff, leaving many feeling unprepared to succeed in their jobs. the roots of evidence-based decision making can be seen in a few articles, for example one describing how newly available data from first generation circulation systems can inform the acquisition of additional copies of high demand titles. 5 the library user comes more into focus as the decade progresses, in studies about user satisfaction with mediated search and retrieval services, whether run in batch mode or online, and in concept papers imagining systems that support end-user searching. other articles express concern over the costs of online searching and other computer services, especially as the costs of library materials continued to rise and journals consumed more of the collections budget. finally, members of the profession understood early that data about use of library materials stored in computer systems needs protection. at the ala midwinter meeting in 1973, the information science and automation division (precursor to lita) passed a motion urging ala to develop data privacy policy, noting the “vulnerability of machine-readable files due to the large quantity of data processed.”6 william mathews provides an excellent article to read in anticipation of library technologies in the 1980s. with considerable prescience in “advances in electronic technologies” he takes the reader through microprocessors, high performance computing, the pending home computer phenomenon and new storage and processing technologies.7 by the end of the 1970s, the library world was on the cusp of a technology revolution! in bringing this review to a close, i would be remiss not to draw attention to the illustrious professionals who assumed editorial roles in the early years of the journal. frederick kilgour, henriette avram, verner clapp, pauline atherton, and others not only had extraordinary careers but also set high standards for the journal and the association. kudos to them for their establishment of jola! 1 frederick g. kilgour, “concept of an on-line computerized library catalog,” journal of library automation 3, no. 1 (march 1970): 2, https://doi.org/10.6017/ital.v3i1.5123. 2 kilgour, “concept,” 3. 3 g. salton, “suggestions for library network design,” journal of library automation 12, no. 1 (march 1979): 39-40. 4 caryl mcallister and a. stratton mcallister, “dobis/libis: an integrated, on-line library management system,” journal of library automation 12, no. 4 (december 1979): 300. 5 robert s. grant, “predicting the need for multiple copies of books, journal of library automation 4, no. 2 (june 1971): 64-71, https://doi.org/10.6017/ital.v4i2.5583. 6 “highlight of minutes: information science and automation division, board of directors meeting,” journal of library automation 6, no. 1 (march 1973): 57, https://ejournals.bc.edu/ojs/index.php/ital/article/view/5761/5140. 7 william d. mathews, “advances in electronic technologies,” journal of library automation 11, no. 4 (december 1978): 299-307. https://doi.org/10.6017/ital.v3i1.5123 https://doi.org/10.6017/ital.v4i2.5583 https://ejournals.bc.edu/ojs/index.php/ital/article/view/5761/5140 evaluation of the new jersey digital highway | jeng 17 judy jeng evaluation of the new jersey digital highway the aim of this research is to study the usefulness of the new jersey digital highway (njdh, www.njdigitalhigh way.org) and its portal structure. the njdh intends to provide an immersive and user-centered portal for new jersey history and culture. the research recruited 145 participants and used a web-based questionnaire that contained three sections: for everyone, for educators, and for curators. the feedback on the usefulness of the njdh was positive and the portal structure was favorable. the research uncovered several reasons why some collections did not want to or could not participate. the findings also suggested priorities for further development. this study is one of the few on the evaluation of cultural heritage digital library. t he new jersey digital highway (njdh, www .njdigitalhighway.org) is a digital library for new jersey history and culture, including collections of new jersey libraries, museums, archives, and historical societies. the njdh, funded in part by the 2003 national leadership grant of the institute for museum and library services, is a joint project by new jersey state library, the new jersey division of archives and records management at rutgers university libraries, the new jersey historical society, and the american labor museum. as part of the project, the njdh identifies 686 cultural heritage institutions (public libraries, archives, historical societies, and museums). as of november 2007, there are more than ten thousand objects (pictures, records, and oral histories) in the repository. more are being added daily. the njdh, at this writing, is still very much a work in process. the principal investigator of this project continues to extend opportunities to more communities to link their sites and scan their images.1 the njdh provides portals for four different groups of people: everyone, educators, students, and librarians and curators. its mission is to develop an immersive, user-centered information portal and to support the new jersey learner through a collaboration among cultural heritage institutions that supports preservation of the past, new access strategies for the future, and active engagement with resources at the local and the global level for shared access and local ownership. the njdh uses fedora (flexible extensible digital object repository architecture) as a platform to mount participating institutions’ digital objects and metadata. fedora is developed jointly by cornell university and the university of virginia and is currently supported through an andrew w. mellon foundation grant that is customizable and allows local institutions to have true control over what they digitize and post.2 fedora is built on xml with core standards that support flexibility and interoperability such as mets (metadata encoding and transmition standard, www.loc.gov/standards/ mets) and oai-pmh (open archives initiative protocol for metadata harvesting, www.openarchives.org) functions. fedora is chosen for the njdh because it can effectively accommodate and manage a broad array of information sources with the flexibility to integrate with other information repositories. the njdh uses a metadata structure based on mods (metadata open description schema, www.loc.gov/ standards/mods), mets, niso, and premis (preservation metadata, www.loc.gov/standards/premis) metadata standards to support preservation of digital objects, to ensure scalability for projects and interoperability with other systems through oai-pmh. this hybrid approach enables njdh collection managers and metadata creators to provide information through multiple presentation standards in a schema easily understood within distinctive cultural heritage organization communities. mods is used for descriptive metadata, provides and retains standard bibliographic cataloging principles, and is therefore easily mapped to marc. the njdh therefore includes a mapping utility that allows the export of records from the njdh to online catalogs for any organization that wants to make its digital objects accessible within its integrated library system. additionally, there are four other types of metadata in njdh: source metadata describes provenance, condition, and conservation of analog source materials such as photographs, books, maps, audio, and video; technical metadata describes born digital images and provides information about the digital master files that will be maintained for long-term preservation and access; rights metadata identifies the rights holder(s) for each information source, identifies the permissions for use including any restrictions, and documents the copyright status of each work; digital provenance metadata provides a digital “audit trail” of any changes to the metadata.3 the use of the njdh has steadily grown and has some three thousand unique visitors a month averaging eight to ten thousand visits per month.4 n prior cultural heritage digital library evaluations literature review indicates that few researchers have investigated the usability or the evaluation of cultural judy jeng (jjeng@njcu.edu) is head of collection services, new jersey city university, new jersey. 18 information technology and libraries | december 2008 heritage digital libraries. the minerva (ministerial network for valorising activities) project proposed a number of criteria and principles specifically for usability evaluations of cultural web applications, including visibility, affordance, natural mapping, constraints, conceptual models, feedback, safety, flexibility, the scope and aim of the site, meaningful organization of the website’s functions, quality of content (for example, consistency, completeness, conciseness, accuracy, objectivity), design of functional layout, consistent use of graphics and multimedia components, as well as provision for navigation tools and search mechanisms.5 in addition, vaki, dallas, and dalla proposed sixteen usability guidelines for cultural applications.6 garoufallou, siatri, and balatsoukas reported their research on the user interface of the veriagrid application.7 the veriagrid system (www.theveriagrid.org) is a platform based on digital cartography that supports a vector map of the city of veria organized by layers and linked to multimedia objects such as text, images, photos, and video clips. the researchers were interested in learnability, errors, and satisfaction. n usefulness as the primary evaluation criterion for the njdh the njdh aims to serve heterogeneous communities and information needs. like other digital cultural services, it is not easy to address usability issues. lynch has said that digital libraries of cultural heritage don’t really have natural communities around them and that digital materials find their own unexpected user communities.8 garoufallou, siatri, and balatsoukas said that “different types of users, such as students and scholars or tourists and travelers look at these services from different angles (for example, scholarly or recreational needs). thus, the provision of accessible and user-friendly systems is important for the wider use and acceptance of these services.”9 the aim of this evaluation was to assess usefulness of the njdh from the perspectives of general users, educators, and cultural heritage professionals. usefulness is one of the criteria of usability with a focus on “did it really help me?” and “was it worth the effort?” usefulness differs from usableness in that usableness refers to functions such as “can i turn it on?” or “can i invoke that function?” usefulness can also mean “serving an intended purpose.” in the technology acceptance model (tam) developed by davis and his colleagues, perceived usefulness refers to the extent to which an information system will enhance a user’s performance.10 in addition to usefulness and usableness, jeng has gathered a comprehensive collection of usability criteria such as effectiveness, efficiency, satisfaction, learnability, ease of use, memorability, mistake recovery, and interface effectiveness.11 usability is a multidimensional construct and has a theoretical root in human–computer interaction. although usefulness may be an important evaluation criterion, thomas and jeng report that usefulness is an often overlooked criterion of usability.12 literature review indicates that usefulness has been used as either the primary or one of the criteria in the following evaluations of digital libraries: elibraryhub, the digital work environment, grow (geotechnical, rock, and water engineering, www.grow.arizona.edu), mcmaster university library’s gateway, the miguel de cervantes virtual library, minnesota’s foundations project, and the moving image collections.13 this paper reports the evaluation of the njdh. n research method a web-based online survey was conducted in september– december 2006. the questionnaire was designed, collected, and analyzed using web-based software called surveymonkey. convenience sampling method was used in this study. subjects were recruited by posting a link on the njdh website, by posting announcements on a number of electronic discussion lists for educators and cultural heritage professionals, and by word-of-mouth invitations. the participants were asked to complete a two-part questionnaire. the first part gathered demographic data such as gender, age, ethnic background, educational background, the county they live in, and how they learned about the njdh. the second part contained three sections: one for everyone, one for educators, and one for cultural heritage professionals. the section for everyone contained twenty-six questions, including seven-point likert scales and open-ended questions with a focus on the digital library’s usefulness, navigation, design, terminology, and user lostness. in addition to this general section, educators were asked to complete another fifteen questions pertaining specifically to the educators’ portal; the cultural heritage professionals had another thirteen questions regarding the librarians and curators’ portal. a total of 145 individuals participated in the survey, of which 32 were educators (22%) and 28 (20%) were cultural heritage professionals. the participants were mostly white (127 respondents or 89%), mostly female (118 respondents or 81%), and most had a master’s or doctoral degree (114 respondents or 79%). in terms of age distribution, more than half of the participants were over 50 (79 respondents or 55%) (see table 1). nearly all (136 respondents or 94%) were residents of new jersey. evaluation of the new jersey digital highway | jeng 19 among the educators that participated in this survey who evaluated the educators’ portal, 56% (18 respondents) worked at colleges or universities, 16% (5 respondents) worked at high schools, 13% (4 respondents) worked at elementary or middle schools, and 6% (2 respondents) identified themselves as specialists in museums, libraries, or archives. roughly a third (10 respondents or 31%) were teachers, 3% (1 respondent) was a teaching assistant, 13% (4 respondents) were school administrators, and 28% (9 respondents) were school library media specialists or librarians (see table 2). in terms of what they teach, 27% (7 respondents) teach new jersey history, 23% (6 respondents) teach social studies, 12% (3 respondents) teach civics, 8% (2 respondents) teach geography, and 8% (2 respondents) teach popular culture. as to the survey participants who identified themselves as cultural heritage professionals, 61% (17 respondents) worked at libraries, 11% (3 respondents) worked at museums, 11% (3 respondents) worked at historical societies, and 4% (1 respondent) worked with archives. in terms of their roles at those organizations, 61% (17 respondents) said they were faculty or staff, 18% (5 respondents) were administrators, one was a consultant, one was a librarian, and one was a volunteer (see table 3). n findings how do users find out about the njdh and will they come back? the survey found that more than half of the respondents (58 participants or 40%) learned about the njdh from their colleagues or friends, 19 participants (13%) learned through attending conferences, 16 participants (11%) were linked from other websites (see figure 1). the njdh digital library intends to build rich and “one stop shop” digital collections of new jersey history and culture. cultural heritage digital library plays a particularly important role for students of the humanities because the digital library is the humanist’s laboratory, its resources are the scholar’s primary data.14 it is important to enhance users’ awareness of this digital library among new jerseyans and even promote this cultural heritage digital library to users at global level. table 1. demographic data (n = 145) total % gender male 27 18.6 female 118 81.4 age 18–24 1 0.7 25–49 63 44.1 50–64 74 51.7 65+ 5 3.5 ethnic background white 127 89.4 african american 5 3.5 asian 6 4.2 hispanic 3 2.1 native american 1 0.7 education high school 5 3.4 associate’s degree 7 4.8 bachelor’s degree 19 13.1 master’s or phd degree 114 78.6 in terms of the purposes of visiting the njdh, the study found 72 respondents (76%) were just browsing and 23 respondents (24%) were looking for specific information such as a specific county information, history, and family genealogy (see figure 2). seventy-two respondents (74%) replied that they will come back to use the njdh again (see figure 3). those who said “no” gave reasons such as their doubts on whether the information in the njdh is reliable and authoritative, the depth and breadth of content in this digital library, and the inconsistency of fonts and font sizes. n navigation navigation has been reported in literature as a common problem in a digital library. users could accidentally leave the digital library, following the links to other web-based resources, and were unaware that they were no longer using the digital library. brinck, gergle, and wood report that disorientation is among the biggest frustrations for web users.15 20 information technology and libraries | december 2008 average 2.54 on a 7-point likert scale, 1 being easy to navigate and 7 being difficult to navigate). twentythree participants (25%) marked 1 on the likert scale, 28 participants (30%) marked 2, and 26 participants (28%) marked 3. these brought the total of the top three points to 83%. the overall response regarding user lostness was also not a problem (response average 2.42 on a 7-point likert scale, 1 being not lost at all and 7 being very lost). only two participants expressed they were very lost and one expressed lost. the reasons that could lead to user lostness include the lack of material in the collections so far, the need for explanation of how relevance is ranked, the home page being text heavy and cluttered, the photos not being legible, the lack of author information in documents, no indication of a trail of how one got there, lengthy urls, the need for better chosen direct links instead of layered links, and patrons’ unfamiliarity with icons and their functions. n layout the rating for the layout of the njdh was very positive (response average 2.54 on a 7-point likert scale, 1 being good and 7 being bad). however, the site may improve its appearance in the following areas: there is currently too much text per page (the font is too small and the use of typography, informational hierarchy, and white space must be improved); more important information needs to go at the top of pages; and more colors need to be used. n terminology the degree to which users interact with a digital library depends on how well users understand the terminology displayed on the system interface. literature review has indicated that the inappropriate use of jargon has been a common problem in digital library design. hartson, shivakumar, and pérez-quinones report from their usability inspection of the networked computer science technical reference library (www.ncstrol.org) that problems with wording accounted for 36% of the digital library’s usability problems.16 system designers often assume too much about the extent of user knowledge. the precise use of words in a user interface is one of the utmost important design considerations for usability. table 2. educators’ demographic data (n = 32) total % institutions university or college 18 56 high school 5 16 elementary or middle school 4 13 museums and others 2 6 no answer 3 9 total 32 100 roles teacher 10 31 teaching assistant 1 3 administrator 4 13 librarian 9 28 no answer 8 25 total 32 100 table 3. cultural heritage professional’s demographic data (n = 28) total % institutions library 17 61 museum 3 11 historical society 3 11 archives 1 4 others or no answer 4 14 roles faculty or staff 17 61 administrator 5 18 consultant 1 4 librarian 1 4 volunteer 1 4 no answer 3 11 this survey found the overall response regarding the navigation of the njdh was very positive (response evaluation of the new jersey digital highway | jeng 21 this research found that the overall response regarding terminology and labeling in the njdh was positive (response average 2.34 on a 7-point likert scale, 1 being clear and 7 being not clear). n usefulness usefulness was the fundamental research focus of this study. this research investigated whether the njdh was useful to the general public, educators, and students. the responses were overwhelmingly positive: 73% of the respondents gave 1–3 ratings on the 7-point likert scale (1 being useful and 7 being not useful)—30% (29 respondents) marked 1, 33% (32 respondents) marked 2, and 12% (12 respondents) marked 3. the average response was 2.63. this was a very positive response. when it comes to the specific section for educators to evaluate the educator’s portal, the rating was also positive (response average 3.04). those educators felt that the most useful information was the “how to” information for teaching with digital resources, research genealogy, developing an oral history, and so on. twelve respondents (44%) indicated they would encourage their students to use the njdh site for term papers or homework assignments. thirteen respondents (50%) indicated they would make their own lesson plans using the resources and information from the njdh. regarding the student’s portal, those educators who responded to the survey indicated that, from their perspectives, the most useful information for students was the general information about new jersey, including a directory of cultural heritage organizations, places to visit, etc. as for the librarians and curators’ portal, those cultural heritage professionals identified the librarians and curators’ resource center as the most useful resource in the njdh, followed by the digital highway collections roadmap and associated guidelines, calendar, the searching capabilities of new jersey cultural heritage organizations, and new jersey information. sixteen respondents (67%) said they would recommend this digital library to their patrons, two respondents (8%) won’t, and six respondents (25%) were not sure. it is obvious that the njdh administrators need to work harder in this area to enhance usefulness for cultural heritage professionals and their patrons. figure 1. where did you hear about njdh? the survey asked all respondents to suggest what themes should be enriched in the njdh collections. the suggestions were, in this order: new jersey history, new jersey state and county documents, new jersey culture, genealogy, everyday life in new jersey, new jersey industry, more immigration resources, education in new jersey, new jersey in wartime, and transportation. regarding the librarians and curators’ portal, the respondents suggested the contents of this particular portal should be enhanced in the following priority order: (1) more links to other websites with history resources and activities, (2) access to mentors experienced in digitizing and metadata who can provide one-to-one assistance, (3) a discussion list or blog where users can ask questions or share ideas with others, (4) information about training sessions around new jersey on digitization and metadata, (5) more resources on digital preservation and metadata, (6) educational activities that users can share with their patrons, (7) a tool for users to create their own interactive activities using the njdh resources, and (8) more information about helping patrons to use the njdh more effectively. n portal structure the njdh provides four portals for different target users: everyone, educators, students, and librarians and curators. each portal provides different interface and packages different information for a different type of user. the survey found 80% of the subjects understood the purpose of the four portals (by marking 1 or 2 on the 7-point likert scale) and only 4 participants (4%) found this type of portal structure confusing. the survey further found 65% of participants felt this kind of portal structure helpful to them. 22 information technology and libraries | december 2008 n why not contributing to the njdh collections? the respondents indicated that the barriers for them to contribute collections or resources to the njdh were, in this order: (1) lack of staff or time, (2) lack of funding, (3) lack of knowledge, and (4) copyright concerns. n statistical analyses the study found demographic factors, such as age, gender, ethnic background, and educational level, do not have significant effects on a number of areas: (1) how the participants ranked usefulness of the digital library, (2) usefulness evaluation of the four-portal structure, (3) understanding of terminology, (4) ease of navigation, and (5) lostness. the study found the correlation between navigation and lostness was statistically significant: r (66) = .83, p < .001. when a user felt the system easy to navigate, the user felt less lost. the study also found usefulness of the digital library has a statistically significant effect on a user ’s return decision. a one-way analysis of variance was conducted. the analysis of variance was significant, f (2, 59) = 20.42, p < .001. the strength of relationship between usefulness ranking and the decision of whether to revisit the digital library, as assessed by n2, was strong, with the usefulness factor accounting for 41% of the variance of the return decision. because the overall f test was significant, follow-up tests were conducted to evaluate pairwise differences among the means. using the turkey test, the pairwise comparisons yes vs. no and yes vs. not sure were significant. the pairwise comparison no vs. not sure was not significant. n conclusions usability evaluation is a user-centered evaluation to learn from users’ needs, expectations, and satisfaction. this research studied usefulness, navigation, user lostness, terminology, and layout. the overall response was positive, and the finding was that the njdh was useful in providing new jersey history and culture information. designers of the njdh learned from the study the priorities of adding various new jersey themes to the collections and how to make the site easier to use. as a result of the study, lifelong learners are identified as an important target audience. this research provided insights on why people came to use this particular digital library, their pleasure of using it, how to improve ease-of-use, navigation, website appearance, and the use of terminology and labeling. the front page of the website was redesigned to address the overuse of text on each page. the study also helped to discover what components of the site were more useful and why. furthermore, it investigated why some museums or collections in new jersey have not participated in this digital library development project. as a result of the study, more emphasis has been placed on building tools figure 2. purpose of the most recent visit figure 3. will you use njdh again? evaluation of the new jersey digital highway | jeng 23 to increase independent collection contribution by museums and archives. the observations of this study may help the development of other academic digital libraries because the barriers found in the study are common obstacles. after eighteen months of the study, the njdh governance planning committee still uses the evaluation report to address more complex and fundamental changes and the reorganization of the digital library. the study confirmed that users of this digital library appreciated the idea of providing different portals for different users. the study did not find demographic factors (age, gender, ethnic background, and educational level) play statistically significant roles in the usefulness rankings of the digital library or portal structure, terminology, ease of use, or user lostness. the study found there was a strong correlation between ease of navigation and user lostness. users don’t have feelings of lostness when a system is easy to navigate. the study also found users will come back to revisit a digital library when they find the site is useful. n acknowledgments judy jeng and grace agnew were the codesigners of the questionnaire for this study. judy served as the evaluation consultant for the njdh. grace agnew, the associate university librarian for digital library systems at rutgers university, was the principal investigator of the njdh. the njdh received funding from institute of museum library services grant lg30-03-0269-03. references 1. linda langschied, “history and high-tech intersect on the new jersey digital highway,” www.imls.gov/profiles/ nov07.shtm (accessed aug. 12, 2008). 2. linda langschied and ann montanaro, “the new jersey digital highway: a next-generation approach to statewide digital library development,” microform & imaging review 34, no. 4 (2005): 167–73. 3. the new jersey digital highway: final report on imls grant #lg30-03-0269-03, www.njdigitalhighway.org/documents/ njdh-final_report_www_version.pdf (accessed aug. 12, 2008). 4. ibid. 5. minerva working group 5, handbook for quality in cultural web sites improving quality for citizens: version 1.2—draft. (2003), www.minervaeurope.org/publications/ qualitycriteria1_2draft/qualitypdf1103.pdf (accessed aug. 12, 2008). 6. elina vaki, costis dallas, and christina dalla, calimera: cultural applications: local institutions mediating electronic resources: deliverable d 18: usability guidelines, www.calimera .org/lists/resources%20library/the%20end%20user%20 experience,%20a%20usable%20community%20memory/ usability%20guidelines.pdf (accessed aug. 12, 2008). 7. emmanouel garoufallou, rania siatri, and panagiotis balatsoukas, “virtual maps—virtual worlds: testing the usability of a greek virtual cultural map,” journal of the american society for information science and technology 59, no. 4 (2008): 591–601. 8. clifford lynch, “digital collections, digital libraries and the digitization of cultural heritage information,” first monday 7, no. 5 (2002), www.firstmonday.org/issues/issue7_5/lynch/ (accessed aug. 12, 2008). 9. garoufallou, siatri, and balatsoukas, “virtual maps— virtual worlds,” 591–601. 10. fred d. davis, “perceived usefulness, perceived ease of use, and user acceptance of information technology,” mis quarterly 13, no. 3 (1989): 319–40; fred d. davis, richard p. bagozzi, and paul r. warshaw, “user acceptance of computer technology: a comparison of two theoretical models,” management science 35, no. 8 (1989): 982–1003. 11. judy jeng, “usability of the digital library: an evaluation model” (phd diss., rutgers university, 2006): 10–19; judy jeng, “usability assessment of academic digital libraries: effectiveness, efficiency, satisfaction, and learnability,” libri: international journal of libraries and information services 55, no. 2/3 (2005): 96–121; judy jeng, “what is usability in the context of the digital library and how can it be measured?” information technology and libraries 24, no. 2 (2005): 47–56. 12. rita leigh thomas, “elements of performance and satisfaction as indicators of the usability of digital spatial interfaces for information-seeking: implications for isla” (phd diss., univ. of southern california, 1998); judy jeng, “usability of the digital library: an evaluation model” (phd diss., rutgers university, 2006): 33. 13. yin-leng theng, mei-yee chan, ai-ling khoo, and raju buddharaju, “quantitative and qualitative evaluations of the singapore national library board’s digital library,” in design and usability of digital libraries: case studies in the asia pacific, ed. yin-leng theng and schubert foo (hershey, pa.: information science publishing, 2005): 334–49.; n. meyyappan, schubert foo, and g. g. chowdhury, “design and evaluation of a taskbased digital library for the academic community,” journal of documentation 60, no. 4 (2004): 449–75; janice lodato, “creating an educational digital library: grow a national civil engineering education resource library,” (paper presented at the conference on human factors in computing systems, vienna, austria, apr. 24–29, 2004), in the acm digital library, http://portal.acm.org/citation.cfm?id=985942&coll=portal&dl =acm&cfid=32427354&cftoken=28824529 (accessed aug. 12, 2008); brian detlor et al., fostering robust library portals: an assessment of the mcmaster university library gateway (hamilton, ont.: michael g. degroote school of business, mcmaster university, 2003); álvaro quijano-solís and raúl novelo-peña, “evaluating a monolingual multinational digital library by using usability: an exploratory approach from a developing country,” the international information & library review 37, no. 4 (2005): 329–36; eileen quam, “informing and evaluating a metadata initiative: usability and metadata studies in minnesota’s foundations project,” government information quarterly 18, no. 24 information technology and libraries | december 2008 3 (2001): 181–94; judy jeng, “metadata usefulness evaluation of the moving image collections” (paper presented at the new jersey library association annual conference, long branch, new jersey, apr. 23–25, 2007), www.njla.org/conference/2007/ presentations/metadata.pdf (accessed aug. 12, 2008). 14. gregory crane and clifford wulfman, “towards a cultural heritage digital library,” proceedings of the 3rd acm/ ieee-cs joint conference on digital libraries, in the acm digital library, http://delivery.acm.org/10.1145/830000/827150/p75 -crane.pdf?key1=827150&key2=9784876911&coll=acm&dl=a cm&cfid=8598346&cftoken=44546164 (accessed aug. 12, 2008). 15. tom brinck, darren gergle, and scott d. wood, designing web sites that work: usability for the web (san francisco: morgan kaufmann, 2002). 16. h. rex hartson, priy a. shivakumar, and manuel a. pérez-quinones, “usability inspection of digital libraries: a case study,” international journal on digital libraries 4, no. 2 (2004): 108–23. lib-s-mocs-kmc364-20141005043558 56 highlight of minutes information science and automation division board of directors meeting 1973 midwinter meeting washington, d. c. monday, january 29, 1973 the meeting was called to order by president ralph shoffner at 8:10a.m. the following were present: board-ralph m. shoffner (chairman ) , richard s. angell, don s. culbertson (!sad executive secretary), paul j. fasana, donald p. hammer, susan k. martin, and bemiece coulter, secretary, isad. committee chairman-stephen r. salmon. guestscharles stevens and david weisbrod. report of national commission on library and information science. mr. charles stevens, executive director of the national commission on library and information science, discussed the commission's priorities and objectives for planning libmry and information services for the nation. the commission has identified six areas of activity in which to conduct investigations in relation to its charge which is to study " ... library and information services adequate to meet the needs of the people of the united states." these six: areas are: ( 1) understanding information needs of the users; ( 2) adequacies and deficiencies of current library and information services; ( 3) pattems of organization; ( 4) legal and financial restrictions on libraries; ( 5) technology in library and information systems; and ( 6) human resources. report to ala planning committee. the report to the ala planning committee on !sad's long range plans was deferred until after the !sad objectives committee report is received in june. objectives committee interim report. mr. stephen salmon, chairman, provided an interim report of the committee. the committee will recommend that the division continue to exist and will list its proposed objectives, which may differ from the original objectives. at the request of louise giles, chairman of the information technology discussion group, special attention will be given to that group's interests in formulating the statement of objectives. membership survey committee. mr. shoffner relayed ms. pope's report that the membership survey will cost $700.00, which is not available in the current budget. mr. culbertson said that the cost could be decreased by surveying a sample of 1,000 members. the decision was to highlights of minutes 57 request the full amount for the survey, to be performed in the coming fiscal year. asidic representative. mr. peter t. watson, through correspondence with mr. shoffner, reported that asidic is interested in liaison with ala, and was concerned with the possibility of accomplishing this through isad. mr. culbertson reported that asidic could become an affiliate of ala for a $40.00 fee, but that isad could recommend a formal liaison, especially if isad and asidic had similar interests. motion. it was moved by paul fasana that this matter of asidic liaison with ala be passed on to the executive director, mr. robert wedgeworth, and that the president of isad write and inform him of such. seconded by richard angell. carried. policy statement on privacy of data processing records. mr. culbertson had been approached about !sad's making a statement on broad issues of data processing, including privacy. a need has been made known by the ala washington office for having such a statement on which to base their stand in certain hearings. mr. hammer felt it very appropriate that the association (ala) take a position on it. mr. weisbrod mentioned that !sad could be involved because of the vulnerability of machine-readable files due to the large quantity of data processed. motion. it was moved by paul fasana that the isad board recommend to the ala council that it (ala) develop some policy expressing its membership's attitude toward the privacy of machine-readable data. seconded by donald hammer. carried. ]ola editor. mr. shoffner reported, concerning the appointment of an editor, that two contacts were outstanding and he would report to the board on wednesday. mr. culbertson has been serving as temporary editor. mr. fasana noted that the schedule for 1972 was for four issues, but only one had appeared. he asked what plans there were to catch up or cancel. mr. culbertson said that legally isad could not cancel any issues, and that a statement had been written for the "memo to members" section of american libraries. he also mentioned the previous board action to have ]ola te chnical communications become a part of the 1973 volume. wednesday, january 31, 1973 mr. shoffner called the meeting to order at 10:00 a.m. those present were: board-ralph m. shoffner (chairman), richards. angell, dons. culbertson (!sad executive secretary), paul j. fasana, donald p. hammer, susan k. martin, and berniece coulter, secretary, isad. committee chairmen-brigitte kenney, ronald miller, and velma veneziano. guest-peter watson. conference planning committee report. mrs. susan 58 journal of library automation vol. 6/ 1 march 1973 martin, chairman, reported that the 1972 seminar on telecommunications had been successful, and the april seminar with the national microfilm association in detroit was proceeding as scheduled. the seminar on the national libraries, originally scheduled for january, and the seminar on netm works which was to be in march had both been postponed until the next fiscal year. planning of the las vegas preconference program is continuing smoothly; the institute is to be concerned with a review of the state-of-the-art of library automation. it will update the !sad preconference institute of 1967. isad / led education committee report. a written report was submitted. (see exhibit 1.) rtsd / isad / rasd representation in machine-readable form of bibliographic information committee report. chairman vehna veneziano reported that as a result of a ]ola technical communications announcement that the committee meeting was open and that there would be discussion of the controversial international standard bibliographic description ( isbd), 2()0-300 persons attended the committee meeting. the committee felt that changes such as isbd in the marc records by the library of congress should take into account the users of the marc distribution service. committee action on the isbd was delayed until the isbd for serials proposal was further along. it was stated that the isbd for serials should be as consistent as possible with the isbd for monographs. the committee suggested that each division publish these standards in its journal. motion. it was moved by paul fasana that the !sad board suggest to the jola editorial board that discussion drafts of standards be published in the ] ournal of library automation. seconded by donald hammer. carried. mrs. veneziano pointed out that a resolution was passed concerning the formation of an ad hoc task force for a period of two years. the task force would work with emerging standards relating to character sets: greek and cyrillic alphabets; mathematical and logical symbols; and control characters relating to communications. three persons were suggested for the task force: charles payne of the university of chicago, david weisbrod of yale, and michael malinconico of the new york public library in addition to lucia rather and henriette avram of the library of congress. the task force would report back to the board through the committee. motion. it was moved by paul fasana that !sad consider the creation of a task force to work with emerging standards relating to character sets and the insertion of a fund request in the isad budget for $1,060 ( $700 for 2 trips for 3 persons and $360 per diem for 3 persons for 2 days for each of 2 trips). seconded by donald hammer. carried. highlights of minutes 59 the committee wished to go on record that since rtsd had recently formed a committee on computer filing that computer filing rules was a function of the interdivisional committee on representation in machinereadable form of bibliographic information. the subject of library codes was discussed. bowker was assigning numeric codes to libraries, book publishers, and book dealers. the committee is concerned about standards and does not wish to see the creation of systems of incompatible codes. telecommunications committee report. brigitte kenney, chairman, submitted a written 18-month report of the committee (exhibit 2). miss kenney announced that she was resigning as chairman of the committee and that no present member was available to assume the chairmanship. mr. hammer, as president-elect, was charged with appointing the next chairman. the function statement of the telecommunications committee has been grouped into four areas: ( 1) communication to members; ( 2) training; ( 3) legislative matters; and ( 4) research. she pointed out that both ]ola technical communications and american libraries had said in writing that they would accept articles on telecommunications, particularly cable tv, and had accepted none. also, she ltad attempted for a year and a half to assemble an information packet at ala headquarters, but did not know the status of the project. headquarters had requested guidelines on cable policy from the committee; she stated that they had not succeeded in completing this task. no guidelines had been provided. mter !sad and ala sources did not respond to a request to publish a cable newsletter, the american society for information science was approached. the asis council approved this the previous friday and she had obtained seed money from the markle foundation. miss kenney referred to the resolution introduced that afternoon in council that an ad hoc ala committee be established to address itself exclusively to cable matters and be representative of all units of ala, and that it take on very specific tasks with clearly delineated time limits. she further stated that she had not felt that !sad had given adequate support to the isad telecommunications committee's activities, and thought that the board would have to decide if this was an appropriate committee for !sad. if so, was the function statement too broad? should it be narrowed to just data transfer? miss kenney also suggested that the committee be expanded in size to include more people involved in telecommunications. in the discussion which followed it was indicated that it could take from two to three years to set up a committee in ala as an interdivisional committee. it was decided that a committee chairman should be found and that 60 journal of library automation vol. 6/1 march 1973 the board could then work with the chairman in the definition of the tasks to be performed. publishing of minutes. it was decided that the board of directors express to the editorial board their desire that the minutes of board meetings be published in the journal. seminar and institute topics committee report. ronald miller~ chairman, enumerated the following points of the committee's meeting: that ( 1) a long range plan for seminar programs be written to cover the period from july 1974 through june 1978; (2) part of the money from the institutes be budgeted to support a professional staff person at ala headquarters to handle the burden of the work; ( 3) policy be established concerning commercial groups using isad programs for a marketing channel, particularly products of use to libraries; ( 4) institutes or seminars be regionalized in the u.s. and canada; and ( 5) liaison efforts be utilized (a) within the network of ala, (b) through subcontractors, and (c) through continuing education programs of library schools or other institutes of higher education. in the discussion by the board it was agreed that a written document, both specific and general, be put before the isad membership concerning future seminar and institute topics in order to obtain reactions. ]ola editor appointed motion. it was moved by donald hammer that the board approve the appointment of susan k. martin as editor of the journal of library automation. seconded by paul fasana. carrjed. tribute to don culbertson. "the board commended don s. culbertson for long, energetic and useful service to isad." exhibit 1 january 23, 1973 !sad/led education committee report the isad/led education committee met sunday, january 28, at 9:30 a.m. in the garden restaurant of the shoreham hotel. present were members james liesener, robert kemper, gerald jahoda, edward heiliger, and (ex officio ) ralph shoffner. absent were ann painter and duane johnson. discussion focused on disc (developmental information science curriculum), what has been achieved by the disc contingent working under the aegis of asis, and how isad/ led could contribute to achieving the disc objective of producing transferable "modules" or packaged programs for information science teaching. it was decided that to reach this objective what would be required were: ( 1) an overall structure or frame of reference which could be used to coordinate modules developed by interested and dedicated individuals. (2) specifications for module construction. re 2lt was decided to await the completion of modules currently b eing developed by charles davis and david buttz and to examine these (at las vegas) as providing guidelines for module specifications. highughts of minutes 61 re l-it was suggested by ralph shoffner that a frame of reference might be achieved, with some dispatch, by drawing up a list of about 20 questions in the area of information science, which library schools might expect their graduating students to answer, each question being answerable in no more than an hour. the idea was that modules might be designed around these questions. also, it was seen that these questions might serve a useful purpose in organizing information science teaching in light of professional program evaluation and accreditation. the suggestion of "questions" was enthusiastically received and the following day gerald jahoda, edward heiliger and charles davis drew up a "sample" list of questions and outlined the following procedure: ( 1) the sample list of questions is sent to isad /led education committee members as well as to asis sig/eis and asis education committee members for recommendations in the way of additions, deletions and word revisions. by february 15, 1973. (2} the questions are revised and edited by an ad hoc committee consisting of interested members of the three committees involved. by march 30, 1973. ( 3) the revised list of questions is sent to accredited library schools in the u.s. and canada for additions, deletions and word revisions. by april 15. ( 4) !sad/led education committee members together with invited members from the asis committees involved revise the question list at las vegas. (5) designating potential module constructors for each of the questions on the final question list. formulation of module specifications at las vegas. immediately after las vegas the designated module constructors will be solicited. they will be sent a "question" together with module specification. this is where we are january 29, 1973. exhibit 2 respectfully submitted, elaine svenonius telecommunications committee annual report 1972/ 73 1. communications: a. cable newsletter: after exhausting every possible avenue within ala (amlibs, lola technicaj. commtmications, headquarters clearinghouse, information packet) the chairman received the mandate considered necessary to go ahead with plans for an effective communications medium. the mandate came in the form of a unanimous resolution from the 104 attendees at the cable institute, held in september, to produce such a newsletter. !sad board approval/endorsement was obtained, and lbe chairman approached asis which will publish the newsletter. start-up money was obtained from the markle foundation for the first promotional issue, which will receive widest distribution. based on response to the initial mailing, the newsletter will continue on a subscription basis, provided 750 subscriptions are obtained. the chairman and two other people will volunteer their time as coeditors. b. the chairman has been operating a clearinghouse on cable information out of her office, which has become incredibly time-consuming. it is impossible for one person to do all that is needed; innumerable letters have been written and phone conversations held with people and groups wanting advice on dozens of issues connected with cable. it is hoped that the newsletter, the proceedings of the cable institute, and a soon-to-be-established task force within srrt on cable will lessen the almost impossible load. c. specific letters were written in response to requests from the rocky mountain 62 ]oumal of library automation vol. 6/1 march 1973 federation (justification of library use of the ats-f satellite), senator mike gravel (introduced several bills on telecommunications, wanted to know what libraries could do with this medium), and a presentation will be made to the national commission hearings in new york. d. a librarian-representative was located, suggested, and subsequently appointed to the fcc federal-state-local advisory committee on cable. (a first for librarians!) e. liaison was maintained with nonlibrary groups: publicable, of which the chairman is a member, the mitre symposium on cable, to which the chairman was invited, and, as a result of that meeting, the aspen workshop on cable in palo alto, which the chairman attended by invitation from douglass cater, together with eight other people, to decide on the direction this activity should take. at all three meetings the chairman attempted to represent the library viewpoint on cable. f. a las vegas program was to be planned, together \vith acrl and the av committee. plans did not materialize, and the committee is being approached by the soon-to-be-established srrt task force on cable to cosponsor a program on cable at las vegas. 2. training: 1. institute on cable television for librarians: held september 1720, 1972, and attended by over 100 librarians from thirty-four states, representing public and state libraries primarily, this was directed by the chairman, and funded by usoe. russell shank and frank norwood, consultants to the tc committee presented major talks. the entire institute was videotaped and the tapes are available. proceedings will be issued in march as a double issue of the drexel library quarterly. the institute was designed to provide a format and material (including videotaped presentations) to allow others to do their own institutes. 2. telecommunications seminar: conducted by russell shank, consultant to the tc committee, it presented an overview over various aspects of telecommunications. held in washington september 25-26, 1972, it, too, was attended by almost 100 ji. brarians from all types of libraries. the chairman and frank norwood, consultant, participated in the presentation of papers. 3. legislative matters: the committee expressed its concern to the ala legislation committee about the lack of sufficient personnel to keep abreast of legislative and regulatory matters affecting telecommunications. the chairman of the legi~lation committee responded by stating that the ala washington office had been trying to do their best, in the absence of funding for additional personnel, and would continue to do so. the committee attempts to follow legislative and regulatory developments in the telecommunications area, and works closely with the washington office in this activity, providing persons to testify, and supplying two of the four members of the subcommittee on copyright (shank and kenney) . the committee participated actively in the revision of the ala policy booklet, concerning itself with matters pertaining to networks and telecommunications. all recommendations were incorporated in the final draft of this document. 4. research: the telecommunications requirements study, long ago proposed, is dormant. shank and kenney are actively working on putting together a proposal to respond to a call for proposals from nsf in the area of telecommunications policy research. the committee will discuss the proposal during the midwinter meeting, 1973. respectfully submitted, brigitte kenney catqc and shelf-ready material | jay, simpson, and smith 41 michael jay ([e-mail?]) is information technology expert, software unit, information technology department; betsy simpson is chair, cataloging and metadata department; and doug smith is head, copy cataloging unit, cataloging and metadata department, george a. smathers libraries, university of florida, gainesville. michael jay, betsy simpson, and doug smith catqc and shelf-ready material: speeding collections to users while preserving data quality libraries contract with vendors to provide shelf-ready material, but is it really shelf-ready? it arrives with all the physical processing needed for immediate shelving, then lingers in back offices while staff conduct itemby-item checks against the catalog. catqc, a console application for microsoft windows developed at the university of florida, builds on oclc services to get material to the shelves and into the hands of users without delay and without sacrificing data quality. using standard c programming, catqc identifies problems in marc record files, often applying complex conditionals, and generates easy-to-use reports that do not require manual item review. a primary goal behind improvements in technical service workflows is to serve users more efficiently. however, the push to move material through the system faster can result in shortcuts that undermine bibliographic quality. developing safeguards that maintain sufficiently high standards but don’t sacrifice productivity is the modus operandi for technical service managers. the implementation of oclc’s worldcat cataloging partners (wcp, formerly promptcat) and bibliographic record notification services offers an opportunity to retool workflows to take advantage of automated processes to the fullest extent possible, but also requires some backroom creativity to assure that adequate access to material is not diminished. n literature review quality control has traditionally been viewed as a central aspect of cataloging operations, either as part of item-byitem handling or manual and automated authority maintenance. how this activity has been applied to outsourced cataloging was the subject of a survey of academic libraries in the united states and canada. a total of 19 percent of libraries in the survey indicated that they forgo quality control of outsourced copy, primarily for government documents records. however, most respondents reported they review records for errors. of that group, 50 percent focus on access points, 30 percent check a variety of fields, and a significant minority—20 percent—look at all data points. overall, the libraries expressed satisfaction with the outsourced cataloging using the following measures of quality supplied by the author: accuracy, consistency, adequacy of access points, and timeliness.1 at the inception of oclc’s promptcat service in 1995, ohio state university libraries participated in a study to test similar quality control criteria with the stated goals of improving efficiency and reducing copyediting. the results were so favorable that the author speculated that promptcat would herald a future where libraries can “reassess their local practices and develop greater confidence in national standards so that catalog records can be integrated into local opacs with minimal revision and library holdings can be made available in bibliographic databases as quickly as possible.”2 fast forward a few years and the new incarnation of promptcat, wcp, is well on its way to fulfilling this dream. in a recent investigation conducted at the university of arkansas libraries, researchers concluded that error review of copy supplied through promptcat is necessary, but the error rate does not warrant discontinuance of the service. the benefits in terms of time savings far outweigh the effort expended to correct errors, particularly when the focus of the review is to correct errors critical to user access. while the researchers examined a wide variety of errors, a primary consideration was series headings, particularly given the problems cited in previous studies and noted in the article.3 with the 2006 announcement by the library of congress (lc) to curtail its practice of providing controlled series access, the cataloging community voiced great concern about the effect of that decision on user access.4 the arkansas study determined that “the significant number of series issues overall (even before lc stopped performing series authority work) more than justifies our concern about providing series authority control for the shelf-ready titles.” approximately one third of the outsourced copy across the three record samples studied had a series, and, of that group, 32 percent needed attention, predominantly taking the form of authority record creation with associated analysis and classification decisions.5 the overwhelming consensus among catalogers is that error review is essential. as far as can be determined, an underlying premise behind such efforts seems to be that it is done with the book in hand. but could there be a way to satisfy the concerns without the book in hand? certainly, validation tools embedded in library management systems provide protections whether records are manually entered or batchloaded, and outsourced authority maintenance services (for those who can use them) offer further control. but a customizable tool that allows libraries to target specific needs, both standards-based and local, without relying on item-by-item handling can contribute michael jay (emjay@ufl.edu) is information technology expert, software unit, information technology department; betsy simpson (betsys@uflib.ufl.edu) is chair, cataloging and metadata department; and doug smith (dougsmith@uflib.ufl .edu) is head, copy cataloging unit, cataloging and metadata department, george a. smathers libraries, university of florida, gainesville. 42 information technology and libraries | march 2009 to an economy of scale demanded by an environment with shrinking budgets and staff to devote to manual bibliographic scrutiny. if that tool is viewed as part of a workflow stream involving local error detection at the receiving location as well as enhancement at the network level (i.e., oclc’s bibliographic record notification service), then it becomes an important step in freeing catalogers to turn their attention to other priorities, such as digitized and hidden collections. n local setting and workflow the george a. smathers libraries at the university of florida encompasses six branches that address the information needs of a diverse academic research campus with close to fifty thousand undergraduate and graduate students. the technical services division, which includes the acquisitions and licensing department and the cataloging and metadata department, acquires and catalogs approximately forty thousand items annually. seeking ways to minimize the handling of incoming material, beginning in 2006 the departments developed a workflow that made it possible to send shelf-ready incoming material directly to the branches after check-in against the invoice. shelf-ready items represent approximately 30 percent of the libraries’ purchased monographic resources at this time. by using wcp record loads along with vendor-supplied shelf-ready processing, the time from receipt to shelf has been reduced significantly because it is no longer necessary to send the bulk of the shipments to cataloging and metadata. exceptions to this practice include specific categories of material that require individual inspection. the vendor is asked to include a flag in books that fall into many of these categories: n any nonprocessed book or book without a spine label n books with spine labels that have numbering after the date (e.g., vol. 4, no. 2) n books with cds or other formats included n books with loose maps n atlases n spiral-bound books n books that have the words “annual,” “biennial,” or a numeric year in the title (these may be a serial add to an existing record or part of a series that will be established during cataloging) to facilitate a post–receipt record review for those items not sent to cataloging and metadata, acquisitions and licensing runs a local programming tool, catqc, which reports records containing attributes cataloging and metadata has determined necessitate closer examination. figure 1 is an example of the reports generated, which are viewed using the mozilla firefox browser. copy catalogers rotate responsibility for checking the report and revising records when necessary. retrieval of the physical piece is only necessary in the 1 percent of cases where the item needs to be relabeled. n catqc report catqc analyzes the content of the wcp record file and identifies records with particular bibliographic coding, which are used to detect potential problems: 1. encoding levels 2, 3, 5, 7, e, j, k, m 2. 040 with non-english subfield b 3. 245 fields with subfields h, n, or p 4. 245 fields with subfields a or b that contain numerals 5. 245 fields with subfields a or b that contain red flag keywords 6. 246 fields 7. 490 fields with first indicator 0 8. 856 fields without subfield 3 9. 6xx fields with second indicators 4, 5, 6, and 7 the numbers following each problem listed below indicate which codes are used to signal the presence of a potential problem. minimal-level copy (1) the library’s wcp profiles, currently in place for three vendors, are set up to accept all oclc encoding levels. with such a wide-open plan, it is important to catch records with minimal-level copy to assure that appropriate access points exist and are coded correctly. the library encounters these less-than-full encoding levels infrequently. parallel records (2) catqc identifies foreign library records that are candidates for parallel record treatment by indicating in the report if the 040 has a non-english subfield b. the report includes a 936 field if present to alert catalogers that a parallel record is available. volume sets (3, 4, 5) the library does not generally analyze the individual volumes of multipart monographic sets (i.e., volume sets) even when the volumes have distinctive titles. these catcq and shelf-ready material | jay, simpson, and smith 43 “volume,” “part,” and “number” as well as common abbreviations of those words (e.g., v. or vol.). serial vs. monograph treatment (4, 5) titles owned by the library and classified as serials sometimes are ordered inadvertently as monographs, resulting in the delivery of a monographic record. a similar problem also occasionally arises with new titles. by detecting numerals, keywords, or the presence of one or more of the subfields in the 245 field, we can quickly scan a list of records with these characteristics. of course, most of the records detected by catqc are false hits because of the broad scope of the search; however, it takes only a few minutes to scan through the record list. non-print formats (3) the library does not receive records for any format other than print through wcp. consequently, detecting the presence of a subfield h in the 245 field is a good signal that there may be a problem with the record. alternate titles (6) alternate titles can be an important access point for library users. sometimes text that should properly be in subfield i (e.g., “at head of title”) of the 246 field is placed in subfield a in front of the alternate title. this adversely affects user access to the title through browse searching. catqc checks for and reports the presence of a 246 field. the cataloger can then quickly confirm that it is coded correctly. untraced series (7) as a program for cooperative cataloging (pcc) participant, the library opted to follow pcc practice to continue to trace series despite lc’s decision in 2006 to treat as untraced all series statements in newly cataloged records. because some libraries chose to follow lc in its decision, there has been an overall increase in the use of untraced series statements across all types of record-encoding volumes are added to the collection under the title of the set. the june 2006 decision by lc to produce individual volume records when a distinctive title exists caused concern about the integrity of the libraries’ existing open volume set records. because such records typically have enumeration indicated in the subfield n, and sometimes p, of the 245 field, the program searches for instances of those subfields. in addition, the program detects the presence of numerals in the 245 and keywords such as figure 1. an example report from catcq 44 information technology and libraries | march 2009 levels. to address this issue, catqc searches all wcp records for 490 fields with first indicator 0. catalogers check the authority files for the series and make any necessary changes to the records. this is by far the most frequent correction made by catalogers. links (8) to provide users with information about the nature of the urls displayed in the catalog, catalogers insure that explanatory text is recorded in subfield 3 of the 856 field. catqc looks for the absence of subfield 3, and, if absent, displays the 856 field in the report as a hyperlink. the cataloger adds the appropriate text (e.g., full text) as needed. subject headings with second indicators 4, 5, 6, and 7 (9) the catqc report reviewed by catalogers includes subject headings with second indicator 4. when these headings duplicate headings already on the record, catalogers delete them from our local system. when the headings are not duplicates, the catalogers change the second indicator 4 to 0. typically, 6xx fields with second indicators 5, 6, and 7 contain non-english headings based on foreign thesauri. these headings can conflict with lc headings and, in some cases, are cross references on lc authorities. the resulting split files are not only confusing to patrons, but also add to the numbers of errors reported that require authority maintenance. for these reasons, our policy is to delete the headings from our local system. catqc detects the presence of second indicators 5, 6, or 7 and creates a modified file with the headings removed with one exception: a heading with second indicator 7 and subfield 2 of “nasat,” which indicates the heading is taken from the national aeronautics and space administration thesaurus, is not removed because the local preference is to retain the “nasat” headings. n library-specific issues catqc resolves local problems when needed. for example, when more than one lc call number was present on the record, the wcp spine manifest sent to the vendor used to contain the second call number, which was affixed to the item. when the wcp records were loaded into the library’s catalog, the first call number populated the holding. as a result, there was a discrepancy between the spine label on the book and the call number in the catalog. prior to generating the report, catqc found multiple instances of call numbers in the records in the wcp file and created a modified file with the call numbers reordered so that the correct call number was used on the holding when the record was loaded. previously, the library’s opac did not display the text in subfield 3 of the 856 field, which specifies the type of material covered by the link, and to the user it appeared that the link was to a full-text resource. this was particularly troublesome for records with lc links to table of contents, publisher descriptions, contributor information, and sample text. to prevent user frustration, catqc was programmed to move the links on the wcp records to 5xx fields. when the opac interface improved and the programming was no longer necessary, catqc was revised. n analysis to see how well catqc and oclc’s bibliographic notification service were meeting our goal of maintaining high-quality bibliographic control, 63 reports were randomly selected from the 171 reports generated by catqc between october 2007 and april 2008. catqc found no problems in twelve (19 percent) of the selected reports. these twelve were not used in the analysis, leaving fifty-one catqc reports examined with at least one potential problem flagged for review. an average of 35.6 percent of the records in the sample of reports was flagged as requiring review by a cataloger. an average of thirteen possible problems was detected per report. of these, 55 percent were potential problems requiring at least some attention from the cataloger. the action required of the cataloger varied from simply checking the text of a field displayed in the report (e.g., 246 fields) to bringing up the record in aleph and editing the bibliographic record (e.g., verifying and correcting series headings or eliminating unwanted subject headings). why the relatively high rate of false positives (45 percent)? to minimize missing serials and volumes belonging to sets, catqc is designed to err on the side of caution. two of the criteria listed earlier were responsible for the vast majority of the false positives generated by catqc: 245 fields with subfields a or b that contain numerals and 245 fields with subfields a or b that contain red-flag keywords. clearly, if every record with a numeral in the 245 is flagged, a lot of hits will be generated that are not actual problems. the list of keywords was purposefully designed to be extensive. for example, “volume,” “vol.,” and “v.” are all triggers causing a record to be flagged. therefore a bibliographic record containing the phrase “volume cost profit analysis” in the 245 field would be flagged as a potential problem. at first glance, a report filled with so many false positives may seem inefficient and burdensome for catalogers to use; however, this is largely mitigated by the excellent display format. the programmer worked closely with catcq and shelf-ready material | jay, simpson, and smith 45 the copy cataloging unit staff to develop a user-friendly report format. each record is framed separately, making it easy to distinguish from adjoining records. potential problems are highlighted with red lettering immediately alerting catalogers to what the potential problem might be. whenever a potential problem is found, the text of the entire field appears in the report so that catalogers can see quickly whether the field triggering the flag is an actual problem. it takes a matter of seconds to glance through the 245 fields of half a dozen records to see if the numeral or keyword detected is a problem. the catalogers who work with these reports estimated that it took them between two and three hours per month to both review the files and make corrections to bibliographic records. a second component of bibliographic quality maintenance is oclc’s bibliographic record notification service. this service compares newly upgraded oclc records with records held by the library and delivers the upgraded records to the library. because catqc flags records with encoding levels of 2, 3, 5, 7, e, j, k, and m, it was possible to determine if these records had, in fact, been upgraded in oclc. in the sample, thirty-three records were flagged because of the encoding level. no upgrade had been made to 21.2 percent of the records in oclc as of august 2008. upgrades had been made to 45.5 percent of the records. the remaining 33.3 percent of the records were manually loaded by catalogers in copy cataloging. these typically are records for items brought to copy cataloging by acquisitions and licensing because they meet one or more of the criteria for individual inspection discussed previously. when catalogers search oclc and find that the received record has not been upgraded, they search for another matching record. a third of the time, a record of higher quality than that received is found in oclc and exported to the catalog. the reason why the record of better quality is not harvested initially is not clear. it is possible that at the time the records were harvested both records were of equivalent quality and by chance one was enhanced over another. in no instance had any of the records originally harvested been upgraded (this is not reflected in the 21.2 percent of records not upgraded). encoding level 8 records are excluded from catqc reports. because of the relatively quick turnaround for upgrades of this type of copy, the library decided to rely solely on the bibliographic record notification service. n technical specifications catqc is a console application for windows. written in standard c, it is designed to be portable to multiple operating systems with little modification. no graphic interface was developed because (a) the users are satisfied with the current operating procedure and (b) the treatment of the records is predefined as a matter of local policy. the user opens a command console (cmd.exe) and types “catqc”+space+“[name of marc file]”+enter. the corrected file is generated; catqc analyzes the modified file and creates the xml report. it moves the report to a reviewing folder on a file server across the lan and indicates to the user that it is terminating. modifications require action by a programmer; the user cannot choose from a list of options. benefits include a 100 kb file size and a processing speed of approximately 1,000 records per second. no quantitative analysis has yet been done related to the speed of processing, but to the user the entire process seems nearly instantaneous. the genesis of the project was an interest in the record structure of marc files brought about in the programmer by the use of earlier local automation tools. the project was speculative. the first experiment contained the programming structure that would become catqc. one record is read into memory at a time, and there is another array held for individual marc fields. conceptually, the records are divided into three portions—leader, directory, and dataset—when the need arises to build an edited record. initially there was no editing, only the production of the report. the generation of strict, valid xml is a significant aspect of catqc. an original document type was created, along with a corresponding cascading style sheet. the reports are viewable to anyone with an xml–capable browser either through file server, web server, or e-mail. (the current version of internet explorer does not fully support the style sheet syntax.) this continues to be convenient for the report reviewers because they do not have to be client application operators. see appendix a for an excerpt of a document instance and appendix b for the document type definition. catqc is not currently a generalized tool such as marcedit, a widely used marc editing utility that provides a standard array of basic capabilities: field counting, field and subfield deletion (with certain conditional checks), field and subfield additions, field swapping and text replacement, and file conversion to and from various formats such as marcxml and dublin core as well as between marc-8 and utf-8 encodings.6 marcedit continues to grow and does offer programmability that relies on the windows scripting host. this requires the user to either learn vbscript or use the wizards offered by marcedit. the catqc development goal was to create a report, viewable through a lan or the internet, which alerts a group of catalogers to potential problems with specific records, often illustrating those problems. although it might have been possible to use a combination of marcedit capabilities and local programming to help achieve this goal, it likely would have been a more cumbersome route, particularly taking into consideration the multidimensional 46 information technology and libraries | march 2009 conditionals desired. it was deemed easier to write a program that addresses local needs directly in a language already familiar to the programmer. as catqc evolved, it was modified to identify more potential problems and to do more logical comparisons as well as to edit the files as necessary before generating the reports. catqc addresses a particular workflow directly and provides one solution. it is procedural as opposed to event driven or object oriented. with version 1.3, the generic functions were extracted into a marclib 1.0, a common object file format library. functions specific to local workflow remain in catqc. the program is freely available to interested libraries by contacting the authors. as of this writing, the university of florida plans to distribute this utility under the gnu public license version 3 (see www.opensource.org/licenses/gpl-3.0.html) while retaining copyright. n conclusion catqc provides catalogers an easy way to check the bibliographic quality of shelf-ready material without the book in hand. as a result, throughput time from receipt to shelf is reduced, and staff can focus data review on problem areas—those affecting access or interfering with local processes. some of the issues addressed by catqc are of concern to all libraries while others reflect local preferences. the program could be easily modified to conform to those preferences. automation tools such as catqc are of key importance to libraries seeking ways to streamline workflows to the benefit of users. references and notes 1. vinh-the lam, “quality control issues in outsourcing cataloging in united states and canadian academic libraries,” cataloging & classification quarterly 40, no. 1 (2005): 101–22. 2. mary m. rider, “promptcat: a projected service for automatic cataloging—results of a study at the ohio state university libraries,” cataloging & classification quarterly 20, no. 4 (1995): 43. 3. mary walker and deb kulczak, “shelf-ready books using promptcat and ybp: issues to consider (an analysis of errors at the university of arkansas),” library collections, acquisitions, & technical services 31, no. 2 (2007): 61–84. 4. “lc pulls plug on series authority records,” cataloging & classification quarterly 43, no. 2 (2006): 98–99. 5. walker and kulczak, “shelf-ready books.” 6. for more information about marcedit, see http://oregon state.edu/~reeset/marcedit/html/index.php. wcp file analysis: 201 records analyzed. record: 71 oclc number: 243683394 timestamp: 20080824000000.0 245: 10 |a difference algebra /|c levin alexander. 245 h 245 n 245 p numerals keywords appendix a. catqc document instance excerpt catcq and shelf-ready material | jay, simpson, and smith 47 490: 0 |a algebras and applications ;|v v. 8 . . . appendix b. catqc document type definition 48 information technology and libraries | march 2009 editor’s comments: odds and ends bob gerrity information technology and libraries | june 2016 1 this issue marks the midpoint of information technology and libraries’ fifth year as an openaccess e-only journal. the move to online-only in 2012 was inevitable, as ital’s print subscription base was longer covering the costs of producing and distributing the print journal. moving to an eonly model using an open-source publishing platform (the public knowledge project’s open journal systems) provided a low-cost production and distribution system that has allowed ital to continue publishing without requiring a large ongoing investment from lita. the move to open access, however, was not inevitable, and i commend lita for supporting that move and for continuing to provide a base subsidy that supports the journal’s ongoing publication. i also thank the boston college libraries for their ongoing support in hosting ital along with a number of other oa journals. since ital is now open, access to it can no longer be offered as an exclusive benefit that comes with lita membership. regardless of the publishing model, though, ital has always relied on voluntary contributions of the time and expertise of reviewers and editors. i’d like to acknowledge the contributions of our past and current editorial board members, who play a key role in ensuring the ongoing quality and vitality of the journal. we will be adding a few additional board members shortly, to help ensure that review of submissions to the journal are completed as quickly and effectively as possible. speaking of peer review, one of the recent innovative startups in the scholarly communication space is a company called publons, which tracks and verifies peer-review activity, providing a mechanism for academics to report (and possibly receive institutional credit for) their peerreview work, an undervalued part of the scholarly communication framework. (full disclosure: at university of queensland we are conducting a pilot project with publons, to integrate the peerreview activities of our academics into our institutional repository.) in addition to new approaches to peer review, such as publons and academic karma, there are quite a few recent examples of innovations in various aspects of scholarly communication that are worth keeping an eye on. these include new collaborative authoring tools such as overleaf, impact-measurement tools such as impactstory, and personal digital library platforms such as readcube. on a broader scale, initiatives such as peerj are building open access publishing platforms intended to dramatically improve the efficiency of and drive down the overall costs of scholarly publishing. february marked the 14th anniversary of a key trigger event in the open access movement—the launch of the budapest open access initiative in 2002. bob gerrity (r.gerrity@uq.edu.au), a member of lita and the editor of information technology and libraries, is university librarian at the university of queensland, brisbane, australia. http://ejournals.bc.edu/ojs/index.php/ https://publons.com/ http://academickarma.org/ https://www.overleaf.com/ https://impactstory.org/ https://www.readcube.com/ https://peerj.com/ http://www.budapestopenaccessinitiative.org/read mailto:r.gerrity@uq.edu.au editor’s comments | gerrity doi: 10.6017/ital.v35i2.9462 2 much has happened in the 14 years since the budapest initiative, on various fronts: o policy—introduction and widespread adoption of funder and institutional oa mandates; o technology--development and widespread adoption of institutional repositories, recent development of mechanisms to facilitate the discovery of oa publications (e.g., share on the library side and chorus on the publisher side); o publishing—establishment of new oa megajournals (e.g., plos, biomed central), embrace of hybrid oa models by mainstream commercial publishers. yet despite all the hype, acrimony, and activity triggered by the oa movement, a recent analysis in chronical of higher education suggests the growth of oa has been slow and incremental: the percentage of research articles published annually in fully open-access format has increased at an average rate of of around one percent a year, from 4.8% in 2008 to 12% in 2015. at this rate, the tipping point for oa still seems very far away. lots of energy has been and continues to be invested by different stakeholders in different approaches, and the green vs. gold argument still predominates. recent developments suggest momentum is gaining for a more radical shift. in december 2015, the max planck institute, a key player in the launch of oa with the berlin declaration on open access in 2003, hosted the 12th version of its annual oa conference to further the discussion around open access. ironically, unlike previous meetings and seemingly in philosophical conflict with the underpinnings of the oa movement, the meeting was by invitation only. given the topic, though, a “proposal to flip subscription journals to open access,” the closed nature of of the meeting is understandable. underpinning the proposal was a 2015 paper from the max planck digital library that suggested that the amount of money currently being spent (largely by libraries) on journal subscriptions should be sufficient to fund research publication costs if applied to a “flipped” journal publishing business model, from subscription-based to gold open access.1 in the netherlands, the university sector has adopted a national approach in negotiating deals with several major publishers (springer, sage, elsevier, and wiley) that allow dutch authors to publish their papers as gold oa, without additional charges (but, depending on the publisher, with limits on total numbers and/or which journals are available within the deals).2 the so-called “dutch deal” by the vsnu (association of universities in the netherlands) and ukb (dutch consortium of university libraries and royal library) takes a national approach to flipping the model, attempting to bundle access rights for dutch readers with apc credits for dutch authors. http://www.arl.org/focus-areas/shared-access-research-ecosystem-share#.v3xhlzn95ty http://www.chorusaccess.org/ http://chronicle.com/article/as-an-open-access-megajournal/234890 http://chronicle.com/article/as-an-open-access-megajournal/234890 https://openaccess.mpg.de/berlin-declaration https://openaccess.mpg.de/berlin-declaration https://www.mpg.de/9202262/area-wide-transition-open-access information technology and libraries | june 2016 3 the dutch government, which currently holds the eu presidency, is pushing hard for a europewide adoption of this approach. last month, the eu’s competitiveness council agreed that all scientific papers should be freely available by 2020.3 meanwhile, in the us, the “pay it forward” research project at the university of california is examining what the institutional financial impact would be with a flipped model. the study is looking at existing institutional journal expenditures on subscriptions and modeling what a future, apc-based model would look like based on institutional research publication output and estimated average apc charges. who knows when or if a global flip might occur, but it does strike me that the scholarly publishing world is overdue for a major shakeup. from the point of view of a university librarian, focused on keeping journal subscription costs in line (unsuccessfully i might add), i think there is real danger in not considering what a flip to a gold model might look like. the commercial publishers we all complain about are successfully exploiting the gold model as an additional revenue stream which, for the most part, academic libraries have been ignoring, since the individual apcs typically are paid from someone else’s budget. this has allowed the overall envelope of spending on research publication (subscriptions and apcs) to grow significantly. perhaps a more interesting question is what the impact of a flip on libraries would be. if gold oa became the predominant model, we would no longer need all of the complex systems we’ve built to manage subscriptions and user access. to quote homer simpson, “woohoo!” in the “watch this space” arena, ebsco’s recently-launched open-source library services platform (lsp) initiative is beginning to take shape. it now has a name—folio (for future of the libraries is open)—and as marshall breeding put it, the project “injects a new dynamic into the competitive landscape of academic library technology, pitting and open source framework backed by ebsco against a proprietary market dominated by ex libris, now owned by ebsco archrival proquest.”4 publicly listed participants in the project include (in addition to ebsco) ole, index data, bywater, bibliolabs, and sirsi dynix.5 the platform release timetable calls for an initial, “technical preview” release of of the code for the base platform in august 2016, and an anticipated release of the apps needed to operate a library in early 2018.6 1. ralf schimmer, kai karin geschuhn, andreas vogler, disrupting the subscription journals’ business model for the necessary large-scale transformation to open access, (2015), doi:10.17617/1.3 2. frank huysmans, vsnu-wiley: not such a big deal for open access, warekennis (blog), march 1, 2016, https://warekennis.nl/vsnu-wiley-not-such-a-big-deal-for-open-access/ 3. martin enserink, “in dramatic statement, european leaders call for ‘immediate’ open access to all scientific papers by by 2020,” science, may 27, 2016, doi:10.1126/science.aag0577. http://icis.ucdavis.edu/?page_id=286 https://warekennis.nl/vsnu-wiley-not-such-a-big-deal-for-open-access/ editor’s comments | gerrity doi: 10.6017/ital.v35i2.9462 4 4. marshall breeding, ebsco supports new open source project, amercian libraries, april 22, 2016, https://americanlibrariesmagazine.org/2016/04/22/ebsco-kuali-open-source-project/ 5. https://www.folio.org/collaboration.php. 6. https://www.folio.org/apps-timelines.php. https://americanlibrariesmagazine.org/2016/04/22/ebsco-kuali-open-source-project/ https://www.folio.org/collaboration.php https://www.folio.org/apps-timelines.php 2 information technology and libraries | june 2007 i write my final president’s column a month after the midwinter meeting in seattle. you will read it as preparations for the ala annual conference in washington, d.c. are well underway. despite that discon­ nect in time, i am confident that the level of enthusiasm will continue uninterrupted between the two events. indeed, the midwinter meeting was highly charged with positive energy and excitement. the feelings are reignited if you listen to the numerous podcasts now found on the lita blog. the lita bloggers and podcasters were omni­ present reporting on all of the meetings and recording the musings of the lita top tech trendsters. by the time you have read this you will have also, hopefully, cast your ballot for lita officers and directors after having had the opportunity to listen to brief podcast interviews with the candidates. the lita board approved the election pod­ casts at the annual conference in new orleans. thanks to the collaborative efforts of the nominating committee and the bigwig members, we have this new input into our voting decision­making. the most exciting aspects of the midwinter meeting were the face­to­face, networking opportunities that make lita so great. the lita happy hour crowd filled the six arms bar and lit it up with the wonderful lita glow badges. what was particularly gratifying to me was the number of new lita members alongside those of us who have been around longer than we care to count. the net­ working that went on there was phenomenal! the other important networking opportunity for lita members was the lita town meeting led by lita vice president mark beatty. the room was packed with eager members ready to brainstorm about what they think lita should be doing after consuming a wonderful breakfast. lita’s sponsored emerging leader, michelle boule, and mark have collated the findings and will be working with the other emerging leaders to fine­tune a direction. the podcast interview of michelle and mark is an excellent summary of what you can expect in the next year when mark is president. as stated earlier, this is my last president’s column, which means my term is winding down. using lita’s strategic plan as a guide, i have worked with many of you in lita to ensure that we have a structure in place that allows us to be more adaptable to the rapidly chang­ ing world and to make sure that lita is relevant to lita members 365 x 24 x 7 and not just at conferences and lita national forum. attracting and retaining new members is critical for the health of any organization and in that vein, mark and i have used the ala emerging leaders program as a jumping off point to work with lita’s emerging leaders. the bigwig group is foment­ ing with energy and excitement as they rally bloggers and have this past year launched the podcasting initiative and the lita wiki. all of these things are making it easier for members to communicate about issues of interest in their work as well as to conduct lita business. the lita blog had over nine thousand downloads of its podcasts in the first three weeks after midwinter which confirms the desire for these types of communications! i appointed two task forces that provided recommen­ dations to the lita board at midwinter. the assessment and research task force has recommended that a perma­ nent committee be established to monitor the collection of feedback and assessment data on lita programs and services. having an established assessment process will enable the board to know how well we are accomplishing our strategic plan and to keep us on the correct course to meet membership needs. the education working group has recommended the merger of two committees, the education and regional institutes committees, into one education committee. this merged committee will develop a variety of educational opportunities including online and face­to­face sessions. we hope to have both of these committees up and going later in 2007. happily, the feedback from the town meeting parallels the recom­ mendations of the task forces. the board will be revisit­ ing the strategic plan at the annual conference using information gathered at the town meeting. we will also be looking at what new services we should be initiating. all arrows seem to be pointing towards more educational and networking opportunities both virtual and in person. i anticipate that lita members will see some great new things happening in the next year. i have very much enjoyed the opportunity to serve as the lita president this past year. the best part has been getting to know so many lita members who have such creative ideas and who roll up their sleeves and dig in to get the work done. i am very grateful for everyone who has volunteered their time and talents to make lita such a great organization. bonnie postlethwaite (postlethwaiteb@umkc.edu) is lita president 2006/2007 and associate dean of libraries, university of missouri–kansas city. president’s column bonnie postlethwaite web content strategy in practice within academic libraries article web content strategy in practice within academic libraries courtney mcdonald and heidi burkhardt information technology and libraries | march 2021 https://doi.org/10.6017/ital.v40i1.12453 courtney mcdonald (crmcdonald@colorado.edu) is associate professor and user experience librarian, university of colorado boulder. heidi burkhardt (heidisb@umich.edu) is web project manager and content strategist, university of michigan. © 2021. abstract web content strategy is a relatively new area of practice in industry, in higher education, and, correspondingly, within academic and research libraries. the authors conducted a web-based survey of academic and research library professionals in order to identify present trends in this area of professional practice by academic librarians and to establish an understanding of the degree of institutional engagement in web content strategy within academic and research libraries. this article presents the findings of that survey. based on analysis of the results, we propose a web content strategy maturity model specific to academic libraries. introduction our previous article traced the history of library adoption of web content management systems (cms), the evolution of those systems and their use in day-to-day library operations, and the corresponding challenges as libraries have attempted to manage increasingly prolific content creation workflows across multiple, divergent cms platforms.1 these challenges include inconsistencies in voice and a lack of sufficient or dedicated resources for library website management, resulting in the absence of shared strategic vision and organizational unity regarding the purpose and function of the library website. we concluded that a productive solution to these challenges lay in the inherently user-centered practice of web content strategy, defined as “an emerging discipline that brings together concepts from user experience design, information architecture, marketing, and technical writing.”2 we further noted that organizational support for web content management and governance strategies for library-authored web content had been rarely addressed in the library literature, despite the growing importance of this area of expertise to the successful provision of support and services: “libraries must proactively embrace and employ best practices in content strategy . . . to fully realize the promise of content management systems through embracing an ethos of libraryauthored content.”3 we now investigate the current state of practice and philosophy around the creation, editing, management, and evaluation of library-authored web content. to what degree, if at all, does web content strategy factor into the actions, policies, and practices of academic libraries, and academic librarians today? does a suitable measure for estimating the maturity of web content strategy practice for academic libraries exist? mailto:crmcdonald@colorado.edu mailto:heidisb@umich.edu information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 2 background maturity models maturity models are one useful mechanism for consistently measuring and assessing an organization’s current level of achievement in a particular area, as well as providing a path to guide future growth and improvement: “maturity levels represent a staged path for an organization’s performance and process improvement efforts based on predefined sets of practice areas. . . . each maturity level builds on the previous maturity levels by adding new functionality or rigor.”4 the initial work on maturity models emerged from carnegie mellon institute (cmi), focused on contract software development.5 since that time, cmi founded the cmmi institute which has expanded the scope of maturity models into other disciplines. many such models, developed for a variety of specific industries or specializations, have since been developed based on the cmmi institute approach, in which stages are defined as: • maturity level 1: initial (unpredictable and reactive); • maturity level 2: managed (planning, performance, measurement and control occur on the project level); • maturity level 3: defined (proactive, rather than reactive, with organization-wide standards); • maturity level 4: quantitatively managed (data-driven with shared, predictable, quantitative performance improvement objectives that align to meet the needs of internal and external stakeholders); and • maturity level 5: optimizing (stable, flexible, agile, responsive, and focused on continuous improvement).6 application of maturity models within user experience work in libraries thus far, discussion of maturity models in the library literature relevant to web librarianship has primarily centered on user experience (ux) work. in their 2020 paper “user experience methods and maturity in academic libraries,” young, chao, and chandler noted, “. . . several different ux maturity models have been advanced in recent years,” reviewing approximately a half-dozen approaches with varying emphases and numbers of stages.7 in 2013, coral sheldon-hess developed the following five-stage model, based on the aforementioned cmmi framework, for assessing maturity of ux practice in library organizations: 1 – decisions are made based on staff’s preferences, management’s pet projects. user experience [of patrons] is rarely discussed. 2 – some effort is made toward improving the user experience. decisions are based on staff’s gut feelings about patrons’ needs, perhaps combined with anecdotes from service points. 3 – the organization cares about user experience; one or two ux champions bring up users’ needs regularly. decisions are made based on established usability principles and studies from other organizations, with occasional usability testing. information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 3 4 – user experience is a primary motivator; most staff are comfortable with ux principles. users are consulted regularly, not just for major decisions, but in an ongoing attempt at improvement. 5 – user experience is so ingrained that staff consider the usability of all of their work products, including internal communications. staff are actively considerate, not only toward users but toward their coworkers.8 as an indicator of overall ux maturity within an organization, sheldon-hess focuses on “consideration” in interactions not only between library staff and library patrons, but also between library staff: “when an organization is well and truly steeped in ux, with total awareness of and buy-in on user-centered thinking, its staff enact those principles, whether they’re facing patrons or not.”9 in 2017, macdonald conducted a series of semi-structured interviews with 16 ux librarians to investigate, among other things, “the organizational aspects of ux librarianship across various library contexts.”10 macdonald proposes a five-stage model, broadly similar in concept to the cmmi institute structure and to sheldon-hess’s model. most compelling, however, were these three major findings, taken from macdonald’s list: • some (but not all) ux librarian positions were created as part of purposeful and strategic efforts to be more self-aware; . . . • the biggest challenges to doing ux are navigating the complex library culture, balancing competing responsibilities, and finding ways to more efficiently employ ux methods; an d • the level of co-worker awareness of ux librarianship is driven by the extent to which ux work is visible and by the individual ux librarian’s ability to effectively communicate their role and value.11 based on analysis of the results of their 2020 survey of library ux professionals, in which they asked respondents to self-diagnose their organizations, young, chao, and chandler presented, for use in libraries, their adaptation of the nielsen norman group’s eight-stage scale of ux maturity: • stage 1: hostility toward usability / stage 2: developer-centered ux—apathy or hostility to ux practice; lack of resources and staff for ux. • stage 3: skunkworks ux—ad hoc ux practices within the organization; ux is practiced, but unofficially and without dedicated resources or staff; leadership does not fully understand or support ux.12 • stage 4: dedicated ux budget—leadership beginning to understand and support ux; dedicated ux budget; ux is assigned fully or partly to a permanent position. • stage 5: managed usability—the ux lead or ux group collaborates with units across the organization and contributes ux data meaningfully to organizational and strategic decision-making. • stage 6: systematic user-centered design process—ux research data is regularly included in projects and decision-making; a wide variety of methods are practiced regularly by multiple departments. • stage 7: integrated user-centered design / stage 8: user-driven organization—ux is practiced throughout the organization; decisions are made and resources are allocated only with ux insights as a guide.13 information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 4 young et al.’s findings supported macdonald’s, underscoring the importance of shared organizational understandings, priorities, and culture related to ux activities and personnel: ux maturity in libraries is related to four key factors: the number of ux methods currently in use; the level of support from leadership in the form of strategic alignment, budget, and personnel; the extent of collaboration throughout the organization; and the degree to which organizational decisions are influenced by ux research. when one or more of these four connected factors advances, so too does ux maturity. 14 these findings are consistent with larger patterns in the management of library-authored web content identified in the earlier cited literature review: inconsistent processes, disconnects between units, varying constituent goals, and vague or ineffective wcm governance structures are recurrent themes throughout the literature . . . web content governance issues often signal a lack of coordination, or even of unity, across an organization.15 assessing the maturity of content strategy practice in libraries we consider kristina halverson’s definition of content strategy, offered in content strategy for the web, as the authoritative definition. halverson states: “content strategy is the practice of planning for the creation, delivery, and governance of useful, usable content.”16 this definition can be divided into five elements: 1. planning: intentionality and alignment, setting goals, discovery and auditing, connecting to strategic a plan or vision 2. creation: roles, responsibilities, and workflows for content creation; attention to content structure; writing or otherwise developing content in its respective format 3. delivery: findability of content within site and more broadly (i.e., search engine optimization), use of distinct communication channels 4. governance: maintenance and lifecycle management of content through coordinated process and decision making; policies and procedures; measurement and evaluation through analysis of usage data, testing, and other means 5. useful/usable (hereafter referred to as ux): relevant, current, clear, concise, and in context jones discusses the application of content strategy–specific maturity models as a potential tool for content strategists: “the[se] model[s] can help your company identify your current level of content operations, . . . decide whether that level will support your content vision and strategy . . . [and] help you plan to get to the next level of content operations.”17 three examples of maturity models developed for use by content strategy industry professionals map industry-specific terms, tools, and actions to the level-based structure put forward by the cmmi institute (see table 1). information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 5 table 1. comparative table of content strategy maturity models content strategy, inc.18 [2016] jones (gathercontent)19 [2018] randolph (kapost)20 [2020] ad hoc: inconsistent quality, lack of uniform practice, little or no opportunity to understand customer needs chaotic: no formal content operations, only ad hoc approaches reactive: chaotic, siloed, lacking clarity, chronically behind rudimentary: movement toward structure, unified process and voice; can be derailed by timelines, resistance piloting: trying content operations in certain areas, such as for a blog siloed: struggles to collaborate, poorly defined and inconsistently measured goals organized & repeatable: strong leadership, uniform process and voice has become routine, integration of userfocused data collection scaling: expanding formal content operations across business functions mobilizing: varying collaboration, content is centralized but not necessarily accessible, defined strategy sometimes impacted by ad hoc requests managed & sustainable: larger buy-in across organization, can sustain changes in leadership, increased number and sophistication of methods sustaining: solidifying and optimizing content operations across business functions integrating: effective collaboration across multiple teams, capability for proactive steps, still struggle to prove roi optimized: close alignment to strategic objectives, integration across the organization, leadership within and outside the organization thriving: sustaining while also innovating and seeing return on investment (roi) optimizing: cross-functional collaboration results in seamless customer messaging and experiences, consistently measured roi contributes to planning while these models have some utility for content strategy practitioners in higher education, including those in academic and research libraries, emphasis on commercial standards for assessing success (e.g., business goals, centrally managed marketing) limits their direct application in the academic environment. the 2017 blog post by tracey playle, “ten pillars for getting the most of your content: how is your university doing?”, presented ten concepts paired with questions, which could be used by higher education content professionals to reflect on their current state of practice.21 this model was developed for use by a consultancy, and the “pillars”—”strategy and vision,” “risk tolerance and creativity,” and “training and professional development”— are more broadly conceived than typical maturity models. thus, this approach seems more appropriate as a personal or management planning tool rather than as a model for evaluating maturity across library organizations. information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 6 methods following review and approval by the researchers’ institutional review boards, a web-based survey collecting information about existing workflows for web content, basic organizational information, and familiarity with concepts related to web content strategy was distributed to 208 professionals in april 2020. the survey was available for four weeks. participants were drawn from academic and research libraries across north america, providing their own opinions as well as information on behalf of their library organization. (see appendix a: institution list.) the sample group (n=208) was composed of north american academic and research libraries that are members of the following nationally and regionally significant membership organizations (excluding non-academic member institutions): the association of research libraries, the big ten academic alliance, the greater western library alliance, and/or the oberlin group. some libraries are members of multiple groups. details are supplied below in table 3. we identified individuals (n=165) based on their professional responsibilities and expertise using the following order and process: 1. individual job title contains some combinations of the following words and/or phrases: content strategy, content specialist, content strategist, web content, web communications, digital communications, digital content 2. head of web department or department email 3. head of ux department or department email 4. head of it or department email for institutions where a specific named individual could not be identified through a review of the organizational website, we identified a general email (e.g., libraries@state.edu) as the contact (n=43). a mailing list was created in mailchimp, and two campaigns were created: one for named individuals, and one for general contacts. only one response was requested per institution. (see appendix b: recruitment emails.) the 165 named individuals, identified as described above, received a personalized email inviting them to participate in the study. the recruitment email explained the purpose of the study, advised potential participants of possible risks and their ability to withdraw at any time, and included a link to the survey. a separate email was sent to the 43 general contacts on the same day, explaining the purpose of the study, and requesting that the recipient forward the communication to the appropriate person in the organization. this email also included information advising potential participants of possible risks and their ability to withdraw at any time, and a link to the survey. data was recorded directly by participants using qualtrics. the bulk of survey data does not include any personal information; we did not collect the names of institutions as part of our data collection, so identifying information is limited to information about institutional memberships. for the group of named individuals, one email bounce was recorded. the open rate for personalized emails sent to named individuals was approximately 62% (88 of 142 successfully delivered emails were opened) and the survey link was followed 66 times. the general email group had a 51% open rate (n=22) with 11 clicks of the survey link. with recruitment occurring in april 2020, most individuals and institutions were at the height of switching to remote operations information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 7 in light of the covid-19 pandemic. despite this, our open rates were considerably higher than average open rates as reported by mailchimp.22 as discussed below, we achieved our minimum response rate goal of 20%. table 2. survey question topics and response count question topic category response count 1 consent — 43 2 organizational memberships demographic 40 3 approx. # full-time employees demographic 41 4 cms products used infrastructure/ organizational structure 41 5 primary cms infrastructure/ organizational structure 39 6 number of site editors infrastructure/ organizational structure 39 7 describe responsibility for content infrastructure/ organizational structure 39 8 existence of position(s) with primary duties of web content infrastructure/ organizational structure 39 9 titles of such positions, if any infrastructure/ organizational structure 24 10 familiar with web content strategy content strategy practices 36 11 definition of web content strategy content strategy practices 32 12 policies or documentation content strategy practices 35 13 methods content strategy practices 37 14 willing to be contacted — 37 15 name — 27 16 email — 26 the survey included 16 questions; question topics and response counts are noted in table 2. informed consent was obtained as part of the first survey question. (see appendix c: survey questions and appendix d: informed consent document.) most questions were multiple-choice or short answer (i.e., a number). two questions required longer-form responses. information collected fell into the following three categories: information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 8 • demographics (estimated total number of employees; institutional memberships; estimated number of employees with website editing privileges) • infrastructure and organizational structure (content management systems used to manage library-authored web content; system used to host primary public-facing website; distribution of responsibility for website content; titles of positions (if any) whose primary responsibilities focus on web content) • web content strategy practices (familiarity with; personal definition; presence or absence of policy or documentation; evaluation methods regularly used) upon completion of the survey questions, participants had the option to indicate that they would be willing to be contacted for an individual interview as part of planned future research on this topic. twenty-seven individuals (63%) opted in and provided us with their contact information. findings in sum, 43 responses were received, resulting in a response rate of 20.67%. because we did not collect names of individuals or institutions and used an anonymous link for our survey, we cannot determine the ultimate response rate by contact group (named individuals or general email). demographic information the bulk of responses came from association of research libraries members, but within-group response rates show that the proportion of responses from each group was relatively balanced within the overall 20% response rate. table 3. distribution of survey contacts, responses, and response rates by group23 organization member libraries contacted responses share of total responses (%) group response rate (%) association of research libraries 117 26 50.98 22.22 big ten academic alliance 15 5 9.8 33.0 greater western library alliance 38 8 15.69 21.05 oberlin group 80 12 23.53 15.0 infrastructure & organizational structure content management systems a variety of content management systems are used to manage library-authored web content (see table 4); libguides, wordpress, omeka, and drupal were most commonly used across the group. other systems mentioned as write-in responses included acquia drupal, cascade, fedora-based systems, archivesspace, google sites, and “wiki and blog.” one response stated, “most pages are just non-cms for the website.” write-in responses for “other” and “proprietary system hosted by institution” were carried forward within the survey from question 3 to question 4, and are available in full in appendix e: other content management systems mentioned by respondents. information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 9 table 4. cms products used to manage library-authored web content q3: cms products used percentage (%) count libguides 28.06 39 wordpress 18.71 26 omeka 15.11 21 drupal 13.67 19 other 9.35 13 sharepoint 7.19 10 proprietary system hosted by institution 7.19 10 adobe experience manager 0.72 1 total 100 139 for their primary library website, just under half of respondents relied on drupal (n=17, 43.59%). slightly fewer selected the specific system, whether the institution’s proprietary system or some other option, that they had shared as a write-in answer for the previous question; in total just under 36% (n=14). despite the widespread use reported in the previous question, only two respondents indicated that their primary website was hosted in libguides. (see table 5.) table 5. cms used to host primary library website q4: primary website cms percentage (%) count drupal 43.59 17 other (write in answers) 20.51 8 wordpress 15.38 6 libguides 5.13 2 proprietary system hosted by institution (write in answers) 15.38 6 dedicated positions, position titles, and organizational workflows almost two-thirds of respondents (n=24, 61.5%) indicated there were position(s) within their library whose primary duties were focused on the creation, management, and/or editing of web content. a total of 52 position titles were shared (the full list of position titles can be found in appendix f). terms and phrases most commonly occurring across this set were web (15), librarian (15), user experience (10), and digital (8). explicitly content-focused terms appeared more rarely: content (6), communication/communications (5), and editor (1). information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 10 table 6. frequency of terms and phrases in free-text descriptions of website content management, grouped by the authors into concepts count count count count count concept collaborative 29 assigned roles 18 locus of control 13 support 5 libguides 14 terms group 7 admin* 6 their own 7 training 2 team 6 manager 5 review 3 guidance 2 distributed 5 editor/s 4 oversight 3 consulting 1 committee 3 developer 3 permission 1 stakeholder 3 product owner 2 representative 2 crossdepartmental 1 decentralized 1 inclusive 1 information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 11 most respondents described collaborative workflows for web content management, in which a group of representatives or delegates collectively stewards website content (see table 6 for a summary and appendix f for full-text responses). collaborative concepts appeared 29 times, including terms like group (7), team (6), distributed (5), and committee (3). within this set, decentralized, inclusive, and cross-departmental each appeared once. similarly, within terms related to locus of control, the phrase “their own” appeared seven times. specifically assigned roles or responsibilities were mentioned 18 times, including terms like admin/administrator (6), manager (5), and editor/s or editorial (4). respondents discussed support structures such as training, guidance or consulting five times. libguides were mentioned 14 times. over 60% of respondents indicated that 20 or fewer employees had editing privileges on the library website (see table 7). three respondents commented “too many” when citing the number or range: “too many! i think about five, but there could be more”; “too many, about 12”; “too many to count, maybe 20+.” table 7. distribution of the number of employees with website editing privileges response percentage (%) count less than five 23.08 9 5–10 20.51 8 11–20 17.95 7 21–99 23.08 9 100–199 10.26 4 200+ 2.56 1 the greatest variation in practice regarding how many employees had website editing privileges occurs in institutions with more than 100 total employees, where institutions reported within every available range (see table 8). table 8. comparison of number of total employees and of number of employees with editing privileges number of employees less than 5 5–10 11–20 21–99 100–199 200+ 4–10 2 — — — — — 11–25 3 1 — — — — 26–50 — 2 2 — — — 51–99 1 1 4 1 — — 100+ 3 4 2 8 4 1 information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 12 web content strategy practices almost all respondents (n=36, 83%) reported that they were familiar with the concept of web content strategy. conversely, only 20% (n=7) reported that their library had either a documented web content strategy or web content governance policy. respondents were asked, optionally, to provide a definition of web content strategy in their own words, and we received 32 responses (see appendix g: definitions of web content strategy). we analyzed the free-text definitions of content strategy based on the five elements of halvorson’s previously cited definition: planning, creation, delivery, governance, and ux. we first individually rated the definitions, then we determined a mutually agreed rating for each. across the set, responses most commonly addressed concepts or activities related to planning and ux, and least commonly mentioned concepts or activities related to delivery (see table 9). table 9. occurrence of content strategy elements in free-text definitions element count percentage (%) plan intentional, strategic, brand, style, best practices 29 91 creation workflows, structure, writing 20 63 delivery findability, channels 13 41 governance maintenance, lifecycle, measurement/evaluation 16 50 ux needs of the user, relevant, current, clear, concise, in context 19 59.38 responses were scored on each of the five elements as follows: zero points, concept not mentioned; one point, some coverage of the concept; two points, thorough coverage of the concept. representative examples are provided in table 10. a perfect score for any individual definition would be 10. the median score across the group was four, and the average score was 3.4. we consider scores less than three to indicate a basic level of practice; scores from four to seven to be an intermediate level of practice; and scores above eight to be advanced levels of practice. of the 33 responses to the free-text definition question, one respondent failed to include any data, 14 responses were classed as basic, 17 responses as intermediate, and none were advanced. information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 13 table 10. example showing scoring of four representative free-text definitions provided by respondents free-text definitions of content strategy plan intentional, strategic, brand, style, best practices creation workflows, structure, writing delivery findability, channels governance maintenance, lifecycle, measuremen t/evaluation ux needs of the user, relevant, current, clear, concise, and in context total score intentional and coordinated vision for content on the website. 1 0 0 0 0 1 an overarching method of bringing user experience best practices together on the website including heuristics, information architecture, and writing for the web. 1 1 0 0 1 3 strategies for management of content over its entire lifecycle to ensure it is accurate, timely, usable, accessible, appropriate, findable, and well-organized. 1 0 1 1 1 4 the process of creating and enacting a vision for the organization and display of web content so that it is user friendly, accurate, up-to-date, and effective in its message. web content strategy often involves considering the thoughts and needs of many stakeholders, and creating one cohesive voice to represent them all. 2 1 0 1 2 6 information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 14 respondents reported most frequent use of practices associated with web librarianship and user experience work: analysis of usage data (n=36) and usability testing (n=28) (see fig. 1). contentspecific methods were less commonly used overall. figure 1. frequency of reported usage of analysis and evaluation methods the five other responses mainly clarified or qualified the selections, although some added additional information, for example: at this time, all library websites use a standard template, so they have the same look and feel. beyond that everything else is “catch as catch can” because we do not have a web services librarian, nor are we likely to get that dedicated position any time soon, given the recent covid-19 financial upheaval. brand guidelines, accessibility guidance, and personal responsibility were also mentioned. discussion the targeted recruitment methodology and survey, representing a combination of demographic and practice-based questions, aspired to collect data suitable to generate a snapshot of how web content strategy work is being undertaken in academic libraries at this time, as well as the depth and breadth of that practice. we were struck by several contrasts in findings: first and foremost, the 80–20 inversion across responses related to knowledge of web content strategy versus its practice. this was particularly notable in combination with respondents’ reports that, in nearly two-thirds of organizations, one information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 15 or multiple positions exist in their organization with primary duties focused on the creation, management, and/or editing of web content. the influence of ux thinking and methods in academic libraries is visible in the frequency of respondents’ reported use of general and established ux practices for maintaining the primary website (e.g., usability testing). the other four elements of halvorson’s definition were less thoroughly covered, both in provided definitions of web content strategy and in methods reported. some respondents mentioned use of methods such as content audits or inventories and style guides, but many fewer reported reliance on review checklists, content calendars, and readability scores. in reviewing the self-reported definitions of content strategy for evidence of each of the five elements of halvorson’s previously discussed definition, trends in findings suggest higher levels of maturity in the elements of planning, creation, and ux, and lower levels in the elements of delivery and governance. nearly all respondents (91%) referenced the element of planning. almost twothirds mentioned concepts or practices related to creation, and approximately 60% of respondents referenced usability of content or a focus on the user in some capacity. only half made mention of governance (including maintenance and evaluation), and even fewer (41%) referenced delivery, whether considering content channels or findability; in fact, no single definition touched on both. overall, the results of the analysis of provided definitions (discussed in the previous section) suggest that at present, web content strategy as a community of practice in academic libraries is operating at, or just above, a basic level. proposed maturity model from these findings, and referencing the structure of the cmmi institute five-stage maturity model, the authors propose the following proposed content strategy maturity model for academic libraries. as previously noted in our findings, we assess the web content strategy community of practice in academic libraries as operating at, or just above, a basic level. to align the proposed maturity model with the definition scores, we applied the 10-point rating scale for provided definitions to the five levels by assigning two points per level, so a score of one or two would be equivalent to level 1, a score of three or four equivalent to level 2, and so on (table 11). table 11. comparison of maturity model with definition rating scale and maturity assessment maturity model level definition score assessment level 1 1 basic level 1 2 basic level 2 3 basic level 2 4 intermediate level 3 5 intermediate level 3 6 intermediate level 4 7 intermediate level 4 8 advanced level 5 9 advanced level 5 10 advanced content strategy maturity model for academic libraries level 1: ad hoc • no planning or governance • creation and delivery are reactive, distributed, and potentially chaotic • no or minimal consideration of ux level 2: establishing • some planning and evidence of strategy, such as use of content audits and creation of a style guide; may be localized within specific groups or units • basic coordination of content creation workflows • delivery workflows not explicitly addressed, or remain haphazard • no or minimal organization-wide governance structures or documentation in place; may be localized within specific groups or units • evidence of active consideration of ux in creation and structure of content level 3: scaling • intentional and proactive planning coordinated across multiple units • basic content creation workflows in place across organization • delivery considered, but may not be consistent or strategic • ad hoc evaluation through usage data and usability testing; organization-wide governance documents and workflows may be at a foundational level • consideration of ux is integral to process of creating useful, usable content • web content creation and maintenance is assigned at least partly to a permanent position with some level of authority and responsibility for the primary website level 4: sustaining • alignment in planning, able to respond to organizational priorities; style guidelines and best practices widely accepted • established and accepted workflows for content creation are coordinated through a person, department, team, or other governing body • delivery includes strategic and consistent use of channels, as well as consideration of findability • regular and strategic evaluation occurs; proactive maintenance and retirement practices in place; managed through established governance documents and workflows • web content strategy explicitly assigned partly or fully to a permanent position level 5: thriving • full lifecycle of content (planning, creation, delivery, maintenance, retirement) managed in coordination across all library-authored web content platforms • governance established and accepted throughout the organization, including documented policies, procedures, and accountability • basic understanding of content strategy concepts and importance across the organization • overall stable, flexible, agile, responsive, user-centered and focused on continuous improvement information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 17 as previously mentioned, the median score across the group was four, and the average score was 3.4; these measures suggest that the majority of survey respondents’ organizational web content strategy maturity levels would currently stand at level 2 or 3, with a few at level 1. conclusion the findings of this survey and assessment, while inherently limited, suggest that web content strategy is currently not a pervasive factor for academic libraries and academic web librarians in the development and implementation of actions, policies, and practices related to website creation, maintenance, and evaluation. we have proposed a measure for self-estimating the maturity of web content strategy practice for academic libraries. our content strategy maturity model for academic libraries, while grounded both in industry best practices and in evidence from practitioners in academic libraries, is nonetheless a work in progress. we intend to further develop and strengthen the model through follow-up interviews with practitioners, drawing on those survey respondents who opted-in to being contacted. interviewees will be invited to discuss their work within and outside the frame of the proposed maturity model, and to provide feedback on the model itself, with the ultimate goal of enabling a better understanding of web content strategy practice in academic libraries and the needs of its community of practice. endnotes 1 courtney mcdonald and heidi burkhardt, “library-authored web content and the need for content strategy,” information technology and libraries 38, no. 3 (september 15, 2019): 8–21, https://doi.org/10.6017/ital.v38i3.11015. 2 mcdonald and burkhardt, 14. 3 mcdonald and burkhardt, 16. 4 “cmmi levels of capability and performance,” sec. maturity levels, cmmi institute llc, accessed may 28 2020, https://cmmiinstitute.com/learning/appraisals/levels. 5 “about cmmi institute,” cmmi institute llc, accessed may 28 2020, https://cmmiinstitute.com/company. 6 “cmmi levels of capability and performance,” sec. maturity levels. 7 scott w. h. young, zoe chao, and adam chandler, “user experience methods and maturity in academic libraries,” information technology and libraries 39, no. 1 (march 16, 2020): 2, https://doi.org/10.6017/ital.v39i1.11787. 8 coral sheldon-hess, “ux, consideration, and a cmmi-based model,” para. 6, july 25, 2013, http://www.sheldon-hess.org/coral/2013/07/ux-consideration-cmmi/. 9 sheldon-hess, “ux, consideration, and a cmmi-based model,” para. 2, http://www.sheldonhess.org/coral/2013/07/ux-consideration-cmmi/. https://doi.org/10.6017/ital.v38i3.11015 https://cmmiinstitute.com/learning/appraisals/levels https://cmmiinstitute.com/company https://doi.org/10.6017/ital.v39i1.11787 http://www.sheldon-hess.org/coral/2013/07/ux-consideration-cmmi/ http://www.sheldon-hess.org/coral/2013/07/ux-consideration-cmmi/ http://www.sheldon-hess.org/coral/2013/07/ux-consideration-cmmi/ information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 18 10 craig m. macdonald, “‘it takes a village’: on ux librarianship and building ux capacity in libraries,” journal of library administration 57, no. 2 (february 17, 2017): 196, https://doi.org/10.1080/01930826.2016.1232942. 11 macdonald, 212. 12 skunk works is trademarked by lockheed martin corporation, but is informally used to describe an experimental, sometimes secret, research and development group focused on agile innovation. 13 young, chao, and chandler, “user experience methods and maturity in academic libraries,” 19. 14 young, chao, and chandler, 23. 15 mcdonald and burkhardt, “library-authored web content and the need for content strategy,” 15–16. 16 kristina halvorson, content strategy for the web, 2nd ed. (berkeley, ca: new riders, 2012), 28. 17 colleen jones, “a content operations maturity model,” sec. a maturity model for content operations, gather content (blog), november 30, 2018, https://gathercontent.com/blog/content-operations-model-of-maturity. 18 “understanding the content maturity model,” content strategy inc. (blog), march 2016, https://www.contentstrategyinc.com/understanding-content-maturity-model/. 19 jones, “a content operations maturity model,” sec. a maturity model for content operations. 20 zoë randolph, “where do you fall on the content operations maturity model?,” sec. the content operations maturity model, kapost blog (blog), april 20, 2020, https://kapost.com/b/content-operations-maturity-model/. 21 tracy playle, “ten pillars for getting the most of your content: how is your university doing?,” pickle jar communications (blog), september 29, 2017, http://www.picklejarcommunications.com/2017/09/29/content-strategy-benchmarking/. 22 “email marketing benchmarks by industry,” mailchimp, accessed june 15, 2020, https://mailchimp.com/resources/email-marketing-benchmarks/. 23 some libraries are members of multiple groups. https://doi.org/10.1080/01930826.2016.1232942 https://gathercontent.com/blog/content-operations-model-of-maturity https://www.contentstrategyinc.com/understanding-content-maturity-model/ https://kapost.com/b/content-operations-maturity-model/ http://www.picklejarcommunications.com/2017/09/29/content-strategy-benchmarking/ https://mailchimp.com/resources/email-marketing-benchmarks/ information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 19 appendices appendix a: institution list appendix b: recruitment emails appendix c: survey questions appendix d: informed consent document appendix e: other content management systems mentioned by respondents appendix f: organizational responsibility for content; and position titles appendix g: definitions of web content strategy information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 20 appendix a: institution list institution membership(s) agnes scott college oberlin group alabama arl alberta arl albion college oberlin group alma college oberlin group amherst college oberlin group arizona arl, gwla arizona state arl, gwla arkansas gwla auburn arl augustana college oberlin group austin college oberlin group bard college oberlin group barnard college oberlin group bates college oberlin group baylor gwla beloit college oberlin group berea college oberlin group boston arl boston college arl boston public library arl bowdoin college oberlin group brigham young arl, gwla british columbia arl brown arl bryn mawr college oberlin group bucknell university oberlin group calgary arl california, berkeley arl california, davis arl california, irvine arl california, los angeles arl information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 21 institution membership(s) california, riverside arl california, san diego arl california, santa barbara arl carleton college oberlin group case western reserve arl chicago arl, btaa cincinnati arl claremont colleges gwla, oberlin group clark university oberlin group coe college oberlin group colby college oberlin group colgate university oberlin group college of the holy cross oberlin group college of wooster oberlin group colorado arl, gwla colorado college oberlin group colorado state arl, gwla columbia arl connecticut arl connecticut college oberlin group cornell arl dartmouth arl davidson college oberlin group delaware arl, gwla denison university oberlin group denver gwla depauw university oberlin group dickinson college oberlin group drew university oberlin group duke arl earlham college oberlin group eckerd college oberlin group emory arl information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 22 institution membership(s) florida arl florida state arl franklin & marshall college oberlin group furman university oberlin group george washington arl georgetown arl georgia arl georgia tech arl gettysburg college oberlin group grinnell college oberlin group guelph arl gustavus adolphus college oberlin group hamilton college oberlin group harvard arl haverford college oberlin group hawaii arl hope college oberlin group houston arl, gwla howard arl illinois, chicago arl, gwla illinois, urbana arl, btaa indiana arl, btaa iowa arl, btaa iowa state arl, gwla johns hopkins arl kalamazoo college oberlin group kansas arl, gwla kansas state gwla kent state arl kentucky arl kenyon college oberlin group knox college oberlin group lafayette college oberlin group information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 23 institution membership(s) lake forest college oberlin group laval arl lawrence university oberlin group library of congress arl louisiana state arl louisville arl macalester college oberlin group manhattan college oberlin group manitoba arl maryland arl, btaa massachusetts arl mcgill arl mcmaster arl miami arl michigan arl, btaa michigan state arl, btaa middlebury college oberlin group mills college oberlin group minnesota arl, btaa missouri arl, gwla mit arl morehouse/spelman colleges (auc) oberlin group mount holyoke college oberlin group nebraska arl, btaa nevada las vegas gwla new mexico arl, gwla new york arl north carolina arl north carolina state arl northwestern arl, btaa notre dame arl oberlin college oberlin group occidental college oberlin group information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 24 institution membership(s) ohio arl ohio state arl, btaa ohio wesleyan university oberlin group oklahoma arl, gwla oklahoma state arl, gwla oregon arl, gwla oregon state gwla ottawa arl pennsylvania arl pennsylvania state arl, btaa pittsburgh arl princeton arl purdue arl, btaa queen's arl randolph-macon college oberlin group reed college oberlin group rhodes college oberlin group rice arl, gwla rochester arl rollins college oberlin group rutgers arl, btaa sarah lawrence college oberlin group saskatchewan arl sewanee: the university of the south oberlin group simmons university oberlin group simon fraser arl skidmore college oberlin group smith college oberlin group south carolina arl southern california arl, gwla southern illinois arl, gwla southern methodist gwla st. john's university / college of st. benedict oberlin group information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 25 institution membership(s) st. lawrence university oberlin group st. olaf college oberlin group suny-albany arl suny-buffalo arl suny-stony brook arl swarthmore college oberlin group syracuse arl temple arl tennessee arl texas arl, gwla texas a&m arl, gwla texas state gwla texas tech arl, gwla toronto arl trinity college oberlin group trinity university oberlin group tulane arl union college oberlin group utah arl, gwla utah state gwla vanderbilt arl vassar college oberlin group virginia arl virginia commonwealth arl virginia tech arl wabash college oberlin group washington arl, gwla washington and lee university oberlin group washington state arl, gwla washington u.-st. louis arl, gwla waterloo arl wayne state arl, gwla wellesley college oberlin group information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 26 institution membership(s) wesleyan university oberlin group west virginia gwla western arl wheaton college oberlin group whitman college oberlin group whittier college oberlin group willamette university oberlin group williams college oberlin group wisconsin arl, btaa wyoming gwla yale arl york arl information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 27 appendix b: recruitment emails recruitment email: named recipients this message is intended for *|mmerge6|* dear *|fname|*, we are writing today to ask for your participation in a research project “content strategy in practice within academic libraries,” (cu boulder irb protocol #18-0670), led by co-investigators courtney mcdonald and heidi burkhardt (university of michigan). we have provided the information below as a downloadable pdf should you wish to keep it for your records. the purpose of the study is to establish an understanding of the degree of institutional engagement in web content strategy within academic and research libraries, and what trends may be detected in this area of professional practice. our primary subject population consists of academic and research libraries that are members of the following nationally and regionally significant membership organizations (excluding nonacademic member institutions): association of research libraries, big ten academic alliance, greater western library alliance, and/or the oberlin group. if you opt to participate, we expect that you will be in this research study for the duration of the time it takes to complete our web-based survey. you will not be paid to be in this study. whether or not you take part in this research is your choice. you can leave the research at any time and it will not be held against you. we expect about 210 people, representing their institutions, in the entire study internationally. this survey will be available over a four-week period in the spring of 2020, through friday, may 1. ** confidentiality ----------------------------------------------------------- information obtained about you for this study will be kept confidential to the extent allowed by law. research information that identifies you may be shared with the university of colorado boulder institutional review board (irb) and others who are responsible for ensuring compliance with laws and regulations related to research, including people on behalf of the office for human information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 28 research protections. the information from this research may be published for scientific purposes; however, your identity will not be given out. ** questions ----------------------------------------------------------- if you have questions, concerns, or complaints, or think the research has hurt you, contact the research team at crmcdonald@colorado.edu. this research has been reviewed and approved by an irb. you may talk to them at (303) 735 3702 or irbadmin@colorado.edu if: * your questions, concerns, or complaints are not being answered by the research team. * you cannot reach the research team. * you want to talk to someone besides the research team. * you have questions about your rights as a research subject. * you want to get information or provide input about this research. thank you for your consideration, courtney mcdonald crmcdonald@colorado.edu heidi burkhardt heidisb@umich.edu ============================================================ not interested in participating? you can ** unsubscribe from this list (*|unsub|*). this email was sent to *|email|* (mailto:*|email|*) why did i get this? (*|about_list|*) unsubscribe from this list (*|unsub|*) update subscription preferences (*|update_profile|*) information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 29 recruitment email: named recipients dear library colleague, we are writing today to ask for your participation in a research project “content strategy in practice within academic libraries,” (cu boulder irb protocol #18-0670), led by co-investigators courtney mcdonald and heidi burkhardt (university of michigan). our primary subject population consists of academic and research libraries that are members of the following nationally and regionally significant membership organizations (excluding non academic member institutions): association of research libraries, big ten academic alliance, greater western library alliance, and/or the oberlin group. we ask that you forward this message to the person in your organization whose role includes oversight of your public web site. we are only requesting a response from one person at each institution contacted. thank you for your assistance in routing this request. we have provided the information below as a downloadable pdf should you wish to keep it for your records. the purpose of the study is to establish an understanding of the degree of institutio nal engagement in web content strategy within academic and research libraries, and what trends may be detected in this area of professional practice. if someone within your library opts to participate, we expect that person will be in this research study for the duration of the time it takes to complete our web-based survey. the participant will not be paid to be in this study. whether or not someone in your library takes part in this research is an individual choice. the participant can leave the research at any time and it will not be held against them. we expect about 210 people, representing their institutions, in the entire study internationally. this survey will be available over a four-week period in the spring of 2020, through friday, may 1. ** confidentiality ----------------------------------------------------------- information obtained about you for this study will be kept confidential to the extent allowed by law. research information that identifies you may be shared with the university of co lorado boulder institutional review board (irb) and others who are responsible for ensuring compliance with laws and regulations related to research, including people on behalf of the office for human research protections. the information from this research may be published for scientific purposes; however, your identity will not be given out. information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 30 ** questions ----------------------------------------------------------- if you have questions, concerns, or complaints, or think the research has hurt you, contact the research team at crmcdonald@colorado.edu. this research has been reviewed and approved by an irb. you may talk to them at (303) 735 3702 or irbadmin@colorado.edu if: * your questions, concerns, or complaints are not being answered by the research team. * you cannot reach the research team. * you want to talk to someone besides the research team. * you have questions about your rights as a research subject. * you want to get information or provide input about this research. thank you for your consideration, courtney mcdonald crmcdonald@colorado.edu heidi burkhardt heidisb@umich.edu ============================================================ not interested in participating? you can ** unsubscribe from this list (*|unsub|*). this email was sent to *|email|* (mailto:*|email|*) why did i get this? (*|about_list|*) unsubscribe from this list (*|unsub|*) update subscription preferences (*|update_profile|*) information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 31 appendix c: survey questions web content strategy methods and maturity start of block: introduction q1 web content strategy methods and maturity in academic libraries (cu boulder irb protocol #20-0581) purpose of the study the purpose of the study is to gather feedback from practitioners on the proposed content strategy maturity model for academic libraries, and to further enhance our understanding of web content strategy practice in academic libraries and the needs of its community of practice. q2 please make a selection below, in lieu of your signature, to document that you h ave read and understand the consent form, and voluntarily agree to take part in this research. o yes, i consent to take part in this research. (1) o no, i do not grant my consent to take part in this research. (2) skip to: end of survey if q2 = no, i do not grant my consent to take part in this research. end of block: introduction start of block: demographic information information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 32 q3 estimated total number of employees (fte) at your library organization: o less than five (12) o 5-10 (13) o 11-20 (14) o 21-99 (15) o 100-199 (16) o 200+ (17) q4 estimated number of employees with editing privileges within your primary library website: o less than five (12) o 5-10 (13) o 11-20 (14) o 21-99 (15) o 100-199 (16) o 200+ (17) q5 does your library have a documented web content strategy and / or a web content governance policy? o no (1) o yes (2) information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 33 q6 are there position(s) within your library whose primary duties are focused on creation, management, and/or editing of web content? o no (1) o yes, including myself (2) o yes, not including myself (3) end of block: demographic information start of block: web content strategy q7 please indicate the degree to which each of the five elements of content strategy are currently in practice at your library. q8 creation employ editorial workflows, consider content structure, support writing. definitely true (48) somewhat true (49) somewhat false (50) definitely false (51) this is currently in practice at my institution. (1) o o o o information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 34 q9 delivery consider findability, discoverability, and search engine optimization, plus choice of content platform or channels. definitely true (48) somewhat true (49) somewhat false (50) definitely false (51) this is currently in practice at my institution. (1) o o o o q10 governance support maintenance and lifecycle of content, as well as measurement and evaluation. definitely true (31) somewhat true (32) somewhat false (33) definitely false (34) this is currently in practice at my institution. (1) o o o o q11 planning use an intentional and strategic approach, including brand, style, and writing best practices. definitely true (31) somewhat true (32) somewhat false (33) definitely false (34) this is currently in practice at my institution. (1) o o o o information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 35 q12 user experience consider needs of the user to produce relevant, current, clear, concise, and in context. definitely true (31) somewhat true (32) somewhat false (33) definitely false (34) this is currently in practice at my institution. (1) o o o o q13 please rank the elements of content strategy (as defined above) in order of their priority based on your observations of practice in your library. • ______ creation (1) • ______ delivery (2) • ______ governance (3) • ______ planning (4) • ______ user experience (5) q14 how would you assess the content strategy maturity of your organization? o basic (1) o intermediate (2) o advanced (3) end of block: web content strategy information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 36 start of block: thank you! q15 your name: ________________________________________________________________ q16 thank you very much for your willingness to be interviewed as part of our research study. prior to continuing on to finalize your survey submission, please sign up for an interview time: [link] (this link will open in a new window in order to allow you to finalize and submit your survey response after scheduling an appointment) please contact courtney mcdonald, crmcdonald@colorado.edu, if you experience any difficulty in registering or if there is not a time available that works for your schedule. end of block: thank you! information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 37 appendix d: informed consent document permission to take part in a human research study page 37 of 28 title of research study: content strategy in practice within academic libraries irb protocol number: 18-0670 investigators: courtney mcdonald and heidi burkhardt purpose of the study the purpose of the study is to establish an understanding of the degree of institutional engagement in web content strategy within academic and research libraries, and what trends may be detected in this area of professional practice. our primary subject population consists of academic and research libraries that are members of the following nationally and regionally significant membership organizations (excluding nonacademic member institutions): association of research libraries, big ten academic alliance, and/or greater western library alliance. we expect that you will be in this research study for the duration of the time it takes to complete our web-based survey. we expect about 210 people, representing their institutions, in the entire study internationally. explanation of procedures we are directly contacting each library to request that the appropriate individual(s) complete a web-based survey. this survey will be available over a four-week period in the spring of 2020. voluntary participation and withdrawal whether or not you take part in this research is your choice. you can leave the research at any time and it will not be held against you. the person in charge of the research study can remove you from the research study without your approval. possible reasons for removal include an incomplete survey submission. confidentiality information obtained about you for this study will be kept confidential to the extent allowed by law. research information that identifies you may be shared with the university of colorado boulder institutional review board (irb) and others who are responsible for ensuring compliance with laws and regulations related to research, including people on behalf of the office for human research protections. the information from this research may be published for scientific purposes; however, your identity will not be given out. payment for participation you will not be paid to be in this study. contact for future studies we would like to keep your contact information on file so we can notify you if we have future research studies we think you may be interested in. this information will be used by only th e information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 38 principal investigator of this study and only for this purpose. you can opt-in to provide your contact information at the end of the online survey. questions if you have questions, concerns, or complaints, or think the research has hurt you, contact to the research team at crmcdonald@colorado.edu this research has been reviewed and approved by an irb. you may talk to them at (303) 7353702 or irbadmin@colorado.edu if: • your questions, concerns, or complaints are not being answered by the research team. • you cannot reach the research team. • you want to talk to someone besides the research team. • you have questions about your rights as a research subject. • you want to get information or provide input about this research. signatures in lieu of your signature, your acknowledgement of this statement in the online survey document documents your permission to take part in this research. mailto:crmcdonald@colorado.edu mailto:irbadmin@colorado.edu information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 39 appendix e: other content management systems mentioned by respondents question #4: which of the following content management systems does your library use to manage library-authored web content? write-in responses for ‘proprietary system hosted by institution’ ● xxxxxxxxxxx • archivesspace • pressbooks • preservica • hippo cms • siteleaf • cascade • dotcms • terminal four • acquia drupal • fedora based digital collections system built in house write-in responses for ‘other” • wiki and blog • we draft content in google docs & also use gather content for auditing. • google sites • cascade • ebsco stacks • modx • islandora and online journal system • contentful • we also have some in-house-built tools such as for room booking; some of these are quite old and we would like to upgrade or improve them when we can. (very few people can make edits in these tools.) • cascade • the majority of the library website (and university website) is managed by a locally developed cms; however, the university is in the process of migrating to the acquia drupal cms. • blacklight, vivo, fedora • most pages are just non-cms for the website information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 40 appendix f: organizational responsibility for content; and position titles question 6 please explain how your organization distributes responsibility for content hosted in your content management system(s). if different parties (individuals, departments, collaborative groups) are responsible for managing content in different platforms please describe. • we have one primary website manager who oversees the management of the website, including content strategy and editing, and 2 editors who assist with small editing tasks. • we have content editors that edit content for individual libraries and collections. there is a content creator network managed by library communications. they provide trainings and guidance for content editors and act as reviewers, but not every single thing gets reviewed. • we have a team of developers and product owners who are responsible for managing web content. • we currently have a very distributed model, where virtually any library staff member or student assistant can request a drupal account and then make changes to existing content or develop new pages. we have a cross-departmental team that oversees the libraries' web interfaces and makes decisions about library homepage content, the menu navigation, overall ia, etc. we have web content guidelines to help staff as they develop new content. we have identified functional and technical owners for each of our cmss and have slightly different processes for managing content in those cmss. our general approach, however, is very inclusive (for better or worse ;) )-lots of staff have access to creating and editing content. we are, however, moving to a less distributed content for drupal in particular. moving forward, we'll have a small team responsible for editing and developing new content. this is to ensure that content is more consistent and user-centered. we attempted to identify funding for a full-time content manager but were unsuccessful, so this team will attempt to fill the role of a full-time content manager. • ux is the product owner and admin. if staff want content added to the website, they send a request to ux, we structure and edit content in a google doc, and then ux posts to the website. • there's no method for how or why responsibility is distributed. it ends up being something like, someone wants to add some content, they get editing access, they can now edit anything for as long as they're at the library. we are a super decentralized and informal library. • the primary content managers are the xxxxxx librarian and the xxxxxx. other individuals (primarily librarians) that are interested in editing their content have access on our development server. their edits are vetted by the xxxxxxand/or the xxxxxx librarian before being moved into production. • the xxxxxx department (6 staff) manages content and helps staff throughout the organization create and maintain content. ux staff sometimes teach others how to manage content, and sometimes do it for them. if design or content is complex, usually ux staff do the work. many staff don't maintain any content beyond their staff pages. subject specialists and instruction librarians maintain content [like] libguides-like content, but we don't use libguides. branch library staff maintain most of the content for their library pages. information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 41 • in addition, the xxxxxx manages the catalog. the xxxxxx department manages special web projects. and the xxxxxx department manages social media, publications, and news. • a web content team made up of two administrators and librarians from xxxxxx and xxxxxx makes executive-level decisions about web content. • the xxxxxx team (xxxxxx) provides oversight and consulting for online user interfaces chaired by a xxxxxxposition which is new and is not yet filled. • for the public website, content editing is distributed to many groups and teams throughout the libraries. • the xxxxxxteam manages the main portions of the site including the homepage, news, maps, calendars, etc. the research librarians and subject liaisons manage the research guides. the xxxxxx provides guidance regarding overall responsibilities and style guidelines. • site structure and top-level pages for our main website resides with xxxxxx. page content is generally distributed to the departments closest to the services described by the pages. • right now editing of pages is distributed to those individuals who have the closest relationship to the pages being edited, with a significantly smaller number of people having administrative access to all of the libraries' websites. • primary website is co-managed by xxxxxx team (4 people) and xxxxxx team (3 people). xxxxxxteam creates timely content about news/events/initiatives while xxxxxx team manages content on evergreen topics. • research librarians and staff manage libguides content, which is in sore need of an inventory and pruning. • primarily me, plus two colleagues who serve with me as a web editorial board • one librarian manages the content and makes changes based on requests from other library staff • my role (xxxxxx) is xxxxxx. we also have a web content creator in our xxxxxx. i chair our xxxxxxgroup (xxxxxx), which has representatives from each division in the library and they are the primary stewards of supporting library authored web content. our "speciality" platforms (libguides, omeka, and wordpress for microsites) all have service leads, but content is managed by the respective stakeholders. the lead for libguides is a xxxxxx [group] member due to its scope and scale. in our primary website, we are currently structured around drupal organic groups for content management with xxxxxx [group] having broad editing access. in our new website, all content management will go through the xxxxxx, with communications for support and dynamic content (homepage, news, events) management. • management is somewhat in flux right now. we recently migrated our main web site to acquia drupal; there is a very new small committee consisting of xxxxxx, and three representatives from elsewhere in the library. for libguides, all reference, instructio n, and subject librarians can edit their own guides; the xxxxxx has tended to have final oversight but i don't know if this has ever been formally delegated. • librarians manage their own libguides subject guides; several members of xxxxxx can make administrative changes to coding, certificates, etc. on the entire site; there are individuals in different departments who control their own pages/libguides. there is a group within the library that administers wordpress for the institution. other content systems are administered by individuals within the library. information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 42 • librarians are responsible for their own libguides. the xxxxxx department manages changes to most content, although some staff do manage their own wordpress content. they tend not to want to. • individuals. mainly one person authors content. the other individual has created some research guides. • individuals in different positions and departments within the library are assigned roles based on the type of content they frequently need to edit. • for instance, xxxxxx staff have the ability to create and edit exhibition content in drupal. xxxxxx staff and xxxxxx staff have the ability to create and edit equipment content. the event coordinator and librarians and staff involved in instruction are allowed to create and edit event and workshop listings. • only the communication coordinator is permitted to create news items that occupy real estate on the home page and various service point home pages. • as for general content, the primary internal stakeholders for that content typically create and edit that content, but if any staff notice a typo or factual error they are encouraged to correct them on their own, although they can also submit a request to the it department if they are not comfortable doing so. • subject specific content is hosted in libguides, and is maintained by subject liaison librarians. other content in libguides, software tutorials or information related to electronic resources for example, is created and maintained by appropriate specialists. • the drupal site when launched had internal stakeholders explicitly defined for each page, and only staff from the appropriate group could edit that content (e.g. if xxxxxx was tagged as the only stakeholder for a page about xxxxxx policies, then only staff from the xxxxxx department with editing privileges could edit that page). this system was abandoned after about two years as it was considered too much overhead to maintain and also the introduction of a content revisioning module that kept a history of edits alleviated fears of malicious editing. • individuals are assigned pages to keep content updated. the xxxxxx is responsible for coordinating with those staff and offers training to make sure content gets updated. • individual liaison librarians are responsible for their own libguides. i and the "xxxxxx" are the primary editors of the wordpress site, although 4 others have editing access (an employee who writes and posts news articles, the liaison librarian who spearheaded our new video tutorials, and two who work in special collections to update finding aids on that site, which is still on wordpress and i would consider under the main libraries web page, but is part of a multisite installation.) • in omeka and libguides, librarians are pretty self-sufficient and responsible for all of their own content. the three or four digital projects faculty and staff who work with omeka manage it internally alongside one of our developers. our omeka instance is relatively small-scale. • i (xxxxxx) oversee our libguides environment. while i am in the process of creating and implementing formal libguides content and structure guidelines, as of now it's a bit of a free-for-all with everyone responsible for the content pertaining to their own liaison department(s). content is made available to patrons via automatically populating legacy landing pages (we've had libguides for a decade and i've been with the institution not yet a year). information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 43 • as the xxxxxx, i am ultimately responsible for almost all of the content in our wordpress environment. that said, i try to distribute content upkeep responsibilities to the relevant department for each piece of the site. managers and committee chairs provide me with what they want on the web, and as needed (and in consultation with them) i review/rewrite it for the web (plain language), develop information architecture, design the front-end, and accessibly publish the content. there are only a few faculty and staff at my library who are comfortable with wordpress -but one of my long-term goals is to empower more folks to enact their own minor edits (e.g., updating hours, lending policies, etc.) while i oversee large-scale content creation, overall architecture, and strategy. we have a blog portion of our wordpress site which is not managed by anyone in particular, but i tend to clean it up if things go awry. • generally all of our web authors *can* publish to most parts of the site. (a very few content types (mostly featured images that display on home pages) can be edited only by admins and a small number of super-users.) however the great majority of people who can post content very rarely do (and some never do). some edit or post only to specific blogs, some only to their own guides or to very specific pages or suites of pages (e.g. liaison librarians to their own guides; thesis assistant to thesis pages). our small group in xxxxxx reviews new and updated pages and edits for in-house style and usability guidelines, and also trains and works collaboratively with web authors to create more usable content and reduce duplication -but given the large number of authors (with varied priorities, skills, and preferences) and pages we have trouble keeping up. we also more actively manage content on home pages. • for the main website and intranet, we have areas broken apart by unit area. we use workbench access to determine who can edit which pages. libguides is managed by committee, but most of the librarians have access. proprietary systems have separate accounts for those who need access. • for libguides, librarians can create content as they like, though there is a group that provides some (light) oversight. for main library website, most content is overseen by departments (in practice, one person each from a handful of “areas”, such as the branches, access services, etc.). • dotcms is primarily managed in systems (2 staff), with delegates from admin and outreach allowed to make limited changes to achieve their goals. libguides is used by all librarians and several staff, with six people given admin privileges. wordpress is used only in special collections. • xxxxxx dept manages major public facing platforms (drupal, wordpress, and shares libguides responsibilities with xxxxxx dept). xxxxxx manages omeka. within platforms, responsibilities are largely managed by department with individuals assigned content duties & permissions as needed. • different units maintain their content; one unit has overall management and checks for uniformity, needed updates, and broken links. • developers/communications office oversees some aspects, library management, research and collections librarians, and key staff edit other pieces. • currently, content is maintained by the xxxxxx librarian in coordination with content stakeholders from around the organization. we are in the process of migrating our site from drupal to omniupdate. once that is complete, we will develop a new model for content responsibilities. information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 44 • content is provided by department/services. • 5 librarians manage the libguides question 9 titles of positions in your organization whose primary duties involve creation, management and/or editing of web content: • head of web services; developer; web designer; user experience librarian • user experience librarian, lead librarian for discovery systems, digital technologies development librarian, lead librarian for software development. and we have titles that are university system it titles that don't mean a whole lot, such as technology support specialist and business and technology applications analyst. • web content specialist • user experience strategist, user experience designer, user experience student assistants , director of marketing communications and events • sr. ux specialist • web support consultant; coordinator, web services & library technology • editor & content strategist in library communications • web manager • discovery & systems librarian • head of library systems and technology • web services and data librarian • communications manager • web content and user experience specialist • metadata and discovery systems librarian, systems analyst, outreach librarian • digital services librarian; manager, communication services; communication specialist • (1) web project manager and content strategist, (2) web content creator • web services librarian • web developer ii • sr. software engineer, program director for digital services • user experience librarian • digital initiatives & scholarly communication librarian; senior library associate in digital scholarship and services • web services and usability librarian • senior library specialist -web content • web developer, software development librarian information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 45 appendix g: definitions of web content strategy question 11 in your own words, please define web content strategy. • a cohesive plan to create an overall strategy for web content that includes tone, terminology, structure, and deployment to best communicate the institution's message and enable the user. for the next question, the true answer is sort of. we have the start of a style guide. we also have the university's branding policies. we also have a web governance committee that is university-wide, of which i'm a part of. however, we don't have a complete strategy and it is certainly not documented. so you pick. • planning, development, and management of web content. two particularly important parts of web content strategy for academic library websites: 1. keeping content up to date and unpublishing outdated content. 2. building consensus for the creation and maintenance of a web style guide and ensuring that content across the large website adheres to the style guide. • strategies for management of content over its entire lifecycle to ensure it is accurate, timely, usable, accessible, appropriate, findable, and well-organized. • a system of workflows, training, and governance that supports the entire lifecycle of content, including creation, maintenance, and updating of content across all communications channels (e.g. websites, social media, signage). • a comprehensive, coordinated, planned approach to content across the site including components such as style guides, accessibility, information architecture, discoverability, seo. • not terribly familiar with the concept in a formal sense but think of it related to how the institution considers the intersection of content made available by the institution, the management and governance of issues such as branding/identity, accessibility, design, marketing, etc. • intentional and coordinated vision for content on the website • content strategy is the planning for the lifecycle of content. it includes creating, editing, reviewing, and deleting content. we also use a content strategy framework to determine each of the following for the content on our websites: audience, page goal, value proposition, validation, and measurement strategy. • website targets the community to ensure they can find what they need • the process of creating and enacting a vision for the organization and display of web content so that it is user friendly, accurate, up-to-date, and effective in its message. web content strategy often involves considering the thoughts and needs of many stakeholders, and creating one cohesive voice to represent them all. • web content strategy is the planning, design, delivery and governance plan for a website. this responsibility is guided by the library website management working group. • a web content strategy is a cohesive approach to managing and editing online content. an effective strategy takes into account web accessibility standards and endeavors to produce and maintain consistent, reliable, user-centered content. an effective content strategy evolves to meet the needs of online users and involves regular user testing and reviews of web traffic/analytics. • web content strategy is the theory and practice of creating, managing, and publishing web content according to evidence-based best practices for usability and readability information technology and libraries march 2021 web content strategy in practice within academic libraries | mcdonald and burkhardt 46 • making sure your content aligns with both your business goals and your audience needs. • a plan to oversee the life cycle of useful, usable content from its creation through maintenance and ultimately removal. • web content strategy is the overarching strategy for how you develop and disseminate web content. ideally, it would be structured and user tested to ensure that the content you are spending time developing is meeting the needs of your library and your community. • a web content strategy guides the full lifecycle of web content, including creation, maintenance, assessment, and retirement. it also sets guiding principles, makes responsibility and authority clear, and documents workflows. • an overarching method of bringing user experience best practices together on the website including: heuristics, information architecture, and writing for the web • planning and management of online content • a defined strategy for creating and delivering effective content to a defined audience at the right time. • in the most basic sense, web content strategy is matching the content, services and functionality of web properties with the organizational strategic goals. • web content strategy can include guidelines, processes, and/or approaches to making your website(s) usable, sustainable, and findable. it's a big-picture or higher-level way of thinking about your site(s), rather than page by page or function by function. • deliberate structures and practices to plan, deliver, and evaluate web content. • producing content that will be useful to users and easy for them to access • tying content to user behavior/user experience? • web content strategy is the thoughtful planning and construction of website content to meet users' needs. • n/a • cohesive planning, development, and management of web content, to engage and support library users. • working with teams and thinking strategically and holistically about the usability, functions, services, information, etc. provided on the website to best meet the needs of the site's users, as well as incorporating the marketing/promotional perspectives offered by the website. • planning and managing web content • web content strategy is the idea that all written and visual information on a certain site would conform to or align with the goals for that site. • ensuring that the most accurate and appropriate words, images, and other assets are presented to patrons at the point of need, while using web assets to tell stories patrons might not know they want to know. abstract introduction background maturity models application of maturity models within user experience work in libraries assessing the maturity of content strategy practice in libraries methods findings demographic information infrastructure & organizational structure content management systems dedicated positions, position titles, and organizational workflows web content strategy practices discussion proposed maturity model content strategy maturity model for academic libraries level 1: ad hoc level 2: establishing level 3: scaling level 4: sustaining level 5: thriving conclusion endnotes jin ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ mobile website use and advanced researchers: understanding library users at a university marine sciences branch campus mary j. markland, hannah gascho rempel, and laurie bridges information technology and libraries | december 2017 7 abstract this exploratory study examined the use of the oregon state university libraries website via mobile devices by advanced researchers at an off-campus branch location. branch campus–affiliated faculty, staff, and graduate students were invited to participate in a survey to determine what their research behaviors are via mobile devices, including frequency of their mobile library website use and the tasks they were attempting to complete. findings showed that while these advanced researchers do periodically use the library website via mobile devices, mobile devices are not the primary mode of searching for articles and books or for reading scholarly sources. mobile devices are most frequently used for viewing the library website when these advanced researchers are at home or in transit. results of this survey will be used to address knowledge gaps around library resources and research tools and to generate more ways to study advanced researchers’ use of library services via mobile devices. introduction as use of mobile devices has expanded in the academic environment, so has the practice of gathering data from multiple sources about what mobile resources are and are not being used. this data informs the design decisions and resource investments libraries make in mobile tools. web analytics is one tool that allows researchers to discover which devices patrons use to access library webpages. but web analytics data do not show what patrons want to do and what hurdles they face when using the library website via a mobile device. web analytics also lacks nuance in that it cannot distinguish user characteristics, such as whether users are novice or advanced researchers, which may affect how these users interact with a mobile device. user surveys are another tool for gathering data on mobile behaviors. user surveys help overcome some of the limitations of web analytics data by directly asking users about their perceived research skills and the resources they use on a mobile device. as is the case at most libraries, oregon state university libraries serves a diverse range of users. we were interested in learning whether advanced researchers—particularly advanced researchers who work at a branch campus—use the library’s resources differently than main mary j. markland (mary.markland@oregonstate.edu), is head, guin library; hannah gascho rempel (hannah.rempel@oregonstate.edu) is science librarian and coordinator of graduate student success services; and laurie bridges (laurie.bridges@oregonstate.edu) is instruction and outreach librarian, oregon state university libraries and press. mailto:mary.markland@oregonstate.edu mailto:hannah.rempel@oregonstate.edu mailto:laurie.bridges@oregonstate.edu mobile website use and advanced researchers | markland, rempel, and bridges doi:10.6017/ital.v36i4.9953 8 campus users. we were chiefly interested in these advanced researchers because of the mobile nature of their work. they are graduate students and faculty in the field of marine science who work in a variety of locations, including their offices, labs, and in the field (which can include rivers, lakes, and the ocean). we focused on the use of the library website via mobile devices as one way to determine whether specific library services should be adapted to best meet the needs of this targeted user community. oregon state university (osu) is oregon’s land-grant university; its home campus is in corvallis, oregon. hatfield marine science center (hmsc) in newport is a branch campus that includes a branch library. guin library at hmsc serves osu students and faculty from across the osu colleges along with the co-located federal and state agencies of the national oceanic and atmospheric administration (noaa), us fish and wildlife service, environmental protection agency (epa), united states geological survey (usgs), united states department of agriculture (usda), and the oregon department of fish and wildlife. the guin library is in newport, which is forty-five miles from the main campus. like many other branch libraries, guin library was established at a time when providing a print collection close to where researchers and students work was paramount, but today it must adapt its services to meet the changing information needs of its user base. branch libraries are typically designed to serve a clientele or subject area, which can create a different institutional culture from the main library. guin library serves advanced undergraduates, graduate students, and scientific researchers. hmsc’s distance from corvallis, the small size of the researcher community, and the shared focus on a research area—marine sciences—create a distinct culture. while guin library is often referred to as the “heart of hmsc,” the number of in-person library users is decreasing. this decline is not unexpected as numerous studies have shown that faculty and graduate students have fewer needs that require an in-person trip to the library.1 studies have also shown that faculty and graduate students can be unaware of the services and resources that libraries provide, thereby continuing the cycle of underuse. 2 to learn more about the needs of hmsc’s advanced researchers, this exploratory study examined their research behaviors via mobile devices. the goals of this study were to • determine if and with what frequency advanced researchers at hmsc use the osu libraries website via mobile devices; • gather a list of tasks advanced users attempt to accomplish when they visit the osu libraries website on a mobile device; and • determine whether the mobile behaviors of these advanced researchers are different from those of researchers from the main osu campus (including undergraduate students), and if so, whether these differences warrant alternative modes of design or service delivery. information technology and libraries | december 2017 9 literature review the conversation about how best to design mobile library websites has shifted over the past decade. early in the mobile-adoption process some libraries focused on creating special websites or apps that worked with mobile devices.3 while libraries globally might still be creating mobilespecific websites and apps,4 us libraries are trending toward responsively designed websites as a more user-friendly option and a simpler solution for most libraries with limited staff and budgets.5 most of the literature on mobile-device use in higher education is focused on undergraduates across a wide range of majors who are using a standard academic library. 6 to help provide context for how libraries have designed their websites for mobile users, some of those specific findings will be shared later. but because our study focused on graduate students and faculty in a sciencefocused branch library, we will begin with a discussion of what is known about more advanced researchers’ use of library services and their mobile-device habits. several themes emerged from the literature on graduate students’ relationships with libraries. in an ironic twist, faculty think graduate students are being assisted by the library while librarians think faculty are providing graduate students with the help they need to be successful.7 this results in many graduate students end up using their library’s resources in an entirely disintermediated way. graduate students, especially those in the sciences, visit the physical library less often and use online resources more than undergraduate students.8 most graduate students start their research process with assistance from academic staff, such as advisors and committee members,9 and are unaware of many library services and resources.10 as frequent virtual-library users who receive little guidance on how to use the library’s tools, graduate students need a library website that is clear in scope and purpose, offers help, and has targeted services. 11 compared to reports on undergraduate use of mobile devices to access their library’s website, relatively few studies have focused on graduate-student or faculty mobile behaviors. a recent survey of japanese library and information science (lis) students compared and undergraduate graduate students’ usage of mobile devices to access library services and found slight differences. however, both groups reported accessing libraries as last on their list of preferred smartphone uses.12 aharony examined the mobile use behaviors of israeli lis graduate students and found approximately half of these graduate students used smartphones and perceived them to be useful and easy tools for use in their everyday life, and could transfer those habits to library searching behaviors.13 when looking specifically at how patrons use library services via a mobile device, rempel and bridges found the top reason graduate students at their main campus used the osu libraries website via mobile devices was to find information on library hours, followed by finding a book and researching a topic.14 barnett-ellis and vann surveyed their small university and found that both undergraduate and graduate students were more than twice as likely to use mobile devices as are their faculty and staff; a majority of students also indicated they were likely to use mobile devices to conduct research.15 finally, survey results showed graduate students in hofstra university’s college of education reported accessing library materials via a mobile device twice as often as other student groups. in addition, these graduate students reported being comfortabl e mobile website use and advanced researchers | markland, rempel, and bridges doi:10.6017/ital.v36i4.9953 10 reading articles up to five pages long on their mobile devices. graduate students were also more likely to be at home when using their mobile device to access the library, a finding the authors attributed to education graduate students frequently being employed as full-time teachers.16 research on how faculty members use library resources characterizes a population that is confident in their literature-searching skills, prefers to search on their own, and has little direct contact with the library.17 faculty researchers highly value convenience;18 they rely primarily on electronic access to journal articles but prefer print access to monographs.19 faculty tend to be self-trained at using search tools, such as pubmed or other online databases, and therefore are not always aware of the more in-depth functionality of these tools.20 in contrast to graduate students, rempel and bridges found that faculty using the library website via mobile devices were less interested in information about the physical library, such as library hours, and were more likely to be researching a topic.21 medical faculty are one of the few faculty groups whose mobile-research behaviors have been specifically examined. a survey administered by bushhousen et al. at a medical university revealed that a third of respondents used mobile apps for research-related activities.22 findings by boruff and storie indicate that one of the biggest barriers to mobile use in health-related academic settings was wireless access.23 thus apps that did not require the user to be connected to the internet were highly desired. faculty and graduate students in health-related academic settings saw a role for the library in advocating for better wireless infrastructure, providing access to a targeted set of heavily used resources, and providing online guides or in-person tutorials on mobile apps or procedures specific to their institution. 24 according to the literature, most design decisions for library mobile sites have been made on the basis of information collected about undergraduate students’ behavior at main-branch campuses. to help inform our understanding of how recent decisions have been made, the remainder of the literature review focuses on what is known about undergraduate students’ mobile behavior. undergraduate students are very comfortable using mobile technologies and perceive themselves to be skilled with these devices. according to the 2015 educause center for research and analysis’ (ecar) study of undergraduate students and information technology, most undergraduate students consider themselves sophisticated technology users who are engaged with information technologies.25 undergraduate students mainly use their smartphones for nonclass activities. but students indicate they could be more effective technology users if they were more skilled at tools such as the learning management system, online collaboration tools, e-books, or laptops and smartphones in class. of interest to libraries is the ecar participants’ top area of reported interest, “search tools to find reference or other information online for class work.”26 however, when a mobile library site is in place, usage rates have been found to be lower than anticipated. in a study of undergraduate science students, salisbury et al. found only 2 percent of respondents reported using their cell phones to access library databases or the library’s catalog every hour or daily, despite 66 percent of the students browsing the internet using their mobile information technology and libraries | december 2017 11 phone hourly or daily. salisbury et al. speculated that users need to be told about mobileoptimized library resources if libraries want to increase usage. 27 rempel and bridges used a pop-up interrupt survey while users were accessing the osu libraries mobile site.28 this approach allowed a larger cross-section of library users to be surveyed. it also reduced memory errors by capturing their activities in real time. activities that had been included in the mobile site because of their perceived usefulness in a mobile environment, such as directions, asking a librarian a question, and the coffee shop webcam, were rarely cited as a reason for visiting the mobile site. the osu libraries branch at hmsc is entering a new era. a marine studies initiative will result in the building of a new multidisciplinary research campus at hmsc that aims to serve five hundred undergraduate students. the change in demographics and the increase in students who will need to be served has prompted guin library staff to explore how the current population of advanced researchers interact with library resources. in addition, examining the ways undergraduate students at the main campus use these tools will help with planning for the upcoming changes in the user community. methods this study used an online qualtrics survey to gather information about how frequently advanced researchers (graduate students, faculty, and affiliated scientists at a branch library for marine science) use the osu libraries website via mobile devices, what they search for, and other ways they use mobile devices to support their research behaviors. a recruitment email with a link to the survey was sent to three discussion lists used by hmsc community in spring 2016. the survey was available for four weeks, and a reminder email was sent one week before the survey closed. the invitation email included a link to an informedconsent document. once the consent document had been reviewed, users were taken to the survey via a second link. respondents could provide an email address to receive a three-dollar coffee card for participating in the study, but their email address was recorded in a separate survey location to preserve their anonymity. the invitation email indicated that this survey was about using the website via a mobile device, and the first survey question asked users if they had ever accessed the library website on a mobile device. if they answered “no,” they were immediately taken to the end of the survey and were not recorded as a participant in the study. a similar survey was conducted with users from osu’s main campus in 2012–13 and again in 2015. the results from 2012–13 have been published previously,29 but the results from 2015 have not. while the focus of the present study is on the mobile behaviors of advanced researchers in the hmsc community, data from the 2015 main-campus study is used to provide a comparison to the broader osu community. osu main-campus respondents in 2015 and hmsc participants in 2016 both answered closedand open-ended questions that explored participants’ general mobiledevice behaviors and behaviors specific to using the osu libraries website via mobile devices. mobile website use and advanced researchers | markland, rempel, and bridges doi:10.6017/ital.v36i4.9953 12 however, the hmsc survey also asked questions about behaviors related to using the osu (nonlibrary) website via a mobile device and participants’ mobile scholarly reading and writing behaviors. the survey concluded with several demographic questions. the survey data was analyzed using qualtrics’ cross-tab functionality and microsoft excel to observe trends and potential differences between user groups. open-ended responses were examined for common themes. twenty-three members of the hmsc community completed the survey, whereas one hundred participants responded to the 2015 main campus survey. participation in the 2015 survey was capped at one hundred respondents because limited incentives were available. the participation difference between the two surveys reflects several differences between the two sampled communities. the most obvious difference is size. the osu community comprises more than thirty-six thousand students, faculty, and staff; the hmsc community is approximately five hundred students, researchers, and faculty—some of whom are also included as part of the larger osu community. the second factor influencing response rates relates to the difference in size between the two communities, but is more striking in the hmsc community: the survey relied on a self-selected group of users who indicated they had a history using the library website via a mobile device. therefore, it is not possible to estimate the population size of mobile-device library-website users specific to the branch library or the main campus library. this limitation means that the results from this study cannot be used to generalize findings to all users who visit a library website via mobile devices; instead the results are intended to present a case that other libraries may compare with behaviors observed on their own campuses. sharing the behaviors of advanced researchers at a branch campus is particularly valuable as this population has historically been understudied. results and discussion participant demographics and devices used of the twenty-three respondents to the hmsc mobile behaviors survey, 13 (62 percent) were graduate students, 7 (34 percent) were faculty (this category includes faculty researchers and courtesy faculty), and one respondent was an noaa employee. two participants declined to declare their affiliation. of the 97 respondents to the 2015 osu main-campus survey who shared their affiliation, 16 (16 percent) were graduate students, 5 (5 percent) were faculty members, and 69 (71 percent) were undergraduates. respondents varied in the types of mobile devices they used when doing library research. smartphones were used by 78 percent (18 respondents) and 22 percent (5 respondents) used a tablet. apple (15 respondents) was the most common device brand used, although six of the respondents used an android phone or tablet. compared to the general population’s device ownership, these respondents are more likely to own apple devices, but the two major device types owned (apple and android) match market trends.30 information technology and libraries | december 2017 13 frequency of library site use on mobile devices most of the hmsc respondents are infrequent users of the library website via mobile devices: 50 percent (11 respondents) did so less than once a month; 41 percent (9 respondents) did so at least once a month; and 9 percent (2 respondents) did so at least once a week. the low level of library website usage via mobile devices was especially notable as this population reports being heavy users of the library website via laptops or desktop computers, with 82 percent (18 respondents) visiting the library website via those tools at least once a week. researchers at hmsc used the library website via mobile devices much less often than the 2015 main-campus respondents (undergraduates, graduate students, and faculty). no hmsc respondents visited the mobile site daily compared to 10 percent of main-campus users, and only 9 percent of hmsc respondents visited weekly compared to 28 percent of main-campus users (see figure 1). figure 1. 2016 hmsc participants vs. 2015 osu main-campus participants reported frequency of library website visits via a mobile device by percent of responses. while hmsc advanced researchers share some mobile behaviors with main-campus students, this exploratory study demonstrates they do not use the library website via mobile devices as frequently. some possible reasons for this are researchers rarely spend time coming and going to and from classes and therefore do not have small gaps of time to fill throughout their day. instead, their daily schedule involves being in the field or in the lab collecting and analyzing data. 0% 10% 20% 30% 40% 50% 60% this is my first time less often than once a month at least once a month at least once a week every day or almost every day branch 2016 main 2015 mobile website use and advanced researchers | markland, rempel, and bridges doi:10.6017/ital.v36i4.9953 14 alternatively, they are frequently involved in writing-intensive projects such as drafting journal articles or grant proposals. they carve out specific periods to do research and do not appear to be filling time with short bursts of literature searching. they can work on laptops and do not need to multitask on a phone or tablet between classes or in other situations. mobile-device ownership among hmsc graduate students might also be limited because of personal budgets that do not allow for owning multiple mobile devices or for having the most recent model. in addition, this group of scientists may not be on the front edge of personal technologies, especially compared to medical researchers, because few mobile apps are designed specifically for the research needs of marine scientists. where researchers are when using mobile devices for library tasks because mobile devices facilitate connecting to resources from many locations, and because advanced researchers conduct research in a range of settings—including the field, the office, and home—we asked respondents where they were most likely to use the library website via a mobile device. thirty-two percent were most likely to be at home, 27 percent in transit; 18 percent at work; and 9 percent in the field. the popularity of using the library website via mobile devices while in transit was somewhat unexpected, but perhaps should not have been because many people try to maximize their travel time by multitasking on mobile devices. the distance from the main campus might explain this finding because a local bus service provides an easy way to travel to and from the main campus, and the hour-long trip would provide opportunities for multitasking via a mobile device. relatively few respondents used mobile devices to access the library website while at work. previous studies show that a lack of reliable campus wireless internet access can affect students’ ability to use mobile technology.31 hmsc also struggles to provide consistent wireless access, and signals are spotty in many areas of our campus. despite signal boosters in guin library, wireless access is still limited at times. in addition, cell phone service is equally spotty both at hmsc and up and down the coast of oregon. it is much less frustrating to work on a device that has a wired connection to the internet while at hmsc. these respondents did use mobile devices while at home, which might indicate they had a better wireless signal there. alternatively, working from home on a mobile device might indicate that they compartmentalize their library-research time as an activity to do at home instead of in the office. researchers used their mobile devices to access the library while in the field less than originally expected, but upon further reflection, it made sense that researchers would be less likely to use library resources during periods of data collection for oceanic or other water-based research projects because of their focused involvement during that stage. the water-based research also increases the risk of losing mobile devices. library resources accessed via mobile devices information technology and libraries | december 2017 15 to learn more about how these respondents used the library website, we asked them to choose what they were searching for from a list of options. respondents could choose as many options as applied to their searching behaviors. hmsc respondents’ primary reason for visiting the library’s site via a mobile device was to find a specific source: 68 percent looked for an article, 45 percent for a journal, 36 percent for a book, and 14 percent for a thesis. many of the hmsc respondents also looked for procedural or library-specific information: 36 percent looked for hours, 32 percent for my account information, 18 percent for interlibrary loan, 14 percent for contact information, 9 percent for how to borrow and request books, 9 percent for workshop information, and 9 percent for oregon estuaries bibliographies—a unique resource provided by the hmsc library. fifty-five percent of searches were for a specific source and 43 percent were for procedural or libraryspecific information. notably missing from this list were respondents who reported searching via their mobile device for directions to the library. compared to the 2015 osu libraries main-campus survey respondents, hmsc respondents were much more likely to visit the library website via a mobile device to look for an article (68 percent vs. 37 percent), find a journal (45 percent vs. 23 percent), access my account information (32 percent vs. 7 percent), use interlibrary loan (18 percent vs. 5 percent), or find contact information (14 percent vs. 1 percent). however, unlike hmsc participants, who do not have access to course reserves at the branch library, 7 percent of osu main-campus respondents used their mobile devices to find course reserves on the library website. see figure 2. 0% 10% 20% 30% 40% 50% 60% 70% directions contact information interlibrary loan course reserves my account a journal a book library hours an article branch 2016 main 2015 mobile website use and advanced researchers | markland, rempel, and bridges doi:10.6017/ital.v36i4.9953 16 figure 2. 2016 hmsc vs. 2015 osu main-campus participants reported searches while visiting the library website via a mobile device by percent of responses. it is possible that hmsc users with different affiliations might use the library site via a mobile device differently. these exploratory findings show that graduate students used the greatest variety of content via mobile devices. graduate students as a group reported using 11 of the 14 provided content choices via a mobile device while faculty reported using 8 of the 14. graduate students were the largest group (62 percent of respondents), which might explain why as a group they searched for more types of content via mobile devices. interestingly, faculty members and faculty researchers reported looking for a thesis via a mobile device, but no graduate students did. perhaps these graduate students had not yet learned about the usefulness of referencing past theses as a starting point for their own thesis writing. or perhaps they were only familiar with searching for journal articles on a topic. in contrast, faculty members might have been searching for specific theses for which they had provided advising or mentoring support. to help us make decisions about how to best direct users to library content via mobile devices, we asked respondents to indicate their searching behaviors and preferences. of the 16 hmsc respondents who answered this question, 12 (75 percent) used our web-scale discovery search box via mobile devices; 4 (25 percent) reported that they did. presumably these latter searchers were navigating to another database to find their sources. of 16 respondents, only 6 (38 percent) indicated that they looked for a specific library database (as opposed to the discovery tool) when using a mobile device. those respondents who were looking for a database tended to be looking for the web of science database, which makes sense for their field of study. when conducting searches for sources on their mobile devices, hmsc respondents employed a variety of search strategies: the 12 respondents who replied used a combination of author (75 percent), journal title (67 percent, keyword (67 percent), and book title (50 percent) searches when starting at the mobile version of the discovery tool. when asked about their preferred way to find sources, a majority of hmsc respondents reported that they tended to prefer a combination of searching and menu navigation while using the library website from mobile devices, while the remainder were evenly divided between preferring menu driven and search-driven discovery. while osu libraries does not currently provide links to any specific apps for source discovery, such as pubmed mobile or jstor browser, 13 (62 percent) of the hmsc respondents indicated they would be somewhat or very likely to use an app to access and use library services. this finding connects to the issue of reliable wireless access. medical graduate students had a wider array of apps available to them, but the primary reason they wanted to use these apps was because they provided a better searching experience in hospitals that had intermittent wireless access—an experience to which researchers at hmsc could relate.32 university website use behaviors on mobile devices to help situate respondents’ library use behaviors on mobile devices in comparison to the way they use other academic resources on mobile devices, we asked hmsc respondents to describe information technology and libraries | december 2017 17 their visits to resources on the osu (nonlibrary) website via mobile devices. compared to their use of the library site on a mobile device, respondents’ use of university services was higher: 43 percent (9 respondents) visited the university’s website via a mobile device at least once a week compared to only 9 percent (2 respondents) who visited the library site with that frequency. this makes sense because of the integral function many of these university services play in most university employees’ regular workflow. respondents indicated visiting key university sites including myosu (a portal webpage, visited by 60 percent of respondents), the hmsc webpage (55 percent), canvas (the university’s learning management system, visited by 50 percent of respondents), and webmail (45 percent). see figure 3. figure 3. university webpages hmsc respondents access on a mobile device by percent of responses. university resources such as campus maps, parking locations, and the graduate school website were frequently used by this population. the use of the first two makes sense as hmsc users are located off-site and need to use maps and parking guidance when they visit the main campus. the use of the graduate school website makes sense because the respondents were primarily graduate students and graduate school guidelines are a necessary source of information. interestingly, our advanced users are similar to undergraduates in that they primarily read email, information from social networking sites, and news on their mobile devices. 33 other research behaviors on mobile devices mobile website use and advanced researchers | markland, rempel, and bridges doi:10.6017/ital.v36i4.9953 18 we wanted to know what other research-related behaviors the hmsc respondents are engaged in via mobile devices to determine if there might be additional ways to support researchers’ workflows. we specifically asked about respondents’ reading, writing, and note-taking behaviors to learn how well these respondents have integrated them with their mobile usage behaviors. all respondents reported reading on their mobile device (see figure 4). email represented the most common reading activity (95 percent), followed by “quick reading” activities, such as reading social networking posts (81 percent), current news (81 percent), and blog posts (62 percent). smaller numbers used their mobile devices for academic or long-form reading, such as reading scholarly articles (33 percent) or books (19 percent). of those respondents who read articles and books on their mobile devices, only respondents highlighted or took notes using their mobile device. seven respondents used a citation manager on their mobile device: three used endnote, one used mendeley, one used pages, and one used zotero. one respondent used evernote on their mobile device, and one advanced user reported using specific data and database management software, websites, and apps related to their projects. more advanced and interactive mobilereading features, such as online spatial landmarks, might be needed before reading scholarly articles on mobile devices becomes more common.34 figure 4. what hmsc respondents reported reading on a mobile device by percent of responses. limitations this exploratory study had several limitations, most of which reflect the nature of doing research with a small population at a branch campus. this study had a small sample size, which limited observations of this population; however, future studies could use research techniques such as interviews or ethnographic studies to gather deep qualitative information about mobile-use 19% 33% 62% 81% 81% 95% 0% 20% 40% 60% 80% 100% 120% books academic or scholarly articles blog posts current news social networking posts (facebook, twitter, etc.) email percent of responses information technology and libraries | december 2017 19 behaviors in this population. a second limitation was that previous studies of the osu libraries mobile website used google analytics to compare survey results with what users were actually doing on the library website. unfortunately, this was not possible for this study. because of how hmsc’s network was set up, anyone at hmsc using the osu internet connections is assigned an ip address that shows a corvallis, oregon, location rather than a newport, oregon, location, which rendered parsing hmsc-specific users in google analytics impossible. the research behaviors of advanced researchers at a branch campus has not been well-examined; despite its limitations, this study provides beneficial insights into the behaviors of this user population. conclusion focusing on how advanced researchers at a branch campus use mobile devices while accessing library and other campus information provides a snapshot of key trends among this user group. these exploratory findings show that these advanced researchers are infrequent users of library resources via mobile devices and, contrary to our initial expectations, are not using mobile devices as a research resource while conducting field-based research. findings showed that while these advanced researchers do periodically use the library website via mobile devices, mobile devices are not the primary mode of searching for articles and books or for reading scholarly sources. mobile devices are most frequently used for viewing the library website when these advanced researchers are at home or in transit. the results of this survey will be used to address the hmsc knowledge gaps around use of library resources and research tools via mobile devices. both graduate students and faculty lack awareness of library resources and services and have unsophisticated library research skills. 35 while the osu main campus has library workshops for graduate students and faculty, these workshops have been inconsistently duplicated at the guin library. because the people working at hmsc come from such a wide variety of departments across osu that focus on marine sciences, hmsc has never had a library orientation. the results indicate possible value in devising ways to promote guin library’s resources and services locally, which could include highlighting the availability of mobile library access. while several participants mentioned using research tools like evernote, pages, or zotero on their mobile devices, most participants did not report enhancing their mobile research experience with these mobile-friendly tools. workshops specifically modeling how to use mobile-friendly tools and apps such as dropbox, evernote, goodreader, or browzine could help introduce the benefits of these tools to these advanced researchers. because wireless access is even more of a concern for researchers at this branch location than for researchers at the main campus, database-specific apps will be explored to determine if the use of searching apps could help alleviate inconsistent wireless access. if database apps that are appropriate for marine science researchers are available, these will be promoted to this user population. future research might involve follow-up interviews or focus groups, ethnographic studies, or interviews, which could expand the knowledge of these researchers’ mobile-device behaviors and mobile website use and advanced researchers | markland, rempel, and bridges doi:10.6017/ital.v36i4.9953 20 their perceptions of mobile devices. exploring the technology usage by these advanced researchers in their labs, including electronic lab notebooks or other tools, might be an interesting contrast to their use of mobile devices. in addition, as the hmsc campus grows with the expansion of the marine studies initiative, increasing numbers of undergraduates will use guin library. the ecar 2015 statistics show that current undergraduates own multiple internet-capable devices.36 presumably, these hmsc undergraduates will be likely to follow the trends seen in the ecar data. certainly, the plans to expand hmsc’s internet and wireless infrastructure will affect all its users. our mobile survey gave us insights into how a sample of the hmsc population uses the library’s resources and services. these observations will allow guin library to expand its services for the hmsc campus. we encourage other librarians to explore their unique user populations when evaluating services and resources. references 1 maria anna jankowska, “identifying university professors’ information needs in the challenging environment of information and communication technologies,” journal of academic librarianship 30, no. 1 (2004): 51–66, https://doi.org/10.1016/j.jal.2003.11.007; pali u. kuruppu and anne marie gruber, “understanding the information needs of academic scholars in agricultural and biological sciences,” journal of academic librarianship 32, no. 6 (2006): 609–23; lotta haglund and per olsson, “the impact on university libraries of changes in information behavior among academic researchers: a multiple case study,” journal of academic librarianship 34, no. 1 (2008): 52–59, https://doi.org/10.1016/j.acalib.2007.11.010; nirmala gunapala, “meeting the needs of the ‘invisible university’: identifying information needs of postdoctoral scholars in the sciences,” issues in science and technology librarianship, no. 77 (summer 2014), https://doi.org/10.5062/f4b8563p. 2 tina chrzastowski and lura joseph, “surveying graduate and professional students’ perspectives on library services, facilities and collections at the university of illinois at urbanachampaign: does subject discipline continue to influence library use?,” issues in science and technology librarianship no. 45 (winter 2006), https://doi.org/10.5062/f4dz068j; kuruppu and gruber, “understanding the information needs of academic scholars in agricultural and biological sciences”; haglund and olsson, “the impact on university libraries of changes in information behavior among academic researchers.” 3 ellyssa kroski, “on the move with the mobile web: libraries and mobile technologies,” library technology reports 44, no. 5 (2008): 1–48, https://doi.org/10.5860/ltr.44n5. 4 paula torres-pérez, eva méndez-rodríguez, and enrique orduna-malea, “mobile web adoption in top ranked university libraries: a preliminary study,” journal of academic librarianship 42, no. 4 (2016): 329–39, https://doi.org/10.1016/j.acalib.2016.05.011. 5 david j. comeaux, “web design trends in academic libraries—a longitudinal study,” journal of web librarianship 11, no. 1 (2017), 1–15, https://doi.org/10.1080/19322909.2016.1230031; https://doi.org/10.1016/j.jal.2003.11.007 https://doi.org/10.1016/j.acalib.2007.11.010 https://doi.org/10.5062/f4b8563p https://doi.org/10.5062/f4dz068j https://doi.org/10.5860/ltr.44n5 https://doi.org/10.1016/j.acalib.2016.05.011 https://doi.org/10.1080/19322909.2016.1230031 information technology and libraries | december 2017 21 zebulin evelhoch, “mobile web site ease of use: an analysis of orbis cascade alliance member web sites,” journal of web librarianship 10, no. 2 (2016): 101–23, https://doi.org/10.1080/19322909.2016.1167649. 6 barbara blummer and jeffrey m. kenton, “academic libraries’ mobile initiatives and research from 2010 to the present: identifying themes in the literature,” in handbook of research on mobile devices and applications in higher education settings, ed. laura briz-ponce, juan juanesméndez, and josé francisco garcía-peñalvo (hershey, pa: igi global, 2016), 118–39. 7 jankowska, “identifying university professors’ information needs in the challenging environment of information and communication technologies.” 8 chrzastowski and joseph, “surveying graduate and professional students’ perspectives on library services, facilities and collections at the university of illinois at urbana-champaign.” 9 carole a. george et al., “scholarly use of information: graduate students’ information seeking behaviour,” information research 11, no. 4 (2006), http://www.informationr.net/ir/114/paper272.html. 10 kristin hoffman et al., “library research skills: a needs assessment for graduate student workshops,” issues in science and technology librarianship 53 (winter-spring 2008), https://doi.org/10.5062/f48p5xfc; hannah gascho rempel and jeanne davidson, “providing information literacy instruction to graduate students through literature review workshops,” issues in science and technology librarianship 53 (winter-spring 2008), https://doi.org/10.5062/f44x55rg. 11 jankowska, “identifying university professors’ information needs in the challenging environment of information and communication technologies.” 12 ka po lau et al., “educational usage of mobile devices: differences between postgraduate and undergraduate students,” journal of academic librarianship 43, no. 3 (may 2017), 201–8, https://doi.org/10.1016/j.acalib.2017.03.004. 13 noa aharony, “mobile libraries: librarians’ and students’ perspectives,” college & research libraries 75, no. 2 (2014): 202–17, https://doi.org/10.5860/crl12-415. 14 hannah gashco rempel and laurie m. bridges, “that was then, this is now: replacing the mobile-optimized site with responsive design,” information technology and libraries 32, no. 4 (2013): 8–24, https://doi.org/10.6017/ital.v32i4.4636. 15 paula barnett-ellis and charlcie pettway vann, “the library right there in my hand: determining user needs for mobile services at a medium-sized regional university,” southeastern librarian 62, no. 2 (2014): 10–15. https://doi.org/10.1080/19322909.2016.1167649 http://www.informationr.net/ir/11-4/paper272.html http://www.informationr.net/ir/11-4/paper272.html https://doi.org/10.5062/f48p5xfc https://doi.org/10.5062/f44x55rg https://doi.org/10.1016/j.acalib.2017.03.004 https://doi.org/10.5860/crl12-415 https://doi.org/10.6017/ital.v32i4.4636 mobile website use and advanced researchers | markland, rempel, and bridges doi:10.6017/ital.v36i4.9953 22 16 william t. caniano and amy catalano, “academic libraries and mobile devices: user and reader preferences,” reference librarian 55, no. 4 (2014), 298–317, https://doi.org/10.1080/02763877.2014.929910. 17 haglund and olsson, “the impact on university libraries of changes in information behavior among academic researchers.” 18 kuruppu and gruber, “understanding the information needs of academic scholars in agricultural and biological sciences.” 19 christine wolff, alisa b. rod, and roger c. schonfeld, “ithaka s+r us faculty survey 2015,” ithaka s+r, april 4, 2016, http://www.sr.ithaka.org/publications/ithaka-sr-us-faculty-survey2015/. 20 m. macedo-rouet et al., “how do scientists select articles in the pubmed database? an empirical study of criteria and strategies,” revue européenne de psychologie appliquée/european review of applied psychology 62, no. 2 (2012): 63–72. 21 rempel and bridges, “that was then, this is now.” 22 ellie bushhousen et al., “smartphone use at a university health science center,” medical reference services quarterly 32, no. 1 (2013): 52–72, https://doi.org/10.1080/02763869.2013.749134. 23 jill t. boruff and dale storie, “mobile devices in medicine: a survey of how medical students, residents, and faculty use smartphones and other mobile devices to find information,” journal of the medical library association 102, no. 1 (2014): 22–30, https://doi.org/10.3163/15365050.102.1.006. 24 bushhousen et al., “smartphone use at a university health science center”; boruff and storie, “mobile devices in medicine.” 25 eden dahlstrom et al., “ecar study of students and information technology, 2015 ," research report, educause center for analysis and research, 2015, https://library.educause.edu/~/media/files/library/2015/8/ers1510ss.pdf?la=en. 26 ibid., 24. 27 lutishoor salisbury, jozef laincz, and jeremy j. smith, “science and technology undergraduate students’ use of the internet, cell phones and social networking sites to access library information,” issues in science and technology librarianship 69 (spring 2012), https://doi.org/10.5062/f4sb43pd. 28 rempel and bridges, “that was then, this is now.” 29 ibid. https://doi.org/10.1080/02763877.2014.929910 http://www.sr.ithaka.org/publications/ithaka-sr-us-faculty-survey-2015/ http://www.sr.ithaka.org/publications/ithaka-sr-us-faculty-survey-2015/ https://doi.org/10.1080/02763869.2013.749134 https://doi.org/10.3163/1536-5050.102.1.006 https://doi.org/10.3163/1536-5050.102.1.006 https://library.educause.edu/~/media/files/library/2015/8/ers1510ss.pdf?la=en https://doi.org/10.5062/f4sb43pd information technology and libraries | december 2017 23 30 “mobile/tablet operating system market share,” netmarketshare, march 2017, https://www.netmarketshare.com/operating-system-market-share.aspx?qprid=8&qpcustomd=1. 31 boruff and storie, “mobile devices in medicine”; patrick lo et al., “use of smartphones by art and design students for accessing library services and learning,” library hi tech 34, no. 2 (2016): 224–38, https://doi.org/10.1108/lht-02-2016-0015. 32 boruff and storie, “mobile devices in medicine.” 33 dahlstrom et al., “ecar study of students and information technology, 2015.” 34 caroline myrberg and ninna wiberg, “screen vs. paper: what is the difference for reading and learning?” insights 28, no. 2 (2015): 49–54, https://doi.org/10.1629/uksg.236. 35 barnett-ellis and vann, “the library right there in my hand”; haglund and olsson, “the impact on university libraries of changes in information behavior among academic researchers”; hoffman et al., “library research skills”; kuruppu and gruber, “understanding the information needs of academic scholars in agricultural and biological sciences”; lau et al., “educational usage of mobile devices”; macedo-rouet et al., “how do scientists select articles in the pubmed database?” 36 dahlstrom et al., “ecar study of students and information technology, 2015.” https://www.netmarketshare.com/operating-system-market-share.aspx?qprid=8&qpcustomd=1 https://doi.org/10.1108/lht-02-2016-0015 https://doi.org/10.1629/uksg.236 abstract introduction literature review methods results and discussion participant demographics and devices used frequency of library site use on mobile devices where researchers are when using mobile devices for library tasks library resources accessed via mobile devices university website use behaviors on mobile devices other research behaviors on mobile devices limitations conclusion references 48 information technology and libraries | march 2007 author id box for 3 column layout column title editor zoomify image is a mature product for easily publishing large, high-resolution images on the web. end users view these images with existing webbrowser software as quickly as they do normal, downsampled images. a flash-based zoomifyer client asynchronously streams image data to the web browser as needed, resulting in response times approaching those of desktop applications using minimal bandwidth. the author, a librarian at cornell university and the principal architect of a small, open-source company, worked closely with zoomify to produce a cross-platform, opensource implementation of that company’s image-processing software and discusses how to easily deploy the product into a widely used webpublishing environment. limitations are also discussed as are areas of improvement and alternatives. z oomifyer from zoomify (www .zoomify.com) enables users to view large, high­resolu­ tion images within existing web­ browser software while providing a rich, interactive user experience. a small zoomifyer client, authored in macromedia flash, is embedded in an html page and makes asyn­ chronous requests to the server to stream image data back to the client as needed. by streaming the image data in this way, the image renders as quickly as a normal, downsampled image, even for images that are giga­ bytes in size. as the user pans and zooms, the response time approaches that of desktop applications while using the smallest possible band­ width necessary to render the image. and because flash has 98.3 per­ cent browser saturation, viewing “zoomified” images is seamless for most users and allows them to view images interactively in much greater detail than would otherwise be prac­ tical or even possible.1 zoomify image (sourceforge.net/ projects/zoomifyimage) was created at cornell university in collabora­ tion with zoomify to create an open­ source, cross­platform, and scriptable version of the processing software that creates the image data displayed in a zoomifyer client. this work was immediately integrated into an inno­ vative content­management system that was being developed within the zope application server, a premier web application and publishing plat­ form. authors in this system can add high­resolution images just as they normally add downsampled images, and the image is automat­ ically processed on the server by zoomify image and displayed within a zoomifyer client. zoomify image is now in its second major release on source forge and contains user con­ tributed software to easily deploy it in other environments such as php. zoomifyer has been used in a number of applications in many fields, and can greatly enhance many research and instructional activities. applying zoomifyer to digital­image collections is obvious, allowing libraries to deliver an unprecedented level of detail in images published to the web. new applications also suggest themselves, such as serving high­resolution images taken from tissue samples in a medical lab or using zoomifyer in advanced geo­ spatial image applications, particu­ larly when advanced client features such as annotations are used. the zoomifyer approach also has positive implications for preservation and copyright protection. zoomify image generates cached derivatives of master image files so the image masters are never directly accessed in the application or sent over the internet. image data are stored and transmitted to the client in small chunks so that end users do not have access to the full data of the original image. deploying zoomify image dependencies and winstallation zoomify image was designed ini­ tially to be a faithful, cross­platform port of zoomify’s image­processing software. it was developed in close cooperation with zoomify to pro­ vide a scriptable method for invok­ ing the image­preparation process for zoomifyer clients so this technol­ ogy could be used in more environ­ ments. zoomify image is written in the python programming language and uses the third­party python imaging library (pil) with jpeg support, both of which are also open source and cross­platform. it has been tested in the following environments: ■ python 2.1.3 ■ pil 1.1.3 and ■ python 2.4.3 ■ pil 1.1.4 installers for python and pil exist for all major platforms and can be obtained at python.org and www .pythonware.com/products/pil. the installation documentation that comes with pil will help you locate the appropriate jpeg libraries if they are missing from your system. for macosx, you can find pre­built binary installers for python, pil and zope at sourceforge.net/projects/ mosxzope. introducing zoomify image adam smith adam smith (ajs17@cornell.edu) is a systems librarian at cornell university library, ithaca, new york. introducing zoomify image | smith 4�introducing zoomify image | smith 4� the “ez” version of the zoomifyer client, a flash­based applet with basic pan and zoom functionality, is pack­ aged with zoomify image for conve­ nience so the software can be used immediately once installed. the ez client is covered by a separate license and can be easily replaced with more advanced clients from zoomify at www.zoomify.com. (a description of how to upgrade the zoomifyer client is included in this paper.) after python and pil with jpeg support are installed, download the zoomify image software from sourceforge.net/projects/zoomify­ image and decompress it. using zoomify image from the command line begin exploring zoomify image by invoking it on the command line: python /zoomifyfilepr ocessor.py or, to process more than one file at a time: python /zoomifyfile processor.py the file format of the images input to zoomify image are typically either tiff or jpeg, but can be any of the many formats that pil can read.2 an image called “test.jpg” is included in the zoomify image distribution and is of sufficient size and complexity to provide an interesting example. during processing, zoomify image creates a new directory to hold the converted image data in the same location as the image file being processed. the name of this direc­ tory is based on the file name of the image being processed, so that, for example, an image called “test.jpg” would have a corresponding folder called “test” containing the converted image data used by the zoomifyer client. if the image file has no file extension, the directory is named by appending “_data” to the image name, so that an image file named “test” would have a corresponding directory called “test_data.” if the process is re­run on the same images, any previously generated data are automatically deleted before being regenerated. zoomify provides substantial documentation and sample code on its web site that demonstrates how to use the data generated by zoomify image in several environments. user­ contributed code is bundled with zoomify image itself, further dem­ onstrating how to dynamically incor­ porate this conversion process into several environments. an example of the use of zoomify image within the zope application server is given. incorporating zoomify image into the zope application server the popular zope application server contains a number of built­in services including a web server, ftp and webdav servers, plug­ins for access­ ing relational databases, and a hier­ archical object­oriented database that uses a file­system metaphor for stor­ age. this object database provides a unique opportunity to incorporate zoomifyer into zope seamlessly. to use zoomify image with zope, the distribution must be decom­ pressed into your zope products directory. for versions 2.7.x and up, this is at: /products/ in zope versions prior to the 2.7.x series, the products directory is at: /lib/python/ products/ restart zope and now within the web­based zope management interface (zmi), the ability to add zoomify image objects appears. after selecting this option, a form is presented that is identical to the form used for adding ordinary image objects within zope. when an image is uploaded using this form, zope automatically invokes the zoomify image conversion process on the server and links the generated data to the default zoomifyer client that comes with the distribution. if the image is subsequently edited within zmi to upload a new version, any existing conversion data for that image are automatically deleted, and the new conversion data are gener­ ated to replace them, just as when invoked on the command line. again, the uploaded image can be in any format that zope recognizes as having a content­type of “image/...” and that pil can read. the only potential “gotcha” in this process is that in the versions of the zoomifyer client the author has tested, zoomify image objects that have file names (in zope terminology, the file name is the object’s “id” property) with extensions other than “.jpg” are not displayed properly by the zoomifyer client. so, when uploading a tiff image, for example, the id given to the zoomify image object should either not contain an extension, or it should be changed from image.tif to something like image_tif. this bug has been reported to zoomify and may be fixed in newer versions of the flash­based viewing software at the time of publication. to view the image within the zoomifyer client, simply call the “view” method of the object from within a browser. so, for a zoomify image object uploaded to: http:///test/test.jpg go to this url: http:///test/test. jpg/view or, to include this view of the image within a zope page template 50 information technology and libraries | march 200750 information technology and libraries | march 2007 (zpt), simply call the tag method of the zoomify image just as you would a normal image object in zope. so, in a zpt, use this: it is possible that the zoomify image conversion process will not have had time to complete when someone tries to view the image. the zoomify image object will attempt to degrade gracefully in this situation by trying to display a downsampled version of the image that is gener­ ated part way through the conver­ sion process, or, if that is also not available, finally informing the user that the image is not yet ready to be viewed. this logic is built into the tag method. to add larger images more effi­ ciently, or to add images in bulk, the zoomify image distribution contains detailed documentation to quickly configure zope to accept images via ftp or webdav and automatically process them through zoomify image when they are uploaded. finally, the default zoomifyer cli­ ent can be overridden by uploading a custom zoomifyer client into a loca­ tion where the zoomify image object can “acquire” it, and giving it a zope id of “zoomifyclient.swf”. how it works to be viewed by a zoomifyer cli­ ent, an image must be processed to produce tiles of the image at differ­ ent scales, or tiers. an xml file that describes these tiles is also necessary. zoomify image provides a cross­ platform method of producing these tiled images and the xml file that describes them. beginning at 100­percent scale, the image is successively scaled in half to produce each tier, until both the width and height of the final tier are, at most, 256 pixels each. each tier is further divided into tiles that are, at most, 256 pixels wide by 256 pixels tall, as seen in figure 1. these tiles are created left to right, top to bottom. tiles are saved as images with the naming conven­ tion indicated in figure 2. the numbering is zero­based, so that the smallest tier is represented by one tile that is at most 256 x 256 pixels wide with the name “0­0­ 0.jpg.” tiles are saved in directories in groups of 256, and those directories also follow a zero­based naming con­ vention starting with “tilegroup0.” lower­numbered tile groups contain lower­numbered tiles, so 0­0­0.jpg is always in tilegroup0. zoomifyer clients understand this tile­naming scheme and only request tiles from the server that are necessary to stitch together the por­ tion of the image being viewed at a particular scale. limitations zoomify image was developed to meet two goals: 1. to provide a cross­platform port of the zoomifyer con­ figure 1. tiers and tiles for a 2048 x 2048 pixel image figure 2. tile image naming scheme introducing zoomify image | smith 51introducing zoomify image | smith 51 verter for use in unix/linux systems, and 2. to make the converter script­ able, and ultimately integrate it into open­source content­man­ agement software, particularly zope. this zoomifyer port was writ­ ten in python, a mature, high­level programming language with an execution model similar to java. although zoomify image continues to be optimized, compared to the official zoomify conversion software, it is slower and more limited in the sizes of images it can reasonably process. anecdotally, zoomify image has been used effectively on images hundreds of megabytes large, but significant performance degradation has been reported in the multi­giga­ byte range. because of these limitations in zoomify image, the official zoomify image­processing software is recom­ mended for converting very large images manually in a windows or macintosh environment. the zoomify image product is recommended in the following circumstances: ■ the conversion must be per­ formed on a unix/linux machine. ■ the conversion process must be scriptable, such as for batch pro­ cessing or being run dynamically. ■ images sizes are not in the multi­ gigabyte range. if a scriptable, cross­platform version of the zoomifyer converter is needed, but performance is an issue, several things can be done to extend the current limits of the soft­ ware. obviously, upgrading hard­ ware, particularly ram, is effective and relatively inexpensive. running the latest versions of python and pil will also help. each new version of python makes significant perfor­ mance improvements, and this was a primary goal of version 2.5, which was released in september 2006. the author believes that the cur­ rent weak link in the performance chain is related to how zoomify image is loading image data into memory with pil during processing. in the current distribution, a python script contributed by gawain avers, which is based partially on the zoomify image approach, uses imagemagick instead of pil for image manipula­ tion and is better able to process multi­gigabyte images. the author would like to add the ability to des­ ignate the image library at runtime in future versions of zoomify image. future development beyond improving the performance of the core­processing algorithm, the author would also like to explore opportunities for more efficiently processing images within zope, such as spawning a background thread for processing images so the zope web server can immediately respond to the client’s image­submission request. the author would also like to improve the tag method to display data more flexibly in the zoomifyer client and ensure consistent behav­ ior with zope’s default image tag method. finally, zoomify image could also benefit from the addi­ tion of a simple configuration file to control such runtime properties as image quality and which third­party image­processing library to use, for example. conclusion zoomify image is mature, open­ source software that makes it pos­ sible to publish large, high­resolution images to the web. it is designed to be convenient to use in a variety of architectures and can be viewed within existing browser software. download it for free, begin using it in minutes, and explore its unique possibilities. references 1. adobe systems, macromedia flash player statistics, http://www.adobe.com/ products/player_census/flashplayer/ (accessed march 1, 2007). 2. pythonware, python imaging library handbook: image file formats, http:// www.pythonware.com/library/pil/ handbook/formats.htm (accessed aug. 6, 2006). resources macromedia flash player statistics (http://www.adobe.com/ products/player_census/flash­ player/) (accessed jan. 2, 2007). python imaging library (pil) (http:// www.pythonware.com/products/ pil/) (accessed jan. 2, 2007). python programming language official web site (http://www.python.org/) (accessed jan. 2, 2007). zoomify image (http://sourceforge.net/ projects/zoomifyimage/) (accessed jan. 2, 2007). zoomify (http://www.zoomify.com/) (accessed jan. 2, 2007). zope community (http://www.zope .org/) (accessed jan. 2, 2007). zope installers for macosx (http:// sourceforge.net/projects/ mosxzope/) (accessed jan. 2, 2007). reference is dead, long live reference: electronic collections in the digital age heather b. terrell information technology and libraries | december 2015 55 abstract in a literature survey on how reference collections have changed to accommodate patrons’ webbased information-seeking behaviors, one notes a marked “us vs. them” mentality—a fear that the internet might render reference irrelevant. these anxieties are oft-noted in articles urging libraries to embrace digital and online reference sources. why all the ambivalence? citing existing research and literature, this essay explores myths about the supposed superiority of physical reference collections and how patrons actually use them, potential challenges associated with electronic reference collections, and how providing vital e-reference collections benefits the library as well as its patrons. introduction reference collections are intended to meet the immediate information needs of users. reference librarians develop these collections with the intention of using them to answer indepth questions and to conduct ready-reference searches on a patron’s behalf. library users depend on reference collections to include easily navigable finding tools that assist them in locating sources that contain reliable information in a useful, accessible format and can be accessed when the information is needed. the expectation for print reference collections is that they are comprised of high-use materials—the very reason for their designation as noncirculating items is ostensibly so that materials are available for on-demand access by both patrons and staff, who use them frequently. however, librarians and patrons alike have acquired what margaret landesman calls “online habits,” to wit, the most-utilized access point to information is often the 24/7 web.1 in a wired world, where the information universe of the internet is not only on our desktops, but also in our pockets and on our fashion accessories, the role of the print reference collection is less relevant in supporting information and research aims. in no other realm have the common practices of both users and librarians changed more than in how we seek information. nevertheless, a technology-related panic seems to be at the boil, with article titles like “are reference books becoming an endangered species?”2 and “off the shelf: is print reference dead?”3 words like “invasion” are used to describe the influx of electronic reference sources. we read about the heather b. terrell (hterrell@milibrary.org), a recent mlis degree graduate from the school of library and information science, san jose state university, is winner of the 2015 lita/ex libris student writing award. mailto:hterrell@milibrary.org reference is dead, long live reference: electronic collections in the digital age | terrell | doi: 10.6017/ital.v34i4.9098 56 “unsustainable luxury” of housing hundreds—sometimes thousands—of unused books on the open shelves. all this handwringing leads us to wonder why librarians in the field need this much coaxing to be cajoled into weeding their print reference collections in favor of electronic reference resources. does this format transition really constitute such a dire situation? what if the decline of print reference usage isn’t a problem? and what’s so luxurious about dusty, overcrowded shelves full of books no one cares to use? in “the myth of browsing,” barclay concludes that “the continued over-my-dead-body insistence that no books be removed [from libraries] is an unsustainable position.”4 a survey of the relevant literature reveals that staff resistant to the transition from print to electronic reference collections often share three core presumptions about reference: • users prefer using print sources, and the importance of patrons’ ability to browse the physical collection is paramount. • the reliability of web-based reference sources may be questionable, especially when compared with the authority of print reference materials. • access to print materials is the only option that certain users (namely, those without library cards) have for being connected to information. there also seems to be a more subtle assumption at play in the print vs. electronic reference debate—that print books are more “dignified,” cultivating a scholarly atmosphere in the library. certain objections to removing print reference collections to closed stacks and using the newly freed public space to build a cooperative learning commons, for instance, tend to devolve into hysterics about the potentiality of libraries becoming “no better than a starbucks.” the “no better” variable in this equation is a cosmetic one—librarians aren’t worrying about libraries serving up a flavored “information latte” for vast profit margins—they are worrying that libraries will be perceived as a place to loiter, use the internet, and “hang out,” rather than a place for serious study. one thing for librarians who worry about this potential outcome to consider is that loyal coffee shop denizens would be up in arms to learn that their favorite shop was being closed or its services being reduced or eliminated. the implications are clear. perhaps libraries should consider the café model: a collaborative “no-shushing” zone—the difference between a library and a coffee house being that at a library, people are able to explore, learn, and be entertained using the resources provided by the institution. at homer babbidge library at the university of connecticut, staff considered it important to “maintain a vestige of the reference collection, so that students were reminded they had entered a special place where scholarship was tangible.”5 however, users considered the underutilized stacks of books a waste of space that could be better used for cooperative work areas or information technology and libraries | december 2015 57 computer access stations. the students’ needs and interests were heeded, and homer babbidge library’s learning commons has been a successful endeavour. reference collections: history and purpose brief points about the history of reference services lend context to the arguments presented in favor of building electronic reference collections. grimmelman points out that “it’s almost a cliché to assert that the internet is like a vast library, that it causes problems of information overload, or that it contains both treasures and junk in vast quantities.”6 from the earliest dedicated reference departments to the 24/7 reference model developed in response to progressing technology, tyckoson affirms that “one thing remains constant—users still need help. the question . . . is how to provide the library’s users with the best help.”7 browsing collections in libraries are newer than one might assume. prior to world war ii, academic library faculty could browse to find reference materials that met the information needs of students, but undergrads weren’t even allowed in the stacks.8 in public libraries, reference collections were open to users, but reference rooms were considered to be, first and foremost, the domain and workplace of the reference.9 this raises the question, what is the domain and workplace of the contemporary reference librarian? arguably, the answer to this query is wherever the information is, for example, online. ready reference collections arose from the need to make the most commonly used resources in the library convenient and readily available for patron use.10 the most commonly used resources in contemporary libraries are those found online—again, where the information is. both users and librarians now turn to the web as the first resort for answering quick reference queries, and they turn to online databases and journals for exploring complex research questions. meanwhile print works that were once used daily sit moldering, gathering dust on the shelves either because they are outdated or because no one thinks to find them when the answer is available at the swipe of a finger or the click of a mouse from where they sit, whether that’s in the library or in, ahem, a coffee shop. “the convenience, speed, and ubiquity of the internet is making most print reference sources irrelevant,” tyckoson says.11 print preference, browsing collections library use is increasing—but, as landesman and others point out, it is increasing because users want access to computers, instruction in technology, study spaces, or just a place to be that’s not home, not school, and not work. users do not come to the library for reference sources— researchers and scholars prefer to access full-text works via their computing devices.12 the argument that users prefer print sources is antiquated, and the emphasis on building browsing collections of physical reference materials reflects a misguided notion that users crave tactile information. landesman is blunt: “when it is a core title, users want it online.”13 reference is dead, long live reference: electronic collections in the digital age | terrell | doi: 10.6017/ital.v34i4.9098 58 statistics bear her assertions out. studies show that usage of print reference collections is minimal and that users strongly prefer online access to reference materials. • at stetson university, usage statistics gathered during the 2003–4 academic year showed that only 8.5 percent of the library’s 25,626 reference volumes were used during that period.14 • a year-long study by public libraries revealed that only 13 percent of winter park (fl) public library’s collection of 8,211 reference items was used.15 • when texas a&m university converted its reference collection’s primary format from print to e-books, a dramatic increase of the e-versions of reference titles was recorded.16 • in a survey of william & mary law library users, a majority of respondents indicated that they consciously select online reference sources over print, citing convenience and currency as top reasons for doing so.17 scanning the shelves may seem to some to be the most intuitive way to search for information, but in actual practice, browsing is ineffective—books at eye level are used more often, patrons are limited to sources not being used by another patron at any given moment, and overcrowding of the shelves results in patrons overlooking useful materials.18 browsing is the least effective way for patrons to “shop” a collection. searching by electronic means overcomes the obstacles inherent in browsing the physical shelves when using well-designed search algorithms that employ keywords on the basis of accurate metadata. landesman indicates that if librarians commit to educating patrons on the use of the reference databases, ebooks, and websites they offer, online reference will be “a huge win for users.”19 it should be noted: no one suggests that print reference should be eliminated entirely, at least not yet. smaller print reference collections result in better-utilized spaces; they ensure that remaining resources in the physical collections are used more effectively—only the items that are actually high-use are included, which makes these sources easier to locate; books formerly classified as reference materials are able to circulate to those interested in their specialized content. smaller print reference collections serve patrons in myriad ways, including freeing up funds that can be used to enhance electronic reference collections. digital reference services are just another way of organizing information—there is no revolution here, unless it is in providing information with more efficiency—with breadth, depth, and access that surpasses what is possible via a print-only reference collection. the inevitable digital shift is a very natural evolution of patron-driven library services rather than cause for consternation on the part of library service providers. web reliability: google and wikipedia those who argue that the reliability of online sources is questionable are typically referring to google results and wikipedia entries, which have little bearing on a library’s electronic collections information technology and libraries | december 2015 59 of databases and e-books, but has plenty to do with a library’s reference services: these two sources are very often used in lieu of printed fact-finding sources such as atlases, subject or biographical dictionaries, and specialized sources like bartlett’s familiar quotations—which was last printed in 2012 and has recently gone digital. for questions of fact, google is often a convenient and “reliable enough” source for most queries; authority of the results yielded by a google search is not always detectable, and is sometimes intentionally obscured, so the librarian must vet results carefully and select the most reliable sources when providing ready reference to patrons. however, google is far more than just its main search page. for instance, google scholar allows searchers to locate full-text articles as well as citations for scholarly papers and peer-reviewed journal articles. in general, there are many tools on the web, and librarians must expend effort determining how to make the best use of each. in particular, google is better suited to some information tasks than others—it’s up to the librarian to know when to use this tool and when to eschew it. wikipedia has been the subject of much heated debate since its inception in 2001, but in a study conducted by nature magazine, the encyclopedias britannica and wikipedia were evaluated on the basis of a review of fifty pairs of articles by subject experts and found to be comparable in terms of the number of content errors—2.9 and 3.9, respectively.20 deliberate misstatements of fact, usually in biographical entries, are cited as evidence that wikipedia is utterly unreliable as a reference source. in fact, print sources have been plagued with the same issues. for many years, the dictionary of american biography contained an entry based on a hoax claiming a (nonexistent) diary of horatio alger—and while the entry was removed in later editions, the article was still referred to in the index for several years after its removal.21 if anything, it seems that format might provide a false sense of assurance that a source’s authority is infallible. all reference sources include bias, and all will include faulty information. the major difference between print and electronic sources is that in the digital era, using the tools of technology, these errors can be corrected quickly. what some see as declining quality of a source based on its format is simply a longstanding feature of human-produced reference works, dissociated from any print vs. web debate. access: collection vs. policy some academic and public libraries intend to decrease or discontinue purchasing print-based reference sources so funds can be diverted to build electronic reference collections; they weed print reference to make room for information commons containing technology used for accessing these electronic collections. the basic assumption in the objection to this practice is that the traditional model of in-person reference is integral to a functioning reference collection, that access to information depends on that information being printed on a physical page. reference services are provided virtually via chat, im, and email. reference services are provided via the library’s website. reference services are provided by roving librarians, reference is dead, long live reference: electronic collections in the digital age | terrell | doi: 10.6017/ital.v34i4.9098 60 librarians engaging in one-on-one literacy sessions, and in large-group training sessions. long gone are the days of the reference librarian who waits patiently at her station for a patron to approach with a question. since the reference services model no longer mandatorily includes a stationary point on the library map, nor does providing quality reference depend solely on the depth and breadth of the print reference collection, how are print reference collections used? as indicated previously, about 10 percent of print reference collections are used by patrons on a regular basis. concern for the information needs of library users who do not have library cards is well-intentioned, but the question remains: if 90 percent of a collection goes unused, even when those users without library cards have access to these materials, is the collection useful? as stewart bodner of nypl says, “it is a disservice to refer a user to a difficult print resource when its electronic counterpart is a far superior product.”22 how users want to receive their information matters—access should not depend on whether a user can obtain a library card. for those libraries with high concentrations of patrons who do not qualify for library cards (e.g., individuals who do not have a fixed home address, or who cannot obtain a state-issued id card), libraries might reconsider their policies rather than their collections. computer-only access cards can be provided on a temporary basis for visitors and others who are unable to obtain permanent cards. san francisco public library recently instituted a welcome card for those members of the community who cannot meet identification requirements for full library privileges. the welcome card allows the user full access to computers and online resources and permits the patron to check out one physical item at a time.23 when compared with purchasing, housing, and maintaining vast print reference collections, this is a significantly less costly and far more patron-centered solution to the problem of access to electronic information sources— librarians should be advocates for users, with the goal being access to knowledge, no matter its format. conclusion: building better hybrid collections most library professionals agree that libraries should collect both print and electronic sources for their reference collections, but the ratio of print to digital is up for debate. as more formats with improved capabilities appear, researchers find that patrons prefer those sources that provide them with the best functionality. it is essential to look to the principles on which reference services are founded. one of those principles is to build collections on the basis of user preferences. librarians must consider what the reference collection is for and whether assumptions about patron preferences are backed by evidence. in essence, considering what “reference” means to users rather than defaulting to the status quo. a reference collection development policy must be based on what is actually used often, not on what has the potential to maybe be used sometime in the future. the library is not an archive, preserving great tomes for posterity—the collections in a library are for use. with less emphasis on print materials, librarians might focus on the wealth of sources available electronically via information technology and libraries | december 2015 61 databases and ebooks, as well as open-source, free online resources. librarians must cultivate an understanding of the resources patrons use and the formats in which they prefer to access information. as heintzelman and coauthors state, “a reference collection should evolve into a smaller and more efficient tool that continually adapts to the new era, merging into a symbiotic relationship with electronic resources.”24 rules of reference that were devised when print works were the premier sources of reference information no longer apply. reference librarians must lead the way in responding to the digital shift—creating electronic collections centered on web-based recommendations, licensed databases of journals, and ebooks—with a focus on rich, interactive, and unbiased content. weeding reference collections of outdated and unused tomes, moving some materials to the closed stacks while allowing others to circulate, and building e-book reference collections allows libraries to provide effective reference services by cultivating collections that patrons want to use. much of the transition from print to electronic reference collections can be accomplished by ensuring that resources are promoted to patrons and staff, that training in using these tools is provided to patrons and staff, that librarians become involved in the selection of digital collections, and that the spaces where print collections were formerly housed are used in ways the community finds valuable. one need not worry about the “invasion” of e-reference or the “death” of print reference. the two can coexist peacefully and vitally, as long as librarians maintain focus on selecting the best material for their reference collections, no matter its format. references 1. margaret landesman, “getting it right—the evolution of reference collections,” reference librarian 44, no. 91–92 (2005): 8. 2. nicole heintzelman, courtney moore, and joyce ward, “are reference books becoming an endangered species? results of a yearlong study of reference book usage at the winter park public library,” public libraries 47, no. 5 (2008): 60–64. 3. sue polanka, “off the shelf: is print reference dead?” booklist 104, no. 9/10 (january 1 & 15, 2008): 127. 4. donald a. barclay, “the myth of browsing: academic library space in the age of facebook,” american libraries 41, no. 6–7 (2010): 52–54. 5. scott kennedy, “farewell to the reference librarian,” journal of library administration 51, no. 4 (2011): 319–25. 6. james grimmelmann, “information policy for the library of babel,” journal of business & technology law 3 (2008): 29. reference is dead, long live reference: electronic collections in the digital age | terrell | doi: 10.6017/ital.v34i4.9098 62 7. david a. tyckoson, “issues and trends in the management of reference services: a historical perspective,” journal of library administration 51, no. 3 (2011): 259–78. 8. donald a. barclay, “the myth of browsing: academic library space in the age of facebook,” american libraries 41, no. 6–7 (2010): 52–54. 9. tyckoson, “issues and trends in the management of reference services.” 10. carol a. singer, “ready reference collections,” reference & user services quarterly 49, no. 3 (2010): 253–64. 11. tyckoson, issues and trends in the management of reference services,” 293. 12. landesman, “getting it right,” 8. 13. ibid., 10. 14. jane t. bradford, “what’s coming off the shelves? a reference use study analyzing print reference sources used in a university library,” journal of academic librarianship 31, no. 6 (2005): 546–58. 15. heintzelman, moore, and ward, “are reference books becoming an endangered species?” 16. dennis dillon, “e-books: the university of texas experience, part 1,” library hi tech 19, no. 2 (2001): 113–25. 17. paul hellyer, “reference 2.0: the future of shrinking print reference collections seems destined for the web,” 13 aall spectrum 24–27 (march 2009). 18. barclay, “the myth of browsing.” 19. landesman, “getting it right.” 20. jim giles, “internet encyclopaedias go head to head,” nature 438, no. 7070 (2005): 900–901. 21. denise beaubien bennett, “the ebb and flow of reference products,” online searcher 38, no. 4 (2014): 44–52. 22. mirela roncevic, “the e-ref invasion-now that e-reference is ubiquitous, has the confusion in the reference community subsided?” library journal 130, no. 19 (2005): 8–16. 23. san francisco public library, “welcome card,” sfpl.org/pdf/services/sfpl314.pdf (2014): 1–2. 24 . heintzelman, moore, and ward, “are reference books becoming an endangered species?” http://sfpl.org/pdf/services/sfpl314.pdf introduction lib-mocs-kmc364-20131012113553 230 i ournal of library automation vol. 14/3 septem her 1981 for member libraries and will demonstrate their system in mid-1981. oclc data has been successfully transferred to many local circulation systems. rlgirlin rlin does not anticipate offering local circulation services for member libraries. rlin data has been successfully transferred to several local circulation systems. wln wln does not anticipate offering local circulation systems on their computer for member libraries. wln data has been successfully transferred to local circulation systems and an agreement has been reached with dataphase, a computerized circulation system vendor, to discount purchase of their system by wln member libraries. public online catalogs again, none of the bibliographic utilities under consideration currently support public online catalogs of an individual library's collection. a public online catalog requires further programming in order to make it easy for the public to locate materials of interest without extensive training; the bibliographic utility's searching procedures are too esoteric to be used by the general public. as in circulation, issues of data transferability and full retrospective conversion of the uo library's catalog are paramount. oclc oclc does not currently encourage public access to their database and does not support use of local online catalogs on their computer due to the tremendous demand for computer resources exerted by 2400 member libraries. oclc and rlg /rlin are participating in a study of user requirements for a public online catalog. oclc data has been successfully transferred to several local online catalogs, including eugene public library's circulation and online catalog system, ulisys. rlg!rlin rlin anticipates being able to offer public access to their database. they are participating in a study with oclc of user requirements for such a system, but no date has been announced for the development of this capability in rlin. rlin data has been successfully transferred to a local public online catalog at northwestern university. wln wln does not believe that a local online patron accessed catalog should be provided through the wln computer, even though they anticipate having such a capability within one year. instead, they encourage libraries to develop local systems for public access to the online computerized catalog and to obtain data from the wln cataloging system. the university of illinois is adapting the wln computer search and database management software to provide a local online catalog and computerassisted instruction in its use for the public. checklist for cassette recorders connected to crts prepared by lawrence a. woods: purdue university libraries, west lafayette, indiana, for the technical standards for library automation committee, information science and automation section, library and information technology association. introduction a data cassette recorder connected to a printer port is an effective, low-cost method of collecting data in machine-readable form from display terminals such as the oclc 100/105. it is important that a data recorder be used rather than an audio recorder although the cassette itself can be a goodquality audio tape. it is also important to note that the data recorded on the tape are not the same as the data originally transmitted to the display terminal, but are simply a line-by-line image of what appears on the screen. a typical installation will have a minimum of two devices: one attached to the display terminal to collect data, and one attached to a printer or an input device to another computer for playback of the data. there are more than 150 various data recording devices on the market. this checklist is prescriptive in nature, outlining and describing those features that are necessary or desirable for a typical application. in addition to features, environmental considerations are briefly mentioned along with information for the purchase, lease, or rental of data equipment. features in general, featu.res must be compatible between all devices used for recording and playback in a given application. some features that are desirable for certain applications are unnecessary or inappropriate for others. 1. recording media the phillips cassette is most widely used and may be interchanged between the recorders of different manufacturers that utilize it. the cartridge (either 3m or a vendor proprietary cartridge) is gaining popularity because of its greater storage and transfer rates, but as yet is not widely used. 2. code most print ports on display terminals use ascii (american standard code for information exchange) data code. the recorder selected should use the same. 3. interfaces the cassette recorder has an "in" plug to accept data. this must be compatible with the print port on the terminal-usually rs232c. the "out" plug on the recorder sends the recorded data to a printer or to a computer. this interface should also be rs232c. 4. recording characteristics a. the number of tracks can vary from one to four. this is one of the factors that determine the amount of data that can be recorded on a single cassette. four tracks are recommended. b. density also affects the amount of data that can be recorded. usual densities are 800 or 1,600 bits per inch (bpi). c. recording mode. thereareseveral reports and working papers 231 modes available. phase encoded (pe) is the best mode for data applications. non-return to zero (nrz) is a popular mode, but has poor error recovery. ibm has a version called nrzi, which improves on nrz but still is less reliable than phase encoded. other commonly found modes are complementary nrz and ratio recording. d. recording format. there is a variety of recording formats. to be assured compatibility with the terminal and playback device the format should be either ansi (american national standards institute) or ecma (european computer manufacturers association) compatible. 5. transmission a. duplex. the recorder should have both full and half duplex available. b. data transfer rate (baud rate). baud rate is usually switchselectable from 110 to 9600. the recorder must be set at the same speed as the printer port on the terminal. the oclc 100 and 105 terminals have a printer port baud rate selection switch that may be setat100,150,300,600,1200,and a meaningless 1800 baud. select a recorder that has the fastest compatible setting: 1200 baud is best. data must be played back at a rate compatible with the receiving device. 6. tape transport characteristics a. read/write speed is usually a function of the baud rate. b. non recording speeds. this feature is important for convenience. fast forward and rewind should be available. one hundred twenty inches per second will rewind a cassette in about thirty seconds. c. drive mechanism. four options are available: capstan, pinch roller, servomechanism, or reel-toreel. pinch roller is the most precise but reduces the life of the tape. 7. packaging 232 journal of library automation vol. 14/3 september 1981 this feature can affect the price of the final configuration. if any item is listed as "separate," increase the total price accordingly. components that can be either internal or separate are: controller, interface, or power supply. 8. remote operation some devices use ascii control codes to trigger controls automatically. this is a useful feature, but the device must have a transparent mode switch, otherwise codes embedded in the data being recorded or sent may trigger undesired operations such as rewind. 9. operating characteristics a. rewind, fast forward, initialize, send and receive are all necessary operations and should be switchcontrolled. b. edit, auto program search, string search, skip, etc., are useful for word-processing operations but are of little use in simple data collection and transmission. c. read backward is desirable for sort operations. d. character mode, line mode, and string mode are useful for printing operations but of little use in data transmission. e. online/offline should be switchselectable. f. simultaneous read/write is useful for editing operations. g. direct block accessing is useful if there is a need to search for recorded data but is not used in sequential processing. h. auto reverse is a useful feature for recording or transmitting more data than can be recorded on one side of a cassette. environmental requirements i. humidity range humidity range should be 20 percent to 80 percent without condensation. lower humidity will cause excessive static electricity. 2. temperature temperature range should be between ten degrees and forty degrees centigrade. 3. power requirements most recorders require a standard 115-volt alternating current at 47 to 63hz. and draw about 60 watts. the circuit should be free from interference such as that caused by florescent lights. a transformer may be required in the outlet to guarantee even power. 4. space requirements the recorder usually can be stored on a desk top. it is important that the indicator lights be visible to the terminal operator to monitor its operation. purchase i. maintenance and availability ask how many drives the manufacturer has installed to date. this may vary from a few hundred to one hundred thousand or more. establish a maintenance contract with the company or a local service bureau. it may be necessary to acquire a spare recorder to use as backup. 2. price determine ahead of time what features you are actually going to use. bells and whistles all cost money. a simple reliable recorder can be purchased for around $700. multiple drive units and other features can run as high as $3,600. microsoft word june_ital_nelson_final.docx what’s  in  a  word?  rethinking  facet   headings  in  a  discovery  service       david  nelson  and   linda  turney     information  technology  and  libraries  |  june  2015           76   abstract   the  emergence  of  discovery  systems  has  been  well  received  by  libraries  who  have  long  been   concerned  with  offering  a  smorgasbord  of  databases  that  require  either  individual  searching  of   databases  or  the  problematic  use  of  federated  searching.  the  ability  to  search  across  a  wide  array  of   subscribed  and  open-­‐access  information  resources  via  a  centralized  index  has  opened  up  access  for   users  to  a  library’s  wealth  of  information  resources.  this  capability  has  been  particularly  praised  for   its  “google-­‐like”  search  interface,  thereby  conforming  to  user  expectations  for  information  searching.   yet  all  discovery  services  also  include  facets  as  a  search  capability  and  thus  provide  faceted   navigation  that  is  a  search  feature  for  which  google  is  not  particularly  well  suited.  discovery  services   thus  provide  a  hybrid  search  interface.  an  examination  of  e-­‐commerce  sites  clearly  shows  that   faceted  navigation  is  an  integral  part  of  their  discovery  systems.  many  library  opacs  also  now  are   being  developed  with  faceted  navigation  capabilities.  however,  the  discovery  services  faceted   structures  suffer  from  a  number  of  problems  that  inhibit  their  usefulness  and  their  potential.  this   article  examines  several  of  these  issues  and  offers  suggestions  for  improving  the  discovery  search   interface.  it  also  argues  that  vendors  and  libraries  need  to  work  together  to  more  closely  analyze  the   user  experience  of  the  discovery  system.   introduction   the  emergence  of  google  as  the  premier  search  engine1  has  had  a  very  profound  effect  on  searcher   expectations  regarding  information.2  by  virtue  of  its  simplicity  and  the  remarkably  powerful   search  algorithms  that  enable  its  highly  relevant  results,  the  simple  search  box  of  google  has   clearly  triumphed  as  the  preferred  way  to  find  information.     but  is  the  google  search  model  really  the  panacea  that  libraries  need  to  resolve  their  search   interface  requirements?  the  nature  of  search  engine  and  search  interface  design  is  a  very  complex   issue.  unfortunately  for  academic  libraries,  google  has  dominated  discussions  and  thinking  about   search  engine  interfaces:  “just  google  it!”  is  a  simple  google  search  box  really  the  preferred   vehicle  with  which  libraries  should  be  delivering  their  content,  both  licensed  and  unlicensed?     the  assumption  librarians  make  to  justify  the  use  of  a  google  model  is  that  library  users  are   essentially  google  users,3  or  that  they  have  the  same  information  searching  needs.4  this  is  a     david  nelson  (david.nelson@mtsu.edu),  is  chair,  collection  development  and  management,  and   linda  turney  (linda.turney@mtsu.edu)  is  cataloging  librarian,  james  e.  walker  library,  middle   tennessee  state  university,  murfreesboro,  tennessee.     what’s  in  a  word?  rethinking  facet  headings  in  a  discovery  service  |  nelson  and  turney   doi:  10.6017/ital.v34i2.5629   77   flawed  assumption.  as  an  academic  library,  we  are  tasked  with  making  discoverable  not  simply   digital-­‐only  information,  but  information  objects  with  discrete  characteristics  that  often  constitute   the  object  of  a  search,  e.g.,  an  audio  book,  a  film,  or  even  a  book  on  a  shelf.  google  has  put  the   emphasis  on  the  keyword,  with  remarkably  gratifying  results  for  the  average  lay  user.  however,  a   recent  project  information  literacy  study  concluded  that  “google-­‐centric  search  skills  that   freshmen  bring  from  high  school  only  get  them  so  far—but  not  far  enough—with  finding  and   using  the  trusted  sources  they  need  for  fulfilling  college  research  assignments.”5  until  now,  the   library  web  development  focus  on  providing  a  “google-­‐like”  search  has,  unfortunately,  diverted   attention  from  an  appreciation  of  the  developments  in  other  areas  of  the  internet  world,  such  as  e-­‐ commerce,  where  searching  for  information  is  an  integral  component  of  the  buyer–seller   relationship.     commercial  entities  have  a  vested  interest  in  developing  their  websites  to  enable  each  user  to   have  a  successful  search  outcome.  while  the  search  interfaces  routinely  encountered  at  various  e-­‐ commerce  sites  may  seem  obvious,  it  is  important  to  remember  that  one  is  looking  at  a  series  of   deliberate  decisions  made  with  regard  to  the  interface  organization  and  structure.  for  companies,   the  search  interface  represents  millions  of  dollars  in  investment,  and  the  design  is  part  of  their   search  engine  optimization  strategy.6  in  this  way,  companies  and  other  organizations  create   robust  search  interfaces  that  enable  visitors  to  effectively  and  efficiently  find  what  they  want  in   the  company  “knowledgebase.”     it  is  clear  that  the  product  search  industry  has  arrived  at  some  very  significant  conclusions  about   user  search  behavior,  and  that  they  strive  to  optimize  their  interfaces  to  accommodate  those   conclusions.  three  features  stand  out:  (1)  the  importance  of  facets  as  a  key  component  in  the   search  design;  (2)  the  personalization  of  the  text  that  instructs  the  user;  and  (3)  intelligibility  of   facet  labels.  in  a  blog  article  on  facets  in  e-­‐commerce  websites,  scharnell  advises  that,  when   determining  what  the  facets  are,  there  are  two  rules  to  follow:  (1)  keep  it  simple;  and  (2)  create  an   intuitive  structure.7     the  primary  goal  of  a  commercial  website  is  to  bring  about  what  is  called  conversion—that  is,   getting  someone  to  the  site  (driving  traffic)  and,  ultimately,  making  a  sale  (the  conversion).   companies  have  discovered  that  facets  are  the  key  to  enabling  their  potential  customers  to  locate   discrete  pieces  of  information  (e.g.,  a  product)  almost  intuitively.  broughton  observes  that  “there   is  an  evident  faceted  approach  to  product  information  in  many  commercial  websites.”8  an   important  characteristic  of  the  faceted  structure  is  that  it  enables  the  user  to  have  the  ability  to   browse  a  collection.  thus  the  goals  of  a  commercial  site  successfully  employing  faceted  navigation   is  not  that  different  from  the  objectives  which  a  library  discovery  layer  seeks  to  accomplish.  while   the  literature  on  information  literacy  is  now  vast,  very  few  articles  deal  with  the  role  that  facets   play  in  the  discovery  process  for  student  searchers.  fagan  is  an  author  who  has  addressed  this   issue  of  facets.9  ramdeen  and  hemminger  discuss  the  role  of  facets  in  the  library  catalog.  10  to   date,  reviews  of  discovery  systems  or  catalog  interfaces  tend  to  place  emphasis  on  helping  patrons     information  technology  and  libraries  |  june  2015           78   to  search  our  demonstrably  flawed  systems  rather  than  considering  the  interfaces  as  the  actual   source  of  problems  for  users.11   while  it  can  be  argued  that  comparing  an  academic  site  and  a  commercial  site  compares  apples   and  oranges,  there  being  little  connection  between  the  complex,  open-­‐ended  subject/research   questions  and  searching  a  company’s  inventory  of  goods.  however,  there  are  elements  of   commonality  at  the  higher  level  of  an  information  need  that  drive  an  individual  to  perform  any   kind  of  information  search.  in  both  the  subject/topic  search  and  the  product  search  there  is  a  need   to  evaluate  results  as  they  appear  and  to  make  various  decisions  while  going  through  a  search   process  to  limit  and  narrow  a  search.  that  is,  for  the  information  that  libraries  seek  to  make   discoverable,  it  is  often  their  extratextual  characteristics  that  are  every  bit  as  important  as  the   content  itself.     this  leads  to  a  discussion  of  facets,  the  various  attributes  by  which  we  can  further  describe  the   “manifestation”  and  the  “expression”  (using  the  frbr  sense  here)  of  an  intellectual  creation.  we   need  to  pay  more  attention  to  the  importance  of  facets  as  a  critical  component  of  the  search   process.  that  is,  we  must  begin  to  move  away  from  the  mantra  that  our  single  search  box  will   provide  a  successful  result  without  additional  considerations,  with  the  idea  that  the  facets  are  of   secondary,  even  tertiary  importance.  badke  observes  “that  users  of  google  actually  need  a  deeper   level  of  information  literacy  because  google  offers  so  little  opportunity  to  nuance  or  facet   results.”12     yet  facets  are  a  key  part  of  our  discovery  interface  design.  however,  a  full  and  successful   exploitation  of  their  possibilities  has  been  significantly  hobbled  by  a  use  of  jargon-­‐heavy   terminology  that  assumes  users  will  immediately  and  instinctively  grasp  the  concept  of  a  faceted   term.  even  a  superficial  study  of  many  successful  commercial  websites  quickly  leads  the   thoughtful  observer  to  the  conclusion  that  their  web  developers  and  designers  have  been  making   excellent  use  of  focus  groups  and  surveys  to  make  the  search  process  as  easy  as  possible.  while   businesses  have  an  obvious  monetary  incentive  to  make  sure  their  users  do  not  leave  a  site   because  the  site  itself  presented  a  problem,  libraries  have  the  same  interest  in  making  sure  our   users  are  equally  able  to  easily  search  our  site.  a  library’s  site  should  not,  by  its  assumptions  about   the  user,  present  obstacles  to  their  search  success.13   with  the  growing  use  of  discovery  systems,14  academic  libraries  are  entering  into  a  new  phase  of   search  engine  deployment.15  by  making  use  of  a  preindexed  database  rather  than  the  more   restrictive  federated  search  process,  the  discovery  service  interface  allows  a  user  to  search  for   content  in  a  wide  variety  of  publication  and  media  types  (e.g.,  journals,  books,  dictionaries,  audio   books,  videos,  manuscripts,  newspapers,  images,  etc.).  to  assist  searchers,  discovery  systems   provide  faceted  navigation  along  with  the  search  box  interface.  several  studies  have  shown  that   the  use  of  facets  in  the  library  environment  has  proven  effective  in  assisting  searchers.16  however,   it  is  equally  clear  that  library  vendors  have  not  thought  deeply  about  the  facet  category  labels,  and   libraries,  which  can  do  a  certain  amount  of  customization,  tend  toward  unquestioning  acceptance     what’s  in  a  word?  rethinking  facet  headings  in  a  discovery  service  |  nelson  and  turney   doi:  10.6017/ital.v34i2.5629   79   of  the  vendor-­‐supplied  labels.  this  is  a  critical  area  involving  both  the  user  interface  and  the  user   experience;  libraries  and  vendors  need  to  spend  far  more  time  and  effort  on  ensuring  the   intelligibility  of  the  facet  labels  and  on  finding  effective  ways  to  encourage  their  use.   the  presence  of  facets  is  a  standard  feature  for  all  library  discovery  systems.17  however,  as  we   will  show  below,  facet  labels  are  not  easily  understandable  for  the  average  user  and  our  search   systems  tend  toward  emphasizing  our  users  as  “anonymous  service  recipient(s).”     what  are  facets?  a  review  of  various  discussions  of  facets  in  information  retrieval  literature   reveals  the  elasticity  of  the  term,  along  with  related  terms.18  will  observes  that  “what  a  facet  is  has   been  stretched  .  .  .  and  the  term  is  used  loosely  to  mean  any  system  in  which  terms  are  selected   from  pre-­‐defined  groups  at  the  time  of  searching.”19  it  is  probably  easiest  to  understand  the  use  of   the  term  facet  in  information  retrieval  systems  as  categories  derived  from  the  universe  of  objects   that  one  is  seeking  to  discover,  whether  we  are  dealing  with  manufactured  products  at  home   depot  or  greek  manuscripts  in  a  library  collection.20  what  adds  to  the  problem  of  definition  is  the   number  of  synonyms:  “the  term  facet  is  commonly  considered  as  analogous  to  category,  attribute,   class  and  concept.”21  how  objects  are  grouped  would  most  logically  determine  the  facets  that  are   necessary  for  the  classification  scheme.  it  is  the  objects  that  are  under  a  facet  that  present  a   problem  in  understanding.  niso  z39.19  defines  facets  as  “attributes  of  content  objects   encompassing  various  non-­‐semantic  aspects  of  a  document,”22  thus  including  such  things  as  author,   language,  format,  etc.  the  terms  that  are  indexed  are  not  the  facets  but  rather  concepts  that  exist   in  a  unique  relationship  to  the  facet.  “homer”  is  indexed  under  a  facet  “author,”  but  indexing  the   term  author  is  meaningless.     another  source  of  confusion  is  the  failure  to  distinguish  between  facets  and  filters,  both  of  which   are  used  to  refine  or  narrow  a  search.23  when  a  search  interface  states  that  it  is  using  “faceted   navigation,”  usually  both  facets  and  filters  are  present.     because  both  a  facet  and  a  filter  are  part  of  retrieval,  it  is  often  difficult  to  separate  the  two.  once   again,  we  encounter  a  terminological  problem.  for  example,  one  can  speak  of  how  a  facet  itself  is   used  to  filter  a  search  in  the  sense  that  it  refines  or  narrows  a  search  to  a  smaller  segment  of  the   universe  of  objects.  here  the  term  filter  refers  to  the  process  of  narrowing  a  search.  but  we  also   have  filters  that  deal  with  ranges.  thus,  the  filter  “date”  covers  a  range  of  time,  from  say  one   month  or  one  year,  to  a  range  over  a  specific  period  of  time.  the  same  can  be  seen  for  the  filter   “price,”  used  to  specify  only  one  amount,  say  $5,  or  a  range  from  $100  to  $299.  the  critical   difference  between  a  facet  and  a  range  filter  is  that  the  terms  found  in  a  facet  are  indexed  while  a   range  filter  (e.g.,  date  or  price)  is  not  an  indexed  term.  it  is  important  to  maintain  a  clear   distinction  between  a  facet  and  a  range  filter  because  the  underlying  metadata  is  different.  a  range   filter  sorts  the  content  in  a  specific  way  and  at  the  same  time  narrows  the  results.     our  examples,  along  with  the  closer  analysis  of  the  ebsco  eds  discovery  system  below,  will  amply   demonstrate  that  facets  and  filters  are  extremely  effective  in  information  retrieval  systems.  the     information  technology  and  libraries  |  june  2015           80   challenge  that  libraries  face  is  the  need  to  make  sure  that  users  are  aware  of  their  presence  on  a   search  interface  rather  than  relying  exclusively  on  keywords  alone  and  solely  on  the   algorithmically  based  result.24  the  value  of  the  faceted/filtered  search  is  the  ability  to  lead  the   searcher  quickly  and  efficiently  to  the  desired  result,  a  result  that  will  too  often  elude  the  user   even  with  a  powerful  google  search,  unless  that  user  gets  most  of  the  terms  exactly  right.   we  chose  various  e-­‐commerce  websites  because  they  have  extremely  large  numbers  of  site  visits   or  because  they  were  smaller  specialty  sites  that  reflected  a  more  highly  optimized  use  of  facets.  a   wide  range  of  product  types  was  in  the  selection.  the  frequency  of  visits  indicates  that  large   numbers  of  users  are  exposed  to  a  search  page  structure  and  terminology,  which  in  turn   establishes  a  standard  for  a  set  of  user  expectations.  best  buy,  target,  and  home  depot  are  among   the  top  on  hundred  accessed  websites,  a  fact  richly  indicative  of  the  type  of  influence  they  will   have  in  setting  user  search  expectations.  an  examination  of  these  websites  reveals  an  underlying   set  of  best  practices  for  making  use  of  faceted  navigation  with  text  searching.   linguistic  personalization   with  the  advent  of  web  2.0  there  are  several  forms  of  interaction  an  individual  can  have  with  a   website.  these  can  be  considered  forms  of  personalization  of  websites.25  usually,  personalization   is  “largely  about  filtering  content  to  satisfy  an  individual’s  particular”  information  needs.26  we  see   personalization  at  its  most  complex  in  the  algorithmically  adjusted  results  to  a  search  based  on   previous  searches.  there  we  find  the  feature  of  suggestions  that  are  offered  to  an  individual  on  the   basis  of  search  results,  a  feature  offered  by  amazon  and  netflix.  while  we  will  not  be  able  to   personalize  our  discovery  services  in  a  manner  similar  to  netflix  or  amazon,  we  can  improve  the   quality  of  the  interaction  in  other  areas  of  “personalization.”  we  should  be  seeking  ways  we  can   more  directly  speak  to  individual  searchers,  for  example,  by  selecting  words  and  phrases  that   speak  directly  to  a  person’s  needs.   our  examination  of  many  e-­‐commerce  sites  reveals  a  robust  use  of  linguistically  personalized   features  as  an  intrinsic  part  of  their  website  design  and  enhancement.  that  is,     e-­‐commerce  sites  make  use  of  their  interface  itself  to  directly  communicate  with  their  customers   in  a  way  that  makes  use  of  certain  linguistic  features  that  can  be  easily  adopted  by  library  sites.   combined  with  faceted  searching,  adding  certain  linguistic  features  should  prove  effective  in   encouraging  the  use  of  the  facets,  and  in  the  process  improve  both  the  search  results  and  the  user   experience.  this  constitutes  the  fundamental  challenge  for  the  academic  library—to  help  shape   the  mental  model  with  regard  to  the  universe  of  content  that  we  provide  through  our  search   interface.  finally,  there  is  what  we  can  consider  a  form  of  linguistic  personalization  with  which   language  is  used  to  “speak”  more  directly  to  a  searcher.  it  is  this  third  feature  of  linguistic   personalization  that  libraries  can  more  easily  control  and  customize  with  the  discovery  services,   as  well  as  at  other  places  on  the  library  website.     what’s  in  a  word?  rethinking  facet  headings  in  a  discovery  service  |  nelson  and  turney   doi:  10.6017/ital.v34i2.5629   81   there  is,  of  course,  the  personalization  that  is  intended  primarily  for  those  who  register  and  then   set  up  their  own  accounts.  however,  there  is  also  the  personalization  in  terms  of  text   communication  in  which  the  website  uses  both  pronouns  and  verb  forms  that  directly  address  the   searcher.  this  is  seen  in  the  use  of  the  second-­‐person  pronoun,  either  the  subject  or  the  possessive,   “you”  or  “your,”  and  for  verbal  forms,  the  use  of  the  second-­‐person  imperative  (usually  the  same  as   the  infinitive  in  english).  this  type  of  personalization  is  a  web  design  decision.  the  search  box   now  frequently  contains  text,  ranging  from  simple  noun  lists  to  sentences,  all  of  which  are   intended  to  encourage  the  user  to  make  use  of  the  search  capabilities.  after  a  search  has  occurred,   the  results  are  also  indicated  with  text  that  speaks  directly  to  a  person  by  means  of  the  use  of   pronouns  and  verbs.  we  find  the  following  interesting  examples  in  table  1:   pronoun   site   notes   what  are  you  looking  for  today?   kroger   search  box   what  can  we  help  you  find?   home  depot   search  box   what  are  you  looking  for?   lowe’s   search  box   your  selections   target   post-­‐search   we  found  x  results  for  [search  term]   target   post-­‐search   narrow  your  results   tigerdirect   post-­‐search   table  1.  linguistic  personalization  examples   in  examining  the  features  that  are  found  at  these  e-­‐commerce  sites,  it  is  interesting  to  note  the  use   of  either  of  two  words  for  the  facet  instructions:  refine  or  narrow,  two  words  our  users  will   routinely  encounter  in  nonlibrary  searching.     the  various  sites  all  have  the  following  elements:   1.  search  box   2.  search  results  outcome  clearly  shown   3.  facet  instruction  [“refine,”  “narrow,”  “show”]     4.  facets         information  technology  and  libraries  |  june  2015           82   major  problems  with  library  discovery  interfaces   we  can  identify  three  important  areas  that  need  to  be  considered  with  the  discovery  interface   design:   1.  the  search  box  itself   2.  the  facet  labels  and  their  intelligibility   3.  getting  the  user  to  the  facets  area     the  library  search  box   the  search  box  makes  an  excellent  point  of  departure  for  implementing  improvements  of  the   library’s  discovery  interfaces.  note  that  companies  do  not  assume  prior  search  knowledge  on  the   part  of  their  potential  market;  they  explicitly  tell  people  what  they  can  do  in  the  search  box.  as  we   see  in  table  1,  many  companies  (e.g.,  home  depot  and  lowe’s)  are  choosing  to  use  entire   sentences,  not  merely  clipped  phrases  or  strings  of  nouns.     many  libraries  are  beginning  to  populate  the  search  box  with  text.  however,  that  text  is  often   simply  a  noun  list  of  types  of  formats,  e.g.,  articles,  books,  media,  etc.  it  is  important  to  point  out   that  there  is  an  implicit  expectation  of  an  action  present  in  a  search  box.  but  too  often  when  our   library  websites  supply  a  list  of  nouns,  we  are  assuming  that  we  are  answering  the  question  in  the   mind  of  the  searcher—they  are  looking  for  a  subject  or  topic—and  we  supply  a  string  of  nouns   that  enumerate  formats.  so  right  from  the  beginning,  we  find  a  mismatch  between  the  user’s   purpose  when  coming  to  a  library’s  search  box  and  our  arbitrary  enumeration  not  of  topics,  but  of   types  of  information  sources.   once  we  recognize  this  problem,  we  have  some  very  good  options  to  choose  from  in  terms  of   personalizing  the  search  box  in  a  way  that  is  more  analogous  to  what  home  depot  and  lowe’s   offer:   1.  what  are  you  looking  for?   the  sentence  above  is  colloquial;  it  is  exactly  what  a  person  would  expect  to  hear  when   approaching  a  reference  librarian  or  from  a  service  counter  experience  in  a  variety  of  settings.   2.  what  are  you  searching  for?   this  is  a  more  complex  concept  because  it  includes  what  can  be  considered  a  technical  term   (“search”),  a  word  now  commonly  understood  within  the  context  of  searching  for  information  and   not  only  applicable  to  a  lost  dog  or  strayed  notebook.     this  simple  adjustment  matches  the  user’s  intent  with  a  clearly  stated  purpose  in  the  search  box.   there  are  additional  ways  we  can  enrich  the  search  box  that  will  assist  the  users  in  their  queries.       what’s  in  a  word?  rethinking  facet  headings  in  a  discovery  service  |  nelson  and  turney   doi:  10.6017/ital.v34i2.5629   83   both  examples  use  the  pronoun  you  so  that  the  sentence  speaks  directly  to  the  individual  searcher.   there  is,  of  course,  the  option  to  just  use  a  verb  in  the  imperative:  “search  for  .  .  .”    or  “enter   [keywords,  terms,  etc.]”.  however,  the  added  feature  of  the  pronoun  you  promotes  the   involvement  of  the  participant-­‐searcher.  see  also  the  interesting  article  by  thompson  on  the  use  of   personal  pronouns  in  social  media  communications  by  university  students.27     facets  column   all  library  discovery  services  make  use  of  facets.  since  the  facets  column  does  constitute  a  far   more  challenging  area  of  linguistic  personalization  for  the  discovery  interface,  the  incorporation  of   specific  types  of  design  features  should  be  employed  to  immediately  attract  the  attention  of  the   user  to  the  facets  column.  this  is  a  very  complicated  area  that  deals  with  user  behavior,  interface   design,  etc.  how  do  we  direct  the  user’s  attention  to  the  facets  column,  let  alone  to  be  aware  of  the   facets  on  the  lefthand  side?  we  can  add  a  note  after  a  search  that  says  something  to  the  effect  of   “too  many  results/hits?  try  narrowing  your  search  with  the  facets  below.”  although  this  involves   difficult  interface  design  issues,  it  is  very  important  that  we  begin  to  think  more  seriously  about   ways  to  draw  our  users  into  the  search  process  more  intuitively  and  effectively.  if  we  don’t,  we   will  find  the  continual  underutilization  of  an  incredibly  powerful  searching  feature.   we  also  know  that  users  routinely  ignore  advertising  banners  so  often  that  the  literature  has   christened  this  tendency  “banner  blindness”;  in  the  same  way,  if  our  facet  labels  are  meaningless,   they  will  be  overlooked.28  we  condemn  the  discovery  service  interface  to  the  same  fate  if  we  are   not  careful  to  choose  meaningful  labels  that  make  sense  when  the  “average”  student  or  faculty   user  encounters  them.  currently,  we  are  also  assuming  knowledge  on  the  part  of  our  users  that  is   clearly  misplaced  or  we  anticipate  a  much  greater  success  with  instruction  than  is  usually   warranted.  there  are  several  studies  that  show  the  disparity  between  the  searcher’s  self-­‐ assessment  and  the  reality  of  the  actual  skill  possessed.29   one  of  the  main  problems  users  experience  with  search  engines  is  their  inability  to  narrow  their   searches,  especially  because  we  are  now  dealing  with  such  a  large  array  of  information  source   types.30  this  is  where  the  use  of  facets  comes  into  its  own.  as  we  seek  to  make  the  discovery   interface  the  first  and,  eventually,  probably  the  only  primary  interface  to  our  selected  resources,   the  user  needs  to  know  how  to  easily  find  a  video  or  a  sound  recording  as  well  as  a  pertinent   article.  this  should  be  done  through  an  easily  accessible  and  understandable  search  interface.  the   success  of  the  e-­‐commerce  sites  in  making  effective  and  profitable  use  of  facets  amply   demonstrates  the  value  of  facets  even  for  complex  research  questions  and  topics.   this  brings  up  the  matter  of  naming  conventions  for  the  facets.  it  is  clear  that,  despite  the  newness   of  discovery  services,  the  facet  labels  simply  continue  the  naming  conventions  that  are  used  in   databases.  we  know  from  usability  studies  that  library  jargon  is  a  stumbling  block  for  our  users.31   when  we  do  not  pay  close  attention  to  the  appropriateness  of  each  facet  category  label,  we  simply   continue  the  utilization  of  a  terminology  that  is  foreign  to  the  understanding  of  many  of  our     information  technology  and  libraries  |  june  2015           84   users,32  undermining  the  use  of  a  powerful  searching  feature  merely  because  of  user  ignorance  of   the  terms.  an  honest  appraisal  of  the  discovery  interface  will  bring  us  immediately  face-­‐to-­‐face   with  one  of  our  primary  legacy  library  problems,  our  heavily  jargon-­‐laden  vocabulary.  in  fact,  we   are  actually  dealing  simultaneously  with  two  problems—the  facet  labels  that  are  chosen  and  the   complexity  of  the  information  universe  that  discovery  systems  expose.  at  a  presentation  on   discovery  services  at  the  2014  ala  annual  conference,  one  speaker  went  so  far  as  to  say  that   facets  are  not  used  in  discovery  searches.33  this  underscores  the  unpleasant  reality  that  we  are   dealing  with  both  a  design  problem  and  an  intelligibility  problem,  not  the  failure  of  facets  as  a   navigational  feature.  at  a  recent  loex  presentation,  one  school  had  already  thrown  in  the  towel   and  will  concentrate  on  teaching  academic  search  premier  over  the  discovery  service  primo.34   again,  this  reveals  that  users  are  having  a  problem  with  the  interface  and  its  display  content.     suggestions  for  improving  facets  and  the  facet  labels   currently,  the  facet  labels  in  library  discovery  service  interfaces  are  limited  to  a  list  of  nouns  that   designate  the  facets  that  can  be  used  for  narrowing  or  limiting  a  search.  however,  the  labels  that   we  use  may  not  be  meaningful  to  our  users  and  are  simply  a  list  of  nouns  that  are,  by  and  large,   not  really  understood.35  second,  a  facet  label  is  also  intended  to  have  the  user  do  something,  hence   a  verb  of  action  is  implied.  in  standard  classification  taxonomies,  the  facet  is  used  for  organizing   and  grouping  the  objects  that  will  be  included  in  the  facet.  for  a  discovery  system,  the  facet  is   there  to  lead  the  searcher  to  content  on  the  basis  of  the  content’s  differing  characteristics  as   expressed  through  a  facet.  one  has  to  ask  the  question,  exactly  why  would  a  student  do  something   simply  because  that  student  sees  a  noun  on  the  lefthand  side?  we  need  to  provide  more  context   during  the  search  process.   below  we  make  recommendations  that  we  think  will  enhance  the  intelligibility  and  the  usability  of   facets.36  it  will  be  important  for  libraries  and  vendors  to  do  substantial  user  experience   investigations  into  the  various  options  that  are  available  for  use  on  a  discovery  page.  our  goal  is  to   draw  attention  to  the  current  inadequacies  in  how  facets  have  been  implemented  in  discovery   services  and  to  encourage  a  more  systematic  approach  to  this  important  area  of  our  library   information  delivery  capabilities.     1. as  observed  above,  in  the  e-­‐commerce  sites,  the  facet  is  indicated  by  the  presence  of  an   icon  marker  that  allows  for  the  facet  to  expand  and  contract.  in  our  sample  of  sites,   there  was  a  parity  between  using  the  +/-­‐  sign  or  a  triangle  (a  full  triangle,  not  a  right   and  downward  chevron).  eds  made  the  decision  to  go  with  the  chevron  symbol.  this  is   a  user  interface  issue  and  one  that  needs  further  examination  and  testing.  we  think   that  the  +/-­‐  sign  is  a  more  suitable  visual  icon  indicator  for  a  user  to  take  a  specific   action.  +/-­‐  also  have  a  value  attached  to  them  that  says  to  a  user  “yes”  for  the  +  sign  and   “no”  for  the  -­‐  sign,  thereby  signaling  a  user  to  expand  (+)  or  contract  (-­‐)  a  list.  we  want   to  attract  users  to  the  facets  and  to  take  an  action.       what’s  in  a  word?  rethinking  facet  headings  in  a  discovery  service  |  nelson  and  turney   doi:  10.6017/ital.v34i2.5629   85   2. make  sure  that  only  facets  and  filters  are  collapsible  and  expendable  and  that  the   design  interface  makes  this  clear.   3. the  term  limit  is  often  found  in  discovery  systems.  this  is  a  term  that  was  not  found  in   our  sample  of  e-­‐commerce  sites.  the  two  primary  terms  are  refine  and  narrow.  the   advantage  of  using  these  terms  is  that  one  can  more  easily  personalize  this  feature,   “narrow  your  results  to”  [full  text]  [scholarly    .  .  .]  [date];  these  are  two  words  that   users  normally  see  when  searching  e-­‐commerce  sites.   4. the  facet  “source  types”  is  a  common  facet  label.  this  is  obscure  terminology  that  users,   especially  students,  tend  not  to  know.  a  suitable  option  to  personalize  this  category   could  be,  “what  type  of  information  do  you  need?”  and  then  list  the  types.  at  least  by   asking  the  question,  a  user  will  be  encouraged  to  look  at  the  possibilities  available,  e.g.,   academic  journals,  trade  publications,  magazines,  etc.     in  the  following  list  of  facets,  we  can  see  that  the  facets  themselves  are  inherently  contradictory  or   do  not  actually  represent  what  they  purport  to  be.  this  is  not  an  argument  against  facets;  rather,   we  need  to  rethink  exactly  what  we  do  want  our  metadata  to  do.  to  simply  take  up  space  on  the   facets  column  does  not  serve  any  purpose.  it  is  also  clear  that  we  need  to  systematically  monitor   the  use  of  facets,  and  for  this  we  need  analytics.  at  this  point,  it  is  difficult,  if  not  impossible,  to   know  whether  facets  have  been  used  for  searches  and,  if  so,  which  facets  have  been  used.  until  we   routinely  gather  this  sort  of  data,  we  will  not  have  the  appropriate  data  to  make  suitable  decisions   about  facets  and  their  use.   1.     language—this  facet  represents  the  language  (both  written  and  spoken  content)  of   the  work.  while  the  term  language  is  understood  by  users,  we  need  to  consider   whether  the  word  alone  triggers  a  response.  since  users  most  likely  want  only  english,   the  facet  label  can  ask  that  question,  and  then  the  selection  of  language  choices  will   appear,  making  it  clear  that  there  are  other  choices  as  well.     a  question  like  “do  you  want  english  only?”  will  then  elicit  a  response  to  narrow  the   results  by  language.  with  the  majority  of  the  materials  in  english,  this  may  be  moot,  but   it  does  encourage  the  searcher  to  think  about  the  language.   the  discovery  layer  adds  the  facet  term  “undetermined”  when  the  provided  metadata   does  not  specify  the  language  of  a  work.  in  a  sense,  the  metadata  has  holes  and  a  user   that  is  searching  for  a  particular  language  will  inadvertently  exclude  relevant  search   results  if  the  facet  is  used  too  soon  to  filter  out  undesired  languages.  we  recommend   that  filtering  by  language  should  be  used  only  as  necessary  and  only  when   overwhelmed  by  a  large  number  of  unwanted  languages.     2.     publisher—this  facet  represents  the  entity  or  the  issuer  of  a  published  work.  this   applies  across  both  serial  and  nonserial  materials.  the  user  most  likely  understands   this  term.  but  the  question  is,  what  is  the  value  of  this  facet?  while  we  do  have  the     information  technology  and  libraries  |  june  2015           86   metadata  for  this,  it  is  difficult  to  understand  the  circumstances  under  which  one  will   actually  limit  a  search  by  the  publisher.  we  suggest  not  displaying  this  facet.   3.     publication—this  facet  represents  the  source  title  of  the  published  work,  such  as  a   journal,  trade  magazine,  or  newspaper.  this  applies  primarily  to  articles,  book  reviews,   columns,  etc.,  and  not  to  publications  like  books,  sound  recordings,  and  videos.  the   user  must  be  made  aware  that  the  use  of  this  facet  should  be  used  for  serial-­‐type   materials  only.  alternatives  to  “publication”  can  be  “article  source.”  this  facet  answers   the  implicit  search  query  and  could  be  a  pop-­‐up  window:  “what  journal  or  magazine   are  you  looking  for?”   4.     content  providers—this  is  a  very  problematic  facet.  it  is  not  difficult  to  surmise  that   most  users  when  encountering  this  term  would  not  know  what  it  means  and,  more   significantly,  why  it  is  important.  in  fact,  the  term  itself  is  not  accurate—another   interesting  issue  that  must  be  dealt  with.  the  “content  providers”  may  not  be  the   actual  providers  of  content  but  rather  providers  of  the  metadata  content,  which  is   something  altogether  different.  for  example,  emerald  is  the  actual  content  provider  for   an  article,  yet  a  different  provider,  the  metadata  provider,  is  listed  as   the  content  provider.  a  suggested  replacement  for  this  term  is  “sources.”  wordings  for   a  pop-­‐up  window  could  be,  “to  narrow  your  search,  choose  from  a  source  that  most   closely  matches  your  topic.  the  sources  are  from  different  types  of  subject  databases.”     5.     subject—the  use  of  the  facet  “subject”  may  seem  to  be  obvious,  yet,  upon  closer   inspection,  the  nature  of  this  facet  is  problematic.  what  is  the  cognitive  connection   between  first  doing  a  keyword  search  and  then  seeing  on  the  lefthand  side  the  facet   label  “subject?”  why  should  a  user  assume  he  or  she  should  now  click  on  a  link  called   “subject,”  since  they  just  finished  doing  a  subject  search?  we  need  to  provide  the   context  for  an  action  that  takes  into  account  the  most  common  experience  of  the  user.   using  the  term  “topic”  rather  than  “subject”  would  allow  us  to  offer  a  term  that  is  more   congruent  with  the  familiar  vocabulary  of  a  student’s  classroom  experience  because   generally  students  are  directed  to  research  topics.     a  university  of  washington  libraries  usability  study  from  the  prediscovery  era  (2004)  found  that   users  preferred  “browse  subjects”  to  “by  subject.”  37  here  we  see  the  presence  of  a  verb  specifying   an  action.  the  significant  finding  for  our  purposes  from  this  earlier  study  is  the  fact  that  users   found  the  phrase  with  a  verb  more  meaningful  than  the  phrase  with  a  preposition.  we  suggest   making  it  clear  that  the  user  can  further  refine  the  search  by  the  suggested  subjects  that  are  listed   in  the  facets  by  using  the  phrase  “narrow  your  topic”  or  “further  narrow  your  topic.”  the  pop-­‐up   window  could  say,  “to  narrow  your  search,  choose  from  this  list  of  possible  topics  that  most   closely  match  your  search  terms.”     what’s  in  a  word?  rethinking  facet  headings  in  a  discovery  service  |  nelson  and  turney   doi:  10.6017/ital.v34i2.5629   87   the  conclusion  reached  by  the  university  of  arizona  study  is  even  more  relevant  for  the  discovery   layer  interface:  “we  learned  that  if  students  have  no  idea  why  or  when  they  should  use  an  index,   they  will  not  choose  a  link  labeled  index,  no  matter  how  well  designed  the  web  page  is.”38  this  is   the  situation  with  facet  labels.  if  they  are  not  intelligible,  or  at  least  provoke  some  response  to  a   question  posed,  they  will  be  ignored,  and  if  ignored,  their  potential  value  goes  completely  unused.     conclusion   e-­‐commerce  has  concluded,  in  the  face  of  overwhelmingly  positive  evidence,  that  facets  are  an   essential  aspect  of  the  successful  (i.e.,  profitable)  user  experience  and  that  they  have  been  almost   universally  adopted  by  companies  who  sell  products,  have  very  large  product  lines,  and  need  to   lead  their  customer  to  exactly  the  type  of  product  they  want.  in  our  discovery  layers,  we  also  need   to  develop  the  kinds  of  features  that  promote  the  effective  use  of  the  resources  we  offer  our   academic  users,  and  build  in,  where  feasible,  appropriate  features.  modifications  can  and  should   be  made  as  libraries  work  with  their  discovery-­‐services  vendor  to  rationalize  an  interface  page   that  should  include  natural  language,  easily  understandable  navigation,  logical  taxonomic  ordering   of  the  facets,  etc.  in  essence,  both  product  searches  and  academic  information  searches  present   the  same  scenario:  we  begin  with  an  information  need,  a  retrieval  system,  and  the  need  to  achieve   recall,  precision,  and  relevance.     discovery  services  allow  for  an  information  search  to  be  carried  out  essentially  as  a  google  search   while  limiting  the  scope  of  facets  to  assistance  in  refining  it.  we  can  be  confident  that  our  users,   many  (or  even  most)  of  whom  also  use  e-­‐commerce  faceted  search  sites,  are  able  to  recognize  a   similar  search  interface.  thus  we  are  dealing  with  an  important  design  issue.  but  to  what  extent   do  our  users  take  advantage  of  faceted  searches?  as  it  stands  at  this  writing,  the  link  between  the   facets  and  their  corresponding  content  “documents”  (articles  or  video)  is  simply  not  clear.  the   characteristics  of  our  discoverable  objects  must  be  tied  in  with  what  a  user  would  be  likely  to   understand.   we  need  analytics  capable  of  supplying  this  sort  of  critical  user-­‐experience  information.  it  may  be   that  we  are  perhaps  dealing  with  conflicting  mental  models  about  information  searching.  students   and  other  members  of  the  academic  community  may  simply  not  be  adequately  cognizant  of  the   implicit  faceted  nature  of  their  query,  and  this  becomes  a  new  opportunity  for  improvements  in   our  approach  to  user  instruction.     it  is  clear  that  libraries  and  vendors  need  to  work  together  to  properly  evaluate  the  facet  labels  if   facets  are  to  begin  to  achieve  their  potential  as  an  essential  search  function.  disheartening   statements  to  the  effect  that  no  one  uses  them,  or  that  the  discovery  system  itself  is  already   branded  a  failure,  demonstrates  that  the  discovery  layer,  while  clearly  a  powerful  tool  for   integrating  a  range  of  accessible  resource,  is  still  in  its  infancy.  our  purpose  in  this  paper  was  to   draw  attention  to  both  the  proven  value  of  faceted  navigation  and  the  ongoing  problem  of     information  technology  and  libraries  |  june  2015           88   confusing  or  inadequately  understood  library  terminology  that  is  presently  hindering  what  should   be  a  powerful  tool  in  our  information  discovery  warehouse.   references     1.     google  is  ranked  number  1  according  to  alexa,  a  traffic-­‐ranking  website.  “the  top  500  sites  on   the  web,”  alexa,  accessed  may  9,  2014,  http://www.alexa.com/topsites.   2.     irene  lopatovska,  megan  r.  fenton  and  sara  campot,  “examining  preferences  for  search   engines  and  their  effects  on  information  behavior,”  proceedings  of  the  american  society  for   information  science  &  technology  49,  no.  1  (2012):  1–11.   3.     betsy  sparrow,  jenny  liu  and  daniel  m.  wegner,  “google  effects  on  memory:  cognitive   consequences  of  having  information  at  our  fingertips,”  science  333,  no.  6043  (2011):  776–78;   daniel  m.  wegner  and  adrian  f.  ward,  “how  google  is  changing  your  brain,”  scientific   american  309,  no.  6  (2013):  58–61;  robin  marantz  henig  and  samantha  henig,   twentysomething:  why  do  young  adults  seem  stuck?  (new  york:  hudson  street,  2012),  139– 43;  matti  näsi  and  leena  koivusilta,  “internet  and  everyday  life:  the  perceived  implications   of  internet  use  on  memory  and  ability  to  concentrate,”  cyberpsychology,  behavior,  and  social   networking  16,  no.  2  (2013):  88–93.   4.     alison  j.  head,  learning  the  ropes:  how  freshmen  conduct  course  research  once  they  enter   college  (project  information  literacy,  december  5,  2013),   http://projectinfolit.org/images/pdfs/pil_2013_freshmenstudy_fullreport.pdf.   5.   ibid.  2.   6.     joshua  steimle,  “what  does  seo  cost?  [infographic],”  forbes,  september  12,  2013,   http://www.forbes.com/sites/joshsteimle/2013/09/12/what-­‐does-­‐seo-­‐cost-­‐infographic/.   7.     frank  scharnell,  “guide  to  ecommerce  facets,  filters  and  catelgories,”  youmoz  (blog),  april   30,  2013,  http://moz.com/ugc/guide-­‐to-­‐ecommerce-­‐facets-­‐filters-­‐and-­‐categories   8.     vanda  broughton,  “meccano,  molecules,  and  the  organization  of  knowledge:  the  continuing   contribution  of  s.  r.  ranganathan”  (presentation,  international  society  for  knowledge   organization  uk  chapter,  london,  november  5,  2007),  2,   http://www.iskouk.org/presentations/vanda_broughton.pdf.   9.     jody  condit  fagan,  “discovery  tools  and  information  literacy,”  journal  of  web  librarianship  5,   no.  3  (2011):  171–78.   10.    sarah  ramdeen  and  bradley  m.  hemminger,  “a  tale  of  two  interfaces:  how  facets  affect  the   library  catalog  search,”  journal  of  the  american  society  for  information  science  &  technology   63  (2012):  702–15.     what’s  in  a  word?  rethinking  facet  headings  in  a  discovery  service  |  nelson  and  turney   doi:  10.6017/ital.v34i2.5629   89     11.    amy  f.  fyn,  vera  lux  and  robert  j.  snyder,  “reflections  on  teaching  and  tweaking  a   discovery  layer,”  reference  services  review  41,  no.  1  (2013):  113–24.  see  also  the  various   presentations  at  recent  loex  conferences.   12.    william  badke,  “pushing  a  big  rock  up  a  hill  all  day:  promoting  information  literacy  skills,”   online  searcher  37,  no.  6  (2013):  67.   13.    see  the  following  blog  entry  on  library  jargon,  which  makes  observations  on  terms  such  as   “periodicals”  and  “databases”:  “periodicals  and  other  library  jargon,”  mr.  library  dude  (blog),   march  18,  2011,  http://mrlibrarydude.wordpress.com/tag/library-­‐jargon/.  this  presentation   on  library  jargon  is  a  very  helpful  contribution  to  the  discussion:  mark  aaron  polger,  “re-­‐ thinking  library  jargon:  maintaining  consistency  and  using  plain  language,”  (slideshow   presentation,  february  5,  2011),  http://www.slideshare.net/markaaronpolger/library-­‐ jargon-­‐newestjan2010feb2010-­‐6815908.     14.    we  are  referring  here  to  systems  such  as  ebsco  eds,  proquest  summon,  ex  libris  primo.   15.    beth  thomsett-­‐scott  and  patricia  e.  reese,  “academic  libraries  and  discovery  tools:  a  survey   of  the  literature,”  college  &  undergraduate  libraries  19,  no.  2–4  (2012):  123–43;  helen   dunford,  review  of  planning  and  implementing  resource  discovery  tools  in  academic  libraries,   by  mary  pagliero  popp  and  diane  dallis,  the  australian  library  journal  62,  no.  2  (2013):  175– 76.   16.    sarah  ramdeen  and  bradley  m.  hemminger,  “a  tale  of  two  interfaces:  how  facets  affect  the   library  catalog  search,”  journal  of  the  american  society  for  information  science  &  technology   63  (2012):  713;  kathleen  bauer  and  alice  peterson-­‐hart,  “does  faceted  display  in  a  library   catalog  increase  use  of  subject  headings?,”  library  hi  tech  30,  no.  2  (2012):  354;  jody  condit   fagan,  “usability  studies  of  faceted  browsing:  a  literature  review,”  information  technology  &   libraries  29,  no.  2  (2010):  62,  http://dx.doi.org/10.6017/ital.v29i2.3144.   17.    william  f.  chickering  and  sharon  q.  yang,  “evaluation  and  comparison  of  discovery  tools:  an   update,”  information  technology  &  libraries  33,  no.  2  (2014),   http://dx.doi.org/10.6017/ital.v33i2.3471.   18.    vanda  broughton,  “the  need  for  a  faceted  classification  as  the  basis  of  all  methods  of   information  retrieval,”  aslib  proceedings  58,  no.  1/2  (2006):  49–72.   19.    leonard  will,  “rigorous  facet  analysis  as  the  basis  for  constructing  knowledge  organization   systems  (kos)  of  all  kinds”  (paper  presented  at  2013  isko  uk  conference,  london,  july  8–9,   2013):  4,  http://www.iskouk.org/conf2013/papers/willpaper.pdf.   20.    marti  a.  hearst,  “design  recommendations  for  hierarchical  faceted  search  interfaces,”  in   proceedings  of  the  acm  sigir  workshop  on  faceted  search  (2006),   http://flamenco.sims.berkeley.edu/papers/faceted-­‐workshop06.pdf.     information  technology  and  libraries  |  june  2015           90     21.    kathryn  la  barre,  “traditions  of  facet  theory,  or  a  garden  of  forking  paths?,”  in  facets  of   knowledge  organization:  proceedings  of  the  isko  uk  second  biennial  conference,  4th–5th  july,   2011,  london  (bingley,  uk  :  emerald,  2012),  96.   22.    barre,  “traditions  of  facet  theory,  or  a  garden  of  forking  paths?,”  98.   23.    frank  scharnell,  “guide  to  ecommerce  facets,  filters  and  categories.”   24.    andrew  d.  asher,  lynda  m.  duke,  and  suzanne  wilson,  “paths  of  discovery:  comparing  the   search  effectiveness  of  ebsco  discovery  service,  summon,  google  scholar,  and  conventional   library  resources,”  college  &  research  libraries  74,  no.  5  (2013):  464–88.     25.    saverio  perugini,  “personalization  by  website  transformation:  theory  and  practice,”   information  processing  &  management  46,  no.  3  (2010):  284;  elizabeth,  f.  churchill,  “putting   the  person  back  into  personalization,”  elizabeth  f.  churchill  (blog),  july  24,  2013,   http://elizabethchurchill.com/uncategorized/putting-­‐the-­‐person-­‐back-­‐into-­‐personalization/.   26.    churchill,  “putting  the  person  back  into  personalization.”   27.    celia  thompson,  kathleen  gray,  and  hyejeong  kim,  “how  social  are  social  media  technologies   (smts)?  a  linguistic  analysis  of  university  students’  experiences  of  using  smts  for  learning,”   the  internet  &  higher  education  21  (2014):  31–40,   http://dx.doi.org/10.1016/j.iheduc.2013.12.001.   28.    “banner  blindness  studies,”  bannerblindness.org,  accessed  april  7,  2014,   http://bannerblindness.org/banner-­‐blindness-­‐studies/.   29.    melissa  gross  and  don  latham,  “undergraduate  perceptions  of  information  literacy:  defining,   attaining,  and  self-­‐assessing  skills,”  college  &  research  libraries  70,  no.  4  (2009):  336–50.   30.    see  the  section  “most  internet  users  say  they  do  not  know  how  to  limit  the  information  that  is   collected  about  them  by  a  website,”  pew  report  2012,   http://www.pewinternet.org/2012/03/09/main-­‐findings-­‐11/#most-­‐internet-­‐users-­‐say-­‐ they-­‐do-­‐not-­‐know-­‐how-­‐to-­‐limit-­‐the-­‐information-­‐that-­‐is-­‐collected-­‐about-­‐them-­‐by-­‐a-­‐website.   31.    chris  jasek,  “how  to  design  library  websites  to  maximize  usability,”  library   connect,  pamphlet  5  (2007):  4,   http://libraryconnectarchive.elsevier.com/lcp/0502/lcp0502.pdf.  see  also  the  results   compiled  in  this  paper  of  fifty-­‐one  intelligibility  studies,  john  kupersmith,  “library  terms  that   users  understand”  (university  of  california,  2012),   http://escholarship.org/uc/item/3qq499w7.   32.    paige  alfonzo,  “my  library  usability  study  stage  1,”  librarian  enumerations  (blog),  june  19,   2013,  http://librarianenumerations.wordpress.com/2013/06/19/library-­‐usability-­‐study/.   33.    “discussing  discovery  services:  what's  working,  what’s  not  and  what’s  next?”  (discussion   forum,  ala  2014  annual  conference,  las  vegas,  nevada,  june  29,  2014).       what’s  in  a  word?  rethinking  facet  headings  in  a  discovery  service  |  nelson  and  turney   doi:  10.6017/ital.v34i2.5629   91     34.    susan  avery  and  lisa  janicke  hinchliffe,  “hopes,  impressions,  and  reality:  is  a  discovery  layer   the  answer?”  (program,  loex  2014  annual  conference,  grand  rapids,  michigan,  may  8–10,   2014),   http://www.loexconference.org/2014/presentations/'loex2014_'hopes%20impressions%2 0and%20reality-­‐averyhinchliffe.pdf.   35.    kupersmith,  “library  terms  that  users  understand.”   36.    we  are  taking  our  examples  from  ebsco  eds  with  which  we  are  most  familiar.  the  issues   discussed  are  common  to  all  discovery  systems.     37.    kupersmith,  “library  terms  that  users  understand.”   38.    ruth  dickstein  and  vicki  mills,  “usability  testing  at  the  university  of  arizona  library:  how  to   let  the  users  in  on  the  design,”  information  technology  and  libraries  19,  no.  3  (2000):  144–51.   106 information technology and libraries | september 2009 michelle frisquepresident’s message michelle frisque (mfrisque@northwestern.edu) is lita president 2009–10 and head, information systems, north western university, chicago. b y the time you read this column i will be lita president, however, as i write this i still have a couple of weeks left in my vice-presidential year. i have been warned by so many that my presidential year will fly by, and i am beginning to understand how that could be. i can’t believe i am almost done with my first year. i have enjoyed it and sometimes been overwhelmed by it—especially when i began the process of appointing lita volunteers to committees and liaison roles. i didn’t realize how many appointments there were to make. i want to thank all of the lita members who volunteered. you really helped make the appointment process easier. as a volunteer organization, lita relies on you, and once again many of you have stepped up. thank you. during the appointment process i was introduced to many lita members whom i had not yet met. i enjoyed being introduced to you virtually, and i look forward to meeting you in person in the coming year. i also want to thank the lita office. they were there whenever i needed them. without their assistance i would not have been able to successfully complete the appointment process. over the last year i have been working closely with this year’s lita emerging leaders, lisa thomas and holly tomren. i have really enjoyed the experience. their enthusiasm and energy is contagious. i wish every lita member could have been at this year’s lita camp in columbus, ohio, on may 8. during one of the lightning round sessions, lisa went to the podium and gave an impassioned speech about the benefits of belonging to a professional organization like lita. if there was a person in the audience that was not yet a lita member, i am sure they joined immediately afterward. she really captured the essence of why i became active in lita and why i continue to stay so involved in this organization so many years later. i can honestly say that as much as i have given to lita, i have received so much more in return. that is the true benefit of lita membership. over the last year, the lita board has had some great discussions with lita members and leaders. those conversations will continue as we start the work of drafting a new strategic plan. i want to create a strategic plan that will chart a meaningful path for the association and its members for the next several years. i want it to provide direction but also be flexible enough to adapt to changes in the information technology association landscape. as andrew pace mentioned in his last president’s message, changes will be coming. while we still aren’t sure exactly what those changes are, we know that it is time to seriously look at the current organizational structure of lita to make sure it best fits our needs today while continuing to remain flexible enough to meet our needs tomorrow. when i think of the organizational changes we are exploring, i can’t help but think of the houses i see on my favorite home improvement shows. lita has good bones. the structure and foundation are solid and well built, and as long as the house is well cared for, should last for years to come. however, like all houses, improvements need to be made over time to keep up with the market. the lita structure and foundation will be the same. when you drive up to the house you will still recognize the lita structure. when you walk in the door my hope is that you will still get that same homey feeling you had before, maybe with a few “oohs” and “aahs” thrown in as you notice the upgrades and enhancements. as the year progresses we will know more. i will use this column and other communication avenues to keep you informed of our plans and to gather your input. i would like to close my first column by thanking you for giving me this opportunity to serve you as the lita president. i am honored and humbled by the trust you have placed in me, and i am ready to start my presidential year. i hope it does not go by too quickly. i want to savor the experience. now let’s get started! microsoft word 5699-11611-7-ce.docx geographic  information  and  technologies   in  academic  libraries:  an  arl  survey  of   services  and  support       ann  l.  holstein     information  technology  and  libraries  |  march  2015             38   abstract   one  hundred  fifteen  academic  libraries,  all  current  members  of  the  association  of  research  libraries   (arl),  were  selected  to  participate  in  an  online  survey  in  an  effort  to  better  understand  campus   usage  of  geographic  data  and  geospatial  technologies,  and  how  libraries  support  these  uses.  the   survey  was  used  to  capture  information  regarding  geographic  needs  of  their  respective  campuses,  the   array  of  services  they  offer,  and  the  education  and  training  of  geographic  information  services   department  staff  members.  the  survey  results,  along  with  review  of  recent  literature,  were  used  to   identify  changes  in  geographic  information  services  and  support  since  1997,  when  a  similar  survey   was  conducted  by  arl.  this  new  study  has  enabled  recommendations  to  be  made  for  building  a   successful  geographic  information  service  center  within  the  campus  library  that  offers  a  robust  and   comprehensive  service  and  support  model  for  all  geographic  information  usage  on  campus.   introduction   in  june  1992,  the  arl  in  partnership  with  esri  (environmental  systems  research  institute)   launched  the  gis  (geographic  information  systems)  literacy  project.  this  project  sought  to   “introduce,  educate,  and  equip  librarians  with  the  skills  necessary”  to  become  effective  gis  users   and  to  learn  how  to  provide  patrons  with  “access  to  spatially  referenced  data  in  all  formats.”1   through  the  implementation  of  a  gis  program,  libraries  can  provide  “a  means  to  have  the   increasing  amount  of  digital  geographic  data  become  a  more  useful  product  for  the  typical   patron.”2     in  1997,  five  years  after  the  gis  literacy  project  began,  a  survey  was  conducted  to  elucidate  how   arl  libraries  support  patron  gis  needs.  the  survey  was  distributed  to  121  arl  members  for  the   purpose  of  gathering  information  about  gis  services,  staffing,  equipment,  software,  data,  and   support  these  libraries  offered  to  their  patrons.  seventy-­‐two  institutions  returned  the  survey,  a  60%   response  rate.  at  that  time,  nearly  three-­‐quarters  (74%)  of  the  respondents  affirmed  that  their   library  administered  some  level  of  gis  services.3  this  indicates  that  the  gis  literacy  project  had  an   evident  positive  impact  on  the  establishment  of  gis  services  in  arl  member  libraries.   since  then,  it  has  been  recognized  that  the  rapid  growth  of  digital  technologies  has  had  a   tremendous  effect  on  gis  services  in  libraries.4  we  acknowledge  the  importance  of  assessing     ann  l.  holstein  (ann.holstein@case.edu)  is  gis  librarian  at  kelvin  smith  library,  case  western   reserve  university,  cleveland,  ohio.     geographic  information  and  technologies  in  academic  research  libraries  |  holstein   39   how  geographic  services  in  academic  research  libraries  have  further  evolved  over  the  past  17   years  in  response  to  these  advancing  technologies  as  well  as  the  increasingly  demanding   geographic  information  needs  of  their  user  communities.     method   for  this  study,  115  academic  libraries,  all  current  members  of  arl  as  of  january  2014,  were   invited  to  participate  in  an  online  survey  in  an  effort  to  better  understand  campus  usage  of   geographic  data  and  geospatial  technologies  and  how  libraries  support  these  uses.  similar  in   nature  to  the  1997  arl  survey,  the  2014  survey  was  designed  to  capture  information  regarding   geographic  needs  of  their  respective  campuses,  the  array  of  services,  software.  and  support  the   academic  libraries  offer,  and  the  education  and  training  of  geographic  information  services   department  staff  members.  our  aim  was  to  be  able  to  determine  the  range  of  support  patrons  can   anticipate  at  these  libraries  and  ascertain  changes  in  gis  library  services  since  the  1997  survey.   a  cross-­‐sectional  survey  was  designed  and  administered  using  qualtrics,  an  online  survey  tool.  it   was  distributed  in  january  2014  via  email  to  the  person  identified  as  the  subject  specialist  for   mapping  and/or  geographic  information  at  each  arl  member  academic  library.  when  the  survey   closed  after  two  weeks,  54  institutions  had  responded  to  the  survey.  this  accounts  for  47%   participation.  responding  institutions  are  listed  in  the  appendix.   results   software  and  technologies   we  were  interested  in  learning  about  what  types  of  geographic  information  software  and   technologies  are  currently  being  offered  at  academic  research  libraries.  results  show  that  100%  of   survey  respondents  offer  gis  software/mapping  technologies  at  their  libraries,  36%  offer  remote   sensing  software  (to  process  and  analyze  remotely  sensed  data  such  as  aerial  photography  and   satellite  imagery),  and  36%  offer  global  positioning  system  (gps)  equipment  and/or  software.   nearly  all  (98%)  said  that  their  libraries  provide  esri  arcgis  software,  with  83%  also  providing   access  to  google  maps  and  google  earth,  and  35%  providing  qgis  (previously  known  as  quantum   gis).  smatterings  of  other  gis,  remote-­‐sensing,  and  gps  products  are  also  offered  by  some  of  the   libraries,  although  not  in  large  numbers  (see  table  1  for  full  listing).     the  fact  that  nearly  all  survey  respondents  offer  arcgis  software  at  their  libraries  comes  as  no   surprise.  arcgis  is  the  most  commonly  provided  mapping  software  available  in  academic  libraries,   and  in  2011,  it  was  determined  that  2,500  academic  libraries  were  using  esri  products.5  esri   software  was  most  popular  in  1997  as  well,  undoubtedly  because  they  offered  free  software  and   training  to  participants  of  the  gis  literacy  project.6         information  technology  and  libraries  |  march  2015   40   software/technology   type   %  of  providing   libraries   esri  arcgis   gis   98   google  maps/earth   gis   83   qgis   gis   35   autocad   gis   19   erdas  imagine   remote  sensing   19   grass   gis   15   envi   remote  sensing   15   geoda   gis   6   pci  geomatica   remote  sensing   6   garmin  map  source   gps   6   simplymap   gis   4   trimble  terrasync   gps   4   table  1.  geographic  information  software/mapping  technologies  provided  at  arl  member   academic  libraries  (2014)   google  maps  and  google  earth,  launched  in  2005,  have  quickly  become  very  popular  mapping   products  used  at  academic  libraries—a  close  second  only  to  esri  arcgis.  in  addition  to  being  free,   their  ease  of  use,  powerful  visualization  capabilities,  “customizable  map  features  and  dynamic   presentation  tools”  make  them  attractive  alternatives  to  commercial  gis  software  products.7     since  1997,  many  software  programs  have  fallen  out  of  favor.  mapinfo,  idrisi,  maptitude,  and   sammamish  data  finder/geosight  pro  were  gis  software  programs  listed  in  the  1997  survey   results  that  are  not  used  today  at  arl  member  academic  libraries.8  instead,  open  source  software   such  as  qgis,  grass,  and  geoda  are  growing  in  popularity.  they  are  free  to  use  and  their  source   code  may  be  modified  as  needed.   gps  equipment  lending  can  be  very  beneficial  to  students  and  campus  researchers  who  need  to   collect  their  own  field  research  locational  data.  the  2014  survey  found  that  30%  of  respondents   loan  recreational  gps  equipment  at  their  libraries  and  10%  loan  mapping-­‐grade  gps  equipment.   the  high  cost  of  mapping-­‐grade  gps  equipment  (several  thousand  dollars)  may  be  a  barrier  for   some  libraries;  however,  this  is  the  type  of  equipment  recommended  in  best-­‐practice  methods  for   gathering  highly  accurate  gps  data  for  research.  in  addition  to  expense,  complexity  of  operation  is   another  consideration.  while  it  is  “fairly  simple  to  use  a  recreational  gps  unit,”  a  certain  level  of   advanced  training  is  required  for  operating  mapping-­‐grade  gps  equipment.9  a  designated  staff   member  may  need  to  take  on  the  responsibility  of  becoming  the  in-­‐house  gps  expert  and  routinely   offer  training  sessions  to  those  interested  in  borrowing  mapping-­‐grade  gps  equipment.     location     geographic  information  and  technologies  in  academic  research  libraries  |  holstein   41   at  36%  of  responding  libraries,  the  geographic  information  services  area  is  located  where  the   paper  maps  are  (map  department/services);  19%  have  separated  this  area  and  designated  it  as  a   geospatial  data  center,  gis,  or  data  services  department;  13%  integrate  it  with  the  reference   department;  and  just  4%  of  libraries  house  the  gis  area  in  government  documents.  table  2  lists  all   reported  locations  for  this  service  area.  not  surprisingly,  in  1997,  government  documents  (39%)   was  just  as  popular  a  location  for  this  service  area  as  within  the  map  department  (43%).10   libraries  identified  government  documents  as  a  natural  fit,  keeping  gis  services  within  close   proximity  to  spatial  data  sets  recently  being  distributed  by  government  agencies,  most  notably  the   us  government  printing  office  (gpo).  these  agencies  had  made  the  decision  to  distribute  “most   data  in  machine  readable  form,”11  including  the  1990  census  data  as  topographically  integrated   geographic  encoding  and  referencing  (tiger)  files.12  gis  technologies  were  needed  to  access  and   most  effectively  use  information  within  these  massive  spatial  datasets.     location   %  of  libraries  (1997)   %  of  libraries  (2014)   map  department/services   43   36   government  documents   39   4   reference   10   13   geospatial  data  center,  gis,  or  data  services   3   19   not  in  any  one  location   -­‐   9   digital  scholarship  center   -­‐   6   combined  area  (i.e.,  map  dept.  &  gov.  docs.)   -­‐   6   table  2.  location  of  the  geographic  information  services  area  within  the  library  (1997  and  2014)   at  59%  of  responding  libraries,  geographic  information  software  is  available  on  computer   workstations  in  a  designated  area,  such  as  within  the  map  department.  however,  many  do  not   restrict  users  by  location  and  have  the  software  available  on  all  computer  workstations   throughout  the  library  (37%)  or  on  designated  workstations  distributed  throughout  the  library   (33%).  a  small  percentage  (7%)  loan  laptops  to  patrons  with  the  software  installed,  allowing  full   mobility  throughout  the  entire  library  space.   staffing   most  professional  staff  working  in  the  geographic  information  services  department  hold  one  or   more  postbaccalaureate  advanced  degrees.  of  113  geographic  services  staff  at  responding   libraries,  65%  had  obtained  an  ma/ms,  mls/mlis,  or  phd;  43%  have  one  advanced  degree,  while   22%  have  two  postbaccalaureate  degrees.  half  (50%)  hold  an  mls/mlis,  31%  hold  an  ma/ms,   and  6%  hold  a  phd.  nearly  one-­‐third  (31%)  have  obtained  a  ba/bs  as  their  highest  educational   degree,  3%  had  a  two-­‐year  technical  degree,  and  2%  had  only  earned  a  ged  or  high  school   diploma.  in  1997,  84%  of  gis  librarians  and  specialists  at  arl  libraries  had  an  mls  degree.13  at   that  time,  the  incumbent  was  most  often  recruited  from  within  the  library  to  assume  this  new  role,     information  technology  and  libraries  |  march  2015   42   whereas  today’s  gis  professionals  are  just  as  likely  to  come  from  nonlibrary  backgrounds,   bringing  their  expertise  and  advanced  geographic  training  to  this  nontraditional  librarian  role.     figure  1.  highest  educational  degree  of  geographic  services  staff  (2014)   on  average,  this  department  is  staffed  by  two  professional  staff  members  and  three  student  staff.   student  employees  can  be  a  terrific  asset,  especially  if  they  have  been  previously  trained  in  gis.   students  are  likely  to  be  recruited  from  departments  that  are  the  heaviest  gis  users  at  the   university  (i.e.,  geography,  geology).  some  libraries  have  implemented  “co-­‐op”  programs  where   students  can  receive  credit  for  working  at  the  gis  services  area.  these  dual-­‐benefit  positions  are   quite  lucrative  to  students.14     campus  users   in  a  typical  week  during  the  course  of  a  semester,  responding  libraries  each  serve  approximately   sixteen  gis  users,  four  remote  sensing  users,  and  three  gps  users.  these  users  may  obtain   assistance  from  department  staff  either  in-­‐person  or  remotely  via  phone  or  email.     on  average,  undergraduate  and  graduate  students  compose  the  majority  (75%)  of  geographic   service  users  (32%  and  43%,  respectively).  faculty  members  compose  14%  of  the  users,  followed   by  staff  (including  postdoctoral  researchers)  at  7%.  some  institutions  also  provide  support  to   public  patrons  and  alumni  (4%  and  1%,  respectively).  in  1997,  it  was  estimated  that  on  average,   63%  of  gis  users  were  students,  22%  were  faculty,  8%  were  staff,  and  8%  were  public.15   ged/hs   2%   2yr  tech   3%   ba/bs   31%   ma/ms/mlis   58%   phd   6%     geographic  information  and  technologies  in  academic  research  libraries  |  holstein   43     figure  2.  comparison  of  the  percentage  of  geographic  service  users  by  patron  status  (1997  and   2014)   the  top  three  departments  that  use  gis  software  at  arl  campuses  are  environmental   science/studies,  urban  planning/studies,  and  geography.  the  most  frequent  remote  sensing   software  users  come  from  the  departments  of  environmental  science/studies,  geography,  and   archaeology.  gps  equipment  loan  and  software  usage  is  most  popular  with  the  departments  of   environmental  science/studies,  geography,  biology/ecology  and  archaeology  (see  table  3  for  full   listing).  some  departments  are  heavy  users  of  all  geographic  technologies,  while  others  have   shown  interest  in  only  one.  for  example,  the  departments  of  psychology  and  medicine/dentistry   have  used  gis  but  have  expressed  little  or  no  interest  in  using  remote-­‐sensing  or  gps  technologies.   support  and  services   the  campus  community  is  supported  by  library  staff  in  a  variety  of  ways  with  regards  to  gis,   remote-­‐sensing,  and  gps  technology  and  software  use.  nearly  all  (94%)  libraries  provide   assistance  using  the  software  for  specific  class  assignments  and  projects,  and  78%  are  able  to   provide  more  in-­‐depth  research  project  consultations.  more  than  one-­‐quarter  (27%)  of  reporting   libraries  will  make  custom  gis  maps  for  patrons,  although  there  may  be  a  charge  depending  on  the   library,  project,  and  patron  type  (10%).  most  (90%)  offer  basic  use  and  troubleshooting  support;   however,  just  39%  offer  support  for  software  installation,  and  55%  offer  technical  support  for   problems  such  as  licensing  issues  and  turning  on  extensions.  the  campus  computing  center  or   information  technology  services  (its)  at  arl  institutions  most  likely  fields  some  of  the  software   installation  and  technical  issues  rather  than  the  library,  thus  accounting  for  the  lower  percentages.     a  variety  of  software  training  may  be  offered  to  the  campus  community  through  the  library;  80%   of  responding  libraries  make  visits  to  classes  to  give  presentations  and  training  sessions,  69%  host   workshops,  47%  provide  opportunities  for  virtual  training  courses  and  tutorials,  and  4%  offer   certificate  training  programs.     0   10   20   30   40   50   60   70   80   students   faculty   staff   public   alumni   1997   2014     information  technology  and  libraries  |  march  2015   44   department   gis   remote  sensing   gps   anthropology   24   10   8   archaeology   24   14   13   architecture   24   1   6   biology/ecology   32   10   13   business/economics   23   1   3   engineering   18   9   11   environmental  science/studies   41   22   16   forestry/wildlife/fisheries   21   12   10   geography   35   22   15   geology   31   12   10   history   27   2   2   information  sciences   14   1   0   nursing   8   1   2   medicine/dentistry   9   0   0   political  science   25   3   5   psychology   4   0   0   public  health/epidemiology/  biostatistics   30   3   9   social  work   2   0   1   sociology   22   0   3   soil  science   17   5   4   statistics   8   3   0   urban  planning/studies   36   7   9   table  3.  number  of  arl  libraries  reporting  frequent  users  of  gis,  remote-­‐sensing,  or  gps   software  and  technologies  from  a  campus  department  (2014)     often,  the  library  is  not  the  only  place  people  can  go  to  obtain  software  support  and  training  on   campus.  most  (86%)  responding  libraries  state  that  their  university  offers  credit  courses,  and  41%   of  campuses  have  a  gis  computer  lab  located  elsewhere  on  campus  that  may  be  utilized.  its  is   available  for  assistance  at  29%  of  the  universities,  and  continuing  education  offers  some  level  of   training  and  support  at  14%  of  campuses.     data  collection  and  access   most  (85%)  of  responding  libraries  collect  geographic  data  and  allow  an  annual  budget  for  it.   “libraries  that  have  invested  money  in  proprietary  software  and  trained  staff  members  will  tend   to  also  develop  and  maintain  their  own  collection  of  data  resources.”16  of  those  collecting  data,  26%   spend  less  than  $1,000  annually,  15%  spend  between  $1,000  and  $2,499,  17%  spend  between   $2,500  and  $5,000,  while  41%  spend  more  than  $5,000.  in  1997,  79%  of  libraries  spent  less  than   $2,000  annually,  and  only  9%  spent  more  than  $5,000.17       geographic  information  and  technologies  in  academic  research  libraries  |  holstein   45     figure  3.  annual  budget  allocations  for  geographic  data  (2014)   a  dramatic  shift  has  occurred  over  the  years  with  budget  allocations  for  data  sets.  no  longer  are   academic  libraries  just  collecting  free  government  data  sets  as  was  typically  the  case  back  in  1997,   but  they  are  investing  much  more  of  their  materials  budget  into  building  up  the  geographic  data   collection  for  their  users.     data  is  made  accessible  to  campus  users  in  a  variety  of  ways.  a  majority  (84%)  offer  data  via   remote  access  or  download  from  a  networked  campus  computer,  using  a  virtual  private  network   (vpn)  or  login.  more  than  half  (62%)  of  responding  libraries  provide  access  to  data  from   workstations  within  the  library,  and  64%  lend  cd-­‐roms.   roughly  one-­‐quarter  (26%)  of  responding  libraries  provide  users  with  storage  for  their  data.  of   those,  29%  have  a  dedicated  geographic  data  server,  14%  use  the  main  library  server,  29%  point   users  to  the  university  server  or  institutional  repository,  and  36%  allow  users  to  store  their  data   directly  onto  a  library  computer  workstation  hard  drive.   internal  use  of  gis  in  libraries   geographic  information  technologies  may  be  used  internally  to  help  patrons  navigate  the  library’s   physical  collections  and  efficiently  locate  print  materials.  of  the  survey  respondents,  60%  use  gis   for  map  or  air  photo  indexing,  27%  use  the  technology  to  create  floor  maps  of  the  library  building,   and  15%  use  it  to  map  the  library’s  physical  collections.  “the  use  of  gis  in  mapping  library   collections  is  one  of  the  non-­‐traditional  but  useful  applications  of  gis.”18  gis  can  be  used  to  link   library  materials  to  simulated  views  of  floor  maps  through  location  codes.19  this  enables  patrons   to  determine  the  exact  location  of  library  material  by  providing  them  with  item  “location  details   such  as  stacks,  row,  rack,  shelf  numbers,  etc.”20  the  gis  system  can  become  a  useful  tool  for   collection  management  and  can  be  a  tremendous  time-­‐saver  for  patrons,  especially  those   unfamiliar  with  the  cataloging  system  or  collection  layout.     discussion   recommendations  for  building  a  successful  geographic  information  service  center   0   5   10   15   20   25   30   35   40   45   percent  (%)     information  technology  and  libraries  |  march  2015   46   the  geographic  information  services  area  is  often  a  blend  of  the  traditional  and  modern.  it  can   extend  to  paper  maps,  atlases,  gps  equipment,  software  manuals,  large-­‐format  scanners,  printers,   and  gis.  gis  services  may  include  a  cluster  of  computers  with  gis  software  installed,  an  accessible   collection  of  gis  data  resources,  and  assistance  available  from  the  library  staff.  the  question  for   academic  libraries  today  is  no  longer  “whether  to  offer  gis  services  but  what  level  of  service  to   offer.”21  every  university  has  different  gis  needs,  and  the  library  must  decide  how  it  can  best   support  these  needs.  there  is  no  set  formula  for  building  a  geographic  information  service  center   because  each  institution  “has  a  different  service  mission  and  user  base.”22  every  library’s  gis   service  program  will  be  designed  with  its  unique  institutional  needs  in  mind;  however,  they  each   will  incorporate  some  combination  of  hardware,  software,  data,  and  training  opportunities   provided  by  at  least  one  knowledgeable  staff  member.23     “gis  represents  a  significant  investment  in  hardware,  software,  staffing,  data  acquisition,  and   ongoing  staff  development.  either  new  money  or  significant  reallocation  is  required.”24   establishing  new  or  enhancing  gis  services  in  the  library  requires  the  “serious  assessment  of  long-­‐ term  support  and  funding  needs.”25  commitment  of  the  university  as  a  whole,  or  at  least  support   from  senior  administration,  “library  administration,  and  related  campus  departments”  is  crucial  to   its  success.26  receiving  “more  funding  will  mean  more  staff,  better  trained  staff,  a  more  in-­‐depth   collection,  better  hardware  and  software,  and  the  ability  to  offer  multiple  types  of  gis  services.”27     once  funding  for  this  endeavor  has  been  secured,  it  is  of  utmost  importance  to  recruit  a  gis   professional  to  manage  the  geographic  information  service  center.  to  be  most  effective  in  this   position,  the  incumbent  should  possess  a  graduate  degree  in  gis  or  geography;  however,   depending  on  what  additional  responsibilities  would  be  required  of  the  candidate  (i.e.,  reference,   cataloging,  etc.)  a  second  degree  in  library  science  is  strongly  recommended.  this  staff  member   should  possess  mapping  and  gis  skills,  which  include  experience  with  esri  software  and  remote   sensing  technologies.  employees  in  this  position  may  be  given  a  job  titles  such  as  “gis  specialists,   gis/data  librarians,  gis/map  librarians,  digital  cartographers,  spatial  data  specialists,  and  gis   coordinators.”28     with  the  new  staff  member  on  board,  hereafter  referred  to  as  “gis  specialist,”  decisions  such  as   what  software  to  provide,  which  data  sets  to  collect,  and  what  types  of  training  and  support  to   offer  to  the  campus  can  be  made.  consulting  with  research  centers  and  academic  departments  that   currently  use  or  are  interested  in  using  gis  and  remote  sensing  technologies  is  a  good  place  to   learn  about  software,  data,  and  training  needs  and  to  determine  the  focus  and  direction  of  the   geographic  information  services  department.29  campus  users  often  come  from  academic   departments  that  “have  neither  staff  nor  facilities  to  support  gis,”  and  “may  only  consist  of  one  or   two  faculty  and  a  few  graduate  students.  these  gis  users  need  access  to  software,  data,  and   expertise  from  a  centralized,  accessible  source  of  research  assistance,  such  as  the  library.”30     at  minimum,  esri  arcgis,  google  maps  and  google  earth  should  be  supported,  with  additional   remote  sensing  or  open  source  gis  software  depending  on  staff  expertise  and  known  campus     geographic  information  and  technologies  in  academic  research  libraries  |  holstein   47   needs.  when  purchasing  commercial  software  licenses,  such  as  for  esri  arcgis,  discounts  for   educational  institutions  are  usually  available.  additionally,  negotiating  campus-­‐wide  software   licenses  may  be  a  good  option  to  consider  as  the  costs  are  usually  far  less  than  purchasing   individual  or  floating  licenses.  costs  for  campus-­‐wide  licensing  are  typically  determined  by  full-­‐ time  equivalent  (fte)  students  enrolled  at  the  university.     facilitating  “access  to  educational  resources  such  as  software  tools  and  applications,  how-­‐to-­‐ guides  for  data  and  software,”  and  tutorials  is  crucial.31  the  gis  specialist  must  be  familiar  with   how  gis  software  can  be  used  by  many  disciplines,  the  availability  of  “training  courses  or  tutorials,   sources  or  extensible  gis  software,  and  hundreds  of  software  and  application  books.”32  tutorials   may  be  provided  direct  from  a  software  vendor  (i.e.,  esri  virtual  campus)  or  developed  in-­‐house   by  the  gis  specialist.  creating  “gis  tutorials  on  short,  task-­‐based  techniques  such  as   georeferencing  or  geocoding”  and  making  them  readily  available  online  or  as  a  handout  may  save   time  having  to  repeatedly  explain  these  techniques  to  patrons.33   geospatial  data  collection  development  is  a  core  function  of  the  geographic  information  services   department.  to  effectively  develop  the  data  collection,  the  gis  specialist  must  fully  comprehend   the  needs  of  the  user  community  as  well  as  possess  a  “fundamental  understanding  of  the  nature   and  use  of  gis  data.”34  this  is  often  referred  to  as  “spatial  literacy.”35  it  is  crucial  to  keep  abreast  of   “recent  developments,  applications,  and  data  sets.”36   the  gis  specialist  will  spend  much  more  time  searching  for  and  acquiring  geographic  data  sets   than  selecting  and  purchasing  traditional  print  items  such  as  maps,  monographs,  and  journals  for   the  collection.  a  budget  should  be  established  annually  for  the  purchase  of  all  geographic   materials,  both  print  and  digital.  a  great  challenge  for  the  specialist  is  to  acquire  data  at  the  lowest   cost  possible.  while  a  plethora  of  free  data  is  available  online  from  government  agencies  and   nonprofit  organizations,  other  data,  available  only  from  private  companies,  may  be  quite   expensive  because  of  the  high  production  costs.  a  collection  development  policy  should  be  created   that  indicates  the  types  of  materials  and  data  collected  and  specifies  geographic  regions,  formats,   and  preferred  scales.37  the  needs  of  the  user  community  must  be  carefully  considered  when   establishing  the  policy.     the  expertise  of  the  gis  specialist  is  needed  not  only  to  help  patrons  locate  the  appropriate   geographic  data,  but  also  to  use  the  software  to  process,  interpret,  and  analyze  it.  “only  the  few   library  patrons  that  have  had  gis  experience  are  likely  to  obtain  any  level  of  success  without   intervention  by  library  staff”;38  thus,  for  any  mapping  program  installed  on  a  library  computer,   “staff  must  have  working  knowledge  of  the  program”  and  must  be  able  to  provide  support  to   users.39  furthermore,  the  gis  specialist  must  be  able  to  train  patrons  to  use  the  software  to   complete  common  tasks  such  as  file  format  conversion,  data  projection,  data  manipulation,  and   geoprocessing.  these  geospatial  technologies  involve  a  steep  learning  curve,  and  unfortunately   “hands-­‐on  training  options  outside  the  university  are  often  cost-­‐prohibitive”  for  many.40  the   campus  community  requires  training  opportunities  to  be  both  convenient  and  inexpensive.     information  technology  and  libraries  |  march  2015   48   teaching  hands-­‐on  geospatial  technology  workshops,  from  basic  to  the  advanced,  is  fundamental   to  educating  the  campus  community.  workshops  will  “vary  from  institution  to  institution,  with   some  offering  students  an  introduction  to  mapping  and  others  focusing  on  specific  features  of  the   program,  such  as  georeferencing,  geocoding,  and  spatial  analysis.  some  also  offer  workshops  that   are  theme  specific,”  such  as  “working  with  census  data”  or  “digital  elevation  modeling.”41  custom   workshops  or  training  sessions  can  be  developed  to  meet  a  specific  campus  need,  tailored  for  a   specific  class  in  consult  with  an  instructor,  or  designed  especially  for  other  library  staff.     today’s  geographic  information  service  center   the  academic  map  librarian  from  the  1970s  or  1980s  would  hardly  recognize  todays’  geographic   information  service  center.  what  was  once  a  room  of  map  cases  and  shelves  of  atlases  and   gazetteers  is  now  a  bustling  geospatial  center.  computers,  powerful  gis  and  remote-­‐sensing   technologies,  gps  devices,  digital  maps,  and  data  are  now  available  to  library  patrons.  every   library  surveyed  provides  gis  software  to  campus  users,  and  85%  also  actively  collect  gis  and   remotely  sensed  data.  with  the  assistance  of  expertly  trained  library  staff,  users  with  no  or  limited   experience  using  geospatial  technologies  are  enabled  to  analyze  spatial  data  sets  and  create   custom  maps  for  coursework,  projects,  and  research.  nearly  all  surveyed  libraries  (94%)  have   staff  that  can  assist  students  specifically  with  software  use  for  class  assignments  and  projects,   while  90%  provide  assistance  with  more  generalized  use  of  the  software.  a  majority  of  libraries   also  offer  a  variety  of  software  training  sessions,  workshops,  and  give  presentations  to  the  campus   community.  all  this  is  made  possible  through  the  library’s  commitment  to  this  service  area  and  the   availability  of  highly  trained  professional  staff,  most  who  hold  a  masters  or  doctoral  degree.  the   library  has  truly  established  itself  as  the  go-­‐to  location  on  campus  for  spatial  mapping  and  analysis.   this  role  has  only  strengthened  in  the  years  since  the  launch  of  the  arl  gis  literacy  project  in   1992.   references   1.     d.  kevin  davie  et  al.,  comps.,  spec  kit  238:  the  arl  geographic  information  systems  literacy   project  (washington,  dc:  association  of  research  libraries,  office  of  leadership  and   management  services,  1999),  16.   2.   ibid.,  3.   3.   ibid.,  i.   4.   abraham  parrish,  “improving  gis  consultations:  a  case  study  at  yale  university  library,”   library  trends  55,  no.  2  (2006):  328,  http://dx.doi.org/10.1353/lib.2006.0060.     5.     eva  dodsworth,  getting  started  with  gis:  a  lita  guide  (new  york:  neal-­‐schuman,  2012),  161.   6.   davie  et  al.,  spec  kit  238,  i.     geographic  information  and  technologies  in  academic  research  libraries  |  holstein   49   7.   eva  dodsworth  and  andrew  nicholson,  “academic  uses  of  google  earth  and  google  maps  in  a   library  setting,”  information  technology  &  libraries  31,  no.  2  (2012):  102,   http://dx.doi.org/10.6017/ital.v31i2.1848.   8.   davie  et  al.,  spec  kit  238,  8.   9.   gregory  h.  march,  “surveying  campus  gis  and  gps  users  to  determine  role  and  level  of   library  services,”  journal  of  map  &  geography  libraries  7,  no.  2  (2011):  170–71,   http://dx.doi.org/10.1080/15420353.2011.566838.   10.   davie  et  al.,  spec  kit  238,  5.     11.   george  j.  soete,  spec  kit  219:  transforming  libraries  issues  and  innovation  in  geographic   information  systems.  (washington,  dc:  association  of  research  libraries,  office  of   management  services,  1997),  5.   12.   camila  gabaldón  and  john  repplinger,  “gis  and  the  academic  library:  a  survey  of  libraries   offering  gis  services  in  two  consortia,”  issues  in  science  and  technology  librarianship  48   (2006),  http://dx.doi.org/10.5062/f4qj7f8r.   13.   davie  et  al.,  spec  kit  238,  5.   14.   soete,  spec  kit  219,  9.   15.   davie  et  al.,  spec  kit  238,  10.   16.   dodsworth,  getting  started  with  gis,  165.   17.   davie  et  al.,  spec  kit  238,  9.   18.   d.  n.  phadke,  geographical  information  systems  (gis)  in  library  and  information  services  (new   delhi:  concept,  2006),  36–37.   19.   ibid.,  13.   20.   ibid.,  74.   21.   rhonda  houser,  “building  a  library  gis  service  from  the  ground  up,”  library  trends  55,  no.  2   (2006):  325,  http://dx.doi.org/10.1353/lib.2006.0058.   22.   melissa  lamont  and  carol  marley,  “spatial  data  and  the  digital  library,”  cartography  and   geographic  information  systems  25,  no.  3  (1998):  143,   http://dx.doi.org/10.1559/152304098782383142.     information  technology  and  libraries  |  march  2015   50   23.   carolyn  d.  argentati,  “expanding  horizons  for  gis  services  in  academic  libraries,”  journal  of   academic  librarianship  23,  no.  6  (1997):  463,   http://dx.doi.org/10.1559/152304098782383142.   24.   soete,  spec  kit  219,  11.   25.   carol  cady  et  al.,  “geographic  information  services  in  the  undergraduate  college:   organizational  models  and  alternatives,”  cartographica  43,  no.  4  (2008):  249,   http://dx.doi.org/10.3138/carto.43.4.239.   26.   houser,  “building  a  library,”  325.   27.   r.  b.  parry  and  c.  r.  perkins,  eds.,  the  map  library  in  the  new  millennium  (chicago:  american   library  association,  2001),  59–60.   28.  patrick  florance,  “gis  collection  development  within  an  academic  library,”  library  trends  55,   no.  2  (2006):  223,  http://dx.doi.org/10.1353/lib.2006.0057.   29.   houser,  “building  a  library,”  325.   30.   ibid.,  323.   31.   ibid.,  322.   32.   parrish.  “improving  gis,”  329.   33.   ibid,  336.   34   florance,  “gis  collection  development,”  222.   35.    soete,  spec  kit  219,  6.   36.    dodsworth,  getting  started  with  gis,  165.   37.   soete,  spec  kit  219,  8.   38.   gabaldón  and  repplinger,  “gis  and  the  academic  library.”   39.   dodsworth,  getting  started  with  gis,  164.   40.   houser,  “building  a  library,”  323.   41.   dodsworth,  getting  started  with  gis,  161–62.         geographic  information  and  technologies  in  academic  research  libraries  |  holstein   51   appendix   responding  institutions   arizona  state  university  libraries   university  of  michigan  library   auburn  university  libraries   michigan  state  university  libraries   boston  college  libraries   university  of  nebraska–lincoln  libraries   university  of  calgary  libraries  and  cultural  resources   new  york  university  libraries   university  of  california,  los  angeles,  library   university  of  north  carolina  at  chapel  hill  libraries   university  of  california,  riverside,  libraries   north  carolina  state  university  libraries   university  of  california,  santa  barbara,  libraries   northwestern  university  library   case  western  reserve  university  libraries   university  of  oregon  libraries   colorado  state  university  libraries   university  of  ottawa  library   columbia  university  libraries   university  of  pennsylvania  libraries   university  of  connecticut  libraries   pennsylvania  state  university  libraries   cornell  university  library   purdue  university  libraries   dartmouth  college  library   queen’s  university  library   duke  university  library   rice  university  library   university  of  florida  libraries   university  of  south  carolina  libraries   georgetown  university  library   university  of  southern  california  libraries   university  of  hawaii  at  manoa  library   syracuse  university  library   university  of  illinois  at  chicago  library   university  of  tennessee,  knoxville,  libraries   university  of  illinois  at  urbana-­‐champaign  library   university  of  texas  libraries   indiana  university  libraries  bloomington   texas  tech  university  libraries   johns  hopkins  university  libraries   university  of  toronto  libraries   university  of  kansas  libraries   tulane  university  library   mcgill  university  library   vanderbilt  university  library   university  of  manitoba  libraries   university  of  waterloo  library   university  of  maryland  libraries   university  of  wisconsin–madison  libraries   massachusetts  institute  of  technology  libraries   yale  university  library   university  of  miami  libraries   york  university  libraries   lib-s-mocs-kmc364-20141005043735 75 a cost effectiveness model for comparing various circulation systems thomas k. burgess: washington state university library two models for circulation systems costing are presented. both the auto~ mated and the manual models are based on experience gained in the analysis of circulation services at washington state university library. validation tests for the model assumptions are devised and explained. use of the models for cost effectiveness comparison and for cost prediction are discussed and examples are given showing their application. introduction many methods for analyzing cost effectiveness have been presented recently in the literature.1 one main difficulty with studies of effectiveness is in quantifying the benefits, or in the case of libraries, assigning values to the quantity or quality of the services offered. 2• 3 one way to circumvent this difficulty is to compare the costs of different methods of providing the same services. value assessment of the services is eliminated by keeping them constant as shown in most cost benefit studies.16 this, of course, is not always possible when comparing manual library systems with mechanized systems. library circulation systems, however, may fit this type of model with relative ease. for this reason, the models described below were developed to compare a manual with a mechanized system. they have the added advantage of allowing for the prediction of costs for either the manual or automated system based on certain circulation loads. the utilization of the models is probably best understood by working through an application. therefore, a description of these applications as performed at washington state university library will be used. assumptions based on practices peculiar to washington state university are removed by the model through the use of the activities definitions for our library. washington state university library has been operating a mechanized circulation system since 1967. based on past experience, the system has recently undergone major modifications to improve its capabilities. we consider it to be a highly efficient machine circulation system. thus, cost 76 journal of library automation vol. 6/2 june 1973 effectiveness comparison with a similar manual operation can provide information on effectiveness of automated circulation systems in general as well as on the wsu implementation. model consideration to insure that the comparisons were fair and that biases were held to a minimum, mathematical models had to be established with rather rigid constraints. validations of these models had to be devised to insure that extrapolations of the model results were meaningful. information about our manual system in operation prior to 1967 is sparse, as no analysis had been performed. it was decided that the manual model should, therefore, be a variant of the machine model, since our machine system includes a small manual system. if the models are to be useful to others, they should make very few assumptions about circulation tasks. therefore, the models should break out each specific task so costs can be accumulated. this also insures that only circulation tasks are counted. if total hours of staff assigned to circulation are used as the basic labor costs, their time at other library functions are included and would provide erroneous data. using a breakdown by tasks will allow use of the model even if major changes occur in organizational or physical rearrangement of the circulation functions. twenty-three basic activities were identified that would cover all circulation functions of our library. a similar list should be prepared for each library to be modeled.7 our list can be used as a guide. these functions and their definitions are listed in appendix a. fifteen functions represent activity for which both the quantity of the activities and the average time to perlorm it are required information for building the model. of the nine remaining activities, eight require only the measurement of total performance time. the last activity, computer operations, was subdivided into three parts: computer charges, library equipment rental costs, and computer personnel costs. the computer personnel costs represent time donated by the computer center to keypunch, decollate and burst printouts, and prepare and schedule jobs. these personnel costs are a part of the machine system and are not reflected otherwise in the computer charges. these three charges are summed and used as a single dollar figure in the model. in our machine system as in many other circulation systems it is impossible to split our computer cost for each circulation subfunction because we use integrated data bases which are charged as a single storage rental cost and not split up among the various programs. the collection of data for this study could have resulted in a sizeable effort and could have unduly biased the data which were to be collected.8 • 9 for example, circulation clerks might have taken as much time to measure the circulation transactions as the circulation transactions themselves required. therefore, we requested supervisors to estimate the time necessary for these tasks, the number of transactions performed, and the pera cost effectiveness model/ burgess 77 centage of staff and student hours used. these data were developed monthly for a three month period during the middle portion of a semester. validation of these estimates to insure their reasonability was accomplished by comparing the total time expended in circulation as reported in the collected data with the total time assigned to circulation activities as reported in the payroll records (the usual manner of estimating costs) .10 a surprisingly high degree of correlation was found primarily due to the fact that few of our circulation staff members have responsibilities outside of circulation. the payroll data also had to be adjusted to reflect actual hours used in circulation functions. a 25 percent figure was used: to reflect holidays and leaves-8.4-10 percent; coffeebreaks-8-12 percent; sickness-2 percent; tardiness and work slumps-3 percent; and miscellaneous-3 percent, for regular staff. by the same method 15 percent was determined for student help. the difference in total hours between the two samples · was less than 5 percent (appendix b, table 5). · the study data were collected from five separate organizational areas (three circulation desks, technical service division, and the library administrative office) which are reasonably independent of each other; the monthly variation in activities reported by the various units was also closely correlated. for example, the percent increase in checkouts for a month was approximately the same at all three circulation desks. model calculations-automated system mtmthly totals were averaged for each activity's transaction time, number of transactions, and percentage of effort allocated to staff, or student labor . (appendix b, tables 1, 2, and 3). average hourly wages were developed separately for student (part-time help) and staff based on salaries of personnel allocated to circulation. the total hours and salarjes were then calculated for staff and student help for each activity. the follo~g example shows the formulas used in calculating some of the entries in; tpe tables: a1 manual checkout transactions ( alt) times transaction time ( altt) equals total time expended ( alte) adjusted to hours [i] total part-time help in hours (ahth) equals (aite) times the percentage of student effort ( aippth) [2] total staff hours ( al 8 ) equals ( alte) times the percentage of staff help (alps) . [3] total salaries (aits) equals (alptn) times student rate (rpth) plus . (al8 ) times staff rate (rs) [4] x shelving total student hours xptn = xte · xppth [5] 18 journal of library automation vol. 6/ 2 june 1973 total staff hours xs = xre · xps total salaries xts = xs · rs + xpth · rpta all other activities were calculated in the same manner as shown above. personnel hours used were totaled and multiplied by the hourly rates. the salary totals and the computer costs were then added together to get the total system cost per month. (appendix b, table 6) v v [6] [7] total salary cost = 1.15 ~ ipta · rpta + 1.25 1: is · rs [8] i=a i=a figure 1 represents curves of monthly cost vs. monthly circulation. the automated system curve was determined from the initial model plotted point, and from extrapolations to other plotted points which were computed b~sed on the following factors: a 25 percent increase or decrease in circulation will result in a 5 percent increase or decrease in computer costs. this estimate results from analyzing the computer processes. the bulk of the computer cost results from sorting and other total file processes which are reasonably insensitive to changes in volume of updating. a factor of 25 percent change in circulation results in a 30 percent change in personnel costs. the 5 percent differential may be conservative, but results from the need for additional supervisory support with its higher salary for each additional operational position added. using the above factors, several additional points were predicted and plotted and the automated system curve was drawn to fit these points (appendix b, table 7). validation of these factors was determined by using budget information and circulation data available from the year 1968 (appendix b, table 8). these data were used to establish a point on the graph. the 1968 costs were compared to the predicted cost as shown by the curve for the circulation volume in 1968. this provided a cost differential which was within 1 percent of the curve predicted costs (figure 1) . . these data were adjusted to reflect annual circulation hours used in circulation in 1971. model cl\.lculations-manual system · the manual model was a modification of the automated model. obviously no machine costs were incurred, but costs for filing and retrieving cards from large tub files of book cards of items in circulation must be added to each' check-out or check-in procedure as well as to snags, holds, and other categories. since ·some· loaned materials are not included in our automated system, a small manual circulation operation runs parallel to the automated system and was included in the automated study. this small manual system served as the base activity for the manual model in the study. a cost effectiveness model/burgess 79 retrieval time from a card tub file is dependent on the number of the cards in the file. the tub file size is approximately equal to the size of the computer's circulation file. sample filing times were made on a catalog card file of comparable size. the results were an average of 40 seconds per item on timings of single records and of batches of alphabetized records to be filed. this figure was then used to extend the average time of the appropriate activities in the parallel manual systems data (activities al, b, il, k, m, and q). following the calculation method used in our automated system, data were developed from the 1971 circulation data and a curve was drawn for the manual system (appendix c and appendix b, table 7 and figure 1). validation of this curve by budget information available from 1967 shows that the difference in the predicted cost from actual cost was less than 2 percent (appendix c, table 5). this represents a significant correlation and validates the entire manual model. generalized use of the models as has been shown by the example of its use at wsu, the models provide for two functions: cost comparison of automated and manual circulation systems at the same levels of book circulation, and prediction of cost in either a manual or automated system at different levels of circulation. of course, combinations of these models may be made, such as: at what circulation levels are costs of both models equal? or, what will my costs .e • cron-over point 4,000 8/yjo 12,000 16,000 20/)00 24,000 28,000 32,000 36,000 monthly element. search engine content provider search engine content provider choose combine query combined results open search environments: the free alternative to commercial search services | o’riordan 52 opensearch makes very few assumptions about the types of sources, the type of query or the search operation. it is ideal for combining content from multiple disparate sources, which may be data from repositories, webpages, or syndicated feeds. for illustrative purposes, listing 1 gives an example opensearch description for harvesting book information from an example digital library called diglib. the root node includes an xml namespace attribute, which gives the url for the standard version. the url element specifies the content type (a mime type), the query (book in this case), and the index offset where to begin. the rel attribute states that the result is a collection of resources. diglib harvests book items en-us utf-8 listing 1. xml opensearch description record. next, we describe some deployed applications that use opensearch. ojax uses qeb technologies such as ajax (asynchronous javascript) to provide a federated search service for oai-pmh compatible open repositories.41, 42 ojax also supports the discovery feature of opensearch, as described in the opensearch 1.1 specification, for auto-detecting that a repository is searchable. stored searches are in atom format. open-source meta-search engines can combine the results of opensearch enabled search engines.43 a system built as a proof-of-concept uses four search sources: a9.com, yacy, mozdex, and alpha. a user can issue a text query (word or phrase) with boolean operators and several modifiers. users can prefer or bias particular engines by setting weights. the system ranks results, combined using a voting algorithm and implemented using the lucene library. opensearch can be employed to specify the search sources and as a common format when results are combined. as levan points out “the job of the meta-search engine is made much simpler if the local search engine supports a standard search interface.”44 levan also mentions mxg in this context. nguyen et al. describe an application where over one hundred search engines are used in experiments in federated search.45 the search sources were mostly opensearch-compliant search engines. an additional tool scrapes results from noncompliant systems. intersynd uses opensearch to help provide a common protocol for harvesting web feeds. intersynd is a syndication system that harvests, stores, and provides feed recommendations.36 it uses java.net’s rome (rss and atom utilities) library to represent feeds in a format-neutral way. information technology and libraries | june 2014 53 intersynd is syndication middleware that allows sources to post and services to fetch information in all major syndication formats (see figure 2). its feed-discovery module, disco, uses the nutch crawler and the opensearch protocol to harvest feeds. nutch is an open-source library for building search engines that supports opensearch. nutch builds on the lucene information retrieval library, adding web-specifics, such as a crawler, a link-graph database, and parsers for html. figure 2. openseach in intersynd. opensearch 1.1 allows returned results in either rss 2.0 or atom 1.0 format or an opensearch format, the “bare minimum of additional functionality required to provide search results over rss channels” (quoted from a9 website). listing 2 below shows a disco results list in rss 2.0 format. opensearch fields appear in the channel description. the nutch fields appear within each item (not shown). an opensearch namespace is specified in the opening xml element. the following additional opensearch elements appear in the example: totalresults, itemsperpage and startindex. nutch: metasearch nutch search results for query: metasearch http://localhost/nutch-1.6dev/opensearch?query=metasearch&start=0&hitspersite=2&hitsperpage=10 282 0 10 metasearch cut... more items cut... listing 2. results produced using nutch with opensearch. open search environments: the free alternative to commercial search services | o’riordan 54 we mention one more application of opensearch here. a series of nasa projects to develop a set of interoperable standards for sharing information employs various open technologies for sharing and disseminating datasets including opensearch for its discovery capability.46 discovery of document and data collections is by keyword search, using the opensearch protocol. there are various extensions to opensearch. for example, an extension to handle sru allows sru (search and retrieval via url) queries within opensearch contexts. other proposed extensions include support for mobility, e-commerce, and geo-location. sru (search/retrieval via url) a technology with some similarities to opensearch but more comprehensive is sru (search/retrieval via url).47 sru is an open restful technology for web search. the current version is sru 2.0, standardized by the organization for the advancement of structured information standards (oasis) as searchretrieve. version 1.0. sru was developed to provide functionality similar to the widely deployed z39.50 standard for library information retrieval updated for the web age.48 sru addresses aspect of search and retrieval by defining models: a data model, a query model, and processing model, a result set model, a diagnostics model and a description-and-discovery model. sru is extensible and can support various underlying low-level technologies. both lucene and dspace implementations are available. the oclc implementation of sru supports both rss and atom feed formats and the atom publishing protocol. sru uses http as the application transport and xml formats for messages. requests can be in the form of either get or post http methods. sru supports a high-level query language called contextual query language (cql). cql is a human-readable query language consisting of search clauses. sru operation involves three parts: explain, search/retrieve and scan. explain is a way to publish resource descriptions. search/retrieve entails the sending of requests (formulated in cql) and the receiving of responses over http. the optional sru scan enables software to query the index. the result list is in xml schema format. the meta-search service mxg uses sru, but relaxes the requirement to use cql.39 srw (search/retrieve web service) is a web services implementation of sru that uses soap as the transfer mechanism. hammond combines opensearch with sru technology in an application for nature publishers.49 he also points out the main differences between the protocols such as sru’s use of a query specification language and differences in the results records. as well as supporting opensearch data formats (rss and atom), the nature application also supports json (javascript object notation). opensearch is used for formatting the result sets whereas sru/cql is used for querying. this search application launched as a public service in 2009. listing 3 below is an example from the nature application showing cql search queries ( tags) used in an opensearch description document. note how both the sru querytype and the information technology and libraries | june 2014 55 opensearch searchterms attributes appear in the query. further details on how to use sru and opensearch together are on the opensearch website. nature.com opensearch interface for nature.com the nature.com opensearch service nature.com opensearch sru listing 3. example using sru and opensearch. other technologies here we more briefly survey some additional technologies of relevance to open-search interoperability. xml-based approaches to information integration, such as the use of xquery, are an option but do not present a loose integration. chudnov et al. describes a simple api for a copy function for web applications to enable syndication, searching, and linking of web resources.50 called unapi, it requires small changes for publishers to add the functionality to web resources such as repositories and feeds. developers can layer unapi over sru, opensearch, or openurl.51 announced in 2008, yahoo!’s searchmonkey technology, also called yahoo!’s open search platform, allowed publishers to add structured metadata to yahoo! search results. searchmonkey divided the problem into two parts: metadata extraction and result presentation. in is not clear how much of this technology survived yahoo! and microsoft’s new search alliance, signed in 2010.52 mika described a search interface technology called microsearch that is similar in nature.53 in microsearch, semantic fields are added a search and search result presentation enriched with open search environments: the free alternative to commercial search services | o’riordan 56 metadata extracted from retrieved content. govaerts et al. described a federated search and recommender system that operates as a browser add-on. the system is opensearch-compliant and all results are in the atom format.54 the corporation for national research initiatives (cnri) digital object architecture (doa) provides a framework for managing digital objects in a networked environment. it consists of three parts: a digital object repository, a resolution mechanism (handle system), and a digital object registry. the repository access protocol (rap) proves a means of networked access to digital objects, which supports authentication and encryption.55 summary and conclusions a rich set of formats and protocols and working implementations show that open search technology is an alternative to the dominant commercial search services. in particular, we discussed the lightweight opensearch and sru protocols as suitable glue to create loosely coupled search-based applications. these can complement other developments in resource discovery and description, open repositories, and open-source information retrieval. the flexibility and extensibility offers exciting opportunities to develop new applications and new types of applications. the successful deployment of open search technology shows that this technology has matured to support many uses. a fruitful area of further development would be to make working with these standards easier for developers and even accessible to the nonprogrammer. references 1. google custom search api, https://developers.google.com/custom-search/v1/overview. 2. mike cafarella and doug cutting, “building nutch: open source search: a case study in writing an open source search engine,” acm queue 2, no. 2 (2004), http://0dl.acm.org.library.ucc.ie/citation.cfm?doid=988392.988408. 3. wray buntine et al., “opportunities from open source search,” in proceedings, the 2005 ieee/wic/acm international conference on web intelligence, 2–8 (2005), http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1517807. 4. jamie callan, “distributed information retrieval,” advances in information retrieval 5 (2000): 127–50. 5. péter jacsó, “internet insights—thoughts about federated searching,” information today 21, no. 9 (2004): 17–27. 6. ricardo baeza and prabhakar raghavan, “next generation web search,” in search computing (berlin heidelberg: springer, 2010): 11–23, http://link.springer.com/chapter/10.1007/9783-642-12310-8_2. https://developers.google.com/custom-search/v1/overview http://0-dl.acm.org.library.ucc.ie/citation.cfm?doid=988392.988408 http://0-dl.acm.org.library.ucc.ie/citation.cfm?doid=988392.988408 http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1517807 http://link.springer.com/chapter/10.1007/978-3-642-12310-8_2 http://link.springer.com/chapter/10.1007/978-3-642-12310-8_2 information technology and libraries | june 2014 57 7. trevor strohman et al., “indri: a language model-based search engine for complex queries,” in proceedings of the international conference on intelligent analysis 2, no. 6, (2005): 2–6. 8. xapian project website, http://xapian.org/. 9. andrew aksyonoff, introduction to search with sphinx: from installation to relevance tuning (sebastopol, ca: o’reilly, 2011). 10. rohit khare, “nutch: a flexible and scalable open-source web search engine,” oregon state university, 2004, p. 32, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.105.5978 11. “xapian users,” http://xapian.org/users. 12. christian middleton and ricardo baeza-yates, “a comparison of open source search engines,” 2007, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.119.6955. 13. apache solr, http://lucene.apache.org/solr/. 14. ross singer, “in search of a really ‘next generation’ catalog,” journal of electronic resources librarianship 20, no. 3 (2008): 139–42, http://www.tandfonline.com/doi/pdf/10.1080/19411260802412752. 15. europeana portal, http://www.europeana.eu/portal/. 16. maristella agosti et al., delosdlms—the integrated delos digital library management system berlin heidelberg: springer, 2007). 17. mehdi alipour-hafezi et al., “interoperability models in digital libraries: an overview,” electronic library 28, no. 3 (2010): 438–52, http://www.emeraldinsight.com/journals.htm?articleid=1864156. 18. institute of electrical and electronics engineers, ieee standard computer dictionary: a compilation of ieee standard computer glossaries (new york: ieee, 1990). 19. clifford lynch and hector garcía-molina, “interoperability, scaling, and the digital libraries research agenda,” in iita digital libraries workshop, 1995. 20. andreas paepcke et al., “interoperability for digital libraries worldwide,” communications of the acm 41, no. 4 (1998): 33–42. 21. manjula, patel et al., “"semantic interoperability in digital library systems,” 2005, http://delos-wp5.ukoln.ac.uk/project-outcomes/si-in-dls/si-in-dls.pdf. 22. georgios athanasopoulos et al., “digital library technology and methodology cookbook,” deliverable d3.4, 2011, http://www.dlorg.eu/index.php/outcomes/dl-org-cookbook. http://xapian.org/ http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.105.5978 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.119.6955 http://lucene.apache.org/solr/ http://www.tandfonline.com/doi/pdf/10.1080/19411260802412752 http://www.europeana.eu/portal/ http://www.emeraldinsight.com/journals.htm?articleid=1864156 http://delos-wp5.ukoln.ac.uk/project-outcomes/si-in-dls/si-in-dls.pdf http://www.dlorg.eu/index.php/outcomes/dl-org-cookbook open search environments: the free alternative to commercial search services | o’riordan 58 23. herbert van de sompel and oren beit-arie, “open linking in the scholarly information environment using the openurl framework,” new review of information networking 7, no. 1 (2001): 59–76, http://www.tandfonline.com/doi/abs/10.1080/13614570109516969. 24. daniel chudnov, “coins for the link trail,” library journal, 131 (2006): 8-10.25. lois mai chan and marcia lei zeng, “metadata interoperability and standardization—a study of methodology, part ii,” d-lib magazine 12, no. 6 (2006), http://www.dlib.org/dlib/june06/zeng/06zeng.html. 26. json (javascript object notation), http://www.json.org/. 27. the open archives initiative protocol for metadata harvesting, http://www.openarchives.org/oai/openarchivesprotocol.html. 28. herbert van de sompel et al., “the ups prototype: an experimental end-user service across e-print archives,” d-lib magazine 6, no. 2 (2000), http://www.dlib.org/dlib/february00/vandesompel-ups/02vandesompel-ups.html. 29. oaister, http://oaister.worldcat.org/. 30. mark nottingham, ed., “the atom syndication format. rfc 4287,” memorandum, ietf network working group, 2005, http://www.ietf.org/rfc/rfc4287. 31. rss 2.0 specification, berkman center for internet & society at harvard law school, july 15, 2003, http://cyber.law.harvard.edu/rss/rss.html. 32. “rss 2.0 and atom 1.0 compared,” http://www.intertwingly.net/wiki/pie/rss20andatom10compared 33. jay brodsky et al., eds., “the information and content exchange (ice) protocol,” working draft, version 2.0, 2003, http://xml.coverpages.org/icev20-workingdraft.pdf. 34. open archives initiative object reuse and exchange, http://www.openarchives.org/ore/. 35. opml (outline processor markup language), http://dev.opml.org/. 36. adrian p. o’riordan, and m. oliver o’mahoney, “engineering an open web syndication interchange with discovery and recommender capabilities,” journal of digital information, 12, no. 1 (2011), http://journals.tdl.org/jodi/index.php/jodi/article/viewarticle/962. 37. vicky reich and david s. h. rosenthal, “lockss: a permanent web publishing and access system,” d-lib magazine 7, no. 6 (2001): 14, http://mirror.dlib.org/dlib/june01/reich/06reich.html. http://www.tandfonline.com/doi/abs/10.1080/13614570109516969 http://www.dlib.org/dlib/june06/zeng/06zeng.html http://www.json.org/ http://www.openarchives.org/oai/openarchivesprotocol.html http://www.dlib.org/dlib/february00/vandesompel-ups/02vandesompel-ups.html http://oaister.worldcat.org/ http://www.ietf.org/rfc/rfc4287 http://cyber.law.harvard.edu/rss/rss.html http://www.intertwingly.net/wiki/pie/rss20andatom10compared http://xml.coverpages.org/icev20-workingdraft.pdf http://www.openarchives.org/ore/ http://dev.opml.org/ http://journals.tdl.org/jodi/index.php/jodi/article/viewarticle/962 http://mirror.dlib.org/dlib/june01/reich/06reich.html information technology and libraries | june 2014 59 38. erik selberg and oren etzioni, “multi-service search and comparison using the metacrawler,” in proceedings of the fourth int'l www conference, boston, 1995. [pub info?] 39. niso metasearch initiative, metasearch xml gateway implementers guide, version 1.0, niso rp-2006-02, 2006, http://www.niso.org/publications/rp/rp-2006-02.pdf. 40. dewitt clinton, “opensearch 1.1 specification, draft 5,” http://opensearch.org/specifications/opensearch/1.1. 41. judith wusteman, “ojax: a case study in agile web 2.0 open source development,” in aslib proceedings 61, no. 3 (2009): 212–31, http://dx.doi.org/10.1108/00012530910959781. 42. judith wusteman and padraig o’hlceadha, “using ajax to empower dynamic searching,” information technology & libraries 25, no. 2 (2013): 57–64, http://0www.ala.org.sapl.sat.lib.tx.us/lita/ital/sites/ala.org.lita.ital/f iles/content/25/2/wusteman.pd f. 43. adrian p. o–riordan, “open meta-search with opensearch: a case study,” technical report hosted at cora.ucc.ie repository, 2007, http://dx.doi.org/10468/982. 44. ralph levan, “opensearch and sru: a continuum of searching,” information technology & libraries 25, no. 3 (2013): 151–53, https://napoleon.bc.edu/ojs/index.php/ital/article/view/3346. 45. dong nguyen et al., “federated search in the wild: the combined power of over a hundred search engines,” in proceedings of the 21st acm international conference on information and knowledge management (maui, hawaii): acm press, 2012): 1874–78, http://dl.acm.org/citation.cfm?id=2398535. 46. b. d. wilson et al., “interoperability using lightweight metadata standards: service & data casting, opensearch, opm provenance, and shared sciflo workflows,” in agu fall meeting abstracts 1 (2011): 1593, http://adsabs.harvard.edu/abs/2011agufmin51c1593w. 47. library of congress, “sru—search/retrieve via url,” www.loc.gov/standards/sru. 48. the library of congress network development and marc standards office, “z39.50 maintenance agency page,” www.loc.gov/z3950/agency. 49. tony hammond, “nature.com opensearch: a case study in opensearch and sru integration,” d-lib magazine 16, no. 7/8, (2010), http://mirror.dlib.org/dlib/july10/hammond/07hammond.print.html. 50. daniel chudnov et al., “introducing unapi,” 2006, http://ir.library.oregonstate.edu/xmlui/handle/1957/2359. http://www.niso.org/publications/rp/rp-2006-02.pdf http://opensearch.org/specifications/opensearch/1.1 http://dx.doi.org/10.1108/00012530910959781 http://0-www.ala.org.sapl.sat.lib.tx.us/lita/ital/sites/ala.org.lita.ital/files/content/25/2/wusteman.pdf http://0-www.ala.org.sapl.sat.lib.tx.us/lita/ital/sites/ala.org.lita.ital/files/content/25/2/wusteman.pdf http://0-www.ala.org.sapl.sat.lib.tx.us/lita/ital/sites/ala.org.lita.ital/files/content/25/2/wusteman.pdf http://dx.doi.org/10468/982 https://napoleon.bc.edu/ojs/index.php/ital/article/view/3346 http://dl.acm.org/citation.cfm?id=2398535 http://adsabs.harvard.edu/abs/2011agufmin51c1593w http://www.loc.gov/standards/sru http://www.loc.gov/z3950/agency http://mirror.dlib.org/dlib/july10/hammond/07hammond.print.html http://ir.library.oregonstate.edu/xmlui/handle/1957/2359 open search environments: the free alternative to commercial search services | o’riordan 60 51. daniel chudnov and deborah england, “a new approach to library service discovery and resource delivery,” serials librarian 54, no. 1–2 (2008): 63–69, http://www.tandfonline.com/doi/abs/10.1080/03615260801973448. 52. “news about our searchmonkey program,” yahoo! search blog, 2010, http://www.ysearchblog.com/2010/08/17/news-about-our-searchmonkey-program/. 53. peter mika, “microsearch: an interface for semantic search,” in semantic search, international workshop located at the 5th european semamntic web conference (eswc 2008) 334 (2008): 79–88, http://ceur-ws.org/vol-334/. 54. sten govaerts et al., “a federated search and social recommendation widget,” in proceedings of the 2nd international workshop on social recommender systems ([pub info?], 2011): 1–8. 55. s. [first name?]reilly, “digital object protocol specification, version 1.0,” november 12, 2009, http://dorepository.org/documentation/protocol_specification.pdf. http://www.tandfonline.com/doi/abs/10.1080/03615260801973448 http://www.ysearchblog.com/2010/08/17/news-about-our-searchmonkey-program/ http://ceur-ws.org/vol-334/ http://dorepository.org/documentation/protocol_specification.pdf user experience with a new public interface for an integrated library system articles user experience with a new public interface for an integrated library system kelly blessinger and david comeaux information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.11607 kelly blessinger (kblessi@lsu.edu) is head of access services, louisiana state university. david comeaux (davidcomeaux@lsu.edu) is systems and discovery librarian, louisiana state university. abstract the purpose of this study was to understand the viewpoints and attitudes of researchers at louisiana state university toward the new public search interface from sirsidynix, enterprise. fifteen university constituents participated in user studies to provide feedback while completing common research tasks. particularly of interest to the librarian observers were identifying and characterizing where problems were expressed by the participants as they utilized the new interface. this study was approached within the framework of the cognitive load theory and user experience (ux). problems that were discovered are discussed along with remedies, in addition to areas for further study. introduction the library catalog serves as a gateway for researchers at louisiana state university (lsu) to access the print and electronic resources available through the library. in 2018 lsu, in collaboration with our academic library consortium (louis: the louisiana library network), upgraded to a new library catalog interface. this system, called enterprise, was developed by sirsidynix, which also provides symphony, an integrated library system (ils) long used by the lsu libraries. “sirsidynix and innovative interfaces are the two largest companies competing in the ils arena that have not been absorbed by one of the top-level industry players.”1 there were several reasons for the change. most importantly, sirsidynix made the decision to discontinue updates to the previous online public access catalog (opac), known as e-library, and focus development on enterprise. in response to this announcement, the louis consortium chose to sunset the e-library opac in the summer of 2018. this was welcome news to many, especially the systems librarian, who had felt frustrated by the antiquated interface of the old opac as well as its limited potential for customization. the newer interface has a more modern design and includes features such as faceted browsing to better suit the twenty-first-century user. enterprise also delivers better keyword searching. this is largely because it uses the solr search platform, which operates on an inverted index. solr (pronounced “solar”) is based on open source indexing technology and is customizable, more flexible, and usually provides more satisfactory results to common searches than our previous catalog. inverted indexing can be conceptualized similarly to indexes within books. “instead of scanning the entire collection, the text is preprocessed and all unique terms are identified. this list of unique terms is referred to as the index. for each term, a list of documents that contain the term is also stored.”2 unlike the old catalog, which sorted results by date (newest to oldest), enterprise ranks results by relevance, like search engines. the new search is also faster because the results are matched to the inverted index instead of whole records.3 mailto:kblessi@lsu.edu mailto:davidcomeaux@lsu.edu information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 2 the authors wanted to investigate how well this new interface would meet users’ research needs. while library database and website usage patterns can be assessed through quantitative measures using web analytics, librarians are often unaware of the elements that cause frustration for the users unless they are reported. prior to enterprise going live, the library’s head of access services solicited internal “power users” in the library to use the new interface. power users were identified as library personnel in units who used the catalog daily for their work. this group included interlibrary loan, circulation, and some participants in research and instruction services. these staff members were asked to use enterprise as their default search to help discover problems before it went live. a shared document was created in google drive for employees to leave feedback regarding their experiences and suggestions for improvements. the systems librarian had access to this folder and periodically accessed it and made warranted changes that were within his control. several changes were made based on feedback from the internal user group. these included adding the term “checked out” in addition to the due date, and the adjustment of features that were not available or working correctly in the advanced search mode, such as date, series, call number, and isbn. several employees were curious about differences between the results in the old system and enterprise due to the new algorithm. additionally, most internal users fou nd the basic search too simplistic and not useful, so the advanced search mode was made the default search. among the suggestions, there was also praise for the new interface. these statements were regarding elements of the user-enablement tools, such as “i was able to login using my patron information. i really like the way that part functions,” and from areas where additional information was now available, such as “i do enjoy that it shows the table of contents —certainly helps with checking citations for ill.” while the feedback from internal stakeholders was helpful, the authors were determined to gather feedback from patrons as well. to obtain this feedback, the authors elected to conduct usability studies. usability testing employs representative users to complete typical tasks while observers watch, facilitate, and take notes. the goal of this type of study is to collect both qualitative and quantitative data to explore problem areas and to gauge user satisfaction.4 enterprise includes an integration with ebsco discovery service (eds) to display results from the electronic databases subscribed to by the library as well as the library’s holdings. eds was implemented several years ago as a separate tool. the implementation team suspected that launching enterprise with periodical article search functionality might be confusing to those who were not accustomed to the catalog operating in this manner. therefore, for the initial roll-out, the discovery functionality was disabled in enterprise, leaving it to function strictly as a catalog to library resources. this decision will be revisited later. like many other academic libraries, eds is currently the default search for users from the lsu libraries homepage. other search interfaces, labeled “catalog,” “databases,” and “e-journals,” are also included as options in a tabbed search box. conceptual framework two schools of thought helped to frame this research inquiry: cognitive load theory and user experience (ux). cognitive load theory relates to the amount of new information a novice learner can take on at a given time due to limitations of the working memory. this theory originated in the field of instructional design in the late 1980s.5 the theory states that what separates novice learners from experts is that the latter know the background, or are familiar with the schema of a information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 3 problem, whereas novices start without this information. accordingly, experts are able to categorize and work with problems as they are presented, whereas new learners need to formulate their problem-solving strategies while encountering new knowledge. as a result, novices are quicker to max out the cognitive load in their working memories while trying to solve problems. ux emerged from the field of computer science and measures the satisfaction of users with their experience with a product. a 2010 article reviewed materials published in 65 separate studies with “cognitive load” in the title or abstract.6 early articles on cognitive load focused on learning and instructional development. these studies concentrated on limiting extraneous information (e.g. , materials and learning concepts), which affects the amount of information able to be held in the working memory.7 while the research that developed cognitive load theory centered on real-life problemsolving scenarios, later research focused on its impact in e-learning environments and learning regarding this delivery mode.8 in contrast to cognitive load theory, which was formed by academic study, the concept of ux was developed in response to user/customer satisfaction, particularly regarding electronic resources such as websites.9 user testing allows end users to provide realtime feedback to developers so they see the product working, and in particular, to note where it could be improved. ux correlates well with cognitive load theory for this study, as the concept arose with the widespread use of computers in the workplace and in homes in the mid-1990s. user studies user expectations have shifted beyond the legacy or “classic” opac, originally designed for use by experienced researchers with the primary goal of searching for known items.10 user feedback has historically been sought when libraries release new platforms and services and to gauge user satisfaction regarding research tools. “libraries seek fully web-based products without compromising the rich functionality and efficiencies embodied in legacy platforms.”11 a study by borbely used a combination of the log files of opac searches and a satisfaction questionnaire to determine which factors were most important to both professional and nonprofessional users. their findings indicated that task effectiveness, defined as the system returning relevant results, was the primary factor related to user satisfaction.12 many of the articles dealing with user studies and library holdings published in recent years have focused on next-generation catalogs (ngcs). this was defined in a 2011 study by 12 characteristics: “a single point of entry for all library resources, state of the art web interface, enriched content, faceted navigation, simple keyword search box with a link to advanced search box on every page, relevancy based on circulation statistics and number of copies, ‘did you mean . . .’ spell checking recommendations/related materials based on transaction logs, user contributions (tagging and ranking), rss feeds, integration with social networking sites, and persistent links.”13 catalogs defined as next-generation provide more options and functionality in a user-friendly, intuitive format. they are typically designed to more closely mimic web search engines, with which novice users are already familiar. tools within ngcs such as the faceted browsing of results have been reported as popular in user studies, especially among searchers without high levels of previous search experience. “faceted browsing offers the user relevant subcategories by which they can see an overview of results, then narrow their list.”14 a 2015 study interviewed 18 academic librarians and users to seek their feedback regarding new features made possible by ngcs. their findings indicate “that while the next-generation catalogue interfaces and information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 4 features are useful, they are not as ‘intuitive’ as some of the literature suggests, regardless of the users’ searching skills.”15 this study also indicated that users typically use the library catalog in combination with other tools such as google scholar or worldcat local both for ease of use and for a more complete review of the literature. while enterprise contains the twelve elements that yang and hofmann defined for a ngc, since the discovery element has been disabled, lsu libraries use of enterprise may be better described as an ils opac with faceted results navigation. while the implementation of discovery services and other web tools has shifted users to sources other than the catalog, many searchers often still prefer to use the library’s catalog. reasons for this may include familiarity with the interface, the ability to limit results to smaller numbers, or the unavailability of specific desired search options through other interfaces. problem statement the purpose of this study was to understand the viewpoints and attitudes of university stakeholders regarding a new interface to the online catalog. in particular, four areas were investigated: 1. identification of problems searching for books on general and distinct topics. 2. identification of problems searching for books with known titles and specific journal titles. 3. exploration of the usability of patron-empowerment features. 4. identification of other issues and/or frustrations (e.g., “pain points”). methodology three groups of users were identified for this study: undergraduate students, graduate students, and staff/faculty. the student participants were the easiest to recruit due to a fine forgiveness program that was initiated at lsu libraries in 2016. this program gives library users the option of completing user testing in lieu of some or all of their library fines (up to $10 per user test). all the student participants were recruited in this manner. additionally, five faculty/staff members identified as frequent library users were asked to participate. the participant pool for user testing included five undergraduate students, five graduate students, and five faculty and staff members. each of these groups had five participants, which is considered a best practice in user testing.16 the total sample studied for this study was 15 library users representing these three unique user groups. these participants are described in more detail in appendix a. for the observations, individuals were brought to the testing room in the library. this is a small neutral conference room with a table, laptop, and chairs for the librarian observers and participants. each participant was tested individually and was asked to speak aloud through their thought process as they used the new interface. the authors employed a technique known as “rapid iterative testing.” this type of testing involves updating the interface soon after problems are identified by a user or observer. thus, after each user test, confusing and extraneous information was removed applying cognitive load theory, improving the interface in alignment with the concept of ux. this approach helped to minimize the number of times participants repeatedly encountered the same issues. this framework makes this study more of a working study than a typical user study. a demonstration of this type of testing is included as figure 1. information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 5 before after figure 1. iterative testing model. this shows the logon screen before and after it was modified. this was based on the observation that users were unsure what information to enter here. the software usability studio was utilized to record audio and video of the participants’ electronic movements throughout each test. although the software can also record video of users throughout tests, the authors felt that this may make the participants uncomfortable and possibly more reluctant to openly share their opinions. at the beginning of each user study, participants were informed of the purpose of the study, the process, and the estimated time of the observation (30 to 45 minutes). the participants were then asked to sign a consent form for participation in the study. the interviews began with two open-ended pre-observation questions to gauge the users’ previous library experience. the first question asked students whether they had received library training in any of their courses, or, if the participant was a teaching staff or faculty member, if they regularly arranged library instruction for their students. the second question explored if they needed to use library resources in a previous assignment or required these in one they had assigned. then volunteers were given a written list of four multi-part task-based exercises, detailed in appendix b. these exercises were designed to evaluate the areas of concern outlined in the problem statement and to let the users explore the system, helping the observers discover unforeseen issues. the observations ended with two follow-up questions that asked the participants to describe their experience with the new interface. they were asked what they liked and what they found frustrating. they were also asked if there were areas where they felt they needed more help, and how the design could be made more user friendly. after the testing was completed, the audio files were imported into temi, a transcription tool that provided the text of what the users and the observers said throughout the test periods. the authors reviewed these transcripts and the recorded videos of the user’s keystrokes within the system for further clarity. the process and all instruments involved were reviewed by the lsu institutional review board prior to the testing. all user tests took place from march through november 2018. information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 6 findings previous library training and assignments three of the five undergraduate participants had received some previous library training fro m their professors or from a librarian who visited their classes. those that had training tended to recall specific databases they found useful, such as cq researcher or jstor. the assignments requiring library research mentioned by the undergraduate participants typically required several scholarly resources as a component of an assignment. four out of five of the graduate-student participants also indicated that they had some library training, and most also indicated they used the library frequently. two of the graduate-student participants, both studying music, mentioned that a research course was a required component of their degree program. the staff and faculty members tested mentioned that, depending on the format of the course, they either demonstrated resources to their students themselves, or would request a librarian to teach more extensive training. participant 13, a teaching staff member, mentioned that in a previous class she was “able to get our subject librarian to provide things to cater to the students, and they had research groups, so she [the subject librarian] was very helpful.” some of the teaching staff and faculty mentioned providing specific scholarly resources for their students. they acknowledged that since these were provided, their students did not gain hands-on experience finding scholarly materials themselves. participant 10, a faculty member, stated that she usually requires that students in one of her courses “do an annotated bibliography. i’ll require that they find six to ten sources in the library and usually require that at least three or four of those sources be on the shelf physically, because i want them to actually work with the books, and in addition, to avail themselves of electronic resources.” most of the staff and faculty participants indicated that, despite its weaknesses, they preferred using the online catalog over eds, mainly because eds included materials outside of our collection. when asked to explain, participant 12, a staff member said, because i feel like [with] ebsco you get a ton of results, and you know, i’m still looking for stuff that you guys have. um, [however] because of the way the original catalog is, i feel like i have to go through discovery to get a pretty accurate search on what lsu has. because, when i do use the discovery search, it’s a lot more sensitive, or should i say maybe a lot less sensitive, and it will pick up a lot of results. . . . it searches your catalog really well, just like worldcat does. . . . so, if the catalog was the thing that was able to do that, that would be cool. if the catalog search was more intuitive and inviting, i wouldn’t even bother going to some of these other places. books: general and distinct topics the observers noticed multiple participants using or commenting on known techniques learned from experience with the old catalog interface. these included boolean operators such as and to connect terms within the results. enterprise does not include boolean logic in its searches. a goal of the structure for the new algorithm is to provide a search closer to natural language. while most of the student participants typically searched by keyword when searching for books on general topics, staff and faculty participants typically preferred to search within the subject or title fields. faculty and staff participants also actively utilized the library of congress subject heading links within records and said that they also recommended that their students find materials in this manner. participant 9, a faculty member, said that he usually told his students to “find one book, then go to the catalog record . . . where you’ll get the subject headings. because . . . you’re not going information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 7 to guess the subject headings just off the top of your head, and that's how they are organized. that's the best way of getting . . . your real list.” many users were able to deduce that a book was available in print based on the call number listed. however, some undergraduate-student participants were confused by links in individual records and assumed that these were links to electronic versions of the book. these primarily linked to additional information about the book, such as the table of contents. the observers also found that many students used the publication date limiter when searching for materials, while commenting that they typically favored recent publications, unless the subject matter made a historical perspective beneficial. the date limiter, while effective for books, is less effective for periodicals, which include a range of publication dates. more advanced researchers, such as one staff participant, enjoyed the option to sort their results by date, but sorted these by “oldest first” indicating that they did this to find primary documents. known title books none of the user groups tested had trouble locating a known book title within the new interface, although two undergraduate students remarked that they preferred to search by the author, if known, to narrow the results. most undergraduates determined if the books were relevant to their needs based on the title and did not explore the tables of contents or further information available. graduate students tended to be more sophisticated in their searches for relevance and used the additional resources available in the records. participant 12, a staff member, mentioned that he liked the new display of the results when he was searching for a specific book. while the old system contained brief title information in the results display, he believed the new version showed more pertinent information, such as the call number, in the initial results. he said, “and this is also great too, because the old system . . . you would bring up information, then there’s another tab you have to click on to get to the . . . real meat of it. so . . . this is really good to see if it’s a book, to know what the number is immediately, just to not have to go through so many clicks.” specific journals specific journal results were problematic and were confusing in multiple ways. the task regarding journals directed users to find a specific journal title within enterprise, and then to determine whether 2018 issues were available and in what format (e.g., print or electronic). all the student users had trouble determining whether a journal was in print or electronic and if the year they needed was available. the task of finding a specific journal title and its available date range was also troublesome to many students. the catalog lists “publication dates” for journals prominently in the search results. however, these dates indicate the years that a journal was published, not the years that the library holds. users need to go into the full record for a journal to see the available years listed under the holding information. unfortunately, this was not intuitive to many. additionally, the presentation of years in records for journals was also unclear to some. for instance, participant 2, a freshman, did not understand that a dash next to an initial date (e.g., 2000–) indicated that the library held issues from 2000 to the present. many student users, especially those familiar with google scholar or eds, did not understand that journals are solely indexed in the catalog by the title of the journal. this is problematic for those who are accustomed to more access points for journals, such as article title and author. journals were additionally confusing because each format (e.g., print, electronic, or microfilm) has its own record in the catalog. typically, users clicked on the first record with the title that matched information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 8 the one they required, and assumed that this was all that was available, rather than scrolling through the results to view the different formats and the varying years within. this issue was problematic for all the student participants. participant 5, a phd student, summed up this frustration by saying, “i find stuff like that sometimes when i’m looking for other things. like, it shows the information [for the journal] but great, awesome. i found that [the journal] is maybe, hopefully somewhere, but sometimes you click on whatever link they have, and it goes to this one thing and there’s like one of the volumes available. so, this is not useful.” while records in the past used to be cataloged with all the holding information in one record, a better functionality for end users, this practice was changed in the midto late 2000sdue to the updating of journal holdings was a manual process completed by technical services staff. this timeframe was when the influx of electronic journal records began to steadily increase, making this workflow too cumbersome. due to all these known issues with journals, when asked to search for a specific journal in the catalog, several advanced searchers (graduate students, staff, and faculty) indicated they would not use the catalog to find journals. several stated other sources they preferred to use, whether google scholar, interlibrary loan, or subject-specific databases in their fields. after fumbling around with the catalog, participant 12, a staff member, summed this up by saying, “i guess if i was looking for a journal, i would just go back to the main page, and go from there [from the e-journals option]. i haven’t really searched for journals from the catalog. the catalog is usually my last [resort], especially for something like a journal.” usability of patron-empowerment features many participants were confused by the login required to engage with the patron-enablement tools prior to the iterative changes demonstrated in figure 1. once changes were made clarifying the required login information, patrons were able to use the patron-enablement tools well, placing holds and selecting the option to send texts regarding overdue materials. however, few undergraduate participants intuitively understood the functionality of the checkboxes next to records to retrieve items later. some participants assumed that they needed to be logged into the system to use this functionality, similar to eds. participant 1, a senior, said that she used a different method for retrieving items later, stating “normally, i'm going to be honest, if i needed the actual title, i’d put it in a word document on my computer. i wouldn’t do it on the library website.” another graduate student, participant 14, stated that while he was aware of the purpose of the checkboxes, he would not use them because the catalog would not be the only resource he would be using. he said that his preference was to “keep a reference list [in word] for every project. and then this reference list will ultimately become the reference for the work done.” participants in every category noted that they did not usually create lists in the catalog to refer to them later. there was enthusiasm regarding the new option to text records, with participant 6, a staff member, going so far as to say “boy, this is gonna make me very annoying to my friends” and staff participant 12 stating “that’s a really cool feature. i think that’s more helpful than this email to yourself.” unfortunately, there were several issues discovered regarding the text functionality. the first issue was that it was not reliably working with all carriers. once that was resolved, the systems librarian removed extraneous information regarding the functionality. this included text that “standard rates apply” and a requirement for users to choose their phone carrier before a text could be sent. these were both deemed unnecessary as it was assumed that users would information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 9 know whether they were charged for receiving text messages. additionally, one of the international graduate student participants did not understand the connection between texting information and the tool to do this, which was titled “sms notification.” while the texts were successful while the users were performing the studies, it was discovered later that the texts did not include the call numbers for items. after discussion regarding this problem arose at a louis conference, the decision was made to hide texting as an option until the system was able to properly text all the necessary information. when sirsidynix can fix this issue, the language around this will likely be made more intuitive by labeling it “text notifications” instead of sms. other issues and frustrations the researchers noticed that some options were causing confusion, such as the format limiter. under this drop-down option, several areas were displayed that did not align to known formats in the lsu collections, such as “continuing resources.” to remedy this, all the formats that could not be identified were removed as options. another confusing element was the way that records were displaying in initial user tests. some marc cataloging information was visible to users, so the systems librarian modified the full record display to hide this information. originally, the option to see this information was moved to the side under an option to “view marc record.” however, since this still seemed to confuse users, this button was changed to “librarian view.” undergraduate-student users reported confusion when they needed to navigate out of the catalog into a new, unfamiliar database interface to obtain an electronic article. participant 3, a senior, described her feelings when this happened, that she felt like she was “not in the hands of lsu anymore. i’m with these people, and i don’t know how to work this for sure.” another undergraduate user gave the suggestion that the system provide a warning when this occurs, so users knew that they would be navigating in a new system. since so many of the records link to databases and other non-catalog resources, this was not pursued. several undergraduate-student users mentioned that they didn’t understand the physical layout of the library, and that they used workarounds to get the materials they needed rather than navigate the library. for example, some were using the “hold” option in the catalog to have staff pull materials for them for reasons not initially intended by the library. rather than using this feature for convenience, they stated they were using it due to a lack of awareness of the layout of the library or the call number system. one user, participant 4, a sophomore, used the hold feature to determine whether a book was in print or electronic. when she clicked on the “hold” button in a record and it was successful, she said “okay, so i can place a hold on it, so i’m assuming there is copy here.” follow-up questions feedback to the new interface was primarily positive. several participants mentioned that the search limiters were now more clearly presented as choices from drop-down boxes. additionally, result lists are now categorized by facets such as format and location, which users had options to “include” or “exclude” at their discretion. participant 9, a faculty member, particularly liked the new library of congress subject facet from within the search results. she mentioned that these were available in the past interface, but the process to get to them was much more cumbersome. she regarded this new capability as a “game changer” and “something she hadn’t even dreamed of.” experienced searchers, such as participant 6, a staff member, noticed and appreciated the improvements in search results made possible by the new algorithm. she said, “it’s very easy to information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 10 look at, especially compared to the old database, and the keyword searching is a lot better.” after conducting a brief search, another staff member, participant 12, mentioned that he thought the results returned by an author search were much more relevant than in the past. he said, “sometimes with the older system even name searches can be sort of awful. i mean . . . this is a lot better. . . . if you type in benjamin franklin, for me at least, it’s difficult to get to the things he actually wrote. you know, you can find a lot of books about [them], and so you kind of have to filter until you can find . . . the subject.” figure 2. catalog disclaimer. the new search is also more forgiving of misspellings than the old version, which responded with “this search returned no results” when an item was misspelled. those who were very familiar with the old interface, such as staff and faculty, were particularly excited by small changes. these included being able to use a browser’s back button instead of the navigation internal to the system, or the addition of the isbn field to the primary search. prior to the new interface, the system would give an error message when a user attempted to use a browser’s back button instead of the internal navigation. additionally, users mentioned that they liked that features were similar to the previous interface with additional options. an example of a new feature is the system employing fuzzy logic to provide “did you mean” and autocomplete suggestions when users start typing in titles they are interested in, similar to what google provides. this same logic also returns results with related terms, eliminating the need for truncation tools.17 one graduate student, however, particularly mentioned missing boolean operators; they thought they were helpful because students had been taught these and were familiar with them. due to this comment, and other differences between the old and new interfaces, a disclaimer was added to information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 11 make users aware of these changes (see figure 2). two of the five undergraduate participants and two faculty participants noted if they didn’t understand something, or needed help, they would ask a person at a service desk for assistance. one of the staff members mentioned she would use the virtual chat if she had a question regarding the catalog. she also suggested that a question mark symbol providing additional information when clicked might be helpful if users got confused in the system. to allow users to provide continued feedback, the systems librarian created a web form for users to ask questions or report errors regarding the system. discussion study participants requested several updates; unfortunately, some of the recommendations suggested were in areas where the systems librarian had little control to make changes. since the initial advanced search was not easy to customize, the systems librarian created a custom new advanced search that more closely fit the needs of catalog users than the built-in search. one limitation of the default advanced search that several participants and staff users noted was the inability to limit by publication date. to work around this problem, one of the features the systems librarian implemented in the custom search was a date-range limiter. while still falling short of the patrons’ desired outcome of inputting a precise date to limit by, the date range feature was still a step forward. he was also able to make stylistic changes, such as bolding fields or making buttons appear in bright colors to make them more visible. other changes included eliminating confusing options and reordering the way full records appeared. this included moving the call number to a more visible area than where it was originally located. after a staff participant suggested it, he was also able to make the author name a hyperlinked field. now users can click on an author’s name to see what other books are available by that author within the library. the systems librarian was also able to make the functionality of the checkboxes more intuitive by adding a “select an action” option at the top of the list of results, which more clearly indicated what could be done with the checked options. these include being added to a list, printed, emailed, or placed on hold. the username and pin required to engage with the user-enablement tools was continually problematic, and not intuitive. only one of the student participants knew their login information, a graduate student close to graduation. the user name is a unique nine-digit lsu id number, which students, faculty, and staff don’t often use. the pin is system generated, so there is no way users could intuit what their pin is. once the user selects the “i forgot my pin” option however, the pin is sent to them, and then they have the option to change it to something they prefer. this setup is not ideal, especially since many other resources on campus are accessed through a more familiar text-based login and password. the addition of “i forgot my pin” to this part of the interface helps by anticipating and assisting with this problem by providing an example with the nine-digit id number, but this can also be overlooked. for this reason and for other security reasons related to paying fines, the library is exploring options to provide a single-sign-in login mechanism. the lack of knowledge regarding the physical layout of the library cannot be solely blamed on the users. in 2014, the lsu libraries made several changes to middleton library, the main campus library. the first was the closing of a once distinct collection, education resources, whose titles were merged with the regular collections. the second was weeding a large number of materials on the library’s third floor to facilitate the creation of a math lab. the resulting shifting of the collection had a direct impact on how patrons were able to locate materials within the library. due to required deadlines, access services staff needed to place books in groupings out of their typical information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 12 call number locations. the department is still working to remedy this five years later. in addition, service points previously manned to assist with wayfinding on the third and fourth floors were closed. conclusion to optimize whatever resources the library provides, user feedback is a useful tool. however, there are limitations to this study, the most obvious being that the data collection took place and was based on researchers at one institution. this could limit the applicability of the study’s results. the second is regarding the sampling method for user tests. student users self -selected by volunteering in lieu of user fines, and staff and faculty identified as frequent library users, were purposively selected. less-experienced users may have encountered different issues as they navigated the system. most of the student participants indicated they had received library training in some manner, and that they had been required to use library resources in the past to complete assignments. only a small number of participants were undergraduates without library training. the authors noted that the student participants who had received library training were more likely to attempt complicated searches and to explore advanced features. however, they also tended to try to conduct searches that were optimized for the previous catalog, such as using boolean logic. those with library training were also more likely to identify problematic areas, such as searching for journals, and to develop workarounds to get the materials they desired. the two graduate students in music, who were required to take a research course, both indicated how helpful this knowledge was to conducting research in their field. the user tests in this study demonstrated which information points the users at lsu found to be the most relevant, which allowed the system librarian to redesign a search that better fit their needs. this included hiding or separating extraneous information, such as additional information regarding texting, and making changes so all marc coding only appeared under the newly created “librarian view.” while this study demonstrated that the advanced researcher participants created workarounds regarding journal searches, undergraduate participants also created workarounds (such as placing holds) to accommodate their lack of knowledge regarding the library system and the physical library. several of the undergraduate participants reported having anxiety regarding their ability to navigate systems when the catalog linked to databases with interfaces new to them. the authors found that more advanced researchers appreciated having more data in catalog records, such as information on publishers and library of congress subject headings. students without as much exposure to library resources tended to prefer to conduct keyword searches and were more likely to judge the relevance of a record based mainly on the title or year of publication. most of the staff and faculty participants in this study indicated that they preferred to use the opac over eds. less-seasoned researchers tended to prefer ease and convenience over additional control and functionality. these kinds of generalizations could be tested by additional studies at other universities. the new user-empowerment features were received positively, especially the new “text notifications” feature. most participants indicated that they found it easy to renew items within the interface. however, the authors discovered that few patrons indicated they would u se “my lists” to capture records they would like to retrieve later. the user tests highlighted how many problems lsu library users are having signing on to the system to utilize the user-enablement tools. it is hoped that the upcoming change to a single sign on will alleviate these issues and the users’ frustrations. the systems librarian would like to incorporate other changes, such as the information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 13 request to return to the same spot on a list after going into a full record rather than returning to the top of the list. he is also planning on programming the mobile view for the catalog soon. currently the mobile site is still linking to the desktop version. he has also reintroduced the option to conduct a boolean search by linking to the old catalog due to so many users being familiar with it. the text messaging is expected to be corrected in an upcoming upgrade. overall, response from the participants in this study was positive, especially regarding the new algorithm. they also appreciated the familiarity of the design with the previous catalog interface, with additional features and functionality. regardless of the limitations in this study, some of this study’s findings reaffirm those from previous user studies. these include researchers indicating a need to consult multiple resources either in combination with or in exclusion of the catalog, and ngcs not being as intuitive as expected. the need to consult multiple resources particularly correlated with this study’s findings regarding journals. librarians were aware that searching journals in the online catalog was tricky for users due to multiple issues. many of the experienced participants in this study mentioned that they appreciated the new algorithm because it provided more accurate results. this reaffirmed results from the borbely study, which indicated that task effectiveness, or the system returning relevant results, as the primary factor related to user satisfaction. also similar to findings from the literature, users appreciated the newly available faceted-browsing features. dissimilar to a previous study however, it was the advanced searchers, rather than the novices, who mentioned these specifically as an improvement.18 the authors noted that it was common for undergraduate library participants to express confusion regarding navigating the physical library, so the library has taken several steps to remedy this. since this user testing was completed, a successful grant was written to provide new digital signage to replace outdated signage. this digital signage will be much more flexible and easier to update than the older fixed signage. additionally, this grant provided a three-year license to the stackmaps software. this software has since been integrated into the catalog and eds tool to direct users to physical locations within the libraries. additionally, the access services department updated physical bookmarks that display the call number ranges and resources available on each floor. these are now available at all the library’s public service desks. the library will also continue providing the popular “hold” services for patrons. this is a relatively new service, which was started to offset confusion and to assist patrons during the construction they may have encountered during the changes to the library. finally, since the fine forgiveness program has been so fruitful regarding recruitment for user studies, the special collections library also anticipates providing user studies in lieu of photocopying costs in the future. future research these user tests made it obvious that finding specific journal information through the catalog was difficult for most users. this is an area that needs remediation, and the systems librarian plans to conduct further user testing to explore avenues to make searching for journal holdings more efficient. another potential area for further study includes assessing enterprise’s integration of article records. as previously mentioned, enterprise can be configured to include article-level records into its display. however, this functionality would duplicate an existing feature of our main search tab, an implementation of eds that we have labeled “discovery.” while the implementation team felt that duplicating this functionality on a search tab labelled “catalog” might initially confuse users, replacing our current default search tab with enterprise warrants information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 14 serious consideration. an additional area to explore is a rethinking of the tabbed search box design. while the tabbed design remains popular in libraries, a trend toward a single search box on the library homepage has been observed in academic libraries.19 a future study with an emphasis on determining the best presentation of a various search interfaces, including either a reshuffling of available tabs or a move to a single search box, is planned in the foreseeable future. information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 15 appendix a: study participants participant status year major date tested 1 undergrad senior international studies & psychology 3/23/2018 2 undergrad freshman mass communication 4/6/2018 3 undergrad senior child and family studies 4/13/2018 4 undergrad sophomore pre-psychology 4/24/2018 5 graduate phd music 4/26/2018 6 staff n/a english 5/3/2018 7 graduate masters music 5/3/2018 8 graduate phd french 5/4/2018 9 faculty n/a history 5/7/2018 10 faculty n/a english 5/8/2018 11 undergrad junior accounting 6/1/2018 12 staff n/a history 6/4/2018 13 staff n/a mass communication 8/28/2018 14 graduate phd curriculum and instruction 10/3/2018 15 graduate phd petroleum engineering 10/2/2018 information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 16 appendix b: user exercises worksheet 1) you need to do a research paper on gerrymandering and race. a) identify three books that you may want to use. b) how would you save these titles to refer to later? 2) you are looking for the book titled harriet tubman and the fight for freedom by lois e. horton. find out the following and write your answers below. a) does the library own this in print? b) what is the call number? c) if we have this book, go into the record, and text yourself the information. d) place a hold on this book. 3) you need an article from the journal of philosophy. do we have access to the 2018 issues? what type of access (e.g., print or electronic)? 4) log in to your personal account to see the following: a) what you have checked out currently, if you have materials out, try to renew an item. b) determine any fines you owe. c) add a text notification for overdue notices. information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 17 endnotes 1 marshall breeding, “library systems report 2018: new technologies enable an expanded vision of library services,” american libraries (may 1, 2018): 22–35. 2 david a. grossman and ophir frieder, information retrieval: algorithms and heuristics, 2nd ed. (dordrecht, the netherlands: springer, 2004). 3 dikshant shahi, apache solr: a practical approach to enterprise search (berkeley, ca: springer ebooks, 2015). ebscohost. 4 “usability testing,” u.s. department of health and human services, accessed june 1, 2019, https://www.usability.gov/how-to-and-tools/methods/usability-testing.html. 5 john sweller, “cognitive load during problem solving: effects on learning,” cognitive science 12, no. 2 (1988): 257–85, https://doi.org/10.1207/s15516709cog1202_4. 6 nina hollender et al., “integrating cognitive load theory and concepts of human–computer interaction,” computers in human behavior 26, no. 6 (2010): 1278–88, https://doi.org/10.1016/j.chb.2010.05.031. 7 wolfgang schnotz and christian kürschner, “a reconsideration of cognitive load theory,” educational psychology review 19, no. 4 (2007): 469–508, https://doi.org/10.1007/s10648007-9053-4. 8 jeroen j. g. van merriënboer and ayres paul, “research on cognitive load theory and its design implications for e-learning,” educational technology research and development 53, no. 3 (2005): 5–13, https://doi.org/10.1007/bf02504793. 9 ashok sivaji and soo shi tzuaan, “website user experience (ux) testing tool development using open source software (oss),” in 2012 southeast asian network of ergonomics societies conference (seanes), ed. halimahtun m. khalid et al. (langkawi, kedah, malaysia: ieee, 2012), 1–6, https://doi.org/10.1109/seanes.2012.6299576. 10 deeann allison, “information portals: the next generation catalog,” journal of web librarianship 4, no. 4 (2010): 375–89, https://doi.org/10.1080/19322909.2010.507972. 11 breeding, “library systems report 2018.” 12 maria borbely, “measuring user satisfaction with a library system according to iso/iec tr 9126‐4,” performance measurement and metrics 12, no. 3 (2011): 151–71, https://doi.org/10.1108/14678041111196640. 13 sharon q. yang and melissa a. hofmann, “next generation or current generation? a study of the opacs of 260 academic libraries in the usa and canada,” library hi tech 29 no. 2 (2011): 266–300, https://doi.org/10.1108/07378831111138170. 14 jody condit fagan, “usability studies of faceted browsing: a literature review,” information technology & libraries 29, no. 2 (2010): 58–66, https://doi.org/10.6017/ital.v29i2.3144. https://www.usability.gov/how-to-and-tools/methods/usability-testing.html https://doi.org/10.1207/s15516709cog1202_4 https://doi.org/10.1016/j.chb.2010.05.031 https://doi.org/10.1007/s10648-007-9053-4 https://doi.org/10.1007/s10648-007-9053-4 https://doi.org/10.1007/bf02504793 https://doi.org/10.1109/seanes.2012.6299576 https://doi.org/10.1080/19322909.2010.507972 https://doi.org/10.1108/14678041111196640 https://doi.org/10.1108/07378831111138170 https://doi.org/10.6017/ital.v29i2.3144 information technology and libraries march 2020 user experience with a new public interface | blessinger and comeaux 18 15 hollie m. osborne and andrew cox, “an investigation into the perceptions of academic librarians and students towards next-generation opacs and their features,” program: electronic library and information systems 51, no. 4 (2015): 2163, https://doi.org/10.1108/prog-10-2013-0055. 16 “best practices for user centered design,” online computer library center (oclc), accessed june 7, 2019, https://www.oclc.org/content/dam/oclc/conferences/acrl_user_centered_design_best_pract ices.pdf. 17 “enterprise,” sirsidynix, accessed june 27, 2019, https://www.sirsidynix.com/enterprise/. 18 fagan, “usability studies.” 19 david j. comeaux, “web design trends in academic libraries—a longitudinal study,” journal of web librarianship 11, no. 1 (2017): 1–15, https://doi.org/10.1080/19322909.2016.1230031. https://doi.org/10.1108/prog-10-2013-0055 https://www.oclc.org/content/dam/oclc/conferences/acrl_user_centered_design_best_practices.pdf https://www.oclc.org/content/dam/oclc/conferences/acrl_user_centered_design_best_practices.pdf https://www.sirsidynix.com/enterprise/ https://doi.org/10.1080/19322909.2016.1230031 abstract introduction conceptual framework user studies problem statement methodology findings previous library training and assignments books: general and distinct topics known title books specific journals other issues and frustrations follow-up questions discussion conclusion future research appendix a: study participants appendix b: user exercises endnotes primo new user interface: usability testing and local customizations implemented in response blake lee galbreath, corey johnson, and erin hvizdak information technology and libraries | june 2018 10 blake lee galbreath (blake.galbreath@wsu.edu) is core services librarian, corey johnson (coreyj@wsu.edu) is instruction and assessment librarian, and erin hvizdak (erin.hvizdak@wsu.edu) is reference and instruction librarian, washington state university. abstract washington state university was the first library system of its 39-member consortium to migrate to primo new user interface. following this migration, we conducted a usability study in july 2017 to better understand how our users fared when the new user interface deviated significantly from the classic interface. from this study, we learned that users had little difficulty using basic and advanced search, signing into and out of primo, and navigating their account. in other areas, where the difference between the two interfaces was more pronounced, study participants experienced more difficulty. finally, we present customizations implemented at washington state university to the design of the interface to help alleviate the observed issues. introduction a july 2017 usability study by washington state university (wsu) libraries was the final segment of a sixmonth process for migrating to the new user interface of ex libris primo called primo new ui. wsu libraries assembled a working group in december 2016 to plan for the migration from the classic interface to primo new ui and met bi-weekly through may 2017. to start, the primo new ui working group attempted to answer some baseline questions: what can and cannot be customized in the new interface? how, and according to what timeline, should we introduce the new interface to our library patrons? what methods could be used to assess the new interface? this working group customized the look and feel of the new interface to conform to wsu branding and then released a beta version of primo new ui in march, leaving the older interface (primo classic) as the primary means of access to primo but allowing users to enter and test the beta version of the new interface. in early may (at the start of the summer semester), the prominence of the old and new interfaces was reversed, making primo new ui the default interface but leaving the possibility of continued access to primo classic. the older interface was removed from public access in mid-august, just prior to the start of the fall semester. the public had the opportunity to work with the beta version from march to may and then another two months experience with the production release by the time the usability study took place in july 2017. the remainder of this paper will focus on the details of this usability study. mailto:blake.galbreath@wsu.edu mailto:coreyj@wsu.edu mailto:erin.hvizdak@wsu.edu primo new user interface | galbreath, johnson, and hvizdak 11 https://doi.org/10.6017/ital.v37i2.10191 research questions primo new ui was the name given to the new front end of the primo discovery layer, which was made available to customers in august 2016. according to ex libris, “its design is based on user studies and feedback to address the different needs of different types of users.”1 we were primarily interested in understanding the usability of the essential functionalities of primo new ui, especially where the design of the new interface deviated significantly from the classic interface (taking local customizations into account). for example, we noted that the new interface introduced the following differences to the user (this ordinal list corresponds to the number labels in figure 1): 1. basic search tabs were expressed as drop-downs. 2. the advanced search link was less prominent than it was with our customized shape and color in the classic interface. 3. main menu items were located in a separate area from the sign in and my account links. 4. my favorites and help/chat icons were located together and in a new section of the top navigation bar. 5. sign in and my account links were hidden beneath a “guest” label. 6. facet values were no longer associated with checkboxes or underlining upon hover. 7. availability statuses were expressed through colored text. figure 1. basic search screen in primo new ui. we also observed a fundamental change in the structure of the record in primo new ui: the horizontally oriented and tabbed structure of the classic record (see figure 2) was converted to a vertically oriented and non-tabbed structure in the new interface (see figure 3). additionally, the tabbed structure of the classic interface opened in a frame of the brief results area, while the same information was displayed on the full display page of the new interface. the options displayed in these areas are known as get it and view it (although we locally branded our sections availability and request options and access options, information technology and libraries | june 2018 12 respectively). therefore, we were eager to see how this change in layout might affect a participant’s ability to find get it and view it information on the full display page. taking the above observations into account, we formulated the following questions: 1. will the participant be able to find and use the basic search functionality? 2. will the participant be able to understand the availability information of the brief results? 3. will the participant be able to find and use the sign in and sign out features? 4. will the participant be able to understand the behavior of the facets? 5. will the participant be able to find and use the actions menu? (see the “send to” boxed area in figure 3.) 6. will the participant be able to navigate the get it and view it areas of the full display page? (see the “availability and request options” boxed area in figure 3.) 7. will the participant be able to navigate the my account area? 8. will the participant be able to find and use the help/chat and my favorites icons? 9. will the participant be able to find and use the advanced search functionality? 10. will the participant be able to find and use the main menu items? (see figure 1, number 3.) figure 2. horizontally oriented and tabbed layout of primo classic. literature review 2012 witnessed a flurry of studies involving primo classic. majors compared the experiences of users within the following discovery interfaces: encore synergy, summon, worldcat local, primo central, and ebsco discovery service. the study used undergraduate students enrolled at the university of colorado and focused on common undergraduate searching activities. each interface was tested by five or six participants who also completed an exit survey. observations specific to the primo interface noted that users had difficulty finding and using existing features, such as email and e-shelf, and difficulty connecting their failed searches to interlibrary loan functionality.2 primo new user interface | galbreath, johnson, and hvizdak 13 https://doi.org/10.6017/ital.v37i2.10191 figure 3. vertically oriented and non-tabbed layout of primo new ui. comeaux noted issues relating to terminology and the display of services during usability testing carried out at tulane university. twenty people, including undergraduates, graduates, and faculty members, participated in this study, which tested five typical information-seeking scenarios. the study found several problems related to terminology. for example, participants did not fully understand the meaning of the expand my results functionality.3 participants also did not understand that the display text “no full-text” could be used to order an item via interlibrary loan. 4 the study also concluded that the mixed presentation of differing resource types (e.g., books, articles, reviews) was confusing for patrons who were attempting known-item searches.5 jarrett documented a usability study conducted at flinders university library. the aims of the study were to determine user perceptions regarding the usability of the discovery layer, the relevance of the information retrieved, and the user experiences of this search interface compared to other interfaces. 6 the usability portion of the study scored the participants’ completion of tasks in the primo discovery layer as difficult, confusing, neutral, or straightforward. scores indicated that participants had difficulty determining different editions of a book, locating a local thesis, and placing an item on hold. the investigators also observed that students had issues signing into primo and distinguishing between journals and journal articles.7 information technology and libraries | june 2018 14 nichols et al. conducted a usability test on a newly implemented primo instance at the university of vermont libraries in 2012. their research questions were designed to understand primo’s design, functionality, and layout.8 the majority of the participants were undergraduate students. similar to comeaux, confusion occurred when participants had to find specific or relevant records within longer sets of results.9 nichols et al. also noticed that test subjects had difficulty navigating and finding information in the primo tabbed structure. like jarrett, nichols et al. noted that participants had difficulty distinguishing between the journals and articles.10 similar to majors, participants in nichols et al. had difficulty finding certain primo functionality, such as email, the e-shelf, and the feature to open items in a new window.11 the investigators concluded that these tools were difficult to find because they were buried too deep in the interface. the university of kansas libraries conducted two usability studies on primo. the first study took place during the 2012–13 academic year and involved 27 participants, including undergraduate, graduate, and professional students, who performed four to five main tasks in two separate sessions. similar to other studies, participants experienced great difficulty using the save to e-shelf and email citation tools.12 kliewer et al. conducted the second usability study in 2016, which focused primarily on student satisfaction with the primo discovery tool. thirty undergraduates participated in this study that collected both qualitative and quantitative data. in contrast to most usability studies of discovery services, this study allowed participants to explore primo with open-ended searches to more closely mimic natural searching strategies. results of the study indicated that the participants preferred basic search to advanced search, used facets (but not enough to maximize their searching potential), rarely moved beyond the first page of search results, and experienced difficulties using the link resolver. in response to the latter, a primo working group clarified language on the link resolver page to better differentiate between links to articles and links to journals.13 brett, lierman, and turner conducted a usability study at the university of houston libraries focusing primarily on undergraduate students. users were able to complete the assigned tasks, but the majority did not do so in the most efficient manner. that is, the participants did not take full advantage of primo functionality, such as facets, holds, and recalls. additionally, some participants exhibited difficulty deciphering among the terms journals, journal articles, and newspaper articles. another difficulty participants experienced was knowing what further steps to take once they had successfully found an item in the results list. for example, participants had trouble locating stacks guides, finding request features, and using call numbers. the researchers concluded that many of the issues witnessed in this usability study could be mitigated via library instruction.14 usability testing of primo new ui has recently begun to take a foothold in academic libraries. in addition to conducting usability testing on the primo classic in april 2015 (5 participants, 5–6 tasks), researchers at boston university carried out both preand post-launch testing of the new interface in december 2016 and april 2017, respectively. pre-launch testing with five student participants identified issues with “labelling, locating links to online services, availability statement links in full results, [and] my favorites.”15 after completing fixes, post-launch testing with four students (2 infrequent users, 2 frequent) found that they were able to easily complete tasks, use filters, save results, and find links to online resources. usage statistics for the new interface, compared to classic, also showed an increased use of facets after fixes, and an increase in the use of some features but decrease in the use of others, providing information on what features warranted further examination.16 primo new user interface | galbreath, johnson, and hvizdak 15 https://doi.org/10.6017/ital.v37i2.10191 california state university (csu) libraries conducted usability studies on primo new ui with 24 participants (undergraduate students, graduate students, and faculty) across five csu campuses. five standard tasks were required: find a specific book, find a specific film, find a peer-reviewed journal article, find an item in the csu network not owned locally, and find a newspaper article. each campus added additional questions based on local needs. participants were overwhelmingly positive about the interface look and feel, ease of use, and speed of the system. the success rate for each task varied across the campuses, with participants having greater success on simple tasks such as finding a specific or known item and mixed results on more difficult tasks including using scopes, understanding icons and elements of the frbr record, and facets. steps were taken to relabel and rearrange the scopes and facets so that they were more meaningful to users, and frbr icons were replaced. the authors concluded that primo is an ideal solution to incorporate both global changes and local preference because of its customizability.17 university of washington libraries conducted usability studies on the classic and new primo interfaces. the primo new ui study observed 12 participants. each 60-minute session included an orientation, pre and post-tests, tasks, and follow-up questions. difficulties were noted with terminology, the site logo, the inability to select multiple facets, unclear navigation, volume requesting, advanced search logic, the pin location in item details, and the date facet. a/b testing with 12 participants (from both the new and c lassic ui studies) revealed the need to fix the sign-in prompt for my favorites, enable libraries to add custom actions to the actions menu, add a sort option for favorites in the new interface, add the ability to rearrange elements on a single item page, and add zotero support. overall, participants preferred the new interface. generally, participants easily completed basic tasks, such as known-item searches, searches for course reserves, and open searches, but had more difficulty with article subject searching, audio/visual subject searching, and print-volume searching, which was consistent from the classic to the new interfaces for student participants.18 method we conducted a diagnostic usability evaluation of primo new ui using eight participants, whom we recruited from the wsu faculty, staff, and student populations. in the end, we received a skewed distribution among the categories: three members of staff and five students (two undergraduate students and three graduate students). the initial composition of the participants comprised a greater number of undergraduate students, but substitution created the final makeup. all the study participants had some exposure to primo classic in the past. we recruited participants by hanging flyers around the libraries of our pullman campus and the adjoining student commons area. we offered the participants $15 in exchange for their time, which we advertised as being a maximum of one hour. the usability test was designed by a team of three library staff, one from systems (it) and two from research services (reference/instruction). two of us were present at each session, one to read the tasks aloud and the other to document the session. we used camtasia to record each session so that we would have the ability to return to it later if we needed to verify our notes or other specifics of the session. we stored the recordings on a secured share of the internal library drive. we received an institutional review board certificate of exemption (irb #16190) to conduct this study. information technology and libraries | june 2018 16 this usability test comprised eleven tasks (see appendix a) to test the research questions described above. the tasks were drafted in consultation with the ex libris set of recommendations for conducting primo usability testing.19 each investigator drew their conclusions as to the participants’ successes and failures. we then met as a group to form a consensus regarding task success and failure (see appendix b). we met to discuss the patterns that emerged and to formulate remedies to problems we perceived as hindering student success. results for each of the ten research questions below, consult appendix b to see details regarding the associated tasks and how each participant approached and completed each task. task set(s) related to research question 1: will the participant be able to find and use the basic search functionality? this was one of the easier tasks for the participants to complete. some participants did not follow the task literally to find their favorite book or movie, but rather completed a search for an item or topic of interest to them. all the participants completed this task successfully. task set(s) related to research question 2: will the participant be able to understand the availability information of the brief results? the majority of the participants understood that the availability text and its color represented important access information. however, there were instances where the color of the availability status was in conflict with its text. this led at least one participant to evaluate the availability of a resource incorrectly. task set(s) related to research question 3: will the participant be able to find and use the sign in and sign out features? the participants all successfully completed this task. participants used multiple methods to sign in: the guest link in the top navigation bar, the sign in link from the ellipsis main menu item, and the get it sign in link on the full display page. all participants signed out via the user link in the top navigation bar. task set(s) related to research question 4: will the participant be able to understand the behavior of the facets? almost all of the participants were able to select the articles facet without issue. one person, however, misunderstood the include behavior of the facets. instead of using the include behavior, this participant used the exclude behavior to remove all facets other than the articles facet. only two participants attempted to use the print books facet to complete the task, “from the list of results, find a print book that you would need to order from another library.” instead, the other 75 percent simply scanned the list of results to find the same information. five out of the eight participants attempted to find the peer-reviewed facet when completing the task to choose any peer-reviewed article from a results list: three were successful, while one selected the newspaper articles facet, and another selected the reviews facet. task set(s) related to research question 5: will the participant be able to find and use the actions menu? the tasks related to the actions menu (copy a citation and email a record) were some of the most difficult for the participants: two were successful, three had some difficulty, and three were unsuccessful. of those primo new user interface | galbreath, johnson, and hvizdak 17 https://doi.org/10.6017/ital.v37i2.10191 who experienced difficulty, one seemed not to understand the task fully; this participant found and copied the citation, but then spent additional time looking for a “clipboard.” the other two participants were both distracted by competing areas of interest: the citations section of the full display and the section headings of the full display. of those who were unsuccessful, one suffered from a technical issue that ex libris needs to resolve (the functionality to expand the list of action items failed), one did not seem to understand what a citation was when they found it, and another could not find the email functionality. this last subject continued searching in the ellipsis area of the main menu, in the my account area, and the facets, but ultimately never found the email icon in the scrolling section of the actions menu. task set(s) related to research question 6: will the participant be able to navigate the get it and view it areas of the full display page? three participants experienced substantial difficulty in completing this set of tasks. these participants were distracted by the styled show libraries and stack chart buttons on the full display page that were competing for attention with the requesting options. task set(s) related to research question 7: will the participant be able to navigate the my account area? all of the participants completed this task successfully. four participants located the back-arrow icon to exit the my account area, while the other four participants used alternate methods: using the library logo, selecting the new search button, and signing out of primo. task set(s) related to research question 8: will the participant be able to find and use the help/chat and my favorites icons? participants encountered very little difficulty in finding a way to procure help and chat with a librarian, with one exception. participant 2 immediately navigated to and opened our help/chat icon, but then moved away from this service because it opened in a new tab. this same participant, along with three others, had a more difficult time finding and deciding to use the pin this item icon than did the three participants who completed the same task with ease. the remaining participant failed to complete this task because they could not find the my favorites area of primo. task set(s) related to research question 9: will the participant be able to find and use the advanced search functionality? one participant had more trouble finding the advanced search functionality than the other seven. another experienced a technical difficulty, in which the primo screen froze during the experiment, and we had to begin the task anew. the remaining six people easily finished the tasks. task set(s) related to research question 10: will the participant be able to find and use the main menu items? the majority of the participants completed this task with ease, navigating to the databases link in the main menu items. one participant, however, was confused by the term database but was able to succeed once we provided a brief definition of the term. the remaining two participants were further confused by the term and instead entered general search terms into the primo search bar. these two participants failed to find the list of databases. discussion information technology and libraries | june 2018 18 study participants completed four of our task sets with relative ease: using basic search (see research question 1 above), signing into and out of primo (see research question 3 above), navigating their my account area (see research question 7 above), and using advanced search (see research question 9 above). there was one exception: one participant experienced minor trouble finding the advanced search link, checking first among the drop-down options on our basic search page. subsequent and unrelated to this study, wsu elected to eliminate the first set of drop-down options from our primo landing page. further testing might tell us if this elimination in the number of drop-down options has effectively made the advanced search link more prominent for users. also, the ease with which participants were able to use items located underneath the “guest” label contradicted our expectations. we predicted that this opacity would cause users issues, but it did not seem to deter them. from this, we concluded that the placement of the sign in options in the upper right corner is sufficient to maintain continuity. participants encountered a moderate degree of difficulty completing two task sets: determining availability statuses and navigating the get it area of the full display page. concerning availability, participants were quick to understand that statuses such as “check holdings” relayed that the item was not available. the participants were also keen to notice that green availability statuses implied access while non -green availability statuses implied non-access. however, per the design of the new interface, certain non-green links became green after opening the full display page of primo. this was a significant deviation from the classic interface, where colors indicating availability status did not change. this design element misled one participant. of note, we did not observe participants experiencing issues with the converted format of the get it and view it areas (see figures 2 and 3) per se. however, we did notice that three of our participants were unnecessarily distracted by the show libraries link when trying to find resource sharing options because wsu had previously styled the show libraries links with color and shape. therefore, our local branding in this area impeded usability and led us to rethink the hierarchy of actions on the full display page. similar to comments made by demars, study participants also remarked that the layout of the full display was cluttered and difficult to read.20 we therefore took steps to make this page more readable for the viewer. study participants displayed the greatest difficulty completing the remaining four task sets: selecting a main menu item, refining a search via the facets, using the actions menu, and navigating the my favorites functionality. however, web design was not necessarily the culprit in all four areas. three participants experienced difficulty finding the databases link (a main menu item). after further discussion, it became apparent that this trouble related not to usability but to information literacy—they did not understand the term databases. therefore, like majors and comeaux,21 we recognize the recurring issue of library jargon, and like brett, lierman, and turner,22 we believe that this issue would best be mitigated via library instruction. in agreement with the literature, two participants selected the incorrect facet because they had difficulty distinguishing among the terms articles, newspaper articles, reviews and peer-reviewed.23 further, one of these participants experienced even more difficulty because of not understanding the inherent functionality of the facet values. that is, this participant did not grasp that the facet value links performed an inclusion process by default. to the contrary, this person believed that they would have had to exclude all unwanted facet values to arrive at the wanted facet value. the change in facet behavior between classic and new interfaces likely caused this confusion. in primo classic, wsu had installed a local customization that provided checkboxes and underlining upon hover for each facet value. the new interface did not primo new user interface | galbreath, johnson, and hvizdak 19 https://doi.org/10.6017/ital.v37i2.10191 provide either one of these clues to the user. additionally, we observed, similar to kliewer et al. and brett, lierman, and turner, that participants oftentimes preferred to scan the results list over refining their search via faceting.24 this finding also matches a 2014 ex libris user study indicating that users are easily confused by too many interface options and thus tend to ignore them.25 regarding the actions menu, the majority of the participants attempted to find the email icon in the correct section of the full display page (i.e., the “send to” section). however, because of a technical issue in the design of the new interface, the email icon was not always present for the participant to find. for others, it was difficult to reach the icon even when it was present as participants had to click the right arrow three to four times to navigate past all the citation manager icons. this observed difficulty in finding existing functionalities in primo echoes that cited by majors and nichols et al.26 participants also experienced significant difficulty deciphering between the similarly named functionalities of the citation icon and the citations section of the full display page. as a result of this observed difficulty, we concluded that differentiating sections of the page with distinct naming conventions would be beneficial to users. like the results reported by boston university, our study participants encountered significant issues when trying to save items into their my favorites list.27 we noticed that participants had difficulty making connections between the icons named keep this item/remove this item and the my favorites area. during testing, it was clear that many of the participants were drawn to the pin icon for the correctly anticipated functionality but then were confused that the tooltips did not include any language resembling “my favorites.” from this last observation, we surmised that providing continuity in language between these icons and the my favorites area would increase usability for our library patrons. pepitone reported problems with the placement of the my favorites pin icon,28 but we observed this being less of a problem than the actual terminology used to name the pin icon. beyond success and failure, a 2014 ex libris user study suggested that academic level and discipline play a key role in user behavior.29 however, we were unable to draw meaningful conclusions among user groups because of our small and homogenous participant pool. decisions made in response to usability results declined to change facets. although one participant did not understand the inclusion mechanism of the facet values, we declined to investigate a customization in this area. according to the primo august 2017 release notes, ex libris plans to make considerable changes to the faceting functionality.30 therefore, we decided to wait until after this release to reassess whether customization was warranted. implemented a change labels citations. we observed confusion between the citation icon of the actions menu and the section of the full display page labeled “citations.” to differentiate between the two items, we changed the actions menu icon text to “cite this item” (see figure 4) and the heading for the citations section to “references cited” (see figure 5). information technology and libraries | june 2018 20 figure 4. cite this item icon of the actions menu. figure 5. references cited section of the full display page. my favorites. there was a mismatch among the tooltip texts of the my favorites icons. we changed the tooltip language for the “keep this item” pin to read “add to my favorites” (see figure 6) and the tooltip language for the “unpin this item” pin to read “remove from my favorites” (see figure 7). figure 6. add to my favorites language for my favorites tooltip. figure 7. remove from my favorites language for my favorites tooltip. availability statuses. per the design of the new interface, certain non-green links became green after opening the full display page of primo new ui. we implemented css code to retain the non-green coloring of the availability statuses after opening the full display. in this case, “check holdings” remains orange (see figure 8). figure 8. availability status color of brief display, before and after opening the full display. primo new user interface | galbreath, johnson, and hvizdak 21 https://doi.org/10.6017/ital.v37i2.10191 link removal full display page headings. there was confusion as to the function of the headings on the full display page. these are anchor tags, but patrons clicked on them as if they were functional links. no patrons used the headings successfully. therefore, we hid the headings section via css (see figure 9). figure 9. removal of headings on full display page. links to other institutions. we observed participants attempting to use the links to other institutions to place resource sharing requests. therefore, we removed the hyperlinking functionality of the links in the list, via css (see figure 10). figure 10. neutralization of links to other institutions. prioritized the emphasis of certain functionalities request options and show libraries buttons. it is usually more important to be able to place a request than find the names of other institutions who own an item. however, the show libraries button was originally styled with crimson coloring, which drew unwarranted attention, while the requesting links were not. therefore, we added styling to the resource-sharing links and removed styling from the show libraries button via css (see figure 11). figure 11. resource sharing link with crimson color, show libraries removed of styling. information technology and libraries | june 2018 22 e-mail icon. we observed that the e-mail icon of the actions menu was difficult to find. therefore, we decreased the number of icons and moved the emailing functionality to the left side of the actions menu (see figure 12). figure 12. email icon prioritized over citation manager icons. contrast and separation full display page sections. participants noted that the information on the full display page tended to run together. to remedy, we created higher contrast between the foreground and background of the page sections via css. we also styled the section titles and dividers with color, among other edits (see figure 13). figure 13. separated sections of full display page (see figure 3 to compare to the new ui default full display page design). primo new user interface | galbreath, johnson, and hvizdak 23 https://doi.org/10.6017/ital.v37i2.10191 conclusion while providing one of the first studies on primo new ui, we acknowledge several limitations. previous studies on primo had larger study populations compared to this one (which had eight participants). however, we adhered to nielsen’s findings that usability studies uncover most design deficiencies with five or more participants.31 additionally, the scope of this study was limited to the usability of the desktop view. we recommend further studies that will concentrate on accessibility compliance and that will test the interface on mobile devices. regarding the study design, the question arose as to whether the participants’ difficulties reflected poor design functionality or a misunderstanding of library terminology (as noted by majors and comeaux).32 the researchers did not carry out pre-tests or an assessment of participants’ level of existing knowledge. this limitation is almost always unavoidable, however, as a task list will always risk not fitting the skills or knowledge of every participant. the lack of some features’ use also might have been because of study design. while not using the facets may reflect that participants are unaware of them, it could also be from the fact that they never had to scroll past the first few items to find the needed resource. users might have felt a greater need to use the facets had we asked more difficult discovery tasks. the study also contained an investigative bias in that the researchers were part of the working group that developed the customized interface, and then tested those customizations. this bias could have been reduced if the study had used researchers who were not a part of the same group that made these customizations. despite these limitations, there are still key findings of note. tasks that participants completed with the greatest ease mapped to those that we assume they do most often, which included basic searching for materials and accessing account information. tasks beyond these basics proved to be more difficult. this raises the question of whether difficulties were really a function of the interface design or if they reflected ongoing literacy issues. therefore, it is crucial that designers work with public services and instruction librarians to identify areas where users might be well-served by making certain functionalities more userfriendly and creating educational and training opportunities to increase awareness of these functionalities.33 bringing diverse perspectives into the study is also crucial so that researchers can discover and be more conscious of commonalities in design and literacy needs, particularly regarding advanced tasks. information technology and libraries | june 2018 24 appendix a: usability tasks note: search it is the local branding for primo at washington state university. 1) please search for your favorite book or movie. a) is this item available for you to read or watch? b) how do you know that this item is or isn’t available for you to read or watch? 2) please sign in to search it. 3) please perform a search for “causes of world war ii” (do not include quotation marks). a) limit your search results to articles. b) for any of the records in your search results list: i) find the citation for any item and copy it to the clipboard. ii) email this record to yourself. 4) please perform a search for “actor’s choice monologues” (do not include quotation marks). a) from the list of results, find a print book that you would need to order from another library. 5) please perform a search for a print book with isbn 0582493498. a) this book is checked out. how would you get a copy of it? b) pretend that this book is not checked out. please show us the information from this record that you would use to find this item on the shelves. 6) please navigate to your library account (from within search it). a) pretend that you have forgotten how many items you have checked out. please show us how you would find out how many items you currently have checked out. b) exit your library account area. 7) please navigate to advanced search. a) perform any search on this page. 8) please show us where you would go to find help and/or chat with a librarian? 9) please perform a search using the keywords “gender and media.” a) add any source to your my favorites list. then open my favorites and click on the title of the source you just added. b) return to your list of results. choose any peer-reviewed article that has the full text available. click on the link that will access the full text. 10) please find a database that might be of interest to you (e.g., jstor). 11) please sign out of search it and close your browser. primo new user interface | galbreath, johnson, and hvizdak 25 https://doi.org/10.6017/ital.v37i2.10191 appendix b: usability results note: search it is the local branding for primo at washington state university. research question 1: will the participant be able to find and use the basic search functionality? associated task(s): 1. please search for your favorite book or movie. participant successful? commentary 1 yes searches for “the truman show” from the beginning. 2 yes searches for “pet sematary” from the beginning. 3 yes searches for “additive manufacturing” from the beginning. 4 yes signs in first, navigates to new search, searches for “pzt sensor design.” 5 yes searches for “the notebook” from the beginning. 6 yes searches for “das leben der anderen” from the beginning. 7 yes searches for “legally blonde” from the beginning. 8 yes searches for “jurassic park” from the beginning. research question 2: will the participant be able to understand the availability information of the brief results? associated task(s): 1b. how do you know that this item is or isn’t available for you to read or watch? 4a. from the list of results, find a print book that you would need to order from another library. participant successful? commentary 1 yes differentiates between green and orange text; uses the “check holdings” availability status. clicks on “availability and request option” heading and then clicks on the resource sharing link. 2 yes, with difficulty. says that green “check holdings” status indicates ability to read the book. selects book with “check holdings” status and locates resource sharing link. information technology and libraries | june 2018 26 participant successful? commentary 3 yes, with difficulty unclear. initially, goes to a record with online access; redoes search, eventually locates resource sharing link. 4 yes says the record for the item reads “in place” and the availability indicator = 1. the record for the item reads “check holdings.” 5 yes says that status is indicated by statement “available at holland/terrell libraries.” the record for the item reads “check holdings.” 6 yes says that status is indicated by statement “available at holland/terrell libraries” and “item in place.” clicks on “check holdings”; says that orange color denotes fact that we don’t have it. 7 yes hovers over “check holdings” status, and then notes that “availability” statement reads “did not match any physical resources.” the record for the item reads “check holdings.” 8 yes says that status is indicated by statement “available at holland/terrell libraries.” says the record for the item reads “check holdings.” research question 3: will the participant be able to find and use the sign in and sign out features? associated task(s): 2. please sign into search it. 11. please sign out of search it and close your browser. participant successful? commentary 1 yes navigates to “guest” link, signs in. 2 yes navigates to ellipsis, signs in. navigates to “user” link, signs out. 3 yes navigates to “guest” link, signs in. navigates to “user” link, signs out. 4 yes n/a—already signed in. navigates to “user” link, signs out. primo new user interface | galbreath, johnson, and hvizdak 27 https://doi.org/10.6017/ital.v37i2.10191 participant successful? commentary 5 yes navigates to “guest” link, signs in. navigates to “user” link, signs out. 6 yes navigates to “guest” link, signs in. navigates to “user” link, signs out. 7 yes uses sign in link from full display page. navigates to “user” link, signs out. 8 yes navigates to “guest” link, signs in. navigates to “user” link, signs out. research question 4: will the participant be able to understand the behavior of the facets? associated task(s): 3a. limit your search results to articles. 4a. from the list of results, find a print book that you would need to order from another library. 9b. return to your list of results. choose any peer-reviewed article that has the full text available. participant successful? commentary 1 yes selects articles facet. n/a—does not use facets (however, participant investigates the library and type facets, returns to results lists). 2 yes selects articles facet. n/a—does not use facets. 3 no uses “exclude” property to remove everything but articles. uses “exclude” property to remove everything but print books. looks in facet type for articles; selects newspaper articles instead. 4 yes, with difficulty selects articles facet. selects print books facet. selects articles under type facet, clicks on “full-text available” status, selects peer-reviewed articles facet. 5 no selects articles facet. n/a—does not use facets. screen freezes (technical issue) and participant is forced to redo search. n/a— does not use facets. when further prompted to find only peerreviewed articles, participant searches pre-filter area and then selects reviews facet. information technology and libraries | june 2018 28 participant successful? commentary 6 yes selects articles facet. clicks on “check holdings.” participant hovers over “online access” text and then selects peer-reviewed facet. 7 yes looks in drop-down scope, then moves to articles facet. n/a— does not use facets. n/a—does not use facets. 8 yes hovers over peer-reviewed articles facet, and then selects articles facet. n/a—does not use facets. selects peer-reviewed facet. research question 5: will the participant be able to find and use the actions menu? associated task(s): 3.b.i. for any of the records in your search results list, find the citation for any item and copy it to the clipboard. 3.b.ii. for any of the records in your search results list, email this record to yourself. participant successful? commentary 1 yes briefly looks at citation icon, scrolls to bottom of page and looks at citations area, returns to citation icon. scrolls to bottom of page, returns to actions area, scrolls with arrow to find email icon, emails to self. 2 no initially clicks on citation manager icon (easybib), then clicks on citation icon and copies to clipboard. could not find email icon (technical issue with search it). although further discussion reveals that participant expects to see email function within “send to” heading. 3 no opens full display page of item, scrolls to bottom of page. clicks on the citation icon but doesn’t see what looking for. finds email icon and emails to self. 4 no opens full display page of item, clicks on the citation icon, double-clicks to highlight citation. could not find email icon. searches in ellipsis. attempts the keep this item pin. navigates to my account. searches in facets. 5 yes, with difficulty finds citation icon, but then leaves the area via citations heading and winds up at web of science homepage. hovers over “cited in this” language. finds the copy functionality. primo new user interface | galbreath, johnson, and hvizdak 29 https://doi.org/10.6017/ital.v37i2.10191 participant successful? commentary attempts sent to heading twice, looks through actions icons, scrolls to right, finds email icon. 6 yes finds citation icon, copies to clipboard. scrolls down page, returns to actions menu, scrolls to email icon, emails record to self. 7 yes, with difficulty copies citation from the brief result, and then spends some time trying to find “the clipboard.” navigates to the email icon. 8 yes, with difficulty scrolls to bottom of full display page, clicks on citing this link, clicks on title to record, and then copies first 3 lines of record. scrolls until finds email icon, but then moves to sent to heading, and then back to email icon, and sends. research question 6: will the participant be able to navigate the get it and view it areas of the full display page? associated task(s): 5.a. this book is checked out. how would you get a copy of it? 5.b. please show us the information from this record that you would use to find this item on the shelves. 9.b. click on the link that will access the full text. participant successful? commentary 1 yes clicks on “check holdings” availability status, clicks on availability and request options heading, clicks on request summit item link. refers to call number in alma iframe. clicks “full-text available” status, clicks database name. 2 yes opens record, locates resource sharing link. refers to call number; opens stack chart to find call number. clicks on title, clicks database name. 3 yes locates request option. locates call number in record. clicks “full-text available” status, clicks database name. 4 yes, with difficulty. clicks on show libraries button, then finds request option after searching page. locates call number in record. clicks “full-text available” status but does not click on database name. 5 yes, with difficulty. moves to stack chart button, then to show libraries button, and then to availability and request options heading, clicks on stack chart, clicks on show libraries, moves into first library listed and information technology and libraries | june 2018 30 participant successful? commentary back out, and finally to ill link. finds call number on full display page. 6 yes finds request summit option. identifies call number and stack chart as means to find book. clicks on database name. 7 yes, with difficulty. looks at status statement, scrolls to bottom of page, then show libraries button, then request summit option. identifies call number and stack chart as means to find book. attempts to use “full-text available” link, then clicks on database name. 8 yes finds summit request option. identifies call number and stack chart as means to find book. attempts to use “full-text available” link, then clicks on database name. research question 7: will the participant be able to navigate their my account area? associated task(s): 6. please navigate to your library account (from within search it). 6a. pretend that you have forgotten how many items you have checked out. please show us how you would find out how many items you currently have checked out. 6b. exit your library account area. participant successful? commentary 1 yes navigates to my account from “user” link. navigates to loans tab. uses back arrow icon. 2 yes navigates to my account from “user” link. navigates to loans tab. uses back arrow icon. 3 yes navigates to my account from main menu ellipsis. navigates to loans. uses back arrow icon. 4 yes navigates to my account from main menu ellipsis. navigates to loans. uses to back arrow icon. 5 yes navigates to my account from “user” link. navigates to loans. signs out of search it. 6 yes navigates to my account from “user” link. navigates to loans. uses search it logo to exit. primo new user interface | galbreath, johnson, and hvizdak 31 https://doi.org/10.6017/ital.v37i2.10191 participant successful? commentary 7 yes navigates to my account from “user” link. navigates to loans. uses new search button to exit. 8 yes navigates to my account from “user” link. navigates to loans. uses search it logo to exit. research question 8: will the participant be able to find and use the help/chat and my favorites icons? associated task(s): 8. please show us where would you go to find help and/or chat with a librarian? 9.a. add any source to your my favorites list. then, open my favorites and click on the title of the source you just added. participant successful? commentary 1 yes, with difficulty navigates to help/chat icon. navigates to keep this item pin, hesitates, navigates to ellipsis, returns to and clicks on pin. moves to my favorites via animation. clicks on title. 2 yes, with difficulty initially navigates to help/chat icon, but thinks it is the wrong button because chat is not directly available within search it. navigates to keep this item pin, hesitates, looks around, selects pin. moves to my favorites via animation. clicks on title. 3 yes, with difficulty navigates to help/chat icon. navigates to ellipsis, actions menu, and tags section. finds keep this item pin. 4 no navigates to help/chat icon. navigates to ellipsis, keep this item pin, my account, and facets quits search. 5 yes, with difficulty navigates to help/chat icon. adds keep this item pin after investigating 12 other icons. moves to my favorites via animation. clicks on title. 6 yes navigates to help/chat icon. adds keep this item pin and moves to my favorites via animation. clicks on title. 7 yes navigates to help/chat icon. checks actions menu, adds keep this item pin and moves to my favorites via animation clicks on title. 8 yes navigates to help/chat icon. adds keep this item pin and moves to my favorites via animation. clicks on title. information technology and libraries | june 2018 32 research question 9: will the participant be able to find and use the advanced search functionality? associated task(s): 7. please navigate to advanced search. 7a. perform any search on this page. participant successful? commentary 1 yes navigates to advanced search. performs search. 2 yes navigates to advanced search. performs search. 3 yes, with difficulty navigates to basic search drop-down, then to new search, then to advanced search. has trouble inserting cursor into search box. 4 yes, with difficulty navigates to advanced search. builds complex search, then search it freezes and we have to restart the search tool. 5 yes navigates to advanced search. performs search. 6 yes navigates to advanced search. performs search. 7 yes navigates to advanced search. performs search. 8 yes navigates to advanced search. performs search. research question #10: will the participant be able to find and use the main menu items? associated task(s): 10. please find a database that might be of interest to you (e.g., jstor). participant successful? commentary 1 yes navigates to “databases” link of main menu. 2 yes navigates to “databases” link of main menu. 3 no types query “stretchable electronics” into search box, but unsure how to find a database in the results lists. 4 no types query “reinforced concrete” into search box, but unsure how to find a database in the results lists. primo new user interface | galbreath, johnson, and hvizdak 33 https://doi.org/10.6017/ital.v37i2.10191 participant successful? commentary 5 yes, with difficulty is confused by term database. enters “ieee” in search box. 6 yes navigates to “databases” link of main menu. 7 yes searches within drop-down scopes, then facets, then moves to “databases” link of main menu. 8 yes navigates to “databases” link of main menu. 1 “frequently asked questions,” ex libris knowledge center, accessed august 28, 2017, https://knowledge.exlibrisgroup.com/primo/product_documentation/050new_primo_user_interface /010frequently_asked_questions. 2 rice majors, “comparative user experiences of next-generation catalogue interfaces,” library trends 61, no. 1 (2012): 186–207, https://doi.org/10.1353/lib.2012.0029. 3 david comeaux, “usability testing of a web-scale discovery system at an academic library,” college & undergraduate libraries 19, no. 2–4 (2012): 199, https://doi.org/10.1080/10691316.2012.695671. 4 comeaux, “usability testing,” 202. 5 comeaux, “usability testing,” 196–97. 6 kylie jarrett, “findit@flinders: user experiences of the primo discovery search solution,” australian academic & research libraries 43, no. 4 (2012): 280, https://doi.org/10.1080/00048623.2012.10722288. 7 jarrett, “findit@flinders,” 287. 8 aaron nichols et al., “kicking the tires: a usability study of the primo discovery tool,” journal of web librarianship 8, no. 2 (2014): 174, https://doi.org/10.1080/19322909.2014.903133. 9 nichols, “kicking the tires,” 181. 10 nichols, “kicking the tires,” 184. 11 nichols, “kicking the tires,” 184–85. 12 scott hanrath and miloche kottman, “use and usability of a discovery tool in an academic library,” journal of web librarianship 9, no. 1 (2015): 9, https://doi.org/10.1080/19322909.2014.983259. 13 greta kliewer et al., “using primo for undergraduate research: a usability study,” library hi tech 34, no. 4 (2016): 576, https://doi.org/10.1108/lht-05-2016-0052. https://knowledge.exlibrisgroup.com/primo/product_documentation/050new_primo_user_interface/010frequently_asked_questions https://knowledge.exlibrisgroup.com/primo/product_documentation/050new_primo_user_interface/010frequently_asked_questions https://doi.org/10.1353/lib.2012.0029 https://doi.org/10.1080/10691316.2012.695671 https://doi.org/10.1080/00048623.2012.10722288 https://doi.org/10.1080/19322909.2014.903133 https://doi.org/10.1080/19322909.2014.983259 https://doi.org/10.1108/lht-05-2016-0052 information technology and libraries | june 2018 34 14 kelsey brett, ashley lierman, and cherie turner, “lessons learned: a primo usability study,” information technology & libraries 35, no. 1 (2016): 21, https://doi.org/10.6017/ital.v35i1.8965. 15 cece cai, april crockett, and michael ward, “our experience with primo new ui,” ex libris users of north america conference 2017, accessed november 4, 2017, http://documents.eluna.org/1467/1/caicrockettward_051017_445pm.pdf. 16 cai, crockett, and ward, “our experience with primo new ui.” 17 j. michael demars, “discovering our users: a multi-campus usability study of primo” (paper presented, international federation of library association and institutions world library and information conference 2017, warsaw, poland, august 14, 2017), 11, http://library.ifla.org/1810/1/s10-2017demars-en.pdf. 18 anne m. pepitone, “a tale of two uis: usability studies of two primo user interfaces” (slideshow presentation, primo day 2017: migrating to the new ui, june 12, 2017), https://www.orbiscascade.org/primo-day-2017-schedule/. 19 “primo usability guidelines and test script,” ex libris knowledge center, accessed october 28, 2017, https://knowledge.exlibrisgroup.com/primo/product_documentation/ new_primo_user_interface/primo_usability_guidelines_and_test_script. 20 demars, “discovering our users,” 9. 21 majors, “comparative user experiences,” 190; comeaux, "usability testing," 198–204. 22 brett, lierman, and turner, “lessons learned,” 21. 23 jarrett, “findit@flinders,” 287; nichols, “kicking the tires,” 184; brett, lierman, and turner, “lessons learned,” 20–21. 24 kliewer et al., “using primo for undergraduate research,” 571–72; brett, lierman, and turner, “lessons learned,” 17. 25 miri botzer, “delivering the experience that users expect: core principles for designing library discovery services,” white paper, nov 25 2015, 10, http://docplayer.net/10248265-delivering-theexperience-that-users-expect-core-principles-for-designing-library-discovery-services-miri-botzerprimo-product-manager-ex-libris.html. 26 majors, “comparative user experiences,” 194; nichols et al., “kicking the tires,” 184–85. 27 cai, crockett, and ward, “our experience with primo new ui,” 28–29. 28 pepitone, “a tale of two uis,” 29. 29 botzer, “delivering the experience,” 4–5; christine stohn, “how do users search and discover? findings from ex libris user research,” library technology guides, may 5 2015, 7–8, https://librarytechnology.org/document/20650. https://doi.org/10.6017/ital.v35i1.8965 http://documents.el-una.org/1467/1/caicrockettward_051017_445pm.pdf http://documents.el-una.org/1467/1/caicrockettward_051017_445pm.pdf http://library.ifla.org/1810/1/s10-2017-demars-en.pdf http://library.ifla.org/1810/1/s10-2017-demars-en.pdf https://www.orbiscascade.org/primo-day-2017-schedule/ https://knowledge.exlibrisgroup.com/primo/product_documentation/%20new_primo_user_interface/primo_usability_guidelines_and_test_script https://knowledge.exlibrisgroup.com/primo/product_documentation/%20new_primo_user_interface/primo_usability_guidelines_and_test_script http://docplayer.net/10248265-delivering-the-experience-that-users-expect-core-principles-for-designing-library-discovery-services-miri-botzer-primo-product-manager-ex-libris.html http://docplayer.net/10248265-delivering-the-experience-that-users-expect-core-principles-for-designing-library-discovery-services-miri-botzer-primo-product-manager-ex-libris.html http://docplayer.net/10248265-delivering-the-experience-that-users-expect-core-principles-for-designing-library-discovery-services-miri-botzer-primo-product-manager-ex-libris.html https://librarytechnology.org/document/20650 primo new user interface | galbreath, johnson, and hvizdak 35 https://doi.org/10.6017/ital.v37i2.10191 30 “primo august 2017 highlights,” ex libris knowledge center, accessed november 2, 2017, https://knowledge.exlibrisgroup.com/primo/product_documentation/highlights/ 027primo_august_2017_highlights. 31 jakob nielsen, “how many test users in a usability study?,” nielsen norman group, jun 4, 2012, https://www.nngroup.com/articles/how-many-test-users/. 32 majors, “comparative user experiences,” 190; comeaux, “usability testing,” 200–204. 33 brett, lierman, and turner, “lessons learned,” 21. https://knowledge.exlibrisgroup.com/primo/product_documentation/highlights/%20027primo_august_2017_highlights https://knowledge.exlibrisgroup.com/primo/product_documentation/highlights/%20027primo_august_2017_highlights https://www.nngroup.com/articles/author/jakob-nielsen/ https://www.nngroup.com/articles/how-many-test-users/ abstract introduction research questions literature review method results task set(s) related to research question 1: will the participant be able to find and use the basic search functionality? task set(s) related to research question 2: will the participant be able to understand the availability information of the brief results? task set(s) related to research question 3: will the participant be able to find and use the sign in and sign out features? task set(s) related to research question 4: will the participant be able to understand the behavior of the facets? task set(s) related to research question 5: will the participant be able to find and use the actions menu? task set(s) related to research question 6: will the participant be able to navigate the get it and view it areas of the full display page? task set(s) related to research question 7: will the participant be able to navigate the my account area? task set(s) related to research question 8: will the participant be able to find and use the help/chat and my favorites icons? task set(s) related to research question 9: will the participant be able to find and use the advanced search functionality? task set(s) related to research question 10: will the participant be able to find and use the main menu items? discussion decisions made in response to usability results declined to change implemented a change labels link removal prioritized the emphasis of certain functionalities contrast and separation conclusion appendix a: usability tasks appendix b: usability results research question 1: associated task(s): research question 2: associated task(s): research question 3: associated task(s): research question 4: associated task(s): research question 5: associated task(s): research question 6: associated task(s): research question 7: associated task(s): research question 8: associated task(s): research question 9: associated task(s): research question #10: associated task(s): adding value to the university of oklahoma libraries history of science collection through digital enhancement maura valentino information technology and libraries | march 2014 25 “in getting my books, i have been always solicitous of an ample margin; this not so much through any love of the thing itself, however agreeable, as for the facility it affords me of penciling suggested thoughts, agreements and differences of opinion, or brief critical comments in general.” —edgar allan poe abstract much of the focus of digital collections has been and continues to be on rare and unique materials, including monographs. a monograph may be made even rarer and more valuable by virtue of hand written marginalia. using technology to enhance scans of unique books and make previously unreadable marginalia readable increases the value of a digital object to researchers. this article describes a case study of enhancing the marginalia in a rare book by copernicus. background the university of oklahoma libraries history of science collections holds many rare books and other objects pertaining to the history of science. one of the rarest holdings is a copy of nicolai copernici torinensis de revolvtionibvs orbium coelestium (on the revolutions of the heavenly spheres), libri vi, a book famous for copernicus’ revolutionary astronomical theory that rejected the ptolemaic earth-centered universe and promoted a heliocentric, sun-centered model. the history of science collections’ copy of this manuscript contains notes added to the margins. similar notes were made in eight different existing copies, and the astrophysicist owen gingerich determined that these notes were created by a group of astronomers in paris known as the offusius group.1 the notes are of significant historical importance as they offer information on the initial reception of copernicus’ theories by the catholic community. having been created almost five hundred years ago in 1543, the handwriting is faded and the ink has absorbed into the paper. maura valentino (maura.valentino@oregonstate.edu) is metadata librarian, oregon state university, corvalis, oregon. previously she was digital initiatives coordinator at the university of oklahoma. mailto:maura.valentino@oregonstate.edu adding value to collections through digital enhancement | valentino 26 written in cursive script, the letters have merged as the ink has dispersed, adding to the difficulties inherent in reading these valuable annotations. the book had previously been digitized, and while some of the margin notes were readable, many of the notes were barely visible. therefore much of the value of the book was being lost in digital form. to rectify this situation the decision was made to enhance the marginalia. it was further decided that once the margin notes were enhanced, two digital representations of each page that contained notes would be included in the digital collection. one copy would present the main text in the most legible fashion (figure 1) and the second copy would highlight the marginalia and ensure that these margin notes were as legible as possible even if in doing so the readability of the main text was diminished (figure 2). figure 1. text readable. figure 2. marginalia enhanced. while creating a written transcript of the marginalia was considered and would have added some value to the digital object, this solution was rejected in favor of digital enhancement for the following reasons. many of the notes contained corrections with lines drawn to the area of text that was being changed, or bracket numbers (figure 3). in addition, some of the notes are corrections of numbers or tables, so a transcript of the text would do little to demonstrate the writer’s intentions in creating the margin note (figure 4). figure 3 .bracketed corrections. figure 4. numerical corrections. information technology and libraries | march 2014 27 also, sometimes there was bleed through from the reverse page, further disrupting the clarity of the marginalia (figures 5 and 6). therefore it was determined that making the notes more readable through digital enhancement would provide the collection’s users with the most useful resource. figure 5. highlighted—bleed through reduced figure 6. bleed through behind. marginalia. the book can be viewed in its entirety here: http://digital.libraries.ou.edu/cdm/landingpage/collection/copernicus literature review “modification of photographs to enhance or change their meaning is nothing new. however, the introduction of techniques for digitizing photographic images and the subsequent development of powerful image editing software has both broadened the possibilities for altering photographs and brought the means of doing so with the reach of anyone with imagination and patience.”2 —richard s. croft the primary goal of this project was to give researchers in the history of science the ability to clearly decipher the marginalia created by the astronomers of the offusius group as they annotated the book using the margins as an editing space. the literature agrees that marginalia is an important piece of history worth preserving. hauptman states, “the thought that produces the necessity for a citation or remark leads directly into the marginal notation.”3 he also adds, “their close proximity to the text allows for immediate visual connection.”4 howard asserts, “for writers and scholars, the margins and endpapers became workshops in which to hammer out their own ideas, and offered spaces in which to file and store information.”5 she also adds that marginalia can “serve as a form of opposition.”6 this is true in this case as some of the marginalia http://digital.libraries.ou.edu/cdm/landingpage/collection/copernicus adding value to collections through digital enhancement | valentino 28 contradicts copernicus. nikolova-houston argues for the historical aspect: “each of the marginalia and colophons is a unique production by its author, and exists in only one copy.”7 she goes on to add, “manuscript marginalia and colophons possess historical value as primary historical sources. they are treated as historical evidence along with other written and oral traditions.”8 such ideas provide a strong justification for the implementation of marginalia enhancement in digital collections. as mentioned above, it was determined that a transcription would not have had the same effect as digital enhancement of the margin notes. this approach is also supported by the literature. for example, ferrari argues for the digital publication of the marginalia that fernando pessoa, the portuguese writer, made while reading. one of the cornerstones of his argument is that digital representation of marginalia allows the reader not only to see the words but also the underlining and other symbols that are not easily put into a transcript. in this way, the user of the digital collection obtains a more complete view of the author of the marginalia’s intent.9 another goal of this project was the general promotion of the university of oklahoma’s history of science collections. johnson, in his new york times article, notes that marginalia lend books an historical context while enabling users to infer other meanings from their texts.10 he also quotes david spadafora, president of the newberry library in chicago, who proclaims that “the digital revolution is a good thing for the physical object.” as more people access historical artifacts in electronic form, he notes, “the more they’re going to want to encounter the real object.”11 in this way, enhancement of the marginalia in digital collections can lead to further exposure for the collection and to greater use of the physical objects themselves. using digital enhancement is not a new idea. morgan asserts, “the innovation of the world wide web is its exciting capacity for space that, while not limitless, is weightless and far less limited that that of the printed book.”12 le, anderson and agarwala also add, “local manipulation of color and tone is one of the most common operations in the digital imaging workflow.”13 the literature shows that other projects have used enhancement of the digital object to increase the usefulness of the original artifact. one of the projects pursued during the library of congress’s american memory initiative involved the digitization of the work of photographer solomon butcher. in this case, technicians were able to enhance an area of one photograph that was blurry in normal photographic processes and allow the viewer to see inside a building.14 the archivo general de indias also used digital enhancement to remove stains and bleed-through from ancient manuscripts and render previously unreadable manuscripts readable.15 in an article advocating for a digital version of william blakes’s poem the four zoas, morgan notes that some features of the manuscript can only be seen in the digital version rather than a transcription: “sections of the manuscript show intense revision, with passages rubbed out, moved earlier or later in the manuscript, and often, added in the margins.”16 information technology and libraries | march 2014 29 digital processing is not limited to the use of photo editing software. although giralt asserts that it is a common method, “the ample potential for image control and manipulation provided by digital technology has stirred a great interest in postproduction, and digital editing.”17 other projects have used various technologies to enhanced images to give added meaning to a digital image. once again, in her article advocating for the digitizing of william blake’s the four zoas, morgan asserts that various enhancement technologies would help readers obtain the greatest benefit from the manuscript. for example, providing “the added benefit of infra-red photography,” would allow “readers to see many of the erased illustrations.”18 she even hopes coding will enhance the usefulness of a digital object: “our impulse to use xml in order to richly encode a text works against passivity. with coding we clarify a work down to its smallest units, and illuminate specific aspects of its structure, aspects that are often less obvious when the work is presented in the form of a printed book.”19 method locating the marginalia each page of the book had been previously scanned and was stored in tagged image file format (tiff). each digital page (tiff image) was carefully examined for marginalia. this was achieved by examining the image in adobe photoshop using the zoom tool to enlarge the image as necessary. as many notes were barely visible, the entirety of each page had to be examined in detail to ensure that margin notes were not overlooked. enlargement of the image in photoshop greatly facilitated this process. enhancing the marginalia once all the pages with marginalia were identified, each page was loaded into adobe photoshop for digital processing and enhancement. the following procedure was used (note: the specific directions that follow reference adobe photoshop cs4 for windows but can be generally applied to most software programs intended for photo editing): 1. using the zoom tool, the image was enlarged to facilitate examination and interaction with the marginalia. 2. individual margin notes were selected using the rectangular marquee tool. the area selected included any lines that were drawn from the notes to the original text so it would be clear to what text the margin note referred. 3. as the handwritten margin notes were orange in tone, a blue filter was applied (as blue is the contrasting color to orange) by selecting adjustments from the image menu and then choosing black and white to display the black and white dialog box. in the black and white dialog box, blue filter was selected from the preset drop-down menu. this small adjustment greatly enhanced the readability of the margin notes. 4. with the area still selected, adjustments was again selected from the image menu. from that adjustments submenu, brightness and contrast was selected. adjustments were made to both these values using the sliders presented by the resulting dialog box to adding value to collections through digital enhancement | valentino 30 further enhance the margin notes legibility. for this particular project, the values selected were generally negative twenty for contrast and positive twenty for brightness. file naming conventions each enhanced image was saved with the same filename as the digital image of the original manuscript page, but with an a (for annotated) added to the end of the filename. this naming scheme enabled a distinction between pages with and without enhanced marginalia. this series of steps was repeated for each page (see table 1). page name explanation book spine pictures of the covers book cover inside cover blank page with ruler to measure page folio 001 page 1 as originally scanned folio 001 verso page 1, reverse side, as originally scanned folio 001 verso a page 1, reverse side, with highlighted marginalia folio 002 page 2 as originally scanned folio 002a page 2 with highlighted marginalia folio 002 v page 2, reverse side folio 002 v a page 2, reverse side, with highlighted marginalia table 1. filenames. importing into the digital management system contentdm was the digital management system selected for this project. all original manuscript page images and enhanced marginalia page images were imported into contentdm following their creation. the next step was to bring all the pages into contentdm as one compound object. a microsoft excel spreadsheet was created with a line for each page, annotated or not. only three fields were used: title, rights, and filename. a description of the book was placed on its history of science digital collections webpage with a link to the compound object in contentdm, so further metadata was not necessary and can always be added later. the first row only contained the title of the book (no filename). there were tiffs available of the cover, the bookend, the inside cover, and the book with a ruler. these were the next rows. then we began with the pages and titled them as the pages were numbered. there were ten pages numbered with roman numerals and then the pages began with alphanumeric page numbers. each page that had handwritten notes had the original page (page 2, for example) and the page with the information technology and libraries | march 2014 31 notes highlighted (page 2 annotated). this would allow the viewer to view the pages in their original form or with the notes highlighted or both, depending on each user’s research interests. once the excel file was complete with each page and its filename entered in a row, the file was saved as a tab-delimited file. import into contentdm required that all the tiff files be in one folder. once the files were moved, the contentdm compound object wizard was used to import. this book was imported as a compound object with no hierarchy. as this book was published in 1593, it has no chapters. to specify page names, the choice to label pages using tab-delimited object for printing was used. the filenames did not contain page numbers, and the choice to label pages in a sequence was not an option, as two copies of each annotated page existed. each object imported into contentdm has a thumbnail image associated with it. contentdm will create this image, but the cover of this book is not attractive, so a jpeg file was created using an image from the book that is often associated with copernicus (see figure 3). conclusions this project resulted in a digital representation of the physical book that is much more useful to researchers than the original, unenhanced digital object. this history of science collection holds not only the first edition of books important to the history of science, but the subsequent editions so that researchers can see how the ideas of science have changed over time. this new digital edition of de revolutionibus allows researchers to see how another scientist made corrections in copernicus’ book as one step in the change in theory over time and insight into the reaction of the catholic church. the format that contentdm creates for the object and a clear naming scheme allow the user to view the pages with or without the marginalia, thus making this a useful object for many types of users (see figure 4). however, using photoshop to highlight areas of a page allowed the digital initiatives department to understand the power of this tool. in understanding the utility and power of photoshop, the digital initiatives department has determined it to be a useful tool in other projects. a project to eliminate some images of people’s fingers that inadvertently were photographed along with pages in a book or manuscript has been added to the queue. in future, digitized books or manuscripts with useful notes will undergo these enhancement processes. adding value to collections through digital enhancement | valentino 32 references 1. owen gingerich, “the master of the 1550 radices: jofrancus offusius,” journal for the history of astronomy 11 (1993): 235–53, http://adsabs.harvard.edu/full/1993jha....24..235g. 2. richard s. croft, “fun and games with photoshop: using image editors to change photographic meaning” (in: visual literacy in the digital age: selected readings from the annual conference of the international visual literacy association (rochester, ny october 13-17, 1993)): 3-10. 3. robert hauptman, documentation: a history and critique of attribution, commentary, glosses, marginalia, notes, bibliographies, works-cited lists, and citation indexing and analysis (jefferson, nc: mcfarland, 2008). 4. ibid. 5. jennifer howard, “scholarship on the edge,” chronicle of higher education 52, no. 9 (2005). 6. ibid. 7. tatiana nikolova-houston,“marginalia and colophons in bulgarian manuscripts and early printed books,” journal of religious & theological information 8, no. 1/2, (2009), http://www.tandfonline.com/doi/abs/10.1080/10477840903459586#preview. 8. ibid. 9. patricio ferrari, “fernando pessoa as a writing-reader: some justifications for a complete digital edition of his marginalia,” portuguese studies 24, no. 2 (2008): 69–114, http://www.jstor.org/stable/41105307. 10. dirk johnson, “book lovers fear dim future for notes in the margins,” new york times, february 20, 2011, http://www.nytimes.com/2011/02/21/books/21margin.html?_r=3&emc=tnt&tntemail1=y & 11. ibid 12. paige morgan, “the minute particular in the immensity of the internet: what coleridge, hartley and blake can teach us about digital editing,” romanticism 15, no. 2 (2009), http://www.euppublishing.com/doi/abs/10.3366/e1354991x09000774. 13. y. li, e. adelson, and a. agarwala, “scribbleboost: adding classification to edge-aware interpolation of local image and video adjustments,” eurographics symposium on rendering27, no. 4 (2008), http://www.mit.edu/~yzli/eg08.pdf. 14. s. michael malinconico, “digital preservation technologies and hybrid libraries,” information services & use 22, no. 4 (2002): 159–74, http://iospress.metapress.com/content/gep1rx9rednylm2n. http://adsabs.harvard.edu/full/1993jha....24..235g http://www.tandfonline.com/doi/abs/10.1080/10477840903459586%23preview http://www.jstor.org/stable/41105307 http://www.nytimes.com/2011/02/21/books/21margin.html?_r=3&emc=tnt&tntemail1=y& http://www.nytimes.com/2011/02/21/books/21margin.html?_r=3&emc=tnt&tntemail1=y& http://www.euppublishing.com/doi/abs/10.3366/e1354991x09000774 information technology and libraries | march 2014 33 15. ibid. 16. morgan, “minute particular.” 17. gabriel f. giralt, “realism and realistic representation in the digital age,” journal of film & video 62, no. 3 (2010): 3, http://muse.jhu.edu/journals/journal_of_film_and_video/v062/62.3.giralt.html. 18. morgan, “minute particular.” 19. morgan, “minute particular.” http://muse.jhu.edu/journals/journal_of_film_and_video/v062/62.3.giralt.html lib-mocs-kmc364-20131012113626 236 news and announcements programmers discussion group meets: pl/1, the marc format, and holdings twenty-two computer programmers, analysts, and managers met on june 29 in san francisco for the formative meeting of the lit a/isas programmers discussion group. in an informal and informative hour, the group established ground rules, started a mailing list, planned the topic for midwinter 1982, and found out more about practice<> in fifteen library-related installations. programming language usage what programming languages are used, and used primarily, at the installations? nine languages turned up, excluding database management systems (and lumping all "assembly" languages together)-but one language accounted for more than one-half of the responses: language users primary pl/1 14 13 assembler/ assembly languages 8 5 cobol 4 2 pascal 3 1 basic 1 1 c 1 1 mils (a mumps dialect) 1 fortran 0 snobol 0 note: some installations use more than one ''primary" language.) a second round of hands showed only four users with no use of pl!i. marc format usage these questions are asked on an agencyby-agency basis. one agency made no use of the marc communications format. none of those receiving marc-format tapes were unable to recreate the format. eight of the fifteen agencies made significant internal-processing use of the marccommunications-format structure, including the leader, directory, and character storage patterns; this question was made more explicit to try to narrow the answers. thus, the marc communications format is used as a processing format in a significant number of institutions. only three agencies use ascii internally, most use of marc takes place within ebcdic. (all but three agencies were using ibm 360/370 equivalent computersthe parallel is clear.) computer usage as noted, all but three agencies use ibm equivalents in the mainframe range; three of those use plug-compatible equipment such as magnuson and amdahl. the other major computers are cdc, dec/vax, and data general eclipse systems. smaller computers in use include dc, dec 11170, datapoint, and ibm series/! units. home terminals and computers four of those present currently have home terminals. three have home computers. future plaru for the discussion group the midwinter 1982 topic will be "holdings," with some emphasis on dealing with holdings formats in various technical processing systems (such as oclc, utlas, wln, rlin). an announcement and mailing list will go to all those on the mailing list, as will an october/november mailing with questions sent to the chair. those interested should send their names and addresses to walt crawford, rlg, jordan quad, stanford, ca 94305. it is anticipated that papers on the topic may be ready by midwinter; questions and comments are welcomed. note: there will be no set speakers or panelists; this will be a true disci.i.i'sion group. the topic for the philadelphia meeting will be set at midwinter 1982.-walt crawford, chair, the research libraries group, inc. channel 2000 a test of viewdata system called channel 2000 was conducted by oclc in columbus, ohio, during the last quarter of 1980. an outgrowth of the oclc research department's home delivery of library services program, channel 2000 was developed and tested to investigate technical, business, market, and social issues involved in electronic delivery of information using videotex technology. data collection throughout the test, data were collected in three ways. transaction logs were maintained, recording keystrokes of each user during the test, thus allowing future analyses and reconstruction of the test sessions. questionnaires requesting demographic information, life-style, opinion leadership, and attitudes toward channel 2000 were collected from each user in each household before, during, and after the test. six focus-group interviews were held and audiotaped to obtain specific userresponses to the information services. attitudes toward library services forty-six percent of the respondents agreed that channel 2000 saved time in getting books from the library. responding to other questions, 29 percent felt that they would rather go to a traditional library than order books through channel 2000, and 38 percent of the users felt that channel 2000 had no effect on their library allendance. forty-one percent of the channel 2000 test group felt that their knowledge of library services increased as a result of the channel 2000 test. in addition, 16 percent of the respondents stated that they spent more time reading books than they did before the test. eighty-two percent of the respondents felt that public libraries should spend tax dollars on services such as channel 2000. although this might suggest that library viewdata services should be taxbased, subsequent focus-group interviews indicated that remote use of these services should be paid for by the individual, whereas on-site use should be "free." sixtythree percent of the test population stated news and announcements 237 that they would probably subscribe and pay for a viewdata library service, if the services were made available to them off-site. purchase intent respondents were asked to rank-order the seven channel 2000 services according to the likelihood that they would pay money to have that service in their home. a mean score was calculated for each channel 2000 service, and the following table shows rank order of preference. rank order channel 2000 service 1 video encyclopedia locate any of 32,000 articles in the new academic american encyclopedia via one of three easy look-up indexes 2 video catalog browse through the videocard catalog of the public libraries of columbus and franklin county, and select books to be mailed directly to your home 3 home banking pay your bills; check the status of your checking and savings accounts; look up the balance of your visa credit card; look up your mortgage and installment loans; get current information on bank one interest rates 4 public information become aware of public and legislative information in ohio 5 columbus calendar check the monthly calendar of events for local educational and entertainment happenings 6 math that connts! teach your children basic mathematics, including counting and simple word problems 7 early reader help your children learn to read by reinforcing word relationships the final report, mailed to all oclc member libraries, was published as channel 2000: description and findings of a viewdata test conducted by oclc in columbus, ohio, october-december 1980. dublin, ohio: research department, online computer library center, inc., 1981. 21p. notis software available at the 1981 ala annual conference in san francisco, the northwestern univer238 journal of library automation vol. 14/3 september 1981 sity library announced the availability of version 3.2 of the notis computer system. intended for medium and large research libraries or groups of libraries, notis provides comprehensive online integratedprocessing capabilities for cataloging, acquisitions, and serials control. patron access by author and title has been in operation for more than a year , and version 3.2 adds subject-access capability as well as other new features. an improved circulation module and other enhancements are under development for future release. although notis, which runs on standard ibm or ibm-compatible hardware, has been in use by the national library of venezuela for several years, northwestern only recently decided to actively market the software, and provided a demonstration at the ala conference. a contract has been signed with the university of florida, and several other installations are expected within a few months. further information on notis may be obtained from the northwestern university library, 1935 sheridan rd., evanston, il 60201. bibliographic access & control system the washington university school of medicine library announces its computerbased online catalog/library control system known as the bibliographic access & control system (bacs). the system is now in operation and utilizes marc cataloging records obtained from oclc since 1975, serials records from philsom serials control network, and machine-readable patron records. features of interest in the system are: 1. patron access by author, title, subject, call number, or combination of keywords. the public-access feature has been in operation since may 1981. online instructions support system use, minimizing staff intervention. user survey indicates a high degree of satisfaction with the system. 2. low cost public access terminal with a specially designed overlay board. 3. barcode-based circulation system featuring the usual functions, including recalls for high demand items, overdue notices, suspension of circulation privileges, etc. 4. cataloging records loaded from oclc marc records by tape and from a microcomputer interface at the oclc printer port. authority control available on three levels: (a) controlled authority, i.e. , mesh or lc, (b) library-specific assigned authority, and (c) word list available to user. 5. full cataloging functions online, including editing, deleting, and entering records. 6. serials control from philsom system. philsom is an online distributed computer network that currently controls serials for sixteen medical school libraries. philsom features rapid online check-in, claims, fiscal control, union lists, and management reports. 7. five possible displays of the basic bibliographic record, varying from a brief record for the public access terminal to complete information for cataloging and reference staff. 8. two levels of documentation available online. the software is available to interested libraries, bibliographic utilities, or commercial firms. contact: washington university school of medicine library, 4580 scott, st. louis, mo 63110; (314) 454-3711. lib-mocs-kmc364-20131012113754 editor's notes goodbye jola it is with mixed emotions that we note that this is the last issue of the journal of library automation. the first issue appeared in march 1968, just shortly after this editor had graduated from library school. under the editorships of frederick g. kilgour and susan k. martin, ]ola established itself as a major source of information about developments in library automation. this is also the last issue of the first volume produced by a new editorial board. the current editors are especially indebted to eileen mahoney of ala's central publication unit, whose experience, patience, and wise counsel contributed materially to making this last volume one we are all proud of. hello ital please welcome volume l , number l of information technology and libraries when its bright new face appears on your doorstep in march. it will look very familiar to you. the new name reflects many of the shifts in emphasis that have gradually been introduced in recent years as changing technologies have encouraged a broadening of lola's original scope. we plan to introduce some minor changes to increase it al's utility, but see these as evolutionary. \.ye continue to solicit comments and suggestions on how the journal can better serve your needs. sychronicity in our september issue, we initiated a new section , " reports and working papers," in which we reproduce documents we believe deserve a wider readership than their original distribution. w e were amused to note a similar innovation in the august bulletin of the american society for information science. we would welcome comments on the usefulness (or wastefulness) of the new section. standard.s standards continue to be a major concern in our field. w e hope those of you involved with acquisitions systems will find the communications by sandy paul and jim long in this issue useful. we encourage you to participate in standards development efforts when possible. please t ry to use developed standards whenever they are applicable to your work. the isbn , san (standard address number), sln (standard library number) , and other standard numbers will become increasingly important as our systems become more interdependent in this shrinking world. 251 a library website redesign in the time of covid: a chronological case study communication a library website redesign in the time of covid a chronological case study erin rushton and bern mulligan information technology and libraries | december 2022 https://doi.org/10.6017/ital.v41i4.15101 erin rushton (erushton@binghamton.edu) is head of digital initiatives and resource discovery, binghamton university libraries. bern mulligan (mulligan@binghamton.edu) is associate librarian emeritus, binghamton university libraries. © 2022. he found a glimmer of hope in the ruins of disaster…. gabriel garcía márquez, love in the time of cholera abstract in november 2019, binghamton university libraries initiated a website redesign project. our goal was to create a user-centered, data-informed website with refreshed content and upgraded functionality. originally, our redesign plan included in-person card-sorting activities, focus groups, and usability studies, but when the libraries went remote in march 2020 due to the covid-19 pandemic, we had to quickly reassess and adapt our processes and workflows. in this article, we will discuss how we completed this significant project remotely by relying on effective project management, communication, teamwork, and flexibility. introduction website redesign projects can be daunting, even during normal circumstances. this article will outline how we accomplished a website redesign project in a reasonable timeframe during the unprecedented circumstances of the covid-19 pandemic. binghamton university is part of the state university of new york (suny) system. founded in 1946, it has an enrollment of about 18,000 graduate and undergraduate students. binghamton university libraries are an important part of the university, serving as the center of the university’s intellectual community. our website is the libraries’ most important tool for “scaling up” services to our users. it is as important as the physical library and became even more so with the digital demands imposed by covid, particularly the importance of access to streaming video during the pandemic. a truism in website redesign is that your current website is never more popular than when you take it down. however, our redesigned website was successfully launched to general approbation and is considered a functional and cosmetic improvement over the old website. we were pleasantly surprised how little negative feedback we received. people just started using it, which may be the highest compliment paid to a design team. as we will highlight throughout the article, we believe that the success of this project was the result of the following: • a dedicated/functional web team, • the ability to meet frequently and on a moment’s notice, • the ability to focus almost exclusively on the project; and • effective project management. mailto:erushton@binghamton.edu_ mailto:mulligan@binghamton.edu information technology and libraries december 2022 a library website redesign in the time of covid | rushton and mulligan 2 premise for the redesign the last time the libraries had fully redesigned the website was in 2013. that project was significant because we had migrated from a locally hosted site to the university’s web content management system, omniupdate. highlights of the 2013 redesign included a fresh look and feel with binghamton university colors, updated site architecture and navigation, and expanded search options from the home page. up until 2013, we had typically redesigned our website on a five-year cycle. as such, we began discussing the possibility of another redesign in mid-2018. we also thought it was an opportune time to redesign the website because the libraries had a new dean, we were offering new services and initiatives, and we had become more mindful of accessibility issues and responsive design. the web team the libraries’ web team is the group that leads website redesign projects and maintains the website. the team generally meets as needed between redesign projects. in anticipation of the planned 2019–2020 redesign, however, we began scheduling more regular meetings. our preferred meeting times were friday mornings. one of the prerequisites for these meetings was to take turns bringing bagels and juice. we quickly got to know what kind of bagel each of us preferred. in november 2019, the team consisted of ben coury, bern mulligan, erin rushton (chair), and dave vose. except for ben, the other members of the web team had participated in several past library website redesigns. ben had recently been hired as the libraries’ digital web designer. he brought high-end programming skills to the team and expertise and knowledge about user experience and accessibility which were integral to the success of the project. it was an advantage to have a small, dedicated, and agile team that collaborated and communicated well. this positive chemistry or esprit de corps among members allowed us to debate any controversial issues professionally, not personally. we internalized the team’s mission and worked single-mindedly toward its successful completion. binghamton university is a google campus, which meant we had access to the full suite of google apps (e.g., gmail, google calendar, google drive, google groups, etc.). at the beginning of the redesign, we had already begun to use google to create, share, and archive our committee documents. project timeline this is the timeline for the project, which took ten months to complete. it occurred basically in two phases: before covid and after covid. much of the work that occurred before covid was completed in person. all of what occurred after covid was done virtually: via email, phone, and, as was the case with most organizations during the pandemic, via zoom. • november 2019: planning phase • december 2019–march 11, 2020: in-person meetings with library constituent groups • march 12, 2020: meeting with student advisory group • march 17, 2020: covid shutdown • april 15, 2020: meeting with communications and marketing information technology and libraries december 2022 a library website redesign in the time of covid | rushton and mulligan 3 • april–june 2020: website architecture and content review • july 2020: creating templates, migrating content, and designing the home page • august 2020: final meetings and details • august 19, 2020: successful launch of the new website when planning a website redesign project, it is understood that there will be a lot of meetings. as the pace of the project accelerates and as the deadline approaches, so does the frequency of meetings. during the ten months of this project, we met 64 times: 16 in-person meetings from november 2012 to march 12, 2020; 25 virtual meetings from march 17 to the end of june; and 23 virtual meetings in july and august. during july and august until launch, the other aspects of our jobs took a back seat to the proj ect; it’s pretty much all we worked on. we sometimes met twice a day, and that’s not counting the incidental phone calls between individual members for questions about sticking points in the process. november 2019: planning phase the libraries’ user interface steering committee (uisc) has oversight for the various public library interfaces, including the website. the committee consists of the web team members and representatives from different departments in the libraries, including public services, technic al services, and special collections. the uisc helped us establish the goals and objectives for the redesign project, gave us feedback about ideas during the redesign process, and monitored our ongoing progress. there were four goals that became apparent almost immediately. over the years, many website development requests had been postponed, so our first goal was to accommodate these improvements and enhancements. and since university communications and marketing, the unit on campus responsible for the entire university web domain, had updated templates since our last redesign, our second goal was to utilize the new templates. our third goal was to create a usercentered, data-informed website. finally, our fourth goal was to address accessibility issues on the website and make it easier for users to navigate. december 2019–march 11, 2020: in-person meetings with library constituent groups once the goals had been established, we began scheduling meetings with a variety of library constituent groups. these included preservation, reader services, special collections, and dean’s council. it was important for us to get input from these groups because we wanted everyone to feel represented by the project since the website is the gateway for many of our services and resources. we also wanted the redesign process to be transparent so that everyone knew what was happening. at each meeting, we discussed how the website was working for their area and what improvements they wanted to see. we also provided a snapshot of the website for each area which showed current usage statistics (see fig. 1). information technology and libraries december 2022 a library website redesign in the time of covid | rushton and mulligan 4 figure 1. screenshot of a website inventory spreadsheet for special collections. march 12, 2020: meeting with student advisory group we met with the student library advisory committee (slac) on march 12. this group offered open, two-way communication between the libraries and the student population and provided a mechanism for the libraries to solicit feedback on specific issues as needed. as an incentive for students attending this meeting, we provided free pizza and snacks. at this meeting, we asked the students a variety of questions, including why they used the website and what they liked and disliked about it. some of the answers we received were surprising to us. for one thing, few of the students began their research from our website. while they had used the website, they were not particularly familiar with most of its features. a second consensus of the group was that if they wanted to know something about the libraries (e.g., our hours), they just googled it. finally, although our ask a librarian service was linked from several places on the website, none of the students in the group had ever noticed it and were unaware of the service. these revelations further informed what we wanted to address in the redesign. march 17, 2020: covid shutdown on march 17, the university abruptly closed in-person services due to the growing pandemic. for a few weeks after the shutdown, the priority for all university employees was transitioning to remote work and providing virtual services. once we felt ready to resume the project, we had to reassess how we would continue it remotely. there were two significant challenges: how to continue our committee work as a distributed team and how to continue gathering user feedback given that no one was physically on campus. we discovered that transitioning into a distributed team worked well for the project. we no longer had to reserve meeting spaces and set up laptops and projectors. instead, we could quickly organize zoom meetings, sometimes on the fly, when we had something we wanted to discuss. and all our committee files were already on google drive. as a result, we were better able to focus almost exclusively on the project and were less impacted by the distractions that often occur in the office environment. information technology and libraries december 2022 a library website redesign in the time of covid | rushton and mulligan 5 unfortunately, we were unable to conduct any of the focus groups with teaching faculty or the “guerrilla” usability testing with students that we had planned. however, we were glad to have had the in-person meetings with some of our constituents before the work-from-home phase because we had established a good foundation of what we needed to accomplish in the redesign. april 15, 2020: meeting with communications and marketing one of the first groups we met with virtually was university communications and marketing (c&m). as mentioned above, this is the unit on campus that is responsible for omniupdate, the university’s website platform. the purpose of this meeting was to discuss our plans and timeline and to clarify the role of c&m in the project. although we now had our own web designer in ben, we knew that c&m had oversight over the entire university web presence and would have to decide whether our redesign fit in with the rest of the domain in terms of its appearance and accessibility. april-june 2020: website architecture and content review the next significant part of the project was reviewing our current pages. during preliminary planning the year before, we had a student list all pages of the libraries’ website in a google spreadsheet which also included google analytics from the three previous years. we literally spent hours poring over this document (see fig. 2). it helped us identify which pages would migrate mostly as is, which pages would need additional review, which pages would be converted to libguides and vice versa, and which pages would be deleted. figure 2. screenshot of a spreadsheet listing all library web pages. information technology and libraries december 2022 a library website redesign in the time of covid | rushton and mulligan 6 originally, we had planned to do a physical card sorting like we had done in past redesigns. we had even created the cards and had scheduled an all-day card-sorting event, complete with pizza. but covid changed all that. since we were meeting virtually, we had to think of another way to accomplish this. a breakthrough occurred when ben introduced trello, an online collaboration and project management tool that allowed us to work together on the new website’s architecture in real time via zoom (see fig. 3). figure 3. screenshot of a trello board. we recreated the six existing main navigation categories and created “cards” for each web page. trello made it easy to move these virtual cards around into the different categories. we had received feedback that six categories were perhaps too many for users to choose from. we spent a number of meetings in may discussing/debating what the new navigation categories should be and where pages fit under them. we decided to fold the locations & collections and special collections sections into about, and search & find into research, because we felt these changes made more logical and functional sense. we added a top-level link to my account because this was also something that users had suggested should be more prominent. another aspect of the libraries’ website redesign project was the content of libguides. initially, they were meant to be subject guides, but over the years some of our web pages were converted into libguides. as part of the redesign, we worked closely with the collections and instruction department to make decisions about where content should be located. most libguides are now subject or research guides while descriptions of libraries’ services are web pages. july 2020: creating templates, migrating content, and designing the home page by the end of june, we had accomplished the following: • met with available constituencies, • identified what content we needed to migrate, • identified what content needed to be created; and • had a new website architecture. information technology and libraries december 2022 a library website redesign in the time of covid | rushton and mulligan 7 at this point, we felt we needed to focus on the actual migration of content and the design of the home page. on july 2, we met virtually with collections and instruction to discuss how the search box on the home page would look and function. it was no mean feat coming to consensus on this. we were on a tight schedule, but we wanted to be sure that we heard everyone’s ideas. taking all feedback into consideration, ben then had the ultimate challenge of coding and designing a functioning search box. for the last three weeks of july, we met daily, and sometimes twice a day, as we began the daunting process of migrating all the lower-level pages. we were definitely feeling the pressure to complete the project on time. since we were using the new university templates, designing the lower-level pages was relatively straightforward. ben had customized the template and provided us with a migration guide. the guide included instructions on how to create and format new pages. this allowed the other members of the web team to migrate content while he focused on the more complicated aspects of the project. all migrated content was reviewed to ensure that it was current. for some pages, this required input from several departments. to facilitate the updating, we copied the content of every page we were migrating into a google doc to allow for collaborative editing. once the content was updated, it would be copied and pasted into the new template. the screenshot in figure 4 represents the redesign for most of the lower-level pages. figure 4. screenshot of a new lower-level page. information technology and libraries december 2022 a library website redesign in the time of covid | rushton and mulligan 8 the most significant changes were the relocation of the navigation column from left to right and some new features that were included in omniupdate, including the ability to add contact information. we had to do some formatting, such as adding headings and hyperlinks, and some metadata work, such as creating page descriptions and keywords. we also had to pick a photo for the banner of each page. we had planned to have the university photographer take new photos, but because there were no students and hardly any staff on campus, we had to rely on pictures of our spaces he had already taken. we were tracking the migration on a spreadsheet (see fig. 5). this document contained the new architecture of the website and had links to the google docs mentioned previously. the spreadsheet also noted who was responsible for reviewing the content and the status of each page. figure 5. screenshot of the spreadsheet used to track the migration of webpages. while this mass migration was taking place, ben focused on creating the pages that required additional coding and customization such as the ask a librarian page, the staff directory, and library tutorials. he also worked on the design of the home page. one of the tools ben used to help with the mock-up was adobe xd (see fig. 6). information technology and libraries december 2022 a library website redesign in the time of covid | rushton and mulligan 9 figure 6. screenshot of initial design of home page in adobe xd. adobe xd is a design tool in the adobe suite used for prototyping user interfaces. he created an interactive wireframe for the home page and landing pages. this allowed us to discuss the interface and make changes without a lot of time spent on mock-ups. information technology and libraries december 2022 a library website redesign in the time of covid | rushton and mulligan 10 august 2020: final meetings and details we met with the user interface steering committee on august 5 to get their input on the redesign. on august 7, we met with communications and marketing again to go over our design and get final approval for our home page. we also previewed the site to library people at an all-staff meeting later that day. the final week before site launch was dedicated to the special collections pages that had embedded links for finding aids and to working on the ask a librarian page. and the last day was spent going through and making sure everything was in order before launch. august 19, 2020: successful launch of the new website we launched the redesigned website on august 19 (see fig. 7). some of its new features include • a large “hero” photo which can easily be changed depending on what the libraries want to promote; • a redesigned search box; • popular links featured on the home page; • a news section which pulls from our blog software and allows for automatic updates; and • a prominent, visually attractive place on the home page for special collections since it was bumped from the top navigation bar. figure 7. partial screenshot of the new library home page. information technology and libraries december 2022 a library website redesign in the time of covid | rushton and mulligan 11 conclusion the process that we had initially envisioned did not work out because of the covid-19 pandemic. although we did meet with our student advisory group, we never got to hold any focus groups with teaching faculty or usability studies with students. but thanks to zoom and other online tools, we were still able to gather some user feedback. we also had a new website ready before fall 2020. while there was certainly unanticipated stress in tackling a project like this in the middle of a pandemic, we felt that working remotely in some ways helped us to be more productive. we were better able to focus almost exclusively on this project and were less impacted by the distractions that often occur in the office environment. we also felt that the quick adoption of zoom made us more agile about scheduling and holding meetings. despite some of the challenges that we faced throughout the project, the redesigned website is a success. since the launch, we have made a few minor changes to the overall architecture of the site. the most significant change was adding a giving link to the navigation menu at the req uest of our dean and the binghamton university foundation. our library website is never static, as we continue to update our home page with news and events and change our hero banner to reflect the priorities of the libraries. while we have no plans for another major redesign in the near future, we are open to making changes and improvements as needed. bibliography becker, danielle a., and lauren yannotta. “modeling a library web site redesign process: developing a user-centered web site through usability testing.” information technology and libraries 32, no. 1 (2013): 6–22. buell, jesi, and mark sandford. “from dreamweaver to drupal: a university library website case study.” information technology and libraries 37, no. 2 (2018): 118–26. wu, jin, and janis f. brown. “website redesign: a case study.” medical reference services quarterly 35, no. 2 (2016): 158–74. zhu, candice. “website makeover: transforming the user experience starting from scratch.” computers in libraries 41, no. 6 (2021): 21–6. abstract introduction premise for the redesign the web team project timeline november 2019: planning phase december 2019–march 11, 2020: in-person meetings with library constituent groups march 12, 2020: meeting with student advisory group march 17, 2020: covid shutdown april 15, 2020: meeting with communications and marketing april-june 2020: website architecture and content review july 2020: creating templates, migrating content, and designing the home page august 2020: final meetings and details august 19, 2020: successful launch of the new website conclusion bibliography microsoft word 12915 20211217 gallery.docx article black, white, and grey the wicked problem of virtual reality in libraries gillian d. ellern and laura cruz information technology and libraries | december 2021 https://doi.org/10.6017/ital.v40i4.12915 gillian d. ellern (ellern@email.wcu.edu) is associate professor and systems librarian, hunter library, western carolina university. laura cruz (lxc601@psu.edu) is associate research professor, schreyer institute for teaching excellence, the pennsylvania state university. © 2021. abstract this study seeks to extend wicked problems analysis within the context of a library’s support for virtual reality (vr) and the related extended reality (xr) emerging technologies. the researchers conducted 11 interviews with 13 librarians, embedded it staff, and/or faculty members who were involved in administering, managing, or planning a virtual reality lab or classroom in a library (or similar unit) in a higher education setting. the qualitative analysis of the interviews identified clusters of challenges, which are categorized as either emergent (but solvable) such as portability and training; complicated (but possible) such as licensing and ethics: and/or wicked (but tameable). the respondents framed their role in supporting the wickedness of vr/xr in three basic ways: library as gateway, library as learning partner, and library as maker. five taming strategies were suggested from this research to help librarians wrestle with these challenges of advocating for a vision of vr/xr on their respective campuses. this research also hints at a larger role for librarians in the research of technology diffusion and what that might mean to their role in higher education in the future. introduction political scientists horst rittel and melvin webber coined the term “wicked problems” in the early 1970s to refer to problems that were sufficiently complex that they defied conventional problemsolving methods.1 initially framed as broad social problems, such as food security or climate change, wicked problems are characterized by having ambiguous parameters, shifting requirements and/or stakeholders, and, perhaps more importantly, “no determinable stopping point.”2 such problems are called wicked because they are “diabolical, in that they resist the usual attempts to resolve them.”3 without the possibility of a clear solution, the end product of wicked problems analysis is not to solve the problem but rather to find ways to “tame” them, an approach which runs counter to conventional models of not only planning but also reasoning.4 if taming is the last step in wicked problems analysis, a critical first step is to determine if a given challenge is, in fact, wicked, as that will then determine what tools, perspectives, and strategies will need to be brought to the table. simple problems can be resolved by matching them to known solutions, more complicated problems may be addressed by analyzing engineering solutions, but super complex/messy/wicked problems require an entirely different mindset.5 persistent frustration with the limitations of conventional problem-solving models has led to a proliferation of studies identifying a host of wicked problems, ranging from the global (covid-19 response) to the local (dysfunctional families).6 the present study seeks to apply the framework of wicked problem analysis to the question of the role of academic libraries in supporting emerging technologies, using the integration of vr/xr as a case study. information technology and libraries december 2021 black, white, and grey | ellern and cruz 2 literature review the wicked problem of libraries and technology has been recognized by a number of scholars, each using a different frame of reference, as perhaps fits the inherent ambiguity of a wicked problem. scholars have identified electronic data management, research data management, and ebooks as library problems that are wicked in nature, and howley notes that the question of public access touches on larger social issues that could be described as wicked. 7 a recent article by williams and willet identifies makerspace technology as boundary work, suggesting that it challenges conventional roles and relationships held by libraries and librarians, an approach which implies the existence a wicked problem.8 despite these exceptions, at least one set of library scholars has noted that “there are very few applications [of wicked problems] in librarianship.”9 the present study seeks to make the case that the application of the wicked problems framework to the question of the role of the libraries in emerging technology can illuminate new strategies, roles, and pathways forward. while research on wicked problems in libraries may be limited, the role of libraries in the curation, development, and dissemination of virtual reality (vr)—or using the more encapsulated term of extended reality (xr)—has been extensively written about by library scholars. although it could be argued that the current output reflects the nascent stages of vr/xr as a research field as scholars explore a library’s role with virtual reality (vr), mixed reality (mr), augmented reality (ar), and everything associated with them such as virtual worlds or 3d 360-degree videos, it is clear that, to date, the published works about vr/xr largely fall into two camps: the visionary and the applied. the former contains studies advocating for the integration of vr/xr (and related technologies) as part of a vision of the future for libraries; and the latter are applied studies that booth labels as “technorealistic.”10 in other words, these are descriptions of established practice or suggestions of practical strategies for how a library (or librarian) can actually implement a vr/xr lab or related program.11 what remains in shorter supply are critical and/or empirical studies that situate the development of vr/xr as an institutional capacity into larger, arguably wicked, questions of the evolving purpose and position of libraries. the case of vr/xr presents a distinctive perspective on the wicked problem of the technological orientation of academic libraries. unlike issues such as electronic records management, vr/xr is not part of the core technological infrastructure of a library, nor does it touch directly upon prior core administrative functions, such as collection development or access services. rather, it is perceived as an extension of library services, particularly those related to the evolving educational mission of the academic library and its role as a broader facilitator of information literacy across disciplines. as one library scholar remarks: as libraries are increasingly called upon to support knowledge exchange beyond traditional books and journals, the creation of novel types of research infrastructure will shape the preservation and access expectations of constituents.12 the present study looks at how librarians navigate, or tame, the myriad of challenges that arise not just from rethinking how an academic library engages with technology, but from pushing the boundaries of what library work is (or could be). as the emergence of vr/xr technology begins to cast a larger shadow over higher education, many librarians have argued that academic libraries associated with institutions with high research activity are especially well situated to take on a leadership role, an opportunity that they information technology and libraries december 2021 black, white, and grey | ellern and cruz 3 had largely missed with recent related technologies such as 3d printing. not wanting to be left behind, these libraries have embraced vr/xr technology at a relatively rapid rate. a recent (unpublished) study noted that in 2015, only 3% (n=4) out of the 125 sampled research universities had a vr/xr presence; by 2020, that percentage increased to 66% (n=77), a rate which appears to be outstripping that of technology competitors such as gis, institutional repositories, and data visualization services.13 given the relatively high resource up-front investment required to support vr/xr, it would appear that many university libraries are doubling down on the prospect that vr/xr will be an integral part of their future. the degree to which the rapid adoption of vr/xr will live up its promise remains to be seen, but the present study seeks to illuminate how current librarians are seeking to tame this potentially savage beast. methodology this irb-approved study is based on the qualitative analysis of eleven interviews with thirteen librarians (8), embedded it staff (3), or faculty members (2), all of whom were involved with the adoption of vr/xr technology at their respective libraries. the inclusion criteria for the study were described in the consent document as those people “currently involved in administering, managing, or planning a virtual reality lab or classroom in a library (or similar unit) in a higher education setting.” to identify potential participants, the researchers conducted a web search using the terms “library” and “virtual reality” or “vr” and then utilized a snowball sampling method to generate a list of potential interviewees that included multiple library types (e.g., academic research libraries [arls], public libraries) as well as institutional types (e.g., community colleges). one large library had multiple participants including one librarian and two support staff responsible for the vr room. taken collectively, these participants’ institutions included community colleges (3), public libraries (2), medical libraries (4), and academic research libraries (4), located in either the united states (10) or canada (1). the pool of the us educational institutions (10) represented five different carnegie classifications: associate’s colleges, doctoral universities, doctoral/professional universities, master’s colleges & universities, and special focus four-year institutions. these comprised a mix of small-, medium-, and large-sized institutions (by full-time enrollment, or fte). all the organizations in this study (11) were public institutions. each interviewee received a copy of the possible interview topics in advance, including a list of potential challenges faced by libraries seeking to integrate vr/xr (see appendix a). the list of challenges was crafted from a literature search, as well as the personal experience of one of the researchers, a librarian who oversees a vr lab. each hour-long, semi-structured interview was conducted via zoom, machine transcribed with kaltura, and further edited manually by the researchers. the transcripts then underwent three rounds of coding. first, the researchers independently reviewed the body of transcripts in their entirety and identified emergent themes. in the second round of coding, potential themes were merged into semi-structured coding guidelines, which were used to code each interview separately. in the third and final round, the themes were re-evaluated and adjusted based on feedback from the previous rounds, leading to the identification of a problem-based typology (emergent, complex, wicked). from our process, we gained insight into a myriad of challenges facing libraries as they work to integrate vr/xr into the work that they do. that insight has, in turn, led to the development of a conceptual framework that we believe will be useful to others seeking to wrestle with these challenges in the future. information technology and libraries december 2021 black, white, and grey | ellern and cruz 4 table 1. equipment, staffing, and funding for vr/xr spaces in participating libraries location of vr service in library number of pcs connected number of mobile headsets types of pc headsets types of mobile headsets staffing one time funding continuing funding room 2 10 htc (vive and pro) oculus go, spectra vr 2 staff yes no entrance 1 0 oculus rift sv 2 staff yes as needed room 4 8 htc vive pro oculus quest and microsoft hololens 1 staff, 3 students yes no, but planned room 5 1 oculus rift s and htc vive pro oculus quest 3 staff, 3 students yes as needed room, mobile vr 1 3 in circ, several in office htc vive cosmos oculus quest, oculus go, samsung odyssey, lenovo mirage solo, hololens, playstation vr and google cardboard 2 staff, 8 students no as needed mobile vr, entrance 2 6 htc vive, oculus rift oculus go, oculus quest all circ staff (2/3 per shift) yes no room 3 7 oculus rift, htc vive pro, htc vive standard google cardboard, insignia vr viewers 2 staff yes no, but planned mobile vr, entrance 4 30 htc vive, oculus rift oculus quest, google cardboard or plastic viewers circ staff at each of 4 locations no yes information technology and libraries december 2021 black, white, and grey | ellern and cruz 5 results library vr/xr spaces even within the relatively small sample of institutions included in our sample, we found that there was a fairly wide range of practice regarding vr/xr library labs, with considerable variance on location, number, and manufacturer of headsets, staffing, and funding as seen in table 1. challenges through our coding process, we identified clusters of those challenges, which we categorized as either emergent (but solvable), complicated (but possible), and/or wicked (but tameable). emergent challenges our respondents identified a number of challenges that are frequently associated with the adoption of emergent technology, regardless of who is choosing to adopt it or what they are choosing to adopt. in other words, any person or place adopting xr (or other types of emergent technology) at this stage of its development is likely to run into similar issues. portability and mobility portability (or lack thereof) was frequently referenced as a limitation of the current technology. the most common headsets purchased for the first generation of library vr lab spaces have physical cords and sensors that have to be plugged in (to high performing computers) during use. one intrepid librarian even described carting around her bulky alienware desktop computers and video displays between campuses, but needing to find a better way because, “it made the computer folks very angry because it’s so delicate and our sidewalks are so bumpy.” she now uses an alienware laptop and some sturdy tripods (for the base stations) on these trips. the lack of portability not only limited the ability of libraries to take vr/xr out of the library for events and in-class presentations, but it also exacerbated existing space constraints, with users having to be literally tethered to cpus, screens, and base stations. as one of our respondents put it, “the biggest issue is that it’s in one place and it’s stuck there.” in this case, manufacturers are aware of the limitations on mobility, and it appears as though wireless headsets will be the next wave of adaptation by the industry. several wireless headsets have already come and gone, as vendors continue to work to overcome both technological and human-centered challenges. the oculus go, google cardboard, and google daydream have all been brought to the market and subsequently been discontinued.14 only one of the libraries we spoke to indicated that they had purchased a wireless headset, and that headset (the oculus go) turned out to be of limited utility. while this next generation of headsets will likely solve a number of operability issues, it also has the potential to compound another challenge noted by most of our respondents, i.e., a lack of sustainable funding for equipment refresh. the majority of our respondents (6 of 8, or 75%) indicated that they purchased their equipment through one-time funding sources, whether internal or external grants (n=6), end-of-year funds, or some combination of these. vr/xr training vr/xr experience remains new to most people outside of the gaming world, so it has fallen largely on librarians to develop introductory training protocols at the level of access to the technology. there are distinctive challenges in introducing vr/xr to a broader audience. some of those challenges may be physical. in its earlier stages, a number of users experienced symptoms such as nausea or seasickness, and while these have been lessened with higher refresh rates and movable information technology and libraries december 2021 black, white, and grey | ellern and cruz 6 lenses, other virtual reality induced symptoms and effects (vrise) continue to emerge with studies of longer-term use.15 two of our librarians expressed concerns that other long-term effects may still be unknown, and both of the public libraries included in this study banned vr/xr use for patrons under 14 years of age until more is known about how it affects developing brains, a recommendation that is now supported by most vendors as well. even for those who do not suffer from physical symptoms, the technology can be disorienting and uncomfortable. this contributes to higher levels of anxiety, which, somewhat ironically, vr/xr has been shown to alleviate in some clinical trials.16 for these reasons, vr/xr labs require staffing not just to safeguard the equipment and ensure its appropriate use, but also to coach users through their new experience. as one of our respondents described her experience, “a lot of people will put on the vr headset and not move because they’re used to computer displays being two-dimensional … it is not common knowledge yet that you can move around and this environment [moves] with you. and they [new users] will just stand there.” coaching someone to move around in a virtual reality environment is not a straightforward endeavor either, as one of our librarians relates: “how do you interact with somebody who can’t see you in a way that’s respectful? because that can be kind of disconcerting if you’ve got a headset on and all of sudden somebody touches your hand?” one of our respondents drew upon her experience as a swimming coach to develop a set of “non-touching” verbal protocols for her student lab assistants to utilize in working with clients who are new to the interior mobility of virtual worlds. other challenges identified by our respondents that might fit into the emerging technologies category include the following: liability, aspects of licensing, physical space modifications, room and equipment management, training curriculum, logistics of engaging with multiple users, availability of apps/games, equipment installation, and evaluation procedures. this list could perhaps also include the need to not only educate patrons on what the emerging technology can do but to advocate for its future significance. as one of our respondents stated, “i think you can write about it and speak about as much as you want. it’s a matter of getting them in there.” complicated challenges unlike emergent issues, complicated challenges are unlikely to be resolved without concerted intervention and leadership and, even then, it is possible that a single or clear solution may not be readily identified. challenges that fall into this category may be described as grey areas, in which future directions remain scattered, unclear, or uncertain. embracing these complexities means that libraries looking to adopt vr/xr currently must be willing to venture out on their own, embracing both the opportunities and the risks inherent in forecasting future technology use. licensing an example of one of these complicated challenges that emerged from our interviews is the issue of licensing. many vr/xr titles are available for free, through services such as steam and the oculus store. all of our respondents indicated that they acquired content via these services. other popular vr/xr academic titles, such as 3d organon anatomy and google tilt brush, are licensed and potential users must pay a fee to access the full functionality of the tool. the challenge is that these licenses are most offered on an individual basis (“for home use only”), a reflection of the primary customer base for vr/xr content creators, e.g., gamers. a number of distributors do offer institutional licenses, but these are primarily for use in companies, with a relatively stable and readily identifiable list of employee users or limited number of stations. some vr/xr distributors information technology and libraries december 2021 black, white, and grey | ellern and cruz 7 offer a lesser-known (and less used) license known as an arcade license (e.g., steam pc café), but the prices are determined based on the assumption that the person renting the software for use will be receiving a fee, an assumption which does not work for libraries who do provide arcadelike services but do so free of charge. in other words, none of these available license types are well-suited for library use; the former too limited, the latter too expensive. as one of our librarians suggested, this is the “sort of the crack that libraries fall into a lot of the time anyway, with regard to [issues such as] document delivery, right, [in which the rule is stated], but it probably doesn’t apply to us in the same way because we’re a library. but it doesn’t explicitly say what i need to do about it.” what this means is that the majority of the librarians we spoke with indicated that they adopted one or more of these license types, but there was discomfort with the maladaptation to library practice and uncertainly as to what might constitute a best practice in the current market space. in the case of vr/xr, this state of affairs is likely due, at least in part, to a lack of awareness of or concern for libraries (or educational labs) as customers on the part of vendors. our respondents indicated that this oversight may be changing, however, as four of the librarians we interviewed reported that game developers reached out to them and negotiated deals in which libraries would receive equipment in exchange for beta-testing new titles with student populations. that said, awareness does not equate to priority, as one of our respondents noted, “i am concerned that we will be one of the last audiences that get some consideration in terms of the functionality that meets the library’s needs.” even if these issues are resolved in the context of vr/xr specifically, it seems unlikely that the complicated problem of “library as customer” will persist with the advent of new technologies and new technology providers. ethics the challenge of vendor relationships is compounded by other emergent ethical issues surrounding the integration of vr/xr into the library. several of the ethical concerns raised by our interviewees are connected to broader social concerns with technology use, such as issues of privacy and security, and others are related to long-standing ethical debates within libraries, such as the degree to which content should be limited by the library. our interviewees had divided opinions, for example, on whether or not the vr/xr lab should offer games. on one hand, the availability of games brought students to the library and engaged them with the new technology. on the other hand, the provision of games constitutes, for some stakeholders, a potentially significant shift away from an academic or scholarly mission for the library. as one respondent put it, “i can’t say that libraries have traditionally not been a place for people to have fun, but i think that’s something that … rubs some people a little bit the wrong way.” another stated, “my big concern at the beginning was that we would put this in and people would [say] … that’s for video games. why did the library buy video games?” the question of including popular content should be a familiar debate to librarians, but the issue is ratcheted up a notch when engagement may also include actions, such as shooting, that may be especially sensitive for college campuses. as one interviewee reflected, “we are a university in the south. and if you had a bunch of white male students that love to go play this game, is that going to make somebody from another group feel uncomfortable or unwelcome or feel like this is not a space for them?” as this example implies, unlike the often private act of reading, vr/xr experiences often take place in virtual places that are at least quasi-public, a venue for which few ethical precedents exist (yet). conversations on the legal and ethical implications of fully virtual information technology and libraries december 2021 black, white, and grey | ellern and cruz 8 crimes, such as rape or robbery, for example, constitute a lively, but so far unresolved, scholarly conversation.17 wicked problems where the challenges faced by libraries get most complicated, however, is when the integration of vr/xr touch upon the more fundamental question of the appropriate roles for libraries in the digital age. our respondents framed their vr labs and services largely within existing roles, e.g., gateway or learning partner, with some attention to emerging roles, such as maker, but they also acknowledged that this adaptation was awkward, solutions were (often) makeshift, and anomalies persist. this suggests the potential for paradigm shifts in the role(s) libraries can play in shaping the intersections of knowledge between the “real” and virtual worlds. library as gateway a number of our respondents connected the library’s adoption of vr/xr technology to its role in providing access to technology for those who may not otherwise have it. this role was especially pronounced in the case of academic libraries located in public universities and public libraries serving a defined community. as one of the respondents described their role, “we’re pleased to have them come and learn how to use these technologies because they’re new and we’re trying to make it more democratized that students can come and use it. they don’t have to pay for it. they don’t have to worry about like a lab being locked away from them. they can come in anytime there’s a staff member and use the stuff where here it will provide them tutorials and instruction if they want to use it.” similarly, another respondent stated, “libraries … offer an entry-level kind of way to engage with this technology in a free way where anyone who is even remotely curious, even if it doesn’t have anything to do with … anything academic, can engage with this stuff.” a third respondent stated that the case they made internally (to their library colleagues) was “to explain the importance of the library philosophy of having equitable access to resources … books are a resource, but technology is also a resource.” we have characterized this role as a gateway, rather than strictly as an access issue, because it also encompasses a vision of a pathway, one which starts in the library but may continue to other places, whether specialized labs in the discipline, in the workforce, or as part of their everyday lives. as one of our respondents put it, “we’re very much about these technologies. they’re here; they’re coming; they’re going to be a big thing soon. and we want our students to know what they are and be comfortable with them. so, we try to position ourselves as a place where they can start learning.” this gateway function is, however, characterized by competing stakeholders, both inside and outside of the library—a defining characteristic of wicked problems. this latter is perhaps best illustrated by looking at issues of accessibility. as the statements above attest, librarians see one of their primary service roles as providing access to technologies such as vr/xr to people who might not otherwise have it. that same sentiment, though, can be flipped on its head when taking other aspects of accessibility into consideration. most vr/xr programs are not ada compliant, whether they are being offered in the physical or virtual public spaces of the library. in its current form, vr/xr is an inherently visual technology, so those who are visually impaired cannot utilize it to the same extent as others. most vr/xr programs require physical movements that may not be possible for those with limited mobility. our librarians have created a few hacks, or workarounds, to provide short-term accommodations for individual students (e.g., a verbal narration of visual interactions), but generally speaking, the technology is not fully accessible. information technology and libraries december 2021 black, white, and grey | ellern and cruz 9 library as learning partner several of our respondents indicated that they saw the library’s adoption of vr/xr technology as an extension of their role as partners in the learning enterprise. this role could be conceived directly, in that the librarian mediates between classroom needs and available vr/xr titles and capabilities. this form of direct mediation could be responsive, i.e., identifying options in response to requests received, or proactive, i.e., identifying options than reaching out to faculty who might wish to avail themselves of them. integrating vr/xr material into the library’s ils the role is especially critical at this stage of vr/xr development, as none of the libraries we spoke to had integrated the available titles into their online, public-facing catalogs or integrated library system (ils). in other words, if a patron wants to know what titles are available, the best way to find out would be to ask the librarian directly and/or visit the vr/xr lab in person. as one librarian put it, “there’s not the infrastructure or the architecture we have around a book. if you were, say, a student in a history class and you wanted to study this thing, there’s no way to discover that as part of the more general resources of the library.” several of our respondents were developing workarounds, such as libguides and web-based directories, but none of these would be accessible through a general search of the library catalog or citation databases. determining how to catalog and/or curate vr/xr artifacts may be challenging and timeconsuming, but it is a problem that has an eventual solution. what is less clear, however, is what the long-term role of the library may be beyond this cataloging function. our respondents consistently indicated that this remains one of the lesser-developed roles for vr/xr in the library, and many identified raising faculty awareness especially as a high priority. while several identified this as essentially a “marketing problem,” it would appear that the challenge extends more deeply. many librarians do not have additional degrees in either educational development or instructional design, which encompasses the practice of matching learning outcomes to technology tools. the two most successful examples of matching learning outcomes to librarybased vr/xr that we heard of were faculty driven, one a project to scan actual human body parts for use in a vr setting; the other a criminal justice project related to empathy education using virtual encounters. these kinds of alignment activities can only occur if there is a tool available to match the proposed learning outcome. most of our respondents lamented the limited availability of titles that are appropriate for use in academic settings, so even if awareness was raised, there may not be sufficient content to meet academic needs. as one librarian suggests, “students will say, i’ve seen the anatomy tool, but right now i’m taking chemistry or i’m taking genetics. do you have anything that will help me with that? i’m a visual learner. i really liked this format. and that’s been challenging for because it’s so new. there’s not a coherence in terms of the titles and subject areas that you get.” and another characterized the issue this way, “it’s like the bargain video bin at walmart. sometimes you have to dig through to find something because it’s just, it’s so new right now.” the issue of availability may seem like an emergent technology issue (as above), but the challenge is further compounded by limitations on capacity, as most library vr labs can only hold one class at a time, and even then, the numbers may be limited, necessitating workarounds such as rotations, remote screen-casting, or extended office hours. even with multiple headsets, most of the time students cannot be in the same virtual reality space together. despite these challenges, information technology and libraries december 2021 black, white, and grey | ellern and cruz 10 many of our respondents were focused on optimizing current capacities, at least in part because of pressure to justify the continued expenditure of both personnel time and equipment costs. this precarious state of affairs is reflective both of tightening university budgets as well as the frequent present of internal sources of resistance from more traditionally-minded colleagues within the library itself (noted tactfully by three of our respondents). bearing all of these factors in mind, it would seem that the question of the long-term sustainability and scalability of vr/xr as a learning service for libraries remains unresolved. library as maker there may be another way to frame vr/xr in the context of libraries. in several cases (n=3), our respondents framed vr/xr not as an extension of classroom-focused service, but rather of support for the research enterprise. as one of our respondents described it, “if they’re still working on a project and they need a thing for this academic project. and then we’re just providing a new way to provide that service, closing some of the research cycle loop, that we’re now part of a different part of that same loop of creating things.” this is a reflection of the changing nature of outputs from scholarly research. previously confined largely to print artifacts, e.g., peer-reviewed journals, researchers are facing an increasing number of choices when it comes to ways to represent the scholarship being created, e.g., knowledge artifacts. this can include artifacts created in, through, or with vr/xr. several of the librarians (n=4) we spoke to mentioned that their vr/xr lab came packaged, in a sense, along with their 3d printing stations. in each case, the librarians noted that the utility of the 3d printers had resonated more readily with library users, and two indicated that they had aspirations to link the two processes in an effort to boost interest in the vr/xr space. for example, one respondent indicated that they wanted users to be able to create an object in a vr/xr program, such as google tilt brush, and then print their creation on an associated 3d printer. libraries have long provided non-3d printing services, largely as ancillary services to support researchers, so this example may, at first glance, appear to be simply a slightly more hightech version of a pre-existing service. these made objects, too, could potentially be stored, cataloged, and disseminated through the library system and/or via a dedicated database such as sketchfab.com. in our interviews, however, the respondents hinted that this linkage (between vr/xr and 3d printing) may actually be a first step towards a more fundamental shift in re-imagining the role of the library vis-à-vis technology. rather than functioning primarily as service providers, emerging technology librarians have the opportunity to become more active (co-)creators of content and facilitators of change. in one case, the vr/xr lab director, also a faculty member, developed partnerships with strategic programs on campus, such as the office of admissions, to generate original content that was specific to their institution. fortunately, the faculty member was able to draw on coding skills she had gained in prior professional roles. in another case, the library partnered with an external developer to generate original content with direct relevance to the community—a project that served to generate interest in the library, vr/xr, and local issues, all at the same time. there is a fundamental difference between a library hosting a maker space and becoming a maker itself. while librarians have traditionally characterized themselves as facilitators of knowledge rather than knowledge creators, there is some evidence that this shift may not be quite as profound as it might appear. this shift began with libraries and librarians scanning digitized items information technology and libraries december 2021 black, white, and grey | ellern and cruz 11 of their siloed special collections and archives. the resulting databases are often treated as published works in and of themselves with the library acting as curator and publisher. in addition, librarians currently hold faculty rank at many research universities and actively present and publish both in library-focused journals, thematic journals (e.g., information literacy), as well as in other venues, often alongside faculty partners.18 the embedded curricular model places librarians in the role of learning designers and as creators of extended, discipline-specific content. it should be noted, too, that content development is not the only “creator” role available. when you build a knowledge management system (like a library catalog), the choices you make serve not just to organize knowledge, but also, to shape that knowledge and, yes, create physical and cognitive pathways to and through it.19 it is perhaps not a coincidence that identifying pathways has been identified as a signature taming strategy for wicked problems. discussion: taming wicked problems our study frames the adoption of vr/xr technology by academic libraries as embedded in the larger wicked problem of library reinvention in a digital age. that said, one of the fundamental characteristics of a wicked problem is not that it is very difficult to solve, but that it is intrinsically unsolvable (or nearly so). this may explain why the question of libraries and technology seems to be a conversation that never goes away, as the question involves a perpetually moving target, embedded in the ever-shifting social, economic, and political dynamics that are taking place well beyond the walls of any library.20 this characterization does not mean, however, that we should not keep trying a variety of strategies to untangle these wicked knots. taming strategy 1: embracing wickedness in a recent essay about learning in higher education, randy bass characterized the wicked problem designation as potentially liberating, rather than discouraging. embracing wickedness serves to move the conversation from thinking of libraries as broken or backward (and therefore, in need of solutions), to a view of the question as a grand challenge, a continual thought experiment that requires ongoing inquiry, thoughtful consideration, and an expansive, rather than reductive, perspective.21 as a grand challenge, the question of libraries and emerging technologies such as vr/xr becomes less of a mad scramble to maintain relevance and more of a scholarly conversation that enhances the role of the library as an inclusive and pluralistic space. in this framework, the questions of whether or not a library should embrace new technology or technology-related service are not bounded by the intrinsic qualities of that technology itself, nor does it mean that libraries everywhere will need to land upon the same, or even similar, technologies, but rather they might seek convergence in the role of libraries as tamers of these wicked problems. taming strategy 2: integrating adaptability the librarians we spoke to generally described their units as falling under the category of “early majority” in roger’s well-known diffusion of technology model, in that they wanted to see evidence that vr/ xr will be useful to others before committing their resources, but they also want to serve a gateway role in introducing promising new technologies to their patrons.22 much of the research on technology diffusion, however, has focused on either end of the curve, i.e., the innovators or non-innovators, and comparatively less research has been done on the role played by those in the middle, such as these libraries.23 by positioning themselves as early majority adopters, academic libraries would potentially be able to articulate a clear and distinct role for themselves vis-à-vis other units within the university that support technology-enabled learning; information technology and libraries december 2021 black, white, and grey | ellern and cruz 12 while also giving themselves the ability to leverage more resources outside of the library itself. the model also has the advantage of providing a sustainable model of re-invention. as a given technology matures along the continuum, the library’s role recedes, enabling it to embrace the next emerging technology. as one of our respondents pointed out, their library used to give training on how to use a mouse and, one day, gateway training for vr is likely to go the same route. taming strategy 3: building networks because wicked problems are complex and ill-defined, taming them is often done by connecting to others with different perspectives.24 our respondents were largely emerging technology librarians who used a number of on-the-ground strategies to tame the wickedness of the task of advocating for a vision of vr/xr on their respective campuses. most of these strategies required creating relationships beyond the walls of the library, e.g., building organizational networks, connecting to community organizations, developing joint, shared, or embedded positions; cultivating faculty champions in academic units, and initiating shared programming. these collaborative strategies resonate with another characteristic of wicked problems, e.g., that they require the ability to think across conventional organizational and disciplinary siloes. taming strategy 4: exercising interdisciplinary imagination and what other role at a university has more experience with this kind of intellectual dexterity than a librarian? our respondents mentioned working with faculty from 14 different disciplines in the context of their responses to our interview questions, and that’s without being asked. as higher education increasingly shifts its attention towards addressing wicked problems, then librarians may be well poised to serve a gateway role in modeling, supporting, and conducting what is now being called “convergent” research.25 this has been described as transdisciplinary inquiry that integrates knowledge from multiple data sources, disciplinary perspectives, and lived experiences in order to confront the world’s most complex problems.26 taming strategy 5: modeling as learning partners in the future of higher education, librarians will have a role to play in developing our students’ abilities to tame these same wicked problems.27 this partnership is not limited to the kind of information and digital literacy needed for cross-disciplinary research. taming wicked problems requires more than a specific set of knowledge or skills, but rather a certain disposition, e.g., a willingness to engage in answering seemingly impossible questions; the flexibility to find pathways through those challenges; the ability to persevere through short-term setbacks; and, above all else, the motivation to support the ability of others to flourish.28 this same set of wicked qualities could easily be applied to all of the respondents in our study, each of whom have succeeded because of their deeply held, intrinsic passion for (and commitment to) the possibilities for what technology and libraries can do together. conclusion the library remains a model of not just individual, but also organizational resiliency. as new technologies such as vr/xr arise, the library as an institution will find ways to weather emerging challenges, resolve complicated problems, and disentangle super complex, i.e., wicked, dilemmas, each of which requires the cultivation of distinctive knowledge, skills, and dispositions. in this study, we argue that the strategies associated with wicked problem solving can serve to strengthen the ability of libraries (and librarians) to serve an active role in our collective future, whether that future is “real” or virtual (or both). information technology and libraries december 2021 black, white, and grey | ellern and cruz 13 appendix a – email to interviewees subject: we are interested in your experience with virtual reality at your library/university/college a research study interview request hi invitee, you are being invited to participate in a research study of how universities navigate the integration of virtual reality labs. you were selected as a possible participant because of your experience in managing or implementing such labs. your participation entails a 45–60 minute interview, conducted through zoom. we will be especially interested in how you, your library, or your university navigated one of the following “grey areas” where a situation is ill-defined or not readily conforming to a category or an existing set of rules or policies. these include but not limited to your professional perspective in one or more of the following: 1. physical and software liability 2. licensing and infringement 3. user accounts with the university and/or with the vendor 4. physical space modifications needed for vr 5. room and equipment management 6. separating collection development policies from equipment and use policies 7. use policies for the equipment, software, and users 8. controlling the vr equipment and software 9. time, research, and staff needed to run this service 10. training and learning curve for users (both faculty and students) 11. logistics of using the vr room for a class and within a class 12. integrating vr into a college course 13. selecting appropriate vr items to purchase 14. evaluating vr items 15. paying for vr items including approval, licensing, purchasing processes 16. installing and maintaining vr items including regular updates, the user installing software/games, management of hardware/software, repair, etc. 17. budget for vr (amount, repair, one-time/continuing) 18. a vr topic of your choice of course, we will not be able to cover all of these areas listed above during our short interview with you. we are sending them so you can begin thinking about these vr challenges and prioritize them. based on our own experience, we think you have important insight to share about some of them that will be beneficial to the broader university and library communities. information technology and libraries december 2021 black, white, and grey | ellern and cruz 14 appendix b – interview protocol 1. tell us about the history of you/your library with vr. 2. how have you/your library navigated one of the following grey areas (drawn from working with vr in libraries) where a situation is ill-defined or not readily conforming to a category or an existing set of rules or policies? these include but are not limited to your professional perspective in one or more of the following (from the list we sent you in our invitation email): • physical and software liability • licensing and infringement • user accounts with the university and/or with the vendor • physical space modifications needed for vr • room and equipment management • separating collection development policies from equipment and use policies • use policies for the equipment, software, and users • controlling the vr equipment and software • time, research, and staff needed to run this service • training and learning curve for users (both faculty and students) • logistics of using the vr room for a class and within a class • integrating vr into a college course • selecting appropriate vr items to purchase • evaluating vr items • paying for vr items including approval, licensing, purchasing processes • installing and maintaining vr items including regular updates, the user installing software/games, management of hardware/software, repair, etc. • budget for vr (amount, repair, one-time/continuing) • a vr topic of your choice please describe an occasion where you were faced with one of these complex, challenging, and/or potentially insurmountable obstacles in integrating vr into your library (or university more broadly). how did you navigate this challenge? 3. please describe one way in which the values, practices, and ethos of librarianship may have been challenged by the integration of a vr lab and the purchase and curation of vr artifacts. . information technology and libraries december 2021 black, white, and grey | ellern and cruz 15 endnotes 1 horst w. j. rittel and melvin m. webber, “dilemmas in a general theory of planning,” policy sciences 4, no. 2 (1973): 155–69. 2 cameron tonkinwise, “design for transitions—from and to what?” design philosophy papers 13, no. 1 (may 2015): 15, http://dx.doi.org/10.1080/14487136.2015.1085686. 3 valerie a. brown, john harris, and jacqueline russell, tackling wicked problems: through the transdisciplinary imagination (london: taylor & francis group, 2010): 302, ebook central. 4 bayard l. catron, “on taming wicked problems,” dialogue 3, no. 3 (1981): 13–16; luke houghton, “engaging alternative cognitive pathways for taming wicked problems,” emergence : complexity and organization 17, no. 1 (2015), https://www.researchgate.net/publication/282282336_engaging_alternative_cognitive_path ways_for_taming_wicked_problems_a_case_study. 5 catron, “on taming wicked problems”; falk daviter, “coping, taming or solving: alternative approaches to the governance of wicked problems,” policy studies 38, no. 6 (november 2017): 571–88, https://doi.org/10.1080/01442872.2017.1384543; david j. snowden and mary e. boone, “a leader’s framework for decision making,” harvard business review (november 1, 2007), https://hbr.org/2007/11/a-leaders-framework-for-decision-making. 6 natallia pashkevich, “wicked problems: background and current state,” philosophia reformata 85, no. 2 (november 4, 2020): 119–24, https://doi.org/10.1163/23528230-8502a008. 7 andrew m. cox, mary anne kennan, liz lyon, and stephen pinfield, “developments in research data management in academic libraries: towards an understanding of research data service maturity,” journal of the association for information science and technology 68, no. 9 (2017): 2182–2200, https://doi.org/10.1002/asi.23781; julie mcleod and sue childs, “a strategic approach to making sense of the ‘wicked’ problem of erm,” records management journal 23, no. 2 (2013): 104–35, http://dx.doi.org/10.1108/rmj-04-2013-0009; shelley wilkin and peter g. underwood, “research on e-book usage in academic libraries: ‘tame’ solution or a ‘wicked problem’?” south african journal of libraries & information science 81, no. 2 (july 2015): 11– 18, https://doi.org/10.7553/81-2-1560; brendan howley, “libraries, prosperity’s wicked problems, and the gifting economy," information today 33, no. 6 (july 2016): 14–15, proquest. 8 rachel d. williams and rebekah willett, “makerspaces and boundary work: the role of librarians as educators in public library makerspaces,” journal of librarianship and information science 51, no. 3 (september 2019): 801–13, https://doi.org/10.1177/0961000617742467. 9 cox, pinfield, and smith, “moving a brick building.” 10 matt cook et al., “challenges and strategies for educational virtual reality,” information technology and libraries 38, no. 4 (december 16, 2019): 25–48, https://doi.org/10.6017/ital.v38i4.11075; kung jin lee, w. e. king, negin dahya, and jin ha lee, “librarian perspectives on the role of virtual reality in public libraries,” proceedings of the association for information science and technology 57, no. 1 (2020): e254, https://doi.org/10.1002/pra2.254; hannah pope, “virtual and augmented reality in information technology and libraries december 2021 black, white, and grey | ellern and cruz 16 libraries,” library technology reports 54, no. 6 (september 8, 2018): 1–25; felicia ann smith, “‘virtual reality in libraries is common sense,’” library hi tech news 36, no. 6 (august 28, 2019): 10–13, https://doi.org/10.1108/lhtn-06-2019-0040; char booth, “from technolust to technorealism,” public services quarterly 5, no. 2 (june 2009): 139–42, https://doi.org/10.1080/15228950902868504. 11 megan frost, michael goates, sarah cheng, and jed johnston, “virtual reality: a survey of use at an academic library,” information technology and libraries 39, no. 1 (march 2020): 1–12. https://doi.org/10.6017/ital.v39i1.11369; jennifer grayburn, zack lischer-katz, kristina golubiewski-davis, and veronica ikeshoji-orlati, 3d/vr in the academic library: emerging practices and trends (washington, dc: council on library and information resources, 2019), https://eric.ed.gov/?id=ed597662; susan lessick and michelle kraft, “facing reality: the growth of virtual reality and health sciences libraries,” journal of the medical library association: jmla 105, no. 4 (october 2017): 407–17, https://doi.org/10.5195/jmla.2017.329; kenneth j. varnum, ed. beyond reality: augmented, virtual, and mixed reality in the library (chicago: american library association, 2019); richard smith and oliver bridle, “using virtual reality to create real world collaborations,” proceedings of the iatul conferences. paper 5 (2018): 10, https://docs.lib.purdue.edu/iatul/2018/collaboration/5/; carl r. grant and stephen rhind-tutt, “is your library ready for the reality of virtual reality? what you need to know and why it belongs in your library,” in o, wind, if winter comes, can spring be far behind? (charleston conference, 2019), https://doi.org/10.5703/1288284317070; dorothy carol ogdon, “hololens and vive pro: virtual reality headsets,” journal of the medical library association: jmla 107, no. 1 (january 2019): 118–21, https://doi.org/10.5195/jmla.2019.602. 12 grayburn et al., 3d/vr in the academic library, 8. 13 douglas bates, “library service study,” unpublished data, june 2, 2020; andrew m. cox, mary anne kennan, liz lyon, and stephen pinfield, “developments in research data management in academic libraries: towards an understanding of research data service maturity,” journal of the association for information science and technology 68, no. 9 (2017): 2182–2200, https://doi.org/10.1002/asi.23781; priti jain, “new trends and future applications/directions of institutional repositories in academic institutions,” library review 60, no. 2 (2011): 125–41, http://dx.doi.org/10.1108/00242531111113078; janice g. norris and elka tenner, “gis in academic business libraries: the future,” journal of business & finance librarianship 6, no. 1 (september 2000): 23, https://doi.org/10.1300/j109v06n01_03. 14 ross rubin, “vendors face the tough reality of affordable vr,” zdnet (july 13, 2020), https://www.zdnet.com/article/vendors-face-the-tough-reality-of-affordable-vr/. 15 sarah sharples, sue cobb, amanda moody, and john r. wilson, “virtual reality induced symptoms and effects (vrise): comparison of head mounted display (hmd), desktop and projection display systems.” displays 29, no. 2 (march 1, 2008): 58–69, https://doi.org/10.1016/j.displa.2007.09.005. 16 emily carl et al., “virtual reality exposure therapy for anxiety and related disorders: a metaanalysis of randomized controlled trials,” journal of anxiety disorders 61 (january 1, 2019): 27–36, https://doi.org/10.1016/j.janxdis.2018.08.003. information technology and libraries december 2021 black, white, and grey | ellern and cruz 17 17 edward castronova, on virtual economies, (rochester, ny: social science research network, july 1, 2002), https://papers.ssrn.com/abstract=338500. 18 barbara i. dewey, “the embedded librarian: strategic campus collaborations,” resource sharing & information networks 17, no. 1/2 (march 2004): 5–17; alessia zanin-yost, “academic collaborations: linking the role of the liaison/embedded librarian to teaching and learning,” college & undergraduate libraries 25, no. 2 (april 2018): 150–63, https://doi.org/10.1080/10691316.2018.1455548. 19 xiaoping sheng and lin sun, “developing knowledge innovation culture of libraries,” library management 28, no. 1/2 (january 9, 2007): 36–52, https://doi.org/10.1108/01435120710723536. 20 lorcan dempsey, “libraries and the informational future: some notes,” information services & use 32, no. 3/4 (july 2012): 201–12, https://doi.org/10.3233/isu-2012-0670. 21 randall bass, “what’s the problem now?” to improve the academy: a journal of educational development 39, no. 1 (spring 2020), https://doi.org/10.3998/tia.17063888.0039.102; kate crowley and brian w. head, “the enduring challenge of ‘wicked problems’: revisiting rittel and webber,” policy sciences 50, no. 4 (december 1, 2017): 539–47, https://doi.org/10.1007/s11077-017-9302-4. 22 brady d. lund, isaiah omame, solomon tijani, and daniel agbaji, “perceptions toward artificial intelligence among academic library employees and alignment with the diffusion of innovations’ adopter categories,” college & research libraries 81, no. 5 (july 2020): 865–82, https://doi.org/10.5860/crl.81.5.865. 23 david a. abrahams, “technology adoption in higher education: a framework for identifying and prioritising issues and barriers to adoption of instructional technology,” journal of applied research in higher education 2, no. 2 (2010): 34–49, https://doi.org/10.1108/17581184201000012. 24 tilmann lindberg, christine noweski, and christoph meinel, “evolving discourses on design thinking: how design cognition inspires meta-disciplinary creative collaboration,” technoetic arts: a journal of speculative research 8, no. 1 (may 2010): 31–37, https://doi.org/10.1386/tear.8.1.31/1; nancy roberts, “wicked problems and network approaches to resolution,” international public management review 1, no. 1 (2000): 1–19. 25 heather leary and samuel severance, “using design-based research to solve wicked problems,” icls 2020 proceedings (june 2020): 1805-6, https://repository.isls.org/bitstream/1/6452/1/1805-1806.pdf; deborah l mulligan and patrick alan danaher, “the wicked problems of researching within the educational margins: some possibilities and problems,” in researching within the educational margins: strategies for communicating and articulating voices, ed. deborah l. mulligan and patrick alan danaher, (cham, switzerland: palgrave macmillan, 2020): 23–39, https://doi.org/10.1007/978-3-03048845-1_2. information technology and libraries december 2021 black, white, and grey | ellern and cruz 18 26 brown, harris, and russell, tackling wicked problems, ebook central; chris burman, marota aphane, and naftali mollel, “the taming wicked problems framework: reflections in the making,” journal for new generation sciences 15 (april 20, 2018): 51–73, https://www.researchgate.net/publication/324646298_the_taming_wicked_problems_fram ework_reflections_in_the_making; “convergence research at nsf,” national science foundation,” accessed october 21, 2021, https://www.nsf.gov/od/oia/convergence/. 27 alex jorgensen and kara lindaman, “practicing democracy on wicked problems through deliberation: essentials for civic learning and student development,” journal of management policy and practice 21, no. 2 (2020): 28–39, https://www.proquest.com/scholarlyjournals/practicing-democracy-on-wicked-problems-through/docview/2435720594/se-2; paul hanstedt, creating wicked students: designing courses for a complex world (sterling, virginia: stylus publishing, 2018), ebook central. 28 ronald barnett, “learning for an unknown future,” higher education research & development 31, no. 1 (february 1, 2012): 65–77, https://doi.org/10.1080/07294360.2012.642841; stephanie wilson and lisa zamberlan, “design for an unknown future: amplified roles for collaboration, new design knowledge, and creativity,” design issues 31, no. 2 (spring 2015): 3–15, https://doi.org/10.1162/desi_a_00318; robin kundis craig, “resilience theory and wicked problems,” vanderbilt law review 73, no. 6 (december 2020): 1733–75, proquest; larry j leifer and martin steinert, “dancing with ambiguity: causality behavior, design thinking, and triple-loop-learning,” information knowledge systems management 10, no. 1–4 (march 2011): 151–73. virtual reality as a tool for student orientation in distance education programs: a study of new library and information science students articles virtual reality as a tool for student orientation in distance education programs a study of new library and information science students sandra valenti, brady lund, and ting wang information technology and libraries | june 2020 https://doi.org/10.6017/ital.v39i2.11937 dr. sandra valenti (svalenti@emporia.edu) is assistant professor, school of library and information management, emporia state university. brady lund (blund2@g.emporia.edu) is doctoral student of library and information management at emporia state university. ting wang (twang2@emporia.edu) is doctoral student of library and information management, emporia state university. abstract virtual reality (vr) has emerged as a popular technology for gaming and learning, with its uses for teaching presently being investigated in a variety of educational settings. however, one area where the effect of this technology on students has not been examined in detail is as tool for new student orientation in colleges and universities. this study investigates this effect using an experimental methodology and the population of new master of library science (mls) students entering a library and information science (lis) program. the results indicate that students who received a vr orientation expressed more optimistic views about the technology, saw greater improvement in scores on an assessment of knowledge about their program and chosen profession, and saw a small decrease in program anxiety compared to those who received the same information as standard textand-links. the majority of students also indicated a willingness to use vr technology for learning for long periods of time (25 minutes or more). the researchers concluded that vr may be a useful tool for increasing student engagement, as described by game engagement theory. literature review computer-assisted instruction (cai) has, for many years, been considered an effective method of instructional delivery that improves student engagement and outcomes.1 new technologies, such as the learning management system (lms), online video, laptops and tablets, word processors, spreadsheets, and presentation platforms, have all significantly altered how knowledge is transferred and measured in students. when adopted by instructors, these technologies can improve the quality of student learning, work, and their evaluation of this work. empirical research has shown that learning technologies do indeed contribute to better learning than a lecture alone.2 positive reaction to the adoption of new learning technologies among student populations has been shown across all grade levels, from pre-k through postgraduate education.3 research in the fields of instructional design technology (idt) and information science (is) have shown that the novelty of new learning technology provides short-term improvement in outcomes.4 this supports the broader hypothesis that engagement increases retention of knowledge. these findings would suggest that, at least in the short term, instructors could anticipate improvement in knowledge retention through the use of a new technology like virtual mailto:svalenti@emporia.edu mailto:blund2@g.emporia.edu mailto:twang2@emporia.edu information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 2 reality. when used in sustained instructional efforts, many learning technologies show som e promise for improving the attainment of learning outcomes.5 this is why interest in learning technology has grown so significantly in the past two decades and the job outlook for instructional designers is increasing faster than the national average. 6 a large proportion of instructional technologies are not truly “adopted” by instructors, but rather used only in one-off sessions and then discarded.7 there seem to be some common factors among those technologies that are adopted and used regularly by instructors: 1. practicality, or the amount of work the new technology requires versus the perceived value of said technology; 2. affordability, or the cost of a new technology versus the perceived value of said technology; and 3. stability, or the likelihood of the product to be continuously supported and updated by its manufacturer (e.g., a product like microsoft office has a higher likelihood of ongoing maintenance).8 as noted by lund and scribner, only recently, with the introduction of free vr development programs and inexpensive viewers/headsets like google cardboard, has vr fit this criteria. 9 it is finally practical to use vr as a learning tool for classrooms with large numbers of students. “virtual reality is the computer-created counterpart to actual reality. through a video headset, computer programs present a visual world that can, pixel-perfectly, replicate the real world—or show a completely unreal one.”10 virtual reality is distinct from augmented reality, which augments a real-world, real-time image (e.g., viewed through a camera on a mobile device) with computer-generated information, such as images, text, videos, animation, and sound.11 the focus of the present study is virtual reality only, not related augmented (or mixed) reality technology. an important contribution to the study of virtual reality in library and information science (lis) is varnum’s beyond reality.12 this short introductory book covers both theoretical and practical considerations for the use of virtual, augmented, and mixed reality in a variety of library contexts. while the book describes how vr can be utilized in a variety of library education (for non-lis majors) contexts, it does not include an example of how virtual reality may be used for library school education. it also does not investigate in significant detail the use of virtual reality for a virtual orientation to an academic program. these are the gaps in which the following study attempts to address. the present study may be viewed through the framework game engagement theory, as described by whitton.13 game engagement theory suggests that five major learning engagement factors exist and that using gaming activities may improve how well learning activities address these factors. these factors include: • challenge, motivation to undertake activity; control, the level of choice; • immersion, extent to which an individual is absorbed into activity; • interest, an individual’s interest in the subject matter; and • purpose, the perceived value of the outcome of the activity. information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 3 it has been suggested by several researchers, including dede, that immersive experiences like vr touch on similar factors of engagement.14 emporia state university’s school of library and information management the setting for this study is emporia (ks) state university’s school of library and information management (esu slim). esu slim is the oldest library school west of the mississippi river, founded in 1902. compared to other lis education programs, esu slim is unique in that it offers a hybrid course delivery format. the six core courses in the mlis degree program are online with two in-person-class weekends for each class. each class weekend is eleven hours: from 6 to 9 p.m. friday and 9 a.m. to 5 p.m. saturday at one of nine distance education locations scattered throughout the western half of the united states. due to this course delivery format, the student population of esu slim may skew slightly older and have more individuals who are employed fulltime in relation to residential master’s programs. esu slim uses a cohort system, with a new group of students beginning annually at each of the eight distance locations as well as the main emporia, kansas campus. before each new cohort begins its first course, a one-day, in-person student orientation is offered on the campus in which the cohort will attend classes. the purpose of this experimental study is to examine how well vr technology can support or satisfy the role of the in-person student orientation by emulating the experience/information students receive during this informational session. methods this study was designed with a pre-test/post-test experimental design. depending on the state in which the students reside, they were assigned either to the experimental or control group . the experimental group received a cardboard vr headset (similar to google cardboard) and a set of instructions on how to use them. they were instructed to utilize this headset to view an interactive experience that introduced elements of library service and library education as a form of new student orientation. students in the control group received a set of links that contained the same information as the vr experience, but in a more static (non-immersive or interactive) setting. participants for this study were library school students from four states: south dakota, idaho, nevada, and oregon. these students were all enrolled in a mixed-delivery program in lis. for each core course in the program, students attend two intensive, in-person, weekend class sessions. the rest of the course content is delivered via a learning management system. for this study, the researchers were particularly interested in understanding the role of vr orientation for distance education students, as these students do not have access to the physical university campus and thus miss out on information that in-person interaction with faculty and the library environment might provide. this also seemed like a worthwhile population to study given that a large portion of lis programs have adopted the distance education (online or mixed-delivery) format. in march 2019, a sample of this population was asked to complete a short survey to indicate their interest in virtual reality for new student orientation and the extent to which acquiring information via this medium may relieve their anxiety and increase their success in the program. sixty-one percent of students indicated at least some elevated level of anxiety about their first mls information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 4 course, while 55 percent agreed that knowing more about the program’s faculty and course structure and purpose would decrease that anxiety. students were also asked to indicate the most pressing information needs they have about the program. these needs are displayed in table 1 below. this information was used to guide the design of the vr content for this study. table 1. information needs expressed by new mls students information need number of respondents (out of 55) information about esu’s curriculum 50 what courses professors normally teach 42 information about information access 41 information about librarianship in general 39 professors’ research interests 35 information about esu’s faculty 27 to see who they are via a video introduction 25 information about esu’s library 24 why they teach for esu’s mls program 23 a little personal information about faculty 20 information about my regional director 14 to which associations do faculty belong 13 information about esu’s physical spaces 5 information about esu’s archives 4 these students were also asked to indicate the extent to which they would like to use vr to virtually “meet” faculty, learn more about the program’s format, see program spaces, and learn about library services, using a five-point likert scale. the findings for this question are displayed in figure 1. information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 5 figure 1. new mls students reception to using vr as an orientation tool based on the largely positive response towards using vr for new student orientation, the researchers progressed to the experimental phase of the study. a vr experience was developed using veer vr (veer.tv), a completely free and intuitive vr-creation platform. within this platform, creators are able to upload images that were captured using a 360-degree vr camera (we used a samsung gear 360 camera) and drag-and-drop interactive elements, including text boxes, videos, audio, and transitions to new images. thus, it was possible to create a vr experience within the setting of an academic library where users could navigate throughout the building and virtually meet faculty and learn about fundamental concepts in librarianship. for this phase of the study a set of research questions were defined, hypothesis created, and independent and dependent variables identified: research questions 1. research question 1: will vr improve students’ knowledge of topics related to their library school and basic library topics, relative to those without a vr experience? 2. research question 2: will vr reduce students’ anxiety about their library program, relative to those without a vr experience? 3. research question 3: will students’ perceptions towards the usefulness of vr be significantly different based on whether or not they utilized the vr experience? 0 2 4 6 8 10 12 14 16 18 20 i'd like to use vr to "meet" faculty i'd like to use vr to learn more about the program format i'd like to use vr to see the classrooms i'd like to learn more about library services using vr f re q u e n c y o f r e sp o n d e n ts category of vr use as student orientation tool strongly agree agree neutral disagree strongly disagree information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 6 hypothesis use of vr will improve students’ knowledge of topics related to library schools and librarianship, reduce their anxiety, and result in a more positive perspective towards vr technology. variables independent variable: whether a student viewed the vr experience for a virtual orientation or viewed the web links for an online orientation. dependent variables: change in students’ scores on a post-test assessment of orientation knowledge, compared to their pre-test scores. change in students’ anxiety levels and perceptions of vr. experimental phase the experimental phase of the study was conducted in august 2019. twenty-nine students agreed to participate in this study. the age and gender characteristics of this population are as follows: fourteen under age 35, eleven age 35–44, four age 45+; nine male, seventeen female, and three fluid or transgender. thirty-three percent of the students who agreed to participate were in the control group, while 67 percent were in the experimental group. all participants in the study received a free vr headset, which was theirs to keep. funding for these vr headsets was provided by a generous grant from a benefactor at the researchers ’ university. participants in the control group were encouraged to use the vr headset after they had completed their participation in the study. both groups received instructions with their viewer that instructed them to complete a pre-test survey, embedded within a module of their learning management system account. following the pre-test, the experimental group was instructed to use the vr experience created by the researchers to learn about their library school, its faculty, and the library concepts. the control group was instructed to use links provided in the module to experience the same content, but without the vr experience. following the experience, both groups were instructed to complete a post-test survey in the module, as well as a follow-up survey that asked questions about how long they interacted with the content, how the experience affected their program anxiety, and additional comments. once the data was collected for all participants, the researchers’ conducted a series of analyses on the data, including an analysis of covariance (ancova) for post-test scores among the control and experimental groups, and ancova for program anxiety following the experimental treatment. 15 results figure 2 displays the amount of time participants in the experimental group spent using the vr experience. nearly 60 percent of participants spent more than 25 minutes using the virtual reality experience. this finding may seem remarkable, given the average attention span of students is generally no more than a handful of minutes, but aligns with that of geri, winer, and zaks, who found that engagement with interactive video lengthens the attention span of users, and supports the premise of engagement theory as discussed in the literature review.16 only 10 percent of individuals assigned to the experimental group decided not to use the headset. additionally, about information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 7 one-third of participants in both the experimental and control groups indicated that they used the vr headset to view other content after they completed the study. figure 2. amount of time experimental group participants spent in vr experience in table 2, responses for likert question about the participants’ post-test perspectives of vr are shown. participants in the vr group generally had more favorable perspectives on their experience than participants in the control group. participants in the control group, however, were a bit more optimistic on the idea that vr has promising uses for education and librarianship (though both groups expressed optimistic perspectives on these questions). there was some indication that participants would be willing to use vr for student orientation again, as both groups responded favorably to the idea that vr orientation information is appropriate and negatively to the idea that it would be better to get information from other sources. tables 3 and 4 display the ancova for pre-test/post-test score change among groups and the change in anxiety among the groups, respectively. post-test scores for the experimental (17.23 correct out of 20 questions, or 86 percent) and control group (17.38/20, or 87 percent) were virtually identical; however the pre-test scores differed (experimental group, 72 percent, scored worse on the pre-test than control group, 78 percent), so the change in scores was actually greater for the experimental group. as shown in table 3, though, this difference in score change was not found to be statistically significant, f (1, 20) = .641 p = .4, r = .01. that is, no significant difference was found as to whether vr improves scores compared to links. it can be concluded, however, that information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 8 the links and vr together did improve scores from the pre-test to the post-test, with ancova values of f (1, 20) = 7.6, p < .01, r = .47. table 2. post-test perspectives of vr for experimental and control groups question control (textlinks)* experimental (vr)* the instructions were easy to understand and follow 3 3.38 the viewer/text-links were fun to use 3 3.63 the vr/text-links content was engaging 3 3.13 i would recommend continuing vr/textlinks use 2.67 3 i felt better informed about the topics presented 2.5 3.11 the information given was helpful 2.5 3.38 i feel more connected to the school than before 2.5 2.88 virtual reality is just a fad 2 2.88 there are exciting uses for vr in education 4 3.5 there are exciting uses for vr in librarianship 4 3.5 using vr is too time consuming 2 3 i’d rather get information in formats other than vr 2.5 2.89 vr orientation information is appropriate 4 3.38 *five-point likert scale (level of agreement—1, strongly disagree; 5, strongly agree) table 3. ancova for pre-test/post-test change in scores degrees of freedom fvalue pvalue pretest 1 .135 .7 group 1 .641 .4 error 18 total 19 corrected total 20 though the vr group generally reported less anxiety on a five-point likert scale following the experiment than the control group (both groups showed some reduction), this difference was not statistically significant at p<.05 (though it was significant at p<.1). it is worth noting that few students indicated prior experience with vr before this study, so it may have simply been the unfamiliar technology that resulted in anxiety not dropping as far as anticipated, not the nature of the content. at the same time, it is worth noting, as bawden and robinson did, that information overload, which could certainly be the product of immersive vr orientations, is connected to information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 9 information anxiety.17 thus, it may be better, in the design of vr orientations, to keep the amount of new information at a minimum, only introducing broad concepts and allowing more freedom and flexibility for the user. table 4. ancova for anxiety following the orientation experience sum of squares df mean square f sig. between groups 3.219 1 3.219 3.44 9 .07 9 within groups 17.733 19 .933 total 20.952 20 discussion participants in this study expressed willingness to use vr for extended periods of time (over 25 minutes) and demonstrated strong levels of engagement. based on this finding, it seems possible that a well-designed vr orientation could be a suitable substitute for the in-person orientation for distance students. this is a significant finding, given that the majority of existing research on orientation for distance education students focuses on the design of online course modules or video streaming for orientation, which are not nearly as immersive and dynamic as physical presence in the environment.18 vr much more closely emulates physical presence than noninteractive/immersive videos and text. those among the participants who were in the experimental (vr) group expressed more favorable perspectives towards the technology. this suggests that experience with the technology increases comfort and interest in the technology. this aligns with the findings of theung, mei-ling, liu, cheok, among others, who found that use of vr were more likely to accept the technology after usage.19 additionally, stated interest in using vr for other purposes, including one-third of participants who have already utilized the technology to explore other apps suggested by the researchers. the findings of this study align with game engagement theory in several of its key aspects. vr is shown to have garnered the interest of the students who participated in the study, as indicated in table 2, aligning with the aspect of interest. they could see the purpose of the experience and were able to take control of the experience to ensure that they interacted with necessary information to satisfy this purpose. this is opposed to the control group, which had to follow links and read text in a sequential order with little control or creativity involved. accordingly, greater improvement in scores was observed for the experimental group. even though the improvement was not statistically significant, this could likely be explained by the relatively small sample size. with a larger number of participants, the statistical strength of the differences between the two study groups may have been more pronounced. this is one limitation of the present study. in addition to a small participant group, several other limitations exist with this study. participants came from only a small sample of states, all in the western half of the united states. a less homogeneous sample may have produced more robust results. some vr headsets arrived late due to delays in distributing them, giving the students less opportunity to review the content than information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 10 they otherwise may have had. finally, the researchers were not able to easily troubleshoot problems with accessing the vr experience for distance students. while the best was done to help all participants figure out how to use the technology, several students opted to discontinue participation when the technology gave them trouble. this also led to a smaller study sample population than initially anticipated. conclusion the findings of this study may have several important implications for library professionals who are considering using vr technology for library orientations or instruction. this study found vr to have a positive effect on students’ interest and to slightly increase scores and reduce anxiety among them. while there is no indication from this study whether vr would produce positive effects over a sustained period of time (e.g., every class session over the course of a semester), in limited usage it appears to at least draw students’ attention more so than the traditional online teaching options like static text and links. the same vr experience developed to introduce students to basic concepts within the librarianship/the library could be used for undergraduate and graduate students in all majors during library orientation sessions. this may make the library a more memorable component of students’ early university experiences, as opposed to lecture information that students are likely to easily forget. library professionals may consider these factors when deciding whether to opt for the more traditional methods of instruction/orientation or experimenting with a more innovative method of teaching like virtual reality. endnotes 1 jennifer j. vogel et al., “using virtual reality with and without gaming attributes for academic achievement,” journal of research on technology in education 39, no. 1 (2006): 105–18, https://doi.org/10.1080/15391523.2006.10782475. 2 yigal rosen, “the effects of an animation-based on-line learning environment on transfer of knowledge and on motivation for science and technology learning,” journal of educational computing research 40, no. 4 (2009): 451–67, https://doi.org/10.2190/ec.40.4.d; elisha chambers, efficacy of educational technology in elementary and secondary classrooms: a metaanalysis of the research literature from 1992–2002 (carbondale, il: southern illinois university at carbondale, 2002). 3 elisha chambers, “efficacy of educational technology in elementary and secondary classrooms: a meta-analysis of the research literature from 1992–2002,” phd diss., southern illinois university at carbondale, 2002. 4 jason m. harley et al., “comparing virtual and location-based augmented reality mobile learning: emotions and learning outcomes,” educational technology research and development 64, no. 3 (2016): 359–88, https://doi.org/10.1007/s11423-015-9420-7; jocelyn parong and richard e. mayer. “learning science in immersive virtual reality,” journal of educational psychology 110, no. 6 (2018): 785–95, https://doi.org/10.1037/edu0000241; paul legris, john ingham, and pierre collerette, “why do people use information technology? a https://doi.org/10.1080/15391523.2006.10782475 https://doi.org/10.2190%2fec.40.4.d https://doi.org/10.1007/s11423-015-9420-7 https://psycnet.apa.org/doi/10.1037/edu0000241 information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 11 critical review of the technology acceptance model,” information and management 40, no. 3 (2003): 191–204, https://doi.org/10.1016/s0378-7206(01)00143-4. 5 zaid khot et al., “the relative effectiveness of computer‐based and traditional resources for education in anatomy,” anatomical sciences education 6, no. 4 (2013): 211–15, https://doi.org/10.1002/ase.1355; michael j. robertson and james g. jones, “exploring academic library users’ preferences of delivery methods for library instruction,” reference & user services quarterly 48, no. 3 (2011): 259–69. 6 joshua kim, “instructional designers by the numbers,” inside higher ed (2015), https://www.insidehighered.com/blogs/technology-and-learning/instructional-designersnumbers. 7 elena olmos-raya et al., “mobile virtual reality as an educational platform: a pilot study on the impact of immersion and positive emotion induction in the learning process,” eurasia journal of mathematics science and technology education 14, no. 6 (2018): 2045-57, https://doi.org/10.29333/ejmste/85874. 8 brady d. lund and shari scribner, “developing virtual reality experiences for archival collections: case study of the may massee collection at emporia state university,” the american archivist, https://doi.org/10.17723/aarc-82-02-07. 9 lund and scribner, “developing virtual reality experiences for archival collections.” 10 kenneth j. varnum, “preface,” in kenneth j. varnum, ed., beyond reality: augmented, virtual, and mixed reality in the library (chicago: ala editions, 2019): x. 11 brady d. lund and daniel a. agbaji, “augmented reality for browsing physical collections in academic libraries,” public services quarterly 14, no. 3 (2018): 275–82, https://doi.org/10.1080/15228959.2018.1487812. 12 kenneth j. varnum, ed., beyond reality: augmented, virtual, and mixed reality in the library (chicago: ala editions, 2019). 13 nicola whitton, “game engagement theory and adult learning,” simulation and gaming 42, no. 5 (2011): 596–609, https://doi.org/10.1177/1046878110378587. 14 chris dede, “immersive interfaces for engagement and learning,” science 323, no. 5910 (2010): 66–69, https://doi.org/10.1126/science.1167311. 15 pat dugard and john todman, “analysis of pre‐test‐post‐test control group designs in educational research,” educational psychology 15, no. 2 (1995): 181–98, https://doi.org/10.1080/0144341950150207. https://doi.org/10.1016/s0378-7206(01)00143-4 https://doi.org/10.1002/ase.1355 https://www.insidehighered.com/blogs/technology-and-learning/instructional-designers-numbers https://www.insidehighered.com/blogs/technology-and-learning/instructional-designers-numbers https://doi.org/10.29333/ejmste/85874 https://doi.org/10.17723/aarc-82-02-07 https://doi.org/10.1080/15228959.2018.1487812 https://doi.org/10.1177%2f1046878110378587 https://doi.org/10.1126/science.1167311 https://doi.org/10.1080/0144341950150207 information technology and libraries june 2020 virtual reality as a tool for student orientation | valenti, lund, and wang 12 16 nitza geri, amir winer, and beni zaks, “challenging the six-minute myth of online video lectures: can interactivity expand the attention span of learners?,” online journal of applied knowledge management 5, no. 1 (2017): 101–11. 17 david bawden and lyn robinson, “the dark side of information: overload, anxiety and other paradoxes and pathologies,” journal of information science 35, no. 2 (2009): 180–91, https://doi.org/10.1177/0165551508095781. 18 moon-heum cho, “online student orientation in higher education: a developmental study,” educational technology research and development 60, no. 6 (2012): 1051–69, https://doi.org/10.1007/s11423-012-9271-4; karmen crowther and alan wallace, “delivering video-streamed library orientation on the web: technology for the educational setting,” college and research libraries news 62, no. 3 (2001): 280–85. 19 yin-leng theng et al., “mixed reality systems for learning: a pilot study understanding user perceptions and acceptance,” international conference on virtual reality (2007): 728–37, https://doi.org/10.1007/978-3-540-73335-5_79. https://doi.org/10.1177/0165551508095781 https://doi.org/10.1007/s11423-012-9271-4 https://doi.org/10.1007/978-3-540-73335-5_79 abstract literature review emporia state university’s school of library and information management methods research questions hypothesis variables experimental phase results discussion conclusion endnotes data center consolidation at the university at albany rebecca l. mugridge and michael sweeney information technology and libraries | december 2015 18 abstract this paper describes the experience of the university at albany (ualbany) libraries’ migration to a centralized university data center. following an introduction to the environment at ualbany, the authors discuss the advantages of data center consolidation. lessons learned from the project include the need to participate in the planning process, review migration schedules carefully, clarify costs of centralization, agree on a service level agreement, communicate plans to customers, and leverage economies of scale. introduction data centers are facilities that house servers and related equipment and systems. they are distinct from data repositories, which collect various forms of research data, although some data repositories are occasionally called data centers. many colleges and universities have data centers or server rooms distributed across one or more campuses, as does the university at albany (ualbany). this paper reports on the experiences of the libraries at ualbany as the libraries’ application and storage servers were consolidated into a new, state-of-the-art, university data center in a new building on campus. the authors discuss the advantages of consolidation, the planning process for the actual move, and lessons learned from the migration. background the university at albany is one of four university centers that are part of the state university of new york (suny) system. founded in 1844, ualbany has approximately 13,000 undergraduates, 4,500 graduate students, and more than 1,000 faculty members. it offers 118 undergraduate majors and minors, and 138 master’s, doctoral, and certificate programs. ualbany resides on three campuses: uptown (the main campus), downtown, and east.1 the uptown campus was built in the 1960s on grounds formerly owned by the albany country club. the campus was designed by noted architect edward durell stone in 1962–63 and was built in 1963–64. the campus buildings include four residential quadrangles surrounding a central “academic podium” consisting of thirteen three-story buildings connected on the surface by an overhanging canopy and below ground by a maze of tunnels and offices. many of the university’s classrooms, lecture halls, academic and operational offices, and infrastructure are housed within the podium on the basement or subbasement levels. this includes the university’s original data center, which is located in a basement room in the center of the podium. rebecca l. mugridge (rmugridge@albany.edu) is interim dean and director and associate director for technical services and library systems, and michael sweeney (msweeney2@albany.edu) is head, library systems department, university libraries, university at albany, albany, new york. mailto:rmugridge@albany.edu mailto:msweeney2@albany.edu data center consolidation at the university of albany | mugridge and sweeney | doi: 10.6017/ital.v34i4.8650 19 while visually striking and unique, the architectural design of the podium presented many challenges since its construction, one of which is regular flooding of the basement and subbasement levels. the original data center was flooded many times, to the extent that any heavy rainstorm had the potential to disrupt functionality and connectivity. when the university was first built in the 1960s it was not known to what extent computing would become part of the university’s infrastructure, and the room that the data center was housed in was not built to today’s standards for environmental control, such as the need for cooling. at the same time, server rooms sprouted all over the university, with many of the colleges and other units purchasing servers and maintaining server rooms in less than ideal conditions. these included server rooms in the college of arts and sciences, the school of business, the athletics department, the university libraries, and many other units. university libraries’ server room the university libraries maintained its own server room with two racks full of equipment that supported all of the libraries’ computing needs. these servers supported our website, mssql and mysql databases, ezproxy, illiad (interlibrary loan service), ares (electronic reserve service), and our search engine appliance (google mini). they also included our domain controller, intranet, and several servers used for backup. two servers and a storage area network housed our virtual environment, containing an additional nine virtual servers. these included servers to support library blogs, wikis, file storage, development and test servers, and additional backup servers. the only library servers not housed in the libraries’ server room were the integrated library system (ils) servers that were maintained primarily by the university’s information technology services (its) staff, our backup domain controllers, and a server holding backups of our virtual servers. the ils production server was housed in ualbany’s data center and the ils test/backup server was housed in the alternate data center in another building on campus. also, two of the libraries’ backup servers for other applications were housed in the university data center. the libraries’ server room consisted of a 340 square foot room on the third floor of the main campus library that was networked to support servers housed in two racks protected by a fire suppression system. there were two ceiling-mounted air conditioning units that cooled the room sufficiently for optimum performance. the libraries’ windows system administrator’s office was nearby and had a connecting door to the server room, giving him ready access to the servers when needed. data center consolidation data center consolidation is defined as “an organization's strategy to reduce it assets by using more efficient technologies. some of the consolidation technologies used in data centers today include server virtualization, storage virtualization, replacing mainframes with smaller blade server systems, cloud computing, better capacity planning and using tools for process automation.”2 in addition to the investigation and use of these technologies, the planning for a information technology and libraries | december 2015 20 new data center often involves the construction of a new building or the renovation of a current building. there were several drivers behind the ualbany’s decisions to build a new data center. in addition to the concerns mentioned above about the potential flooding risk of the current data center, the ability to manage optimum temperature was also a factor. the current data center was built to house 1960s-era equipment and was not able to keep up with the cooling requirements of the more extensive computing equipment in use in the twenty-first century. the current data center also occupied what is considered prime real estate at the university, at the center of campus and near the lecture center, which experiences high foot traffic during the academic year. the new data center was constructed near the edge of campus, with little foot or auto traffic, allowing the space previously occupied by equipment to be repurposed in a way that better meets the university’s needs. like many other universities, ualbany is increasingly making use of cloud computing capabilities. for example, the email and calendaring system are cloud-based. nevertheless, this movement is being made in a deliberate and thoughtful way, leaving many of our administrative computing needs reliant on the use of physical servers. ualbany and the libraries have decreased the number of physical servers necessary by relying on a virtualized environment, and part of the project to move to the new data center included a conversion from physical to virtual servers. the libraries’ ils production and test servers remain physical, as do several of the other libraries’ application servers. many of the libraries’ backup servers are now virtual. while there was no official mandate to consolidate all of the distributed server rooms across campus into the new data center, everyone involved understood that this was a direction the university administration supported. the libraries’ dean and director also supported this effort on behalf of the libraries and charged libraries’ staff to collaborate with its to make this happen. some of the drivers behind this decision include the promise of a better environment, improved security, backup generators for computing equipment, the use of its’s virtual environment, the automation of server management, a faster network, the ability to repurpose the libraries’ server room, and more. these drivers are described in more detail later in this paper. construction planning for ualbany’s new data center began in the mid-2000s and included the identification of funding and the architectural design of the new building, later to be named the information technology building (itb). the actual construction began in 2013, with an estimated completion date of february 2014 and occupancy in april 2014. unexpected challenges during construction delayed the timeline somewhat, and the construction was not completed until may 2014. the certificate of occupancy was granted in fall 2014. the data center is certified as tier iii by uptime institute,3 and the building is designated leed gold. data center consolidation at the university of albany | mugridge and sweeney | doi: 10.6017/ital.v34i4.8650 21 alternate data center simultaneously with the construction of the new data center, the university entered into an agreement with another suny institution to house our alternate data center. this center was originally housed in another building on the ualbany campus, less than a mile from both the main data center and itb, in a building leased by ualbany. this situation left some environmental issues out of our control, not an ideal situation. for example, an air conditioner failure in fall 2013 caused our backup and test ils server to be down for six days, affecting our ability to use that server for other purposes and holding up several projects. in addition, data center best practice calls for an alternate data center to be housed at a distance from the main data center. in february of 2014, the servers in the alternate data center were moved to their new location. this included the libraries’ backup ils server as well as two backup servers formerly housed in the main data center. advantages to the libraries moving to the university data center there were many advantages to the libraries moving to a centralized data center. many of these advantages also applied to the other units considering a move to the new data center, but for the purposes of this paper, we are addressing them in the context of the libraries’ experience. repurpose space the libraries’ server room occupied a large office that could be repurposed to house multiple staff offices or student spaces. the libraries have many group study rooms available for student use; however, they are in great demand, and the possibility of gaining more space for student use was seen as an advantage to making the move to a new data center. climate control the new data center is built on a raised floor that allows better air circulation. hundreds of servers and other pieces of equipment create a lot of excess heat, and raised floor construction allows for better circulation of air. new racks have chimneys that exhaust heat from high-density computing environments. air conditioners supply a constant stream of air that will maintain the optimum temperature for computing equipment. censors continually monitor humidity and keep it at an optimal level. this was an improvement over the libraries’ current server room, which had sufficient air conditioning for our relatively small number of physical servers but did not have backup generators to keep equipment running during a power outage. backup generators the new data center was built with two backup generators. if the building suddenly loses power, the backup generators will immediately start and provide a seamless source of energy. a secondary benefit to the university is that the backup generators can also provide a source of energy to other buildings on that side of campus; this area did not previously have a backup source of energy. again, the libraries’ server room did not have a redundant electrical supply. in information technology and libraries | december 2015 22 the event of a power outage, battery units would allow the servers to shut down properly if the outage lasted more than forty-five minutes. security with server rooms scattered all over the university, security issues were a concern. now that the servers are housed in one location, the university can provide a highly secure environment in a more cost effective way. the new data center has card-swipe access to the building and biometric access to the data center itself. there are also cameras installed in the building as a further security measure. virtual environment although the libraries have made strides toward moving into a virtualized environment in the past few years, we had many constraints on our ability to keep up with developments. the libraries’ virtual environment was two versions behind ualbany’s virtual environment, and the storage needs of the libraries’ virtual environment were at capacity. part of the incentive to moving into the new data center was the ability to downsize some of our physical equipment and migrate some of our physical servers to virtual equivalents. automation of server management one of the benefits of consolidating servers into one environment is that they are in a secure location, but it is still possible to manage them from a distance. the virtual environment has a web-based console that allows system administrators to connect and manage them, and the physical servers can be managed over the network as well. even though the servers are centralized, our system administrator can work from an office in the library, or from home if needed. faster network part of the project to construct a new data center included the installation of an additional fiber network across campus. the new fiber network connects all buildings on campus with each other and the new data center. all of the network equipment was upgraded, providing faster connections and response time. the additional fiber network is fault tolerant: if the primary network fails, the second fiber network can immediately take its place with no loss of service. staging and work room the new data center was designed to include a staging and work room. this can be used by any of the system administrators who are responsible for equipment housed in the data center, and it allows them to work on equipment in a room adjacent to the locked and secure data center. data center consolidation at the university of albany | mugridge and sweeney | doi: 10.6017/ital.v34i4.8650 23 equipment inventory part of the planning for the migration involved creating a detailed inventory of equipment. the libraries already had a server inventory, but the information collected for the migration went far beyond just a list of the servers. this helped us identify who was responsible for physical and virtual servers and who was responsible for the services and applications that ran on those servers. creating the equipment inventory also allowed us to consolidate and decommission equipment that was no longer needed, and additionally helped us determine a prioritization and time line for the move. applications inventory in addition to creating an equipment inventory, the libraries created an applications inventory that included information about the dependencies that applications had on each other. for example, the libraries’ electronic resources reserve application (ares) had been integrated into the university’s course management system, blackboard. that meant that when blackboard was inaccessible, ares was as well. all of these dependencies had to be taken into account when planning the schedule for the move. disadvantages to the libraries moving to the university data center the libraries have noted few disadvantages to moving to the new data center. what might seem like disadvantages are in reality just a change in the way we do our work. for example, we have been asked to inform someone in its before we go to the new data center to work on a server. this is a simple step, and has not hindered our work at any time. another change is the need to use a tool created by its to configure our virtual servers, and we found that the tool has been configured to give us fewer administrative options than what its staff have. this has reinforced our understanding that we need to be present and proactive in representing the libraries’ interests in managing all of our computing equipment and software. migration days the majority of ualbany’s servers were moved from the main data center to the new one in itb on august 9, 2014. however, we were unwilling to move all of the libraries’ servers on that day, which fell in the middle of the summer session. a compromise was reached between the libraries and its that allowed many of the libraries’ less mission-critical servers to be moved on the same day as the university’s servers. these servers were primarily ones that were used for development and backup purposes, one exception being the server that supported the libraries’ electronic reserves service. this server was dependent on the university-supported blackboard server, which was being moved on august 9, so the libraries’ agreed to move this server that day so there would not be two downtimes for the electronic reserve system. the libraries’ most critical servers were moved to itb on august 18, 2014. this was the first day of intersession and would affect students and faculty the least. there were many people involved information technology and libraries | december 2015 24 in the move, including the library systems staff, the migration consulting firm staff, the professional moving company that was hired to carry out the move, and its staff who were responsible for the network and other support. move activities included shutting down and backing up applications, powering off the servers, and packing the equipment. at itb the equipment was unpacked, placed in its assigned rack location, plugged in, and powered on. then each server had to be started, and applications tested. all of this activity began at 3:00 a.m. and continued until early afternoon. the day concluded with a conference call between all parties involved to confirm that everything was up and running as expected. lessons learned participate in the process the libraries were invited to participate in the planning for a new data center early in the process. its, ul, and other units with significant server collections met and discussed their computing needs and respective computing infrastructures. once the construction of itb began, the planning ramped up and monthly meetings of stakeholders became weekly meetings. agendas for these meetings included round robin reports about • construction project oversight; • migration consulting; • partnerships (with other units on campus, including the libraries); • status of our alternate data center (housed 10 miles away at another suny institution); • campus fiber network; • internal wiring and network design; • administrative computing planning and move; • research computing planning and move; • systems management (storage and virtual environment) planning and move; • data center advisory structure; and • campus notification and public relations. these meetings gave us an opportunity to learn about and understand all aspects of the data center migration project. participants reviewed project timelines and other documents that were housed on a shared wiki space. after the data center migration consultants were hired, they began to use the microsoft onedrive collaboration space to share and distribute documents. meeting regularly with all project participants allowed us to ask questions to clarify priorities and timelines and to advocate for the libraries’ needs. review schedules carefully as with many construction projects, unexpected delays in the construction of the data center delayed all of our plans. originally the building was to be completed in february; this was later data center consolidation at the university of albany | mugridge and sweeney | doi: 10.6017/ital.v34i4.8650 25 changed to april and then may. after the construction was complete, the building had to be commissioned, which means that every system within the building had to be tested independently by outside inspectors. coordination of this work is very time consuming, and the completion of the commissioning delayed occupancy by another few months. the university was finally given permission to move equipment into the data center in july. in the meantime, our consultants were working feverishly to develop timelines for the move, identify, and secure a contract with a professional it moving company, and create “playbooks” for each move. the playbook is a document that includes • names and contact information of everyone involved with the move; • sequence of events: an hour-by-hour description of all activities; • server overview: including the name, make, model, rack location and elevation, and contact person for each server; and • schematics of both old and new server locations including details about each server as well as rack locations and elevations. library staff became concerned when the original date scheduled to move most of the university’s application servers, including the libraries’ ils server, was in the middle of the summer session. although the projected downtime was only to be twelve hours (and probably fewer), library staff were not willing to have twelve hours’ downtime during a short four-week summer session. there were concerns that downtime, not only to the online catalog, but also to all of the libraries’ databases, the website, online reference service, electronic reserves, and other resources would present a severe hardship to faculty and students. we also recognized the risk, however small, of something going wrong during the move that would cause a lengthier downtime. at the same time the university was concerned about pushing the move too close to the start of the fall semester, as well as the increased cost of scheduling a second move date. during these negotiations it became apparent that the libraries’ needs are different from administrative computing needs. whereas the middle of a semester is a poor time for libraries’ servers to experience downtime, it can be a better time for administrative computing, which is often busier during intersession when grading reports are being run and personnel databases are being updated. ultimately, the libraries advocated for and secured an agreement for a second move date, scheduled for the first work day after the end of the summer session. similarly, its was encouraging all of its partners across the campus to move as much computing as possible into their virtual environment. this is a worthwhile goal, but again the libraries had to negotiate to make this change according to the schedule best for the libraries and its users. the its virtual environment was a more current release of the virtual machine (vm) software than the libraries were using, so the libraries were faced with not only a migration, but also an upgrade. ultimately, we postponed the vm migration until after the physical migration, and we have information technology and libraries | december 2015 26 benefited from waiting. other partners have had to work through a number of kinks in the process, and the libraries’ vm migration has benefitted from the other partners’ experience. clarify costs of centralization when ualbany began to consider and plan for a centralized data center, one of the concerns raised by the various data center managers from units other than its was the cost of centralizing their servers in another location. centralized data centers have many costs: heating, cooling, security, staffing, cleaning, backup energy sources, networking costs, and more. the question on everyone’s mind was who was going to pay for these costs. would each unit have to pay toward the maintenance of the data center? some objected to the idea of having to pay to be a tenant in a centralized data center, when they already had their own data center or server room at what seemed like no cost. the only cost they experienced was an opportunity cost of what else they could use the server room for. in the libraries’ case, the server room could be used for group study, office space, or other purposes, but it did not cost the libraries money to use it as a server room because utilities are covered centrally by the university. on the other hand, by migrating some of our computing to the its virtual environment, we may save money in the long run because we will not have to replace hardware and pay warranty fees. after much negotiation the university settled on a five-year commitment to no charges for the partnering units on campus, including the libraries. this agreement was documented in a partnership agreement drafted by a group of representatives from all of the key units involved. contribute to the development of a service level agreement library staff contributed to the development of a service level agreement (sla) for our participation in a centralized data center. having an sla in place ensures that all parties to the agreement understand their rights and responsibilities. we began by searching other universities’ websites for samples of slas, which we shared with its staff who were assigned to this project. the establishment of a centralized data center includes several major elements: data center as a service (dcaas), infrastructure as a service (iaas), as well as the network that connects it all. the sla that was developed, still in draft form, has elements that address the following: • the length of the agreement • network uptime • infrastructure as a service o server/storage environment and technical support o access to iaas o file backup and retention o maintenance of partner systems o its scheduled maintenance o data ownership, security, responsibility, and integrity data center consolidation at the university of albany | mugridge and sweeney | doi: 10.6017/ital.v34i4.8650 27 o business continuity, tiering, and disaster recovery o availability and response time of its staff • data center as a service o environment and support o building access and security o physical rack space o deliveries o scheduled maintenance o communications • glossary we recommend that institutions considering data center consolidation projects complete their sla and other agreements before moving servers into a shared environment. in our case, however, we were unable to finalize the sla prior to the actual move. this was not because of any particular demands that the libraries were making, but was primarily because of the rapid approach of the deadline for moving into the new data center. it had to be completed before the beginning of fall semester, and preferably with a few weeks to spare in case anything went wrong. while the planning for the data center construction and migration seemed to stretch over a long period of time, the final few months turned into a frenzy of activity that ranged from last-minute construction details to nailing down the exact order in which thousands of pieces of equipment would be moved. although not every detail was ironed out at the time of the move, the intentions and spirit of the sla have been documented and it will be completed during 2015. communicate developments and plans during the planning and development for the data center migration project, we recognized that it would be important to communicate any changes to the libraries’ systems availability to our users. its also recognized the need to communicate such changes. both its and the libraries took a many-pronged approach to communicating developments and plans related to the migration. within the libraries we shared updates at library faculty meetings as well as meetings of the library policy group (the dean’s administrative policy team). we sought feedback from many groups on proposed move dates, establishing intersession as the preferred time to move any libraries’ servers that would affect access to resources used by faculty or students. as the moves got closer the communication efforts were ramped up. within the libraries, we posted alerts on the libraries’ webpage that linked to charts indicating what services would be unavailable and when. we also included slides on the libraries’ main webpage with the same information. the same slide was posted on all three libraries’ flat-screen monitors, on which we post important news and dates. we sent mass emails to all libraries’ staff that reminded them when services would be down. staff members who were responsible for specific services made an effort to contact their customers directly. for example, the head of access services contacted information technology and libraries | december 2015 28 faculty members about the scheduled interruptions of ares, our electronic resources reserves system. some of the downtime affected just users, and other downtime also affected staff who could not work in the ils during the move. we planned alternate activities for staff members who could not work during the down time and had a productive division clean up day instead. its also made great efforts to communicate to the university community about the moves and any potential downtime. their efforts included mass emails to all faculty, staff, and students. its created and posted slides to the libraries’ flat screen monitors, as well as other monitors throughout the university. its also formed a team of liaisons from each school and college, using that group as yet another conduit to communicate changes. they shared draft schedules, seeking input on the effect of downtime on the university’s functions. leverage economies of scale one of the challenges of maintaining a distributed data center environment is that each system administrator or unit had to manage its own servers singlehandedly. in the case of the libraries at ualbany, we had moved in the direction of using the power of virtualization to manage many of our servers. virtualization refers to the process of creating virtual servers within one physical server, thereby multiplying the value of a single server many times. the libraries had virtualized a number of library servers, saving money by not having to purchase additional costly physical servers. however, its, with its greater purchasing power, was using more current and advanced virtualization software, hardware, and services than the libraries. its created a suite of services that allows system administrators access to the virtual environment so they can manage their virtual servers from their own offices. by moving into their virtual environment (iaas), the libraries are able to leverage the economies of scale presented by their environment. conclusion the consolidation of distributed data centers or server rooms on university campuses offers many advantages to their owners and administrators, but only minimal disadvantages. the university at albany carried out a decade-long project to design and build a state-of-the-art data center. the libraries participated in a two-year project to migrate their servers to the new data center. this included the hire of a data center migration consulting firm, the development of a migration plan and schedule for the physical move that took place late summer 2014. the authors have found that there are many advantages to consolidating data centers, including taking advantage of economies of scale, an improved physical environment, better backup services and security systems, and more. lessons learned from this experience include the value of participating in the process, reviewing migration schedules carefully, clarifying the costs of consolidation, contributing to the development of an sla, and communicating all plans and developments to the libraries’ customers, including faculty, staff, and students. as other university libraries consider the possibility of consolidating their data centers, the authors hope that this paper will provide some guidance to their efforts. data center consolidation at the university of albany | mugridge and sweeney | doi: 10.6017/ital.v34i4.8650 29 references 1. “fast facts,” university at albany, accessed march 31, 2015, www.albany.edu/about/about_fastfacts.php. 2. “data center consolidation, it consolidation,” webopedia, www.webopedia.com/term/d/data-center-consolidation-it-consolidation.html (accessed march 31, 2015). 3. uptime institute, accessed march 31, 2015, https://uptimeinstitute.com/tiercertification/. https://uptimeinstitute.com/tiercertification/ library management practices in the libraries of pakistan: a detailed retrospective article library management practices in the libraries of pakistan a detailed retrospective asim ullah, shah khusro, and irfan ullah information technology and libraries | september 2022 asim ullah (asimullah@uop.edu.pk) is doctoral candidate, department of computer science, university of peshawar. shah khusro (khusro@uop.edu.pk) is professor, department of computer science, university of peshawar. corresponding author irfan ullah (irfan@sbbu.edu.pk) is assistant professor, department of computer science, shaheed benazir bhutto university, sheringal. © 2022. abstract library and information science has been at an infant stage in pakistan, primarily in resource management, description, discovery, and access. the reasons are many, including the lack of interest and use of modern tools, techniques, and best practices by librarians in pakistan. finding a solution to these challenges requires a comprehensive study that identifies the current state of libraries in pakistan. this paper fills this gap in the literature by reviewing the relevant literature published between 2015 and 2021 and selected through a rigorous search and selection methodology. it also analyzes the websites of 82 libraries in pakistan through a theoretical framework based on various aspects. the findings of this study include: libraries in pakistan need a transition from traditional and limited solutions to more advanced information and communication technology (ict)-enabled, user-friendly, and state-of-the-art systems to produce dynamic, consumable, and sharable knowledge space. they must adopt social semantic cataloging to bring all the stakeholders on a single platform. a libraries consortium should be developed to link users to local, multilingual, and multicultural collections for improved knowledge production, recording, sharing, acquisition, and dissemination. these findings benefit pakistani libraries, librarians, information science professionals, and researchers in other developing countries. to the best of our knowledge, this is the first study of its kind providing insights into the current state of libraries in pakistan through the study of their websites using a rigorous theoretical framework and in the light of the latest relevant literature. introduction with the inception of the web, library and information science (lis) professionals and researchers have solved several major challenges and issues regarding resource description, discovery, and access. yet, many new problems arise in the practices and services delivered by libraries if they are not in line with emerging technologies and standards. these problems are promptly addressed by the libraries and their lis professionals using cutting-edge technologies, sufficient training, and the availability of the required resources. this practice keeps these libraries functional and acceptable among their users, especially in developed countries. on the other hand, in less developed and developing countries, libraries are losing their importance, which may be due to the adherence of these libraries to outdated lis approaches. pakistan is one of the developing countries where this is often observed. but, before devising a solution to regain their value, importance, and acceptance, it is essential to identify the current state of libraries in pakistan. to address this need, this paper reviews and summarizes the findings of the well-reputed published literature regarding libraries in pakistan and collects and analyzes important details from library websites. mailto:asimullah@uop.edu.pk mailto:khusro@uop.edu.pk mailto:irfan@sbbu.edu.pk information technology and libraries september 2022 library management practices in the libraries of pakistan 2 ullah, khusro, and ullah this study is inspired by two review articles that considered different aspects of lis research.1 the most similar is the article from noh and chang, who analyzed lis practices by reviewing relevant literature regarding libraries in korea from 1970 to 2018.2 however, to the best of our knowledge, we found no holistic, systematic literature review covering the current state of library management practices in pakistan and highlighting its key challenges, issues, and research opportunities. similarly, ganaee and rafiq studied the current state and features of the websites of the 85 academic libraries of pakistan via surveys and interviews to identify their issues and problems.3 the websites were analyzed for contrasting color schemes, readable text, minimal use of horizontal scrolling, language, staff details, opacs, navigation, and other details of the information architecture. inspired by ganaee and rafiq, this study contributes a theoretical evaluation framework to study the current state of libraries by analyzing their websites. it comprises several aspects and criteria, including the availability of general information and information about resources and collections, the use of web 2.0 tools, the design of the website, the offering of web-based services, the use of instruction tools, and the application of accessibility guidelines for supporting individuals with visual and other impairments. the paper extends the findings and implications of the aforementioned research by highlighting the current state of library management practices in the libraries of pakistan, the challenges and issues those libraries face, and the research opportunities that lie ahead of them in the realm of modern digital technologies. the paper provides a systematic literature review of the relevant literature on the libraries of pakistan and devises a theoretical framework to collect and analyze data by visiting the websites of the selected 82 libraries of pakistan that have an online presence.4 the study has implications for researchers and lis professionals in pakistan and those of developing countries coping with similar challenges and issues. the first section of this paper presents the methodology for selecting relevant literature by adopting the well-known prisma framework.5 the second section presents a summary of key findings. the third section presents a discussion and analysis. the last section concludes the paper, followed by endnotes and an appendix holding data about the selected 82 websites of the libraries of pakistan. methodology this section discusses the literature search and selection strategy and the theoretical evaluation framework used to study the websites of the selected 82 libraries of pakistan. the literature search and selection strategy this section discusses the search and selection process for collecting the relevant literature using google scholar. google scholar indexes more than 389 million records and has the highest coverage of knowledge and research areas.6 we developed rigorous search and selection criteria by adopting the prisma methodology for gathering the relevant scholarly literature. 7 the prisma methodology is a systematic literature review approach, ensuring transparent and complete reporting on selecting relevant literature in a given course of inquiry.8 it tracks a full record of how the relevant literature was selected. it visualizes the details in a prisma flow diagram,9 shown in figure 1. the first step in applying prisma and following this diagram is to develop a search framework consisting of keywords or search queries that maximize the coverage and accuracy of finding relevant studies. the search framework for this study was developed by following ullah and khusro and liberati et al.10 table 1 summarizes the search framework and provides details on the search query and the number of total records matched by reading the information technology and libraries september 2022 library management practices in the libraries of pakistan 3 ullah, khusro, and ullah search results list’s title and text snippet, which resulted in the number of relevant records reported in the third column. the duplicates that appear after entering the next search query are recorded in the fourth column. the duplicates are removed from the counting of relevant records matched against the given search query. the net results are reported in the final column to be further screened by title, abstract, and other details on the publisher’s website. the inclusion/exclusion criteria are required to narrow down the selection criteria further so that only relevant items are included, and the irrelevant ones are filtered out or excluded. using this search framework, our inclusion criteria selected the following publications: • publications that discuss computer and web-based software solutions regarding resource acquisition, description (cataloging), discovery, and access inside the library or libraries of pakistan. • publications that highlight the use and the adaptation of technologies, especially modern cataloging practices, the use of semantic web, and linked open data (lod) in the libraries of pakistan. • publications that highlight issues and challenges faced by pakistani libraries to become part of the global library community and learn from their best practices in terms of software and related technologies. • publications in the english language with pakistani context and published during 2015 – 2021. the exclusion criteria to remove items from the list included the following: • publications published before 2015 and written in languages other than english. • publications that are of low academic significance with low-quality publication venues. examples include papers having incomplete details or those published in non-peerreviewed journals and conferences. • theses, dissertations, surveys, review articles, patents, and citations. information technology and libraries september 2022 library management practices in the libraries of pakistan 4 ullah, khusro, and ullah table 1. the search framework – keywords and criteria for finding relevant publications s. no. search query records matched relevant records by title & text snippet duplicates identified net items to be screened by title & abstract 1. “library science”, “information science”, “lis”, “libraries”, “pakistan” 963 15 0 15 2. “academic libraries”, “university libraries”, “digital libraries”, “pakistan” 645 48 2 46 3. “library staff”, “training”, “resources”, “library automation”, “libraries”, “pakistan” 355 26 15 11 4. “libraries”, “university libraries”, “hec”, “digital library”, “pakistan” 258 71 39 32 5. “collection management”, “collection development”, “libraries”, “pakistan” 240 11 11 0 6. ”design”, “accessibility”, “usability”, “responsiveness”, “websites”, “libraries”, “pakistan” 109 3 0 3 7. “social networking”, “social web”, “libraries”, “facebook”, “twitter”, “youtube”, “pakistan” 93 0 0 0 8. “services”, “web 2.0”, “rating”, “review”, “comment”, “libraries”, “pakistan” 75 1 1 0 9. “library automation”, “computerization”, “library software”, “libraries”, “pakistan” 76 5 5 0 10. “automation”, “integrated library systems”, “library software”, “pakistan” 68 8 7 1 information technology and libraries september 2022 library management practices in the libraries of pakistan 5 ullah, khusro, and ullah s. no. search query records matched relevant records by title & text snippet duplicates identified net items to be screened by title & abstract 11. “azad jammu and kashmir”, “punjab”, “sindh”, “khyber pakhtunkhwa”, “balochistan”, “gilgit”, “libraries”, “pakistan” 30 0 0 0 12. “digitization”, “digital skills”, “digital competencies”, “libraries”, “pakistan” 29 4 3 1 13. “book selection”, “acquisition”, “classification”, “cataloging”, “libraries”, “pakistan” 17 2 1 1 total 2958 194 84 110 figure 1 visualizes the search process using the well-known prisma diagram.11 google scholar retrieved 2,958 records. the search queries brought 84 duplicate records identified and removed, leading to 2,874 records left for initial screening. after an initial screening using title and text snippets, we identified 110 records to be relevant, leaving a total of 2,764 records. these 110 records were then accessed by visiting their publisher’s websites to read their title, abstract, and other details. the full texts of these papers were obtained. after applying skimming on the full -text of these records and considering the inclusion/exclusion criteria, 26 were excluded leaving behind 84 publications for in-depth reading and analysis. an in-depth reading of these 84 articles and application of the inclusion/exclusion criteria identified a further 3 articles to be irrelevant, leaving behind 81 articles to be relevant and to be included in the analysis and discussion. information technology and libraries september 2022 library management practices in the libraries of pakistan 6 ullah, khusro, and ullah figure 1. prisma diagram regarding the selection of relevant publications. information technology and libraries september 2022 library management practices in the libraries of pakistan 7 ullah, khusro, and ullah the evaluation framework the theoretical evaluation framework used to collect relevant data from the selected websites is shown in table 2. it summarizes the purpose of each criterion and its possible values using abbreviations. table 2. the evaluation framework for libraries: criteria and their descriptions s. no. criteria explanation 1. s. no. the serial no. of each record in table a-1 of appendix a: details of libraries 2. library name purpose: the name of the library. values: library name 3. url values: the url of the library. values: url 4. library website design12 purpose: whether the website design is kept user-centered and accessible for the blind and visually impaired people. values: language clarity (lc: yes/no); presentation clarity (pc: yes/no); support for special people (sp: yes/no); logical structure (ls: yes/no); responsive web design (rwd: yes/no); multilinguality (mlw) of web pages (yes/no) 5. general information13 purpose: general information available on the website regarding content. values: copyright statement (c); resources and services (rs); mission/goals/objectives (g); news/events (ne); contact details (cn); frequently asked questions (faq); last updated (lu); map/directions to the library (mp); calender (cl); virtual tour (vt); policies (p); word cloud (wc); opening hours details (oh), not available (na) 6. web 2.0 tools14 purpose: the purpose of web 2.0 tools is to connect the library users and get updates from the library management about different contents demanded or needed by the library users. users can share and comment on the library holdings in their friends’ circle through these social networking applications. this criterion is set for analyzing whether social networking applications are used in pakistani libraries or not and which social networking tool is mostly used. values: facebook (fb); flickr (fr); twitter (t); rss (r); social bookmarking (s); instagram (i); blogs (b); wikis (w); youtube (yt); pinterest (pi); not available (na) information technology and libraries september 2022 library management practices in the libraries of pakistan 8 ullah, khusro, and ullah s. no. criteria explanation 7. web-based library services15 purpose: the services offered by the library on the web. it has subcolumns including search, browsing, and other. search, values: opac; author (at); title (tt); subject (su); keyword (ke); and advanced search (as) browsing, values: author (at); title (tt); subject (su); category (ca); keyword (ke) other, values: ask a librarian (al); email (em); loan (ln); awareness (aw); newsletter (nw); delivery (de); sms; ready reference questions (rq); chat (ch); library exhibits (lx); feedback (fb); reserving computers (rc); council services (cs); smartphone-based services (sp); not available (na) 8. resources and collections16 purpose: this criterion aims to analyze the nature, variety, and types of the resources that are mostly available in pakistani libraries. values: opac; bibliographic databases (bd); full-text databases (ft); journals (j); books (b); audiobooks (ab); magazines (mg); online reference sources (or); opac of other libraries (opac-o); multimedia collections (mc); other (o); special collections (sc); multilinguality (mlr) of resources; not available (na); information of physical resources (ph) 9. instructional tools17 purpose: tools to guide users in searching, browsing, and other services. values: research guides (rg); subject guides/pathfinders (sg); opac search tips (tips); information literacy program (infl); citation guides (cg); online tutorials (ot); user groups (ug); plagiarism guides (pg); webinars (wb); not available (na) 10. accessibility guidelines18 purpose: whether the website and library follows the accessibility guidelines values: yes/no summary of key observations the lis practices in pakistan’s libraries are gradually shifting from manual to digital. however, they are still far from meeting the latest international practices of resource management, acquisition, cataloging, classification, circulation, discovery, access, and accessibility for people with disabilities, including those with visual impairments. this section has a twofold objective. first, it reviews the latest literature regarding the current state of lis practices in the libraries of pakistan to identify challenges and issues being faced and future research opportunities. second, it information technology and libraries september 2022 library management practices in the libraries of pakistan 9 ullah, khusro, and ullah extends these findings by evaluating the websites of the selected 82 libraries for a clearer picture of the current state of these chosen libraries. lis practices in the light of published literature this section discusses lis practices in the libraries of pakistan with details from the published literature. the following subsections briefly discuss these practices. collection development and management books are given greater importance as the main holdings in the libraries of pakistan. currently, printed books are selected in the conventional manual manner. book selection tools include suppliers’ lists, publishers’ catalogs, book fairs/visits to book shops, book reviews, recommendations from the readers, selection committees, suggestion registers, and publishers/suppliers’ desk copies. the requested books are supplied to the libraries. librarians check these books physically and verify their accuracy. if a book is damaged or not present, it is reported to the vendor so that new copies could be arranged. there is a rare case of online or electronic book selection and procurement from national and international book vendors. there is also a very rare practice of purchasing softcover books in batches. these aspects have been discussed in several research publications by lis professionals and researchers of pakistan. one prominent reason is the lack of a sufficient budget and standard clear resource acquisition and management policy.19 the following are some of the notable challenges and issues that appeared in the published literature: • the development of the quality collection.20 • lack of formal policies and guidelines for collection selection, acquisition, and related activities.21 • lack of electronic resources22 and challenges in their subscription and off-campus access. 23 • inadequate collections and the resulting limited use of resources.24 • financial constraints.25 • lack of formal policies and procedures for collection development and management, including selection, acquisition, digitization, and access.26 • lack of proper library communities and the coordination among them for collection development and management.27 • failure to fulfill the user information needs.28 researchers have made some recommendations (that could also be treated as research opportunities) to address these challenges: the libraries need to meet user needs and maintain their pace for disseminating the current and updated scientific knowledge and new insights in the literature to achieve excellence in service delivery.29 the factors affecting lis practices in the academic libraries of pakistan include collection development goals, management policies , and procedures, user requirements, budget, and evaluation.30 the user information needs should be considered to the fullest, and a user-centric approach should be developed to improve content selection.31 librarians should understand the use of linked and open data (lod) for creating standard metadata records for information resources management in libraries. 32 in this regard, the librarians should consider the major challenges, including the lack of technical expertise, awareness of the latest tools and technologies, the complexity of technologies, non-availability of vocabularies, and legal issues.33 the librarians must consider the research community’s limited information technology and libraries september 2022 library management practices in the libraries of pakistan 10 ullah, khusro, and ullah demand and use of e-resources in academic activities.34 there is a significant relationship between the digital resources database and the development of academic research for generating innovative ideas and improving researchers’ cognitive abilities. 35 therefore, libraries must be well aware of maintaining sufficient and up-to-date resources. social networking sites should be considered for knowledge management practices among the employees in public and private universities.36 effective policies should be developed to increase the researchers’ satisfaction and research productivity.37 resource description, discovery, and access as it relates to resource description and access, most libraries in pakistan use online public access catalogs (opacs). the use of specialized software, including, e.g., dspace and e-prints, for developing and using institutional repositories and digital libraries is rare. libraries are still relying on the conventional manual, partially computerized, slow, and old methods of records management and are limited to opacs-based search and retrieval. they are less aware and familiar with the modern best practices of using lod for resource description, sharing, and access. it is unproven and new to the libraries of pakistan for several reasons, including the complexity in deployment and usage and the constraints on financial and human resources.38 some of the notable challenges and issues that appeared in the published literature include the following: • lack of or limited searching and access to resources39 and their sharing. 40 • lack of synchronous or digital reference services41 and the poor availability of virtual reference services.42 • lack of search and retrieval solution for multilingual resources written in pashtu, arabic, and urdu.43 • limited or no use of big data analytics to improve acquisition, preservation, curation, and data analysis.44 • insufficient information on the websites regarding their libraries and lack of communication support for end users.45 • less frequent use of web 2.0, website aid tools, and limited information about their libraries.46 • the smaller size of the library website and the lack of aids including site index, frequently asked questions, user guides about its use.47 • the lack of awareness, best practices, it staff, and the complexity in implementing lod in resource description, discovery, sharing, and access.48 these challenges can be addressed if the recommendations of the researchers are considered. some of these recommendations, which also serve as research opportunities, include: the library management practices should consider using and exploiting ontologies and lod to develop more rigorous classification systems for improved resource description, discovery, and access.49 strategic planning and policies are essential for incorporating ict in the libraries of pakistan, with emphasis on resource description, discovery, access, and sharing through web-based services.50 besides the library’s reference desk and e-mail service, the online instant messaging and search engines tools must be used for virtual reference service (vrs) in libraries. a proper set of written policies and standard operating procedures (sops) for vrs must be introduced.51 the collaboration and sharing of experiences and skills for deploying lod is also vital.52 through lod, the libraries of pakistan can be linked to other global libraries to promote our indigenous information technology and libraries september 2022 library management practices in the libraries of pakistan 11 ullah, khusro, and ullah literature on the web.53 it is challenging to migrate data from text-based and marc catalogs to linked data formats. in addition, the recognition and providence of the uris are challenging. synchronizing terminologies with linked data technology and minimizing its complexity is also challenging. the conversion of marc 21 records to resource description framework (rdf) is onerous.54 a list of the bibliographic databases should be provided on the library website with instructions for their usage and relevant content should be made accessible discipline-wise through proper authentication login.55 services like “ask a librarian,” search, searching via barcode scanners, and maintaining a rich database should be considered by each library through their online and mobile phone interfaces.56 in developing smartphone-based library applications, it is essential to consider service quality, affinity, usefulness, ease of use, satisfaction, confirmation, and continuous usage.57 the information architecture of libraries’ websites should be analyzed from the perspective of their users, and their navigation system should be improved and adapted accordingly.58 the usefulness and cost are the most influential factors that should be considered while adopting library software such as koha.59 the design and quality of the contents and services of the library website are important.60 the use of digital library resources positively impacts research productivity and should be considered. 61 adherence to new standards, practices, and technologies the lack of interest from library staff in adopting and adhering to new standards and technologies is another inevitable aspect. another reason for this non-adherence could be the lack of knowledge by upper management and failure to understand the modern-day needs of library users. however, some developments are taking momentum. for example, several libraries offer web-based services.62 in some scenarios, university students use the social web to access and share resources.63 the pakistan scientific and technological information center (pastic) is developing a searchable database of indigenous collections64 supporting smartphone-based search and access.65 pastic is also creating a consortium-level public access catalog of the scientific periodicals produced by the authors of pakistan.66 the agha khan university has developed an integrated resource management system for connecting different, geographically dispersed libraries of various campuses in pakistan.67 access to digital libraries through the higher education of pakistan (hec) digital library, a library management system, and e-document delivery are some of the notable innovations in the lis domain of pakistan. 68 there are 122 public universities, 95 private universities, and more than 600 non-degree-awarding institutions with hec-dl access.69 the lis practices in pakistani libraries mostly suffer from the lack of professional training,70 awareness of the latest library standards and technologies,71 technological and it proficiency,72 policies for library processes and ict,73 knowledge regarding lod technologies,74 engagement with digitization activities,75 resource sharing, and collaboration,76 sufficient financial resources,77 the supportive and assistive atmosphere for persons with special needs,78 as well as issues regarding archiving, cataloging, and disseminating local and indigenous literature and artifacts.79 a library must find ways of adapting new tools, standards, technologies, and necessary training to support users in resource management, discovery, and access. the hec pakistan maintains one such library to offer free access to research publications and periodicals in different universities of pakistan and their scholars for off-campus online access.80 however, most university library users are not fully satisfied with collection development, and a major part of the literature is still not information technology and libraries september 2022 library management practices in the libraries of pakistan 12 ullah, khusro, and ullah accessible.81 besides, as discussed, pastic is playing its active role in developing a library consortium and a searchable database/catalog of the indigenous collections of pakistan. several university librarians have adopted knowledge management practices to deliver and improve their library services efficiently.82 apart from these few initiatives, the research and development of lis practices in the libraries of pakistan have been at very minimum and need significant attention. some of the notable challenges and issues that appeared in the published literature include • librarians have limited or outdated knowledge regarding research data management.83 • the inappropriate infrastructure.84 • limited or no use of ict, knowledge, and expertise in the use of computers, internet connectivity issues, inadequate computer labs.85 • training and leadership.86 • lack of supporting it staff.87 • lack or limited use of human resource management88 and leadership.89 • financial constraints.90 • lack of dynamic websites for the libraries.91 • lack of tools and standard library software.92 • the very basic level of digital competencies for developing, managing, and protecting digital libraries in universities of pakistan.93 • lack of uniformity and standard features in library websites. 94 • there is less frequent use of web 2.0, website aid tools, and limited information about their libraries.95 • the smaller size of the library website and the lack of aids including site index, frequently asked questions, user guides about its use.96 • the relative infant stage of information commons (information technology infrastructure, services, and resources).97 • negligible willingness and interest in research data management. 98 • reluctance in sharing research data99 and weak and informal collaboration on research.100 some recommendations (that could also be treated as research opportunities) made by researchers include: services, including electronic services, librarian’s end services, and technical knowledge services, should be improved in the special libraries of pakistan. 101 it is essential to understand the need to deploy and use library software, including, e.g., koha, dspace, e-prints, and evergreen.102 human resource management, especially effective leadership with a broader vision, boldness, charismatic personality, and knowledge dissemination abilities, is required to lead staff and manage their social relationships.103 as an information manager of the library, a librarian must be fully aware of web 3.0, the semantic web, and artificial intelligence (ai) tools to become expert in the digital landscape.104 web 2.0 tools and social networking sites should be used in marketing and advertising the library services to the end users.105 the cataloging paradigms should incorporate social collaborative cataloging metadata. 106 artificial intelligence tools and services should be considered where lis professionals can collaborate and join hands with computer science professionals to develop libraries.107 academic libraries’ performance can be improved by using big data tools and analytics.108 quality enhancement and industrial affiliation are important for increasing the quality and quantity of research in academia. 109 the information technology and libraries september 2022 library management practices in the libraries of pakistan 13 ullah, khusro, and ullah digital library, institutional repository software, bibliographic databases, e-journals searching, and referencing tools are very important for increasing the research production of the public sector universities.110 the competencies of ict skills, education in copyright laws and intellectual property, using digital and physical learning resources, and collection development must be improved.111 hec must provide funds for information commons projects for significant benefits to library users.112 lis practices in the light of the studied websites this section attempts to highlight the current state of the libraries in pakistan through data and observations collected from their websites. reviewing a library’s website reveals several aspects of its current state. table a-1 in appendix a summarizes the collected data obtained through the evaluation framework discussed in the methodology section and summarized in table 2. for example, a library with a website that is not user-centered and accessible to people with visual impairments, a criterion outlined as the third item in table 2, may face issues with supporting it staff, lack of expertise, and budget constraints. a library that is unable to offer web-based services cannot meet the needs of a major portion of its users interested in accessing content and servi ces online. a similar impact is connected to each of the remaining criteria of the evaluation framework. the lack of certain pieces of information on the library website affects their users negatively and may restrain them from using it. it is notable that most of the libraries of pakistan have no websites at all, which makes it challenging to discuss their strengths and limitations. as shown in figure 2, only 36% (of the selected 82 websites) of the libraries listed on the hec website have websites leaving 64% that have no online presence. this also makes it challenging to draw a clearer picture of the current state of libraries of pakistan and, therefore, the statistics presented here depict only a rough estimation of the exact details. figure 2. percentage of libraries in pakistan with and without websites. information technology and libraries september 2022 library management practices in the libraries of pakistan 14 ullah, khusro, and ullah figure 3 shows the statistics concerning the appearance and design of library websites in pakistan, which are improving in language and presentation clarity, logical structure, responsive web design, and access to the hec digital library. these websites need improvement in providing accessibility tools for people with disabilities, meeting accessibility guidelines, and incorporating multilingual support. figure 3. library website design, accessibility, and access to hec digital library (lc: language clarity; pc: presentation clarity; sp: support for special people; ls: logical structure; rwd: responsive web design; mlw: multiliguality of web pages; accessibility guidelines; and hec dl access). information technology and libraries september 2022 library management practices in the libraries of pakistan 15 ullah, khusro, and ullah figure 4 shows that most of the libraries (63 out of 82: 76.8%) offer general information on their websites. the most prominent among these include contact details (50 out of 82: 61%), copyright statement (47 out of 82: 57.3%), and library operating hours (46 out of 82: 56.09%), followed by resources (27 out of 82: 32.9%), news/events (25 out of 82: 30.5%) , mission/goals/objectives (24 out of 82: 29.3%), and maps/directions to the library building (19 out of 82: 23.2%), policies (18 out 82: 22%), frequently asked questions (16 out of 82: 19.5%), and last update (12 out of 82: 14.6%). the virtual tour, calendar, and word cloud are the least provided, as shown. finally, a considerable number of libraries (19 out of 82: 23.2%) lack most of the general information. figure 4. number of library websites that offer general information to its users about cn: contact details; c: copyright; oh: opening hours details; rs: resources; ne: news/events; g: mission/goals/objectives; mp: map/directions to the library; p: policies; faq: frequently asked questions; lu: last update; vt: virtual tour; wc: word cloud; cl: calendar; and na: not available. information technology and libraries september 2022 library management practices in the libraries of pakistan 16 ullah, khusro, and ullah figure 5 shows the details of the libraries that allow sharing their contents or communicating with their users using web 2.0 tools and social media. most of the libraries (53 out of 82: 64.6%) are not connected with their users through social networking. most of the libraries that exploit web 2.0 tools, use facebook (26 out of 82: 31.7%), followed by twitter (22 out of 82: 26.8%), youtube (9 out of 82: 11%), instagram (8 out 82: 9.7%), and rss (5 out of 82: 6.1%). figure 5. number of library websites that provide social networking through fb: facebook; t: twitter; yt: youtube; i: instagram; r: rss; b: blog; s: social bookmarking; w: wikis; pi: pinterest; fr: flicker; na: not available. information technology and libraries september 2022 library management practices in the libraries of pakistan 17 ullah, khusro, and ullah figure 6 shows the statistics for the instructional tools used by different websites of libraries in pakistan. these tools are for the new visitors or the person who requires instruction in navigation, search, and access to the contents of the library’s website. most of the libraries (67 out of 82: 81.7%) do not offer instructional tools on the websites. only a few (15 out of 82: 18. 3%) provide instructional tools in one form or the other. these include information literacy programs (10 out of 82: 12.2%), citation guides (7 out of 82: 8.5%), research guides (6 out 82: 7.3%), subject guides/pathfinders (4 out of 82: 4.8%), tutorials (3 out of 82: 3.6%), opac search tips (3 out of 82: 3.6%), webinars (2 out of 82: 2.4%), program guides (2 out of 82: 2.4%), and user guides (1 out of 82: 1.2%). figure 6. number of library websites that provide the instructional tools of infl: information literacy program; cg: citation guides; rg: research guides; sg: subject guides/pathfinders; ot: online tutorials; tips: opac search tips; wb: webinars; pg: plagiarism guides; ug: user groups; na: not available. information technology and libraries september 2022 library management practices in the libraries of pakistan 18 ullah, khusro, and ullah figure 7 shows the statistics about searching as part of web-based services provided by different libraries on their websites. most libraries (53 out of 82: 66.2%) offer search using keywords (44 out of 82: 53.6%) followed by title (42 out of 82: 51.2%), advanced search (39 out of 82: 47.6%), authors (38 out of 82: 46.3%), subjects (36 out of 82: 43.9%), and opac (5 out of 82: 6.1%). a considerable number of libraries (29 out of 82: 35.4%) have no search functionality. figure 7. number of libraries offering web-based searching services through at: author; tt: title; su: subject; ke: keyword; as: advanced search; opac; na: not available. information technology and libraries september 2022 library management practices in the libraries of pakistan 19 ullah, khusro, and ullah figure 8 shows that most libraries’ websites (53 out of 82: 64.6%) offer browsing using different options and filters. most libraries allow browsing through categories (42 out of 82: 51.5%) followed by the title (40 out of 82: 48.8%), author (38 out of 82: 46.3%), subject (36 out of 82: 43.9%), and keywords (28 out of 82: 34.1%). several libraries (29 out of 82: 35.36%) offer no such browsing functionalities. figure 8. number of libraries offering web-based browsing service parameters including ca: category; tt: title; at: author; su: subject; ke: keyword; na: not available. information technology and libraries september 2022 library management practices in the libraries of pakistan 20 ullah, khusro, and ullah figure 9 shows the statistics for web-based services offered by libraries other than search and browsing, which are depicted separately in figures 7 and 8, respectively. most libraries (63 out of 82: 76.8%) do not offer these services on their websites. only a few of them (19 out of 82: 23.2%) offer services such as ask a librarian (14 out of 82: 17.1%), followed by email, delivery (9 out of 82: 11% each), loan (6 out of 82: 7.3%), chat, ready reference questions (4 out of 82: 4.9% each), and spreading awareness among users (3 out of 82: 3.6%). the remaining services such as newsletter, reserving computers for the users, council services, smartphone-based services, and short messaging service are offered on almost none of the selected libraries’ websites. figure 9. number of libraries offering other web-based library services that provide support for accessing and discovering any service or resource other than search and browsing. these services include al: ask a librarian; em: email; de: delivery; ln: loan; fb: feedback; ch: chat; rq: ready reference questions; aw: awareness; nw: newsletter; rc: reserving computers; lx: library exhibits; cs: council services; sp: smartphone-based services; sms; na: not available. information technology and libraries september 2022 library management practices in the libraries of pakistan 21 ullah, khusro, and ullah figure 10 shows the details offered by libraries about their resources and collections. a considerable number of these libraries (20 out of 82: 24.4%) provide no such information. most libraries (62 out of 82: 75.6%) give details about books (45 out 82: 54.8%), followed by journals (39 out of 82: 45.6%), bibliographic databases (37 out of 82: 45.1%), opac (17 out of 82: 20.7%), full-text databases (10 out of 82: 12.2%), magazines (9 out of 82: 11%), physical books (7 out of 82: 8.7%), online reference services (6 out of 82: 7.3%), opac of other libraries (3 out of 82: 3.7%), audiobooks (2 out of 82: 2.4%), and multimedia collections (1 out of 82: 1.2%). figure 10. number of libraries offering resources and collections including b: books; j: journals; bd: bibliographic databases; opac; o: other; ft: full-text databases; mg: magazines; ph: physical books; or: online reference sources; opac-o: opac of other libraries; ab: audiobooks; mc: multimedia collections; sc: special collection; mlr: multilinguality of resources; na: not available. discussion and analysis the study of websites of the 82 libraries of pakistan reveals that the majority are not technically sound and cannot assist and offer services to its users, including people with visual or physical impairments. the key observations made in the previous section emphasize the need for the libraries of pakistan to transform their libraries’ practices from manual to automatic and webbased services. this can be achieved through collaborative research and development efforts from several domains, including computer science, lis, human-computer interaction, ai, the semantic web, and lod. there are several examples of library consortia that enable collaborative efforts to make available and accessible catalogs, websites, and activities from a single platform.113 these include the online information technology and libraries september 2022 library management practices in the libraries of pakistan 22 ullah, khusro, and ullah computer library center (oclc), the international coalition of library consortia (icolc), hathitrust digital library, the arxiv e-print archive, google books, and shared print storage.114 in pakistan, pastic made the first effort to develop such a consortium115 to allow access to the holdings of the libraries of pakistan by combining their opacs. it offers a searchable database of the collections and enables resource sharing among all the member libraries.116 however, its successful implementation in pakistan requires the willingness of data sharing, professional interaction, and benefiting from the modern technologies among all the libraries of pakistan. the consortium should be supported with the best practices from information retrieval and semantic web technologies to offer better search and retrieval functionalities. users should be made part of the resource description so that the idea of social semantic cataloging117 can be realized, where users can discuss their information needs, recommend books and resources, and enrich the catalog with user-generated content. the artificial intelligence and deep learning algorithms should be exploited in book recommendations so that the available professional metadata and user-generated content could be used to the fullest in serving the users’ information needs. the resulting rich metadata should be made available and consumable on the lod to benefit other potential applications. this will enable the libraries to meet the complex information needs of the users, who describe them in natural language. the natural language is ambiguous, and resources described through user-generated content produced by users in the same language will better support the search and recommendation of books.118 this will improve the resource description, discovery, and access services of the libraries of pakistan to a greater extent. figure 3 depicts another significant limitation of the websites of the libraries of pakistan : extremely limited availability of navigational, retrieval, and visualization aids for people with visual impairments. most of the libraries’ websites have no provision for accessibility mechanisms. this is unfortunate as in 2017 it was reported that 21.78 million people were affected by blindness and vision impairment.119 although several technological aids have been defined for performing daily life activities, including navigation, orientation, localization, obstacle detection, etc.,120 the libraries of pakistan, in the majority, lack accessibility-related solutions for those who are blind or have a visual impairment. holdings should be enriched with audio and braille books and supplemented with an ict-based accessibility solution. the library building should accommodate visitors with diverse needs. information about accessibility should be shared as part of the general information on the library’s website. in this regard, all the stakeholders of the libraries, including government and non-government organizations, educational institutions, and lis professionals, should be made involved to work collaboratively on an effective accessibility solution for all library users.121 smartphones have been among the top trends in pakistan, especially for college and university students who use them most frequently. according to the infographic by grappetite, 77% of smartphone users are between 21 and 30, and 12% are aged 31 to 40 years.122 by closely looking at these statistics, people of these two age groups are the most potential users of libraries as they usually need a variety of books. according to statista, smartphone ownership in pakistan has increased from 10% in 2014 to 51% in 2020.123 according to pakistan telecommunication authority, currently there are 191 million cellular/mobile phone subscribers, and there are 110 million 3g/4g subscribers.124 these statistics suggest that libraries should also benefit from information technology and libraries september 2022 library management practices in the libraries of pakistan 23 ullah, khusro, and ullah incorporating smartphones. the most prominent opportunities are developing smartphone apps that support users in knowing about the collection of a library via the web and producing an interactive user interface that helps them find answers to several of their questions regarding library services. the library opacs can be made usable and accessible through mobile web applications. there are several prospects and opportunities regarding using library space for people with disabilities through smartphones. a smartphone application can be developed to enable readers in navigation, localization, and finding items of interest in the library. conclusions this study aims to provide a holistic view of the current state of libraries in pakistan in the light of the most relevant and recent research works from lis professionals and researchers. it also attempts to identify some of the major challenges, issues, and research opportunities regarding the current state of lis practices in libraries of pakistan with that of technologically advanced countries. the study suggests a need for increasing technology proficiency, adaptability of the latest technologies, proper legislation for lis practices that meet international standards, improvements in collection development, and efforts to meet library users’ needs. the libraries of pakistan need a transition from traditional and limited solutions to a more advanced, ict-enabled, user-friendly, and state-of-the-art system to produce a dynamic, consumable, and sharable knowledge space. the libraries must adopt a social semantic cataloging environment to bring all stakeholders to a single platform. development of a library consortium is critical to connect our local, multilingual, and multicultural collections to users for improved knowledge production, recording, sharing, acquisition, and dissemination. we hope that lis professionals of pakistan and the rest of the world, in general, find this article supportive to their current and future studies. information technology and libraries september 2022 library management practices in the libraries of pakista 24 ullah, khusro, and ullah appendix a: details of libraries table a-1. the comparison and evaluation of libraries using the criteria in table 1. s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other 1. central library university of peshawar http://www.uop.e du.pk/library/ ✓ ✓     c, g, ne, cn na na na na ph na  ✓ 2. brains institute peshawar http://www.brains .edu.pk/library-2/ ✓ ✓     mp fb, t, i na na na ph na  ✓ 3. library of edwardes college, the mall peshawar cantt https://www.edw ardes.edu.pk/libra ry ✓ ✓     c, oh na na na na na na  ✓ 4. the aga khan university library https://www.aku. edu/library/pages /home.aspx ✓ ✓  ✓ ✓  c, rs, g, ne, cn, faq, lu, mp, cl, vt, p, wc, oh fb, t, i, yt ke, as ca, ke na ph, o na  ✓ 5. air university central library https://www.au.e du.pk/pages/libra ry/about_library.a spx ✓ ✓  ✓ ✓  c, rs, g, ne, cn, faq, lu, mp, cl, vt, p, wc, oh fb, t ke, tt, su, as ca na ph, b, bd, j na  ✓ 6. the allama iqbal open university (aiou) http://library.aiou. edu.pk/ ✓ ✓  ✓ ✓  c, rs, g, ne, cn, faq, lu, oh, wc na ke, tt, su, as ca na ph, b, bd, j na  ✓ 7. bahria university libraries https://bahria.edu .pk/libraries/ ✓ ✓  ✓ ✓  c, rs, g, ne, cn, p, oh fb, t, i, fr ke, tt, su, as ca na ph, b, bd, j na  ✓ information technology and libraries september 2022 library management practices in the libraries of pakista 25 ullah, khusro, and ullah s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other 8. library of balochistan university of engineering & technology, khuzdar http://www.buetk .edu.pk/?page_id= 7368 ✓ ✓  ✓   g na na na na ph na  ✓ 9. library of balochistan university of information technology, engineering & management sciences (buitems) https://www.buit ms.edu.pk/library/ defaulthecandbuit ems.aspx       na na na na na na na  ✓ 10. library of baqai medical university https://baqai.edu. pk/digitallibrary.php       na na na na na na na  ✓ 11. library of barrett hodgson university https://www.bhu. edu.pk/home/tier librarybuilding       na na na na na na na   12. library of beaconhouse national university https://www.bnu. edu.pk/bnu/facilit ies/library ✓ ✓  ✓   g, oh na na na na na na  ✓ 13. comsats university junaid https://ciit.insi gniails.com/lib ✓ ✓  ✓ ✓ ✓ c, rs, g, ne, cn, faq, lu, fb, t at, tt, su, ke, at, tt, su, ke, de, b, o ot ✓ ✓ https://ciit.insigniails.com/library/home https://ciit.insigniails.com/library/home information technology and libraries september 2022 library management practices in the libraries of pakista 26 ullah, khusro, and ullah s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other zaidi library rary/home https://library.co msats.edu.pk/ oh as,ss ca al., 14. city university of science and information technology http://cusit.edu.p k/library/       na na na na na na na  ✓ 15. library of fatima jinnah women university https://fjwu.edu.p k/library/ ✓ ✓     rs, g, ne, cn, lu, oh, p fb, t na na na na na  ✓ 16. library of federal urdu university of arts, sciences & technology https://fuuast.edu .pk/library/       na na na na na na na  ✓ 17. library of forman christian college http://library.fccol lege.edu.pk/ ✓ ✓  ✓ ✓  c, rs, cn, faq, p, oh, mp r, t, fb at, tt, su, ke, as,ss at, tt, su, ca al, em, ln, aw, nw, de, rq, lx, fb, rc, cs opac, bd, ft, j, b, ab, mg, opac-o, mc rg, sg, tips, infl, cg, ot, ug, pg, wb  ✓ 18. library of foundation university, http://fui.edu.pk/ fui_main_site/in dex.php/campus      c na na na na na na  ✓ https://ciit.insigniails.com/library/home information technology and libraries september 2022 library management practices in the libraries of pakista 27 ullah, khusro, and ullah s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other islamabad life/library 19. library of gift university https://www.gift.e du.pk/page/library -overview ✓ ✓  ✓   c, rs, g, ne, cn, p, oh na tt, at, su ke ch, al, em, aw, de opac, bd, ft, j, b, or na  ✓ 20. library of ghulam ishaq khan institute of engineering sciences & technology http://119.159.23 5.56:8085/forms/ default.aspx ✓ ✓  ✓   c, oh na at, tt, su at, tt, su, ca na opac, bd, ft, j, b, ab na  ✓ 21. library of gomal university http://clib.ddns.ne t/ ✓ ✓  ✓ ✓  p, wc na ke, at, tt, su, as ke, at, tt, su na opac, bd, ft, j, b na  ✓ 22. library of government college university http://library.gcu. edu.pk/ ✓ ✓  ✓ ✓  c, rs, g, ne, cn, lu, mp, p, oh na at, tt, su, ke, as at, tt, su, ca al, em, ln, rq, fb opac, bd, ft, j, b, or rg, sg, tips, cg  ✓ 23. government college university faisalabad https://library.gcu f.edu.pk/ ✓ ✓  ✓ ✓  g, cn, p na ke, tt, at, as su na opac, bd, ft, j na  ✓ 24. library of government college for women university https://www.gcw us.edu.pk/library/ ✓ ✓  ✓ ✓  c, rs, p, cn, oh na as, ke, at, tt, su ca, ke, at, tt, su na bd, ft, j na  ✓ 25. library of https://www.gre.a ✓ ✓  ✓ ✓  c, rs, g, ne, fb, t, i, as, at ca em, b, bd ot, wb ✓  information technology and libraries september 2022 library management practices in the libraries of pakista 28 ullah, khusro, and ullah s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other greenwich university c.uk/it-andlibrary/library cn, faq, lu, mp, cl, vt, p, wc, oh yt ln, de, rq, ch, fb, sp 26. library of hitec university http://111.68.98.2 04/libmax/opac/in dex.aspx ✓ ✓  ✓ ✓  c, rs, cn, faq, p, oh na ke ke na b, j, opac na  ✓ 27. library of habib university https://habib.edu. pk/library/ ✓ ✓  ✓ ✓  oh, c, rs, cn, fb, t, i, yt ke ke al, em, ln, aw, nw, de, ch opac, j, ft, b, mg na  ✓ 28. library of hamdard university http://library.ham dard.edu.pk/ ✓ ✓  ✓ ✓  c, rs, cn, faq, p, oh r, t, fb tt, at, su, ke ca, ke al, de opac, bd, j, b infl, sg  ✓ 29. panjab elibrary https://elibrary.pu njab.gov.pk/ ✓ ✓  ✓ ✓  c, rs, cn, faq, g, ne, p, oh, mp fb, t, yt tt, at, su, as tt, at, su, ca fb opac, bd, mg, ft, b, j infl  ✓ 30. library of ilma university https://ilmauniver sity.edu.pk/digitall ibrary ✓ ✓  ✓ ✓  c, mp na na na na bd, ft, j, mg, or na  ✓ 31. library of iqra national university https://iqra.edu.p k/library/ ✓ ✓  ✓ ✓  c, oh, cn na na na na na na  ✓ information technology and libraries september 2022 library management practices in the libraries of pakista 29 ullah, khusro, and ullah s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other 32. library of international islamic university https://www.ii u.edu.pk/?page _id=171 ✓ ✓  ✓ ✓  c, ne, oh, cn, rs fb, t, at, tt, su at, tt, su fb o, bd, opac na  ✓ 33. library of institute of space technology https://www.ist.e du.pk/library       na na na na na na na  ✓ 34. library of institute of southern punjab https://isp.edu.pk /libraryitsupport       na na na na na na na  ✓ 35. library of islamia university punjab http://library.iub.e du.pk/ ✓ ✓  ✓ ✓  oh na at, tt, su, ke, as at, tt, su, ke, ca rc, em, de opac na  ✓ 36. library of isra university https://isra.edu.pk /library/     ✓  na na tt, su, at, as tt, su, at, ca na opac na  ✓ 37. library of jinnah sindh medical university http://www.jsmu. edu.pk/faciltieslibrary.html       na na na na na na na  ✓ 38. library of khyber medical university https://www.kmc. edu.pk/new/librar y/       na na na na na na na  ✓ 39. library of king edward medical university https://kemu.edu. pk/library       g, oh na na na na na na  ✓ 40. library of lahore college for http://www.lcwu. edu.pk/lcwu✓ ✓  ✓ ✓  g, rs, faq, p na na na na bd na  ✓ https://www.iiu.edu.pk/?page_id=171 https://www.iiu.edu.pk/?page_id=171 https://www.iiu.edu.pk/?page_id=171 information technology and libraries september 2022 library management practices in the libraries of pakista 30 ullah, khusro, and ullah s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other women university library-researchwebsites.html 41. library of lahore university of management sciences https://library.lum s.edu.pk/ ✓ ✓  ✓ ✓  ne, cn, vt, oh fb, i ke, at, tt ke, at, tt, su, ca al, ch bd, j, b infl, tips, rg  ✓ 42. library of mehran university of engineering & technology http://library.mue t.edu.pk/index.ph p ✓ ✓  ✓ ✓  c, ne, cn, oh fb, yt, t, i, r, b at, tt, su, ke, as at, tt, su, ca al, de, ln bd, j, b, opac, or infl, rg  ✓ 43. library of minhaj university https://library.mul .edu.pk/ ✓ ✓  ✓ ✓  c, ne, mp, rs, cn, oh, g fb, t, yt at, tt, su, ke, as at, tt, su, ca ln, de b, j, or, bd, infl, pg, cg, rg  ✓ 44. library of mirpur university of science & technology https://cms.must. edu.pk:8083/form s/default.aspx ✓ ✓  ✓ ✓  c, oh, cn na at, tt, su, as, ke at, tt, su, ke, ca na b na  ✓ 45. library of mohammad ali jinnah university http://ils.jinnah.ed u/ ✓ ✓  ✓ ✓  c, oh, cn na at, tt, su, ke, as at, tt, su, ca na b, j, bd na  ✓ 46. engr. abul kalam library ned university of engineering & technology https://library.ned uet.edu.pk/ ✓ ✓  ✓ ✓  c, cn na ke, au, tt ke, at, tt na b, j, mg, bd cg  ✓ 47. library of namal http://library.nam ✓ ✓  ✓ ✓  ne, cn, oh na at, tt, at, tt, na j, b, bd, na  ✓ information technology and libraries september 2022 library management practices in the libraries of pakista 31 ullah, khusro, and ullah s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other institute, mainwali al.edu.pk/ su, ke, as su, ca mg 48. library of national defense university http://111.68.99.1 07/libmax/opac/in dex.aspx ✓ ✓  ✓ ✓  c, cn, oh na ke, as ke, ca na b, j, mg na  ✓ 49. library of national textile university http://ntu.edu.pk/ library/ ✓ ✓  ✓ ✓  cn, oh, ne, faq fb tt, su, at, as tt, su, at, ca na b, bd, j, opac cg, infl  ✓ 50. library of national university of sciences & technology http://www.nust. edu.pk/library/pa ges/default.aspx ✓ ✓  ✓ ✓  cn, mp, c, g, oh, vt, faq na at, tt, su, ke, as at, tt, su, ke, ca na b, bd, j, opac infl  ✓ 51. library of peoples university of medical & health sciences for women http://opac.pumh s.edu.pk/ ✓ ✓  ✓ ✓  cn, mp, c, g, oh, vt, faq na at, tt, su, ke, as at, tt, su, ke, ca na b, bd, j, opac infl  ✓ 52. library of shaheed benazir bhutto university sheringal dir upper pakistan http://142.54.178. 188:5229/ ✓ ✓  ✓ ✓  na na at, tt, su, ke, as at, tt, su, ke, ca na b, bd, j, opac na  ✓ 53. library of shaheed zulfikar ali bhutto institute of science & technology https://szabist.ed u.pk/szabistlibrary/ ✓   ✓ ✓  cn, mp, c, g, oh, vt, ne, faq, p na ke, as at, tt, su, ke, ca al, em b, j na  ✓ information technology and libraries september 2022 library management practices in the libraries of pakista 32 ullah, khusro, and ullah s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other 54. library of sir syed case institute of technology https://case.edu.p k/library/default. aspx ✓ ✓  ✓ ✓  oh, cn fb, t tt, at, ke, su, as tt, ca na b, o na  ✓ 55. library of the islamia college, peshawar http://142.54.178. 188:5209 ✓ ✓  ✓ ✓  na na na na na na na  ✓ 56. library of university of balochistan http://web.uob.ed u.pk/uob/departm ents/library/libra ry.php ✓ ✓  ✓ ✓  cn, mp, c na ke ke na b na  ✓ 57. library of the university of agriculture peshawar http://www.aup.e du.pk/library.php       na na na na na na na  ✓ 58. library of university of buner https://www.ubun er.edu.pk/library ✓      oh, g, c na na na na na na   59. library of university of central punjab http://library.ucp. edu.pk/ ✓ ✓  ✓ ✓  oh, g, c, rs, ne, cn, mp, vt fb tt, at, as tt, at, ca na b, mg, j, bd cg, infl  ✓ 60. library of university of engineering & technology khyber pakhtunkhwa https://www.uetp eshawar.edu.pk/li brary.php ✓      na na na na na na na  ✓ information technology and libraries september 2022 library management practices in the libraries of pakista 33 ullah, khusro, and ullah s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other 61. library of university of engineering technology lahore http://library.uet. edu.pk/ ✓ ✓  ✓ ✓  na r at, tt, su, ke, as ke, ca, at, tt al b, j, bd na  ✓ 62. library of university of engineering & technology, taxila https://www.uett axila.edu.pk/librar y.aspx ✓ ✓  ✓ ✓  cn, rs, ne, oh, c t at, tt, su, ke, as at, tt, su, ke, ca al b, bd, j na  ✓ 63. library of university of haripur http://www.uoh.e du.pk/centrallibrary.php?page= mjyx ✓ ✓  ✓ ✓  cn, rs, ne, oh, c na ke at, tt, su, ke, ca na b, bd, j, o na  ✓ 64. library of university of karachi http://www.uok.e du.pk/library/inde x.php ✓ ✓  ✓ ✓  cn, rs, ne, oh, c, mp na ke at, tt, su, ke, ca na b, bd, j, o na  ✓ 65. library of university of management & technology https://library.um t.edu.pk/home.as px ✓ ✓  ✓ ✓  cn, rs, ne, oh, c, mp fb, t at, tt, su, ke, as at, tt, su, ke, ca al, em b, bd, j na  ✓ 66. online catalogue, central library, university of sargodha http://142.54.178. 188:5157/ ✓ ✓  ✓ ✓  na na at, tt, su, ke, as at, tt, su, ke, ca na b, bd, j na  ✓ 67. library of university of https://library.usa. edu.pk/ ✓ ✓  ✓ ✓  cn, rs, lu, oh, c na at, tt, su, ke, at, tt, su, ke, al, rq b, bd, j na  ✓ information technology and libraries september 2022 library management practices in the libraries of pakista 34 ullah, khusro, and ullah s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other south asia as ca 68. library of university of the punjab https://pulibrary.edu.pk / ✓ ✓  ✓ ✓  cn, rs, oh, c fb tt, at, as, ke tt, ca al, ch, em bd, b, j, o, opac-o rg, sg, cg  ✓ 69. library of zia-uddin university https://zu.edu.pk/ academics/library/ ✓ ✓  ✓ ✓  cn, rs, oh, c, g, ne, p, lu na at, tt, su, ke, as at, tt, su, ke, ca na bd, j, opac-o na  ✓ 70. library of cabinet division, islamabad http://ndw.gov.pk /index.html ✓ ✓  ✓ ✓  cn, rs, oh, c, faq, g, ne, p, lu na na na na na na   71. elibrary, government of the punjab https://elibrary.pu njab.gov.pk/ ✓ ✓  ✓ ✓  mp fb, t, yt at, tt, su, ke, as at, tt, su, ke, ca na bd, b, j, o, opac-o na   72. hec digital library http://hecpk.sum mon.serialssolutio ns.com/ ✓ ✓  ✓ ✓  na na ke, as at, su, ca na b, o, j, mg na  ✓ 73. bahauddin zakariya university (bzu), multan http://library.bzu. edu.pk ✓ ✓  ✓ ✓  na na na na na b, j na  ✓ 74. begum nustrat bhutto women university, sukkur http://143.244.15 7.171 ✓ ✓  ✓   na fb, i opac, at, tt, su, ke, as at, tt, su, ke, ca na b, j na  ✓ 75. cecos university of information http://sites.google .com/view/library ✓ ✓  ✓   lu, cn, oh na opac, at, tt na b na  ✓ https://pulibrary.edu.pk/ https://pulibrary.edu.pk/ information technology and libraries september 2022 library management practices in the libraries of pakista 35 ullah, khusro, and ullah s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other technology & emerging sciences cup/home at, tt 76. dha suffa university http://dclkarachi.c om ✓ ✓  ✓ ✓  c, cn, mp fb, t tt, at, su, ke tt, at, su, ke na b na  ✓ 77. institute of business management https://iobm.daph nis.opalsinfo.net/b in/home ✓ ✓  ✓ ✓  c, cn, lu r opac, at, tt, ke, as, su ca, at, tt, su na b na  ✓ 78. jinnah university for women https://www.juw. edu.pk/campusfacilities/library-1/ ✓ ✓  ✓   c, cn na na na na na na  ✓ 79. khawaja freed university of engineering & information technology, rahim yar khan https://kfueit.edu. pk/aboutlibrary?1=1&menu =sidelink?main=840&m ain=859&parent=f acilities ✓ ✓  ✓   c, cn, oh fb, t, yt na na na na na  ✓ 80. kinnaird college for women, lahore http://www.kinnai rd.edu.pk/library3/ ✓ ✓  ✓   cn, faq na na na na or na   81. lahore leads university https://leads.edu. pk/libraries-.php ✓ ✓  ✓   cn, c fb, t opac, at, tt, ke, as, su ca, at, tt, su na b na  ✓ 82. minhaj university https://lrc.mul.ed ✓ ✓  ✓ ✓  c, cn, mp fb, t, yt opac, su, ca na b na  ✓ information technology and libraries september 2022 library management practices in the libraries of pakista 36 ullah, khusro, and ullah s. no. library name url library website design general information web 2.0 tools web-based library services resources / collections instructional tools accessibility guidelines hec dl access lc pc sp ls rwd mlw search browse other u.pk/ ke information technology and libraries september 2022 library management practices in the libraries of pakistan 37 ullah, khusro, and ullah endnotes 1 younghee noh and rosa chang, “international collaboration in library and information science research in korea,” international journal of knowledge content development & technology 9, no. 2 (2019):91–110, https://doi.org/10.5865/ijkct.2019.9.2.091; muhammad abbas ganaee and muhammad rafiq, “pakistani university library web sites: features, contents, and maintenance issues,” journal of web librarianship 10, no. 4 (2016): 294–315, https://doi.org/10.1080/19322909.2016.1195308. 2 noh and chang, “international collaboration in korea,” 95. 3 ganaee and rafiq, “pakistani university library web sites,” 294. 4 in this study, we evaluated the websites of the libraries of public and private sector universities and research institutes. these websites are listed on the digital library website of hec, pakistan, available at http://www.digitallibrary.edu.pk/institutes.php. 5 alessandro liberati et al., “the prisma statement for reporting systematic reviews and metaanalyses of studies that evaluate health care interventions: explanation and elaboration,” journal of clinical epidemiology 6, no. 7 (2009): e1–e34, https://doi.org/10.1016/j.jclinepi.2009.06.006. 6 michael gusenbauer, “google scholar to overshadow them all? comparing the sizes of 12 academic search engines and bibliographic databases,” scientometrics 118, no. 1 (2019):177– 214, https://doi.org/10.1007/s11192-018-2958-5. 7 liberati et al., “prisma,” e9. 8 liberati et al., “prisma,” e1. 9 liberati et al., “prisma,” e5. 10 irfan ullah and shah khusro, “social book search: the impact of the social web on book retrieval and recommendation,” multimedia tools and applications 79, no. 11 (2020): 8011– 60, https://doi.org/10.1007/s11042-019-08591-0; liberati et al., “prisma,” e9–e10. 11 liberati et al., “prisma,” e5. 12 charlene l. al-qallaf and alaa ridha, “a comprehensive analysis of academic library websites: design, navigation, content, services, and web 2.0 tools,” international information & library review 51, no. 2 (2019): 93–106, https://doi.org/10.1080/10572317.2018.1467166; rozalynd p. mcconnaughy and steven p. wilson, “content and design features of academic health sciences libraries’ home pages,” medical reference services quarterly 37, no. 2 (2018): 153–67, https://doi.org/10.1080/02763869.2018.1439219; gricel dominguez, sarah j. hammill, and ava iuliano brillat, “toward a usable academic library web site: a case study of tried and tested usability practices,” journal of web librarianship 9, no. 2–3 (2015): 99–120, https://doi.org/10.1080/19322909.2015.1076710. https://doi.org/10.5865/ijkct.2019.9.2.091 https://doi.org/10.1080/19322909.2016.1195308 http://www.digitallibrary.edu.pk/institutes.php https://doi.org/10.1016/j.jclinepi.2009.06.006 https://doi.org/10.1007/s11042-019-08591-0 https://doi.org/10.1080/10572317.2018.1467166 https://doi.org/10.1080/02763869.2018.1439219 https://doi.org/10.1080/19322909.2015.1076710 information technology and libraries september 2022 library management practices in the libraries of pakistan 38 ullah, khusro, and ullah 13 al-qallaf and ridha, “web 2.0 tools,” 100; mcconnaughy and wilson, “libraries’ home pages,” 166–67; anna mierzecka and andrius suminas, “academic library website functions in the context of users’ information needs,” journal of librarianship and information science 50, no. 2 (2018): 157–67, https://doi.org/10.1177/0961000616664401; alan kerr and diane rasmussen pennington, “public library mobile apps in scotland: views from the local authorities and the public,” library hi tech 36, no. 2 (2018): 237–51, https://doi.org/10.1108/lht-05-2017-0091; saleeq ahmad dar, “mobile library initiatives: a new way to revitalize the academic library settings,” library hi tech news 36, no. 5 (2019): 15–21, https://doi.org/10.1108/lhtn-05-2019-0032. 14 al-qallaf and ridha, “web 2.0 tools,” 102; mcconnaughy and wilson, “libraries’ home pages,” 159. 15 al-qallaf and ridha, “web 2.0 tools,” 95–97; irfan ullah and shah khusro, “on the search behaviour of users in the context of interactive social book search,” behaviour & information technology 39, no. 4 (2020): 443–62, https://doi.org/10.1080/0144929x.2019.1599069; mcconnaughy and wilson, “libraries’ home pages,” 153–67; mierzecka and suminas, “website functions,” 164; kerr and pennington, “scotland,” 243; dar, “mobile library initiatives,” 15 –17. 16 al-qallaf and ridha, “web 2.0 tools,” 100; mcconnaughy and wilson, “libraries’ home pages” 153–67; mierzecka and suminas, “website functions,” 162–64; kerr and pennington, “scotland,” 243. 17 al–qallaf and ridha, “web 2.0 tools,” 95–100; mcconnaughy and wilson, “libraries’ home pages,” 158; mierzecka and suminas, “website functions,” 161, 162; dar, “mobile library initiatives,” 19. 18 mierzecka and suminas, “website functions,” 158; paul khawaja, “a software tool-based accessibility assessment of public library websites in the united states,” (masters paper, university of north carolina, chapel hill, (2020): 1–51, https://doi.org/10.17615/432g-f412; rita kosztyánné mátrai, “how to make an electronic library accessible,” the electronic library 36, no. 4 (2018): 620–32, https://doi.org/10.1108/el-07-2017-0143. 19 muhammad rafi, ghalib khan, and sikandar ali, “challenges associated with resource selection in public libraries of khyber pakhtunkhawa, pakistan,” information and knowledge management 6, no. 2 (2016): 27–33, https://www.iiste.org/journals/index.php/ikm/article/view/28709. 20 ghalib khan and rubina bhatti, “collection development and management in the university libraries of pakistan: a survey of users’ satisfaction,” international information & library review, 53, no. 3 (2021): 239–53, https://doi.org/10.1080/10572317.2020.1830739; muhammad rafi, sikandar ali, and ashfaq ahmad, “administrative challenges to public libraries in khyber pakhtunkhawa pakistan: an empirical study,” journal of studies in social sciences 15, no. 1 (2016): 32–48, https://infinitypress.info/index.php/jsss/article/view/1280. 21 khan and bhatti, “collection development,” 248; rafi, ali, and ahmad, “khyber pakhtunkhawa,” 36. https://doi.org/10.1177/0961000616664401 https://doi.org/10.1108/lht-05-2017-0091 https://doi.org/10.1108/lhtn-05-2019-0032 https://doi.org/10.1080/0144929x.2019.1599069 https://doi.org/10.17615/432g-f412 https://doi.org/10.1108/el-07-2017-0143 https://www.iiste.org/journals/index.php/ikm/article/view/28709 https://doi.org/10.1080/10572317.2020.1830739 https://infinitypress.info/index.php/jsss/article/view/1280 information technology and libraries september 2022 library management practices in the libraries of pakistan 39 ullah, khusro, and ullah 22 amjid khan, rubina bhatti, ghalib khan, and muhammad ismail, “the role of academic libraries in facilitating undergraduate and post-graduate studies: a case study of the university of peshawar, pakistan,” chinese librarianship: an international electronic journal 2014, no. 38 (2014): 36–49, http://white-clouds.com/iclc/cliej/cl38kbki.pdf; atta ur-rehman marwat and muhammad younus, “evaluation of college libraries in khyber pakhtunkhwa, pakistan: condition, role, and challenges,” library philosophy and practice 2020, no. 4049 (2020): 1–43, https://digitalcommons.unl.edu/libphilprac/4049. 23 muhammad naeem and nadeem siddique, “use of print and electronic journals by the academic community: a survey at gc university lahore,” library philosophy and practice 2020, no. 3788 (2020): 1–16, https://digitalcommons.unl.edu/libphilprac/3788; muhammad abbas ganaee, “library websites of pakistani universities: an exploratory study,” qualitative and quantitative methods in libraries 5, no. 2 (2017): 385–95, http://www.qqml.net/index.php/qqml/article/view/325; alia arshad and kanwal ameen, “academic scientists’ scholarly use of information resources in the digital environment: perceptions and barriers,” global knowledge, memory and communication 67, no. 6/7 (2018): 467–83, https://doi.org/10.1108/gkmc-05-2018-0044. 24 khan et al., “facilitating,” 36, 45. 25 muhammad rafiq, kanwal ameen, and munazza jabeen, “barriers to digitization in university libraries of pakistan: a developing country’s perspective,” the electronic library 36, no. 3 (2018): 457–70, https://doi.org/10.1108/el-01-2017-0012; marwat and younus, “college libraries,” 24; nadeem siddique and khalid mahmood, “status of library software in higher education institutions of pakistan,” international information & library review 47, no. 3–4 (2015): 59–65, https://doi.org/10.1080/10572317.2015.1087796. 26muhammad rafiq and kanwal ameen, “towards a digitization framework: pakistani perspective,” pakistan journal of information management & libraries 15, no. 1 (2014): 22–29, http://journals.pu.edu.pk/journals/index.php/pjiml/article/view/757; khan and bhatti, “collection development,” 241; marwat and younus, “college libraries,” 37. 27 rafiq, ameen, and jabeen, “barriers,” 459, 465; khan and bhatti, “collection development,” 252. 28 khan and bhatti, “collection development,” 247. 29 khan et al., “facilitating,” 46; 30 khan and bhatti, “collection development,” 240. 31 khan and bhatti, “collection development,” 240, 247. 32nosheen fatima warraich and abebe rorissa, “adoption of linked data technologies among university librarians in pakistan: challenges and prospects,” malaysian journal of library & information science 23, no. 3 (2018): 1–13, https://doi.org/10.22452/mjlis.vol23no3.1. http://white-clouds.com/iclc/cliej/cl38kbki.pdf https://digitalcommons.unl.edu/libphilprac/4049 https://digitalcommons.unl.edu/libphilprac/3788 http://www.qqml.net/index.php/qqml/article/view/325 https://doi.org/10.1108/gkmc-05-2018-0044 https://doi.org/10.1108/el-01-2017-0012 https://doi.org/10.1080/10572317.2015.1087796 http://journals.pu.edu.pk/journals/index.php/pjiml/article/view/757 https://doi.org/10.22452/mjlis.vol23no3.1 information technology and libraries september 2022 library management practices in the libraries of pakistan 40 ullah, khusro, and ullah 33 nazia wahid nosheen, fatima warraich and muzammil tahira, “mapping the cataloguing practices in information environment: a review of linked data challenges,” information and learning science 119, no. 9/10 (2018): 586–96, https://doi.org/10.1108/ils-10-2017-0106. 34haseeb ahmad piracha and kanwal ameen, “policy and planning of research data management in university libraries of pakistan,” collection and curation 38, no. 2 (2019): 39–44, https://doi.org/10.1108/cc-08-2018-0019; amjid khan and shamsahd ahmed, “usage of edatabases and e-journals by research community in pakistani universities: issues and perspectives,” library philosophy and practice, 2020, no. 4570 (2020): 1–11, https://digitalcommons.unl.edu/libphilprac/4570. 35 muhammad rafi, zheng jianming, and khurshid ahmad, “evaluating the impact of digital library database resources on the productivity of academic research,” information discovery and delivery 47, no. 1 (2019): 42–52, https://doi.org/10.1108/idd-07-2018-0025; asif altaf and nosheen fatima warraich, “awareness and use of electronic information sources by university students in pakistan,” pakistan library & information science journal 48, no. 4 (2017): 14–25, https://www.researchgate.net/publication/326356264. 36muhammad naeem and mohammad javid khan, “do social networking applications support the antecedents of knowledge sharing practices?” vine journal of information and knowledge management systems 49, no. 4 (2019): 494–509, https://doi.org/10.1108/vjikms-12-20180133. 37 amjid khan, shamshad ahmed, asad khan, and ghalib khan, “the impact of digital library resources usage on engineering research productivity: an empirical evidences from pakistan,” collection building 36, no. 2 (2017): 37–44, https://doi.org/10.1108/cb-10-2016-0027; ikram ul haq and rabiya ali faridi, “knowledge sharing practices amongst the library and information professionals of pakistan in the digital era,” in cooperation and collaboration initiatives for libraries and related institutions, ed. collence takaingenhamo chisita (hershey, pa: igi global, 2020) 200–17, https://doi.org/10.4018/978-1-7998-0043-9.ch010. 38 warraich and rorissa, “adoption,” 7, 8, 13. 39 khan and bhatti, “collection development,” 248; rafi, ali, and ahmad, “khyber pakhtunkhawa,” 42; siddique and mahmood, “library software,” 64. 40rafiq, ameen, and jabeen, “barriers,” 464, 465, 467; sajjad ahmad, shehzad ahmad, and muhammad kamran, “electronic information resource sharing among the research scholars: a case of university of peshawar,” pakistan library & information science journal 50, no. 2 (2019): 45–60. 41 mirza abdul rasheed and muhammad rafiq, “new trends and practices for digital reference service (drs) a survey in the university libraries of punjab, pakistan,” pakistan library & information science journal 48, no. 4 (2017): 44–55; saira hanif soroya and kanwal ameen, “what do they want? millennials and role of libraries in pakistan,” the journal of academic librarianship 44, no. 2 (2018): 248–55, https://doi.org/10.1016/j.acalib.2018.01.003. https://doi.org/10.1108/ils-10-2017-0106 https://doi.org/10.1108/cc-08-2018-0019 https://digitalcommons.unl.edu/libphilprac/4570 https://doi.org/10.1108/idd-07-2018-0025 https://www.researchgate.net/publication/326356264 https://doi.org/10.1108/vjikms-12-2018-0133 https://doi.org/10.1108/vjikms-12-2018-0133 https://doi.org/10.1108/cb-10-2016-0027 https://doi.org/10.4018/978-1-7998-0043-9.ch010 https://doi.org/10.1016/j.acalib.2018.01.003 information technology and libraries september 2022 library management practices in the libraries of pakistan 41 ullah, khusro, and ullah 42rubia khan, arif khan, sidra malik, and haroon idrees, “virtual reference services through web search engines: study of academic libraries in pakistan,” publications 5, no. 2 (2017): 1–13, https://doi.org/10.3390/publications5020006. 43 hafiz habib-ur-rehman, haroon idrees, and ahsan ullah, “organization and usage of information resources at deeni madaris libraries in pakistan,” library review 66, no. 3 (2017): 163–78, https://doi.org/10.1108/lr-02-2016-0016; nadeem siddique and khalid mahmood, “library software in pakistan: a review of literature,” library review 63, no. 3 (2014): 224– 40, https://doi.org/10.1108/lr-04-2013-0048. 44 khurshid ahmad, zheng jianming, and muhammad rafi, “an analysis of academic librarians competencies and skills for implementation of big data analytics in libraries,” data technologies and applications 53, no. 2 (2019): 201–16, https://doi.org/10.1108/dta-092018-0085. 45 shahzad abbas, shanawar khalid, and fakhar abbas hashmi, “library websites as source of marketing of library resources: an empirical study of hec recognized universities of pakistan,” qualitative and quantitative methods in libraries 5, no. 1 (2017): 235–49, http://www.qqml.net/index.php/qqml/article/view/321; rubina bhatti, awais asghar, and amjid khan, “the websites in university libraries of pakistan: current status and new perspectives,” pakistan library & information science journal 46, no. 1 (2015): 26–35. 46 ganaee and rafiq, “pakistani university library web sites,” 294, 303–9. 47 ganaee, “library websites,” 385. 48 warraich and rorissa, “linked data technologies,” 7–9. 49 asim ullah, shah khusro, and irfan ullah, “bibliographic classification in the digital age: current trends & future directions,” information technology and libraries 36, no. 3 (2017): 48–77, https://doi.org/10.6017/ital.v36i3.8930. 50 muhammad ss mirza, and muhammad arif, “challenges in information technology adoption in pakistani university libraries,” international journal of knowledge content development & technology 6, no. 1 (2016): 105–16, https://doi.org/10.5865/ijkct.2016.6.1.105. 51 khan et al., “virtual reference services,” 4–8. 52 wahid, warraich, and tahira, “mapping,” 587, 593. 53nosheen fatima warraich, “linked data technologies in libraries: an appraisal,” journal of political studies 23, no. 2 (2016): 697–707. 54 corine deliot, “publishing the british national bibliography as linked open data,” catalogue & index, no. 174 (march 2014): 13–18, https://cdn.ymaws.com/www.cilip.org.uk/resource/collection/f71f19c3-49cf-462d-8165b07967ee07f0/catalogue_and_index_issue_174,_march_2014.pdf. https://doi.org/10.3390/publications5020006 https://doi.org/10.1108/lr-02-2016-0016 https://doi.org/10.1108/lr-04-2013-0048 https://doi.org/10.1108/dta-09-2018-0085 https://doi.org/10.1108/dta-09-2018-0085 http://www.qqml.net/index.php/qqml/article/view/321 https://doi.org/10.6017/ital.v36i3.8930 https://doi.org/10.5865/ijkct.2016.6.1.105 https://cdn.ymaws.com/www.cilip.org.uk/resource/collection/f71f19c3-49cf-462d-8165-b07967ee07f0/catalogue_and_index_issue_174,_march_2014.pdf https://cdn.ymaws.com/www.cilip.org.uk/resource/collection/f71f19c3-49cf-462d-8165-b07967ee07f0/catalogue_and_index_issue_174,_march_2014.pdf information technology and libraries september 2022 library management practices in the libraries of pakistan 42 ullah, khusro, and ullah 55 muhammad rafi et al., “knowledge-based society and emerging disciplines: a correlation of academic performance,” the bottom line 33, no. 4 (2020): 337–58, https://doi.org/10.1108/bl-12-2019-0130. 56 ali mansouri and nooshin soleymani asl, “assessing mobile application components in providing library services,” the electronic library 37, no. 1 (2019): 49–66, https://doi.org/10.1108/el-10-2018-0204. 57 hamaad rafique et al., “do digital students show an inclination toward continuous use of academic library applications? a case study,” the journal of academic librarianship 47, no. 2 (2020): 1–15, https://doi.org/10.1016/j.acalib.2020.102298. 58 ganaee and rafiq, “pakistani university library web sites,” 303–10. 59 asad khan, “investigating the factors influencing librarians’ intention toward the adoption of koha—an open source integrated library system in pakistan,” library philosophy and practice 2020, no. 4360 (2020): 1–52: https://digitalcommons.unl.edu/libphilprac/4360. 60 arslan sheikh, “evaluating the usability of comsats institute of information technology library website: a case study,” the electronic library 35, no. 1 (2017): 121–36, https://doi.org/10.1108/el-08-2015-0149. 61 khan and ahmed, “research community,” 2, 3. 62arif khan, haroon idrees, and khan mudassir, “library web sites for people with disability: accessibility evaluation of library websites in pakistan,” library hi tech news 32, no. 6 (2015): 1–7, https://doi.org/10.1108/lhtn-01-2015-0010. 63muhammad tariq and khalid mahmood, “use, purpose and usage ranking of online informatio n resources by university research students,” 2015 4th international symposium on emerging trends and technologies in libraries and information services, noida, india (2015): 257–63, https://doi.org/10.1109/ettlis.2015.7048208. 64 “online book search,” pakistan scientific and technological information center (pastic), accessed march 24, 2022, http://pastic.gov.pk/advancebooksearch.aspx. 65“objectives,” pakistan scientific and technological information center (pastic), accessed march 22, 2022, http://pastic.gov.pk/objectives.aspx?par=abtp&cmenu=objectives. 66 “about us,” consortium of s&t and r&d libraries of pakistan (cstrdlp), accessed march 22, 2022, http://consortium.pastic.gov.pk. 67 ashraf sharif, “integrating libraries across continents: a case of aga khan university’s nine libraries in five countries,” (paper, national conference on career development of lis professionals and overall improvement of libraries in pakistan, islamabad, 2012): 1–12, https://ecommons.aku.edu/libraries/18. https://doi.org/10.1108/bl-12-2019-0130 https://doi.org/10.1108/el-10-2018-0204 https://doi.org/10.1016/j.acalib.2020.102298 https://digitalcommons.unl.edu/libphilprac/4360 https://doi.org/10.1108/el-08-2015-0149 https://doi.org/10.1108/lhtn-01-2015-0010 https://doi.org/10.1109/ettlis.2015.7048208 http://pastic.gov.pk/advancebooksearch.aspx http://pastic.gov.pk/objectives.aspx?par=abtp&cmenu=objectives http://consortium.pastic.gov.pk/ https://ecommons.aku.edu/libraries/18 information technology and libraries september 2022 library management practices in the libraries of pakistan 43 ullah, khusro, and ullah 68 sania awais and kanwal ameen, “the current innovation status of university libraries in pakistan,” library management 40, no. 3/4 (2019): 178–90, https://doi.org/10.1108/lm-112017-0125. 69 “participants of digital library,” higher education commission (hec) of pakistan, accessed march 24, 2022, http://digitallibrary.edu.pk/institutes.php. 70 warraich and rorissa, “adoption,” 8; rafiq, ameen, and jabeen, “barriers,” 465; shamshad ahmed, arslan sheikh, and muhammad akram, “implementing knowledge management in university libraries of punjab, pakistan,” information discovery and delivery 46, no. 2 (2018): 83–94, https://doi.org/10.1108/idd-08-2017-0065; sabah jan, “status of electronic resources in libraries: a review study,” library philosophy and practice 2019, no. 2524 (2019): 1–20, https://digitalcommons.unl.edu/libphilprac/2524. 71 warraich and rorissa, “adoption,” 1, 7–9; rafiq, ameen, and jabeen, “barriers,” 459, 460; ahmed, sheikh, and akram, “knowledge management,” 84. 72 rafiq and ameen, “digitization framework,” 26; rafi, ali, and ahmad, “khyber pakhtunkhawa,” 39; kanwal ameen, “changing scenario of librarianship in pakistan: managing with the challenges and opportunities,” library management 32, no. 3 (2011): 171–82, https://doi.org/10.1108/01435121111112880. 73 rafiq, ameen, and jabeen, “barriers,” 463; asad khan, mohamad noorman masrek, khalid mahmood, and saima qutab, “factors influencing the adoption of digital reference services among the university librarians in pakistan,” the electronic library 35, no. 6 (2017): 1225–46, https://doi.org/10.1108/el-05-2016-0112; ghalib khan and rubina bhatti, “the impact of higher education commission of pakistan’s funding on the collection development budgets of university libraries,” the bottom line 29, no. 1 (2016): 12–24, https://doi.org/10.1108/bl06-2015-0008. 74 irfan ullah, shah khusro, asim ullah, and muhammand naeem, “an overview of the current state of linked and open data in cataloging,” information technology and libraries 37, no. 4 (2018): 47–80, https://doi.org/10.6017/ital.v37i4.10432; wahid, warraich, and tahira, “mapping,” 593. 75 rafiq, ameen, and jabeen, “barriers,” 460. 76 piracha and ameen, “policy and planning,” 39, 42; ahmed, sheikh, and akram, “knowledge management,” 85. 77 warraich and rorissa, “adoption,” 7, 8; jan, “status,” 13. 78 sania awais and kanwal ameen, “information accessibility for students with disabilities: an exploratory study of pakistan,” malaysian journal of library & information science 20, no. 2 (2017): 103–15, https://mjlis.um.edu.my/article/view/1768; khan, idrees, and mudassir, “accessibility evaluation,” 6. https://doi.org/10.1108/lm-11-2017-0125 https://doi.org/10.1108/lm-11-2017-0125 http://digitallibrary.edu.pk/institutes.php https://doi.org/10.1108/idd-08-2017-0065 https://digitalcommons.unl.edu/libphilprac/2524 https://doi.org/10.1108/01435121111112880 https://doi.org/10.1108/el-05-2016-0112 https://doi.org/10.1108/bl-06-2015-0008 https://doi.org/10.1108/bl-06-2015-0008 https://doi.org/10.6017/ital.v37i4.10432 https://mjlis.um.edu.my/article/view/1768 information technology and libraries september 2022 library management practices in the libraries of pakistan 44 ullah, khusro, and ullah 79 nosheen fatima warraich, amara malik, and kanwal ameen, “gauging the collection and services of public libraries in pakistan,” global knowledge, memory and communication 67, no. 4/5 (2018): 244–58, https://doi.org/10.1108/gkmc-11-2017-0089. 80 alia arshad and kanwal ameen, “scholarly communication in the age of google: exploring academics’ use patterns of e-journals at the university of the punjab,” the electronic library 35, no. 1 (2017): 167–84, https://doi.org/10.1108/el-09-2015-0171. 81 khan and bhatti, “collection development,” 242, 251. 82 khurshid ahmad and muhammad rafiq, “methods of knowledge management practices in pakistani universities’ libraries,” nust journal of social sciences and humanities 4, no. 1 (2018): 115–26, https://doi.org/10.51732/njssh.v4i1.30. 83 piracha and ameen, “policy and planning,” 42. 84 piracha and ameen, “policy and planning,” 39. 85 khan and bhatti, “collection development,” 248–49; rafi, ali, and ahmad, “khyber pakhtunkhawa,” 34, 41, 45; muhammad arif and khalid mahmood, “the changing role of librarians in the digital world: adoption of web 2.0 technologies by pakistani librarians,” the electronic library 30, no. 4 (2012): 469–79, https://doi.org/10.1108/02640471211252184; warraich, malik, and ameen, “gauging,” 249– 55; amjid khan and shamshad ahmed, “analyzing the relationship between organizational culture and lifelong learning among the information professionals in the university libraries of pakistan,” information discovery and delivery 50, no. 1 (2022): 1–11, https://doi.org/10.1108/idd-01-2019-0001; ahsan ullah and harron idrees, “technical staff positions and technology related tasks: a study of university libraries in pakistan,” pakistan journal of information management & libraries 18, no. 1 (2017): 52–61, https://ssrn.com/abstract=2918822. 86 khan and bhatti, “collection development,” 242; rafi, ali, and ahmad, “khyber pakhtunkhawa,” 44; marwat and younus, “college libraries,” 27; shehzad ahmad and sajjad ahmad, “status of ict in the university libraries of khyber pakhtunkhwa,” pakistan library & information science journal 48, no. 2 (2017): 37–48; sajjad ahmad, shehzad ahmad, and muhammad arshad, “attitude of university information professionals’ toward the use and application of ict: a case of khyber pakhtunkhwa,” pakistan library & information science journal 51, no. 3 (2020): 51–64; rabia abdul karim and anila fatima shakil, “a research study about the importance of e library for globalized learning among students at university level in karachi, pakistan,” rads journal of social sciences & business management 4, no. 2 (2017): 104–14, http://www.jssbm.juw.edu.pk/index.php/jssbm/article/view/45. 87 piracha and ameen, “policy and planning,” 39–41; rafi, ali, and ahmad, “khyber pakhtunkhawa,” 44; warraich, malik, and ameen, “gauging,” 249; ahmad, ahmad, and kamran, “sharing,” 45; ullah and idrees, “technical staff,” 59; ahmad and ahmad, “ict,” 37; karim and shakil, “globalized learning,” 113; warraich and rorissa, “adoption,” 8; siddique and mahmood, “pakistan,” 237. https://doi.org/10.1108/gkmc-11-2017-0089 https://doi.org/10.1108/el-09-2015-0171 https://doi.org/10.51732/njssh.v4i1.30 https://doi.org/10.1108/02640471211252184 https://doi.org/10.1108/idd-01-2019-0001 https://ssrn.com/abstract=2918822 http://www.jssbm.juw.edu.pk/index.php/jssbm/article/view/45 information technology and libraries september 2022 library management practices in the libraries of pakistan 45 ullah, khusro, and ullah 88 rafiq, ameen, and jabeen, “barriers,” 463–66. 89 murtaza ashiq, shafiq ur rehman, and syeda hina batool, “academic library leaders’ conceptions of library leadership in pakistan,” malaysian journal of library & information science 24, no. 2 (2019): 55–71, https://doi.org/10.22452/mjlis.vol24no2.4; piracha and ameen, “policy and planning,” 42, 43; amara malik and kanwal ameen, “library and information science collaboration in pakistan: challenges and prospects,” information and learning science 119, no. 9/10 (2018): 555–71, https://doi.org/10.1108/ils-09-2017-0096. 90 rafiq, ameen, and jabeen, “barriers,” 463–67; marwat and younus, “college libraries,” 24; siddique and mahmood, “library software,” 61. 91 mirza and arif, “challenges,” 113. 92 marwat and younus, “college libraries,” 39; ahmad and ahmad, “ict,” 38–47; siddique and mahmood, “pakistan,” 224, 234, 235. 93 shakeel ahmad khan and rubina bhatti, “digital competencies for developing and managing digital libraries: an investigation from university librarians in pakistan,” the electronic library 35, no. 3 (2017): 573–97, https://doi.org/10.1108/el-06-2016-0133. 94 midrar ullah, “content analysis of medical college library websites in pakistan indicates necessary improvements,” health information & libraries journal (14 july, 2021): 1–10, https://doi.org/10.1111/hir.12386. 95 ganaee and rafiq, “pakistani university library web sites,” 294, 303–9. 96 ganaee, “library websites,” 385. 97arslan sheikh, “development of information commons in university libraries of pakistan: the current scenario,” the journal of academic librarianship 41, no. 2 (2015): 130–39, https://doi.org/10.1016/j.acalib.2015.01.002. 98 piracha and ameen, “policy and planning,” 39, 42, 43. 99 piracha and ameen, “policy and planning,” 42. 100 malik and ameen, “collaboration,” 563, 564. 101 waqar ahmad, muhammad shahid soroya, and munazza jubeen, “electronic, librarian’s end, techno knowledge and multifactor services in the special libraries of lahore,” pakistan library & information science journal 48, no. 4 (2017): 102–14, https://www.researchgate.net/publication/334546141. 102 nadeem siddique and khalid mahmood, “combating problems related to library software in higher education institutions of pakistan: an analysis of focus groups,” malaysian journal of library & information science 21, no. 1 (2016): 35–51, https://doi.org/10.22452/mjlis.vol21no1.3. https://doi.org/10.22452/mjlis.vol24no2.4 https://doi.org/10.1108/ils-09-2017-0096 https://doi.org/10.1108/el-06-2016-0133 https://doi.org/10.1111/hir.12386 https://doi.org/10.1016/j.acalib.2015.01.002 https://www.researchgate.net/publication/334546141 https://doi.org/10.22452/mjlis.vol21no1.3 information technology and libraries september 2022 library management practices in the libraries of pakistan 46 ullah, khusro, and ullah 103 ashiq, rehman, and batool, “library leaders,” 61–68. 104 waqar ahmed, “third generation of the web: libraries, librarians and web 3.0,” library hi tech news 32, no. 4 (2015): 6–8, https://doi.org/10.1108/lhtn-11-2014-0100. 105 abid hussain and saeed ullah jan, “awareness of web 2.0 technology in the academic libraries: an islamabad perspective,” library philosophy and practice 2018, no. 1945 (2018): 1–13, https://digitalcommons.unl.edu/libphilprac/1945; azizur rahman, amjid khan, and ghalib kan, “assessment of web 2.0 applications in university libraries of khyber pakhtunkhwa,” pakistan library & information science journal 50, no. 3 (2019): 9–18; muhammad tufail khan and muhammad rafiq, “library social media services (lsms)! going viral for survival,” pakistan library & information science journal 50, no. 3 (2019): 23–32. 106 ullah et al., “current state,” 64–66. 107 muhammad yousuf ali, salaman bin naeem, and rubina bhatti, “artificial intelligence tools and perspectives of university librarians: an overview,” business information review 37, no. 3 (2020): 116–24, https://doi.org/10.1177/0266382120952016. 108 y. m. atiquil islam, khurshid ahmad, muhammad rafi, and zheng jianming, “performance– based evaluation of academic libraries in the big data era,” journal of information science 47, no. 4 (2020): 458–71, https://doi.org/10.1177/0165551520918516. 109 abid hussain and muhammad ibrahim, “research productivity of library and information science in khyber pakhtunkhwa: a case study of sarhad university of science peshawar, pakistan,” journal of information management and library studies 1, no. 1 (2018): 54–63, http://jimls.kkkuk.edu.pk/jimls/index.php/jimls/article/view/14. 110 shamshad ahmed and atta ur rehman, “perceptions and level of ict competencies: a survey of librarians at public sector universities in khyber pakhtunkhwa, pakistan,” pakistan journal of information management and libraries 18, no. 1 (2016): 1–11, http://journals.pu.edu.pk/journals/index.php/pjiml/article/viewarticle/951; altaf and warraich, “awareness,” 14–22; munir moosa sadruddin, “contribution of digital libraries and its role in reaping quality researches in pakistan – challenges and opportunities,” pakistan library & information science journal 46, no. 1 (2015): 60–70. 111 muhammad umar farooq, ahsan ullah, memoona iqbal, and abid hussain, “current and required competencies of university librarians in pakistan,” library management 37, no. 8/9 (2016): 410–45, https://doi.org/10.1108/lm-03-2016-0017. 112 sheikh, “information commons,” 138. 113 kimberly l. armstrong and thomas h. teper, “library consortia and the cic: leveraging scale for collaborative success,” serials review 43, no. 1 (2017): 28–33. https://doi.org/10.1080/00987913.2017.1284493. 114 armstrong and teper, “library consortia,” 30–32. 115 “about us,” pastic. https://doi.org/10.1108/lhtn-11-2014-0100 https://digitalcommons.unl.edu/libphilprac/1945 https://doi.org/10.1177/0266382120952016 https://doi.org/10.1177/0165551520918516 http://jimls.kkkuk.edu.pk/jimls/index.php/jimls/article/view/14 http://journals.pu.edu.pk/journals/index.php/pjiml/article/viewarticle/951 https://doi.org/10.1108/lm-03-2016-0017 https://doi.org/10.1080/00987913.2017.1284493 information technology and libraries september 2022 library management practices in the libraries of pakistan 47 ullah, khusro, and ullah 116 “objectives,” pastic. 117 ullah et al, “current state,” 64–66. 118 ullah et al., “current state,” 64–67. 119 bilal hassan, ramsha ahmed, bo li, ayesha noor, and zahid ul hassan, “a comprehensive study capturing vision loss burden in pakistan (1990–2025): findings from the global burden of disease (gbd) 2017 study,” plos one 14, no. 5 (2019): e0216492, https://doi.org/10.1371/journal.pone.0216492. 120 izaz khan, shah khusro, and irfan ullah, “technology-assisted white cane: evaluation and future directions,” peerj 6, no. e6058 (2018.): 1–27, https://doi.org/10.7717/peerj.6058. 121 awais and ameen, “information accessibility,” 111–13. 122 “smartphone usage in pakistan,” accessed march 24, 2022, https://pas.org.pk/smart-phoneusage-in-pakistan-infographics. 123 “smartphone penetration rate as share of connections in pakistan from 2014 to 2020,” statista research department, december 30, 2016, https://www.statista.com/statistics/671542/smartphone-penetration-as-share-ofconnections-in-pakistan. 124 “telecom indicators,” pakistan telecommunication authority, january 2022, https://www.pta.gov.pk/en/telecom-indicators https://doi.org/10.1371/journal.pone.0216492 https://doi.org/10.7717/peerj.6058 https://pas.org.pk/smart-phone-usage-in-pakistan-infographics https://pas.org.pk/smart-phone-usage-in-pakistan-infographics https://www.statista.com/statistics/671542/smartphone-penetration-as-share-of-connections-in-pakistan https://www.statista.com/statistics/671542/smartphone-penetration-as-share-of-connections-in-pakistan https://www.pta.gov.pk/en/telecom-indicators abstract introduction methodology the literature search and selection strategy the evaluation framework summary of key observations lis practices in the light of published literature collection development and management resource description, discovery, and access adherence to new standards, practices, and technologies lis practices in the light of the studied websites discussion and analysis conclusions appendix a: details of libraries endnotes 2 information technology and libraries | september 2007 w elcome to my first ital president’s column. each president only gets a year to do these col­ umns, so expectations must be low all around. my hope is to stimulate some thinking and conversation that results in lita members’ ideas being exchanged and to create real opportunities to implement those ideas. my first column i thought i would keep short and sweet, and discuss just a few of the ideas that have been rattling around in my head since the 2007 midwinter lita town meeting, which have been enhanced by a number of discussions among librarians over the last six months. with any luck, these thoughts might have some bearing on what any of those ideas could mean to our organization. first off, i don’t think i can express how weird this whole presidential appellation is to me. i am extremely proud to be associated with lita, and honored and surprised at being elected. i come from a consortia envi­ ronment and an extremely flat organization. solving problems is often a matter of throwing all the parties in a room together and hashing it out until solutions are arrived at. i’ve been a training librarian for quite a while now, and pragmatic approaches to problem solving are my central focus. i’m a consortia wrangler, a trainer, and a technology pusher, and i hope my approach is, and will be, to listen hard and then see what can be accomplished. so in my own way, i find being president kind of on the embarrassing side. it’s like not knowing what to do with your hands when you’re speaking in public. at the lita town meeting (http://litablog .org/2007/06/17/lita­town­meeting­2007­report/) it was pretty obvious that members want community in all its various forms, face­to­face in multiple venues and online in multiple venues. it’s also pretty obvious from the studies done by pew internet and american life and by oclc that our users, and in particular our younger users, really want community. the web 2.0 and the library 2.0 movements are responses to that desire. as a somewhat flippant observation, we spent a generation educating our kids to work in groups, and now we shouldn’t be sur­ prised that they want to work and play in groups. many of us work effectively in collaborative groups everyday. we find it exciting, productive, and even fun. it’s an environment that we would like to create for our patrons, in­house and virtually. it’s what we would like to see in our association. having been to every single top tech trends program and listened to the lita trendsters, one theme that often comes up is that complaining about the systems our ven­ dors deliver can at times be pointless, because they sim­ ply deliver what we ask for. there is of course a corollary to this. once a system is in the marketplace, adding func­ tionality often becomes centered around the low­hang­ ing fruit. as a fictitious example, a vendor might easily add the ability to change the colors of the display to the patron, but adding a shelf list browse might take serious coding to create. so through discussions and rfp, we ask for and get the pretty colors while the browsing function waits, a form of procrastination. so then does innovation come only when all the low­hanging fruit has finally been plucked, and there’s nothing else to procrastinate on? as social organizations, libraries, ala, lita and other groups, it appears that we have plucked all the low­hanging fruit of web 1.0. e­mail and static web pages have been done to death. as a pragmatist, what concerns me most is implementation. what delivery systems should and can we adopt and develop to fulfill the promise of services we’d like? can we ensure that barriers to participation are either eliminated or so low as to include everyone? i like to think that web 2.0 is innovation toward mirroring how we personally want to work and play and how we want our social structures to perform. so how can we make lita mirror how we want to work and play? i do know it’s not just making everything a wiki. mark beatty (mbeatty@wils.wisc.edu) is lita president 2007/2008 and trainer, wisconsin library services, madison. president’s column mark beatty barnettellis 22 information technology and libraries | march 2005 the metascholar initiative of emory university libraries, in collaboration with the center for the study of southern culture, the atlanta history center, and the georgia music hall of fame, received an institute of museum and library services grant to develop a new model for library-museum-archives collaboration. this collaboration will broaden access to resources for learning communities through the use of the open archives initiative protocol for metadata harvesting (oaipmh). the project, titled music of social change (mosc), will use oai-pmh as a tool to bridge the widely varying metadata standards and practices across museums, archives, and libraries. this paper will focus specifically on the unique advantages of the use of oaipmh to concurrently maximize the exposure of metadata emergent from varying metadata cultures. t he metascholar initiative of emory university libraries, in collaboration with the center for the study of southern culture, the atlanta history center, and the georgia music hall of fame, received an institute of museum and library services grant to develop a new model for library-museum-archives collaboration to broaden access to resources for learning communities through the use of the open archives initiative protocol for metadata harvesting (oai-pmh).1 the collaborators of the project, entitled music of social change (mosc), are creating a subject-based virtual collection concerning music and musicians associated with social-change movements such as the civil-rights struggle. this paper will specifically focus on the advantages offered by oai-pmh in amalgamating and serving metadata from these institutional sources that are significantly different in kind.2 there has been a great deal of discussion within the library community as to the possibilities oai-pmh holds for harvesting, aggregating, and then disseminating research metadata. however, in reality, only a few of institutions (be they museum, archives, or libraries) have actually begun to utilize oai-pmh to this end. there are some practical, historical barriers to implementing any shared system for distributing metadata across institutions that are, more than in degree, different in kind. one of these significant differences is of metadata cultures and practices. libraries have traditionally incrementally assigned metadata at an item level within their collection(s). the strength of this model is that at least a minimal amount of metadata is assigned to a very high percentage of items within the collection. the challenge of such a system is that for such metadata records to interoperate within a shared database and through a common interface (for example, the traditional union catalog), the metadata fields have been quite rigidly defined compared to those within archival and museum environments. due to tradition as well as the sheer volume of items collected by libraries, metadata at an item level are not greatly detailed or contextualized. often, items within library collections lack robust relationary mapping to other items within or outside of the collection, as is done, for example, in archival processing. content contextualization is highly valued by archival metadata practices and culture as the central tenet of metadata creation. items at a subcollection level almost always have metadata derivative from and deferential to that of the collection-level metadata. the great benefit of archival practices in metadata assignment is a contextualization of content that reflects the background, the topographic place in time and space of a given portion of a collection and its organic, emergent relationship to the whole. the weaknesses of this model are a great inconsistency in description details and variables (at the collection and subcollection levels), as well as very disparate levels of granularity within the hierarchy of the structure of a collection at which metadata are assigned. such disparities among institutional types feed an unnecessary level of misunderstanding by libraries of the metadata culture and aims of archives as well as those of museums. museums often have very skeletal documented (as opposed to undocumented) metadata about their collections or objects therein. often museums are not funded to make metadata on their collections freely available. it is common, in fact, for curatorial staff to view metadata as intellectual property to which they serve as gatekeepers, reflecting a professional value placed upon contextualizing materials for users. this is done on a user-by-user or exhibition-by-exhibition basis, depending on user background or the thesis of a given exhibition. additionally, museums perceive information on the aboutness of their collections to be a class of capital with which they can always potentially cost-recover or generate income. within the culture of museums, staff have traditionally been disinclined to make their collections available in an unmediated manner. additionally, there has been resistance to documenting information about collections in a systematic way. there is even greater resistance to adhering to any prescriptions on metadata as would be required for compliance with even the most minimally structured database. such regulation would discriminate the mosc project: using the oai-pmh to bridge metadata cultural differences across museums, archives, and libraries eulalia roel eulalia roel (eulalia.roel@gmail.com) is coordinator of information resources at the federal reserve, atlanta. against the nuanced information required for each and every object within a collection. � why oai-pmh to bridge these cultures? oai-pmh was selected by the mosc project as a means to bridge some of these substantial disparities. the protocol is often mistakenly assumed to function only with metadata expressed as unqualified dublin core (dc). in fact, the protocol functions with any metadata format expressed by extensible markup language (xml); this is the minimal requirement for content to serve metadata through oai-pmh. this includes those formats that have been well received by institutions other than libraries, such as xml encoded archival description (ead) as it is used in archives. as per 4.2 of the oai-pmh guidelines for repository implementers, communities are able to develop their own collection description xml schemas for use within description . . . elements. if all that is desired is the ability to include an unstructured textual description, then it is recommended that repositories use the dublin core description element. seven existing schemes are: dublin core, encoded archival description (ead), the eprints schema, rslp collection description schema, uddi/wsdl, marc21, and the branding schema.3 the oai protocol has often been partnered with unqualified dc metadata, as this is the most minimal metadata structure necessary for participation in an oai harvesting system. not only are these dc fields unqualified, no fields are actually required. no structure or regulations are codified outside of requiring metadata contributors to adhere to this unqualified metadata schema. therefore, the oai protocol requires minimal technology support and resources at any given contributing site (such support varying more widely across institutions than even their metadata practices themselves). this maximizes flexibility in metadata contribution, as well as maximizing interoperability between the collective data pool from which a user can search. granted, this unregulated framework does come at a cost of inconsistency in metadata detail and quality. however, the great advantage of such nominal requirements is that they enable contributors with minimal metadata-encoding practices to participate in the metadata collaborative. following is an example of a record as it may appear in the mosc collection:
oai:atlantahistorycenter.com:10 2003-03-31 south:blues south:mississippi-delta-region
long hall recordings morris, william blues .. comment: sound amateur recording 2003-05-16 sound recording http://atlantahistorycenter.com/ porcelain/10
additionally, with no fields required by the dc schema, institutions can have absolute discretion as to what metadata are exposed if this is a concern (as may be for privacy considerations for archives or for intellectualproperty concerns for museums). however, one of the great strengths of implementing oai-pmh is that, while the threshold for regulating metadata is low, the protocol can also handle any metadata format expressed by xml, including data formats significantly more structured than dc; for example, ead, text encoding initiative (tei), and tei lite-defined documents. scholars are then able to access these scholarly objects via one point, while still being able to collectively access and utilize all metadata objects available in all collections, from the most to the least robust. the aim of the mosc project participants in selecting oai-pmh is to maximize participation from fairly disparate kinds of organizations, with equally disparate kinds of metadata cultures and practices. in comparison to other, currently available methods of metadata aggregation, oai-pmh is maximally forgiving of discordant metadata suppliers. thereby, the hope is, metadata contributions are maximized. concurrently, the protocol the mosc project | roel 23 24 information technology and libraries | march 2005 allows for highly robust metadata formats. as the cost for inclusion in aggregated systems, in some cases metadata objects are stripped down. this need is eliminated when oai-pmh is utilized. the use of the protocol allows for the inclusion of objects consisting of the most skeletal unqualified dublin core elements, while still accommodating the most complicated metadata objects. optimally, this is a means to achieve a critical mass of contributed resources that will enable end users to utilize the mosc project as the premier site and a primary resource for information on materials about music and musicians associated with social-change movements. � acknowledgment the author would like to express her sincerest gratitude to the institute of museum and library services for funding the music of social change project. references 1. “metascholar: an emory university digital library research initiative,” emory university libraries web site. accessed sept. 1, 2004, http://metascholar.org/; “the center for southern culture,” university of mississippi web site. accessed sept. 1, 2004, www.olemiss.edu/depts/south/; “atlanta history center,” atlanta history center web site. accessed sept. 1, 2004, www.atlantahistorycenter.com/; “georgia music hall of fame,” georgia music hall of fame web site. accessed sept. 1, 2004, www.gamusichall.com/home.html; “institute of museum and library services: library-museum collaboration,” institute of museum and library services web site. accessed sept. 1, 2004, www.imls.gov/grants/l-m/index.htm. 2. “implementation guidelines for the open archives initiative protocol for metadata harvesting,” open archives initiative web site. accessed sept. 1, 2004, www.openarchives.org/ oai/openarchivesprotocol.html#introduction. 3. “4.2 collection and set descriptions,” open archives initiative web site. accessed sept. 1, 2004, www.openarchives.org/ oai/2.0/guidelines-repository.htm#setdescription. content management systems: trends in academic libraries ruth sara connell information technology and libraries | june 2013 42 abstract academic libraries, and their parent institutions, are increasingly using content management systems (cmss) for website management. in this study, the author surveyed academic library web managers from four-year institutions to discover whether they had adopted cmss, which tools they were using, and their satisfaction with their website management system. other issues, such as institutional control over library website management, were raised. the survey results showed that cms satisfaction levels vary by tool and that many libraries do not have input into the selection of their cms because the determination is made at an institutional level. these findings will be helpful for decision makers involved in the selection of cmss for academic libraries. introduction as library websites have evolved over the years, so has their role and complexity. in the beginning, the purpose of most library websites was to convey basic information, such as hours and policies, to library users. as time passed, more and more library products and services became available online, increasing the size and complexity of library websites. many academic library web designers found that their web authoring tools were no longer adequate for their needs and turned to cmss to help them manage and maintain their sites. for other web designers, the choice was not theirs to make. their institution transitioned to a cms and required the academic library to follow suit, regardless of whether the library staff had a say in the selection of the cms or its suitability for the library environment. the purpose of this study was to examine cms usage within the academic library market and to provide librarians quantitative and qualitative knowledge to help make decisions when considering switching to, or between, cmss. in particular, the objectives of this study were to determine (1) the level of saturation of cmss in the academic library community; (2) the most popular cmss within academic libraries, the reasons for the selection of those systems, and satisfaction with those cmss; (3) if there is a relationship between libraries with their own dedicated information technology (it) staff and those with open source (os) systems; and (4) if there is a relationship between institutional characteristics and issues surrounding cms selection. ruth sara connell (ruth.connell@valpo.edu) is associate professor of library services and electronic services librarian, christopher center library services, valparaiso university, valparaiso, in. mailto:ruth.connell@valpo.edu content management systems: trends in academic libraries | connell 43 although this study largely focuses on cms adoption and related issues, the library web designers who responded to the survey were asked to identify what method of web management they use if they do not use a cms and asked about satisfaction with their current system. thus, information regarding cms alternatives (such as adobe’s dreamweaver web content editing software) is also included in the results. as will be discussed in the literature review, cmss have been broadly defined in the past. therefore, for this study participants were informed that only cmss used to manage their primary public website were of interest. specifically, cmss were defined as website management tools through which the appearance and formatting is managed separately from content, so that authors can easily add content regardless of web authoring skills. literature review most of the library literature regarding cms adoption consists of individual case studies describing selection and implementation at specific institutions. there are very few comprehensive surveys of library websites or the personnel in charge of academic library websites to determine trends in cms usage. the published studies including cms usage within academic libraries do not definitively answer whether overall adoption has increased. in 2005 several georgia state university librarians surveyed web librarians at sixty-three of their peer institutions, and of the sixteen responses, six (or 38 percent) reported use of “cms technology to run parts of their web site.” 1 a 2006 study of web managers from wide range of institutions (associates to research) indicated a 26 percent (twenty-four of ninety-four) cms adoption rate.2 a more recent 2008 study of institutions of varying sizes resulted in a little more than half of respondents indicating use of cmss, although the authors note that “people defined cmss very broadly,” 3 including tools like moodle and contentdm, and some of those libraries indicated they did not use the cms to manage their website. a 2012 study by comeaux and schmetzke differs from the others mentioned here in that they reviewed academic library websites of the fifty-six campuses offering ala-accredited graduate degrees (generally larger universities) and used tools and examined page code to try to determine on their own if the libraries used cmss, as opposed to polling librarians at those institutions to ask them to self-identify if they used cmss. they identified nineteen out of fifty-six (34 percent) sites using cmss. the authors offer this caveat, “it is very possible that more sites use cmss than could be readily identified. this is particularly true for ‘home-grown’ systems, which are unlikely to leave any readily discernible source code.” 4 because of different methodologies and population groups studied in these studies, it is not possible to draw conclusions regarding cms adoption rates within academic libraries over time using these results. as mentioned previously, some people define cmss more broadly than others. one example of a product that can be used as a cms, but is not necessarily a cms, is springshare’s libguides. many libraries use libguides as a component of their website to create guides. however, some libraries have utilized the product to develop their whole site, in effect using it as a cms. a case study by information technology and libraries | june 2013 44 two librarians at york college describes why they chose libguides as their cms instead of as a more limited guide creation tool.5 several themes recurred throughout many of the case study articles. one common theme was the issue of lack of control and problems of collaboration between academic libraries and the campus entities controlling website management. amy york, the web services librarian at middle tennessee state university, described the decision to transition to a cms in this way, “and while it was feasible for us to remain outside of the campus cms and yet conform to the campus template, the head of the it web unit was quite adamant that we move into the cms.” 6 in a study by bundza et al., several participants who indicated dissatisfaction with website maintenance mentioned “authority and decision-making issues” as well as “turf struggles.” 7 other articles expressed more positive collaborative experiences. morehead state university librarians kmetz and bailey noted, “when attending conferences and hearing the stories of other libraries, it became apparent that a typical relationship between librarians and a campus it staff is often much less communicative and much less positive than [ours]. because of the relatively smooth collaborative spirit, a librarian was invited in 2003 to participate in the selection of a cms system.” 8 kimberley stephenson also emphasized the advantageous relationships that can develop when a positive approach is used, “rather than simply complaining that staff from other departments do not understand library needs, librarians should respectfully acknowledge that campus web developers want to create a site that attracts users and consider how an attractive site that reflects the university’s brand can be beneficial in promoting library resources and services.” 9 however, earlier in the article she does acknowledge that the iterative and collaborative process between the library and their university relations (ur) department was occasionally contentious and that the web services librarian notifies ur staff before making changes to the library homepage.10 another common theme in the literature was the reasoning behind transitioning to a cms. one commonly cited criterion was access control or workflow management, which allows site administrators to assign contributors editorial control over different sections of the site or approve changes before publishing.11 however, although this feature is considered a requirement by many libraries, it has its detractors. kmetz and bailey indicated that at morehead state university, “approval chains have been viewed as somewhat stifling and potentially draconian, so they have not been activated.” 12 these studies greatly informed the questions used and development of the survey instrument for this study. method in designing the survey instrument, questions were considered based on how they informed the objectives of the study. to simplify analysis, it was important to compile as comprehensive a list of content management systems: trends in academic libraries | connell 45 cmss as possible. this list was created by pulling cms names from the literature review, the web4lib discussion list, and the cmsmatrix website (www.cmsmatrix.org). in order to select institutions for distribution, the 2010 carnegie classification of institutions of higher education basic classification lists were used.13 the author chose to focus on three broad classifications: 1. research institutions consisting of the following carnegie basic classifications: research universities (very high research activity), research universities (high research activity), and dru: doctoral/research universities. 2. master’s institutions consisting of the following carnegie basic classifications: master's colleges and universities (larger programs), master's colleges and universities (medium programs), master's colleges and universities (smaller programs). 3. baccalaureate institutions consisting of the following carnegie basic classifications: baccalaureate colleges—arts & sciences and baccalaureate colleges—diverse fields. the basic classification lists were downloaded into excel with each of the three categories in a different worksheet, and then each institution was assigned a number using the random number generator feature within excel. the institutions were then sorted by those numbers creating a randomly ordered list within each classification. to determine sample size for a stratified random sampling, ronald powell’s “table for determining sample size from a given population” 14 (with a .05 degree of accuracy) was used. each classification’s population was considered separately, and the appropriate sample size chosen from the table. the population size of each of the groups (total number of institutions within that carnegie classification) and the corresponding sample sizes were • research: population = 297, sample size = 165; • master’s: population = 727, sample size = 248; • baccalaureate: population = 662, sample size = 242. the total number of institutions included in the sample size was 655. the author then went through the list of selected institutions and searched online to find their library webpages and find the person most likely responsible for the library’s website. during this process, there were some institutions, mostly for-profits, for which a library website could not be found. when this occurred, that institution was eliminated and the next institution on the list used in its place. in some cases, the person responsible for web content was not easily identifiable; in these cases an educated guess was made when possible, or else the director or a general library email address was used. the survey was made available online and distributed via e-mail to the 655 recipients on october 1, 2012. reminders were sent on october 10 and october 18, and the survey was closed on october 26, 2012. out of 655 recipients, 286 responses were received. some of those responses http://www.cmsmatrix.org/ information technology and libraries | june 2013 46 had to be eliminated for various reasons. if two responses were received from one institution, the more complete response was used while the other response was discarded. some responses included only an answer to the first question (name of institution or declination of that question to answer demographic questions) and no other responses; these were also eliminated. once the invalid responses were removed, 265 remained, for a 40 percent response rate. before conducting an analysis of the data, some cleanup and standardization of results was required. for example, a handful of respondents indicated they used a cms and then indicated that their cms was dreamweaver or adobe contribute. these responses were recoded as non-cms responses. likewise, one respondent self-identified as a non-cms user but then listed drupal as his/her web management tool and this was recoded as a cms response. demographic profile of respondents for the purposes of gathering demographic data, respondents were offered two options. they could provide their institution’s name, which would be used solely to pair their responses with the appropriate carnegie demographic categories (not to identify them or their institution), or they could choose to answer a separate set of questions regarding their size, public/private affiliation, and basic carnegie classification. the basic carnegie classification of the largest response group was master’s with 102 responses (38 percent); then baccalaureate institutions (94 responses or 35 percent), and then research institutions (69 responses or 26 percent). this correlates pretty closely with the distribution percentages, which were 38 percent master’s (248 out of 655), 37 percent baccalaureate (242 out of 655), and 25percent research (165 out of 655). of the 265 responses, 95 (36 percent) came from academic librarians representing public institutions and 170 (64 percent) from private. of the private institutions, the vast majority (166 responses or 98 percent) were not-for-profit, while 4 (2 percent) were for-profits. to define size, the carnegie size and setting classification was used. very small institutions are defined as less than 1,000 full-time equivalent (fte) enrollment, small is 1,000–2,999 fte, medium is 3,000–9,999 fte, and large is at least 10,000 fte. the largest group of responses came from small institutions (105 responses or 40 percent), then medium (67 responses or 25 percent), large (60 responses or 23 percent), and very small (33 responses or 12 percent). results the first question asking for institutional identification (or alternative routing to carnegie classification questions) was the only question for which an answer was required. in addition, because of question logic, some people saw questions that others did not based on how they answered previous questions. thus, the number of responses varies for each question. one of the objectives of this study was to identify if there were traits among institutional characteristics and cms selection and management. the results that follow include both content management systems: trends in academic libraries | connell 47 descriptive statistics and statistically significant inferential statistics discovered using chi-square and fisher’s exact tests. statistically significant results are labeled as such. the responses to this survey show that most academic libraries are using a cms to manage their main library website (169out of 265 responses or 64 percent). overall, cms users expressed similar (although slightly greater) satisfaction levels with their method of web management (see table 1.) table 1 satisfaction by cms use use a cms to manage library website yes no user is highly satisfied or satisfied yes 79 responses or 54% 41 responses or 47% no 68 responses or 46% 46 responses or 53% total 147 responses or 100% 87 responses or 100% non-cms users non-cms users were asked what software or system they use to govern their site. by far, the most popular system mentioned among the 82 responses was adobe dreamweaver, with 24 (29 percent) users listing it as their only or primary system. some people listed dreamweaver as part of a list of tools used; for example “php / mysql, integrated development environments (php storm, coda), dreamweaver, etc.,” and if all mentions of dreamweaver are included, the number of users rises to 31 (38 percent). some version of “hand coded” was the second most popular answer with 9 responses (11 percent), followed by adobe contribute with 7 (9 percent). many of the “other” responses were hard to classify and were excluded from analysis. some examples include: • ftp to the web • voyager public web browser ezproxy • excel, e-mail, file folders on shared drives among the top three non-cms web management systems, dreamweaver users were most satisfied, selecting highly satisfied or satisfied in 15 out of 24 (63 percent) cases. hand coders were highly satisfied or satisfied in 5out of 9 of cases (56 percent), and adobe contribute users were only highly satisfied or satisfied in 3 out of 7 (43 percent) cases. respondents not using a cms were asked whether they were considering a move to a cms within the next two years. most (59 percent) said yes. research libraries were much more likely to be planning such a move (81percent) than master’s (50 percent) or baccalaureate (45 percent) libraries (see table 2.) a chi-square test rejects the null hypothesis that the consideration of a move to cms is independent of basic carnegie classification; this difference was significant at the p = 0.038 level. information technology and libraries | june 2013 48 table 2 non-cms users considering a move to a cms within the next two years by carnegie classification* baccalaureate master’s research total no 11 responses or 55% 11 responses or 50% 4 responses or 19% 26 responses or 41% yes 9 responses or 45% 11 responses or 50% 17 responses or 81% 37 responses or 59% total 20 responses or 100% 22 responses or 100% 21 responses or 100% 63 responses or 100% chi-square=6.526, df=2, p=.038 *excludes “not sure” responses non-cms users were asked to provide comments related to topics covered in the survey, and here is a sampling of responses received: • cmss cost money that our college cannot count on being available on a yearly basis. • the library doesn't have overall responsibility for the website. university web services manages the entire site, i submit changes to them for inclusion and updates. • we are so small that the time to learn and implement a cms hardly seems worth it. so far this low-tech method has worked for us. • the main university site was moved to a cms in 2008. the library was not included in that move because of the number of pages. i hear rumors that we will be forced into the cms that is under consideration for adoption now. the library has had zero input in the selection of the new cms. cms users when respondents indicated their library used a cms, they were routed to a series of cms related questions. the first question asked which cms their library was using. of the 153 responses, the most popular cmss were drupal (40); wordpress (15); libguides (14), which was defined within the survey as a cms “for main library website, not just for guides”; cascade server (12); ektron (6); and modx and plone (5 each). these users were also asked about their overall satisfaction with their systems. among the top four cmss, libguides users were the most satisfied, selecting highly satisfied or satisfied in 12 out of 12 (100 percent) cases. the remaining three systems’ satisfaction ratings (highly satisfied or satisfied) were as follows: wordpress (12out of 15 cases or 80 percent), drupal (26out of 38 cases or 68 percent), and cascade server (3 out of 11 cases or 27 percent). when asked whether they would switch systems if given the opportunity, most (61out of 109 cases or 56 percent) said no. looking at the responses for the top four cmss, responses echo the content management systems: trends in academic libraries | connell 49 satisfaction responses. libguides users were least likely to want to switch (0 out of 7 cases or 0 percent), followed by wordpress (1 out of 5 cases or 17 percent), drupal (8out of 23 cases or 26 percent), and cascade server (3 out of 7 or 43 percent) users. respondents were asked whether their library uses the same cms as their parent institution. most (106 out of 169 cases or 63 percent) said yes. libraries at large institutions (over 10,000 fte) were much less likely (34 percent) than their smaller counterparts to share a cms with their parent institution (see table 3.) a chi-square test rejects the null hypothesis that sharing a cms with a parent institution is independent of size: at a significance level of p = 0.001, libraries at smaller institutions are more likely to share a cms with their parent. table 3 cms users whose libraries use the same cms as their parent institution by size large medium small very small total no 23 responses (66%) 15 responses (33%) 19 responses (27%) 6 responses (35%) 63 responses (37%) yes 12 responses (34%) 31 responses (67%) 52 responses (73%) 11 responses (65%) 106 responses (63%) total 35 responses (100%) 46 responses (100%) 71 responses (100%) 17 responses (100%) 169 responses (100%) chi-square=15.921, df=3, p=.001 not surprisingly, a similar correlation holds true for comparing shared cmss and simplified basic carnegie classification. baccalaureate and master’s libraries were more likely to share cmss with their institutions (69 percent and 71 percent respectively) than research libraries (42 percent) (see table 4.) at a significance level of p = 0.004, a chi-square test rejects the null hypothesis that sharing a cms with a parent institution is independent of basic carnegie classification. table 4 cms users whose libraries use the same cms as their parent institution, by carnegie classification baccalaureate master’s research total no 19 responses (31%) 18 responses (29%) 26 responses (58%) 63 responses (37%) yes 43 responses (69%) 44 responses (71%) 19 responses (42%) 106 responses (63%) total 62 responses (100%) 62 responses (100%) 45 responses (100%) 169 responses (100%) chi-square = 11.057, df = 2, p = .004 information technology and libraries | june 2013 50 when participants responded that their library shared a cms with the parent institution, they were asked a follow up question about whether the library made the transition with the parent institution. most (80 out of 99 cases or 81 percent) said yes, the transition was made together. however, private institutions were more likely to have made the switch together (88 percent) than public (63 percent) (see table 5.) a fisher’s exact test rejects the null hypothesis that transition to cms is independent of institutional control: at a significance level of p = 0.010, private institutions are more likely than public to move to a cms in concert. table 5 users whose libraries and parent institutions use the same cms: transition by public/private control* private public total switched independently 9 responses (13%) 10 responses (37%) 19 responses (19%) switched together 63 responses (88%) 17 responses or (63%) 80 responses (81%) total 72 responses (101%)** 27 responses (100%) 99 responses (100%) fisher’s exact test: p = .010 * excludes responses where people indicated “other” ** due to rounding, total is greater than 100% similarly, a relationship existed between transition to cms and basic carnegie classification. baccalaureate institutions (93 percent) were more likely than master’s (80 percent), which were more likely than research institutions (53 percent) to make the transition together (see table 6.) a chi-square test rejects the null hypothesis that the transition to cms is independent of basic carnegie classification: at a significance level of p = 0.002, higher degree granting institutions are less likely to make the transition together. table 6 users whose libraries and parent institutions use the same cms: transition by carnegie classification* baccalaureate master’s research total switched independently 3 responses (7%) 8 responses (21%) 8 responses (47%) 19 responses (19%) switched together 40 responses (93%) 31 responses (80%) 9 responses (53%) 80 responses (81%) total 43 responses (100%) 39 responses (101%)** 17 responses (100%) 99 responses (100%) chi-square = 12.693, df = 2, p = .002 *excludes responses where people indicated “other” **due to rounding, total is greater than 100% content management systems: trends in academic libraries | connell 51 this study indicates that for libraries that transitioned to a cms with their parent institution, the transition was usually forced. out of the 88 libraries that transitioned together and indicated whether they were given a choice, only 8 libraries (9 percent) had a say in whether to make that transition. and even though academic libraries were usually forced to transition with their institution, they did not usually have representation on campus-wide cms selection committees. only 25 percent (22 out of 87) respondents indicated that their library had a seat at the table during cms selection. when comparing cms satisfaction ratings among libraries that were represented on cms selection committees versus those that had no representation, it is not surprising that those with representation were more satisfied (13 out of 22 cases or 59 percent) than those without (21 out of 59 cases or 36 percent). the same holds true for those libraries given a choice whether to transition. those given a choice were satisfied more often (6out of 8 cases or 75 percent) than those forced to transition (21 out of 71 cases or 30 percent). respondents who said that they were not on the same cms as their institution were asked why they chose a different system. many of the responses indicated a desire for freedom from the controlling influence of either it and marketing arms of the institution : • we felt drupal offered more flexibility for our needs than cascade, which is what the university at large was using. i've heard more recently that the university may be considering switching to drupal. • university pr controls all aspects of the university cms. we want more freedom. • we are a service-oriented organization, as opposed to a marketing arm. we by necessity need to be different. cms users were asked to provide a list of three factors most important in their selection of their cms and to rank their list in order of importance. the author standardized the responses, e.g. “price” was recorded as “cost.” the factors listed first, in order of frequency, were ease of use (15), flexibility (10), and cost (6). ignoring the ranking, 38 respondents listed ease of use somewhere in their “top three”, while 23 listed cost, and 16 listed flexibility. another objective of this study was to determine if there was a positive correlation between libraries with their own dedicated it staff and those who chose open source cmss. therefore cms users were asked if their library had its own dedicated it staff, and 66 out of 143 libraries (46 percent) said yes. then the cmss used by respondents were translated into two categories, open source or proprietary systems (when a cms listed was unknown it was coded as a missing value), and a fisher’s exact test was run against all cases that had values for both variables to see if a correlation existed. although those with library it had open source systems more frequently than those without, the difference was not significant (see table 7.) information technology and libraries | june 2013 52 table 7 libraries with own it personnel by open source cmss library has own it yes no total cms is open source yes 37 responses (73%) 32 responses (57%) 69 responses (65%) no 14 responses (28%) 24 responses (43%) 38 responses (36%) total 51 responses (101%)* 56 responses or (100%) 107 responses (101%)* fisher’s exact test: p = .109 *due to rounding, total is greater than 100% in another question, people were asked to self-identify if their organization uses an open source cms, and if so asked whether they have outsourced any of its implementation or design to an outside vendor. most (61 out of 77 cases or 79 percent) said they had not outsourced implementation or design. one person commented, “no, i don't recommend doing this. the cost is great, you lose the expertise once the consultant leaves, and the maintenance cost goes through the roof. hire someone fulltime or move a current position to be the keeper of the system.” one of the advantages of having a cms is the ability to give multiple people, regardless of their web authoring skills, the opportunity to edit webpages. therefore, cms users were asked how many web content creators they have within their library. out of 152 responses, the most frequent range cited was 2–5 authors (72 responses or 47 percent), followed by (33 responses or 22 percent) with only one author, 6–10 authors (20 responses or 13 percent), 21–50 authors (16 responses or 11 percent), 11–20 authors (6 responses or 4 percent), and over 50 authors (5 responses or 3 percent). because this question was an open ended response and responses varied greatly, including “over 100 (over 20 are regular contributors)” and “1–3”, standardization was required. when a range or multiple numbers were provided, the largest number was used. respondents were asked whether their library uses a workflow management process requiring page authors to receive approval before publishing content. of the 131 people who responded yes or no, most (88 responses or 67 percent) said no. cms users were asked to provide comments related to topics covered in the survey. many comments mentioned issues of control (or lack thereof), while another common theme was concerns with specific cmss. here is a sampling of responses received: • having dedicated staff is a necessity. there was a time when these tools could be installed and used by a techie generalist. those days are over. a professional content person and a professional cms person are a must if you want your site to look like a professional site... content management systems: trends in academic libraries | connell 53 i'm shocked at how many libraries switched to a cms yet still have a site that looks and feels like it was created 10 years ago. • since the cms was bred in-house by another university department, we do not have control over changing the design or layout. the last time i requested a change, they wanted to charge us. • our university marketing department, which includes the web team, is currently in the process of switching [cmss]. we were not invited to be involved in the selection process for a new cms, although they did receive my unsolicited advice. • we compared costs for open source and licensed systems, and we found the costs to be approximately equivalent based on the development work we would have needed in an open source environment. • the library was not part of the original selection process for the campus' first cms because my position didn't exist at that time. now that we have a dedicated web services position, the library is considered a "power user" in the cms and we are often part of the campus wide discussions about the new cms and strategic planning involving the campus website. • we currently do not have the preferred level of control over our library website; we fought for customization rights for our front page, and won on that front. however, departments on campus do not have permission to install or configure modules, which we hope will change in the future. • there’s a huge disconnect between it /administration and the library regarding unique needs of the library in the context of web-based delivery of information. discussion comparing the results of this study to previous studies indicates that cms usage within academic libraries is rising. the 64 percent cms adoption rate found in this survey, which used a more narrow definition of cms than some previous studies cited in the literature review, is higher than adoption rates in any of said studies. as more libraries make the transition, it is important to know how different cmss have been received among their peers. although cms users are slightly more satisfied than non-cms users (54 percent vs. 47 percent), the tools used matter. so if a library using dreamweaver to manage their site is given an option of moving with their institution to a cms and that cms is cascade server, they should strongly consider sticking with their current non-cms method based on the respective satisfaction levels reported in this study (63 percent vs. 27 percent). satisfaction levels are important, but should not be considered in a vacuum. for example, although libguides users reported very high satisfaction levels (100 percent were satisfied or very satisfied), users were mostly (11 out of 14 users or 79 percent) small or very small schools, while the remaining three (21percent) were medium schools. no large schools reported using libguides as their cms. libguides may be wonderful for a smaller school without need of much information technology and libraries | june 2013 54 customization or, in some cases, access to technical expertise but may not be a good cms solution for larger institutions. one of the largest issues raised by survey respondents was libraries’ control, or lack thereof, when moving to a campus-selected cms. given the complexity of academic libraries websites, library representation on campus-wide cms selection committees is warranted. not only are libraries more satisfied with the results when given a say in the selection, but libraries have special needs when it comes to website design that other campus units do not. including library representation ensures those needs are met. some of the respondents’ comments regarding lack of control over their sites are disturbing to libraries being forced or considering a move to a campus cms. clearly, having to pay another campus department to make changes to the library site is not an attractive option for most libraries. nor should libraries have to fight for the right or ability to customize their home pages. developing good working relationships with the decision makers may help prevent some of these problems, but likely not all. this study indicates that it is not uncommon for academic libraries to be forced into cmss, regardless of the cmss acceptability to the library environment. conclusion the adoption of cmss to manage academic libraries’ websites is increasing, but not all cmss are created equal. when given input into switching website management tools, library staff have many factors to take into consideration. these include, but are not limited to, in-house technical expertise, desirability of open source solutions, satisfaction of peer libraries with considered systems, and library specific needs, such as workflow management and customization requirements. ideally, libraries would always be partners at the table when campus-wide cms decisions are being made, but this study shows that this does not happen in most cases. if a library suspects that it is likely to be required to move to a campus-selected system, its staff should be alert for news of impending changes so that they can work to be involved at the beginning of the process to be able to provide input. a transition to a bad cms can have long-term negative effects on the library, its users, and staff. a library’s website is its virtual “branch” and vitally important to the functioning of the library. the management of such an important component of the library should not be left to chance. references 1. doug goans, guy leach, and teri m. vogel, “beyond html: developing and re-imagining library web guides in a content management system,” library hi tech 24, no. 1 (2006): 29–53, doi:10.1108/07378830610652095. 2. ruth sara connell, “survey of web developers in academic libraries,” the journal of academic librarianship 34, no. 2 (march 2008): 121–129, doi:10.1016/j.acalib.2007.12.005. http://dx.doi.org/10.1016/j.acalib.2007.12.005 content management systems: trends in academic libraries | connell 55 3. maira bundza, patricia fravel vander meer, and maria a. perez-stable, “work of the web weavers: web development in academic libraries,” journal of web librarianship 3, no. 3 (july 2009): 239–62. 4. david comeaux and axel schmetzke, “accessibility of academic library web sites in north america—current status and trends (2002–2012).” library hi tech 31, no. 1 (january 28, 2013): 2. 5. daniel verbit and vickie l. kline, “libguides: a cms for busy librarians,” computers in libraries 31, no. 6 (july 2011): 21–25. 6. amy york, holly hebert, and j. michael lindsay, “transforming the library website: you and the it crowd,” tennessee libraries 62, no. 3 (2012). 7. bundza, vender meer, and perez-stable, “work of the web weavers: web development in academic libraries.” 8. tom kmetz and ray bailey, “migrating a library’s web site to a commercial cms within a campus-wide implementation,” library hi tech 24, no. 1 (2006): 102–14, doi:10.1108/07378830610652130. 9. kimberley stephenson, “sharing control, embracing collaboration: cross-campus partnerships for library website design and management,” journal of electronic resources librarianship 24, no. 2 (april 2012): 91–100. 10. ibid. 11. elizabeth l. black, “selecting a web content management system for an academic library website,” information technology & libraries 30, no. 4 (december 2011): 185–89; andy austin and christopher harris, “welcome to a new paradigm,” library technology reports 44, no. 4 (june 2008): 5–7; holly yu , “chapter 1: library web content management: needs and challenges,” in content and workflow management for library web sites: case studies, ed. holly yu (hersey, pa: information science publishing, 2005), 1–21; wayne powel and chris gill, “web content management systems in higher education,” educause quarterly 26, no. 2 (2003): 43– 50; goans, leach, and vogel, “beyond html.” 12. kmetz and bailey, “migrating a library’s web site.” 13. carnegie foundation for the advancement of teaching, 2010 classification of institutions of higher education, accessed february 4, 2013, http://classifications.carnegiefoundation.org/descriptions/basic.php. 14. ronald r. powell , basic research methods for librarians (greenwood, 1997). http://classifications.carnegiefoundation.org/descriptions/basic.php managing in-library use data: putting a web geographic information systems platform through its paces bruce godfrey and rick stoddart information technology and libraries | june 2018 34 bruce godfrey (bgodfrey@uidaho.edu) is gis librarian and rick stoddart (rstoddart@uidaho.edu) is education librarian at the university of idaho library. abstract web geographic information system (gis) platforms have matured to a point where they offer attractive capabilities for collecting, analyzing, sharing, and visualizing in-library use data for space-assessment initiatives. as these platforms continue to evolve, it is reasonable to conclude that enhancements to these platforms will not only offer librarians more opportunities to collect in-library use data to inform the use of physical space in their buildings, but also that they will potentially provide opportunities to more easily share database schemas for defining learning spaces and observations associated with those spaces. this article proposes using web gis, as opposed to traditional desktop gis, as an approach for collecting, managing, documenting, analyzing, visualizing, and sharing in-library use data and goes on to highlight the process for utilizing the esri arcgis online platform for a pilot project by an academic library for this purpose. introduction a geographic information system (gis) is a computer program for working with geographic data. a gis is an ideal tool for capturing data about library learning spaces because they can be described by a geographic area. the learning spaces might be small or large, irregularly shaped or symmetrical—either way, the shape can be described by a set of geographic coordinates. tools for storing, managing, documenting, analyzing, and visualizing geographic data can all be found in a gis. the locations and shapes of geographic features (such as library learning spaces) as well as attributes of those features (such as the type of learning space) can be captured in a gis. the roots of giss stretch back to the 1960s. goodchild characterizes giss’ advances in spatial analysis during the 1970s and the growth of gis in the 1980s, coinciding with the proliferation and affordability of desktop computers.1 the enhancement of gis software from desktop computer applications to online platforms has been underway for some time. the origins of web gis can be traced back to 1990s, but it is only since the mid2000s that products have really matured to a point where they can be viable alternatives to their desktop counterparts. web gis first appears in 1993 when xerox corporation’s palo alto research center created an online map viewer.2 their map viewer, running in a web browser, was the first demonstration of performing gis tasks without gis software installed on a local computer. even though this early web-based gis application had limited capabilities, the potential of performing gis operations from computers anywhere and anytime was recognized. the possible capabilities of web gis began to be more fully discussed in the mid-1990s.3 web gis software became available in earnest in 1996 as gis companies began releasing commercial offerings.4 the first two decades of this century have seen web gis explode in functionality and scope to become an integral part of most giss. mailto:bgodfrey@uidaho.edu mailto:rstoddart@uidaho.edu managing in-library use data | godfrey and stoddart 35 https://doi.org/10.6017/ital.v37i2.10208 in late 2012, a collaborative mapping platform hosted by esri (environmental systems research institute) named arcgis online (https://www.arcgis.com/) was released. esri is a gis software company that was founded in 1969, and its products are used by more than 7,000 colleges and universities across the globe.5 the collaborative platform enables users to create, manage, analyze, store, and share maps, applications, and data on the internet. gis software continues to evolve from desktop computer programs to specialized software applications (i.e., apps) that are part of a web-focused platform. this transformation is profoundly growing the accessibility of the technology to a broader array of users. what was once a technology reserved for geographic information professionals because of its complexity and cost has now been streamlined and put in the hands of nonprofessionals who want to take advantage of its many possibilities. it is no longer reserved for academic disciplines such as geographic information science and remote sensing science; instead, gis has seen its use grow in humanities and social science to the point where libraries are developing targeted services for these disciplines. 6 professionals are afforded the ability to share their data more easily, and nonprofessionals are able to utilize those data to create information and knowledge more easily. this transformation bodes well for libraries because it lowers technological hurdles that might have precluded the technology’s use for space-assessment and other place-based initiatives in the past. now that software-as-a-service (saas) mapping platforms such as mango, gis cloud, and arcgis online enable users to access capabilities over the internet, there is no server software for users to install or licensing to configure. additionally, the training required by personnel to gather, utilize, and manage data has been greatly reduced compared to its desktop predecessor. academic libraries, and libraries in general, stand to gain from the evolution. the use of desktop gis for space assessment the value of space planning efforts in libraries and the observational methods employed to conduct such activities have been well articulated in library research. the use of desktop gis as a tool for collecting in library use data in academic libraries has been present for more than a decade. bishop and mandel show that libraries’ use of gis falls into two broad categories, analyzing service area populations and facilities management, the latter of which encompasses “in-library use and occupancy of library study space.”7 work related to the use of gis to study library-patron spaces is discussed below. in the past twenty years, academic libraries have seen many transformations in their roles on college and university campuses. gis technologies have helped document and respond to those transformations. xia outlined the value of using gis as a tool for space management in academic libraries more than a decade ago because of its “capacity for analyzing spatial data and interactive information.” 8 in one study, xia describes using esri arcview 3.x desktop software for library space management. arcview was esri’s first gis software to have a graphical user interface; predecessors had command-line interfaces. xia mentions the use of, at that time, the emerging arcgis product, which went on to replace arcview 3.x. gis proved to be a valuable tool for xia to track the spatial distribution of books in the library environment.9 xia went on to measure and visualize the occupancy of study space using arcview.10 lastly, xia used arcview as an item-locating system within the physical space of the library.11 more recently, mandel utilized mapwindow, an open-source desktop gis originally developed at idaho state university, for creating maps of fictional in-library use data.12 mandel’s process demonstrated how a gis could be utilized to visualize the use of library spaces for marketing materials and services as well as graphically depicting a library’s value. coyle argued for the use of gis as a tool to analyze the interior space of the library, and specifically the library collection itself, while not implementing a system with any https://www.arcgis.com/ information technology and libraries | june 2018 36 specific gis package.13 given and archibald detailed their use of visual traffic sweeps as an approach to collect and visualize in-library use data.14 their workflow involved utilizing a microsoft excel spreadsheet to capture data and then importing the data into arcgis to query and visualize the data. therefore, gis wasn’t used for data capture; it was used toward the end of the process to visualize these data. while the body of work details the use of desktop gis for working with in-library use data, collaborative web gis platforms now offer opportunities to advance existing research in this arena by streamlining datacollection workflows, sharing database schemas, and enabling broader collaboration with peers, thereby potentially creating opportunities for new research. fusing the capabilities of these new platforms with traditional observational methods of gathering data on how people are using library spaces extends the body of knowledge and offers interesting new opportunities for research such as cross-institutional comparisons. it is critical for twenty-first-century academic libraries to collect such data to continue to evolve with the changing needs of digital-age campus research and culture. utilizing a cloud-based platform for learning space assessment discussed below is the approach employed for this pilot project to use web gis to collect, manage, share, and visualize information about library learning spaces. this pilot project utilized the esri arcgis online platform and client applications accessing that platform (see figure 1). collector for arcgis (http://doc.arcgis.com/en/collector), a ready-made app, was used for data collection. arcgis desktop (http://desktop.arcgis.com) was used at the outset to create the initial database schema. a custom html/javascript web application was developed to better enable library administrators to visualize the data as a map, table, or chart. prior to the implementation of this pilot project, the circulation department conducted floor sweeps for safety purposes (e.g., making sure certain doors were locked), but space assessment data had never been gathered for the library. research study location all observations were taken during fall 2016 and spring 2017 at the university of idaho library and the gary strong curriculum center. this article focuses on the implementation of the platform for use at the library. the first floor of the university of idaho library underwent a remodel during winter 2016. the remodel included new furniture and different configurations of areas better customized for learning and studying. spaces such as group study, booths, and brainstorming spaces figured prominently in the remodel. additionally, expanded food and beverage options and having proximity to open seating areas located near natural light provide a welcoming environment. library hours were also expanded to 24 hours per day, 5 days a week. with these changes arose the desire to digitally collect data to learn about the use of these new locations by patrons. utilizing these data to inform decision-making about future changes to the physical spaces in the library, as well as connecting library learning spaces to campus learning outcomes, were goals of this research. http://doc.arcgis.com/en/collector http://desktop.arcgis.com/ managing in-library use data | godfrey and stoddart 37 https://doi.org/10.6017/ital.v37i2.10208 figure 1. infrastructure for the pilot project. selecting the arcgis online platform using locally existing resources to implement this pilot project was a requirement. funding was not available to purchase server software or hardware. personnel time could be carved out of existing positions for this effort, but money was not available to hire additional personnel. the university of idaho library does not have a dedicated it unit, so choices were limited. purchasing business-intelligence software such as tableau was cost prohibitive. an open-source tool such as suma, developed by north carolina state university libraries, was not a practical option in this case because the system requirements did not align with the expertise of existing personnel.15 fortunately, the arcgis online platform was available for this research at no cost to the library, and existing personnel had experience using the platform. the university of idaho participates in, and contributes financially to, a state of idaho higher education site license for esri software. the software is then available to personnel across the institution for research, teaching, and, to a lesser extent at this time, administrative purposes. since arcgis online is a cloud platform, there is no server software to install and update and no server hardware to configure. additionally, the university of idaho gis librarian was familiar with the capabilities of the platform and available to actively participate in this research. information technology and libraries | june 2018 38 in short, researchers’ access to and existing expertise with the arcgis online platform, coupled with the extensive capabilities of the platform itself, made it the best choice for this research. pilot project design a public services librarian and the gis librarian assumed leadership roles for the pilot project. the public services librarian led tasks associated with defining the learning (i.e., the data-collection) spaces, defining the data fields and domains for those spaces, and overseeing personnel responsible for collecting these data. the gis librarian led tasks associated with creating the database schema, creating the geographic features representing the learning spaces, creating a web application to visualize the data, and managing content on the arcgis online platform. library personnel were responsible for collecting the data. gathering ancillary data having building floor plans in a digital format was helpful for data collectors to orient themselves in the space when looking at a map on a mobile device. our research team was able to acquire georeferenced building floor plans for our institution from the information technology services unit on campus. each of the four floors of the library were published to arcgis online as hosted tile layers to serve as a frame of reference for data collectors. managing content and users arcgis online provides the ability to create and define groups. groups are collections of items that can be shared among named users. individual user accounts for each project participant were created, and a group containing items for this pilot project to be shared among those users was created. this approach allowed all data associated with the project to be private and only shared among personnel participating in the project. database design the primary knowledge product resulting from this research was a web application containing a twodimensional map, tables, and charts. a geodatabase, which is an assemblage of geographic datasets, needed to be designed and created to provide data to the web application.16 designing a geodatabase begins with defining the operational layers required to gather information. 17 for this pilot project, one operational layer depicting individual learning spaces was required (see table 1). table 1. description of the learning spaces layer layer learning spaces map use learning spaces define areas intended for a specific type of learning data source digitized using building floor plans as a frame of reference representation polygons the learning spaces layer was used to store the geometry of the individual learning spaces. a table to store observations for each learning space was needed, and a relationship between each individual space and the observations for each space was required (see figure 2). the relationship binds observations to their appropriate learning spaces. the relationship was defined to allow one learning space to relate to many observations for that space. managing in-library use data | godfrey and stoddart 39 https://doi.org/10.6017/ital.v37i2.10208 figure 2. data elements of the geodatabase. fields, analogous to columns in a spreadsheet, were defined for the learning spaces and observations table to store descriptive information. for example, a friendly name was assigned to each learning space. additionally, domains were defined to manage valid values for specific fields. domains were necessary for quality control and quality assurance to enforce data integrity, enabling data collectors to pick items from lists rather than having to type the item names. this feature eliminates potential data-collection errors. field names, data types, field descriptions, and domains for this pilot project can be found in the appendix. defining data-collection spaces a template was created to define the information required to create each learning space feature. these features were created by digitizing them on a computer screen for each of the four floors of the library using the building floor plans as a frame of reference. ten learning spaces were defined for the first floor of the library and one each for floors 2, 3, and 4. a map for each floor was created and published to arcgis online as a hosted feature layer.18 each map contained two layers: one for the floor plan and one for the learning spaces (figure 3). library personnel used these maps to collect data. data collection data collection was accomplished using collector for arcgis installed on mobile devices. this eliminated the need for any software-development costs for data collection. collector for arcgis is a ready-made arcgis online application that is designed to provide an easy-to-use interface for collecting location-based data. the software was installed on a variety of devices, including a samsung galaxy tablet, a surface tablet, and an apple ipad. the online collection mode was enabled during collection, resulting in data being transferred real-time to arcgis online. the software can collect data in an offline mode, but, because strong internet connections were available in both campus buildings, the online mode was utilized. the collection workflow consisted of library personnel traversing the floors of the library and recording data about the number of users in each space, what the users were doing in the space, and entering additional context comments if necessary. library staff were encouraged to use their own expertise and observational cues (e.g., textbooks present) when recording data associated with patron activities in library spaces. the date, time, and name of the data collector was recorded automatically, an option available through the arcgis online platform. the user interface for the software was friendly and intuitive and required minimal training (figure 4). a list was provided to select the type of use for the selected space. data were accessible via arcgis online immediately following collection. information technology and libraries | june 2018 40 figure 3. first floor learning spaces of the university of idaho library overlaid on the building floor plan. figure 4. the collector for arcgis user interface utilized for data collection. managing in-library use data | godfrey and stoddart 41 https://doi.org/10.6017/ital.v37i2.10208 results of using web gis web gis, specifically arcgis online, offered the functionality required for collecting and managing in library use data. additionally, the platform offers librarians supplementary opportunities for collaborative space-assessment projects. while the arcgis online platform proved to be useful for this pilot project, some of the advantages and limitations encountered are discussed below. advantage: ease of use through targeted applications esri software has been used in academia for decades. while the early command-line versions and later desktop versions were the playground of those with gis training, web gis applications have a decidedly friendlier interface because of the ability to customize applications on the platform for specific purposes. for example, applications with management functionality can be separated from applications intended for data gathering. the need for excessive functionally to be included in one interface is replaced with a more modular framework, resulting in less complex user interfaces as seen in many desktop gis programs. while some personnel involved with this project had used esri software for many years and were familiar with the capabilities of the arcgis online, they had not used the platform for data collection prior to this project. managing users and content for the project proved to be straightforward. it was made even easier when enterprise logins were configured, which allowed personnel to sign in using their institutional user name and password. authoring the database schema, creating the necessary maps, and publishing those maps as hosted services was not complicated for those with basic desktop and web gis knowledge. those responsible for collecting data needed little training using collector for arcgis to begin data collection. finally, librarians with no gis background were able to export the data to a familiar format (commaseparated values) to begin analysis using software such as excel. in short, authoring the database and map services remains best handled by those with gis experience. however, targeted application interfaces enable user without gis experience to collect and work with data. advantage: participation in enterprise architecture conducting library research on a platform many faculty, students, and staff are beginning to use for research, learning, and administration places librarians within the same collaborative space as the communities they are serving. in the case of this research, our need for building floor plans presented opportunities to more broadly discuss enterprise gis at our institution by sharing this information. interaction took place between the library, facilities services, and information technology services, resulting in a cultivation of relationships around data sharing. furthermore, integration of our enterprise security with the arcgis online platform adds a level of legitimacy to geospatial data management efforts. advantage: potential for cross-institutional collaborative projects the potential for cross-institutional collaboration on library-space assessment and other projects should not be overlooked when using the arcgis online platform. such collaborations are even more manageable because esri software is being used by more than 7,000 colleges and universities across the globe. even though cross-institutional collaboration was not a goal of this research, the opportunities for projects or programs of this nature became abundantly clear. items created in arcgis online can be shared between organizations. simply sharing a library-space-assessment database schema with librarians at other institutions would allow them to quickly implement a similar project on the arcgis online platform. this opens the door to new research opportunities. the functionality exists for one institution to host a database that personnel from multiple institutions could populate. a single dataset containing learning spaces of multiple institutions with multiple contributors could be created, managed, and analyzed collaboratively. this could enable lower-resource libraries to participate in projects with larger institutions as economies information technology and libraries | june 2018 42 of scale are realized. and it offers the ability to undertake projects across multiple institutions to explore broader space assessment or other research questions. limitation: updating hosted feature service schemas the ability to author and edit schemas entirely in arcgis online has not yet matured to a point where it matches the abilities of its desktop counterpart. specifically, updating a published schema is currently difficult to accomplish in arcgis online because a user-friendly interface does not exist. however, the task can be accomplished by editing the javascript object notation (json) of the hosted feature service. while this is a current limitation for managers of the hosted feature service and not data collectors, it is anticipated that this will be addressed in future updates. limitation: user interface for standards-based metadata items created as part of the pilot project were documented using the metadata editor provided in arcgis online. arcgis online’s users can create and maintain geospatial standards-based metadata for content. however, the user interface for creating metadata based on either the iso 19115-series or federal geographic data committee (fgdc) content standard for digital geospatial metadata (csdgm) could be improved by simplifying its complexity and allowing for batch updating specific elements. item documentation for the platform focuses on creating and editing elements of arcgis-format metadata. it should be noted, and potentially added as a point of concern for librarians, that the ability to author and edit metadata based on the iso and csdgm standards was introduced three years after the initial release of arcgis online. limitation: visualizing data in related tables the ability to visualize data collected as part of this project using ready-made applications in arcgis online yielded unsatisfactory results. the primary limitation was related to working with repeated measurements for the learning spaces. ready-made applications like web appbuilder and operations dashboard have limited support for a user-friendly presentation of repeated learning-space observation. therefore, a custom web application was developed by a university of idaho student using the esri javascript application programming interface (api). the application provides the ability to select a date range, a time scope (e.g., daytime, nighttime, all hours), a building, and a floor to visualize the data. the learning spaces are colored by the total number of users in a space on the basis of the parameters selected (see figure 5). figure 5. map view of the space assessment dashboard application. managing in-library use data | godfrey and stoddart 43 https://doi.org/10.6017/ital.v37i2.10208 for each individual space, a chart and table can be displayed to gain further insight (see figures 5 and 6). figure 6. chart view of the space assessment dashboard application. figure 7. table view of the space assessment dashboard application. limitations: data-collection software issues using collector for arcgis on devices running windows 10 proved frustrating because of a documented bug with collector. a “you are not connected to the internet” error would appear randomly, even when there was a valid internet connection. a workaround was implemented to circumvent the issue, but it was a source of frustration for data-collection staff. offline data-collection mode was experimented with to see if it was a more favorable option; however, the date and time of the data collection are not captured in offline information technology and libraries | june 2018 44 mode, so that potential workflow was abandoned. there were no issues encountered for data collectors who used the samsung galaxy (running the android operating system) or an apple ipad. conclusions web-based gis platforms such as arcgis online have evolved to the point where they offer the functionality required for collecting and managing in-library use data. the arcgis online platform performed commendably for this pilot project. while arcgis desktop was used to author the original database schema in this project, it is reasonable to conclude that it is only a matter of time until the functionality required to complete the entire workflow in the web-based platform is available. using mobile and desktop devices outfitted with the collector for arcgis application proved to be a practical way for collecting real-time in-library use data. managing project users and the items those users were able to access was straightforward. while the visualization tools for repeated measurements data are currently limited in arcgis online, the data are accessible as a web service, and the sky is the limit on custom webapplication development. looking ahead, adjusting schemas to capture height above and below ground level to take advantage of 3d data models and visualization is intriguing. use of this model may be beneficial for space-assessment projects that seek to gather data more broadly across institutions. finally, a noteworthy realization from this research is the potential for inter-institutional and crossinstitution collaboration of library space–assessment projects, or other projects for that matter. librarians can begin embracing the web gis movement alongside those in the communities they participate in and serve. opportunities to create efficiencies are possible through the simple sharing of database schemas. additionally, the ability for one institution to host a database enabling personnel at multiple institutions, or at multiple libraries at larger institutions, to contribute data is available and ready for further research. managing in-library use data | godfrey and stoddart 45 https://doi.org/10.6017/ital.v37i2.10208 appendix: schemas for each object in the geodatabase used for data collection building name table and associated domain values domainname buildingname description name of the building fieldtype smallinteger domain type codedvalue code name 0 library 1 education space identifier table and associated domain values domainname spaceid description identifier for the area fieldtype string domain type codedvalue code name 1a group study 1b café 1c landing 1d computer lab 1e individual/small group study 1f mill (134) 1g group study (133) 1h group study (132) 1i group study (131) 1j classroom (120) information technology and libraries | june 2018 46 2a 2nd floor 3a 3rd floor 4a 4th floor 3a_1 imtc area 1 3b_1 imtc area 2 3c_1 imtc area 3 3d_1 imtc area 4 type of use table and associated domain values domainname typeofusage description type of usage of the area. fieldtype smallinteger domain type codedvalue code name 0 browsing stacks 1 individual studying 2 lounging 3 meeting / group study 4 service point (circulation / reference / its help) 5 using library computers space assessment areas feature class table field datatype description domin globalid guid global identifier spaceid string space identifier spaceid floor string building floor bldgname smallinteger building name buildingname managing in-library use data | godfrey and stoddart 47 https://doi.org/10.6017/ital.v37i2.10208 space assessment areas observations table field datatype description domin type_of_usage smallinteger type of usage typeofusage number_of_users smallinteger number of users globalid guid global identifier spaceid string space identifier spaceid comments string general comments space assessment areas feature class to observations relationship class cardinality onetomany isattributed false iscomposite false forwardpathlabel space_assessment_data backwardpathlabel space_assessment_areas description relationship between the space assessment areas and data collected origin class name origin primary key origin foreign key space_assessment_areas spaceid spaceid information technology and libraries | june 2018 48 references 1 michael f. goodchild, “part 1. spatial analysts and gis practitioners,” journal of geographical systems 2, no. 1 (2000): 5–10, https://doi.org/10.1007/s101090050022. 2 pinde fu and jiulin sun, web gis: principles and applications (redlands, ca: esri, 2011), 7. 3 suzana dragićević, “the potential of web-based gis,” journal of geographical systems 6, no. 2 (2004): 79– 81, https://doi.org/10.1007/s10109-004-0133-4. 4 fu and sun, web gis, 9. 5 “who we are,” esri, accessed october 17, 2017, http://www.esri.com/about-esri#who-we-are. 6 ningning kong, michael fosmire, and benjamin dewayne branch, “developing library gis services for humanities and social science: an action research approach,” college & research libraries 78, no. 4 (2017): 413–27, https://doi.org/10.5860/crl.78.4.413. 7 bradley wade bishop and lauren h. mandel, “utilizing geographic information systems (gis) in library research,” library hi tech 28, no. 4 (2010): 543, https://doi.org/10.1108/07378831011096213. 8 jingfeng xia, “library space management: a gis proposal,” library hi tech 22, no. 4 (2004): 375, https://doi.org/10.1108/07378830410570476. 9 jingfeng xia. “gis in the management of library pick-up books,” library hi tech 22, no. 2 (2004): 209–16, https://doi.org/10.1108/07378830410543520. 10 jingfeng xia, “visualizing occupancy of library study space with gis maps,” new library world 106, no. 5/6 (2005): 219–33, https://doi.org/10.1108/03074800510595832. 11 jingfeng xia, “locating library items by gis technology,” collection management 30, no. 1 (2005): 63–72, https://doi.org/10.1300/j105v30n01_07. 12 lauren h. mandel, “geographic information systems: tools for displaying in-library use data,” information technology & libraries 29, no. 1 (2010): 47–52, https://doi.org/10.6017/ital.v29i1.3158. 13 andrew coyle, “interior library gis,” library hi tech 29, no. 3 (2011): 529–49, https://doi.org/10.1108/07378831111174468. 14 lisa m. given and heather archibald, “visual traffic sweeps (vts): a research method for mapping user activities in the library space,” library & information science research 37, no. 2 (2015): 100–108, https://doi.org/10.1016/j.lisr.2015.02.005. 15 “suma,” north carolina state university libraries, accessed october 17, 2017, https://www.lib.ncsu.edu/projects/suma. 16 “what is a geodatabase?,” esri, accessed october 17, 2017, http://desktop.arcgis.com/en/arcmap/10.4/manage-data/geodatabases/what-is-a-geodatabase.htm. https://doi.org/10.1007/s101090050022 https://doi.org/10.1007/s10109-004-0133-4 http://www.esri.com/about-esri%23who-we-are https://doi.org/10.5860/crl.78.4.413 https://doi.org/10.1108/07378831011096213 https://doi.org/10.1108/07378830410570476 https://doi.org/10.1108/07378830410543520 https://doi.org/10.1108/03074800510595832 https://doi.org/10.1300/j105v30n01_07 https://doi.org/10.6017/ital.v29i1.3158 https://doi.org/10.1108/07378831111174468 https://doi.org/10.1016/j.lisr.2015.02.005 https://www.lib.ncsu.edu/projects/suma http://desktop.arcgis.com/en/arcmap/10.4/manage-data/geodatabases/what-is-a-geodatabase.htm managing in-library use data | godfrey and stoddart 49 https://doi.org/10.6017/ital.v37i2.10208 17 “geodatabase design steps,” esri, accessed october 17, 2017, http://desktop.arcgis.com/en/arcmap/10.4/manage-data/geodatabases/geodatabase-designsteps.htm. 18 “hosted layers,” esri, accessed october 17, 2017, http://doc.arcgis.com/en/arcgis-online/sharemaps/hosted-web-layers.htm. http://desktop.arcgis.com/en/arcmap/10.4/manage-data/geodatabases/geodatabase-design-steps.htm http://desktop.arcgis.com/en/arcmap/10.4/manage-data/geodatabases/geodatabase-design-steps.htm http://doc.arcgis.com/en/arcgis-online/share-maps/hosted-web-layers.htm http://doc.arcgis.com/en/arcgis-online/share-maps/hosted-web-layers.htm abstract introduction the use of desktop gis for space assessment utilizing a cloud-based platform for learning space assessment research study location selecting the arcgis online platform pilot project design gathering ancillary data managing content and users database design defining data-collection spaces data collection results of using web gis advantage: ease of use through targeted applications advantage: participation in enterprise architecture advantage: potential for cross-institutional collaborative projects limitation: updating hosted feature service schemas limitation: user interface for standards-based metadata limitation: visualizing data in related tables limitations: data-collection software issues conclusions appendix: schemas for each object in the geodatabase used for data collection references a library website migration: project planning in the midst of a pandemic communication a library website migration project planning in the midst of a pandemic isabel vargas ochoa information technology and libraries | december 2022 https://doi.org/10.6017/ital.v41i4.14801 isabel vargas ochoa (ivargas2@csustan.edu) is stockton campus & web services librarian, california state university, stanislaus. © 2022. abstract this article provides a background on the migration of the california state university (csu), stanislaus library website from an open-source platform to a content management system specifically designed for library websites. before the migration, there was a trial of different content management systems (cms), a student usability study, and consultations with outside web and systems librarians to acquire better insight on their experiences migrating a library website and their familiarity with the different cms trialed.1 the evaluation process, website design, and usability study began before the pandemic and the global shift to remote services. however, despite this shift, the timeline for the migration was not altered and the migration was completed as planned. within a year, the library website migration planning, designing, trialing, and structural organization was completed using a modified waterfall model approach. background completed under a sudden time limit, the website migration project for the california state university (csu), stanislaus library website is both distinctive and relevant to other libraries who plan to complete a redesign of their website, on both desktop and mobile screens, to meet accessibility requirements under a limited schedule caused by unforeseen circumstances—in this case, a global pandemic and sudden shift to remote work. the website migration project included a reconsideration of the content management system (cms) the library was hosted on. csu stanislaus, surrounded by agricultural landscapes and settled in the central valley, is a hispanic-serving and minority-majority university. ethnic minorities make up 70% of total enrollment and three-fourths of the undergraduates are first-generation students.2 in fall 2021, a little over 8,800 fte (full time equivalent) students were enrolled and total enrollment reached 10,500.3 the university has two campuses, turlock and stockton, and four colleges: the college of science; the college of business administration; the college of arts, humanities and social sciences; and the college of education, kinesiology and social work. the csu stanislaus library website has been designed and redesigned over twenty years for services and content updates, university and library rebranding, and to comply with web accessibility requirements. before the library website migration in 2020, at the start of the covid19 pandemic, the website was developed and produced using the drupal platform (version 7), an open-source cms. the contents of the library website have been updated from time to time since its first years and the website’s front-end design has been modified during the past few years. before the website migration from drupal to springshare llc, the library explored various cms, including wordpress and joomla. initially, staff encountered issues on the former library website hosted on drupal. over the years, the library’s website became difficult to maintain, due to the continuously modified written framework, and to implement new branding across the website’s content and overall theme. mailto:ivargas2@csustan.edu information technology and libraries december 2022 a library website migration | ochoa 2 the objective of the website migration project was to effectively enhance the website’s interface for usability and to meet current standards and guidelines for accessibility. additionally, the university was set to launch a new design of the institutional website, which required that the library emulate the university website’s design for uniformity. in preparation for the university website redesign, it was necessary for the library to model the new university design, explore cms, and migrate the library website. a possible migration from drupal 7 to drupal 8 was investigated from 2017 to 2018, when the former web services librarian was in the process of redesigning the website; however, the redesign and migration was not completed. it was discovered around the same time that editing or upgrading the initial design and development of the library website, which utilized a community developed design heavily customized over the years, would make the migration extremely difficult. any editing that triggered modification of the locally customized theme caused other elements of the website to break or collapse, particularly the website’s layout design, including the header, footer, and menu. it quickly became apparent that it would be more sensible to begin the redesign of the website infrastructure on a platform starting from scratch. a new design would also facilitate the application of the latest accessibility and usability standards. a complete rebuild would also afford the library the option to consider other web management platforms. this would evidently be a challenging feat. so, a modified version of the waterfall model was adopted. for this migration, a simple cascading approach was chosen as it worked best with the natural flow of the library’s planned migration. the waterfall model consists of the following objective processes: requirement analysis, design, construction, acceptance testing, and transition .4 for the planned migration, the requirement analysis was confirming that i would have a local and cloud server available for trialing the cms and developing a website design. the design and construction processes would be complete when the new website design was created, the cms trials were finalized, and outside web services and systems librarians were consulted. the testing phase would be complete when the student usability studies were concluded. as explored in my previous article, “navigation design and library terminology,” a user-centered usability study was conducted to assist in the library website redesign and create the website prototype. the prototype was designed to assess the library website’s front-end elements as well as the layout theme and overall design. lastly, the transition, or migration process, would be the final planned objective in the approach. as the web services librarian, i worked as the website migration project manager. the project manager migrated the final and redesigned library website and website content, conducted the student usability studies, tested the cms and created a cms recommendation for the library, and consulted with outside librarians on their experience migrating a website and using drupal or springshare as their library website cms. timeline and an unexpected pandemic the cms trials began in fall 2019 and continued until summer 2020. the former library website used drupal 7, so drupal 8 and springhare’s libguides cms were set to be trialed. the trials consisted of developing and designing a new library website on the platforms. experiences were documented and the design process was recorded for analysis and comparison information technology and libraries december 2022 a library website migration | ochoa 3 of the platforms. this information was used to determine which platform would best support the new website. consultations were also sought from various web services librarians, system librarians, and information technology professionals. table 1. timeline of the website migration semester activities fall 2019 • trial springshare libguides cms5 • develop library website design prototype • consult outside web and systems librarians on their migration to libguides cms spring 2020 • trial drupal (version 8) • test website design prototype through a student usability study6 • complete, compare, and analyze libguides cms and drupal platform trials summer 2020 • finalize library website design • migrate former library website content to final chosen platform • complete library website migration cms trials libguides cms and drupal were the systems trialed for the library website. drupal was used for more than two decades and consideration was given to upgrading to the latest version of the platform, drupal 8. springshare libguides cms was trialed as library staff and faculty were already familiar with libguides and subscribed to several other springshare applications, including libanswers and libchat for virtual research consultations, libcal for reserving library spaces, and libinsight for user analytics. trialing of libguides cms began in fall 2019. a website prototype (design, theme, homepage) was developed and designed using the platform. the platform was analyzed and explored heavily since it was a platform that had not been previously utilized by our library. libguides cms offers unlimited advanced customized groups. features like publishing workflow management, discussion boards, internal sites, various account types, password protections , and ip restrictions, and courseware integration via lti, were also researched during the test. in terms of content creation and maintenance, there were limitations under libguides cms. libguides cms has a built-in framework, ideal for libraries, with default settings that may disrupt or limit complex customization. at the time of the trial, additional support from springshare was needed to override default settings. also, libguides cms, compared to drupal, did not provide an option for tracking revisions on guides and content. drupal, a highly programmable free open-source website platform, was the previous platform the library website was hosted on. like all upgrades, drupal 8 offered a series of new features and improvements, from framework to themes. as a highly programmable platform, it requires information technology and libraries december 2022 a library website migration | ochoa 4 mindful designing and programming to establish the infrastructure and design. drupal 8 was tested and trialed on a development site in february 2020. the development site on drupal was utilized for the final evaluation of the cms. when the campus was ordered to partially close in march 2020, the home page and foundational design were completed; however, the development site was inaccessible remotely due to a block by the campus firewall. the development site resided on a protected local server on campus, and special permissions were required for remote access. unfortunately, i was not granted the special permissions required in due time, so the development site on drupal was put on hold. based on projections after creating a foundational design in february 2020, it would have taken about six months to complete the overall website design. remotely, i continued the drupal 8 evaluation, considered the results of librarian consultations and the literature on drupal as a cms. consultations and cms comparisons generally, the difference observed between both platforms is that libguides cms is a content management system primarily designed and maintained for library websites, whereas drupal is a framework for all sites, including highly customized websites. to support the cms comparisons, six web services and system librarians were consulted prior to the migration. the librarians were from distinct institutions: two community colleges and four 4year universities ranging from 2,000 enrolled students to 30,000 enrolled students, and library departments from 10 to over 200 library personnel. a systems librarian from a university of about 2,000 enrolled undergraduate students, regarding their experience migrating their website from drupal to libguides cms, shared that, “[their migration] took a couple months . . . we worked with campus it and springshare to ‘flip the switch’.” a digital services librarian from a university of over 10,000 enrolled students explained, “the entire transition probably took about 6–8 months.” the time to migrate a library website would depend on the size of the website, which was also influenced by the size of the campus. with more than 10,000 students, the csu stanislaus library website migration project was scheduled to be completed by the end of the summer semester, from june 2020 to august 2020. creating a new website from scratch on drupal proved to be a longer process than creating a new website on libguides cms. a systems librarian explained that “[libguides cms] is also quite streamlined compared to trying to maintain a more complex platform like drupal, which makes it a bit easier for librarians who are not full-time professional coders.” still, libguides cms is not as robust and did not offer the level of customized creation that drupal offered for general websites. the systems librarian added that although it is helpful to have a web services or systems librarian who can code full time, “turnover happens and some libraries can’t be sure there will always be someone on hand who is comfortable doing that coding.” ideally, having a full-time programmer is valuable for any library managing their own website; however, it is currently not the case for our university library. a user interface developer from a university of over 30,000 enrolled undergraduate students described their experience using a large amount of css to override default settings in libguides cms. they explained, “we have a large amount of overriding css, not to mention that it makes [it] messy. when building a site [in libguides] you can do whatever you want as long as you know where to put your code, utilize the js libraries springshare uses, implement css to override their information technology and libraries december 2022 a library website migration | ochoa 5 system default items/styles and use their api.” before finalizing our migration, we were required to contact the springshare technical support team to override default settings in libguides cms. however, the default settings are implemented to guide nonprogramming librarians when creating web content. a web services librarian from a university with 2,000 undergraduate students enrolled stated, “i don't think [libguides cms] will be anything like a drupal or wordpress cms. but, i do believe that their software is the perfect niche for libraries and librarians.” libguides cms required getting used to as csu stanislaus library staff were accustomed to hosting the website on a cms that allowed file transfer protocol (ftp). as a systems librarian explained, it can be difficult to organize a large amount of coding in libguides cms since “you don’t get your own server that you can configure and use for things like ftp storage.” the experiences shared by librarians were similar throughout our process for creating the new site and design, and these consultations in particular were not only insightful but helped prepare for and organize the structure of the website before actually migrating the content. an additional component considered before choosing a cms was the technical support and server options available for each. libguides cms is cloud-based and hosted by springshare, which currently uses amazon web services (aws). upgrades are implemented by springshare overnight, as well as minute-by-minute base backups. for the most part, systems and web librarians were satisfied with springshare technical support to implement these changes. with drupal, the institution can choose whether to host their site on a cloud or local server. during our migration in summer 2020, we sought assistance from springshare technical support to modify security certificates and custom domain names. if a site is hosted on drupal, the librarian can implement security certificates and update custom name domains without having to contact the drupal technical support team. it is fundamental for a library to consider these features as well, especially if under a set timeline. these consultations with developers and with systems and web librarians aided in the understanding of what libguides cms and drupal offered based on general comparisons, programming, customization, and technical support. cms accessibility compliance the accessibility levels of each cms also supported the final decision of the chosen cms. according to the web content accessibility guidelines (wcag), there are three levels of accessibility conformance: a (lowest), aa (midrange), and aaa (highest).7 currently, the target level of accessibility for csu campus websites is aa, which also includes all the guidelines found under conformance a. regardless of the foundational framework for both libguides cms and drupal, it was determined after exploring accessibility on these cms that developers should regularly test the design and content customization for accessibility. ultimately, the accessibility levels of a library website and its mobile responsiveness are dependent on the local management and develo pment of the sites. website design: usability study concurrent to the consultations, trials, and design development, a usability study was conducted in february 2020 with a total of 38 university student participants, including undergraduate information technology and libraries december 2022 a library website migration | ochoa 6 students, from freshman to seniors, and graduate students. the usability study was organized to test the website navigation design prototype that was built and used during the cms trials. students’ feedback would guide the decision of whether to design an audience-based navigation menu or a topic-based navigation menu. the study was conducted in a closed and monitored library room with laptops prepared. students were asked to answer questions and complete tasks to test the website design prototype menu navigation design. each student’s actions were recorded through screen recordings and visual observation, while assigning numbers to students to ensure anonymity, e.g., student 1, student 2, etc. the following seven tasks were used for the student usability study: 1. find research help on citing legal documents—a california statute—in apa style citation. 2. find the library hours during spring break. 3. find information on the library study spaces hours and location. 4. you’re a student at the stanislaus state stockton campus and you need to request a book from turlock to be sent to stockton. fill out the request-a-book form. 5. you are a graduate student and you need to submit your thesis online. fill out the thesis submission form. 6. for your history class, you need to find information on the university’s history in the university archives and special collections. find information on the university archives and special collections. 7. find any article on salmon migration in portland, or. you need to print it, email it to yourself, and you also need the article cited. during each study, students’ actions were screen recorded using snagit, screen capture and screen recording software installed on the laptops. data collected included the ease of access in terms of navigation behavior, the number of clicks, web pages visited, and the time it took for students to complete each of the seven tasks. that data was recorded and analyzed from the anonymously saved screen recorded videos. students were also asked to answer questions at the end of the study in the form of a written survey, which was then collected and utilized to support the final decision of the outcome of the prototype design. the results of the usability study provided the library with a variety of outcomes and several elements were integrated to lead the redesign of the library website’s header and main menu. the results of the study showed that the prototype’s design navigation, an audience-based navigation, was not as user friendly as predicted; therefore, the library website prototype design would need to be edited and modified to revert to the current navigation design of the existing website, which is a topic-based navigation. students had difficulties with the audience-based navigation design since it required them to select an “audience type” under the menu (fig. 1). their selection was determined by assessing where they believed the information was found. since most students did not understand the structure of the website, they did not know how to utilize the audience-based navigation to complete the seven tasks. although they found that the navigation design of the website was clear and simple, it required a “getting used to.” information technology and libraries december 2022 a library website migration | ochoa 7 the results also highlighted the effects of the use of library terms. to make menu links exceptionally user friendly, clear and common terminology was added. an additional component was a search-all search box for the website, which was advocated by the student participants. based on navigation results, the main menu and submenus were also structured to not only be clear and organized, but for popular pages to be mapped and linked in more than one menu. figure 1. screenshot of the audience-based navigation design developed for the library website prototype. figure 2. screenshot of the topic-based navigation in the former library website. the design structure of the website relied on the organization and management of the website pages. to maintain a congruent structure, it was necessary to choose a navigation design that met the needs of our students. in this case, the results determined that the topic-based navigation was preferred; thus, the management of website pages and submenus was modified to fit this navigation. the usability study was focused on testing the navigation design of the website and the navigation main menu. given the helpful feedback from users and having more participants than expected, it would have been beneficial to also test other aspects of the website in this study. the home page is the landing point and statistically the most visited web page of the library’s website. it is the hotspot for students and our university’s community to find the catalog, resources to fulfill their research needs, upcoming news and events, the reservation platform, and more. however, as the website redesign progressed, there were challenges in designing the primary components of the website home page, such as assessing what elements were fundamental to have on the library website, following web accessibility requirements, and following the university website’s new theme and design. ultimately, the components of the former library website’s home page were migrated in its similar structure to the redesigned information technology and libraries december 2022 a library website migration | ochoa 8 website. yet, adding questions and tasks on the usability of the library home page and its content components would have certainly aided in not only the redesign and migration, bu t the direction of the library website’s future development. this information will be a focal point for future usability studies of the library’s website. the final migration the greatest challenge throughout the project was the time constraint due to the covid-19 pandemic. because the pandemic brought several unforeseen obstacles into staff work schedules, it was challenging to manage the time needed to complete the project and simultaneously work around tasks surfaced by the pandemic. however, staff committed to stay on schedule and to complete the migration project before the start of the fall semester despite the circumstances. after transitioning to remote library services, emphasis was placed on developing the website and web content. even more so, the migration project served to ensure that the library was providing an enhanced and accessible desktop and mobile website for users who were now working from home. additionally, this included the management of web services, on top of the migration project. a concise and organized schedule was necessary and although time management of the different projects and tasks offset by the pandemic was challenging, the web services librarian was fortunate to have support from the library information technology staff. after the cms trials and after the website prototype usability study, libguides cms was chosen as the content management system for the university library website. because the library was looking for an easy-to-use platform, utilizing libguides cms reduced the time needed to build an infrastructure and allowed simple website content management, maintenance, and an improvement over the former website’s accessibility and mobile responsiveness. the platform worked well for the campus and the library; however, each library should evaluate its respective department priorities, along with what is expected, desired, and needed for their individual library website to successfully showcase services and programs to users. following a modified waterfall model approach proved to be a success for the website migration project due to existing resources and scheduled timeline for implementation. in a future virtual renovation or redesign of the library website, the library will explore various project planning models pertinent to the future proposal’s desired outcomes. endnotes 1 isabel vargas ochoa, “navigation design and library terminology,” information technology and libraries 39, no. 4 (2020). 2 “diversity and equity data portal,” california state university, stanislaus, 2021, https://www.csustan.edu/iea/diversity-and-equity-data-portal. 3 “quick facts,” california state university, stanislaus, 2021, https://www.csustan.edu/iea/institutional-data/quick-facts. 4 bob hughes and roger ireland, project management for it-related projects, 3rd edition, (swindon, uk: bcs learning and development, 2019). 5 vargas ochoa, “navigation design and library terminology.” https://www.csustan.edu/iea/diversity-and-equity-data-portal https://www.csustan.edu/iea/institutional-data/quick-facts information technology and libraries december 2022 a library website migration | ochoa 9 6 vargas ochoa, “navigation design and library terminology.” 7 “web content accessibility guidelines (wcag) 2 level aaa conformance,” w3c web accessibility initiative (wai), web accessibility initiative (wai), 13 july 2020, https://www.w3.org/wai/wcag2aaa-conformance. https://www.w3.org/wai/wcag2aaa-conformance abstract background timeline and an unexpected pandemic cms trials consultations and cms comparisons cms accessibility compliance website design: usability study the final migration endnotes 4 information technology and libraries | december 2007 author id box for 2 column layout column title editor enterprise digital asset management (dam) systems are beginning to be explored in higher education, but little information about their implementation issues is available. this article describes the university of michigan’s investigation of managing and retrieving rich media assets in an enterprise dam system. it includes the background of the pilot project and descriptions of its infrastructure and metadata schema. two case studies are summarized—one in healthcare education, and one in teacher education and research. experiences with five significant issues are summarized: privacy, intellectual ownership, digital rights management, uncataloged materials backlog, and user interface and integration with other systems. u niversities are producers and repositories of large amounts of intellectual assets. these assets are of various forms: in addition to text materials, such as journal papers, there are theses, performances from per­ forming arts departments, recordings of native speakers of indigenous languages, or videos demonstrating surgical procedures, to name a few.1 such multimedia materials have not, in general, been available outside the originat­ ing academic department or unit, let alone systematically cataloged or indexed. valuable assets are “lost” by being locked away in individual drawers or hard disks.2 managing and retrieving multimedia assets are not problems confined to academia. media companies such as broadcast news agencies and movie studios also have faced this problem, leading to their adoption of digital asset management (dam) systems. in brief, dam systems are not only repositories of digital­rich media content and the associated metadata, but also provide management functionalities similar to database manage­ ment systems, including access control.3 a dam system can “ingest digital assets, store and index assets for easy searching, retrieve assets for use in many environments, and manage the rights associated with those assets.”4 in summer 2000, the university of michigan (u­m) tv station, umtv, was searching for a video archive solution. that fall, a u­m team visited cnn and experienced a “eureka!” moment. as james hilton, then­associate provost for academic, information, and instructional technology affairs, later wrote, “building a digital asset management into the infrastructure . . . will be the digital equivalent of bringing indoor plumbing to the campus.”5 in spring 2001, an enterprise dam system was considered for inclusion in the university infrastruc­ ture. upon completion of a limited proof­of­concept project, a cross­campus team developed the request for proposals (rfp) for the dams living lab, which was issued in july 2002 and subsequently awarded to ibm and ancept. in august 2003, hardware and software installation began in the living lab.6 by 2006, the project changed its name to bluestream to appeal to the grow­ ing mainstream user base.7 six academic and two support units agreed to partner in the pilot: ■ school of education ■ school of dentistry ■ college of literature, science, and the arts ■ school of nursing ■ school of pharmacy ■ school of social work ■ information technology central services ■ university libraries the academic units were asked to provide typical and unusual digital media assets to be included in the living lab pilot. the pilot focused on rich media, so the preferred types of assets were digital video, images, and other multimedia delivered over the web. the living lab pilot was designed to address four key questions: ■ how to create a robust infrastructure to process, manage, store, and publish digital rich media assets and their associated metadata. ■ how to build an environment where assets are eas­ ily searched, shared, edited, and repurposed in the academic model. ■ how to streamline the workflow required to create new works with digital rich media assets. ■ how to provide a campuswide platform for future application of rights declaration techniques (or other ip tools) to existing assets. this article describes the challenges encountered during the research­and­development phase of the u­m enterprise dam system project known as the living lab. the project has now ended, and the implemented project is known as bluestream. enterprise digital asset management system pilot: lessons learned yong-mi kim, judy ahronheim, kara suzuka, louis e. king, dan bruell, ron miller, and lynn johnson yong-mi kim (kimym@umich.edu) is carat-rackham fellow 2004, school of information; judy ahronheim (jaheim@umich .edu) is metadata specialist, university libraries; kara suzuka (ksuzuka@umich.edu) is assistant research scientist, school of education; louis e. king (leking@umich.edu) is managing producer, digital media commons; dan bruell (danlbee@umich .edu) is director, school of dentistry; ron miller (ronalan@umich .edu) is multimedia services position lead, school of education; and lynn johnson (lynjohns@umich.edu) is associate professor, school of dentistry, university of michigan, ann arbor. article title | author 5enterprise dam system pilot | kim, ahronheim, suzuka, king, bruell, miller, and johnson 5 ■ background of the living lab: u-m enterprise dam system project an enterprise project such as the living lab at u­m can have significant impact on an institution’s teaching and learning activities by allowing all faculty and students easy yet secure access to media assets across the entire campus. such extensive impact can only be obtained by overcoming numerous and varied obstacles and by docu­ menting actual implementation experiences employed to overcome those challenges. enterprise dam system vendors such as stellent, artesia, and canto list clients from many different industry sectors, including gov­ ernment and education, but provide no detailed case studies on their web sites.8 information regarding the status of enterprise dam system projects and specific issues that arose during implementation is difficult to find. information publicly available for enterprise dam system projects in higher education is usually in the form of white papers or proposals that do not cover the actual implementations.9 given the high degree of interest and the number of pilot projects announced in recent years, this shortcoming has prompted the writing of this article, which presents the most important lessons learned dur­ ing the first phase of the living lab pilot project with the hope that these experiences will be valuable to other academic institutions considering similar projects. as part of its core mission, u­m strives to meet the teaching and learning needs of the entire campus. thus, the living lab pilot solicited participation from a diverse cross­section of the university’s departments and units with the goal of evaluating the use of varied teaching and learning assets for the system. from the beginning, it was expected that this system would handle assets in many different forms, such as digital video or digitized images, and also accommodate various organizational schemas and metadata for different collections. this sets the u­m enterprise dam system apart from projects that focus on only one type of collection or define a large monolithic metadata schema for all assets. data were gathered through interviews with asset providers, focus groups with potential users, and a review of the relevant literature. a number of barriers were identified during the pilot’s first phase. while there were some technical barriers, the most signifi­ cant barriers were cultural and organizational ones for which technical solutions were not clear. perhaps the most significant cultural divide was between the culture of academia and the culture of the commercial sector. cultural and organizational assumptions from com­ mercial business practices were embedded in the design of the products initially used in the living lab imple­ mentation. thus, an additional implementation chal­ lenge was determining which issues should be resolved through technical means, and which should be solved by changing the academic culture. this is expected to be an ongoing challenge. ■ architecture (building the infrastructure) an enterprise dam system in an academic community such as u­m needs to support a wide variety of services in order to meet the numerous and varied teaching, research, service, and administrative functions. figure 1 illustrates the services that are provided by an enterprise dam system and concurrently demonstrates its com­ plexity. the left column, process, lists a few of the media processes that various producers will use prepare their media and subsequent ingestion into the enterprise dam system; the middle column, manage, demonstrates the various functions of the enterprise dam system; while the third column, publish, lists a subset of the publishing venues for the media. because an enterprise dam system supports a variety of rich media, a number of software tools and workflows are required. figure 2 illustrates this complexity and describes the architecture and workflow used to add a video segment. the organization of figure 2 parallels that of figure 1. the left column, process, indicates that flip factory by telestream is used to convert digital video from the original codec to one that can be used for play­ back.10 in addition, videologger by virage uses media analysis algorithms to extract key frames and time codes created by louis e. king, ©2004 regents of the university of michigan figure 1. component services of the living lab 6 information technology and libraries | december 20076 information technology and libraries | december 2007 from the video as well as to convert the speech­to­text for easy searching.11 the middle column, manage, illustrates tools from ibm that help create rich media as well as tools from stellent, such as its ancept media server (ams), that store and index the rich media assets.12 the third column, publish, illustrates two examples of how these digital video assets could be made available to the end user. one strategy is as a real video stream using real network’s helix server, and the other as a quicktime video stream using ibm’s videocharger.13 a thorough discussion of all of the software and hardware that make up u­m’s dam system is beyond the scope of this article. however, a list of the software components with links to their associated web sites is provided in figure 3. from the beginning the living lab pilot aimed for a diverse collection of assets to promote resource discovery and sharing across the university. figure 4 illustrates how the living lab is expected to fit into the varied publishing venues that comprise the campus teaching and learning infrastructure. existing storage and network infrastruc­ tures are used to deliver media assets to various software systems on campus. the living lab is used to streamline the cataloging, searching, and retrieving processes encoun­ tered during academic teaching and research activities. the following example describes how the enterprise dam system fits into the future campus cyberinfrastruc­ ture. a faculty member in the school of music is a jazz composer. one of her compositions is digitally stored in the enterprise dam system along with the associated metadata (cataloging information) that will allow the piece to be found during a search. that single audio file is then found, accessed, and used by five unique publish­ ing venues—the course web site, the university web site, a radio broadcast, the music store, and the library archive. the faculty member uses the piece in her jazz interpreta­ tion course and thus includes a link to the composition on her sakai course web site.14 when she receives an award, the u­m issues a press release on the u­m web site that includes a link to an audio sample. concurrently, michigan radio uses the enterprise dam system to find the piece for a radio interview with her that includes an audio segment.15 her performance is published by block m records, u­m’s web­based recording label, and, lastly, the library permanently stores the valuable piece in its institutional archive, deep blue.16 ■ metadata (managing assets within the academic model) the vision for enterprise dam at u­m is for digital assets to not only be stored in a secure repository, but also be findable, accessible, and usable by the appropriate persons in the university community in their academic endeavors. information about these assets, or metadata, is a crucial component of fulfilling this vision. an important created by louis e. king, ©2004 regents of the university of michigan figure 2. the living lab architecture north american systems ancept media server www.nasi.com/ancept.php ibm content manager www-306.ibm.com/software/data/cm/cmgr/mp/ telestream flip factory www.telestream.net/products/flipfactory.htm virage videologger www.virage.com/content/products/index.en.html ibm video charger www-306.ibm.com/software/data/videocharger/ real networks helix server www.realnetworks.com/products/media_delivery. html apple quicktime streaming server www.apple.com/quicktime/streamingserver/ handmade software image alchemy www.handmadesw.com/products/image_alchemy. htm figure 3. software used in the living lab article title | author 7enterprise dam system pilot | kim, ahronheim, suzuka, king, bruell, miller, and johnson 7 question that arises is, “what kind of metadata should be required for the assets in the living lab?” to help answer this question, potential asset provid­ ers were interviewed regarding their current approach to metadata, such as if they used a particular schema and how well it met their purposes. not surprisingly, asset providers had widely varied metadata implementations. while the assets intended for the living lab pilot all had some metadata, the scope and granularity varied greatly. metadata storage and access methods also varied, ranging from databases implemented using commercial database products and providing web front­ends, to a combination of paper and spreadsheet records that had to be consulted together to locate a particular asset. the assets to be used in the living lab pilot consisted primarily of high­ and low­resolution digital images and digitized video. these interviews also generated a number of requirements for any potential living lab metadata schema. it was deter­ mined that the schema should be able to: ■ describe heterogeneous collections at an appropriate level of granularity and detail, allowing for domain­ specific description needs and vocabularies; ■ allow metadata entry by non­specialists; ■ enable searches across multiple subject areas and col­ lections; ■ provide provenance information for the assets; and ■ provide information on authorized uses of the assets for differing classes of users. an examination of the literature showed a general consensus that no single metadata standard could meet the requirements of heterogeneous collections.17 projects as diverse as pb core and vius at penn state adopted the approach of drawing from multiple existing metadata standards.18 their approaches differ in that pb core is a combination of selected metadata elements from a num­ ber of standards plus additional elements unique to pb core, while vius opted for a merged superset of all the elements in the standards selected. in interviews with asset providers (usually faculty), cataloging backlog and the lack of personnel for gen­ erating and entering metadata emerged as consistent problems. there was concern that an overly complex or specialized schema would aggravate the cataloging back­ log by making metadata generation time­consuming and cumbersome. budgetary constraints made hiring pro­ fessional metadata creators prohibitive. another aspect of the personnel problem was that adequate descrip­ tion required subject specialists who were, ideally, the resource authors or creators. but subject specialists, while familiar with the resources and the potential audience for them, may not be knowledgeable of how to produce high­quality metadata, such as controlled vocabularies or consistent naming formats. to address these issues, the more simple and straight­ forward indexing process offered by dublin core (dc) was selected as the starting point for the metadata schema in the living lab.19 dc was originally developed to sup­ port resource discovery of a digital object, with resource authors as metadata creators. dc is a relatively small standard, but is extensible through the use of qualifiers. it has been adopted as a standard by a number of standards organizations, such as iso and ansi. a body of research exists on its use in digital libraries and its efficacy for author­generated metadata, and there are metadata crosswalks between dc and most other metadata stan­ dards. a number of other subject­specific standards were also examined for more specialized description needs and controlled vocabularies: vra core, ims learning resource meta­data specification, and snodent.20 in the end, the project leaders elected to adopt a rather novel approach to metadata by not defining one metadata schema for all assets. by taking advantage of the power of multiple approaches (for example, pb core for mix­and­ match, and vius for a merged superset) each collection can have its own schema as long as it contains the ele­ ments of a more general, lowest­common­denominator schema. this overall schema, um_core, was defined based on dc. the elements are prefixed with dc or um to specify the schema origin. um_publisher and um_alternatepublisher identify who should be contacted about problems or ques­ tions regarding that particular asset. um_secondarysubject is a cross­collection subject classification schema devel­ created by louis e. king, ©2004 regents of the university of michigan figure 4. the enterprise dam system as the future campus infrastructure for academic venues 8 information technology and libraries | december 20078 information technology and libraries | december 2007 oped by the u­m libraries, and helps map the asset into the context of the university. in adopting such an approach to metadata, metadata creation is seen not as a one­shot process, but a collaborative and iterative one. for example, on initial ingestion into the living lab, the only metadata entered for an image may be dc_title, dc_date, and um_publisher. additional meta­ data may be entered as users discover and use the asset, or as input from a subject specialist becomes available. the discussion so far has focused on metadata pro­ duced with human intervention. a number of metadata elements can be obtained from the digital objects through the use of software. in an enterprise dam system, this is referred to as automatically generated metadata and is what can be directly obtained from a computer file such as file name, file size, and file format. this type of metadata is expected to play a larger role as an increasing propor­ tion of assets will be born digital and come accompanied by a rich set of embedded metadata. for example, images or video produced by current digital cameras contain exchangeable image file format (exif) metadata, which include such information as image size, date produced, and camera model used. when available, the living lab presents automatically generated metadata to the user in addition to the elements in um_core. thus, asset metadata in the living lab can be pro­ duced in two ways: automatically generated through a tool such as virage videologger in the case of video, or entered by hand through the current dam system inter­ face.21 in addition, if metadata already exist in a database format, such as filemaker, this can be imported once the appropriate mappings are defined.22 videologger, a video analysis tool for digital video files, can extract video key frames, add closed captions, determine whether the audio is speech or music, convert speech to text, and identify (through facial recognition) the speaker(s). these capabilities allow for more sophis­ ticated searching of video assets compared to the cur­ rent capabilities of search engines such as google. some degree of content­based searching can now be done, as opposed to searching that relies on the title and other textual description provided separately from the video itself. for the pilot, particular interest was expressed in the speech recognition capability of videologger. videologger generates a time­coded text of spoken key­ words with 50 to 90 percent accuracy. the result is not nearly accurate enough to generate a transcript, but does indeed provide robust data for searching the content of video. given the diversity of assets in the living lab, it is clear that the university can utilize low­cost keyword analysis to enhance search granularity as well as the more expensive, fully accurate hand­processed transcript. ■ workflow examples two instructional challenges demonstrate how an enter­ prise digital asset management system can provide a solution to instructional dilemmas and how a unique workflow needs to be created for each situation. the chal­ lenges related to each project are described. school of dentistry the educational dilemma the u­m school of dentistry uses standardized patient instructors (spis) to assess students’ abilities to interact with patients. carefully trained actors play carefully scripted patient roles. dental students interview the patients, read their records, and make decisions about the patients’ care, all in a few minutes (see figure 6). each session is video recorded. currently, spis grade each student on predeter­ mined criteria, and the video recording is only used if a student contests the spis’ grade. ideally, a dental educator should review each recording and also grade each student. however, the u­m class size of 105 dental students causes a recording­based grading process to be prohibitively expensive in terms of personnel time. in addition, the use of digital videotape makes it difficult for the recorded sessions to be made available to the students. because the tapes are part of the student’s record, they cannot be checked out. if a student wants to review a tape, she or he must make an appointment and review it in a supervised setting. living lab solution the u­m school of dentistry’s living lab pilot attempted simultaneously to improve the spi program and lower the cost of faculty grading spi sessions through three goals: dc_title dc_creator dc_subject um_secondarysubject dc_description dc_publisher dc_contributor dc_date dc_type dc_format dc_identifier dc_source dc_language dc_relation dc_coverage dc_rights um_publisher um_alternatepublisher figure 5. the u-m enterprise dam system metadata scheme um_core article title | author 9enterprise dam system pilot | kim, ahronheim, suzuka, king, bruell, miller, and johnson 9 1. use speech­to­text analysis to create an easily searched transcript; 2. streamline the recording process; and 3. make the videos available online for student review. each of these challenges and the current results are summarized. speech-to-text analysis it was hypothesized that an effective speech­to­text anal­ ysis of the spi session could enable a grader quickly to locate video segments that: (1) represented student dis­ cussion of specific dental procedures; and (2) contained student verbalizations of key clinical communication skills.23 in summer 2005, nine spi sessions were recorded and a comparison between manual transcription and the automated speech­to­text processes was conducted. the transcribed audio track was manually marked up with time­coded reference points and inserted as an annota­ tion track to the video. those same videos also were ana­ lyzed through the video logger speech­to­text service in the living lab, resulting in an automatically generated, time­coded text track. lastly, six keywords were selected that, if spoken by the student, indicated the correct use of either a dental procedure or good communication skills. keyword searches were conducted on both the manual transcription and the speech­to­text analysis. three results were calculated on the key word searches of both versions of all nine recorded sessions. they were: (1) the number of successful keyword searches; (2) the number of successful search results that did not actually contain the keywords (false positives); and (3) the time required to complete the manual transcrip­ tion and text­to­speech analysis of the recordings. the results demonstrated that the speech­to­text analysis matched the manual transcription 20 to 60 percent of the time. also, the speech­to­text process resulted in a false positive less than 10 percent of the time. lastly, the time required to complete the speech­to­text analysis of a session was two minutes, while the average time required to complete a manual transcription of the same session was 180 minutes. while not perfect, the results are encouraging that manually transcribing the audio is no longer necessary. improvements are being made to the clinical environment and microphones so that a higher­quality recording is obtained. it is anticipated that those changes combined with improved software will improve the results of the speech­to­text analysis sufficiently so that automated keyword searches can be conducted for grading purposes. streamlining the recording process scale is a significant challenge to capturing 105 spi inter­ actions in a short amount of time. two to three weeks are required for the entire class of 105 students to complete a series of spi experiences, with as a many as four concur­ rent sessions at any given time. in summer 2006, it was decided to record 50 percent of one class. logistically, one camera operator could staff two stations simultane­ ously. the stations had to be physically close enough for a one­person operation, but not so close that audio from the adjacent session was recorded. the optimal distance was about thirty to thirty­five feet of separa­ tion. staggering the start times of each session allowed the camera operator to make sure each was started with optimal settings. since the results of the speech­to­text analysis were linked to the quality of the equipment used, two prosumer minidv cameras with professional quality microphones and tripods also were purchased. student availability an important strength of living lab is the ability to make the assets both protected and accessible. the current itera­ tion does not have an interface for user­created access con­ trol lists (acl), instead they need to be created by a systems administrator. once a systems administrator has created an acl, academic technology support staff can add or subtract people. to satisfy family educational rights and protection act regulations, a separate acl is needed for each student for the spi project.24 currently, the possibility of including the spi recordings and their associated transcriptions as ele­ ments of an eportfolio is being explored.25 in the meantime, students can use url references to include these videos and transcripts in such web­based tools as eportfolios and course management systems. discussion as the challenges of improving speech­to­text analysis, recording workflow, and user­created acls are overcome, the spi program will be able to operate at a new and previ­ ously unimagined level. a more objective keyword grad­ ing process can be instituted. students will be easily able to search through and review their sessions at times and locations that are convenient for them. living lab also will allow students to view their eportfolio of spi interactions and witness how they have improved their communica­ tion skills with patients. for the first time in healthcare education, a clinician’s communication skills, such as bedside or chairside manner, will be able to be taught and assessed using objective methods. school of education the challenge of using records of practice for research and professional education classroom documentation plays a significant role in educational research and in the professional education of teachers at the u­m school of education. collections of 10 information technology and libraries | december 200710 information technology and libraries | december 2007 videos capturing classroom lessons, small­group work, and interviews with students and teachers—as well as other classroom records, such as images of student work, teacher lesson plans, and assessment documents—are basic to much of the research that takes places in the school of education. however, there also is a large and increasing demand to use these records from real class­ rooms for educational purposes at the u­m and beyond, creating rich media materials for helping preservice and practicing teachers learn to see, understand, and engage in important practices of teaching. this desire to create widely distributed educational materials from classroom documentation raises two important challenges: first, there is the important challenge of protecting the identity of children (and, in some cases, teachers); and second, there is the difficult task of ensuring that the classroom records can be easily accessed by individuals who have permission to view and use the records while being inac­ cessible to those without permission. one research and materials development project at the u­m school of education has been exploring the use of living lab to support the critical work of processing classroom records for use in research and in educational materials, and the distribution and protection of class­ room records as they are integrated into teacher educa­ tion lessons and professional development sessions at the u­m and other sites in the united states. the findings and challenges of these efforts are summarized below. processing classroom records the classroom records used in the pilot were processed in three main ways, producing three different types of products: ■ preservation copies are high­quality formats of the classroom records with minimal loss of digital infor­ mation that can be read by modern computers with standard software. these files are given standardized filenames, cleaned of artifacts and minor irregu­ larities, and de­identified (that is, digitally altered to remove any information that could reveal the identity of the students and, in some cases, of the teachers). ■ working copies are lower­quality versions of the preservation copies that are still sufficient for print­ ing or displaying and viewing. trading some degree of quality for smaller file sizes and thus data rates, the working copies are easier for people to use and share. additionally, these files are further devel­ oped to enhance usability: videos are clipped and composited to feature particular episodes; videos also are subtitled, flagged with chapter markers (or other types of coding), and embedded with links for accessing other relevant information; images of stu­ dent and teacher work are organized into multipage pdfs with bookmarks, links, and other navigational aids; and all files are embedded with metadata for aiding their discovery and revealing information about the files and their contents. ■ distribution copies are typically similar in quality to the working copies but are often integrated into other documents or with other content; they are labeled with copyright information and statements about the limitations of use. they are, in many cases, edited for use on a variety of platforms and copy protected in small ways (for example, word and powerpoint files are converted to pdfs). the living lab was found to support this processing of classroom records in two important ways. first, the system allowed for the setup and use of workflows that enabled undergraduate students hired by the project to upload processed files into the system and walk through a series of quality checks, focused on different aspects of the products. so, for example, when checking the preservation copies, one person was assigned to check the preservation copy against the actual artifact to make sure everything was captured adequately and that the resulting digital file was named properly (“quality check 1”). another individual was assigned to make sure the content was cleaned up properly and that no identifying information appeared anywhere (“quality check 2”). and finally, a third person checked the file against the meta­ data to make sure that all basic information about the file was correct (“quality check 3”). files that passed through all checks were organized into collections accessible to project members and others (“organize”). files that failed along the way were sent back to the beginning of the workflow (the “drawing board”), fixed, and checked again (see figure 7). figure 6. a dental student interviewing an spi. article title | author 11enterprise dam system pilot | kim, ahronheim, suzuka, king, bruell, miller, and johnson 11 second, living lab allowed asset and collection development to be carried out collaboratively and itera­ tively, enabling different individuals to add value in dif­ ferent ways over time. undergraduate students did much of the initial processing and checking of the assets; skilled staff members converted subtitles into speech metadata housed within living lab; and, eventually, project faculty and graduate students will add other types of analytic codes and content specific metadata to the assets. distribution and protection of classroom records in addition to supporting the production of various types of assets and collections, the living lab supported the distribution and protection of classroom records for use in education settings both at u­m and other institutions. for example, almost fifteen hours of classroom videos from a third­grade mathematics class were made acces­ sible to and were used by instructors and students in the college of education at michigan state university. in a different context, approximately ten minutes of classroom video was made available to instructors in mathematics departments at brigham young university, the university of georgia, and the city college of new york to use in courses for elementary teachers. each asset (and its derivatives) housed within living lab has a url that can be embedded within web pages and online course­management systems, allowing for a great deal of flexibility in how and where the assets are pre­ sented and used. at the same time, each call to the server is checked and, when required, users are prompted to authen­ ticate by logging in before any assets are delivered. this has great potential for easily, seamlessly, and safely integrating living lab assets into a variety of web spaces. although this feature has indeed allowed for a great deal of flexibility, there were and continue to be challenges with creating an integrated and seamless experience for school of education students and their instructors. for example, depending on a variety of factors, such as user operating systems and web browser combinations, users might be prompted for multiple logins. additionally, the login for the living lab server can be quite unforgiving, locking out users who fail to login properly in the first few tries and providing limited communication about what has occurred and what needs to be done to correct the situation. discussion during the living lab pilot a number of workflow chal­ lenges were overcome that now allow numerous and varied types of media related to classroom records to be ingested into living lab, and derivatives created. this demonstrates that living lab is ready for complex media challenges associated with instruction. however, the next challenge of delivering easily and smoothly to others still remains. once authentication and authorization is con­ ducted using single sign­on techniques that allow users to access assets securely from living lab through other systems, assets will be able to be incorporated into web­ based materials and used to enhance the instruction of teachers in ways that have yet to be conceived. ■ privacy, intellectual property, and copyright during the course of the pilot, a number of issues emerged. among these were some of the most critical issues that institutions considering embarking on a similar asset man­ agement system need to address. these issues are: ■ privacy; ■ intellectual ownership and author control of materials; ■ digital rights management and copyright; ■ uncataloged materials backlog; and ■ user interface and integration with other campus systems. up to this point, enterprise dam systems had been developed and used primarily by commercial enterprises— for example, cnn and other broadcasting companies. using a product developed by and for the commercial sec­ tor brought to the fore the cultural differences between the academy and the commercial sector (see figure 8). the first three issues in the previous list are related to the differing cultures of commercial enterprise and academia. these issues are addressed below. the fourth and fifth issues are addressed in the section “other important issues.” privacy videos of medical procedures can be of tremendous value to students. in their own words, “watching is different from reading about it in a textbook.” but subjects have the right to retract their consent regarding the use of their images or treatment information for educational purposes. this creates a dilemma: if other assets have been cre­ ated using it, do all of them have to be withdrawn? for drawing board → quality check 1 → quality check 2 → quality check 3 → organize figure 7. living lab workflow 12 information technology and libraries | december 200712 information technology and libraries | december 2007 example, if a professor included an image from the univer­ sity’s dam system in a classroom powerpoint or keynote presentation, and subsequently included the presentation in the university’s dam system, what is the status of this file if the patient withdraws consent for use of her or his treatment information?26 when must the patient’s request be fulfilled? can it be done at the end of the semester, or does it need to be completed immediately? if the request must be fulfilled immediately, the faculty member may not have sufficient time to find a comparable replacement. waiting until the end of the semester helps balance patient privacy with teaching needs. in either case, files must be withdrawn from the enterprise dam system and links to those files removed. consent status and asset relationships must be part of the metadata for an asset to handle such situations. consideration must be given to associating a digital copy of all consent forms with the corresponding asset within an enterprise dam system. intellectual ownership and author control of materials authors’ rights, as recognized by the berne convention for the protection of literary and artistic works, have two components.27 one, the economic right in the work, is what is usually recognized by copyright law in the united states, being a property right that the author of the work can transfer to others through a contract. the other component—the moral rights of the author—is not explicitly acknowledged by copyright law in the united states and thus may escape consideration regarding ownership and use of intellectual property. moral rights include the right to the integrity of the work, and thus come into play in situations where a work is distorted or misrepresented. unlike economic rights, moral rights cannot be transferred and remain with the author. in a university setting, the university may own the economic right for a researcher’s work, in the form of copyright, but the researcher retains moral rights. the following incident illustrates what can happen when only property rights are taken into account. a digital video segment of a medical procedure was being shown as part of a living lab demo at a university it showcase. because the u­m held the copyright for that particular videotape, no problems were foreseen regarding its usage. a faculty member recognized the video as one she had cre­ ated several years ago and expressed great concern that it had been used for such a purpose without her knowledge or consent. the concern arose from the fact that video showed an outdated procedure. while the faculty member continued to use this video in the classroom, she felt this was different from having it available through the living lab. in the classroom, the faculty member alerted students to the outdated practices during the viewing, and she had full control over who viewed it. the faculty member felt she lost this control and additional clarification when the video became available through living lab. that is, her work was now misrepresented and her moral rights as an author were violated. digital rights management and copyright in the academic world, digital rights management (drm) is becoming a necessary component in disseminating intellectual products of all forms.28 however, at this time there are few standards and no technical drm solution that works for all media on all platforms. therefore, u­m has elected to use social rather than technical means of managing digital rights. the living lab metadata schema provides an element for rights statements, dc_rights. these metadata, combined with education of the univer­ sity community about copyright, fair use, and the highly granular access control and privileges management of the system, provide the community with the knowledge and tools to use the assets ethically. the university can establish rights declarations to use in the dc_rights field as standards are developed and prec­ edent is established in the courts. these declarations may include copyright licenses developed by the university legal counsel as well as those from the creative commons.29 current solution—access control lists a clear difference between the cultures of commercial enterprises and academia emerged regarding access to assets, administered through acls.30 an acl specifies commercial dam system model university dam system model assets held centrally federated ownership of assets access, roles, and privileges managed centrally distributed management of access, privileges and roles metadata frameworks— monolithic federated metadata schema agnostic user interface(s) re: privileges, ownership figure 8. differences between commerical and university uses of a dam system. article title | author 13enterprise dam system pilot | kim, ahronheim, suzuka, king, bruell, miller, and johnson 13 who is allowed to access an asset and how they can use it. in commercial settings, access to assets is centrally managed, while in academia, with its complex set of intellectual and copyright issues, it is preferable to have them managed by the asset holders. university users repeatedly asked for the ability to define acls for each asset in the living lab. currently, end users and support staff cannot define acls—only system administrators can create them. the middleware for user­defined acls has been fully developed, and the user interface for user­ defined acls will be made available in the next version. this capability is important in the academic envi­ ronment because the composition of group(s) of people requiring access to a particular asset is fluid and can span many organizational boundaries, both within and outside the university. a research group owning a collection of assets may want to restrict access for various reasons, including requirements set forth by an institutional review board (irb, a university group that oversees research projects involving human subjects), or regulations such as the health insurance portability and accountability act of 1996, which addresses patient health information privacy.32 the research group will want flexible access control, as research group members may collaborate with others inside and outside the university. the original irb approval may specify that confidentiality of the subjects must be maintained, and collected data, such as video or transcripts, can only be viewed by those directly involved in the research project and cannot be browsed by other researchers not involved in the study or the public at large. in another situation, a collection of art images may only be viewed by current students of the institution, thus requiring a different acl. this situation is still open to interpretation. some say patient consent regarding the use of information for instructional purposes cannot be withdrawn for the use of existing information at the home institution. they can only withdraw it for the use of future assets. others may feel that patients can withdraw permission for the use of their patient assets. other important issues uncataloged materials backlog what emerged from interviews and focus groups with content providers was that while there was no lack of assets they would like to see online, a large proportion of these assets had never been cataloged or even sys­ tematically labeled in some form. this finding may be attributed in part to the pilot focusing on existing assets that have previously not been available for widespread sharing—such as the files stored on faculty hard disks and departmental servers—only known to a favored few. owners or creators of these materials had not consciously thought about sharing these materials or making them available to others. librarians, in contrast, have devel­ oped systems and practices to ensure the findability of materials that enter the library. asset owners were more than willing to have the assets placed online, but did not have the time or resources to provide the appropriate metadata. hiring personnel to create the metadata is problematic, as there is a limit to the metadata that can be entered by non­experts, and experts often are scarce and expensive. for example, for a collection of oral pathology images of microscopic slides, a subject expert must provide the diagnoses, stain, magnification, and other information for each image. without these details, merely putting the slides online is of little value, but these metadata cannot be provided by laypeople. collaborative metadata creation, allowing multiple metadata authors and iterations, may be one solution to this problem. a number of studies indicate that both organiza­ tional support and user­friendly metadata creation tools are necessary for resource authors to create high­ quality metadata.33 some of the backlog may be resolved through development of tools aimed at resource authors. in addition, increased use of digital file formats with embedded metadata may contribute to reducing future backlog by requiring less human involvement in meta­ data creation. faculty need to be taught that metadata raises the value and utility of assets. as they come to understand the essential role metadata plays, they, too, will invest in its creation. user interface and integration with other systems an enterprise dam system has two basic types of uses: by producers and by users. producers tend to be digital media technologists who create the digital assets and ingest them into the enterprise dam system. the users are the faculty, students, and staff who use these digital assets in their teaching, learning, or research. the research and development version of the enter­ prise dam system, living lab, works well for digital asset producers, but not for the users of these digital assets. ingestion and accessing processes are quite complex and are not currently integrated with other campus systems, such as the online library catalog or the sakai­based, campuswide course management sys­ tem, ctools.34 digital producers who are comfortable with complex systems are able to ingest and access rich media. however, users have to log onto the enterprise dam system and navigate its complex user interface. the level of complexity of accessing the media can cre­ ate a barrier to adoption and use. if the level of complex­ ity for accessing the assets is too high for users, then the system also is too complex to expect users to contribute to the ingestion of digital assets. 14 information technology and libraries | december 200714 information technology and libraries | december 2007 in both student and faculty focus groups there was concern about the technical skills needed for faculty use of an enterprise dam system in the classroom. ideally faculty should be able to incorporate assets seamlessly from the enterprise dam system to their classroom mate­ rials, such as powerpoint or keynote presentations. then, the presentations created on their computers should dis­ play without glitches on the classroom system. obviously faculty members cannot be expected to troubleshoot in the classroom when display problems occur. if the enterprise dam system is perceived as difficult to use, or as requiring a lot of troubleshooting by the user, this will discourage adoption by the faculty. this creates additional demands on the enterprise dam system, and potential additional it staffing demands for the academic units wanting to promote enterprise dam system use. when a problem is experienced in the classroom, the departmental it support, not the enterprise dam system support team, will be the first to be called. ideally, an enterprise dam system should be linked to the campus it infrastructure such that users or con­ sumers do not interact with the dam system itself, but rather through existing academic tools, such as the library gateway, course management system, or departmental web sites. having to learn a new system could be a sig­ nificant barrier to use for many potential dam system users in academia. ■ conclusions and lessons learned the vision of a dam system that would allow faculty and students easy yet secure access to myriad rich media assets is extremely appealing to members of the academy. conducting the pilot projects revealed numerous techni­ cal and cultural problems to resolve prior to achieving this vision. the authors anticipate that other institutions will need to address these same issues before undertaking their own enterprise dam system. using commercial software developed in academia during the course of the living lab pilot, the differ­ ences between academia and the commercial sector proved to be a significant issue. assumptions about the organizational culture and work methods are built into systems, often in a tacit manner. in the case of the initial iteration of the living lab, these assumptions were those of the corporate world, the primary clients of the commercial providers as well the environment of the developers. u­m project participants, meanwhile, brought their own expectations based on the reality of their work environment in academia. universities do not have a strict hierarchical structure, with each aca­ demic unit and department having a great degree of local control. academia also has a culture of sharing, where teaching or research products are often shared with no payment involved, other than acknowledgment of the source. thus, there was a process of mutual edu­ cation and negotiation regarding what was and was not acceptable in the enterprise dam system implementa­ tion. this difference of cultures first manifested itself with acls. in the initial implementation, an acl could be defined only by a system administrator. this was a showstopper for the u­m participants, who thought that asset providers themselves would be able to define and modify the acl for any particular asset. a centralized solution with a single owner of the assets (the company), which is acceptable in the corporate environment, is not acceptable in a university environment, where each user is consumer and owner. defining who has access to an asset can be a complex problem in academia, since this access is a moving target subject to both departmental and institutional constraints. libraries and librarians the traditional role of libraries is one of preserving and making accessible the intellectual property of all of humanity. with each new advance in information tech­ nology, such as dam systems, the role of libraries and librarians continues to evolve. this pilot highlighted the role and value of librarians skilled in metadata develop­ ment and assignment. without their expertise and early involvement, there would have been no standard method of indexing assets, thus preventing users from finding useful media. also, the project reinforced two reasons for encouraging asset creators to assign metadata at the asset creation point instead of at the archival point. one, this ensures that metadata are assigned when the content expertise is available. it is very difficult for producers to assign metadata retrospectively, and the indexing information may no longer be available at the point of archive. two, metadata assignment at the point of asset creation helps to ensure consistent metadata assignment that lends itself to automated solutions at the time of archiving.35 thus, while their role in digital asset man­ agement systems continues to evolve, the authors predict that the librarians’ role will evolve around metadata, and that libraries will start to become the archive for digital materials. it is anticipated that librarians will work with technical experts to develop workflows that include the automated metadata assignment to help faculty routinely add existing and new collections of assets to the system. one example of such a role is deep blue at the university of michigan. deep blue is a digital framework for pre­ serving and finding the best scholarly and artistic work produced at the university. article title | author 15enterprise dam system pilot | kim, ahronheim, suzuka, king, bruell, miller, and johnson 15 production productivity new technical complexities emerge with each new asset collection added to the u­m system. new workflows as well as richer software features continue to be developed to meet newly identified integration and user interface needs. as the living lab experience advances, techni­ cal barriers are eliminated and new workflows auto­ mated. the authors anticipate that, eventually, automated workflows will allow faculty and staff to routinely use digital assets with a minimum of technical expertise, thus decreasing the personnel costs associated with the use of rich media. for the foreseeable future, however, techni­ cally knowledgeable staff will be required to develop these workflows and even complete a significant amount of the work. academic practice the more delicate and challenging issue is educating fac­ ulty on the value and power of digital assets to improve their research and teaching. dam is a new concept to fac­ ulty, and it will only become useful when integrated into their daily teaching and research. this will happen as fac­ ulty members become more knowledgeable and increase their comfort in the use of digital assets. the dental case study demonstrates that an improved student experience can be provided with such an asset management system, while the education case study demonstrates that a com­ plex set of authentic classroom materials can be orga­ nized and ingested for use by others. these case studies are only two examples of the unanticipated outcomes that result from the use of digital assets in education. the authors predict that as more unanticipated and innova­ tive uses of digital assets are discovered, these new uses will, in turn, lead to increased academic productivity—for example, teaching more without increasing the number of faculty, students teaching each other with rich media, small­group work, and project­based learning. the list of possibilities is endless. as the living lab evolved from a research and development project into the implementation project known as bluestream, it has become an actual classroom resource. this article described myriad issues that were addressed so that other institutions can embark on their own enterprise dam systems fully informed about the road ahead. the remaining technical issues can and will be resolved over time. the greatest challenges that remain are being discovered as faculty and students use bluestream to improve teaching, learning, and research activities. the success of bluestream specifically, and enterprise dam systems in general, will be determined by their successes and failures in meeting the needs of faculty and students. ■ acknowledgements the authors recognize that the living lab pilot program was conducted with the support of others. we thank ruxandra­ana iacob for her administrative contributions to the project. we thank both ruxandra­ana iacob and sharon grayden for their assistance with writing this article. thanks to karen dickinson for her encourage­ ment, optimism, and constant support throughout the project. we thank mark fitzgerald for his vision regard­ ing the potential of the school of dentistry spi project and for conducting the original research. the living lab pilot was conducted with support from the university of michigan office of the provost through the carat partnership program, which pro­ vided funding for the pilot, and the carat­rackham fellowship program, which funded the metadata work. references 1. a. doyle and l. dawson, “current practices in digital asset management,” internet2/cni performance archive & retrieval working group, 2003, http://docs.internet2.edu/ doclib/draft­internet2­humanities­digital­asset­management­ practices­200310.html (accessed feb. 17, 2007). 2. d. z. spicer, p. b. deblois, and the educause current issues committee. “fifth annual educause survey identifies current it issues.” educause quarterly 27, no. 2 (2004): 8–22. 3. humanities advanced technology and information insti­ tute (hatii), university of glasgow, and the national initiative for a networked cultural heritage (ninch), “the ninch guide to good practice in the digital representation and man­ agement of cultural heritage materials,” 2003, www.nyu.edu/ its/humanities/ninchguide (accessed july 10, 2005). 4. a. mccord, “overview of digital asset management sys­ tems,” educause evolving technologies committee, sept. 6, 2002. 5. james l. hilton, “digital management systems,” educause review 38, no. 2 (2003): 53. 6. james. hilton, “university of michigan digital asset management system,” 2004. http://sitemaker.umich.edu/ bluestream/files/dams_year01_campus.ppt (accessed feb. 15, 2007). 7. the university of michigan, “bluestream,” 2006, http:// sitemaker.umich.edu/bluestream (accessed feb. 15, 2007). 8. oracle corp., “stellent universal content management,” 2006, www.stellent.com/en/index.htm (accessed feb. 15, 2007); artesia digital media group, “artesia: the open text digital media group,” 2006, www.artesia.com/ (accessed feb. 15, 2007); canto, “canto,” 2007, www.canto.com (accessed feb. 15, 2007). 9. r. d. vernon and o. v. riger, “digital asset management: an introduction to key issues,” www.cit.cornell.edu/oit/arch­ init/digassetmgmt.html (accessed sept. 24, 2004); yan han, “digital content management: the search for a content man­ agement system,” library hi tech 22, no. 4 (2004): 355–65; stan­ ford university libraries and academic information resources, 16 information technology and libraries | december 200716 information technology and libraries | december 2007 “media preservation: digital preservation,” 2005, http://library. stanford.edu/depts/pres/mediapres/digital.html (accessed july 29, 2005). 10. telestream, “telestream, inc.,” 2005, www.telestream.net/ products/flipfactory.htm (accessed feb. 15, 2007). 11. autonomy, inc., “virage products overview: virage vid­ eologger,” 2006, www.virage.com/content/products/index. en.html (accessed feb. 15, 2007). 12. international business machines corp., “ancept media server: digital asset management solution,” 2007, www.nasi. com/ancept.php (accessed feb. 15, 2007). 13. realnetworks, inc., “realnetworks media servers,” 2007, www.realnetworks.com/products/media_delivery.html (accessed feb. 15, 2007); apple, inc., “quicktime streaming server,” 2007, www.apple.com/quicktime/streamingserver (accessed feb. 15, 2007); international business machines corp., “db2 content manager video charger,” 2007, www­306.ibm. com/software/data/videocharger/ (accessed feb. 15, 2007). 14. sakai, “sakai: collaboration and learning environment for education,” 2007, www.sakaiproject.org (accessed feb. 15, 2007). 15. the university of michigan, “michigan radio,” 2007, www.michiganradio.org (accessed feb. 15, 2007). 16. the university of michigan, “block m records,” 2005, www.blockmrecords.org (accessed feb. 15, 2007); the univer­ sity of michigan, “deep blue,” 2007, http://deepblue.lib.umich. edu (accessed feb. 15, 2007). 17. e. duval et al., “metadata principles and practicalities,” d-lib magazine 8, no 4 (2002); a. m. white et al., “pb core— the public broadcasting metadata initiative: progress report,” 2003 dublin core conference sept. 28–oct. 2, 2003, seattle; j. attig, a. copeland, and m. pelikan, “context and meaning: the challenges of metadata for a digital image library within the university,“ college & research libraries 65, no. 3 (may 2004): 251–61. 18. white et al., “pb core—the public broadcasting meta­ data initiative”; attig, copeland, and pelikan, “context and meaning.” 19. dublin core metadata initiative, “dublin core metadata initiative,” 2007, http://dublincore.org (accessed feb. 15, 2007). 20. visual resources association, “vra core categories, version 3.0,” 2002, www.vraweb.org/vracore3.htm (accessed feb. 15, 2007); louis j. goldberg, et al., “the significance of snodent,” studies in health technology and informatics 116 (aug. 2005): 737–42; http://ontology.buffalo.edu/medo/sno­ dent_05.pdf (accessed feb. 15, 2007). 21. autonomy, “virage products overview.” 22. filemaker, inc., “filemaker,” 2007, www.filemaker.com/ products (accessed feb. 15, 2007). 23. m. fitzgerald et al., “efficacy of speech­to­text technol­ ogy in managing video recorded interactions,” journal of dental research 85, special issue a (2006): abstract no. 833. 24. u.s. department of education, “family educational rights and privacy act ferpa,” 2005, www.ed.gov/policy/ gen/guid/fpco/ferpa/index.html (accessed feb. 15, 2007). 25. g. lorenzo and j. ittelson, “an overview of e­portfolios,” educause learning initiative, 2005, http://educause.edu/ir/ library/pdf/eli3001.pdf (accessed feb. 15, 2007). 26. microsoft corp., “microsoft office powerpoint 2007,” 2007, http://office.microsoft.com/en­us/powerpoint/default. aspx (accessed feb. 15, 2007); apple, inc., “keynote,” 2007, www.apple.com/iwork/keynote (accessed feb. 15, 2007). 27. world intellectual property organization, “berne con­ vention for the protection of literary and artistic works,” 1979, www.wipo.int/treaties/en/ip/berne/trtdocs_wo001.html (accessed feb. 15, 2007). 28. wikimedia foundation, inc., “digital rights manage­ ment,” 2007, http://en.wikipedia.org/wiki/digital_rights_ management (accessed feb. 15, 2007). 29. creative commons, “creative commons,” 2007, http:// creativecommons.org (accessed feb. 15, 2007). 30. wikimedia foundation, inc., “access control list,” 2007, http://en.wikipedia.org/wiki/access_control_list (accessed feb. 15, 2007). 31. the university of michigan, “um institutional review boards,” 2007, www.irb.research.umich.edu (accessed feb. 15, 2007). 32. health insurance portability and accountability act of 1996 (hipaa), “centers for medicare and medicaid ser­ vices,” 2005, www.cms.hhs.gov/hipaageninfo/downloads/ hipaalaw.pdf (accessed feb. 15, 2007). 33. j. greenberg et al., “author­generated dublin core meta­ data for web resources: a baseline study in an organization,” journal of digital information 2, no. 2 (2002), http://journals.tdl. org/jodi/article/view/jodi­39/45 (accessed nov. 10, 2007); a. crystal and j. greenberg, “usability of a metadata creation application for resource authors,” library & information science research 27, no. 2 (2005): 177–89. 34. the university of michigan, “ctools,” 2007, https:// ctools.umich.edu/portal (accessed feb. 15, 2007). 35. m. cox et al., descriptive metadata for television (amster­ dam: focal pr., 2006); michael a. chopey, “planning and imple­ menting a metadata­driven digital repository,” cataloging & classification quarterly 40, no. 3/4 (2005): 255–87. user testing with microinteractions: enhancing a next-generation repository communication user testing with microinteractions enhancing a next generation repository sara gonzales, matthew b. carson, guillaume viger, lisa o'keefe, norrina b. allen, joseph p. ferrie, and kristi holmes information technology and libraries | march 2021 https://doi.org/10.6017/ital.v40i1.12341 sara gonzales (sara.gonzales2@northwestern.edu) is data librarian, galter health sciences library & learning center, northwestern university feinberg school of medicine. matthew b. carson (matthew.carson@northwestern.edu) is head, digital systems/senior research data scientist, galter health sciences library & learning center, northwestern university feinberg school of medicine. guillaume viger (guillaume.viger@northwestern.edu) is senior developer, galter health sciences library & learning center, northwestern university feinberg school of medicine. lisa o’keefe (lisa.okeefe@northwestern.edu) is senior program administrator, galter health sciences library & learning center, northwestern university feinberg school of medicine. norrina b. allen (norrina-allen@northwestern.edu) is associate professor of preventive medicine (epidemiology) and pediatrics, northwestern university feinberg school of medicine. joseph p. ferrie (ferrie@northwestern.edu) is professor and department chair of economics, northwestern university. kristi holmes (kristi.holmes@northwestern.edu) is director, galter health sciences library & learning center, and professor of preventive medicine (health and biomedical informatics) and medical education at northwestern university feinberg school of medicine. © 2021. abstract enabling and supporting discoverability of research outputs and datasets are key functions of university and academic health center institutional repositories. yet adoption rates among potential repository users are hampered by a number of factors, prominent among which are difficulties with basic usability. in their efforts to implement a local instance of inveniordm, a turnkey next generation repository, team members at northwestern university’s galter health sciences library & learning center supplemented agile development principles and methods and a user experience design-centered approach with observations of users’ microinteractions (interactions with each part of the software’s interface that requires human intervention). microinteractions were observed through user testing sessions conducted in fall 2019. the result has been a more user-informed development effort incorporating the experiences and viewpoints of a multidisciplinary team of researchers spanning multiple departments of a highly ranked research university. introduction galter health sciences library & learning center facilitates and supports the discoverability of knowledge for the faculty, students, and staff of the feinberg school of medicine at northwestern university. as an integrated unit in northwestern university’s clinical and translational sciences institute (nucats) and a key partner to other institutes across northwestern university’s two campuses, enabling maximum ease of use of library resources and support for meaningful information discoveries for researchers at all stages has been a prime motivator. these motivators helped drive the selection and development of an upgraded institutional repository infrastructure at galter, a project which began in 2018. discovery of resources through repository tools depends upon many factors: metadata and controlled vocabularies used, storage and retrieval capacity, and familiarity and comfort level with mailto:sara.gonzales2@northwestern.edu mailto:matthew.carson@northwestern.edu mailto:guillaume.viger@northwestern.edu mailto:lisa.okeefe@northwestern.edu mailto:norrina-allen@northwestern.edu mailto:ferrie@northwestern.edu mailto:kristi.holmes@northwestern.edu information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 2 the tool on the part of researchers and students. is the institutional repository link easy to find on the website? more importantly, is it easy to use? can records be created and files uploaded with ease? do searches bring meaningful results, and can they be filtered and organized for maximum impact? from early on in the repository upgrade project, galter library partnered with northwestern’s institute for innovations in developmental sciences (devsci), both to answer these questions and to find practical ways to serve researchers who aimed to discover relevant datasets through a repository. the work of the interdisciplinary devsci group is focused on human development across the lifespan in all areas, including physical, emotional, psychological, and socioeconomic, providing a multidisciplinary perspective for the collaboration. through this partnership, devsci’s goal was to develop a data repository or index through which they could discover the datasets of their fellow researchers and find new collaborators, thus providing an ideal perspective from which to provide critical feedback. galter health sciences library & learning center selected the inveniordm (research data management) extensible institutional repository (ir) platform as its local ir code upgrade. inveniordm is a python-based, modular and scalable ir developed by cern (the european organization for nuclear research) and collaborators.1 the first version of the invenio framework was developed in 2000. in 2018, invenio 3.02 was released with significantly improved software and code rewritten to make it a modular framework. this new framework now serves as a foundation for modern research data management and scholarly communications through a trusted digital repository. inveniordm is being collaboratively developed, with its many partners and robust developer community ensuring the framework’s maintenance, improvement, and preservation capacity into the foreseeable future. to galter library and its partners at devsci, the development timespan required to build a local instance of inveniordm presented the perfect opportunity to address one of the major stumbling blocks of ir adoption: the user experience. if a repository were designed with users’ needs in mind, and took into account their behaviors and interactions with every aspect of the tool, it had the potential to increase adoption and usability far beyond numbers generally observed for university irs. designing for user behaviors was the goal of galter library’s repository development team as we launched a round of user testing of the repository’s alph a version in fall 2019. literature review it is an exciting time for irs serving researchers in the sciences and particularly in the field of biomedical research. new robust repository frameworks capable of storing and preserving data for decades into the future are being developed to meet widely articulated researcher needs, including those user behaviors and technologies highlighted by the confederation of open access repositories (coar) in detailed guidelines for next generation repository features.3 these features include interoperable resource transfer, metadata-enhanced discovery by navigation, and exposure of permanent identifiers. researcher-focused organizations such as coar endorse and promulgate fair principles to make deposited and shared data findable, accessible, interoperable, and reusable.4 meanwhile, federal agencies such as the national institutes of health are increasingly incorporating policies to encourage best practices for data management and data sharing of grant funded research.5 these policies often recommend depositing data in a robust, secure, and accessible ir which can be maintained by the researchers’ own institution. in recent years, the majority of the deposited products of research have been stored in subject repositories information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 3 or made available via social media platforms which carry no guarantee of long-term curation and preservation of shared resources. this happens even though institutional repositories are wellrepresented on the overall repository landscape, suggesting that these institutional assets are underutilized in critical data workflows.6 the reasons for slow adoption and use of institutional repositories (ir) by researchers in routine workflows are many: irs can be perceived as adding to researchers’ administrative work burden through the need to clean, deposit, and catalog data and other research outputs; many feel trepidation about open science practices and their effects on citation counts; and researchers may feel unsure about copyright restrictions on materials they might deposit.7 narayan and luca, in their study of one university’s ir adoption challenges, outline some of the deep-seated motivations behind this trepidation, such as the social and psychological barriers imposed by researchers’ own, and university-encouraged, traditional views of scholarly publishing, as well as the ways in which these views are heavily supported by university systems in their tenure and promotion policies. in addition, many researchers perceive the content contained in irs as restricted or of limited use compared to the volume of resources that can be found through a google search.8 the ir is often seen as a small island within the larger digital research landscape. the degree to which repository managers are attuned to their local users’ professional and personal needs with regard to a repository will have a large impact on adoption rates among hesitant user populations. as witt and betz & hall point out, professional motivators have arisen in multiple disciplines to deposit in irs not only preprints, but datasets, data dictionaries, readme files, and other reproducibility-supporting resources, in order to provide open access to the products of federally funded research.9 building on funder and publisher mandates for making both publications and datasets open access, ir builders and maintainers can employ various methods to increase the motivation momentum towards ir adoption. they can highlight repository champions, faculty users of ir tools who can provide use cases and success stories about the benefits the ir brings to them, as both depositors and searchers.10 they can help to allay fears and confusion around depositors’ rights with regard to deposited materials by carefully explicating license types and definitions and by consulting with researchers on the correct licenses to choose for their deposits. they can work with their repository’s developers to create value-added modifications to the repository, including user-friendly browsing, featured collections, and researcher pages, which highlight the most current research at the institution.11 importantly, if the ir maintainers are able to modify the repository’s interface to suit local needs, they can help ensure that the majority of users have a positive experience with the repository’s interface, one in which every interaction is intuitive and in which there are no wasted steps or unnecessary clutter. such usability can be achieved through examining repository users’ microinteractions, that is, interactions with each small part of the software’s interface that requires human intervention. for those engaged in library technology projects in recent years, user experience (ux) design will be a familiar concept. ux design seeks to make a user’s interaction with a product—often a webbased tool—easier and more intuitive, frequently through manipulating the behavior of the user.12 however, in the recent trend toward designing with a focus on microinteractions, software developers are influenced by the data they glean from observing users’ interactions with each part of the software’s interface that requires human intervention, noting from the users themselves the intuitive and non-intuitive parts of an interaction in order to determine where changes should be made.13 the development team for inveniordm has taken an approach that combines traditional information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 4 user-experience design based around collaborative, open source code and tools and common dataset metadata standards such as datacite (https://schema.datacite.org); observations from invenio’s 20 years of serving as cern’s ir; and examination of microinteractions at certain key stages of development. these microinteractions revolve around common user actions within an ir, including depositing items, searching, browsing, and creating a user account.14 the result of combining these approaches has allowed galter health sciences library & learning center to put users’ needs at the forefront of its new ir. inveniordm: a next-generation repository galter library’s selection of a new ir solution was carefully considered and motivated by the organization’s need for a robust, forward-looking, and feature-rich repository that could support best practices in research data management and sharing as realized through the invenio framework. the python framework incorporates community-built python libraries, while also leveraging flask, a postgresql or mysql back-end database, the react js user interface, and the extremely fast elasticsearch json-native distributed search engine. the resulting tool is eminently scalable, securely housing petabytes’ worth of easily discoverable records. galter library began our collaboration with cern to build a local instance of inveniordm, while contributing to the overall repository source code, in late 2018. since that time, a local developer has worked on the code and contributed to the repository’s project roadmap, updating github issues and pushing releases.15 a project manager, data librarian, and the library leadership have also been involved throughout the project in the areas of general guidance and management, oversight, assessment, dissemination and outreach, and requirements gathering. many requirements were gleaned from the devsci community through conversations and informal interviews around data storage practices. in early 2019, to ensure that the repository was meeting the initially envisioned requirements, the galter library repository team analyzed the requirements thus far gathered for the project, which had been translated into github issues and added to by team members and collaborators. the requirements gathered from devsci collaborators and galter librarians were found to map directly into key ir functional categories outlined separately by ir stakeholders throughout the globe such as the national institutes of health, the confederation of open access repositories, the digital repository of ireland, the department of computer and information sciences at covenant university, nigeria, and others. those requirements included record creation and ingest, robust metadata for accessing a record, user account and permissions, user authentication, search functionality, resource access/download, and community pages and features (see fig. 1).16 https://schema.datacite.org/ information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 5 figure 1. local repository requirements mapped to ir functional categories. with the repository’s key functional requirements defined, the development team needed realworld data to help inform the microinteractions that would bring the functions to life, both for repository managers and users. to acquire microinteraction data, galter library’s data librarian designed and organized a round of user testing of the alpha release of the repository in autumn 2019. the alpha release of inveniordm was completed by september 2019, meeting a deadline established by one of the project’s key funders, the national center for data to health (cd2h), information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 6 through a grant funded by the national center for advancing translational sciences (ncats). this early alpha release enabled record creation and file upload, application of seven metadata elements (title, authors, description, resource type, subjects, visibility, and license), user authentication, search, faceting/filtering, and download of resources. to make the experience of searching the alpha release repository as realistic as possible, the data librarian asked colleagues from devsci to provide data for seed records for the repository, based on their own research. colleagues willingly obliged, and over a dozen seed records based on realworld clinical studies and other studies focused on human development were created in early october 2019. while conducting email and word-of-mouth recruitment with members of the devsci community, the data librarian worked on a testing script designed to require the maximum amount of microinteractions possible as each user worked with the repository. the script asked users to complete the following list of tasks (see fig. 2), while thinking aloud and noting anything that they found unusual or anything that they would have expected to see in the user interface of the repository. figure 2. ir user testing script tasks. the user testing tasks conform to many of the functional requirements for institutional repositories identified from our requirements gathering (fig. 1), including user authentication and account, search functionality, resource access/download, record creation/ingest, and robust metadata for accessing a record (detailed record page). by october 2019, ten northwestern university faculty members, mainly from devsci, and two information professionals had agreed to test the alpha version of inveniordm. the data librarian arranged to securely host testing sessions through web conferencing software. testers agreed to have the sessions recorded and shared their screens as they worked through the test scenarios, allowing the data librarian to observe their movements through the repository and to review the recordings later in case anything was missed. testing sessions generally lasted between twenty and thirty minutes, although some lasted from forty-five minutes to one hour. after sessions were completed, the data librarian recorded in text documents a description of all microinteractions and verbal observations testers made about the repository. she later transferred that data to a spreadsheet, listing each criterion individually and manually adding a count of how many testers either reported or were observed to experience the same phenomenon information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 7 (appendix 1). reported phenomena and observed difficulties that users experienced in testing the repository were aggregated and included in the final reported data if at least two testers reported or had the same experience. through these counts the data librarian was able to identify which microinteractions proved most challenging to the testers. discussion manual, qualitative analysis of the user testing data revealed challenges that users faced with inveniordm that were best captured and expressed when observed as microinteractions. though user experience design had been employed extensively in the design of the database, it was the nuances of the interactions that showed where improvements in the design could still be made. almost every functional area of the repository demonstrated a need for increased user input in its design. the results of user exercises testing the various functional areas are described below. user profile screens ease-of-use exercise while most testers did not experience difficulties in locating the login button on the repository’s home page that allowed them to access the user profile portions of the site (9/12 located the button in less than three seconds), most testers (7/12) requested clearer instructions for which username and password to use (e.g., ldap or shibboleth-based credentials), and three-quarters of testers (9/12) inquired about where and how to add information about themselves to their profiles (e.g., professional title, contact information, department and other affiliations, etc.). while the required task consisted simply of logging successfully into one’s profile, the testers rightly discovered and acknowledged that the robust, cris (current research information system)-like features of many repositories’ user profile pages had not yet been fully implemented in inveniordm. finding datasets exercise the next task required testers to perform a search any way they liked in the repository, locate a dataset record, and download the associated data file. whether searching using filters or by entering keywords to find a known item, users were always able to easily identify a data file within a record and download it. a special feature of inveniordm that occasionally made finding a data file to download challenging was that the repository was designed to serve as both a repository for digital files and a data index. the main feature of a data index is that it will store records representing datasets without necessarily storing the datasets themselves. the data index option is crucial for health sciences researchers, who are often motivated to share data files as openly as possible for motivations of reproducibility, open access to scientific data, and compliance with funder mandates, but who cannot always safely deposit data files due to the presence of personally identifiable information (pii) or protected health information as defined by hipaa17 of the human subjects who are involved in their studies. this phenomenon spurred us to create seed records in the repository that represent real clinical studies, for which the data could be made available upon request from the researchers, but for which a data file was not uploaded to the repository. by following the protocol of making this clinical data public as safely as possible, these records were created and tagged with a visibility level of open access. two of the testers stated that they believed that a record tagged as open access implied that a data file was available for download, and many others (10/12) expected either a visual cue or another filter to allow users to hone in on only the dataset record results that contained a deposited data file. information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 8 filter searching exercise searching the repository, particularly via the filters, resulted in some of the most interesting results of the entire testing process. in the testing exercises, users were asked to search for anything they wanted, either narrowing their results from a direct term search or beginning a search from the full record set using the filter options on the left side of the screen. opinions differed among the repository team members and some testers as to whether applying two filters at once to the results of a search would result in an and or or union of the two subsets of results. another way to phrase this scenario would be, that upon the application of one filter, would the search results and other filters update in real time? for instance, if i filtered my search results to include only the deposits of one particular author, and if the file type filter choices still contained, after the filtering action, types including pdfs, xlsx files, doc files, mov files, etc., is it safe to assume that my chosen author deposited all those types of files? (see fig. 3.) or do the filters behave independently of each other? one third of the testers (4/12) said they expected the filtering choices to update in real time and that the application of two filters should result in an and union of results. figure 3. filters available in inveniordm. seven of the twelve testers said that the most helpful of the of the five filters available in the repository at the time were resource type, file type, and subjects, while slightly less found author and license filters helpful. three of the twelve asserted that it would not occur to them to filter on a resource’s license. information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 9 specific record search exercise in the specific record search, users were asked to find a specific paper by a particular author. none of the testers experienced any difficulties or significant delays in bringing up the requested record, which served as a testament both to their searching abilities and to the robustness of the elasticsearch framework utilized in the repository’s infrastructure. ten of the twelve testers mentioned that they found the preformatted citations available with each record helpful, and two of these testers requested an easy way to export the citation in their desired format to endnote. three of the twelve testers experienced a brief initial delay in locating the known record because they still had filters applied in the repository from a previous search. they requested an easy way to clear all previous filters when starting a new search. creating a record exercise as one of the larger user testing tasks, users were provided with a dummy file representing a dataset and asked to deposit and describe it with appropriate metadata using the repository’s cataloging form. the first part of this task involved finding the button that brings the user to the cataloging form, a button which is available from several places in the inveniordm layout (home page, search results page, and profile page). eight of the twelve testers took longer than three seconds to locate this button, termed the “catalog your research” button. two of these eight reported that “catalog” was not the verb they would associate with depositing and describing a data file; to some, “catalog” seemed a library-centric term. on the repository’s record creation page, a space exists to either upload or drag and drop a file, and when this task is done, a large, blue “start upload” button appears that the user must click to begin the file upload. (see fig. 4.) yet despite its size and color, almost half the testers (5/12) did not notice that they had to click it in order to complete the upload of their file and, worse, they often completed the record creation process and published their record without noticing that the file had not uploaded. visual cues were needed to confirm for the user whether a file was successfully uploaded or not. in addition, automatic upload upon browsing and attaching a file or dragging and dropping a file was reported as an expected behavior by many users. information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 10 figure 4. users often missed the blue “start upload” button just beneath the file name. information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 11 most users applied descriptive metadata successfully and easily, but some experienced trouble while appending subject metadata to describe the subject matter of their deposits. as the repository is being customized for a health sciences library, subject fields are offered in inveniordm to allow appending both medical subject heading (mesh) terms and terms from faceted application of subject terminology (fast), a vocabulary derived from the library of congress subject headings (lcsh). since the two vocabularies’ terms are offered in separate fields (fast serving as a more universal set of terms to complement the biomedically oriented mesh), users became confused, not knowing which, if either, subject field they should complete. a solution involving a single subject field that queries both the mesh and fast apis is warranted in order to simplify the subject-tagging experience for the user. editing a record exercise editing the record they had just created, the final testing task, proved to be unproblematic for testers. eleven of the twelve found the edit button on their record pages in less than three seconds, and the editing process was reported to be straightforward. one tester observed that they were unable to change their file (i.e., make a version-level change to their record), but this was only because version change functionality had not yet been implemented in the repository. reflection/unguided feedback once all the testing exercises outlined above were completed, testers were asked to talk about the uses for which they might employ a tool like inveniordm. without prompting or suggestions of uses, the testers overwhelmingly stated that they would use the completed repository for the very functions for which most irs are built: storing data files (two users), sharing data for open science or to fulfill funders’ mandates (two users), searching for others’ datasets (four users), creating gray literature collections to showcase their conference presentations and posters (three users), embedding repository-issued dois from their datasets into manuscripts and posters (three users), and storing data in private collections to share with trusted collaborators (four users). this data shows that university faculty and researchers have various specific needs for repository solutions, which can be met if the repositories are designed with these needs in mind. after testing: next steps the user testing experience for inveniordm proved to be a highly enjoyable process for all involved. the tester participants expressed enthusiasm in the process and appreciated the opportunity to share their ideas about the functionality of an ir while it was still in the design stages. the testers’ enthusiasm reinforced the notion that many university faculty members are eager for an intuitive, user-friendly tool that will allow them to store, retrieve, and share their research outputs, as long as the tool is designed with their needs in mind. observation of testers’ microinteractions with galter library’s new institutional repository has helped the local development team to better understand what those needs are. the results of the user testing were presented at the inveniordm product meeting at cern in january 2020. the results were well received and resulted in immediate adjustments to the repository’s development. as development continues into 2021, the repository team at galter health sciences library & learning center will design and manage at least one additional user testing round to ensure that the repository continues to meet its goals of serving key functional requirements of irs while also providing users the best possible experience through each interaction they have with the tool. as user testing sessions demonstrate, there is much room for growth in the achievement of a truly intuitive interface design in even some of the seemingly information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 12 simplest functions of the repository, such as intuitively placing a deposit button or honing in on the right combination and placement of filters. the galter library development team is committed to continuing to work toward an intuitive and seamless user experience. on this journey the repository team acknowledges and thanks the testers and future users of its repository and the researchers and support staff for whom the tool is being built and without whom it could not be built half as well. acknowledgements the project team would like to acknowledge northwestern university’s institute for innovations in developmental sciences and the northwestern university clinical and translational sciences institute (nucats). inveniordm project team members sara gonzales, guillaume viger, matthew b. carson, lisa o’keefe, and kristi holmes were partially funded by the ctsa program national center for data to health, grant u24tr002306 and nucats, grant ul1tr001422. appendix 1. inveniordm user testing aggregate data, divided by task reporting criteria a phenomenon or observation was noted if it was reported by, or observed in the behavior of, two or more testers. for four tasks seconds were counted as a mark of how easy it was to find a repository element that enabled the task: 1. finding the login button to access user account 2. finding the citation button after the search for a specific record 3. finding the "catalog your research" button 4. finding the "edit record" button. counting of seconds was done with an iphone stopwatch while reviewing the recorded sessions. if finding the required button took the user longer than a generous count of three seconds, it was deemed that the user had a hard time locating the item. user profile screens results 9/12 testers wanted to add information about themselves and their appointment (dep artment, title, contact information, etc.) 7/12 wanted clearer instructions for the username and password they use to log in 3/12 testers took three seconds or more to locate the user login button on the home page finding datasets exercise results 10/12 testers expected a (sortable, filterable) cue on the search results screen to show whether record has a file to download 3/12 testers wanted grayed out instructions or search tips/suggestions in the search box 2/12 testers believed that the open access pill in search results implied there would be a file to download information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 13 2/12 testers believed that the subject pills in full record view should be clickable to enable direct search on the subjects filter searching exercise results 7/12 testers said the most helpful filters were resource type, file type, and subjects, followed by author, then license 4/12 testers expected filter choices to update in real time based on initial filter chosen 3/12 testers expected option to expand beyond the top 10 authors in the authors filter 3/12 testers were not familiar with the choices of mesh and fast terms 3/12 testers would not think of filtering on license 2/12 testers expected guidance on the licenses' meanings if browsing/filtering by license is offered 2/12 testers expected greater filter collapsing/expanding options than what was offered 2/12 testers expected to apply two filters at once 2/12 testers wanted to filter on sample or demographic information of study subjects specific record search exercise results 10/12 testers found the preformatted citations helpful 7/12 testers found the citation button in less than three seconds 3/12 testers had trouble with their known-record searching because filters were on when they started; needed an easy way to clear all filters 2/12 testers expected an option to download the found record’s citation to endnote creating a record exercise results 8/12 testers took longer than three seconds to find the “catalog your research” button; of those two would have used a different phrase [“’catalog’ is too library-centric”] 5/12 testers did not see “start upload” button after dropping their files, and an additional two said they expected auto-upload immediately upon dropping their files, with no “start upload” button necessary 3 of the 5 testers who missed the “start upload” button did not notice that their file did not get saved to their record 5/12 testers did not notice at first that the “save draft” step was needed before clicking publish, and one additional tester said they expected record auto-save, which would help in filling out a longer record 5/12 testers wanted guidance on which license to choose 4/12 testers expected some kind of instructions for filling out the cataloging page, even if only for specific fields like description or title information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 14 3/12 testers found the resource type interface intuitive 3/12 testers thought the arrow in the mesh (subject) field implied the availability of a drop-down list of options 2/12 testers did not see the drop-down choices under resource type umbrella categories at first 2/12 testers expected terms entered in the mesh (subject) fields to stay there, or a warning that they will disappear if there is no match 2/12 testers wanted more guidance on choosing visibility level (private, public, etc.) 2/12 testers wanted more definitions/assistance about difference between medical and topical subject terms 2/12 testers wanted more definitions/assistance with fast terms 2/12 testers said they would prefer a default license option editing a record exercise results 11/12 testers found the edit button in less than three seconds reflection/unguided feedback results 4/12 testers would use the repository to search for data 4/12 testers would store data files in private collections to be shared only with trusted collaborators 3/12 would embed the repository-issued dois from their datasets into their manuscripts and papers 3/12 testers would create their own grey literature collections of conference abstracts and posters 2/12 testers would use the repository for storing data files 2/12 testers would use the repository for open access/open science/data sharing complian ce motivations information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 15 references 1 “inveniordm: the turn-key research data management repository,” cern (european organization for nuclear research), accessed march 11, 2020, https://inveniosoftware.org/products/rdm/. 2 lars holm nielsen, “invenio v3.0.0 released,” invenio blog (blog), invenio, june 7, 2018, https://invenio-software.org/blog/invenio-v300-released/. 3 “next generation repositories: behaviours and technical recommendations of the coar next generation repositories working group,” confederation of open access repositories (coar), november 28, 2017, https://www.coar-repositories.org/files/ngr-final-formatted-reportcc.pdf. 4 “the fair data principles,” force11, accessed march 11, 2020, https://www.force11.org/group/fairgroup/fairprinciples. 5 national institutes of health, “final nih policy for data management and sharing,” nih office of extramural research, accessed january 15, 2021, https://grants.nih.gov/grants/guide/noticefiles/not-od-21-013.html. 6 gary e. gorman, jennifer rowley, and stephen pinfield, “making open access work: the ‘stateof-the-art’ in providing open access to scholarly literature,” online information review 39, no. 5 (september 2015): 604–36. 7 bhuva narayan and edward luca, “issues and challenges in researchers’ adoption of open access and institutional repositories: a contextual study of a university repository,” in proceedings of rails – research applications, information and library studies, 2016, school of information management, victoria university of wellington, new zealand, 6–8 december (2016); information research: an international electronic journal, 22, no. 4 (december 2017), http://hdl.handle.net/10453/121438. 8 beth st. jean, soo young rieh, elizabeth yakel, and karen markey, “unheard voices: institutional repository end-users,” college & research libraries 72, no. 1 (january 2011): 21–42. 9 michael witt et al., “connecting researchers to data repositories in the earth, space, and environmental sciences,” in digital libraries: supporting open science, ircdl 2019, ed. leonardo candela and gianmaria silvello (2019); communications in computer and information science 988, 86–96; sonya betz and robyn hall, “self-archiving with ease in an institutional repository: microinteractions and the user experience,” information technology and libraries 34, no. 3 (september 2015): 43–58. 10 betz and hall, “self-archiving with ease,” 43–58. 11 st. jean, rieh, yakel, and markey, “unheard voices,” 21–42. 12 “user experience design,” wikipedia, last modified january 12, 2021, https://en.wikipedia.org/wiki/user_experience_design. 13 betz and hall, “self-archiving with ease,” 43–58. https://invenio-software.org/products/rdm/ https://invenio-software.org/products/rdm/ https://invenio-software.org/blog/invenio-v300-released/ https://www.coar-repositories.org/files/ngr-final-formatted-report-cc.pdf https://www.coar-repositories.org/files/ngr-final-formatted-report-cc.pdf https://www.force11.org/group/fairgroup/fairprinciples https://grants.nih.gov/grants/guide/notice-files/not-od-21-013.html https://grants.nih.gov/grants/guide/notice-files/not-od-21-013.html http://hdl.handle.net/10453/121438 https://en.wikipedia.org/wiki/user_experience_design information technology and libraries march 2021 user testing with microinteractions | gonzales, carson, viger, o’keefe, allen, ferrie, and holmes 16 14 a. o. adewumi, n. a. omoregbe, and sanjay misra, “usability evaluation of mobile access to institutional repository,” international journal of pharmacy and technology 8, no. 4 (december 2016): 22892–905. 15 “inveniordm project roadmap,” cern (european organization for nuclear research), accessed march 17, 2020, https://invenio-software.org/products/rdm/roadmap/. 16 national institutes of health, office of the director, “supplemental information to the nih policy for data management and sharing: selecting a repository for data resulting from nihsupported research,” last modified october 29, 2020, https://grants.nih.gov/grants/guide/notice-files/not-od-21-016.html; “coar community framework for good practices in repositories, public version 1,” confederation of open access repositories (coar), last modified october 8, 2020, https://www.coarrepositories.org/coar-community-framework-for-good-practices-in-repositories/; sharon webb and charlene mcgoohan, the digital repository of ireland: requirements specification (national university of ireland maynooth, 2015), https://doi.org/10.3318/dri.2015.6; adewumi, omoregbe, and misra, “usability evaluation of mobile access to institutional repository,” 22892–905; suntae kim, “functional requirements for research data repositories,” international journal of knowledge content development & technology 8, no. 1 (march 2018): 25–36. 17 u.s. department of health and human services, “summary of the hipaa security rule,” last modified july 26, 2013, https://www.hhs.gov/hipaa/for-professionals/security/lawsregulations/index.html. https://grants.nih.gov/grants/guide/notice-files/not-od-21-016.html https://www.coar-repositories.org/coar-community-framework-for-good-practices-in-repositories/ https://www.coar-repositories.org/coar-community-framework-for-good-practices-in-repositories/ https://doi.org/10.3318/dri.2015.6 https://www.hhs.gov/hipaa/for-professionals/security/laws-regulations/index.html https://www.hhs.gov/hipaa/for-professionals/security/laws-regulations/index.html abstract introduction literature review inveniordm: a next-generation repository discussion user profile screens ease-of-use exercise finding datasets exercise filter searching exercise specific record search exercise creating a record exercise editing a record exercise reflection/unguided feedback after testing: next steps acknowledgements appendix 1. inveniordm user testing aggregate data, divided by task reporting criteria user profile screens results finding datasets exercise results filter searching exercise results specific record search exercise results creating a record exercise results editing a record exercise results reflection/unguided feedback results references lib-s-mocs-kmc364-20141005044413 highlights of minutes information science and automation division board of directors meeting 1973 annual conference las vegas, nevada monday, june 25, 1973 145 the meeting was called to order by president ralph shoffner at 8:25 a.m. those present were: board-ralph shoffner, paul j. fasana, susan k. martin, donald p. hammer, and berniece coulter, secretary, isad. guestsfrederick kilgour, james dolby, stephen salmon, james rizzolo, lavahn overmyer, douglas ferguson, and brett butler. minutes of midwinter meeting. there was a request that the minutes of the board meetings at midwinter (january 1973-washington, d.c.) be further edited-deleting and clarifying for publication. mrs. susan martin, editor of lola, informed those present that the deadline for submission of copy for the march 1973 issue is the middle of july. president shoffner suggested a separate meeting be held during conference to revise the midwinter minutes. cooperation with asis/siglan. douglas ferguson, representing the american society for information science special interest group on library automation & networks (asis/ siglan), presented a proposal for the board's consideration that !sad cooperate with asis/ siglan in the areas of publications, programs, and research proposals. the aims of such cooperation would be to reach people (for membership) and to save money. mr. ferguson was interested in the board's response on this matter. mrs. martin stated that cooperation in the publications area might be relatively easy for asis, but for ala it might be another matter. mr. ferguson felt that implementation would be the problem. he would like to focus on "specific purpose projects" functions-sharing of budget and membership. mr. fasana suggested that the three chairmen of the isad ad hoc committees (research topics, seminar and institute topics, and objectives) meet with mr. ferguson, and that charles husbands, !sad's representative to asis, be included. president shoffner named jim dolby, don bosseau, and douglas fer146 journal of library automation vol. 6/ 3 september 1973 guson to set up a time and place to meet with those asis/siglan board members present at the conference to discuss isad and asis/ siglan cooperation. report of jola editor. mrs. martin informed the board of the status of ]ola. the june 1972 issue had been mailed, the september 1972 issue would be mailed in two weeks (about the middle of july), and the december 1972 issue was in the galley stage and should be mailed within six weeks. the post office has been told that the mailing of the jo la issues would be caught up by the end of the calendar year. the ]ola technical communications has completed publication of the 1972 issues and is ready to be incorporated into jola. she suggested that the board reconsider incorporating jola/tc into the journal since tc would become neither timely nor a newsletter. the decision to incorporate originally had had been voted by the board as a result of ala publishing board's request that all divisions cut their budget item for publications by 10 percent. the 1973/14 budget was set up for a combined journal. mrs. martin asked for the _board's opinion. president shoffner asked if it would be possible to return to a monthly publication as soon as cost allowed. mrs. martin replied that a reduction in the.jola budget would have to be made to allow for monthly publication since tc was now incorporated into lola's budget. the question was referred to the isad editorial board. information science abstracts. ben ami lipetz is interested in sponsorship and cooperation in relation to isa. several associations are now sponsors. to become a sponsor the association makes an initial financial commitment and then gains advisory board capacity. !sad at one t~e had a subscription drive for fifty new isa subscriptions from isad members, which was not fulfilled. motion by paul fasana that some attempt be made to evaluate isa and make recommendations to the board whether isad should consider promoting subscriptions to isa. seconded by susan martin. carried. report of the committee on objectnes. mr. stephen salmon, chairman, reported that his committee on objectives would meet that morning at 10:00 a.m. and they had not yet discussed the draft report. he had talked with the members of the committee by telephone, however. mr. richard angell had written to president shoffner; the main point of his letter was that isad not accept the recommendations of the media group that isad incorporate them into the division. he argued that ala . organization does not provide for types of materials, although it does provide for types of libraries and types of activities. he also felt that there was not a broad enough area of mutual concern, and that coo should deal with this matter. mr. salmon pointed out two matters: ( 1) highlights of minutes 147 when the full objectives committee met with representatives of the information technology discussion group, there did seem to be a community of effort-the "chemistry" seemed to be there, and it did seem to work well, and ( 2) it was expressed that "media" has been around ala for years, even decades, and may continue without a "home'~ while the best answer is still being determined. mr. salmon suggested that isad could attempt a solution at this time. mr. shoffner mentioned that one of his particular concerns was that the objectives committee very explicitly review and then express its opinion on the relation of the technology group to ala's audiovisual committee. mr. salmon stated the committee's feelings were that isad objectives should be changed to include media in its scope. he asked whether the information technology discussion group should continue as a discussion group or become a committee, round table, or section. president shoffner said he had put "proposed audiovisual committee" on the agenda to determine whether or not there were strong objections to the information technology discussion group being housed in isad as a committee. if not, the objectives committee could be charged to come back to the board with recommendations as to the form of the group. mr. sahnon stated, however, that the objectives committee was not an organization committee and the form of the group was not its concern. the establishment of a committee was provided for in the division bylaws. mr. fasana thought !sad's objectives were broad enough as stated in the bylaws, article ii, section 1, the object of the division: particularly the words "and related technological developments." mr. salmon said the committee felt that media should be specifically provided for in the isad objectives statement rather than leaving it open to be read into the present statement. he mentioned that historically the "founding fathers" of isad had merely called the name of the division: "library automation" but that the "information science" had been put in by coo to please the reference services division who were interested in that area. to place the matter before coo could take six years, mr. fasana stated. mr. salmon felt, however, that it may be the time to call coo's attention to this discussion group so that eventually they might eliminate the nineteen or so a v . committees which have no formal connection with ala's audiovisual committee. mr. kilgour wanted to know how the board felt about a name change of the division and mr. shoffner suggested he see steve salmon and larry auld about his suggestions. report of research topics ad hoc comm!ttee. mr. james l. dolby, chairman, expressed the feelings of his committee that the present research need lay in the area of improvement of library op148 journal of library automation vol. 6/3 september 1973 erations and suggested concentrating on planning and evaluating existing and proposed methods rather than on system "breakthroughs." a tremendous amount of data has been accumulated by presently operating automated systems. this data should now be put to use in research studies. first, acquisitions-no library is comprehensive, not even the library of congress. data should be collected that should enable the library to know the ratio of use to collectivity, e.g. lists of high-use books; funneling all acquisitions through a single source to reduce duplicate buying; the possible merging of data bases (upon which circulation data can shed some light); and setting up methods by which libraries can interchange data so that comparative analyses can be made. second, cataloging-mr. dolby made several recommendations: ( 1) develop measures of cost effectiveness of various access systems. he stressed a need to provide more access to library collections, looking into the means of expanding from two or three subject headings per book in a cataloging system to forty or fifty. we need to know what people are trying to do in libraries by systematic collection of information about their activities. ( 2) enumerate various types of information requests made by users. included in this would be the collection of data on in-library use. ( 3) determine needs in terms of coordinated planning, cooperation, and hardware and software transferability which should be confronted before the fact, rather than after, as more and more regional operations take shape. ( 4) develop the problem-solving and idea-producing capabilities of library staffs to a maximum. ( 5) develop a continuing education program for librarians covering !sad-related topics. ( 6) establish a curriculum committee to deal with problems in schools of librarianship and to make information science a teachable subject. conference planning committee report. chairman brett butler had two statements to make regarding future programs: ( 1) the theme of an institute program which had been postponed would be carried over to the 1974 new york annual conference: "library automation in the national libraries," enlarged to include other national libraries; ( 2) the 1975 san francisco meeting would be on "information science and library automation" and focus on the things dolby mentioned in his research topics ad hoc committee report. he further stated he would like to see more co-sponsored programs. wednesday, june 27,1973 president ralph shoffner opened the board meeting at 10:10 a.m. the following were in attendance: board-ralph shoffner, donald hammer, paul fasana> susan ma1tin, and berniece coulter, secretary, isad. guests-stephen salmon, james rizzolo, douglas ferguson, brett butler, david waite, velma veneziano, pearce grove, ronald miller, frederick kilgour, and lawrence w. s. auld. president shoffner pronounced a quorum present. highlights of minutes 149 seminar and institute topics. chairman ron miller said the committee had reviewed the literature and felt there was value in regard to continuing education programs. he said they had received summaries and statistics on previous seminars and felt the institutes should be continued. conference planning committee report. brett butler, chairman, summarized the seminars held during the year. the microforms seminar in detroit was not held as a separate seminar but incorporated into sessions of the national microform association's annual meeting. two other seminars were not held as planned but postponed. surplus monies in the preconference fund were to be used for publishing of the proceedings of the preconference. tapes had been professionally made of these proceedings. the 1975 meetings were being planned now with the goal of cooperating with asis. the program in new york on national libraries which was cancelled last midwinter (1973-washington, d.c. ) was being considered with the scope to be increased to include other national libraries-france, great britain, etc. the program could require more time than the normal time slots. maryann duggan was preparing a goals document to be distributed to each participating library. mr. butler continued with his report saying that maryann duggan would plan and coordinate the networks seminar for the spring. the focus would be on the proposed theme-"advertising library automation-how to share yom efforts." the general feeling is that isad should also do something with other associations in state library operations and library schools. mr. fasana stated that some feedback on the las vegas preconference had been related in such questions as "why was the registration fee for preconference so expensive this year?" mr. butler reported that a reduction in the price of the proceedings was offered to registrants. !sad would have to subsidize ala publishing services for the discount offered these registrants because ala cannot sell at different prices to the membership. mr. miller stated that his committee felt that a good deal of the work now done by volunteers should be done by a staff member and those costs be included in the registration fees. this approach should be used for future seminars. mr. fasana said that some divisions list the analysis or a breakdown of costs on their advertising or program for preconferences. apparently, president shoffner said, the price was reasonable, judging from the response. mr. butler said another objection was the conflict with acrl' s preconference on networks. had we had knowledge of their preconference previously, the two could have been coordinated. mr. shoffner said that in a large conference conflicts were to be expected. mrs. veneziano felt there were few people in attendance at the networks preconference who, had it not been held, would have attended !sad's. president given authority to appoint committee 150 journal of library automation vol. 6/ 3 september 1973 members without individual approval. mr. kilgour asked that the board give him the authority to appoint committee members without approval of each individual as the isad bylaws stated. he asked for blanket approval. this was previously given to mr. shoffner. mr. kilgour asked that the board invoke section 2b of the bylaws and provide that the appointments last until the end of the president's term. mr. fasana suggested that a sense of "yes" be given. report of marbi committee. mrs. veneziano, the chairman, suggested the acronym "marbf' (machine-readable bibliographic information) be used for convenience in remembering the lengthy title of the committee. the committee was still concerned with trying to define its role and the mechanism to implement the role. one important consideration of the committee was to serve as a link between the members of ala and the library of congress in order to avoid the repetition of the problem which arose regarding isbd and marc records. henriette avram had prepared a position paper on this committee's relationship to any impending content changes to marc records. these would not be changes required as a result of changes in cataloging rules, over which lc has no say, but rather changes in the needs of the .library community. the paper was not intended to set forth a permanent operation, but to propose guidelines. it outlined what would happen at the point where some possibility for change was discovered, and how lc and the committee would communicate. it did not, of course, detail the committee's communication with division members and the relationship of the committee to the marc users discussion group. the committee concluded that it should communicate at least with the marc subscribers, and that james rizzolo, chairman of the marc users discussion group, would assume the responsibility of circulating information on impending changes to the marc subscribers and marc users and get the information back to the committee which would determine if there was a consensus and in its best judgment give a reply back to lc. a second activity of the committee would be to reach interested people through some means as ]ola tc or lrts. the voluminous amount of papers should not be distributed generally as they become obsolete quickly, but a center is needed for storing these papers and the fact that they exist should be circulated to the library field in general. copies could be made available for a price to those interested. mrs. veneziano hoped this could be worked out with someone at ala headquarters. another area-that of nonbibliographic data, e.g., uniform library codes, dealer codes, etc.-is of interest to the committee. mrs. veneziano's personal opinion was that though the function statement of the committee indicates responsibility for bibliographic information only, the committee should also be involved with anything which impacts the use of that bibliographic data. she expressed hesitancy to have many other committees highlights of minute~ 151 working in this area as too much time is devoted to getting feedback from other committees before a decision can be made. . the committee would like to propose adoption of a mechanism where.by it can set up a subcommittee, task force, or working group with a limited life-span which would study and react to very specific technical proposals and working papers, etc. which are developing informally at a national and international level. these subgroups must be very responsive so that the committee will not be placed in a position where it cannot take action readily. also there must be a flexible mechanism for establishing subcommittees. rtsd has strong feelings on setting up such a group without approval by the division. she did not think, however, there was any objection to creating task forces and felt that it was the only way to obtain expert comment on some of these materials. the feeling of the committee was that henriette a vram' s position paper should be accepted with minor provisos: the implementation (section 1-b) and the time frame allowed for reporting back to lc. henriette a vram is to go back to lc and check these modifications. h they meet lc's approval, the committee will accept the position paper. the character set subcommittee, consisting of charles payne, david weisbrod, and michael malinconico, will study the latest draft working papers and comment to mrs. avram who is on the task force. · · · mr. fasana said the committee had power to set up subcommittees because of the board's previous approval on this; in addition, the function statement gives the committee the right to set up a task force. president shoffner pointed out that the results of deliberations require a joint submission to all three boards. mr. hammer volunteered help in any coordination which might be needed. comments regarding the distribution of the opinion survey that mrs. a vram' s report calls for was that it should not be general but that · it be noted (perhaps in lola tc) that the survey is available. , mr. shoffner suggested that the board accept john kountz's statement regarding the establishment of a committee on nonbibliographical data and reconsider the matter again at midwinter. john linford suggested that it would be best to expand the charge to the committee to include ·the nonbibliographical area. paul fasana stated that the committee's function statement now is so worded that it can include noncataloging data. the sense of the board was agreement that the authority already existed; telecommunications committee report. the new chairman, david waite, reported that the committee first discussed the committee's focus, as it was the desire of the board, as h e understood it, to make some changes in this committee. in the past the activities of the committee were basically in cable tv. the present members were not too inter~ ested in making that their prime target, but instead the electronic communications of bibliographic data. they are not going to just look at hot is152 journal of library automation vol. 6/ 3 september 1973 sues but are currently proceeding in the area of telecommunications information, at the same time keeping their eyes open for important developments in the technological field under the broad base of telecommunications. the two main focal points of the committee would be education and standards. in education the committee would try to communicate with decision makers as related to aspects of communications to be investigated and then overflow to the general library community. mrs. martin asked if there might not be a problem with the committee's taking on this role since seminars, institutes, etc., were the function of the conference planning committee. the planning for such seminars, etc., mr. butler said, on a six or nine month basis did not work adequately. there was no objection to the telecommunications committee functioning in this area mr. kilgour remarked that at&t and other phone companies, as well as fcc, had a great deal going on with impact on telecommunications and butler said, on a six or nine month basis did not work adequately. there was no objection to the telecommunications committee functioning in this with networks presently and he felt that ala should present a position to fcc. the committee should therefore inform itself extensively as to what is going on so that if it appears some action by ala was needed, we would be prepared. committee on objectives report. chairman stephen salmon summarized the discussion by the objectives committee of the three issues raised at the first session of the isad board meeting regarding the information technology discussion group: ( 1) how such a group should fit into the organizational structure of isad. the committee sensed a media committee was not an answer but felt a discussion group was appropriate. the media group should be continued even if transformed into a committee. ( 2) the restatement of the objectives and activities of the division to include the media group. consideration was given to paul fasana's words that "related technology" in the isad bylaws' objectives statement included educational technology already. but the committee agreed with the board that a change in the language would help clarify and mr. kilgour had some recommendations in rewording the objectives statement to solve the problem. ( 3 ) terminology for the name of the division and the journal. they finally identified three possible name changes for the division: (a) information science and library automation, (b) information science and educational technology, and (c) information science and technology. the final decision was that the present name of the division, "'information science and automation," was the best. the committee also thought the draft report should specifically include another objective, i.e., to offer expertise in this area to others in ala and other professional organizations like arl. mr. salmon listed the additions and changes made and included in the final draft of the committee's report. highlights of minutes 153 motion paul fasana moved that the isad board accept and adopt the report of the objectives committee. seconded by susan k. martin. carried. mrs. m·artin remarked tl1at the information technology discussion group was already within isad. mr. shoffner explained that the board had accepted the group only for one year and during that year isad intended to determine whether this activity was within isad' s scope. mrs. martin asked, if isad considers educational technology and audiovisual concerns to be within its scope, what the relationship would be with the other audiovisual committees in ala and also what coo's role would be? the board was not asserting what is out of scope with any other parts of ala, mr. shoffner answered, only what was within isad's scope. president shoffner thanked chairman salmon for his report and the committee for its work in carrying out the original charge as given and meeting the time schedule. he then declared the committee disbanded. there was some discussion on coordinating with other a v committees in ala and what aspect of a v, isad would be concerned with. mr. kilgour suggested that the information technology discussion group should pursue its own goals and not concern itself with the coordination of all ala av groups. such coordination he felt was impossible. whether a number of committees or subcommittees in the discussion group could be formed was also discussed. mr. shoffner stated that these should be "units" of the discussion group, not "committees." he further said he was reluctant to establish committees and would do so only after a group of people committed to doing a job showed, over some continuing period of time, productive activity on a number of different tasks that relate to each other. report of editorial board. mrs. martin said that at the monday isad board meeting she had talked of retaining ]ola tc as a separate publication, but the final feeling of the editorial board was negative. the thought was to create a separate section within ]ola but with a different format, as the green sheets are inserted in the library association record ( .. liaison"). don bosseau, editor of ]ola tc, remarked that the editorial board had provided insight into another need which was for truly technical communications, e.g., a short summary which would show up later in a longer, detailed article. the editorial board felt that tc should be made into something that has more impact than news releases. isad/led education committee report. a written report was submitted by the committee. (see exhibit a.) cola report. the membership of the discussion group had increased to 145. chairnlan don bosseau, who has held that position since the incorporation of the group within isad, said ballots would be sent out shortly for the election of a new chairman. 154 journal of library automation vol. 6/3 september 1973 mr. bosseau also asked about control of membership in the group and stated that ala's guidelines indicate one person per institution as a maximum membership. the board corrected this idea by saying that this limitation was not ala's but the old cola limitation. there is no limit on membership by ala. mr. butler asked that the planning of cola, marc, and information technology discussion groups' meetings be coordinated. mrs. martin said that david weisbrod had suggested that there be a cola meeting at asis and that would be part of the cooperation between asis and isad in the program area. marc users discussion group. mr. james rizzolo told of his intent to make a survey by breaking up the mailing lists he had into three groups: ( 1) marc subscribers, ( 2) those interested in using marc, and ( 3) an informational group. mr. kilgour thought the group was called "marc subscribers" not "marc users." mr. shoffner said the name had always been marc users. originally there had been the intent to set up a "marc subscribers'' group but mr. culbertson had said that it would not fit into either isad or ala's structure. they then settled on marc users discussion group. it was stated that both marc and cola discussion groups should be in the program section of the ala conference program book. it was pointed out that program time can be requested by committee or discussion group chairmen. information technology discussion group. mr. shoffner requested mr. donald hammer to inform the isad information technology discussion group that the board would not establish an av committee, but intended to continue with the information technology discussion group in response to their memo of march 2, 1973 requesting an a v committee within isad. report to ala planning committee. mr. shoffner also requested mr. hammer to forward the objectives committee report on the long range plans of isad to the ala planning committee as a means to meet their request. (this report had been deferred from midwinter so that the final report of the objectives committee could first be heard.) rtsd computer filing committee. mr. fasana said he was asked by the rtsd board why isad had refused their request to appoint an isad member to the rtsd computer filing committee. mr. hammer said he would see that a committee member was appointed. mr. shoffner expressed appreciation to the board and turned over the gavel to president fred kilgour. the meeting was adjourned at 12:00 noon. exhibit a june 25, 1973 highlights of m intttes 155 minutes of the 1973 annual isad/led meeting the 1973 annual meeting of !sad/led convened june 25 at caesar's palace, atrium i, las vegas. present were members jim liesener, ann painter and elaine svenonius; and visitors martha west (california state university, san jose), barbara fleming (university of nevada, reno) and philip heer (university of denver); in attendance were pauline atherton and charles davis. discussion centered on two topics: the disc questions as commented upon by library school faculties and the future course of disc. the general and specific comments on the disc questions given by library school faculties are given on the attached sheets. these sheets include, in addition to responses reported at ala, responses which arrived belatedly throughout the summer. general criticisms are primarily of two types: the questions are either too broad or they are outside the domain of information science. it was felt that had the use to which the questions are to be put-viz., to develop modules, not to examine graduating students-been clearer, the charge "too broad" would not have resulted. as to what is to be included in the domain of information science, this was precisely the point of the exercise of generating questions and comments limiting or extending the domain of information science, should be accorded consideration. at the june 25 meeting, the individual questions were discussed generally in light of comments received. following the discussion, participants at the meeting expressed informally and with varying degrees of determination interest in developing modules around certain of the questions. the meeting ended with a discussion of the future of disc. a technical session is being planned by the es sig at los angeles in october: program modules for developing curricula in information science; the plan for module development will be advertised and some demonstration modules shown with a view to drawing up module specifications. also contemplated is a program by !sad/led in january at midwinter ala.elaine svenonius, august 15, 1973. an algorithm for variable-length proper-name compression 257 james l. dolby: r & d consultants company, los altos, california viable on-line search systems require reasonable capabilities to automatically detect (and hopefully correct) variations between request format and stored format. an important requirement is the solution of the problem of matching proper names, not only because both input specificatiof.i,s and storage specifications are subject to error, but also because various transliteration schemes exist and can provide variant proper name forms in the same data base. this paper reviews several proper name matching schemes and provides an updated version of these schemes which tests out nicely on the proper name equivalence classes of a suburban telephone book. an appendix lists the corpus of names used for algorithm test. a viable on-line search system cannot reasonably assume that each user will invariably provide the proper input information without error. human beings not only make errors, but also expect their correspondents, be they human or mechanical, to be able to cope with these errors, at least at some reasonable error-rate level. many of the difficulties in implementing computer systems in many areas of human activity stem from failure to recognize, and plan for, routine acceptance of errors in the systems. indeed, computing did not become the widespread activity it is now until the socalled higher-level languages came into being. although it is customary to think of higher-level languages as being "more english-like," the height of their level is better measured by the brevity with which various jobs can be expressed (for brevity tends to reduce errors) and the degree of sophistication of their automatic error detection and correction procedures. the processing of catalog information for the purposes of exposing and retrieving information presents at least two major areas for research in automatic error detection and correction. at the first stage, the data bank must be created, updated and maintained. methods for dealing with input errors at this level have been derived by a number of groups and it seems reasonable to assert that something in the order of 60% of the input errors can be detected automatically ( 1,2,3 ). with the possibility of human proof258 journal of library automation vol. 3/4 december, 1970 reading and error detection through actual use, it is reasonable to expect a mature data base to have a very low over-all error rate. at the second stage, however, when a user approaches the data base through a terminal or other on-line device, the errors will be of a recurring nature. each user will generate his own error set and, though experience will tend to minimize the error rate for a particular user, there will be an essentially irreducible minimum error rate even for an experienced user. if the system is to attract users other than professional interrogators, it must respond intelligently at this minimal error level. this paper explores certain problems associated with making "noisy matches" in catalog searches. because preliminary information indicates that the most likely source of input errors is in the keyboarding of proper names, the main emphasis of the paper is on the problem of algorithmically compressing proper names in such a way as to identify similar names (and likely misspellings) without over-identifying the list of possible authors. existing name-compression algorithms the problem of providing equivalence classes of proper names is hardly new. library catalogs, telephone directories and other major data bases have made use of "see-also"-type references for many years. some years ago remington-rand derived an alphanumeric name compression algorithm, soundex, that could be applied either by hand or by machine for such purposes ( 4). perhaps the most widely used on-line retrieval system presently in existence, the airline reservation system (such as sabre), makes use of such an algorithm (5). the closely related problem of compressing english words (either to establish noisy matches, to eliminate misspelled words, or simply to achieve data bank compression) has also received some attention ( 6, 7, 8). implementation of such algorithms has been described ( 9, 10, 11, 12, 13). although english word structure differs from proper-name structure in some important respects (e.g., the existence of suffixes), three of the algorithms are constructed by giving varying degrees of attention to the following five areas of word structure: 1 ) the character in word initial position; 2) the character set: (a, e, i, 0, u, y, h, w); 3) doubled characters (e.g., tt); 4) transformation of consonants (i.e., all alphabetic characters other than those in 2 above) into equivalence classes; 5) truncation of the residual character string. the word-initial character receives varying attention. soundex places the initial consonant in the initial position of the compressed form and then transforms all other consonants into equivalence classes with numeric titles. sabre maintains the word-initial character even if it is a vowel. in the armour research foundation scheme (arf), the word-initial character is also retained as is. algorithm for name compressionjdolby 259 both soundex and sabre eliminate all characters in the set 2) above. the arf scheme retains all characters in shorter words and deletes vowels only, to reduce the compressed form to four characters, deleting the "u" after "q," the second vowel in a vowel string, and then all remaining vowels. all three systems delete the second letter of a double-letter string. sabre goes a step further and deletes the second letter of a doubleletter string occurring after the vowels have been deleted. thus, the second "r" of "bearer" would be deleted. soundex maps the eighteen consonants into six equivalence classes: 1) b, f, p, v 2) c, g, j, k, q, s, x, z 3) d, t 4) l 5) m, n 6) r sabre and arf do not perform any transformations on these eighteen consonants. finally, all three systems truncate the remaining string of characters to four characters. for shorter forms, padding in the form of zeros (soundex), blanks (sabre), or hyphens (arf) is added so that all codes are precisely four characters long. variable-length coding schemes have been considered but generally rejected for implementation on major systems because of the attendant difficulties of programming and the fact that code compression is enhanced by fixed-length codes where no interword space is necessary. although fixed-length schemes of length greater than four have been considered, no definitive data appears to be available as to the enhanced ability of compressed codes to discriminate by introduction of more characters. the sabre system does add a fifth character but makes use of the person's first initial for added discrimination. tukey ( 14) has constructed a personal author code for his citation indexing and permuted title studies on an extensive corpus of the statistical literature. in this situation the author code is a semi-mnemonic code in a tag form to assist the user in identification rather than to be used as a basic entry point. however, tukey does note that in his corpus a threecharacter code of the surname, plus two initials, is superior to a fivecharacter surname code for purposes of unique identification. measuring algorithmic performance one of the main problems in constructing linguistic algorithms is to decide on appropriate measures of performance and to obtain data bases for implementing such measures. in this case it is clear that certain improvements in existing algorithms can be madeparticularly by using more sophisticated b·ansformation rules for the consonants and that 260 journal of librat·y automation vol. 3/4 december, 1970 the problems of implementing such changes are not so great in today's context as they were when the systems noted above were originally derived. improvements in processing speeds and programming languages, however, do not remove the need for keeping "linguistic frills" to a minimum. ideally, it would be desirable to have a list of common errors in keyboarding names as a test basis for any proposed algorithms. unfortunately, no such list of sufficient size appears to be available. lacking this, one can speculate that certain formal properties of the predictability of language might be useful in deriving an algorithm. at the english word level, some effort has been made to exploit measures of entropy as developed by shannon in this direction (6, 7). however, there is good reason to question whether entropy, at least when measured in the usual way, is strongly correlated with actually occurring errors ( 15). as an alternative, one can study existing lists of personal-name equivalence classes to derive such algorithms and then test the algorithm against such classes, measuring both the degree of over-identification and the degree of under-identification. clearly, such tests will carry more weight if they are conducted under economic forcing conditions where weaknesses in the test set will lead to real and measurable expense to the organization publishing the list. the sabre system operates under strong economic forcing conditions in the sense that airline passengers frequently have a number of competitive alternatives available to them and lost reservations can cause sufficient inconvenience for them to consider these alternatives. however, the main application of the sabre system is to rather small groups of persons (at least when compared to the number of personal authors in a typical library catalog), so that errors of over-identification are essentially trivial in cost to the airlines. a readily available source of "see-also"-type equivalence classes of proper names is given in the telephone directory system. here, the economic forcing system is not so strong as in the airline situation, but it is measurable in that failure to provide an adequate list will lead to increased user dependence on the information operator, with consequent increased cost to the telephone company. as a test of the feasibility of using such a set of equivalence classes, the 451 classes found in the palo alto-los altos (california) telephone directory were copied out by hand and used in deriving and testing the algorithm given in the next section and the soundex algorithm. there remains the question of deciding what is to constitute proper agreement between any algorithm and the set of equivalence classes chosen as a data base. at the grossest level it seems reasonable to argue that overidentification is less serious than under-identification. false drops only tend to clog the line. lost reference points, on the other hand, lead to lost information. investigation of other applications of linguistic algorithms, such as algorithms to hyphenate words, identify semantically similar words through cutting off of suffixes, and so forth, indicates that it is usually algorithm for name compressionjdolby 261 possible to reduce crucial error (in this case under-identification) to something under 5%, while preserving something in the order of 80% of the original distinctions (or efficiency) of the system. efforts to improve materially on the "five-and-eighty" rule generally lead to solutions involving larger context and/or extensive exception dictionaries. in this study efforts are directed at achieving a "five-and-eighty" solution. a variable-length name-compression scheme in light of the fact that no definitive information is available on the problems of truncating errors in name-compression algorithms, it is convenient to break the problem into two pieces. first is derivation of a variable-length algorithm of the required accuracy and efficiency and then determination of the errors induced by truncation. a studying of the set of equivalence classes given in the palo alto-los altos telephone directory made fairly clear that with minor modifications of the basic five steps used in the other algorithms noted above, it would not be too difficult to provide a reasonably accurate match without requiring too much over-identification. the main modifications made consisted of maintaining the position of the first vowel and using local context to make transformations on the consonants. the algorithm is given below. (the rules given must be applied in the order given both with respect to the rules themselves and to the order of the lists within the rules, as the precedence relations are important to the performance of the algorithm.) a spelling equivalent abbreviation algorithm for personal names 1) transform: "meg" to "mk", "mag" to "mk", "mac" to "mk", "me" to "mk". 2) working from the right, recursively delete the second letter from th f ii . i tt · "dt" "ld" " d" " t" " " " d" " t" " '' e o owmg e er parrs: , , n , n , rc , r , r , sc , "sk", "st''. 3) t f ,, , t "k ,, (( , t 1.( , " ., t " ., " , t " ,~ ,, rans orm: x o s , ce o se , c1 o s1 , cy o sy , consonant-ch" to "consonant-sh"; all other occurrences of "c" to "k", "z" to "s", "wr" to "r", "dg" to "g", "qu" to "k'', "t" to "d", "ph" to "f' (after the first letter). 4) delete all consonants other than "1", "n", and y' which precede the letter "k" (after the first letter). 5) delete one letter from any doubled consonant. 6) transform "pf#" to "p#", "#pf" to "#f", "vowel-gh#" to "vowel-£#", "consonant-gh" to "consonant-g#", and delete all other occurrences of "gh". ("#"is the word-beginning and word-ending marker.) 7) replace the first vowel in the name by the symbol "•". 8) delete all remaining vowels. 9) delete all occurrences of "w" or "h" after the first letter in the word. the vowels are taken to be (a, e, i, 0, u, y) . the remaining literal characters are treated as consonants. 262 journal of library automation vol. 3/4 december, 1970 the algorithm splits 22 ( 4.9%) of the 451 equivalence classes given by the phone directory. on the other hand, the algorithm provides 349 distinct classes (not counting those classes that were broken off in error) or 77.4% of the 451 classes in the telephone directory data base. thus has been achieved a reasonable approximation to the "five-and-eighty" performance found in other linguistic problem areas. to give a proper appreciation of the nature of these underidentification errors, they are discussed below individually. 1) the name bryer is put in the same equivalence class with a variety of spellings of the name bear. the algorithm fails to make this identification. 2) blagburn is not equated to blackburn. 3) the name davison is equated to davidson in its various forms. the algorithm fails to make this identification and this appears to be one of a modest class of difficulties that occur prior to the -son, -sen names. 4) the class of names dickinson, dickerson, dickison, and dickenson are all equated by the directory but kept separate, except for the two forms of dickinson, by the algorithm. 5) the name holm is not equated with the name home. 6) the name holmes is not equated with the name homes. 7) the algorithm fails to equate jaeger with two forms of yaeger. 8) the algorithm fails to equate lamb with lamn. 9) the algorithm incorrectly assumes that the final "gh" of leigh should be treated as an "f." treating final "gh" either as a null sound or an "f' leads to about the same number of errors in either direction. 10) the algorithm fails on the pairing of leicester and lester. the difficulty is an intervening vowel. 11) the algorithm fails to equate the various forms of lindsay with the forms of lindsley. 12) the algorithm fails to equate the various forms of mclaughlin with mclachlan. 13) the algorithm fails to equate mccullogh with mccullah. this is again the final "gh" problem. 14) the algorithm fails to equate mccue with mchugh (again the final "gh" problem) . 15) the algorithm fails to equate moretton with morton. this is an intervening vowel problem. 16) the algorithm fails to equate rauch with roush. 17) the algorithm fails to equate robinson with robison (another -son type problem). 18) the algorithm incorrectly assumes that the interior "ph" of shepherd is an "£." 19) the algorithm fails to equate speer with speier. algorithm for name compressionjdolby 263 20) the algorithm fails to equate stevens with stephens. 21) the algorithm fails to equate stevenson with stephenson. 22) the algorithm fails to equate the various forms of the word thompson (an -son problem.) in several of the errors noted above it may be questioned whether the telephone directory is following its own procedures with complete rigor. setting these aside, the primary errors occur with the final "gh," the words ending in "son," and the words with the extraneous interior vowels. each of these problems can be resolved to any desired degree of accuracy, but only at the expense of noticeable ·increases in the degree of complexity of the algorithm. the truncation problem simple truncation does not introduce errors of under-identification; it can only lead to further over-identification. examination of the results of applying the algorithm to the telephone directory data base shows that no new over-identification is introduced if the compressed codes are all reduced to the leftmost seven characters. further truncation leads to the following results: code length 7 6 5 4 cumulative over-identification losses 0 1 6 45 thus there is a strong argument for maintaining at least five characters in the compressed code. however, there is no real need for restriction to simple truncation. following the procedures used in the arf system, further truncation can be obtained by selectively removing some of the remaining characters. the natural candidate for such removal is the vowel marker. if the vowel marker is removed from all the five character codes, only six more overidentification errors are introduced. removal of the vowel markers from all of the codes would have introduced 17 more errors of over-identification. the utility of the vowel marker is in the short codes. this in turn suggests that introduction of a second vowel marker in the very short codes may have some utility, and this is indeed the case. if the conception of vowel marker is generalized as marking the position of a vowel-string (i.e., a string of consecutive vowels), where for these purposes a vowel is any of the characters (a, e, i, 0, u, y, h, w), and these markers are maintained as "padding" in the very short words, 18 errors of over-identification are eliminated at the cost of two new errors of under-identification. in this way the following modification to the variable length algorithm is derived: 1) mark the position of each of the first two vowel strings with an "o ," if there is more than one vowel. 264 journal of library automation vol. 3/4 december, 1970 2) truncate to six characters. 3) if the six-character code has two vowel markers, remove the righthand vowel marker. otherwise, truncate the sixth character. 4) if the resulting five-character code has a vowel marker, remove it. otherwise remove the fifth character. 5) for all codes having less than four characters in the variable-length fonn, pad to four characters by adding blanks to the right. measured against the telephone directory data base, this fixed-length compression code provides 361 distinct classes (not counting improper class splits as separate classes) or 80% of the 451 given classes. twentyfour ( 5.3 %) of the classes are improperly split. by way of comparison, the sound ex system improperly splits 135 classes ( 30%) and provides only 287 distinct classes (not counting improperly split classes), or 63.8% of the telephone directory data base. acknowledgments this research was carried out for the institute of library research, university of california, under the sponsorship of the office of education, research grant no. oeg-1-7-071083-5068. the author would like to thank ralph m. shoffner and kelley l. cartwright for suggesting the problem and for a number of useful comments on existing systems. allan j. humphrey was kind enough to program the variable-length version of the algorithm for test purposes. appendix: corpus of names used for algorithm test a list of personal-name equivalence classes from the palo alto-los altos telephone directory is arranged according to the variable-length compression code (with the vowel marked "•" treated as an "a" for ordering) . names whose compressed codes do not match the one given in the first column (and hence represent weaknesses in the algorithm and/ or the directory groupings) are given in italics. a small number of directory entries that do not bear on the immediate problem have been deleted from the list : bell's see also bells; co-op see also co-operative; st. see also saint; etc. 0 bl abel, abele, abell, able 0 brms abrahams, abrams 0 brmsn abrahamson, abramson •d eddy, eddie 0 dmns edmonds, edmunds 0 dmnsn edmondson, edmundson 0 dms adams, addems 0 gn eagen, egan, eggen 0 gr jaeger, yaeger, yeager °kn aiken, aikin, aitken °kns adkins, akins °kr okr ·ks 0 lbrd ·ln 0 ln 0 lsn 0 lvr •ms 0 ngl 0 nl 0 nrs 0 nrsn •ns 0 rksn 0 rl 0 rn •rns •rs 0 rvn 0 rvng 0 sbrn b•n b•ns b°kmn b0 l b0 l b0 l b0 l b.l b 0 ln b·m b 0 mn b•n b0 nd b·r b0 r b•r b•r b 0 rbr b•rc b 0 rgr b 0 rk b 0 rn algorithm for name compressionjdolby 265 acker, aker eckard, eckardt, eckart, eckert, eckhardt oakes, oaks, ochs albright, allbright elliot, elliott allan, allen, allyn ohlsen, olesen, olsen, olson, olsson oliveira, olivera, olivero ames, eames engel, engle, ingle o'neal, o'neil, o'neill andrews, andrus andersen, anderson, andreasen ennis, enos enrichsen, erickson, ericson, ericsson, eriksen earley, early erwin, irwin aarons, ahrends, ahrens, arens, arentz, arons ayers, ayres ervin, ervine, irvin, irvine erving, irving osborn, osborne, osbourne, osburn beatie, beattie, beatty, beaty, beedie betts, betz bachman, bachmann, backman bailey, baillie, bailly, baily, bayley beal, beale, beall, biehl belew, ballou, bellew buhl, buell belle, bell bolton, boulton baum, bohm, bohme bauman, bowman bain, bane, bayne bennet, bennett baer, bahr, baier, bair, bare, bear, beare, behr, beier, bier, bryer barry, beare, beery, berry bauer, baur, bower bird, burd, byrd barbour, barber berg, bergh, burge berger, burger boerke, birk, bourke, burk, burke burn, byrne 266 journal of library automation vol. 3/4 december, 1970 b 0 rnr b 0 rns b 0 rnsn b0 rs bl°kbrn bl 0 m br 0 d br 0 n br 0 n d 0 ds d°f d 0 gn d°k n•knsn n•ksn n•l n•l n•l d 0 mn n•n n•n n•n n•n n•n d0 nl d.r n•r d 0 rm d 0 vdsn n•vs dr0 sl f• f°fr f 0 gn f 0 l f 0 l f 0 lknr f 0 lps f 0 ngn f 0 nl f0 rl f 0 rr f 0 rr f 0 rs bernard, bernhard, bernhardt, bernhart berns, bims, burns, byrns, byrnes bernstein, bornstein bertsch, birch, burch blackburn, blagburn blom, bloom, bluhm, blum, blume brode, brodie, brody braun, brown, browne brand, brandt, brant diezt, ditz duffie, duffy dougan, dugan, duggan dickey, dicke dickenson, dickerson, dickinson, dickison dickson, dixon, dixson dailey, daily, daley, daly dahl, dahle, dall, doll deahl, deal, diehl diamond, dimond, dymond dean, deane, deen denney, denny donahoo, donahue, donoho, donohoe, donohoo, donohue, dunnahoo downey, downie dunn, dunne donley, donnelley, donnelly daugherty, doherty, dougherty dyar, dyer derham, durham davidsen, davidson, davison davies, davis driscoll, driskell fay, fahay, fahey fifer, pfeffer, pfeiffer fagan, feigan, fegan feil; pfeil feld, feldt, felt faulkner, falconer philips, phillips finnegan, finnigan finlay, finley farrell, ferrell ferrara, ferreira, ferriera foerster, forester, forrester, forster forrest, forest f 0 rs f 0 rs f 0 sr fl 0 n fl 0 ngn fr0 fr0 dmn fr0 drksn fr°k fr0 ns fr0 ns fr 0 s fr0 sr g0 d g0 ds g°f g0 l g0 lmr g0 lr g0 ms g0 nr g 0 nsls g0 nslvs g0 rd c•rn g 0 rn g 0 rnr c•rr g 0 s gr 0 gr.fd gr0 n gr•s h•n h°f h°fmn h0 g h 0 gn h°k h°ksn h 0 l h•l h•l h0 l h 0 ld algorithm for name compressionjdolby 267 faris, farriss, ferris, ferriss first, fuerst, furst fischer, fisher flinn, flynn flanagan, flanigan, flannigan frei, frey, fry, frye freedman, friedman frederickson, frederiksen, fredickson, fredriksson franck, frank france, frantz, franz frances, francis freeze, freese, fries fraser, frasier, frazer, frazier good, goode getz, goetz, goetze goff, gough gold, goold, gould gilmer, gilmore, gilmour gallagher, gallaher, galleher gomes, gomez guenther, gunther gonzales, gonzalez consalves, gonzalves garratt, garrett garrity, geraghty, geraty, gerrity gorden, gordohn, gordon gardiner, gardner, gartner garrard, gerard, gerrard, girard gauss, goss gray, grey griffeth, griffith green, greene gros, grose, gross hyde, heidt hoff, hough, huff hoffman, hoffmann, hofman, hofmann, huffman hoag, hoge, hogue hagan, hagen hauch, hauck, hauk, hauke hutcheson, hutchison holley, holly holl, hall halley, haley haile, hale holiday, halliday, holladay, holliday i 268 journal of libra1·y automation vol. 3/4 december, 1970 h 0 lg h 0 lm h 0 lms h 0 ln h0 m h 0 mr h 0 n h 0 n h0 nn h 0 nrks h 0 nrksn h0 ns h0 ns i-jonsn h 0 r h 0 r h 0 r h 0 r h 0 rmn h 0 rmn h 0 rmn h0 rn h 0 rn h 0 rn h 0 rngdn h 0 s h 0 s h 0 s h 0 sn h 0 vr r tfr rfrs tkb rkbsn rks rl rms rmsn rnsn rs ko k°f k°fmn helwig, hellwig holm, home holmes, homes highland, hyland ham, hamm hammar, hammer hanna, hannah hahn, hahne, harm, haun hanan, hannan, hannon hendricks, hendrix, henriques hendrickson, henriksen, henrikson heintz, heinz, heinze, hindes, hinds, hines, hinze haines, haynes henson, hansen, hanson, hanssen, hansson, hanszen herd, heard, hird, hurd hart, hardt, harte, heart hare, hair hardey, hardie, hardy hartman, hardmen, hardman, hartmann herman, hermann, herrmann harman, harmon heron, herrin, herron hardin, harden hom, horne herrington, harrington haas, haase, hasse howes, house, howse hays, hayes houston, huston hoover, hover jew, jue jeffery, jeffrey jefferies, jefferis, jefferys, jeffreys jacobi, jacoby jacobsen, jacobson, jackobsen jacques, jacks, jaques jewell, juhl jaimes, james jameson, jamieson, jamison jahnsen, jansen, jansohn, janssen, jansson, janzen, jensen, jenson joice, joyce kay, kaye coffee, coffey coffman, kauffman, kaufman, kaufmann k°k k0 l k0 l k0 lmn k0 lr k0 mbrln k 0 mbs k0 mp k0 mps k0 n k0 n k0 n k0 n k0 n k0 n k0 n k 0 nl k 0 nr k0 ns k0 p k0 pl k0 r k0 r k0 r k0 r k0 r k 0 rd k0 rln k 0 rn k0 rsnr k0 s k0 s k0 s k0 sl k0 slr k 0 sr kl 0 n kl.,rk kl 0 sn kr 0 kr 0 gr kr.,mr kr 0 n kr 0 s kr 0 s algor·ithm. for name compressionfdolby 269 cook, cooke, koch, koche cole, kohl, koll kelley, kelly coleman, cohnan koehler, koeller, kohler, koller chamberlain, chamberlin combs, coombes, coombs camp, kampe, kampf campos, campus cahn, conn, kahn cahen, cain, caine, cane, kain, kane chin, chinn chaney, cheney coen, cohan, cohen, cohn, cone, koehn, kahn coon, kuhn, kuhne kenney, kenny, kinney conley, conly, connelly, connolly conner, connor coons, koontz, kuhns, kuns, kuntz, kunz coop, co-op, coope, coupe, koop chapel, chapell, chappel, chappell, chappelle, chapple carrie, carey, cary corey, cory carr, kar, karr kurtz, kurz kehr, ker, kerr cartwright, cortright carleton, carlton carney, cerney, kearney kirschner, kirchner chace, chase cass, kass kees, keyes, keys cassel, cassell, castle kesler, kessler, kestler kaiser, kayser, keizer, keyser, kieser, kiser, kizer cline, klein, kleine, kline clark, clarke claussen, clausen, clawson, closson crow, crowe krieger, kroeger, krueger, kruger creamer, cramer, kraemer, kl·amer, kremer craine, crane christie, christy, kristee crouss, kraus, krausch, krause, krouse 270 journal of library automation vol. 3/4 december, 1970 kr 0 s kr 0 s kr 0 snsn lo lo l 0 d l 0 dl l 0 drmn l°k l°ks l 0 ln l 0 lr l 0 mb l 0 mn l 0 mn l0 n l0 n l0 n l0 n l 0 ng l 0 nn l 0 ns l 0 r l 0 rns l 0 rns l 0 rsn l 0 s l 0 s l 0 sr l0 v l 0 vd l 0 vl l 0 vn m 0 d m 0 dn m0 ds m 0 dsn m°kl m°km m°ks m°ks m 0 ln m 0 ln m 0 lr m 0 lr cross, krost crews, cruz, kruse christensen, christiansen, christianson loe, loewe, low, lowe lea, lee, leigh lloyd, loyd litle, littell, little, lytle ledterman, letterman leach, leech, leitch lucas, lukas laughlin, loughlin lawler, lawlor lamb, lamm lemen, lemmon, lemon layman, lehman, lehmann lind, lynd, lynde lion, lyon lin, linn, lynn, lynne lain, laine, laing, lane, layne lang, lange london, lundin lindsay, lindsey, lindsley, linsley lawry, lowery, lowrey, lowry lawrence, lowrance laurence, lawrance, lawrence, lorence, lorenz larsen, larson lewis, louis, luis, luiz lacey, lacy leicester, lester levey, levi, levy leavett, leavitt, levit lavell, lavelle, leavelle, loveall, lovell lavin, levin, levine mead, meade m oretton, morton mathews, matthews madison, madsen, matson, matteson, mattison, mattson michael, michel meacham, mechem marques, marquez, marquis, marquiss marcks, marks, marx maloney, moloney, molony mullan, mullen, mullin mallery, mallory moeller, moller, mueller, muller m0 lr m 0 ls m 0 n m0 nr m0 nr m0 nsn m 0 r m 0 r m0 r m0 r m0 r m0 rf m0 rl m 0 rn m 0 rs m0 rs mk0 mk0 mk0 mk 0 mk 0 l mk 0 lf mk 0 lm mk 0 n mk 0 nr mk 0 ns mk0 ns mk0 r mk0 r mkd 0 nl mkf 0 rln mkf 0 rsn mkl 0 d mkl 0 kln mkl 0 ln mkl 0 n mkl•n mkl 0 s mkm 0 ln mkn°l mkr•o n°kl n°kls n°kls algorithm for name compressionjdolby 271 millar, miller miles, myles mahan, mann miner, minor monroe, munro monson, munson murray, murrey maher, maier, mayer mohr, moor, moore meyers, myers meier, meyer, mieir, myhre murphey, murphy merrell, merrill marten, martin, martine, martyn meyers, myers maurice, morris, morse mccoy, mccaughey magee, mcgee, mcgehee, mcghie mackey, mackay, mackie, mckay mccue, mchugh magill, mcgill mccollough, mccullah, mccullough mccallum, mccollum, mccolm mckenney, mckinney macintyre, mcentire, mcintire, mcintyre mackenzie, mckenzie maginnis, mcginnis, mcguinness, mcinnes, mcinnis maguire, mcguire mccarthy, mccarty macdonald, mcdonald, mcdonnell macfarland, macfarlane, mcfarland, mcfarlane macpherson, mcpherson macleod, mccloud, mcleod maclachlan, maclachlin, mclachlan, mclaughlin, mcloughlin mcclellan, mcclelland, mclellan mcclain, mcclaine, mclain, mclane maclean, mcclean, mclean mccloskey, mcclosky, mccluskey macmillan, mcmillan, mcmillin macneal, mcneal, mcneil, mcneill magrath, mcgrath nichol, nicholl, nickel, nickle, nicol, nicoll nicholls, nichols, nickels, nickles, nicols nicholas, nicolas 272 journal of library automation vol. 3/4 d ecember, 1970 n°klsn n°ksn n°l n°lsn n°mn n°rs n°sbd p•n p 0 drsn p•c p 0 lk p0 lsn p•n p•r p•r p0 rk p 0 rks p•rs r•rs p•rs p 0 rsn pr°kr pr 0 ns pr 0 r r• r• r 0 bnsn r•n r•n r 0 d r 0 dr r•ns r 0 gn r•gr r°k r°k r°kr n•l r0 mngtn r0 mr n•ms n•n r0 nr r•s nicholsen, nicholson, nicolaisen, nicolson nickson, nixon neal, neale, neall, neel, neil, neill neilsen, neilson, nelsen, nelson, nielsen, nielson, nilson, nilssen, nilsson neumann, newman norris, nourse nesbit, nesbitt, nisbet pettee, petty peterson, pederson, pedersen, petersen, petterson page, paige polak, pollack, pollak, pollock polson, paulsen, paulson, poulsen, poulsson paine, payn, payne parry, perry parr, paar park, parke parks, parkes pierce, pearce, peirce, piers parish, parrish paris, parris pierson, pearson, pehrson, peirson prichard, pritchard prince, prinz prior, pryor roe, rowe rae, ray, raye, rea, rey, wray robinson, robison rothe, roth rudd, rood, rude reed, read, reade, reid rider, ryder rhoades, rhoads, rhodes regan, ragon, reagan rodgers, rogers richey, ritchey, ritchie reich, reiche reichardt, richert, rickard reilley, reilly, reilli, riley remington, rimington reamer, reimer, riemer, rimmer ramsay, ramsey rhein, rhine, ryan reinhard, reinhardt, reinhart, rhinehart, rinehart reas, reece, rees, reese, reis, reiss, ries r0 s r0 s r0 s r•vs s•br s°fl s•fn s°fns s°fnsn s°fr s°fr s•cl s 0 glr s•k s•ks s•l s•l s•lr s•ls s•lv s•lvr s 0 mkr s 0 mn s 0 mn s•mrs s·ms s•n s 0 n s 0 nr s0 nrs s 0 pr s·r s·r s·r s 0 r s0 r s•rl s 0 rlng s•rmn s0 rn s•rr sos sm 0 d algorithm for name compressionjdolby 273 rauch, rausch, roach, roche, roush rush, rusch russ, rus reaves, reeves seibert, siebert schofield, scofield stefan, steffan, steffen, stephan, stephen steffens, stephens, stevens steffensen, steffenson, stephenson, stevenson schaefer, schaeffer, schafer, schaffer, schafer, shaffer, sheaffer stauffer, stouffer siegal, sigal sigler, ziegler schuck, shuck sachs, sacks, saks, sax, saxe seeley, seely, seley schell, shell schuler, schuller schultz, schultze, schulz, schulze, shults, shultz silva, sylva silveira, silvera, silveria schomaker, schumacher, schumaker, shoemaker, shumaker simon, symon seaman, seemann, semon somers, sommars, sommers, summers simms, sims stein, stine sweeney, sweeny, sweney senter, center sanders, saunders shepard, shephard, shepheard, shepherd, sheppard stahr, star, starr stewart, stuart storey, story saier, sayre schwartz, schwarz, schwarze, swartz schirle, shirley sterling, stirling scheuermann, schurman, sherman stearn, stem scherer, shearer, sharer, sherer, sheerer sousa, souza smith, smyth, smythe 274 journal of library automation vol. 3/4 december, 1970 sm 0 d sn°dr sn°l sp 0 lng sp 0 r sp 0 r sr 0 dr sr0 dr t0 d t 0 msn t0 rl tr 0 s v·l v·l v·r w•o w 0 dkr w·nl w·nmn w 0 dr w 0 drs w 0 gnr w 0 l w 0 l w 0 l w 0 lbr w 0 lf w 0 lkns w 0 lks w 0 ln w 0 lr w 0 lrs w 0 ls w 0 ls w 0 ls w 0 lsn w 0 n w 0 r w 0 r w 0 rl w 0 rnr w 0 s w·smn schmid, schmidt, schmit, schmitt, smit schneider, schnieder, snaider, snider, snyder schnell, snell spalding, spaulding spear, speer, speirer spears, speers schroder, schroeder, schroeter schrader, shrader tait, tate thomason, thompson, thomsen, thomson, tomson terrel, terrell, terrill tracey, tracy vail, vaile, vale valley, valle vieira, vierra white, wight whitacre, whitaker, whiteaker, whittaker whiteley, whitley whitman, wittman woodard, woodward waters, watters wagener, waggener, wagoner, wagner, wegner, waggoner willey, willi wiley, wylie wahl, wall wilber, wilbur wolf, wolfe, wolff, woolf, woulfe, wulf, wulff wilkens, wilkins wilkes, wilks whalen, whelan walter, walther, wolter walters, walthers, wolters wallace, wallis welch, welsh welles, wells willson, wilson winn, wynn, wynne worth, wirth ware, wear, weir, wier wehrle, wehrlie, werle, worley warner, werner weis, weiss, wiese, wise, wyss weismann, weissman, weseman, wiseman, wismonn, wissman algorithm for name compressionjdolby 275 references 1. cox, n.s.m.; dolby, j. l.: "structured linguistic data and the automatic detection of errors." in advances in computer typesetting (london: institute of printing, 1966), pp. 122-125. 2. cox, n.s.m.; dews, j. d.; dolby, j. l.,: the computer and the library (hamden, conn.: archon press, 1967). 3. dolby, j. l.; forsyth, v. j.; resnikoff, h. l.: computerized library catalogs: their growth, cost and utility (cambridge, massachusetts: the m.i.t. press, 1969) . 4. becker, joseph; hayes, robert m. : information storage and retrieval (new york: wiley, 1963 ), p. 143. 5. davidson, leon: "retrieval of misspelled names in airlines passenger record system," communications of the acm, 5 (1962), 169-171. 6. blair, c. r.: "a program for correcting spelling errors," information & control, 3 ( 1960), 60-67. 7. schwartz, e. s.: an adaptive information transmission system employing minimum redundancy word codes (armour research foundation report, april 1962). (ad 274-135). 8. bourne, c. p.; ford, d.: "a study of methods for systematically abbreviating english words and names," journal of the acm, 8 ( 1961), 538-552. 9. kessler, m. m., "the "on-line" technical information system at m.i.t.", in 1967 ieee international convention record. (new york: institute of electrical and electronic engineers, 1967), pp. 40-43. 10. kilgour, f. g.: "retrieval of single entries from a computerized library catalog file," american society for information science, proceedings, 5 ( 1968), 133-136. 11. nugent, w. r.: "compression word coding techniques for information retrieval," journal of library automation, 1 (december 1968), 250-260. 12. rothrock, h. i.: computer-assisted directory search; a dissertation in electrical engineering. (philadelphia: university of pennsylvania, 1968). 13. ruecking, f. h.: "bibliographic retrieval from bibliographic input; the hypothesis and construction of a test," journal of library automation, 1 (december 1968), 227-238. 14. tukey, j. w.: a tagging system for journal articles and other citable items: a status report (princeton, n.j.: statistical techniques research group, princeton university, 1963). 15. resnikoff, h. l.; dolby, j. l.: a proposal to construct a linguistic and statistical programming system, (los altos, cal.: r & d consultants company, 1967). editorial board thoughts: “india does not exist.” mark cyzyk information technology and libraries | june 2013 4 often, i find myself trolling online forums, searching for and praying i find a bona-fide solution to a technical problem. typically, my process begins with the annoying discovery that many others are running into the same, or very similar, difficulty. many others. once i get over my initial frustration ("why isn't this problem fixed by now?"), i proceed to read, to attempt to determine which of the often conflicting and even contradictory suggestions for fixing the problem might actually work. i thought it would be instructive to step back for a moment and examine this experience. to do so, i want to use as my example, as my straw man, not a technical question, but a more generic question, the sort of question anyone might conceivably ask. i'll ask this question, then i'll list what i think might be answers, in form and substance, from the technical forums had it been asked there: "i want to go to india. how best to get there?" why would you want to go there? you could fly. you could take a ship. why go to india? iceland is much better. i went to india once and it wasn't that great. you never specify where in india you want to go. we can't help you until you tell us where in india you want to go. i am sick and tired of these people who don't read the forums. your query has been answered before. the only way to get there is to fly first class on continental. you could ride a mule to india. new zealand is much better. you should go there instead. it is impossible to go to india. you can get from india to anywhere in europe very easily via india air. you should read a passage to india, i forget who wrote it. i read it as an undergraduate. it was very good. you are an idiot for wanting to go to india. india does not exist. mark cyzyk (mcyzyk@jhu.edu), a member of lita and the ital editorial board, is the scholarly communication architect in the sheridan libraries, the johns hopkins university, baltimore, maryland. mailto:mcyzyk@jhu.edu editorial board thoughts: india does not exist | cyzyk 5 i think it's safe to say that the signal to noise ratio here is high. if we truly want to answer a question, we don't want to add noise. pontificating, posturing, and automatically posing as a mentor in a mentor/protégé relationship will typically be construed as adding nothing but noise to the signal. in most cases, we who answer such questions are not here to educate, except insofar as we provide a clear and concise answer to a technical query issued by one of our peers. what should we assume? first off, we should assume that the person writing the question is sincere: he truly does want to go to india. we need not question his motives. the best way to think about this is that the query is a hypothetical: if he were to want to go to india, how best to do it? if you were to want to go to india, how best to do it? this requires a certain level of empathy on the part of the one answering the question, a level of empathy of which the technical forums are all but devoid. many answers on those forums are so tone-deaf to human need they may as well have been written by robots. "how best to get there" is tricky because you must make some assumptions. assumptions are fine as long as you're explicit about them. one assumption might be: he is leaving from the east coast of the united states. another assumption might be: he is going to india only for a short while, for a conference or vacation. yet another one might be: by "best" he means "quickest, most efficient, least expensive." stating these assumptions, then stating your answer to the question, is appropriate and is what is most helpful. stating your assumptions is tantamount to stating your understanding of the original question, its scope and context. this is always a helpful thing to do when attempting to communicate with another human being. now, communication and plumbing the depth of human need, at least with respect to informational and bibliographic needs, has always been a strong suit of librarians, so what i write here is not really directed at librarians. it is, though, directed at we who straddle both the library world and the technology world, if that distinction is not a false one and can be usefully made. i think it important for those of us split between two cultures to ensure that we fall to one side and not the other, in particular that we do not fall into the oftentimes loutish and ultimately unproductive communication mores exhibited by many of the online technical forums. whenever my wife and i hear a news story on tv or radio openly wondering why more women do not go into i.t., i blurt out something like: "you wanna know why? just go read the comments section of most posts at slashdot.com. why on earth would anyone who didn't have to put up with that kind of culture actually choose to put up with it?" isn't "india does not exist" exactly the kind of response one would find on slashdot.com if the initial question was, "i want to go to india -how best to get there?"? with all this in mind, i hereby issue my own question, this time a technical one: information technology and libraries | june 2013 6 "i want to programmatically convert a largish set of documents from pdf to docx format. how best to do it?" i hope you don't think i'm an idiot. reproduced with permission of the copyright owner. further reproduction prohibited without permission. using gis to measure in-library book-use behavior xia, jingfeng information technology and libraries; dec 2004; 23, 4; proquest pg. 184 that development rests with the local site. for the nwda consortium, this development , using the code base, ha s been manageable. the current stat e of interface dev elop ment for the nwda project can be reviewed at http :// nwda. wsulibs .wsu.edu / project_info /. conclusion in se lecting an ead searc handretrieval system, one important qu es tion for th e consortium was, which software solution had the best prosp ects for migration in the futur e? because of the inherent strength s of nativ e xml technology in comp ari son to the other product categories list ed in tabl e 1, a nati ve xml databas e appeared to be the best appro ach, and textml provided the best combination of licensi ng costs, software capabilities, and support. it is important to note that the distinctions betw een nativ e xml databas es and databases that support xml throu gh extensions (xmlenabled databa ses) ma y b eco me more difficult to dis cern over time, in part du e to the existi ng exper tise and in vestme nts in rdbms technologies. 16 nevertheless, capabilities central to native xml, such as the us e of an xml-based query language, are integral to th e success of such hybrid syst ems . references and notes 1. daniel pitti , "encoded archival de scriptio n: the development of a n encoding standard for archival finding aids ," the american archivist 60, no. 3 (summ er 1997): 269. 2. daniel pitti, "encod ed archival des cription: an introducti on and overvi ew," 0-lib magazine 5, no. 11 (nov. 1999). accessed nov. 2, 2004, www.dlib. org / dlib / november99 / 11 pitti.html. 3. daniel v. pitti and wendy m. duff (ed s.), "introduction," in encoded archival description on the internet (binghamton, n.y.: haworth, 2001), 3. 4. james m . roth, "serv ing up ead: an exp lorat o ry study on the deployment and utilization of encod ed archival description finding aids," the american archivist 64, no. 2 (fall /win ter 2001): 226. 5. sarah l. shreeves et al., "har ves ting cultural heritage metadata using the oai protocol," library hi tech 21, no. 2 (2003): 161. 6. nan cy fleck and michael sead le, "ead harvesting for the national ga llery of th e spoken word" (pap er present ed at th e coa liti on for netw orke d information fall 2002 task force meeting, san anton io, tex., dec. 2002). accessed nov. 2, 2004, www.cni .org/tfms/2002b. fall/handouts/h-ead-fleckseadle.doc. 7. anne j. gilliland -swe tland , "po pularizi ng th e finding aid : exploiting ead to enhance online discover y and retrieval," in encoded archival description on the internet (bing h a mton , n.y.: haworth, 2001), 207. 8. ibid, 210-14. 9. charlott e b. brown and brian e. c. schottlaender, "the onlin e arch ive of california: a consortia! approach to encode d archival descrip tion, " in encoded archival description on the internet (bingham ton, n .y.: haworth, 2001), 99. 10. ibid , 103-5. oac ava ilable at: www . o ac.c dlib.org / . accessed nov. 2, 2004. 11. christopher j. prom and thomas habing, "using the open archiv es initiative protocols w ith ead," in proceed ings of th e second acm/ieee-cs joint confe rence on digit al librari es (portland, ore., july 2002). accessed nov . 2, 2004, http:// dli .grai ng er.uiu c.edu / publications / jcdl20 02/ p14prom.pdf. 12. marc cyrenne, "go ing n at ive: wh en should you use a native xml database?" aim e-doc magazine 16, no . 6 (nov./dec. 2002), 16. accessed nov. 2, 2004, www. edo cmaga zine.com/ article_ n ew.as p?id=2 5421. 13. product categor y decisions ba sed upon definiti ons and classifications available from: ronald bourret, "xml database products." accessed nov. 2, 2004, www. rpbourret .com/ x ml / xmlda ta base prods .htm. 14. cyrenn e, "going native," 18. 15. bill stockting, "ead in a2a," microsoft power point pres entation. accessed n ov. 2, 2004, www.agad.a rchiwa . gov.pl/ ead /s tocking.pp t. 16. uwe hohenstein, "supp orting xml in oracle9i ," in akm a l b. chaudhri , 184 information technology and libraries i december 2004 awais rashid, and rob erto zic ar i (eds.), xml data management: native xml and xml-enabled database systems (boston: addison-wesley, 2003), 123-4. using gis to measure in-library book-use behavior jingfeng xia this article is an attempt to develop geographic information systems (gis) technology into an analytical tool for examining the relationships between the height of the bookshelves and the behavior of library readers in utilizing books within a library. the tool would contain a database to store book-use information and some gis maps to represent bookshelves. upon analyzing the data stored in the database, differen t frequencies of book use across bookshelf layers are displayed on the maps. the tool would provide a wonderful means of visualization through which analysts can quickly realize the spatial distribution of books used in a library. this article reveals that readers tend to pull books out of the bookshelf layers that are easily reachable by human eyes and hands, and thus opens some issues for librarians to reconsider the management of library collections. several years ago, when working as a library assistant reshelving books in a university librar y, the author noted that the majority of books used inside the library were from the mid-range laye rs of bookshelves. that is, b y proportion , few books pulled out by library readers were from the top or bottom layers. books on the layers that were easily reachable by readers were frequentl y utilized . such a book-us e distribution patt ern made the job of reshelving books easy, but created some inquiries: how could book locati ons influ ence th e choices of readers in selecting books? if this was not an isolated observation, it must have exposed an inter es ting reproduced with permission of the copyright owner. further reproduction prohibited without permission. phenomenon that librarians needed to pay attention to . then , by finding out the reasons , librarians might becom e capable of guiding, to some extent , us ers' selectiv eness on library books by deliberately arranging collections at design ated heights on book sh elves. a research study was designed to develop geographical information systems (gis) into an analytical tool to examine former casual observations by the author. the study was conducted in the mackimmie library at the university of calgary. thi s paper highlights th e results of the study that aimed at assessing th e behavior of library readers in pulling out books from bookshelves . thes e book s, when not checked out, are categoriz ed as "pickup books" becau se they are usually discarded inside a library after use and then picked up by library assistants for reshelving. like many other libraries , the mackimmie library does not encourage reasd ers to reshelve books th emse lves. arcview, a gis software, was selected to develop th e tool for this study because gis ha s the functions of dynamicall y analyzing and di splayin g spatial data. the research on library readers pullin g out books involv es the measur emen ts of bookshelf heights, and thu s deals with spatial coordinates. with the capability of presenting book shelves in different views on map s, gis is able to provide readers with an easy und erstanding of the anal ytical results in visual forms, which make any textu al description s wordy . at the same time, some gis products are available now in most academic libraries, thus giving develop ers convenient access to use. hypothesis when library users decide to check books out of a library, the se books are what the y think of as useful. peopl e are usually hesitant to carry home books that are of little or uncertain use, not only because of the limit on the numb er of check-out books , but also bec ause of the physical work required for carrying them. moreover, some items, such as periodicals and multimedia materials, are either designated as "refe rence only" or have a very short loan period . it is reasonable to beli eve that user s carefully select what they want from library collections and keep these book s for handy use outside the library. by contrast , in-library book use repre sents a different category of library readers' behavior . there are two general categories of in-library book us e: readers bringin g their own books into a library for use, and readers pulling out book s from bookshelves inside a librar y. the former is commonly seen when students study textbook s for examinations (not the topic of this study), whil e the latter is a little more complex. 1 as library users approach bookshelves to extract book s, th ey may or may not hav e a definit e target. when coming with call numb ers, peopl e will deliberately draw the books they want for reading, photoc opyi ng , or referencing. ho wever, there are time s when user s on ly wander in bookshelf aisles of desired collections, uncertain about singling out specific books . th ey may simply shelf-shop to randomly select whatever is interesting to them, or they may locate a subject of need and go to the storage position(s) to look for whatever books are there. no matter what these readers' intention s are, they roam among collections, pick book s for quick u se, and leave them inside the library after use, although some materials may also be checked out. because of such arbitrary selections from library collections , physical con venie nce sometimes influence s library users in takin g books from booksh elves-they ma y look around for books on bookshelf layers that are at a reach able height. the standard library bookshelf is hi gher than the average person's height and is structured to have five to eigh t layers. in aca demic libraries, "wood shelving is available in three heights: 82 in. (2050 mm), with a bottom shelf and six ad justabl e shelves; 60 in. (1500 mm), with a b ottom shelf and four adjustable she lves; and 42 in. (1050 mm), with a bottom shelf and two adjustable shel ves ." 2 for regular collections in mo st academic libraries, bookshelve s are usually about eightytwo inches high and hav e seven layers. books on the top lay er are out of reach for many reader s, requiring them to use a ladder to draw a book from it. many users are hesitant to use ladders. even worse, a reader will have to bend over or squ at down to view the contents of books on the bottom layer of a bookshelf . hence , the hypothe sis is that books used inside a library are primarily distribut ed among the mid-ranged layers of bookshelves. specifically, if a bookshelf ha s seven lay ers, books placed on layers two through six are most frequently consulted. this is the subject of this research paper . background a considerable number of studies have investigated the utilization of books that are checked out of a library. an esti mate made in 1967 pointed out that over seven hundred research results pertained to this topic. ' how ever, the situation of books used inside a library has not been given enough attention. one of the reasons for this seeming neglect comes from the belief that the records of library book s in circulation provide similar info rma tion as those of books used within libraries." thi s misunderstanding wa s lately criticized by other researchers who discov ere d the differences in use behavior between jingfeng xia (jxia@email.arizona.edu) is a student at the school of information resources and library science at the university of arizona, tucson. using gis to measure in-library book-use behavior i xia 185 reproduced with permission of the copyright owner. further reproduction prohibited without permission. libr ary readers takin g books h ome and those using books inside libraries. 5 research ers hav e now recog ni zed that correlations between the two sets of data are n ot as strong as they seemed to be. such reco gnition, unfortunately, ha s not resulted in mor e consequ ent work to explor e the issu e of in-lib rary book use. this is probabl y due to th e difficulties of co llecting data or the la ck of appropriate research methods .6 also, the majority of rele va nt surv eys w ere conducted several de cades ago and focu sed primaril y on exp loring a go od method of sam plin g in-library book us e.7 am ong the se studies , fu ssler and simon preferr ed to carry out researc h by distributing questionnaires am ong library reader s; drott u sed randomsampling m et hods to statisti ca lly examine th e importance of librar ybook use; and jain, as well as salv erso n, emphasized dividing th e survey time s into differ ent investi gation units when conducting res earch. simil a rly, m orse point ed out the compl ex ity of measurin g lib rarybook u se a t wo rk , advocating an involv ement of computerized operation s in librar y-book man ag ement. the sampling strategies and analy tical methods implemented in pa st studie s are still applicable to curr ent res earc h. non etheless, because many new technol ogie s ha ve come into view since th en, it is quite likel y tha t som e new ways of obtaining and analy zing th e d ata of in-library book use can now be developed. th e n ew app roac hes must have the capability of providing not only accurate m easurem ent of the data but also the me ans for easy manipulation . th eir result s must be able to enhance th e und ers tandin g of us er behavio r in expl ori ng th e reso urc es of existing collection inv entorie s . one of th e solutions is an analytic al tool. an analytical tool can control data collection and anal ysis by computerizati on . if the system is ab le to accumul ate const antly upd ated records ov er time, it will remedy the probl em of poor sampling th at man y researchers hav e encount ered, be cause an alysis will then b e done on all the data rather th an w ith certain isolated samples. the development of m odern technologi es makes such data collection and storage po ssible and easier than ever before. on e exampl e of the technologi es is the radio freque ncy identification (rfid) tag system that ha s been adopted b y some public and acade mic librar ies recently.8 thi s system stores a tag in each librar y item with the item's biblio gra phic information, and uses an antenna to keep tr ack of th e tag. by automatically communicating with dat a stored in the tags, the system can collect dat a on all librar y collections in a timel y manner and export them into pred esigned d atabases for easy man ag ement. data an a lys is and pres enta tion comprise ano ther p ar t of the an aly tical mechani sm. researc hers h ave to carefully evaluate existing technologies in order to select prop er produ cts or de ve lop parti cular pro gra ms to integrate with rfid (if used) and th e databases. it is fortunate th at gis techno log y is available with numerous functi ons for analyzing and demonstrating data , especiall y spatial data. da ta visuali za tion through gis produ cts has been very good, which giv es them advantages over other analytic al, stati stical, or repor tin g produ cts. combining rfid and gis into one system would seem to be th e perfect solution-the former can effective ly carry out dat a collection and th e latter can efficiently perfo rm data analysis and presentation. h ow ever, while gis products h ave been u sed in libraries in the unit ed states for more th an a dec ade, mo st academic libraries are hesitant to invest in rfid because of its high costs . gis technology alone, however , can still provide sufficient functions to be dev eloped into such an analytical tool. up to n ow, tho se librarie s that have provid ed gis serv ices only use the software that assists in the utilization of geospatial data and map186 information technology and libraries i december 2004 ping te chnologie s for users .9 gis is not expl oited enou gh to aid the manage ment of librari es them selves and the res earch of librar y collections. some commercial gis software, such as "lib rary decision" by civic-technologi es, has be en recently marketed to support the analysis of libraryuser d a ta for public libraries. 10 ho wev er, it only wor ks w ell on the data of conventional geographical nature, that is, th e distributi on and location of librari es and th eir users with the mapping of city bl ocks and streets . it does not app ly to a librar y an d its books, and especiall y not to the distribution of books us ed insid e th e librar y. such products are also not ap plicabl e to acad emic librari es that do not always concentrate on the ana lysis of geog rap hical area s of their us ers. even so, gis h as all the function s that such a propos ed analytical tool demands. it is suit able for assisting in the research of in-library book us e where library floor layout s or other facilities can be d raw n into maps on multiple-dimensional views. at the same tim e, bookshelves wi th individual lay ers can be treated as an innovative form of map by gis technology (see figur e 1), makin g visible the relationship of book u se to the height of the book sh elf. as soon as th e presentation mechanism is linked to databases, any updat es on book use will be mirror ed visuall y. method this proj ect is one of a serie s of projects for deve lopin g gis into a tool to manag e and anal yze the u sage characteristi cs of library books . the other projects include u sing gis to measure book u sability for the de velopm ent of collection inventorie s; to assist in the managem ent of libr ary physic al space an d facilities; and to locat e library items . 11 in order to make gis workable for the subject of this paper , the focus was placed only on the exploration of corr ela tions b e tween b ooks helf reproduced with permission of the copyright owner. further reproduction prohibited without permission. figure 1. the front view of one bookshelf rack on the fifth floor of the university of calgary mackimmie library. eight bookshelves assemb le the range. here, different shades of color represents the numbers of books used on each individual layer. the display is only for demonstration and not to actual scale. height s and book-use frequenci es in an aca demic library environm ent. th ere are two major step s to conductin g this research : collectin g data and d ev eloping a gis anal y tical tool. since mackimmie librar y did not in ves t in rfid at th e tim e thi s resea rch was undertak en , p ersonal ob servations were mad e to record b ook-use data. 12 the dev elopm ent of th e gis tool involves creatin g a sm all d a tabase to store data and facilit ate d ata analysis. it also requir es creatin g seve r al bookshelf and sh elf-r an ge m ap s to pre sent anal ytical result s in visualized forms. arc view-the mo st p opul ar gis produ ct in th e w orldwas ut ilize d for the de ve lopment. this paper presents only a p or tion of co llection areas at mackimmie library. part of the fifth floor, wh ere som e collections of humaniti es and social sciences are stored, w as selected becau se this floor is amon g the busi est of th e floors used by read ers. it is filled with sixty-eight ran ges of b ookshelves containin g book s from call numbers b to du. the terms used in this paper includ e bookshelf, referring to one unit of furnitur e fitted with horizontal sh elves to h old book s; rack, which includ es more than one bookshelf standin g tog eth er in a line ; and range, comp osed of two racks standing b ack-t o-back. bookshelves on the fifth floor are arr anged to surround a group of facility rooms in the central area. stud y corridors are set between booksh elves and the wall. each booksh elf ran ge consists of two bookshelf rack s, each of which in turn has eight individual bookshel ves . all of the book shel ves are about eight y-two in ches high and have seven laye rs. th e laye rs, except for the top on es th a t are open, are equal in height , w idth , and length. data collection personal surv eys wer e taken by the author to not e d own each call number of books that w ere n ot in their origin a l p os ition s on the sh elv es, but in stead were found discard ed on the floo r, tables, chairs, sofas , or on top or in front of other stocked book s . boo ks on th e sh elving carts ar e also account ed for. the surveys we re separ ately con ducted three times a d ay mo rnin g, afternoon, and ev en in gin ord er to cat ch as m any book s u sed in a day as p oss ible. to avoid reco rdin g the sa me boo k mor e than on ce, n o duplicat e call numbers w ere acce pt ed for any single da y even thou gh th e sam e book wa s found in diff erent locations on that day. on the oth er hand , the sam e call number coul d be ent ere d int o the records on th e second day alth ough it was recorded th e d ay befo re a nd remained in th e sa m e pla ce w ith out b eing pick ed up by librar y ass is tants . (thi s dupli ca te reco rdin g was ve ry rare beca use of th e routin e work of book pi ckup by libra ry ass istants.) a period of two w eeks w as d esignated for the sur vey in th e first h alf of december 2002. th e final exam in ation week was pl ann ed becau se it represents a week of h ea vy book u se, although previous resea rch found th at readers in this w ee k tend ed to u se library collection s less th an their own stud y mat erials." a suppl em ent a ry surve y th a t a lso las ted two w eeks, includin g a final exam ina tion wee k, wa s condu cted in th e lib ra ry in late spring 2004. to simplify the rese arch , some excepti ons w ere established for d a ta collection. pe riodicals were exclud ed beca use th ey have a very short loa n p er iod (gen erall y one day) . libra ry u sers m ay pr efer to read articl es in journ als w ithin the library and thu s w ill h av e a clear idea as to wh a t m aterials to read. '' books belon ging to oth er floo rs of the librar y, o r b oo ks b elon g ing to th e fifth floor but found out sid e th e area were not includ ed in th e an alysis. furthermore, du e to the n atur e and time limit of thes e ob ser v ation s, b ooks pulled out of tar geted bo okshelves were not distingui sh ed from b oo ks taken from book sh elves at rand om . thi s information can onl y become ava ilable throu gh int erv iew s using gis to measure in-library book-use behavior i xia 187 reproduced with permission of the copyright owner. further reproduction prohibited without permission. with library users, which can be another rese arch project. each book shelf laye r wa s recorded with and signified by two call numbers: the start and end numbers of books. for example, the call numbers "bf1999 .k54" to "bh21 .b35 1965," representing books stored on a particular layer, were record ed to identify that layer . because book shifting can happen from time to time, such recording of start and end call numbers for individual book shelf layers only reflects the condition s when this research wa s undertaken and may need updates whenever changes occur. data manipulation and visualization using a bookshelf lay er as the recording unit is essenti al for the analysi s of the relationship between book use and bookshelf height. each book used can be classifi ed to fit in one unit according to the call num ber of the book. therefore , building a databas e with a table for lay ers will be an important part in the development of such an analytic al tool. the layers tabl e includes a data field as an identifi e r to stand for the sequenc e of e ach layer-1 for the top layer, 2 for th e next layer down , and so on , in addition to storing the start and end call numbers of books for each lay er. if more than one bookshelf in th e library has seven layers, layer identifiers will it erate from booksh elf to bookshelf . therefore, this tabl e will also need an identifier for each individual book shelf with which lay ers are associated. the dat abase will also contain such information as bookshelf ranges, bookshelf racks, and books , all of which are individual database tables that are joined with each other by relational keys. among them, the ranges table is simply characterized by its id entifier, and is designed to repre se nt two rack s of book shelves that stand back to back. the bookshelves table is identified by the call numbers of the start and end books stor ed across individual bookshelves rather than on individual layers. furthermore, th e books table is primarily filled with the data of individual book call numbers as well as book pickup time s and book discard locations . gis h as lim ited ability for orga nizing da tab ase struc tu re. if n ecessa ry, oth er da tab ase managemen t sys tem s, su ch as microsof t access, can b e incor p ora ted . qu ery codes are built to ge t su mmarize d infor m ation for speci fic p ur poses, and th e agg rega ted da ta are exp or ted int o gis data bases for fur the r sp a tial an alysis or con venie nt vis u al prese nt ati on . da ta vis u aliza tion can be show n at differe nt leve lsby layer, books helf, rack, and range . th e firs t attempt at ma king a vis u al dem on stra tion of this researc h is for th e area of in di vi du al b ooks helves at layer leve l (see figur e 1). th e follow in g qu ery w ill return necessary summ arize d informa tion: select sum(b.call_no) as total_num, l.layer_id, l.shelf_id from (books b inner join layers l on b .some_id = l.some_id) where b.call_no > l.start_no and b.call_no < l.end_no group by l.layer_id, l.shelf_id order by l.shelf_id, l.layer_id. at the same time, another attempt is made to d emonstrate book numbers per layer, at bookshelf level, across multipl e bookshelf ranges. this demonstration provides a better visualization in the gis di splay so that an ov erall view of the height distributions of book usage over certain collection areas can be presented (see figures 2 and 3). to achi eve such visualization, data must be compared in order to get information about which layer of a bookshelf 188 information technology and libraries i december 2004 contains the most frequently used books and which holds thos e that are rarely visit ed . this demonstration indicates that any alternative selection of analytical-display units can be easily performed by making modifications on the query that works on aggregating data . technically, data visualization can be presented by using an y gis software, although arcview is used here because it has been availabl e in the systems of many academic libraries. bookshelf ranges in mackimmie library 's fifth floor were drawn into map features . in order to show them with a three-dim ensional view, each of the seven layers was given a sequential number as its height value , and all book shelves were treated as having the same height. these height values are tre a ted as the z values in any three-dimensional analysis. then, by associating the numbers of books from the database with the heights of layers on the map, arcview is able to sketch the hei ght distributions of inlibrary book us e in new perspectives, dramatically improving the understanding of book use. in order to implement the visualiza tion of all layers across a bookshelf range, lay ers were drawn as map features (see figure 1). layer heights and widths are in appropriat e proportion . (individual book s on each layer are for demonstration only, and thus are not in the exact shape and number.) figure 1 shows how a bookshelf rack has been presented as a gis map, which is a totally new idea in the applications of gis visualization . the databas e and visualization mechanism constitute what is referred to in this paper as the analytical tool. one will find that th e development is relatively easy and the tool is incredibly simple. however, it is a dynamic device. if expanded into other parts of the library collections, this tool will become an integrated system that is able to assi st in the management of library book use and reproduced with permission of the copyright owner. further reproduction prohibited without permission. ••••m•==== ===----::"'-:-=-=-=-=-=-=-=-=-=-=-===:-::::-::-":"".:-:-~.,jgl4 file edit 7:j scene iheme .s1.liace 6t~ s ~ i:!~ ijid ~ ~ liiffl~[i] !hj~ ~~~i§]~ [qi (ill ---..................... _ -~ ¥.l ,, ill figure 2. a three-dimensional view of bookshelf ranges on the fifth floor at the mackimmie library. the height of each bookshelf represents the corresponding height of the layer from which most books were removed. this display is not to actual scale. i -'! st a,t iij gj.1v(am i t·ie 1 marked pictures haye been selected 6this is the first marked picture this is the last marked picture the following pictures are from file mmix 2 +respec if~' %warning--original search parameters are still in effect 7 pleas&type ih parameters and their values type "done" when you haye finished +restart 8 e~iter name of file des i red +orbit 9 please type in parameters and their values t'r'pe "done" i,_ihen you have finished +charge:s: $ 0.5:3 10 •help ll+iden "> 5196 • +this is a~ error ++error++: ~ld such ~:eyi .• jdrd--please rehpe litle +do 1022 pictures to process, please ~jait 22 pictures have been selected 12thjs is the first picture the follobiing pictures are from file orbit .. 1 please enter commands +type parameters specified parameter key for file orbit idetl a 1• file orbit• id = 5196 parameter. values: +5 13 :~ 6 5196 +type parameters latitude, longitude, resolution parameter key for file orbit latitude longitude resolution a 6, file orbit• id = 5201 parameter values: 24.48 -47.27 2.90 please turn off viei.~er• terminal• and coupler jc:s 13 [98"394, mnnj legged cff tty77 1948 27-a•y;-74 ji.s:i n2, ... nn) 306 journal of library automation vol. 7/4 december 1974 if not used, file name is assumed to refer to the file last searched. if the parameters are not enumerated, those specified for the picture selection are typed out. the parameters to be typed out can be enumerated or the specification parameters called for. if neither of these is done, the values of all parameters are typed out. parameters typed out are identified by column headings. phase transfer commands function respecify allows respecification of selection parameters-only those parameters which are reentered are changed; previously specified parameters retain their values. search similar to respecify, except only those pictures in the present list are candidates for selection. this is more efficient than again searching through all the pictures. continue if the search was terminated before all pictures had been processed, the search is continued from where it had been suspended. restart to view another set of pictures (all specified parameter values are deleted) . field number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23-28 appendix5 mariner 9 picture records field fixed-length portion fiche code data code file name id number (das) unit# picture number footprint code unused variable portion das time orbit latitude longitude solar lighting angle phase angle viewing angle slant range camera resolution local time filter exposure time role and file of filter version on roll film comments (content descriptors) reproduced with permission of the copyright owner. further reproduction prohibited without permission. the internet as a source of academic research information: findings of two pilot studies kibirige, harry m;depalo, lisa information technology and libraries; mar 2000; 19, 1; proquest pg. 11 the internet as a source of academic research information: harry m. kibirige and lisa depalo findings of two pilot studies as a source of serious subject-oriented information, the internet has been a powerful feature in the information arena since its inception in the last quarter of the twentieth century. it was, however, initially restricted to government contractors or major research universities operating under the aegis of the advanced research projects network (arpanet).1 in the 1990s, the content and use of the internet was expanded to include mundane subjects covered in business, industry, education, government, entertainment, and a host of other areas. it has become a magnanimous network of networks the measurement of whose size, impact, and content often elude serious scholarly effort.2 opening the internet to common usage literally opened the flood gates of what has come to be known as the information superhighway. currently, there is virtually no subject that cannot be found on the internet in one form or another. t here is both hype and reality as to what the internet can generate in terms of substantive information. in their daily pursuits of information, information professionals as well as end-users of information are challenged with regard to what their expectations are and what actually is delivered in terms of tangible information products and services on the net. academic users are a special breed in that both faculty and students have specific topics covered in their courses of study or faculty research agendas for which they need information. the use of electronic resources found on and off the internet is becoming increasingly vital for education and training in academic environments.3 five basic elements often are required in the electronic resources that academic information seekers desire: accessibility, timeliness, readability, relevance, and authority. the internet excels in the first three, but depending on how and from where the information is gathered, it may not be so reliable with regard to the last two elements. the two pilot studies discussed in this article involved four academic institutions and were conducted by the researchers with approximately twelve months apart. one covering two institutions was done in the fall of 1997. it was replicated covering another two institutions in the spring of 1999. the main goal of the studies was to investigate how academic users perceive search engines and subject-oriented databases as sources of topical information . the basic underlying question was, "when faced with a topical subject, what is the users' predominant recourse, online databases (which may include cd-rom, or dvd databases) or search engines?" our results indicated that there is predominant preference for search engines for the group taken as a whole. further analysis using nonparametric correlation coefficientskendall's tau_b and spearman's rho-however, indicated that those who use the internet monthly or weekly had high correlations with online databases as their preferred predominant information sources . on the other hand, daily users tended to have high correlations with search engines as preferred predominant information sources . i information seeking behavior of academic users over the years, several studies have been conducted on how users seek and find information relevant to their needs . for the purposes of our analysis three categories will be used: the undergraduate, the graduate, and the post-doctoral research faculty user. while the levels of how the needed information may be articulated and packaged may be different , the five basic required elements in the electronic information resources needed by academics, already identified, remain the same. the internet has, however, added another dimension to the information-seeking behavior of all academics in that much of the needed information, if and when found, has a higher chance of appearing as full text (sometimes defined as viewdata) on the internet. 4 with viewdata the end user has the ultimate in information seeking and acquisition in that he or she will get text, images, and sound in one, two, or more resources on the net. the process also may be accomplished in one sitting or search session on the computer terminal. the internet thus may be more likely to generate viewdata in contrast to conventional databases , which have for a long time been associated with the less desirable citations. in many instances and with a little persistence, it can provide the analogy of "one stop shopping" whereby a user can get viewdata needed for a topic. this may explain the tendency to try the internet first as a potential information source even for experienced searchers. to be effective, such searching needs experience and a lot of patience while sifting through pages of useless verbiage, as the information sources often are garnered from several sites. categories of academic users have varying levels of expertise in information seeking and harry m. klblrlge is associate professor at queens college, city university of new york, and lisa depalo is assistant professor at college of staten island, city university of new york. the internet as a source of academic resource information i kibirigie and depalo 11 reproduced with permission of the copyright owner. further reproduction prohibited without permission. have different characteristics in their information-seeking behavior. undergraduate users undergraduates are at the lowest point on the totem pole with regard to expertise in information seeking at any academic institution. there is more to the information needs of undergraduate students than can be revealed during the reference interview process. there are the pervading needs that the information age has created, which can be met only by those who possess critical thinking skills. critical thinking skills are imperative to much more than completing college-level assignments-they are also imperative to surviving in the job market once students graduate. this premise has been set forth in the 1992 united states government report from the department of labor and the secretary's commission on achieving necessary skills (scans) entitled skills and tasks for jobs: a scans report for america 2000. this report defined two types of skills needed to excel in the workplace and labeled them as competencies and foundations. effective reference and instruction services can help students develop the critical thinking skills needed to meet the information competency, in particular, since it pertains to one who "acquires and evaluates information, organizes and maintains information, interprets and communicates information, and uses computers to process information." 5 acquiring and evaluating information can be particularly difficult for undergraduates in the information age since one is bombarded with data in print and electronic formats. one can easily determine the reliability of print sources by looking at the name of the author, editor, or publisher. however, the internet has become a popular choice for students who need to do research. it has gained the reputation for providing all that one needs right at one's fingertips. the problem is that one cannot readily discern what is reliable and what is not without some instruction. it may be argued that the undergraduates' information seeking is somewhat eased by the general guides they get from the faculty in the classroom. there is the general professorial lecture which outlines the topics to be covered during the course, as well as associated relevant readings used to broaden the subjects covered. in addition, there is the text book which elaborates on material covered in class. finally, there are journal articles and other information sources which ordinarily are placed on reserve. as far as subject content covered in class lectures and discussion is concerned, information is usually well organized and accessible. at that level information seeking is minimal and often guided by the dictates of the professor. but then enters the term paper and the whole student peace of mind with regard to information gathering habits is disturbed. the term paper brings many unknowns to the undergraduate. the magnitude of the subject to be covered is initially fuzzy. the resources needed to get background as well as specific information are also fuzzy. furthermore, even when the resources are a little clear, sifting through them and making rational selection of relevant material may be problematic. the whole academic exercise entails learning and using new information tools, many of which were not covered in high school. computers and other electronic equipment have accentuated the undergraduates' mesmerization process in their information-seeking effort. a trait that most undergraduates exhibit in their information-seeking behavior is approaching the reference librarian for suggestions of leads to information sources needed for the term paper topic. they also may request the librarian to evaluate the sources as to their relevance, and sometimes even ask him or her to fetch the actual material needed. 6 with the advent of the internet and other electronic resources online or otherwise, (e.g., dialog, lexis-nexis, cd-roms, dvd, and tapes), the undergraduate may go directly to the internet terminal and thus skip the librarians' counsel and hand-holding which used to be vital for accessing the printed material. unless the undergraduate student is well-groomed in searching the internet, this relatively new tendency to act independently of the information professional may result in hours of useless roaming on the net with little relevant information retrieved. the graduate user in their study of business students, atkinson and figueroa found that graduates reported fewer hours spent in the library than undergraduates.7 the researchers did not attempt to explain why that was so. perhaps because of their search skills, graduates do more focused information seeking and do not waste much of their time browsing and floundering in the unknown information abyss within the library. the researchers reported an equal interest in searching internet resources and online databases (e.g., lexis-nexis, dow jones, and abi/inform), among graduates and undergraduates. however, their research was done at the end of 1995 and beginning of 1996, before the proliferation of search engines on the internet. as an information searcher the graduate is more sophisticated compared with the undergraduate. subject coverage is usually more clearly defined in many of the assignments encountered. he or she has gone through most of the pitfalls of the undergraduate experience and can select a subject and research it relatively well. most likely due to the nature of their assignments, undergraduates' information needs may be satisfied by simple information systems that allow users to browse. their searches also tend to be less exhaustive than graduates. on the other hand, graduates are faced with relatively 12 information technology and libraries i march 2000 reproduced with permission of the copyright owner. further reproduction prohibited without permission. narrower subjects and prefer to conduct more comprehensive searches. 8 the post-doctoral researcher-faculty faculty have mastered the art of getting relevant information. many belong to the informal invisible college and attend professional conferences, both of which are used to get information for teaching and research. hart's study found that formal sources, which may be found in the personal and college or university library, are more important in the faculty's information-seeking effort than informal ones. 9 according to hart, this informationseeking characteristic would be applicable to printed and electronic resources found on the internet. although our research did not specifically test it, online databases tend to direct the end user to formalized definitive and tested resources than the internet search engines. this would minimize user search time and maximize relevance of the information needed by the research academic faculty. in other words, while the listserv might be one of the internet substitutes for the invisible college, information found on it would be more acceptable to a research faculty if it directs him or her to reliable and verifiable databases, i.e., information from cendata (u.s. census bureau information database), edgar (u.s. security and exchange database), or dow jones . developments in the electronic resources arena have made many hard copies less popular. subject-oriented databases can be searched either in the library or in faculty offices. curtis et al. researched the information-seeking behavior of health sciences faculty and found a relatively new and growing information-seeking characteristic. according to curtis et al., faculty tend to prefer to search electronic resources from their offices rather than go to the library. 10 that is not surprising, for if a faculty member can access library catalogs and electronic databases, some of which can provide viewdata (full text), it is not necessary for him or her to go to the library for some of the information needed. in addition, if cd-rom databases are on a local area network accessible via the college online catalog, faculty may seldom go to a library whose resources are on the network via a library web site, telnet, or the traditional dial up. i the pilot studies with the general information-seeking behavior of academic users in mind, the researchers decided to investigate the use of search engines for information sources in the academe in the new york metropolitan area. search engines were contrasted to databases which may be url(universal resource locator) accessible online via an internet browser, stand alone on cd-rom, or on cdrom towers linked by a library local area network. in her article on web search engines, schwartz discussed recent studies done on their performance . she pointed out the fact that the end user is not often a participant in such studies .11 although our research was not on evaluation, we deliberately focused on the end user to gather statistics on perception of web search engine utility in internet surfing and information seeking . kassel evaluated search engines indicating their variety and complexity when used to search the internet. 12 other relevant literature indicated the difficulty of navigating the internet for both the information professional and the end user. it also indicated how direct access to databases was a shortcut to retrieving some of the topical information . our periodic observations of internet users revealed heavy use of search engines. we suspected that end users use them to get topical information which might otherwise be easily gotten from online databases. consequently, we thought it necessary to conduct a study on end-user perception . objectives our objectives in embarking on the pilot studies were to: 1. find the frequency of internet use by end users. this would allow us to check whether there is a correlation between frequency of internet use and perception of search engine utility. 2. find the most popular search engine. examining the most popular search engine with respect to indexing policy might indicate whether it would generate more topical subject type of information. 3. gauge the use of online and cd-rom databases in the library. in order to help the end-users' memory as to what databases are involved in the research, common databases were listed on the questionnaire as examples. 4. gauge the use of search engines in libraries and information centers. common search engines likewise were listed to help the end user identify what they were. 5. relate the results to pragmatic library and informa tion-center functioning in providing information . methodology four metropolitan new york academic institutions were selected : borough of manhattan community college; iona college; queens college of the city university of new york; and wagner college. the main criteria for selection was ease of access for the researchers. a composite sample of users was selected from these institutions to participate in the studies. the sample used was dynamic and self-selected in that whoever used the the internet as a source of academic resource information i kibirigie and depalo 13 reproduced with permission of the copyright owner. further reproduction prohibited without permission. "internet terminal" was a potential research subject. only end users as opposed to information professionals/librarians were used in the study . while subjects sat at the terminal, they were requested to complete the questionnaire and return it to the reference/information desk. simplicity dictated the design of the research and data collection instrument (questionnaire). it was one page, multicolored, and was entitled "internet use questionnaire." we estimated that it would take the subjects four to seven minutes to complete. our assumption daily 46% ivlonthly 9% weekly 45% in designing it to be simple and least time-consuming was that since the subjects were sitting at the terminals, they were time conscious. figure 1. frequency of internet use while subjects were asked to complete the questionnaire, they had the option not to. forty copies of the questionnaire were given to each academic institutional librar y, making a total of 160. useable returns were 155, or 97 percent. in addition to the questionnaire, we conducted exit interviews with some of the subjects who were using the internet terminals after they handed in the completed questionnaires. the purpose of the interviews was to have some idea as to how the users perceived the utility of the internet in getting electronic-based information . four questions were used: 1. how do you find the internet as an information source? 2. did you get what you needed from the internet ? 3. do you have a favorite search engine? 4. is there any point when you would seek the assistance of the reference librarian/information specialist? analysis of the data was done using the spss (statistical package for social science) package. we used descriptive statistics for general group tendencies-frequency of internet use and preferred sources for topical subject search. for inferential statistics we preferred the non-parametric pairwise two-tailed correlation coefficients, kendall's tau_b and spearman's rho statistics . microsoft's excel program package was used to draw some of the illustrations. results the study revealed that an overwhelming majority of subjects (91 percent) use the internet at least once a week (this includes those who use it daily) . an almost equal number (45 percent) use it weekly-(at least once a week); 46 percent use it at least once a day (see figure 1). as figure 2 shows, search engines are the predominant preferred tools for searching topical subjects on the search engine 84% figure 2. preferred sources for subject search online db 16% internet as contrasted to online and cd-rom databases. we used the two-tailed pairwise correlation coefficients to see whether there are correlations between frequency of internet use and tool preferences. as table 1 and table 2 indicate, subjects who used the internet monthly or weekly had high correlations with online databases . daily users, however, tended to have high correlations with search engines as tools to get to topical subject information sources. i interpretations and conclusions search engines certainly provide the most common access points utilized by library /information center users to get to electronic resources on the internet . unfortunately, the average user seems to have the impression that the internet is a be-all and almost a panacea to all information problems. kassel suggests 14 information technology and libraries i march 2000 reproduced with permission of the copyright owner. further reproduction prohibited without permission. the pilot studies do not give correlatlons conclusive answers as to why the daily i seng i m:>nthl y i weekly ondb weekly and monthly internet users correlated with those who use online and cd-rom databases. it might be that they search the internet via search engines as supplements to conventional online sources. alternatively they may search using search engines on an exploratory basis when they begin a relatively new subject. daily users who correlated with search engines might have mistaken the highway function of search spearman's rho daily correlation coeffic ient 1.000 -.544 sig. (2-tailed) .456 n 4 4 seng correlation coefficient -.544 1.000 sig. (2-tailed) .456 n 4 4 m:>nthl y correlation coefficient -258 0.316 sig. (2-tailed) .742 .684 n 4 4 weekly correlation coefficient -.544 .500 sig . (2-tailed) .456 .500 n 4 4 ondb correlation coefficient .258 .316 sig . (2-tailed) .742 .684 n 4 4 table 1. nonparametric correlations-spearman's rho that, at best, search engines seem to reach just about half of the web pages available on the internet.13 sullivan has given several reasons why search engine coverage is incomplete and search results sometimes may be misleading.14 among the most cogent reasons are: documents may be changed after they have been picked up for inclusion; deleted materials may be displayed as available; and web sites or files which are password accessible are not covered. much of the information needed in academe is proprietary and available via database vendors. using search engines as the main recourse to topical information shortchanges the user and may lead to frustration unless the high user expectations are tempered by constant education by the information specialist. correlations .258 .742 4 .316 .684 4 1 4 .949 .051 4 .800 .200 4 -.544 .45€ 4 .50< .50< ' 0.94! .051 ' 1.00( .63 .36! ' .258 .742 4 .316 .684 4 0.8 .200 4 .632 .368 4 1.000 4 engines from the actual sources for example: edgar or medline or eric. it might have been the problem of confusing "the end" with the "means to the end ." i implications for information professionals our studies indicated that a majority of the users in the sample preferred the search engines as access points to the internet for topical information. the interest in search engines correlated with the state university of new york at albany study which also indicated their predominant use in searching the internet. 15 while the albany study was general, ours related the search engines to getting topical information and the use of online databases as an alternative. our findings point to the need to re-educate the internet user in several aspects of the superhighdaily i seng i m:>nthl yi weekly ondb way. first, content-the fact that only a fraction of the possible sites (approximately one half) are indexed by the search engines. second, authority-because it is so easy to self-publish on the internet, a lot of information of low integrity (for instance) or factual inaccuracy may be mistaken for reliable sources. third, transiency of information found on the internet must be pointed out. the maxim "here today, gone tomorrow" is appropriate for several kendall 's tau_ b daily correlation coefficient 1.000 -.516 sig . (2-tailed) .346 n 4 4 seng correlation coefficient .516 1.000 sig. (2-tailed) .346 n 4 4 m:>nthl y correlation coefficient -236 .000 .183 sig . (2-tailed) .655 .718 n 4 4 weekly correlation coefficient -516 .000 .400 sig . (2-tailed) .346 .444 n 4 4 ondb correlation coeffic ient .236 .183 sig. (2-tailed) .655 .718 n 4 4 table 2. nonparametric correlations-kendall's tau-b .236 .516 .655 .346 4 ' .183 .40< .718 .44< 4 ' 1.000 .91, .071 4 ' .913 1.00c .071 4 ' .667 .54l .174 .27! 4 ' .236 .655 4 .183 .718 4 .667 .174 4 .548 .279 4 1.000 4 web sites on the internet. finally, information professionals must the internet as a source of academic resource information i kibirigie and depalo 15 reproduced with permission of the copyright owner. further reproduction prohibited without permission. emphasize in their training the proven online databases to which users should go directly, if and when those databases are provided by the library or information center. information professionals have a direct link to providing users with guidance to proven online databases, specifically during course-integrated instruction. education for the end user is paramount to the optimum utilization of electronic information sources. a welldeveloped information resources instruction program is needed in conjunction with the one-on-one instruction that takes place every day at the reference/information desk. such instruction programs must be cumulative, if they are to be effective in an age of burgeoning choices for end users who can more and more often choose to be remote users of information resources. in an academic environment, early intervention at the freshman level is paramount, but also must be pursued in a structured manner at the upper levels. many college and university information resources instruction programs are based on a one-shot, approximately fifty minute session, which often is executed as an orientation to the library /information center. such a method of instruction has no guarantee that there will be further guidance sought, either at the behest of a teaching faculty member in the form of course-integrated instruction, or on an individual level at the reference desk. developing effective ways to integrate information resources instruction into the lives of end users is one of the challenges information professionals face in the new millennium with an increase in the use of electronic resources found on the internet. references and notes 1. jon guice, "looking backward and forward at the internet," the information society 14, no. 3 (july /sept. 1998): 201-11. 2. g. mcmurdo, "the net by numbers," journal of information science 22, no. 5 (1996): 1397-411. 3. n. l. pelzer and others, "library use and information seeking behavior of veterinary medical students revisited in the electronic environment," bulletin of the medical library association 86, no. 3 (july 1998): 346-55. 4. harry m. kibirige, "viewdata," in encyclopedia of electrical and electronics engineering, vol. 23, ed. by g. webster (new york: john wiley, 1999), 223-31. 5. department of labor, the secretary's commission on achieving necessary skills, skills and tasks for jobs (washington , d.c.: department of labor, 1992). 6. gloria l. leckie , "desperately seeking citations: uncovering faculty assumptions about the undergraduate search process," journal of academic librarianship 22, no. 3 (1996): 202-208. 7. joseph d. atkinson and miguel figueroa, "information seeking behavior of business students : a research study," the reference librarian 58, (1997): 59-73. 8. deborah shaw, "bibliographic database searching by graduate students in language and literature: search strategies, systems interfaces, and relevance judgements," library & information science research 17, no. 4 (fall 1995): 327-45 . 9. richard l. hart, "information gathering among the faculty of a comprehensive college : formality and globality," journal of academic librarianship 23, no . 1 (jan. 1997): 21-27. 10. k. l. curtis and others, "information-seeking behavior of health science faculty: the impact of new information technologies," bulletin of the medical library association 85, no . 4 (oct. 1997): 402-10. 11. candy schwartz, "web search engines," journal of the american society for information science 49, no. 11 (sept. 1998) 973-82. 12. amelia kassel, "internet power searching : finding pearls in a zillion grains of sand," information outlook (apr . 1999): 28-32. 13. ibid. 14. danny sullivan , "search engine coverage study published," search engine watch. accessed march 11, 2000, www .searchenginewatch.com. / sereport/99 /os-size.html. 15. wei peter he, "what are they doing on the internet?: study of information seeking behaviors," internet reference services quarterly 1, no. 1 (1996): 31-51 . 16 information technology and libraries i march 2000 researchgate metrics’ behavior and its correlation with rg score and scopus indicators: a combination of bibliometric and altmetric analysis of scholars in medical sciences article researchgate metrics’ behavior and its correlation with rg score and scopus indicators a combination of bibliometric and altmetric analysis of scholars in medical sciences saeideh valizadeh-haghi, hamed nasibi-sis,* maryam shekofteh, and shahabedin rahmatizadeh information technology and libraries | march 2022 https://doi.org/10.6017/ital.v41i1.14033 saeideh valizadeh-haghi (saeideh.valizadeh@gmail.com) is assistant professor, department of medical library and information sciences, school of allied medical sciences, shahid beheshti university of medical sciences, tehran, iran. *corresponding author hamed nasibi-sis (nasibi.lib@gmail.com) is msc. graduate, department of medical library and information sciences, school of allied medical sciences, shahid beheshti university of medical sciences. maryam shekofteh (shekofteh_m@yahoo.com) is associate professor, department of medical library and information sciences, school of allied medical sciences, shahid beheshti university of medical sciences. shahabedin rahmatizadeh (shahab.rahmatizadeh@gmail.com) is assistant professor, department of health information technology and management, school of allied medical sciences, shahid beheshti university of medical sciences, tehran, iran. © 2022. abstract objective: social networking sites are appropriate tools for sharing and exposing scientific works to increase citations. the objectives of the present study are to investigate the activity of iranian scholars in the medical sciences in researchgate and to explore the effect of each of the four researchgate metrics on the rg score. moreover, the citation metrics of the faculty members in scopus and the relationship between these metrics and the rg score were explored. methods: the study population included all sbmu faculty members who have profiles in researchgate (n=950). the data were collected through researchgate and scopus in january 2021. the spearman correlation coefficient was applied to examine the relationship between researchgate metrics and scopus indicators as well as to determine the effect of each researchgate metric on the rg score. results: the findings revealed that the publication sharing metric had the highest correlation (0.918) with the rg score and had the greatest impact on it (p-value <0.001), while the question asking metric showed the lowest correlation (0.11). moreover, there was a significant relationship between the rg score and scopus citation metrics (p-value <0.05). furthermore, all four rg metrics had a positive and significant relationship with scopus indicators (p-value <0.05), in which the number of shared publications had the highest correlation compared to other rg metrics. conclusion: researchers’ participation in the researchgate social network is effective in increasing citation indicators. therefore, more activity in the researchgate social network may have favorable results in improving universities’ ranking. introduction conducting any scientific activity first requires gaining knowledge of previous relevant research and citing those sources. there is often a content link between these activities and the sources cited.1 typically, receiving citations is essential and valuable for researchers because, on the one hand, this issue is effective in the career advancement and promotion of researchers and, on the mailto:saeideh.valizadeh@gmail.com mailto:nasibi.lib@gmail.com mailto:shekofteh_m@yahoo.com mailto:shahab.rahmatizadeh@gmail.com information technology and libraries march 2022 researchgate metrics’ behavior | valizadeh-haghi, nasibi-sis, shekofteh, and rahmatizadeh 2 other hand, researchers intend to have a greater impact on science by receiving more citations. these works should be shared with other researchers and exposed to their observation to increase scholars’ research activities’ citation rate using appropriate tools.2 in this respect, academic social network sites are appropriate tools for sharing and exposing scientific works to increase citation rates.3 academic social network sites have brought researchers together regardless of time and space constraints and have facilitated scientific communication and information exchange.4 in addition, researchers can use these networks to pursue their common interests with other users.5 various studies indicate that sharing publications and subsequently publications’ visibility through social networks increases the citation rate of these works by more than 50%. it has also been observed that journal articles which are shared through these networks have received more citations than other articles in the same journals.6 one academic social network site for the exchange of scientific information is researchgate, which authors can use to cooperate with researchers in all scientific disciplines.7 through this network, researchers‘ scientific works will have better visibility by other people.8 to use this network, users must initially create their profile and then perform scientific activities.9 the researchers’ activity level in this network is indicated by the rg score, which is determined based on four individual metrics, including the number of shared publications, the researcher’s activity in asking questions, the researcher’s activity in answering other people’s questions, and the number of followers. the individual rg metrics affect researchers’ rg score, but the extent of individual metrics impact on this score is not clear.10 shahid beheshti university of medical sciences (sbmu) is one of the top universities in iran. according to the evaluation of medical universities’ research activities in the webometrics ranking of world universities, sbmu has achieved the fourth rank among iran’s medical universities.11 in the centre for science and technology studies (cwts) leiden ranking, this university is ranked 11th among iranian universities and 646th among world universities.12 faculty members are one of the main components in universities’ educational structure and play a crucial role in generating, conducting, and disseminating knowledge. due to the importance of citations of faculty members’ scientific works in ranking systems and the situation of sbmu in world rankings, it seems that measures should be taken to improve the citations of faculty members of this university as one of the ways to improve the ranking of the university. considering that more than half of the published articles never receive citations, as well as the positive role of research sharing on social networks in increasing the number of citations, it seems that the activities of sbmu faculty members in the researchgate network may increase the citations count to their research and, consequently, improve the university’s ranking.13 however, to date, no research has been carried out on the activity of the faculty members of sbmu in researchgate. literature review various studies have addressed researchers’ activity in the researchgate academic network. the level of researchers’ activities in researchgate and the relationship between citation metrics and rg score are among the topics that have been investigated. information technology and libraries march 2022 researchgate metrics’ behavior | valizadeh-haghi, nasibi-sis, shekofteh, and rahmatizadeh 3 regarding researchers’ activity in researchgate, numerous studies have been carried out. among these, the research of nikkar, rahmani, lui, sheeja, muhammad yousuf ali, mahajan, and joshi can be mentioned.14 nikkar et al. conducted a study to investigate surgical researchers’ activities in researchgate, which revealed that the majority of these researchers (86.24%) are active members in researchgate.15 rahmani et al. identified the activity of faculty members of technical colleges in researchgate, which showed that most of these researchers (64.16%) were active members of this network.16 the study by sheeja et al. of naval architecture engineering researchers at researchgate revealed that most of them (65%) have a researchgate profile.17 the study by muhammad yousuf ali, titled “altmetrics of pakistani library and information science researchers at researchgate,” indicated that 75.73% of researchers have a researchgate profile.18 in contrast, in studies conducted by mahajan et al., joshi et al., and lui et al., findings showed that less than half of the surveyed researchers are active users of this network.19 in addition to measuring activity in researchgate, some other studies also examined the relationship between the rg score and citation indicators. in this regard, in a study by joshi et al., it was revealed that there is a significant relationship between the rg score and citation metrics.20 shrivastava et al. also conducted an analysis of researchgate profiles of panjab university lecturers.21 the results demonstrated that there is a significant relationship between rg and citation metrics. a study conducted by naderbeigi et al. showed that there is a significant relationship between activity on the researchgate network, rg score, and scopus metrics of the faculty members of sharif university of technology.22 the allied medical science scientists’ activity in researchgate was examined by valizadeh-haghi et al.23 the study revealed that there is a significant relationship between rg scores and scopus indicators. correspondingly, the findings showed that there is a significant relationship between lecturers’ academic ranking and their rg scores as well as scopus indicators. according to the literature, it seems that the effect of each of the individual metrics of researchgate on the rg score has not so far been studied. researchgate also has not officially specified the impact of any of its individual metrics on the rg score, while researchers’ awareness of this impact may affect their activity behavior in any of the individual metrics to increase their rg score. previous studies also show that limited research has been conducted in iran regarding faculty members’ activity in researchgate. accordingly, none of these studies has investigated the activity of all faculties of a university. therefore, the objectives of the present study include (1) investigating the activity of sbmu faculty members in researchgate, (2) investigating the effect of each of the four individual researchgate metrics on the rg score of the faculty members, (3) determining the citation metrics of the faculty members in scopus, and (4) the relationship between individual rg metrics and the faculty members’ scopus citation metrics. material and methods the present altmetrics study population included all sbmu faculty members who have profiles in researchgate (n=950). information technology and libraries march 2022 researchgate metrics’ behavior | valizadeh-haghi, nasibi-sis, shekofteh, and rahmatizadeh 4 to extract the number as well as the name of the faculty members, we used the iranian scientometrics information database, which is developed by the deputy of research and technology of ministry of health and medical education of iran.24 the number of faculty members in this system was 1,430, of which 950 had profiles in researchgate and were examined. the data regarding rg score were collected through direct observation of their profiles in researchgate. the rg score includes four metrics: number of shared publications, researcher’s activity in asking questions, researcher’s activity in answering other people’s questions, and followers. data related to the number of citations and the h-index of each of the lecturers were collected by viewing their profiles in the scopus database in january 2021. in this study, it was assumed that there is a significant relationship between researchgate individual metrics and scopus citation metrics. given that the data were not normally distributed, to examine this relationship, the spearman correlation coefficient was used. moreover, regarding that the impact of each of the researchgate metrics on the rg score has not been officially determined, the spearman correlation coefficient was applied to determine the effect of the individual metrics of researchgate on the rg score of the participants. the collected data were analyzed using excel and spss version 18 software. results the rg score of the faculty members is shown in table 1. all faculty members had an rg score, and most of the faculty members (79.5%) had an rg score of less than one. the average rg score of participants was 15.88. table 1. rg score of the sbmu faculty members rg score frequency mean min max sd median members % <1 55 5.79 0.01 0 0.59 0.08 0 1-11 300 31.58 6.18 1.13 10.98 2.89 6.56 11-21 297 31.26 16.09 11 20.98 2.83 15.92 21-31 209 22 25.24 21.06 30.97 2.79 25.25 31-41 82 8.63 34.9 31.02 40.32 2.56 34.5 41-51 6 0.63 42.88 41.46 45.84 1.57 42.34 41-61 1 0.11 56.49 56.49 56.49 56.49 total 950 100 15.88 0 56.49 10.42 15.05 the findings show that most of the faculty members have shared their publications in researchgate, but only 4.11% of them are not active in sharing their publications (see table 2). information technology and libraries march 2022 researchgate metrics’ behavior | valizadeh-haghi, nasibi-sis, shekofteh, and rahmatizadeh 5 table 2. number of shared publications of the sbmu faculty members publication frequency mean min max sd median members % 0 39 4.11 0 0 0 0 0 1-50 689 72.53 19.70 1 50 13.12 17 51-100 145 15.26 69.83 51 100 13.24 68 101-150 36 3.79 121.28 101 147 13.50 118 151-200 23 2.42 174.04 153 199 14.66 172 201-250 10 1.05 222.80 203 243 14.29 226 251-300 5 0.53 262.40 251 275 11.13 264 401-450 1 0.11 402 402 402 402 501-550 1 0.11 522 522 522 -_ 522 801-850 1 0.11 824 824 824 824 total 950 100 39.32 0 824 54.73 23 the findings indicate that most faculty members (94.42%) did not have any activity in asking questions. the highest level of activity in this metric was performed by 0.11% of faculty members (see table 3). table 3. the faculty members’ activity in asking questions question frequency mean min max sd median members % 0 897 94.42 0 0 0 0 0 1-10 51 5.37 1.73 1 9 1.58 1 20-30 1 0.11 28 28 28 28 41-50 1 0.11 46 46 46 46 total 950 100 0.17 0 46 1.82 0 table 4. faculty members’ activity in answering questions answers frequency mean min max sd median members % 0 840 88.42 0 0 0 0 0 1-5 91 9.58 2.08 1 5 1.34 2 6-10 10 1.05 7 6 9 1.15 7 11-15 3 0.32 13.33 12 15 1.53 13 16-20 2 0.21 16 16 16 0 16 21-25 1 0.11 25 25 25 25 31-35 1 0.11 31 31 31 31 41-45 1 0.11 41 41 41 41 216-220 1 0.11 218 218 218 218 total 950 100 0.68 0 218 7.43 0 information technology and libraries march 2022 researchgate metrics’ behavior | valizadeh-haghi, nasibi-sis, shekofteh, and rahmatizadeh 6 additionally, in answer to other researchers’ questions, most faculty members (88.42%) did not have any activity. the highest level of activity in this metric was done by 0.11% of them (see table 4). the findings demonstrated that most faculty members had followers, and only 0.74% had no followers (see table 5). table 5. the number of followers of sbmu faculty members followers frequency mean min max sd median members % 0 7 0.74 0 0 0 0 0 1-50 654 68.84 21.51 1 50 13.58 20 51-100 146 15.37 69.36 51 99 13.76 65.5 101-150 66 6.95 121 102 150 14.53 119 151-200 39 4.11 171.92 151 198 14.38 169 201-250 20 2.11 223.60 202 246 13.29 223.5 >250 18 1.89 391.44 253 891 181.72 338 total 950 100 53.05 0 891 72.17 31 the correlation between rg metrics and rg score was examined using the spearman correlation test. the findings showed that the publication sharing metrics had the highest correlation (0.918) with the rg score; therefore, it had the greatest impact on the rg score (p-value <0.001). the question asking metric had the lowest correlation (0.11) with the rg score (see table 6). table 6. the correlation between researchgate metrics and rg score of faculty members publication followers question answers rg score correlation coefficient 0.918 0.774 0.11 0.185 p-value < .001 < .001 .001 < .001 the number of citations of the faculty members in the scopus database is presented in table 7. the findings showed that most faculty members had citations, and only 5.16% of them had not received any citations. information technology and libraries march 2022 researchgate metrics’ behavior | valizadeh-haghi, nasibi-sis, shekofteh, and rahmatizadeh 7 table 7. number of citations of sbmu faculty members in scopus citation frequency mean min max sd median members % 0 49 5.16 0 0 0 0 0 1-500 771 8116 108.91 1 495 114.80 67 501-1000 68 7.16 694.06 503 997 148.87 690 1001-1500 29 3.05 1213.66 1016 1481 131.55 1180 1501-2000 14 1.47 1771.14 1594 1995 131.41 1741 2001-2500 9 0.95 2175.56 2029 2446 137.10 2192 2501-3000 2 0.21 2747 2686 2808 86.27 2747 3001-3500 2 0.21 3207.5 3007 3408 283.55 3207.5 3501-4000 2 0.21 3767 3582 3952 261.63 3767 4001-4500 1 0.11 4459 4459 4459 4459 4501-5000 1 0.11 4581 4581 4581 4581 6501-7000 1 0.11 6907 6907 6907 6907 1900119500 1 0.11 19272 19272 19272 19272 total 950 100 279.37 0 19272 817.14 83 the findings indicated that most faculty members had an h-index in scopus, and the mean of their h-index was 6.46 (see table 8). table 8. h-index of sbmu faculty members hindex frequency mean min max sd median members % 0 49 5.16 0 0 0 0 0 1-10 732 77.05 4.54 1 10 2.57 4 11-20 137 14.42 14.38 11 20 2.81 14 21-30 29 3.05 24.41 21 30 2.71 24 31-40 2 0.21 35.50 31 40 6.36 35.5 61-70 1 0.11 63 63 63 63 total 950 100 6.46 0 63 5.96 5 the correlation between researchgate indicators and scopus citation metrics is presented in table 9. information technology and libraries march 2022 researchgate metrics’ behavior | valizadeh-haghi, nasibi-sis, shekofteh, and rahmatizadeh 8 table 9. correlation between researchgate indicators and scopus citation metrics h-index citation p-value correlation coefficient p-value correlation coefficient rg score < .001 0.803 < .001 0.791 publication < .001 0.735 < .001 0.715 question .006 0.09 .019 0.076 answers < .001 0.147 < .001 0.119 followers < .001 0.694 < .001 0.676 the findings showed a positive and significant relationship between the rg score and scopus citation metrics (p-value <0.05). additionally, all four rg metrics had a positive and significant relationship with scopus citation metrics, including citations and the h-index (p-value <0.05). the findings showed that the number of shared publications had the highest correlation with citations (0.715) and h-index (0.735) compared to other rg metrics. discussion researchgate’s mission is to link the academic world and make research accessible to all scholars. this study has compared the rg metrics of sbmu faculty members. the major findings are highlighted and discussed around the four research questions of this study. the findings of the present study revealed that even though more than half of the faculty members have profiles in researchgate and are active in this network, compared to the findings of other studies, such as those of yousuf ali, janmohammadi, rahmani and nikkar, this rate is low.25 this issue may be due to the lack of knowledge and familiarity of faculty members with the researchgate social network or the lack of the need to publish outputs on the researchgate social network, which needs further investigation. the present study results also showed that the mean rg score of sbmu faculty members is similar to the results of other studies conducted in iran and other international studies.26 the current study results indicated that the subjects’ activity in the four rg metrics was slight in some indicators, and the highest activity was related to publications metric. the lowest level of members’ activity was related to the asking-questions metric. considering that the rg score results from the scores obtained by the researcher in the four rg metrics, this study’s research results confirm that the faculty members did not pay enough attention to the activity in all the rg metrics. the present findings showed that faculty members have limited activities in sharing publications, which is aligned with the results from other studies.27 this could be due to several reasons. firstly, young faculty members who have recently joined the university as faculty members may have fewer publications in comparison with older members. another reason may be that faculty members who have recently joined the researchgate social network have not had enough time to share all of their publications. it should be noted that sharing publications on researchgate has massive ramifications for the open access movement. it might be that one of the reasons authors do not publish on rg is because they do not have the rights to do so. in this regard, it is worthy to mention that the publication-sharing metric includes both full-text sharing and/or abstract information technology and libraries march 2022 researchgate metrics’ behavior | valizadeh-haghi, nasibi-sis, shekofteh, and rahmatizadeh 9 sharing in which sharing the abstract is legal. so, researchers were free to share their abstracts but they have not done it. the findings also show that most of the sbmu faculty members have no activity in two metrics: asking questions and answering questions. compared to other studies, the activity of sbmu faculty members in these metrics is at a lower level.28 possible reasons for this could be a lack of awareness of the importance of these metrics to increase the rg score, a lack of english language proficiency to participate in asking and answering questions, and a lack of time. however, more research is needed in this area. the results showed that most of the sbmu faculty members have followers. the mean number of followers of the faculty members is similar to what was found in other studies.29 as the number of followers increases, a person’s popularity in their subject area increases. they may even be influenced by the researcher’s studies in other areas and follow the researcher’s activities in researchgate and, with the increase of followers, there is a possibility of increasing the citation rate.30 therefore, it is recommended to elaborate on the importance and role of each of the rg metrics in raising the rg score through posters, workshops, and educational brochures for faculty members. in this study, the correlation between each of the rg metrics and rg score was examined using the spearman correlation test. the present study results revealed that the shared publication and number of followers metrics have a stronger correlation with the rg score compared to the metrics of questions and answers. the results also showed a significant correlation between all rg metrics and the rg score of sbmu faculty members. the study results indicated that most of the sbmu faculty members have citations in the scopus database and have an h-index, but most of them received the least number of citations. according to the present study findings, the subjects have little activity in the researchgate social network. as one of the possible reasons for the low number of citations, we can mention the low activity in the researchgate social network. research on surgeons’ publications has also confirmed this.31 nevertheless, there is a need for further research on the low number of citations of faculty members of sbmu. the present study’s findings demonstrated a significant relationship between the rg score and scopus citation metrics (h-index and number of citations). in this regard, the highest correlation was observed between the h-index and rg score (p-value = 0.803). this finding is consistent with other studies’ findings.32 there is also a significant relationship between each of the rg score metrics and scopus citation metrics. in this regard, the highest and lowest correlations with scopus citation metrics were observed between publication, questions, and answers metrics, respectively (p-value = 0.001). considering the positive relationship between each of the rg metrics and scopus citation metrics, it is suggested that faculty members pay enough attention to all of these metrics to help increase their citation indicators. due to the researchgate social network’s role in increasing the visibility of researchers’ scientific outputs, faculty members can consider the use of this network as one of the tools to increase the number of citations and the h-index. conclusion easy access to research outputs and increasing visibility is one of the most important features of researchgate, which, according to the results of this study, has a significant impact on increasing information technology and libraries march 2022 researchgate metrics’ behavior | valizadeh-haghi, nasibi-sis, shekofteh, and rahmatizadeh 10 the use and thus increasing citations. as the results revealed, researchers’ participation in the researchgate social network is effective in increasing citation indicators, including the number of citations and h-index. therefore, more activity in the researchgate social network, followed by receiving citations, can have favorable results in improving rankings for both research institutes and universities. universities can encourage faculty members to join and work in researchgate and other academic-social networks by considering privileges to improve their academic rank. libraries and research centers can explain the importance of faculty members’ activities in these networks by holding workshops on altmetric indicators and academic social network sites, especially researchgate. they can also justify to researchers the benefits of using this network and sharing more scientific outputs. declaration of interest: none funding: this work was supported by the school of allied medical sciences, shahid beheshti university of medical sciences, tehran, iran [grant number 28727]. the research ethics committee has approved this research under the ethical code number ir.sbmu.retech.rec.1400.310. endnotes 1 bart penders, “ten simple rules for responsible referencing,” plos computational biology 14, no. 4 (2018), https://doi.org/10.1371/journal.pcbi.1006036; b. s. lancho barrantes et al., “citation flows in the zones of influence of scientific collaborations,” journal of the american society of information science technology 63, no. 3 (2012): 481–89, https://doi.org/10.1002/asi.21682. 2 h. a. piwowar, r. s. day, and d. b. fridsma, “sharing detailed research data is associated with increased citation rate,” plos one 2, no. 3 (2007): e308, https://doi.org/10.1371/journal.pone.0000308. 3 stefano bortoli, paolo bouquet, and themis palpanas, “social networking: power to the people,” in papers presented in w3c workshop on the future of social networking position, january, barcelona (2009); brian kelly, “can linkedin and academia.edu enhance access to open repositories?”, impact of social sciences blog, 2012, https://blogs.lse.ac.uk/impactofsocialsciences/2012/08/23/linkedin-academia-enhanceaccess-to-open-repositories/. 4 bortoli, bouquet, and palpanas, “social networking.” 5 nicole muscanell and sonja utz, “social networking for scientists: an analysis on how and why academics use researchgate,” online information review 41, no. 5 (2017): 744–59, https://doi.org/10.1108/oir-07-2016-0185. 6 stevan harnad, “publish or perish—self-archive to flourish: the green route to open access,” ercim news 64 (2006), http://eprints.ecs.soton.ac.uk/11715/1/harnad-ercim.pdf. 7 vala ali rohani and siew hock ow, “eliciting essential requirements for social networks in academic environments,” in 2011 ieee symposium on computers & informatics (ieee, 2011):171–76, https://doi.org/10.1109/isci.2011.5958905. https://doi.org/10.1371/journal.pcbi.1006036 https://doi.org/10.1002/asi.21682 https://doi.org/10.1371/journal.pone.0000308 https://blogs.lse.ac.uk/impactofsocialsciences/2012/08/23/linkedin-academia-enhance-access-to-open-repositories/ https://blogs.lse.ac.uk/impactofsocialsciences/2012/08/23/linkedin-academia-enhance-access-to-open-repositories/ https://doi.org/10.1108/oir-07-2016-0185 http://eprints.ecs.soton.ac.uk/11715/1/harnad-ercim.pdf https://doi.org/10.1109/isci.2011.5958905 information technology and libraries march 2022 researchgate metrics’ behavior | valizadeh-haghi, nasibi-sis, shekofteh, and rahmatizadeh 11 8 elena giglia, “academic social networks: it’s time to change the way we do research,” european journal of physical and rehabilitation medicine 47, no. 2 (2011): 345–49. 9 rohani and hock ow, “eliciting.” 10 peter kraker and elisabeth lex, “a critical look at the researchgate score as a measure of scientific reputation,” (paper, quantifying and analysing scholarly communication on the web (ascw15), oxford, uk, june 30, 2015): 7–9, https://doi.org/10.5281/zenodo.35401. 11 webometrics ranking of world universities 2021, https://www.webometrics.info/en/asia/iran%20%28islamic%20republic%20of%29. 12 cwts leiden ranking 2018, https://www.leidenranking.com/ranking/2018/list. 13 richard van noorden, “the science that’s never been cited,” nature 552 (2017): 162–64, https://doi.org/10.1038/d41586-017-08404-0; rishabh shrivastava and preeti mahajan, “an altmetric analysis of researchgate profiles of physics researchers: a study of university of delhi (india),” performance measurement and metrics 18, no. 1 (2017): 52–66, https://doi.org/10.1108/pmm-07-2016-0033. 14 dennis h. lui et al., “contemporary engagement with social media amongst hernia surgery specialists,” hernia 21, no. 4 (2017): 509–15, https://doi.org/10.1007/s10029-017-1609-8; n. k. sheeja and susan k. mathew, “researchgate profiles of naval architecture scientists in india: an altmetric analysis,” library philosophy and practice (2019): 1–9, https://digitalcommons.unl.edu/libphilprac/2305/; muhammad yousuf ali and joanna richardson, “pakistani lis scholars’ altmetrics in researchgate,” program 51, no. 2, (2017):152–69, https://doi.org/10.1108/prog-07-2016-0052; preeti mahajan, har singh, and anil kumar, “use of snss by the researchers in india: a comparative study of panjab university and kurukshetra university,” library review 62, no. 8/9 (2013): 525–46; neil d. joshi et al., “social media in neurosurgery: using researchgate,” world neurosurgery 127 (2019): e950– e956, https://doi.org/10.1016/j.wneu.2019.04.007; maliheh nikkar, rahim alijani, and hamid ghazizadeh khalifeh mahaleh, “investigation of the presence of surgery researchers in research gate scientific network: an altmetrics study,” iranian journal of surgery 25, no. 2, (2017): 76–82; maryam rahmani et al., “rg score compared with h-index: a case study,” sciences and techniques of information management 4, no. 2 (2018): 61–76, http://stim.qom.ac.ir/article_1139_en.html. 15 nikkar, alijani, and ghazizadeh khalifeh mahaleh, “investigation.” 16 rahmani et al., “rg score.” 17 sheeja and mathew, “researchgate.” 18 yusuf ali and richardson, “pakistani.” 19 mahajan, singh, and kumar, “use of snss”; joshi et al., “social media”; h lui et al., “contemporary.” 20 joshi et al., “social media.” https://doi.org/10.5281/zenodo.35401 https://www.webometrics.info/en/asia/iran%20%28islamic%20republic%20of%29 https://www.leidenranking.com/ranking/2018/list https://doi.org/10.1038/d41586-017-08404-0 https://doi.org/10.1038/d41586-017-08404-0 https://doi.org/10.1108/pmm-07-2016-0033 https://doi.org/10.1007/s10029-017-1609-8 https://digitalcommons.unl.edu/libphilprac/2305/ https://doi.org/10.1108/prog-07-2016-0052 https://doi.org/10.1016/j.wneu.2019.04.007 https://doi.org/10.1016/j.wneu.2019.04.007 http://stim.qom.ac.ir/article_1139_en.html information technology and libraries march 2022 researchgate metrics’ behavior | valizadeh-haghi, nasibi-sis, shekofteh, and rahmatizadeh 12 21 rishabh shrivastava and preeti mahajan, “relationship amongst researchgate altmetric indicators and scopus bibliometric indicators: the case of panjab university chandigarh (india),” new library world (2015), https://doi.org/10.1108/nlw-03-2015-0017. 22 farahnaz naderbeigi and alireza isfandyari-moghaddam, “researchers’ scientific performance in researchgate: the case of a technology university,” library philosophy & practice (2018), https://digitalcommons.unl.edu/libphilprac/1752. 23 hamed nasibi-sis, saeideh valizadeh-haghi, and maryam shekofteh, “researchgate altmetric scores and scopus bibliometric indicators among lecturers,” performance measurement and metrics 22 no. 1 (2020):15–24, https://doi.org/10.1108/pmm-04-2020-0020. 24 iranian scientometrics information database, https://isid.research.ac.ir/, accessed december 26, 2021. 25 yusuf ali and richardson, “pakistani”; maryam janmohammadi, maryam rahmani, and zahra rootan, “review of rg indices and ranking of researchers in research gate: case study: faculty of veterinary medicine, university of tehran,” in proceedings international interactive information retrieval conference (tehran, 2017); rahmani et al., “rg score”; nikkar, alijani, and ghazizadeh khalifeh mahaleh, “investigation.” 26 rahmani et al., “rg score”; naderbeigi and isfandyari-moghaddam, “researchers”; janmohammadi, rahmani, and rootan, “review of rg”; shrivastava and mahajan, “an altmetric analysis”; shrivastava and mahajan, “relationship.” 27 shrivastava and mahajan, “an altmetric analysis”; shrivastava and mahajan, “relationship”; janmohammadi, rahmani, and rootan, “review of rg.” 28 shrivastava and mahajan, “an altmetric analysis”; shrivastava and mahajan, “relationship.” 29 shrivastava and mahajan, “an altmetric analysis”; shrivastava and mahajan, “relationship”; naderbeigi and isfandyari-moghaddam, “researchers”; janmohammadi, rahmani, and, rootan, “review of rg.” 30 shrivastava and mahajan, “an altmetric analysis.” 31 nikkar, alijani, and ghazizadeh khalifeh mahaleh, “investigation.” 32 sheeja and mathew, “researchgate”; joshi et al., “social media”; rishabh and mahajan, “relationship”; naderbeigi and isfandyari-moghaddam, “researchers.” https://doi.org/10.1108/nlw-03-2015-0017 https://digitalcommons.unl.edu/libphilprac/1752 https://doi.org/10.1108/pmm-04-2020-0020 https://isid.research.ac.ir/ abstract introduction literature review material and methods results discussion conclusion endnotes letter from the editor (december 2019) letter from the editor kenneth j. varnum information technology and libraries | december 2019 1 https://doi.org/10.6017/ital.v38i4.11923 earlier this fall, i had the privilege of participating in the sharjah library conference, a three-day event hosted by the sharjah book authority in the united arab emirates with programming coordinated by the ala international relations office. the experience of meeting with so many librarians from cultures different from my own was truly rewarding and enriching. it was both refreshing and invigorating to see, first-hand, the global importance of the local matters that occupy so much of my professional life. i returned to my regular job with a newfound appreciation for how much the issues i spend so much of my professional time on—information access, equity, user experience, and the like—are universal. it is easy to get lost in the weeds of my own circumstances and environment, and sometimes difficult to look up and explore what colleagues, known and unknown, are doing and thinking. the experience reinforces the importance of important open access publications such as information technology and libraries. while “open access” doesn’t remove every possible barrier to accessing the knowledge, experience, and lessons contained within in its virtual cover, it does remove the all-important paywall. and that is no small thing, in a community of library technologists who interact and exchange information through social media, email, and other tools. our open access status gives this journal a vibrant platform for sharing knowledge, experience, and expertise to all who seek it. i hope you find this issue’s contents useful and informative, and will share the items you find most important with your peers at your institutions and beyond. i invite you to add your own knowledge and experience to our collective wisdom through a contribution to the journal. for more details, see the about the journal page or get in touch with me. sincerely, kenneth j. varnum, editor varnum@umich.edu december 2019 https://www.sibfala.com/program http://www.sba.gov.ae/ http://www.ala.org/aboutala/offices/iro https://ejournals.bc.edu/index.php/ital/about mailto:varnum@umich.edu information retrieval using a middleware approach danijela boberić krstićev information technology and libraries | march 2013 54 abstract this paper explores the use of a mediator/wrapper approach to enable the search of an existing library management system using different information retrieval protocols. it proposes an architecture for a software component that will act as an intermediary between the library system and search services. it provides an overview of different approaches to add z39.50 and search/retrieval via url (sru) functionality using a middleware approach that is implemented on the bisis library management system. that wrapper performs transformation of contextual query language (cql) into lucene query language. the primary aim of this software component is to enable search and retrieval of bibliographic records using the sru and z39.50 protocols, but the proposed architecture of the software components is also suitable for inclusion of the existing library management system into a library portal. the software component provides a single interface to server-side protocols for search and retrieval of records. additional protocols could be used. this paper provides practical demonstration of interest to developers of library management systems and those who are trying to use open-source solutions to make their local catalog accessible to other systems. introduction information technologies are changing and developing very quickly, forcing continual adjustment of business processes to leverage the new trends. these changes affect all spheres of society, including libraries. there is a need to add new functionality to existing systems in ways that are cost effective and do not require major redevelopment of systems that have achieved a reasonable level of maturity and robustness. this paper describes how to extend an existing library management system with new functionality supporting easy sharing of bibliographic information with other library management systems. one of the core services of library management systems is support for shared cataloging. this service consists of the following activities: a librarian when processing a new bibliographical unit first checks whether the bibliographic unit has already been recorded in another library in the world. if it is found, then the librarian stores that electronic records to his/her local database of bibliographic records. in order to enable those activities, it is necessary that standard way of communication between different library management systems exists. currently, the well-known standards in this area are z39.501 and sru.2 danijela boberić krstićev (dboberic@uns.ac.rs) is a member department of mathematics and informatics, faculty of sciences, university of novi sad, serbia. mailto:dboberic@uns.ac.rs information retrieval using a middleware approach | krstićev 55 in this paper, a software component that integrates services for retrieval bibliographic records using the z39.50 and sru standard is described. the main purpose of that component is to encapsulate server sides of the appropriate protocols and to provide a unique interface for communication with the existing library management system. the same interface may be used regardless of which protocols are used for communication with the library management system. in addition, the software component acts as an intermediary between two different library management systems. the main advantage of the component is that it is independent of library management system with which it communicates. also, the component could be extended with new search and retrieval protocols. by using the component, the functionality of existing library management systems would be improved and redevelopment of the existing system would not be necessary. it means that the existing library management system would just need to provide an interface for communication with that component. that interface can even be implemented as an xml web service. standards used for search and retrieval the z39.50 standard was one of the first standards that defined a set of services to search for and retrieve data. the standard is an abstract model that defines communication between the client and server and does not go into details of implementation of the client or server. the model defines abstract prefixes used for search that do not depend on the implementation of the underlying system. it also defines the format in which data can be exchanged. the z39.50 standard defines query language type-1, which is required when implementing this standard. the z39.50 standard has certain drawbacks that new generation of standards, like sru, is trying to overcome. sru tries to keep functionality defined by z39.50 standard, but to allow its implementation using current technologies. one of the main advantages of the sru protocol, as opposed to z39.50, is that it allows messages to be exchanged in a form of xml documents, which was not the case with the z39.50 protocol. the query language used in sru is called contextual query language (cql).3 the sru standard has two implementations, one in which search and retrieval is done by sending messages via the hypertext transfer protocol (http) get and post methods (sru version) and the other for sending messages using the simple object access protocol (soap) (srw version). the main difference between sru and srw is in the way of sending messages.4 the srw version of the protocol packs messages in the soap envelope element, while the sru version of the protocol sends messages based on parameter/value pairs that are included in the url. another difference between the two versions is that the sru protocol for messages transfer uses only http, while srw, in can use secure shell (ssh) and simple mail transfer protocol (smtp), in addition to http. information technology and libraries | march 2013 56 related work a common approach for adding sru support to library systems, most of which already support, the z39.50 search protocol,5 has been to use existing software architecture that supports the z39.50 protocol. simultaneously supporting both protocols is very important because individual libraries will not decide to move to the new protocol until it is widely adopted within the library community. one approach in the implementation of a system for retrieval of data using both protocols is to create two independent server-side components for z39.50 and sru, where both software components access a single database. this approach involves creating a server implementation from the scratch without the utilization of existing architectures, which could be considered a disadvantage. figure 1. software architecture of a system with separate implementations of serverside protocols this approach is good if there is an existing z39.50 or sru server-side implementation, or if there is a library management system, for example, that supports just the z39.50 protocol, but has open programming code and allows changes that would allow the development of an sru service. the system architecture that is based on this approach is shown in figure 1 as a unified modeling language (uml) component diagram. in this figure, the software components that constitute the implementation of the client and the server side for each individual protocol are clearly separated, while the database is shared. the main disadvantage of this approach is that adding support for new search and retrieval protocols requires the transformation of the query language supported by that new protocol into the query language of target system. for example, if the existing library management system uses a relational database to store bibliographic records, for every a new protocol added, its query language must be transformed into the structured query language (sql) supported by the database. z39.50 server side sru server side database z39.50 client side sru client side zservice sruservice jdbc information retrieval using a middleware approach | krstićev 57 however, in most commercial library management systems that support server-side z39.50, local development and maintenance of additional services may not be possible due to the closed nature of the systems. one of the solutions in this case would be to create a so-called “gateway” software component that implements both an sru server and a z39.50 client, used to access the existing z39.50 server. that is, if a sru client's application sends search request, the gateway will accept that request, transform it into the z39.50 request and forward the request to the z39.50 server. similarly, when the gateway receives a response from the z39.50 server, the gateway will transform this response in sru response and forward it to the client. in this way, the client will have the impression that communicates directly with the sru server, while the existing z39.50 server will think that it sends response directly to the z39.50 client. figure 2 presents a component diagram that represents the architecture of the system that is based on this approach. figure 2. software architecture of a system with a gateway the software architecture shown in the figure 2 is one of the most common approaches and is used by the library of congress (lc),6 which uses the commercial voyager7 library information system, which allows searching by the z39.50 protocol. in order to support search of the lc database using sru, indexdata8 developed the yazproxy software component,9 which is an sruz39.50 gateway. the same idea10 was used in the implementation of the "the european library”11 database sru client side jdbc gateway sru server side z39.50 client side srutoz3950converter zservice z39.50 server side sruservice information technology and libraries | march 2013 58 portal, which aims to provide integrated access to the major collections of all the european national libraries. another interesting approach in designing software architecture for systems dealing with retrieval of information can be observed in the systems involved in searching heterogeneous information sources. the architecture of these systems is shown in figure 3. the basic idea in most of these systems is to provide the user with a single interface to search different systems. this means that there is a separate component that will accept a user query and transform it into a query that is supported by the specific system component that offers search and data retrieval. this component is also known as a mediator. a separate wrapper component must be created for each system to be searched, to convert the user's query to a query that is understood by the particular target system.12 figure 3. architecture with the mediator/wrapper approach figure 3 shows a system architecture that enables communication with three different systems (system1, system2 and systemn), each of which may use a different query language and therefore need different wrapper components (wrapper1, wrapper2 and wrappern ). in this architecture, each system can be a new mediator component that will interact with other systems. that is, the wrapper component can communicate with the system or with another mediator. the role of the mediator is to accept the request defined by the user and send it to all wrapper components. the wrapper components know how to transform the query that is sent by a mediator into a query that is supported by the target system with which the wrapper communicates. in addition, the wrapper has to transform data received from the target system in a format prescribed by the mediator. communication between client applications and the mediator client mediator system1 system2 systemn wrapper1 wrapper2 wrappern converter1 concrete query languagenconcrete query language2concrete query language1 converter2 convertern uniform query language information retrieval using a middleware approach | krstićev 59 may be through one of the protocols for search and retrieval of information, for example through the sru or z39.50 protocols, or it may be a standard http protocol. systems in which the architecture is based on the mediator/wrapper approach are described in several papers. coiera et al (2005)13 describe the architecture of a system that deals with the federated search of journals in the field of medicine, using the internal query language unified query language (uql). for each information source with which the system communicates, a wrapper was developed to translate queries from uql into the native query language of the source. the wrapper also has the task of returning search results to the mediator. those results are returned as an xml document, with a defined internal format called a unified response language (urel). as an alternative to using particular defined languages (uql and urel), a cql query language and the sru protocol could be used. another example of the use of mediators is described by cousins and sanders (2006),14 who address the interoperability issues in cross-database access and suggest how to incorporate a virtual union catalogue into the wider information environment through the application of middleware, using the z39.50 protocol to communicate with underlying sources. software component for services integration this paper describes a software component that would enable the integration of services for search and retrieval of bibliographic records into an existing library system. the main idea is that the component should be modular and flexible in order to allow the addition of new protocols for search and easy integration into the existing system. based on the papers analyzed in the previous section, it was concluded that a mediator/wrapper approach would work best. the architecture of system that would include the component and that would allow search and retrieval of bibliographic records from other library systems is shown in figure 4. z39.50 client sru client library information system recordmanager intermediary mediator wrapper z39.50 server sru server information technology and libraries | march 2013 60 figure 4. architecture of system for retrieval of bibliographic records in figure 4, the central place is occupied by the intermediary component, which consists of a mediator component and a wrapper component. this component is an intermediary between the search service and an existing library system. the library system provides an interface (recordmanager) which is responsible for returning records that match the received query. figure 4 also shows the components that are client applications that use specific protocols for communication (sru and z39.50), as well as the components that represent the server-side implementation of appropriate protocols. this paper will not describe the architecture of components that implement the server side of the z39.50 and sru protocols, primarily because there are already a lot of open-source solutions15 that implement those components and can easily be connected with this intermediary component. in order to test the intermediary component, we used the server side of the z39.50 protocol developed through the jafer project16 ; for the sru server side, we developed a special web service in the java programming language. in further discussion, it is assumed that the intermediary component receives queries from server-side z39.50 and sru services, and that this component does not contain any implementation of these protocols. the mediator component, which is part of the intermediary component, must accept queries sent by the server-side search and retrieval services. the mediator component uses its own internal representation of queries, so it is therefore necessary to transform received queries into the appropriate internal representation. after that, the mediator will establish communication with the wrapper component, which is in charge of executing queries in existing library system. the basic role of the wrapper component is to transform queries received from the mediator into queries supported by library system. after executing the query, the wrapper sends search results as an xml document to the mediator. before sending those results to server side of protocol, the mediator must transform those results into the format that was defined by the client. mediator software component the mediator is a software component that provides a unique interface for different client applications. in this study, as shown in figure 4, a slightly different solution was selected. instead of the mediator communicating directly with the client application, which in the case of protocols for data exchange is client side of that protocol, it actually communicates with the server components that implement the appropriate protocols, and the client application exchanges messages with the corresponding server-side protocol. the z39.50 client exchanges messages with the appropriate z39.50 server, and it communicates with the mediator component. a similar process is done when communication is done using the sru protocol. what is important to emphasize is that the z39.50 and sru servers communicate with the mediator through a unified user interface, represented in figure 5 by class mediatorservice. in this way the same method is used to submit the query and receive results, regardless of which protocol is used. that means information retrieval using a middleware approach | krstićev 61 that our system becomes more scalable and that it is possible to add some new search and retrieval protocols without refactoring the mediator component. figure 5 shows the uml class diagram that describes the software mediator component. the mediatorservice class is responsible for communication with the server-side z39.50 and sru protocols. this class accepts queries from the server side of protocols and returns bibliographic records in the format defined by the server. the mediator can accept queries defined by different query languages. its task is to transform these queries to an internal query language, which will be forwarded to the wrapper component. in this implementation, accepted queries are transformed into an object representation of cql, as defined by the sru standard. one of the reasons for choosing cql is that concepts defined in the z39.50 standard query language can be easily mapped to the corresponding concepts defined by cql. cql is semantically rich, so can be used to create various types of queries. also, because it is based on the concept of context set, it is extensible and allows usage of various types of context sets for different purposes. so, cql is not just limited to the function of searching bibliographic material. it could, for example, be used for searching geographical data. accordingly, it was assumed that cql is a general query language and that probably any query language could be transformed into it. in this implementation, the object model of cql query defined in project cqljava17 was used. in the case that there is a new query language, it would be necessary to perform mapping of the new query language into cql or to extend the object model of cql with new concepts. this implementation of the mediator component could transform two different types of queries into the cql object model. currently, it can transform type-1 queries (used by z39.50) and cql queries into cql object representation. to to add a new query language, it would just be necessary to add a new class that would implement the interface queryconverter shown in figure 5, but the architecture of component mediator remains the same. one task of the mediator component is to return records in the format that was defined by the client that sent the request. information technology and libraries | march 2013 62 figure 5. uml class diagram of mediator component as the mediator communicates with the z39.50 and sru server side, the task of the z39.50 and sru server side will be to check whether the format that the client requires is supported by the underlying system. if it is not supported, the request is not sent to mediator. otherwise, the mediator ensures the transformation of retrieved records into the chosen format. the mediator obtains bibliographic records from the wrapper in the form of an xml document that is valid according to the appropriate xml schema.18 the xml schema allows the creation of an xml document describing bibliographic records according to the unimarc19 or marc2120 format. the current implementation of the mediator component supports transformation of bibliographic records into an xml document that can be an instance of the unimarcslim xml schema,21 the marc21slim xml schema,22 or the dublin core xml schema.23 adding support for a new format would require creating a new class that would extend the class recordserializer (figure 5). because this mediator component works with xml, the transformation of bibliographic records into a new format also could be done by using exstensible stylesheet language transformations (xslt). 0..11..1 0..1 1..* 0..1 0..1 mediatorservice + getrecords (object query, string format) : string[] wrapper + executequery (cqlnode cqlquery) : string[] cqlstringconverter + parsequery (object query) : cqlnode rpnconverter + parsequery (object query) : cqlnode queryconverter + parsequery (object query) : cqlnode marc21serializer + serialize (string r) : sting dublincoreserializer + serialize (string r) : sting unimarcserializer + serialize (string r) : sting recordserialize + serialize (string r) : sting information retrieval using a middleware approach | krstićev 63 wrapper software component the wrapper software component is responsible for ensuring communication between the mediator and the existing library system. that is, the wrapper component is responsible for transforming the cql object representation into a concrete query that is supported by the existing library system and for obtaining results that match the query. implementation of the wrapper component directly depends on the architecture of the existing library system. figure 7 proposes a possible architecture of the wrapper component. this proposed architecture assumes that the existing library system provides some kind of service that will be used by the wrapper component to send the query and obtain results. the recordmanager interface in figure 7 is an example of such a service. recordmanager has two operations, one which executes the query and returns the number of hits and the second operation which returns bibliographic records. this proposed solution is useful for libraries that use a library management system that can be extended. it may not be appropriate for libraries using an “off the self” library management system that cannot be extended. the proposed architecture of the wrapper component is based on a strategy design pattern,24 primarily because of the need for transformation of the cql query into a query that is supported by the library system. according to the cql concept of context sets, all prefixes that can be searched are grouped in context sets, and these sets are registered with the library of congress. the concept of context sets enables specific communities and users to define their own prefixes, relations, and modifiers without fear that their name will be identical to the name of prefix defined in another set. that is, it is possible to define two prefixes with the same name, but they belong to different sets and therefore have different semantics. cql offers the possibility of combining in a single query elements that are defined in different context sets. when parsing a query, it is necessary to check which context set a particular item belongs to and then to apply appropriate mapping of the element from the context set to the corresponding element defined by the query language used in the library system. the strategy design pattern includes patterns that describe the behavior of objects (behavioral patterns), which determine the responsibility of each object and the way in which objects communicate with each other. the main task of a strategy pattern is to enable easy adjustment of the algorithm that is applied by an object at runtime. strategy pattern defines a family of algorithms, each of which is encapsulated in a single object. figure 6 is shows a class diagram from the book “design patterns: elements of reusable object-oriented software,“25 which describes basic elements of strategy patterns. information technology and libraries | march 2013 64 figure 6. strategy design pattern the basic elements of this pattern are the classes context, strategy, concretestrategya and concretestrategyb. the class context is in charge of choosing and changing algorithms in a way that creates an instance of the appropriate class, which implements the interface strategy. interface strategy contains the method algorityinterface(), which should implement all classes that implement that interface. class concretestrategya implements one concrete algorithm. this design pattern is used when transforming cql queries primarily because cql queries can consist of elements that belong to different context sets, whose elements are interpreted differently. classes context, strategy, cqlstrategy and dcstrategy, shown in figure 7, are elements of strategy pattern responsible for mapping concepts defined by cql. the class context is responsible for selection of appropriate strategies for parsing, depending on which context set the element that is going to be transformed belongs to. class cqlstrategy and dcstrategy are responsible for mapping the elements belonging respectively to the cql or dublin core context set in the appropriate elements of a particular query language used by the library system. the use of strategy pattern makes it possible, in real time, to change the algorithm that will parse the query depending on what context set is used. the described implementation of a wrapper component enables the parsing of queries that contain only elements that belong to cql and/or the dublin core context set. in order to provide support for a new context set, a new implementation of interface strategy (figure 7) would be required, including an algorithm to parse the elements defined by this new set. information retrieval using a middleware approach | krstićev 65 figure 7. uml class diagram of wrapper component integration of intermediary software components into the bisis library system the bisis library system was developed at the faculty of science and the faculty of technical sciences in novi sad, serbia, and has had several versions since its introduction in 1993. the fourth and current version of the system is based on xml technologies. among the core functional units of bisis26 are: • circulation of library material • cataloging of bibliographic records • indexing and retrieval of bibliographic records • downloading bibliographic records through z39.50 protocol • creation of a card catalog • creation of statistical reports an intermediary software component has been integrated into the bisis system. the intermediary component was written in the java programming language and implemented as a web application. communication between server applications that support the z39.50 and sru protocols and the intermediary component is done using the software package hessian.27 hessian offers a simple implementation of two protocols to communicate with web services, a binary protocol and its corresponding xml protocol, both of which rely on http. use of hessian package makes it easy to create a java servlet on the server side and proxy object on client-side, which will be used to 0..1 1..1 0..11..1 0..1 1..1 context + + + setstrategy (string strategy) mapindext ounderlayingprefix (string index) parseoperand (string index, cqlt ermnode node) : void : string : object strategy + + mapindext ounderlayingprefix (string index) parseoperand (string underlayingpref, cqlt ermnode node) : string : object cqlstrategy + + mapindext ounderlayingprefix (string index) parseoperand (string underlayingpref, cqlt ermnode node) : string : object dcstrategy + + mapindext ounderlayingprefix (string index) parseoperand (string underlayingpref, cqlt ermnode node) : string : object recordmanager + + select (object query) getrecords (int hits[]) : int[] : string[] wrapper + executequery (cqlnode cqlquery) makequery (cqlnode cql, object underlayingquery) : string[] : object information technology and libraries | march 2013 66 communicate with the servlet. in this case, the proxy object is deployed on the server side of protocol and the intermediary component contains a servlet. communication between the intermediary and bisis is also realized using the hessian software package, which leads to the possibility of creating a distributed system because the existing library system, the intermediary component, and server applications that implement the protocols can be located on physically separate computers. the bisis library system uses the lucene software package for indexing and searching. lucene has defined its own query language,29 so the wrapper component that is integrated into bisis has to transform to the cql query object model the object representation of the query defined by lucene. therefore the wrapper first needs to determine to which context set the index belongs and then apply the appropriate strategy for mapping the index. the rules for mapping the index to lucene fields are read from the corresponding xml document that is defined for every context set. listing 1 below provides an example of an xml document that contains some rules for mapping indexes of the dublin core context set to lucene index fields. the xml element index represents the name of index which is going to be mapped, while the xml element mappingelement contains the name of lucene field. for example, the title index defined in the dublincore context set, which denotes search by title of the publication, is mapped to the field ti, which is used by the search engine of bisis system. title ti creator au subject sb listing 1. xml document with rules for mapping the dublincore context set after the index is mapped to corresponding fields in lucene, a similar procedure is repeated for a relationship that may belong to some other context set or may have modifiers that belong to some information retrieval using a middleware approach | krstićev 67 other context set. it is therefore necessary to change the current strategy for mapping into a new one. by doing this, all elements of the cql query are converted into a lucene query, so the new query can be sent to bisis to be executed. approximately 40 libraries in serbia currently use the bisis system, which includes a z39.50 client, allowing the libraries to search the collections of other libraries that support communication through the z39.50 protocol. by integrating the intermediary component in the bisis system, non-bisis libraries may now search the collections of libraries that use bisis. as a first step, the intermediary component was just integrated in a few libraries, without any major problems. the component is most useful to the city libraries that use system bisis, because they have many branches, which can now search and retrieve bibliographic records from their central libraries. the component could potentially be used by other library management system, assuming the presence of an appropriate wrapper component to transform cql to the target query language. conclusion this paper describes an independent, modular software component that enables the integration of a service for search and retrieval of bibliographic records into an existing library system. the software component provides a single interface to server-side protocols to search and retrieve records, and could be extended to support additional server-side protocols. the paper describes the communication of this component with z39.50 and sru servers. the software component was developed for integration with the bisis library system, but is an independent component that could be integrated in any other library system. the proposed architecture of the software component is also suitable for inclusion of the existing library system into a single portal. the architecture of the portal should involve one mediator component whose task would be to communicate with wrapper components of individual library systems. each library system would implement its own search and store functionalities and could function independently of the portal. the basic advantage of this architecture is that it is possible to include new library systems that provide search services. it is only necessary to add a new wrapper that will perform the appropriate transformation of the query obtained from the mediator component in a query that the library system can process. the task of the mediator is to send queries to the wrapper, while each wrapper can establish communication with a specific library system. after obtaining the results from underlying library system, the mediator should be able to combine results, remove duplicate, and sort results. in this way end user would have impression that he has been searched a single database. references 1. “information retrieval (z39.50): application service definition and protocol specification,” http://www.loc.gov/z3950/agency/z39-50-2003.pdf (accessed february 22, 2013). http://www.loc.gov/z3950/agency/z39-50-2003.pdf information technology and libraries | march 2013 68 2. “search/retrieval via url,” http://www.loc.gov/standards/sru/. 3. “contextual query language – cql,” http://www.loc.gov/standards/sru/specs/cql.html. 4. eric lease morgan, "an introduction to the search/retrieve url service (sru),” ariadne 40 (2004), http://www.ariadne.ac.uk/issue40/morgan. 5. larry e. dixson, "yaz proxy installation to enhance z39.50 server performance,” library hi tech 27, no. 2 (2009): 277-285, http://dx.doi.org/10.1108/07378830910968227; mike taylor and adam dickmeiss, “delivering marc/xml records from the library of congress catalogue using the open protocols srw/u and z39.50,” (paper presented at world library and information congress: 71st ifla general conference and council, oslo, 2005). 6. mike taylor and adam dickmeiss,“delivering marc/xml records from the library of congress catalogue using the open protocols srw/u and z39.50,” (paper presented at world library and information congress: 71st ifla general conference and council, oslo, 2005). 7. “voyager integrated library system,” http://www.exlibrisgroup.com/category/voyager. 8. “indexdata,” http://www.indexdata.com/. 9. “yazproxy,” http://www.indexdata.com/yazproxy. 10. theo van veen and bill oldroyd, “search and retrieval in the european library,” d-lib magazine 10, no. 2 (2004), http://www.dlib.org/dlib/february04/vanveen/02vanveen.html.. 11. “тhe european library,” http://www.theeuropeanlibrary.org./tel4/. 12. gio wiederhold ,“mediators in the architecture of future information systems,” computer 25, no. 3 (1992): 38-49, http://dx.doi.org/10.1109/2/121508. 13. enrico coiera, martin walther, ken nguyen, and nigel h. lovell, “architecture for knowledgebased and federated search of online clinical evidence,” journal of medical internet research 7, no. 5 (2005), http://www.jmir.org/2005/5/e52/. 14. shirley cousins and ashley sanders, “incorporating a virtual union catalogue into the wider information environment through the application of middleware: interoperability issues in crossdatabase access,” journal of documentation 62, no. 1 (2006): 120-144, http://dx.doi.org/10.1108/00220410610642084. 15. “sru software and tools,” http://www.loc.gov/standards/sru/resources/tools.html; “z39.50 registry of implementators,” http://www.loc.gov/z3950/agency/register/entries.html. 16. “jafer toolkit project,” http://www.jafer.org. 17. “cql-java: a free cql compiler for java,” http://zing/z3950.org/cql/java/. http://www.loc.gov/standards/sru/ http://www.loc.gov/standards/sru/specs/cql.html http://www.ariadne.ac.uk/issue40/morgan http://dx.doi.org/10.1108/07378830910968227 http://www.exlibrisgroup.com/category/voyager http://www.indexdata.com/ http://www.indexdata.com/yazproxy http://www.dlib.org/dlib/february04/vanveen/02vanveen.html http://www.theeuropeanlibrary.org./tel4/ http://dx.doi.org/10.1109/2/121508 http://www.jmir.org/2005/5/e52/ http://dx.doi.org/10.1108/00220410610642084 http://www.loc.gov/standards/sru/resources/tools.html http://www.loc.gov/z3950/agency/register/entries.html http://www.jafer.org/ http://zing/z3950.org/cql/java/ information retrieval using a middleware approach | krstićev 69 18. bojana dimić, branko milosavljević and dušan surla,“xml schema for unimarc and marc 21 formats,” the electronic library 28, no. 2 (2010): 245-262, http://dx.doi.org/10.1108/02640471011033611. 19. “unimarc formats and related documentation,” http://www.ifla.org/en/publications/unimarcformats-and-related-documentation. 20. “marc 21 format for bibliographic data,” http://www.loc.gov/marc/bibliographic/. 21. “unimarcslim xml schema,” http://www.bncf.firenze.sbn.it/progetti/unimarc/slim/documentation/unimarcslim.xsd. 22. “marc21slim xml schema,” http://www.loc.gov/standards/marcxml/schema/marc21slim.xsd. 23. “dublincore xml schema,” http://www.loc.gov/standards/sru/resources/dc-schema.xsd. 24. erich gamma, richard helm, ralph johnson, and john vlissides, design patterns: elements of reusable object-oriented software (indianapolis: addison–wesley, 1994), 315-323. 25. ibid. 26. danijela boberić and branko milosavljević, “generating library material reports in software system bisis,” (proceedings of the 4th international conference on engineering technologies icet, novi sad, 2009); danijela boberić and dušan surla, “xml editor for search and retrieval of bibliographic records in the z39.50 standard”, the electronic library 27, no. 3 (2009): 474-495, http://dx.doi.org/10.1108/02640470910966916 (accessed february 22, 1013); bojana dimić and dušan surla, “xml editor for unimarc and marc21 cataloguing,” the electronic library 27, no. 3 (2009): 509-528, http://dx.doi.org/10.1108/02640470910966934 (accessed february 22, 2013); jelena rađenović, branko milosavljеvić and dušan surla, “modelling and implementation of catalogue cards using freemarker,” program: electronic library and information systems 43, no. 1 (2009): 63-76, http://dx.doi.org/10.1108/00330330934110 (accessed february 22, 2013); danijela tešendić, branko milosavljević and dušan surla, “a library circulation system for city and special libraries”, the electronic library 27, no. 1 (2009): 162-186, http://dx.doi.org/10.1108/02640470910934669. 27. “hessian,” http://hessian.caucho.com/doc/hessian-overview.xtp. 28. branko milosavljević, danijela boberić, and dušan surla, “retrieval of bibliographic records using apache lucene,” the electronic library 28, no. 4 (2010): 525-539, http://dx.doi.org/10.1108/02640471011065355. acknowledgement the work is partially supported by the ministry of education and science of the republic of serbia, through project no. 174023: "intelligent techniques and their integration into wide-spectrum decision support." http://dx.doi.org/10.1108/02640471011033611 http://www.ifla.org/en/publications/unimarc-formats-and-related-documentation http://www.ifla.org/en/publications/unimarc-formats-and-related-documentation http://www.loc.gov/marc/bibliographic/ http://www.bncf.firenze.sbn.it/progetti/unimarc/slim/documentation/unimarcslim.xsd http://www.loc.gov/standards/marcxml/schema/marc21slim.xsd http://www.loc.gov/standards/sru/resources/dc-schema.xsd http://dx.doi.org/10.1108/02640470910966916 http://dx.doi.org/10.1108/02640470910966934 http://dx.doi.org/10.1108/00330330934110 http://dx.doi.org/10.1108/02640470910934669 http://hessian.caucho.com/doc/hessian-overview.xtp http://dx.doi.org/10.1108/02640471011065355 abstract service barometers: using lending kiosks to locate patrons public libraries leading the way service barometers using lending kiosks to locate patrons william yarbrough information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.13499 public libraries have been using lending kiosks for close to ten years now. typically, kiosks are used as a sort of satellite collection, delivering library services directly to the community. often, they target people with limited mobility or who lack reliable transportation. reaching these underserved populations helps expand a library’s service area and users, which can factor in to state aid. but when amanda jackson first took over as director for the chesapeake public library, back in 2018, one of her first ideas involved using lending kiosks in a way that’s slightly unconventional. chesapeake is the second largest city in the commonwealth of virginia, measuring at 350 square miles. while considered suburban, the city is also plenty rural, with large areas of farmland, forest, swamps, and river. one of those areas is southern chesapeake. spanning roughly 130 square miles, southern chesapeake stretches all the way down to the border with north carolina. the closest library, however, is located on the northern end of the city, all the way up in great bridge. this library, which houses over 289,000 items (not to mention a law library and history room), is the largest in the city. more than 500,000 people visit it annually. still, for many of the 19,586 res idents in southern chesapeake, it’s a bit of a hike. for this size population, breaking ground on a new library branch would be warranted. especially since that population is growing. from 2018 to 2019, southern chesapeake saw its population increase by 1.62%, good for second highest among the city’s nine boroughs. but even a small library, of, say, 12,000 square feet, would cost $3 million—and that’s being conservative. justifying that big of an expense, both to the city council and local taxpayers, requires proving a return on investment. “building a new library brings a lot of excitement and energy into the community,” jackson says. “but first, to support that decision, we need to better understand what south chesapeake needs from the library.” the first person jackson turned to was maiko medina, who heads up cpl’s it division. like jackson, medina has worked in libraries for close to 20 years, first as frontline staff before transitioning to it. while with the neighboring virginia beach public library system, he helped install a variety of new systems, including self-checkout kiosks. together, medina and jackson came up with a plan, that uses lending kiosks as a type of service barometer. cpl would install kiosks all around south chesapeake, at city parks, community william yarbrough (wyarbrou@infopeake.org) is administrative assistant, chesapeake public library. © 2021. mailto:wyarbrou@infopeake.org information technology and libraries june 2021 service barometers | yarbrough 2 centers, police and fire stations, and local businesses. each kiosk would provide a sel ection of new and popular items from the library’s catalogue, from which patrons could check out on the spot. “by studying how the kiosks are used, we’ll get a better idea for where our patrons are,” medina says. “it’ll also tell us what they’re interested in.” this plan was submitted as a capital project. at a proposed $113,000, the project would fund one initial lending kiosk, along with an accompanying holds locker. chesapeake city council approved this project for fiscal year 2020. installation of the first kiosk in southern chesapeake was then scheduled for 2021. however, when the covid-19 pandemic hit, jackson, medina and the rest of the team at cpl recognized the need to speed the plan into action. “we kept hearing about how kids were struggling with virtual learning,” jackson says, “especially those in our underserved communities.” one of those communities is south norfolk. among the neighborhood’s 22,851 residents, 59.2% identify as black. 31% are 19 or younger. of those children, 39% are enrolled at a title i school. as schools were forced to move learning online, many of these students fell behind, whether because they lacked reliable internet, access to a home computer, or both. to meet this need, the neighboring dr. clarence v. cuffee library was transformed into an outreach and innovation center. along with a business center, maker spaces, stem walls, and a rotating art gallery, the new-and-improved cuffee library came with a student learning center. through this service, students can schedule one-on-one tutoring appointments with library staff and local college students, either virtual or in-person. of course, adding these new services required moving other things around. books, dvds, and other materials were redistributed to other libraries across the system. this may seem like an odd decision (after all, what’s a library without books?). but patrons weren’t using this library for materials; in fact, the number of items checked out from the collection (17,922) was significantly lower than any of the other six branches. still, the library didn’t want to abandon those patrons who rely on cuffee for more traditional services. especially since many were likely stuck at home during the pandemic. to meet this need, last october, shortly after the newly renovated dr. clarence v. cuffee outreach and innovation center opened, rather than wait until 2021, the library secured additional funding, through the cares act, to install a lending kiosk right outside the center’s main entrance. with this kiosk—a lendit 200 (courtesy of d-tech international)—patrons can check out from a rotating list of 200 items. they can also get any other item in the library’ collection by using the holdit locker (also from d-tech). both services are free and are available 24 hours a day, 7 days a week. so far, since last december, 136 items have been checked out through the cuffee lendit kiosk. another 304 have been checked out via the holds locker. among those check-outs, most popular are adult nonfiction dvds (33%) and adult fiction books (20%). as a result, more of these items will be rotated into the kiosk’s collection. not only that, but next month, the library will break information technology and libraries june 2021 service barometers | yarbrough 3 ground on another, bigger lending kiosk. located at fire station 7, this kiosk will be the first lendit 500 installed in south chesapeake. “the lending kiosk has really helped us continue to serve our patrons during all the changes brought on over the past 18 months,” say both jackson and medina. “now that things are starting to move ahead a little, we’re excited for how this technology will help us reach more of chesapeake.” over the next couple of years, chesapeake public library will use these lending kiosks to learn more about what the growing number of people in south chesapeake need from the library. maybe that’s more kiosks, a small storefront or even a full-sized, brick-and-mortar building. either way, new and innovative technologies like the lending kiosks will lead the way, helping cpl deliver services further into the community. 168 techniques for special processing of data within bibliographic text paula goossens: royal library albert i, brussels, belgium. an analysis of the codification practices of bibliographic desc1'iptions reveals a multiplicity of ways to solve the p1'oblem of the special processing of ce1tain characters within a bibliographic element. to obtain a clem· insight i'nto this subfect, a review of the techniques used in different systems is given. the basic principles of each technique are stated, examples am given, and advantages and disadvantages are weighed. simple local applications as well as more ambitious shared cataloging p1'0jects are considered. introduction effective library automation should be based on a one-time manual input of the bibliographic descriptions, with multiple output functions. these objectives may be met by introducing a logical coding technique. the higher the requirements of the output, the more sophisticated the storage coding has to be. in most cases a simple identification of the bibliographic elements is not sufficient. the requirement of a minimum of flexibility in filing and printing operations necessitates the ability to locate certain groups of characters within these elements. it is our aim, in this article, to give a review of the techniques solving this last problem. as an introduction, the basic bibliographic element coding methods are roughly schematized in the first section. according to the precision in the element identification, a distinction is made between two groups, called respectively field level and sub:field level systems. the second section contains discussions on the techniques for special processing of data within bibliographic text. three basic groups are treated: the duplication method, the internal coding techniques, and the automatic handling techniques. the different studies are illustrated with examples of existing systems. for the field level projects we confined ourselves to some important german and belgian applications. in the choice of the subfield level systems, which are marc ii based, we tried to be more complete. most of the cited applications, for practical reasons, only concern the treatment of monographs. this cannot be seen as a limitation because the methods discussed are very techniques for special processing/ goossens 169 general by nature and may be used for other material. each system which has recourse to different special processing techniques is discussed in terms of each of these techniques, enabling one to get a realistic overview of the problem. in the last section, a table of the systems versus the techniques used is given. the material studied in this paper provided us with the necessary background for building an internal coding technique in our internal processing format. bibliographic element codification methods field level systems the most rudimentary projects of catalog automation are limited to a coarse division of the bibliographic description into broad fields. these are marked by special supplied codes and cover the basic elements of author, title, imprint, collation, etc. in some of the field level systems, a bibliographic element may be further differentiated according to a more specific content designation, or according to a function identification. for instance, the author element can be split up into personal name and corporate name, or a distinction can be made between a main entry, an added entry, a reference, etc. this approach supports only the treatment of each identified bibliographic element as a whole for all necessary processing operations, filing and printing included. this explains why, in certain applications, some of the bibliographic elements are duplicated, under a variant form, according to the subsequent treatments reflected in the output functions. details on this will be discussed later. here we only mention as an example the deutsche bibliographie and the project developed at the university of bochum.l-4 it is evident that these procedures are limited in their possibilities and are not economical if applied to very voluminous bibliographic files. for this reason, at the same time, more sophisticated systems, using internal coding techniques, came into existence. these allow one to perform separate operations within a bibliographic element, based on a special indication of certain character strings within the text. as there is an overlap in the types of internal coding techniques used in the field level systems and in the subfield level systems, this problem will later be studied as a whole. we limit ourselves to citing some projects falling under this heading. as german applications we have the deutsche bibliographie and the bikas system. 5 in belgium the programs of the quetelet fonds may be mentioned.6· 7 subfield level systems in a subfield level system the basic bibliographic elements, separated into fields, are further subdivided into smaller logical units called subfields. for instance, a personal name is broken into a surname, a forename, a numeration, a title, etc. such a working method provides access to smaller logical units and will greatly facilitate the functions of extraction, sup170 journal of lihm1·y automation vol. 7/3 september 1974 pression, and transposition. thus, more flexibility in the processing of the bibliographic records is obtained. as is well known, the library of congress accomplished the pioneering work in developing the marc ii format: the communications format and the internal processing format. s-n these will be called marc lc and a distinction between the two will only be made if necessary. the marc lc project originated in the context of a shared cataloging program and immediately served as a model in different national bibliographies and in public and university libraries. in this paper we will discuss bnb marc of the british national bibliography, the nypl automated bibliographic system of the new york public library, monocle of the library of the university of grenoble, canadian marc, and fbr (forma bibliothecae regiae), the internal processing format of the royal library of belgium.l2-21 in order to further optimize the coding of a bibliographic description, the library of congress also provided for each field two special codes, called indicators. the function of these indicators differs from field to field. for example, in a personal name one of the indicators describes the type of name, to wit: forename, single surname, multiple surname, and name of family. some of the indicators may act as an internal code. in spite of the well-considered structuring of the bibliographic data in the subfield level systems, not all library objectives may yet be satisfied. to reduce the remaining limitations, some approaches similar to those elaborated in field level systems are supplied. some ( nypl, marc lc internal fmmat, and canadian marc) have, or will have, in a very limited way, recourse to a procedure of duplication of subfields or fields. all cited systems, except nypl, use to a greater or lesser degree internal coding techniques. finally some subfield level systems automatically solve certain filing problems by computer algorithms. this option was taken by nypl, marc lc, and bnb marc. each of these methods will be discussed in detail in the next section. techniques for special processing of data methods for special treatment of words or characters within bibliographic text were for the most part introduced to suppmt exact file arrangement procedures and printing operations. in order to give concrete form to the following explanation, we will illustrate some complex cases. each example contains the printing form and the filing form according to specific cataloging practices for some bibliographic elements. consider the titles in examples 1, 2, and 3, and the surnames in examples 4, 5, and 6. example 1: l'automation des bibliotheques automation bibliotheques example 2: bulletino della r. accademia medica di roma bolletino accademia medica roma techniques for special processing/ goossens 171 example 3: ibm 360 assembler language i b m three hundred sixty assembler language example 4: me kelvy mackelvy example 5: van de castele v andecastele example 6: martin du card martin dugard we do not intend, in this paper, to review the well-known basic rules for building a sort key (the translation of lowercase characters to uppercase, the completion of numerics, etc.). our attention is directed to the character strings that file differently than they are spelled in the printing form. the methods developed to meet these problems are of a very different nature. for reasons of space, not all the examples will be reconsidered in every case; only those most meaningful for the specific application will be chosen. duplication methods we briefly repeat that this method consists of the duplication of certain bibliographic elements in variant fonns, each of them exactly corresponding to a certain type of treatment. in bochum, the title data are handled in this way. one field, called "sachtitel," contains the filing form of the title followed by the year of edition. another field, named "titelbeschreibung," includes the printing form of the title and the other elements necessary for the identification of a work (statements of authorship, edition statement, imprint, series statement, etc.). to apply this procedure to examples 1, 2, and 3, the different forms of each title respectively have to be stored in a printing field and in a sorting field. analogous procedures are, in a more limited way, employed in the deutsche bibliographie. for instance, in addition to the imprint, the name of the publisher is stored in a separate field to facilitate the creation of publisher indexes. the technique of the duplication of bibliographic elements has also been considered in subfield level systems. the nypl format furnishes a filing subfield in those fields needed for the creation of the sort key. this special subfield is generally created by program, although in exceptional cases manual input may be necessary. in the filing subfield the text is preceded by a special character indicating whether or not the subfield has been introduced manually. marc lc (internal format) and canadian marc opt for a more flexible approach in which the filing information is specified with the same precision as the other information. the sorting data are stored in complete fields containing, among others, the same subfields as the corresponding original field. because in most subfield level systems the number of different fields is much higher than in field level systems, the duplication method becomes more intricate. provision of a separately coded field for each normal field 172 j oumal of library automation vol. 7 i 3 september 197 4 which may need filing information is excluded. only one filing field is supplied, which is repeatable and stored after the other fields. in order to link the sorting fields with the original fields, specific procedures have been devised. marc lc, for instance, reserves one byte per field, the sorting field code, to announce the presence or the absence of a related sorting field. the link between the fields themselves is placed in a special subfield of the filing field. 22 in the supposition that examples 3 and 4 originate from the same bibliographical description, this method may be illustrated schematically as follows: tag 100 245 880 880 sorting field code sequence number x 1 x 1 1 2 data $a$mc kelvy $a$ibm 360 assembler language $ja$1001$mackelvy $ja$2451$i b m three hundred sixty assembler language as is well known, the personal author and title fields are coded respectively as tag 100 and tag 245. tag 880 defines a filing field. in the second column, the letter x identifies the presence of a related sorting field. the third column contains a tag sequence number needed for the unequivocal identification of a field. in the last column the sign ·$ is a delimiter. the first $ is followed by the different subfield codes. the other delimiters initiate the subsequent subfields. in tag 100 and 245, the first subfields contain the surname and the short title respectively. in tag 880 the first subfield gives the identification number of the related original field. the further subfield subdivision is exactly the same as in the original fields. in canadian marc a slightly different approach has been worked out. note that in neither of the last two projects has this technique been implemented yet. for an evaluation of the duplication method different means of application must be considered. if not systematically used for several bibliographic elements, the method is very easy at input. the cataloger can fill in the data exactly as they are; no special codes must be imbedded in the text. but it is easy to understand that a more frequent need of duplicated data renders the cataloging work very cumbersome. in regard to information processing, this method consumes much storage space. first, a certain percentage of the data is repeated; second, in the most complete approach of the subfield level systems, space is needed for identifying and linking information. for instance, in marc lc, one byte per field is provided containing the sorting field code, even if no filing information at all is present. finally, programming efforts are also burdened by the need for special linking procedures. in order to minimize the use of the duplication technique, the cited systems reduce their application in different ways. bochum simplified its cataloging rules in order to limit its use to title information. as will be explained further, the deutsche bibliographie also has recourse to internal techniques for special processing/ goossens 173 coding techniques. nypl, marc lc, and canadian marc only call on it if other more efficient methods (see later) fail. they also make an attempt to adapt existing cataloging practices to an unmodified machine handling of nonduplicated and minimally coded data. intemal coding techniques separators separators are special codes introduced within the text, identifying the characters to be treated in a special way. a distinction can be made among four procedures. 1. simple separators. with this method, each special action to be performed on a limited character string is indicated by a group of two identical separators, each represented as a single special sign. illustration on examples 2, 3, 4, and 6 gives: example 2: £ bolletino £ ¢bulletino della r. ¢accademia medica ¢di ¢roma example 3: £i b m three hundred sixty £¢ibm 360 ¢assembler language example 4: m£a£c¢ ¢kelvy example 6: martin du¢ ¢card the characters enclosed between each group of two corresponding codes £ must be omitted for printing operations. in the same way the characters enclosed between two corresponding codes ¢ are to be ignored in the process of filing. in the case that only the starting position of a special action has to be indicated, one separator is sufficient. for instance, if in example 1 we limit ourselves to coding the first character to be taken into account for filing operations, we have: example 1: l' i automation des bibliotheques where a slash is used as sorting instruction code. the simple separator method has tempting positive aspects. occupying a minimum of storage space (maximum two bytes for each instruction), the technique gives a large range of processing possibilities. indeed, excluding the limitation on the number of special signs available as separators, no other restrictions are imposed. this argument will be rated at its true worth only after evaluation of the multiple function separators method and of the indicator techniques. the major disadvantage of the simple separator method lies in its slowness of exploitation. in fact, for every treatment to be performed, each data element which may contain special codes has to be scanned, character by character, to localize the separators within the text and to enable the execution of the appropriate instructions. for example, in the case of a printing operation, the program has to identify the parts of the text to be considered and to remove all separators. the sluggishness of 17 4 i ournal of library automation vol. 7 i 3 september 197 4 execution was for some, as for canadian marc, a reason to disapprove this method.23 as already mentioned, another handicap with cataloging applications is the loss of a number of characters caused by their use as special codes. it is self-evident that each character needed as a separator cannot be used as an ordinary character in the text. for bochum this was a motive to reject this method. many of the field level systems with internal codes have recourse to simple separators. we mention the deutsche bibliographie, in which some separators indicate the keywords serving for automatic creation of indexes and others give the necessary commands for font changes in photocomposition applications. in order to reduce the number of special signs, the deutsche bibliographie also duplicates certain bibliographic data. bikas uses simple separators for filing purposes. the technique is also employed in subfield level systems. in monocle each title field contains a slash, indicating the first character to be taken into account for filing. 2. multiple function separators. designed by the british, the technique of the multiple function separators was adopted in monocle. the basic idea consists of the use of one separator characteristic for instructing multiple actions. in the case of monocle these actions are printing only, filing only, and both printing and filing. in order to give concrete form to this method we apply it to examples 3, 4, and 6, using a vertical bar as special code. example 3: jibm 360 jib m three hundred sixty jassembler language example 4: mjc jacjkelvy example 6: martin dujjjgard the so-called three-bar filing system divides a data element into the following parts: data to be j data to be i data to be filed and printed j printed only filed only i data to be j filed and printed in comparison with the simple separator technique, this method has the advantage of needing fewer special characters. a gain of storage space cannot be assumed directly. as is the case in example 6, if only one special instruction is needed, the set of three separators must still be used. on the other hand, one must note that a repetition of identical groups of multiple function separators within one data element must be avoided. subsequent use of these codes leads to very unclear representations of the text and may cause faulty data storage. this can well be proved if the necessary groups of three bars are inserted in examples 1 and 2. of the studied systems, monocle is the only one to use this method. 3. separators with indicators. as mentioned in the description of subfield level systems, two indicators are added for each field present. in techniques for special p1'0cessing/ goossens 175 order to speed up the processing time in separator applications, indicators may be exploited. in monocle the presence or the absence of three bars in a subfield is signalled by an indicator at the beginning of the corresponding field. this avoids the systematic search for separators within all the subfields that may contain special codes. the number of indicators being limited, it is self-evident that in certain fields they may already be used for other purposes. as a result, some of the separators will be identified at the beginning of the field and others not. this leads to a certain heterogeneity in the general system concept which complicates the programming efforts. under this heading, we have mentioned the use of indicators only in connection with multiple function separators. note that this procedure could be applied as well in simple separator methods. nevertheless, none of the subfield level systems performs in this fashion because it is not necessary for the particular applications. this method is not followed in the field level systems as no indicators are provided. 4. compound separators. a means of avoiding the second disadvantage of the simple separator technique is to represent each separator by a two-character code: the first one, a delimiter, identifies the presence of the separator and is common to each of them; the second one, a normal character, identifies the separator's characteristic. taking the sign £ as delimiter and indicating the functions of nonprinting and nonfiling respectively by the characters a and b, examples 2 and 4 give in this case : example 2: £ abolletino £ a£ bbulletino della r. £ baccademia medica £ bdi £ broma example 4: m£aa£ac£b £bkelvy thus the number of reserved special characters is reduced to one, independent of the number of different types of separators needed. in none of the considered projects is this technique used, probably because of the amount of storage space wasted. indicators as the concept of adding indicators in a bibliographic record format is an innovation of marc lc, the methods described under this heading concern only subfield level systems. although at the moment of the creation of marc lc one did not anticipate the systematic use of indicators for filing, its adherents made good use of them for this purpose. 1. personal name type indicator. as mentioned earlier, in marc lc one of the indicators, in the field of a personal name, provides information on the name type. this enables one to realize special file arrangements. for example, in the case of homonyms, the names consisting only of a forename can be filed before identical surnames. using the same indicator, an exact sort sequence can be obtained for 176 journal of libmry automation vol. 7/3 september 1974 single surnames, including prefixes. knowing that the printing form of example 5 is a single surname, the program for building the sort key can ignore the two spaces. the systems derived from marc lc developed analog indicator codifications adapted to their own requirements. this seems to be an elegant method for solving particular filing problems in personal names. nevertheless, its possibilities are not large enough to give full satisfaction. for instance, example 6 gives a multiple surname with prefix in the second part of the name. the statement of multiple surname in the indicator does not give enough information to create the exact sort form. because of this shortcoming, monocle had recourse to the technique called "separators with indicators." 2. indicators identifying the beginning of filing text. bnb marc reserves one indicator in the title field for identification of the first character of the title to be considered for filing. this indicator is a digit between zero and nine, giving the number of characters to be skipped at the beginning of the text. applying this technique to example i, the corresponding filing indicator must have the value three. without having recourse to other working methods, this title sorts as: example 1: automation des bibliotheques notice that the article des still remains in the filing form. this procedure has the advantage of being very economical in storage space and in processing time. moreover the text is not cluttered with extraneous characters. on the other hand we must disapprove of the limitation of this technique to the indication of nonfiling words at the beginning of a field. the possibility of identifying certain character strings within the text is not provided for. taking examples 2 and 3 we observe that the stated conditions cannot be fulfilled. another negative side is the number of characters to be ignored, which may not exceed nine. also one indicator must be available for this filing indication. after bnb marc, marc lc and canadian marc also introduced this technique. 3. separators with indicators. the use of indicators in combination with separators has been treated above. pointers a final internal coding technique which seems worth studying is the one developed at the royal library of belgium for the creation of the catalogs of the library of the quetelet fonds, a field level system. the pointer technique is rather intricate at input but has many advantages at output. because there is inadequate documentation of this working method, we will try to give an insight into it by schematizing the procedures to be followed to create the final storage structure. at input, the cataloger intechniques for special p1'dcessing/goossens 177 serts the necessary internal codes as simple separators within the text. these codes are extracted by program from the text and placed before it, at the beginning of each field. each separator, now called pointm· characteristic, is supplemented with the absolute beginning address and the length of its action area within the text. in the quetelet fonds the pointer characteristic is represented by one character, the address and length occupy two bytes each. the complete set of pointers (pointer characteristics, lengths, and addresses ) is named pointer field. this field is incorporated in a sort of directory, starting with the sign "&" identifying the beginning of the field, followed by the length of the directory, the length of the text, and the pointer field itself. this is illustrated in figure 1. note that each field contains the five first bytes, even if no pointers are present. in the quetelet fonds, pointers are used for the following purposes: nonfiling, nonprinting, kwic index, indication of a corporate name in the title of a periodical, etc. examples 2, 3, and 4 should be stored in this system as represented in figure 2. directory text i i pointer field i i i i i 1 i i i i i i i i i i i representation of the structure of a field in the internal processing format of the quetelet fonds system. the codes respectively represent: &: field delimiter; ld: length of directory; lt: length of text; x, y, . . . : pointer characteristics; ax, ay, . . . : addresses of the beginning of the related action area inside the text; lx, ly, ... : length of these action areas. fig. 1. structure of direct01y with pointe1' technique. ' the advantages of the pointer technique are numerous. first, we must mention the relative rapidity of the processing of the records. in fact, in order to detect a specific pointer, only the directory has to be consulted. all subsequent instructions can be executed immediately. in contrast with most of the other methods discussed, there is no objection to using pointers for all internal coding purposes needed. this enables one to pursue homogeneity in the storage format, facilitating the development of programs. further, the physical separation of the internal codes and the text allow, in most cases, a direct clean text representation without any reformatting. finally, unpredictable expansions of internal coding processes can easily be added without adaptation of the existing software. a great disadvantage of the pointer technique lies in the creation of the directory. the storage space occupied by the pointers is also great in comparison with the place occupied by internal codes in other methods. a further handicap is the limitation imposed at input due to the use of simple separators. 178 journal of library automation vol. 7 i 3 september 197 4 ~~2,!j5,31~4>.~ 1,~eb ,4>11 ,9lel4,61ct;,3jb.o,l,l,e,t, i,n.o, ,b,u, i, i ,e, t, i ,n,oj 0 5 10 15 ,d,e,l,l,a, ,r,., ,a,c,c,a,d,e,m,i,a, ,m,e,d,i,c,a, ,d:i, ,r,o,m,a,$ ~ ~ ~ ~ ~ ~ ~ ~ ~~1 ,5 1 5,2~a~,$1 2,61sl2,6 1ci>, eli, ,b, ,m, ,t,h,r,e,e, ,h,u,n,d,r,e,d, ,s, i ,x,rl 0 5 10 15 ~ lv, , i,b,m, ,3,6,4>, ,a,s,s,e,m,b, i ,e, r, , i ,a,n,g,u,a,g,e, ~ ~ ~ h ~ ~ ~~ ~~1 ,5,4>,91~4>, 114>,1lelc~>,3 1 ci>, 1jm,a,c, ,k,e, l ,v,y, ~ 0 5 8 representation of examples 2, 3, and 4 in the quetelet fonds format. a represents the pointer characteristic for nonprinting data; b is the pointer characteristic for nonfiling data. fig. 2. pointe1· technique as applied to bibliographic data. in spite of these negative arguments, we see a great interest in this method, and wish to give some suggestions in order to relieve or to eliminate some of them. initially we must realize that the creation of a record takes place only once, while the applications are innumerable. the possibility of automatically adding some of the codes may also be considered. data needing special treatment expressed in a consistent set of logical rules can be coded by program. only exceptions have to be treated manually. in considering the space occupied by the directory, some profit could be imagined by trying to reduce the storage space occupied by the addresses and the lengths. there is also a solution to be found by not having systematically to provide pointer field information. one must realize that only a small percentage of the fields may contain such codes. finally, the restrictions at input may be removed by using complex separators. such a change does not have any repercussion on the directory. as far as we know, the pointer technique has not been used in a subfield level system. at our library an internal processing format of the subfield level type, called fbr, is under development, in which a pointer technique based on the foregoing is incorporated. techniques for special p1'dcessing/goossens 179 automatic handling techniques in order to give a complete review of the methods of handling data within bibliographic text, we must also treat the methods in which both the identification and the special treatment of these data are done during the execution of the output programs. the working method can easily be demonstrated with example 1. only the printing form must be recorded. the program for building the sort key processes a look-up table of nonfiling words including the articles l' and des. the program checks every word of the printing form for a match with one of the words of the nonfiling list. the sort key is built up with all the words which are not present in this table. to treat example 4, an analogous procedure can be worked through. an equivalence list of words for which the filing form differs from the printing form is needed. if, during the construction of the sort key, a match is found with a word in the equivalence list, the correct filing form, stored in this list, is placed in the sort key. the other words are taken in their printing form. in our case, using the equivalence list, me should be replaced by mac. in order to speed up the look-up procedures, different methods of organization of the look-up tables can be devised. other types of automatic processing techniques can be illustrated by the special filing algorithms constructed for a correct sort of dates. for instance, in order to be able to sort b.c. and a.d. dates in a chronological order, the year 0 is replaced by the year 5000. b.c. and a.d. dates are respectively subtracted from or added to this number. thus dates back to 5000 b.c. can be correctly treated. this technique, introduced by nypl, is also used at lc. the advantages of automatic handling techniques are many. no special arrangements must be made at input. only the bibliographic elements must be introduced under the printing form and no special codes have to be added. there is no storage space wasted for storing internal codes. as negative aspects we ascertain that not all cataloging rules may be expressed in rigid systematic process steps. examples 2 and 3 illustrate this point. one must also recognize that the special automatic handling programs must be executed repeatedly when a sort key is built up, increasing the processing time. this procedure may give some help for filing purposes, but we can hardly imagine that it really may solve all internal coding problems. think of the instructions to be given for the choice of character type while working with a type setting machine. the automatic handling technique is very extensively applied in the nypl programs, marc lc has recourse to it for treating dates, and bnb marc for personal names. 24 none of the field level systems considered here uses this method. summary and conclusions table 1 presents, for the discussed systems, a summary of the methods used for treating data in a bibliographic text. the duplication and indicator techniques have the most adherents. however, we must keep in mind table 1. review of the techniques for special processing of data within bibliographic text used or planned in the discussed systems systems techniques ,..... 00 0 automatic duplication internal codes handling ......... 0 separators t separators with indicators indicators pointers multiple personal beginning of 0 simple function name type filing text -t-t deutsche ""· c::>"' bibliographie x x ~ ~ <.-: eo chum x ~ <>+0 ~ bikas a .... 0 ;:i quetelet fonds x < 0 !"'"" -l. marc lc x x x x -cn cj') bnb marc x x x ('t) "'0 ..... ('t) nypl x x s 0" ('t) .... ,..... monocle x x x x co -l. jol>.. canadian marc x x x fer x techniques for special processing/ goossens 181 that in most of the systems the duplication of data only represents an extreme solution. on the other hand, indicators are very limited in their possibilities. as far as the flexibility and application possibilities are concerned, the simple separators and the pointers present the most interesting prospects. automatic handling techniques may produce good results for use in well-defined fields or subfields. from the evaluations given for the different methods, we conclude that for a special application the choice of a method depends greatly on the objectives, namely the sort of special processing facilities needed, the volume of data to be treated, and the frequency of execution. references i. rudolf blum, "die maschinelle herstellung der deutschen bibliographie in bibliothekarischer sicht," zeitschrift fur bibliothekswesen und bibliographie 13:303-21 (1966). 2. die zmd in frankfurt am main; herausgegeben von klaus schneider (berlin: beuth-vertrieb gmbh, 1969), p.133-37, 162-67. 3, magnetbanddienst deutsche bibliographie, beschreibung fur 7-spur-magnetbiinder (frankfurt on the main: zentralstelle fi.ir maschinelle documentation, 1972). 4. ingeborg sobottke, "rationalisierung der alphabetischen katalogisierung," in electronische datenverarbeitung in der universitiitsbibliothek bochum; herausgegeben in verbindung mit der pressestelle der ruhr-universitat bochum von gunther pflug und bernhard adams (bochum: druckund verlagshaus schiirmann & klagges, 1968), p.24-32. 5. datenerfassung und datenverarbeitung in der universitiitsbibliothek bielefeld: eine materialsammlung; hrsg. von elke bonness und harro heim (munich: pullach, 1972). 6. michel bartholomeus, l' aspect informatique de la catalographie automatique (brussels: bibliotheque royale albert j•r, 1970), 7. m. bartholomeus and m. hansart, lecture des ent1·ees bibliog1·aphiques sous format 80 colonnes et creation de l'enregistrement standard; publication interne: mecono b015a (brussels: bibliotheque royale albert j•r, 1969). 8. henriette d. avram, john f. knapp, and lucia j. rather, the marc ii format: a communications format for bibliographic data (washington, d.c.: library of congress, 1968) . 9. books, a marc format: specifications for magnetic tapes containing catalog records for books (5th ed.; washington, d.c.: library of congress, 1972). 10. "automation activities in the processing department of the library of congress," library resources & technical services 16:195-239 (spring 1972). 11. l. e. leonard and l. j. rather, internal marc format specifications for books (3d ed.; washington, d.c.: library of congress, 1972). 12. marc record service proposals (bnb documentation service publications no.1 [london: council of the british national bibliography, ltd., 1968]). 13. marc ii specifications (bnb documentation service publications no.2 [london: council of the british national bibliography, ltd., 1969]). 14. michael gorman and john e. linford, desc1·iption of the bnb marc recorda manual of practice (london: council of the british national bibliography, ltd., 1971). 182 ] ournal of library automation vol 7 i 3 september 197 4 15. edward duncan, "computer filing at the new york public library," in lm·c reports vol.3, no.3 ( 1970), p.66-72. 16. nypl automated bibliographic system overview, internal report. (new york: new york public library, 1972). 17. marc chauveinc, monocle: projet de mise en ordinateur d'une notice catalographique de livre. deuxieme edition (grenoble: bibliotheque universitaire, 1972). 18. marc chauveinc, "monocle," journal of library automation 4:113-28 (sept. 1971). 19. canadian marc (ottawa: national library of canada, 1972). 20. format de communication du marc canadien: monographies (ottawa: bibliotheque nationale du canada, 1973). 21. to be published. 22. private communications ( 1973). 23. private communications ( 1972). 24. private communications ( 1973). 4 information technology and libraries | march 2005 the challenges encountered in building the international children’s digital library (icdl), a freely available online library of children’s literature are described. these challenges include selecting and processing books from different countries, handling and presenting multiple languages simultaneously, and addressing cultural differences. unlike other digital libraries that present content from one or a few languages and cultures, and focus on either adult or child audiences, icdl must serve a multilingual, multicultural, multigenerational audience. the research is presented as a case study for addressing these design criteria; current solutions and plans for future work are described. t he internet is a multilingual, multicultural, multigenerational environment. while once the domain of english-speaking, western, adult males, the demographics of the internet have changed remarkably over the last decade. as of march 2004, english was the native language of only 35 percent of the total world online population. as of march 2004, asia, europe, and north america each make up roughly 30 percent of internet usage worldwide.1 in the united states, women and men now use the internet in approximately equal numbers, and children and teenagers use the internet more than any other age group.2 creators of online digital libraries have recognized the benefit of making their content available to users around the world, not only for the obvious benefits of broader dissemination of information and cultural awareness, but also as tools for empowerment and strengthening community.3 creating digital libraries for children has also become a popular research topic as more children access the internet.4 the international children’s digital library (icdl) project seeks to combine these areas of research to address the needs of both international and intergenerational users.5 ■ background and related work creating international software is a complex process involving two steps: internationalization, where the core functionality of the software is separated from localized interface details, and localization, where the interface is customized for a particular audience.6 the localization step is not simply a matter of language translation, but involves technical, national, and cultural aspects of the software.7 technical details such as different operating systems, fonts, and file formats must be accommodated. national differences in language, punctuation, number formats, and text direction must be handled properly. finally, and perhaps most challenging, cultural differences must be addressed. hofstede defines culture as “the collective mental programming of the mind which distinguishes the members of one group or category of people from another.”8 these groups might be defined by national, regional, ethnic, religious, gender, generation, social class, or occupation differences. by age ten, most children have learned the value system of their culture, and it is very difficult to change. hofstede breaks culture into four components: values, rituals, heroes, and symbols. these components manifest themselves everywhere in software interfaces, from acceptable iconic representations of people, animals, and religious symbols to suitable colors, phrases, jokes, and scientific theories.9 however, as hoft notes, culture is like an iceberg: only 10 percent of the characteristics of a culture are visible on the surface.10 the rest are subjective, unspoken, and unconscious. it is only by evaluating an interface with users from the target culture that designers can understand if their software is acceptable.11 developers of online digital libraries have had to contend with international audiences for many years, and the marc and oclc systems have reflected this concern by including capabilities for transliteration and diacritical characters (accents) in various languages.12 however, it is only more recently, with the development of international character-set standards and web browsers that recognize these standards, that truly international digital libraries have emerged. greenstone, an the international children’s digital library: a case study in designing for a multilingual, multicultural, multigenerational audience hilary browne hutchinson, anne rose, benjamin b. bederson, ann carlson weeks, and allison druin hilary browne hutchinson (hilary@cs.umd.edu) is a faculty research assistant in the institute for advanced computer studies and a ph.d. student in the department of computer science. anne rose (rose@cs.umd.edu) is a faculty research assistant in the institute for advanced computer studies. benjamin b. bederson (bederson@cs.umd.edu) is an associate professor in the department of computer science and the institute for advanced computer studies and director of the human-computer interaction laboratory. ann carlson weeks (acweeks@umd.edu) is professor of the practice in the college of information studies. allison druin (allisond@umiacs.umd.edu) is an assistant professor in the college of information studies and the institute for advanced computer studies. all authors are affiliated with the university of maryland-college park and the human-computer interaction laboratory. open-source software project based in new zealand, allows people to create online digital libraries in their native language and culture.13 oclc recently completed a redesign of firstsearch, a web-based bibliographic and full-text retrieval service, to accommodate users with different software, languages, and disabilities.14 researchers at virginia tech redesigned citidel, an online collection of computer-science technical reports, to create an online community that allows users to translate their interface into different languages.15 researchers have also realized that beyond accessibility, digital libraries have enormous potential for empowerment and building community, especially in developing countries. witten et al. and downie describe the importance of community involvement when creating a digital library for a particular culture, both to empower users and to make sure the culture is accurately reflected.16 even more than accurately reflecting a culture, a digital library also needs to be understood by the culture. duncker notes that a digital-library interface metaphor based on a traditional physical library was incomprehensible to the maori culture in new zealand, who are not familiar with the conventions of western libraries.17 in addition to international libraries, a number of researchers have focused on creating digital libraries for children. recognizing that children have difficulty with spelling, reading, and typing, as well as traditional categorization methods such as the dewey decimal system, a number of researchers have created more child-friendly digital libraries.18 pejtersen created the bookhouse interface with a metaphor of rooms in a house to support different types of searching.19 külper et al. designed the bücherschatz interface for children who are eight to ten years old using a treasure-hunt metaphor.20 druin et al. designed the querykids interface for young children to find information about animals.21 theng et al. used the greenstone software to create an environment for older children to write and share stories.22 the icdl project seeks to build on and combine research in both international and children’s digital libraries. as a result, icdl is more ambitious than other digital library projects in a number of respects. first, it is designed for a broader audience. while the digital libraries already described target one or a few cultures or languages, icdl’s audience includes potentially every culture and language in the world. second, the content is not localized. part of the library’s goal is to expose users to books from different cultures, so it would be counterproductive to present books only in a user’s native language. as a result, the interface not only supports multiple languages and cultures, but it also supports them simultaneously, frequently on the same screen. third, icdl’s audience not only includes a broad group of adults from around the world, but also children from three to thirteen years of age. to address these challenges, a multidisciplinary, multilingual, multicultural, and multigenerational team was created, and the development was divided into several stages. in the first stage, completed in november 2002, a java-based, english-only version of the library was created that addressed the searching and reading needs of children. in the second stage, completed in may 2003, an html version of the software was developed that addressed the needs of users with minimal technology. in the third stage, completed in may 2004, the metadata for the books in the library were translated into their native languages, allowing users to view these metadata in the language of their choice. the final stage, currently in progress, involves translating the interface to different languages and adjusting some of the visual design of the interface according to the cultural norms of the associated language being presented. in this paper, the research is presented as a case study, describing the solutions implemented to address some of these challenges and plans for addressing ongoing ones. ■ icdl project description the icdl project was initiated in 2002 by the university of maryland and the internet archive with funding from the national science foundation (nsf) and the institute for museum and library services (imls). today, the projects continues at the university of maryland. the goals of the project include: ■ creating a collection of ten thousand children’s books in one hundred languages; ■ collaborating with children as design partners to develop new interfaces for searching, browsing, reading, and sharing books in the library; and ■ evaluating the impact of access to multicultural materials on children, schools, and libraries. the project has two main audiences: children three to thirteen years of age and the adults who work with them, as well as international scholars who study children’s literature. the project draws together a multidisciplinary team of researchers from computer science, library science, education, and art backgrounds. the research team is also multigenerational—team members include children seven to eleven years of age, who work with the adult members of the team twice a week during the school year and for two weeks during the summer to help design and evaluate software. using the methods of cooperative inquiry, including brainstorming, lowtech prototyping, and observational note taking, the team has researched, designed, and built the library’s category structure, collection goals, and searching and reading interfaces.23 the international children’s digital library | hutchinson, rose, bederson, weeks, and druin 5 6 information technology and libraries | march 2005 the research team is also multilingual and multicultural. adult team members are native or fluent speakers of a number of languages besides english, and are working with school children and their teachers and librarians in the united states, new zealand, honduras, and germany to study how different cultures use both physical and digital libraries. the team is also working with children and their teachers in the united states, hungary, and argentina to understand how children who speak different languages can communicate and learn about each other’s cultures through sharing books. finally, an advisory board of librarians from around the world advises the team on curatorial and cultural issues, and numerous volunteers translate book and web-site information. ■ icdl interface description icdl has four search tools for accessing the current collection of approximately five hundred books in thirty languages: simple, advanced, location, and keyword. all are implemented with java servlet technology, use only html and javascript on the client side, and can run on a 56k modem. these interfaces were created during the first two development phases. the team visited physical libraries to observe children looking for books, developed a category hierarchy of kid-friendly terms based on these findings, and designed different tools for reading books.24 using the simple interface (figure 1), users can search for books using colorful buttons representing the most popular search categories. the advanced interface (figure 2), allows users to search for books in a compact, text-link-based interface that contains the entire librarycategory hierarchy. by selecting the location interface (figure 3), users can search for books by spinning a globe to select a continent. finally, with the keyword interface, users search for books by typing in a keyword. younger children seem to prefer the simplicity and fun of the location interface, while older children enjoy browsing the kid-friendly categories, such as colors, feelings, and shapes.25 all of these methods search the library for books with matching metadata. users can then read the book using a variety of book readers, including standard html pages and more elaborate java-based tools developed by the icdl team that present book pages in comic or spiral layouts (figures 4–6). in addition to the public interface, icdl also includes a private web site that was developed for book contributors to enter bibliographic metadata about the books they provide to the library (figures 7 and 8). using the metadata interface, contributors can enter information about their books in the native language of the book, and optionally translate or transliterate this information into english or latin-based characters. the design of icdl is driven by its audience, which includes users, contributors, and volunteers of all ages from around the world—more than six hundred thousand unique visitors from more than two hundred countries (at last count). as a result, books written in many different languages for users of different ages and cultural backgrounds must be collected, processed, stored, and presented. the rest of this paper will describe some of the challenges encountered and that are still being encountered in the development process, including selecting and processing a more diverse collection of books, handling different character sets and fonts, and addressing differences in cultural, religious, social, and political interpretation. figure 2. icdl advanced interface figure 1. icdl simple interface ■ book selection and processing the first challenge in the icdl project is obtaining and managing content. collecting books from around the world is a challenge because national libraries, publishers, and creators (authors and illustrators) all have different rules regarding copyrights. the goal is to identify and obtain award-winning children’s books from around the world, for example, books on the white ravens list, which are also made available to icdl users (www. icdlbooks.org/servlet/whiteravens).26 however, unsolicited books are received, frequently in languages the team cannot read. as a result, members of the advisory board and various children’s literature organizations in different countries are relied on to review these books. these groups help determine whether books are relevant and acceptable in the culture they are from, and whether they are appropriate for the three-to-thirteen age group. these groups are eager to help; including them in the process is an effective way to build the project and the community surrounding it. in addition to collecting and scanning books, bibliographical metadata in the native language of the book (title, creator[s], publisher, abstract) are also collected via the web-based metadata form filled out by the book contributors. it was decided to base the icdl metadata specification on the dublin core because of its international background, ability to be understood by nonspecialists, and the possibilities to extend its basic elements to meet icdl’s specific needs (see www.icdlbooks.org/ metadata/specification for more details).27 contributors who provide metadata have the option of translating them to english; they also can transliterate them to latin characters, if necessary. regardless of what language or figure 5. icdl comic book reader figure 3. icdl location interface figure 4. icdl standard book reader figure 6. icdl spiral book reader the international children’s digital library | hutchinson, rose, bederson, weeks, and druin 7 8 information technology and libraries | march 2005 languages they provide, they are asked to provide information that they create themselves, such as the abstract, in a format that is easily understandable by children. simple, short sentences make the information easy for children to read, and easier to translate to other languages. the metadata provided allow the team to catalog the books for browsing according to the various categories and to index the books for keyword searching. even though translation to english is optional, the englishspeaking metadata team needs the metadata in english in order to catalog the books. since many contributors do not have the time or ability to provide all of this information, volunteers who speak different languages are relied on to check the metadata that get submitted, and translate or transliterate them as necessary. this method allows information to be collected from contributors without overwhelming them, and also helps build and maintain the volunteer community. ■ handling different character sets the metadata form allows contributors to provide information from the comfort of an operating system and keyboard in their native language, but this flexibility requires software that can handle many different character sets. for example, english uses a latin character set; russian uses a cyrillic character set; and an arabic character set is used for persian/farsi. fortunately, there exists a single character set called unicode, an international, cross-platform standard that contains a unique encoding for nearly every character in every language.28 unfortunately, not all software supports unicode as yet. in the first stage of implementation in icdl, metadata information was collected only in english, so unicode compliance was not a problem. however, in the next phase of development, which included collecting and presenting metadata in the native language of all of the books, the software had to be adjusted to use unicode because icdl supports potentially every language in the world. the open-source mysql database, recently upgraded to allow storage of unicode data, was already in use for storing metadata. icdl’s web applications run on apache http and tomcat web servers, both of which are freely available and unicode-compliant. however, both the web site and the database had to be internationalized and localized to separate the template for metadata presentation from the content in different languages. a unicode-compliant database driver was necessary for passing information between the database and the web site. both the public and metadata web-site applications are written using freely available java servlet technology. the java language is unicode-compliant, but some adjustments had to be made to icdl’s servlet code to force it to handle data using unicode. to allow users to conduct keyword searches for books in the public interface, apache’s freely available lucene search engine is used to create indices of book metadata, which can then be searched. lucene is unicode-compliant, but a separate index for each language had to be created, requiring users to select a search language. this requirement was necessary for two reasons: (1) to avoid confusion over the same words with different meanings (bra means good in swedish); and (2) different languages have different rules for stopwords to ignore (the, of, a in english), truncation of similar words (cats has the same root as cat in english), and separation of characters (chinese does not put white space between symbols). lucene has text analyzers for a variety of languages that support these different conventions. for languages that figure 8. icdl metadata interface with japanese metadata figure 7. icdl metadata interface with spanish metadata lucene does not support, icdl volunteers translated english stopwords, and simple text analyzers were created by the team. finally, html headers created by the java servlets had to be modified to indicate that the content being delivered to users was in unicode. most current browsers and operating systems recognize and handle web pages properly delivered in unicode. for those that do not, help pages were created that explain how to configure common browsers to use unicode, and how to upgrade older browsers that do not support unicode. by making the icdl systems fully unicode-compliant, contributors from all over the world can enter metadata about books in an easily accessible html form using their native languages, and the characters are properly transmitted and stored in the icdl database. volunteers can then use the same form to translate or transliterate the metadata as necessary. finally, this information can be presented to our users when they look at books. for example the book where’s the bear? (harris, 1997) is written in six different languages.29 the original metadata came in english, but icdl volunteers translated them to italian, japanese, french, spanish, and german. users looking at the preview page for this book in the library have the opportunity to change the display language of the book to any one of these languages using a pull-down menu (figures 9 and 10). currently, only the book metadata language can be changed, but in the next stage of development, all of the surrounding interface text (navigation, labels) will be translated to different languages as well. the plan for doing this is to take a similar approach to the citidel and greenstone projects by creating a web site where volunteers can translate words and phrases from the icdl interface into their native language.30 like the creators of citidel, the team believes that machine-based translation would not provide good enough results. unfortunately, the resources do not exist for the team to do the translating themselves. encouraging volunteers to translate the site will help enlarge and enrich the icdl community. for languages that do not receive volunteer translation, translation services are an affordable alternative. ■ character-set complications several issues have arisen as a result of collecting multilingual metadata in many character sets. first, different countries use different formats for dates and times, so contributors are allowed to specify the calendar used when they enter date information (muslim or julian). second, not only do different countries use different formats for numbers, the numbers themselves are also different. for example, the arabic numbers for 1, 2, 3 are even though java is unicode-compliant, it treats numbers as latin characters, necessitating the storing of latin versions of any non-latin numbers used internally by the software for calculations, such as bookpage count. a third issue is that some of the metadata, such as author and illustrator names, need to be transliterated so their values can be displayed when the metadata are shown in a latin-based language. ideally, the transliteration standards used for a language need to be consistent so that the same values are always transliterated the same way. unfortunately, the team has found no practical way to enforce this, except to state the standard to be used in icdl metadata specification. when different standards are used, it makes comparison of equal items much more difficult. for example, the same persian/farsi creator has been figure 10. where’s the bear? in japanese figure 9. where’s the bear? in english the international children’s digital library | hutchinson, rose, bederson, weeks, and druin 9 10 information technology and libraries | march 2005 transliterated as both “hormoz riyaahi” and “hormoz riahi.” it cannot be assumed that a person is the same just because the name is the same (john smith), and when a name is in a character set that the team cannot understand, this problem becomes more challenging. finally, there was the question of how to handle differences in character-set length and direction in the interface. different languages use different numbers of characters to present the same text. icdl screens had to be designed in such a way that the metadata in languages with longer or shorter representations than the english version would still fit. the team anticipates having to make additional interface changes to accommodate longer labels and navigational aids when the remainder of the interface is translated. the fact also had to be considered that, while most languages are read left to right, a few (arabic and hebrew) are read right to left. as a result, screens were designed so that book metadata were reasonably presented in either direction. currently, only the text is displayed right to left, but eventually the goal is to mirror the entire interface to be oriented right to left when content is shown in right-to-left languages. for the problem of how to handle the arrows for turning pages in right-to-left languages—since these arrows could be interpreted as either “previous” and “next” or “left” and “right”—“previous” and “next” were chosen for consistency, so they work the same way in leftto-right books and right-to-left books. ■ font complications while most current browsers and operating systems recognize unicode characters, whether or not the characters are displayed properly depends on whether users have appropriate fonts installed on their computers. for instance, a user looking at where’s the bear? and choosing to display the metadata in japanese will see the japanese metadata only if the computer has a font installed that includes japanese characters. otherwise, depending on the browser and operating system, he may see question marks, square boxes, or nothing at all instead of the japanese characters. the good news is that many users will never face this problem. the interface for icdl is presented in english (until it is translated to other languages). since most operating systems come with fonts that can display english characters, the team has metadata in english (always presented first by default) for nearly all the books. users who choose to display book metadata in another language are likely to do so because they actually can read that language, and therefore are likely to have fonts installed for displaying that language. furthermore, many commonly used software packages, such as microsoft office, come with fonts for many languages. as a result, many users will have fonts installed for more languages than just those required for the native language of their operating system. of course, fonts will still be a problem for other users, such as those with new computers that have not yet been configured with different fonts or those using a public machine at a library. these users will need to install fonts so they can view book metadata, and eventually the entire interface, in other languages. to assist these users, help pages have been created to assist users with the process of installing a font on various operating systems. ■ issues of interpretation while technical issues have been a major challenge for icdl, a number of nontechnical issues relating to interpretation have also been encountered. first, until the interface has been translated into different languages, visual icons are crucial for communicating information to young children who cannot read, and to users who do not speak english. however, certain pictorial representations may not be understood by all cultures, or worse, may offend some cultures. for example, one icon showing a boy sticking out his tongue had to be redesigned when it was learned this was offensive in the chinese culture. the team has also redesigned other icons, such as those using stars as the rating system for popular books. the original icons used five-sided stars, which are religiously significant, so they were changed to more neutral sevenor eight-sided stars. as the team continues to internationalize the interface, there will likely be a need to change other icons that are difficult to represent in a culturally neutral way when the interface is displayed in different languages. for instance, it is a real challenge to create icons for categories such as mythology or super heroes, since the symbols and stories for these concepts differ by culture. icons for such categories as funny, happy, and sad are also complicated because certain common american facial and hand representations have different, sometimes offensive, meanings in different cultures. what is considered funny in one culture (a clown) may not be understood well by another culture. different versions of such icons may have to be created, depending on the language and cultural preferences of users. the team relies on its multicultural members, volunteers, and advisory board to highlight these concerns. religious, social, and political problems of interpretation have also been encountered. icdl’s collection develops unevenly as relationships are built with various publishers and libraries. as a result, there are currently many arabic books and only a few hebrew books; this has generated multiple e-mails from users concerned that icdl is taking a political stance on the arab-israeli conflict. to address this concern, the team is currently working to develop a more balanced collection. many books published in hong kong are received from contributors in either hong kong or china who want their own country to be credited with publication. to address this concern, it was decided to credit the publication country as “hong kong/china” to avoid offending either party. finally, some books have been received with potentially objectionable content. some of these are historical books involving presentation of content that is now considered derogatory. some include subject matter that may be deemed appropriate by some cultures but not by others. some include information that may be too sophisticated for children three to thirteen years of age in any culture. while careful not to include books that are inappropriate for children in this age group, the team does not want to censor books whose content is subjectively offensive. instead, such contributors are consulted to make sure they were aware of icdl collection-development guidelines. if they believe that a book is historically or culturally appropriate, the book is included. a statement is also provided at the bottom of all the book pages indicating that the books in the library come from diverse cultures and historical periods and may not be appropriate for all users of the library. ■ conclusions and lessons learned designing a digital library for an international, intergenerational audience is a challenging process, but it is hugely rewarding. the team is continually amazed with feedback from users all over the world expressing thanks that books are made available from their countries, from teachers who use the library as a resource for lesson planning, from parents who have discovered a new way to read with their children, and from children who are thrilled to discover new favorite books that they cannot get in their local library. thus, the first recommendation the team can make based on experience is that creating international digital-library resources for children is a rich and rewarding area of research that others should continue to explore. a second important lesson learned is that an international, intergenerational team is an absolute necessity. simply having users and testers from other countries is not enough; their input is valuable, but it comes too late in the design process to influence major design changes. team members from different cultural backgrounds offer perspectives that an american-only team simply would not think to consider. similarly, team members who are children understand how children like to look for and read books, and what interface tools are difficult or easy, and fun or not fun. enthusiastic advisors and volunteers are also a crucial resource. the icdl team does not have the time, money, or resources to address all of the issues that surface, and advisors and volunteers are key resources in the development process. bringing together as diverse a team as possible is highly recommended. the goals of educational enrichment and international understanding in an international library make it an attractive resource for people to want to help, so assembling such a team is not as difficult as it sounds. beyond the human resources, the technical resources involved in making icdl an international environment necessitate the examination and adjustment of software and interfaces at every level. unlike many digital libraries that only focus on one or a few languages, icdl must be simultaneously multilingual, multicultural, and multigenerational. as a result, a third lesson is that freely available and open-source technologies are now available for making the necessary infrastructure meet these criteria. with varying degrees of complexity, the team was able to get all the pieces to work together properly. the more difficult challenge, unfortunately, falls on icdl’s users, who may need to install new fonts to view metadata in different languages. however, as computer and browser technologies advance to reflect more global applications, this problem is expected to lessen and eventually disappear. having technical staff capable of searching for and integrating open-source tools with international support to handle these technical issues is highly recommended, as well as usability staff versed in the nuances of different operating systems and browsers. finally, the more subjective issue of cultural interpretation has proven to be the most interesting challenge. it is one that will likely not disappear as icdl’s collection grows and the next stage of development is embarked on for translating the interface to support other languages and cultures. the fourth lesson learned is that culture pervades every aspect of both the visual design and the content of the interface, and that it is necessary to examine one’s own biased cultural assumptions to ensure respect of others. however, with the enthusiasm that continues to be seen in the icdl team members, advisors, volunteers, and users, future design challenges will be able to be addressed with their help. the final recommendation is to actively seek feedback from team members, volunteers, and users from different backgrounds about the cultural appropriateness of all aspects of your software. it may not be possible to address all cultures in your audience right away, but it is important to have a framework in place so that these issues are addressed eventually. the international children’s digital library | hutchinson, rose, bederson, weeks, and druin 11 12 information technology and libraries | march 2005 ■ acknowledgments icdl is a large project with many people who make it the wonderful resource that it has become. we thank them all for their continued hard work, as well as our many volunteers and our generous contributors. we would especially like to thank nsf for our information technology research grant, and imls for our national leadership grant. without this generous funding, our research would not be possible. references 1. internet world stats. accessed mar. 9, 2005, www.internet worldstats.com 2. national telecommunications and information administration (2004). “a nation online: entering the broadband age.” accessed mar. 9, 2005, www.ntia.doc.gov/reports/anol/index. html. 3. i. witten et al., “the promise of digital libraries in developing countries,” communications of the acm 44, no. 5 (2001): 82–85; j. downie, (2003). “realization of four important principles in cross-cultural digital library development,” workshop paper for jcdl 2003. accessed dec. 16, 2004, http://music -ir.org/~jdownie/jcdl03_workshop_downie_dun.pdf. 4. p. busey and t. doerr, “kid’s catalog: an information retrieval system for children,” youth services in libraries 7, no. 1 (1993): 77–84; u. külper, u. schulz, and g. will, “bücherschatz—a prototype of a children’s opac,” information services and use no. 17 (1997): 201–14; a. druin et al., “designing a digital library for young children: an intergenerational partnership,” in proceedings of the acm/ieee-cs joint conference on digital libraries (new york: association for computing machinery, 2001), 398–405. 5. a. druin, “what children can teach us: developing digital libraries for children with children,” library quarterly (in press). accessed dec. 16, 2004, www.icdlbooks.org. 6. a. marcus, “global and intercultural user-interface design,” in j. jacko and a. sears, eds., the human-computer interaction handbook (mahwah, n.j.: lawrence erlbaum assoc., 2002), 441–63. 7. t. fernandes, global interface design (boston: ap professional, 1995). 8. g. hofstede, cultures and organizations: software of the mind (new york: mcgraw-hill, 1991). 9. fernandes, global interface design. 10. n. hoft, “developing a cultural model,” in e. del galdo and j. nielsen, eds., international user interfaces (new york: wiley, 1996), 41–73. 11. j. nielsen, “international usability engineering,” in e. del galdo, and j. nielsen, eds., international user interfaces (new york: wiley, 1996), 1–13. 12. c. borgman, “multimedia, multicultural, and multilingual digital libraries: or, how do we exchange data in 400 languages?” d-lib magazine 3 (june 1997). 13. i. witten et al., “greenstone: a comprehensive opensource digital library software system,” in proceedings of digital libraries 2000 (new york: association for computing machinery, 2000), 113–21. 14. g. perlman, “the firstsearch user interface architecture: universal access for any user, in many languages, on any platform,” in cuu 2000 conference proceedings (new york: association for computing machinery, 2000), 1–8. 15. s. perugini et al., “enhancing usability in citidel: multimodal, multilingual, and interactive visualization interfaces,” in proceedings of jcdl ‘04 (new york: association for computing machinery, 2004), 315–24. 16. witten et al., “the promise of digital libraries in developing countries”; downie, “four important principles.” 17. e. duncker, “cross-cultural usability of the library metaphor,” in proceedings of jcdl ‘02 (new york: association for computing machinery, 2002), 223–30. 18. p. moore, and a. st. george, “children as information seekers: the cognitive demands of books and library systems,” school library media quarterly 19 (1991): 161–68; p. solomon, “children’s information retrieval behavior: a case analysis of an opac,” journal of the american society for information science and technology 44, no. 5 (1993): 245–64; busey and doerr, “kid’s catalog,” 77–84. 19. a. pejtersen, “a library system for information retrieval based on a cognitive task analysis and supported by an iconbased interface,” acm conference on information retrieval (new york: association for computing machinery, 1989), 40–47. 20. külper et al., “bücherschatz—a prototype of a children’s opac,” 201–14. 21. druin et al., “designing a digital library,” 398–405. 22. y. theng et al., “dynamic digital libraries for children,” in proceedings of the joint conference on digital libraries (new york: association for computing machinery, 2001), 406–15. 23. a. druin, “cooperative inquiry: developing new technologies for children with children,” in proceedings of human factors in computing (new york: association for computing machinery, 1999), 592–99. 24. j. hourcade et al., “the international children’s digital library: viewing digital books online,” interacting with computers 15 (2003): 151–67. 25. k. reuter and a. druin, “bringing together children and books: an initial descriptive study of children’s book searching and selection behavior in a digital library,” in proceedings of american society for information science and technology conference (in press). 26. international youth library, the white ravens 2004. available for purchase at. www.ijb.de/index2.html (accessed dec. 16, 2004). 27. dublin core metadata initiative. accessed dec. 16, 2004, www.dublincore.org. 28. unicode consortium (2004). accessed dec. 16, 2004, www.unicode.org. 29. j. harris, where’s the bear? (los angeles: the j. paul getty museum, 1997). 30. perugini et al., “enhancing usability in citidel,” 315–24. usability test results for a discovery tool in an academic library jody condit fagan meris mandernach carl s. nelson jonathan r. paulo grover saunders information technology and libraries | march 2012 83 abstract discovery tools are emerging in libraries. these tools offer library patrons the ability to concurrently search the library catalog and journal articles. while vendors rush to provide feature-rich interfaces and access to as much content as possible, librarians wonder about the usefulness of these tools to library patrons. to learn about both the utility and usability of ebsco discovery service, james madison university (jmu) conducted a usability test with eight students and two faculty members. the test consisted of nine tasks focused on common patron requests or related to the utility of specific discovery tool features. software recorded participants’ actions and time on task, human observers judged the success of each task, and a post–survey questionnaire gathered qualitative feedback and comments from the participants. participants were successful at most tasks, but specific usability problems suggested some interface changes for both ebsco discovery service and jmu’s customizations of the tool. the study also raised several questions for libraries above and beyond any specific discovery-tool interface, including the scope and purpose of a discovery tool versus other library systems, working with the large result sets made possible by discovery tools, and navigation between the tool and other library services and resources. this article will be of interest to those who are investigating discovery tools, selecting products, integrating discovery tools into a library web presence, or performing evaluations of similar systems. introduction discovery tools appeared on the library scene shortly after the arrival of next-generation catalogs. the authors of this paper define discovery tools as web software that searches journal-article and library-catalog metadata in a unified index and presents search results in a single interface. this differs from federated search software, which searches multiple databases and aggregates the results. examples of discovery tools include serials solutions summon, ebsco discovery service, jody condit fagan (faganjc@jmu.edu) is director, scholarly content systems, meris mandernach (manderma@jmu.edu) is collection management librarian, carl s. nelson (nelsoncs@jmu.edu) is digital user experience specialist, jonathan r. paulo (paulojr@jmu.edu) is education librarian, and grover saunders (saundebn@jmu.edu) is web media developer, carrier library, james madison university, harrisonburg, va. mailto:faganjc@jmu.edu mailto:manderma@jmu.edu mailto:nelsoncs@jmu.edu mailto:paulojr@jmu.edu mailto:saundebn@jmu.edu usability test results for a discovery tool in an academic library | fagan et al 84 ex libris primo, and oclc worldcat local; examples of federated search software include serials solutions webfeat and ebsco integrated search. with federated search software, results rely on the search algorithm and relevance ranking as well as each tool’s algorithms and relevance rankings. discovery tools, which import metadata into one index, apply one set of search algorithms to retrieve and rank results. this difference is important because it contributes to a fundamentally different user experience in terms of speed, relevance, and ability to interact consistently with results. combining the library catalog, article indexes, and other source types in a unified interface is a big change for users because they no longer need to choose a specific search tool to begin their search. research has shown that such a choice has long been in conflict with users’ expectati ons.1 federated search software was unable to completely fulfill users’ expectations because of its limited technology.2 now that discovery tools provide a truly integrated search experience, with greatly improved relevance rankings, response times, and increased consistency, libraries can finally begin to meet this area of user expectation. however, discovery tools present new challenges for users: will they be able to differentiate between source types in the integrated results sets? will they be able to limit large results sets effectively? do they understand the scope of the tool and that other online resources exist outside the tool’s boundaries? the sea change brought by discovery tools also raises challenges for librarians, who have grown comfortable with the separation between the library catalog and other online databases. discovery tools may mask important differences between disciplinary searching, and they do not currently offer discipline-specific strategies or limits. they also lack authority control, which makes topical precision a challenge. their usual prominence on library websites may direct traffic away from carefully cultivated and organized collections of online resources. discovery tools offer both opportunities and challenges for library instruction, depending on the academic discipline, users’ knowledge, and information-seeking need. james madison university (jmu) is a predominantly undergraduate institution of approximately 18,000 students in virginia. jmu has a strong information literacy program integrated into the curriculum through the university’s information seeking skills test (isst). the isst is completed before students are able to register for third-semester courses. additionally, the library provides an information literacy tutorial, “go for the gold,” that supports the skills needed for the isst. jmu launched ebsco discovery service (eds) in august 2010 after participating as a beta development partner in spring and summer 2010. as with other discovery tools, the predominant feature of eds is integration of the library catalog with article databases and other types of sources. at the time of this study, eds had a few differentiating features. first, because of ebsco’s business as a database and journal provider, article metadata was drawn from a combination of journal-publisher information and abstracts and index records. the latter included robust subject indexing (e.g., the medical subject headings in cinahl). the content searched by eds varies by information technology and libraries | march 2012 85 institution according to the institution’s subscription. jmu had a large number of ebsco databases and third-party database subscriptions through ebsco, so the quantity of information searched by eds at jmu is quite large. eds also allowed for extensive customization of the tool, including header navigation links, results-screen layout, and the inclusion of widgets in the right-hand column of the results screen. jmu libraries developed a custom “quick search” widget based on eds for the library home page (see figure 1), which allows users to add limits to the discovery-tool search and assists with local authentication requirements. based on experience with a pilot test of the open-source vufind next-generation catalog, jmu libraries believed users would find the ability to limit up-front useful, so quick search’s first drop-down menu contained keyword, title, and author field limits; the second drop-down contained limits for books, articles, scholarly articles, “just leo library catalog,” and the library website (which did not use eds). the “just leo library catalog” option limited the user’s search to the library catalog database records but used the eds interface to perform the search. to access the native catalog interface, a link to leo library catalog was included immediately above the search box as well as in the library website header. figure 1. quick search widget on jmu library homepage usability test results for a discovery tool in an academic library | fagan et al 86 evaluation was included as part of the implementation process for the discovery t ool, and therefore a usability test was conducted in october 2010. the purpose of the study was to explore how patrons used the discovery tool, to uncover any usability issues with the chosen system and to investigate user satisfaction. specific tasks addressed the use of facets within the discovery tool, patrons’ use of date limiters, and the usability of the quick search widget. the usability test also had tasks in which users were asked to locate books and articles using only the discovery tool, then repeat the task using anything but the discovery tool. this article interprets the usability study’s results in the context of other local usability tests and web-usage data from the first semester of use. some findings were used to implement changes to quick search and the library website, and to recommend changes to ebsco; however, other findings suggested general questions related to discovery tool software that libraries will need to investigate further. literature review literature reviewed for this article included some background reading on users and library catalogs, library responses to users’ expectations, usability studies in libraries, and usability studies of discovery tools specifically. the first group of articles comprised a discussion about the limitations of traditional library catalogs. the strengths and weaknesses of library catalogs were reported in several academic libraries’ usability studies.3 calhoun recognized that library users’ preference for google caused a decline in the use and value of library catalogs, and encouraged library leaders to “establish the catalog within the framework of online information discovery systems.” 4 this awareness of changes in user expectations during a time when google set the benchmark for search simplicity was echoed by numerous authors who recognized the limits of library catalogs and expressed a need for the catalog to be greatly modernized to keep pace with the evolution of the web. 5 libraries have responded in several ways to the call for modernization, most notably through investigations related to federated searching and next-generation catalogs. several articles have presented usability studies results for various federated searching products.6 fagan provided a thorough literature review of faceted browsing and next-generation catalogs.7 western michigan university presented usability study results for the next-generation catalog vufind, revealing that participants took advantage of the simple search box but did not use the next-generation catalog features of tagging, comments, favorites, and sms texting. 8 the university of minnesota conducted two usability studies of primo and reported that participants were satisfied with using primo to find known print items, limit by author and date, and find a journal title.9 tod olson conducted a study with graduate students and faculty using the aquabrowser interface, and his participants located sources for their research they had not previously been able to find.10 information technology and libraries | march 2012 87 the literature also revealed both opportunities and limitations of federated searching and nextgeneration catalogs. allison presented statistics from google analytics for an implementation of encore at the university of nebraska-lincoln. 11 the usage statistics revealed an increased use of article databases as well as an increased use of narrowing facets such as format and media type, and library location. allison concluded that encore increased users’ exposure to the entire collection. breeding concluded that federated searching had various limitations, especially search speed and interface design, and was thus unable to compete with google scholar. 12 usability studies of next-generation catalogs revealed a lack of features necessary to fully incorporate an entire library’s collection. breeding also recognized the limitations of next-generation library catalogs and saw discovery tools as their next step in evolution: “it’s all about helping users discover library content in all formats, regardless of whether it resides within the physical library or among its collections of electronic content, spanning both locally owned materials and those accessed remotely through subscriptions.” 13 the dominant literature related to discovery tools discussed features,14 reviewed them from a library selector perspective,15 summarized academic libraries’ decisions following selection, 16 presented questions related to evaluation after selection,17 and offered a thorough evaluation of common features.18 allison concluded that “usability testing will help clarify what aspects need improvement, what additions will make [the interface] more useful, and how the interface can be made so intuitive that user training is not needed.”19 breeding noted “it will only be through the experience of library users that these products will either prove themselves or not.”20 libraries have been adapting techniques from the field of usability testing for over a decade to learn more about user behavior, usability, and user satisfaction, with library web sites and systems. 21 rubin and chisnell and dumas and redish provided an authoritative overview of the benefits and best practices of usability testing. 22 in addition, campbell and norlin and winters offered specific usability methodologies for libraries.23 worldcat local has dominated usability studies of discovery tools published to date. ward, shadle, and mofield conducted a usability study at the university of washington. 24 although the second round of testing was not published, the first round involved seven undergraduate and three graduate students; its purpose “was to determine how successful uw students would be in using worldcat local to discover and obtain books and journal articles (in both print and electronic form) from the uw collection, from the summit consortium, and from other worldcat libraries.” 25 although participants were successful at completing these tasks, a few issues arose out of the usability study. users had difficulty with the brief item display because reviews were listed higher than the actual items. the detailed item display also hindered users’ ability to decipher between various editions and formats. the second round of usability testing, not yet published, included tasks related to finding materials on specific subject areas. usability test results for a discovery tool in an academic library | fagan et al 88 boock, chadwell, and reese conducted a usability study of worldcat local at oregon state university.26 the study included four tasks and five evaluative questions. forty undergraduate students, sixteen graduate students, twenty-four library employees, four instructors, and eighteen faculty members took part in the study. they summarized that users found known-title searching to be easier in the library catalog but found topical searches to be more effective in worldcat local.the participants preferred worldcat local for the ability to find articles and search for materials in other institutions. western washington university also conducted a usability study of worldcat local. they selected twenty-four participants with a wide range of academic experience to conduct twenty tasks in both worldcat local and the traditional library catalog.27 the comparison revealed several problems in using worldcat local, including users’ inability to determine the scope of the content, confusion over the intermixing of formats, problems with the display of facet option, and difficulty with known-item searches. western washington university decided not to implement worldcat local. oclc published a thorough summary of several usability studies conducted mostly with academic libraries piloting the tool, including the university of washington; the university of california (berkeley, davis, and irvine campuses); ohio state university; the peninsula library system in san mateo, california; and the free library of urbana and the des plaines public library, both in illinois.28 the report conveyed favorable user interest in searching local, group, and global collections together. users also appreciated the ability to search articles and books together. the authors commented, “however, most academic participants in one test (nine of fourteen) wrongly assumed that journal article coverage includes all the licensed content available at their campuses.”29 oclc used the testing results to improve the order of search results, provide clarity about various editions, improve facets for narrowing a search, provide links to electronic resources, and increase visibility of search terms. at grand valley state university, doug way conducted an analysis of usage statistics after implementing the discovery tool summon in 2009; the usage statistics revealed an increased use of full-text downloads and link resolver software but a decrease in the use of core subject databases.30 the usage statistics showed promising results, but way recommended further studies of usage statistics over a longer period of time to better understand how discovery tools affect entire library collections. north carolina state university libraries released a final report about their usability study of summon.31 the results of these usability studies were similar to other studies of discovery tools: users were satisfied with the ability to search the library catalog and article databases with a single search, but users had mixed results with known-item searching and confusion about narrowing facets and results ranking. although several additional academic libraries have conducted usability studies of encore, summon, and ebsco discovery service, the results have not yet been published.32 information technology and libraries | march 2012 89 only one usability study of ebsco discovery service was found. in a study with six participants, williams and foster found users were satisfied and able to adapt to the new system quickly but did not take full advantage of the rich feature set.33 combined with the rapid changes in these tools, the literature illustrates a current need for more usability studies related to discovery tools. the necessary focus on specific software implementations and different study designs make it difficult to identify common themes. additional usability studies will offer greater breadth and depth to the current dialogue about discovery tools. this article will help fill the gap by presenting results from a usability study of ebsco discovery service. publishing such usability results of discovery tools will inform institutional decisions, improve user experiences, and advance the tools’ content, features, and interface design. in addition, libraries will be able to more thoroughly modernize library catalogs to meet users’ changing needs and expectations as well as keep pace with the evolution of the web. method james madison university libraries’ usability lab features one workstation with two pieces of usability software: techsmith’s morae (version 3) (http://www.techsmith.com/morae.asp), which records screen captures of participant actions during the usability studies, and the usability testing environment (ute) (version 3), which presents participants with tasks in a web-browser environment. the ute also presents end-of-task questions to measure time on task and task success. the study of eds, conducted in october 2010, was covered by an institutional review board – approved protocol. participants were recruited for the study through a bulk email sent to all students and faculty. interested respondents were randomly selected to include a variety of grade levels and majors for students and years of service and disciplines taught for faculty members. the study included ten participants with ranging levels of experience: two freshman, two sophomores, two juniors, one senior, one graduate student, and two faculty members. three of the participants were from the school of business, one from education, two from the arts and humanities, and two from the sciences. the remaining two participants had dual majors in the humanities and the sciences. a usability rule of thumb is that at least five users will reveal more than 75 percent of usability issues.34 because the goal was to observe a wide range of user behaviors and usability issues, and to gather data about satisfaction from a variety of perspectives, this study used two users of each grade level plus two faculty participants (for a total of ten) to provide as much heterogeneity as possible. student participants were presented with ten pre–study questions, and faculty participants were asked nine pre–study questions (see appendix a). the pre–study questions were intended to http://www.techsmith.com/morae.asp usability test results for a discovery tool in an academic library | fagan et al 90 gather information about participants’ background, including their time at jmu, their academic discipline, and their experience with the library website, the ebscohost interface, the library catalog, and library instruction. since participants were anonymous, we hoped their answers would help us interpret unusual comments or findings. pre–test results were not used to form comparison groups (e.g., freshmen versus senior) because these groups would not be representative of their larger populations. these questions were followed by a practice task to help familiarize participants with the testing software. the study consisted of nine tasks designed to showcase usability issues, show the researchers how users behaved in the system, and measure user satisfaction. appendix b lists the tasks and what they were intended to measure. in designing the test, determining success on some tasks seemed very objective (find a video about a given topic) while others appeared to be more subjective (those involving relevance judgments). for this reason, we asked participants to provide satisfaction information on some tasks and not others. in retrospect, for consistency of interpretation, we probably should have asked participants to rate or comment on every task. all of the tasks were presented in the same order. tasks were completed either by clicking “answer” and answering a question (multiple choice or typed response), or by clicking “finished” after navigating to a particular webpage. participants also had the option to skip the task they were working on and move to the next task. allowing participants to skip a task helps differentiate between genuinely incorrect answers and incorrect answers due to participant frustration or guessing. a time limit of 5 minutes was set for tasks 1–7, while tasks 8 and 9 were given time limits of 8 minutes, after which the participant was timed out. time limits were used to ensure participants were able to complete all tasks within the agreed-upon session. average time on task across all tasks was 1 minute, 35 seconds. after the study was completed, participants were presented with the system usability scale (sus), a ten-item scale using statements of subjective assessment and covering a variety of aspects of system usability.35 sus scores, which provide a numerical score out of 100, are affected by the complexity of both the system and the tasks users may have performed before taking the sus. the sus was followed by a post–test consisting of six open-ended questions, plus one additional question for faculty participants, intended to gather more qualitative feedback about user satisfaction with the system (see appendix a). a technical glitch with the ute software affected the study in two ways. first, on seven of the ninety tasks, the ute failed to enforce the five-minute maximum time limit, and participants exceeding a task’s time limit were allowed to continue the task until they completed or skipped the task. one participant exceeded the time limit on task 1 while three of these errors occurred during both tasks 8 and 9. this problem potentially limits the ability to compare the average time on task across tasks; however, since this study used time on task in a descriptive rather than comparative way, the impact on interpreting results is minimal. the seven instances in which the glitch occurred were included in the average time on task data found in figure 3 because the times information technology and libraries | march 2012 91 were not extreme and the time limit had been imposed mostly to be sure participants had time to complete all the tasks. a second problem with the ute was that it randomly and prematurely aborted some users’ tasks; when this happened, participants were informed that their time had run out and were then moved on to the next task. this problem is more serious because it is unknown how much more time or effort the participant would have spent on the task or whether they would have been more successful. because of this, the results below specify how many participants were affected for each task. although this was unfortunate, the results of the participants who did not experience this problem still provide useful cases of user behavior, especially because this study does not attempt to generalize observed behavior or usability issues to the larger population. although a participant mentioned a few technical glitches during testing to the facilitator, the extent of software errors was not discovered until after the tests were complete (and the semester was over) because the facilitator did not directly observe participants during sessions. results the participants were asked several pre–test questions to learn about their research habits. all but one participant indicated they used the library website no more than six times per month (see figure 2). common tasks this study’s student participants said they performed on the website were searching for books and articles, searching for music scores, “research using databases,” and checking library hours. the two faculty participants mentioned book and database searches, electronic journal access, and interlibrary loan. participants were shown the quick search widget and were asked “how much of the library’s resources do you think the quick search will search?” seven participants said “most”; only one person, a faculty member, said it would search “all” the library’s resources. figure 2. monthly visits to library website < 1 visit (2) 1 3 visits (4) 4 6 visits (3) > 7 visits (1) usability test results for a discovery tool in an academic library | fagan et al 92 when shown screenshots of the library catalog and an ebscohost database, seven participants were sure they had used leo library catalog, and three were not sure. three indicated that they had used an ebsco database before, five had not, and two were not su re. participants were also asked how often they had used library resources for assignments in their major field of study; four said “often,” two said “sometimes,” one “rarely/never,” and one “very often.” students were also asked “has a librarian spoken to a class you’ve attended about library research?” and two said yes, five said no, and one was not sure. a “practice task” was administered to ensure participants were comfortable with the workstation and software: “use quick search to search a topic relating to your major/discipline or another topic of interest to you. if you were writing a paper on this topic how satisfied would you be with these results?” no one selected “no opinion” or very unsatisfied”; sixty percent were “very satisfied” or “satisfied” with their results; forty percent were “somewhat unsatisfied.” figure 3 shows the time spent on each task, while figure 4 describes participants’ success on the tasks. task 1 task 2 task 3 task 4 task 5 task 6 task 7 task 8 task 9 no. of responses (not including timeouts) 10 9 5 7 9 10 10 8 10 avg. time on task (in seconds) 175* 123 116 97 34 120 92 252* 255* standard deviation 212 43 50 49 26 36 51 177 174 *includes time(s) in excess of the set time limit. excess time allowed by software error. figure 3. average time spent on tasks 175 123 116 97 34 120 92 292 255 0 50 100 150 200 250 300 350 task 1 task 2 task 3 task 4 task 5 task 6 task 7 task 8 task 9 t im e o n t a sk ( in s e co n d s) average time for all tasks (not including timeouts) information technology and libraries | march 2012 93 the first task (“what was the last thing you searched for when doing a research assignment for class? use quick search to re-search for this.”) started participants on the library homepage. participants were then asked to “tell us how this compared to your previous experience” using a text box. the average time on task was almost 2 minutes; however one faculty participant took more than 12 minutes on this task; if his or her time was removed, the time on task average was 1 minute, 23 seconds. figure 5 shows the participants’ search terms and their comments. task 1 task 2 task 3 task 4 task 5 task 6 task 7 task 8 task 9 how success determined users only asked to provide feedback valid typed-in response provided how many subtasks completed (out of 3) how many subtasks completed (out of 2) correct multiple choice answer how many subtasks completed (out of 2) end task at correct web location how many subtasks complete d (out of 4) how many subtasks completed (out of 4) p01 n/a correct 3 2 timeout 2 correct 0* 0** p02 n/a correct 3* 1 correct 2 correct 0** 3 p03 n/a correct 0* 1 incorrect 2 correct 4 3 p04 n/a correct 2 0* correct 2 skip 3 2 p05 n/a correct* 2 2 correct 1 correct 4 2 p06 n/a correct 3* 1 correct 1 correct 3 0** p07 n/a correct 2 1* correct 1 correct 0 2 p08 n/a correct 2 0* correct 0 skip timeout 0** p09 n/a correct 2* skip correct 2 correct 4 2 p10 n/a correct 1* 1 correct 2 skip 4 2 note: “timeout” indicates an immediate timeout error. users were unable to take any action on the task. *user experienced a timeout error while working on the task. this may have affected their ability to complete the task. **user did not follow directions. figure 4. participants’ success on tasks usability test results for a discovery tool in an academic library | fagan et al 94 participant jmu status major/discipline search terms p01 faculty geology large low shear wave velocity province comments: ebsco did a fairly complete job. there were some irrelevant results that i don’t remember seeing when i used georef. p02 faculty computer information systems & management science (statistics) student cheating comments: this is a topic that i am somewhat familiar with the related literature. i was pleased with the diversity of journals that were found in the search. the topics of the articles was right on target. the recency of the articles was great. this is a topic for which i am somewhat familiar with the related literature. i was impressed with the search results regarding: diversity of journals; recency of articles; just the topic in articles i was looking for. p03 graduate student education death of a salesman comments: there is a lot of variety in the types of sources that quick search is pulling up now. i would still have liked to see more critical sources on the play but i could probably have found more results of that nature with a better search term, such as “death of a salesman criticism.” p04 1st year voice performance current issues in russia comments: it was somewhat helpful in the way that it gave me information about what had happened in the past couple months, but not what was happening now in russia. p05 3rd year nursing uninsured and health care reform comments: the quick search gave very detailed articles i thought, which could be good, but were not exactly what i was looking for. then again, i didn’t read all these articles either p06 1st year history headscarf law comments: this search yielded more results related to my topic. i needed other sources for an argument on the french creating law banning religious dress and symbols in school. using other methods with the same keyword, i had an enormous amount of trouble finding articles that pertained to my essay. p07 3rd year english jung comments: i like the fact that it can be so defined to help me get exactly what i need. p08 4th year spanish restaurant industry comments: this is about the same as the last time that i researched this topic. p09 2nd year hospitality aphasia comments: there are many good sources, however there are also completely irrelevant sources. p10 2nd year management rogers five types of feedback comments: there is not many documents on the topic i searched for. this may be because the topic is not popular or my search is not specific/too specific. figure 5. participants’ search terms and comments information technology and libraries | march 2012 95 the second task started on the library homepage and asked participants to find a video related to early childhood cognitive development. this task was chosen because jmu libraries have significant video collections and because the research team hypothesized users might have trouble because there was no explicit way to limit to videos at the time. the average time on this task was two minutes, with one person experiencing an arbitrary time out by the software. participants were judged to be successful on this task by the researchers if they found any video related to the topic. all participants were successful on this task, but four entered, then left the discovery tool interface to complete the task. five participants looked for a video search option in the drop-down menu, and of these, three immediately used something other than quick search when they saw that there was no video search option. of those who tried quick search, six opened the source type facet in eds search results and four selected a source type limit, but only two selected a source type that led directly to success (“non-print resources”). task 3 started participants in eds (see figure 6) and asked them to search on speech pathology, find a way to limit search results to audiology, and limit their search results to peer-reviewed sources. participants spent an average of 1 minute, 40 seconds on this task, with five participants being artificially timed out by the software. participants’ success on this task was determined by the researchers’ examination of the number of subtasks they completed. the three subtasks consisted of successfully searching for the given topic (speech language pathology) limiting the search results to audiology, and further limiting the results to peer reviewed sources. four participants were able to complete all three subtasks, including two who were timed out. (the times for those who were timed out were not included in time on task averages, but they were given credit for success.) five completed just two of the subtasks, failing to limit to peerreviewed; one of these because of a timeout. it was unclear why the remaining participants did not attempt to alter the search results to “peer reviewed.” looking at the performed actions, six of the ten typed “and audiology” into search keywords to narrow the search results, while one found and used “audiology” in the subject facet on the search results page. six participants found and used the “scholarly (peer reviewed) journals” checkbox limiter. usability test results for a discovery tool in an academic library | fagan et al 96 figure 6. ebsco discovery service interface beginning with the results they had from task 3, task 4 asked participants to find more recent sources and to select the most recent source available. task success was measured by correct completion of two subtasks: limiting the search results to the last five years and finding the most recent source. the average time on task was 1 minute, 14 seconds, with three artificial timeouts. of those who did not time out, all seven were able to limit their sources to be more recent in some way, but only three were able to select the most recent source. in addition to this being a common research task, the team was interested to see how users accomplished this task. three typed in the limiter in the left-hand column, two typed in the limiter on the advanced search screen, and two used the date slider. two participants used the “sort” drop-down menu to change the sort order to “date descending,” which helped them complete this task. other participants changed the dates, and then selected the first result, which was not the most recent. task 5, which started within eds, asked participants to find a way to ask a jmu librarian for help. the success of this task was measured by whether they reached the correct url for the ask-a information technology and libraries | march 2012 97 librarian page; eight of the ten participants were successful. this task took an average of only 31 seconds to complete, and eight of the ten used the ask-a-librarian link at the top of the page. of the two unsuccessful participants, one was timed out, while another clicked “search modes” for no apparent reason, then clicked back and decided to finish the task. task 6 started in the eds interface and asked participants to locate the journal yachting and boating world and select the correct coverage dates and online status from a list of four options; participants were deemed successful at two subtasks if they selected the correct option and successful at one subtask if they chose an option that was partially correct. participants took an average of two minutes on this task; only five answered correctly. during this task, three participants used the ebsco search option “so journal title/source,” four used quotation marks, and four searched or re-searched with the “title” drop-down menu option. three chose the correct dates of coverage, but were unable to correctly identify the online availability. it is important to note that only searching and locating the journal title were accomplished with the discovery tool; to see dates of coverage and online availability, users clicked jmu’s link resolver button, and the resulting screen was served from serials solutions’ article linker product. although some users spent more time than perhaps was necessary using the eds search options to locate the journal, the real barriers to this task were encountered when trying to interpret the serials solutions screen. task 7, where participants started in eds, was designed to determine whether users could navigate to a research database outside of eds. users were asked to look up the sculpture genius of mirth and were told the library database camio would be the best place to search. they were instructed to “locate this database and find the sculpture.” the researcher observed the recordings to determine success on this task, which was defined as using camio to find the sculpture. participants took an average of 1 minute, 32 seconds on this task; seven were observed to complete the task successfully, while three chose to skip the task. to accomplish this task, seven participants used the jmu research databases link in the header navigation at some point, but only four began the task by doing this. six participants began by searching within eds. the final two tasks started on the library homepage and were a pair: participants were asked to find two books and two recent, peer-reviewed articles (from the last five years) on rheumatoid arthritis. task 8 asked them to use the library’s eds widget, quick search, to accomplish this, and task 9 asked them to accomplish the same task without using quick search. when they found sources, they were asked to enter the four relevant titles in a text-entry box. the average time spent on these tasks was similar: about four minutes per task. comparing these tasks was somewhat confusing because some participants did not follow instructions. user s uccess was determined by the researchers’ observation of how many of the four subtasks the user was able to complete successfully: find two books, find two articles, limit to peer reviewed, and select articles from last five years (with or without using a limiter); figure 4 shows their success. usability test results for a discovery tool in an academic library | fagan et al 98 looking at the seven users who used quick search on the quick search tasks, six limited to “scholarly (peer reviewed) journals”; six limited to the last five years; and seven narrowed results using the source type facet. the average number of subtasks completed on task eight was 3.14 out of 4. looking at the seven users who followed instructions and did not use quick search on task 9, all began with the library catalog and tried to locate articles within the library catalog. the average number of subtasks completed on task 9 was 2.29 out of 4. some users tried to locate articles by setting the catalog’s material type drop-down menu to “periodicals” and others used the catalog’s “periodical” tab, which performed a title keyword search of the e-journal portal. for task 9, only two users eventually chose a research database to find articles. user behavior can only be compared for the six users (all students) who followed instructions on both tasks; a summary is provided in figure 4. after completing all nine tasks, participants were presented with the system usability scale. eds scored 56 out of 100. following the sus, participants were asked a series of post–test questions. only one of the faculty members chose to answer the post–test questions. when asked how they would use quick search, all eight students explicitly mentioned class assignments, and the participating faculty member replied “to search for books.” two students mentioned books specifically, while the rest used the more generic term “sources” to describe items for which they would search. when asked “when would you not use this search tool?” the faculty member said “i would just have to get used to using it. i mainly go to [the library catalog] and then research databases.” responses from the six students who answered this question were vague and hard to categorize: • “not really sure for more general question/learning” • “when just browsing” • “for quick answers” • “if i could look up the information on the internet” • “when the material i need is broad” • “basic searching when you do not need to say where you got the info from” when asked for the advantages of quick search, four specifically mentioned the ability to narrow results, three respondents mentioned “speed,” three mentioned ease of use, and three mentioned relevance in some way (e.g., “it does a pretty good job associating keywords with sources”). two mentioned the broad coverage and one compared it to google, “which is what students are looking for.” when asked to list disadvantages, the faculty member mentioned he/she was not sure what part of the library home page was actually “quick search,” and was not sure how to get to his/her library account. three students talked about quick search being “overwhelming” or “confusing” because of the many features, although one of these also stated, “like anything you need to learn in order to use it efficiently.” one student mentioned the lack of an audio recording limit and another said “when the search results come up it is hard to tell if they are usable results.” information technology and libraries | march 2012 99 knowing that quick search may not always provide the best results, the research team also asked users what they would do if they were unable to find an item using quick search. a faculty participant said he or she would log into the library catalog and start from there. five students mentioned consulting a library staff member in some fashion. three mentioned moving on from library resources, although not necessarily as their first step. one said “find out more information on it to help narrow down my search.” only one student mentioned the library catalog or any other specific library resource. when participants were asked if “quick search” was an appropriate name, seven agreed that it was. of those who did not agree, one participant’s comment was “not really, though i don’t think it matters.” and another’s was “i think it represents the idea of the search, but not the action. it could be quicker.” the only alternative name suggestion was “search tool.” web traffic analysis web traffic through quick search and in eds provides additional context for this study’s results. during august–december 2010, quick search was searched 81,841 times from the library homepage. this is an increase from traffic into the previous widget in this location that searched the catalog, which received 41,740 searches during the same period in 2009. even adjusting for an approximately 22 percent increase in website traffic from 2009 to 2010, this is an increase of 75 percent. interestingly, the traffic to the most popular link on the library homepage, research databases, went from 55,891 in 2009 to 30,616 in 2010, a decrease of 55 percent when adjusting for the change in website traffic. during fall 2010, 28 percent of quick search searches from the homepage were executed using at least one drop-down menu. twelve percent changed quick search’s first drop-down menu to something other than the keyword default, with “title” being the most popular option (7 percent of searches) followed by author (4 percent of searches). twenty percent of users changed the second drop-down option; “just articles” and “just books” were the most popular options, garnering 7 percent and 6 percent of searches, respectively, followed by “just scholarly articles,” which accounted for 4 percent of searches. looking at ebsco’s statistical reports for jmu’s implementation of eds, there were 85,835 sessions and approximately 195,400 searches during august–december 2010. this means about 95 percent of eds sessions were launched using quick search from the homepage. there were an average of 2.3 searches per session, which is comparable to past behavior in jmu’s other ebscohost databases. discussion usability test results for a discovery tool in an academic library | fagan et al 100 the goal of this study was to gather initial data about user behavior, usability issues, and user satisfaction with discovery tools. the task design and technical limitations of the study mean that comparing time on task between participants or tasks would not be particularly illuminating; and, while the success rates on tasks are interesting, they are not generalizable to the larger jmu population. instead, this study provided observations of user behavior that librarians can use to improve services, it suggested some “quick fixes” to usability issues, and it pointed to several research questions. when possible, these observations are supplemented by comparisons between this study and the only other published usability study of eds.36 this study confirmed a previous finding of user studies of federated search software and discovery tools: students have trouble determining what is searched by various systems.37 on the tasks in which they were asked to not use quick search to find articles, participants tried to search for articles in the library catalog. although all but one of this study’s participants correctly answered that quick search did not search “all” library resources, seven thought it searched “most.” both “most” or “some” would be considered correct; however, it is interesting that answering this question more specifically is challenging even for librarians. many journals in subject article indexes and abstracts are included in the eds foundation index; furthermore, jmu’s implementation of eds includes all of jmu’s ebsco subscription resources as well, making it impractical to assemble a master list of indexed titles. of course, there are numerous online resources with contents which may never be included in a discovery tool, such as political voting records, ethnographic files, and financial data. users often have access to these resources through their library. however, if they do not know the library has a database of financial data, they will certainly not consider this content in their response to a question of how many of the library resources are included in the discovery tool. as discovery tools begin to fulfill users’ expectations for a “single search,” libraries will need to share best practices for showcasing valuable, useful collections that fall outside the discovery tool’s scope or abilities. this is especially critical when reviewing the 72 percent increase in homepage traffic to the homepage search widget compared with the 55 percent decrease in homepage traffic to the research databases page. it is important to note these trends do not mean the library’s other research databases have fallen in usage by 55 percent. though there was not a comprehensive examination of usage statistics, spot-checking suggested ebsco and non-ebsco subject databases had both increases and decreases in usage from previous years. another issue libraries should consider, especially when preparing for instruction classes, is that users do not seem to understand which information needs are suited to a discovery tool versus the catalog or subject-specific databases. several tasks provided additional information about users’ mental models of the tool, which may help libraries make better decisions about navigation customizations in discovery tool interfaces and on library websites. task 7 was designed to discover whether users could find their way to a database outside of eds if they knew they needed to use a specific database. six participants, including one of the faculty members, began by searching eds for the name of the sculpture and/or the database name. on task 1, a graduate information technology and libraries | march 2012 101 student who searched on “death of a salesman” and was asked to comment on how quick search results compared to his or her previous experience, said, “i would still have liked to see more critical sources on the play but i could probably have found more results of that nature with a better search term, such as ‘death of a salesman criticism.’” while true, most librarians would suggest using a literary criticism database, which would target this information need. librarians may have differing opinions regarding the best research starting point, but their rationale would be much different than that of the students in this study. this study’s participants said they would use quick search/eds when they were doing class work or research, but would not use it for general inquiries. if librarians were to list which user information needs are best met by a discovery tool versus a subject-specific database, the types of information needs listed would be much more numerous and diverse, regardless of differences over how to classify them. in addition to helping users choose between a discovery tool or a subject-specific database, libraries will need to conceptualize how users will move in and out of the discovery tool to other library resources, services, and user accounts. while users had no trouble finding the ask-alibrarian link in the header, it might have been more informative if users started from a searchresults page to see if they would find the right-hand column’s ask-a-librarian link or links to library subject guides and database lists. discovery tools vary in their abilities to connect users with their online library accounts and are changing quickly in this area. this study also provided some interesting observations about discovery tool interfaces. the default setting for ebsco discovery service is a single search box. however, this study suggests that while users desire a single search, they are willing to use multiple interface options. this was supported by log analysis of the library’s locally developed entry widget, quick search, in which 28 percent of searches included the use of a drop-down menu. on the first usability task, users left quick search’s options set to the default. on other tasks, participants frequently used the dropdown menus and limiters in both quick search and eds. for example, on task 2, which asked them to look for videos, five users looked in the quick search format drop-down menu. on the same task within eds, six users attempted to use the source type facet. use of limiters was similarly observed by williams and foster in their eds usability study.38 one eds interface option that was not obvious to participants was the link to change the sort order. when asked to find the most recent article, only two participants changed the sort option. most others used the date input boxes to limit their search, then selected the first result even thought it was not the most recent one. it is unclear whether the participant assumed the first result was the most recent or whether they could not figure out how to display the most recent sources. finding a journal title from library homepages has long been a difficult task,39 and this study provided no exception, even with the addition of a discovery tool. it is important to note that the standard eds implementation would include a “publications” or “journals a–z” link in the header; usability test results for a discovery tool in an academic library | fagan et al 102 in eds, libraries can customize the text of this link. jmu did not have this type of link enabled in our test, since the hope was that users could find journal titles within the eds results. however, neither eds nor the quick search widget’s search interfaces offered a way to limit the search to a journal title at the time of this study. during the usability test, four participants changed the field search drop-down menu to “title” in eds, and three participants changed the eds field search drop-down menu to “so journal title/source,” which limits the search to articles within that journal title. while both of these ideas were good, neither one resulted in a precise results set in eds for this task unless the user also limited to “jmu catalog only,” a nonintuitive option. since the test, jmu has added a “journal titles” option to quick search that launches the user’s search into the journal a–z list (provided by serials solutions). in two months after the change (february and march 2011), only 391 searches were performed with this option. this was less than 1 percent of all searches, indicating that while it may be an important task, it is not a popular one. like many libraries with discovery tools, jmu added federated search capabilities to eds using ebscohost integrated search software in an attempt to draw some traffic to databases not included in eds (or not subscribed to through ebsco by jmu), such as mla international bibliography, scopus, and credo reference. links to these databases appeared in the upper-righthand column of eds during the usability study (see figure 6.) usage data from ebsco showed that less than 1 percent of all jmu’s eds sessions for fall 2010 included any interaction with this area. likewise, williams and foster observed their participants did not use their federated search until explicitly asked to do so.40 perhaps users faced with discovery tool results simply have no motivation to click on additional database results. since the usability test, jmu has replaced the right-hand column with static links to ask-a-librarian, subject guides, and research database lists. readers may wonder why one of the most common tasks, finding a specific book title, was not included in this usability study; this was because jmu libraries posed this task in a concurrent homepage usability study. on that study, twenty of the twenty-five participants used quick search to find the title “pigs in heaven” and choose the correct call number. eleven of the twenty used the quick search drop-down menu to choose a title search option, further confirming users’ willingness to limit up-front. the average time on this task was just under a minute, and all participants completed this task successfully, so this task was not repeated in the eds usability test. other studies have reported trouble with this type of task;41 much could depend on the item chosen as well as the tool’s relevance ranking. user satisfaction with eds can be summarized from the open-ended post–study questions, from the responses to task 1 (figure 5), and the sus scale. answers to the post–study questions indicated participants liked the ability to narrow results, the speed and ease of use, and relevance of the system. a few participants did describe the system as being “overwhelming” or “confusing” because of the many features, which was also supported by the sus scores. jmu has been using the sus to understand the relative usability of library systems. the sus offers a benchmark for system improvement; for example, ebsco discovery service received an sus of only 37 in spring 2010 (n information technology and libraries | march 2012 103 = 7) but a 56 on this study in fall 2010 (n = 10). this suggests the interface has become more usable. in 2009, jmu libraries also used the sus to test the library catalog’s classic interface as well as a vufind interface to the library catalog, which received scores of 68 (n = 15) and 80 (n = 14), respectively. the differences between the catalog scores and eds indicate an important distinction between usability and usefulness, with the latter concept encompassing a system’s content and capabilities. the library catalog is, perhaps, a more straightforward tool than a discovery tool and attempts to provide access to a smaller set of information. it has none of the complexity involved in finding article-level or book chapter information. all else being equal, simpler tools will be more usable. in an experimental study, tsakonas and paptheodorou found that while users did not distinguish between the concepts of usability and usefulness, they prefer attributes composing a useful system in contrast to those supporting usability.42 discovery tools, which support more tasks, must make compromises in usability that simpler systems can avoid. in their study of eds, williams and foster also found overall user satisfaction with eds. their participants made positive comments about the interface as well as the usefulness and relevance of the results.43 jmu passed on several suggestions to ebsco related to eds based on the test results. ebsco subsequently added “audio” and “video” to the source types, which enabled jmu to add a “just videos at jmu” option to quick search. while it is confusing that “audio” and “video” source types currently behave differently than the others in eds, in that they limit to jmu’s catalog as well as to the source type, this behavior produces what most local users expect. a previous usability study of worldcat local showed users have trouble discriminating between source types in results lists, so the source types facet is important.44 another piece of feedback provided to ebsco was that on the task where users needed to choose the most recent result, only two of our participants sorted by date descending. perhaps the textual appearance of the sort option (instead of a drop-down menu) was not obvious to participants (see figure 6); however, williams and foster did not observe this to be an issue in their study.45 future research the findings of this study suggest many avenues for future research. libraries will need to revisit the scope of their catalogs and other systems to keep up with users’ mental models and information needs. catalogs and subject-specific databases still perform some tasks much better than discovery tools, but libraries will need to investigate how to situate the discovery tool and specialized tools within their web presence in a way that will make sense to users. when should a user be directed to the catalog versus a discovery tool? what items should libraries continue to include in their catalogs? what role do institutional repositories play in the suite of library tools, and how does the discovery tool connect to them (or include them?) how do library websites begin to make sense of the current state of library search systems? above all, are users able to find the best resources for their research needs? although research on searchers’ mental models has been extensive,46 librarians’ mental models have not been studied as such. yet placing the usability test results for a discovery tool in an academic library | fagan et al 104 discovery tool among the library’s suite of services will involve compromises between these two models. another area needing research is how to instruct users to work with the large numbers of results returned by discovery tools. in subject-specific databases, librarians often help users measure the success of their strategy—or even their topic—by the number of results returned: in criminal justice abstracts, 5,000 results means a topic is too broad or the search strategy needs refinement. in a discovery tool, a result set this large will likely have some good results on the first couple of pages if sorted by relevance; however, users will still need to know how to grow or reduce their results sets. participants in this study showed a willingness to use limiters and other interface features, but not always the most helpful ones. when asked to narrow a broad subject on task 3 of this study, only one participant chose to use the “subject” facet even when the subtopic, audiology, was clearly available. most added search terms. it will be important for future studies to investigate the best way for users to narrow large results set in a discovery tool. this study also suggested possible areas of investigation for future user studies. one interesting finding related to this study’s users’ information contexts was that when users were asked to search on their last research topic, it did not always match up with their major: a voice performance student searched on “current issues in russia,” and the hospitality major searched on “aphasia.” to what extent does a discovery tool help or hinder students who are searching outside their major area of study? one of jmu’s reference librarians noted that while he would usually teach a student majoring in a subject how to use that subject’s specific indexes, as opposed to a discovery tool, a student outside the major might not need to learn the subject-specific indexes for that subject and could be well served by the discovery tool. future studies could also investigate the usage and usability of discovery tool features in order to continue informing library customizations and advice to vendors. for example, this study did not have a task related to logging into a patron account or requesting items, but that would be good to investigate in a follow-up study. another area ripe for further investigation is discovery tool limiters. this study’s participants frequently attempted to use limiters, but didn’t always choose the correct ones for the task. what are the ideal design choices for making limiters intuitive? this study found almost no use of the embedded federated search add-on: is this true at other institutions? finally, this study and others reveal difficulty in distinguishing source types. development and testing of interface enhancements to support this ability would be helpful to many libraries’ systems. conclusion this usability test of a discovery tool at james madison university did not reveal as many interface-specific findings as it did questions about the role of discovery tools in libraries. users were generally able to navigate through the quick search and eds interfaces and complete tasks successfully. tasks that are challenging in other interfaces, such as locating journal articles and discriminating between source types, continued to be challenging in a discovery tool interface. information technology and libraries | march 2012 105 this usability test suggested that while some interface features were heavily used, such as drop down limits and facets, other features were not used, such as federated search results. as discovery tools continue to grow and evolve, libraries should continue to conduct usability tests, both to find usability issues and to understand user behavior and satisfaction. although discovery tools challenge libraries to think not only about access but also about the best research pathways for users, they provide users with a search that more closely matches their expectations. acknowledgement the authors would like to thank patrick ragland for his editorial assistance in preparing this manuscript. correction april 12, 2018: at the request of the author, this article was revised to remove a link to a website. references 1. emily alling and rachael naismith, “protocol analysis of a federated search tool: designing for users,” internet reference services quarterly 12, no. 1 (2007): 195, http://scholarworks.umass.edu/librarian_pubs/1/ (accessed jan. 11, 2012); frank cervone, “what we've learned from doing usability testing on openurl resolvers and federated search engines,” computers in libraries 25, no. 9 (2005): 10 ; sara randall, “federated searching and usability testing: building the perfect beast,” serials review 32, no. 3 (2006): 181–82, doi:10.1016/j.serrev.2006.06.003; ed tallent, “metasearching in boston college libraries —a case study of user reactions,” new library world 105, no. 1 (2004): 69-75, doi: 10.1108/03074800410515282. 2. s. c. williams and a. k. foster, “promise fulfilled? an ebsco discovery service usability study,” journal of web librarianship 5, no. 3 (2011), http://www.tandfonline.com/doi/pdf/10.1080/19322909.2011.597590 (accessed jan. 11, 2012). 3. janet k. chisman, karen r. diller, and sharon l. walbridge, “usability testing: a case study,” college & research libraries 60, no. 6 (november 1999): 552–69, http://crl.acrl.org/content/60/6/552.short (accessed jan. 11, 2012); frances c. johnson and jenny craven, “beyond usability: the study of functionality of the 2.0 online catalogue,” new review of academic librarianship 16, no. 2 (2010): 228–50, doi: 10.1108/00012531011015217 (accessed jan, 11, 2012); jennifer e. knievel, jina choi wakimoto, and sara holladay, “does interface design influence catalog use? a case study,” college & research libraries 70, no. 5 (september 2009): 446–58, http://crl.acrl.org/content/70/5/446.short (accessed jan. 11, 2012); jia mi and cathy weng, “revitalizing the library opac: interface, searching, and display challenges,” information technology & libraries 27, no. 1 (march 2008): 5–22, http://0http://scholarworks.umass.edu/librarian_pubs/1/ http://www.tandfonline.com/doi/pdf/10.1080/19322909.2011.597590 http://crl.acrl.org/content/60/6/552.short http://crl.acrl.org/content/70/5/446.short http://0-www.ala.org.sapl.sat.lib.tx.us/ala/mgrps/divs/lita/publications/ital/27/1/mi.pdf usability test results for a discovery tool in an academic library | fagan et al 106 www.ala.org.sapl.sat.lib.tx.us/ala/mgrps/divs/lita/publications/ital/27/1/mi.pdf (accessed jan. 11, 2012). 4. karen calhoun, “the changing nature of the catalog and its integration with other discovery tools,” http://www.loc.gov/catdir/calhoun-report-final.pdf (accessed mar. 11, 2011). 5. dee ann allison, “information portals: the next generation catalog,” journal of web librarianship 4, no. 1 (2010): 375–89, http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1240&context=libraryscience (accessed january 11, 2012); marshall breeding, “the state of the art in library discovery,” computers in libraries 30, no. 1 (2010): 31–34; c. p diedrichs, “discovery and delivery: making it work for users . . . taking the sting out of serials!” (lecture, north american serials interest group, inc. 23rd annual conference, phoenix, arizona, june 5–8, 2008), doi: 10.1080/03615260802679127; ian hargraves, “controversies of information discovery,” knowledge, technology & policy 20, no. 2 (summer 2007): 83, http://www.springerlink.com/content/au20jr6226252272/fulltext.html (accessed jan. 11, 2012); jane hutton, “academic libraries as digital gateways: linking students to the burgeoning wealth of open online collections,” journal of library administration 48, no. 3 (2008): 495–507, doi: 10.1080/01930820802289615; oclc, “online catalogs: what users and librarians want: an oclc report,” http://www.oclc.org/reports/onlinecatalogs/default.htm (accessed mar. 11 2011). 6. c. j. belliston, jared l. howland, and brian c. roberts, “undergraduate use of federated searching: a survey of preferences and perceptions of value-added functionality,” college & research libraries 68, no. 6 (november 2007): 472–86, http://crl.acrl.org/content/68/6/472.full.pdf+html (accessed jan. 11, 2012); judith z. emde, sara e. morris, and monica claassen‐wilson, “testing an academic library website for usability with faculty and graduate students,” evidence based library & information practice 4, no. 4 (2009): 24– 36, http://kuscholarworks.ku.edu/dspace/bitstream/1808/5887/1/emdee_morris_cw.pdf (accessed jan. 11,2012); karla saari kitalong, athena hoeppner, and meg scharf, “making sense of an academic library web site: toward a more usable interface for university researchers,” journal of web librarianship 2, no. 2/3 (2008): 177–204, http://www.tandfonline.com/doi/abs/10.1080/19322900802205742 (accessed jan. 11, 2012); ed tallent, “metasearching in boston college libraries—a case study of user reactions,” new library world 105, no. 1 (2004): 69–75, doi: 10.1108/03074800410515282; rong tang, ingrid hsieh-yee, and shanyun zhang, “user perceptions of metalib combined search: an investigation of how users make sense of federated searching,” internet reference services quarterly 12, no. 1 (2007): 211–36, http://www.tandfonline.com/doi/abs/10.1300/j136v12n01_11 (accessed jan. 11, 2012). 7. jody condit fagan, “usability studies of faceted browsing: a literature review,” information technology & libraries 29, no. 2 (2010): 58–66, http://0-www.ala.org.sapl.sat.lib.tx.us/ala/mgrps/divs/lita/publications/ital/27/1/mi.pdf http://www.loc.gov/catdir/calhoun-report-final.pdf http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1240&context=libraryscience http://www.springerlink.com/content/au20jr6226252272/fulltext.html http://www.oclc.org/reports/onlinecatalogs/default.htm http://crl.acrl.org/content/68/6/472.full.pdf+html http://kuscholarworks.ku.edu/dspace/bitstream/1808/5887/1/emdee_morris_cw.pdf http://www.tandfonline.com/doi/abs/10.1080/19322900802205742 http://www.tandfonline.com/doi/abs/10.1300/j136v12n01_11 information technology and libraries | march 2012 107 http://web2.ala.org/ala/mgrps/divs/lita/publications/ital/29/2/fagan.pdf (accessed jan. 11, 2012). 8. birong ho, keith kelley, and scott garrison, “implementing vufind as an alternative to voyager’s web voyage interface: one library’s experience,” library hi tech 27, no. 1 (2009): 8292, doi: 10.1108/07378830910942946 (accessed jan. 11, 2012). 9. tamar sadeh, “user experience in the library: a case study,” new library world 109, no. 1 (2008): 7–24, doi: 10.1108/03074800810845976 (accessed jan. 11, 2012). 10. tod a. olson, “utility of a faceted catalog for scholarly research,” library hi tech 25, no. 4 (2007): 550–61, doi: 10.1108/07378830710840509 (accessed jan. 11, 2012). 11. allison, “information portals,” 375–89. 12. marshall breeding, “plotting a new course for metasearch,” computers in libraries 25, no. 2 (2005): 27. 13. ibid. 14. dennis brunning and george machovec, “interview about summon with jane burke, vice president of serials solutions,” charleston advisor 11, no. 4 (2010): 60–62; dennis brunning and george machovec, “an interview with sam brooks and michael gorrell on the ebscohost integrated search and ebsco discovery service,” charleston advisor 11, no. 3 (2010): 62–65, http://www.ebscohost.com/uploads/discovery/pdfs/topicfile-121.pdf (accessed jan. 11, 2012). 15. ronda rowe, “web-scale discovery: a review of summon, ebsco discovery service, and worldcat local,” charleston advisor 12, no. 1 (2010): 5–10; k. stevenson et al., “next-generation library catalogues: reviews of encore, primo, summon and summa,” serials 22, no. 1 (2009): 68–78. 16. jason vaughan, “chapter 7: questions to consider,” library technology reports 47, no. 1 (2011): 54; paula l. webb and muriel d. nero, “opacs in the clouds,” computers in libraries 29, no. 9 (2009): 18. 17. jason vaughan, “investigations into library web scale discovery services,” articles (libraries), paper 44 (2011), http://digitalcommons.library.unlv.edu/lib_articles/44. 18. marshall breeding, “the state of the art in library discovery,” 31–34; sharon q. yang and kurt wagner, “evaluating and comparing discovery tools: how close are we towards next generation catalog?” library hi tech 28, no. 4 (2010): 690–709. 19. allison, “information portals,” 375–89. 20. breeding, “the state of the art in library discovery,” 31–34. 21. galina letnikova, “usability testing of academic library websites: a selective bibliography,” internet reference services quarterly 8, no. 4 (2003): 53–68. http://web2.ala.org/ala/mgrps/divs/lita/publications/ital/29/2/fagan.pdf http://www.ebscohost.com/uploads/discovery/pdfs/topicfile-121.pdf http://digitalcommons.library.unlv.edu/lib_articles/44 usability test results for a discovery tool in an academic library | fagan et al 108 22. jeffrey rubin and dana chisnell, handbook of usability testing: how to plan, design, and conduct effective tests, 2nd ed. (indianapolis, in: wiley, 2008); joseph s. dumas and janice redish, a practical guide to usability testing, rev. ed. (portland, or: intellect, 1999). 23. nicole campbell, ed., usability assessment of library-related web sites: methods and case studies (chicago: library & information technology association, 2001); elaina norlin and c. m. winters, usability testing for library web sites: a hands-on guide (chicago: american library association, 2002). 24. jennifer l. ward, steve shadle, and pam mofield, “user experience, feedback, and testing,” library technology reports 44, no. 6 (2008): 17. 25. ibid. 26. michael boock, faye chadwell, and terry reese, “worldcat local task force report to lamp,” http://hdl.handle.net/1957/11167 (accessed mar. 11 2011). 27. bob thomas and stefanie buck, “oclc’s worldcat local versus iii’s webpac: which interface is better at supporting common user tasks?” library hi tech 28, no. 4 (2010): 648–71. 28. oclc, “some findings from worldcat local usability tests prepared for ala annual,” http://www.oclc.org/worldcatlocal/about/213941usf_some_findings_about_worldcat_local.pdf (accessed mar. 11, 2011). 29. ibid., 2. 30. doug way, “the impact of web-scale discovery on the use of a library collection,” serials review 36, no. 4 (2010): 21420. 31. north carolina state university libraries, “final summon user research report,” http://www.lib.ncsu.edu/userstudies/studies/2010_summon/ (accessed mar. 28, 2011). 32. alesia mcmanus, “the discovery sandbox: aleph and encore playing together,” http://www.nercomp.org/data/media/discovery%20sandbox%20mcmanus.pdf (accessed mar. 28, 2011); prweb, “deakin university in australia chooses ebsco discovery service,” http://www.prweb.com/releases/deakin/chooseseds/prweb8059318.htm (accessed mar. 28, 2011); university of manitoba, “summon usability: partnering with the vendor,” http://prezi.com/icxawthckyhp/summon-usability-partnering-with-the-vendor (accessed mar. 28, 2011). 33. williams and foster, “promise fulfilled?” 34. jakob nielsen, “why you only need to test with 5 users,” http://www.useit.com/alertbox/20000319.html (accessed aug. 20, 2011). 35. john brooke, “sus: a ‘quick and dirty’ usability scale,” in usability evaluation in industry, ed. p. w. jordanet al. (london: taylor & francis, 1996), http://www.usabilitynet.org/trump/documents/suschapt.doc (accessed apr. 6, 2011). 36. williams and foster, “promise fulfilled?” http://hdl.handle.net/1957/11167 http://www.oclc.org/worldcatlocal/about/213941usf_some_findings_about_worldcat_local.pdf http://www.lib.ncsu.edu/userstudies/studies/2010_summon/ http://www.nercomp.org/data/media/discovery%20sandbox%20mcmanus.pdf http://www.prweb.com/releases/deakin/chooseseds/prweb8059318.htm http://prezi.com/icxawthckyhp/summon-usability-partnering-with-the-vendor/ http://www.useit.com/alertbox/20000319.html http://www.usabilitynet.org/trump/documents/suschapt.doc information technology and libraries | march 2012 109 37. seikyung jung et al., “libraryfind: system design and usability testing of academic metasearch system,” journal of the american society for information science & technology 59, no. 3 (2008): 375–89; williams and foster, “promise fulfilled?”; laura wrubel and kari schmidt, “usability testing of a metasearch interface: a case study,” college & research libraries 68, no. 4 (2007): 292–311. 38. williams and foster, “promise fulfilled?” 39. letnikova, “usability testing of academic library websites,” 53–68; tom ipri, michael yunkin, and jeanne m. brown, “usability as a method for assessing discovery,” information technology & libraries 28, no. 4 (2009): 181–86; susan h. mvungi, karin de jager, and peter g. underwood, “an evaluation of the information architecture of the uct library web site,” south african journal of library & information science 74, no. 2 (2008): 171–82. 40. williams and foster, “promise fulfilled?” 41. ward et al., “user experience, feedback, and testing,” 17. 42. giannis tsakonas and christos papatheodorou, “analysing and evaluating usefulness and usability in electronic information services,” journal of information science 32, no. 5 (2006): 400– 419. 43. williams and foster, “promise fulfilled?” 44. bob thomas and stefanie buck, “oclc’s worldcat local versus iii’s webpac: which interface is better at supporting common user tasks?” library hi tech 28, no. 4 (2010): 648–71. 45. williams and foster, “promise fulfilled?” 46. tracy gabridge, millicent gaskell, and amy stout, “information seeking through students’ eyes: the mit photo diary study,” college & research libraries 69, no. 6 (2008): 510–22; yan zhang, “undergraduate students’ mental models of the web as an information retrieval system,” journal of the american society for information science & technology 59, no. 13 (2008): 2087–98; brenda reeb and susan gibbons, “students, librarians, and subject guides: improving a poor rate of return,” portal: libraries and the academy 4, no. 1 (2004): 123–30; alexandra dimitroff, “mental models theory and search outcome in a bibliographic retrieval system,” library & information science research 14, no. 2 (1992): 141–56. usability test results for a discovery tool in an academic library | fagan et al 110 appendix a task pre–test 1: please indicate your jmu status (1st year, 2nd year, 3rd year, 4th year, graduate student, faculty, other) pre–test 2: please list your major(s) or area of teaching (open ended) pre–test 3: how often do you use the library website? (less than once a month, 1–3 visits per month, 4–6 visits per month, more than 7 visits per month) pre–test 4: what are some of the most common things you currently do on the library website? (open ended) pre–test 5: how much of the library’s resources do you think the quick search will search? (less than a third, less than half, half, most, all) pre–test 6: have you used leo? (show screenshot on printout) (yes, no, not sure) pre–test 7: have you used ebsco? (show screenshot on printout) (yes, no, not sure) pre–test 8 (student participants only): how often have you used library web resources for course assignments in your major? (rarely/never, sometimes, often, very often) pre–test 9 (student participants only): how often have you used library resources for course assignments outside of your major? (rarely/never, sometimes, often, very often) pre–test 10 (student participants only): has a librarian spoken to a class you've attended about library research? (yes, no, not sure) pre–test 11 (faculty participants only): how often do you give assignments that require the use of library resources? (rarely/never, sometimes, often, very often) pre–test 12 (faculty participants only): how often have you had a librarian visit one of your classes to teach your students about library research? (rarely/never, sometimes, often, very often) post–test 1: when would you use this search tool? post–test 2: when would you not use this search tool? post–test 3: what would you say are the major advantages of quick search? information technology and libraries | march 2012 111 post–test 4: what would you say are the major problems with quick search? post–test 5: if you were unable to find an item using quick search/ebsco discovery service what would your next steps be? post–test 6: do you think the name “quick search” is fitting for this search tool? if not, what would you call it? post–test 7 (faculty participants only): if you knew students would use this tool to complete assignments would you alter how you structure assignments and how? appendix b task purpose • practice task: use quick search to search a topic relating to your major / discipline or another topic of interest to you. if you were writing a paper on this topic how satisfied would you be with these results? help users get comfortable with the usability testing software. also, since the first time someone uses a piece of software involves behaviors unique to that case, we wanted participants’ first use of eds to be with a practice task. 1. what was the last thing you searched for when doing a research assignment for class? use quick search to re-search for this. tell us how this compared to your previous experience. having participants re-search a topic with which they had some experience and interest would motivate them to engage with results and provide a comparison point for their answer. we hoped to learn about their satisfaction with relevance, quality, and quantity of results. (user behavior, user satisfaction) 2. using quick search find a video related to early childhood cognitive development. when you’ve found a suitable video recording, click answer and copy and paste the title. this task aimed to determine whether participants could complete the task, as well as show us which features they used in their attempts. (usability, user behavior) 3. search on speech pathology and find a way to limit your search results to audiology. then, limit your search results to peer reviewed sources. how satisfied are you with the results? since there are several ways to limit results in eds, we designed this task to show us which limiters participants tried to use, and which limiters resulted in success. we also hoped to learn about whether they thought the limiters provided satisfactory results. (usability, user behavior, user satisfaction) usability test results for a discovery tool in an academic library | fagan et al 112 4. you need more recent sources. please limit these search results to the last 5 years, then select the most recent source available. click finished when you are done. since there are several ways to limit by date in eds, we designed this task to show us which limiters participants tried to use, and which limiters resulted in success. (usability, user behavior) 5. find a way to ask a jmu librarian for help using this search tool. after you’ve found the correct web page, click finished. we wanted to determine whether the user could complete this task, and which pathway they chose to do it. (usability, user behavior) 6. locate the journal yachting and boating world. what are the coverage dates? is this journal available in online full text? we wanted to determine whether the user could locate a journal by title. (usability) 7. you need to look up the sculpture genius of mirth. you have been told that the library database, camio, would be the best place to search for this. locate this database and find the sculpture. we wanted to know whether users who knew they needed to use a specific database could find that database from within the discovery tool. (usability, user behavior). 8. use quick search to find 2 books and 2 recent peer reviewed articles (from the last 5 years) on rheumatoid arthritis. when you have found suitable source click answer and copy and paste the titles. click back to webpage if you need to return to your search results. these two tasks were intended to show us how users completed a common, broad task with and without a discovery tool, whether they would be more successful with or without the tool, and what barriers existed with and without the tool (usability, user behavior) 9. without using quick search, find 2 books and 2 recent peer reviewed articles (from the last 5 years) on rheumatoid arthritis. when you have found suitable sources click answer and copy and paste the titles. click back to webpage if you need to return to your search results. 295 listings of uncataloged collections fred l. bellomy: head, and lies n. jaccarino: systems analyst, library systems staff, university of california, santa barbara, california. an operational computerized system used by the ucsb libraries produces listings of bibliographic data about items in collections where full cataloging treatment is not considered justified. the system produces listings of the brief bibliographic records sorted by any of the data elements in the record including up to twenty-five subjects terrrl8. of special interest are the authority listings of descriptions and the coordinate indexes to the full records. introduction this short report was extracted from the more comprehensive document, listings of uncataloged collectionssystems documentation, santa barbara: university of california, december 1969, library systems document ls 69-11. the library staff at the university of california at santa barbara is using computerized procedures to produce a variety of listings of bibliographic information about items in uncataloged collections. although many similar systems undoubtedly have been developed to do similar jobs, this one is noteworthy in two respects, first in being well-documented and second because its versatility has been tested on three totally different collections. the machine programs, written in pl/1, were first used to list the ucsb art exhibition catalogs collection, but they were designed to be versatile so that they could he applied easily to other similar collections 296 journal of library automation vol. 3/4 december, 1970 as well. at present these programs are also being used at ucsb to list the documentation of marine pollution due to major oil spills (the oil spill information center). the programs have been successfully tested also on about one hundred items of the ucsb collection of early american trade catalogs. application to other collections (such as the phono record collection or video tape file) has been studied and is feasible. although it is usually difficult to use programs that were not specifically tailored for a particular user, these programs represent at least one instance where attention to versatility and the probable broad scope of possible applications has resulted in a system capable of producing listings for different collections at any location where there is access to an ibm system 360 computer and a staff capable of adapting about a half dozen job control language ( jcl) statements. the machine written listings of catalogs provide a limited amount of bibliographic data about each item in the collection. the advantage of such listings is the expedition with which a new, not-yet-cataloged, collection can be made accessible. description as a first step in obtaining a listing, library staff members examine ea~h item in the collection to be listed and transcribe the necessary bibliographic data to an input work sheet (figure 1). information on the work sheet is keypunched into one or more punched cards. these records, once in the computer, can be sorted in various ways to provide a variety of listings. master listings can be produced at desired intervals (e.g. monthly). multiple copies of each list can be produced, and the sheets of computer printout are a convenient form of access to the material when individual copies of the list are separated and placed in hard-board binders for distribution to the library service desks. program "packages" (i.e. jcl decks) contain many comment cards, so that each package is self explanatory after very little instruction. to keep the system simple for the librarians who use it, separate "packages" have been prepared for each different listing (or combination of listings) decided on. listings of the full records (see figure 2) have been prepared now by 1) classification letter, 2) accession number, 3) year of "exhibit", 11) mai~ and secondary subjects, 12) agency name, 13) agency city, and 17) author. obviously, others are possible. listings of subjects (figure 3) and agencies with the number of times each was used accompany full record listings by subject and agency. these are used as authority lists for future term assignments. another package, artindx, is used to produce coordinate indexes by subject, agency, author and others. an example of the subject index is shown in figure 4. such indexes are used with a master listing of the full bibliographic records in accession number order. this method reduces the amount of printout required to provide many different description approaches to the collection. listings of uncataloged collectionsfbellomy and jaccarino 297 catalog collections input worksheet column 1. classification letter ___ 2-3 2. accession number 4-8 3. year of exhibit 9-12 4. 8&w illustration no. 13-15 5. color illustration no. 16-18 6. chronology (y=yes, n=no) __ 19 7. bibliography no. pages __ 20-21 8. bib.ft.notes (y=yes,n=no)_ 22 9. pages no. 23-25 10. spare 26-30 11. subject(s) (separate with ";"} var 1 2. agency name var 13. agency city var 14. agency state var 1 5. agency country var 1 6. title var 17. au thor _________ _ var 18. spare __________ _ var note: data elements 1-10 are fixed field and are to be keyed into the card columns indicated. the card sequence number is always keyed into column 1. data elements 11-18 are variable field and each is to be terminated with a"". every record must contain exac tly eight of the s e end of variable field marks (" "). fig. 1. input worksheet for catalog collections. a~t e~h i b ition catalog in da te sequence december 1, 1968 pag e 3 ~ <---------a genc y---------> <~o.> <-------------subject--------------> <---------------~o i es ,aut hor ,title , etc . --------------> g5 british museum lonoon ,great br itai n ar iti sh 'i useum lon oon,great br ita in f itz william mu seum cambr i dge,grea t britain maggs bros. lonoon ,gr eat britain klein~erger,f .,galler i es new york , new yor k usa orouot, hotel paris , fr a.'lce natio na l loan collection t rust londo'l,great britain belvedere vienna,austri4 orouot , hotel par!s,france maggs bros, london,g reat britain nat i onal huseet co penhagen , oenmark 933 british museum, lonoon,gre at brl ta int collect i ons ,ha noaooks manual s and gui'les; london,great 8r itain ,ga lleri es and museu~ s,collecti ons; bronze age,european,collections; bron7.es,euro pean, cp.llfc tions; ~ronzes,celtic,collections 113 british mu seum,l onoon,g rf.at britain, collect ions,hanob 80ks manuals and gu i of.s; lono'ln ,great britain,galler i es and museums,collecti ons,handbook s ~anuals ano guides; art, fgypti an, collections 774 fitlwilliam mljseum,cambrioge,great britain,collections,handbooks man uals and gu i des; chmbridge,gro.at britain,galleries and "useu ms,collections,hanobo oks manua ls and guides 670 graph i c arts 70 1 painting, italian, 15th cen tury; charitjes,amerj can,20 th century ; painting,jtalian,l6th cen tury; world war ,l914-191 8 tcharitie s, a"erican 7 2 sa i nt-aubin,gabriel jacques oe,l7241780; graphic arts,french,18th centu~y 789 nat iona l loan coll ec tion trus t, lonoon,great brjtai n, co llectionst handbooks manual s ano guides; lonoon,great britain,galleries and museums,collections: painting,collection s 224 tapestry,gobelin 70 uhde,wilhelm,l874-19~7; collectors and collec ting,20th century 669 graphic arts 641 hanet,edouar0,1 832-188~ ; paint i ng , french,19th century fig. 2. sample listin g of full record. 11905 1 1~8p 1153 btw lll us t 1 color !llus , inc . chronologyi author: lea),c~arlcs a.;shit h, regin al o; tille: gui de to the ant!~uit ies of the early bronze age of central ano western eu~jpe in the department of british and medi aeval aniiq uit! fs !british museumi 119091 32sp 1233 bt tl i llus, 1p bibliography, footnotes, i nc. chronolo:oy i aut hor: bu'lge,e,a .wal lis ; title: guide to t~e egyptian colle cti ons in the br iti sh museum 119121 240p 1223 b&w ill us i title: pr in cipal pictures of the fitzwilliam museum 0 cam 8ridge 119151 105p 127 b&w il lusi titl e: engrav i ngs , etchings and drawings ! catalogue 134?1 119171 26 0p 1102 b&w i llus , 3p bibliography, foo tn otesi author : siren,jsvalo;brockwell ,maur ic e w.; t itl e: loan exh ibiti on of italian primi tive s in aio of the ameri can war reli ef 119191 63p 140 btw illus, footn otes i title: eaux-fo rt es originales , gravures,oessins , l i vres et catalogues !llustres 119191 1l3p 153 btw lllus, ip iii bliogr aphy, foo tnote s, inc. chronolo~yi titl e: cat alogue of pictur es in .the national loan collect ion trust,lonoon 119211 71p 124 btw il lus , footn otes i aut hor: baloass,luow i g v~; ti tle: kata log .oer go~elins-auss tellung i part 2 of a work i~ 3 partsi 119211 izp 11 6 3tw illus) title: catalogue des tableaux: aquarellcs,oess in s,collection uhoe , salle n.l. 119221 146p 142 btw i llu si title: engravings,etchings and draw i ngs ! cata logue •4301 11 9221 36p 17 bt w ill us , 1 p bibliography, footnotesi t i tle : eoouaro manet cuisial lnin g av hans arbeten i sk4 ndinavisk agoi '0' ~ ~ ..q.. t-t & a "'t ~ f .,... 5" ;'$ < 0 ~ c.o -t+>-d (!) n (!) g. (!) v'"t ~ ~ 0 listings of uncataloged collectionsjbellomy and jaccarino 299 count 1 3 l 1 1 4 6 1 1 1 1 1 2 fl 1 1 1 1 1 1 7 1 1 8 3 q 1 1 1 2 r 1 14 1 1 1 1 1 4 1 4 1 1 2 1 4 1 23 3 1 57 surject pa i~ting,ame~ ican ,2 0th century, 1qa3-1967 painting,argentiner20th century pai~ting,austrain,15th c~ntury painting,australian,20th century,l954-1966,coll~(tions painting,austr!an,lbth century painting~austrianrl9th renturv pat~ting,austrian,20th century palnting,austrian,20th century,collections painting,baroque,outch,l7th ce~tury patnting,baroque,flemish,1?th cen tury ?ainting,baroque,italy pajnting,baroque,lbth century paintingrbelgian,lqth cfntury painting,belgia~.20th century painting,brasilian,20th century dai~ting,britishrl9th century pat~ting,canaoa,zoth ce ntury painting,canadianrl9th centu~y painting,canadian,2 0 th century painting,chinese,collecti ons patnting,collecttnns. painting, collections painting,czechoslovak,17th century painting,outch,collections painting,outch,17th cfntury patnting,dutch,1qth century painting,outch,20th century painting,english,collfctiqns patnting,english,nor~i\.h school painting,english,l6th centupy painting,english,18th century painting,english,l9th century pajnting,englisy,19th century,collections painting,english, 20t h century patnting,engljsh,?oth c~etury pal~ting,eurepea~ painting,europea~ painting,fle~ish,collecti~ns patnting,flemish,l6th century,collectigns pat~ting,flemish,t7th ce~turv patnting,flemish,17th cfntury,collectio~s pt.tnting,french patnting,french,collecti0ns patnting,fre~ch,l6th century,l530-1619 patnting,fre~ch,l7th century patnting,french,17th .century,crllectlons painting,fr ench,j~th ccnt~ry painting,fre~ch,lrth c~~t~ry,collections patnting,french,19th ce~tury paintjng,french,19th cfntury,cnll~ctions palnting,french,l9th century,t892-1897 dajnting,french,20th century fig. 3. subject listing. oil oil spill infor"ation center sli8jec t in de x tc tober, 1970 page 102 oh c.o 8 oil ihports.histdry ~e 0007 ...... 0 ~ oil ihfc~ls.~eslrlllll~s jo ol.lj '"'t ~ ~ ...... oil in navica8l£ waters act 119221 jt 1100 j o 1013 jo 1195 .q.. jo 2065 ~ .... oil in naviga8l£ wh£rs alt .amendhents j~ 1100 ~ 119631 ~ ~ oil lands ne 0603 jo 0129 ~ 1:: oil lukage j() 016 0 gp 0068 ..... 0 oil leaks jo 0038 ~ ..... .... oil pollution gp uo.lo gp 001l gp 0012 jo oou gp occ4 gp ooss gf 0 056 gp 0057 jo 0008 gp 0069 0 ~ gp 0060 jo 1011 j g 002 2 jo 0103 gp 0034 jo 002 5 j t 0cl6 jo 0037 jo 0098 jo 0009 jo coijo jo 2021 j o 0032 jo oih gp oc61t jo 003.5 jc oc86 jo 0071 jo 0118 jo 0039 < j(; 0100 nf 101t1 jo 0 082 jo oisl ju oc11t jo c0~5 jc oiu j~ 0087 jo l l~t8 jo 0089 jo 0120 nf 2111 jg 0092 jo 1013 jo oollt j o c105 jt 0126 jo 0097 jo 2018 jo 0099 £. .jo 10ou nf 2261 jc 0102 ju 1213 jo cc41t j(j 0115 jc olit6 ju 0127 ne 0608 jo 0109 ju 1100 jg 0132 ne 0603 jll 0 094 j o 1165 jll 1196 jo 1067 ne i 008 jo 0129 c.o j~ 20qo jll 1002 ne iou ju 010 4 j() £ 025 j( £036 jo 1297 nf 1088 jo 0199 -ne 06iu jc loll ne ius) jll 0154 jo 2065 ne 2016 r.e· 0607 ne £048 j o 1209 ,;:.. nf 1040 jo 1092 ne 1073 jo 1014 ne 0615 m 21t6 he 1 0 67 ne 2088 jo 1229 ne 1070 jc llti2 ne 2u03 jo ic44 ne 2035 ne 2246 ~e 2097 ne 2188 jo 1299 ne 2090 jo 2002 ne 2083 jo ic54 ne 2045 1<£ 2.l5t h 2257 ne 2228 ne 0609 t::j ne 22•0 ju 2012 n£ 2093 jg 1c6'< ne 2085 "e 22~6 i02 ne 210l jo 1084 igrams might never have been developed. 52 journal of library automation vol. 14/1 march 1981 publishing firm. with a feeling of deja vu i listened to an explanation of how difficult it is to develop a system for the novice; one proposed solution is to allow only the first four letters of a word to be entered (one of the search methods used at the library of congress, which does suggest some cross-fertilization ). whatever the trends, the reality is that librarians and information scientists are playing decreasing roles in the growth of information display technology. hardware systems analysts, advertisers, and communications specialists are the main professions that have an active role to play in the information age. perhaps the answer is an immediate and radical change in the training of library schools of today. our small role may reflect our penchant to be collectors, archivists, and guardians of the information repositories . have we become the keepers of the system? the demand today is for service, information, and entertainment. if we librarians cannot fulfill these needs our places are not assured. should the american library association (ala) be ensuring that libraries are a part of all ongoing tests of videotex-at least in some way-either as organizers, information providers, or in analysis? consider the force of the argument given at the ala 1980 new york annual conference that cable television should be a medium that librarians become involved with for the future. certainly involvement is an important role, but we , like the industrialists and marketers before us, must make smart decisions and choose the proper niche and the most effective way to use our limited resources if we are to serve any part of society in the future. bibliography 1. electronic publishing revietc. oxford, england : learned information ltd . quarterly . 2. home video report . white plains, new york : knowledge industry publications. weekly. 3. ieee transactions on consumer electronics. new york: ieee broadcast, cable, and consumer electronics soc iety . five tim es yearly. 4. international videotex /te letext news. washington , d. c.: arlen communications ltd. monthly . 5. videodisc/teletext news. westport , conn.: microform revi ew. quarterly. 6. videoprint. norwalk , conn.: videoprint. two times monthly. 7. viewdata/videotex report. new york: link resources corp. monthly. data processing library: a very special library sherry cook, mercedes dumlao, and maria szabo: bechtel data processing library, san francisco, california. the 1980s are here and with them comes the ever broadening application of the computer. this presents a new challenge to libraries. what do we do with all these computer codes? how do we index the material? and most importantly, how do we make it accessible to our patrons or computer users? bechtel's data processing library has met these demands. the genesis for th e collection was bechte l's conversion from a honeywell 6000 computer to a univac lloo in 1974. all the programs in use at that time were converted to run on the univac system. it seemed a good time to put all of the computer programs together from all of the various bechtel divisions into a controlled collection. the librarians were charged with the responsibility of enforcing standards and control of bechtel's computer programs. the major benefits derived from placing all computer programs into a controlled library were: 1. company-wide usage of the programs. 2. minimize investment in program development through common usage. 3. computer file and documentation storage by the library to safeguard the investment. 4. central location for audits of program code and documentation. 5. centralized reporting on bechtel programs . developing the collection involved basic cataloging techniques which were greatly modified to encompass all the information that computer programs generate, including actual code, documentation, and listings . historically, this information must be kept indefinitely on an archival basis . the machine-readabl e codes themselves are grouped together and maintained from the library's budget . finally , a reference desk is staffed to answer questions from the entire user community. documentation for programs is strictly controlled . code changes are arranged chronologically to provide only the most current release of a program to all users. historical information is kept and is crucial to satisfy the demands of auditors (such as the nuclear regulatory commission). additionally, the names of people administratively connected with the program are recorded and their responsibilities communications 53 defined (valuable in situations of liability for work complete d yesteryear). the backbone of the operation is a standards manual that spells out and discusses the file requirements, documentation specifications, and control forms. this standard is made readily available throughout bechtel. in addition, there are in-house education classes about the same document. indeed, the central data processing library is the repository of computer information at bechtel. the centralization and control of computer programs eliminates the chaos that can occur if too many individuals maintain and use the same computer program . accessibility of tables in pdf documents: issues, challenges, and future directions article accessibility of tables in pdf documents issues, challenges, and future directions nosheen fayyaz, shah khusro, and shakir ullah information technology and libraries | september 2021 https://doi.org/10.6017/ital.v40i3.12325 nosheen fayyaz (nosheenfayaz@uop.edu.pk) is doctoral candidate, university of peshawar. shah khusro (khusro@uop.edu.pk) is professor, university of peshawar. shakir ullah (ullah@ulm.edu) is instructor, university of louisiana monroe. © 2021. abstract people access and share information over the web and in other digital environments, including digital libraries, in the form of documents such as books, articles, technical reports, etc. these documents are in a variety of formats, of which the portable document format (pdf) is most widely used because of its emphasis on preserving the layout of the original material. the retrieval of relevant material from these derivative documents is challenging for information retrieval (ir) because the rich semantic structure of these documents is lost. the retrieval of important units such as images, figures, algorithms, mathematical formulas, and tables becomes a challenge. among these elements, tables are particularly important because they can add value to the resource description, discovery, and accessibility of documents not only on the web but also in libraries if they are made retrievable and presentable to readers. sighted users comprehend tables for sensemaking using visual cues, but blind and visually impaired users must rely on assistive technologies, including textto-speech and screen readers, to comprehend tables. however, these technologies do not pay sufficient attention to tables in order to effectively present tables to visually impaired individuals. therefore, ways must be found to make tables in pdf documents not only retrievable but also comprehensible. before developing such solutions, it is necessary to review the available assistive technologies, tools, and frameworks for their capabilities, strengths, and limitations from the comprehension perspective of blind and visually impaired people, along with suitable environments like digital libraries. we found no such review article that critically and analytically presents and evaluates these technologies. to fill this gap in the literature, this review paper reports on the current state of the accessibility of pdf documents, digital libraries, assistive technologies, tools, and frameworks that make pdf tables comprehensible and accessible to blind and visually impaired people. the study findings have implications for libraries, information sciences, and information retrieval. introduction the web has a huge collection of documents, including pages, books, blogs, articles, reports, etc., available in different formats. these formats include html (hypertext markup language), epub (electronic publication), azw (amazon word), and the ubiquitous pdf (portable document format) format. pdf is layout oriented and unstructured, having elements such as text, images, tables, and metadata. all these elements carry specific information and have their relative importance. tables can be part of a structured or unstructured document. a structured table, like in html, is relatively easy to extract and interpret, as it has a starting and ending tag pair for the table itself, its headings, each row, and discrete values. however, unstructured documents, which can include books, journals, audio, video, images, and documents, do not follow a specified format or structure for the organization of information.1 a table has levels of abstraction; the higher levels mailto:nosheenfayaz@uop.edu.pk mailto:khusro@uop.edu.pk mailto:ullah@ulm.edu information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 2 of abstraction have fewer details whereas a lower level gives more information. the human has to understand and comprehend the underlying semantics of the table content for sensemaking. the content of a table has a strong bond with its context, as it has concrete information regarding the surrounding text; therefore, tables are hard to comprehend when taken out of the context. poorly conceived information is more dangerous, as it can lead to misconceptions and poor decisions. any system or component that interacts with humans must be capable of offering comprehensible explanations.2 a reader understands a table in at least three cognitive processes: comprehension, searching, and interpretation & comparison.3 in contrast, blind and visually impaired persons need assistance in comprehending the tabulated information, for example, understanding table structure and its content, searching for particular information in a table, and comparing and interpreting tabular data. therefore, they need technical solutions for reading documents.4 according to the world health organization (who), the number of blind and visually impaired people has increased significantly and has risen to 2.2 billion, so technical solutions or assistive technology are a must for their reading.5 assistive technologies are supposed to handle the three main kinds of print disabilities: vision problems, motor skill problems, and cognitive problems.6 for vision problems we have tools like text-to-speech and screen readers to help blind and visually impaired people to read text documents. however, these tools work on the upper level of abstraction and give limited information to users because they focus on text and ignore components related to presentation such as tables, graphs, images, etc. this limitation is not only found on the web but also affects other digital environments including digital libraries, where more reliable document collections are present but their retrieval and presentation to blind and visually impaired people is challenging.7 for example, a study identified the limitations of digital libraries in meeting the specific needs of blind and visually impaired people and suggested including help features for a more user-friendly experience.8 michigan state university has taken an initiative to make digital content accessible by adopting wcag 2.0 as official technical guidelines. they presented a fiveyear accessibility plan for making new, existing, and purchased content accessible along with resource allocations, training for their staff, and future requirements.9 these efforts may motivate other libraries to adopt such measures and make it a convenient place for people with disabilities. common accessibility problems include the lack of alternate descriptions, using visual cues to describe interactions in the user interface, fuzzy visuals, and audios.10 furthermore, the sightcentered nature of the digital library creates problems for blind users, such as the absence of meaningful descriptions for nontext content and instructions, along with information about the digital library’s features due to missing textual or verbal instructions.11 the traditional usage of a digital library makes a canned or routine utilization of its collections, which may be broadened by making computational ready collections.12 the accessibility of these documents will help in knowledge dissemination to blind and visually impaired people. researchers have presented frameworks and algorithms for exploring and interpreting pdf elements like images, charts, tables, and graphs. these interpretations are the basis for both humans and machines to gain meaningful insights out of tabular data. this paper highlights the significance of the rich semantics of pdf tables and the challenges in their interpretation and their presentation to blind and visually impaired people. it is proposed to present the tables’ explicit and implicit information in a progressive manner to reduce the cognitive overload on blind and visually impaired individuals. this might be achieved by providing some basic information (such as a table caption and the number of rows and columns in the table), which may be followed by information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 3 navigation and querying within the table. this stepwise approach of leveraging a table’s semantics may help in its better comprehension. table semantics will also be helpful in libraries, information science, and information retrieval, as it has the potential to improve library cataloging and classification. the next section of this paper discusses and reviews the efforts and limitations in the existing literature and presents a general model of table processing and interpretation. the prominent issues and challenges are identified regarding general table structure, format, interpretations, and evaluation; the presentation of tables to blind and visually impaired people; and specifically the accessibility issues in digital library are presented in the following section. the last section contains some future research directions that will unleash some new dynamics of this domain. the current state of table processing a table presents summarized information in a particular arrangement, where the structure of the table reveals some implicit semantics. in 1996, xinxin wang defined the logical structure and style rules of tables and presents wang’s abstract model.13 the model separated the logical structure from layout specification and is considered a generic and complete model in the literature.14 unstructured tables can be regular or irregular. a regular table has intersecting vertical and horizontal borders that develops a table of cell bounding boxes, while in an irregular table, there is no relationship between the number of rows and columns.15 tables can be n-dimensional, having spanning cells or multiline cells. tables can be long and span multiple pages and they can be floating (can be placed to the left or right of the page, with text wrapped around them). sometimes tables have no explicit boundaries and even worse, cell separators may not be visible. a table can have a variety of content that includes numerical data, text, symbols, images, and equations.16 the location of table and identification of table structure in documents is evaluated in the international conference on document analysis and recognition (icdar) table competition.17 the methods used for the identification of table structure are rule-based methods, 18 data-driven methods,19 and the graphical neural network.20 multiple frameworks and approaches are used for the extraction and processing of tables from structured documents like html,21 and unstructured documents like images and pdf documents.22 keeping in view all the conducted studies and research, we present a general model of table processing and interpretation in figure 1, showing the prominent inputs (web, pdf, and images), processes, and some notable outputs. the model has a table extraction process that is followed by processing which yields multiple outputs including organized data, analyzed structure, and analyzed content. processing can be followed by establishing relationships within the table, within table and context, or with other related tables. moreover, tables presented in open formats such as csv, xml, json, and rdf extend their potential by exploration, creating or extending ontologies and knowledge bases, and publishing tables on an lod cloud to establish links with other open data sources. below is a detailed discussion of table extraction and processing and the relationships of tables with content and context. information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 4 figure 1. a general model for table interpretation. table extraction and processing the recognition, extraction, and processing of tables, from a variety of documents, have used multiple approaches.23 the hidden table semantics will not only help in understanding tables but can also contribute to digital library cataloging. these approaches are categorized in the following sections. using heuristics different heuristic approaches are presented for extraction and processing of unstructured tables . for example, pdftrex uses spatial features and follows a bottom up approach for the recognition and extraction of tables. it represents a table as a two-dimensional grid on a cartesian plane and extracts the table as a set of cells along with their coordinates.24 in another approach, natural language processing (nlp) features are used for deeper understanding of text. these tools use parts of speech and dependency paths for the extraction of tables and for finding relations among tables by using the nlp toolkit.25 milosevic et al. presented five steps for table processing, i.e., table detection, functional analysis, structural analysis, synthetic analysis, and semantic analysis,26 while roya rastan endorses the first three steps in his phd dissertation and proposed a framework for the processing of tables. this framework consists of four layers: input management, table processing, storage, and management.27 to recognize and extract tables from documents, ad hoc heuristics are used with the existing methods and techniques, which includes three steps: (1) “preprocessing” to define and prepare text chunks from a source table by using the features of text like font, space, and bounding box; (2) “text block recovering” to identify the set of text chunks that could be treated as the content of a single cell; and (3) “cell recovering” to observe the arrangement of cells for identifying the rows and columns.28 the authors exploited appearance features of text printing instructions and position of drawing cursor for table detection and structure recognition in their web-based solution. they claimed to attain an accuracy or f-score of 83.18% for table extraction and 93.64% for structure recognition.29 furthermore, an interactive document reader was presented by the researchers of stanford university, in which structural analysis was combined with rule-based matching and natural language processing to associate a table’s values with the related text to develop sentence-table information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 5 pairs in the document. they also tried to relate tables in two data sets but unfortunately obtained 48.8 % results.30 although the results were not satisfactory, this effort opens up new endeavors for practitioners and researchers. using segmentation the segmentation approach is used for identification of tables in unstructured and untagged pdf documents, along with its columns and rows.31 visual separators and geometric content layout information is used for the extraction of tables from multiple pages of documents, and the technique is tested on e-books and scientific documents.32 ali et al. adopted a segmentation approach to deal with incomplete, impure, and complex tables by extracting table schema, data, and reading paths of data to represent in a layout independent format.33 the extraction of tables from images also used segmentation through a top-down pipeline approach. the text and tables in medical laboratory reports were identified, where the content of tables needs to be correctly captured and interpreted.34 as the medical tables include text, numbers, characters, and symbols, therefore, their correct interpretations is critical in medical reports and a minor error can lead to very dangerous outcomes. a system named texus used segmentation to prepare text chunks for finding relations among cells. the system provided end-to-end processing of tables, which claimed to detect a variety of tables in layout-independent format from a data set of complex financial tables. the system interpreted the tables and produced an xml file about the structure of the tables showing the access paths of each data cell as an attribute. 35 similarly, page segmentation is carried out by using deep learning methods to identify tables, text, and figures.36 using machine learning and deep learning approaches machine learning and deep learning techniques are also used for automatic detection and extraction of table data. random forest classifiers are used to detect a table header.37 multitask fully convolutional neural networks (fcn) are used for page segmentation to identify the tables, text block, and figure elements.38 the k-nearest neighbor method and layout heuristics are used in a system named tao for the automatic detection, extraction, and organization of data in tables in order to generate an enriched representation of data. 39 similarly, deep learning techniques like rcnn are used to capture tables from a university of las vegas data set of document images. based on the assumption that tabular data is mostly numeric, the researchers used color-coding/coloration to distinguish numeric and textual data and claim to have achieved improved performance.40 table detection and recognition in born-digital documents and images is carried out by using transfer learning for faster r-cnn in order to overcome the problem of labeled data sets and fcn semantic segmentation is used for table structure recognition. the method is evaluated using the icdar 2013 data set.41 another approach, named decnt, worked on images (in any format) for the extraction of tables by using a combination of deformable cnn with faster r-cnn/fpn. the method is evaluated using the icdar 2013 data set and the icdar 2017 pod data set, unlv, mormont.42 in other research, the authors pointed out the weakness in the existing methods and techniques for understanding tables and presented a “graph neural network” approach to analyze the structure of pdf tables and handle the spanning cells.43 along with that, multiple deep learning techniques are used for integrating and querying tables using word embedding, rnn, knn, and lstm for the classification of financial tables. 44 in the case of web tables, annotation of table columns is performed by using convolutional neural network (cnn) along with transfer learning information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 6 in order to overcome the problem of a shortage of target data sets. the column semantics are embedded into vector space and are used for predicting the type of columns without using the metadata. the method is tested on two web table data sets, t2dv2 and limaye.45 using ontologies ontologies have also played a vital role in the detection, recognition, and annotation of tables from the web, images, and pdf documents. ontologies consider the content and structure of a table for their conceptual representation. a system named tableminer was used to interpret the tables semantically by identifying the semantic concepts for the columns and disambiguating the cell content by using rdfa and microdata for improved annotations of the table. 46 tableminer considered relational tables and mapped the headers of table with the properties of ontology for linking the cell values with entities.47 the relationships between a table and its context are also extracted and annotated, and to remove disambiguation, the provenance of relationships is preserved.48 a framework named tabel, developed by varish mulwad, has a module for converting a table to a graphical model to infer the semantics of table header, cells, and their relation to each other. these semantics are used to convert the graphical representation to rdf triples by using knowledge bases along with the author’s own defined ontology or any other ontology.49 the ontology is also used for finding the relevant tables in a domain of technical documents only.50 for easy interpretation by ontologies and more usability of the government data, the unstructured tabular data is suggested to be published in open format like csv (comma separated value).51 the studies mentioned above have mostly used relational tables, technical document tables, government data, and medical data with a main objective of making tables open and interrelated. along with that, another study argued that besides the metadata of a resource, user-generated content may also be considered and published as linked open data for improved consumption and would also contribute to better cataloging of digital libraries.52 unfortunately, it still has problems like disambiguation and correlation of complex tables, besides other issues involved in publishing and consuming data as open and linked data.53 relationship of tables with content and context the content of a table is present in a particular arrangement in order to give some specific information. therefore, the table content should be interpreted for the hidden semantics among the cell content, context, and with other related tables in a particular domain. in this reg ard, natural language techniques are used for the identification of relationships in the table and the related text using the nlp toolkit. the researchers claimed improvement in table schema identification and quality of relation.54 similarly, ontologies are used to identify the semantic relations among the text, table contents, and table structure.55 another research project used rulebased matching and structural analysis for finding the relationship in table cell and sentence text, by developing sentence-table pair in the document. this project tried to develop a relationship between tables of two data sets but achieved only 48.8% success rate.56 a system named texus tried to find out the relations among the values of cells using cell entries, categories, and access paths. they used segmentation techniques for preparation of text chunks and produced an xml file about the structure of the table showing the access paths of each data cell as an attribute.57 narrowing the table-understanding domain to clinical literature, with a focus on just the numerical and textual data of tables from xml documents, milosevic et al. extended their previous work and tried to identify the relationship between the table and the surrounding text. they added pragmatic analysis, cell selection, and syntactic analysis, defined five categories of cells, depending upon the data in the cells, and identified seven semantic categories for the information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 7 specification of table extraction process. the pubmedcentral data set is used to test the developed system with regard to task, variables and complexity. the authors claimed to achieve an accuracy measure f-score between 82% and 92%.58 a “graph neural network” model was developed to build an undirected graph for the prediction of relations among adjacent cells. the model was tested on benchmark data sets, i.e., icdar 2013 and tablebank-2019, and claimed to outperform.59 another system, named tablepedia, unified the tables of experimental results with regard to method, data set, metric, score, and source into tuples. the system extracted the related tables and identified the conflicted results by using the rule-based and learning-based methods with the help of sql operations.60 an sql-like query was proposed for the financial tables in pdf formats by using deep learning approaches.61 all the mentioned techniques for establishing relationships follow the rule-based, learning-based, segmentation, neural network, heuristics, and ontologies. among these, the ontologies can establish inter domain relationships and explorations. however, it still has issues that will be discussed in the conclusion section. existing accessibility-driven solutions for pdf documents apart from the systems and frameworks for table understanding and processing, a mech anism or a solution is needed to present tables in a meaningful way to blind and visually impaired people. the accessibility of digital documents is based on the captured structural information and its availability for processing by other software and applications, such as tagged pdf, can help in summarizing, navigating, and providing structural information of the content.62 nazimi made an effort to present a framework for understanding the complex documents and its components, including images, charts, and tables, in a nonvisual representation to blind and visually impaired people.63 the existing available solutions for reading pdf documents to blind and visually impaired people focus on text and give little attention to its elements such as tables, images, graphs, and charts. particularly speaking for tables, these solutions either read the table caption and ignore the content, or read the table as if it were text, which renders it meaningless. these assistive technologies are divided into four main categories. 1. text-to-speech tools 2. screen readers 3. voice assistant 4. natural language generator (nlg) text to speech tools include products like wordtalk, virtual speaker, audiobook reader voice, voicepaper, dream reader, etc. these tools can read text from txt, pdf, or doc files aloud and have an interface for user interaction, where the user can copy-paste the text or mention the path of the file to read. some of the tools are free with limited features while others are proprietary. they need human interaction, which might be difficult to use for a visually impaired or blind person. screen readers include jaws, nvda, cobra, voiceover, and talkback. they speak out every user activity that is taking place, like opening or closing a window, clicking on a button, reading text from a txt or pdf file. these tools are helpful for visually impaired people, as they do not need the user to open a specific software and then specify the path of files to read. the most popular tool , jaws, is proprietary and used for windows. nvda is free software for windows, while voiceover is a free tool provided with apple’s operating systems (including macos and ios). nvda reads the text, taking note of punctuation; it reads the table row by row like text and then reads the caption of the table at the end. it can also read the alternate text of the table if it is included in acrobat pro. these tools need the user to be aware of what he or she is doing. information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 8 similarly, there are voice assistants like apple siri, microsoft cortana, google assistant, and amazon alexa. all these tools take instructions from the user and then try to provide solutions. these tools may ask users several queries to clarify what they want and provide limited functionalities such as reading out gps coordinates, playing music, etc. the natural language generator (nlg) is used to convert raw text or data into narrations. existing popular systems or tools include arria nlg, quill, ibm watson, ax nlg cloud, amazon polly, and wordsmith. these systems are used to perform data analysis and convert the extracted analytics to narrations, which could be easily understand by the user. these tools are not for narrating tables from unstructured documents. however, a framework has been developed using a neural encoder-decoder to generate text from tables. it is claimed that the solution outperformed the existing solutions and achieved higher bleu score and f-score using data sets weathergov, wikibio, wikitable, and a chinese data set wikibiocn.64 this technique focuses on formal tables and ignores complex tables as well as unstructured pdf. a new emerging category in this paradigm is the document-centered assistant, which tries to help users review documents by asking questions. the field is currently studied for the type of questions that a user may ask and the candidate machine learning models that can be used for answering them. the questions would be different from factoid questions and chitchat, because here the focus would be on relevant information from that specific document.65 this category seems to have a big scope for understanding, reviewing, and inferring knowledge from documents. apart from the solutions mentioned above, there are some java and python tools and libraries that are used for table extraction from pdf documents and are shown in table 1. some of the tools are commercial and claim to extract tables, table rows, and even table cells from documents and images, like pdftables, docparser, and pdftron. similarly, there are also open-source java and python libraries for table and metadata extraction from images and documents. the libraries that extract tables from images are camelot, excalibur, and pdfplumber, whereas the libraries that extract tables from non-image-based documents are tabula, pypdf2, pdf table extractor, pdfplumber, and pdfminer. among these five, pdf table extractor is browser-based and pdfminer works with structured tables and digs out the semantic relations. for working with unstructured tables in pdf documents and developing a table extractor component for an integrated environment, tabula, pdfplumber, and pypdf2 might be better choices. the research and solutions mentioned above regarding table detection and understanding are carried out to make them meaningful to machines and humans who have no visual impairment or dyslexia. therefore, the future documentation may consider the inclusion of translations and lay summaries (concise descriptions in simple words) of objects or elements within the document, as essential components, to make them accessible to blind and visually impaired individuals as well.66 in this regard, the world wide web consortium (w3c) developed the web accessibility guidelines for developing web documents to make the nontext elements accessible. these guidelines include elements such as captions for tables and figures, description of figures, and summaries of the tables.67 similarly, html has tags to include summaries of a table, including , ,

. microsoft word has an option “text alternative” to add a description of a table or figure for visually impaired people, who will use screen readers for reading the document. adobe acrobat reader also has an accessibility pane to tag tables and add alternative text and descriptions of tables, which is used by the nvda screen reader to read aloud. moreover, commonlook office, whose motto is “build accessibility into documents early,” has add-ins for microsoft word or powerpoint to add enough accessibility content to the documents to information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 9 table 1. solutions and libraries for table extraction and processing. s no. tools open source image based comments 1 tabula y n extracts data tables from pdf and saves as csv or excel spreadsheet. it works on native pdf files and cannot extract scanned tables. it supports multiple platforms but does not support batch processing. 2 pdftables n n extracts page, table, table row, and even table cell. it is a fully automated api. it supports multiple platforms and multiple programming languages. 3 docparser n y extracts information from images and forms. it is a cloud-based application and supports batch processing. it parses the documents and offers more features but needs human intervention. it shows poor accuracy in handwritten application forms. 4 pdftron n n supports multiple platforms and multiple programming languages. 5 camelot y y a python library that extracts table from images. it has built-in ocr. 6 excalibur y y a web-based solution which is powered by camelot. 7 pypdf2 y n a python library that can do batch processing with multiple files. 8 pdfplumber y y a python library built on pdfminer. 9 pdf table extractor y n a web-based tool built on tabula. it supports scraping of multiple page tables and comparison of cell values. 10 pdfminer y n a python library that extracts information like location, fonts, and lines of the text. it focuses on analyzing text. it has a pdf parser. it figures out the semantic relationships among structured tables. make the resulting pdf accessible. however, already-developed unstructured documents, without any accessibility features, still need some measures to make the documents understandable to visually impaired or blind users. keeping in mind the statistics of visually impaired people and the unstructured data of the future—the global data sphere will grow from 33zb to 175zb and 80% of this worldwide data will be unstructured—visually impaired individuals cannot be ignored for their access to knowledge.68 therefore, we would need mechanisms for making these unstructured documents understandable to as many people as possible by incorporating accessibility measures in the document readers. the following section highlights some of the key issues in this domain. issues and challenges in the existing systems tables can be utilized in multiple scenarios including information extraction, table search, ontology engineering, conversion to dbms, and document engineering. 69 the situation becomes difficult when a blind or visually impaired person needs to understand the tables. the issues and challenges in dealing with pdf tables are categorized in the following sections. https://tabula.technology/ https://pdftables.com/ https://docparser.com/ https://www.pdftron.com/ https://resourcegovernance.org/analysis-tools/tools/pdf-table-extractor https://resourcegovernance.org/analysis-tools/tools/pdf-table-extractor information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 10 table structure tables in pdf documents need more focus on table structure detection because they do not follow a defined formal structure.70 several knowledge gaps are identified in literature regarding table structure, such as the identification of functional areas of tables, for which silva argued the use of multiple heuristics and machine learning algorithms in parallel or in sequence.71 the variety of structural layouts creates problems in their identification, which can be handled by defining more rules at the lexical and syntactic layer of table processing. this could also be fruitful for better semantic annotations.72 in addition, the variety of cell content or inconsistent cell content, along with implicit header cells, creates problems in understanding the tables, especially by machines.73 the vector representation of web tables may be applied to pdf tables for semantic annotations and identification of column types.74 along with that approach, graphical representation and a graphical neural network (gnn) can also be used for better structure identification in multiple domains.75 new data sets need to be introduced for structure recognition in various domains, including business and finance, as they use a huge amount of tables in their documents.76 from the discussion above, the table structure inconsistencies, cell content inconsistencies, functional and logical processing of tables needs more research effort to eliminate the stated problems. along with that, the inclusion of more data sets will also help in handling the diversity in the field. table formats the existing format of tables in pdf lacks the metadata needed for further processing; therefore, the conversion of pdf tables to other formats, especially open formats, will open new endeavors. some researchers have worked on converting tables to csv format, which retains the basic structure but lacks some cell formatting. researchers worked on the transformation of web tables to relational tables for easy manipulation.77 in contrast, xml can handle complex data and is more easily read by humans. therefore, a methodology is presented to work on tables in xml format, but it considers tables having text and numerical data only.78 json, another format, can also be used as an alternative to xml; it is smaller in size than the xml and can handle complex and hierarchical data. the json format has less support than xml but is preferred for web application due to its interoperability and lightweight features. table interpretation the variable representation patterns of table values, dense content and natural languag e processing create problems in the correct interpretation of tables.79 anaphoric resolution techniques and documenting level discourse parsers are suggested to handle complex references among multiple domains.80 moreover, handling the locality features of a table and the annotation of its property feature can lead to better interpretation of tables.81 the use of a knowledge base is suggested for understanding and annotating the relationships among tables and text to get more information about the extracted entities from tables and text.82 similarly, the extraction of data and its precision in medical and financial tables is an issue that needs the attention of researchers, as both fields have crucial and important data in its tables. 83 for easy interpretation of tables, machine learning classifiers, based on table headings and captions, can be used to classify them into their respective domain.84 the relationship of tables in a specific domain and or among multiple domains can be achieved by developing ontologies.85 this will enable the tables to be published on an lod cloud that will establish more relationships and infer insights from multiple domains. information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 11 table evaluation most of the researchers working on pdf tables have tried to evaluate their work with popular data sets such as icdar 2013, icdar 2015, icdar 2017 pod, pubmed, unlv, and mormont. as we have pdf documents in multiple domains, therefore, new data sets should be introduced for structure recognition, especially in business and finance, as these domains use a large number of tables in their documents.86 an evaluation methodology was proposed for table detection, structure recognition, and its functional and semantic analysis.87 unfortunately, there are no standard metrics, parameters, and formal methodology for table processing evaluation.88 therefore, standard evaluation metrics should be defined for pdf tables, in order to standardize the evaluation of algorithms and frameworks. table presentation to blind and visually impaired users the available tools and techniques for reading aloud documents to blind and visually impaired people either read the table caption only and ignore the content or treat the tables as text and read the rows line by line. this does not help these users to understand the semantics of the table and its content. besides the content of the table, its layout shows grouping and connections among the content which is not presented to blind and visually impaired people by current solutions.89 therefore, tools and screen readers need to present tables in nonvisual format or give a summarized view of tables by following the guidelines of w3c, instead of reading the table like text.90 the summarized view of tables can become part of bibliographic metadata and can contribute in cataloging in the perspective of linked and open data. 91 a study highlighted the accessibility of published pdf articles by four journal publishers and presented the findings in graphs to show the trend from 2009 to 2013, by taking parameters including meaningful title, alternate text for images, and logical reading order.92 the author further applied the same methodology to analyze the articles published in next four years (2014 to 2018) and came to the conclusion that accessibility of pdf documents had improved. however, the journal publishers , who should be more aware of disability and accessibility, did not consistently follow the pdf/ua accessibility requirements and wcag 2.0 when producing pdf versions of their articles.93 therefore, visually impaired individuals should be provided with a mechanism for understanding the digital content and underlying semantics at multiple levels of abstractions, like the general information about the document and its elements—including tables—its structure and content, navigation in the table, and querying the table to get more details and lessen cognitive overload. accessibility of digital library collection the accessibility of large-scale digital library collections can enhance content for sighted as well as visually impaired users. the traditional utilization of digital library collections needs to be broadened by making computation-ready collections meant to be used and consumed in multiple domains.94 an effort was made by researchers to digitize and archive a digital repository of images and convert them to pdf/a documents but, unfortunately, the researchers came up with limited semantics as they did not consider the elements within the documents themselves.95 the accessibility of these converted documents may be compromised with these limited semantics. the rich semantics of tables can be used in the bibliographic classification of a digital library’s collection to increase the search width of the digital library.96 blind and visually impaired users can be assisted in using digital libraries, as they may need help at physical and cognitive levels. at the physical level, the blind may face difficulty in accessing information, identifying path and status, and efficiently evaluating information. at the cognitive level, they may face problems in understanding multiple structures, programs, information, features of the digital library, and the need to stick to some specific formats. therefore, the inclusion of help features will make the information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 12 digital library friendly to blind and visually impaired people by incorporating meaningful descriptions for nontextual elements.97 the sight-centered nature of the digital library creates problems for blind and visually impaired users due to missing textual or verbal instructions. some researchers identified the inclusion of labels and meaningful descriptions for hyperlinks, instructions, structure, multimedia content and nontext content to make digital libraries friendly to blind and visually impaired people.98 at the same time, others argue for improvement in usability by introducing help features in terms of usefulness, ease of use, and user satisfaction.99 the accessibility of digital libraries in general and its content in specific may be improved by accommodating help features in the interface and meaningful descriptions for the contents’ nontext elements including tables. conclusions and future research directions this study discusses the accessibility of tables included in pdf documents in general as well as in the specific environment of digital libraries. existing frameworks, algorithms, and solutions for the processing and interpretation of pdf tables, specifically their presentation to blind and visually impaired people, are thoroughly discussed. a general workflow of table processing is also presented in figure 1. the available solutions for reading out pdf documents to blind and visually impaired people are analyzed for their output, specifically for their attitude towards handling tables. furthermore, a list of resources for table interpretation and presentation are discussed along with their different features. the issues and challenges in table structure, format, interpretation, evaluation, its presentation to blind, and accessibility of digital library collection are discussed. the researchers working in the domain of accessibility, digital library, and pdf tables can extend and modify the current solutions and algorithms by following the future research directions given below. • the structure of a table has implicit semantic information which a sighted reader can infer but a blind reader needs assistance to understand. the structure of a pdf table is extracted using multiple approaches like heuristics, ontologies, machine learning and segmentation, whereas vectors are used for a web table.100 therefore, the combinations of multiple approaches and use of vectors for pdf tables may produce better results. • the content of a table is usually numeric or very short text and needs proper interpretation. therefore, a knowledge base can be used to get more information about the extracted entities from tables and text in order to understand and annotate the relationships among tables and text.101 these knowledge bases can be predetermined or may be selected automatically according to the table content or domain. • table interpretation can become easy if tables are classified according to their domains by using machine learning classifiers. the classification can be based on table headings and captions, as well as the title and author of the document.102 • ontologies are used to relate the tables in a specific domain and or among multiple domains, and publishing them on an lod cloud will establish new relationships.103 this will help in inferring new insights from complex, long, and numerical tables. • unstructured data and content can be made available for multiple usage and interpretations if it is converted to open formats like csv, json and xml.104 among these, csv comes with repeated content, xml needs special parsers, whereas json is lightweigh t and easy to write and read.105 it has support from nosql databases like mongodb and apache couchdb, and web application apis like twitter, you tube, and facebook. information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 13 therefore, json might be a better option for the conversion of pdf tables for its multiple interpretation and navigation within tables. • the processes used for evaluation of tables have no defined matrices.106 therefore, the table evaluation processes should be defined with their respective matrices in order to standardize the research in this domain. • the precision of extracted content of table is very crucial especially in medical, financial, and experimental tables that have numeric data. therefore, the preprocessing of tables or conversion to other formats would need more attention to avoid any truncation or round off of the data. • the presentation of tables to blind or visually impaired people can be in nonvisual or summarized form.107 the summaries may be presented nonvisually, including the structural layout as well as a brief introduction of the table, to minimize the cognitive overload on these individuals. • to evaluate the accessibility of digital library interfaces, 16 heuristics were proposed to make the digital libraries in reach of users, however, more heuristics are needed to make generalized interfaces for all individuals.108 • the nontext elements of digital library collections should have meaningful descriptions for better understandability of blind and visually impaired individuals. the user-generated content about these nontext elements could be used for cataloging.109 • the rich semantics of tables can be exploited for cataloging and classification that will be helpful in exploratory searching. • as the michigan state university libraries has taken the initiative of assessing and improving the accessibility of digital library content by adopting the wcag guidelines, other libraries can also adopt the model for providing accessible content to their users including blind and visually impaired individuals. • the development of new data sets for tables in multiple domains can facilitate the researchers in interpreting tables and establishing relationships in cross-domains. this review paper is an attempt to highlight the knowledge gap in processing the pdf tables and its accessibility for blind and visually impaired individuals. an efficient and open-source solution for making pdf documents accessible to blind and visually impaired people needs to exploit the heuristics, ontologies, machine learning, and deep learning by using open-source libraries and tools for understanding and interpreting the tabular content in order to reduce information overload. endnotes 1 roya rastan, “automatic tabular data ex wcag traction and understanding” (phd diss., university of new south wales, 2017). 2 mark t. maybury, “communicative acts for explanation generation,” international journal of man-machine studies 37, no. 2 (1992): 135–72. 3 patricia wright, “the comprehension of tabulated information: some similarities between reading prose and reading tables,” nspi journal 19, no. 8 (1980): 25–29, https://doi.org/10.1002/pfi.4180190810. https://doi.org/10.1002/pfi.4180190810 information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 14 4 jean-claude guédon et al., future of scholarly publishing and scholarly communication: report of the expert group to the european commission (brussels: european commission, directorategeneral for research and innovation, 2019), https://doi.org/10.2777/836532. 5 world health organization, world report on vision, october 8, 2019, https://www.who.int/publications-detail/world-report-on-vision/. 6 mireia ribera turró, “are pdf documents accessible?” information technology and libraries 27, no. 3 (2008): 25–43, https://doi.org/10.6017/ital.v27i3.3246. 7 kyunghye yoon, laura hulscher, and rachel dols, “accessibility and diversity in library and information science: inclusive information architecture for library websites,” library quarterly 86, no. 2 (2016): 213–29, https://doi.org/10.1086/685399. 8 iris xie et al., “using digital libraries non-visually: understanding the help-seeking situations of blind users,” information research 20, no. 2 (2015): 673. 9 heidi m. schroeder, “implementing accessibility initiatives at the michigan state university libraries,” reference services review 46, no. 3 (2018): 399–413, https://doi.org/10.1108/rsr04-2018-0043. 10 joanne oud, “accessibility of vendor-created database tutorials for people with disabilities,” information technology and libraries 35, no.4 (2016): 7–18, https://doi.org/10.6017/ital.v35i4.9469. 11 rakesh babu and iris xie, “haze in the digital library: design issues hampering accessibility for blind users,” electronic library 35, no. 5 (2017): 1052–65, https://doi.org/10.1108/el-102016-0209. 12 rachel wittmann et al., “from digital library to open datasets,” information technology and libraries 38, no. 4 (2019): 49–61, https://doi.org/10.6017/ital.v38i4.11101. 13 xinxin wang, “tabular abstraction, editing, and formatting” (phd diss., university of waterloo, 1996). 14 rastan, “automatic tabular data extraction,” 25. 15 azadeh nazemi, “non-visual representation of complex documents for use in digital talking books” (phd diss., curtin university, 2015). 16 rastan, “automatic tabular data extraction,” 14. 17 max göbel et al., “icdar 2013 table competition,” in 2013 12th international conference on document analysis and recognition (2013): 1449–53, https://doi.org/10.1109/icdar.2013.292. 18 burcu yildiz, katharina kaiser, and silvia miksch, “pdf2table: a method to extract table information from pdf files,” in proceedings of the 2nd indian international conference on artificial intelligence (iicai, 2005): 1773–85; tamir hassan and robert baumgartner, “table recognition and understanding from pdf files,” in ninth international conference on https://doi.org/10.2777/836532 https://www.who.int/publications-detail/world-report-on-vision/ https://doi.org/10.6017/ital.v27i3.3246. https://doi.org/10.1086/685399 https://doi.org/10.1108/rsr-04-2018-0043 https://doi.org/10.1108/rsr-04-2018-0043 https://doi.org/10.6017/ital.v35i4.9469 https://doi.org/10.1108/el-10-2016-0209 https://doi.org/10.1108/el-10-2016-0209 https://doi.org/10.6017/ital.v38i4.11101 https://doi.org/10.1109/icdar.2013.292 information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 15 document analysis and recognition (icdar 2007) (2007): 1143–47, https://doi.org/ 10.1109/icdar.2007.4377094; alexey shigarov et al., “tabbypdf: web-based system for pdf table extraction,” in international conference on information and software technologies (springer international publishing, 2018): 257–69, https://doi.org/10.1007/978-3-31999972-2_20. 19 minghao li et al., “tablebank: table benchmark for image-based table detection and recognition,” preprint, arxiv:1903.01949; sebastian schreiber et al., “deepdesrt: deep learning for detection and structure recognition of tables in document images,” in 2017 14th iapr international conference on document analysis and recognition (icdar) (2017): 1162–67, https://doi.org/10.1109/icdar.2017.192. 20 zewen chi et al., “complicated table structure recognition,” preprint, arxiv:1908.04729. 21 michael cafarella et al., “ten years of webtables,” in proceedings of the vldb endowment 11, no. 12 (august 2018): 2140–49, https://doi.org/10.14778/3229863.3240492. 22 shah khusro, asima latif, and irfan ullah. “on methods and tools of table detection, extraction and annotation in pdf documents,” journal of information science 41, no. 1 (2015): 41–57, https://doi.org/10.1177/0165551514551903. 23 hassan, “table recognition and understanding”; richard zanibbi, dorothea blostein, and james r cordy, “a survey of table recognition,” document analysis and recognition 7, no. 1 (2004): 1–16, https://doi.org/10.1007/s10032-004-0120-9; andreiwid sheffer corrêa and pär-ola zander, “unleashing tabular content to open data: a survey on pdf table extraction methods and tools,” in proceedings of the 18th annual international conference on digital government research (june 2017): 54–63, https://doi.org/10.1145/3085228.3085278; christopher clark and santosh divvala, “looking beyond text: extracting figures, tables and captions from computer science papers” (paper, aaai workshops at the twenty-ninth aaai conference on artificial intelligence, austin, tx, january 25–26, 2015)., 24 ermelinda oro and massimo ruffolo, “pdf–trex: an approach for recognizing and extracting tables from pdf documents,” in 2009 10th international conference on document analysis and recognition (icdar) (2009): 906–10, https://doi.org/10.1109/icdar.2009.12. 25 vidhya govindaraju, ce zhang, and christopher ré, “understanding tables in context using standard nlp toolkits,” in proceedings of the 51st annual meeting of the association for computational linguistics (sofia, bulgaria: association for computational linguistics, august 2013): 658–64. 26 nikola milosevic et al., “disentangling the structure of tables in scientific literature,” in natural language processing and information systems, nldb 2016, lecture notes in computer science 9612 (springer, cham), https://doi.org/10.1007/978-3-319-41754-7_14. 27 rastan, “automatic tabular data extraction,” 48. https://10.0.4.85/icdar.2007.4377094 https://10.0.4.85/icdar.2007.4377094 https://doi.org/10.1007/978-3-319-99972-2_20 https://doi.org/10.1007/978-3-319-99972-2_20 https://doi.org/10.1109/icdar.2017.192 https://doi.org/10.14778/3229863.3240492 https://doi.org/10.1177/0165551514551903 https://doi.org/10.1007/s10032-004-0120-9 https://doi.org/10.1145/3085228.3085278 https://doi.org/10.1109/icdar.2009.12 https://doi.org/10.1007/978-3-319-41754-7_14 information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 16 28 alexey shigarov, andrey mikhailov, and andrey altaev, “configurable table structure recognition in untagged pdf documents,” in proceedings of the 2016 acm symposium on document engineering, (2016): 119–22, https://doi.org/10.1145/2960811.2967152. 29 shigarov et al., “tabbypdf,” 262, 263, 265. 30 dae hyun kim et al., “facilitating document reading by linking text and tables,” in proceedings of the 31st annual acm symposium on user interface software and technology (october 2018): 423–34, https://doi.org/10.1145/3242587.3242617. 31 hassan, “table recognition and understanding,” 1145. 32 jing fang et al., “a table detection method for multipage pdf documents via visual separators and tabular structures,” in 2011 international conference on document analysis and recognition (2011): 779–83, https://doi.org/10.1109/icdar.2011.304. 33 bahadar ali and shah khusro, “a divide-and-merge approach for deep segmentation of document tables,” in proceedings of the 10th international conference on informatics and systems (may 2016): 43–49, https://doi.org/10.1145/2908446.2908473. 34 wenyuan xue et al., “table analysis and information extraction for medical laboratory reports,” in 2018 ieee 16th intl conf on dependable, autonomic and secure computing, 16th intl conf on pervasive intelligence and computing, 4th intl conf on big data intelligence and computing and cyber science and technology congress (dasc/picom/datacom/cyberscitech) (2018): 193–99, https://doi.org/10.1109/dasc/picom/datacom/cyberscitec.2018.00043. 35 roya rastan, hye-young paik, and john shepherd, “texus: a unified framework for extracting and understanding tables in pdf documents,” information processing & management 56, no. 3 (2019): 895–918, https://doi.org/10.1016/j.ipm.2019.01.008. 36 dafang he et al., “multi-scale multi-task fcm for semantic page segmentation and table detection,” in 2017 14th iapr international conference on document analysis and recognition (icdar) (2017): 254–61, https://doi.org/10.1109/icdar.2017.50. 37 jing fang et al., “table header detection and classification,” in proceedings of the twenty-sixth aaai conference on artificial intelligence (july 2012): 599–605. 38 he et al., “multi-scale multi-task,” 255. 39 martha o. perez-arriaga, trilce estrada, and soraya abad-mota, “tao: system for table detection and extraction from pdf documents,” florida artificial intelligence research society conference, north america (2016). 40 saman arif and faisal shafait, “table detection in document images using foreground and background features,” in 2018 digital image computing: techniques and applications (dicta), (2018): 1–8, https://doi.org/10.1109/dicta.2018.8615795. 41 schreiber et al., “deepdesrt,” 1163, 1164. https://doi.org/10.1145/2960811.2967152 https://doi.org/10.1145/3242587.3242617 https://doi.org/10.1109/icdar.2011.304 https://doi.org/10.1145/2908446.2908473 https://doi.org/10.1109/dasc/picom/datacom/cyberscitec.2018.00043 https://doi.org/10.1016/j.ipm.2019.01.008 https://doi.org/10.1109/icdar.2017.50 https://doi.org/10.1109/dicta.2018.8615795 information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 17 42 shoaib ahmed siddiqui et al., “decnt: deep deformable cnn for table detection,” ieee access 6 (2018): 74151–61, https://doi.org/10.1109/access.2018.2880211. 43 chi et al., “complicated table structure recognition.” 44 rahul anand, hye-young paik, and cheng wang, “integrating and querying similar tables from pdf documents using deep learning,” 2019, preprint, arxiv:1901.04672. 45 jiaoyan chen et al., “colnet: embedding the semantics of web tables for column type prediction,” in proceedings of the aaai conference on artificial intelligence 33, no. 1: 29–36, https://doi.org/10.1609/aaai.v33i01.330129. 46 ziqi zhang, “towards efficient and effective semantic table interpretation,” in international semantic web conference (2014): 487–502, https://doi.org/10.1007/978-3-319-11964-9_31. 47 ivan ermilov, sören auer, and claus stadler, “user-driven semantic mapping of tabular data,” in proceedings of the 9th international conference on semantic systems (september 2013): 105–12, https://doi.org/10.1145/2506182.2506196. 48 martha o perez-arriaga, trilce estrada, and soraya abad-mota, “table interpretation and extraction of semantic relationships to synthesize digital documents,” in proceedings of the 6th international conference on data science, technology and application—data (2017): 223– 32, https://doi.org/10.5220/0006436902230232. 49 varish mulwad, “tabel—a domain-independent and extensible framework for inferring the semantics of tables,” (phd diss., university of maryland, 2015). 50 syed tahseen raza rizvi et al., “ontology-based information extraction from technical documents,” in proceedings of the 10th international conference on agents and artificial intelligence (icaart) (2018): 493–500, https://doi.org/10.5220/0006596604930500. 51 corrêa and zander, “unleashing tabular content to open data,” 55. 52 irfan ullah et al., “an overview of the current state of linked and open data in cataloging,” information technology and libraries 37, no. 4 (2018): 47–80, https://doi.org/10.6017/ital.v37i4.10432. 53 nosheen fayyaz, irfan ullah, and shah khusro, “on the current state of linked open data: issues, challenges, and future directions,” international journal on semantic web and information systems (ijswis) 14, no. 4 (2018): 110–28, https://doi.org/10.4018/ijswis.2018100106. 54 govindaraju, zhang, and ré , “understanding tables in context using standard nlp toolkits,” 660, 661. 55 perez-arriaga, estrada, and abad-mota, “table interpretation and extraction,” 227. 56 kim et al., “facilitating document reading,” 425, 426. https://doi.org/10.1109/access.2018.2880211 https://doi.org/10.1609/aaai.v33i01.330129 https://doi.org/10.1007/978-3-319-11964-9_31 https://doi.org/10.1145/2506182.2506196 https://doi.org/10.5220/0006436902230232 https://doi.org/10.5220/0006596604930500 https://doi.org/10.6017/ital.v37i4.10432 https://doi.org/10.4018/ijswis.2018100106 information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 18 57 rastan, pail, and shepherd, “texus,” 906. 58 nikola milosevic et al., “a framework for information extraction from tables in biomedical literature,” international journal on document analysis and recognition (ijdar) 22, no. 1 (2019): 55–78, https://doi.org/10.1007/s10032-019-00317-0. 59 chi et al., “complicated table structure recognition.” 60 wenhao yu et al., “tablepedia: automating pdf table reading in an experimental evidence exploration and analytic system,” in the world wide web conference (may 2019): 3615–19, https://doi.org/10.1145/3308558.3314118. 61 anand, paik, and wang, “integrating and querying similar tables.” 62 turró, “are pdf documents accessible?” 2, 4. 63 nazemi, “non-visual representation of complex documents,” 110, 111, 112, 118. 64 juan cao, “generating natural language descriptions from tables,” ieee access 8 (2020): 46206–16, https://doi.org/10.1109/access.2020.2979115. 65 maartje ter hoeve et al., “conversations with documents: an exploration of document-centered assistance,” in proceedings of the 2020 conference on human information interaction and retrieval (march 2020): 43–52, https://doi.org/10.1145/3343413.3377971. 66 guédon et al., “future of scholarly publishing,” 42. 67 w3c, “wcag 2.0.” 68 world health organization, “world report on vision”; david reinsel, john gantz, and john rydning, “data age 2025: the digitization of the world, from edge to core,” idc white paper, #us44413318 (framingham, ma: idc, november 2018), https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataagewhitepaper.pdf/. 69 rastan, “automatic tabular data extraction,” 18, 19. 70 arif and shafait, “table detection in document images,” 1. 71 ana costa e silva, “parts that add up to a whole: a framework for the analysis of tables,” (phd diss., edinburgh university, uk, 2010). 72 milosevic et al., “a framework for information extraction from tables,” 60. 73 rastan, “automatic tabular data extraction,” 14. 74 chen et al., “colnet,” 31. 75 mulwad, “tabel,” 23; zewen, “complicated table structure recognition.” 76 siddiqui et al., “decnt,” 74160. https://doi.org/10.1007/s10032-019-00317-0 https://doi.org/10.1145/3308558.3314118 https://doi.org/10.1109/access.2020.2979115 https://doi.org/10.1145/3343413.3377971 https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf/ https://www.seagate.com/files/www-content/our-story/trends/files/idc-seagate-dataage-whitepaper.pdf/ information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 19 77 david w embley, sharad seth, and george nagy, “transforming web tables to a relational database,” 2014 22nd international conference on pattern recognition (2014) 2781–86, https://doi.org/10.1109/icpr.2014.479. 78 milosevic et al., “a framework for information extraction from tables,” 56. 79 milosevic et al., “a framework for information extraction from tables,” 55, 56. 80 kim et al., “facilitating document reading,” 432. 81 chen et al., “colnet,” 36. 82 asima latif et al., “a hybrid technique for annotating book tables,” int. arab j. inf. technol 15, no. 4 (2018): 777–83. 83 rastan, paik, and shepherd, “texus,” 909. 84 milosevic et al., “a framework for information extraction from tables,” 61, 62, 65, 66. 85 rizvi et al., “ontology-based information extraction,” 496. 86 siddiqui et al., “decnt,” 74160. 87 max göbel et al., “a methodology for evaluating algorithms for table understanding in pdf documents,” in proceedings of the 2012 acm symposium on document engineering (september 2012): 45–48, https://doi.org/10.1145/2361354.2361365. 88 rastan, paik, and shepherd, “texus,” 917. 89 david pinto et al., “table extraction using conditional random fields,” in proceedings of the 26th annual international acm sigir conference on research and development in information retrieval (july 2003): 235–42, https://doi.org/10.1145/860435.860479. 90 nazemi, “non-visual representation of complex documents,” 118–44; w3c, “wcag 2.0.” 91 ullah et al., “current state of linked and open data in cataloging,” 47, 48. 92 julius t. nganji, “the portable document format (pdf) accessibility practice of four journal publishers,” library and information science research 37, no.3 (2015): 254–62, https://doi.org/10.1016/j.lisr.2015.02.002. 93 julius t. nganji, “an assessment of the accessibility of pdf versions of selected journal articles published in a wcag 2.0 era (2014–2018),” learned publishing 31, no. 4 (2018): 391–401, https://doi.org/10.1002/leap.1197. 94 wittmann et al., “from digital library to open datasets,” 49, 50. 95 yan han and xueheng wan, “digitization of text documents using pdf/a,” information technology and libraries 37, no. 1 (2018): 52–64, https://doi.org/10.6017/ital.v37i1.9878. https://doi.org/10.1109/icpr.2014.479 https://doi.org/10.1145/2361354.2361365 https://doi.org/10.1145/860435.860479 https://doi.org/10.1016/j.lisr.2015.02.002 https://doi.org/10.1002/leap.1197 https://doi.org/10.6017/ital.v37i1.9878 information technology and libraries september 2021 accessibility of tables in pdf documents | fayyaz, khusro, and ullah 20 96 asim ullah, shah khusro, and irfan ullah, “bibliographic classification in the digital age: current trends & future directions,” information technology and libraries 36, no. 3 (2017): 48–77, https://doi.org/10.6017/ital.v36i3.8930. 97 xie et al., “using digital libraries non-visually,” paper 673. 98 babu and xie, “haze in the digital library,” 1057–59. 99 iris xie et al., “enhancing usability of digital libraries: designing help features to support blind and visually impaired users,” information processing and management 57, no. 3 (2020): 102110, https://doi.org/10.1016/j.ipm.2019.102110. 100 chen et al., “colnet,” 31, 32. 101 kim et al., “facilitating document reading,” 432. 102 milosevic et al., “a framework for information extraction from tables,” 61. 103 rizvi et al., “ontology-based information extraction,” 496. 104 embley, seth, and nagy, “transforming web tables to a relational database,” 2783; milosevic et al., “a framework for information extraction from tables,” 60. 105 nicholas j tierney and karthik ram, “a realistic guide to making data available alongside code to improve reproducibility,” preprint, arxiv:2002.11626. 106 rastan, paik, and shepherd, “texus,” 917. 107 nazemi, “non-visual representation of complex documents,” 118–44; w3c, “wcag 2.0.” 108 mexhid ferati and wondwossen m. beyene, “developing heuristics for evaluating the accessibility of digital library interfaces,” in universal access in human–computer interaction, design and development approaches and methods, uahci 2017, lecture notes in computer science 10277 (springer, cham), https://doi.org/10.1007/978-3-319-58706-6_14. 109 ullah et al., “current state of linked and open data in cataloging,” 64. https://doi.org/10.6017/ital.v36i3.8930 https://doi.org/10.1016/j.ipm.2019.102110 https://doi.org/10.1007/978-3-319-58706-6_14 abstract introduction the current state of table processing table extraction and processing using heuristics using segmentation using machine learning and deep learning approaches using ontologies relationship of tables with content and context existing accessibility-driven solutions for pdf documents issues and challenges in the existing systems table structure table formats table interpretation table evaluation table presentation to blind and visually impaired users accessibility of digital library collection conclusions and future research directions endnotes applying gamification to the library orientation: a study of interactive user experience and engagement preferences articles applying gamification to the library orientation a study of interactive user experience and engagement preferences karen nourse reed and a. miller information technology and libraries | september 2020 https://doi.org/10.6017/ital.v39i3.12209 karen nourse reed (karen.reed@mtsu.edu) is associate professor, middle tennessee state university. a. miller (a.miller@mtsu.edu) is associate professor, middle tennessee state university. © 2020. abstract by providing an overview of library services as well as the building layout, the library orientation can help newcomers make optimal use of the library. the benefits of this outreach can be curtailed, however, by the significant staffing required to offer in-person tours. one academic library overcame this issue by turning to user experience research and gamification to provide an individualized online library orientation for four specific user groups: undergraduate students, graduate students, faculty, and community members. the library surveyed 167 users to investigate preferences regarding orientation format, as well as likelihood of future library use as a result of the gamified orientation format. results demonstrated a preference for the gamified experience among undergraduate students as compared to other surveyed groups. introduction background newcomers to the academic campus can be a bit overwhelmed by their unfamiliar environment: there are faces to learn, services and processes to navigate, and an unexplored landscape of academic buildings to traverse. whether one is an incoming student or recently hired employee of the university, all need to become quickly oriented to their surroundings to ensure productivity. in the midst of this transition, the academic library may or may not be on the list of immediate inquiries; however, the library is an important place to start. newcomers would be wise to familiarize themselves with the building and its services so that they can make optimal use of its offerings. two studies found that students who used the library received better grades and had higher retention rates. 1 another study regarding university employees revealed that untenured faculty made less use of the library than tenured faculty, a problem attributed to lack of familiarity with the library.2 researchers have also found that faculty will often express interest in different library services without realizing that these services are in fact available.3 it is safe to say that libraries cannot always rely on newcomers to discover the physical and electronic services on their own; they need to be shown these items in order to mitigate the risk of unawareness. in consideration of these issues, the walker library at middle tennessee state university (mtsu) recognized that more could be done to welcome its new arrivals to campus. the public university enrolls approximately 21,000 students, the majority of whom are undergraduates. however, with a carnegie classification of doctoral/professional and over one hundred graduate degree programs, there was a strong need for specialized research among the university’s graduate students and faculty. other groups needed to use the library too: non-faculty employees on campus as well as community users who frequently used walker library for its specialized and general collections. the authors realized that when new members of these different groups mailto:karen.reed@mtsu.edu mailto:a.miller@mtsu.edu information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 2 arrived on campus, few opportunities were available for acclimation to the library’s services or building layout. limited orientation experiences were conducted within library instruction classes, but these sessions primarily taught research skills and targeted freshman generaleducation classes as well as select upper-division and graduate classes. in short, it appeared that students, employees, and visitors to the university would largely have to discover the library’s services on their own through a search on the library website or an exploration of the physical library. it was very likely that, in doing so, the newcomers might miss out on valuable services and information. as mtsu librarians, the authors felt strongly that library orientations were important to everyone at the university so that they might make optimal use of the library’s offerings. the authors based this opinion on their knowledge of relevant scholarly literature as well as their own anecdotal experiences with students and faculty.4 the authors defined the library orientation differently from library instruction: in their view, an orientation should acquaint users with the services and physical spaces of the library, as compared to instruction that would teach users how to use the library’s electronic resources such as databases. the desired new approach would structure orientations in response to the different needs of the library’s users. for example, the authors found that undergraduates typically had distinct library interests compared to faculty. it was recognized that library orientations were time-consuming for everyone: library patrons at mtsu often did not want to take the time for a physical tour, nor did the library have the staffing to accommodate large-scale requests. the authors turned to the gamification trend, and specifically interactive storytelling, as a solution. interactive storytelling has previous applications in librarianship as a means of creating an immersive and self-guided user experience.5 however, no previous research appears to have been conducted to understand the different online, gamified orientation needs of various library groups. to overcome this gap, the authors developed an online, interactive, game-like experience via storytelling software to orient four different groups of users to the library’s services. these groups were undergraduate students, graduate students, faculty members (which included both faculty and staff at the university), and community members (i.e., visitors to the university or alumni); see figure 1 for an illustration of each groups’ game avatars. these groups were invited to participate in the gamified experience called libgo (short for library game orientation). after playing libgo, participants gave feedback through an online survey. this paper will give a brief explanation of the creation of the game, as well as describe the results of research conducted to understand the impact of the gamified experience across the four user groups. information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 3 figure 1. libgo players were allowed to self-select their user group upon entering the game. each of the four user groups was assigned an avatar and followed a logic path specified for that group. literature review traditional orientation searches for literature on library orientation yield very broad and yet limited details about users of the traditional library orientation method. it is important to note that the terms “library tour” and “library orientation” can be somewhat vague, because this terminology is not interchangeable, yet is frequently treated as such in the literature.6 these terms are often included among library instruction materials which predominately influence undergraduate students.7 kylie bailin, benjamin jahre, and sarah morris define orientation as “any attempt to reduce library anxiety by introducing students to what a college/university library is, what it contains, and where to find information while also showing how helpful librarians can be.”8 their book is a culmination of case studies of academic library orientation in various forms worldwide where the common theme across most chapters is the need to assess, revise, and change library orientation models as needed, especially in response to feedback, staff demands, and the evolving trend of libraries and technology.9 furthermore, the majority of these studies are undergraduate-focused, and often freshman-focused, while only a few studies are geared towards graduate students. other traditional orientation problems discussed in the literature include students lacking intrinsic motivation to attend library orientation, library staff time required to execute the orientation, and lack of attendance.10 additionally, among librarians there seems to be consensus that the traditional library tours are the least effective means of orientation, yet they are the most highly used and with attention predominately focused on the undergraduate population alone. 11 information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 4 in 1997, pixey anne mosely described the traditional guided library tour as ineffective, and documented the trend of libraries discontinuing it in favor of more active learning options.12 her study surveyed 44 students who took a redesigned library tour, all of whom were undergraduates (with freshmen as the target population). although mosely’s study only addressed one group of library users, it does attempt to answer a question on library perception whereby 93 percent of surveyed students indicated feeling more comfortable in using the library after the more active learning approach.13 a comparison study by marcus and beck looked at traditional vs treasure hunt orientations, and ultimately discovered that perception of the traditional method is limited by the selective user population and lack of effective measurements. they cited the need for continued study of alternative approaches to academic library orientation.14 a study by kenneth burhanna, tammy eschedor voelker, and julie gedeon looked at the traditional library tour from the physical and virtual perspective. confronted with a lack of access to the physical library, these researchers at kent state university decided to add an online option for the required traditional freshman library tour.15 their study compared the efficacy of learning and affective outcomes between face-to-face library tours and those of online library tours. of the 3,610 students who took the required library tour assignment, 3,567 chose the online tour method and 63 opted or were required to take the in-person, librarian-led tour. surveys were later sent to a random list of 250 students who did not take the in-person tour and the 63 students who did take the in-person tour. of the 46 usable responses all but one were undergraduates and 39 (85 percent) of them were freshman.16 this is a small sample size with a ratio of slightly greater than 2:1 for online versus in-person tour participation. although results showed that an instructor’s recommendation on format selection was the strongest influencing factor, convenience was also significant for those who selected the online option (81.5 percent). in contrast, only 18.5 percent of the students who took the face-toface tour rated it as convenient. the authors found that regardless of tour type, students were more comfortable using the library (85 percent) and more likely to use library resources (80 percent) after having taken a library tour. interestingly, students who took the online tour seemed slightly more likely to visit the physical library than those who took the in-person tour. ultimately the analysis of both tours showed this method of library orientation encourages library resource use, and the “online tour seems to perform as well, if not slightly better than the in-person tour.”17 gamification use in libraries an alternative format to the traditional method is gamification. gamification has become a familiar trend within academic libraries in recent years, and most often refers to the use of a technology based game delivery within an instructional setting. some users find gamified library instruction to be more enjoyable than traditional methods. for these people, gamification can potentially increase student engagement as well as retention of information.18 the goal of gamification is to create a simplified reality with a defined user experience. kyle felker and eric phetteplace emphasized the importance of user interaction over “specific mechanics or technologies” in thinking about the gamification design process.19 proponents of gamification of library instructional content indicate that it connects to the broader mission of library discovery and exploration as exemplified through collaboration and the stimulation of learning.20 additional benefits of gamification are its teaching, outreach and engagement functions.21 many researchers have documented specific applications of online gaming as a means of imparting library instruction. mary j. broussard and jessica urick oberlin described the work of librarians at lycoming college in developing an online game as one approach to teaching about information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 5 plagiarism.22 melissa mallon offered summaries of nine games produced for higher education, several of which were specifically created for use by academic libraries.23 many of these online library games reviewed used flash, or required players to download the game before playing. by contrast, j. long detailed an initiative at miami university to integrate gamification into the library instruction, a project which utilized twine.24 twine is an in-browser method and therefore avoids the problem of requiring users to download additional software prior to playing the game. other libraries have used online gamification specifically as a tool for library orientations. although researchers have demonstrated that the library orientation is an important practice in establishing positive first impressions of the library and counteracting library anxiety among new users, the differences between in-person versus online delivery formats are unclear.25 several successful instances have been documented in which the orientation was moved to an online game format. nancy o’hanlon, karen diaz, and fred roecker described a collaboration at ohio state university libraries between librarians and the office of first year experience; for this project, they created a game to orient all new students to the library prior to arrival on campus.26 the game was called “head hunt,” and was cited among those games listed in the article by mallon. 27 anna-lise smith and lesli baker reported the “get a clue” game at utah valley university which oriented new students over two semesters.28 another orientation game developed at california state university-fresno was noteworthy for its placement in the university’s learning management system (lms).29 in reviewing the literature regarding online library gamification efforts, there appear to be several best practices. several studies cite initial student assessment to understand student knowledge and/or perceptions of the content, followed by an iterative design process with a team of librarians and computer programmers.30 felker and phetteplace reinforced the need for this iterative process of prototyping, testing, deployment, and assessment as one key to success; however they also stated that the most prevalent reason for failure is that the games are not fun for users.31 librarians are information experts, and are not necessarily trained in fun game design. some libraries have solved this problem by partnering with or hiring professional designers; however for many under-resourced libraries, this is not an option.32 taking advantage of opensource tools, as well as the documented trial-and-error practices of others, can be helpful to newcomers who wish to break into new library engagement methods utilizing gamification. as literature has shown, a traditional library tour may have a place in the list of library services, but for whom and at what cost are questions with limited answers in studies done to date. gamification has offered an alternative perspective but with narrow accounts of its success in the online storytelling format and for users outside of the heavily focused freshman group. across all literature of library orientation studies, there is little reference to other library user populations such as faculty, staff, community users, distance students, or students not formally part of a class that requires library orientation. development of the library game orientation (libgo) libgo was developed by the authors with not only a consideration for the walker library user experience, but also with a specific attention to the differing needs of the multiple user groups served by the library. this user-focused concern led to exploring creative methodologies such as user experience research and human-centered design thinking, a process of overlapping phases that produces a creative and meaningful solution in a non-linear way. the three pillars of design information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 6 thinking are inspiration, ideation, and iteration.33 defining the problem and empathizing with the users (inspiration) led into the ideation phase, whereby the authors created lowand high-fidelity prototypes. the prototypes were tested and improved (iteration) through the use of beta testing in which playtesters interacted with the gamified orientation. the authors were novice developers of the gamified orientation, and this entailed a learning curve for not only the design thinking mindset but also the technical achievability. the development started with design thinking conversations and quickly turned to low-fidelity prototypes designed on paper. the development soon advanced to the actual coding so that the authors could get early designs tested before launching the final version. prior to deployment on the library’s website, libgo underwent a series of playtesting by library faculty, staff, and student employees. this testing was invaluable and led to such improvements as streamlining of processes and less ambiguity of text. libgo was developed with the twine open-source software (https://twinery.org), a product which is primarily used for telling interactive, non-linear stories with html. twine was an excellent application for this project as it allowed the creation of an online and interactive “choose your own adventure” styled library orientation game, in which users could explore the library based upon their selection of one of multiple available plot directions. with a modest learning curve and as an open source software, twine is highly accessible for those who are not accustomed to coding. for those who know html, css, javascript, variables, and conditional logic, twine’s capabilities can be extended. the library’s interactive orientation adventure requires users to select one of the four available personas: undergraduate student, graduate student, faculty, or community member. users subsequently follow that persona through a non-linear series of places, resources and points of interest built with the html output of using twee (twine’s programming language). see figure 2 for an example point of interest page and figure 3 for an example of a user’s final score after completing the gamified experience. once the twine story went through several iterations of design and testing, the html file was placed on the library’s website for the gamified orientation to be implemented with actual users. https://twinery.org/ information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 7 figure 2. this instructional page within libgo explains how to reserve different library spaces online. upon reading this content, the user will progress by clicking on one of the hypertext lines in blue font at the bottom. figure 3. based upon the displayed avatar, this libgo page is representative of a graduate student’s completion of libgo. the page indicates the player’s final score and gives additional options to return to the home page or complete the survey. information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 8 purpose of study libgo utilized the common "choose your own adventure" format whereby players progress through a storyline based upon their selection of one of multiple available plot directions. although the literature suggests that other technology-based methods are an engaging and instructive mode of content delivery, little prior research exists regarding this specific approach to library outreach. furthermore, no previous research appears to have been conducted to understand the different online, gamified orientation needs of various library groups. the researchers wanted to understand the potential of interactive storytelling as a means to educate a range of users on library services as well as make the library more approachable from a user perspective. the study was designed to understand the user experience of each of the four groups. the researchers hoped to discern which users, if any, found the gamified experience to be a helpful method of orientation to the library’s physical and electronic services. another area of inquiry was to determine whether this might be an effective delivery method by which to target certain segments of the campus for outreach. finally, the study intended to determine whether this method of orientation might incline participants toward future use of the library. methodology overview the authors selected an embedded mixed methods design approach in which quantitative and qualitative data were collected concurrently through the same assessment instrument.34 the survey instrument primarily collected quantitative data, however a qualitative open-response question was embedded at the end of the survey: this question gathered additional data by which to answer the research questions. each data set (one quantitative and one qualitative) was analyzed separately for each participant group, and then the groups were compared to develop a richer understanding of participant behavior. research questions the data collection and subsequent analysis attempted to answer the following questions: 1. which group(s) of library users prefer to be oriented to library services and resources through the interactive storytelling format, as compared to other formats? 2. which group(s) of library users are more likely to use library services and resources after participating in the interactive storytelling format of orientation? 3. what are user impressions of libgo, and are there any differences in impression based on the characteristics of the unique user group? participants participants for the study were recruited in-person and via the library website. in-person recruitment entailed the distribution of flyers and use of signage to recruit participants to play libgo in a library computer lab during a one-day event. online recruitment lasted approximately ten weeks and simply involved the placement of a link to libgo on the home page of th e library’s website. a total of 167 responses were gathered through both methods and participants were distributed as shown in table 1. information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 9 table 1. composition of study’s participants group number affiliation number of responses 1 undergraduate students 55 2 graduate students 62 3 faculty 13 4 staff 28 5 community members 9 total 167 for the purposes of statistical data analysis, groups 3 and 4 were combined to produce a single group of 41 university employee respondents; also, group 5’s data was not included in the statistical analysis due to the low number of participants. qualitative data for all groups, however, was included in the non-statistical analysis. survey instrument a survey with twelve total questions was developed for this study and was administered online through qualtrics. after playing libgo, participants were asked to voluntarily complete the survey; if they agreed, they were redirected to the survey’s website. before answering any survey questions, the instrument administered an informed consent statement to participants . all aspects of the research, including the survey instrument, were approved through the university’s institutional review board (protocol number 18-1293). the first part of the survey (see appendix a) consisted of ten questions, each with a ten-point likert scaled response. the first five questions were each designed to measure a preference construct, and the next five questions each measured a likelihood construct. the pref erence construct referred to participant’s preference for a library orientation: did they prefer libgo’s online interactive storytelling format, or did they prefer another format such as in-person talks? the likelihood construct referred to the participant’s self-perceived likelihood of more readily engaging with the library in the future (both in-person and online) after playing libgo. the second part of the survey gathered the participant’s self-reported affiliation (see table 1 for the list of possible group affiliations) as well as offered participants an open-ended response area for optional qualitative feedback. data collection the study’s data was collected in two stages. in stage one, libgo was unveiled to library visitors during a special campus-wide week of student programming events. on the library’s designated event day, the researchers held a drop-in event at one of the library’s computer labs (see figure 4 for an example of event advertisement). library visitors were offered a prize bag and snacks if they agreed to play libgo and complete the survey. during the three-hour-long drop-in session, 58 individual responses were collected: the vast majority of these came from undergraduate students (51 responses), with additional responses from graduate students (n = 4), university staff employees (n = 2), and one community member responding. community members were defined as anyone not currently directly affiliated with the university; this group may have included prospective students or alumni. stage 2 began the following day after the library drop-in event, and simply involved the placement of a link to libgo on the home page of the library’s website. any visitor to the library’s website could click on the advertisement to be taken to libgo. this link information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 10 remained active on the library website for ten weeks, at which point the final data was gathered. a total of 167 responses were gathered during both stages and participants were distributed as previously shown in table 1. figure 4. example of student libgo event advertisement results quantitative findings statistical analysis of each of the ten quantitative questions required the use of one-way anova in spss. a post hoc test (hochberg’s gt2) was run in each instance to account for the different sample sizes. for all statistical analysis, only the data from undergraduates, graduate students, and university employees (a group which combined both faculty and staff results) were utilized. a listing of mean comparisons by group, for each of the ten survey questions, may be found in table 2. the analysis of the one-way anovas yielded statistically significant results for three of the ten individual questions in the first part of the survey: questions 2, 3, and 6 (see table 3). table 2. descriptive statistics for survey results (10-point scale, with 10 as most likely) survey question mean for undergraduate students mean for graduate students mean for university employees 1. in considering the different ways to learn about walker library, do you find this library orientation game to be more or less preferable as compared to other orientation options (such as in-person tours, speaking with a librarian, or clicking through the library website on your own)? 7.02 6.39 6.02 2. in your opinion, was the library orientation game a useful way to get introduced to the library’s services and resources? 8.13 6.94 7.12 information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 11 3. if your friend needed a library orientation, how likely would you be to recommend the game over other orientation options (such as in-person tours, speaking with a librarian, or clicking through the library website on your own?) 7.38 5.94 5.98 4. please indicate your level of agreement with the following statement: “as compared to playing the game, i would have preferred to learn about the library’s resources and services by my own exploration of the library website?” 6.11 6.50 5.88 5. please indicate your level of agreement with the following statement: “as compared to playing the game, i would have preferred to learn about the library’s resources and services through an inperson orientation tour.” 6.11 5.08 5.76 6. after playing this orientation game, are you more or less likely to visit walker library in person? 8.27 6.94 6.90 7. after playing this library orientation game, are you more or less likely to use the walker library website to find out about the library (such as hours of operation, where to go to get different materials/services, etc.)? 7.82 6.97 7.20 8. after playing this library orientation game, are you more or less likely to seek help from a librarian at walker library? 6.95 6.58 6.63 9. after playing this library orientation game, are you more or less likely to use the library’s online resources (such as databases, journals, e-books)? 7.67 7.15 6.90 10. after playing this library orientation game, are you more or less likely to attend a library workshop, training, or event? 6.96 6.73 6.24 table 3. overall statistically significant group differences df f p w2 question 2 2 3.714 .027 .03 question 3 2 4.508 .012 .04 question 6 2 7.178 .001 .07 question 2 asked “in your opinion, was the library orientation game a useful way to get introduced to the library’s services and resources?” the one-way anova found that there was a statistically significant difference between groups (f(2,155) = 3.714, p = .027, ω2 = .03). the post hoc comparison using the hochberg’s gt2 test revealed that undergraduates were statistically significantly more likely to prefer libgo in this manner (m = 8.13, sd = 1.94, p = .031) as information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 12 compared to the graduate students (m = 6.94, sd = 2.72). there was no statistically significant difference between undergraduates and the university employees (p = .145). according to criteria suggested by roger kirk, the effect size of .03 indicates a small effect in perceived usefulness of libgo as an introduction among undergraduates.35 question 3 asked “if your friend needed a library orientation, how likely would you be to recommend the game over other orientation options (such as in-person tours, speaking with a librarian, or clicking through the library website on your own)?” the one-way anova found that there was a statistically significant difference between groups (f(2, 155) = 4.508, p = .012, ω2 = .04). the post hoc comparison using the hochberg’s gt2 test found that undergraduates were statistically significantly more likely to prefer libgo over other orientation options (m = 7.38, sd = 2.49, p = .021) as compared to graduate students (m = 5.94, sd = 3.06). there was no statistically significant difference between undergraduates and university employees (p = .053). the effect size of .04 indicates a small effect regarding undergraduate preference for libgo versus other orientation options. question 6 asked “after playing this library orientation game, are you more or less likely to visit walker library in person?” the one-way anova found that there was a statistically significant difference between groups (f(2,155) = 7.178, p = .001, ω2 = .07). the post hoc comparison using the hochberg’s gt2 test revealed that undergraduates were statistically significantly more likely to visit the library after playing libgo (m = 8.27, sd = 2.09, p = .003) as compared to graduate students (m = 6.94, sd = 2.20). additionally, the test found that undergraduates were statistically significantly more likely to visit the library after playing libgo (p = .007) as compared to university employees (m = 6.90, sd = 2.08). according to criteria suggested by kirk, the effect size of .07 indicates a medium effect regarding undergraduate potential to visit the library in person after playing libgo. 36 in addition to testing each individual survey question, tests were run to understand the possible group differences by construct (preference and likelihood). the preference construct was an aggregate of survey questions 1-5, and the likelihood construct was an aggregate of survey questions 6-10. for both constructs, the one-way anova found results which were not statistically significant. in all, the quantitative findings indicated three areas by which the experience of playing libgo was more helpful for the surveyed undergraduates than the other surveyed groups (i.e., graduate students or university employees). at this point, the analysis turned to the qualitative data so as to better understand participant views of libgo. qualitative findings analysis of the qualitative results was limited to the data collected in the survey’s final question. question 12 was an open-response area, and was intentionally prefaced with a vague prompt: “do you have any final thoughts for the library (suggestions, additions, modification, comments, criticisms, praise, etc.)?” of the 167 total survey responses, 67 individuals chose to answer this question. preliminary analysis showed that the feedback derived from this question covered a spectrum of topics, ranging from remarks on the libgo experience itself to broader concerns regarding other library services. open coding strategies were utilized to interpret the content of participant responses. under this methodology, the responses were evaluated for general themes and then coded and grouped information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 13 under a constant comparative approach.37 nvivo 12 software was used to code all 67 participant responses. initial coding yielded eight open codes, but these were later consolidated into six final codes (see table 4). one code (libgo improvement tip) was rather nuanced and yielded five axial codes (see table 5). axial codes denoted secondary concerns which fell under a larger category of interest. although some participants gave longer feedback which addressed multiple concerns, care was taken to segregate each distinct concern to a specific code. therefore, it is important to note that some comments addressed multiple concerns, and so the total number of concerns (n = 76) is greater than the total number of individuals responding to the prompt (n = 67). table 4. distribution of qualitative codes by user group code undergraduate graduate faculty staff community member total # concerns positive feedback 7 7 1 4 2 21 negative feedback 1 2 0 3 0 6 in-person tour preference 2 3 0 1 0 6 libgo improvement tip 5 11 1 3 3 23 library services feedback 2 4 3 0 0 9 library building feedback 1 7 1 2 0 11 total: 18 34 6 13 5 76 discussion of qualitative themes positive feedback (21 separate concerns). affirmative comments regarding libgo were primarily split between undergraduate and graduate students, with a small number of comments coming from the other groups. although all groups stated that the game was helpful, one undergraduate wrote “i wish i would’ve received this orientation at the very beginning of the year!” a graduate student declared “this was a creative way to engage students, and i think it should be included on the website for fun.” both community members commented on the utility of libgo in providing an orientation without having to physically come to the library; for example, “interactive without having to actually attend the library in person which i liked.” additionally, a community member pointed out the instructional capability of libgo, writing “i think i learned more from the game than walking around in the library.” negative feedback (6 separate concerns). unfavorable comments regarding libgo primarily challenged the orientation’s characterization as a “game” in terms of its lack of fun. one graduate student wrote a comment representative of this concern by stating, “the game didn’t really seem like a game at all.” a particularly searing comment came from a university staff member who information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 14 wrote, “calling this collection of web pages an ‘interactive game’ is a stretch, which is a generous way of stating it.” in-person tour preference (6 separate concerns). a small number of concerns indicated a preference for in-person orientations versus online. one undergraduate cited the ability to ask questions during an in-person tour as an advantage of that delivery medium. a graduate student mentioned their desire for kinesthetic learning over an online approach, writing, “i prefer hands on exploration of the library.” libgo improvement tip (23 separate concerns). suggested improvements to libgo were the largest area of qualitative feedback and produced five axial themes (subthemes); see table 5 for a breakdown of the five axial themes by group. 1. design issues were the largest cited area of improvement, and the most commonly mentioned design problem was the inability of the user to go back to previously seen content. although this functionality did in fact exist, it was apparently not intuitive to users; design modifications in future iterations are therefore critical. other users made suggestions as to the color scheme used and the ability to magnify image sizes. 2. user experience was another area of feedback, and primarily included suggestions on how to make libgo a more fun experience. one graduate student offered a role-playing game alternative. another graduate student expressed an interest in a game with side missions, in addition to the overall goals, where tokens could be earned for completed missions; the student justified these changes by stating “i feel that incorporating these types of idea will make the game more enjoyable.” in suggesting similar improvements, one undergraduate stated that libgo “felt more like a quiz than a game.” 3. technology issues primarily addressed two related issues: images not loading and broken links. images not loading could be dependent on many factors, including the user’s browser settings, internet traffic (volume) delaying load time, or broken image links, among others. broken links could be the root issue since the images used in libgo were taken from other areas of the library website. this method of gathering content pointed out a design vulnerability of using existing image locations (controlled by non-libgo developers) rather than images exclusively for libgo. 4. content issues were raised exclusively by graduate students. one student felt that libgo placed an emphasis on physical spaces in the library and did not give a deep enough treatment to library services. another graduate student asked for “an interactive map to click on so that we physically see the areas” of the library, thus making the interaction more user-friendly with a visual. 5. didn’t understand purpose is a subtheme where improvement is needed and is based on two comments made by the two university staff members. one wrote that “an online tour would have been better and just as informative,” although libgo was not only designed to be an online tour of the library, but also an orientation of the library’s services. the other staff member wrote, “i read the rules but it was still unclear what the objective was.” in all, it is clear that libgo’s purpose was confusing for some. information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 15 table 5. libgo improvement tip axial codes by user group axial code undergraduate graduate faculty staff community member total # concerns design 4 3 0 0 1 8 user experience 1 2 1 0 1 5 tech issue 0 1 0 1 0 2 content 0 5 0 0 1 6 didn’t understand purpose 0 0 0 2 0 2 total: 5 11 1 3 3 23 library services feedback (9 separate concerns). several participants took the opportunity to provide feedback on general library services rather than on libgo itself. undergraduates simply gave general positive feedback about the value of the library, but many graduate students gave recommendations regarding specific electronic resource improvements. additionally, one graduate student wrote, “i think it is critical to meet with new graduate students before they start their program,” something the library used to do but had not pursued in recent years. although these comments did not directly pertain to libgo, the authors accepted all of them as valuable feedback to the library. library building feedback (11 separate concerns). this was another theme in which graduate students dominated the comments. feedback ranging from requests for microwave use, additional study tables and better temperature control in the building appeared. several participants asked for greater enforcement of quiet zones. like the library services feedback, the authors again took these comments as helpful to the overall library rather than libgo. discussion the results of this study indicated that some groups of library visitors better received the gamified library orientation experience than other groups. undergraduate students indicated the largest appreciation for a library orientation via libgo. specifically, they demonstrated a statistically significant difference over the other groups in supporting libgo’s usefulness as an orientation tool, a preference for libgo over other orientation formats, and a likelihood of future use of the physical library after playing libgo. these very encouraging results provide evidence for the efficacy of alternative means of library orientation. the qualitative results provided additional helpful insight regarding the user impressions from each of the five surveyed groups. this feedback demonstrated that a variety of groups benefited from the experience of playing libgo, including some community members who appreciated libgo as a means of becoming acclimated to the library without having to enter the building. a virtual orientation format was not ideal for a few players who indicated a preference for a face-toface orientation due to the ability to ask questions. many people identified areas of improvement for libgo. graduate students in particular offered a disproportionate number of suggestions as compared to the other groups. while they provided a great deal of helpful feedback, it is possible that graduate students were so distracted by the perceived problems that they could not fully take in the experience or gain value from libgo’s orientation purpose. it is also very likely that libgo information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 16 simply was not very fun for these players: several players noted that it did not feel like a game but rather a collection of content. the review of literature indicated that this amusement issue is a common pitfall of educational games. although the authors tried to design an enjoyable orientation experience, it is possible that more work is needed to satisfy user expectations. the mixed-methods design of this study was instrumental in providing a richer understanding of user perceptions. while the statistical analysis of participant survey responses was very helpful in identifying clear trends between groups, the qualitative analysis helped the authors draw valuable conclusions. specifically, the open-response data demonstrated that additional groups such as graduate students and community members appreciated the experience of playing libgo; this information was not readily apparent through the statistical analysis. additionally, the qualitative analysis demonstrated that many groups had concerns regarding areas of improvement that may have impaired their user experience. these important findings could help guide future directions of the research. in all, the authors concluded this phase of the research feeling satisfied that libgo showed great promise for library orientation delivery but could benefit from continued development and future user assessment. although undergraduate students seemed most receptive overall to a virtual orientation experience, other groups appeared to have benefited from the resource. study limitations a primary limitation of this study was its small sample size. as the entire university campus was targeted for participation in the study, the number of respondents was far too small to generalize the results. despite this limitation however, the study’s population reflected many different groups of library patrons on campus. the findings are therefore valuable as a means of stimulating future discussion regarding the value of alternative library orientation methods utilizing gamification. another limitation is that the authors did not pre-assess the targeted groups for their prior knowledge of walker library services and building layout, nor for their interest in learning about these topics. it is possible that various groups did not see the value in learning about the library for a variety of reasons. faculty members, in particular, may have considered their prior knowledge adequate for navigating the electronic holdings or building layout without recognizing the value of the other many services offered physically and electronically by the library. all groups may have experienced a level of “library anxiety” that prevented them from being motivated to learn more about the library.38 it is difficult to understand the range of covariate factors without a pre-assessment. finally, there was qualitative evidence supporting the limitation that libgo did not properly convey its stated purpose of orientation rather than imparting research skills. without understanding libgo’s focus on library orientation, users could have been confused or disappointed by the experience. although care was taken to make this purpose explicit, some users indicated their confusion in the qualitative data. this observed problem points to a design flaw that undoubtedly had some bearing on the study’s results. conclusion & future research convinced of the importance of the library orientation, the authors sought to move this traditional in-person experience to a virtual one. the quantitative results indicated that the gamified information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 17 orientation experience was useful to undergraduate students in its intended purpose of acclimating users to the library, as well as encouraging their future use of the physical library. at a time in which physical traffic to the library has shown a marked decline, new outreach strategies should be considered.39 the results were also helpful in showing that this particular iteration of the gamified orientation was preferred over other delivery methods by undergraduate students, as compared to other groups, to a statistically significant level. this is an important finding as it demonstrates that a diversified outreach strategy is necessary: different groups of library patrons desire their orientation information in different formats. the next logical question to ask however is: why did the other groups examined through the statistical data analysis (graduate students and faculty) not appreciate the gamified orientation to the same level as undergraduates? the answers to this question are complicated and may be explained in part by the qualitative analysis. based upon those findings, it is possible that the game did not appeal to these groups on the basis of fun or enjoyment; this concern was specifically mentioned by graduate students. faculty members, including staff, provided a smaller level of qualitative feedback; it is therefore difficult to speculate as to their exact reasons for disengagement with libgo. with this concern in mind, the authors would like to concentrate their next iteration of research on the specific library orientation needs of graduate students and faculty. both groups present different, but critical, needs for outreach. graduate students were the largest group of survey respondents, presumably indicating a high level of interest in learning more about the library. many graduate programs at mtsu are delivered partially or entirely online; as a result, these students may be less likely to come to campus. due to graduate students’ relatively infrequent visits to campus, a virtual library orientation could be even more meaningful for them in meeting their need for library services information. faculty are another important group to target because if they lack a full understanding of the library’s offerings, they are unlikely to assign assignments that wholly utilize the library’s services. although it is possible that faculty prefer an in-person orientation, many new faculty have indicated limited availability for such events. a virtual orientation seems conducive to busy schedules. however, it is possible that the issue is simply a matter of marketing: faculty may not know that a virtual option is available, nor do they necessarily understand all that the library has to offer. in all, future research should begin with a survey to understand what both groups already know about the library, as well as the library services they desire. another necessary step in future research would be the expansion of the development team to include computer programmers. although the authors feel that libgo holds great promise as a virtual orientation tool, more needs to be done to enhance the user’s enjoyment of the experience. twine is a user-friendly software that other librarians could pick up without having to be computer programmers; however, programmers (professional or student) could bring a design expertise to the project. future iterations of this project should incorporate the skills of multiple groups, including expertise in libraries, user research, visual design, interaction design, programming, marketing, and testers from each type of intended audience. collectively, this group will have the greatest impact on improving the user experience and ultimately the usefulness of a gamified orientation experience. this experience with gamification, and specifically interactive storytelling, was a valuable experience for walker library. these results should encourage other libraries seeking an alternate information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 18 delivery method for orientations. the authors hope to build upon the lessons learned from this mixed methods research study of libgo to find the correct outreach medium for their range of library users. acknowledgments special thanks to our beta playtesters and student assistants who worked the libgo event, which was funded, in part, by mt engage and walker library at middle tennessee state university. information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 19 appendix a: survey instrument information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 20 information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 21 information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 22 information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 23 information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 24 endnotes 1 sandra calemme mccarthy, “at issue: exploring library usage by online learners with student success,” community college enterprise 23, no. 2 (january 2017): 27–31; angie thorpe et al., “the impact of the academic library on student success: connecting the dots,” portal: libraries and the academy 16, no. 2 (2016): 373–92, https://doi.org/10.1353/pla.20160027. 2 steven ovadia, “how does tenure status impact library usage: a study of laguardia community college,” journal of academic librarianship 35, no. 4 (january 2009): 332–40, https://doi.org/10.1016/j.acalib.2009.04.022. 3 chris leeder and steven lonn, “faculty usage of library tools in a learning management system,” college & research libraries, 75, no. 5 (september 2014): 641–63, https://doi.org/10.5860/crl.75.5.641. 4 kyle felker and eric phetteplace, “gamification in libraries: the state of the art,” reference and user services quarterly 54, no. 2 (2014): 19-23, https://doi.org/10.5860/rusq.54n2.19; nancy o’hanlon, karen diaz, and fred roecker, “a game-based multimedia approach to library orientation,” (paper, 35th national loex library instruction conference, san diego, may 2007), https://commons.emich.edu/loexconf2007/19/; leila june rod-welch, “let’s get oriented: getting intimate with the library, small group sessions for library orientation,” (paper, association of college and research libraries conference, baltimore, march 2017), http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/201 7/letsgetoriented.pdf. 5 kelly czarnecki, “chapter 4: digital storytelling in different library settings,” library technology reports, no. 7 (2009): 20-30; rebecca j. morris, “creating, viewing, and assessing: fluid roles of the student self in digital storytelling,” school libraries worldwide, no. 2 (2013): 54–68. 6 sandra marcus and sheila beck, “a library adventure: comparing a treasure hunt with a traditional freshman orientation tour,” college & research libraries 64, no. 1 (january 2003): 23–44, https://doi.org/10.5860/crl.64.1.23. 7 lori oling and michelle mach, “tour trends in academic arl libraries,” college & research libraries, 63, no. 1 (january 2002): 13-23, https://doi.org/10.5860/crl.63.1.13. 8 kylie bailin, benjamin jahre, and sarah morriss, “planning academic library orientations: case studies from around the world,” (oxford, uk: chandos publishing, 2018): xvi. 9 bailin, jahre, and morriss, “planning academic library orientations.” 10 marcus and beck, “a library adventure”; a. carolyn miller, “the round robin library tour,” journal of academic librarianship 6, no. 4 (1980): 215–18; michael simmons, “evaluation of library tours,” edrs, ed 331513 (1990): 1-24. 11 marcus and beck, “a library adventure”; oling and mach, “tour trends”; rod-welch, “let’s get oriented.” https://doi.org/10.1353/pla.20160027 https://doi.org/10.1016/j.acalib.2009.04.022 https://doi.org/10.5860/crl.75.5.641 https://commons.emich.edu/loexconf2007/19/ http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/letsgetoriented.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/letsgetoriented.pdf https://doi.org/10.5860/crl.64.1.23 https://doi.org/10.5860/crl.63.1.13 information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 25 12 pixey anne mosley, “assessing the comfort level impact and perceptual value of library tours,” research strategies 15, no. 4 (1997): 261–70, https://doi.org/10.1016/s07343310(97)90013-6. 13 mosley, “assessing the comfort level impact and perceptual value of library tours.” 14 marcus and beck, “a library adventure,” 27. 15 kenneth j. burhanna, tammy j. eschedor voelker, and jule a. gedeon, “virtually the same: comparing the effectiveness of online versus in-person library tours,” public services quarterly 4, no. 4(2008): 317–38, https://doi.org/10.1080/15228950802461616. 16 burhanna, voelker, and gedeon, “virtually the same,” 326. 17 burhanna, voelker, and gedeon, “virtually the same,” 329. 18 felker and phetteplace, “gamification in libraries.” 19 felker and phetteplace, “gamification in libraries,”20. 20 felker and phetteplace, “gamification in libraries.” 21 felker and phetteplace, “gamification in libraries”; o’hanlon et al., “a game-based multimedia approach.” 22 mary j. broussard and jessica urick oberlin, “using online games to fight plagiarism: a spoonful of sugar helps the medicine go down,” indiana libraries 30, no. 1 (january 2011): 28–39. 23 melissa mallon, “gaming and gamification,” public services quarterly 9, no. 3 (2013): 210–21, https://doi.org/10.1080/15228959.2013.815502. 24 j. long, “chapter 21: gaming library instruction: using interactive play to promote research as a process,” distributed learning (january 1, 2017), 385–401, https://doi.org/10.1016/b978-008-100598-9.00021-0. 25 rod-welch, “let’s get oriented.” 26 o’hanlon et al., “a game-based multimedia approach.” 27 mallon, “gaming and gamification.” 28 anna-lise smith and lesli baker, “getting a clue: creating student detectives and dragon slayers in your library,” reference services review 39, no. 4 (november 2011): 628–42, https://doi.org/10.1108/00907321111186659. 29 monica fusich et al., “hml-iq: frenso state’s online library orientation game,” college & research libraries news 72, no. 11 (december 2011): 626–30, https://doi.org/10.5860/crln.72.11.8667. https://doi.org/10.1016/s0734-3310(97)90013-6 https://doi.org/10.1016/s0734-3310(97)90013-6 https://doi.org/10.1080/15228950802461616 https://doi.org/10.1080/15228959.2013.815502 https://doi.org/10.1016/b978-0-08-100598-9.00021-0 https://doi.org/10.1016/b978-0-08-100598-9.00021-0 https://doi.org/10.1108/00907321111186659 https://doi.org/10.5860/crln.72.11.8667 information technology and libraries september 2020 applying gamification to the library orientation | reed and miller 26 30 broussard and oberlin, “using online games”; fusich et al., “hml-iq”; o’hanlon et al., “a gamebased multimedia approach.” 31 felker and phetteplace, “gamification in libraries.” 32 felker and phetteplace, “gamification in libraries”; fusich et al., “hml-iq.” 33 “design thinking for libraries: a toolkit for patron-centered design,” ideo (2015), http://designthinkingforlibraries.com. 34 john w. creswell and vicki l. plano clark, designing and conducting mixed methods research (thousand oaks, ca: sage publications, 2007). 35 roger kirk, “practical significance: a concept whose time has come,” educational and psychological measurement, no. 5 (1996). 36 kirk, “practical significance.” 37 sandra mathison, “encyclopedia of evaluation,” sage, 2005, https://doi.org/10.4135/9781412950558. 38 rod-welch, “let’s get oriented.” 39 felker and phetteplace, “gamification in libraries.” http://designthinkingforlibraries.com/ https://doi.org/10.4135/9781412950558 abstract introduction background literature review traditional orientation gamification use in libraries development of the library game orientation (libgo) purpose of study methodology overview research questions participants survey instrument data collection results quantitative findings qualitative findings discussion of qualitative themes discussion study limitations conclusion & future research acknowledgments appendix a: survey instrument endnotes 6 japanese character input: its state and problems ichiko morita: ohio state university, columbus. computer processing of information is highly advanced in japan, and it continues to be researched and improved by the cooperative efforts of the government, private corporations, and individual scientists, who are among the best in the world. this paper introduces various approaches to the computer input of information currently developed in japan, and discusses the possibility of their applications to the processing of east asian-vernacular language materials in large research libraries in this country. processing of catalog information through an on-line shared-cataloging system has become a part of american libraries' common practice, and its financial and temporal savings have been proven. however, there are some materials not yet considered appropriate for computer processing. the library of congress' plans for romanizing catalog information for all non-roman language materials and putting them on marc tapes for quick distribution of information have been objected to by a large number of specialists in the field. the opponents' reason has been that computerization of vernacular languages by means of transliteration is not satisfactory. such materials are best handled in their own writing systems (the languages in this category include chinese, japanese, korean, hebrew, arabic, and various languages in india). those specialists in the field who see systems working for roman-alphabet materials generally agree that automated systems are very efficient and useful for their research. it would be best if non-roman language materials could be processed through computers using their own writing systems. as far as technology goes, it is possible to process such materials in their original form. systems that have the capability of handling those languages directly have been developed; among the most advanced are the japanese systems. japan has overcome numerous difficulties in developing systems that are capable of handling japanese characters. although automation of libraries is not as widespread as in the united states (due perhaps to a delay in the development of computers), some japanese libraries have already a decade of experience with advanced manuscript received august 1980; accepted december 1980. japanese character input/morita 7 systems. many others have recently started to adopt them. wide utilization of these systems seems to be just a matter of time. it will be beneficial to review japanese methods and consider possible adaptation of them to our systems. in the following sections, various japanese approaches to inputting the japanese language are explained with an eye to future automation of non-roman language materials in this country. the japanese language and the computer it should be noted, first of all, that the japanese language is an entirely different language from chinese, although they are often confused because they both use the same chinese ideographs in writing. each chinese ideograph , or character, symbolizes a certain object or denotes a certain meaning. the japanese use them in the japanese language with its own pronunciation in the context of its own grammar, whereas the chinese use them in the chinese language with its own pronunciation in the context of its own grammar. this means that a chinese ideograph could mean the same thing in both languages, but be pronounced or read differently and used in different grammatical environments. the chinese ideographs used in japanese are referred to as kanji, which are, to complicate the matter, used along with japanese syllabaries called kana. kana, in two styles called hiragana and katakana, total about 170 characters. depending on whether ,a kanji is used with another kanji or kana, the reading of it varies. at different times one set of kanji may be read in two or three different ways. the total number of kanji is about 50,000. in comprehensive dictionaries, about 40,000 or more kanji are included. medium-sized ones, such as ueda's daijiten, include about 15,000; concise ones about 8,000 to 10,000. 1 according to several tests on frequency of kanji occurrence made in various japanese institutions, approximately 3,000 kanji appear in high frequency, 3,000 are of moderate frequency, and several thousand more are of infrequent occurrence. as for geographical names, 2,279 kanji will cover most of japan and 1,500 kanji will suffice to cover personal names, except for very unusual names. 2 approximately 6,300 characters are needed for major newspapers such as the asahi and the nikkei. the trends in the use of kanji are to simplify the characters themselves, and not to use difficult kanji with many strokes. in 1946, the japanese government established 1,850 kanji as those for daily use, 3 and today newspapers and official documents use only those kanji, except for some personal and geographical names. the implication of this trend for computerization of kanji is that, depending on the documents to be covered, the need in number and kind of kanji varies. that is, institutions that deal with scientific or current information do not need as many kanji as other types of institutions that handle documents cover8 journal of library automation vol. 14/1 march 1981 ing longer periods and larger areas of knowledge . for example, japan information center for science and technology, which mainly handles the latest scientific information, claims that with approximately 6,000 kanji it can function satisfactorily. an example from the other extreme is the national institute of japanese literature, whose collection covers older historical periods, during which a great number of kanji were used and many kanji went through changes, mostly simplification m style. the latter institute is constantly adding new kanji to its system. it is obvious then that the first problem in the computerization of japanese materials is the number and kind of kanji to be included in the system. this is a problem of hardware. the other problem concerns software. when japanese is written, its words are not divided as in english, for combination of kanji and kana helps visually to make sentences understandable without word division. also, compound nouns are made by adding other words to a noun, so that, if a set of kanji represents one noun, one can expand its meaning by adding another kanji to it. though word division has been a problem in transliteration and not new in computerization, both arbitrarily divided words and undivided words in particular become serious problems in the computer files and in the retrieval of information . a question may be raised as to why we need kanji processing in spite of these problems; why isn't computer handling of alphanumerics and kana, which is in use today, sufficient? the answer to this is mainly that kanji possess a definite visual effect. also, if only romanized languages or kana alone are used, many homonyms may make the meaning ambiguous. while it is quite possible to write japanese only in kana or in the'"romanized forms, as proven by the systems in use, it is better, for efficiency and precision, to express the language in the way it is actually written. as for the problem of word division, study is in progress on methods of dividing words systematically and automatically, incorporating the latest research in the field of applied linguistics. this is more concerned with the development of software, and this paper will not delve into it. inputting various japanese approaches to inputting kanji and kana are organized below into six major groupings according to different inputting devices. they are: (1) full keyboard, (2) component pattern input , (3) kana keyboard, ( 4) stenotype, (5) optical character recognition, and (6) voice recognition . these six methods are further divided into subvariations as shown in table 1. 4 full keyboard the main feature of this approach is use of a full character keyboard as the inputting device. the operator uses the full character keyboard japanese character input/morita 9 table 1 . input systems major approaches full variations keyboard kanji teletypewriter subvariations japanese typewriter character location coded-plate scanning coded typeface modified coded typeface tablet style electromagnetic component pattern input kana electrostatic photoelectric training characters/ characters needed minute accommodated mediumextensive medium mediumsmall 40-100 30-50 30-70 2,300-4,000 2,205 2,863 2,200-3,000 3,000-4,096 2,800-4,000 2,800-4,000 keyboard two-key stroke location correspondence extensive 60-120 4,096 stenotype optical character recognition voice recognition association memory display selection small 20-30 kana-kanji conversion word conversion sentence conversion 1,000-2,500 rather than codes or other symbols. the keyboard varies depending on models, usually consisting of frequently used kanji and both sets of kana, supplemented by arabic numerals, roman, cyrillic, and greek alphabets in upper and lower cases, often with italics, signs, and diacritical marks. to each character, a two-byte binary code (expressed by a four-digit numeral) is assigned, so that when the inputter types a character the code for the character is punched on paper or cassette tape. kanji teletypewriter the oldest method for kanji inputting, still widely in use, is the kanji teletypewriter system or multishift system. one variation of this approach, developed by the national diet library at an early stage of its computerization, has 192 character keys, each having fourteen characters in three columns and five lines, as shown in figure 1. in addition, there are fourteen selection keys arranged in three columns and five rows on the lower left of the keyboard to correspond to the pattern of characters on each character key . when an operator strikes the character key b with the right hand and the selection key a with the left hand at the same time, the code for the character c is punched on the tape. 10 journal of library automation vol. 14/1 march 1981 000 000 ooo 000 qoo \ ' \ \ \ \ :>'1":111l"jwij i '_''l'h.l-t tt1ul um ~~i :f'i :t~ ;jl;lt'{>,'f r r_r,f rx 15 1 lf~ 1 1-e --····--j·-·· ·-.. lf [l{rl i i'yliilj f'li: ·r1 1jgt)f i *y:nt : ii!j,ii¥1.1 i 9;j.;,~;1: ; ~!z1tt '?" ~~:.,;.· .-.t r •.. ~,, •. x ~.:r, r_ x ,r,; ,r;; i ~~i_i_ 1 if r. ---· . -l -··· ·r'i!..r':~ i tm~m~x <¥~1 t :k j~,] f:k {i~ ~ ri fr t1>/ ilm ~'!<. #.l'li!iii *t9.t !k ix x: . rel ~ \ \ \ \ \ \ \ • \ .ii character key b character c 1rselection key a fig. 1. kanji teletypewriter keyboard of the national diet library. included on this keyboard are : kanji kana western alphabets numerals symbols and marks kanji pattern s kanji components space 2,006 90 144 20 210 40 139 total 2,6506 by using shift keys on the upper left of the keyboard, kana in both styles and alphabets in upper and lower cases can be input. for satisfacjapanese character input!morita 11 tory operation, the keyers must be professionally trained, and it is said that one to three months are necessary for them to be fully trained and able to input an average of fifty to sixty kanji per minute. this is not as fast as most other methods discussed. japanese typewriter the second of the full keyboard approaches is the japanese typewriter method, which uses a modification of the standard japanese typewriter with a tray filled with kanji printing types. the operator finds a character in the tray and punches it by moving a metal handle as the type bar is punched down to print the character. this is rather primitive and different in its operation from the english typewriter, which uses the ten-finger touch method. there are four variations: character location method. kanji are arranged on a keyboard by their codes, so that when a key is punched, the kanji is typed on regular paper as if it had been done by a regular japanese typewriter. at the same time, the code is automatically read from the location of the key and is punched on tape. code-plate scanning method. each type bar has a plate attached on its side, and the code for the character is marked on its plate . when a key is typed, the kanji is printed on paper and the code from the plate is optically scanned at the same time. coded typeface method. each typeface is made with a character on the upper half and a code for it on the lower hale when a key is typed, both the character and code are printed. the code on the bottom half is optically scanned from the printed paper. modified coded typeface method. instead of typing both characters and codes on the paper, this method prints only the characters on the front of the paper and, at the same time, prints a bar code on the back of the paper. the machine capable of doing this is complicated. the size of the character on a typeface can be bigger than in the variation above, and the bar code can be larger to make the scanning of the code easier and more precise. as the discussion of the four variations indicates, the japanese typewriter offers the advantage of being able to monitor input at the time of keying. since the japanese typewriter has been in use for a long time in offices where a quantity of official documents are dealt with, and since ordinary japanese typists can use this system without any additional training, the use of equipment similar in operation was considered advantageous . however, it should be noted that japanese typewriters have never become as prevalent as english typewriters, and the demand for computers comes from more areas than just those where japanese typewriters are used . for this reason, the use of japanese typewriters is not as advantageous as its proponents claim . an obvious 12 journal of library automation vol. 14/1 march 1981 disadvantage is its slow speed of operation-thirty to fifty characters per minute on the average. another disadvantage is that the number of characters on the keyboard is limited to about 3,000. tablet style this method, also known as pen-touch method, was recently developed . each character has a key, and characters are arranged in a certain order. the location of the characters on a matrix sheet determines the two-byte binary code, which consists of a two-digit numerical abscissa and twodigit numerical ordinate . the operator touches the key with a penshaped detector and the code for the character is punched on the paper tape. the operation is one-handed, requiring only a light touch of the key by a detector. keys are on one flat keyboard and are color-coded by sections to make it easier for the operator to locate them. light touch operation reduces operator fatigue. this method does not require special training. however, the number of kanji on a keyboard of reasonable size is limited to approximately 3,500. by shifting, twice as many characters can be handled, though all characters are not indicated on the keyboard. speed of input is not very high-thirty to seventy characters per minute. this system, already used in many libraries, is becoming increasingly popular because of its easy operation. there are three different technologies used: electromagnetic, electrostatic, and photoelectric. there are no differences in actual input operation for those electronically different methods. component pattern input although not a full keyboard method, component pattern input is closely related to these methods. the idea behind this approach is that most kanji are composed of one or more basic component units, two or more of which can be put together into one kanji according to one predetermined pattern out of forty general patterns. the inputting device has keys for those forty patterns along with keys for individual components on a special keyboard. to compose a kanji, a key for an appropriate pattern is selected and typed, and components are chosen to fill each individually numbered block of the selected pattern, following the established order as shown below. 7 each pattern has a code, and so does each component . when a key is typed, the code is punched on a paper tape as shown in figure 2. there are cases where a kanji with two components can be a component of another kanji, as shown in the first and second examples in figure 2. a kanji is constructed by punching at least three codes : one for a pattern and at least two for components. then, a kanji dictionary consisting of several thousand master-code combinations (see figure 3) is stored in a magnetic drum, and the several codes to compose a kanji punched on paper or cassette tapes are converted through this dictionjapanese character lnput!morita 13 k&njl nol on pattern a componenl parlo (radiula) lhe keyboard• ;1§ *-d! [e] . f§ ---. .j 2804 38d 2723 --·-c od eo ~t§ !-.~f~~ --:~ . : .... .: . ... ! 00 "j * ej 2806 3813 1638 1938 -codu t-t ;f:t:~ lm * ~t ~ ~ ~' ,,.~; u : __ ~~-; 4 2807 1638 1138 1138 1138 --cod eo ffe ~*,l; ~ [1@ * ;-1-1 y {! -l __ m1 ___.. 4 2807 1o3a 1817 142a 08z4 ---cod eo fig. 2 . component pattern input. z804 3813 z7zb 0000 0000 0000 8118 • ~-m z806 3813 1638 193!1 0000 0000 b 118 -ao z607 1638 1138 1138 1138 0000 6117 -~ 1a z807 1638 1817 l4za 08z4 0000 9815 .. t~ fig. 3. kanji dictionary. ary to a two-byte binary code assigned to that particular kanji. these are then handled as other kanji with an individual code. though this can be a stand-alone approach to inputting kanji, the principle has been adopted by the national diet library to supplement the inputting of kanji on the full keyboard kanji teletypewriter. the national diet library uses this system when inputting kanji that are not included in its keyboard. instead of having a special separate keyboard, the kanji teletypewriter of the national diet library integrates patterns and components as equivalents to other characters. its keyboard includes forty patterns and approximately 140 components. this was the most elementary approach to computerize kanji . conceived in the early developmental stage of kanji processing, it used one of the characteristics of kanji, the composition from several components. in actual situations, this technique requires at least three key strokes for one kanji and consumes time to locate the needed component on the 14 journal of library automation vol. 14/1 march 1981 keyboard. furthermore, it requires the complicated extra step of putting input codes through a kanji dictionary to combine component codes into a code per kanji. no library is currently using this system by itself. kana keyboard system the keyboard of a japanese syllabary typewriter has adapted the conventional english typewriter keyboard and has standard roman alphabet keys that contain katakana in shift (figure 4). since the number of katakana exceeds that of roman letters, the katakana keys are extended to keys for numerals and punctuation marks. this means that this typewriter can be used either for kana or roman letters by changing its mode. fig. 4. kana typewriter keyboard. two-key stroke method this variation of the kana keyboard system is referred to as the twokey stroke system, and uses kana as codes not as letters . roman letters can be used as codes, too. there are two different subvariations. they are: location correspondence. keys are divided into two sections : one for right hand, and the other for left hand. if two keys are to be stroked, there will be four possible combinations of key strokes: (1) left hand twice, (2) left .and right, (3) right and left, and (4) right twice. the keyboard is accompanied by a kanji table in which characters are arranged in several blocks and in a certain order within each block. each block, which contains twenty-six kanji in a four-by-six arrangement, is made according to each combination of strokes: first block is left and left; second block is left, right, etc. within each block, the ordinate consists of keys for the first stroke and the abscissa for the second . a kanji which is at the intersection of the above indicates which keys are to be typed. when kanji a is to be typed (see figure 5), since it is in block a indicating the stroke combination as left and left, the operator types a · and w by left hand. if kanji b is to be typed, the operator types key a by left hand and key p by right. each key has a byte code and a combination of two key strokes makes a composite, a two-byte binary code, for a kanji. the bit may be changed by shifting, and different kanji can block a (for left, left) g j.,;( '7-. (q) (w) (e) (r) ~ ( 1) 0000 '! (q) 00 00 4(a) o• 00 ll) 0/0 0 0 (z) ' ,. kanji a japanese character input!morita 15 'ij / (t) (y) 0 0 0 0 00 0 0 ~ (1) "' (q) 4(a) ''l (z) block b (for left, right) 7-.:::.. 7--e" o (u) (i) (0) (p) ($) (c) 000000 oooooo ooo.oo 0 0 0/0 0 0 ,. / / kanji b fig. 5. kanji table for location correspondence method. be typed if another table is prepared for kanji with different bits. association memory method . in this method, each kanji is given two kana which usually represent a reading of that kanji. the operator associates a kanji to be input with two kana assigned to that kanji, and types them with two strokes using the kana keys. both of the key-stroke methods are economical as well as convenient because of the wide availability of kana typewriters . mainly for that reason, both of these systems . have been well accepted and are expected to grow further. since this touch method does not require the operator to look for the character on the keyboard to input, it is the fastest to operate and is considered suitable for input in quantity. it is possible to input 60 to 120 characters per minute. the only drawback is that the operator must get acquainted with the arrangement of kanji in the first variation, and must memorize all the associated kana spelling for many kanji in case of the second variation. in either case, the operator must be professionally trained. the japan information center for science and technology, which indexes many scientific publications, employs a vendor who uses the location correspondence variation of this system for inputting information. display selection this also uses a kana typewriter with a screen in front . when a word is typed in kana, a group of kanji with that sound are displayed on the screen. the operator chooses the right kanji with a light pen-a slow but accurate operation. the operator does not have to be specially trained for this. kana-kanji conversion in contrast to the conventional approach of full keyboard inputting, an entirely new method for inputting kanji is gaining popularity as the 16 journal of library automation vol. 14/1 march 1981 availability of sophisticated software increases. this uses a kana typewriter keyboard to input japanese in syllabary or romanized form, converting them to kanji by software. there are two ways of conversion: one that converts word by word, and the other sentence by sentence. stenotype the stenotype is a typewriterlike device. the operator must be able to take shorthand. when the stenotype is used, it punches words in paper tapes. therefore, inputting is high speed. however, the operator must receive proper training. optical character recognition this system, developing quickly and expected to gain wider use, can scan a maximum of 2,500 printed kanji. 8 one variation connects a writing tablet to a computer so that as the operator writes kanji on the tablet, the computer scans them in stroke order. this function of scanning by the stroke order is considered to be an advantage for processing some types of japanese documents. the drawbacks are that the system is still very expensive, and the number of recognizable characters is fewer than 2,000. voice recognition this is an oral-visual system, in which the human voice is read by a computer. obviously the most difficult to develop, this system is still in an experimental stage . however, a prototype has been demonstrated at various exhibitions, and the system apparently possesses great potential. summary pattern configuration and output devices for japanese characters are basically the same as those for english. however, the pattern generation of characters is mechanically more complicated than that of the roman alphabet, because kanji has a more complicated structure than the roman alphabet and the number of components is greater. each kanji is represented by a two-byte binary code rather than one byte as in roman alphabet. because of this, the efficiency of retrieval is low. presently, hard copy and typesetting for printing of hard copy are the major output forms, and very little on-line retrieval of information with kanji is in current operation. problems particular to kanji processing among numerous problems in processing kanji through computers, major ones are: (1) which kanji are to be included; (2) how many characters are to be handled; (3) what code should be assigned and how it should be arranged on the keyboard or table; and (4) how the kanji not included on the keyboard should be treated. in the early stage of kanji computer development, different institujapanese character input/morita 17 tions handled the problems in ways best suited to their individual needs, according to the nature of the literature covered, the amount of literature processed, and the kinds of output needed . they experimented with the then best available capabilities. as a result, the finished systems are all independent and mutually incompatible. standardization is obviously necessary for exchange of information among the systems. in order to set standards for selection of characters and assignment of codes, jis (japan industrial standard) c6226-1978 has been compiled by the japan association for development of information processing. this is a table of characters designed for information exchange (a portion of which is shown in figure 6). it has a one-byte code as its abscissa and another as its ordinate. characters are arranged so that the intersection of abscissa and ordinate determines a kanji whose code consists of four numerals, two from the abscissa and two from the ordinate. included in the table are kana in both styles, roman, greek, and cyrillic alphabets in upper and lower cases, diacritical marks, numerals, and punctuation marks, as follows: 1. special characters 108 2. numerals (arabic) 10 3. roman alphabets 52 4. hiragana 83 5. katakana 86 6. greek alphabets 48 7. cyrillic alphabets 66 8 . kanji 6,349 total 6~8029 in the first section of the table , numerals, alphabets., kana, and special characters are grouped . in the second section, the total of 2, 965 frequently used kanji are arranged as the first priority group, and an additional 3,384 kanji are selected as the second group 10 in the bottom half of the table. kanji are printed in the preferred style for printing typeface. this table will resolve problems 1 to 3 mentioned above. institutions that had arranged their own codes for kanji, including the national institute of japanese literature, are now automatically translating their own codes into jis codes. in cases where needed kanji are not included on the keyboard, handling varies. with the japanese typewriter, because each kanji is inscribed on a typeface, only the kanji on that typeface is printed when the type bar is stroked . therefore , only kanji that have typefaces can be input in this system, while some other handling is possible in other methods. while the number of characters that can be accommodated on keyboards is limited to 2,000 to 3,500, depending on the type of equip18 journal of library automation vol. 14/1 march 1981 b7 d d did d d d d 0 d 0 0 0 b6 1 1 1 1 1 1 1 1 1 1 1 1 1 ! ~ bs d d d d d 1 d 0 d d d 0 d d 2 "' b4 d d d d 0 d 0 1 1 1 1 1 1 1 bj d 0 d 1 1 1 1 0 0 0 0 1 1 1b2 d 1 1 d 0 1 1 0 0 1 1 d 0 bt 1 0 1 0 1 0 1 0 1 0 1 d 1 ~ 1 "'1 1~ b4 1 2 3 4 5 6 7 8 9 10 11 12 13 b; b6 b5 b3 b2 bt 0 1 0 0 0 0 1 1 :·s p: i jl r-f ii ' lll-i . . ? i ~ ~ lj ' 0 ji' 1 • _; ' 1... ---' . . 0 1 d 0 1 0 1 0 2 ~ oic'ji6 a. \l v * ' t -i t 0 1 0 0 0 j1 1 3 0 1 0 i q i 1 0 0 4 ... ..j.. ~.--. ) ;{_ i .z h tj' /j{ ~ "> d) "' 7 }; 0 1 0 0 1 0 1 5 7 711 1 '/ rf .:r.. .x. ;;t ;t ij ij~ ~ 0 1 0 j 0 i 1 ii 0 6 a!bir t.ieiz h 8 r kia m n 0 1 0 j 0 1 1 i 1 i 7 a 6 1 8 rln e e )k 3 l1 i1 k ji 0 1 0 1 0 olol 8 0 1 ol1,oloi11 9 0 1 0 1 0 ii! 0 10 0 i 1 0 1 0 1 ' 11 j. 0 1 ol 1 1jo 0 12 0 1 0 1 1 0 1 13 0 1 0 1 1 1 0 14 0 1 o 1 1 1 1 1 15 r· :!fi p.§. k ~ n -~ "' 1 t-'· ttr; j ;~ ;_,~ -ftt !ffi 0 1 1 0 0 0 0 16 5.p.. ' t ,u a. ').{ * _[§ ~c...· 0 \1 ' 1 0 0 0 1 17 vi,"'i ~~ ,., .. p-[ :tji r'-· ft•;j;j i rr: 1fn .!jfl ·~.c.~ >j(; ,>_l;;, , .~. lit (j • 1 -'fj•--;;1 0/ !_11/ 0 0 1 0 18 tftl b.fltitti l~j [£}:\ £ fjil ~~ n ~;_rj :& !iii] :l~ j . f--""· i . --:----·-·· ~q.~~ t~r-i~ 1 jf( fe .r.t: ''"' if~~ rm 0 1 1 0 oj1/l 19 is •r ·1,. i \. 1,el ;r-.l; j,~ ~ 0 1 1 0 1 0 0 20 5''5 ;\ j ....:: i ji "'~ fn . f i )(ij • 'f-lj!t. ret jf~ ;flj /fjj •wj .;lj: p~ n i 1 1 n 1 i n i 1 ?l ~ .j~ i m i ~ \.;!cr j:.jt ~rr ~ i.gi ~;j 14!.~ h:l :=r fig. 6 . code of th e japanese graphic character set for information interchange. japanese character input!morita 19 ment, character generators have the capability of outputting more than the number of characters on the keyboard. figure 7 shows their relationship. characters that are in the generator but not on the keyboard must be frequently processed, because the number of characters needed for most documents could reach 6,000 to 6,500. using a shift key to enter another mode is a fairly common technique for inputting uncommon kanji. the keyboard may not have a character but, if the character generator has it, the code for that character can be input by shifting. for example, if a character on the keyboard has a code 0117, a bit is changed so the code 8117 can be typed by shifting and typing that key. if the code 8117 is assigned to another kanji not on the keyboard but indexed in the dictionary, it can be input. this applies for the kanji teletypewriter, tablet style, and the two-key stroke variations of the kana typewriter. in the kanji teletypewriter system used by the national diet library, the keyboard accommodates 2,650 characters, while its character generi i i i ,---...... ' / -'-fig. 7. kanji creating capability. outside system capability system capability character generator capability keyboard characters ator has the capability for 5, 717. operators in the national diet library input kanji that are not on the keyboard by using component pattern input method. or, if the operator finds the kanji code in the specially compiled dictionary in which codes for kanji are indexed, a shift key is used to change the bit, thus creating the code for kanji not on the keyboard. most other tablet systems use code dictionaries. in the twokey stroke variations of kana typewriters, tables of kanji for second and third or more shifts can be built, especially when the location association method is used. the handling of kanji that are not in character generators is more difficult. only the digital character generator, the kind that uses either dot or stroke, can add characters fairly easily. in the flying spot system, characters can be added, but it must be done professionally with an additional character cylinder and is very costly. the national diet library, which now uses flying spot, limits addition of kanji to a minimum. because its output is solely in printed book form, the national diet library inputs a fill character for kanji not in the system . when 20 journal of library automation vol. 14/1 march 1981 the phototypeset masters are made, the fill characters are replaced by typeset characters . the use of a fill character suffices only when the output is phototypeset, because there is a step to replace fill characters by typeface. however, as long as the data base includes many fill characters on the magnetic tapes, the on-line retrieval of information or later utilization of tapes becomes unsatisfactory . the national institute of japanese literature uses a dot matrix and prints by wiredot impact . if a kanji is not in the character generator, the institute's staff composes the kanji in an enlarged dot matrix and creates the capability for printing in the generator. if the kanji made in such a way is used only once, the kanji pattern is not stored in the character generator, so that the generator does not reach its full capacity quickly. the enlarged dot composite for kanji created in the institute is filed and indexed for future use. most other institutions simply do not use those less commonly used kanji, and substitute kana for them . in addition to the problems common to any character output, such as size and number of dots, the problem of the space for kanji in relation to other characters and the choice of vertical or horizontal printing of japanese sentences with kanji must be considered. kanji have many strokes and, as mentioned before, are expressed by two-byte codes . each kanji needs a double space when displayed on screens or printed. when a kanji is used with numerals or kana, the kanji part looks fine but the numerical part has too much space between each numeral. therefore, input of kanji is done in a kanji mode and input of kana, roman alphabets, and numerals are in a kananumerical mode. in this way a multidigit figure looks like one whole figure rather than a line of one-digit figures . some formal documents must be printed in the traditional vertical arrangement. to cope with this situation, some line printers have the capability to precompose a vertical page before printing it. there are multicolor crts · on the market that can be used for the retrieval of library-related information, e. g., main entry in red, series statement in yellow. one last problem that must be considered is that most of these systems require trained operators, or else the operation is very slow. the information is edited and compiled by the editors and prepared for input in the form of worksheets. so are the revisions. at various stages of revising the text, the information must be printed, given to the editors, and revised . further developments in simplifying input and revising texts for efficient flow are to be expected. application of kanji systems processing of vernacular-language materials in their own writing systems is considered vital for research libraries in this country. in adoptjapanese character input/morita 21 ing the kanji systems in such libraries, there are three major factors that must be considered: the objectives and needs of the institution, the cost, and the personnel. first, the institution must know what it must accomplish by means of such a system. the needs may not be the same for all institutions . is the system for retrieving catalog information, or for inputting catalog and other information? is it for internal processing or patron use? is it for a large bibliographic utility to distribute information to its subscribers, or for an individual institution to process its own information? could the system be shared by the department of asian studies in any way? the character set needs· of the institution are a major factor in choosing the system . since input and output devices are different, i.e., one cannot input kanji on a crt and retrieve kanji from the same crt, the institution must consider how much it will need to input, or whether it can rely on available data bases. some institutions may not need any input equipment if they utilize available data bases . if japan marc and other tapes are made accessible by a large bibliographic utility in this country, the institutions will be able to obtain bibliographic information in kanji on the screen. if they want only catalog cards or a com catalog, they will not need any equipment except the terminals supported by the utility. if they want to input, they must consider what form or forms of output they need, how to create the characters not included in the system, in addition to which system to choose. second, cost is an important factor. is the expense jl.lstified in terms of the other needs of the library? what can be accomplished per dollar spent? the kanji systems are still expensive, though the cost will eventually be reduced. how much can be spent and how much continuing support can be expected are factors that modify system expectations. the budget must include not only the one-time hardware cost , but also the software, maintenance, and personnel. third, the availability of personnel will affect the choice of system. what degree of language expertise does the system require in each stage of operation, such as inputting, maintenance , and programming? does it need terminal operators trained in those languages? what other personnel does the system need as far as language-related qualification is concerned? apart from the three major factors discussed above, there are some technical aspects that must be adjusted to library situations in this country. since japanese, chinese, and korean use the same chinese ideographs to different degrees and in different ways, libraries considering automated processing of these language materials are probably expected to handle all three languages by the same system, to say nothing about the other non-roman scripts. problems will arise in selecting characters for inclusion in the system. as pointed out earlier with regard to 22 journal of library automation vol. 14/1 march 1981 japanese character processing, there are simply too many characters for the present capacity of any computer. if korean and chinese languages are to be handled by the same computer, this problem multiplies. the korean alphabet, called hangul, would have to be included. chinese has more characters than japanese. worse yet is the fact that some kanji are simplified in different ways in japan and china, so that they are neither recognizable nor interchangeable between them . it will be an enormous task to accommodate both in the same system. another problem is the arrangement and indexing of kanji. if a full keyboard, a japanese typewriter keyboard, or two-key stroke system, especially its location association method by kana typewriter, is considered for japanese, chinese, and korean, the arrangement of the characters must be indexed and accessed for the three languages, in addition to the multiple readings found in japanese. for example, kanji on the japanese keyboard are usually arranged by the initial sound of the japanese reading of the kanji . this arrangement will be useless for chinese and korean, because japanese readings are not the same as chinese or korean readings. the arrangement of kanji on the keyboards must be on some new principle common to these languages. even if the kana-kanji conversion is used, and roman alphabet-kanji conversion software is adopted, software to handle those three languages must be developed. such software would have to be highly sophisticated. the presence of many homonyms in chinese will cause a great problem to the extent that the system relies on transliterated or romanized forms of the language . recognition of the many identical spellings in different language contexts will be extremely difficult. the above discussion is based on what is currently available in japan . the combination of existing inputting, generating, and outputting equipment developed by japanese technology opens up various possibilities for us to build effective systems in this country . acknowledgment this article is based on a study conducted in japan as a japan foundation professional fellow, and as a visiting re search fellow of the center for research on information and library science, university of tokyo. references l. national institute of japanese lite rature, implementation of a computer system and a kanji handling system at ni]l (tokyo: nijl, 1978), p.16. 2. toshio ishiwata, "kanji shori kenkyu ni motomerareru mono " ("requirements for study on kanji processing"] computopia no.9 (1977) , p.35 . 3. gendai yoga no kiso chishiki , 1980 {basic knowledge on current terms , 1980] (tokyo: jiyukokuminsha, 1980), p .999. 4. figures are taken from the following two sources and compiled by the author: hasegawa, jitsur6. "kanji shari sochi" ("kanji processing devices"] ]aha shari [in formation processing] 19, no.4:353 (april 1978). japanese character lnput!morita 23 sugai, kazur6. "kanji nyii.-shutsuryoku sochi mo kaihatsu doko" ["a trend in development of kanji input-output devices"] business communication 16, no. 7:41 (1979). 5. used for the pattern input mentioned in the following component pattern input system . 6. national diet library, library automation in the national diet library (tokyo: the library, 1979), p.4 . 7. ibid., p.7 . 8. asia business consultants is using an optical character recognition system that can scan handwritten kana and numerals in a small scale to input and process catalog information for a library collection. 9. "joh6 kokan no tame no kanji fug6 no hy6junka" ["s tandarization of kanji code for information interchange"] kagaku gijitsu bunken siibisu [scientific and technical documents service] no.50 (1978), p.29. 10. ibid., p .28. ichiko morita is assistant professor in library administration and head, automated processing division, the ohio state university libraries . editor's notes most ]ola readers are aware of significant delays in publication in the last volume. susan k. martin, a former editor of ]ola, and richard d. johnson, a former editor of college & research libraries , gave freely of their time and energy to bring the journal back on schedule. mary madden, judith schmidt, and the members of the editorial board under the leadership of charles husbands all worked closely with sue and richard in this effort. this was a second time around for sue, who undertook a similar task when she assumed the jola editorship in 1972. the ]ola readership and this editor owe debts of gratitude to sue, richard, and all the others who helped. we do not foresee major changes in the format of the journal as established principally under the editorships of kilgour and martin. we look for increased strength in our book reviews section under the editorship of david weisbrod. the addition of tom harnish as assistant editor for video technologies indicates our recognition of the growing importance of videobased information systems. we encourage reader suggestions. w e welcome brief communications of successes or failures that might be of interest to other readers. letters to the editor about any of our feature articles or communications are solicited. the next generation library catalog | yang and hofmann 141 sharon q. yang and melissa a. hofmann the next generation library catalog: a comparative study of the opacs of koha, evergreen, and voyager open source has been the center of attention in the library world for the past several years. koha and evergreen are the two major open-source integrated library systems (ilss), and they continue to grow in maturity and popularity. the question remains as to how much we have achieved in open-source development toward the next-generation catalog compared to commercial systems. little has been written in the library literature to answer this question. this paper intends to answer this question by comparing the next-generation features of the opacs of two open-source ilss (koha and evergreen) and one proprietary ils (voyager’s webvoyage). m uch discussion has occurred lately on the nextgeneration library catalog, sometimes referred to as the library 2.0 catalog or “the third generation catalog.”1 different and even conflicting expectations exist as to what the next-generation library catalog comprises: in two sentences, this catalog is not really a catalog at all but more like a tool designed to make it easier for students to learn, teachers to instruct, and scholars to do research. it provides its intended audience with a more effective means for finding and using data and information.2 such expectations, despite their vagueness, eventually took concrete form in 2007.3 among the most prominent features of the next-generation catalog are a simple keyword search box, enhanced browsing possibilities, spelling corrections, relevance ranking, faceted navigation, federated search, user contribution, and enriched content, just to mention a few. over the past three years, libraries, vendors, and open-source communities have intensified their efforts to develop opacs with advanced features. the next-generation catalog is becoming the current catalog. the library community welcomes open-source integrated library systems (ilss) with open arms, as evidenced by the increasing number of libraries and library consortia that have adopted or are considering opensource options, such as koha, evergreen, and the open library environment project (ole project). librarians see a golden opportunity to add features to a system that will take years for a proprietary vendor to develop. open-source opacs, especially that of koha, seem to be more innovative than their long-established proprietary counterparts, as our investigation shows in this paper. threatened by this phenomenon, ils vendors have rushed to improve their opacs, modeling them after the next-generation catalog. for example, ex libris pushed out its new opac, webvoyage 7.0, in august of 2008 to give its opac a modern touch. one interesting question remains. in a competition for a modernized opac, which opac is closest to our visions for the next-generation library catalog: opensource or proprietary? the comparative study described in this article was conducted in the hope of yielding some information on this topic. for libraries facing options between open-source and proprietary systems, “a thorough process of evaluating an integrated library system (ils) today would not be complete without also weighing the open source ils products against their proprietary counterparts.”3 ■■ scope and purpose of the study the purpose of the study is to determine which opac of the three ilss—koha, evergreen, or webvoyage—offers more in terms of services and is more comparable to the next-generation library catalog. the three systems include two open-source and one proprietary ilss. koha and evergreen are chosen because they are the two most popular and fully developed open-source ilss in north america. at the time of the study, koha had 936 implementations worldwide; evergreen had 543 library users.4 we chose webvoyage for comparison because it is the opac of the voyager ils by ex libris, the biggest ils vendor in terms of personnel and marketplace.5 it also is one of the more popular ilss in north america, with a customer base of 1,424 libraries, most of which are academic.6 as the sample only includes three ilss, the study is very limited in scope, and the findings cannot be extrapolated to all open-source and proprietary catalogs. but, hopefully, readers will gain some insight into how much progress libraries, vendors, and open-source communities have achieved toward the next-generation catalog. ■■ literature review a review of the library literature found two relevant studies on the comparison of opacs in recent years. the first study was conducted by two librarians in slovenia investigating how much progress libraries had made toward the next-generation catalog.7 six online catalogs sharon q. yang (yangs@rider.edu) is systems librarian and melissa a. hofmann (mhofmann@rider.edu) is bibliographic control librarian, rider university. 142 information technology and libraries | september 2010 were examined and evaluated, including worldcat, the slovene union catalog cobiss, and those of four public libraries in the united states. the study also compared services provided by the library catalogs in the sample with those offered by amazon. the comparison took place primarily in six areas: search, presentation of results, enriched content, user participation, personalization, and web 2.0 technologies applied in opacs. the authors gave a detailed description of the research results supplemented by tables and snapshots of the catalogs in comparison. the findings indicated that “the progress of library catalogues has really been substantial in the last few years.” specifically, the library catalogues have made “the best progress on the content field and the least in user participation and personalization.” when compared to services offered by amazon, the authors concluded that “none of the six chosen catalogues offers the complete package of examined options that amazon does.”8 in other words, library catalogs in the sample still lacked features compared to amazon. the other comparative study was conducted by linda riewe, a library school student, in fulfillment for her master’s degree from san jose university. the research described in her thesis is a questionnaire survey targeted at 361 libraries that compares open-source (specifically, koha and evergreen) and propriety ilss in north america. more than twenty proprietary systems were covered, including horizon, voyager, millennium, polaris, innopac, and unicorn.9 only a small part of her study was related to opacs. it involved three questions about opacs and asked librarians to evaluate the ease of use of their ils opac’s search engines, their opac search engine’s completeness of features, and their perception of how easy it is for patrons to make self-service requests online for renewals and holds. a scale of 1 to 5 was used (1 = least satisfied; 5= very satisfied) regarding the three aspects of opacs. the mean and medium satisfaction ratings for open-source opacs were higher than those of proprietary ones. koha’s opac was ranked 4.3, 3.9, and 3.9, respectively in mean, the highest on the scale in all three categories, while the proprietary opacs were ranked 3.9, 3.6, and 3.6.10 evergreen fell in the middle, still ahead of proprietary opacs. the findings reinforced the perception that open-source catalogs, especially koha, offer more advanced features than proprietary ones. as riewe’s study focused more on the cost and user satisfaction with ilss, it yielded limited information about the connected opacs. no comparative research has measured the progress of open-source versus proprietary catalogs toward the next-generation library catalog. therefore the comparison described in this paper is the first of its kind. as only koha, everygreen, and voyager’s opacs are examined in this paper, the results cannot be extrapolated. studies on a larger scale are needed to shed light on the progress librarians have made toward the next-generation catalog. ■■ method the first step of the study was identifing and defining of a set of measurements by which to compare the three opacs. a review of library literature on the next-generation library catalog revealed different and somewhat conflicting points of views as to what the nextgeneration catalog should be. as marshall breeding put it, “there isn’t one single answer. we will see a number of approaches, each attacking the problem somewhat differently.”11 this study decided to use the most commonly held visions, which are summarized well by breeding and by morgan’s lita executive summary.12 the ten parameters identified and used in the comparison were taken primarily from breeding’s introduction to the july/ august 2007 issue of library technology reports, “nextgeneration library catalogs.”13 the ten features reflect some librarians’ visions for a modern catalog. they serve as additions to, rather than replacements of, the feature sets commonly found in legacy catalogs. the following are the definitions of each measurement: ■■ a single point of entry to all library information: “information” refers to all library resources. the next-generation catalog contains not only bibliographical information about printed books, video tapes, and journal titles but also leads to the full text of all electronic databases, digital archives, and any other library resources. it is a federated search engine for one-stop searching. it not only allows for one search leading to a federation of results, it also links to full-text electronic books and journal articles and directs users to printed materials. ■■ state-of-the-art web interface: library catalogs should be “intuitive interfaces” and “visually appealing sites” that compare well with other internet search engines.14 a library’s opac can be intimidating and complex. to attract users, the next-generation catalog looks and feels similar to google, amazon, and other popular websites. this criterion is highly subjective, however, because some users may find google and amazon anything but intuitive or appealing. the underlying assumption is that some internet search engines are popular, and a library catalog should be similar to be popular themselves. ■■ enriched content: breeding writes, “legacy catalogs tend to offer text-only displays, drawing only on the marc record. a next-generation catalog might bring in content from different sources to strengthen the visual appeal and increase the amount of information presented to the user.”15 the enriched content the next generation library catalog | yang and hofmann 143 includes images of book covers, cd and movie cases, tables of contents, summaries, reviews, and photos of items that traditionally are not present in legacy catalogs. ■■ faceted navigation: faceted navigation allows users to narrow their search results by facets. the types of facets may include subjects, authors, dates, types of materials, locations, series, and more. many discovery tools and federated search engines, such as villanova university’s vufind and innovative interface’s encore, have used this technology in searches.16 auto-graphics also applied this feature in their opac, agent iluminar.17 ■■ simple keyword search box: the next-generation catalog looks and feels like popular internet search engines. the best example is google’s simple user interface. that means that a simple keyword search box, instead of a controlled vocabulary or specific-field search box, should be presented to the user on the opening page with a link to an advanced search for user in need of more complex searching options. ■■ relevancy: traditional ranking of search results is based on the frequency and positions of terms in bibliographical records during keyword searches. relevancy has not worked well in opacs. in addition, popularity is another factor that has not been taken into consideration in relevancy ranking. for instance, “when ranking results from the library’s book collection, the number of times that an item has been checked out could be considered an indicator of popularity.”18 by the same token, the size and font of tags in a tag cloud or the number of comments users attach to an item may also be considered relevant in ranking search results. so far, almost no opacs are capable of incorporating circulation statistics into relevancy ranking. ■■ “did you mean . . . ?”: when a search term is not spelled correctly or nothing is found in the opac in a keyword search, the spell checker will kick in and suggest the correct spelling or recommend a term that may match the user’s intended search term. for example, a modern catalog may generate a statement such as “did you mean . . . ?” or “maybe you meant . . . .” this may be a very popular and useful service in modern opacs. ■■ recommendations and related materials: the nextgeneration catalog is envisioned as promoting reading and learning by making recommendations of additional related materials to patrons. this feature is an imitation of amazon and websites that promote selling by stating “customers who bought this item also bought . . . .” likewise, after a search in the opac, a statement such as “patrons who borrowed this book also borrowed the following books . . .” may appear. ■■ user contribution—ratings, reviews, comments, and tagging: legacy catalogs only allow catalogers to add content. in the next-generation catalog, users can be active contributors to the content of the opac. they can rate, write reviews, tag, and comment on items. user contribution is an important indicator for use and can be used in relevancy ranking. ■■ rss feeds: the next-generation catalog is dynamic because it delivers lists of new acquisitions and search updates to users through rss feeds. modern catalogs are service-oriented; they do more than provide a simple display search results. the second step is to apply these ten visions to the opacs of koha, evergreen, and webvoyage to determine if they are present or absent. the opacs used in this study included three examples from each system. they may have been product demos and live catalogs randomly chosen from the user list on the product websites. the latest releases at the time of the study was koha 3.0, evergreen 2.0, webvoyage 7.1. in case of discrepancies between product descriptions and reality, we gave precedence to reality over claims. in other words, even if the product documentation lists and describes a feature, this study does not include it if the feature is not in action either in the demo or live catalogs. despite the fact that a planned future release of one of those investigated opacs may add a feature, this study only recorded what existed at the time of the comparison. the following are the opacs examined in this paper. koha ■■ koho demo for academic libraries: http://academic .demo.kohalibrary.com/ ■■ wagner college: http://wagner.waldo.kohalibrary .com/ ■■ clearwater christian college: http://ccc.kohalibrary .com/ evergreen ■■ evergreen demo: http://demo.gapines.org/opac/ en-us/skin/default/xml/index.xml ■■ georgia pines: http://gapines.org/opac/en-us/ skin/default/xml/index.xml ■■ columbia bible college at http://columbiabc .evergreencatalog.com/opac/en-ca/skin/default/ xml/index.xml webvoyage ■■ rider university libraries: http://voyager.rider.edu ■■ renton college library: http://renton.library.ctc .edu/vwebv/searchbasic 144 information technology and libraries | september 2010 ■■ shoreline college library: http://shoreline.library .ctc.edu/vwebv/searchbasic the final step includes data collection and compilation. a discussion of findings follows. the study draws conclusions about which opac is more advanced and has more features of the next-generation library catalog. ■■ findings each of the opacs of koha, evergreen, and webvoyage are examined for the presence of the ten features of the next-generation catalog. single point of entry for all library information none of the opacs of the three ilss provides true federated searching. to varying degrees, each is limited in access, showing an absence of contents from electronic databases, digital archives, and other sources that generally are not located in the legacy catalog. of the three, koha is more advanced. while webvoyage and evergreen only display journal-holdings information in their opacs, koha links journal titles from its catalog to proquest’s serials solutions, thus leading users to fulltext journals in the electronic databases. the example in figure 1 (koha demo) shows the journal title unix update with an active link to the full-text journal in the availability field. the link takes patrons to serials solutions, where full text at the journal-title level is listed for each database (see figure 2). each link will take you into the full text in each database. state-of-the-art web interface as beauty is in the eye of the beholder, the interface of a catalog can be appealing to one user but prohibitive to another. with this limitation in mind, the out-of-thebox user interface at the demo sites was considered for each opac. all the three catalogs have the google-like simplicity in presentation. all of the user interfaces are highly customizable. it largely depends on the library to make the user interface appealing and welcoming to users. figures 3–5 show snapshots from each ilss demo sites and have not been customized. however, there are a few differences in the “state of the art.” for one, koha’s navigation between screens relies solely on the browser’s forward and back buttons, while webvoyage and evergreen have internal navigation buttons that more efficiently take the user between title lists, headings lists, and record displays, and between records in a result set. while all three opacs offer an advanced search page with multiple boxes for entering search terms, only webvoyage makes the relationship between the terms in different boxes clear. by the use of a drop-down box, it makes explicit that the search terms are by default anded and also allows for the selection of or and not. in koha’s and evergreen’s advanced search, however, the terms are anded only, a fact that is not at all obvious to the user. in the demo opacs examined, there is no option to choose or or not between rows, nor is there any indication that the search is anded. the point of providing multiple search boxes is to guide users in constructing a boolean search without their having to worry about operators and syntax. in koha, however, users have to type an or or not statement themselves within the text box, thus defeating the purpose of having multiple boxes. while evergreen allows for a not construction within a row (“does not contain”), it does not provide an option for or (“contains” and “matches exactly” are the other two options available). see figures figure 1. link to full-text journals in serials solutions in koha figure 2. links to serials solutions from koha the next generation library catalog | yang and hofmann 145 6–8. thus koha’s and evergreen’s advanced search is less than intuitive for users and certainly less functional than webvoyage’s. enriched content to varying degrees, enriched content is present in all three catalogs, with koha providing the most. while all three catalogs have book covers and movie-container art, koha has much more in its catalog. for instance, it displays tags, descriptions, comments, and amazon reviews. webvoyage displays links to google books for book reviews and content summaries but does not have tags, descriptions, and comments in the catalog. see figures 9–11. faceted navigation the koha opac is the only catalog of the three to offer faceted navigation. the “refine your search” feature allows users to narrow search results by availability, places, libraries, authors, topics, and series. clicking on a term within a facet adds that term to the search query and generates a narrower list of results. the user may then choose another facet to further refine the search. while evergreen appears to have faceted navigation upon first glance, it actually does not possess this feature. the following facets appear after a search generates hits: “relevant subjects,” “relevant authors,” and “relevant series.” but choosing a term within a facet does not narrow down the previous search. instead, it generates an entirely new search with the selected term; it does not add the new term to the previous query. users must manually combine the terms in the simple search box or through the advanced search page. webvoyage also does not offer faceted navigation—it only provides an option to “filter your search” by format, language, and date when a set of results is returned. see figures 12–14. keyword searching koha, evergreen, and webvoyage all present a simple keyword search box with a link to the advanced search (see figures 3–5). relevancy neither koha, evergreen, nor webvoyage provide any evidence for meeting the criteria of the next-generation catalog’s more inclusive vision of relevancy ranking, such as accounting for an item’s popularity or allowing user tags. koha uses index data’s zebra program for its relevance ranking, which “reads structured records in a variety of input formats . . . and allows access to them through exact boolean search figure 3. koha: state-of-the-art user interface figure 5. voyager: state-of-the-art user interface figure 4. evergreen: state-of-the-art user interface 146 information technology and libraries | september 2010 user contributions koha is the only system of the three that allows users to add tags, comments, descriptions, and reviews. in koha’s opac, user-added tags form tag clouds, and the font and size of each keyword or tag indicate that keyword or figure 6. voyager advanced search figure 7. koha advanced search figure 8. evergreen advanced search expressions and relevance-ranked free-text queries.19 evergreen’s dokuwiki states that the base relevancy score is determined by the cover density of the searched terms. after this base score is determined, items may receive score bumps based on word order, matching on the first word, and exact matches depending on the type of search performed.20 these statements do not indicate that either koha or evergreen go beyond the traditional relevancy-ranking methods of legacy systems, such as webvoyage. did you mean . . . ? only evergreen has a true “did you mean . . . ?” feature. when no hits are returned, evergreen provides a suggested alternate spelling (“maybe you meant . . . ?”) as well as a suggested additional search (“you may also like to try these related searches . . .”). koha has a spell-check feature, but it automatically normalizes the search term and does not give the option of choosing different one. this is not the same as a “did you mean . . . ?” feature as defined above. while the normalizing process may be seamless, it takes the power of choice away from the user and may be problematic if a particular alternative spelling or misspelling is searched purposefully, such as “womyn.” (when “womyn” is searched as a keyword in the koha demo opac, 16,230 hits are returned. this catalog does not appear to contain the term as spelled, which is why it is normalized to women. the fact that the term does not appear as is may not be transparent to the searcher.) with normalization, the user may also be unaware that any mistake in spelling has occurred, and the number of hits may differ between the correct spelling and the normalized spelling, potentially affecting discovery. the normalization feature also only works with particular combinations of misspellings, where letter order affects whether a match is found. otherwise the system returns a “no result found!” message with no suggestions offered. (try “homoexuality” vs. “homoexsuality.” in koha’s demo opac, the former, with a missing “s,” yields 553 hits, while the latter, with a misplaced “s,” yields none.) however, koha is a step ahead of webvoyage, which has no built-in spell checker at all. if a search fails, the system returns the message “search resulted in no hits.” see figures 15–17. recommendations/related materials none of the three online catalogs can recommend materials for users. the next generation library catalog | yang and hofmann 147 figure 9. koha enriched content figure 10. evergreen enriched content figure 11. voyager enriched content figure 12. koha faceted navigation figure 13. evergreen faceted navigation figure 14. voyager faceted navigation 148 information technology and libraries | september 2010 nevertheless, the user contribution in the koha opac is not easy to use. it may take many clicks before a user can figure out how to add or edit text. it requires user login, and the system cannot keep track of the search hits after a login takes place. therefore the user contribution features of koha need improvement. see figure 18. rss feeds koha provides rss feeds, while evergreen and webvoyage do not. ■■ conclusion table 1 is a summary of the comparisons in this paper. these comparisons show that the koha opac has six out of the ten compared features for the next-generation catalog, plus two halves. its full-fledged features include state-of-the-art web interface, enriched content, faceted navigation, a simple keyword search box, user contribution, and rss feeds. the two halves indicate the existence of a feature that is not fully developed. for instance, “did you mean . . . ?” in koha does not work the way the next-generation catalog is envisioned. in addition, koha has the capability of linking journal titles to full text via serials solutions, while the other two opacs only display holdings information. evergreen falls into second place, providing four out of the ten compared features: state-of-the-art interface, enriched content, a keyword search box, and “did you mean . . . ?” webvoyage, the voyager opac from ex libris, comes in third, providing only three out of the ten features for figure 15. evergreen: did you mean . . . ? figure 16. koha: did you mean . . . ? figure 17. voyager: did you mean . . . ? figure 18. koha user contibutions tag’s frequency of use. all the tags in a tag cloud serve as hyperlinks to library materials. users can write their own reviews to complement the amazon reviews. all user-added reviews, descriptions, and comments have to be approved by a librarian before they are finalized for display in the opac. the next generation library catalog | yang and hofmann 149 the next-generation catalog. based on the evidence, koha’s opac is more advanced and innovative than evergreen’s or voyager’s. among the three catalogs, the open-source opacs compare more favorably to the ideal next-generation catalog then the proprietary opac. however, none of them is capable of federated searching. only koha offers faceted navigation. webvoyage does not even provide a spell checker. the ils opac still has a long way to go toward the nextgeneration catalog. though this study samples only three catalogs, hopefully the findings will provide a glimpse of the current state of open-source versus proprietary catalogs. ils opacs are not comparable in features and functions to stand-alone opacs, also referred to as “discovery tools” or “layers.” some discovery tools, such as ex libris’ primo, also are federated search engines and are modeled after the next-generation catalog. recently they have become increasingly popular because they are bolder and more innovative than ils opacs. two of the best stand-alone open-source opacs are villanova university’s vufind and oregon state university’s libraryfind.21 both boast eight out of ten features of the next-generation catalog.22 technically it is easier to develop a new stand-alone opac with all the next-generation catalog features than mending old ils opacs. as more and more libraries are disappointed with their ils opacs, more discovery tools will be implemented. vendors will stop improving ils opacs and concentrate on developing better discovery tools. the fact that ils opacs are falling behind current trends may eventually bear no significance for libraries—at least for the ones that can afford the purchase or implementation of a more sophisticated discovery tool or stand-alone opac. certainly small and public libraries who cannot afford a discovery tool or a programmer for an open-source opac overlay will suffer, unless market conditions change. references 1. tanja mercun and maja žumer, “new generation of catalogues for the new generation of users: a comparison of six library catalogues,” program: electronic library & information systems 42, no. 3 (july 2008): 243–61. 2. eric lease morgan, “a ‘next-generation’ library catalog— executive summary (part #1 of 5),” online posting, july 7, 2006, lita blog: library information technology association, http:// litablog.org/2006/07/07/a-next-generation-library-catalog -executive-summary-part-1-of-5/ (accessed nov. 10, 2008). 3. marshall breeding, introduction to “next generation library catalogs,” library technology reports 43, no. 4 (july/aug. 2007): 5–14. 4. ibid. 5. marshall breeding, “library technology guides: key resources in the field of library automation,” http:// www .librarytechnology.org/lwc-search-advanced.pl (accessed jan. 23, 2010). 6. marshall breeding, “investing in the future: automation marketplace 2009,” library journal (apr. 1, 2009), http:// www .libraryjournal.com/article/ca6645868.html (accessed jan. 23, 2010). 7. marshall breeding, “library technology guides: company directory,” http://www.librarytechnology.org/exlibris .pl?sid=20100123734344482&code=vend (accessed jan. 23, 2010). 8. merčun and zumer, “new generation of catalogues.” 9. ibid. 10. linda riewe, “integrated library system (ils) survey: open source vs. proprietary-tables” (master’s thesis, san jose university, 2008): 2–5, http://users.sfo.com/~lmr/ils-survey/ tables-all.pdf (accessed nov. 4, 2008). 11. ibid., 26–27. 12. breeding, introduction. 13. ibid.; morgan, “a ‘next-generation’ library catalog.” 14. breeding, introduction. 15. ibid. 16. ibid. 17. villanova university, “vufind,” http://vufind.org/ (accessed june 10, 2010); innovated interfaces, “encore,” http:// encoreforlibraries.com/ (accessed june 10, 2010). 18. auto-graphics, “agent illuminar,” http://www4.auto -graphics.com/solutions/agentiluminar/agentiluminar.htm (accessed june 10, 2010). 19. breeding, introduction; morgan, “a ‘next-generation’ table 1. summary features of the next generation catalog koha evergreen voyager single point of entry for all library information ûü û û state-of-the-art web interface ü ü ü enriched content ü ü ü faceted navigation ü û û keyword search ü ü ü relevancy û û û did you mean…? üû ü û recommended/ related materials û û û user contribution ü û û rss feed ü û û 150 information technology and libraries | september 2010 22. villanova university, “vufind”; oregon state university, “libraryfind,” http://libraryfind.org/ (accessed june 10, 2010). 23. sharon q.yang and kurt wagner, “open source standalone opacs,” (microsoft powerpoint presentation, 2010 virtual academic library environment annual conference, piscataway, new jersey, jan. 8, 2010). library catalog.” 20. index data, “zebra,” http://www.indexdata.dk/zebra/ (accessed jan. 3, 2009). 21. evergreen docuwiki, “search relevancy ranking,” http://open-ils.org/dokuwiki/doku.php?id=scratchpad:opac_ demo&s=core (accessed dec. 19, 2008). lita cover 3, cover 4 yalsa cover 2 index to advertisers communications ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ manzari user-centered design of a web site | manzari and trinidad-christensen 163 this study describes the life cycle of a library web site created with a user-centered design process to serve a graduate school of library and information science (lis). findings based on a heuristic evaluation and usability study were applied in an iterative redesign of the site to better serve the needs of this special academic library population. recommendations for design of web-based services for library patrons from lis programs are discussed, as well as implications for web sites for special libraries within larger academic library settings. u ser-centered design principles were applied to the creation of a web site for the library and information science (lis) library at the c. w. post campus of long island university. this web site was designed for use by master’s degree and doctoral students in the palmer school of library and information science. the prototype was subjected to a usability study consisting of a heuristic evaluation and usability testing. the results were employed in an iterative redesign of the web site to better accommodate users’ needs. this was the first usability study of a web site at the c. w. post library. human-computer interaction, the study of the interaction of human performance with computers, imposes a rigorous methodology on the process of user-interface design. more than an intuitive determination of userfriendliness, a successful interactive product is developed by careful design, testing, and redesign based on the testing outcomes. testing the product several times as it is being developed, or iterative testing, allows the users’ needs to be incorporated into the design. the interface should be designed for a specific community of users and set of tasks to be accomplished, with the goal of creating a consistent, usable product. the lis library had a web site that was simply a description of the collection and did not provide access to online specialized resources. a new web site was designed for the lis library by the incoming lis librarian who made a determination of what content might be useful for lis students and faculty. the goal was to have such content readily accessible in a web site separate from the main library web site. the web site for the lis library includes: ฀ access to all online databases and journals related to lis; ฀ a general overview of the lis library and its resources as well as contact information, hours, and staff; ฀ a list of all print and online lis library journal subscriptions, grouped by both title and subject, with links to access the online journals; ฀ links to other web sites in the lis field; ฀ links to other university web pages, including the main library’s home page, library catalog, and instructions for remote database access, as well as to the lis school web site; ฀ a link to jake (jointly administered knowledge environment), a project by yale university that allows users to search for periodical titles within online databases, since the library did not have this type of access through its own software. this information was arranged in four top-level pages with sublevels. design considerations included making the site both easy to learn and efficient once users were familiar with it. since classes are taught at four locations in the metropolitan area, the site needed to be flexible enough to serve students at the c. w. post campus library as well as remotely. the layout of the information was designed to make the web site uncluttered and attractive. different color schemes were tried and informally polled among users. a version with white text on black background prompted strong likes or dislikes when shown to users. although this combination is easy to read, it was rejected because of the strong negative reactions from several users. photographs of the lis library and students were included. the pages were designed with a menu on the left side; fly-out menus were used to access submenus. where main library pages already existed for information to be included in the lis web site, such as lis hours and staff, links to those pages were made instead of re-creating the information in the lis web site. an attempt was made to render the site accessible to users with disabilities, and pages were made compliant with the world wide web consortium (w3c) by using their html validator and their cascading style sheet validator.1 ฀ literature review usability is a term with many definitions, varying by field.2 the fields of industrial engineering, product research and development, computer systems, and library science all share the study of human-and-machine interaction, as well user-centered design of a web site for library and information science students: heuristic evaluation and usability testing laura manzari and jeremiah trinidad-christensen laura manzari (manzari@liu.edu) is an associate professor and library and information science librarian at the c. w. post campus of long island university, brookville, n.y. jeremiah trinidad-christensen (jt2118@columbia.edu) is a gis/map librarian at columbia university, new york, n.y. 164 information technology and libraries | september 2006 as a commitment to users. dumas and reddish explain it simply: “usability means that the people who use the product can do so quickly and easily to accomplish their own tasks.”3 user-centered design incorporates usability principles into product design and places the focus on the user during project development. gould and lewis cite three principles of user-centered design: an early focus on users and tasks, empirical measurement of product usage, and iterative design to include user input into product design and modification.4 jakob nielsen, an often-cited usability engineering specialist, emphasizes that for increased functionality, engineering usability principles should apply to web design, which should be treated as a software development project. he advocates incorporating user evaluation into the design process first through a heuristic evaluation, followed by usability testing with a redesign of the product after each phase of evaluation.5 usability principles have been applied to library web-site design; however, library web-site usability studies often do not include the additional heuristic evaluation recommended by nielsen.6 in addition to usability, consideration should also be given during the design process to making the web site accessible to people with disabilities. federal agencies are now required by the rehabilitation act to make their web sites accessible to the disabled. section 508 part 1194.22 of the act enumerates sixteen rules for internet applications to help ensure web-site access for people with various disabilities.7 similarly, the web accessibility initiative hosted by the w3c works to ensure that accessibility practices are considered in web-site design. they developed the web content accessibility guidelines for making web sites accessible to people with disabilities.8 although articles have been written about usability testing of academic library web sites, very little has been written about usability testing of special-collection web sites for distinct user populations within larger academic settings.9 ฀ heuristic evaluation methodology heuristic evaluation is a usability engineering method in which a small set of expert evaluators examine a user interface for design problems by judging its compliance with a set of recognized usability principles or heuristics. nielsen developed a set of ten widely adopted usability heuristics (see sidebar). after studying the use of individual evaluators as well as groups of varying sizes, nielsen and molich recommend using three to five evaluators for a heuristic evaluation.10 the use of multiple experts will catch more flaws than a single expert, but using more than five experts does not produce greater results. in comparisons of heuristic evaluation and usability testing, the heuristic evaluation uncovered more of the minor problems while usability testing uncovered more major, global problems.11 since each method tends to uncover different usability problems, it is recommended that both methods be used complementarily, particularly with an iterative design change between the heuristic evaluation and the usability testing. for the heuristic evaluation, four people were approached from the palmer lis school faculty and ph.d. program with expertise in web-site design and humancomputer interaction. three agreed to participate. they were asked to familiarize themselves with the web site and evaluate it according to nielsen’s ten heuristics, which were provided to them. ฀ heuristic evaluation results the evaluators were all in agreement that the language was appropriate for lis students. one evaluator said if new students were not familiar with some of the terms they soon would be. another thought jake, the tool to access full text, might not be clear to students at first, but the lis web-site explanation was fine the way it was. they were also in agreement that the web site was well designed. comments included: “the purpose and description of each page is short and to the point, and there is a good, clean, viewable page for the users”; “the site was well designed and not over designed”; “very clear and user friendly”; “excellent example of limiting unnecessary irrelevant information.” the only page to receive a “poor layout” comment was the lengthy subject list of journals, though no suggestions for improvement were made. concern was expressed about links to other web sites on campus. one evaluator thought new students might be confused about the relationship between long island university, c. w. post, and the palmer school. two evaluators thought links to the main library’s web site could cause confusion because of the different design and layout. a preference for the design of the lis library web site over the main library and palmer school web sites was expressed. to eliminate some confusion, the menu options for other campus web sites were dropped down to a separate menu right below the menu of lis web pages. for additional clarity, some of the main library pages were re-created in the style of the lis pages instead of linking to the original page. the evaluators made several concrete suggestions for menu changes, which were included in the redesign. it was suggested that several menu options were unclear and needed clarification, so additional text was added for clarity at the expense of brevity. long island university’s online catalog is named liucat and was listed that way on the menu. new students might not be familiar with this name, so the menu label was changed to liucat (library catalog). user-centered design of a web site | manzari and trinidad-christensen 165 for the link to jake, a description, find periodicals in online databases, was added for clarification. it was also suggested that the link to the main library web page for all databases could cause confusion since the layout and design of that page is different. the wording was changed to all databases (located in the c. w. post library web site). menu options were originally arranged in order of anticipated use (see figure 1). thus, the order of menu options from the lis home page was databases, journals, library catalog, other web sites, palmer school, and main library. evaluators suggested that putting the option for lis home page first would give users an easy “emergency exit” to return to the home page if they were lost. the original menu options also varied from page to page. for example, menu options on the database page referred only to pages that users might need while doing database searches. at the suggestion of evaluators, the menu options were changed to be consistent on every page (see figure 2). a redesign based on these results was completed and posted to the internet for public use (see figure 3). ฀ usability testing methodology usability testing is an empirical method for improving design. test subjects are gathered from the population who will use the product and are asked to perform real tasks using the prototype while their performance and reactions to the product are observed and recorded by an interviewer. this observation and recording of behavior distinguishes usability testing from focus groups. observation allows the tester to see when and where users become frustrated or confused. the goal is to jakob nielsen’s usability heuristics visibility of system status—the system should always keep users informed about what is going on, through appropriate feedback within reasonable time. match between system and the real world— the system should speak the user’s language, with words, phrases, and concepts familiar to the user rather than system-oriented terms. follow real-world conventions, making information appear in a natural and logical order. user control and freedom—users often choose system functions by mistake and will need a clearly marked “emergency exit” to leave the unwanted state without having to go through an extended dialogue. support undo and redo. consistency and standards—users should not have to wonder whether different words, situations, or actions mean the same thing. follow platform conventions. error prevention—even better than good error messages is a careful design that prevents problems from occurring in the first place. recognition rather than recall—make objects, actions, and options visible. the user should not have to remember information from one part of the dialogue to another. instructions for use of the system should be visible or easily retrievable whenever appropriate. flexibility and efficiency of use—accelerators, unseen by the novice user, may often speed up the interaction for the expert user such that the system can cater to both inexperienced and experienced users. allow users to tailor frequent actions. aesthetic and minimalist design—dialogues should not contain information that is irrelevant or rarely needed. every extra unit of information in a dialogue competes with the relevant units of information and diminishes their relative visibility. help users recognize, diagnose, and recover from errors—error messages should be expressed in plain language (no codes), precisely indicate the problems, and constructively suggest a solution. help and documentation—even though it is better if the system can be used without documentation, it may be necessary to provide help and documentation. any such information would be easy to search, focused on the user’s task, list concrete steps to be carried out, and not be too large.12 figure 1. original menu figure 2. revised menu 166 information technology and libraries | september 2006 uncover usability problems with the product, not to test the participants themselves. the data gathered are then analyzed to recommend changes to fix usability problems. in addition to recording empirical data such as number of errors made or time taken to complete tasks, active intervention allows the interviewer to question participants about reasons for their actions as well as about their opinions regarding the product. in fact, subjects are asked to verbalize their thought processes as they complete the tasks using the interface. test subjects are usually interviewed individually and are all given the same pretest briefing from a script with a list of instructions followed by tasks representing actual use. test subjects are also asked questions about their likes and dislikes. in most situations, payment or other incentives are offered to help recruit subjects. four or five subjects will reveal 80 percent of usability problems.13 messages were sent to students via the palmer school’s mailing lists requesting volunteers. a ten-dollar gift certificate to a bookstore was offered as an inducement to recruitment. input was desired from both master’s degree and doctoral students. the first nine volunteers to respond—all master’s degree students—were accepted. this group included students from both the main and satellite campuses. no ph.d. students volunteered to participate at first, citing busy schedules, but eventually a doctoral student was recruited. testing was conducted in computer labs at the library, at the palmer school, and at the manhattan satellite campus. demographic information was gathered regarding users’ gender, age range, university status, familiarity with computers, with the internet, and with the lis library, as well as the type of internet connection and browser usually used. the subjects were given eight tasks to complete using the web site. the tasks reflected both the type of assignment a student might receive in class and the type of information they might seek on the lis web site on their own. the questions were designed to test usability of different parts of the web site. ฀ ฀usability testing results the first task tested the print journals page and asked if the lis library subscribes to a specific journal and whether it is refereed. (the web site uses an asterisk next to a journal title to indicate that it is refereed.) all subjects were able to easily find that the lis library does hold the journal title. although it was not initially obvious that the asterisk was a notation indicating that the journal was refereed, most of the subjects eventually found the explanatory note. many of the subjects did not know what a refereed journal was, and some asked if a definition could be provided on the site. for the second task, subjects needed to use jake to find the full text of an article. none of the students were familiar with jake but were able to use the lis web site to gain an understanding of its purpose and to access it. the third task asked subjects to find a library association that required using the other web sites page. all subjects demonstrated an understanding of how to use this page and found the information. the fourth task tested the full-text databases page. only one subject actually used this page to complete the task. the rest used the all databases link to the main library’s database list. that link appears above the link to full-text databases and most subjects chose that link without looking at the next menu option. several subfigure 3. final home page user-centered design of a web site | manzari and trinidad-christensen 167 jects became confused when they were taken to the main library’s page, just as the evaluators had predicted. even though wording was added warning users that they were leaving the lis web site, most subjects did not read it and wondered why the page layout changed and was not as clear. they also had trouble navigating back to the lis web site from the main library web site. the fifth task tested the journals by subject page. this task took longer for most of the subjects to answer, but all were able to use the page successfully to find a journal on a given subject. the sixth task required using the lis home page, and everyone easily used it to find the operating hours. the seventh task required subjects to find an online journal title that could be accessed from the electronic journals page. all subjects navigated this page easily. the final task asked subjects to find a book review. most subjects did not look at the page for library and information sciences databases to access the books in print database, saying they did not think it would be included there. instead, they used the link to the main library’s database page. one subject was not able to complete this task. problems primarily occurred during testing when subjects left the lis page to use a non-library science database located on the main web site. subjects had problems getting back to the lis site from the main library site. while performing tasks, some subjects would scroll up and down long lists instead of using the toolbars provided to bring the user to an exact location on the page. some preferred using the back button instead of using the lis web-site menu to navigate. these seemed to be individual styles of using the web and not any usability problem with the site. several people consistently used the menu to return to the lis home page before starting each new task, even though they could have navigated directly to the page they needed, making a return to the home page unnecessary. this validated the recommendation from the heuristic study that the link to the home page always be the first menu option to give users a comfortable safety valve when they get lost. the final questions asked subjects for their opinions on what they did and did not like about the web site, as well as any suggestions for improving the site. all subjects responded that they liked the layout of the pages, calling them uncluttered, clean, attractive, and logical. there were very few suggestions for improving the site. one person asked that contact information be included on the menu options in addition to its location right below the menu on the lis home page. another participant suggested adding class syllabi to the web site each semester, listing required texts along with a link to an online bookstore. some of the novice users asked for explanations of unfamiliar terms such as “refereed journals.” a participant suggested including a search engine instead of using links to navigate the site. this was considered during the initial site design but was not included since the site did not have a large number of pages. however, a search engine may be worth including. the one doctoral student had previously only used the main library’s web page to access databases. originally, he said he did not see the advantage of a site devoted to information science sources for doctoral candidates, since that program is more multidisciplinary. however, after completing the usability study, the student concluded that the lis web site was useful. he suggested that it should be publicized more to doctoral candidates and that it be more prominently highlighted on the main library web site. though the questions asked were about the lis web site, several subjects complained about the layout of the main library web site and suggested that it have better linking to the lis web site to enable it to be accessed more easily. ฀ conclusions iterative testing and user-centered design resulted in a product that testing revealed to be easy to learn and efficient to use, and about which subjects expressed satisfaction. based on findings that some students had not even been aware of the existence of the lis web site, greater emphasis is now given to the web site and its features during new student orientations. the biggest problem users had was navigating from the web pages of the main library back to the lis site. it was suggested that the lis site be highlighted more prominently on the main library web site. some users were confused by the different layouts between the sites, but no one expressed a preference for the design used by the main library web site. despite this confusion, subjects overwhelmingly expressed positive feedback about having a specialized library site serving their specific needs. issues regarding web-site design can be problematic for smaller specialized libraries within larger institutions. in this case, some of the problems navigating between the sites could be resolved by changes to the main library site. the design of the lis web site was preferred over the main campus web site by both the heuristic evaluators and the students in the usability test. however, designers of a main library web site might not be receptive to suggestions from a specialized or branch library. although consistency in design would eliminate confusion, requiring the specialcollection’s web site to follow a design set by the main institution could be a loss for users. in this instance, the main site was designed without user input, whereas the specialized library serving a smaller population was able to be more dynamic and responsive to its users. finding an appropriate balance for a site used by students new to the field as well as advanced students is 168 information technology and libraries | september 2006 a challenge. although the students in the study were all experienced computer and web users, their familiarity with basic library concepts varied greatly. a few novice users expressed some confusion as to the difference between journals and index databases. there actually was a description of each of these sources on the site but it was not read. (the subjects barely read any of the site’s text, so it can be difficult to make some points clearer when users want to navigate quickly without reading instructions. several subjects who did not bother to read text on the site still suggested having more notes to explain unfamiliar terms. however, if the site becomes too overloaded with explanations of library concepts, it could become annoying for more advanced users.) a separate page with a glossary is a possibility—based on the study, however, it will probably not be read. another possibility is a handout for students that could have more text for new users without cluttering the web site. having such a handout would also serve to publicize the site. there was some concern prior to the study that offering more advanced features, such as providing access to jake or indicating which journals are refereed, might be off-putting for new students; therefore, test questions were designed to gauge reactions to these features. most students in the study did express some intimidation at not being familiar with these concepts. however, all the subjects eventually figured out how to use jake and, once they tried it, thought it was a good idea to include it. even new students who had the most difficulty were still able to navigate and learn from the site to be able to use it efficiently. an online survey was added to the final design to allow continuous user input. the site consistently receives positive feedback through these surveys. it was planned that responses could be used to continually assess the site and ensure that it is kept responsive and up-to-date; however specific suggestions have not yet been forthcoming. how valuable was usability testing to the web-site design? several good suggestions were made and implemented, and the process confirmed that the site was well designed. it provided some insight into how subjects used the web site that had not been anticipated by the designers. since usability studies are fairly easy and inexpensive to conduct, it is probably a step worth taking during the web-site design process even if it results in only minor changes to the design. references and notes 1. w3c, “the w3c markup validation service,” validator .w3.org (accessed nov. 1, 2005); w3c, “the w3c css validation service,” jigsaw.w3.org/css-validator (accessed nov. 1, 2005). 2. see carol m. barnum, usability testing and research (new york: longman international, 2002); alison j. head, “web redemption and the promise of usability,” online 23, no. 6 (1999): 20–29; international standards organization, ergonomic requirements for office work with visual display terminals. part 11: guidance on usability—iso 9241-11 (geneva: international organization for standardization, 1998); judy jeng, “what is usability in the context of the digital library and how can it be measured?” information technology and libraries 24, no. 2 (2005): 47–52; jakob nielsen, usability engineering (boston: academic, 1993); ruth ann palmquist, “an overview of usability for the study of users’ web-based information retrieval behavior,” journal of education for library and information science 42, no. 2 (2001): 123–36. 3. joseph s. dumas and janice c. redish, a practical guide to usability testing (portland: intellect bks., 1999), 4. 4. john d. gould and clayton h. lewis, “designing for usability: key principles and what designers think,” communications of the acm 28 no. 3 (1985): 300–11. 5. jakob nielsen, “heuristic evaluation,” in jakob nielsen and robert l. mack, eds., usability inspection methods (new york: wiley, 1994), 25–62. 6. see denise t. covey, usage and usability assessment: library practices and concerns (washington, d.c.: digital library federation, 2002); nicole campbell, usability assessment of library-related web sites (chicago: ala, 2001); kristen l. garlock and sherry piontek, designing web interfaces to library services and resources (chicago: ala, 1999); anna noakes schulze, “user-centered design for information professionals,” journal of education for library and information science 42, no. 2 (2001): 116–22; susan m. thompson, “remote observation strategies for usability testing,” information technology and libraries 22, no. 3 (2003): 22–32. 7. government services administration, “section 508: section 508 standards,” www.section508.gov/index.cfm?fuseacti on=content&id=12#web (accessed nov. 1, 2005). 8. w3c, “web content accessibility guidelines 2.0,” www .w3.org/tr/wcag20 (accessed nov. 1, 2005). 9. see susan augustine and courtney greene, “discovering how students search a library web site: a usability case study,” college and research libraries 63, no. 4 (2002): 354–65; brenda battleson, austin booth, and jane weintrop, “usability testing of an academic library web site: a case study,” journal of academic librarianship 27, no. 3 (2001): 188–98; janice krueger, ron l. ray, and lorrie knight, “applying web usability techniques to assess student awareness of library web resources,” journal of academic librarianship 30, no. 4 (2004): 285–93; thura mack et al., “designing for experts: how scholars approach an academic library web site,” information technology and libraries 23, no. 1 (2004): 16–22; mark shelstad, “content matters: analysis of a web site redesign,” oclc systems & services 21, no. 3 (2005): 209–25; robert l. tolliver et al., “web site redesign and testing with a usability consultant: lessons learned,” oclc systems & services 21, no. 3 (2005): 156–67; dominique turnbow et al., “usability testing for web redesign: a ucla case study,” oclc systems & services 21, no. 3 (2005): 226–34; leanne m. vandecreek, “usability analysis of northern illinois user-centered design of a web site | manzari and trinidad-christensen 169 university libraries’ web site: a case study,” oclc systems & services 21, no. 3 (2005): 181–92. 10. jakob nielsen and rolf molich, “heuristic evaluation of user interfaces,” in proceedings of the acm chi ’90 (new york: association for computing machinery, 1990), 249–56. 11. robin jeffries et al., “user interface evaluation in the real world: a comparison of a few techniques,” in proceedings of the acm chi ’91 (new york: association for computing machinery, 1991), 119–24; jakob nielsen, “finding usability problems through heuristic evaluation,” in proceedings of the acm chi ’92 (new york: association for computing machinery, 1992), 373–86. 12. jakob nielsen, “heuristic evaluation,” 25–62. 13. jeffrey rubin, handbook of usability testing: how to plan, design, and conduct effective tests (new york: wiley, 1994); jakob nielsen, “why you only need to test with five users, alertbox mar. 19, 2000,” www.useit.com/alertbox/20000319.html (accessed nov. 1, 2005). letter from the editor: september 2021 letter from the editor september 2021 kenneth j. varnum information technology and libraries | september 2021 https://doi.org/10.6017/ital.v40i3.13859 in the editorial section of this issue, we have two columns to share. the september editorial board thoughts essay is by paul swanson, “building a culture of resilience in libraries,” reflecting on the lessons of covid-driven flexibility and suggests that a culture of resilience in our libraries will help us to more easily adapt to these, and emerging, changes we will inevitably encounter. that is followed by carole williams’ public libraries leading the way column, “delivering: automated materials handling for staff and patrons,” in which she discusses the effects of an automated materials handling system on both the staff and patrons of the charleston county (sc) public library. in peer-reviewed content, we have a diverse set of articles on range of topics: bias mitigation in metadata; accessibility of pdf documents; two articles on automated classification of different kinds of texts; two articles with lessons learned due to our abrupt move to remote service; and a case study on the importance of product ownership. 1. mitigating bias in metadata: a use case using homosaurus linked data / juliet hardesty and allison nolan 2. accessibility of tables in pdf documents: issues, challenges and future directions / nosheen fayyaz, shah khusro, and shakir ullah 3. text analysis and visualization research on the hetu dangse during the qing dynasty of china / zhiyu wang, jingyu wu, guang yu, and zhiping song 4. topic modeling as a tool for analyzing library chat transcripts / hyunseung koh and mark fienup 5. expanding and improving our library’s virtual chat service: discovering best practices when demand increases / parker fruehan and diana hellyar 6. a rapid implementation of a reserve reading list solution in response to the covid-19 pandemic / matthew black and susan powelson 7. product ownership of a legacy institutional repository: a case study on revitalizing an aging service / mikala narlock and don brower kenneth j. varnum, editor varnum@umich.edu september 2021 https://ejournals.bc.edu/index.php/ital/article/view/13781 https://ejournals.bc.edu/index.php/ital/article/view/13697 https://ejournals.bc.edu/index.php/ital/article/view/13697 https://ejournals.bc.edu/index.php/ital/article/view/13053 https://ejournals.bc.edu/index.php/ital/article/view/12325 https://ejournals.bc.edu/index.php/ital/article/view/12325 https://ejournals.bc.edu/index.php/ital/article/view/13279 https://ejournals.bc.edu/index.php/ital/article/view/13279 https://ejournals.bc.edu/index.php/ital/article/view/13333 https://ejournals.bc.edu/index.php/ital/article/view/13117 https://ejournals.bc.edu/index.php/ital/article/view/13117 https://ejournals.bc.edu/index.php/ital/article/view/13209 https://ejournals.bc.edu/index.php/ital/article/view/13209 https://ejournals.bc.edu/index.php/ital/article/view/13241 https://ejournals.bc.edu/index.php/ital/article/view/13241 mailto:varnum@umich.edu 2 information technology and libraries | march 2010 michelle frisque (mfrisque@northwestern.edu) is lita president 2009–10 and head, information systems, north western university, chicago. michelle frisque michelle frisque (mfrisque@northwestern.edu) is lita president 2009–10 and head, information systems, north western university, chicago. michelle frisque president’s message: join us at the forum! t he first lita national forum i attended was in milwaukee, wisconsin. it seems like it was only a couple of years ago, but in fact nine national forums have since passed. i was a new librarian, and i went on a lark when a colleague invited me to attend and let me crash in her room for free. i am so glad i took her up on the offer because it was one of the best conferences i have ever attended. it was the first conference that i felt was made up of people like me, people who shared my interests in technology within the library. the programming was a good mix of practical know-how and mindblowing possibilities. my understanding of what was possible was greatly expanded, and i came home excited and ready to try out the new things i had learned. almost eight years passed before i attended my next forum in cincinnati, ohio. after half a day i wondered why i had waited so long. the program was diverse, covering a wide range of topics. i remember being depressed and outraged on the current state of internet access in the united states as reported by the office for information technology policy. i felt that surge of recognition when i discovered that other universities were having a difficult time documenting and tracking the various systems they run and maintain. i was inspired by david lanke’s talk, “obligations of leadership.” if you missed it you can still hear it online. it is linked from the lita blog (http:// www.litablog.org). while the next forum may seem like a long way off to you, it is in the forefront of my mind. the national forum 2010 planning committee is busy working to make sure this forum lives up to the reputation of forums past. this year’s forum takes place in atlanta, georgia, september 30–october 3. the theme is “the cloud and the crowd.” program proposals are due february 19, so i cannot give you specifics about the concurrent sessions, but we do hope to have presentations about projects, plans, or discoveries in areas of library-related technology involving emerging cloud technologies; software-as-service, as well as social technologies of various kinds; using virtualized or cloud resources for storage or computing in libraries; library-specific open-source software (oss) and other oss “in” libraries; technology on a budget; using crowdsourcing and user groups for supporting technology projects; and training via the crowd. each accepted program is scheduled to maximize the impact for each attendee. programming ranges from five-minute lightening talks to full day preconferences. in addition, on the basis of attendee comments from previous forums, we have also decided to offer thirtyand seventy-five-minute concurrent sessions. these concurrent sessions will be a mix of traditional singleor multispeaker formats, panel discussions, case studies, and demonstrations of projects. finally, poster sessions will also be available. while programs such as the keynote speakers, lightning talks, and concurrent sessions are an important part of the forum experience, so is the opportunity to network with other attendees. i know i have learned just as much talking with a group of people in the hall between sessions, during lunch, or at the networking dinners as i have sitting in the programs. not only is it a great opportunity to catch up with old friends, you will also have the opportunity to make new ones. for instance, at the 2009 national forum in salt lake city, utah, approximately half of the people who attended were first-time attendees. the national forum is an intimate event whose attendance ranges between 250 and 400 people, thus making it easy to forge personal connections. attendees come from a variety of settings, including academic, public, and special libraries; library-related organizations; and vendors. if you want to meet the attendees in a more formal setting you can attend a networking dinner organized on-site by lita members. this year the dinners were organized by the lita president, lita past president, lita presidentelect, and a lita director-at-large. if you have not attended a national forum or it has been a while, i hope i have piqued your interest in coming to the next national forum in atlanta. registration will open in may! the most up-to-date information about the 2010 forum is available at the lita website (http:// www.lita.org). i know that even after my lita presidency is a distant memory, i will still make time to attend the lita national forum. i hope to see you there! 87 book reviews automation in libraries, by r. t. kimber. oxford, pergamon press, 1968. 140 pp. $6.00. many books have been published in recent years on the subject of library automation. very few of them, however, have succeeded in making meaningful contributions to a better understanding of the subject. this volume has made a sincere effort to be one of the few. although library automation is an ambiguous term which lacks precise definition, it is used here clearly to mean the use of computers in libraries. the book is intended for those with no computer background but who are familiar with library operations. it attempts to give a good introduction to current practices in library automation and a fairly detailed ac4 count of the state of the art. in the first chapter, "libraries and automation," mr. kimber discusses the relationship between the library and the computer. seeing the computer as a means of performing human clerical functions, he points out two important attitudes that must be observed: first, one must not change to a computer system just for the sake of changing, and second, one must be willing to change if the change means improvement. the monetary worth of the computer in the library is difficult to express because the end result is not increased profit but better service. since benefits from computer operations can be expressed in time and effort saved, these are the means of monetary comparison the author suggests. he also observes that although there are many good reasons for wanting computerized operations, some of these are merely emotional. chapter ii, "introduction to computers" is written by anne h. boyd, lecturer in computation at queen's university of belfast. miss boyd gives a brief review of the development and use of computers and discusses the fundamentals of computer systems. the next four chapters by mr. kimber present computerized systems for various library activities: chapter iii, "ordering and acquisitions," chapter iv, "circulation control," chapter v, "periodicals listing and accessioning," chapter vi, "catalogues and bibliographies." each chapter with the minimum of technical terminology gives a good account of what is involved in automating a particular operation. his treatment is very informative on these matters. in his final chapter (chapter vii, "the present state of automation in libraries") kimber discusses current trends of library automation and gives examples of libraries which use computers. his list is admittedly not comprehensive, but it does provide a comparison to the "ideal" systems he has described in the earlier chapters. in commenting on the future of computerized library systems, he sees these systems as an escape from the problems of everyday library operations. .88 journal of library a.utomation vol. 3/1 march, 1970 this book should be a good addition to the current book-s on library automation. one unfortunate aspect, however, appears to be an absence of treatment regarding the psychological impact of automation on librarians and users which is certainly one important aspect to he considered when automation of a system is proposed. also, at times the author, in attempting to simplify his discussion, has made :a generalized statement without fuller explanation. this could be misleading and tend to confuse the uninitiated reader. these deficiencies are not of major consequence and do not prejudice the total work but, care should be taken in reading. sul h. lee 1968 international directory of research and development scientists, philadelphia: institute for scientific information, inc., 1969, 1352 pages { approx. ) . $60.00. the second issue of the "international directory of research and development scientists" ( idr&ds) iists the names and organizational addresses of 152,648 authors whose papers were listed in either 4o implications of marc, and the· library of congress systems studies. (this paper includes twenty-eight pages. of ap-· pendices., mostly charts}., two additional papers include a discussion of the future of, and a tabulation of trends affecting, library automation. mm:h of the material in these non-survey papers. is reported more completely elsewhere and some of ft now seems dated. the material presented in this publication must have produced a highly effective educational institute in 1967. in 1969~ its value is at best as a first reader in library automation but not as the state-of-the-art review the title proclaims. charles t . payne 90 journal of library automation vol. 3/1 march, 1970 computers and data processing: information sources, by chester morrill, jr. an annotated guide to the literature, associations, and institutions concerned with input, throughput, and output of data. detroit: gale research co., [1969]. 275 pp. $8.75. (management information guide, 15) this latest volume in the management information guide series should prove as useful as its predecessors, offering to those persons interested in or concerned with computers and data processing (and who now is not?) an organized and extensive survey of the basic and necessary source of available information. thus the text is for the most part an annotated bibliography of pertinent references arranged in broad categories, each category prefaced with a paragraph or two of comment. this is in the style of mr. morrill's earlier contribution to the series, systems and procedures including office management, 1967 and, in general, that of all the volumes of the series. section 7 "operating" is the largest category, some forty pages of references subdivided into "manuals," "digital computers," "data transmission," "fortran," "software" and the like. section 9, entitled "front office references," is of particular interest to the reference librarian, since it serves as a guide to desirable dictionaries, handbooks and abstracting services in the fields of automation and data processing. individual annotations are usually brief, informative and on occasion evaluative. they give evidence of considerable skill in the art of capsule characterization. the prefatory paragraphs and notes to each section characterize the particular topic as successfully and succinctly as do the individual annotations. the preface to section 3, "personnel," is particularly felicitous. coverage is ample not only as to the subjects chosen but also as to numbers of references under individual subjects. an important thirty pages of appendices lists additional sources of information associations, manufacturers, seminars, publishers, placement firms, etc.-particularly valuable to the business man or government official as a desk or front-office reference book, although the librarian will also find it of value in providing specific information for his clientele. in all, this is a highly competent and very welcome addition to the series as well as to the ranks of special reference sources so necessary to the proper practice of the reference librarian's art. i think of crane's a guide to the literature of chemistry and white's sources of information in the social sciences and consider the author quite comfortable in their company as well as in that of his colleagues in the series. in addition, he evinces in his annotations and prefaces a wit, a turn of phrase and a capacity for direct statement that inform and delight the user. he displays an expertise in the fields of management and computer science, and one feels one can rely on his selection and judgment. eleanor r. devlin book reviews 91 cenralized book processing: a feasibility study based on colorado academic libraries by lawrence e. leonard, joan m. maier and richard m. dougherty. metuchen, n.j.: scarecrow press, 1969. 401 pp. $10.00. in october 1966 the national science foundation awarded a grant to the university of colorado libraries and the colorado council of librarians for research in the area of centralized processing. the project was in three phases. phase i involved an examination of the feasibility of establishing a book-processing center to serve the needs of the nine state-supported college and university libraries in colorado (which range in size from the university of colorado, with 805,959 volumes as of june 30, 1967, to metropolitan state college, a new institution with 8,310 volumes). phase ii involved a simulation study of the proposed center, while phase iii involved an operational book-processing center on a one-year experimental basis. this book summarizes the results of the first two phases of the study. phase i involved a detailed time-and-cost analysis of the acquisition, cataloging, and bookkeeping procedures in the nine participating libraries, with resultant processing costs per volume which are both convincing and somewhat startling, ranging as they do from $2.67 to $7.71 per volume. the operating specifications of the proposed book-processing center are then set forth and a mathematical model for simulating its operations under a variety of alternative conditions is prepared. the conclusions are less than surprising: "a centralized book processing center to serve the needs of the academic libraries in colorado is a viable approach to book processing." project benefits are enumerated, in the areas of cost savings, time-lag reductions, and the more efficient utilization of personnel. unfortunately, while many of the conclusions are buttressed by a dazzling array of tables and mathematical formulas (how can most librarians really argue with a regression analysis correlation coefficient matrix?), some of the most important savings cited are based on simple guesses, in some cases very simple guesses. to mention just two examples: 1) we are told that "a discount advantage expected through the use of combined ordering and a larger volume of ordering is conservatively estimated at 5% ... " (perhaps, but what is this based on?) 2) in the area of time lag reduction, "the greatest savings in time will accrue when the center is able to purchase materials from a vendor who has built up his book stock to reflect the needs of academic institutions. up to now, vendors have been unwilling to do this because there is insufficient profit motive." would nine libraries combining together change this profit picture? it is unfortunate that this report could not have waited on phase iii, the completion of the one-year trial of the operational center which was to have been ready in august 1969, so that we could see just how the predictions for the center worked out in practice. as it stands, however, the 92 journal of library autcmuztion vol 3/1 march, 1970 book is a valuable study in library systems analysis and design, and its identification and quantification of the various technical processing activities can yield real benefits to librarians everywhere, be they ever so decentralized. norman dudley a guide to a selection of computer-based science and technology reference services in the u.s.a., american library association, chicago, illinois, 1969, 29 pages. $1.50. this guide is an attempt to bring together those reference publications which are also available in machine readable form. as a "selection" it is limited to eighteen sources from government, professional and private organizations. the guide is the result of a survey undertaken in 1968 by the science and technology reference services committee of the american library association reference services division. the committee was composed of elsie bergland, john mcgowan, william page, joseph paulukonis, margaret simonds, george caldwell, robert krupp and richard snyder. each entry is broken down into three units: 1) the characteristics of the data base, 2) the equipment configuration and 3) the use of the file. subject headings under characteristics of the data base include subject matter, literature surveyed, types of material covered, etc. the equipment configuration section describes computer model, core, operating systems, and programming language. the use of the file section covers potential uses of the data base by the producer and the subscriber. unfortunately for publications of this sort, they become out of date rather quickly. the continuing series, the directory of computerized information in science and technology, is updated periodically and is a very useful reference tool in this field. ge"y d. guthrie 92 journal of library autonuztion vol 3/1 march, 1970 book is a valuable study in library systems analysis and design, and its identification and quantification of the various technical processing activities can yield real benefits to librarians everywhere, be they ever so decentralized. norman dudley a guide to a selection of computer-based science and technology reference services in the u.s.a., american library association, chicago, illinois, 1969, 29 pages. $1.50. this guide is an attempt to bring together those refere~~e pu~lic,~~o~s which are also available in machine readable form. as a selection 1t ls limited to eighteen sources from government, professional and private organizations. . . the guide is the result of a survey undertaken m 1968 by the sc1ence and technology reference services committee of the american library association reference services division. the committee was composed of elsie bergland, john mcgowan, william page, joseph paulukonis, margaret simonds, george caldwell, robert krupp and richard snyder. each entry is broken down into three units: 1) the characteristics of the data base, 2) the equipment configuration and 3) the use of the file. subject headings under characteristics of the data base include subject matter, literature surveyed, types of material covered, etc. the. equipment configuration section describes computer model, core, operatmg systems, and programming language. the use of the file section covers potential uses of the data base by the producer and the subscriber. unfortunately for publications of this sort, they become out of date rather quickly. the continuing series, the directory of computerized infornuztion in science and technology, is updated periodically and is a very useful reference tool in this field. gerry d. guthrie \ orthographic error patterns of author names in catalog searches 93 renata tagliacozzo, manfred kochen, and lawrence rosenberg: mental health research institute, the university of michigan, ann arbor, michigan an investigation of error patterns in author names based on data from a survey of library catalog searches. position of spelling errors was noted and related to length of name. probability of a name having a spelling error was found to increase with length of name. nearly half of the spelling mistakes were replacement errors; following, in order of decreasing frequency, were omission, addition, and transposition errors. computer-based catalog searching may fail if a searcher provides an author or title which does not match with the required exactitude the corresponding computer-stored catalog entry ( 1). in designing computer aids to catalog searching, it is important to build in safety features that decrease sensitivity to minor errors. for example, compression coding techniques may be used to minimize the effects of spelling errors on retrieval ( 2, 3, 4). preliminary to the design of good protection devices, the application of error-correction coding theory ( 5, 6, 7) and data on error patterns in actual catalog searches ( 8, 9) may be helpful. a recent survey of catalog use at three university libraries yielded some data of the above-mentioned kind (10). the aim of this paper is to present and analyze those results of the survey which bear on questions of error control in searching a computer-stored catalog. in the survey, users were interviewed at random as they approached the catalog. of the 2167 users interviewed, 1489 were searching the catalog for a particular item ("known-item searches"). of these, 67.9% first entered the catalog with an author's or editor's name, 26.2% with a title, and 5.9% with a subject heading. approximately half the searchers had a written citation, while half relied on memory for the relevant ineditorial board thoughts | eden 109 editorial board thoughts bradford lee eden musings on the demise of paper w e have been hearing the dire predictions about the end of paper and the book since microfiche was hailed as the savior of libraries decades ago. now it seems that technology may be finally catching up with the hype. with the amazon kindle and the sony reader beginning to sell in the marketplace despite the cost (about $360 for the kindle), it appears that a whole new group of electronic alternatives to the print book will soon be available for users next year. amazon reports that e-book sales quadrupled in 2008 from the previous year. this has many technology firms salivating and hoping that the consumer market is ready to move to digital reading as quickly and profitably as the move to digital music. some of these new devices and technologies are featured in the march 3, 2009, fortune article by michael v. copeland titled “the end of paper?”1 part of the problem with current readers is their challenges for advertising. because the screen is so small, there isn’t any room to insert ads (i.e., revenue) around the margins of the text. but new readers such as plastic logic, polymer vision, and firstpaper will have larger screens, stronger image resolution, and automatic wireless updates, with color screens and video capabilities just over the horizon. still, working out a business model for newspapers and magazines is the real challenge. and how much will readers pay for content? with everything “free” over the internet, consumers have become accustomed to information readily available for no immediate cost. so how much to charge and how to make money selling content? the plastic logic reader weighs less than a pound, is one-eighth of an inch thick, and resembles an 8½ x 11 inch sheet of paper or a clipboard. it will appear in the marketplace next year, using plastic transistors powered by a lithium battery. while not flexible, it is a very durable and break-resistant device. other e-readers will use flexible display technology that allows one to fold up the screen and place the device into a pocket. much of this technology is fueled by e-ink, a start-up company that is behind the success of the kindle and the reader. they are exploring the use of color and video, but both have problems in terms of reading experience and battery wear. in the long run, however, these issues will be resolved. expense is the main concern: just how much are users willing to pay to read something in digital rather than analog? amazon has been hugely successful with the kindle, selling more than 500,000 for just under $400 in 2007. and with the drop in subscriptions for analog magazines and newspapers, advertisers are becoming nervous about their futures. or will the “pay by the article” model, like that used for digital music sales, become the norm? so what should or do these developments mean for libraries? it means that we should probably be exploring the purchase of some of these products when they appear and offering them (with some content) for checkout to our patrons. many of us did something similar when it became apparent that laptops were wanted and needed by students for their use. many of us still offer this service today, even though many campuses now require students to purchase them anyway. offering cutting-edge technology with content related to the transmission and packaging of information is one way for our clientele to see libraries as more than just print materials and a social space. and libraries shouldn’t pay full price (or any price) for these new toys; companies that develop these products are dying to find free research and development focus groups that will assist them in versioning and upgrading their products for the marketplace. what better avenue than college students? related to this is the recent announcement by the university of michigan that their university press will now be a digital operation to be run as part of the library.2 decreased university and library budgets have meant that university presses have not been able to sell enough of their monographs to maintain viable business models. the move of a university press to a successful scholarly communication and open-source publishing entity like the university of michigan libraries means that the press will be able to survive, and it also indicates that the newer model of academic libraries as university publishers will have a prototypical example to point out to their university’s administration. in the long run, these types of partnerships are essential if academic libraries are to survive their own budget cuts in the future. references 1. michael v. copeland, “the end of paper?” cnnmoney .com, mar. 3, 2009, http://money.cnn.com/2009/03/03/ technology/copeland_epaper.fortune/ (accessed june 22, 2009). 2. andrew albanese, “university of michigan press merged with library, with new emphasis on digital monographs,” libraryjournal.com, mar. 26, 2009, http://www .libraryjournal.com/article/ca6647076.html (accessed june 22, 2009). bradford lee eden (eden@library.ucsb.edu) is associate university librarian for technical services and scholarly communication, university of california, santa barbara. a tale of two tools: comparing libkey discovery to quicklinks in primo ve communication a tale of two tools comparing libkey discovery to quicklinks in primo ve jill k. locascio and dejah rubel information technology and libraries | june 2023 https://doi.org/10.6017/ital.v42i2.16253 jill k. locascio (jlocascio@sunyopt.edu) is associate librarian, suny college of optometry. dejah rubel (dejahrubel@ferris.edu) is metadata and electronic resources management librarian, ferris state university. © 2023. introduction consistent delivery of full-text content has been a challenge for libraries since the development of online databases. library systems have attempted to meet this challenge, but link resolvers and early direct linking tools often fell short of patron expectations. in the last several years, a new generation of direct linking tools has appeared, two of which will be discussed in this article: third iron’s libkey discovery and quicklinks by ex libris, a clarivate company. figure 1 shows the “download pdf” link added by libkey. figure 2 shows the “get pdf” link provided by quicklinks. the way we configured our discovery interface, a resource cannot receive both the libkey and quicklinks pdf links. these two direct linking tools were chosen because they were both relatively new to the market in april 2021 when this analysis took place and they can both be integrated into primo ve, the library discovery system of choice at the authors’ home institutions of suny college of optometry and ferris state university. through analysis of the frequency of direct links, link success rate, and number of clicks, this study may help determine which product is most likely to meet your patrons’ needs. figure 1. example of a libkey discovery link in primo ve. figure 2. example of a quicklink in primo ve. mailto:jlocascio@sunyopt.edu mailto:dejahrubel@ferris.edu information technology and libraries june 2023 a tale of two tools 2 locascio and rubel literature review over the past 20 years link resolvers and direct linking have evolved in tandem. early link generator tools, such as proquest’s sitebuilder, often involved a process that “… proved too cumbersome for most end-users.”1 five years later, tools from ebsco, gale, ovid, and proquest had improved, but they were all proprietary. bickford postulates that metadata-based standards, like openurl, may make linking as simple as copying and pasting from the address bar; however, they may be more likely to fail “… as long as vendors use incompatible, inaccurate, or incomplete metadata.”2 the first research was wakimoto’s 2006 study of sfx, which relied on 224 test queries and 188,944 individual uses for its data set. 3 of those queries, 39.7% of search results included a full-text link and that link was accessed 65.2% of the time. unfortunately, wakimoto also discovered that 22.2% of all full-text results failed and concluded that most complaints against sfx were problems with the systems it links to and not the link resolver itself. alth ough intended to be provider-neutral, the openurl standard is, in fact, vulnerable to metadata omissions. content providers, whether aggregators or publishers, have a vested interest in link stability and platform use and have therefore invested in building direct link generation tools. in 2006, grogg examined ebsco’s smartlink, which checks access rights before generating the link; proquest’s crosslinks, which was used to link from proquest to another vendor’s content; silverplatter and links@ovid, which relied on a knowledge base in the terabytes for static links.4 in 2008, cecchino described the national library of medicine’s linkout tool for selected publishers within pubmed.5 they also described two ovid products: links@ovid and linksolver, noting that the former is similar to linkout and the latter is similar to sfx. most of the time these tools worked well, but their use was restricted to a particular platform or set of publishers. as online public catalogs became discovery layers, direct linking became a feature of the library management system. two studies have been done thus far: silton’s analysis of summon and stuart’s analysis of 360 link. in 2014, silton tested the percentage of full-text articles retrievable from summon by running a test query and examining the first 100 results. over a year, the total success rate for unfiltered queries rose from 61% to 76%. after direct linking was introduced, the success rate of link resolver links rose to 65.8–73% and direct links succeeded 90.48–100% of the time. silton concluded, “while direct linking had some issues in its early months, it generally performs better than the link resolver.”6 in 2011, stuart, varnum, and ahronheim began testing the 1-click feature of 360 link on 579 citations, 82.2% of which were successful. after direct linking became an option for summon in 2012, 61–70% of their sample relied on it. “between direct linking and 1-click about 93 to 94% of the time an attempt was made to lead users directly to the full text of the article … [and] … we were able to reach full text … from 79% to about 84% of the time.”7 direct linking outperformed 1-click with a 90% success rate compared to 58–67% for 1-click. stuart also compared the actual error rate with one based on user reports and discovered that “relying solely on user reports of errors to judge the reliability of full-text links dramatically underreports true problems by a factor of 100.”8 openurl links were especially alarming with approximately 20% of them failing. although direct linking is more reliable, stuart closes by noting that direct linking binds libraries closer to vendors thereby decreasing institutional their flexibility. information technology and libraries june 2023 a tale of two tools 3 locascio and rubel methods the goal of this project was to assess two of the latest direct linking tools: ex libris’s native quicklinks feature and third iron’s libkey discovery. we performed a side-by-side comparison of the two tools by searching for specific articles in primo ve, the library discovery system used by the authors’ respective home institutions, suny college of optometry and ferris state university, and measuring • how often each vendor’s direct links appeared on the brief record; • success rate of the links; and • number of clicks it takes from each link to reach the pdf full text. both suny college of optometry and ferris state university use ex libris’ alma as their library services platform. alma provides a number of usage reports in their analytics module. we sourced the queries used in our analysis from the alma analytics link resolver usage report. the report contains a field number of requests, which records the number of times an openurl request was sent to the link resolver. an openurl request is sent to the link resolver when the user clicks on a link to the link resolver from an outside source (such as google scholar), for example, when the user submits a request using primo’s citation linker or when the user accesses the article’s full record in primo by clicking on either the brief record’s title or availability statement. this means that results that have a direct link (whether a quicklink or libkey discovery link) on the brief record will not appear in the report if the user clicked the direct link to the article. thus, in order to create test searches that would be an accurate representation of articles being accessed, we used article titles taken from suny optometry’s october 2019 alma link resolver usage report— a report that was generated prior to the implementation of both libkey discovery and quicklinks. the report was filtered to include only articles with the source type of primo/primo central to ensure that the initial search was taking place within the native primo interface, as requests from outside sources like google scholar or from primo’s citation linker are irrelevant to this analysis. this filtering generated a total of 412 articles. after further removal of duplicates and non -article material, there were 386 article titles in our test query set. we created two separate primo views as test environments: one with libkey discovery and the other with quicklinks. we ran the test searches twice in each view. in the first round of testing, we recorded whether a direct link was present. we also recorded the name of the full-text provider (if present), as well as whether the article was open access. suny optometry does not filter their primo results by availability; therefore, many of the articles included in the initial search did not have any associated full-text activations. since these articles are irrelevant to our assessment, we removed them before analyzing the first round of data and proceeding with the second search. the exception to these removals were articles identified as open access by unpaywall, as the presence of unpaywall links is independent of any activations in alma. furthermore, third iron’s libkey discovery and ex libris’ quicklinks both incorporate unpaywall’s api into their products to provide direct links to pdfs of open access articles. this functionality helps fill coverage gaps where institutions may not have activated a hybrid open access journal due to its paywalls. therefore, we are including the presence of direct links resulting from the unpaywall api when determining whether a libkey discovery link or quicklink is present. after filtering for availability, we had 254 article titles for the first round of searching and analysis. the initial analysis revealed the need to further filter the information technology and libraries june 2023 a tale of two tools 4 locascio and rubel articles used for the second round of searching, which would provide a much closer comparison of the two direct linking tools as third iron had partnered with more content providers than ex libris. controlling for shared providers would give a more accurate representation of how each direct linking tool performs in relation to the other. when controlling for shared providers and open access articles, we were left with 145 article titles for the second query set. during the second round of searching, we measured whether the direct link was successful in linking to the full text—meaning that the link was neither broken nor linked to an incorrect article—and how many clicks were necessary to get from the direct link to the article pdf. along the way, additional qualitative measures were observed, such as document download time and metadata record quality. while not as easy to measure as the quantitative data, these observations provided additional insight into the strengths and weaknesses of each of these direct linking tools. since april 2022, when our research was conducted, ex libris has added several quicklinks providers, possibly increasing the current number of quicklinks available. additionally, both rounds of searching were conducted on campus, so our analysis excludes any consideration of authentication and/or proxy information. results of the 254 articles searched, 208 (82%) had libkey discovery links present while 129 (52%) had quicklinks present. while this seems like a large discrepancy between the two direct link providers, it can be explained by the fact that during the time of testing, ex libris was collaborating with fewer content providers than third iron. ex libris has since added more providers. while the provider discrepancy meant that there were many instances where a libkey discovery link was present where a quicklink was not, there were 5 articles where a quicklink was present while a libkey discovery link was not. as mentioned previously, the criterion for the 254 articles included in the second round of searching was that the articles must be activated in alma or must be open access. of these 254 articles, we identified 137 (54%) as open access. of those open access articles, 132 (96%) had libkey discovery links present, and 118 (86%) had quicklinks present. we found that 113 (82%) of the open access articles had both libkey discovery links and quicklinks present. we also discovered within this set of 137 open access articles that 30 (22%) were from non-activated resources. of those 30 open access articles from non-activated titles, all 30 (100%) had libkey discovery links appearing on the brief results and 24 (80%) had quicklinks. to get a better idea of how libkey discovery links and quicklinks compared in terms of linking success, we filtered to only those articles available from providers who were participating in both libkey discovery links as well as quicklinks. since both direct linking tools use unpaywall integrations, we continued to include open access articles. this filtering resulted in 145 articles where libkey discovery links were present in 137 articles (94%) while quicklinks were present in 129 articles (89%). we found that 123 (85%) of these 145 articles had both libkey discovery links and quicklinks present. there were 2 (1%) articles that had neither libkey discovery links nor quicklinks present despite being activated in a journal currently participating as a provider in both direct linking tools. there were also 14 articles (10%) that had libkey discovery links but information technology and libraries june 2023 a tale of two tools 5 locascio and rubel not quicklinks; all of these articles were open access. in total, of the 145 articles searched, 128 (88%) were identified as open access. as for the 137 libkey discovery links, 130 (95%) of them successfully linked to the article. on average it took 1.07 clicks to get to the pdf of the article. of the 129 quicklinks, 126 (98%) of them successfully linked to the article. on average it took 1.07 clicks to get to the pdf of the article. we also attempted to measure the time it took for the pages to load after the initial click on the libkey discovery links and quicklinks; however, the tools used to measure this, as well as the environments in which the links were being clicked, proved too varied to provide an appropriate comparison. nevertheless, we noted observations such as the page load times after clicking on libkey discovery links and quicklinks were generally consistent, but quicklinks attempts to connect to the wiley platform took a significant time (at least 10 seconds) to load. conclusions with high article linking success rates, both third iron’s libkey discovery and ex libris’ quicklinks deliver on the promise to provide fast and seamless access to full-text articles. however, the libkey discovery tool far outpaces quicklinks when it comes to coverage. both direct linking tools perform well with open access articles, supplying libraries with better options for full-text links to articles that may be in hybrid journals. as with any kind of full-text linking, both direct linking tools rely on metadata. in conclusion, while libkey discovery provides a more complete direct linking solution, both libkey discovery and quicklinks are reliable tools that improve primo’s discovery and delivery experience. endnotes 1 david bickford, “using direct linking capabilities in aggregated databases for e-reserves,” journal of library administration 41, no. 1/2 (2004): 31–45, https://doi.org/10.1300/j111v41n01_04. 2 bickford, 45. 3 wendy furlan, “library users expect link resolvers to provide full text while librarians expect accurate results,” evidence based library and information practice 1, no. 4 (2006): 60–63, https://doi.org/10.18438/b88c7p. 4 jill e. grogg, “linking without a stand-alone link resolver,” library technology reports 42, no. 1 (2006): 31–34. 5 nicola j. cecchino, “full-text linking demystified,” journal of electronic resources in medical libraries 5, no. 1 (2008): 33–42, https://doi.org/10.1080/15424060802093377. 6 kate silton, “assessment of full-text linking in summon: one institution’s approach,” journal of electronic resources librarianship 26, no. 3 (2014): 163–69, https://doi.org/10.1080/1941126x.2014.936767. https://doi.org/10.1300/j111v41n01_04 https://doi.org/10.18438/b88c7p https://doi.org/10.1080/15424060802093377 https://doi.org/10.1080/1941126x.2014.936767 information technology and libraries june 2023 a tale of two tools 6 locascio and rubel 7 kenyon stuart, ken varnum, and judith ahronheim, “measuring journal linking success from a discovery service,” information technology and libraries 34, no. 1 (2015): 52–76, https://doi.org/10.6017/ital.v34i1.5607. 8 stuart, varnum, and ahronheim, 74. https://doi.org/10.6017/ital.v34i1.5607 introduction literature review methods results conclusions microsoft word 14041 20211221 galley.docx public libraries leading the way how covid affected our python class at the worcester public library melody friedenthal information technology and libraries | december 2021 https://doi.org/10.6017/ital.v40i4.14041 melody friedenthal (mfriedenthal@mywpl.org) is a public services librarian, worcester public library. © 2021. in june 2020, ital published my account of how the worcester public library (ma) came to offer a class in python programming and how that class was organized. although readers may have read the article in the middle of our covid-year, i wrote it mostly in early january 2020, before libraries across the country closed in an effort to protect staff and patrons from the disease. from spring 2020 through april 2021, i taught intro to coding: python for beginners five more times. but, of course, these classes were not face-to-face. like virtually all other library, musical, political, religious, and cultural programming across the world, our python course was taught virtually. the public services team has one professional zoom account, which my colleagues and i share. how did going remote affect this class? it depends on whether your perspective is that of a student or that of the instructor. many of us have read how difficult it’s been for teachers to effectively reach their elementarythrough-collage-age students. i’ve had many of the same challenges, but since nearly all my students are adults and they all chose to take this class, i don’t need to grapple with fidgety kids or recess. on the other hand, there were few distractions in our computer lab, while covid-time students have to grapple with pets, children squabbling, or noise from a tv. i was teaching from my home office. at the library i have one monitor but at home i have two, which makes it easier for me to spread out my assorted documents. to “protect” my students from seeing my messy house, i used a virtual background, one chosen not to distract. however, the software which determines the borders of a human presenter isn’t perfect and there is sometimes a halo behind my head of the things behind me; this may be distracting itself. prior to covid, since we had twelve seats in the computer lab, we limited registration to fourteen, allowing for some no-shows (and we have two spare laptops, in case everyone showed up). a week prior to session one i would email the registrants, asking them to confirm their continued interest. if a student didn’t confirm, i’d give their seat to someone on the waitlist. while i was not prepared to make my class a mooc (massive open online courses) because i individually review homework and give lots of feedback, we did increase maximum registration to fifteen since the number of seats in the computer lab was no longer a limiting factor. and, as before, i ask for confirmation via email, but i also include in that email two links and an attached word doc. the document is an excerpt from cory doctorow’s novel little brother on the joys of coding. information technology and libraries december 2021 how covid affected our python class | friedenthal 2 the first embedded link leads to the free version of zoom. the second link is to the thonny website (https://thonny.org). thonny is a free ide (integrated development environment) where students can write and execute python code. we used thonny when i taught face-to-face, but the lab computers all had thonny installed, and were ready for students to use. now, i have to depend on the ability of students to download the software to their own computers. i ask students to do the two downloads ahead of session one. which brings us to two problems: the class was no longer accessible to students who live in a household without a computer and internet service. and, as i found out with one prospective student, it’s not accessible to patrons who don’t have administrative rights to their computer; that is, the ability to download new software. when a patron confirms their interest, i email them the course manual. it now contains about 93 pages. i told students they might choose to print it but doing so is up to individual preference. the advantage of having a digital copy is that students can search for keywords easily. the disadvantage is that the cost of printing the manual is shifted to the student and may be prohibitive for some. in session one, i acknowledged that it’s difficult to learn technical material via zoom, and i encouraged everyone to ask questions during class and to email me if they are stymied while working on the homework. i reiterated that invitation during every session. while teaching, i bounce back-and-forth between screen-sharing my thonny window and the manual, while trying to keep an eye on the little zoom windows showing my students. some students cannot or choose not to turn on their video. this is a problem for me, since i can’t readily determine who’s asking a question. moreover, it is helpful to associate a face with a name. and since i give out a certificate of completion to each student who does the homework and attends all sessions, i want to make sure the student is actually taking part. i’ve had students who sign in, leave their camera off, and then apparently leave (i call on students by name and sometimes the no-video ones never respond). offering the class online has advantages in snowy worcester. students can tune in from the comfort of their own homes, avoid the slick roads, bypassing paying for parking at the municipal lot next to our building or for a bus to downtown, or the discomfort of walking in a dark citycenter in the evening. another plus: as program organizers and program participants have discovered, with videoconferencing we are no longer limited geographically. i had registrants who live in pennsylvania and georgia. as always, students range from total beginners to experienced programmers-of-other-languages. i’ve thought about how i can give extra time to the former while not boring the latter. one thing i’ve done is to make some assignments optional and say, “if you want an extra challenge, give this a try….” i’ve slowed the class down a bit, leaving more time for coding during each session. if a student has difficulties, i invited them to share their screen. this pedagogical technique actually works better information technology and libraries december 2021 how covid affected our python class | friedenthal 3 via zoom than in-person, because we could all see that screen equally well. in the computer lab, only the student who sat at the same (2-person) desk could easily see what the other person had coded. another thing i’ve done is to ratchet down the formality of the class: i am chattier and demo fun games i’ve written, e.g., hangman, tic-tac-toe, rock-paper-scissors, and you sunk my battleship, for inspiration. i experimented with using the built-in zoom whiteboard but that wasn’t satisfactory, so i wrote supplementary notes as comments in thonny. parents were fearful their kids were not being intellectually challenged when schools were closed due to the pandemic, so maybe i shouldn’t have been surprised that the april 2021 class contained seven children. there would have been an eighth, but when i realized one registrant was just seven years old, i told his mother that, while she was the best judge of her son’s abilities, i discouraged him from taking the class. she decided to take it herself. figure 1. a word-cloud of our fall 2020 project outcome evaluations (includes other digital learning programs). at our sixth and final session i traditionally execute a program which draws colorful graphics, rather like spirograph. students were able to see each curve being drawn in a new window launched by the ide. but this window doesn’t exist until i executed the program. while we were information technology and libraries december 2021 how covid affected our python class | friedenthal 4 using zoom, when i attempted to share my screen, the students missed the first graphics, no matter how fast i was at screen-sharing. i made the execution “sleep” for a few seconds to give me time to switch screens before the graphics were drawn. a larger percentage of students earned the certificate of completion during the virtual classes than on average in the in-person pre-covid classes, perhaps 75% vs. 40%. for the in-person classes our communications officer printed the certificates on heavy paper adorned with the wpl logo; i signed each and handed them out during the final session. for our virtual classes, the certificates were digitally signed and then emailed; students could print them if they chose. this follow-up is being written during october 2021, and with a substantial percentage of massachusetts residents vaccinated for covid, the worcester public library is now back to offering many programs in-person, including python. the city of worcester requires mask use in all municipal buildings, and while some patrons don’t cooperate, i’ve told my students that anyone not wearing a mask properly will be asked to leave the computer lab. with so many people out of work due to the economic devastation wrought by covid, we were gratified to be able to offer a class that teaches in-demand skills, especially ones that can be applied in a work-from-home environment. article title | author 23frbrization of a library catalog | dickey 23 the functional requirements for bibliographic records (frbr)’s hierarchical system defines families of bibliographic relationship between records and collocates them better than most extant bibliographic systems. certain library materials (especially audio-visual formats) pose notable challenges to search and retrieval; the first benefits of a frbrized system would be felt in music libraries, but research already has proven its advantages for fine arts, theology, and literature—the bulk of the non-science, technology, and mathematics collections. this report will summarize the benefits of frbr to nextgeneration library catalogs and opacs, and will review the handful of ils and catalog systems currently operating with its theoretical structure. editor’s note: this article is the winner of the lita/ ex libris writing award, 2007. t he following review addresses the challenges and benefits of a next-generation online public access catalog (opac) according to the functional requirements for bibliographic records (frbr).1 after a brief recapitulation of the challenges posed by certain library materials—specifically, but not limited to, audiovisual materials—this report will present frbr’s benefits as a means of organizing the database and public search results from an opac.2 frbr’s hierarchical system of records defines families of bibliographic relationship between records and collocates them better than most extant bibliographic systems; it thus affords both library users and staff a more streamlined navigation between related items in different materials formats and among editions and adaptations of a work. in the eight years since the frbr report’s publication, a handful of working systems have been developed. the first benefits of such a system to an average academic library system would be felt in a branch music library, but research already has proven its advantages for fine arts, theology, and literature—the bulk of the non-science, technology, and mathematics collections. ■ current search and retrieval challenges the difficulties faced first, but not exclusively, by music users of most integrated library systems fall into two related categories: issues of materials formats, and issues of cataloging, indexing, and marc record structure. music libraries must collect, catalog, and support materials in more formats than anyone else; this makes their experience of the most common ils modules—circulation, reserves, and acquisitions—by definition more complicated. the study of music continues to rely on the interrelated use of three distinct information formats—scores (the notated manifestation of a composer’s or improviser’s thought), recordings (realizations in sound, and sometimes video, of such compositions and improvisations), and books and journals (intellectual thought regarding such compositions and improvisations)—music libraries continue to require . . . collections that integrate [emphasis mine] these three information formats appropriately.3 put a different way, “relatedness is a pervasive characteristic of music materials.”4 this is why frbr’s model of bibliographic relationships offers benefits that will first impact the music collection.5 at present, however, musical formats pose search and retrieval challenges for most ils users, and the problem is certainly replicated with microforms and video recordings. the marc codes distinguish between material formats, but they support only one category for sound recordings, lumping together cd, dvd audio, cassette tape, reel-toreel tape, and all other types.6 this single “sound recording” definition is easily reflected in opacs (such as those powered by innovative interfaces’ millennium and ex libris’ aleph 500) and union catalogs (such as worldcat. org).7 however, the distinction between sound recording formats is embedded in subfields of the 007 field, which presently cannot be indexed by many library automation systems because the subfields are not adjacent. an even more central challenge derives from the fact that music sound recordings—such as journals and essay collections—contain within each item more than one work. thus, for one of the central material formats collected by a music library (as well as by a public library or other academic branches), users routinely find themselves searching for a distinct subset of the item record. perversely, though music catalogers do tend to include analytic added-entries for the subparts of a cd recording or printed score, and major ils vendors are learning to index them, aacr2 guidelines set arbitrary cutoff points of about fifteen tracks on a sound recording, and three performable units within a score.8 subsets of essay collections and journal runs are routinely exposed to users’ searches by indexing and abstracting services and major databases, but subsets of libraries’ music collections depend upon catalogers to exploit the marc records for user access.9 timothy j. dickey (dickeyt@oclc.org) is a post-doctoral researcher, oclc office of programs and research, dublin, ohio. frbrization of a library catalog: better collocation of records, leading to enhanced search, retrieval, and display timothy j. dickey 24 information technology and libraries | march 200824 information technology and libraries | march 2008 in light of these pervasive bibliographic relationships, catalogers of music (again, with parallels in other subjects) have developed a distinctive approach to the marc metadata schema. in particular, they—with their colleagues in literature, fine arts, and theology—rely upon the 700t field for uniform work titles, and upon careful authority control.10 however, once again, many major ils portals have spotty records in affording access to library collections via these data. innovative interfaces’ millennium, though it clearly leads other major library products in this market, frequently frustrates music librarians (it is, of course, not alone in doing so).11 its automatic authority control feature works poorly with (necessary) music authority records.12 and even though innovative has been one of the first vendors to add a database index to the 700t field, partly in response to concerns expressed to the company by the music librarians’ user group, millennium apparently does not allow for an appropriate level of follow-through on searching.13 an initial search by name of a major composer, for instance, yields a huge and cluttered result set containing all indexed 700t fields.14 the results do helpfully include the appropriate see also references, but those references disappear in a subsidiary (limited) search. in addition, the subsidiary display inexplicably changes to an unhelpful arrangement of generic 245 fields (“mozart, symphonies”; “mozart, operas, excerpts”). similar challenges will be faced by other parts of an academic or large public library collection, including the literature collections (for works such as shakespeare’s plays), fine arts (for images and artists’ works), and theology (for works whose uniform title is in latin). the opac interfaces of other major ils vendors fare little better. the same search (for “mozart”) on the emory university library catalog (with an ils by sirsidynix), similarly yields a rich results set of more than one thousand records, and poses similar problems in refining the search.15 in the case of this opac, an index of 700t fields also exists, but it only may be searched from the inside of a single record; as with millennium, sirsidynix’s interface will then group the next set of results confusingly by 245 fields. the library corporation’s carl-x apparently does not contain a 700t index; the simple “mozart” search returns a muchsimplified set of only 97 results organized by 245a fields, and thus offers a more concise set of results but avoids the most incisive index for audio-visual materials.16 ex libris offers a somewhat more helpful display of its more restricted results; unfortunately for the present comparison, though the detailed results set does list the “format” of all mozart-authored items, the same term— “music”—is used for sound recordings, musical scores, and score excerpts, with no attempt logically to group the results around individual works.17 no 700t index appears present. ■ the frbr paradigm: review of literature and theory from the earliest library catalogs in the modern age, the tools of bibliographic organization have sought to afford users both access to the collection and collocation of related materials. anglo-american cataloging practice has traditionally served the first function by main entries and alternate access points and the second function by classification systems. however, as knowledge increases in scope and complexity, the systems of bibliographic control have needed to evolve. as early as the 1950s, theories were developing that sought to distinguish between the intellectual content of a work, and its often manifold physical embodiments.18 the 1961 paris international conference on cataloging principles first reified within the cataloging community a work-item distinction, though even the 1988 publication of the anglo-american cataloging rules, 2nd ed., “continued to demonstrate confusion about the nature . . . of works.”19 meanwhile, extensive research into the nature of bibliographic relationships groped toward a consensus definition of the entity-types that could encompass such relationships.20 ed o’neill and diane vizine-goetz examined some one hundred editions of smollett’s the expedition of humphrey clinker over a two-hundred-year span of publication history to propose a hierarchical set of definitions to define entity levels.21 the theoretical entities include the intellectual content of a work—which in the case of audio-visual works, may not even exist in any printed formats—the various versions, editions, and printings in which that intellectual content manifests itself, and the specific copies of each manifestation which a library may hold.22 research has discovered such clusters of bibliographically related entities for as much as 50 percent or more of all the intellectual works in any given library catalog, and as many as 85 percent of the works in a music catalog.23 this work laid the foundation for frbr (and, once again, incidentally underscored the breadth of its applicability to, and beyond, music catalogs). the theoretical framework of frbr is most concisely set forth in the final report of the ifla study group. the long-awaited publication traces its genesis to the 1990 stockholm seminar, and the resultant 1992 founding of the ilfa study group on functional requirements for bibliographic records. the study group set out to develop: a framework that identifies and clearly defines the entities of interest to users of bibliographic records, the attributes of each entity, and the types of relationships that operate between entities . . . a conceptual model that would serve as the basis for relating specific attributes and relationships . . . to the various tasks that users perform when consulting bibliographic records. article title | author 25frbrization of a library catalog | dickey 25 the study makes no a priori assumptions about the bibliographic record itself, either in terms of content or structure.24 in other words, the intention of the group’s deliberations and the final report is to present a model for understanding bibliographic entities and the relationships between them to support information organization tools. it specifically adopts an approach that defines classes of entities based upon how users, rather than catalogers, approach bibliographic records—or, by natural extension, any system of metadata. the frbr hierarchical entities comprise a fourfold set of definitions: ■ work: “a distinct intellectual or artistic creation”; ■ expression: “the intellectual or artistic realization of a work” in any combination of forms (including editions, arrangements, adaptations, translations, performances, etc.); ■ manifestation: “the physical embodiment of an expression of a work”; and ■ item: “a single exemplar of a manifestation.”25 examples of these hierarchical levels abound in the bibliographic universe, but frequently music offers the quickest examples: ■ work: mozart’s die zauberflöte (the magic flute) ■ work: puccini’s la bohéme ■ expression: the composer’s complete musical score (1896) ■ manifestation: edition of the score printed by ricordi in 1897 ■ expression: an english language edition for piano and voices ■ expression: a performance by mirella freni, luciano pavarotti, and the berlin philharmonic orchestra (october 1972) ■ manifestation: a recording of this perfor mance released on 33¹/³ rpm sound discs in 1972 by london records ■ manifestation: a re-release of the same per formance on compact disc in 1987 by london records ■ item: the copy of the compact disc held by the columbus metropolitan library ■ item: the copy of the compact disc held by the university of cincinnati in fact, lis research has tended to demonstrate what music librarians have always understood—that relatedness among items and complexity of families is most prevalent in audio-visual collections. even before the ifla report had been penned, sherry vellucci had set out the task: “to create new catalog structures that better serve the needs of the music user community, it is important first to understand the exact nature and complexity of the materials to be described in the catalog.”26 even limiting herself to musical scores alone (that is, no recordings or monographs), vellucci found that more than 94.8 percent of her sample exhibited at least one bibliographic relationship with another entity in the collection; she further related this finding to the very “inherent nature of music, which requires performance for its aural realization,” as opposed to, for example, monographic book printing.27 vellucci and others have frequently commented on how the relatedness of manifestations—in different formats, arrangements, and abridgements—of musical works continues to be a problem for information retrieval in the world of music bibliography.28 musical works have been variously and industriously described by musicologists and music bibliographers. yet, in the information retrieval domain [and, i might add, under both aacr and aacr2] . . . systems for bibliographic information retrieval . . . have been designed with the document as the key entity, and works have been dismissed as too abstract . . .29 the work is the access point many users will bring—in their minds, and thus in their queries—to a system. they intend, however, to discover, identify, and obtain specific manifestations of that work. very recently, research has begun to demonstrate that the frbr model can offer specific advantages to music retrieval in cases such as these: “the description of bibliographic data in a frbr-based database leads to less redundancy and a clearer presentation of the relationships which are implicit in the traditional databases found in libraries today.”30 explorations of the theory in view of the benefits to other disciplines, such as audio-visual and other graphic materials, maps, oral literature, and rare books, have appeared in the literature as well.31 the admitted weakness of the frbr theory, of course, is that it remains a theory at its inception, with still preciously few working applications. ■ frbr applications working implementations of frbr to catalogs, opacs, and ilss are still relatively few but promise much for the future. the frbr theoretical framework has remained an area of intense research at oclc, which has even led to some prototype applications and, very recently, deployment in the worldcat local interface.32 a scattered few other researchers have crafted frbr catalogs and catalog displays for their own ends; the library of congress has a prototype as well. innovative, the leading academic ils vendor, announced a frbr feature for 2005 release, 26 information technology and libraries | march 200826 information technology and libraries | march 2008 yet shelved the project for lack of a beta-testing partner library.33 ex libris’ primo discovery tool, one other complete ils (by visionary technologies for library systems, or vtls), and the national library of australia, have each deployed operational frbr applications.34 the number of projects testifies to the high level of interest among the cataloging and information science communities, while the relatively small number of successful applications testifies to the difficulties faced. oclc has engaged in a number of research projects and prototypes in order to explore ways that frbrization of bibliographic records could enhance information access. oclc research frequently notes the potential streamlining of library cataloging by frbrization; in addition they have experienced “superior presentation” and “more intuitive clustering” of search results when the model is incorporated into systems.35 work-level definitions stand behind such oclc research prototypes as audience level, dewey browser, fictionfinder, xisbn, and live search. in every case, researchers determined that, though it was very difficult to automate any identification of expressions, application of work-level categories both simplifies and improves search result sets.36 an algorithm common to several of these applications is freely available as an open source application, and now as a public interface option in oclc’s worldcat local.37 the algorithm creates an author/title key to cluster worksets (often at a higher level than the frbr work, as in the case of the two distinct works that are the book and screenplay for gone with the wind). in the public search interface, the results sets may be grouped at the work level; users may then execute a more granular search for “all editions,” an option that then displays the group of expressions linked to the work record. unfortunately, as the software does not use 700t fields (its intention is to travel up the entity hierarchy, and it uses the 1xx, 24x, and 130 fields), its usefulness in solving the above challenges may not be immediate. a somewhat similar application (though merrilee proffitt declares it not to be a frbr product) was redlightgreen, a user interface for the exrlg union catalog based upon quasi-frbr clustering.38 the reports from designers of other automated systems offer interesting commentaries on the process. the team building an automatically frbrized database and user interface for austlit—a new union collection of australian literature among eight academic libraries and the national library of australia—acknowledged some difficulty with non-monographic works such as poems, though the majority of their database consisted of simpler work-manifestation pairs.39 based on strongly positive user feedback (“the presentation of information about related works [is] both useful and comprehensible”), a similar application was attempted on the australian national music gateway musicaustralia; it is unclear whether the project was shelved due to difficulties in automating the frbrization process.40 one recent application created for the perseus digital library adopts a somewhat different approach.41 rather than altering previously created marc records to allow hierarchical relationships to surface, this team created new records using crosswalks between marc and, for instance, mods, for work-level records. they claim some moderate level of success; though once again, their discussion of the process is more illuminating than their product. mimno and crane successfully allowed a single manifestation-level record to link upwards to many expressions, a necessary analytic feature especially for dealing with sound recordings. they did practically demonstrate the difficulty of searching elements from different levels of the hierarchy at the same time (such as work title and translator), a complication predicted by yee.42 three ils vendors have released products that use the frbr model: portia (visualcat), ex libris (primo), and vtls (virtua).43 the first product, a cataloging utility from a smaller player in the vendor market, claims to incorporate frbr into its metadata capture, yet the information available does not explain how, nor do they offer an opac to exploit it. the 2007 release of ex libris’ primo offers what the company calls “frbr groupings” of results.44 this discovery tool is not itself an ils, but promises to interoperate with major existing ils products to consolidate search results. it remains unclear at this time how ex libris’ “standard frbr algorithms” actually group records; the single deployment in the danish royal library allows searching for more records with the same title, for instance, but does not distinguish between translations of the same work.45 vtls, on the other hand, has since 2004 offered a complete product that has the potential to modify existing marc records—via local linking tags in the 001 and 004 fields—to create frbr relationships.46 their own studies agreed with oclc that a subset, roughly 18 percent, of existing catalog records (most heavily concentrated in music collections) would benefit from the process, and they thus allow for “mixed” catalogs, with only subsets (or even individually selected records) to be frbrized. the company’s own information suggests relatively simple implementation by library catalogers, coupled with robust functionality for users, and may be the leading edge of the next generation of catalog products. ■ frbr solutions the ilfa study group, following its user-centered approach, set out a list of specific tasks that users of a computer-aided catalog should be able to accomplish: article title | author 27frbrization of a library catalog | dickey 27 ■ to find all manifestations embodying certain criteria, or to find a specific manifestation given identifying information about it; ■ to identify a work, and to identify expressions and manifestations of that work; ■ to select among works, among expressions, and among manifestations; and ■ to obtain a particular manifestation once selected. it seems clear that the frbr model offers a framework of relationships that can aid each task. unfortunately, none of the currently available commercial solutions may be in themselves completely applicable for a single library. the oclc work-set algorithm is open source, as well as easily available through worldcat local, but it only works to create super-work records; it also ignores the 700t field so crucial to many of the issues noted above. none of the other home-grown applications may have code available to an institution. the virtua module from vtls offers a very tempting solution, but may require a change of vendor.47 either adapting one of these solutions or designing a local application, then, raises the question: what would the ideal system entail? catalog frbrization will transpire in two segments: enhancing the existing catalog to add bibliographic relationships to surface in the retrieval phase, and designing or adaptating a new interface and display to reflect the relationships.48 the first task may prove the more formidable, due to the size of even a modest catalog database and the difficulties often observed in automating such a task; while the librarians constructing the austlit system found a relatively high percentage of records could be transferred en masse, the oclc research team had difficulty automatically pinpointing expressions from current marc records.49 despite current technology trends toward users’ application of tags, reviews, and other metadata, a task as specialized as adding bibliographic relationships to the catalog demands specialized cataloging professionals.50 the best approach within a current library structure may be to create a single new position to head the project and to act as liaison with cataloging staff in the various branches and with vendor staff, if applicable. each library branch may judge on its own the proportions of records to frbrize, beginning with high-traffic works and authors, those for whom search results tend to be the most overwhelming and confusing to users. each branch can be responsible for allocation of cataloging staff effort to the process, and will thus have specialist oversight of subsets of the database. three technical solutions to actually changing the database structure have been attempted in the literature to date: incrementally improving the existing marc records to better reflect bibliographic relationships, adding local linking tags, and simply creating new metadata schemas. the vtls solution of adding local linking tags seems most appropriate; relationships between records are created and maintained via unique identifiers and linking statements in the 001 and 004 fields.51 oclc’s open source software could expedite the creation of work-level records, and the creation of expression-level records will be made easier by the large amount of bibliographic information already present in the current catalog. wherever possible, cataloging staff also should take the opportunity to verify or create links to authority files so as to enhance retrieval.52 creating a new catalog display option could be accomplished via additions to current opac coding, either by adopting worldcat local or by designing parts of a new local interface. it need not even require a complete revision; the single site (ucl) currently deploying vtls’ frbrized interface maintains a mixed catalog and offers, once again, a highly intuitive model.53 when a searcher comes across a bibliographic record for which frbr linking is available, they may click a link to open a new display screen. we should strive, however, to use simple interface statements such as “view all different kinds of holdings,” “this work has x editions, in y languages” or “this version of the work has been published z times” (both the oclc prototype and the austlit gateway offer such helpful and user-friendly statements). though the foundational work of both tillett and smiraglia focused upon taxonomies of relationships, the hierarchical structure of the ifla proposal should remain at the forefront of the display, with a secondary organization by type of relationship or type of entity. rather than adopting a design which automatically refreshes at each click, a tree organization of the display should be more user-friendly, allowing users to maintain a visual sense of the organization that they are encountering (see appendix for screenshots of this type of tree display).54 format information should be included in the display, as an indication of a users’ primary category, as well as a distinction among expressions of a work. with these changes, the library catalog will begin to afford its users better access to many of its core collections. frbrization of even part of the catalog—concentrating on high-incidence authors, as identified by subject specialists—will allow it better to reflect, and collocate, items within the families of bibliographic relationships that have been acknowledged a part of library collections for decades. this increased collocation will begin to counteract the pitfalls of mere keyword searching on the part of users, especially in conjunction with renewed authority work. finally, frbr offers a display option in a revamped opac that is at the same time simpler than current result lists, and more elegant in its reflection of relatedness among items. each feature should better 28 information technology and libraries | march 200828 information technology and libraries | march 2008 enable the users of our catalog to find, select, and obtain appropriate resources, and will bring our libraries into the next generation of cataloging practice. references and notes 1. ifla committee on the functional requirements for bibliographic records, final report (munich: k. g. saur, 1998); see also http://www.ifla.org/vii/s13/wgfrbr/bibliography.htm (accessed mar. 10, 2007). 2. this paper began as a graduate research assignment for lis 60640 (library automation), in the kent state university mlis program, march 19, 2007. my thanks to jennifer hambrick, nancy lensenmayer, and joan lippincott, for their helpful comments on earlier drafts. the curricular assignment asked for a library automation proposal in a specific library setting; the original review contained a set of recommendations concerning frbr through the lens of a (fictional) medium-sized academic library system, that of st. hildegard of bingen catholic university. as will be noted below, the branch music library typically serves a small population of music majors (graduate and undergraduate) within such an institution, but also a large portion of the student body that use the library’s collection to support their music coursework and arts distribution requirements. any music library’s proportion of the overall system’s holdings may be relatively small, but will include materials in a diverse set of formats: monographs, serials, musical scores, sound recordings in several formats (cassette tapes, lps, cds, and streaming audio files), and a growing collection of video recordings, likewise in several formats (vhs, laser discs, and dvd). it thus offers an early test case for difficulties with an automated library system. 3. dan zager, “collection development and management,” notes—quarterly journal of the music library association 56, no. 3 (march 2000): 569. 4. sherry l. velluci, “music metadata and authority control in an international context,” notes—quarterly journal of the music library association 57, no. 3 (mar. 2001): 541. 5. the opac for the university of huddersfield library system famously first deployed a search option for related items (“did you mean . . . ?”); http://www.hud.ac.uk/cls (accessed july 10, 2007). frbr not only offers the related item search, but also logically groups related works throughout the library catalog. 6. allyson carlyle demonstrated empirically that users value an object’s format as one of the first distinguishing features: “user categorization of works: toward improved organization of online catalog displays,” journal of documentation 55, no. 2 (mar. 1999): 184–208 at 197. 7. millennium will feature heavily in the following discussion, both because of its position leading the academic library automation market (being adopted wholesale by, for instance, the ohio statewide academic library consortium), and because it was the subject of the original paper. 8. see alastair boyd, “the worst of both worlds: how old rules and new interfaces hinder access to music,” caml review 33, no. 3 (nov. 2005), http://www.yorku.ca/caml/ review/33-3/both_worlds.htm (accessed mar. 12, 2007); michael gorman and paul w. winkler, eds., anglo-american cataloging rules, 2nd ed. (chicago: ala, 1988). 9. in the past few years, a small subset of the search literature has described technical efforts to develop search engines that can query by musical example; see j. stephen downie, “the scientific evaluation of music information retrieval systems: foundations and future,” computer music journal 28, no. 2 (summer 2004): 12–23. a company called melodis corporation has recently announced a successful launch of a query-by-humming search engine, though a verdict from the music community remains out; http://www.midomi.com (accessed jan. 31, 2007). 10. see velluci, “music metadata and authority control in an international context”; richard p. smiraglia, “uniform titles for music: an exercise in collocating works,” cataloging and classification quarterly 9, no. 3 (1989): 97–114; steven h. wright, “music librarianship at the turn of the century: technology,” notes—quarterly journal of the music library association 56, no. 3 (mar. 2000): 591–97. each author builds upon the foundational work of barbara tillett, “bibliographic relationships: toward a conceptual structure of bibliographic information used in cataloging” (ph.d. diss., university of california at los angeles, 1987). 11. “at conferences, [my colleagues] are always groaning if they are a voyager client,” interview with an academic music librarian by the author, feb. 9, 2007. 12. several prominent music librarians only discovered that innovative’s system had such a feature when instances of the automatic system’s changing carefully crafted music authority records were discovered; mark sharff (washington university in st. louis) and deborah pierce (university of washington), postings to innovative music users’ group electronic discussion list, oct. 6, 2006, archive accessed feb. 1, 2007. 13. music librarians are the only subset of the millennium users to have formed their own innovate users’ group. sirsidynix has a separate users’ group for stm librarians, and ex libris hosts a law librarians’ users’ group, two other groups whose interaction with the ils poses discipline-specific challenges. 14. searches were tested on the the ohio state university libraries’ opac , http://library.osu.edu (accessed mar. 10, 2007). 15. http://www.emory.edu/libraries.cfm (accessed june 27, 2007). 16. searches performed on the library of oklahoma state university, http://www.library.okstate.edu (accessed june 27, 2007); tlc has considered making frbrization a possible feature of their product. they offer some concatenation of “intellectually similar bibliographic records,” and “tlc continues to monitor emerging frbr standards”; don kaiser, personal communication to the author, july 8, 2007. i was unable to reach representatives of sirsidynix on this issue. 17. searches performed on the mit library catalog, powered by aleph 500 http://libraries.mit.edu (accessed june 27, 2007). 18. eva verona, “literary unit versus bibliographic unit [1959],” in foundations of descriptive cataloging, ed. michael carpenter and elaine svenonius, 155–75 (littleton, colo.: libraries unlimited, 1985), and seymour lubetzky, principles of cataloging, final report phase i: descriptive cataloging (los angeles: institute for library research, 1969), are usually credited with article title | author 29frbrization of a library catalog | dickey 29 the foundational work on such theories; see richard p. smiraglia, the nature of “a work”: implications for the organization of knowledge (lanham, md.: scarecrow, 2001), 15–33, to whom the following overview is indebted. 19. anglo-american cataloging rules, cited in smiraglia, the nature of “a work,” 33. 20. among the many library and information science thinkers contributing to this body of research, the most prominent have been patrick wilson, “the second objective” in the conceptual foundations of descriptive cataloging, ed. elaine svenonius, 5–16 (san diego: academic publ., 1989); edward t. o’neill and diane vizine-goetz, “bibliographic relationships: implications for the function of the catalog,” in the conceptual foundations of descriptive cataloging, ed. elaine svenonius, 167–79 (san diego: academic publ., 1989); barbara ann tillett, “bibliographic relationships: toward a conceptual structure of bibliographic information used in cataloging” (ph.d. diss, university of california, los angeles, 1987); eadem, “bibliographic relationships,” in relationships in the organization of knowledge, carol a. bean and rebecca green, eds. , 19–35 (dordrecht: kluwer, 2001) (summary of her dissertation findings on 19–20); martha m. yee, “manifestations and near-equivalents: theory with special attention to moving-image materials,” library resources and technical services 38, no. 3 (1994): 227–55. 21. o’neill and vizine-goetz, “bibliographic relationships”; see also edward t. o’neill, “frbr: application of the entityrelationship model to humphrey clinker,” library resources and technical services 46, no. 4 (oct. 2002): 150–59. 22. theorists in music semiotics who have more or less profoundly influenced music librarians’ view of their materials include jean-jacques nattiez, music and discourse: toward a semiology of music, trans. by carolyn abbate (princeton, n.j.: princeton univ. pr., 1990), and lydia goehr, the imaginary museum of musical works (new york: oxford univ. pr., 1992). see also smiraglia, the nature of “a work,” 64. for a concise overview of how semiotic theory has influenced thinking about literary texts, see w. c. greetham, theories of the text (oxford: oxford univ. pr., 1999), 276–325. 23. studies have found families of derivative bibliographic relationships in 30.2 percent of all worldcat records, 49.9 percent of records in the catalog of georgetown university library, 52.9 percent in the burke theological library (union theological seminary), 57.9 percent of theological works in the new york university library, and 85.4 percent in the sibley music library at the eastman school of music (university of rochester). see smiraglia, the nature of “a work,” 87, who cites richard p. smiraglia and gregory h. leazer, “derivative bibliographic relationships: the work relationship in a global bibliographic database,” journal of the american society for information science 50 (1999): 493–504; richard p. smiraglia, “authority control and the extent of derivative bibliographic relationships” (ph.d. diss., university of chicago, 1992); richard p. smiraglia, “derivative bibliographic relationships among theological works,” proceedings of the 62nd annual meeting of the american society for information science (medford, n.j.: information today, 1999): 497–506; and sherry l. vellucci, “bibliographic relationships among musical bibliographic entities: a conceptual analysis of music represented in a library catalog with a taxonomy of the relationships” (d.l.s. diss., columbia university, 1994). 24. ifla, final report, 2–3. 25. ibid, 16–23. 26. sherry l. vellucci, bibliographic relationships in music catalogs (lanham, md.: scarecrow, 1997), 1. 27. ibid, 238; 251. 28. vellucci, “music metadata”; richard p. smiraglia, “musical works and information retrieval,” notes: quarterly journal of the music library association 58, no. 4 (june 2002). patrick le boeuf notes that users of music collections often use the single word “score” to indicate any one of the four frbr entities; “musical works in the frbr model or ‘quasi la stessa cosa’: variations on a theme by umberto eco,” in functional requirements for bibliographic records (frbr): hype or cure-all? ed. patrick le boeuf, 103–23 at 105–06 (new york: haworth, 2005). 29. smiraglia, “musical works and information retrieval,” 2. 30. marte brenne, “storage and retrieval of musical documents in a frbr-based library catalogue” (masters’ thesis, oslo university college, 2004), 79. see also john anderies, “enhancing library catalogs for music,” paper presented at the conference on music and technology in the liberal arts environment, hamilton college, june 22, 2004; powerpoint presentation accessed mar. 12, 2007, from http://academics. hamilton.edu/conferences/musicandtech/presentations/catalog-enhancements.ppt; boyd, “the worst of both worlds.” 31. see the extensive bibliography compiled by ifla, cataloging division: “frbr bibliography,” http://www.ifla.org/ vii/s13/wgfrbr.bibliography.htm (accessed mar. 10, 2007). 32. the first ils deployment of the worldcat local application using frbr is with the university of washington libraries: http://www.lib.washington.edu (accessed june 27, 2007). 33. innovative interfaces, inc., “millennium 2005 preview: frbr support,” inn-touch (june 2004), 9. interestingly, the onepage advertisement for the new service chose a musical work, puccini’s opera la bohème, to illustrate how the sorting would work. innovative interfaces booth staff at the ala national conference, washington, d.c., june 24, 2007, told the author the company has moved in a different development direction now (investing more heavily in faceted browsing). 34. denmark’s det kongelige bibliotek has been the first ex libris partner library to deploy primo, http://www.kb.dk/en (accessed july 10, 2007). the vtls system has been operating since 2004 at the université catholique de louvain, http:// www.bib.ucl.ac.be (accessed mar. 15, 2007). for austlit, see http://www.austlit.edu.au (accessed mar. 14, 2007). 35. rick bennett, brian f. lavoie, and edward t. o’neill, “the concept of a work in worldcat: an application of frbr,” library collections, acquisitions, and technical services 27, no. 1 (spring 2003): 45–60. work-level records allow manifestation and item records to inherit labor-intensive subject classification metadata; eric childress, “frbr and oclc research,” paper presented at the university of north carolina-chapel hill, apr. 10, 2006, http://www.oclc.org/research/presentations/ childress/20060410-uncch-sils.ppt (accessed mar. 12, 2007). 36. thomas b. hickey, edward t. o’neill, and jenny toves, “experiments with the ifla functional requirements for bibliographic records (frbr),” d-lib 8, no. 9 (sept. 2002), http://www.dlib.org/dlib/september02/hickey/09hickey.html (accessed mar. 12, 2007). 37. thomas b. hickey and jenny toves, “frbr work-set algorithm,” apr. 2005 report, http://www.oclc.org/research/ projects/frbr/default.htm (accessed mar. 12, 2007); algorithm 30 information technology and libraries | march 200830 information technology and libraries | march 2008 available at http://www.oclc.org/research/projects/frbr/algorithm.htm. on worldcat local, see above, note 32. 38. merrilee proffitt, “redlightgreen: frbr between a rock and a hard place,” http://www.ala.org/ala/alcts/alctsconted/ presentations/proffitt.pdf (accessed mar. 12, 2007). redlight green has been discontinued, and some of its technology incorporated into worldcat local. 39. http://www.austlit.edu.au (accessed mar. 14, 2007), but unfortunately a subscription database at this time, and thus unavailable for operational comparison. see marie-louise ayres, “case studies in implementing functional requirements for bibliographic records: austlit and musicaustralia,” alj: the australian library journal 54, no. 1 (feb. 2005): 43–54, http:// www.nla.gov.au/nla/staffpaper/2005/ayres1.html (accessed mar. 12, 2007). 40. ibid. 41. see david mimno and gregory crane, “hierarchical catalog records: implementing a frbr catalog,” d-lib 11, no. 10 (oct. 2005); http://www.dlib.org/dlib/october05/ crane/10crane.html (accessed mar. 12, 2007). 42. ibid. see also martha m. yee, “frbrization: a method for turning online public finding lists into online public catalogs,” information technology and libraries 24, no. 3 (2005): 77–95, http://repositories.cdlib.org/postprints/715 (accessed mar. 12, 2007). 43. portia, “visualcat overview,” http://www.portia.dk/ pubs/visualcat/present/visualcatoverview20050607.pdf (accessed mar. 14, 2007); vtls, inc., “virtua,” http://www.vtls. com/brochures/virtua.pdf (accessed mar. 14, 2007). 44. http://www.exlibrisgroup.com/primo_orig.htm (accessed july 10, 2007). 45. syed ahmed, personal communication to the author, july 10, 2007; searches run july 10, 2007, on http://www.kb.dk/en. the library’s holdings of manifestations of mozart’s singspiel opera, the magic flute, run to four different groupings on this catalog: one under the title “die zauberflöte,” one under the title “la flute enchantée: opéra fantastique en 4 actes,” and two separate groups under the title “tryllefløtjen.” 46. “vtls announces first production use of frbr,” http:// www.vtls.com/corporate/releases/2004/6.shtml (accessed mar. 14, 2007). unfortunately, though this press release indicates commitments on the part of the université catholique de louvain and vaughan public libraries (ontario, canada) to use fully frbrized catalogs, only the first is operating in this mode as of july 2007, and with only a subset of its catalog adapted. 47. virtua is not interoperable, for instance, with any of innovative’s other ils modules, which continue to dominate a number of larger academic consortia; john espley, vtls inc. director of design, personal communication to the author, mar. 15, 2007. 48. see allyson carlyle, “fulfilling the second objective in the online catalog: schemes for organizing author and work records into usable displays,” library resources and technical services 41, no. 2 (1997): 79–100. 49. even at the work-level, yee distinguished fully eight different places in a marc record in which the identity of a work may be located, “frbrization,” 79–80. 50. gregory leazer and richard p. smiraglia imply that cataloger-based “maps” of bibliographic relationships are inadequate; “bibliographic families in the library catalog: a qualitative analysis and grounded theory,” library resources and technical services 43, no. 4 (1999): 191–212. the cataloging failures they describe, however, are more a result of inadequacies in the current rules and practice, and do not really prove that catalogers have failed in the task of creating useful systems. 51. vinood chacra and john espley, “differentiating libraries though enriched user searching: frbr as the next dimensions in meaningful information retrieval,” powerpoint presentation, http://www.vtls.com/corporate/frbr.shtml (accessed mar. 10, 2007). 52. see yee, “frbrization.” 53. http://www.bib.ucl.ac.be (accessed mar. 15, 2007). 54. not only does the ex libris primo application need clickthroughs, it creates a new window for an extra step before presenting a new group of records. bibliography anderies, john. “enhancing library catalogs for music.” paper presented at the conference on music and technology in the liberal arts environment, hamilton college, june 22, 2004; http://academics.hamilton.edu/conferences/musicandtech/presentations/catalog-enhancements.ppt (accessed mar. 12, 2007). ayres, marie-louise. “case studies in implementing functional requirements for bibliographic records: austlit and musicaustralia.” alj: the australian library journal 54, no. 1 (feb. 2005): 43–54; http://www.nla.gov.au/nla/staffpaper/2005/ ayres1.html (accessed mar. 12, 2007). bennett, rick, brian f. lavoie, and edward t. o’neill. “the concept of a work in worldcat: an application of frbr.” library collections, acquisitions, and technical services 27, no. 1 (spring 2003): 45–60. boyd, alistair. “the worst of both worlds: how old rules and new interfaces hinder access to music.” caml review 33, no. 3 (nov. 2005); http://www.yorku.ca/caml/review/33-3/ both_worlds.htm (accessed mar. 12, 2007). brenne, marte. “storage and retrieval of musical documents in a frbr-based library catalogue.” masters’ thesis, oslo university college, 2004. carlyle, allyson. “fulfilling the second objective in the online catalog: schemes for organizing author and work records into usable displays,” library resources and technical services 41, no. 2 (1997): 79–100. ______. “user categorization of works: toward improved organization of online catalog displays.” journal of documentation 55, no. 2 (mar. 1999): 184–208 chacra, vinood, and john espley. “differentiating libraries though enriched user searching: frbr as the next dimensions in meaningful information retrieval.” powerpoint presentation, http://www.vtls.com/corporate/frbr.shtml (accessed mar. 10, 2007). childress, eric. “frbr and oclc research.” paper presented at the university of north carolina-chapel hill, apr. 10, 2006; http://www.oclc.org/research/presentations/ childress/20060410-uncch-sils.ppt (accessed mar. 12, 2007). hickey, thomas b., and edward o’neill. “frbrizing oclc’s worldcat.” in functional requirements for bibliographic records article title | author 31frbrization of a library catalog | dickey 31 (frbr): hype or cure-all? ed. patrick le boeuf, 239-251. new york: haworth, 2005. hickey, thomas b., and jenny toves. “frbr work-set algorithm.” apr. 2005 report; http://www.oclc.org/research/ frbr (accessed mar. 12, 2007). hickey, thomas b., edward t. o’neill, and jenny toves, “experiments with the ifla functional requirements for bibliographic records (frbr),” d-lib 8, no. 9 (sept. 2002); http://www.dlib.org/dlib/september02/hickey/09hickey. html (accessed mar. 12, 2007). ifla study group on the functional requirements for bibliographic records. functional requirements for bibliographic records: final report. munich: k. g. saur, 1998. layne, sara shatford. “subject access to art images.” in introduction to art image access: issues, tools, standards, strategies, murtha baca, ed., 1–18. los angeles: getty research institute, 2002. leazer, gregory, and richard p. smiraglia. “bibliographic families in the library catalog: a qualitative analysis and grounded theory.” library resources and technical services 43, no. 4 (1999): 191–212. le boeuf, patrick. “musical works in the frbr model or ‘quasi la stessa cosa’: variations on a theme by umberto eco.” in functional requirements for bibliographic records (frbr): hype or cure-all? patrick le boeuf, ed., 103–23 new york: haworth, 2005. markey, karen. subject access to visual resources collections: a model for computer construction of thematic catalogs. new york: greenwood, 1986. mimno, david, and gregory crane. “hierarchical catalog records: implementing a frbr catalog.” d-lib 11, no. 10 (oct. 2005); http://www.dlib.org/dlib/october05/crane/10crane. html (accessed mar. 12, 2007). o’neill, edward t. “frbr: application of the entity-relationship model to humphrey clinker.” library resources and technical services 46, no. 4 (oct. 2002): 150–59. o’neill, edward t., and diane vizine-goetz. “bibliographic relationships: implications for the function of the catalog.” in the conceptual foundations of descriptive cataloging. elaine svenonius, ed., 167–79. san diego: academic publ., 1989. proffitt, merrilee. “redlightgreen: frbr between a rock and a hard place.” paper presented at the 2004 ala annual conference, orlando, fla.; http://www.ala.org/ala/alcts/alctsconted/presentations/proffitt.pdf (accessed mar. 12, 2007). smiraglia, richard p. bibliographic control of music, 1897–2000. lanham, md.: scarecrow and music library association, 2006. ______. “content metadata: an analysis of etruscan artifacts in a museum of archaeology.” cataloging and classification quarterly, 40, no. 3/4 (2005): 135–51. ______. “musical works and information retrieval,” notes: quarterly journal of the music library association 58, no. 4 (june 2002): 747–64. ______. the nature of “a work”: implications for the organization of knowledge. lanham, md.: scarecrow, 2001. ______. “uniform titles for music: an exercise in collocating works.” cataloging and classification quarterly 9, no. 3 (1989): 97–114. tillett, barbara ann. “bibliographic relationships.” in relationships in the organization of knowledge. carol a. bean and rebecca green, eds., 19–35. dordrecht: kluwer, 2001. vellucci, sherry l. bibliographic relationships in music catalogs. lanham, md.: scarecrow, 1997. ______. “music metadata and authority control in an international context.” notes—quarterly journal of the music library association 57, no. 3 (mar. 2001): 541–54. wilson, patrick. “the second objective.” in the conceptual foundations of descriptive cataloging. elaine svenonius, ed., 5–16. san diego: academic publ., 1989. wright, h. s. “music librarianship at the turn of the century: technology.” notes: quarterly journal of the music library association 56, no. 3 (mar. 2000): 591–97. yee, martha m. “frbrization: a method for turning online public finding lists into online public catalogs.” information technology and libraries 24, no. 3 (2005): 77–95; http://repositories.cdlib.org/postprints/713 (accessed mar. 12, 2007). ______. “manifestations and near-equivalents: theory with special attention to moving-image materials.” library resources and technical services 38, no. 3 (1994): 227–55. zager, daniel. “collection development and management.” notes: quarterly journal of the music library association 56, no. 3 (2000): 567–73. 32 information technology and libraries | march 200832 information technology and libraries | march 2008 a search on also sprach zarathustra on the online public access catalog for the universite catholique de louvain, with results frbrized. (a vtls opac). selecting the first work yields the following screen: . . . which, when frbrized, yields a list of expressions. any part of the tree may be expanded, to display manifestations, and item-level records follow. appendix: examples of a frbrized tree display evaluating the impact of the long-s upon 18th-century encyclopedia britannica automatic subject metadata generation results articles evaluating the impact of the long-s upon 18th-century encyclopedia britannica automatic subject metadata generation results sam grabus information technology and libraries | september 2020 https://doi.org/10.6017/ital.v39i3.12235 sam grabus (smg383@drexel.edu) is an information science phd candidate at drexel university’s college of computing and informatics, and research assistant at drexel’s metadata research center. this article is the 2020 winner of the lita/ex libris student writing award. © 2020. abstract this research compares automatic subject metadata generation when the pre-1800s long-s character is corrected to a standard < s >. the test environment includes entries from the third edition of the encyclopedia britannica, and the hive automatic subject indexing tool. a comparative study of metadata generated before and after correction of the long-s demonstrated an average of 26.51 percent potentially relevant terms per entry omitted from results if the long-s is not corrected. results confirm that correcting the long-s increases the availability of terms that can be used for creating quality metadata records. a relationship is also demonstrated between shorter entries and an increase in omitted terms when the long-s is not corrected. introduction the creation of subject metadata for individual documents is long known to support standardized resource discovery and analysis by identifying and connecting resources with similar aboutness .1 in order to address the challenges of scale, automatic or semi-automatic indexing is frequently employed for the generation of subject metadata, particularly for academic articles, where the abstract and title can be used as surrogates in place of indexing the full text. when automatically generating subject metadata for historical humanities full texts that do not have an abstract, anachronistic typographical challenges may arise. one key challenge is that presented by the historical “long-s” < ſ >. in order to account for these idiosyncrasies, there is a need to understand the impact that they have upon the automatic subject indexing output. addressing this challenge will help librarians and information professionals to determine whether or not they will need to correct the long-s when automatically generating subject metadata for full-text pre-1800s documents. the problem of the long-s in optical character recognition (ocr) for digital manuscript images has been discussed for decades.2 many scholars have researched methods for correcting the longs through the use of rule-based algorithms or dictionaries.3 while the problem of the long-s is well-known in the digital humanities community, automatic subject metadata generation for a large corpus of pre-1800s documents is rare, as is research about the application and evaluation of existing automatic subject metadata generation tools on 18th-century documents in real-world information environments. the impact of the long-s upon automatic subject metadata generation results for pre-1800s texts has not been extensively explored. the research presented in this paper addresses this need. the paper reports results from basic statistical analysis and visualization using the helping interdisciplinary vocabulary engineering (hive) tool automatic mailto:smg383@drexel.edu information technology and libraries september 2020 evaluating the impact of the long-s | grabus 2 subject indexing results, before and after the correction of the historical long-s in the 3rd edition of the encyclopedia britannica. background work was conducted over the summer and fall of 2019, and the research presented was conducted during winter 2020. the work was motivated by current work on the “developing the data set of nineteenth-century knowledge” project, a national endowment for the humanities collaborative project between temple university’s digital scholarship center and drexel university’s metadata research center. the grant is part of a larger project, temple university’s “19th-century knowledge project,” which is digitizing four historical editions of the encyclopedia britannica.4 the next section of this paper presents background covering the historical encyclopedia britannica data, the automatic subject metadata generation tool used for this project, a brief background of “the long-s problem,” and the distribution of encyclopedia entry lengths in the 3rd edition. the background section will be followed by research objectives and method supporting the analysis. next, the results are presented, demonstrating prevalence of terms omitted from the automatic subject metadata generation results if the long-s is not corrected to a standard small < s > character, as well as the impact of encyclopedia entry length upon these results. the results are followed by a contextual discussion, and a conclusion that highlights key findings and identifies future research. background indexing for the 19th-century knowledge project the 19th-century knowledge project, an neh-funded initiative at temple university, is fully digitizing four historical editions of the encyclopedia britannica (the 3rd, 7th, 9th, and 11th). the long-term goal of the project is to analyze the evolving conceptualization of knowledge across the 19th century.5 the 3rd edition of the encyclopedia britannica (1797) is the earliest edition being digitized for this project. the 3rd edition consists of 18 volumes, with a total of 14,579 pages, and individual entries ranging from four to over 150,000 words. for each individual entry, researchers at temple have created individual tei-xml files from the ocr output. in order to enrich accessibility and analysis across this digital collection, the knowledge project will be adding controlled vocabulary subject headings into the tei headers of each encyclopedia entry xml file. considering the size of this corpus, both in terms of entry length and number of entries, automatic subject metadata generation will be required for the creation of this metadata. the knowledge project will employ controlled vocabularies to replace or complement naturally extracted keywords for this process. using controlled vocabularies adheres to metadata semantic interoperability best practices, ensures representation consistency, and helps to bypass linguistic idiosyncrasies of these 18th and 19th century primary source materials. 6 we selected two versions of the library of congress subject headings (lcsh) as the controlled vocabularies for this project. lcsh was selected due to its relational thesaurus structure, multidisciplinary nature, and continued prevalence in digital collections due to its expressiveness and status as the largest general indexing vocabulary.7 in addition to the headings from the 2018 edition of lcsh, headings from the 1910 lcsh are also implemented in order to provide a more multi-faceted representation, using temporally-relevant terms that may have been removed from the contemporary lcsh. the tool applied for this process is hive, a vocabulary server and automatic indexing application. 8 hive allows the user to upload a digital text or url, select one or more controlled vocabularies, and performs automatic subject indexing through the mapping of naturally extracted keywords to the available controlled vocabulary terms. hive was initially launched as an imls linked open information technology and libraries september 2020 evaluating the impact of the long-s | grabus 3 vocabulary and indexing demonstration project in 2009. since that time, hive has been further developed, with the addition of more controlled vocabularies, user interface options, and the rake keyword extraction algorithm. the rake keyword extraction algorithm has been selected for this project after a comparison of topic relevance precision scores for three keyword extraction algorithms.9 the long-s problem early in our metadata generation efforts, we discovered that the 3rd edition of the encyclopedia britannica employs the historical long-s. originating in early roman cursive script, the long-s was used in typesetting up through the 18th century, both with and without a left crossbar. by the end of the 18th century, the long-s fell out of use with printers.10 as outlined by lexicographers of the 17th and 18th centuries, the rules for using the long-s were frequently vague, complicated, inconsistent over time, and varied according to language (english, french, spanish, or italian). 11 these rules specified where in a word the long-s should be used instead of a short < s >, whether it is capitalized, where it may be used in proximity to apostrophes, hyphens, and the letters < f >, < b >, < h >, and < k >; and whether it is used as part of a compound word or abbreviation.12 this is further complicated by the inclusion of the half-crossbar, which occasionally results in two consequences: (a) the long-s may be interpreted by ocr as an < f >, and < b > and < f > may be interpreted by ocr as a long-s. figure 1 shows an example from the 3rd edition entry on russia, in which the original text specifies “of” (line 1 in top figure), yet the ocr output has interpreted the character as a long-s. the long-s may also occasionally be interpreted by the ocr as a lowercase < l >, such as the “univerlity of dublin” in the 3rd edition entry on robinson (the most rev sir richard). these complications and inconsistencies are challenges when developing python rules for correcting the long-s in an automated way, and even preexisting scripts will need to be adapted for individual use with a particular corpus. figure 1. example from the 3rd edition entry on russia, comparing the original use of a letter < f > in “of” to the ocr output of the same passage, which mistakenly interprets the character as a long-s. information technology and libraries september 2020 evaluating the impact of the long-s | grabus 4 despite the transition away from the long-s towards the end of the 18th century, the 3rd edition of the encyclopedia britannica (published in 1797) implements the long-s throughout, with approximately 100,594 instances of the long-s in the ocr output. when performing metadata generation with the hive tool on the ocr output for an entry, the long-s is most often interpreted by the automatic metadata generation tool as an < f >, which can result in (a) inaccurate keyword extraction (e.g., russians→ ruffians), and (b) when mapping extracted keywords to controlled vocabulary terms, essential topics could be unidentifiable, and hive will subsequently omit them from the results because they cannot be mapped to controlled vocabulary terms. figure 2 provides a truncated view of long-s words in the 3rd edition entry on rum, which are subsequently removed from the pool of automatically extracted keywords when performing the automatic subject indexing sequence in hive. using keyword extraction algorithms that are largely dependent upon term frequencies, automatic subject indexing for an entry on rum may be substantially hindered when meaningful and frequently occurring words such as sugar, and yeast are removed. figure 2. examples of the long-s in the 3rd edition encyclopedia britannica entry on rum. using this example entry, the automatic subject indexing results were compared using python, to determine which terms only appear when the long-s has been corrected to the standard < s >. the comparison showed that 16 total terms no longer appeared in the results when the long-s was not corrected to a standard < s >: ten terms using the 2018 lcsh, and six terms using the 1910 lcsh. these omitted results included the terms sugar and yeast. the next section will discuss the encyclopedia entry word count for this corpus, and the possible impact that this may have upon automatic subject indexing between corrected and uncorrected long-s instances. encyclopedia entry lengths consistent with other encyclopedia britannica editions in the 18th and 19th centuries, the encyclopedia entries in the 3rd edition vary substantially in length. a convenience sample of 3,849 3rd edition entries ranging in length from 2 to 202,848 words demonstrated an arithmetic mean of information technology and libraries september 2020 evaluating the impact of the long-s | grabus 5 826.60, and a median word count of 71. as shown in figure 3, this indicates a significant skew towards shorter entry lengths. for the vast majority of encyclopedia entries in this corpus, a low total word count may impact the degree of long-s impact for automatic subject indexing results, given the importance of term availability and frequency for keyword extraction algorithms. figure 3. scatterplot of word count for a convenience sample of 3,849 3rd edition encyclopedia britannica entries. large-scale metadata generation requires time, labor, and resources, and it becomes more costly when accounting for the complications of correcting the long-s for a particular corpus. library and information professionals working with digital humanities resources will need to understand the impact of correcting or not corrected the long-s in the corpus before designating resources and developing a protocol for generating the automatic or semi-automatic metadata for full-text resources. this includes understanding whether or not the length of each individual document will affect the degree of long-s impact upon the results. this challenge, and issues reviewed above, are in the research presented below. objectives the overriding goal of this work is to determine the prevalence of omitted terms in automatic subject indexing results when the long-s is not corrected in the 3rd edition entries of the encyclopedia britannica. research questions: 1. what is the average number of terms that are omitted from automatic subject indexing results when the long-s is not corrected to a standard < s >? 2. how does the encyclopedia entry length affect the number of terms that are omitted when the long-s is not corrected to a standard < s >? this analysis will approach these goals by performing a comparative analysis of automatic subject indexing results to determine the number of terms that are omitted from the results when the long-s is not corrected to a standard letter < s >. basic descriptive statistics are generated to determine central tendency. the quantity of terms omitted are then compared with encyclopedia information technology and libraries september 2020 evaluating the impact of the long-s | grabus 6 entry word counts. these objectives were shaped by collaboration between drexel university’s metadata research center and temple university’s digital scholarship center. the next section of this paper will report on methods and steps taken to address these objectives. methods we approached this research by performing a comparative analysis of subject metadata generated both before and after the correction of the historical long-s in the 3rd edition of the encyclopedia britannica. the hive tool was used to automatically generate the subject metadata. descriptive statistics were applied, and visualizations produced from the results were also examined to identify trends. figure 4. the 30 encyclopedia britannica 3rd edition encyclopedia britannica entries randomly selected for this study, sorted in ascending order by their word counts. the protocol for performing this research involved the following steps: 1. compile a sample for testing: 1.1. a random sample of 30 encyclopedia entries was identified from a convenience sample of entries that comprise the letter s volumes of the 3rd edition. the entries range, in length, from 6 to 6,114 words. the median word count for entries in this sample is 99 words. 1.2. the sample of terms selected for this study and their respective word counts are visualized in figure 4. 1.3. for each entry, the long-s terms in the original xml file were extracted to a list. 2. perform automatic subject indexing sequence upon entries to generate lists of terms: 2.1. using the 2018 and 1910 versions of the lcsh. 2.2. with fixed maximum subject heading results set to 40: 20 maximum terms returned with the 2018 lcsh, and 20 maximum terms returned with the 1910 lcsh. 2.3. before long-s correction and after long-s correction, using the oxygen xml editor tei to txt transformation. information technology and libraries september 2020 evaluating the impact of the long-s | grabus 7 3. perform outer join on python data frames, between terms generated when the long-s has been corrected vs. terms generated when the long-s has not been corrected. the resulting left outer join list displays terms that are omitted from the automatic indexing results if the long-s is not corrected to a standard small < s >. the quantity of terms omitted are recorded for comparison. 4. analysis: descriptive statistics were generated to determine central tendency for the number and percentage of words omitted when the long-s is not corrected. the quantity of terms omitted are also visualized in a continuous scatterplot with the corresponding word counts, to demonstrate that the quantity of terms omitted when the long-s is not corrected seems to relate to the length of the document being automatically classified. results the results report the prevalence of omitted terms when the long-s is not corrected to a standard < s >, as well as a visualization of the number of terms omitted as they relate to the encyclopedia entry length. for each of the 30 sample entries automatically indexed with hive, a fixed maximum number of 40 entries were returned: a maximum of 20 terms using the 2018 lcsh, and a maximum of 20 terms using the 1910 lcsh. as seen in table 1, central tendency is measured using the arithmetic mean and median, along with the standard deviation and range. the average number of terms omitted from an entry’s results is 6.73, and the average percentage of terms omitted from an entry’s results is 26.51 percent, with the 2018 and 1910 editions of lcsh performing at similar rates. the full results are displayed in appendix a. table 1. measures of centrality, standard deviation, range, and percentage for quantity of terms omitted when the long-s is not corrected to a standard < s >, rounded to the hundredth. for each entry, a maximum of 40 terms were returned: 20 using 2018 lcsh and 20 using 1910 lcsh. the total results returned varies according to the entry length. these totals are reported in appendix b. (n= 30 entries.) for each entry in the sample, the results in appendix a display the total words omitted when the long-s is not corrected, the number of 2018 lcsh terms omitted, the number of 1910 lcsh terms omitted, and the encyclopedia entry word count. figure 5 visualizes the total number of terms omitted for each entry when the long-s is not corrected, demonstrating an increase in terms omitted for entries with lower word counts. these results are broken down by vocabulary used in figure 6, demonstrating that both vocabularies used to generate these results indicate a significant increase in omitted terms for shorter entries. column1 both vocabularies 2018 lcsh 1910 lcsh average, terms omitted 6.73 3.67 3.07 median, terms omitted 5 3 2 standard deviation 6.53 3.84 3.17 range, terms omitted 0-24 0-13 0-11 average percentage, omitted terms 26.51% 27.51% 24.28% median percentage, omitted terms 22.36% 20.00% 19.09% information technology and libraries september 2020 evaluating the impact of the long-s | grabus 8 figure 5. number of automatic subject indexing terms that are omitted when the long-s is not corrected to a standard < s > as compared by encyclopedia entry word count. figure 6. number of automatic subject indexing terms that are omitted when the long-s is not corrected to a standard < s > as compared by encyclopedia entry word count, separated by controlled vocabulary version. information technology and libraries september 2020 evaluating the impact of the long-s | grabus 9 discussion the analysis above presents measures of centrality for quantity of terms omitted if the long-s is not corrected to a standard < s > prior to automatic subject indexing using hive, as well as a visualization to represent the relationship between encyclopedia entry word count and number of terms omitted. although researchers have identified challenges with the long-s and have focused a great deal on the technologies and methods used to correct it, there is still limited work on looking at the results of not correcting the long-s character when performing an automatic subject indexing sequence. this research demonstrated an average of 6.73 potentially relevant terms omitted from automatic indexing results when the long-s is not corrected, accounting for an average of 26.51 percent of the total results, with an approximately equal distribution of omitted terms across the two controlled vocabulary versions used. when the quantity of terms omitted is visualized using a continuous scatterplot, the results also demonstrated a significant increase in omitted terms for shorter entries, with longer entries less affected. these results reflect the impact of term frequency and total word count in keyword extraction and automatic subject indexing, with longer documents having a greater pool of total terms from which to identify key terms. considering the complexities and similarities of the typographical characters in the original manuscript, the ocr output process for this corpus occasionally mistakes the letters < s >, < f >, < r >, and < l >. as a result, an occasional long-s word in this study did not originally contain an < s > (e.g., sor instead of for). correction of these long-s ocr errors requires the development of a dictionary-based script. an additional complication of this research is that the corrected ocr output for the encyclopedia entries still contains a few errors not related to the long-s, which will prevent the mapping of the term to any controlled vocabulary term (e.g., in the entry on sepulchre, the ocr output for the term palestine was palestinc). these results are specific to this particular corpus of 3rd edition encyclopedia britannica entries, but it is very likely that testing another set of pre-1800s documents containing the long-s would also illustrate that for best results with any algorithm or tool, the long-s needs to be corrected. the results are also specific to the two versions of the lcsh used, both the 1910 lcsh and the 2018 lcsh, which are available in the hive tool. the 1910 version is key for the time period being studied, and the 2018, more contemporary to today, has supported additional analysis on the impact of the long-s. both of these vocabularies are important to the larger 19th-century knowledge project. it should be noted that while the lcsh is updated weekly, we were limited to what is available via the hive tool, and any discrepancies that may be found with the 2020 lcsh will very likely have a minimal effect upon metadata generation results. it should be noted that the 2020 lcsh will be incorporated into hive soon and can be explored in future research. conclusion and next steps the objective of this research was to determine the impact of correcting the long-s in pre-1800s documents when performing an automatic metadata generation sequence using keyword extraction and controlled vocabulary mapping. this was accomplished by performing an automatic subject indexing sequence using the hive tool, followed by a basic statistical analysis to determine the quantity of terms omitted from the results when the long-s is not corrected to a standard < s >. the number of omitted terms was also compared with the encyclopedia entry word count and visualized to demonstrate a significant increase in omitted terms for shorter information technology and libraries september 2020 evaluating the impact of the long-s | grabus 10 encyclopedia entries. the study was conclusive in confirming that the correction of the long-s is a critical part of our workflow. the significance of this research is that it demonstrates the necessity of correcting the long-s prior to performing an automatic subject indexing on historical documents. beyond the correction of the long-s, the larger next steps for this project are to continue to explore automatic metadata generation for this corpus. these next steps include the comparison of results using contemporary vs. historical vocabularies and streamlining a protocol for bulk classification procedures and integration of terms into the tei-xml headers. the research presented here can inform other digital humanities and even science-oriented projects, where researchers may not be aware of the impact of the long-s on automatic metadata generation not only for subjects, but also named entities, particularly when automatic approaches with controlled vocabularies are desired. acknowledgements the author thanks dr. jane greenberg and dr. peter logan for their guidance. the author acknowledges the support of the neh grant #haa-261228-18. information technology and libraries september 2020 evaluating the impact of the long-s | grabus 11 appendix a entry term total words omitted 2018 lcsh terms omitted 1910 lcsh terms omitted encyclopedia entry word count sardis 24 13 11 381 suction 24 13 11 38 stylites, pillar saints 19 13 6 199 shadwell 14 10 4 211 salicornia 13 6 7 254 sepulchre 11 3 8 348 sitta nuthatch 9 5 4 620 sprat 9 3 6 475 serapis 8 5 3 587 strada 8 1 7 189 shoad 7 4 3 463 sign 7 5 2 68 shooting 6 3 3 6114 strata 6 3 3 2920 stewartia 5 4 1 72 subclavian 5 3 2 20 schweinfurt 4 2 2 84 scroll 4 2 2 45 spalatro 4 3 1 99 special 4 3 1 24 samogitia 3 2 1 112 shakespeare 3 0 3 3855 sinapism 2 1 1 25 sect 1 1 0 20 severino 1 1 0 38 shaddock 1 1 0 6 scarlet 0 0 0 65 shallop, shalloop 0 0 0 42 soldanella 0 0 0 56 spoletto 0 0 0 99 information technology and libraries september 2020 evaluating the impact of the long-s | grabus 12 appendix b *n = 30 entries average terms returned median terms returned corrected 24.77 / 40 possible 28 / 40 possible uncorrected 26.47 / 40 possible 29 / 40 possible 2018 lcsh corrected 14.10 / 20 possible 19 / 20 possible 2018 lcsh uncorrected 13.47 / 20 possible 18.5 / 20 possible 1910 lcsh corrected 11.27 / 20 possible 11 / 20 possible 1910 lcsh uncorrected 10.13 / 20 possible 9 / 20 possible information technology and libraries september 2020 evaluating the impact of the long-s | grabus 13 endnotes 1 liz woolcott, “understanding metadata: what is metadata, and what is it for?,” routledge (november 17, 2017), https://doi.org/10.1080/01639374.2017.1358232; koraljka golub et al., “a framework for evaluating automatic indexing or classification in the context of retrieval,“ journal of the association for information science and technology 67, no. 1 (2016), https://doi.org/10.1002/asi.23600; lynne c. howarth, “metadata and bibliographic control: soul-mates or two solitudes?,“ cataloging & classification quarterly 40, no. 3-4 (2005), https://doi.org/10.1300/j104v40n03_03. 2 a. belaid et al., “automatic indexing and reformulation of ancient dictionaries“ (paper presented at the first international workshop on document image analysis for libraries, palo alto, ca, 2004), https://doi.org/10.1109/dial.2004.1263264. 3 beatrice alex et al., “digitised historical text: does it have to be mediocre" (paper presented at the konvens 2012 (lthist 2012 workshop), vienna, september 21, 2012); ted underwood, “a half-decent ocr normalizer for english texts after 1700," the stone and the shell, december 10, 2013, https://tedunderwood.com/2013/12/10/a-half-decent-ocr-normalizer-for-englishtexts-after-1700/. 4 “nineteenth-century knowledge project," (github repository), 2020, https://tuplogan.github.io/. 5 “nineteenth-century knowledge project.” 6 marcia lei zeng and lois mai chan, “metadata interoperability and standardization a study of methodology, part ii," d-lib magazine 12, no. 6 (2006); g. bueno-de-la-fuente, d. rodríguez mateos, and j. greenberg, “chapter 10 automatic text indexing with skos vocabularies in hive" (elsevier ltd, 2016); sheila bair and sharon carlson, “where keywords fail: using metadata to facilitate digital humanities scholarship," journal of library metadata 8, no. 3 (2008), https://doi.org/10.1080/19386380802398503. 7 john walsh, “the use of library of congress subject headings in digital collections," library review 60, no. 4 (2011), https://doi.org/10.1108/00242531111127875. 8 jane greenberg et al., “hive: helping interdisciplinary vocabulary engineering,“ bulletin of the american society for information science and technology 37, no. 4 (2011), https://doi.org/10.1002/bult.2011.1720370407. 9 sam grabus et al., “representing aboutness: automatically indexing 19thcentury encyclopedia britannica entries,” nasko 7 (2019), pp. 138-48, https://doi.org/10.7152/nasko.v7i1.15635. 10 karen attar, “s and long s," in oxford companion to the book, eds. michael felix suarez and h. r. ii woudhuysen (oxford: oxford university press, 2010); ingrid tieken-boon van ostade, “spelling systems,“ in an introduction to late modern english (edinburgh university press, 2009). 11 andrew west, “the rules for long-s," tugboat 32, no. 1 (2011). 12 attar, “s and long s.” https://doi.org/10.1080/01639374.2017.1358232 https://doi.org/10.1002/asi.23600 https://doi.org/10.1300/j104v40n03_03 https://doi.org/10.1109/dial.2004.1263264 https://tedunderwood.com/2013/12/10/a-half-decent-ocr-normalizer-for-english-texts-after-1700/ https://tedunderwood.com/2013/12/10/a-half-decent-ocr-normalizer-for-english-texts-after-1700/ https://tu-plogan.github.io/ https://tu-plogan.github.io/ https://doi.org/10.1080/19386380802398503 https://doi.org/10.1108/00242531111127875 https://doi.org/10.1002/bult.2011.1720370407 https://doi.org/10.7152/nasko.v7i1.15635 abstract introduction background indexing for the 19th-century knowledge project the long-s problem encyclopedia entry lengths objectives methods results discussion conclusion and next steps acknowledgements appendix a appendix b microsoft word march_ital_dehmlow.docx editorial  board  thoughts     a&i  databases:  the  next  frontier     to  discover   mark  dehmlow       information  technology  and  libraries  |  march  2015     1   i  think  it  is  fair  to  say  that  the  discovery  technology  space  is  a  relatively  mature  market  segment,   not  complete,  but  mature.    much  of  the  easy-­‐to-­‐negotiate  content  has  been  negotiated,  and  many   of  the  systems  on  the  market  are  above  or  approaching  a  billion  records.    this  would  seem  a  lot,   but  there  is  a  whole  slice  of  tremendously  valuable  content  still  not  fully  available  across  all   platforms,  namely  the  specialized  subject  abstracting  and  indexing  database  content.    this  content   has  a  lot  of  significant  value  for  the  discovery  community—many  of  those  databases  go  further   back  than  content  pulled  from  journal  publishers  or  full-­‐text  databases.    equally  as  important  is   that  they  represent  an  important  portion  of  humanities  and  social  sciences  content  that  is  less   represented  in  discovery  systems  as  compared  to  stem  content.    for  vendors  of  a&i  content,  the   concerns  are  clear  and  realistic,  differently  from  journal  publishers  whose  metadata  is  meant  to   direct  users  to  their  main  content  (full  text),  the  metadata  for  a&i  publishers  is  the  main  content.     according  to  a  recent  nfais  report,  a  major  concern  for  them  is  that  if  they  include  their  content   in  discovery  systems,  they  “risk  loss  of  brand  awareness”  and  the  implications  are  that  institutions   will  be  more  likely  to  cancel  those  subscriptions.1    the  focus  therefore  seems  to  have  been  how  to   optimize  the  visibility  of  their  content  in  discovery  systems  before  being  willing  to  share  it.       in  addition  to  the  nfais  report,  some  of  the  conversations  i  have  seen  on  the  topic  seem  to  focus   on  wanting  discovery  system  providers  to  meet  a  more  complex  set  of  requirements  that  will   maximize  leveraging  the  rich  metadata  contained  in  those  resources,  the  idea  being  that  utilizing   that  metadata  in  specific  ways  will  increase  the  visibility  of  the  content.    in  principle  i  think  it  is  a   commendable  goal  to  maximize  the  value  of  the  comprehensive  metadata  a&i  records  contain,   and  the  complexities  of  including  a&i  data  into  discovery  systems  need  to  be  carefully  considered   -­‐  namely  blending  multiple  subject  and  authority  vocabularies,  and  ensuring  that  metadata   records  are  appropriately  balanced  with  full  text  in  the  relevancy  algorithm.  but  i  also  worry  that   setting  too  many  requirements  that  are  too  complicated  will  lead  to  delayed  access  and  biased   search  results.    it  is  important  that  this  content  is  blended  in  a  meaningful  way,  but  determining   relevancy  is  a  complex  endeavor,  and  it  is  critically  important  for  relevancy  to  be  unbiased  from   the  content  provider  perspective  and  instead  focus  on  the  user,  their  query,  and  the  context  of   their  search.       another  concern  that  i  have  heard  articulated  is  that  results  in  discovery  services  are  unlikely  to     be  as  good  as  native  a&i  systems  because  of  the  already  mentioned  blending  issues.    this  is  likely     mark  dehmlow  (mark.dehmlow@nd.edu),  a  member  of  the  ital  editorial  board,  is  program   director,  library  information  technology,  university  of  notre  dame,  south  bend,  in.       editorial  board  thoughts:  a&i  databases  |  dehmlow     2   to  be  true,  but  i  think  it  is  critical  to  focus  on  the  purpose  of  discovery  systems.    as  donald   hawkins  recently  wrote  in  a  summary  of  a  workshop  called  “information  discovery  and  the   future  of  abstracting  and  indexing  services,”  “a&i  services  provide  precision  discipline-­‐specific   searching  for  expert  researchers,  and  discovery  services  provide  quick  access  to  full  text.”2     hawkins  indicates  that  discovery  systems  are  not  meant  to  be  sophisticated  search  tools,  but   rather  a  quick  means  to  search  a  broad  range  of  scholarly  resources  and  i  think  sometimes  a  quick   starting  point  for  researchers.    because  of  the  nature  of  merging  billions  of  scholarly  records  into  a   single  system,  discovery  systems  will  never  be  able  to  provide  the  same  experience  as  a  native  a&i   system,  nor  should  they.    over  time,  they  may  become  better  tuned  to  provide  a  better  overall   experience  for  the  three  different  types  of  searchers  we  have  in  higher  education:  novice  users  like   undergraduates  looking  for  a  quick  resource,  advanced  users  like  graduate  students  and  faculty   looking  for  more  comprehensive  topical  coverage,  and  expert  users  like  librarians  who  want   sophisticated  search  features  to  hone  in  on  the  perfect  few  resources.    many  of  the  discovery   systems  are  working  on  building  these  features,  but  the  industry  will  take  time  to  solve  this   problem,  and  i  tend  to  look  at  things  from  the  lense  of  our  end  users—non-­‐inclusion  of  this   content  directly  impacts  their  overall  discovery  experience.   one  might  ask,  if  the  discovery  system  experience  isn’t  as  precise  and  complete  as  the  native  a&i   experience,  why  bother?    in  addition  to  broadening  the  subject  scope  by  including  many  of  the   more  narrow  and  deep  subject  metadata,  there  is  also  the  importance  of  serendipitous  finding.     that  content,  in  the  context  of  a  quick  user  search,  may  drive  the  user  to  just  the  right  thing  that   they  need.    in  addition,  my  belief  is  that  with  that  content,  we  can  build  search  systems  that  are   deeper  than  google  scholar,  and  by  extension  provide  our  end  users  with  a  superior  search   experience.    and  so  i  advocate  for  innovating  now  instead  of  waiting  to  work  out  all  of  the  details.     i  am  not  suggesting  moving  forward  callously,  but  swiftly.    the  work  that  niso  has  done  on  the   open  data  initiative  has  resulted  in  some  good  recommendations  about  how  to  proceed.    for   example,  they  have  suggested  two  usage  metrics  that  could  be  valuable  for  measuring  a&i  content   use  in  discovery  systems:  search  counts  (by  collection  and  customer  for  a&i  databases)  and   results  clicks  (number  of  times  an  end  user  clicks  on  a  content  provider’s  content  in  a  set  of   results).3     while  i  think  these  types  of  metrics  are  aligned  with  the  types  of  measures  that  libraries  evaluate   a&i  database  usage  by,  i  think  at  the  same  time  they  don’t  really  say  much  about  the  overall  value   of  the  resources  themselves.    sometimes  in  the  library  profession,  our  obsession  for  counting  stuff   loses  connection  with  collecting  metrics  that  actually  say  something  about  impact.    of  the  two   counts,  i  could  see  perhaps  counting  the  result  clicks  as  having  more  value.    in  this  instance,   knowing  that  a  user  found  something  of  interest  from  a  specific  resource  at  the  very  least  indicates   that  it  led  the  user  some  place.    i  think  the  measure  of  search  counts  by  collection  is  less  useful.    at   best  it  indicates  that  the  resource  was  searched,  but  it  tells  us  nothing  about  who  was  searching   for  an  item,  what  they  found,  or  what  they  subsequently  did  with  the  item  once  they  found  it.    i  do   think  we  in  libraries  need  to  consider  the  bigger  picture.    regardless  of  the  number  of  searches   information  technology  and  libraries  |  march  2015     3   (which  doesn’t  really  tell  us  anything  anyway),  we  need  to  recognize  the  value  alone  of  including   the  a&i  content,  and  instead  of  trying  to  determine  the  value  of  the  resource  by  the  number  of   times  it  was  searched,  focus  more  on  the  breadth  of  exposure  that  content  is  getting  by  inclusion   in  the  discovery  system.   i  think  a  more  useful  technical  requirement  for  discovery  providers  would  be  to  provide  pathways   to  specific  a&i  resources  within  the  context  of  a  user’s  search—not  dissimilar  to  how  google   places  sponsored  content  at  the  top  of  their  search  results,  a  kind  of  promotional  widget.    in  this   case,  using  metadata  returned  from  the  query,  the  systems  could  calculate  which  one  or  two   specific  resources  would  guide  the  user  to  more  in  depth  research.    by  virtue  of  inclusion  of  the   resource  in  the  discovery  system,  those  resources  could  become  part  of  the  promotional  widget.     this  would  guide  users  back  to  the  native  a&i  resource  which  both  libraries  and  a&i  providers   want,  and  it  would  do  that  in  a  more  intuitive  and  meaningful  way  for  the  end  user.   all  of  the  parties  involved  in  the  discovery  discussion  can  bring  something  to  the  table  if  we  want   to  solve  these  issues  in  a  timely  way.    i  hope  that  a&i  publishers  and  discovery  system  providers   make  haste  and  get  agreements  underway  for  content  sharing  and  i  would  recommend  that   instead  of  focusing  on  requiring  finished  implementations  based  in  complex  requirement  before   loading  content,  both  of  them  should  instead  focus  on  some  achievable  short  and  long  term  goals.     integrating  a&i  content  perfectly  will  take  some  time  to  complete  and  the  longer  we  wait,  the   longer  our  users  have  a  sub-­‐optimal  discovery  experience.    discovery  providers  need  to  make  long   term  commitments  to  developing  mechanisms  that  satisfy  usage  metrics  for  a&i  content,  although   i  would  recommend  defining  measures  that  have  true  value.    a&i  providers  should  be  measured  in   their  demands:  while  their  stakes  in  system  integration  is  real,  there  runs  a  risk  of  content   providers  vying  for  their  content  to  be  preferred  when  relevancy  neutrality  is  paramount  for  a   discovery  system  to  be  effective.    i  think  it  is  worth  lauding  the  efforts  of  a  few  trailblazing  a&i   publishers  such  as  thomson  reuters  and  proquest  who  have  made  agreements  with  some  of  the   discovery  providers  and  are  sharing  their  a&i  content  already,  providing  some  precedent  for   sharing  a&i  content.    lastly,  libraries  and  knowledge  workers  need  to  develop  better  means  for   calculating  overall  resource  value,  moving  beyond  strict  counts  to  thinking  of  ways  to  determine   the  overall  scholarly/pedagogical  impact  of  those  resources  and  they  need  to  make  the  fact  alone   that  an  a&i  publisher  shares  its  data  with  a  discovery  provider  indicate  significant  value  for  the   resource.                       editorial  board  thoughts:  a&i  databases  |  dehmlow     4     references     1.    nfais,  recommended  practices:  discovery  systems.  nfais,  2013.   https://nfais.memberclicks.net/assets/docs/bestpractices/recommended_practices_final_aug_ 2013.pdf.     2.    hawkins,  donald  t.,    “information  discovery  and  the  future  of  abstracting  and  indexing   services:  an  nfais  workshop.”    against  the  grain.    ,  2013.  http://www.against-­‐the-­‐ grain.com/2013/08/information-­‐discovery-­‐and-­‐the-­‐future-­‐of-­‐abstracting-­‐and-­‐indexing-­‐ services-­‐an-­‐nfais-­‐workshop/.   3.    open  discovery  initiative  working  group,  open  discovery  initiative:  promoting  transparency  in   discovery.    baltimore:  niso,  2014.   http://www.niso.org/apps/group_public/download.php/13388/rp-­‐19-­‐2014_odi.pdf.   editor’s comments bob gerrity information technologies and libraries | september 2013 3 this month’s issue in this month’s issue, we welcome back the president’s message column, with incoming lita president cindi trainor describing upcoming lita events, priorities, and opportunities for members. university of denver mlis candidate gina schlesselman-tarango contributes a compelling piece describing the background, use, and potential library application of searchable signatures in web 2.0 applications such as instagram. jenny emanuel from university of illinois reports on the complex relationship that millennial academic librarians have with technology. kristina l. southwell and jacquelyn slater from university of oklahoma present the findings of a study evaluating the accessibility of special collections finding aids to screen readers for visually impaired users. ping fu from central washington university and moira fitzgerald from yale university look at the potential effects of cloud-based next-generation library services platforms on staffing models for systems and technical-services departments. visiting the discovery side of library services, megan johnson from appalachian state university reports on usability testing of appalachian’s “one box” integrated articles and catalog search, using innovative interfaces’ encore discovery service. speaking of usability, i had the chance recently to observe a usability testing session for my library’s website, and was reminded of the importance of designing library websites and delivering web-based library services that will actually be of value to our users, delivered with their context in mind rather than ours. my library, like many others, has a website rich in content and complexity and organized around our structure. to the user i was observing, the complexity and library-centric organization clearly were obstacles to the rich content we offer. an undergraduate art history major, she was primarily interested in library resources and services that were directly connected to her coursework and that were accessible from the university’s learning management system (lms). she valued the convenience of direct access from the lms to library-managed course readings and past exam papers. but, when asked to navigate to the same resources using the library homepage as a starting point, rather than the lms, she quickly became frustrated and confused by the overload of search options with (to her) confusing labels. she was further stymied by our proclivity to make things more complex than they need to be (or should be). a simple example: a common occurrence at the beginning of semester is that students with outstanding library fines/fees are blocked from registering for classes. rather than providing a simple, direct “resolve my library fees” link, with clear instructions on how to fix their problem, as bob gerrity (r.gerrity@uq.edu.au) is university librarian, university of queensland, australia. editor’s comments bob gerrity editor’s comments | gerrity 4 quickly as possible, we instead provide pages of information about how and why the fines/fees were calculated, with no link to a solution to the problem at hand. my takeaways from the session were that (1) our website needs to be radically simplified and (2) we should be focussing on designing and delivering services that can be embedded in the context of the user’s natural workflows, not the library’s. easier said than done, of course. reviewers needed the ital editorial board has room for a couple of additional members, to help us keep up with incoming article submissions. if you have a passion for library technology, a willingness to undertake a few reviews each year, and are a member of lita (or willing to join), please send me an e-mail indicating your interest and area(s) of expertise. as always, suggestions and feedback on ital are welcome, at the e-mail address above. 315 technical communications isad/solinet to sponsor institute "networks and networking ii; the present and potential" is the theme of an isad institute to be held at the braniff place hotel on february 27-28, 1975, in new orleans. the sponsors are the information science and automation division of ala and the southeastern library network (solinet). this second institute on networking will be an extension of the previous one held in new orleans a year ago. the ground covered in that previous institute will be the point of departure for "networks ii." the purpose of the previous institute was to review the options available in networking, to provide a framework for identifying problems, and to suggest evaluation strategies to aid in choosing alternative systems. while the topics covered in the previous institute will be briefly reviewed in this one, some speakers will take different approaches to the subject of networking, while other speakers will discuss totally new aspects. in addition to the papers given and the resultant questions and answers from the floor, a period of round table discussions will be held during which the speakers can be questioned on a person-to-person basis. a new feature to isad institutes now being planned will be the presence of vendors' exhibits. arrangements are being made with the many vendors and manufacturers whose services are applicable to networking to exhibit their products and systems. it is hoped that many of them will be interested in responding to this opportunity. the program will include: "a systems approach to selection of alternatives" -resource sharing-camponents-communications options-planning strategy. joseph a. rosenthal university of california, berkeley. ' "state of the nation"-review of current developments and an evaluation. brett butler, butler associates. "the library of congress, marc, and future developments." henriette d. avram, library of congress. "data bases, standards and data conversions" -existing data bases-characteristics-standardization-problems. john f. knapp, richard abel & co. "user products"-possibilities for product creation-the role of user products. maurice freedman, new york public library. "on-line technology"-hardware and software considerations-library requirements-standards-cost considerations of alternatives. philip long, state university of new york, albany. "publishers' view of networks"-copyright-effect on publishers-effect on authorship-impact on jobbers-facsimile transmission. carol nemeyer, association of american publishers. "national library of canada"-current and anticipated developments-cooperative plans in canada-international cooperation. rodney duchesne, national library of canada. "administrative, legal, financial, organizational and political considerations" -actual and potential problems-organizational options-financial commitment-governance. fred kilgour, oclc. registration will be $75.00 to members of ala and staff members of solinet institutions, $90.00 to nonmembers, and $10.00 to library school students. for hotel reservation information and registration blanks, contact donald p. hammer, isad, american library association, 50 e. huron st., chicago, il 60611; 312-944-6780. 316 journal of library automation vol. 7/4 december 1974 regional projects and activities indiana coopemtive libmry services authm·ity the first official meeting of the board of directors of the indiana cooperative library services authority (incolsa) was held june 4, 1974, at the indiana state library in indianapolis. a direct outgrowth of the cooperative bibliographic center for indiana libraries ( cobicil) feasibility study project sponsored by the indiana state library and directed by mrs. barbara evans markuson, incolsa has been organized as an independent not-for-profit organization "to encourage the development and improvement of all types of library service." to date, contracts have been signed by sixty-one public, thirteen academic, fourteen schools and five specfal librariesa total of ninety-three libraries. incolsa is being funded initially by a three-year establishment grant from the u.s. office of education, library services and construction act (lsca) title i funds. officers are: president-harold baker, head of library systems development, indiana state university; vice-presidentor. michael buckland, assistant director for technical services, purdue university libraries; secretary-mary hartzler, head of catalog division, indiana state library; treasurer-mary bishop, director of the crawfordsville book processing center; three directors-at-large--phil hamilton, director of the kokomo public library; edward a. howard, director of the evansville-vanderburgh county public library; and sena kautz, director of media services, duneland school corporation. stanford's ballots on-line files publicly available through spires september 16,.1974 the stanford university libraries automated technical processing system, ballots (bibliographic automation of large library operations using a timesharing system) , has been in operation for twenty-two months and supports the acquisition and cataloging of nearly 90 percent of all materials processed. important components of the ballots operations are several on-line files accessible through an unusually powerful set of indexes. currently available are: a file of library of congress marc data starting from january 1, 1972 (with a gap from may to august 1972); an in-process file of individual items being purchased by stanford; an on-line catalog (the catalog data file) of all items cataloged through the system, whether copy was derived from library of congress marc data, was input from non-marc cataloging copy, or resulted from stanford's own original cataloging efforts; and a file of see, see also, and explanatory references (the reference file) to the catalog data file. in addition, during september and october 1974, the 85,000 bibliographic and holdings records (already in machinereadable form on magnetic tape) representing the entire j. henry meyer memorial undergraduate library was convmted to on-line meyer catalog data and meyer reference files in ballots. these files are publicly available through spires (stanford public information retrieval system) to any person with a terminal that can dial up the stanford center for information processing's academic computer services computer (an ibm 360 model 67) and who has a valid computer account. the marc file can be searched through the following index points: lc card number personal name corporate/ conference n arne title the in-process, catalog data, and reference files for stanford and for meyer can also be searched as spires public subfiles through the following index points: ballots unique record identification number personal name corporate/ conference name title subject heading (catalog data and reference file records only) call number (catalog data and reference file records only) lc card number the title and corporate/ conference name indexes are word indexes; this means that each word is indexed individually. search requests may draw on more than one index at a time by using the logical operators "and," "or," and "and not" to combine index values sought. if you plan to use spires to search these files, or if you would like more information, a publication called gttide to ballots files may be ordered by writing to: editor, library computing services, s.c.i.p.-willow, stanford university, stanford, ca 94305. this document contains complete information about the ballots files and data elements, how to open an account number, and how to use spires to search ballots files. a list of ballots publications and prices is also available on request. as additional libraries create on-line files using ballots in a network environment, these files will also be available. these additions will be announced in ]ola technical commttnications. data base news interchange of alp and ei data bases a national science foundation grant (gn-42062) for $128,700 has been awarded to the american institute of physics (aip), in cooperation with engineering index ( ei), for a project entitled "interchange of data bases." the grant became effective on may 1, 1974, for a period of fifteen months. the project is intended to develop methods by which ei and alp can reduce their input costs by eliminating duplication of intellectual effort and processing. through sharing of the resources of the two organizations and an interchange of their respective data bases, alp and ei expect to improve the utilization of these computer-readable data bases. the basic requirement for the developtechnical communications 317 ment of the interchange capability for computer-readable data bases is the establishment of a compatible set of data elements. each organization has unique data elements in its data base. it will therefore be necessary to determine which of the data elements are absolutely essential to each organization's services which elements can be modified, and wh~t other elements must be added. mter the list of data elements has been established, it will be possible to unite the specifications and programs for format conversions from alp to ei tape format and vice versa. simultaneously, there will be the development of language conversion facilities between ei' s indexing vocabulary and alp's physics and astronomy classification scheme (pacs). it is also planned to investigate the possibility of establishing a computer program which can convert alp's indexing to ei's terms and vice versa. with the accomplishment of the above tasks, it will be possible to create new services and repackage existing services to satisfy the information demands in areas of mutual interest to engineers and physicists, such as acoustics and optics. eric data base users conference the educational resource information center (eric) held an eric data base users conference in conjunction with the 37th annual meeting of the american society for information science (asis) in atlanta, georgia, october 13-17, 1974. the eric data base users conference provided a forum for present and potential eric users to discuss common problems and concerns as well as interact with other components of the eric network: central eric, the eric processing and reference facility, eric clearinghouse personnel, and information dissemination centers. although attendees have in the past been primarily oriented toward machine use of the eric files, all patterns of usage were represented at this conference, from manual users of printed indexes to operators of national on-line reh·ieval systems. 318 ]oumal of library automation vol. 7/4 december 1974 a number of invited papers were presented dealing with subjects such as: • the current state and future directions of educational information dissemination. sam rosenfeld (nie), lee burchina! (nsf). • what services, systems, and data bases are available? marvin gechman (information general), harvey marron (nie). • the roles of libraries and industry, respectively, in disseminating educational information. richard de gennaro (university of pennsylvania), paul zurkowski (information industry association) . several organizations (national library of canada, university of georgia, wisconsin state department of education) were invited to participate in "show and tell" sessions to describe in detail how they are using the eric system and data base. a status report covering eric on-line services for educators was presented by dr. carlos cuadra (system development corporation) and dr. roger summit (lockheed). interactive discussion groups covered a number of subjects including: • computer techniques-programming methods, use of utilities, file maintenance, search system selection, installation, and operation. • serv:uig the end user of educational information. • introduction to the eric systemwhat tools, systems, and services are available and how are they used? • beginning and advanced sessions on computer searching the eric files. online terminals were used to demonstrate and explain use of machine capabilities. commercial services and developments scope data inc. ala train compatible terminal printers scope data inc. currently is offering a high-speed, nonimpact terminal printer for use in various interactive printing applications. capability can be included in the series 200 printer as an extra-cost feature to print the eight-bit ascii character set for ala character set with 176 characters. for further information contact alan g. smith, director of marketing, scope data inc., 3728 silver star rd., orlando, fl 32808. institute for scientific information puts life sciences data base on-line through system development corporation the institute for scientific information (lsi) has announced that it will collaborate with system development corporation (sdc) to provide on-line, interactive, computer searches of the life sciences journal literature. scheduled to be fully operational by july 1, 1974, the isi-sdc service is called scisearch® and is designed to give quick, easy, and economical access to a large life sciences literature .file. stressing ease of access, the sdc retrieval program, orbit, permits subscribers to conduct extremely rapid literature searches through two-way communications terminals located in their own facilities. mter examining the preliminary results of their inquiries, searchers are able to further refine their questions to make them broader or narrower. this dialog between the searcher and the computer (located in sdc's headquarters in santa monica, california) is conducted with simple english-language statements. because this system is tied in to a nationwide communications network, most subscribers will be able to link their terminals to the computer through the equivalent of a local phone call. covering every editorial item from about 1,100 of the world's most important life sciences journals, the service will initially offer a searchable ille of over 400,000 items published between april 1972 and the present. each month approximately 16,000 new items will be added until the average size of the file totals about one-half million items and represents two-and-one-half years ·of coverage. to assure subscribers maximum retrieval effectiveness when dealing with this massive amount of information, the data base can be searched in several ways. included are searches by keywords, word stems, word phrases, authors, and organizations. one of the search techniques utilized-citation searching-is an exclusive feature of the lsi data base. for every item retrieved through a search, subscribers can receive a complete bibliographic description that includes all authors, journal citation, full title, a language indicator, a code for the type of item (article, note, review, etc.), an lsi accession number, and all the cited references contained in the retrieved article. the accession number is used to order full-text copies of relevant items through lsi's original article tear sheet service (oats®). this ability to provide copies of every item in the data base distinguishes the lsi service from many others. current library of congress catalog on-line for reference searches information dynamics corporation (idc) has agreed to collaborate with system development corporation (sdc) to provide reference librarians, researchers, and scholars with on-line interactive computer searches of all library materials being cataloged by the library of congress. scheduled to be fully operational as of october 1, 1974, the sdc-idc service is called sdc-idc/libcon and is designed to give quick, easy, and economical access to a large portion of the world's scholarly library materials. as in the lsi service described above, the data base can be searched in several ways. included are compound logic searches by keywords, word stems, word phrases, authors, organizations, and subject headings for most english materials. one of the search techniques utilized-string searching-is an exclusive feature of sdc's orbit system. keyword searching of cataloged items including all foreign materials processed by the library of congress technical communications 319 is an exclusive feature of the idc data base not currently available in other online marc files. for individual items retrieved through a search, subscribers can receive a bibliographic description that includes authors, full title, an idc accession number, the lc classification number, and publisher information. standards the isad committee on technical standards for library automation invites your participation in the standards game editor's note: the tesla reactor ballot will be provided in f01'thcoming issues. to use, photocopy the ballot fol'm, fill out, and mail to: john c. kountz, associate for library automation, office of the chan{jellor, the california state university and colleges, 5670 wilshire blvd., suite 900, los angeles, ca 90036. the procedure this procedure is geared to handle both reactive (originating from the outside) and initiative (originating from within ala) standards proposals to provide recommendations to ala's representatives to existing, recognized standards organizations. to enter the procedure for an initiative standards proposal you must complete an "initiative standards proposal" using the outline which follows: initiative standard proposal outlinethe following outline is designed to facilitate review by both the committee and the membership of initiative standards proposals and to expedite the handling of the initiative standard proposal through the procedure. since the outline will be used for the review process, it is to be followed explicitly. where an initiative standard requirement does not require the use of a specific outline entry, the entry heading is to be used followed by the words "not applicable" (e.g., where no standards exist which relate to the proposal, this is indi320 journal of library automation vol. 7/4 december 1974 cated by: vi. existing standards. not applicable). nate that the parenthetical statements following most of the outline entry descriptions relate to the ansi standards proposal section headings to facilitate the translation from this outline to the ansi format. all initiative standards proposals are to be typed, double spaced on 8~~~~ x 11" white paper (typing on one side only) . each page is to be numbered consecutively in the upper right-hand corner. the initiator's last name followed by the key word from the title is to appear one line below each page number. i. title of initiative standard proposal (title) . ii. initiator information (forward). a. name b. title c. organization d. address e. city, state, zip f. telephone: area code, number, extension iii. technical area. describe the area of library technology as understood by initiator. be as precise as possible since in large measure the information given here will help determine which ala official representative might best handle this proposal once it has been reviewed and which ala organizational component might best be engaged in the review process. iv. purpose. state the purpose of standard proposal (scope and qualifications) . v. description. briefly describe the standard proposal (specification of the standard). vi. relationship of other standards. if existing standards have been identified which relate to, or are felt to influence, this standard proposal, cite them here (expository remarks). vii. background. describe the research or historical review performed relating to this standard proposal (if applicable, provide a bibliography) and your findings (justification). viii. specifications. (optional) specify the standard proposal using record layouts, mechanical drawings, and such related documentation aids as required in addition to text exposition where applicable (specifications of the standard). kindly note that the outline is designed to enable standards proposals to be written following a generalized format which will facilitate their review. in addition, the outline permits the presentation of background and descriptive information which, while important during any evaluation, is a prerequisite to the development of a standard. tesla reactor ballot identification number for standing requirement reactor information name-----'----------tiue ______________________ ___ organization --------------addrms _____________ ___ city ___ _ state ___ zip __ _ telephonea 1:-:::-rea::+----~---­ need (for this standard) for d against 0 specification (a presented in this requirement) for 0 against 0 ext. can you participate in the development of this. standard -.,.---------==----0 no d yes reason for position: (use format of proposal. · additional pages can be used if required) the reactor ballot is to be used by members to voice their recommendations relative to initiative standards proposals. the reactor ballot permits both "for" and "against" votes to be explained, permitting the capture of additional information which is necessary to document and communicate formal standards proposals to standards organizations outside of the american library association. as you, the members, use the outline to present your standards proposals, tesla will publish them in jola-tc and solicit membership reaction via the reactor ballot. throughout the process tesla will insure that standards proposals are drawn to the attention of the applicable american library association division or committee. thus, internal review usually will proceed concurrently with membership review. from the review and the reactor ballot tesla will prepare a "majority recommendation" and a "minority report" on each standards proposal. the majority recommendation and minority report so developed will then be transmitted to the originator, and to the official american library association representative on the appropriate standards organization where it should prove a source of guidance as official votes are cast. in addition, the status of each standards proposal will be reported by tesla in jola-tc via the standards scoreboard. the committee (tesla) itself will be nonpartisan with regard to the proposals handled by it. however, the committee does reserve the right to reject proposals which after review are not found to relate to library automation. input to the editor: we have been asked by the members of the ala interdivisional committee on representation in machine readable form of bibliographic information, (marbi) to respond to your editorial in the june 1974 issue of the journal of library automation. this editorial dealt with the council of library resources' [sic] involvement in a wide range of projects, ranging from the sponsorship of a group which is attempting to develop a subset of marc for use in inter-library exchange technical communications 321 of bibliographic data ( cembi), to management of a project which has as its goal the creation of a national serials data base, (conser), and, more recently, to the convening of a conference of library and a&i organizations to discuss the outlook for comprehensive national bibliographic control. you raised several legitimate questions: 1) has sufficient publicity been given to these activities of the council so that all, not just a few, libraries are aware of what is happening and have an opportunity to exert an influence on developments? and, 2) is the council bypassing existing channels of operation and communication? you also suggest that proposals from groups such as cembi be channeled through an official ala committee such as marbi for intensive review and evaluation. it should be pointed out that marbi is not charged with the development of standards. it acts to monitor and review proposals affecting the format and content of machine readable bibliographic data, where that data has implications for national or international use. this applies to proposals emanating from cembi and conser as well as from other concerned groups. all indications to date are that the council is fully aware of marbi's role and will not bypass marbi. a number of members of marbi are also members of cembi and marbi is represented on the conser project. also reassuring is the fact that, unless we allow lc to fall by the wayside in its role as the primary creator and distributor of machine readable data, any standards for format or content developed by a council-sponsored group will eventually be reflected in the marc records distributed by lc. the library of congress has issued a statement, published in the june 1974 issue of jola, to the effect that it will not implement any changes in the marc distribution system which are not acceptable to marbi. marbi and lc have worked out a procedure whereby all proposed changes to marc are submitted to marbi. they are then published in ]ola and distributed to mem322 journal of library automation vol. 7/4 december 1974 hers of the marc users discussion group for comments. comments are collected and evaluated by marbi and a report submitted to lc, with its recommendations. the marbi review process does not guarantee perfection and there is no assurance that everyone will be satisfied. compromise and expediency are the name of the game in this extremely complicated and uncharted area of standards for machine readable bibliographic data. however the council has undoubtedly learned from the isbd(m) experience that it cannot make decisions which affect libraries without the greatest possible involvement of librarians. it is the feeling of the marbi committee members that the council intends to work with marbi in future projects which fall into marbi's area of concern. velma veneziano marbi past chairperson ruth tighe chairperson editor's note: it is gratifying to note that marbts response reflects the opinions expressed in the june 1974 editorial. the library community will doubtless. be pleased to learn of clr's intention to work closely with marbi.-skm to the editor: as briefly discussed with you, yom editorial in the june 1974 issue of jola is both admirable and disturbing (to me, at least). the problem of national leadership in the area of library automation is a critical problem indeed. being in the ''boondocks" and far removed from the scene of action, i can only express to you my perception as events and activities filter through to me. i can remember as far back as 1957 when adi had a series of meetings in washington, d.c. trying to establish a national program for bibliographic automation. i have been through eighteen years of meetings, committees, conferences, etc. concerned with trying to develop a national plan for bibliographic automation and information storage and retrieval systems. i have worked with nsf, usoe, department of commerce, u.s. patent office, engineering and technical societies, dod agency-the entire spectrum. i spent a good many years working in adi and asis, sla, andmost recently ala. at no time were we able to make significant progress towards a national system. even the great airlie house conference did not produce any significant changes in the fragmented, competitive "non-system." it has only been in the recent past since clr has taken an aggressive posture that i am able to see the beginning of orderly development of a national automated bibliographic system. i certainly agree that any topic as critical as those being discussed by cembi should be in the public domain, but i also believe that the progress made by cembi would not have been possible without clr taking the initiative in getting these key agencies together. thank goodness someone quit talking and started doing something at the national level! i sincerely believe that in the absence of a national library and with the cmrent lack of legally derived authority in this arena, clr provides a genuine service to the total library community in establishing cembi. hopefully, your very excellent article (in the same issue of jola) on "standards for library automation ... " will help to put the entire issue of bibliographic record standards into perspective. as a former chemist and corrosion engineer, i am fully aware of the absolute necessity for technical standards. i am also fully aware of the necessity of developing technical standards through the process you outlined in your article. hopefully, clr action with cembi will expedite this laborious process and help to push our profession forward into the twentieth century. since we ourselves have not been able to do it through all these years, i am personally grateful that some group such as clr took the initiative and forced us to do what we should have done years ago. maryann duggan slice office di1·ector editor's note: positive action and progmssive movement are, of course, desirable and are often lacking in large organizations. however, posit·ive action without communication of this action to the affected population can only be detrimental. on issues of the complexity of those addressed by cembi and conser, review by the library community is always useful, even though action may be temporarily delayed.-skm to the editor: on page 233 of the september issue of lola there is a report from the information industry association's micropublishing committee chairman (henry powell). he states that", .. the committee spelled out several areas of concern to micropublishers which will be the subject of committee action .... " one of the concerns of the committee is that a z39 standards committee has recommended "standards covering what micropublishers can say about their products." (emphasis mine.) technical communications 323 as chairman of the z39 standards subcommittee which is developing the advertising standard referred to, i wish to point out that there is no intention on the part of the subcommittee to tell micropublishers what they can say nor what they may say about their products. the subcommittee, which is composed of representatives from three micropublishing concerns, two librarians, and myself, has from the beginning taken the view that the purpose of the standard would be to provide guidance for micropublishers and librarians alike. we are most anxious that no one feel that the subcommittee has any intention of attempting to use the standards mechanism to tell any micropublisher how he must design his advertisements. in addition it should be noted that no ansi standard is compulsory. carl m. spaulding program officer council on library resou1·ces decision-making in the selection, procurement, and implementation of alma/primo: the customer perspective article decision-making in the selection, procurement, and implementation of alma/primo the customer perspective jin xiu guo and gordon xu information technology and libraries | march 2023 https://doi.org/10.6017/ital.v42i1.15599 jin xiu guo (jiguo@fiu.edu) is associate dean for technical services, florida international university. gordon xu (gordon.xu@njit.edu) is associate university librarian for collections & information technology, new jersey institute of technology. © 2023. abstract this case study examines the decision-making process of library leaders and administrators in the selection, procurement, and implementation of ex libris alma/primo as their library services platform (lsp). the authors conducted a survey of libraries and library consortia in canada and the united states who have implemented or plan to implement alma. the results show that most libraries use both request for information (rfi) and request for proposal (rfp) in their system selection process, but the vendor-offered training is insufficient for effective operation. one-third of the libraries surveyed are considering switching to open-source options for their next automation system. these insights can benefit libraries and library consortia in improving their technological readiness and decision-making processes. introduction with the exponential growth of digital information, libraries have been seeking innovative systems to manage electronic resources and provide collection services. the next-generation integrated library system (ils) should address both current challenges and future demands. with that in mind, new cloud-based commercial products have come into the market in recent years. ex libris alma, oclc worldshare, and innovative sierra are often referred to as library service platforms (lsps) compared to a client-based ils. among these new products, selecting and implementing a new system is no small task. studies show that libraries might overlook the capacity of an ils to accommodate many functions and make a tough choice between sticking with the current vendor or switching to another before investing time and resources to migrate to a completely new system.1 libraries do not make these kinds of decisions in a rational manner, which involves clearly defining the problem, identifying and evaluating potential options, weighing the pros and cons of each option, considering an organization’s values, goals, and preferences, making a choice based on a systematic analysis, and continuously reassessing and adjusting the decision as new information becomes available. as a result, a selected system might not be the best fit for a library’s actual needs.2 library consortia also face a similar challenge, but in a more complex context. for example, sharing cost, level of collaboration, and integration with other library applications can be quite different from a small library to a large research library. additionally, the requirement for security and scalability can vary among consortial members. ninety-four percent of academic libraries migrated their systems to alma in 2018 by joining a consortium.3 at a consortial level, managing a system migration project adds a significant challenge because of the competing, often conflicting desires of constituent institutions. mailto:jiguo@fiu.edu mailto:gordon.xu@njit.edu information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 2 guo and xu budgeting for a migration project needs to be secured before the project takes place. the one-time migration cost has a huge impact on a library’s decision on a new system. lengthy procurement processes mean that it can take a year to communicate requirements, solicit bids, and make a final decision. libraries also wonder if they should acquire such a new system through a consortial deal or on their own. a successful implementation of a new system starts with making a sound choice. the system migration project encompasses various technological and management decisions made by project managers, team leaders, and library administrators. decisions about data cleanup, migration mapping, system configuration, communication, and training can have a tremendous impact on project outcomes, staffing, existing workflows, and job functions and responsibilities. in the meantime, the project itself also provides libraries a great opportunity to improve the existing operational and staffing model and to adjust their strategy to manage technological and organizational change. there are few studies on decision-making of the alma/primo selection, procurement, and migration from the user’s perspective. alma is a cloud-based library management system that helps libraries manage, deliver, and discover digital and physical resources. it offers functionalities such as resource discovery, resource management, resource sharing, and analytics. primo ve is a next-generation library discovery platform that provides users with access to a central index of the library’s collections. it offers a personalized and intuitive search experience, with features such as faceted searching, saved searches, and item recommendations. both alma and primo ve are ex libris products. this case study fills the gap and provides a better understanding of how american and canadian library leaders and administrators make decisions for their libraries and consortia. the pairing of ex libris’s alma and primo products has become a widely accepted next-generation system due to its cloud-based model for managing both electronic and print resources. the findings of this study offer insights and lessons learned to help library leaders and administrators to make better decisions on their future technological change. literature review the growing user demand for electronic resources over the last decade has led libraries to make a rapid digital transformation to manage and deliver online library services. consequently, system providers are hungry to develop the next-generation library systems. organizations have started to adopt cloud computing as their infrastructure. a benefit of cloud computing is that local it staff no longer need to handle hardware failures and software installation. cloud computing streamlines processes and saves time and money. additionally, cloud computing not only enables libraries to deliver resources and services in a network and a library community but also frees libraries from managing technology to focus on collection building, service improvement, and innovation. therefore, libraries have started to migrate their client-based integrated library systems (ils) to cloud-based next-generation systems, often referred as lsps. these lsps can be connected with other web applications, increase collection visibility and accessibility, streamline workflows, reduce duplication of staffing and collections, and create a greener ecosystem for organizations.4 library consortia have been playing vital roles in resource sharing, cooperative purchasing, discovery, user experience, and technical support. many libraries migrate to a shared nextgeneration ils or lsp by joining a consortium. besides sharing common needs, participating information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 3 guo and xu libraries are quite different with respect to their sizes, the kinds and numbers of resources they provide, services, priorities, and staffing. although this could pose some challenges like cost sharing for participating libraries, workflow design, policy, and a collaboration model for libraries, libraries still benefit greatly from the shared catalog and enhanced metadata as well as cooperation on a global level through the product community such as eluna and igelu.5 the selection of a new system is not a small decision. calvert and read pointed out that some libraries turned to “sheep syndrome” of selecting what other libraries have bought due to the lack of software knowledge.6 their study suggested that a request for proposal (rfp) could be a part of the lsp selection process by providing a consistent set of vendor responses with a narrow scope, a formal statement of requirements for benchmarking, and a mechanism for vendors to compete. gallagher advised considering existing contracts, financial resources, and rfps before beginning a system assessment. he indicated that the expiration date of the current ils and opt-out clauses of the existing contract could be the indicators of a go-live date. a price quote including a one-time implementation fee and a cost-benefit analysis of the current ecosystem compared to the vendor offer could provide a helpful document that envisions future library services.7 in addition to an rfp, yang and venable also considered the library automation marketplace and needs of their own library when migrating from sirsidynix symphony to alma/primo.8 gallaway and hines embraced competitive usability techniques to test a set of standard tasks across multiple systems by using focus groups at loyola university new orleans to select a nextgeneration system.9 they also collected anecdotal information and feedback on the system performance of the current library online catalog through a survey of library staff. this evidencebased decision-making process makes system selection in a rational manner. manifold, on the other hand, proposed a principled approach to selecting a new lsp. he believed that system selection was a part of the continuing process of organizational change and needed to involve library staff and users throughout the process. today’s lsp systems can connect almost the entire range of library operations, from resource management and acquisitions to user request fulfillment and the integration of subject guides on research, teaching, and learning a system migration is much more than just a move to a new system; instead, it is a transfer to a new culture. he suggested the acquisitions process must start with educating participants on the features of various systems, methods of vendor assessment, the rules of contract negotiation, communication, and stress management. the success in system selection and implementation should be measured over the life span of the system to guide new decisions along the way.10 in addition to commercial products, some libraries are acquiring open-source software (oss) that enables them to have a greater control over customization. the potential benefits of oss include cost effectiveness, interoperability, user friendliness, reliability, stability, auditability, and customization. koha, evergreen, folio, abcd, winisis, newgenlib, emilda, pmb (phpmybibli) and weblis are examples of oss ils/lsp products on the market.11 when selecting and implementing an oss solution, small libraries such as the paine college colins-callaway library, with a limited budget and small staff, chose a hosted open-source ils (koha) to obtain specific expertise and services at a reasonable price.12 once a system is selected, the implementation process itself can be critical to the perception of overall system success. lovins expressed concern about choosing a project management approach that is schedule-driven over results-driven. he also recommended organizing implementation information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 4 guo and xu activities around the incoming system functionality. for a consortium-wide system migration, a “train-the-trainer” strategy was adopted in the training program, which mostly offers demonstrations instead of instruction to future trainers.13 the program hardly met libraries’ expectation for training. active staff participation in a system migration is key to a project success. banerjee and middleton reported that when library staff owned the migration process, fewer mistakes and greater satisfaction with the new system, as well as quicker troubleshooting of problems that did arise as a result of the migration, were observed.14 avery shared that the god’s bible college libraries did an informal preand post-assessment of library users and staff to gather feedbacks on both legacy and target ils. he recommended conducting a formalized preand post-evaluation of user satisfaction with the ils.15 stewart and morrison observed that acquisitions workflows in a shared alma environment must balance required consortial needs with local policies and procedures. the unmet training needs and the lack of an electronic resources management (erm) module in alma presented challenges for library staff to develop and manage alma workflows. they argued that a two-year project cycle was super ambitious especially if the consortium size and variety of individual libraries involved were large and wide.16 when migrating from horizon to symphony (both are sirsidynix products), king fahd university of petroleum and minerals based in dhahran saudi arabia experienced a delayed implementation. some unmet needs, such as a dramatic shift of workflows, user interface customization, and training support by a system provider or its parent company not matched by a local vendor, became hurdles for this project.17 although a new lsp including alma/primo and oss empowers libraries to create unified workflows across functional modules, this feature requires a system user to have cross-functional roles to conduct these activities.18 when migrating from non-ex libris product lines to alma/primo, libraries may need to make tough implementation decisions. for example, the university of south carolina migrated library data to alma/primo from innovative’s millennium and ebsco’s full text finder. when the legacy and target products are from different vendors, the system migration can be more complicated in communication, data mapping, data quality, and expected results of data migration. for the usc library, the preexisting duplicate records for electronic resources should have been cleaned up before the migration.19 libraries should address their concerns about key activities during the implementation to get the best possible result. the joint bank fund library had a three-day onsite training in workflows in the middle of the project. it would be much more effective if the library had communicated with the vendor to reschedule the training at a later stage of the migration because library staff were not yet familiar with the lsp by the expected time.20 the university of north carolina at charlotte migrated from oclc’s worldshare management services (wms) to alma/primo after migrating from millennium to wms four and a half years previously. the atkins library went through the second system migration because wms modules did not meet their library’s needs. going through two system migrations in the span of five years was particularly costly and frustrated technical services staff spent more than half of their work time on data cleanup. additional time for data cleaning, workflow design, and training was also needed after the migration to alma.21 fu and fitzgerald studied the effect of lsp staffing models for library systems and technical services by analyzing the software architecture, workflows, and functionality of voyager and information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 5 guo and xu millennium against those realigned in alma, wms (worldshare management systems), and innovative sierra. they discovered that the workload of systems staff could be reduced by around 40 percent, so library systems staff could have additional time to focus on local applications development, the discovery interface, and system integration. in the meanwhile, the functionality of the next generation ils provides a centralized-data services platform to manage all types of library assets with unified workflows. consequently, libraries could streamline and automate workflows for both physical and electronic resources through systems integration and enhanced functionality. this change requires libraries to reconsider their staffing models, redefine job descriptions, and even reorganize the library structure to leverage the benefits of a new lsp.22 western michigan university (wmu) decided to reorganize its technical services department after the alma migration was completed in 2015. after the alma implementation, it was observed that staff spent 38 percent less time working with physical materials. the systems department also shifted its focus from back-end system support to front-end user and other new technologies. wmu consolidated fourteen departments into six and renamed technical services to resource management, composed of cataloging and metadata, collections and stacks, and electronic resources. the lsp administration was shared by four certified alma administrators and one discovery administrator residing in the resource management department.23 although researchers and library practitioners have studied ils selection and implementation processes and the impact of migration on library operation and staffing, only the studies on the rfp and usability testing have focused on decision-making on the ils selection. today, library administrators and leaders face technological change more often while making a transformation to a digital business model. they should understand how decisions are made at different organizational levels when managing change. this study is to fill this gap and help library administrators and leaders to better prepare for future change through the following research questions: • what is the decision-making process and what do libraries consider? • how do libraries evaluate the migration project? • what are the impacts of the system migration on library staffing and operation? • what lessons have libraries learned from the system migration? • what will libraries do differently for the future system migration? methods researchers have adopted both qualitative and quantitative methods for studies about system migration. the literature indicates that both interviews and surveys have been employed to collect data for these studies.24 a usability testing through a set of tasks across systems has also been utilized in a system selection.25 a comparative analysis of vendor documents, rfp responses, and webinars has been applied in studying the impact of system migration on staffing models.26 in this research, the authors used a qualitative method through a survey to understand decisionmaking on system selection, procurement, and implementation. data collection the population for this study is those libraries that implemented or are planning to implement alma. through the eluna membership management site (https://eluna40.wildapricot.org/), the https://eluna40.wildapricot.org/ information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 6 guo and xu authors identified 1,440 libraries in the united states and canada that use at least one ex libris product. with help from sue julich at university of iowa libraries, who manages the site, 1,150 alma libraries were identified. the authors also contacted marshall breeding, the founder and publisher of library technology guides (https://librarytechnology.org/), and obtained a list of 1,134 alma libraries in the united states and canada. comparing the alma libraries acquired from the two different sources, they eventually identified 1,079 libraries from the united states and 55 libraries from canada as eligible survey-participating libraries. the authors developed a 13-question survey in qualtrics. this questionnaire aimed to help participants recall the project experience and offer them an opportunity to self-reflect and give feedback. the survey was distributed via email to the eligible libraries. a few email reminders were sent out to encourage participation. upon the closure of the survey, 291 libraries (27%) completed the survey completely. data analysis qualtrics generates data analysis and reports. the authors conducted a text analysis by categorizing responses to those open-ended survey questions to clarify the characteristics of each response manually and then presented and analyzed data in microsoft excel. findings part i: library profile & background information the participating libraries have diverse profiles in terms of size and geographic location and reflect the point of views from small library to library consortium. remarkably, during the survey, the authors received requests for a complete survey questionnaire so that respondents could coordinate and provide the complete and accurate data on behalf of their libraries. respondents the majority of the respondents in this survey were deans, directors of the library or university librarians, and system librarians (see table 1). also, there were a wide variety of other position titles across cataloging, acquisitions, technical support, and reference, who participated in the survey (see table 2). participating libraries geographic location the participating libraries were located in the united states and canada, and the majority of them were american libraries (see table 3). the american libraries were distributed in 36 states, while the canadian libraries came from 4 provinces. https://librarytechnology.org/ information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 7 guo and xu table 1. the position titles of the respondents position title percentage dean/director of the library/university librarian 35% system librarian 23% other 42% table 2. the other position titles of the respondents other position titles assessment librarian head of metadata and cataloging asset management librarian head of technical services assistant director ils coordinator associate dean instructional technology librarian associate director lead librarian associate law librarian library technician associate university librarian library technology manager cataloging and metadata librarian manager of archives & access services cataloging librarian manager of digital services collections librarian manager of technical support consortial executive director metadata librarian deputy director of the library project director director of library systems public services librarian director of library technology services reference librarian/webmaster director of technical services resource description and access librarian electronic resources librarian solutions architect, alma implementation project manager head librarian supervisor for access services head of acquisitions technical services and instruction librarian head of collection management technical services librarian head of library systems technical services section head head of library technology services technology manager table 3. the geographic locations of the libraries country percentage united states 92% canada 8% information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 8 guo and xu library size the libraries served a wide variety of student sizes, ranging from less than 1,000 to over 50,000 students (see table 4). the smallest library had only 199 students while the largest library system or consortium had 482,000. the number of employees in those institutions ranged from less than 1,000 employees to over 20,000 faculty and staff (see table 5). the smallest institution may only have 10 employees, while there were three larger institutions with over 50,000 faculty and staff. table 4. student population (number of ftes) student population (number of ftes) percentage <1,000 6% 1,000–1,999 14% 2,000–2,999 10% 3,000–3,999 8% 4,000–4,999 4% 5,000–5,999 6% 6,000–6,999 4% 7,000–7,999 6% 8,000–8,999 4% 9,000–9,999 1% 10,000–14,999 9% 15,000–19,999 8% 20,000–29,999 6% 30,000–39,999 5% 40,000–49,999 3% 50,000+ 4% table 5. faculty and staff population (number of ftes) faculty/staff population (number of ftes) percentage <100 9% 100–499 25% 500–1,000 17% 1,000–1,999 14% 2,000–2,999 7% 3,000–4,999 12% 5,000–9,999 9% 10,000–19,999 4% 20,000+ 5% information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 9 guo and xu library type the majority of the libraries were single campus libraries; some were part of a multicampus library system or consortium libraries (see table 6). the other types of libraries may include single campus libraries serving more than one institution or location, central offices of a consortium, part of a statewide system, or independent libraries involved in consortium purchase and implementation of alma. table 6. library type library type percentage single campus library 45% part of a multicampus library system 24% part of a consortium 26% other 5% previous integrated library system (ils) the majority of previous ilss used by the participating libraries were voyager, aleph, millennium, and sierra (see table 7), and their vendors were ex libris, innovative interfaces, inc., and sirsidynix (see table 8). thirty-seven percent of libraries reported that they had used their previous ils over 20 years before they planned to migrate or migrated to alma (see table 9). also, one-fifth of libraries indicated that prior to alma, it was their first time to adopt an ils. therefore, this was their only experience in system migration (see table 10). all libraries used cataloging, circulation, and opac modules in their previous ilss, and they also used other modules (see tables 11 and 12). table 7. the previous ilss the previous ils percentage voyager 29% aleph 24% millennium 16% sierra 12% symphony 6% worldshare management services 3% horizon 2% workflows 2% tlc 1% clio 1% evergreen 1% surpass 1% the library corporation 1% other 3% information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 10 guo and xu table 8. the previous system vendors the previous ils vendor percentage ex libris 49% innovative interfaces, inc. 28% sirsidynix 11% oclc 4% endeavor 1% tlc 1% surpass 1% the library corporation 1% other 5% table 9. years with the previous systems years with the previous system percentage 3 1% 4 1% 5–9 7% 10–14 18% 15–19 27% 20+ 37% unknown 9% table 10. whether the previous systems were the first ilss was it your first ils percentage no 72% yes 20% unknown 7% table 11. modules used in previous ils modules used in previous ilss percentage cataloging 100% circulation 100% opac 100% serials 77% acquisitions 76% course reserves 64% interlibrary loan 28% other 9% information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 11 guo and xu table 12. other modules used in previous ilss other modules used in previous ilss analytics booking course reserves discovery system electronic resource management ereserves inn-reach licensing part ii: implementation process alma modules/functions the majority of libraries reported that they will implement or have implemented the following alma modules: fulfillment, primo/primo ve, resource management, and acquisitions (see table 13). some libraries mentioned that they also used summon to replace primo/primo ve as they had used it before the system migration. table 13. alma modules/functions implemented alma modules/functions implemented percentage fulfillment 100% primo/primo ve 93% resource management 92% acquisitions 84% erm (electronic resources management) 77% course reserves 73% network zone 50% interlibrary loan 40% digital collections 21% other 8% selection process rfi and rfp when asked if an rfi (request for information) was involved, more than half of the libraries responded with a confirmative answer (see fig. 1). about half of the libraries reported that they did not conduct a system functionality survey to collect information from library users and colleagues (see fig. 2). more than half of the libraries indicated that the rfp (request for proposal) process is required for the system migration (see fig. 3). there were a variety of reasons why for those libraries who did not conduct the rfp process (see fig. 4), such as an rfp may not be necessary when migrating systems to the same vendor, there was no increase in expenditure, or the expenditure did not reach a budget threshold (e.g., less than $100,000), or the previous contract stipulated it if upgrading to a new product with the same vendor. another reason was that libraries might have an existing relationship with vendors and would like to continue using information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 12 guo and xu their products. some libraries were given authority by the university administration and library directors to handle the negotiation, or they thought an rfi offered sufficient information to make this decision. other libraries had no choice in conducting an rfi or rfp process for reasons such as their system was outdated and they had to migrate, the decision was made by consortium, or alma was their sole source procurement. figure 1. whether an rfi (request for information) was involved. figure 2. whether a system functionality survey was conducted. yes 52% no 40% unknown 8% no 51%yes 43% unknown 6% information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 13 guo and xu figure 3. whether an rfp (request for proposal) was involved. figure 4. the rationales for libraries who did not conduct the rfp. information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 14 guo and xu decision-making the authors found that the common roles involved in the decision-making process included library dean/director, alma local implementation team, and alma project working group (consortium) (see fig. 5). some libraries indicated that their system migration decision was made by university executives (provost, vp finance, cio, and cfo), campus it, aul for library technology, or all librarians/staff. one library reported that the dean of arts, languages & learning services made the selection decision instead of the library or librarians. figure 5. the decision makers. important factors for system selection the authors found that the four most important elements to consider for system selection were budget reality; electronic resource management (erm), bibliographic, and authority control; discovery layers (primo, primo ve); and cloud hosted (see table 14). information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 15 guo and xu table 14. the important factors for system selection important factor for system selection strongly disagree somewhat disagree neither agree nor disagree somewhat agree strongly agree the budget reality 3% 6% 11% 34% 47% the number of libraries adopted 7% 7% 27% 40% 19% erm, bibliographic, & authority control 2% 2% 17% 38% 41% discovery layers (primo, primo ve) 6% 4% 13% 27% 50% the analytics/reporting functionality 4% 6% 15% 41% 35% cloud hosted 3% 3% 12% 36% 47% the campus it infrastructure & its ecosystems 8% 12% 31% 31% 18% integration with other erps 12% 15% 30% 33% 10% customer support & satisfaction 4% 6% 21% 37% 31% system user training programs 5% 11% 24% 38% 21% figure 6. the data migrated to alma. information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 16 guo and xu data migrated the most common types of data migrated to alma were bibliographic records, holdings and items, patrons, and circulation data (see fig. 6). some libraries reported that they also migrated other types of data including vendor lists, e-resource data, all available data types, etc. discovery service the survey asked if there were any libraries that migrated to alma and did not choose primo/primo ve for their discovery service. nine libraries reported they were in this case. four of them used summon, four chose ebsco discovery service, and one adopted their locally developed product. when asking the reason for their choices, the nine libraries indicated that they would like to stay with the existing discovery service. additionally, two of the libraries stated that a budget limitation was a part of their reasons, and one library thought the better discovery service for users was the rationale. part iii: feedback on alma migration system migration evaluation the majority of libraries reported that they did not conduct a formal post-migration evaluation. half of the libraries thought the migration achieved their project goals, or met the needs of library operations (acquisitions, cataloging, fulfilment, discovery, etc.) (see fig. 7). figure 7. whether a formal post migration evaluation was conducted. information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 17 guo and xu some libraries also provided their own migration evaluation, including rfp mandatory requirements signoff, availability study, focus groups with library staff, usability testing with students and faculty, feedback and cross-checking with consortium, debrief of library staff, etc. some only did an informal evaluation, which turned out to be not handled well or not very satisfactory. for example, one consortium did a survey on the migration and provided the feedback to ex libris for improvement. other libraries reported that they had not done the evaluation as they did not start the migration process, were still in the migration stage, that an evaluation was not a part of the decision-making process, or that alma was offered as a free product because of their consortial partnerships. valuable lessons learned the authors asked what were the most valuable lessons the libraries had learned from the migration project, and how they would implement the migration differently if they had a chance to do it again. the most valuable lessons concentrated on training, communication, engagement, implementation process, and data cleanup/preparation (see fig. 8). these lessons are shared in greater detail in the discussion section. figure 8. the valuable lessons learned from the migration project. information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 18 guo and xu prospective migration when asking if libraries would consider working with ex libris again if they migrated to a new system in the future, 70 percent of libraries gave an affirmative answer, but some libraries indicated that they would seek other alternatives (see fig. 9). when asked how likely libraries would be to consider implementing an open-source ils, the majority of libraries conveyed that they would not consider open source; only 7 percent of libraries would consider it (see fig. 10). figure 9. whether ex libris products would be considered in the future. figure 10. whether an open-source ils would be considered in the future. information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 19 guo and xu discussion the authors examine the above findings further through the lens of the research questions raised in the literature review section. the decision-making process and factors considered the survey indicates that both rfi and rfp are important for a selection process. fifty-two percent of the libraries conducted an rfi and 57 percent required the rfp process for the system migration. interestingly, even with a variety of sound reasons such as no increase in expenditure, within the budget threshold, existing relationships with vendors, sole source procurement, consortium decision, riders, etc., some libraries still did not roll out the rfp process. besides rfi and rfp, 43 percent of libraries went through a system functionality survey to collect information from library users and colleagues. for most libraries, the library dean or director, alma local implementation team, or alma project working group of a consortium were involved in the decision-making process. in some cases, university executives such as provost, vp finance, cio, cfo, campus it, and associate dean or associate university librarian for library technology made a collective decision. in a rare case, the dean of arts, languages & learning services made the call for the system selection. when considering system migration, many factors can be important. this survey shows that libraries mainly consider budget reality; erm, bibliographic, and authority control; discovery layers; and cloud-hosted systems. it is interesting that most libraries would like to move to a cloud-based system that has better functionality for discovery and electronic resources management. the survey also reveals that library administration needs to find a way to offset the cost increase of the system migration. the lack of comparable system or service offerings in the market also contributes to the decision on system selection. project evaluation project evaluation provides important feedback from both system users and system providers and a great opportunity for libraries to learn. the findings indicate that many libraries do not have a formal assessment process. some consortia have conducted surveys and provided feedback to ex libris, but no response reported to the feedback from ex libris. both libraries and system vendors have lost the opportunity to learn and improve project management. for example, welldocumented complaints on dissatisfaction with ex libris training have not been effectively addressed. some libraries believe a demonstration-focused training model does not provide the same experience as onsite training offers. many libraries have had trouble with acquisitions workflows. the eocr (electronic order confirmation record) and edi (electronic data interchange) processes are standard practices in libraries today to generate order records and create invoices automatically and should be a part of implementation contract to ensure that libraries can operate appropriately after a new system goes live. it is time for both libraries and system providers to consider a formal project assessment as a part of system migration down the road. libraries will not do better if they do not improve today. libraries cannot improve if they do not know where previous projects have gone wrong. a better way to learn from mistakes is project assessment. information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 20 guo and xu impacts on library staffing and library operation some libraries reported that insufficient staffing over the system migration has created additional problems and hardships. some library departments have been stretched very thin in order to work on the migration project in addition to their regular operational duties. however, about onethird of survey-participating libraries have reported that meeting the needs of library operation including acquisitions, cataloging, fulfilment, and discovery is a criterion of project evaluation. the lack of dedicated lsp project migration staff creates a challenge for system migration. most importantly, additional staffing time and technical capacity are important factors that decide if libraries could fully take advantage of the functionalities of a new system. libraries might manage the system migration better by hiring additional technical staff on a project basis to handle technical aspects if staff cannot be released from library operation to focus on the migration project. the system integration and unified automated workflows of a modern lsp can enable libraries to run their operations more efficiently. particularly in a shared environment or network, libraries could share bibliographic records for general collections wider and deeper, which could dramatically reduce the need for both original and copy cataloging. system staff no longer need to install or upgrade proprietary software and maintain servers in house. these changes might cause job insecurity for some library staff. it is critical for library leaders to make adjustments to some job responsibilities or develop new skills to meet new demands. this requires library administration to create a culture of embracing change, learning, and collaboration. staff can take the advantage of a new system by being curious and reassessing previous workflows. library administration could create a flexible structure to encourage learning and collaboration across departments. lessons learned many libraries shared valuable lessons they learned from the migration projects. those lessons concentrate on training, communication and engagement, implementation process, and data cleanup and preparation. training many libraries expressed dissatisfaction with the training provided by their vendor. for example, libraries moving to alma reported that ex libris could have focused more on in-person, postmigration training. as it was, staff felt undertrained because they had access only to online training before the libraries had access to their own data in alma/primo. additionally, ex libris did not have regular trainers for a particular library, so there was less continuity across training sessions than there could have been. some suggest that ex libris do a concentrated several-day initial training for migration so that libraries have a solid overview of the entire system before data exports for testing loads, and then delve into a detailed weekly training that includes more library staff. it seems a good idea to schedule more training sessions after implementation because libraries may not know how the system functions during the implementation period. in an ideal world, libraries would put more contractual obligations on ex libris to train staff more thoroughly. after all, libraries need to hold ex libris more accountable for project outcome. for consortium libraries, they should insist that ex libris provide specialized individual trainers and technical contacts. attending group training sessions conducted by a variety of different ex libris information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 21 guo and xu trainers does not work well in large migration projects. ex libris needs to train the library staff rather than focusing on training the consortium support staff and expecting them to do most of the staff training. ex libris indeed carries a variety of training webinars that are free; however, for bespoke training or intimate training sessions, they charge their customers. a barrier for many libraries is that they just cannot afford to pay more on these bespoke training sessions so they depend on other in-house training and best practices (e.g., work groups, training committees, inhouse power users, etc.) to train/manage the training needs of their library personnel. communication and engagement many libraries express that communication is extremely important and buy-in from stakeholders at all levels is critical to the migration project’s success. investing the initial time to have all stakeholders onboard will pay off. blocking off time for weekly meetings with involved staff and ex libris is key. some suggested asking more questions and seeking to understand the functionality of the new system more deeply. for consortial libraries, librarians can become much closer to each other and learn to seek out and receive help from one another in the ways that they might never do before. the networking can be an invaluable source for mutual support going forward. some libraries reported that due to the lack of communication, an overly sudden decision for the implementation timeline was made at the legislative level. information regarding requirements and expenses was not fully clarified before the process began and came as a surprise during the migration. the whole process felt very rushed by the vendor with insufficient trainings, which turned out to be very dissatisfying. implementation process a system migration is complex and requires a great deal of time, institutional resources, and staff. some key processes needed to be better prepared in advance, such as staff trainings, project plans and major milestones, system analysis, customer inputs for implementation and configuration, data cleanup, physical to electronic processing (p2e), source data extraction, validation and delivery, workflow analysis, fulfillment network, authentication, third-party integrations, data review and testing, go-live readiness checklist, etc. in practice, the migration was often more time and resource-intensive than expected, meaning that libraries found it difficult to complete their part of the process in the contractually-specified time. libraries should clear the decks of core staff to focus on migration, and make sure there are no other major projects occurring at the same time. if staff have insufficient time during the migration window, libraries need to hire temporary experienced staff for the project. this investment will benefit library operation in the long run. the implementation team members should have more dedicated time to be trained so that the library staff are well prepared and knowledgeable in the areas in which they work. it is wise to clean up data as much as possible prior to migration. it would be ideal if the existing workflows were fully documented with diagrams so that it would be easier to determine what parts of the workflows need change. some libraries reported their migration happened during the pandemic with state-issued stay-athome orders in force. it was extremely stressful juggling all of the changes for the library while keeping up with system migration. ideally, it would be better to avoid doing the migration during a pandemic and postpone the migration. but if libraries have no other choices, one benefit is to take advantage of closures for cutover days. the stress of the implementation and trying to get information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 22 guo and xu things done may cause frustrations to boil over. it is advised to manage these situations by adding additional support where needed and by always ensuring that communication is a top priority so that any confusion is kept to a minimum. for consortial libraries, it is important for individual institution members to have their own project managers. the consortial libraries would have tried to standardize more configurations across the consortia, like user groups, circulation settings, item types, etc. some libraries felt the whole migration process was rushed by the vendor, which turned out to be not very successful. libraries should not let the vendor talk them into a compressed, severalmonth migration timeline; instead, they should spend more time in the preparation and implementation process. data cleanup and preparation although it is tedious and time consuming, many libraries suggested cleaning up data as much as possible prior to migration. more pre-migration data cleanup would avoid the post-migration mess. some libraries recommended more stringent cleanup of catalog records, acquisitions data, circulation data, patron records, weeding, etc. it is important to make sure the cataloging structure matches the structure of the new system. had they taken the data review stage more seriously and fully modeled the processes and workflows that would be needed, they would have had fewer data cleanup problems to address after the migration was complete. some libraries cautioned that alma’s p2e (physical to electronic) migration process was more complex than anticipated. they stated that the p2e conversion did not work as it should have, and ex libris should do a better job in the future. due to misalignment of source and target collections, the p2e process resulted in a large cleanup after the migration. a number of libraries would have asked more questions about what data was migrated and to where. ex libris had migrated data that should not have been migrated. as a result, a messy system became a reality. planning for future system migrations when asking what libraries will do differently for a future system migration, many provided very interesting insights. some libraries believed that the system migration put library leadership in a difficult position. they needed to engage all library employees in decision-making and provide staff with the resources they needed to navigate change, experience the vulnerability of learning a new system, and even have difficult conversations with colleagues. at the same time, library leaders are accountable to their parent organizations and subject to budget pressure and mandates to follow procurement processes, which are geared around efficiency and hierarchy rather than promoting democratic decision-making and self-governance. many libraries expressed a concern about training. they stated that they would demand a separate contract for training in the future and put more contractual obligations on system providers to train staff more thoroughly. they would spell out in greater detail what a successful migration would consist of to hold ex libris responsible for outcome. during the bidding process, library staff should be less distracted by smooth presentations but ask difficult questions about system functionality. another concern is about the pricing. one early adopter of alma stated that they learned the risks, rewards, and excitement of helping with a developing product as they felt aleph was a dead end information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 23 guo and xu and did not see many other alternatives. they would have negotiated more strongly with ex libris on pricing considering the immaturity of the product and pricing model at the time of adoption. some libraries felt they were not given competitive pricing, and their costs went up significantly, which constituted a large budget shift. some small libraries believed alma is too big for them, and oclc might be more appropriate for their size of collections and materials. they realized they underutilized a very expensive system. some libraries preferred a customized implementation as opposed to the one-size-fits-all model ex libris offered. they stated that despite learning the new system, they found that the solutions ex libris offered for their implementation rarely worked. they would better off fitting in their own workflows with alma (especially for budgeting). ex libris seems to be not ready to work with single-campus small colleges. other libraries reported that they had multiple people in a project management role, which created communication issues. they learned that in any future migration processes they should have a single project manager empowered to make decisions. for consortium libraries, some libraries suggested taking advantage of cohorts of migrating institutions to share information, issues, and raise common questions. they would have made some local decisions instead of simply going with the consortiums. one consortium experienced a major difficulty that the group implementation took place in different countries. the time difference with their implementation team had added an additional dimension to project management. they would have done an individual migration instead of a group migration since they had a very complex institutional structure. some libraries strongly recommended open-source systems as well. they believed that the trend toward vertical consolidation of vendors is not healthy for the library system market in the long run. with mergers and acquisitions, gigantic companies are formed and might over-control the market and pricing. conclusions decision-making on the selection, procurement, and implementation of a new lsp is a process that requires gathering information and seeking input from library administration, experts, and different levels of stakeholders in a systematical way to ensure the system quality, fitness, and a successful implementation. the findings suggest that libraries should adopt an rfi/rfp (request for information/proposal) or system functionality survey as the basis for system selection. budget, resource discovery, and electronic resources management are the most important factors to be considered in an ils selection. staffing time and technical capability must be addressed before implementing a new system to enable libraries to manage user expectations. insufficient staff and the lack of technical skills could affect the realization of the benefits of a new system. technological change can lead to the shifts of staff job responsibilities and lead to a new way of working together. it is important for library administration to address organizational change when making technological change. a formal project assessment is essential for libraries and system providers to learn and improve collectively. open-source systems could open doors for libraries to seek more customized and affordable systems. research limitations like all research studies, this study has limitations that provide opportunities for further investigations. firstly, because we asked for responses from individuals, not libraries, the findings information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 24 guo and xu might be biased by participants due to individual experiences. secondly, due to the limitation of time, space, and number of survey questions, reported data mainly focused on alma libraries and could not cover migration experiences of libraries migrating to other products or all aspects of system migration. further research would benefit the library community from interviewing participating libraries in a different size, type, and geographic location as well as different system providers. practical implications every new system has its advantages and downsides. to help libraries fully take advantage of a new system, it would be helpful if vendors could evaluate training, physical to electronic (p2e) process, and system affordability. providing training after a system goes live will help libraries implement workflows effectively and give staff better experience. p2e is crucial for ensuring that all relevant information is transferred and maintained in the new system. vendors could address potential p2e issues before a system migration takes place so that libraries might approach data cleanup differently. it would be great if vendors could customize system modules or functionalities as needed by both small and large libraries. this will give libraries flexibility to invest in most needed library operations at different prices to make the system affordable. customer services can be crucial for libraries to continue optimizing the new system down the road. regularly seeking libraries’ feedback can foster a positive customer relationship and benefit both libraries and vendors. acknowledgements the authors appreciate the support of marshall breeding and sue julich for providing the library contact lists. the authors would also like to thank the office of research integrity for reviewing the survey questionnaire and providing comments. much gratitude goes to the survey participants who volunteered their time to participate in this study and took the time to communicate with the authors in order to provide accurate responses for their libraries or consortia. information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 25 guo and xu appendix: survey questionnaire adult online consent to participate in a research study a customers’ perspective: decision-making on system migration summary information things you should know about this study: • purpose: the purpose of the study is to understand how library leaders make decisions on system migration during technological change and the impact of these decisions on library operation and staff. • procedures: if you choose to participate, you will be asked to answer 12 multiplechoice questions and 3 open-ended questions. • duration: this will take about 15 to 20 minutes. • risks: there is little risk or discomfort from this research since you share your project experience anonymously. • benefits: the main benefit to you from this research is to self-reflect on the project and have an opportunity to share the project experience. we plan to publish our findings, which will bring potential benefits to you and the library community. • alternatives: there are no known alternatives available to you other than not taking part in this study. • participation: taking part in this research project is voluntary. please carefully read the entire document before agreeing to participate. confidentiality the records of this study will be kept private and will be protected to the fullest extent provided by law. in any sort of report we might publish, we will not include any information that will make it possible to identify you. research records will be stored securely and only the researcher team will have access to the records. information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 26 guo and xu the following questions are for general analytical use only. although qualtrics does not collect your email address, please do not provide your personal identification indicators (pii) with your answers. if pii appear in the responses, we will apply a data anonymization process to anonymize pii after the results are added into the final tally. right to decline or withdraw your participation in this study is voluntary. you are free to participate in the study or withdraw your consent at any time during the study. you will not lose any benefits if you decide not to participate or if you quit the study early. the investigator reserves the right to remove you without your consent at such time that he/she feels it is in the best interest. researcher contact information if you have any questions about the purpose, procedures, or any other issues relating to this research study you may contact jin guo (jiguo@fiu.edu) or gordon xu (gordon.xu@njit.edu). irb contact information if you would like to talk with someone about your rights of being a subject in this research study or about ethical issues with this research study, you may contact the fiu office of research integrity by phone at 305-348-2494 or by email at ori@fiu.edu. participant agreement i have read the information in this consent form and agree to participate in this study. i have had a chance to ask any questions i have about this study, and they have been answered for me. by clicking on the “consent to participate” button below i am providing my informed consent. consent to participate mailto:jiguo@fiu.edu mailto:gordon.xu@njit.edu information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 27 guo and xu section i: library profile and background information 1. your title: a. dean/director of the library/university librarian b. system librarian c. other (please specify: _________________) 2. describe your institution a. location i. us ii. canada iii. state 2. total student and faculty population a. total student population (number of ftes) b. total faculty population (number of ftes) 3. information about your library a. single campus library b. part of a multicampus library system c. part of a consortium d. other (please specify: _________________) 4. previous ils: a. the previous ils name: b. the previous ils vendor: c. years with the previous system: d. was it your first ils? a. yes b. no 5. ils modules in use prior to alma migration: (please check all that apply) a. acquisitions b. cataloging c. circulation d. interlibrary loan e. reserves f. serials g. opac h. other (please specify: _____________________) information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 28 guo and xu section ii: alma implementation process 6. alma modules/functions implemented: (please check all that apply) a. acquisitions b. resource management c. fulfillment d. interlibrary loan e. course reserves f. erm g. network zone h. primo/primo ve i. digital collections j. other (please specify: ________________________) 7. the system selection process • was an rfi (request for information) involved? a. yes b. no • did you conduct a system functionality survey to collect information from library users and colleagues? a. yes b. no • was the rfp (request for proposal) process required? • a. yes, please specify the person/department that prepared for rfp. _____ • b. no, please provide the reason why (e.g., budget cap less than $100k, etc.)_____ 8. who was involved in the decision-making process? • alma project working group (consortium) • alma local implementation team • project manager(s) • library dean • institutional coordinators/leads • departmental heads • others (please specify ______) information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 29 guo and xu 9. what are important factors for system selection (5 points, weight/per response)? • the budget reality • the number of libraries adopted • e-resource management (erm), bibliographic, and authority control • discovery layers (primo, primo ve) • the analytics/reporting functionality • cloud hosted • the university/college it infrastructure and its ecosystems • integration with other erp (enterprise resource planning) systems/platforms • customer support & satisfaction • system user training programs 10. what data was migrated (please select all that apply)? • authority data • bibliographic records • holdings and items • patrons • loans, holds, and fines • acquisitions • course reserves • digital metadata and objects 11. please skip this question if you use primo/primo ve. if you chose non-ex libris products for discovery service, please specify the product____, and select the possible reason below: • budget limitation • stay with the existing discovery service • others section iii: feedback on alma migration project. 12. how did your library evaluate the system migration project? • no formal post-migration evaluation • user satisfaction survey • achieved the project goals • met the needs of library operations (acquisitions, cataloging, fulfilment, discovery, etc.) information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 30 guo and xu 13. open-ended questions • what are the most valuable lessons you have learned from this project? if you had a chance to do it again, how would you implement the migration differently? • would the library consider working with ex libris again if it were to migrate to a new system in the future? • how likely is it that this library would consider implementing an open-source ils? information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 31 guo and xu endnotes 1 zhonghong wang, “integrated library system (ils) challenges and opportunities: a survey of us academic libraries with migration projects,” the journal of academic librarianship 35, no. 3 (2009): 207–20, https://doi.org/10.1016/j.acalib.2009.03.024. 2 teri oaks gallaway and mary finnan hines, “competitive usability and the catalogue: a process for justification and selection of a next-generation catalogue or web-scale discovery system,” library trends 61, no. 1 (2012): 173–85. 3 guoying liu and ping fu, “shared next generation ilss and academic library consortia: trends, opportunities and challenges,” international journal of librarianship 3, no. 2 (2018): 53–71. 4 matt goldner, “winds of change: libraries and cloud computing,” bcla browser: linking the library landscape 4, no. 1 (2012): 1–7. 5 liu and fu, “shared next generation,” 53–71; jone thingbø, frode arntsen, anne munkebyaune, and jan erik kofoed, “transitioning from a self-developed and self-hosted ils to a cloudbased library services platform for the bibsys library system consortium in norway,” bibliothek forschung und praxis 40, no. 3 (2016): 331–40, https://doi.org/10.1515/bfp-20160052. 6 philip calvert and marion read, “rfps: a necessary evil or indispensable tool?” electronic library 24, no. 5 (2006): 649–61. 7 matt gallagher, “how to conduct a library services platform review and selection,” computers in libraries 36, no. 8 (2016): 20. 8 zhongqin (june) yang and linda venable, “from sirsidvnix symphony to alma/primo: lessons learned from an ils migration,” computers in libraries 38, no. 2 (march 2018): 10–13. 9 gallaway and hines, “competitive usability,” 173–85. 10 alan manifold, “a principled approach to selecting an automated library system,” library hi tech 18, no. 2 (2000): 119–30, https://doi.org/10.1108/07378830010333455. 11 ayoku a. ojedokun, grace o. o. olla, and samuel a. adigun, “integrated library system implementation: the bowen university library experience with koha software,” african journal of library, archives and information science 26, no. 1 (2016): 31–42. 12 lyn h. dennison and alana faye lewis, “small and open source: decisions and implementation of an open source integrated library system in a small private college,” georgia library quarterly 48, no. 2 (spring 2011): 6–9. 13 daniel lovins, “management issues related to library systems migrations. a report of the alcts camms heads of cataloging interest group meeting. american library association annual conference, san francisco, june 2015,” technical services quarterly 33, no. 2 (2016): 192–98, https://doi.org/10.1080/07317131.2016.1135005. https://doi.org/10.1016/j.acalib.2009.03.024 https://doi.org/10.1515/bfp-2016-0052 https://doi.org/10.1515/bfp-2016-0052 https://doi.org/10.1108/07378830010333455 https://doi.org/10.1080/07317131.2016.1135005 information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 32 guo and xu 14 kyle banerjee and cheryl middleton, “successful fast track implementation of a new library system,” technical services quarterly 18, no. 3 (2001): 21–33. 15 joshua m. avery, “implementing an open source integrated library system (ils) in a special focus institution,” digital library perspectives 32, no. 4 (2016): 287–98, https://doi.org/10.1108/dlp-02-2016-0003. 16 morag stewart and cheryl aine morrison, “breaking ground: consortial migration to a nextgeneration ils and its impact on acquisitions workflows,” library resources & technical services 60, no. 4 (2016): 259–69. 17 zahiruddin khurshid and saleh a. al-baridi, “system migration from horizon to symphony at king fahd university of petroleum and minerals,” ifla journal 36, no. 3, (2010): 251–58, https://doi.org/10.1177/0340035210378712. 18 efstratios grammenis and antonios mourikis, “migrating from integrated library systems to library services platforms: an exploratory qualitative study for the implications on academic libraries’ workflows,” qualitative and quantitative methods in libraries 9, no. 3 (september 2020): 343–57, http://qqml-journal.net/index.php/qqml/article/view/655/585. 19 abigail wickes, “e-resource migration: from dual to unified management,” serials review 47, no. 3–4 (2021): 140–42. 20 yang and venable, “from sirsidynix,” 13. 21 joseph nicholson and shoko tokoro, “cloud hopping: one library’s experience migrating from one lsp to another,” technical services quarterly 38, no. 4 (2021): 377–94. 22 ping fu and moira fitzgerald, “a comparative analysis of the effect of the integrated library system on staffing models in academic libraries,” information technology and libraries 32, no. 3 (september 2013): 47–58. 23 geraldine rinna and marianne swierenga, “migration as a catalyst for organizational change in technical services,” technical services quarterly 37, no. 4 (2020): 355–75, https://doi.org/10.1080/07317131.2020.1810439. 24 vandana singh, “experiences of migrating to open source integrated library systems,” information technology and libraries 32, no. 1 (2013): 36–53, https://doi.org/10.6017/ital.v32i1.2268; shea-tinn yeh and zhiping walter, “critical success factors for integrated library system implementation in academic libraries: a qualitative study,” information technology and libraries 35, no. 3 (2016): 27–42, https://doi.org/10.6017/ital.v35i3.9255; grammenis and mourikis, “migrating from integrated library systems,” 343–54; xiaoai ren, “service decision-making processes at three new york state cooperative public library systems,” library management 35, no. 6 (2014): 418–32, https://doi.org/10.1108/lm-07-2013-0060; wang, “integrated library system,” 207– 20; pamela r. cibbarelli, “helping you buy ils,” computers in libraries 30, no. 1 (2010): 20–48, https://www.infotoday.com/cilmag/cilmag_ilsguide.pdf; calvert and read, “rfps,” 649–61. https://doi.org/10.1108/dlp-02-2016-0003 https://doi.org/10.1177/0340035210378712 http://qqml-journal.net/index.php/qqml/article/view/655/585 https://doi.org/10.1080/07317131.2020.1810439 https://doi.org/10.6017/ital.v32i1.2268 https://doi.org/10.6017/ital.v35i3.9255 https://doi.org/10.1108/lm-07-2013-0060 information technology and libraries march 2023 decision-making in the selection, procurement, and implementation of alma/primo 33 guo and xu 25 gallaway and hines, “competitive usability,” 173–85. 26 fu and fitzgerald, “a comparative analysis,” 47–58. abstract introduction literature review methods data collection data analysis findings part i: library profile & background information respondents participating libraries geographic location library size library type previous integrated library system (ils) part ii: implementation process alma modules/functions selection process rfi and rfp decision-making important factors for system selection data migrated discovery service part iii: feedback on alma migration system migration evaluation valuable lessons learned prospective migration discussion the decision-making process and factors considered project evaluation impacts on library staffing and library operation lessons learned training communication and engagement implementation process data cleanup and preparation planning for future system migrations conclusions research limitations practical implications acknowledgements appendix: survey questionnaire endnotes lib-mocs-kmc364-20131012113638 book reviews the future of the printed word: the impact and implications of the new communications technology. edited by philip hills. westport, conn.: greenwood, 1980. 172p. $25. lc: 80-1716. isbn: 0-313-22693-8 (lib. bdg.). the character of this volume is as much that of a topical journal or annual review as that of a monograph. a dozen authors have contributed thirteen chapters, all but one prepared especially for this publication. ten of the chapters are by british authors, two by americans, and one by european community personnel located in luxembourg. an amusing punch satire about book (built-in orderly organized knowledge) is reprinted as an unnumbered fourteenth chapter. in an excellent opening essay, john m. straw horn notes: "in this book, the expression printed word is construed very broadly, to include words in any kind of display: paper, microforms, crt's, plasma panels and so on." his essay is a terse but pointed review of the organization of information transfer, some current trends, factors affecting acceptance of new technologies, and some broad projections for the future. provocative essays by maurice b. line and p. j. hills, editor of the volume, explore the printed word from the points of view of a bookperson and an educator. in one of the most elegant metaphors to appear in information science literature, line suggests: "the printed butterfly will emerge from its electronic chrysalis, but it will also return again to it in due time. the vast majority of documents will thus be stored in electronic (chrysalis) form, but the majority of those used at any given time will be in their printed (butterfly) form." two incisive and thorough chapters on official information by patricia wright systematically explore the use of old and new technologies for forms, leaflets, and signs. 239 wright makes acute and useful observations on how technology can hinder or help gathering and dispersion of governmental information. the graphic information research unit of the royal college of art has done excellent work in recent years in exploring how various display options affect comprehension. linda reynolds provides a good essay, "designing for the new communications technology," based on that research. the review of prospects for electronic journal publishing by donald w. king is a good overview, especially for beginners. a chapter on euronet diane describes problems in creating an online database capability in the european political environment. chapters on printing technologies, microforms, and videodiscs cover all major alternatives but suffer from brevity. two brief but competent speculative essays, which add little, complete the volume. the work lacks a general index, but the organization of chapters makes this a minor flaw. use of presumably common british acronyms without explanation, especially in credits and citations, is an irritant for non-u.k. readers. the work would make an excellent supplementary text for a course on the history of the book. practitioners in publishing or library and information science will find much of interest.-brian aveney. turnkey automated circulation systems: aids to libraries in the market place. edited by judith bernstein . chicago. american library assn., 1980. 332p. $10.50. when my library entered the marketplace for an automated circulation system, i searched the literature for aids. had i found this book at that time i would have been disappointed. what i would expect from a 332-page book with a subtitle, "aids to libraries in the market place," would be numerous examples of what had been done 240 journal of library automation vol. 14/3 september 1981 before. i would expect samples of the analyses that other libraries had done to justify entering the marketplace, samples of the rfps that had been sent to vendors, and samples of the contracts that had been signed. i would like to see a case study (or two) of the complete process of procurement. admittedly, this expectation is somewhat of an ideal, but these are "aids" that we searched for and that other libraries now ask from us. what does this book provide? an editorial introduction gives a sense of the difficulties of the marketplace and the frustrations encountered in it. a two-page bibliography gives a reasonable selection of readings to provide a background for decision making. a discussion titled "hiring a consultant-why and how," is a very useful enumeration of details to be considered in the decision to hire a consultant and in the agreement with a consultant. a model request for proposal is a good synthesis of the details to be included in almost every library's rfp and thus provides a starting point for the library new to the marketplace. all of this is what i consider to be the substance of this book, and it ends at page 40. the remaining 292 pages are devoted to the "profiles" of individual libraries which have installed automated circulation systems. the profiles are intended to assist in the identification of libraries to be contacted for further information, but provide little useful information by themselves. my primary objection to this book is the misleading nature of the citation. one expects more than three hundred pages of "aids" and finds a directory with a fortypage preface. but for the librarian new to the marketplace it may be worth the price.-alan e. hagyard, yale university library, new haven, connecticut. archives and the computer, by michael cook. london: butterworths, 1980. 152p. $29.95. lc: 80-41286. isbn: 0-40810734-0. michael cook recognizes the special predicament of the archivist whose job consists of trying to satisfy three contradictory needs: (1) the need to arrange and describe archives by their provenance, (2) the need to store them most efficiently by shape and size, and (3) the need to access them to answer inquiries that are mostly subjectoriented. the solution to these conflicting requirements may come from the computer. as cook says, "the speed and variety of computerized lists and indexes derived from a single data base could solve this problem by producing finding aids in all possible sorts of order." in a very handsomely produced, sturdily bound book, archives and the computer, michael cook, archivist of the university of liverpool, reports on various computer systems serving the needs of the archivists. his book starts with a general discussion on the nature of automated systems and their relation to manual ones. this is followed by the description of a select group of archives systems-some still in use, others put to their well-deserved rest after a few years' use. he covers records management systems (i.e., the area of handling current records) and archives management systems (i.e., the handling of noncurrent documents). in the final chapter cook moves the discussion away from computer processing of traditional, familiar forms of archival material, focusing instead on processing archives that are themselves machine-readable data files. how does the archivist accomplish all of the necessary tasks if the archives are not readable by the human eye? how does he appraise, arrange, describe, and access them? i like mr. cook's cautious and sober attitude. talking about system design, he remarks, "at this stage decisions will be made which will be irrevocable in practical terms, and may cause much trouble later. " about implementation and testing, "computer systems should help people to work more effectively in a more interesting environment; if they fail in this, or appear to fail, there is something wrong, and it would perhaps be better not to introduce the change." the records management systems he describes are used by british county and city record offices. an interesting feature in one of them, a system called arms, is a printout that tabulates for each class of documents the number of requests in a year, per year stored. this printout could be very helpful in modifying established retention periods on the basis of experience. the following archives systems are described: prospec (adopted by the public record office of london) , nars a-1 (used by the national archives of the usa), spindex (first used by the national archives and the national historical publications and records commission), selgem (used by the archives of the smithsonian institution), stairs (an ibm system, used, among others, by the house of lords record office in london), paradigm (developed and used at the university of illinois), mistral (used by the national archives of ivory coast), and arcaic (used and abandoned by the east sussex record office). of all these systems, i found the description of selgem the most educational. besides listing the fields making up a computer record, cook shows an example of an actual record as it appears in the master list, and as it appears in the printed guide to the archives. he also includes an actual segment of the name/ subject index. although there is a brief mention about the choice between networking versus isolated, separate systems, the book does not speculate about the possibility of a network of many institutions building a common database. nor does the author discuss the much debated and very timely question of whether archivists could possibly agree on a uniform computer record for the description of manuscripts and archives, similar to the way in which librarians have agreed on using the marc formats for the description of their materials. a glossary of technical terms, a "select directory" of archival systems, and a "select bibliography" are useful additions to the main text. this book is more recommended to the archivist looking for a computer system than for the systems analyst who wants to learn how archives are processed.suzanna lengyel, yale university library, new haven, connecticut. the library and information manager's guide to online seroices. edited by ryan e. hoover. white plains, n.y.: knowledge industry publications, 1980. 270p. $29.50 hardcover, $24.50 softcover. lc: 8021602. isbn: 0-914236-60-1 (hardcover); book reviews 241 0-914236-52-0 (softcover). hoover and jeven colleagues provide an overview of the main issues and techniques involved in starting and managing an online retrieval service. the emphasis is on a library setting-the implicitly broader focus conveyed by the title is not matched by any specific coverage of, for example, the online search activity of the for-profit information brokers, where funding, staffing, publicizing, and the search process itself are handled differently than in libraries. the three large, general search services (lockheed, sdc, and brs) are used throughout for the descriptions and search examples, and their bibliographic databases inevitably receive the most attention. there is a noticeable slant toward the two agencies with which several of the contributors are or were affiliated-the university of utah (which doesn't detract from the book's objectivity) and sdc (which does). the chapters are of uneven quality and scope. most of the obvious areas are covered-the available search systems and databases; equipment needs; search techniques; managing an online service in a library; training searchers; promoting service; and measurement and evaluation. taken as a whole, the.book is a good stateof-the-art report, even though it is already becoming outdated in terms of industry facts. the numerous charts and tables serve to flesh out the text, but do we really need six photographs of terminals (two of them showing the same searcher at the same terminal , the only difference being that in one there is an onlooker) to illustrate that "some searchers prefer to have the user present"? brief chapters on the growing network of online user groups, and on the future of online services (largely derived from lancaster) end the text, and the book has a serviceable bibliography, glossary, and index. six years ago i reviewed one of the first kipi publicationsit was in typescript, comb-bound, a little more than one hundred pages, and it cost $24.50. this is a much better production and, considering inflation since 1975, it represents vastly better value for money. it should serve as a useful handbook for those of us in the field, as well as those just starting, for another 242 journal of library automation vol. 14/3 september 1981 year or two.-peter watson, california state university, chico. basics of online searching, by charles t. meadow and pauline atherton cochrane. new york: wiley, 1981. 245p. $15.95. lc: 80-23050. isbn: 0-417-05283-3. the use of online information retrieval services is becoming widespread throughout the information community, whether in traditional libraries or in business, industry, or government offices. the need for trained searchers is evident by looking at the job advertisements and at the quantity of training programs being offered around the country. the programs presented by the machine-assisted reference section (mars) of the reference and adult services division of ala are always packed. the librarians attending ala annual conferences seem to be hungry for any information available about online information retrieval services. this text fills an obvious need for the professional who attended library school before course offerings in online information retrieval were available. although online information retrieval is now being taught in most library and information science curriculums, there have been only a few attempts at providing a textbook for beginning students, and none of those has been very successful since the lancaster and fayen information retrieval online in 1973. basics of online searching is a text intended "to teach the principles of interactive bibliographic searching . . . to those with little or no prior experience. the major intended audiences are students, working information specialists and librarians, and end users, the people for whom all this searching is done. " because the authors have done an excellent job of targeting their audience and sticking to that target, this text will be useful at the introductory level. the authors cover the elements of interactive searching including the reference interview, boolean logic, search strategy development, telecommunications and equipment, basic database structure, selective dissemination of information, and how to get help from search-service vendors. the text is relatively free of jargon and does a good job of defining in context new terms as they appear. the authors begin with basic definitions and a brief overview of the process of interactive searching. the reference interview and search strategy development is covered adequately, first with an introduction and then in a later chapter providing more detailed information. telecommunications and computer equipment are covered in enough detail for the novice. the next five chapters cover search language, databases, various types of text searching, and how to get on and off the computer. this section of the book uses examples that show the different approaches to the same process on three different systems-brs, orbit, and dialog. the authors do not lose sight of their intent to demonstrate the principles of online searching. there is a brief chapter on selective dissemination of information (sdi) and cross-file searching. the chapter explains how sdi is used and gives examples of constructing and saving a search for sdi on each of the three systems. the last chapter of the book, "search strategy," is especially good. there seemed to be something beyond the basic elementary information of the preceeding chapters. the authors clearly demonstrate concept development and search strategy formulation. the authors do an excellent job of integrating the discussion of the three major search service vendors, lockheed's dialog, system development's orbit, and bibliographic retrieval services, inc. examples are used from each of the services with a discussion of the differences. the book does clarify the similarity of the services by showing how each function can be accomplished on each system. searchers using only one system now might use this text to see how easily their knowledge could be transferred to another system. problems with the text do not abound, but there are some that should be brought to the attention of the reader. there is a slight problem with the format of the examples. the reviewer found herself searching for the completion of a paragraph of text on a few occasions. the examples are very good and clear; they are simply not separated from the text adequately for easy reading. there were a couple of instances of unnecessary redundancy . t here were two separate discussions, one on truncation and one on searching word fragments, which could have been improved by integration into one section. there was a repetition of "steps in the presearch interview and the online search" in chapter 3 and then again in chapter 12. this is almost a page of steps, which are very good, but a simple reference back to the earlier list would have sufficed. but the biggest problem with the text in the eyes of this reviewer is that of omission. there was no discussion of citation searching, evaluation of search results, and no mention of the various training options available for the novice searcher. this reviewer would like to have seen more information on where to go next as guidance to the novice. the one hundred pages of appendixes seem unnecessary and will soon be out of date. library school teachers planning to use this as a text would do well to request free, up-to-date materials rather than relying upon the documents in the appendix, which are more than a year old at the time of this writing. most every book on this topic has made the same mistake of reprinting search-service and database-producer literature. overall, however, the authors have succeeded very capably in their intended endeavor "to teach principles, rather than the detailed mechanics of any particular search system." there is a place in the literature for this very basic text, which is well written, uses clear examples, and teaches in an understated way. for those people who are afraid of automation, afraid to touch a computer terminal, and are insecure about their ability to do online searching, this book will relieve most of those fears and insecurities. the authors acknowledge their desire to give simple instructions and offer a chapter called "assistance" for people who need more help. novices might assume they could read this book, purchase a terminal, get a password and system manual, and begin searching. as a matter of fact one could do this, but the results would likely be a discredit to the search-service vendor because of a lack of system-specific training on the part of the searcher. most people, like this reviewer, can conceptualize a new process, but would feel more comfortable with some type of formal hands-on book reviews 243 training-even for half a day. there are too many little things that can be an impediment to success. the reviewer would heartily recommend this book to inexperienced searchers and library school students but would warn the experienced searchers that there is nothing newforthem.-carolynm. gray, western illinois university, macomb. quick • search cross-system database search guides. san jose, calif.: california library authority for systems and services, 1980. 21 charts. $75 (class members), $95 (nonmembers). isbn: 0-938098-00-4. the class on-line reference service (colrs) is a cooperative program for public, academic, and special libraries offering training and consultation on almost any aspect of online reference searching through the major commercial vendors of databases. this service is a part of class, the california library authority for systems and services, and acts as a contact point for searchers and the database industry through vendor-training sessions, database training, and the coordination of large group contracts with dialog information services and bibliographic retrieval services (brs). this close relationship to the online industry gives class a unique position from which to supply information on databases from a multiple search-system perspective. the publication of the quick•search cross-system database search guides is a natural outgrowth of the colrs program in training and consulting. the twenty-one charts in quick•search show the formats used to search for information in a specific database across the two or three vendors offering the database commercially. the databases were selected as the most commonly searched through the major commercial search services: bibliographic retrieval services, dialog information services, and system development corporation search service (soc). eight databases in the sciences, eight in the social sciences, and five multidisciplinary files are included in the complete set. two subsets of the science and multidisciplinary files, and the social science and multidisciplinary files are available for $60 for class members 244 journal of library automation vol. 14/3 september 1981 and $80 for nonmembers. the eight science databases are biosis, cab abstracts, compendex, energyline, enviroline, food service & technology abstracts, inspec, and oceanic abstracts. the social science files are abii inform, eric, exceptional child education resources, library and information science abstracts, management contents, psychological abstracts, social scisearch, and u.s. political science documents. the multidisciplinary databases are conference papers index, comprehensive dissertation index, ntis, pais international, and ssie current research. the stated purpose of the quick • search guides is to aid the experienced searcher who must use databases from more than one search service by showing the formats for each vendor of a database side by side for comparison. because most searchers tend to use a database on only one system, the guides are really more appropriate to an organization where several searchers may be using the same database through different systems and a "universal" quickreference chart is needed. because each guide covers only one database, the level of detail shown is much greater than in the simple-command comparison charts previously published. the guides are arranged to show particular features of the databases as they are used on the different search systems. the file label used to access the database and those fields that are searched when a term is entered with no restriction (the basic index) are shown at the top of each chart. the fields used in subject searching follow and show the field codes used to restrict subject searches, along with the format used online to enter search terms. the typical fields illustrated are title, subject descriptor, identifier, abstract, and category or section code. these fields vary according to database, but include the majority of subject access points used in the file. the balance of the chart is used to illustrate the field codes and formats used to retrieve information from other access points in the database such as author, journal source, language, publication date, document type, report numbers, or update code. these alternate access points vary widely by database, but each chart provides information on limiting searches by date, language, or update code at a minimum. the guides supply a useful amount of information for the experienced searcher needing a prompt on a form of entry for the fields available in a database, but a good understanding of the search system is required to use them properly. given the close contact class has with the database producers and online vendors, it is somewhat surprising to find inaccuracies and some misinterpretation in some of the guides. in the preface, for instance, the editor states, "in many brs files, uj and un are paragraph labels used in addition to de, mj, and mn. they are used to indicate major (uj) or minor (un) single word descriptors, similar to the df in dialog and iw in orbit." it is true that df is used in dialog to indicate a single-word descriptor, but in orbit the code is it. in brs, uj and un mean the term so restricted is an "unbound" part of a multiword descriptor-not a single-word descriptor (see brs/eric database guide, p.l4). the use of iw in orbit retrieves "unbound" words from the it field. the most trouble in the charts appears to be in the orbit sections. the basic index is misrepresented in several files and the iw field is only irregularly listed, even when it is present in the sdc version of the database. suggestions on the use of sensearch and stringsearch are not consistently illustrated for fields that cannot be directly restricted in some databases on orbit, such as abstract or supplementary index terms. many times the suggested search entry would not restrict retrieval to the field indicated on the chart. these inaccuracies would probably not doom an experienced searcher to failure in using a database, but they are annoying and do little to inspire absolute confidence in the information presented. class is to be complimented on the graphic representations in quick*search and the heavy stock used for the guides (the paper will probably outlive the information printed on it). addenda are planned for those databases changed or reloaded since the preparation of quick*search in october 1980, and a second edition is already under consideration. the quick*search guides are not meant as a replacement for vendor or database documentation and, in fact, are simply repackaged versions of the basic file descriptions available from the online vendors. considering the price of this publication, organizations would do well to consider investing instead in detailed user guides and updates for their searchers in order to provide the most accurate and current information on databases on a specific system.-rod slade, university of oregon library, eugene. viewdata and videotext, 1980-81: a worldwide report. transcript of viewdata '80, first world conference on viewdata, videotex, and teletext, london, march 26-28, 1980. white plains, n.y.: knowledge industry publications, 1980. 623p. $75 softcover. lc: 80-18234. isbn: 0914236-77-6. videotex81. proceedingsofvideotex'81 international conference and exhibition, may 20-22, 1981, toronto, canada. northwood hills, middlesex, u.k.: online conferences ltd., 1981. 470p. $85 softcover. viewdata '80 and videotex '81 were two state-of-the-art conferences for the emerging videotex field. videotex is the generic name for mass-market, consumer-oriented information retrieval systems of low cost and relative ease of use. videotex, as a technology, is divided into teletext systems and viewdata systems. teletext systems sequentially broadcast information using a portion of the television signal. subscribers, using a special decoder, can select individual pages from the several hundred offered. viewdata systems, on the other hand, are quite like online information systems except for their use of a television as a display device, their simplicity, and their broader range of transactions and information. these conference proceedings will be of interest to a limited audience. they are not for the complete beginner. nor will they provide hours of entertaining reading. neither meets academic publication criteria; many of the papers are fluff, outlines, or sales pitches. both proceedings have their share, unfortunately large, of uninformative articles. but if you are seriously interested in vidbook reviews 245 eotex's technology, uses, and social implications, then by all means at least skim the 1981 conference papers. the proceeding~ do describe the state of the art. moreover, the two proceedings, taken together, show some of the changes in the videotex field in the last year ... and not only in the spelling of "videotex." as state of the art, the viewdata '80 conference proceedings are already superseded. most of the material has been adequately covered by now in other publications at a much lower cost. there are two exceptions to this, both worth noting. the proceedings has several excellent articles on the japanese captain system, the best published on that system. of additional interest is a report on control data corporation's (cdc) market test of their plato educational system. their report suggests a large consumer market for highquality educational services even at a relatively high price. the videotex '81 conference proceedings are, of course, more current. there are four major topics of interest in the proceedings. firstly, there are several good presentations on videotex services, such as electronic publishing, retailing, and banking. there is an excellent discussion on what videotex means to newspapers, both in opportunities and threats. secondly, and particularly recommended, is a paper by tydeman and zwimpfer of the institute for the future. the paper outlines some of the social changes and problems that may result from large-scale videotex implementation. thirdly, there are updates on the existing videotex technologies and efforts from the french, japanese, canadian, and british groups. the british are perhaps the most interesting since they have a year of operational experience with their viewdata system, prestel. they state that most usage was from the business community, and their reports suggest that services are shifting to attract that market. if this is the case, it is a significant change from the original consumer orientation. there is also a good article on a prestel information provider's first year. of additional interest is that prestelcompatible databases and systems are being constructed in britain. thus, people will be 246 journal of library automation vol. 14/3 september 1981 able to access different systems using the same protocol. finally, there are numerous fascinating papers on american efforts. the americans, in contrast to the british, seem very unsettled; there is still a multiplicity of designs. (at&t's decision on a modified telidon standard, not reported in the proceedings but a major event of the conference, may ameliorate that .) the papers indicate overall that the "classic" definitions of viewdata and teletext will crumble or will be supplemented in the face of 100-channel, two-way cable systems. several papers document how these new cable capabilities will provide channels for large amounts of information to be delivered by teletext, viewdata, or hybrid systems. a paper by simon notes that cable will not only provide large audiences for information services but will also eliminate some of the traditionally defined viewdata functions. for example, people will not buy commodity prices from a viewdata service if that same information is available on a cable channel at a lower price. unfortunately, there are some topics missing from the 1981 conference proceedings. consumer-oriented educational services are mentioned little. systemperformance or human-factor considerations are rarely analyzed. there is much discussion of what services should be offered, but there is little discussion of how those services should be offered. no presentation is made on how to design very large databases for ease of use. particularly distressing is the relative omission of the word "quality" from the american papers in both proceedings. one cannot expect every home to be wired to access the entire library of congress. nonetheless, one can hope that videotex will not become merely a medium for used-car advertising.-mark s. ackerman, department of computer and information science, ohio state university and oclc, inc. , columbus. a candid look at collected works: challenges of clustering aggregates in glimir and frbr gail thornburg information technology and libraries | september 2014 53 abstract creating descriptions of collected works in ways consistent with clear and precise retrieval has long challenged information professionals. this paper describes problems of creating record clusters for collected works and distinguishing them from single works: design pitfalls, successes, failures, and future research. overview and definitions the functional requirements for bibliographic records (frbr) was developed by the international federation of library associations (ifla) as a conceptual model of the bibliographic universe. frbr is intended to provide a more holistic approach to retrieval and access of information than any specific cataloging code. frbr defines a work as a distinct intellectual or artistic creation. put very simply, an expression of that work might be published as a book. in frbr terms, this book is a manifestation of that work.1 a collected work can be defined as “a group of individual works, selected by a common element such as author, subject or theme, brought together for the purposes of distribution as a new work.”2 in frbr, this type of work is termed an aggregate or “manifestation embodying multiple distinct expressions .”3 zumer describes aggregate as “a bibliographic entity formed by combing distinct bibliographic units together.”4 here the terms are used interchangeably. in frbr, the definition of aggregates applies only to group 1 entities, i.e., not to groups of persons or corporate bodies. the ifla working group on aggregates has defined three distinct types of aggregates: (1) collections of expressions, (2) aggregates resulting from augmentation or supplementing of a work with additional material, and (3) aggregates of parallel expressions of one work in multiple languages.5 while noting the relationships between the categories, this paper will focus on the first type. aggregates of the first type include selections, anthologies, series, books with independent sections by different authors, and so on. aggregates may occur in any format, from a volume containing both of the j. d. salinger works catcher in the rye and franny and zooey to a sound recording containing popular adagios from several composers to a video containing three john wayne movies. gail thornburg (thornbug@oclc.org) is consulting software engineer and researcher at oclc, dublin, ohio. mailto:thornbug@oclc.org a candid look at collected works | thornburg 54 the environment the oclc worldcat database is replete with bibliographic records describing aggregates. it has been estimated that that database may contain more than 20 percent aggregates.6 this proportion may increase as worldcat coverage of recordings and videos tends to increase. in the global library manifestation identifier (glimir) project, automatic clustering of the records into groups of instances of the same manifestation of a work was devised. glimir finds and groups similar records for a given manifestation and assigns two types of identifiers for the clusters. the first type is manifestation id, which identifies parallel records differing only in language of cataloging or metadata detail, some of which are probably true duplicates whose differences cannot be safely deduplicated by a machine process. the second type is a content id, which describes a broader clustering, for instance, physical and digital reproductions and reprints of the same title from differing publishers. this process started with the searching and matching algorithms developed for worldcat. the glimir clustering software is a specialization of the matching software developed for the batch loading of records to worldcat, deduplicating the database, and other search and comparison purposes.7 this form of glimirization compares an incoming record to database search results to determine what should match for glimir purposes. this is a looser match in some respects than what would be done for merging duplicates. the initial challenges of tailoring matching algorithms to suit the needs of glimir have been described in thornburg and oskins8 and in gatenby et al.9 the goals of glimir are (1) to cluster together different descriptions of the same resource and to get a clearer picture of the number of actual manifestations in worldcat so as to allow the selection of the most appropriate description, and (2) to cluster together different resources with the same content to improve discovery and delivery for end users. according to richard greene, “the ultimate goal of glimir is to link resources in different sites with a single identifier, to cluster hits and thereby maximize the rank of library resources in the web sphere.”10 glimir is related conceptually to the frbr model. if the goal of frbr is to improve the grouping of similar items for one work, then glimir similarly groups items within a given work. manifestation clusters specify the closest matches. content clusters contain reproductions and may be considered to represent elements of the expression level of the frbr model. the frbr and glimir algorithms this paper discusses have evolved significantly over the past three years. in addition, it should be recognized that the frbr algorithms use a map/reduce keyed approach to cluster frbr works and some glimir content while the full glimir algorithms use a more detailed and computationally expensive record comparison approach. the frbr batch process starts with worldcat enhanced with additional authority links, including the production glimir clusters. it makes several passes through worldcat, each pass constructing keys that pull similar records together for comparison and evaluation. as described by toves, “successive passes progressively build up knowledge about the groups allowing us to refine and information technology and libraries | september 2014 55 expand clusters, ending up with the work, content and manifestation clusters to feed into production.”11 each approach to clustering has its limits of feasibility, but the frbr and glimir combined teams have endeavored to synchronize changes to the algorithms and to share insights. some materials are easier to cluster using one approach, and some in the other. clustering meets aggregates in the initial implementation of glimir, the issue of handling collected works was considered out of scope for the project. with experience, the team realized there can be no effective automatic glimir clustering if collected works are not identified and handled in some way. why is this? suppose a record exists for a text volume containing work a. this matches to a record containing work a, but actually also containing work b. this matches to a work containing b and also containing works c, d, and e. the effect is a snowballing of cluster members that serves no one. how could this happen? in a bibliographic database such as worldcat, items representing collected works can be catalogued in several ways. efforts to relax matching criteria in just the right degree to cluster records for the same work are difficult to devise and apply. the glimir and frbr teams consulted several times to discuss clustering strategies for works, content, and manifestation clusters. practical experience with glimir led to rounds of enhancements and distinctions to improve the software’s decisions. while glimir clusters can and have been undone and redone on more than one occasion, it took experience from the team to realize that the clues to a collected work must be recognized. bible and beowulf as are many initial production startups, the output of glimir processing was monitored. reports for changes in any clusters of more than fifty were reviewed by quality control catalogers for suspicious combinations. and occasionally a library using a glimiror frbr-organized display would report a strange cluster. this was the case with a huge malformed cluster of records for the bible. such a work set tends to be large and unmanageable by nature; there are a huge number of records for the bible in worldcat. however, it was noticed the set had grown suddenly over the previous two months. user interface applications stalled when attempting to present a view organized by such a set. one day, a local institution reported that a record for beowulf had turned up in this same work set. this started the team on an investigation. after much searching and analysis of the members of this cluster, the index case was uncovered. in many cases bibliographic records are allowed to cluster based on a uniform title. what the team found connecting these disparate records was a totally unexpected use of the uniform title, a field a candid look at collected works | thornburg 56 240 subfield a, contents: “b.”. that’s right, “b.”. once the first case was located, it was not hard to figure out that there were numerous uniform “titles” with other single letters of the alphabet. so in this odd usage, bible and beowulf could come together, if insufficient data were present in two records to discriminate by other comparisons. or potentially, other titles which started with “b.” seeing this unanticipated use of uniform title field, the frbr and glimir algorithms were promptly modified to beware. the frbr and glimir clusters were then unclustered and redone. this was a data issue, and unanticipated uses of fields in a record will crop up, if usually with less drama. further experience showed more. in the examination of another ill-formed cluster, a reviewer realized that one record had the uniform title stated as “illiad” but the item title was homer’s “odyssey.” of course these have the same author, and may easily have the same publisher. even the same translator (e.g., richard lattimore) is not improbable for a work like this. this was a case of bad data, but it imploded two very large clusters. music and identification of collected works as music catalogers know, musical works are very frequently presented in items that are collections of works. the rules for creating bibliographic records for music, whether scores or recordings or other, are intricate. the challenges to software to distinguish minor differences in wording from critical differences seem to be endless. moreover, musical sound recordings are largely collected works due to the nature of publication. as noted by papakhian, personal author headings are repeated oftener in sound recording collections than in the general body of materials.12 there are several factors that may contribute to such an observation. there are likely to be numerous recordings by the same performer of different works and numerous records of the same work by different performers. composers are also likely to be performers. the point is, for sound recordings an author statement and title may be less effective discriminators than for printed materials. vellucci13,14 and riley15 have written extensively on the problems of music in frbr models. the problems of distinguishing and relating whole/part relationships is particularly tricky. musical compositions often consist of units or segments that can be performed separately. so they are generally susceptible to extraction. these extractive relationships are seen in cases where parts are removed from the whole to exist separately, or perhaps parts for a violin or other instrument are extracted from the full score. software must be informed with rules as to significant differences in description of varying parts and varying descriptions of instruments, and in this team’s experience that is particularly difficult. krummel has noted that the bibliographic control of sound recordings has a dimension beyond item and work, that is, performance.16 different performances of the same beethoven symphony information technology and libraries | september 2014 57 need to be distinguished. cast and performer list evaluation and dates checking are done by the software. however, the comparisons the software can make are susceptible to fullness or scarcity of data provided in the bibliographic record. there is great variation observed in the numbers of cast members stated in a record. translator and adapter information can prove useful in the same sense of roles discrimination for other types of materials. this is close scrutiny of a record. at the same time consider that an opera can include the creative contributions of an author (plot), a librettist, and a musical composer. yet these all come together to provide one work, not a collected work. tillett has categorized seven types of bibliographic relationships among bibliographic entities, including the following: 1. equivalence, as exact copies or reproduction of a work. photocopies, microforms are examples. 2. derivative relationships, or, a modification such as variations, editions, translations. 3. descriptive, as in criticism, evaluation, review of a work. 4. whole/part, such as the relation of a selection from an anthology. 5. accompanying, as in a supplement or concordance or augmentation to a work. 6. sequential, or chronological relationships. 7. shared characteristic relationships, as in items not actually related that share a common author, director, performer, or other role. 17 while it is highly desirable for a software system to notice category 1 to cluster different records for the same work, that same software could be confused by “clues,” such as in category 7. and the software needs to understand the significance of the other categories in deciding what to group and what to split. to handle these relations in bibliographic records, tillett discusses linking devices including, for instance, uniform titles. yet uniform titles are used for the categories of equivalence relationships, whole/part relationships, and derivative relationships. this becomes more and more complex for a machine to figure out. of course, uniform titles within bibliographic records are supposed to link to authority records via text string only. consideration should ideally be given to linking via identifiers, as has been suggested elsewhere.18 thematic indexes review of scores and recordings glimir clusters showed a case where haydn’s symphonies a and b were brought together. these were outside the traditional canon of the 104 haydn symphonies and were referred to as “a” and “b” by the haydn scholar h. c. robbins landon. this misclustering highlighted the need for additional checks in the software. a candid look at collected works | thornburg 58 the original glimir software was not aware of thematic indexes as a tool for discrimination. thematic indexes are numbering systems for the works of a composer. the kochel mozart catalog, as in k. 626, is a familiar example. these designations are not unique to a given composer, that is, they are intended to be unique for a given composer, but identical designators may coincidentally have been assigned to multiple composers. while “b” series numbers may be applied to works of chambonnières, couperin, dvořák, pleyel, and others, the presence of more than one b number is suggestive of collected work status. for more on the various numbering systems, see the interesting discussion by the music library association.19 however, the software cannot merely count likely identifiers in the usual place. this could lead to falsely flagging aggregates; one work by dvořák could have b.193, which is incidentally equivalent to opus 105. clearly, any detection of multiple identifiers of this sort must be restricted to identifiers of the same series. string quartet number 5, or maybe 6 cases of renumbering can cause problems in identifying collected works. an early suppressed or lost work, later discovered and added to the canon of the composer’s work, can cause renumbering of the later works. clustering software needs must be very attentive to discrete numbers in music, but can it be clever enough? paul hindemith (1895–1963) works offer an example. his first string quartet was written in 1915, but long suppressed. his publisher was generally schott. long after hindemith’s death, this first quartet was unearthed, and then was published by schott. the publisher then renumbered all the quartets. so quartets previously 1 through 6 became 2 through 7. the rediscovered work was then called “no. 1,” though sometimes called “no. 0” to keep the older numbering intact. further, the last two quartets did not even have opus numbers assigned and were both in the same key.20 this presents a challenge. anything musical another problem case emerged when reviewers noticed a cluster contained both the unrelated songs “old black joe” and “when you and i were young maggie.” on investigation, the cluster held a number of unrelated pieces. here the use of alternate titles in a 246 field had led to overclustering, and the rules for use of 246 fields were tightened in frbr and glimir. as in the other problem cases, cycles of testing were necessary to estimate sufficient yet not excessive restrictions. rules too strict split good clusters and defeat the purpose of frbr and glimir. at this point the glimir/frbr team recognized that rules changes were necessary but not sufficient. that is, a concerted effort to handle collected works was essential. information technology and libraries | september 2014 59 strategies for identifying collected works the greatest problem, and most immediate need, was to stop the snowballing of clusters. clusters containing some member records that are collected works can suddenly mushroom out of control. rule 1 was that a record for a collected work must never be grouped with a record for a single work. if all in a group are collected works, that is closer to tolerable (more on that later). with time and experimentation, a set of checks were devised to allow collected works to be flagged. these clues were categorized as types: (1) considered conclusive evidence, or (2) partial evidence. type 2 needed another piece of evidence in the record. finding the best clues was a team effort. it was acknowledged that to prevent overclustering, overidentification of aggregates was preferable to failure to identify them. several cycles of tests were conducted and reviewed, assessing whether the software guessed right. table 1 illustrates the types of checks done for a given bibliographic record. here the “$” is used as abbreviation for subfield, and “ind” equals indicator. area field rule notes uniform title 240 $a and no $m, $n, $p, or $r title in $ a on list of terms, without the other subfields listed, is collected work this is a long list of terms such as “symphonies,” “plays,” “concertos,” and so on. title 245 contains “selections,” is collected 245 245 with multiple semi colons and doc type “rec” 246 if four or more v246 fields with ind2 = 2, 3, or 4, is collected. if more than 1 246, consider partial evidence extent 300 if 300$a has “pagination multiple” or “multiple pagings,” is collected contents notes 505$a and $t 1. check $a for first and last occurrences of “movement”. if not multiple movement occurrences and does have if all / any the above produce more than one pattern instance or more a candid look at collected works | thornburg 60 multiple “ / ” pattern. 2. if the above doesn’t find multiple patterns, also look for “ ; “ patterns. 3. if the above checks don’t produce more than 1 pattern, look for multiple “ – ” patterns. 4. count 505s $t cases. 5. count $r cases. than one $t, or more than one $r, is collected. various fields for thematic index clues 505a if any v505 $a, check for differing opuses. (this also checks for thematic index cases too.) if found, is collected. for types score and recording related work 740 if 1 or more 740 and 1 has indicator 2 = 2”, is collected . if only multiple 740s, partial evidence author 700/710/711/730 check for $t and $n. and check 730 ind 2 value of “2.” if 730 with ind2 = 2 or multiple $t is found, is collected. if only 1 $t, partial evidence 100/110/111, 700/710 730 if format recording, and both records are collected work, require cast list match to cluster anything but manifestation matches. that is, do not cluster at content level without verifying by cast. table 1. checks on bibliographic records. frailties of collected works identification in well-cataloged records the above table illustrates many areas in a bibliographic record that can be mined for evidence of aggregates. the problem is that cataloging practice offers no one rule mandatory to catalog a collected work correctly. moreover, as worldcat membership grows, the use of multiple schemes of cataloging rules for different eras and geographic areas adds to the complexity, even assuming that all the bibliographic records are cataloged “correctly.” correct cataloging is not assumed by the team. information technology and libraries | september 2014 61 software confounded with all the checks outlined in the table, the team still found cases of collected works that seemed to defy machine detection. one record had the two separate works, tom sawyer and huckleberry finn, in the same title field, with no other clues to the aggregate nature of the item. the work brustbild was another case. for this electronic resource set, brustbild appeared to be the collection set title, but the specific title for each picture was given in the publisher field. a cluster for the work gedichte von eduard morike (score) showed problems with the uniform title which was for the larger work, but the cluster records each actually represented parts of the work. the bad cluster for si ku quan shu zhen ben bie ji, an electronic resource, contained records which each appeared to represent the entire collection of 400 volumes, but the link in each 856 field pointed only to one volume in the set. limitations of the present approach the current processing rules for collected works adopt a strategy of containment. the problem may be handled in the near term by avoiding the mixing of collected works with noncollected works, but the clusters containing collected works need further analysis to produce optimal results. for example, it is one thing to notice scores “arrangements” as a clue to the presence of an aggregate. the requirement also exists that an arrangement should not cluster with the original score. the rules for clustering and distinguishing different sets of arrangements present another level of complexity. checks to compare and equate the instruments involved in an arrangement are quite difficult; in this team’s experience, they fail more often than they succeed. without initial explication of the rules for separating arrangements, reviewers quickly found clusters such as haydn’s schopfung, which included records for the full score, vocal score, and an arrangement for two flutes. an implementation that expects one manifestation to have the identifier of only one work is a conceptual problem for aggregates. a simple case: if the description of a recording of bernstein’s mass has an obscurely placed note indicating the second side contains the work candide, mass is likely to be dominant in the clustering effect, with the second work effectively “hidden.” this manifestation would seem to need three work ids, one for the combination, one for mass, and one for candide. this does not easily translate to an implementation of the frbr model but could perhaps be achieved via links. several layers of links would seem necessary. a manifestation needs to link to its collected work. a collected work needs links to records for the individual works that it contains, and vice versa, individual works need to link to collective works. this can be important for translations, for example, into russian, where collective works are common even where they do not exist in the original language. a candid look at collected works | thornburg 62 lessons learned first and foremost, plan to deal with collected works. for clustering efforts this must be addressed in some way for any large body of records. secondly, formats will gain the focus. the initial implementation of the glimir algorithms used test sets mainly composed of a specific work. after all, glimir clusters should all be formed within one work. these sets were carefully selected to represent as many different types of work sets as possible, whether clear or difficult examples of work set members. plenty of attention was given to the compatibility of differing formats, given the looser content clustering. these were good tests of the software’s ability to cluster effectively and correctly within a set that contained numerous types of materials. random sets of records were also tested to cross check for unexpected side effects. what in retrospect the team would have expanded was sets that were focused on specific formats. recordings, scrutinized as a group, can show different problems than scores or books. the distinctions to be made are probably not complete. another lesson learned in glimir concerned the risks of clustering. the deliberate effort to relax the very conservative nature of the matching algorithms used in glimir was critical to success in clustering anything. singleton clusters don’t improve anyone’s view. in the efforts to decide what should and should not be clustered, it was initially hard to discern the larger scale risks of overclustering. risks from sparse records were probably handled fairly well in this initial effort, but risks from complex records needed more work. collected works is only one illustration of risks of overclustering. future research the current research suggests a number of areas for possible further exploration: • the option for human intervention to rearrange clusters not easily clustered automatically would seem to be a valuable enhancement. • there is next the general question, what sort of processing is needed, and feasible, to distinguish the members of clusters flagged as collected works? • part versus whole relationships can be difficult to distinguish from the information in bibliographic records. further investigation of these descriptions is needed. • arrangements of works in music are so complex as to suggest an entire study by themselves. work on this area is in progress, but it needs rules investigation. • other derivative relationships among works: do these need consideration in a clustering effort? can and should they be brought together while avoiding overclustering of aggregates? • how much clustering of collected works may actually be helpful to persons or processes searching the database? how can clusters express relationships to other clusters? information technology and libraries | september 2014 63 conclusion clustering bibliographic records in a database as large as worldcat takes careful design and undaunted execution. the navigational balance between underclustering and overclustering is never easy to maintain, and course corrections will continue to challenge the navigators. acknowledgments this paper would have been a lesser thing without the patient readings by rich greene, janifer gatenby, and jay weitz, as well as their professional insights and help in clarifying cataloging points. special thanks to jay weitz for explicating many complex cases in music cataloging and music history. references 1. barbara tillett, “what is frbr? a conceptual model for the bibliographic universe,” last modified 2004, accessed november 22, 2013, http://www.loc.gov/cds/frbr.html. 2. janifer gatenby, email message to the author, november 10, 2013. 3. international federation of library associations (ifla) working group on aggregates, final report of the working group on aggregates, september 12, 2011, http://www.ifla.org/files/assets/cataloguing/frbrrg/aggregatesfinalreport.pdf. 4. maja zumer and edward t. o’neill, “modeling aggregates in frbr,” cataloging and classification quarterly 50, no. 5–7 (2012): 456–72. 5. ifla working group on aggregates, final report. 6. zumer and o’neill, “modelling aggregates in frbr.” 7. gail thornbug and w. michael oskins, “misinformation and bias in metadata processing: matching in large databases,” information technology & libraries 26, no. 2 (2007): 15–22. 8. gail thornburg and w. michael oskins, “matching music: clustering versus distinguishing records in a large database,” oclc systems and services 28, no. 1 (2012): 32–42. 9. janifer gatenby et al., “glimir: manifestation and content clustering within worldcat,” code{4}lib journal 17 (june 2012),http://journal.code4lib.org/articles/6812. 10. richard o. greene, “cataloging alchemy: making your data work harder” (slideshow presented at the american library association annual meeting, washington, dc, june 26–29, 2010), http://vidego.multicastmedia.com/player.php?p=ntst323q. 11. jenny toves, email message to the author, december 17, 2013. 12. arsen r. papakhian, “the frequency of personal name headings in the indiana university music library card catalogs,” library resources & technical services 29 (1985): 273–85. http://www.loc.gov/cds/frbr.html http://www.ifla.org/files/assets/cataloguing/frbrrg/aggregatesfinalreport.pdf http://journal.code4lib.org/articles/6812 http://vidego.multicastmedia.com/player.php?p=ntst323q a candid look at collected works | thornburg 64 13. sherry l. vellucci, bibliographic relationships in music catalogs (lanham, md: scarecrow, 1997). 14. sherry l. vellucci, “frbr and music,” in understanding frbr: what it is and how it will affect our retrieval tools, ed. arlene g. taylor (westport, ct: libraries unlimited, 2007), 131–51. 15. jenn riley, “application of the functional requirements for bibliographic records (frbr) to music,” www.dlib.indiana.edu/~jenlrile/presentations/ismir2008/riley.pdf. 16. donald w. krummel, “musical functions and bibliographic forms,” the library, 5th ser. 31 (1976): 327–50. 17. barbara tillett, “bibliographic relationships: toward a conceptual structure of bibliographic information used in cataloging,” (phd diss., graduate school of library & information science, university of california, los angeles, 1987), 22–83. 18. program for cooperative cataloging (pcc) task group on the creation and function of name authorities in a non marc environment, “report on the pcc task group on the creation and function of name authorities in a non marc environment,” last modified 2013, http://www.loc.gov/aba/pcc/rda/rda%20task%20groups%20and%20charges/reportpcc tgonnameauthina_nonmarc_environ_finalreport.pdf. 19. music library association, authorities subcommittee of the bibliographic control committee, “thematic indexes used in the library of congress/naco authority file,” http://bcc.musiclibraryassoc.org/bcc-historical/bcc2011/thematic_indexes.htm. 20. jay weitz, email message to the author, may 6, 2013. http://www.dlib.indiana.edu/~jenlrile/presentations/ismir2008/riley.pdf http://www.loc.gov/aba/pcc/rda/rda%20task%20groups%20and%20charges/reportpcctgonnameauthina_nonmarc_environ_finalreport.pdf http://www.loc.gov/aba/pcc/rda/rda%20task%20groups%20and%20charges/reportpcctgonnameauthina_nonmarc_environ_finalreport.pdf http://bcc.musiclibraryassoc.org/bcc-historical/bcc2011/thematic_indexes.htm overview and definitions the environment clustering meets aggregates in the initial implementation of glimir, the issue of handling collected works was considered out of scope for the project. with experience, the team realized there can be no effective automatic glimir clustering if collected works are not identified ... why is this? suppose a record exists for a text volume containing work a. this matches to a record containing work a, but actually also containing work b. this matches to a work containing b and also containing works c, d, and e. the effect is a snowb... bible and beowulf music and identification of collected works thematic indexes string quartet number 5, or maybe 6 anything musical strategies for identifying collected works the greatest problem, and most immediate need, was to stop the snowballing of clusters. clusters containing some member records that are collected works can suddenly mushroom out of control. rule 1 was that a record for a collected work must never be grouped with a record for a single work. if all in a group are collected works, that is closer to tolerable (more on that later). frailties of collected works identification in well-cataloged records software confounded limitations of the present approach lessons learned future research conclusion acknowledgments this paper would have been a lesser thing without the patient readings by rich greene, janifer gatenby, and jay weitz, as well as their professional insights and help in clarifying cataloging points. special thanks to jay weitz for explicating many co... references 150 book reviews networks and disciplines; !proceedings of the educom fall conference, october 11-13, 1972, ann arbor, michigan. princeton: educom, 1973. 209p. $6.00. as with so many conferences, the principal beneficiaries of this one are those who attended the sessions, and not those who will read the proceedings. except for a few prepared papers, the text is the somewhat edited version of verbatim, ad lib summaries of a number of workshop sessions and two panels that purport to summarize common themes and consensus. since few people are profound in ad lib commentaries, the result is shallow and repetitive. the forest of themes is completely lost among a bewildering array of trees. the conference was, i am sure, exciting and thought-provoking for the participants. it was simply organized, starting with statements of networking activities in a number of disciplines, i.e., chemistry, language studies, economics, libraries, museums, and social research. the paper on economics is by far the best organized presentation of the problems and potential of computers in any of the fields considered, and perhaps the best short presentation yet published for economics. the paper on libraries was short, that on chemistry lacking in analytical quality, that on language provocative, that on social research highly personal, and that on museums a neat mixture of reporting and interpreting. much of the information is conditional, that is, it described what might or could be in the realm of the application of computers to the various subjects. the speakers all directed their papers to the concept of networks, interpreted chiefly as widespread remote access to computational facilities. the papers are followed by very brief transcripts of the summaries of workshops in which the application of computers to each of the disciplines was presumably discussed in detail. much of each summary is indicative and not really informative about the discussions. the concluding text again is the transcript of two final panels on themes and relationships among computer centers. the only description for this portion of the text is turgid. in the midst of all this is the banquet paper presented by ed parker, who as usual was thoughtful and insightful, and several presentations by national science foundation officials that must have been useful at the time to guide those relying on federal funding for computer networks in developing proposals. i can't think of another reference that touches on the potential of computers in so many different disciplines, but it is apparent from the breadth of ideas and the range of suggested or tested applications that a coherent and analytical review should be done. this volume isn't it. russell shank smithsonian institution the analysis of information systems, by charles t. meadow. second edition. los angeles: melville publishing co., 1973. a wiley-becker & hayes series book. this is a revised edition of a book first published in 1967. the earlier edition was written from the viewpoint of the programmer interested in the application of computers to information retrieval and related problems. the second edition claims to be "more of a textbook for information science graduate students and users" (although it is not clear who these "users" are) . elsewhere the author indicates that his emphasis is on "software technology of information systems" and that the book is intended "to bridge the communications gap among information users, librarians and data processors." the book is divided into four parts: language and communication (dealing largely with indexing techniques and the properties of index languages) , retrieval of information (including retrieval strategies and the evaluation of system performance), the organization of information (organization of records, of ffies, file sets), computer processing of information (basic file processes, data access systems, interactive information retrieval, programming languages, generalized data management systems). the second two sections are, i feel, . much better than the first. these are the areas in which the author has had the most direct experience, and the topics covered, at least in their information retrieval applications, are not discussed particularly well or particularly fully elsewhere. it is these sections of the book that make it of most value to the student of information science. i am less happy about meadow's discussion of indexing and index languages, which i find unclear, incomplete, and inaccurate in places. the distinction drawn between pre-coordinate and post-coordinate systems is inaccurate; meadow tends to refer to such systems simply as keyword systems, although it is perfectly possible to have a post-coordinate system based on, say, class numbers, which can hardly be considered keywords, while it is also possible to have keyword systems that are essentially precoordinate. in fact, meadow relates the characteristic of being post-coordinate to the number of terms an indexer may use (" ... permit their users to select several descriptors for an index, as many as are needed to describe a particular document"), but this is not an accurate distinction between the two types of system. the real difference is related to how the terms are used (not how many are used), including how they are used at the time of searching. the references to faceted classification are also confusing and a number of statements are made throughout the discussion on index languages that are completely untrue. for example, meadow states (p. 51) that "a hierarchical classification language has no syntax to combine descriptors into terms." this is not at all accurate since several hierarchical classification schemes, including udc, do have synthetic elements which allow combination of descriptors, and some of these are highly synthetic. in fact, meadow himself gives an example (p. 3839) of this synthetic feature in the udc. it is also perhaps unfortunate that the student could read all through meadow's discussion of index languages without getting any clear idea of the structure of a thesaurus for information retrieval and how this thesaurus is applied in practice. book reviews 151 moreover, meadow used medical subject headings as his example of a thesaurus (p. 33-34), although this is not at all a conventional thesaurus and does not follow the usual thesaurus structure. my other criticism is that the book is too selective in its discussion of various aspects of information retrieval. for example, the discussion on automatic indexing is by no means a complete review of techniques that have been used in this field. likewise, the discussion of interactive systems is very limited, because it is based solely on nasa's system, recon. the student who relied only on meadow's coverage of these topics would get a very incomplete and one-sided view of what exists and what has been done in the way of research. in short, i would recommend this book for those sections (p. 183-412) that deal with the organization of records and files and with related programming considerations. the author has handled these topics well and perhaps more completely, in the information retrieval context, than anyone else. indexing and index languages, on the other hand, are subjects that have been covered more completely, clearly, and accurately by various other writers. i would not recommend the discussion on index languages to a student unless read in conjunction with other texts. f. w. lancaster university of illinois application of computer technology to librm·y processes, a syllabus, by joseph becker and josephine s. pulsifer. metuchen, n.j.: scarecrow press, 1973. 173p. $5.00. despite the large number of institutions offering courses related to library automation, including just about every library school in north america, accredited or not, there is a remarkable shortage of published material to assist in this instruction. with the publication of this small volume a light has been kindled; let us hope it will be only the first of many, for larger numbers of better educated librarians must surely result in higher standards in the field. this syllabus covers eight topics related 152 journal of library automation vol. 7/2 jtme 1974 to the use of computers in libraries, titled as follows: bridging the gap (librarians and automation); computer technology; systems analysis and implementation; marc program; library clerical processes (which encompasses acquisitions, cataloging, serials, circulation, and management information) ; reference services; related technologies; and library networks. each topic is treated as a unit of instruction, and each receives the identical treatment as follows. the units each start with an introductory paragraph, explaining what the field encompasses, and indicating the purpose of teaching that topic. the purpose of systems analysis, for example, is "to develop the sequence of steps essential to the introduction of automated systems into the library." a series of behavioral objectives are then listed, to show what the student will be able to do (after he has learned the material) that he presumably was unable to do before. for example, there are seven behavioral objectives in the unit on computer technology, of which the first four are: "1) the student will be able to discuss the two-fold requirement to represent data by codes and data structures for purposes of machine manipulation, 2) the student will be able to identify the basic components of computer systems and describe their purposes, 3) the student will be able to differentiate hardware and software and describe briefly the part that programming plays in the overall computer processing operation, 4) the student will be able to define the various modes of computer operation and indicate the utility of each in library operations." the remaining three objectives refer to the student's ability to enumerate and compare types of input, output, and storage devices. then an outline of the instructional material is presented, followed by the detailed and well-organized material for instruction. in no case can the material presented here be considered all that an instructor would need to know about the field, but a surprising amount of specific detail is included, along with a carefully organized framework within which to place other knowledge. the end result is to present to the instructor a series of outlines that would encompass much of the material included in a basic introductory course in library automation. every instructor would, presumably, want to add other topics of his own in addition to adding other material to the topics treated in this volume, but he has here an extremely helpful guide to a basic course, and the only work of its kind to be published to date. peter simmons school of librarianship university of british columbia the larc reports, vol. 6, issue 1. online cataloging and circulation at western kentucky university: an approach to automated instructional resources ~anagement. 1973. 78p. this is a detailed account of the design, development, and implementation of online cataloging and circulation which have been in operation at western kentucky university for several years. the library's reasons for using computers are similar to those of many college and university libraries that experienced rapid growth during the 1960s. the faculty of the division of library services first prepared a detailed proposal with appropriate feasibility studies and cost analyses to reclassify the collection from dewey decimal to library of congress classification. the proposal was approved by the administration of the university, and the decision was made to utilize campus computer facilities via online input techniques for reclassification, cataloging, and circulation. "project reclass" was accomplished during 1970-71 using ibm 2741 ats/360 terminals. a circulation file was subsequently generated from the master record file. the main library is housed in a new building and has excellent computer facilities within the library that are connected to the university computer center. cataloging information is input directly into the system via ats terminals; ibm 2260 visual display terminals are used for inquiry into the status of books and patrons; and ibm 1031/1033 data collection terminals are used to charge out and check in books. catalog cards and book catalogs in upper/lower case are produced in batch mode on regular schedule. the on-line circulation book record file is used in conjunction with the on-line student master record and payroll master record files for preparation of overdue and fine notices. apparently the communication between library staff and computer personnel has been well above average, and cooperation of the administration and other interested parties has been outstanding. the attention given to planning, scheduling, training, and implementation is impressive. what has been accomplished to date is considered very successful, and plans are book reviews 153 underway to develop on-line acquisitions ordering and receiving procedures. the report has some annoying shortcomings such as referring to the library of congress as "national library"; frequent use of the word "xeroxing," which the xerox corporation is attempting to correct; "inputing" for "inputting"; and several other misspelled words. some parts are poorly organized and unclear, but the report does provide rriany useful details for those considering a similar undertaking. lavahn overmyer school of library science case western reserve university letter from the editors (march 2023) letter from the editors kenneth j. varnum and marisha c. kelly information technology and libraries | march 2023 https://doi.org/10.6017/ital.v42i1.16319 welcome to the march 2023 issue. despite the date, snow still covers the ground where the editor lives, and winter still appears to be holding on tightly to both coasts. we’re pleased to share with you the first issue of the calendar year and a collection of five peer-reviewed articles, as well as some news and updates (below). we also have a column in our public libraries leading the way series, “virtual production at cloud901 in the memphis central library” by alan ji and david mason, about how that library has adapted cutting-edge production techniques used in streaming tv shows such as the mandalorian to create virtual scenery in their teen-focused makerspace. peer-reviewed articles in the current issue are listed here: • the current state and challenges in democratizing small museums’ collections online / avgoustinos avgousti and georgios papaioannou • services to mobile users: the best practice from the top visited public libraries in the us / yan quan liu and sarah lewis • decision-making in the selection, procurement, and implementation of alma/primo: the customer perspective / jin xiu guo and gordon xu • exploring final project trends utilizing nuclear knowledge taxonomy: an approach using text mining / faizhal arif santosa • japanese military “comfort women” knowledge graph: linking fragmented digital records / haram park and haklae kim call for new editorial board members coming in april the ital editorial board, a core committee, will be issuing a call for volunteers in april. for those selected, two-year terms of service will start on july 1. editorial board members have a critical role in building the foundation for the journal’s future through setting policy and content guidelines. members of the board have several key responsibilities: • shaping the direction and strategy for the journal; • participating in online editorial board meetings; • soliciting contributions to the journal (based on personal networking, conference attendance, etc.); and • optionally reviewing articles submitted to the journal, for those who want to be involved at an even deeper level (see the peer reviewer job description). if you are interested in furthering the scholarly record for library technology and have a background in information technology in libraries, archives, or museums, this is an exciting opportunity to contribute to the profession and engage with colleagues across all types of organizations in examining the role of technology in libraries. because we want the editorial board to reflect the broad diversity of core’s membership, we especially encourage individuals from underrepresented groups and identities to apply. ital will move to a new host this summer over the past year, the editors of the three core journals—ital, library leadership & management (ll&m), and library resources and technical services (lrts)—have been working with core and the core board to consolidate our journals on a single publishing platform. we’re pleased to say that ll&m and ital will move this summer to ala’s open journal systems platform, where lrts https://ejournals.bc.edu/index.php/ital/article/view/16315 https://ejournals.bc.edu/index.php/ital/article/view/14099 https://ejournals.bc.edu/index.php/ital/article/view/15143 https://ejournals.bc.edu/index.php/ital/article/view/15599 https://ejournals.bc.edu/index.php/ital/article/view/15599 https://ejournals.bc.edu/index.php/ital/article/view/15603 https://ejournals.bc.edu/index.php/ital/article/view/15603 https://ejournals.bc.edu/index.php/ital/article/view/15799 https://ejournals.bc.edu/index.php/ital/article/view/15799 https://docs.google.com/document/d/1vtgq8fcfm9ux2u0elvhjrdlm6vxut7ybu6cytqw-nz4/edit?usp=sharing information technology and libraries march 2023 letter from the editors 2 varnum and kelly is already published. we’ll have more details to share in our june issue, before the move, but want to let you know some important details: • ital’s urls will change, but dois will continue to resolve the new home of the journal. we will work with our current host, boston college, to set up redirects to the new location. • ala uses the same publishing platform as boston college, open journal systems, so for authors and reviewers, the experience will remain the same. • articles published in ital (and our two sibling journals) will continue to be open access with no fees charged to authors or readers. authors maintain copyright in their work. we are very grateful to boston college for their support of information technology and libraries over the past decade, and to the core board for supporting this project. be a part of a future issue as the u.s. academic year hurdles to a close this spring, it’s a great time to think about the work you’ve accomplished and what you might share with your library colleagues near and far. our call for submissions outlines the topics of interest to the journal—basically, if the submission discusses the intersection of libraries/archives/museums and technology, it’s potentially in scope—and the process for submitting an article. we’d love to consider your article for publication. or, if you have an idea you’d like to discuss with ital’s editors, contact either of us at the email addresses below. kenneth j. varnum, editor marisha c. kelly, assistant editor varnum@umich.edu marisha.librarian@gmail.com https://ejournals.bc.edu/index.php/ital/call-for-submissions mailto:varnum@umich.edu mailto:marisha.librarian@gmail.com call for new editorial board members coming in april ital will move to a new host this summer be a part of a future issue salazar 170 information technology and libraries | september 2006 author id box for 3 column layout traditional, larger libraries can rely on their physical collection, coffee shops, and study rooms as ways to entice patrons into their library. yet virtual libraries merely have their online presence to attract students to resources. this can only be achieved by providing a fully functional site that is well designed and organized, allowing patrons to navigate and locate information easily. one such technology significantly improving the overall usefulness of web sites is a content management system (cms). although the cms is not a novel technology per se, it is a technology smaller libraries cannot afford to ignore. in the fall of 2004, the northcentral university electronic learning resources center (elrc), a small, virtual library, moved from a static to a database-driven web site. this article explains the importance of a cms for the virtual or smaller library and describes the methodology used by elrc to complete the project. state of the virtual library the northcentral university electronic learning resource center (elrc), a virtual library, recently moved from a static to a databasedriven web site in 2004.1 before this, the site consisted of 450 static pages and continued to multiply due to the creation and expansion of northcentral university (ncu) programs. to provide the type of service demanded by our internet-savvy patrons, the elrc felt it needed to evolve to the next stage of web management and design. ncu, with a current enrollment of roughly twenty-one hundred fulltime students, is one of many forprofit virtual universities (including the university of phoenix, capella, and walden, among others) seeking to carve a niche in the education market by offering professional degrees entirely online.2 in the past few years, distance education has experienced exponential growth, causing virtual universities to flourish, but forcing on their libraries the challenge of keeping pace.3 typically, virtual libraries are manned by a limited staff comprised of one or two librarians who are responsible for all facets of the library, including interlibrary loan, virtual reference, library instruction, and web site management, among other library duties. 4 web site management, as expected, becomes cumbersome when a site exceeds two hundred or more static pages and a clear and structured system is not in place to maintain a proliferating number of web pages. because virtual, for-profit libraries do not rely on public funding and taxes, they tend not to be as concerned about autonomy as public or state libraries, which must find ways to stay within budget and curtail expenses. on the same note, some academic libraries prefer to maintain a local area network (lan), while other libraries may not have the staff, resources, or need for such a system. thus, for some virtual libraries, such as elrc, the incorporation of technology takes on a more dependent role. that is, where some libraries are encouraged to explore open source applications and create homegrown tools, the virtual, smaller-staffed library finds itself more or less reliant on its university’s information technology (it) department.5 virtual libraries address the needs of distance education students, who demand an equivalent, if not surpassing, level of service and instruction as they would expect to find at physical libraries.6 meeting these needs requires a great deal of creativity, ingenuity, and a strong technical background. recent trends in developing technologies such as mylibrary, learning objects, blogs, virtual chat, and federated searching have broadened the scope of possibilities for the smaller-staffed, virtual library. in particular, a content management system (cms) utilizes a combination of tools that provide numerous advantages, as outlined below: 1. the creation of templates that maintain a consistent design throughout the site 2. the convenience of adding, updating, and deleting information from a single, online location 3. the creation and maintenance of interactive pages or learning objects 4. the implementation of a simple editing interface that eliminates knowledge of extensible hypertext markup language/hypertext markup language (xhtml/ html) by library staff simply defined, a cms is comprised of a database, server pages such as active server page (asp), personal home page (php), or coldfusion; a web server—for example, internet information server (iis), personal web server (pws), or apache; and an editing tool to manage web content.7 these resources vary in price, but for a virtual library integrated into a larger university, it is ideal to implement applications and software supported by the university. for the autonomous academic library, this may differ. there are advantages and disadvantages for using proprietary and nonproprietary software, and it is left to the library, virtual or physical, to determine the type of resources needed to meet the goals and mission of the university.8 although the scope of this article focuses on the creation of tools for a homegrown cms, some libraries may wish to explore commercial cms packages that include additional services such as technical support. these cms packages will vary in price and services depending on the vendor and the needs of the library.9 elrc transformed in fall 2004, a group that consisted ed salazar (esalazar@ncu.edu) is reference/web librarian at northcentral university. content management for the virtual library ed salazar article title | author 171content management for the virtual library | salazar 171 of two librarians, the education chair, and programmer, convened to discuss the redesign of the elrc web site, which had become increasingly difficult to manage. specifically, the amount of duplicated content, inconsistent design and layout, and unstructured architecture of the site posed severe navigational and organizational problems. the group selected and compared other academic library sites to determine a desired design and theme for the new elrc site. discussions also involved the addition of features such as a site search and breadcrumbs, which the group felt were essential. as a result, the creation of a homegrown cms using proprietary software became the route of choice to meeting the increasing demands of patrons and the need to expand the site. because ncu utilizes microsoft (ms) information system products, it was agreed ms or ms-compatible applications would be used to create the cms, which consisted of sql server, iis, asp, visual basic script (vbscript), jspell iframe, and ms visual interdev. ms visual interdev and jspell iframe supplanted our previous web editor, ms frontpage, which seemed to generate superfluous code and thus made it difficult to debug or alter the design and layout of pages. also, using jspell iframe eliminated the need for future ncu librarians to possess an expertise in xhtml/html. with these pieces in place, the arduous task of culling content from static pages and entering it into a database was begun. the database the sql server database helped in organizing and structuring content, and allowed for the creation of templates and administration (admin) pages.10 in addition, the database played an integral part in creating the search, breadcrumb, and site map features the group so desperately wanted. a significant amount of time was spent weeding the site for information that had become obsolete or irrelevant to elrc. it should be noted that the group originally attempted to use access for a database but stumbled across several problems, one being the inability to maintain a stable and reliable connection to the database. the templates with the database nearly complete, the programmer began creating asp templates in ms visual interdev. these templates basically serve as the shell of the web page, preserving the design and layout elements of the page while extracting unique content based on a user’s request. in essence, a single template can produce hundreds of pages consistent in design. likewise, a single change to the template can alter the entire design of the site. for the elrc, seven templates were created for more than 450 pages. figure 1 shows the elrc course guides template. figure 2 shows the public view of the elrc course guide template. changes to the templates are done using ms visual interdev, which offers a user-friendly environment for managing web pages. ms visual interdev also includes helpful features, such as highlighting code errors for easy debugging, and the ability to access, create, and maintain stable connections to databases.11 in addition, the ms visual interdev editor recognizes commonly used asp commands, allowing the user to save time by utilizing keyboard shortcuts when programming. besides creating templates, asp server-include files and cascading style sheets (css) were incorporated, allowing for the easy modification of code on a single file instead of each and every page or template. this, in particular, is time-efficient when having to add or change database connections or design elements. also, the elrc took extra precaution to ensure that style elements met the accessibility requirements and standards set forth by the world wide web consortium (w3c), as well as tested the site on other browsers, such as firefox and netscape.12 as the site continues to grow and expand, so may the need for additional templates. creation or replication of templates is simple, requiring a basic understanding of programming and the re-assigning of new variables in the code to match added or modified tables. there is some speculation in the near future of migrating the site to the asp.net environment for added functionality and security. if and when that time comes, the elrc will be ready. at present, ncu is not considering the use of open source code or applications (the exception being the apache web server); this is primarily due to available technical support, security, and intuitiveness of use associated with commercial software. in addition, the ncu information system was built using commercial software and a complete transition to open source, at the moment, is not possible or desirable. with the templates complete, the elrc began running a prototype of the new site, making it accessible to students and faculty from a link on the old site. a survey was created that allowed users to comment on the new site. one detail of importance to note is that the survey duplicated a prior survey done on the old site in 2003 in order to provide the elrc with comparative data. the admin pages the next phase of the project required the creation of admin pages, which would allow content to be quickly added, updated, and deleted on the site. these pages, like the templates, were created in ms visual interdev; display content is housed within the database on the web, thus allowing 172 information technology and libraries | september 2006 it to be changed on the fly. figure 3 shows all of the web pages for the elrc within a table. what is particularly convenient about the admin edit pages is the incorporation of the jspell iframe editor, which serves as the frontend editor to the site. the reason for using jspell iframe, as stated earlier, is its ease of use: the simple tool bar provides the basic, essential tools necessary for creating content without the daunting number of buttons and menu selections other editors tend to have. also, jspell iframe is reasonably priced and does not entail a complex installation or require any space on local hard drives; instead, the program is maintained on the server. consequentially, all that is required is the insertion of the jspell iframe javascript code into the web pages. in addition to jspell iframe, fields within admin edit pages are or can be pre-populated by content in the database. for instance, the title or display order of links can be easily edited or changed. longer text fields comprised of paragraphs are created or modified using jspell iframe. deleting a page is simple, requiring only the click of a delete button on the bottom, righthand corner. figure 4 shows jspell iframe embedded within an admin edit page. the admin add page is straightforward. information is entered into the fields appearing on a form page, and the proper page type designation is selected from a drop-down menu. yet, more importantly, the admin add and the admin edit pages can filter information to specific users for security purposes and library needs. figure 5 shows an admin add page. figure 6 shows an admin edit page. the admin pages were designed with flexibility in mind. main column headings may be sorted, as seen in figure 3, allowing one to locate a particular page. the sorting feature also displays the inner structure of the database that, in turn, identifies parent-child relationships between pages in the elrc, which is useful and necessary when adding pages to the elrc site. due to the careful thought used in creating the admin pages, they have proven to be extremely effective and useful in maintaining a library web site. each and every change to the site can be made on the web, allowing content to be edited remotely and eliminating the need for installing and maintaining expensive editing software on local and remote machines. usability testing with the site completed, the elrc felt it important to perform usability figure 1. elrc course guide template figure 2. public view of the elrc course guide template article title | author 173content management for the virtual library | salazar 173 tests, but how does a virtual library conduct usability testing when all of its students are distance education students? this is a difficult question that involves some ingenuity to answer. in order to solve this problem, staff members were propositioned (begged) to volunteer for the study. total staff acquired was five. also, a local college class of about ten students was persuaded to participate in the study. granted, the total number of subjects is not representative of the ncu student body; however, substantial changes to the site were made from the data gathered. more usability testing is expected in the immediate future. the findings usability testing complete, the site was launched. during this period, a few minor hang-ups were experienced, including broken links, form page errors, and stray design elements, but these were only minor problems that were quickly fixed. feedback from the elrc survey showed that nearly all of the students and faculty, roughly fifty respondents, approved of the changes by commenting that the site had improved in layout and organization of content as well as navigation. also, responses and comments from usability testing participants were equally positive and encouraging. figure 7 shows the new ncu learners elrc home page. although it is difficult to establish a direct connection between the elrc site and usage, recent statistics appear promising. since the inception of the new site in december 2004, the number of visits to the elrc learners home page has jumped 10 percent. this number is expected to rise as ncu continues to grow and students become more acquainted and familiar with the site. the project took nearly six months to complete and required the expertise of a programmer. although programming may be outside the requisites of a distance librarian, managing the site is not. a general understanding of control statements and sql is all that is needed. for the distance librarian who spends almost all of his or her time online, these skills can be acquired on the job or by taking introductory programming courses at a local college. in the hope that the site will continue to expand in concert with the growing body of ncu students, recently the elrc added a writing center and blog. with the entire site now being database driven, adding, updating, deleting content is done effortlessly. ideally, students and faculty will play a greater role in the development of the elrc site as a result of the changes. involving patrons with the site can play an integral, beneficial role in their academic pursuits. figure 3. web pages for elrc within a table figure 4. jspell iframe editor embedded within an admin edit page 174 information technology and libraries | september 2006 conclusion the elrc at ncu encourages other virtual or smaller libraries to explore their resources for improving their library web sites, which involves understanding campus resources and personnel. with the ever-burgeoning growth of technological resources, every library—small or large, virtual or physical, public or private—can empower itself to meet the needs of internet-savvy students. it is only a matter of being aware of the resources and putting them to good use. references and notes 1. the ncu elrc web site is comprised of three separate sites: the public site www.ncu.edu/elrc (accessed dec. 2, 2004), the mentors site http://mentors .ncu.edu/elrc (accessed dec. 2, 2004), and the learners site http://learners.ncu .edu/elrc (accessed dec. 2, 2004). although similar in design, each site is tailored to meet the needs of each individual group as well as protect ncu’s resources, services, and information. access to subscription resources and personal information is available upon authentication of the user to the site. 2. for a detailed overview of virtual libraries, see valerie a. akuna, “virtual universities: the new higher education paradigm,” estrella mountain college, http://students.estrellamountain .edu/drakuna/virtualuniversities.htm (accessed feb. 15, 2005). 3. u.s. department of education, national center for education statistics, “the condition of education 2004,” distance education at postsecondary institutions, http://nces.ed.gov/pubsearch/ pubsinfo.asp?pubid=2004077 (accessed feb. 8, 2005). 4. for more information on the role of the virtual librarian in a virtual university, see jan zastrow, “going the distance: academic librarians in the virtual university,” university of hawaii–kapiolani community college, http://library.kcc .hawaii.edu/~illdoc/de/depaper.htm (accessed jan. 29, 2005). 5. for an overview on developing an open source cms, please see mark dahl, “content management strategy for a college library web site,” information technology and libraries 23, no. 1 (2004). 6. for a detailed discussion on distance education and virtual libraries, see smiti gandhi, “academic librarians and distance education: challenges and opportunities,” reference & user services quarterly 43, no. 2 (2003). 7. for detailed information on using asp pages for managing databases, see xiaodong li and john paul fullerton, “create, edit, and manage web database content using active server pages,” library hi tech 20, no. 3 (2002); see also, bryan h. davidson, “database driven, dynamic content delivery: providing and managing access to online resources using microsoft access and active server pages,” oclc systems and services 17, no. 1 (2001). figure 6. admin edit page figure 5. admin add page article title | author 175content management for the virtual library | salazar 175 8. for advantages and disadvantages of open source and proprietary software, see john caroll, “open source versus proprietary: both have advantages,” special to cnet asia, http://asia.cnet.com/ builder/program/work/0,39009380,3918 1451,00.htm (accessed feb. 4, 2004); see also, stephen shankland, “study: opensource database going mainstream,” cnet, http://ecoustics-cnet.com.com/ study+open-source+databases+going +mainstream/2100-7344_3-5171543.html (accessed feb. 4, 2004). 9. for information on commercial content management vendors and prices, see cms watch, www.cmswatch.com/cms/ vendors (accessed feb. 15, 2005). “sql server 2000 product overview,” microsoft windows server system, www.microsoft. com/sql/evaluation/overview/default. asp (accessed feb. 15, 2005). 10. for a review on visual interdev, see maggie biggs, “visual studio 6.0 demonstrates improved integration,” infoworld 20, no. 35 (1998), www.infoworld.com/ cgi-bin/displaytc.p1?/reviews/980831 vstudio6.htm (accessed feb. 4, 2004). 11. “checklist of checkpoints for web content accessibility guidelines 1.0,” w3c, www.w3.org/tr/wai-webcon tent/full-checklist.html (accessed feb. 1, 2005). 12. jspell iframe 2004, www.jspell .com/iframe-spell-checker.html (accessed dec. 2, 2004). figure 7. elrc learners home page ebsco cover 2 lama cover 3 lita cover 4 index to advertisers lib-mocs-kmc364-20131012112749 147 who rules the rules? "why can't the english teach their children how to speak?" wondered henry higgins, implying that a lack of widely and consistently followed rules of usage created linguistic backwardness and anarchy. higgins' question might be rephrased today as: "when will the code teach its founders how to catalog?" the library of congress has historically fitted catalog codes to its own practices rather than following them slavishly. the best example is the lamentable policy of superimposition: continued use of preestablished forms of names that are not in compliance with the paris principles or aacrl. this was a cause of widespread confusion and complaint and the practice was eventually discontinued ... well, sort of discontinued. the various interpretations of aacrl, the inclusion of new rules, and pressure for further modifications eventually led to the drafting of aacr2, a code that was supposed to end variance and controversial practices. one might assume that including lc as a principal author of the new text and an lc official as one of the editors might result in a code that it could actually follow. judging by the spate of exceptions and interpretations made so far (more than 300), this has not been the case. in the place of superimposition, we have new impositions known as "compatible headings." they may not be readily ascertained according to the rules, but have been granted a sort of bibliographic squatter's rights. although it would be simpler for catalogers to follow the rules consistently, they must instead check several cataloging service bulletins and name authorities to see whether lc has determined that a given personal, corporate, or serial name is already "compatible" with aacr2. this can result in cataloging delays, higher processing costs, and inconsistent entries. aacr2 and uncertainties regarding its application by lc have been widely credited with lower cataloging productivity. this is not to imply that lc is behaving in a strictly arbitrary or capricious manner vis-a-vis the code. they can be seen as caught on the horns of a trilemma, with vast internal needs and increasing external demands competing for a shrinking budget. president reagan may have whispered sweet nothings during national library week, but during budget hearings it became clear that libraries are not as "truly needy" as impoverished generals and interior decorators. decisions to depart from aacr2 have been based primarily on cost factors. the decision by the rtsd catalog code revision committee and the joint steering committee not to consider cost and implementation factors has led both to widespread opposition to the code resulting in a one-year delay in implementation, and to the modifications that lc has made and is making. some variations such as using "dept." for "depart148 journal of library automation vol. 14/3 september 1981 ment" and "house" for "house of representatives" make fiscal and common sense. many other lc changes are simply bibliographic nit-picking, minor irritants to catalogers who must flip back and forth between the text of aacr2 and half a dozen bulletins to settle a minor point of description. why didn't lc representatives attempt to say, "wait a minute-we just can't do that now," while the code was being considered rather than after it was published? anyway, considering that lc was starting up a whole new catalog and closing the old one, one wonders why rules not to be applied retrospectively had to be tinkered with to such an extent. major questions still to be resolved include not only the compatiblename quandary, but the treatment of serials, microform reproductions, establishment of corporate names and determination of when works "emanate from" corporate bodies, and the romanization of slavic names. the decision to use title entry for serials and monographic series even in the case of generic titles has been controversial. there are, of course, exceptions to the rules, and there will be differences in how uncertain catalogers construct complex entries with parenthetical modifiers. unfortunately, rules establishing entries for serials have sometimes been muddied rather than clarified in the bulletin. consider the example in the winter 1981 issue wherein the bulletin of the engineering station of west virginia university is entered under "bulletin," while the same publication for the entire university is entered under "west virginia university bulletin." also, consider the complex cross-reference structure required to direct users between the two files, both of which may well be split again,' historically, between author/ title and title main entry. this is a special problem in the case of large monographic series generated by corporate bodies. the lc position on microform reproductions of previously published works is clearer, but is still a point of controversy. they have decided to provide the imprint and collation (er, make that "publication, distribution, etc., area" and "physical description area") of the original work, with a description of the microform in a note. in other words, they're sticking to aacrl. the rtsd ccs committee on cataloging: description and access is currently trying to resolve this conflict, one in which many research libraries have sided with lc. this body is also trying to unravel the mystique of "corporate emanation'' introduced in aacr2. another sore point has been the lc decision to follow an alternative rule, which prefers commonly known forms of romanized names over those established via systematic romanization. that lc is correctly following the spirit of the general principle for personal names is little comfort to research libraries with large slavic collections. how are other libraries responding to the murky form of aacr2? some are closing old card catalogs and continuing them with com or temporary card supplements. some of these are establishing cross-reference links between variant forms of names between catalogs, while others are not. editorial/dwyer 149 some are keeping their catalogs open and shifting files, while others are splitting files. some are shifting some files and splitting others. aa cr2 was intended to provide headings that could be easily ascertained by the user. ironically, the temporary result is scrambled catalogs: access systems involving multiple lookups and built-in confusion . until most bibliographic records are in machine-readable form under reliable authority control this will continue to be the case. authority control, it would seem, has long been an idea whose time has come but whose application is yet to be realized. the cooperative efforts of the library of congr~s and the major bibliographic utilities to establish reliable automated authority control will do much to ameliorate the problems presented by aacr2. it would also be helpful if lc, perhaps with the financial assistance of other libraries, networks, and foundations, would publish what might be called aacr2¥2-not a new edition of the code but one accurately reflecting actual lc practice. finally, future code makers would be wise to consider cost and other implementation factors in their deliberations. professor higgins, ever the optimist, would rather sing "wouldn't it be !overly" than hear another verse of "i did it my way." james r. dwyer editor's notes title change it often seems that the only things that change their names as often as library publications are standards organizations. not to be left out, jola will be called information technology and libraries beginning with volume 1, number 1, the march 1982 issue . this name was approved by the lit a board in san francisco this june as more accurately reflecting the true scope of the journal. new section with this issue, we are initiating a new section: "reports and working papers." this is intended to help disseminate documents of particular interest to the]ola readership. we solicit suggestions of documents, often developed as working papers for a specific purpose or group but of interest and value to our readership. in general, documents in this section are neither refereed nor edited. mitch i take great personal pleasure in publishing mike malinconico's speech upon presenting the 1981 lita award to mitch freedman. readers' comments we do continue to solicit suggestions about the journal but receive few. is anybody reading it? if you have any thoughts about what we should or shouldn't do, we would welcome your sharing them. lib-s-mocs-kmc364-20141005044052 109 statistical behavior of search keys abraham bookstein: graduate library school, university of chicago editor's note: the editor and author are aware that varying approaches may be taken to the problem presented here. readers are invited to respond in the form of a paper or a technical c.'ommunication. in discussion about search keys, concern has been expressed as to how the nwnber of items tetrieved by a single value relates to collection size. this paper creates a statistical model that attempts to give some insight into this behavior. it is concluded that, in general, the observed behavior can be explained as being intrinsically statistical in nature rather than being a property of specific search keys. an attempt is made to relate this model to other tesearch, and to indicate how this model may be made to yield more accurate predictions. introduction various experiments suggest that it may be possible to develop, as an access route into a file of bibliographic records, a search key'" whose values can be easily derived from such bibliographic data as is likely to be available to its users.1 some concern, however, has been expressed regarding the nonuniqueness of these keys: if the number of items retrieved were often to exceed an amount easily handled by a user of the system, the value of this access route would be considerably diminished. accordingly, an important measure of search key performance is the frequency with which a large number of records is reh·ieved as the search key is applied to the file. this measure is · related, for example, to how many memory accesses will be required, on the average, to retrieve all records satisfying a request; it is also an important consideration in deciding which display device should be installed in a system.2 • 3 after evaluating such a measure for a search key on a particular file, it is reasonable to ask how that measure will change over time, as the file increases in size. the nature of this variation has already been of concern to researchers in the field. kilgour, on the basis of a· number of experiments carried out at oclc, notes that "there remains a major problem to be o by the. phrase "search key'~ we mean a key similar to the 3-3 or 3-1-1-1 keys used at · ohio college library center and other places, which is made up by concatenating truncations of bibliographic data elements. llo journal of library automation vol 6/ 2 june 1973 solved and a major question to be answered. the problem is constituted of those replies that contain a number of entries exceeding the optimal maximum .. .. the major question to be answered is how truncated search keys will perform on files ten and a hundred times the size of that used in this experiment."' he elsewhere observes that "as a file of bibliographic entries increases, the maximum number of entries per reply does not increase in a one-to-one ratio ... . "5 this paper presents a mathematical model that addresses itself to the problem defined by kilgour and attempts to explain his observation; it is suggested that the gross features of the behavior are statistical in nature and not properties of specific search keys. a view of collection growth the cause of the phenomenon observed by kilgour can best be understood by first considering a simple model which, while not itself valid, does cast light on the nature of the behavior. this first model neglects the effect of randomness both in the growth of the collection and in the arrival of requests. it supposes our search key has the following property: regardless of collection size, the fraction of the collection retrieved by a particular search key value, v~, is exactly given by a constant f;; thus, if the fil e holds n records, a request for v 1 will retrieve n 1 = f,n records. this model similarly assumes that among any sizeable number of requests, the fraction of the time any particular search key value will occur is fixed; thus, for any subset of search key values, it is possible to determine how often members of that subset will occur among a set of requests. in particular, for any integer n, we can form the set of all the search key values that will retrieve less than n items. we can then determine how often search key values from that set are requested. if, for example, requests for these values occur 99 percent of the time, then we can assert that 99 percent of the time less than n items will be retrieved. if the fil e contains n items, then these n items constitute the fraction f = ~ of the file. should the collection size increase to ln, then the model predicts that 99 percent of the time less than f( ln) = ln items would be retrieved. in other words, we have precisely the behavior kilgour observes does not occur. this argument shows that a simple deterministic model does not conform to experience with search keys. the model breaks down in two ways, which accounts for the discrepancy between the results derived from it and kilgour's observations: 1. in any actual library, the fraction of the time that a particular request will appear within a sequence of requests will vary; and 2. in comparing two different samples having the same size, the number of items having a given search key value will vary. the first of these factors is easily dealt with and its analysis will suggest the number of requests to use in a test of search key behavior in a given library. for a particular collection, lets denote the set of search key values statistical behavior of search keysj bookstein 111 for which, say, twenty or more items are retrieved. we would like to find the fraction of the time that a request in s occurs in the long run; suppose this value is in fact q. then among m requests, the probability that m members of s occur is given by the binomial distribution fb(m\q,mi). this distribution has a mean of qm and a variance of qm(1 q). should we desire to estimate the actual fraction of the time that twenty or more items will be retrieved, we can take a sample of m requests and compute q, the fraction of the requests with search key values in s; if we do so, we will usually get a value for q between q ,/ m v q ( 1 q) and q + v2 m v q ( 1 q) .' if for example, q = .01 and m = 10,000, we would tend to find q in the interval .01 ± .002. thus the effect of randomness in the arrival of requests can easily be controlled by increasing the number of requests considered; furthermore, the size of error can be predicted. we next introduce the second factor; its analysis will suggest how the behavior of search keys will change as the collection grows in size. for this purpose we adopt a model of collection growth which assumes that as items arrive, they are randomly distributed among the search key values in accordance with some probability distribution. if we suppose that the probability of an item being assigned a specified search key value, v11 is p11 then in a collection of n items we may conclude that the probability of n items having that value is given by the binomial distribution: ( n ) n n-n fu(n jpbn) = 7 p1(1p1 ) • if g' ( v;) is the probability that the value v1 is selected from the request population, then the probability that the "next" request retrieve n items is given by def ~~ g'(vt) fb(njp;,n) =fg(p) fb(njp,n)dp; g(p) dp= ~ g'(v;) p;! p i ~ p + dp is the probability that a request arrive with value p1 in the interval (p,p + dp), and will be treated as a continuous function.""' since the expectation of the binomial distribution is given by pn, we have de£ nfpg(p)dp = np as the expected number of items retrieved by a random request; since this is proportional ton, doubling the size of the collection will, on the average, double the amount of material ret1·ieved. similarly, the 2 2 variance, u 2, is given by n2 ( p 2 p ) + nf p( 1 p) g( p )dp. should p2 p , de£ the variance of p, be small, this reduces to nfp(l p )g(p )dp = i?n, so that approximately 95 percent of the time the amount of material retrieved would be less than np + 2\1 n a-= n ( p + , ~a) . v n . •• this result would more precisely be expressed as f fb(n lp ,n)dg(p), which has the form of a stieltjes integral. the expression used in the text is simpler and reasonably valid because of the vast number of values the search key can take. i i i j i 112 journal of library automation vol. 6/2 june 1973 it is the factor + 2cf p vn' and its dependence on n, that may account for kilgour's nonlinearity, and not any property intrinsic in the nature of any type of search key. thus, to the extent that this model reflects what is really happening, the 95 percent point increases roughly proportionately with file size; the "constant" of proportionality, however, is the sum of two tem1s: the first is a true constant, and the second is a term that approaches zero as the file gets larger. in particular, this model suggests that we will never reach a leveling off point-as the file increases in size, the number of items retrieved will also increase, and the pattern of increase will become increasingly linear. up to this point this discussion has been qualitative in nature, being based upon general statistical considerations and making use of the normal approximation to some unknown distribution; its broad conclusions are, however, consistent with the findings of earlier workers and can explain certai11 unanticipated properties of search keys. to proceed further it will be necessary to restrict the form of the function g(p); tl1is will be attemped in the following section of this paper. relationship of model to earlier research interest in access methods that are appropriate for files of bibliographic data has generated a considerable amount of empirical research on search key behavior. of necessity, this pioneering work has been of a descriptive nature, resulting in data showing search key behavior in specific environments. while these efforts have lent a good deal of insight into the nature of search keys, the basic weakness of such research lies in the difficulty of extending these findings to other situations. one purpose of a mathematical model such ·as. the one being developed here is to provide this increased generality by representing in a concise and easily manipulated form the results of previous research. it is accordingly of interest to indicate the relationship between previous work on search keys and our model. research on search key performance has been of two kinds. the fi.rst kind seeks .to answer the question: for any number, n, how many search key values retrieve n items? the answer to this question depends only on the search key and the collection; it is independent of the pattern of request arrivals. the second kind of research involves the ·actual arrival of requests; it tries to answer the question: for any number n, how frequently will requests resulting in the retrieval of n items occur? to discuss this research in terms of our model requires a closer examinadef tion of the function g( p) previously defined. we recall that g( p) dp == ~ g'(v1), with dp being a small number. thus g(p) is determined p ~ pi ~ p+dp by two factors: statistical behavior of semch keysjbookstein 113 a. the number of search key values in the interval ( p,p + dp). let us denote this value by f(p )dp, so f(p) is the density of search keys at p. we make use here of the fact that although the number of possible search key values is finite, the number is very large, so their. distribution can be thought of as continuous. b. the average probability of search keys, with values p 1 near p, being requested. we shall refer to this quantity as g"(p). by combining these factors we have g(p) = g"(p )f(p ). · in terms of this discussion, the first type of research described above. is in fact estimating f(p): if there ares search key values that retrieve n items from a collection of n items, then sis an estimate of this relation uses _!_ f (~)· n n' n + ~~ n = pn, and dp = n n~ 1 n n' the second kind of research directly estimates g ( p). guthrie, in a recent paper, provides a bridge between the two types of research by discussing his findings in terms of two models.6 one of his models, which asserts that each search key value has an equal chance of being requested, is equivalent to the assumption that g"(p) = 1, and g(p) = f(p). guthrie finds that this is not an adequate representation of his data. guthrie's second model asserts that each item has an equal chance of being requested. in our terms this becomes g' ( p )ap, and g( p )apf ( p). this model, while an improvement over the first, still disagrees with the data. furthermore , these models do not estimate f ( p); even if guthrie's model were correct, we would not know the probability that n items would be re trieved until we were told how many search key values contained n items. in the next section we will try to remedy this situation by means of a two paramete r representation of g( p). a representation of f(p) to get a more detailed account of search key behavior by experiment is difficult since the two aspects of randomness already discussed are confounded; the experimenter only sees the combined effect. we will, however, try to estimate the distribution g ( p) by a distribution of the form (a + {3 + 1)! a (1 )f3 a!f3! p p. we believe that such an attempt is reasonable on three grounds: a. it is not possible to find g(p) exactly, and moreover, it is not clear that this would be desirable. we are interested in a reasonable approximation that is satisfactory for decision-making purposes; b. the above distribution assumes a wide variety of shapes as a and f3 vary; it seems likely that values of a and f3 can be found for which 114 journal of library automation vol. 6/ 2 june 1973 this distribution is close enough to g ( p); and c. this distribution is mathematically tractable. if we proceed using the above approximation for g(p ), we find: (i) the probability, p(n), of n items being retrieved is given by 1. p(n) = (-n) ~-+ f3 + 1~l(a + n)! (nn + [3)! n a!fj! (a+fj+n+l)! ( ii) the expected number of items retrieved, e, is given by a + 1 2. e == n a + {3 + 2 ; and (iii) the variance, v, of the number of items retrieved is given by _ a+l {3 + 1 n 3· v n a + f3 + 2 a + {3 + 3 ( 1 + a. + {3 + 2 ) · if the experiment is performed on a small sample, the expectation and variance can be computed and the values of a and f1 estimated from the relations e a (1 -) + 1 4. f1== n 2, and e n v en e 1 -n 5. a. v 1 e l e 1-n usually ~ will be much smaller than one; in this case we may use the approximations: n 4'. f3 =(a+ i)e, and e 5'. e 1 a= n. once a and f1 have been evaluated, we can compute the probabilities p ( n) for files of arbitrary size, and with these values we can make assertions regarding the probability of, say, more than 30 items being retrieved. a relation that can be derived from formula 1 and may be of use when comparing this model with experiment is: p(n) i + {3 n-n = 1 + a n + 1 p(n + 1) statistical behavior of search keys/ bookstein 115 the probability of zero retrievals is likely to be an extraordinary point in the distributions g ( p) and p ( n) since it is influenced by the knowledge that a user may have of the collection; this effect is likely to be encountered in a sampling process in which the requests have to be generated artificially. in such cases it would be advisable to treat p ( 0) as an empirically derived parameter, (), and use the modified formula { (jifn=o 6. p' (n) = (1 fj) 1 ~(;~o) if n ::1= 0. the value of() can be estimated by the fraction of requests retrieving zero items; for sampling techniques using only productive requests, () will be zero. a. and f3 can be calculated as before from the mean and variance of the sample. conclusion the above discussion is intended as an attempt to provide some theoretical understanding of the puzzling behavior discovered in the use of search keys and also to provide some guide for those experimenting with samples of such files. we do, however, urge caution for the latter uses. an analysis similar to the above can be useful under several different circumstances, such as: determining the future behavior expected of a search key in a single library as the collection grows; determining the behavior for one library based upon experiments conducted on a different but similar library; and extrapolating from the performance of a search key in a sample of the collection to its pedormance in the full collection. if one wishes to compare two different libraries, one can note that as far as search key values are concerned, a particular library's collection can be thought of as a random sample of the larger population from which it selects its material, and accordingly the formula for p ( n) should be valid. in this case, if two different collections are drawn from the same population, the g ( p) refers to this population and the libraries are distinguished by the parameter n; when we are considering samples from a single library, then n is the sample size and g ( p) refers to the library itself. no theoretical basis exists at present for estimating to what extent the populations being considered depend upon the type of library, if any, so this problem must be dealt with empirically. we have assumed here that these populations are similar with regard to search key values. should these populations in fact vary, it is possible that they can be broken down, e.g., by language, into subpopulations that are stable and for each of which the analysis is valid. acknowledgments this work was made possible by clr/ neh grant no. e0-262-70-4658. i would like to express my gratitude to members of the university of chicago systems development office for their many comments and suggestions on this work. i ; i ll6 journal of library automation vol. 6/ 2 june 1973 references i. frederick g. kilgour, philip l. long, eugene b. leiderman, and alan l. landgraf, "title-only entries retrieved by use of truncated search key," journal of library automation 4:207-10 (dec. 1971). 2. a. bookstein, "double hashing," journal of the american society for information science 23:402-25 (nov.-dec. 1972) . 3. a. bookstein, "hash coding with a non-unique search key," to be published in the journal of american society for information science. 4. frederick g. kilgour, philip l. long, eugene b. leiderman, and ajan l. landgraf, "retrieval of bibliographic entries from a name-title catalog by use of truncated search keys." preprint. 5. kilgour, long, leiderman, and landgraf, "title-only entries," p.209-10. 6. gerry p. guthrie and steven d. slifko, "analysis of search key retrieval on a large bibliographic file," journal of library automation 5:96-100 (june 1972). letter from the editor kenneth j. varnum information technology and libraries | march 2018 1 https://doi.org/10.6017/ital.v37i1.10388 this issue marks 50 years of information technology and libraries. the scope and everaccelerating pace of technological change over the five decades since journal of library automation was launched in 1968 mirrors what the world at large has experienced. from “automating” existing services and functions a half century ago, libraries are now using technology to rethink, recreate, and reinvent services — often in areas that simply were in the realm of science fiction. in an attempt to put today’s technology landscape in context, ital will publish a series of essays this year, each focusing on the highlights of a decade. in this issue, editorial board member mark cyzyk talks about selected articles from the first two volumes of the journal. in the remaining issues this year, we’ll tackle the 1970s, 1980s, 1990s, and 2000s. the journal itself, now as ever before, focuses on the present and the near future, so we will hold off recapitulating the current decade until our centennial celebration in 2068. as we look back over the journal’s history, the editorial board is also looking to the future. we want to make sure that we know for whom we are publishing these articles, and to make sure that the journal is as relevant to today’s (and tomorrow’s) readership as it has been for those who have brought us to the present. to that end, we invite anyone who is reading this issue to take this brief survey — tell us a little about how you came to ital today, how you’re connected with library technology, and what you’d like to see in the journal. it won’t take much of you r time (no more than 5 minutes) and will help us understand the context in which we are working. there’s another opportunity for you to help shape the future of the journal. due to a number of terms being up at the end of june 2018, we have at least five openings on the editorial board to fill. if you are passionate about libraries and technology, enjoy working with authors to shape their articles, and want to help set out today’s scholarly record for tomorrow’s technologists, submit a statement of interest at https://goo.gl/forms/5gbqouuseolxrfx52. we seek to have an editorial board that represents the diversity of library technology practitioners, and particularly invite individuals from non-academic libraries and underrepresented demographic groups to apply. sincerely, kenneth j. varnum editor march 2018 https://umich.qualtrics.com/jfe/form/sv_6hafly0cyjpbk4j https://umich.qualtrics.com/jfe/form/sv_6hafly0cyjpbk4j https://goo.gl/forms/5gbqouuseolxrfx52 accessible, dynamic web content using instagram jaci wilkinson information technology and libraries | march 2018 19 jaci wilkinson (jaci.wilkinson@umontana.edu) is web services librarian at the university of montana. abstract this is a case study in dynamic content creation using instagram’s application program interface (api). an embedded feed of the mansfield library archives and special collections’ (asc) most recent instagram posts was created for their website’s homepage. the process to harness instagram’s api highlighted competing interests: web services’ desire to most efficiently manage content, asc staff’s investment in the latest social media trends, and everyone’s institutional commitment to accessibility. introduction the mansfield library archives and special collections (asc) at the university of montana had a simple enough request. their homepage had been static for years and it was not possible to add more content creation to anyone’s workload. however, they had a robust instagram account with more than one thousand followers. was there any way to synchronize workflows with an instagram embed on the homepage? the solution was more complicated than we thought. we developed an instagram embed, but in the process grappled with some fundamental questions of technology in the library. how do we streamline the creation and sharing of ephemeral, dynamic content? how do we reconcile web accessibility standards with the innovative new platforms we want to incorporate on our websites? libraries have invested heavily in social media to improve their approachability, reduce library anxiety, and interact with their users. at the mansfield library, this investment has paid off for asc. this unit was an early adaptor of instagram, a photo and short video–sharing application with the public or approved followers. the asc instagram account launched in january 2015, and staff quickly settled on the persona of “banjo cat” to share collection items and relevant history. banjo cat was inspired by a whimsical nineteenth-century photograph in asc of a cat playing a banjo (see figure 1). asc now has about 1,200 followers including many other libraries, archives, and special collections. in fact, connecting to a wider community of similar institutions was a driving factor in creating an instagram account. the asc staff member who updates the account said, while we have lots of interactions with patrons on facebook we have basically zero interactions with other institutions. instagram is all about interacting with other institutions, sharing ideas for posts, commenting on posts. so by learning about this community and participating and interacting with it we are able to . . . learn about programs and ideas that we would probably not have access to otherwise. 1 mailto:jaci.wilkinson@umontana.edu accessible, dynamic web content using instagram | wilkinson 20 https://doi.org/10.6017/ital.v37i1.10230 figure 1. banjo cat by l. a. de ribas. mansfield library archives and special collections. 1880s. but while asc’s social media thrived, its website was bereft of dynamic content. given that the asc homepage is the ninth most visited page on the library site, it felt like a wasted opportunity to let such a highly trafficked area lack engaging, current, and appealing content. it seemed only natural to harness the energy put into the asc instagram account and embed that same light-hearted, community-oriented, and collection-focused content on the asc homepage. literature review libraries are enthusiastic adopters of social media; one study even shows that as of 2013, 94 percent of academic libraries had a social media presence.2 a 2006 library journal article observed the following about myspace, then a popular social media platform: “given the popularity and reach of this powerful social network, libraries have a chance to be leaders on their college campuses and in the larger community by realizing the possibilities of using social networking sites like myspace to bring their services to the public.” 3 this open-minded spirit and willingness to try new technology trends was shrewd. pew research reports that as of 2016, 69 percent of americans use some type of social media. 4 social media use has grown more representative of the population: the percentage of older adults on at least one social media site continues to increase.5 for academic libraries, the pull of facebook was immediately strong because of the initial requirement for users to have a .edu address. academic libraries very early on attempted to connect with students about services, resources, and spaces using facebook.6 information technology and libraries | march 2018 21 dynamic content is a gateway to building interest toward and buy-in to an institution. in user experience literature, “user delight” is “a positive emotional affect that a user may have when interacting with a device or interface.”7 in walter’s hierarchy of user needs, pleasure tops all other needs.8 figure 2. aaron walter’s hierarchy of user needs, from therese fessenden, “a theory of user delight: why usability is the foundation for delightful experiences,” nielsen norman group, march 25, 2017, https://www.nngroup.com/articles/theory-user-delight/. using social media to engage users with special collections has its own niche. special collections are typically housed in closed stacks and have no digital equivalent. often the materials housed in special collections are rare, fragile, exotic, beautiful, and unusual; a study of library blogs and social media found that those with higher aesthetic value received more visitors and more revisits.9 social media “gives users an idea of what the collection offers while it promotes and potentially gains foot traffic.”10 it has even been suggested that social media gives special collections the opportunity to stand in when digitization isn’t possible: “instead of digitizing a whole collection, librarians can highlight important parts of the collection with a snippet of its history.”11 in creating ucla’s powell library instagram account, librarian danielle salomon https://www.nngroup.com/articles/theory-user-delight/ accessible, dynamic web content using instagram | wilkinson 22 https://doi.org/10.6017/ital.v37i1.10230 writes, “special collections items and digital library images can be a treasure trove of social media content. one of our library’s goals is to increase students’ exposure to special collections items, so we draw heavily from these collections.”12 instagram is a relative newcomer to social media, but it has been consistently successful since its inception in 2010.13 as of 2016, 28 percent of americans use instagram, up from 11 percent in 2013.14 facebook bought instagram in 2012 and has since bolstered the application’s success by making the two platforms easy to navigate and share between. after vine, a short video application, was shuttered in 2017, instagram’s ability to take and post short videos has increased its value. instagram is distinct in that it is mobile-dependent: it is difficult to run the application through a web browser, and only one device can operate an instagram account. within the library community, instagram’s adoption has been strongest in academic libraries. this is tied to the high number of instagram users who are college-age.15 another reason libraries select instagram is because it has more diverse users than other social media applications, specifically african americans and latinos.16 in a 2016 study, instagram was the second-most pick among college students at western oregon university when asked what social media application the library should use (twitter came in first). the most popular use of instagram in academic libraries is familiarizing students with services, resources, and spaces. uses include first-year instruction activities to combat library anxiety and mini-contests that ask users to identify what posted photos are of.17 ucla’s powell library discovered students posting instagram photos of their spaces, so they initially joined to repost those photos and interact with those users. instagram makes a library seem approachable. librarian joanna hare reflected on this discovery: “instagram is really powerful in that respect because you can just snap a few photos [and] show what’s going on . . . so that students don’t view the library as being intimidating.” 18 approachability is augmented by delegating photography and posting tasks to library student employees. social media is less often seen as a way to help create dynamic content for a library’s website. the exceptions to this trend have come from institutions with substantial technology resources. north carolina state university created an open source software that adds photos posted by anyone on instagram to a library photo collection when a certain hashtag is used.19 the university of nebraska’s calvin t. ryan library created an rss feed that disseminates blog posts to twitter, facebook, and the library homepage. posts from followed accounts in twitter and facebook are also a part of the resulting feed. the rss feed requires use of a third-party tool called dlvr.it (https://dlvrit.com/), which supports many other social media applications, but not instagram. a notable absence in literature on social media use in libraries is any mention of accessibility concerns. the “improving the accessibility of social media for public service” toolkit developed by a group of us government offices is a useful resource that includes specific guidelines on making instagram posts more accessible.20 the toolkit explains that “more and more organizations are using social media to conduct outreach, recruit job candidates and encourage workplace productivity. . . . but not all social media content is accessible to people with certain disabilities, which limits the reach and effectiveness of these platforms. and with 20% of the population estimated to have a disability, government agencies have an obligation to ensure that their messages, services and products are as inclusive as possible.”21 given the stated importance of social media in library literature, the lack of conversation about accessibility and social media is a barrier to inclusivity. https://dlvrit.com/ information technology and libraries | march 2018 23 mansfield library archives and special collections’ instagram feed dynamic content was lacking from any part of the asc website, but staff had a dearth of time and knowledge of the content management system to create web content. there was a drive to solve this problem because a new web services librarian had recently been hired. when the web services librarian learned of asc’s thriving instagram presence, she pursued the possibility of including that content on the asc website. she felt that, in addition to being more efficient, content creation should stay in-house given the highly specialized nature of asc’s collections, spaces, and resources. the ideal solution would allow asc staff to create and manage an instagram feed unassisted; the web services librarian sought the simplest possible solution for them. our content management system and instagram’s developer website were first consulted with the hope that one provided an automated embed or plugin. our content management system, cascade, could pull in content from facebook and twitter but not instagram, and instagram did not have an automated feed creator. after more research, we learned that third-party instagram feed embeds are the only possible way to create an instagram feed without using instagram’s api. the api was considered a last-resort option because we knew that asc staff could not manage the code themselves. the idea of using any third-party service was undesirable because of a lack of control, stability, and accessibility. if the service has technical issues or goes out of business, it would be very noticeable given the visibility of asc’s homepage. in 2012, a student advocacy organization at the university of montana filed a civil rights complaint with the us department of education focusing on disabled students’ unequal access to electronic and information technologies. since then, the mansfield library has been proactive to eliminate barriers to access.22 given this history, we are wary of the accessibility of third-party applications to someone using assistive technology, most likely, a screen reader. juicer (https://www.juicer.io/), for example, is a freely available service for an instagram feed but in exchange it retains its branding prominently at the top of the feed. an example of juicer in use can be found on the home page of the baltimore aquarium (http://aqua.org/). tests of juicer showed that it was not accessible for a screen reader. finally, it didn’t fit our need: juicer curated posts from other users depending on the hashtags and reposts, but we only wanted to feature our own content. the unpredictability of other accounts’ posts ending up on the asc homepage was not desirable. instagram’s developer site did not make finding a solution easy. the page titled “embedding” is about embedding individual posts on a webpage, not a whole feed.23 this content does not even link out to an explanation of how to embed a feed. the “authentication” page is where the process begins because calling the api requires a token an authenticated instagram account user.24 a user is authenticated by creating a client id and then receiving an access token. another interesting roadblock provided by the instagram developer site is that the “authentication” page provides no further information about using the access token to call the api. it took outside research to finally figure out the steps needed to make the api requests for asc’s feed.25 php code is used to call the api and copy the three most recent asc instagram posts to a local server file. (using javascript to call the api is a poor choice because that code will make the account’s access token public. if anyone sees this token they can use it themselves to pull your feed using the instagram api.) css replicates the look and feel of instagram with white, minimalistic icons and a simple photo display https://www.juicer.io/ http://aqua.org/ accessible, dynamic web content using instagram | wilkinson 24 https://doi.org/10.6017/ital.v37i1.10230 that darkens and shows the beginning of the description when a user’s mouse hovers over it. all code from this project is freely available in github.26 there is a catch to this embedded feed process. the directions given through instagram and by the online sources we used only took us to sandbox mode (in web development, sandbox refers to a restricted or test version of a final product). in sandbox, instagram limits the number of requests to the api. unfortunately, a request was made every time someone went to the asc page. the initial feed stopped working in minutes because we did not realize this limitation of sandbox mode meant. another look at the instagram developer site taught us that the only way to leave sandbox was to have our “app,” as instagram called it, reviewed.27 in other words, instagram has only set up their api to be used for full application development (like juicer). we decided not to leave sandbox mode because of uncertainty about what instagram’s review process would entail. if our app was rejected, would they force us to discontinue our work? the timeline for the approval process was also uncertain. distrust and uncertainty, unfortunately, guided our decision-making at this stage. instead of undergoing the review process, the php code was reconfigured to call the api only once a day. this made the feed less dynamic because it was not updating in real time. f or our purposes this was not a problem; the asc instagram account is updated at most once or twice a week anyway. as a result, we are “scraping” asc’s instagram account. although “crawling, scraping, and caching” are prohibited by instagram’s terms of use, other instagram feeds in github have similar workarounds and point out that a plugin/scraper “uses (the) same endpoint that instagram is using in their own site, so it’s arguable if the toc [terms of use] can prohibit the use of openly available information.”28 while figuring out how to work with the instagram api, a major accessibility roadblock cropped up: there was no place for the alt text—descriptive information about the image that is used by assistive technologies for users with low vision. besides taking or uploading a photo, the only other actions offered to create a new post were to write a caption, tag people, or add a location. only the caption allowed for a text string. without alt text, not only is the instagram feed unintelligible to a screen reader but it disturbs a screen reader user’s interaction with all other content on that page. an asc staff member discovered a solution when she noticed a joshua tree national park instagram post with alt text at the bottom of the caption. although initially put off by the “wordiness,” we concluded this was the only logical way to move forward. the benefits to this format of alt text took focus as we moved through the project: the asc staff member was able to choose the desired alt text without any additional steps or skills, and we grew to relish the opportunity to explain to curious users what the #alttext hashtag meant and why it was important to us. php code isolates all text after #alttext and displays that as the alt text to a screen reader. since the instagram feed was implemented, it has been interesting to follow how the instagram developer site has changed and grown. although facebook has owned instagram for five years, the instagram developer site is only now starting to link out to facebook developer content. most recently, the instagram developer site has been advertising the instagram graph api for use by business accounts. this type of development is useless for us because we have a personal instagram account, not a business account. and the function of the instagram graph api is focused on the internal user and analytics, not the end user and user experience. even if the instagram graph api was available for personal accounts, it is worth asking if this type of data collection would be of use to an organization that doesn’t have the labor of a devoted marketing team. information technology and libraries | march 2018 25 dynamic content through social media and web content provides opportunities to create user delight because it focuses on visually appealing, fun, timely, and interesting information. for archives, special collections, and other cultural heritage institutions, this content is particularly useful because it provides a look into collections that are interesting and rare but also fragile and housed in closed stacks. these positives are tempered by the reality many of these institutions face: budgets are tight, staffs are small, and technical expertise might be lacking. this paper demonstrates how important and useful social media is to create dynamic website content. unfortunately, there is a gap in library literature on accessibility and social media; although social media content is ephemeral or lacks specific utility, libraries need to pay more attention to the various ways users access resources and information through social media, especially if that same content appears on the institution’s website. the asc’s embedded homepage instagram feed fits their needs, is accessible, and builds community around their unique collections. by providing all the code created in this project in github,29 including the css we used, our hope is that institutions interested in this instagram feed model could replicate it for their own purposes without extensive technical support. acknowledgments i am thankful for the expertise of carlie magill, donna mccrea, and wes samson. without them this project would not have been possible. references 1 carlie magill, e-mail message to author, august 8, 2017. 2 michael sutherland, “rss feed 2.0” code4lib 31, january 28, 2016, http://journal.code4lib.org/articles/11299. 3 beth evans, “your space or myspace?” library journal 131 (2006): 8–12. library, information science & technology abstracts, ebscohost. 4 “social media fact sheet,” pew research center, january 12, 2017, http://www.pewinternet.org/fact-sheet/social-media/. 5 ibid. 6 brian s. mathews, “do you facebook?” c&rl news, may 2006, http://crln.acrl.org/index.php/crlnews/article/viewfile/7622/7622. 7 therese fessenden, “a theory of user delight: why usability is the foundation for delightful experiences,” nielsen norman group, march 25, 2017, https://www.nngroup.com/articles/theory-user-delight/. 8 ibid. 9 daryl green, “utilizing social media to promote special collections: what works and what doesn’t” (paper, 78th ifla general conference and assembly, helsinki, finland, june 2012), 11, https://www.ifla.org/past-wlic/2012/87-green-en.pdf. 10 katrina rink, “displaying special collections online,” serials librarian 73, no. 2 (2017): 1–9, https://doi.org/10.1080/0361526x.2017.1291462. 11 ibid. http://journal.code4lib.org/articles/11299 http://www.pewinternet.org/fact-sheet/social-media/ http://crln.acrl.org/index.php/crlnews/article/viewfile/7622/7622 https://www.nngroup.com/articles/theory-user-delight/ https://www.ifla.org/past-wlic/2012/87-green-en.pdf https://doi.org/10.1080/0361526x.2017.1291462 accessible, dynamic web content using instagram | wilkinson 26 https://doi.org/10.6017/ital.v37i1.10230 12 danielle salomon, “moving on from facebook,” college & research libraries news 74, no. 8 (2013): 408–12, https://crln.acrl.org/index.php/crlnews/article/view/8991. 13 sarah perez, “the rise of instagram,” techcrunch, april 24, 2012, https://techcrunch.com/2012/04/24/the-rise-of-instagram-tracking-the-apps-spreadworldwide/. 14 “social media fact sheet,” pew research center, january 12, 2017, http://www.pewinternet.org/fact-sheet/social-media/. 15 lauren wallis, “#selfiesinthestacks: sharing the library with instagram,” internet reference services quarterly 19, no. 3–4 (2014): 181–206, https://doi.org/10.1080/10875301.2014.983287. 16 elizabeth brookbank, “so much social media, so little time: using student feedback to guide academic library social media strategy ,” journal of electronic resources librarianship 27, no. 4 (2015): 232–47, https://doi.org/10.1080/1941126x.2015.1092344; salomon, “moving on from facebook.” 17 wallis,“#selfiesinthestacks”; salomon, “moving on from facebook.” 18 wendy abbott et al., “an instagram is worth a thousand words: an industry panel and audience q&a,” library hi tech news 30, no. 7 (2013): 1–6, https://doi.org/10.1108/lhtn08-2013-0047. 19 salomon “moving on from facebook.” 20 “federal social media accessibility toolkit hackpad,” digital gov, accessed november 25, 2017, https://www.digitalgov.gov/resources/federal-social-media-accessibility-toolkit-hackpad/ . 21 ibid. 22 donna e. mccrea, “creating a more accessible environment for our users with disabilities: responding to an office for civil rights complaint,” archival issues 38, no. 1 (2017): 7, https://scholarworks.umt.edu/ml_pubs/25/ 23 “embedding,” instagram developer, accessed november 25, 2017, https://www.instagram.com/developer/embedding/. 24 “authentication,” instagram developer, accessed november 25, 2017, https://www.instagram.com/developer/authentication/ . 25 pranay deegoju, “embedding instagram feed in your website,” logical feed, december 25, 2015, https://www.logicalfeed.com/embedding-instagram-feed-in-your-website . 26 wes samson, “ws784512 instagram,” github, 2016, https://github.com/ws784512/instagram. 27 “sandbox mode,” instagram developer, accessed november 25, 2017, https://www.instagram.com/developer/sandbox/. 28 “terms of use,” instagram, accessed november 25, 2017, https://help.instagram.com/478745558852511; and “image-hashtag-feed,” digitoimisto dude oy, accessed november 25, 2017, https://github.com/digitoimistodude/image-hashtag-feed. 29 samson, “ws784512 instagram.” https://crln.acrl.org/index.php/crlnews/article/view/8991 https://techcrunch.com/2012/04/24/the-rise-of-instagram-tracking-the-apps-spread-worldwide/ https://techcrunch.com/2012/04/24/the-rise-of-instagram-tracking-the-apps-spread-worldwide/ http://www.pewinternet.org/fact-sheet/social-media/ https://doi.org/10.1080/10875301.2014.983287 https://doi.org/10.1080/1941126x.2015.1092344 https://doi.org/10.1108/lhtn-08-2013-0047 https://doi.org/10.1108/lhtn-08-2013-0047 https://www.digitalgov.gov/resources/federal-social-media-accessibility-toolkit-hackpad/ https://scholarworks.umt.edu/ml_pubs/25/ https://www.instagram.com/developer/embedding/ https://www.instagram.com/developer/authentication/ https://www.logicalfeed.com/embedding-instagram-feed-in-your-website https://github.com/ws784512/instagram https://www.instagram.com/developer/sandbox/ https://help.instagram.com/478745558852511 https://github.com/digitoimistodude/image-hashtag-feed abstract introduction literature review mansfield library archives and special collections’ instagram feed acknowledgments references editorial: singularity—are we there, yet? | truitt 55 i n my last column, i wrote about two books—nicholas carr ’s the shallows and william powers’ hamlet’s blackberry—relating to learning in the always-on, always connected environment of “screens.”1 since then, two additional works have come to my attention. while i won’t be able to do them justice in the space i have here, they deserve careful consideration and open discussion by those of us in the library community. if carr’s and power’s books are about how we learn in an always-connected world of screens, sherry turkle’s alone together and elias aboujaoude’s virtually you are about who we are in the process of becoming in that world.2 turkle is a psychologist at mit who studies human– computer interactions. among her previous works are the second self (1984) and life on the screen (1995). aboujaoude is a psychiatrist at the stanford university school of medicine, where he serves as director of the obsessive compulsive disorder clinic and the impulse control disorders clinic. based on extensive coverage of specialist and popular literature, as well as numerous anonymized accounts of patients and subjects encountered by the authors, both works are characterized by thorough research and thoughtful analysis. while their approaches to the topic of “what we are becoming” as a result of screens may differ— aboujaoude’s, for example, focuses on “templates” and the terminology of traditional psychiatry, while turkle’s examines the relationship between loneliness and solitude (they are different), and how these in turn relate to the world of screens—their observations of the everyday manifestations of what might be called the pathology of screens bear many common threads. i’m acutely aware of the potential for injustice (at best) and misrepresentation or misunderstanding (rather worse) that i risk in seeking to distill two very complex studies into such a small space. and, frankly, i’m still trying to wrap my head around both the books and the larger issues they raise. with that caveat, i still think we should be reading about and widely discussing the phenomena reported, which many of us observe on a daily basis. in the sections that follow, i’d like to touch on a very few themes that emerge from these books. ■■ “why do people no longer suffice?”3 a pair of anecdotes that turkle recounts to explain her reasons for writing the current book seems worth sharing at the outset. in the first, she describes taking her then-fourteen-year-old daughter, rebecca, to the charles darwin exhibition at new york’s american museum of natural history in 2005. among the many artifacts on display was a pair of live giant galapagos tortoises: “one tortoise was hidden from view; the other rested in its cage, utterly still. rebecca inspected the visible tortoise thoughtfully for a while and then said matter-of-factly, ‘they could have used a robot.’” when turkle queried other bystanders, many of the children agreed, with one saying, ‘for what the turtles do, you didn’t have to have live ones.’” in this case, “alive enough” was sufficient for the purpose at hand.4 sometime later, turkle read and publicly expressed her reservations about british computer scientist david levy’s book, love and sex with robots, in which levy predicted that by the middle of this century, love with robots will be as normal as love with other humans, while the number of sexual acts and lovemaking positions commonly practiced between humans will be extended, as robots will teach more than is in all of the world’s published sex manuals combined.5 contacted by a reporter from scientific american about her comments regarding levy’s book, turkle was stunned when the reporter, equating the possibility of relationships between humans and robots with gay and lesbian relationships, accused her of likewise opposing these human-to-human relationships. if we now have reached a point where gay and lesbian relationships can strike us as comparable to human-to-machine relationships, something very important has changed; for turkle, it suggested that we are on the threshold of what she terms the “robotic moment”: this does not mean that companionate robots are common among us; it refers to our state of emotional—and i would say philosophical—readiness. i find people willing to seriously consider robots not only as pets but as potential friends, confidants and romantic partners. we don’t seem to care what these artificial intelligences “know” or “understand” of the human moments we might “share” with them. at the robotic moment, the performance of connection seems connection enough. we are poised to attach to the inanimate without prejudice.6 marc truitteditorial: singularity—are we there, yet? marc truitt (marc.truitt@ualberta.ca) is associate university librarian, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 56 information technology and libraries | june 2011 while these examples are admittedly extreme, both authors agree that something very basic has changed in the way we conduct ourselves. turkle characterizes it as mobile technology having made each of us “pausable,” i.e., that a face-to-face interaction being interrupted by an incoming call, text message, or e-mail is no longer extraordinary; rather, in the “new etiquette,” it is “close to the norm.”10 and the rudeness, as well we know, isn’t limited to mobile communications. referring to “flame wars,” which regularly erupt in online communities, aboujaoude observes: the internet makes it easier to suspend ethical codes governing conduct and behavior. gentleness, common courtesy, and the little niceties that announce us as well-mannered, civilized, and sociable members of the species are quickly stripped away to reveal a completely naked, often unpleasant human being.11 even our routine e-mail messages—lacking as they often do salutations and closing sign-offs—are characterized by a form of curtness heretofore unacceptable in paper communications. remarkably, to those old enough to recall the traditional norms, the brusqueness is not only unintended, it is as well unconscious; “[we] just don’t think warmth and manners are necessary or even advisable in cyberspace.”12 ■■ castles in the air: avatars, profiles, and remaking ourselves as we wish we were finally, a place to love your body, love your friends, and love your life. —second life, “what is second life?”13 one of the interesting and worrisome themes in both turkle’s and aboujaoude’s studies is that of the reinvention and transformation of the self, in the form of online personas and avatars. this is the stock-in-trade of online communities and gaming sites such as facebook and second life. these sites cater to our nearly universal desire to be someone other than who we are: online, you’re slim, rich, and buffed up, and you feel you have more opportunities than in the real world. . . . we can reinvent ourselves as comely avatars. we can write the facebook profile that pleases us. we can edit our messages until they project the self we want to be.14 the problem is that for many there is an increasing fuzziness at the interface between real and virtual ■■ changing mores, or the triumph of rudeness i can’t think of any successful online community where the nice, quiet, reasonable voices defeat the loud, angry ones. . . . the computer somehow nullifies the social contract. —heather champ, yahoo!’s flickr community manager7 sadly, we’ve all experienced it. we get stuck on a bus, train, or in an elevator with someone engaged in a loud conversation on her or his mobile phone. all too often, the person is loudly carrying on about matters we wish we weren’t there to hear. perhaps it’s a fight with a partner. or a discussion of some delicate health matter. whatever it is, we really don’t want to know, but because of the limitations imposed by physical spaces, we can’t avoid being a party to at least half of the conversation. what’s wrong with these individuals? do they really have no consideration or sense of propriety? it turns out that in matters of tact and good taste, the ground has shifted, and where once we understood and abided by commonly accepted rules of conduct and respect for others, we do so no longer. indeed, the everyday obnoxious intrusions by those using public spaces for their private conversations are among the least of offenders. consider the following situations shared by turkle: sal, 62 years old, holds a small dinner party at his home as part of his “reentry into society” after several years of having cared for his recently deceased wife: i invited a woman, about fifty, who works in washington. in the middle of a conversation about the middle east, she takes out her blackberry. she wasn’t speaking on it. i wondered if she was checking her e-mail. i thought she was being rude, so i asked her what she was doing. she said that she was blogging the conversation. she was blogging the conversation.8 turkle later tells of attending a memorial service for a friend. several [attendees] around me used the [printed] program’s stiff, protective wings to hide their cell phones as they sent text messages during the service. one of the texting mourners, a woman in her late sixties, came over to chat with me after the service. matter-of-factly, she offered, “i couldn’t stand to sit that long without getting on my phone.” the point of the service was to take a moment. this woman had been schooled by a technology she’d had for less than a decade to find this close to impossible.9 editorial: singularity—are we there, yet? | truitt 57 enough” became yet more blurred. turkle’s anecdotes of children explaining the “aliveness” of these robots are both touching and disturbing. speaking of a tamagotchi, one child wrote a poem: “my baby died in his sleep. i will forever weep. then his batteries went dead. now he lives in my head.”19 the concept of “alive enough” is not unique to the very young, either. by 2009, sociable robots had moved beyond children’s toys with the introduction of paro, a baby seal-like “creature” aimed at providing companionship to the elderly and touted as “the most therapeutic robot in the world. . . . the children were onto something: the elderly are taken with the robots. most are accepting and there are times when some seem to prefer a robot with simple demands to a person with more complicated ones.”20 where does it end? turkle goes on to describe nursebot, a device aimed at hospitals and long-term care facilities, which colleagues characterized as “a robot even sherry can love.” but when turkle injured herself in a fall a few months later, [i was] wheeled from one test to another on a hospital stretcher. my companions in this journey were a changing collection of male orderlies. they knew how much it hurt when they had to lift me off the gurney and onto the radiology table. they were solicitous and funny. . . . the orderly who took me to the discharge station . . . gave me a high five. the nursebot might have been capable of the logistics, but i was glad that i was there with people. . . . between human beings, simple things reach you. when it comes to care, there may be no pedestrian jobs.21 but need we librarians care about something as farfetched as nursebot? absolutely. now that ibm has proven that it can design a machine—okay, an array of machines, but something much more compact is surely coming soon—that can win at jeopardy!, is the robotic reference librarian really that much of a hurdle? take a bit of watson technology, stick it in nursebot, give it sensible shoes, and hey, i can easily imagine bibliobot, factory-standard in several guises, including perhaps donna reed (as mary, who becomes the town librarian in the alter-life of capra’s it’s a wonderful life) or shirley jones (as marian, the librarian, in the music man). i like donna reed as much as anyone, but do i really want reference assistance from her android doppelgänger? but then, for years after the introduction of the atm, i confess that i continued taking lunch hours off just so that i could deal with a “real person” at the bank, so perhaps it’s just me. the future is in the helping/service professions, indeed! and when we’re all replaced by robots (sociable and otherwise), what will we do to fill the time? personas: “not surprisingly, people report feeling let down when they move from the virtual to the real world. it is not uncommon to see people fidget with their smartphones, looking for virtual places where they might once again be more.”15 turkle speaks of the development of what she terms a “vexed relationship” between the real and the virtual: in games where we expect to play an avatar, we end up being ourselves in the most revealing ways; on social-networking sites such as facebook, we think we will be presenting ourselves, but our profile ends up as somebody else—often the fantasy of who we want to be. distinctions blur.16 and indeed, some completely lose sight of what is real and what is not. aboujaoude relates the story of alex, whose involvement in an online community became so consuming that he not only created for himself an online persona—“’i then meticulously painted in his hair, streak by streak, and picked “azure blue” for his eye color and “snow white” for his teeth.’”—but also left his “real” girlfriend after similarly remaking the avatar of his online girlfriend, nadia—“from her waist size to the number of freckles on her cheeks.” speaking of his former “real” girlfriend, alex said, “real had become overrated.”17 ■■ “don’t we have people for these jobs?”18 ageist disclaimer: when i grew up, robots—those that weren’t in science fiction stories or films—were things that were touted as making auto assembly lines more efficient, or putting auto workers out of jobs, depending on your perspective. while not technically a robot, the other machine that characterized “that time” was the automated teller machine (atm), which freed us from having to do our banking during traditional weekday hours, and not coincidentally resulted, again, in the loss of many entry-level jobs in financial institutions. as i recall, we were all reassured that the future lay in “helping/ service” professions, where the danger of replacement by machines was thought to be minimal. now, fast forward 30 years. the first half of turkle’s book is the history of “sociable robots” and our interactions with them. moving from the reactions of mit students to joseph weizenbaum’s eliza in the mid-1970s, she recounts her studies of children’s interactions, first with electronic toys—e.g., tamagotchi—and later, with increasingly sophisticated and “alive” robots, such as furby, aibo, and my real baby. with each generation, these devices made yet more “demands” on their owners—for care, “feeding”, etc. and with each generation, the line between “alive” and “alive 58 information technology and libraries | june 2011 to admit that we’ve seen many examples of how connectedness between people we’d otherwise consider “normal” has and is changing our manners and mores.24 many libraries and other public spaces, reacting to patron complaints about the lack of consideration shown by some users, have had to declare certain areas “cell phone free.” in the interest of getting your attention, i’ve admittedly selected some fairly extreme examples from the two books at hand. however, i think the point is that, now that the glitter of always-on, always-connected, has begun to fade a bit, there is a continuum of dysfunctional behaviors that we are beginning to notice, and it’s time to talk about how we as librarians fit into all of this. are there things we in libraries are doing that encourage some of these less desirable and even unhealthy behaviors? which takes us to a second concern raised by some of my gentle draft-readers: we’ve heard this tale before. television, and radio before it, were technologies that, when they were new, were criticized as corrupting and leading us to all sorts of negative, self-destructive, and socially undesirable behaviors. how are screens and the technology of always-connected any different? a part of me—the one that winces every time someone glibly refers to the “transformational” changes taking place around us—agrees. i was trained as a historian, to take a long view about change. and we’re talking about technologies that—in the case of the web— have been in common use for just over fifteen years. that said, my interest here is in seeing our profession begin a conversation about how connective technologies have influenced behavioral changes in people, and especially about how we in libraries may be unwittingly abetting those behavioral changes. television and radio were fundamentally different technologies in that they were one-way broadcast tools. and to the best of my recollection, neither has ever been widely adopted by or in libraries. yes, we’ve circulated videos and sound recordings, and even provided limited facilities for the playback of such media. but neither has ever really had an impact on the traditional core business of libraries, which is the encouragement and facilitation of the largely solitary, contemplative act of reading. connective technologies, in the form of intelligent machines and network-based communities, can be said to be antithetical to this core activity. we need to think about that, and to consider carefully the behaviors we may be encouraging. notwithstanding those critics of change in our profession who feel we move far too glacially, i would maintain that we have often been, if not at the forefront of the technology pack, then certainly among its most enthusiastic ■■ where from here? i titled this column “singularity.” for those not familiar with the literature of science fiction, turkle provides a useful explanation: this notion has migrated from science fiction to engineering. the singularity is the moment—it is mythic; you have to believe in it—when machine intelligence crosses a tipping point. past this point, say those who believe, artificial intelligence will go beyond anything we can currently conceive. . . . at the singularity, everything will become technically possible, including robots that love. indeed, at the singularity, we may merge with the robotic and achieve immortality. the singularity is technological rapture.22 i think it’s pretty clear that we’re still a fair distance from anything that one might reasonably term a singularity. but the concept is surely present, albeit in a somewhat less hubristic degree, when we speak in uncritical awe of “game-changing” or “transformational” technologies. turkle puts it this way: the triumphalist narrative of the web is the reassuring story that people want to hear and that technologists want to tell. but the heroic story is not the whole story. in virtual worlds and computer games, people are flattened into personae. on social networks, people are reduced to their profiles. on our mobile devices, we often talk to each other on the move and with little disposable time—so little, in fact, that we communicate in a new language of abbreviation in which letters stand for words and emoticons for feelings. . . . we are increasingly connected to each other but oddly more alone: in intimacy, new solitudes.23 some of my endlessly patient friends—the ones who provide both you and me with some measure of buffering from the worst of my rants in prepublication drafts of these columns—have asked questions about how all this relates to libraries, for example: how much it is legitimate to generalize to the broader population research findings from cases of obsessive compulsive disorder? the individuals studied are, of course, obsessive and compulsive, in relation to the internet and new technologies. do their behaviors not represent an extreme end of the population? a fair question. and yes, the examples i’ve provided in this column are admittedly somewhat extreme. but turkle and aboujaoud both point to many examples that are far more common. i think all of us would have editorial: singularity—are we there, yet? | truitt 59 references and notes 1. marc truitt, “editorial: the air is full of people,” information technology and libraries 30 (mar. 2011): 3–5. http:// www.ala.org/ala/mgrps/divs/lita/ital/302011/3001mar/ editorial_pdf.cfm (accessed apr. 25, 2011). 2. sherry turkle, alone together: why we expect more from technology and less from each other (new york: basic books, 2011); elias aboujaoude, virtually you : the dangerous powers of the e-personality (new york : norton, 2011). 3. turkle, 19. 4. ibid., 3–4. 5. quoted in ibid., 5. 6. ibid., 9–10. emphasis added. 7. quoted in aboujaoude, 99. 8. turkle, 162. emphasis in original. 9. ibid, 295. 10. turkle, 161. 11. aboujaoude, 96 12. ibid., 98. 13. quoted in turkle, 1. 14. ibid., 12. 15. ibid. 16. ibid., 153. 17. aboujaoude, 77–78. 18. turkle, 290. 19. ibid., 34. 20. ibid., 103–4. 21. ibid., 120–21. 22. ibid., 25. 23. ibid., 18–19. 24. for a recent and typical example, see david carr, “keep your thumbs still when i’m talking to you,” new york times, apr. 15, 2011, http://www.nytimes.com/2011/04/17/ fashion/17text.html (accessed may 2, 2011). 25. aboujaoude, 283. adopters. in our quest to remain “relevant” to our university or school administrations, governing boards, and (in theory, at least) our patrons, we have embraced with remarkably little reservation just about every technology trend that’s come along in the past few decades. at the same time, we’ve been remarkably uncritical and unreflective about our role in, and the larger implications of, what we might be doing by adopting these technologies. aboujaoude, in a surprising, but i think largely correct summary comment, observes: extremely little is available, however, for the individual interested in learning more about how virtual technology has reshaped our inner universe and may be remapping our brains. as centers of learning, public libraries, schools, and universities may be disproportionately responsible for this deficiency. they outdo one another in digitalizing their holdings and speeding up their internet connections, and rightfully see those upgrades as essential to compete for students, scholars, and patrons. in exchange, however, and with few exceptions, they teach little about the unintended, less obvious, and more personal consequences of the world wide web. the irony is, at least in some libraries’ case, that their very survival seems threatened by a shift that they do not seem fully engaged in trying to understand, much less educate their audiences about.25 i could hardly agree more. so, how do we answer aboujaoude’s critique? letter from the editor: farewell 2020 letter from the editor farewell 2020 kenneth j. varnum information technology and libraries | december 2020 https://doi.org/10.6017/ital.v39i4.13051 i don’t think i’ve ever been so ready to see a year in the rear-view mirror as i am with 2020. this year is one i’d just as soon not repeat, although i nurture a small flame of hope. hope that as a society what we have experienced this year will exert a positive influence on the future. hope that we recall the critical importance of facts and evidence. hope that we don’t drop the effort to be better members of our local, national, and global communities and treat everyone equitably. hope that as a global populace we continue to get into “good trouble” and push back against institutionalized policies and practices of racism and discrimination and strive to be better. despite the myriad challenges this year has brought, it is welcome to see so many libraries continuing to serve their communities, adapting to pandemic restrictions, and providing new and modified access to books and digital information. and equally gratifying, from my perspective as ital’s editor, is that so many library technologists continue to generously share what they have learned through submissions to this journal. along those lines, i’m extending my annual invitation to our public library colleagues to propose a contribution to our quarterly column, “public libraries leading the way.” items in this series highlight a technology-based innovation from a public library perspective. topics we are interested in could include any way that technologies have helped you provide or innovate service to your communities during the pandemic, but could touch on any novel, interesting, or promising use of technology in a public library setting. columns should be in the 1,000-1,500 word range and may include illustrations. these are not intended to be research articles. rather, public libraries leading the way columns are meant to share practical experience with technology development or uses within the library. if you are interested in contributing a column, please submit a brief summary of your idea. wishing you the best for 2021, kenneth j. varnum, editor varnum@umich.edu december 2020 https://ejournals.bc.edu/index.php/ital/pllw https://docs.google.com/forms/d/e/1faipqlsd7c0-g-lxetkj2ukjokd7oyt-vprtoizdm1fs8xuhkotctug/viewform https://docs.google.com/forms/d/e/1faipqlsd7c0-g-lxetkj2ukjokd7oyt-vprtoizdm1fs8xuhkotctug/viewform mailto:varnum@umich.edu lib-mocs-kmc364-20131012114451 322 highlights of lit a board meetings these highlights are published to inform division members of the activities of their board. they are abstracted from the official minutes. 1981 ala annual conference san francisco first session june 29, 1981 board members present: s. michael malinconico, brigitte l. kenney, barbara e. markuson, nancy l. eaton, kenneth j. bierman, bonnie k. juergens, marilyn j. rehnberg, helen cyr, heike kordish, donald p. hammer. lita election results. vice-president/president-elect: carolyn m. gray director-at-large: hugh atkinson ala councilor: bonnie k. juergens vccs vice-chairperson/chairperson-elect: mary h. karpinski vccs secretary: patricia m. paine vccs member-at-large: leon l. drolet, jr. avs chairperson: anne t. meyer a vs vice-chairperson/chairperson-elect: louis r. pointon avs member-at-large: michael d. miller isas vice-chairperson/chairperson-elect: james c. thompson isas member-at-large: sherrie schmidt evaluation of electronic mail project. the members of the board reviewed their experiences and impressions with the ontyme electronic mail system. the general consensus was that the system was very good and everyone was pleased with it and wants to expand its use. the board has not yet used the source, although we are now subscribers to the system. motion was made by markuson , seconded by rehnberg, and passed that: the electronic mail project be extended through the midwinter meeting, 1982, with a total budget of $2,000 from the inception of the project. uta's representation on ansi x-3. x-3 is the american national standards institute committee on computers and information processing. discussion included the mechanics of keeping the membership informed of proposed standards being considered, the large amount of time required of the representahighlights of meetings 323 tive to monitor, study, and disseminate the proposed standards, and the costs involved for lit a to support a representative. juergens requested that if a division-wide representative to x-3 is appointed that that person should also be made ex officio to the isas/tesla committee or be liaison to the chair of isas. no action was taken. goals and long-range planning committee. kenney announced that she had appointed an ad hoc goals and long-range planning committee chaired by george abbott. directory of library systems in use. the suggestion was made that a directory of the many automated systems in use in libraries would be very useful. a motion was made by markuson, seconded by kenney, and passed that: in response to inquiry about a directory to assist in identifying specific applications of technology in libraries, media, and information centers, that the publications committee explore the feasibility of an online lit a directory of library, media, and information center use of technology. the investigation should consider format of description, potential of interactive online updating, and possible output byproducts, and should result in a draft rfp for consideration by the lit a board for review at midwinter. president's program at philadelphia. kenney announced her plans for the lit a president's program at the philadelphia ala annual conference. she is planning to transmit by satellite to fifty receiving sites around the country an "ala sampler" of outstanding technically-based programs from the philadelphia conference and short vignettes of what ala is all about. the subject of "0 n-line catalogs" has been chosen for the president's program and segments of it and the rtsd/lit airasd preconference institute on the same subject will be used. the program is intended for people who cannot get to ala conferences. if not enough registration is received by the coming ala midwinter meeting the whole activity would be cancelled. oral history project. at the 1980 new york ala conference, the suggestion was made that in the future many of the pioneers in the field of library automation will pass off the scene and it was felt that it was lit a's responsibility to capture for posterity the ideas and philosophy of those people. a motion was made by kenney, seconded by eaton, and passed that: an ad hoc committee be formed to investigate an oral history project in all aspects and submit a detailed set of alternative approaches for the board's consideration. the library history roundtable will be informed of the committee's activity and invited to participate. second session june 30, 1981 board members present: s. michael malinconico, brigitte l. kenney, barbara e. markuson, nancy l. eaton , kenneth}. bierman, ronald f. miller, bonnie k. juergens, marilyn j. rehnberg, helen cyr , heike kordish , charles husbands, and donald p. hammer. 324 journal of library automation vol. 14/4 december 1981 lita section reports: isas. bonnie juergens, chairperson of isas, reported that the section has approved three programs for the philadelphia conference. asis will be asked to cosponsor the program "information science, computer science, and library science: in search of common ground". another program is the "the uses of microcomputers in medium-sized public and academic libraries," and the third one will be a detailed analysis and comparison of the marc format. juergens reported that the isas retrospective conversion discussion group and one of the same name in rtsd would like to combine. a motion was made by juergens, and passed that: isas pursue appropriate steps to invite the rtsd section which currently hosts a discussion group on retrospective conversion to combine that discussion group with the lit a/isas retrospective conversion discussion group. the invitation to rtsd will include a specific description of mutual responsibilities. electronic library membership initiative group. (information report by richard sweeney, public library of columbus and franklin co., ohio; and neal kaske, oclc.) sweeney reviewed the discussions that took place at a meeting held in columbus on march 23-24, 1981 concerned with the whole area of remote electronic access to information and its impact on the library field. the group concluded that its members want to have some input on a very immediate level on the direction technology goes and the direction the policies and issues go. out of that meeting came a mission statement which is now the function statement of the ala electronic library membership initiative group (elmig). sweeney read that statement and reported on the group's concern for the future of libraries when these remote systems become established. he commented on the large number of programs and meetings on these areas that are not coordinated and not really providing the leadership our field should be giving. the almost total lack of research on these areas was also commented on. the need for the associations to provide the leadership was stressed. several members of the lita board expressed interest in providing a "home" for elmig within lita as many of lit a's interests are those of the mig. both groups are concerned with the same issues it was pointed out. lita section reports: audio-visual section. avs recommended that an audiovisual task force be established, which would include other ala units, and would share information about their plans, and would try to avoid major schedule conflicts and overlaps. a motion was made by cyr, and passed that lit a board approve ad hoc lit a a-v section participation in a broadbased task force involving rtsd, pla, acrl, aasl and others to coordinate audiovisual-related activities. cyr asked board's sanction for a "a-v interest group breakfast" where people could just socialize and talk together. this would be sometime in the future. the board members had no objection. marbi committee report. elaine woods reported that the marbi committee is focusing more on the principles and the issues that need to be highlights of meetings 325 addressed in the marc format. the committee is current with l.c. proposals. marbi has drawn up a shopping list of issues to be addressed and they are now working on some of them. publications committee report. charles husbands informed the board that the publications committee feels it is time to change the title of lola. they have chosen a title of information technology and libraries, and it is to be effective with the march 1982 issue. after discussion, a motion was made by bonnie juergens, and passed to that effect. the matter of raising the subscription price of lola was discussed. due to the fact that the division's subsidy to the journal will greatly increase next budget year, the motion was made by ken bierman, and passed that: non-member prices for the journal of the division be increased to $20 for a one-year subscription and $5.50 for a single issue, effective with march 1982, and that the published member subscription price be raised sufficiently to conform to postal regulation. husbands requested that various members of the lola editorial board be included in the lita electronic mail system. approved by the board by consensus. husbands asked the board to keep in mind the possibility of publishing some of the results of the oral history project in lola. brian aveney asked the board to allow him to investigate the possibility of putting the full text of lola online. it would be an experiment to see what people would do with it. the board approved by consensus. aveney will return with a final proposal later. other such ideas were discussed including the proposals to put the "headlines" from the lita newsletter on the source, and to include the roster of lita committees in the oclc address directory. arrangements are in process for both of these activities. goals and long-range planning committee. george abbott, chairperson, asked the board's permission to include his committee on lit a's electronic mail system. the intent would be to use it for text editing of committee documents. board approved by consensus. abbott reported that the committee expects to hold open hearings at midwinter and to have a basic document for discussion at that time. third session june 30, 1981 board members present: s. michael malinconico, brigitte l. kenney, ronald f. miller, kenneth j. bierman, marilyn j. rehnberg, heike kordish, and donald p. hammer. bylaws and organization report. there have been seven changes to the lit a bylaws that kordish will prepare in text form for the board to act on at midwinter in time for the spring ala ballot. ala priorities survey. ron miller reported that the ala executive board took action on the ala priorities and there are five of them. briefly, they are 326 journal of library automation vol. 14/4 december 1981 access to information , legislation and funding, intellectual freedom, public awareness, and personnel resources. joint council on educational telecommunications. lynne bradley reported that jcet has established a task force to bring information to its members about the new technologies and how they can best be used in education. since lit a members have much of the necessary expertise, bradley suggested that lita organize a one-day program for jcet. some board members were very much interested and bradley was asked to work with the lit a program planning committee to organize such a program. program planning committee. sue tyner reported that the telecommunications committee will hold a preconference institute at the philadelphia annual conference called "the teleconference center." it is intended to teach librarians how to set up a teleconference center. the lit a group that has been putting on the "data processing specifications and contracting" workshops has been asked to hold a workshop prior to the ifla meeting. malinconico suggested that the board adopt a policy of lit a costs plus 15 percent, but that a subcommittee of the lit a program planning committee should be set up to define policy in this area. carolyn gray was suggested as a person for this committee. marilyn rehnberg, chairperson of vccs, reported a request from national audio-visual association asking lit a to put on a " video showcase" for the seminar part of the nava annual conference in anaheim in january. lit a board of directors meetin gs record of votes 1981 annual conference motions (in order of appearance in the " highlights") board member 1 2 3 4 5 6 7 8 s. michael malinconico y y y y y y y y brigitte l. kenney y y y y y y y y barbara e. markuson y y y y y y y y nancy l. eaton y y y y y y y y kenneth j . bierman y y y y y y y y honald f. miller 0 0 0 y y y y y angie w. lecierq 0 0 0 0 0 0 0 0 helen cyr y y y y y y y y bonnie k. juergens y y y y y y y y marilyn j. rehnberg y y y y y y y y key: y =yes a= abstain 0 =absent president’s message andromeda yelton information technology and libraries | march 2018 2 andromeda yelton (andromeda.yelton@gmail.com) is lita president 2017-18 and senior software engineer, mit libraries, cambridge, massachusetts. in my last president’s message, i talked about change — ital’s transition to new leadership — and imagination — wakanda and the archival imaginary. today change and imagination are on my mind again as lita contemplates a new path forward: potential becoming a new combined division with alcts and llama. as you may have already seen on litablog (http://litablog.org/2018/02/lita-alcts-and-llamadocument-on-small-division-collaboration/), the three divisional leadership teams have been envisioning this possibility, and all three division boards discussed it at midwinter. while the id ea sprang out of our shared challenges with financial stability, in discussing it we’ve realized how much opportunity we have to be stronger together. for instance, we’ve heard for years that you, lita members, want more of a leadership training pathway, and more ways to stay involved with your lita home as you move into management; alignment with llama automatically opens up all kinds of possibilities. they have an agile divisional structure with their communities of practice and an outstanding set of lead ership competencies. and anyone involved with library technology knows that we live and die by metadata, but we aren’t all experts in it; joining forces with alcts creates a natural home for people no matter where they are (or where they’re going) on the technology/metadata continuum. alcts also runs far more online education than lita and runs a virtual conference. meanwhile, of course, lita has a lot to offer to llama and alcts. you already know how rewarding the networking is, and how great the depth of expertise on technology topics. we also bring strong publications (like this very journal), marquee conference programs (like top tech trends and the imagineering panel), and a face-to-face conference. (speaking of which, please pitch a session (http://bit.ly/2gpgxdf) for the 2018 lita forum!) i want to emphasize that no decisions have been made yet. the outcome of our three board discussions was that we all feel there is enough merit to this proposal to explore it further, but none of us are formally committed to this direction. furthermore, it is not practically or procedurally possible to make a change of this magnitude until at least 2019. in the meantime, we expect there will be numerous working groups to determine if and how this all could work, as well as open forums for the membership of all three divisions to express hopes, concerns, and ideas. personally, my highest priority is to ensure that that you, the members, continue to have a divisional home: one that gives you learning opportunities and a place for professional camaraderie, and that is on solid financial footing so it can continue to be here for you in the long term. http://litablog.org/2018/02/lita-alcts-and-llama-document-on-small-division-collaboration/ http://litablog.org/2018/02/lita-alcts-and-llama-document-on-small-division-collaboration/ http://bit.ly/2gpgxdf president’s message | march 2018 3 https://doi.org/10.6017/ital.v37i1.10386 so, i’m excited about the possibilities that a superhero teamup affords, but i’m even more excited to hear from you. do you find this prospect thrilling, scary, both? do you think we should absolutely go this way, or definitely not, or maybe but with caveats and questions? please tell me what you think. you can submit anonymous feedback and questions at https://bit.ly/litamergefeedback. i will periodically collate and answer these questions on litablog. you can also reach out to me personally any time (andromeda.yelton@gmail.com). https://bit.ly/litamergefeedback mailto:andromeda.yelton@gmail.com automated storage & retrieval system: from storage to service articles automated storage & retrieval system: from storage to service justin kovalcik and mike villalobos information technology and libraries | december 2019 114 justin kovalcik (jdkovalcik@gmail.com) is director of library information technology, csun oviatt library. mike villalobos (mike.villalobos@csun.edu) is guest services supervisor, csun oviatt library. abstract the california state university, northridge (csun) oviatt library was the first library in the world to integrate an automated storage and retrieval system (as/rs) into its operations. the as/rs continues to provide efficient space management for the library. however, added value has been identified in materials security and inventory as well as customer service. the concept of library as space, paired with improved services and efficiencies, has resulted in the as/rs becoming a critical component of library operations and future strategy. staffing, service, and security opportunities paired with support and maintenance challenges, enable the library to provide a unique critique and assessment of an as/rs. introduction “space is a premium” is a phrase not unique to libraries; however, due to the inclusive and open environment promoted by libraries, their floor space is especially attractive to those within and outside of the building’s traditional walls. in many libraries, the majority of floor space is used to house a library’s collection. in the past, as collections grew, floor space became increasingly limited. faced with expanding expectations and demands, libraries struggled to identify a balance between transforming space for new services while adding materials to a growing collection. in addition to management activities like weeding, other solutions such as offsite storage and compact shelving rose in popularity as a method to create library space in the absence o f new building construction. years later as collections move away from print and physical materials, libraries are beginning to reexamine their building’s space and envision new features and services. “now that so many library holdings are accessible digitally, academic libraries have the opportunity to make use of their physical space in new and innovative ways.”1 the csun oviatt library took a novel approach and launched the world’s first automated storage and retrieval system (as/rs) in 1991 as a storage solution to resolve its building space limitations. the project was a california state university (csu) system chancellor’s office initiative that cost more than $2 million to implement and began in 1989. the original concept “came from the warehousing industry, where it had been used by business enterprises for years.”2 by leveraging and storing physical materials in the as/rs, the csun oviatt library is able to create space within the library for new activities and services. “instead of simply storing information materials, the library space can and should evolve to meet current academic needs by transforming into an environment that encourages collaborative work.”3 mailto:jdkovalcik@gmail.com mailto:mike.villalobos@csun.edu automated storage & retrieval system | kovalcik and villalobos 115 https://doi.org/10.6017/ital.v38i4.11273 unfortunately, as the first stewards of an as/rs, csun made decisions that led to mismanagement and neglect resulting in the as/rs facing many challenges in becoming a stable and reliable component of the library. however, recent efforts have sought to resolve these issues and resulted in system updates, management, and functionality. whereas in the past low-use materials were placed in as/rs to create space for new materials, now materials are moved into the as/rs to create space for patrons, secure collections, and improve customer service. as part of this critical review, the functionality and maintenance along with the historical and current management of the as/rs will be examined. background csun is the second-largest member of the twenty-three-campus csu system. the diverse university community includes over 38,000 students and more than 4,000 employees.4 consisting of nine colleges offering 60 baccalaureate degrees, 41 master’s degrees, 28 credentials in education, and various extended learning and special programs, csun provides a diverse community with numerous opportunities for scholarly success.5 the csun oviatt library’s as/rs is an imposing and impressive area of the library that routinely attracts onlookers and has become part of the campus tour. the as/rs is housed in the library’s east wing and occupies an area that is 8,000 square feet and 40 feet high arranged into six aisles. the 13,260 steel bins, each 2 feet x 4 feet, in heights of 6, 10, 12, 15, and 18 inches, are stored on both sides of the aisles enabling the as/rs to store an estimated 1.2 million items.6 each aisle has a storage retrieval machine (srm) that performs automatic, semiautomatic, and manual “picks” and “deposits” of the bins.7 the as/rs was assessed in 2014 as responsibilities, support, and expectations of the system shifted and previous configurations were no longer viable. discontinued and failing equipment, unsupported server software, inconsistent training and use, and decreased local support and management were identified as impediments for greater involvement in library projects and operations. campus provided funding in 2015 to update the server software as well as major hardware components on three of the six aisles. divided into two phases, the server software upgrade was completed in may 2017 followed by the hardware upgrade in january 2019.8 literature review the continued growth of student, faculty, and academic programs along with evolving expectations and needs since the late 1980s has required the library to analyze library services and examine the building’s physical space and storage capacity. in the late 1980s, identifying space for increasing printed materials was the main contributing factor in implementing the as/rs. in the mid-2010s, creating space within the library for new services was dependent on a stable and reliable as/rs. “the conventional way of solving the space problem by adding new buildings and off-site storage facilities was untenable.”9 a benefit of an as/rs, as creaghe and davis predicted in 1986 was, “the probable slow transition from books to electronic media, an aaf [automated access facility] may postpone the need for future library construction indefinitely.”10 the as/rs has enabled the library to create space by removing physical materials while enhancing customer service, material security, and inventory control. “the role of the library as service has been evolving in lockstep with user needs. the current transformative process that takes place in academia has a powerful impact on at least two functional areas of the library: information technology and libraries | december 2019 116 library as space and library as collection.”11 in addition, the “increased security the aaf … offers will save patrons time that would be spent looking for books on the open shelves that may be in use in the library, on the waiting shelves, misplaced, or missing.”12 in subsequent years, library services have evolved to include computer labs with multiple high-use printers/scanners/copiers, instructional spaces, individual and group study spaces, makerspaces, etc., in addition to campus entities that have required large amounts of physical space within the library. “it is well-known that academic libraries have storage problems. traditional remedies for this situation—used in libraries across the nation—include off-site storage for less used volumes, as well as, more recently, innovative compact shelving. these solutions help, but each has its disadvantages, and both are far from ideal. . . . when the eastern michigan university library had the opportunity to move into a new building, we saw that an as/rs system would enable us to gain open space for activities such as computer labs, training rooms, a cafe, meeting rooms, and seating for students studying.”13 the as/rs provides all the space advantages provided by off-site storage and compact shelving while adding much more value while mitigating negatives of off-site time delays and the confusion of accessing and using compact shelving. staffing & usage 1991–1994 following the 80/20 principle, low-use items were initially selected for storage in the as/rs. “when the storage policy was being developed in [the] 1990s, the 80/20 principle was firmly espoused by librarians. . . . thus, by moving lower-use materials to as/rs, the library could still ensure that more than 80% of the use of the materials occurs on volumes available in the open stacks.”14 low-use items were identified if one of the following three conditions was met: (1) the item’s last circulation date was more than five years ago; (2) the item was a non-circulating periodical; or (3) items that were not designed to leave an area and received little patron usage such as the reference collection. in 1991, the as/rs was loaded with 800,000 low-use items and went live for the first time later that year. staffing for the initial as/rs department consisted of one full-time as/rs supervisor (40 hours/week), one part-time as/rs repair technician (20 hours/week), and 40 hours a week of dedicated student employees, for a total of 100 hours a week of dedicated as/rs management. the as/rs was largely utilized as a specialized service for internal library operations with limited patron-initiated requests. as/rs operations were uniquely created and customized for each as/rs operator as well as the desired task needing to be performed. skills were developed internally with knowledge and training shared by word of mouth or accompanied with limited documentation. 2000 mid-2000s the as/rs department functioned in this manner until the 1994 northridge earthquake struck the campus directly and required partial building reconstruction to the library. although there was no damage to the as/rs itself or its surrounding structure, extensive damage occurred in the wings of the library. the damage resulted in the library building being closed and inaccessible. when the library reopened in 2000, it was determined that due to previous as/rs low usage that a dedicated department was no longer warranted. the as/rs supervisor position was dissolved, the student employee budget was eliminated, and the as/rs technician position was not replaced after the employee retired in 2008. as/rs operational responsibilities were consolidated into the circulation department and as/rs administration into the systems department. both circulation automated storage & retrieval system | kovalcik and villalobos 117 https://doi.org/10.6017/ital.v38i4.11273 and systems departments redefined their roles and responsibilities to include the as/rs without additional budgetary funding, staffing, or training. in order for as/rs operations to be absorbed by these departments, changes had to occur in the administration, operating procedures, staffing assignments, and access to the as/rs. all five circulation staff members and twenty student employees received informal training by members of the former as/rs department in the daily operations of the as/rs. the circulation members also received additional training for first-tier troubleshooting of as/rs operations such as bin alignments, emergency stops, and inventory audits. the as/rs repair technician remained in the systems department; however, as/rs troubleshooting responsibility was shared among the systems support specialists and dedicated as/rs support was lost. the administrative tasks of scheduling preventive maintenance services (pms), resolving as/rs hardware/equipment issues with the vendor, and maintaining the server software remained with the head of the systems department. without a dedicated department providing oversight for the as/rs, issues and problems began to occur frequently. circulation had neither the training nor resources available to master procedures or enforce quality control measures. similarly, the systems department became increasingly removed from daily operations. many issues were not reported at all and became viewed as system quirks that required workarounds or were viewed as limitations of the system. for issues that were reported, troubleshooting had to start all over again and systems relied on circulation staff being able to replicate the issue in order to demonstrate the problem. system’s personnel retained little knowledge on performing daily operations, and troubleshooting became more complex and problematic as different operators had different levels of knowledge and skill that accompanied their unique procedures. mid-2000s–2015 these issues became further exasperated when areas outside of circulation were given full access to the as/rs in the mid-2000s. employees from different departments of the library began entering and accessing the as/rs area and operated the as/rs based on knowledge and skills they learned informally. student assistants from these other departments also began accessing the area and performing tasks on behalf of their informally trained supervisors. further, without access control, employees as well as students ventured into the “pit” area of the as/rs where the srms move and end-of-aisle operations occur. this area contains many hazards and is unsafe without proper training. during this period, the special collections and archives (sc/a) department loaded thousands of un-cataloged, high-use items into the as/rs that required specialized service from circulation. these items were categorized as “non-library of congress” and inventory records were entered into the as/rs software manually by various library employees. in addition, paper copies were created and maintained as an independent inventory by sc/a. over the years, the sc/a paper inventory copies were found to be insufficiently labeled, misidentified, or missing. therefore, the as/rs software inventory database and the sc/a paper copy inventory contained conflicts that could not be reconciled. to resolve this situation, an audit of sc/a materials was completed in spring 2019 to locate inventory that was thought to be missing. information technology and libraries | december 2019 118 all bound journals and current periodicals were eventually loaded into the as/rs as well, causing other departments and areas to rely on the as/rs more heavily. departments such as interlibrary loan and reserves, as well as patrons, began requesting materials stored in the as/rs more routinely and frequently. the as/rs transformed from a storage space with limited usage to an active area with simultaneous usage requests of different types throughout the day. without a dedicated staff to organize, troubleshoot, and provide quality control, there was an abundance of errors that led to long waits for materials, interdepartmental conflicts, and unresolved errors. high-use materials from sc/a, as well as currently received periodicals from the main collection, were the catalysts that drove and eventually warranted change in the as/rs usage model from storage to service. the inclusion of these materials created new primary customers identified as internal library departments: sc/a and interlibrary loan (ill). with over 4,000 materials contained in the as/rs, sc/a requires prompt service for processing archival material into the as/rs and filling specialized patron requests for these materials. in addition, ill processes over 500 periodical requests per month that utilize and depended on as/rs services. the additional storage and requests created an uptick in overall as/rs utilization that carried over into circulation desk operations as well. 2015–present the move from storage to service was not only inevitable due to an evolving as/rs inventory, but was necessary in order to regain quality control and manage the library-wide projects that involved the as/rs. the increased usage and reliance on the as/rs required the system be well maintained and managed. administration of the as/rs remains within systems and circulation student employees continue to provide supervised assistance to the as/rs. the crucial change was identified and emerged within circulation for a dedicated operations and project manager. an as/rs lead position was created with responsibilities for the daily operations and management of the system and service. however, this was not a complete return to the original staffing concept of the early 1990s. the concept for this new position focuses on project management and system operations rather than the original sole attention to system operations. the as/rs lead is the point of contact for all library projects that utilize the as/rs, relaying any as/rs issues or concerns to systems, and daily as/rs usage. this shift is necessary due to the increased demand and reliance on the system that has changed its charge from storage to service. customer service the library noted over time that the as/rs could be used as a tool in weeding and other collection shift projects to create space and aid in reorganizing materials. as more high-use materials were loaded into the as/rs the indirect advantages of the as/rs became more apparent. patrons request materials stored within the as/rs through the library’s website and pick up the materials at the circulation desk. there is no need for patrons to navigate the library, successfully use the classification system, and search shelves to locate an item that may or may not be there. as kirsch notes, “the ability to request items electronically and pick them up within minutes eliminates the user’s frustration at searching the aisles and floors of an unfamiliar library.”15 the vast majority of library patrons are csun students that commute and must make the best use of their time while on campus. housing items in the as/rs creates the opportunity to have hundreds of thousands of items all picked up and returned to one central location. this makes it far easier for library patrons, especially users with mobility challenges, to engage with a plethora of library automated storage & retrieval system | kovalcik and villalobos 119 https://doi.org/10.6017/ital.v38i4.11273 materials. the time allotted for library research and/or enjoyment becomes more productive as their desired materials are delivered within minutes of arriving in the building. as heinrich and willis state, “the provision of the nimble, just-in-time collection becomes paramount, and the demand for as/rs increases exponentially.”16 as/rs items are more readily available than shelved items on the floor, as it takes minutes to have as/rs items returned and made available once again. “they may be lost, stolen, misshelved, or simply still on their way back to the shelves from circulation—we actually have no way of knowing where they are without a lengthy manual search process, which may take days. . . . unlike books on the open shelves, returned storage books are immediately and easily ‘reshelved’ and quickly available again.”17 another advantage is there is no need to keep materials in call-number order with the unpleasant reality of missing and misshelved items. items in the as/rs are assigned bin locations that can only be accessed by an operatoror user-initiated request. the workflow required to remove a material from the as/rs involves multiple scans and procedures that increase accountability that does not exist for items stored on floor shelves. further, users are assured of an item’s availability within the system. storing materials in the as/rs ensures that items are always checked out when they leave the library and not sitting unaccounted for in library offices and processing areas. it also avoids patron frustration of misshelved, recently checked-out, or missing items. security the decision to follow the 80/20 principle and place low-use items in the as/rs meant high-use items remained freely available to library patrons on the open shelves of each floor. this resulted in high-use items being available for patron browsing and checkout, as well as patron misuse and theft. the sole means of securing these high-use items involved tattle-tape and installing security gates at the main entrance. therefore, the development of policies and procedures for the enforcement of these gates was also required. beyond the inherent cost, maintenance, and issue of ensuring items are sensitized and desensitized correctly, gate enforcement became another issue that rested upon the circulation department. assuming theft would occur by exiting the building through passing through the gates at the main entrance of the library, enforcement is limited in actions that may be performed by library employees. touching, impeding the path, following, detaining, searching, etc. of library patrons are restricted actions reserved for campus authorities such as the police and not library employees. rather than attempting to enforce a security mechanism in which we have no authority, the as/rs provides an alternative for the security of high-use and valuable materials. storing items in the as/rs eliminates the possibility of theft or damage by visitors and places control and accountability over the internal use of materials. “there would be far fewer instances of mutilation and fewer missing items.”18 further, access to the as/rs area was restricted from all library personnel to only circulation and systems employees with limited exceptions. individual log ins also provided a method of control and accountability as each operator is required to use a personal account rather than a departmental account to perform actions on the as/rs. materials stored in the as/rs are, “more significantly . . . safer from theft and vandalism.”19 information technology and libraries | december 2019 120 inventory conducting a full inventory of a library collection is time consuming, expensive, and often inaccurate by the time of completion. missing or lost items, shelf reading projects, in-process items, etc. create overhead for library employees and generate frustration for patrons searching for an item. massive, library-wide projects such as collection shifts and weeding are common endeavors undertaken to create space, remove outdated materials, and improve collection efficiency. however, actions taken on an open shelves collection is time consuming, costly, inefficient, and affect patron activities. these projects typically involve months of work that involve multiple departments to complete. items stored within the as/rs do not experience these challenges because the system is managed by a full-time employee throughout the year and not on a project basis. the system is capable of performing inventory audits, and does not affect public services. therefore, while the cost of an item on an open shelf is $0.079, the cost of storing the same item in the as/rs is $0.0220 routine and spot audits ensure an accurate inventory, confirm capacity level of the system, and establish best management of the bins. as/rs inventory audits are highly accurate and much more efficient than shelf reading with little impact to patron services. “while this takes some staff time, it is far less time-consuming than shelf reading or searching for misshelved books.”21 storing materials in the as/rs is more efficient than on open shelves; however, bin management is essential in ensuring bins are configured in the best arrangement to achieve optimal efficiency. the size and configuration of bins directly affects storage capacity. type of storage, random or dedicated, also influences capacity, efficiency, and accessibility of items. the 13,260 steel bins in the as/rs range in height from 6 to 18 inches. the most commonly used bins are the 10and 12-inch bins; however, there is a finite number of these bin heights. unfortunately, the smallest and largest bins are rarely used due to material sizes and weight capacity; therefore, as/rs optimal capacity is unattainable and the number of materials eligible for loading limited by number of bins available. the library also determined that dedicated, rather than random, bin storage type aided in locating specialized materials, reduced loading and retrieval errors, and enhanced accessibility by arranging highly used bins to reachable locations. in the event an srm breaks down and an aisle becomes nonfunctional for retrieving bins, strategically placing the highest used and specialized locations in bins that can be manually pulled is a proactive strategy. however, this requires dedicated bins with an accurate and known inventory that has been arranged in accessible locations. lessons learned disasters & security in 1994, the as/rs proved to provide a much more stable and secure environment than the open stacks when it successfully endured a 6.9 earthquake. the reshelving of more than 300,000 items required a crew of more than thirty personnel over a year to complete. many items were destroyed from the impact of falling to the floor and being buried underneath hundreds of other automated storage & retrieval system | kovalcik and villalobos 121 https://doi.org/10.6017/ital.v38i4.11273 items. the as/rs in contrast consisted of over 800,000 items and successfully sustained the brunt of the earthquake’s impact with no damage to any of the stored items. unfortunately. the materials that had been loaded into the as/rs in 1991 were low-use items that were viewed as one step from weeding. therefore, high-use items stored in open shelves were damaged and required the long process of recovery and reconstruction: identifying and cataloging damaged and undamaged materials, disposal of those damaged, renovation of the area, and purchase of new items. the low-use items stored in the as/rs by contrast required a few bins that had slightly shifted be pushed back fully into their slots. as/rs items have proven to be more secure from misplacement, theft, and physical damage from earthquakes as compared to items in open shelves. maintenance, support, and modernization the csun oviatt library has received two major updates to the as/rs since it was installed in 1991. in 2011, the as/rs received updates for communication and positioning components. the second major update occurred in two phases between 2016 and 2018 and focused on software and equipment. in phase one, server and client-side software was updated from the original software created in 1989. in phase two, half the srms received new motors, drives, and controllers. due to the many years of reliance on preventive maintenance (pm) visits and avoidance of modernization, our vendors were unable to provide support for the as/rs software and had difficulty locating equipment that had become obsolete. preventive maintenance visits were used to maintain the status quo and are not a long-term strategy for maintaining a large investment and critical component of business operations. creaghe and davis note that, “current industrial facility managers report that with a proper aaf [automated access facility] maintenance program, it is realistic to expect the system to be up 9598 percent of the time.”22 pm service is essential for long-term as/rs success; however, preventive maintenance alone is incapable of modernization and ensuring equipment and software do not become obsolete. maintenance is not the same as support, rather maintenance is an aspect of support. support includes points of contacts who are available for troubleshooting, spare supplies on hand for quick repairs, a life-cycle strategy for major components, and longterm planning and budgeting. kirsch attested the following describing eastern michigan university’s strategy: “although the dean is proud and excited about this technology, he acknowledges that just like any computerized technology, when it’s down, it’s down. ” to avoid system problems, emu bought a twenty-year supply of major spare parts and employs the equivalent of one-and-a-half full-time workers to care for its automated storage and retrieval system.”23 a system that relies solely on preventive maintenance will quickly become obsolete and require large and expensive projects in the future if the system is to continue functioning. further, modernization provides an avenue for new features and functions to be realized that increase functionality and efficiency. networking the csun oviatt library on average receives between three to four visits a year along with multiple emails and phone conversations requesting information from different libraries regarding the as/rs. these conversations aid the library by viewing the as/rs in different perspectives and forces the library to review current practices. information technology and libraries | december 2019 122 the library has learned through speaking with many different libraries that needs, design, and configuration of an as/rs can be as unique as the libraries inquiring. the csun oviatt library, for example. is much different than the three other csu system libraries that have an as/rs. due to our system being outdated, it has been difficult to form or establish meaningful groups or share information because the systems are all different from each other. as more conversations occur and systems become more modern and standard, there is potential for knowledge sharing as well as group lobbying efforts for features and pricing. buy in user confidence in any system is required in order for that system to be successful. convincing a user base that moving materials from readily available open shelves and transferring them into steel bins housed within 40-feet-high aisles that are inaccessible will be difficult if the system is consistently down. therefore, the better the as/rs is managed and supported, the more reliable and dependable that system will be and the likelihood user confidence will grow. informing stakeholders of long-term planning and welcoming feedback demonstrates that the system is being supported and managed with an ongoing strategy that is part of future library operations. similarly, administrators need confirmation that large investments and mission-critical services are stable, reliable, and efficient. creating a new line item in the budget for as/rs support and equipment life-cycle requires justification along with a firm understanding of the system. in addition, staffing and organizational responsibilities must also be reviewed in order to establish an environment that is successful and efficient. continuous assessments of the as/rs regarding downtime, projects involved, services and efficiencies provided, etc. aid in providing an illustration of the importance and impact of the system on library operations as a whole. recording usage and statistics unfortunately, usage statistics were not recorded for the as/rs prior to june 2017. therefore, data is unavailable to analyze previous system usage, maintenance, downtime, or project involvement. data-driven decisions require the collection of statistics for system analysis and assessment. following the server software and hardware updates, efforts have been taken to record project statistics, inventory audits, and srm faults, as well as public and internal paging requests. conclusion the as/rs remains, as heinrich & willis described it, “a time-tested innovation.”24 through lessons learned and objective assessment, the library is positioning the as/rs to be a critical component for future development and strategy. by expanding the role of the as/rs to include functions beyond low-use storage, the library discovered efficiencies in material security, customer service, inventory accountability, and strategic planning. the csun oviatt library has learned, experienced, and adjusted its perception, treatment, and usage of the as/rs over the past thirty years. factors often forgotten such as access to the area, staffing and inventory auditing are easily overlooked, while other potential functions such as material security and customer services may not be identified without ongoing analysis and assessment. critical review without a limited or biased perception, has enabled the library to realize the greater functionality the as/rs is able to provide. automated storage & retrieval system | kovalcik and villalobos 123 https://doi.org/10.6017/ital.v38i4.11273 notes 1 shira atkinson and kirsten lee, “design and implementation of a study room reservation system: lessons from a pilot program using google calendar,” college & research libraries 79, no. 7 (2018): 916–30, https://doi.org/10.5860/crl.79.7.916. 2 helen heinrich and eric willis. “automated storage and retrieval system: a time-tested innovation,” library management 35, no. 6/7 (august 5, 2014): 444-53. https://doi.org/10.1108/lm-09-2013-0086. 3 atkinson and lee, “design and implementation of a study room reservation system,” 916–30. 4 “about csun,” california state university, northridge, february 2, 2019, https://www.csun.edu/about-csun. 5 “colleges,” california state university, northridge, may 8, 2019, https://www.csun.edu/academic-affairs/colleges. 6 estimated as/rs capacity was calculated by determining the average size and weight of an item for each size of bin along with the most common bin layout. the average item was then used to determine how many could be stored along the width and length (and if appropriate height) of the bin and then multiplied. many factors affect the overall capacity including: bin layout (with or without dividers), stored item type (book, box, records, etc.), weight of the items, and operator determination of full, partial, empty bin designation. the as/rs mini-loaders have a weight limit of 450 pounds including the weight of the bin. 7 “automated storage and retrieval system (as/rs),” csun oviatt library, https://library.csun.edu/about/asrs. 8 “automated storage and retrieval system (as/rs),” csun oviatt library, https://library.csun.edu/about/asrs. 9 heinrich and willis, “automated storage and retrieval system,” 444-53. 10 norma s. creaghe and douglas a. davis. “hard copy in transition: an automated storage and retrieval facility for low-use library materials,” college & research libraries 47, no. 5 (september 1986): 495-99, https://doi.org/10.5860/crl_47_05_495. 11 heinrich and willis, “automated storage and retrieval system,” 444-53. 12 creaghe and davis, “hard copy in transition,” 495-99. 13 linda shirato, sarah cogan, and sandra yee, “the impact of an automated storage and retrieval system on public services.” reference services review 29, no. 3 (september 2001): 253-61, https://doi.org/10.1108/eum0000000006545. 14 heinrich and willis, “automated storage and retrieval system,” 444-53. 15 sarah e. kirsch, “automated storage and retrieval—the next generation: how northridge’s success is spurring a revolution in library storage and circulation,” paper presented at the https://doi.org/10.5860/crl.79.7.916 https://doi.org/10.1108/lm-09-2013-0086 https://www.csun.edu/about-csun https://www.csun.edu/academic-affairs/colleges https://library.csun.edu/about/asrs https://doi.org/10.5860/crl_47_05_495 https://doi.org/10.1108/eum0000000006545 information technology and libraries | december 2019 124 acrl 9th national conference, detroit, michigan, april 8-11 1999, http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/pdf/kirsch99.pdf . 16 heinrich and willis, “automated storage and retrieval system,” 444-53. 17 shirato, cogan, and yee, “the impact of an automated storage and retrieval system, 253-61. 18 kirsch, “automated storage and retrieval.” 19 shirato, cogan, and yee, “the impact of an automated storage and retrieval system, 253-61. 20 cost of material management was calculated by removing building operational costs (lighting, hvac, carpet, accessibility/open hours, etc.) and focusing on the management of the material instead. the management of materials (or unit cost) is determined by dividing the total amount of fixed and variable costs by the total number of units; 400,000 items divided by $31,500 in annual shelving student budget equals $0.079 per-material per-year in open shelves; 900,000 items divided by $18,000 in annual as/rs student budget equals $0.02 permaterial per-year in the as/rs. 21 shirato, cogan, and yee, “the impact of an automated storage and retrieval system,” 253-61. 22creaghe and davis, “hard copy in transition,” 495-99. 23 kirsch, “automated storage and retrieval.” 24 heinrich and willis, “automated storage and retrieval system,” 444-53. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/pdf/kirsch99.pdf abstract introduction background literature review staffing & usage 1991–1994 2000 mid-2000s mid-2000s–2015 2015–present customer service security inventory lessons learned disasters & security maintenance, support, and modernization networking buy in recording usage and statistics conclusion notes 10181 20190318 galley a systematic approach towards web preservation muzammil khan and arif ur rahman information technology and libraries | march 2019 71 muzammil khan (muzammilkhan86@gmail.com) assistant professor, department of computer and software technology, university of swat. arif ur rahman (badwanpk@gmail.com) assistant professor, department of computer science, bahria university islamabad. abstract the main purpose of the article is to divide the web preservation process into small explicable stages and design a step-by-step web preservation process that leads to creating a well-organized web archive. a number of research articles are studied about web preservation projects and web archives, and designed a step-by-step systematic approach for web preservation. the proposed comprehensive web preservation process describes and combines strengths of different techniques observed during the study for preserving digital web contents into a digital web archive. for each web preservation step, different approaches and possible implementation techniques have been identified that can be adopted in digital archiving. the potential value of the proposed model is to guide the archivist, related personnel, and organizations to effectively preserved their intellectual digital contents for future use. moreover, the model can help to initiate a web preservation process and create a wellorganized web archive to efficiently manage the archived web contents. a section briefly describes the implementation of the proposed approach in a digital news stories preservation framework for archiving news published online from different sources. introduction the amount of information generated by institutions is increasing with the passage of time. one of the mediums that uses this information is the world wide web (www). the www has become a tool to share information quickly with everyone regardless of their physical location. the number of web pages is vast. google and bing each index approximately 4.8 billion.1 though the www is a rapidly growing source of information, it is fragile in nature. according to the available statistics, 80 percent of pages become unavailable after one year and 13 percent of links (mostly web references) in scholarly articles are broken after 27 months.2 moreover, 11 percent of posts and comments on websites for various purposes are lost within a year. according to another study conducted on 10 million web pages collected from the internet archive in 2001, the average survival rate of web pages is 1,132.1 days with a standard deviation of 903.5 days. 90.6 percent pages of those web pages are inaccessible today.3 the information fragility causes this valuable scholarly, cultural, and scientific information to vanish and become inaccessible to future generations. in recent years, it was realized that the lifespan of digital objects is very short, and rapid technological changes make it more difficult to access these objects. therefore, there is a need to preserve the information available on the www. digital preservation is performed using the primary methods of emulation and migration, in which emulation provides the preserved digital objects in their original format while migration provide objects in a different format.4 in the last systematic approach towards web preservation | khan and ur rahman 72 https://doi.org/10.6017/ital.v38i1.10181 two decades, a number of institutions worldwide, such as national and international libraries, universities, and companies started to preserve their web resources (resources found at a web server, i.e., web contents and web structure). the first web archive was initiated in 1996 by brewster kahle, named the internet archive, and it holds more than 30 petabytes data, which includes 279 billion web pages, 11 million books and texts, and 8 million other digital objects such as audio, video, image files, etc. more than seventy web archive initiatives were started in 33 countries since 1996, which shows the importance of web preservation projects and preservation of web contents. this information era encourages librarians, archivists, and researchers to preserve the information available online for upcoming generations. while digital resources may not replace the information available in physical form, the digital version of these information resources improves access to the available information.5 there are different aspects of the preservation process and web archiving, e.g., digital objects’ ingestion to the archive during preservation process, digital object’s format and storage, archival management, administrative issues, access and security to the archive, and preservation planning. these aspects need to be understood for effective web preservation and will help in addressing the challenges that occur during the preservation process. the reference model for open archival information system (oais) is an attempt to provide a high-level framework for the development and comparison of digital archives. in web preservation, a challenging task is to identify the starting point of the preservation process and to effectively complete the process which help to proceed further to the other activities. therefore, the complicated nature of the web and the complex structure of the web contents make the preservation of the web content even more difficult. the oais reference model helps in achieving the goals of a preservation task in a step-by-step manner. the stakeholders are identified, i.e., producer, management, and consumer, and the packages, i.e., submission information package (sip), archival information package (aip) and dissemination information package (dip), which need to be processed, are clearly defined.6 this study aims to design a step-by-step systematic approach for web preservation that helps to understand preservation or archival activities’ challenges, especially those that relate to digital information objects at various steps of the preservation process. the systematic approach may lead to an easy way to analyze, design, implement, and evaluate the archive with clarity and different options for an effective preservation process and archival development. an effective preservation process is one that leads to a well-organized, easily managed web archive and accomplishes designated community requirements. this approach may help to address the challenges and risks that confront archivists and analysts during preservation activities. step-by-step systematic approach digital preservation is “the set of processes and activities that ensure long-term, sustained storage of, access to and interpretation of digital information.”7 the growth and decline rates of www content and the importance of the information presented on the web make it a key candidate for preservation. web preservation confronts a number of challenges due to its complex structure, a variety of available formats, and the type of information (purpose) it provides. the overall layout of the web varies domain to domain based on the type of information and its presentation. the websites can be categorized based on two things. first, the type of information (i.e., the web information technology and libraries | march 2019 73 contents) and second, the way this information presented (i.e., the layout or structure of the web page. examples include educational, personal, news, e-commerce, and social networking websites, which vary a lot in their contents and structure. the variations in the overall layout make it difficult to preserve different web contents in a single web archive. the web preservation activities are summarized in figure 1. the following sections explain the web preservation activities and possible implementation in proposed systematic approach. defining the scope of the web archive the www provides an opportunity to share information using various services, such as blogs, social networking websites, e-commerce, wikis, and e-libraries. these websites provide information on a variety of topics and address different communities based on their interest and needs. there are many differences in the way the information is handled and presented on the www. in addition, the overall layout of the web changes from one domain to another domain.8 therefore, it is not practically feasible to develop a single system to preserve all types of websites for the long term. so, before starting to preserve the web, one (the archivist) should define the scope of the web to be archived. the archive will be either a site-centric, topic-centric, or domaincentric archive.9 site-centric archive a site-centric archive focuses on a particular website for preservation. these types of archives are mostly initiated by the website creator or owner. the site-centric web archives allow access to the old versions of the website. topic-centric archive topic-centric archives are created to preserve information on a particular topic published on the web for future use. for scientific verification, researchers need to refer to the available information while it is difficult to ensure access to these contents due to the ephemeral nature of the web. a number of topic-centric archive projects have been performed including the archipol archive of dutch political websites,10 the digital archive for chinese studies (dachs) archive2,11 minerva by the library of congress,12 and the french elections web archive for archiving the websites related to the french elections.13 domain-centric archive the word “domain” refers to a location, network, or web extension. a domain-centric archive covers websites published with a specific domain name dns, using either a top-level domain (tld), e.g., .com, .edu, or .org, or a second-level domain (sld), e.g., .edu.pk or .edu.fr. an advantage of domain-centric archiving is that it can be created by automatically detecting specific websites. several projects have a domain-centric scope, e.g., the portuguese web archive (pwa) national websites,14 the kulturarw, a swedish royal library web archive collection of.se and .com domain websites,15 and the uk government web archive collection of uk government websites, e.g., .gov.uk domain websites. understanding the web structure after defining the scope of the intended web archive, the archivist will have a better understanding of the interest and expected queries of the intended community based on the resources available or the information provided by the selected domain. the focus in this step is to understand the type of information (contents) provided by the selected domain and how the information has been presented. the web can be understood by two dimensions. the first systematic approach towards web preservation | khan and ur rahman 74 https://doi.org/10.6017/ital.v38i1.10181 figure 1. systematic approach for web preservation process. information technology and libraries | march 2019 75 considers the web as a medium that communicates contents using various protocols, i.e., http, and the second considers the web as a content container, which further presents the contents to the viewers and not simply contents, e.g. the underlying technology used to display the contents.16 the preservation team should understand such parameters as the technical issues, the future technologies, and the expected inclusion of other related content. identify the web resources the archivist should understand the contents and the representation of the contents of the selected domain, e.g., blogs, social networking websites, institutional websites, educational institutional websites, newspaper websites, or entertainment websites. all of these websites provide different information and address individual communities that have distinct information needs. a web page is the combination of two things, i.e., web contents and web structure.17 the resources which can be preserved are as follows. web contents web contents or web information can be categorized into the following categories: • textual contents (plain text): this category describes textual information that appears on a web page. it does not include links, behaviors, and presentation stylesheets. • visual contents (images): these contents are the visual forms of information or are a complementary material to the information provided in the textual form. • multimedia contents: as another form of information, multimedia contents mainly include audio and video. it may also include animation or even text as a part of a video or a combination of text, audio, and video. web structure web structure can be categorized in the following categories: • appearance (web layout or presentation): this category indicates the overall layout or presentation of a web page. the look and feel of a web page (representation of the contents) are important, which is maintained with different technologies, e.g., html or stylesheets, etc. • behavior (code navigations): categorized by link navigations, these can be within a website or to other websites, external document links or dynamic and animated features, such as live feed, comments, tagging, or bookmarking. identify designated community the archivist should identify the designated community of the intended web archive, their functional requirements and expected queries by analyzing them carefully. the designated community means the potential users, such as those who can access the archived web contents for different purposes, i.e., accessing old information that is not available in normal circumstances or referring to an old news article which is not bookmarked properly or retrieving relevant news articles published long ago, etc. prioritize the web resources after a comprehensive assessment of the resources of the selected domain and the identification of potential users’ requirements and expected queries, the archivist should prioritize the web systematic approach towards web preservation | khan and ur rahman 76 https://doi.org/10.6017/ital.v38i1.10181 resources. the complexity of web resources and their representation cause complications in the digital preservation process. generally, it may be undesirable or unviable to preserve all web resources; therefore, it is worthwhile to designate the web resources for preservation. the priority should be assigned on the basis of two things: first, the potential reuse of the resource and second, the frequency with which the resource will be accessed. the resources with no value, little value, or those managed elsewhere can be excluded. for prioritization of resources, the moscow method can be applied.18 the acronym moscow can be elaborated as: m must have, the resource must be preserved or resources that must be a part of the archive and preserved. for example, in the digital news story archive (dnsa), the textual news story must be preserved in the archive because the preservation emphasis is on a textual news story.19 online news contains textual news stories, and many news stories contain associated images, and a fraction of news stories contain associated audio-video contents. s should have, the resource should be preserved if at all possible. almost all the news stories have associated images; a few news stories have associated audio and video that complement it and should be preserved as a part of the news story in the web archive. c could have, the resource could be preserved if it does not affect anything else or is nice to have. the web structure in dnsa depends on the resources to be used for the preservation of news stories; the layout of the newspaper website could (c) be a part of the preservation process if it does not affect anything, e.g., storage capacity and system efficiency. w won’t have, the resource would not be included. archiving multiple versions of the layout or structure of the online newspaper are not worthwhile and hence would not (w) be preserved. the prioritization of these resources is very important in the context of web preservation planning because it does not waste time and energy, and it is the best way to handle users’ requirements and fulfill their expected queries. how to capture the resource(s) the selection of a feasible capturing technique depends on: first, the resources to be captured and second, the capturing task frequency. there are three web resources capturing techniques, i.e., by browser, web crawler, and authoring system. each capturing technique has associated advantages and disadvantages.7 web capturing using browsers the intended web content can be captured using browsers after a web page is rendered when the http transaction occurs. this technique is also referred to as a snapshot or post-rendering technique. the method captures those things which are visible to the users; the behavior and other attributes remain invisible. capturing static contents is one of the disadvantages of web capturing by the browser approach, this approach generally preserved contents in the form of images. it is best for well-organized websites, and commercial tools are available for capturing the web. the following are well-known tools to capture web using browsers. webcapture (https://web-capture.net/) is a free online web-capturing service. it is a fast web page snapshot tool, which can grab web pages in seven different formats, i.e. jpeg, tiff, png, bmp information technology and libraries | march 2019 77 image formats, pdf, svg, and postscript files of high quality. it also allows downloading the intended format in a zip file and is suitable for long vertical web pages with no distortion in layout. a.nnotate (http://a.nnotate.com/), is an online annotating web snapshot tool to keep track of information gathered from the web efficiently and easily. it allows adding tags and notes to the snapshot and building a personal index of web pages as document index. the annotation feature can be used for multiple purposes, for example, compiling an annotated library of objects for organization, sharing commented web pages, product comparison, etc. snagit (https://www.techsmith.com/screen-capture.html) is a well-known snapshot tool for capturing screens with built-in advanced image editing features and screen recording. snagit is a commercial and advanced screen capture tool that can capture web pages with images, linked files, source code, and the url of the web page. acrobat webcapture (file > create > pdf from web page...) creates a tagged pdf file from the web page that a user visits while the adobe pdf toolbar is used for the entire website.20 the capture by a browser technique has the following advantages: • by this technique, the archivist can capture only the displayed contents, and it is an advantage if you need to preserve the displayed contents only. • it is a relatively simple technique for well-organized websites. • commercial tools exist for web capturing using browsers. in addition, the disadvantages are the following: • capturing displayed contents only is a disadvantage if the focus is not on only displayed contents. • it results in frozen contents and treats contents as if they are publications. • it loses the web structure, such as appearance, behavior, and other attributes of the web page. web capturing using an authoring system/server the authoring system capturing technique is used for web harvesting directly from the website hosting server. all the contents, e.g., textual information, images, and source code, are collected from the source web server. the authoring system allows the archivist to preserve the different versions of the website. the authoring system depends on the infrastructure of the content management system and is not a good choice for external resources. the system is best for an owned web server and works well for limited internal purposes. the web curator tool (http://webcurator.sourceforge.net/), pandas (an old british library harvesting tool), and netarchivesuite (https://sbforge.org/display/nas/netarchivesuite) are known tools use for planning and scheduling web harvesting. they can be used by non-technical personnel for both selection and harvesting web content selection policies. these web archiving tools were developed in a collaboration of the national library of new zealand and the british library and are used for the uk web archive (http://www.ariadne.ac.uk/issue50/beresford/). the tools can interface with web crawlers, such as heritrix (https://sourceforge.net/projects/archivecrawler/). authoring systems are also referred to as workflow systems or curatorial tools. systematic approach towards web preservation | khan and ur rahman 78 https://doi.org/10.6017/ital.v38i1.10181 the authoring system has the following advantages: • it is best for web harvesting, which captures everything available. • it is easy to perform, if you have proper access permission or you own the server or system to access for capturing the resources. • it works in short to medium term resources and feasible for internal access within organizations. the disadvantages of web capturing using the authoring system are: • it captures all available raw information, not only presentations. • it may be too reliant on the authoring infrastructure or the content management system. • it is not feasible for large term resources, or for external access from outside organization. web capturing using web crawlers web crawlers are perhaps the mostly used technique for capturing web contents in systematic and automated manner.21 crawler development needs the expertise and experience of different tools, i.e. positive and negative of technologies, and the viability of a tool in a specific scenario. the main advantage of crawlers is that they extract embedded content. heritrix, httrack, wget, and deeparc are common examples of web crawlers. heritrix (https://github.com/internetarchive/heritrix3/wiki) is developed in java, an open source and freely available web crawler, and it was developed by internet archive. heritrix is one of the widely used extensible and web-scale web crawlers in web preservation projects. initially, the heritrix was developed for specific purpose crawling of specific websites and now a resourceful or customize web crawler for archiving the web. httrack (https://www.httrack.com/) is a freely available configurable browser utility. httrack crawls html, images, and other files from a server to a local directory and allows offline viewing of the website. the httrack crawler downloads a complete website from the web server to a local computer system and makes it available for offline for viewing with all related link-structure and seems like the user is using it online. it also updates the archived websites at the local system from the server and resumes all the interrupted previous extractions. the httrack available for both windows and linux/unix operating systems. wget (http://www.gnu.org/software/wget/) is a freely available non-interactive command line tool that can easily be configured with other technologies and different scripts. it can capture files from the web using widely used ftp, ftps, http and https protocols, and support cookies as well. it also updates the archived websites and resumes all the interrupted extractions. wget is available for both microsoft windows and unix operating systems. the advantages of web crawling: • widely used in capturing techniques. • can capture specific content or everything. • avoids some of the accessing issues, such as: link rewriting and embedded external content from an archive or live. information technology and libraries | march 2019 79 disadvantages associated with web crawling: • much work is required, as well as tools or development expertise and experience, etc. • the web crawler does not have the right scope: sometimes, it does not capture everything that it should, and sometimes the crawler captures too much content. web content selection policy in the previous steps, the web resources are identified, prioritized based on requirements and expected queries of the designated community, and feasible capturing technique is identified based on capturing frequency. now, the contents need to be prepared and filtered for selection, and a feasible selection approach needs to be selected based on the contents. a web content selection policy helps to determine and clarify, which web contents are required to be captured based on the priorities, the purpose and the scope of web contents already defined.22 the decision of the selection policy comprises the description of the context, the intended users, the access mechanisms and the expected uses of the archive. the selection policy may comprise the selection process and selection approach. the selection process can be divided into subtasks which, in combination, provide a qualitative selection of web contents to a certain extent, i.e., preparation, discovery, and filtering, as shown in figure 2. the main objective of the preparation phase is to determine the targeted information space, the capture technique, capturing tools, extension categorization, granularity level, and the frequency of archiving activity. the best personnel who can provide help in preparation are the domain experts, regardless of the scope of the web archive. the domain experts may be the archivists, researchers, librarians, or any other authentic reference, i.e. a document or a research article. the tools defined in the preparation phase will help to discover intended information in the discovery phase, which can be divided into the following four categories: 1. hubs may be the global directories or topical directories, collection of sites or even a single web page with essential links related to a particular subject or topic. 2. search engines can facilitate discovery by defining a precise query or set of alternative queries related to a topic. the use of specialized search engines can significantly improve the results of discovering related information that can be greatly improved. 3. crawlers can be used to extract web contents such as textual information, images, audio, video and links. moreover, the overall layout of a web page or a whole website can also be extracted in a well-defined systematic manner. 4. external sources may be non-web sources that may be anything, such as printed material for mailing lists, which can be monitored by the selection team. the main objective of the discovery phase is to determine the source of information to be stored the archive. this determination can be achieved by two ways. first, a manually created entry point list is used to determine the list of entry points (usually links) for crawling the collection manually and updating the list during the crawl. there are two discovery methods, i.e., exogenous and endogenous. exogenous discovery is used in manual selection and mostly relies on exploitation of an entry point list for hubs, search engines, and on non-web documents. second, there is an automatically created entry point list to determine the list of entry points by extracting links automatically and obtaining an updated list every time during the crawl. endogenous discovery is systematic approach towards web preservation | khan and ur rahman 80 https://doi.org/10.6017/ital.v38i1.10181 used in automatic selection and relies on the link extraction using crawlers by exploring the entry point list. figure 2. selection process. the main objective of the filtering phase is to optimize and make concise the discovered web contents (discovery space). filtering is important in order to collect more specific web content and remove unwanted or duplicated content. usually, for preservation, an automatic filtering method is used; manual filtering is useful if the robots or automatic tools cannot interpret the web. the discovery and filter phase can be combined practically or logically. several evaluation axes can be used for the selection policy (e.g., quality, subject, genre, and publisher). in the literature, we have three known techniques for selecting web content. the selection approach can be either automatic or manual. manual content selection is very rare because it is labor intensive: it requires automatic tools for finding the content, and then manual review of that collection to identify the subset that should be captured. automatic selection policies are used frequently in web preservation projects for web collection, especially for web archives.23 the selection of the collection approach depends on the frequency with which the web content has been preserved in the archive. there are four different selection approaches for web content collection. unselective approach the unselective approach implies collecting everything possible; by specifically using this approach, the whole website and its related domains and subdomains are downloaded to the archive. it is also referred to as automatic harvesting or selection, bulk selection, and domain selection.24 the automatic approach is used in a situation where a web crawler usually performs the collection. for example, the collection of websites from a domain, i.e., .edu means all educational institution websites (at domain level) or the collection of all possible contents/pages from a website (harvesting at website level) by extracting the embedded links. a section of the data preservation community believes that technically it is a relatively cheaper, quicker collection approach and yields a comprehensive picture of the web as a whole. in contrast, its significant drawbacks are that it generates huge unsorted, duplicated, and potentially useless data, consuming too many resources. information technology and libraries | march 2019 81 the swedish royal library’s project kulturarw3 harvests websites at domain level, i.e., collecting websites from a .se domain which is a physically located website in sweden and one of the first projects to adopt this approach.25 usually, national-based web archive initiatives adopt the unselective approach, most notably nedlib, a helsinki university library harvester, and aola, an austrian online archive.26 selective approach the selective approach was adopted by the national library of australia (nla) in the pandas project in 1997. in this approach, a website is included for archiving based on certain predefined strategies and on the access and information provided by the archive. the library of congress’ project minerva and the british library project “britain on the web” are the other known projects that have adopted the selective approach. according to nla, the selected websites are archived based on nla guidelines after negotiation with the owners.27 the inclusion decision could be taken at one of the following levels: • website level: which websites should be included from a selected domain, e.g., to archive all educational websites from high level domain “.pk”. • web page level: which web pages should be included from a selected website, e.g., to archive the homepages of all educational websites. • web content level: which type of web contents should be preserved, e.g., to archive all the images from the homepages of educational websites. a selective approach is best if the numbers of websites to be archived are very large or the archiving process is targeting the entire www and wants to narrow down the scope by identifying the resources in which the archivists are more interested. this approach performs implicit or explicit assumptions about the web contents that are not to be selected for preservation. it may be very helpful to initiate a pilot preservation project, which identifies: what is possible? what can be managed? in addition, some tangible results may be obtained easily and quickly in order to enhance the scope of the project in a broader perspective. the selective approach may be based on a predefined criterion or based on an event. selective approach based on criteria involves selecting web resources based on various predefined sets of criteria. nla’s guidance characterizes the criteria-based selective approach as the “most narrowly defined method,” and described it as “thematic selection.” a simple or a complex content-selection criteria can be defined, which depends on the overall goal of preservation. for example, all resources owned by an organization, all resources of one genre, i.e., all programming blogs, resources contributed to a common subject, resources addressing a specific community within an institution, i.e., students or staff, all publications belonging to an individual organization or group of organizations, all resources that may benefit external users or an external user’s community, e.g., historians, or alumni. selective approach based on event involves selecting web resources or websites based on various time-based events. the archivists may focus on websites that address national or international important events, e.g., disasters, elections, and the football world cup, etc. eventbased websites have two characteristics: (1) very frequent updates and (2) website content is lost after a short time, e.g., a few weeks or a few months. for example, the start and end of a term or systematic approach towards web preservation | khan and ur rahman 82 https://doi.org/10.6017/ital.v38i1.10181 academic year, the duration of an activity, e.g., research project, appointment, or departure of a new senior official. deposit approach in the deposit collection approach, the information package is submitted by the administrator or owner of the website which includes a copy of the website with related files that can be accessed through different hyperlinks. the archival information package is applicable to the small collection (of a few websites), or the owner of the website can initiate the preservation project, e.g. a company can initiate a project for preserving their website. the deposit collection approach was adopted by the national archives and records administration (nara) for the collection of us federal agency websites in 2001 and by die deutsche bibliothek (ddb, http://deposit.ddb.de/) for the collection of dissertations and some online publications. new digital initiatives are heavily dependent on administrator or owner support and provide an easy way to deposit new content to the repository, e.g., in the macewan university’s institutional repository, the librarians leading the project tried to offer an easy and effective way to deposit their archival contents.28 combined approach there are advantages and disadvantages associated with each collection approach. the ongoing debate is which approach is best in a given situation. for example, the deposit approach should be an inexpensive agreement with the depositors. the emphasis is to use the combination of automatic harvesting and selective approaches as these two approaches are cheaper as compared to other selection approaches because a few staff personnel are required and cope with technological challenges. this initiative was taken by the bibliothque nationale de france (bnf) in 2006. the bnf automatically crawls information regarding the updated web pages and stores it in an xml-based “site delta” and uses page relevancy and importance, similar to how google ranks pages, to evaluate individual pages.29 the bnf used a selective approach for the deep web (that is, web pages or websites that are behind a password or are otherwise not generally accessible to search engines), referred to as “deposit track.” metadata identification cataloging is required to discover a specific item from the digital collection. an identifier or set of identifiers is required to retrieve a digital record in digital repositories or an archive. for digital documents, this catalog or registration or identifier is referred to as metadata.30 metadata are structured information concerning resources that describe, locate (discover or place), manage, easily retrieve (access) and use digital information resources. metadata are often referred to as “data about data” or “information about information”, but it may be more helpful and informative to describe these data as “descriptive and technical documentation.”31 metadata can be divided into the following three categories: 1. descriptive metadata describes a resource for discovery and identification purposes. it may consist of elements for a document such as title, author(s), abstract, and keywords, etc. 2. structural metadata describes how compound objects are put together, for example, how sections are ordered to form chapters. information technology and libraries | march 2019 83 3. administrative metadata imparts information to facilitate resource management, such as when and how a file was created, who can access the file, its type, and other technical information. administrative metadata is classified into two types: (1) rights management metadata addresses intellectual property rights and (2) preservation metadata contains information needed to archive and preserve a resource.32 due to new information technologies, digital repositories, especially web-based repositories, have grown rapidly over the last two decades. this interest prompts the digital libraries communities to devise metadata strategies to manage the immense amount of data stored in digital libraries.33 metadata play a vital role in the long-term preservation of digital objects and important to identify the metadata which may help to retrieve a specific object from the archive after preservation. according to duff et al., “the right metadata is the key to preserving digital objects.”34 there are hundreds of metadata standards developed over the years for different user environments, disciplines, and for different purposes; many of them are in their second, third, or nth edition.35 digital preservation and archiving requires metadata standards to trace and ensure its access to the digital objects. several of the common standards are briefly discussed below. dublin core metadata initiative (dcmi, http://dublincore.org/) was initiated at the 2nd world wide web conference in 1994 and was standardized by ansi/niso z39.85 in 2001 and iso 15386 in 2003.36 the main purpose of the dcmi was to define an element set for representing web resources; initially, thirteen core elements were defined which later increased to a fifteen-element set. the elements are optional, repeatable, can be followed in any order, and expressed in xml.37 metadata encoding and transmission standard (mets, http://www.loc.gov/standards/mets/) is an xml metadata standard intended to represent information of the complex digital objects. mets elements evolved from the early project making of america ii “moa2” in 2001, supported by the library of congress and sponsored by the digital library federation “dlf” and registered with national information standards organization “niso” in 2004. a mets document contains seven major sections in which each contains different aspects of metadata.38 metadata object description schema (mods, http://www.loc.gov/standards/mods/) was initiated by the marc21 maintenance agency at the library of congress in 2002. mods elements are richer then dcmi, simpler then marc21 bibliographic format and expressed in xml.39 the mods identified the widest facets or features of an object and presented nineteen high-level optional elements.40 visual resources association core strategies (vra core, http://www.loc.gov/standards/vracore/) was developed in 1996, and the current version 4.0 was released in 2007. the vra core is a widely used standard for art, libraries, and archives for such objects as paintings, drawings, sculpture, architecture, and photographs, as well as books and decorative and performance art.41 the vra core contains nineteen elements and nine sub-elements.42 preservation metadata implementation strategies (premis, http://www.loc.gov/standards/premis/) was developed in 2005, sponsored by the online computer library center (oclc) and the research libraries group (rlg), includes a data dictionary and some information about metadata. premis defined a set of five interactive core semantic units or entities and xml schema for endorsing digital preservation activities. it is not systematic approach towards web preservation | khan and ur rahman 84 https://doi.org/10.6017/ital.v38i1.10181 concerned with discovery and access but with common metadata, and for descriptive metadata, other standards (dublin core, mets or mods) need to be used. the premis data model contains intellectual entities (contents that can be described as a unit, e.g., books, articles, databases), objects (discrete units of information in digital form, which can be files, bitstreams, or any representation), agents (people, organization, or software), events (actions that involve an object and an agent known to the system) and rights (assertion of rights and permission).43 it is indisputable that good metadata improves access to the digital object in the digital repository. therefore, the creation and selection of appropriate metadata make the web archive accessible to the archive user. structure metadata helps to manage the archival collection internally, as well as the related services, but may not always help to discover the primary source of the digital object.44 currently, there are many semi-automated metadata generation tools. the use of these semiautomatic tools for generating metadata is crucial for the future, considering the operation’s complexity and cost of manual metadata origination.45 archival format the web archive initiatives select websites for archiving based on relevance of contents and the intended audience of the archived information. the size of the web archives varies significantly depending on their scope and the type of content they are preserving, e.g., web pages, pdf documents, images, audio, or video files.46 to preserve these contents, a web archive uses different storage formats containing metadata and utilizes data compression techniques. the internet archive defined the arc format (http://archive.org/web/researcher/arcfileformat.php), later used as a defacto standard. in 2009, the internet organization for standardization (iso) established the warc format (https://goo.gl/0rbwsn) as an official standard for web archiving. approximately 54 percent of web archive initiatives applied arc and warc formats for archiving. the use of standard formats helps the archivists to facilitate the creation of collaborative tools, such as search engines and ui utilities to efficiently manipulate the archived data.47 information dissemination mechanisms a well-defined preservation process can lead to a well-organized web archive that is easy to maintain and easy to retrieve a specific digital object from the collection using information dissemination techniques. poor search results are one of the main problems in information dissemination of web archives. the users of a web archive expend excessive time to retrieve intended documents or information to satisfy the user’s query. archivists are more concerned with “ofness,” “what collections are made up of,” although archive users are concerned with aboutness, “what collections are about.”48 to use the full potential of web archives a usable interface is needed to help the user to search the archive for specific digital object. full text and keyword search are the dominant ways to search the unstructured information repository, evidently observed from the online search engines. the sophistication of search results against user queries is based on the ranking tools.49 the access tools and techniques are getting the attention of researchers, and approximately 82 percent of european web archives concentrate on such tools, which makes these web archives easily accessible.50 the lucene full-text search engine and its extension nutchwax is widely used in web archiving. moreover, for the combination of semantic descriptions that already rely on or are implicit within their descriptive metadata, reasoning-based or semantic searching of the archival information technology and libraries | march 2019 85 collection can enable the system to produce novel possibilities for the archival content retrieval and browsing.51 even in the current era of digital archives, mobile services are adopted in digital libraries, e.g., access to e-books, libraries databases, catalogs, and text messaging are common mobile services offered in university libraries.52 in a massive repository, a user query retrieves millions of documents, which makes it difficult for users to identify the most relevant information. the ranking model estimates the results relevancy based on user’s queries using specified criteria to overcome this problem and sorts the results by placing the most relevant result at the top.53 there are a number of ranking models that exist in the literature, e.g., conventional ranking models, e.g., tf-idf, bm25f, temporal ranking models, e.g., pagerank, and learning to rank models, e.g., l2r. the findings of the systematic approach for web preservation are used to automate the process of the digital news-story preservation. the steps of the proposed model are carefully adopted to develop a tool that is able to add contextual information to the stories to be preserved. digital news stories preservation framework the advancement of web technologies and maturation of the internet attracts news readers to access news online that is provided by multiple sources and to obtain the desired information comprehensively. the amount of news published online has grown rapidly, and for an individual, it is cumbersome to browse through all online sources for relevant news articles. the news generation in the digital environment is no longer a periodic process with a fixed single output, such as printed newspapers. the news is instantly generated and updated online in a continuous fashion. however, because of different reasons, such as the short lifespan of digital information and the speed of generation of information, it has become vital to preserve digital news for the long term. digital preservation includes various actions to ensure that digital information remains accessible and usable, as long as they are considered important.54 libraries and archives preserve by carefully digitizing newspapers considering as a good source of knowing the history. many approaches have been developed to preserve digital information for the long term. the lifespan of news stories published online varies from one newspaper to another, i.e., from one day to a month. however, a newspaper may be backed up and archived by the news publisher or national archives; in the future, it will be difficult to access particular information published in various newspapers regarding the same news story. the issues become even more complicated if a story is to be tracked through an archive of many newspapers, which requires different access technologies. the digital news story preservation (dnsp) framework was introduced to preserve digital news articles published online from multiple sources.55 the dnsp framework is planned based on adopting the proposed step-by-step systematic approach for web preservation to develop a wellorganized web archive. initially, the main objectives defined for the dnsp framework are: • to initiate a well-organized national level digital news archive of multiple news sources. • to normalize news articles during preservation to a common format for future use. • to extract explicit and implicit metadata, which would be helpful in ingesting stories to the archive and browsing through the archive in the future. • to introduce content-based similarity measures to link digital news articles during preservation. systematic approach towards web preservation | khan and ur rahman 86 https://doi.org/10.6017/ital.v38i1.10181 the digital news story extractor (dnse) is a tool developed to facilitate the extraction of news stories from the online newspapers and to migrate to a normalized format for preservation. the normalized format also includes a step to add metadata in the digital news stories archive (dnsa) for future use.56 to facilitate the accessibility of news articles preserved from multiple sources, some mechanisms need to be adopted for linking the archived digital news articles. an effective term-based approach “common ratio measure for stories (crms)” for linking digital news articles in dnsa is introduced that links similar news articles during the preservation process.57 the approach is empirically analyzed, and the results of the proposed approach are compared to get conclusive arguments. the initial results computed automatically using a common ratio measure for stories are encouraging and are compared with the similarity of news articles based on human judgment. the results are generalized by defining a threshold value based on multiple experimental results using the proposed approach. currently, there is ongoing work to extend the scope of dnsa to dual languages, i.e., urdu and english, as well as content-based similarity measures to link news articles published in urduenglish. moreover, research is underway to develop tools for exploiting the linkage created among stories during the preservation process for search and retrieval tasks. summary effective strategic planning is critical in creating web archives; hence, it requires a wellunderstood and a well-planned preservation process. the process should result in a wellorganized web archive that includes not only the content to be preserved but also the contextual information required to interpret the content. the study attempts to answer many questions by guiding the archivists and related personnel, such as: how to lead the web preservation process effectively? how to initiate the preservation process? how to proceed through different steps? what are the possible techniques that may help to create a well-organized web archive? how can the archived information can be used to its greatest potential? to answer these questions, the study resulted in an appropriate step-by-step process for web preservation and a well-organized web archive. the targeted goal of each step is identified by researching the existing approaches that can be adopted. the possible techniques for those approaches are discussed in detail for each step. references 1 “world wide web size,” the size of the world wide web, visited on jan 31, 2019, http://www.worldwidewebsize.com/. 2 brian f. lavoie, “the open archival information system reference model: introductory guide,” microform & imaging review 33, no. 2 (2004): 68-81; alexandros ntoulas, junghoo cho, and christopher olston, “what's new on the web? the evolution of the web from a search engine perspective,” in proceedings of the 13th international conference on world wide web-04 (new york, ny: acm, 2004), 1-12. information technology and libraries | march 2019 87 3 teru agata et al., “life span of web pages: a survey of 10 million pages collected in 2001,” ieee/acm joint conference on digital libraries, (ieee, 2014), 463-64, https://doi.org/10.1109/jcdl.2014.6970226. 4 timothy robert hart and denise de vries, “metadata provenance and vulnerability,” information technology and libraries 36, no. 4 (dec. 2017): 24-33, https://doi.org/10.6017/ital.v36i4.10146. 5 claire warwick et al., “library and information resources and users of digital resources in the humanities,” program 42, no. 1 (2008): 5-27, https://doi.org/10.1108/00330330810851555. 6 lavoie, “open archival information system reference model.” 7 susan farrell, k. ashley, and r. davis, “a guide to web preservation,” practical advice for web and records managers based on best practices from the jisc-funded powr project (2010), https://jiscpowr.jiscinvolve.org/wp/files/2010/06/guide-2010-final.pdf. 8 lavoie, “open archival information system reference model;” farrell, ashley, and davis, “guide to web preservation.” 9 peter lyman, “archiving the world wide web,” washington, library of congress (2002), https://www.clir.org/pubs/reports/pub106/web/. 10 diomidis spinellis, “the decay and failures of web references,” communications of the acm 46, no. 1 (2003): 71-77, https://dl.acm.org/citation.cfm?doid=602421.602422. 11 digital archive for chinese studies (dachs) archive2 https://www.zo.uniheidelberg.de/boa/digital_resources/dachs/index_en.html, visited on jan 31, 2019. 12 julien masanès, “web archiving methods and approaches: a comparative study,” library trends 54, no. 1 (2005): 72-90, https://doi.org/10.1353/lib.2006.0005. 13 hanno lecher, “small scale academic web archiving: dachs,” in web archiving (berlin/heidelberg: springer, 2006), 213-25, https://doi.org/10.1007/978-3-540-463320_10. 14 daniel gomes et al., “introducing the portuguese web archive initiative,” in 8th international web archiving workshop (berlin/heidelberg: springer, 2009). 15 gerrit voerman et al., “archiving the web: political party web sites in the netherlands,” european political science 2, no. 1 (2002): 68-75, https://doi.org/10.1057/eps.2002.51. 16 sonja gabriel, “public sector records management: a practical guide,” records management journal 18, no. 2 (2008), https://doi.org/10.1108/00242530810911914. 17 farrell, ashley, and davis, “guide to web preservation.” systematic approach towards web preservation | khan and ur rahman 88 https://doi.org/10.6017/ital.v38i1.10181 18 jung-ran park and andrew brenza, “evaluation of semi-automatic metadata generation tools: a survey of the current state of the art,” information technology and libraries 34, no. 3 (sept, 2015): 22-42, https://doi.org/10.6017/ital.v34i3.5889. 19 muzammil khan and arif ur rahman, “digital news story preservation framework,” in digital libraries: providing quality information: 17th international conference on asia-pacific digital libraries, icadl 2015 seoul, korea, december 9-12, 2015 (proceedings, vol. 9469, springer, 2015), 350-52, https://doi.org/10.1007/978-3-319-27974-9; muzammil khan, “using text processing techniques for linking news stories for digital preservation,” phd thesis, faculty of computer science, preston university kohat, islamabad campus, hec pakistan, 2018. 20 dennis dimick, “adobe acrobat captures the web,” washington apple pi journal (1999): 23-25. 21 trupti udapure, ravindra d. kale, and rajesh c. dharmik, “study of web crawler and its different types,” iosr journal of computer engineering (iosr-jce) 16, no. 1 (2014): 01-05, https://doi.org/10.9790/0661-16160105. 22 dora biblarz et al., “guidelines for a collection development policy using the conspectus model,” international federation of library associations and institutions, section on acquisition and collection development (2001). 23 farrell, ashley, and davis, “guide to web preservation;” e. pinsent et al., “powr: the preservation of web resources handbook,” http://jisc.ac.uk/publications/programmerelated/2008/powrhandbook.aspx (2010); michael day, “preserving the fabric of our lives: a survey of web preservation initiatives,” lecture notes in computer science (berlin/heidelberg: springer, 2003): 461-72, https://doi.org/10.1007/978-3-540-45175-4_42. 24 pinsent et al., “powr:”; day, “preserving the fabric.” 25 allan arvidson, “the royal swedish web archive: a complete collection of web pages,” international preservation news (2001): 10-12. 26 andreas rauber, andreas aschenbrenner, and oliver witvoet, “austrian online archive processing: analyzing archives of the world wide web,” research and advanced technology for digital libraries (2002): ecdl 2002. lecture notes in computer science, vol 2458, (berlin/heidelberg: springer, 2002), 16-31, https://doi.org/10.1007/3-540-45747-x_2. 27 william arms, “collecting and preserving the web: the minerva prototype,” rlg diginews 5, no. 2 (2001). 28 sonya betz and robyn hall, “self-archiving with ease in an institutional repository: micro interactions and the user experience,” information technology and libraries 34, no. 3 (sept. 2015): 43-58, https://doi.org/10.6017/ital.v34i3.5900. 29 serge abiteboul et al., “a first experience in archiving the french web,” in international conference on theory and practice of digital libraries, (berlin/heidelberg: springer, 2002), 115, https://doi.org/10.1007/3-540-45747-x_1; sergey brin and lawrence page, “reprint of: information technology and libraries | march 2019 89 the anatomy of a large-scale hypertextual web search engine,” computer networks 56, no. 18 (2012): 3825-33, https://doi.org/10.1016/j.comnet.2012.10.007. 30 masanès, “web archiving.” 31 niso-press, “understanding metadata,” national information standards (2004), http://www.niso.org/publications/understanding-metadata. 32 ibid. 33 jane greenberg, “understanding metadata and metadata schemes,” cataloging & classification quarterly 40, no. 3-4 (2009): 17-36, https://doi.org/10.1300/j104v40n03_02. 34 michael day, “preservation metadata initiatives: practicality, sustainability, and interoperability,” publishers: archivschule marburg (2004): 91-117. 35 jenn riley, glossary of metadata standards (2010). 36 corey harper, “dublin core metadata initiative: beyond the element set,” information standards quarterly 22, no. 1 (2010): 20-31. 37 jane greenberg, “dublin core: history, key concepts, and evolving context (part one),” in slide presentation on dc-2010 international conference on dublin core and metadata applications pittsburgh, pa (2010). 38 cundiff v. morgan, “an introduction to the metadata encoding and transmission standard (mets),” library hi tech 22, no. 1 (2004): 52-64, https://doi.org/10.1108/07378830410524495; leta negandhi, “metadata encoding and transmission standard (mets),”in texas conference on digital libraries, tcdl-2012 (2012). 39 sally h. mccallum, “an introduction to the metadata object description schema (mods),” library hi tech 22, no. 1 (2004): 82-88, https://doi.org/10.1108/07378830410524521. 40 r. gartner, “mode: metadata object description schema,” jisc techwatch report tsw (2003): 03-06. www.loc.gov/standards/mods/. 41 vra-core, “an introduction of vra core,” http://www.loc.gov/standards/vracore/vra core4 intro.pdf, created: oct 2014. 42 vra-core, “vra core element outline,” http://www.loc.gov/standards/vracore/vra core4 outline.pdf, created: feb 2007. 43 priscilla caplan, “understanding premis,” washington dc, usa: library of congress, (2009), https://www.loc.gov/standards/premis/understanding-premis.pdf; j. relay, “an introduction to premis,” singapore ipress tutorial, (2011), http://www.loc.gov/standards/premis/premistutorial ipres2011 singapore.pdf. systematic approach towards web preservation | khan and ur rahman 90 https://doi.org/10.6017/ital.v38i1.10181 44 jennifer schaffner, “the metadata is the interface: better description for better discovery of archives and special collections, synthesized from user studies,” making archival and special collections more accessible, 85 (2015). 45 joao miranda and daniel gomes, “trends in web characteristics,” in web congress, 2009. laweb'09. latin american, (ieee, 2009), 146-53, https://doi.org/10.1109/la-web.2009.28. 46 daniel gomes, joão miranda, and miguel costa, “a survey on web archiving initiatives,” research and advanced technology for digital libraries (2011): 408-20, https://doi.org/10.1007/978-3-642-24469-8_41. 47 ibid. 48 schaffner, “metadata is the interface.” 49 miguel costa and mário j. silva, “evaluating web archive search systems,” in international conference on web information systems engineering (berlin/heidelberg: springer, 2012), 440454. https://doi.org/10.1007/978-3-642-35063-4_32. 50 foundation, i, “web archiving in europe,” technical report, commercenet labs (2010). 51 georgia solomou and dimitrios koutsomitropoulos, “towards an evaluation of semantic searching in digital repositories: a dspace case-study,” program 49, no. 1 (2015): 63-90, https://doi.org/10.1108/prog-07-2013-0037. 52 liu yan quan and sarah briggs, “a library in the palm of your hand: mobile services in top 100 university libraries,” information technology and libraries 34, no. 2 (june 2015): 133, https://doi.org/10.6017/ital.v34i2.5650. 53 ricardo baeza-yates and berthier ribeiro-neto, modern information retrieval 463. (new york: acm pr., 1999). 54 daniel burda and frank teuteberg, “sustaining accessibility of information through digital preservation: a literature review,” journal of information science, 39, no. 4 (2013): 442-58, https://doi.org/10.1177/0165551513480107. 55 muzammil khan et al., “normalizing digital news-stories for preservation,” in digital information management (icdim), 2016 eleventh international conference on (ieee, 2016), 8590, https://doi.org/10.1109/icdim.2016.7829785. 56 khan, et al., “normalizing digital news.” 57 muzammil khan, arif ur rahman, and m. daud awan, “term-based approach for linking digital news stories,” in italian research conference on digital libraries (cham, switzerland: springer, 2018), 127-38, https://doi.org/10.1007/978-3-319-73165-0_13. generating collaborative systems for digital libraries | visser and ball 187 marijke visser and mary alice ball the middle mile: the role of the public library in ensuring access to broadband of fundamentally altering culture and society. in some circles the changes happen in real time as new web-based applications are developed, adopted, and integrated into the user’s daily life. these users are the early adopters; the internet cognoscenti. second tier users appreciate the availability of online resources and use a mix of devices to access internet content but vary in the extent to which they try the latest application or device. the third tier users also vary in the amount they access the internet but have generally not embraced its full potential, from not seeking out readily available resources to not connecting at all.1 regardless of the degree to which they access the internet, all of these users require basic technology skills and a robust underlying infrastructure. since the introduction of web 2.0, the number and type of participatory web-based applications has continued to grow. many people are eagerly taking part in creating an increasing variety of web-based content because the basic tools to do so are widely available. the amateur, creating and sharing for primarily personal reasons, has the ability to reach an audience of unprecedented size. in turn, the internet audience, or virtual audience, can select from a vast menu of formats, including multimedia and print. with print resources disappearing, it is increasingly likely for an individual to only be able to access necessary material online. web-based resources are unique in that they enable an undetermined number of people, personally connected or complete strangers, to interact with and manipulate the content thereby creating something new with each interaction and subsequent iteration. many of these new resources and applications require much more bandwidth than traditional print resources. with the necessary technology no longer out of reach, a crosssection of society is affecting the course the twenty-first century is taking vis à vis how information is created, who can create it, and how we share it.2 in turn, who can access web-based content and who decides how it can be accessed become critical questions to answer. as people become more adept at using web-based tools and eager to try new applications, the need for greater broadband will intensify. the economic downturn is having a marked effect on people’s internet use. if there was a preexisting problem with inadequate access to broadband, current circumstances exacerbate it to where it needs immediate attention. access to broadband internet today increases this paper discusses the role of the public library in ensuring access to the broadband communication that is so critical in today’s knowledge-based society. it examines the culture of information in 2010, and then asks what it means if individuals are online or not. the paper also explores current issues surrounding telecommunications and policy, and finally seeks to understand the role of the library in this highly technological, perpetually connected world. i n the last twenty years library collections have evolved from being predominantly print-based to ones that have a significant digital component. this trend, which has a direct impact on library services, has only accelerated with the advent of web 2.0 technologies and participatory content creation. cutting-edge libraries with next generation catalogs encourage patrons to post reviews, contribute videos, and write on library blogs and wikis. even less adventuresome institutions offer a variety of electronic databases licensed from multiple publishers and vendors. the piece of these library portfolios that is at best ignored and at worst vilified is the infrastructure that enables internet connectivity. in 2010, broadband telecommunication is recognized as essential to access the full range of information resources. telecommunications experts articulate their concerns about the digital divide by focusing on firstand last-mile issues of bringing fiber and cable to end users. the library, particularly the public library, represents the metaphorical middle mile providing the public with access to rich information content. equally important, it provides technical knowledge, subject matter expertise, and general training and support to library users. this paper discusses the role of the public library in ensuring access to the broadband communication that is so critical in today’s knowledge-based society. it examines the culture of information in 2010, and then asks what it means if individuals are online or not. the paper also explores current issues surrounding telecommunications and policy, and finally seeks to understand the role of the library in this highly technological, perpetually connected world. ■■ the culture of information information today is dynamic. as the internet continues on its fast paced, evolutionary track, what we call ‘information’ fluctuates with each emerging web-based technology. theoretically a democratic platform, the internet and its user-generated content is in the process marijke visser (mvisser@alawash.org) is information technology policy analyst and mary alice ball (maryaliceball@yahoo .com) former chair, telecommunications subcommittee, office for information technology policy, american library association, washington, dc. 188 information technology and libraries | december 2010 the geographical location of a community will also influence what kind of internet service is available because of deployment costs. these costs are typically reflected in varying prices to consumers. in addition to the physical layout of an area, current federal telecommunications policies limit the degree to which incentives can be used on the local level.7 encouraging competition between isps, including municipal electric utilities, incumbent local exchange carriers, and national cable companies, for example, requires coordination between local needs and state and federal policies. such coordinated efforts are inherently difficult when taking into consideration the numerous differences between locales. ultimately, though, all of these factors influence the price end users must pay for internet access. with necessary infrastructure and telecommunications policies in place, there are individual behaviors that also affect broadband adoption. according to the pew study, “home broadband adoption 2008,” 62 percent of dial-up users are not interested in switching to broadband.8 clearly there is a segment of the population that has not yet found personal relevance to high-speed access to online resources. in part this may be because they only have experience with dial-up connections. depending on dial-up gives the user an inherently inferior experience because bandwidth requirements to download a document or view a website with multimedia features automatically prevent these users from accessing the same resources as a user with a high-speed connection. a dial-up user would not necessarily be aware of this difference. if this is the only experience a user has it might be enough to deter broadband adoption, especially if there are other contributing factors like lack of technical comfort or availability of relevant content. motivation to use the internet is influenced by the extent to which individuals find content personally relevant. whether it is searching for a job and filling out an application, looking at pictures of grandchildren, using skype to talk to a family member deployed in iraq, researching healthcare providers, updating a personal webpage, or streaming video, people who do these things have discovered personally relevant internet content and applications. understanding the potential relevance of going online makes it more likely that someone would experiment with other applications, thus increasing both the familiarity with what is available and the comfort level with accessing it. without relevant content, there is little motivation for someone not inclined to experiment with internet technology to cross what amounts to a significant hurdle to adoption. anthony wilhelm argues in a 2003 article discussing the growing digital divide that culturally relevant content is critical in increasing the likelihood that non-users will want to access web-based resources.9 the scope of the issue of providing culturally relevant content is underscored in the 2008 pew study, the amount of information and variety of formats available to the user. in turn more content is being distributed as users create and share original content.3 businesses, nonprofits, municipal agencies, and educational institutions appreciate that by putting their resources online they reach a broader segment of their constituency. this approach to reaching an audience works provided the constituents have their own access to the materials, both physically and intellectually. it is one thing to have an internet connection and another to have the skill set necessary to make productive use of it. as reported in job-seeking in u.s. public libraries in 2009, “less than 44% of the top 100 u.s. retailers accept instore paper applications.”4 municipal, state, and federal agencies are increasingly putting their resources online, including unemployment benefit applications, tax forms, and court documents.5 in addition to online documents, the report finds social service agencies may encourage clients to make appointments and apply for state jobs online.6 many of the processes that are now online require an ability to navigate the complexities of the internet at the same time as navigating difficult forms and websites. the combination of the two can deter someone from retrieving necessary resources or successfully completing a critical procedure. while early adopters and policy-makers debate the issues surrounding internet access, the other strata of society, knowingly or not and to varying degrees, are enmeshed in the outcomes of these ongoing discussions because their right to information is at stake. ■■ barriers to broadband access by condensing internet access issues to focus on the availability of adequate and sustainable broadband, it is possible to pinpoint four significant barriers to access: price, availability, perceived relevance, and technical skill level. the first two barriers are determined by existing telecommunications infrastructure as well as local, state, and federal telecommunications policies. the latter barriers are influenced by individual behaviors. both divisions deserve attention. if local infrastructure and the internet service provider (isp) options do not support broadband access to all areas within its boundaries, the result will be that some community members can have broadband services at home while others must rely on work or public access computers. it is important to determine what kind of broadband services are available (e.g., cable, dsl, fiber, satellite) and if they are robust enough to support the activities of the community. infrastructure must already be in place or there must be economic incentive for isps to invest in improving current infrastructure or in installing new infrastructure. generating collaborative systems for digital libraries | visser and ball 189 at all. success hinges on understanding that each community is unique, on leveraging its strengths, and on ameliorating its weaknesses. local government can play a significant role in the availability of broadband access. from a municipal perspective, emphasizing the role of broadband as a factor in economic development can help define how the municipality should most effectively advocate for broadband deployment and adoption. gillett offers four initiatives appropriate for stimulating broadband from a local viewpoint. municipal governments can ■■ become leaders in developing locally relevant internet content and using broadband in their own services; ■■ adopt policies that make it easier for isps to offer broadband; ■■ subsidize broadband users and/or isps; or ■■ become involved in providing the infrastructure or services themselves.12 individually or in combination these four initiatives underscore the fact that government awareness of the possibilities for community growth made possible by broadband access can lead to local government support for the initiatives of other local agencies, including nonprofit, municipal, or small businesses. agencies partnering to support community needs can provide evidence to local policy makers that broadband is essential for community success. once the municipality sees the potential for social and economic development, it is more likely to support policies that stimulate broadband buildout. building strong local partnerships will set the stage for the development of a sustainable broadband initiative as the different stakeholders share perspectives that take into account a variety of necessary components. when the time comes to implement a strategy, not only will different perspectives have been included, the plan will have champions to speak for it: the government, isps, public and private agencies, and community members. it is important to know which constituents are already engaged in supporting community broadband initiatives and which should be tapped. the ultimate purpose in establishing broadband internet access in a community is to benefit the individual community members, thereby stimulating local economic development. key players need to represent agencies that recognize the individual voice. a 2004 study led by strover provides an example of the importance of engaging local community leaders and agencies in developing a successful broadband access project.13 the study looked at thirty-six communities that received state funding to establish community technology centers (ctc). it addressed the effective use and management of ctcs and called attention to the inadequacy of supplying the hardware without community support which found that of the 27 percent of adult americans who are not internet users, 33 percent report they are not interested in going online.10 that pew can report similar information five years after the wilhelm article identifies a barrier to equitable access that has not been adequately resolved. ■■ models for sustainable broadband availability in discussing broadband, the question of what constitutes broadband inevitably arises. gillett, lehr, and osoria, in “local government broadband initiatives,” offers a functional definition: “access is ‘broadband’ if it represents a noticeable improvement over standard dial-up and, once in place, is no longer perceived as the limiting constraint on what can be done over the internet.”11 while this definition works in relationship to dial-up, it is flexible enough to apply to all situations by focusing on “a noticeable improvement” and “no longer perceived as the limiting constraint” (added emphasis). ensuring sustainable broadband access necessitates anticipating future demand. short sighted definitions, applicable at a set moment in time, limit long-term viability of alternative solutions. devising a sustainable solution calls for careful scrutiny of alternative models, because the stakes are so high in the broadband debate. there are many different players involved in constructing information policies. this does not mean, however, that their perspectives are mutually exclusive. in debates with multiple perspectives, it is important to involve stakeholders who are aligned with the ultimate goal: assuring access to quality broadband to anyone going online. what is successful for one community may be entirely inappropriate in another; designing a successful system requires examining and comparing a range of scenarios. existing circumstances may predetermine a particular starting point, but one first step is to evaluate best practices currently in place in a variety of communities to come up with a plan that meets the unique criteria of the community in question. sustainable broadband solutions need to be developed with local constituents in mind and successful solutions will incorporate the realities of current and future local technologies and infrastructure as well as local, state, and federal information policies. presupposing that the goal is to provide the community with the best possible option(s) for quality broadband access, these are key considerations to take into account when devising the plan. in addition to the technological and infrastructure issues, within a community there will be a combination of ways people access the internet. there will be those who have home access, those who need public access, and those who do not seek access 190 information technology and libraries | december 2010 the current emphasis on universal broadband depends on selecting the best of the alternative plans according to carefully vetted criteria in order to develop a flexible and forward-thinking course of action. can we let people remain without access to robust broadband and the necessary skill set to use it effectively? no. as more and more resources critical to basic life tasks are accessible only online, those individuals that face challenges to going online will likely be socially and economically disadvantaged when compared to their online counterparts. recognition of this potential for intensifying digital divide is recognized in the federal communication commission’s (fcc) national broadband plan (nbp) released in march 2010.18 the nbp states six national broadband goals, the third of which is “every american should have affordable access to robust broadband service, and the means and skills to subscribe if they so choose.”19 research conducted for the recommendations in the nbp was comprehensive in scope including voices from industry, public interest, academia, and municipal and state government. responses to more than thirty public notices issued by the fcc provide evidence of wide concern from a variety of perspectives that broadband access should become ubiquitous if the united states is to be a competitive force in the twentyfirst century. access to essential information such as government, public safety, educational, and economic resources requires a broadband connection to the internet. it is incumbent on government officials, isps, and community organizations to share ideas and resources to achieve a solution for providing their communities with robust and sustainable broadband. it is not necessary to have all users up to par with the early adopters. there is not a one-size-fits-all approach to wanting to be connected, nor is there a one-size-fits-all solution to providing access. what is important is that an individual can go online via a robust, high-speed connection that meets that individual’s needs at that moment. what this means for finding solutions is ■■ there needs to be a range of solutions to meet the needs of individual communities; ■■ they need to be flexible enough to meet the evolving needs of these communities as applications and online content continue to change; and ■■ they must be sustainable for the long term so that the community is prepared to meet future needs that are as yet unknown. solutions to providing broadband internet access will be most successful when they are designed starting at the local level. community needs vary according to local demographics, geography, existing infrastructure, types of service providers, and how state and federal systems in place. users need a support system that highlights opportunities available via the internet and that provides help when they run into problems. access is more than providing the infrastructure and hardware. the potential users must also find content that is culturally relevant in an environment that supports local needs and expectations. strover found the most successful ctcs were located in places that “actively attracted people for other social and entertaining reasons.”14 in other words, the ctcs did not operate in a vacuum devoid of social context. successful adoption of the ctcs as a resource for information was dependent on the targeted population finding culturally relevant content in a supportive environment. an additional point made in the study showed that without strong community leadership, there was not significant use of the ctc even when placed in an already established community center.15 this has significant implications for what constitutes access as libraries plan broadband initiatives. investments in technology and a national commitment to ensure universal access to these new technologies in the 1990s provide the current policy framework. as suggested by wilhelm in 2003, to continue to move forward the national agenda needs to focus on updating policies to fit new information circumstances as they arise. today’s information policy debates should emphasize a similar focus. beyond accelerating broadband deployment into underserved areas, wilhelm suggests there needs to be support for training and content development that guarantees communities will actually use and benefit from having broadband deployed in their area.16 technology training and support for local agencies that provide the public with internet access, as well as opportunities for the individuals themselves, is essential if policies are going to actually lead to useful broadband adoption. individual and agency internet access and adoption require investment beyond infrastructure; they depend on having both culturally relevant content and the information literacy skills necessary to benefit from it. ■■ finding the right solution though it may have taken an economic crisis to bring broadband discussions into the living room, the result is causing renewed interest in a long-standing issue. many states have formed broadband task forces or councils to address the lack of adequate broadband access at the state level and, on the national front, broadband was a key component of the american recovery and reinvestment act of 2009.17 the issue changes as technologies evolve but the underlying tenet of providing people access to the information and resources they need to be productive members of society is the same. what becomes of generating collaborative systems for digital libraries | visser and ball 191 difficult to measure, these kinds of social and cultural capital are important elements in ongoing debates about uses and consequences of broadband access. an ongoing challenge for those interested in the social, economic, and policy consequences of modern information networks will be to keep up with changing notions of what it means to be connected in cyberspace.”20 the social contexts in which a broadband plan will be enacted influence the appropriateness of different scenarios and should help guide which ones are implemented. engaging a variety of stakeholders will increase the likelihood of positive outcomes as community members embrace the opportunities provided by broadband internet access. it is difficult, however, to anticipate the outcomes that may occur as users become more familiar with the resources and achieve a higher level of comfort with technology. ramirez states, the “unexpected outcomes” section of many evaluation reports tends to be rich with anecdotes . . . . the unexpected, the emergent, the socially constructed innovations seem to be, to a large extent, off the radar screen, and yet they often contain relevant evidence of how people embrace technology and how they innovate once they discover its potential.21 community members have the most to gain from having broadband internet access. including them will increase the community’s return on its investment as they take advantage of the available resources. ramirez suggests that “participatory, learning, and adaptive policy approaches” will guide the community toward developing communication technology policies that lead to a vibrant future for individuals and community alike.22 as success stories increase, the aggregation of local communities’ social and economic growth will lead to a net sum gain for the nation as a whole. ■■ the role of the library public libraries play an important role in providing internet access to their community members. according to a 2008 study, the public library is the only outlet for no-fee internet access in 72.5 percent of communities nationwide; in rural communities the number goes up to 82.0 percent.23 beyond having desktop or, in some cases, wireless access, public libraries offer invaluable user support in the form of technical training and locally relevant content. libraries provide a secondary community resource for other local agencies who can point their clients to the library for no-fee internet access. in today’s economy where anecdotal reports show an increase in library use, particularly internet use, the role of the public policies mesh with local ordinances. local stakeholders best understand the complex interworking of their community and are aware of who should be included in the decision-making process. including a local perspective will also increase the likelihood that as community needs change, new issues will be brought to the attention of policy makers and agencies who advocate for the individual community members. community agencies that already are familiar with local needs, abilities, and expectations are logical groups to be part of developing a successful local broadband access strategy. the library exemplifies a community resource whose expertise in local issues can inform information policy discussions on local, state, and federal levels. as a natural extension of library service, libraries offer the added value support necessary for many users to successfully navigate the internet. the library is an established community hub for informational resources and provides dedicated staff, technology training opportunities, and no-fee public access computers with an internet connection. libraries in many communities are creating locally relevant web-based content as well as linking to other community resources on their own websites. seeking a partnership with the local library will augment a community broadband initiative. it is difficult to appreciate the impacts of current information technologies because they change so rapidly there is not enough time to realistically measure the effects of one before it is mixed in with a new innovation. with web-based technologies there is a lag time between what those in the front of the pack are doing online and what those in the rear are experiencing. while there is general consensus that broadband internet access is critical in promoting social and economic development in the twenty-first century as is evidenced by the national purposes outlined in the nbp, there is not necessarily agreement on benchmarks for measuring the impacts. three anticipated outcomes of providing community access to broadband are ■■ civic participation will increase; ■■ communities will realize economic growth; and ■■ individual quality of life will improve. when a strategy involves significant financial and energy investments there is a tendency to want palpable results. the success of providing broadband access in a community is challenging to capture. to achieve a level of acceptable success it is necessary to focus on local communities and aggregate anecdotal evidence of incremental changes in public welfare and economic gain. acceptable success is subjective at best but can be usefully defined in context of local constituencies. referring to participation in the development of a vibrant culture, horrigan notes that “while inherently 192 information technology and libraries | december 2010 isolation. an individual must possess skills to navigate the online resources. as users gain an understanding of the potential personal growth and opportunities broadband yields, they will be more likely to seek additional online resources. by stimulating broadband use, the library will contribute to the social and economic health of the community. if the library is to extend its role as the information hub in the community by providing no-fee access to broadband to anyone who walks through the door, the local community must be prepared to support that role. it requires a commitment to encourage build out of appropriate technology necessary for the library to maintain a sustainable internet connection. it necessitates that local communities advocate for national information and communication policies that are pro-library. when public policy supports the library’s efforts, the local community benefits and society at large can progress. what if the library’s own technology needs are not met? the role of the library in its community is becoming increasingly important as more people turn to it for their internet access. without sufficient revenue, the library will have a difficult time meeting this additional demand for services. in turn, in many libraries increased demand for broadband access stretches the limit of it support for both the library staff and the patrons needing help at the computers. what will be the fallout from the library not being able to provide internet services the patrons desire and require? will there be a growing skills difference between people who adopt emerging technologies and incorporate them into their daily lives and those who maintain the technological status quo? what will the social impact be of remaining off line either completely or only marginally? can the library be the bridge between those on the edge, those in the middle, and those at the end? with a strong and well articulated vision for the future, the library can be the link that provides the community with sustainable broadband. ■■ conclusion the recent national focus on universal broadband access has provided an opportunity to rectify a lapse in effective information policy. whether the goal includes facilitating meaningful access continues to be more elusive. as government, organizations, businesses, and individuals rely more heavily on the internet for sharing and receiving information, broadband internet access will continue to increase in importance. following the status quo will not necessarily lead to more people having broadband access in the long run. the early adopters will continue to stimulate technological innovation which, in turn, will trickle down the ranks of the different user types. currently, library as a stable internet provider cannot be overestimated. to maintain its vital function, however, the library must also resolve infrastructure challenges of its own. because of the increased demand for access to internet resources, public libraries are finding their current broadband services are not able to support the demand of their patrons. the issues are two-fold: increased patron use means there are often neither sufficient workstations nor broadband speeds to meet patron demand. in 2008, about 82.5 percent of libraries reported an insufficient number of public workstations, and about 57.5 percent reported insufficient broadband speeds.24 to add to these already significant issues, the report indicates libraries are having trouble supporting the necessary information technology (it) because of either staff time constraints or the lack of a dedicated it staff.25 public libraries are facing considerable infrastructure management issues at a time when library use is increasing. overcoming the challenges successfully will require support on the local, state, and federal level. here is where the librarian, as someone trained to become inherently familiar with the needs of her local constituency and ethically bound to provide access to a variety of information resources, needs to insert herself into the debate. librarians need to be ahead of the crowd as the voice that assures content will be readily accessible to those who seek it. today, the elemental policy issue regarding access to information via the internet hinges on connectivity to a sustainable broadband network. to promote equitable broadband access, the librarian needs be aware of the pertinent information policies in place or under consideration, and be able to anticipate those in the future. additionally, she will need to educate local policy makers about the need for broadband in their community. in some circumstances, the librarian will need to move beyond her local community and raise awareness of community access issues on the state and federal level. the librarian is already able to articulate numerous issues to a variety of stakeholders and can transfer this skill to advocate for sustainable broadband strategies that will succeed in her local community. there are many strata of internet users, from those in the forefront of early adoption to those not interested in being online at all. the early adopters drive the market which responds by making resources more and more likely to be primarily available only online. as we continue this trend, the social repercussions increase from merely not being able to access entertainment and news to being unable to participate in the knowledge-based society of the twenty-first century. by folding in added value online access for the community, the library helps increase the likelihood that the community will benefit from broadband being available to the library patrons and by extension to the community as a whole. to realize the internet’s full potential, access to it cannot be provided in generating collaborative systems for digital libraries | visser and ball 193 community, the entire community benefits regardless of where and how the individuals go online. the effects of the internet are now becoming broadly social enough that there is a general awareness that the internet is not decoration on contemporary society but a challenge to it.28 being connected is no longer an optional luxury; to engage in the twenty-first century it is essential. access to the internet, however, is more than simple connectivity. successful access requires: an understanding of the benefits to going on line, technological comfort, information literacy, ongoing support and training, and the availability of culturally relevant content. people are at various levels of internet use, from those eagerly anticipating the next iteration of web-based applications to those hesitant to open an e-mail account. this user spectrum is likely to continue. though the starting point may vary depending on the applications that become important to the user in the middle of the spectrum, there will be those out in front and those barely keeping up. the implications of the pervasiveness of the internet are only beginning to be appreciated and understood. because of their involvement at the cutting edge of internet evolution, librarians can help lead the conversations. libraries have always been situated in neutral territory within their communities and closely aligned with the public good. librarians understand the perspective of their patrons and are grounded in their local communities. librarians can therefore advocate effectively for their communities on issues that may not completely be understood or even recognized as mattering. connectivity is an issue supremely important to the library as today access to the full range of information necessitates a broadband connection. libraries have carved out a role for themselves as a premier internet access provider in the continually evolving online culture. as noted by bertot, mcclure, and jaeger, the “role of internet access provider for the community is ingrained in the social perceptions of public libraries, and public internet access has become a central part of community perceptions about libraries and the value of the library profession.”29 in times of both economic crisis and technological innovation, there are many unknowns. in part because of these two juxtaposed events, the role of the public library is in flux. additionally, the network of community organizations that libraries link to is becoming more and more complex. it is a time of great opportunity if the library can articulate its role and frame it in relationship to broader society. evolving internet applications require increasing amounts of bandwidth and the trend is to make these bandwidth-heavy applications more and more vital to daily life. one clear path the library community can take however, the supply of internet resources is unevenly stimulating user demand and the unequal distribution of broadband access has greater potential for significant negative social consequences. staying the course and following a haphazard evolution of broadband adoption, may, in fact, renew valid concerns about a digital divide. without an intentional and coordinated approach to developing a broadband strategy, its success is likely to fall short of expectations. the question of how to ensure that internet content is meaningful requires instituting a plan on a very local level, including stakeholders who are familiar with the unique strengths and weaknesses of their community. strover, in her 2000 article the first mile, suggests connectivity issues should be viewed from a first mile perspective where the focus is on the person accessing the internet and her qualitative experience rather than from a last mile perspective which emphasizes isp, infrastructure, and market concerns.26 both perspectives are talking about the same physical section of the connection network: the piece that connects the user to the network. according to strover, distinguishing between the first mile and last mile perspectives is more than an arbitrary argument over semantics. instead, a first mile perspective represents a shift “in the values and priorities that shape telecommunications policy.”27 by switching to a first mile perspective, connectivity issues immediately take into account the social aspects of what it means to be online. who will bring this perspective to the table? and how will we ascertain what the best approach to supporting the individual voice should be? the first mile perspective is one the library is intimately familiar with as an organization that traditionally advocates for the first mile of all information policies. the library is in a key position in the connectivity debate because of its inclination to speak for the user and to be aware of the unique attributes and needs of its local community. as part of its mission, the library takes into account the distinctive needs of its user community when it designs and implements its services. a natural outgrowth of this practice is to be keenly aware of the demographics of the community at large. the library can leverage its knowledge and understanding to create an even greater positive impact on the social, educational, and economic community development made possible by broadband adoption. to extend the first mile perspective analogy, in the connectivity debate, the library will play the role of the middle mile: the support system that successfully connects the internet to the consumer. while the target populations for stimulating demand for broadband are really those in the second tier of users, by advocating for the first mile perspective, the library will be advocating for equitable information policies whose implementation has bearing on the early adopters as well. by stimulating demand for broadband within a 194 information technology and libraries | december 2010 initiatives,” 538. 12. ibid., 537–58. 13. sharon strover, gary chapman, and jody waters, “beyond community networking and ctcs: access, development, and public policy,” telecommunications policy 28, no. 7/8 (2004): 465–85. 14. ibid., 483. 15. ibid. 16. wilhelm, “leveraging sunken investments in communications infrastructure,” 282. 17. see, for example, the virginia broadband round table (http://www.otpba.vi.virginia.gov/broadband_roundtable .shtml), the ohio broadband council (http://www.ohiobroad bandcouncil.org/), and the california broadband task force (http://gov.ca.gov/speech/4596. see www.fcc.gov/recovery/ broadband/) for information on broadband initiatives in the american recovery and reinvestment act. 18. federal communication commission, national broadband plan: connecting america, http://www.broadband.gov/ (accessed apr. 11, 2010). 19. ibid. 20. horrigan, “broadband: what’s all the fuss about?” 2. 21. ricardo ramirez, “appreciating the contribution of broadband ict with rural and remote communities: stepping stones toward and alternative paradigm,” the information society 23 (2007): 86. 22. ibid., 92. 23. denise m. davis, john carlo bertot, and charles, r. mcclure, “libraries connect communities: public library funding & technology access study 2007–2008,” 35, http:// www.ala.org/ala/aboutala/offices/ors/plftas/0708/libraries connectcommunities.pdf (accessed jan. 24, 2009). 24. john carlo bertot et al., “public libraries and the internet 2008: study results and findings,” 11, http://www.ii.fsu.edu/ projectfiles/plinternet/2008/everything.pdf (accessed jan. 24, 2009). these numbers represent an increase from the previous year’s study which suggests that libraries while trying to meet demand are not able to keep up. 25. ibid. 26. sharon strover, “the first mile,” the information society 16, no. 2 (2000): 151–54. 27. ibid., 151. 28. clay shirky, “here comes everybody: the power of organizing without organizations.” berkman center for internet & society (2008). video presentation. available at http:// cyber.law.harvard.edu/interactive/events/2008/02/shirky (retrieved march 1, 2009). 29. john carlo bertot, charles r. mcclure, and paul t. jaeger, “the impacts of free public internet access on public library patrons and communities,” library quarterly 78, no. 3 (2008): 286, http://www.journals.uchicago.edu.proxy.ulib.iupui.edu/ doi/pdf/10.1086/588445 (accessed jan. 30, 2009). is to develop its role as the middle mile connecting the increasing breadth of internet resources to the general public. the broadband debate has moved out of the background of telecommunication policy and into the center of public attention. now is the moment that calls for an information policy advocate who can represent the end user while understanding the complexity of the other stakeholder perspectives. the library undoubtedly has its own share of stakeholders, but over time it is an institution that has maintained a neutral stance within its community, thereby achieving a unique ability to speak for all parties. those who speak for the library are able to represent the needs of the public, work with a diverse group of stakeholders, and help negotiate a sustainable strategy for providing broadband internet access. references and notes 1. lee rainie, “2.0 and the internet world,” internet librarian 2007, http://www.pewinternet.org/presentations/2007/20 -and-the-internet-world.aspx (accessed mar. 4, 2009). see also john horrigan, “a typology of information and communication technology users,” 2007, www.pewinternet.org/~/media// files/reports/2007/pip_ict_typology.pdf.pdf (accessed feb. 12, 2009). 2. lawrence lessig, “early creative commons history, my version,” video blog post, 2008, http://lessig.org/ blog/2008/08/early_creative_commons_history.html (accessed jan. 20, 2009). see the relevant passage from 20:53 through 21:50. 3. john horrigan, “broadband: what’s all the fuss about?” 2007, p. 1, http://www.pewinternet.org/~/media/ files/reports/2007/broadband%20fuss.pdf.pdf (accessed feb. 12, 2009). 4. “job-seeking in us public libraries,” public library funding & technology access study, 2009, http://www.ala.org/ ala/research/initiatives/plftas/issuesbriefs/brief_jobs_july.pdf (accessed mar. 27, 2009). 5. ibid. 6. ibid. 7. sharon e. gillett, william h. lehr, and carlos osorio, “local government broadband initiatives,” telecommunications policy 28 (2004): 539. 8. john horrigan, “home broadband adoption 2008,” 10, http://www.pewinternet.org/~/media//files/reports/2008/ pip_broadband_2008.pdf (accessed feb. 12, 2009). 9. anthony g. wilhelm, “leveraging sunken investments in communications infrastructure: a policy perspective from the united states,” the information society 19 (2003): 279–86. 10. horrigan, “home broadband adoption,” 12. 11. gillett, lehr, and osorio, “local government broadband roel some considered 2000 the year of the e-book, and due to the dot-com bust, that could have been the format’s highwater mark. however, the first quarter of 2004 saw the greatest number of e-book purchases ever with more than $3 million in sales. a 2002 consumer survey found that 67 percent of respondents wanted to read e-books; 62 percent wanted access to e-books through a library. unfortunately, the large amount of information written on e-books has begun to develop myths around their use, functionality, and cost. the author suggests that these myths may interfere with the role of libraries in helping to determine the future of the medium and access to it. rather than fixate on the pros and cons of current versions of e-book technology, it is important for librarians to stay engaged and help clarify the role of digital documents in the modern library. a lthough 2000 was unofficially proclaimed as the year of the electronic book, or e-book, due in part to the highly publicized release of a stephen king short story exclusively in electronic format, the dot-com bust would derail a number of high-profile e-book endeavors. with far less fanfare, the e-book industry has been slowly recovering. in 2004, e-books represented the fastest-growing segment of the publishing industry. during the first quarter of that year, more than four hundred thousand e-books were sold, a 46 percent increase over the previous year ’s numbers.1 e-books continue to gain acceptance with some readers, although their place in history is still being determined—fad? great idea too soon? wrong approach at any time? the answers partly depend on the reader ’s perspective. the main focus of this article is the role of e-book technologies in libraries. libraries have always served as repositories of the written word, regardless of the particular medium used to store the words. from the ancient scrolls of qumran to the hand-illuminated manuscripts of medieval europe to the familiar typeset codices of today, the library’s role has been to collect, organize, and share ideas via the written word. in today’s society, the written word is increasingly encountered in digital form. writers use word processors; readers see words displayed; and researchers can scan countless collections without leaving the confines of the office. for self-proclaimed book lovers, the digital world is not necessarily an ideal one. emotional reactions are common when one imagines a world without a favorite writing pen or the musty-smelling, yellowed pages of a treasured volume from youth. one of the battle lines between the traditional bibliophile and the modern technologist is drawn over the concept of the e-book. some see this digital form of written word as an evolutionary step beyond printed texts, which have been sometimes humorously dubbed tree-books. although a good deal of attention has been generated by the initial publicity regarding newer e-book technologies, the apparent failures of most of them has begun to establish myths around the concept. abram points out that the relative success of e-books in niche areas (such as reference works) is in direct contrast with public opinion of those purchasing novels and popular literature through traditional vendors.2 crawford paraphrases lewis carroll in describing this confusion: “when you cope with online content about e-books, you can believe six impossible things before breakfast.”3 incidentally, this article will attempt to dispel a mere five of the myths about e-books. the future of e-books and the critical role of libraries in this future are best served by uncovering these myths and seeking a balanced, reasoned view of their potential. a 2002 consumer survey on e-books found that 67 percent of respondents wanted to read an e-book, and 62 percent wanted that access to be from a library.4 underlying this position is the assumption that the ideas represented by the written word are of paramount importance to both writers and readers. it is also assumed that libraries will continue their critical role in collecting, organizing, and sharing information. � myth 1—e-books represent a new idea that has failed many libraries have invested in various forms of e-book delivery with mixed results.5 sottong wisely warns of the premature adoption of e-book technology, which he dubs a false pretender as a replacement to printed texts.6 however, the last five years are but a small part of a longer history, and presumably, a still longer future as is often the case with computer jargon, the term e-book has emerged and gained currency in a very short amount of time. however, the concept of providing written texts in an electronic format has existed for a long time, as demonstrated by bush’s description of the dispelling five myths about e-books james e. gall james e. gall (james.gall@unco.edu) is assistant professor of educational technology at the university of northern colorado, greeley. dispelling five myths about e-books | gall 25 26 information technology and libraries | march 2005 memex.7 the gutenberg project put theory into practice by converting traditional texts into digital files as early as 1971.8 even if the e-book merely represents the latest incarnation of the concept, it does so tenuously. books in their present form have a history of hundreds of years, or thousands if their parchment and papyrus ancestors are included. this history is rich with successes and failures of technology. for example, petroski presents an interesting historical examination of the problem of storing books when the one book–one desk model collapsed under the proliferation of available texts.9 similarly, a determination on the success or failure of e-books, or digital texts, based upon a relatively short period of time, is fraught with difficulty. rather, it is important to look at recent developments as merely a next step. the technology is clearly not ready for uncritical, widespread acceptance, but it is also deserving of more than a summary dismissal. � myth 2—e-books are easily defined the term e-book means different things depending on the context. at the simplest, it refers to any primarily textual material that is stored digitally to be delivered via electronic display. one of the confusing aspects of defining ebooks is that in the digital world, information and the media used to store, transfer, and view it are loosely coupled. an e-book in digital form can be stored on cd–rom or any number of other media and then passed on through computer networks or telephone lines. the device used to view an e-book could be a standard computer, a personal digital assistant (pda), or an e-book reader (the dedicated piece of equipment on which an e-book can be read; confusingly, also referred to as an e-book). technically, virtually any computing device with a display could be used as an e-book reader. from a practical point of view, our eyes might not tolerate reading great lengths of text on a wireless phone, and banks will not likely provide excerpts of chaucer during atm transactions. another important factor in defining e-books is the actual content. a conservative definition is that an e-book is an electronic copy or version of a printed text. this appears to be the predominant view of publishers. purists often maintain that a true e-book is one that is specifically written for that format and not available in traditional printed form.10 this was one of the categories of the shortlived (2000–2002) frankfurt e-book awards. of course, the multitude of textual materials that could be delivered via the technology exceeds these definitions. magazines, primary-source documents, online commentaries and reviews, and transcripts of audio or video presentations are just a short list of nonbook materials that are finding their way into e-book formats. one can note with some sense of irony that the technology behind the web was originally designed as a way for scientists to disseminate research reports.11 despite the web’s popularity, reading research reports makes up an exceedingly small percentage of its use today. although there is a continuing effort to reach a common standard for e-books (see www.openebook.org/), the current marketplace contains numerous noncompatible formats. this noncompatibility is the result of both design and competitive tradeoffs. in the case of the former, there is a distinct philosophical difference between formats that attempt to retain the original look and navigation of the printed page (such as adobe’s popular pdf files) versus those that retain the text’s structure but allow variability in its presentation (as best exemplified by the free-flowing nature of texts presented as html pages). this difference can also be seen in the functionality built around the format. traditional systems provide readers with familiar book characteristics such as a table of contents, bookmarks, and margin notes, a view that could be named bibliocentric. the alternative is one that takes more advantage of the new medium and could be labeled technocentric, and can most easily be seen in the extensive use of hyperlinking.12 the simplest use of hyperlinking provides an easy form of annotating texts and presenting related texts. on the other extreme, hyperlinks are used in the creation of nonlinear texts in which the followed links provide a unique context for building meaning on the part of the reader.13 it is interesting to note that a preliminary study of e-book features found that the most desirable features tended to reflect the functionality of traditional books and the least desirable features provided functionality not found there.14 competitive tradeoffs are a critical issue at the current point of e-book development. the current profit models of publishing entities and copyright concerns of authors seem naturally opposed to e-book formats in which texts were freely shared, duplicated, and distributed. for example, the open ebook forum is the most prominent organization devoted to the development of standards for e-book technologies. in late 2004, their web site listed seventy-six current members. although the american library association is a member, it is one of only six members representing library-oriented organizations. in comparison, thirty-five members (or 46 percent) are publishing organizations, and thirteen (or 17 percent) are technology companies.15 the number of traditional publishers versus technology companies on this list may suggest that a bibliocentric view of ebooks would be more favored. this also appears to confirm one media prediction that traditional publishers would continue to dominate efforts with this new medium.16 however, the limited representation of libraries in this endeavor is troubling (despite the disclaimer of using an admittedly rough metric for measuring impact). it is clear that many industry formats attempt to limit the ability to distribute materials by keying files so that they may only be viewed on one device or a specific installed version of the reader software. this creates technological problems for entities like libraries that attempt to provide access to information for various parties. the concept of fair use of copyrighted materials has to be reexamined under an entirely new set of assumptions. another irony is that the availability of free, public-domain materials in e-book format can be viewed as negative by the publishing industry. after investing considerable time and effort in developing e-book technology, publishers would prefer that users continue purchasing new e-book material rather than spend time reading the vast library of free historical material. many of these content issues are currently being played out in courts and the marketplace, particularly with regard to digital music and video.17 although one can humorously imagine the so-called problems associated with a population obsessed with downloading and reading great literature, the precedents set by these popular media will have a direct impact on the future of digital texts. despite the labor required to scan or key entire print books into digital formats, there have been some reports of this type of piracy.18 other models for the dissemination of digital intellectual property that are not determined by traditional material concerns of supply and demand will continually be attempted. for example, nelson predicted a hypertext-publishing scheme in which all material was available, but royalties were distributed according to actual access by end users.19 theoretically, such a system would provide a perfect balance between access and profitability. in nelson’s words “nothing will ever be misquoted or out of context, since the user can inquire as to the origins and native form of any quotation or other inclusion. royalties will be automatically paid by a user whenever he or she draws out a byte from a published document.”20 � myth 3—e-books and printed books are competing media many, if not most, published articles regarding e-books follow classic plot construction; the writer must present a protagonist and an antagonist. bibliophiles place the printed page as the hero and the e-book as the potential bane of civilization. proulx, one such author, was quoted as saying, “nobody is going to sit down and read a novel on a twitchy little screen—ever.”21 technologists cast the e-book as the electronic savior of text, replacing the tired tradition of the printed word in the same way the printed word replaced oral traditions. hawkins quotes an author who claims that e-books are “a meteor striking the scholarly publication world.” his slightly more restrained view was that e-books had the potential “to be the most far-reaching change since gutenberg’s invention.”22 grant places this metaphorical battle at the forefront by titling an article “e-books: friend or foe?”23 before deciding which side to take, consider whether this clash of media is an appropriate metaphor. this author has introduced samples of current ebook technology in graduate classes he has taught. when presented with the technology as part of the coursework, students quickly declare their allegiances. bibliophiles most often suggest that the technology will never replace the love of curling up with a good book. the technologists will ask how many pages can be stored in the device and then fantasize about the types of libraries they can carry and the various venues for reading that they will explore. however, after a few weeks in using the devices, both groups tend to move to a middle ground of practical use. at that point, the discussion turns to what materials are best left on the printed page (usually described as pleasure reading) and what would be useful in e-book format (reference works, course catalogs, how-to manuals). other instructors have reported similar patterns of use.24 at this point, the observation is largely anecdotal, but it does call into question the perceived need for a decisive referendum on the value of e-books. the issue is not whether e-books will replace the printed word. the concern of librarians and others involved in the infrastructure of the book should be on developing the proper role for e-books in a broader culture of information. unless this approach is taken, the true goal of libraries—disseminating information to the public—will suffer. the gap between bibiliophile and technologist approaches can already be seen in the materials available in e-book format. the publishing industry in general treats the e-book as just another format, releasing the same titles in hardcover, book-on-tape, and e-book at the same time. on the opposite end of the spectrum, technologists have adopted various e-book formats for creating and transferring numerous reference documents. given their preferences, it is easy to find e-book references on unix, html coding, and the like, but there is a scarcity of materials in philosophy, history, and the arts. librarians seem the most appropriate group for developing shared understanding. publishers and e-book hardware and software manufacturers need to be concerned with the bottom line. libraries, by design, are concerned with the preservation of information and its continued dissemination long after the need to sell a particular book has passed. the hobby of creating and transferring texts to digital form is idiosyncratic and unorganized when viewed from the highest levels. libraries not only contain expertise in all areas of human endeavor, but also have strategies for categorizing and maintaining information in productive ways. in short, libraries are the best line of defense for maintaining the value of the printed page and promoting the value of digital texts. dispelling five myths about e-books | gall 27 28 information technology and libraries | march 2005 � myth 4—e-books are expensive a common complaint about e-books is that they are expensive. on the surface, this seems clear. dedicated ebook readers seemed to bottom out at around $300, and a new bestseller in e-book format is priced about the same as the hardcover edition. add the immediate and longterm costs of rechargeable batteries and the electricity needed to power them, and the economic case against the e-book appears closed. what if we turn the same critical eye to the printed page? the manufacture and distribution of printed texts is highly developed and astounding. when gutenberg succeeded in putting the christian bible in the hands of the moneyed public, he surely could not have comprehended the billions of copies that would eventually be distributed. even with the wealth of printed material at hand, one must still consider the high cost of the system. the law of supply and demand rules books as a tangible product. the most profitable books are those that will reach the most readers. specialized texts have limited audiences and, therefore, will usually be priced higher. this produces problems for both groups. popular texts must be printed in high quantities and delivered to various outlets. unfortunately, the printed page does have maintenance costs. sellen and harper point out that the actual printing cost is insignificant compared with the cost of dealing with documents after printing. they cite one study that indicated that united states businesses spend about $1 billion per year designing and printing forms, but spend an additional $25 to $35 billion filing, storing, and retrieving them.25 books are no different; as any librarian knows, it costs money to maintain a collection and protect texts from the environment and the effects of age. in the retail arena, the competition is fiercer. books that do not sell are removed in favor of those that do. it is estimated that 10 percent of texts printed each year are turned to pulp, although, fortunately, many are recycled.26 the bbc reported that more than two million former romance novels were used in the construction of a new tollway.27 with more specialized texts, the problem is not wealth, but scarcity. if a text is not profitable, it will probably become out of print. this is often synonymous with inaccessible. from the publisher’s perspective, it is only cost-effective to commit to a printing when the demand is high enough. a library is a good source of outof-print texts, provided that it has been funded appropriately to acquire and maintain the particular works that are needed. e-books are not a panacea. other innovations, such as on-demand publishing, may be part of the answer in solving the economic issues regarding collections. however, e-books can help alleviate some of these issues. e-books are easily copied and distributed, which is a boon to the researcher and information consumer. in many cases, the goal is the access to information, not the possession of a book. it could also benefit the author and publisher if appropriate reimbursement systems are put into place. as previously described, nelson originally envisioned his online hypertext system, xanadu, with a mechanism for royalties based on access—a supply-anddemand system for ideas, not materials.28 the systems used to manage access to digital materials continue to increase in complexity and have spawned a whole new business of digital rights management (drm).29 examples include reciprocal (www.reciprocal.com), overdrive (www.overdrive.com), and netlibrary (www.net library.com). libraries are the specific target of netlibrary, which promotes an e-books-on-demand project that allows free access for short periods of time.30 the creation of a standard digital object identifier (doi) for published materials may also help online publishers and entities like libraries manage their digital collections more easily.31 online music systems, such as apple’s itunes (www. itunes.com), strike a workable balance between quickand-easy access to music and a workable, economic model for reimbursing artists. e-books also have appeal for special audiences who already require assistive technologies for accessing print collections.32 having discussed the hidden costs of printed texts, another important economic issue of e-books to examine is a current trend in usage. despite the availability of dedicated e-book readers, the largest growth in e-book usage is surely in nondedicated devices. e-book–reading software is available for personal computers, laptops, and pdas. according to one source, microsoft had sold four million pocketpc e-book-enabled devices, and had two million downloads of the ms reader for the personal computer; palm had sold approximately 20 million ebook-enabled devices; and adobe had more than 30 million acrobat readers downloaded.33 these numbers alone indicate some 24 million reader-capable pdas, and 32 million reader-capable pcs, for a total of 56 million devices. although it is difficult to find data on actual use, one online bookseller reported some data on e-book use from an audience survey.34 although 88 percent had purchased books online, only 16 percent had read an e-book (11 percent using a pc, 3 percent on a handheld device, and 2 percent on both). it is presumed that in most cases this equipment was purchased for other reasons, with ebook reading being a secondary function. as such, it would be unfair to include the full cost of this equipment in any calculation of the cost of providing information in an e-book format. if so, the cost of providing artificial lighting in any building where reading takes place would need to be calculated as part of the cost of the printed page. the potential user base for the e-book rises as more computers and pdas are sold, decreasing the need for special equipment. this does not mean that the dedicated e-book reader is obsolete. by most commercial accounts, the apple newton was a failure. its bulky size and awkward interface were the subject of much ridicule. however, it did introduce the concept of the pda. the success of the palm line of products owes much to the proof of concept provided by the newton. the makers of the portable gameboy videogame system are repositioning it for multimedia digital-content delivery, and plan to pilot a flash-memory download system for various content types, including e-books.35 innovative products such as e-paper are already developed in prototype form.36 they are likely to lead to another wave of dedicated e-book readers or provide e-book–reading potential embedded in other consumer applications. � myth 5—e-books are a passing fad it is trendy to list the failures of past media (such as radio, film, and television) in impacting education despite great initial promise.37 however, all those media are still with us after having found particular niches within our culture. if the e-book is viewed as just an alternative format, comparisons with past experiences of library collections containing videotapes, record albums, and such are not appropriate.38 however, if e-books are viewed as a tool or way to access information, the questions change. instead of asking how digital formats will replace print collections, we can ask how will an e-book version extend the reach of our current collection or provide our readers with resources previously unavailable or unaffordable. when trying to locate a research article, one is generally not concerned with whether the local library has a loose copy, bound copy, microform, microfiche, or even has to resort to interlibrary loan. as long as the content is accessible and can be cited, it can be used. electronic access to journal content is becoming more common. perhaps dry journal articles do not conjure up the same romantic visions of exploring the stacks that may hinder greater acceptance of e-books. a parallel can be drawn to the current work of filmrestoration experts. the medium of film has reached an age where some of the earliest influential works no longer exist or are in a condition of rapid deterioration. according to one film site, more than half of the films made before 1950 have already been lost due to decay of existing copies.39 the work of restoration involves finding what remains of a great work in various vaults and collections. often, the only usable film is a secondor third-generation copy. from digitized copies, cleaning, color correction, and other painstaking work, a restored and—it is hoped—complete work emerges. ironically, once this laborious process is completed, a near-extinct classic is suddenly available to millions in the form of a dvd disc at a local retailer. what if the same attitude was taken with the world’s collections of printed materials? jantz has described potential impacts of e-book technology on academic libraries.40 lareau conducted a study on using e-books to replace lost books at kent state university, but found that limited availability and high costs did not make it feasible at the time.41 project gutenberg (www.gutenberg.net) and the electronic text center at the university of virginia (http://etext.lib.virginia.edu) are two examples of scholars attempting to save and share book content in electronic forms, but more efforts are needed. unfortunately, the shift to digital content has also contributed to the sheer volume of content available. edwards has recently discussed issues in attempting to archive and preserve digital media.42 the web may be suffering from a glut of information, but the content is highly skewed toward the new and technology oriented. in a few years, we may find that nontechnology–related endeavors are no longer represented in our information landscape. � conclusion the e-book industry is currently dominated by commercial-content providers, such as franklin, and software companies, most notably adobe, palm, and microsoft. traditional print-based publishers have also maintained continued interest in the medium. it is assumed that these publishers had the capital to weather the ups and downs of the industry more so than new publishers dedicated solely to e-book delivery. although the contributions and efforts of these organizations are needed, the future of e-book content should not be left to their largesse. when the rocket e-book device was initially released, a small but loyal following of readers contributed thousands of titles to its online library. some of these titles were self-published vanity projects or brief reference documents, but many were public-domain classics, painstakingly scanned or keyed in by readers wishing to share their favorite reads. when gemstar purchased rocket, the software’s ability to create non-purchased content was curtailed and the online library of free titles dismantled. apparently, both were viewed as limiting the profitability of the e-book vendor. however, gemstar recently made notice of discontinuing their e-book reading devices, one would assume due to a lack of profitability. this can be seen as a cautionary tale for libraries, which often define success by number of volumes available and accessed rather than units sold. committing to a technology that concurrently requires consumer success can be problematic. bibliophile and technologist alike must take responsibility for the future of our collective information resources. the bibliophile must ensure that all aspects of dispelling five myths about e-books | gall 29 30 information technology and libraries | march 2005 human knowledge and creativity are nurtured and allowed to survive in electronic forms. the technologist must ensure that accessibility and intellectual-property rights are addressed with every technological innovation. parry provides three concrete suggestions for public libraries in response to new media demands: continue to acknowledge and respond to customer demands, revisit the library’s mission statement for currency, and promote or accelerate shared agreements with other institutions to alleviate the high costs of accumulating resources.43 the proper frame of mind for these activities is suggested by levy: we make a mistake, i believe, when we fixate on particular forms and technologies, taking them in and of themselves, to be the carriers of what we want to embrace or resist. . . . it isn’t a question, it needn’t be a question, of books or the web, of letters or e-mail, of digital libraries or the bricks-and-mortar variety, of paper or digital technologies. . . . these modes of operation are only in conflict when we insist that one or the other is the only way to operate.44 in the early 1930s, lomax dragged his primitive audio-recording equipment over the roads of the american south to capture the performances of numerous folk musicians.45 at the time, he certainly didn’t imagine that at one point in history someone with a laptop computer sitting in a coffee shop with wireless access could download the performances of robert johnson from itunes. however, without his efforts, those unique voices in our history would have been lost. it is hoped that the readers of the future will be thanking the library professionals of today for preserving our print collections and enabling their access digitally via our primitive, but evolving, e-book technologies. references 1. open e-book forum, “press release: record e-book retail sales set in q1 2004,” june 4, 2004. accessed dec. 27, 2004, www.openebook.org. 2. stephen abram, “e-books: rumors of our death are greatly exaggerated,” information outlook 8, no. 2 (2004): 14–16. 3. walt crawford, “the white queen strikes again: an e-book update,” econtent 25, no. 11 (2002): 46–47. 4. harold henke, “consumer survey on e-books.” accessed dec. 27, 2004, www.openebook.org. 5. sue hutley, “follow the e-book road: e-books in australian public libraries,” aplis 15, no. 1 (2002): 32–37; andrew k. pace, “e-books: round two,” american libraries 35, no. 8 (2004): 74–75; michael rogers, “librarians, publishers, and vendors revisit e-books,” library journal 129, no. 7 (2004): 23–24. 6. stephen sottong, “e-book technology: waiting for the ‘false pretender,’” information technology and libraries 20, no. 2 (2001): 72–80. 7. vannevar bush, “as we may think,” atlantic monthly 176, no. 1 (1945): 101–108. 8. michael s. hart, “history and philosophy of project gutenberg.” accessed dec. 27, 2004, www.gutenberg.net/ about.shtml. 9. henry petroski, the book on the bookshelf (new york: vintage, 2000). 10. steve ditlea, “the real e-books,” technology review 103, no. 4 (2000): 70–73. 11. tim berners-lee, weaving the web: the original design and ultimate destiny of the world wide web by its inventor (new york: harpercollins, 1999). 12. james e. gall and annmari m. duffy, “e-books in a college course: a case study” (presented at the association for educational communications and technology conference, atlanta, ga., nov. 8–10, 2001). 13. george p. landow, hypertext 2.0: the convergence of contemporary critical theory and technology (baltimore, md.: johns hopkins univ. pr., 1997). 14. harold henke, “survey on electronic book features.” accessed dec. 27, 2004, www.openebook.org. 15. open e-book forum, “press release: record e-book retail sales set in q1 2004.” 16. lori enos, “report: e-book industry set to explode,” e-commerce times, 20 dec. 2000. accessed dec. 27, 2004, www. ecommercetimes.com/story/6215.html. 17. luis a. ubinas, “the answer to video piracy,” mckinsey quarterly no. 1. accessed accessed dec. 27, 2004, www .mckinseyquarterly.com. 18. mark hoorebeek, “e-books, libraries, and peer-topeer file-sharing,” australian library journal 52, no. 2 (2003): 163–68. 19. theodor h. nelson, “managing immense storage,” byte 13, no. 1 (1988): 225–38. 20. ibid., 238. 21. jacob weisberg, “the way we live now: the good ebook,” new york times, 4 june 2000. accessed dec. 27, 2004, www.nytimes.com. 22. donald t. hawkins, “electronic books: a major publishing revolution. part 1: general considerations and issues,” online 24, no. 4 (2000): 14–28. 23. steve grant, “e-books: friend or foe?” book report 21, no. 1 (2002): 50–54. 24. lori bell, “e-books go to college,” library journal 127, no. 8 (2002): 44–46. 25. abigail j. sellen and richard h. harper, the myth of the paperless office (cambridge, mass.: mit pr., 2002). 26. stephen moss, “pulped fiction,” sydney morning herald, 29 mar. 2002. accessed dec. 27, 2004, www.smh.com.au. 27. bbc news, “m6 toll built with pulped fiction,” bbc news uk edition, 18 dec. 2003. accessed dec. 27, 2004, http:// news.bbc.co.uk. 28. nelson, “managing immense storage.” 29. michael a. looney and mark sheehan, “digitizing education: a primer on e-books,” educause 36, no. 4 (2001): 38–46. 30. brian kenney, “netlibrary, ebsco explore new models for e-books,” library journal 128, no. 7 (2003). 31. stephen h. wildstrom, “a library to end all libraries,” business week (july 23, 2001): 23. online.” they have implemented several process improvements already and will complete their work by the 2005 ala annual conference. this past fall, michelle frisque, lita web manager, conducted a survey of our members about the lita web site. michelle and the web coordinating committee are already working on a new look and feel for the lita web site based on the survey comments, and the result promises to be phenomenal. on top of all of the current activities, new vision statement, strategic planning, and the lita web site redesign, mary taylor and the lita board worked with a graphic designer to develop a new lita logo. after much deliberation, the new logo debuted at the 2004 lita national forum with great enthusiasm. many members commented that the new logo expresses the “energy” of lita and felt the change was terrific. with your help, lita had a very successful conference in orlando. although there were weather and transportation difficulties, the lita programs and discussions were of the highest quality, as always. the program and preconference offerings for the upcoming annual conference in chicago promise to be as strong as ever. don’t forget, lita also offers regional institutes throughout the year. check the lita web site to see if there’s a regional institute scheduled in your area. lita held another successful national forum in fall 2004 in st. louis, “ten years of connectivity: libraries, the world wide web, and the next decade.” the threeday educational event included excellent preconferences, general sessions, and more than thirty concurrent sessions. i want to thank the wonderful 2004 lita national forum planning committee, chaired by diane bisom, the presenters, and the lita office staff who all made this event a great experience. the next lita national forum will be held at the san jose marriott, san jose, california, september 29–october 2, 2005. the theme will be “the ubiquitous web: personalization, portability, and online collaboration.” thomas dowling, chair, and the 2005 lita national forum planning committee are preparing another “must attend” event. next year marks lita’s fortieth anniversary. 2006 will be a year for lita to celebrate our history, future, and our many accomplishments. we are fortunate to have lynne lysiak leading the fortieth anniversary task force activities. i know we all will enjoy the festivities. i look forward to working with many of you as we continue to make lita a wonderful and vibrant association. i encourage you to send me your comments and suggestions to further the goals, services, and activities of lita. 32. terence cavanaugh, “e-books and accommodations: is this the future of print accommodation?” teaching exceptional children 35, no. 2 (2002): 56–61. 33. skip pratt, “e-books and e-publishing: ignore ms reader and palm os at your own peril,” knowledge download, 2002. accessed dec. 27, 2004, www.knowledge-download.com/260802 -e-book-article. 34. davina witt, “audience profile and demographics,” mar./apr. 2003. accessed dec. 27, 2004, www.bookbrowse.com/ media/audience.cfm. 35. geoff daily, “gameboy advance: not just playing with games,” econtent 27, no. 5 (2004): 12–14. 36. associated press, “flexible e-paper on its way,” associated press, 7 may 2003. accessed dec. 27, 2004, www.wired.com/news. 37. richard mayer, multimedia learning (cambridge, uk: cambridge university press, 2000). 38. sottong, “e-book technology.” 39. amc, “film facts: read about lost films.”accessed june 19, 2003, www.amctv.com/article?cid=1052. 40. ronald jantz, “e-books and new library service models: an analysis of the impact of e-book technology on academic libraries,” information technology and libraries 20, no. 2 (2001): 104–15. 41. susan lareau, the feasibility of the use of e-books for replacing lost or brittle books in the kent state university library, 2001, eric, ed 459862. accessed dec. 27, 2004, http://searcheric.org. 42. eli edwards, “ephemeral to enduring: the internet archive and its role in preserving digital media,” information technology and libraries 23, no. 1 (2004): 3–8. 43. norm parry, format proliferation in public libraries, 2002, eric, ed 470035,. accessed dec. 27, 2004, http://searcheric.org. 44. david m. levy, scrolling forward: making sense of documents in the digital age (new york: arcade pub., 2001). 45. about alan lomax. accessed dec. 27 2004, www.alan -lomax.com/about.html. dispelling five myths about e-books | gall 31 (president’s column continued from page 2) art & tech 24 ebsco cover 2 lita covers 3–4 index to advertisers 26 information technology and libraries | september 2007 author id box for 2 column layout wikis in libraries matthew m. bejune wikis have recently been adopted to support a variety of collaborative activities within libraries. this article and its companion wiki, librarywikis (http://librarywikis. pbwiki.com/), seek to document the phenomenon of wikis in libraries. this subject is considered within the framework of computer-supported cooperative work (cscw). the author identified thirty-three library wikis and developed a classification schema with four categories: (1) collaboration among libraries (45.7 percent); (2) collaboration among library staff (31.4 percent); (3) collaboration among library staff and patrons (14.3 percent); and (4) collaboration among patrons (8.6 percent). examples of library wikis are presented within the article, as is a discussion for why wikis are primarily utilized within categories i and ii and not within categories iii and iv. it is clear that wikis have great utility within libraries, and the author urges further application of wikis in libraries. i n recent years, the popularity of wikis has skyrocketed. wikis were invented in the mid­1990s to help facilitate the exchange of ideas between computer programmers. the use of wikis has gone far beyond the domain of com­ puter programming, and now it seems as if every google search contains a wikipedia entry. wikis have entered into the public consciousness. so, too, have wikis entered into the domain of professional library practice. the purpose of this research is to document how wikis are used in librar­ ies. in conjunction with this article, the author has created librarywikis (http://librarywikis.pbwiki.com/), a wiki to which readers can submit additional examples of wikis used in libraries. the article will proceed in three sections. the first section is a literature review that defines wikis and introduces computer­supported cooperative work (cscw) as a context for understanding wikis. the second section documents the author’s research and presents a schema for classifying wikis used in libraries. the third section considers the implications of the research results. ■ literature review what’s a wiki? wikipedia (2007a) defines a wiki as: a type of web site that allows the visitors to add, remove, edit, and change some content, typically with­ out the need for registration. it also allows for linking among any number of pages. this ease of interaction and operation makes a wiki an effective tool for mass collaborative authoring. wikis have been around since the mid­1990s, though it is only recently that they have become ubiquitous. in 1995, ward cunningham launched the first wiki, wikiwikiweb (http://c2.com/cgi/wiki), which is still active today, to facilitate the exchange of ideas among computer program­ mers (wikipedia 2007b). the launch of wikiwikiweb was a departure from the existing model of web communica­ tion ,where there was a clear divide between authors and readers. wikiwikiweb elevated the status of readers, if they so chose, to that of content writers and editors. this model proved popular, and the wiki technology used on wikiwikiweb was soon ported to other online communi­ ties, the most famous example being wikipedia. on january 15, 2001, wikipedia was launched by larry sanger and jimmy wales as a complementary project for the now­defunct nupedia encyclopedia. nupedia was a free, online encyclopedia with articles written by experts and reviewed by editors. wikipedia was designed as a feeder project to solicit new articles for nupedia that were not submitted by experts. the two services coexisted for some time, but in 2003 the nupedia servers were shut down. since its launch, wikipedia has undergone rapid growth. at the close of 2001, wikipedia’s first year of operation, there were 20,000 articles in eighteen language editions. as of this writing, there are approximately seven million articles in 251 languages, fourteen of which have more than 100,000 articles each. as a sign of wikipedia’s growth, when this manuscript was first submitted four months earlier, there were more than five million articles in 250 languages. author’s note: sources in the previous two para­ graphs come from wikipedia. the author acknowledges the concerns within the academy regarding the practice of citing wikipedia within scholarly works; however, it was decided that wikipedia is arguably an authoritative source on wikis and itself. nevertheless, the author notes that there were changes—insubstantial ones—to the cited wikipedia entries between when the manuscript was first submitted and when it was revised four months later. wikis and cscw wikis facilitate collaborative authoring and can be con­ sidered one of the technologies studied under the domain of cscw. in this section, cscw is explained and it is shown how wikis fit within this framework. cscw is an area of computer science research that considers the application of computer technology to sup­ port cooperative, also referred to as collaborative work. the term was first coined in 1984 by irene greif (1988) and matthew m. bejune (mbejune@purdue.edu) is an assistant professor of library science at purdue university libraries. he also is a doctoral student at the graduate school of library and information science, university of illinois at urbana-champaign. article title | author 27wikis in libraries | bejune 27 paul cashman to describe a workshop they were planning on the support of people in work environments with com­ puters. over the years there have been a number of review articles that describe cscw in greater detail, including bannon and schmidt (1991), rodden (1991), schmidt and bannon (1992), sachs (1995), dourish (2001), ackerman (2002), olson and olson (2002), dix, finlay, abowd, and beale (2004), and shneiderman and plaisant (2005). publication in the field of cscw primarily occurs through conferences. the first conference on cscw was held in 1986 in austin, texas. since then, the conference has been held biennially in the united states. proceedings are published by the association for computing machinery (acm, http://www.acm.org/). in 1991, the first european conference on computer supported cooperative work (ecscw) was held in amsterdam. ecscw also is held biennially, in odd­numbered years. ecscw proceedings are published by springer (http://www.ecscw.uni­sie­ gen.de/). the primary journal for cscw is computer supported cooperative work: the journal of collaborative computing. publications also appear within publications of the acm and chi, the conference on human factors in computing. cscw and libraries as libraries are, by nature, collaborative work envi­ ronments—library staff working together and with patrons—and as digital libraries and computer technolo­ gies become increasingly prevalent, there is a natural fit between cscw and libraries. the following researchers have applied cscw to libraries. twidale et al. (1997) pub­ lished a report sponsored by the british library research and innovation centre that examined the role of col­ laboration in the information­searching process to inform how information systems design could better address and support collaborative activity. twidale and nichols (1998) offered ethnographic research of physical collaborative environments—in a university library and an office—to aid the design of digital libraries. they wrote two reviews of cscw as applied to libraries—the first was more com­ prehensive (twidale and nichols 1998) than the second (twidale and nichols 1999). sánchez (2001) discussed collaborative environments designed and prototyped for digital library environments. classification of collaboration technologies that facilitate collaborative work are typically classified within cscw across two continua: synchronous versus asynchronous, and co­located versus remote. if put together in a two­by­two matrix, there are four possibilities: (1) synchronous and co­located (same time, same place); (2) synchronous and remote (same time, different place); (3) asynchronous and remote (different time, different place); and (4) asynchronous and co­located (different time, same place). this classification schema was first proposed by johansen et al. (1988). nichols and twidale (1999) mapped work applications within the realm of cscw in figure 1. wikis are not present in the figure, but their absence is not an indication that they are not cooperative work technologies. rather, wikis were not yet widely in use at the time cscw was considered by nichols and twidale. the author has added wikis to nichols and twidale’s graphical representation in figure 2. interestingly, wikis are border­crossers fitting within two quadrants: the upper right—asynchronous and co­located; and the lower right—asynchronous and remote. wikis are asynchronous in that they do not require people to be working together at the same time. they are both co­located and remote in that people working collaboratively may not need to be working in the same place. it is also interesting to note that library technologies also can be mapped using johansen’s schema. nichols and twidale (1999) also mapped this, and figure 3 illus­ trates the variety of collaborative work that goes on within libraries. ■ method in order to to discover the widest variety of wikis used in libraries, the author searched for examples of wikis used in libraries within three areas—the lis literature, the library success wiki, and within messages posted on three professional electronic discussion lists. when examples were found, they were logged and classified according to a schema created by the author. results are presented in the next section. the first area searched was within the lis literature. the author utilized the wilson library literature and figure 1. classification of cscw applications co-located remote synchronous asynchronous meeting rooms distributed meetings muds and moos shared drawing video conferencing collaborative writing team rooms organizational memory workflow web-based applications collaborative writing 2� information technology and libraries | september 20072� information technology and libraries | september 2007 information science database. there were two main types of articles: ones that argued for the use of wikis in libraries, and ones that were case studies of wikis that had been implemented. the second area searched was within library success: a best practices wiki (http://www.libsuccess.org/) (see figure 4), created by meredith farkas, distance learning librarian at norwich university. as the name implies, it is a place for people within the library community to share their success stories. posting to the wiki is open to the public, though registration is encouraged. there are many subject areas on the wiki, including management and leadership, readers’ advisory, reference services, infor­ mation literacy, and so on. there also is a section about collaborative tools in libraries (http://www.libsuccess .org/index.php?title=collaborative_tools_in_libraries), in which examples of wikis in libraries are presented. within this section there is a presentation about wikis made by farkas (2006) titled wiki world (http://www. libsuccess.org/indexphp?title=wiki_world), from which examples were culled. the third area that was searched was professional electronic discussion list messages from web4lib, dig_ ref, and libref­l. the web4lib electronic discussion list (tennant 2005) is “for the discussion of issues relating to the creation, management, and support of library­ based world wide web servers, services, and applica­ tions.” the list is moderated by roy tennant and the web4lib advisory board and was started in 1994. the dig_ref electronic discussion list is a forum for “people and organizations answering the questions of users via the internet” (webjunction n.d.). the list is hosted by the information institute of syracuse, school of information studies, syracuse university, and was created in 1998. the libref­l electronic discussion list is “a moderated discussion of issues related to reference librarianship (balraj 2005). established in 1990, it’s operated out of kent state university and moderated by a group of list own­ ers. these three electronic discussion lists were selected for two reasons. first, the author is a subscriber to each electronic discussion list, and prior to the research noted there were messages about wikis in libraries. second, based on the descriptions of each electronic discussion list stated above, the selected electronic discussion lists reasonably covered the discussion of wikis in libraries within the professional library electronic discussion lists. one year of messages, november 15, 2005, through november 14, 2006, was analyzed for each list. messages about wikis in libraries were identified through key­ word searches against the author’s personal archive of electronic discussion list messages collected over the figure 2. classification of cscw applications including wikis co-located remote synchronous asynchronous meeting rooms distributed meetings muds and moos shared drawing video conferencing collaborative writing wikis team rooms wikis organizational memory workflow web-based applications collaborative writing figure 3. classification of collaborative work within libraries co-located remote synchronous asynchronous personal help reference interview issue of book on loan fact-to-face interactions use of opacs database search video conferencing telephone notice boards post-it notes memos documents for study social information filtering e-mail, voicemail distance learning postal services figure �. library success: a best practices wiki (http://www. libsuccess.org/) article title | author 29wikis in libraries | bejune 29 years. an alternative method would have been to search the web archive of each list, but the author found it easier to search within his mail client, microsoft outlook. messages with the word “wiki” were found in 513 mes­ sages: 354 in web4lib, 91 in dig_ref, and 68 in libref­ l. this approach had high recall, as discourse about wikis frequently included the use of the word “wiki,” though low precision, as there were many results that were not about wikis used in libraries. common false hits included messages about the nature study (giles 2005) that com­ pared wikipedia to encyclopedia britannica, and messages that included the word “wiki” but were simply refer­ ring to wikis, though not examples of wikis used within libraries. from the list of 513 messages, the author read each message and came up with a much shorter list of thirty­nine messages about wikis in libraries: thirty­two in web4lib, three in dig_ref, and four in libref­l. ■ results classification of the results after all wiki examples had been collected, it became clear that there was a way to classify the results. in farkas’s (2006) presentation about wikis, she organized wikis in two categories: (1) how libraries can use wikis with their patrons; and (2) how libraries can use wikis for knowledge sharing and collaboration. this schema, while it accounts for two types of collaboration, is not granular enough to represent the types of collaboration found within the wiki examples identified. as such, it became clear that another schema was needed. twidale and nichols (1998) identified three types of collaboration within libraries: (1) collaboration among library staff; (2) collaboration between a patron and a member of staff; and (3) collaboration among library users. their classification schema mapped well to the examples of wikis that were identified; however, it too was not granular enough, as it did not distinguish among col­ laboration between library staff intraorganizationally and extraorganizationally, the two most common types of wiki usage found in the research (see appendix). to account for these types of collaboration, which are common not only to wiki use in libraries but to all professional library prac­ tice, the author modified twidale and nichols schema (see figure 6). the improved schema also uniformly represents entities across the categories—library staff and member of staff are referred to as “library staff,” and patrons and library users are referred to as “patrons.” examples of wikis used in libraries for each category are provided to better illustrate the proposed classifica­ tion schema. ■ collaboration among libraries the library instruction wiki (http://instructionwiki .org/main_page) is an example of a wiki that is used for collaboration among libraries (figure 7). it appears as though the wiki was originally set up to support library instruction within oregon—it is unclear if this was asso­ ciated with a particular type of library, say academic or public—but now the wiki supports library instruction in general. the wiki is self­described as: a collaboratively developed resource for librarians involved with or interested in instruction. all librarians and others interested in library instruction are welcome and encouraged to contribute. the tagline for the wiki is “stop reinventing the wheel”(library instruction wiki 2006). from this wiki there figure 6. four types of collaboration within libraries 1. collaboration among libraries (extra-organizational) 2. collaboration among library staff (intra-organizational) 3. collaboration among library staff and patrons 4. collaboration among patrons figure 5. wiki world (http://www.libsuccess.org/index.php?title=wiki _world) 30 information technology and libraries | september 200730 information technology and libraries | september 2007 is a list of library instruction resources that include the fol­ lowing: handouts, tutorials, and other resources to share; teaching techniques, tips, and tricks; class­specific web sites and handouts; glossary and encyclopedia; bibliography and suggested reading; and instruction­related projects, brainstorms, and documents. within the handouts, tutori­ als, and other resources to share section, the author found a wide variety of resources from libraries across the country. similarly, there were a number of suggestions to be found under the teaching techniques, tips, and tricks section. another example of a wiki used for collaboration among libraries is the library success wiki (http://www .libsuccess.org/), one of the sources of examples of wikis used in this research. adding to earlier descriptions of this wiki as presented in this paper, library success seems to be one of the most frequently updated library wikis and perhaps the most comprehensive in its cover­ age of library topics. ■ collaboration among library staff the university of connecticut libraries’ staff wiki (http:// wiki.lib.uconn.edu/) is an example of a wiki used for col­ laboration among library staff (figure 8). this wiki is a knowledge base containing more than one thousand infor­ mation technology services (its) documents. its docu­ ments support the information technology needs of the library organization. examples include answers to com­ monly asked questions, user manuals, and instructions for a variety of computer operations. in addition to being a repository of its documents, the wiki also serves as a portal to other wikis within the university of connecticut libraries. there are many other wikis connected to library units; teams; software applications, such as the libraries ils; libraries within the university of connecticut libraries; and other university of connecticut campuses. the health science library knowledge base, stony brook university (http://appdev.hsclib.sunysb.edu/ twiki/bin/view/main/webhome) is another example of a wiki that is used for collaboration among library staff (figure 9). the wiki is described as “a space for the dynamic collaboration of the library staff, and a platform of shared resources” (health sciences library 2007). on the wiki there are the following content areas: news and announcements; hsl departments; projects; trouble­ shooting; staff training resources, working papers and support materials; and community activities, scholarship, conferences, and publications. ■ collaboration among library staff and patrons there are only a few examples of wikis used for collabora­ tion among library staff and patrons to cite as exemplars. one example is the st. joseph county public library (sjpl) subject guides (http://www.libraryforlife.org/ subjectguides/index.php/main_page), seen in figure 10. this wiki is a collection of resources and services in print and electronic formats to assist library patrons with subject area searching. as the wiki is published by library staff for public consumption, it has more of a professional feel than wikis from the first two categories. pages have images, and the content is structured to look like a standard web page. though the wiki looks like a web page, there still remain a number of edit links that follow each section of text on the wiki. while these tags bear importance for those editing figure 7. library instruction wiki (http://instructionwiki.org/) figure �. the university of connecticut libraries’ staff wiki (http:// wiki.lib.uconn.edu/) article title | author 31wikis in libraries | bejune 31 the wiki—library staff only in this case—they undoubtedly puzzle library patrons who think that they have the ability to edit the wiki when, in fact, they do not. another example of collaboration between library staff and patrons that takes a similar approach is the usc aiken gregg­graniteville library web site (http://library. usca.edu/) in figure 11. as with the sjpl subject guides, this wiki looks more like a web site than a wiki. in fact, the usc aiken wiki conceals its true identity as a wiki even more so than the sjpl subject guides. the only evidence that the web site is a wiki is a link at the bottom of each page that says “powered by pmwiki.” pmwiki (http:// pmwiki.org/) is a content management system that uti­ lizes the wiki technology on the back end to manage a web site while retaining the look and feel of a standard web site. it seems that the benefits of using a wiki in such a way are shared content creation and management. ■ collaboration among patrons as there are only three examples of wikis used for col­ laboration among patrons, all examples will be high­ lighted in this section. the first example is wiki worldcat (http://www.oclc.org/productworks/wcwiki.htm), sponsored by oclc. wiki worldcat launched as a pilot project in september 2005. the service allows users of open worldcat, oclc’s web version of worldcat, to add book reviews to item records. though this wiki does not have many book reviews in it, even for contemporary bestsellers, it gives a taste for how a wiki could be used to facilitate collaboration among patrons. a second example is the biz wiki from ohio university libraries (http://www.library.ohiou.edu/subjects/ bizwiki/index.php/main_page) (see figure 12). the biz wiki is a collection of business information resources avail­ able through ohio university. the wiki was created by chad boeninger, reference and instruction librarian, as an alternate form of a subject guide or pathfinder. what separates this wiki from those in the third category, collaboration among library staff and patrons, is that the wiki is editable by patrons as well as librarians. similarly, butler wikiref (http://www .seedwiki.com/wiki/butler_wikiref) is a wiki that has reviews of reference resources created by butler librarians, faculty, staff, and students (see figure 13).figure 9. health sciences library knowledge base (http://appdev .hsclib.sunysb.edu/twiki/bin/view/main/webhome) figure 11. usc aiken gregg-graniteville library (http://library.usca .edu/) figure 10. sjcpl subject guides (http://libraryforlife.org/subject guides/index.php/main_page/) 32 information technology and libraries | september 200732 information technology and libraries | september 2007 full results thirty­three wikis were identified. two wikis were classi­ fied in two categories each. the full results are available in the appendix. table 1 illustrates how wikis were not uniformly distributed across the four categories: category i had 45.7 percent, category ii had 31.4 percent, category iii had 14.3 percent, and category iv had 8.6 percent. nearly 80 percent of all examples were found within categories i and ii. as seen in some of the examples in the previous section, wikis were utilized for a variety of purposes. here is a short list of purposes for which wikis were utilized: for sharing information, supporting association work, collecting soft­ ware documentation, supporting conferences, facilitating librarian­to­faculty collaboration, creating digital reposito­ ries, managing web content, creating intranets, providing reference desk support, creating knowledge bases, creating subject guides, and collecting reader reviews. wiki software utilization is summarized in tables 2 and 3. mediawiki is the most popular software utilized by libraries (33.3 percent), followed by unknown (30.3 percent), pbwiki (12.1 percent), pmwiki (12.1 percent), seedwiki (6.1 percent), twiki (3 percent), and xwiki (3 percent). if the values for unknown are removed from the totals (table 3 ), mediawiki is utilized in almost half (47.8 percent) of all library wiki applications. ■ discussion with a wealth of examples of wikis in categories i and ii and a dearth of examples of wikis in categories iii and iv, the library community seems to be more comfortable using wikis to collaborate within the community, but less comfortable using wikis to collaborate with library patrons or to enable collaboration among patrons. the research results pose the questions: why are wikis pre­ dominantly used for collaboration within the library community? and why are wikis minimally used for col­ laborating with patrons and helping patrons to collabo­ rate with one another? why are wikis predominantly used for collaboration within the library community? this is perhaps the easier of the two questions to explain. there is a long legacy of cooperation and collaboration intraorganizationally and extraorganizationally within libraries. one explanation for this is the shared bud­ getary climate within libraries. all too often there are insufficient money, staff, and resources to offer desired levels of service. librarians work together to overcome these barriers. prominent examples include coopera­ tive cataloging, interlibrary lending, and the formation of consortia to negotiate pricing. another explanation can be found in the personal characteristics of library professionals. librarianship is a service profession that consequently attracts service­minded individuals who are interested in helping others, whether they are library patrons or fellow colleagues. a third reason is the role of library associations, such as the international federation of library associations and institutions, the american library association, the special libraries association, and the medical library association, as well as many others at the international, national, state, and local lev­ figure 12. ohio university libraries biz wiki (http://www.library. ohiou.edu/subjects/bizwiki) figure 13. butler wikiref (http://www.seedwiki.com/wiki/butler_ wikiref) article title | author 33wikis in libraries | bejune 33 els, and the work that is done through these associations at annual conferences and throughout the year. libraries use wikis to collaborate intraorganizationally and extra­ organizationally because collaboration is what they do most naturally. why are wikis minimally used for collaborating with patrons and helping patrons to collaborate with one another? the reasons for why libraries are only minimally using wikis to collaborate with patrons and for patron collabora­ tion are more difficult to ascertain. however, due to the untapped potential of using wikis, the proposed answers to this question are more important and may lead to future implementations of wikis in libraries. here are four pos­ sible explanations, some more speculative than others. first, perhaps one of the reasons is the result of the way in which libraries are conceived by library patrons and librarians alike. a strong case can be made for libraries as places of collaborative work, and the author takes this posi­ tion. however, historically libraries have been repositories of information, and this remains a pervasive and difficult concept to change—libraries are frequently seen simply as places to get books. in this scenario, the librarian is a gate­ keeper that a patron interacts with to get a book—that is, if the patron interacts with a librarian at all. it also is worthy to note that the relationship is one­way—the patron needs the assistance of librarian, but not the other way around. viewed in these terms, this is not a collaborative situation. for libraries to use wikis for the purpose of collaborating with library patrons, it might demand the reconceptualiza­ tion of libraries by library patrons and librarians. similarly, this extreme conceptualization of libraries does not con­ sider patrons working with one another, even though it is an activity that occurs formally and informally within libraries, not to mention with the emergence of interdisci­ plinary and multidisciplinary work. if wikis are to be used to facilitate collaboration between patrons, the conceptual­ ization of the library by library patrons and librarians must be expanded. second, there may be fears within the library commu­ nity about authority, responsibility, and liability. libraries have long held the responsibility of ensuring the authority of the bibliographic catalog. if patrons are allowed to edit the library wiki, there is potential for negatively affecting the authority of the wiki and even the perceived author­ ity of the library. likewise, there is potential liability in allowing patrons to post to the library wiki. similar con­ table 2. software totals wiki software no. % mediawiki 11 33.3 unknown 10 30.3 pbwiki 4 12.1 pmwiki 4 12.1 seedwiki 2 6.1 twiki 1 3 xwiki 1 3 total: 33 100 table 3. software totals without unknowns wiki software no. % mediawiki 11 47.8 pbwiki 4 17.4 pmwiki 4 17.4 seedwiki 2 8.7 twiki 1 4.3 xwiki 1 4.3 total: 23 100.0 table 1. classification summary category no. % i: collaboration among libraries 16 45.7 ii: collaboration among library staff 11 31.4 iii: collaboration among library staff and patrons 5 14.3 iv: collaboration among patrons 3 8.6 total: 35 100.0 3� information technology and libraries | september 20073� information technology and libraries | september 2007 cerns have been raised in the past about other collabora­ tive technologies, such as blogs, bulletin boards, mailing lists, and so on, all aspects of the library 2.0 movement. if libraries are fully to realize library 2.0 as described by casey and savastinuk (2006), miller (2006), and courtney (2007), these issues must be considered. third, perhaps it is due to a matter of fit. it might be the case that wikis are utilized in categories i and ii and not within categories iii and iv because the tools are better suited to support the types of activities within categories i and ii. consider some of the activities listed earlier: sup­ porting association work, collecting software documenta­ tion, supporting conferences, creating digital repositories, creating intranets, and creating knowledge bases. each of these illustrates a wiki that is utilized for the creation of a resource with multiple authors and readers, tasks that are well­suited to wikis. wikipedia is a great example of a wiki with clear, shared tasks for multiple authors and multiple readers and a sense of persistence over time. in contrast, relationships between library staff and patrons do not typically lead to the shared creation of resources. while it is true that the relationship between patron and librarian in the context of a patron’s research assignment can be collab­ orative depending on the circumstances, authorship is not shared but is possessed by the patron. in addition, research assignments in the context of undergraduate coursework are short­lived and seldom go beyond the confines of a particular course. in terms of patrons working together with other patrons, there is the precedent of group work; however, groups often produce projects or papers that share the characteristics of nongroup research assignments listed above. this, of course, does not mean that wikis are not suitable for collaboration within categories iii and iv, but perhaps the opportunities for collaboration are fewer or that they stretch the imagination of the types and ways of doing collaborative work. fourth, perhaps it is a matter of “not yet.” while the research has shown that libraries are not utilizing wikis in categories iii and iv, this may be because it is too soon. it should be noted that wikis are still new technologies. it might be the case that librarians are experimenting in safer contexts so they will gain experience prior to trying more public projects where their expertise will be needed. if this explanation is true, it is expected that more exam­ ples of wikis in libraries will soon emerge. as they do, the author hopes that all examples of wikis in libraries, new and old, will be added to the companion wiki to this article, librarywikis (http://librarywikis.pbwiki.com/). ■ conclusion it appears that wikis are here to stay, and that their utili­ zation within libraries is only just beginning. this article documented the current practice of wikis used in libraries using cscw as a framework for discussion. the author located examples of wikis in three places: within the lis lit­ erature, on the library success wiki, and within messages from three professional electronic discussion lists. thirty­ three examples of wikis were identified and classified using a classification schema created by the author. the schema has four categories: (1) collaboration among librar­ ies; (2) collaboration among library staff; (3) collaboration among library staff and patrons; and (4) collaboration among patrons. wikis were used for a variety of purposes, including for sharing information, supporting associa­ tion work, collecting software documentation, supporting conferences, facilitating librarian­to­faculty collaboration, creating digital repositories, managing web content, creat­ ing intranets, providing reference desk support, creating knowledge bases, creating subject guides, and collecting reader reviews. by and large, wikis were primarily used to support collaboration among library staff intraorganiza­ tionally and extraorganizationally, with nearly 80 percent (45.7 percent and 31.4 percent respectively) of the examples so identified, and less so in the support of collaboration among library staff and patrons (14.3 percent) and col­ laboration among patrons (8.6 percent). a majority of the examples of wikis utilized the mediawiki software (47.8 percent). it is clear that there are plenty of examples of wikis utilized in libraries, and more to be found each day. it is at this time that the profession is faced with extending the use of this technology, and it is to the future to see how wikis will continue to be used within libraries. works cited ackerman, mark s. 2002. the intellectual challenge of cscw: the gap between social requirements and technical feasibil­ ity. in human-computer interaction in the new millennium, ed. john m. carroll, 179–203. new york: addison­wesley. balraj, leela, et al. 2005 libref­l. kent state university librar­ ies. http://www.library.kent.edu/page/10391 (accessed june 12, 2007). archive is available at this link as well. bannon, liam j., and kjeld schmidt. 1991. cscw: four charac­ ters in search for a context. in studies in computer supported cooperative work. ed. john m. bowers and steven d. benford, 3–16. amsterdam: elsevier. casey, michael e., and laura c. savastinuk. 2006. library 2.0. library journal 131, no. 14: 40–42. http://www.libraryjournal. com/article/ca6365200.html (accessed june 12, 2007). courtney, nancy. 2007. library 2.0 and beyond: innovative technologies and tomorrow’s user (in press). westport, conn.: libraries unlimited. dix, alan, et al. 2004. socio­organizational issues and stake­ holder requirements. in human computer interaction, 3rd ed., 450–74. upper saddle river, n.j.: prentice hall. dourish, paul. 2001. social computing. in where the action is: the foundations of embodied interaction, 55–97. cambridge, mass: mit pr. article title | author 35wikis in libraries | bejune 35 farkas, meredith. 2006. wiki world. http://www.libsuccess. org/index.php?title=wiki_world (accessed june 12, 2007). giles, jim. 2005. internet encyclopaedias go head to head. nature 438: 900–01. http://www.nature.com/nature/journal/v438/ n7070/full/438900a.html (accessed june 12, 2007). greif, irene, ed. 1988. computer supported cooperative work: a book of readings. san mateo, calif.: morgan kaufmann publishers. health sciences library, state university of new york, stony brook. 2007. health sciences library knowledge base. http://appdev.hsclib.sunysb.edu/twiki/bin/view/main/ webhome (accessed june 12, 2007). johansen, robert, et al. 1988. groupware: computer support for business teams. new york: free press. library instruction wiki. 2006. http://instructionwiki.org/ main_page (accessed june 12, 2007). miller, paul. 2006. coming together around library 2.0. dlib magazine 12, no. 4. http://www.dlib.org/dlib/april06/ miller/04miller.html (accessed june 12, 2007). nichols, david m., and michael b. twidale. 1999. com­ puter supported cooperative work and libraries. vine 109: 10–15. http://www.comp.lancs.ac.uk/computing/research/ cseg/projects/ariadne/docs/vine.html (accessed june 12, 2007). olson, gary m., and judith s. olson. 2002. groupware and com­ puter­supported cooperative work. in the human-computer interaction handbook: fundamentals, evolving technologies and emerging applications, ed. julie a. jacko and andrew sears, 583–95. mahwah, n.j.: lawrence erlbaum associates, inc.. rodden, tom t. 1991. a survey of cscw systems. interacting with computers 3, no. 3: 319–54. sachs, patricia. 1995. transforming work: collaboration, learn­ ing, and design. communications of the acm 38: 227–49. sánchez, j. alfredo. 2001. hci and cscw in the context of digi­ tal libraries. in chi ‘01 extended abstracts on human factors in computing systems. conference on human factors in computing systems. seattle, wash., mar. 31–apr. 5 2001. schmidt, kjeld, and liam j. bannon. 1992. taking cscw seri­ ously: supporting articulation work. computer supported cooperative work 1, no. 1/2: 7–40. shneiderman, ben, and catherine plaisant. 2005. collaboration. in designing the user interface: strategies for effective humancomputer interaction, 4th ed., 408–50. reading, mass.: addison wesley. tennant, roy. 2005. web4lib electronic discussion. webjunc­ tion.org. http://lists.webjunction.org/web4lib/ (accessed june 12, 2007). archive is available at this link as well. twidale, michael b., et al. 1997. collaboration in physical and digital libraries. report no. 64, british library research and innovation centre. http://www.comp.lancs.ac.uk/ computing/research/cseg/projects/ariadne/bl/report/ (accessed june 12, 2007). twidale, michael b., and david m. nichols. 1998a. using studies of collaborative activity in physical environments to inform the design of digital libraries. technical report cseg/11/98, computing department, lancaster university, uk. http://www.comp.lancs.ac.uk/computing/research/cseg/ projects/ariadne/docs/cscw98.html (accessed june 12, 2007). twidale, michael b., and david m. nichols. 1998b. a survey of applications of cscw for digital libraries. technical report cseg/4/98, computing department, lancaster university, uk. http://www.comp.lancs.ac.uk/computing/research/cseg/ projects/ariadne/docs/survey.html (accessed june 12, 2007). webjunction. n.d. dig_ref electronic discussion list. http:// www.vrd.org/dig_ref/dig_ref.shtml (accessed june 12, 2007). wikipedia. 2007a. wiki. http://en.wikipedia.org/wiki/wiki (accessed april 29, 2007). wikipedia. 2007b. wikiwikiweb. http://en.wikipedia.org/ wiki/wikiwikiweb (accessed april 29, 2007). 36 information technology and libraries | september 200736 information technology and libraries | september 2007 appendix. wikis in libraries i = collaboration between libraries ii = collaboration between library staff iii = collaboration between library staff and patrons iv = collaboration between patrons category description location wiki software i library success: a best practices wiki—a wiki capturing library success stories. covers a wide variety of topics. also features a presentation about wikis http://www.libsuccess. org/index.php?title=wiki_world http://www.libsuccess.org/ mediawiki i wiki for school library association in alaska http://akasl.pbwiki.com/ pbwiki i wiki to support reserves direct. free, open­source software for managing academic reserves materials developed by emory university. http://www.reservesdirect.org/ wiki/index.php/main_page mediawiki i sunyla new tech wiki—a place for state university of new york (suny) librarians to share how they are using information technologies to interact with patrons http://sunylanewtechwiki.pbwiki. com/ pbwiki i wiki for librarians and faculty members to collaborate across campuses. being used with distance learning instructors and small groups message from robin shapiro. on [dig_ref] electronic discussion list dated 10/18/2006. unknown i discusses setting up three wikis in last month: “one to sup­ port a pre­conference workshop, another for behind­the­ scenes conferences planning by local organizers, and one for conference attendees to use before they arrived and during the sessions” (30). fichter, darlene. 2006. using wikis to support online collaboration in libraries. information outlook 10, no.1: 30­31. unknown i unofficial wiki to the american library association 2005 annual conference http://meredith.wolfwater.com/ wiki/index.php?title=main_page mediawiki i unofficial wiki to the 2005 internet librarian conference http://ili2005.xwiki.com/xwiki/bin/ view/main/webhome xwiki i wiki for the canadian library association (cla) 2005 annual conference http://wiki.ucalgary.ca/page/cla mediawiki i wiki for south carolina library association http://www.scla.org/governance/ homepage pmwiki i wiki set up to support national discussion about institutional repositories in new zealand http://wiki.tertiary.govt.nz/ ~institutionalrepositories pmwiki i the oregon library instruction wiki used for sharing infor­ mation about library instruction http://instructionwiki.org/ mediawiki i personal repositories online wiki environment (prowe)— an online repository sponsored by the open university and the university of leicester that uses wikis and blogs to encourage the open exchange of ideas across communities of practice http://www.prowe.ac.uk/ unknown article title | author 37wikis in libraries | bejune 37 category description location wiki software i lis wiki—space for collecting articles and general informa­ tion about library and information science http://liswiki.org/wiki/main_page mediawiki i making of modern michigan—a wiki to support a state­wide digital library project http://blog.lib.msu.edu/mmmwiki/ index.php/main_page unknown (behind firewall) i wiki used as a web content editing tool in a digital library initiative sponsored by emory university, the university of arizona, virginia tech, and the university of notre dame http://sunylanewtechwiki.pbwiki .com/ pbwiki ii wiki at suny stony brook health sciences library used as knowledge base http://appdev.hsclib.sunysb.edu/ twiki/bin/view/main/webhome; presentation can be found at: http:// ms.cc.sunysb.edu/%7edachase/ wikisinaction.htm twiki ii wiki at york university used internally for committee work. exploring how to use wikis as a way to collaborate with users message from mark robertson. on web4lib electronic discussion list dated 10/13/2006. unknown ii wiki for internal staff use at the university of waterloo. they utilize access control to restrict parts of the wiki to groups message from chris gray. on web4lib electronic discussion list dated 08/09/2006. unknown ii wiki at the university of toronto for internal communica­ tions, technical problems, and as a document repository message from stephanie walker. on libref­l electronic discussion list dated 10/28/2006. unknown ii wiki used for coordination and organization of portable professor program, which appears to be a collaborative infor­ mation literacy program for remote faculty http://tfpp­committee.pbwiki.com/ pbwiki ii the university of connecticut libraries’ staff wiki which is a repository of information technology services documents http://wiki.lib.uconn.edu/wiki/ main_page mediawiki ii wiki used at binghamton university libraries for staff intranet. features pages for committees, documentation, policies, newsletters, presentations, and travel reports screenshots can be found at http://library.lib.binghamton.edu/ presentations/cil2006/cil%202006 _wikis.pdf mediawiki ii wiki used at the information desk at miami university described in: withers, rob. “something wiki this way comes.” c&rl news 66, no. 11 (2005): 775–77. unknown ii use of wiki as knowledge base to support reference service http://oregonstate.edu/~reeset/ rdm/ unknown ii university of minnesota libraries staff web site in wiki form https://wiki.lib.umn.edu/ pmwiki ii wiki used to support the mit engineering and science libraries b­team. the wiki may no longer be active, but is still available http://www.seedwiki.com/wiki/b­ team seedwiki iii a wiki that is subject guide at st. joseph county public library in south bend, indiana http://www.libraryforlife.org/ subjectguides/index.php/main_page mediawiki 3� information technology and libraries | september 20073� information technology and libraries | september 2007 category description location wiki software iii wiki used at the aiken library, university of south carolina as a content management system (cms) http://library.usca.edu/main/ homepage pmwiki iii doucette library of teaching resources wiki—a repository of resources for education students http://wiki.ucalgary.ca/page/ doucette mediawiki iv wiki worldcat (wikid) is an oclc pilot project (now defunct) that allowed users to add reviews to open worldcat records http://www.oclc.org/product­ works/wcwiki.htm unknown iii and iv wikiref lists reviews of reference resources—databases, books, web sites, etc. —created by butler librarians, faculty, staff, and students. http://www.seedwiki.com/wiki/ butler_wikiref; reported in matthies, brad, jonathan helmke, and paul slater. using a wiki to enhance library instruction. indiana libraries 25, no. 3 (2006): 32–34. seedwiki iii and iv wiki used as a subject guide at ohio university http://www.library.ohiou.edu/sub­ jects/bizwiki/index.php/main_page; presentation about the wiki: http://www.infotoday.com/cil2006/ presentations/c101­102_boeninger .pps mediawiki evaluation and comparison of discovery tools: an update f. william chickering and sharon q. yang information technology and libraries | june 2014 5 abstract selection and implementation of a web-scale discovery tool by the rider university libraries (rul) in the 2011–2012 academic year revealed that the endeavor was a complex one. research into the state of adoption of web-scale discovery tools in north america and the evolution of product effectiveness provided a good starting point. in the following study, we evaluated fourteen major discovery tools (three open source and ten proprietary), benchmarking sixteen criteria recognized as the advanced features of a “next generation catalog.” some of the features have been used in previous research on discovery tools. the purpose of the study was to evaluate and compare all the major discovery tools , and the findings serve to update librarians on the latest developments and user interfaces and to assist them in their adoption of a discovery tool. introduction in 2004, the rider university libraries’ (rul) strategic planning process uncovered a need to investigate federated searching as a means to support rese arch. a tool was needed to search and access all journal titles available to rul users at that time, including 12,000+ electronic full-text journals. lacking the ability to provide relevancy ranking due to its real-time search operations, as well as the cost of the products then available, the decision was made to defer implementation of federated search. monitoring developments yearly revealed no improvements strong enough to adopt the approach. by 2011, the number of electronic full-text journals had increased to 51,128, and by this time federated search as a concept had metamorphosed into web -scale discovery. clearly, the time had come to consider implementing this more advanced approach to searching the ever-growing number of journals available to our clients. though rul passed on federated searching, viewing it as too cumbersome to serve our students well, we anticipated the day when improved systems would emerge. vaughn nicely describes the ability of more highly evolved discovery systems to “provide qu ick and seamless discovery, delivery, and relevancy-ranking capabilities across a huge repository of content.” 1 yang and hofmann anticipated the emergence of web-scale discovery with their evaluation of next generation catalogs. 2,3 by 2011, informed by yang and hofmann’s research, we believed that the systems in the marketplace were sufficiently evolved to make our efforts at assessing available systems worthwhile. this coincided nicely with an important objective in our strategic plan : f. william chickering (chick@rider.edu) is dean of university libraries, rider university, lawrenceville, new jersey. sharon q. yang (yangs@rider.edu) is associate professor–librarian at moore library, rider university, lawrenceville, new jersey. mailto:chick@rider.edu mailto:yangs@rider.edu evaluation and comparison of discovery tools: an update | chickering and yang 6 investigate link resolvers and discovery tools for federated searching and opac by summer 2011. heeding alexander pope’s advice to “be not the first by whom the new are tried, nor yet the last to lay the old aside,”4 we set about discovering what systems were in use throughout north america and which features each provided. some history in 2006, antelman, lynema, and pace observed that “library catalogs have represented stagnant technology for close to twenty years.” better technology was needed “to leverage the rich metadata trapped in the marc record to enhance collection browsing. the promise of online catalogs has never been realized. for more than a decade, the profession either turned a blind eye to problems with the catalog or accepted that it is powerless to fix them.” 6 dissatisfaction with catalog search tools led us to review the vufind discovery tool. while it had some useful features (spelling, “did you mean?” suggestions), it still suffered from inadequacies in full-text search and the cumbersome nature of searcher-designated boolean searching. it did not work well in searching printed music collections and, of course, only served as a catalog front end. with this all in mind, rul developed a set of objectives to improve information access for clients: • to provide information seekers with • an easy search option for academically valid information materials • an effective search option for academically valid information materials • a reliable search option for academically valid information materials across platforms • to recapture student academic search activity from google • to attempt revitalizing the use of monographic collections • to provide an effective mechanism to support offerings of e -books • to build a firm platform for appropriate library support of distance learning coursework literature review marshall breeding first discussed broad based discovery tools in 2005, shortly after the launch of google scholar. he posits that federated search could not compete with the power and speed of a tool like google scholar. he proclaims the need for, as he describes it, a “centralized search model.”7 information technology and libraries | june 2014 7 building on breeding’s observations four years earlier, diedrichs astutely observe d in 2009 that “user expectations for complete and immediate discovery and delivery of information have been set by their experiences in the web2.0 world. libraries must respond to the needs of those users whose needs can easily be met with google-like discovery tools, as well as those that require deeper access to our resources.”10 in that same year, dolski described the common situation in many academic libraries when in reference to the university of nevada las vegas (unlv) library he states, “our library website serves as the de facto gateway to our electronic, networked content offerings. yet usability studies have shown that findability, when given our website as a starting point, is poor. undoubtedly this is due, at least in part, to interface fragmentation.” 11 this perfectly described the way we had come to view rul’s situation. in 2010, breeding reviewed the systems in the market, noting that these are not just nextgeneration catalogs. he stressed “equal access to content in all forms,” a concept we now take for granted. a key virtue in discovery tools, he notes, is the “blending of the full text of journal articles and books alongside citation data, bibliographic, and authority records resulting in a powerful search experience. rather than being provided a limited number of access points selected by catalogers, each word and phrase within the text becomes a possible point of retrieval.” breeding further points out that: “web-scale discovery platforms will blur many of the restrictions and rules that we impose on library users. rather than having to explain to a user that the library catalog lists books and journal titles but not journal articles, users can simply begin with the concept, author, or title of interest and straightaway begin seeing results across the many formats within the library’s collection.”12 working with freshmen at rider university revealed that they are ahead of the professionals in approaching information this way, and we believed that web-scale discovery tools could help our users. as we began the process of selecting a discovery tool, we looked at the experiences of others. fabbi at the university of nevada las vegas (unlv) folded in a strong component of organizational learning in a highly structured manner that was unnecessary at rider. 13 no information was disclosed on the process of selecting a discovery vendor, though the website reveals the presence of a discovery tool (http://library.nevada.edu/). in contrast, many librarians at rider explored a variety of libraries’ application of search tools. following hofmann and yang’s work, a process of vendor demonstrations and analysis of feasibility led to a trial of ebsco discovery service. what we hoped for is what way at grand valley state reported in 2010 of his analysis of serials solutions’ summon: http://library.nevada.edu/ evaluation and comparison of discovery tools: an update | chickering and yang 8 an examination of usage statistics showed a dramatic decrease in the use of traditional abstracting and indexing databases and an equally dramatic increase in the use of full text resources from full text database and online journal collections. the author concludes that the increase in full text use is linked to the implementation of a web‐scale discovery tool.14 method understanding both rul’s objectives and the state of the art as reflected in the literature, we concluded that an up-to-date review of discovery tool adoptions was in order before moving forward in the process of selecting a product. 1. the resulting study included these steps: (1) compiling a list of all the major discovery tools, (2) developing a set of criteria for evaluation, (3) examining between four to seven websites where a discovery tool was deployed and evaluating each tool against each criteria, (4) recording the findings, and (5) analyzing the data. the targeted population for the study included all the major discovery tools in use in the united states. we define a discovery tool as a library user interface independent of any library systems. a discovery tool can be used to replace the opac module of an integrated library system or liv e sideby-side with the opac. other names for discovery tools include stand -alone opac, discovery layer, or discovery user interface. lately, a discovery tool is more often called a discovery service because most are becoming subscription-based and reside remotely in a cloud-based saas (software as a service) model. the authors compiled a list of fourteen discovery tools based on marshall breeding’s “major discovery products” guide published in “library technology guides.”15 those included aquabrowser library, axiell arena, bibliocommons (bibliocore), blacklight, ebsco discovery service, encore, endeca, extensible catalog, sirsidynix enterprise, primo, summon, visualizer, vufind, and worldcat local. two open-source discovery layers, sopac (the social opac) and scriblio, were excluded from this study because very few libraries are using them. for evaluation in this study, academic libraries were preferred over public libraries during the sample selection process. however, some discovery tools , such as bibliocommons, were more popular among public libraries. therefore examples of public library websites were included in the evaluation. the sites that made the final list were chosen either from the vendor’s website that maintained a customer list or breeding’s “library technology guides.”16 the following is the final list of libraries whose implementations were used in the study. information technology and libraries | june 2014 9 example library sites with proprietary discovery tools: aquabrowser (serials solutions) 1. allen county public library at http://smartcat.acpl.lib.in.us/ 2. gallaudet university library at http://discovery.wrlc.org/?skin=ga 3. harvard university at http://lib.harvard.edu/ 4. norwood young america public library at http://aquabrowser.carverlib.org/ 5. selco southeastern libraries cooperating at http://aquabrowser.selco.info/?c_profile=far 6. university of edinburgh (uk) at http://aquabrowser.lib.ed.ac.uk/ axiell arena (axiell) 1. doncaster council libraries (uk) at http://library.doncaster.gov.uk/web/arena 2. lerums bibliotek (lerums library, sweden) at http://bibliotek.lerum.se/web/arena 3. london libraries consortium-royal kingston library (uk) at http://arena.yourlondonlibrary.net/web/kingston 4. norddjurs (denmark) at https://norddjursbib.dk/web/arena/ 5. north east lincolnshire libraries (uk) at http://library.nelincs.gov.uk/web/arena 6. someron kaupunginkirjasto (finland) at http://somero.verkkokirjasto.fi/web/arena 7. syddjurs (denmark) at https://bibliotek.syddjurs.dk/web/arena1 bibliocore (bibliocommons) 1. halton hills public library at http://hhpl.bibliocommons.com/dashboard 2. new york public library at http://nypl.bibliocommons.com/ 3. oakville public library at http://www.opl.on.ca/ 4. princeton public library at http://princetonlibrary.bibliocommons.com/ 5. seattle public library at http://seattle.bibliocommons.com/ 6. west perth (australia) public library at http://wppl.bibliocommons.com/dashboard 7. whatcom county library system at http://wcls.bibliocommons.com/ ebsco discovery service/eds (ebsco) 1. aston university (uk) at http://www1.aston.ac.uk/library/ 2. columbia college chicago library at http://www.lib.colum.edu/ 3. loyalist college at http://www.loyalistlibrary.com/ 4. massey university (new zealand) at http://www.massey.ac.nz/massey/research/library/library_home.cfm 5. rider university at http://www.rider.edu/library 6. santa rosa junior college at http://www.santarosa.edu/library/ 7. st. edward's university at http://library.stedwards.edu/ encore (innovative interfaces) http://smartcat.acpl.lib.in.us/ http://discovery.wrlc.org/?skin=ga http://lib.harvard.edu/ http://aquabrowser.carverlib.org/ http://aquabrowser.selco.info/?c_profile=far http://aquabrowser.lib.ed.ac.uk/ http://library.doncaster.gov.uk/web/arena http://bibliotek.lerum.se/web/arena http://arena.yourlondonlibrary.net/web/kingston https://norddjursbib.dk/web/arena/ http://library.nelincs.gov.uk/web/arena http://somero.verkkokirjasto.fi/web/arena https://bibliotek.syddjurs.dk/web/arena1 http://hhpl.bibliocommons.com/dashboard http://nypl.bibliocommons.com/ http://www.opl.on.ca/ http://princetonlibrary.bibliocommons.com/ http://seattle.bibliocommons.com/ http://wppl.bibliocommons.com/dashboard http://wcls.bibliocommons.com/ http://www1.aston.ac.uk/library/ http://www.lib.colum.edu/ http://www.massey.ac.nz/massey/research/library/library_home.cfm http://www.rider.edu/library http://www.santarosa.edu/library/ http://library.stedwards.edu/ evaluation and comparison of discovery tools: an update | chickering and yang 10 1. adelphi university at http://libraries.adelphi.edu/ 2. athens state university library at http://www.athens.edu/library/ 3. california state university at http://coast.library.csulb.edu/ 4. deakin university (australia) at http://www.deakin.edu.au/library/ 5. indiana state university at http://timon.indstate.edu/iii/encore/home?lang=eng 6. johnson and wales university at http://library.uri.edu/ 7. st. lawrence university at http://www.stlawu.edu/library/ endeca (oracle) 1. john f. kennedy presidential library and museum at http://www.jfklibrary.org/ 2. north caroline state university at http://www.lib.ncsu.edu/endeca/ 3. phoenix public library at http://www.phoenixpubliclibrary.org/ 4. triangle research libraries network at http://search.trln.org/ 5. university of technology, sydney (australia) at http://www.lib.uts.edu.au/ 6. university of north carolina at http://search.lib.unc.edu/ 7. university of ottawa (canada) libraries at http://www.biblio.uottawa.ca/html/index.jsp?lang=en enterprise (sirsidynix) 1. cerritos college at http://cert.ent.sirsi.net/client/cerritos 2. maricopa county community colleges at https://mcccd.ent.sirsi.net/client/default 3. mountain state university/university of charleston at http://msul.ent.sirsi.net/client/default 4. university of mary at http://cdak.ent.sirsi.net/client/uml 5. university of the virgin islands at http://uvi.ent.sirsi.net/client/default 6. western iowa tech community college at http://wiowa2.ent.sirsi.net/client/default primo (ex libris) 1. aberystwyth university (uk) at http://primo.aber.ac.uk/ 2. coventry university (uk) at http://locate.coventry.ac.uk/ 3. curtin university (australia) at http://catalogue.curtin.edu.au/ 4. emory university at http://web.library.emory.edu/ 5. new york university at http://library.nyu.edu/ 6. university of iowa at http://www.lib.uiowa.edu/ 7. vanderbilt university at http://www.library.vanderbilt.edu visualizer (vtls) 1. blinn college at http://www.blinn.edu/library/index.htm 2. edward via virginia college of osteopathic medicine at http://vcom.vtls.com:1177/ 3. george c. marshall foundation at http://gmarshall.vtls.com:6330/ 4. scugog memorial public library at http://www.scugoglibrary.ca/ http://libraries.adelphi.edu/ http://www.athens.edu/library/ http://coast.library.csulb.edu/ http://www.deakin.edu.au/library/ http://timon.indstate.edu/iii/encore/home?lang=eng http://library.uri.edu/ http://www.stlawu.edu/library/ http://www.jfklibrary.org/ http://www.lib.ncsu.edu/endeca/ http://www.phoenixpubliclibrary.org/ http://search.trln.org/ http://www.lib.uts.edu.au/ http://search.lib.unc.edu/ http://www.biblio.uottawa.ca/html/index.jsp?lang=en http://cert.ent.sirsi.net/client/cerritos https://mcccd.ent.sirsi.net/client/default http://msul.ent.sirsi.net/client/default http://cdak.ent.sirsi.net/client/uml http://uvi.ent.sirsi.net/client/default http://wiowa2.ent.sirsi.net/client/default http://primo.aber.ac.uk/primo_library/libweb/action/search.do?dscnt=1&dstmp=1326479965873&vid=aberu_vu1&fromlogin=true http://locate.coventry.ac.uk/primo_library/libweb/action/search.do?dscnt=1&fromlogin=true&dstmp=1326480439550&vid=cov_vu1&fromlogin=true http://catalogue.curtin.edu.au/primo_library/libweb/action/search.do?dscnt=0&dstmp=1326480547980&vid=cur&fromlogin=true http://web.library.emory.edu/ http://library.nyu.edu/ http://www.lib.uiowa.edu/ http://www.library.vanderbilt.edu/ http://www.blinn.edu/library/index.htm http://vcom.vtls.com:1177/ http://gmarshall.vtls.com:6330/ http://www.scugoglibrary.ca/ information technology and libraries | june 2014 11 summon (serials solutions) 1. arizona state university at http://lib.asu.edu/ 2. dartmouth college at http://dartmouth.summon.serialssolutions.com/ 3. duke university at http://library.duke.edu/ 4. florida state university at http://www.lib.fsu.edu/ 5. liberty university at http://www.liberty.edu/index.cfm?pid=178 6. university of sydney at http://www.library.usyd.edu.au/ worldcat local (oclc) 1. boise state university at http://library.boisestate.edu/ 2. bowie state university at http://www.bowiestate.edu/academics/library/ 3. eastern washington university at http://www.ewu.edu/library.xml 4. louisiana state university at http://lsulibraries.worldcat.org/ 5. saint john's university at http://www.csbsju.edu/libraries.htm 6. saint xavier university at http://lib.sxu.edu/home examples of open source and free discovery tools: blacklight (the university of virginia library) 1. columbia university at http://academiccommons.columbia.edu/ 2. johns hopkins university at https://catalyst.library.jhu.edu/ 3. north carolina university at http://historicalstate.lib.ncsu.edu 4. northwestern university at http://findingaids.library.northwestern.edu/ 5. stanford university at http://www-sul.stanford.edu/ 6. university of hull (uk) at http://blacklight.hull.ac.uk/ 7. university of virginia at http://search.lib.virginia.edu/ extensible catalog/xc (extensible catalog organization/carli/university of rochester) 1. demo at http://extensiblecatalog.org/xc/demo 2. extensible catalog library at http://xco-demo.carli.illinois.edu/dtmilestone3 3. kyushu university (japan) at http://catalog.lib.kyushu-u.ac.jp/en 4. spanish general state authority libraries (spain) at http://pcu.bage.es/ 5. thailand cyber university/asia institute of technology (thailand) at http://globe.thaicyberu.go.th/ vufind (villanova university) 1. auburn university at http://www.lib.auburn.edu/ 2. carnegie mellon university libraries at http://search.library.cmu.edu/vufind/search/advanced http://lib.asu.edu/ http://dartmouth.summon.serialssolutions.com/ http://library.duke.edu/ http://www.lib.fsu.edu/ http://www.liberty.edu/index.cfm?pid=178 http://www.library.usyd.edu.au/ http://library.boisestate.edu/ http://www.bowiestate.edu/academics/library/ http://www.ewu.edu/library.xml http://lsulibraries.worldcat.org/search?qt=affiliate_wcl_all&q=&wcsbtn2w.x=14&wcsbtn2w.y=9 http://www.csbsju.edu/libraries.htm http://lib.sxu.edu/home http://academiccommons.columbia.edu/ https://catalyst.library.jhu.edu/ http://historicalstate.lib.ncsu.edu/ http://findingaids.library.northwestern.edu/ http://www-sul.stanford.edu/ http://blacklight.hull.ac.uk/ http://search.lib.virginia.edu/ http://extensiblecatalog.org/xc/demo http://xco-demo.carli.illinois.edu/dtmilestone3 http://catalog.lib.kyushu-u.ac.jp/en http://pcu.bage.es/ http://globe.thaicyberu.go.th/ http://www.lib.auburn.edu/ http://search.library.cmu.edu/vufind/search/advanced evaluation and comparison of discovery tools: an update | chickering and yang 12 3. colorado state university at http://lib.colostate.edu/ 4. saint olaf college at http://www.stolaf.edu/library/index.cfm 5. university of michigan at http://mirlyn.lib.umich.edu 6. western michigan university at https://catalog.library.wmich.edu/vufind/ 7. yale university library at http://yufind.library.yale.edu/yufind/ the following list of criteria was used for the purpose of the evaluation. some were based on those used by the previous studies on discovery tools.17, 18, 19 the list embodied the librarians’ vision for the next-generation catalog and contained some of the most desirable features for a modern opac. the authors were aware of other desirable features for a discovery layer, and the following list was by no means the most comprehensive, but it served the purpose of the study well. 1. one-stop search for all library resources. a discovery tool should include all library resources in its search including the catalog with books and videos, journal articles in databases, and local archives and digital repository. this can be accomplished by the unified index or federated search, an essential component for a discovery tool. some of the discovery tools are described as web-scale because of their potential to search seamlessly across all library resources. 2. state-of-the-art web interface. a discovery tool should have a modern design similar to e-commerce sites, such as google, netflix, and amazon. 3. enriched content. discovery tools should include book cover images, reviews, and user driven input, such as comments, descriptions, ratings, and tag clouds. the enriched content can be either from library patrons, commercial sources, or both. 4. faceted navigation. discovery tools should allow users to narrow down the search results by categories, also called facets. the commonly used facets include locations, publication dates, authors, formats, and more. 5. simple keyword search box with a link to advanced search at the start page. a discovery tool should start with a simple keyword search box that looks like that of google or amazon. a link to the advanced search should be present. 6. simple keyword search box on every page. the simple keyword search box should appear on every page of a discovery tool. 7. relevancy. relevancy results criteria should take into consideration circulation statistics and books with multiple copies. more frequently circulated books indicate popularity and usefulness, and they should be ranked higher on the top of the display. a book of multiple copies may also be an indication of importance. http://lib.colostate.edu/ http://www.stolaf.edu/library/index.cfm http://mirlyn.lib.umich.edu/ https://catalog.library.wmich.edu/vufind/ http://yufind.library.yale.edu/yufind/ information technology and libraries | june 2014 13 8. “did you mean . . . ? spell-checking. when an error appears in the search, the discovery tool should correct the query spelling as a link so that users can simply click on it to get the search results. 9. recommendations/related materials. a discovery tool should recommend resources for readers in a similar manner to amazon or other e -commerce sites, based on transaction logs. this should take the form of “readers who borrowed this item also borrowed the following . . . ” or a link to recommended readings. it would be ideal if a discovery tool can recommend the most popular articles, a service similar to ex libris ’ bx usage-based services. 10. user contribution. user input includes descriptions, summaries, reviews, criticism, comments, rating and ranking, and tagging or folksonomies. 11. rss feeds. a modern opac should provide rss feeds. 12. integration with social networking sites. when a discovery tool is integrated with social networking sites, patrons can share links to library items with their friends on social networks like twitter, facebook, and delicious. 13. persistent links. records in a discovery tool contain a stable url capable of being copied and pasted and serving as a permanent link to that record. they are also called permanent urls. 14. auto-completion/stemming. a discovery tool is equipped with the computational algorithm that it can auto-complete the search words or supply a list of previously used words or phrases for users to choose from. google has stemming algorithms. 15. mobile compatibility. there is a difference between being “mobile compatible” and a “custom mobile website.” the former indicates a website can be viewed or used on a mobile phone, and the later denotes a different version of the user interface specially built for mobile use. in this study we include both as “yes.” 16. functional requirements for bibliographic retrieval (frbr). the latest development of rda certainly makes a discovery tool more desirable if it can display frbr relationships. for instance, a discovery tool may display and link different versions, editions or formats of a work, what frbr refers to as expressions and manifestations. for record keeping and analysis, a microsoft excel file with sixteen fields based on the above criteria was created. the authors checked the discovery tools on the websites of the selected libraries and recorded those features as present or absent. evaluation and comparison of discovery tools: an update | chickering and yang 14 rda compatibility is not used as a criterion in the study because most discovery tools allow users to add rda fields in marc. by now, all the discovery tools should be able to display, index, and search the new rda fields. findings one stop searching for all library resources—this is the most desirable feature when acquiring a discovery tool. unfortunately it also presented the biggest challenge for vendors. both librarians and vendors have been struggling with this issue for the past several years, yet no one has worked out a perfect solution. based on the examples the authors examined, this study found that only five out of fourteen discovery tools can retrieve articles from databases along with books, videos, and digital repositories. those include ebsco discovery service, encore, p rimo, summon, and worldcat local. whereas encore uses an approach similar to federated search performing live searches of databases, the other discovery tools build a single unified index. while the single unified index requires the libraries to send their catalog data and local information to the vendor for update and thus the discovery tools may fall behind in reflecting up to the minute accuracy in local holdings, federated search does real-time searching and does not lag behind in displaying current information. both approaches are limited in what they cover. both need permission from content providers for inclusion in the unified index or to develop a connection to article databases for real-time searching. for those discovery tools that do not have their own unified index or real-time searching capability, they provide web-scale searching through other means. for instance, vufind has developed connectors to application programming interfaces (apis) by serials solutions or oclc to pull search results from summon and worldcat local. encore not only developed its own realtime connection to electronic databases but is enhancing its web -scale search by incorporating the unified index from other discovery tools such as the ebsco discovery service. aquabrowse r is augmented by 360 federated search for the same purpose. despite of those possibilities, the authors did not find the article level retrieval in the sample discovery tools other than the main five mentioned above. comparing the coverage of each tools’ web-scale index can be challenging. ebsco, summon, and worldcat local publicize their content coverage on the web while primo and encore only share this information with their customers. this makes it hard to compare and evaluate the content coverage without contacting vendors and asking for that information. at present, none of the five discovery tools (ebsco discovery service, encore, primo, summon, and worldcat local) can boast 100% coverage of all library resources. in fact, none of the internet search engines, including google or google scholar, can retrieve 100% of all resources. therefore web -scale searching is more a goal than a possibility. apart from political and economic reasons, this is in part due to the nonbibliographic structure of the contents in databases such as scifinder and some others. one stop searching is still a work in progress because discovery tools provide students with a quick and simple way to retrieve a large number, but still an incomplete list of resources held by a information technology and libraries | june 2014 15 library. for more in-depth research, students are still encouraged to search the catalog, discipline specific databases, and digital repositories separately. state of the art interface—all the discovery tools are very similar in appearance to amazon.com. some are better than others. this study did not rate each discovery tool based on a scale and thus did not distinguish their fine degrees in appearance. rather each discovery tool is given a “yes” or “no.” the designation was based on subjective judgment. all the discovery tools received “yes” because they are very similar in appearance. enriched content—all the discovery tools have embedded book cover images or video jacket images, but some have displayed more, such as ratings and rankings, user -supplied or commercially available reviews, overviews, previews, comments, descriptions, title discussion, excerpts, or age suitability, just to name a few. a discovery tool may display enriched content by default out of box, but some may need to be customized to include it. the following is a list of enriched content implemented in each discovery tool that the authors found in the sample. the number in the last column indicates how many types of enriched content were found in the discovery tool at the time of the study. bibliocommons and aquabrowser stand out from the rest and made the top two on the list based on the number of enriched content from noncataloging sources (see figure 1). it is debatable how much nontraditional data a discovery tool should incorporate into its display. it warrants another discussion as to how useful such data is for users. faceted navigation—faceted navigation has become a standard feature in discovery tools over the last two years. it allows users to further divide search results into subsets based on predetermined terms. facets come from a variety of fields in marc records. some discovery tools have more facets than others. the most commonly seen facets include location or collections, publication dates, formats, author, genre, and subjects. faceted navigation is highly configurable as many discovery tools allow libraries to decide on their own facets. faceted navigation has become an integral part of a discovery tool. simple keyword search box on the starting page with a link to advanced search—the original idea is to allow a library’s user interface to resemble google by displaying a simple keyword search box with a link to advanced search at the starting page. most discovery tools provide the flexibility for libraries to choose or reject this option. however, many librarians find this approach unacceptable as they feel it lacks precision in searching and thus may mislead users. as the keyword box is highly configurable and up to the library to decide how they will present it, many libraries have added a pull down menu with options to search keywords, authors, titles, and locations. in doing so, the original intention for a google like simple search box is lost. therefore only a few libraries follow the goo gle-like box style at the starting page. most libraries altered the simple keyword search box on the starting page to include a dropdown menu or radio buttons, so the simple keyword search box is neither simple nor limited to keyword search only. nevertheless, this study gave all the discovery tools a “yes.” all the systems are capable of this feature even though libraries may choose not to use it. evaluation and comparison of discovery tools: an update | chickering and yang 16 rank discovery tool enriched content total 1 bibliocommons cover images, tags, similar title, private note, notices, age suitability, summary, quotes, video, comments, and rating 11 2 aquabrowser cover images, previews, reviews, summary, excerpts, tags, author notes & sketches, full text from google, rating/ranking 9 3 enterprise cover images, reviews, google previews, summary, excepts 5 4 axiell arena cover images, tags, reviews, and title discussion 4 vufind cover images, tags, reviews, comments 4 5 primo cover images, tags, previews 3 worldcat local cover images, tags, reviews 3 6 encore cover images, tags 2 visualizer cover images, reviews 2 summon cover images, reviews 2 7 blacklight cover images 1 ebsco discovery service cover images 1 endeca cover images 1 extensible catalog cover images 1 figure 1. the ranked list of enriched content in discovery tools . simple keyword search box on every page—the feature enables a user to start a new search at every step of navigation in the discovery tool. most of the discovery tools provide such a box on the top of the screen as users navigate through the search results and record displays except extensible catalog and enterprise by sirsidynix. the feature is missing from the former while the latter almost has this feature except when displaying bib records in a pop-up box. information technology and libraries | june 2014 17 relevancy—traditionally, relevancy is uniformly based on a computer algorithm that calculates the frequency and relative position of a keyword (field weighting) in a record and displays the search results based on the final score. other factors have never been a part of the decision in the display of search results. in the discussion on next-generation catalogs, relevancy based on circulation statistics and other factors came up as a desirable possibility, and no discovery tool has met this challenge until now. primo by ex libris is the only one among the discovery tools under investigation that can sort the final results by popularity. “primo’s popularity ranking is calculated by use. this means that the more an item record has been clicked and viewed, the more popular it is.”20 even though those are not real circulation statistics, this is considered to be a revolutionary step and a departure from traditional relevancy. three years ago none of the discovery tools provided this option.21 to make relevancy ranking even more sophisticated, scholarrank, another service by ex libris, can work with primo to sort the search results not only based on a query match but also an item’s value score (its usage and number of citations) and a user’s characteristics and information needs. this shows the possibility of more advanced relevancy ranking in discovery tools. other vendors will most likely follow in the future incorporating more sophistication in their relevancy algorithms. spell checker/“did you mean . . . ?”—the most commonly observed way of correcting a misspelling in a query is, “did you mean . . . ?” but there are other variations providing the same or similar services. some of those variations are very user-friendly. the following is a list of different responses when a user enters misspelled words (see figure 2). “xxx” represents the keyword being searched. evaluation and comparison of discovery tools: an update | chickering and yang 18 discovery tools responses for misspelled search words notes acquabrowser did you mean to search: xxx, xxx, xxx? the suggested words are hyperlinks to execute new searches. axiell arena your original search for xxx has returned no hits. the fuzzy search returned n hits. automatically displays a list of hits based on fuzzy logic. “n” is a number. bibliocommons did you mean xxx (n results)? displays suggested word along with the number of results as a link. blacklight no records found. no spell checker, but possible to add by local technical team. ebsco discovery service results may also be available for xxx. the suggested word is a link to execute a new search. encore did you mean xxx? the suggested word is a link to execute a new search. endeca did you mean xxx? the suggested word is a link to execute a new search. enterprise did you mean xxx? the suggested word is a link to execute a new search. extensible catalog sorry, no results found for: xxx. no spell checker, but possible to add by local technical team. primo did you mean xxx? the suggested word is a link to execute a new search. summon did you mean xxx? the suggested word is a link to execute a new search. visualizer did you mean xxx? the suggested word is a link to execute a new search. vufind 1. no results found in this category. search alternative words: xxx, xxx, xxx. 2. perhaps you should try some spelling variation: xxx, xxx, xxx. 3. your search xxx did not match any resources. what should i do now? a list of suggestions including checking a web dictionary. 1. alternative words are links to execute new searches. 2. suggested words are links to execute new searches. 3. suggestions what to do next. worldcat local did you mean xxx? the suggested word is a link to execute a new search. figure 2. spell checker. most of the discovery tools on the list provide this feature except blacklight and extensible catalog. open-source solutions sometimes provide a framework that you add features to. this leaves many information technology and libraries | june 2014 19 possibilities for local developers to add and develop. for instance, a diction ary or spell checker may be easily installed even if a discovery tool does not come with one out of the box. this feature may be configurable. 9. recommendation—amazon has one of those search engines with a recommendation system such as “customers who bought item a also bought item b.” the ecommerce recommendation algorithms analyze the activities of shoppers on the web and build a database of buyer profiles. the recommendations are made based on shopper behavior. when this applies to the library content, it could become “readers who were interested in item a were also interested in item b .” however, most discovery tools do not have a recommendation system. instead, they have adopted different approaches. most discovery tools make recommendations from bibliographic data in marc records such as subject headings for similar items. primo is one of the few discovery tools with a recommendation system similar to those used by amazon and other internet commercial sites. its bx article recommender service is based on usage patterns collected from its link resolver, sfx. developed by ex libris, bx is an independent service that integrates with primo well, but can serve as an add-on function for other discovery tools. bx is an excellent example that discovery tools can suggest new leads and directions for scholars in their research. the authors counted all the discovery tools that provide some kind of recommendations regardless of their technological approaches using marc data or algorithms. ten out of fourte en discovery tools provide this feature in various forms (see figure 3). those include axiell arena, bibliocommons, ebsco discovery service, encore, endeca, extensible catalog, primo, summon, worldcat local, and vufind. the following are some of the recommendations found in those discovery tools. the authors did not find any recommendation in the libraries that use aquabrowser, enterpri se, visualizer, or blacklight. discovery tools language used for recommending or linking to related items axiell arena “see book recommendations on this topic” “who else writes like this?” bibliocommons “similar titles & subject headings & lists that include this title” ebsco discovery service “find similar results” encore “other searches you may try” “additional suggestions” endeca “recommended titles for. . . . view all recommended titles that match your search” evaluation and comparison of discovery tools: an update | chickering and yang 20 “more like this” extensive catalog “more like this” “searches related to . . . ” primo “suggested new searches by this author” “suggested new searches by this subject” “users interested in this article also expressed an interest in the following:” summon “search related to . . . ” worldcat local “more like this” “similar items” “related subjects” “user lists with this item” vufind “more like this” “similar items” “suggested topics” “related subjects” figure 3. language used for recommendation. some discovery tool recommendations are designed in a more user friendly manner than others. most recommendations exist exclusively for items. ideally, a discovery tool should provide an article recommendation system like ex libris’ bx usage-based service that will show users the most frequently used and most popular articles. at the time of this evaluation, no discovery tool has incorporated an article recommendation system except primo. research is needed to evaluate how patrons utilize recommendation services or if they find recommendations beneficial in discovery tools. user contribution—traditionally, bibliographic data has been safely guarded by cataloging librarians for quality control. it has been unthinkable that users would be allowed to add data to library records. the internet has brought new perspectives on this issue. half of the discovery tools (7) under evaluation provide this feature to varying degrees (see figure 4). designed primarily for public libraries, bibliocommons seems the most open to user -supplied data among information technology and libraries | june 2014 21 all the discovery tools. many other discovery tools (7) allow users to contribute tags and reviews. all the discovery tools allow librarians to censor user -supplied data before releasing it for public display. the following figure is a summary of the types of data these discove ry tools allow users to enter. ranking discovery tool user contribution 1 bibliocommons tags, similar title, private note, notices, age suitability, summary, quotes, video, comments, and ratings (10) 2 aquabrowser tags, reviews, and ratings/rankings (3) axiell arena tags, reviews, and title discussions (3) vufind tags, reviews, comments (3) 3 primo tags and reviews (2) worldcat local tags and reviews (2) 4 encore tags (1) 5 blacklight (0) endeca (0) enterprise (0) extensible catalog (0) summon (0) visualizer (0) figure 4. discovery tools based on user contribution. past research indicates that folksono mies or tags are highly useful.22 they complement librarycontrolled vocabularies, such as library of congress subject headings, and increase access to library collections. a few discovery tools allow user entered tags to form “word clouds.” the relative importance of tags in a word cloud is emphasized by font color and size. a tag list is another way to organize and display tags. in both cases , tags are hyperlinked to a relevant list of items. some tags serve as keywords to start new searches, while others narrow search results. only four discovery tools, aquabrowser, encore, primo, and worldcat local, provide both tag clouds and lists. bibliocommons provides only tag lists for the same purpose. the rest of the discovery tools do not have either. one setback of user-supplied tags for subject access is their evaluation and comparison of discovery tools: an update | chickering and yang 22 incomplete nature. they may lead users to partial retrieval of information as users add tags only to items that they have used. the coverage is not systematic and inclusive of all collections. therefore data supplied by users in discovery tools remains controversial. it is possible to seed systems with folksonomies using services like librarything for libraries, which could reduce the impact of this issue. rss feed/email alerts—this feature can automatically send a list of new library resources to users based on his or her search criteria. it can be useful for experienced researchers or frequent library users. some discovery tools may use email alerts as well. eight out of fourteen discovery tools in this evaluation provide rss feeds. those with rss feeds include aquabrowser, axiell arena, ebsco discovery service, endeca, enterprise, primo, summon, and vufind. an rss feed can be added as a plug-in in some discovery tools if it does not come as part of the base system. integration with social networking sites—as most of the college students participate in social networking sites, this feature provides an easy way to share resources among college s tudents on social networking sites. users can place the link to a resource by clicking on an icon in the discovery tool and share the resource with friends on facebook, twitter, delicious and many other social network sites. nine out of the fourteen discovery tools provide this feature. some discovery tools provide integration possibilities with many more social networking sites than others. those with this feature include aquabrowser, axiell arena, bibliocommons, ebsco discovery service, encore, endeca, primo, worldcat local, and extensible catalog. so far , the interaction between discovery tools and social networking sites is limited to sharing resources. social networking sites should be carefully evaluated for the possibility of integra ting some of their popular features into discovery tools. persistent link—this is also called permanent link or permurl. not all the links displayed in a browser location box are persistent links, therefore some discovery tools specifically provid e a link in the records for users to copy and keep. five out of fourteen discovery tools explicitly listed this link in records. those include aquabrowser, axiell arena, blacklight, ebsco discovery service, and worldcat local. the authors marked a system a s “no” when a permanent link is not prominently displayed in a discovery tool. in other words, only those discovery tools that explicitly provide a persistent link are counted as “yes.” however, the url in a browser’s location box during the display of a record may serve as a persistent link in some cases. for instance, vufind does not provide a permanent url in the record, but indicates on the project site that url in the location box is a persistent link. auto-completion/stemming—when a user types in keywords in the search box, the discovery tool will supply a list of words or phrases that she or he can choose readily. this is a highly useful feature that google excels at. stemming not only automatically completes the spelling of a keyword, but also supplies a list of phrases that point to existing items. the authors found this feature in six out of fourteen discovery tools. they include axiell arena, endeca, enterprise, extensible catalog, summon, and worldcat local. information technology and libraries | june 2014 23 mobile interface—the terms “mobile compatible” and “mobile interface” are two different concepts. a mobile interface is a simplified version of a normal browser version of a discovery tool interface so it is optimized for use on mobile phones , and the authors only counted those discovery tools that have a separate mobile interface. a discovery tool may be mobile friendly or compatible and does not necessarily need a separate mobile interface. many discovery tools, such as ebsco, can detect the request from a mobile phone and automatically direct the request to the mobile interface. eleven out of fourteen claim to provide a separate mobile interface. blacklight, enterprise, and extensible catalog do not seem to have a separate mobile interface even though they may be mobile friendly. frbr—frbr groupings denote the relationships between work, manifestation, expression, and items. for instance, a search will not only retrieve a title, but different editions and formats of the work. only three discovery tools can display frbr relationships: extensible catalog (open source), primo by ex libris, and worldcat local by oclc. so far , most discovery tools are not capable of displaying the manifestations and expressions of a work in a meaningful way. from the user’s point of view, this feature is highly desirable. figure 5 is a screenshot from primo demonstrating displays indicating a large number of different adaptations of the work “romeo and juliet.” figure 6 displays the same intellectual work in different manif estations such as dvd, vhs, books, and more. figure 5. display of frbr relationships in primo . evaluation and comparison of discovery tools: an update | chickering and yang 24 figure 6. different versions of the same work in primo . summary the following are the summary tables of our comparison and evaluation. proprietary and open source programs are listed separately in these tables. the total number of features the authors found in a particular discovery tool is displayed at the end of the column. proprietary discovery tools seem to have more advanced characteristics of a modern d iscovery tool than the opensource counterparts. the open-source program blacklight displays fewer advanced features, but seems flexible for users to add features. see figures 7, 8, and 9. information technology and libraries | june 2014 25 figure 7. proprietary discovery tools. aquabrower axiell arena bibliocommons ebsco/ eds encore endeca 1. single point of search no no no yes yes no 2. state of the art interface yes yes yes yes yes yes 3. enriched content yes yes yes yes yes yes 4. faced navigation yes yes yes yes yes yes 5. simple keyword search box on the starting page yes yes yes yes yes yes 6. simple keyword search box on every page yes yes yes yes yes yes 7. relevancy no no no no no no 8. spell checker/ “did you mean . . . ?” yes yes yes yes yes yes 9. recommendation no yes yes yes yes yes 10. user contribution yes yes yes no yes no 11. rss yes yes no yes no yes 12. integration with social network sites yes yes yes yes yes yes 13. persistent links yes yes no yes no no 14. stemming/autocomplete no yes no no no yes 15. mobile interface yes yes yes yes yes yes 16. frbr no no no no no no total 11/16 13/16 10/16 12/16 11/16 11/16 evaluation and comparison of discovery tools: an update | chickering and yang 26 enterprise primo summon visualizer worldcat local 1. single point of search no yes yes no yes 2. state of the art interface yes yes yes yes yes 3. enriched content yes yes yes yes yes 4. faced navigation yes yes yes yes yes 5. simple keyword search box on the starting page yes yes yes yes yes 6. simple keyword search box on every page no yes yes yes yes 7. relevancy no yes no no no 8. spell checker/ did you mean...? yes yes yes yes yes 9. recommendation no yes yes no yes 10. user contribution no yes no no yes 11. rss yes yes yes no no 12. integration with social network sites no yes no no yes 13. persistent links no no no no yes 14. stemming/autocomplete yes no yes no yes 15. mobile interface no yes yes yes yes 16. frbr no yes no no yes total 7/16 14/16 11/16 7/16 14/16 figure 8. proprietary discovery tools (continued). blacklight extensible catalog vufind 1. one point of search no no no 2. state of the art interface yes yes yes 3. enriched content yes yes yes 4. faceted navigation yes yes yes 5. simple keyword search box on the starting page yes yes yes 6. simple keyword search box on every page yes yes yes 7. relevancy no no no 8. spell checker/did you mean ...? no no yes 9. recommendation no yes yes 10. user contribution no no yes 11. rss no no yes 12. integration with social network sites no yes no information technology and libraries | june 2014 27 13. persistent links yes no no 14. stemming/auto-complete no yes no 15. mobile interface no no yes 16. frbr no yes no total 6/16 9/16 10/16 figure 9. free and open-source discovery tools. as one-stop searching is the core of a discovery tool, this consideration placed five discovery tools above the rest: encore, ebsco discovery service, primo, summon, and worldcat local ( see figure 10). these five are web-scale discovery services. all of them use their native unified index except encore, which has incorporated the ebsco unified index in its search. despite of great progress made in the past three years in one-stop searching, none of the discovery to ols can truly search across all library resources—all of them have some limitations as to the coverage of content. each unified index may cover different databases as well as overlap each other in many areas. one possible solution may lie in a hybrid approach that combines a unified index with federated search (also called real-time discovery). those old and new technologies may work well when complementing each other. it remains a challenge if libraries will ever have one-stop searching in its true sense. discovery tools one-stop searching encore yes ebsco discovery service yes primo yes summon yes worldcat local yes figure 10. the discovery tools capable of one stop searching . it is also worth mentioning that one-stop searching is a vital and central piece of discovery tools. those discovery tools without a native unified index or connectors to databases for real -time searching are at a disadvantage. therefore discovery tools that do not provide web -scale searching are investigating various possibilities to incorporate one-stop searching. some are drawing on the unified indexes of those discovery tools that have them through connectors to the application programming interfaces (apis) of those products. for instance, vufind in cludes connectors to the apis of a few other systems that have a unified index or vast resources such as summon and worldcat. blacklight may provide one-stop searching through the primo api. such a practice may present other problems such as calculating relevancy ranking across resources that may not live in the same centralized index, thus not achieving fully balanced relevancy ranking. nevertheless, discovery tool developers are working hard to achieve one-stop searching. as a unified index can be shared across discovery tools, in the next few years, more and more discovery services may offer one-stop searching. evaluation and comparison of discovery tools: an update | chickering and yang 28 based on the count of the sixteen criteria in the checklist, we ranked primo and worldcat local as the top two discovery tools. based on our criteria , primo has two unique features that make it stand out: relevancy enhanced by usage statistics and value score and the frbr relationship display. worldcat local and extensible catalog are the other two discov ery tools that can display frbr relationships (see figure 11). rank discovery tools number of advanced features 1 primo and worldcat local 14/16 2 axiell arena 13/16 3 ebsco discovery service 12/16 4 aquabrowser, encore, and endeca 11/16 5 bibliocommons, summon, and vufind 10/16 6 extensible catalog 9/16 6 enterprise and visualizer 7/16 7 blacklight 6/16 figure 11. ranked discovery tools. limitations as discovery tools are going through new releases and improvements, what is true today may b e false tomorrow. discovery tools constantly improve and evolve , and many features are not included in this evaluation, such as integration with google maps for the location of an item and user-driven acquisitions. innovations are added to discovery tools constantly. this study only covers the most common features that the library community agreed upon as those that a discovery tool should have. some open-source discovery tools may provide a skeleton of an application that leaves the code open for users to develop new features. therefore different implementations of an open-source discovery tool may encompass totally different features that are not part of the core application. for instance, the university of virginia developed virgo based on blacklight, adding many advanced features. thus it is quite a challenge to distinguish what comes with the software and what are local developments. this study focused on the user interface of discovery tools. what are not included are content coverage, application administration, and searching capability of the discovery tools. those three are important factors when choosing a discovery tool. conclusion search technology has evolved far beyond federated searching. the concept of a “next generation catalog” has merged with this idea, and spawned a generation of discovery tools bringing almost google-like power to library searching. the problems facing libraries now are the intelligent information technology and libraries | june 2014 29 selection of a tool that fits their contexts, and structuring a process to adopt a nd refine that tool to meet the objectives of the library. our findings indicate that primo and worldcat local have better user interfaces, displaying more advanced features of a next generation catalog than their peers. for rul, ebsco discovery service (eds) provides something approaching the ease of google searching from either a single search box or a very powerful advanced search. being aware of the limitations noted above, rider’s libraries elected to continue displaying traditional search options in addition to what we’ve branded “library one search.” another issue we discovered in this process is when negotiating for a vendor-hosted test, libraries must be sure that the test period begins when the configuration is complete rather than only whe n the data load begins. all phases of the project took far more time than anticipated. the client institution’s implementation coordinator or team needs to be reviewing the progress on a daily basis and communicating often with the vendor-based implementation team. with the evaluative framework this study provides, libraries moving toward discovery tools should consider changing capabilities of the available discovery tools to make informed choices. references 1. jason vaughan, “investigations into library web-scale discovery services,” information technology & libraries 31, no. 1 (2012): 32–33, http://dx.doi.org/10.6017/ital.v31i1.1916. 2. sharon q. yang and melissa a. hofmann, “next generation or current generation? a study of the opacs of 260 academic libraries in the usa and canada,” library hi tech 29 no. 2 (2011): 266–300. 3. melissa a. hofmann and sharon q. yang, “‘discovering’ what’s changed: a revisit of the opacs of 260 academic libraries,” library hi tech 30, no. 2 (2012): 253–74. 4. alexander pope, “alexander pope quotes,” http://www.brainyquote.com/quotes/authors/a/alexander_pope.html. 5. f. william chickering, “linking information technologies: benefits and challenges,” proceedings of the 4th international conference on new information technologies, budapest, hungary, december 1991, http://web.simmons.edu/~chen/nit/nit%2791/019-chi.htm. 6. kristin antelman, emily lynema, and andrew k. pace, “toward a twenty-first century library catalog,” information technology & libraries 25, no. 3, (2006): 128-39, http://dx.doi.org/10.6017/ital.v25i3.3342. 7. marshall breeding, “plotting a new course for metasearch,” computers in libraries 25, no. 2 (2005): 27–29. http://www.brainyquote.com/quotes/authors/a/alexander_pope.html http://web.simmons.edu/~chen/nit/nit%2791/019-chi.htm evaluation and comparison of discovery tools: an update | chickering and yang 30 8. judith carter, “discovery: what do you mean by that?” information technology & libraries 28, no. 4 (2009): 161–63, http://dx.doi.org/10:6017/ital.v28i4.3326. 9. priscilla caplan, “on discovery tools, opacs and the motion of library language,” library hi tech 30, no. 1 (2012): 108–15. 10. carol pitts diedrichs, “discovery and delivery: making it work for users,” serials librarian 56, no. 1–4 (2009): 79, http://dx.doi.org/10.1080/03615260802679127. 11. alex a. dolski, “information discovery insights gained from multipac, a prototype library discovery system,” information technology & libraries 28, no. 4, (2009): 173, http://dx.doi.org/10.6017/ital.v28i4.3328. 12. marshall breeding, “the state of the art in library discovery,” computers in libraries 30, no. 1 (2010): 31–34. 13. jennifer l. fabbi, “focus as impetus for organizational learning,” information technology & libraries 28, no. 4 (2009): 164–71, http://dx.doi.org/10.6017/ital.v28i4.3327. 14. douglas way, “the impact of web-scale discovery on the use of a library collection,” serials review 36, no. 4: (2010): 214–20, http://dx.doi.org/10.1016/j.serrev.2010.07.002. 15. marshall breeding, “library technology guides: discovery products,” http://www.librarytechnology.org/discovery.pl. 16. ibid. 17. sharon q. yang and kurt wagner, “evaluating and comparing discovery tools: how close are we towards next generation catalog?” library hi tech 28, no. 4 (2010): 690–709. 18. yang and hofmann, “next generation or current generation? ” 266–300. 19. melissa a. hofmann and sharon q. yang, “how next-gen r u? a review of academic opacs in the united states and canada,” computers in libraries 31, no. 6 (2010): 26–29. 20. brown library of virginia western community college, “primo-frequently asked questions,” http://www.virginiawestern.edu/library/primo -faq.php#popularity_ranking. 21. yang and wagner, “evaluating and comparing discovery tools,” 690–709. 22. yanyi lee and sharon q. yang, “folksonomies as subject access—a survey of tagging in library online catalogs and discovery layers,” paper presented at ifla post-conference “beyond libraries-subject metadata in the digital environment and semantic web ,” tallinn, estoniai, 18 august 2012, http://www.nlib.ee/html/yritus/ifla_jarel/papers/4-1_yan.docx http://athena.rider.edu:2054/eds/viewarticle?data=dgjymppp44rp2%2fdv0%2bnjisfk5ie42eik6tmvsk6k63nn5kx94um%2bsa2otkewpq9lnqe4sk%2bws0yexss%2b8ujfhvhx4yzn5eyb4rorsbguteq1r7u%2b6tfsf7vb7d7i2lt94unjho6c8nnls79mpnfsvdgmrlg2rbdjsaeusk6mtlcwnosh8opfjlvc84tq6uoq8gaa&hid=20 http://www.librarytechnology.org/discovery.pl http://www.virginiawestern.edu/library/primo-faq.php#popularity_ranking http://www.nlib.ee/html/yritus/ifla_jarel/papers/4-1_yan.docx letter from the editor kenneth j. varnum information technology and libraries | june 2018 1 in this june 2018 issue, we continue our celebration of ital’s 50th year with a summary by editorial board member sandra shores of the articles published in the 1970s, the journal’s first full decade of publication. the 1970s are particularly pivotal in library technology, as it marks the introduction of the personal computer, as a hobbyist’s tool, to society. the web is still more than a decade away, but the seeds are being planted. with this issue, we introduce a new look for the journal — thanks to the work of lita’s web coordinating committee, and in particular kelly sattler (also a member of the editorial board), jingjing wu, and guy cicinelli. the new design is much easier on the eyes and more legible, and sports a new graphic identity for ital. board transitions june marks the changing of the editorial board. a significant number of board members’ terms expire this june 30, and i’d like to take this opportunity to thank those departing members for their years of service to information technology and libraries, and the support they have offered me this year as i began as editor. each has ably and generously contributed to the journal’s growth over the last years, and i thank them for their service to the journal and to ital: • mark cyzyk (johns hopkins university) • mark dehmlow (notre dame university) • sharon farnel (university of alberta) • kelly sattler (michigan state university) • sandra shores (university of alberta) these are big shoes to fill, but i am excited about the new members who have been appointed for two-year terms beginning july 1, 2018. in march, we extended a call for volunteers for 2 -year terms on the editorial board. we received almost 50 applications, and ultimately added seven new members: • steven bowers (wayne state university) • kevin ford (art institute of chicago) • cinthya ippoliti (oklahoma state university) • ida joiner (independent consultant) • breanne kirsch (university of south carolina upstate) • michael sauers (do space, omaha, nebraska) • laurie willis (san jose public library) readership survey summary over the past three months, we ran a survey of the ital readership to try to understand a bit more detail about who you are, collectively. the survey received 81 complete responses out of about 11,000 views of pages with the survey link on it. here are some brief summary results: • nearly half (46%) of respondents have attended at least one lita event (in-person or online). letter from the editor | varnum 2 https://doi.org/10.6017/ital.v37i2.10571 • three quarters (75%) of respondents are from academic libraries. public, special, and lis programs make up an additional 20%. • the majority (56%) are librarians, with the remaining spread across a number of other roles. • almost two thirds (63%) of respondents have never been lita members, a quarter (25%) are current members, and the remainder are former members. • about four fifths (81%) of responses came from the current issue (either the table of contents or individual articles). an invitation what can you share with your library colleagues in relation to technology? if you have interesting research about technology in a library setting, or are looking for a venue to share your your case study, get in touch with me at varnum@umich.edu. sincerely, kenneth j. varnum, editor varnum@umich.edu june 2018 mailto:varnum@umich.edu board transitions readership survey summary an invitation mitchell multimedia will have a profound effect on libraries during the next decade. this rapidly developing technology permits the user to combine digital still images, video, animation, graphics, and audio. it can be delivered in a variety of finished formats, including streaming video on the web, video on dvd/vcd, embedded digital objects within a web page or presentation software such as powerpoint, utilized within graphic designs, or printed as hardcopy. this article examines the elements of multimedia creation, as well as requirements and recommendations for implementing a multimedia facility in the library. t he term multimedia, which some may remember being used in the early 1970s as the name for slide shows set to music, now is used to describe “a number of diverse technologies that allow visual and audio media to be combined in new ways for the purpose of communicating.”1 almost all personal computers sold today are capable of viewing multimedia; many can, with minor modifications, also create multimedia. one of the most important features of multimedia is its flexibility. multimedia creation has several distinct elements—inputs, processes performed on those inputs, and outputs (see figure 1). each element can be described as follows. � inputs—new video can be recorded, or existing video, stored on a hard disk, cd/dvd, or tape can be imported. the same is true of audio, with the added flexibility of creating soundtracks or sound effects later, during the editing process. digital still images can be used, either shot on a camera or created by scanning an existing picture. digital artwork or animated sequences created in other software also can be brought in. � processing—regardless of the source, these digital inputs are loaded into the editing software. at this stage, the user will select and arrange the images and sounds, and the software may permit special effects to be created. in addition, the editing software may compress the file so that it is easier to use than the large file sizes used in raw video and audio recording. � outputs—at this point, the user has more choices to make. the new multimedia file can be sent to a program that will encode it for a streaming video in any one of a variety of popular formats, such as windows media, realmedia, or clipstream. then it can be mounted on a web site (either a regular page or within courseware such as webct or blackboard), or the file could be burned onto a cd or dvd, or it could be used within presentation software such as microsoft powerpoint. or the output file from the editing process could be encoded and embedded so that it is an avatar running as part of a web page with a product such as rovion bluestream. the possibilities are nearly endless. all of this is made possible by advances in technology on a variety of fronts. one of the happy anomalies in technology is that greater performance is frequently accompanied by lower costs. this is certainly the case with much of the activity surrounding multimedia. the following factors have fostered advances in multimedia: � increase in processing power and decrease in cost of computer hardware; � quality and affordability of video equipment; � compression of multimedia files; � consumer broadband internet access; and � current multimedia editing software the first two technology factors concern the equipment involved in multimedia production. leading off is the familiar, ever-increasing speed of processors and improved memory and hard-drive space, all delivered for less money. this trend is something that many people take for granted, but a reality check is sometimes in order. the processor in the typical desktop machine on advertised special today is approximately forty-four times as fast as the first pentium processor sold ten years ago, and is equipped with sixteen times as much ram and 117 times as much hard-drive space—at 20 percent of the cost of the old machine (not even adjusted for inflation!). the second factor is the incredible quality available in consumer-market video equipment at reasonable costs. while the images produced with consumer-grade video would not play well at the local megaplex movie theater, they look very good on the small screens found on computers, televisions, and classroom projectors. the third factor is that tremendous compression of multimedia files can be achieved during the editing process. an incoming raw-video file (in the standard .avi format) can be compressed with editing, encoding, and dedicated third-party compression software to an incredible 1 to 2 percent of its original size, and it will still retain very good quality as a digital object on the web and in other desktop viewing applications. the fourth factor is extremely critical for the success of multimedia web applications. home access is shifting away from dial-up access to broadband, with its greatly increased transfer rates. half of all united states homes with internet access are already using broadband, and the 32 information technology and libraries | march 2005 gregory a. mitchell (mitchellg@utpa.edu) is assistant director, resource management at the university of texas—pan american library, edinburg, texas. distinctive expertise: multimedia, the library, and the term paper of the future gregory a. mitchell forecast is for steady increase in these numbers.2 although not all broadband is created equal, it is all significantly faster than dial-up access. the final technology factor concerns the software that is currently available to the multimedia web developer. a developer can achieve some quite professional results with even the most basic products, and then can grow into more complex software that supports increasing levels of expertise. once again, this software is being sold in the price range that typical consumers can afford. � small really is beautiful creating a multimedia lab in the library need not be a large, complex undertaking. in fact, it can be very low cost and as simple as a single workstation. so it is scalable, allowing the library to start small and build in complexity and cost as time, money, and human resources will permit. at the bare-bones minimum, a multimedia lab would consist of a workstation with the software necessary for acquiring, editing, and outputting the files. for practical purposes, though, the workstation should be equipped with a network connection, a cd/dvd burner, a scanner, and a webcam with microphone. another very useful option is an analog-digital bridge device, which enables the capture of analog input (such as vhs tape) into digital files for the editor. to achieve better-quality video when shooting original content, a digital-video camera, tripod, wireless microphone, and portable light kit would be recommended. since more time typically is spent at the editing station than with the camera, the lab can be expanded with additional workstations before investing in another camera. experience at the author’s institution has shown that it is possible to operate a lab with ten workstations and only three video cameras and three still cameras. finally, output from the editing process will likely be printed, so a photoquality printer is another convenient option. this illustrates that the entry into multimedia work need not be a large expense, especially if an existing workstation and any other equipment is already available. if a fairly recent workstation is available to dedicate to the project, the library’s total startup cost could range from $200 to $1,000. not many new library services can be launched for as little as that. rather than dwell on equipment specifications, as that is not the intent of this discussion, the reader may consult the excellent tutorials available from desktop video and pc magazine’s online product guide.3 finally, the creation of a studio is a worthwhile option. although some video will need to be shot on location, many times it is possible to set up and shoot in just one place. a studio is the best place in which to work because it is a controlled environment. it does not need to be large or complicated, and a quiet office or study room can be set up with little effort and expense. the studio gives the users control over the sound and the lighting, and involves minimal setup time for projects. � the research paper of the future multimedia has begun to attract attention in the library community. joe janes, chair of library and information science at the information school at the university of washington and the person responsible for developing the internet public library, recently stated he foresees a growing role for multimedia in the library. it will replace much of the traditional, text-based communication that people are accustomed to. for example, multimedia projects can become the research paper of the future for students.4 it is the media in which many library customers will be working. experience from the author’s institution with creating a multimedia lab would seem to confirm his observation. during the first year and a half of operation, use of the lab has steadily increased (see figure 2). � collaboration the multimedia lab opens the doors to collaborative opportunities with faculty and students from a variety of disciplines across campus. this is because multimedia, like geographic information systems (gis) or other electronic information and communication technologies, is a tool and is not discipline-specific. as important as it is to make the connection with faculty, this media is something with which the students will frequently lead the figure 1. multimedia creation process distinctive expertise: multimedia, the library, and the term paper of the future | mitchell 33 34 information technology and libraries | march 2005 way. they are, after all, the mtv generation, and multimedia has an incredible appeal to their visual orientation. faculty themselves have used it to augment their web-based courses as well as traditional classroom instruction. the author ’s library has even initiated a multimedia résumé service for graduating students. the students can record a video introduction of themselves, encode this as a rovion bluestream avatar, and post it with their résumés on the web. this creates a much stronger impression than a standard résumé, hopefully giving the students an edge in promoting themselves on the job market. even more impressive is the variety of projects that are created in the lab by the students. one might expect to see interest from students in art and communications classes, but students come from many other disciplines as well. for example, business students have effectively used multimedia in their graduate-school business-plan presentations, while biology students like to use the graphics capabilities to study close-ups of slides. education students have employed it to produce multimedia instructional aids, and a sociology student put together a presentation on underserved, low-income neighborhoods. the library supplies the facility and instruction—only the imagination of students is needed. libraries have always been involved in the students’ research and writing process, by providing content, instruction, and facilities for producing the final research product. the same is true in the multimedia environment, although implementing a multimedia lab calls for some new skills for librarians. these include familiarity with basic principles of videography, learning how to use the cameras and other equipment, and gaining some mastery of the editing and encoding software. � why put it in the library? in addition to the research-paper analogy, the author believes that librarians can point with pride to the values and value that libraries offer their communities. it is a central and neutral location—not in one department’s or college’s turf. libraries are conveniently open for many hours per week. many of the information resources that students might use to prepare the presentation are in the library. and librarians have a professional ethic that drives them to provide instruction and assistance for the services the library offers. since multimedia production does have a learning curve and most new users need help in mastering the technology, it does not fit very well with the typical 24/7 drop-in computer lab that the campus information technology (it) often operates. this is a good opportunity for librarians to recognize some of their strengths and capitalize on them. in addition, this can be a breath of fresh air for librarians. here is an opportunity to learn about something new and creative. most people find that they have less room for creativity as time goes by.5 with a multimedia lab in the building, it will offer the librarians the opportunity to create multimedia productions for the library, besides assisting students and faculty with their projects. � potential problems there are some obstacles to overcome, of course. they need not be seen as major, but it is best to be realistic when beginning any new venture. it is almost always a good idea to start small, with a pilot project that will yield valuable lessons before venturing into anything big. � equipment—define what specifications are needed, see what is already available to use or borrow, then figure out what you will actually need to buy. � software—check out the variety of software for editing and production; think about how you want to begin using multimedia (primarily on the web, in presentation software such as powerpoint, as standalone videos on cds and dvds). � money—if funding permits, a library can invest several thousand dollars in a high-end multimedia computer, associated peripherals such as a color printer and one or more scanners, and a software suite to meet initial anticipated demands for multimedia creation and editing. if funding is scarce, you may want to investigate what existing equipment could be used in support of a pilot project. � location—this needs some space of its own, accessible to students and monitored by staff. although the figure 2. university of texas—pan american library multimedia lab usage editing workstation could be in an area with other computers, a quiet area is needed for shooting video so that there will not be interference from noise and unwanted foot traffic through the shots. � staffing and training—a multimedia lab is not a good candidate for self-service. librarians and staff who will provide the service need to learn how to use the equipment and software. make sure that they all have an acceptable level of competence and confidence so that the library can shine with its new service, but expect that everyone will need to continue to learn and grow in their proficiency. if your library plans to produce its own multimedia sessions as well, it would be a good investment to attend a class on television or video production. � hours—how many hours per week will the new service be available? if it is the entire time the library is open, be prepared to train plenty of staff. repeat users will need less help as their skills increase (by the way, some of these students can be great workstudy employees). � instruction—plan to offer formal orientation and instruction sessions to faculty and their classes. if your lab is small, this is challenging, but it can be accomplished with some creativity. for example, a general instruction session on concepts can be done in a classroom, followed up by a series of small groups working by appointment for the appliedlearning component in the multimedia lab. the author and a colleague have even done instruction outside the library using laptops and cameras, creating a de facto mobile studio. � copyright—if there are already vcrs or photocopiers in the library, you have had to deal with this issue. the pan american library at university of texas does not allow people to use its lab to copy movies, which is a request that surely will come to you, and we post the usual copyright notices just as we do at our photocopiers. for some excellent information on copyright, visit the american library association web site (www.ala.org). � evaluation—plan on at least basic evaluation of the service. this can include an assessment of the effectiveness of the instruction sessions, a survey of satisfaction with the lab itself, a questionnaire on the intended uses of the multimedia projects, demographic data on the students, or other student input. logs of the number of uses and peak-demand periods are extremely useful for planning and for justifying further expenditures and staffing requests. � flexibility for the future—whatever you do in a pilot phase, always keep in mind that you want to keep an open mind—you are trying to learn from the experience so that you can make good decisions for the direction of this new service. it may not go exactly the way you originally thought, because of serendipity, or changes in technology, or very strong demand from some segments of the campus instead of others, or other environmental factors. � conclusion benefits to the library from the multimedia lab are many. one of the most important benefits is that it keeps the library involved in the process of academic communication, as the medium of the communication changes with technology. by being involved in this evolving medium at its early stages, the library is poised to pounce on opportunities to employ it to the benefit of the library in instruction and content delivery. the library also would position itself on campus as a key player in it and the leading local expert in the growing field of multimedia. since multimedia is a tool that crosses the entire range of subject disciplines on campus, it opens the doors of faculty to collaborate with librarians in exciting new ways. just as many campuses already have learning and collaborative communities that grew around their web courseware or gis endeavors, so too can one develop around multimedia. the appendix offers a list of multimedia web sites to consider. libraries are more than warehouses of books and periodicals. as more and more of our resources have been made available electronically, and indeed more of higher education has moved to electronic delivery, many libraries have been faced with declining gate counts, circulations, and reference statistics. as someone observed, we are victims of our own success. so what is the role of the library? we are intrinsically involved in the process of instruction, academic research, and communication. as kling observed, “one important strategic idea is that libraries configure their it services and activities to emphasize the distinctive expertise of their librarians rather than simply concentrate on the size and character of the documentary collection.”7 it is imperative therefore that libraries pick out the new trends that will allow them to excel by capitalizing on their traditional strengths. references 1. scala, inc. multimedia directory. accessed apr. 21, 2004, www.scala.com/multimedia/multimedia-definition.html. 2. nielsen/netratings as of june, 2004. accessed aug. 10, 2004, www.websiteoptimization.com/. 3. about.com, dvt101. accessed apr. 15, 2004, http:// desktopvideo.about.com/library/weekly/aa040703a.htm; “anatomy of a video editing workstation,” pc magazine. accessed apr. 16, 2004, www.pcmag.com/article2/0,1759,1264650 ,00.asp. distinctive expertise: multimedia, the library, and the term paper of the future | mitchell 35 36 information technology and libraries | march 2005 4. college of dupage, “joe janes and colleagues: preparing for the future of digital reference,” a satellite broadcast from the college of dupage, 16 apr. 2004. 5. sandra kerka, creativity in adulthood (columbus, ohio: eric clearinghouse on adult career and vocational education, eric digest no. 204, ed429186, 1999). 6. american library association, “copyright issues, primer on the digital millennium.” accessed may 10, 2004, www.ala .org/ala/washoff/woissues/copyrightb/dmca/dmcprimer.pdf. 7. rob kling, “the internet and the strategic reconfiguration of libraries,” library administration & management 15, no. 3 (summer 2001): 144–51. appendix. for further reading: a multimedia web-site tour the following is a sampling of some of the most popular and interesting multimedia software, with examples of completed productions. this is not an official endorsement of any one product over another, whether listed here or not. a look at these sites will, however, give the reader an idea about the power and possibilities of multimedia communications. adobe (www.adobe.com) the well-known makers of some of the most powerful and popular editing software packages for graphics and video. camtasia (www.camtasia.com) easy to use, this is a good example of the type of software that does screen capture and recording, which is handy for producing online tutorials. clipstream (www.clipstream.com) an excellent example of the type of newer encoding software that achieves incredible compression of video and delivers it over the web with no viewer or plug-ins required for the user. finalcut pro (www.apple.com/finalcutpro) a perennial favorite among the mac crowd, this software is relatively easy to learn and lets the developer achieve dramatic results. flashants (www.flashants.com) a handy program that converts flash animation into .avi video format so that you can integrate animated sequences into a video production. macromedia (www.macromedia.com) the makers of flash and director, which are some of the most popular graphics, animation, and mulitimedia editing tools in the business. pinnacle (www.pinnaclesys.com) what finalcut pro is to the mac, this package is for the pc environment. easy to use, yet sophisticated in the results achieved. rovion (www.rovion.com) rovion bluestream is an encoder that enables the creation of avatar characters to appear live on your web page. a plugin is required for the user, but this approach definitely gets attention. serious magic (www.seriousmagic.com) an award-winning software package that allows you to turn a workstation into a studio, complete with teleprompter capability, sound effects, graphics, and editing. university of texas—pan american library (www.lib.panam.edu/libinfo/media.asp) links to multimedia projects at the author’s institution, including productions made by staff and students. identifying key steps for developing mobile applications and mobile websites for libraries devendra dilip potnis, reynard regenstreifharms, and edwin cortez information technology and libraries | september 2016 43 abstract mobile applications and mobile websites (mamw) represent information systems that are increasingly being developed by libraries to better serve their patrons. because of a lack of in-house it skills and the knowledge necessary to develop mamw, a majority of libraries are forced to rely on external it professionals who may or may not help libraries meet patron needs but instead may deplete libraries’ scarce financial resources. this paper applies a system analysis and design perspective to analyze the experience and advice shared by librarians and it professionals engaged in developing mamw. this paper identifies key steps and precautions to take while developing mamw for libraries. it also advises library and information science graduate programs to equip their students with the specific skills and knowledge needed to develop and implement mamw. introduction the unprecedented adoption and ongoing use of a variety of context-specific mobile technologies by diverse patron populations, the ubiquitous nature of mobile content, and the increasing demand for location-aware library services have forced libraries to “go mobile.” mobile applications and mobile websites (mamw), that is, web portals running on mobile devices, represent information systems that are increasingly being developed and used by libraries to better serve their patrons. however, a majority of libraries often lack the in-house human resources necessary to develop mamw. because of a lack of staff equipped with the requisite it skills and knowledge, libraries are often forced to partner with and rely on external it professionals, potentially losing control over the process of developing mamw.1 partnerships with external it professionals do not always help libraries meet the information needs of their patrons but instead can deplete their scarce financial resources. it then becomes necessary for librarians to understand the process of developing mamw to better evaluate mamw for better serving library patrons. one possibility devendra dilip potnis (dpotnis@utk.edu) is associate professor, school of information sciences; reynard regenstreif-harms (reynardrh@gmail.com) is project archives technician, great smoky mountains national park, gatlinburg, tennessee; and edwin cortez (ecortez@utk.edu) is professor, school of information sciences, university of tennessee at knoxville. mailto:dpotnis@utk.edu mailto:reynardrh@gmail.com) mailto:ecortez@utk.edu identifying key steps for developing mobile applications & mobile websites for libraries | potnis, regenstreif-harms, and cortez |doi:10.6017/ital.v35i2.8652 44 is to re-educate themselves through continuing education or other professional development activities. another solution would be to see library and information science (lis) schools strengthen their curriculum in the area of management, evaluation, and application of mamw and related emerging technologies. issues, challenges, and strategies for providing librarians with these opportunities are abundant and have been debated for more than thirty years, especially since libraries started experiencing the impact of microchip and portable technologies.2 any practical and immediate guidance could help librarians in charge of developing mamw.3 however, a majority of the practical guidance available for developing mamw for libraries is limited to specific settings or patron populations. also, the practical guidance is not theoretically validated, curtailing its generalizability for diverse library settings. for instance, a number of librarians and it professionals share their experience and stories of mamw development to serve a specific patron population in a specific library setting.4,5 their stories typically describe their success stories of developing mamw, the lessons learned during the development of mamw, or their advice for developing mamw. this paper applies a system analysis and design perspective from the information systems discipline to examine the experience and advice shared by librarians and it professionals for identifying the key steps and precautions to be taken when developing mamw for libraries. system analysis and design, a branch of the information systems discipline, is the most widely used theoretical knowledgebase available for developing information systems.6 according to the system analysis and design perspective, development, planning, analysis, design, implementation, and maintenance are the six phases of building any information system.7 the next section synthesizes our method for this secondary research. the following section discusses the key steps we identified for developing, planning, analyzing, designing, implementing, and maintaining mamw for libraries. the concluding section presents the implications of this study for libraries and lis graduate programs. method we began this study with a practitioner’s handbook guiding libraries to use mobile technologies for delivering services to diverse patron populations.8 to search the literature relevant to our research, we devised many key phrases, including but not limited to “mobile technolog*,” “mobile applications for libraries,” and “mobile websites for libraries.” as part of our active informationseeking process, we applied a snowball sampling technique to collect more than seventy-five scholarly research articles, handbooks, ala library technology reports, and books hosted on ebsco and information science source databases. our passive information-seeking was helped by article suggestions from emerald insight and elsevier science direct, two of the most widely used journal hosting sites, in response to the journal articles we accessed there. we applied the following four criteria to establish the relevancy of publications to our research: accuracy of facts; duration of publications (i.e., from 2000 to 2014); credibility of authors; and content focused on information technology and libraries | september 2016 45 problems, solutions, advice, and tips for developing mamw. several research articles published by information technology and libraries and library hi tech, two top-tier journals covering the development of mamw for libraries, built the foundation of this secondary research. we analyzed the collected literature using the qualitative data presentation and analysis method proposed by miles and huberman.9 we developed microsoft excel summary sheets to code the experience and advice shared by librarians and it professionals. the coded data was read repeatedly to identify and name patterns and themes. each relevant publication was analyzed individually and then compared across subjects to identify patterns and common categories. the inter-coder reliability between the two authors who analyzed data was 85 percent. data analysis helped us identify the key steps needed for planning, analyzing, designing, implementing, and maintaining mamw for libraries. findings and discussion key steps for planning mamw forming and managing a team building teams of people with the appropriate skills, knowledge, and experience is one of the first steps suggested by the existing literature for planning mamw. it is essential for team members to be aware of new developments and trends in the market.10 for instance, developers should be aware of print resources on relevant technologies such as apache, asp, javascript, php, ruby on rails, and python, etc.; online resources such as detectmobilebrowser.com and w3c mobileok checker to test catalogs, design functionality, and accessibility on mobile devices; and various online communities of developers who could provide peer-support when needed.11 team members are also expected to keep up with new developments in mobile devices, platforms, operating systems, digital rights management terms and conditions, and emerging standards for content formats.12 periodic delegation of various tasks could help libraries develop mamw effectively.13 libraries should also form productive, financially feasible partnerships with external stakeholders such as internet service providers and network administrators for hosting mamw on appropriate internet servers that meet desired safety and security standards.14,15 requirements gathering requirements for developing mamw can be collected through empirical research and secondary research. typically, the goal of empirical research is to help libraries [set off as bulleted list?]gather patron preferences for and expectations of mamw,16,17 stay abreast of the continual evolution of patron needs,18 periodically (e.g., quarterly, annually, biannually, etc.) gather and evaluate user needs,19 index the content of mamw,20 investigate the acceptance of the library’s use of mamw by patrons,21 understand user needs, and identify top library services requested by patrons. identifying key steps for developing mobile applications & mobile websites for libraries | potnis, regenstreif-harms, and cortez |doi:10.6017/ital.v35i2.8652 46 empirical research in the form of usability testing, functional validation, user surveys, etc., should be carried out before developing mamw to inform the development process and/or after developing mamw to study their adoption by library patrons. empirical research typically involves the identification of patrons and other stakeholders who are going to be affected by mamw. this step is followed by developing data-collection instruments, collecting data from patrons and other stakeholders, and analyzing qualitative and quantitative data using appropriate techniques and software.22 secondary research mainly focuses on scanning and assessing existing literature. for instance, using appropriate datasets on mobile use, librarians may be able to identify the factors responsible for the adoption of mobile technologies.23 typically, such factors include but are not limited to cognitive, affective, social, and economic conditions of potential users. mamw developers could also scan the environment by examining existing mamw and reviewing the literature to create sets of guidelines for replacing old information systems by developing new, well-functioning mamw.24 librarians could also scan the market for free software options to conserve financial resources.25 making strategic choices mobile applications or mobile websites? one of the most important strategic decisions libraries need to make during this phase is whether to use a mobile app or a mobile website—that is, a web portal running on mobile devices—for offering services to patrons. mobile websites are web browser-based applications that might direct mobile users to a different set of content pages, serve a single set of content to all patrons while using different style sheets or templates reformatted for desktop or mobile browsers, or use a site transcoder (a rule-based interpreter), which resides between a website and a web client and intercepts and reformats content in real time for a mobile device.26,27 mobile apps are more challenging to build than mobile websites because they require separate and specific programming for each operating system.28 mobile apps burden users and their devices. for instance, users are expected to remember the functionality of each menu item, and a significant amount of memory is required to store and support apps on mobile devices. however, potential profitability, better mobile-device functionality, and greater exposure through app stores can make mobile apps an economical option over mobile websites.29 buy or build? in the planning phase, libraries also need to decide whether to buy commercial, off-the-shelf (cots) mamw or build a customized mamw. mamw need to be evaluated in terms of customer support and service, maintenance, the ability to meet patron needs, and library needs when making this choice.30 sometimes libraries purchase cots products and end up customizing them, benefiting from both options. for example, some libraries first purchase packaged mobile frameworks to create simple, static mobile websites and subsequently develop dynamic library apps specific to library services.31 information technology and libraries | september 2016 47 managing scope many libraries have limited financial resources, which makes it necessary for their staff to manage the scope of mamw development. the ability to prioritize tasks and identify mission-critical features of mobile mamw are some of the most common activities undertaken by libraries to manage this scope.32 for instance, it is not practical to make entire library websites mobile because libraries would end up serving only those patrons who access their sites over mobile alone. instead, libraries should determine which part of the website should go mobile. a growing trend of using products like mobile first design to design a mobile version of a website first and then work up to a larger desktop version could help librarians better manage the scope of mamw development. alternatively, jeff wisniewski, a leading web services librarian in the united states, advises libraries to create a new mobile-optimized homepage alone, which is faster than trying to retrofit the library’s existing homepage for mobile.33 this advice is highly practical because no webmaster has any interest in trying to maintain two distinct versions of the library’s webpages with details such as hours of operations and contact information. selecting the appropriate software development method there are three key methods for developing mamw: structured methodologies (e.g., waterfall or parallel), rapid application prototyping (e.g., phased, prototyping, or throwaway prototyping), and agile development, an umbrella term used to refer to the collection of agile methodologies like crystal, dynamic systems development method, extreme programming, feature-driven development, and scrum. there is a bidirectional relationship between these mamw development methods and the resources available for their development. project resources such as funding, duration, and human resources influence and are affected by the type of software development method selected for developing mamw. however, studies rarely pay attention to this important dimension of the planning phase.34 key steps in the analysis phase requirements analysis after collecting data from patrons, the next natural step is to analyze the data to inform the process of conceptualizing, building, and developing mamw.35 the requirements-analysis phase helps libraries achieve user-centered design of mamw and assess the return on investment in mamw. the context and goals of the patrons using mobile devices, and the tasks they are likely and unlikely to perform on a mobile device, are the key considerations for developing user-centered mamw for library patrons.36 it is critical to gather, understand, and review user needs.37 surveys can be developed on paper or online, which can be analyzed using advanced statistical techniques or qualitative software.38,39 the analysis allows the following questions to be answered: which identifying key steps for developing mobile applications & mobile websites for libraries | potnis, regenstreif-harms, and cortez |doi:10.6017/ital.v35i2.8652 48 library services do patrons use most frequently on their mobile devices? what is their level of satisfaction for using those services? what types of library services and products would they like to access with their mobile phones in the future? survey analyses can help librarians predict which mobile services patrons will find most useful;40 they can also help librarians classify users on the basis of their perceptions, experience, and habits when using mobile technologies to access library services.41 as a result, libraries can identify and prioritize functional areas for their mamw deployment.42 mamw developers can learn from their users’ humbling and/or frustrating experience of using mobile devices for library services. in addition, libraries can keep track of their patrons’ positive and negative observations, their information-sharing practices, and howthey create group experiences on the platform provided by their libraries.43 to improve existing mamw, libraries could also use google analytics, a free web metrics tool, for identifying the popularity of mamw features and analyzing statistics on how they are used.44 to develop operating system-specific mobile apps, google analytics can be used to learn about the popularity of mobile devices used by patrons.45 ideally, libraries should calculate and document roi before investing in the development of mamw.46 for instance, libraries can run a cost-benefit analysis on the process of developing mamw and compare various library services offered over mobile devices.47 typically the following data could help libraries run the cost-benefit analysis: specific deliverables (e.g., features of mamw), resources (e.g., resources needed, available resources, etc.), risks (e.g., types of risks, level of risks, etc.), performance requirements, and security requirements for developing mamw. this analysis would help libraries make decisions on service provisions such as specific goals to be set for developing mamw, feasibility of introducing desired features of mamw, and how to manage available resources to meet the set goals.48 libraries should also examine what other libraries have already done to provide mobile services.49 communication/liaising with stakeholders the effective communication between developers and stakeholders influences almost every aspect of developing information systems. however, existing studies do not emphasize the significance of communication with stakeholders. for instance, several studies vaguely refer to the translation of user needs into technology requirements.50 but few studies point out the precise modeling technique (e.g., entity relationship diagrams, unified modeling language, etc.) for converting user needs into a language understood by software developers. developers should communicate best practices and suggestions for the future implementation of mamw in libraries,51 which involves the prediction and selection of appropriate mamw for libraries,52 the demonstration of what is possible and how services are relevant, and how new resources can help create value for libraries.53,54 communication with users is also critical for creating value-added services for patrons who use different mobile technologies to meet their needs related to work, leisure, commuting, etc.55 information technology and libraries | september 2016 49 however, the existing literature on mamw development for libraries does not mention the significance of this activity. key steps for designing mamw prototyping prototyping refers to the modeling or simulation of an actual information system. mamw can have paper-based or computer-based prototypes. prototyping allows developers to directly communicate with mamw users to seek their feedback. developers can correct or modify the original design of mamw until users and developers are in agreement about the system design. building consensus between mamw developers and potential users is another key challenge to overcome during this phase, which may put a financial burden on mamw development projects. it requires skilled personnel to manage the scope, time, human resources, and budget of such projects. wireframing is one of the most prominent prototyping techniques practiced by librarians and it professionals for developing mamw for libraries.56 this technique depicts schematic on-screen blueprints of mamw, lacking style, color, or graphics, focusing mainly on functionality, behavior, and priority of content. selecting hardware, programming languages, platforms, frameworks, and toolkits existing literature on the development of mamw for libraries covers the selection and management of software; software development kits; scripting languages like javascript; data management and representation languages such as html, xml, and their text editors; and ajax for animations and transitions. the existing literature also guides libraries for training their staff for using mamw to better serve patrons.57 few studies also provide guidance on selecting cots products such as webkit, an open source web browser engine that renders webpages on smartphones and allows users to view high-quality graphics on data networks with faster throughput.58 however, it might be a good idea to use licensed open source cots products because licensed software allows libraries to legally distribute software within their organizations as covered by the licensing agreement. libraries that use software-licensing agreements may also be able to seek expert help and advice whenever they have a concern or query. in the authors’ experience, librarians have shared few effective strategies to design mamw. one key strategy is to purchase reliable device emulators and cross-compatible web editors. these technologies allow the user to work with the design at the most basic level, save documents as text, transfer the documents between web programs, and direct designers toward simple solutions.59 sample cross-compatible web editors include, but are not limited to, notetab pro (http://www.notetab.com/), code lobster (http://www.codelobster.com/), and bluefish (http://bluefish.openoffice.nl). http://www.notetab.com/ http://www.codelobster.com/ http://bluefish.openoffice.nl/ identifying key steps for developing mobile applications & mobile websites for libraries | potnis, regenstreif-harms, and cortez |doi:10.6017/ital.v35i2.8652 50 hybrid mobile app frameworks like bootstrap, ionic, mobile angular ui, intel xdk, appcelerator titanium, sencha, kendo ui, and phonegap use a combination of web technologies like html, css, and javascript for developing mobile-first, responsive mamw. a majority of these frameworks use a drag-and-drop approach and do not require any coding for developing mobile apps. one-click api connect further simplifies the process. user-interface frameworks like jquerymobile and topcoat eliminate the need to design user interfaces manually. importantly, mamw developed using such frameworks can support many mobile platforms and devices. toolkits like github, skyronic, crudkit, and hawhaw enable developers to quickly build mobilefriendly crud (create/read/update/delete) interfaces for php, laravel, and codeigniter apps. such mobile apps also work with mysql and other databases, allowing users to receive and process data and display information to users. table 1 categorizes specific hardware and software features recommended for mamw to better serve library patrons. # areas of information systems/it specific features recommended for developing mamw for libraries 1 human-computer interaction (hci) behavioral, cognitive, motivational, and affective aspects of hci design responsive web sites for libraries to enhance user experience60 design a user interface meeting the expectations and needs of potential users (e.g., menu with the following items: library catalog, patron accounts, ask a librarian, contact information, listing of hours, etc.)61 design meaningful mobile websites based on user needs, documenting and maintaining mobile websites62 usability engineering design concise interfaces with limited links, descriptive icons, home and parent-link icons63 create a user-friendly site (e.g., the dok library concept center in delft, netherlands, offers a welcome text message to first-time visitors)64 effectively transition from traditional websites to mobile-optimized sites with responsive design65 create user-friendly interface designs66 present a clean, easy to navigate mobile version of search results67 information technology and libraries | september 2016 51 information visualization automatically maintain reliable and stable fundamental information required by indoor localization systems68 save time by redesigning existing sites69,70 2 web programming html, xml, etc. design sites with a complete separation of content and presentation71 code html and css for better user experiences72 create and shorten links to make them easier to input using small or virtual keyboards73 using cient-side and server-side scripting such as javascript object notation, etc. design and develop mashups74 develop mamw using client-server architecture, accessible on mobile devices75 without scripting implement widgetization to facilitate the integration of mobile websites—developing a widget library for mobile-based web information systems76 3 open source design mobile websites that allow users to leverage the same open source technology as the main websites77 design mobile websites linking to other existing services like library h3lp and library catalogs with mobile interfaces such as mobilecat78 4 networking design a mobile website capable of exploiting advancements in technology such as faster mobile data networks79 identify and address technology issues (e.g., connectivity, security, speed, signal strength, etc.) faced by patrons when using mamw80 5 input/output devices use a mobile robot to determine the location of fixed rfid tags in space81 design mamw capable of processing data communicated using radio frequency identification devices, near-field communication technology, and bluetoothbased technology like ibeacons82 offer innovative services using augmentedreality tools83 identifying key steps for developing mobile applications & mobile websites for libraries | potnis, regenstreif-harms, and cortez |doi:10.6017/ital.v35i2.8652 52 6 databases integrate a back-end database of metadata with front-end mobile technologies84 integrate front-end of mobile mamw with back-end of standard databases and services85 7 social media and analytics integrate social media sites (e.g., foursquare, facebook place, gowalla, etc.) with existing checkout services for accurate and information rich entries86 implement google voice or a free textmessaging service87 use google analytics for mobile optimized website by copying the free javascript code generated from google analytics and paste it into library webpages to gain insight into what resources are used and who used them88 integrate a geo-location feature with mobile services89 table 1. mamw with specific hardware and software features from the above table, which is based on the analysis of the literature on developing mobile applications and mobile websites for libraries, it becomes clear that web programming and hci are the two leading technology areas that shape the development of mamw and consequently the services offered by them. designing user interfaces of mamw librarians and it professionals engaged in developing mamw for libraries make the following recommendations. use two style sheets: css play a key role in offering uniform display to user interfaces for all webpages. studies recommend designing two style sheets—namely, mobile.css and iphone.css— when developing mamw, since most of the time smartphones ignore mobile stylesheets.90 in that case, iphone.css could direct itself to browsers of a specific screen-width, helping those mobile devices that are not directed to the mobile website by the mobile.css stylesheet.91 minimize use of javascript: javascript is instrumental in detecting what mobile device is being used by patrons and then directing them to the appropriate webpage with options including full website, simple text-based, and touch-mobile-optimized. however, it is critical to minimize the use of javascript on library mobile websites because not every smartphone offers the minimum level of support required to operate it.92 handle images intelligently: to help patrons optimize their bandwidth use, image files on mobile sites should be incorporated with css rather than html code; also, to ensure consistency in the information technology and libraries | september 2016 53 appearance of user interfaces of mobile websites, images should be kept to the same absolute size.93 key steps for implementing mamw programming for mamw programming is at the heart of developing mamw. as shown in table 1 above, web programming enables developers to build mamw with a number of value-added features for patrons. for instance, a web-application server running on cold fusion can process data communicated via web browsers on mobile devices; this feature allows mamw users to access search engines on library websites via smartphones.94 also, client-side processing of classes (with a widget library) allows patrons to use their mobile devices as thin clients, thereby optimizing the use of network bandwidth.95 testing mamw past studies recommend testing the content, display/design, and functionality of mamw in a controlled environment (e.g., usability lab) or in the real world (i.e., in libraries). content: librarians are advised to set up testing databases for testing image presentation, traditional free text search, location-based search, barcode scanning for isbn search, qr encapsulation, and voice search.96 display/design: librarians can review and test mamw on multiple devices to confirm that everything displays and functions as intended.97 they can also test a beta version of their mobile website with varying devices to provide guidance regarding image sizing;98 beta versions are also useful in testing mobile websites for their display on different browsers and devices.99 functionality: librarians can set up testing practices and environments for the most heavily used device platforms (e.g., hci incubators such as eye testing software, which is a combination of virtual emulators and mobile devices not owned by libraries).100,101 they can also use the user agent switcher add-on for firefox to test a mobile website and use web-based services like device anywhere and browser cam offering mobile emulation to test the functionality of mamw.102 training patrons unless patrons realize the significance of a new information system for managing information resources they will hardly use it. however, training patrons for using a newly developed mamw is almost completely missing from the studies describing the process of developing mamw for libraries. joe murphy, a technology librarian at yale university, identifies the significance of user training in managing the change from traditional to mobile search and advises librarians to explore the mobile literacy skills of their patrons and educate them on how to use new systems.103 identifying key steps for developing mobile applications & mobile websites for libraries | potnis, regenstreif-harms, and cortez |doi:10.6017/ital.v35i2.8652 54 data management mamw cannot function properly without clean data. cleaning up data, curating data, and addressing other data-related issues are some of the least mentioned activities in the literature for developing mamw. however, it is necessary for librarians engaged in developing mamw to identify and address common challenges for managing data when used for mamw. for example, it might be a good strategy for librarians to study the best practices for managing data-related issues when offering reference services using sms .104 skills needed for maintaining mamw documentation and version control of software past studies recommend developing a mobile strategy for building a mobile-tracking device and evaluating mobile infrastructure to ensure the continued assessment and monitoring of mobile usage and trends among patrons.105 however, past studies do not report or provide many details about the maintenance of mamw, which leads us to infer that maintenance of mamw involving documentation and version control is a neglected aspect of their development. open source software development is increasingly becoming a common practice for developing mamw. implementing version-control software (e.g., subversion and github) to accommodate the needs of developers distributed across the world is a necessity for developing mamw. versioncontrol software provides a code repository with a centralized database for developers to share their code, which minimizes errors associated with overwriting or reverting code changes and maximizes software development collaboration efforts.106 conclusion there are various forces driving change in the knowledge and skills area for information professionals: technologies, changing environments, and the changing role of it in managing and providing services to patrons. these forces affect all levels of it-based professionals, those responsible for information processing and those responsible for information services. this paper has examined the key steps and precautions to be taken while developing mamw to better serve their patrons. after analyzing the existing guidance offered by librarians and it professionals from the system analysis and design perspective, we find that some of the most ignored activities in mamw development are selecting appropriate software development methodologies, prototyping, communicating with stakeholders, software version control, data management, and training patrons to use newly developed or revamped mamw. the lack of attention to these activities could hinder libraries’ ability to better serve patrons using mamw. it is necessary for librarians and it professionals to pay close attention to the above activities when developing mamw. information technology and libraries | september 2016 55 our study also shows that web programming and hci are the two most widely used technology areas for developing mamw for libraries. to save their scarce financial resources, which otherwise could be invested in partnering with external it professionals, libraries could either train their existing staff or recruit lis graduates equipped with the skills and knowledge identified in this paper to develop mamw (see table 2). # key steps for developing mamw skills and knowledge required for developing mamw a planning phase 1 forming and managing team human resource management 2 making strategic choices time management cost management quality management human resource management (e.g., staff capacity) 3 requirements gathering research (empirical and secondary) 4 managing scope (e.g., managing financial resources, prioritizing tasks, identifying mission-critical features of mamw, etc.) scope management 5 selecting an appropriate software development method time management cost management quality management b analysis phase 6 requirements analysis research (empirical and secondary) 7 communication/liaising with stakeholders communications management c design phase 8 prototyping software development (hci) 9 selecting hardware and programming languages and platforms software development (web programming and hci) 10 designing user interfaces of mamw software development (hci) d implementation phase 11 programming for mamw software development (web programming—e.g., android, ios, visual c++, visual c#, visual basic, etc.) 12 testing mamw software development (web programming and hci) identifying key steps for developing mobile applications & mobile websites for libraries | potnis, regenstreif-harms, and cortez |doi:10.6017/ital.v35i2.8652 56 13 training patrons human resource management 14 data management (e.g., cleaning up data, curating data, etc.) data management e maintenance phase 15 documentation and version control of software software development (web programming and hci) table 2. skills and knowledge necessary to develop mamw the management of scope, time, cost, quality, human resources, and communication related to any project is known as project management.107 in addition to the skills and knowledge related to project management, librarians would also need to be proficient in software development (with an emphasis on hci and web programming), data management, and the proper methods for conducting empirical and secondary research for developing mamw. if lis programs equip their graduate students with the skills and knowledge identified in this paper, the next generation of lis graduates could develop mamw for libraries without relying on external it professionals, which would make libraries more self-reliant and better able to manage their financial resources.108 this paper assumes a very small number of scholarly publications to be reflective of the realworld scenarios of developing mamw for all types of libraries. this assumption is one of the limitations of this study. also, the sample of publications analyzed in this study is not statistically representative of the development of mamw for libraries around the world. in the future, the authors plan to interview librarians and it professionals engaged in developing and maintaining mamw for their libraries to better understand the landscape of developing mamw for libraries. references 1. devendra potnis, ed cortez, and suzie allard, “educating lis students as mobile technology consultants” (poster presented at 2015 association for library and information science education annual meeting, chicago, january 25–27), http://f1000.com/posters/browse/summary/1097683. 2. edwin michael cortez, “new and emerging technologies for information delivery,” catholic library world no. 54 (1982): 214–18. 3. kimberly d. pendell and michael s. bowman, “usability study of a library’s mobile website: an example from portland state university,” information technology & libraries 31, no. 2 (2012): 45–62, http://dx.doi.org/10.6017/ital.v31i2.1913. 4. godmar back and annette bailey, “web services and widgets for library information systems,” information technology & libraries 29 no. 2 (2010): 76–86, http://dx.doi.org/10.6017/ital.v29i2.3146 . http://f1000.com/posters/browse/summary/1097683 http://dx.doi.org/10.6017/ital.v31i2.1913 http://dx.doi.org/10.6017/ital.v29i2.3146 information technology and libraries | september 2016 57 5. hannah gascho rempel and laurie bridges, “that was then, this is now: replacing the mobile optimized site with responsive design,” information technology & libraries 32, no. 4 (2013): 8–24, http://dx.doi.org/10.6017/ital.v32i4.4636. 6. june jamrich parsons and dan oja, new perspectives on computer concepts 2014: comprehensive, course technology (boston: cengage learning, 2013). 7. ibid. 8. andrew walsh, using mobile technology to deliver library services: a handbook (london: facet, 2012). 9. matthew b. miles and a. michael huberman, qualitative data analysis (thousand oaks, ca: sage, 1994). 10. bohyun kim, “responsive web design, discoverability and mobile challenge,” library technology reports 49, no 6 (2013): 29–39, https://journals.ala.org/ltr/article/view/4507. 11. james elder, “how to become the “tech guy and make iphone apps for your library,” the reference librarian 53, no. 4 (2012): 448–55, http://dx.doi.org/10.1080/02763877.2012.707465. 12. sarah houghton, “mobile services for broke libraries: 10 steps to mobile success,” the reference librarian 53, no. 3 (2012): 313–21, http://dx.doi.org/10.1080/02763877.2012.679195. 13. pendell and bowman, “usability study.” 14. lisa carlucci thomas, “libraries, librarians and mobile services,” bulletin of the american society for information science & technology 38, no. 1 (2011): 8–9, http://dx.doi.org/10.1002/bult.2011.1720380105. 15. elder, “how to become the ‘tech guy.’” 16. kim, “responsive web design.” 17. chad mairn, “three things you can do today to get your library ready for the mobile experience,” the reference librarian 53, no. 3 (2012): 263–69, http://dx.doi.org/10.1080/02763877.2012.678245. 18. rempel and bridges, “that was then.” 19. rachael hu and alison meier, “planning for a mobile future: a user research case study from the california digital library,” serials 24, no. 3 (2011): s17–25. 20. kim, “responsive web design.” http://dx.doi.org/10.6017/ital.v32i4.4636 https://journals.ala.org/ltr/article/view/4507 http://dx.doi.org/10.1080/02763877.2012.707465 http://dx.doi.org/10.1080/02763877.2012.679195 http://dx.doi.org/10.1002/bult.2011.1720380105 http://dx.doi.org/10.1080/02763877.2012.678245 identifying key steps for developing mobile applications & mobile websites for libraries | potnis, regenstreif-harms, and cortez |doi:10.6017/ital.v35i2.8652 58 21. lorraine paterson and boon low, “student attitudes towards mobile library services for smartphones,” library hi tech 29, no. 3 (2011): 412–23, http://dx.doi.org/10.1108/07378831111174387. 22. jim hahn, michael twidale, alejandro gutierrez and reza farivar, “methods for applied mobile digital library research: a framework for extensible wayfinding systems,” the reference librarian 52, no. 1-2 (2011): 106–16, http://dx.doi.org/10.1080/02763877.2011.527600. 23. patterson and low, “student attitudes.” 24. gillian nowlan, “going mobile: creating a mobile presence for your library,” new library world 114, no. 3/4 (2013): 142–50, http://dx.doi.org/10.1108/03074801311304050. 25. elder, “how to become the ‘tech guy.’” 26. matthew connolly, tony cosgrave, and baseema b. krkoska, “mobilizing the library’s web presence and services: a student-library collaboration to create the library’s mobile site and iphone application,” the reference librarian 52, no. 1-2 (2010): 27–35, http://dx.doi.org/10.1080/02763877.2011.520109. 27. stephan spitzer, “make that to go: re-engineering a web portal for mobile access,” computers in libraries 3 no. 5 (2012): 10–14. 28. houghton, “mobile services.” 29. cody w. hanson, “mobile solutions for your library,” library technology reports 47, no. 2 (2011): 24–31, https://journals.ala.org/ltr/article/view/4475/5222. 30. terence k. huwe, “using apps to extend the library’s brand,” computers in libraries 33, no. 2 (2013): 27–29. 31. edward iglesias and wittawat meesangnill, “mobile website development: from site to app,” bulletin of the american society for information science and technology 38, no. 1 (2011): 18– 23. 32. jeff wisniewski, “mobile usability,” bulletin of the american society for information science & technology 38, no. 1 (2011): 30–32, http://dx.doi.org/10.1002/bult.2011.1720380108. 33. jeff wisniewski, “mobile websites with minimal effort,” online 34, no. 1 (2010): 54–57. 34. hahn et al., “methods for applied mobile digital library research.” 35. j. michael demars, “smarter phones: creating a pocket sized academic library,” the reference librarian 53, no. 3 (2012): 253–62, http://dx.doi.org/10.1080/02763877.2012.678236. http://dx.doi.org/10.1108/07378831111174387 http://dx.doi.org/10.1080/02763877.2011.527600 http://dx.doi.org/10.1108/03074801311304050 http://dx.doi.org/10.1080/02763877.2011.520109 https://journals.ala.org/ltr/article/view/4475/5222 http://dx.doi.org/10.1002/bult.2011.1720380108 http://dx.doi.org/10.1080/02763877.2012.678236 information technology and libraries | september 2016 59 36. kim griggs, laurie m. bridges, and hannah gascho rempel, “library/mobile: tips on designing and developing mobile websites,” code4lib no. 8 (2009), http://journal.code4lib.org/articles/2055. 37. demars, “smarter phones.” 38. hahn et al., “methods for applied mobile digital library research.” 39. beth stahr, “text message reference service: five years later,” the reference librarian no. 52, no. 1-2 (2011): 9–19, http://dx.doi.org/10.1080/02763877.2011.524502. 40. patterson and low, “student attitudes.” 41. ibid. 42. ibid. 43. hanson, “mobile solutions for your library.” 44. stahr, “text message reference service.” 45. spitzer, “make that to go.” 46. allison bolorizadeh et al., “making instruction mobile,” the reference librarian 53, no. 4 (2012): 373–83, http://dx.doi.org/10.1080/02763877.2012.707488. 47. maura keating, “will they come? get out the word about going mobile,” the reference librarian no. 52, no. 1-2 (2010): 20-26, http://dx.doi.org/10.1080/02763877.2010.520111. 48. patterson and low, “student attitudes.” 49. hanson, “mobile solutions for your library.” 50. patterson and low, “student attitudes.” 51. hanson, “mobile solutions for your library.” 52. cody w. hanson, “why worry about mobile?,” library technology reports no. 47, no. 2 (2011): 5–10, https://journals.ala.org/ltr/article/view/4476. 53. keating, “will they come?” 54. spitzer, “make that to go.” 55. kim, “responsive web design.” 56. wisniewski, “mobile usability.” 57. elder, “how to become the ‘tech guy.’” http://journal.code4lib.org/articles/2055 http://dx.doi.org/10.1080/02763877.2011.524502 http://dx.doi.org/10.1080/02763877.2012.707488 http://dx.doi.org/10.1080/02763877.2010.520111 https://journals.ala.org/ltr/article/view/4476 identifying key steps for developing mobile applications & mobile websites for libraries | potnis, regenstreif-harms, and cortez |doi:10.6017/ital.v35i2.8652 60 58. sally wilson and graham mccarthy, “the mobile university: from the library to the campus,” reference services review 38, no. 2 (2010): 214–32, http://dx.doi.org/10.1108/00907321011044990. 59. brendan ryan, “developing library websites optimized for mobile devices,” the reference librarian 52, no. 1-2 (2010): 128–35, http://dx.doi.org/10.1080/02763877.2011.527792. 60. kim, “responsive web design.” 61. connolly, cosgrave, and krkoska, “mobilizing the library’s web presence and services.” 62. demars, “smarter phones.” 63. mark andy west, arthur w. hafner, and bradley d. faust, “expanding access to library collections and services using small-screen devices,” information technology & libraries 25 (2006): 103–7. 64. houghton, “mobile services.” 65. rempel and bridges, “that was then.” 66. elder, “how to become the ‘tech guy.’” 67. heather williams and anne peters, “and that’s how i connect to my library: how a 42second promotional video helped to launch the utsa libraries’ new summon mobile application,” the reference librarian 53, no. 3 (2012): 322–25, http://dx.doi.org/10.1080/02763877.2012.679845. 68. hahn et al., “methods for applied mobile digital library research.” 69. danielle andre becker, ingrid bonadie-joseph, and jonathan cain, “developing and completing a library mobile technology survey to create a user-centered mobile presence,” library hi-tech 31, no. 4 (2013): 688–99, http://dx.doi.org/10.1108/lht-03-2013-0032. 70. rempel and bridges, “that was then.” 71. iglesias and meesangnill, “mobile website development.” 72. elder, “how to become the ‘tech guy.’” 73. andrew walsh, “mobile information literacy: a preliminary outline of information behavior in a mobile environment,” journal of information literacy 6, no. 2 (2012): 56–69, http://dx.doi.org/10.11645/6.2.1696. 74. back and bailey, “web services and widgets.” 75. ibid. 76. ibid. 77. spitzer, “make that to go.” http://dx.doi.org/10.1108/00907321011044990 http://dx.doi.org/10.1080/02763877.2011.527792 http://dx.doi.org/10.1080/02763877.2012.679845 http://dx.doi.org/10.1108/lht-03-2013-0032 http://dx.doi.org/10.11645/6.2.1696 information technology and libraries | september 2016 61 78. iglesias and meesangnill, “mobile website development.” 79. bohyun kim, “the present and future of the library mobile experience,” library technology reports 49, no. 6 (2013): 15–28, https://journals.ala.org/ltr/article/view/4506. 80. pendell and bowman, “usability study.” 81. hahn et al., “methods for applied mobile digital library research.” 82. andromeda yelton, “where to go next,” library technology reports 48, no. 1 (2012): 25–34, https://journals.ala.org/ltr/article/view/4655/5511. 83. ibid. 84. hahn et al., “methods for applied mobile digital library research.” 85. houghton, “mobile services.” 86. ibid. 87. mairn, “three things you can do today.” 88. ibid. 89. tamara pianos, “econbiz to go: mobile search options for business and economics— developing a library app for researchers,” library hi tech 30, no. 3 (2012): 436–48, http://dx.doi.org/10.1108/07378831211266582. 90. demars, “smarter phones.” 91. ryan, “developing library websites.” 92. pendell and bowman, “usability study.” 93. ryan, “developing library websites.” 94. michael j. whitchurch, “qr codes and library engagement,” bulletin of the american society for information science & technology 38, no. 1 (2011): 14–17. 95. back and bailey, “web services and widgets.” 96. jingru hoivik, “global village: mobile access to library resources,” library hi tech 31, no. 3 (2013): 467–77, http://dx.doi.org/10.1108/lht-12-2012-0132. 97. elder, “how to become the ‘tech guy.’” 98. ryan, “developing library websites.” 99. west, hafner and faust, “expanding access.” 100. hu and meier, “planning for a mobile future.” 101. iglesias and meesangnill, “mobile website development.” https://journals.ala.org/ltr/article/view/4506 https://journals.ala.org/ltr/article/view/4655/5511 http://dx.doi.org/10.1108/07378831211266582 http://dx.doi.org/10.1108/lht-12-2012-0132 identifying key steps for developing mobile applications & mobile websites for libraries | potnis, regenstreif-harms, and cortez |doi:10.6017/ital.v35i2.8652 62 102. wisniewski, “mobile usability.” 103. joe murphy, “using mobile devices for research: smartphones, databases and libraries,” online 34, no. 3 (2010): 14–18. 104. amy vecchione and margie ruppel, “reference is neither here nor there: a snapshot of sms reference services,” the reference librarian 53, no. 4 (2012): 355–72, http://dx.doi.org/10.1080/02763877.2012.704569. 105. hu and meier, “planning for a mobile future.” 106. wilson and mccarthy, “the mobile university.” 107. project management institute, a guide to the project management body of knowledge (pmbok guide) (newtown square, pa: project management institute, 2013). 108. devendra potnis et al., “skills and knowledge needed to serve as mobile technology consultants in information organizations,” journal of education for library & information science 57 (2016): 187–96. http://dx.doi.org/10.1080/02763877.2012.704569 abstract introduction method forming and managing a team key steps in the analysis phase key steps for designing mamw key steps for implementing mamw skills needed for maintaining mamw conclusion forming and managing team this paper assumes a very small number of scholarly publications to be reflective of the real-world scenarios of developing mamw for all types of libraries. this assumption is one of the limitations of this study. also, the sample of publications anal... references can bibliographic data be put directly onto the semantic web? | yee 55 martha m. yee can bibliographic data be put directly onto the semantic web? this paper is a think piece about the possible future of bibliographic control; it provides a brief introduction to the semantic web and defines related terms, and it discusses granularity and structure issues and the lack of standards for the efficient display and indexing of bibliographic data. it is also a report on a work in progress—an experiment in building a resource description framework (rdf) model of more frbrized cataloging rules than those about to be introduced to the library community (resource description and access) and in creating an rdf data model for the rules. i am now in the process of trying to model my cataloging rules in the form of an rdf model, which can also be inspected at http://myee. bol.ucla.edu/. in the process of doing this, i have discovered a number of areas in which i am not sure that rdf is sophisticated enough yet to deal with our data. this article is an attempt to identify some of those areas and explore whether or not the problems i have encountered are soluble—in other words, whether or not our data might be able to live on the semantic web. in this paper, i am focusing on raising the questions about the suitability of rdf to our data that have come up in the course of my work. t his paper is a think piece about the possible future of bibliographic control; as such, it raises more complex questions than it answers. it is also a report on a work in progress—an experiment in building a resource description framework (rdf) model of frbrized descriptive and subject-cataloging rules. here my focus will be on the data model rather than on the frbrized cataloging rules for gathering data to put in the model, although i hope to have more to say about the latter in the future. the intent is not to present you with conclusions but to present some questions about data modeling that have arisen in the course of the experiment. my premise is that decisions about the data model we follow in the future should be made openly and as a community rather than in a small, closed group of insiders. if we are to move toward the creation of metadata that is more interoperable with metadata being created outside our community, as is called for by many in our profession, we will need to address these complex questions as a community following a period of deep thinking, clever experimentation, and astute political strategizing. n the vision the semantic web is still a bewitching midsummer night’s dream. it is the idea that we might be able to replace the existing html–based web consisting of marked-up documents—or pages—with a new rdf– based web consisting of data encoded as classes, class properties, and class relationships (semantic linkages), allowing the web to become a huge shared database. some call this web 3.0, with hyperdata replacing hypertext. embracing the semantic web might allow us to do a better job of integrating our content and services with the wider internet, thereby satisfying the desire for greater data interoperability that seems to be widespread in our field. it also might free our data from the proprietary prisons in which it is currently held and allow us to cooperate in developing open-source software to index and display the data in much better ways than we have managed to achieve so far in vendor-developed ils opacs or in giant, bureaucratic bibliographic empires such as oclc worldcat. the semantic web also holds the promise of allowing us to make our work more efficient. in this bewitching vision, we would share in the creation of uniform resource identifiers (uris) for works, expressions, manifestations, persons, corporate bodies, places, subjects, and so on. at the uri would be found all of the data about that entity, including the preferred name and the variant names, but also including much more data about the entity than we currently put into our work (name-title and title), such as personal name, corporate name, geographic, and subject authority records. if any of that data needed to be changed, it would be changed only once, and the change would be immediately accessible to all users, libraries, and library staff by means of links down to local data such as circulation, acquisitions, and binding data. each work would need to be described only once at one uri, each expression would need to be described only once at one uri, and so forth. very much up in the air is the question of what institutional structures would support the sharing of the creation of uris for entities on the semantic web. for the data to be reliable, we would need to have a way to ensure that the system would be under the control of people who had been educated about the value of clean and accurate entity definition, the value of choosing “most commonly known” preferred forms (for display in lists of multiple different entities), and the value of providing access martha m. yee (myee@ucla.edu) is cataloging supervisor at the university of california, los angeles film and television archive. 56 information technology and libraries | june 2009 under all variant forms likely to be sought. at the same time, we would need a mechanism to ensure that any interested members of the public could contribute to the effort of gathering variants or correcting entity definitions when we have had inadequate information. for example, it would be very valuable to have the input of a textual or descriptive bibliographer applied to difficult questions concerning particular editions, issues, and states of a significant literary work. it would also be very valuable to be able to solicit input from a subject expert in determining the bounds of a concept entity (subject heading) or class entity (classification). n the experiment (my project) to explore these bewitching ideas, i have been conducting an experiment. as part of my experiment, i designed a set of cataloging rules that are more frbrized than is rda in the sense that they more clearly differentiate between data applying to expression and data applying to manifestation. note that there is an underlying assumption in both frbr (which defines expression quite differently from manifestation) and on my part, namely that catalogers always know whether a given piece of data applies at either the expression or the manifestation level. that assumption is open to questioning in the process of the experiment as well. my rules also call for creating a more hierarchical and degressive relationship between the frbr entities work, expression, manifestation, and item, such that data pertaining to the work does not need to be repeated for every expression, data pertaining to the expression does not need to be repeated for every manifestation, and so forth. degressive is an old term used by bibliographers for bibliographies that provide great detail about first editions and less detail for editions after the first. i have adapted this term to characterize my rules, according to which the cataloger begins by describing the work; any details that pertain to all expressions and manifestations of the work are not repeated in the expression and manifestation descriptions. this paper would be entirely too long if i spent any more time describing the rules i am developing, which can be inspected at http://myee.bol.ucla .edu. here, i would like to focus on the data-modeling process and the questions about the suitability of rdf and the semantic web for encoding our data. (by the way, i don’t seriously expect anyone to adopt my rules! they are radically different than the rules currently being applied and would represent a revolution in cataloging practice that we may not be up to undertaking in the current economic climate. their value lies in their thought-experiment aspect and their ability to clarify what entities we can model and what entities we may not be able to model.) i am now in the process of trying to model my cataloging rules in the form of an rdf model (“rdf” as used in this paper should be considered from now on to encompass rdf schema [rdfs], web ontology language [owl], and simple knowledge organization system [skos] unless otherwise stated); this model can also be inspected at http://myee.bol .ucla.edu. in the process of doing this, i have discovered a number of areas in which i am not sure that rdf is yet sophisticated enough to deal with our data. this article is an attempt to outline some of those areas and explore whether the problems i have encountered are soluble, in other words, whether or not our data might be able to live on the semantic web eventually. i have already heard from rdf experts bruce d’arcus (miami university) and rob styles (developer of talis, as semantic web technology company), whom i cite later, but through this article i hope to reach a larger community. my research questions can be found later, but first some definitions. n definition of terms the semantic web is a way to represent knowledge; it is a knowledge-representation language that provides ways of expressing meaning that are amenable to computation; it is also a means of constructing knowledgedomain maps consisting of class and property axioms with a formal semantics rdf is a family of specifications for methods of modeling information that underpins the semantic web through a variety of syntax formats; an rdf metadata model is based on making statements about resources in the form of triples that consist of 1. the subject of the triple (e.g., “new york”); 2. the predicate of the triple that links the subject and the object (e.g., “has the postal abbreviation”); and 3. the object of the triple (e.g., “ny”). xml is commonly used to express rdf, but it is not a necessity; it can also be expressed in notation 3 or n3, for example.1 rdfs is an extensible knowledge-representation language that provides basic elements for the description of ontologies, also known as rdf vocabularies. using rdfs, statements are made about resources in the form of 1. a class (or entity) as subject of the rdf triple (e.g., “new york”); 2. a relationship (or semantic linkage) as predicate of the rdf triple that links the subject and the object (e.g., can bibliographic data be put directly onto the semantic web? | yee 57 “has the postal abbreviation”); and 3. a property (or attribute) as object of the rdf triple (e.g., “ny”). owl is a family of knowledge representation languages for authoring ontologies compatible with rdf. skos is a family of formal languages built upon rdf and designed for representation of thesauri, classification schemes, taxonomies, or subject-heading systems. n research questions actually, the full-blown semantic web may not be exactly what we need. remember that the fundamental definition of the semantic web is “a way to represent knowledge.” the semantic web is a direct descendant of the attempt to create artificial intelligence, that is, of the attempt to encode enough knowledge of the real world to allow a computer to reason about reality in a way indistinguishable from the way a human being reasons. one of the research questions should probably be whether or not the technology developed to support the semantic web can be used to represent information rather than knowledge. fortunately, we do not need to represent all of human knowledge—we simply need to describe and index resources to facilitate their retrieval. we need to encode facts about the resources and what the resources discuss (what they are “about”), not facts about “reality.” based on our past experience, doing even this is not as simple as people think it is. the question is whether we could do what we need to do within the context of the semantic web. sometimes things that sound simple do not turn out to be so simple in the doing. my research questions are as follows: 1. is it possible for catalogers to tell in all cases whether a piece of data pertains to the frbr expression or the frbr manifestation? 2. is it possible to fit our data into rdf? given that rdf was designed to encode knowledge rather than information, perhaps it is the wrong technology to use for our purposes? 3. if it is possible to fit our data into rdf, is it possible to use that data to design indexes and displays that meet the objectives of the catalog (i.e., providing an efficient instrument to allow a user to find a particular work of which the author and title are known, a particular expression of a work, all of the works of an author, all of the works in a given genre or form, or all of the works on a particular subject)? as stated previously, i am not yet ready to answer these questions. i hope to find answers in the course of developing the rules and the model. in this paper, i am focusing on raising the questions about the suitability of rdf to our data that have come up in the course of my work. n other relevant projects other relevant projects include the following: 1. frbr, functional requirements for authority data (frad), funtional requirements for subject authority records (frsar), and frbr-objectoriented (frbroo). all are attempts to create conceptual models of bibliographic entities using an entity-relationship model that is very similar to the class-property model used by rdf.2 2. various initiatives at the library of congress (lc), such as lc subject headings (lcsh) in skos,3 the lc name authority file in skos,4 the lccn permalink project to create persistent uris for bibliographic records,5 and initiatives to provide skos representations for vocabularies and data elements used in marc, premis, and mets. these all represent attempts to convert our existing bibliographic data into uris that stand for the bibliographic entities represented by bibliographic records and authority records; the uris would then be available for experiments in putting our data directly onto the semantic web. 3. the dc-rda task group project to put rda data elements into rdf.6 as noted previously and discussed further later, rda is less frbrized than my cataloging rules, but otherwise this project is very similar to mine. 4. dublin core’s (dc’s) work on an rdf schema.7 dublin core is very focused on manifestation and does not deal with expressions and works, so it is less similar to my project than is the dc-rda task groups’s project (see further discussion later). n why my project? one might legitimately ask why there is a need for a different model than the ones already provided by frbr, frad, frsar, frbroo, rda, and dc. the frbr and rda models are still tied to the model that is implicit in our current bibliographic data in which expression and manifestation are undifferentiated. this is because publishers publish and libraries acquire and shelve manifestations. in our current bibliographic practice, a new 58 information technology and libraries | june 2009 bibliographic record is made for either a new manifestation or a new expression. thus, in effect, there is no way for a computer to tell one from the other in our current data. despite the fact that frbr has good definitions of expression (change in content) and manifestation (mere change in carrier), it perpetuates the existing implicit model in its mapping of attributes to entities. for example, frbr maps the following to manifestation: edition statements (“2nd rev. ed.”); statements of responsibility that identify translators, editors, and illustrators; physical description statements that identify illustrated editions; and extent statements that differentiate expressions (the 102-minute version vs. the 89-minute version); etc. thus the frbr definition of expression recognizes that a 2nd revised edition is a new expression, but frbr maps the edition statement to manifestation. in my model, i have tried to differentiate more cleanly data applying to expressions from data applying to manifestations.8 frbr and rda tend to assume that our current bibliographic data elements map to one and only one group 1 entity or class. there are exceptions, such as title, which frbr and rda define at work, expression, and manifestation levels. however, there is a lack of recognition that, to create an accurate model of the bibliographic universe, more data elements need to be applied at the work and expression level in addition to (or even instead of) the manifestation level. in the appendix i have tried to contrast the frbr, frad, and rda models with mine. in my model, many more data elements (properties and attributes) are linked to the work and expression level. after all, if the expression entity is defined as any change in work content, the work entity needs to be associated with all content elements that might change, such as the original extent of the work, the original statement of responsibility, whether illustrations were originally present, whether color was originally present in a visual work, whether sound was originally present in an audiovisual work, the original aspect ratio of a moving image work, and so on. frbr also tends to assume that our current data elements map to one and only one entity. in working on my model, i have come to the conclusion that this is not necessarily true. in some cases, a data element pertaining to a manifestation also pertains to the expression and the work. in other cases, the same data element is specific to that manifestation, and, in other cases, the same data element is specific to its expression. this is true of most of the elements of the bibliographic description. frad, in attempting to deal with the fact that our current cataloging rules allow a single person to have several bibliographic identities (or pseudonyms), treats person, name, and controlled access point as three separate entities or classes. i have tried to keep my model simpler and more elegant by treating only person as an entity, with preferred name and variant name as attributes or properties of that entity. frbroo is focused on the creation process for works, with special attention to the creation of unique works of art and other one-off items found in museums. thus frbroo tends to neglect the collocation of the various expressions that develop in the history of a work that is reproduced and published, such as translations, abridged editions, editions with commentary, etc. dc has concentrated exclusively on the description of manifestations and has neglected expression and work altogether. one of the tenets of semantic web development is that, once an entity is defined by a community, other communities can reuse that entity without defining it themselves. the very different definitions of the work and expression entities in the different communities described above raise some serious questions about the viability of this tenet. n assumptions it should be noted that this entire experiment is based on two assumptions about the future of human intervention for information organization. these two assumptions are based on the even bigger assumption that, even though the internet seems to be an economy based on free intellectual labor, and, even though human intervention for information organization is expensive (and therefore at more risk than ever), human intervention for information organization is worth the expense. n assumption 1: what we need is not artificial intelligence, but a better human–machine partnership such that humans can do all of the intellectual labor and machines can do all of the repetitive clerical labor. currently, catalogers spend too much time on the latter because of the poor design of current systems for inputting data. the universal employment provided by paying humans to do the intellectual labor of building the semantic web might be just the stimulus our economy needs. n assumption 2: those who need structured and granular data—and the precise retrieval that results from it—to carry out research and scholarship may constitute an elite minority rather than most of the people of the world (sadly), but that talented and intelligent minority is an important one for the cultural and technological advancement of humanity. it is even possible that, if we did a better job of providing access to such data, we might enable the enlargement of that minority. can bibliographic data be put directly onto the semantic web? | yee 59 n granularity and structure issues as soon as one starts to create a data model, one encounters granularity or cataloger-data parsing issues. these issues have actually been with us all along as we developed the data model implicit in aacr2r and marc 21. those familiar with rda, frbr, and frad development will recognize that much of that development is directed at increasing structure and granularity in catalogerproduced data to prepare for moving it onto the semantic web. however, there are clear trade-offs in an increase in structure and granularity. more structure and more granularity make possible more powerful indexing and more sophisticated display, but more structure and more granularity are more complex and expensive to apply and less likely to be implemented in a standard fashion across all communities; that is, it is less likely that interoperable data would be produced. any switching or mapping that was employed to create interoperable data would produce the lowest common denominator (the simplest and least granular data), and once rendered interoperable, it would not be possible for that data to swim back upstream to regain its lost granularity. data with less structure and less granularity could be easier and cheaper to apply and might have the potential to be adopted in a more standard fashion across all communities, but that data would limit the degree to which powerful indexing and sophisticated display would be possible. take the example of a personal name: currently, we demarcate surname from forename by putting the surname first, followed by a comma and then the forename. even that amount of granularity can sometimes pose a problem for a cataloger who does not necessarily know which part of the name is surname and which part is forename in a culture unfamiliar to the cataloger. in other words, the more granularity you desire in your data, the more often the people collecting the data are going to encounter ambiguous situations. another example: currently, we do not collect information about gender self-identification; if we were to increase the granularity of our data to gather that information, we would surely encounter situations in which the cataloger would not necessarily know if a given creator was self-defined as a female or a male or of some other gender identity. presently, if we are adding a birth and death date, whatever dates we use are all together in a $d subfield without any separate coding to indicate which date is the birth date and which is the death date (although an occasional “b.” or “d.” will tell us this kind of information). we could certainly provide more granularity for dates, but that would make the marc 21 format much more complex and difficult to learn. people who dislike the marc 21 format already argue that it is too granular and therefore requires too much of a learning curve before people can use it. for example, tennant claims that “there are only two kinds of people who believe themselves able to read a marc record without referring to a stack of manuals: a handful of our top catalogers and those on serious drugs.”9 how much of the granularity already in marc 21 is used either in existing records or, even if present, is used in indexing and display software? granularity costs money, and libraries and archives are already starving for resources. granularity can only be provided by people, and people are expensive. granularity and structure also exist in tension with each other. more granularity can lead to less structure (or more complexity to retain structure along with granularity). in the pursuit of more granularity of data than we have now, rda, attempting to support rdf–compliant xml encoding, has been atomizing data to make it useful to computers, but this will not necessarily make the data more useful to humans. to be useful to humans, it must be possible to group and arrange (sort) the data meaningfully, both for indexing and for display. the developers of skos refer to the “vast amounts of unstructured (i.e., human readable) information in the web,”10 yet labeling bits of data as to type and recording semantic relationships in a machine-actionable way do not necessarily provide the kind of structure necessary to make data readable by humans and therefore useful to the people the web is ultimately supposed to serve. consider the case of music instrumentation. if you have a piece of music for five guitars and one flute, and you simply code number and instrumentation without any way to link “five” with “guitars” and “one” with “flute,” you will not be able to guarantee that a person looking for music for five flutes and one guitar will not be given this piece of music in their results (see figure 1).11 the more granular the data, the less the cataloger can build order, sequencing, and linking into the data; the coding must be carefully designed to allow the desired order, sequencing, and linking for indexing and display to be possible, which might call for even more complex coding. it would be easy to lose information about order, sequencing, and linking inadvertently. actually, there are several different meanings for the term structure: 1. structure is an object of a record (structure of document?); for example, elings and waibel refer to “data fields . . . also referred to as elements . . . which are organized into a record by a data structure.”12 2. structure is the communications layer, as opposed to the display layer or content designation.13 3. structure is the record, field, and subfield. 4. structure is the linking of bits of data together in the 60 information technology and libraries | june 2009 form of various types of relationships. 5. structure is the display of data in a structured, ordered, and sequenced manner to facilitate human understanding. 6. data structure is a way of storing data in a computer so that it can be used efficiently (this is how computer programmers use the term). i hasten to add that i am definitely in favor of adding more structure and granularity to our data when it is necessary to carry out the fundamental objectives of our profession and of our catalogs. i argued earlier that frbr and rda are not granular enough when it comes to the distinction between data elements that apply to expression and those that apply to manifestation. if we could just agree on how to differentiate data applying to the manifestation from data applying to the expression instead of our current practice of identifying works with headings and lumping all manifestation and expression data together, we could increase the level of service we are able to provide to users a thousandfold. however, if we are not going to commit to differentiating between figure 1b. example of encoding of musical instrumentation at the expression level based on the above model 5 guitars 1 flute instrumentation of musical expression original instrumentation of musical expression—number of a particular instrument original instrumentation of musical expression—type of instrument figure 1a. extract from yee rdf model that illustrates one technique for modeling musical instrumentation at the expression level (using a blank node to group repeated number and instrument type) can bibliographic data be put directly onto the semantic web? | yee 61 expression and manifestation, it would be more intellectually honest for frbr and rda to take the less granular path of mapping all existing bibliographic data to manifestation and expression undifferentiated, that is, to use our current data model unchanged and state this openly. i am not in favor of adding granularity for granularity’s sake or for the sake of vague conceptions of possible future use. granularity is expensive and should be used only in support of clear and fundamental objectives. n the goal: efficient displays and indexes my main concern is that we model and then structure the data in a way that allows us to build the complex displays that are necessary to make catalogs appear simple to use. i am aware that the current orthodoxy is that recording data should be kept completely separate from indexing and display (“the applications layer”). because i have spent my career in a field in which catalog records are indexed and displayed badly by systems people who don’t seem to understand the data contained in them, i am a skeptic. it is definitely possible to model and structure data in such a way that desired displays and indexes are impossible to construct. i have seen it happen! the lc working group report states that “it will be recognized that human users and their needs for display and discovery do not represent the only use of bibliographic metadata; instead, to an increasing degree, machine applications are their primary users.”14 my fear is that the underlying assumption here is that users need to (and can) retrieve the single perfect record. this will never be true for bibliographic metadata. users will always need to assemble all relevant records (of all kinds) as precisely as possible and then browse through them before making a decision about which resources to obtain. this is as true in the semantic web—where “records” can be conceived of as entity or class uris—as it is in the world of marc–encoded metadata. some of the problems that have arisen in the past in trying to index bibliographic metadata for humans are connected to the fact that existing systems do not group all of the data related to a particular entity effectively, such that a user can use any variant name or any combination of variant names for an entity and do a successful search. currently, you can only look for a match among two or more keywords within the bounds of a single manifestation-based bibliographic record or within the bounds of a single heading, minus any variant terms for that entity. thus, when you do a keyword search for two keywords, for example, “clemens” and “adventures,” you will retrieve only those manifestations of mark twain’s adventures of tom sawyer that have his real name (clemens) and the title word “adventures” co-occurring within the bounded space created by a single manifestation-based bibliographic record. instead, the preferred forms and the variant forms for a given entity need to be bounded for indexing such that the keywords the user employs to search for that entity can be matched using co-occurrence rules that look for matches within a single bounded space representing the entity desired. we will return to this problem in the discussion of issue 3 in the later section “rdf problems encountered.” the most complex indexing problem has always proven to be the grouping or bounding of data related to a work, since it requires pulling in all variants for the creator(s) of that work as well. otherwise, a user who searches for a work using a variant of the author’s name and a variant of the title will continue to fail (as they do in all current opacs), even when the desired work exists in the catalog. if we could create a uri for the adventures of tom sawyer that included all variant names for the author and all variant titles for the work (including the variant title tom sawyer), the same keyword search described above (“clemens” and “adventures”) could be made to retrieve all manifestations and expressions of the adventures of tom sawyer, instead of the few isolated manifestations that it would retrieve in current catalogs. we need to make sure that we design and structure the data such that the following displays are possible: n display all works by this author in alphabetical order by title with the sorting element (title) appearing at the top of each work displayed. n display all works on this subject in alphabetical order by principal author and title (with principal author and title appearing at top of each work displayed), or title if there is no principal author (with title appearing at top of each work displayed). we must ensure that we design and structure the data in such a way that our structure allows us to create subgroups of related data, such as instrumentation for a piece of music (consisting of a number associated with each particular instrument), place and related publisher for a certain span of dates on a serial title change record, and the like. n which standards will carry out which functions? currently, we have a number of different standards to carry out a number of different functions; we can speculate about how those functions might be allocated in a new semantic web–based dispensation, as shown in table 1. in table 1, data structure is taken to mean what a record represents or stands for; traditionally, a record has represented an expression (in the days of hand62 information technology and libraries | june 2009 press books) or a manifestation (ever since reproduction mechanisms have become more sophisticated, allowing an explosion of reproductions of the same content in different formats and coming from different distributors). rda is record-neutral; rdf would allow uris to be established for any and all of the frbr levels; that is, there would be a uri for a particular work, a uri for a particular expression, a uri for a particular manifestation, and a uri for a particular item. note that i am not using data structure in the sense that a computer programmer does (as a way of storing data in a computer so that it can be used efficiently). currently, the encoding of facts about entity relationships (see table 1) is carried out by matching data-value character strings (headings or linking fields using issns and the like) that are defined by the lc/naco authority file (following aacr2r rules), lcsh (following rules in the subject cataloging manual), etc. in the future, this function might be carried out by using rdf to link the uri for a resource to the uri for a data value. display rules (see table 1) are currently defined by isbd and aacr2r but widely ignored by systems, which frequently truncate bibliographic records arbitrarily in displays, supply labels, and the like; rda abdicates responsibility, pushing display out of the cataloging rules. the general principle on the web is to divorce data from display and allow anyone to display the data any way they want. display is the heart of the objects (or goals) of cataloging: the point is to display to the user the works of an author, the editions of a work, or the works on a subject. all of these goals only can be met if complex, high-quality displays can be built from the data created according to the data model. indexing rules (see table 1) were once under the control of catalogers (in book and card catalogs) in that users had to navigate through headings and cross-references to find table 1. possible reallocation of current functions in a new semantic web–based dispensation function current future? data content, or content guidelines (rules for providing data in a particular element) defined by aacr2r and marc 21 defined by rda and rdf/rdfs/ owl/skos data elements defined by isbd–based aacr2r and marc 21 defined by rda and rdf/rdfs/ owl/skos data values defined by lc/naco authority file, lcsh, marc 21 coded data values, etc. defined as ontologies using rdf/ rdfs/owl/skos encoding or labeling of data elements for machine manipulation; same as data format? defined by iso 2709–based marc 21 defined by rdf/rdfs/xml data structure (i.e., what a record stands for) defined by aacr2r and marc 21; also frbr? defined by rdf/rdfs/owl/ skos schematization (constraint on structure and content) marc 21, mods, dcmi abstract model defined by rdf/rdfs/owl/ skos encoding of facts about entity relationships carried out by matching data value strings (headings found in lc/naco authority file and lcsh, issn’s, and the like) carried out by rdf/rdfs/owl/ skos in the form of uri links display rules ils software, formerly isbd– based aacr2r (“application layer”) or yee rules indexing rules ils software sparql, “application layer,” or yee rules can bibliographic data be put directly onto the semantic web? | yee 63 what they wanted; currently indexing is in the hands of system designers who prefer to provide keyword indexing of bibliographic (i.e., manifestation-based) records rather than provide users with access to the entities they are really interested in (works, authors and subjects), all represented currently by authority records for headings and cross-references. rda abdicates responsibility, pushing indexing concerns completely out of the cataloging rules. the general principle on the web is to allow resources to be indexed by any web search engines that wish to index them. current web data is not structured at all for either indexing or display. i would argue that our interest in the semantic web should be focused on whether or not it will support more data structure—as well as more logic in that data structure—to support better indexes and better displays than we have now in manifestation-based ils opacs. crucial to better indexing than we have ever had before are the co-occurrence rules for keyword indexing, that is, the rules for when a co-occurrence of two or more keywords should produce a match. we need to be able to do a keyword search across all possible variant names for the entity of interest, and the entity of interest for the average catalog user is much more likely to be a particular work than to be a particular manifestation. unfortunately, catalog-use studies only have studied so-called known-item searches without investigating whether a known-item searcher was looking for a particular edition or manifestation of a work or was simply looking for a particular work in order to make a choice as to edition or manifestation once the work was found. however, common sense tells us that it is a rare user who approaches the catalog with prior knowledge about all published editions of a given work. the more common situation is surely one in which a user desires to read a particular shakespeare play or view a particular david lean film and discovers that the desired work exists in more than one expression or manifestation only after searching the catalog. we need to have the keyword(s) in our search for a particular work co-occur within a bounded space that encompasses all possible keywords that might refer to that particular work entity, including both creator and title keywords. notice in table 1 the unifying effect that rdf could potentially have; it could free us from the use of multiple standards that can easily contradict each other, or at least not live peacefully together. examples are not hard to find in the current environment. one that has cropped up in the course of rda development concerns family names. presently the rules for naming families are different depending on whether the family is the subject of a work (and established according to lcsh) or whether the family is responsible for a collection of papers (and established according to rda). n types of data rda has blurred the distinctions among certain types of data, apparently because there is a perception that on the semantic web the same piece of data needs to be coded only once, and all indexing and display needs can be supported from that one piece of data. i question that assumption on the basis of my experience with bibliographic cataloging. all of the following ways of encoding the same piece of data can still have value in certain circumstances: n transcribed; in rdf terms, a literal (i.e., any data that is not a uri, a constant value). transcribed data is data copied from an item being cataloged. it is valuable for providing access to the form of the name used on a title page and is particularly useful for people who use pseudonyms, corporate bodies that change name, and so on. transcribed data is an important part of the historical record and not just for off-line materials; it can be a historical record of changing data on notoriously fluid webpages. n composed; in rdf terms, also a literal. composed data is information composed by a cataloger on the basis of observation of the item in hand; it can be valuable for historical purposes to know which data was composed. n supplied; in rdf terms, also a literal. supplied data is information supplied by a cataloger from outside sources; it can be valuable for historical purposes to know which data was supplied and from which outside sources it came. n coded; in rdf, represented by a uri. coded data would likely transform on the semantic web into links to ontologies that could provide normalized, human-readable identification strings on demand, thus causing coded and normalized data to merge into one type of data. is it not possible, though, that the coded form of normalized data might continue to provide for more efficient searching for computers as opposed to humans? coded data also has great cross-cultural value, since it is not as language-dependent as literals or normalized headings. n normalized headings (controlled headings); in rdf, represented by a uri. normalized or controlled headings are still necessary to provide users with coherent, ordered displays of thousands of entities that all match the user’s search for a particular entity (work, author, subject, etc.). the reason google displays are so hideous is that, so far, the data searched lacks any normalized display data. if variant language forms of the name for an entity 64 information technology and libraries | june 2009 are linked to an entity uri, it should be possible to supply headings in the language and script desired by a particular user. n the rdf model those who have become familiar with frbr over the years will probably not find it too difficult to transition from the frbr conceptual model to the rdf model. what frbr calls an “entity,” rdf calls a “subject” and rdfs calls a “class.” what frbr calls an “attribute,” rdf calls an “object” and rdfs calls a “property.” what frbr calls a “relationship,” rdf calls a “predicate” and rdfs calls a “relationship” or a “semantic linkage” (see table 2). the difficulty in any data-modeling exercise lies in deciding what to treat as an entity or class and what to treat as an attribute or property. the authors of frbr decided to create a class called expression to deal with any change in the content of a work. when frbr is applied to serials, which change content with every issue, the model does not work well. in my model, i found it useful to create a new entity at the manifestation level, the serial title, to deal with the type of change that is more relevant to serials, the change in title. i also created another new entity at the manifestation level, title-manifestation, to deal with a change of title in a nonserial work that is not associated with a change in content. one hundred years ago, this entity would have been called title-edition. i am also in the process of developing an entity at the expression level—surrogate—to deal with reproductions of original artworks that need to inherit the qualities of the original artwork they reproduce without being treated as an edition of that original artwork, which ipso facto is unique. these are just examples of cases in which it is not that easy to decide on the classes or entities that are necessary to accurately model bibliographic information. see the appendix for a complete comparison of the classes and entities defined in four different models: frbr, frad, rda, and the yee cataloging rules (ycr). the appendix also shows variation among these models concerning whether a given data element is treated as a class/entity or as an attribute/property. the most notable examples are name and preferred access point, which are treated as classes/entities in frad, as attributes in frbr and ycr, and as both in rda. n rdf problems encountered my goal for this paper is to institute discussion with data modelers about which problems i observed are insoluble and which are soluble: 1. is there an assumption on the part of semantic web developers that a given data element, such as a publisher name, should be expressed as either a literal or using a uri (i.e., controlled), but never both? cataloging is rooted in humanistic practices that require careful recording of evidence. there will always be value in distinguishing and labeling the following types of data: n copied as is from an artifact (transcribed) n supplied by a cataloger n categorized by a cataloger (controlled) tim berners-lee (the father of the internet and the semantic web) emphasizes the importance of recording not just data but also its provenance for the sake of authenticity.15 for many data elements, therefore, it will be important to be able to record both a literal (transcribed or composed form or both) and a uri (controlled form). is this a problem in rdf? as a corollary, if any data that can be given a uri cannot also be represented by a literal (transcribed and composed data, or one or the other), it may not be possible to design coherent, readable displays of the data describing a particular entity. among other things, cataloging is a discursive writing skill. does rdf require that all data be represented only once, either by a literal or by a uri? or is it perhaps possible that data that has a uri could also have a transcribed or composed form as a property? perhaps it will even be possible to store multiple snapshots of online works that change over time to document variant forms of a name for works, persons, and so on. 2. will the internet ever be fast enough to assemble the equivalent of our current records from a collection of hundreds or even thousands of uris? in rdf, links are one-to-one rather than one-to-many. this leads to a great proliferation of reciprocal links. the more granularity there is in the data, the more linking is necessary to ensure that atomized data elements are linked together. potentially, every piece of data describing a particular entity could be represented by a uri leading out to a skos list of data values. the number of links necessary to pull together table 2. the frbr conceptual model translated into rdf and rdfs frbr rdf rdfs entity subject class attribute object property relationship predicate relationship/ semantic linkage can bibliographic data be put directly onto the semantic web? | yee 65 all of the data just to describe one manifestation could become astronomical, as could the number of one-to-one links necessary to create the appearance of a one-to-many link, such as the link between an author and all the works of an author. is the internet really fast enough to assemble a record from hundreds of uris in a reasonable amount of time? given the often slow network throughput typical of many of our current internet connections, is it really practical to expect all of these pieces to be pulled together efficiently to create a single display for a single user? we yet may feel nostalgia for the single manifestation-based record that already has all of the relevant data in it (no assembly required). bruce d’arcus points out, however, that i think if you’re dealing with rdf, you wouldn’t necessarily be gathering these data in real-time. the uris that are the targets for those links are really just global identifiers. how you get the triples is a separate matter. so, for example, in my own personal case, i’m going to put together an rdf store that is populated with data from a variety of sources, but that data population will happen by script, and i’ll still be querying a single endpoint, where the rdf is stored in a relational database.16 in other words, d’arcus essentially will put them all in one place, or in one database that “looks” from a uri perspective to be “one place” where they’re already gathered. 3. is rdf capable of dealing with works that are identified using their creators? we need to treat author as both an entity in its own right and as a property of a work, and in many cases the latter is the more important function for user service. lexical labels, or human-readable identifiers for works that are identified using both the principal author and the title, are particularly problematic in rdf given that the principal author is an entity in its own right. is rdf capable of supporting the indexing necessary to allow a user to search using any variant of the author’s name and any variant of the title of a work in combination and still retrieve all expressions and manifestations of that work, given that author will have a uri of its own, linked by means of a relationship link to the work uri? is rdf capable of supporting the display of a list of one thousand works, each identified by principal author, in order first by principal author, then by title, then by publication date, given that the preferred heading for each principal author would have to be assembled from the uri for that principal author and the preferred title for each work would have to be assembled from the uri for that work? for fear that this will not, in fact, be possible, i have put a human-readable work-identifier data element into my model that consists of principal author and title when appropriate, even though that means the preferred name of the principal author may not be able to be controlled by the entity record for the principal author. any guidance from experienced data modelers in this regard would be appreciated. according to bruce d’arcus, this is purely an interface or application question that does not require a solution at the data layer.17 since we have never had interfaces or applications that would do this correctly, even though the data is readily available in authority records, i am skeptical about this answer! perhaps bruce’s suggestion under item 9 of designating a sortname property for each entity is the solution here as well. my human-readable work identifier consisting of the name of the principal creator and uniform title of work could be designated the sortname poperty for the work. it would have to be changed whenever the preferred form of the name for the principal creator changed, however. 4. do all possible inverse relationships need to be expressed explicitly, or can they be inferred? my model is already quite large, and i have not yet defined the inverse of every property as i really should to have a correct rdf model. in other words, for every property there needs to be an inverse property; for example, the property iscreatorof needs to have the inverse property iscreatedby; thus “twain” has the property iscreatorof, while “adventures of tom sawyer” has the property iscreatedby. perhaps users and inputters will not actually have to see the huge, complex rdf data model that would result from creating all the inverse relationships, but those who maintain the model will have to deal with a great deal of complexity. however, since i’m not a programmer, i don’t know how the complexity of rdf compares to the complexity of existing ils software. 5. can rdf solve the problems we are having now because of the lack of transitivity or inheritance in the data models that underlie current ilses, or will rdf merely perpetuate these problems? we have problems now with the data models that underlie our current ilses because of the inability of these models to deal with hierarchical inheritance, such that whatever is true of an entity in the hierarchy is also true of every entity below that entity in the hierarchy. one example is that of cross-references to a parent corporate body that should be held to apply to all subdivisions of that corporate body but never are in existing ils systems. there is a cross-reference from “fbi” to “united states. federal bureau of investigation,” but not from “fbi counterterrorism division” to “united states. federal bureau of investigation. counterterrorism division.” for that reason, a search in any opac name index for “fbi counterterrorism division” will fail. we need systems that recognize that data about a parent corporate body is relevant to all subdivisions of that parent body. we need systems that recognize that data about a work is relevant to all expressions and manifestations of that work. rdf allows you to link a work to an expression 66 information technology and libraries | june 2009 and an expression to a manifestation, but i don’t believe it allows you to encode the information that everything that is true of the work is true of all of its expressions and manifestations. rob styles seems to confirm this: “rdf doesn’t have hierarchy. in computer science terms, it’s a graph, not a tree, which means you can connect anything to anything else in any direction.”18 of course, not all links should be this kind of transitive or inheritance link. one expression of work a is linked to another expression of work a by links to work a, but whatever is true of one of those expressions is not necessarily true of the other; one may be illustrated, for example, while the other is not. whatever is true of one work is not necessarily true of another work related to it by related work link. it should be recognized that bibliographic data is rife with hierarchy. it is one of our major tools for expressing meaning to our users. corporate bodies have corporate subdivisions, and many things that are true for the parent body also are true for its subdivisions. subjects are expressed using main headings and subject subdivisions, and many things that are true for the main heading (such as variant names) also are true for the heading combined with one of its subdivisions. geographic areas are contained within larger geographic areas, and many things that are true of the larger geographic area also are true for smaller regions, counties, cities, etc., contained within that larger geographic area. for all these reasons, i believe that, to do effective displays and indexes for our bibliographic data, it is critical that we be able to distinguish between a hierarchical relationship and a nonhierarchical relationship. 6. to recognize the fact that the subject of a book or a film could be a work, a person, a concept, an object, an event, or a place (all classes in the model), is there any reason we cannot define subject itself as a property (a relationship) rather than a class in its own right? in my model, all subject properties are defined as having a domain of resource, meaning there is no constraint as to the class to which these subject properties apply. i’m not sure if there will be any fall-out from that modeling decision. 7. how do we distinguish between the corporate behavior of a jurisdiction and the subject behavior of a geographical location? sometimes a place is a jurisdiction and behaves like a corporate body (e.g., united states is the name of the government of the united states). sometimes place is a physical location in which something is located (e.g., the birds discussed in a book about the birds of the united states). to distinguish between the corporate behavior of a jurisdiction and the subject behavior of a geographical location, i have defined two different classes for place: place as jurisdictional corporate body and place as geographic area. will this cause problems in the model? will there be times when it prevents us from making elegant generalizations in the model about place per se? there is a similar problem with events. some events are corporate bodies (e.g., conferences that publish papers) and some are a kind of subject (e.g., an earthquake). i have defined two different classes for event: conference or other event as corporate body creator and event as subject. 8. what is the best way to model a bound-with or an issuedwith relationship, or a part–whole relationship in which the whole must be located to obtain the part? the bound-with relationship is actually between two items containing two different works, while the issued-with relationship is between two manifestations containing two different works (see figure 2). is this a work-to-work relationship? will designating it a work-to-work relationship cause problems for indicating which specific items or manifestation-items of each work are physically located in the same place? this question may also apply to those part–whole relationships in which the part is physically contained within the whole and both are located in the same place (sometimes known as analytics). one thing to bear in mind is that in all of these cases the relationship between two works does not hold between all instances of each work; it only holds for those particular instances that are contained in the particular manifestation or item that is bound with, issued with, or part of the whole. however, if the relationship is modeled as a work-1manifestation to work-2-manifestation relationship, or a work-1-item to work-2-item relationship,, care must be taken in the design of displays to pull in enough information about the two or more works so as not to confuse the user. 9. how do we express the arrangement of elements that have a definite order? i am having trouble imagining how to encode the ordering of data elements that make up a larger element, such as the pieces of a personal name. this is really a desire to control the display of those atomized elements so that they make sense to human beings rather than just to machines. could one define a property such as natural language order of forename, surname, middle name, patronymic, matronymic and/or clan name of a person given that the ideal order of these elements might vary from one person to another? could one define properties such as sorting element 1, sorting element 2, sorting element 3, etc., and assign them to the various pieces that will be assembled to make a particular heading for an entity, such as an lcsh heading for a historical period? (depending on the answer to the question in item 11, it may or may not be possible to assign a property to a property in this fashion.) are there standard sorting rules we need to be aware of (in unicode, for example)? are there other rdf techniques available to deal with sorting and arrangement? bruce d’arcus suggests that, instead of coding the name parts, it would be more useful to designate sortname properties;19 might it not be necessary to designate a sortname property for each variant name, as well, can bibliographic data be put directly onto the semantic web? | yee 67 for cases in which variants need to appear in sorted displays? and wouldn’t these sortname properties complicate maintenance over time as preferred and variant names changed? 10. how do we link related data elements in such a way that effective indexing and displays are possible? some examples: number and kind of instrument (e.g., music written for two oboes and three guitars); multiple publishers, frequencies, subtitles, editors, etc., with date spans for a serial title change (or will it be necessary to create a new manifestation for every single change in subtitle, publisher name, place of publication, etc?). the assumption seems to be that there will be no repeatable data elements. based on my somewhat limited experience with rdf, it appears that there are record equivalents (every data element—property or relationship—pertaining to a particular entity with a uri), but there are no field or subfield equivalents that allow the sublinking of related pieces of data about an entity. indeed, rob styles goes so far as to argue that ultimately there is no notion of a “record” in rdf.20 it is possible that blank nodes might be able to fill in for fields and subfields in some cases for grouping data, but there are dangers involved in their use.21 to a cataloger, it looks as though the plan is for rdf data to float around loose without any requirement that there be a method for pulling it together into coherent displays designed for human beings. 11. can a property have a property in rdf? as an example of where it might be useful to define a property of a property, robert maxwell suggests that date of publication is really an attribute (property) of the published by relationship (another property).22 another example: in my model, a variant title for a serial is a property. can that property itself have the property type of variant title to encompass things like spine title, key title, etc.? another example appeared in item 9, in which it is suggested that it might be desirable to assign sort-element properties to the various elements of a name property. 12. how do we document record display decisions? there is no way to record display decisions in rdf itself; it is completely display-neutral. we could not safely commit to a particular rdf–based data model until a significant amount of sample bibliographic data had been created and open-source indexing and display software had been designed and user-tested on that data. it may be that we will need to supplement rdf with some other encoding mechanism that allows us to record display decisions along with the data. current cataloging rules are about display as much as they are about content designation. isbd concerns the order in which the elements should be displayed to humans. the cataloging objectives concern display to users of such entity groups as the works of an author, the editions of a work, and the works on a subject. 13. can all bibliographic data be reduced to either a class or a property with a finite list of values? another way to put this is to ask if all that catalogers do could be reduced to a set of pull-down menus. cataloging is the art of writing discursive prose as much as it is the ability to select the correct value for a particular data element. we must deal with ambiguous data (presented by joe blow could mean that joe created the entire work, produced it, distributed it, sponsored it, or merely funded it). we must sometimes record information without knowing its exact meaning. we must deal with situations that have not been anticipated in advance. it is not possible to list every possible kind of data and every possible value for each type of figure 2. examples of part–whole relationships. how might these be best expressed in rdf? issued-with relationship a copy of charlie chaplin’s 1917 film the immigrant can be found on a videodisc compilation called charlie chaplin, the early years along with two other chaplin films. this compilation was published and collected by many different libraries and media centers. if a user wants to view this copy of the immigrant, he or she will first have to locate charlie chaplin, the early years, then look for the desired film at the beginning of the first videodisc in the set. the issued-with relationship between the immigrant and the other two films on charlie chaplin, the early years is currently expressed in the bibliographic record by means of a “with” note: first on charlie chaplin, the early years, v. 1 (62 min.) with: the count – easy street. bound-with relationship the university of california, los angeles film & television archive has acquired a reel of 16 mm. film from a collector who strung five warner bros. cartoons together on a single reel of film. we can assume that no other archive, library, or media collection will have this particular compilation of cartoons, so the relationship between the five cartoons is purely local in nature. however, any user at the film & television archive who wishes to view one of these cartoons will have to request a viewing appointment for the entire reel and then find the desired cartoon among the other four on the reel. the bound-with relationship among these cartoons is currently expressed in a holdings record by means of a “with” note: fourth on reel with: daffy doodles – tweety pie – i love to singa – along flirtation walk. 68 information technology and libraries | june 2009 data up front before any data is gathered. it will always be necessary to provide a plain-text escape hatch. the bibliographic world is a complex, constantly changing world filled with ambiguity. n what are the next steps? in a sense, this paper is a first crude attempt at locating unmapped territory that has not yet been explored. if we were to decide as a community that it would be valuable to move our shared cataloging activities onto the semantic web, we would have a lot of work ahead of us. if some of the rdf problems described above are insoluble, we may need to work with semantic web developers to create a more sophisticated version of rdf that can handle the transitivity and complex linking required by our data. we will also need to encourage a very complex existing community to evolve institutional structures that would enable a more efficient use of the internet for the sharing of cataloging and other metadata creation. this is not just a technological problem, but also a political one. in the meantime, the experiment continues. let the thinking and learning begin! references and notes 1. “notation3, or n3 as it is more commonly known, is a shorthand non–xml serialization of resource description framework models, designed with human-readability in mind: n3 is much more compact and readable than xml rdf notation. the format is being developed by tim berners-lee and others from the semantic web community.” wikipedia, “notation 3,” http://en.wikipedia.org/wiki/notation_3 (accessed feb. 19, 2009). 2. frbr review group, www.ifla.org/vii/s13/wgfrbr/; frbr review group, franar (working group on functional requirements and numbering of authority records), www .ifla.org/vii/d4/wg-franar.htm; frbr review group, frsar (working group, functional requirements for subject authority records), www.ifla.org/vii/s29/wgfrsar.htm; frbroo, frbr review group, working group on frbr/crm dialogue, www .ifla.org/vii/s13/wgfrbr/frbr-crmdialogue_wg.htm. 3. library of congress, response to on the record: report of the library of congress working group on the future of bibliographic control (washington, d.c.: library of congress, 2008): 24, 39, 40, www.loc.gov/bibliographic-future/news/lcwgrpt response_dm_053008.pdf (accessed mar. 25, 2009). 4. ibid., 39. 5. ibid., 41. 6. dublin core metadata initiative, dcmi/rda task group wiki, http://www.dublincore.org/dcmirdataskgroup/ (accessed mar. 25, 2009). 7. mikael nilsson, andy powell, pete johnston, and ambjorn naeve, expressing dublin core metadata using the resource description framework (rdf), http://dublincore.org/ documents/2008/01/14/dc-rdf/ (accessed mar. 25, 2009). 8. see for example table 6.3 in frbr, which maps to manifestation every kind of data that pertains to expression change with the exception of language change. ifla study group on the functional requirements for bibliographic records, functional requirements for bibliographic records (munich: k. g. saur, 1998): 95, http://www.ifla.org/vii/s13/frbr/frbr.pdf (accessed mar. 4, 2009). 9. roy tennant, “marc must die,” library journal 127, no. 17 (oct. 15, 2002): 26. 10. w3c, skos simple knowledge organization system reference, w3c working draft 29 august 2008, http://www.w3.org/ tr/skos-reference/ (accessed mar. 25, 2009). 11. the extract in figure 1 is taken from my complete rdf model, which can be found at http://myee.bol.ucla.edu/ ycrschemardf.txt. 12. mary w. elings and gunter waibel, “metadata for all: descriptive standards and metadata sharing across libraries, archives and museums,” first monday 12, no. 3 (mar. 5, 2007), http://www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/ article/view/1628/1543 (accessed mar. 25, 2009). 13. oclc, a holdings primer: principles and standards for local holdings records, 2nd ed. (dublin, ohio: oclc, 2008), 4, http:// www.oclc.org/us/en/support/documentation/localholdings/ primer/holdings%20primer%202008.pdf (accessed mar. 25, 2009). 14. the library of congress working group, on the record: report of the library of congress working group on the future of bibliographic control (washington, d.c.: library of congress, 2008): 30, http:// www.loc.gov/bibliographic-future/news/lcwg-ontherecord -jan08-final.pdf (accessed mar. 25, 2009). 15. talis, sir tim berners-lee talks with talis about the semantic web: transcript of an interview recorded on 7 february 2008, http://talis-podcasts.s3.amazonaws.com/twt20080207_timbl .html (accessed mar. 25, 2009). 16. bruce d’arcus, e-mail to author, mar. 18, 2008. 17. ibid. 18. rob styles, e-mail to author, mar. 25, 2008. 19. bruce d’arcus, e-mail to author, mar. 18, 2008. 20. rob styles, e-mail to author, mar. 25, 2008. 21. w3c, “section 2.3, structured property values and blank nodes,” in rdf primer: w3c recommendation 10 february 2004, http://www.w3.org/tr/rdf-primer/#structuredproperties (accessed mar. 25, 2009). 22. robert maxwell, frbr: a guide for the perplexed (chicago: ala, 2008). can bibliographic data be put directly onto the semantic web? | yee 69 entities/classes in rda, frbr, frad compared to yee cataloging rules (ycr) rda, frbr, and frad ycr group 1: work work group 1: expression expression surrogate group 1: manifestation manifestation title-manifestation serial title group 1: item item group 2: person person fictitious character performing animal group 2: corporate body corporate body corporate subdivision place as jurisdictional corporate body conference or other event as corporate body creator jurisdictional corporate subdivision family (rda and frad only) group 3: concept concept group 3: object object group 3: event event or historical period as subject group 3: place place as geographic area discipline genre/form name identifier controlled access point rules (frad only) agency (frad only) appendix. entity/class and attribute/property comparisons 70 information technology and libraries | june 2009 attributes/properties in frbr compared to frad model entity frbr frad work title of the work form of work date of the work other distinguishing characteristics intended termination intended audience context for the work medium of performance (musical work) numeric designation (musical work) key (musical work) coordinates (cartographic work) equinox (cartographic work) form of work date of the work medium of performance subject of the work numeric designation key place of origin of the work original language of the work history other distinguishing characteristic expression title of the expression form of expression date of expression language of expression other distinguishing characteristics extensibility of expression revisability of expression extent of the expression summarization of content context for the expression critical response to the expression use restrictions on the expression sequencing pattern (serial) expected regularity of issue (serial) expected frequency of issue (serial) type of score (musical notation) medium of performance (musical notation or recorded sound) scale (cartographic image/object) projection (cartographic image/object) presentation technique (cartographic image/object) representation of relief (cartographic image/object) geodetic, grid, and vertical measurement (cartographic image/ object) recording technique (remote sensing image) special characteristic (remote sensing image) technique (graphic or projected image) form of expression date of expression language of expression technique other distinguishing characteristic surrogate can bibliographic data be put directly onto the semantic web? | yee 71 model entity frbr frad manifestation title of the manifestation statement of responsibility edition/issue designation place of publication/distribution publisher/distributor date of publication/distribution fabricator/manufacturer series statement form of carrier extent of the carrier physical medium capture mode dimensions of the carrier manifestation identifier source for acquisition/access authorization terms of availability access restrictions on the manifestation typeface (printed book) type size (printed book) foliation (hand-printed book) collation (hand-printed book) publication status (serial) numbering (serial) playing speed (sound recording) groove width (sound recording) kind of cutting (sound recording) tape configuration (sound recording) kind of sound (sound recording) special reproduction characteristic (sound recording) colour (image) reduction ratio (microform) polarity (microform or visual projection) generation (microform or visual projection) presentation format (visual projection) system requirements (electronic resource) file characteristics (electronic resource) mode of access (remote access electronic resource) access address (remote access electronic resource) edition/issue designation place of publication/distribution publisher/distributor date of publication/distribution form of carrier numbering title-manifestation serial title item item identifier fingerprint provenance of the item marks/inscriptions exhibition history condition of the item treatment history scheduled treatment access restrictions on the item location of item attributes/properties in frbr compared to frad (cont.) 72 information technology and libraries | june 2009 model entity frbr frad person name of person dates of person title of person other designation associated with the person dates associated with the person title of person other designation associated with the person gender place of birth place of death country place of residence affiliation address language of person field of activity profession/occupation biography/history fictitious character performing animal corporate body name of the corporate body number associated with the corporate body place associated with the corporate body date associated with the corporate body other designation associated with the corporate body place associated with the corporate body date associated with the corporate body other designation associated with the corporate body type of corporate body language of the corporate body address field of activity history corporate subdivision place as jurisdictional corporate body conference or other event as corporate body creator jurisdictional corporate subdivision family type of family dates of family places associated with family history of family concept term for the concept type of concept object term for the object type of object date of production place of production producer/fabricator physical medium event term for the event date associated with the event place associated with the event attributes/properties in frbr compared to frad (cont.) can bibliographic data be put directly onto the semantic web? | yee 73 model entity frbr frad place term for the place coordinates other geographical information discipline genre/form name type of name scope of usage dates of usage language of name script of name transliteration scheme of name identifier type of identifier identifier string suffix controlled access point type of controlled access point status of controlled access point designated usage of controlled access point undifferentiated access point language of base access point script of base access point script of cataloguing transliteration scheme of base access point transliteration scheme of cataloguing source of controlled access point base access point addition rules citation for rules rules identifier agency name of agency agency identifier location of agency attributes/properties in frbr compared to frad (cont.) 74 information technology and libraries | june 2009 attributes/properties in rda compared to ycr model entity rda ycr work title of the work form of work date of work place of origin of work medium of performance numeric designation key signatory to a treaty, etc. other distinguishing characteristic of the work original language of the work history of the work identifier for the work nature of the content coverage of the content coordinates of cartographic content equinox epoch intended audience system of organization dissertation or theses information key identifier for work language-based identifier (preferred lexical label) variant language-based identifier (alternate lexical label) language-based identifier (preferred lexical label) for work language-based identifier for work (preferred lexical label) identified by principalcreator in combination with uniform title language-based identifier (preferred lexical label) for work identified by title alone (uniform title) supplied title for work variant title for work original language of work responsibility for work original publication statement of work dates associated with work original publication/release/broadcast date of work copyright date of work creation date of work date of first recording of a work date of first performance of a work finding date of naturally occurring object original publisher/distributor/broadcaster of work places associated with work original place of publication/distribution/broadcasting for work country of origin of work place of creation of work place of first recording of work place of first performance of work finding place of naturally occurring object original method of publication/distribution/broadcast of work serial or integrating work original numeric and/or alphabetic designations—beginning serial or integrating work original chronological designations— beginning serial or integrating work original numeric and/or alphabetic designations—ending serial or integrating work original chronological designations— ending encoding of content of work genre/form of content of work original instrumentation of musical work instrumentation of musical work—number of a particular instrument instrumentation of musical work—type of instrument original voice(s) of musical work voice(s) of musical work—number of a particular type of voice voice(s) of musical work—type of voice original key of musical work numeric designation of musical work coordinates of cartographic work equinox of cartographic work original physical characteristics of work original extent of work original dimensions of work mode of issuance of work can bibliographic data be put directly onto the semantic web? | yee 75 model entity rda ycr work (cont.) original aspect ratio of moving image work original image format of moving image work original base of work original materials applied to base of work work summary work contents list custodial history of work creation of archival collection censorship history of work note about relationship(s) to other works expression content type date of expression language of expression other distinguishing characteristic of the expression identifier for the expression summarization of the content place and date of capture language of the content form of notation accessibility content illustrative content supplementary content colour content sound content aspect ratio format of notated music medium of performance of musical content duration performer, narrator, and/or presenter artistic and/or technical credits scale projection of cartographic content other details of cartographic content awards key identifier for expression language-based identifier (preferred lexical label) for expression variant title for expression nature of modification of expression expression title expression statement of responsibility edition statement scale of cartographic expression projection of cartographic expression publication statement of expression place of publication/distribution/release/broadcasting for expression place of recording for expression publisher/distributor/releaser/broadcaster for expression publication/distribution/release/broadcast date for expression copyright date for expression date of recording for expression numeric and/or alphabetic designations for serial expressions chronological designations for serial expressions performance date for expression place of performance for expression extent of expression content of expression language of expression text language of expression captions language of expression sound track language of sung or spoken text of expression language of expression subtitles language of expression intertitles language of summary or abstract of expression instrumentation of musical expression instrumentation of musical expression—number of a particular instrument instrumentation of musical expression—type of instrument voice(s) of musical expression voice(s) of musical expression—number of a particular type of voice voice(s) of musical expression—type of voice key of musical expression appendages to the expression expression series statement mode of issuance for expression notes about expression surrogate [under development] attributes/properties in rda compared to ycr (cont.) 76 information technology and libraries | june 2009 model entity rda ycr manifestation title statement of responsibility edition statement numbering of serials production statement publication statement distribution statement manufacture statement copyright date series statement mode of issuance frequency identifier for the manifestation note media type carrier type base material applied material mount production method generation layout book format font size polarity reduction ratio sound characteristics projection characteristics of motion picture film video characteristics digital file characteristics equipment and system requirements terms of availability key identifier for manifestation publication statement of manifestation place of publication/distribution/release/broadcast of manifestation manifestation publisher/distributor/releaser/broadcaster manifestation date of publication/distribution/release/broadcast carrier edition statement carrier piece count carrier name carrier broadcast standard carrier recording type carrier playing speed carrier configuration of playback channels process used to produce carrier carrier dimensions carrier base materials carrier generation carrier polarity materials applied to carrier carrier encoding format intermediation tool requirements system requirements serial manifestation illustration statement manifestation standard number manifestation isbn manifestation issn manifestation publisher number manifestation universal product code notes about manifestation titlemanifestation key identifier for title-manifestation variant title for title-manifestation title-manifestation title title-manifestation statement of responsibilities title-manifestation edition statement publication statement of title-manifestation place of publication/distribution/release/broadcasting of titlemanifestation publisher/distributor/releaser, broadcaster of title-manifestation date of publication/distribution/release/broadcast of titlemanifestation title-manifestation series title-manifestation mode of issuance notes about title-manifestation title-manifestation standard number attributes/properties in rda compared to ycr (cont.) can bibliographic data be put directly onto the semantic web? | yee 77 model entity rda ycr serial title key identifier for serial title variant title for serial title title of serial title serial title statement of responsibility serial title edition statement publication statement of serial title place of publication/distribution/release/broadcast of serial title publisher/distributor/releaser/broadcaster of serial title date of publication/distribution/release/broadcast of serial title serial title beginning numeric and/or alphabetic designations serial title beginning chronological designations serial title ending numeric and/or alphabetic designations serial title ending chronological designations serial title frequency serial title mode of issuance serial title illustration statement notes about serial title serial title issn-l item preferred citation custodial history immediate source of acquisition identifier for the item item-specific carrier characteristics key identifier for item item barcode item location item call number or accession number item copy number item provenance item condition item marks and inscriptions item exhibition history item treatment history item scheduled treatment item access restrictions attributes/properties in rda compared to ycr (cont.) 78 information technology and libraries | june 2009 model entity rda ycr person name of the person preferred name for the person variant name for the person date associated with the person title of the person fuller form of name other designation associated with the person gender place of birth place of death country associated with the person place of residence address of the person affiliation language of the person field of activity of the person profession or occupation biographical information identifier for the person key identifier for person language-based identifier (preferred lexical label) for person clan name of person forename/given name/first name of person matronymic of person middle name of person nickname of person patronymic of person surname/family name of person natural language order of forename, surname, middle name, patronymic, matronymic and/or clan name of person affiliation of person biography/history of person date of birth of person date of death of person ethnicity of person field of activity of person gender of person language of person place of birth of person place of death of person place of residence of person political affiliation of person profession/occupation of person religion of person variant name for person fictitious character [under development] performing animal [under development] corporate body name of the corporate body preferred name for the corporate body variant name for the corporate body place associated with the corporate body date associated with the corporate body associated institution other designation associated with the corporate body language of the corporate body address of the corporate body field of activity of the corporate body corporate history identifier for the corporate body key identifier for corporate body language-based identifier (preferred lexical label) for corporate body dates associated with corporate body field of activity of corporate body history of corporate body language of corporate body place associated with corporate body type of corporate body variant name for corporate body corporate subdivision [under development] place as jurisdictional corporate body [under development] attributes/properties in rda compared to ycr (cont.) can bibliographic data be put directly onto the semantic web? | yee 79 model entity rda ycr conference or other event as corporate body creator [under development] jurisdictional corporate subdivision [under development] family name of the family preferred name for the family variant name for the family type of family date associated with the family place associated with the family prominent member of the family hereditary title family history identifier for the family concept term for the concept preferred term for the concept variant term for the concept type of concept identifier for the concept key identifier for concept language-based identifier (preferred lexical label) for concept qualifier for concept language-based identifier variant name for concept object name of the object preferred name for the object variant name for the object type of object date of production place of production producer/fabricator physical medium identifier for the object key identifier for object language-based identifier (preferred lexical label) for object qualifier for object language-based identifier variant name for object event name of the event preferred name for the event variant name for the event date associated with the event place associated with the event identifier for the event key identifier for event or historical period as subject language-based identifier (preferred lexical label) for event or historical period as subject beginning date for event or historical period as subject ending date for event or historical period as subject variant name for event or historical period as subject place name of the place preferred name for the place variant name for the place coordinates other geographical information identifier for the place key identifier for place as geographic area language-based identifier (preferred lexical label) for place as geographic area qualifier for place as geographic area variant name for place as geographic area discipline key identifier for discipline language-based identifier (preferred lexical label) (name or classification number or symbol) for discipline translation of meaning of classification number or symbol for discipline attributes/properties in rda compared to ycr (cont.) 80 information technology and libraries | june 2009 model entity rda ycr genre/form key identifier for genre/form language-based identifier (preferred lexical label) for genre/form variant name for genre/form name scope of usage date of usage identifier controlled access point rules agency note: in rda, the following attributes have not yet been assigned to a particular class or entity: extent, dimensions, terms of availability, contact information, restrictions on access, restrictions on use, uniform resource locator, status of identification, source consulted, cataloguer’s note, status of identification, and undifferentiated name indicator. name is being treated as both a class and a property. identifier and controlled access point are treated as properties rather than classes in both rda and ycr. attributes/properties in rda compared to ycr (cont.) utilizing technology to support and extend access to students and job seekers during the pandemic public libraries leading the way utilizing technology to support and extend access to students and job seekers during the pandemic daniel berra information technology and libraries | march 2021 https://doi.org/10.6017/ital.v40i1.13261 daniel berra (danielb@pfulgervilletx.gov) is assistant director, pflugerville (texas) public library. © 2021. “public libraries leading the way” is a regular column spotlighting technology in public libraries. the ongoing pandemic necessitated a reimaging of public library services and resources. out of this challenge rose opportunities to better serve the needs of our communities during the pandemic and beyond. when our library first closed our doors to the public last march, we began discussions on how the needs of our community have changed. we identified two key groups for whom the pandemic had forced an uncomfortable shift: students suddenly thrust into virtual learning and adults who had lost their jobs. while we continue to serve all members of our community in a variety of ways, we looked to increase support for these specific groups utilizing available technology. like many public libraries, the pflugerville public library quickly shifted our service model to include virtual programs, curbside pickup, library cards issued remotely and a focus on electronic resources. our community is rapidly growing and diverse. many of our nearly 70,000 residents are frequent users of library services, attend our wide array of programs, hold meetings, study or work inside the building and enjoy both the physical and virtual library collection. the pandemic shift required our talented staff to find ways to provide a similar level of service to a community who heavily utilizes the library. for both students and job seekers, we took steps to alleviate some of the difficulties the building’s closure caused by utilizing existing technology. we worked with the city’s it department to extend the library’s wi-fi to cover the entire parking lot, allowing for 24-hour access. we also utilized our existing print from your own device system to allow library users to submit print jobs and then pick them up through our curbside service. we added additional wi-fi hotspots available for checkout to ensure access at home for those lacking internet. since these services were already offered to some degree, the expansion of access was relatively easy to implement. for students we drew upon our existing relationship with the pflugerville independent school district (pfisd) to provide support and extend access. we expanded the offering of our special digit cards, which allow students to sign up for an account giving them access to all of our electronic resources and wi-fi hotspots. the school district’s librarians handle the signups and then submit the forms so we can set up the accounts then contact students by email or phone. we further extended access to ebooks by working with the district and our vendor overdrive, to provide a direct way for students to browse and check out through the district’s own ebook app. this allows students to seamlessly see both of our collections, significantly increasing their reading options and removing barriers to access. on the support front, we utilized a portion of the city’s cares act funds directed toward the library to launch a live, virtual tutoring service called brainfuse helpnow. students of all ages have anonymous access to tutors from home seven days a week, as well as additional homework mailto:danielb@pfulgervilletx.gov information technology and libraries march 2021 utilizing technology to support and extend access | berra 2 support resources. this piece meshes nicely with some of our virtual programming for teens, like our sat and act practice tests and other testand career-preparation e-resources. recognizing the pandemic’s impact on the economy, and how this directly affects our community, we worked to prioritize support for the unemployed and under-employed. we added a resume review/job-search coaching service led by two of our circulation staff members. we utilized another portion of our cares act funds to offer career online high school, providing adults with access to an online program to obtain their high school diploma. we also began lending laptops for home use to ensure access to necessary technology. some of our support was already in place before the pandemic began, and we made a significant marketing push to highlight these e-resources. for instance, we partner with the pflugerville community development corporation to provide the online training resource lynda.com (soon to be linkedin learning). we saw a large increase in usage particularly in the first few months of the pandemic as community members looked to add employable skills to their toolboxes. we also created a page on our website with all of our job search assistance resources and services highlighted in one place. while the main emphasis of these efforts utilizes technology, serving the needs of the entire community also requires supporting those who are generally less connected. we have to balance our digital expectations with something more tangible, recognizing many library users still utilize the library in a more traditional way. for students, our senior youth services librarian partnered with pfisd for a book give away in conjunction with the district’s food distribution program to get books in the hands of children for the summer. we also began distributing “care kits” through our curbside service that include personal grooming products and cold weather gear for anyone in need. while 2020 featured the addition of many new services or significant expansion of existing ones, we are focused in 2021 on increasing our marketing efforts for these offerings. relying too heavily on digital forms of communication can limit the impact of our services. for instance, if we want to let people who do not have access to the internet at home know we have wi-fi hotspots and laptops available for checkout, then spreading the word through our standard methods of social media, website, and email will prove ineffective. with the building currently closed to the public, we face an additional barrier to communication. to help alleviate some of this, we have created a job search assistance flyer that we are distributing at places like local food pantries. we plan to expand on similar methods of marketing throughout the year. while positive feedback is often hidden from libraries since we prioritize patron privacy and anonymity, we have received a few specific stories that highlight our impact. our firs t scholarship recipient for career online high school shared how the opportunity to obtain her high school diploma will open up new professional avenues and erase the stigma of having not completed high school. another community member who took advantage of our job search coaching to prepare for an interview expressed gratitude to the library staff who helped increase his employment chances. we also see resumes and homework assignments printed through our virtual printing service, hear from parents with children utilizing hotspots for virtual schooling, see cars in the parking lot using the extended wi-fi and track statistics showing a large increase in the usage of our electronic resources. https://library.pflugervilletx.gov/services/assistance-for-job-seekers information technology and libraries march 2021 utilizing technology to support and extend access | berra 3 the ongoing pandemic necessitated a re-imagining of library services. the needs of our community changed and we set out to find ways to provide assistance to those who need it the most utilizing technology, while remaining mindful of those who are not as comfortable in the digital age. the combination of utilizing technology to address the current needs and expanding access to this technology, has allowed us to better serve the community. we are in the process now of evaluating all of our changes to determine which ones will continue even after the pandemic ends. we already know that we will keep our methods of extending access like the expanded wi-fi availability, laptops for checkout, digit cards for students and the seamless connection to our ebook collection for pfisd. in the area of support, we will continue to offer career online high school, brainfuse helpnow for virtual tutoring, and our resume review/job search coaching service. public libraries are well positioned to innovate and adjust to changes in society. it is one of the things we do extremely well, out of necessity, but also out of a deep desire to serve our communities. all of the shifts the pflugerville public library made related to supporting students and job seekers drew upon existing technology and available resources. what changed was the areas on which we chose to focus our efforts. by prioritizing support and access while pinpointing the needs of the moment, we found ways to better serve our community within the context of everything else we provide. while the jury is still out on how successful some of these initiatives will prove, we already know that many of these changes will continue long after the pandemic ends. editorial board thoughts | dehmlow 53 mark dehmloweditorial board thoughts the ten commandments of interacting with nontechnical people m ore than ten years of working with technology and interacting with nontechnical users in a higher education environment has taught me many lessons about successful communication strategies. somehow, in that time, i have been fortunate to learn some effective mechanisms for providing constructive support and leading successful technical projects with both technically and “semitechnically” minded patrons and librarians. i have come to think of myself as someone who lives in the “in between,” existing more in the beyond than the bed or the bath, and, while not a native of either place, i like to think that i am someone who is comfortable in both the technical and traditional cliques within the library. ironically, it turns out that the most critical pieces to successfully implementing technology solutions and bridging the digital divide in libraries has been categorically nontechnical in nature; it all comes down to collegiality, clear communication, and a commitment to collaboration. as i ruminated on the last ten plus years of working in technology, i began to think of the behaviors and techniques that have proved most useful in developing successful relationships across all areas of the library. the result is this list of the top ten dos and don’ts for those of us self-identified techies who are working more and more often with the self-identified nontechnical set. 1. be inclusive—i have been around long enough to see how projects that include only technical people are doomed to scrutiny and criticism. the single best strategy i have found to getting buy-in for technical projects is to include key stakeholders and those with influence in project planning and core decision-making. not only does this create support for projects, but it encourages others to have a sense of ownership in project implementation—and when people feel ownership for a project, they are more likely to help it succeed. 2. share the knowledge—i don’t know if it is just the nature of librarianship, but librarians like to know things, and more often than not they have a healthy sense of curiosity about how things work. i find it goes a long way when i take a few moments to explain how a particular technology works. our public services specialists, in particular, often want to know the details of how our digital tools work so that they can teach users most effectively and answer questions users have about how they function. sharing expertise is a really nice way to be inclusive. 3. know when you have shared enough—in the same way that i don’t need to know every deep detail of collections management to appreciate it, most nontechies don’t need hour-long lectures on how each component of technology relates to the other. knowing how much information to share when describing concepts is critical to keeping people’s interest and generally keeping you approachable. 4. communicate in english—it is true that every specialization has its own vocabulary and acronyms (oh how we love acronyms in libraries) that have no relevance to nonspecialists. i especially see this in the jargon we use in the library to describe our tools and services. the best policy is to avoid jargon and explain concepts in lay-person’s terms or, if using jargon is unavoidable, define specialized words in the simplest terms possible. using analogies and drawing pictures can be excellent ways to describe technical concepts and how they work. it is amazing how much from kindergarten remains relevant later in life! 5. avoid techno-snobbery—i know that i am risking virtual ostracism in writing this, but i think it needs to be said. just because i understand technology does not make me better than others, and i have heard some variant of the “cup holder on the computer” joke way too often. even if you don’t make these kinds of comments in front of people who aren’t as technically capable as you, the attitude will be apparent in your interactions, and there is truly nothing more condescending. 6. meet people halfway—when people are trying to ask technology-related questions or converse about technical issues, don’t correct small mistakes. instead, try to understand and coax out their meaning; elaborate on what they are saying, and extend the conversation to include information they might not be aware of. people don’t like to be corrected or made to feel stupid—it is embarrassing. if their understanding is close enough to the basic idea, letting small mistakes in terminology slide can create an opening for a deeper understanding. you can provide the correct terminology when talking about the topic without making a point to correct people. 7. don’t make a clean technical/nontechnical distinction— after once offering the “technical” perspective on a topic, one librarian said to me that it wasn’t that they themselves didn’t have any technical mark dehmlow (mdehmlow@nd.edu) is digital initiatives librarian, hesburgh libraries, university of notre dame, notre dame, indiana. 54 information technology and libraries | june 2009 perspective, it just wasn’t perhaps as extensive as mine. each person has some level of technical expertise; it is better to encourage the development of that understanding rather than compartmentalizing people on the basis of their area of expertise. 8. don’t expect everyone to be interested—just because i chose a technical track and am interested in it doesn’t mean everyone should be. sometimes people just want to focus on their area of expertise and let the technical work be handled by the techies. 9. assume everyone is capable—at least at some level. sometimes it is just a question of describing concepts in the right way, and besides, not everyone should be a programmer. everyone brings their own skills to the table and that should be respected. 10. expertise is just that—and no one, no one knows everything. there just isn’t enough time, and our brains aren’t that big. embrace those with different expertise, and bring those perspectives into your project planning. a purely technical perspective, while perhaps being efficient, may not provide a practical or intuitive solution for users. diversity in perspective creates stronger projects. in the same way that the most interesting work in academia is becoming increasingly more multidisciplinary, so too the most successful work in libraries needs to bring diverse perspectives to the fore. while it is easy to say libraries are constantly becoming more technically oriented because of the expanse of digital collections and services, the need for the convergence of the technical and traditional domains is clear—digital preservation is a good example of an area that requires the lessons and strengths learned from physical preservation, and, if anything, the technical aspects still raise more questions than solutions—just see henry newman’s article “rocks don’t need to be backed up” to see what i mean.1 increasingly, as we develop and implement applications that better leverage our collections and highlight our services, their success hinges on their usability, user-driven design, and implementations based on user feedback. these “user”-based evaluation techniques fit more closely with traditional aspects of public services: interacting with patrons. lastly, it is also important to remember that technology can be intimidating. it has already caused a good deal of anxiety for those in libraries who are worried about long-term job security as technology continues to initiate changes in the way we perform our jobs. one of the best ways to bring people along is to demystify the scary parts of technology and help them see a role for themselves in the future of the library. going back to maslow’s hierarchy of needs, people want to feel a sense of security and belonging, and i believe it is incumbent upon those of us with a deep understanding of technology to help bring the technical to the traditional in a way that serves everyone in the process. reference 1. henry newman, “rocks don’t need to be backed up,” enterprise storage forum.com (mar. 27, 2009), www.enterprise storageforum.com/continuity/features/article.php/3812496 (accessed april 24, 2009). student use of library computers: are desktop computers still relevant in today’s libraries? susan thompson information technology and libraries |december 2012 20 abstract academic libraries have traditionally provided computers for students to access their collections and, more recently, facilitate all aspects of studying. recent changes in technology, particularly the increased presence of mobile devices, calls into question how libraries can best provide technology support and how it might affect the use of other library services. a two-year study conducted at california state university san marcos library analyzed student use of computers in the library, both the library’s own desktop computers and laptops owned by students. the study found that, despite the increased ownership of mobile technology by students, they still clearly preferred to use desktop computers in the library. it also showed that students who used computers in the library were more likely to use other library services and physical collections. introduction for more than thirty years, it has been standard practice in libraries to provide some type of computer facility to assist students in their research. originally, the focus was on providing access to library resources, first the online catalog and then journal databases. for the past decade or so, this has expanded to general-use computers, often in an information-commons environment, capable of supporting all aspects of student research from original resource discovery to creation of the final paper or other research product. however, times are changing and the ready access to mobile technology has brought into question whether libraries need to or should continue to provide dedicated desktop computers. do students still use and value access to computers in the library? what impact does student computer use have on the library and its other services? have we reached the point where we should reevaluate how we use computers to support student research? california state university san marcos (csusm) is a public university with about nine thousand students, primarily undergraduates from the local area. csusm was established in 1991 and is one of the youngest campuses in the 23-campus california state university system. the library, originally located in space carved out of an administration building, moved into its own dedicated library building in 2004. one of the core principles in planning the new building was the vision of the library as a teaching and learning center. as a result, a great deal of thought went into the design of technology to support this vision. rather than viewing technology’s role as just supporting access to library resources, we expanded its role to providing cradle-to-grave support for the entire research process. we also felt that encouraging students to work in the library would encourage use of traditional library materials and the expertise of library staff, since these resources would be readily available.1 susan thompson (sthompsn@csusm.edu) is coordinator of library systems, california state university san marcos. student use of library computers | thompson 21 rethinking our assumptions about library technology’s role in the student research process led us to consider the entire building as a partner in the students’ learning process. rather than centralizing all computer support in one information commons, we wanted to provide technology wherever students want to use it. we used two strategies. first, we provided centralized technology using more than two hundred desktop computers, most located in four of our learning spaces: reference, classrooms, the media library, and the computer lab. three of these spaces are configured like information commons, providing full-service research computers grouped around the service desks near each library entrance. in addition, simplified “walk-up” computers are available on every floor. the simplified computers provide limited web services to encourage quick turnaround and no login requirement to ensure ready access to library collections for everyone, including community members. the other major component of our technology plan was the provision of wireless throughout the building, along with extensive power outlets to support mobile computing. more than forty quiet study rooms, along with table “islands” in the stacks, help support the use of laptops for group study. however, only two of these quiet studies, located in the media library, provide desktop computers designed specifically to support group work. in 2009 and again in 2010, we conducted computer use studies to evaluate the success of the library’s technology strategy and determine whether the library’s desktop computers were still meeting student needs as envisioned by the building plan. the goal of the study was to obtain a better understanding of how students use the library’s computers, including types of applications used, computer preferences, and computer-related study habits. the study addressed several specific research questions. first, librarians were concerned that the expanded capabilities of the desktop computers distracted students from an academic and library research focus. were students using the library’s computers appropriately? second, the original technology plan had provided extensive support for mobile technology, but the technology landscape has changed over time. how did the increase in student ownership of mobile devices—now at more than 80 percent—affect the use of the desktop computers? finally, did providing an application-rich computer environment encourage student to conduct more of their studying in the library, leading them more frequently to use traditional library collections and services? this article will focus on the study results pertaining to the second and third research questions. we found that, according to our expectations, students using library computer facilities also made extensive use of traditional library services. however, we were surprised to discover that the growing availability of mobile devices had relatively little impact on students’ continuing preference for libraryprovided desktop computers. literature review the concept of the information commons was just coming into vogue in the early 2000s, when we were designing our library building, and it strongly influenced our technology design as well as building design. information commons, defined by steiner as the “functional integration of technology and service delivery,” have become one of the primary methods by which libraries provide enhanced computing support for students studying in the library.2 one of the changes in libraries motivating the information-commons concept is the desire to support a broad range of learning styles, including the propensity to mix academic and social activities. particularly influential to our design was the concept of the information commons supporting students’ projects “from inception to completion” by providing appropriate technologies to facilitate research, collaboration, and consultation.3 information technology and libraries |december 2012 22 providing access to computers appears to contribute to the value of libraries as “place.” shill and toner, early in the era of information commons, noted “there are no systematic, empirical studies documenting the impact of enhanced library buildings on student usage of the physical library.” 4 since then, several evaluations of the information-commons approach seem to show a positive correlation between creation of a commons and higher library usage because students are now able to complete all aspects of their assignments in the library. for example, the university of tennessee and indiana university have shown significant increases in gate counts after they implemented their commons.5 while many studies discuss the value of information commons, very few look at why library computers are preferred over computers in other areas on campus. burke looked at factors influencing students’ choice of computing facilities at an australian university.6 given a choice of central computer labs, residence hall computers, and the library’s information commons, most students preferred the computers in the library over the other computer locations, with more than half using the library computers more than once a week. they rated the library most highly on its convenience and closeness to resources. perhaps the most important trend likely to affect libraries’ support for student technology needs is the increased use of mobile technology. the 2010 nationwide educause center for applied research (ecar) study, from the same year as the second csusm study, showed that 89 percent of students had laptops.7 other nationwide studies have corroborated this high level of laptop ownership.8 so, does this increased use of laptops and mobile devices have affect the use of desktop computers? the 2010 ecar study reported that desktop ownership (about 50 percent in 2010) had declined by more than 25 percent between 2006 and 2009, a significant period in the lifetime of csusm’s new library building. pew’s internet & american life project trend data showed desktop ownership as the only gadget category in which ownership is decreasing, from 68 percent in 2006 to 55 percent at the end of 2011.9 some libraries and campuses are beginning to respond to the increase in laptop ownership by changing their support for desktop computers. university of colorado boulder, in an effort to decrease costs and increase availability of flexible campus spaces, is making a major move away from providing desktop computers.10 while they found that 97 percent of their students own laptops and other mobile devices, they were concerned that many students still preferred to use desktop computers when on campus. to entice students to bring their laptops to campus, the university is enhancing their support for mobile devices by converting their central computer labs into flexible-use space with plentiful power outlets, flexible furniture, printing solutions, and access to the usual campus software. nevertheless, it may be premature for all libraries and universities to eliminate their desktop computer support. tom, voss, and scheetz found students want flexibility with a spectrum of technological options.11 certainly, they want wi-fi and power outlets to support their mobile technology. however, students also want conventional campus workstations providing a variety of functions, such as quick print and email computers, long-term workstations with privacy, and workstations at larger tables with multiple monitors that support group work. while the ubiquity of laptops is an important factor today, other forms of mobile devices may become more important in the future. a 2009 wall street journal article reported the trend for business travelers is to rely on smartphones rather than laptops.12 for the last three years, educause’s horizon reports have made support for non-laptop mobile technologies one of the top trends. the 2009 horizon report mentioned that in countries like japan, “young people equipped student use of library computers | thompson 23 with mobiles often see no reason to own personal computers.”13 in 2010, horizon reported an interesting pilot project at a community college in which one group of students was issued mobile devices and another group was not.14 members of the group with the mobile devices were found to work on the course more during their spare time. the 2011 horizon report discusses mobiles as capable devices in their own right that are increasingly users’ first choice for internet access.15 therefore, rather than trying to determine which technology is most important, libraries may need to support multiple devices. trends described in the ecar and horizon studies make it clear that students own multiple devices. so how do they use them in the study environment? head’s interviews with undergraduate students at ten us campuses found that “students use a less is more approach to manage and control all of the it devices and information systems available to them.”16 for example, in the days before final exams, students were selective in their use of technology to focus on coursework yet remain connected with the people in their lives. the question then may not be which technology libraries should support but rather how to support the right technology at the right time. method the csusm study used a mixed-method approach, combining surveys with real-time observation to improve the effectiveness of assessment and generate a more holistic understanding of how library users made their technology choices. the study protocol received exempt status by the university human subjects review board. it was carried out twice over a two-year period to determine whether time of the semester affected usage. in 2009, the study was administered at the end of the spring term, april 15 to may 3. we expected that students near the end of the term would be preparing for finals and completing assignments, including major projects. the 2010 study was conducted near the beginning of the term, february 4 to february 18. we that early term students would be less engaged in academic assignments, particularly major research projects. we carried out each study over a two-week period. an attempt was made to check consistency by duplicating each time and location. each location was surveyed monday—thursday, once in the morning and once in the afternoon during the heavy-use times of 11 a.m. and 2 p.m. the survey locations included two large computer labs (more than eighty computers each), one located near the library reference desk and one near the academic technology helpdesk. other locations included twenty computers in the media library, a handful of desktop computers in the curriculum area, and laptop users, mostly located on the fourth and fifth floor of the library. the fourth and fifth floor observations also included the library’s forty quiet study rooms. for the 2010 study, the other large computer lab on campus (108 computers), located outside the library, also was included for comparison purposes. we used two techniques: a quantitative survey of library computer users and a qualitative observation of software applications usage and selected study habits. the survey tried to determine the purpose for which the student was using the computer for that day, what their computer preference was, and what other business they might have in the library. it also asked students for their suggestions for changes in the library. the survey was usually completed within the five-minute period that we had estimated and contained no identifying personal information. the survey administrator handed-out the one-page paper survey, along with a pencil if desired, to each student using a library workstation or using a laptop during each designated observation information technology and libraries |december 2012 24 period. users who refused to take the survey were counted in the total number of students asked to do the survey. however, users who indicated they refused because they had already completed a survey on a previous observation date were marked as “dup” in the 2010 survey and were not counted again. the “dup” statistic proved useful as an independent confirmation of the popularity of the library computers. the second method involved conducting “over-the-shoulder” observations of students using the library computers. while students were filling out the paper survey, the survey administrator walked behind the users and inconspicuously looked at their computer screens. all users in the area were observed whether or not they had agreed to take the survey. the one exception was users in group-study rooms. the observer did not enter the room and could only note behaviors visible from the door window, such as laptop usage or group studying. based on brief (one minute or less) observations, administrators noted on a form the type of software application the student was using at that point in time. the observer also noted other, nondesktop computer technical devices in use (specifically laptops, headphones, and mobile devices such as smart phones), and study behaviors, such as groupwork (defined as two or more people working together). the student was not identified on the form. we felt that these observations could validate information provided by the users on the survey. results we completed 1,452 observations in 2009 and 2,501 observations in 2010. the gate counts for the primary month each study took place—70,607 for april 2009 and 59,668 for february 2010— show the library was used more heavily during the final exam period. the larger number of results the second year was due to more careful observation of laptop and study-group computer users on the fourth and fifth floor and the addition of observations in a nonlibrary computer lab rather than an increase of students available to be observed. the observations looked at application usage, study habits, and devices present, but this article will only discuss the observations pertaining to devices. in 2009, 17 percent of students were observed using laptops (see table1). this number almost doubled in 2010 to 33 percent. most laptop users were observed on the fourth and fifth floors where furniture, convenient electrical outlets, and quiet study rooms provided the best support for this technology. very few desktop computers were available, so students desiring to study on these floors have to bring their own laptops. almost 20 percent of students in 2010 were observed with other mobile technology, such as cell phones or ipods, and 16 percent were wearing headphones, which indicated there was other, often not visible, mobile technology in use. student use of library computers | thompson 25 table 1. mobile technology observed in 2009, 1,141 students completed the computer-use survey. however, we were unable to accurately determine the return rate that year. the nature of the study, which surveyed the same locations multiple times, revealed that many of the students were approached more than once to complete the survey. thus the majority of the refusals to take the survey were because the subject had already completed one previously. the 2010 study accounted for this phenomenon by counting refusals and duplications separately. in 2010, 1,123 students completed the survey out of 1,423 unique asks, resulting in a 79 percent return rate. the 619 duplicates counted represented about half of the 2010 surveys completed and could be considered another indicator of frequent use of the library’s computers. the 2010 results included an additional 290 surveys completed by students using the other large computer lab on campus outside the library. table 2. frequency of computer use 33% 16% 18% 17% 0% 5% 10% 15% 20% 25% 30% 35% laptop in use headphones in use mobile device in use (cell phone, ipod) 2010 2009 49% 33% 11% 9% 42% 30% 15% 10% 0% 10% 20% 30% 40% 50% 60% daily when on campus several times a week several times a month rarely use comps in library 2009 2010 information technology and libraries |december 2012 26 in both years of the study, 78 percent of students said they preferred to use computers in the library to other computer lab locations on campus. students also indicated they were frequent users (see table 2). in 2009, 82 percent of students used the library computers frequently—49 percent daily and 33 percent several times a week. the frequency of use in the 2010 early term study dropped about 10 percent to 72 percent but with the same proportion of daily vs. weekly users. convenience and quiet were the top reasons given by more than half of students as to why they preferred the library computers followed closely by atmosphere. about a quarter of students preferred library computers because of their close access to other library services. table 3. preferred computer to use in the library the types of computer that students preferred to use in the library were desktop computers followed by laptops owned by the students (see table 3). it is notable that the preference for desktop computers changed significantly from 2009 and 2010: 84 percent of students preferred desktop computers in 2009 vs. 72 percent in 2010—a 12 percent decrease. not surprisingly, few students preferred the simplified walk-up computers used for quick lookups. however, we did not expect such little interest in checking out laptops, with only 2 percent preferring that option. the 2010 study added a new question to the survey to better understand the types of technology devices owned by students (see table 4). in 2010, 84 percent of students owned a laptop (combining the netbook and laptop statistics). almost 40 percent of students owned a desktop, therefore many students owned more than one type of computer. of the 85 percent of students that indicated they had a cell phone, about one-third indicated they owned smart phones. the majority of students own music players. the one technology students were not interested in was e-book readers, with less than 2 percent indicating ownership. 84% 6% 23% 2% 71% 5% 28% 2% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% sit-down pc walk-up pc own laptop laptop checked out in library 2009 2010 student use of library computers | thompson 27 table 4. technology devices owned by students (2010) to understand how the use of technology might affect use of the library in general, the survey asked students what other library services they used on the same day they were using library computers. table 5 shows survey responses are very similar between the late term 2009 study and the early term in 2010. by far the most popular use of the library, by more than three-quarters of the students, was for study. around 25 percent of the students planned to meet with others, and 20 percent planned to use the media services. around 15 percent of students planned to checkout print books, 15 percent planned to use journals, and 10 percent planned to ask for help. the biggest difference for students early in the term was an increased interest (5 percent more) in using the library for study. the late-term students were 9 percent more likely to meet with others. by contrast, users in the nonlibrary computer lab were much less likely to make use of other library services. only 24 percent of nonlibrary users planned to study in the library, and 8 percent planned to meet with others in the library that day. use of all other library services was less than 5 percent by the nonlibrary computer users. 1% 1% 7% 31% 40% 52% 59% 77% 0% 20% 40% 60% 80% 100% kindle/book reader other handheld devices netbook smart phone desktop computer regular cell phone ipod/mp3 music player laptop information technology and libraries |december 2012 28 table 5. other library services used in 2010, we also asked users what changes they would like in the library, and 58 percent of respondents provided suggestions. the question was not limited to technology, but by far the biggest request for change was to provide more computers (requested by 30 percent of all respondents). analysis of the other survey questions regarding computer ownership, and preferences revealed who was requesting more traditional desktops in the library. surprisingly, most were laptop users; 90 percent of laptop owners wanted more computers and 88 percent of the respondents making this request were located on the fourth and fifth floor, which were almost exclusively laptop users. the next most comments received were remarks indicating student satisfaction with the current library services: 19 percent of students said they were satisfied with current library services and 9 percent praised the library and its services. commonality of requests dropped quickly at that point, with the fourth most common request being for more quiet (2 percent). 1% 0% 0% 2% 2% 3% 3% 4% 7% 23% 4% 3% 3% 9% 10% 13% 13% 22% 26% 81% 0% 3% 6% 8% 10% 15% 16% 20% 35% 76% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% other pick up ill/circuit create a video/web page use a reserve book ask questions/get help look for journals/newspapers checkout a book use media meet with others study 2009 2010 non-library student use of library computers | thompson 29 discussion the results show that students consistently prefer to use computers in the library, with 78 percent declaring a preference for the library over other computer locations on campus both years of the study. this preference is confirmed by the statistics reported by csusm’s campus it department, which tracks computer login data. this data consistently shows the library computer labs are used more than nonlibrary computer labs, with the computers near the library reference desk as the most popular followed closely by the library’s second large computer lab, which is located next to the technology help desk. for instance, during the 2010 study period, the reference desk lab (80 computers) had 6,247 logins compared to 3,218 logins in the largest nonlibrary lab (108 computers)—double the amount of usage. the data also shows that use of the computers near the reference desk increased by 15 percent between 2007 and 2010. supporting the popularity of using computers in the library is the fact that most students are repeat customers. table 2 shows 82 percent of the 2009 late-term respondents used the library computers several times a week with almost half using our computers daily. in contrast, 72 percent of the 2010 early term students used the library computers daily or several times a week. the 10 percent drop in frequency of visits to the library for computing applied to both laptop and desktop users and seems to be largely due to not yet receiving enough work from classes to justify more frequent use. the kind of computer that users prefered changed somewhat over the course of the study. the preference for desktop computers dropped from 84 percent of students in 2009 to 72 percent in 2010 (see table 3). one reason for this 12 percent drop may be related to how the survey was adminstered. the 2010 study did a more thorough job of surveying the fourth and fifth library floors where most laptop users are. as a result, the laptop floors represented 29 percent of the response in 2010 vs. only 13 percent in 2009. these numbers are also reflected in the proporation of laptops observed each year—33 percent in 2010 vs. 17 percent in 2009 (see table 1). the drop in desktop computer preference is interesting because it was not matched by an equally large increase in laptop preference, which only increased by 5 percent. the other reason for the decrease in desktop preference is likely due to the larger change seen nationwide in student laptop ownership. for instance, the pew study of gadget ownership showed a 13 percent drop in desktop ownership over a five-year period, 2006–2011, while at the same time laptop ownership almost doubled from 30 percent to 56 percent.17 however, it is interesting to note that, according to the pew study, in 2011 the percent of adults who owned each type of device was nearly equal— 55 percent for desktops and 56 percent for laptops. the 2010 survey tried to better understand students’ preferences by identifying all the kinds of technology they had available to them. we found that 77 percent of csusm students owned laptops and an additional 7 percent owned the netbook form of laptops (see table 4). the combined 84 percent laptop ownership is comparable with the 2010 ecar study’s finding of 89 percent student laptop ownership nationwide.18 this high level of laptop ownership may explain why the users who preferred laptop computers almost all preferred to use their own rather than laptops checked out in the library. despite the high laptop ownership and decrease in desktop preference, it is significant that the majority of csusm students still prefer to use desktop computers in the library. aside from the 72 percent of respondents who specifically stated a preference for desktop computers, the top suggestion for library improvement was to add more desktop computers, requested by 38 percent information technology and libraries |december 2012 30 of respondents. further analysis of the survey data revealed that it was the laptop owners and the fourth and fifth floor laptop users who were the primary requestors of more desktop computers. to try to better understand this seemingly contradictory behavior, we have done some further investigation. anecdotal conversations with users during the survey indicated that convenience and reliability are two factors affecting student’s decision to use desktop computers. the desktop computers’ speed and reliable internet connections were regarded as particularly important when uploading a final project to a professor, with some students stating they came to the library specifically to upload an assignment. in may 2012, the csusm library held a focus group that provided additional insight to the question of desktops vs. laptops. all of the eight-student focus group participants owned laptops, yet all eight participants indicated that they preferred to use desktop computers in the library. when asked why, participants indicated the reliability and speed of the desktop computers and the convenience of not having to remember to bring their laptop to school and “lug” it around. another factor influencing the convenience factor may be that our campus does not require that students own a laptop and bring it to class, so they may have less motivation to travel with their laptop. supporting the idea that students perceive different benefits for each type of computer, six of the eight participants owned a desktop computer in addition to a laptop. the 2010 study also showed that students see value in owning both a desktop and a laptop computer, since the 40 percent ownership of desktop computers overlaps the 84 percent ownership of laptops (see table 4). table 6. reasons students prefer using library computer areas for almost half of the students surveyed, one of the reasons for their preference for using computers in the library was either the ready access to library services or staff (see table 6). even more significant, when specifically asked what else they planned to do in the library that day besides using the computer (see table 5), more than 80 percent of the students indicated that they intended to use the library for purposes other than computing. the top two uses for the library were studying (76 percent in 2009, 81 percent in 2010) and meeting with others (35/26 percent), indicating the importance of the library as place. the most popular library service was the media 0% 5% 10% 15% 20% 25% 30% library services are close library staff are close 2009 2010 student use of library computers | thompson 31 library (20/22 percent) followed by collections with 16/13 percent planning to checkout a book and 15/13 percent planning to look for journals and newspapers. it is interesting that the level of use of these library services was similar whether early or late in the term. the biggest difference was that early term students were less likely to be working with a group but were slightly more likely to be engaged in general studying. even the less-used services, such as asking a question (10 percent) or using a reserve book (8 percent), exhibited an appropriate amount of usage if one looks at the actual numbers. for example, 8 percent of 1,123 2010 survey respondents represent 90 students who used reserve materials sometime during the 8 hours of the two-week survey period. to put the use of the library by computer users into perspective, we also asked students using the nonlibrary computer lab if they planned to use the library sometime that same day. only 24 percent of the nonlibrary computer users planned to study in the library that day vs. 81 percent of the library computer users; only 4 percent planned to use media vs. 24 percent; and 2 percent planned to check out a book vs. 13 percent. the implication is clear that students using computers in the library are much more likely to use the library’s other services. we usually think of providing desktop computers as a service for students, and so it is. however, the study results show that providing computers also benefits the library itself. it reinforces its role as place by providing a complete study environment for students and encouraging all study behaviors including communication and working with others. the popularity of the library computers provide us with a “captive audience” of repeat customers. conclusion the csusm library technology that was planned in 2004 is still meeting students’ needs. although most of our students own laptops, most still prefer to use desktop computers in the library. in fact, providing a full-service computer environment to support the entire research process benefits the entire library. students who use computers in the library appear to conduct more of their studying in the library and thus make more use of traditional library collections and services. going forward, several questions arise for future studies. csusm is a commuter school. students often treat their work space in the library as their office for the day, which increases the importance of a reliable and comfortable computer arrangement. one question that could be asked is whether the results would be different for colleges where most students live on campus or nearby. if the university requires that all students own their own laptop and expects them to bring them to class, how does that affect the relevance of desktop computers in the library? the 2010 study was completed just a few weeks before the first ipad was introduced. since students have identified convenience and weight as reasons for not carrying their laptops, are tablets and ultra-light computers, like the macbook air, more likely to be carried on campus by students and used them more frequently for their research? how important is it to have a supportive mobile infrastructure with features such as high speed wifi, ability to use campus printers, and access to campus applications? are students using smart phones and other mobile devices for study purposes? in fact, are we focusing too much on laptops, and are other mobile devices starting to take over that role? this study’s results make it clear that we can’t just look at data such as ecar’s, which show high laptop ownership, and assume that means students don’t want or won’t use library computers. as information technology and libraries |december 2012 32 the types of mobile devices continue to grow and evolve, libraries should continue to develop ways to facilitate their research role. however, the bottom line may not be that one technology will replace another but rather that students will have a mix of devices and will choose which device is best suited to a particular purpose. therefore libraries, rather than trying to pick which device to support, may need to develop a broad-based strategy to support them all. references 1. susan m. thompson and gabriella sonntag. “chapter 4: building for learning: synergy of space, technology and collaboration.” learning commons: evolution and collaborative essentials. oxford: chandos publishing (2008): 117-199. 2. heidi m. steiner and robert p. holley, “the past, present, and possibilities of commons in the academic library,” reference librarian 50, no. 4 (2009): 309–332. 3. michael j. whitchurch and c. jeffery belliston,“information commons at brigham young university: past, present, and future,” reference services review 34, no. 2 (2006): 261–78. 4. harold shill and shawn tonner, “creating a better place: physical improvements in academic libraries, 1995–2002,” college & research libraries 64 (2003): 435. 5. barbara i. dewey, “social, intellectual, and cultural spaces: creating compelling library environments for the digital age,” journal of library administration 48, no. 1 (2008): 85–94; diane dallis and carolyn walters, “reference services in the commons environment,” references services review 34, no. 2 (2006): 248–60. 6. liz burke et al., “where and why students choose to use computer facilities: a collaborative study at an australian and united kingdom university,” australian academic & research libraries 39, no. 3 (september 2008): 181–97. 7. shannon d. smith and judith borreson caruso, the ecar study of undergraduate students and information technology, 2010 (boulder, co: educause center for applied research, october 2010), http://net.educause.edu/ir/library/pdf/ers1006/rs/ers1006w.pdf (accessed march 21, 2012). 8. pew internet & american life project, “adult gadget ownership over time (2006–2012),” http://www.pewinternet.org/static-pages/trend-data-(adults)/device-ownership.aspx (accessed june 14, 2012); the horizon report: 2009 edition, the new media consortium and educause learning initiative, http://net.educause.edu/ir/library/pdf/hr2011.pdf (accessed march 21, 2012); the horizon report: 2010 edition, the new media consortium and educause learning initiative, http://net.educause.edu/ir/library/pdf/hr2011.pdf (accessed march 21, 2012); the horizon report: 2011 edition, the new media consortium and educause learning initiative, http://net.educause.edu/ir/library/pdf/hr2011.pdf (accessed march 21, 2012). 9. pew internet, “adult gadget ownership.” http://net.educause.edu/ir/library/pdf/ers1006/rs/ers1006w.pdf http://www.pewinternet.org/static-pages/trend-data-(adults)/device-ownership.aspx http://net.educause.edu/ir/library/pdf/hr2011.pdf http://net.educause.edu/ir/library/pdf/hr2011.pdf http://net.educause.edu/ir/library/pdf/hr2011.pdf student use of library computers | thompson 33 10. deborah keyek-franssen et al., computer labs study university of colorado boulder office of information technology october 7, 2011, http://oit.colorado.edu/sites/default/files/labsstudypenultimate-10-07-11.pdf (accessed june 15, 2012). 11. j. s. c. tom, k. voss, and c. scheetz[full names?], “the space is the message: first assessment of a learning studio,” educause quarterly 31, no. 2 (2008), http://www.educause.edu/ero/article/space-message-first-assessment-learning-studio (accessed june 25, 2012). 12. nick wingfield, “time to leave the laptop behind,” wall street journal, february 23, 2009, http://online.wsj.com/article/sb122477763884262815.html (accessed june 15 2012). 13. the horizon report: 2009 edition. 14. the horizon report: 2010 edition. 15. the horizon report: 2011 edition. 16. alison j. head and michael b. eisenberg, “balancing act: how college students manage technology while in the library during crunch time,” project information literacy research report, information school, university of washington, october 12, 2011, http://projectinfolit.org/pdfs/pil_fall2011_techstudy_fullreport1.1.pdf (accessed june 14, 2012). 17. pew internet, “adult gadget ownership.” 18. smith and caruso, ecar study. http://oit.colorado.edu/sites/default/files/labsstudy-penultimate-10-07-11.pdf http://oit.colorado.edu/sites/default/files/labsstudy-penultimate-10-07-11.pdf http://www.educause.edu/ero/article/space-message-first-assessment-learning-studio http://online.wsj.com/article/sb122477763884262815.html http://projectinfolit.org/pdfs/pil_fall2011_techstudy_fullreport1.1.pdf table 1. mobile technology observed discussion digital collections are a sprint, not a marathon: adapting scrum project management techniques to library digital initiatives michael j. dulock and holley long information technology and libraries | december 2015 5 abstract this article describes a case study in which a small team from the digital initiatives group and metadata services department at the university of colorado boulder (cu-boulder) libraries conducted a pilot of the scrum project management framework. the pilot team organized digital initiatives work into short, fixed intervals called sprints—a key component of scrum. working for more than a year in the modified framework yielded significant improvements to digital collection work, including increased production of digital objects and surrogate records, accelerated publication of digital collections, and an increase in the number of concurrent projects. adoption of sprints has improved communication and cooperation between participants, reinforced teamwork, and enhanced their ability to adapt to shifting priorities. introduction libraries in recent years have freely adapted methodologies from other disciplines in an effort to improve library services. for example, librarians have • employed usability testing techniques to enhance users’ experience with digital libraries interfaces,1 improve the utility of library websites,2 and determine the efficacy of a visual search interface for a commercial library database;3 • adopted participatory design methods to identify information visualizations that could augment digital library services4 and determine user needs in new library buildings;5 and • utilized principles of continuous process improvement to enhance workflows for book acquisition and implementation of serial title changes in a technical services unit.6 librarians often come to the profession with disciplinary knowledge from an undergraduate degree unrelated to librarianship, so it should come as no surprise that they bring some of that disciplinary knowledge to their work. the interdisciplinary nature of librarianship also creates an environment that is amenable to adoption or adaptation of techniques from a variety of sources, not only those originating in library science. in this paper, the authors describe their experiences michael j. dulock (michael.dulock@colorado.edu) is assistant professor and metadata librarian, university of colorado boulder. holley long (longh@uncw.edu), previously assistant professor and systems librarian for digital initiatives at university of colorado, boulder, is digital initiatives librarian, randall library, university of north carolina wilmington. mailto:michael.dulock@colorado.edu mailto:longh@uncw.edu digital collections are a sprint, not a marathon | dulock and long | doi: 10.6017/ital.v34i4.5869 6 in applying a modified scrum management framework to facilitate digital collection production. they begin by elucidating the fundamentals of scrum and then describes a pilot project using aspects of the methodology. they discuss the outcomes of the pilot and posit additional features of scrum that may be adopted in the future. fundamentals of scrum project management the scrum project management framework—one of several techniques under the rubric of agile project management—originated in software development, and has been applied in a variety of library contexts including the development of digital library platforms7 and library web applications.8 scrum’s salient characteristics include self-managing teams that organize their work into “short iterations of clearly defined deliverables” and focus on “communication over documentation.”9 the scrum primer: a lightweight guide to the theory and practice of scrum describes the roles, tools, and processes involved in this project management technique.10 scrum teams are cross-functional and consist of five to nine members who are cross-trained to perform multiple tasks. in addition to the team, two individuals serve specialized roles, scrum master and product owner. the scrum master is responsible for ensuring that scrum principles are followed and for removing any obstacles that hinder the team’s productivity. hence the scrum master is not a project manager, but a facilitator. the product owner’s role is to manage the product by identifying and prioritizing its features. this individual represents the stakeholders’ interests and is ultimately responsible for the product’s value. the team divides their work into short, fixed intervals called sprints that typically last two to four weeks and are never extended. at the beginning of each sprint, the team meets to select and commit to completing a set of deliverables. once these goals are set, they remain stable for the duration; course corrections can occur in later sprints. in software development, the scrum team aims to complete a unit of work that stands on its own and is fully functional, known as a potentially shippable increment. it is selected from an itemized list of product features called the product backlog. the backlog is established at the outset of development and consists of a comprehensive list of tasks that must occur to complete the product. a well-constructed backlog has four characteristics. first, it is prioritized with the features that will yield the highest return on investment at the top of the list. second, the backlog is appropriately detailed, so that the tasks at the top of the list are well-defined whereas those at the bottom may be more vaguely demarcated. third, each task receives an estimation for the amount of effort required to complete it, which helps the team to project a timeline for the product. finally, the backlog evolves in response to new developments. individual tasks may be added, deleted, divided, or reprioritized over the life of the project. during the course of a sprint, team members meet to plan the sprint, check-in on a daily basis, and then debrief at the conclusion of the sprint. they begin with a two-part planning meeting in which the product owner reviews the highest priority tasks with the team. in the second half of the meeting, the team and the scrum master determine how many of the tasks can be accomplished in information technologies and libraries |december 2015 7 the given timeframe, thus defining the goals for the sprint. this meeting generally lasts no longer than four hours for a two-week sprint. every day, the team holds a brief meeting to get organized and stay on track. during these “daily scrums,” each team member shares three pieces of information: what has been accomplished since the previous meeting, what will be accomplished before the next meeting, and what, if any, obstacles are impeding the work. these fifteen-minute meetings provide the team with a valuable opportunity to communicate and coordinate their efforts. sprints conclude with two meetings, a review and retrospective. during the review, the team inspects the deliverables that were produced during that sprint. the retrospective provides an opportunity to discuss the process, what is working well, and what needs to be adjusted. figure 1. typical meeting schedule for a two-week sprint evidence in the literature suggests that scrum improves both outcomes and process. one metaanalysis of 274 programming case studies found that implementing scrum led to improved productivity as well as greater customer satisfaction, product quality, team motivation, and cost reduction.11 proponents of this project management technique find that it leads to a more flexible and efficient process. scrum’s brief iterative work cycles and evolving product backlog promote adaptability so the team can address the inevitable changes that occur over the life of a project. by contrast, traditional project management techniques have been criticized for requiring too much time upfront on planning and being too rigid to respond to changes in later stages of the project.12 scrum also promotes communication over documentation,13 resulting in less administrative overhead as well as increased accountability and trust between team members. scrum pilot at university of colorado boulder libraries the university of colorado boulder (cu-boulder) libraries digital initiatives team was interested in adopting scrum because of its incremental approach to completing large projects, its focus on communication, and its flexibility. these attributes meshed well with the group’s goals to publish larger collections more quickly and to more effectively multitask the production of multiple high digital collections are a sprint, not a marathon | dulock and long | doi: 10.6017/ital.v34i4.5869 8 priority collections. the group’s staffing model and approach to collection building prior to the scrum pilot is described here to provide some context for this choice of project management tool. digital collection proposals are vetted by a working group composed of ten members, the digital library management group (dlmg), to ensure that major considerations such as copyright status are fully investigated before undertaking the collection. approved proposals are prioritized by the appropriate collection manager as high, medium, or low and then placed in a queue for scanning and metadata provisioning. a core group of individuals generally works on all digital collections, including the metadata librarian, the digital initiatives librarian, and one or both of the digitization lab managers. additionally, the team frequently includes the subject specialist who nominated the collection for digitization, staff catalogers, and other library staff members whose expertise is required. at any given time, the queue may contain as many as fifteen collections, and the core team works on several of them concurrently to address the separate needs of participating departments. while this approach allows the teams to distribute resources more equitably across departments, progress on individual collections can be slower than if they are addressed one at a time. prior to implementing aspects of scrum, the team also completed the scanning and metadata records for every object in the collection before it was published. as a result, publication of larger collections trailed behind smaller collections. the details of digital collection production vary depending of the nature of the project, but the process usually follows the same broad outline. unless the entire collection will be digitized, the collection manager chooses a selection of materials on the basis of criteria such as research value, rarity, curatorial considerations, copyright status, physical condition, feasibility for scanning, and availability of metadata. photographs and paper-based materials are then evaluated by the preservation department to ensure that they are in suitable condition for scanning. likewise, the media lab manager evaluates audio and video media for condition issues such as sticky shed syndrome, which will affect digitization.14 depending on format, the material is then digitized by the digitization lab manager or the media lab manager and their student assistants according to locally established workflows that conform to nationally recognized best practices. once digitized, student assistants apply post-processing procedures as appropriate and required, such as running ocr (optical character recognition) software to convert images to text or equalizing levels on an audio file. the lab managers then check the files for quality assurance and move the files to the appropriate location on the server. the metadata librarian creates a metadata template appropriate to the material being digitized by using industry standards such as visual resources association core (vra core), metadata object description schema (mods), pbcore, and dublin core (dc). metadata creation methods depend on the existence of legacy metadata for the analog materials and in what format legacy metadata is contained. the metadata librarian, along with his staff and/or student assistants, adapts legacy metadata into a format that can be ingested by the digital library software or creates records directly in the software when there is no legacy metadata. metadata is formatted or created in accordance with existing input standards such as cataloging cultural objects (cco) and resource description and access (rda), and it is enhanced information technologies and libraries |december 2015 9 as much as possible using controlled vocabularies such as the art and architectural thesaurus (aat) and library of congress subject headings. the metadata librarian performs quality assurance on the metadata records during creation and before the collection is published. in the final stages, the collection is created in the digital library software, at which time search and display options are established: thumbnail labels, default collection sorting, faceted browsing fields, etc. then the files and metadata are uploaded and published online. the highlight of the cu-boulder digital library is the twenty-seven collections drawn from local holdings in archives, special collections department, music library, and earth sciences and map library, among others. the library also contains purchased content and “luna commons” collections created by institutions that use the same digital library platform, for a total of more than 185,000 images, texts, maps, audio recordings, and videos. the following four collections were created during the scrum pilot and illustrate the types of materials available in the cuboulder digital library: the colorado coal project consists of video and audio interviews, transcripts, and slides collected between 1974 and 1979 by the university of colorado coal project. the project was funded by the colorado humanities program and the national endowment for the humanities to create an ethnographic record of the history of coal mining in the western united states from immigration and daily life in the coal camps to labor conditions and strikes, including ludlow (1913–14) and columbine (1927). the mining maps collection provides access to scanned maps of various mines, lodes, and claims in colorado from the late 1800s to the early 1900s. these maps come from a variety of creators, including private publishers and us government agencies. digital collections are a sprint, not a marathon | dulock and long | doi: 10.6017/ital.v34i4.5869 10 the vasulka media archive showcases the work of pioneering video artists steina and woody vasulka and contains some of their cutting-edge studies in video that experiment with form, content, and presentation. steina, an icelander, educated in music at the prague conservatory of music, and woody, a graduate of prague's film academy, arrived in new york city just in time for the new media explosion. they brought with them their experience of the european media awakening, which helped them blend seamlessly into the youth media revolution of the late sixties and early seventies in the united states. the 3d natural history collection comprises one hundred archaeology and paleontology specimens from the rocky mountain and southwest regions, including baskets, moccasins, animal figurines, game pieces, jewelry, tools, and other everyday objects from the freemont, clovis, and ancestral puebloan cultures as well as a selection of vertebrate, invertebrate, and track paleontology specimens from the mesozoic through the cenozoic eras (250 ma to the present). the diffusion of effort across multiple collections and a slower publication rate for larger collections offered opportunities for improvement. after attending a conference session on scrum project management for web development projects, one of the team members recognized scrum’s potential to improve production processes since the technique divides large projects into manageable subtasks that can be accomplished in regular, short intervals.15 this approach would allow the team to switch between different high priority collections at regularly defined intervals to facilitate steady progress on competing priorities. working in sprints would also make it easier to publish smaller portions of a large collection at regular intervals. thus scrum held the potential to increase the production rate for larger collections and make the team’s progress more transparent to users and colleagues. in april 2013, a small team of cu-boulder librarians and staff initiated a pilot to assess the effect on processes and outcomes for digital collection production. rather than involving individuals from all affected units, regardless of their level of engagement in a particular project, the scrum pilot was limited to the three individuals who were involved in most, if not all, of the projects information technologies and libraries |december 2015 11 undertaken: the digital initiatives librarian, metadata librarian, and digitization lab manager.16 by including these three individuals, the major functions of metadata provision, digitization, and publication were covered in the trial with no disruption to the existing workflows or organizational structures. selecting this group also ensured that scrum would be tested in a broad range of scenarios and on collections from several different departments. to begin, the team met to review the scrum project management framework and considered how best to pilot the technique. taking a pragmatic approach, they only adopted those aspects of scrum that were deemed most likely to result in improved outcomes. if the pilot were successful, other aspects of scrum could be incrementally incorporated later. the group discussed how scrum roles, processes, and tools could be adapted to digital collection workflows and determined that sprints would likely have the highest return on investment. they also chose to adapt and hybridize certain aspects of the planning meeting and daily scrum to achieve goals that were not being met by other existing meetings. sprint planning and end meetings were combined so that all three participants knew what each had completed and what was targeted for the next sprint. select activities of sprint planning and end meetings were already a part of the monthly dlmg meetings, making additional sprint meetings redundant. daily scrum meetings were excluded as the team felt that daily meetings would not produce enough benefit to justify the costs. in addition, two of the three participants have numerous responsibilities that lie outside of projects subject to the scrum pilot, so each person does not necessarily perform scrum-related work every day. however, the short meeting time was adopted into the planning/end meeting, as were elements of the three core questions of the daily scrum meeting, with some modifications. the questions addressed in the biweekly meetings are: what have you done since the last meeting? what are you planning for the next meeting? what impediments, if any, did you encounter during the sprint? the latter question was sometimes addressed mid-sprint through emails, phone calls, or one-off meetings that include a larger or different group of stakeholders. the team adopted the two-week duration typical of scrum sprints for the pilot. this has proven to be a good medium-term timeframe. it was short enough that the team could adjust priorities quickly, but long enough to complete significant work. the team chose to combine the sprint planning and sprint review meetings into a single meeting. part of the motivation for a trial of the scrum technique was to minimize additional time away from projects while maximizing information transfer during the meetings. a single biweekly planning/review meeting was determined to be sufficient to report accomplishments and set goals yet substantial and free of irrelevant content without being overly burdensome as “yet another meeting.” at each sprint meeting, each participant reported on results from the previous sprint. work that was completed allowed the next phase of a project to proceed. based on the results of the last sprint, each team member set measurable goals that could be realistically met in the next twoweek sprint. there has been a concerted effort to keep the meetings short, limited to about twenty digital collections are a sprint, not a marathon | dulock and long | doi: 10.6017/ital.v34i4.5869 12 to twenty-five minutes. to enforce this habit, the sprint meetings were scheduled to begin twenty minutes before other regularly scheduled meetings for most or all of the participants. this helped keep participants on-topic and reinforced the transfer-of-information aspect of the meetings, with minimal leeway for extraneous topics. reflection the modified scrum methodology described above has been in place for more than a year. there have been several positive outcomes resulting from this practice. beginning with the most practical, production has become more regular than it was before scrum was implemented. the nature of digital initiatives in this environment dictates that many projects are in progress at once, in various stages of completion. the production work, such as digitizing media or creating metadata records, has become more consistent and regular. instead of production peaks and valleys, there is more of a straight line as portions of projects are finished and others come online. this in turn has resulted in faster publication of collections. in 2013, the team published six new collections, twice as many as the previous year. the ability to put all hands on deck for a project for a two-week period can increase productivity. since sprints allow for short, concentrated bursts of work on a single project, smaller projects can be completed in a few sprints and larger projects can be divided into “potentially shippable units” and thus published incrementally. another benefit of scrum is that the variability of the two-week sprint cycle allows the team to work on more collections concurrently. for example, during a given sprint, scanning is underway for one collection, a metadata template is being constructed for another, the analog material in a third is being examined for pre-scanning preservation assessment, and a fourth collection is being published. while this type of multitasking occurred before the team piloted sprints, the scrum project management framework lends more structure and coordination to the various team members’ efforts. collection building activities can be broken down into subtasks that are accomplished in nonconsecutive sprints without undercutting the team’s concerted efforts. as a result, the team can juggle competing priorities much more effectively. the team is working with multiple stakeholders at any given time, each of whom may have several projects planned or in progress. as focus shifts among stakeholders and their respective projects, the scrum team is able to adjust quickly to align with those priorities, even if only for a single sprint. this also makes it easier to respond to emerging requests or address small, focused projects on the basis of events such as exhibits or course assignments. additional benefits of the scrum methodology pertain to communication and work style among the three scrum participants. the frequent, short meetings are densely packed and highly focused. each person has only a few minutes to describe what has been accomplished, explain problems encountered, troubleshoot solutions, and share plans for the next sprint. the return on the time investment of twenty minutes every two weeks is significant—there is no time to waste on issues that do not pertain directly to the projects underway, just completed, or about to start. a further result is that the group’s sense of itself as a team is enhanced. as stated above, the three scrum information technologies and libraries |december 2015 13 participants do not all work in the same administrative unit within the library. though they shared frequent communication by email as projects progressed, regular sprint meetings have fostered a closer sense of team. the participants know from sprint to sprint what the others are doing; they can assist one another with problems face-to-face and coordinate with one another so that work segments progress toward production in a logical sequence. with more than a year of experience with scrum, the pilot team has determined that several aspects of the methodology have worked well in our environment. in general, the sprint pattern fits well with existing operating modes. the monthly dlmg meeting, which includes a large and diverse group, provides an opportunity to discuss priorities, review project proposals, establish standards, and make strategic decisions. the bi-weekly sprint meetings dovetail nicely, with one meeting taking place at a midpoint between dlmg meetings, and one just prior to dlmg meetings. this allows the three scrum participants to focus on strategic items during the dlmg meeting but keep a close eye on operational items in between. the scrum methodology has also accommodated the competing priorities that the three participants must balance on an ongoing basis. there is considerable variation between participants in terms of roles and responsibilities, but the division of work into sprints has given the team greater opportunity to fit production work in with other responsibilities, such as supervision and training; scholarly research and writing; service performed for disciplinary organizations; infrastructure building; and planning, research, and design work for future projects. the two-week sprint duration is a productive time interval during which the team can set and reach incremental goals, whether that is starting and finishing a small project on short notice, making a big push on a large-scale project, or continuing gradual progress on a large, deliberatelypaced initiative. the brief meetings ensure that participants focus on the previous sprint and the upcoming sprint. there is usually just enough time to discuss accomplishments, goals, and obstacles, with some time left to troubleshoot as necessary. the meeting schedule and structure allows each individual to set his or her own goals so that he or she can make maximum progress during the sprint. this in turn feeds into accountability. there is always an external check on one’s progress—the next meeting comes up in two weeks, creating an effective deadline (which also sometimes corresponds to a project deadline). it becomes easier to stay on task and keep goals in sight with the sprint report looming in a matter of days. at the same time, scrum helps to define each person’s role and clarifies how roles align with each other. some tasks are completely independent, while others must be done in sequence and depend on another’s work. the sprint schedule allows large, complex projects to be divided into manageable pieces so that each sprint can result in a sense of accomplishment, even if it may require many sprint cycles to actually complete a project. this is especially true for large digital initiatives. for instance, completing the entire project may take a year, but subsets of a collection may be published in phases at more frequent intervals in the meantime. digital collections are a sprint, not a marathon | dulock and long | doi: 10.6017/ital.v34i4.5869 14 summary of benefits ● enhanced ability to manage multiple concurrent projects ● published large collections incrementally, increasing responsiveness to users and other stakeholders ● improved team building ● increased communication and accountability among team members future considerations based on these outcomes, the team can safely say that it met its objectives for the test pilot. one of the reasons that it was feasible to try this when the participants were already highly committed is that the pilot used a small portion of the scrum methodology and was not too rigid in its approach. the team felt that a hybrid of the scrum planning and scrum review meeting held twice a month would provide the benefits without overburdening schedules with additional meetings. there were also plans to have a virtual email check-in every other week to loosely achieve the goals of the daily scrum meeting, that is, to improve communication and accountability. the email check-in fell by the wayside; the team found it wasn’t necessary because there were already adequate opportunities to check-in with each other over the course of a two-week sprint. the team has found the sprints and modified scrum meetings to be highly useful and relatively easy to incorporate into their workflows. the next phase of the pilot will implement product backlogs and burn down charts, diagrams showing how much work remains for the team in a single sprint, with the goal of tracking collections’ progress at the item level through each step of the planning, selection, preservation assessment, digitization, metadata provisioning, and publication workflows. figure 2. hypothetical backlog for the first sprint of a digital collection17 information technologies and libraries |december 2015 15 scrum backlogs are arranged on the basis of a task’s perceived benefit for customers. to adapt backlogs for digital collection production work, the backlog task list’s order will instead be based in part on the workflow sequence. for example, pieces from the physical collection must be selected before preservation staff can assess them. additionally, the backlog items will be sequenced according to the materials’ research value or complexity. for instance, the digitization of a folder of significant correspondence from an archival collection would be assigned a higher priority in the backlog than the digitization of newspaper clippings of minor importance from the same collection. or, materials that are easy to scan would be listed in the backlog ahead of fragile or complex items that require more time to complete. this will allow the team to publish the most valuable items from the collection more quickly. according to scrum best practices, backlogs are also appropriately detailed. in the context of digital collection production work, collections’ backlogs would begin with a standard template of high-level activities: materials’ selection, copyright analysis, preservation assessment, digitization, metadata creation, and publication. as the team progresses through backlog items, they will become increasingly detailed. backlogs also evolve. scrum’s ability to respond to change has been one of its strongest assets in this environment and therefore the backlog’s ability to evolve will make it a valuable addition to the team’s process. for example, materials that a collection manager uncovers and adds to the project late in the process can be easily incorporated into the backlog or materials in the collection that are needed to support an upcoming instruction session can be moved up in the backlog for the next sprint. in this way, the backlog will support the team’s goal to nimbly respond to shifting priorities and emerging opportunities. figure 3. hypothetical burn down chart18 digital collections are a sprint, not a marathon | dulock and long | doi: 10.6017/ital.v34i4.5869 16 the final relevant feature of a backlog, the “effort estimates,” taken in conjunction with the burn down chart will help the team develop better metrics for estimating the time and resources required to complete a collection. when items are added to the backlog, team members estimate the amount of effort needed to complete it. the burn down chart illustrates how much work remains and, in general practice, is updated on a daily basis. given that the team has truncated the scrum meeting schedule, this may occur on a weekly basis, but will nonetheless benefit the team in several ways. initially, it will keep the team on track and provide valuable and detailed information for stakeholders on the collections’ progress. as the team accrues old burn down charts from completed collections, they can use the data to hone their ability to estimate the amount of time and resources needed to complete a given project. conclusion through the pilot conducted for digital initiatives at cu-boulder libraries, application of aspects of the scrum project management framework has demonstrated significant benefits with no discernable downside. adoption of sprint planning and end meetings resulted in several positive outcomes for the participants. digital collection production has become more regular; work can be underway on more collections simultaneously; and collections are, on average, published more quickly. in addition, communication and cooperation among the sprint pilot participants have increased and strengthened the sense of teamwork among them. the sprint schedule has blended well with existing digital initiatives meetings and workflows, and has enhanced the team’s ability to handle ever-shifting priorities. additional aspects of scrum, such as product backlogs and burn down charts, will be incorporated into the participants’ workflows to allow them to better track the work done at the item level, provide more detailed information for stakeholders during the course of a project, and predict how much time and effort will be required for future projects. the positive results of this pilot demonstrate the benefits to be gained by looking outside standard library practice and adopting techniques developed in another discipline. given the range of activities performed in libraries, the possibilities to improve workflows and increase efficiency are limitless as long as those doing the work keep an open mind and a sharp eye out for methodologies that could ultimately benefit their work, and in turn, their users. references 1. sueli mara ferreira and denise nunes pithan, “usability of digital libraries,” oclc systems & services: international digital library perspectives 21, no. 4 (2005): 316, doi: 10.1108/10650750510631695. 2. danielle a. becker and lauren yannotta, “modeling a library web site redesign process: developing a user-centered web site through usability testing,” information technology & libraries 32, no. 1 (2013): 11, doi: 10.6017/ital.v32i1.2311. 3. jodi condit fagan, “usability testing of a large, multidisciplinary library database: basic search and visual search,” information technology & libraries 25 no. 3 (2006): 140–41, 10.6017/ital.v25i3.3345. http://dx.doi.org/10.1108/10650750510631695 http://dx.doi.org/10.6017/ital.v32i1.2311 http://dx.doi.org/10.6017/ital.v25i3.3345 information technologies and libraries |december 2015 17 4. panayiotis zaphiris, kulvinder gill, terry h.-y. ma, stephanie wilson and helen petrie, “exploring the use of information visualization for digital libraries,” new review of information networking 10, no. 1 (2004): 58, doi: 10.1080/1361457042000304136. 5. benjamin meunier and olaf eigenbrodt, “more than bricks and mortar: building a community of users through library design,” journal of library administration 54 no. 3 (2014): 218–19, 10.1080/01930826.2014.915166. 6. lisa a. palmer and barbara c. ingrassia, “utilizing the power of continuous process improvement in technical services,” journal of hospital librarianship 5 no. 3 (2005): 94–95, 10.1300/j186v05n03_09. 7. javier d. fernández et al., “agile dl: building a delos-conformed digital library using agile software development,” in research and advanced technology for digital libraries, edited by birte christensen-dalsgaard et al. (berlin: springer-verlag, 2008), 398–9, doi: 10.1007/978-3540-87599-4_44. 8. michelle frisque, “using scrum to streamline web applications development and improve transparency” (paper presented at the 13th annual lita national forum, atlanta, georgia, september 30–october 3, 2010). 9. frank h. cervone, “understanding agile project management methods using scrum,” oclc systems & services 27, no. 1 (2011): 19, 10.1108/10650751111106528. 10. pete deemer, gabrielle benefield, craig larman, and bas vodde, “the scrum primer: a lightweight guide to the theory and practice of scrum," (2012), 3-15, www.infoq.com/minibooks/scrum_primer. 11. eliza s. f. cardozo et al., “scrum and productivity in software projects: a systematic literature review” (paper presented at the 14th international conference on evaluation and assessment in software engineering (ease), 2010), 3. 12. cervone, “understanding agile project management,” 18. 13. ibid., 19. 14. sticky shed syndrome refers to the degradation of magnetic tape where the binder separates from the carrier. the binder can then stick to the playback equipment rendering the tape unplayable. 15. frisque, “using scrum.” 16. the media lab manager responsible for audio and video digitization did not participate because his lab offers fee-based services to the public and thus has long-established business processes in place that would not have blended easily with sprints. 17. figure 2 is based on illustration created by mountain goat software, “sprint backlog,” https://www.mountaingoatsoftware.com/agile/scrum/sprint-backlog. 18. figure 3 is adapted from template created by expert project management, “burn down chart template,” www.expertprogrammanagement.com/wpcontent/uploads/templates/burndown.xls. http://dx.doi.org/10.1080/1361457042000304136 http://dx.doi.org/10.1080/01930826.2014.915166 http://dx.doi.org/10.1300/j186v05n03_09 http://dx.doi.org/10.1007/978-3-540-87599-4_44 http://dx.doi.org/10.1007/978-3-540-87599-4_44 http://dx.doi.org/10.1108/10650751111106528 https://www.mountaingoatsoftware.com/agile/scrum/sprint-backlog http://www.expertprogrammanagement.com/wp-content/uploads/templates/burndown.xls http://www.expertprogrammanagement.com/wp-content/uploads/templates/burndown.xls eclipse editor for marc records bojana dimić surla information technology and libraries | september 2012 65 abstract editing bibliographic data is an important part of library information systems. in this paper we discuss existing approaches in developing user interfaces for editing marc records. there are two basic approaches: screen forms that support entering bibliographic data without knowledge of the marc structure, and direct editing of marc records shown on the screen. this paper presents the eclipse editor, which fully supports editing of marc records. it is written in java as an eclipse plug-in, so it is platform-independent. it can be extended for use with any data store. the paper also presents a rich client platform (rcp) application made of a marc editor plug-in, which can be used outside of eclipse. the practical application of the results is integration of the rcp application into the bisis library information system. introduction an important module of every library information system (lis) is one for editing bibliographic records (i.e., cataloguing). most library information systems store their bibliographic data in a form of marc records. some of them support cataloging by direct-editing of marc record; others have a user interface that enables entering bibliographic data by a user who knows nothing about how marc records are organized. the subject of this paper is user interfaces for editing marc records. it gives software requirements and analyzes existing approaches in this field. as the main part of the paper, we present the eclipse editor for marc records, developed at the university of novi sad, as a part of the bisis library information system. eclipse uses the marc 21 variant of the marc format. the remainder of this paper describes the motivation for the research, presents the software requirements for cataloging according to marc standards, and provides background on the marc 21 format. it also describes the development of the bisis software system, reviews the literature concerning tools for cataloging, and analyzes existing approaches in developing user interfaces for editing marc records. the results of the research are presented in the final section, which describes the functionality and technical characteristics of the eclipse marc editor. the rich client platform (rcp) version of the editor, which can be used independently of eclipse, is also presented. motivation the motivation for this paper was to provide an improved user interface for cataloging by the marc standard that will lead to more efficient and comfortable work for catalogers. bojana dimić surla (bdimic@uns.ns.ac.yu) is an associate professor, university of novi sad, serbia. eclipse editor for marc records |surla 66 there are two basic approaches in developing user interfaces for marc cataloging. the first approach includes using a classic screen form made of text fields and labels with the description of the bibliographic data, without marc standard indication. the second approach is direct editing of a record that is shown on the screen. those two approaches will be discussed in detail in “existing approaches in developing user interfaces for editing marc records” below. the current editor in the bisis system is a mixture of these two approaches—it supports direct editing, but data input is done via text field, which opens on double click.1 the idea presented in this paper is to create an editor that overcomes all drawbacks of previous solutions. the approach taken in creating the editor was direct record-editing with real-time validation and no additional dialogs. software requirements for marc cataloging the user interface for marc cataloging needs to support following functions: • creating marc records that satisfy constraints proposed by the bibliographic format • selecting codes for field tags, subfield names, and values of coded elements, such as character positions in leader and control fields, indicators, and subfield content • validating entered data • access to data about the marc format (a “user manual” for marc cataloging) • exporting and importing created records • providing various previews of the record, such as catalog cards background marc 21 as was previously mentioned, the eclipse editor uses the marc 21 variant. marc 21 consists of five formats: bibliographic data, authority data, holdings data, classification data, and community information.2 marc 21 records consist of three parts: record leader, set of control fields, and set of data fields. the record leader content, which follows the ldr label, includes the logical length of the record (first five characters) and the code for record status (sixth character). after the record leader, there are control fields. every control field is written in new line and consists of the threecharacter numeric tag and content of the control field. the content of the control field can be a single datum or a set of fixed-length bibliographic data. control fields are followed by data fields in the record. every line in the record that contains a data field consists of a three-character numeric tag, the value for the first and the second indicator—or the number sign (#) if indicators are not defined for the field—and the list of subfields that belong to the field. information technology and libraries | september 2012 67 detailed analysis of marc 21 shows that there are some constraints on the structure and content of the marc 21 record. constraints on the structure define which fields and subfields can appear more than once in the record (i.e., are the fields and subfields repeatable or not), the allowed length of the record elements, and all the elements of the record defined by marc 21. constraints on the record content are defined on the content of the leader, indicators, control fields and subfields. moreover, some constraints connect more elements in the record (when the content of one element depends on the content of the other element in the record). an example of constraint on the structure for data field 016 is that the field has the first indicator whereas the second indicator is undefined. the field 016 can have subfields a, z, 2, and 8, of which z and 8 are repeatable. bisis the results presented in this paper belong to the research on the development of the bisis library information system. this system, which has been in development since 1993, is currently in its fourth version. the editor for cataloging in the current version of bisis was the starting point for the development of eclipse, the subject of this paper. 3 apart from an editor for cataloging, the bisis system has a module for circulation and an editor for creating z39.50 queries.4 the indexing and searching of bibliographic records was implemented using the lucene text server.5 as a part of the editor for cataloging, we developed the module generating various reports and catalog cards from marc records.6 bisis also supports creating an electronic catalog of unimarc records on the web, where the input of bibliographic data can be down without knowing unimarc but the entered data are mapped to unimarc and stored in the bisis database.7 the recent research within the bisis project relates to its extension for managing research results at the university of novi sad. for that purpose, we developed the current research information system (cris) on the recommendation of the nonprofit organization eurocris.8 the paper “cerif compatible data model based on marc 21 format” gives the proposal for the common european research information format (cerif), a compatible data model based on marc 21. in this model, a part of the cerif data model that relates to research results is mapped to marc 21. furthermore, on the basis of this model, research management at the university of novi sad was developed.9 the paper “cerif data model extension for evaluation and quantitative expression of scientific research results” explains the extension of cerif for evaluation of published scientific research. the extension is based on the semantic layer of cerif, which enables classification of entities and their relationships by different classification schemas.10 the current version of the bisis system is based on a variant of the unimarc format. the development of the next version of bisis, which will be based on marc 21, is in progress. the first task was migrating existing unimarc records.11 the second task is developing the editor for marc 21 records, which is the subject of this paper. eclipse editor for marc records |surla 68 cataloging tools an editor for cataloging is a standard part of a cataloger’s workstation and the subject of numerous studies. lange describes the cataloging development process from handwritten cataloging cards, to typewriters (first manual then electronic), to the appearance of marc records and pc-based cataloger’s workstations.12 leroya and thomas debate the influence of web development on cataloging. they stress that the availability of information on the web, as well as the possibility that more applications can be opened in the same time in different windows, greatly influence the process of creating bibliographic records. their paper also indicates that there are some problems that result from using large numbers of resources from the web, such as errors that arise from copy-paste methods. consequently, there is a need for automatic check of spelling errors and the possibility of a detailed review by a cataloger during editing.13 khurshid deals with general principles of the cataloger’s workstation, its configuration, and its influence on a cataloger’s productivity. in addition to efficient access to remote and local electronic resources, khurshid includes record transfer through a network and sophisticated record editing as important functions of a cataloger’s workstation. furthermore, khurshid says it is possible to improve cataloging efficiency in the windows-based cataloger’s workstation by finding bibliographic records in other institutions and cutting and pasting lengthy parts of the record (such as summary notes) to their own catalog.14 existing approaches in developing user interfaces for editing marc records the basic source for this analysis of existing user interfaces for editing marc records was the official site for marc standards of the library of congress in addition to scientific journals and conferences. the analysis of existing systems shows that there are two basic approaches in the implementation of editing marc records: 15 • entering bibliographic data in classic screen forms made of text fields and labels, which does not require knowledge of the marc format (concourse,16 koha,17 j-marc18) • direct editing of a marc record shown on the screen (marcedit,19 isismarc,20 catalis,21 polaris,22 marcmaker and marcbraker,23 exlibris voyager24). both of these approaches have advantages and disadvantages. the drawback of the first approach is that it provides a limited set of bibliographic data to edit, and the extension of that set implies changes to the application, or in the best cases changes in configuration. another problem is that there are usually a lot of text fields, text areas, combo boxes, and labels on the screen that need to be organized into several tabs or additional windows. this situation usually makes it difficult for the users to see errors or to connect different parts of the record when checking their work. moreover, all found solutions from the first group perform little validation of data entered by the user.25 one important advantage of the first approach is that the application can be used by a user information technology and libraries | september 2012 69 who is not familiar with the standard, thus the need for access to marc data can be avoided (one of functions listed “marc 21” above). as for second approach, editing a marc record directly on the screen overcomes the problem of extending the set of bibliographic data to enter. it also enables users to scan entered data and check the whole record, which appears on the screen. users can also copy and paste parts of records from other resources into the editor. however, a majority of those applications are actually editors for editing marc files that are later uploaded in some database or transformed in some other format (marcedit, marcmaker and marcbreaker, polaris), and they usually support little or no data validation.26 they allow users to write anything (i.e., the record structure is not controlled by the program), and only validate at the end of the process when uploading or transforming the record. among those editors there are those, such as catalis and isismarc, that present the marc record as a table. they support the control of structure, but the record presented in this way is usually too big to fit on the screen, so it is separated into several tabs. an important function of editing marc records is selecting code for coded elements that can be positioned in the leader or control field, value of the indicator, or value of the subfield. there are also field tags or subfield codes that sometimes need to be selected for addition to a record. all analyzed editors provide additional dialogs for picking this code that require the user to constantly open and close dialogs, which sometimes can be annoying for the user. one important fact about editors in the second group is that they can be used only by a user who is familiar with marc, so access to the large set of marc element descriptions can make the job easier. some of the mentioned systems provide descriptions of the fields and subfields (e.g., isismarc), but most of them do not. findings the editor for marc records was developed as a plug-in for eclipse; therefore it is similar to eclipse’s java code editors. as the editor is written in java, it is platform-independent. the main part of this editor was created using oaw xtext framework for developing textual domain-specific languages.27 it was created using model-driven software development by specifying the model of marc record in a form of xtext grammar and generating the editor. all main characteristics of the editor were generated on the basis of the specification of constraints and extensions of the xtext grammar—therefore all changes to the editor can be realized by changing the specification. moreover, this editor can be easily adjusted for any database by using the concept of extension and extension point in the eclipse plug-in. we make this application independent of eclipse by using rich client platform (rcp) technology. this editor is implemented for marc 21 bibliographic and holdings formats. user interface eclipse editor for marc records |surla 70 figure 1 shows the editor opened within eclipse. the main area is marked with “1”—it shows the marc 21 file that is being edited. that file contains one marc 21 bibliographic record. the tags of the fields and subfields codes are highlighted in the editor, which contributes to presentation clarity. the area marked with “2” serves for listing the errors in the record, that is, nonvalid elements entered in the record. the area marked with “3” shows data about marc 21 in a tree form. this part of the screen has two other possible views: a marc 21 holdings format tree and a navigator, which is the standard eclipse view for browsing resources for the opened project. the actions available for creating a record are available in the cataloging menu and on the cataloging toolbar, which is marked with “4.” these are actions for previewing the catalog card, creating a new bibliographic record, loading a record from a database (importing the record), uploading a record to a database (exporting the record), and creating a holdings record for this bibliographic record. figure 1. eclipse editor for marc records in the eclipse editor for marc, selecting codes is enabled without opening additional dialogs or windows (figure 2). that is a standard eclipse mechanism for code completion: typing ctrl + space opens the dropdown list with all possible values for the cursor’s current position. information technology and libraries | september 2012 71 figure 2. selecting codes record validation is done in real time, and every violation is shown while editing (figure 3). figure 3 depicts two errors in the record: one is a wrong value in the second character position in control field 008, and another is that two 100 fields were entered, which is a field that cannot be duplicated in a record. figure 3. validation errors rcp application of the cataloging editor as shown above, the editor is available as an eclipse plug-in, which raises the question of what a cataloger will do with all the other functions of the eclipse integrated development environment (ide). as seen in figures 1 and 3, there are a lot of additional toolbars and menus that not related eclipse editor for marc records |surla 72 to cataloging. the answer lies in rcp technology. rcp technology generates independent software applications on the basis of a set of eclipse plug-ins.28 the main window of an rcp application with additional actions is shown in figure 4. beside the cataloguing menu that is shown, the window also contains the file menu, which includes save and save as actions, as well as the edit menu, which includes undo and redu actions. all of these actions are also available via the toolbar. figure 4. rcp application conclusion the goal of this paper was to review current user interfaces for editing marc records. we presented two basic approaches in this field and analyzed of advantages and disadvantages of each. we then presented the eclipse marc editor, which is part of the bisis library software system. the idea behind eclipse is inputting structured marc data in the form similar to programming language editors. the author did not find this approach in the accessible literature. the rcp application of the presented editor will find its practical application in future versions of the bisis system. it represents an upgrade of the existing editor and a starting point for forming the version of the bisis system that will be based on marc 21. the acquired results can also be information technology and libraries | september 2012 73 used for the input of other data into the bisis system, including data from the cris system used at the university of novi sad. this paper shows that eclipse plug-in technology can be used for creating end user applications. the development of applications with the plug-in technology enables the use of a big library of created components from the eclipse user interface, whereby writing source code is avoided. additionally, the plug-in technology enables the development of extendible applications by using the concept of the extension point. in this way, we can create software components that can be used by a great number of different information systems. by using the concept of “extension point,” the editor can be extended by the functions that are specific for a data store. an extension point was created for export and import of marc records, which means the marc editor plug-in can be used with any database management system by extending this extension point in eclipse plug-in technology. future work in the development of the eclipse marc editor is to implement support for additional marc formats, for authority and classification data, and for community information. these formats propose the same record structure but have different constraints on the content and different sets of fields and subfields, as well as different codes for character positions and subfields. therefore the appearance of the editor will remain the same. the only difference will be the specification of the constraints and codes for code completion. another interesting topic for discussion is considering implementation of other modules of library information systems in eclipse plug-in technology. references 1. bojana dimić and dušan surla, “xml editor for unimarc and marc21 cataloging,” electronic library 27 (2009): 509–28; bojana dimić, branko milosavljević, and dušan surla, “xml schema for unimarc and marc 21 formats,” electronic library 28 (2010): 245–62. 2. library of congress, “marc standards,” http://www.loc.gov/marc (access february 19, 2011). 3. dimić and surla, “xml editor,” dimić, milosavljević, and surla, “xml schema.” 4. danijela tešendić, branko milosavljević, and dušan surla, “a library circulation system for city and special libraries,” electronic library 27 (2009): 162–68; branko milosavljevic and danijela tešendić, “software architecture of distributed client/server library circulation,” electronic library, 28 (2010): 286–99; danijela boberić and dušan surla, “xml editor for search and retrieval of bibliographic records in the z39.50 standard,” electronic library 27 (2009): 474–95. 5. branko milosavljević, danijela boberić, and dušan surla, “retrieval of bibliographic records using apache lucene,” electronic library 28 (2010): 525–36. http://www.loc.gov/marc eclipse editor for marc records |surla 74 6. jelana rađenović, branko milosavljеvić, and dušan surla, “modelling and implementation of catalogue cards using freemarker,” program: electronic library and information systems 43 (2009): 63–76. 7. katarina belić and dušan surla, “model of user friendly system for library cataloging,” comsis 5 (2008): 61–85; katarina belić and dušan surla, “user-friendly web application for bibliographic material processing,” electronic library 26 (2008): 400–410; eurocris homepage, www.eurocris.org (accessed february 21, 2011). 8. dragan ivanović, dušan surla, and zora konjović, “cerif compatible data model based on marc 21 format,” electronic library 29 (2011). http://www.emeraldinsight.com/journals.htm?articleid=1906945. 9. eurocris, “common european research information format,” http://www.eurocris.org/index.php?page=cerifreleasesandt=1 (accessed february 21, 2011); dragan ivanović et al., “a cerif-compatible research management system based on the marc 21 format,” program: electronic library and information systems 44 (2010): 229–51. 10. gordana milosavljević et al., “automated construction of the user interface for a cerifcompliant research management system,” the electronic library 29 (2011). http://www.emeraldinsight.com/journals.htm?articleid=1954429; dragan ivanović, dušan surla, and miloš racković, “a cerif data model extension for evaluation and quantitative expression of scientific research results,” scientometrics 86 (2010): 155–72. 11. gordana rudić and dušan surla, “conversion of bibliographic records to marc 21 format,” electronic library 27 (2009): 950–67. 12. holley r. lange, “catalogers and workstations: a retrospective and future view,” cataloging & classification quarterly 16 (1993): 39–52. 13. sarah yoder leroya and suzanne leffard thomas, “impact of web access on cataloging,” cataloging & classification quarterly 38 (2004): 7–16. 14. zahirrudin khurshid, “the cataloger’s workstation in the electronic library environment,” electronic library 19 (2001): 78–83. 15. library of congress, “marc standards,” http://www.loc.gov/marc (access february 19, 2011). 16. book systems, “concourse software product,” http://www.booksys.com/v2/products/concourse (accessed february 19, 2011). 17. koha library software community homepage, http://koha-community.org (accessed february 19, 2011). http://www.emeraldinsight.com/journals.htm?articleid=1906945 http://www.emeraldinsight.com/journals.htm?articleid=1954429 http://www.loc.gov/marc http://www.booksys.com/v2/products/concourse http://koha-community.org/ information technology and libraries | september 2012 75 18. wendy osborn et al., “a cross-platform solution for bibliographic record manipulation in digital libraries,” (paper presented at the sixth iasted international conference communications, internet and information technology, july 2–4, 2007, banf, alberta, canada). 19. terry reese, “marcedit—your complete free marc editing utility,” http://people.oregonstate.edu/~reeset.marcedit/html/index.php (accessed february 19, 2011). 20. united nations educational scientific and cultural organization, “isismarc,” http://portal.unesco.org/ci/en/ev.phpurl_id=11041&url_do=do_topic&url_section=201.html (accessed february 19, 2011). 21. fernando j. gómez “catalis,” http://inmabb.criba.edu.ar/catalis (accessed february 19, 2011). 22. polaris library systems homepage, http://www.gisinfosystems.com (accessed february 19, 2011). 23. library of congress, “marcmaker and marcbreaker user’s manual,” http://www.loc.gov/marc/makrbrkr.html (accessed february 19, 2011). 24. exlibris, “exlibris voyager,” http://www.exlibrisgroup.com/category/voyager (accessed february 19, 2011). 25. book systems, “concourse software product.” 26. bonnie parks, “an interview with terry reese,” serials review 31 (2005): 303–8. 27. eclipse.org, “xtext,” http://www.eclipse.org/xtext (accessed february 19, 2011). 28. the eclipse foundation, “rich client platform,” http://wiki.eclipse.org/index.php/rich_client_platform (accessed february 19, 2011). http://people.oregonstate.edu/~reeset.marcedit/html/index.php http://portal.unesco.org/ci/en/ev.php-url_id=11041&url_do=do_topic&url_section=201.html http://portal.unesco.org/ci/en/ev.php-url_id=11041&url_do=do_topic&url_section=201.html http://inmabb.criba.edu.ar/catalis http://www.gisinfosystems.com/ http://www.loc.gov/marc/makrbrkr.html http://www.exlibrisgroup.com/category/voyager http://www.eclipse.org/xtext http://wiki.eclipse.org/index.php/rich_client_platform 18. wendy osborn et al., “a cross-platform solution for bibliographic record manipulation in digital libraries,” (paper presented at the sixth iasted international conference communications, internet and information technology, july 2–4, 2007, banf, ... 25. book systems, “concourse software product.” 26. bonnie parks, “an interview with terry reese,” serials review 31 (2005): 303–8. methods of randomization of large files with high volatility 79 patrick c. mitchell: senior programmer, washington state university, pullman, washington, and thomas k. burgess: project manager, institute of library research, university of california, los angeles, california key-to-address conversion algorithms which have been used for a large, direct access file are compared with respect to record density and access time. cumulative distribution functions are plotted to demonstrate the distribution of addresses generated by each method. the long-standing practice of counting address collisions is shown to be less valuable in fudging algorithm effectiveness than considering the maximum number of contiguously occupied file locations. the random access disk file used by the washington state university library acquisition sub-system is a large file with a sizable number of records being added and deleted daily. this file represents not only materials on order by the acquisitions section, but all materials which are in process within the technical services area of the library. the size of the file currently varies from approximately 12,000 to 15,000 items and has a capacity of 18,000 items. over 40,000 items are added and purged annually. each record consists of both fixed length fields and variable length fields. fixed fields primarily contain quantity and accounting information; the variable length fields represent bibliographic data. records are blocked at 1,000 characters for file structuring purposes; however the variable length information is treated as strings of characters with delimiters. the key to the file is a 16-character structure which is developed from the purchase order number. the structure of the key is as follows: six digits of the original purchase order number, two digits of partial order and credit information, and eight digits containing the computed relative record address. proper development of this key turns out to be 80 journal of library automation vol 3/1 march, 1970 the most important factor in achieving efficiency in both file access time and record density within the file. the w.s.u. purchase order numbering system, developed from a basic six-digit purchase order number, allows up to one million entries. of these, the library currently uses four blocks: one block for standing orders, one block for orders originating from the university after the system becomes operational, another block used by the systems people in prototype testing of the system, and a fourth block which was given to one vendor who operates an approval book program. in mapping a possible million numbers into eighteen thousand disk locations, there is a high probability that the disk addresses for more than one record will be the same. disk location, also called disk address, home position, and relative record address ( rra) in this paper, refers to the computed offset address of a record in the file, relative to the starting address of the file. currently, the file resides on an ibm 2316 disk pack which can store six 1000-character records per track. thus if the starting address of the file is track 40, a record with rra = 5 would have its home position on track 40, while a record with rra = 6 would have its home position on track 41. it should be noted that routines in this system are required to calculate neither absolute track address nor relative track address and therefore the file could be moved to any direct access device supported by os/bdam without program modification. when two records map into the same address, it is called a collision. for a write statement under the ibm 360 operating system, basic direct access methods, the system locates that disk address generated and if another record is found there, it sequentially searches from that point forward until a vacant space is found and then stores the new record in that space. the sequential search is done by a hardware program in the i/ 0 channel and proceeds at the rotational speed of the device on which the file resides. the cpu is free during this period to service other users. similarily, when searching for a record, the system locates the disk address and matches keys; if they do not match, it sequentially searches forward from that point. long sequential searches sharply degrade the operating efficiency of on-line systems. in initial experimentation with this file, it was discovered that some records were 2,500 disk positions away from their computed locations. this seriously reduced response time to the terminals which were operating against those records. the necessity to develop a method for placing each record close to its calculated location became quite obvious. however, the methodology for doing this was not as clear. the upper bound delay for a direct access read/write operation can be defined as the largest number of contiguously occupied record locations within the file. the problem of minimizing this upper bound for a particular file is equivalent to finding an algorithm which maps the keys in such a way that unoccupied locations are interspersed throughout the randomization of large files/mitchell and burgess 81 file space. one method for doing this is to triple the amount of space required for the file. this has been a traditional approach but is unsatisfactory in terms of its efficiency in space utilization. the method first used by the library was motivated by the necessity to "get on the air." its requirements were that it be easily implemented and perform to a reasonable degree. the prime modulo scheme seemed to qualify and was selected. as this algorithm was used, the largest prime number within the file size was divided into the purchase order number and the modulo remainder was used as an address; that is, rra = [po modulo pr] where rra is the relative record address, po is the purchase order number, and pr is a prime number. during the initial period file size grew to about 8,000 records. because the acquisitions section was converting from its manual operation, the file continued to grow in size and the collision problem became pronounced. when the file reached about 70% capacity-that is when 70% of the space allocated for the file was being occupied by records-this method became unusable; records were then located so far from their original addresses that terminal response times became degraded and batch process routines began to have significant increases in run times. with no additional space available to expand the size of the file, it became necessary to increase the record density within the existing file bounds. therefore an adaptation of the original algorithm was developed. in addition to generating the original number by dividing a prime number into the purchase order number and keeping the modulo remainder, the purchase order number was multiplied by 300 and divided by that same prime number to get an additional modulo remainder; the latter was added to the first modulo remainder and the sum then divided by 2: (po modulo pr) + (300 • po modulo pr) 2 rra = again this scheme brought some relief, but the file continued to grow as the system was implemented, and it became obvious that this procedure would also fail because of over-crowded areas in the file. a search of the literature using w. b. climenson's chapter on file structure ( 2) as a start provided some other methods for reducing the collision problem ( 1, 3, 4, 5, 6). several randomization or hashing schemes were examined. however, none of these methods appeared to be particularly pertinent to the set of conditions at washington state. in order to bring relief from the continuing problem of file and program maintenance involved with changing the file-mapping algorithm, research was initiated to devise an algorithm which would, independent of the input data, map records uniformly across the available file space. the algorithm which resulted utilizes a pseudo-random number generator, rand (7) developed at the w.s.u. computing center randl, program 360l-13.5.004, computing center library, computing center, 82 journal of library automation vol 3/ 1 march, 1970 washington state university, pullman, washington. the normal use of rand is to generate a sequence of uniformly distributed integers over the interval [1, m], where m is a specified upper bound in the interval [1, 231 -1]. in addition to m, rand has a second input parameter: n, which is the last number generated by rand. given m and n, rand generates a result r. rand is used by the algorithm to generate relative disk addresses by setting m to the size or capacity of the file, by setting n to the purchase order number of the record to be located, and by using r as the relative address of the record. rra =rand (po, m ) . in order to test the effectiveness of this algorithm and others which might be devised, a file simulation program was written bdamsim, program 360l-06.7.008, computing center library, computing center, washington state university, pullman, washington. inputs to this program are: a) an algorithm to generate relative record locations; b) a sequential file which contains the input data for "a"; c) various scalar values such as file capacity, approximate number of records in the file, title of output, etc. the program analyzes the numbers generated by "a" operating on "b" within the constraints of "c". the outputs of the program are some statistical results and a graphical plot showing the cumulative distribution function of the generated addresses. figures 1, 2, and 3 show the plotted output of the three algorithms operating against the current acquisitions file. the abscissas of the plots 8 • )!! ii! li 1i :;! 5i ::! !':! ~ ~ n n ~~ a= .. ~, ,~ -' -' ~)11 i'! a; ·:5 ~li ma! 0.. 0.. .. .. it ,:: ~ ~ ~--~~~~-±~~~--~~~~~~~~--~--~--~--~~0 21 , 10 '12.20 83.30 111,'10 105. 51 i:m.61 1~7.71 1811,81 im.tl 211.01 2$!,11 253.2f relrt ive record addresses lx i 02 l fig. 1. rra =po modulo pr randomization of large files/ mitchell and burgess 83 fig. 2. rra = ( (po modulo pr) + (300 x po modulo pr) )/ 2. 8 i )c ii! ~ ~ z! 5i fl !! l':! i<; ~ ;;; :::::: ;::: ~8 z 8::: .; .; ::~ ~ ,.. ~ ..j ..j ~~ ~iii ~m :ti~ a: a: """' "' ~ ~ ~ ~ fig. 3. rra =rand (po, pr). 84 journal of library a while any abandoned cluster (14,692,237 out of 24,030,176!) was erroneously described as follows: this xml empty statement omits the specific information about the abandoned cluster. to obtain this invaluable information again, we filed a bug by email. 29 the decision taken was drastic: starting in may 2020, viaf stopped including this information in its monthly dump, as stated at the bottom of the page itself.30 as a result, the only recourse available to viaf contributors or any https://www.wikidata.org/w/index.php?title=q102371&oldid=1220309663 https://viaf.org/viaf/57898554/ https://www.wikidata.org/wiki/q4117019 https://www.wikidata.org/wiki/q23665535 http://viaf.org/viaf/data information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 8 other institution that would synchronize their authority records with viaf identifiers is to rely on an external identification tool such as wikidata! materials and methods any comparison between viaf and wikidata must consider their different content. viaf contains personal name clusters, corporate name clusters, geographic name clusters, and work clusters, whereas wikidata allows items to describe any kind of entity relevant in the universe of discourse of the users’ data and irrespective of their bibliographic nature. even if all kinds of viaf clusters are relevant for bibliographic control, this study is limited to the analysis of personal name clusters in viaf and of items having “instance of: human” (p31:q5) in wikidata, because they are largely the most represented in viaf and they can be directly compared.31 some entities, such as mythological persons, legendary persons, etc., that are personal clusters in viaf, are not treated as humans in wikidata and belong to other instances (e.g., https://www.wikidata.org/wiki/q95074). a double approach was used to compare viaf and wikidata: first, data analyses of viaf and wikidata were performed, to compare viaf clusters and wikidata items and to investigate their reciprocal relationships (see the data analysis section). second, a comparison of several general characteristics, such as scope, objectives, philosophy, authority control, and identification, was made based on respective websites and available literature to find and highlight differences and similarities. full viaf dumps are available in native xml, rdf, marc-21 xml, or iso-2709 marc-21 (http://viaf.org/viaf/data/). viaf clusters were analyzed using an xml dump published on september 6, 2020 (http://viaf.org/viaf/data/viaf-20200906-clusters.xml.gz). full wikidata dumps are available in xml, json, or rdf.32 however, given the size of the entire dataset, it is much more convenient to create customized rdf dumps using the tool wdumper (https://wdumps.toolforge.org/). all the information (settings, dimension, and date of base dump) about dumps created using wdumper remains traced (https://wdumps.toolforge.org/dumps). wikidata items were analyzed using a customized rdf dump updated to september 14, 2020 (https://wdumps.toolforge.org/dump/732). the customized dump contains all statements with non-deprecated values33 present in items having both “instance of: human” (p31:q5) in best rank and at least one value of “viaf id” (p214) in best rank. both dumps were parsed using three perl scripts. dumps and scripts were uploaded on zenodo and are all available for analysis and reuse.34 perl scripts generate json data that are published on the html page http://catalogo.pusc.it/beyond_viaf/, where they are interpreted by javascript scripts in order to populate eight tables: three dedicated to viaf (tables 1–3) and five to wikidata (tables 4–8). in order to select the statements to be analyzed in wikidata items, three sets of relevant properties were found through three distinct sparql queries at the end of september 2020: viaf members (table 5), authority controls related to libraries but not being viaf members (table 6), and biographical dictionaries (table 7).35 at the beginning of october 2020, another sparql query was performed to find all the personal items containing the authority controls related to libraries but not being viaf members (table 6, column 4), without filtering the search to personal items having at least one value of “viaf id” (p214).36 https://www.wikidata.org/wiki/q95074 http://viaf.org/viaf/data/ http://viaf.org/viaf/data/viaf-20200906-clusters.xml.gz https://wdumps.toolforge.org/ https://wdumps.toolforge.org/dumps https://wdumps.toolforge.org/dump/732 http://catalogo.pusc.it/beyond_viaf/ http://catalogo.pusc.it/beyond_viaf/#summary http://catalogo.pusc.it/beyond_viaf/#summary http://catalogo.pusc.it/beyond_viaf/#tb5 http://catalogo.pusc.it/beyond_viaf/#tb6 http://catalogo.pusc.it/beyond_viaf/#tb7 http://catalogo.pusc.it/beyond_viaf/#tb6 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 9 data analysis: viaf clusters and wikidata items for this paper, two different versions of the data tables were produced: the first version, available at http://catalogo.pusc.it/beyond_viaf/, is a full, commented, and dynamic version of all the tables. within that version, links to the acronyms (such as lc, dnb, sudoc, etc.) of all the viaf contributors and other data providers are available too. static versions of these tables are included in this paper with commentary. viaf viaf has 22,099,715 personal clusters, half of which (50.90%; table 1, col. 2) are isolated clusters (i.e., they contain only one id). the presence of isolated clusters is interesting because it means that those clusters are created based on data coming from just one source. what is more, the percentage of isolated clusters is much higher (71.19%; table 1, col. 12) if just viaf contributors are taken into account (i.e., excluding isolated clusters due to data from other data providers, such as isni). it is worth noting that other data providers can form isolated clusters, with the relevant exception of wikidata (for which viaf uses the acronym wkp), which never appears in isolated clusters (table 1, cols. 7 and 8). table 1. viaf personal clusters by number of sources [adapted from http://catalogo.pusc.it/beyond_viaf/#tb1] the total number of ids present in viaf clusters is 51,327,847 (table 2), distributed in 22,099,715 clusters; the most relevant contributors include lc (7,266,628 ids), dnb (5,677,731 ids), sudoc (3,278,189 ids), and nta (2,754,036 ids), while the most relevant other data providers are isni (8,455,814 ids) and wkp (2,148,680 ids) (table 2). apart from lc and dnb, data about isolated clusters (table 2, col. 5) shows that the number of isolate clusters tends to slowly decrease over time and that clustering has improved: recently-added sources tend to have a higher share of isolated ids. another relevant figure is that sources in non-latin alphabets usually have higher shares of isolated ids.37 so, a high number of isolated clusters may reveal a source that is partially in need to be gathered to existing clusters. http://catalogo.pusc.it/beyond_viaf/ http://catalogo.pusc.it/beyond_viaf/#tb1 http://catalogo.pusc.it/beyond_viaf/#tb1 http://catalogo.pusc.it/beyond_viaf/#tb1 http://catalogo.pusc.it/beyond_viaf/#tb1 http://catalogo.pusc.it/beyond_viaf/#tb2 http://catalogo.pusc.it/beyond_viaf/#tb2 http://catalogo.pusc.it/beyond_viaf/#tb2 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 10 table 2. viaf personal clusters by source [adapted from http://catalogo.pusc.it/beyond_viaf/#tb2] the histories of viaf clusters, as contained in xml dumps, appear weird and incoherent. for example, many viaf contributors in their first year of appearance seem to have no additions and many removals (e.g., bav row; for complete information see table 3 on the website at http://catalogo.pusc.it/beyond_viaf/#tb3). incoherence is due to the absence of redirected and abandoned clusters in the data. nevertheless, the histories allow us to reconstruct the year of first contribution of each source—an information otherwise unavailable—and to detect major changes in the data provided to viaf by each source.38 table 3. viaf history of personal clusters by source [adapted from http://catalogo.pusc.it/beyond_viaf/#tb3] wikidata wikidata has 8,304,947 personal items and 2,061,046 of them contain a viaf id. usually one or more viaf sources are extracted from the viaf id(s), so that 1,905,470 personal items containing viaf id have at least one viaf source id (table 4, col. 1). wikidata records ids from a wide range http://catalogo.pusc.it/beyond_viaf/#tb2 http://catalogo.pusc.it/beyond_viaf/#tb3 http://catalogo.pusc.it/beyond_viaf/#tb4 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 11 of other resources, such as non-viaf bibliographic agencies and biographical dictionaries (investigated in these tables), but also encyclopedias and various online databases. considering the 2,061,046 items containing a viaf id, 684,367 items contain only one viaf source id (table 4, col. 1), but only 353,710 items contain only one among viaf sources ids and non-viaf sources ids and biographical dictionaries ids (table 4, col. 15); so, more than 300,000 items containing only one viaf source id have at least one non-viaf source id and/or one biographical dictionary id. table 4. wikidata personal items (pers. it.) by number of ids [adapted from http://catalogo.pusc.it/beyond_viaf/#tb4] viaf and wikidata: a data comparison from a quantitative perspective, wikidata personal items (8,304,947) are 37.58% of viaf personal clusters (22,099,715), while wikidata personal items having a viaf id (2,061,046) are 9.26%. ids from viaf sources present in wikidata personal items containing viaf id (6,292,778; table 5, col. 3) are 12.91% of ids present in viaf personal clusters (48,740,933; table 5, col. 4). in the authors’ opinion, quantitative confrontation between viaf and wikidata must be carefully considered. it could be argued that is a noticeable disadvantage of wikidata with respect to viaf, but it would be right only from a bibliographic control perspective and the other side of the coin must be examined too. as wikidata represents any kind of entity relevant for its users (libraries, archives, museums, and many other stakeholders), viaf contains just over a third of wikidata items (37%). furthermore, a very large part of the personal entities represented in wikidata (at present, more than 6,200,000, i.e., about 75%) cannot rely on viaf for identification purposes (for example, because wikidata personal items can also represent singers, lawyers, pilots, and so on). it can be concluded that viaf can be considered just one specialized source, in the domain of the semantic web and with respect to the objectives of wikidata. considering single viaf sources, wikidata surpasses viaf by number of ids only in two cases, perseus (135.18%) and simacob (102.17%) (table 5, col. 5). this is possible because wikidata and viaf gather different sets of data from both the sources; the former uses sets of data obtained by its users, while the latter uses only data sent by the contributor. all the other sources, because of the absence of systematic imports, are much rarer in wikidata than in viaf. http://catalogo.pusc.it/beyond_viaf/#tb4 http://catalogo.pusc.it/beyond_viaf/#tb4 http://catalogo.pusc.it/beyond_viaf/#tb5 http://catalogo.pusc.it/beyond_viaf/#tb5 http://catalogo.pusc.it/beyond_viaf/#tb5 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 12 table 5. wikidata personal items (pers. it.) by viaf source [adapted from http://catalogo.pusc.it/beyond_viaf/#tb5] table 6 and table 7 show authority control in wikidata living aside viaf. wikidata contains some non-viaf sources (usually non-national libraries or groups of libraries which couldn’t become viaf contributors); their ids in personal items having viaf id (894,161) are the 86.04% of their ids in all personal items (958,206; table 6, col. 4), meaning that wikidata provides a clusterization for more than 64,000 ids (6%) probably corresponding to non-existent viaf clusters (table 6, totals). http://catalogo.pusc.it/beyond_viaf/#tb6 http://catalogo.pusc.it/beyond_viaf/#tb7 http://catalogo.pusc.it/beyond_viaf/#tb6 http://catalogo.pusc.it/beyond_viaf/#tb6 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 13 table 6. wikidata personal items (pers. it.) by non-viaf sources [adapted from http://catalogo.pusc.it/beyond_viaf/#tb6] table 7. wikidata personal items (pers. it.) by biographical dictionary [adapted from http://catalogo.pusc.it/beyond_viaf/#tb7] in general the presence of ids of biographical dictionaries (796,609 ids in total) in 725,755 personal items having viaf id helps significantly in the definition of authoritative dates of birth and death (table 7, total of column 2 and table 4, total of column 12). http://catalogo.pusc.it/beyond_viaf/#tb7 http://catalogo.pusc.it/beyond_viaf/#tb4 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 14 a comparison between table 1, column 7, and table 2, row wkp (the acronym for wikidata wrongly used by viaf) shows that 2,147,319 clusters contain 2,148,680 wkp ids; it means that, from a viaf point of view, wikidata duplicates are only 1,361. furthermore, a comparison between the total and row 0 in table 8, col. 1, shows that 2,061,046 items contain at least one viaf id and that 2,037,638 items contain exactly one viaf id; so, items containing one or more viaf duplicates are 23,408. as a result, it can be concluded that the percentage of duplicates in wikidata is less than 0.01% and in viaf is about 0.01%, so wikidata is as trustworthy as viaf. viaf and wikidata not only are able to discover reciprocal duplicates, but also discover duplicates in viaf sources, by a comparison between table 8, col. 3—containing the total number of the cases in which a viaf source has at least one duplicate—and table 8, col. 5—containing the total number of the cases in which viaf sources are duplicated. however, while duplicates recorded by viaf are findable only by querying the monthly dumps using in-house–made programs, duplicates discovered by wikidata are easily findable through sparql queries detecting single-value constraint violations. table 8. wikidata personal items (pers. it.) by repeated viaf sources and viaf source ids [adapted from http://catalogo.pusc.it/beyond_viaf/#tb8] discussion viaf and wikidata are quite different in their purpose, scope, organizational and theoretical approach, data harvesting and management. a major difference between viaf and wikidata is in their purpose: on the one hand, viaf aims to identify bibliographic entities and to connect authority data provided by selected contributors (national libraries, cultural agencies, and other major institutions) and extracted from other data providers (such as isni, rism or de663, wikidata, etc.) through the creation of clusters by means of software. on the other hand, like isni, wikidata focuses on both identification and description of entities and has the purpose of building collaboratively a database concerning the sum of all relevant knowledge—provided that each item complying with its notability criteria is accepted— using a crowdsourced approach (https://www.wikidata.org/wiki/wikidata:notability). http://catalogo.pusc.it/beyond_viaf/#tb1 http://catalogo.pusc.it/beyond_viaf/#tb2 http://catalogo.pusc.it/beyond_viaf/#tb8 http://catalogo.pusc.it/beyond_viaf/#tb8 http://catalogo.pusc.it/beyond_viaf/#tb8 http://catalogo.pusc.it/beyond_viaf/#tb8 https://www.wikidata.org/wiki/wikidata:notability information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 15 another relevant difference between viaf and wikidata is their scope: while viaf aims to identify a few selected types of entities already described within the bibliographic universe by national agencies, wikidata aims to identify and describe any kind of entity of interest for the wikidata community. wikidata items may exist for any kind of entity and may contain a very broad range of data and of external identifiers. so, wikidata can represent bibliographic data and entities —e.g., at present wikidata records data for the 54% of all the bibliographic sources cited in wikipedia entries—any other kind of entity provided for in viaf (i.e., agents, works, expressions, and places), and any other entity defined by the frbr-ifla lrm model (e.g., manifestations, items, timespans, nomens, res, etc.), and by other models relevant for the glam universe (such as frbroo and cidoc).39 but it is open to any data model because it can also include any kind of entity outside the bibliographic or cultural heritage universe, as it is a knowledge base capable of containing any kind of statement on any entity users want to describe. in addition, for any kind of entity there is no minimum or maximum number of statements that must or can be added; as soon as an entity is clearly identified, it can be added to wikidata. moreover, when miss ing, new identifiers—and properties for description—can be proposed by anyone through property proposals and, if well defined, they are usually approved within two weeks (https://www.wikidata.org/wiki/wikidata:property_proposal). a broader scope is supposed to be much more convenient for users who wish to discover previously unknown links and information in the semantic web. organizational model due to the viaf top-down approach, data is completely managed by oclc with no chance for common users or medium and small libraries or other institutions to directly improve viaf clusters (e.g., by adding other data coming from their collections or from encyclopedias or online databases, merging duplicates, solving conflations, etc.). as the wikidata approach is “to crowdsource data acquisition, allowing a global community to edit the data,” data is curated directly by users interested in their creation and use.40 so, in wikidata, data is produced by volunteers, by means of semiautomatic or manual data harvesting from any desired and available source. moreover, users’ statistics show that authoritative data from national bibliographic agencies and other libraries, archives, and museums are normally uploaded by common users, not by librarians (or any other kind of institutional data curator).41 identification function the theoretical approach differs too, both as to the form of the names and as to identification function. in viaf, preferred and variant forms of names for persons are based on national cataloguing codes. because national codes are different, viaf is needed and works as a neutral hub of all the national preferred forms. cataloguing rules can assure uniformity and univocity to the forms of the names of the entities within a national catalogue but are quite complicated to be understood and used by users. in ranganathan’s words, “the cataloguing conventions are on the surface quite contrary to what mr. everybody is familiar with.”42 in contrast, preferred forms in wikidata are based on the international principles of the convenience of the user and common usage.43 a clear example is the use of the direct form of name (jane doe) instead of the inverted form of name (doe, jane). a different usage in the forms of names could be an issue for the integration of library metadata in wikidata. in practice, however, it is not. first, there is no conflict between the wikidata form and any other form from a theoretical point of view, as wikidata form is already treated in viaf as the preferred form within its specific context.44 in addition to that, wikidata accepts any library https://www.wikidata.org/wiki/wikidata:property_proposal information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 16 identifier, so that any library-controlled form can be linked to a wikidata item and vice versa. furthermore, a wikidata bot could be programmed to dump authorized and variant access points from national authority files and add them to the item labels and aliases. 45 lastly, it could be argued that national cataloguing codes are compliant with the icp principles and with the convenience of the user and common usage. but a remarkable difference is that while in national codes principles are applied by cataloguers for users, in wikidata they are expressed directly by the users themselves. as the identification function is a major feature of the semantic web, the different approach of viaf and wikidata to this issue must be underlined. as noted, “viaf remains neutral towards differences in the cataloguing policy of its data contributors” and, for this reason, viaf accepts all ids provided by its sources, even when they are not clearly identifiable entities but are just labels (see for example https://viaf.org/viaf/307171748 or https://viaf.org/viaf/305052259).46 on the contrary, wikidata explicitly requires each item to refer to “a clearly identifiable conceptual or material entity” (second notability criterium; https://www.wikidata.org/wiki/wikidata:notability). as a consequence, many isolated clusters formed by viaf on the basis of single contributors’ ids related to not-clearly-identifiable entities are not acceptable in wikidata and remain unlinked. moreover, data on cluster duplication shows that identification in wikidata is performed with the same quality level as in viaf. clusters for identification purpose are created both in viaf and wikidata, but differently from viaf, in wikidata external identifiers—as all the other data—are not provided in a structured way by national libraries or other institutions (with very few exceptions); instead, identifiers are usually found and added by common users through web scrapers and after data cleaning. what is more, matches are not performed automatically, but semiautomatically (through tools such as openrefine or mix’n’match (https://mix-n-match.toolforge.org/ and https://openrefine.org/) or manually. an enhanced feature of wikidata in clusterization is the record of a wider variety of sources and relative ids: due to its openness, wikidata refers to viaf and its sources, but also to any other library or cultural institution and to a large number of reference sources like encyclopedias and biographical dictionaries too (table 7). a wider variety of identification sources and manual work assure a higher level of identification. data quantity data harvesting affects both quantity and quality of data. in viaf, data are collected from periodical contributions of viaf participants, with very large sets of data. therefore, from a quantitative point of view, viaf has a far larger number of people (22,099,715 personal clusters) in comparison with wikidata (8,304,947 personal items). even though wikidata was created in 2012, the number of personal items in wikidata is currently only over a third (37%) of all viaf personal clusters. although quantities are not directly comparable due to the different universe to be described, in the last few years initiatives to enhance organized cooperation between libraries and wikidata and to promote data production in wikidata are increasing. a very high-quality initiative is supported by cornell university, harvard university, stanford university, and the university of iowa’s school of library and information science, in collaboration with the library of congress and the program for cooperative cataloging (pcc). their linked data for production (ld4p) wikidata project is “an indepth exploration of how wikidata could serve as a platform for publishing, linking, and enriching library linked data” https://viaf.org/viaf/307171748 https://viaf.org/viaf/305052259/#jones,_a._l https://www.wikidata.org/wiki/wikidata:notability https://mix-n-match.toolforge.org/ https://openrefine.org/ http://catalogo.pusc.it/beyond_viaf/#tb7 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 17 (https://www.wikidata.org/wiki/wikidata:wikiproject_linked_data_for_production). an additional example is the ifla wikidata working group that was formed “to explore and advocate for the use of and contribution to wikidata by library and information professionals, the integration of wikidata and wikibase with library systems, and alignment of the wikidata ontology with library metadata formats such as bibframe, rda, and marc” (https://www.ifla.org/node/92837). even so, wikidata is still very far from having a structured workflow to ingest data from national or local libraries, museums, and archives. in fact, while the projects mentioned above are mainly dedicated to explaining to the public of librarians and institutions why wikidata is important and how to contribute to it, there are still very few projects which are mainly dedicated to the concrete massive synchronisation of data between library and bibliographic data and wikidata. in fact, they also require a relevant effort in the manual cleaning of discrepancies and oddities emerging from the synchronisation. relevant exceptions are the national library of wales 47 and the biblioteca europea di informazione e cultura, where significant work has been done to synchronise respective databases of authors (and of other types of entities) with wikidata. 48 data quality data quality also needs to be analyzed in detail. even if data from national libraries are authoritative and of high quality, as a virtual file viaf neither has nor produces its own data. consequently, viaf data does not always remain authoritative because errors can be both inherited and added, and clusters can be duplicated. the issue is well known by isni, that “whenever necessary [. . .] splits and merges data coming from viaf, and even applies protection to data that has been fixed manually.”49 as shown in table 2 and table 8, viaf clusters are subject to isolation and duplication when they are created and to many changes and updates when they are maintained. so, even if viaf collects a huge amount of authoritative data and creates clusters of ids, viaf users can not always safely and continuously rely on them. data flows just in one direction (from national libraries to viaf), viaf deletes and rebuilds clusters without giving priority to the stability of one cluster over another, and, after april 2020, viaf no longer makes available to users a record of its changes.50 on the contrary, wikidata data is always under strict control of any user, as its structure is designed to trace any minimum change to its data. every single addition or deletion is documented, not just to easily recover eventual vandalism, but also to support any decision with clear evidence. any stakeholder can exactly know if, how, when, and why data changed, in any moment. what is more, from a qualitative point of view, wikidata seems to offer a better solution for the recording of authority data than viaf. first, it can store a wider variety of data about a person in a more semantic way. not only is it possible in wikidata to express preferred and variant forms of the name, related names, works, co-authors, publication statistics, and other data about the person—like in viaf—but all these data are all expressed in a semantic way. for example, whereas in viaf “bach, anna magdalena” is just a related name of johann sebastian bach, in wikidata she is recorded and qualified as the person who married the musician. thanks to that different approach, wikidata can represent and show bach’s full genealogic tree (https://magnustoolserver.toolforge.org/ts2/geneawiki/?q=q1339). as adamich noted, “building graphs from bibliographic entities is really about making the data machine readable and understandable. it is about making the data web enabled. in terms of translation, linked data opens up a whole new world over our marc entrapment.”51 https://www.wikidata.org/wiki/wikidata:wikiproject_linked_data_for_production https://www.ifla.org/node/92837 http://catalogo.pusc.it/beyond_viaf/#tb2 http://catalogo.pusc.it/beyond_viaf/#tb8 https://magnus-toolserver.toolforge.org/ts2/geneawiki/?q=q1339 https://magnus-toolserver.toolforge.org/ts2/geneawiki/?q=q1339 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 18 quality is enhanced by matching methods too; whereas viaf matches identities by an algorithm based on explicit identifiers or string matching (such as the forms of the name, dates, and bibliographic relationships),52 wikidata matches are usually decided by a human, the user, or (in the case of semiautomatic imports) at least checked a posteriori by a human after some time. the higher precision of manual over automatic matching is recognized also in viaf guidelines. 53 furthermore, as seen above, notability requires that, when clear identification is impossible, no item must be created in wikidata. data maintenance and usability data quality relies also on maintenance. comparison between wikidata items and viaf clusters shows a very small but constant presence of errors to be fixed in both (around 0.01%), even if it is impossible to determine with certainty whether viaf uses wikidata error pages. issues on fixing viaf errors directly by viaf contributors were already noted: “while clustering anomalies can be handled by viaf itself, reporting errors found in source data of viaf partners raise problems related to the efficiency of the notification workflows. at this point, involvement of viaf partners themselves in the process is needed.”54 on the other hand, in wikidata anyone can edit items, add new data or delete mistakes, merge items, fix various issues, and so on, on the fly. due to its openness, wikidata may also suffer from vandalism, but it has its own solutions.55 along with this, data receive special attention to their accuracy and reliability because they are uploaded and maintained by users that are direct stakeholders. for this reason, in wikidata, references to bibliographical or biographical sources and to other data provider ids such as any national and international identification system are suggested, promoted, and carefully examined. moreover, there is a commitment to monitor the consistency of viaf clusters. the ability of wikidata to identify inconsistent viaf clusters and the fact that viaf isolated clusters can be reduced at least by 30%56 by referring to identifiers from wikidata and other data providers, are the best demonstration of the quality of its data and of the importance of the other data providers in viaf clusterization. as to the usability of data, the internal search of viaf lacks more than basic functions: the only available filter allows to limit results to clusters having one specific source; on the contrary, filtering searches for clusters having and/or not having a specific group of sources or to clusters having more or less sources would be very useful, especially in order to find duplicates. in contrast, wikidata has a sparql query service which returns results based on the current status of the database and its internal search can integrate some of the functions of the query service, allowing to look for items having and/or not having specific statements (https://www.wikidata.org/wiki/special:search).57 considering cases in which viaf and wikidata discover potential duplicates in their sources, viaf has no page dedicated to listing cases of (supposedly) duplicate ids from its sources, while wikidata easily allows to find cases in which single sources have (supposedly) duplicate ids through constraint violations58 and appropriate sparql queries. a comparison table a comparison table was built to compare scope, role, system, and functions between viaf and wikidata, inspired by and adapted from a viaf vs isni comparison.59 https://www.wikidata.org/wiki/special:search information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 19 table 9. comparison between and complementarity of viaf and wikidata features feature viaf wikidata scope ● persons ● organizations ● works ● expressions ● locations ● any kind of viaf entity ● any “res” of ifla lrm ● any entity of cidoc ● any other non-glam entity ● any entity in the universe of discourse software ● unknown ● wikibase60 data. person entity properties ● preferred form of name, based on national cataloguing rules ● very rich variant forms of name, identified by national agencies variant forms ● sources ● preferred form of name (label) based on convenience of the user and common usage61 ● variant forms of name (aliases), organized by languages and scripts62 ● sources (as statements and references and with qualifiers) data. quantity (persons) ● number of clusters: 33,656,281 (sept. 2020) ● number of personal clusters: 22,099,715 (sept. 2020) ● number of entities: 90,260,081 (oct. 2020) ● number of personal items: 8,304,947 (oct. 2020) ● number of personal items with viaf id: 2,061,046 (sept. 2020) data. harvesting ● data are provided by authoritative national bibliographic agencies ● data are added through massive semiautomatic imports and/or manually by any interested user data. quality ● data are granted by authoritative national bibliographic agencies ● data are controlled by any directly interested user, based on data from viaf, available bibliographic agencies, and other authoritative bibliographic sources data. other entities properties ● isbn, titles, dates included in the cluster ● any kind of property applicable to an entity can be used (multimedia included)63 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 20 feature viaf wikidata ● dates, genre, bibliographic references from sources, xlinks, etc. ● properties are unchangeable ● all statements admit references, which are strongly recommended in some cases ● unavailable properties can be freely added through a process of property proposal64 data. dates ● dates are extracted from authority and bibliographic records using a parsing technique; calendars and precision are not available65 ● dates are imported semiautomatically from various sources or filled in manually; different calendars are available and further statements can be made through qualifiers66 data. vandalism ● no vandalism: data are editable only by oclc ● everyone can edit, but items which are frequently vandalized can be temporarily or permanently protected from the edits of unregistered users67 data. fixing errors, deduplicating, or unmerging clusters/items ● suggestions and requests via email ● asynchronous ● presumably, automated processes and human interventions ● viaf rebuilds clusters and does not give priority to the stability of one cluster over another68 ● everyone can edit69 ● instantaneous ● probable errors (constraintviolations) are detected in an automated way (by bots and through queries) ● pages with lists of probable errors (constraint-violations) are freely available and constantly updated in an automated way (by bots)70 data. license ● all public data (license: http://opendatacommons.org/licen ses/by/1.0/) ● all public data (license: https://creativecommons.org/publi cdomain/zero/1.0/deed.it) role ● create clusters ● ingest authority records from viaf contributors and other data providers (included wkd and isni) ● publish and diffuse viaf ids and data ● create items with a worldwide recognized and standard identifier ● interlink items with any available external identifier ● ingest data from viaf, from viaf contributors, and other data providers (e.g., isni) http://opendatacommons.org/licenses/by/1.0/ http://opendatacommons.org/licenses/by/1.0/ https://creativecommons.org/publicdomain/zero/1.0/deed.it https://creativecommons.org/publicdomain/zero/1.0/deed.it information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 21 feature viaf wikidata ● allow to create and maintain on toolforge free tools—e.g., mix’n’match—to ingest external identifiers71 ● manage library, bibliographic, and non-library and non-bibliographic linked data ● publish and diffuse wikidata ids and data organizational model ● oclc service, guided by viaf council of participating institutions ● hierarchical, top-down ● membership on request and subordinated to approval ● largely limited to national bibliographic agencies ● wikimedia project ● distributed, bottom-up ● everyone can take part in the project72 ● open to any bibliographic or nonbibliographic institution (national, large, medium, and small) system. website ● interface only in english language ● interface in nearly any language and script; new ones can be added ● online facilities (end user input; edit online facilities for end user) ● login enhances users’ experience (by gadgets and scripts) system. updating ● periodical (asynchronous) ingestions ● continuous, instantaneous, free updates system. versioning ● history is included in each present cluster and for abandoned clusters ● history is inaccessible in redirected clusters ● page history available in each item and for redirected items ● for deleted items, history is accessible only to administrators long-term preservation policy ● oclc maintains the hosting, software, and data for viaf73 ● wikimedia foundation maintains the hosting, software, and data for wikidata74 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 22 feature viaf wikidata notifications to stakeholders ● notifications to be sent to data providers ● notifications are sent to end users and contributors display, search, and download ● in multiple formats: xml and json, including justlinks.json; ● basic search interface ● clusters are listed without clear ranking rule ● integrating monthly dumps ● api endpoint75 ● before april 2020, by monthly dump with persist links; after, monthly dumps without persists links ● in multiple formats: json, php, n3, ttl, nt, rdf, jsonld, html76 ● search interface 77 ● api endpoint78 ● sparql query endpoint79 ● dumps80, also customizable81 ● see https://www.wikidata.org/wiki/help :about_data linked data and sru ● linked data ● sru82 (search and browse indexes, using cql syntax; output formats are xml or html) ● linked data interoperability. local ● local institution can only reconcile viaf ids to their own data ● as changes are made by viaf, synchronization must be periodically performed by sources and local institutions ● full reconciliation, upload, and synchronization of local ids on wikidata and vice versa ● dedicated tools: mix’n’match ● other tools: openrefine ● bots ● manually conclusion main viaf and wikidata features and personal entities data were analyzed and compared in this study to focus on analogies and differences, and to highlight their reciprocal role and helpfulness in the worldwide bibliographical context and in the semantic web environment. viaf is a major international initiative to address the challenge of reliably identifying bibliographic agents on the web, by means of authoritative data based on national cataloguing codes and coming from the national libraries involved in the ubc program. moreover, viaf is a pillar of the identification process that users enact within wikidata. still, the comparison emphasized a few relevant issues in viaf’s approach, designed more than twenty years ago: a very selective policy of inclusion of its sources—contributors and other data providers—and to their participation to the governance, that prevents a worldwide openness of the project to non national libraries and cultural institutions; an obvious neutrality toward data coming from its https://www.wikidata.org/wiki/help:about_data https://www.wikidata.org/wiki/help:about_data https://www.wikidata.org/wiki/help:about_data https://www.wikidata.org/wiki/help:about_data information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 23 contributors, even when data are not compliant with the identification requirements of the semantic web; troubles in correct clustering of ids (duplicate clusters to be merged and conflated clusters to be split), and a one-way flow of data due to its top-down approach that prevents a quick and cooperative workflow to identify and fix errors; the ability to identify only a narrow range of entities (i.e., mainly bibliographic entities, but not even all those provided by ifla lrm). on the other side, the semantic web has offered new important tools and chances to libraries, archives, museums and other cultural institutions, and their data are recognized as a relevant asset for building the backbone of the semantic web as to the control of entities of bibliographic and cultural interest. after eight years of existence, wikidata is playing a relevant role in the publication, aggregation, and control of bibliographic and non-bibliographic information in the semantic web too. it is more and more indicated as a hub for identifiers in the semantic web.83 wikidata depends on viaf for a large part of the identification work of its items on viaf and viaf’s preeminent role in wikidata is acknowledged by its primary position in the identifiers section of the data of each item. for this reason, the wikidata community constantly monitors the consistency of viaf clusters and continuously updates lists of errors present in them . on the other hand, if viaf is undoubtedly very useful to the wikidata community, wikidata can support the consistency of viaf clusters. the wikidata informational ecosystem is much larger and wider, can be built by any interested institution and person, and its identification function can count also on the authority work of national and non-national libraries excluded from the viaf environment, and on authoritative non-bibliographical reference sources too. this study opens some research perspectives. analysis was limited to data about personal entities, as this kind of entity was the only one directly comparable, while further research is wanted to possibly extend the analysis to other kinds of entities. moreover, more research should be devoted to the investigation of the treatment of special categories of persons and their names, such as mythological and legendary characters, ancient greek and latin authors, kings, queens, popes, saints, and so on, as viaf guidelines84 themselves declare among viaf’s typical problems the clusterization of such names (and they often get five or more viaf ids in wikidata). a further line of research should consider the relevance of the clusterization of encyclopedias and other reference sources in the identification process within wikidata. lastly, isolated clusters would need more consideration; as a matter of fact, in this study they were used as a clue of relatively recent uploads in viaf, but lc and dnb show a high rate of isolated clusters too (maybe due to the richness of their collections and metadata). more research on isolated clusters could help to describe with more precision the possible role of non-national libraries and institutions and of their locally rich collections in identifying lesser-known agents (not just persons) in a worldwide perspective. from analyzed data and direct comparison, it can be concluded that viaf and wikidata can be constantly improved through reciprocal comparison, which allows discovery of errors in both. viaf and wikidata are two relevant tools for the authority control in the semantic web and they each have a specific role to play and different stakeholders. unfortunately, as opposed to the relationship between viaf and isni, at present no aspect of viaf-wikidata interoperability is discussed between the managing structures of both systems, on a regular or irregular basis . while wikidata appears to be more reliable with regards to the identification process, its most significant weakness consists in its unorganized and unplanned crowdsourced data acquisition, information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 24 even if based at present on about 11,500 active editors.85 furthermore, the wikidata community still lacks the constant support and cooperation of institutional data curators such as librarians, archivists, and museum curators. many current projects are mainly dedicated to explaining to the potential institutional stakeholders the importance and the usefulness of wikidata for their institutional missions, but there are still too few projects devoted to massive synchronization of data from institutional silos to wikidata. but, as soon as these initiatives reach a critical mass, wikidata will become the real global hub of the web of data. acknowledgements all the authors have cooperated in the redaction and revision of the article. nevertheless, each author has mainly authored specific sections and subsections of the article: • stefano bargioni: data analysis; viaf; wikidata; viaf and wikidata: a data comparison. • carlo bianchini: introduction; discussion; organizational model; identification function; data quantity; data quality; data maintenance and usability. • camillo carlo pellizzari di san girolamo: relationship between viaf and libraries; relationship between wikidata and academic, research, and public libraries; relationship between viaf and wikidata; wikidata controls on viaf; materials and methods; conclusion. all authors contributed to a comparison table. the authors wish to thank the anonymous reviewer whose suggestions helped to improve and enrich the paper, and the editor for his helpful edits. information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 25 endnotes 1 thomas baker et al., library linked data incubator group final report, sec. 2 (w3c incubator group, october 25, 2011), http://www.w3.org/2005/incubator/lld/xgr-lld-20111025/. 2 baker et al., library linked data. 3 dorothy anderson, universal bibliographic control. a long term policy—a plan for action (munchen: verlag dokumentation, 1974), 11. 4 anila angjeli, andrew mac ewan, and vincent boulet, “isni and viaf: transforming ways of trustfully consolidating identities,” in ifla wlic 2014 (ifla 2014 lyon, ifla, 2014), 2, http://library.ifla.org/985/1/086-angjeli-en.pdf. 5 rick bennett et al., “viaf (virtual international authority file): linking the deutsche nationalbibliothek and library of congress name authority files,” international cataloguing and bibliographic control 36, no. 1 (2007): 12–18; barbara b. tillett, the bibliographic universe and the new ifla cataloging principles : lectio magistralis in library science = l’universo bibliografico e i nuovi principi di catalogazione dell’ifla : lectio magistralis di biblioteconomia (fiesole (firenze): casalini libri, 2008), 14–15, http://digital.casalini.it/9788885297814; “viaf. connect authority data across cultures and languages to facilitate research,” oclc, 2020, https://www.oclc.org/en/viaf.html. 6 gildas illien and françoise bourdon, “a la recherche du temps perdu, retour vers le futur: cbu 2.0” (paper, ifla wlic 2014, lyon, france, 2014), 13–14, http://library.ifla.org/956/. 7 illien and bourdon, “a la recherche,” 15. 8 gordon dunsire and mirna willer, “the local in the global: universal bibliographic control from the bottom up” (paper, ifla wlic 2014, lyon, france, 2014), 11, http://library.ifla.org/817/. 9 luca martinelli, “wikidata: la soluzione wikimediana ai linked open data,” aib studi 56, no. 1 (march 2016): 75–85, https://doi.org/10.2426/aibstudi-11434; jesús tramullas, “objetos culturales y metadatos: hacia la liberación de datos en wikidata,” anuario thinkepi 11 (2017): 319–21, https://doi.org/10/ghbj63; xavier agenjo-bullón and francisca hernández-carrascal, “wikipedia, wikidata y mix’n’match,” anuario thinkepi 14 (2020), https://doi.org/10/ghbj6t; claudio forziati and valeria lo castro, “the connection between library data and community participation: the project share catalogue-wikidata,” jlis.it 9, no. 3 (2018): 109–20, https://doi.org/10/ggxj9n; adrian pohl, “was ist wikidata und wie kann es die bibliothekarische arbeit unterstützen?,” abi technik 38, no. 2 (2018): 208, https://doi.org/10/ghbj6w; arl white paper on wikidata: opportunities and recommendations (the association of research libraries, 2019), https://www.arl.org/wpcontent/uploads/2019/04/2019.04.18-arl-white-paper-on-wikidata.pdf; regine heberlein, “on the flipside: wikidata for cultural heritage metadata through the example of numismatic description” (paper, ifla wlic 2019, libraries: dialogue for change, session 206: art libraries with subject analysis and access, athens, greece, august 28, 2019), http://library.ifla.org/2492/1/206-heberlein-en.pdf. 10 arl white paper on wikidata, 27–30; theo van veen, “wikidata: from ‘an’ identifier to ‘the’ identifier,” information technology and libraries 38, no. 2 (2019): 72–81, http://www.w3.org/2005/incubator/lld/xgr-lld-20111025/ http://library.ifla.org/985/1/086-angjeli-en.pdf http://digital.casalini.it/9788885297814 https://www.oclc.org/en/viaf.html http://library.ifla.org/956/ http://library.ifla.org/817/ https://doi.org/10.2426/aibstudi-11434 https://doi.org/10/ghbj63 https://doi.org/10/ghbj6t https://doi.org/10/ggxj9n https://doi.org/10/ghbj6w https://www.arl.org/wp-content/uploads/2019/04/2019.04.18-arl-white-paper-on-wikidata.pdf https://www.arl.org/wp-content/uploads/2019/04/2019.04.18-arl-white-paper-on-wikidata.pdf http://library.ifla.org/2492/1/206-heberlein-en.pdf information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 26 https://doi.org/10/ghbj62; hilary thorsen, “ld4p: linked data for production: wikidata as a hub for identifiers” (slideshow presentation, june 11, 2020), https://docs.google.com/presentation/d/1jwz3_ncf5rdd7ejetglfv99uv2pnd1v/edit?usp=embed_facebook. 11 tillett, the bibliographic universe, 15. 12 open data commons attribution license (odc-by) v1.0 (as stated in http://viaf.org/viaf/data/). 13 “viaf admission criteria,” oclc, 2020, https://www.oclc.org/content/dam/oclc/viaf/viaf%20admission%20criteria.pdf. 14 the description of wikidata source in http://viaf.org/viaf/partnerpages/wkp.html seems to refer to wikipedia before the existence of wikidata. the same acronym wkp reflects this anachronism, whereas isni correctly uses wkd. anyway, this description, as well as many others, requires an update. 15 stacy allison-cassin and dan scott, “wikidata: a platform for your library’s linked open data,” code4lib journal 40 (may 4, 2018), https://journal.code4lib.org/articles/13424. 16 carlo bianchini and pasquale spinelli, “wikidata at fondazione levi (venice, italy): a case study for the publication of data about fondo gambara, a collection of 202 musicians’ portraits,” jlis.it 11, no. 3 (september 15, 2020): 24. 17 ifla working group on functional requirements and numbering of authority records (franar), functional requirements for authority data: a conceptual model (münchen: k. g. saur, 2009), 46, https://www.ifla.org/files/assets/cataloguing/frad/frad_2013.pdf. for qualifiers, see https://www.wikidata.org/wiki/help:qualifiers; for references see https://www.wikidata.org/wiki/help:sources. 18 partial lists are linked from https://wikibase-registry.wmflabs.org/wiki/main_page. 19 see https://www.transition-bibliographique.fr/fne/french-national-entities-file/; the proof of concept is available at https://github.com/abes-esr/poc-fne. 20 jean godby et al., creating library linked data with wikibase: lessons learned from project passage (dublin oh: oclc research, 2019): 8, https://doi.org/10.25333/faq3-ax08. 21 ifla, “opportunities for academic and research libraries and wikipedia” (discussion paper, 2016), 10, https://www.ifla.org/files/assets/hq/topics/infosociety/iflawikipediaopportunitiesforacademicandresearchlibraries.pdf. 22 john riemer, “the program for cooperative cataloging & a wikidata pilot” (slideshow presentation, june 16, 2020), slide 5, https://docs.google.com/presentation/d/1npkaqdggft1wi2vx0zgmtixwxwjpq96ntxx4mmy xffi/edit#slide=id.p. 23 godby et al., “creating library linked data,” 8. https://doi.org/10/ghbj62 https://docs.google.com/presentation/d/1jwz3_ncf5rdd-7ejetglfv99uv2pnd1v/edit?usp=embed_facebook https://docs.google.com/presentation/d/1jwz3_ncf5rdd-7ejetglfv99uv2pnd1v/edit?usp=embed_facebook http://viaf.org/viaf/data/ https://www.oclc.org/content/dam/oclc/viaf/viaf%20admission%20criteria.pdf http://viaf.org/viaf/partnerpages/wkp.html https://journal.code4lib.org/articles/13424 https://www.ifla.org/files/assets/cataloguing/frad/frad_2013.pdf https://www.wikidata.org/wiki/help:qualifiers https://www.wikidata.org/wiki/help:sources https://wikibase-registry.wmflabs.org/wiki/main_page https://www.transition-bibliographique.fr/fne/french-national-entities-file/ https://github.com/abes-esr/poc-fne https://doi.org/10.25333/faq3-ax08 https://www.ifla.org/files/assets/hq/topics/info-society/iflawikipediaopportunitiesforacademicandresearchlibraries.pdf https://www.ifla.org/files/assets/hq/topics/info-society/iflawikipediaopportunitiesforacademicandresearchlibraries.pdf https://docs.google.com/presentation/d/1npkaqdggft1wi2vx0zgmtixwxwjpq96ntxx4mmyxffi/edit%23slide=id.p https://docs.google.com/presentation/d/1npkaqdggft1wi2vx0zgmtixwxwjpq96ntxx4mmyxffi/edit%23slide=id.p information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 27 24 maximilian klein and alex kyrios, “viafbot and the integration of library data on wikipedia,” code4lib journal 22 (october 14, 2013), https://journal.code4lib.org/articles/8964. 25 ifla cataloguing section and ifla meeting of experts on an international cataloguing code, statement of international cataloguing principles (icp) (den haag: ifla, 2016), para. 5.3. 26 https://www.wikidata.org/wiki/mediawiki:wikibasesortedproperties#ids_with_datatype_%22external-id%22; isni (p213, https://www.wikidata.org/wiki/property:p213) is presently sorted after viaf instead of in the iso section because it is considered primarily as a viaf source. 27 epìdosis, viaf e wikidata.mpg, 2020, https://commons.wikimedia.org/wiki/file:viaf_e_wikidata.mpg; a list of gadgets is available at https://www.wikidata.org/wiki/wikidata:viaf/cluster#gadgets. 28 the main error-report page is https://www.wikidata.org/wiki/wikidata:viaf/cluster/conflating_entities; its subpage https://www.wikidata.org/wiki/wikidata:viaf/cluster/conflating_specific_entries is designed for collecting “easy” cases of conflation, when only a few members of a cluster should be moved elsewhere, while the cluster is substantially sane. 29 moreno hayley, email to author, march 23, 2020. to the question if data about abandoned clusters would have been maintained, the viaf answered, “we recognize that the data in the file was not usable. viaf is in a period of transition and it was decided that we could not at this time fix the file so it has been removed from the list of available downloads.” 30 the statement read: “the persist-rdf.xml file has been removed and will no longer be available,” accessed october 23, 2020. 31 angjeli, mac ewan, and boulet “isni and viaf,” 3. 32 https://dumps.wikimedia.org/wikidatawiki/; instructions and a list of kinds of data dumps are available at https://www.wikidata.org/wiki/wikidata:database_download. 33 a general explanation of ranks is available at https://www.wikidata.org/wiki/help:ranking. here is a small summary: values of statements can be ranked in three ways, “preferred,” “normal” (default), and “deprecated”; the expression “values with non-deprecated rank” includes all values with preferred rank or normal rank; the expression “values with best rank” includes only values with preferred rank or normal rank, with this condition: if the same statement has two or more values and at least one of them has preferred rank, values with normal rank aren’t counted; if there aren’t values with preferred rank, all values with normal rank are counted. 34 viaf and wikidata dumps, together with the scripts, were published on zenodo at https://doi.org/10.5281/zenodo.4457114. https://journal.code4lib.org/articles/8964 https://www.wikidata.org/wiki/mediawiki:wikibase-sortedproperties%23ids_with_datatype_%22external-id%22 https://www.wikidata.org/wiki/mediawiki:wikibase-sortedproperties%23ids_with_datatype_%22external-id%22 https://www.wikidata.org/wiki/property:p213 https://commons.wikimedia.org/wiki/file:viaf_e_wikidata.mpg https://www.wikidata.org/wiki/wikidata:viaf/cluster%23gadgets https://www.wikidata.org/wiki/wikidata:viaf/cluster/conflating_entities https://www.wikidata.org/wiki/wikidata:viaf/cluster/conflating_specific_entries https://dumps.wikimedia.org/wikidatawiki/ https://www.wikidata.org/wiki/wikidata:database_download https://www.wikidata.org/wiki/help:ranking https://doi.org/10.5281/zenodo.4457114 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 28 35 the queries can be performed using the following links: viaf members: https://w.wiki/i5j; authority controls related to libraries but not being viaf members: https://w.wiki/i5k; biographical dictionaries: https://w.wiki/i5n. 36 the query can be performed using the following link: https://w.wiki/i5p. 37 it could be because they are probably more difficult to cluster, but in some cases also because they represent infrequently described entities. 38 as suggested by the reviewer, more removals than additions may be a clue of a cleanup project. 39 pat riva, patrick le boeuf, and maja zumer, ifla library reference model, draft (den haag: ifla, 2017), https://www.ifla.org/files/assets/cataloguing/frbr-lrm/ifla_lrm_2017-03.pdf; nick crofts et al., “definition of the cidoc conceptual reference model,” version 5.0.4, icom/cidoc crm special interest group, 2011, http://www.cidoc-crm.org/html/5.0.4/cidoc-crm.html; chryssoula bekiari et al., eds., frbr object-oriented definition and mapping from frbrer, frad and frsad, version 2.0 (international working group on frbr and cidoc crm harmonisation, 2013), http://old.cidoccrm.org/docs/frbr_oo/frbr_docs/frbroo_v2.0_draft_2013may.pdf; lydia pintscher, lea lacroix, and mattia capozzi, “what’s new on the wikidata features this year,” youtube video, october 26, 2020, truocolo, https://www.youtube.com/watch?v=ebxdzk54gru. 40 denny vrandečić and markus krötzsch, “wikidata: a free collaborative knowledgebase,” communications of the acm 57, no. 10 (september 23, 2014): 80, https://doi.org/10/gftnsk. 41 for a general statistic see http://wikidata.wikiscan.org/users; for a statistic about the viaf property see https://bambots.brucemyers.com/navelgazer.php?property=p214; changing the id of the property at the end of the url allows exploring other property statistics. 42 shiyali ramamrita ranganathan, reference service, 2nd ed., ranganathan series in library science 8 (bombay: asia publishing house, 1961), 74. 43 ifla cataloguing section and ifla meeting of experts on an international cataloguing code, statement of international cataloguing principles (icp), 5, https://www.ifla.org/publications/node/11015. 44 wikidata does have a guideline for a preferred label, and its choice is based on users’ convenience (https://www.wikidata.org/wiki/help:label, par. 1.2) as required by international cataloguing principles (2016). as to the choice of the wikidata label in a specific language, viaf does not show any clear principle, while the authors believe that it would be preferable to use the english (“en”) label, whenever available. see ifla cataloguing section and ifla meeting of experts on an international cataloguing code, statement of international cataloguing principles (icp). 45 for example, in september it was done for nkc using openrefine (sample edit: https://www.wikidata.org/w/index.php?title=q520487&diff=1269046867&oldid=12668704 64). https://w.wiki/i5j https://w.wiki/i5k https://w.wiki/i5n https://w.wiki/i5p https://www.ifla.org/files/assets/cataloguing/frbr-lrm/ifla_lrm_2017-03.pdf http://www.cidoc-crm.org/html/5.0.4/cidoc-crm.html http://old.cidoc-crm.org/docs/frbr_oo/frbr_docs/frbroo_v2.0_draft_2013may.pdf http://old.cidoc-crm.org/docs/frbr_oo/frbr_docs/frbroo_v2.0_draft_2013may.pdf https://www.youtube.com/watch?v=ebxdzk54gru https://doi.org/10/gftnsk http://wikidata.wikiscan.org/users https://bambots.brucemyers.com/navelgazer.php?property=p214 https://www.ifla.org/publications/node/11015 https://www.wikidata.org/wiki/help:label https://www.wikidata.org/w/index.php?title=q520487&diff=1269046867&oldid=1266870464 https://www.wikidata.org/w/index.php?title=q520487&diff=1269046867&oldid=1266870464 information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 29 46 angjeli, mac ewan, and boulet, “isni and viaf,” 9. 47 simon cobb (https://www.wikidata.org/wiki/user:sic19) became wikidata visiting scholar in 2017 (https://en.wikipedia.org/wiki/user:jason.nlw/wikidata_visiting_scholar). 48 federico leva and marco chemello, “the effectiveness of a wikimedian in permanent residence: the beic case study,” jlis.it 9, no. 3 (september 2018): 141–47, https://doi.org/10.4403/jlis.it-12481. 49 angjeli, mac ewan, and boulet, “isni and viaf,” 11. 50 andrew mac ewan, “isni, viaf and naco and their relationship to orcid, discussion paper for pcc policy committee, 4 november,” 2013, 2, http://www.loc.gov/aba/pcc/documents/isni%20poco%20discussion%20paper%202013.d ocx. 51 tom adamich, “library cataloging workflows and library linked data: the paradigm shift,” technicalities 39, no. 3 (may/june 2019): 14. 52 oclc, viaf guidelines, rev. july 16, 2019, 2, https://www.oclc.org/content/dam/oclc/viaf/viaf%20guidelines.pdf. 53 oclc, viaf guidelines, 5. “when viaf is unable to algorithmically match some of the source authority records with each other, they can be manually pulled together into a single cluster using an internal table.” 54 angjeli, mac ewan, and boulet, “isni and viaf,” 16. 55 stefan heindorf et al., “vandalism detection in wikidata,” in proceedings of the 25th acm international conference on information and knowledge management, cikm ’16 (new york, ny: association for computing machinery, 2016), 327–36, https://doi.org/10/gg2nmm; amir sarabadani, aaron halfaker, and dario taraborelli, “building automated vandalism detection tools for wikidata,” in proceedings of the 26th international conference on world wide web companion, www ’17 companion (republic and canton of geneva, che: international world wide web conferences steering committee, 2017), 1647–54, https://doi.org/10/ghhtzf. 56 see table 1, col. 1 vs col. 9; it should be noted that col. 9 considers only non-viaf sources and biographical dictionaries, but wikidata also links to encyclopedias and other online databases. 57 for example, people not having viaf id but having iccu id (https://tinyurl.com/y6hbtjuo); instructions about the internal search are available at https://www.mediawiki.org/wiki/help:extension:wikibasecirrussearch. 58 https://www.wikidata.org/wiki/wikidata:database_reports/constraint_violations. 59 angjeli, mac ewan, and boulet, “isni and viaf,” 16. 60 https://www.mediawiki.org/wiki/wikibase/datamodel. https://www.wikidata.org/wiki/user:sic19 https://en.wikipedia.org/wiki/user:jason.nlw/wikidata_visiting_scholar https://doi.org/10.4403/jlis.it-12481 http://www.loc.gov/aba/pcc/documents/isni%20poco%20discussion%20paper%202013.docx http://www.loc.gov/aba/pcc/documents/isni%20poco%20discussion%20paper%202013.docx https://www.oclc.org/content/dam/oclc/viaf/viaf%20guidelines.pdf https://doi.org/10/gg2nmm https://doi.org/10/ghhtzf https://tinyurl.com/y6hbtjuo https://www.mediawiki.org/wiki/help:extension:wikibasecirrussearch https://www.wikidata.org/wiki/wikidata:database_reports/constraint_violations https://www.mediawiki.org/wiki/wikibase/datamodel information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 30 61 “the label is the most common name that the item would be known by” (https://www.wikidata.org/wiki/help:label). see also ifla cataloguing section and ifla meeting of experts on an international cataloguing code, statement of international cataloguing principles (icp), 5., https://www.ifla.org/publications/node/11015. 62 bots exist to create more and more variant forms based on matching properties, such as date of birth (p569) and date of death (p570), and to import variant forms of names from national authority files. see, for example, https://www.wikidata.org/w/index.php?title=q5669&diff=611600491&oldid=608231160 . 63 https://www.wikidata.org/wiki/help:data_type. 64 https://www.wikidata.org/wiki/wikidata:property_proposal. 65 jenny a. toves and thomas b. hickey, “parsing and matching dates in viaf,” code4lib journal, 26 (october 21, 2014), https://journal.code4lib.org/articles/9607; stefano bargioni, “from authority enrichment to authoritybox : applying rda in a koha environment,” jlis.it 11, no. 1 (2020): 175–89, https://doi.org/10/gg66rq. 66 https://www.wikidata.org/wiki/help:dates. 67 see heindorf et al., “vandalism detection in wikidata.” 68 see mac ewan, “isni, viaf and naco.” 69 see https://www.wikidata.org/wiki/help:merge, https://www.wikidata.org/wiki/help:split_an_item, and https://www.wikidata.org/wiki/help:conflation_of_two_people. 70 complete list at https://www.wikidata.org/wiki/wikidata:database_reports/constraint_violations (e.g., https://www.wikidata.org/wiki/wikidata:database_reports/constraint_violations/p214). 71 https://admin.toolforge.org/; see also xavier agenjo-bullón and francisca hernándezcarrascal, “registros de autoridades, enriquecimiento semántico y wikidata,” anuario thinkepi 12 (2018): 361–72, https://doi.org/10/ghbj6z. 72 https://www.wikidata.org/wiki/wikidata:property_proposal. 73 https://www.oclc.org/en/viaf.html. 74 https://www.wikidata.org/wiki/wikidata:introduction. 75 https://platform.worldcat.org/api-explorer/apis/viaf. 76 https://www.wikidata.org/wiki/special:entitydata; see also https://www.wikidata.org/wiki/wikidata:database_download. 77 https://www.wikidata.org/wiki/special:search. https://www.wikidata.org/wiki/help:label https://www.ifla.org/publications/node/11015 https://www.wikidata.org/w/index.php?title=q5669&diff=611600491&oldid=608231160 https://www.wikidata.org/wiki/help:data_type https://www.wikidata.org/wiki/wikidata:property_proposal https://journal.code4lib.org/articles/9607 https://doi.org/10/gg66rq https://www.wikidata.org/wiki/help:dates https://www.wikidata.org/wiki/help:merge https://www.wikidata.org/wiki/help:split_an_item https://www.wikidata.org/wiki/help:conflation_of_two_people https://www.wikidata.org/wiki/wikidata:database_reports/constraint_violations https://www.wikidata.org/wiki/wikidata:database_reports/constraint_violations/p214 https://admin.toolforge.org/ https://doi.org/10/ghbj6z https://www.wikidata.org/wiki/wikidata:property_proposal https://www.oclc.org/en/viaf.html https://www.wikidata.org/wiki/wikidata:introduction https://platform.worldcat.org/api-explorer/apis/viaf https://www.wikidata.org/wiki/special:entitydata https://www.wikidata.org/wiki/wikidata:database_download https://www.wikidata.org/wiki/special:search information technology and libraries june 2021 beyond viaf | bianchini, bargioni, and pellizzari di san girolamo 31 78 https://www.wikidata.org/w/api.php. 79 https://query.wikidata.org/. 80 https://dumps.wikimedia.org/wikidatawiki/. 81 https://wdumps.toolforge.org/. 82 https://www.oclc.org/developer/develop/web-services/viaf/authority-source.en.html. 83 van veen, “wikidata.” 84 see “typical problems” in viaf guidelines: https://www.oclc.org/content/dam/oclc/viaf/viaf%20guidelines.pdf. 85 pintscher, lacroix, and capozzi, “what’s new.” https://www.wikidata.org/w/api.php https://query.wikidata.org/ https://dumps.wikimedia.org/wikidatawiki/ https://wdumps.toolforge.org/ https://www.oclc.org/developer/develop/web-services/viaf/authority-source.en.html https://www.oclc.org/content/dam/oclc/viaf/viaf%20guidelines.pdf abstract introduction relationship between viaf and libraries relationships between wikidata and academic, research, and public libraries relationship between viaf and wikidata wikidata controls on viaf materials and methods data analysis: viaf clusters and wikidata items viaf wikidata viaf and wikidata: a data comparison discussion organizational model identification function data quantity data quality data maintenance and usability a comparison table conclusion acknowledgements endnotes 276 on-line acquisitions by lolita frances g. spigai: former information analyst, oregon state university library; and thomas mahan: research associate, oregon state university computer center, corvallis, oregon. the on-line acquisition program (lolita) in use at the oregon state university library is described in t erms of development costs, equipment requirements, and overall design philosophy. in pa1'ticular, the record format and content of records in the on-orde1' file, and the on-line processing of these records (input, search, correction, output) using a cathode ray tube display terminal are detailed. the oregon state university library collection has grown by 15,00020,000 new titles per year (corresponding to 30,000-35,000 volumes per year) for the past three years to a total of approximately 275,000 titles ( 600,000 volumes); continuing serials account for a large percentage of annual "volume" growth. these figures would indicate an average input of 60-80 new titles per day. on an average, a corresponding number of records are removed each day upon completion of the processing cycle. a like number of records are updated when books and invoices are received. in addition, approximately 200 searches per day are made to determine whether an item is being ordered or to determine the status of an order. since the mid-1960's, and with the introduction of time-sharing, a handful of academic libraries ( 1, 2, 3) and several library networks ( 4, 5, 6) have introduced the advantages ( 7) of on-line computer systems to library routines. most of the on-line library systems use teletypewriter terminals. use of visual displays for library routines has been limited, although stanford anticipates using visual displays with ibm 2741 typeon-line acquisitionsjspigai and mahan 277 writer terminals in a read-only mode ( 1), and the library of the ibm advanced systems development division at los gatos, sharing an ibm 360/50, uses an ibm 2260 display for ordering and receiving ( 8). in addition, an institute of library research study, focusing on on-line maintenance and search of library catalog holdings records, has concluded that even with the limited number of characters available on all but the most expensive display terminals " ... the high volume of data output associated with bibliographic search makes it desirable to incorporate crt's as soon as possible, in order to facilitate testing on a basis superior to that achievable with the mechanical devices." (9). many academic libraries, during shelflist conversion or input of acquisition data, use a series of tags for bibliographic information. some of these tags are for in-house use, while others presumably are used to aid in the conversion of marc tape input to the library's own input format. the number of full-time staff required to design and operate automated systems in individual academic libraries typically ranges from seven to fifteen. this doesn't seem to be an inordinate range, since most departments of a medium-large to large academic library require a similar size staff for operational purposes alone. lolita (library on-line information and text access) is the automated acquisition system used by the oregon state university library. it operates in an on-line, time-shared, conversational mode, using a cathode ray tube (cdc-210) or a 35-ksr teletype as a terminal, depending upon the operation required. both types of equipment are in the acquisitions department of the library; each interacts with the university's main computer ( cdc-3300, 91k core, 24-bit words), which, in turn accesses the mass storage disk ( cdc-814, capable of storing almost 300 million characters) through the use of lolita's programs in conjunction with the executive program, os-3 ( 10). under the os-3 time-sll,aring system, lolita shares the use of the central computer memory and processor with up to 59 other concurrent users; the use of the mass storage disk is also shared with other users of the university's computer center. (lolita will require approximately 11 million characters of disk storage). lolita's programs are written in fortran and in the assembly language, compass, and are composed of two sets: those which maintain the outstanding order file, and those which produce printed products and maintain the accounting and vendor files. several key factors have shaped the design of lolita. an on-line, time-sharing system has been operating at osu since july 1968, and online capabilities have been available for test purposes since the summer of 1967. programming efforts could be concentrated exclusively on the design of lolita and an earlier pilot project ( 11) , for no time was needed to design, debug or redesign the operating system software, as was necessary at washington state u. and the u. of chicago (2, 12) . heavy reliance was put on assembly language coding for the usual 278 journal of library automation vol. 3/4 december, 1970 reasons, plus the knowledge that the computer center's next computer is to be a cdc-3500, with an instruction set identical to that which the library now uses. in short, neither the os-3 operating system nor the assembly language will change for the next few years. an added motivation influencing program design was the desire to minimize response time for the user. in view of the transient nature of a university library's student and civil service staff, the need for an easily-learned and maintained system is paramotmt. the flerible display format of the crt allows a machine readable worksheet, with a built-in, automatic, tagging scheme; it obviates the need for a paper worksheet, and thus eliminates a time-consuming, · tedious, and error-prone conversion process. the book request slip contains the source information for input. proofreading and correction are done on-line at time of input. alterations can be made at any later time as well. lolita has used from 1.5 to 3.0 fte through the period of design to operation. after an initial testing and data base buildup period, anticipated to last about six months, and during which lolita will be run in parallel with the manual system, it is expected that the on-order/in-process, vendor, and accounting files will be maintained automatically and that reports and forms currently output by the acquisitions department staff will be generated automatically. specifically, records comprising three files will be kept on-line : 1) the outstanding order file (a slight misnomer since it includes and will include three types of book request data: outstanding orders, desiderata of high priority, and in-process material), 2 ) name and address for those vendors of high use (approximately 200 of 2500, or about 8% ), and codes and use-frequency counts for all vendors, and 3) accounting data for all educational resource materials purchased by the oregon state university library. it should be kept in mind that, although lolita is designed for book order functions, the final edited record, after the item has been cataloged, will be captured on magnetic tape as a complete catalog record. thus, all statistics and information, except circulation data, will be available for future book acquisitions. this project is being undertaken for two reasons: 1) the oregon state university library is concerned that librarians achieve their potential as productive professionals through the use of data processing equipment for routine procedures, and that cost savings may be realized as the library approaches a total system encompassing all of the technical services routines, and 2) a uniquely receptive computer center and a successful on-line time-sharing facility are available. record format and content each book request is described by 27 data elements which are grouped into three logical categories and are displayed in three logical "pages" on-line acquisitionsfspigai and mahan 279 of a crt screen. the categories are: 1) bibliographic information, 2) accounting information, and 3) inventory information; figures 1, 2, and 3 list the data elements in the same sequence as they appear on the crt screen. though most data elements listed are self-explanatory, eight require some description. order number flag word author title edition id number publisher year published notes fig. 1. bibliographic information. order number date requested date ordered estimated price number of copies account number vendor code vendor invoice number invoice date actual price date received date 1st claim sent date 2nd claim sent fig. 2. accounting information. order number bib cit date cataloged volume issue location code lc class number fig. 3. inventory information. 280 l ournal of library automation vol. 3 f 4 december, 1970 flag word this data element indicates the status of a request. the normal order procedure needs no hag word. exceptions are dealt with automatically by entering an appropriate hag word. as more requests are added to the system, and as more exceptional instances are uncovered, more hag words will undoubtedly be added. to date there are twelve hag words, plus one data element which serves both as a data element and as a status signal. flag words and procedures activated are described below. conf.: confirming orders for materials ordered by phone or letter, and for unsolicited items which are to be added to the collection. the order form is not mailed, but used for processing internal to the library only. accounting routines are activated. gift: for gift or exchange items, a special series number prefixed by a "g" is assigned and the printed purchase order is used internally only. this hag word also acts as a signal so that accounting routines will not encumber any money. the primary reason for assigning a purchase order number is to provide a record indexing mechanism (this is also true for held orders) . held : selected second-priority orders being held up for additional book budget funds. these order records are kept on line, and are assigned a special series of purchase order numbers, prefixed by an "h." no accounting procedures accompany these orders, although a purchase order is generated and manually filed by purchase order number. live : held orders which have been activated. this word causes a reassignment of purchase order numbers to the next number in the main sequence ( instead of "h" -prefixed numbered) and sets up the natural chain of accounting events. the new purchase order number is then written or typed on the order form, the order date added, and the order mailed. cash: orders for books from vendors who require advance payment. an expenditure, instead of an encumbrance, is recorded. rush: used for books which are to be rush ordered and/or rush cataloged. rush will also be rubber-stamped on the purchase order for emphasis. no special procedures are activated within the computer programs; rush is an instruction for people. docs: used when ordering items from vendors with whom the osu library maintains deposit accounts (e.g. government printing office). this causes a zero encumbrance in the accounting scheme; cash is used to put additional money into deposit accounts. canc: cancelled orders. unencumbers monies and credits accounts for cash orders. reis: used to reissue an order for an item which has been cancelled. a new purchase order containing a new order number, vendor, etc. will automatically be issued. re-input is not necessary; however, changes in vendor no., etc., can be made. on-line acquisitionsj spigai and mahan 281 part: denotes a partial shipment for one purchase order. no catalog date can be entered while part appears as the flag word. invo will replace part when the final shipment has been received; canc will replace part if the final shipment is not received, and the order is reissued for the portion received. · invo : when invoice information is entered into the file, invo is typed in as the flag word. this causes accounting information (purchase order number, vendor code, invoice number, actual price, invoice data, account number) to be duplicated in the accounting file. kill: used to remove an inactive record from the file ( cf. date cataloged). date cataloged: a value entered for this data element signals the end of processing. the record is removed from the main file and transferred to magnetic tape. changes and additions to inventory and bibliographic data elements are anticipated at this final point, to bring the record into line with those of the catalog dept. author(s) all authors are to be included in this data element, corporate authors, joint authors, etc. the entry form is last name first (e.g. smith, john a. ). for compound authors, a slash is used as the delimiter separating names (e.g. smith, john a. i jones, john paul) . id number standard book number, vendor catalog number, etc. order number the order number is automatically assigned to one of three series depending on the flag word: the main number series with the fiscal year as prefix; held order series with an "h"-prefix (stored in the order number index as 101, the "h" is what is printed on the order forms); and gift series with a "g" -prefix (likewise stored in the order number index as 102). vendor code a sample of 18 months of invoice data (obtained from the comptroller's office) for the library resource account number indicates the use of 2200 vendors during that period of time. by sorting by invoice frequency and dollar amount, about 200 vendors were identified who either invoiced the library more than 12 times during this time period (since the invoices tended to contain more than one item for frequently used vendors, the number of purchase orders issued could easily be several times this amount), or whose invoices totalled over $110.00. of these, 171 have been selected for on-line storage. they will be assigned code numbers 1 to 171, and names and addresses of these vendors will be included on the computer generated purchase orders. authority files for all vendors 282 journal of library automation vol. 3/4 december, 1970 are kept on rolodex units; one set is arranged alphabetically by vendor name, the other by vendor code. account number the library account to which the book is charged. the number is divided into four sections: 1) a two-digit prefix identification for osu, 2) a four-digit identification for osu library resource expenditures, 3) a oneor two-digit identification of the particular library resource fund account to be charged (e.g. science, humanities, serials, binding, etc. ), and 4) a oneor two-digit code identifying the subject which most closely describes the request. from this data, statistics will be derived which describe expenditures by subject as well as by fund allocation. this will provide a powerful tool for collection building and . may also be a political aid in governing departmental participation in book selection. bibcit bibliographic citation code which cites the location by acquisitions dept. personnel of bibliographic data ( l.c. copy, etc. ). this information is included on the catalog work slip (4th copy of the purchase order) so that duplicate searching by the catalog dept. can be avoided. lc classification number refers to the call number as it is assigned by the osu catalog dept. file organization on-order record the operating system for oregon state university's on-line, time-sharing system reads into memory a quarter page (or file block) of 510 computer words at a time. each on-order (outstanding order) record is composed of a block of 51 computer words ( 204 6-bit characters), or linked lists of blocks, in order to best use this system. thu·s, each quarter page is divided into ten physical records of 51 computer words apiece. for records requiring more than one block, the nearest available block of 51 words within the same 510 word file-block is used; but if none is vacant within the same file-block, the first available 51-word block in the file is used. if none is free the file is lengthened to provide more blocks. a bit array is used to keep track of the status (in use, vacant) of records in the main file. in the bit array, each of 20 bits of each 24-bit computer word corresponds to a 51-word block in the main file. as in figure 4, the 13th bit has a zero value, indicating a vacancy in the 13th 51-word block of the main file; the 14th bit has a value of 1, indicating the 14th 51-word block in the on-order file is in use. a total of 10,120 block locations can be monitored by each file block of the bit array. records in this file are logically ordered by purchase order number, the arrangement effected by pointers which string the blocks together. on-line acquisitiansf spigai .and mahan 28$ 510-word ftle block unused 4 bits one -word b i t array fig. 4. bit army monitor of record block use in the on order file. access points order number the order number index is arranged by the main portion of the order number, and within that, it is in prefix number sequence. the sequence in figure 5 illustrates order number index arrangement (as well as the logical arrangement of the on-order file). the order number index allows quick access to selected points within the main file. conceptually, the ordered main file is segmented into strings of records whose order numbers fall into certain ranges. more specifically, items whose sequence numbers range from 0 to 4 (ignoring the prefix of the order number) comprise the first segment, 5 to 9 the second, etc. the index itself merely contains pointers to the leading record in each (conceptual) segment. thus, in the records whose purchase order numbers are shown in figure 5, there would be pointers to the second (69-124) and sixth (70-125), but not to the others. to reach the fourth ( 101-124) one follows the index to the second, and then follows the block pointers through the third to the fourth . 102-118 69-124 70-124 101-124, 102-124 70-125 102-125 . 70-126 fig. 5. fiscal year 1969, order number 124 fiscal year 1970, order number 124 held order number 124 for the current year gift order number 124 for the current year ( note : the prefix 'h,' which is printed on the purchase orders is represented as the number 101 for internal computer processing; likewise 102 represents the prefix 'g') order number index sequence. 284 journal of library automation vol. 3/4 december, 1970 p.o. number forward pointer ' p.o. number backward pointer time of last update . p. 0. number title forward pointer v title backward pointer v pointers to author( s) / ~ ~ title > date of re_quest date ordered encumbered price number of c<>e_ies account number (2 words) vendor number flag word ~ publisher 1 date of publication ~ notes ~ ~ edition ~ ld number ~ blbcit ' lc classification number )' volume number issue ~ location code ; ~ ~ vendor's invoice number ~~ invoice date actual price date received date first claim sent date second claim sent fig. 6. "on order" record organization. on-line acquisitionsjspigai and mahan 285 author(s) the author index is in the form of a multi-tiered inverted tree. the lowest tier is an inverted index containing the only representation of the author's names (it is not stored in the on-order record (figure 6), and, for each author, pointers to the records of each of his books (figure 7). the entries for several authors may be packed into a single 51-word block, if space permits. each higher tier serves to direct the indexing mechanism to the proper block in the next tier below, and to this end as much as needed of an author's name is filed upwards into higher tiers; this method is described in more detail by lefkovitz ( 13) as "the unique truncation variable length key-word key." author index directory (level 0 + 1) john/ jones, j 927 inverted author index (level 0) control word (ii chars. in record; # chars. in full name of author; # of titles jones, t jones, john pa ul 928 jop k.a 1282 tow ~ ~~~3 in on order file ~~2~66~7------------~ on order file 1072 927 10/20/69 10/29/69 $4.95 . 30-1061-6-20 16 0000 1282 10 fig. 7. author index organization and access to on order file. title not yet programmed. on-line record processing record creation after a number of new book requests have been searched to determine their absence from osu's collection and after they have been bibliographically identified, they are hatched for vendor assignment and readied for entry into the on-line file of book requests via the crt (figure 8 ). l-.:> 00 0) g '"'t i5 -c -~ n ... /y'rifiid "-.. n _/ not ""i assiql vrnoor 1•..-::-::-. _ i .... ~ a ~ y i > ~ ...... c ~ ...... .... c ;:s n < 0 !-' cn -~ d (!) () (!) !3 0"' (!) ~'"i ..... to -..1 0 fig. 8. book request processing. on-line acquisitionsjspagai and mahan 287 lolita's starting page is obtained by typing in the word lolita on the crt screen. the text illustrated in figure 9 is then displayed on the screen of the crt. when 't' is typed in, indicating a wish to create a record, the first data element of the first page of input appears (figure 10). (since the majority of records do not need a flag word upon input, the flag word fill-in line appears only on a redisplay of this page, and the flag word may be inserted at that time.) main file please indicate a choice 1. create a new entry 2. locate an existing entry 9. terminate all processing fig. 9. "starting" page of function choices. author(s): examples: jones dequincey, thomas washington, booker t. adams, john quincy/ doe, john american medical association fig. 10. first data element displayed in new record creation process. at this point the user can go in one of two directions. the first page of input information may be entered one data element at a time, each element being requested in a tutorial fashion by lolita. alternately, all of the first page data may be input at once, with data elements separated by delimiters. the user can switch from one method to the other at any point. a control key (return) is the delimiter used to signal the end of each data element, and, at the same time, return repositions the cursor (which indicates the position of the next character to be typed on the crt screen) to the location of the next data element to be filled in. another conh·ol key (send): 1) serves as a terminal delimiter, and 2) transmits data on the screen to the computer, thereby 3) triggering the continuation of processing until the next screen display is generated. thus, with page one, data elements are displayed, filled in and sent one at a time in the tutorial approach, or, all seven data elements are typed in at once, a return mark following items 1-6, then sent after the last data element. return or send must be used with each data element, even with those for which there is no information. this secures the sequence of element input, thus providing an easy (for the user) and automatic way of tagging elements for any future tape searches to provide statistics or analytical reports. in particular, this process obviates all content restrictions on variable (ie., free-form) items. each of the pages is redisplayed after 288 journal of library auto'tiultion vol. 3/4 december, 1970 input, and corrections can be made at this time. the crt is used for all input and its write-over capabilities are utilized for corrections, as compared to the "read-only" use planned for crt displays used for stanford's ballots ( 1). except for the flag word, all the data elements on the first page are variable in length and unrestricted as to content. data elements on page 2 and 3 (figures 2 and 3) are more of a fixed length in nature; thus with these pages, a whole page at a time is always filled in and sent: the tutorial function is inherent in the display. the concluding display is shown in figure 11. send if all done, type 1-3 to review pages. fig. 11. review option. because hatched searching and input are assumed, when one search or input is finished, the program recycles to continue searching or inputting without going back to the starting page (figure 9) each time. record search searching programs have been completed which will search by order number and by author. title searching will be implemented within the next few months, although a satisfactory scheme for title searching ( improving on manual methods, yet economical) has not been uncovered. methods suggested or used by ames, kilgour, ruecking, and spires have been noted (14, 15, 16, 17). the procedure for searching within the outstanding order file begins with the display of choices shown in figure 9. one types a "2," indicating a desire to locate an existing entry, and the text shown in figure 12 is displayed on the crt screen. at this point one chooses to search either by order number or by author. if one selects a valid order number representing a request record, the first page of that record, containing bibliographic information, is displayed. this is followed by the display shown in figure 11, so that accounting and inventory information may also be reviewed. for the user's convenience the order number is displayed in the upper right-hand comer of each of the three pages, both upon record input and search redisplay. to search by author, one types the author's name on the second line of figure 12, using the same format as that used in record creation. if the ------------------------: order number ------------------------------: a uth 0 r supply one of the above (start on the appropriate line) fig. 12. display of search options. ' on-line acquisitionsjspigai and mahan 289 author has only one entry in the outstanding order file, the first page of the entry will appear, etc. (as in the order number search above) . if the author entered has more than one entry in the on-line file, information depicted in figure 13 will be displayed on the screen of the crt. __ _____________ : enter number or 'nf' (not found) 1. night of the iguana 2. the milk-train doesn't stop here anymore 3. cat on a hot tin roof n. the glass menagerie fig. 13. display of multiple titles on file for one author. if the requested title is one of the titles displayed, one types its number and the record for that title will be displayed. if the title isn't among those displayed, typing nf would result in a redisplay of the text in figure 12 in order for searching to continue. for personal authors, variant forms of the name may be located using the following procedure. the word others is entered at the top of the screen, after an unsuccessful author search, so that a search for author j. p. jones would find all documents by john paul jones, joseph p. jones, j. peter jones, etc., as well as j. p. jones. a search for john p. jones would find all documents by j. p. jones, john jones and j. peter jones as well as john p. jones. record changes additions and corrections to the original record are made by first locating the record (by order number, author, or eventually, title), adding to the data elements, or writing over them (for corrections), and transmitting the information. examples of this procedure include: 1) entering the date received, 2) recording the vendor invoice number, invoice date, and actual price and 3) inserting or changing a flag word. in addition, after an item has been cataloged, the record is revised to include catalog data, as well as to exclude extraneous order notes. output aside from the crt displays, output is in three forms: off-line tape, printed forms and on-line files (figure 14). examples of output are library purchase orders, accounting reports, vendor data, and records of cataloged items. the number of potential reporting uses is limited only by money and imagination. 290 journal of library automation vol. 3/4 december, 1970 fig. 14. output from on-line on order file input. i order number i i date i id number author title publisher vendor name vendor address voujmes edition fig. 15. purchase order. f estimated price i no. of copies i vendor cooe i account date of pub. * * * • flag** • * gift or held order no. bibcit library purchase order 00 r cd !il~ iii r= ::0 r < . > sp >cil r-i ~ c/l c~ x c/lftl :v 0 c: -i ::0 z 0 "' < q "' ~ ::0 c/l ~ :::; -< on-line acquisitionsfspigai and mahan 291 the purchase order, shown in figure 15, is composed of four copies: 1} the vendor's copy to be retained by him, 2) a vendor "report" copy, 3) the copy which is kept as a record in the osu library, and 4) a catalog work slip to be forwarded to the catalog department with the book. purchase orders are printed on the library's teletype, which is equipped with a sprocket-feed. orders can also be printed on the line printer in the computer center. while this is a slightly cheaper data processing procedure, since no terminal costs are incurred, convenience and security have produced a victory in "economics over economies" ( 18 ), and the librarian's time has been considered in the total scheme. for gift items, purchase orders are produced as the cheapest means of preparing a catalog work slip. held purchase orders are produced and manually filed in purchase order number sequence, but when their status is changed to live, the old numbers are automatically replaced by a purchase order number in the main series. these new numbers are written onto the purchase orders, along with any other changes, and the orders are mailed. the flag word live also activates accounting procedures. there are two sets of accounting reports. the first is generated when the purchase orders are issued and contains tabulated information for the library's bookkeeper, the head of business records in the acquisitions dept., and the comptroller of the oregon state system of higher education. the second summary report is issued after the book and invoice have been received and will contain additional information, pertinent to the invoicing procedure; this report has the same distribution as the first. periodic reports are planned for the library's subject divisions summarizing expenditures by account number, reference area, and subject. programming for this has not yet been done. a frequency count will be stored with each vendor code and periodic listings will be printed for use in retaining vendors. mter an item has been cataloged, the catalog work slip and a slip equivalent to a main-entry catalog card are sent to acquisitions, and all remaining information and changes are recorded in the on-line record. this record is then transferred to a file from which it is dumped onto a magnetic tape. this off-line file will be used for statistical analyses and will be the start of a machine readable data base. future plans will, of course, depend on funding; however, two logical steps which could follow immediately and require no additional conversion are: 1) additional computer generated paper products (charge cards, catalog cards, book spine labels, new book lists, etc. ) , and 2) a management information system using acquisition and cataloging data. the construction of a central serial record in machine readable form would produce many valuable by-products. a program for the translation of the marc ii test tape has been written which causes these records to be printed out on the computer center's line printer; and since a sub292 journal of library automation vol. 3/4 december, 1970 scription to the marc tapes is now available to osu for test purposes, its advantages and compatibility with lolita will be investigated as time permits. unsolved problems, aside from those which everyone working in a data processing environment faces (e.g. syst~m and hardware breakdown, continued project funding, and lengthy dehv~ry times for hardware), include: 1) the widely varying system response tunes (commonly from a fraction of a second up to 60 seconds; usually 2-15 seconds); 2) the lack of personnel skilled in both data processing and library techniques; 3) the limited print train currently available on the line printer ( 62 character set); and 4) bureaucratic policy, which can render the most sophisticated plans for automation unfeasible if properly applied. it is recognized that all these problems can be solved by money, time, and priorities. meanwhile, the period of in-parallel operation will be valued as a time to educate, to test, to gather statistics, and to further refine the programs and procedures which comprise lolita. evaluation preliminary input samples indicate that a daily average of from 8 hours, 20 minutes, to 10 hours and 45 minutes will be necessary for input, searches, ~ting and corrections using the crt. an additional 3 hours per day ~f terminal time using the teletype will be required to produce the purchase orders, answer rush search questions if the crt is busy, and activate the daily batch programs (accounting reports, etc.). the sad economic plight of most libraries causes librarians to cast an especially suspicious eye on the costs of automation; a few words on osu's data processing costs may b~ of interest. the cost of total development efforts to produce lolita is under $90,000 (though considerably less was actually expended), or an average annual cost of $30,000 over a three-year period. this compare~ favorably with average annual incomes of from $50,000 to over $300,000 m federal funds alone for other on-line library acquisition projects in ?tiiversities ( 19, 20, 21, 22). a total of 6.75 man-years was required to des1gn lolita. the 6.75 man-years comprises 2.5 years of programming, 3.25 years .of systems analysis, coordination and documentation, and 1.0 year of clencal work, and represents the efforts of four students and six professional workers. this total does not include the time spent by acqu~sitions department personnel in reviewing lolita's abilities or in leammg to use the terminals. current data processing rates charged by the computer center include the following: crt rental-$100/mo.; cpu time-$300/hr.; terminal time -$2.00/hr.; on-line storage costs-15c/2040 characters/mo. the teletype has been purchased, thus only local phone lines charges are incurred. the on-line system is available for use from 7 :30 a.m. to 11:00 p.m. each week-day, and from 7:30 a.m. to 5:00 p.m. on saturday, which more than covers the 8-5 schedule of the acquisitions department. il on-line acquisitionsjspigai and mahan 293 acknowledgments the work on which this paper is based was supported by the administration, the computer center and the library of oregon state university. special mention is due robert s. baker, systems analyst, osu library, and lawrence w. s. auld, head, technical services, osu library, for their extensive participation in the lolita project and for their many suggestions which benefitted the final version of this paper. hans weber, head, business records, osu library, also contributed much to lolita's design. references l. veaner, allen b.: project ballots: bibliographic automation of large library operations using a time-sharing system. progress report, march 27, 1969-june 26, 1969, (stanford california: stanford university libraries, 29 july 1969), ed-030 777. 2. burgess, thomas k.; ames, l.: lola: library on-line acquisition sub~system~ (pullman, washington: washington state university, systems office, july 1968), pb-179 892. 3. payne, charles: "the university of chicago's book processing system." in stanford conference on collaborative library systems development: proceedings, stanford, california, october 4-5, 1968 (stanford california: stanford university libraries, 1969). ed-031 281, 119-139. 4. pearson, karl m.: marc and the library service center: automation at bargain rates (santa monica, california: system development corporation, 12 september 1969). sp-3410. 5. nugent, william r.: "nelinet -the new england library information network." in congress of the international federation for information processing (ifip), 4th: proceedings, edinburgh, august 5-10, 1968 (amsterdam, north holland publishing co., 1968 ). g28-g32. 6. blair, john r.; snyder, ruby: «an automated library system: project leeds," american libraries, 1 (february 1970), 172-173. 7. warheit, i. a.: "design of library systems for implementation with interactive computers," ] ournal of library automation, 3 (march 1970)' 68-72. 8. overmyer, lavahn: library automation: a critical review (cleveland, ohio: case western reserve university, school of library science, december 1969). ed-034 107. 9. cunningham, jay l.; schieber, william d.; shoffner, ralph m.: a study of the organization and search of bibliographic holdings records in on-line computer systems: phase i (berkeley, california university: institute of library research, march 1969). ed029 679, pp. 13-14. 294 journal of library automation vol. 3/4 december, 1970 10. meeker, james w.; crandall, n. ronald; dayton, fred a.; rose, g. : "os-3: the oregon state open shop operating system." in american federation for information processing societies: proceedings of the 1969 spring joint computer conference, boston, mass., may 14-16, 1969 (montvale, new jersey: afips press, 1969), 241248. 11. spigai, frances; taylor, mary: a pilot-an on-line library acquisition system (corvallis, oregon: oregon state university, computer center, january 1968), cc-68-40, ed-024 410. 12. university of chicago. library: development of an integrated, computer-based, bibliographical data system for a large university library (chicago, illinois: university of chicago, library, 1968). pb-179 426. 13. lefkovitz, david : file structures for on-line systems (new york: spartan books, 1969 ), pp. 98-104. 14. ames, james lawrence: an algorithm for title searching in a computer based file (pullman, washington : washington state university library, systems division, 1968). 15. kilgour, frederick g.: "retrieval of single entries from a computerized library catalog file," proceedings of the american society for information science, 5 (new york, greenwood publishing corp., 1968)' 133-136. 16. ruecking, frederick h., jr.: "bibliographic retrieval from bibliographic input; the hypothesis and construction of a test," journal of library automation, 1 (december 1968), 227-238. 17. parker, edwin b.: spires (stanford physical information retrieval system). 1967 annual report (stanford california: stanford university, institute for communication research, december 1967), 33-39. 18. kilgour, frederick g.: "effect of computerization on acquisitions," program, 3 (november 1969), 100-101. 19. "university library systems development projects undertaken at columbia, chicago and stanford with funds from national science foundation and office of education," scientific information notes, 10 (april-may 1968), 1-2. 20. "grants and contracts," scientific information notes, 10 (octoberdecember 1968), 14. 21. "university of chicago to set up total integrated library system utilizing computer-based data-handling processes," scientific information notes, 9 (june-july 1967), 1. 22. "washington state university to make preliminary library systems study," scientific information notes, 9 (april-may 1967), 6. editorial ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ facing what’s next, together lita president’s message facing what’s next, together emily morton-owens information technology and libraries | june 2020 https://doi.org/10.6017/ital.v39i2.12383 emily morton-owens (egmowens.lita@gmail.com) is lita president 2019-20 and the acting associate university librarian for library technology services at the university of pennsylvania libraries. when i wrote my march editorial, i was optimistically picturing some of the changes that we are now seeing for lita—while being scarcely able to imagine how the world and our profession would need to adapt quickly to the impacts on library services as a result of covid-19. it is a momentous and exciting change for us to turn the page on lita and become core, yet this suddenly pales in comparison to the challenges we face as professionals and community members. libraries’ rapid operational changes show how important the ingenuity and dedication of technology staff are to our libraries. since states began to shut down, our listserv, lita-l, has hosted discussions on topics like how to provide person-to-person reference and computer assistance remotely, how to make computer labs safe for re-occupancy, how to create virtual reading lists to share with patrons, and how to support students with limited internet access. there has been an explosion in practical problem-solving (ils experts reconfiguring our systems with new user account settings and due dates), ingenuity (repurposing 3d printers and conservation materials to make masks), and advocacy (for controlled digital lending). sometimes the expense of library technologies feels heavy, but these tools have the ability to scale services in crucial ways—making them available to more people at the same time, available to people who can only take advantage after hours, available across distances. technologists are focused on risk, resilience, and sustainability, which makes us adaptable when the ground rules change. our websites communicate about our new service models and community resources; ill systems regenerate around increased digital delivery; reservation systems for laptops now allocate the use of study seating. our library technology tools bridge past practices, what we can do now, and what we’ll do next. one of our values as ala members is sustainability. (we even chose this as the theme for lita’s 2020 team of emerging leaders.) sustainability isn’t about predicting the future and making firm plans for it; it’s about planning for an uncertain future, getting into a resilient mindset, and including the community in decision-making. although the current crisis isn’t climate-related per se, this way of thinking is relevant to helping libraries serve their communities. we will need this agile mindset as we confront new financial realities. our libraries and ala itself are facing difficult budget challenges, layoffs, reorganizations, and fundamental conversations about the vitalness of the services we provide. my favorite example from my own library of a covid-19 response is one where management, technical services, and it innovated together. our leadership negotiated an opportunity for us to gain access to digitized, copyrighted material from hathitrust that corresponds to print materials currently locked away in our library building. thanks to decades of careful effort by our technical services team, we had accurate data to match our print records with records for the digital versions. our it team had processes for loading the new links into our catalog almost mailto:egmowens.lita@gmail.com information technology and libraries june 2020 facing what’s next, together | morton-owens 2 instantaneously. the result was a swift and massive bolstering of our digital access precisely when our users needed it most. this collaboration perfectly illustrates how natural our merger with alcts and llama is. as threats to our profession and the ways we’ve done things in the past gather around us, i am heartened by the strengths and opportunities of core. it is energizing to be surrounded by the talent of our three organizations working together. i hope more of our members experience that over the summer and fall, as we convene working groups and hold events together, including a unique social hour at ala virtual and an online fall forum. i close out my year serving as the penultimate lita president in a world with more sadness and uncertainty than we could have foreseen. we are facing new expectations and new pressures, especially financial ones. as professionals and community members, we are animated by our sense of purpose. while lita has been transformed by our vote to continue as core, the support and inspiration we provide each other in our association will carry on. lib-s-mocs-kmc364-20140601052432 118 journal of library automation vol. 5/2 june, 1972 automation of acquisitions at parkland college ruth c. carter: system librarian, university of pittsburgh libraries. when this article was in preparation, the author was head of technical services and automation, parkland college, champagne, illinois this paper presents a case study of the automation of acquisitions fun ctions at parkland college. this system, utilizing batch processing, demonstrates that small libraries can develop and support lm·ge-scale automated systems at a reasonable cost. in operation since september 1971 , it provides machine-generated purchase orders, multiple order cards, budget statements, ovet·due notices to vendors, and many cataloging by-products. th e entire collection, print and nonprint, of the learning resource center is being accumulated gradually into a machine-readable data base. introduction-background parkland college, opened in 1967, is a two-year community college located in champaign, illinois. before the librarian-analyst, who combines a library degree with several years' experience as a computer systems analyst and six months of programming training, was hired by parkland, the administration decided that automation of some library procedures was feasible. at the time the library decided to initiate automation planning (december 1970), it had a book collection just under 30,000 plus 1000 audio-visual items. the decision to automate would not have been possible unless a computer was available at the college. in the spring of 1970 when the librarian-analyst was hired, parkland owned an ibm 360/ 30 with 32k. before automation plans were under way, the college purchased an ibm 360/30 with 64k. the computer's increased capacity provided even more incentive for utilizing the computer for significant projects in addition to instructional and administrative functions. among the reasons in favor of automation was a automation of acquisitions/carter 119 general consensus indicating that automation was the way to go, and that the library with its many individual records is a natural for utilizing the computer. the automation of library acquisitions at parkland is notable for several reasons. first, automation was done relatively easily and rapidly; actual systems design and programming were completed in six months. full implementation was achieved within nine months of the formal beginning of the project. second, documentation of the system is exhaustive and is based on a detailed method of communication between the system's librarian-analyst and the programmer. third, automation in this instance was accomplished economically. fourth, the entire system can be run on an ibm 360/30 with 32k having two disk drives and two tape drives, and a standard print chain consisting of just upper-case letters. what to automate? this, of course, is a crucial question. where out of the various alternatives of circulation, acquisitions, cataloging, and others does one begin? neither the librarian-analyst nor the rest of the library staff made any attempt to work out an answer during the fall of 1970. the librarian-analyst, as head of technical services spent the first four months concentrating on cataloging and learning the problems in the acquisitions area. by december she was ready to begin planning for automation. meetings were arranged with the director of the learning resource center and the director of the computer center. informal discussions with the library staff were held. circulation was eliminated early from consideration, since parkland is in temporary quarters. it seemed more logical to develop the area of circulation with the move to the permanent campus. in addition, the volume of circulation did not appear to warrant the time and personnel commitment necessary to develop a comprehensive system at this time. several possibilities remained: the acquisition of new materials, conversion of our whole catalog, and periodicals control, including automatic claim notices. periodicals seemed the least likely of the three, because our holdings numbered less than 700, and it was felt that the volume involved did not justify the effort and expense of going to a computer system, particularly the first computer system within the library. converting the whole catalog had some positive arguments. it would provide a data base for later circulation efforts and also make it possible to produce bibliographies and other service features for faculty members. however, this idea was discarded due to the large initial data-conversion problem, and because it did not provide relief for existing problems within the library. the library staff concluded that acquisitions had first priority for automation. to this the director of the computer center heartily agreed on the grounds that it was a conventional data processing type of application, and it would dovetail with existing data bases already maintained for administrative purposes, in particular, the vendor file and financial reporting 120 journal of library automation vol. 5/ 2 june, 1972 files. furthermore, the library could then produce its encumbrance data to be entered into the budget programs for the business office accounting records. from the standpoint of the library staff, it was believed that by utilizing the computer in acquisitions we could improve the overall staff utilization in the area. probably the strongest point is that, while we did not expect clerical work time to be decreased, its nature would be changed. one specific function to be eliminated was the manual bookkeeping done, although a machine system would still require checking for accuracy. we expected that the acquisitions librarian, once freed from some routine responsibilities concerning the budget, would be able to devote that time to more professional activities. other advantages in automating acquisitions were: more accurate and up-to-date information, especially in regard to budget figures would be available; human errors in sending out orders would be cut down; and statistics on orders could be compiled automatically. at this point, as well as previously, the literature was searched for relevant discussions of acquisitions systems and/or mechanization applications in small libraries. relatively little had appeared in print describing library automation in junior colleges. those articles found to be helpful included: burgess, cage, corbin, dobb, dunlap, macpherson, morris, and vagianos (see references 1-5 and 7-9). also, hayes and becker's handbook of data processing for libraries ( 6) became available at this time. it was especially useful for the summary of features usually present within the scope of standard acquisitions applications. along with use of the literature, several visits to other libraries with operational systems were made. a visit of particular importance was made in january ( 1971) to study an established off-line acquisitions system. as soon as there was general agreement on proceeding with plans for acquisitions, a list was prepared of the criteria the library staff would expect from the automation of acquisitions. the list items included: 1. the system should be open-ended, i.e., it should be planned with other potential future systems in mind. 2. it should handle the preparation of outgoing forms such as purchase orders, book-order cards, notifications to faculty requestors, and overdue notices to vendors. 3. the system should perform bookkeeping functions and provide many different access points for inquiry into the data base. 4. there must be a status list of items in the acquisitions process, up to and including the point of receiving cataloging. 5. it should have as much automatic editing of input data as possible. 6. the system must have flexible updating and file maintenance routines. 7. it should provide the library staff with decision-aiding information including many of our previously manually maintained statistics. automation of acquisitions/carter 121 8. it must be flexible. 9. it should maintain simplicity. and, 10. it should provide better service to the faculty through faster and more accurate ordering and notifications. along with the criteria for an acquisitions system, a possible sequence of automation development was submitted. this was to provide a means for keeping clearly in mind that, while acquisitions would get first attention, this was only a starting point, and that the system should be planned in such a manner as to facilitate its compatibility with future developments. as originally stated, acquisitions, strictly speaking, represented phase 1, and materials added to the collection were phase 2. however, phases 1 and 2 were planned and programmed at the same time. thus, from the beginning, parkland college has included in its system cataloging information such as the complete call number, and up to three subject headings of fifty characters each. the decision regarding number and length of subject headings will be discussed later. (see master record layout at figure 1.) time estimate-schedule in january, 1971, a proposed time estimate (see figure 2) was submitted to the director of the computer center for his approval. this time estimate was prepared with the goal of automating acquisitions beginning with the fiscal year 1972 (i.e., july 1972). the proposed schedule also took into account the fact that most of the librarians were expected to be on vacation all (or at least most) of august, and also that during september, with the registration of students and other demands on the computer resulting from the beginning of a new academic year, computer time and personnel would be tight and probably could not provide the necessary support to a system still in its developmental stages. the schedule called for the librarian-analyst to begin full -time work on analysis on february 15 with final implementation of the system by the end of july. preparation of this estimate was based on computer output if everything went right. it was an extremely rigorous schedule. considering that problems did arise, the implementation of this system during the first week of august is truly notable. of course, bugs remained after the system was actually in operation, and, as with all systems, changes were still being made several months later both in specifications for programming and in the programs conforming to the specifications. when the time estimate was submitted, it was also necessary to make firm decisions regarding personnel to perform all the necessary tasks. the librarian-analyst assumed responsibility for all systems analysis and program definitions. the library staff supplied the keypunching support. one clerk had been hired previously because of her keypunch training. on july 1, an additional clerk was hired with this skill. the main problem was programtap'e layout foi'im tape no. i prepared by: i remarks r. carter library master files: on order, i n process, ~!story no , length, block : 400 x 9 iji"i'l ·~ ~~·i i ill i ii ii i i ill i i j";'i" i ~ ii ill ii i i ii ii i ill i i~ ~j""t"l" i rj,iiiiiiiiiiiiiiiii 'i'j"llllllllllllllj~ llll.~ k '"''''' "''""' '"· ' j l 1111111111111111111111111111111 1 111111111111111 f ~ositions 301-350 • subject hesding no. 2; 35 1-400 • subject he &ding no. 3. fig. 1. master record layout. 1-' ~ 0' ~ !::l --a t'"' .... cl"' ~ > ~ ...... 0 ~ ...... o· ;! < 0 ~ cjl ......._ l~ ._ c :l v(!) 1-' cd -l 1:-0 automation of acquisitions/carter 123 ming, because the computer center did not have the full-time personnel to support a major new effort. this was resolved by hiring a programmer on a special three-month contract running from april 15 to july 15, 1971. prior to implementation, the library was forced to rely on the availability of keypunch machines at the computer center. in september 1971, an ibm model 129 keypunch and verifier was installed in the technical services department of the library. a model 129 was chosen for the library in conformance with the initial requirement set by the director of the computer center-that all library data for the computer be verified. this has proven to be a wise decision, as we have had relatively limited problems with invalid or erroneous data. requirements specification phase (analysis) three weeks were allowed for identification and specification of all output desired from the initial system. many of these requirements were alluded to in the preliminary list of criteria for the system. to meet the library's needs we decided that the system must produce: purchase orders, individual order cards (including a copy used to order catalog cards from the library of congress), budget statements including all encumbrances and payments as well as other financial data, lists of all books on order or in process or cancelled, notices to vendors regarding items on order more than 120 days, notices to each faculty member of the additions to the collection of items they requested complete with call number, and a monthly accession list of all newly cataloged items that could be circulated to all faculty members. time date to date to development steps required start complete i. requirements specifications 3 weeks feb. 15 march 5 ii. detailed design-system how 3 weeks march 8 march 26 ill. detailed design-programming specifications 10 weeks march 29 june4 iv. programming-acquisitions 10 weeks april15 june 23 v. programming-materials accessioned 3 weeks june 24 july 14 vi. computer program system test -acquisitions & materials accessioned 2% weeks july 1 july 26 vii. implementation july 1971 fig. 2. time estimate for automation of acquisitions at parkland college as submitted in january 1971. a beginning and ending date for each phase is indicated and the actual time in weeks required is shown. 124 journal of library automation vol. 5/2 june, 1972 once it was known what forms were required, orders were placed for the necessary pre-printed forms. with some outside advice in the matter of forms suppliers, specifications for three new forms were delineated, two of which would be for use on the computer. the first form encountered in outlining the acquisitions process was a request form. the request form is used to make a record of all items ordered and to serve as a checklist in the searching process (see figure 3). later, it is stamped with a six-position control number and serves as the source document for keypunching new orders, which require three input cards per item ordered. the request form is then retained in control-number sequence until the item has completed its way through the technical services process. specifications for the purchase orders were drawn up by parkland's business manager. the machine-generated purchase orders used by parkland are almost identical to the conventional manual purchase orders used throughout the college. in this case, automation of the library's purchase orders is a likely precursor to automation of the purchase orders for the remainder of the college. the most complicated form to design, from the library's viewpoint, was the individual order form. this was required in five parts, including a copy complying with library of congress specifications for use with ocr equipment. (this is illustrated in figure 4. ) paper pato iy n.cji. co. speeoiset e moore business foams, inc., 26 searched in bip pbip 8pr ptla o. p, pil fund vendor format code author (last name first) titlefvol. card catalog publisher other year no. copies reviewed in: series/edition lccard no. requester control no. order code price sbn fig. 3. request form, used as a control record for each item ordered. -------r-------------------------------------------------------------------i 0 0 0 0 0 i subsc riber no i m i alpha pref' i i 220111 i i i author westheimer, david title lighter than a feathe r publisher little date 1971 no. copies l control number 103921-b order date ·v endor ll ttle brown & co j l c car d number r 174 -15494 7 i i 10 i list price i lo i ' w '; i r •· " ~ ii 0 7.95 i 1-14-72 i l01375 i 0 p 0. no . i parkland college library i io 11111111111111111 i 0 i a b c d e f g sbni h i j k l m n 0 i i b_ -------i-------------------------------------------------------------------~---~-. 01 1 o i i ------------_l__ original copy, used to order catalog cards rrom me library of congress. ---~--------------------------------,-0 i 1 r;:·-t,m7 ! o 0 [ author·westh£jmer, oavio l ... o ... ee•o title light fit than • featttfr i o-m 0 'o publisher ll ttle date 1971 list price 7. 95 0 0 0 0 . no. copies 1 control number 10)921-6 order date vendor little a·•town a. to 1-14-72 p.o. no. l01j7s parkland college library champaign, illinois 61820 second copy, used to send to vendor. fig. 4. copies one and two of the multiple-part order form . 0 0 0 126 journal of lib·rary automation vol. 5/2 june, 1972 it was important to determine forms requirements early, as it was anticipated that several months' time would elapse before they would be received. naturally, it was desired that the forms be on hand by the time the programs would be ready for testing, which was planned for late june or early july. one of the most critical parts of the requirements specification phase was the determination of data elements to be included in the master records. perhaps the most perplexing of those possibilities considered was subject headings. since we wanted an open-ended system which would leave us some room for future development, without major modifications, a decision was made to include three 50-character subject headings in each record. here we were limited because of the decision made (for purposes of simplicity of design and programming) to confine the system to fixed -length records. it was considered desirable for storage purposes to keep the master record length within 400 characters. while the decision on subject headings may prove to be adequate in the long run, it does give parkland's library a good starting point for some projects using subject headings, such as developing bibliographies on demand. despite possible future modifications to the data base, all items going into the history (master) file included headings as defined above. additional determinations made in the initial phase regarded files to be maintained. here a crucial factor was the physical limitations of the college's computer system. as only two tape drives and two disk drives comprised the primary storage facilities, the capability for performing sorts was limited. in fact, one of the disk drives was reserved strictly for systems programs, and could not be utilized directly by the library. this contributed to the decision to maintain separate on-order and in-process files, as well as a history file on tape. the college vendor file and the library budget file are maintained on disk. a final area of effort in the initial phase was developing codes to be utilized throughout the system. naturally, many conditions would be indicated in the computer records by the use of a oneor two-position code. one example is the format code, a one-position code, which indicates the types of items used such as: b=book, r=record, and s=filmstrip. design phase-system flow three weeks were allotted to developing the overall systems flow chart. this time was spent working out each separate program that would be required, and flow-charting the entire series of programs. a flow chart of the system (without minor additions dating after september 1971) is shown in figure 5. however, it does not necessarily indicate the sequence in which programs are run. in general, maintenance of each of the separate files is run prior to new data. this procedure has proved to work well. .-------~ : llfnoo• i : u'oaf( ca~o$ i ~ ----;--.! ... -.. -, i \woau i i vtlfooii r· ~:~~ ~ automation of acquisitions/ carter 127 o\uiojhiuv ooooo c:~f6' fig. 5. system flow chart. 128 journal of library automation vol. 5/ 2 june, 1972 in most cases, pre-sorting of card input is provided. this decision was not based on optimum efficiency but on the compatibility with routine procedures and facilities in the computer center. design phase-program specifications one of the most significant parts of the development of parkland's automated library acquisitions system is the exhaustive documentation provided by detailed written specifications for each program in the system. each program, including utilities such as sorts, was assigned a job number and then described under each of the following topics: purpose, frequency, definitions (any unusual terms), input, output, and method. a format was provided for each input and output, whether it was a card, tape, disk, list, or other printed report or form. these accompanied each individual program specification. the method section is particularly important. here the librarian-analyst stated the procedure used to arrive at the given output based on the given input. any necessary constants were defined. because the librarian-analyst has had programming training, these specifications are detailed to the point where the programmer does not have to do much more than code the problem, making it possible for programming to proceed quickly. this thorough problem definition for each program by the librarian-analyst was one of the major factors (perhaps the primary key) in our success in acquisitions being accomplished rapidly and efficiently. it had the advantage of obviating the need for a senior programmer, or for having someone from the computer center become highly involved in the analysis of library details. furthermore, and perhaps most important is the fact that it provides the detailed documentation of the system. there should be no doubt as to the procedures within each program. an example of a specification for one of the programs in the parkland college library acquisition series is presented in the appendix. it should be mentioned that most of the programs are written in cobol. there are a few in assembler, and some minimal use is made of rpg. testing of the program the original plans called for testing with test data which would proceed simultaneously with programming. however, as things developed, most coding was done prior to very much testing. as a result, the period originally devoted to live data testing of the whole system was instead devoted to testing the programs with test data. thus, in early july, we were about two weeks behind the original time estimate, and that is where it ended up. the usual problems showed up in testing with test data. moreover, during the first week of july, it was learned that the business office was changing the length of the account numbers from 9 to 11 positions. fortunately, space had been planned for up to a 12-position field, so the lengthened number could be easily accommodated by the system. however, the changautomation of acquisitions/carter 129 ing of numbers required modification of any program which edited data for valid account numbers. this was a minor problem and easily resolved. on july 15 the programmer completed the job for which he was hiredi.e., to complete a programming and systems test utilizing live data and to make appropriate changes as identified during testing. since not even testdata testing was complete on july 15, he stayed until july 20 and finished that work. meanwhile, the director of the computer center had already selected the individual to be the operator when the library's jobs were being run on a regular basis. this employee would also provide program maintenance. on july 21, this permanent staff member took over programming. for the next two weeks, while summer school classes were in session, most of the trial runs of the library series had to he done during evenings, nights, and on weekends. by the end of july, most of the major bugs appeared to be out of the programs. impact on technical services success on the first usable purchase order and order cards came on august 3. within the next day or two, a workable budget statement was produced along with a wits list (work in technical services). by august 13, when the vacation time came, nearly one thousand books had been ordered via the automated system. while a few bugs remained to be dealt with in september, the system was accomplishing its basic mission essentially on time. it took less than eight months to identify requirements, and design, program, and test a system consisting of twenty-seven programs in its original design! during the remainder of 1971, various bugs were found, and, it is to be hoped, eliminated from the system. more bugs occurred in the budget series than in any other single segment of the system. over a period of several months, these were worked out; as of march, 1972, the budget sequence of programs worked smoothly. implementation following the implementation of the automated technical services system, several effects were evident. an obvious effect was the saving of two to three days per month formerly spent on bookkeeping. on the other hand, one permanent staff member was added to technical services because of the keypunching workload. this addition had two causes: the keypunching load, and the fact that many more books were ordered directly from publishers with a consequent major increase in processing in-house. therefore, much of what was expended in salary for the extra clerk was saved by eliminating most prepaid processing costs. for several months after implementation, some duplication of effort was required, especially by acquisitions personnel. thus, the total effect on changing the nature of work was not immediately obvious. by march 1972, duplication was essentially phased out, and more realistic assessments of the 130 journal of library automation vol. 5/2 june, 1972 impact of automation in changing the nature of the workload are now being made. one of the most obvious changes is the increased number of bills to be approved for payment. by utilizing the computer to batch purchase orders and order cards, almost all materials are now ordered directly from publishers, rather than pre-processed from a jobber. although the speed by which items are received and processed has increased substantially, there has been a corresponding increase in paper work in this regard. additional services besides the immediate effects of the automation of acquisitions within technical services, other parts of the library and the college felt the impact. this is especially true of reference, which now has a weekly updated listing of all items on order, in process, or cataloged within the last month, in both author /main entry and title sequence. budget statements are now available to the director of the learning resource center and other personnel on a weekly rather than monthly basis. not only are they received sooner, but they provide more information than is present in the statement originating from the computer center. a useful fringe benefit is the availability of overdue notices to vendors when items have heen on order more than 120 days. a computer-generated notice is sent each week to faculty members regarding items requested, cancelled, or cataloged. the response of the library staff and the rest of the faculty to the automated system has been very favorable. cost at this date (march 1972) , costs are difficult to assess, but certainly seem minimal. the only direct costs are the installation of a 129 keypunch, which rents for $170 per month, plus the salary of the extra staff member for keypunching. however, the extra salary is compensated for by no longer ordering items pre-processed at an average cost of $2.05 per item. naturally, there is some local cost for processing materials such as pockets and labels, but it is minor on a per-volume basis. in addition, by being processed locally, materials are available to the users much more rapidly. among other costs, the learning resource center had to pay a threemonth salary for a programmer. other computer support, whether personnel or machine time, has not been directly billed to the library. analyst time is absorbed, in part, in general library salaries as the librarian-analyst is also head oftechnical services and is responsible for original cataloging. about one-half of her time is devoted to automation activities. as an indirect cost of automation, it is reasonable to include the cost of a special summer project contract of about $1500 for the reference librarian to catalog a-v materials. this was necessary because the librarian-analyst was directly involved with automation, thus not able to keep up with all media of materials to be cataloged. purchase-order forms previously covered by the business office budget cost the library $900. however, it was a two-year automation of acquisitions/carter 131 supply which was paid for by money the college, if not the library, would have expended anyway. the multiple-order forms for computer use exceed the cost of more standard forms by several hundred dollars per year. the library also expends about $400 per year to buy punch cards and magnetic tape. some direct savings resulted from what are by-products of the automated system, but which were previously done manually. these include production of a monthly accession list and notices to faculty members of items they requested which were ordered, cancelled, or cataloged. the accession list was previously compiled by xeroxing in ten copies the shelflist card for all items added to the collection during a month. this involved both xerox charges and student assistant time. notices to faculty were previously sent out by both the order and processing sections. now these notices are consolidated, which produces savings in addressing time, as well as eliminating manual production of each notice. overall, in calculating costs and savings, direct and indirect, it appears at this point that parkland has automated many library routines very inexpensively, although specific cost figures remain to be determined. with the availability of a similar computer, many other libraries should be able to undertake automation of certain basic functions without large expenditures of either money or personnel time. problems as with all automated efforts, some problems were encountered at almost every stage of development. taken as a whole, these were minor and, for the most part, few hitches were encountered. however, so that others may profit from the library automation experience at parkland, those problems will be discussed. the major problem was the original programmer of the series. this person was not a regular employee of parkland and was not concerned with being retained. since he was not part of the staff, he worked erratically and frequently was hard to get hold of. we were working on a tight time schedule, and it was very important to maintain close supervision of the progress being made, although sometimes this was difficult. in addition, even though it was strongly desired that tests be conducted throughout the three-month period, the programmer waited until all coding and compiling was completed before beginning even test-data testing with most programs. fortunately, it worked out satisfactorily, as the regular staff member of the computer center, who presently runs our jobs and does program maintenance, took over in mid-july and was available for live-data tests. all staff members directly involved with automation worked very hard the last two weeks of july and the first week of august to complete testing with live data. the programs were further refined during august and september, and most of the bugs were out by early fall . naturally, changes in specifications continued to be made, and our acquisitions system is definitely not static. 132 journal of library automation vol. 5/ 2 june, 1972 the lesson we learned from the experience with the initial programmer is that, if a regular staff member of the institution can be assigned to the development of programs for the library, avoiding other assignments during that time period, a more satisfactory response can be achieved from the programmer. also, in such an operation it would be possible to monitor progress on a more regular basis. another group of problems arose in connection with the new forms required for the automated system. fortunately, these were not serious. the forms arrived later than they were promised, and, without exception, their cost was about 25 percent more than the original estimates. because custom forms can take a long time to be completed, it is wise to identify output requirements ·early in the development of an automated system, so that the forms can be completed and delivered when the system is ready for final testing and implementation. a few minor problems revolved around decisions made in file design. for conserving space and holding down the size of the master record, it was decided to pack numerical fields. this would have been satisfactory if packing had been limited to such fields as the julian date, such as 72001 rather than 01-01-72. (this form of the date was used to provide easy computation when calculating overdue orders. ) unfortunately, fields such as the numerical part of the lc card number and the parkland college account numbers were also packed. no problem existed except when the lc card was blank at order time; then the lc number printed as zeros. of course, these could be suppressed once the problem was identified, although it was decided to make space to unpack the field. it was learned that packed fields always print zero when unpacked, unless this is specifically suppressed, and also that it is impossible to debug packed fields on routine file dumps that are requested with provisions for unpacking and reformatting the dump. this is because packed fields print blank when they are dumped. other minor difficulties included: l. the print chain did not print colons or semi-colons, except as zero, therefore, the library's records all contain commas instead. 2. in the midst of programming the account numbers , all the college's funds were changed, thus requiring the change of constants and edit criteria in many programs. 3. as originally specified for input, the lc classification number did not sort in shelf list order, for instance, bf 8 sorted after bf 21. this was eventually remedied by left-justifying the letters and right-justifying the numbers within separate fixed fields. 4. routine delays for machine repair and maintenance were a concern, since it is necessary to adhere to a tight schedule in systems development. automation of acquisitions/carter 133 future development as is so frequently the case, now that parkland is committed to automated functions within the library, more and more applications are seen. even the former skeptics on the staff are enthusiastic, and all the professionals have made suggestions for the future. several additions to the acquisitions system were made in the first six months following implementation of the system. these included a list of purchase orders sequenced by vendor and enlarging the machine-generated notices to faculty requestors to cover items ordered and cancelled. various additions have been made in several programs originally part of the system, which expand the services the system can provide for the library staff. many more minor modifications and supplementary features in acquisitions have been identified for inclusion in the system, and will be added as time permits. the first additional area to benefit directly by the computer availability has been periodicals. without involving complicated programming, the periodicals holdings have been converted to a card file which is then listed directly, card by card, without changes, except for suppression of a control and sequence number. nothing more is planned for periodicals in the near future, because the new card file enables the master holdings list of 800 titles to be updated in technical services by the periodicals assistant, who also keypunches one-half time. the time-consuming retyping of the holdings list is now eliminated, and multiple copies of up-to-date holdings lists can be produced more frequently with less effort. another new area for which programming specifications were released in december 1971 is reference. in this system it is hoped that subject bibliographies and holdings lists, based on library of congress classification, can be produced. this system will have a multitude of purposes, one of the primary ones being to give better service to our faculty members. we get many requests for copies of portions of our shelflist or other extracts of holdings. rather than filling these requests by xeroxing cards or tedious typing, a few extract specifications will permit computerized retrieval and printing. also, search time in the catalog will be cut down considerably. in the subject bibliographies, the library plans to be able to extract on any heading, stem of a heading, or any part of a heading, thus getting much more flexibility than in manual use of the card catalog. programming for this is currently under way, and after the system has been completed and is operational, some interesting results should be identified. by including three subject headings of fifty characters in our original file design, it was possible to design and program the reference series as a spin-off of the acquisitionstechnical services system with a minimum of additional effort. even if it is eventually decided to lengthen either the number or size of the subject headings contained in parkland's file, useful services will have been provided under the original design, as well as simply having provided a base for further decisions and developments. 134 journal of library automation vol. 5/2 june, 1972 other projects which are being considered for future action are serials holdings (in parkland's case, mostly annuals and yearbooks which get cataloged), including an anticipation list, and management statistics consisting of holdings percentages by class letter versus collection additions and circulation figures by class letter. circulation itself will undoubtedly not be designed prior to actual residence on the permanent campus ( anticipated for fall 1973), but all of the above are possibilities and some will receive attention in the immediate future. by building a data base which includes subject headings and call numbers, many future projects will be practical to consider as the file maintenance programs and the data base will already exist. these, of course, may be modified from time to time to meet changing conditions and requirements. additionally, parkland's library staff has been following cooperative library automation efforts involving other libraries, and would happily consider participation in appropriate cooperative ventures. conclusion in the opinion of both the library and computer staff, the automation of acquisitions is a success. it was accomplished rapidly and essentially on time and economically-with few costs higher than originally anticipated. now that the system is operating smoothly, with only an occasional bug cropping up, the extra workload caused by parallel operations has been phased out and the total efficiency of the system should continue to improve. the system to date has been running on a weekly basis, and this has proved satisfactory to both the computer center personnel and the library. the library is among the first parts of parkland to be on a regular weekly schedule using the computer. most other processing is on a monthly and quarterly cycle. in approaching any automated systems development, a general attitude of flexibility combined with thoroughness is very important and will probably bring the best long-term results. by being flexible and open-ended, regardless of what portion of a library's functions were originally automated, the way will be paved to provide a data nucleus for other applications in areas of the library. thoroughness in design and attention to initial detail are also important, as sometimes it is harder to find the time to make the changes than was expected. there is probably a tendency to get along with an operational system as it is, rather than making minor non-crucial modifications in it, although such changes do get worked in as time permits. nonetheless, it is very important that in the initial stages a system be as comprehensively planned as feasible. the parkland college learning resource center is fortunate in that original specifications (on the whole) were well thought out and provided a cohesive unit, which is also characterized by built-in flexibility, and as a result is adaptable to future growth. automation of acquisitions/carter 135 acknowledgments numerous individuals have participated in and supported library automation efforts at parkland college. david l. johnson, director of the learning resource center provided the initial inspiration and determination. robert 0. carr, director of the computer center, welcomed the library's commitment to automation and provided the technical advice where necessary. sandra lee meyer, acquisitions librarian, gave full cooperation, including tireless aid in clarification of requirements and debugging test results. since late july 1971, bill abraham has been the programmeroperator for the library system and has consistently given more than one hundred percent effort. jim whitehead from western illinois university contributed valuable advice based on his prior experience in acquisitions automation. finally, kathryn luther henderson, an inspirational teacher and friend, voluntarily spent many hours writing test data and offering the opportunity for many fruitful discussions. references 1. thomas k. burgess, "criteria for design of an on-line acquisitions system at washington state university library," in proceedings of the 1969 clinic on library applications of data processing, edited by dewey e. carroll (urbana: university of illinois, graduate school of library science, 1970), p. 50-66. 2. alvin c. cage, "data processing applications for acquisitions at the texas southern university library," in proceedings, texas confe1·ence on library automation, 1969 (houston: texas library association, acquisitions round table, 1969), p. 35-57. 3. john b. corbin, "the district and its libraries-tarrant county junior college district, fort worth, texas," in proceedings of the 1969 clinic on library applications of data processing, edited by dewey e. carroll (urbana: university of illinois, graduate school of library science, 1970), p. 114-34. 4. t. c. dobb, "administration and organization of data processing for the library as viewed from the computing centre," in proceedings of the 1969 clinic on library applications of data processing, edited by dewey e. carroll (urbana: university of illinois, graduate school of library science, 1970), p. 75-80. 5. connie dunlap, "automated acquisitions procedures at the university of michigan library," library resources & technical se rvices 11: 192202 (spring 1967). 136 journal of library automation vol. 5 / 2 jun e , 1972 6. robert m . hayes and joseph becker, handbook of data processing for libraries (new york: wiley-becker and hayes, 1970). 7. john f. macpherson, "automated acquisition at the university of western ontario," in automation in libraries. papers presented at the c.a.c.u.l. workshop on library automation at the university of british columbia, vancouver, april 10-12, 1967 (ottawa, ontario: canadian library association, 1967). 8. ned c. morris, "computer-based acquisitions system at texas a & t university," journal of library automation 1 :1-12 (march 1968 ). 9. louis vagianos, "acquisitions: policies, procedures, and problems," in automation in libraries. papers presented at the c.a.c.u.l. workshop on library automation at the university of british columbia, vancouver, april 10-12, 1967 (ottawa, ontario: canadian library associ ation , 1967 ), p. 1-9. 158 information technology and libraries | december 2009 michelle frisquepresident’s message i know the president’s message is usually dedicated to talking about where lita is now or where we are hoping lita will be in the future, but i would like to deviate from the usual path. the theme of this issue of ital is “discovery,” and i thought i would participate in that theme. like all of you, i wear many hats. i am president of lita. i am head of the information services department at the galter health sciences library at northwestern university. i also am a new part-time student in the masters of learning and organizational change program at northwestern university. as a student and a practicing librarian, i am now on both sides of the discovery process. as head of the information systems department, i lead the team that is responsible for developing and maintaining a website that assists our health-care clinicians, researchers, students, and staff with selecting and managing the electronic information they need when they need it. as a student, i am a user of a library discovery system. in a recent class, we were learning about the burkelitwin causal model of organization performance and change. the article we were reading described the model; however, it did not answer all of my questions. i thought about my options and decided i should investigate further. before i continue, i should confess that, like many students, i was working on this homework assignment at the last minute, so the resources had to be available online. this should be easy, right? i wanted to find an overview of the model. i first tried the library’s website using several search strategies and browsed the resources in metalib, the library catalog, and libguides with no luck. the information i found was not what i was looking for. i then tried wikipedia without success. finally, as a last resort, i searched google. i figured i would find something there, right? i didn’t. while i found many scholarly articles and sites that would give me more information for a fee, none of the results i reviewed gave me an overview of the model in question. i gave up. the student in me thought: it should not be this hard! the librarian in me just wanted to forget i had ever had this experience. this got me to thinking: why is this so hard? libraries have “stuff” everywhere. we access “stuff,” like books, journals, articles, images, datasets, etc., from hundreds of vendors and thousands of publishers who guard their stuff and dictate how we and our users can access that stuff. that’s a problem. i could come up with a million other reasons why this is so difficult, but i won’t. instead, i would like to think about what could be. in this same class we learned about appreciative inquiry (ai) theory. i am simplifying the theory, but the essence of ai is to think about what you want something to be instead of identifying the problems of what is. i decided to put ai to the test and tried to come up with my ideal discovery process. i put both my student and librarian hats on, and here is what i have come up with so far: n i want to enter my search in one place and search once for what i need. i don’t want to have to search the same terms many times in various locations in the hopes one of them has what i am looking for. i don’t care where the stuff is or who provides the information. if i am allowed to access it i want to search it. n i want items to be recommended to me on the basis of what i am searching. i also want the system to recommend other searches i might want to try. n i want the search results to be organized for me. while perusing a result list can be loads of fun because you never know what you might find, i don’t always have time to go through pages and pages of information. n i want the search results to be returned to me in a timely manner. n i want the system to learn from me and others so that the results list improves over time. n i want to find the answer. i’m sure if i had time i would come up with more. while we aren’t there yet, we should continually take steps—both big and small—to perfect the discovery process. i look forward to reading the articles in this issue to see what other librarians have discovered, and i hope to learn new things that will bring us one step closer to creating the ultimate discovery experience. michelle frisque (mfrisque@northwestern.edu) is lita president 2009–10 and head, information systems, northwestern university, chicago. book reviews information technology and libraries | march 2014 44 epub 3: best practices, by matt garrish and markus gylling. sebastopol, ca: o'reilly. 2013. 345 pp. isbn: 978-1-449-32914-3. $29.99. there is much of value in this book—there aren't really that many books out right now about the electronic book markup framework, epub 3—yet i have a hard time recommending it, especially if you're an epub novice like me. so much of the book assumes a familiarity with epub 2. if you aren't familiar with this version of the specification, then you will be playing a constant game of catch-up. also, it's clear that the book was written by multiple authors; the chapters are sometimes jarringly disparate with respect to pacing and style. the book as a whole needs a good edit. this is surprising since o'reilly is almost unifo rmly excellent in this regard. the first three chapters form the core of the book. the first chapter, "package document and metadata," illustrates how the top level container of any epub 3 book is the "package document." this document contains metadata about the book as well as a manifest (a list of files included in the package as a whole), a spine (a list of the reading order of the files included in the book), and an optional list of bindings (a lookup list similar to the list of helper applications contained in the configurations of most modern web browsers). the second chapter, "navigation," addresses and illustrates the creation of a proper table of contents, a list of landmarks (sort of an abbreviated table of contents), and a page list (useful for quickly navigating to a specific print-equivalent page in the book). the third chapter, "content documents," is the heart of the core of the book. this chapter addresses markup of actual chapters in a book, pointing out that epub 3 markup here is mostly a subset of html5, but also pointing out such things as the use of mathml for mathematical markup, svg (scalable vector graphics), page layout issues, use of css, and the use of document headers and footers. after reading these first three chapters, my sense is that one is ready to dive into a markup project, which is exactly what i did with my own project. that said, i think a reread of these core chapters is due, which i intend to do presently. the rest of the book is devoted to specialty subjects such as how to embed fonts, use of audio and video clips, "media overlays" (epub 3 supports a subset of smil, the synchronized multimedia integration language, for creating synchronized text/audio/video presentations), interactivity and scripting (with javascript), global language support, accessibility issues, provision for automated text-to-speech, and a nice utility chapter on validation of epub 3 xml files. of these, the chapter on global language support i found to be fascinating. for us native english speakers, it's not immediately obvious some of the problems one will inevitably encounter when trying to create an electronic publication that can work in non-western languages. just consider languages that read vertically and from right to left, for one! as an epub novice, my greatest desire would be for the book to provide, maybe in an appendix, a fairly comprehensive example of an epub 3 marked -up book. maybe this is a tall book reviews 45 order? nevertheless, i would love to see an example of marked up text including bidirectional footnotes, pagination, a table of contents, etc.; simple, foundational things, really. examples of each of these are included in the book, but not in one place. having such an example in one place would be something that could be used as a quick-start template for us epub beginners. to be fair, code examples of all of this is up on the accompanying website, and i am using these examples as i learn to code epub 3 for my own project. but having a single, relatively comprehensive example as an appendix to the book would be very useful. as i read this book, something kept bothering me. epub2 and epub 3 are so very different, with reading systems designed to render epub 3 documents being fairly rare at this point. so if different versions of the same spec are so different, with no guarantee that a future reading system will be able to read documents adhering to a previous version, then the prospect of reading epub documents into the future is pretty sketchy. are e-books, then, just convenient and cool mechanisms for currently reading longish narrative prose—convenient and cool, but transitory? mark cyzyk is the scholarly communication architect in the sheridan libraries, johns hopkins university, baltimore, maryland, usa. 78 design principles for a comprehensive library system tamer uluakar, anton r. pierce, and vinod chachra: virginia polytechnic institute and state university, blacksburg, virginia. this paper describes a project that takes a step-by-step or incremental approach to the development of an online comprehensive system running on a dedicated computer. the described design paid particular attention to present and predicted capabilities in computing as well as to trends in library automation. the resultant system is now in its second of three releases, having tied together circulation control, catalog access, and serial holdings . perspective the use of computers in libraries is no longer a speculative venture for the daring few. rather, library automation has become the accepted prerequisite for effective library service. the question faced is not "if," but rather "how" and "when." the reasons for this evolution are diverse, but fundamental is the recognition of online computer processing as the most effective means of simultaneously handling inventory control, information retrieval, and networking of large, complex, and volatile stores of data. most areas of current library practice could now benefit from effective computer-based control. mature and proven systems exist for cataloging, circulation, serials control, acquisitions, catalog access, and "reader guidance"; the latter by virtue of online literature searching facilities such as dialog, medlars, or brs. the challenge is to find or develop an optimal mix of capabilities. two common limitations from which library automation projects suffer are the use of nonstandardized, incomplete records and the lack of functional integration of different tasks. in most cases these limitations are due to historic circumstances. the pioneering systems say, those online systems introduced between 1967 and 1975 had to conserve carefully the available computing resources. a decade ago it was unthinkable for any library to store a million marc records online. mass manuscript received july 1980; accepted february 1981. design principles/uluakar, et al. 79 storage costs alone precluded that option. to best realize the benefits of automation, short records, usually of fixed length, were employed. there is little question that systems based on short records were helpful to their users . however, one characteristic of these systems was their proliferation within a particular library. after the first system was shown to be a success, it became compelling to try another. the problem was that these separate systems were usually not communicating directly with each other because of limitations imposed by program complexity and load on available resources. thus, the use of incomplete records breeds isolated, noncommunicating systems. however, system users have come to demand that all relevant data be available at a single terminal from a single system. it is not enough to know that a particular title is due back in twenty-five days; the user must also know that copy two has just been received, and that copy three is expected to arrive from the vendor in one week. that is, the functions of catalog access, circulation, and acquisitions must be brought together at a single place the user's terminal. and while the importance of functional integration has been recognized for some time, only a very few report successful implementations. i,z the kafkaesque alternative to functional integration becomes the library that has been "well computerized" but where the librarian must use five different terminals, one for each task. as computer-based systems have grown to maturity, increasing stress has been placed on standardization . in library automation the measure of standardization is wide-scale use of the marc formats for documents and authorities; the use of bibliographic "registry" entries such as isbn, issn, or coden; the use of standard bibliographic description; and so forth. however, the application of common languages and standardized protocols, data description, and definition has been less pervasive. we find many applications that eschew use of the common high-level languages, database management systems, and standard "off-the-shelf' or general-purpose hardware. the emergence of powerful and easy-to-use database management systems, the spectacular price reductions in hardware, and the concomitant, and equally spectacular, improvements in system capabilities have made it clear that it is practical to think ambitiously. perhaps the major articulation of these developments has been the pervasive shift from a central computer shared with nonlibrary users to the utilization of dedicated minicomputers. 3 our analysis of the requirements of a comprehensive system led to recognition of the key role played by serials in research libraries. serials form the most critical factor in automating library service because of the complexity of their bibliographic, order, and inventory records, and because of their importance to research. 4 a fundamental error in designing a comprehensive library system would involve focusing on the require80 journal of library automation vol. 14/2 june 1981 ments of monographs and/or other "one-shot" forms of the literature. the reason is, simply, that monographs and other such publications can be treated as an easy limiting case of a continuing set of publications . this observation is borne out by christoffersson, who reports an application that extends the idea of seriality and develops a means to provide useful control and access to all classes of material. 5 design philosophy the concerns outlined above mean that a viable library system should meet the following design criteria: functional integration. functional integration is simply the ability to conduct all appropriate inquiries, updates, and transactions on any terminal. this envisages a cradle-to-grave system wherein a title is ordered, has its bibliographic record added to the database, is received and paid, has its bibliographic record adjusted to match the piece, is bound, found by author, title, subject, series, etc., charged out, and, alas, flagged as missing. in this way a terminal linked to the system will be a one-stop place to conduct all the business associated with a particular title, subject, series, order, claim, vendor, or borrower. completeness of data. if the system is to be functionally integrated, it is clear that it must carry the data required to support all functions. in particular, data completeness is required to satisfy the access and control functions. consider, for example, the problems associated with the cataloging function. a book is frequently known by several titles or authors. creating these additional access points is a large portion of the cataloger's responsibility. only systems that allow the user access to these additional entries utilize the effort spent in building the catalog record. such system capabilities must be present to allow the laborintensive card catalog to be closed and, more important, to allow maintenance of the catalog within the system . use of standardized data and networking. in an excellent article, silberstein reminds us that, in general, the primary rationale for adhering to standards is interchangeability. 6 we give great importance to being able to project our data to whatever systems may develop in the future. we believe this consideration is of the highest priority because, fundamentally, the only thing that will be preserved into the future is the data itself.* without interchangeability of data, sharing of resources is impossible. data interchangeability is, of course, a basic assumption that has been made in speculation concering the national bibliographic network7 developing from the bibliographic utilities-notably, oclc, inc., the research libraries group's rlin facility, the washington library network, and the university of toronto's utlas facility. today, nearly all *this state of affairs seems to be true for all computer-based systems because their lifetime is, typically, no greater than ten years. design principles!uluakar, et al. 81 research libraries participate in some utility. while their participation is primarily directed to utilization of the c<;~,taloging support services, we find an increasing amount of interest and use of additional capabilities, notably interlibrary loan. we expect a steady and continual growth of these library networking capabilities. however, networking is not problem free. perhaps the biggest single problem in using the network is the misalignment between the record as found on the bibliographic database and the requirements of individual libraries. while such variability between the resource database record and the user's needed version is well understood, 8 the local library frequently has a difficult time adjusting records to meet local needs. one example is oclc's inability to "remember" in the online database a particular library's version of a record. another example is the conser project's practice of "locking" very dynamic records as soon as they are authenticated. this locking frequently means that required updates cannot be made and users cannot share with one another corrections to the base record. after locking, each must, independently, go about bringing the record up to date. thus, as roughton notes, "the next library to call up the record loses the benefit of the previous library's work. "9 this inhospitable state of affairs forces individual libraries to maintain their own records if they wish to change bibliographic records after initial entry. the problem of local adjustment of bibliographic records in no way conflicts with the goal of standardized bibliogra:phic data. standardized data provides a quick means of delivering an intelligible package to a variety of users who will adapt the package to meet their particular needs . standardization does not mean making adaptation inefficient or more costly than it need be; rather, standards provide a framework around which the details are filled in. these observations on standardized data formats imply that the library's data must be based on marc records for books, serials, authorities, etc.; and on the ansi standards for summary serials holdings notation, book numbers, library addresses, and so forth. microscopic data description. at this point, system administrators face a fundamental problem-many of the library's important records have no standard format. the most conspicuous example involves the notation for detailed serials holdings. 10 the only alternative one has when trying to build a system without standardized formats is to rely on "microscopic" description. that is, each and every distinct type of data element that makes up (or can make up) a field in a record must be accounted for and uniquely tagged. in this way, whatever standard format is ultimately set, it will be possible, in principle, to assemble by algorithm the data elements into an arrangement that will be in conformity with the standard. only if the library is using microscopic data description will the library be able to maintain its independence of particular lines 82 journal of library automation vol. 14/2 june 1981 of hardware or software. we are convinced that the use of untagged, free-form input will, in the long run, spell disaster. use of general purpose hardware and software. many strategies in dealing with library automation involve redesigning standard hardware or software. for example, one vendor has reported an interesting design of mass storage units that improved access time. 11 we feel that future applications should, as much as possible, steer clear of such customized implementations because the standard capabilities of most affordable systems allow sufficient processing power and storage economies even if these capabilities are suboptimal for a particular application . the use of general-purpose hardware and system software promotes system sharing between different installations. moreover, an application based on general-purpose hardware and system software will be easier to maintain and far less vulnerable to changes in personnel. for turnkey installations, the greater the degree of use of general-purpose hardware and software, the better shielded will the installation be against changes in product line or the vendor's ultimate demise . a noteworthy application of this principle of compatibility is seen in the system being developed by the national library of medicine. 12 system description the functional capabilities of the virginia tech library system (vtls) have been developed in two software releases, with the third release soon to appear. the initial release met the needs associated with circulation control and also provided rudimentary access to the catalog and serials holdings. the present release has benefited from the use of the marc format, and allows vastly improved catalog access and control. release iii, the comprehensive library system now being developed, will draw together acquisitions, authority control, and serials control with the current capabilities. vtls release i the initial release of the system was developed in 1976 to meet needs generated by rapid library growth. circulation transactions had been increasing at about 10 percent annually for the previous decade and were straining the manually maintained circulation files beyond acceptable limits. the main library* at virginia tech is organized in subject divisions-each essentially "owning" one floor of a 100,000-square-foot facility. a 100,000-square-foot addition to the library had been approved. because virginia tech's library has only one card catalog, some means was necessary to distribute catalog information throughout a facility that *only two quite small branch libraries (architecture and geology) exist on campus . in addition there is a reserve collection located in the washington, d.c., area that supports off-campus graduate programs in the areas of education, business administration, and coiuputer science. all these sites are linked to the system. design principles/uluakar, et al. 83 was to double its size. after reviewing the alternative means of distributing the catalog-e . g., a duplicate card catalog, photographic reproduction of the catalog, or a com catalog-it was decided to attack both problems, circulation control and remote catalog access, within a single online system . vtls was installed on a full-time basis in august 1976. its first release ran continuously on the library's dedicated hewlett/packard 3000 minicomputer until december 1979 . at that time the system held brief bibliographic data for approximately 325,000 monographs and 25,000 journals and other serial titles-records for about half the collection. while the first release ably met its goals, it became clear that it would prove to be an unsuitable host for additional modules involving acquisitions and serials control, primarily because of the brief, fixed-length bibliographic records. as a result of highly favorable price reductions in computer hardware and improvements in capability, it was possible to think in terms of storing one million marc records online as well as supporting the additional terminals required for a comprehensive library system. vtls release ii vtls runs under a single online program for all real-time transactions. the major goals in the design of this program were the following: 1. two conflicting requirements had to be a~commodated : first, the program had to be easy to use for library patrons. this is requisite for a system that will eventually replace the card catalog. second, the program had to be practical, efficient, and versatile for its professional users. the keystrokes required had to be minimal, and related screens had to be easily accessible· from one to another. 2. the response time had to be good, especially for more frequent transactions. 3. the contents of all screens had to be balanced to provide enough information without being overcrowded and difficult to read or comprehend. further, each screen of vtls had to be arranged by some logical arrangement of the data it contains-for most screens this meant alphabetical sorting of the data according to ala rules. 4. the format of all screens, especially those to be viewed by the patrons, had to be visually pleasing. thus , the use of special symbols (which are so abundant on many computer system displays), nonstandard abbreviations, and locally (and often quite arbitrarily) defined terms were unacceptable. 5. the program had to have security provisions to restrict certain classes of users from addressing particular modules of the program. considerable effort was spent to satisfy these goals. the first goal was achieved by the "network of screens" approach. the second goalprompt system response-necessitated the use of the "data buffer 84 journal of library automation vol. 14/2 june 1981 method," which, in turn , proved to have other uses (both of these techniques are discussed below) . to satisfy goals three and four, a committee of librarians and analysts spent months drafting and reviewing each screen until it was finally approved by the design group. goal fivesecurity provisions-was reached without much difficulty. network of screens vtls' s data-access system is designed to be used as easily as a road map. this is accomplished by the use of a "network of screens." the network of screens is much like a road map in which a set of related data (a screen displayed in one or more pages) acts as a "city," and the commands that lead from one set to another act as "highways." vtls has nineteen screens including various menu screens, bibliographic screens (see "the data buffer method" below), serial holdings screens, item (physical piece) screens, and screens for patron-related data. the user can "drive" from one "city" to another us ing system commands. the system commands are either "global" or "local." global commands, as the name implies, may be entered at any point during the execution of the online program. a local command is peculiar to a given screen. global commands are of two types: search commands and processing commands. search commands are used to access the database by author, title, subject, added entries, call number, lc card number, isbn, issn, patron name, etc. processing commands, on the other hand, initiate procedures such as check-out, renewal, or check-in of items. the user first enters a global (search) command to access one of the screens in the network. from there, local commands that are specific to the current screen can be used. there are three different types of local commands: commands that take the user from one screen to another; commands that page within the current screen; and commands that update data related to the screen. for example, it is possible to start by entering an author search command to access the network and then proceed not only to find what books the author has in the system but also the availability of each of the books . if the books are checked out, information about the patrons who have them can also be reached. this display is called the patron screen . from the patron screen, one can "drive" to the patron activity screen , which displays circulation information about the patrons. thus, each d isplayed screen leads to another. in fact, the searches can start at ten different screens and proceed in many different ways through the network. database design image/3000, hewlett-packard's database management system used by vtls, is designed to be used with fixed-length records. this fact, coupled with the need to sort entries on most screens, created serious problems in the early stages of the system design . but various techdesign principles/uluakar, et al. 85 niques were devised to overcome these apparent road blocks . figure 1 illustrates the breakdown of the bibliographic record in the database and the way it is linked with piece-specific · data. bibliographic data are stored in three distinct groups for subsequent retrieval: l. controlled vocabulary terms. (authority data set) 2. title and title-like data. (title data set) 3. all remaining bibliographic data; i.e., data that is not indexed. (marc-other data set) this grouping of the marc record extends to subfields, thus splitting mixed fields such as author-title added entries . when individual fields are parsed in this way, a single field may contribute more than one access point, such as variant forms of author, title, series name, subject, and added entries. access by the standard bibliographic control numbers is effected by use of inverted files (not shown in the figure). a fundamental characteristic of this layout involves the storage of controlled vocabulary terms (i.e., authors and subjects). regardless of the number of references made to an authority term from different bibliographic records, the controlled vocabulary term is stored only once. the system assigns a unique number (authority id) to each such term and uses this number to keep records of the references made to it in a separate data set (authority bibliographic linkage data set). this particular structure makes an authority control subsystem possible, speeds up online retrieval and display, and economizes mass storage. the data buffer method the system displays bibliographic records in two different formats. if the terminal used is designated for librarians, the records are displayed al'thority -bibliographic linka<;e data set fh;ure 16. biblio<;raphic layo ut of the cfs-11 data base . tsimplif'iedl fig. 1. bibliographic layout of the vtls database (simplified). 86 journal of library automation vol. 14/2 june 1981 in the marc format (the resulting screen is referred to as the marc screen); otherwise, they are displayed in a screen that is formatted similar to a catalog card. before displaying these screens, the online program collects and formats the data to be displayed and stores it in one of the two "buffer" data sets. the records stored in the buffer data sets are called buffer records. buffer records can be edited, as required, by adding new lines, deleting, or modifying existing character strings. these updates can be executed quickly and without placing much load on the system since they involve little, if any, analysis, indexing, and sorting. thus, the buffer data sets store all bibliographic updates and new data entry of the day. at night, these records are transferred to the rest of the database by a batch program. the data buffer method has had several pronounced effects on the system. by transferring periods of heavy resource demand to off-hours, the system can work with full marc records in a library that has a heavy real-time load of data entry, inquiry, and circulation. the data buffer approach also improves access efficiency because once a buffer record is prepared for a screen, subsequent searches for the same record are satisfied by the buffer record. data entry and the oclc interface the most frequently encountered method of entering marc records into a local computer involves use of tape in the marc ii communications format . alternative methods include the use of microprocessors or digital recorders which "play back" a marc-tagged screen image from oclc or some other bibliographic utility. these alternative methods have the strong advantage of shortening the delay introduced while waiting for a tape to be delivered. we have been able to link the utility's terminal to the data buffer. 13 data flows from the utility to the buffer in real time. no intervention in the utility's terminal was required for the local processor to be able to capture the marc-tagged screen. batch programs running on the hip 3000 read records from printer ports of oclc terminals and pass them directly to the data buffer. once a record gets into the data buffer, it is accessible by oclc number so that subsequent editing and linkage to piece-specific data or serial holdings can be made right away in the local system . buffer records can also be created by direct keyboarding of the full array of fixed and variable fields using the vtls terminals. circulation as with most other online circulation systems, vtls uses machinesensible bar-code labels to identify books and borrowers to the system. all efforts have been made to humanize the system. one consequence is design principles/uluakar, et al. 87 that the system does not make decisions better made by responsible staff. thus, two kinds of circulation stations reside side by side. the first is staffed by students who typically work a ten-to-twenty-hour week and historically have shown high turnover. their circulation stations only deal with inquiries and with heavily used but nondiscretionary transactions: check-out, renewal, and check-in. should problems arise, the borrower is directed to the adjacent station staffed by a full-time employee who, using the system, can articulate circulation policy to borrowers and make decisions with regard to any questions concerning fines, lost books, or reinstatement of invalidated or blocked privileges . start-up we found system start-up to be a relatively easy task. it was convenient to use the so-called rolling conversion in which items were labeled upon their initial circulation through the system. the greatest benefit was seen in the first year when the probability that items brought to the circulation desk were already known to the system increased exponentially. after six months this probability had risen to 65 percent with only 10 percent of the circulating collection having been labeled . at the end of the year the probability increased linearly at 0. 7 percent per month. after three years of operation, the probability was 90 percent, with approximately 50 percent of the circulating collection having been labeled. reference use the ability to distribute catalog access as well as circulation information provides a powerful information tool. a subset of all functions previously described is available to the nonlibrarian users of the system through user-cordial screens. a "help" function may also be initiated at any screen to guide users through the network of screens. current development critical to the overall design of vtls is the system's ability to treat serials and continuations. without this capability, the modules being developed to support acquisitions, serials check-in and claiming, and binding, will not function satisfactorily. equally important, the design lays the foundation for authority control by virtue of its use of a dictionary for all controlled vocabulary terms. thus a name or subject entry is carried internally as a four-byte code, which is translated to the authority entry upon display. another internally coded data element, the bib-id, is designed to handle many of the linkage problems associated with serials and continuations. the bib-id is unique for each marc record. prior to establishing the serials control modules governing receipt, 88 journal of library automation vol. 14/2 june 1981 claiming, and binding, the coded holdings module must be functioning. this module will allow automatic identification of volume (or binding unit) closure and automatic identification of gaps in holdings or overdue receipts. thus, highest priority has been given to the development of this module so that these other modules can, in turn, develop. the holdings module serves two functions: first, it allows the detailed recordings of serials holdings consistent with the principle stated earlier concerning microscopic data description; and second, these microscopic data are coded so that the system can recognize (and predict) particular pieces or binding units in terms of enumerative and chronological data. the next three areas of development are modules for acquisitions and fund control, serials receipts and binding, and authority control. the final development will be comprehensive management reports. it should be noted that each one of these developments will result in a specific benefit to the user community. the project is incremental in that the development of area a does not mean that area b must be developed for a to have lasting value. this incremental approach offers designers and administrators the advantages associated with an orderly growth in complexity and budget requirements. further, the capabilities of the host hardware and software are stressed in smaller steps than would be the case if the comprehensive system were written and then turned on. the key move appears to be predefining the scope and capabilities of each stage so that a useful product emerges at its completion, and so that it lays a foundation for the next. references 1. velma veneziano and james s. aagaard, "cost advantages of total system development," in proceedings of the 1976 clinic on library applications of data processing (urbana, ill.: university of illinois press, 1976), p.133-44 . 2. charles payne and others, "the university of chicago data management system ," library quarterly 47:1-22 (jan . 1977). 3. audry n. grosch, minicomputers in libraries (new york: knowledge industry press, 1979), 142p . 4. richard degennaro, "wanted: a mini-computer serials system," library journal 102:878-79 (april 15, 1977). 5. john g. christoffersson, "automation at the university of georgia libraries," journal of library automation 12:23-38 (march 1979). 6. stephen m. silberstein, "standards in a national bibliographic network," journal of library automation 10:142-53 (june 1977). 7. network technical architecture group, "message delivery system for the national library and information service network: general requirements," in david c. hartmann, ed . , library of congress network planning paper, no.4, 1978, 35p. 8. arlene t. dowell, cataloging with copy (littleton, colo.: libraries unlimited, 1976), 295p. 9. michael roughton, "oclc serials records: errors , omissions, and dependability," journal of academic librarianship 5:316-21 (jan. 1980). 10. tamer uluakar, "needed: a national standard for machine-interpretable representation of serial holdings," rtsd newsletter 6:34 (may/june 1981) . design principles!uluakar, et al. 89 11. c.l. systems, inc., "the libs 100 system: a techn-ological perspective," clsi newsletter, no .6 (fall/winter 1977). 12. lister hill national center for biomedical communications, national library of medicine, "the integrated library system: overview and status" (lhc/ctb internal documentation, bethesda, md., october 1, 1979), 55p. 13. francis j. galligan to pierce, 11 feb. 1980. tamer uluakar is manager of the virginia tech library automation project. anton r. pierce is planning and research librarian at the university libraries. vinod chachra is director of computing resources and associate professor of industrial engineering. using the harvesting method to submit etds into proquest: a case study of a lesser-known approach communications using the harvesting method to submit etds into proquest a case study of a lesser-known approach marielle veve information technology and libraries | september 2020 https://doi.org/10.6017/ital.v39i3.12197 marielle veve (m.veve@unf.edu) is metadata librarian, university of north florida. © 2020. abstract the following case study describes an academic library’s recent experience implementing the harvesting method to submit electronic theses and dissertations (etds) into the proquest dissertations & theses global database (pqdt). in this lesser-known approach, etds are deposited first in the institutional repository (ir), where they get processed, to be later harvested for free by proquest through the ir’s open archives initiative (oai) feed. the method provides a series of advantages over some of the alternative methods, including students’ choice to opt-in or out from proquest, better control over the embargo restrictions, and more customization power without having to rely on overly complicated workflows. institutions interested in adopting a simple, automated, post-ir method to submit etds into proquest, while keeping the local workflow, should benefit from this method. introduction the university of north florida (unf) is a midsize public institution established in 1972, with the first theses and dissertations (tds) submitted in 1974. since then, copies have been deposited in the library, where bibliographic records are created and entered in the library catalog and the online computer library center (oclc). during the period of 1999 to 2012, some tds were also deposited in proquest by the graduate school on behalf of students who decided to. this practice, however, was discontinued in the summer of 2012, when the institutional repository, digital commons, was established and submission to it became mandatory. five years later, in the summer of 2017, interest in getting unf tds hosted in proquest resurfaced. this renewed interest grew out from a desire of some faculty and graduate students to see the institution’s electronic theses and dissertations (etds) posted there, in addition to a recent library subscription to the proquest dissertations & theses global database (pqdt). a month later, conversations between the library and graduate school began on the possibility of resuming hosting unf etds in proquest. consensus was reached that the pqdt database would be a good exposure point for our etds, in addition to the institutional repository (ir), yet some concerns were raised. one of the concerns was cost of the service and who would be paying for it. neither the library nor the graduate school had allocated funds for this. the next concern was the possibility of proquest imposing restrictions that could prevent students, or the university, from posting etds in other places. it was important to make sure there were no such restrictions. another concern was expressed over students entering embargo dates in proquest that do not match the embargo dates selected for the ir. this is a common problem encountered by other libraries.1 for that reason, we wanted to keep the local workflow. the last concern expressed during the conversations was preserving students’ right to opt-in or out from distributing their theses in proquest. this is something both the graduate school and library have been adamant mailto:m.veve@unf.edu information technology and libraries september 2020 using the harvesting method to submit etds into proquest | veve 2 about. in higher education, requiring students to submit to proquest is a controversial issue which has raised ethical concerns and has been highly debated over the years.2 once conversations between the library and graduate school were held and concerns were gathered, the library moved ahead to investigate the available options to submit etds into proquest. literature review currently, there are three options to submit etds into proquest: (1) submission through the proquest etd administrator tool, (2) submission via file transfer protocol (ftp), and (3) submission through harvests performed by proquest.3 proquest etd administrator submission option in this option, a proprietary submission tool called proquest etd administrator is used by students, or assigned administrators, to upload etds into proquest. inside the tool, a fixed metadata form is completed with information on the degree, subject terms are selected from a proprietary list, and keywords are provided. the whole administrative and review process gets done inside the tool. afterwards, zip packages with the etds and proquest’s extensible markup language (xml) files are sent to the institution via ftp transfers, or through direct deposits to the ir using the simple web-service offering repository deposit (sword) protocol. the etd administrator submission method presents several shortcomings. first, the proquest xml metadata that is returned to the institutions must be transformed into ir metadata for ingest in the ir, a process that can be long and labor intensive.4 second, the subject terms supplied in the returned files come from a proprietary list of categories maintained by proquest, which does not match the library of congress subject headings (lcsh) used by libraries.5 third, control over the metadata provided is lost because the metadata form cannot be altered, plus customizations to other parts of the system can be difficult to integrate. 6 fourth, there have been issues with students indicating different embargo periods in the proquest and ir publishing options, with instances of students choosing to embargo etds in the ir, while not in proquest.7 lastly, this method does not allow students’ choice, unless the etds are submitted separately in two systems in a process that can be burdensome. ultimately, for these reasons, we found the etd administrator not a suitable option for our institution. ftp submission option in this option, an administrator sends zip packages with the institution’s etd files and proquest xml metadata to proquest via ftp.8 at the time of this investigation, there was a $25 charge per etd submitted through this method.9 we did not want to pursue this option because of the charge and the tedious metadata transformations that would be needed between ir and proquest xml schemas. another way to go around this would have been to submit the etds through the vireo application. vireo is an open source, etd management system used by libraries to freely submit etds into proquest via ftp.10 this alternative, however, was not an option for us as our ir, digital commons, does not support the vireo application. harvesting submission option this is the latest method available to submit etds into proquest. in this option, etds are submitted first into an ir, or other internal system, where they get processed to be later harvested by proquest through the ir’s existing open archives initiative (oai) feed.11 at the time of this writing, we were not able to find a single study that documents the use of this method. this option information technology and libraries september 2020 using the harvesting method to submit etds into proquest | veve 3 looked appealing and worth pursuing as it met most of our desired criteria. first, with this option, students’ choice would not be compromised as etds would be submitted to proquest after being posted in the ir. second, because the etd administrator would not be used, issues with conflicting embargo dates and unalterable metadata forms would be avoided. in addition, the local workflow would be retained, thus eliminating the need for tedious metadata transformations between proquest and ir schemas. from the available options, this one seemed the most feasible solution for our institution. implementation of the harvesting method at unf after research on the different submittal options was performed, the library approached proquest to express interest in depositing our future etds into their system by using a post-ir option. in the first communications, proquest suggested we use the etd administrator to submit etds because is the most commonly used method. when we expressed interest in the harvesting option, they said “we have not been harvesting from bepress sites” (the company that makes digital commons) and suggested we use the ftp option instead.12 ten months later, they clarified the harvests could be performed from bepress sites and that the option is free, with the only requirement of a non-exclusive agreement between the university and proquest. the news appeased both the library’s and the graduate school’s previous concerns, as we would be able to adopt a free method that would not compromise on students’ choice nor restrict students from posting in other places, while keeping the local workflow. after agreement on the submittal method was established, planning and testing of the harvesting method began. the library worked with proquest and bepress to customize the harvesting process while the university’s office of the general counsel worked with proquest on the negotiation process. negotiation process before proquest could harvest unf etds, two legal documents needed to be in place. the first document was the theses and dissertations distribution agreement, which specifies the conditions under which etds can be obtained, reproduced, and disseminated by proquest. the document had to be signed by the unf’s board of trustees and proquest. the agreement stipulated the following conditions: • the agreement must be non-exclusive. • the university must make the full-text uniform resource locators (urls) and abstracts of etds available to proquest. • proquest must harvest the etds from the university’s ir. • the university and students have the option to elect not to submit individual works or to withdraw them. • no fees are due from the university or students for the service. • proquest must include the etds in the pqdt database. the second document that needed to be in place was the theses and dissertations availability agreement, which grants the university the non-exclusive right to reproduce and distribute the etds. this agreement between students and unf specifies the places where etds can be hosted and the embargo restrictions, if any. unf already has been using this document as part of its etd workflow, but the document needed to be modified to include the additional option to submit information technology and libraries september 2020 using the harvesting method to submit etds into proquest | veve 4 etds into proquest. beginning with the spring 2019 semester, the revised version of the agreement provided students with two hosting alternatives: posting in the ir only or in the ir and proquest. local steps performed before the harvesting the workflow begins when students upload their etds and supplemental files (certificate of approval and availability agreements) directly into the digital commons ir. in there, students complete a metadata template with information on the degree and keywords related to the thesis are provided. after this, the graduate school reviews the submitted etds and approves them inside the ir platform. next, the library digital projects’ staff downloads the native pdf files of etds, processes them, and creates public and archival versions for each etd. availability agreements are reviewed to determine which students chose to embargo their etds and which ones chose to host them in proquest, in addition to the ir. if students choose to embargo their etds, the embargo dates are entered in the metadata template. if students choose to publish their etds in proquest, a “proquest: yes” option is checked in their metadata template, while students who choose not to host in proquest would get a “proquest: no” in their template. (the proquest field is a new administrative field that was added to the etd metadata template, starting with the spring 2019 semester, to assist with the harvesting process. it was designed to alert proquest of the etds that were authorized for harvesting. more detail on its functionality will be provided in the next section.) the reason library staff enters the proquest and embargo fields on behalf of students is to avoid having students enter incorrect data on the template. following this review, the metadata librarian assigns library of congress subject headings to each etd and creates authority files for the authors. these are also entered in the metadata template. afterwards, the etds get posted in the digital commons’ public display, with the fulltext pdf files available only for the non-embargoed etds. information that appears in the public display of digital commons will also appear immediately in the oai feed for harvesting. at this point, two separate processes take place: 1. metadata librarian harvests the etds’ metadata from the oai feed and converts it into marc records that are sent to oclc, with the ir’s url attached. the workflow is described at https://journal.code4lib.org/articles/11676. 2. on the seventh of each month, proquest harvests the full-text pdf files, with some metadata, of the non-embargoed etds that were authorized for harvesting from the oai feed. harvesting process (customized for our institution) to perform the harvests, proquest creates a customized robot for each institution that crawls oaipmh compliant repositories to harvest metadata and full-text pdf files of etds.13 the robot performs a date-limited oai request to pull everything that has been published or edited in an ir’s publication set during a specific timeframe. information to formulate the date limited request is provided to proquest by the institution for the first harvest only, subsequently, the process gets done automatically by the robot. the request contains the following elements: information technology and libraries september 2020 using the harvesting method to submit etds into proquest | veve 5 • base url of the oai repository • publication set • metadata prefix or type of metadata • date range of titles to be harvested in the particular case of our institution, we needed to customize the robot to limit the harvests to authorized etds only. to achieve this, we worked with bepress to add a new, hidden field at the bottom of our digital commons’ etd metadata template. the field, called proquest, consisted of a dropdown menu with 2 alternatives: “proquest yes” or “proquest no” (see figure 1). the field was mapped to an element in the oai feed that displays the value of “proquest: yes” or “proquest: no,” thus alerting the robot of the etds that were authorized for harvesting and the ones that were not. the element used to map the proquest field in the oai feed is the , which is a qualified dublin core (qdc) element (figure 2). for that reason, the robot needs to perform the harvests from the qdc oai feed in order to see this field. figure 1. display of the proquest field’s dropdown menu in the metadata template figure 2. display of the proquest field in the qdc oai feed after the etds authorized for harvesting have been identified with help from the “proquest: yes” field, the robot narrows down the ones that can be harvested at the present moment by using the element. this element, as the name implies, provides the date when the full text file of an etd becomes available. it also displays in the qdc oai feed (see figure 3). if the date is on or before the monthly harvest day, the etd is currently available for harvesting. if the date is in the future, the robot identifies that etd as embargoed and adds its title to a log of embargoed etds with some basic metadata (including the etd’s author and the last time it was checked). the log of embargoed etds is then pulled out in the future to identify the etds that come out of embargo so the robot can retrieve them. figure 3. display of the element in the qdc oai feed information technology and libraries september 2020 using the harvesting method to submit etds into proquest | veve 6 after the etds that are currently available for harvesting have been identified (because they have the “proquest: yes” field and a present or past availability date), the robot performs a harvest of their full-text pdf files by using the third element, which displays at the bottom of records in the oai feed (figure 4). the third element contains a url with direct access to the complete pdf file of etds that are currently not embargoed. etds that are currently on embargo contain a url that redirects the user to a webpage with the message: “the full-text of this etd is currently under embargo. it will be available for download on [future date]” (see figure 5). figure 4. display of the third element at the bottom of records in the qdc oai feed figure 5. message that displays in the url of embargoed etds once the metadata and full-text pdf files of authorized, non-embargoed etds have been obtained by the robot, they get queued for processing by the proquest editorial team, who then assigns them international standard book numbers (isbns) and proquest’s proprietary terms. it takes an average of four to nine weeks for the etds to display in the pqdt database after been harvested. records in the pqdt come with the institutional repository’s original cover page and a copyright statement that leaves copyright to the author. afterwards, the process gets repeated once a month. this frequency can be set to quarterly or semi-annually if desired. information technology and libraries september 2020 using the harvesting method to submit etds into proquest | veve 7 additional points on the harvesting method handling of etds that come out of embargo. when the embargo period of an etd expires, the full-text pdf of it becomes automatically available in the ir’s webpage, and consequently, in the third element that displays in the oai record. each month, when the robot prepares to crawl the oai feed, it will first check for the titles in the log of embargoed etds to determine if any of them have become fully available through the third element. the ones that become available are then pulled by the robot through this element. handling of metadata edits performed after the etds have been harvested and published in pqdt. edits performed to metadata of etds will trigger a change of date in the element that displays in the oai records. this change of date will alert the robot of an update that took place in a record, which is then manually edited or re-harvested, depending on the type of update that took place. sending marc records to oclc. as part of the harvesting process, proquest provides free marc records for the etds hosted in their pqdt database. these can be delivered to oclc on behalf of the institution on an irregular basis. records are machine-generated “k” level and come with urls that link to the pqdt database and with proquest’s proprietary subject terms. we requested to be excluded from these deliveries and continue our local practice of sending marc records to oclc with lcsh, authority file headings, and the ir’s urls. notifications of harvests performed by proquest and imports to the pqdt database. when harvests or imports to the pqdt have been performed by proquest, institutions do not get automatically notified. still, they can request to receive scheduled monthly reports of the titles that have been added to the pqdt. unf requested to receive these monthly reports. usage statistics of etds hosted in pqdt. usage statistics of an institution’s etds hosted in the pqdt can be retrieved from a tool called dissertation dashboard. this tool is available to the institution’s etd administrators and provides the number of times some aspect of an etd (e.g., citations, abstract viewings, page previews, and downloads) has been accessed through the pqdt database. royalty payments to authors. students who submit etds through this method are also eligible to receive royalties from proquest. obstacles faced during the planning phase, we encountered some obstacles that hindered progress on the implementation. these were: • amount of time it took to get the ball rolling. initially, we were misled by the assumption we would not be able to use the harvesting method to submit etds into proquest because we were bepress users, as we were originally told, but that ended up not being the case. ten months later, we were notified by the same source that the harvesting option for bepress sites would be possible and doable by proquest. these were ten months that delayed the implementation process. information technology and libraries september 2020 using the harvesting method to submit etds into proquest | veve 8 • amount of time it took to get the paperwork finalized and signed before the harvesting. from the moment first contact was initiated with proquest, to the moment the last agreement was finalized and signed by both parties, 21 months went by. there was a lot of back and forth in the negotiation process and paperwork between the university and proquest. • inconsistent lines of communication. there were multiple parties involved in the communication process and some of the emails began with one person only to be later transferred to someone else. this lack of consistency in the communication lines made it difficult to determine who was in charge of particular tasks at certain stages of the process. conclusion and recommendations although problems were encountered at the beginning, implementation of the harvesting process at unf was a complete success. once the process started, it ran smoothly without complications. harvests were performed on schedule and no issues with unauthorized content been pulled from the oai were faced. fields used to alert the robot in the oai of the etds authorized for harvesting worked as planned, and so did the embargo log used to identify and pull the out of embargo etds. it should be noted that digital commons users who want to exclude embargoed etds from displaying in the oai can do so by setting up an optional yes/no button in their submission form. this button prevents metadata of particular records from displaying in the oai feed. we did not pursue this option because we have been using the etd metadata that displays in th e oai to generate the marc records we send to oclc. in addition, we took the necessary precautions to avoid exposing full content of the embargoed etds in the oai feed. institutions planning to use this method should be very careful with the content they display in the oai as to avoid embargoed etds from been mistakenly pulled by proquest. access restrictions can be set by either suppressing the metadata of embargoed etds from displaying in the oai or by suppressing the urls with full access to the embargoed etds. the same precaution should be taken if planning to provide students with the choice to opt-in or out from proquest. altogether, the harvesting option proved to be a reliable solution to submit etds into proquest without having to compromise on students’ choice nor rely on complicated workflows with metadata transformations between ir and proquest schemas. institutions interested in adopting a simple, automated, post-ir method, while keeping the local workflow, should benefit from this method. information technology and libraries september 2020 using the harvesting method to submit etds into proquest | veve 9 endnotes 1 dan tam do and laura gewissler, “managing etds: the good, the bad, and the ugly,” in what’s past is prologue: charleston conference proceedings, eds. beth r. bernhardt et al. (west lafayette, in: purdue university press, 2017), 200-04, https://doi.org/10.5703/1288284316661; emily symonds stenberg, september 7, 2016, reply to wendy robertson, “anything to watch out for with etd embargoes?,” digital commons google users group (blog), https://groups.google.com/forum/#!searchin/digitalcommons/embargo$20dates%7csort:da te/digitalcommons/rningtrarny/6byzt9apaqaj. 2 gail p. clement, “american etd dissemination in the age of open access: proquest, noquest, or allowing student choice,” college & research libraries news 74, no. 11 (december 2013): 562– 66, https://doi.org/10.5860/crln.74.11.9039; fuse, 2012-2013, graduate students re-fuse!, https://oaktrust.library.tamu.edu/bitstream/handle/1969.1/152270/graduate%20students %20re-fuse.pdf?sequence=25&isallowed=y. 3 “pqdt submissions options for universities,” proquest, http://contentz.mkt5049.com/lp/43888/382619/pqdtsubmissionsguide_0.pdf . 4 meghan banach bergin and charlotte roh, “systematically populating an ir with etds: launching a retrospective digitization project and collecting current etds,” in making institutional repositories work, eds. burton b. callicott, david scherer, and andrew wesolek (west lafayette, in: purdue university press, 2016), 127–37, https://docs.lib.purdue.edu/purduepress_ebooks/41/. 5 cedar c. middleton, jason w. dean, and mary a. gilbertson, “a process for the original cataloging of theses and dissertations,” cataloging and classification quarterly 53, no. 2 (february 2015): 234–46, https://doi.org/10.1080/01639374.2014.971997. 6 wendy robertson and rebecca routh, “light on etd’s: out from the shadows” (presentation, annual meeting for the ila/acrl spring conference, cedar rapids, ia, april 23, 2010), http://ir.uiowa.edu/lib_pubs/52/; yuan li, sarah h. theimer, and suzanne m. preate, “campus partnerships advance both etd implementation and ir development: a win-win strategy at syracuse university,” library management 35, no. 4/5 (2014): 398–404, https://doi.org/10.1108/lm-09-2013-0093. 7 do and gewissler, “managing etds,” 202; banach bergin and roh, “systematically populating,” 134; donna o’malley, june 27, 2017, reply to andrew wesolek, “etd embargoes through proquest,” digital commons google users group (blog), https://groups.google.com/forum/#!searchin/digitalcommons/embargo$20proquest%7csort :date/digitalcommons/gadwi8infga/sg7de7sdcaaj. 8 gail p. clement and fred rascoe, “etd management & publishing in the proquest system and the university repository: a comparative analysis,” journal of librarianship and scholarly communication 1, no. 4 (august 2013): 8, http://doi.org/10.7710/2162-3309.1074. 9 “u.s. dissertations publishing services: 2017-2018 fee schedule,” proquest. https://doi.org/10.5703/1288284316661 https://groups.google.com/forum/#!searchin/digitalcommons/embargo$20dates%7csort:date/digitalcommons/rningtrarny/6byzt9apaqaj https://groups.google.com/forum/#!searchin/digitalcommons/embargo$20dates%7csort:date/digitalcommons/rningtrarny/6byzt9apaqaj https://doi.org/10.5860/crln.74.11.9039 https://oaktrust.library.tamu.edu/bitstream/handle/1969.1/152270/graduate%20students%20re-fuse.pdf?sequence=25&isallowed=y https://oaktrust.library.tamu.edu/bitstream/handle/1969.1/152270/graduate%20students%20re-fuse.pdf?sequence=25&isallowed=y http://contentz.mkt5049.com/lp/43888/382619/pqdtsubmissionsguide_0.pdf https://docs.lib.purdue.edu/purduepress_ebooks/41/ https://doi.org/10.1080/01639374.2014.971997 http://ir.uiowa.edu/lib_pubs/52/ https://doi.org/10.1108/lm-09-2013-0093 https://groups.google.com/forum/#!searchin/digitalcommons/embargo$20proquest%7csort:date/digitalcommons/gadwi8infga/sg7de7sdcaaj https://groups.google.com/forum/#!searchin/digitalcommons/embargo$20proquest%7csort:date/digitalcommons/gadwi8infga/sg7de7sdcaaj http://doi.org/10.7710/2162-3309.1074 information technology and libraries september 2020 using the harvesting method to submit etds into proquest | veve 10 10 “support: proquest export documentation,” vireo users group, https://vireoetd.org/vireo/support/proquest-export-documentation/. 11 “pqdt global submission options, institutional repository + harvesting,” proquest, https://media2.proquest.com/documents/dissertations-submissionsguide.pdf. 12 marlene coles, email message to author, january 19, 2018. 13 “proquest dissertations & theses global harvesting process,” proquest. https://vireoetd.org/vireo/support/proquest-export-documentation/ https://media2.proquest.com/documents/dissertations-submissionsguide.pdf abstract introduction literature review proquest etd administrator submission option ftp submission option harvesting submission option implementation of the harvesting method at unf negotiation process local steps performed before the harvesting harvesting process (customized for our institution) additional points on the harvesting method handling of etds that come out of embargo. handling of metadata edits performed after the etds have been harvested and published in pqdt. sending marc records to oclc. notifications of harvests performed by proquest and imports to the pqdt database. usage statistics of etds hosted in pqdt. royalty payments to authors. obstacles faced conclusion and recommendations endnotes smartphones: a potential discovery tool | starkweather and stoward 187 smartphones: a potential discovery tool wendy starkweather and eva stowers the anticipated wide adoption of smartphones by researchers is viewed by the authors as a basis for developing mobile-based services. in response to the unlv libraries’ strategic plan’s focus on experimentation and outreach, the authors investigate the current and potential role of smartphones as a valuable discovery tool for library users. w hen the dean of libraries announced a discovery mini-conference at the university of nevada las vegas libraries to be held in spring 2009, we saw the opportunity to investigate the potential use of smartphones as a means of getting information and services to students. being enthusiastic users of apple’s iphone, we and the web technical support manager, developed a presentation highlighting the iphone’s potential value in an academic library setting. because wendy is unlv libraries’ director of user services, she was interested in the applicability of smartphones as a tool for users to more easily discover the libraries’ resources and services. eva, as the health sciences librarian, was aware of a long tradition of pda use by medical professionals. indeed, first-year bachelor of science nursing students are required to purchase a pda bundled with select software. together we were drawn to the student-outreach possibilities inherent in new smartphone applications such as twitter, facebook, and myspace. n presentation our brief review of the news and literature about mobile phones in general provided some interesting findings and served as a backdrop for our presentation: n a total of 77 percent of internet experts agreed that the mobile phone would be “the primary connection tool” for most people in the world by 2020.1 the number of smartphone users is expected to top 100 million by 2013. there are currently 25 million smartphone users, with sales in north america having grown 69 percent in 2008.2 n smartphones offer a combination of technologies, including gps tracking, digital cameras, and digital music, as well as more than fifty-thousand specialized apps for the iphone and new ones being designed for the blackberry and the palm pre.3 the palm pre offered less than twenty applications at its launch, but one million apllication downloads had been performed by june 24, 2009, less than a month after launch.4 n the 2009 horizon report predicts that the time to adoption of these mobile devices in the educational context will be “one year or less.”5 data gathered from campus users also was presented, providing another context. in march 2009, a survey of university of california, davis (uc-davis) students showed that 43 percent owned a smartphone.6 uc-davis is participating in apple’s university education forum. here at unlv, 37 percent of students and 26 percent of faculty and staff own a smartphone.7 the presentation itself highlighted the mobile applications that were being developed in several libraries to enhance student research, provide library instruction, and promote library services. two examples were abilene christian university (http://www.acu.edu/technology/ mobilelearning/index.html), which in fall 2008 distributed iphones and ipod touches to the incoming freshman class; and stanford university (http://www.stanford .edu/services/wirelessdevice/iphone/) which participates in “itunes u” (http://itunes.stanford.edu/). if the libraries were to move forward with smartphone technologies, it would be following the lead of such universities. readers also may be interested in joan lippincott’s recent concise summary of the implications of mobile technologies for academic libraries as well as the chapter on library mobile initiatives in the july 2008 library technology report.8 n goals: a balancing act ultimately the goal for many of these efforts is to be where the users are. this aspiration is spelled out in unlv libraries’ new strategic plan relating to infrastructure evolution, namely, “work towards an interface and system architecture that incorporates our resources, internal and external, and allows the user to access from their preferred starting point.”9 while such a goal is laudable and fits very well into the discovery emphasis of the mini-conference presentation, we are well aware of the need for further investigation before proceeding directly to full-scale development of a complete suite of mobile services for our users. of critical importance is ascertaining where our users are and determining whether they want us to be there and in what capacity. the value of this effort is demonstrated in booth’s research report on student interest in emerging technologies at ohio state university. the report includes the results of an extensive environmental survey of their wendy starkweather (wendy.starkweather@unlv.edu) is director, user services division, and eva stowers (eva.stowers @unlv.edu) is medical/health sciences librarian at the university of nevada las vegas libraries. 188 information technology and libraries | december 2009 library users. the study is part of ohio state’s effort to actualize their culture of assessment and continuous learning and to use “extant local knowledge of user populations and library goals” to inform “homegrown studies to illuminate contextual nuance and character, customization that can be difficult to achieve when using externally developed survey instruments.”10 unlv libraries are attempting to balance early experimentation and more extensive data-driven decision-making. the recently adopted strategic plan includes specific directions associated with both efforts. for experimentation, the direction states, “encourage staff to experiment with, explore, and share innovative and creative applications of technology.”11 to that end, we have begun working with our colleagues to introduce easy, small-scale efforts designed to test the waters of mobile technology use through small pilot projects. “text-a-librarian” has been added to our existing group of virtual reference service, and we introduced a “text the call number and record” service to our library’s opac in july 2009. unlv libraries’ strategic plan helps foster the healthy balance by directing library staff to “emphasize data collection and other evidence based approaches needed to assess efficiency and effectiveness of multiple modes and formats of access/ownership” and “collaborate to educate faculty and others regarding ways to incorporate library collections and services into education experiences for students.”12 action items associated with these directions will help the libraries learn and apply information specific to their users as the libraries further adopt and integrate mobile technologies into their services. as we begin our planning in earnest, we look forward to our own set of valuable discoveries. references 1. janna anderson and lee rainie, the future of the internet iii, pew internet & american life project, http://www.pewinternet .org/~/media//files/reports/2008/pip_futureinternet3.pdf .pdf (accessed july 20, 2009). 2. sam churchill, “smartphone users: 110m by 2013,” blog entry, mar. 24, 2009, dailywireless.org, http://www.daily wireless.org/2009/03/24/smartphone-users-100m-by-2013 (accessed july 20, 2009). 3. mg siegler, “state of the iphone ecosystem: 40 million devices and 50,000 apps,” blog entry, june 8, 2009, tech crunch, http://www.techcrunch.com/2009/06/08/40-million-iphones -and-ipod-touches-and-50000-apps (accessed july 20, 2009). 4. jenna wortham, “palm app catalog hits a million downloads,” blog entry, june 24, 2009, new york times technology, http://bits.blogs.nytimes.com/2009/06/24/palm-app-cataloghits-a-million-downloads (accessed july 20, 2009). 5. larry johnson, alan levine, and rachel smith, horizon report, 2009 edition (austin, tex.: the new media consortium, 2009), http://www.nmc.org/pdf/2009-horizon-report.pdf (accessed july 20, 2009). 6. university of california, davis. “more than 40% of campus students own smartphones, yearly tech survey says,” technews, http://technews.ucdavis.edu/news2.cfm?id=1752 (accessed july 20, 2009). 7. university of nevada las vegas, office of information technology, “student technology survey report: 2008– 2009,” http://oit.unlv.edu/sites/default/files/survey/survey results2008_students3_27_09.pdf (accessed july 20, 2009). 8. joan lippincott, “mobile technologies, mobile users: implications for academic libraries,” arl bi-monthly report 261 (dec. 2008), http://www.arl.org/bm~doc/arl-br-261-mobile .pdf. (accessed july 20, 2009); ellyssa kroski, “library mobile initiatives,” library technology reports 44, no. 5 (july 2008): 33–38. 9. “unlv libraries strategic plan 2009–2011,” http://www .library.unlv.edu/about/strategic_plan09-11.pdf (accessed july 20, 2009): 2. 10. char booth, informing innovation: tracking student interest in emerging library technologies at ohio university (chicago: association of college and research libraries, 2009), http:// www.ala.org/ala/mgrps/divs/acrl/publications/digital/ ii-booth.pdf (accessed july 20, 2009); “unlv libraries strategic plan 2009–2011,” 6. 11. “unlv libraries strategic plan 2009–2011,” 2. 12. ibid. 76 information technology and libraries | june 2010 in this paper we discuss the design space of methods for integrating information from web services into websites. we focus primarily on client-side mash-ups, in which code running in the user’s browser contacts web services directly without the assistance of an intermediary server or proxy. to create such mash-ups, we advocate the use of “widgets,” which are easy-to-use, customizable html elements whose use does not require programming knowledge. although the techniques we discuss apply to any web-based information system, we specifically consider how an opac can become both the target of web services integration and also a web service that provides information to be integrated elsewhere. we describe three widget libraries we have developed, which provide access to four web services. these libraries have been deployed by us and others. our contributions are twofold: we give practitioners an insight into the trade-offs surrounding the appropriate choice of mash-up model, and we present the specific designs and use examples of three concrete widget libraries librarians can directly use or adapt. all software described in this paper is available under the lgpl open source license. ■■ background web-based information systems use a client-server architecture in which the server sends html markup to the user’s browser, which then renders this html and displays it to the user. along with html markup, a server may send javascript code that executes in the user’s browser. this javascript code can in turn contact the original server or additional servers and include information obtained from them into the rendered content while it is being displayed. this basic architecture allows for myriad possible design choices and combinations for mash-ups. each design choice has implications to ease of use, customizability, programming requirements, hosting requirements, scalability, latency, and availability. server-side mash-ups in a server-side mash-up design, shown in figure 1, the mash-up server contacts the base server and each source when it receives a request from a client. it combines the information received from the base server and the sources and sends the combined html to the client. server-side mash-up systems that combine base and mash-up servers are also referred to as data mash-up systems. such data mash-up systems typically provide a web-based configuration front-end that allows users to select data sources, specify the manner in which they are combined, and to create a layout for the entire mash-up. godmar back and annette bailey web services and widgets for library information systems as more libraries integrate information from web services to enhance their online public displays, techniques that facilitate this integration are needed. this paper presents a technique for such integration that is based on html widgets. we discuss three example systems (google book classes, tictoclookup, and majax) that implement this technique. these systems can be easily adapted without requiring programming experience or expensive hosting. t o improve the usefulness and quality of their online public access catalogs (opacs), more and more librarians include information from additional sources into their public displays.1 examples of such sources include web services that provide additional bibliographic information, social bookmarking and tagging information, book reviews, alternative sources for bibliographic items, table-of-contents previews, and excerpts. as new web services emerge, librarians quickly integrate them to enhance the quality of their opac displays. conversely, librarians are interested in opening the bibliographic, holdings, and circulation information contained in their opacs for inclusion into other web offerings they or others maintain. for example, by turning their opac into a web service, subject librarians can include up-to-the-minute circulation information in subject or resource guides. similarly, university instructors can use an opac’s metadata records to display citation information ready for import into citation management software on their course pages. the ability to easily create such “mash-up” pages is crucial for increasing the visibility and reach of the digital resources libraries provide. although the technology to use web services to create mash-ups is well known, several practical requirements must be met to facilitate its widespread use. first, any environment providing for such integration should be easy to use, even for librarians with limited programming background. this ease of use must extend to environments that include proprietary systems, such as vendor-provided opacs. second, integration must be seamless and customizable, allowing for local display preferences and flexible styling. third, the setup, hosting, and maintenance of any necessary infrastructure must be low-cost and should maximize the use of already available or freely accessible resources. fourth, performance must be acceptable, both in terms of latency and scalability.2 godmar back (gback@cs.vt.edu) is assistant professor, department of computer science and annette bailey (afbailey@vt.edu) is assistant professor, university libraries, virginia tech university, blacksburg. web services and widgets for library information systems | back and bailey 77 examples of such systems include dapper and yahoo! pipes.3 these systems require very little programming knowledge, but they limit mash-up creators to the functionality supported by a particular system and do not allow the user to leverage the layout and functionality of an existing base server, such as an existing opac. integrating server-side mash-up systems with proprietary opacs as the base server is difficult because the mash-up server must parse the opac’s output before integrating any additional information. moreover, users must now visit—or be redirected to—the url of the mash-up server. although some emerging extensible opac designs provide the ability to include information from external sources directly and easily, most currently deployed systems do not.4 in addition, those mash-up servers that do usually require server-side programming to retrieve and integrate the information coming from the mash-up sources into the page. the availability of software libraries and the use of special purpose markup languages may mitigate this requirement in the future. from a performance scalability point of view, the mash-up server is a bottleneck in server-side mash-ups and therefore must be made large enough to handle the expected load of end-user requests. on the other hand, the caching of data retrieved from mash-up sources is simple to implement in this arrangement because only the mash-up server contacts these sources. such caching reduces the frequency with which requests have to be sent to sources if their data is cacheable, that is, if realtime information is not required. the latency in this design is the sum of the time required for the client to send a request to the mashup server and receive a reply, plus the processing time required by the server, plus the time incurred by sending a request and receiving a reply from the last responding mash-up source. this model assumes that the mash-up server contacts all sources in parallel, or as soon as the server knows that information from a source should be included in a page. the availability of the system depends on the availability of all mash-up sources. if a mash-up source does not respond, the end user must wait until such failure is apparent to the mash-up server via a timeout. finally, because the mash-up server acts as a client to the base and source servers, no additional security considerations apply with respect to which sources may be contacted. there also are no restrictions on the data interchange format used by source servers as long as the mash-up server is able to parse the data returned. client-side mash-ups in a client-side setup, shown in figure 2, the base server sends only a partial website to the client, along with javascript code that instructs the client which other sources of information to contact. when executed in the browser, this javascript code retrieves the information from the mash-up sources directly and completes the mash-up. the primary appeal of client-side mashing is that no mash-up server is required, and thus the url that users visit does not change. consequently, the mash-up server is no longer a bottleneck. equally important, no maintenance is required for this server, which is particularly relevant when libraries use turnkey solutions that restrict administrative access to the machine housing their opac. on the other hand, without a mash-up server, results from mash-up sources can no longer be centrally cached. thus the mash-up sources themselves must be sufficiently figure 1. server-side mash-up construction figure 2. client-side mash-up construction 78 information technology and libraries | june 2010 scalable to handle the expected number of requests. as a load-reducing strategy, mash-up sources can label their results with appropriate expiration times to influence the caching of results in the clients’ browsers. availability is increased because the mash-up degrades gracefully if some of the mash-up sources fail, since the information from the remaining sources can still be displayed to the user. assuming that requests are sent by the client in parallel or as soon as possible, and assuming that each mash-up source responds with similar latency to requests sent by the user’s browser as to requests sent by a mash-up server, the latency for a client-side mash-up is similar to the server-side mash-up. however, unlike in the server-side approach, the page designer has the option to display partial results to the user while some requests are still in progress, or even to delay sending some requests until the user explicitly requests the data by clicking on a link or other element on the page. because client-side mash-ups rely on javascript code to contact web services directly, they are subject to a number of restrictions that stem from the security model governing the execution of javascript code in current browsers. this security model is designed to protect the user from malicious websites that could exploit client-side code and abuse the user’s credentials to retrieve html or xml data from other websites to which a user has access. such malicious code could then relay this potentially sensitive data back to the malicious site. to prevent such attacks, the security model allows the retrieval of html text or xml data only from sites within the same domain as the origin site, a policy commonly known as sameorigin policy. in figure 2, sources a and b come from the same domain as the page the user visits. the restrictions of the same-origin policy can be avoided by using the javascript object notation (json) interchange format.5 because client-side code may retrieve and execute javascript code served from any domain, web services that are not co-located with the origin site can make their results available using json. doing so facilitates their inclusion into any page, independent of the domain from which it is served (see source c in figure 2). many existing web services already provide an option to return data in json format, perhaps along with other formats such as xml. for web services that do not, a proxy server may be required to translate the data coming from the service into json. if the implementation of a proxy server is not feasible, the web service is usable only on pages within the same domain as the website using it. client-side mash-ups lend themselves naturally to enhancing the functionality of existing, proprietary opac systems, particularly when a vendor provides only limited extensibility. because they do not require server-side programming, the absence of a suitable vendor-provided server-side programming interface does not prevent their creation. oftentimes, vendor-provided templates or variables can be suitably adapted to send the necessary html markup and javascript code to the client. the amount of javascript code a librarian needs to write (or copy from a provided example) determines both the likelihood of adoption and the maintainability of a given mash-up creation. the less javascript code there is to write, the larger the group of librarians who feel comfortable trying and adopting a given implementation. the approach of using html widgets hides the use of javascript almost entirely from the mash-up creator. html widgets represent specially composed markup, which will be replaced with information coming from a mash-up source when the page is rendered. because the necessary code is contained in a javascript library, adapters do not need to understand programming to use the information coming from the web service. finally, html widgets are also preferable for javascript-savvy users because they create a layer of abstraction over the complexity and browser dependencies inherent in javascript programming. ■■ the google book classes widget library to illustrate our approach, we present a first example that allows the integration of data obtained from google book search into any website, including opac pages. google book search provides access to google’s database of book metadata and contents. because of the company’s book scanning activities as well as through agreements with publishers, google hosts scanned images of many book jackets as well as partial or even full previews for some books. many libraries are interested in either using the book jackets when displaying opac records or alerting their users if google can provide a partial or full view of an item a user selected in their catalog, or both.6 this service can help users decide whether to borrow the book from the library. the google book search dynamic link api the google book search dynamic link api is a jsonbased web service through which google provides certain metadata for items it has indexed. it can be queried using bibliographic identifiers such as isbn, oclc number, or library of congress control number (lccn). it returns a small set of data that includes the url of a book jacket thumbnail image, the url of a page with bibliographic information, the url of a preview page (if available), as well as information about the extent of any preview and whether the preview viewer can be embedded directly into other pages. table 1 shows the json result returned for an example isbn. web services and widgets for library information systems | back and bailey 79 widgetization to facilitate the easy integration of this service into websites without javascript programming, we developed a widget library. from the adapter’s perspective, the use of these widgets is extremely simple. the adapter places html or

tags into the page where they want data from google book search to display. these tags contain an html attribute that acts as an identifier to describe the bibliographic item for which information should be retrieved. it may contain its isbn, oclc number, or lccn. in addition, the tags also contain one or more html <class> attributes to describe which processing should be done with the information retrieved from google to integrate it into the page. these classes can be combined with a list of traditional css classes in the <class> attribute to apply further style and formatting control. examples as an example, consider the following html an adapter may use in a page: <span title=“isbn:0596000278” class=“gbs -thumbnail gbs-link-to-preview”></span> when processed by the google book classes widget library, the class “gbs-thumbnail” instructs the widget to embed a thumbnail image of the book jacket for isbn 0596000278, and “gbs-link-to-preview” provides instructions to wrap the <span> tag in a hyperlink pointing to google’s preview page. the result is as if the server had contacted google’s web service and constructed the html shown in example 1 in table 2, but the mash-up creator does not need to be concerned with the mechanics of contacting google’s service and making the necessary manipulations to the document. example 2 in table 2 demonstrates a second possible use of the widget. in this example, the creator’s intent is to display an image that links to google’s information page if and only if google provides at least a partial preview for the book in question. this goal is accomplished by placing the image inside the span and using style=“display:none” to make the span initially invisible. the span is made visible only if a preview is available at google, displaying the hyperlinked image. the full list of features supported by the google book classes widget library can be found in table 3. integration with legacy opacs the approach described thus far assumes that the mashup creator has sufficient control over the html markup that is sent to the user. this assumption does not always hold if the html is produced by a vendor-provided system, since such systems automatically generate most of the html used to display opac search results or individual bibliographic records. if the opac provides an extension system, such as a facility to embed customized links to external resources, it may be used to generate the necessary html by utilizing variables (e.g., “@#isbn@” for isbn numbers) set by the opac software. if no extension facility exists, accommodations by the widget library are needed to maintain the goal of not requiring any programming on the part of the adapter. we implemented such accommodations to facilitate the use of google book classes within a iii millennium opac.7 we used magic strings such as “isbn:millennium.record” in a table 1. sample request and response for google book search dynamic link api request: http://books.google.com/books?bibkeys=isbn:0596000278&jscmd=viewapi&callback=process json response: process({ “isbn:0596000278”: { “bib_key”: “isbn:0596000278”, “info_url”: “http://books.google.com/books?id=ezqe1hh91q4c\x26source=gbs_viewapi”, “preview_url”: “http://books.google.com/books?id=ezqe1hh91q4c\x26printsec=frontcover\x26 source=gbs_viewapi”, “thumbnail_url”: “http://bks4.books.google.com/books?id=ezqe1hh91q4c\x26printsec=frontcover\x26 img=1\x26zoom=5\x26sig=acfu3u2d1usnxw9baqd94u2nc3quwhjn2a”, “preview”: “partial”, “embeddable”: true } }); 80 information technology and libraries | june 2010 table 2. example of client-side processing by the google book classes widget library example 1: html written by adapter browser display <span title=“isbn:0596000278” class=“gbs-thumbnail gbs-link-to-preview”> </span> resultant html after client-side processing <a href=“http://books.google.com/books?id=ezqe1hh91q4c& printsec=frontcover&source=gbs_viewapi”> <span title=“” class=”gbs-thumbnail gbs-link-to-preview”> <img src=“http://bks3.books.google.com/books?id=ezqe1hh91q4c& amp;printsec=frontcover&img=1&zoom=5& sig=acfu3u2d1usnxw9baqd94u2nc3quwhjn2a” /> </span> </a> example 2: html written by adapter browser display <span style=“display: none” title=“isbn:0596000278” class=“gbs-link-to-info gbs-if-partial-or-full”> <img src=“http://www.google.com/intl/en/googlebooks/images/ gbs_preview_button1.gif” /> </span> resultant html after client-side processing <a href=”http://books.google.com/books?id=ezqe1hh91q4c& source=gbs_viewapi”> <span title=“” class=“gbs-link-to-info gbs-if-partial-or-full”> <img src=“http://www.google.com/intl/en/googlebooks/images/ gbs_preview_button1.gif” /> </span> </a> table 3. supported google book classes google book class meaning gbs-thumbnail gbs-link-to-preview gbs-link-to-info gbs-link-to-thumbnail gbs-embed-viewer gbs-if-noview gbs-if-partial-or-full gbs-if-partial gbs-if-full gbs-remove-on-failure include an <img...> embedding the thumbnail image wrap span/div in link to preview at google book search (gbs) wrap span/div in link to info page at gbs wrap span/div in link to thumbnail at gbs directly embed a viewer for book’s content into the page, if possible keep this span/div only if gbs reports that book’s viewability is “noview” keep this span/div only if gbs reports that book’s viewability is at least “partial” keep this span/div only if gbs reports that book’s viewability is “partial” keep this span/div only if gbs reports that book’s viewability is “full” remove this span/div if gbs doesn’t return book information for this item <title> attribute to instruct the widget library to harvest the isbn from the current page via screen scraping. figure 3 provides an example of how a google book classes widget can be integrated into an opac search results page. ■■ the tictoclookup widget library the tictocs journal table of contents service is a free online service that allows academic researchers and web services and widgets for library information systems | back and bailey 81 other users to keep up with newly published research by giving them access to thousands of journal tables of contents from multiple publishers.8 the tictocs consortium compiles and maintains a dataset that maps issns and journal titles to rss-feed urls for the journals’ tables of contents. the tictoclookup web service we used the tictocs dataset to create a simple json web service called “tictoclookup” that returns rss-feed urls when queried by issn and, optionally, by journal title. table 4 shows an example query and response. to accommodate different hosting scenarios, we created two implementations of this tictoclookup: a standalone and a cloud-based implementation. the standalone version is implemented as a python web application conformant to the web services gateway interface (wsgi) specification. hosting this version requires access to a web server that supports a wsgicompatible environment, such as apache’s mod_wsgi. the python application reads the tictocs dataset and responds to lookup requests for specific issns. a cron job downloads the most up-to-date version of the dataset periodically. the cloud version of the tictoclookup service is implemented as a google app engine (gae) application. it uses the highly scalable and highly available gae datastore to store tictocs data records. gae applications run on servers located in google’s regional data centers so that requests are handled by a data center geographically close to the requesting client. as of june 2009, google hosting of gae applications is free, which includes a free allotment of several computational resources. for each application, gae allows quotas of up to 1.3 mb requests and the use of up to 10 gb of bandwidth per twenty-fourhour period. although this capacity is sufficient for the purposes of many small and medium-size institutions, additional capacity can be purchased at a small cost. widgetization to facilitate the easy integration of this service into websites without javascript programming, we developed a widget library. like google book classes, this widget library is controlled via html attributes associated with html <span> or <div> tags that are placed into the page where the user decides to display data from the tictoclookup service. the html <title> attribute identifies the journal by its issn or its issn and title. as with google book classes, figure 3. sample use of google book classes in an opac results page table 4. sample request and response for tictocs lookup web service request: http://tictoclookup.appspot.com/0028-0836?title=nature&jsoncallback=process json response: process({ “lastmod”: “wed apr 29 05:42:36 2009”, “records”: [{ “title”: “nature”, “rssfeed”: http://www.nature.com/nature/current_issue/rss }], “issn”: “00280836” }); 82 information technology and libraries | june 2010 the html <class> attribute describes the desired processing, which may contain traditional css classes. example consider the following html an adapter may use in a page: <span style=“display:none” class=“tictoc-link tictoc-preview tictoc-alternate-link” title=“issn:00280836: nature”> click to subscribe to table of contents for this journal </span> when processed by the tictoclookup widget library, the class “tictoc-link” instructs the widget to wrap the span in a link to the rss feed at which the table of content is published, allowing users to subscribe to it. the class “tictoc-preview” associates a tooltip element with the span, which displays the first entries of the feed when the user hovers over the link. we use the google feeds api, another json-based web service, to retrieve a cached copy of the feed. the “tictoc-alternate-link” class places an alternate link into the current document, which in some browsers triggers the display of the rss feed icon figure 4. sample use of tictoclookup classes in the status bar. the <span> element, which is initially invisible, is made visible if and only if the tictoclookup service returns information for the given pair of issn and title. figure 4 provides a screenshot of the display if the user hovers over the link. as with google book classes, the mash-up creator does not need to be concerned with the mechanics of contacting the tictoclookup web service and making the necessary manipulations to the document. table 5 provides a complete overview of the classes tictoclookup supports. integration with legacy opacs similar to the google book classes widget library, we implemented provisions that allow the use of tictoclookup classes on pages over which the mash-up creator has limited control. for instance, specifying a title attribute of “issn:millennium.issnandtitle” harvests the issn and journal title from the iii millennium’s record display page. ■■ majax whereas the widget libraries discussed thus far integrate external web services into an opac display, majax is a widget library that integrates information coming from an opac into other pages, such as resource guides or course displays. majax is designed for use with a iii millennium integrated library system (ils) whose vendor does not provide a web-services interface. the techniques we used, however, extend to other opacs as well. like many table 5. supported tictoclookup classes tictoclookup class meaning tictoc-link tictoc-preview tictoc-embed-n tictoc-alternate-link tictoc-append-title wrap span/div in link to table of contents display tooltip with preview of current entries embed preview of first n entries insert <link rel=“alternate”> into document append the title of the journal to the span/div web services and widgets for library information systems | back and bailey 83 legacy opacs, millennium does not only lack a web-services interface, but lacks any programming interface to the records contained in the system and does not provide access to the database or file system of the machine housing the opac. providing opac data as a web service we implemented two methods to access records from the millennium opac using bibliographic identifiers such as isbn, oclc number, bibliographic record number, and item title. both methods provide access to complete marc records and holdings information, along with locations and real-time availability for each held item. majax extracts this information via screenscraping from the marc record display page. as with all screen-scraping approaches, the code performing the scraping must be updated if the output format provided by the opac changes. in our experience, such changes occur at a frequency of less than once per year. the first method, majax 1, implements screen scraping using javascript code that is contained in a document placed in a directory on the server (/screens), which is normally used for supplementary resources, such as images. this document is included in the target page as a hidden html <iframe> element (see frame b in figure 2). consequently, the same-domain restriction applies to the code residing in it. majax 1 can thus be used only on pages within the same domain—for instance, if the opac is housed at opac.library.university.edu, majax 1 may be used on all pages within *.university.edu (not merely *.library.university.edu). the key advantage of majax 1 is that no additional server is required. the second method, majax 2, uses an intermediary server that retrieves the data from the opac, translates it to json, and returns it to the client. this method, shown in figure 5, returns json data and therefore does not suffer from the same-domain restriction. however, it requires hosting the majax 2 web service. like the tictoclookup web service, we implemented the majax 2 web service using python conformant to wsgi. a single installation can support multiple opacs. widgetization the majax widget library allows the integration of both majax 1 and majax 2 data into websites without javascript programming. the <span> tags function as placeholders, and <title> and <class> attributes describe the desired processing. majax provides a number of “majax classes,” multiple of which can be specified. these classes allow a mash-up creator to insert a large variety of bibliographic information, such as the values of marc fields. classes are also provided to insert fully formatted, ready-to-copy bibliographic references in harvard style, live circulation information, links to the catalog record, links to online versions of the item (if applicable), a ready-to-import ris description of the item, and even images of the book cover. a list of classes majax supports is provided in table 6. examples figure 6 provides an example use of majax widgets. four <span> tags expand into the book cover, a complete harvard-style reference, the valid of a specific marc field (020), and a display of the current availability of the item, wrapped in a link to the catalog record. texts such as “copy is available” shown in figure 6 are localizable. even though there are multiple majax <span> tags that refer to the same isbn, the majax widget library will contact the majax 1 or majax 2 web service only once per identifier, independent of how often it is used in a page. to manage the load, the majax client site library can be configured to not exceed a maximum number of requests per second, per client. all software described in this paper is available under the lgpl open source license. the majax libraries have been used by us and others for about two years. for instance, the “new books” list in our library uses majax 1 to provide circulation information. faculty members at our institution are using majax to enrich their course websites. a number of libraries have adopted majax 1, which is particularly easy to host because no additional server is required. ■■ related work most ilss in use today do not provide suitable web-services interfaces to access either bibliographic information figure 5. architecture of the majax 2 web service 84 information technology and libraries | june 2010 or availability data.9 this shortcoming is addressed by multiple initiatives. the ils discovery interface task force (ils-di) created a set of recommendations that facilitate the integration of discovery interfaces with legacy ilss, but does not define a concrete api.10 related, the iso 20775 holdings standard describes an xml schema to describe the availability of items across systems, but does not describe an api for accessing them.11 many ilss provide a z39.50 interface in addition to their htmlbased web opacs, but z39.50 does not provide standardized holdings and availability.12 nevertheless, there is hope within the community that ils vendors will react to their customers’ needs and provide web-services interfaces that implement these recommendations. the jangle project provides an api and an implementation of the ils-di recommendations through a representations state transfer (rest)–based interface that uses the atom publishing protocol (app).13 jangle can be linked to legacy ilss via connectors. the use of the xml-based app prevents direct access from client-side javascript code, however. in the future, adoption and widespread implementation of the w3c working draft on crossorigin resource sharing may relax the same-origin restriction in a controlled fashion, and thus allow access to app feeds from javascript across domains.14 screen-scraping is a common technique used to overcome the lack of web-services interfaces. for instance, oclc’s worldcat local product obtains access to availability information from legacy ilss in a similar fashion as our majax 2 service.15 whereas the web services used or created in our work exclusively use a rest-based model and return data in json format, interfaces based on soap (formerly simple object access protocol) whose semantics are described by a wsdl specification provide an alternative if access from within client-side javascript code is not required.16 html written by adapter <table width=“340”><tr><td> <span class=“majax-syndetics-vtech” title=“i1843341662”></span> </td><td> <span class=“majax-harvard-reference” title=“i1843341662”></span> <br /> isbn: <span class=“majax-marc-020” title=“i1843341662”></span> <br /> <span class=“majax-linktocatalogmajax-showholdings” title=“i1843341662”></span> </td></tr></table> display in browser after processing dahl, mark., banerjee, kyle., spalti, michael., 2006, digital libraries : integrating content and systems / oxford, chandos publishing, xviii, 203 p. isbn: 1843341662 (hbk.) 1 copy is available figure 6. example use of majax widgets oclc grid services provides rest-based web-services interfaces to several databases, including the worldcat search api and identifier services such as xisbn, xissn, and xoclcnum for frbr-related metadata.17 these services support xml and json and could benefit from widgetization for easier inclusion into client pages. the use of html markup to encode processing instructions is common in javascript frameworks, such as yui or dojo, which use <div> elements with customdefined attributes (so-called expando attributes) for this purpose.18 google gadgets uses a similar technique as well.19 the widely used context objects in spans (coins) specification exploits <span> tags to encode openurl table 6. selected majax classes majax class replacement majax-marc-fff-s majax-marc-fff majax-syndetics-* majax-showholdings majax-showholdings-brief majax-endnote majax-ebook majax-linktocatalog majax-harvard-reference majax-newline majax-space marc field fff, subfields concatenation of all subfields in field fff book cover image current holdings and availability information …in brief format ris version of record link to online version, if any link to record in catalog reference in harvard style newline space web services and widgets for library information systems | back and bailey 85 techniques for the seamless inclusion of information from web services into websites. we considered the cases where an opac is either the target of such integration or the source of the information being integrated. we focused on client-side techniques in which each user’s browser contacts web services directly because this approach lends itself to the creation of html widgets. these widgets allow the integration and customization of web services without requiring programming. therefore nonprogrammers can become mash-up creators. we described in detail the functionality and use of several widget libraries and web services we built. table 7 provides a summary of the functionality and hosting requirements for each system discussed. although the specific requirements for each system differ because of their respective nature, all systems are designed to be deployable with minimum effort and resource requirements. this low entry cost, combined with the provision of a high-level, nonprogramming interface, constitute two crucial preconditions for the broad adoption of mash-up techniques in libraries, which in turn has the potential to context objects in pages for processing by client-side extension.20 librarything uses client-side mash-up techniques to incorporate a social tagging service into opac pages.21 although their technique uses a <div> element as a placeholder, it does not allow customization via classes—the changes to the content are encoded in custom-generated javascript code for each library that subscribes to the service. the juice project shares our goal of simplifying the enrichment of opac pages with content from other sources.22 it provides a set of reusable components that is directed at javascript programmers, not librarians. in the computer-science community, multiple emerging projects investigate how to simplify the creation of server-side data mash-ups by end user programmers.23 ■■ conclusion this paper explored the design space of mash-up table 7. summary of features and requirements for the widget libraries presented in this paper majax 1 majax 2 google book classes tictoclookup classes web service screen scraping iii record display json proxy for iii record display google book search dynamic link api books.google.com tictoc cloud application tictoclookup .appspot.com hosted by existing millennium installation /screens wsgi/python script on libx.lib.vt.edu google, inc. google, inc. via google app engine data provenance your opac your opac google jisc (www.tictocs .ac.uk) additional cost n/a can use libx.lib.vt.edu for testing, must run wsgi-enabled web server in production free, but subject to google terms of service generous free quota, pay per use beyond that same domain restriction yes no no no widgetization majax.js: class-based: majaxclasses gbsclasses.js:classbased: gbs tictoc.js:class-based: tictoc requires javascript programming no no no no requires additional server no yes (apache+mod_wsgi) no no (if using gae), else need apache+mod_wsgi iii bibrecord display n/a n/a yes yes iii webbridge integration yes yes yes yes 86 information technology and libraries | june 2010 vastly increase the reach and visibility of their electronic resources in the wider community. references 1. nicole engard, ed., library mashups—exploring new ways to deliver library data (medford, n.j.: information today, 2009); andrew darby and ron gilmour, “adding delicious data to your library website,” information technology & libraries 28, no. 2 (2009): 100–103. 2. monica brown-sica, “playing tag in the dark: diagnosing slowness in library response time,” information technologies & libraries 27, no. 4 (2008): 29–32. 3. dapper, “dapper dynamic ads,” http://www.dapper .net/ (accessed june 19, 2009); yahoo!, “pipes,” http://pipes .yahoo.com/pipes/ (accessed june 19, 2009). 4. jennifer bowen, “metadata to support next-generation library resource discovery: lessons from the extensible catalog, phase 1,” information technology & libraries 27, no. 2 (2008): 6–19; john blyberg, “ils customer bill-of-rights,” online posting, blyberg.net, nov. 20, 2005, http://www.blyberg .net/2005/11/20/ils-customer-bill-of-rights/ (accessed june 18, 2009). 5. douglas crockford, “the application/json media type for javascript object notation (json),” memo, the internet society, july 2006, http://www.ietf.org/rfc/rfc4627.txt (accessed mar. 30, 2010). 6. google, “who’s using the book search apis?” http:// code.google.com/apis/books/casestudies/ (accessed june 16, 2009). 7. innovative interfaces, “millennium ils,” http://www.iii .com/products/millennium_ils.shtml (accessed june 19, 2009). 8. joint information systems committee, “tictocs journal tables of contents service,” http://www.tictocs.ac.uk/ (accessed june 18, 2009). 9. mark dahl, kyle banarjee, and michael spalti, digital libraries: integrating content and systems (oxford, united kingdom: chandos, 2006). 10. john ockerbloom et al., “dlf ils discovery interface task group (ils-di) technical recommendation,” (dec. 8, 2008), http://diglib.org/architectures/ilsdi/dlf_ils_ discovery_1.1.pdf (accessed june 18, 2009). 11. international organization for standardization, “information and documentation—schema for holdings information,” http://www.iso.org/iso/catalogue_detail .htm?csnumber=39735 (accessed june 18, 2009) 12. national information standards organization, “ansi/ niso z39.50—information retrieval: application service definition and protocol specification,” (bethesda, md.: niso pr., 2003), http://www.loc.gov/z3950/agency/z39-50-2003.pdf (accessed may 31, 2010). 13. ross singer and james farrugia, “unveiling jangle: untangling library resources and exposing them through the atom publishing protocol,” the code4lib journal no. 4 (sept. 22, 2008), http://journal.code4lib.org/articles/109 (accessed apr. 21, 2010); roy fielding, “architectural styles and the design of network-based software architectures” (phd diss., university of california, irvine, 2000); j. c. gregorio, ed., “the atom publishing protocol,” memo, the internet engineering task force, oct. 2007, http://bitworking.org/projects/atom/rfc5023.html (accessed june 18, 2009). 14. world wide web consortium, “cross-origin resource sharing: w3c working draft 17 march 2009,” http://www .w3.org/tr/access-control/ (accessed june 18, 2009). 15. oclc online computer library center, “worldcat and cataloging documentation,” http://www.oclc.org/support/ documentation/worldcat/default.htm (accessed june 18, 2009). 16. f. curbera et al., “unraveling the web services web: an introduction to soap, wsdl, and uddi,” ieee internet computing 6, no. 2 (2002): 86–93. 17. oclc online computer library center, “oclc web services,” http://www.worldcat.org/devnet/wiki/services (accessed june 18, 2009); international federation of library associations and institutions study group on the functional requirements for bibliographic records, “functional requirements for bibliographic records : final report,” http://www.ifla.org/files/ cataloguing/frbr/frbr_2008.pdf (accessed mar. 31, 2010). 18. yahoo!, “the yahoo! user interface library (yui),” http://developer.yahoo.com/yui/ (accessed june 18, 2009); dojo foundation, “dojo—the javascript toolkit,” http://www .dojotoolkit.org/ (accessed june 18, 2009). 19. google, “gadgets.* api developer’s guide,” http://code. google.com/apis/gadgets/docs/dev_guide.html (accessed june 18, 2009). 20. daniel chudnov, “coins for the link trail,” library journal 131 (2006): 8–10. 21. librarything, “librarything,” http://www.librarything .com/widget.php (accessed june 19, 2009). 22. robert wallis, “juice—javascript user interface componentised extensions,” http://code.google.com/p/juice-project/ (accessed june 18, 2009). 23. jeffrey wong and jason hong, “making mashups with marmite: towards end-user programming for the web” conference on human factors in computing systems, san jose, california, april 28–may 3, 2007: conference proceedings, volume 2 (new york: association for computing machinery, 2007): 1435–44; guiling wang, shaohua yang, and yanbo han, “mashroom: end-user mashup programming using nested tables” (paper presented at the international world wide web conference, madrid, spain, 2009): 861–70; nan zang, “mashups for the web-active user” (paper presented at the ieee symposium on visual languages and human-centric computing, herrshing am ammersee, germany, 2008): 276–77. 6 information technology and libraries | march 2010 sandra shores is [tk] sandra shores editorial board thoughts: issue introduction to student essays t he papers in this special issue, although covering diverse topics, have in common their authorship by people currently or recently engaged in graduate library studies. it has been many years since i was a library science student—twenty-five in fact. i remember remarking to a future colleague at the time that i found the interview for my first professional job easy, not because the interviewers failed to ask challenging questions, but because i had just graduated. i was passionate about my chosen profession, and my mind was filled from my time at library school with big ideas and the latest theories, techniques, and knowledge of our discipline. while i could enthusiastically respond to anything the interviewers asked, my colleague remarked she had been in her job so long that she felt she had lost her sense of the big questions. the busyness of her daily work life drew her focus away from contemplation of our purpose, principles, and values as librarians. i now feel at a similar point in my career as this colleague did twenty-five years ago, and for that reason i have been delighted to work with these student authors to help see their papers through to publication. the six papers represent the strongest work from a wide selection that students submitted to the lita/ ex libris student writing award competition. this year’s winner is michael silver, who looks forward to graduating in the spring from the mlis program at the university of alberta. silver entered the program with a strong library technology foundation, having provided it services to a regional library system for about ten years. he notes that “the ‘accidental systems librarian’ position is probably the norm in many small and medium sized libraries. as a result, there are a number of practices that libraries should adopt from the it world that many library staff have never been exposed to.”1 his paper, which details the implementation of an open-source monitoring system to ensure the availability of library systems and services, is a fine example of the blending of best practices from two professions. indeed, many of us who work in it in libraries have a library background and still have a great deal to learn from it professionals. silver is contemplating a phd program or else a return to a library systems position when he graduates. either way, the profession will benefit from his thoughtful, well-researched, and useful contributions to our field. todd vandenbark’s paper on library web design for persons with disabilities follows, providing a highly practical but also very readable guide for webmasters and others. vandenbark graduated last spring with a masters degree from the school of library and information science at indiana university and is already working as a web services librarian at the eccles health sciences library at the university of utah. like mr. silver, he entered the program with a number of years’ work experience in the it field, and his paper reflects the depth of his technical knowledge. vandenbark notes, however, that he has found “the enthusiasm and collegiality among library technology professionals to be a welcome change from other employment experiences,” a gratifying comment for readers of this journal. ilana tolkoff tackles the challenging concept of global interoperability in cataloguing. she was fascinated that a single database, oclc, has holdings from libraries all over the world. this is also such a recent phenomenon that our current cataloging standards still do not accommodate such global participation. i was interested to see what librarians were doing to reconcile this variety of languages, scripts, cultures, and independently developed cataloging standards. tolkoff also graduated this past spring and is hoping to find a position within a music library. marijke visser addresses the overwhelming question of how to organize and expose internet resources, looking at tagging and the social web as a solution. coming from a teaching background, visser has long been interested in literacy and life-long learning. she is concerned about “the amount of information found only online and what it means when people are unable . . . to find the best resources, the best article, the right website that answers a question or solves a critical problem.” she is excited by “the potential for creativity made possible by technology” and by the way librarians incorporate “collaborative tools and interactive applications into library service.” visser looks forward to graduating in may. mary kurtz examines the use of the dublin core metadata schema within dspace institutional repositories. as a volunteer, she used dspace to archive historical photographs and was responsible for classifying them using dublin core. she enjoyed exploring how other institutions use the same tools and would love to delve further into digital archives, “how they’re used, how they’re organized, who uses them and why.” kurtz graduated in the summer and is looking for the right job for her interests and talents in a location that suits herself and her family. finally, lauren mandel wraps up the issue exploring the use of a geographic information system to understand how patrons use library spaces. mandel has been an enthusiastic patron of libraries since she was a small child visiting her local county and city public libraries. she is currently a doctoral candidate at florida state university and sees an academic future for herself. mandel expresses infectious optimism about technology in libraries: sandra shores (sandra.shores@ualberta.ca) is guest editor of this issue and operations manager, information technology services, university of alberta libraries, edmonton, alberta, canada. editorial board thoughts | shores 7 looking ahead, it seems clear that the pace of change in today’s environment will only continue to accelerate; thus the need for us to quickly form and dissolve key sponsorships and partnerships that will result in the successful fostering and implementation of new ideas, the currency of a vibrant profession. the next challenge is to realize that many of the key sponsorship and partnerships that need to be formed are not just with traditional organizations in this profession. tomorrow’s sponsorships and partnership will be with those organizations that will benefit from the expertise of libraries and their suppliers while in return helping to develop or provide the new funding opportunities and means and places for disseminating access to their expertise and resources. likely organizations would be those in the fields of education, publishing, content creation and management, and social and community webbased software. to summarize, we at ex libris believe in sponsorships and partnerships. we believe they’re important and should be used in advancing our profession and organizations. from long experience we also have learned there are right ways and wrong ways to implement these tools, and i’ve shared thoughts on how to make them work for all the parties involved. again, i thank marc for his receptiveness to this discussion and my even deeper appreciation for trying to address the issues. it’s serves as an excellent example of what i discussed above. people forget, but paper, the scroll, the codex, and later the book were all major technological leaps, not to mention the printing press and moveable type. . . . there is so much potential for using technology to equalize access to information, regardless of how much money you have, what language you speak, or where you live. big ideas, enthusiasm, and hope for the profession, in addition to practical technology-focused information await the reader. enjoy the issue, and congratulations to the winner and all the finalists! note 1. all quotations are taken with permission from private e-mail correspondence. a partnership for creating successful partnerships continued from page 5 microsoft word march_ital_prommann_original_notes.docx applying  hierarchical  task  analysis   method  to  discovery  layer  evaluation       merlen  prommann   and  tao  zhang     information  technology  and  libraries  |  march  2015             77   abstract   while  usability  tests  have  been  helpful  in  evaluating  the  success  or  failure  of  implementing  discovery   layers  in  the  library  context,  the  focus  of  usability  tests  has  remained  on  the  search  interface  rather   than  the  discovery  process  for  users.  the  informal  site-­‐  and  context  specific  usability  tests  have   offered  little  to  test  the  rigor  of  the  discovery  layers  against  the  user  goals,  motivations  and  workflow   they  have  been  designed  to  support.  this  study  proposes  hierarchical  task  analysis  (hta)  as  an   important  complementary  evaluation  method  to  usability  testing  of  discovery  layers.  relevant   literature  is  reviewed  for  the  discovery  layers  and  the  hta  method.  as  no  previous  application  of  hta   to  the  evaluation  of  discovery  layers  was  found,  this  paper  presents  the  application  of  hta  as  an   expert  based  and  workflow  centered  (e.g.,  retrieving  a  relevant  book  or  a  journal  article)  method  to   evaluating  discovery  layers.  purdue  university’s  primo  by  ex  libris  was  used  to  map  eleven  use  cases   as  hta  charts.  nielsen’s  goal  composition  theory  was  used  as  an  analytical  framework  to  evaluate   the  goal  charts  from  two  perspectives:  a)  users’  physical  interactions  (i.e.,  clicks),  and  b)  user’s   cognitive  steps  (i.e.,  decision  points  for  what  to  do  next).  a  brief  comparison  of  hta  and  usability  test   findings  is  offered  as  a  way  of  conclusion.   introduction   discovery  layers  are  relatively  new  third  party  software  components  that  offer  google-­‐like  web-­‐ scale  search  interface  for  library  users  to  find  information  held  in  the  library  catalo  and  beyond.   libraries  are  increasingly  utilizing  these  to  offer  a  better  user  experience  to  their  patrons.  while   popular  in  application,  the  discussion  about  discovery  layer  implementation  and  evaluation   remains  limited.  [1][2]     a  majority  of  reported  case  studies  discussing  discovery  layer  implementations  are  based  on   informal  usability  tests  that  involve  a  small  sample  of  users  in  a  specific  context.  the  resulting  data   sets  are  often  incomplete  and  the  scenarios  are  hard  to  generalize.[3]  discovery  layers  have  a   number  of  technical  advantages  over  the  traditional  federated  search  and  cover  a  much  wider   range  of  library  resources.  however,  they  are  not  without  limitations.  questions  have  remained   scarce  about  the  workflow  of  discovery  layers  and  how  well  they  help  users  achieve  their  goals.     merlen  prommann  (mpromann@purdue.edu)  is  user  experience  researcher  and  designer,   purdue  university  libraries.  tao  zhang  (zhan1022@purdue.edu)  is  user  experience  specialist,   purdue  university  libraries.     applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       78   beth  thomsett-­‐scott  and  patricia  e.  reese1  offered  an  extensive  overview  of  the  literature   discussing  the  disconnect  between  what  the  library  websites  offer  and  what  their  users  would   like.[1]  on  the  one  hand,  library  directors  deal  with  a  great  variety  of  faculty  perceptions,  in  terms   of  what  the  role  of  library  is  and  how  they  approach  research  differently.  the  ithaka  s+r  library   survey  of  not-­‐for  profit  four-­‐year  academic  institutions  in  the  us  suggests  a  real  diversity  of   american  academic  libraries  as  they  seek  to  develop  services  with  sustained  value.[4]  for  the   common  library  website  user,  irrelevant  search  results  and  unfamiliar  library  taxonomy  (e.g.  call   numbers,  multiple  locations,  item  formats,  etc.)  are  two  most  common  gaps.[3]  michael  khoo  and   catherine  hall  demonstrated  how  users,  primarily  college  students,  have  become  so  accustomed   to  the  search  functionalities  on  the  internet  that  they  are  reluctant  to  use  library  websites  for  their   research.[5]  no  doubt,  the  launch  of  google  scholar  in  2005  was  another  driver  for  librarians  to   move  from  the  traditional  federated  searching  to  something  faster  and  more  comprehensive.[1]   while  literature  encouraging  google-­‐like  search  experiences  is  abundant,  khoo  and  hall  have   warned  designers  to  not  take  users’  preferences  towards  google  at  face  value.  they  studied  users’   mental  models,  defining  it  as  “a  model  that  people  have  of  themselves,  others,  the  environment,   and  the  things  with  which  they  interact,  such  as  technologies,”  and  concluded  that  users  often  do   not  understand  the  complexities  of  how  search  functions  actually  work  or  what  is  useful  about   them.[5]     a  more  systematic  examination  of  the  tasks  that  discovery  layers  are  designed  to  support  is   needed.  this  paper  introduces  hierarchical  task  analysis  (henceforth  hta)  as  an  expert  method  to   evaluate  discovery  layers  from  a  task-­‐oriented  perspective.  it  aims  to  complement  usability   testing.  for  more  than  40  years,  hta  has  been  the  primary  methodology  to  study  systems’  sub-­‐ goal  hierarchies  for  it  presents  the  opportunity  to  provide  insights  into  key  workflow  issues.  with   expertise  in  applying  hta  and  being  frequent  users  of  the  purdue  university  libraries  website  for   personal  academic  needs,  we  mapped  user  tasks  into  several  flow  charts  based  on  three  task   scenarios:  (1)  finding  an  article,  (2)  finding  a  book,  and  (3)  finding  an  ebook.  jackob  nielsen’s   “goal  composition”  heuristics:  generalization,  integration  and  user  control  mechanisms[6]  were   used  as  an  analytical  framework  to  evaluate  the  user  experience  of  an  ex  libris  primo®  discovery   layer  implemented  at  purdue  university  libraries.  the  goal  composition  heuristics  focus  on   multifunctionality  and  the  idea  of  servicing  many  possible  user  goals  at  once.  for  instance,   generalization  allows  users  to  use  one  feature  on  more  objects.  integration  allows  each  feature  to   be  used  in  combination  with  other  facilities.  control  mechanisms  allow  users  to  inspect  and   amend  how  the  computer  carries  out  the  instructions.  we  discussed  the  key  issues  with  other   library  colleagues  to  meet  nielsen’s  five  expert  rule  and  avoid  loss  in  the  quality  of  insights.[7]   nielsen  studied  the  value  of  participant  volume  in  usability  tests  and  concluded  that  after  the  fifth   user  researchers  are  wasting  their  time  by  observing  the  same  findings  and  not  learning  much   new.  a  comparison  to  usability  study  findings,  as  presented  by  fagan  et  al,  is  offered  as  a  way  of   conclusion.[3]       information  technology  and  libraries  |  march  2015   79   related  work     discovery  layers   the  traditional  federated  search  technology  offers  the  overall  benefit  of  searching  many  databases   at  once.[8][1]  yet  it  has  been  known  to  frustrate  users,  as  they  often  do  not  know  which  databases   to  include  in  their  search.  emily  alling  and  rachel  naismith  aggregated  common  findings  from  a   number  of  studies  involving  the  traditional  federated  search  technology.[9]  besides  slow  response   time,  other  key  causes  of  frustrating  inefficiency  were:  limited  information  about  search  results,   information  overload  due  to  the  lack  of  filters,  and  the  fact  that  results  were  not  ranked  in  order  of   relevance  (see  also  [2][1]).   new  tools,  termed  as  “discovery,”  “discovery  tools,”[2][10]  “discovery  layers’”  or  “next  generation   catalogs,”[11]  have  become  increasingly  popular  and  have  provided  the  hope  of  eliminating  some   of  the  issues  with  traditional  federated  search.  generally,  they  are  third  party  interfaces  that  use   pre-­‐indexing  to  provide  speedy  discovery  of  relevant  materials  across  millions  of  records  of  local   library  collections,  from  books  and  articles,  to  databases  and  digital  archives.  furthermore,  some   systems  (e.g.,  ex  libris  primo  central  index)  aggregate  hundreds  of  millions  of  scholarly  e-­‐ resources,  including  journal  articles,  e-­‐books,  reviews,  legal  documents  and  more  that  are   harvested  from  primary  and  secondary  publishers  and  aggregators,  and  from  open-­‐access   repositories.  discovery  layers  are  projected  to  help  create  the  next  generation  of  federated  search   engines  that  utilize  a  single  search  index  of  metadata  to  search  the  rising  volume  of  resources   available  for  libraries.[2][11][10][1]    while  not  systematic  yet,  results  from  a  number  of  usability   studies  on  these  discovery  layers  point  to  the  benefits  they  offer.     the  most  noteworthy  benefit  of  a  discovery  layer  is  its  seemingly  easy  to  use  unified  search   interface.  jerry  caswell  and  john  d.  wynstra  studied  the  implementation  of  ex  libris  metalib   centralized  indexes  based  on  the  federated  search  technology  at  the  university  of  northern  iowa   library.[8]  they  confirmed  how  the  easily  accessible  unified  interface  helped  users  to  search   multiple  relevant  databases  simultaneously  and  more  efficiently.  lyle  ford  concluded  that  the   summon  discovery  layer  by  serials  solutions  fulfilled  students’  expectations  to  be  able  to  search   books  and  articles  together.[12]  susan  johns-­‐smith  pointed  out  another  key  benefit  to  users:   customizability.[10]  the  summon  discovery  layer  allowed  users  to  determine  how  much  of  the   machine-­‐readable  cataloging  (marc)  record  was  displayed.  the  study  also  confirmed  how  the   unified  interface,  aligning  the  look  and  feel  among  databases,  increased  the  ease  of  use  for  end-­‐ users.  michael  gorrell  described  how  one  of  the  key  providers,  ebsco,  gathered  input  from  users   and  considered  design  features  of  popular  websites,  to  implement  new  technologies  to  the   ebscohost  interface.[13]  some  of  the  features  that  ease  the  usability  of  ebscohost  are  a  dynamic   date  slider,  an  article  preview  hover,  and  expandable  features  for  various  facets,  such  as  subject   and  publication.[2]     applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       80   another  key  benefit  of  discovery  systems  is  the  speed  of  results  retrieval.  the  primo  discovery   layer  by  ex  libris  has  been  complimented  for  its  ability  to  reduce  the  time  it  takes  to  conclude  a   search  session,  while  maximizing  the  volume  of  relevant  results  per  search  session.[14]  it  was   suggested  that  in  so  doing  the  tool  helps  introduce  users  to  new  content  types.  yuji  tosaka  and   cathy  weng  reported  how  records  with  richer  metadata  tend  to  be  found  more  frequently  and   lead  to  more  circulation.[15]  similarly,  luther  and  kelly  reported  an  increase  in  overall  downloads,   while  the  use  of  individual  databases  decreased.[16]  these  studies  point  to  the  trend  of  an   enhanced  distribution  of  discovery  and  knowledge.     with  the  additional  metadata  of  item  records,  however,  there  is  also  the  increased  likelihood  of   inconsistencies  across  databases  that  are  brought  together  in  a  centralized  index.  a  study  by   graham  stone  offered  a  comprehensive  report  on  the  implementation  process  of  the  summon   discovery  layer  at  the  university  of  huddersfield,  highlighting  major  inconsistences  in  cataloging   practices  and  the  difficulties  it  caused  in  providing  consistent  journal  holdings  and  titles.[17]  this   casts  shadows  on  the  promise  of  better  findability.     jeff  wisniewski[18]  and  williams  and  foster[2]  are  among  the  many  who  espouse  discovery   layers  as  a  step  towards  a  truly  single  search  function  that  is  flexible  while  allowing  needed   customizability.  these  new  tools,  however,  are  not  without  their  limitations.  the  majority  of   usability  studies  reinforce  similar  results  and  focus  on  the  user  interface.  fagan  et  al,  for  example,   studied  the  usability  of  ebsco  discovery  service  at  james  madison  university  (jmu).  while  most   tasks  were  accomplished  successfully,  the  study  confirmed  previous  warnings  that  users  do  not   understand  the  complexities  of  search  and  identified  several  interface  issues:  (1)  users  desire   single  search,  but  willingly  use  multiple  options  for  search,  (2)  lack  of  visibility  for  the  option  to   sort  search  results,  and  (3)  the  difficulty  in  finding  journal  articles.[3]     yang  and  wagner  offer  one  case  where  the  aim  was  to  evaluate  discovery  layers  against  a  check-­‐ list  of  12  features  that  would  define  a  true  ‘next  generation  catalogue’:     (1)  single  point  of  entry  to  all  library  information,     (2)  state-­‐of-­‐the-­‐art  web  interface  (e.g.  google  and  amazon),     (3)  enriched  content  (e.g.  book  cover  images,  ratings  and  comments),     (4)  faceted  navigation  for  search  results,     (5)  simple  keyword  search  on  every  page,     (6)  more  precise  relevancy  (with  circulation  statistics  a  contributing  factor),     (7)  automatic  spell  check,     (8)  recommendations  to  related  materials  (common  in  commercial  sites,  e.g.  amazon),     (9)  allowing  users  to  add  data  to  records  (e.g.  reviews),       information  technology  and  libraries  |  march  2015   81   (10)  rss  feeds  to  allow  users  to  follow  top  circulating  books  or  topic  related  updates  in  the   library  catalogue,     (11)  links  to  social  networking  sites  to  allow  users  to  share  their  resources,     (12)  stable  url’s  that  can  be  easily  copied,  pasted  and  shared.  [11]     they  used  this  list  to  evaluate  seven  open  source  and  ten  proprietary  discovery  layers,  revealing   how  only  a  few  of  them  can  be  considered  true  ‘next  generation  catalogs’  supporting  the  users’   needs  that  are  common  on  the  web.  all  of  the  tools  included  in  their  study  missed  precision  in   retrieving  relevant  search  results,  e.g.  based  on  transaction  data.  the  authors  were  impressed   with  open  source  discovery  layers  libraryfind  and  vufind,  which  had  10  of  the  12  features,   leaving  vendors  of  proprietary  discovery  layers  ranking  lower  (see  figure  1).     figure  1.  17  discovery  layers  (x-­‐axis)  were  evaluated  against  a  checklist  of  12  features  expected  of   the  next  generation  catalogue  (y-­‐axis)   yang  and  wagner  theorized  that  the  relative  lack  of  innovation  among  commercial  discovery   layers  is  due  to  practical  reasons:  vendors  create  their  new  discovery  layers  to  run  alongside  older   ones,  rather  than  attempting  to  alter  the  proprietary  code  of  the  integrated  library  system’s  (ils)   online  public  access  catalog  (opac).  they  pointed  to  the  need  for  “libraries,  vendors  and  the  open   source  community  […]  to  cooperate  and  work  together  in  a  spirit  of  optimism  and  collegiality  to   make  the  true  next  generation  catalogs  a  reality”.[11]  at  the  same  time,  the  university  of  michigan   article  discovery  working  group  reported  on  vendors’  being  more  cooperative  and  allowing   coordination  among  products,  increasing  the  potential  of  web-­‐scale  discovery  services.[19]  how   to  evaluate  and  optimize  user  workflow  across  these  coordinating  products  remains  a  practical   9   9   9   8   7.5   7   7   7   6   6   6   5   5   4   2   1   0   1   2   3   4   5   6   7   8   9   10   ranking  of  discovery  layers     (yang  and  wagner  2010,  707)       applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       82   challenge.  in  this  study,  we  propose  hta  as  a  prospectively  helpful  method  to  evaluate  user   workflow  through  these  increasingly  complex  products.       hierarchical  task  analysis   with  roots  in  tylorism*,  industrial  psychology  and  system  processes,  task  analyses  continue  to   offer  valuable  insights  into  the  balance  of  efficiency  and  effectiveness  in  human-­‐computer   interaction  scenarios  [20][21].  historically,  frank  and  lillian  gilbreth  (1911)  set  forth  the   principle  of  hierarchical  task  analysis  (hta),  when  they  broke  down  and  studied  the  individual   steps  involved  in  laying  bricks.  they  reduced  the  brick  laying  process  from  about  18  movements   down  to  four  (in  [21]).  but,  it  was  john  annett  and  keith  d.  duncan  (1967)  who  introduced  hta  as   a  method  to  better  evaluate  the  personnel  training  needs  of  an  organization.  they  used  it  to  break   apart  behavioral  aspects  of  complex  tasks  such  as  planning,  diagnosis  and  decision-­‐making  (see   in[22][21]).     hta  helps  break  users  goals  into  subtasks  and  actions,  usually  in  a  visual  form  of  a  graphic  chart.   it  offers  a  practical  model  for  goal  execution,  allowing  designers  to  map  user  goals  to  the  system’s   varying  task  levels  and  evaluate  their  feasibility  [23].  in  so  doing,  hta  offers  the  structure  with   which  to  learn  about  tasks  and  highlight  any  unnecessary  steps  and  potential  errors  that  might   occur  during  a  task  performance  [24][25],  whether  cognitive  or  physical.  its  strength  lies  in  its   dual  approach  to  evaluation:  on  the  one  hand,  user  interface  elements  are  mapped  at  an  extremely   low  and  detailed  level  (to  individual  buttons),  while  on  the  other  hand,  each  of  these  interface   elements  gets  mapped  to  user’s  high-­‐level  cognitive  tasks  (the  cognitive  load).  this  informs  a   rigorous  design  approach,  where  each  detail  accounts  for  the  high-­‐level  user  task  it  needs  to   support.     the  main  limitation  of  classical  hta  is  its  system-­‐centric  focus  that  does  not  account  for  the  wider   context  the  tasks  under  examination  exists  in.  the  field  of  human-­‐computer  interaction  has  shifted   our  understanding  of  cognition  from  an  individual  information  processing  model  to  a  networked   and  contextually  defined  set  of  interactions,  where  the  task  under  analysis  is  no  longer  confined  to   a  desktop  but  “extends  into  a  complex  network  of  information  and  computer-­‐mediated  interactions”   [26].  the  task  step  focused  hta  does  not  have  the  ability  to  account  for  the  rich  social  and   physical  contexts  that  the  increasingly  mediated  and  multifaceted  activities  are  embedded  in.  hta   has  been  reiterated  with  additional  theories  and  heuristics,  so  as  to  better  account  for  the   increasingly  more  complete  understanding  of  human  activity.       advanced  task  models  and  analysis  methods  have  been  developed  based  on  the  principle  of  hta.   stuart  k.  card,  thomas  p.  moran  and  allen  newell  [27]  proposed  an  engineering  model  of  human   performance  –  goms  (goals,  operators,  methods,  and  selection)  –  to  map  how  task  environment   features  determine  what  and  when  users  know  about  the  task  [20].  goms  have  been  expanded  to   cope  with  rising  complexities  (e.g.  [28][29][30]),  but  the  models  have  become  largely  impractical                                                                                                                             *  tylorism  is  the  application  of  scientific  method  to  the  analysis  of  work,  so  as  to  make  it  more  efficient  and  cost-­‐effective.  modern  task     information  technology  and  libraries  |  march  2015   83   in  the  process  [20].  instead  of  simplistically  suggesting  cognitive  errors  are  due  to  interface  design,   cognitive  task  analysis  (cta)  attempts  to  address  the  underlying  mental  processes  that  most   often  give  rise  to  errors  [24].  given  the  lack  of  our  structural  understanding  about  cognitive   processes,  the  analysis  of  cognitive  tasks  has  remained  problematic  to  implement  [20][31].   activity  theory  models  people  as  active  decision  makers  [20].  it  explains  how  users  convert  goals   into  a  set  of  motives  and  how  they  seek  to  execute  those  motives  as  a  set  of  interactions  in  a  given   situational  condition.  these  situational  conditions  either  help  or  prevent  the  user  from  achieving   the  intended  goal.  activity  theory  is  beginning  to  offer  a  coherent  foundation  to  account  for  the   task  context  [20],  but  it  has  yet  to  offer  a  disciplined  set  of  methods  to  execute  this  theory  in  the   form  of  a  task  analysis.     even  though  task  analyses  have  seen  much  improvement,  adaptation  and  usage  in  its  near-­‐40-­‐ year-­‐long  existence  and  its  core  benefit  –  aiding  an  understanding  of  the  tasks  users  need  to   perform  to  achieve  their  desired  goals  –  have  remained  the  same.  until  activity  theory,  cla  and   other  contextual  approaches  are  developed  into  more  readily  applicable  analysis  frameworks,   classical  hta  with  the  additional  layers  of  heuristics  guiding  the  analysis  remains  the  practical   option  [21].  nielsen’s  goal  composition  [6]  offers  one  such  set  of  heuristics  applicable  for  the  web   context.  it  presents  usability  concepts  such  as  reuse,  multitasking,  automated  use,  recovering  and   retrieving,  to  name  a  few,  so  as  to  systematically  evaluate  the  hta  charts  representing  the   interplay  between  an  interface  and  the  user.     utility  of  hta  for  evaluating  discovery  layers     usability  testing  has  become  the  norm  in  validating  the  effectiveness  and  ease  of  use  of  library   websites.  yet,  thirteen  years  ago,  brenda  battleson,  austin  booth  and  jane  weintrop  [32]   emphasized  the  need  to  support  user  tasks  as  the  crucial  element  to  user-­‐centered  design.  in   comparison  to  usability  testing,  hta  offers  a  more  comprehensive  model  for  the  analysis  of  how   well  discovery  layers  support  users’  tasks  in  the  contemporary  library  context.  considering  the   strengths  of  the  hta  method  and  the  current  need  for  vendors  to  simplify  the  workflows  in  the   increasingly  complex  systems,  it  is  surprising  that  hta  has  not  yet  been  applied  to  the  evaluation   of  discovery  layers.     this  paper  introduces  hierarchical  task  analysis  (hta)  as  a  solution  to  systematically  evaluate   the  workflow  of  discovery  layers  as  a  technology  that  helps  users  accomplish  specific  tasks,  herein,   retrieving  relevant  items  from  the  library  catalog  and  other  scholarly  collections.  nielsen’s  [6]   goal  composition  heuristics,  designed  to  evaluate  usability  in  the  web  context,  is  used  to  guide  the   evaluation  of  the  user  workflow  via  the  hta  task  maps.  as  a  process  (vs.  context)  specific   approach,  hta  can  help  achieve  a  more  systematic  examination  of  the  tasks  discovery  layers   should  support,  such  as  finding  an  article,  a  book  or  an  ebook,  and  help  vendors  coordinate  to   achieve  the  full  potential  of  web-­‐scale  discovery  services.       applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       84   method:  applying  hta  to  primo  by  ex  libris   the  object  of  this  study  was  purdue  university’s  library  website,  which  was  re-­‐launched  with  ex   libris’  primo  in  january  2013  (figure  2)  to  serve  the  growing  student  and  faculty  community.  its   3.6  million  indexed  records  are  visited  over  1.1  million  times  every  year.  roughly  34%  of  these   visits  are  to  electronic  books.  according  to  sharon  q.  yang  and  kurt  wagner  [11],  who  studied  17   different  discovery  layers,  primo  ranked  the  best  among  the  commercial  discovery  layer  products,   coming  fourth  after  the  open  source  tools  library  find,  vufind,  and  scriblio  in  the  overall  rankings.   we  will  evaluate  how  efficiently  and  effectively  the  primo  search  interface  supports  users’  of  the   purdue  libraries  tasks.         figure  2.  purdue  library  front  page  and  search  box   based  on  our  three  year  experience  of  user  studies  and  usability  testing  of  the  library  website,  we   identified  finding  an  article,  a  book  and  an  ebook  as  the  three  major  representative  scenarios  of   purdue  library  usage.  to  test  how  primo  helps  its  users  and  how  many  cognitive  steps  it  requires   of  them,  each  of  the  three  scenarios  were  broken  into  three  or  four  specific  case  studies.  the  case   studies  were  designed  to  account  for  the  different  availability  categories  present  in  the  current   primo  system,  e.g.  ‘full  text  available’,  ‘partial  availability’,  ‘restricted  access’  or  ‘no  access’.  this  is   because  the  different  availabilities  present  users  with  different  possible  frustrations  and  obstacles     information  technology  and  libraries  |  march  2015   85   to  task  accomplishment.  this  system-­‐design  perspective  could  offer  a  comparable  baseline  for   discovery  layer  evaluation  across  libraries.  a  full  list  of  the  eleven  case  studies  can  be  seen  below:     find  an  article:   case  1.  the  library  has  only  a  full  electronic  text.   case  2.  the  library  has  the  correct  issue  of  the  journal  in  print,  which  contains  the  article,  as   well  as,  a  full  electronic  text.   case  3.  the  library  has  the  correct  issue  of  the  journal,  which  contains  the  article,  only  in   print.   case  4.  the  library  does  not  have  the  full  text,  either  in  print  or  electronically.  a  possible   option  is  to  use  inter  library  loan  (here  forth  ill)  request.     find  a  book  (print  copy):   case  5.  the  library  has  the  book  and  the  book  is  on  the  shelf.   case  6.  the  library  has  the  book,  but  the  book  is  in  a  restricted  place,  such  as  the  hicks   repository.  the  user  has  to  request  the  book.   case  7.  the  library  has  the  book,  but  it  is  either  on  the  shelf  or  in  a  repository.  the  user   would  like  to  request  the  book.   case  8.  the  library  does  not  have  the  book.  possible  options  are  uborrow†  or  ill.       find  an  ebook:   case  9.  the  library  has  the  full  text  of  the  ebook.     case  10.  the  ebook  is  shown  in  search  results  but  the  library  does  not  have  full  text.   case  11.  the  book  is  not  shown  in  search  results.  possible  option  is  to  use  uborrow  or  ill.   it  is  generally  accepted  that  hta  is  not  a  complex  analysis  method,  but  since  it  offers  general   guiding  principles  rather  than  a  rigorous  step-­‐by-­‐step  guide,  it  can  be  tricky  to  implement   [24][20][21][23].  both  authors  of  this  study  have  expertise  in  applying  hta  and  are  frequent   users  of  the  purdue  library’s  website.  we  are  familiar  with  the  library’s  commonly  reported   system  errors;  however,  all  of  our  case  studies  result  from  a  randomized  topic  search,  not  from   specific  reported  items.  to  achieve  consistent  hta  charts  one  author  carried  out  the  identified   use-­‐cases  on  a  part-­‐time  basis  over  a  two-­‐month  period.  each  case  was  executed  on  the  purdue   library  website,  using  the  primo  discovery  layer.  an  on  campus  hewlett-­‐packard  (hp)  desktop   computer  with  internet  explorer  and  a  personal  macbook  laptop  with  safari  and  google  chrome   were  used  to  identify  any  possible  inconsistencies  between  user  experiences  on  different                                                                                                                             †  uborrow  is  a  federated  catalog  and  direct  consortial  borrowing  service  provided  by  the  committee  on  institutional   cooperation  (cic).  uborrow  allows  users  to  search  for,  and  request,  available  books  from  all  cic  libraries,  which  includes   all  universities  in  the  big  ten  as  well  as  the  university  of  chicago,  and  the  center  for  research  libraries.     applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       86   operating  systems.  as  per  stanton’s  [21]  statement  that  “hta  is  a  living  documentation  of  the  sub-­‐ goal  hierarchy  that  only  exists  in  the  latest  state  of  revision”,  mapping  the  hta  charts  was  an   iterative  process  between  the  two  authors.   according  to  david  embrey  [24]  “the  analyst  needs  to  develop  a  measure  of  skill  [in  the  task]  in   order  to  analyze  a  task  effectively”  (2).  this  measure  of  skill  was  developed  in  the  process  of   finding  real  examples  (via  a  randomized  topic  search)  from  the  purdue  library  catalog  to  match   the  structural  cases  listed  above.  for  instance  ‘case  1.  the  library  has  only  the  electronic  full  text’   was  turned  into  a  case  goal:  ‘0  find  the  conference  proceeding  on  network-­‐assisted  underwater   acoustic  communication'.  a  full  list  of  referenced  case  studies  is  below:   find  an  article:   case  1.  find  the  article  “network-­‐assisted  underwater  acoustic  communication”  (yang  and  kevin,   2012).   case  2.  find  the  article  “comparison  of  simple  potential  functions  for  simulating  liquid  water”   (jorgensen  et  al.,  1983).   case  3.  find  the  journal  design  annual  “graphis  inc”  (2008).   case  4.  find  the  article  “a  technique  for  murine  irradiation  in  a  controlled  gas  environment”   (walb,  m.  c.  et  al.,  2012).   find  a  book  (in  print):   case  5.  find  the  book  show  me  the  numbers:  designing  tables  and  graphs  to  enlighten  (few,   2004).   case  6.  find  the  book  the  love  of  cats  and  place  a  request  for  it  (metcalf,  1973).   case  7.  find  the  book  the  prince  and  place  a  request  for  it  (machiavelli).   case  8.  find  the  book  the  design  history  reader  by  maffei  and  houze  (2010).  (uborrow  or  ill).     find  an  ebook:   case  9.  find  the  ebook  handbook  of  usability  testing.  how  to  plan,  design  and  conduct  effective   tests  (rubin  and  chisnell,  2008)   case  10.  find  the  ebook  the  science  of  awakening  consciousness:  our  ancient  wisdom  (partly   available  via  hathi  trust)   case  11.  find  the  ebook  ancient  awakening  by  matthew  bryan  laube  (uborrow).     hta  descriptions  are  generally  diagrammatic  or  tabular.  since  diagrams  are  easier  to  assimilate   and  promise  the  identification  of  a  larger  number  of  sub-­‐goals  [23],  diagrammatic  description   method  was  preferred  (figure  2).  each  analysis  started  with  the  establishment  of  sub-­‐goals,  such   as  ‘browse  the  library  website’  and  ‘retrieve  the  article’,  and  followed  with  the  identification  of   individual  small  steps  that  make  the  sub-­‐goal  possible,  e.g.  ‘press  search’  and  ‘click  on  2,  to  go  to   page  2’  (figures  3-­‐5).  then,  additional  iterations  were  made  to  include:  (1)  cognitive  steps,  where     information  technology  and  libraries  |  march  2015   87   users  need  to  evaluate  the  screen  in  order  to  take  the  next  step  (e.g.  identifying  the  correct  url  to   open  from  the  initial  results  set),  and  (2)  capture  cognitive  decision  points  between  multiple   options  for  users  to  choose  from.  for  instance,  items  can  be  requested  either  via  interlibrary  loan   (ill)  or  uborrow,  presenting  the  user  with  an  a  or  b  option,  requiring  cognitive  effort  to  make  a   choice.  such  parallel  paths  were  color  coded  in  yellow  (figure  2).  both  physical  and  cognitive   steps  were  recorded  into  xmind‡,  a  free  mind  mapping  software.  they  were  color-­‐coded  black  and   gray,  respectively,  helping  visualize  the  volume  of  cognitive  decision  points  and  steps  (i.e.   cognitive  load).     figure  3.  full  hta  chart  for  'find  a  book'  scenario  (case  5).  created  in  xmind.         figure  4.  zoom  in  to  steps  1  and  2  of  the  hta  map  for  ‘find  a  book’  scenario  (case  5).  created  in   xmind.                                                                                                                               ‡ xmind is a free mind mapping software that allows structured presentation of step multiple coding references, the addition of images, links and extensive notes. http://www.xmind.net/   applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       88     figure  5.  zoom  in  to  step  3  of  the  hta  map  for  the  'find  a  book'  scenario  (case  5).  created  in   xmind.     information  technology  and  libraries  |  march  2015   89     figure  6.  zoom  in  to  step  4  of  the  hta  map  of  the  'find  a  book'  scenario  (case  5).  created  in   xmind.   to  organize  the  decision  flow  chart,  the  original  hierarchical  number  scheme  for  hta  that   requires  every  sub-­‐goal  to  be  uniquely  numbered  with  an  integer  in  numerical  sequence  [21],  was   strictly  followed.  visual  (screen  captures)  and  verbal  notes  on  efficient  and  inefficient  design   factors  were  taken  during  the  hta  mapping  process  and  linked  directly  to  the  tasks  they  applied   to.  steps,  where  interface  design  guided  the  user  to  the  next  step,  were  marked  ‘fluent’  with  a   green  tick  (figures  3  and  4).  steps  that  were  likely  to  mislead  users  from  the  optimal  path  to  item   retrieval  and  were  a  burden  to  user’s  workflow  were  marked  with  a  red  ‘x’  (see  figures  4  and  5).   one  major  advantage  of  the  diagram  format  is  its  visual  and  structural  representation  of  sub-­‐goals   and  their  steps  in  a  spatial  manner  (see  figures  2-­‐5).  this  is  useful  for  gaining  a  quick  overview  of   the  workflow  [21].   when  exactly  to  stop  the  analysis  has  remained  undefined  for  hta  [21].  it  is  at  the  discretion  of   the  analyst  to  evaluate  if  there  is  the  need  to  re-­‐describe  every  sub-­‐goal  down  to  the  most  basic   level,  or  whether  the  failure  to  perform  that  sub-­‐goal  is,  in  fact,  consequential  to  the  study  results.   we  decided  to  stop  evaluation  at  the  point  where  the  user  located  (a  shelf  number  or  reserve  pick   up  number)  or  received  the  sought  item  via  download.  furthermore,  steps  that  were  perceived  as   possible  when  impossible  in  actuality  were  transcribed  into  the  diagrams.  article  scenario  case  1   offers  an  example:  once  the  desired  search  result  was  identified,  its  green  dot  for  ‘full  text  available’     applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       90   was  likely  to  be  perceived  as  clickable,  when  in  actuality  it  was  not.  the  user  is  required  to  click  on   the  title  or  open  the  tab  ‘find  online’  to  access  the  external  digital  library  and  download  the   desired  article  (see  figure  7).       figure  7.  article  scenario  (case1)  two  search  results,  where  green  'full  text  available'  may  be   perceived  as  clickable.   task  analysis  focuses  on  the  properties  of  the  task  rather  than  the  user.  this  requires  expert   evaluation  in  place  of  involving  users  in  the  study.  as  stated  above,  both  of  the  authors  are   working  experts  in  the  field  of  user  experience  in  the  library  context,  thoroughly  aware  of  the   tasks  under  analysis  and  how  they  are  executed  on  a  daily  basis.  a  group  of  12  (librarians,   reference  service  staff,  system  administrators  and  developers)  were  asked  to  review  the  hta   charts  on  a  monthly  basis.  feedback  and  implications  of  identified  issues  were  discussed  as  a   group.  according  to  nielsen  [7]  it  takes  five  experts  (double  specialist  in  nielsen’s  terms,  is  an   expert  in  usability  as  well  as  in  the  particular  technology  employed  by  the  software.)  to  not  have   significant  loss  of  findings  (see  figure  7).  based  on  this  enumeration,  the  final  versions  of  the  hta   charts  offer  accurate  representations  of  the  primo  workflow  in  the  three  use  scenarios  of  finding   an  article,  finding  a  book  and  finding  an  ebook  at  purdue  university  libraries.       information  technology  and  libraries  |  march  2015   91     figure  8.  average  proportion  of  usability  problems  found  as  a  function  of  number  of  evaluators  in   a  group  performing  heuristic  evaluation  [7].   results     the  reason  for  mapping  primo’s  workflows  in  hta  charts  was  to  identify  key  workflow  and   usability  issues  of  a  widely  used  discovery  layer  in  scenarios  and  contexts  it  was  designed  to   serve.  the  resulting  hta  diagrams  offered  insights  into  fluent  steps  (green  ticks),  as  well  as   workflow  issues  (red  ‘x’)  present  in  primo,  as  applied  at  purdue  university  libraries.  it  is  due  to   space  limitations,  that  only  the  main  findings  of  the  hta  will  be  discussed.  the  full  results  are   published  on  purdue  university  research  repository§.  table  1  presents  how  many  parallel  routes   (a  vs.  b  route),  physical  steps  (clicks),  cognitive  evaluation  steps,  likely  errors  and  well  guided   steps  each  of  the  use  cases  had.     on  average  it  took  between  20  to  30  steps  to  find  a  relevant  item  within  primo.  even  though  no   ideal  step  count  has  been  identified  for  the  library  context,  this  is  quite  high  in  the  general  context   of  the  web,  where  fast  task  accomplishment  is  generally  expected.  paul  chojecki  [33]  tested  how   too  many  options  impact  usability  on  a  website.  he  revealed  that  the  average  step  count  to  lead  to   higher  satisfaction  levels  is  6  (vs.  18,16  average  steps  at  purdue  libraries).  in  our  study,  the   majority  of  the  steps  were  physical  pressing  of  a  button  or  filter  selection;  however,  cognitive   steps  took  up  just  under  a  half  of  the  steps  in  nearly  all  cases.  the  majority  of  cases  flow  well,  as   the  strengths  (fluent  well  guided  steps)  of  primo  outweigh  its  less  guided  steps  that  easily  lend   themselves  to  the  chance  of  error.                                                                                                                                   § task analysis cases and results for ex libris primo. https://purr.purdue.edu/publications/1738   applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       92   content  type   articles   books   ebooks   case  number   1   2   3   4   avg   5   6   7   8   avg   9   10   11   avg   no.  of  decision  points   (between  a  &  b),  to   retrieve  an  item   5   8   4   4   5   4   5   5   2   4   6   3   2   4   minimum  steps   possible  to  retrieve  an   item  (clicks  +  cognitive   decisions)     18   27   16   30   23   18   25   28   24   24   22   19   19   20   of  these  minimum   steps,  how  many  were   cognitive  (information   evaluation  was  needed   to  proceed)     4   8   9   13   9   6   9   7   7   7   4   6   4   5   maximum  steps  it  can   take  to  retrieve  an   item  (clicks  +  cognitive   decisions)   26   35   23   36   30   22   31   33   28   29   32   23   22   26   of  these,  maximum   steps,  how  many  were   cognitive   10   17   14   15   14   10   13   16   8   12   9   8   5   7   errors  (steps  that   mislead  from  optimal   item  retrieval)   3   15   4   8   8   2   2   4   3   3   13   1   2   5   fluent  well  guided   steps  to  item  retrieval   11   11   9   8   10   7   8   7   5   7   6   4   3   5   table  1.  table  listing  each  case’s  key  task  measures,  and  each  scenario’s  averages.   between  the  three  item  search  scenarios  –  articles,  books  and  ebooks  –  the  retrieval  of  articles   was  least  guided  and  required  the  highest  amount  of  decisions  from  the  user  (5,  vs.  4  for  books   and  4  for  ebooks  on  average).  retrieving  an  article  (between  23-­‐30  steps  on  average)  or  a  book   (24-­‐29  steps  on  average)  took  more  steps  to  accomplish  than  finding  a  relevant  ebook  (20-­‐26   steps  on  average).  the  high  volume  of  steps  (max  30  steps  on  average)  it  required  to  retrieve  an   article,  as  well  as  its  high  error  rate  (8),  were  due  to  the  higher  amount  of  cognitive  steps  (12   steps  on  average)  required  to  identify  the  correct  article  and  to  locate  a  hard  copy  (instead  of  the   relatively  easily  retrievable  online  copy).  in  the  book  scenario,  the  challenge  was  also  two-­‐fold:  on   the  one  hand,  it  was  challenging  to  verify  the  right  book  when  there  were  many  similar  results   (this  explains  the  high  number  of  12  cognitive  steps  on  average);  on  the  other  hand,  the  flow  to   place  a  request  for  a  book  was  also  a  challenge.  the  latter  was  a  key  contributor  to  the  higher   amount  of  physical  steps  required  for  retrieving  a  book  (max  29  on  average).         information  technology  and  libraries  |  march  2015   93   common  to  all  eleven  cases,  whether  articles  or  books,  was  the  four  sub-­‐goal-­‐process:  1)  browse   the  library  website,  2)  find  results,  3)  open  the  page  of  the  desired  item,  and  4)  retrieve,  locate   or  order  the  item.  the  first  two  offered  near  identical  experiences,  no  matter  the  search  scenario   or  case.  third  and  fourth  sub-­‐goals,  however,  presented  different  workflow  issues  depending  on   the  product  searched  and  its  availability,  e.g.  ‘in  print’  or  ‘online’.  as  such,  general  results  will  be   presented  for  the  first  two  themes,  while  scenario  specific  overviews  will  be  provided  for  the   latter  two  themes.   browsing  the  library  website   browsing  the  library  website  was  easy  and  supported  different  user  tasks.  the  simple  url   (lib.purdue.edu)  was  memorable  and  appeared  first  in  the  search  results.  the  immediate   availability  of  sub-­‐menus,  such  as  databases  and  catalogs,  offered  speedy  searching  for  the   frequent  users.  the  choice  between:  a)  general  url,  or  b)  sub-­‐menu,  was  the  first  key  decision   point  users  of  primo  at  purdue  libraries  were  presented  with.     the  purdue  libraries’  home  page  (revisit  figure  1)  had  a  simple  design  with  a  clear,  central  and   visible  search  box.  just  above  it  were  search  filters  for  articles,  books  and  the  web.  this  was  the   second  key  decision  point  users  were  presented  with:  a)  they  could  either  type  into  the  search  bar   without  selecting  any  filters,  or  b)  they  could  select  a  filter  to  aid  the  focus  of  their  results  to  a   specific  item  type.  browsing  the  library  website  offers  an  efficient  and  fluent  workflow,  with   ebooks  being  the  only  exception.  it  was  hard  to  know  whether  they  were  grouped  under  articles   or  books  &  media  filters.  confusingly  (at  the  time  of  the  study)  purdue  libraries  listed  ebooks  that   had  no  physical  copies  under  articles,  while  other  ebooks  that  purdue  had  physical  version  of  (in   addition  to  the  digital  ones)  under  books  &  media.  this  was  not  explained  in  the  interface,  nor  was   there  a  readily  available  tooltip.   finding  relevant  results     figure  9.  search  results  for  article  (case2)  ‘comparison  of  simple  potential  functions  for   simulating  liquid  water’     applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       94   primo  presented  the  search  results  in  an  algorithmic  order  of  relevance  offering  additional  pages   for  every  20  items  appearing  in  the  search  results.  the  search  bar  was  then  minimized  at  the  top   of  the  page,  available  for  easy  editing.  the  page  was  divided  into  two  key  sections,  where  the  first   quarter  entailed  filters  (e.g.  year  of  publishing,  resource  type,  author,  journal,  etc.),  and  the  other   three  quarters  was  left  for  search  results  (see  figure  8).  the  majority  of  cognitive  decisions  across   scenarios  were  made  on  this  results  page.  this  was  due  to  the  need  to  pick  up  the  cues  to  identify   and  verify  the  accurate  item  being  searched.  the  value  of  these  cognitive  steps  lies  in  their  leading   of  the  user  to  the  next  physical  steps.  as  discussed  in  the  next  section,  opening  the  page  of  the   desired  item,  there  were  several  elements  that  succeeded  and  failed  at  guiding  the  user  to  their   accurate  search  result.     search  results  were  considered  relevant  when  the  search  presented  results  in  the  general  topic   area  of  the  searched  item.  most  cases  in  most  scenarios  led  to  relevant  results,  however,  book   case  8  and  ebook  case  11,  provided  only  unrelated  results.  generally,  books  and  ebooks  were   easy  to  identify  as  available.  this  was  due  to  their  typically  short  titles,  which  took  less  effort  to   read.  journal  articles,  on  the  other  hand,  have  longer  titles  and  required  more  cognitive  effort  to   be  verified.     article  case  4,  book  case  6  and  ebook  case  10  had  relevant  but  restricted  results.  the  color-­‐ coding  system  that  indicated  the  level  of  availability  for  the  presented  search  results:  green  (fully   available),  orange  (partly  available)  or  gray  (not  available)  dots  –  was  followed  by  an  explanatory   availability  tag,  e.g.  'available  online'  or  'full  text  available'  etc.  tabs  represented  additional  cues,   offering  additional  information,  e.g.  ‘find  in  print’.  these  appeared  in  a  supplementary  way  where   applicable.  for  example,  if  an  item  was  not  available,  its  dot  was  gray  and  it  neither  had  the  'find   in  print'  nor  'find  online'  tab.  instead,  it  had  a  'request'  tab,  guiding  the  user  towards  an  available   alternative  action.  restricted  availability  items,  such  as  a  book  in  a  closed  repository,  had  an   orange  indicator  for  partial  availability.  for  these,  primo  still  offered  the  'find  in  print'  or  'find   online'  tab,  whichever  was  appropriate.  while  the  overall  presentation  of  item  availability  was   clear  and  color-­‐coding  consistent,  the  mechanisms  were  not  without  their  errors,  as  discussed   below.   opening  the  page  of  the  desired  item   this  sub-­‐goal  comprised  of  two  main  steps:  1)  information  driven  cognitive  steps,  which  help  the   user  identify  the  correct  item,  and  2)  user  interface  guided  physical  steps  that  resulted  in  opening   the  page  of  the  desired  item.     frequent  strengths  that  helped  the  identification  of  relevant  items  across  the  scenarios  were  the   clearly  identifiable  labels  underneath  the  image  icons  (e.g.  'book’,  'article',  ‘conference  proceeding'),   hierarchically  structured  information  about  the  items  (title,  key  details,  availability)  and   perceivably  clickable  links  (blue  with  an  underlined  hover  effect).  the  labels  and  hierarchically   presented  details  (e.g.  year,  journal,  issue,  volume,  etc.)  helped  the  workflow  to  remain  smooth,     information  technology  and  libraries  |  march  2015   95   minimizing  the  need  to  use  side  filters.  the  immediate  details  reduced  the  need  to  open   additional  pages,  cutting  down  the  steps  needed  to  accomplish  the  task.  the  hover  effect  of  item   titles  made  the  link  look  and  feel  clickable,  guiding  the  user  closer  to  retrieving  the  item.  color-­‐ coding  all  clickable  links  in  the  same  blue  was  also  an  effective  design  feature,  even  though  bolded   availability  labels  were  equally  prominent  and  clickable.  this  was  especially  true  for  articles   where  the  ‘full  text  available’  tags  correspond  to  users  goal  to  immediately  download  the  sought   item  (figure  8).   the  most  frequent  causes  of  errors  were  duplicated  search  results.  generally,  primo  displays   multiple  versions  of  the  same  item  into  one  search  result  and  offered  a  link:  ‘see  all  results’.  in  line   with  graham  stone’s  [17]  study,  which  highlighted  the  problem  of  cataloging  inconsistences,   primo  struggled  to  consistently  grouping  all  overlapping  search  result  items.  both  book  and   article  scenarios  suffered  from  at  least  one  duplicate  search  result  case  due  to  inconsistent  details.   article  scenario  case  2  offers  an  example,  where  jorgensen  et  al  “comparison  of  simple  potential   functions  for  simulating  liquid  water”  (1983)  had  two  separate  results  for  the  same  journal   article  of  the  same  year  (first  two  results  in  figure  8).  problematically,  the  two  results  offered   different  details  for  the  journal  issue  and  page  numbers.  this  may  cause  likely  referencing   problems  for  primo  users.   duplicated  search  results  were  also  an  issue  for  book  scenarios.  the  most  frequent  causes  for  this   were  instances  where  authors’  first  and  last  names  were  presented  in  a  reverse  order  (see  also   figure  8  for  article  case  2),  the  books  had  different  print  editions,  or  the  editors’  name  was  used   in  place  of  the  authors’.  book  scenario  case  7:  machiavelli’s  “the  prince”  resulted  in  extremely   varied  results,  requiring  16  cognitive  steps  and  33  physical  steps  before  a  desired  item  could  be   verified.  this  is  where  search  filters  were  most  handy.  problematically,  in  case  7,  machiavelli  –   the  author  –  did  not  even  appear  in  the  author  filter  list,  while  ebrary  inc  was  listed.  again,  this   points  to  the  inconsistent  metadata  and  the  effects  it  can  have  on  usability,  as  discussed  by  stone.2   other  workflow  issues  were  presented  by  design  details  such  as  the  additional  information  boxes   underneath  the  item  information,  e.g.  ‘find  in  print’,  ‘details’  and  ‘find  online’.  they  opened  a  small   scrollable  box  that  maintained  the  overall  page  view,  were  difficult  to  scroll.  the  arrow  kept   slipping  outside  of  the  box,  scrolling  the  entire  site’s  page  instead  of  the  content  inside  the  box.  in   addition,  the  information  boxes  did  not  work  well  with  chrome.  this  was  especially  problematic   on  the  macbook  where  after  a  couple  of  searches  the  boxes  failed  to  list  the  details  and  left  the   user  with  an  unaccomplished  task.  comparably,  safari  on  a  mac  and  internet  explorer  on  a  pc   never  had  such  issues.       retrieving  the  items  (call  number  or  downloading  the  pdf)   the  last  sub-­‐goal  was  to  retrieve  the  item  of  interest.  this  often  comprised  of  multiple  decision   points:  whether  to  retrieve  the  pdf  version  from  online  or  identify  a  call  number  for  the  physical     applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       96   copy  or  whether  to  place  a  request,  ordering  it  via  inter  library  loan  (ill)  or  uborrow.  each   option  is  briefly  discussed  below.     ebooks  and  articles,  if  available  online,  offered  efficient  online  availability.  if  an  article  was   identified  for  retrieval,  there  were  two  options  to  access  the  link  to  the  database,  e.g.  ‘view  this   record  in  acm’:  a)  via  the  full  view  of  the  item,  or  b)  small  ‘find  online’  preview  box  discussed   above.  where  more  than  one  database  was  available,  information  about  the  publication  range  the   library  holds  helped  identify  the  right  link  to  download  the  pdf  on  the  link-­‐resolver  page.  one  of   the  key  benefits  of  having  links  from  within  primo  to  the  full  texts  was  the  fact  that  they  opened  in   new  browser  windows  or  tabs,  without  interference  to  other  ongoing  search.  while  a  few  of  the   pdf  links  to  downloadable  texts  were  difficult  to  find  through  some  external  database  sites,  once   found,  they  all  opened  in  adobe  reader  with  easy  options  to  either  'save'  or  ‘print’  the  material.     ebooks  were  available  via  ebrary  or  ebl  libraries.  while  the  latter  offers  some  novel  uses,  such  as   audio  (i.e.  read  aloud),  neither  of  the  two  platforms  was  easy  to  use.  while  reading  online  was   possible,  downloading  an  ebook  was  challenging.  the  platform  seemed  to  offer  good  options:  a)   download  by  chapter,  b)  download  by  page  numbers,  or  c)  download  the  full  book  for  14  days.  in   actuality,  however,  these  were  all  unavailable.  ebook  case  9  had  chapters  longer  than  the  60-­‐page   limit  per  day.  page  numbers  proved  difficult  to  use,  as  the  book’s  numbers  did  not  match  the  pdf’s   page  numbers.  this  made  it  hard  to  keep  track  of  what  was  downloaded  and  where  one  left  off  to   continue  later  (due  to  imposed  time-­‐limits).  the  14-­‐day  full  access  option  was  only  available  in   adobe  digital  editions  software  (an  ebook  reader  software  by  adobe  systems  built  with  adobe   flash),  which  was  neither  available  on  most  campus  computers  nor  on  personal  laptops.     the  least  demanding  and  most  fluent  of  all  retrieval  options  was  the  process  of  identifying  the   location  and  call  number  for  physical  copies.  inconsistent  metadata,  however,  posed  some   challenges.    book  case  5  offered  a  merged  search  result  of  two  books,  but  listed  them  with   different  call  numbers  in  the  ‘find  in  print’  tab.  libraries  have  many  physical  copies  of  the  same   book,  but  identifying  consistency  in  call  number  is  a  cognitive  step  that  helps  verify  the   similarities  or  differences  between  the  two  results.  the  different  call  numbers  raised  doubts  about   which  item  to  choose,  slowing  the  workflow  for  the  task  and  increasing  the  number  of  cognitive   steps  required  to  accomplish  the  task.     compared  to  books,  finding  an  article  in  print  format  was  hardly  straightforward.  the  main  cause   for  error  when  looking  up  hard  copies  of  journals  was  the  fact  that  individual  journal  issues  did   not  have  individual  call  numbers  at  purdue  libraries.  instead,  they  were  had  one  call  number  per   periodical  where  the  entire  journal  series  had  only  one  call  number.  article  case  2,  for  example,   offered  the  journal  code:  530.5  j821  in  the  ‘find  in  print’  tab.  in  general,  the  tab  suffered  from  too   much  information,  poor  layout  and  unhelpful  information  hierarchy,  all  of  which  slowed  down  the   cognitive  tasks  of  verifying  whether  an  item  was  relevant  or  not.  it  listed  ‘location’  and  ‘holdings   range’  as  the  first  pieces  of  information,  wherein  ‘holdings  range’  included  not  just  hard  copy   related  information,  but  listed  digital  items  as  well,  even  though  this  tab  was  for  physical  version     information  technology  and  libraries  |  march  2015   97   of  the  item.  to  illustrate,  article  case  2  claimed  to  have  holdings  for  1900  –  2013,  whereas  hard   copies  were  only  available  for  1900-­‐2000,  and  digital  copies  for  2001-­‐2013.     each  scenario  had  one  or  two  cases  where  there  were  neither  physical  nor  digital  options   available.  the  sub-­‐goal  commonly  comprised  of  a  decision  between  three  options:  c)  placing  a   request,  d)  ordering  an  item  via  inter  library  loan  (ill),  or  c)  ordering  an  item  via  uborrow.   while  the  ‘signing  in  to  request’  option  and  ill  were  easy  to  use  with  few  required  steps,  there   was  a  lack  of  guidance  on  how  to  choose  between  the  three  options.  frequently,  ill  and  uborrow   appeared  as  equal  options  adjacent  to  one  another,  leaving  the  next  step  unguided.  of  all  three,   placing  a  request  via  uborrow  was  the  hardest  to  accomplish.  it  often  failed  to  present  any   relevant  results  on  the  first  results  page  of  the  uborrow  system,  requiring  the  use  of  advanced   search  and  filters.  for  instance,  book  case  6  was  ‘not  requestable’  via  uborrow.  when  it  did  list   the  sought  for  item  in  the  search  results  it  looped  back  to  purdue's  own  closed  repository  (which   remained  unavailable).     discussion   the  goal  of  this  study  was  to  utilize  hta  to  examine  the  workflow  of  the  primo  discovery  layer  at   purdue  university  libraries.  nielsen’s  [6]  goal  composition  heuristics  were  used  to  extend  the   task-­‐based  analysis  and  understand  the  tasks  in  the  context  of  discovery  layers  in  libraries.  three   key  usability  domains:  generalization,  integration  and  user  control  mechanisms  were  used  as  an   analytical  framework  to  draw  usability  conclusions  about  how  primo  was  supporting,  if  at  all,   successful  completion  of  the  three  scenarios.  the  next  three  sub-­‐sections  evaluate  and  offer  design   solutions  on  the  three  usability  domains  mentioned  above.  overall,  this  study  confirmed  primo’s   ability  to  reduce  the  workload  for  users  to  find  their  materials.  primo  is  flexible  and  intuitive,   permitting  efficient  search  and  successful  retrieval  of  library  materials,  while  offering  the   possibility  of  many  search  sessions  at  once  [14].    a  comparison  to  a  usability  test  results  is  offered   as  a  way  of  conclusion.     generalization  mechanisms   primo  can  be  considered  a  flexible  discovery  layer  as  it  helps  users  achieve  many  goals  with   minimum  amount  of  steps.  it  makes  use  of  several  generalization  mechanisms  that  allow  users  to   utilize  their  tasks  towards  many  goals  at  once.  for  instance,  the  library  website  result  in  google   offers  not  only  the  main  url  but  also  seven  sub-­‐links  to  specialist  library  site  locations,  such  as   opening  hours  and  databases.  this  makes  primo  accessible  and  relevant  for  a  broader  array  of   people  who  are  likely  to  have  different  goals.  for  instance,  some  may  seek  to  enter  a  specific   database,  instead  of  having  to  open  primo’s  landing  page  and  entering  the  search  terms.  another   may  wish  to  utilize  ‘find’,  which  guides  the  user,  one  step  at  a  time,  via  a  process  of  definition   elimination,  closer  to  the  item  they  are  looking  forknow  the  opening  times.   similarly,  the  primo  search  function  saves  already  typed  information,  both  on  its  landing  page  and   its  results  page.  this  facilitates  search  by  requiring  query  entry  only  once,  while  allowing  end     applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       98   users  to  click  on  different  filters  to  narrow  the  results  in  different  ways.  as  a  part  of  the  work  done   towards  one  search  can  be  used  towards  another,  e.g.  by  content,  journal,  or  topic  type,  the  system   can  ease  the  work  effort  required  of  users.  this  is  further  supported  by  the  system  saving  already   typed  keywords  when  returning  to  the  main  search  page  from  research  results  and  allows  for  a   fluid  search  experience  where  the  user  adjusts  a  keyword  thread  with  minimal  typing,  until  they   find  what  they  are  looking  for.   a  key  problem  for  primo  is  its  inability  to  manage  inconsistent  meta-­‐data.  the  tendency  to  group   different  versions  of  the  same  search  results  together  is  helpful  as  it  clarifies  information  noise.  in   an  effort  to  enhance  the  speed  it  takes  to  evaluate  the  relevancy  of  search  results,  the  system  seeks   to  shighlight  any  differences  in  the  meta-­‐data.  if  inconsistencies  in  meta-­‐data  cause  same  search   results  to  appear  as  separate  items,  it  is  likely  to  affect  the  cognitive  steps  and  therefore  the   workload  and  efficiency  with  which  the  user  is  able  to  accomplish  identification.     it  is  clear  from  previous  studies  that  if  discovery  layers  were  to  become  the  next  generation   catalogs  [11],  and  were  to  enhance  the  speed  of  knowledge  distribution  as  has  been  hoped  by   tosaka  and  weng  [15]  and  luther  and  kelly  [16],  then  mutual  agreement  is  needed  on  how  meta-­‐ data  from  disparate  sources  [17].  understanding  that  users’  cognitive  workload  should  be   minimized  (by  offering  fewer  options  and  more  directive  guidance)  for  more  efficient  decision-­‐ making,  library  items  should  have  accurate  details  in  their  meta-­‐data,  e.g.  consistent  and  thorough   volume,  issue  and  page  numbers  for  journal  articles,  correct  print  and  reprint  years  for  books,  and   item  type  (conference  proceeding  vs.  journal  article).   integration  mechanisms   the  discovery  layer’s  ability  to  increase  the  number  of  search  sessions  [14]  at  any  one  time  is   possible  due  to  its  flexibility  to  support  multitasking.  primo  achieves  this  with  its  own  individual   features  used  in  combination  with  other  system  facilities  and  external  sources.  for  instance,   primo’s  design  allows  users  to  review  and  compare  several  search  results  at  once  via  the  ‘find  in   print’  or  ‘details’  tabs.  although  not  perfect,  since  the  small  boxes  are  hard  to  scroll  within,  the   information  can  save  the  user  the  need  and  additional  steps  of  opening  many  new  windows  and   having  to  click  between  them  just  for  reviewing  search  results.  instead,  many  ‘detail’  boxes  of   similar  results  may  be  opened  and  viewed  at  once,  allowing  for  effective  visual  comparison.  this   integration  mechanism  allows  a  fluent  transition  from  skimming  the  search  results  to  another   temporary  action  of  gaining  insight  about  the  relevance  of  an  item.  most  importantly,  this  is   accomplished  without  requiring  the  user  to  open  a  new  browser  page  or  tab,  where  they  would   have  to  break  from  their  overall  search  flow  and  remember  the  details  (instead  of  visually   comparing  them),  making  it  hard  to  resume  from  where  they  left  off.     a  contrary  integration  mechanic  that  primo  makes  use  of  is  its  smooth  automated  connectivity  to   external  sites,  such  as  databases,  ebrary,  ill,  etc.  new  browser  pages  are  used  to  allow  the   continuation  of  a  task  outside  of  primo  itself  without  forcing  the  user  out  of  the  system  to  the     information  technology  and  libraries  |  march  2015   99   library  service  or  full  text.  primo  users  can  skim  search  results,  identify  relevant  resources  and   open  them  in  new  browser  pages  for  later  reviewing.     what  is  missing,  however,  is  the  opportunity  to  easily  save  and  resume  a  search.  retrieving  the   search  result  or  saving  it  under  ones’  login  details  would  benefit  users  who  recall  items  of  interest   from  previous  searches  and  would  like  to  repeat  the  results  without  having  to  remember  the   keywords  or  search  process  they  used.  it  is  not  obvious  how  to  locate  the  save  search  session   option  in  primo’s  interface.   user  control  mechanisms   yang  and  wagner  [11]  ranked  primo  highest  among  the  vendors,  primarily  for  its  good  user   control  mechanisms,  which  allow  users  to  inspect  and  change  the  search  functions  on  an  ongoing   basis.  primo  does  a  good  job  at  presenting  search  results  in  a  quick  and  organized  manner.  it   allows  for  the  needed  ‘undo’  functionality  and  continued  attachment  and  removal  of  filters,  while   saving  the  last  keywords  when  clicking  the  back  button  from  search  results.  the  continuously   available  small  search  box  also  offers  the  flexibility  for  the  user  to  change  search  parameters   easily.  in  summary,  primo  offers  agile  searching,  while  accounting  for  a  few  different  discovery   mental  models.     however,  if  primo  wants  to  preserve  its  current  effectiveness  and  make  the  jump  towards  a  single   search  function  that  is  truly  flexible  and  allows  for  much  needed  customizability  [18][2],  it  needs   to  allow  for  several  similar  user  goals  to  be  easily  executable  without  confusion  about  the  likely   outcome.  the  most  prominent  current  system  error  for  primo,  as  it  has  been  applied  in  the  purdue   libraries,  is  its  inability  to  differentiate  ebooks  from  journal  articles  or  books.  it  would  support   users  goals  to  be  able  to  start  and  finish  an  ebook  related  tasks  at  the  home  page’s  search  box.   currently,  users  have  the  cognitive  burden  to  consider  whether  ebooks  are  more  likely  to  be   found  under  ‘books  &  media’  or  ‘journals’.  currently,  primo,  as  applied  to  its  implementation  at   purdue  libraries  at  the  time  of  this  study,  does  not  support  goals  to  search  for  content  type,  e.g.  an   ebook.  this  however,  is  increasingly  popular  among  the  student  population  who  want  ebooks  on   their  tablets  and  phones  instead  of  carrying  heavy  books  in  their  backpacks.     another  key  pain-­‐point  for  current  users  is  the  identification  of  specific  journals  in  physical  form,   say  for  archival  research.  currently,  each  journal  issue  is  listed  individually  in  the  ‘find  in  print’   section,  even  though  the  journals  only  have  one  call  number.  listing  all  volumes  and  issues  of  each   periodical  overwhelms  the  user  with  too  much  information  and  prevents  the  effective   accomplishment  of  the  task  of  locating  a  specific  journal  issue.  since  there  is  only  one  call  number   available  for  the  entire  journal  sequence,  it  may  lead  to  better  clarity  and  usability  if  the   information  was  reduced.  instead  of  listing  all  possible  journal  issues,  a  range  or  ranges  (if   incomplete  set  of  issues)  that  the  library  has  physically  present  should  be  listed.  in  article  case  2,   for  instance,  there  are  five  items  for  the  year  1983.  why  lead  the  user  to  look  at  a  range  where   there  is  no  possible  option?     applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       100   comparing  hta  to  a  usability  test   usability  tests  benefit  from  the  invaluable  direct  input  from  the  end  user.  at  the  same  time   usability  studies,  as  constructed  conditions,  offer  limited  opportunities  to  learn  about  users’  real   motivations  and  goals  and  how  the  discovery  layers  support  or  fail  to  support  their  tasks.  fagan   et  al  [3]  conducted  a  usability  test  with  eight  students  and  two  faculty  members  to  learn  about   usability  issues  and  user  satisfaction  with  discovery  layers.  they  measured  time,  accuracy  and   completion  rate  for  nine  specific  tasks,  and  obtained  insights  from  task  observations  and  post-­‐test   surveys.  they  reported  on  issues  with  users  not  following  directions  (93),  the  prevalence  of  time   outs,  users  skipping  tasks,  and  variable  task  times.    these  results  all  point  to  a  mismatch  between   the  user  goals  and  the  study  tasks  and  offer  an  incomplete  picture  about  the  system’s  ability  to   support  user  goals  that  are  accomplished  via  specific  tasks.   expert  evaluation  based  hta  method  does  not  require  users’  direct  input.  hta  offers  a  method  to   achieve  a  relatively  complete  evaluation  of  how  low-­‐level  interface  facets  support  users’  high-­‐ level  cognitive  tasks.  hta  measures  the  system  designs  quality  in  supporting  a  specific  task   needed  to  accomplish  a  user  goal.  instead  of  measuring  time,  physical  and  cognitive  tasks  are   measured  in  number  of  steps.  instead  of  accuracy  and  completion  rate,  fluent  workflow  steps  and   mistaken  steps  are  counted.  the  two  methods  offer  opposite  strengths,  making  them  a  good   complements.  given  hta’s  system-­‐centric  approach,  it  can  better  inform  which  tasks  would  be   useful  in  usability  testing.   to  compare  the  our  research  findings  with  usability  tests,  fagan  et  al  [3]  confirmed  some  of  the   previously  established  findings  that  journal  titles  are  difficult  to  locate  via  the  library  home  page   (vs.  databases),  that  filters  are  handy  when  they  are  needed  and  that  users’  mental  models  have  a   preference  for  a  google-­‐like  single  search-­‐box.  for  instance,  students  and  even  librarians,  struggle   to  understand  what  is  being  searched  in  each  system  and  how  results  are  ranked  (see  also  [5]).   the  hta  method  applied  in  this  study  was  also  able  to  confirm  that  journal  titles  are  more   difficult  to  identify  than  books  and  ebooks,  the  flexibility  benefit  offered  by  filters  and  identify  the   single  search  box  as  a  fluent  system  design.  since,  hta  does  not  rely  on  the  user  to  tell  why  these   results  are  true,  hta,  as  applied  in  this  study,  helped  expert  evaluators  understand  the  reasons   for  these  findings  via  self-­‐directed  execution  and  discussion  with  colleagues  later.  depending  on   the  task  design,  either  usability  testing  or  hta  offer  the  capabilities  to  identify  cases  such  as   confusion  about  how  to  start  an  ebook  search  in  primo.  taking  a  system  design  approach  to  task   design  offers  a  path  to  a  systematic  understanding  of  discovery  layer  usability,  which  lends  itself   to  easier  comparison  and  external  validity.     in  terms  of  specific  interface  features,  usability  tests  are  good  for  evaluating  the  visibility  of   specific  features.  for  example,  fagan  et  al  [3]  asked  their  participants  to  (1)  search  on  speech   pathology,  (2)  find  a  way  to  limit  search  results  to  audiology,  and  then  (3)  limit  their  search   results  to  peer-­‐reviewed  (task  3  in  [3],  p.  95).  by  measuring  completion  rate,  they  were  able  to   identify  the  relative  failure  of  ‘peer-­‐reviewed’  over  ‘audiology’  filters,  but  they  were  left  “unclear     information  technology  and  libraries  |  march  2015   101   [about]  why  the  remaining  participants  did  not  attempt  to  alter  the  search  results  to  ‘peer  reviewed,”   failing  to  accomplish  the  task  [3].  in  comparison,  hta  as  an  analytical  rather  than  observational   methodology,  leads  to  more  synthesized  results.  in  addition  to  insights  into  possible  gaps   between  system  design  and  mental  models,  hta  as  a  goal-­‐oriented  approach,  concerns  itself  with   issues  of  workflow  (how  well  the  system  guides  the  user  to  accomplishing  their  task)  and   efficiency  (minimizing  the  number  of  steps  required  to  finish  a  task).  these  are  less  obvious  to   identify  with  usability  tests,  where  participants  are  not  impacted  by  their  routine  goals,  time   pressures  and  consequently  their  patience  may  be  more  tolerant  as  a  result.   the  application  of  hta  helped  identify  key  workflow  issues  and  map  them  to  specific  design   elements.  for  instance,  the  lack  of  ebooks  as  a  search  filter  meant  that  the  current  system  did  not   support  content  form  based  searching  well  for  two  mains  forms:  articles  and  books.  compared  to   usability  tests  that  focus  on  specific  fabricated  search  processes,  hta  aims  to  map  all  possible   routes  the  system’s  design  offers  to  accomplish  a  goal,  allowing  for  their  parallel  existence  during   the  analysis.  this  system-­‐centered  approach  to  task  evaluation,  we  argue,  is  the  key  benefit  hta   can  offer  towards  a  more  systematic  evaluation  of  discovery  layers,  where  different  user  groups   would  have  varying  levels  of  assistance  needs.  hta  task-­‐analysis  allows  for  the  nuanced   understanding  that  results  can  differ  as  the  context  of  use  differs.  that  applies  even  to  the   contextual  difference  between  user  test  participants  and  routine  library  users.     conclusion   discovery  layers  are  advancing  the  search  experiences  libraries  can  offer.  with  increasing   efficiency,  increased  ease  of  use  and  more  relevant  results,  scholarly  search  has  become  a  far  less   frustrating  experience.  while  google  is  still  perceived  as  the  holy  grail  of  discovery  experiences,  in   reality  it  may  not  be  quite  what  scholarly  users  are  after  [5].  the  application  of  discovery  layers   has  focused  on  eliminating  the  limitations  that  plagued  the  traditional  federated  search  and   improving  the  search  index  coverage  and  performance.  usability  studies  have  been  effective  in   verifying  these  benefits  and  key  interface  issues.  moving  forward,  studies  on  discovery  layers   should  focus  more  on  the  significance  of  discovery  layers  on  user  experience.   this  study  presents  the  expert  evaluation  based  hta  methods  as  a  complementary  way  to   systematically  evaluate  popular  discovery  layers.  it  is  the  system  design  and  goal-­‐oriented   evaluation  approach  that  offers  the  prospects  of  a  more  thorough  body  of  research  on  discovery   layers  than  usability  alone.  using  hta  as  a  systematic  preliminary  study  guiding  formal  usability   testing  offers  one  way  to  achieve  more  comparable  study  results  on  applications  of  discovery   layers.  it  is  through  comparisons  that  the  discussion  of  discovery  and  user  experience  can  gain  a   more  focused  research  attention.  as  such,  hta  can  help  vendors  to  achieve  the  full  potential  of   web-­‐scale  discovery  services.     to  better  understand  and  ultimately  design  to  their  full  potential,  systematic  studies  are  needed   on  discovery  layers.  this  study  is  the  first  attempt  to  apply  hta  towards  systematically  analyzing   user  workflow  and  interaction  issues  on  discovery  layers.  the  authors  hope  to  see  more  work  in     applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       102   this  area,  with  the  hope  of  achieving  true  next  generation  catalogs  that  can  enhance  knowledge   distribution.         references   [1]   beth  thomsett-­‐scott  and  patricia  e.  reese,  “academic  libraries  and  discovery  tools:  a   survey  of  the  literature,”  college  &  undergraduate  libraries  19,  no.  2–4  (april  2012):  123– 143.  http://dx.doi.org/10.1080/10691316.2012.697009.     [2]     sarah  c.  williams  and  anita  k.  foster,  “promise  fulfilled?  an  ebsco  discovery  service   usability  study,”  journal  of  web  librarianship  5,  no.  3  (jul.  2011):  179–198.   http://dx.doi.org/10.1080/19322909.2011.597590.     [3]     jody  condit  fagan,  meris  a.  mandernach,  carl  s.  nelson,  jonathan  r.  paulo,  and  grover   saunders,  “usability  test  results  for  a  discovery  tool  in  an  academic  library,”  information   technology  and  libraries  31,  no.  1  (mar.  2012):  83–112,  mar.  2012.   http://dx.doi.org/10.6017/ital.v31i1.1855.   [4]     roger  c.  schonfeld  and  matthew  p.  long,  “ithaka  s+r  us  library  survey  2013,”  ithaka  s+r,     survey  2,  mar.  2014.  http://sr.ithaka.org/research-­‐publications/ithaka-­‐sr-­‐us-­‐library-­‐survey-­‐ 2013.   [5]     michael  khoo  and  catherin  hall,  “what  would  ‘google’  do?  users’  mental  models  of  a  digital   library  search  engine,”  in  theory  and  practice  of  digital  libraries,  ed.  panayiotis  zaphiris,   george  buchanan,  edie  rasmussen,  and  fernando  loizides,  1-­‐12  (berlin  heidelberg,   springer:  2012).  http://dx.doi.org/10.1007/978-­‐3-­‐642-­‐33290-­‐6_1.   [6]     jakob  nielsen,  “goal  composition:  extending  task  analysis  to  predict  things  people  may   want  to  do,”  goal  composition:  extending  task  analysis  to  predict  things  people  may  want  to   do,  01-­‐jan-­‐1994.  http://www.nngroup.com/articles/goal-­‐composition/.   [7]     jakob  nielsen,  “finding  usability  problems  through  heuristic  evaluation,”  in  proceedings  of   the  sigchi  conference  on  human  factors  in  computing  systems,  373-­‐380  (new  york,  ny,   acm:  1992).  http://dx.doi.org/10.1145/142750.142834.   [8]     jerry  v.  caswell  and  john  d.  wynstra,  “improving  the  search  experience:  federated  search   and  the  library  gateway,”  library  hi  tech  28,  no.  3  (sep.  2010):  391–401.   http://dx.doi.org/10.1108/07378831011076648.   [9]     emily  r.  alling  and  rachael  naismith,  “protocol  analysis  of  a  federated  search  tool:   designing  for  users,”  internet  reference  services  quarterly  12,  no.  1/2,  (2007):  195–210.   http://dx.doi.org/10.1300/j136v12n01_10.   [10]   susan  johns-­‐smith,  “evaluation  and  implementation  of  a  discovery  tool,”  kansas  library   association  college  and  university  libraries  section  proceedings  2,  no.  1  (jan.  2012):  17–23.     information  technology  and  libraries  |  march  2015   103   [11]   sharon  q.  yang  and  kurt  wagner,  “evaluating  and  comparing  discovery  tools:  how  close  are   we  towards  next  generation  catalog?,”  library  hi  tech  28,  no.  4  (nov.  2010):  690–709.   http://dx.doi.org/10.1108/07378831011096312.   [12]   lyle  ford,  “better  than  google  scholar?,”  presentation,  advance  program  for  internet   librarian  2010,  monterey,  california,  25-­‐oct-­‐2010.   [13]   michael  gorrell,  “the  21st  century  searcher:  how  the  growth  of  search  engines  affected  the   redesign  of  ebscohost,”  against  the  grain  20,  no.  3  (2008):  22,  24.   [14]   sian  harris,  “discovery  services  sift  through  expert  resources,”  research  information,  no.  53,  ,   (apr.  2011):  18–20.   http://www.researchinformation.info/features/feature.php?feature_id=315.   [15]   yuji  tosaka  and  cathy  weng,  “reexamining  content-­‐enriched  access:  its  effect  on  usage  and   discovery,”  college  &  research  libraries  72,  no.  5  (sep.  2011):  pp.  412–427.   http://dx.doi.org/10.5860/.   [16]   judy  luther  and  maureen  c.  kelly,  “the  next  generation  of  discovery,”  library  journal  136,   no.  5  (march  15,  2011):  66-­‐71.     [17]   graham  stone,  “searching  life,  the  universe  and  everything?  the  implementation  of   summon  at  the  university  of  huddersfield,”  liber  quarterly  20,  no.  1  (2010):  25–51.   http://liber.library.uu.nl/index.php/lq/article/view/7974.   [18]   jeff  wisniewski,  “web  scale  discovery:  the  future’s  so  bright,  i  gotta  wear  shades,”  online   34,  no.  4  (aug.  2010):  55–57.   [19]   gaurav  bhatnagar,  scott  dennis,  gabriel  duque,  sara  henry,  mark  maceachern,  stephanie   teasley,  and  ken  varnum,  “university  of  michigan  library  article  discovery  working  group   final  report,”  university  of  michigan  library,  jan.  2010,   http://www.lib.umich.edu/files/adwg/final-­‐report.pdf     [20]   abe  crystal  and  beth  ellington,  “task  analysis  and  human-­‐computer  interaction:  approaches,   techniques,  and  levels  of  analysis”  in  amcis  2004  proeedings,  paper  391,   http://aisel.aisnet.org/amcis2004/391.    [21]  neville  a.  stanton,  “hierarchical  task  analysis:  developments,  applications,  and  extensions,”   applied  ergonomics  37,  no.  1  (2006):  55–79.   [22]   john  annett  and  neville  a.  stanton,  eds.  task  analysis,  1  edition.  london ;  new  york:  crc   press,  2000.   [23]   sarah  k.  felipe,  anne  e.  adams,  wendy  a.  rogers,  and  arthur  d.  fisk,  “training  novices  on   hierarchical  task  analysis,”  proceedings  of  the  human  factors  and  ergonomics  society   annual  meeting  54,  no.  23,  (sep.  2010):  2005–2009,   http://dx.doi.org/10.1177/154193121005402321.     applying  hierarchical  task  analysis  method  to  discovery  layer  evaluation  |  promann  and   zhang       104   [24]   d.  embrey,  “task  analysis  techniques,”  human  reliability  associates  ltd,  vol.  1,  2000.   [25]   j.  reason,  “combating  omission  errors  through  task  analysis  and  good  reminders,”  quality  &   safety  health  care  11,  no.  1  (mar.  2002):  40–44,  http://dx.doi.org/10.1136/qhc.11.1.40.   [26]   james  hollan,  edwin  hutchins,  and  david  kirsh,  “distributed  cognition:  toward  a  new   foundation  for  human-­‐computer  interaction  research,”  acm  trans.  comput.-­‐hum.  interact   7,  no.  2  (jun.  2000):  174–196,    http://dx.doi.org/10.1145/353485.353487.   [27]   stuart  k.  card,  allen  newell,  and  thomas  p.  moran,  the  psychology  of  human-­‐computer   interaction.  hillsdale,  nj,  usa:  l.  erlbaum  associates  inc.,  1983.   [28]   stephen  j.  payne  and  t.  r.  g.  green,  “the  structure  of  command  languages:  an  experiment  on   task-­‐action  grammar,”  international  journal  of  man-­‐machine  studies  30,  no.  2  (feb.  1989):   213–234.   [29]   bonnie  e.  john  and  david  e.  kieras,  “using  goms  for  user  interface  design  and  evaluation:   which  technique?,”  acm  transactions  on  computer-­‐human  interactions  3,  no.  4  (dec.  1996):   287–319,  http://dx.doi.org/10.1145/235833.236050.   [30]   david  e.  kieras  and  david  e.  meyer,  “an  overview  of  the  epic  architecture  for  cognition  and   performance  with  application  to  human-­‐computer  interaction,”  human-­‐computer   interaction  12,  no.  4  (dec.  1997):  391–438,   http://dx.doi.org/10.1207/s15327051hci1204_4.   [31]   laura  g.  militello  and  robert  j.  hutton,  “applied  cognitive  task  analysis  (acta):  a   practitioner’s  toolkit  for  understanding  cognitive  task  demands,”  ergonomics  41,  no.  11   (nov.  1998):    1618–1641,  http://dx.doi.org/10.1080/001401398186108.   [32]   brenda  battleson,  austin  booth,  and  jane  weintrop,  “usability  testing  of  an  academic  library   web  site:  a  case  study,”  the  journal  of  academic  librarianship  27,  no.  3  (may  2001):  188– 198.   [33]   paul  chojecki,  “how  to  increase  website  usability  with  link  annotations,”  in  20th   international  symposium  on  human  factors  in  telecommunication.  6th  european  colloquium   for  user-­‐friendly  product  information.  proceedings,  2006,  p.  8.             case  study  references:     information  technology  and  libraries  |  march  2015   105     find  an  article:   case  1.  yang,  t.  c.,  and  kevin  d.  heaney.  "network-­‐assisted  underwater  acoustic   communications."  in  proceedings  of  the  seventh  acm  international  conference  on  underwater   networks  and  systems,  p.  37.  acm,  2012.   case  2.  jorgensen,  william  l.,  jayaraman  chandrasekhar,  jeffry  d.  madura,  roger  w.  impey,  and   michael  l.  klein.  "comparison  of  simple  potential  functions  for  simulating  liquid  water."  the   journal  of  chemical  physics  79  (1983):  926.   case  3.  “design  annual.”  graphis  inc.,  2008   case  4.  walb,  m.  c.,  j.  e.  moore,  a.  attia,  k.  t.  wheeler,  m.  s.  miller,  and  m.  t.  munley.  "a   technique  for  murine  irradiation  in  a  controlled  gas  environment."  biomedical  sciences   instrumentation  48  (2012):  470.   find  a  book  (physical):   case  5.  few,  stephen.  show  me  the  numbers:  designing  tables  and  graphs  to  enlighten.  vol.  1,   no.  1.  oakland,  ca:  analytics  press,  2004.   case  6.  metcalf,  christine.  the  love  of  cats.  crescent  books,  1973.   case  7.  machiavelli,  niccolò,  and  leo  paul  s.  de  alvarez.  1989.  the  prince.  prospect  heights,  ill:   waveland  press.   case  8.  lees-­‐maffei,  grace,  and  rebecca  houze,  eds.  the  design  history  reader.  berg,  2010.   find  an  ebook:   case  9.  rubin,  jeffrey,  and  dana  chisnell.  handbook  of  usability  testing:  how  to  plan,  design,   and  conduct  effective  tests.  wiley  technical  communication  library,  2008.   case  10.  rubin,  jeffrey,  and  dana  chisnell.  handbook  of  usability  testing:  how  to  plan,  design,   and  conduct  effective  tests.  wiley  technical  communication  library,  2008.   case  11.  laube,  matthew  bryan.  ancient  awakening.  2010.       lib-s-mocs-kmc364-20140601051149 3 an interactne computer-based circulation system: design and development james s. aagaard: departments of computer sciences and electrical engineering, northwestern university, evanston, illinois. an on-line computer-based circulation control system has been installed at the northwestern university library. features of the system include selfservice book charge, remote terminal inquiry and update, and automatic production of notices for call-ins and books available. fine notices are also prepared daily and overdue notices weekly. important considerations in the design of the system were to minimize costs of operation and to include technical services functions eventually. the system operates on a relatively smau computer in a multiprogrammed mode. introduction although the northwestern university library had given some consideration to the adoption of data processing techniques over a period of many years, it was not until planning for a new library building started that this consideration became serious. an associate university librarian and a systems analyst were added to the staff with specific responsibilities in the "automation" area. the recommendation of the systems analyst was that an on-line system should be designed to integrate all library functions. two areas were isolated for initial development: technical services, including ordering and cataloging, and circulation control. several other decisions were made at about the same time (fall 1967). perhaps most important of these was the choice of computer. the acquisition of a dedicated library computer was ruled out on the basis of cost, leaving the choice to be made between a control data 6400 soon to be installed in the university's computing center, or an ibm 360/30 in the administrative data processing department. it was clear that the ibm 360 would have to be upgraded considerably to handle an on-line system, but the decision was made to use it, based on the facts that it was already installed and operating, that the machine itself was more adaptable to text processing 4 journal of library automation vol. 5/1 march, 1972 applications, and that the library was an administrative application. a small programming staff was available, and it was decided to use that staff rather than have the library develop its own programming capability. the university's engineering and science libraries were administratively divorced from the rest of the evanston campus libraries to serve as a pilot location for development and testing. one final decision was made by the programming staff. since there was reason to believe that the use of a real-time system by the library might generate similar requests from other users of data processing services, the system should be capable of extension to other applications, if possible. design then began on a general-purpose file maintenance system. a detailed description of this system will be presented in another paper. actual programming started in spring 1968, and in about a year the teleprocessing system was essentially complete and work was started on various subsidiary programs, to be run on a daily or weekly basis. these included programs for producing catalog cards, purchase orders, and similar materials. however, at this time the realization came that the opening of the new building was less than a year away (construction was on schedule, unlike the situation with several other buildings) , and the new library administration felt very strongly that it would be desirable to have an operational circulation system. work then was suspended on the technical services part of the system, but it is important to note that the system developed up to that point provides the on-line inquiry capability to the circulation operation. it is also true that this capability is more sophisticated than would be needed for circulation applications only. mter the basic system design for the circulation system was completed in spring 1969, the massive job of preparing nearly a million punched cards for the books was started in the summer. this was done using student operators, working from the shelflist. the most expensive and time-consuming part of the job, however, proved to be the insertion of the cards in the books and this was not completely finished before the new building opened in january 1970. the computer circulation system was not ready for operation until december, so that it was tested in the pilot library for only a three-week period, hardly enough for a complete cycle of book charges and discharges. operation in the new building was complicated by several factors besides a new and unfamiliar circulation system. the building itself was not quite finished, all of the books were not in place, all of the remote terminals were not installed, and there was a large backlog of work which had accumulated during the moving period. the most serious problem, however, was that a decision had been made to continue the old manual circulation system in parallel with the new one, and with the other problems this became too much of a burden on the library staff. when it apinteractive circulation design/ aagaard 5 peared that there were no problems with the new system which could not be worked out, the manual system was quickly abandoned. after this point operations began to improve rapidly, and within a few months the system was running quite smoothly. the systems and programming staff has now returned to the implementation of the technical services system. general description functionally the northwestern university library circulation system may be viewed as consisting of three parts. the first of these is a book charge/ discharge operation using the ibm 1030 series of terminals. the second part is the general-purpose file maintenance system, originally developed for technical services; and the third part is a group of programs which are run in "batch" mode and thus have no direct interaction with the remote terminals. the teleprocessing program operates in a partition of 36,864 bytes of storage on a 65,536-byte computer. the ibm disk operating system is used, which requires 8,240 bytes, leaving 18,432 bytes for batch programs. the basic telecommunications access method is used for remote terminal input-output operations. data storage is on an ibm 2321 data cell. the present terminal configuration consists of five pairs of 1031 input terminals and 1033 printers, and four 27 40 model 2 typewriter terminals (two of which are used for technical services development). the partition size is probably adequate for one or two more terminals. all of the 1030 terminals share a common telephone line, as do all of the 27 40 terminals. two of the 1030 terminals are master units, connected directly to the telephone line, while three 1030s are satellites, operating through a master terminal. one master 1030 and one 2740 are located in the technological institute library; the remaining terminals are in the main university library. book charge system each of the 1031 terminals will accept an 80-column punched card which is kept in the book pocket, and also a punched plastic user identification badge. the three satellite terminals are located in the stack area of the library, one on each of three floors adjacent to the elevators. they are used for self-service charge of books. a master terminal, which also includes a manual entry keyboard, is located at the circulation desk on the main floor and is operated by library staff. the keyboard on this terminal allows the staff to perform additional functions , such as charging books to users without badges, charging for periods other than the standard loan period, processing renewals, and discharging books which have been returned. the 1033 printer associated with each t erminal is really just a modified electric typewriter. when a book is charged and the transaction is ac6 journal of library automation vol. 5/1 march, 1972 cepted by the computer, the printer creates a date due slip which shows the call number of the book, the identification number of the user, and the date due. this is placed in the book pocket and serves as the borrower's pass to carry the book past the exit guards. to make this a reasonably secure system, the guard must verify that the call number printed on the slip corresponds with the number on the book, and the user number on the slip corresponds with the number on a valid university identification badge. note that this system permits a user to bring the book back into the library and take it out again as often as he wishes during the loan period. the 1030 terminals are associated with a small, specialized part of the computer teleprocessing program which accepts the information from the terminal, reformats it so that it is compatible with the general-purpose file maintenance system, and enters it in the file, checking, of course, that the same book is not already in the file. (transactions which are invalid for one reason or another result in an "unprocessed" message on the printer, and the user must then go to the circulation desk to have the problem resolved. ) this portion of the system also processes renewals and discharges from the terminal at the circulation desk. inquiry system inquiry requests use the general-purpose file maintenance system originally developed for technical services. this gives an operator at the circulation desk the capability to query the file about the status of any book, and also to make certain changes in records, such as indicating renewals and saves. the terminal used for this purpose is the ibm 27 40, which is similar to a typewriter in operation. in order to facilitate the inquiry operations, the key to each record is the actual call number of the book, rather than some arbitrary accession number. the call number is divided into two parts, which we call the search key and the key extension. the search key includes the dewey classification number, cutter number, and up to four work letters; the key extension includes other information such as edition number, volume number, or copy number. a few compromises were necessary with the key extension in order to adapt it to the limited character set that the 1030 terminals can process, but these changes have not caused any difficulty. when a record is first entered in the file, the search key is used to calculate a position in the file. this is done by taking the alphanumeric characters in the search key, performing some mathematical operations to reduce the number of characters to four, and then treating the result as if it were a number. this resultant number is divided by a constant number which is chosen to be the largest prime number which is smaller than the number of tracks on the storage device for the file (the data cell ). the remainder from this randomizing computation gives a location where an attempt is made to place the record. it is quite possible that this place interactive circulation design/aagaard 7 in the file will be occupied by some other record, which might have the same search key or a quite different one. additional steps are provided to find an alternate location in such cases. additional information about the file organization is given in the appendix to this article. when a record must be located in the file, the same randomizing procedure is followed. then, the complete keys of any records which are found at the calculated location are examined to find the desired record. however, the operator at the inquiry terminal has the option of requesting a search for records based on search key only, and this is often useful. even though the individual records for each book are relatively short ( sixty-seven characters), the cost of maintaining a record in the file for every book in the library would be prohibitive. for this reason the philosophy has been adopted that an entry will be made in the file only for a book which is not in its proper place on the shelves. thus the file includes entries for books which have been placed on reserve, loaned to library departments, or reported lost or missing. there are probably more records of this type in the file at any given time than records representing books borrowed by individuals. unfortunately, the library does not have the terminal or personnel capacity to verify the status of all books before the user leaves the card catalog area (adjacent to the circulation desk) so that if he doesn't find a book on the shelves he must return to the circulation desk to request an inquiry. in many cases, of course, the user may be satisfied with a nearby book on the same subject, or perhaps he just intends to browse in a general subject area. it is hoped that at some future time it will be possible to obtain inquiry terminals which can be user-operated and located in the stack area. an additional feature of the system is the capability to place a "save" on a book which is on loan. this is done by the operator at the inquiry terminal, who adds the saver's identification number to the record of the book. the save request triggers a call-in notice, discussed below. this procedure has the minor disadvantage that a save cannot be entered so that the first copy of a particular book which is returned will be held; the operator must select a copy if more than one is out (or enter the save on all copies). this has not proved to be a serious problem. the overall teleprocessing system includes a file of records, called the "transaction file," which is written sequentially. records from other files which can be accessed by remote terminals are written in the transaction file under certain circumstances. in the case of circulation records accessed by the inquiry terminal, writing of the transaction file is under the control of the terminal operator, who will always request that the book record be written in the transaction file when entering a save on a record. records processed by the 1030 terminals are entered in the transaction file only when they represent books which are discharged and which possibly involve a fine or contain a save. 8 journal of library automation vol. 5/1 march, 1972 batch processing the third part of the complete system includes a number of "batch" programs, which are run periodically and are independent of the real-time program. however, they may be, and usually are, run at times when the real-time program is also operating. these programs use either the transaction file or the main circulation file. each weekday morning, a series of programs is run which processes the data entered into the transaction file since the previous run. three types of printed notices are prepared by these programs as shown in the accompanying figures. one is a fine notice for books which were overdue when returned but for which a fine was not collected. (if a fine was collected the fact is indicated by a keyboard entry on the 1030 terminal when the book is discharged.) there also may be circumstances when the regular fine was collected, but a penalty fine is due because the book was called in but not returned in the specified time. the second type of notice is the call-in notice. this is prepared from records which were placed in the transaction file as a result of a save entered by the inquiry terminal operator. the third type of notice is the book-available notice, which results when a book with a save is discharged. (when the actual discharge is performed the real-time computer program prints a message on the 1033 printer so that the book will not be returned to the stacks. ) after the selection of records from the transaction file, name and address information is added from student and personnel files maintained by the data processing department, and a four-part notice is prepared. the same printed form is used for all three types of notices; the additional copies of the fine notice are used in a manual follow-up system if the fine is not p-aid immediately. processing of these notices generally takes less than ten minutes of computer time a day. on a weekly basis, another series of programs is used to process the entire file of outstanding books. a considerably longer time is required because of the size of the file which must be examined. at this time all records in the file are also transferred to a backup file, which provides some protection in case of damage to the prime file (it has not yet been necessary to use it). information is also extracted about books which have had transactions in the past week, and a circulation statistics report is prepared. finally, records for books which are overdue are extracted and processed similarly to the daily notices, resulting in a one-part overdue notice. on a quarterly basis the entire file is examined and lists are prepared of all entries which otherwise do not qualify for overdue notices. these include charges to reserve, lost and missing, other libraries, carrels within the library, and to faculty (who are not charged fines). these lists are distributed for verification that the books are actually located where the interactive circulation design/ aagaard 9 file says they are and then returned to the circulation department for any further processing which might be necessary. backup system in designing a real-time circulation system it was obviously necessary to provide for those occasions when some equipment malfunction prevents normal processing. it was not felt necessary to provide any backup to the inquiry part of the system, and book discharges can be allowed to wait for several days if absolutely necessary. to process charges during a period when the computer system is not operational, a standard register source record punch is used. this device can accept the same plastic user badge and book card as the 1030 terminal, and transfers the punched information to a two-part form. the first part of the form serves as the date due slip, while the second part, a standard 80-column card, is used to enter the transaction through the 1031 when the real-time system is again operational. this system has the advantage that it is completely independent of the real-time system, except of course for the building electric power. if for some reason the standard register punch cannot be used for a transaction (or for a book without a book card under any circumstances), the missing information can be handwritten on the two-part form. the second card part is then keypunched and the transaction entered through the 1031 terminal. this ultimate backup system is avoided, however, as it is very susceptible to transcription errors. costs any attempt to determine the cost of the on-line circulation system is complicated by several factors. the cost of the terminals is an obvious item, and fairly easy to determine and allocate. however, the cost of the communications adapter which connects the telephone lines to the computer, as well as the cost of running the teleprocessing program, must be shared by all users. these now include technical services as well as circulation services, and in the future may include other nonlibrary university users. finally, even if this allocation could be made, there is still the problem of separating the costs of the teleprocessing program and the batch programs being run in the data processing department. however, since poor information may be better than none at all, the following figures are presented. they include monthly charges for the real-time program for both circulation and the part of technical services which is now operating, but do not include any charges for running of batch programs. 1030 terminals master 1031a & 1033 with manual entry 2 @ $251 ----------------------------------------------------------------------------------$ 502 10 journal of library automation vol. 5/ 1 march, 1972 satellite 1031b & 1033 3 @ $155 --------------------------------------------------------------------------------------465 2740 model 2, 600 bps 4 @ $170 --------------------------------------------------------------------------------------------680 2701 data adapter with 2 lines ---------------------------------------------------------_ 450 t elephone line charges ---------------------------------------------------------------------------25 data cell space, 5 cells ( 1 circulation; 4 technical services ) --------------------------------------------------1,400 core storage allocated exclusively to teleprocessing ------------------------1,700 estimated share of cpu and disk costs -------------------------------------------------1,400 special operator charge -----------------------------------------------------------------------350 future plans although the present real-time program includes a list containing a limited number of invalid user numbers (lost badges or users who are guilty of repeated violations of library rules), it would be more satisfactory to have a list of valid numbers instead. this might even be expanded into a self-regulating system, where users with a sufficient number of "demerits" would be prevented from charging additional books, and which might reduce the need for fines. another desirable addition to the system would be if some simple and inexpensive display terminals could be obtained and placed in the stack area to provide a self-service inquiry capability. this would relieve the circulation desk staff of some additional work, as well as reduce the number of trips the users must make to and from the stack area. acknowledgments the success of this circulation system is due in no small part to the constant support and encouragement from mr. h oward f. smith, director of administrative data processing, and mr. john p. mcgowan, university librarian. mrs. velma veneziano was responsible for establishing the library's requirements and rendered invaluable help during the implementation of the system. [editor's note: mrs. v eneziano is preparing an updated (1972) summary of the actual application of this design, which it is hoped will be published shortly.] appendix 1 detailed d escription of file structure the following information is included in each circulation file record: key dewey classification number-11 characters . ( the decimal point is included in the record but not punched on book cards. ) interactive circulation designjaagaard 11 cutter number-5 characters. work letters-4 characters. key extension volume, editor, copy-17 characters. location code-2 characters. this code indicates whether the book is in the main library or in a branch. it also provides for a subsidiary location code if required. large book indicator-! character. charge code and date-s characters. borrower number-5 characters. renewal code and date-s characters. discharge code and date-s characters. save date-2 characters. saver number-5 characters. due date-4 characters. terminal identification code-1 character. reserved for system use-1 character. all dates and user identification numbers are packed to save space. because of the random organization of the file it is necessary that storage space be allocated for considerably more records than the expected maximum file size. this allocation conceivably might have been based on a simulation study of the file operation, but this could not be justified since not even a good estimate of maximum file size was available. (an important unknown factor was the change in usage patterns expected to result from the move to a new building. ) the only available estimates indicated a maximum file size of about 60-70,000 records, and on this basis sufficient space was allocated to hold 144,000 records. with twelve records to a track on the ibm 2321 data cell, the file requires 12,000 tracks, or 60 percent of one cell. an equivalent would be about one pack on an ibm 2314 disk storage unit. of this total space, five percent (about 7,200 records ) is set aside as an overflow area; the remainder is the prime area. when the randomizing algorithm leads to a cylinder (twenty tracks) which is completely filled, the record is written in the overflow area and the cylinder at the prime location is flagged . using this system, failure due to "running out of space" will be gradual; it is likely that system performance will be seriously degraded before the overflow area is completely filled. after more than a year of operation the file contains almost 63,000 records, of which about 230 are in the overflow area. consideration is being given to minor changes in the handling of overflow records which should reduce the time required to search the overflow area. enhancing visibility of vendor accessibility documentation samuel kent willis and faye o’reilly information technology and libraries | september 2018 12 samuel kent willis (samuel.willis@wichita.edu) is assistant professor and technology development librarian and faye o’reilly (faye.oreilly@wichita.edu) is assistant professor and digital resources librarian at wichita state university. abstract with higher education increasingly being online or having online components, it is important to ensure that online materials are accessible for persons with print and other disabilities. libraryrelated research has focused on the need for academic libraries to have accessible websites, in part to reach patrons who are participating in distance-education programs. a key component of a library’s website, however, is the materials it avails to patrons through vendor platforms outside the direct control of the library, making it more involved to address accessibility concerns. librarians must communicate the need for accessible digital files to vendors so they will prioritize it. in much the same way as contracted workers constructing a physical space for a federal or federally funded agency must follow ada standards for accessibility, so software vendors should be required to design virtual spaces to be accessible. a main objective of this study was to determine a method of increasing the visibility of vendor accessibility documentation for the benefit of our users. it is important that we, as service providers for the public good, act as a bridge between vendors and the patrons we serve. introduction the world wide web was developed late in 1989 but reached the public sector the following year and quickly gained prominence.1 around this same time (1990), the americans with disabilities act (ada) was also passed, so when it was written the role of the web had yet to take shape. websites and online content, while not included specifically in the ada, have been increasingly emphasized when institutions examine the accessibility of their resources for persons with disabilities. more recent legislation, as well as legal-settlement agreements (including with colleges and universities), have included—and even emphasized—the importance of accessible online content. researchers have argued that in requiring facilities to be accessible, ada must include digital accessibility.2 with higher education increasingly being online or having online components, it is important to ensure that online materials are accessible for persons with print and other disabilities, many of whom may have received more extensive support in primary and secondary schools. unless accessibility is pursued with purpose, the level of education and educational materials available for students with disabilities will be severely limited.3 literature review legislation and existing guidelines equal access to information for all patrons is a foundational goal of libraries. in higher education, accessible information and communications technology allows users of all abilities to focus on learning without undue burden.4 colleges and universities are required by law to provide enhancing visibility of vendor accessibility documentation | willis and o’reilly 13 https://doi.org/10.6017/ital.v37i3.10240 reasonable accommodations to allow an individual with a disability to participate fully in the programs and activities of the university. according to title ii of ada, discrimination on the basis of disability by any state or local government and its agencies is strictly prohibited.5 section 504 of the rehabilitation act of 1973 also prohibits discrimination on the basis of disability by any program or activity receiving federal assistance.6 the department of education stated, “public educational institutions that are subject to education’s section 504, regulations because they receive federal financial assistance from us are also subject to the title ii regulations because they are public entities (e.g., school districts, state educational agencies, public institutions of vocational education and public colleges and universities).” 7 this piece of legislation usually manifests itself in the physical learning space—wheelchair ramps, braille textbook options, interpreters, and more—but finds little application in the digital spaces of a university, especially in the library’s online research presence. this is an alarming revelation; much higher learning today takes place in an online environment, and inaccessible library resources are a contributing factor to challenges in higher education faced by users with disabilities. to be considered accessible, a digital space, such as a website, online-learning management system, or a research discovery layer, and any word documents, pdfs, and multimedia presented therein, should be formatted in such a way that it is compatible with assistive technologies, such as screen-reading software. a website should also be navigable without a mouse using visual or auditory clues. content on a website ought to be clearly and logically organized, with skip navigation links to bypass to the page’s main content. images should have alternative text descriptions, known as “alt text,” that is brief and informative, describing the content and role of the image. links should likewise have clear descriptions of the target page. these and similar considerations aim to help persons with impairments that may make reading a monitor or screen difficult.8 digital spaces like a research database are considered electronic information technology (eit). eit is defined as “information technology and any equipment or interconnected system or subsystem of equipment that is used in the creation, conversion or duplication of data or information.”9 recently this terminology has been converted to information and communications technology (ict) as per the final rule updating section 508 in early 2017, but the essence of what it means remains unchanged.10 legislation regarding digital accessibility exists, specifically section 508 of the rehabilitation act of 1973, but only federal agencies and institutions receiving federal aid are required to abide by these statutes. lawmakers considered technology as a growing part of daily life in 1998 and amended the rehabilitation act with section 508, requiring federal agencies to make their ict accessible to people with disabilities.11 in 2017, these standards were updated with a final rule that modernized guidelines for accessibility of future ict.12 any research databases or other applications used by college and university libraries to facilitate online learning would be considered ict and thereby subject to section 508 requirements. it is evident that libraries not only have legal reasons to comply with section 508, but ethical reasons as well because making library collections and services universally available is a core value of the library community.13 in addition to legislation, the world wide web accessibility initiative (wai) created the web content accessibility guidelines (wcag) in 1999 in response to the growing need for web accessibility and to promote universal design. these standards created for web-content creators and web-tool developers are continually updated as new technologies and capabilities emerge— with version 2.0 being released in 2008—and apply specifically to web content and design. many information technology and libraries | septmeber 2018 14 of these guidelines were absorbed by the 2017 refresh of section 508 of the rehabilitation act of 1973.14 with fourteen guidelines assigned priority levels 1–3, wcag 2.0, and subsequent revisions to date, offer three levels of conformance with digital-accessibility guidelines: level a, the most basic level, meaning all mandatory level 1 guidelines are met; level aa, meaning priority levels 1 and 2 are met; and level aaa, meaning priority levels 1–3 are met. these conformance levels are important because many ict vendors will make their claims to conformance with wcag standards by using provided wai icons or using statements that refer to the level of conformance.15 wcag 2.0 guidelines alone are not enough to determine fully if a website or other digital content is truly accessible. it partly depends on it having an intuitive layout for a variety of users, which can only be achieved through usability testing.16 it is crucial that librarians understand what is required for a product or service to be considered accessible, and a firm grasp of wcag 2.0 and its conformance levels will enrich a librarian’s understanding of web accessibility and section 508 regulations.17 a voluntary product accessibility template (vpat) is a self-assessment document that vendors are required to complete only if they wish to sell their products to the federal government or any institution that chooses to require them. the quality of vpats varies, but essentially they will list section 508 standards and for each specify whether they fully or partially support it, do not support it, or if the standard is not applicable. there is then a space for the vendor to provide an explanation for limitations. since these are voluntary self-assessments, these documents can sometimes be brief and incomplete, but even brief statements can be specific enough to relatively easily verify the claims of support. because libraries are portals to online content, including e-books, e-journals, databases, streaming media, and more, which are provided largely by third-party vendors, libraries face unique struggles when attempting to comply with federal regulations. notions of equality and equal access are inherent to libraries and important for the maintenance of a democratic society, which makes accessibility within libraries’ digital content a concerning ethics issue.18 having little control over how ict is designed, libraries still must figure out how to address accessibility needs within third-party ict. in 2012, the association of research libraries (arl) joint task force on services to patrons with print disabilities encouraged libraries to require publishers to implement industry best practices, comply with legal requirements for accessibility, include language in publisher and vendor contracts to address accessibility, and request documentation like vpats.19 the task force’s report was vital in the creation and direction of this study. existing literature and studies as library professionals, we may often make assumptions of the accessibility of a third-party resource when the reality is that greater importance is placed on design of a product; accessibility components are either being added as special features or are being included once the design work is completed.20 tatomir and durrance conducted a study on the compatibility of thirty-two library databases with a set of guidelines for accessibility they called the tatomir accessibility checklist.21 this list included checking the usability of these databases with a screen reader and braille renewable display. they found that 44 percent of the databases were inaccessible, with an additional 28 percent being only “marginally accessible,” based on their criteria. this suggests major problems exist within vendor database platforms.22 enhancing visibility of vendor accessibility documentation | willis and o’reilly 15 https://doi.org/10.6017/ital.v37i3.10240 building on this research, western kentucky university libraries conducted a study on vpats from vendors to determine how accessible seventeen of their databases were.23 the university libraries ran an accessibility scan on those databases and compared the results with the vendors’ vpats, finding that the templates from the vendors were accurate about 80 percent of the time. most of the vendors did not address the accessibility of portable document format (pdf) files in their vpat statements, though it was an important component of their services. pertinent to this study, western kentucky’s work looked for accessibility documentation on vendors’ websites , and when one was not found, contacted the vendors requesting this information. this study was unique for targeting vendor-supplied vpats rather than only examining the databases themselves or tutorials from vendors. as mentioned previously, this was only done for the libraries’ main database vendors. mune and agee published an article on the ebooks accessibility project (eap) funded by affordable learning solutions at the california state university system. in this project, the researchers compared academic e-book platforms to e-reader platforms used for popular trade publications. they gathered data on the top sixteen library e-book vendors at san jose state university based on patron usage and title and holdings counts. the results indicated that academic e-book platforms were less accessible than nonacademic platforms, largely because of hesitance in adopting the epub 3 format, which by default has superior navigation and document structure to pdf or html, common academic options.24 while this study focused solely on the accessibility of e-book materials, a method for contacting vendors used in the eap study was adapted for the current study, applied at a larger scale. the eap researchers attempted to locate the vendors’ vpats online, and they contacted the vendors at least twice to request a vpat or other accessibility statement when none was located. it is noteworthy that of the sixteen vendors, all but one (94 percent) provided eap with some form of accessibility documentation, though less than half (44 percent) had a vpat available.25 another study, by joanne oud, examined vendor-supplied database video tutorials. half of the twenty-four vendors examined in oud’s study had tutorials in formats that were not accessible by keyboard or screen reader. this was largely because many of these tutorials were flash-based.26 shockwave flash is neither accessible for persons with disabilities nor good for usability on modern browsers.27 oud’s findings suggest that tutorial content would be more widely accessible if they were placed in youtube or another platform that had transcripts and captions available. while the focus of the study was different from our own, it was similar in that oud examined the accessibility of vendor materials apart from the journals and collections. also, oud noted that to make use of vendor tutorials, the website on which they are housed must likewise be accessible and the videos easy to find, but this is often not the case.28 other studies suggest that vendor websites and platforms often impede access to information. vendor platforms often have inaccessible pdfs, or the links to the full-text options are not easily located. delancey’s study also found more than three-fourths of the vendors examined had images without alternative text and frames without titles, resulting in many users with visual impairments being left out of the content of these images and frames entirely. of particular note, however, was the finding that not one of the vendors in this study had all forms —buttons, search boxes, and other browser navigation tools—labeled correctly, leaving the sites difficult to navigate.29 beyond whether the information itself is accessible, the question inevitably arises, can information technology and libraries | septmeber 2018 16 the desired information even be reached? one way or another, the content on these platforms must be accessible and easy to find. part of the motivation behind the current study stems from what delancey put so well: “only one vendor (out of seventeen), project muse, had a publicly available vpat on their website, though 9 others supplied this documentation upon request in under a week.”30 the first step in improving accessibility of resources for our patrons is to discuss accessibility with them—to determine how accessible information resources are today and identify areas of need. if a vpat or, minimally, any form of an accessibility statement is not easily discoverable on a vendor’s website—even if it is available upon request—users with disabilities as well as enabled users are not able to benefit from this information. are the vendors making it a priority in this case? additionally, since 41 percent of the vendors delancey examined had no vpat at all, what can be done before and aside from reaching out to vendors and stressing the importance of accessibility and of making statements on accessibility easy to find? from legal responsibilities to the dismal reality of digital accessibility, the task of improving library service for patrons with disabilities is daunting, even with the empowering ethical drivers of the library value system. ostergaard created “strategies for acquiring accessible electronic information sources,” an incredible guide to begin creating a guide that helps librarians develop an accessibility plan informed by her own work committed to accessibility in her library. steps 3 and 4 of ostergaard’s strategies are particularly relevant to the current study. step 3, “communicating with vendors,” involves inquiring about the accessibility of electronic products in addition to asking about any future plans for accessibility of their product and requesting vpats or other vendor supplied accessibility documentation. step 3 also recommends that librarians request vendors meet wcag 2.0 best practices and to incorporate a clause in license agreements that clearly defines accessibility of their products as further demonstration of ded ication to accessibility. such communication, it is hoped, would also lead to improved product development.31 once vendors are contacted, ostergaard outlines in step 4 the importance of documenting vendor communication regarding digital accessibility and further suggests assigning a person or team to review information received. ostergaard’s library changed the name of their acquisitions budget to “access budget,” reallocating a portion of their budget to review existing subscriptions, purchase accessible replacements, or in some cases, convert materials to an accessible format. the documentation review allowed the library to make informed decisions about collections and service availability on behalf of library users, but no mention was made of involving users in this process. the article provided a letter template that encompassed the aforementioned concepts and a request for assessment documentation, such as vpats and official statements of compliance. the ostergaard template served as a foundation for the language used in vendor communication for the current study, particularly the vpat or other accessibility documentation request.32 there have been no studies that suggest a way to implement easily discoverable vendor accessibility documentation—even when said documentation is not readily available to the public on the vendors’ sites. delancey suggested creating “an open repository for both vendor supplied documentation, and the results of any usability testing,” but this was suggested for internal library use, not public dissemination.33 if this documentation is made more easily available, we can enhancing visibility of vendor accessibility documentation | willis and o’reilly 17 https://doi.org/10.6017/ital.v37i3.10240 increase patron involvement in the discussion of accessibility of vendor-supplied library resources. research methods library-related research has focused the need for academic libraries to have accessible websites, in part to reach patrons who are participating in distance-education programs.34 a key component of a library’s website is the materials it avails to patrons from vendors, like databases and database aggregators. since, however, these materials are accessed via vendor platforms, they are outside the direct control of the library, making it more difficult to address accessibility concerns. some vendors have put forward significant effort in addressing accessibility needs. some offer a built-in feature for text-to-speech for html files or provide documents in a variety of formats, including txt and mp3 files, thereby offering a format that works well with common screenreading programs, or providing a sound file directly. this is of particular benefit to patrons with print disabilities.35 other vendors, such as ebook central (formerly ebrary), have worked to eliminate their flash dependencies. this is recognized as a positive step toward making vendor content usable for all. streaming video and other nonprint-based library materials must also be accessible. a person with visual impairments may be able to hear the soundtrack of the video, but unless an accurate description is provided of what is being presented visually, he or she will miss out on such information, such as the names of those speaking. to complicate matters further, hearing impaired users of these databases will not be privy to what is verbalized unless accurate captions and transcripts, or an interpreter, is made available for the videos. captions and transcripts are sometimes made available, but can easily be incomplete or incorrect. for example, alexander street press provided closed captioning and transcripts for some collections but not others. even when the captions or transcripts existed, as with a video we tested from ethnographic videos online, it was of low quality, inscribing the word “object” as “old pics,” “house” as “mess,” and so forth. one vendor, docuseek, had subtitles to translate from spanish, but no closed captioning or transcript available. audio-impaired users could not make full use of the video because the subtitles did not include all information presented in the sound track. (transcripts can also be useful to visually impaired users using screen readers.) films on demand had better captions and transcripts, but did not include all the words on the screen in the transcript, such as the title. regardless of the medium there are multiple ways to provide accessible versions, but they are seldom automatic. librarians must communicate the need for accessible digital files to vendors so they will prioritize it. as long as libraries—one of their main customer groups—accept their offerings whether accessible or not for persons with disabilities, vendors have no reason to put great effort into making these improvements. as colker pointed out, commercial vendors are not required to comply with ada regulations under title ii or title iii.36 vendors may also face resource restrictions that hinder their ability to improve their platforms’ accessibility. 37 they are businesses, so it is natural that they would only commit a concerted effort to reformat and enhance their platforms and records if the benefits are expected to outweigh the costs; they must firstly be made aware of the issue, and know that it is important to libraries and their patrons. information technology and libraries | septmeber 2018 18 in much the same way as contracted workers constructing a physical space for a federal or federally funded agency must follow ada standards for accessibility, so software vendors should be required to design virtual spaces to be accessible. this comparison was made by the department of education more than twenty years ago, and has the added benefit of greatly reducing the need for accommodation after the fact.38 according to cardenes, “at a minimum, a public entity has a duty to solve barriers to information access that the public entity’s purchasing choices create.”39 oswal stressed the importance of integrating the blind user experience into the development of databases from the beginning, as well as finding steps useful for guiding library users after the fact. merely following the rules set out in federal regulations is not enough to provide exemplary service to library patrons. the patrons as well must be involved in the process to fully address accessibility needs.40 process and findings the first objective of this study was to gain a better understanding of the accessibility of our library’s vendor-provided digital resources through the review of vendor-provided accessibility documentation. the second objective of this study was to determine a method of increasing the visibility of accessibility documentation for the benefit of our users and to communicate to them our commitment to improving service to users with disabilities. with a digital collection consisting of 270 databases, more than 750,000 e-books and e-journals, and more than 12 million streaming media titles, it was difficult to identify an appropriate sample. we needed a collection that would best serve as an illustrative swatch of our library’s digital holdings, and more importantly, a collection that would have the largest impact on our users. we also needed to establish a strategy for obtaining accessibility documentation regarding third-party content as well as create a delivery method for the vpats and other documentation we discovered in the course of our study. similar to other institutions, our library maintains a directory of the most used and most useful databases on the library’s homepage in the form of the a–z list (http://libresources.wichita.edu/az.php). determinations of usefulness are based on input from our reference librarians, who connect with user needs directly, whereas use comes from annual usage statistics compiled as per standard library procedures. users can browse this directory by subject, search by title, and sort by database type (full-text, streaming media, etc.), and the a-z list is a convenient place for users to begin their research. the directory also served as a convenient place to begin this study as it presented us with a sample that not only reflected the needs and habits of our patrons, but an excellent and diverse list of vendors to work with. beginning with a list of all subscribed databases (270 in 2016) exported directly from the a–z list’s backend, we sorted the list by vendor and determined that 74 vendors would be investigated. university materials indexed by the directory (i.e., institutional repository and libguides) were excluded from this study. as visibility of accessibility documentation is of concern to this study, our investigation began by visiting the database or vendor’s site and conducting a web search to obtain any information about accessibility. we were looking for mentions of the following keywords: “section 508” or “section 504,” “w3” or “wcag,” “vpat,” “ada,” and simply “accessibility.” some sites were intuitive: thirty-four vendors (45 percent) had statements that were found online. examples of commonly used documentation, which for the purposes of this study will be referred to as http://libresources.wichita.edu/az.php enhancing visibility of vendor accessibility documentation | willis and o’reilly 19 https://doi.org/10.6017/ital.v37i3.10240 accessibility statements, included “accessibility policy,” “section 508 compliance,” or “accessibility statement.” of those thirty-four vendors who posted accessibility documentation online, eleven provided a vpat or a link thereto in their accessibility statements. if we could not find an accessibility statement on the site, vendors were contacted first via email requesting information and documentation regarding the accessibility of their product using a form letter inspired by the ostergaard template.41 this email address was either found online— likely the “contact us” or technical support email links—or originated in the list of vendors’ contacts maintained in the library management system if another contact could not be found. if a response was not received within thirty days, the vendors were contacted a second time, a suggestion gleaned from mune and agee’s work.42 after all vendors included in the study had been contacted, any who did not provide a vpat were contacted a final time with a specific request for a vpat. for vendors who responded they could not provide a vpat or other accessibility statement, we used a screenshot of their response as documentation. the form letter (see appendix a) used in the current study made it known to vendors that their response would be posted publicly for the benefit of our users. twelve of the remaining vendors responded to our email inquiries with vpats and seven vendors responded with other accessibility documentation. figure 1. results of vendor query for accessibility documentation. in total, eleven vpats (15 percent) were found online and vpats from twelve vendors (16 percent) were received in response to our emailed request. twenty-three vendors (31 percent) had other accessibility documentation available online, while seven vendors (9 percent) provided other accessibility documentation in response to email inquiries. eight vendors (11 percent) other accessibility documentation found online 31% other accessibility documentation received 9% no official statement 11% did not respond 18% vpats found online 15% vpats received 16% vpats 31% results of vendor query for accessibility documentation information technology and libraries | septmeber 2018 20 responded they had no official statements or documentation to offer, and thirteen vendors (18 percent) did not respond (see figure 1). with the documentation compiled, we needed to establish an appropriate delivery system that would make this accessibility information visible to library users and therefore further the accessibility efforts. our collection cross-section, the a–z list, was chosen because of its prominence in our library’s online research presence as a suitable location to not only store but to convey this documentation to users. we created a clickable icon to be embedded into the databases’ entries in our a–z list created in libguides (a springshare product). clicking the icon would take the user to the vendor’s statement page, directly to the vpat, or to a page we created in libguides to store screen captures of vendor emails and vpats we received as attachments. if a vpat was available, we linked to it above any other documentation because vpats present a more rigorous analysis of the accessibility of third-party-created ict. libguides was determined to be a suitable place to house this documentation not only because it made the information easy to find for patrons, but also because springshare built libguides in an increasingly accessible manner and has documented its efforts using vpats for each product (see appendix b). further study it is expected that some of the information provided by the vendors is incomplete or inaccurate, even despite their best efforts, so the information we provide to patrons from and about the vendors might at times lead our patrons astray. we briefly examined the vpats acquired through this project to inform our work moving forward and found errors in at least half of them. some vendors claimed that skip navigation was available when none was found, while another would have benefitted from it but said it was “not applicable.” others were too brief to be useful, as no explanations were given for their claims. building on this current research, we intend, in collaboration with patrons with disabilities, to further verify the accuracy of key statements made by vendors in their vpats and other accessibility documentation. this analysis will give concrete feedback to vendors on how their sites could be further improved. as stated earlier, giving patrons access requires more than following a set of guidelines; it requires dialog to ensure their needs are fully met.43 it requires more than making the available documents accessible, but also testing the platform used to retrieve the documents for accessibility. as one author put it so well, “a lack of technological access is a solvable problem, but only if it is made a priority.”44 as vendors are not directly subject to enforcement of section 508 and other statutes regarding accessibility of the products they provide to libraries, vpats are truly voluntary. as such, the level of effort and detail of the product assessments are inconsistent and accuracy of the documentation is questionable. we intend to continue to be involved in the digital-accessibility initiative in part through our analysis of our digital-library presence, utilizing user input and expanding their role in improving the user experience. this would enable us to further improve our libraries’ service to users with disabilities. if we, as library professionals and institutions, stand together and each say our part, vendors will realize this is an important issue to address. also, it is important that we, as service providers for the public good, act as a bridge between these vendors—who at times do not avail good service information to their customers—and the patrons we serve. it may be a small step, but providing links to the vpats and other accessibility statements from vendors right where the patrons need enhancing visibility of vendor accessibility documentation | willis and o’reilly 21 https://doi.org/10.6017/ital.v37i3.10240 them is an important step in meeting the patrons where they are and showing them help is available. we can show patrons we care and will work with them to improve the now limited accessibility of not only scholarly information itself, but even of the platforms in which they are housed. information technology and libraries | septmeber 2018 22 appendix a: accessibility documentation request email template subject line: vpat request thank you for the information you provided answering our inquiry regarding the accessibility of your electronic product. wichita state university libraries has set a goal of improving the accessibility of the electronic and information technology we provide to our patrons. in accordance with section 504 of the rehabilitation act and title ii of the americans with disabilities act, do you happen to have a voluntary product accessibility template (vpat) available, or have you made plans to do further accessibility testing on your product? the vpat documentation can be found on the u.s. department of state website: http://www.state.gov/m/irm/impact/126343.htm. http://www.state.gov/m/irm/impact/126343.htm enhancing visibility of vendor accessibility documentation | willis and o’reilly 23 https://doi.org/10.6017/ital.v37i3.10240 appendix b: vpat and other accessibility documentation urls used in the databases a–z list. (list current as of october 20, 2017. library subscriptions may have changed. vendors may have updated urls or added additional documentation since october 20. research on this project is ongoing. please see http://libresources.wichita.edu/az.php for a current list of vendor accessibility documentation.) vendor urls aapg (american association of petroleum geologists) no accessibility documentation available abc-clio no response acls (american council of learned societies) http://www.humanitiesebook.org/about/for-librarians/#adacompliance-and-accessibility acm (association of computing machinery) https://www.acm.org/accessibility acs (american chemical society) https://www.acs.org/content/acs/en/accessibilitystatement.html adam matthew digital http://libresources.wichita.edu/c.php?g=583127&p=4026332 aiaa (american institute of aeronautics & astronautics) http://libresources.wichita.edu/ld.php?content_id=32264954 alexander street press https://alexanderstreet.com/page/accessibility-statement american institute of physics http://www.scitation.org/faqs american mathematical society http://www.ams.org/about-us/vpat-mathscinet-2014-ams.pdf apa (american psychological association) http://www.apa.org/about/accessibility.aspx asm international no response asme (american society of mechanical engineers) no accessibility documentation available astm no accessibility documentation available bioone http://www.bioone.org/page/resources/accessibility books 24x7 https://documentation.skillsoft.com/bkb/qrc/assistiveqrc.pdf britannica http://help.eb.com/bolae/accessibility_policy.htm business expert press http://media2.proquest.com/documents/ebookcentral_vpat.pdf cabell’s no response cambridge crystallographic data centre https://www.ccdc.cam.ac.uk/termsandconditions/ cambridge university press http://www.cambridge.org/about-us/accessibility/ cas no accessibility documentation available clcd (children’s literature comprehensive database) no response http://libresources.wichita.edu/az.php http://www.humanitiesebook.org/about/for-librarians/#ada-compliance-and-accessibility http://www.humanitiesebook.org/about/for-librarians/#ada-compliance-and-accessibility https://www.acm.org/accessibility https://www.acs.org/content/acs/en/accessibility-statement.html https://www.acs.org/content/acs/en/accessibility-statement.html http://libresources.wichita.edu/c.php?g=583127&p=4026332 http://libresources.wichita.edu/ld.php?content_id=32264954 https://alexanderstreet.com/page/accessibility-statement http://www.scitation.org/faqs http://www.ams.org/about-us/vpat-mathscinet-2014-ams.pdf http://www.apa.org/about/accessibility.aspx http://www.bioone.org/page/resources/accessibility https://documentation.skillsoft.com/bkb/qrc/assistiveqrc.pdf http://help.eb.com/bolae/accessibility_policy.htm http://media2.proquest.com/documents/ebookcentral_vpat.pdf https://www.ccdc.cam.ac.uk/termsandconditions/ http://www.cambridge.org/about-us/accessibility/ information technology and libraries | septmeber 2018 24 conference board http://www.conferenceboard.ca/accessibility/resources.aspx?as pxautodetectcookiesupport=1 cq press http://library.cqpress.com/cqresearcher/html/public/vpat.html credo reference https://credoreference.zendesk.com/hc/enus/articles/201429069-accessibility datazoa http://libresources.wichita.edu/accessibilitystatements/datazo avpat docuseek2 https://docuseek2.wikispaces.com/section+508+compliance+st atement ebsco https://www.ebscohost.com/government/full-508-accessibility ei engineering village https://www.elsevier.com/solutions/engineeringvillage/features/accessibility elsevier https://www.elsevier.com/solutions/sciencedirect/support/web -accessibility gale https://support.gale.com/technical/618 google https://www.google.com/accessibility/initiatives-research.html hathitrust https://www.hathitrust.org/accessibility heinonline https://www.wshein.com/accessibility/ ibisworld no response ieee https://www.ieee.org/accessibility_statement.html infobase learning http://support.infobaselearning.com/index.php?/tech_support/ knowledgebase/article/view/1318/0/ada-usability-statement infogroup http://libresources.wichita.edu/c.php?g=583127&p=4286285 institute of physics http://iopscience.iop.org/page/accessibility interdok no response jstor https://about.jstor.org/accessibility/ kanopy https://help.kanopystreaming.com/hc/enus/articles/210691557-what-is-kanopy-s-position-onaccessibility lexisnexis http://www.lexisnexis.com/gsa/76/accessible.asp library of congress https://www.congress.gov/accessibility mergent no accessibility documentation available national academies press no response national library of medicine https://www.nlm.nih.gov/accessibility.html naxos http://libresources.wichita.edu/c.php?g=583127&p=4287131 ncjrs https://www.justice.gov/accessibility/accessibility-information newsbank http://libresources.wichita.edu/c.php?g=583127&p=4457078 oclc https://www.oclc.org/en/policies/accessibility.html ovid http://ovidsupport.custhelp.com/app/answers/detail/a_id/590 9/~/is-the-ovid-interface-section-508-compliant%3f oxford university press https://global.oup.com/academic/accessibility/?cc=us&lang=en & projectmuse https://muse.jhu.edu/accessibility http://www.conferenceboard.ca/accessibility/resources.aspx?aspxautodetectcookiesupport=1 http://www.conferenceboard.ca/accessibility/resources.aspx?aspxautodetectcookiesupport=1 http://library.cqpress.com/cqresearcher/html/public/vpat.html https://credoreference.zendesk.com/hc/en-us/articles/201429069-accessibility https://credoreference.zendesk.com/hc/en-us/articles/201429069-accessibility http://libresources.wichita.edu/accessibilitystatements/datazoavpat http://libresources.wichita.edu/accessibilitystatements/datazoavpat https://docuseek2.wikispaces.com/section+508+compliance+statement https://docuseek2.wikispaces.com/section+508+compliance+statement https://www.ebscohost.com/government/full-508-accessibility https://www.elsevier.com/solutions/engineering-village/features/accessibility https://www.elsevier.com/solutions/engineering-village/features/accessibility https://www.elsevier.com/solutions/sciencedirect/support/web-accessibility https://www.elsevier.com/solutions/sciencedirect/support/web-accessibility https://support.gale.com/technical/618 https://www.google.com/accessibility/initiatives-research.html https://www.hathitrust.org/accessibility https://www.wshein.com/accessibility/ https://www.ieee.org/accessibility_statement.html http://support.infobaselearning.com/index.php?/tech_support/knowledgebase/article/view/1318/0/ada-usability-statement http://support.infobaselearning.com/index.php?/tech_support/knowledgebase/article/view/1318/0/ada-usability-statement http://libresources.wichita.edu/c.php?g=583127&p=4286285 http://iopscience.iop.org/page/accessibility https://about.jstor.org/accessibility/ https://help.kanopystreaming.com/hc/en-us/articles/210691557-what-is-kanopy-s-position-on-accessibilityhttps://help.kanopystreaming.com/hc/en-us/articles/210691557-what-is-kanopy-s-position-on-accessibilityhttps://help.kanopystreaming.com/hc/en-us/articles/210691557-what-is-kanopy-s-position-on-accessibilityhttp://www.lexisnexis.com/gsa/76/accessible.asp https://www.congress.gov/accessibility https://www.nlm.nih.gov/accessibility.html http://libresources.wichita.edu/c.php?g=583127&p=4287131 https://www.justice.gov/accessibility/accessibility-information http://libresources.wichita.edu/c.php?g=583127&p=4457078 https://www.oclc.org/en/policies/accessibility.html http://ovidsupport.custhelp.com/app/answers/detail/a_id/5909/~/is-the-ovid-interface-section-508-compliant%3f http://ovidsupport.custhelp.com/app/answers/detail/a_id/5909/~/is-the-ovid-interface-section-508-compliant%3f https://global.oup.com/academic/accessibility/?cc=us&lang=en& https://global.oup.com/academic/accessibility/?cc=us&lang=en& https://muse.jhu.edu/accessibility enhancing visibility of vendor accessibility documentation | willis and o’reilly 25 https://doi.org/10.6017/ital.v37i3.10240 proquest http://media2.proquest.com/documents/proquest_academic_vp at.pdf, http://media2.proquest.com/documents/ebookcentral_vpat.pdf, readex http://uniaccessig.org/lua/wpcontent/uploads/2014/11/readex.pdf sage https://us.sagepub.com/en-us/nam/accessibility-0 salem press no response sbrnet no response springer https://github.com/springernature/vpat/blob/master/springerl ink.md standard & poor’s no response swank no accessibility documentation available (http://libresources.wichita.edu/accessibilitystatements/swan kaccessibility) taylor & francis http://libresources.wichita.edu/c.php?g=583127&p=4539268 thomson reuters https://clarivate.com/wpcontent/uploads/2018/02/pacr_wos_5.27_jan-2018_v1.0.pdf, us department of commerce http://osec.doc.gov/accessibility/accessibliity_statement.html us department of education https://www2.ed.gov/notices/accessibility/index.html us government printing office https://www.gpo.gov/accessibility university of chicago no accessibility documentation available university of michigan https://www.press.umich.edu/about#accessibility uptodate http://libresources.wichita.edu/c.php?g=583127&p=4691631 valueline http://libresources.wichita.edu/accessibilitystatements/valueli neaccessibility wrds (wharton research data services) https://wrds-www.wharton.upenn.edu/pages/wrds-508compliance/ wiley http://olabout.wiley.com/wileycda/section/id-406157.html references 1 neil savage, “weaving the web,” communications of the acm 60, no. 6 (june 2017): 22. 2 ruth colker, “the americans with disabilities act is outdated,” drake law review 63, no. 3 (2015): 799. 3 colker, “the americans with disabilities act,” 817; joanne oud, “accessibility of vendor-created database tutorials for people with disabilities,” information technology and libraries 35, no. 4 (2016): 13–14. 4 laura delancey and kirsten ostergaard, “accessibility for electronic resources librarians,” serials librarian 71, no. 3–4 (2016): 181, https://doi.org/10.1080/0361526x.2016.1254134. http://media2.proquest.com/documents/proquest_academic_vpat.pdf http://media2.proquest.com/documents/proquest_academic_vpat.pdf http://media2.proquest.com/documents/ebookcentral_vpat.pdf http://uniaccessig.org/lua/wp-content/uploads/2014/11/readex.pdf http://uniaccessig.org/lua/wp-content/uploads/2014/11/readex.pdf https://us.sagepub.com/en-us/nam/accessibility-0 https://github.com/springernature/vpat/blob/master/springerlink.md https://github.com/springernature/vpat/blob/master/springerlink.md http://libresources.wichita.edu/accessibilitystatements/swankaccessibility http://libresources.wichita.edu/accessibilitystatements/swankaccessibility http://libresources.wichita.edu/c.php?g=583127&p=4539268 https://clarivate.com/wp-content/uploads/2018/02/pacr_wos_5.27_jan-2018_v1.0.pdf https://clarivate.com/wp-content/uploads/2018/02/pacr_wos_5.27_jan-2018_v1.0.pdf http://osec.doc.gov/accessibility/accessibliity_statement.html https://www2.ed.gov/notices/accessibility/index.html https://www.gpo.gov/accessibility https://www.press.umich.edu/about#accessibility http://libresources.wichita.edu/c.php?g=583127&p=4691631 http://libresources.wichita.edu/accessibilitystatements/valuelineaccessibility http://libresources.wichita.edu/accessibilitystatements/valuelineaccessibility https://wrds-www.wharton.upenn.edu/pages/wrds-508-compliance/ https://wrds-www.wharton.upenn.edu/pages/wrds-508-compliance/ http://olabout.wiley.com/wileycda/section/id-406157.html https://doi.org/10.1080/0361526x.2016.1254134 information technology and libraries | septmeber 2018 26 5 americans with disabilities act of 1990, pub. l. no. 101-336, 104 stat. 327 (1990). 6 rehabilitation act of 1973, pub. l. no. 93-112, 87 stat. 355 (1973). 7 discrimination on the basis of disability in federally assisted programs and activities, 77 fed. reg. 14,972 (march 14, 2012) (to be codified at 34 cfr pt. 104). 8 delancey and ostergaard, “accessibility for electronic resources,” 180. 9 architectural and transportation barriers compliance board, 65 fed. reg. 80,500, 80,524 (december 21, 2000) (to be codified at 36 cfr pt. 1194). 10 architectural and transportation barriers compliance board, 82 fed. reg. 5,790 (january 19, 2017) (to be codified at 36 cfr pt. 1193-1194). 11 29 usc §794d, at 289 (2016). 12 architectural and transportation barriers compliance board, 82 fed. reg. 5,790, 5,791 (january 19, 2017) (to be codified at 36 cfr pt. 1193-1194). 13 paul t. jaeger, “section 508 goes to the library: complying with federal legal standards to produce accessible electronic and information technology in libraries,” information technology and disabilities 8, no. 2 (2002), http://link.galegroup.com/apps/doc/a207644357/aone?u=9211haea&sid=aone&xid=4c7f 77da. 14 architectural and transportation barriers compliance board, 82 fed. reg. 5,790, 5791 (january 19, 2017) (to be codified at 36 cfr pt. 1193-1194). 15 ben caldwell et al., eds., “web content accessibility guidelines (wcag) 2.0,” last modified december 11, 2008, http://www.w3.org/tr/2008/rec-wcag20-20081211/. 16 delancey, laura, “assessing the accuracy of vendor-supplied accessibility documentation,” library hi tech 33, no. 1 (2015): 108. 17 ostergaard, kirsten, “accessibility from scratch: one library’s journey to prioritize the accessibility of electronic information resources,” serials librarian 69, no. 2 (2015): 159, https://doi.org/10.1080/0361526x.2015.1069777. 18 jaeger, “section 508.” 19 mary case et al., eds., “report of the arl joint task force on services to patrons with print disabilities,” association of research libraries, november 2, 2012, p. 29, http://www.arl.org/storage/documents/publications/print-disabilities-tfreport02nov12.pdf. 20 delancey and ostergaard, “accessibility for electronic resources,” 180. http://link.galegroup.com/apps/doc/a207644357/aone?u=9211haea&sid=aone&xid=4c7f77da http://link.galegroup.com/apps/doc/a207644357/aone?u=9211haea&sid=aone&xid=4c7f77da http://www.w3.org/tr/2008/rec-wcag20-20081211/ https://doi.org/10.1080/0361526x.2015.1069777 http://www.arl.org/storage/documents/publications/print-disabilities-tfreport02nov12.pdf enhancing visibility of vendor accessibility documentation | willis and o’reilly 27 https://doi.org/10.6017/ital.v37i3.10240 21 jennifer tatomir and joan c. durrance, “overcoming the information gap: measuring the accessibility of library databases to adaptive technology users,” library hi tech 28, no. 4 (2010): 581, https://doi.org/10.1108/07378831011096240. 22 tatomir and durrance, “overcoming the information gap,” 584. 23 delancey, “assessing the accuracy,” 104–5. 24 christina mune and ann agee, “are e-books for everyone? an evaluation of academic e-book platforms’ accessibility features,” journal of electronic resources librarianship 28, no. 3 (2016): 172–75, https://doi.org/10.1080/1941126x.2016.1200927. 25 mune and agee, “are e-books for everyone?,” 175. 26 joanne oud, “accessibility of vendor-created database tutorials for people with disabilities,” information technology and libraries 35, no. 4 (2016): 12, https://doi.org/10.6017/ital.v35i4.9469. 27 mark hachman, “tested: how flash destroys your browser’s performance,” pc world, august 7, 2015, https://www.pcworld.com/article/2960741/browsers/tested-how-flash-destroysyour-browsers-performance.html. 28 oud, “accessibility of vendor-created database tutorials,” 12. 29 delancey, “assessing the accuracy,” 106–7. 30 delancey, “assessing the accuracy,” 105. 31 kirsten ostergaard, “accessibility from scratch: one library’s journey to prioritize the accessibility of electronic information resources,” serials librarian 69, no. 2 (2015): 162–65, https://doi.org/10.1080/0361526x.2015.1069777. 32 ostergaard, “accessibility from scratch.” 164 33 delancey, “assessing the accuracy,” 111. 34 cynthia guyer and michelle uzeta, “assistive technology obligations for postsecondary education institutions,” journal of access services 6, no. 1/2 (2009): 29; oud, “accessibility of vendor-created database tutorials,” 7. 35 mune and agee, “are e-books for everyone?,” 173. 36 colker, “the americans with disabilities act,” 792–93. 37 delancey, “assessing the accuracy,” 107. 38 colker, “the americans with disabilities act,” 814; mune and agee, “are e-books for everyone?,” 182. https://doi.org/10.1108/07378831011096240 https://doi.org/10.1080/1941126x.2016.1200927 https://doi.org/10.6017/ital.v35i4.9469 https://www.pcworld.com/article/2960741/browsers/tested-how-flash-destroys-your-browsers-performance.html https://www.pcworld.com/article/2960741/browsers/tested-how-flash-destroys-your-browsers-performance.html https://doi.org/10.1080/0361526x.2015.1069777 information technology and libraries | septmeber 2018 28 39 adriana cardenes to dr. james rosser, april 7, 1997, private collection, quoted in colker, “the americans with disabilities act is outdated,” 815. 40 sushil k. oswal, “access to digital library databases in higher education: design problems and infrastructural gaps,” work 48, no. 3 (2014): 316. 41 ostergaard, “accessibility from scratch,” 164. 42 mune and agee, “are e-books for everyone?,” 175. 43 delancey, “assessing the accuracy,” 108; mune and agee, “are e-books for everyone?,” 181. 44 colker, “the americans with disabilities act,” 817. abstract introduction literature review legislation and existing guidelines existing literature and studies research methods process and findings further study appendix a: accessibility documentation request email template appendix b: vpat and other accessibility documentation urls used in the databases a–z list. experiences of migrating to an open source integrated library system vandana singh information technology and libraries | march 2013 36 abstract interest in migrating to open-source integrated library systems is continually growing in libraries. along with the interest, lack of empirical research and evidence to compare the process of migration brings a lot of anxiety to the interested librarians. in this research, twenty librarians who have worked in libraries that migrated to open-source integrated library system (ils) or are in the process of migrating were interviewed. the interviews focused on their experiences and the lessons learned in the process of migration. the results from the interviews are used to create guidelines/best practices for each stage of the adoption process of an open-source ils. these guidelines will be helpful for librarians who want to research and adopt an open-source ils. introduction open-source software (oss) has become increasingly popular in libraries, and every year more libraries migrate to an open-source integrated library system.1 while there many discrete opensource applications used by libraries, this paper focuses on the integrated library system (ils), which supports core operations at most libraries. the two most popular open-source ilss in the united states are koha and evergreen, and they are being positioned as alternatives to proprietary ilss. 2 as open-source software becomes more widely used, it is not enough just to identify which software is the most appropriate for libraries, but it is also important to identify best practices, common problems, and misconceptions with the adoption of these software packages. the literature on open-source ilss is usually in the form of a case study from an individual library or a detailed account of one or two aspects of the process of selection, migration, and adoption. in our interactions with librarians from across the country, we found that there are no consolidated resources for researching different open-source ilss and for sharing the experiences of the people using them. librarians who are interested in open-source ils cannot find one resource that can give them an overview of the necessary information related to open-source ilss. in this research, we interviewed twenty librarians from different types and sizes of libraries and gathered their experiences to create generalized guidelines for the adoption of open-source ilss. these guidelines are at a broader level than one single case study and cover all the different stages of the adoption lifecycle. the experiences of librarians are useful for people who are evaluating opensource ilss as well as those who are in the process of adoption. learning from their experiences will help librarians to not have to reinvent the wheel. this type of research helps the librarians by empowering them with the information they need; also, it helps us in understanding the current status of this popular software. vandana singh (vandana@utk.edu) is assistant professor, school of information sciences, university of tennessee, knoxville, tennessee. mailto:vandana@utk.edu experiences of migrating to an open-source integrated library system | singh 37 literature review as mentioned earlier, most of the literature on open-source ils is practitioner-based and provides case studies or single steps in the process of adoption. these research studies and resources are useful but do not address the broad information needs of the librarians who are researching the topic of open-source ilss. every library is different, so no two libraries are going to take the same path in the adoption process. the usefulness of these articles depends on whether the searcher can find one in a similar environment. another issue is the amount of information given in these resources. often these papers discuss only one aspect of moving to an open-source ils, for example choosing the open-source ils. if they do cover the whole process, there is usually not enough detail to know how they did it. for example, morton-owens, hanson, and walls organize their paper into five sections: motivation and requirements analysis, software selection, configuration, training, and maintenance. 3 however, each section includes more main points than description. another relevant stream of literature includes those that compare different opensource ilss. these range from little more than links to different open-source projects to in-depth comparisons.4 for example, muller evaluated open-source communities for different ilss on forty criteria and then compared the ils on over eight hundred functions and features.5 these types of articles are very useful for those who are trying to become acquainted with the different opensource ilss that are available and are in the evaluation phase of the process. again, they are not helpful in understanding the entire process of adoption. some best practices articles such as tennant may be a little older, but his nine tips are still valid and very useful as a good foundation for anyone thinking about making the switch to open-source ils.6 what are the factors for moving to an open-source ils? another reason why an open-source ils appeals to libraries is its underlying philosophy: “open source and open access are philosophically linked to intellectual freedom, which is ultimately the mission of libraries.” 7 the other two common reasons are cost and functionality. the literature covering the decision to move to an open-source ils makes it clear that there is a wide variety of ways that libraries come to this decision. in espiau-bechetoille, bernon, bruley, and mousin, the consortium made the decision in four parts. 8 the article states that they initially determined that four open-source ilss met their needs (koha, emilda, gnuteca and evergreen), although it is somewhat vague as to how they determined that koha was the best for their situation. indeed, most of the article is about how the three libraries involved had to work together, coordinating and dividing responsibilities. bissels shares that money was the main reason that the complementary and alternative medicine library and information service (camlis) decided to migrate to koha.9 they explain the process of making that decision. camlis was being developed from nothing, which makes their situation different than most libraries, and hence the process is different as well. michigan is an area known for its number of evergreen libraries. much of that is due to michigan library consortium. dykhuis explains the long, involved process that led to a number of evergreen installations. 10 mlc provides services to michigan information technology and libraries | march 2013 38 libraries, such as training and support. when they started looking for an ils system that all libraries could use, the main concerns were cost and functionality, which are the two key aspects that are mentioned in any discussion about choosing an ils. kohn and mccloy state that they decided to migrate to a new ils due to frustration with their current ils and that they involved all six of their librarians in the decision-making process.11 dennison and lewis show another reason why people migrate to open-source ils.12 they say that the proprietary system they were using was much more complicated than they needed. in addition, because of staff turnover no one really understood the system. this lack of expertise combined with increasing annual costs led to the decision to move to an open-source ils. an important lesson to take from this article is that they included all six of their librarians in the decisionmaking process. for a smaller library where everyone is an expert in their area of the library, it is important to get everyone involved in order to make sure that important functions or needed capabilities are not overlooked. almost any library that chooses open-source ils will name cost as one of their primary reasons. functionality is usually what determines which ils they choose. riewe conducted a study where he asked why each library chose its current ils. 13 open source libraries responded most often with ability to customize, the freedom from vendor lock-in, portability, and cost. how does migration happen? there are two general ways to do a migration: all at once or in stages. kohn and mccloy discuss a three-phase migration.14 the reason for this method was to spread the cost over several years. they did the public website and federated catalog as phase one and did the backend part during phases two and three. when multiple libraries are involved, phased migration is more like what is described in dykhuis.15 in that case, first a pilot program was created where a few libraries migrated over to the new system. when that was successful, then more libraries migrated. in contrast to a phased migration, walls discusses a migration completed in three months.16 this time includes installation, testing, and configuration. one interesting decision they made was to migrate at the end of the fiscal year in order to limit the amount of acquisitions data to be migrated. dennison and lewis completed their migration in two months. in this migration, most of the work was done by the company that was hosting their system. 17 this limited the amount of expertise that the library staff needed and made the migration much smoother from their perspective. migration can also be an opportunity; for example, morton-owens, hanson, and walls mention that they used the migration to koha to synchronize circulation rules between the branches. 18 it was also used to weed out inactive patrons (anyone who had not used the library in two years). data migration can be a problem, though. in the old system, the location code had been used for where the item was within the branch library, what kind of item it was, and how it circulated, but experiences of migrating to an open-source integrated library system | singh 39 these are three separate fields in koha. however, to some extent these issues are true of any migration between different systems. the migration experience is not always of a smooth transition. one of the advantages of opensource is the ability to customize and to develop functions that are specific to your library. in the case of new york academy of medicine library (nyam) working with its consortium waldo (westchester academic library directors organization), it was the decision to have developments completed before migration that caused the problems.19 their migration schedule was delayed by a month, and even after the delay not all of the eleven key features were complete. in addition, their migration took place when liblime (a proprietary vendor) with whom they were working announced their separation from the koha open-source community, which caused additional confusion. there are a couple of lessons to take from this. first, if doing development, be sure that the time needed is built into the migration schedule. also, when choosing an ils, think about how many developments are going to be necessary to successfully run the ils in your environment. lastly, try to prioritize the developments to minimize the number needed before “going live.” what does the literature say about training? very little is available about the training process for open-source ils. in current studies, training can be done in two ways: either by buying training from a vendor, or doing it internally.20 dennison and lewis found that having staff work on the system together at first and then try it independently was the most successful. 21 they had a demonstration system to practice, which also helped. in addition to this self-training, they had onsite training done by module, which allowed staff to attend only the training that was relevant and needed for them. in all of the articles discussed in this section, only one talks about ongoing maintenance. 22 the two-paragraph section includes suggested methods and does not mention anything about the amount of time or expertise needed for ongoing maintenance. in summary, in this literature review we found that there is research about open-source ils but that there is a need for much more work in this area. it was found that research articles and practitioner pieces are available and talk about different aspects of the adoption process. the main reasons for adoption are identified. there are also a few scattered individual articles about the process of migration, training, and maintenance. there is a gap in the studies of open-source ils, and there is no comprehensive study that documents the process, explains the steps, and identifies best practices and challenges for librarians who are interested. data sources the objective for data collection was to collect data from a variety of library types and sizes in order to collect a wide range of data. e-mail invitations for interviews were sent to koha and evergreen discussion list and to several other library-related discussion list. the e-mail requested volunteers for a telephone interview to share their experiences with open-source integrated library systems. potential participants identified themselves as being willing to be interviewed for information technology and libraries | march 2013 40 the project via e-mail and were then contacted by researchers to set up times for phone interviews. the list of interview questions was e-mailed to the participants before the interviews so that they could review the questions and had enough time to reflect on their experiences. the interviews were conducted with librarians working in a variety of libraries, including nine libraries using evergreen and one in the process of migrating to evergreen. seven libraries were using koha, two were using other open-source ilss, and one was using a proprietary ils while evaluating opensource. public libraries were the most numerous with eleven respondents, while there were also four special libraries, three academic libraries, and one school library. researchers also requested information about the size of the library collection. seven libraries owned collections of less than 100,000 items, seven had collections of 100,001–999,999 items, and four libraries owned collections of over 1,000,000 items. geographically, the respondents ranged all over the united states and included one library located in afghanistan (although the ils was installed in the united states). table 1 details the description of the data. data collection method interviews were chosen as the primary means of data collection in order to gather rich information that could be analyzed using qualitative methods. researchers sought to interview professionals from a variety of library types and sizes in order to collect a variety of different experiences regarding the selection, implementation, and ongoing maintenance of open-source ils. interviewing was the chosen methodology for several reasons. first, the goal was to go past the practitioner articles to see what kinds of trends there are in the migration process. this requires getting experiences from multiple librarians. interviews provide the in-depth “case-study description” that we were looking for.23 in addition, the most useful aspect of interviewing is the ability to follow up on an answer that the participant gives.24 this ensures that the same type of information is gathered from every interview. this is unlike surveys where sometimes participants do not respond in a way that answers what the researcher really wants to know. in our case, we used telephone interviews due to the geographic dispersion of the participants. it allowed us to talk to librarians from all over the country instead of just within our area. the interview questions are listed in appendix a. data analysis methodology interviews were transcribed, and identifying information was then removed from each of the transcribed documents. the transcripts were then uploaded into dedoose (www.dedoose.com), a web-based analysis program supporting qualitative and mixed methods research. dedoose provides online excerpt selection, coding, and analysis by multiple researchers for multiple documents. the research team used an iterative process of qualitatively analyzing the resulting documents. this method used multiple reviews of the data to initially code large excerpts which were then analyzed twice more to extract common themes and ideas. researchers began by reviewing each document for quantitative information, including the library type, ils in use, experiences of migrating to an open-source integrated library system | singh 41 number of it staff, and size of the collection. this information was added as metadata descriptors to each document in dedoose. upon review of the transcriptions and in discussions about the interview process, researchers began a content analysis of the qualitative data. codes were created based on this initial analysis to aid in categorizing the data from the interviews. two coders coded the entire dataset, specifying categories and themes to the excerpts of the interview transcription. all of the excerpts from each coder were used to create two tests. each coder then took the test of the other's codes by choosing their own codes for each excerpt. researchers earned scores of .96 and .95 using cohen’s kappa statistic, indicating very high reliability. table 1. description of libraries library size (number of items in collection) library type ils used under 100,000 academic koha 100,000–1,000,000 public evergreen under 100,000 special proprietary—considering open-source under 100,000 public koha school koha 100,000–1,000,000 public millennium—in process of migrating to evergreen 100,000–1,000,000 public evergreen 100,000–1,000,000 special koha under 100,000 public koha public evergreen 100,000–1,000,000 academic evergreen-equinox under 100,000 special koha over 1,000,000 academic kuali ole 100,000–1,000,000 public evergreen-equinox over 1,000,000 public evergreen 100,000–1,000,000 public evergreen under 100,000 public koha-bywater over 1,000,000 public evergreen-equinox under 100,000 public evergreen over 1,000,000 special collective access information technology and libraries | march 2013 42 results results from the interview questions were divided into eight categories identified as stages of migration, starting with evaluation of the ils, creation of a demonstration site, data preparation, identification of customization and development needs, data migration, staff training and user testing, and going live and long-term maintenance plans. best practices and challenges for each of the stages are presented below. this section begins with some general considerations gleaned from the responses. general consideration when migrating to an open-source ils • create awareness about open-source culture in your library—let them know what to expect. • develop it skills internally even if you use a vendor. • assess your staff’s abilities before committing. knowing what your staff can do will help determine whether you need to work with a vendor and to what degree or if you can do it alone. it is also a way to determine who is going to be on your migration team. • have a demonstration system; pre-migration, it can be used to test and train, and after migration it can be used to help find solutions to problems. this will also help develop skills internally. • communication is key. o if working with a vendor either as a single library or as a consortium, have a designated liaison with the vendor so all questions go through one person. in a consortium, ensure that everybody knows what is going on. • be prepared to commit a significant amount of staff time for testing, development, and migration, especially if you are not hiring a proprietary vendor for support. working with vendors • read contracts carefully. do not be afraid to ask questions and request changes. sometimes the other party has a completely different meaning for a word than you do. make sure you are on the same page. • ensure that there is an explicit timeline and procedure for the release of usable source code. • see that you are guaranteed and entitled to access the source code in case you need to switch developers, bring additional developers on board, or try to fix problems in-house. • provide specific examples when reporting problems. specific example will help the developers determine what the problem is and will help prevent any miscommunication. • designate a liaison between library staff and developers. the liaison will have to be someone who understands or can learn enough about what the developers are doing so that he or she can translate any problems or complaints from one group to the other. experiences of migrating to an open-source integrated library system | singh 43 • set up regular meetings for those involved in the migration project. regular meetings keep everyone focused and on task. they also provide an opportunity for questions, concerns, and problems to be addressed quickly. sample quote from interviews: one of the main things that came up is working with equinox, it was amazing. to start with, they were very, very helpful. and i had made an assumption, and i think the rest of us had, too, that we were working with, that this was developed by librarians, and that the terminology used would be library jargon. but that was not the case. we had some stumbling points over, we would say, okay, we want this, or this is a transaction, or that’s a bill, but that’s not what they called it. they didn’t call it a transaction, or they didn’t call it a bill. and so when we wrote the contract, we wrote it so that none of the patrons’ current checkout record would migrate, which is a big issue. and we didn’t realize that we weren’t using the right terminology in order to put that in the contract so that those current checkouts would move over with the migration and not just the record. stage 1—evaluation when making the decision of whether to migrate to open-source and which open-source ils is best for your library, the main things to start with are two questions: who makes the decision and on what basis. in practice, who makes the decision? • if a single library, one or two people make the decision, usually the library director and whoever is serving as the tech person. • if in a consortium, a committee makes the decision, often either the library directors or tech people. best practice suggestion: regardless of the size of the library system, even though these are the people making the decisions, you should always try to include as many groups as open-sourceible in the decision to move to open-source. which ils? • make a list of requirements based on your current system and a wish list of requirements for the new system. this is one area where you can involve more than just the system staff. asking the different departments (cataloging, acquisitions, and circulation) what their needs are ensures that the final decision includes everyone. • talk to other libraries that have made the move to open-source. they are a great resource for seeing how the system actually works, asking questions about the migration process, and providing information about open-source problems. if available, talk to a library that migrated from your current proprietary system. some systems are easier to migrate from than others, so this would be an opportunity to find out about any specific problems. information technology and libraries | march 2013 44 stage 2—set up a demonstration site • this is the most important guideline in the entire paper. create a demonstration site before making a final decision. o if there is still confusion in your team about which ils to use, setting up a demo site and installing koha and evergreen will be the best way to decide which one works for your situation. o doing at least one test migration will show what kind of data preparation needs to be done, usually by doing data mapping. data mapping is where you determine where the fields in your current system go when you move into the new system. another often-used term for data mapping is staging tables. o the demo site is also a good way to do staff training when needed. o the demo site also provides a way to determine what the best setup rules, policies, and settings are by testing them in advance. o it provides an opportunity to learn the processes of the different modules and how they differ from your library’s current practices. o most importantly, it serves as a test run for migration, which will make the actual migration go smoothly. sample quotes from interviews: do you think that the tests with the data and doing that really helped? oh yes, we were have had a disaster if we hadn’t done three tests and test loads. the pals office has done conversions multiple times before so they have it done, and we have good tech people. so they knew that the three tests loads would be a good thing. we did discover some of the tools that should be used, like for example one of the things that’s recommended for evergreen patron migration is to have a staging table, so you dump all your records into a database that you can then use to create the records in the evergreen tables. and you know we found out why that was important by running into a couple, a few problems with not being able to line up the data in the multiple fields. but you know that’s the sort of thing we expect. that’s pretty, i classify it as pretty typical migration learning, is finding out what works one way, what doesn’t the other. but you know that was a good thing because all the documents were saying, “you should use a staging table.” and we had to figure out ourselves why that was such a good idea. you should use a staging table for migration, i.e. move records into a database that is then used to create records in evergreen. it helps because some data doesn't line up in the same fields. it's a good idea to set up tables and rules far in advance in order to test before migration. it's very important to do data mapping very carefully because if you lose anything during migration it's difficult to get it back. check it to make sure that all the fields will be experiences of migrating to an open-source integrated library system | singh 45 transported correctly, and run tests while the old system is still up to make sure everything is there. stage 3—data preparation • clean up the data in advance. the better the data is, the easier it will transfer. this is also an opportunity to start fresh in a new system, so if there were inconsistencies or irritations in the old system this is a good time to fix it. o weeding—if you have records (either materials or patrons) that are out of date, get rid of them. the fewer the records, the easier migration will be. in addition, vendors often charge by record, so why pay for records you do not need? • consistency in data is key. if multiple people are working on the data, make sure they are working based on the same standards. • do a fine amnesty when migrating to a new system. depending on the systems (current and new), it is sometimes impossible or very difficult to transfer fine data into the new ils, so doing a fine amnesty will make the process simpler. • spot check data (testing, during, and after migration). catching problems early means there will be less work trying to fix problems later. sample quotes from interviews: i would say that if you’re considering converting to an ils software, that you’ve really got to do the data mapping very carefully with a fine-toothed comb because you don’t want to lose data. it’s too hard to get it back in. the data needs to be normalized so that the numbers of fields are uniform, names are in the correct order, and data is displayed correctly. the library has had to decide whether it is worthwhile to do things like getting rid of old abbreviations, etc. to make the data more easily understood. problems occur with old data if information such as note fields has been entered inconsistently. it's important to have procedures and to make sure everyone is following them. often things are put in different places, which causes a lot of trouble. they are doing a lot of cleanup of data, such as reducing the number of unique values in the case of some items that had a huge number of values in a drop down list. would like to spend more time on data cleanup but need to go ahead and get data migrated. stage 4—development/customization • one benefit of using an open-source ils is that any development done by any library comes back to the community, so often if you want something done, someone else might have already created that functionality and you can use it. information technology and libraries | march 2013 46 • develop partnerships. often if you want a specific development, someone else does too. if your staff does not have the expertise, then you could provide more of the funding and the partner could provide the tech skills or vice versa. partnerships mean the development will cost less than if you did it alone. • grant money is also available for open-source development and may be another funding option. sample quotes from interviews: the library does its own minor customizations and uses equinox for major jobs. they will lay out and prepare everything then hire equinox to write and implement new code. the library tries not to do things on its own but always looks for partnerships when doing any customizations. that way libraries that have similar needs can share resources. stage 5—migration process • write workflows and policies/rules beforehand. writing these when working on the demo site should provide step-by-step instructions on how to do the final migration. • having regular meetings during the migration process ensures that everyone stays on the same page and prevents miscommunications that will slow down the process. • if many libraries are involved, migration in waves will make things easier. this is generally a situation with a statewide consortium. usually there is a pilot migration of four to eight libraries, then after that, each wave gets a little bigger as the system becomes more practiced. this can also be a useful model if the libraries involved in the consortium are accepting the migration at different rates. • for a consortium that is coming from multiple ilss, having a vendor will make it easier. this is not to say that it could not be done without a vendor, but migrating from system a is going to be different than migrating from system b. this increases the complexity, which can make working with a vendor more cost effective. stage 6—staff training and user testing • who does the training? there are two main ways: by a vendor or internally. o if trained by a vendor, there are two options:  the vendor sends someone to the library to conduct training.  the library sends someone to the vendor for training and then he or she comes back and trains the rest of the staff. o if trained internally, there are a lot of training materials available. there are several libraries that have created their own materials and then made them available online. this is another time where having contacts with other libraries can help in using common resources. experiences of migrating to an open-source integrated library system | singh 47 • documentation is important for training. the best way is to find what documentation is already available and then customize it for your system. • do training fairly close to the “go live” date. • use a day or two for training. if a consortium is spread out geographically, use webinars and wikis. • when doing training, have specific tasks to do. this can be done a few ways. o do the specific tasks at the training. o demonstrate the tasks at training and then give “homework” where the staff does the specific tasks independently. to implement this option, staff has to have access to a demo system. o have staff try the tests on their own and use the training session for questions or problems they had. sample quotes from interviews: well we had, we hired equinox to come and do 2 days of training with us. so they’re here and did hands-on training with us. and then we also, they provided some packets of exercises that people could do on their own. and we had the system up and running in the background so that they could play with it about a week before we actually went live to the public so that they could get used to it, figure out how things worked, and work with it a little bit so they could answer questions before the public came and said, hey, how do i find my record, and i can’t get into this anymore. and the training was really good, but the hands-on was the best. and it’s not a difficult system to work, but you just need a little experience with it before it makes sense. evergreen runs a test server that anybody can download the staff client for that and work in their test server and just examine all of the records and how the system works, to figure out our workflows. we looked up documentation online—evergreen, indiana, pines, various places—copied the documentation they so graciously hosted online for everybody to use, went through it, found what worked for us. those couple staff members worked with other staff. we printed out kind of our little how-to guides for other people, depending on which worked, and told them they’re going to sit down, we’ve got terminals set up here, sit down and learn it. the admin person, she went through some quite detailed training. she went to atlanta and had training from equinox on a lot of aspects of evergreen. and then we also, she came back, and then she did training for all the libraries in the consortium, kind of an intensive day-long or half-day-long thing that she offered in several different central geographic locations so that all the libraries would have a chance to go and attend without having to drive too far. and we also did webinars, we got a couple webinars for the real outlying libraries. and we also have ongoing weekly webinars. and we have a wiki set up where we put all the information in the online manual and stuff like that. information technology and libraries | march 2013 48 all the training sessions were recorded, and so we had them on cd for new people coming on board. marketing for patrons • most libraries have not done anything elaborate, generally just announcements through posters, local papers, flyers, and on websites. • if the migration is greatly changing the situation for patrons, then more marketing is needed. • set up a demo computer for patrons to try or hold classes once the system is up. training for patrons • most libraries did not find this necessary. either the system is easy to use or it is set up to look like the old system. • if training patrons, create online tutorials. stage 7—“go live” and after • if possible, have your old system running for a month or two until you are sure that all the data got migrated over properly. sample quote from interviews: check it to make sure that all the fields will be transported correctly, and run tests while the old system is still up to make sure everything is there. maintenance—library staff (this assumes a migration being done in-house with little to no vendor support.) • staff has to have the technical knowledge (linux, sql, and coding). • often the money saved from moving to open-source is used to pay for additional staff. • most time is not spent on maintenance but on customization, updates, or problem-solving. maintenance—vendor • often start with higher vendor support, which lessens as the staff learns and develops expertise. discussion and conclusion interviews with twenty librarians from different settings provided insight into the process of the adoption of open-source ils and were used to develop the guidelines presented in this paper. these guidelines are not intended to serve as a complete guide to the process of adoption but are meant to give interested librarians an overview of the process. these guidelines can help libraries prepare themselves for the research and adoption far before they delve into the process. since these guidelines are all based in the real-life adoption experiences of libraries, they provide insight experiences of migrating to an open-source integrated library system | singh 49 into the challenges as well as the opportunities in the process. these guidelines can be used to develop an adoption plan and requirements for the adoption process. in future research, we are working to create adoption blueprints and total cost of ownership assessments (with and without vendors) for libraries of different sizes and types. also, as part of this research we have developed an information portal that contains resources that will help librarians in each phase of the process of open-source ils adoption. the information portal along with these guidelines will fill a very important gap in the resources available for open-source ils adoption. the url for the portal is not being provided in this paper to ensure anonymous review. references 1. marshall breeding, “automation marketplace 2012: agents of change,” library journal 137, no. 6 (april 1, 2012), http://lj.libraryjournal.com/2012/03/industry-news/automationmarketplace-2012-agents-of-change (accessed february 18, 2013). 2. tristan müller, “how to choose a free and open-source integrated library system,” oclc systems & services: international digital library perspectives 27, no. 1 (2011): 57–78, http://dx.doi.org/10.1108/10650751111106573 (accessed february 18, 2013). 3. emily g. morton-owens, karen l. hanson, and ian walls, “implementing open-source software for three core library functions: a stage-by-stage comparison,” journal of electronic resources in medical libraries 8, no. 1 (2011), 1–14, http://dx.doi.org/10.1080/15424065.2011.551486 (accessed february 18, 2013). 4. janet l. balas, “how they did it: ils migration case studies,” computer in libraries 31, no. 8 (2011): 37. 5. müller, “how to choose a free and open-source integrated library system.” 6. roy tennant, “technology decision-making: a guide for the perplexed,” library journal 125, no. 7 (2000): 30. 7. xan arch, “ultimate debate 2010: open source software—free beer or free puppy? a report of the lita internet resources & services interest group program, american library association annual conference, washington, dc, june 2010,” technical services quarterly 28, no. 2 (2011): 186–88, http://dx.doi.org/10.1080/07317131.2011.546268 (accessed february 18, 2013). 8. camille espiau-bechetoille, jean bernon, caroline bruley, and sandrine mousin, “an example of inter-university cooperation for implementing koha in libraries: collective approach and institutional needs,” oclc systems & services: international digital library perspectives 27, no.1 (2011): 40–44, http://dx.doi.org/10.1108/10650751111106546 (accessed february 18, 2013). http://lj.libraryjournal.com/2012/03/industry-news/automation-marketplace-2012-agents-of-change/ http://lj.libraryjournal.com/2012/03/industry-news/automation-marketplace-2012-agents-of-change/ http://dx.doi.org/10.1108/10650751111106573 http://dx.doi.org/10.1080/15424065.2011.551486 http://dx.doi.org/10.1080/07317131.2011.546268 http://dx.doi.org/10.1108/10650751111106546 information technology and libraries | march 2013 50 9. gerhard bissels, “implementation of an open-source library management system: experiences with koha 3.0 at the royal london homoeopathic hospital,” electronic library and information systems 42, no. 3 (2008): 303–14, http://dx.doi.org/10.1108/00330330810892703 (accessed february 18, 2013). 10. randy dykhuis, “michigan evergreen: implementing a shared open source integrated library system,” collaborative librarianship 1, no. 2 (2009): 60–65, http://collaborativelibrarianship.org/index.php/jocl/article/view/7/8 (accessed february 18, 2013). 11. karen kohn and eric mccloy, “phased migration to koha: our library’s experience,” journal of web librarianship 4 no. 4 (2010): 427–34, http://dx.doi.org/10.1080/19322909.2010.485944 (accessed february 18, 2013). 12. l.h. lyn dennison and a.f. lewis, “small and open-source: decisions and implementation of an open-source integrated library system in a small private college,” georgia library quarterly 48 no. 2 (2011): 6–8, http://digitalcommons.kennesaw.edu/glq/vol48/iss2/3 (accessed february 18, 2012). 13. linda m. riewe, “survey of open-source integrated library systems,” master’s theses, paper 3481, http://scholarworks.sjsu.edu/etd_theses/3481 (accessed february 18, 2013). 14. karen kohn and eric mccloy, “phased migration to koha: our library’s experience.” 15. randy dykhuis, “michigan evergreen: implementing a shared open source integrated library system.” 16. ian walls, “migrating from innovative interfaces’ millennium to koha: the nyu health sciences libraries’ experiences,” oclc systems & services: international digital library perspectives 27, no. 1 (2011): 51–56, http://dx.doi.org/10.1108/10650751111106564 (accessed february 13, 2013). 17. l.h. lyn dennison and a.f. lewis, “small and open-source: decisions and implementation of an open-source integrated library system in a small private college.” 18. emily g. morton-owens, karen l. hanson, and ian walls “implementing open-source software for three core library functions: a stage-by-stage comparison.” 19. lisa genoese and latrina keith, “jumping ship: one health science library’s voyage from a proprietary ils to open source,” journal of electronic resources in medical libraries 8, no. 2 (2011): 126–33, http://dx.doi.org/10.1080/15424065.2011.576605 (accessed february 18, 2013). 20. ian walls, “migrating from innovative interfaces’ millennium to koha: the nyu health sciences libraries’ experiences”; emily g. morton-owens, karen l. hanson, and ian walls, http://dx.doi.org/10.1108/00330330810892703 http://collaborativelibrarianship.org/index.php/jocl/article/view/7/8 http://dx.doi.org/10.1080/19322909.2010.485944 http://digitalcommons.kennesaw.edu/glq/vol48/iss2/3 http://scholarworks.sjsu.edu/etd_theses/3481 http://dx.doi.org/10.1108/10650751111106564 http://dx.doi.org/10.1080/15424065.2011.576605 experiences of migrating to an open-source integrated library system | singh 51 “implementing open-source software for three core library functions: a stage-by-stage comparison.” 21. l. h. lyn dennison and a. f. lewis, “small and open-source: decisions and implementation of an open-source integrated library system in a small private college.” 22. morton-owens, hanson, and walls, “implementing open-source software for three core library functions.” 23. laurel jizba mis, “an essay on our interviews, and a call for participation,” journal of internet cataloging 6 no. 2 (2003): 17–20, doi: 10.1300/j141v06n02_04 (accessed february 18, 2013). 24. golnessa galyani moghaddan and mostafa moballeghi, “how do we measure the use of scientific journals? a note on research methodologies,” scientometrics 76, no. 1 (2008): 125– 33, doi: 10.1007/s11192-007-1901-y (accessed february 18, 2013). doi:%2010.1300/j141v06n02_04 doi:%2010.1007/s11192-007-1901-y information technology and libraries | march 2013 52 appendix a. interview questions library environment 1. what is your library type (school, academic, public, special, etc.)? 2. what is your library size (how many employees, population served, and number of materials)? evaluation (we would like as much info as possible about why the system was chosen over others, including any existing system.) 3. what open-source ils are you using and why did you choose it? 4. when choosing an open-source ils, where did you go for information (vendor/ils pages, community groups, personal contacts, etc)? 5. who was involved in deciding which ils to use? adoption (we would like to document specific problems or issues that could be used by other libraries to ease their installation.) 6. were there any problems during migration? 7. what do you know now that you wish you had known before migration? 8. how long did migration take? were you on schedule? 9. if getting paid support, how did the vendors (previous and current) help with migration? implementation (again, specific examples of the things that worked well or didn't work. how can other libraries learn from this experience?) 10. what kind of (and how much) training did your library staff receive? 11. did you do any kind of marketing to your patrons? 12. (if haven’t gotten to this part yet), what are your plans for implementation? 13. how much time did implementation take and were you on schedule? maintenance (this information will be especially important when compared to the library type and size as a reference for other libraries. we would like to get answers that are as specific as possible). 14. how large is your systems staff? is it sufficient to maintain the system? 15. how much time do you spend each week doing system maintenance? how does this compare to your old system? experiences of migrating to an open-source integrated library system | singh 53 16. what resources (or channels) do you use to solve your technical support issues? what roles do paid vendors play in maintenance of your system? advice for other libraries (these open-ended questions are an opportunity to learn more information that we might not have thought of asking about. responses could provide a valuable resource to other libraries as they plan their implementation). 17. what is the best thing and worst thing about having an open-source ils? 18. are there any lessons or advice that you would like to share with other librarians who are thinking about or migrating to an open-source ils? acknowledgment this research was funded by an early career imls grant. abstract interest in migrating to open-source integrated library systems is continually growing in libraries. along with the interest, lack of empirical research and evidence to compare the process of migration brings a lot of anxiety to the interested librari... a collaborative approach to newspaper preservation public libraries leading the way a collaborative approach to newspaper preservation ana krahmer and laura douglas information technology and libraries | september 2020 https://doi.org/10.6017/ital.v39i3.12596 ana krahmer (ana.krahmer@unt.edu) oversees the digital newspaper unit at unt. through this work, she manages the texas digital newspaper program collection on the portal to texas history, which is a gateway to historic research materials freely available worldwide. laura douglas (laura.douglas@cityofdenton.com) is the librarian in charge of the special collections with the denton public library which houses the genealogy, texana, and local denton history collections as well as the denton municipal archives. in her work, she regularly assists patrons with newspaper research questions specifically related to denton newspapers. © 2020. introduction when we first proposed this column in january 2020, we had no idea how much the world would change between then and the july deadline. while we have collaborated for many years on a variety of projects, the value of our collaboration has never proven itself more than in this covid 19 reality: collaboration leverages the strengths and resources of partners to form something stronger than each. in this world of covid-19, the collaboration between the denton public library (dpl) and the university of north texas libraries (unt) has allowed us to build open, online access to the first 16 years of the denton record-chronicle (drc). this newspaper is the city’s daily newspaper of record, and the collaboration between dpl and unt resulted in free, worldwide research access, via the portal to texas history. the project was funded by a $24,820.00 grant through the imls library services and technology act (lsta), awarded from september 2019 to august 2020 by the texas state libraries and archives commission (tslac) as part of their textreasures program, to digitize 24,000 newspaper pages. this project has also resulted in a follow-up collaboration to build open access to further years of this daily newspaper title, through a 2021 textreasures award to digitize an additional 24,000 newspaper pages. the real question, though, is what recipe made this a successful collaboration. background the drc has been the community newspaper in denton for over 100 years. due to the sheer amount of material, digitizing a daily newspaper with such an extensive publication run is a long term project that requires a lot of planning, time, and funding. since the dpl’s inception in 1937, the library has endeavored to collect items related to denton and texas history. with community support, the library has developed a well-rounded collection of local history, texana, and genealogical materials, all of which are housed in the special collections research area at the emily fowler central library. these materials support research, projects, and exhibits. one major research resource is the archival collection of local newspapers, mainly the drc, maintained on 752 rolls of microfilm containing issues from 1908 to 2018. before this project, access to these newspapers was only available in the special collections research area, through microfilm readers or paid subscription services. in addition, although steps had been taken to preserve the film, many of the rolls show wear from years of use, while others have developed vinegar syndrome and soon will no longer be a usable resource. in 2018, unt obtained publisher permission to make the drc run freely accessible on the portal to texas history. mailto:ana.krahmer@unt.edu mailto:laura.douglas@cityofdenton.com information technology and libraries september 2020 a collaborative approach to newspaper preservation | krahmer and douglas 2 laura had been exploring different avenues to digitize this microfilm and make them freely available to the public when ana contacted her with information about the texas state library and archives commission (tslac), which awards annual grants supported by library services & technology act funds, through the institute of museum & library services. lsta funding is annually provided to all fifty states through the institute of museum and library services, and the state library determines how this funding is expended. in texas, lsta funding is provided through a number of grant programs including textreasures, a competitive grant program for any texas library. as described by tslac, the “textreasures grant is designed to help libraries make their special collections more accessible for the people of texas and beyond. activities considered for possible funding include digitization, microfilming, and cataloging.” libraries can apply to fund the same type of project up to three years in a row, and the drc project applied for $24,820.00 in 2019 to digitize 24,000 newspaper pages, representing the earliest years of microfilm available at the denton public library. to create a viable grant application dpl partnered with the texas digital newspaper program (tdnp), available through unt’s portal to texas history, and decided to start first by digitizing as many early years of microfilm as grant funding could cover. tdnp is the largest single-state, open access, digital newspaper preservation repository in the u.s., hosting just under 8 million newspaper pages at the time of this writing. in late 2018, unt received permission from the owner of the drc to include the newspaper run in the tdnp collection, which represented a very exciting opportunity for city and county researchers, as well as for the dpl. as thanks to the publisher for granting permission, unt built access to the 2014 to 2018 pdf eprint editions, which the tdnp preserves as a service to texas press associationmember publishers. after this, unt contacted the dpl to discuss applying for grant funding. once laura learned that the dpl had received the 2019 award, she prepared the local planning steps necessary to collaborate with the university. the project becomes real the denton record-chronicle digitization project grant contract and resolution for adoption went before the denton city council on october 8, 2019. the city of denton issued a press release that day, and the drc also published an article announcing the project. over the next few days the drc article appeared across social media, including the city of denton’s social media accounts, as well as through library-associated email newsletters. after the first newspapers became available on the portal, both dpl and unt prepared blog posts about the project, which have also appeared on social media. these blog posts fulfilled publicity requirements specified by the grant, even while offering training to researchers in how to work with the online newspaper collection. one major convenience to this collaboration is that both organizations are in the same city. transfer of materials was arranged by email and accomplished by a trip across town. we completed the digitization process in batches, with the first 10 microfilm rolls going to unt on october 10, 2019, and unt uploading the first 854 issues in december 2019. the newspapers from the first microfilm set represented 1908-1916. dpl transferred the last set of microfilm in april 2020, with dates ranging from 1917 through september 1924, shortly after which unt completed and uploaded the grant-funded count of 24,000 newspaper pages. the estimated year given in the grant proposal that the scans would have gone through was 1938, but the page count on this newspaper proved to be much, much higher than originally estimated, and as a result, the funding only covered up to september 1924. dpl and unt will continue their partnership by information technology and libraries september 2020 a collaborative approach to newspaper preservation | krahmer and douglas 3 digitizing further years of the drc, through a variety of methods. as we were in the midst of preparing this column the tslac contacted laura to inform her that dpl had received a second grant award, in the amount of $24,820.00 to digitize 24,000 additional newspaper pages, which will move the newspapers through 1954. as of july 23, 2020, the denton record-chronicle collection on the portal to texas history hosts 6,168 items and has been used 16,397 times. this includes 1,743 items that are pdf eprint editions of the paper from 2014 to 2018, which unt uploaded for long-term preservation and access. unt uploads eprint editions without a charge, and digitally preserves these through an agreement with the texas press association; these pdfs were not a part of the funded grant, but they do enhance access to the collection and helped to build community interest in seeing earlier years available on the portal. the usage of the collection skyrocketed after the early editions became available. january 2020 saw the highest number with the collection uses at 3105. once this project is complete, it will include over 200,000 newspaper pages. neither dpl nor unt has the ability to tackle this project on their own, but through collaboration, it is possible. recipe for your own collaboration success these are planning recommendations as you prepare for your own collaboration, drawn from what we’ve learned as we worked on this project together. 1. communicate early and often: communicating needs enables partners to identify each other’s strengths. each partner will bring their strengths to the project, which in this case included actual archival materials from dpl and technological expertise on the unt side. in addition, be prepared to communicate with local groups who need to endorse or sign off on the project, including possibly the city council, the historical commission, or the city manager. 2. partner to write the grant: partnering in preparing the grant achieves two goals: first, it enables partners to develop a communication flow that will move forward throughout the collaboration; second, it ensures that partners know what each can realistically accomplish within the grant timeline. in this case, laura wrote most of the grant application herself, but she had very specific questions that ana had to answer, and she needed key elements from unt, including project budget, technological infrastructure, and a commitment letter. communicating early and partner on the grant application process ensured that there were no unexpected surprises that were within the control of either partner. 3. work together to explain your partnership: with a grant of this size, we always spoke in advance to ensure we weren’t over-promising when newspapers would appear online. this also gave both laura and ana lead-time for promoting the project: laura would share the years of the physical microfilm before sending them over, and ana would walk laura through the years that would get uploaded in a given month. this allowed them to plan publicity, training, and outreach efforts based on the dates of newspapers going online. in addition, laura regularly communicated with ana prior to submitting grant reports, and this was critical in preventing miscommunication going to the funding agency. 4. pad enough time for the unexpected: of course, we had no way of knowing a pandemic would occur when we began this project, and what saved us was that we’d started planning as soon as we learned about receiving the grant, rather than as soon as the grant started, which was in september 2019. planning two months in advance put us two months ahead of schedule, and we were able to start exchanging materials as soon as the grant period information technology and libraries september 2020 a collaborative approach to newspaper preservation | krahmer and douglas 4 started. this gave us a few weeks of lead time so we successfully completed the project by the end of april 2020, at which point the microfilm page count had been scanned and unt staff could remote in to complete the digitization processes. extra time is only a benefit. if the covid-19 pandemic had not occurred, we still might have had to address technological or film deterioration problems, and we could resolve these earlier rather than later because we had given ourselves a few extra weeks of lead time. 5. don’t be afraid to explain changes to your granting agency: if your project changes due to unforeseen circumstances, for example in our project the uploaded total of pages reached 24,000 before we digitized the entire planned date range. unt charges a per-page digitization fee, and these newspaper issues proved to contain more pages than expected . laura contacted the representative at tslac to explain the situation and offer an alternative approach to cover the digitization of the remaining years. the important thing is to keep the granting agency informed of any changes, delays, or hiccups in the project. we are both proud of having completed this project three months before the end of the grant period, but we know that without solid communication, planning, or flexibility, the covid-19 pandemic would have made the situation extremely difficult if not impossible. leveraging the portal’s technical infrastructure and tdnp’s newspaper expertise with the volume of material and collection expertise provided by the dpl has given us a model for success we plan to capitalize on in future projects. best of all, in the world of covid-19, our patrons can access these newspapers from the comfort of their own couches, without even taking off their pajamas! introduction background the project becomes real recipe for your own collaboration success application level security in a public library: a case study richard thomchick and tonia san nicolas-rocca information technology and libraries | december 2018 107 richard thomchick (richardt@vmware.com) is mlis, san josé state university. tonia san nicolas-rocca (tonia.sannicolas-rocca@sjsu.edu) is assistant professor in the school of information at san josé state university. abstract libraries have historically made great efforts to ensure the confidentiality of patron personally identifiable information (pii), but the rapid, widespread adoption of information technology and the internet have given rise to new privacy and security challenges. hypertext transport protocol secure (https) is a form of hypertext transport protocol (http) that enables secure communication over the public internet and provides a deterministic way to guarantee data confidentiality so that attackers cannot eavesdrop on communications. https has been used to protect sensitive information exchanges, but security exploits such as passive and active attacks have exposed the need to implement https in a more rigorous and pervasive manner. this report is intended to shed light on the state of https implementation in libraries, and to suggest ways in which libraries can evaluate and improve application security so that they can better protect the confidentiality of pii about library patrons. introduction patron privacy is fundamental to the practice of librarianship in the united states (u.s.). libraries have historically made great efforts to ensure the confidentiality of personally identifiable information (pii), but the rapid, widespread adoption of information technology and the internet have given rise to new privacy and security challenges. the usa patriot act, the rollback of the federal communications commission rules prohibiting internet service providers from selling customer browsing histories without the customer’s permission, along with electronic surveillance efforts by the national security agency (nsa) and other government agencies, have further intensified privacy concerns about sensitive information that is transmitted over the public internet when patrons interact with electronic library resources through online systems such as an online public access catalog (opac). 1 hypertext transport protocol secure (https) is a form of hypertext transport protocol (http) that enables secure communication over the public internet and provides a deterministic way to guarantee data confidentiality so that attackers cannot eavesdrop on communications. https has been used to protect sensitive information exchanges (i.e., e-commerce transactions, user authentication, etc.). in practice, however, security exploits such as man-in-the-middle attacks have demonstrated the relative ease with which an attacker can transparently eavesdrop on or hijack http traffic by targeting gaps in https implementation. there is little or no evidence in the literature that libraries are aware of the associated vulnerabilities, threats, or risks, or that researchers have evaluated the use of https in library web applications. this report is intended to shed light on the state of https implementation in libraries, and to suggest ways in which libraries can evaluate and improve application security so that they can better protect the mailto:richardt@vmware.com mailto:tonia.sannicolas-rocca@sjsu.edu application level security in a public library |thomchick and san nicolas-rocca 108 https://doi.org/10.6017/ital.v37i4.10405 confidentiality of pii about library patrons. the structure of this paper is as follows. first, we review the literature on privacy as it pertains to librarianship and cybersecurity. we then describe the testing and research methods used to evaluate https implementation. a discussion on the results of the findings is presented. finally, we explain the limitations and suggest future research directions. literature review the research begins with a survey of the literature on the topic of confidentiality as it pertains to patron privacy; the impact of information technology on libraries; and the use of https as a security control to protect the confidentiality of patron data when it is transmitted over the public internet. while there is ample literature on the topic of patron privacy, there appears to be a lack of empirical studies that measure the use of https to protect the privacy of data transmitted to and from patrons when they use library web applications.2 the primal importance of patron privacy patron privacy has long been one of the most important principles of the library profession in the u.s. as early as 1939, the code of ethics for librarians explicitly stated, “it is the librarian’s obligation to treat as confidential any private information obtained through contact with li brary patrons.”3 the concept of privacy as applied to personal and circulation data in library records began to appear in the library literature not long after the passage of the u.s. privacy act of 1974.4 today, the american library association (ala) regards privacy as “fundamental to the ethics and practice of librarianship,” and has formally adopted a policy regarding the confidentiality of personally identifiable information (pii) about library users, which asserts, “confidentiality exists when a library is in possession of personally identifiable information about users and keeps that information private on their behalf.”5 this policy affirms language from the ala code of ethics, and states that “confidentiality extends to information sought or received and resources consulted, borrowed, acquired or transmitted including database search records, reference questions and interviews, circulation records, interlibrary loan records, information about materials downloaded or placed on ‘hold’ or ‘reserve,’ and other personally identifiable information about uses of library materials, programs, facilities, or services.” 6 with the advent of new technologies used in libraries to support information discovery, more challenges arise to protect patron privacy.7 the impact of information technology on patron privacy researchers have studied the impact of information technology on patron privacy for several decades. early research by harter and machovec discussed the data privacy challenges arising from the use of automated systems in the library, and the associated ethical considerations for librarians who create, view, modify, and use patron records.8 fouty addressed issues regarding the privacy of patron data contained in library databases, arguing that online patron records provide more information about individual library users, more quickly, than traditional paperbased files.9 agnew and miller presented a hypothetical case involving the transmission of an obscene email from a library computer, and an ensuing fbi inquiry, as a method of examining privacy issues that arise from patron internet use at the library.10 in addition, merry pointed to the potential for violations of patron privacy brought about by tracking of personal information attached to electronic text supplied by publishers.11 information technology and libraries | december 2018 109 the consensus from the literature, as articulated by fifarek, is that technology has given rise to new privacy challenges, and that the adoption of technology in the library has outpaced efforts to maintain patron privacy.12 this sentiment was echoed and amplified by john berry, former ala president, who commented that there are “deeper issues that arise from the impact of converting information to digitized, online formats” and critiqued the library profession for having “not built protections for such fundamental rights as those to free expression, privacy, and freedom.”13 ala affirmed these findings and validated much of the prevailing research in a report from the library information technology association, which concluded, “user records have also expanded beyond the standard lists of library cardholders and circulation records as libraries begin to use electronic communication methods such as electronic mail for reference services, and as they provide access to computer, web and printing use.”14 in more recent years, library systems have made increasing use of network communication protocols such as http and focus of the literature has shifted towards internet technologies in response to the growth of trends such as cloud computing and web 2.0. mavodza characterizes the relevance of cloud computing as “unavoidable” and expounds on the ways in which software-as-aservice (saas), platform as a service (paas), and infrastructure as a service (iaas) and other cloud computing models “bring to the forefront considerations about . . . information security [and] privacy . . . that the librarian has to be knowledgeable about.”15 levy and bérard caution that nextgeneration library systems and web-based solutions are “a breakthrough but need careful scrutiny” of security, privacy, and related issues such as data provenance (i.e., where the information is physically stored, which can potentially affect security and privacy compliance requirements). 16 protecting patron privacy in the “library 2.0” era “library 2.0” is an approach to librarianship that emphasizes engagement and multidirectional interaction with library patrons. although this model is “broader than just online communication and collaboration” and “encompasses both physical and virtual spaces,” there can be no doubt that “library 2.0 is rooted in the global web 2.0 discussion,” and that libraries have made increasing use of web 2.0 technologies to engage patrons.17 the library 2.0 model disrupts many traditional practices for protecting privacy, such as limited tracking of user activity, short-term data retention policies, and anonymous browsing of physical materials. instead, as zimmer states, “the norms of web 2.0 promote the open sharing of information—often personal information—and the design of many library 2.0 services capitalize on access to patron information and might require additional tracking, collection, and aggregation of patron activities.”18 as ala cautioned in their study on privacy and confidentiality, “libraries that provide materials over websites controlled by the library must determine the appropriate use of any data describing user activity logged or gathered by the web server software.”19 the dilemma facing libraries in the library 2.0 era, then, is how to appropriately leverage user information while maintaining patron privacy. many library systems require users to validate their identity through the use of a username, password, pin code, or another unique identifier for access to their library circulation records and other personal information.20 however, several studies suggest the authentication process itself spawns a trail of personally identifiable information about library patrons that must be kept confidential.21 there is discussion in the literature about the value of using https and ssl certificates to protect patron privacy and build a high level of trust with users, and general awareness about importance of encrypting communications that involve sensitive information, such as “payment for fines and fees via the opac” or when “patrons are required to enter personal application level security in a public library |thomchick and san nicolas-rocca 110 https://doi.org/10.6017/ital.v37i4.10405 details such as addresses, phone numbers, usernames, and/or passwords.”22 however, as breeding observed, many opacs and other library automation software products “don't use ssl by default, even when processing these personalization features.” 23 these observations call library privacy practices into question, and are concerning since “hackers have identified library ilss as vulnerable, especially when libraries do not enforce strict system security protocols.” 24 one of the challenges facing libraries is the perception that “a library's basic website and online catalog functions don't need enhanced security.”25 as a matter-of-fact, one of the most common complaints against https implementation in libraries has been: “we don’t serve any sensitive information.”26 these beliefs may be based on the historical practice of using https selectively to secure “sensitive” information and operations such as user authentication. but in recent years, it has become clear that selective https implementation is not an adequate defense. the electronic frontier foundation (eff) cautions, “some site operators provide only the login page over https, on the theory that only the user’s password is sensitive. these sites’ users are vulnerable to passive and active attacks.”27 passive attacks do not alter systems or data. during a passive attack, a hacker will attempt to listen in on communications over a network. eavesdropping is an example of a passive attack.28 active attacks alter systems or data. during this type of attack, a hacker will attempt to break into a system to make changes to transmitted or stored data, or introduce data into the system. examples of active attacks include man-in-the-middle, impersonation, and session hijacking.29 http exploits web servers typically generate unique session token ids for authenticated users and transmit them to the browser, where they are cached in the form of cookies. session hijacking is a type of attack that “compromises the session token by stealing or predicting a valid session token to gain unauthorized access to the web server,” often by using a network sniffer to capture a valid session id that can be used to gain access to the server.30 session hijacking is not a new problem, but the release of the firesheep attack kit in 2010 increased awareness about the inherent insecurity of http and the need for persistent https.31 in the wake of firesheep’s release and several major security breaches, senator charles schumer, in a letter to yahoo!, twitter, and amazon, characterized http as a “welcome mat for would-be hackers” and urged the technology industry to implement better security as quickly as possible.32 these and other events prompted several major site operators, including google, facebook, paypal, and twitter, to switch from partial to pervasive https. today these sites transmit virtually all web application traffic over https. security researchers from these companies, as well as from several standards organizations such as electronic frontier foundation (eff), internet engineering task force (ietf), and open web application security project have shared their experiences and recommendations to help other website operators implement https effectively.33 these include encrypting the entire session, avoiding mixed content, configuring cookies correctly, using valid ssl certificates, and enabling hsts to enforce https. testing techniques used to evaluate https implementation there is little or no evidence in the literature that libraries are aware of the associated vulnerabilities, threats, or risks, or that researchers have evaluated the use of https in library web applications. however, there are many methods that libraries can use to evaluate https and information technology and libraries | december 2018 111 ssl/tls implementation, including automated software tools and heuristic evaluations. these methods can be combined for deeper analysis. automated software tools among the most widely used automated analysis software tools is ssl server test from qualys ssl labs. this online service “performs a deep analysis of the configuration of any ssl web server on the public internet” and provides a visual summary as well as detailed information about authentication (certification and certificate chains) and configuration (protocols, key strength, cipher suites, and protocol details).34 users can optionally post the results to a central “board” that acts as a clearinghouse for identifying “insecure” and “trusted” sites. another popular tool is sslscan, a command-line application that, as the name implies, quickly “queries ssl services, such as https, in order to determine the ciphers that are supported.”35 however, these tools are limited in that they only report specific types of data and do not provide a holistic view of https implementation. heuristic evaluations in addition to automated software tools, librarians can also use heuristic evaluations to manually inspect the gray areas of https implementation, either to validate the results of automated software or to examine aspects not included in the functionality of these tools. one example is httpsnow, a service that lets users report and view information about how websites use https. httpsnow enables this activity by providing heuristics that non-technical audiences can use to derive a relatively accurate assessment of https deployment on any particular website or application. the project documentation includes descriptions of, and guidance for identifying, http-related vulnerabilities such as use of http during authenticated user sessions, presence of mixed content (instances in which content on a webpage is transmitted via https while other content elements are transmitted via http), insecure cookie configurations, and use of invalid ssl certificates. research methodology a combination of heuristic and automated methods was used to evaluate https implementation in a public library web application to determine how many security vulnerabilities exist in the application and assess to the potential privacy risks to the library’s patrons. research location this research project was conducted at a public library in the western us that we will call west coast public library (wcpl). this library was established in 1908 and employs ninety staff and approximately forty volunteers. in addition, it has approximately 91,000 cardholders. as part of its operations, wcpl runs a public-facing website and an integrated library system (ils) that includes an opac with personalization for authenticated users. test to conduct the test, a valid wcpl library patron account was created and used to authenticate one of the authors for access to account information and personalized features of wcpl’s opac. next, the google chrome web browser was used to visit wcpl’s public-facing website. a valid patron name, library card number, and eight-digit pin number were then used to gain access to online account information. several tasks were performed to evaluate https usage. a sample search application level security in a public library |thomchick and san nicolas-rocca 112 https://doi.org/10.6017/ital.v37i4.10405 query for the keyword “recipes” was performed in the opac while logged in. the description pages for two of the resources listed in the search engine result page (one printed resource and one electronic resource) were clicked on and viewed. the electronic resource was added to the online account’s “book cart” and the book cart page was viewed. during these activities, httpsnow heuristics were applied to individual webpages and to the user session as a whole. the web browser’s url address window was inspected to determine whether some or all pages were transmitted via http or https. the url icon in the browser’s address bar was clicked on to view a list of the cookies that the application set in the browser. each cookie was inspected for the text, "send for: encrypted connections only," which indicates that the cookie is secure. individual webpages were checked for the presence of mixed (encrypted and unencrypted) content. information about individual ssl certificates was inspected to determine their validity and encryption key length. all domain and subdomain names encountered during these activ ities were documented. the google chrome web browser was then used to access the qualys ssl server test tool. each domain name encountered was submitted. test results were then examined to determine whether any authentication or configuration flaws exist in wcpl’s web applications. results and discussion given the recommendations suggested by several organizations (e.g., eff, ietf, owasp), we evaluated wcpl’s web application to determine how many security vulnerabilities exist in the application, and assess the potential privacy risks to the library’s patrons. the results of tests, as discussed below, suggest that wcpl’s web application processes a number of vulnerabilities that could potentially be exploited by attackers and compromise the confidentiality of pii about library patrons. this is not surprising given the lack of research on https implementation, as well as the general consensus in the literature that technology adoption has outpaced efforts to maintain patron privacy. based on the results of these tests, wcpl’s website and ils span across several domains. some of these domains appear to be operated by wcpl, while others appear to be part of a hosted environment operated by the ils vendor. based on this information, it is reasonable to conclude that wcpl’s ils utilizes a “hybrid cloud” model. in addition, random use of https is observed in the opac interface during the testing process. this is discussed in the following sections. use of http during authenticated user sessions library patrons use wcpl’s website and opac to access and search for books and other material available through the library. given the results of the tests, wcpl does not use https pervasively across its entire web application. during the test, we found that wcpl’s website is transmitted via http by default. this was after manually entering in the url with an “https” prefix, which resulted in a redirect to the unencrypted “http” page. we continued to test wcpl’s website and opac by performing a query using the search bar located on the patron account page. we found that wcpl’s opac transmits some pages over http and others over https. for example, when a search query is performed in the search bar located on the patron account page, the search engine results page is sometimes served over https, and sometimes over http (see figure 1). this behavior is not limited to specific pages; rather it appears to be random. this security flaw leaves library patrons vulnerable to passive and active attacks that exploit gaps in https implementation, which allows an attacker to eavesdrop on and hijack a user-session providing the attacker with access to private information. information technology and libraries | december 2018 113 figure 1. results of the library’s use of https. presence of mixed content when a library patron visits a webpage served over https, the connection with the web server is encrypted, and therefore, safeguarded from attack. if an https webpage includes content retrieved via http, the webpage is only partially encrypted, leaving the unencrypted content vulnerable to attackers. analysis of wcpl’s website did not reveal any explicit use of mixed content on the public-facing portion of the site. test results, however, detected unencrypted content sources on some pages of the library’s online catalog. this, unfortunately, puts patron privacy at risk as attackers can intercept the http resources when an https webpage loads content such as an image, iframe or font over http. this compromises the security of what is perceived to be a secure site by enabling an attacker to exploit an insecure css file or javascript function, leading to disclosure of sensitive data, malicious website redirect, man-in-the-middle attacks, phishing, and other active attacks.36 insecure cookie management cookies are small text files, sent from a web server and stored on user computers via web browsers. cookies can be divided into two categories: session and persistent. persistent cookies are stored on the user’s hard drive until they are erased or expire. unlike persistent cookies, session cookies are stored in memory and erased once the user closes their browser. provided that computer settings allow for it, cookies are created when a user visits a website. cookies can be set up such that communication is limited to encrypted communication, and can be used to remember login credentials, previous information entered into forms, such as name, mailing address, email address, and the like. cookies can also be used to monitor the number of times a user visits a website, the pages a user visits, and the amount of time spent on a webpage. application level security in a public library |thomchick and san nicolas-rocca 114 https://doi.org/10.6017/ital.v37i4.10405 the results of the tests suggest that wcpl’s cookie policies are inconsistent. we found two types of cookies present. within one domain, the web application uses a jsession cookie that is configured to send for “secure connections only.” this indicates that the session id cookie is encrypted during transmission. another domain uses an asp.net session id that is configured to send for any connection, which means the session id could be transmitted in an unencrypted format. cookies transmitted in an unencrypted format could be intercepted by an attacker in order to eavesdrop on or hijack user sessions. this leaves user privacy vulnerable given the type of information contained within cookies. flawed encryption protocol support transport layer security (tls) is a protocol designed to provide secure communication over the web. websites using tls, therefore, provide a secure communication path between their web servers and web browsers preventing eavesdropping, hijacking, and other active attacks. this study employed the ssl server test from qualys ssl labs to perform an analysis of wcpl’s web applications. results of the qualys test (see figure 2) indicate that the site does not support tls 1.2, which means the server may be vulnerable to passive and active attacks, thereby providing hackers with access to data passed between a web server and web browser accessing the server. in addition, the application’s server platform supports ssl 2.0, which is insecure because it is subject to a number of passive and active attacks leading to loss of confidentiality, privacy, and integrity. figure 2. qualys scanning service results. the vulnerabilities discovered during the testing process may be a result of uncoordinated security. this is concerning because it is a by-product of the cloud computing approach used to operate wcpl’s ils. while libraries may have acclimated to the challenge of coordinating security measures across a distributed application, they now face the added complexity of coordinating information technology and libraries | december 2018 115 security measures with their vendors, who themselves may also utilize additional cloud-based offerings from third parties. as cloud technology adoption increases and cloud-based infrastructures become more complex and distributed, attackers will likely attempt to find and exploit systems with inconsistent or uneven security measures, and libraries will need to work closely with information technology vendors to ensure tight coordination of security measures. unencrypted communication using http affects the privacy, security, and integrity of patron data. passive attacks such as eavesdropping, and active attacks such as hijacking, man -in-the-middle, and phishing can reveal patron login credentials, search history, identity, and other sensitive information that, according to ala, should be kept private and confidential. given the results of the testing done in this study, it is clear that wcpl needs to revisit and strengthen their web application security measures by, according to organizations within the security community, using https pervasively across the entire web application, avoiding mixed content, configuring cookies limited to encrypted communication, using valid ssl certificates, and enabling hsts to enforce https. implementing improvements to https will mitigate attacks by strengthening the integrity of wcpl’s web applications, which in turn, will help protect the privacy and confidentiality of library patrons. limitations and future research this research was performed at a public library in the western u.s. therefore, future research is needed to study the implementation of https to increase patron privacy at other public libraries, libraries in other parts of the u.s. and in other countries. it would also be valuable to conduct similar research at libraries of different types, including academic, law, medical, and other types of special libraries. ssl server test from qualys ssl labs and httpsnow were used to evaluate the use of https at wcpl. the use of other evaluation techniques may generate different results. while a major limitation of this study is the evaluation of a single public library and the implementation of https to ensure patron privacy, a next phase of research should further investigate the policies in place that are used to safeguard patron privacy. these include security education, training, and awareness programs, as well as access controls. furthermore, library 2.0 and cloud computing are fundamental to libraries, but create risks that could impact the ability to keep patron pii safeguarded. as such, future research should evaluate the impact library 2.0 and cloud computing applications have on maintaining the confidentiality of patron information. conclusion the library profession has long been a staunch defender of privacy rights, and the literature reviewed indicates strong awareness and concern about the rapid pace of information technology and its impact on the confidentiality of personally identifiable information about library patrons. much work has been done to educate librarians and patrons about the risks facing them and the measures they can take to protect themselves. however, the research and experimentation presented in this report strongly suggest that there is a need for wcpl and other libraries to reassess and strengthen their https implementations. https is not a panacea for mitigating web application risks, but it can help libraries give patrons the assurance of knowing they take security and privacy seriously, and that reasonable steps are being taken to protect them. finally, this report concludes that further research on library application security should be conducted to assess the overall state of application security in public, academic, and special libraries, with the application level security in a public library |thomchick and san nicolas-rocca 116 https://doi.org/10.6017/ital.v37i4.10405 long-term objective of enabling ala and other professional institutions to develop policies and best practices to guide the secure adoption of library 2.0 and cloud computing technologies within a socially connected world. references 1 jon brodkin, “president trump delivers final blow to web browsing privacy rules,” ars technica (april 3, 2017), https://arstechnica.com/tech-policy/2017/04/trumps-signaturemakes-it-official-isp-privacy-rules-are-dead/. 2 shayna pekala, “privacy and user experience in 21st century library discovery,” information technology and libraries 36, no. 2 (2017): 48–58, https://doi.org/10.6017/ital.v36i2.9817. 3 american library association, “history of the code of ethics: 1939 code of ethics for librarians,” accessed may 11, 2018, http://www.ala.org/template.cfm?section=history1&template=/contentmanagement/conte ntdisplay.cfm&contentid=8875. 4 joyce crooks, “civil liberties, libraries, and computers,” library journal 101, no. 3 (1976): 482– 87; stephen harter and charles c. busha, “libraries and privacy legislation,” library journal 101, no. 3 (1976): 475–81; kathleen g. fouty, “online patron records and privacy: service vs. security,” journal of academic librarianship 19, no. 5 (1993): 289–93, https://doi.org/10.1016/0099-1333(93)90024-y. 5 “code of ethics of the american library association,” american library association, amended january 22, 2008, http://www.ala.org/advocacy/proethics/codeofethics/codeethics; “privacy: an interpretation of the library bill of rights,” american library association, amended july 1, 2014, http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy. 6 american library association, “privacy: an interpretation of the library bill of rights,” amended july 1, 2014, http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy. 7 pekala, “privacy and user,” pp. 48–58. 8 harter and busha, “libraries and privacy legislation,” pp. 475–81; george s. machovec, “data security and privacy in the age of automated library systems,” information intelligence, online libraries, and microcomputers 6, no. 1 (1988). 9 fouty, “online patron records and privacy, pp. 289–93. 10 grace j. agnew and rex miller, “how do you manage?,” library journal 121, no. 2 (1996): 54. 11 lois k. merry, “hey, look who took this out!—privacy in the electronic library,” journal of interlibrary loan, document delivery & information supply 6, no. 4 (1996): 35–44, https://doi.org/10.1300/j110v06n04_04. https://arstechnica.com/tech-policy/2017/04/trumps-signature-makes-it-official-isp-privacy-rules-are-dead/ https://arstechnica.com/tech-policy/2017/04/trumps-signature-makes-it-official-isp-privacy-rules-are-dead/ https://doi.org/10.6017/ital.v36i2.9817 http://www.ala.org/template.cfm?section=history1&template=/contentmanagement/contentdisplay.cfm&contentid=8875 http://www.ala.org/template.cfm?section=history1&template=/contentmanagement/contentdisplay.cfm&contentid=8875 https://doi.org/10.1016/0099-1333(93)90024-y http://www.ala.org/advocacy/proethics/codeofethics/codeethics http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy https://doi.org/10.1300/j110v06n04_04 information technology and libraries | december 2018 117 12 aimee fifarek, “technology and privacy in the academic library,” online information review 26, no. 6 (2002): 366–74, https://doi.org/10.1108/14684520210452691. 13 john n. berry iii, “digital democracy: not yet!,” library journal 125, no. 1 (2000): 6. 14 american library association, “appendix—privacy and confidentiality in the electronic environment,” september 28, 2006, http://www.ala.org/lita/involve/taskforces/dissolved/privacy/appendix. 15 judith mavodza, “the impact of cloud computing on the future of academic library practices and services,” new library world 114, no. 3/4 (2012): 132–41, https://doi.org/10.1108/03074801311304041. 16 richard levy, “library in the cloud with diamonds: a critical evaluation of the future of library management systems,” library hi tech news 30, no. 3 (2013): 9–13, https://doi.org/10.1108/lhtn-11-2012-0071; raymond bérard, “next generation library systems: new opportunities and threats,” bibliothek, forschung und praxis 37, no. 1 (2013): 52–58, https://doi.org/10.1515/bfp-2013-0008. 17 michael stephens, “the hyperlinked library: a ttw white paper,” accessed may 13, 2018, http://tametheweb.com/2011/02/21/hyperlinkedlibrary2011/; michael zimmer, “patron privacy in the ‘2.0’ era.” journal of information ethics 22, no. 1 (2013): 44–59, https://doi.org/10.3172/jie.22.1.44. 18 zimmer, “patron privacy in the ‘2.0’ era,” p. 44. 19 “the american library association’s task force on privacy and confidentiality in the electronic environment,” american library association, final report july 7, 2000, http://www.ala.org/lita/about/taskforces/dissolved/privacy. 20 library information technology association (lita), accessed may 11, 2018, http://www.ala.org/lita/. 21 library information technology association (lita), accessed may 11, 2018, http://www.ala.org/lita/; pam dixon, “ethical issues implicit in library authentication and access management: risks and best practices,” journal of library administration 47, no. 3 (2008): 141–62, https://doi.org/10.1080/01930820802186480; eric p. delozier, “anonymity and authenticity in the cloud: issues and applications,” oclc systems and services: international digital library perspectives 29, no. 2 (2012): 65–77, https://doi.org/10.1108/10650751311319278. 22 marshall breeding, “building trust through secure web sites,” computers in libraries 25, no. 6 (2006), p. 24. 23 breeding, “building trust,” p. 25. https://doi.org/10.1108/14684520210452691 http://www.ala.org/lita/involve/taskforces/dissolved/privacy/appendix https://doi.org/10.1108/03074801311304041 https://doi.org/10.1108/lhtn-11-2012-0071 https://doi.org/10.1515/bfp-2013-0008 http://tametheweb.com/2011/02/21/hyperlinkedlibrary2011/ https://doi.org/10.3172/jie.22.1.44 http://www.ala.org/lita/about/taskforces/dissolved/privacy http://www.ala.org/lita/ http://www.ala.org/lita/ https://doi.org/10.1080/01930820802186480 https://doi.org/10.1108/10650751311319278 application level security in a public library |thomchick and san nicolas-rocca 118 https://doi.org/10.6017/ital.v37i4.10405 24 barbara swatt engstrom et al., “evaluating patron privacy on your ils: how to protect the confidentiality of your patron information,” aall spectrum 10, no 6 (2006): 4–19. 25 breeding, “building trust,” p. 26. 26 tj lamana, “the state of https in libraries,” intellectual freedom blog, the office for intellectual freedom of the american library association (2017), https://www.oif.ala.org/oif/?p=11883. 27 chris palmer and yan zhu, “how to deploy https correctly,” electronic frontier foundation, updated february 9, 2017, https://www.eff.org/https-everywhere/deploying-https. 28 computer security resource center, “glossary,” national institute of standards and technology, accessed may 12, 2018, https://csrc.nist.gov/glossary/?term=491#alphaindexdiv. 29 computer security resource center, “glossary,” national institute of standards and technology, accessed may 12, 2018, https://csrc.nist.gov/glossary/?term=2817. 30 open web application security project, “session hijacking attack,” last modified august 14, 2014, https://www.owasp.org/index.php/session_hijacking_attack; open web application security project, “session management cheat sheet,” last modified september 11, 2017, https://www.owasp.org/index.php/session_management_cheat_sheet. 31 eric butler, “firesheep,” (2010), http://codebutler.com/firesheep/; audrey watters, “zuckerberg's page hacked, now facebook to offer ‘always on’ https," accessed may 16, 2018, https://readwrite.com/2011/01/26/zuckerbergs_facebook_page_hacked_and_now_facebook/ . 32 info security magazine, “senator schumer: current internet security “welcome mat for wouldbe hackers,” (march 2, 2011), http://www.infosecurity-magazine.com/view/16328/senator schumer-current-internetsecurity-welcome-mat-for-wouldbe-hackers/. 33 palmer and zhu, “how to deploy https correctly”; internet engineering task force, “recommendations for secure use of transport layer security (tls) and datagram transport layer security (dtls),” (may, 2015), https://tools.ietf.org/html/bcp195; open web application security project, “session management cheat sheet,” last modified september 11, 2017, https://www.owasp.org/index.php/session_management_cheat_sheet. 34 qualys ssl labs, “ssl/tls deployment best practices,” accessed may 18, 2018, https://www.ssllabs.com/projects/best-practices/. 35 sourceforge, “sslscan—fast ssl scanner,” last updated april 24, 2013, http://sourceforge.net/projects/sslscan/. 36 palmer and zhu, “how to deploy https correctly.” https://www.oif.ala.org/oif/?p=11883 https://www.eff.org/https-everywhere/deploying-https https://csrc.nist.gov/glossary/?term=491#alphaindexdiv https://csrc.nist.gov/glossary/?term=2817 https://www.owasp.org/index.php/session_hijacking_attack https://www.owasp.org/index.php/session_management_cheat_sheet http://codebutler.com/firesheep/ https://readwrite.com/2011/01/26/zuckerbergs_facebook_page_hacked_and_now_facebook/ http://www.infosecurity-magazine.com/view/16328/senator-%20schumer-current-internet-%20security-welcome-mat-for-wouldbe-hackers/ http://www.infosecurity-magazine.com/view/16328/senator-%20schumer-current-internet-%20security-welcome-mat-for-wouldbe-hackers/ https://tools.ietf.org/html/bcp195 https://www.owasp.org/index.php/session_management_cheat_sheet https://www.ssllabs.com/projects/best-practices/ http://sourceforge.net/projects/sslscan/ abstract introduction literature review the primal importance of patron privacy the impact of information technology on patron privacy protecting patron privacy in the “library 2.0” era http exploits testing techniques used to evaluate https implementation automated software tools heuristic evaluations research methodology research location test results and discussion use of http during authenticated user sessions presence of mixed content insecure cookie management flawed encryption protocol support limitations and future research conclusion references core leadership column: making room for change through rest core leadership column making room for change through rest margaret heller information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.13513 i write this column from the vantage point of my current role as a member of the core technology section leadership team, and as a newly elected president-elect of core, with my term starting in july 2021. the planning for core began years ago but became a real division of ala in the most chaotic of times. visions for the first year of core were set aside as we had to face the reality of all the work needing to be done remotely, without any conferences that would allow for in-person conversations, and with all the leadership and members under personal and professional strain. yet being forced to start up slowly and deliberately provides some advantages. settling into this new situation has allowed staff, leaders, and members to acclimate to a new division an d learn how we want to do things in the future, rather than relying too much on how we did things in the past or feeling pressure to meet every demand. right now, we are all at a juncture in our personal and professional lives, and thinking about how to approach the coming months. summer offers the promise of growth and reinvention. the pause that a break implies allows time for us both as individuals to make time for what is important to us, and as members or employees of institutions to reconsider our priorities. for people working in library technology, however, the “summer break” is often anything but. public libraries become a hub for activity as schools are closed, and school and academic libraries may use slow periods when classes are not in session for necessary systems upgrades or to roll out a new service. the summer of 2020 was one of the most challenging of my life, both professionally and personally, and meeting all the demands of the moment left hardly any time for a true break. this year, just like last year, feels like a summer we might not let ourselves rest for a moment. while many libraries have been open to some degree over the past year, the upcoming summer has the potential for a return to something like normal. shutting down regular in-person services and buildings felt chaotic since it required new ways of providing those services and building up new technical infrastructure, but without us having expected this in advance like a normal summer project. the return may also feel chaotic, but rather than approaching it as a series of tasks in a plan that requires lots of energy and work, i hope we can treat the time as a period of reflective practice and give ourselves time to understand what has changed. adapting to the realities of life since spring of 2020 has changed us all in various ways, and so too our library users have new needs and expectations. in some cases, they have embraced new services, though this has not been a smooth process for everyone. i have a family member who started using an e-reader for the first time during the pandemic to access library e-books when her public library was closed or had limited services. she was grateful for the option to access books this way, but occasionally struggled to follow the complex workflow from library app to vendor site to device. without the ability to visit a physical reference desk to ask for help, she asked me to assist with device troubleshooting on several occasions. that worked well for her, but margaret heller (mheller1@luc.edu) is digital services librarian, loyola university chicago, and (as of july 1, 2021) president-elect of the core: leadership, infrastructure, futures division of ala. © 2021. mailto:mheller1@luc.edu information technology and libraries june 2021 making room for change through rest | heller 2 not everyone has a digital services librarian in their quarantine bubble. i share this to illustrate that while some people will have adapted or gotten the help they need, for many, this time has been one of doing without or maladaptation. going back to “normal” will not help those who will need even more than they did pre-pandemic. taking time to understand that fact, and to accept that it will not be a quick process of return for many people, will allow us to give each other space to find a way back to our lives as library users and library employees. while many of us feel uncomfortable when we see slow progress—i know i do—i am coming to realize the value of making space for slowness and for rest. rest comes in all forms. it could be physical rest, but it could be pursuing an artistic or athletic hobby, intentional social interactions, or spiritual practices. institutions might give extra time off or set healthy expectations for work hours and meeting-free days, while also discarding old practices and attitudes to create better future work environments. there are crises to which we must immediately react and respond, but without personal and institutional energy in reserve, we will not do as good a job when they occur. crises include political upheaval, public health emergencies, and other major events, but we can also appreciate how they unfold on a more mundane level. information technology work often requires odd hours, intense bursts of energy to complete projects in a small window of time, and unpredictable problems that require dropping everything else to address an emergency. it is natural to constantly look towards the most urgent and the newest problem. this tendency results in lengthy backlogs for requests and accumulates technical debt from deferred maintenance or refactoring. yet as we bring our libraries and other institutions out of pandemic mode over the next few years, allowing for reflective space can help us to be cautious about the choices we make. for example, during earlier stages of the pandemic, many of us probably had to set u p systems for some type of surveillance to maintain social distancing and aid in contact tracing. taking some time to review all those new procedures and systems—and purposefully dismantle those with negative privacy implications—will help us to go forward as more ethical and empathetic institutions. taking it slow is going to be the only way through the next period. summer 2021 should be about reflection on collective trauma. we responded to the events of the past year, whether it was for closing libraries, keeping libraries open as safely as possible, racial justice work, or election support, and now we must consider how to incorporate what we started into lasting change. to do that reflection will require rest. we know how important rest is but finding space for it is not usually a high priority. rest allows us to integrate our experiences, and will build us back to make sure we can keep responding to what comes next. i am challenging myself to spend time in deliberate reflection at the cost of mindless productivity over the coming months so that i can keep helping my library and core succeed. i hope you will consider doing the same. 214 highlights of isad board meeting 197 4 annual meeting new york, new york monday, july 8, 1974 the meeting was called to order by president frederick kilgour at 4:45 p.m. the following were present: board-frederick g. kilgour, lawrence w. s. auld, paul j. fasana, susan k. martin, ralph m. shoffner, donald p. hammer ( isad executive secretary), and berniece coulter, secretary, isad. guests-henriette d. avram, roberto esteves, stephen salmon, merry sue smoller, and ruth l. tighe. additions to the agenda. mrs. martin requested that the matter of commercial brochures being included in isad mailings be added to the agenda. midwinter minutes approved. motion. it was moved by paul fasana that the minutes of the isad 197 4 chicago midwinter meeting be approved. seconded by ralph shoffner. carried. introduction of new officers. mr. kilgour introduced to the board henriette avram, vice-president/president-elect, and ruth tighe, member-at-large of the isad board of directors, who would assume office at the close of the new york conference. policy concerning materials used in isad dissemination or displays. motion. it was moved by susan martin that the isad board establish a policy that only material produced by ala units or related professional organizations be included in its disseminations or displays. seconded by paul fasana. carried. video/cable section. mr. roberto esteves, chairman of the ala video/ cable ad hoc study committee, solicited the interest of and activity by the isad board in getting video i cable incorporated into the isad structure. he reported that his committee had considered three alternatives as to where video/ cable concerns could be situated within ala: ( 1) it could remain as a task force in srrt; (2) a separate round table on video/ 216 i ouj'nal of libmry automation vol. 7 i 3 september 197 4 that had been before the evaluation, when it was his belief that isa was a cunent awareness service. mr. fasana recommended that the executive secretary write isa a letter informing them that the board cannot consider becoming a sponsor at this time. asidic. mr. hammer informed the board that peter watson had talked with him about asidic liaison, and they had concluded that asidic is primarily interested in having an observer at isad board meetings. to accomplish this requires no action from the board. wednesday, july 10, 1974 the meeting was called to order at 4:40 p.m. by president frederick kilgour. those present were: board-frederick g. kilgour, lawrence w. s. auld, paul j. fasana, susan k. martin, ralph m. shoffner, donald p. hammer (isad executive secretary), and berniece coulter, secretary, isad. committee chairmen-brian aveney, brett butler, helen schmierer, velma veneziano. guests-henriette d. avram, gerald lazorick, ruth l. tighe. sdi service for ala members. mr. lazorick (ohio state university mechanized information center) discussed the advantages to ala members if the osu selective dissemination of information service were available to them by subscription. the center would charge $50 per year for a profile, as opposed to the standard $300. the contract for sdi and retrospective searches (two services) would require ala to guarantee $17,000 per year ( $10,000 for sdi and $7,000 for retrospective searches). also, mr. fasana estimated that advertising and publicity costs might be as much as $5,000. mr. lazorick further explained that the printing of the necessary materials and the mailing would be handled by the center. ala would be responsible for advertising, marketing, and billing. it was relayed to the board that mr. wedgeworth did not feel that ala would profit enough for the amount it would have to pay for the service. he felt that osu could provide the service directly to individuals without the intervention of ala. mr. kilgour said that he felt a need to know that the money paid would indeed return to ala. the board had in the past expressed an interest in this type of service for ala members, and mr. kilgour asked if this feeling still existed. there was agreement among the board that it would be a desirable service for ala members. mr. kilgour stated that it will be necessary to: ( 1) determine the actual costs; ( 2) find the least expensive way of informing the members of this opportunity; and ( 3) obtain a commitment from the membership. he further said he would talk with mr. wedgeworth to see if agreement could highlights of meetings 215 cable could be formed; or ( 3) a section on video i cable could be formed in isad. the committee had favored an isad section. a round table might have more appeal to members, but would be outside the ala divisional and political structure. he further made known the desire of the committee for coordination of the forty-nine existing groups involved with audiovisual in ala. he noted that it would be possible to create a committee on a v within the isad section on video/ cable if there were interest in that approach to solving the problem. motion. mr. fasana moved that the isad board endorse in principle the ala video/cable ad hoc study committee's suggestion to create within isad a section devoted to video/ cable. seconded by susan martin. carried. misleading claims. mr. salmon indicated that some advertising over the last few years had appeared to be misleading, and that in some cases librarians' and libraries' names had been incorporated into advertising literature without their knowledge. two rtsd committees touch upon these problems as they relate to technical processing products and services: the bookdealer-library relations committee and the micropublishing projects committee. with adequate care, mr. salmon suggested, such a committee could be used by isad to ensure that its members are adequately informed. isad board members indicated an interest in and a need for a service of this nature, but reflected a hesitancy regarding the sensitivity of the issue. mr. kilgour asked that the matter be deferred until the wednesday board meeting, after the function statements of the two rtsd committees had been distributed to the board. isad historian. mr. hammer reviewed the action of the board previously in deciding to eliminate the history committee and appoint a historian if ala were going to publish a history of the association for the 1976 centennial celebration. since it has since been determined that ala will not produce such a publication, isad has no need to appoint a historian. isa (infonnation science abstmcts). mr. kilgour felt that there were now two obvious avenues open to the board at this time: ( 1) to pursue the evaluation of isa, or ( 2) to drop it altogether. mr. fasana said he believed that if ala were a sponsor, isa would abstract more library literature. because chemical information journals are among their sponsors, they cover chemical literature heavily. he felt that isad should look seriously into isa sponsorship, as there is nothing comparable to it in the united states. the isa board is interested, he said, because it would increase' their subscriptions and also increase their scope if they obtained subscriptions from ala members. mr. hammer explained that isad at one time had attempted to organize a subscription campaign, but the response was poor. mr. kilgour said that his reaction had been in favor of sponsorship, but highlights of meetings 217 be reached after additional points were discussed. we could then determine the answers to the three questions mentioned above. bylaws and organization committee. the newly appointed chairman of the isad bylaws and organization committee, helen schmierer, explained that she had found two versions of the isad bylaws extant, and there was a question as to which was current. minutes of the division did not reveal that any actual vote by the membership concerning various changes in the bylaws ever took place. she suggested that her committee use the original ( 1968) version of the bylaws as the basis on which to present all subsequent changes to the membership for a vote. she told the board she would have a new version ready by midwinter, and that it could then be published in ]ola and voted on at the san francisco annual conference. in answer to a question from ms. schmierer, mr. kilgour explained that it was the intent of the board that the bylaws and organization committees be combined, and the resulting committee should provide guidelines for each new committee established subsequently within isad. he also stated that it was necessary that a change be made in the present bylaws so that if a president did not complete a term, there would be a special election in order to elect another vice-president to take over the following year. for other charges to the bylaws and organization committee, mr. kilgour referred ms. schmierer to the minutes of the 197 4 midwinter meeting. telecommunications committee report. mr. kilgour annouced that david waite had resigned as chairman of the telecommunications committee and that he had appointed philip long as new chairman. mr. long presented a report of the committee (exhibit 1). he said that the areas of interest of the committee were networking, protocol, and standards. the following resolution was passed at their meeting: "that ala, via isad, join the committee of corporate telephone users ( cctu) and thus support the effort to combat the at&t attempt to adversely modify the current w ats tariff; should it not be legally or financially feasible for isad i ala to join cctu the committee will nonetheless attempt to follow and rep01t on this and related regulatory items." mr. kilgour called for a motion recommending that ala become a member of the committee of corporate telephone users, an organization to combat the cunent revision of the w ats tariff, providing money is available and no legal problems are connected with ala's so doing. however, several members of the board wanted further information as to what would be the position of ala with regard to the organization and in what sense would that position be an advantage to the members of ala. copies of the document produced by the cctu were also requested. mr. long said 218 journal of libmry automation vol. 7/3 september 1974 that he would contact a member of that committee in new york and get copies to the board members. mr. long requested that his committee be enlarged. mr. kilgour told him to appoint as many members as he needed. program planning committee report. chairman brett butler reported on the new orleans institute on networking which he felt was very successful both topically and financially. he said smaller libraries are beginning to consider automation, and therefore are sending staff to these institutes. he reported that $9,300 was received from registration fees, and expenses were approximately $6,100. in addition, $1,800 in expenses were paid by slice. mr. hammer will send a report to the board. mr. butler told of the committee's meeting in may in chicago. the minutes of that meeting, written by mr. hammer, had been approved by the committee and could be distributed to the board. he further related that the program at the new york annual conference had gone well, with approximately 400 in attendance. there were no plans for publication of the proceedings of the program, although it had been taped for sale by ala. mr. butler said the program planning committee desired liaison with each isad operating committee. they had appointed someone to tesla and hoped to do likewise with the telecommunications committee. at the suggestion of ms. avram, a serials institute has been planned for atlanta in october, preceding the asis meeting. josh smith (asis) and mr. hammer are the coordinators. mr. butler also announced that another institute on networking would be held in the spring in new orleans. with more advance publicity he felt there would be a greater response than the institute of march 197 4, which had an attendance of over 125. the 1975 institute will be a basic tutorial; james rizzolo is responsible for the content. plans for a series of cooperative programs with asis were laid out by the committee. this had been discussed with josh smith and had received his approval. mr. butler said he would prepare a statement which would describe the fiscal organization to be sent to the board for a mail vote. mr. kilgour expressed his opinion that with the new dues structure, the board must look at the financial gain involved in the institutes. in fact, any money-making venture must be considered at this time due to the dues structure change. plans for the cable tv preconference at san francisco ( 1975) were dropped. a program for san francisco would center around reactions to the document produced by the national commission on libraries and information science, the final draft of which is to be published in january 1975. this program is to be analytical in nature. mr. butler explained that there is possible cosponsorship interest. also at the san francisco annual conference, the office of intellectual highlights of meetings 219 freedom will cosponsor with isad a panel on various aspects of privacy and data file security. mr. butler announced that fifteen deans of library schools had attended that morning's meeting of the committee. there is interest in cosponsorship of continuing education programs, but nothing has been made definite at this point. the committee will explore this further. committee on representation in machine readable form of bibliographic information (marbi) report. (exhibit 2). velma veneziano requested that mr. fasana report to isad, as he had prepared a summary of the meeting for the rtsd board of directors. she asked mr. kilgour if the board would approve her writing the canadian library association to grant permission to send an official observer to marbi, as requested. mr. kilgour suggested that the letter definitely state that this representative would be a nonvoting participant. cola. (exhibit 3 ). thursday, july 11, 1974 the meeting was called to order by president frederick kilgour at 4:30p.m. those present were: board-frederick g. kilgour, lawrence w. s. auld, paul p. fasana, susan k. martin, ralph m. shoffner, donald p. hammer ( isad executive secretary), and berniece coulter, secretary, isad. guests-henriette d. avram, william summers. report of lola editor. copies of the ]ola annual report were distributed to the board (exhibit 4). ms. martin requested board reaction to changes suggested by the isad editorial board: ( 1) incorporate the issn on the cover of the journal, and drop the coden; (2) change the color of the cover of lola for each volume, beginning with the march 1975 issue; and ( 3) consider changing the title of the journal, in the light of possible incorporation of information technologies into isad, to the journal of library technology (jolt). the consensus of the board was that: ( 1) coden should remain on the cover; ( 2) a change in cover stock was quite appropriate; and (3) ]ola is a long-established title, and should remain. committee reports. mr. kilgour suggested that committee reports to the board be discontinued to save time and that written reports be submitted in the future. motion. it was moved by ralph shoffner that all isad committee reports be submitted to the board in writing and that the chairman appear before the board only if the committee desired some board action, and that the board has previously received this request in writing. seconded by larry auld. carried. mr. kilgour suggested that committee appointments be sent to the board by carbon copies of letters rather than reported as an agenda item. 220 ]oumal of library automation vol. 7/3 september 1974 representative to ansi x-4. motion. it was moved by ralph shoffner that mr. hammer explore and obtain, if possible, ala representation to ansi x-4 committee and that the board conditionally appoint arthur brody to be that representative. seconded by larry auld. carried. committee on technical standards for library automation (tesla) report. (exhibit 5). motion. it was moved by paul fasana to turn over to helen schmierer, chairman of the bylaws and organization committee, the matter of a revised charge to tesla. seconded by susan martin. carried. membership survey committee report. the final report of the membership survey ad hoc committee was distributed to the board members. this completed the work of the committee and the committee was therefore disbanded. mr. william summers, a member of the committee, appeared before the board. he stated that they had the computer capability to run any data correlations desired by the board. mr. kilgour asked the board members to request any correlations they would want from don hammer by october 15. he will forward them to ms. pope by mid-november, and she will have the correlations ready by the midwinter meeting in 1975. the board noted that the survey showed that 25 percent of !sad members are library directors, and that the most frequent age is over fifty. the number of people belonging to !sad who have no contact with library automation was surprising to some. a significant number of !sad members responded to the questionnaire. misleading claims. it was the sense of the board that the establishment of a committee in !sad to investigate misleading claims be referred to the bylaws and organization committee. the chairman is to contact william north, the ala attorney, concerning legal implications, and also steve salmon, who had shown interest in these problems, should be approached concerning the chairmanship. general discussion. most of the discussion centered around the new dues structure of the association. there was a question of how funds would be distributed to the divisions from institutional membership dues. ms. martin said that she would send the board an analysis of the expenditures and income for ]ola. the need for cash capital should be considered for the continuing publication of the journal despite advertising fluctuations. mr. shoffner stated that he favored using lola funds to sponsor !sad institutes and that there is a need for more introductory and elementary education in the !sad institutes. participants in the institutes had shown interest in basic knowledge of automation in order to make decisions in their work even though not necessarily involved directly with automation. highlights of meetings 221 exhibit 1 telecommunications committee report progress report of activities to date: 1. committee decided to maintain an awareness of future possibilities of two-way cable for data transmission, but not to continue active role in broadcast cable area in view of ongoing work in the area elsewhere. 2. committee members extensively debated the directions to which its future efforts would be actively directed; these included education, network protocol standards, etc. 3. committee accepted reports from messrs. randel and long on current suppliers of bibliographic services via star networking, and on current ansi and eia (plus iso) standards activities related to present and future bibliographic data transmission. 4. committee resolved: to attempt to formulate methods for computer-to-computer interaction (protocols) by telecommunication links, such that a single terminal of arbitrary characteristics could access a variety of host services in a "user-transparent" fashion. 5. various members of the committee accepted assignments in gathering data and protocols in use in such networks as arpa, tym-share, ncic, etc. it was recognized that the membership of the committee must be enlarged and that more than two ala meeting forums yearly are needed for the task. recommendations for division board action: the committee moved and unanimously passed a resolution that ala, via !sad, join the committee of corporate telephone users (cctu) and thus support the effort to combat the at&t attempt to adversely modify the current wats tariff; should it not be legally or financially feasible for !sad/ ala to join cctu the committee will nonetheless attempt to follow and report on this and related regulatory items. exhibit 2 committee on representation in machine readable form of bibliographic information (marbi) report following is a summary of deliberations and actions of the committee: 1. jola editorial, vol. 7, no. 2, 1974. the chairperson was asked to send a letter to the editor correcting the erroneous/ ambiguous reference to marbi and its relationship (formal and otherwise) to lc, clr cembi, etc. 2. conser. the committee took note of and discussed recent developments of the conser project. the committee will review and comment on formal recommendations of conser affecting marc serials format when they are submitted through lc. 3. clrjnsf sponsored conference on national bibliographic control. formal "conclusions and recommendations" of the conference have been distributed. the committee decided to take note of this document, ask each member to comment on the substance, and to prepare a formal critique/reaction of the conclusions for clr. 4. character sets. a progress report (by h. avram) was presented of international activities. extended character sets for latin (i.e., roman), cyrillic, and greek have been agreed to by iso working group on character sets. a draft standard i's being prepared. further work is being done on character sets for mathematical symbols and african languages. 5. iso 2709, format structure for marc records. progress report given. no action taken. 6. content designators. a progress report on international activities was given (by 222 i oumal of library automation vol. 7 i 3 september 197 4 h. avram) as well as a summary of some working papers prepared to date. copies will be submitted to the marbi members. 7. iso filing standards. a progress report was given. discussion but no action. 8. authority record formats. copies of the lc proposal for "authorities: a marc format" were distributed. a description of the work in progress at lc was given. lc tentatively plans to initiate a service for authorities in machine-readable form in 1975. the service probably will include names new to lc with cross-references and names new to marc with cross-references. 9. microform experiment. lc representative described a com microform experiment currently being defined/ set up at lc. the experiment will focus on lcsh 8th ed. in com format. 10. isbd-serials. the first formal publication of isbd-s was available at this conference. it was decided that each member would review the document and send comments to mr. fasana by august 15. mr. fasana was instructed to prepare a summary of the comments supplied. 11. catalog code revision committee. the need to establish liaison and input to this committee was discussed. arrangements were made with the chairperson of the ccrc (j. byrum) to establish input and liaison between the two committees. exhibit 3 cola discussion group report the isad cola discussion group met on july 7, 1974. brian aveney, chairman, mentioned that the subject of merger of cola with the marc users' discussion group had been informally raised, and invited comment from any members of the groups. discussion centered around the time needs and a suggestion was made that cola and mudg meet back to back. further discussion was deferred for later informal contacts. the program divided into two different sessions. the first consisted of a series of independent presentations on library automation activities around the counhy. those who reported were: helena rivoire (bucknell university); ron miller and bill mathews (nelinet); ann ekstrom (oclc); richard de gennaro (university of pennsylvania); james sokoloski (university of massachusetts); james dolby (r&d associates); howard harris (university of chicago); and stephen silberstein (university of california, berkeley). the second half of the program consisted of a panel presentation about the use of microform catalogs in libraries. richard jensen (university of texas, permian basin) described the use during the last year of a divided microfiche catalog produced under contract by richard abel co. no other form of access to the collection is provided for public use. a brief questionnaire about patron response indicated no great difficulties in use. some complaints about readers and filing were noted. mary fischer (los angeles public library) discussed the transition to com fiche for internal reports, for reasons of cost. a variety of reports can now be distributed to all branches which formerly did not have access to this information except at the central library. james rizzolo (new york public library) mentioned the dance collection catalog now available on film. user response has been very positive, but the fact that this is the first time any form of catalog has been available is probably a large factor in this response. a com marc character set has been developed with a new york vendor for use in internal fiche files, and samples were made available to the group. highlights of meetings 223 exhibit 4 journal of library automation annual report this report covers the eighteen months between january 1973 and june 1974. during this period nine issues of ]ola appeared, from the june 1972 to the june 1974 issues. these issues contained thirty-nine articles and twenty-three book reviews. in addition, lola/technical communications was incorporated into the journal with the march 1973 issue. with volume 7 (1974), an editorial or guest editorial appears in each issue. in january 1973 the journal was eight months behind. ala's central production unit was to have taken over the technical editing with the 1973 volume, but due to the unforeseen delay in publication the staff was not familiar with the journal, or the printer. by march all the major problems had been sorted out, and the june 1972 issue was sent to the printer. at that time there was a bacldog of thirty-five manuscripts, of which twenty were eventually published, nine were rejected, and six are still pending (either sent back to the author for revision, or still in the process of locating or identifying the author). with volume 6 (1973), the contract for printing was given to the ovid bell press, inc. spencer-walker did not bid on a contract renewal. because of the increasing cost of paper and the narrower selection offered by paper manufacturers, the editorial board determined that at the same time it would be reasonable to change from use of permalife to another cheaper but acid-free stock. warren old english was selected; at the time (june 1973) it was $25.10 per hundredweight. since february 1973, ]ola has received fifty manuscripts for consideration: published 18 rejected 11 accepted 7 in review 4 pending 9 sent to tc editor 1 it is difficult to summarize the content of these nine issues. when categorized very broadly, the thirty-nine articles covered the following topics: aspects of cataloging 7 search keys and file structure 7 national automation and standards 7 isad topics 5 circulation 5 acquisitions 2 serials 2 information reh·ieval 2 administration 1 other 1 don bosseau continues, i am pleased to say, as editor of technical communications. peter simmons (university of british columbia) accepted the position of book review editor, and is also doing an excellent job. he reports that, in addition to the reviews already published, eleven reviews have been submitted and are awaiting publication, and six books are in the hands of reviewers. the central production unit has been of invaluable assistance in bringing the journal up to date, in negotiating with the post office on our behalf, and in continuing to provide technical editing support. lola is now completely up to date; i hope that we shall continue to improve the 224 journal of libmry automation vol. 7/3 september 197 4 standards for acceptance of articles, and that time will now permit us to examine the journal critically to determine where improvements could or should be made. exhibit 5 committee on technical standards for library automation report recommendations for division board action: i. nominate mr. arthur brody as isad representative to ansi-x4. 2. approve revised charge to tesla. the tesla met in three sessions. i. minutes of previous meeting. approved. 2. charge to the committee. the charge to the committee had been revised and the · reasons for each revision documented, and the revisions reviewed. it was voted that the charge as revised be approved by the isad board. 3. draft procedure. the tesla procedure for handling standards proposals was reviewed and the following changes recommended: a. proposal outline item viii be made optional. b. reactor ballot include three responses, e.g., for/ against-need for standard; for/against-specification of standard; yes/no-available to work on specification. these changes will be made and published in the next issue of jola-tc. 4. publication of materials relating to standards. the article describing the committee's procedures and role and outlining the standards organization potentially impacting libraries was published in jola. the committee discerned that standards exist which would be of importance to the library community and that these be identified and reviewed in terms of their impact on libraries. as a first step, a listing of those standards will be drawn up and, on review of the committee, published in jola-tc. 5. representative to ansi-x4. the ala is currently not represented on ansi-x4. the committee recommends that the isad board nominate mr. arthur brody as the ala representative to ansi-x4. 6. metrication. the current movement to metric measure may impact libraries. a subcommittee of ms. madeline henderson (chairperson) and dr. ed bowles was formed to develop a position paper on the impact of metrication. 7. standards program at san francisco. the committee will present a 1 ~-hour program on standards at the next annual convention, in san francisco. 8. open meeting. reactor ballot responses to the potential standards areas and a general review of the committee's activities were held in its third session. 9. next meeting. tentatively the committee will meet at the asis conference in atlanta. time and date to be announced. exhibit 6 isad/led committee on education for information science report discussion: directions of committee. need for visibility at ala and follow up to denver ( 1971) meeting. highlights of meetings 225 possible tutorial or institute topics-cosponsors. action: 1. plan program for san francisco 1975. speaker: ph.d. student from syracuse to design guidelines for module development panel: two to three modules presented reactors: discussion 2. work out subject outline based on questionnaire for distribution at san francisco for possible module development ready for committee approval by midwinter. recommendations for division board action: program slot for san francisco highlights: serious concern about lack of member participation. isad and led may want to reexamine purpose-need for committee and/ or reorganization. i i '! 320 book reviews a computer based system for reserve activities in a university library, by paul j. fasana (and others). new york: systems office, the libraries, columbia university, 1969. (final report, project no. 7-1129, u. s. office of education, bureau of research) iii, 50, (53) pp. one opens this report wondering whether circulation of reserve books to readers is included in the computer based system, and assuming that such circulation would have to be handled on-line because the short duration of reserve loans, often on the order of one hour, would not seem to fit well with batch processing. it is soon made clear that on-line circulation was set as a goal of the second phase of the system; only the first phase is described here, though somewhat tantalizingly it is stated that one of the aspects of phase two already developed or experimented with is "a fully operational off-line circulation system." what is reported here, however, in commendable fullness, is a system, called reserves processing, which greatly facilitates the processes of putting books on reserve, taking them off, and producing reference lists. emphasis has been placed on developing a generalized system that can be used in different units of the columbia university libraries, and, with necessary modifications, in other academic libraries. the preferred form of data entry is on-line with an ibm 2741 terminal. other functions (and backup systems for data entry) are off-line; the master reserve file is stored on an ibm 2311 disc pack. one section of the report describes the system for those who are not computer specialists; this includes copies of forms and form letters. other sections give technical documentation, including a flow chart, details of format, and actual listings of four programs written in f level cobol for os/360. the report will be valuable to anyone considering the problem of reserve books; its successor covering phase two will be eagerly awaited by all those interested in circulation as well. foster m. palmer involvement of computers in medical sciences, compiled by k. m. shahid, h. j. vander aa, and l. m. c. j. sicking. amsterdam: swets and zeitlinger, 1969. 227 pp. the compilers of this volume have brought together the significant abstracts of the literature that pertains to the use of the computer in present-day medicine. this volume will serve a valuable purpose for those interested in the computer and its applications in medical sciences as it will give a broad overview of computer usage in medicine and many closely allied fields. as computer uses grow in frequency and diversity, a review of this type becomes increasingly valuable to those interested in the field. 1 ohn a. prior book reviews 321 translations journals; list of periodicals translated cover-to-cover, abstracted publications and periodicals containing selected articles, compiled by mrs. a. s. de groot-de rook. delft: european translations centre, 1970. 44 pp. $2.00. this book is an updated bibliographical list and union catalog intended as a guide to scientific and technical journals in translation. entries, arranged alphabetically by original title, contain bibliographical details, publisher and price. the list includes both current and terminated periodicals (about 400 entries). there are cross references from the translated title to the original title. at the end of each entry selected locations and their holdings are listed. the holdings of national translation centres and/or libraries adhering to the european translation centre are also included. a "list of publishing houses," the agents from which to order, are included along with mailing addresses. there is also a "list of holding libraries" with addresses. only non-western language periodicals for which there are western language verisions are included. no non-western journals that contain western language articles or journals originally published in western languages are included. irene braden hoadley ' proceedings of the 1969 clinic on library applications of data processing, edited by dewey e. carroll. urbana: university of illinois graduate school of library science, 1970. 144 pp. $5.00. the volume contains eleven invited papers presented at the seventh annual clinic on library applications of data processing held april 2730, 1969, at urbana, illinois. as in the preceding volumes in this series, the purpose is to report actual experience in case history form of applications of data processing technology to areas of library operations. the book is a source of information on how particular problems were handled within a particular environment. library operations which receive particular attention are the usual ones: acquisitions, cataloging and circulation. "library networks: cataloging and bibliographic aspects," by ann curran presents actual problems encountered in the development of an operating network as well as many thought-provoking questions. stephen salmon's article on automation of the library of congress card division is very informative. also of interest are two articles dealing with pl/i as a programming language for library applications. several articles are beginning to describe on-line applications of data processing for libraries as well as batch processing and the optimal mixes of both. unlike some of the preceding volumes, this volume has a very fine overali index. there is an error in the name of one of the authors (james b. corbin should b e john b. corbin). no participant discussion is included. kenneth ]. bierman 322 journal of library automation vol. 3/4 december, 1970 techniques of information retrieval, by b. c. vickery. hamden, conn.: archon books, 1970. 262 pp. $11.00. this book is a lucidly written text dealing primarily with manual indexing, and the manual construction of document profiles. there is a wealth of information about classification systems and their use for indexing purposes, and two particularly interesting chapters that give illustrations of some of the work going on at information centers, and of some of the basic concepts arising in systems evaluation, respectively. the present reviewer finds this book difficult to deal with, since the temptation continuously arises to substitute one's own aims for those of the author. to my mind, this book does not deal with the "techniques of information retrieval," as commonly understood. the latter would surely include a thorough description of automatic indexing procedures, automatic classification, on-line search systems, modern storage allocation methods, fast search systems, and so on; and while some of these concepts are mentioned in passing, the reader surely cannot obtain an accurate picture in these areas. rather, the book deals with conventional indexing procedures, and will likely be of value for the conventional training of librarians and documentalists. the text is easy to read, and includes plenty of examples, as well as some examination questions and exercises. still, this reviewer wonders whether a more modem book might not have been published in 1970, particularly if the title includes the phrase "information retrieval." to this question, the author would likely answer (as on page 17) that the: " ... analysis and synthesis of information, though it may be aided by the machine can only be carried out effectively by skilled human labor;" or again (as on page 43) : " ... if we cannot say for certain what is the optimum human selection of index terms in a particular situation, then one cannot evaluate a machine selection." statements such as these are easy to generate, particularly if one is not obliged to furnish any proof for one's assertions. in any case, they serve to illustrate the author's viewpoint and his particular choice of subject matter. to summarize, this text appears to be an excellent introduction to conventional documentation work, with emphasis on manual document analysis and indexing. it does not, unfortunately, give a reasonable preview of the fundamental changes which will inevitably occur in the information and documentation fields over the next ten or twenty years. g. salton : | zhang et al. 75seeing the wood for the trees | zhang et al. 75 here again, no weighting or differentiating mechanism is included in describing the multiple elements. what is addressed is the “what” problem: what is the work of or about? metadata schemas for images and art works such as vra core and cdwa focus on specificity and exhaustivity of indexing, that is, the precision and quantity of terms applied to a subject element. however, these schemas do not address the question of how much the work is of or about the item or concept represented by a particular keyword. recently, social tagging functions have been adopted in digital library and catalog systems to help support better searching and browsing. this introduces more subject terms into the system. yet again, there is typically no mechanism to differentiate between the tags used for any given item, except for only a few sites that make use of tag frequency information in the search interfaces. as collections grow and more federated searching is carried out, the absence of weights for subject terms can cause problems in search and navigation. the following examples illustrate the problems, and the rest of the paper further reviews and discusses the precedent research and practice on weighting, and further outlines the issues that are critical in applying a weighting mechanism. example, the dublin core metadata element set recommends the use of controlled vocabulary to represent subject in “keywords, key phrases, or classification codes.”1 similarly, the library of congress practice, suggested in the subject headings manual, is to assign “one or more subject headings that best summarize the overall contents of the work and provide access to its most important topics.”2 a topic is only “important enough” to be given a subject heading if it comprises at least 20 percent of a work, except for headings of named entities, which do not need to be 20 percent of the work when they are “critical to the subject of the work as a whole.”3 although catalogers are aware of it when they assign terms, this weight information is left out of the current library metadata schemas and practice. a similar practice applies in non-textual object subject indexing. because of the difficulty of selecting words to represent visual/aural symbolism, subject indexing for art and cultural objects is usually guided by panofsky’s three levels of meaning (pre-iconographical, iconographical, and post-iconographical), further refined by layne in “ofness” and “aboutness” in each level. specifically, what can be indexed includes the “ofness” (what the picture depicts) as well as some “aboutness” (what is expressed in the picture) in both pre–iconographical and iconographical levels.4 in practice, vra core 4.0 for example defines subject subelements as: terms or phrases that describe, identify, or interpret the work or image and what it depicts or expresses. these may include generic terms that describe the work and the elements that it comprises, terms that identify particular people, geographic places, narrative and iconographic themes, or terms that refer to broader concepts or interpretations.5 seeing the wood for the trees: enhancing metadata subject elements with weights subject indexing has been conducted in a dichotomous way in terms of what the information object is primarily about/of or not, corresponding to the presence or absence of a particular subject term, respectively. with more subject terms brought into information systems via social tagging, manual cataloging, or automated indexing, many more partially relevant results can be retrieved. using examples from digital image collections and online library catalog systems, we explore the problem and advocate for adding a weighting mechanism to subject indexing and tagging to make web search and navigation more effective and efficient. we argue that the weighting of subject terms is more important than ever in today’s world of growing collections, more federated searching, and expansion of social tagging. such a weighting mechanism needs to be considered and applied not only by indexers, catalogers, and taggers, but also needs to be incorporated into system functionality and metadata schemas. s ubjects as important access points have largely been indexed in a dichotomous way: what the object is primarily about/ of or not. this approach to indexing is implicitly assumed in various guidelines for subject indexing. for hong zhang, linda c. smith, michael twidale, and fang huang gaocommunications hong zhang (hzhang1@illinois.edu) is phd candidate, graduate school of library and information science, university of illinois at urbana-champaign, linda c. smith (lcsmith@illinois.edu) is professor, graduate school of library and information science, university of illinois at urbana-champaign, michael twidale (twidale@illinois.edu) is professor, graduate school of library and information science, university of illinois at urbana-champaign, and fang huang gao (fgao@gpo.gov) is supervisory librarian, government printing office. 76 information technology and libraries | june 2011 ■■ examples of problems exhaustive indexing: digital library collections a search query of “tree” can return thousands of images in several digital library collections. the results include images with a tree or trees as primary components mixed with images where a tree or trees, although definitely present, are minor components of the image. figure 1 illustrates the point. these examples come from three different collections and either include the subject element of “tree” or are tagged with “tree” by users. there is no mechanism that catalogers or users have available to indicate that “tree” in these images is a minor component. note that we are not calling this out as an error in the professionally developed subject terms, nor indeed in the end user generated tags. although particular images may have an incorrectly applied keyword, we want to talk about the vast majority where the keyword quite correctly refers to a component of the image. furthermore, such keywords referring to minor components of the image are extremely useful for other queries. this kind of exhaustive indexing of images enables the effective satisfaction of search needs, such as looking for pictures of “buildings, people, and trees” or “trees beside a river.” with large image collections, such compound needs become more important to satisfy by combinations of searching and browsing. to enable them, metadata about minor subjects is essential. however, without weights to differentiate subject keywords, users will get overwhelmed with partially relevant results. for example, a user looking for images of trees (i.e., “tree” as the primary subject) would have to look through large sets of results such as a photograph of a dog with a tiny tree out of focus in the background. for some items that include rich metadata, such as title or description, when people look at a particular item’s record, with the title and sometimes the description, we may very well determine that the picture is primarily of, say, a dog instead of trees. that is, the subject elements have to be interpreted based on the context of other elements in the record to convey the “primary” and “peripheral” subjects among the listed subject terms. however, in a search and navigation system where subject elements are usually treated as context-free, search efficiency will be largely impaired because of the “noise” items and inability to refine the scope, especially when the volume of items grows. lack of weighting also limits other potential uses of keywords or tags. for example, all the tags of all the items in a collection can be used to create a tag cloud as a low cost way to contribute to a visualization of what a collection is “about” overall.6 unfortunately, a laboriously developed set of exhaustive tags, although valuable for supporting searching and browsing within a large image collection, could give a very distorted overview of what the whole collection is about. extending our example, the tag “tree” may occur so frequently and be so prominent in the tag cloud that a user infers that this is mostly a botanical collection. selective indexing: lcsh in library catalogs although more extreme in the case of images in conveying the “ofness,” the same problem with multiple subjects also applies to text in terms of “aboutness.” the following example comes from an online library catalog in a faceted navigation web interface using library of congress subject headings in subject cataloging.7 the query “psychoanalysis and religion” returned 158 results, with 126 in “psychoanalysis and religion” under the topic facet. according to the subject headings manual, the first subject is always the primary one, while the second and others could be either a primary or nonprimary subject.8 this means that among these 126 books, there is no easy way to tell which books are “primarily” about “psychoanalysis and religion” unless the user goes through all of them. with the provided metadata, we do know that all books that have “psychoanalysis and religion” as the first subject heading are primarily about this topic, but a book that has this same heading as its second subject heading may or may not be primarily about this topic. there is no way to indicate which it is in the metadata, nor in the search interface. as this example shows, the library of congress manual involves an attempt to acknowledge and make a distinction between primary and nonprimary subjects. however in practice the attempt is insufficient to be really useful since apart from the first entry, it is ambiguous whether subsequent entries are additional primary subjects or nonprimary subjects. consequently, the search system and, further on, the users are not able to take full advantage of the care of a cataloger in deciding whether an additional subject is primary or not. other information retrieval systems the negative effect of current subject indexing without weighting on search outcomes has been identified by some researchers on particular information retrieval systems. in a study examining “the contribution of metadata to effective searching,”9 hawking and zobel found that the available subject metadata are “of little value in ranking answers” to search queries.10 their explanation is that “it is difficult to indicate via metadata tagging the relative importance of a page to a particular topic,”11 in addition to the problems in data quality and system implementation. the same problem : | zhang et al. 77seeing the wood for the trees | zhang et al. 77 authors compared with the automatic indexing systems, because human indexers should be better at weighting the significance of subjects, and be more able to distinguish between important and peripheral compared with computers that base significance on term frequency.13 indeed, while various weighting algorithms have been used in automatic indexing systems to approximate the distinguishing function, there is simply no such mechanism built in human subject the particular page harder to find.12 a similar problem is reported in a recent study by lykke and eslau. in comparing searching by controlled subject metadata, searching based on automatic indexing, and searching based on automatic indexing expanded with a corporate thesaurus in an enterprise electronic document management system, the authors found that the metadata searches produced the lowest precision among the three strategies. the problem of indiscriminate metadata indexing is “remarkable” to the of multiple tags without weights is described: in the kinds of queries we have studied, there is typically one page (or at most a small number) that is particularly valuable. there are many other pages which could be said to be relevant to the query—and thus merit a metadata match—but they are not nearly so useful for a typical searcher. under the assumption that metadata is needed for search, all of these pages should have the relevant metadata tag, but this makes a. subject: women; books; dresses; flowers; trees; . . . in: victoria & albert museum (accessed aug. 30, 2010), http://collections.vam.ac.uk/item/014962/oil-painting-the-day-dream b. tags: japanese; moon; nights; walking; tree; . . . in: brooklyn museum (accessed aug. 30, 2010), http://www.brooklynmuseum.org/opencollections/objects/121725/aoi_slope_outside_toranomon_gate_no._113_from_ one_hundred_famous_views_of_edo c. tags: japanese; birds; silk; waterfall; tree; . . . in: steve: the museum social tagging project (accessed aug. 30, 2010), http://tagger.steve.museum/steve/object/15?offset=2 figure 1. example images with “tree” as a subject item 78 information technology and libraries | june 2011 anderson in niso tr021997.20 in addition, researchers have noticed the limitations of this dichotomous indexing. in an opinion piece, markey emphasizes the urgency to “replace boolean-based catalogs with post-boolean probabilistic retrieval methods,”21 especially given the challenges library systems are faced with today. it is the time to change the boolean, i.e., dichotomous, practice of subject indexing and cataloging, no matter whether it is produced by professional librarians, by user tagging, or by an automatic mechanism. indeed, as declared by svenonius, “while the purpose of an index is to point, the pointing cannot be done indiscriminately.”22 needed refinements in subject indexing the fact that weighted indexing has become more prominently needed over the past decade may be related to the shift in the continuum from subject indexing as representation/ surrogate to subject indexing as access points, which is consistent with the shift from a small number of subject terms to more subject terms. this might explain why the weighting practice is applied in the above mentioned medline/pubmed system. with web-based systems, social tagging technology, federated searching, and the growing number of collections producing more subject terms, to distinguish between them has become a prominent problem. in reviewing information users and use from the 1920s to the present, miksa points out the trend to “more granular access to informational objects” “by viewing documents as having many diverse subjects rather than one or two ‘main’ subjects,” no matter what the social and technical environment has been.23 in recognizing this theme in the future development of information organization and retrieval systems, we argue that the subject indexing mechanism subject indexing has been discussed in the research area of subject analysis for some time. weighting gives indexing an increased granularity and can be a device to counteract the effect of indexing specificity and exhaustivity on precision and recall, as pointed out by foskett: whereas specificity is a device to increase relevance at the cost of recall, exhaustivity works in the opposite direction, by increasing recall, but at the expense of relevance. a device which we may use to counteract this effect to some extent is weighting. in this, we try to show the significance of any particular specification by giving it a weight on a pre-established scale. for example, if we had a book on pets which dealt largely with dogs, we might give pets a weight of 10/10, and dogs, a weight of 8/10 or less.16 anderson also includes weighting as a part of indexing in the guidelines for indexes and related information retrieval devices (niso tr021997): one function of an index is to discriminate between major and minor treatments of particular topics or manifestations of particular features.17 he also notes that a weighting scheme is “especially useful in high-exhaustivity indexing”18 when both peripheral and primary topics are indicated. similarly, fidel lists “weights” as one of the issues that should be addressed in an indexing policy.19 metadata indexing without weighting is related to the simplified dichotomous assumption in subject indexing—primarily about/of and not primarily about/of, which further leads to the dichotomous retrieval result—retrieved and not retrieved. weighting as a mechanism to break this dichotomy is noted by metadata indexing even though human indexers are able to do the job much better than computers. weighting: yesterday, today, and future precedent weighting practices written more than thirty years ago, the final report of the subject access project describes how the project researchers applied weights to the newly added subject terms extracted from tables of contents and backof-the-book indexes. the criterion used in that project was that terms and phrases with a “ten-page range or larger” were treated as “major” ones.14 a similar mechanism was adopted in the eric database beginning in the 1960s, with indexes distinguishing “major” and “minor” descriptors as the result of indexing. while some search systems allowed differentiation of major and minor descriptors in formulating searches, others simply included the distinction (with an asterisk) when displaying a record. unfortunately, this distinguishing mechanism is no longer included in the later eric indexing data. a system using weighted indexing and searching and still running today is the medline/pubmed interface. a qualifier [majr] can be used with a medical subject headings (mesh) term in a query to “search a mesh heading which is a major topic of an article (e.g., thromboembolism[majr]).”15 in the search result page, each major mesh topic term is denoted by an asterisk at the end. weighting concept and the purpose of indexing the weighting concept is connected with the fundamental purpose of indexing. the idea of weighting in : | zhang et al. 79seeing the wood for the trees | zhang et al. 79 user tagging and machine generated metadata, such weighting becomes more important than ever if we are to make productive use of metadata richness and still see the wood for the trees. references 1. “dublin core metadata element set, version 1.1,” http://dublincore.org/docu ments/dces/ (accessed nov. 20, 2010). 2. library of congress, subject headings manual (washington, d.c.: library of congress, 2008). 3. ibid. 4. elaine svenonius, “access to nonbook materials: the limits of subject indexing for visual and aural languages,” journal of the american society for information science, 45, no. 8 (1994): 600–606. 5. “vra core 4.0 element description,” http://www.loc.gov/standards/vracore/ vra_core4_element_description.pdf (accessed mar. 31, 2011). 6. richard j. urban, michael b. twidale, and piotr adamczyk, “designing and developing a collections dashboard,” in j. trant and d. bearman (eds). museums and the web 2010: proceedings, ed. j. trant and d. bearman (toronto: archives & museum informatics, 2010). http://www .archimuse.com/mw2010/papers/urban/ urban.html (accessed apr. 5, 2011). 7. “vufind at the university of illinois,” http://vufind.carli.illinois.edu (accessed nov. 20, 2010). 8. library of congress, subject headings manual. 9. david hawking and justin zobel, “does topic metadata help with web search?” journal of the american society for information science & technology 58, no. 5 (2007): 613–28. 10. ibid. 11. ibid. 12. ibid, 625. 13. marianne lykke and anna g. eslau, “using thesauri in enterprise settings: indexing or query expansion?” in the janus faced scholar. a festschrift in honour of peter ingwersen, ed. birger larsen et al. (copenhagen: royal school of library & information science, 2010): 87–97. 14. subject access project, books are for use: final report of the subject access project to the council on library resources (syracuse, n.y.: syracuse univ., 1978). 15. “pubmed,” http://www.nlm.nih more than three categories or using continuous scales instead of category rating.24 subject indexing involves a similar judgment of relevance when deciding whether to include a subject term. more sophisticated scales certainly enable more useful ranking of results, but the cost of obtaining such information may rise. after the mechanism of incorporating weights into subject indexing/ cataloging is developed, guidelines should be provided for indexing practice to produce consistent and good quality. weights in both indexing and retrieval system adding weights to subject indexing/ cataloging needs to be considered and applied in three parts: (1) extending metadata schemas by encoding weights in subject elements; (2) subject indexing/cataloging with weight information; and (3) retrieval systems that exploit the weighting information in subject metadata elements. the mechanism will not work effectively in the absence of any one of them. conclusion this paper advocates for adding a weighting mechanism to subject indexing and tagging, to enable search algorithms to be more discriminating and browsing better oriented, and thus to make it possible to provide more granular access to information. such a weighting mechanism needs to be considered and applied not only by indexers, catalogers, and taggers, but also needs to be incorporated into system functionality. as social tagging is brought into today’s digital library collections and online library catalogs, as collections grow and are aggregated, and the opportunity arises for adding more metadata from a variety of different sources, including end should provide sufficient granularity to allow more granular access to information, as demonstrated in the examples in the previous section. potential challenges while arguing for the potential value of weights associated with subject terms, it is also important to acknowledge potential challenges posed by this approach. human judgment treating assigned terms equally might seem to avoid the additional human judgment and the subjectivity of the weight levels because different catalogers may give different weight to a subject heading. we argue that assigning subject headings is itself unavoidably subjective. we are already using professional indexers and subject catalogers to create value-added metadata in the form of subject terms. assigning weights would be a further enhancement. on the other hand, adding a weighting mechanism into metadata schemas is independent of the issue of human indexing. no matter who will do the subject indexing or tagging, either professional librarians or users or possibly computers, there is a need for weight information in the metadata records. the weighting scale in terms of the specific mechanism of representing the weight rating, we can benefit from research on weighting of index terms and on the relevance of search results. for example, the three categories of relevant, partially relevant, and nonrelevant in information retrieval are similar to the major, minor, and nonpresent subject indexing method in the examples above. borlund notes several retrieval studies proposing 80 information technology and libraries | june 2011 22. svenonius, “access to nonbook materials,” 601. 23. francis miksa, “information organization and the mysterious information user,” libraries & the cultural record 44, no. 3 (2009): 343–70. 24. pia borlund, “the concept of relevance in ir,” journal of the american society for information science & technology 54, no. 10 (2003): 913–25. 18. ibid. 19. raya fidel, “user-centered indexing,” journal of the american society for information science 45, no. 8 (1994): 572–75. 20. anderson, guidelines for indexes and related information retrieval devices, 20. 21. karen markey, “the online library catalog: paradise lost and paradise regained?” d-lib magazine 13, no. 1/2 (2007). . g o v / b s d / d i s t e d / p u b m e d t u t o r i a l / 020_760.html (accessed nov. 20, 2010). 16. a. c. foskett, the subject approach to information, 5th ed. (london: library association publishing, 1996): 24. 17. james d. anderson, guidelines for indexes and related information retrieval devices. niso-tr02–1997, http:// www.niso.org/publications/tr/tr02.pdf (accessed nov. 20, 2010): 25. virtual reality: the next big thing for libraries to consider editorial board thoughts virtual reality: the next big thing for libraries to consider breanne kirsch information technology and libraries | december 2019 4 breanne kirsch (breanne.kirsch@briarcliff.edu) is university librarian, briar cliff university. i had the pleasure of attending educause annual conference from october 14-17, 2019. this was my first time at educause, but i was impressed with the variety of programs, vendors, and options for learning about technology and higher education. after recently completing my coursework for a second master’s in educational technology, i was curious to see what new technologies would be highlighted at educause. i found out about some new trends, such as the growth of esports in high schools and higher education. esports are when players or teams compete through computers in video game competitions.1 there were over 20 programs and sessions about virtual reality at educause. since there were so many programs about virtual reality at educause, i wanted to share a little of what i learned including how some higher education institutions are creating vr content, using pre-created content, and vr in libraries. since virtual reality is still new to many higher education institutions, i wasn’t sure how many would be creating content, but i did attend a couple of sessions about how 360 -degree content is being created. virtual reality content creation seems to happen most frequently in the medical field so students can practice different procedures that may not happen very frequently in their jobs, allowing them to experience a wider variety of procedures that they will eventually encounter in the workplace. health sciences libraries are generally ahead of the curve in providing vr services to patrons.2 additionally, stem areas are finding more uses for vr, such as vr laboratories so expensive lab equipment does not need to be purchased, but students can still participate in vr lab experiences. creating vr content using tools such as unity can be difficult and time-consuming. some educators are using 360-degree cameras to create virtual settings that can be used by students but are easier to create. tim fuller and rich kappel spoke about how they used a 360-degree camera and matterport scans to create 360-degree virtual environments for students to explore and engage with robotics technology. tags can then be added to include pictures, videos, or link to websites with more information. this creates a shareable link that can be used to share with students. i was able to use my iphone and the google street view app to create a 360-degree tour of my library. it is not high quality enough to view in virtual reality with an oculus go or other vr headset, but it is a great starting point for creating a 360-degree virtual tour of a library on a budget. this was free (since i already had an iphone). there is a wide variety of freely available, 360-degree content that can be used by educators in the classroom and more is being created. what does this mean for libraries? while quick virtual tours can be created with smartphones, higher quality vr experiences can also be created by librarians using a 360-degree video camera. these experiences could be used to teach students information literacy skills or search strategies in a vr environment. while this would be harder to do right now with the technologies available, mailto:breanne.kirsch@briarcliff.edu virtual reality: the next big thing for libraries to consider | kirsch 5 https://doi.org/10.6017/ital.v38i4.11847 it could become easier down the road. meanwhile, librarians can create 360-degree virtual tours. libraries can offer vr services, such as a vr lab or checking out standalone vr headsets, such as oculus go or oculus quest. just like with the makerspaces trend, libraries are well situated to support virtual reality in education. our library circulates an oculus go and when we were considering adding a virtual reality headset, there were some risks we considered prior to purchasing it. there are health risks for some people when using virtual reality headsets, such as motion sickness, dizziness, and, in some cases, epileptic seizures. it is important to explain this to students before they check out the device, so they know to immediately quit using the oculus go if they have an adverse reaction. additionally, we keep cleaning wipes with the oculus go to help keep it sanitary when multiple people are using it. a tablet or smart phone needs to be associated with the oculus go in order to update apps or download new apps. therefore, a passcode needs to be added so students can’t purchase paid apps on the oculus go with the associated credit card. privacy can also be a concern, especially when using the social apps, which is why i decided not to download the social apps on the oculus go at this time. some of the scary apps, such as the face your fear app can cause students to scream, so it is important that students realize how realistic the experiences are before using them. one final consideration when offering vr services is staffing. there needs to be someone trained in the library that can help teach students how to use the vr headset and experiences. i’ve trained each of our student workers in how to use the headset so they can show other students. while these are some important considerations when deciding whether to offer vr services or not, i believe the benefits outweigh the risks. virtual reality is expected to continue to grow, especially with wireless headsets, such as the oculus go and oculus quest available. it is important for libraries to be ready to offer support with virtual reality, just as we’ve offered support for prior technologies including tablets, laptops, computers, 3d printers, etc. libraries can start small, by circulating an oculus go or creating a 360-degree library tour. libraries with more resources could create a vr lab or provide support for creating vr content, such as 360 -degree video cameras or tools like unity. it will be exciting to see how libraries can support vr in the future. further readings van arnhem, jolanda-pieta, christine elliott, and marie rose. augmented and virtual reality in libraries. lanham: rowman & littlefield, 2018. varnum, kenneth j. beyond reality: augmented, virtual, and mixed reality in the library. chicago: ala editions, 2019. endnote 1 matthew a. pluss, kyle j. m. bennett, andrew r. novak, derek panchuk, aaron j. coutts and job fransen, “esports: the chess of the 21st century,” frontiers in psychology 10, no. 156, 2019, https://doi.org/10.3389/fpsyg.2019.00156. 2 susan lessick and michelle kraft, “facing reality: the growth of virtual reality and health sciences libraries,” journal of the medical library association 105, no. 4, 2017, https://doi.org/10.5195/jmla.2017.329. https://doi.org/10.3389/fpsyg.2019.00156 https://doi.org/10.5195/jmla.2017.329 further readings endnote reproduced with permission of the copyright owner. further reproduction prohibited without permission. using server-side include commands for subject web-page management: ... northrup, lori;cherry, ed;darby, della information technology and libraries; dec 2004; 23, 4; proquest pg. 192 tutorial using server-side include commands for subject webpage management: an alternative to database-driven technologies for the smaller academic library lori northrup, ed cherry, and della darby frustrated by the time-consuming process of updating subject web pages, librarians at samford university library (sul) developed a process for streamlining updates using server-side include (sst) commands. they created text files on the library server that corresponded to each of 143 online resources. include commands within the html document for each subject page refer to these text files, which are pulled into the page as it loads on the user's browser. for the user, the process is seamless. for librarians, time spent in updating web pages is greatly reduced; changes to text files on the server result in simultaneous changes to the edited resources across the library's web site. for small libraries with limited online resources, this process may provide an elegant solution to an ongoing problem. the migration of print ed subject guides and p athfinders to web pages began almost concurrently with th e creation of library web sites. dunsmore relates that this online mi gra tion durlori northrup (lanorthr@samford.edu) is reference librarian, ed cherry (cecherry @samford.edu) is the automation librarian, and della darby (dhdarby@samford. edu) is the coordinator of reference and government documents at samford university library, birmingham, alabama. in g the 1990s was follow ed almost immediately by articles on the d esig n, construction, usability , and maintenanc e of web-ba sed subject guides. 1 a scan of recent literat ure (for example, dean; roberts; davi dson; grimes and morris; and galvanestrada) suggests that onlin e access to library resources has becom e the norm, and that librarian s struggle with the tim e necessary to maintain th ese guides online. 2 in an effort to reduce time sp en t maintainin g subject guides to int erne t, print, and online resources, librari ans are discove ring more efficient methods of resource management for their web-mounted subject guid es and pathfinder s. roberts , davidson, and ga lvanestrada d escr ibed variations of database-driv en technology th at generate dynamic subject guides to library res ources; a common database of reso urces that have been descriptive ly enhanced for retrieval pro v ides the backbone for a system th at crea tes subject guid es for the user at the point of query. ' patrons can then search for materials that ma y cross disciplinary lines, and receive a more targeted result list than the y would ha v e rece ived had they only combined two s tatic subject bibliographi es from related fields. the primary advantage to this type of retrieval system , for the librarian, is that updates can be done at one central location-within the database-and will then appear when the upd ated item is viewed on any portion of the library 's web site generated from that database . libraries that have adopted databa se-driven technologies hav e done so because their resource listings have exceeded m anageable capacity. selected resourc es from the int erne t or from librar y electronic holdin gs have reached a numb er that is difficult for available staff to maintain , especially acro ss hundr eds of static web pages. for one library , this might mean a collection of mor e than two hundred; for larger librar y, this critical point might be reached only after eight hundred resources were gathered. at some 192 information technology and libraries i december 2004 point in the collection of resourc es, the amount of work n ecessa ry to create a database-driven syste m will be less than the pot en tial workload for updating individual pages. for the creat ors of these databasedri ven systems of resource retri eva l and displa y, th e time inve sted in the database sys tem grea tly outweighs th e potential time lost in upd ating what davidson has describ ed as "hundreds of p ages of html containing multipl e occurrences of the same information , each of which nee ds to be checked and updat ed in response to even trivial changes in title or url." 4 however, the investm ent of time and labor neces sary to th e creation of a populated database should not be downplayed. davidson also notes "th e process of recrea ting or migrating an entire site to a databa se-driven platform is timeand lab or -intensive ."5 indeed, rob er ts sugg es ts that the labor and time required of library staff to get the sys tem running at full potenti al outweighs any technical issues involved in creating th e database. 6 in galvanestrada 's case, librarians created tools spe cific to the sys tem to facilitate entr y of dat abase information , th ereby incr easi ng initial time and lab or investm ents. 7 this method for handling a multiplicity of web pages and resources may work well for colleges and universities with hundreds of resources to be repeated across searche s or subject pages. for some libraries , howeve r, the critical amount of resources tha t can spur tha t type of d ecision ma y never be reached. a closer loo k at one small acad emic library 's efforts to reduce time spent in updating static html subject guides may be helpful to other librari es in similar situations . samford's situation samford univ ers ity is a smallto medium-sized institution , with about 2,900 undergr aduate and 1,500 gradu ate students. the campus h as five reproduced with permission of the copyright owner. further reproduction prohibited without permission. information units: the university library, the law library, the education curriculum center, the career development center, and the drug information center. the university library is the primary information center for all disciplines except law. within the reference department, which is one of several university library departments, four full-time librarians are responsible for general reference, government documents reference, and maintenance of the department's designated web pages on the library's web site. like the majority of academic libraries, samford university library (sul) provides subject access to its electronic resources. sul' s practice is to provide static-subject web pages that include a list made up of (1) peri odical databases of primary subject importance, databases of secondary importance, and general databases; (2) reference books; and (3) web sites. each reference librarian is responsible for the creation and maintenance of selected subject pages and departmental pages. the department also maintains a page with an alphabetical list of all sul subscription databases and some free databases. it is from this list of commonly used resources that librarians select materials to fill the top portion of the subject pages . subject pages have undergone several metamorphoses in recent years. one major change was to add brief descriptions to each title on the database list while maintaining links to pages that held more in-depth information about each of the databases. because a database may be listed on two or more subject pages, these changes required a considerabl e amount of repetitious updating . the initial changes were made to the alphabetical list, and then the html code was copied into each of the subject pages where the title was listed. copying and pasting in this way helped to ensure consistency across subject pages. when a recent review of sul's web-site statistics indicated that the description pages were receiving little or no activity, the reference librarians decided to enlarge the descriptions under each database title on the list and remove the links to the description pages. this would provide more initial information for the patron while eliminating the web team's maintenance of underutilized pages. making these changes across all the sul subject pages and some other affected pages resulted in hours of html correction . for each change made to the alphabetical list of resources, several other subject web pages had to undergo the same change. for example, librarians were copying the information for academic search elite (ase) and infotrac onefile across every subject page. recently, faced with another update of all of the subject pages due to a database name change, the reference librarians decided to find a method that would minimize the repetition of effort required for this and future overhauls of the sul database lists. some informal discussion among reference librarians and with the automation librarian had concerned recent literature on database-driven web sites; however, the general consensus was that sul had neither the time nor the need to take such a significant leap. the total number of items on the alphabetical list of resources at that time was 144. adopting a new platform of operation in a database-driven model seemed too large an undertaking for sul' s small list of resources. the reference coordinator consulted with the automation librarian about the possibility of using include statements in place of each database title and description. the web team was already using include statements for the headers and footers of all library web pages. include commands include commands are a type of ssi code. fagan explains that through using ssi codes, a web author with no knowledge of programming can insert set groups of data into an html web page. carefully constructed statements within the html page give the server a command to locate and insert a piece of information (a date, a text file, a program). when a user requests a page (clicks on a link for the page), the server loads the page, inserting the requested material in place of the ssi code on the page ." the web server executes ssi code before the web page is transferred to the browser making the request. this means that the use of ssi is not browser-dependent. ssi directives work regardless of security or privacy controls in the browser, such as disabling javascript or cookies. mach notes that this type of retrieval and substitution can be especially helpful for material that is used repeatedly on several library web pages. the web author can create one file containing the information used on several pages; then, when a change is made to that file, it is repeated across all web pages that include that file. 9 as mentioned above, sul web pages all include the same header and footer; these headers and footers do not appear in the original html code for the pages . instead, there is a command that tells the server to locate the header and footer files and include them when the page is loaded on a browser. the user does not know that ssi coding was used; for the user, the page appears complete. if a change needs to be made in a footer , then the change is made to the file containing that information. changes appear on all pages that refer to the edited footer file. library literature on the use of ssi somewhat slow to adopt ssi capabilities, librarians have recently made excellent use of this elegant resource . using server-side include commands i northrup, cherry, and darby 193 reproduced with permission of the copyright owner. further reproduction prohibited without permission. current articles tend to be written with the understanding that many librarians who work on their library's web site may not have the experience or levels of access necessary to make server changes. articles focus more on the implementation of the syntax at the level of web-page maintenance and less on server technology. authors in the most recent publications have been more likely to offer examples, and to speculat e on uses that, while they are not entirely innovative, hav e not been fully taken advantage of in the past. using ssi to include text files in a web page has been mentioned as a possibility since the first article, but later articles have tended to amplify the possibilities for this type of include command. in a slight shift of emphasis, later articles have tended to downplay the added server strain that was once a matter of great concern for ssi users. ssi statements have probably been in use in libraries since servers were capable of processing them; documentation accompanying servers includes information on their use, as do some html manuals for beginning web-page creators. reference to the use of ssis in a library environment does not appear until 2000. notess's article provides a good, concise overview, and points to several helpful resources for the webmaster. notess mentions common uses of ssi commands that will provide knowledge about the page, such as "current date and tim e, the ltrl of the page, the directory in which the file is located, the kind of browser the user has, or pull in content from separate files to construct the page before it is delivered." he elaborates on this point by stating that "a simple text file can contain the content [for a web page], and people with no html experience can be given access to change that text." 10 while he mentions this possibl e use of the ssi include command for text files, notess's examples all concern the echo command. notess does mention that ssi can possibly cause server strain but states that "includes add very little extra load" [emphasis added]. 11 later articles tend to mirror notess's in describing situations appropriate for the use of ssi and in noting possible difficulties and concerns. mach's article assumes a bit more knowledge of servers and also access to server set-up features, but it is written in clear explanatory language and provides exc ellent examples of commonly used ssi commands, including figures to illustrate those uses. she elaborates on notess's suggestion about text files, and makes the suggestion that if non-html files are the targets of ssi commands, that they be given an extension such as .txt, so that they will not be indexed as web pages. 12 also notable is her assertion that "most web servers should be able to handle the extra load of parsing all files and simply using th e .html extension already in place." 13 written with an eye to those who are not web or server administrators, pagan's article one year later provides many example screen shots and a step-by-step guide to implementing ssi. the information here is similar to that in mach 's article, but the style is more accessible to the less-experienced web site developer. like mach and notess, fagan mentions the use of non-html files and clarifies the idea of having an "all-text file, which could be ftp'd to the web server." 14 as with the other authors, she mentions that enabling ssi on a server can result in the loss of "at least microseconds of time" as pages are parsed and reconstructed for the browser .15 she explained, though, that "ssi is used on large web sites in some fairly complex ways without causing any discernible time lag . the question is one of how busy your web server is; if it is not overburdened with requests, it will easily handle the additional load of parsing files." 1" she rightly asserts that slow internet connections or older, slower patron pcs are more likely to cause delays than is ssi load on a server. 17 194 information technology and libraries i december 2004 these articl es demonstrate an organic movement toward wider use of ssi capabilities. all of them mention the potential to use ssi for portions of pages that are the same, such as headers and footers. each of them also mentions the use of ssi to include text files that web authors without html experience can edit. what they do not cover is the use of ssi to include text files that are repeatedly used in the body of multiple web pag es. examples given are for web page s with significant blocks of text that might be written at different times by different authors. the use of ssi include commands at sul to insert pieces of identical text across many pages, while not groundbr eaking , is significant in light of libraries' recent efforts to handle large numbers of electronic resources. sul's experience as the literatur e and server technology have advanc ed over the last few years, there has been a noticeable move from emphasis on the strain that ssi can cause to acknowledgement that the technology has become more capable in handling that strain. this is not to say that strain on the server does not occur. the user manual for apache servers clearly states that "while this load increase is minor, in a shared server environment it can become significant." 18 however, the increases in server speed and capability certainly make it more feasible to use ssi in quantity than it was in the past, and library literature supports that view. informal information from web discussions and web-development sites is much more emphatic . recent discussion on thelist at evolt.org focused on this issue and elicited the following from baratta: "today's servers are super charged compared to just a few years ago, and most people won't see the traffic levels that give ssi overhead a chance to affect serving time." 19 reproduced with permission of the copyright owner. further reproduction prohibited without permission. the server statistics at sul bear out this theory. the library's six-yearold apache server last year averaged slightly more than 300,000 hits per month. the average increases somewhat to around 340,000 hits during the nine months of samford' s fall and spring semesters. this total number of hits includes all catalog searches as well, as this server also hosts sul' s automated library system. despite this seemingly large number of hits, the server usually reports that it is 80 percent to 90 percent idle even during the 8 a.m. to 5 p.m. time slot during which it does the major portion of its work. at the time these statistics were gathered, ssi was already in use for the inclusion of urls, last update dates, and some header and footer information. the commands directing the server to provide this information to the browser are more taxing to the server than commands employing the include command. after the addition of four to ten include commands on each of thirtyseven subject pages, and the construction of one alphabetical list consisting only of 143 include commands, the load on the server remains at less than 20 percent. this load is on a server with a 200 mhz powerpc processor, with 512 mb of ram. certainly each library has its own server situation, and this solution for handling occasional updates to electronic resources information on library web pages may not work for all. consultations with the automation librarian or computer technology department should verify specific limitations and requirements for using ssl sul also tested to see whether the use of ssi would result in longer page load times. the download of two identical (to the user) pages, one static and the other built from include directives, was timed. there was no visible difference between the two. a more rigorous test may have detected a difference measurable in milliseconds. however, sul librarians feel that the user's connection speed has a larger effect on the page load time than the use of ssl background the evolution of this process for sul moved rather quickly from inception to application, involving only slight changes to the library's normal processes for updating web pages. the most time-consuming tasks involved creating the include .txt files at the outset, and then providing a database of the completed include commands. each time a database title is added to the collection, the automation librarian creates a persistent uniform resource locator (purl) for that database. this is due to the tendency of the database vendors to alter the url through which access is gained and because the library sometimes switches vendors for its databases. with this procedure in place, the automation librarian can make the change to the url in one place and every link to that database will connect properly. this saves considerable time and effort for the librarians who maintain the subject pages. librarians were in the habit of using that purl to create a section of html that presented the database link to the patron. figure 1 illustrates a portion of html code for ebsco' s ase as it appears in a text editor. in addition to appearing in subject pages on various topics, subscription databases and commonly used web resources are listed in one alphabetical listing. this page, updated first when changes are made, is where librarians came to copy the current code for a database. in this way, sul attempted to keep pages consistent. text for a database could be copied and pasted into the html editor as a librarian worked to update the pages. while this seemed like a smooth process, the simple reality of having to copy and paste one database change to, potentially, all subject pages was too time-consuming to make it an efficient process. updates to the subject pages and alphabetical list are done as time permits. often, one or two librarians were unable to make the changes until weeks had passed, resulting in a web site that was inconsistent and sometimes misleading. for instance, dates of database coverage might change and be documented on the alphabetical list, but not on all the subject pages listing that database. the way sul does it now when it was decided to try the include commands as an alternative for updating database information and streamlining the update process for librarians, changes were made over the course of three days, and occurred in three phases. in phase one, the alphabetical list was carved into 143 individual .txt files, each of which contained information for one database or web resource. each .txt file was saved on the server with a name corresponding to the purl that is used for that database, or with an easily recognizable name constructed from the title or url of the resource. the html code in figure 1 became a .txt file titled eb-ase.txt. using the purl for the title of the file makes it more easily recognizable and acquaints librarians with the purls already in use. in phase two, one librarian created an excel file containing the following: (1) an alphabetical list of resource names; (2) existing names for the corresponding .txt files; and (3) component parts of an include command for each resource. the final column concatenates these component parts together resulting in a ready-made include command for each resource. figure 2 illustrates a portion of this table. in phase three, librarians used th e table of include commands to select the resources they wanted on thei r pages. this entailed replacing a block using server-side include commands i northrup, cherry, and darby 195 reproduced with permission of the copyright owner. further reproduction prohibited without permission. of html code with one include command . the examp le of html code in figure 1 would be reduced to the following statem ent: <!---#incl ud e vir tu al=" / top ics/ includ es/e b-ase.txt" -> once the appropriate pieces of html code had been replaced with corresponding include statements, the changeover was complete. from this point forward, changes in such things as database names, coverage periods, and descriptive material will be made to one .txt file. the change will immediately be reflected across all subject pages with no additional work involved for the librarians responsible for those pages. note that the sul server has been configured so that it parses all web pages. this is necessary because most of the library's web pages have some ssl this configuration means that the web page extensions remain .html. if the server is not configured in this manner, then all pages containing ssi must end in a .shtml extension. this is a subject that requires discussion with automation librarians or the department responsible for the library's server. advantages obviously, the biggest advantage to this method is the time saved for individual librarians. there is now no need for librarians to do any maintenance work for links to information housed in the alphabetical list. static html pages referencing gale's infotrac onefile database, for instance, would have required updates to approximately forty subject pages; now, one librarian can correct one .txt file and simultaneously update all forty subject pages. time saved can be used in collecting and editing the list of web sites that are a part of each subject page; this is a task that has been pushed back in the past, in favor of making more urgent database information changes. .! i coffeecup html editor www.coffeecup.com , f.ile j;:dit ~iew q.ocument [nsert e,ormat iools \'lindow t!elp j . [~ · lrl· ~ ~ i ~ @ • ! .. ,,., ,. ix iq :ii"' t;~ ~ . ! '1&1 t':f ~ ft; • •&;pl .;11= • · 1;9l· ,,. ®·. !ai· ,.,i · ~ . fil• · ~ · ~ · ee . !.l!j • edrt j preview i help i r <tr><t d a li gn= "ce nter "><i rng src= " /images/campus . gif " width = "39 " height:::o "32 " alt= "su users" bo r cter= "o''></t d > <td><a href :c:::11 / eb-ase "> academic sea rch elite</ a><font size="1 ">   ( ebsco)   ; </font>< img src= " / image::: /f ulltext. gif " a lt = "some full text " border = "o"> <d iv class = r'ctescr i pt i on ">multi-di sc iplinary database includes some sc hol ar ly articles</div></td></tr> fig. 1. html code for academic search elite using a purl called eb-ase el microsoft excel 4lphabeticalresources.xls l~ ole ~dit 'iiew insert fq.rmat ioofs qata y!,indow t1elp d q§; !iii ,f6i 'm it [l. ~ 1 nth till, • <lf i on• ! ~ i: • ~½ jd 100% • fij ada! • 10 • i b i c2 f,, <!--#include v111ual• "/topics/includes/ a ___ ·---________ el ____ _ ~~:~:~ : ~ c~:~c,::li,. .. . :1:st~~~ ::]~~:~~:-::~~:::::;:~;:~~;:~i:~~::;~;;~t)~2 .. :: : ~access science acc,es~-s.ci.ence.t_xt <\--#inclu~-~ virtljal='.'/t __ ~,p._ic~/)t!_cl~_des[a~cess-scie.n.~-~.t.~t" ·-> ~ accessiblea~h ives · accessible.ix! <l--#incluq_e virtl1.~.1.:=..''.!t __ o,pics/includes/acc.~-~--~i_91~.t~t" --> fig. 2. database names, .txt file names, and resultant include commands in addition, librarians who are using this simple technique do not need extensive training. the creation of the excel database of include commands allows for quick additions to an existing page, or the creation of new subject pages. librarians using the include commands can simply copy and paste them; there is no need for them to understand the syntax or to be able to repeat it. this makes using ssi particularly attractive to staff who do not want the added burden of further training in html. the librarian responsible for creating the .txt files and the excel database of statements demonstrated the copying and pasting of the include statements to all the other librarians who edit html pages in a one-time tenminute training session. the only additional training issue has involved page structure. since the library uses a table structure for the subject pages, all table tags are included in the database .txt files. making sure that librarians understand that they do not need to recreate the table tags has been the only additional training issue for the department. as librarians begin to use these commands, links to resources across subject pages will look the same and will provide the user with the same information. this increased uniformity results in a more professional appearance for the web site as a whole. disadvantages this revolution in the maintenance of subject pages has not been without its disadvantages. the primary complaint by librarians using ssi include commands is that they cannot preview their changes in their html editors. sul's department uses the coffeecup html editor, which allows previews, but the previews are not visible for items that are retrieved using ssis. this is because the page is not fully assembled until the server assembles it. when the librarian views the page in the editor, 196 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. prior to uploading it to th e server, the include commands are without tar gets. the target .txt files are on the server. when a user requ ests a p age, include commands pull in the missing pieces (the .txt files, or other files); th en, th e completed pag e is seamlessly presented to the us er via his or her brow ser. as mach notes, "preview ing a web page without crucial element s . . . can be di sconcerting, esp ecially to visuall y oriented d esigners."20 in sul's experienc e with thi s particular issue, librarian s who are uncomfortable loading pages with locally invisible elements can load th em into temporary fold ers on the server, check them for errors there, and then move them to th eir appro priate dir ectories . conclusion situational factors have allowed sul to imple ment this change with surprising ease and speed. because the library has its own server, and because th ere is an automation librarian on staff, communicati on and chan ge have been easy and efficient. librar y staff deduce that it is becau se the include command of ssi is b eing u sed more than other possible commands that the librar y is not experiencing an increase in loadin g tim e on its pages. of course, the size of sul's reso urce list makes this kind of soluart & tec h ebsco tion feasible ; certainly, if the librar y were working with hundreds of resources, it would be more likely that a datab ase -driv en strategy would be ad op ted . the simplicity and elegance of the ssi include command process has encourage d adoption, and sul ha s seen no ill effects from the us er side of operations. librarian web au th ors qui ckly overcame any slight di sco mfort with the new proc ess and are now able to devote a portion of editing time to other, less m ono tonous tasks. references and notes 1. carla dun smore, "a qualitative study of web-mounted pathfinders created by academic business libraries," libri 52, no . 3 (sept. 2002): 140-41. 2. charles w. dea n , "th e public electronic libr ary : web-based subj ec t guides," library hi tech 16, no. 3-4 (1998): 80-88; gary rob erts , "designi ng a database-driven web site, or, the evolution of the infoiguan a," computers in libraries 20, no. 9 (oct. 2000): 26-32; bryan h. davidson, "database-driven, dynamic content delivery: providing an d managing access to online resources using microsoft access and ac ti ve server pages," oclc systems and services 17, no . 1 (2001): 34-42; marybeth grimes and sara e. morris , "a co mp ari so n of academic librarie s' webliographies, " internet reference services quarterly 5, no . 4 (2001): 69-77; laur a ga lv an -estra da, "moving towards a user-cent ere d, database-driven web site at th e ucsd libraries," index to advertisers 179 200 lita internet reference services quarterly 7, no. 1-2 (2002): 49-61. 3. roberts, "infoiguana "; davidson, "da tabase driven"; galvanestrada, "user -cen tered, database-driv en web site." 4. davidson, "database driven," und er " int roduction ." 5. ibid., under "developm ent conside ra tions." 6. roberts, "infoiguana ," 32. 7. ga lvan-estrada, " u ser -centered, database-driven web site, " 55-56. 8. jody co ndit fagan, "server -side includ es made sim ple, " the electronic library 20, no. 5 (2002): 382-83 . 9. michelle mach, "the service of serv er -side includes," information technology and libraries 20, no. 4 (2001): 213. 10. greg r. notess, "serv er side includes for site management," online 24, no. 4 (july 2000): 78, 80. 11. ibid. 12. mach, "se rvice of server-side includ es," 216. 13. ibid., 214. 14. fagan, "server -side includ es m ade simple," 387. 15. ibid., 383. 16 . ibid. 17. ibid. 18. apache httpd server project, "apac h e http server version 1.3: secu rity tips for server configurati on," th e apache softwar e foundation. accessed oct. 29, 2003, http: / / httpd. apac he.org/ docs / misc / sec urity _tips .html. 19. an th on y baratta, e-mail to th elis t mailing list, may 16, 2003, accessed nov . 4, 2003, http:/ / lists.evolt.or g/ archive/ week-of-mon-20030512/140824.html. 20. mach, "service of serv er -side includ es," 217. cover 2, 191, covers 3--4 using server-side include commands i northrup, cherry, and darby 197 editorial board thoughts: critical technology cinthya ippoliti information technology and libraries | december 2018 5 cinthya ippoliti (cinthya.ippoliti@ucdenver.edu) is university librarian and director, auraria library, university of colorado. critical librarianship has brought many changes in how libraries have examined their programs and services, created new positions dedicated to equity, inclusion, and diversity, and paved the way to challenge existing assumptions about our work and environment. technology also exists in a space that is not neutral, as library systems and services reflect specific perspectives in their content and focus as well as how they are made accessible (or not). i would like to briefly examine how we can begin to think about these issues within academic libraries, and offer some additional readings for further reflection for four technology-related areas: spaces, services/programming, systems, and engaging with our users. technology spaces we might assume that because we are seeing students using our classrooms, makerspaces, and study areas, that we have been successful in meeting the needs of a wide variety of users. to a large extent that may be true, but we should also be asking ourselves who does not feel welcome in such a space and, more importantly, why not? there are two facets to this question. the first involves the degree to which libraries strive to create a welcoming environment. staff interactions, signage, hours, and institutional values are all part of a complex and broader environment that signals to users how these spaces function and how they are perceived by the organization. these same elements can also serve as deterrents through choices in layout, policy, or other intangible aspects so that they may in fact prevent individuals from entering these spaces in the first place. the second revolves around the notion that each technology-rich space conveys its level of friendliness and intended purpose through its physical presence. ensuring that furniture, paint, and layout are compliant with ada standards, and integrating these features with each other as opposed to setting them apart so that they are not considered “special” or “different,” is one small and vital step in this direction. maggie beers and teggin summers cover these issues an educause review article and discuss asking questions regarding how power structures are reinforced by having a “front” of the room or other configurations can enrich planning and assessment efforts. similarly, developing a plan so that new technology in areas such as makerspaces rotates as much as possible will help to provide access for those who may not be able to utilize these resources outside of the library context in order to accommodate differing skill levels, interests, and learning styles. in addition, students may not always be present on campus due to family, job, or other life circumstances and planning with the assumption that everyone who could benefit from using a particular space is in fact taking advantage of that benefit, is problematic. one way around that is to ensure that each space is as flexible as possible and (ideally) can be reconfigured for quiet reflection, collaborative work, or transformed into a sensory space or other type of specialized environment. the reservation process should be available both online and manually (as not critical technology | ippoliti 6 https://doi.org/10.6017/ital.v37i4.10810 everyone may have access to a computer and/or the internet), hour limitations should have several counter options, and the space should be available as much of the time as possible when it is not in use for more a more formalized purpose. any space usage assessments should also purposefully include non-users or perceived non-users and integrate questions about barriers to or about the space in their methodologies. finally, ensuring that the right level of staffing to support both the intended, as well as perhaps the unintended, uses of the space and the activities that occur within it will help create a sense that not only the space itself is valued, but that the experiences occurring within it are even more important. this is not easy to accomplish, as it is difficult to predict exactly how a space will be used unless there are very strict confines placed around its configuration and accessibility. but assuming that most spaces in libraries are designed to be malleable and keeping in constant communication with users via some of the methods described above should help. technology services and programming similarly, services and programs cannot be built around a one-size-fits-all model. this can prove to be quite challenging given the limited resources libraries face. engagement and learning lie not only in access to tools, but in the very process of sharing knowledge and experiences — whether for academic growth, social action, or simply personal enjoyment. matt ratto, who coined the term “critical making,” defines it as the process “intended to highlight the interwoven material and conceptual work that making involves.” he argues that “critical making is dependent on open design technologies and processes that allow the distribution and sharing of technical work and its results.” ratto makes the further point that this process also has the capacity of “unpacking the social and technical dimensions of information technologies.” this in turn allows for technology to become more than simply a cool resource, but rather a mechanism for democratizing this creative work of making and designing while dealing with its messy, political, and uncomfortable aspects which do not exist in vacuum outside of the tools themselves. an approach in this instance might involve taking technology outside of library spaces such as on campus or within the community, offering as much for free as possible, and capitalizing on programs such as girls who code (https://girlswhocode.com/) and grow with google (https://grow.google/). capturing how these resources are used in all of their possible permutations enables stories of individuals to shine through. the impact of these programs takes on a personal element through showcases, speaker events, and hackathons that are designed to bring the community together and engage in sharing of knowledge, perspectives, and conversations. in addition, this will hopefully shrink the barriers for those who don’t see themselves as having a role in these activities. integrated library systems i do not have a background in systems, but simon barron and andrew preater have written a great chapter unpacking the inherent power structures which manifest themselves in library systems such as the integrated library system (ils), discovery interfaces, and the third-party resources we provide access to. they suggest taking action by thinking about user privacy and ensuring that the information libraries are able to view, gather, and store is used ethically and that decisions for derivative services or actions are not made based on assumptions about gender identity, economic status, or other identifiers via access to these types of data. openness is another area the authors explore, as they discuss how libraries can use open source software whenever possible in order to balance the field against profit-based licensing models. barron and preater also raise a concern however that while crowdsourcing is in theory a good way to include the community in https://girlswhocode.com/ https://grow.google/ information technology and libraries | december 2018 7 developing ways to help itself, it still does not recognize the limited resources marginalized populations can dedicate to these efforts. finally, they discuss how it is crucial for libraries to recognize and support the expertise needed in this arena in order to avoid overreliance on vendor systems that can prove alluring with out-of-the-box solutions, but which compromise things like privacy, autonomy, and customization that might otherwise benefit from equity, diversity, and inclusion-centered practices. equity-driven design engaging with users in developing shared solutions to challenges is an important aspect of the user experience, and can help pave the way for deeper conversations. taking a step back and making sure the assessment and design process itself is transparent for everyone is one of the first things that needs to be in place. i would like to harken to the work of gretchen rossman and sharon rallis who make a crucial distinction between user-centered design, in which the user seldom has a voice in what the final process or product looks like, and what they term as “emancipatory design,” in which participants are “collaboratively producing knowledge to improve their work and their lives.” in addition, emancipatory design is one where “users are in charge; their power, their indigenous knowledge are more powerful and respected than those of the expert designer.” this approach can therefore be a means to promoting equity, diversity, and inclusion into technology work in libraries by focusing on the users’ voice as opposed to our own and working collaboratively to develop shared solutions to address their challenges. a specific example of how this framework might be applied comes from the stanford school of design which is famous for its course in design thinking. stanford has recently taken that concep t even further, and integrated an equity focus into the first steps of the progression, where the designer is not only identifying existing built-in biases but also raises questions such as who the users are, what are the equity challenges that need to be addressed, who has institutional power, and how is it manifested in the decisions that drive the organization. the stanford model also provides specific methods focusing on human values and developing relational trust as a way to bookend the design thinking process by reflecting on the blind spots that were uncovered as a way to help inform action items and next steps and ensure that the users are actively collaborating to develop these services and programs which in turn affect them. this version of the program is available at https://dschool.stanford.edu/resources/equity-centered-design-framework. as a final thought, one idea to keep at the forefront in all of these areas is that of universal design, which is defined by the center for universal design at ncsu as “the design of products and environments to be useable by all people, to the greatest extent possible, without the need for adaptation or specialized design.” the first principle is that of equitable use and can be applied to many technology-related aspects whether they are physical or virtual: • provide the same means of use for all users: identical whenever possible; equivalent when not • avoid segregating or stigmatizing any users • provisions for privacy, security, and safety should be equally available to all users • make the design appealing to all users https://dschool.stanford.edu/resources/equity-centered-design-framework critical technology | ippoliti 8 https://doi.org/10.6017/ital.v37i4.10810 further readings: barron, s. and preater, a. j. “critical systems librarianship.” in the politics of theory and the practice of critical librarianship (sacramento: litwin books, 2018). https://repository.uwl.ac.uk/id/eprint/4512/1/2018-barron-and-preater-critical-systemslibrarianship.pdf. beers, m. & summers, t. “educational equity and the classroom: designing learning-ready spaces for all students,” educause review. may 7, 2018. https://er.educause.edu/articles/2018/5/educational-equity-and-the-classroom-designinglearning-ready-spaces-for-all-students. north carolina state university center for universal design. “center for universal design”. https://projects.ncsu.edu/design/cud/ (accessed november 25, 2018). ratto, m. “critical making,” open design now. http://opendesignnow.org/index.html%3fp=434.html (accessed november 7, 2018). rossman, g. b., and rallis, s. f. learning in the field: an introduction to qualitative research (thousand oaks, ca: sage, 1998). https://repository.uwl.ac.uk/id/eprint/4512/1/2018-barron-and-preater-critical-systems-librarianship.pdf https://repository.uwl.ac.uk/id/eprint/4512/1/2018-barron-and-preater-critical-systems-librarianship.pdf https://er.educause.edu/articles/2018/5/educational-equity-and-the-classroom-designing-learning-ready-spaces-for-all-students https://er.educause.edu/articles/2018/5/educational-equity-and-the-classroom-designing-learning-ready-spaces-for-all-students https://projects.ncsu.edu/design/cud/ http://opendesignnow.org/index.html%3fp=434.html technology spaces technology services and programming integrated library systems equity-driven design further readings: microsoft word 12035 20211217 galley.docx article stateful library analysis and migration system (slam) an etl system for performing digital library migrations adrian-tudor pănescu, teodora-elena grosu, and vasile manta information technology and libraries | december 2021 https://doi.org/10.6017/ital.v40i4.12035 adrian-tudor pănescu (tudor@figshare.com) is software engineer, figshare. teodora-elena grosu (teodora@figshare.com) is software engineer, figshare. vasile manta (vmanta@tuiasi.ro) is professor, faculty of automatic control and computer engineering, gheorghe asachi technical university of iași, romania. © 2021. abstract interoperability between research management systems, especially digital libraries or repositories, has been a central theme in the community for the past years, with the discussion focused on means of enriching, linking, and disseminating outputs. this paper considers a frequently overlooked aspect, namely the migration of records across systems, by introducing the stateful library analysis and migration system (slam) and presenting practical experiences with migrating records from dspace and digital commons repositories to figshare. introduction bibliographic record repositories are a central part of the research venture, playing a key role in both the dissemination and preservation of outcomes such as journal articles, conference papers, theses and dissertations, monographs, and, more recently, datasets. as the ecosystem of which these are a part of has evolved at a sustained pace in the last decade, repositories also had to adapt while ensuring uninterrupted service to the research community. nevertheless, a number of developments, both at the local, repository level and at a more general, global scale, have created the necessity of considering the complete replacement of certain systems with new repository solutions which are better suited for their stakeholders’ requirements. the following are a few such developments: • the need to consolidate both technological solutions and operational teams, in order to reduce running costs and provide a unified experience for end users, the research personnel.1 • various policies require researchers to provide not only traditional outputs, such as journal articles or conference papers, but also the datasets and other materials backing up scientific claims. for repositories, this means both adapting to larger amounts of stored data as well as ensuring that the metadata dissemination and preservation mechanisms are suited for the new output types (e.g., while full-text search is a common feature of literature repositories, it cannot be easily applied to numeric datasets).2 • apart from extending the set of stored outputs, policies have also created new requirements for existing record types. for example, the research excellence framework (ref) in the uk mandates monitoring open access (oa) publishing of research articles; thus, institutional repositories are no longer only a facilitator of green open access (selfarchiving of records) but also a means of monitoring compliance.3 this requires the implementation of new logic in existing repositories, which can frequently be difficult, especially when faced with legacy repository code bases or insufficient technological resources. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 2 • commercial, contractual, or leadership changes can also create the need to replace repository systems, due to uncertainty (see the acquisition of bepress by elsevier) or preference for certain platforms.4 while these developments can generate the requirement to switch repositories in a very short span of time, such a venture needs to be properly planned and executed in order to ensure, on the one hand, that no records are lost or corrupted and, on the other hand, that minimal or no downtime is caused. ideally, migrations would also be an opportunity to curate and enrich the existing corpus by consolidating and correcting bibliographic records. between 2018 and 2019 the research team has performed six digital library migrations from various source repository solutions (dspace, digital commons, custom in-house built systems) to the figshare software as a service (saas) repository platform. for this purpose, slam, an extract, transform, load (etl) system, was developed and successfully employed in order to migrate over 80,000 records. this article describes the rationale behind slam, its design and implementation, and the practical experiences with employing it for repository migrations. a number of future enhancements and open problems are also discussed. motivation and background of slam in early 2018 figshare started considering the suitability of its repository platform for storing content which is usually specific to institutional repositories (journal articles, theses, monographs), along with non-traditional research outputs (datasets or scientific software).5 while feature-wise this was validated by its hosted preprint servers, a new challenge was posed, as stakeholders choosing to use figshare as an institutional repository also had to transfer all content from their existing systems.6 thus, in the first half of 2018, a first migration was performed, transferring records from a bepress digital commons (dc) repository (https://www.bepress.com/products/digitalcommons/) to figshare (https://figshare.com). from a technical point of view, a python (https://www.python.org/) script was developed for this migration; this script parsed a commaseparated values (csv) report produced by dc which contained all metadata and links to the record files.7 using this information, records were created on the figshare repository using its application programming interface (api) (https://api.figshare.com). while this migration succeeded, the naive technical solution presented a number of issues: • difficulties with the metadata crosswalk: while a crosswalk was initially set up, mostly based on the definition of the fields in the source and target repositories’ metadata schema, issues were discovered while migrating the records, mainly generated by inconsistencies in the values of the fields across the corpus. these issues were fixed on a case-by-case basis, in order to ensure a lossless migration, but it would have been preferable to surface them in the early phases, in order to have the migration script mitigate the issues in the final run. • running the migration procedure multiple times: the migration script followed mostly an all or nothing approach, which, at each run, fully migrated all records between repositories. this is undesirable, as there was a need to run the script only for those records that failed to migrate (due, for example, to metadata crosswalk issues). after the full migration was completed, there was also a need to apply only some minor corrections to records, without following the full procedure. this was not possible, since the script would recreate all records to migrate from scratch on the target repository, as it did not have any memory of information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 3 previous runs. this issue was also amplified by the fact that in the source repository records did not have any type of persistent identifier attached. thus, additional scripts, which only performed the corrections, had to be developed. • ability to run the migration procedure with minimal supervision: like most migrations, this instance considered a large number of records (over 10,000) and, ideally, the process would run with minimal supervision operators. while the script partially accomplished this, the need for better fault-tolerance and enhanced logging was identified. given the lessons learned from the initial attempt and the requirement that five additional migrations were to be completed between october 2018 and december 2019, a more robust alternative to the naive migration script was required. this alternative had to adhere to three design principles: 1. reusability: the system should be usable for multiple migrations without extensive additions or modifications. thus, it should be able to adapt to the workflows of multiple repositories, metadata schemas, and other concerns specific to each migration. 2. statefulness: in software engineering, programs can either discard knowledge of past transactions or preserve it, allowing previous results and operations to be revisited. migration systems benefit from a stateful architecture, as the system should be able to perform the same migration multiple times, without creating duplicate records on the target repository, while allowing for incremental record improvements with each run. apart from allowing for corrections to be applied post-migration, this would also support the prototyping phase (where multiple test migrations are performed in order to validate the metadata crosswalks), that no information is lost, and other general workflow aspects. 3. fault tolerance: the system should implement fault tolerance mechanisms at all levels, allowing it to run migrations of large corpora with minimal supervision and, at the same time, implement sufficient logging and exception handling to allow operators to identify and correct potential issues. several repository migrations are represented in the literature. in van tuyl et al., the authors describe the process of moving from a dspace (https://duraspace.org/dspace) to a samvera (https://samvera.org) system, while in the study from do van chau records were migrated from a solution developed in house to dspace.8 both instances offer valuable insight into the challenges posed by digital library migrations, especially at the level of bibliographic metadata; on the other hand, both works are focused mostly on a specific use-case and do not propose general technical solutions for other migrations. it is interesting to note that the migration presented by van tuyl et al. required two and a half years of work, while slam was employed to carry five migrations in 14 months. the bridge2hyku toolkit (https://bridge2hyku.github.io/toolkit) is a collection of tools, including a module for the hyku repository solution (https://hyku.samvera.org), aimed at facilitating the import of records into digital libraries based on this software. similar to slam, it includes an analysis component, useful for surfacing and correcting potential metadata issues during the migration. slam provides two major improvements over this solution, namely it defines a generic architecture that can be used for migrating records between any two repositories, while also defining a procedural migration workflow to create a robust, fault-tolerant, and extensible solution. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 4 pygrametl (http://chrthomsen.github.io/pygrametl/) and petl (https://github.com/petldevelopers/petl) are two open-source frameworks which allow the defining of etl workflows; similar to slam, the processing steps are defined using python functions. these projects are targeted towards tabular and numeric data, making them unsuitable for the transfer of files and metadata across bibliographic repositories. singer (https://www.singer.io/) is an etl framework similar in design to slam, which allows the composing of various data sources (or taps) and targets, in order to move data between them. the two downsides of this implementation are that it is focused on processing data specified in the javascript object notation (json) format, which is not always available for bibliographic metadata, and that it does not facilitate extending the pipeline with, for example, the analysis facilities targeted by slam. hevo data (https://hevodata.com/), pentaho kettle (https://github.com/pentaho/pentahokettle) and talend open studio (https://www.talend.com/products/talend-open-studio/) are etl frameworks which employ graphical interfaces to allow users to define the processing workflows. while such functionality was not initially identified as a requirement for our planned migration projects, during testing it became obvious that providing such an interface could bring value by having repository administrators be more involved in defining and validating the processing applied to bibliographic records, as the administrators possess the most knowledge of the organisation of the repositories. a downside of the three solutions is that their usage requires commercial agreements, which did not line up with the business requirements of the considered migrations. in their work, tešendić and boberić krstićev use the pentaho suite in order to implement the etl component of a business intelligence (bi) solution for reporting on bibliographic records.9 while the structure of the etl processing is different—the authors being mostly interested only on certain aspects of the metadata—this work provides insights into the types of analysis that could be performed while migrating records. slam’s design and implementation following the design principles previously mentioned, slam’s architecture was devised as presented in figure 1; as for most etl systems, the easiest way of understanding its operation is by examining the data flow. the migration workflow proceeds by extracting all the required information from the source repository. this could be achieved in multiple ways, such as harvesting through an oai-pmh (https://www.openarchives.org/pmh/) endpoint or other types of api, using the bulk export functionality implemented by most repository systems, or even by crawling the html markup describing records, similar to what search engines do in order to discover web pages. once this mechanism has been established, practical experience proves that it is beneficial to move this raw data closer to the destination repository (to a staging area as depicted in figure 1). while this transfer might prove cumbersome, especially for large corpora, it is required only once. moreover, having the data close to the destination repository allows faster prototyping and testing of the migration procedure, as network latency and throughput are improved, while also ensuring that the source repository’s functioning is not affected in any manner. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 5 figure 1. main components and data flow in slam. areas in light blue are currently under development, while the components highlighted in green need to be adapted for each migration. the system splits the data to be migrated into four logical slices: bibliographic metadata, record files (e.g., pdfs of journal articles), persistent identifiers of records (pids, such as digital object identifiers or handles), and usage data (views and downloads). metadata is the first aspect to be considered. from the migration point of view, two dimensions are considered: the syntax and the semantics. metadata comes in various formats, such as csv or extensible markup language (xml) files, but most of these can be easily parsed by openly available software solutions. of more interest are the semantics of the metadata, which stem from the employed schemas or ontologies of field definitions; examples include dublin core (https://www.dublincore.org) or datacite (https://schema.datacite.org). a schema crosswalk, which describes how the fields in the target repository schema should be populated using the source data, needs to be set up when transferring records. while this should not be a concern if information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 6 the two repositories use the same schema, for the performed migrations (described below) this was not the case. other reasons for setting up such a crosswalk include • loosely defined schema in at least one of the repositories: certain repository systems do not specify a schema with clear field definitions, validations or applicability. by having the source repository administrators help with setting up a crosswalk, the migration team can avoid issues caused by incomplete understanding of the metadata. • support for the review of bibliographic records: migrations can prove to be an opportunity for reviewing and amending the records’ metadata; for example, infrequently used fields can be completely removed, and values which tend to confuse end users can be moved to other fields. • ensuring that a record on how the migration was performed, from the metadata point of view, is maintained. the crosswalk is considered an artefact of the migration and is preserved for future reference. in slam, the crosswalk is tested using elasticsearch, “an open-source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured.”10 the setup uses the crosswalk to create elasticsearch documents which include all fields as they would be transferred to the destination repository. a kibana (https://www.elastic.co/products/kibana) dashboard is then used to inspect the records’ metadata and perform structured searches across the corpus. this can allow, for example, discovering fields which do not follow a consistent pattern for the values, as seen in figure 2. as the crosswalk includes, apart from the field mapping, altering operations that can be performed on each field, this analysis can facilitate the review process described by the second point above. while performing actual migrations, a number of inconsistencies that the source repository administrators were unaware of were surfaced by slam and corrected in the target repository. this is commonplace especially in large corpora spanning decades, where the repository metadata workflows and schemas changed multiple times. two points should be noted about this component: • this is the only component of the architecture for which we mention an actual solution chosen for the practical implementation, namely elasticsearch. while other solutions could have been chosen, such as the ones included in the bridge2hyku toolkit, elasticsearch proved to be the best fit for a highly automated system which requires analysis capabilities; it is a production-grade solution which can index a high number of documents and support complex queries, while also providing user-friendly analytical views via kibana. • there are arguments for loading the metadata in the analysis component without having it processed through the crosswalk; such a workflow could provide further insights into various issues in the corpus which are possibly obscured by the crosswalk. our practical experiences did not fully justify this requirement, while the actual implementation provided a mean to test the crosswalk, a major migration component; nevertheless, we are still considering the possibility of having to load the raw metadata for analysis in future migrations. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 7 figure 2. a view examining the possible values of the temporal coverage field from the dublin core schema in an institutional repository corpus to be migrated. this shows variation in the format of the values (full date, year only) which can cause issues when migrating to a schema which applies strict validation on date/time values, and thus need to be handled by the migration harness. this view is generated using kibana from the elasticsearch stack, employed by slam for metadata analysis purposes. with the crosswalk set up, the migration module can be completed. from a logical point of view, it comprises of four components: 1. metadata processing: this component uses the crosswalk in order to transfer the metadata to the target repository. 2. file upload: this simply uploads all files associated to a bibliographic record to their new locations. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 8 3. usage data transfer: most repositories implement counters for views and downloads of records, and this information, if available, is also transferred to the target repository. 4. persistent identifier update: if the records are using persistent identifiers, such as digital object identifiers (dois) (https://doi.org/) or handles (http://handle.net/), these are updated to resolve to the new locations in the target repository. while employing slam for migrations, cases in which persistent identifiers were not employed on source repositories were encountered, with records being accessible only via uniform resource locators (urls). as these cannot always be transferred across repositories, because each software uses its own url schema, it is advisable to implement persistent identifiers before migrations. figure 3. a simplified process diagram describing the steps required for migrating a bibliographic record. each successful operation is recorded in a persistent database which is used in subsequent runs for resuming the workflow. for example, files will not be uploaded each time the script is run, thus avoiding duplication. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 9 one of the architectural goals of slam is statefulness and this is implemented at this level, the migration module being designed as a state machine. a trivial example of such a state machine is shown in figure 3. the state machine status is serialised in a persistent database, with each migration run deserializing it in order to understand which operations still need to be applied for each record. maintaining such a registry provides several other benefits: • facilitates testing and prototyping: this was the original reason behind the architecture, useful especially before the metadata analysis functionality was implemented. if one of the operations required for transferring a record fails, subsequent runs will not apply all steps, but only the ones that did not complete. as for each record a separate state section is maintained, this becomes especially useful when migrating multiple entries; records which failed to migrate can be easily isolated and subsequently reprocessed. • allows creating reports on the migration: these are used, for example, to validate that all records were indeed transferred to the target repository. • allows the migration module to be portable: if the state machine serialisation is accessible, the module can run from different locations and at different points in time. the first architectural principle previously presented relates to the reusability of slam across migrations. the most common cause of divergence between migrations is related to the differences between repository solutions; slam isolates this concern by using two connectors, one for the source and one for the target repository. these connectors translate the information to be migrated to and from slam’s internal data model. thus, the source connector needs to be able to traverse the staging storage and provide slam with all the required record information, while the target connector will upload the records to the new repository (using a web-accessible api for example). this means that for each migration only three parts of slam need to be adapted (shown in green highlights in figure 1): the source and target connectors, and the metadata crosswalk. all other components can remain unchanged, thus reducing the technical development time. in the last step of slam’s workflow, the information that was used for the migration is sent to a long-term preservation storage, in order to ensure that it remains available for future reference. in our implementation, the following information is preserved: • original metadata and files, as extracted from the source repository. • metadata crosswalk from source to target repository. • migration script state machine serialisation. this information is sufficient for understanding the exact steps applied during the migration and, if required, for applying certain corrections to the migrated records at a future point in time. employing slam for real-world migrations slam was used for performing five repository migrations in one year, as described in table 1; the target repository in all five cases was figshare. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 10 table 1. overview of repositories migrated to figshare using slam. source repository identifier repository type software number of records ir1 institutional dspace 37,000 ir2 institutional dspace 25,605 d1 data custom 334 (105 gb) ir3 institutional digital commons 2,275 ir4 institutional dspace 15,474 slam’s viability was assessed based on the design principles outlined above. reusability, the main rationale behind slam, relates to being able to reuse as much of the system as possible across migrations. the architecture isolated the parts that required adaption from one migration to another (the connectors and the crosswalk); the time spent by a software engineer in order to set up these was monitored. the target here was to support the specialised staff on making domainspecific decisions, especially on the metadata crosswalk, by reducing the time needed to develop the three mentioned components. for example, the research excellence framework (ref) 2021 exercise in the united kingdom had strict metadata requirements, which required thorough testing in connection with current research information systems and open access monitoring solutions. between the first and fourth migration, this was reduced from six person-weeks to only two; it is important to note that slam evolved between the migrations, based on the lessons learned from each instance. statefulness, the property which allows re-processing already-migrated records, is covered in slam by the state machine implemented in the migration module, which is persistent and can be referenced in subsequent runs. all the migrations in table 1 required supplementary runs after all records were migrated, most frequently in order to fix metadata issues discovered after the full corpus was transferred. for example, ir1 required three such runs: 1. the first run fixed a number of issues caused by omissions in the metadata schema crosswalk. 2. the second run enriched the metadata using information taken from a current research information system (a source external to slam). 3. the last run corrected the usage statistics (view and downloads) which were incorrectly imported initially, due to incomplete understanding of the source repository’s database. due to slam’s design, no issues were encountered while performing these runs, as no records were duplicated, removed, or erroneously modified; this was manually checked by the repository administrators, either by sampling the corpus or by inspecting each migrated record, depending on the repository size. a key aspect highlighted by the requirement to reprocess migrated records relates to the granularity of the state machine. as an example, in ir3 a second run required attaching supplementary files to a number of migrated records, and this posed a challenge due to the fact that the state machine only recorded if all files have been uploaded, and not which files were successfully added to the record. thus, the state machine was amended to record the complete list of record files, allowing for more granular control over this processing step. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 11 the last concern, fault tolerance, was achieved by applying basic software engineering principles, such as fail-fast (report migration issues as soon as they manifest), the implementation of proper exception handling (such as not to ignore any potential issues), and addition of enhanced logging in order to provide a complete record of the processing steps. for each of the five migrations, slam ran unsupervised, reporting at the end of each run the records for which an issue was encountered. as an example, in the ir4 migration, slam initially failed to migrate 300 records. these were reported to the operator, and after minor fixes were applied to the metadata crosswalk the migration completed successfully. fault-tolerance plays a central role in ensuring that during migrations no data is lost or corrupted, by surfacing any edge-case that might have been missed during the development of the metadata crosswalk, repository connectors, or core migration module, while also isolating such issues to the records exhibiting them, with no impact on the full corpus. future directions while proven viable in real-world scenarios, a number of areas which can benefit from further improvements were identified through an analysis of the current implementation, based on the experiences of the five migrations. first, the migration-specific components (connectors and metadata crosswalk, shown in green in figure 1) require further decoupling from the core migration module. for example, since all migrations considered figshare as a target repository, this connector is currently strongly interlinked with the core module, in order to save development time according to business requirements and migration timelines. further decoupling will ensure that the core migration module’s design is not influenced in any way by the repository’s architecture and capabilities. completing this work will also allow making the source code of our current implementation of slam publicly available, as in its current state it is making use of proprietary components which are employed across other parts of the figshare platform. aside from these, the source code includes straightforward python modules and makes use of open technologies such as elasticsearch, which will allow the larger community to adapt and use slam with other source or target repositories, or even enhance it with further functionality. nevertheless, the general architecture can already be implemented in any other way or using a different set of technologies. further to this point, the metadata crosswalk is currently influenced by the logic and design of the migration module; for example, it uses the same procedural programming language, python, as all other components of slam. employing technologies such as extensible stylesheet language transformations (xslt, for metadata in xml formats) or sparql (for rdf) will help involve staff with in-depth domain knowledge further in the migration, for whom these technologies are more familiar; moreover, such a design does not require any knowledge of slam’s internal processes. second, the five completed migrations highlighted the importance of reviewing, correcting, and enhancing records during the migration. for example, when migrating a journal article’s version of record in an open access context, special care needs to be given to its metadata (title, authors, journal name, publication date or persistent identifier), as mistakes can generate issues with scholarly search engines which will not be able to link the published version to the repository one. a possible input for comparing and correcting existing metadata is the information contained by current research information systems, which aggregate information from various databases, such as scopus (https://www.scopus.com/). if access to such systems is not available, it is possible to information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 12 source metadata from open directories, such as crossref (https://www.crossref.org/). this component is included in the architectural overview presented in figure 1. the third area in need of improvement relates to testing the outcome of the migrations. as mentioned in the previous section, this is currently a manual process and can be both cumbersome and error prone. while in line with slam’s philosophy of automating every step of the process, implementing a mechanism for validating the end migration result could also provide stronger assurances on the completeness and correctness of the migration. finally, slam’s preservation module requires further development in order to ensure that it is fully automated; moreover, the possibility of adding a manifest explaining the migration artefacts needs to be considered, as knowledge on the organisation of the information, which is specific to each migration, might be lost in time. it is important to note that architecture-wise, which was the main concern of this work, we did not identify any major shortcomings in slam—most issues discussed above focus on implementation issues. slam’s modular design will facilitate any additions to the system, required to support new use cases and migrations. conclusions this paper describes slam, the stateful library analysis and migration system, an etl software architecture for performing digital library migrations. what differentiates such transfers from other data migrations is the required domain knowledge, the particularities of the target and source repositories in the context of the scholarly communications ecosystem, and the structure of the migration package, which includes, among others, bibliographic metadata, record files, and usage data. digital libraries are an integral part of the cultural heritage; thus, any migration needs to ensure that no information is lost or corrupted in the process. the main contributions brought by slam are 1. it includes an analysis module based on an industry standard search engine, elasticsearch, which allows operators to analyse the metadata and schema crosswalk, facilitating the decisions required for properly migrating information between repositories; 2. it implements a serializable state machine in its migration module, which facilitates running the migration procedures multiple times without duplicating, removing, or corrupting records, while allowing for corrections to be applied to the corpus; 3. it follows a modular design, which enhances its reusability across multiple migrations, by reducing the development time required for adapting the system to new source and target repositories. slam applies established software engineering principles in order to provide a trustworthy tool to digital library administrators that need to transfer content between systems. its design was both influenced and validated by real-world applications, having been used for five different migrations with various requirements and targeted repository solutions. future work will consider enhancing slam’s metadata analysis and enrichment capabilities as well as the collection of further data points on its performance and possible improvement directions while using it for new digital library migrations. information technology and libraries december 2021 stateful library analysis and migration system (slam) | pănescu, grosu, and manta 13 endnotes 1 david scherer and dan valen, “balancing multiple roles of repositories: developing a comprehensive repository at carnegie mellon university,” publications 7, no. 2 (2019), https://doi.org/10.3390/publications7020030. 2 directorate-general for research & innovation, “h2020 programme—guidelines to the rules on open access to scientific publications and open access to research data in horizon 2020,” version 3.2, march 21, 2017, https://web.archive.org/web/20180826235248/http://ec.europa.eu/research/participants/ data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf; national institutes of health, “nih public access policy details,” last updated march 25, 2016, https://web.archive.org/web/20180421191423/https://publicaccess.nih.gov/policy.htm. 3 the ref, “research excellence framework,” https://web.archive.org/web/20191215143352/https://www.ref.ac.uk/. 4 roger c. schonfeld, “elsevier acquires bepress,” scholarly kitchen (blog), august 2, 2017, https://web.archive.org/web/20191212183253/https://scholarlykitchen.sspnet.org/2017/0 8/02/elsevier-acquires-bepress/. 5 alan hyndman, “announcing the figshare institutional repository… and data repository… and thesis repository… really just an all-in-one next gen repository,” figshare (blog), march 22, 2018, https://figshare.com/blog/announcing_the_figshare_institutional_repository_and_data_repos itory_and_thesis_repository_really_just_an_all-in-one_next_gen_repository/389. 6 alan hyndman, “figshare to power chemrxiv™ beta, new chemistry preprint server for the global chemistry community,” figshare (blog), august 14, 2017, https://web.archive.org/web/20191218194210/https:/figshare.com/blog/_/322. 7 bepress, “digital commons dashboard,” https://web.archive.org/web/20191218192450/https://www.bepress.com/reference_guide_ dc/digital-commons-dashboard/. 8 steve van tuyl et al., “are we still working on this? a meta-retrospective of a digital repository migration in the form of a classic greek tragedy (in extreme violation of aristotelian unity of time),” code{4}lib journal no. 41 (august, 9, 2018), https://journal.code4lib.org/articles/13581; do van chau, “challenges of metadata migration in digital repository: a case study of the migration of duo to dspace at the university of oslo library” (master’s thesis, university of oslo, 2011), http://hdl.handle.net/10642/990. 9 danijela tešendić and danijela boberić krstićev, “business intelligence in the service of libraries,” information technology and libraries 38, no. 4 (2019), https://doi.org/10.6017/ital.v38i4.10599. 10 “what is elasticsearch?” elasticsearch bv, http://web.archive.org/web/20191207032247/https://www.elastic.co/whatis/elasticsearch. it is our flagship: surveying the landscape of digital interactive displays in learning environments lydia zvyagintseva information technology and libraries | june 2018 50 lydia zvyagintseva (lzvyagintseva@epl.ca) is the digital exhibits librarian at the edmonton public library in edmonton, alberta. abstract this paper presents the findings of an environmental scan conducted as part of a digital exhibits intern librarian project at the edmonton public library in 2016. as part of the library’s 2016–2018 business plan objective to define the vision for a digital exhibits service, this research project aimed to understand the current landscape of digital displays in learning institutions globally. the resulting study consisted of 39 structured interviews with libraries, museums, galleries, schools, and creative design studios. the environmental scan explored the technical infrastructure of digital displays, their user groups, various uses for the technologies within organizational contexts, the content sources, scheduling models, and resourcing needs for this emergent service. additionally, broader themes surrounding challenges and successes were also included in the study. despite the variety of approaches taken among learning institutions in supporting digital displays, the majority of organizations have expressed a high degree of satisfaction with these technologies. introduction in 2020, the stanley a. milner library, the central branch of the edmonton (alberta) public library (epl) will reopen after extensive renovations to both the interior and exterior of the building. as part of the interior renovations, epl will have installed a large digital interactive display wall modeled after the cube at queensland university of technology (qut) in brisbane, australia. to prepare for the launch of this new technology service, epl hired a digital exhibits intern librarian in 2016, whose role consisted of conducting research to inform the library in defining the vision for a digital display wall serving as a shared community platform for all manner of digitally accessible and interactive exhibits. as a result, the author carried out an environmental scan and a literature review related to digital display, as well as their consequent service contexts. for the purposes of this paper, “digital displays” refers to the technology and hardware used to showcase information, whereas “digital exhibits” refers to content and software used on those displays. wherever the service of running, managing, or using this technology is discussed, it is framed as “digital display service” and concerns both technical and organizational aspects of using this technology in a learning institution. method the data were collected between may 30 and august 20, 2016. a series of structured interviews were conducted by skype, phone, and email. the study population was driven by searching google mailto:lzvyagintseva@epl.ca it is our flagship | zvyagintseva 51 https://doi.org/10.6017/ital.v37i2.9987 and google news for keywords such as “digital interactive and library,” “interactive display,” “public display,” or “visualization wall” to identify organizations that have installed digital displays. a list of the study population was expanded by reviewing websites of creative studios specializing in interactive experiences and through a snowball effect once the interviews had begun. a small number of vendors, consisting primarily of creative agencies specializing in digital interactive services, were also included in the study population. participants were then recruited by email. the goal of this project was to gain a broad understanding of the emergent technology, content, and service model landscape related to digital displays. as a result, structured interviews were deemed to be the most appropriate method of data collection because of their capacity to generate a large amount of qualitative and quantitative data. in total, 39 interviews were conducted. a list of interview questions prepared for the interviews is included in appendix a. additionally, a complete list of the study population can also be found in appendix b. predominantly, organizations from canada, the united states, australia, and new zealand are represented in this study. literature review definitions • public displays, a term used in the literature to refer to a particular type of digital display, can refer to “small or large sized screens that are placed indoor . . . or outdoor for public viewing and usage” and which may be interactive to support information browsing and searching activities.”1 in public displays, a large proportion of users are passers-by and thus first-time users.2 in academic environments, these technologies may be referred to as “video walls” and have been characterized as display technologies with little interactivity and input from users, often located in high-traffic, public areas with content prepared ahead of time and scheduled for display according to particular priorities.3 • semi-public displays, on the other hand, can be understood as systems intended to be used by “members of a small, co-located group within a confined physical space, and not general passers-by.”4 in academic environments, they have been referred to as “visualization spaces” or “visualization studios,” and can be defined as workspaces with real-time content displayed for analysis or interpretation, often placed in in libraries or research department units.5 for the purposes of this paper, “digital displays” refers to both public and semi-public displays, as organizations interviewed as part of this study had both types of displays, occasionally simultaneously. • honeypot effect describes how people interacting with an information system, such as a public display, stimulate other users to observe, approach, and engage in interaction with that system.6 this phenomenon extends beyond digital displays to tourism, art, or retail environments, where a site of interest attracts attention of passers-by and draws them to participate in that site. interactivity the area of interactivity with public displays has been studied by many researchers, with three commonly used modes of interaction clearly identified: touch, gesture, and remote modes. information technology and libraries | june 2018 52 • touch (or multi-touch): this is the most common way users interact with personal mobile devices such as smartphones and tablets. multi-touch interaction on public displays should support many individuals interacting with the digital screen simultaneously, since many users expect immediate access and will not take turns. for example, some technologies studied in this report support up to 30 touch points at any given time, while others, like qut’s the cube, allow for a near infinite number of touch points. though studies show that this technique is fast and natural, it also requires additional physical effort from the user.7 while touch interaction using infrared sensors has a high touch recognition rate, its shortcomings have been identified as being expensive and being influenced by light interference, such as light around the touch screen.8 • gesture: this is interaction is through movement of the user’s hands, arms, or entire body, recognized by sensors such as the microsoft kinect or leap motion systems. although studies show that this type of interaction is quick and intuitive, it also brings “a cognitive load to the users together with the increased concern of performing gestures in public spaces.”9 specifically, body gestures were found not to be well suited to passing-by interaction, unlike hand gestures, which can be performed while walking. hand gestures also have an acceptable mental, physical and temporal workload.10 research into gesturebased interaction shows that “more movement can negatively influence recall” and is therefore not suited for informational exhibits.11 similarly, people consider gestures to be too much work “when they require two hands and large movements” to execute.12 not surprisingly, research suggests that gestures deemed to be socially acceptable for public spaces are small, unobtrusive ones that mimic everyday actions. they are also more likely to be adopted by users. • remote: these are interactions using another device, such as mobile phones, tablets, virtual-reality headsets, game controllers, and other special devices. connection protocols may include bluetooth, sms messaging, near-field communication, radio-frequency identification, wireless-network connectivity, and other methods. mobile-based interaction with public displays has received a lot of attention in research, media, and commercial environments because this mode allows users to interact from variable distance with minimal physical effort. however, users often find mobile interaction with a public display “too technical and inconvenient” because it requires sophisticated levels of digital literacy in addition to having access to a suitable device.13 some suggest that using personal devices for input also helps “avoid occlusion and offers interaction at a distance” without requiring multi-touch or gesture-based interactions.14 as well, subjects in studies on mobile interaction often indicate their preference for this mode because of its low mental effort and low physical demand. however, it is possible that these studies focused on users with high degrees of digital literacies rather than the general public with varying degrees of access and comfort with mobile technologies. user engagement attracting user attention is not necessarily guaranteed by virtue of having a public display. according to research, the most significant factors that influence user engagement with public digital displays are age, display content, and social context. it is our flagship | zvyagintseva 53 https://doi.org/10.6017/ital.v37i2.9987 age hinrichs found that children were the first to engage in interaction with public displays and would often recruit adults accompanying them toward the installation.15 on the other hand, the hinrichs found adults to be more hesitant in approaching the installation: “they would often look at it from a distance before deciding to explore it further.”16 these findings suggest that designing for children first is an effective strategy for enticing interaction from users of all ages. display content studies on engagement in public digital display environments indicate that both passive and active types of engagement exist with digital displays. the role of emotion in the content displayed also cannot be overlooked. specifically, clinch et al. state that people typically pay attention to displays “only when they expected the content to be of interest to them” and that they are “more likely to expect interesting content in a university context rather than within commercial premises.”17 in other words, the context in which the display is situated affects user expectations and primes them for interaction. the dominant communication pattern in existing display and signage systems has been narrowcast, a model in which displays are essentially seen as distribution points for centrally created content without much consideration for users. this model of messaging exists in commercial spaces, such as malls, but also in public areas like transit centers, university campuses, and other spaces where crowds of people may gather or pass by. observational studies indicate that people tend to perceive this type of content as not relevant to them and ignore it.18 for public displays to be engaging to end users, in other words, “there needs to be some kind of reciprocal interaction.”19 in public spaces, interactive displays may be more successful than noninteractive displays in engaging viewers and making city centers livelier and more attractive.20 in terms of precise measures of attention to such displays, studies of average attention time correlate age with responsiveness to digital signage. children (1–14 years) are more receptive than adults and men spend more time observing digital signage than women.21 studies also indicate a significantly higher average attention times for observing dynamic content as compared to static content.22 scholars like buerger suggest that designers of applications for public digital displays should assume that viewers are not willing “to spend more than a few seconds to determine whether a display is of interest.”23 instead, they recommend presenting informational content with minimal text and in such a way that the most important information can be determined in two-to-three seconds. in a museum context, the average interaction time with the digital display was between two and five minutes, which was also the average time people spent exploring analog exhibits.24 dynamic, game-like exhibits at the cube incorporate all the above findings to make interaction interesting, short, and drawing the attention of children first. social context social context is another aspect that has been studied extensively in the field of human-computer interaction, and it provides many valuable lessons for applying evidence-based practices to technology service planning in libraries. many scholars have observed the honeypot effect as related to interaction with digital displays in public settings. this effect describes how users who are actively engaged with the display perform two important functions: they entice passers-by to become actively engaged users themselves, and they demonstrate how to interact with the technology without formal instruction. information technology and libraries | june 2018 54 many argue that a conductive social context can “overcome a poor physical space, but an inappropriate social context can inhibit interaction” even in physical spaces where engagement with the technology is encouraged.25 this finding relates to use of gestures on public displays. researchers also found that contextual social factors such as age and being around others in a public setting do, in fact, influence the choice of multi-touch gestures. hinrichs suggests enabling a variety of gestures for each action—accommodating different hand postures and a large number of touch points, for example—to support fluid gesture sequences and social interactions.26 a major deterrent to users’ interaction with large public displays has been identified as the potential for social embarrassment.27 as an implication, the authors suggest positioning the display along thoroughfares of traffic and improving how the interaction principles of the display are communicated implicitly to bystanders, thus continually instructing new users on techniques of interaction.28 findings technical and hardware landscape the average age of public displays was around three years, indicating an early stage of development of this type of service among learning institutions. such technologies first appeared in europe more than 10 years ago (for example, the most widely cited early example of a public display is the citywall in helsinki in 2007).29 however, adoption in north american did not start until around 2013.the median year for the installation of these technologies among organizations studied in this report is 2014. among public institutions represented in the study population, such as public libraries and museums, digital displays were most frequently installed in 2015. while most organizations have only one display space, it was not unusual to find several within a single organization. for example, for the purposes of this study, the researcher has counted the cube as three display spaces, as documentation and promotional literature on the technology cites “3 separate display zones.” as a result, the average number of display spaces in the population of this study is 1.75. the following modes of interaction beyond displaying video content with digital displays have been observed in the study population in descending order of frequency: • sound (79%). while research on human-computer interaction is inconclusive about best practices related to incorporating sound into digital interactive displays, it is clear, among the organizations interviewed in the environmental scan, that sound is a major component of digital exhibits and should not be overlooked. • touch or multi-touch (46%). this finding highlights that screens capable of supporting multi-user interaction is not consistent across the study population. • gesture (25%): these include tools such as microsoft kinect, leap motion, or other systems for detecting movement for interaction. • mobile (14%). while some researchers in the human-computer interaction field suggest mobile is the most effective way to bridge the divide between large public displays, personalization of content, and user engagement, mobile interactivity is not used frequently to engage with digital displays in the study population. one outlier is north carolina state university library, which takes a holistic, “massively responsive design” approach in which responsive web design principles are applied to content that can be it is our flagship | zvyagintseva 55 https://doi.org/10.6017/ital.v37i2.9987 displayed effectively at once online, on digital display walls, and on mobile devices while optimizing institutional resources dedicated to supporting visualization services. further, as in the broader personal computing environment, the microsoft windows operating system dominates display systems, with 61% of the organizations choosing a windows machine to power their digital display. a fifth (21%) of all organizations have some form of networked computing infrastructure, such as the cube with its capacity to process exhibit content using 30 servers. instead, the majority (79%) of organizations interviewed have a single computer powering the display. this finding is perhaps not surprising, given that few institutions have dedicated it teams to support a single technology service like the cube. users and use cases understanding primary audiences was also important for this study, as the organizational user base defines the context for digital exhibits. the breakdown of these audiences is summarized in figure 1. for example, the university of oregon ford alumni center’s digital interactive display focuses primarily on showcasing the success of its alumni, with a goal of recruiting new students to the university. however, the interactive exhibits also serve the general public through tours and events on the university of oregon campus. other organizations with digital displays, such as all saints anglican school and the philadelphia museum of art, also target specific audiences, so planning for exhibits may be easier in those contexts than in organizations like the university of waterloo stratford campus, with its display wall at the downtown campus that receives visitor traffic from students, faculty, and the public. 44% 33% 22% types of audience academic public both public and academic information technology and libraries | june 2018 56 figure 1. audience types for digital displays in the study population. digital displays serve various purposes, which depend on the context of the organization in which they exist, their technical functionality, their primary audience, their service design, and other factors. interview participants were asked about the various uses for these technologies at their institutions. a single display could have multiple functions within a single institution. the following list summarizes these multiple uses: 1. educational (67%), such as displaying digital collections, archives, historical maps, and other informational. these activities can be summarized in the words of one participant as “education via browse”—in other words, self-guided discovery rather than formal instruction. 2. fun or entertainment (56%), including art exhibitions, film screenings, games, playful exhibits, and other engaging content to entice users. 3. communication (47%), which can be considered a form of digital signage to promote library or institutional services and marketing content. displays can also deliver presentations and communicate scholarly work. 4. teaching (42%), including formal and semi-formal instruction, workshops, student presentations, and student course-work showcases. 5. events (31%), such as public tours, conferences, guest speakers, special events, galas, and other social activities near or using the display. 6. community engagement (28%), including participation from community members through content contribution, showing local content, using the display technology as an outreach tool, and other strategies to build relationships with user communities. 7. research (22%), where the display functions as a tool that facilitates scholarly activities like data collection, analysis, and peer review. many study participants acknowledged challenges in using digital displays for this purpose and have identified other services that might support this use more effectively. content types and management in the words of deakin university librarians, “content is critical, but the message is king,” so it was particularly important for the author to understand the current digital display landscape as it relates to content.30 specifically, the research project encompassed the variety of content used on digital displays as well as how it is created, managed, shared, and received by the audiences of various organizations interviewed in this study. as can be observed in figure 2, all organizations supported 2d content, such as images, video, audio, presentation slides, and other visual and textual material. however, dynamic forms of content, such as social media feeds, interactive maps, and websites were less prevalent. it is our flagship | zvyagintseva 57 https://doi.org/10.6017/ital.v37i2.9987 figure 2. types of content supported by digital displays in the study population. discussions around interest in emergent, immersive, and dynamic 3d content such as games and virtual and augmented reality also came up frequently in the study interviews, and the researcher found that these types of content were supported in only 16 (57%) of the 28 total cases. this number is lower than the total number of interviewees because not all organizations interviewed had content to manage or display. in addition, many organizations recognized that they would likely be exploring ways to present 3d games or immersive environments through their digital display in the near future. not surprisingly, the creative agencies included in this study revealed an awareness and active development of content of this nature, noting “rising demand and interest in 3d and game-like environments.” furthermore, projects involving motion detection, the internet of things, and other sensor-based interactions are also seeing rise in demand, according to study participants. 100 % 61 % 57 % 0 10 20 30 40 50 60 70 80 90 100 content types supported content types static 2d dynamic web dynamic 3d information technology and libraries | june 2018 58 figure 3. content management systems for digital displays. in terms of managing various types of content, 20 (71%) of the organizations interviewed had used some form of content management system (cms), while the rest did not use any tool to manage or organize content. of those organizations that used a cms, 15 (75%) relied on a vendorsupplied system, such as tools by fourwinds interactive, visix, or nec live. the remaining 5 (18%) cms users created a custom solution without going to a vendor. this finding suggests that since the majority of content supported by organizations with digital displays is 2d, current vendor solutions for managing that content are sufficient for the study population at this point. it is unclear how the rise in demand for dynamic, game-like content will be supported by vendors in the coming years. table 1 reflects the distribution of approaches to managing content observed in the study population. 18% 11% 53% 18% 71% content management no system unknown vendor-supplied system in-house created system it is our flagship | zvyagintseva 59 https://doi.org/10.6017/ital.v37i2.9987 table 1. content management in study population content management responses % vendor supplied system 15 54 in-house created system 5 18 no system 5 18 unknown 3 10 middleware, automation, and exhibit management middleware can be described as the layer of software between the operating system and applications running on the display, especially in a networked computing environment. for example, most organizations studied in the environmental scan supported a windows environment with a range of exhibit applications, like slideshows, web browsers, and executable files, such as games. middleware can simplify and automate the process of starting up, switching between, and shutting off display applications on a set schedule. as figure 4 demonstrates, the majority of the organizations in the study population (17, or 61%) did not have a middleware solution. however, this group was heterogeneous: 14 organizations (50%) did not require a middleware solution because they ran content semi-permanently or relied on user-supplied content, in which case the display functioned as a teaching tool. the remaining three organizations (11%) manually managed scheduling and switching between exhibit content. in such cases, a middleware solution would be valuable to management of content, especially as the number of applications grows, but it was not present in these organizations. comparatively, 10 organizations (36%) used a custom solution, such as a combination of windows or linux scripts to manage automation and scheduling of content on the display. one organization (3%) did not specify their approach to managing content. these findings suggest that no formalized solution to automating and managing software currently exists among the study population. in addition to organizing content, digital-exhibits services involve scheduling or automating content to meet user needs according to the time of day, special events, or seasonal relevance. as a result, the middleware technology solution supports sustainable management of displays and predictable sharing of content for end users. this environmental scan revealed that digital exhibits and interactive experiences are still in the early days of development. it is possible that new solutions for managing content both at the application and the middleware level may emerge in the coming years, but they are currently limited. information technology and libraries | june 2018 60 figure 4. middleware solutions in the study population. sources of content when finding sources of content to be displayed on digital displays, organizations interviewed used multiple strategies simultaneously. table 2 below brings together the findings related to this theme. table 2. content sources for digital exhibits content source % external/commissioned 64 user-supplied 64 internal/in-house 50 collaborative with partner 43 for example, many organizations rely on their users to generate and submit material (18, or 64%); others commission vendors to create exhibits for them (18, or 64%). in 50% of all cases, organizations also produce content for exhibits in-house. in other words, most organizations used a combination of all sources to generate content for their digital displays. only a few use a single 61% 36% 3% middleware use none custom unknown it is our flagship | zvyagintseva 61 https://doi.org/10.6017/ital.v37i2.9987 source of content, such as the semi-permanent historical exhibit at henrico county public library. others, like the duke media wall, rely entirely on their users to supply content, which employs a “for students by students” model of content creation. additionally, only 12 (43%) of the organizations interviewed had explored or established some form of partnership for creating exhibits. primarily, these partnerships existed with departments, centers, institutes, campus units, and/or students in academic settings, such as the computer science department, faculty of graduate studies, and international studies. other examples of partnerships were with similar civic, educational, cultural, and heritage organizations, such as municipal libraries, historical societies, art galleries, museums, and nonprofits. examples included study participants working with ars electronica, local symphony orchestras, harvard space science, and nasa on digital exhibits. clearly, a variety of approaches were taken in the study population to come up with digital exhibits content. content creation guidelines seven organizations (19%) in the study population shared publicly the content guidelines aimed to simplify the process of engaging users in creating exhibits. these guidelines were analyzed, and key elements were identified that are necessary for users to know in order to contribute in a meaningful way, thereby lowering the barrier to participation. these elements include resolution of the display screen(s), touch capability, ambient light around the display space, required file formats, and maximum file size. a complete list of organizations with such guidelines, along with websites where these guidelines can be found, is included in appendix c. based on the analysis of this limited sample, the bare minimum for community participation guidelines would include clearly outlining • the scope, purpose, audience, and curatorial policy of the digital exhibits service; • the technical specifications, such as the resolution, aspect ratio, and file formats supported by the display; • the design guidelines, such as colors, templates and other visual elements; • the contact information of the digital exhibits coordinator; and • the online or email submission form. it should be noted, however, that such specifications are primarily useful when a cms exists and the content solicited from users is at least somewhat standardized. for example, images, slides, or webpages may be easier for community partners to contribute than video games or 3d interactive content. no examples of guidelines for the latter were observed in the study. content scheduling whereas the middleware section of this study examined the technical approaches to content management and automation, this section explores the frequency of exhibit rotation from a service design perspective. as can be observed in figure 5, no consistent or dominant model for exhibit scheduling has been identified in the study population. generally, approaches to scheduling digital exhibits reflect organizational contexts. for example, museums typically design an exhibit and display it on a permanent basis, while academic institutions change displays of student work or scholarly communication once per semester. the following scheduling models have emerged in the descending order of frequency in the study population. information technology and libraries | june 2018 62 figure 5. content scheduling distribution in the study population. 1. unstructured (29%): no formal approach, policy, or expectation is identified by the organization regarding displaying exhibits. this model is largely related to the early stage of service development in this domain, lack of staff capacity to support the service, and/or responsiveness to user needs. one study participant, for example, referred to this loose approach by noting that “no formalized approach and no official policy exists.” for example, institutions may have frameworks for what types of content are acceptable but no specific requirements on the content subjects. institutions adopting a lab space model (see figure 6) for digital displays largely belong to this category. in other words, content is created on the fly through workshops, data analysis, and other situations as needed by users. in this case, no formal scheduling is required apart from space reservations. 2. seasonal (29%), which can be defined as a period from three to six months and includes semester-based scheduling in academic institutions. many organizations operate on a quarterly basis, so it would seem logical that content refresh cycles reflect the broader workflow of the organization. 3. permanent (21%): in the cases of museums, permanent exhibits may mean displaying content indefinitely or until the next hardware refresh, which might reconfigure the entire interactive display service. no specific date ranges were cited for this model. 4. monthly (10%): this pattern was observed among academic libraries, with production of “monthly playlists” featuring curated book lists or other monthly specials. 5. weekly (7%): north carolina state university and deakin university libraries aim to have fresh content up once per week; they achieve this in part by formalizing the roles needed to support their digital display and visualization services. 29% 29% 21% 10% 7% 4% content scheduling unstructured seasonal permanent monthly weekly daily it is our flagship | zvyagintseva 63 https://doi.org/10.6017/ital.v37i2.9987 6. daily (4%): only griffith university ensures that new content is available every day on its #seemore display; it does this largely by relying on standardized external and internal inputs, such as weather updates and the university marketing department content. staffing and skills one key element of the digital exhibits research project included investigating staffing models required to support a service of this nature. not surprisingly, the theme around resource needs for digital exhibits emerged in most interviews conducted. several participants have noted that one “can’t just throw up content and leave it” while others advised to “have expertise on staff before tech is installed.” data gathered shows that the average full-time equivalent (fte) needed to support digital display services in organizations interviewed was 2.97—around three full time staff members. in addition, 74% of the organizations studied had maintenance or support contracts with various vendors, including av integrators, cms specialists, creative studios that produced original content, or hardware suppliers. hardware and av integrators typically provided a 12-month contract for technical troubleshooting while creative studios ensured a 3month support contract for digital exhibits they designed. the average time to create an original, interactive exhibit was between 9 and 12 months according to the data provided by creative agencies, the cube teams, and learning organizations who have in-house teams creating exhibits regularly. this length of time varies on the complexity of interaction designed, depth of the exhibit “narrative,” and modes of input supported by the exhibit application. additionally, it was important to understand the curatorial labor behind digital exhibits; the author did not necessarily speak with the curator of exhibits, and this work may be carried out by multiple individuals within organizations with digital displays or creative studios. in 20 (57%) of the cases, the person interviewed also curated some of or all the content for the digital display in their respective institutions. in five (14%) of the cases, the individual interviewed was not a curator for any of the content, because there was no need for curation in the first place. for example, displays in these cases were used for analysis or teaching and therefore did not require prepared content. in the rest of the cases (10, or 29%), a creative agency vendor, another member of the team, or a community partner was responsible for the curation of exhibit content. this finding suggests that, while a significant number of organizations outsource the design and curation of exhibits, the majority retain control over this process. therefore, dedicating resources to curation, organization, and management of exhibit content is deemed significant by the organizations represented in the study. in terms of the capacity to carry out digital display services, skills that have been identified by study participants as being important to supporting work of this nature include the following: 1. technical skills (such as the ability to troubleshoot), general interest in technology, and flexibility and willingness to learn new things (74%) 2. design, visual, and creative sensibility (40%), as this type of work is primarily a visual experience 3. software-development or programming-language knowledge (31%) 4. communication, collaboration, and relationship-building (25%) 5. project management (20%) information technology and libraries | june 2018 64 6. audiovisual and media skills (14%), as digital exhibits are “as much an av experience as an it experience,” according to one study participant 7. curatorial, organizational, and content-management skills (11%) the most frequent dedicated roles mentioned in the interviews are shown in table 3. table 3. types of roles significant to digital exhibits work position responses % developer/programmer 11 31 project manager 8 23 graphic designer 6 17 user experience or user interface designer 4 11 it systems administrator 4 11 av or media specialist 4 11 the relatively low percentages represented in this table suggest the distribution of skills mentioned above among various team members or combining multiple skills in a single role, as may be the case in small institutions or those without formalized services with dedicated roles. nevertheless, the presence of specific job titles indicates understanding of various skill sets needed to run a service that uses digital displays. challenges and successes many challenges were identified by study participants related to initiating and supporting a service that uses digital displays for learning. clearly, multiple challenges could be associated with the services related to digital displays within a single organization. however, many successes and lessons learned were also shared by interviewees, often overlapping with identified challenges. this pattern suggests that some organizations can pursue strategies that address challenges faced by their library or museum colleagues while perhaps lacking resources or capacity in other areas related to this type of service. for example, some organizations have observed a lack of user engagement because of limited interactivity of the technology solution they used. others have had successful user engagement largely by investing in technology solutions that provide a range of modes of interaction. it is important to learn from both these areas to anticipate possible pain points and to be able to capitalize on successes that lead to industry recognition and engagement from library customers. table 4 summarized the range of challenges identified. it is our flagship | zvyagintseva 65 https://doi.org/10.6017/ital.v37i2.9987 table 4. challenges related to digital display services challenge identified responses % technical 14 41 content 11 33 costs 11 33 user expectations 11 33 workflow 10 29 service design 9 26 time 8 24 organizational culture 8 24 user engagement 7 20 as reflected in table 4, several key challenges have been discussed: 1. technical, such as troubleshooting the technology, keeping up with new technologies or upgrades, and finding software solutions appropriate for the hardware selected. 2. content, such as coming up with original content or curating existing sources. in the words of one participant, “quality and refresh of content is key—it has to be meaningful, interesting, and new.” this clearly presents a resource requirement. 3. costs, such as the financial commitment to the service, the unseen costs in putting exhibits together, software licensing, and hardware upgrades. 4. user expectations, such as keeping the service at its full potential, using maximum functionality of the hardware, and software solutions. according to study participants, users “may not want what they think or they say they want,” and to some extent, "such technologies are almost an expectation now, and not as exciting for users.” 5. workflow or project-management strategies specifically related to emergent multimedia experiences that require new cycles of development and testing. 6. time to plan, source, create, troubleshoot, launch, and improve exhibits. 7. service design, such as thinking holistically about the functions of the technology within the larger organizational structure. as one study participant stated, organizations “cannot disregard the reality of the service being tied to a physical space” in that these types of technologies are both a virtual and physical customer experience. 8. organizational culture and policy, in terms of adapting project-based approaches to planning and resourcing services, getting institutional support, and educating all staff about the purpose, function, and benefits of the service. 9. user engagement, particularly keeping users interested in the exhibits and continually finding new and exciting content. various participants have found that “linger time is information technology and libraries | june 2018 66 between 30 seconds to few minutes” and content being displayed needs to be “something interesting, unique, and succinct, but not a destination in itself.” despite the clear challenges with delivering digital exhibits services, organizations that participated in this study have identified keys to success (see table 5). table 5. successes and lessons learned in using digital displays successful approach or lesson identified responses % user engagement and interactivity 16 47 service design 14 41 “wow” factor 12 35 organizational leadership 12 35 technology solution 10 29 flexibility 10 29 communication and collaboration 10 29 project management 9 26 team and skill sets 9 26 as reflected in table 5, several approaches have been discussed: • user engagement and interactivity, particularly for those institutions that invested in highly interactive and immersive experiences; the rewards are seen in interest and enthusiasm of their user groups. • service design: organizations that have carefully planned the service have found that this technology was successfully serving the needs of their user communities. • promotion and “wow factor” that has brought attention to the organization and the service. it is not surprising that digital displays are central points on tours of dignitaries, political figures, and external guests. further, many have commented that they “did not imagine a library could be involved in such an innovative experiment,” and others have added that their digital displays have “created new conversations that did not exist before.” • leadership and vision at the organizational level, which secures support and resources as well as defines the scope of the service to ensure its sustainability and success: “money is not necessarily the only barrier to doing this service, but risk taking, culture.” • technology solution, where “everything works” and both the organization and users of the service are happy with the functionality, features, and performance of the chosen solution. • flexibility and willingness to learn new things, including being open to agile projectmanagement methods, taking risks, and continually learning new tools, technologies, and processes as the service matures. it is our flagship | zvyagintseva 67 https://doi.org/10.6017/ital.v37i2.9987 • communication and collaboration, both internally among stakeholders and externally by building community partnerships, new audiences, and user participation in content creation. for example, one study participant noted that the technology “has contributed to giving the museum a new audience of primarily young people and families—a key objective held in 2010 at the commencement of the gallery refurbishments.” • workflow and project management for those embracing new approaches required to bring multiple skill sets together to create engaging new exhibits. as one participant has put it, “these types of approaches require testing, improvement, a new workflow and lifecycle for the projects.” • having the right team with appropriate skills to support the service, though this theme was rated as being less significant than designing services effectively and securing institutional support for the technology service. in other words, study participants noted that having in-house programming or design skills is not enough without proper definition of success for digital exhibits services. perceptions institutional and user reception of digital displays as a service to pursue in learning organizations has been identified as overwhelmingly positive, with 87% of the organizations noting positive feedback. for example, one study participant noted the positive attention received by the wider community for the digital display, stating “it is our flagship and people are in general impressed by both the potential and some of the existing content." some participants have gone as far as to say that the reception among users has been “through the roof” and they have “never had a negative feedback comment” about their display. this finding indicates a high degree of satisfaction with such technologies by organizations that pursued a digital display. table 6 further explores the range of perceptions observed in the study. table 6. perception of digital display services perception responses % positive 20 87 hesitation or uncertainty 7 30 concerns about purpose 4 17 concerns about user engagement 4 17 concerns about costs 3 13 negative 3 13 a minority (13%) have noted some negative perceptions, largely related to concerns about costs or functionality of the technology; 30% have observed uncertainty and hesitation on behalf of the staff and users in terms of engagement as well as interrogating its purpose in the organization. for example, one study participant summarizes this mixed sentiment by saying, “the perception is information technology and libraries | june 2018 68 that it’s really neat and worthwhile for exploring new ways of teaching, but that the same features and functions could be achieved with less (which we think is a good thing!).” it is helpful to note this trend in perception, as any new service will likely bring a mixture of excitement, hesitation, and occasional opposition. interestingly, these reactions have originated both from the staff of organizations interviewed and their communities of users. discussion the findings from this study indicate that the functions of the digital displays are highly dependent on the organizational context in which displays exist. this context, in turn, defines the nature of the services delivered through the digital display. for example, figure 6 can be useful in classifying the various ways digital displays appear in the study population, from research and teaching-oriented lab spaces to public spaces with passive messaging or active immersive gamelike digital experiences. figure 6. types of digital displays in the study population. as such, visualization walls might belong in the “lab spaces” category that typically appears in academic libraries or research units and do not require content planning and scheduling. what we might call “digital interactive exhibits” tend to appear in museums and galleries with a primarily public audience and may have a permanent, seasonal, or monthly rotation schedule. however, despite a range of approaches taken to provide content and in terms of use of these technologies, many organizations share resourcing needs and challenges, such as troubleshooting the technology solution, creating engaging content, and managing costs of interactive projects. despite these common concerns, the digital-exhibits services were perceived as being overwhelmingly satisfactory in all types of organizations included in this study because they brought new audiences to the organization and were often seen as “showpieces” in the broader community. the data gathered in the environmental scan demonstrates that there is currently little consistency among digital displays in learning environments. this lack of consistency is seen in content-development methods among study participants, their programming, content it is our flagship | zvyagintseva 69 https://doi.org/10.6017/ital.v37i2.9987 management, technology solutions, and even naming of the display (and, by extension, the display service). for example, this study revealed that no evidently “open platform” for managing content at the application or the middleware level currently exists. a small number of software tools are used by organizations to support digital displays, but their use is in no way standardized, as compared to nearly every other area of library services. there is some indication that digitaldisplay services may become more standardized in the coming years, and more tools, solutions, vendors, and communities of practice will be available. for example, many signage cmss are currently on the market, and the number of game-like immersive experience companies is growing, suggesting extension of these services to libraries in the coming years. only a few software tools exist for creating exhibits, such as intuiface and touchdesigner, though no free, open-source versions of exhibit software are currently available. as well, the growing number of digital exhibits and interactive media companies currently focuses on turnkey—rather than software-as-a-service or platform—solutions. in contrast, some consistency exists in staffing needs and skills required to support the digitalexhibits service. a majority of organizations interviewed agreed that design, software development, systems administration, and project-management skills are needed to ensure digital-exhibits services run sustainably in a learning organization. in addition, lack of public library representation in this study makes it challenging to draw parallels to the library context. adapting museum practices is also not necessarily reliable, as there is rarely a mandate to engage communities and partner on content creation, as there is in libraries. for example, only the el paso (texas) museum of history engages the local community to source and organize content. these findings suggest that digital displays are a growing domain, and more solutions are likely to emerge in the coming years. the cube, compared to the rest of the study population, is a unique service model because it successfully brings together most elements examined in the environmental scan. for example, to ensure continual engagement with the digital display, the cube schedules exhibits on a regular basis and employs user interface designers, systems administrators, software engineers, and project managers. it also extends the content through community engagement, public tours, and stem programming. it has created an in-house middleware solution to simplify exhibit delivery and has chosen unity3d as its platform of choice for exhibit development. limitations only organizations from english-speaking countries were interviewed as part of the environmental scan. it is therefore unclear if access to organizations from non–english-speaking countries would have produced new themes and significantly different results. in addition, as with all environmental scans, the data is limited by the degree of understanding, knowledge, and willingness to share information of the individual being interviewed. particularly, individuals with whom the author spoke may or may not have been technology or service leads for the digital display at their respective institutions. thus, the study participants had a range of understanding of hardware specifications, functionality, and service-design components associated with digital displays. for example, having access to technology leads would have likely provided more nuanced responses around the middleware solutions and the underlying technical infrastructure required to support this service. information technology and libraries | june 2018 70 a small number of vendors were also interviewed as part of the environmental scan even though vendors did not necessarily have digital displays or service models parallel to libraries or museums. they are included in appendix b. nevertheless, gathering data from this group was deemed relevant to the study, as creative agencies have formalized staffing models and clearly identified skill sets necessary to support services of this nature. in addition, this group possesses knowledge of best practices, workflows, and project-management processes related to exhibit development. finally, this environmental scan also did not capture any interaction with direct users of digital displays, whose experiences and perceptions of these technologies may or may not support the findings gathered from the organizations interviewed. these limitations were addressed by increasing the sample size of the study within the time and resource constraints of the research project. conclusion the findings of this study show that the functions of digital-display technologies and their related services are highly dependent on the organizational context in which they exist. however, despite a range of approaches taken to provide content and in terms of use of these technologies, many organizations share resourcing needs and challenges, such as troubleshooting the technology solution, creating engaging content, and managing costs of interactive projects. despite these common concerns, digital displays were perceived as being overwhelmingly positive in all types of organizations interviewed in this study, as they brought new audiences to the organization and were often seen as “showpieces” in the broader community. the successes and lessons learned from the study population are meant to provide a broader perspective on this maturing domain as well as help inform planning processes for future digital exhibits in learning organizations. it is our flagship | zvyagintseva 71 https://doi.org/10.6017/ital.v37i2.9987 appendix a. environmental scan questions digital exhibits environmental scan interview questions—museums, libraries, public organizations 1. what are the technical specifications of the digital interactive technology at your institution? 2. who are the primary users of this technology (those interacting with the platform)? is there anyone you thought would use it and isn’t? 3. what are primary uses for the technology (events, presentations, analysis, workshops)? 4. what types content is supported by the technology (video, images, audio, maps, text, games, 3d, all of the above?) 5. where is content created and how is this content managed? 6. what is the schedule for the content and how is it prioritized? 7. can you estimate the fte (full-time equivalent) of staff members involved in supporting this technology/service, both directly and indirectly? what does indirect support for this technology entail? 8. in your experience, what kinds of skills are necessary in order to support this service? 9. have partnerships with other organizations producing content to be exhibited been established or explored? 10. what challenges have you encountered in providing this service? 11. what have been some keys to the successes in supporting this service? 12. what has been the biggest success of this service and what has been the biggest disappointment? 13. what is the perception of this technology in institution more broadly? 14. are there any other institutions you suggest we contact to learn more about similar technologies? information technology and libraries | june 2018 72 digital exhibits environmental scan interview questions: vendors 1. what is the relationship between creative studio and hardware/fabrication? do you do everything or work with av integrators instead to put together touch interactives? 2. who have been the primary users of the interactive exhibits and projects you have completed? 3. who writes the use cases when creating a digital interactive exhibit? 4. what types content is supported by the technology (video, images, audio, maps, text, games, 3d, all of the above?) do you see a rise in interest for 3d and game-like environments and do you have internal expertise to support it? 5. where is content created for the exhibits and how is this content managed? who curates? 6. what timespan or lifecycle do you design for? 7. how big is your team? how long to projects typically take to create? 8. what types of expertise do you have in house? what might a project team look like? 9. to what extent is there a goal of sharing knowledge back with the company from clients or users? 10. what challenges have you encountered in providing this service? 11. what have been some keys to the successes in supporting this service? it is our flagship | zvyagintseva 73 https://doi.org/10.6017/ital.v37i2.9987 appendix b: study population in environmental scan organization location date interviewed all saints anglican school merrimac, australia july 25, 2016 anode nashville, tn july 22, 2016 belle & wissell seattle, wa july 26, 2016 bradman museum bowral, australia july 10, 2016 brown university library providence, ri june 3, 2016 university of calgary library and cultural resources calgary, ab june 2, 2016 deakin university library geelong, australia june 14, 2016 university of colorado denver library denver, co june 24, 2016 duke university library durham, nc august 17, 2016 el paso museum of history el paso, tx june 24, 2016 georgia state university library atlanta, ga june 10, 2016 gibson group wellington, new zealand july 16, 2016 henrico county public library henrico, va august 9, 2016 ideum corrales, nm july 26, 2016 indiana university bloomington library bloomington, in may 31, 2016 interactive mechanics philadelphia, pa august 2, 2016 johns hopkins university library baltimore, md june 20, 2016 nashville public library nashville, tn july 22, 2016 north carolina state university library raleigh, nc june 8, 2016 university of north carolina atchapel hill library chapel hill, nc june 2, 2016 university of nebraska omaha omaha, ne june 16, 2016 omaha do space omaha, ne july 11, 2016 university of oregon alumni center eugene, or june 7, 2016 philadelphia museum of art philadelphia, pa august 10, 2016 queensland university of technology brisbane, australia june 30; july 29, 2016; august 16, 2016 société des arts technologiques montreal, qc august 8, 2016 second story portland, or july 28, 2016 st. louis university st. louis, mo july 4, 2016 stanford university library stanford, ca july 22, 2016 university of illinois at chicago chicago, il june 22, 2016 university of mary washington fredericksburg, va july 7, 2016 visibull waterloo, on august 12, 2016 university of waterloo stratford campus stratford, on june 22, 2016 yale university center for science and social science information new haven, ct july 13, 2016 information technology and libraries | june 2018 74 appendix c: digital content publishing guidelines organization name guidelines website deakin university library http://www.deakin.edu.au/library/projects/sparking-trueimagination duke university https://wiki.duke.edu/display/lmw/lmw+home griffith university https://intranet.secure.griffith.edu.au/work/digitalsignage/seemore north carolina state university library http://www.lib.ncsu.edu/videowalls university colorado denver http://library.auraria.edu/discoverywall university of calgary library and cultural resources http://lcr.ucalgary.ca/media-walls university of waterloo stratford campus https://uwaterloo.ca/stratford-campus/research/christiemicrotiles-wall http://www.deakin.edu.au/library/projects/sparking-true-imagination http://www.deakin.edu.au/library/projects/sparking-true-imagination https://wiki.duke.edu/display/lmw/lmw+home https://intranet.secure.griffith.edu.au/work/digital-signage/seemore https://intranet.secure.griffith.edu.au/work/digital-signage/seemore http://www.lib.ncsu.edu/videowalls http://library.auraria.edu/discoverywall http://lcr.ucalgary.ca/media-walls https://uwaterloo.ca/stratford-campus/research/christie-microtiles-wall https://uwaterloo.ca/stratford-campus/research/christie-microtiles-wall it is our flagship | zvyagintseva 75 https://doi.org/10.6017/ital.v37i2.9987 references 1 flora salim and usman haque, “urban computing in the wild: a survey on large scale participation and citizen engagement with ubiquitous computing, cyber physical systems, and internet of things,” international journal of human-computer studies 81 (september 2015): 31–48, https://doi.org/10.1016/j.ijhcs.2015.03.003. 2 peter peltonen et al., “it’s mine, don't touch! interactions at a large multi-touch display in a city center,” proceedings of the sigchi conference on human factors in computing systems, florence, italy, april 5–10, 2008, 1285–94, https://doi.org/10.1145/1357054.1357255. 3 shawna sadler, mike nutt, and renee reaume, “managing public video walls in academic library,” (presentation, cni spring 2015 meeting, seattle, washington, april 13-14, 2015), http://dro.deakin.edu.au/eserv/du:30073322/sadler-managing-2015.pdf. 4 peltonen et al., “it’s mine, don't touch!” 5 john brosz, e. patrick rashleigh, and josh boyer. “experiences with high resolution display walls in academic libraries” (presentation, cni fall 2015 meeting, washington, dc, december 13-14, 2015), https://www.cni.org/wp-content/uploads/2015/12/cni_experiences_brosz.pdf; bryan sinclair, jill sexton, and joseph hurley, “visualization on the big screen: hands-on immersive environments designed for student and faculty collaboration” (presentation, cni spring 2015 meeting, seattle, washington, april 13–14, 2015), https://scholarworks.gsu.edu/univ_lib_facpres/29/. 6 niels wouters et al., “uncovering the honeypot effect: how audiences engage with public interactive systems. conference on designing interactive systems,” dis ’16 proceedings of the 2016 acm conference on designing interactive systems, brisbane, australia, june 4–8, 2016, 516, https://doi.org/10.1145/2901790.2901796. 7 gonzalo parra, joris klerkx, and erik duval, “understanding engagement with interactive public displays: an awareness campaign in the wild,” proceedings of the international symposium on pervasive displays, copenhagen, denmark, june 3–4, 2014, 180–85, https:/doi.org/10.1145 /2611009.2611020; ekaterina kurdyukova, mohammad obaid, and elisabeth andre, “direct, bodily or mobile interaction?,” proceedings of the 11th international conference on mobile and ubiquitous multimedia, ulm, germany, december 4–6, 2012, https://doi.org/10.1145 /2406367.2406421; tongyan ning et al., “no need to stop: menu techniques for passing by public displays,” proceedings of the 2011 annual conference on human factors in computing systems, vancouver, british columbia, https://www.gillesbailly.fr/publis/bailly_chi11.pdf. 8 jung soo lee et al., “a study on digital signage interaction using mobile device,” international journal of information and electronics engineering 5 no. 5 (2015): 394–97, https://doi.org/10.7763/ijiee.2015.v5.566. jung soo lee et al., “a study on digital signage interaction using mobile device,” international journal of information and electronics engineering 5 no. 5 (2015): 394–97, https://doi.org/10.7763/ijiee.2015.v5.566. 9 parra et al, “understanding engagement,” 181. https://doi.org/10.1016/j.ijhcs.2015.03.003 https://doi.org/10.1145/1357054.1357255 http://dro.deakin.edu.au/eserv/du:30073322/sadler-managing-2015.pdf https://www.cni.org/wp-content/uploads/2015/12/cni_experiences_brosz.pdf https://scholarworks.gsu.edu/univ_lib_facpres/29/ https://doi.org/10.1145/2901790.2901796 https://doi.org/10.1145/2611009.2611020 https://doi.org/10.1145/2611009.2611020 https://doi.org/10.1145/2406367.2406421 https://doi.org/10.1145/2406367.2406421 https://www.gillesbailly.fr/publis/bailly_chi11.pdf https://doi.org/10.7763/ijiee.2015.v5.566 https://doi.org/10.7763/ijiee.2015.v5.566 information technology and libraries | june 2018 76 10 parra et al, “understanding engagement,” 181; walter, robert, gilles gailly, and jorg müller. “strikeapose: revealing mid-air gestures on public displays.” proceedings of the sigchi conference on human factors in computing systems, paris, france, april 27-may 2, 2013, 841850. https://doi.org/10.1145/2470654.2470774. 11 philipp panhey et al., “what people really remember: understanding cognitive effects when interactive with large displays,” proceedings of the 2015 international conference on interactive tabletops & surfaces, madeira, portugal, november 15–18, 2015, 103–6, https://doi.org/10.1145/2817721.2817732. 12 christopher ackad et al., “an in-the-wild study of learning mid-air gestures to browse hierarchical information at a large interactive public display,” proceedings of the 2015 acm international joint conference on pervasive and ubiquitous computing, osaka, japan, september 7–11, 2015, 1227–38, https://doi.org/10.1145/2750858.2807532. 13 parra et al, “understanding engagement,” 181; kurdyukova, obaid and andre, 2012, n.p. 14 jouni vepsäläinen et al., “web-based public-screen gaming: insights from deployments,” ieee pervasive computing 15 no. 3 (2016): 40–46, https://ieeexplore.ieee.org/document/7508836/. 15 uta hinrichs, holly schmidt, and sheelagh carpendale, “emdialog: bringing information visualization into the museum,” ieee transactions on visualization and computer graphics 14 no. 6 (november 2008):1181-1188. https://doi.org/10.1109/tvcg.2008.127. 16 hinrichs, schmidt, and carpendale, “emdialog.” 17 sarah clinch et al., “reflections on the long-term use of an experimental digital signage system,” proceedings of the 13th international conference on ubiquitous computing, beijing, china, september 17-21, 2011, 133-142. https://doi.org/10.1145/2030112.2030132. 18 elaine m. huang, anna koster, and jan borchers. “overcoming assumptions and uncovering practices: when does the public really look at public displays?,” proceedings of the 6th international conference on pervasive computing, sydney, australia, may 19-22, 2008, 228-243. https://doi.org/10.1007/978-3-540-79576-6_14; jorg muller et al., “looking glass: a field study on noticing interactivity of a shop window,” proceedings of the sigchi conference on human factors in computing systems, austin, texas, may 5-10, 2012, 297-306. https://doi.org/10.1145/2207676.2207718. 19 salim & haque, “urban computing in the wild,” 35 20 mettina veenstra et al., “should public displays be interactive? evaluating the impact of interactivity on audience engagement,” proceedings of the 4th international symposium on pervasive displays, saarbruecken, germany, june 10–12, 2015, 15–21, https://doi.org/10.1145/2757710.2757727. 21 clinch et al., “reflections.” https://doi.org/10.1145/2470654.2470774 https://doi.org/10.1145/2817721.2817732 https://doi.org/10.1145/2750858.2807532 https://ieeexplore.ieee.org/document/7508836/ https://doi.org/10.1109/tvcg.2008.127 https://doi.org/10.1145/2030112.2030132 https://doi.org/10.1007/978-3-540-79576-6_14 https://doi.org/10.1145/2207676.2207718 https://doi.org/10.1145/2757710.2757727 it is our flagship | zvyagintseva 77 https://doi.org/10.6017/ital.v37i2.9987 22 robert ravnik and franc solina, “audience measurement of digital signage: qualitative study in real-world environment using computer vision,” interacting with computers 25, no. 3 (2013), https://doi.org/10.1093/iwc/iws023. 23 neal buerger, “types of public interactive display technologies and how to motivate users to interact,” media informatics advanced seminar on ubiquitous computing, 2011, hausen, doris, conradi, bettina, hang, alina, hennecke, fabiant, kratz, sven, lohmann, sebastian, richter, hendrik, butz, andreas and hussmann, heinrich (eds). university of munich, department of computer science, media informatics group, 2011. https://pdfs.semanticscholar.org/533a/4ef7780403e8072346d574cf288e89fc442d.pdf . 24 c. g. screven, “information design in informal settings: museums and other public spaces,” in information design, ed. robert e. jacobson (cambridge, ma: mit press, 2000), 131–192. 25 parra et al., “understanding engagement,” 181. 26 uta hinrichs and sheelagh carpendale, “gestures in the wild: studying multi-touch gesture sequences on interactive tabletop exhibits,” proceedings of the sigchi conference on human factors in computing systems, vancouver, british columbia, may 7–12, 2011, 3023–32, https://doi.org/10.1145/1978942.1979391. 27 harry brignull and yvonne rogers, “enticing people to interact with large public displays in public spaces,” interact ’03, proceedings of the international conference on human-computer interaction, zurich, switzerland, september 1-5, 2003, 17-24, matthias rauterberg, marino menozzi, and janet wesson (eds.), tokyo: ios press, 2003. http://www.idemployee.id.tue.nl/g.w.m.rauterberg/conferences/interact2003/interact200 3-p17.pdf. 28 peltonen et al., “it’s mine, don't touch!” 29 peltonen et al., “it’s mine, don't touch!” 30 anne horn, bernadette lingham, and sue owen, “library learning spaces in the digital age,” proceedings of the 35th annual international association of scientific and technological university libraries conference, espoo, finland, june 2-5, 2014. http://docs.lib.purdue.edu/iatul/2014/libraryspace/2. https://doi.org/10.1093/iwc/iws023 https://pdfs.semanticscholar.org/533a/4ef7780403e8072346d574cf288e89fc442d.pdf https://doi.org/10.1145/1978942.1979391 http://www.idemployee.id.tue.nl/g.w.m.rauterberg/conferences/interact2003/interact2003-p17.pdf http://www.idemployee.id.tue.nl/g.w.m.rauterberg/conferences/interact2003/interact2003-p17.pdf http://docs.lib.purdue.edu/iatul/2014/libraryspace/2 abstract introduction method literature review definitions interactivity user engagement age display content social context findings technical and hardware landscape users and use cases figure 1. audience types for digital displays in the study population. content types and management middleware, automation, and exhibit management sources of content content creation guidelines content scheduling staffing and skills challenges and successes perceptions discussion figure 6. types of digital displays in the study population. limitations conclusion appendix a. environmental scan questions digital exhibits environmental scan interview questions—museums, libraries, public organizations digital exhibits environmental scan interview questions: vendors appendix b: study population in environmental scan appendix c: digital content publishing guidelines references article explainable artificial intelligence (xai) adoption and advocacy michael ridley information technology and libraries | june 2022 https://doi.org/10.6017/ital.v41i2.14683 michael ridley (mridley@uoguelph.ca) is librarian, university of guelph. © 2022. abstract the field of explainable artificial intelligence (xai) advances techniques, processes, and strategies that provide explanations for the predictions, recommendations, and decisions of opaque and complex machine learning systems. increasingly academic libraries are providing library users with systems, services, and collections created and delivered by machine learning. academic libraries should adopt xai as a tool set to verify and validate these resources, and advocate for public policy regarding xai that serves libraries, the academy, and the public interest. introduction explainable artificial intelligence (xai) is a subfield of artificial intelligence (ai) that provides explanations for the predictions, recommendations, and decisions of intelligent systems.1 machine learning is rapidly becoming an integral part of academic libraries. xai is a set of techniques, processes, and strategies that libraries should adopt and advocate for to ensure that machine learning appropriately serves librarianship, the academy, and the public interest. knowingly or not, libraries acquire and provide access to systems, services, and collections infused and directed by machine learning methods, and library users are engaged in information behavior (e.g., seeking, using, managing) facilitated or augmented by machine learning. machine learning in library and information science (lis), as with many other fields, has become ubiquitous. however, this technology is often opaque and complex, yet consequential. there are significant concerns about bias, unfairness, and veracity.2 there are troubling questions about user agency and power imbalances.3 while lis has a long-standing interest in ai and intelligent information systems generally, 4 it has only recently turned its attention to xai and how it affects the field and how the field might influence it.5 xai is a critical lens through which to view machine learning in libraries. it is also a set of techniques, processes, and strategies essential to influencing and shaping this stil l emerging technology: research libraries have a unique and important opportunity to shape the development, deployment, and use of intelligent systems in a manner consistent with the values of scholarship and librarianship. the area of explainable artificial intelligence is only one component of this, but in many ways, it may be the most important.6 dismissing engagement with xai because it is “highly technical and impenetrable to those outside that community” is neither acceptable nor increasingly possible.7 artificial intelligence is the essential substrate of contemporary information systems and xai is a tool set for critical assessment and accountability. the details matter and must be understood if libraries are to have a place at the table as xai, and machine learning, evolves and further deepens its effect on lis. mailto:mridley@uoguelph.ca information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 2 this paper provides an overview of xai with key definitions, a historical context, and examples of xai techniques, strategies, and processes that form the basis of the field. it considers areas where xai and academic libraries intersect. the dual emphasis is on xai as a toolset for libraries to adopt and xai as an area for public policy advocacy. what is xai? xai is plagued by definitional problems.8 some definitions are focused solely and narrowly on the technical concepts while others focus only on the broad social and political dimensions. lacking “a theory of explainable ai, with a formal and universally agreed definition of what explanations are,”9 the fundamentals of this field are still being explored, often from different disciplinary perspectives.10 critical algorithm studies position machine learning as socio-techno-informational systems.11 as such, a definition of xai must encompass not just the techniques, as important and necessary as they are, but also the context within which xai operates. the us defense advanced research projects agency (darpa) description of xai captures the breadth and scope of the field. the purpose of xai is for ai systems to have “the ability to explain their rationale, characterize their strengths and weaknesses, and convey an understanding of how they will behave in the future” 12 and to “enable human users to understand, appropriately trust, and effectively manage the emerging generation of artificially intelligent partners.”13 xai is needed to: 1. generate trust, transparency, and understanding; 2. ensure compliance with regulations and legislation; 3. mitigate risk; 4. generate accountable, reliable, and sound models for justification; 5. minimize or mitigate bias, unfairness, and misinterpretation in model performance and interpretation; and 6. validate models and validate explanations generated by xai.14 xai consists of testable and unambiguous proofs, various verification and validation methods that assess influence and veracity, and authorizations that define requirements or mandate auditing within a public policy framework. xai is not a new consideration. explainability has been a preoccupation of computer science since the early days of expert systems in the late twentieth century.15 however, the 2018 introduction of the general data protection regulation (gdpr) by the european union (eu) shifted explainability from a purely technical issue to one with an additional and urgent focus on public policy.16 while the presence of a “right to explanation” in the gdpr is highly contested, 17 industry groups and jurisdictions beyond the eu recognized its evitability spurring an explosion in xai research and development.18 types of xai taxonomies of xai types are classified based on their scope and mechanism.19 local explanations interpret the decisions of a machine learning model used in a specific instance (i.e., involving data and context relevant to the circumstance). global explanations interpret the model more generally (i.e., involving all the training data and relevant contexts). in black-box or model-agnostic explanations, only the input and the output of the machine learning model are required while information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 3 white-box or model-specific explanations require more detailed information regarding the processing or design of the model. another way to categorize xai is as proofs, validations, and authorizations. proofs are testable, traceable, and unambiguous explanations demonstrable through causal links, logic statements, or transparent processes. typically, proofs are only available for ai systems that use “inherently interpretable” techniques such as rules, decisions trees, or linear regressions.20 validations are explanations that confirm the veracity of the ai system. these verifications occur through testing procedures, reproducibility, approximations and abstractions, and justifications. authorizations are explanations because of processes in which third parties provide some form of standard, ratification, prohibition, or audit. authorizations might pertain to the ai model, its operation in specific instances, or even the process by which the ai was created. they can be provided by professional groups, nongovernmental organizations, governments and government agencies, and third parties in the public and private sector. academic libraries can adopt proofs and validations as means to interrogate information systems and resources. this includes collections which are increasingly machine learning systems themselves or developed with machine learning methods. the recognition of “collections as data” is an important shift in this direction.21 where appropriate, proofs and validations should accompany content and systems derived from machine learning. libraries must also engage with xai as authorizations to assess the public policy implications that exist, are emergent, or are necessary. library advocacy is currently lacking in this area. the requirement for policy and governance frameworks is a reminder that machine learning is “far from being purely mechanistic, it is deeply, inescapably human”22 and that while complex and opaque “the ‘black box’ is full of people.”23 prerequisites to an xai strategy three questions are important for any xai strategy: • what constitutes a good explanation? • who is the explanation for? • how will the explanation be provided? explanations are context specific. the “goodness” of an explanation is dependent on the needs and objectives of the explainee (a user) and the explainer (an xai). following research from the fields of psychology and cognitive science, keil suggests five reasons for why someone wants an explanation: (1) to predict similar events in the future, (2) to diagnose, (3) to assess blame or guilt, (4) to justify or rationalize an action, and (5) for aesthetic pleasure.24 for most people, explanations need not be complete or even fully accurate.25 as a result, who the explanation is for is critical to a good explanation. different audiences have different priorities. system developers are primarily interested in performance explanations while clients focus on effectiveness or efficacy, professionals are concerned about veracity, and regulators are interested in policy implications. nonexpert, lay users of a system want explanations that build trust and provide accountability. information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 4 a good explanation is also affected by its presentation. there are temporal and format considerations. explanations can be provided or available in real time and continuously as the process occurs (hence partial explanations) or post hoc and in summary form. interactive explanations are widely preferred but are not always appropriate or actionable. 26 studies have compared textual, visual, and multimodal formats with differing results. familiar textual responses or simple visual explanations such as venn diagrams are often most effective for nonexpert users.27 drawing from philosophy, psychology, and cognitive science, miller recommends four approach es for xai.28 explanations are contrastive. when people want to know the “why” of something, “people do not ask why event p happened, but rather why event p happened instead of some event q.” explanations are selected. “humans are adept at selecting one or two causes from a sometimes infinite number of causes to be the explanation.” explanations are social. “they are a transfer of knowledge, presented as part of a conversation or interaction, and are thus presented relative to the explainer’s beliefs about the explainee’s beliefs.” finally, miller cautions against using probabilities and statistical relationships and encourages references to causes. burrell identifies three key barriers to explainability: concealment, the limited technical understanding of the user, and an incompatibility between the user (human) and algorithmic reasoning.29 while concealment is deliberate, it may or may not be justified. protecting ip and trade secrets is acceptable while obscuring processes to purposively deceive users is not. regulations are a tool to moderate the former and minimize the latter. the technical limitations of users and the incompatibility between users and algorithms suggest two remedies. first is enhancing algorithmic literacy. algorithmic literacy is a “a set of competencies that enables individuals to critically evaluate ai technologies; communicate and collaborate effectively with ai; and use ai as a tool online, at home, and in the workplace.”30 libraries have a key role in advancing algorithmic literacy in their communities.31 just as libraries championed information literacy through the promulgation of standards and principles, the provision of diverse educational programming, and the engagement of the broad academic community, so too can libraries be central to efforts to enhance algorithmic literacy. second is a requirement that xai must be sensitive to the abilities and needs of different users. a survey of the key challenges and research direction of xai identified 39 issues, including the need to understand and enhance the user experience, match xai to user expertise, and explain the competencies of ai systems to users.32 this is the essence of human-centered explainable ai (hcxai). among hcxai principles are the importance of context (regarding user objectives, decision consequences, timing, modality, and intended audience), the value of using hybrid explanation methods that complement and extend each other, and the power of contrastive examples and approaches.33 proofs and validations xai that provide proofs or validations can be adopted by libraries to assess and evaluate machine learning utilized in systems, services, and collections. since proofs pertain to already interpretable systems, the four examples provided focus on validations: feature audit, approximation and abstraction, reproducibility, and xai by ai. these techniques may require access to, or information about, the machine learning model. this would include such characteristics as the algorithms used, settings of the parameters and hyperparameters, optimization choices, and the training data. while all these may not be normally information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 5 available, designers of machine learning systems in consequential settings should expect to provide, indeed be required to provide, such access. similarly, vendors of library content or systems utilizing machine learning should make explanatory proofs and validations available for library inspection. feature audit feature audit is an explanatory strategy that attempts to reveal the key features (e.g., characteristics of the data or settings of the hyperparameters used to the differentiate data) that have a primary role in the prediction of the algorithm. by isolating these features, it is possible to explain the key components of the decision. feature audit is a standard technique of linear regression, but it is made more difficult in machine learning because of the complexity of the information space (e.g., billions of parameters and high dimensionality). there are various feature audit techniques34 but all of them are “decompositional” in that they attempt to reduce the work of the algorithm to its component parts and then use those results as an explanation.35 feature audit can highlight bias or inaccuracy by revealing incongruence between the data and the prediction. more advanced feature audit techniques (e.g., gradient feature auditing) recognize that features can indirectly influence other features and that these features are not easily detectable as separate, influential elements.36 this interaction among features challenges the strict decompositional approach to feature audit and will likely lead to an increased focus on the relational analysis among and between elements. approximation and abstraction approximation and abstraction are techniques that create a more simplified model to explain the more complex model.37 people seek and accept explanations that “satisfice”38 and are coherent with existing beliefs.39 this recognizes that “an explanation has greater power than an alternative if it makes what is being explained less surprising.”40 approaches such as “model distillation”41 or the “model agnostic” feature reduction of the local interpretable model-agnostic explanations (lime) tool create a simplified presentation of the algorithmic model.42 this approximation or abstraction may compromise accuracy, but it provides an accessible representation that enhances understandability. a different type of approximation or abstraction is a narrative of the machine learning processes utilized that provides sufficient documentation for a reader to act as an explanation of the outcomes. an exemplary case of this is lithium-ion batteries: a machine-generated summary of current research published by springer nature and written by beta writer, an ai or more accurately a suite of algorithms.43 a collaboration of machine learning and human editors, the full production cycle of the book is documented in the introduction.44 in lieu of being able to interrogate the system directly, this detailed account provides an explanation of the system allowing readers to assess the strengths, limitations, and confidence levels of the algorithmic processes and offers a model of what might be necessary for future ai generated texts.45 libraries can utilize this documentation in acquisition or licensing decisions and subsequently make it available as user guides when resources are added to the collection. reproducibility replication is a verification strategy fundamental to science. being able to independently reproduce results in different settings provides evidence of veracity and supports user trust. however, documented problems in reproducing machine learning studies have questioned the information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 6 generalizability of these approaches and undermined their explanatory capacity. for example, an analysis of text mining studies using machine learning for citation screening in the preparation of systemic reviews revealed a lack of key elements to enable replicability (e.g., access to research datasets, software environments used, randomization control, and lack of detail on new methods proposed or employed).46 in response, a “reproducibility challenge” was created by the international conference on learning representations (iclr) to validate 2018 conference submissions and has continued in subsequent meetings.47 more rigorous replication through the availability of all necessary components and the development of standards will be important to this type of verification.48 xai by ai the inherent complexity and opacity of unsupervised learning or reinforcement learning suggests, as xai researcher trevor darrell puts it, “the solution to explainable ai is more ai.”49 in this approach to explanation, oversight ai are positioned as intermediaries between an ai and its users: workers have supervisors; businesses have accountants; schoolteachers have principals. we suggest that the time has come to develop ai oversight systems (“ai guardians”) that will seek to ensure that the various smart machines will not stray from the guidelines their programmers have provided.50 while the prospect of ai guardians may be dystopic, oversight systems performing roles that validate, interrogate, and report are common in code checking tools. generative adversarial networks (gans) have been used to create counterfactual explanations of another machine learning model to enhance explainability.51 with strategic organizational and staffing changes to enhance capabilities, libraries can design and deploy such oversight or adversarial tools with objectives appropriate to the requirements and norms of libraries and the academy. authorization xai that results from authorizations is an area where public policy engagement is needed to ensure xai, and machine learning, are appropriately serving libraries, the academy, and the public at large. three examples are provided: codes and standards, regulation, and audit. codes and standards one approach to explanation, supported by the ai industry and professional organizations, are voluntary codes or standards that encourage explanatory capabilities. these nonbinding principles are a type of self-regulation and are widely promoted as a means of assurance.52 the association for computing machinery’s statement on algorithms highlights seven principles as guides to system design and use: awareness, access and redress, accountability, explanation, data provenance, auditability, validation, and testing. however, the language used is tentative and conditional. designers are “encouraged” to provide explanations and to “encourage” a means for interrogation and auditing “where harm is suspected” (i.e., a post hoc process). despite this, the statement concludes with a strong position on accountability if not explainability: “institutions should be held responsible for decisions made by the algorithms that they use, even if it is not feasible to explain in detail how the algorithms produce their results.”53 information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 7 unfortunately, the optimism for self-regulation in explainability is undercut by the poor experience with voluntary mechanisms regarding privacy protection.54 in addition, library associations, library system vendors, and scholarly publishers have been slow to endorse any codes or standards regarding explainability. regulation the most common recommendation for ai oversight and authorization to ensure explainability is the creation of a regulatory agency. specific suggestions include a “neutral data arbiter” with investigative powers like the us federal trade commission,55 a food and drug administration “for algorithms,”56 a standing “commission on artificial intelligence,”57 quasi-governmental agencies such as the council of europe,58 and a hybrid agency model combining certification and liability.59 such agencies would have legislated or delegated powers to investigate, certify, license, and arbitrate on matters relating to ai and algorithms, including their design, use, and effects. there are few calls for an international regulatory agency despite digitally porous national boundaries and the global reach of machine learning.60 that almost no such agencies have been created reveals the strength and influence of the large corporations responsible for developing and deploying most machine learning tools and systems.61 reports comparing regulatory approaches to ai among the european union, the united kingdom, the united states, and canada indicate significantly different approaches but with most proceeding with a “light touch” to avoid competitive disadvantages in a multitrillion dollar global marketplace.62 the introduction of the draft eu artificial intelligence act marks the first major jurisdiction to propose specific ai legislation.63 while the act is fulsome about high-risk ai, it is silent on any notion of “explainable” ai, preferring to focus on the less specific idea of “trustworthy artificial intelligence.” with this the eu appears to retreat from the idea of explainability in the gdpr. an exception to this inertia or backtracking is the development and use of algorithmic impact assessments in both governments and industry. these instruments help prospective users of an algorithmic decision-making system determine levels of explanatory requirements and standards to meet those requirements.64 canada has been a leader in this area with a protocol covering use of these systems in the federal government.65 some identify due process as a possible, if limited, remedy for explainability.66 however, a landmark us case suggests otherwise. in state v. loomis, regarding the use of compas, an algorithmic sentencing system, the court ruled on the role of explanation in due process:67 the wisconsin supreme court held that a trial court’s use of an algorithmic risk assessment in sentencing did not violate the defendant’s due process rights even though the methodology used to produce the assessment was disclosed neither to the court nor to the defendant.68 the petition of the loomis case to the us supreme court was denied, so a higher court ruling on this issue is unavailable.69 advocacy for regulations regarding explainability should be a central concern for libraries. without strong regulatory oversight requiring disclosure and accountability, machine learning information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 8 systems will remain black boxes and presence of these consequential systems in the lives of users will be obscured. audit a commonly recommended approach to ai oversight and explanation is third-party auditing.70 the use of audit and principles of auditing are widely accepted in a variety of areas. 71 in a library context, auditing of ai can be thought of as a reviewing process to achieve transparency or to determine product compliance. auditing is typically done after system implementation, but it can be accomplished at any stage. it is possible to audit design specifications, completed code, cognitive models, or periodic audits of specific decisions.72 the keys to successful audit oversight are clear audit goals and objectives (e.g., what is being audited and for what purpose), acknowledged expertise of the auditors, authority of the auditors to recommend, and authorization of the auditors to investigate. any such auditing responsibility for xai would require the trust of stakeholders such as ai designers, government regulators, industry representatives as well as users themselves. critics of the audit approach have focused on lack of auditor expertise, algorithmic complexity, and the need for approaches that assess the algorithmic system prior to its release. 73 while most audit recommendations assume a public agency in this role, an innovative suggestion is a crowdsourced audit (a form of audit study that involves the recruitment of testers to anonymously assess an algorithmic system; an xai form of the “secret shopper”).74 this approach resembles techniques used by consumer advocates and might indicate the rise of public activists into the xai arena. the complexity of algorithms suggests that a precondition for an audit is “auditability.”75 this would require that ai be designed in such a way that an audit is possible (i.e., inspectable in some manner) while, presumably, not impairing its predictive performance. sandvig et al. propose regulatory changes because “rather than regulating for transparency or misbehavior, we find this situation argues for ‘regulation toward auditability’.”76 auditing is not without its difficulties. there are no industry standards for algorithmic auditing.77 a high-profile development was the recent launch of orcaa (orcaarisk.com), an algorithmic auditing company started by cathy o’neil, a data scientist who has written extensively about the perils of uncontrolled algorithms.78 however, the legitimacy of third-party auditing has been criticized as lacking public transparency and the capacity to demand change.79 while libraries may not be able to create their own auditing capacity, whether collectively or individually, they are encouraged to engage with the emerging algorithmic auditing community to shape auditing practices appropriate for scholarly communication. xai as discovery while xai is primarily a means to validate and authorize machine learning systems, another use of xai is gaining attention. since xai can find new information latent in large and complex datasets, discovery is promoted as “one of the most important achievements of the entire algorithmic explainability project.”80 alkhateeb asks “can scientific discovery really be automated” while invoking the earlier work of swanson which mined the medical literature for new knowledge by connecting seemingly unrelated articles through search.81 an emerging reason for libraries to adopt xai may be as a powerful discovery tool. https://orcaarisk.com/ information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 9 conclusion our lives have become “algorithmically mediated”82 where we are “dependent on computational spectacles to see the world.”83 academic libraries are now sites where systems, services, and collections are increasingly shaped and provided by machine learning. the predictions, recommendations, and decisions of machine learning systems are powerful as well as consequential. however, “the danger is not so much in delegating cognitive tasks, but in distancing ourselves from—or in not knowing about—the nature and precise mechanisms of that delegation.”84 taddeo notes that “delegation without supervision characterises the presence of trust.”85 xai is an essential tool to build that trust. geoffrey hinton, a central figure in the development of machine learning,86 argues that requiring an explanation from an ai system would be “a complete disaster” and that trust and acceptance should be based on the system’s performance, not its explainability.87 this is consistent with the view of many that “if algorithms that cannot be easily explained consistently make better decisions in certain areas, then policymakers should not require an explanation.”88 both these views are at odds with the tenants of critical thought and assessment, and both challenge norms of algorithmic accountability. xai is a dual opportunity for libraries. on one hand, it is a set of techniques, processes, and strategies that enable the interrogation of the algorithmically driven resources that libraries provide to their users. on the other hand, it is a public policy arena where advocacy is necessary to promote and uphold the values of librarianship, the academy, and the public interest in the face of powerful new technologies. many disciplines have engaged with xai as machine learning has impacted their fields.89 xai has been called a “disruptive force” in lis,90 warranting the growing interest in how xai affects the field and how the field might influence it. information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 10 endnotes 1 vijay arya et al., “one explanation does not fit all: a toolkit and taxonomy of ai explainability techniques,” arxiv:1909.03012 [cs, stat], 2019, http://arxiv.org/abs/1909.03012; shane t. mueller et al., “explanation in human-ai systems: a literature meta-review, synopsis of key ideas and publications, and bibliography for explainable ai,” arxiv:1902.01876 [cs], 2019, http://arxiv.org/abs/1902.01876; ingrid nunes and dietmar jannach, “a systematic review and taxonomy of explanations in decision support and recommender systems,” user modeling and user-adapted interaction 27, no. 3 (2017): 393–444, https://doi.org/10.1007/s11257-017-9195-0; gesina schwalbe and bettina finzel, “xai method properties: a (meta-) study,” arxiv:2105.07190 [cs], 2021, http://arxiv.org/abs/2105.07190. 2 safiya noble, algorithms of oppression: how search engines reinforce racism (new york: new york university press, 2018); frank pasquale, the black box society: the secret algorithms that control money and information (cambridge, mass.: harvard university press, 2015); sara wachter-boettcher, technically wrong: sexist apps, biased algorithms, and other threats of toxic tech (new york: w. w. norton, 2017). 3 abeba birhane et al., “the values encoded in machine learning research,” arxiv:2106.15590 [cs], 2021, http://arxiv.org/abs/2106.15590; taina bucher, if ... then: algorithmic power and politics (new york: oxford university press, 2018); sarah myers west, meredith whittaker, and kate crawford, discriminating systems: gender, race, and power in ai (ai now institute, 2019), https://ainowinstitute.org/discriminatingsystems.html. 4 rao aluri and donald e. riggs, “application of expert systems to libraries,” ed. joe a. hewitt, advances in library automation and networking 2 (1988): 1–43; ryan cordell, machine learning + libraries: a report on the state of the field (washington dc: library of congress, 2020), https://labs.loc.gov/static/labs/work/reports/cordell-loc-ml-report.pdf; jason griffey, ed., “artificial intelligence and machine learning in libraries,” library technology reports 55, no. 1 (2019), https://doi.org/10.5860/ltr.55n1; guoying liu, “the application of intelligent agents in libraries: a survey,” program: electronic library and information systems 45, no. 1 (2011): 78–97, https://doi.org/10.1108/00330331111107411; linda c. smith, “artificial intelligence in information retrieval systems,” information processing and management 12, no. 3 (1976): 189–222, https://doi.org/10.1016/0306-4573(76)90005-4. 5 jenny bunn, “working in contexts for which transparency is important: a recordkeeping view of explainable artificial intelligence (xai),” records management journal (london, england) 30, no. 2 (2020): 143–53, https://doi.org/10.1108/rmj-08-2019-0038; cordell, “machine learning + libraries”; andrew m. cox, the impact of ai, machine learning, automation and robotics on the information professions (cilip, 2021), http://www.cilip.org.uk/resource/resmgr/cilip/research/tech_review/cilip_–_ai_report__final_lo.pdf; daniel johnson, machine learning, libraries, and cross-disciplinary research: possibilities and provocations (notre dame, indiana: hesburgh libraries, university of notre dame, 2020), https://dx.doi.org/10.7274/r0-wxg0-pe06; sarah lippincott, mapping the current landscape of research library engagement with emerging technologies in research and learning (washington dc: association of research libraries, 2020), https://www.arl.org/wp-content/uploads/2020/03/2020.03.25-emerging-technologies http://arxiv.org/abs/1909.03012 http://arxiv.org/abs/1902.01876 https://doi.org/10.1007/s11257-017-9195-0 http://arxiv.org/abs/2105.07190 http://arxiv.org/abs/2106.15590 https://ainowinstitute.org/discriminatingsystems.html https://labs.loc.gov/static/labs/work/reports/cordell-loc-ml-report.pdf https://doi.org/10.5860/ltr.55n1 https://doi.org/10.1108/00330331111107411 https://doi.org/10.1016/0306-4573(76)90005-4 https://doi.org/10.1108/rmj-08-2019-0038 http://www.cilip.org.uk/resource/resmgr/cilip/research/tech_review/cilip_–_ai_report_-_final_lo.pdf http://www.cilip.org.uk/resource/resmgr/cilip/research/tech_review/cilip_–_ai_report_-_final_lo.pdf https://dx.doi.org/10.7274/r0-wxg0-pe06 https://www.arl.org/wp-content/uploads/2020/03/2020.03.25-emerging-technologies-landscape-summary.pdf information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 11 landscape-summary.pdf; thomas padilla, responsible operations. data science, machine learning, and ai in libraries (dublin, oh: oclc research, 2019), https://doi.org/10.25333/xk7z-9g97; michael ridley, “explainable artificial intelligence,” research library issues, no. 299 (2019): 28–46, https://doi.org/10.29242/rli.299.3. 6 ridley, “explainable artificial intelligence,” 42. 7 bunn, “working in contexts for which transparency is important,” 151. 8 sebastian palacio et al., “xai handbook: towards a unified framework for explainable ai,” arxiv:2105.06677 [cs], 2021, http://arxiv.org/abs/2105.06677; sahil verma et al., “pitfalls of explainable ml: an industry perspective,” in mlsys journe workshop, 2021, http://arxiv.org/abs/2106.07758; giulia vilone and luca longo, “explainable artificial intelligence: a systematic review,” arxiv:2006.00093 [cs], 2020, http://arxiv.org/abs/2006.00093. 9 wojciech samek and klaus-robert muller, “towards explainable artificial intelligence,” in explainable ai: interpreting, explaining and visualizing deep learning, ed. wojciech samek et al., lecture notes in artificial intelligence 11700 (cham: springer international publishing, 2019), 17. 10 mueller et al., “explanation in human-ai systems.” 11 isto huvila et al., “information behavior and practices research informing information systems design,” journal of the association for information science and technology, 2021, 1–15, https://doi.org/10.1002/asi.24611. 12 darpa, explainable artificial intelligence (xai) (arlington, va: darpa, 2016), http://www.darpa.mil/attachments/darpa-baa-16-53.pdf. 13 matt turek, “explainable artificial intelligence (xai),” darpa, https://www.darpa.mil/program/explainable-artificial-intelligence. 14 julie gerlings, arisa shollo, and ioanna constantiou, “reviewing the need for explainable artificial intelligence (xai),” in proceedings of the hawaii international conference on system sciences, 2020, http://arxiv.org/abs/2012.01007. 15 william j. clancey, “the epistemology of a rule-based expert system—a framework for explanation,” artificial intelligence 20, no. 3 (1983): 215–51, https://doi.org/10.1016/00043702(83)90008-5; william swartout, “xplain: a system for creating and explaining expert consulting programs,” artificial intelligence 21 (1983): 285–325; william swartout, cecile paris, and johanna moore, “design for explainable expert systems,” ieee expert-intelligent systems & their applications 6, no. 3 (1991): 58–64, https://doi.org/10.1109/64.87686. 16 european union, “regulation (eu) 2016/679 of the european parliament and of the council of 27 april 2016,” 2016, http://eur-lex.europa.eu/legalcontent/en/txt/?uri=celex:32016r0679. https://www.arl.org/wp-content/uploads/2020/03/2020.03.25-emerging-technologies-landscape-summary.pdf https://doi.org/10.25333/xk7z-9g97 https://doi.org/10.29242/rli.299.3 http://arxiv.org/abs/2105.06677 http://arxiv.org/abs/2006.00093 https://doi.org/10.1002/asi.24611 http://www.darpa.mil/attachments/darpa-baa-16-53.pdf https://www.darpa.mil/program/explainable-artificial-intelligence http://arxiv.org/abs/2012.01007 https://doi.org/10.1016/0004-3702(83)90008-5 https://doi.org/10.1016/0004-3702(83)90008-5 https://doi.org/10.1109/64.87686 http://eur-lex.europa.eu/legal-content/en/txt/?uri=celex:32016r0679 http://eur-lex.europa.eu/legal-content/en/txt/?uri=celex:32016r0679 information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 12 17 lilian edwards and michael veale, “slave to the algorithm? why a ‘right to explanation’ is probably not the remedy you are looking for,” duke law & technology review 16 (2017): 18–84; bryce goodman and seth flaxman, “european union regulations on algorithmic decision making and a ‘right to explanation’,” ai magazine 38, no. 3 (2017): 50–57, https://doi.org/10.1609/aimag.v38i3.2741; margot e. kaminski, “the right to explanation, explained,” berkeley technology law journal 34, no. 1 (2019): 189–218, https://doi.org/10.15779/z38td9n83h; sandra wachter, brent mittelstadt, and luciano floridi, “why a right to explanation of automated decision-making does not exist in the general data protection regulation,” international data privacy law 7, no. 2 (2017): 76–99, https://doi.org/10.1093/idpl/ipx005. 18 amina adadi and mohammed berrada, “peeking inside the black-box: a survey on explainable artificial intelligence (xai),” ieee access 6 (2018): 52138–60, https://doi.org/10.1109/access.2018.2870052; mueller et al., “explanation in human-ai systems”; vilone and longo, “explainable artificial intelligence.” 19 schwalbe and finzel, “xai method properties.” 20 or biran and courtenay cotton, “explanation and justification in machine learning: a survey” (international joint conference on artificial intelligence, workshop on explainable artificial intelligence (xai), melbourne, 2017), http://www.cs.columbia.edu/~orb/papers/xai_survey_paper_2017.pdf. 21 padilla, responsible operations. 22 jenna burrell and marion fourcade, “the society of algorithms,” annual review of sociology 47, no. 1 (2021): 231, https://doi.org/10.1146/annurev-soc-090820-020800. 23 nick seaver, “seeing like an infrastructure: avidity and difference in algorithmic recommendation,” cultural studies 35, no. 4–5 (2021): 775, https://doi.org/10.1080/09502386.2021.1895248. 24 frank c. keil, “explanation and understanding,” annual review of psychology 57 (2006): 227– 54, https://doi.org/10.1146/annurev.psych.57.102904.190100. 25 donald a. norman, “some observations on mental models,” in mental models, ed. dedre gentner and albert l. stevens (new york: psychology press, 1983), 7–14. 26 ashraf abdul et al., “trends and trajectories for explainable, accountable, and intelligible systems: an hci research agenda,” in proceedings of the 2018 chi conference on human factors in computing systems, chi ’18 (new york: acm, 2018), 582:1–582:18, https://doi.org/10.1145/3173574.3174156; joachim diederich, “methods for the explanation of machine learning processes and results for non-experts,” psyarxiv, 2018, https://doi.org/10.31234/osf.io/54eub. 27 pigi kouki et al., “user preferences for hybrid explanations,” in proceedings of the eleventh acm conference on recommender systems, recsys ’17 (new york, ny: acm, 2017), 84–88, https://doi.org/10.1145/3109859.3109915. https://doi.org/10.1609/aimag.v38i3.2741 https://doi.org/10.15779/z38td9n83h https://doi.org/10.1093/idpl/ipx005 https://doi.org/10.1109/access.2018.2870052 http://www.cs.columbia.edu/~orb/papers/xai_survey_paper_2017.pdf https://doi.org/10.1146/annurev-soc-090820-020800 https://doi.org/10.1080/09502386.2021.1895248 https://doi.org/10.1146/annurev.psych.57.102904.190100 https://doi.org/10.1145/3173574.3174156 https://doi.org/10.31234/osf.io/54eub https://doi.org/10.1145/3109859.3109915 information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 13 28 tim miller, “explanation in artificial intelligence: insights from the social sciences,” artificial intelligence 267 (2019): 3, https://doi.org/10.1016/j.artint.2018.07.007. 29 jenna burrell, “how the machine ‘thinks’: understanding opacity in machine learning algorithms,” big data & society 3, no. 1 (2016), https://doi.org/10.1177/2053951715622512. 30 duri long and brian magerko, “what is ai literacy? competencies and design considerations,” in proceedings of the 2020 chi conference on human factors in computing systems, chi ’20 (honolulu, hi: association for computing machinery, 2020), 2, https://doi.org/10.1145/3313831.3376727. 31 michael ridley and danica pawlick-potts, “algorithmic literacy and the role for libraries,” information technology and libraries 40, no. 2 (2021), https://doi.org/doi.org/10.6017/ital.v40i2.12963. 32 waddah saeed and christian omlin, “explainable ai (xai): a systematic meta-survey of current challenges and future opportunities,” arxiv:2111.06420 [cs], 2021, http://arxiv.org/abs/2111.06420. 33 shane t. mueller et al., “principles of explanation in human-ai systems” (explainable agency in artificial intelligence workshop, aaai 2021), http://arxiv.org/abs/2102.04972. 34 sebastian bach et al., “on pixel-wise explanations for non-linear classifier decisions by layerwise relevance propagation,” plos one 10, no. 7 (2015): e0130140, https://doi.org/10.1371/journal.pone.0130140; biran and cotton, “explanation and justification in machine learning: a survey”; chris brinton, “a framework for explanation of machine learning decisions” (ijcai-17 workshop on explainable ai (xai), melbourne: ijcai, 2017), http://www.intelligentrobots.org/files/ijcai2017/ijcai-17_xai_ws_proceedings.pdf; chris olah, alexander mordvintsev, and ludwig schubert, “feature visualization,” distill, november 7, 2017, https://doi.org/10.23915/distill.00007. 35 edwards and veale, “slave to the algorithm?” 36 philip adler et al., “auditing black-box models for indirect influence,” knowledge and information systems 54 (2018): 95–122, https://doi.org/10.1007/s10115-017-1116-3. 37 alisa bokulich, “how scientific models can explain,” synthese 180, no. 1 (2011): 33–45, https://doi.org/10.1007/s11229-009-9565-1; keil, “explanation and understanding.” 38 herbert a. simon, “what is an ‘explanation’ of behavior?,” psychological science 3, no. 3 (1992): 150–61, https://doi.org/10.1111/j.1467-9280.1992.tb00017.x. 39 norbert schwarz et al., “ease of retrieval as information: another look at the availability heuristic,” journal of personality and social psychology 61, no. 2 (1991): 195–202, https://doi.org/10.1037/0022-3514.61.2.195; paul thagard, “evaluating explanations in law, science, and everyday life,” current directions in psychological science 15, no. 3 (2006): 141– 45, https://doi.org/10.1111/j.0963-7214.2006.00424.x. https://doi.org/10.1016/j.artint.2018.07.007 https://doi.org/10.1177/2053951715622512 https://doi.org/10.1145/3313831.3376727 https://doi.org/doi.org/10.6017/ital.v40i2.12963 http://arxiv.org/abs/2111.06420 http://arxiv.org/abs/2102.04972 https://doi.org/10.1371/journal.pone.0130140 http://www.intelligentrobots.org/files/ijcai2017/ijcai-17_xai_ws_proceedings.pdf https://doi.org/10.23915/distill.00007 https://doi.org/10.1007/s10115-017-1116-3 https://doi.org/10.1007/s11229-009-9565-1 https://doi.org/10.1111/j.1467-9280.1992.tb00017.x https://doi.org/10.1037/0022-3514.61.2.195 https://doi.org/10.1111/j.0963-7214.2006.00424.x information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 14 40 tania lombrozo, “explanatory preferences shape learning and inference,” trends in cognitive sciences 20, no. 10 (2016): 756, https://doi.org/10.1016/j.tics.2016.08.001. 41 sarah tan et al., “detecting bias in black-box models using transparent model distillation,” arxiv:1710.06169 [cs, stat], november 18, 2017, http://arxiv.org/abs/1710.06169. 42 marco tulio ribeiro, sameer singh, and carlos guestrin, “model-agnostic interpretability of machine learning,” arxiv:1606.05386 [cs, stat], 2016, http://arxiv.org/abs/1606.05386. 43 beta writer, lithium-ion batteries: a machine-generated summary of current research (heidelberg: springer nature, 2019), https://link.springer.com/book/10.1007/978-3-03016800-1. 44 henning schoenenberger, christian chiarcos, and niko schenk, preface to lithium-ion batteries; a machine-generated summary of current research, by beta writer, (heidelberg: springer international publishing, 2019). 45 michael ridley, “machine information behaviour,” in the rise of ai: implications and applications of artificial intelligence in academic libraries, ed. sandy hervieux and amanda wheatley (association of college and university libraries, 2022). 46 babatunde kazeem olorisade, pearl brereton, and peter andras, “reproducibility of studies on text mining for citation screening in systematic reviews: evaluation and checklist,” journal of biomedical informatics 73 (2017): 1–13, https://doi.org/10.1016/j.jbi.2017.07.010; babatunde k. olorisade, pearl brereton, and peter andras, “reproducibility in machine learning-based studies: an example of text mining,” in reproducibility in ml workshop (international conference on machine learning, sydney, australia, 2017), https://openreview.net/pdf?id=by4l2pbq-. 47 joelle pineau, “reproducibility challenge,” october 6, 2017, http://www.cs.mcgill.ca/~jpineau/iclr2018-reproducibilitychallenge.html. 48 benjamin haibe-kains et al., “transparency and reproducibility in artificial intelligence,” nature 586, no. 7829 (2020): e14–e16, https://doi.org/10.1038/s41586-020-2766-y; benjamin j. heil et al., “reproducibility standards for machine learning in the life sciences,” nature methods, august 30, 2021, https://doi.org/10.1038/s41592-021-01256-7. 49 cliff kuang, “can a.i. be taught to explain itself?,” the new york times magazine, november 21, 2017, 50, https://nyti.ms/2hr1s15. 50 amitai etzioni and oren etzioni, “incorporating ethics into artificial intelligence,” the journal of ethics 21, no. 4 (2017): 403–18, https://doi.org/10.1007/s10892-017-9252-2. 51 kamran alipour et al., “improving users’ mental model with attention-directed counterfactual edits,” applied ai letters, 2021, e47, https://doi.org/10.1002/ail2.47. 52 association for computing machinery, statement on algorithmic transparency and accountability (new york: acm, 2017), http://www.acm.org/binaries/content/assets/publicpolicy/2017_joint_statement_algorithms.pdf; alex campolo et al., ai now 2017 report (new https://doi.org/10.1016/j.tics.2016.08.001 http://arxiv.org/abs/1710.06169 http://arxiv.org/abs/1606.05386 https://link.springer.com/book/10.1007/978-3-030-16800-1 https://link.springer.com/book/10.1007/978-3-030-16800-1 https://doi.org/10.1016/j.jbi.2017.07.010 https://openreview.net/pdf?id=by4l2pbqhttp://www.cs.mcgill.ca/~jpineau/iclr2018-reproducibilitychallenge.html https://doi.org/10.1038/s41586-020-2766-y https://doi.org/10.1038/s41592-021-01256-7 https://nyti.ms/2hr1s15 https://doi.org/10.1007/s10892-017-9252-2 https://doi.org/10.1002/ail2.47 http://www.acm.org/binaries/content/assets/public-policy/2017_joint_statement_algorithms.pdf http://www.acm.org/binaries/content/assets/public-policy/2017_joint_statement_algorithms.pdf information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 15 york: ai now institute, 2017); ieee, ethically aligned design: a vision for prioritizing human wellbeing with artificial intelligence and autonomous systems (new york: ieee, 2019), https://standards.ieee.org/content/dam/ieeestandards/standards/web/documents/other/ead1e.pdf. 53 association for computing machinery, statement on algorithmic transparency and accountability, 2. 54 lilian edwards and michael veale, “enslaving the algorithm: from a ‘right to an explanation’ to a ‘right to better decisions’?,” ieee security & privacy 16, no. 3 (2018): 46–54. 55 kate crawford and jason schultz, “big data and due process: toward a framework to redress predictive privacy harms,” boston college law review 55, no. 1 (2014): 93–128. 56 andrew tutt, “an fda for algorithms,” administrative law review 69, no. 1 (2017): 83–123. 57 corinne cath et al., “artificial intelligence and the ‘good society’: the us, eu, and uk approach,” science and engineering ethics, march 28, 2017, https://doi.org/10.1007/s11948-017-9901-7. 58 edwards and veale, “slave to the algorithm?” 59 matthew u. scherer, “regulating artificial intelligence systems: risks, challenges, competencies, and strategies,” harvard journal of law & technology 29, no. 2 (2016): 353– 400. 60 roger brownsword, “from erewhon to alphago: for the sake of human dignity, should we destroy the machines?,” law, innovation and technology 9, no. 1 (january 2, 2017): 117–53, https://doi.org/10.1080/17579961.2017.1303927. 61 birhane et al., “the values encoded in machine learning research”; ana brandusescu, artificial intelligence policy and funding in canada: public investments, private interests (montreal: centre for interdisciplinary research on montreal, mcgill university, 2021). 62 cath et al., “artificial intelligence and the ‘good society’”; law commission of ontario and céline castets-renard, comparing european and canadian ai regulation, 2021, https://www.lcocdo.org/wp-content/uploads/2021/12/comparing-european-and-canadian-ai-regulationfinal-november-2021.pdf. 63 european commission, “artificial intelligence act,” 2021, https://eur-lex.europa.eu/legalcontent/en/txt/?uri=celex:52021pc0206. 64 dillon reisman et al., algorithmic impact assessment: a practical framework for public agency accountability (new york: ai now institute, 2018), https://ainowinstitute.org/aiareport2018.pdf. 65 treasury board of canada secretariat, “directive on automated decision-making,” 2019, http://www.tbs-sct.gc.ca/pol/doc-eng.aspx?id=32592. https://standards.ieee.org/content/dam/ieee-standards/standards/web/documents/other/ead1e.pdf https://standards.ieee.org/content/dam/ieee-standards/standards/web/documents/other/ead1e.pdf https://doi.org/10.1007/s11948-017-9901-7 https://doi.org/10.1080/17579961.2017.1303927 https://www.lco-cdo.org/wp-content/uploads/2021/12/comparing-european-and-canadian-ai-regulation-final-november-2021.pdf https://www.lco-cdo.org/wp-content/uploads/2021/12/comparing-european-and-canadian-ai-regulation-final-november-2021.pdf https://www.lco-cdo.org/wp-content/uploads/2021/12/comparing-european-and-canadian-ai-regulation-final-november-2021.pdf https://eur-lex.europa.eu/legal-content/en/txt/?uri=celex:52021pc0206 https://eur-lex.europa.eu/legal-content/en/txt/?uri=celex:52021pc0206 https://ainowinstitute.org/aiareport2018.pdf http://www.tbs-sct.gc.ca/pol/doc-eng.aspx?id=32592 information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 16 66 danielle keats citron and frank pasquale, “the scored society: due process for automated predictions,” washington law review 89 (2014): 1–33; scherer, “regulating artificial intelligence systems.” 67 julia angwin et al., “machine bias,” propublica, may 23, 2016, https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing. 68 “state v. loomis,” harvard law review 130, no. 5 (2017), https://harvardlawreview.org/2017/03/state-v-loomis/. 69 “loomis v. wisconsin,” scotusblog, june 26, 2017, http://www.scotusblog.com/casefiles/cases/loomis-v-wisconsin/. 70 brownsword, “from erewhon to alphago”; campolo et al., ai now 2017 report; ieee, ethically aligned design; pasquale, the black box society: the secret algorithms that control money and information; wachter, mittelstadt, and floridi, “why a right to explanation.” 71 michael power, the audit society: rituals of verification (oxford: oxford university press, 1997). 72 alfred ng, “can auditing eliminate bias from algorithms?,” the markup, february 23, 2021, https://themarkup.org/ask-the-markup/2021/02/23/can-auditing-eliminate-bias-fromalgorithms. 73 joshua alexander knoll, “accountable algorithms” (phd diss, princeton university, 2015). 74 christian sandvig et al., “auditing algorithms: research methods for detecting discrimination on internet platforms,” data and discrimination: converting critical concerns into productive inquiry, 2014, http://wwwpersonal.umich.edu/~csandvig/research/auditing%20algorithms%20--%20sandvig%20-%20ica%202014%20data%20and%20discrimination%20preconference.pdf. 75 association for computing machinery, statement on algorithmic transparency and accountability. 76 sandvig et al., “auditing algorithms,” 17. 77 ng, “can auditing eliminate bias from algorithms?” 78 cathy o’neil, weapons of math destruction: how big data increases inequality and threatens democracy (new york: crown, 2016). 79 emanuel moss et al., assembling accountability: algorithmic impact assessment for the public interest (data & society, 2021), https://datasociety.net/wpcontent/uploads/2021/06/assembling-accountability.pdf. 80 david s. watson and luciano floridi, “the explanation game: a formal framework for interpretable machine learning,” synthese (dordrecht) 198, no. 10 (2020): 9214, https://doi.org/10.1007/s11229-020-02629-9. https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing https://harvardlawreview.org/2017/03/state-v-loomis/ http://www.scotusblog.com/case-files/cases/loomis-v-wisconsin/ http://www.scotusblog.com/case-files/cases/loomis-v-wisconsin/ https://themarkup.org/ask-the-markup/2021/02/23/can-auditing-eliminate-bias-from-algorithms https://themarkup.org/ask-the-markup/2021/02/23/can-auditing-eliminate-bias-from-algorithms http://www-personal.umich.edu/~csandvig/research/auditing%20algorithms%20--%20sandvig%20--%20ica%202014%20data%20and%20discrimination%20preconference.pdf http://www-personal.umich.edu/~csandvig/research/auditing%20algorithms%20--%20sandvig%20--%20ica%202014%20data%20and%20discrimination%20preconference.pdf http://www-personal.umich.edu/~csandvig/research/auditing%20algorithms%20--%20sandvig%20--%20ica%202014%20data%20and%20discrimination%20preconference.pdf https://datasociety.net/wp-content/uploads/2021/06/assembling-accountability.pdf https://datasociety.net/wp-content/uploads/2021/06/assembling-accountability.pdf https://doi.org/10.1007/s11229-020-02629-9 information technology and libraries june 2022 explainable artificial intelligence (xai) | ridley 17 81 ahmed alkhateeb, “science has outgrown the human mind and its limited capacities,” aeon, april 24, 2017, https://aeon.co/ideas/science-has-outgrown-the-human-mind-and-its-limitedcapacities; don r. swanson, “undiscovered public knowledge,” the library quarterly 56, no. 2 (1986): 103–18; don r. swanson, “medical literature as a potential source of new knowledge.,” bulletin of the medical library association 78, no. 1 (1990): 29–37. 82 jack anderson, “understanding and interpreting algorithms: toward a hermeneutics of algorithms,” media, culture & society 42, no. 7–8 (2020): 1479–94, https://doi.org/10.1177/0163443720919373. 83 ed finn, “algorithm of the enlightenment,” issues in science and technology 33, no. 3 (2017): 24. 84 jos de mul and bibi van den berg, “remote control: human autonomy in the age of computermediated agency,” in law, human agency, and autonomic computing, ed. mireille hildebrandt and antoinette rouvroy (abingdon: routledge, 2011), 59. 85 mariarosaria taddeo, “trusting digital technologies correctly,” minds and machines 27, no. 4 (2017): 565, https://doi.org/10.1007/s11023-017-9450-5. 86 cade metz, genius makers: the mavericks who brought ai to google, facebook, and the world (dutton, 2021). 87 tom simonite, “google’s ai guru wants computers to think more like brains,” wired, december 12, 2018, https://www.wired.com/story/googles-ai-guru-computers-think-more-like-brains/. 88 nick wallace, “eu’s right to explanation: a harmful restriction on artificial intelligence,” techzone, january 25, 2017, http://www.techzone360.com/topics/techzone/articles/2017/01/25/429101-eus-rightexplanation-harmful-restriction-artificial-intelligence.htm#. 89 mueller et al., “explanation in human-ai systems.” 90 bunn, “working in contexts for which transparency is important,” 143. https://aeon.co/ideas/science-has-outgrown-the-human-mind-and-its-limited-capacities https://aeon.co/ideas/science-has-outgrown-the-human-mind-and-its-limited-capacities https://doi.org/10.1177/0163443720919373 https://doi.org/10.1007/s11023-017-9450-5 https://www.wired.com/story/googles-ai-guru-computers-think-more-like-brains/ http://www.techzone360.com/topics/techzone/articles/2017/01/25/429101-eus-right-explanation-harmful-restriction-artificial-intelligence.htm http://www.techzone360.com/topics/techzone/articles/2017/01/25/429101-eus-right-explanation-harmful-restriction-artificial-intelligence.htm abstract introduction what is xai? types of xai prerequisites to an xai strategy proofs and validations feature audit approximation and abstraction reproducibility xai by ai authorization codes and standards regulation audit xai as discovery conclusion endnotes lib-s-mocs-kmc364-20140601053153 184 ]oumal of library automation vol. 5/3 september, 1972 two types of designs for on-line circulation systems rob mcgee: systems development office, university of chicago libra,ry on-line circulation systems divide into two types. one type contains records only for charged or otherwise absent items. the other contains a file of records for all titles or volumes in the library collection, regardless of their circulation status. this paper traces differences between the two types, examining different kinds of files and terminals, transaction evidence, the quality of bibliographic data, querying, and the possibility of functions outside circulation. aspects of both operational and potential systems are considered. introduction a literature survey was made of on-line circulation systems ( 1 ). to qualify for study, a system needed to perform any major circulation function on-line. charging and querying were common. some systems were also found to perform some acquisitions, cataloging, and reference work. criteria used to examine systems have been presented in an earlier paper as key factors of circulation system analysis and design (2). this paper conceptualizes the survey findings, and goes on to consider general problems and alternatives of designing on-line circulation systems. the survey shows that on-line circulation systems divide into two types, according to the scope of their bibliographic records. we give the term "absence file" to a set of records for only those items that have been charged or otherwise removed from their assigned locations. the name "item file" is given to what is, or approaches being, a comprehensive file of records for all titles or volumes in the library collection, regardless of their circulation status. each on-line circulation system either does or does not have an item file. systems without an item file must contain an absence file, and are two types of designs/mcgee 185 therefore called "absence systems." systems with an item file are called "item systems." (an item system may also have an absence file, depending upon its design.) note that an "absence file" and an "item file" are each conceptually or logically defined as a single file, whereas in some operational systems either may be stored as more than one physical file. two other basic files generally appear in operational systems: a user file of complete records for users; and a transaction file that may be variously used for data collection, system update, system backup, and batch generation of notices. we can now generalize common but not exclusive file definitions for the two design types. absence systems usually contain three main logical files: 1) a user file; 2) an absence file that contains records only for charged or otherwise absent items; and 3) a transaction file. user identification number and complete item data (all the item data the system is to hold) are input at transaction time to create charge records. these data are typically collected from machine-readable sources such as punched cards or magnetic strips; the surveyed systems use punched cards. time data, such as charge date or due date, and circumstantial data, such as charging location, may also be collected. during batch processing, user records are accessed by identification number to obtain name, address, and so forth. examples of absence systems are found at west sussex county library (3,4,5 ), illinois state library ( 6,7 ), midwestern university library ( 8,9,10), queen's university library ( 11,12,13,14,15), northwestern university library ( 16), and bucknell university library ( 17). item systems are characterized by three or four major files: 1) a user file; 2) an item file of bibliographic records for all library volumes or titles, or for as many as machine records can feasibly be created and stored; 3) a transaction file that may be used for update of the item file, data collection and analysis, and perhaps notice generation; and optionally 4) an absence file of records for circulating items, if transaction data for them are more efficiently kept here than in the item file. records in both the user and item files contain either full data or at least enough data to address messages to users and to adequately describe items. if an absence file is used in an item system, it may copy bibliographic data from the item file, or the two files may be linked to avoid data redundancy. item systems are in operation at bell laboratories library ( 18,19 ), eastern illinois university library ( 20), ohio state university libraries ( 21,22 ), and the technical library of the manned spacecraft center, houston ( 23). (the manned spacecraft center library, alone among the systems we have surveyed, does not have a user file. instead, user's last name, initials, and address code are input at charge time.) since the basic distinction between absence and item systems is whether descriptive data for an item are machine-held prior to its charge time, item file records are limited primarily by the costs of conversion and machine storag~, but can have full, even marc-like formats; whereas absence sys186 journal of library automation vol. 5/3 september, 1972 tern records are restricted by the quantity of data that can be input at charge time, i.e. by the capacities of source record coding and data transfer techniques. basic approaches to on-line circulation system development three approaches to the design of on-line circulation systems have originated from different notions of circulation control ( 24). first is the view that circulation control is a separate library function, or one with minimal relationships to other library data processing. exclusive requirements for user and item data are formulated; the format of bibliographic data and the design of data management capabilities are developed explicitly for circulation control, to the exclusion of other library data processing requirements. absence systems have been developed with this approach, but thus far item systems have not. the second approach is to create a circulation system that is operationally independent of other library data processing activities, but designed with a view toward the possibility of shared usage of bibliographic and other data, and of general library data processing facilities. compatibility with other functions is provided, to aid later combination. either system design can take this approach, but item systems can take better advantage of the integration of functions. a third approach is to add circulation control to other large file processes (such as a cataloging system), or to develop them concurrently. this follows an integrated view of library data processing that sees a circulation system operating with many of the same data and processing requirements as other library functions, all of which are handled by a general library data management system. the broad range of library data processing activities needs to be addressed, and an item system design is likely to be preferred. two concepts underlie these approaches: 1 ) an integrated library system, and 2) a remotely accessible library catalog. an integrated view of the library is one of a total operating unit with a variety of operations that are logically interrelated and interconnected by their mutual requirements for data and processing ( 25) . the term "integrated system" usually implies a system in which centralized, minimally-redundant files undergo shared processing by different library functions. it is not clear exactly how the concept of a remotely accessible catalog should be defined, or exactly what the phrase means to various users. if we take it to mean the capability to access information from a given catalog at remote locations, then a variety of systems may qualify: e.g., telephone access to a group that performs manual card catalog lookups ; multiple locations of book catalogs or microform catalogs; and terminal access to an on-line, computerized catalog. the last is pertinent to our discussion. two types of designs/mcgee 187 how an integrated system is implemented determines if its central bibliographic file is accessible from multiple remote locations; how an on-line, remotely accessible catalog is used determines if the system is integrated. recently the ohio state university libraries circulation system has come to be explicitly called a remote catalog access system. we have not yet found reports of any on-line system that has integrated all of technical services and circulation. the addition of circulation control to existing on-line cataloging systems has been planned for the shawnee mission system ( 26) and mentioned for the system at western kentucky university, bowling green ( 27). the ohio college library center has not yet decided how it will handle circulation. as long as we define an integrated system on the basis of multiple uses of nonredundant data (among other characteristics), and a remote catalog access system upon physical accessibility, various systems may qualify as either or both. recognizing these two concepts helps to show how the three approaches to on-line circulation system development capsulize broader trends in library systems. first, the redundancy of bibliographic data in operationally separate but conceptually related functions has characterized traditional manual systems, batch computer systems, and now some on-line systems. second, the construction of individual, independent subsystems, while planning for their eventual combination, has been called an evolutionary approach to a total system, by de gennaro ( 25). he has also defined a third, integrated approach, in which computerized systems are designed to take advantage of interrelationships among different subsystems, for example, by accepting one-time inputs of given data, and processing them for multiple library functions and outputs. these three trends have been widely experienced in changing relationships among traditional and innovative systems for acquisitions and cataloging. the pattern is repeating now in the evolution of on-line systems with respect to technical services processing and circulation control. large on-line systems are emerging to perform acquisitions, cataloging, circulation control, and reference functions with shared processing facilities and data bases. terminal devices most on-line circulation systems do not perform all functions on-line, although the following are possibilities: charge, discharge, inquiry, and other record creations and updates, such as reserving items in circulation, renewing loans, recording fines payments, and even converting files to machine-readable form. what do their input/output requirements imply for terminal devices? inputs for charges may be minimal user and item identification numbers, or full borrower and book descriptions. evidence of valid charges may be produced; printouts of user number, call number, short author and title, and due date are common. there are also special "security systems" that switch two-state devices in books (such as sensitized plates ~r labels) to record valid charge, but as yet no such system has been 188 journal of library automation vol. 5/3 september, 1972 coupled with on-line charging. discharge inputs need only to match existing records; simple access keys such as call number or accession number are adequate. querying, too, may be accomplished with simple search keys, or with bibliographic inputs such as author and title . all these functions can be performed with keyboard input and output display of alphanumeric data. absence systems although not a requirement by our definition, most absence systems feature machine-readable user and item cards (the queen's university system does not); the terminals used for charge and discharge must have card reading capabilities. thus for on-line tasks charge stations need card readers and the ability to produce charge evidence, usually in hard copy; querying by bibliographic search keys requires keyboard input and output display of alphanumeric data; discharge stations need a card-reading capability and a display mechanism to identify reserved items; and file creation requires inputs of alphanumeric data in a character set that may range from minimum to full. there are at least two problems in choosing terminal equipment for absence systems. fi1·st, any single terminal or configuration that satisfies input and output requirements for all basic functions may be too expensive to install at every library location of these activities. second, the combination of separate hardware units (such as keyboards, printers, and card readers) may require special hardware or software interfaces that prove difficult and expensive ( 16, 28). alternatively, separate circulation stations with different terminal devices can be established for specific functions. this solution may introduce problems of hardware and personnel redundancy and backup. difficulties with terminal devices explain in part why most systems perform not all but only selected functions on-line. item systems it is possible, in systems with user and item files, to access records by using search keys that are either keystroked or machine-read from cards. the use of machine-readable cards involves the same problems as those described for absence systems. however, choosing keyboard entry of accession or call numbers eliminates card reading, and simplifies requirements so that keyboard devices with display capabilities can perform all basic functions. the feasibility of keyboarding the inputs at transaction time has been demonstrated by the systems at queen's university library (an absence system without machinereadable cards), ohio state university libraries, bell laboratories, and the technical library of the manned spacecr:1ft center. a system based on a single terminal device that handles all realtime functions offers attractive simplifications for hardware and teleprocessing software. the primary disadvantages also center on the device itself. factors such as input error, transmission and printing rates, character set, special function keys, noise, and cost have various implications two types of designs/mcgee 189 for system design and operations. obviously, in a system based on a single terminal device, the characteristics of that device are influential. kilgour has stated that the two most important factors in configuring a computer system for an item file design are, first, the nature of secondary memory, and second, the kind of terminal device to be employed ( 29). the need to quickly access large stores of data is basic. as for keyboard devices, one often finds that typewriter terminals may require far more computation of a central computer than do cathode ray tube terminals, because many crt systems have substantial computing power of their own, giving the effect of a satellite computer. this can be important for systems that will run on time-shared machines, or transmit data over long distances. the problems we have described for circulation terminals can be overcome; appropriate devices can be built. too many library systems have been designed around unsuitable hardware; there has been little choice but to develop circulation systems (both on-line and batch) with data collection devices designed for industrial applications. their influence-frequently bad-is fundamental to the nature of resulting systems. in fairness, suppliers need both direction and marketing potential. the deeper fault is with librarians, who have inadequately documented requirements and not proven the existence of a market. the integrated approach to library automation ultimately visualizes all library functions using a single set each of bibliographic, user, and other kinds of records, although different pieces of data for different purposes. similarly, one can say that different sets of terminal requirements arise from the different input/ output specifications among library tasks and not so much from the nature of bibliographic and other data. as functional requirements of different activities (e.g., acquisitions, cataloging, and circulation) overlap, the opportunity to use identical or similar terminals in a variety of library processes is enhanced. extending the integrated approach to libraryrelated hardware fits well with the concepts of modular hardware design and add-on features. take, for example, a basic keyboard/display screen terminal to which modules can be added to read book and borrower identification and to produce hardcopy printout. transaction evidence a variety of transactions may occur between a library and its users: charging and discharging books, placing reserves on circulating material, paying fines, etc. evidence may be provided to verify transaction accuracy and to furnish receipts for users. this evidence can be in various formats: it might be a hardcopy record (or worksheet) of transaction inputs, or a printout or screen display of system responses. printed charge evidence is a familiar example, and is sometimes used for inspection of items that library users carry from the building. two kinds of charge evidence may be defined ( 2). simple evidence contains no more user and item data than are input at transaction time. complex evidence contains user or item data other 190 ]oumal of library automation vol. 5/3 september, 1972 than charge time input, and requires the system to extract data from machine-held file( s). printed evidence typically contains an item due date that may be calculated from either user or item criteria, or both; or directly specified at the time of charge. let us look at the implications of printed charge evidence for the two system types. absence systems in most absence systems user identification number and full item data are transferred into the system from machine-readable cards at charge time. there are various ways of printing simple transaction evidence; the following are illustrative. one technique is to transmit data directly to a computer that formats them for output, calculates a due date, and returns them to a printing device. another method is to process source record data with a terminal system that can buffer and format them, select a juc date, and output the evidence on a printer. shifting functions from the computer to a terminal system may simplify teleprocessing software, save time at the central processing unit, and permit nearly normal charge operations during computer downtime. if more elaborate user data than identification number are required, there are two obvious solutions. central user records may he accessed to provide complex evidence, possibly increasing central processing unit time and response time. or, user cards (such as magnetically encoded ones ) that contain fuller data could be employed with a terminal system that handles them independently of the central computer. item systems in item systems as in absence systems it is possible to use machinereadable cards, with the same implications for printing charge evidence. however, if 1) user and item numbers are keystroked, 2) these are considered sufficient borrower and book information, and 3) decision rules for loan periods are simple, then little or no computer response is required for charging. due date may be returned and printed to signal completed transaction, or predated date due slips may be used. alternatively, special terminal features may be added to select and print a due date. this complicates otherwise simple terminal requirements. sophistications such as status checks on the borrower (e.g., any outstanding fines?) and item (e.g., is it reserved for another user?) will of course require more extensive processing and responses. if charge time inputs are indeed keystroked, user and item numbers with check digits are desirable, to minimize the effects of input error. for complex evidence response time is important, expecially if terminals are typewriter-like devices. the time required is determined by the sources of response data, their access times, how much data must be transmitted, and the transmission and terminal display rates. through careful design the time required to obtain charge evidence and complete the transaction can two types of designsj.mcgee 191 be minimized. for example, if the user number carries a code for borrower class, then a due date can be quickly selected and printed, while the item file is accessed for needed author /title data. it is clear that in an item system containing a user file, only very simple inputs are required to record a charge transaction. the additional requirements for charge evidence, status checks on user and item, and so forth determine how elaborate and slow system responses may become. availability, holdings, and absence information one may take the view that a library should provide the following kinds of responses to users. if a title is requested, library holdings for it should be given. if a specific item is wanted, either its absence or presumed location should be reported. if the item cannot be immediately provided, the library should determine its future availability and inform the user. the terms "availability," "holdings," and "absence information" have special meanings. the availability of a specific item to a library's users is mapped onto the universe of items by the library's acquisitions, cataloging, circulation control, and interlibrary borrowing functions. availability information obtains from all these sources, but particularly from the public catalog of library holdings. absence information, in contrast, corresponds only to a subset of library holdings-it tells the locations of library-owned items when they are absent from the locations indicated by the catalog. absence information therefore corresponds to a subset of holdings information; holdings information is a subset of availability information. in the context of our discussion an absence system provides full absence information and only partial holdings and availability information. an item system can provide full holdings and absence information, but only partial availability information, since items not owned may be ordered, or borrowed from another library. such considerations strengthen the argument that circulation control shares a functional unity with other library processes, and should therefore be considered as one of several integrated functions. the provision of absence and availability information is the essence of circulation system querying requirements. figure 1 shows that different query keys access different subsets of availability information. note the wider utility of some keys than for just circulation control. absence systems in on-line circulation systems built around an absence file, the data representing each physical item may range from a simple accession or item call number (as in the queen's university and northwestern university systems) to larger records containing as much data as may be stored and transferred with a machine-readable card (for example, a hollerith-punched book card). if availability information is to be obtained from a library's public catalog, and not from the circulation system, then access to the file of absence information may be with any key shown in the catalog records: 192 journal of library automation vol. 5/3 september, 1972 e.g., author and title, title and author, ca11 number, accession number. only the simpler keys, item call number and accession number, have been used in absence systems developed so far. consequently their requirements for file organization and access software are minimal. in most systems these keys permit exact matches to only single records, but in the northwestern university system a call number query may cause display of a set of related records ( 16). item systems the query function in item systems is bound by different constraints than those constraints of absence systems. the amount of bibliographic data is not restricted by the storage capacities of machine-readable cards, transaction response time, or the transfer rates of charge-time inputs. however, the following questions arise: how many and which records from the library's data base (e.g., its shelflist) must be converted to provide a sufficient item file? for each record converted, how much and which data are required? what functions shall such a data base ultimately support? what kinds of absence and availability information will be provided? these deserve special discussion before we consider querying in item systems. item system bibliographic data how much data are required for item file records? in an integrated system full records are ultimately produced by the cataloging process. should one use full, variable-length, marc-like records for circulation control? the conversion, on-line storage, management, and access of a large file of full bibliographic records are expensive propositions. one may be compelled toward a lesser effort. under much of the popular data management software it is easier to organize and store fixed length records than variable length records with different combinations of fixed and variable length fields. the files of current item systems hold less-than-full bibliographic records: bell laboratories library utilizes two basic fixed length formats of 155 and 188 characters ( 18,19); the eastern illinois item file consists of 124-byte records ( 20); although the ohio state system contains variable length records, they are less-than-full bibliographically, averaging 103 bytes ( 22 ) ; the ~1anned spacecraft center system has fixed length records of 168 characters ( 23). if not a full, marc-like record, then what? two questions may be asked: how much data should be converted for each record? and: how much of these data should be put initially into an item file? if one believes a fully integrated system may eventually take over some public catalog functions, then traditional author-title-subject accesses must be maintained, at least until proven unneeded. the minimum genuinely useful set of bibliographic data elements needed for futuristic information retrieval from library catalogs has not been proven; the safe but expensive answer is to convert univ e r se of all items two types of designs/ mcgee 193 access keys .,..------:~}standard bibliogr aphic / descr i ption l st andard but noncomp r ehensively --__ ::;;~ ~ppli~d.single-element unique a----/" 1dent1f1e r: e . g. , isbn , ssn _,/ l li brarya ssigned ke ys s uch as item call number and accession numbe r no te ~means t he access key r etrieves all me mbers of a set ..-me a ns the access ke y retrieves only so~e members of a set fig. 1. possible access ke ys to sets of availability information full records. initially, however, one might want no more data in an item file than are functionally justified. how much are a ctually needed ? the four existing item systems provide traditional information in new ways that have dramatically improved services to users. they answer several basic kinds of questions on-line: does the library have book ? is it available now? what books do i have charged out? such queries can be answered by nonsubject, d escriptive bibliographic data, and by circulation status information that shows if items are absent, and when they may become available. for this an item file needs records only for items that are used, in contrast to a comprehensive on-line shelflist. which records to include b ecomes a problem remarkably similar to deciding what books to put into low-access, compact storage , or to discard. the two university libraries with item systems chose comprehensive conversions: eastern illinois for 235,000 volumes ( 20 ), and ohio state for 800,000 titles ( 22). what are the potential advantages to users of an item system? if one only wants to know what books are charged, an absence system will suffice. both the penalties and promise of an item system lie in its bibliographic store-in the records it holds (scope ), and in the data these records contain (content ) . unless real-time querying of an item file can substitute for at least some manual searches of the public catalog, and in an improved way, its bibliographic data offer no direct advantages to users of a circulation system; an item system will provide no direct circulation services that an absence system could not. applying this as a test to the utility of a noncompreh ensive item file (a file of records only for items that are, or are likely to b e, in use ), we find perhaps the key question for development of item systems among libraries with very large catalogs: to what extent may 194 journal of library automation vol. 5/3 september, 1972 a noncomprehensive item file substitute for accesses to a comprehensive public catalog? although related, this is not the same question as what proportion of a library's book stock circulates. this is a question of how the public catalog is used: by whom , and for what? lipetz's study of the card catalog in yale university's sterling memorial library gives insight to at least that institution's catalog use (30). he found that 73 percent of the users attempting searches were looking for particular documents (known items ). overall, users' approaches to catalog searches were: author, 62 percent; title, 28.5 percent; subject, 4.5 percent; and editor, 4 percent. this may encourage one to believe that an item file which is accessible by author and title can handle a significant portion of manual catalog lookups. if so, developers of item systems may want to consider strategies similar to the following. if it is shown that satisfactory author /title access can be provided by an item file, then perhaps a large library is justified in dividing its card catalog and retaining only the subject-access portion. the argument is that author/ title access can be provided by an item file of partial records containing nonsubject descriptive data, whereas the requirements for subject access involve still more data that are likely to change as subject descriptions do. however, if a manual card catalog for subjects were maintained , this would facilitate updates of subject headings, and at the same time permit the most efficient format and smallest set of machine-held item file records to be kept. through the use of machine-held subject authority files, maintenance instructions and replacement heading cards could be computer-produced for update of the manual subject catalog. (distribution of machine-readable subject headings is being considered by the library of congress marc development office.) reduction in the maintenance and use of a full manual author-title-subject catalog by library technical services departments could produce significant savings, aside from whatever direct improvements in access that machine files might provide. if the item file were noncomprehensive, or contained retrospective records only for those items that circulated, then author/title accesses would of course be limited to the contents of that file. this would require maintenance of full manual catalogs for noncirculating items. two general alternatives to a comprehensive item file of full records come to mind. one is to utilize records as they are created by the cataloging process, complemented by partial-record conversion (conveniently, inhouse) for only those retrospective items that circulate. another is to create a special circulation-only item file of partial records. this kind of system would use an item file primarily as an alternative to machine-readable book cards. absence and holdings querying would be supported, but not acquisitions or cataloging functions. a system like this, with an item file of partial records, may be the most reasonable answer for large research libraries ( 28). it should be able to give the same circulation services as an absence system, in addition to satisfying certain kinds of two types of designs j.mcgee 195 public catalog searches. the simplifications for data conversion, data management software, and terminals are worth special evaluation as a middle or simple approach to on-line circulation system development, with an item system design. item system querying, bibliographic data structure, and file organization the querying capabilities featured by each item system differ somewhat, and are explained in part by differences in bibliographic data structure and file organization. the data and design of an information-providing system are fundamental to the kinds of services it can provide. a useful conceptual model is the traditional manual library system in which separate files are used for different functions: an in-process file for technical processing, a shelflist for the official holdings, a public catalog for reference, and a circulation file to control item absences. among these the file for circulation contains less bibliographical data than the others, since even a single data element such as the call number can uniquely identify a physical item and relate it to a fuller description, such as a shelflist record. a circulation file of this nature is in effect a manual absence file, and serves no major purpose other than circulation control. vvere the processing requirements not impractical, circulation status cou ld be more usefully recorded in public catalog records, in the manner of an item file. the eastern illinois university system has an indexed sequential item file organized by item call number plus accession number. it may be queried by this key to get an exact match to a single record, or by a classification number to get a file scan of corresponding records. query by user number displays charges to the user. the ohio state system has a read-only item file that is randomized by item call number, but it may also be accessed by an author/title key that consists of the first four characters of the main entry plus the first five characters of the first significant word or words of the title. the second five characters can be blanked to provide author-only access. the file is also accessible by an item record number that is assigned sequentially to new records entering the file. the bell laboratories system provides access by item number to its item file, and uses a set of twelve query codes to obtain status and other factual information on users and items. user number and item number are the query keys. the item file is also used to produce a book catalog that gives the item numbers by which queries can be made. the item file of the manned spacecraft center library is sequentially organized by item number, and can also be queried by call number and user number. these systems demonstrate alternatives for bibliographic data structure and file organization and access methods that are summarized by the author in a separate work ( 1) and explained by the references for each system 196 journal of library automation vol. 5/3 september, 1972 ( 18,19,20,21,22,23). briefly, the eastern illinois, bell laboratories, and manned spacecraft center systems use a fixed-length item record structure, and charge data are written directly to item record fields that are defined for this purpose. the ohio state system has a variable length , read-only item record. transaction data are recorded in an absence file, and linked to the item file. in the bell laboratories system what is conceptually a single item file is actually two separately organized physical rles with different record formats. fixed-length book records are organized sequentially, and each contains fields for three loans and two reserves; all copies and volumes are represented. journal records arc organized by an indexed sequential method, and do not contain copy and volume data, which must be added at transaction time. in the eastern illinois and manned spacecraft center systems the item file contains a separate record for each physical volume in the library. the ohio state item file contains one record per title. although it is difficult to tell without detailed programming knowledge of these systems, thebell laboratories data structure seems to enable exact matches to single records for status queries (e.g., what is the status of title number ? what is the status of copy ? ) in ways that the eastern illinois and ohio state systems can only accomplish through a terminal operator's interpretation of a displayed set of matching records. the bell laboratories system can therefore conduct queries of this nature with keyboard/printer terminals, whereas the eastern illinois and ohio state systems require crt devices to display large amounts of information. it can also ask what overnight loans are still out, possibly a function of its journal file's data structure. the software implications of these various capabilities will not be discussed here. suffice it to say that absence systems require simpler accesses and data management than do the kind of item systems we have discussed, and that as item files are designed to replace all or selected public catalog functions, their data management and user interface requirements become greater. special aspects of the charge function two aspects of the charge function have special significance for on-line systems: patron self-charging, and a telephone and mail or delivery service. among the on-line systems we surveyed, only the one at northwestern university is reported to be self-charging. to have patron self-charging requires that charge transactions be simple and convenient. data transfer methods that require little effort are therefore preferred, and the usc of machine-readable user and item identifications seems to be the best current choice. the northwestern system uses hollerithpunched user badges and book cards. other methods of data entry such as magnetic card reading and optical scanning are often mentioned for circulatwo types of designs/mcgee 197 tion control, but as of december 1971 we know of none that has resulted in a practical terminal-based svstem for on-line charging. two of the item systems promote a telephone and delivery service: the bell laboratories system and the ohio state university system. in each system inquiries can be directed to operators, who may conduct on-line searches of library holdings and circulation information for specific items. the kinds of questions that can be asked are "does the library have ___ ?" and "is it charged?" we noted earlier that a catalog can he made "remotely accessible" in several ways: e.g., by a special group that performs manual card catalog lookups for telephoned requests, or by users' consulting multiple copies of book or microform catalogs. in principle, a variety of catalogs and circulation systems can be used together in a telephone inquiry system of this nature. for example, the library of the georgia institute of technology has recently implemented an "extended catalog access" and delivery service that is based on microfiche copies of its catalog at thirty-six campus locations, coupled with telephone inquiry to a manual circulation system ( 31). readers look up wanted items and telephone the library to request them. the manual circulation file is checked: available items are charged for delivery, or reserves may be placed for items that are already loaned. presumably, the currency of information and quickness of response times are better in an on-line circulation system than in any other type. an item system can furnish both holdings and absence information. an absence system needs to be coupled to another system to furnish holdings information: a requirement is that the holdings information must contain a key by which the corresponding absence records can be accessed. these are basic considerations in providing a telephone and delivery service. system backup the problems considered here derive from two conditions: unexpected system downtimes and scheduled periods when the system is not in operation. at these times a system cannot execute on-line tasks . two classes of backup problems are: 1) provision of service to users during the downtimes; and 2) updating system files to record downtime transactions. the latter are termed recovery problems. one way to backup the query function is to periodically print a list of circulating items. the frequency and ease of access (e.g., number of copies, their locations, telephone access to them ) of such a list can pose substantial problems. an alternative to scheduled printings is an arrangement for quick printouts of a frequently copied backup tape on a redundant computer system. the basic recovery problem is how to enter data into the system for transactions that took place during downtimes. presumably, if unexpected do\\'ntimes are not inordinately long, discharges and other file updates may be postponed. this simplification is helpful, since transaction sequences 198 journal of librm·y automation vol. 5/3 september, 1972 among different kinds of updates can become quite complicated, e.g., discharges undo charges, and confusing the sequence causes problems: although other kinds of system updates have their own special problems, the following paragraphs only briefly discuss the backup and recovery of charging activities. absence systems the provision of transaction evidence in off-line mode has already been suggested for absence systems that have the necessary terminal capabilities. similarly, there are configurations which, in off-line mode, read user and item cards and produce machine-readable transaction records that can be read-in during post-downtime recovery procedures. the northwestern university system has a special backup terminal for this purpose. the provision of automatic recovery facilities is an attractive feature. alternatively, multiple part manual transaction records can be made for charges during downtimes. one part may serve as transaction evidence; the other can be used for manual input of recovery data, when the system is up again. exactly how this is done depends upon other details of the particular system. item systems since inputs of user and item identification numbers are sufficient to record charges in item systems, the recovery problem can be simpler than for absence systems. typewriter-like terminals with card or paper tape punches or magnetic recorders can be used to create machine-readable recovery data. the requirements for transaction evidence may be crucial. perhaps the solution to the worst case is the use of a two-part manual transaction form: one copy for transaction evidence, and the other, as above, for post-downtime recovery inputs. we can summarize three hardware solutions for transaction backup in either system type: 1) total system redundancy, 2) backup at the terminal level, and 3) a backup facility between terminal and computer. the cost of full system redundancy makes it unlikely. a facility to log transactions during downtimes is more feasible; there are several choices. one such alternative is to record transaction data off-line in machine-readable form at each data collection point: e.g., to punch paper-tape or cards. another alternative is to record data from several terminals with a single device, such as a magnetic recorder, or a control unit that coordinates a multiterminal system. a third solution, a variation on the second, is a mini-computer which links terminals, and handles telecommunications with a larger machine that holds system files. this approach has been taken by bucknell university. it affords more comprehensive backup than merely capturing transaction data. other functions, such as checks for user validity and reserved items, can be performed on a relatively reliable mini-computer dedicated to circulation. two types of designs/mcgee 199 conclusion on-line library catalogs are now a reality, but not yet for the exotic information retrieval work once popularly projected. instead, relatively straightforward accesses by author, title, and call number are supporting circulation, reference, and technical processing functions. the needs for better circulation systems and network processing of shared cataloging data have stimulated developments of large-scale operational (not experimental) systems around resident files of on-line bibliographic records. developers have not waited for solutions to fundamental problems of automatic indexing and information retrieval ; they have put large bibliographic files on-line and provided relatively simple, multiple access keys. the advances that have been made are in methods of physical access to bibliographic records, not in the intellectual or subject access to information. no new information is being retrieved, but familiar processes are being performed in better ways. improvements in the ease and time of accessing library files have dramatically upgraded the library's responses for its own routine work and to the public in general. we are experiencing the first of a new generation of practical systems that perform traditional functions with on-line rather than manual files, with as much benefit as possible short of better subject access. the new systems are transcending the barriers to convenient use that have been imposed by the size, complexities, and awkwardness of large manual systems. historically, it has been impractical to add circulation information to each record in the public catalog for an item. with on-line files of single records per item this is now possible. state-of-the-art computing affords multiple access keys to a record, instead of duplicating it for additional entries as in manual catalogs. how many and which keys are furnished largely determines the extent to which an on-line catalog can replace a traditional one. difficult cost and technical problems explain the current approaches. full requirements of a public catalog have been avoided; simpler files have been built to handle explicit processing functions. the advantages are simplified records and fewer access points. full bibliographic records are variable length, often large, and sometimes eccentric-and therefore relatively expensive to handle in machine form. in principle the overhead for access is the same as for manual files: the more entries that are provided, the greater the storage, processing, and cost. systems with simpler files than the public catalog have therefore been built. there have been no machine equivalents of large library catalogs; so we have studied manual ones to theorize ideal characteristics. in some cases this model may have supplied a misleading bias. studies of the new on-line systems at work could possibly revise our notions of what is needed. the kinds of systems now emerging are answers for the foreseeable futur e. the tradition of separately organizing and managing public and technical services will be challenged by the integrated systems. their centralized files , data handling, and access methods transcend functional boundaries which 200 journal of library automation vol. 5/3 september, 1972 grew between library tasks that used different but redundant manual files and evolved separate units and procedures to accomplish virtually the same basic data processing functions. the profession has yet to widely appreciate the new overview and managerial changes that are invited. reaction to them may be projected as a fourth and perhaps painful trend. in sofar as no fully integrated systems have yet been developed, it is likely that as they emerge they will force substantial changes to traditional patterns of library organization and management. acknowledgments this work was supported by the university of chicago library systems development office under clr/neh grant no. e0-262-70-4658 from the council on library resources and the national endowment for the humanities, for the d evelopment and operational testing of a library data management system. references 1. rob mcgee, a literatu1'e survey of operational and emerging online library circulation systems (university of chicago library systems development office, feb. 1972). available as eric/clis ed 059 752. mf$0.65, hc$3.29. 2. , "key factors of circulation system analysis and design," college and research libraries 33:127-140 ( mar. 1972 ). 3. h. k. g. bearman, "library computerisation in west sussex," program: news of computers in british libraries 2:53-58 (july 1968). 4. , "west sussex county library computer book issuing system," assistant libra1·ian 61:200-202 ( sept. 1968 ) . 5. richardt. kimber, "an operational computerised circulation system with on-line interrogation capability," program: news of computers in british librm·ies 2 :75-80 (oct. 1968 ) . 6. homer v. ruby, "computerized circulation at illinois state library," illinois libraries 50:159-162 ( feb . 1968 ). 7. robert e. hamilton, "the illinois state library 'on-line' circulation control system," in: proceedings of the 1968 clinic on librm·y applications of data processing. (urbana, ill.: university of illinois graduate school of library science, 1969 ) p. 11-28. 8. ibm corp., on-line library circulation control syste m, moffet library, midwestern university , wichita falls, t exas. application bri ef k-20-0271-0. ( white plains, n.y.: ibm corp., data processing div. , 1968) 14 p . 9. calvin j. boyer and jack frost, "on-lin e circulation controlmidwestern university library's system using an ibm 1401 computer in a 'time-sharing' mode," in: proceedings of the 1969 clinic on two types of designs/mcgee 201 library applications of data processing. (urbana, ill.: university of illinois graduate school of library science, 1970) p. 135-145. 10. charles d. reineke and calvin j. boyer, "automated circulation system at midwestern university," ala bulletin 63:1249-1254 (oct. 1969). 11. belfast, queen's university, school of library studies, study group on the library applications of computers, first report of the working party (belfast university, july 1965) 18 p. 12. richard t. kimber, "studies at the queen's university of belfast on real-time computer control of book circulation," journal of documentation 22:116-122 (june 1966) . 13. , "conversational circulation," libri 17:131-141 ( 1967). 14. ___ ,"the cost of an on-line circulation system," program: news of computers in british libraries, 2:81-94 (oct. 1968). 15. ann h. boyd and philip e. j. walden, "a simplified on-line circulation system," program: news of compute1·s in libraries 3:47-65 (july 1969). 16. velma veneziano and joseph t. paulukonis, "an on-line, real-time time circulation system." [this documentation of the northwestern university library system was made specially available to the author. a later version with the same title appears in larc reports 3:7-48 (winter 1970-71)]. 17. h . rivoire and m. smith, library systems automation reports 1971a-2, bucknell library on-line circulation system (blocs). ellen clarke bertrand library ( 15 mar. 1971) 19 p. 18. r. a. kennedy, "bell laboratories' library real-time loan system (bellrel)," lola, 1:128-146 (june 1968). 19. , "bell laboratories' on-line circulation control system: one year's experience," in : proceedings of the 1969 clinic on library applications of data processing. (urbana, ill.: university of illinois graduate school of library science, 1970) p. 14-30. 20. paladugu v. rao and b. joseph szerenyi, "booth library on-line circulation system (bloc)," ]ola, 4:86-102 (june 1971). 21. richard h. stanwood, "monograph and serial circulation control," a paper for the international congress of documentation, buenos aires, sept. 21-24, 1970. national council for scientific and technical researcb, buenos aires ( 1970) 23 p. 22. ibm corp., data processing division, functional specifications: a circulation system for the ohio state university libraries, gaithersburg, maryland (november 26, 1969) various paginations. [this and other technical documentation were made specially available to the author. this is now available through eric/clis as: on-line remote catalog access and circulation control system. part i: functional specifications. part ii: user's manual. november 1969. 151 p. ed 050 792. mf $0.65, hc $4.00] 202 journal of library automation vol. 5/3 september, 1972 23. edward e. shumilak, an online interactive book-library-management system. nasa technical note nasa tn d-7052. national aeronautics . and space administration, washington, d.c. ( march 1971 ) 40 p. [this document is available through the national technical information service under document number n71-20526] 24. university of chicago library, a p1'0posal for the development and operational testing of a library data management system, herman h. fussier and fred h. harris, principal investigators. ( chicago, ill.: 1970) 44 p. 25. richard de gennaro, "the development and administration of automated systems in academic libraries," ]ola, 1:75-91 (mar. 1968). 26. ellen w. miller and b. j. hodges, "shawnee mission's on-line cataloging system," ]ola 4:13-26 (mar. 1971). 27. simon p. j. chen, "on-line and real-time cataloging," american libraries 3:117-119 (feb. 1972 ). 28. university of chicago library, development of an integrated, computer-based, bibliographical data system for a large university library, annual report 1967/ 68. by herman h . fussier and charles t. payne. university of chicago library, chicago, illinois ( 1968 ) 17 p. + appendixes. 29. frederick g. kilgour, letter to the author 23 november 1971. 30. ben-ami lipetz, user requirements in identifying desired works in a large library, final report, grant no. sar/ oeg-1-71071140-4427, u.s. department of health, education, and welfare, office of education, bureau of research. (new haven, conn.: yale university library, june 1970) 73 p. + appendixes. 31. "library extends catalog access and new delivery service," [ 4 p.] a brochure issued by price gilbert memorial library, georgia institute of technology, atlanta, georgia, 1972. library use of web-based research guides jimmy ghaphery and erin white information technology and libraries | march 2012 21 abstract this paper describes the ways in which libraries are currently implementing and managing webbased research guides (a.k.a. pathfinders, libguides, subject guides, etc.) by examining two sets of data from the spring of 2011. one set of data was compiled by visiting the websites of ninety-nine american university arl libraries and recording the characteristics of each site’s research guides. the other set of data is based on an online survey of librarians about the ways in which their libraries implement and maintain research guides. in conclusion, a discussion follows that includes implications for the library technology community. selected literature review while there has been significant research on library research guides, there has not been a recent survey either of the overall landscape or of librarian attitudes and practices. there has been recent work on the efficacy of research guides as well as strategies for their promotion. there is still work to be done on developing a strong return on investment metric for research guides, although the same could probably be said for other library technologies including websites, digital collections, and institutional repositories. subject-based research guides have a long history in libraries that predates the web as a servicedelivery mechanism. a literature-review article from 2007 found that research on the subject gained momentum around 1996 with the advent of electronic research guides, and that there was a need for more user-centric testing.1 by the mid-2000s, it was rare to find a library that did not offer research guides through its website.2 the format of guides has certainly shifted over time to database-driven efforts through local library programming and commercial offerings. a number of other articles start to answer some of the questions about usability posed in the 2007 literature review by vileno. in 2008, grays, del bosque, and costello used virtual focus groups as a test bed for guide evaluation.3 two articles from the august 2010 issue of the journal of library administration contain excellent literature reviews and look toward marketing, assessment, and best practices.4 also in 2010, vileno followed up on the 2007 literature review with usability testing that pointed toward a number of areas in which users experienced difficulties with research guides.5 jimmy ghaphery (jghapher@vcu.edu) is head, library information systems and erin white (erwhite@vcu.edu) is web systems librarian, virginia commonwealth university libraries, richmond, va. mailto:jghapher@vcu.edu library use of web-based research guides | ghaphery and white 22 in terms of cross-library studies, an interesting collaboration in 2008 between cornell and princeton universities found that students, faculty, and librarians perceived value in research guides, but that their qualitative comments and content analysis of the guides themselves indicated a need for more compelling and effective features.6 the work of morris and grimes from 1999 should also be mentioned; the authors surveyed 53 university libraries, finding that it was rare to find a library with formal management policies for their research guides.7 most recently, libguides has emerged as a leader in this arena, offering a popular software-as-aservice (saas) model and as such is not yet heavily represented in the literature. a multichapter libguides lita guide is pending publication and will cover such topics as implementing and managing libguides, setting standards for training and design, and creating and managing guides. arl guides landscape during the week of march 3rd, 2011, the authors visited the websites of 99 american university arl libraries to determine the prevalence and general characteristics of their subject-based research guides. in general, the visits reinforced the overarching theme within the literature that subject-based research guides are a core component of academic library web services. all 99 libraries offered research guides that were easy to find from the library home page. libguides was very prominent as a platform, in production at 67 of the 99 libraries. among these, it appeared that at least 5 libraries were in the process of migrating from a previous system (either a homegrown, database-driven site or static html pages) to libguides. in addition to the presence and platform, the authors recorded additional information about the scope and breadth of each site’s research guides. for each site, the presence of course-based research guides was recorded. in some cases the course guides had a separate listing, whereas in others they were intermingled with the subject-based research guides. course guides were found on 75 of the 99 libraries visited. of these, 63 were also libguides sites. it is certainly possible that course guides are being deployed at some of the other libraries but were not immediately visible in visiting the websites, or that course guides may be deployed through a course management system. nonetheless, it appears that the use of libguides encourages the presence of public-facing course guides. qualitatively, there was wide diversity of how course guides were organized and presented, varying from a simple a-to-z listing of all guides to separately curated landing pages specifically organized by discipline. the number of guides was recorded for each libguides site. it was possible to append “/browse.php?o=a” to the base url to determine how many guides and authors were published at each site. this php extension was the publicly available listing of all guides on each libguides platform. the “/browse.php?o=a” extension no longer publicly reports these statistics; however, findings could be reproduced by manually counting the number of guides and authors on each site. the authors confirmed the validity of this method in the fall of 2011 by revisiting four sites and finding that the numbers derived from manual counting were in line with the previous findings. of information technology and libraries | march 2012 23 the 63 libguides sites we observed, a total of 14,522 guides were counted from 2,101 authors for an average of 7 guides per author. on average, each site had 220 guides from 32 authors (median of 179 guides; 29 authors). at the high end of the scale, one site had 713 guides from 46 authors. based on the volume observed, libraries appear to be investing significant time toward the creation, and presumably the maintenance, of this content. in addition to creation and ongoing maintenance, such long lists of topics raise a number of usability issues that libraries will also be wise to keep in mind.8 survey the literature review and website visits call out two strong trends: 1. research guides are as commonplace as books in libraries, 2. libguides is the elephant in the room, so much so that it is hard to discuss research guides without discussing libguides. based on preliminary findings from the literature review and survey, we looked to further describe how libraries are supporting, innovating, implementing, and evaluating their research guides. a ten-question survey was designed to better understand how research guides sit within the cultural environment of libraries. it was distributed to a number of professional discussion lists the week of april 19, 2011 (see appendix). the following lists were used in an attempt to get a balance of opinion from populations of both technical and public services librarians: code4lib, web4lib, lita-l, lib-ref-l, and ili-l. the survey was made available for two weeks following the list announcements. survey response was very strong, with 198 responses (188 libraries) received without the benefit of any follow-up recruitment. ten institutions submitted more than one response. in these cases only the first response was included for analysis. we did not complete a response for our own institution. the vast majority (155, 82%) of respondents were from college or university libraries. of the remaining 33, 24 (13%) were from community college libraries, with only 9 (5%) identifying themselves as public, school, private, or governmental. among the college and university libraries, 17 (9%) identified themselves as members of the arl, which comprises 126 members.9 in terms of “what system best describes your research guides by subject?” the results were similar to the survey of arl websites. most libraries (129, 69%) reported libguides as their system, followed by “customized open source system” and “static html pages,” both at 20 responses (11% each). sixteen libraries (9%) reported using a homegrown system, with three libraries (2%) reporting “other commercial system.” in terms of initiating and maintaining a guides system, much of the work within libraries seems to be happening outside of library systems departments. when asked which statement best described who selected the guides system, 67 respondents (36%) indicated their library research library use of web-based research guides | ghaphery and white 24 guides were “initiated by public services,” followed closely by “more of a library-wide initiative” at 63 responses (34%). in the middle at 34 responses (18%) was “initiated by an informal crossdepartmental group.” only 10 respondents (5%) selected “initiated by systems,” with the top down approach of “initiated by administration” gathering 14 responses (7%). when narrowing the responses to those sites that are using libguides or campus guides, the portrait is not terribly different, with 36% library-wide, 35% public services, 18% informal cross-departmental, 7% administration, and systems trailing at 4%. likewise there was not a strong indication of library systems involvement in maintaining or supporting research guides. sixty-nine responses (37%) indicated “no ongoing involvement” and an additional 35 (19%) indicated “n/a we do not have a systems department.” there were only 21 responses (11%) stating “considerable ongoing involvement,” with the balance of 63 responses (34%) for “some ongoing involvement.” not surprisingly, there was a correlation between the type of research guide and the amount of systems involvement. for sites running a “customized open source system,” “other commercial system,” or “homegrown system,” at least 80% of responses indicated either “considerable” or “some” ongoing systems involvement. in contrast, 37% of sites running libguides or campusguides indicated “considerable” or “some” technical involvement. further, the libguides and campusguides users recorded the highest percentage (43%) of “no ongoing involvement” compared to 37% of all respondents. interestingly, 20% of libguides and campus guides users answered “n/a we do not have a systems department,” which is not significantly higher than all respondents for this question at 19%. the level of interaction between research guides and enterprise library systems was not reported as strong. when asked “which statement best describes the relationship between your web content management system and your research guides?” 112 responses (60%) indicated that “our content management system is independent of our research guides” with an additional 51 responses (27%) indicating that they did not have a content management system (cms). only 12 respondents (6%) said that their cms was integrated with their research guides with a remaining 13 (7%) saying that their cms was used for “both our website and our research guides.” a similar portrait was found in seeking out the relationship between research guides and discovery/federated search tools. when asked “which statement best describes the relationship between your discovery/federated search tool and your research guides?” roughly half of the respondents (96, 51%) did not have a discovery system (“n/a we do not have a discovery tool”). only 12 respondents (6%) selected “we prominently feature our discovery tool on our guides,” whereas more than double that number, 26 (14%), said “we typically do not include our discovery tool on our guides.” fifty four respondents (29%) took the middle path of “our discovery tool is one of many search options we feature on our guides.” in the case of both discovery systems and content management systems, it seems that research guides are typically not deeply integrated. when asked “what other type of content do you host on your research guides system?” respondents selected from a list of choices as reflected in table 1. information technology and libraries | march 2012 25 answer total percent libguides/campusguides course pages 127 68% 74% “how to” instruction 123 65% 77% alphabetical list of all databases 76 40% 42% “about the library” information (for example hours, directions, staff directory, event) 59 31% 35% digital collections 34 18% 19% everything—we use the research guide platform as our website 16 9% 9% none of the above 17 9% 2% table 1. other types of content hosted on research guides system these answers reinforce the portrait of integration within the larger library web presence. while the research guides platform is an important part of that presence, significant content is also being managed by libraries through other systems. it is also consistent with the findings from the arl website visits, where course pages were consistently found within the research guides platform. for sites reporting libguides or campusguides as their platform, inclusion of course pages and how-to instruction was even higher, at 74% and 77%, respectively. another multi-answer question sought to determine what types of policies are being used by libraries for the management of research guides: “which of the following procedures or policies do you have in place for your research guides?” responses are summarized in table 2. library use of web-based research guides | ghaphery and white 26 answer total percent percent using libguides/campusguides style guides for consistent presentation 105 56 58 maintenance and upkeep of guides 94 50 53 link checking 87 46 50 required elements such as contact information, chat, pictures, etc. 78 41 56 training for guide creators 73 39 43 transfer of guides to another author due to separation or change in duties 72 38 41 defined scope of appropriate content 43 23 22 allowing and/or moderating user tags, comments, ratings 36 19 25 none of the above 36 19 19 controlled vocabulary/tagging system for managing guides 23 12 25 table 2. management policies/procedures for research guides while nearly one in five libraries reported none of the policies in place at all, the responses indicate that there is effort being applied toward the management of these systems. the highest percentage for any given policy was 56% for “style guides for consistent presentation.” best practices in these areas could be emerging or many of these policies could be specific to individual library needs. as with the survey question on content, the research-guides platform also has a role with the libguides and campusguides users reporting much higher rates of policies for “controlled vocabulary/tagging” (25% vs. 12%) and “required elements” (56% vs. 41%). in both information technology and libraries | march 2012 27 of these cases, it is likely that the need for policies arise from the availability of these features and options that may not be present in other systems. based on this supposition, it is somewhat surprising that the libguides and campusguides sites reported the same lack of policy adoption (none of the above; 19%). the final question in the survey further explored the management posture for research guides by asking a free-text question: “how do you evaluate the success or failure of your research guides?” results were compiled into a spreadsheet. the authors used inductive coding to find themes and perform a basic data analysis on the responses, including a tally of which evaluation methods were used and how often. one in five institutions (37 respondents, 19.6%) looked only to usage stats, while seven respondents (4%) indicated that their library had performed usability testing as part of the evaluation. forty-our respondents (23.4%) said they had no evaluation method in place (“ouch! it hurts to write that.”), though many expressed an interest or plans to begin evaluation. another emerging theme included ten respondents who quantified success in terms of library adoption and ease of use. this included one respondent who had adopted libguides in light of prohibitive it regulations (“we choose libguides because it would not allow us to create class specific research webpages”). several institutions also expressed frustration with the survey instrument because they were in the process of moving from one guides system to another and were not sure how to address many questions. most responses indicated that there are more questions than answers regarding the efficacy of their research guides, though the general sentiment toward the idea of guides was positive, with words such as “positive,” “easy,” “like,” and “love” appearing in 16 responses. countering that, 5 respondents indicated that their libraries’ research-guides projects had fallen through. conclusion this study confirms previous research that web-based research guides are a common offering, especially in academic libraries. adding to this, we have recorded a quantitative adoption of libguides both through visiting arl websites and through a survey distributed to library listservs. further, this study did not find a consistent management or assessment practice for library research guides. perhaps the most interesting finding from this study is the role of library systems departments with regard to research guides. it appears that many library systems departments are not actively involved in either the initiation or ongoing support of web-based research guides. what are the implications for the library technology community and what questions arise for future research? the apparent ascendancy of libguides over local solutions is certainly worth considering and in part demonstrates some comfort within libraries for cloud computing and saas. time will tell how this might spread to other library systems. the popularity of libguides, at its heart a specialized content management system, also calls into question the vitality and adaptability of local content management system implementations in libraries. more generally, does the desire to professionally select and steward information for users on research guides indicate librarian misgivings about the usability of enterprise library systems? how do attitudes library use of web-based research guides | ghaphery and white 28 toward research guides differ between public services and technical services? hopefully these questions serve as a call for continued technical engagement with library research guides. what shape that engagement may have in the future is an open question, but based on the prevalence and descriptions of current implementations, such consideration by the library technology community is worthwhile. references 1. luigina vileno, “from paper to electronic, the evolution of pathfinders: a review of the literature,” reference services review 35, no. 3 (2007): 434–51. 2. martin courtois, martha higgins, aditya kapur, “was this guide helpful? users’ perceptions of subject guides,” reference services review 33 , no. 2 (2005): 188–96. 3. lateka j. grays, darcy del bosque, and kristen costello, “building a better m.i.c.e. trap: using virtual focus groups to assess subject guides for distance education students,” journal of library administration 48, no. 3/4 (2008): 431–53. 4. mira foster et al., “marketing research guides: an online experiment with libguides,” journal of library administration 50, no. 5/6 (july/september, 2010): 602–16; alisa c. gonzalez and theresa westbrock, “reaching out with libguides: establishing a working set of best practices,” journal of library administration 50, no. 5/6 (july/september, 2010): 638–56. 5. luigina vileno, “testing the usability of two online research guides,” partnership: the canadian journal of library and information practice and research 5, no. 2 (2010), http://journal.lib.uoguelph.ca/index.php/perj/article/view/1235 (accessed august 8, 2011). 6. angela horne and steve adams, “do the outcomes justify the buzz? an assessment of libguides at cornell university and princeton university—presentation transcript,” presented at the association of academic and research libraries, seattle, wa, 2009, http://www.slideshare.net/smadams/do-the-outcomes-justify-the-buzz-an-assessment-oflibguides-at-cornell-university-and-princeton-university (accessed august 8, 2011). 7. sarah morris and marybeth grimes, “a great deal of time and effort: an overview of creating and maintaining internet-based subject guides,” library computing 18, no. 3 (1999): 213–16. 8. mathew miles and scott bergstrom, “classification of library resources by subject on the library website: is there an optimal number of subject labels?” information technology & libraries 28, no. 1 (march 2009): 16–20, http://www.ala.org/lita/ital/files/28/1/miles.pdf (accessed august 8, 2011). 9. association of research libraries, “association of research libraries: member libraries,” http://www.arl.org/arl/membership/members.shtml (accessed october 24, 2011). http://journal.lib.uoguelph.ca/index.php/perj/article/view/1235 http://www.slideshare.net/smadams/do-the-outcomes-justify-the-buzz-an-assessment-of-libguides-at-cornell-university-and-princeton-university http://www.slideshare.net/smadams/do-the-outcomes-justify-the-buzz-an-assessment-of-libguides-at-cornell-university-and-princeton-university http://www.ala.org/lita/ital/files/28/1/miles.pdf http://www.arl.org/arl/membership/members.shtml information technology and libraries | march 2012 29 appendix. survey library use of web-based research guides please complete the survey below. we are researching libraries’ use of web-based research guides. please consider filling out the following survey, or forwarding this survey to the person in your library who would be in the best position to describe your library’s research guides. responses are anonymous. thank you for your help! jimmy ghaphery, vcu libraries erin white, vcu libraries 1) what is the name of your organization? __________________________________ note that the name of your organization will only be used to make sure multiple responses from the same organization are not received. any publication of results will not include specific names of organizations. 2) which choice best describes your library? o arl o university library o college library o community college library o public library o school library o private library o governmental library o nonprofit library 3) what type of system best describes your research guides by subject? o libguides or campusguides o customized open source system o other commercial system o homegrown system o static html pages 4) which statement best describes the selection of your current research guides system? o initiated by administration o initiated by systems o initiated by public services o initiated by an informal cross-departmental group o more of a library-wide initiative library use of web-based research guides | ghaphery and white 30 5) how much ongoing involvement does your systems department have with the management of your research guides? o no ongoing involvement o some ongoing involvement o considerable ongoing involvement o n/a we do not have a systems department 6) what other type of content do you host on your research guides system? o course pages o “how to” instruction o alphabetical list of all databases o “about the library” information (for example: hours, directions, staff directory, events) o digital collections o everything—we use the research guide platform as our website o none of the above 7) which statement best describes the relationship between your discovery/federated search tool and your research guides? o we typically do not include our discovery tool on our guides o our discovery tool is one of many search options we promote on our guides o we prominently feature our discovery tool on our guides o n/a we do not have a discovery tool 8) which statement best describes the relationship between your web content management system and your research guides? o our content management system is independent of our research guides o our content management system is integrated with our research guides o our content management system is used for both our website and our research guides o n/a we do not have a content management system 9) which of the following procedures or policies do you have in place for your research guides? o defined scope of appropriate content o required elements such as contact information, chat, pictures, etc. o style guides for consistent presentation o allowing and/or moderating user tags, comments, ratings o training for guide creators o controlled vocabulary/tagging system for managing guides o maintenance and upkeep of guides o link checking information technology and libraries | march 2012 31 o transfer of guides to another author due to separation or change in duties o none of the above 10) how do you evaluate the success or failure of your research guides? [free text] development of a gold-standard pashto dataset and a segmentation app article development of a gold-standard pashto dataset and a segmentation app yan han and marek rychlik information technology and libraries | march 2021 https://doi.org/10.6017/ital.v40i1.12553 yan han (yhan@email.arizona.edu) is full librarian at the university of arizona libraries. marek rychlik (rychlik@math.arizona.edu) is full professor at the department of mathematics, university of arizona. © 2021. abstract the article aims to introduce a gold-standard pashto dataset and a segmentation app. the pashto dataset consists of 300 line images and corresponding pashto text from three selected books. a line image is simply an image consisting of one text line from a scanned page. to our knowledge, this is one of the first open access datasets which directly maps line images to their corresponding text in the pashto language. we also introduce the development of a segmentation app using textbox expanding algorithms, a different approach to ocr segmentation. the authors discuss the steps to build a pashto dataset and develop our unique approach to segmentation. the article starts with the nature of the pashto alphabet and its unique diacritics which require special considerations for segmentation. needs for datasets and a few available pashto datasets are reviewed. criteria of selection of data sources are discussed and three books were selected by our language specialist from the afghan digital repository. the authors review previous segmentation methods and introduce a new approach to segmentation for pashto content. the segmentation app and results are discussed to show readers how to adjust variables for different books. our unique segmentation approach uses an expanding textbox method which performs very well given the nature of the pashto scripts. the app can also be used for persian and other languages using the arabic writing system. the dataset can be used for ocr training, ocr testing, and machine learning applications related to content in pashto. background the ocr technology for printed modern latin scripts is a largely solved problem, as both character and word accuracies typically reach greater than 95%. most well-known commercial ocr systems include abbyy, omnipage, and adobe acrobat ocr engine (licensed from iris), while open source systems have tesseract, ocropus, and kraken. ocr technology for other languages and scripts, including arabic scripts and traditional chinese, is still not satisfactory despite the fact that ocr research on these languages has been ongoing since the 1980s. an east asian librarian in 2019 wrote to the author: i am just back from the annual aas (association for asian studies) and ceal (council on east asian libraries) meetings. this year, prof. peter bol of harvard hosted a 2-day digital tech expo there to promote digital humanities . . . i spent 1 day on the dh sessions, where scholars constantly mentioned chinese ocr as a conspicuous and serious block on their path to assessing “digitized” textual collections. if you and your team succeed, it will surely help the eas scholarly community a lot.1 mailto:yhan@email.arizona.edu mailto:rychlik@math.arizona.edu information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 2 sturgeon, who has directed the chinese text project since 2005, stated that ocr of premodern chinese texts presents challenges distinct from ocr of modern documents and premodern documents in other languages, because training data is typically not available and a natural approach to improving accuracy is to train using data extracted from real images of text in the same historical writing style.2 sturgeon utilized both imperfect ocr software and allowed users to manually key in corresponding text via a crowdsourcing approach to gradually improve the quality of transcriptions.3 in 2018, the authors received a grant award from the national endowment for the humanities (neh) to develop ocr and a software prototype for an open-source global language databank for pashto and traditional chinese. activities included fundamental research and software implementation of new ocr technology for the two languages. for the past two years, we have been engaged in all aspects of ocr research in pashto, persian, and chinese scripts, including assessing current technology and systems, reviewing and building datasets, and researching and implementing segmentation algorithms and machine learning models involving neural networks. languages, scripts, and writing systems people in the world read, write, and speak a handful of major languages. of those, reading and writing is accomplished through the use of several types of scripts: latin, chinese, arabic, and devanagari. languages and scripts are very complex topics in regard to origin, structure, and use. they evolve due to influencing and being influenced by each other. a script is defined as “a collection of letters and other written signs used to represent textual information in one or more writing systems,” where a writing system is a common communication method to allow people to exchange information through a medium such as paper.4 the first requirement in a writing system is letters or other written signs. a common writing system can use an alphabet, syllabary, or logography. specifically, the latin and arabic writing systems use alphabets, where an alphabet is a standardized set of letters. combination of letters makes a word. another approach is to use a logogram. chinese characters (including japanese kanji and korean hanja) are logograms. in the alphabet and syllabic systems, individual characters represent sounds only, while in the logographic system each logogram represents a word or a phrase. one script, such as latin and arabic, may be used for several different languages, while some languages use several scripts. latin script is used in western europe, most of eastern europe, and across north and south america. arabic script was adopted by the west asian, middle eastern, and near african regions. in contrast, the japanese use three scripts: the hiragana and katakana syllabaries and the kanji logogram. the next critical feature in a writing system is the order in which to read and write. a writing system has two directions: horizontal and vertical. almost all writing systems are written vertically from top to bottom (ttb). bottom-to-top (btt) writing systems do exist. the philippines traditional scripts, the tagalog (baybayin), hanunóo, buhid, and tagbanwa are in limited use today. they are written from btt.5 within the ttb method, four possibilities exist: 1. left to right (ltr) first and ttb: this method refers to writing a horizontal line starting from the top left of a page, continuing to the right, and returning to the next line all the way from top to bottom. the latin writing system uses this variation. the current chinese writing system uses this order as well. information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 3 2. right to left (rtl) first and ttb: this method refers to writing a horizontal line starting from the top right of a page, continuing to the left, returning to the next line and all the way from top to bottom. arabic writing systems, such as arabic, persian, and pashto, use this order. 3. ttb first and rtl: this method refers to writing a vertical line starting from top right of a page, continuing to the bottom, and returning to the next line all the way from right to left. this method was widely used in traditional chinese (before the 1950s) and traditional japanese materials for thousands of years. it is still used in chinese calligraphy, and occasionally can be found in materials published in chinese. 4. ttb first and ltf: rarely used by a writing system. one of the examples is the manchu script.6 the nature of the scripts and the writing systems may require different algorithms and considerations when we deal with ocr technology, including preparing datasets, segmentation, and performing ocr in computer vision. pashto pashto (پښتو ), alternatively spelled as pushto, pukhto, or pakhto, historically as afghani (افغاني ), is one of the two official languages of afghanistan (the other is dari/farsi/persian). it is also spoken as a regional language in pakistan. pashto is spoken by 40 to 60 million people in afghanistan and pakistan. 7 the arabic script writing system is used for writing arabic, persian, and pashto languages in a cursive style. arabic, persian, and pashto are totally different languages, though they use almost the same alphabets within the same writing system. the pashto alphabet is a modified form of the arabic alphabet. it consists of 45 letters and four diacritic marks and includes all 28 letters from the arabic alphabet. the pashto alphabet includes all 32 letters from the persian alphabet, of which 28 letters are from the arabic alphabet. the romanization of pashto consists of several standards including the american library association (ala) and library of congress (lc) ala-lc romanization, bgn/pcgn, din 31635, iso233, and arabtex. details of romanization of pashto letters with their initial, medial, final forms, and the ala-lc rules are available at library of congress’s website.8 the need for datasets the authors are currently engaging in ocr research, and have applied machine learning (ml) models and methods such as convolutional neural networks (cnn) and recurrent neural networks (rnn). the advance of ml models and multiple methods has achieved great improvements in many fields. for instance, the most well-known event in ml occurred when an ai program named alphago defeated the world go champion in 2015. open-source ocr systems tesseract and ocropus both released their ocr systems using the rnn models in 2014 and 2018. these models and methods rely heavily on datasets for training, improvement, and evaluation. similarly, alphago uses datasets for training and evaluations. good and comprehensive datasets are critical to the success of an ml model and/or method. the most well-known dataset is the mnist database which contains a training set of 60,000 images and a test set of 10,000 images (28 × 28 pixels) of handwritten digits (0–9). the dataset is widely used for training and testing in ml as the gold-standard dataset for ml techniques and pattern recognition. information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 4 related datasets currently, few pashto datasets are available as open access. while there are other pashto datasets mentioned in the literature, we have not found one that provides a one-to-one mapping of line images to texts. a search on github has one result showing a raw text dataset containing content in pashto scraped from the web. however, this dataset is of little use in the case of training ml models for ocr, because it has no corresponding text. the computer science department of the national university of computer and emerging sciences (nuces) peshawar campus has been working on pashto ocr since 2006, and its research has created a pashto image-to-ligature dataset titled fast-nu dataset, containing 4,000 images of 1,000 unique ligatures in a variety of font sizes.9 the creators of this dataset have kindly sent us the pashto image-to-ligature dataset. a recent paper discussed the use of deep learning architectures for ocr in pashto with the development of a bigger dataset based on the fast-nu dataset including contours, negative, and rotated images.10 ali developed a database recording pashto digits from 25 male and 25 female native pashto speakers for automatic speech recognition. unfortunately, the authors had difficulties in downloading this dataset.11 khan et. al. designed a database encompassing a total of 4,488 images (102 distin guishing samples for the 44 pashto letters). this approach is very close to that of the fast-nu dataset.12 we are not sure if they are very similar, as we have not found a way to download and evaluate the dataset. another article describes offline pashto ocr using ml which tested more than 5,000 images in the dataset.13 the article describes its “extraction of lines containing pashto content,” but these “lines containing pashto content” have no specific resource or link to check. rawan and han compiled a pashto–english dictionary, which is open accessible through its website and an android app.14 in the past decade of working with afghan materials, han and rawan found several existing pashto language dictionaries online but encountered several issues related to standardized spelling, pronunciation, romanization/transliteration, and limited content. this improved dictionary contains over 12,000 entries of pashto words; each entry has a pashto word and corresponding english meanings. the pashto–english dictionary has been created with the following objectives in mind: a) standardized spelling and vocabulary, b) standard pronunciation, and c) standardized romanization with the ala–lc romanization scheme. other published pashto dictionaries either use one of the above or a combination of a few romanization systems. this dataset is available for noncommercial use upon reasonable request. two datasets but in different languages (arabic and persian) were produced by the open islamicate texts initiative, available in github (https://github.com/openiti/ocr_gs_data).15 both arabic and persian datasets have scans of original books from the premodern and corresponding texts.16 for example, its persian datasets came from page images from three persian books. these pages were segmented into separated line images and the line images were transcribed with corresponding persian texts. https://github.com/openiti/ocr_gs_data information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 5 building a pashto dataset our dataset creation methodology consists of three phases: • the first is to select pashto publications from our largest digital afghan collections. the focus was to have a language specialist who selected publications varying in fonts, original quality, and publication years. • the second phase is to use our segmentation app to produce line images from page images of the selected titles. because of the nature of pashto alphabets, we took a different segmentation approach involving expanding textboxes. this approach produced positive outcomes. • the final phase is to generate gold-standard text from corresponding line images involving human key-in and final review. we originally hoped that ocr generated text could increase productivity. unfortunately, the text produced from the current open-source ocr system tesseract 4.x was not useful. a persian ph.d. student was hired to complete the one-to-one key-in. finally, the author and his colleague reviewed the dataset. data source rawan and han at the university of arizona libraries have been collaborating with the afghanistan centre at kabul university (acku), the de facto national library of afghanistan. the purpose of the 13-year-long collaboration is to preserve and provide open access to afghanistan’s unique materials from the acku’s physical collections. initially funded by a grant of $350,000 from the national endowment for the humanities (neh) for the period of 2008 to 2012, the project digitized 200,000 pages of materials from the modern period. the project continues to receive support from the university of arizona and the acku. the acku’s permanent collection is the most extensive in the region covering a time of war and social upheaval in the country, with most of the documents in the principal languages of pashto, dari (persian), and english with a variety of formats such as monographs, series, reports, yearbooks, videos, and newspapers. in addition, rawan and han also pursued related afghani scholars’ collections including those of ludwig w. adamec and m. mobin shorish. a repository (www.afghandata.org) has been openly accessible containing these unique materials dating from the 1950s to the present. the repository has grown from the initial 200,000 pages to 2 million, and is the biggest digital repository in the world covering afghanistan and its region with more than 200,000 active users viewing 400,000 pages per year. the wealth of the materials in terms of content, formats, and sources of information makes them undoubtedly the ultimate source of information for the studies of afghanistan and its region. from a data scientist’s point of view, the repository is a treasure trove for big data and ml purposes because it consists of a diversity of content from many sources in a variety of formats and document layouts. selection the selected books, published in 1986, 2002, and 2006 respectively, vary in fonts, printing, and digitization quality. ms. rawan, a language specialist, selected ten pashto books from the afghan digital repository. due to the limited funding available, only three books were used as the source for the dataset. more titles can be added if additional funding is available in the future. http://www.afghandata.org/ information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 6 1. “ کرګر اکبر محمد څیړونکی / څیره اوفلسفی عرفانی روښان دبایزید کی حالنامه په .” (mystic and philosophic profile of bayazid roshan as reflected in halnama), published in 2006 and digitized in 400 dpi in grayscale (www.doi.org/10.2458/azu_acku_bp189_5_bay29_pay23_1385). published in 2002 and digitized in 600 dpi in ,(women in life) ”لیکنه او څیړنه معصومه رضاء سید“ .2 black and white (www.doi.org/10.2458/azu_acku_bp173_4_ray62_1381). published in 1986 with lower ,(teaching the qur'an and theology) ”… تعلمیم القران او دینیات“ .3 quality printing in a different pashto font digitized in 600 dpi in black and white (www.doi.org/10.2458/azu_acku_bp45_tay67_1365). image processing algorithms and the segmentation app in a general traditional ocr system, the workflow of recognition consists of the following stages: preprocessing, document layout analysis, page segmentation, classification, and postprocessing. in our research, we refer to segmentation as the process of partitioning a digital image into one or more information blocks. the segmentation app is used during the preprocessing and segmentation stages. sturgeon’s paper discussed major methods for preprocessing and character segmentation.17 multiple papers discussed various methods to do arabic or pashto text segmentation, including • horizontal projection18 • baseline 19 • template matching 20 • contour analysis 21 • zoning, 22 and • a combination of one or more above methods such as contour analysis and template matching.23 these methods have certain issues when dealing with letters with dots on the top or the bottom, and diacritics, specific to pashto scripts, as the pashto alphabet contains more letters and diacritics than its counterparts in arabic and persian. in addition, noise from original low-quality printing and digitization creates additional barriers. ullah et. al.24 briefly mentioned text area detection and segmentation with the detection and removal of diacritics. their segmentation goes from line segmentation using the horizontal projection, to word, and then to character level progressively. the letters (e.g., څ ,ټ, ښ) are sensitive to noise randomly appearing in page images. our method has proven to be successful in getting accurate character and line segments with the benefits of simplicity and program efficiency. details of discussion of the method are beyond the scope of this paper and shall be discussed in another paper. the author and a postdoctoral researcher created the code to identify pashto/persian text lines from page images, where the page images are from our digitization master files. our method takes a different approach from the above segmentation methods. algorithms and specific properties related to the characteristics of pashto letters have been implemented. we called it the “expanding textbox” method, which calculates the overlapping ratio of one textbox with the others and merges them based on a confidence level controlled by users. the confidence level of overlapping ratio is controlled by properties such as textbox, overlaptype, overlapthreshold, maxdiacriticssize, and minlineheight. to achieve segmentation, the app is also a specific image processing program that contains common preprocessing algorithms such as binarization. http://www.doi.org/10.2458/azu_acku_bp189_5_bay29_pay23_1385 https://www.doi.org/10.2458/azu_acku_bp173_4_ray62_1381 https://www.doi.org/10.2458/azu_acku_bp45_tay67_1365 information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 7 all commercial and open-source ocr systems give users few choices in page segmentation. we believe that the availability of flexible adjustments unique to the pashto/persian/arabic alphabets allows users to achieve accurate results based on analysis of our largest collections of pashto materials. our huge collections of printed materials spanning the period from the 1950s to the 2010s were published by governments, non-profit organizations, local companies, and individuals. these materials were printed in a diverse range of fonts and printing quality. the app has unique features to allow users to adjust several variables to ensure that they have accurate segmentation. segmentation parameters such as vertical expansion and horizontal expansion (see fig. 1) can be adjusted to expand the line vertically and/or horizontally. our experiments show that typically vertical expansion is set to −0.15 and horizontal expansion is set to 5 for most of the page images from our collections. however, both variables are subject to change if lines are not segmented correctly. figures 2 and 3 show a real-life example of the different values in the vertical expansion (set to 0.20) to get all of the correct lines. users can adjust these variables to achieve desired outputs if diacritics and lines are not recognized correctly. the app was programmed using matlab, which can run on matlab or run independently if packaged with matlab. the app can be exported to other platforms and run in batch mode if needed. the app has a simple gui (see fig. 1) providing a preview of expanded ligatures, expanded diacritics, lines of text, and binarized image windows. this allows users to adjust segmentation variables and verify results before outputting. figure 4 demonstrates an example of lines of text preview. when satisfied, users can output these lines as images (one image per line from a page image). these line images are ready for ocr or manual transcription. figure 1. expanded diacritics (highlighted in red) and the app gui. information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 8 figure 2. vertical expansion set as −0.15 missing two lines. figure 3. vertical expansion set as −0.20 producing correct results information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 9 figure 4. text lines identified (lines in green). information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 10 finalizing the dataset to build a truly 100% accurate dataset, we have a language specialist who keyed in, verified, and double-checked the corresponding pashto texts. a ph.d. student from the school of middle eastern and north african studies (with persian language fluency) was hired to complete this task. initially, we tried to ocr these line images by using the open-source system tesseract 4.x with the hope that its output would speed up the key-in process. unfortunately, the majority of the ocr results from these line images was not usable. to ensure that the dataset has the gold-standard one-to-one mapping of a line image to a line text, the ph.d. student keyed in pashto texts line by line by viewing every individual line image. figure 5 shows a sample line image and its text. finally, ms. rawan and the author han reviewed these line images and their corresponding texts. the dataset is organized in a hierarchical structure consisting of directories, where each directory contains line subdirectories which hold line images and its texts. the dataset is openly available at github (https://github.com/yhan818/pashto-dataset). figure 5. sample line image and corresponding text. discussion the nature of the scripts and the writing systems may require different algorithms and considerations when we deal with ocr technology, including preparing datasets, segmentation, and performing ocr in computer vision. in our research in specific languages, we have tested this app with documents in pashto, persian/dari and arabic with successful results. our textbox extension method should work for any language using the arabic writing system beyond these above scripts. during our research, we are clearly aware of the following limitations of the ocr technology, techniques and systems: 1) lack of high accuracy in segmentation: a) while it is true that ocr on the character/word accuracy of the latin scripts can exceed 95% accuracy, one shall not believe that the accuracy of a document after ocr will be at the same level. depending on the nature of a document, segmentation accuracy varies among documents. ocring documents in simple layout (e.g., a monograph without columns and tables) generally reaches high accuracy, while ocring documents with complex layouts (e.g., newspapers and scientific articles) generates poor results. b) we have tested multiple popular commercial and open-source ocr systems specifically in the area of segmentation. on several samples, every ocr system failed completely. in other words, the text output of every ocr system is nonsense. in some cases, only abbyy recognized columns correctly; the remaining systems unexpectedly transposed https://github.com/yhan818/pashto-dataset information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 11 text columns, which means potential indexing and searching errors, although the character and word accuracy reached 95% accuracy. c) we argue that segmentation accuracy shall be added as one of the most important evaluation criteria. 2) to date almost all ocr technology and systems are limited to text only: a) missing information in other formats: we agree that plain text in the writing systems is the most commonly used and a very important communication method. however, almost all materials contain information in other formats (e.g., illustrations, figures, tables, and formulas) that may be very difficult to describe in text. an individual page from a monograph, journal article, or newspaper may contain information in other formats beyond text. such information can be a table, mathematical formula, figure, picture, or drawing. one company, mathpix, recently started to provide ocr on simple tables and mathematical formulas for a fee. b) missing semantic information: in addition, current ocr simply outputs plain text, ignoring existing semantic information (e.g., bold highlighted text and section/subsection headings in different fonts and sizes). ocr refers to character recognition, which limits its own scope in theory. in current practices, semantic information is totally ignored by every ocr system. scant research has been carried out for them too. conclusion and future work so far, we have created a pashto dataset containing 300 line images and their corresponding text from three pashto monographs published in 1986, 2002, and 2006, respectively. the dataset is openly available as a gold-standard pashto dataset from real books. when future funding is received, we will add more data to this dataset. the segmentation app produces accurate line images from page images for pashto and persian content. it will work for other languages using the arabic writing system. potential users of our prototype software will find it is relatively easy to modify with little knowledge of the underlying technology in other programming languages such as java. in addition, researchers who understand linear algebra in which matlab is used can modify the code for their needs. we are also using this dataset to train and evaluate our current ocr algorithms with rnn and other ml models. an initial report of our research and results can be found at arxiv.org.25 the authors will report and update future research results and available datasets via conferences and formal publications. the authors would like to thank the national endowment for the humanities for its grant (pr 263939-19) to our project development of image-to-text conversion for pashto and traditional chinese. the authors would like to thank riaz ahmad and saeeda naz for providing the nuces fast ligature dataset. the authors would also like to thank atifa rawan, sayyed m. vazirizade, and sharam parastesch for their valuable contributions. ms. rawan selected the sample pashto manuscripts and reviewed the lines. dr. vazirizade worked on segmentation algorithms and code. ph.d. student sharam parastesch keyed in and verified the dataset. information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 12 glossary of terms • alphabet: an alphabet is a standardized set of letters and symbols. the most popular are the latin alphabet (a–z) and the arabic alphabet. • classification: in machine learning, classification is to assign a sample to one or more classes by supervised learning from examples. • dataset: a set of data. a dataset can be in a variety of forms or formats (e.g., text, images, audio, videos, 3d objects, gps data, machine learning data), from one table, to a collection of images of handwritten digits (e.g., mnist database), to a collection of metadata of a digital repository (e.g., arxiv dataset). • document layout analysis: in the ocr technology, document layout analysis is the process of identifying the layout and categorizing the information blocks in a digital image of a document. the goal is to segmentalize one information block from the other and arrange these information blocks in the correct reading order. • language: a structured system of communication; the system of linguistic signs or symbols considered in the abstract (as opposed to speech). • left to right (ltr) first and top to bottom (ttb) writing direction: writing a horizontal line starting from the top left of a page, continuing to the right, and returning to the next line all the way from top to bottom. the latin writing system uses this writing direction. the current chinese writing system uses this order as well. • logogram: a written character that represents a word or phrase. the most popular are chinese (simplified and traditional) characters, kanji (japanese), and hanja (korean). • optical character recognition (ocr): conversion of an image consisting of text (printed or handwritten) into digital text. • page segmentation: segmentation process for a scanned page in a digital image file format. • right to left (rtl) first and top to bottom (ttb) writing direction: writing a horizontal line starting from the top right of a page, continuing to the left, returning to the next line and all the way from top to bottom. the arabic writing systems such as arabic, persian, and pashto use this order. • script: “[a] collection of letters and other written signs used to represent textual information in one or more writing systems.”26 • segmentation: segmentation is the process of partitioning a digital image of a document into multiple segments where each segment consists of a set of pixels. it aims at separating the digital image into one or more information blocks, where each information block contains logical information separated from the other information block. these information blocks shall be arranged in the correct reading order. (see document layout analysis) • textbox: in an ocr system, a textbox is a box with (x,y) (identified in the computer source code) that contains one or more characters. • top to bottom (ttb) first and left to right (ltr) writing direction: writing a vertical line starting from the top left of a page, continuing to the bottom, and returning to the next line all the way from left to right. this method is rarely used by any writing system. • top to bottom (ttb) first and right to left (rtl) writing direction: writing a vertical line starting from the top right of a page, continuing to the bottom, and returning to the next line all the way from right to left. this method was widely used in traditional chinese (before 1950s) and traditional japanese materials for thousands of years. it is still used in chinese calligraphy, and occasionally can be found in materials published in chinese. • writing system: a common communication method to allow people to exchange information through a medium such as paper. information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 13 endnotes 1 lu gan, email message to author, march 25, 2019. 2 donald sturgeon, “large-scale optical character recognition of pre-modern chinese texts,” international journal of buddhist thought and culture 28, no. 2 (2018): 11–44, https://dsturgeon.net/papers/large-scale-chinese-ocr.pdf. 3 donald sturgeon, “digitizing premodern text with the chinese text project,” journal of chinese history 4, no. 2 (2020): 486–98, https://doi:10.1017/jch.2020.19. 4 “glossary of unicode terms,” the unicode consortium, last updated may 20, 2020, http://www.unicode.org/glossary/. 5 the unicode standard version 13.0—core specification: chapter 17: indonesia and oceania (the unicode consortium: mountain view, ca, 2020), https://www.unicode.org/versions/unicode13.0.0/ch17.pdf#g26723. 6 britta-maria gruber and wolfgang kirsch, “writing machu on a western computer (an interim report),” saksaha: a journal of manchu studies, 3, (1998): https://doi.org/10.3998/saksaha.13401746.0003.008. 7 herbert penzl, a grammar of pashto: a descriptive study of the dialect of kandahar, afghanistan. (new york: ishi press, 2009). 8 library of congress, pushto romanization tables (2013), https://www.loc.gov/catdir/cpso/romanization/pushto.pdf. 9 riaz ahmad et al., “robust optical recognition of cursive pashto script using scale, rotation and location invariant approach,” plos one 10, no. 9 (september 14, 2015): e0133648, https://doi.org/10.1371/journal.pone.0133648. 10 shizza zahoor et al., “deep optical character recognition: a case of pashto language,” journal of electronic imaging 29, no. 02 (march 4, 2020), https://doi.org/10.1117/1.jei.29.2.023002. 11 zakir ali et al., “database development and automatic speech recognition of isolated pashto spoken digits using mfcc and k-nn,” international journal of speech technology 18, no. 2 (june 2015): 271–75, https://doi.org/10.1007/s10772-014-9267-z. 12 sulaiman khan et al., “knn and ann-based recognition of handwritten pashto letters using zoning features,” international journal of advanced computer science and applications 9, no. 10 (2018), https://doi.org/10.14569/ijacsa.2018.091069. 13 sultan ullah et al., “offline pashto ocr using machine learning,” in 2019 7th international electrical engineering congress (ieecon), (hua hin, thailand, 2019): 1–4, https://doi.org/10.1109/ieecon45304.2019.8938859. 14 atifa rawan and yan han, the pasto-english dictionary (2014), http://www.pashtoenglish.org. https://dsturgeon.net/papers/large-scale-chinese-ocr.pdf about:blank http://www.unicode.org/glossary/ https://www.unicode.org/versions/unicode13.0.0/ch17.pdf#g26723 https://doi.org/10.3998/saksaha.13401746.0003.008 https://doi.org/10.3998/saksaha.13401746.0003.008 https://doi.org/10.3998/saksaha.13401746.0003.008 https://www.loc.gov/catdir/cpso/romanization/pushto.pdf https://doi.org/10.1371/journal.pone.0133648 https://doi.org/10.1371/journal.pone.0133648 https://doi.org/10.1371/journal.pone.0133648 https://doi.org/10.1117/1.jei.29.2.023002 https://doi.org/10.1117/1.jei.29.2.023002 https://doi.org/10.1007/s10772-014-9267-z https://doi.org/10.1007/s10772-014-9267-z https://doi.org/10.14569/ijacsa.2018.091069 https://doi.org/10.14569/ijacsa.2018.091069 %20 %20 https://doi.org/10.1109/ieecon45304.2019.8938859 http://www.pashtoenglish.org/ information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 14 15 open islamicate texts initiative, open islamicate texts initiative (openiti): creating the digital infrastructure for the study of the premodern islamicate world (2016), https://iticorpus.github.io/. 16 matthew thomas miller, maxim g. romanov, and sarah bowen savant, “digitizing the textual heritage of the premodern islamicate world: principles and plans,” international journal of middle east studies 50, no. 1 (february 2018): 103–9, https://doi.org/10.1017/s0020743817000964. 17 sturgeon, “large-scale optical character recognition of pre-modern chinese texts,” 11–44. 18 mohamed attia and mohamed el-mahallawy, “histogram-based lines and words decomposition for arabic omni font-written ocr systems; enhancements and evaluation,” in computer analysis of images and patterns, ed. walter g. kropatsch, martin kampel, and allan hanbury, vol. 4673, lecture notes in computer science (berlin, heidelberg: springer berlin heidelberg, 2007), 522–30, https://doi.org/10.1007/978-3-540-74272-2_65; mahmoud a. a. mousa, mohammed s. sayed, and mahmoud i. abdalla, “arabic character segmentation using projection based approach with profile’s amplitude filter,” arxiv:1707.00800 [cs], july 3, 2017, http://arxiv.org/abs/1707.00800. 19 atallah al-shatnawi and khairuddin omar, “methods of arabic language baseline detection— the state of art,” international journal of computer science and network security 8, no. 10 (october 2008); tarik abu-ain et al., “a novel baseline detection method of handwritten arabic-script documents based on sub-words,” in soft computing applications and intelligent systems, ed. shahrul azman noah et al., communications in computer and information science 378 (springer: berlin, heidelberg, 2013), 67–77, https://doi.org/10.1007/978-3-642-405679_6; saeeda naz et al., “challenges in baseline detection of arabic script based languages,” in intelligent systems for science and information, ed. liming chen, supriya kapoor, and rahul bhatia, studies in computational intelligence (springer international publishing, 2014), 542: 181–96, https://doi.org/10.1007/978-3-319-04702-7_11. 20 majid ziaratban and karim faez. “a novel two-stage algorithm for baseline estimation and correction in farsi and arabic handwritten text line,” in 2008 19th international conference on pattern recognition, tampa, fl, usa: ieee, 2008: 1–5, https://doi.org/10.1109/icpr.2008.4761822. 21 safwan wshah, zhixin shi, and venu govindaraju, “segmentation of arabic handwriting based on both contour and skeleton segmentation,” in 2009 10th international conference on document analysis and recognition, barcelona, spain: ieee, 2009: 793–97, https://doi.org/10.1109/icdar.2009.152; yusra osman, “segmentation algorithm for arabic handwritten text based on contour analysis,” in 2013 international conference on computing, electrical and electronic engineering (icceee), khartoum, sudan: ieee, 2013: 447–52, https://doi.org/10.1109/icceee.2013.6633980. 22 khan et al., “knn and ann-based recognition of handwritten pashto letters using zoning features.” https://iti-corpus.github.io/ https://iti-corpus.github.io/ https://doi.org/10.1017/s0020743817000964 https://doi.org/10.1017/s0020743817000964 https://doi.org/10.1017/s0020743817000964 https://doi.org/10.1007/978-3-540-74272-2_65 https://doi.org/10.1007/978-3-540-74272-2_65 http://arxiv.org/abs/1707.00800 http://arxiv.org/abs/1707.00800 http://arxiv.org/abs/1707.00800 https://doi.org/10.1007/978-3-642-40567-9_6 https://doi.org/10.1007/978-3-642-40567-9_6 https://doi.org/10.1007/978-3-642-40567-9_6 https://doi.org/10.1007/978-3-319-04702-7_11 https://doi.org/10.1007/978-3-319-04702-7_11 https://doi.org/10.1109/icpr.2008.4761822 https://doi.org/10.1109/icpr.2008.4761822 https://doi.org/10.1109/icpr.2008.4761822 https://doi.org/10.1109/icdar.2009.152 https://doi.org/10.1109/icdar.2009.152 https://doi.org/10.1109/icdar.2009.152 https://doi.org/10.1109/icceee.2013.6633980 https://doi.org/10.1109/icceee.2013.6633980 https://doi.org/10.1109/icceee.2013.6633980 information technology and libraries march 2021 development of a gold-standard pashto dataset and a segmentation app | han and rychlik 15 23 abdelhay zoizou, arsalane zarghili, and ilham chaker. “a new hybrid method for arabic multifont text segmentation, and a reference corpus construction.” journal of king saud university—computer and information sciences 32, no. 5 (june 2020): 576–82, https://doi.org/10.1016/j.jksuci.2018.07.003. 24 ullah, “offline pashto ocr using machine learning.” 25 marek rychlik et al., “development of a new image-to-text conversion system for pashto, farsi and traditional chinese,” arxiv:2005.08650 [cs], may 8, 2020, http://arxiv.org/abs/2005.08650. 26 “glossary of unicode terms,” http://www.unicode.org/glossary/. https://doi.org/10.1016/j.jksuci.2018.07.003 https://doi.org/10.1016/j.jksuci.2018.07.003 https://doi.org/10.1016/j.jksuci.2018.07.003 http://arxiv.org/abs/2005.08650 http://arxiv.org/abs/2005.08650 http://arxiv.org/abs/2005.08650 http://www.unicode.org/glossary/ abstract background languages, scripts, and writing systems pashto the need for datasets related datasets building a pashto dataset data source selection image processing algorithms and the segmentation app finalizing the dataset discussion conclusion and future work glossary of terms endnotes 100 communications the evolution of an online acquisitions system jenko lukac: lewis and clark college library, portland, oregon. about two years ago a home-grown online acquisitions system was developed and implemented at pacific university. the program, written in basic for the data general nova computer, performs all the necessary functions such as ordering, receiving, fund accounting, etc. 1 this program was offered to the library community, and about one hundred libraries from around the world have availed themselves of it . one of the libraries that obtained and adopted pacific's electronic acquisitions system (peas) was the watzek library at lewis and clark college. the advantage of a home-grown system is that it can be freely modified to suit the evolving needs of a particular library. this communication describes some of the changes made by lewis and clark college to the peas program, in order to illustrate how software developed at one institution can be "imported" into and enhanced by another institution . although matters were particularly simplified by having the same person who developed peas at pacific be responsible for the enhancements at lewis and clark, the procedure and conclusions are still generally applicable. the first change made to the peas program was to rename it clas-the computerized library acquisitions system . the most important change, however, was to translate it from data general basic to digital equipment corporation basic, since the computer at lewis and clark is a dec vax-11 . (each hardware manufacturer implements a slightly different version of a programming language.) the translation requires changing things such as square brackets to parentheses, the word read to get, the word write to put, etc. these changes would have to have been done repeatedly throughout the program, but, in fact, were quite easily accomplished by using a text editor-a metaprogram that can be instructed to change all occurrences of, for example, the word read to the word get in a single pass. clas retained all of the features of peas , and became fully operational at lewis and clark in february of 1980. since then, new features have been added as the staff expressed a need for them. some are minor, such as having the computer recognize initial articles in titles . others are more significant : 1. searching for records in clas by author and title makes use of unlimited rightand left-handed truncation. this makes possible subject searching through k~y words in the title . for this purpose an extra terminal is provided at the reference desk. 2. clas permits the file to be searched by the name of the faculty member who requested the item, in addition to the eight other access points available in peas. 3. clas provides an activity report for any given period showing, for each fund, the amount ordered, the amount received, and the average cost per item . 4. clas can produce vendor reports showing for each vendor the average discount and the delivery schedule. 5. clas asks the operator to verify the cost of an item if the list price and cost differ by more than 30 percent. 6. clas allows the receipt of partial shipments. some of the enhancements to clas involved successive modifications. for example, one of the features of peas was the prevention of duplicate orders by matching new orders being input with records already in the database. a potential duplicate is reported if there is a match on both the author and the title fields . it was decided at the time of implementation at lewis and clark that this criterion was too restrictive, and clas was programmed to report a duplicate if only the title fields matched . after some months of experience, it turned out that even this requirement was excessively restrictive: a slight variation in the way a title was input would prevent a duplicate from showing up. the criterion was then further relaxed to signal duplicates if either the title or the author's last name matched. this, however, was too broad a net : although no duplicates were missed, ordering a book by wilson or smith produced a tedious list of potential duplicates. hence , the requirement was tightened slightly to look for a match in either the title or the author's last name and first initial. this final criterion is currently serving well the needs of the watzek library. what is important about this evolutionary process is that it illustrates the dynamic way in which a library can "fine-tune" an automated system that is receptive to user modifications. since peas is supposed to be a selfexplanatory system, it lacks any documentation. clas is still a self-explanatory system, but nevertheless a manual has been produced to describe all its features and to record programming information such as the structure of the files . one version of the documentation is kept in machine-readable form so that it can be easily updated to correspond to developments in the program . in conclusion, it can be stated that a library-application software package has been successfully transplanted from one institution to another, from one hardware environment to another, and in doing so has matured into a fuller and more flexible system, which it is hoped will, in turn, benefit other libraries contemplating the automation of their acquisitions operation .2 references 1. jenko lukac, "a no cost online acquisicommunications 101 tions system for a medium-size library," library journal 107:684-85 (march 15, 1980). 2. interested libraries can request a copy of the clas program ($80) or manual ($40) directly from the author. the significance of information in the ordinary conduct of life* robert newhard: torrance public library, torrance, california. the information benefit provided to the general public by the developing telecommunications systems will be highly dependent upon the provider's perception of the current and potential role of information in the ordinary interests of life. as sessing this role cannot easily be done by standard questionnaire or survey methods because information does not have a conscious function in people's lives. some paradigms from the past and present may, therefore, be of use in articulating the everyday importance of information. the tool paradigm: information as a link between man and his tools or repairing a lost confidence prior to the industrial revolution, most production was carried on in the home, using tools either made or repaired mainly at home. in this cottage industry, each person was very close to and secure in the use of his tools . with the advent of the industrial revolution and the factory system, the worker no longer owned his tools, but went to one place to use someone else's tools. man and his tools began to separate. many used the tools, fewer understood them. this process began to create the "expert." today most of the tools we use-the automobile, telephone, computer termi* a version of this paper was delivered at the meeting on "public libraries and the remote electronic delivery of information (redi)," columbus, ohio, march 23-24, 1981. index to volume 24 ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ contactless services: a survey of the practices of large public libraries in china article contactless services a survey of the practices of large public libraries in china yajun guo, zinan yang, yiming yuan, huifang ma, and yan quan liu information technology and libraries | june 2022 https://doi.org/10.6017/ital.v41i2.14141 yajun guo (yadon0619@hotmail.com) is professor, school of information management, zhengzhou university of aeronautics. zinan yang (yangzinan612@163.com) is master, school of information management, zhengzhou university of aeronautics. yiming yuan (yuanyiming361@163.com) is master, school of information management, zhengzhou university of aeronautics. huifang ma (mahuifang126@126.com) is master, school of information management, zhengzhou university of aeronautics. *corresponding author hamed yan quan liu (liuy1@southernct.edu) is professor, department of information and library science, southern connecticut state university. © 2022. abstract contactless services have become a common way for public libraries to provide services. as a result, the strategy used by public libraries in china will effectively stop the spread of epidemics caused by human touch and will serve as a model for other libraries throughout the world. the primary goal of this study is to gain a deeper understanding of the contactless service measures provided by large chinese public libraries for users in the pandemic era, as well as the challenges and countermeasures for providing such services. the data for this study was obtained using a combination of website investigation, content analysis, and telephone interviews for an analytical survey study of 128 large public libraries in china. the study finds that touch-free information dissemination, remote resources use, no-touch interaction self-services, network services, online reference, and smart services without personal interactions are among the contactless services available in chinese public libraries. exploring the current state of contactless services in large public libraries in china will help to fill a need for empirical attention to contactless services in libraries and the public sector. up-to-date information to assist libraries all over the world in improving their contactless services implementation and practices is provided. introduction the spread of covid-19 began in 2020, and people all over the world are still fighting the severity of its spread, the breadth of its impact, and the extent of its endurance. the virus’s continued spread has had a wide-ranging impact on industry sectors worldwide, including libraries. the growth of public libraries has also seen significant changes as a result of covid-19, resulting in added patron services, including contactless services. contactless services are those that patrons can use without having to interact face to face with librarians. these services transcend time and geographical constraints, as well as lower the danger of disease transmission through human interaction. since the covid-19 pandemic, contactless or touch-free interaction services are emerging in chinese public libraries. this service model can also serve as a reference for other libraries. this study evaluates and analyzes contactless service patterns in large public libraries in china, and then suggests a contactless service framework for public libraries, which is currently in the process of being implemented. mailto:yadon0619@hotmail.com mailto:yangzinan612@163.com mailto:mahuifang126@126.com mailto:liuy1@southernct.edu information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 3 literature review the available literature shows that the term “non-contact” appeared as early as 1916 in the article “identification of the meningococcus in the naso-pharynx with special reference to serological reactions” and described a patient’s infection in the context of medical research.1 in recent years, with the widespread application of “internet +” and the development and promotion of technologies such as the internet of things, cloud computing, and artificial intelligence, the contactless economy has grown by leaps and bounds, and so has the research on library contactless services.2 library contactless services encompass a wide range of services such as selfservices, online reference, and smart services without personal interactions. library self-service has become a major service model for contact-free services. the self-service model was first adopted in american public libraries in the 1970s with the emergence of self service borrowing and returning practices.3 many public libraries have since adopted stand-alone, fully automated self-service halls, self-service counters, etc.4 by the 1990s, a range of commercial self-service kiosks and self-service products had been introduced.5 currently, the most mature self-service type used by the library community is the circulation self-service product.6 in addition to self-service borrowing and returning of titles, libraries have launched self-service printing systems, self-service computer systems, and self-service booking of study spaces.7 as an example, patrons can complete printing operations using a self-service system and can offer payment by bank card, alipay, wechat, and other means.8 a face recognition system can also be used to borrow and return books, a solution for patrons who forget their library cards.9 these library selfservice system elements are confined to simple, repetitive, and routine tasks such as conducting book inventories, book handling, circulating books, and the like, whose development stems from the widespread application of electronic magnetic stripe technology and radio frequency identification (rfid), optical character recognition (ocr) technology, and face recognition.10 new applications of technology continue to advance the development of contactless services in libraries. the overall work and service processes of the library have been made intelligent to varying degrees. online reference is an important service in the contactless service program. researchers have started to study the current state of library reference services. interactive online reference services support patrons using the library, including how to search for literature, locate and renew books, schedule a study or seminar room, and participate in other library activities, such as seminars, lectures, etc.11 in response to the problem of how patrons access various library service abilities, digital reference systems need to have functions such as automated semantic processing, automated scene awareness, through automatic calculation and adaptive matching, understanding of patrons’ interests preferences and needs, and the ability to recommend the most suitable information resources for them.12 at present, most library reference services in china mainly include the use of telephone, email, wechat, robot librarians/interactive communication, microblogs, and qq, an instant messaging software popular in china. during the past two years, most public libraries in china have essentially implemented the use of the aforementioned reference tools to communicate and interact with patrons, with wechat having a 55.6% adoption rate when compared to other instant reference tools.13 the use of online chat in reference services has allowed librarians to help patrons from anywhere and at any time through embedding chat plug-ins into multiple pages of the library website and directing patrons to ask questions based on the specific page they are viewing, setting up automatic pop-up chat windows, and changing patrons’ passive waiting to active engagement. 14 in terms of technology, emerging technologies information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 4 such as patron profiling, natural language processing, and contextual awareness can support the development of reference advisory services in libraries.15 the online reference service provides a 24/7, high-quality, efficient, and personalized service that connects libraries more closely with society and is an important window in the future smart library service system. smart services without personal interactions may become the most popular form of library services development for the future, and research on library smart services has gradually deepened. in terms of conceptual definition, the library community generally understands the concept of library smart services as mobile library services that are not limited by time and space and can help patrons find books and other types of materials in the library by connecting to the wireless internet.16 apart from this, there are two other ways to define library smart services. one discusses the meaning of smart services in an abstract way, such as library smart services that should be an advanced library form dedicated to knowledge services through human-computer interaction, a comprehensive ecosystem.17 the other concretizes the extension of this concept expressed with a formula “smart library = library + internet of things + cloud computing + smar t devices.”18 applied technology research is an important part of smart services in libraries. library smart services have three main features: digitization, networking, and clustering. among them, digitization provides the technical basis, networking provides the information guarantee, and clustering provides the library management model of resources sharing, complementary advantages, and common development among libraries.19 the key breakthrough in the development of smart services is the applications deployment of smart technologies to truly realize a new form of integration of online and offline, virtual and reality. 20 the integration of face recognition technology in traditional libraries, as well as its application to services like acces s control management, book borrowing and returning, and wallet payment, can help libraries build smart services faster.21 the integration of deep learning into a mobile visual search system for library smart services can play an important role in integrating multiple sources of heterogeneous visual data and the personalized preferences of patrons.22 blockchain technology, born out of the impact of the new wave of information technology, has also been applied to the construction of smart library information systems because of its decentralized and secure features.23 library smart services can leverage new technologies and smart devices to enhance the efficiency of library contact-free services and provide new opportunities for knowledge innovation, knowledge sharing, and universal participation, thereby enabling innovation in service models. additional research on the development of contactless services in service areas such as library self-services, online reference, and smart services is discussed. in particular, the research and construction of smart library services have been enriched with the advent of big data and artificial intelligence. however, non-contact service has not been systematically researched and elaborated in domestic and international librarianship. the emergence and prevalence of covid-19 has enabled libraries in many countries to practice various types of touch-free services, such as the introduction of postal delivery, storage deposit, and click-and-collect in australian libraries; curbside pickup service or build a book bag service in us public libraries; and delivery book to the building services in chinese university libraries. 24 therefore, a systematic investigation and study of contactless services in public libraries in the pandemic is of great importance for the adaptation and innovation of library services. information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 5 methods survey samples the survey selected some of the most typical public libraries for the study. the selection criteria were those large public libraries in the more economically and culturally developed regions of china. a total of 128 large public libraries were identified, including national libraries, 32 provincial public libraries, and municipal public libraries in the top 100 cities by gdp ranking in 2020, of which five public libraries, including the capital library and nanjing library, are both top 100 city libraries and provincial libraries. these 128 large public libraries can more obviously reflect the current service level of the better developed public libraries in china, and represent the highest level of public library construction in china. (see table 1 for a list of the libraries studied.) table 1. a list of the 128 public libraries that were studied no. library no. library 1. national library of china 2. hebei library 3. shanxi library 4. liaoning provincial library 5. jilin province library 6. heilongjiang provincial library 7. zhejiang library 8. anhui provincial library 9. fujian provincial library 10. jiangxi provincial library 11. shandong library 12. henan provincial library 13. hubei provincial library 14. hunan library 15. guangzhou library 16. hainan library 17. sichuan library 18. guizhou library 19. yunnan provincial library 20. shanxi library 21. gansu provincial library 22. qinghai library 23. guangxi library 24. inner mongolia library 25. tibet library 26. ningxia library 27. xinjiang library 28. shanghai library 29. capital library of china 30. shenzhen library 31. guangzhou digital library 32. chongqing library 33. tianjin library 34. suzhou library 35. chengdu public library 36. wuhan library 37. hangzhou public library 38. nanjing library 39. qingdao library 40. wuxi library 41. changsha library 42. ningbo library 43. foshan library 44. zhengzhou library 45. nantong library 46. dongguan library 47. yantai library 48. quanzhou library 49. dalian library 50. jinan library 51. xi’an public library 52. hefei city library 53. fuzhou library 54. tangshan library 55. changzhou library 56. changchun library 57. guilin library 58. harbin library 59. xuzhou library 60. shijiazhuang library 61. weifang library 62. shenyang library 63. wenzhou library 64. shaoxing library information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 6 no. library no. library 65. yangzhou library 66. yancheng library 67. nanchang library 68. zibo library 69. kunming library 70. taizhou library 71. erdos city library 72. public library of jining 73. taizhou library 74. linyi library 75. luoyang library 76. xiamen library 77. dongying library 78. nanning library 79. zhenjiang library 80. jiaxing library 81. xiangyang library 82. jinhua library 83. yichang library 84. huizhou tsz wan library 85. cangzhou digital library 86. zhangzhou library 87. weihai library 88. digital library of handan 89. guiyang library 90. sun yat-sen library of guangdong province 91. ganzhou library 92. baotou library 93. huaian library 94. yulin digital library 95. dezhou network library 96. yuyang library 97. changde library 98. baoding library 99. the library of jiujiang city 100. taiyuan library 101. hohhot library 102. wuhu library 103. langfang library 104. national library of hengyang city 105. maoming library 106. nanyang library 107. heze library 108. urumqi library 109. zhanjiang library 110. zunyi library 111. shangqiu library 112. jiangmen library 113. liuzhou library 114. zhuzhou library 115. xuchang library 116. chuzhou library 117. lianyungang library 118. suqian library 119. mianyang library 120. zhuhai library 121. xinyang library 122. zhoukou library 123. zhumadian library 124. huzhou library 125. lanzhou library 126. fuyang library 127. xinxiang library 128. jiaozuo library survey methods web-based investigation, content analysis, and interviews with librarians were used to assess 128 public libraries in china. the survey was carried out between march 10 and september 15 in 2021. first, the authors identified the media platforms for sharing information about each public library’s contactless services, including an official website, a social networking account on wechat, or a library-developed app. the authors investigated whether these media platforms were updated with information about the contactless services and if they provided various information about these services. next, the authors searched the various contactless services offered by this library through these media platforms and recorded them. finally, the authors reviewed the data and findings from the survey to minimize errors and ensure the accuracy of the findings. information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 7 findings touch-free information distribution the distribution of library information is generally carried out in a touch-free manner. there are three commonly used information media in libraries: official website, wechat official account, and library-developed app. the adoption rate of each information medium by libraries is determined by investigating whether libraries have opened information media platforms and whether the opened platforms are updated with service information. the results showed that the information medium with the highest adoption rate was the wechat official account, reaching 100%. the library’s official website showed an adoption rate of 94%. only 57% of libraries use apps to distribute contactless information (see fig, 1). figure 1. percentage of touch-free information distribution platforms in large public libraries in china. patron services must provide timely and convenient access if public libraries want to effectively expand their patron base or increase library usage. wechat is better adapted to user convenience than websites, which explains the greater utilization rate as a contactless information dissemination tool for libraries. as a public service institution, the chinese public library has an incomparable impact on politics, economy, and culture. libraries have a great influence on the cultural popularization and educational development of the public. therefore, touch-free information dissemination plays an important role in improving the efficiency of information dissemination. wechat has been fully integrated into china’s public library services as a communication tool, allowing libraries to better foster cultural growth. in the process of cultural growth, libraries need to emphasize interactive public participation and combine public culture, social topics, citizen interaction and media communication, bringing innovative value to promote urban vitality and urban humanism. the 100% 94% 57% 0% 20% 40% 60% 80% 100% 120% wechat official account official website app information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 8 widespread use of wechat helps users stay up to date on the newest information and access library resources services more conveniently. remote resources services restrictions on the use of digital resources are closely related to the frequency of patrons’ use. restrictive measures that posed obstacles to patrons using digital resources were identified. among the 128 large public libraries surveyed, 42% of libraries require reader card authentication by patrons before they can access remote resources services; 8% of libraries do not require users to have reader cards for services. patrons can use the remote resources services available in the remaining 49% of public libraries without needing to register for a user account or patron id on the library website. to reduce the risk of infection between librarians and patrons, some libraries adopted noncontact paper document delivery services for users in urgent need of paper books during the pandemic. for example, the peking university library’s book delivery to building service (see fig. 2) and xiamen library and wenzhou library’s book delivery to home (see fig. 3) allow patrons to reserve books online, and librarians will express mail the books to patrons’ homes according to their needs. figure 2. peking university library’s book delivery service to the building. figure 3. book delivery service of xiamen library and wenzhou library. contactless services have two outstanding advantages: services can be obtained without contact with people, and convenience. however, if the use of remote resources is restricted in many ways, it will lead to a decrease in the utilization of digital resources in libraries. while intellectual property requirements and concerns must be appropriately managed, public libraries should strive to provide patrons with unlimited access to digital materials and physical print books. no-touch interaction self-services no-touch interaction self-services in chinese public libraries mainly include self-checkout, selfretrieval, self-storage, self-printing, self-card registration, and other self-service services, such as self-payment, and self-reservation of study rooms or seminar rooms (see fig. 4). information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 9 figure 4. percentage of large public libraries in china that provide contactless self-service. the survey of large public libraries in china shows that the majority offer self-checkout and selfretrieval services. the percentage of public libraries offering self-storage, self-certification and self-printing is low, with only 50% or less usage. self-storage, as one of the earlier self-services, has a usage rate of 50%. only 34 percent of public libraries offered self-card registration. the selfservice card registration machine has four main functions: reader card registration, payment, password modification, and renewal. for example, when patrons need to pay deposits or overdue fines, they can use the self-service card registration machine to swipe their cards and payment to facilitate subsequent borrowing of various resources. the machine supports face recognition technology for card application and online deposit recharge, catering to the needs of patrons in many aspects of operation (see fig. 5). the proportion of self-printing is even lower available at only 15% of libraries. self-card registration and self-printing are both emerging self-service options that require strong financial and technical support and are therefore not widely available. 5% 99% 98% 50% 34% 15% 0% 20% 40% 60% 80% 100% 120% others self-checkout self-retrieval self-storage self-card registration self-printing information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 10 figure 5. self-service card registration machine in chinese large public libraries. most public libraries in china have set up dedicated self-service libraries or microservice halls on the wechat public account platform in addition to further promoting library contactless services and enabling users to enjoy self-service library services anytime, anywhere. for example, the changsha library (see fig. 6) and the taiyuan library (see fig. 7) have both set up a microservice hall column on their wechat public numbers, containing services such as personal appointment, book renewal, event registration, and digital resources. the emergence of online self -service library services has greatly contributed to the development of equalization and standardization of public library services. information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 11 figure 6. changsha library no-touch interaction self-service hall. figure 7. taiyuan library no-touch interaction self-service hall. 24-hour self-service library the 24-hour self-service library, a contactless phenomenon in china’s public libraries, was introduced in 2006 and officially launched in 2007 by dongguan library and followed by shenzhen library’s initial batch of ten self-service libraries. the success of the shenzhen model has sparked a boom in the construction of self-service libraries in china, with 77% of the chinese public libraries surveyed having opened self-help libraries. the development of self-service libraries is divided into two types of service models: space-based self-service libraries (see fig. 8), i.e., unattended libraries with a certain amount of space for use, in which patrons can freely select books and read for leisure, such as 24-hour city bookstores; and a cabinet-type self-service library (see fig. 9), similar to a bank atm with an operating panel and similar in appearance to a bookcase, which allows real-time data interaction with the central library via the network. the eight self-service libraries in taiyuan library in shanxi can provide self-service book borrowing services through the new model of library + internet + credit, which allows patrons to apply for a reader’s card without a deposit and make reservations online and deliver books to the counter (see fig. 10). by cross-referencing the reader’s card with the patron’s face information, the guangzhou self-service library provides self-service borrowing and returning services for patrons through face recognition. there are many similar self-service libraries in china, which provide various types of patron services in different forms, largely reducing direct contact between patrons and librarians, and between patrons and readers. for example, when the pandemic was most severe, data collected from the ningbo self-service library showed that 7,022 physical books were borrowed and returned from january to march 2020, 50% more than in a normal year.25 information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 12 figure 8. space-based self-service libraries. figure 9. cabinet type self-service library. information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 13 figure 10. taiyuan self-service library. the popularity of 24-hour self-service libraries in china is first and foremost due to the strong support and financial investment of government departments in the construction of self -service libraries. secondly, the features of self-service libraries, which are convenient, time-independent, time-saving, efficient, and diversified, are in line with modern lifestyles, integrating public library services into people’s lives, increasing the visibility and penetration of public library patron services, and maximizing patrons’ needs in reading. network services there is a wide range of network services but the most common are seat reservation, online renewal, and overdue fee payment (see fig. 11). the survey found that 89% of chinese public libraries offer at least one of these network services, indicating a high adoption rate of network services. in 2002, online renewals began to appear in china and then gradually became popular. most of the public libraries in china provide this service in the personal library or wechat official account. the rate of adoption of network service is as high as 85% in the 128 public libraries surveyed. the prevalence of seat reservation services is not high. only 28% of the public libraries surveyed offered seat reservation services. information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 14 figure 11. percentage of large chinese public libraries that provide network services. coverage of the online overdue fee payment service was even lower with only 21% of public libraries providing access. however, some libraries have replaced the overdue fee system with other methods, such as the shantou library’s lending points system. in the system, the initial number of points on a patron’s account is 100, with two points added for each book borrowed and one point deducted for each day a book is overdue. when the number of points deducted on the account reaches zero, the reader’s card will be frozen for seven days and cannot be used to borrow books. after the freeze is lifted, the number of points will be reset to 20.26 in summary, contactless services in china’s public libraries are moving in a more humane direction. online reference services as a type of contactless service, online reference services are extremely helpful in developing access to documentary information resources. the survey shows that 94% of public libraries provide online reference services. online reference services are available by telephone, website, email, qq, and wechat. telephone reference and website reference are the earliest forms of contactless service, with the highest usage rates of 79% and 71% respectively among public libraries surveyed. this is followed by slightly lower coverage of email reference and qq reference at 55% and 48% respectively. wechat reference coverage rate is the lowest with only 16% (see fig. 12). qq and wechat are both tencent’s instant messengers, but qq’s file function is slightly stronger than wechat’s. qq can send large files of over 1gb and files do not expire, making it easy for the reference librarians to communicate with patrons. 85% 28% 21% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% online renewal seat reservation overdue fee payment information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 15 figure 12. percentage of large public libraries in china that provided online reference service tools. other online reference methods such as microblog reference and intelligent robot reference are present in chinese large public libraries. real-time reference is labor-intensive and timeconsuming, and where librarians may be unavailable to provide an immediate response, intelligent robotic referencing can make up for the problem of consultants being online full time. applying intelligent robots to library reference can also provide accurate and personalized consultation services according to patrons’ needs and behavioral patterns, greatly improving the quality, effectiveness, and satisfaction of consultation services. for example, the zhejiang library has an online reference service which includes online 24-hour robot reference and offline message modules. patrons can also choose expert reference and see available reference experts in the expert list and their details, including name, library, title, specialties, status, etc.27 in addition, the hunan library provides joint online reference, which is a public welfare platform of the hunan provincial literature and information resources common construction and sharing collaborative network, to provide online reference services to the public. eleven member units, including hunan library, hunan university library, and hunan science and technology information institute benefit from the rich literature resources, information technology, and human resources of the network, and all sites work together to provide free online reference advice and remote delivery of literature to a wide range of patrons, as well as advisory and tutorial services to guide patrons on how to use the library’s physical and digital resources.28 smart services without personal interactions driven by artificial intelligence, blockchain, cloud computing, and other technologies, libraries are evolving from physical and digital libraries to smart libraries. smart services without personal interactions are a fundamental capability of smart libraries. this survey found that the coverage of 4% 79% 71% 55% 48% 16% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% others telephone website email qq wechat information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 16 smart services was 52%, with virtual reality coverage at 21%, face recognition coverage at 20%, and swipe face to borrow books at 9%. face recognition can be used in library resources services, face gates, security monitoring, self-checkout, and other online and offline real-name identity verification instances, which can improve the efficiency of identity verification. the biggest advantage of face recognition is that it is contactless and easy to use, avoiding the health and safety risks associated with contact identification such as fingerprints. swipe face to borrow books is one of the applications included in face recognition technology that allows patrons to quickly borrow and return books by swiping faces, even if they have forgotten their reader’s card. this technology also tracks the interests of patrons based on their borrowing habits and history records, providing them with corresponding reading recommendation services. it is worth noting that chinese public libraries have a rich variety of smart service methods. in terms of vr technology applications, the national library of china launched the national library virtual reality system in 2008, the first service in china to bring vr technology to the public eye. the virtual reality system provides patrons with the option to explore virtual scenes and interact with virtual resources available in the library. the virtual scenes are distributed by using computer systems to build realistic architectural structures and reading rooms, so that patrons can learn about the library in the library lobby with the help of vr equipment. virtual resources are digital resources presented in virtual form. the technology combines flash and human gesture recognition systems, allowing patrons to flip through books touch-free at virtual reality reading stations, enhancing the reading style and interactive experience. in addition, the fuzhou library is concerned with the characteristics of different groups of people and has made virtual experiences a focus of its services, using vr technology to innovate reading methods, such as presenting animal images in 3d form on a computer screen, which has been welcomed by a large number of readers, especially children. shanghai library, tianjin library, shenzhen library, chongqing library, and jinan library have introduced vr technology into their patron services as to attract more users. in terms of blockchain applications, the national digital library of china makes use of the special features of blockchain technology in terms of distributed storage, traceable transmission, and high-grade encryption to provide full-time, full-domain, and full-scene copyright protection for massive digital resources and promotes the construction of intelligent library services. related to big data technology, the shanghai library provides personalized recommendation services for e-books based on the characteristics of the books borrowed by readers. patrons using a mobile phone can scan a code on borrowed books and click on the recommended book’s cover for immediate reading.29 conclusion & recommendations an in-depth analysis of the contactless service strategy will help to steadily improve the smart library development process in public libraries and to support their transition to smart libraries. this report provides a systematic framework for contactless services for public libraries based on a survey and assessment of the contactless service status of large public libraries in china. contactless patron services, contactless space services, contactless self -services, and contactless extension services are the four key components of the framework (see fig. 13). information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 17 figure 13. a systematic framework of contactless services for public libraries. providing contactless patron services patron services are the heart and soul of each public library. the library’s services providing no personal physical contact or touch-free connection with patrons are referred to as contactless patron services. this includes book lending, online reference, digital resources and network reading promotion. at present, most chinese public libraries have few contactless lending options, making it difficult to meet the needs of patrons who cannot access the library due to covid-19 or transportation difficulties for various reasons. therefore, public libraries can enrich their existing book lending methods by providing patrons with contactless services, such as book delivery and online lending, to create a convenient reading environment. a focus on digital resources is fundamental to achieving contactless patron services. at present, some public libraries in china neglect the management of digital resources due to the emphasis on paper resources, and digital resources are not updated and maintained in a timely manner, which leads to the inability of patrons to use them smoothly; therefore, the effective management of digital resources in libraries is crucial. in addition, public libraries can carry out activities such as network reading promotion and reader education to effectively improve the utilization of library resources. information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 18 building contactless space services contactless space services refer to the touch-free interaction between physical space and virtual space. physical space services mainly include self-reservation of study rooms, discussion rooms, meeting rooms, as well as providing venues for public lectures or exhibitions, etc., to fulfill the space demands arising from patrons’ access to information. virtual space services mainly include building spaces for collaboration and communication, creative spaces, information sharing spaces, and cultural spaces, providing a virtual integrated environment for patrons’ needs for information exchange and acquisition in the online environment. public libraries can develop their activities through different channels according to the characteristics and elements of physical and virtual spaces, so that libraries can evolve from “library as a place” to “library as a platform.” the combination of an offline library space and an online library platform provides a more convenient and accessible library experience for patrons. implementing no-touch interaction self-services no-touch interactive self-service plays a pivotal role as one of the service forms of the contactless service strategy. it mainly includes no-touch interaction self-services such as information retrieval, resources navigation, self-checkout, and self-printing. public libraries can set up no-touch interaction self-service sections on their official websites or social media accounts to help patrons quickly access up-to-date information from anywhere and at any time. developing contactless extension services in the three dimensions of time, space, and approach, contactless extension services refer to the mutual extension of the library. public libraries can be open year round on a 24/7 basis or during holidays without librarians, allowing patrons to swipe their own cards to gain access. the traditional collection of paper books should not only be available in offline libraries but can extend to individual self-service libraries or city bookshops. libraries can approach patrons with a more individualized service strategy. for example, some public libraries provide a service called build a book bag, where librarians select books according to the patron’s personal interests and reading preferences and deliver them to a designated location. limitations and prospects after analyzing the current status of contactless services in large public libraries in china, this paper finds that contactless services such as reference and access to digital resources are well established in chinese public libraries. on the other hand, the availability of contactless applications such as no-touch interaction self-services, network services, and smart services without personal interaction are less well-developed. despite the rapid development of touch-free services and their variety, public libraries in china have not yet implemented a system of contactless services. this paper proposes a systematic framework to improve the development and practice of contactless services in public libraries and interrupt the spread of covid-19. the framework includes four core modules: contactless patron services, contactless space services, contactless self-help services, and contactless extension services. it is foreseeable that contactless services will become the mainstream of public library services in the future. information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 19 endnotes 1 fred griffith, “identification of the meningococcus in the naso-pharynx with special reference to serological reactions,” journal of hygiene 15, no. 3 (1916): 446–63, https://doi.org/10.1017/s0022172400006355. 2 “guiding opinions of the state council on actively promoting the ‘internet +’ action,” 2015, http://www.gov.cn/zhengce/content/2015-07/04/content_10002.htm. 3 d. brooks, “a program for self-service patron interaction with an online circulation file,” in proceedings of the american society for information science 39th annual meeting (oxford, england, 1976). 4 beth dempsey, “do-it-yourself libraries,” library journal 135, no. 12 (2010): 86–93, https://doi.org/10.1016/j.lisr.2010.03.004. 5 jackie mardikian, “self-service charge systems: current technological applications and their implications for the future library,” reference services review 23, no. 4 (1995): 19–38, https://doi.org/10.1108/eb049262. 6 pan yongming, liu huihui, and liu yanquan, “mobile circulation self-service in u.s. university libraries,” library and information service 58, no. 12 (2014): 26–31, https://doi.org/10.13266/j.issn.0252-3116.2014.12.004. 7 chen wu and jang airong, “building a modern self-service oriented library,” journal of academic libraries, no. 3 (2013): 93–96, https://doi.org/cnki:sun:mrfs.0.2016-24-350. 8 rao zengyang, “innovative strategies for university library services in the era of smart libraries,” library theory and practice, no. 12 (2016): 75–76, https://doi.org/10.14064/j.cnki.issn1005-8214.2016.12.018. 9 wang weiqiu and liu chunli, “functional design and model construction of intelligent library services in china based on face recognition technology,” research on library science, no. 18 (2018): 44–50, https://doi.org/10.15941/j.cnki.issn1001-0424.2018.18.008. 10 cheng huanwen and zhong yuanxin, “a three-dimensional analysis of a smart library,” library tribune 41, no. 6 (2021): 43–45. 11 nahyun kwon and vicki l. gregory, “the effects of librarians’ behavioral performance and user satisfaction in chat reference services,” reference & user services quarterly, no. 47 (2007): 137–48, https://doi.org/10.5860/rusq.47n2.137. 12 w. uutoni, “providing digital reference services: a namibian case study,” new library world 119, no. 5 (2018): 342–56, https://doi.org/10.1108/ils-11-2017-0122. 13 zhu hui, liu hongbin, and zhang li, “an analysis of the remote service model of university libraries in response to public safety emergencies,” new century library, no. 5 (2021): 39–45, https://doi.org/10.16810/j.cnki.1672-514x.2021.05.007. https://doi.org/10.1017/s0022172400006355 http://www.gov.cn/zhengce/content/2015-07/04/content_10002.htm https://doi.org/10.1016/j.lisr.2010.03.004 https://doi.org/10.1108/eb049262. https://doi.org/10.13266/j.issn.0252-3116.2014.12.004 https://doi.org/10.14064/j.cnki.issn1005-8214.2016.12.018 https://doi.org/10.15941/j.cnki.issn1001-0424.2018.18.008 https://doi.org/10.5860/rusq.47n2.137 https://doi.org/10.1108/ils-11-2017-0122 https://doi.org/10.1080/24750158.2020.1840719 information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 20 14 xiangming mu, alexandra dimitroff, jeanette jordan, and natalie burclaff, “a survey and empirical study of virtual reference service in academic libraries,” journal of academic librarianship 37, no. 2 (2011): 120–29, https://doi.org/10.1016/j.acalib.2011.02.003. 15 cheng xiufeng et al., “a study on a library’s intelligent reference service model based on user portraits,” research on library science, no. 2 (2021): 43–55, https://doi.org/10.15941/j.cnki.is sn1001-0424.2021.02.012. 16 m. aittola, t. ryhänen, and t. ojala, “smart library-location-aware mobile library service,” in human-computer interaction with mobile devices and services, international symposium, (2003). 17 chu jingli and duan meizhen, “from smart libraries to intelligent libraries,” journal of the national library of china, no. 1 (2019): 3–9, https://doi.org/10.13666/j.cnki.jnlc.2019.01.001. 18 yan dong, “iot-based smart libraries,” journal of library science 32, no. 7 (2010): 8–10, http://doi.org/10.14037/j.cnki.tsgxk.2010.07.034. 19 wang shiwei, “a brief discussion of the five relationships of smart libraries,” library journal 36, no. 4 (2017): 4–10, https://doi.org/10.13663/j.cnki.lj.2017.04.001. 20 morell d. boone, “unlv and beyond,” library hi tech 20, no. 1 (2002): 121–23, https://doi.org/10.1108/07378830210733981. 21 qin hong et al., “research on the application of face recognition technology in libraries,” journal of academic libraries 36, no. 6 (2018): 49–54, https://doi.org/10.16603/j.issn10021027.2018.06.008. 22 li mo, “research on a mobile visual search service model for smart libraries based on deep learning,” journal of modern information 39, no. 5 (2019): 89–96. 23 zhou jie, “study on the application of lora technology in smart libraries,” new century library, no. 5 (2021): 57–61, https://doi.org/10.16810/j.cnki.1672-514x.2021.05.010. 24 international federation of library associations and institutions, “the covid-19 and the global library community,” 2020, https://www.ifla.org/covid-19-and-the-global-library-field/; guo yajun, yang zinan, and yang zhishun, “the provision of patron services in chinese academic libraries responding to the covid-19 pandemic,” library hi tech 39, no. 2 (2021): 533–48, https://doi.org/10.1108/lht-04-2020-0098; peking university library, “book delivery service to the buildings where the patrons live,” (2020), https://mp.weixin.qq.com/s/eknyg_-_rjrcl6sjc-it-a. 25 hu bin ying yan, “study on the intelligent construction of ningbo library under the influence of epidemic,” jiangsu science & technology information 38, no. 24 (2021): 17–21, https://doi.org/10.3969/j.issn.1004-7530.2021.24.005. 26 shantou library, “come and be a book ‘saint’! city library changes lending rules, points system instead of overdue fees,” 2021, http://www.stlib.net/information/26182. https://doi.org/10.1016/j.acalib.2011.02.003 https://doi.org/10.13666/j.cnki.jnlc.2019.01.001 https://doi.org/10.13663/j.cnki.lj.2017.04.001 https://doi.org/10.1108/07378830210733981 https://doi.org/10.16603/j.issn1002-1027.2018.06.008 https://doi.org/10.16603/j.issn1002-1027.2018.06.008 https://doi.org/10.16810/j.cnki.1672-514x.2021.05.010 https://www.ifla.org/covid-19-and-the-global-library-field/ https://doi.org/10.1108/lht-04-2020-0098 https://mp.weixin.qq.com/s/eknyg_-_rjrcl6sjc-it-a' http://dx.chinadoi.cn/10.3969/j.issn.1004-7530.2021.24.005 http://www.stlib.net/information/26182 information technology and libraries june 2022 contactless services | guo, yang, yuan, ma, and liu 21 27 zhejiang library, “online reference services,” 2020, https://www.zjlib.cn/yibanwt/index.htm?liid=2. 28 hunan provincial collaborative network for the construction and sharing of literature and information resources, “reference union of public libraries in hunan province,” 2021, http://zx.library.hn.cn/. 29 ministry of culture and tourism of the people’s republic of china, “shanghai library launches personalized recommendation service for e-books,” 2021, https://www.mct.gov.cn/whzx/qg whxxlb/sh/202101/t20210106_920497.htm. https://www.zjlib.cn/yibanwt/index.htm?liid=2 http://zx.library.hn.cn/ https://www.mct.gov.cn/whzx/qgwhxxlb/sh/202101/t20210106_920497.htm https://www.mct.gov.cn/whzx/qgwhxxlb/sh/202101/t20210106_920497.htm abstract introduction literature review methods survey samples survey methods findings touch-free information distribution remote resources services no-touch interaction self-services 24-hour self-service library network services online reference services smart services without personal interactions conclusion & recommendations providing contactless patron services building contactless space services implementing no-touch interaction self-services developing contactless extension services limitations and prospects endnotes an overview of the current state of linked and open data in cataloging irfan ullah, shah khusro, asim ullah, and muhammad naeem information technology and libraries | december 2018 47 irfan ullah (cs.irfan@uop.edu.pk) is doctoral candidate, shah khusro (khusro@uop.edu.pk) is professor, asim ullah (asimullah@uop.edu.pk) is doctoral student, and muhammad naeem (mnaeem@uop.edu.pk) is assistant professor, at the department of computer science, university of peshawar. abstract linked open data (lod) is a core semantic web technology that makes knowledge and information spaces of different knowledge domains manageable, reusable, shareable, exchangeable, and interoperable. the lod approach achieves this through the provision of services for describing, indexing, organizing, and retrieving knowledge artifacts and making them available for quick consumption and publication. this is also aligned with the role and objective of traditional library cataloging. owing to this link, major libraries of the world are transferring their bibliographic metadata to the lod landscape. some developments in this direction include the replacement of anglo-american cataloging rules 2nd edition by the resource description and access (rda) and the trend towards the wider adoption of bibframe 2.0. an interesting and related development in this respect are the discussions among knowledge resources managers and library community on the possibility of enriching bibliographic metadata with socially curated or user-generated content. the popularity of linked open data and its benefit to librarians and knowledge management professionals warrant a comprehensive survey of the subject. although several reviews and survey articles on the application of linked data principles to cataloging have appeared in literature, a generic yet holistic review of the current state of linked and open data in cataloging is missing. to fill the gap, the authors have collected recent literature (2014–18) on the current state of linked open data in cataloging to identify research trends, challenges, and opportunities in this area and, in addition, to understand the potential of socially curated metadata in cataloging mainly in the realm of the web of data. to the best of the authors’ knowledge, this review article is the first of its kind that holistically treats the subject of cataloging in the linked and open data environment. some of the findings of the review are: linked and open data is becoming the mainstream trend in library cataloging especially in the major libraries and research projects of the world; with the emergence of linked open vocabularies (lov), the bibliographic metadata is becoming more meaningful and reusable; and, finally, enriching bibliographic metadata with user-generated content is gaining momentum. conclusions drawn from the study include the need for a focus on the quality of catalogued knowledge and the reduction of the barriers to the publication and consumption of such knowledge, and the attention on the part of library community to the learning from the successful adoption of lod in other application domains and contributing collaboratively to the global scale activity of cataloging. introduction with the emergence of the semantic web and linked open data (lod), libraries have been able to make their bibliographic data publishable and consumable on the web, resulting in an increased understanding and utility both for humans and machines.1 additionally, the use of linked data principles of lod has allowed connecting related data on the web.2 traditional catalogs as mailto:cs.irfan@uop.edu.pk mailto:khusro@uop.edu.pk mailto:asimullah@uop.edu.pk mailto:mnaeem@uop.edu.pk current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 48 https://doi.org/10.6017/ital.v37i4.10432 collections of metadata about library content have served the same purpose for a long time.3 it is, therefore, natural to establish a link between the two technologies and exploit the capabilities of lod to enhance the power of cataloging services. in this regard, significant milestones have been achieved, which includes the use of linked and open data principles for publishing and linking library catalogs, bibframe, and europeana data model (edm).4 however, the potential of linked and open data for building more efficient libraries and the challenges involved in that direction are mostly unknown due to the lack of a holistic view of the relationship between cataloging and the lod initiative and the advances made in both areas. likewise, the possibility of enriching the bibliographic metadata with user-generated content such as ratings, tags, and reviews to facilitate the search for known-items as well as exploratory search has not received much attention. 5 some studies of preliminary extent have, however, appeared in literature an overview of which is presented in the following paragraphs. several survey and review articles have contributed to different aspects of cataloging in the lod environment. hallo et al. investigated how linked data is used in digital libraries, how the major libraries of the world implemented it, and how they benefit from it by focusing on the selected ontologies and vocabularies. 6 they identified several specific challenges to applying linked data to digital libraries. more specifically, they reviewed the linked data applications in digital libraries by analyzing research publications regarding the major national libraries (obtaining five-stars by following linked data principles) and published from 2012 to 2016.7 tallerås examined statistically the quality of linked bibliographic data published by the major libraries including spain, france, the united kingdom, and germany. 8 yoose and perkins presented a brief survey of lod uses under different projects in different domains including libraries, archives, and museums.9 by exploring the current advances in the semantic web, robert identified the potential roles of libraries in publishing and consuming bibliographic data and institutional research output as linked and open data on the web.10 gardašević presented a detailed overview of semantic web and linked open data from the perspective of library data management and their applicability within the library domain to provide a more open and integrated catalog for improved search, resource discovery, and access.11 thomas, pierre-yves, and bernard presented a review of linked open vocabularies (lov), in which they analyzed the health of lov from the requirements perspective of its stakeholders, its current progress, its uses in lod applications, and proposed best practices and guidelines regarding the promotion of lov ecosystem.12 they uncovered the social and technical aspects of this ecosystem and identified the requirements for the long-term preservation of lov data. vandenbussche et al. highlighted the features, components, significance, and applications of lov and identified the ways in which lov supports ontology & vocabulary engineering in the publication, reuse and data quality of lod.13 tosaka and park performed a detailed literature review of rda (2005–11) and identified its fundamental differences from aacr2, its relationship with the metadata standards, and its impact on metadata encoding standards, users, practitioners, and the training required.14 sprochi presented the current progress in rda, frbr (functional requirements for bibliographic records), and bibframe to predict the future of library metadata, the skills and knowledge required to handle it, and the directions in which the library community is heading. 15 gonzales identified the limitations of marc21 and the benefits of and challenges in adopting the bibframe information technology and libraries | december 2018 49 framework.16 taniguchi assessed bibframe 2.0 for the exchange and sharing of metadata created in different ways for different bibliographic resources.17 he discussed bibframe 1.0 from rda point of view.18 he examined bibframe 2.0 from the perspective of rda to uncover issues in its mapping to bibframe including rda expressions in bibframe, mapping rda elements to bibframe properties, and converting marc21 metadata records to bibframe metadata. 19 fayyaz, ullah, and khusro reported on the current state of lod and identified several prominent issues, challenges, and research opportunities. 20 ullah, khusro, and ullah reviewed and evaluated different approaches for bibliographic classification of digital collections.21 by looking at the above survey and review articles, one may observe that these articles target a specific aspect of cataloging from the perspective of lod. the holistic analysis and a complete picture of the current state of cataloging in transiting to lod ecosystem are missing. this paper adds to the body of knowledge by filling this gap in the literature. more specifically, it attempts to answer the following research questions (rqs): rq01: how linked open data (lod) and vocabularies (lov) are transforming the digital landscape of library catalogs? rq02: what are the prominent/major issues, challenges, and research opportunities in publishing and consuming bibliographic metadata as linked and open data? rq03: what is the possible impact of extending bibliographic metadata with the usergenerated content and making it visible on the lod cloud? the first section of this paper answers rq01 by discussing the potential role of lod and lov in making library catalogs visible and reusable on the web. the second section answers rq02 by identifying some of the prominent issues, challenges, and research opportunities in publishing, linking, and consuming library catalogs as linked data. it also identifies specific issues in rda and bibframe from lod perspective and highlights the quality of lod-based cataloging. the third section answers rq03 by reviewing the state-of-the-art literature on the socially curated metadata and its role in cataloging. the last section concludes the paper followed by references cited in this article. the role of linked open data and vocabularies in cataloging the catalogers, librarians, and information science professionals have always been busy defining the set of rules, guidelines, and standards to record the metadata about knowledge artifacts accurately, precisely, and efficiently. the aacr2 are among the widely used rules and guidelines for cataloging. however, it has several issues with the nature of authorship, the relationships between bibliographic metadata, the categorization of format-specific resources, and the description of new data types.22 in an attempt to produce its revised version, aacr3, the cataloging community noticed that a new framework should be developed with the name of rda.23 based on frbr conceptual models, rda is a “flexible and extendible bibliographic framework” that supports data sharing and interoperability and is compatible with marc21 and aacr2.24 according to the rda toolkit, rda describes digital and non-digital resources by taking advantage of the flexibilities and efficiencies of modern information storage and retrieval technologies while at the same time is backward-compatible with legacy technologies used in conventional resource discovery and access applications.25 it is aligned with the ifla’s current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 50 https://doi.org/10.6017/ital.v37i4.10432 (international federation of library associations and institutions) conceptual models of authority and bibliographic metadata (frbr, frad [functional requirements for authority data], frsad [functional requirements for subject authority data]).26 rda accommodates all types of content and media in digital environments with improved bibliographic control in the realm of linked and open data; however, its responsiveness to user requirements needs further research.27 the discussion of the cataloging rules and guidelines stays incomplete without the metadata encoding standards and formats that give practical shape to these rules in the form of library catalogs. the most common encoding formats include dublin core (dc) and marc21. dublin core (http://lov.okfn.org/dataset/lov/vocabs/dce) is a [general-purpose metadata encoding scheme and] vocabulary of fifteen properties with “broad, generic, and usable terms” for resource description in natural language. it is advantageous as it presents relatively low barriers to repository construction; however, it lacks in standards to index subjects consistently as well as to offer a uniform semantic basis necessary for an enhanced search experience.28 the lack of uniform semantic basis is due to the individual interpretations and exploitations of dc metadata by the libraries, which in turn originated from its different and independent implementations at the element level.29 marc21 is the most common machine process-able metadata encoding format for bibliographic metadata. it can be mapped to several formats including dc, marc/xml (http://www.loc.gov/standards/marcxml/), mods (http://www.loc.gov/standards/mods), mads (http://www.loc.gov/standards/mads), and other metadata standards.30 however, marc21 has several limitations such as only library software and librarians understand it, it is semantically inexpressive and isolated from the web structure, and it lacks in expressive semantic connections to relate different data elements in a single catalog record.31 besides its limitations, marc metadata encoding format is vital for resource discovery especially within the library environment, and therefore, ways must be found to make visible the library collections outside the libraries and available through the major web search engines.32 one such effort is from the library of congress (http://catalog.loc.gov/) that introduced a new bibliographic metadata framework, bibframe 2.0, which will eventually replace marc21 and allow semantic web and linked open data to interlink bibliographic metadata from different libraries. other metadata encoding schema and frameworks include schema.org, edm, and the international community for documentation (cidoc)’s conceptual reference model (cidoc-crm).33 today, the bibliographic metadata records are available on the web in several forms including marc21, online public access catalogs (opacs), and bibliographic descriptions from online catalogs (e.g., library of congress), online cooperative catalogs (e.g., oclc’s worldcat [https://www.oclc.org/en/worldcat.html program]), social collaborative cataloging applications (e.g., librarything [https://www.librarything.com]), digital libraries (e.g., ieee xplore digital library [https://ieeexplore.ieee.org/xplore/home.jsp]), acm digital library(https://dl.acm.org), book search engines such as google books, and commercial databases including e.g., amazon.com. most of these cataloging web applications use either marc or other legacy standards as metadata encoding and representation schemes. however, the majority of these applications are either considering or transiting to the emerging cataloging rules, frameworks, and encoding schemes so that the bibliographic descriptions of their holdings could be made visible and reusable as linked and open data on the web for the broader interests of libraries, publishers, and end-users. http://lov.okfn.org/dataset/lov/vocabs/dce http://www.loc.gov/standards/marcxml/ http://www.loc.gov/standards/mods http://www.loc.gov/standards/mads http://catalog.loc.gov/ https://www.oclc.org/en/worldcat.html https://www.librarything.com/ https://ieeexplore.ieee.org/xplore/home.jsp https://dl.acm.org/ information technology and libraries | december 2018 51 the presence of high-quality reusable vocabularies makes the consumption of linked data more meaningful, which is made possible by linked open vocabularies (lov) that bring value-added extensions to the web of data.34 the following two subsections attempt to answer the rq01 by highlighting how lod and lov are transforming the current digital landscape of cataloging. linked and open data the semantic web and linked open data have enabled libraries to publish and make visible their bibliographic data on the web, which increases the understanding and consumption of this metadata both for humans and machines.35 lod connects and relates bibliographic metadata on the web using linked data principles.36 publishing, linking, and consuming bibliographic metadata as linked and open data brings several benefits. these include improvements in data visibility, linkage with different online services, interoperability through universal lod platform, and the credibility due to user annotations.37 other benefits include: the semantic modeling of entities related to bibliographic resources; ease in transforming topics into skos; ease in the usage of linked library data in other services; better data visualization according to user requirements; linking and querying linked data from multiple sources; and improved usability of library linked data in other domains and knowledge areas.38 different users including scientists, students, citizens and other stakeholders of library data can benefit from adopting lod in libraries.39 linked data has the potential to make bibliographic metadata visible, reusable, shareable, and exchangeable on the web with greater semantic interoperability among the consuming applications. several major projects including bibframe, lodlam (linked open data in libraries archives and museums [http://lodlam.net]), and ld4l (linked data for libraries [https://www.ld4l.org]) are in progress, which advocates for this potential.40 similarly, library linked data (lld) is lod-based bibliographic datasets, available in mods and marc21 and could be used in making search systems more sophisticated and may also be used in lov datasets to integrate applications requiring library and subjects domain datasets.41 bianchini and guerrini report on the current changes in the library and cataloging domains from ranganathan’s point of view of trinity (library, books, staff), which states that changes in one element of this trinity undoubtedly affect the others.42 they found several factors including readers, collections, and services influence this trinity and emphasize for a change: • readers moved to the web from libraries and wanted to save their time but want many capabilities including searching and navigating the full-text of resources by following links. they want resources connected to similar and related resources. they want concepts interlinked to perform an exploratory search, find serendipitous results to fulfill their information needs. • collections encompass several changes from their production to dissemination, from search and navigation to the representation and presentation of content. the ways the users access them and catalogers describe them are changing. their management is moving beyond the boundaries of their corresponding libraries to the open and broader landscape of open access context and exposure to lod environment. • services are moving from bibliographic data silos to the semantic web. this affects moving the bibliographic model to a more connected and linked data model and environment of semantic web. the data is moving from bibliographic database management systems to large lod graph, where millions of marc records are reused and converted to new http://lodlam.net/ https://www.ld4l.org/ current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 52 https://doi.org/10.6017/ital.v37i4.10432 encoding formats that are backward compatible with marc21, rda, and others and provide opportunities to be exploited fully by the linked and open data environment. thinking along this direction, new cataloging rules and guidelines, such as rda, are making us a part of the growing global activity of cataloging. therefore, catalogers should take keen interest in and avail themselves of the opportunities that lie in linked and open data for cataloging. otherwise, they (as a service) might be forgotten or removed from the trinity, i.e., from collections and readers.43 several major libraries have been actively working to make their bibliographic metadata visible and re-usable on the web. the library of congress through its linked data service (http://id.loc.gov) enables humans and machines to access its authority data programmatically. 44 it exposes and interconnects data on the web through dereferenceable uniform resource identifiers (uris).45 its scope includes providing access to the commonly found loc standards and vocabularies (controlled vocabularies and data values) for the list of authorities and controlled vocabularies that loc currently supports.46 according to the loc, the linked data service brings several benefits to the users including: accessing data at no cost; providing granular access to individual data values; downloading controlled vocabularies and their data values in numerous formats; enabling linking to loc data values within the user metadata using linked data principles; providing a simple restful api, clear license and usage policy for each vocabulary; accessing data across loc divisions through a unified endpoint; and visualizing relationships between concepts and values.47 however, to fully exploit the potentials of lod, loc is mainly focusing on its bibframe initiative.48 bibframe is not only a replacement for the current marc21 metadata encoding format it is a new way of thinking how the available large amount of bibliographic metadata could be shared, reused, and made available as linked and open data. 49 the bibframe 2.0 (https://www.loc.gov/bibframe/docs/bibframe2-model.html) model organizes information into work (the details of the about the work information), instance (work on specific subject quantity in numbers), item (format: print or electronic), and nature (copy/original work). bibframe 2.0 elaborates the roles of the persons in the specific work as agents, and the subject of the work as subjects and events.50 according to taniguchi, bibframe 2.0 takes the bibliographic metadata standards to the linked and open data with model and vocabulary that makes the cataloging more useful both inside and outside the library community.51 to achieve this goal, it needs to fulfill two primary requirements. these include (1) accepting and representing metadata created with rda by replacing the marc21, and therefore, working as creating, exchanging, and sharing rda metadata; (2) accepting and accommodating descriptive metadata for bibliographic resources created by libraries, cultural heritage communities, and users for the wide exchange and sharing. bibframe 2.0 should comply with the linked data principles including the use of rdf and uris. in addition to the library of congress, oclc through its linked data research has also been actively involved in research on transforming and publishing its bibliographic metadata as linked data.52 under this program, oclc aims to provide a technical platform for the management and publication of its rdf datasets at a commercial scale. it models the key bibliographic entities including work and person and populates them with legacy and marc-based metadata. it extends http://id.loc.gov/ https://www.loc.gov/bibframe/docs/bibframe2-model.html information technology and libraries | december 2018 53 models to efficiently describe the contents of digital collections, art objects, and institutional repositories, which are not very well-described in marc. it improves the bibliographic description of works and their translations. it manages the transition from marc and other legacy encoding formats to linked data and develops prototypes for native consumption of linked data to improve resource description and discovery. finally, it organizes teaching and training events.53 since 2012, oclc has been publishing bibliographic data as linked data with three major lod datasets including oclc persons, worldcat works, and worldcat.org.54 inspired from google research, currently, they have been working on knowledge vault pipeline process to harvest, extract, normalize, weigh, and synthesize knowledge from bibliographic records, authority files, and the web to generate linked data triples to improve the exploration and discovery experience of end-users.55 worldcat.org publishes it bibliographic metadata as linked data by extracting a rich set of entities including persons, works, places, events, concepts, and organizations to make possible several web services and functionalities for resource discovery and access.56 it uses schema.org (http://schema.org) as the base ontology, which can be extended with different ontologies and vocabularies to model worldcat bibliographic data to be published and consumed as linked data.57 tennant presents a simple example of how this works. suppose we want to represent the fact “william shakespeare is the author of hamlet” as linked data.58 to do this, the important entities should be extracted along with their semantics (relationships) and represented in a format that is both machine-processable and human-readable. using schema.org, virtual international authority file (viaf.org), and worldcat.org, the sentence can be represented as a linked data triple, as shown in figure 1 based on tennant.59 the digital bibliography & library project (dblp) is an online computer science bibliography that provides bibliographic information about major publications in computer science with the goal of providing free access to high-quality bibliographic metadata and links to the electronic version of these publications.60 as of october 2018, it has indexed more than 4.3 million publications from more than 2.1 million authors and has indexed more than 40,000 journal volumes, 38,000 conference/workshop proceedings, and more than 80,000 monographs.61 its dataset is available on lod that allows for faceted search and faceted navigation to the matching publications. it uses growbag graphs to create topic facets and uses dblp++ datasets (an enhanced version of dblp) and additional data extracted from the related webpages on the web.62 a mysql database stores the dblp++ dataset that is accessible through several ways including (1) getting the database dump; (2) using its web services; (3) using d2r server to access it in rdf; and (4) getting the rdf dump available in n3 serialization.63 the above discussions on loc, oclc, and dblp make it clear that lod can potentially transform the cataloging landscape of libraries by making bibliographic metadata visible and reusable on the web. however, this potential can only be exploited to its fullest if relevant vocabularies are provided to make the linked data more meaningful. lov fulfills this demand for relevant and standard vocabularies, discussed in the next subsection. current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 54 https://doi.org/10.6017/ital.v37i4.10432 figure 1. an example of publishing a sample fact as linked data (based on tennant64). linked open vocabularies linked open vocabularies (lov) are a “high-quality catalog of reusable vocabularies to describe linked and open data.”65 they assist publishers in choosing the appropriate vocabulary to efficiently describe the semantics (classes, properties, and data types) of the data to be published as linked and open data.66 lov interconnect vocabularies, version control, the property type of values to be matched with a query to increase the score of the terms, and offers a range of data access methods including apis, sparql endpoint, and data dump. the aim is to make the reuse of well-documented vocabularies possible in the lod environment.67 the lov portal brings valueadded extensions to the web of data, which is evident from its adoption in several state-of-the-art applications.68 the presence of vocabulary makes the corresponding linked data meaningful, if the original vocabulary vanishes from the web, linked data applications that rely on it no longer function because they cannot validate against the authoritative source. lov systems prevent vocabularies from becoming unavailable by providing redundant or back-up locations for these vocabularies.69 the lov catalog meets almost all types of search criteria including search using metadata, ontology, apis, rdf dump, and sparql endpoint enabling it to provide a range of services regarding the reuse of rdf vocabularies.70 linked data should be accompanied by its meaning to achieve its benefits, which is possible using vocabularies especially rdf vocabularies that are also published as linked data and linked with each other forming an lov ecosystem.71 such an ecosystem defines the health and usability of linked data by making its meaningful interpretation possible.72 for an ontology or vocabulary to be included into the lov catalog, it must be of an appropriate size with low-level and normalized information technology and libraries | december 2018 55 constraints and represented in rdfs or web ontology language (owl); it must allow creating instances and support documentation by permitting comments, labels, definitions, and descriptions to support end users.73 the ontology must have additional characteristics such as those described in semantic web languages like owl, published on the web with no limitations on its reuse, and support for content negotiation using searchable content and namespace uris .74 the lov catalog offers four core functionalities that make it more attractive for libraries. the aggregate accesses vocabularies through dump file or (a sparql) endpoint. the search finds classes/properties in a vocabulary or ontology. the stat displays descriptive statistics of lov vocabularies. finally, suggest enables the registry of new vocabularies.75 radio and hanrath uncovered the concerns regarding transitioning to lov including how preexisting terms could be mapped while considering the potential semantic loss.76 they describe this transition in the light of a case study at the university of kansas institutional repository, which adopted oclc’s fast vocabulary and analyzed the outcomes and impact of exposing their data as linked data. to them, a vocabulary that is universal in scope and detail can become “bloated” and may result in an aggregated list of uncontrolled terms. however, such a diverse system may be capable of accurately describing the contents of an institutional repository. in this regard, adopting linked data vocabulary may serve to increase the overall quality of data by ensuring consistency with greater exposure of the resources when published as lod. however, such a transition to a linked data vocabulary is not that simple and gets complicated when the process involves reconciling the legacy metadata especially when dealing with the issues of under or misrepresentation.77 publishers, commercial entities, and data providers such as universities are taking keen interest and consortial participation, and therefore the library community must contribute to, benefit from, and consider this inevitable opportunity seriously.78 considering, the core role of libraries in connecting people to the information, they should come forward to make available their descriptive metadata collections as linked and open data for the benefit of the scholarly community on the web. it is time to move from strings (descriptive bibliographic records) to things (data items) that are connected in a more meaningful manner for the consumption of both machines and humans.79 besides the numerous benefits of the lov, there are some well-documented [and well-supported] vocabularies that are “not published or no longer available.”80 while focusing on the mappings between schema.org and lov, nogales et al. argue that the lov portal is limited as “some of the vocabularies are not available here.”81 in other words, the lov portal is growing, but currently, it is at the infant stage, where much work is needed to bring all or at least the missing welldocumented and well-supported vocabularies. this way the true benefits of lov could be exploited to the fullest when such vocabularies are linked and made available for the consumption and reuse of the broader audience and applications of the web of data. challenges, issues, and research opportunities to answer the rq02, this section attempts to identify some of the prominent/key challenges and issues regarding publishing and consuming bibliographic metadata as linked and open data. the sheer scale and diversity of cataloging frameworks, metadata encoding schemes, and standards make it difficult to approach cataloging effectively and efficiently. the quality of the cataloging data is another dimension that needs proper attention. current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 56 https://doi.org/10.6017/ital.v37i4.10432 the multiplicity of cataloging rules and standards the importance and critical role of standards in cataloging are clear to everyone. with standards, it becomes possible to identify authors uniquely; link users to the intended and the required resources; assess the value and usage of the services a library or information system provides; operate efficiently different transactions regarding bibliographic metadata, link content, preserve metadata, and generate reports; and enable the transfer of notifications, data, and events across machines.82 the success of these standards is because of the community-based efforts and their utility for a person/organization and ease of adoption. 83 however, we are living in a “jungle of standards” with massive scale and complexity.84 we are facing a flood of standards, schemas, protocols, and formats to deal with bibliographic metadata. 85 it is necessary to come up with some uniform and widely accepted standard, schema, protocol, and format, which will make possible the uniformity between bibliographic records and make way for records de-duplication on the web. also, because of the exponential growth of the digital landscape of document collections and the emerging yet widely adopted linked data environment, it becomes necessary for librarians to be part of this global scale activity of making their bibliographic data available as linked and open data.86 therefore, all these standards need reenvisioning and reconsideration when libraries transit from the current implementations to a more complex lod-based environment.87 rda is easy to use, user-centric, and retrieval-supportive with a precise vocabulary.88 however, it has lengthier descriptions with a lot of technical terms, is time-consuming, needs re-training, and suffers from the generation gap.89 rda is transitioning from aacr2 to produce metadata for knowledge artifacts, and it will be adaptive to the emerging data structures of linked data.90 although librarians could potentially play a vital role in making rda successful, it is challenging to bring them on the same page with publishers and vendors.91 while studying bibframe 2.0 from rda point of view, taniguchi observed that: • bibframe has no class correspondence with rda, especially making a distinction between work and expression is challenging. • some rda elements have no corresponding properties in bibframe, and therefore, cannot be expressed in bibframe. in other cases, bibframe properties cannot be converted back to rda elements due to the many-to-one and many-to-many mappings between them. • the availability of multiple marc21-to-bibframe tools results in the variety of bibframe metadata, which makes its matching and merging in the later stages challenging.92 to understand whether bibframe 2.0 is suitable as a metadata schema, taniguchi examined it closely for domain constraint of properties and developed four additional methods for implementing such constraints, i.e., defining properties in bibframe.93 in these methods, method 1 is the strictest one for defining such properties, method 2 from bibframe, and the remaining gradually loosen. method 1 defines the domain of individual properties as work or instance only, which is according to the method in rda. method 2 defines properties using multiclass structure (work-instance-item) for descriptive metadata. method 3 introduces a new class bibres to accommodate work and instance properties. method 4 uses two classes bibres and work for representing a bibliographic resource. method 5 leaves the domain of any property unspecified and uses rdf:type to represent whether a resource belongs to the work or instance. he observed that: information technology and libraries | december 2018 57 • the multi-class structure used in bibframe (method 2) questions the consistency between this structure and the domain definition of the properties. • if the quality of the metadata is concerned especially matching among converted metadata from different source metadata, then method 1 works better than method 2. • if metadata conversion from different sources is required, then method 4 or 5 should be applied.94 taniguchi concludes that bibframe’s domain constraint policy is unsuitable for descriptive metadata schema to exchange and share bibliographic resources, and therefore, should be reconsidered.95 according to sprochi, bibliographic metadata is passing through a significant transformation. 96 frbr, rda, and bibframe are among the three major and currently running programs that will affect the recording, storage, retrieval, reuse and sharing of bibliographic metadata. ifla focuses on reconciling frbr, frad, and frsad models into one model namely frbr-library reference model (rfbr-lrm [https://www.ifla.org/node/10280]), published in may 2016.97 sprochi further adds that it is generally expected that by adopting this new model, rda will be changed and revised significantly. bibframe will also get substantial modifications to become compatible with frbr-lrm and the resulting rda rules.98 these initiatives, on the one hand, makes possible their visibility on the web, but on the other hand, introduces several changes and challenges for the library and information science community.99 to cope with the challenges of making bibliographic data visible, available, reusable, and shareable on the web, sprochi argues that: 100 • the library and information science community must think of the bibliographic records in terms of data that is both human-readable and machine-understandable, which can be processed across different applications and databases with no format restrictions. also, this data must support interoperability among vendors, publishers, users, and libraries and therefore, should be thought of beyond the notion that “only library create quality metadata (as quoted in coyle (2007)” and cited by sprochi101). • a shared understanding of semantic web, lod, data formats, and other related technologies is necessary for the library and information science community for more meaningful and fruitful conversations with software developers, information & library science (ils) designers, and it & linked data professionals. at least some basic knowledge about these technologies will enable the library community to take active participation in publishing, storing, visualizing, linking, and consuming bibliographic metadata as linked and open data. • the library community must show a strong commitment to more ils vendors to “postmarc” standards such as bibframe or any other standard that is supportive of the lod environment. this way we will be in a better position to exploit linked data and semantic web to their fullest. the library community must be ready to adopt lod in cataloging. transitioning from marc to linked data needs collaborative efforts and requires addressing several challenges. these challenges include: https://www.ifla.org/node/10280 current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 58 https://doi.org/10.6017/ital.v37i4.10432 • committing to a single standard by getting all units in the library, so that the big data problem resulting from using multiple metadata standards by different institutions could be mitigated; • bringing individual experts, libraries, university, and governments to work together and organize conferences, seminars, and workshops to bring linked data into the mainstream ; • translating the bibframe vocabulary into other languages; • involving different users and experts in the area; and • obtaining funding from the public sector and other agencies to continue the journey towards linked data.102 in the current scenario of metadata practices, the interoperability for the exchange of metadata varies across different formats.103 the semantic web and lod support different library models such as frbroo, edm, and bibframe. these conceptual models and frameworks suffer from the interoperability issue, which makes data integration difficult. currently, several options are available for encoding bibliographic data to rdf (and to lod), which further complicates the interoperability and introduces inconsistency.104 existing descriptive cataloging methodologies and the bibliographic ontology descriptions in cataloging and metadata standards set the stage for redesigning and developing better ways of improved information retrieval and interoperability.105 besides the massive heaps of information on the web, the library community (especially digital libraries) has devised standards for metadata and bibliographic description to meet the interoperability requirements for this part of the data on the web.106 semantic web technologies could be exploited to make information presentation, storage, and retrieval more user-friendly for digital libraries.107 to achieve such interoperability among resources, castro proposed an architecture for semantic bibliographic description.108 gardašević emphasizes on employing information system engineers and developers to understand resource description, discovery, and access process in libraries and then extend these practices by applying linked data principles.109 this way bibliographic metadata will be more visible, reusable and shareable on the web. godby, wang, and mixter stress collaborative efforts to establish a single and universal platform for cataloging rules, encoding schema, and model to a higher level of maturity, which requires initiatives such as rda, bibframe, ld4l, and biblow (https://bibflow.library.ucdavis.edu/about).110 the massive volume of metadata (available in marc and other legacy formats) makes data migration to bibframe challenging.111 although bibframe challenges the conventional ground of cataloging, which aims to record tangible knowledge containers, it is still in the infant stage at both theoretical and practical levels.112 for bibframe to be more efficient, enhanced, and enriched, it needs the attention of librarians and information science experts who will use it to encode their bibliographic metadata.113 gonzales suggests that librarians must be willing to share metadata and upgrade metadata encoding standards to bibframe; they should train, learn, and upgrade their systems to efficiently use bibframe encoding scheme and research new ways of bringing interoperability between bibframe and other legacy metadata standards; and they should ensure the data security of patrons and mitigate the legal and copyright issues in making visible their resources as linked and open data.114 also, lov must be exploited from the cataloging perspective by finding out ways to create a single, flexible, adaptable, and representative vocabulary. such a vocabulary will bring the cataloging data from different https://bibflow.library.ucdavis.edu/about information technology and libraries | december 2018 59 libraries of the world and make it accessible and consumable as a single library linked data to get free from the jungle of metadata vocabularies [and standards]. publishing and consuming linked bibliographic metadata according to the findings of one survey, there are several primary motives for publishing an institution’s [meta]data as linked data. these include (in the order from most frequent/ essential to a lesser one):115 • making data visible on the web; • experimenting and finding the potentials of publishing datasets as linked data; • exposing local datasets to understand the nature of linked data; • exploring the benefits of linked data for search engine optimization (seo); • consuming and reusing linked data in future projects; • increasing the data reusability and interoperability; • testing schema.org and bibframe; • meeting the requirements of the project; and • making available the “stable, integrated, and normalized data about research activities of an institution.”116 they also identified several reasons from the participants regarding the consumption of such data. these include (in the order from most frequent/essential to a lesser one):117 • improving the user experience; • extending local data with other datasets; • effectively managing the internal metadata; • improving the accuracy and scope of search results; • trying to improve seo for local resources; • understanding the effect of data aggregation from multiple datasets; and • experimenting and finding the potentials of consuming linked datasets. publishing and consuming bibliographic data on the lod cloud brings numerous applications. kalou et al. developed a semantic mashup by combining semantic web technologies, restful services, and content management services (cms) to generate personalized book recommendations and publish them as linked data.118 it allows for the expressive reasoning and efficient management of ontologies and has potential applications in the library, cataloging services, and ranking book records and reviews. this application exemplifies how we can use the commercially [and socially] curated metadata with bibliographic descriptions from improved user experience in digital libraries using linked data principles. however, publishing and consuming bibliographic metadata as linked and open data is not that simple and need addressing several prominent challenges and issues, which are identified in the following subsections along with some opportunities for further research. publishing linked bibliographic metadata the university of illinois library worked on publishing marc21 records of 30,000 digitized books as linked library data by adding links, transforming them to lod-friendly semantics (mods) and deploying them as rdf with the objective to be used by a wider community.119 to them, using semantic web technologies, a book can be linked to related resources and multiple possible current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 60 https://doi.org/10.6017/ital.v37i4.10432 contexts, which is an opportunity for libraries to build innovative user-centered services for the dissemination and uses of bibliographic metadata.120 in this regard, the challenge is to utilize the existing book-related bibliographic maximally and descriptive metadata in a manner that parallels with the services (both inside the library and outside) as well as exploit to the fullest the full-text search and semantic web technologies, standards, and lod services.121 while publishing the national bibliographic information as free open linked data, ifla identifies several issues including:122 • dealing with the negative financial impact on the revenue generated from traditional metadata services; • the inability to offer consistent services due to the complexity of copyright and licensing frameworks; • the confusion in understanding the difference between “open” and “free” terms; • remodeling library data as library linked data; • the limited persistence and sustainability of linked data resources; • the steep learning curve in understanding and applying linked data practices to library data; • making choices between sites to link to; and • creating persistent uris for library data objects. from the analysis of the relevant literature, hallo identified several issues in publishing bibliographic metadata as linked and open data. these include difficulties in cataloging and migrating data to new conceptual models; the multiplicity of vocabularies for the same metadata; the lack of agreements to share data; the lack of experts and tools for transforming data; the lack of applications and indicators for its consumption; mapping issues; providing useful links of datasets; defining and controlling data ownership; and ensuring dataset quality.123 libraries should adopt to linked data five-stars model by adopting emerging non-proprietary formats to publish its data; link to external resources and services; participate actively in enriching; and improving the quality of metadata to improve knowledge management and discovery. 124 the cataloging has a bright future with more dataset providers by involving citizens and end -users in metadata enrichment and annotation; making ranking and recommendation as part of library cataloging services; and the increased participation of the library community to the body of semantic web and linked data.125 publishing linked data poses several issues. these include data cleanup issues es pecially when dealing with legacy data; technical issues such as data ownership; the software maturity to keep linked data up-to-date; managing its colossal volume; and providing it support for data entry, annotation, and modeling; developing representative and widely applicable lovs; and handling the steep learning curve to understand and apply linked data principles. 126 bull and quimby stress understanding how the library community is transiting their cataloging methods, systems, standards, and integrations to the lod for making them visible on the web and how they keep backward compatibility with legacy bibliographic metadata.127 it is necessary for the lod data model to maintain the underlying semantics of the existing models, schemas, and standards, yet innovate and renew old traditions, where the quality of the conversion solely depends on the ability of this new model to cope with heterogeneity conflicts, information technology and libraries | december 2018 61 maintain granularity and semantic attributes and consequently prevent loss of data and semantics.128 the new model should be semantically expressive enough to support meaningful and precise linking to other datasets. by thinking alternatively, these challenges are the significant research opportunities that will enable us to be part of linked and open data community in a more profound manner. consuming linked bibliographic metadata consuming linked data resources can be a daunting task and may involve resolving/mitigating several challenges. these challenges include:129 • dealing with the bulky or non-available rdf dumps, no authority control within rdf dumps, and data format variations; • identifying terms’ specificity levels during concept matching; • the limited reusability of library linked data due to lack of contextual data; • harmonizing classes and objects at the institution level; • excessive handcrafting due to few off-the-shelf visualization tools; • manual mapping of vocabularies; • matching, aligning, and disambiguating library and linked data; • the limited representation of several essential resources as linked data due to nonavailability of uris; • the lack of sufficient representative semantics for bibliographic data; • the time-consuming nature of linked data to understand its structure for reuse; • the ambiguity of terms across languages; and • the non-stability of endpoints and outdated datasets. syndication is required to make library data visible on the web. also, it is necessary to understand how current applications including web search engines perceive and treat visibility, to what extent schema.org matters, and what is the nature of the linked data cloud.130 an influential work may be translated into several languages, which results in multiple metadata records. some of these are complete, and others are with missing details. godby and smith‐ yoshimura suggest aggregating these multiple metadata records into a single record, which can be complete, link the work to its different translations and translators, and is publishable (and consumable) as linked data.131 however, such an aggregation demands a great deal of human effort to make these records visible and consumable as linked data. this also includes describing all types of objects that libraries currently collect and manage, translating research findings to best practices; and establishing policies to use uris in marc and other types of records. 132 to achieve the long-term goal of making metadata consumable as linked data; the libraries, as well as individual researchers, should align their research with work that of the major players such as oclc, loc, and ifla and follow their best practices.133 the issues in lov needs immediate attention to make lod more useful. these issues, according to include the following:134 • lov publishes only a subset of rdf vocabularies with no inclusion for value vocabularies such as skos thesaurus; • it provides no or almost negligible support for vocabulary authors; current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 62 https://doi.org/10.6017/ital.v37i4.10432 • it relies on third parties to get the information about vocabulary usage in published datasets; • it has insufficient support for multilingualism or many languages; • it should support multi-term vocabulary search, which is required from the ontology designers to understand and employ the complex relationships among concepts; • it should support vocabulary matching, vocabulary checking, and multilingualism to allow users to search and browse vocabularies using their native language. it also improves the quality of the vocabulary by translation, which allows the community to evaluate and collaborate; and • efforts are required to improve and make possible the long-term preservation of vocabularies. lod emerged to change the design and development of metadata, which has implications for controlled vocabularies, especially, the person/agent vocabularies that are fundamental to data linkage but suffer from the issues of metadata maintenance and verification. 135 therefore, practical data management and the metadata-to-triples transition should be studied in detail to make the wider adaptation of lod possible.136 to come out of the lab environment and make lod practically useful, the controlled vocabularies must be cleaned, and its cost should be reduced.137 however, achieving this is challenging and needs to answer how knowledge artifacts could be uniquely identified and labeled across digital collections and what should be the standard practices to use them.138 linked data is still new to libraries.139 the technological complexities, the feeling of risks in adopting new technology and limitations due to the system, politics, and economy are some of the barriers in its usage in libraries.140 however, libraries can potentially overcome these barriers by learning from the use of linked data in other domains including, e.g., google’s knowledge graph and facebook’s open graph.141 the graph interfaces could be developed to link author, publisher, and book-related information, which in turn can be linked to the other open and freely available datasets.142 it is time that the library and information science professionals come out of the old, document-centric approach to bibliographic metadata and adapt their thinking as more datacentric for a more meaningful consumption of bibliographic metadata by both users and machines.143 quality of linked bibliographic metadata the use of a cataloging data defines its quality.144 the quality is essential for the discovery, usage, provenance, currency, authentication, and administration of metadata. 145 cataloging data or bibliographic metadata is considered fit for use based on its accuracy, completeness, logical consistency, provenance, coherence, timeliness, conformance and accessibility. 146 data is commonly assessed by its quality to be used in specific application scenarios and use cases, however, sometimes, low-quality data can still be useful for a specific application as far as its quality meets the requirements of that application.147 the reasons include several factors including availability, accuracy, believability, completeness, conciseness, consistency, objectivity, relevance, understandability, timeliness, and verifiability that determine the quality of data. 148 the quality of linked data can be of two types, one is the inherent quality of linked data, and the other relates to its infrastructure aspects. the former can be further divided into aspects including domain, metadata, rdf model, links among data items, and vocabulary. the infrastructural information technology and libraries | december 2018 63 aspects include the server that hosts the linked data, linked data fragments, and file servers.149 this typology introduces issues of their own, the issues related to the inherent quality including “linking, vocabulary usage and the provision of administrative metadata.”150 the infrastructural aspect introduces issues related to naming conventions, which include avoiding blank nodes and using http uris, linking through owl:sameas links, describing by reusing the existing terms and dereferencing.151 the quality cataloging definitions are mainly based on the experience and practices of the cataloging community.152 its quality falls into at least four basic categories: (1) the technical details of the bibliographic records, (2) the cataloging standards, (3) the cataloging process, and (4) the impact of cataloging on the user.153 the cataloging community focuses mainly on the quality of bibliographic metadata. however, it is not sufficient enough to consider the accuracy, completeness, and standardization of bibliographic metadata, and therefore, it is necessary that they should also consider the information needs of the users.154 van kleeck et al. investigated issues in the quality management of metadata of electronic resources to assess in supporting user tasks of finding, selecting, and accessing library holdings as well as identifying the potential for increasing efficiencies in acquisition and cataloging workflow.155 they evaluated the quality of existing bibliographic records mostly provided by their vendors and compared them with those of oclc and found that the latter has better support users in resource discovery and access. 156 from the management perspective, the complexity and volume of bibliographic metadata and the method of ingesting it to the catalog emphasize the selection of highest quality records.157 from the perspective of digital repositories, the absence of well-defined theoretical and operational definitions of metadata quality, interoperability, and consistency are some of the issues for the quality of metadata.158 the national information standards organization (niso) identifies several issues in creating metadata. 159 these include the inadequate knowledge about cataloging in both manual and automatic environments leading to inaccurate data entry, inconsistency of subject vocabularies, and limitations of resource discovery, and the development of standardized approaches to structure metadata.160 the poor quality of linked data can make its usefulness much difficult.161 datasets are created at the data level resulting in a significant variance in perspectives and underlying data models.162 this also leads to errors in triplication, syntax, and data; misleading owl:sameas links, and the low availability of sparql endpoints.163 library catalogs, because of their low quality, most often fail to communicate clear and correct information correctly to the users.164 the reasons for such low quality include user’s inability to produce catalogs that are free from faults and duplicates as well as low standards and policies that drive these cataloging practices. 165 although the rich collections of bibliographic metadata are available, these are rich in terms of the heaps of cataloging data and not in terms of quality with almost no bibliographic control. 166 these errors in and the low quality of bibliographic metadata are the result of misunderstanding the aims and functions of bibliographic metadata and adopting the “unwise” cataloging standards and policies.167 still there exist some high-quality cataloging efforts with well-maintained cataloging records, where the only quality warrant is to correctly understand the subject matter of the artifact and effectively communicate between librarians and experts in the corresponding domain knowledge. 168 the demand for such high quality and well-managed catalogs has increased on the web. although current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 64 https://doi.org/10.6017/ital.v37i4.10432 people are more accustomed to web search engines, the quality catalogs will attract not only libraries but the general web users as well (when published and consumed as linked data).169 the community must work together on metadata with publishers and vendors to approach cataloging from the user perspective and refine the skillset as well as produce quality metadata.170 as library and information science professionals, we should not only be the users of the standards , instead, we must actively participate and contribute to its development and improvement so that we may effectively and efficiently connect our data with the rest of the world.171 such collaboration is required from not only the librarians and vendors but also from the users in developing an efficient cataloging environment and for a more usable bibliographic metadata, this is discussed in the next section. linking the socially curated metadata this section addresses rq03 by reviewing the state-of-the-art literature from multiple but related domains including library sciences, information sciences, information retrieval, and semantic web. the section below discusses the importance and possible impact of making socially curated metadata as part of the bibliographic or professionally curated metadata. the next section highlights why social collaborative cataloging approaches should be adopted by librarians to work with other stakeholders in making their bibliographic data available and visible as linked and open data and what is the possible impact of fusing the user-generated content with professional metadata and making it available as linked and open data. the socially curated metadata matters in cataloging conventional libraries have clear and well-established classification and cataloging schemes but these are as challenging to learn, understand, and apply as they are slow and painful to consume.172 using computers to retrieve bibliographic records resulted in the massive usage of copy cataloging.173 however, adopting this practice is challenging, because these records are inconsistent; incomplete; less visible, granular, and discoverable; unable to integrate metadata and content to the corresponding records; difficult to preserve with new and usable format for the consumption by users and machines; and not supportive towards integrating the user-generated content into the cataloging records.174 the university of illinois library, through its vufind service, offers extra features to enhance the search and exploration experience of end users by providing a book’s cover image, table of contents, abstracts, reviews, comments, and user tags.175 users can contribute content such as tags, reviews, comments, and recommend books to friends. h owever, it is necessary to research whether this user-generated content should be integrated to or preserved along the bibliographic records.176 in their book, alemu and stevens mentioned several advantages of making user-generated content as part of the library catalogs.177 these include (i) enhancing the functionality of professionallycurated metadata by making information objects findable and discoverable; (ii) removing the limitations posed by sufficiency and necessity principles of the professionally-curated metadata; (iii) bringing users closer to the library by “pro-actively engaging” them in ratings, tagging, and reviewing, etc., provided that users are also involved in managing and controlling metadata entries; and (iv) the resulting “wisdom of the crowd” would benefit all the stakeholders from this massively growing socially-curated metadata. however, this combination can only be utilized optimally if we can semantically and contextually link it to the internal and external resources; the resulting metadata is openly accessed, shared, and reused; users are supported in easily adding information technology and libraries | december 2018 65 the metadata and made part of the quality control by enabling them to report spamming activities to the metadata experts.178 librarything for libraries (ltfl) makes a library catalog more informative and interactive by enhancing opac, providing access to professional and social metadata, and enabling them to search, browse, and discover library holdings in a more engaging way (https://www.librarything.com/forlibraries). it is one of the practical examples of enriching library catalogs with user-generated content. this trend of merging social and professional metadata innovates library cataloging by dissolving the borders between “social sphere” and library resources.179 the social media has expanded library into social spaces by exploiting tags and tag clouds as navigational tools and enriching the bibliographic descriptions by integrating the user-generated content.180 it bridges the communication gaps between the library and its users, where users take active participation in resource description, discovery, and access. 181 the potential role of the socially curated metadata in resource description, discovery, and access is also evident from the long long-tail social book search research under the initiative for xml retrieval (inex) where both professionally curated bibliographic and user-generated social metadata are exploited for retrieval and recommendation to support both known-item as well as exploratory search.182 by experimenting with amazon/librarything datasets of 2.8 million book records, containing both professional and social metadata, the results conclude that enriching the professional metadata with social metadata especially tags significantly improves search and recommendation.183 koolen also noticed that the social metadata especially tags and reviews significantly improve the search performance as professionally curated metadata is “often too limited” to describe books resourcefully.184 users add socially curated metadata with the intention of making resource re-findable during a future visit, i.e., they add metadata such as tags to facilitate themselves and allow other similar users in resource discovery and access, and therefore, form a community around the resource.185 clements found user tags (social tagging) beneficial for librarians while browsing and exploring the library catalogs.186 to some librarians, tags are complementary to controlled vocabulary; however, training issues and lack of awareness of social tagging functionality in cataloging interfaces prevent their perceived benefit.187 the socially curated metadata as linked data metadata is socially constructed.188 it is shaping and shaped by the context in which it is developed and applied, and demands community-driven approaches, where data should be looked at from a holistic point of view rather than considering them as discrete (individual) semantic units.189 the library is adopting the collaborative social aspect of cataloging that will take place between authors, repository managers, libraries, e-collection consortiums, publishers, and vendors.190 librarians should improve their cataloging skills in line with the advances in technology to expose and make visible their bibliographic metadata as linked and open data.191 currently, linked library data is generated and used by library professionals. socially constructed metadata will act as a value-added in retrieving knowledge artifacts with precision.192 the addition of socially constructed and community-driven metadata in current metadata structures, controlled vocabularies, and classification systems provide the holistic view of these structures as they add the community-generated sense to the professionally-curated metadata structures.193 an https://www.librarything.com/forlibraries current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 66 https://doi.org/10.6017/ital.v37i4.10432 example of the possibilities of making user-generated content as part of cataloging and linked open data is the semantic book mashup (see “consuming linked bibliographic metadata” above) which demonstrates how the commercially [and socially] curated metadata could be retriev ed and linked with bibliographic descriptions.194 while enumerating the possible applications of this mashup, they argue that book reviews from different websites could be aggregated using linked data principles by extending the review class of bibframe 2.0.195 from the analysis of twenty-one in-depth interviews with lis professionals, alemu discovered four metadata principles, namely metadata enrichment, linkage, openness, and filtering.196 this analysis revealed that the absence of socially curated metadata is sub-optimal for the potential of lod in libraries.197 their analysis advocates for a mixed-metadata approach, in which social metadata (tags, ratings, and reviews) augments the bibliographic metadata by involving users proactively and by offering a social collaborative cataloging platform. the metadata principles should be reconceptualized, and linked data should be exploited to address the existing library metadata challenges. therefore, the current efforts in linked data should fully consider social metadata.198 library catalogs should be enriched by mixing the professional and social metadata as well as semantically and contextually interlinked to internal and external information resources to be optimally used in different application scenarios.199 to fully exploit this linkage, the duplication of metadata should be reduced. it must be made openly accessible so that its sharing, reuse, mixing, and matching could be made possible. the enriched metadata must be filtered per user requirements using an interface that is flexible, personalized, contextual, and reconfigurable.200 their analysis suggests a “paradigm shift” in metadata’s future, i.e., from simple to enriched; from disconnected, invisible and locked to well-structured, machine-understandable, interconnected, visible, and more visualized metadata; and from single opac interface to reconfigurable and adaptive metadata interfaces.201 by involving users in the metadata curation process, the mixed approach will bring diversity in metadata and make resources discoverable, usable, and user-centric with the wider and well-supported platform of linked and open data.202 in conclusion, the fusion of socially curated metadata with the standards-based professional metadata is essential from the perspective of the user-centric paradigm of cataloging, which has the potential to aid resource discovery and access and open new opportunities for information scientists working in linked and open data as well as catalogers who are transiting to the web of data to make their metadata visible, reusable, and linkable to other resources on the web. from the analysis and scholarly discussions of alemu, stevens, farnel, and others as well as from the initial experiments of kalou et al.203 it becomes apparent that the application of linked data principles for library catalogs is future-proof and promising towards more user-friendly search and exploration experience with efficient resource description, discovery, access, and recommendations. conclusions in this paper, we presented a brief yet holistic review of the current state of linked and open data in cataloging. the paper identified the potentials of lod and lov in making the bibliographic descriptions publishable, linkable, and consumable on the web. several prominent challenges, issues, and future research avenues were identified and discussed. the potential role of sociallycurated metadata for enriching library catalogs and the collaborative social aspect of cataloging were highlighted. some of the notable points include the following: information technology and libraries | december 2018 67 • publishing, linking, and consuming bibliographic metadata on the web using linked data principles brings several benefits for libraries.204 the library community should improve their skills regarding this paradigm shift and adopt the best practices from other domains.205 • standards have a key role in cataloging, however, we are living in a “jungle of metadata standards” with varying complexity and scale, which makes it difficult to select, apply and work with.206 to be part of global scale activity of making bibliographic data available on the web as linked and open data, these standards should be considered and reenvisioned.207 • the quality of bibliographic metadata depends on several factors including accuracy, completeness, logical consistency, provenance, coherence, timeliness, conformance and accessibility.208 however, achieving these characteristics is challenging because of several reasons including cataloging errors; limited bibliographic control; misunderstanding the role of metadata; and “unwise” cataloging standards and policies.209 to ensure high-quality and make data visible and reusable as linked data, the library community should contribute to developing and refining these standards and policies. 210 • metadata is socially constructed and demands community-driven approaches and the social collaborative aspect of cataloging by involving authors, repository managers, librarians, digital collection consortiums, publishers, vendors, and users.211 this is an emerging trend, which is gradually dissolving the borders between the “social sphere” and library resources and bridging the communication gap between libraries and their users, where end users contribute to the bibliographic descriptions resulting in a diversity of metadata and making it user-centric and usable.212 • adopting a “mixed-metadata approach” by considering bibliographic metadata and the user-generated content complementary and essential for each other suggests a “paradigm shift” in the metadata’s future from simple to enriched; from human-readable data silos to machine understandable, well-structured, and reusable; from invisible and restricted to visible and open; and from single opac to reconfigurable interfaces on the web.213 several researchers including the ones cited in this article agree that the professionally curated bibliographic metadata supports mostly the known-item search and has little value to open and exploratory search and browsing. they believe that not only the collaborative social efforts of the cataloging community are essential but also the socially curated metadata, which can be used to enrich bibliographic metadata and support exploration and serendipity. this is not only evident from the wider usage of librarything and its ltfl but also from the long-tail inex social book search research where both professionally curated bibliographic and user-generated social metadata are exploited for retrieval and recommendation to support both known-item as well as exploratory search.214 therefore, this aspect should be considered for further research to make cataloging more useful for all the stakeholders including libraries, users, authors, publishers, and for the general consumption as linked data on the web. the current trend of social collaborative cataloging efforts is essential to fully exploit the potential of linked open data. however, if we look closely, we find four groups including librarians, linked data experts, information retrieval (ir) and interactive ir researchers; and users, all going on their separate ways with minimal collaboration and communication. more specifically, they are not benefiting from each other to a greater extent, which could result in better possibilities of current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 68 https://doi.org/10.6017/ital.v37i4.10432 resource description, discovery, and access. for example, the library community should consider the findings of inex sbs track, which have demonstrated that professional and social metadata, are essential for each other to facilitate end users in resource discovery and access and support not only known-item search but also exploration and serendipity. the current practices of librarything, ltfl, and social web in general advocate for user-centric cataloging, where users are not only the consumers of bibliographic descriptions but also the contributors to metadata enrichment. linked open data experts have achieved significant milestones in other domains including, e.g., e-government, they should understand the cataloging and resource discovery & access practices in libraries to make the bibliographic metadata not only visible as linked data on the web but also shareable, re-usable, and beneficial to the end-users. the social collaborative cataloging approach by involving the four mentioned groups actively is significant to make bibliographic descriptions more useful not only for the library community and users but also for their consumption on the web as linked and open data. together we can, and we must. references 1 maría hallo et al., “current state of linked data in digital libraries,” journal of information science 42, no. 2 (2016):117–27, https://doi.org/10.1177/0165551515594729. 2 tim berners-lee, “design issues: linked data,” w3c, 2006, updated june18, 2009, accessed november 09, 2018, https://www.w3.org/designissues/linkeddata.html; hallo, “current state,” 117. 3 yuji tosaka and jung-ran park, “rda: resource description & access—a survey of the current state of the art,” journal of the american society for information science and technology 64, no. 4 (2013): 651–62, https://doi.org/10.1002/asi.22825. 4 hallo, “current state,” 118; angela kroeger, “the road to bibframe: the evolution of the idea of bibliographic transition into a post-marc future,” cataloging & classification quarterly 51, no. (2013): 873–90. https://doi.org/10.1080/01639374.2013.823584; martin doerr et al., “the europeana data model (edm).” paper presented at the world library and information congress: 76th ifla general conference and assembly, gothenburg, sweden, august 10–15, 2010. 5 getaneh alemu and brett stevens, an emergent theory of digital library metadata—enrich then filter,1st edition (waltham, ma: chandos publishing, elsevier ltd. 2015). 6 hallo, “current state,” 118 . 7 berners-lee, “design issues.” 8 kim tallerås, “quality of linked bibliographic data: the models, vocabularies, and links of data sets published by four national libraries,” journal of library metadata 17, no. 2 (2017):126– 55, https://doi.org/10.1080/19386389.2017.1355166. 9 becky yoose and jody perkins, “the linked open data landscape in libraries and beyond,” journal of library metadata 13, no. 2–3 (2013): 197–211, https://doi.org/10.1080/19386389.2013.826075. https://doi.org/10.1177/0165551515594729 https://www.w3.org/designissues/linkeddata.html https://doi.org/10.1002/asi.22825 https://doi.org/10.1080/01639374.2013.823584 https://doi.org/10.1080/19386389.2017.1355166 https://doi.org/10.1080/19386389.2013.826075 information technology and libraries | december 2018 69 10 robert fox, “from strings to things,” digital library perspectives 32, no. 1 (2016): 2–6, https://doi.org/10.1108/dlp-10-2015-0020. 11 stanislava gardašević, “semantic web and linked (open) data possibilities and prospects for libraries,” infotheca—journal of informatics & librarianship 14, no. 1 (2013): 26–36, http://infoteka.bg.ac.rs/pdf/eng/2013-1/infotheca_xiv_1_2014_26-36.pdf. 12 thomas baker, pierre-yves vandenbussche, and bernard vatant, “requirements for vocabulary preservation and governance,” library hi tech 31, no. 4 (2013): 657-68, https://doi.org/10.1108/lht-03-2013-0027. 13 pierre-yves vandenbussche et al., “linked open vocabularies (lov): a gateway to reusable semantic vocabularies on the web,” semantic web 8, no. 3 (2017): 437–45, https://doi.org/10.3233/sw-160213. 14 tosaka, “rda,” 651, 652. 15 amanda sprochi, “where are we headed? resource description and access, bibliographic framework, and the functional requirements for bibliographic records library reference model,” international information & library review 48, no. 2 (2016): 129–36, https://doi.org/10.1080/10572317.2016.1176455. 16 brighid m.gonzales, “linking libraries to the web: linked data and the future of the bibliographic record,” information technology and libraries 33, no. 4 (2014): 10, https://doi.org/10.6017/ital.v33i4.5631. 17 shoichi taniguchi, “is bibframe 2.0 a suitable schema for exchanging and sharing diverse descriptive metadata about bibliographic resources?,” cataloging & classification quarterly 56, no. 1 (2018): 40–61, https://doi.org/10.1080/01639374.2017.1382643. 18 shoichi taniguchi, “bibframe and its issues: from the viewpoint of rda metadata,” journal of information processing and management 58, no. 1 (2015): 20–27, https://doi.org/10.1241/johokanri.58.20. 19 shoichi taniguchi, “examining bibframe 2.0 from the viewpoint of rda metadata schema,” cataloging & classification quarterly 55, no. 6 (2017): 387–412, https://doi.org/10.1080/01639374.2017.1322161. 20 nosheen fayyaz, irfan ullah, and shah khusro, “on the current state of linked open data: issues, challenges, and future directions,” international journal on semantic web and information systems (ijswis) 14, no. 4 (2018): 110–28, https://doi.org/10.4018/ijswis.2018100106. 21 asim ullah, shah khusro, and irfan ullah, “bibliographic classification in the digital age: current trends & future directions,” information technology and libraries 36, no. 3 (2017): 48–77, https://doi.org/10.6017/ital.v36i3.8930. 22 tosaka, “rda,” 659. https://doi.org/10.1108/dlp-10-2015-0020 http://infoteka.bg.ac.rs/pdf/eng/2013-1/infotheca_xiv_1_2014_26-36.pdf https://doi.org/10.1108/lht-03-2013-0027 https://doi.org/10.3233/sw-160213 https://doi.org/10.1080/10572317.2016.1176455 https://doi.org/10.6017/ital.v33i4.5631 https://doi.org/10.1080/01639374.2017.1382643 https://doi.org/10.1241/johokanri.58.20 https://doi.org/10.1080/01639374.2017.1322161 https://doi.org/10.4018/ijswis.2018100106 https://doi.org/10.6017/ital.v36i3.8930 current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 70 https://doi.org/10.6017/ital.v37i4.10432 23 tosaka, “rda,” 651, 652, 659. 24 tosaka, “rda,” 653, 660. 25 the first author used the trial version of rda toolkit to report these facts about rda (https://access.rdatoolkit.org). rda toolkit is co-published by american library association (http://www.ala.org), canadian federation of library associations (http://cflafcab.ca/en/home-page), and facet publishing (http://www.facetpublishing.co.uk). 26 ifla, “ifla conceptual models,” the international federation of library associations and institutions (ifla), 2017, updated april 06, 2009, accessed november 12, 2018, https://www.ifla.org/node/2016. 27 tosaka, “rda,” 651, 652, 655. 28 michael john khoo et al., “augmenting dublin core digital library metadata with dewey decimal classification,” journal of documentation 71, no. 5 (2015): 976–98. https://doi.org/10.1108/jd-07-2014-0103; ulli waltinger et al., “hierarchical classification of oai metadata using the ddc taxonomy,” in advanced language technologies for digital libraries, edited by raffaella bernardi, frederique segond and ilya zaihrayeu. lecture notes in computer science (lncs), 29–40: springer, berlin, heidelberg, 2011; aaron krowne and martin halbert, “an initial evaluation of automated organization for digital library browsing,” paper presented at the proceedings of the 5th acm/ieee-cs joint conference on digital libraries, denver, co, usa, june 7–11, 2005 2005; waltinger, “ddc taxonomy,” 30. 29 khoo, “dublin core,” 977, 984 . 30 loc, “marc standards: marc21 formats,” library of congress (loc), 2013, updated march 14, 2013, accessed january 2, 2014, http://www.loc.gov/marc/marcdocz.html. 31 philip e schreur, “linked data for production and the program for cooperative cataloging,” pcc policy committee meeting, 2017, accessed may 18, 2018, https://www.loc.gov/aba/pcc/documents/facil-session-2017/pcc_and_ld4p.pdf. 32 sarah bull and amanda quimby, “a renaissance in library metadata? the importance of community collaboration in a digital world,” insights 29, no. 2 (2016): 146–53, http://doi.org/10.1629/uksg.302. 33 philip e. schreur, “linked data for production,” pcc policy committee meeting, 2015, accessed november 09, 2018, https://www.loc.gov/aba/pcc/documents/pcc-ld4p.docx. 34 vandenbussche, “linked open vocabularies,” 437, 438, 450. 35 hallo, “current state,” 120. 36 hallo, “current state,” 118. 37 hallo, “current state,” 120, 124. https://access.rdatoolkit.org/ http://www.ala.org/ http://cfla-fcab.ca/en/home-page http://cfla-fcab.ca/en/home-page http://www.facetpublishing.co.uk/ https://www.ifla.org/node/2016 https://doi.org/10.1108/jd-07-2014-0103 http://www.loc.gov/marc/marcdocz.html https://www.loc.gov/aba/pcc/documents/facil-session-2017/pcc_and_ld4p.pdf http://doi.org/10.1629/uksg.302 https://www.loc.gov/aba/pcc/documents/pcc-ld4p.docx information technology and libraries | december 2018 71 38 hallo, “current state,” 120, 124. 39 hallo, “current state,” 124. 40 bull, “community collaboration,” 147. 41 sam gyun oh, myongho yi, and wonghong jang, “deploying linked open vocabulary (lov) to enhance library linked data,” journal of information science theory and practice 2, no. 2 (2015): 6–15, http://dx.doi.org/10.1633/jistap.2015.3.2.1. 42 carlo bianchini and mauro guerrini, “a turning point for catalogs: ranganathan’s possible point of view,” cataloging & classification quarterly 53, no. 3-4 (2015): 341–51, http://doi.org/10.1080/01639374.2014.968273. 43 bianchini, “turning point,” 350. 44 loc, “library of congress linked data service,” the library of congress, accessed march 24, 2018, http://id.loc.gov/about/. 45 loc, “linked data service.” 46 loc, “linked data service.” 47 loc, “linked data service.” 48 loc, “linked data service.” 49 margaret e dull, “moving metadata forward with bibframe: an interview with rebecca guenther,” serials review 42, no. 1 (2016): 65–69, https://doi.org/10.1080/00987913.2016.1141032. 50 loc, “overview of the bibframe 2.0 model,” library of congress, april 21, 2016, accessed november 09, 2018, https://www.loc.gov/bibframe/docs/bibframe2-model.html. 51 taniguchi, “bibframe 2.0,” 388; taniguchi, “suitable schema,” 40. 52 oclc. 2016, “oclc linked data research,” online computer library center (oclc), https://www.oclc.org/research/themes/data-science/linkeddata.html. 53 oclc, “linked data research.” 54 jeff mister, “turning bibliographic metadata into actionable knowledge,” next blog—oclc, february 29, 2016, http://www.oclc.org/blog/main/turning-bibliographic-metadata-intoactionable-knowledge/. 55 mister, “turning bibliographic metadata.” 56 george campbell, karen coombs, and hank sway, “oclc linked data,” oclc developer network, march 26, 2018, https://www.oclc.org/developer/develop/linked-data.en.html. 57 campbell, “oclc linked data.” http://dx.doi.org/10.1633/jistap.2015.3.2.1 http://doi.org/10.1080/01639374.2014.968273 http://id.loc.gov/about/ https://doi.org/10.1080/00987913.2016.1141032 https://www.loc.gov/bibframe/docs/bibframe2-model.html https://www.oclc.org/research/themes/data-science/linkeddata.html http://www.oclc.org/blog/main/turning-bibliographic-metadata-into-actionable-knowledge/ http://www.oclc.org/blog/main/turning-bibliographic-metadata-into-actionable-knowledge/ https://www.oclc.org/developer/develop/linked-data.en.html current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 72 https://doi.org/10.6017/ital.v37i4.10432 58 roy tennant, “getting started with linked data,” next blog—oclc, february 8, 2016, http://www.oclc.org/blog/main/getting-started-with-linked-data-3/. 59 tennant, “linked data.” 60 dblp, “dblp computer science bibliography: frequently asked questions,” digital bibliography & library project (dblp), updated november 07, 2018, accessed 08 november 2018. http://dblp.uni-trier.de/faq/. 61 dblp, “frequently asked questions.” 62 jörg diederich, wolf-tilo balke, and uwe thaden, “demonstrating the semantic growbag: automatically creating topic facets for faceteddblp,” paper presented at the proceedings of the 7th acm/ieee-cs joint conference on digital libraries, vancouver, canada, june 17–22, 2007. 63 jörg diederich, wolf-tilo balke, and uwe thaden, “about faceteddblp,” 2018, accessed november 09, 2018, http://dblp.l3s.de/dblp++.php. 64 tennant, “linked data.” 65 in this section, lov catalog or portal refers to the lov platform available at http://lov.okfn.org/dataset/lov/, whereas the abbreviation lov, when used alone (without the term catalog/portal), refers to linked open vocabularies in general; vandenbussche, “linked open vocabularies,” 437. 66 vandenbussche, “linked open vocabularies,” 443, 450. 67 vandenbussche, “linked open vocabularies,” 437. 68 vandenbussche, “linked open vocabularies,” 437, 438, 450. 69 vandenbussche, “linked open vocabularies,” 438. 70 vandenbussche, “linked open vocabularies,” 437, 438, 443–46. 71 baker thomas, pierre-yves vandenbussche, and bernard vatant, “requirements for vocabulary preservation and governance,” library hi tech 31, no. 4 (2013): 657–68, https://doi.org/10.1108/lht-03-2013-0027. 72 thomas, “vocabulary preservation,” 658. 73 oh, “deploying,” 9. 74 oh, “deploying,” 9. 75 oh, “deploying,” 9, 10. http://www.oclc.org/blog/main/getting-started-with-linked-data-3/ http://dblp.uni-trier.de/faq/ http://dblp.l3s.de/dblp++.php http://lov.okfn.org/dataset/lov/ https://doi.org/10.1108/lht-03-2013-0027 information technology and libraries | december 2018 73 76 erik radio and scott hanrath, “measuring the impact and effectiveness of transitioning to a linked data vocabulary,” journal of library metadata 16, no. 2 (2016): 80–94, https://doi.org/10.1080/19386389.2016.1215734. 77 radio, transitioning,” 81. 78 robert, “strings to things,” 2. 79 robert, “strings to things,” 2, 4, 6. 80 vandenbussche, “linked open vocabularies,” 438. 81 as of april 23, 2018, the schema.org vocabulary is now available at http://lov.okfn.org/dataset/lov/; alberto nogales et al., “linking from schema.org microdata to the web of linked data: an empirical assessment,” computer standards & interfaces 45 (2016): 90-99. https://doi.org/10.1016/j.csi.2015.12.003. 82 bull, “community collaboration,” 146. 83 bull, “community collaboration,” 146. 84 bull, “community collaboration,” 147. 85 bull, “community collaboration,” 147. 86 bull, “community collaboration,” 147, 148. 87 schreur, 2015. linked data for production. 88 yhna therese p. santos, “resource description and access in the eyes of the filipino librarian: perceived advantages and disadvantages,” journal of library metadata 18, no. 1 (2017): 45–56, https://doi.org/10.1080/19386389.2017.1401869. 89 santos, “filipino librarian,” 51–55. 90 philomena w. mwaniki, “envisioning the future role of librarians: skills, services and information resources,” library management 39, no. 1, 2 (2018): 2–11, https://doi.org/10.1108/lm-01-2017-0001. 91 mwaniki, “envisioning the future,” 7, 8. 92 taniguchi, “bibframe 2.0,” 410, 411 . 93 taniguchi, “suitable schema,” 52–58 . 94 taniguchi, “suitable schema,” 59, 60. 95 taniguchi, “suitable schema,” 60. 96 sprochi, “where are we headed?,” 129, 134. https://doi.org/10.1080/19386389.2016.1215734 http://lov.okfn.org/dataset/lov/ https://doi.org/10.1016/j.csi.2015.12.003 https://doi.org/10.1080/19386389.2017.1401869 https://doi.org/10.1108/lm-01-2017-0001 current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 74 https://doi.org/10.6017/ital.v37i4.10432 97 sprochi, “where are we headed?,” 129. 98 sprochi, “where are we headed?,” 134. 99 sprochi, “where are we headed?,” 134. 100 sprochi, “where are we headed?,” 134, 135. 101 sprochi, “where are we headed?,” 134. 102 caitlin tillman, joseph hafner, and sharon farnel, “forming the canadian linked data initiative,” paper presented at the the 37th international association of scientific and technological university libraries 2016 (iatul 2016) conference, dalhousie university libraries in halifax, nova scotia, june 5–9, 2016. 103 carol jean godby, shenghui wang, and jeffrey k mixter, library linked data in the cloud: oclc's experiments with new models of resource description. vol. 5, synthesis lectures on the semantic web: theory and technology, san rafael, california (usa),morgan & claypool publishers, 2015, https://doi.org/10.2200/s00620ed1v01y201412wbe012. 104 sofia zapounidou, michalis sfakakis, and christos papatheodorou, “highlights of library data models in the era of linked open data,” paper presented at the the 7th metadata and semantics research conference, mtsr 2013, thessaloniki, greece, november 19 –22, 2013; timothy w. cole et al., “library marc records into linked open data: challenges and opportunities,” journal of library metadata 13, no. 2–3 (2013): 163–96, https://doi.org/10.1080/19386389.2013.826074; kim tallerås, “from many records to one graph: heterogeneity conflicts in the linked data restructuring cycle, information research 18, no. 3 (2013) paper c18, accessed november 10, 2018. 105 fabiano ferreira de castro, “functional requirements for bibliographic description in digital environments,” transinformação 28, no. 2 (2016): 223–31. https://doi.org/10.1590/231808892016000200008. 106 castro, “functional requirements,” 223, 224. 107 castro, “functional requirements,” 224, 230. 108 castro, “functional requirements,” 223, 228–30. 109 gardašević, “possibilities and prospects,” 35. 110 godby, oclc's experiments, 112. 111 gonzales, “the future,” 17. 112 karim tharani, “linked data in libraries: a case study of harvesting and sharing bibliographic metadata with bibframe,” information technology and libraries 34, no. 1 (2015): 5–15. https://doi.org/https://doi.org/10.6017/ital.v34i1.5664. 113 tharani, “harvesting and sharing,” 16. https://doi.org/10.2200/s00620ed1v01y201412wbe012 https://doi.org/10.1080/19386389.2013.826074 https://doi.org/10.1590/2318-08892016000200008 https://doi.org/10.1590/2318-08892016000200008 https://doi.org/https:/doi.org/10.6017/ital.v34i1.5664 information technology and libraries | december 2018 75 114 gonzales, “the future,” 16. 115 karen smith-yoshimura, “analysis of international linked data survey for implementers,” dlib magazine, 2016, july/august 2016. 116 smith-yoshimura, “analysis.” 117 smith-yoshimura, “analysis.” 118 aikaterini k. kalou, dimitrios a. koutsomitropoulos, and georgia d. solomou, “combining the best of both worlds: a semantic web book mashup as a linked data service over cms infrastructure,” journal of library metadata 16, no. 3–4 (2016): 228–49, https://doi.org/10.1080/19386389.2016.1258897. 119 cole, “marc,” 163, 165, 175. 120 cole, “marc,” 163, 164, 191. 121 cole, “marc,” 164, 191. 122 ifla, “linked open data: challenges arising,” the international federation of library associations and institutions (ifla), 2014, accessed march 03, 2018, https://www.ifla.org/book/export/html/8548. 123 hallo, “current state,” 124. 124 hallo, “current state,” 126. 125 hallo, “current state,” 124. 126 karen smith-yoshimura, “linked data survey results 4–why and what institutions are publishing (updated),” hanging together the oclc research blog, september 3, 2014, accessed november 12, 2018, https://hangingtogether.org/?p=4167. 127 bull, “community collaboration,” 148. 128 tallerås, “one graph.” 129 karen smith-yoshimura, “linked data survey results 3–why and what institutions are consuming (updated),” hanging together the oclc research blog, september 1, 2014, accessed november 12, 2018, http://hangingtogether.org/?p=4155. 130 godby, oclc’s experiments, 116. 131 carol jean godby and karen smith‐yoshimura, “from records to things: managing the transition from legacy library metadata to linked data,” bulletin of the association for information science and technology 43, no. 2 (2017): 18–23, https://doi.org/10.1002/bul2.2017.1720430209. 132 godby, “from records to things,” 23. https://doi.org/10.1080/19386389.2016.1258897 https://www.ifla.org/book/export/html/8548 https://hangingtogether.org/?p=4167 http://hangingtogether.org/?p=4155 https://doi.org/10.1002/bul2.2017.1720430209 current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 76 https://doi.org/10.6017/ital.v37i4.10432 133 godby, “from records to things,” 22. 134 vandenbussche, “linked open vocabularies,” 449, 450. 135 silvia b. southwick, cory k lampert, and richard southwick, “preparing controlled vocabularies for linked data: benefits and challenges,” journal of library metadata 15, no. 3–4 (2015): 177–190, https://doi.org/10.1080/19386389.2015.1099983. 136 southwick, “controlled vocabularies,” 177. 137 southwick, “controlled vocabularies,” 189, 190. 138 southwick, “controlled vocabularies,” 183. 139 robin hastings, “feature: linked data in libraries: status and future direction,” computers in libraries (magzine article), 2015, http://www.infotoday.com/cilmag/nov15/hastings-linked-data-in-libraries.shtml. 140 hastings, “status and future.” 141 hastings, “status and future.” 142 hastings, “status and future.” 143 hastings, “status and future.” 144 tallerås, “national libraries,” 129 (by quoting from van hooland 2009; wang and strong 1996). 145 jung-ran park, “metadata quality in digital repositories: a survey of the current state of the art,” cataloging & classification quarterly 47, no. 3–4 (2009): 213–28, https://doi.org/10.1080/01639370902737240. 146 tallerås, “national libraries,” 129 (by quoting from bruce & hillmann, 2004). 147 park, “metadata quality,” 213, 224; tallerås, “national libraries,” 129, 150. 148 park, “metadata quality,” 213, 215, 218–21, 224, 225; tallerås, “national libraries,” 141. 149 tallerås, “national libraries,” 129. 150 tallerås, “national libraries,” 129. 151 tallerås, “national libraries,” 129. 152 karen snow, “defining, assessing, and rethinking quality cataloging,” cataloging & classification quarterly 55, no. 7–8 (2017): 438–55, https://doi.org/10.1080/01639374.2017.1350774. 153 snow, “quality cataloging,” 445. 154 snow, “quality cataloging,” 451, 452. https://doi.org/10.1080/19386389.2015.1099983 http://www.infotoday.com/cilmag/nov15/hastings--linked-data-in-libraries.shtml http://www.infotoday.com/cilmag/nov15/hastings--linked-data-in-libraries.shtml https://doi.org/10.1080/01639370902737240 https://doi.org/10.1080/01639374.2017.1350774 information technology and libraries | december 2018 77 155 david van kleeck et al., “managing bibliographic data quality for electronic resources,” cataloging & classification quarterly 55, no. 7-8 (2017): 560–77, https://doi.org/10.1080/01639374.2017.1350777. 156 van kleeck, “data quality,” 560, 575, 576. 157 van kleeck, “data quality,” 575. 158 park, “metadata quality,” 214, 216–18, 225. 159 niso, a framework of guidance for building good digital collections, ed. niso framework advisory group, 3rd ed (baltimore, md: national information standards organization, 2007), https://www.niso.org/sites/default/files/2017-08/framework3.pdf. 160 park, “metadata quality,” 214, 215; niso. guidance; jane barton, sarah currier, and jessie mn hey, “building quality assurance into metadata creation: an analysis based on the learning objects and e-prints communities of practice,” paper presented at the proceedings of the international conference on dublin core and metadata applications: supporting communities of discourse and practice—metadata research & applications, seattle, washington, september 28–october 2, 2003. 161 pascal hitzler and krzysztof janowicz, “linked data, big data, and the 4th paradigm,” semantic web 4, no. 3 (2013): 233–35, https://doi.org/10.3233/sw-130117. 162 hitzler, “4th paradigm,” 234. 163 hitzler, “4th paradigm,” 234. 164 alberto petrucciani, “quality of library catalogs and value of (good) catalogs,” cataloging & classification quarterly 53, no. 3–4 (2015): 303–13. https://doi.org/10.1080/01639374.2014.1003669. 165 petrucciani, “quality,” 303, 305. 166 petrucciani, “quality,” 303, 309, 311. 167 petrucciani, “quality,” 303, 309. 168 petrucciani, “quality,” 309, 310. 169 petrucciani, “quality,” 310. 170 bull, “community collaboration,” 147. 171 bull, “community collaboration,” 148. 172 han, myung-ja, “new discovery services and library bibliographic control,” library trends 61, no. 1 (2012):162–72, https://doi.org/10.1353/lib.2012.0025. 173 han, “bibliographic control,” 162. https://doi.org/10.1080/01639374.2017.1350777 https://www.niso.org/sites/default/files/2017-08/framework3.pdf https://doi.org/10.3233/sw-130117 https://doi.org/10.1080/01639374.2014.1003669 https://doi.org/10.1353/lib.2012.0025 current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 78 https://doi.org/10.6017/ital.v37i4.10432 174 han, “bibliographic control,” 169–71. 175 han, “bibliographic control,” 163. 176 han, “bibliographic control,” 167–70. 177 alemu, emergent theory, 29–33, 43–65. 178 alemu, emergent theory, 29–65. 179 lorri mon, social media and library services, synthesis lectures on information concepts, retrieval, and services, ed. gary marchionini, 40, san rafael, california (usa), morgan & claypool publishers, 2015), https://doi.org/10.2200/s00634ed1v01y201503icr040. 180 mon, social media, 50. 181 mon, social media, 24. 182 marijn koolen et al., “overview of the clef 2016 social book search lab,” paper presented at the 7th international conference of the cross-language evaluation forum for european languages, évora, portugal, september 5–8, 2016; koolen et al., “overview of the clef 2015 social book search lab,” paper presented at the 6th international conference of the crosslanguage evaluation forum for european languages, toulouse, france, september 8–11, 2015; patrice bellot et al., “overview of inex 2014,” paper presented at the international conference of the cross-language evaluation forum for european languages, sheffield, uk, september 15–18, 2014; bellot et al., “overview of inex 2013,” paper presented at the international conference of the cross-language evaluation forum for european languages, valencia, spain, september 23–26, 2013. 183 bo-wen zhang, xu-cheng yin, and fang zhou, “a generic pseudo relevance feedback framework with heterogeneous social information,” information sciences 367–68 (2016): 909–26, https://doi.org/10.1016/j.ins.2016.07.004; xu-cheng yin et al., “isart: a generic framework for searching books with social information,” plos one 11, no. 2 (2016): e0148479, https://doi.org/10.1371/journal.pone.0148479; faten hamad and bashar alshboul, “exploiting social media and tagging for social book search: simple query methods for retrieval optimization,” in social media shaping e-publishing and academia, edited by nashrawan tahaet al., 107–17 (cham: springer international publishing, 2017). 184 marijn koolen, “user reviews in the search index? that’ll never work!” paper presented at the 36th european conference on ir research (ecir 2014), amsterdam, the netherlands, april 13–16, 2014. 185 alemu, emergent theory, 29–33, 43–65. 186 lucy clements and chern li liew, “talking about tags: an exploratory study of librarians’ perception and use of social tagging in a public library,” the electronic library 34, no. 2 (2016): 289–301, https://doi.org/10.1108/el-12-2014-0216. 187 clements, “talking about tags,” 291, 297-99. https://doi.org/10.2200/s00634ed1v01y201503icr040 https://doi.org/10.1016/j.ins.2016.07.004 https://doi.org/10.1371/journal.pone.0148479 https://doi.org/10.1108/el-12-2014-0216 information technology and libraries | december 2018 79 188 sharon farnel, “understanding community appropriate metadata through bernstein’s theory of language codes,” journal of library metadata 17, no. 1 (2017): 5–18, https://doi.org/10.1080/19386389.2017.1285141. 189 farnel, “bernstein’s theory,” 5, 6. 190 mwaniki, “envisioning the future,” 8. 191 mwaniki, “envisioning the future,” 8, 9. 192 getaneh alemu et al., “toward an emerging principle of linking socially-constructed metadata,” journal of library metadata 14, no. 2 (2014): 103–29, https://doi.org/10.1080/19386389.2014.914775. 193 farnel, “bernstein’s theory,” 15–16. 194 kalou, “book mashup.” 195 kalou, “book mashup,” 242, 243. 196 alemu, “socially-constructed metadata,” 103, 107. 197 alemu, “socially-constructed metadata,” 103. 198 alemu, “socially-constructed metadata,” 103, 104, 120, 121. 199 getaneh alemu, “a theory of metadata enriching and filtering: challenges and opportunities to implementation,” qualitative and quantitative methods in libraries 5, no. 2 (2017): 311–34, http://www.qqml-journal.net/index.php/qqml/article/view/343 200 alemu, “metadata enriching and filtering,” 311. 201 alemu, “socially-constructed metadata,” 125. 202 alemu, “metadata enriching and filtering,” 319, 320. 203 alemu, “metadata enriching and filtering”; alemu, emergent theory; alemu, “sociallyconstructed metadata”; farnel, “bernstein's theory”; kalou, “book mashup.” 204 hallo, “current state,” 120. 205 alemu, “socially-constructed metadata,” 125; hastings, “status and future.” 206 bull, “community collaboration,” 147. 207 bull, “community collaboration,” 152; bull, “community collaboration,” 152; schreur, 2015. linked data for production. 208 tallerås, “national libraries,” 129. 209 petrucciani, “quality,” 303, 309. https://doi.org/10.1080/19386389.2017.1285141 https://doi.org/10.1080/19386389.2014.914775 http://www.qqml-journal.net/index.php/qqml/article/view/343 current state of linked and open data in cataloging | ullah, khusro, ullah, and naeem 80 https://doi.org/10.6017/ital.v37i4.10432 210 bull, “community collaboration,” 147, 152. 211 farnel, “bernstein's theory,” 5, 6, 12, 13, 15, 16; mwaniki, “envisioning the future,” 8. 212 mon, social media, 3; alemu, “metadata enriching and filtering,” 320. 213 alemu, “socially-constructed metadata,” 125. 214 koolen, “clef 2016”; koolen, “clef 2015”; bellot, “inex 2014”; bellot, “inex 2013.” abstract introduction the role of linked open data and vocabularies in cataloging linked and open data linked open vocabularies challenges, issues, and research opportunities the multiplicity of cataloging rules and standards publishing and consuming linked bibliographic metadata publishing linked bibliographic metadata consuming linked bibliographic metadata quality of linked bibliographic metadata linking the socially curated metadata the socially curated metadata matters in cataloging the socially curated metadata as linked data conclusions references from our readers | eden 93 bradford lee edenfrom our readers the new user environment: the end of technical services? editor’s note: “from our readers” is an occasional feature highlighting ital readers’ letters and commentaries on timely issues. technical services: an obsolete term used to describe the largest component of most library staffs in the twentieth century. that component of the staff was entirely devoted to arcane and mysterious processes involved in selecting, acquiring, cataloging, processing, and otherwise making available to library users physical material containing information content pieces (incops). the processes were complicated, expensive, and time-consuming, and generally served to severely limit direct service to users both by producing records that were difficult to understand and interpret, even by other library staff, and by consuming from 75–80 percent of the library’s financial and personnel resources. in the twenty-first century, the advent of new forms of publication and new techniques for providing universal records and universal access to information content made the organizational structure obsolete. that change in organizational structure, more than any other single factor, is generally credited as being responsible for the dramatic improvement in the quality of library service that has occurred in the first decade of the twenty-first century. t here are many who would say that i was the one who wrote this quotation. i didn’t, and it is, in fact, more than twenty-five years old!1 while i was beginning to research and prepare for this article, i began as most users today start their search for information: i started with google. granted, i rarely go beyond the first page of results (as most user surveys indicate), but the paucity of links made me click to the next screen. there, at number 16, was a scanned article. jackpot! i thought as i started perusing the contents of this resource online, thinking to myself how the future had changed so dramatically since 1984, with the emergence of the internet and the laptop, all of the new information formats, and the digitization of information. ahh, the power of full text! after reading through the table of contents, introduction, and the first chapter, i noticed that some of the pages were missing. mmmm, obviously some very shoddy scanning on the part of google. but no, i finally realized that only part of this special issue was available on google. obviously, i missed the statement at the bottom of the front scan of the book: “this is a preview. the total pages displayed will be limited. learn more.” and thus the issues regarding copyright reared their ugly head. when discussing the new user environment, there are many demands facing libraries today. in a report by martha bates, citing the principle of least effort first attributed to philologist george zipf and quoted in the calhoun report to the library of congress, she states: people do not just use information that is easy to find; they even use information that they know to be of poor quality and less reliable—so long as it requires little effort to find—rather than using information they know to be of high quality and reliable, though harder to find . . . despite heroic efforts on the part of librarians, students seldom have sufficiently sustained exposure to and practice with library skills to reach the point where they feel real ease with and mastery of library information systems.2 according to the final report of bibliographic services task force of the university of california libraries, users expect the following: ■■ one system or search to cover a wide information universe (e.g., google or amazon) ■■ enriched metadata (e.g., onix, tables of contents, and cover art) ■■ full-text availability ■■ to move easily and seamlessly from a citation about an item to the item itself—discovery alone is not enough ■■ systems to provide a lot of intelligent assistance ■❏ correction of obvious spelling errors ■❏ results sorting in order of relevance to their queries ■❏ help in navigating large retrievals through logical subsetting or topical maps or hierarchies ■❏ help in selecting the best source through relevance ranking or added commentary from peers and experts or “others who used this also used that” tools ■❏ customization and personalization services ■■ authenticated single sign-on ■■ security and privacy ■■ communication and collaboration ■■ multiple formats available: e-books, mpeg, jpeg, rss and other push technologies, along with traditional, tangible formats ■■ direct links to e-mail, instant messaging, and sharing ■■ access to online virtual communities ■■ access to what the library has to offer without actually having to visit the library3 bradford lee eden (eden@library.ucsb.edu) is associate university librarian for technical services & scholarly communication, university of california, santa barbara. 94 information technology and libraries | june 2010 what is there in this new user environment for those who work in technical services? as indicated in the opening quote, would a dramatic improvement in library services occur if technical services were removed from the organizational structure? even in 1983, the huge financial investment that libraries made in the organization and description of information, inventory, workflows, and personnel was recognized; today, that investment comes under intense scrutiny as libraries realize that we no longer have a monopoly on information access, and to survive we need to move forward more aggressively into the digital environment than ever before. as marcum stated in her now-famous article, ■■ if the commonly available books and journals are accessible online, should we consider the search engines the primary means of access to them? ■■ massive digitization radically changes the nature of local libraries. does it make sense to devote local efforts to the cataloging of unique materials only rather than the regular books and journals? ■■ we have introduced our cataloging rules and the marc format to libraries all over the world. how do we make massive changes without creating chaos? ■■ and finally, a more specific question: should we proceed with aacr3 in light of a much-changed environment?4 there are larger internal issues to consider here as well. the budget situation in libraries requires the application of business models to workflows that have normally not been questioned nor challenged. karen calhoun discusses this topic in a number of her contributions to the literature: when catalog librarians identify what they contribute to their communities with their methods (the cataloging rules, etc.) and with the product they provide (the catalog), they face the danger of “marketing myopia.” marketing myopia is a term used in the business literature to describe a nearsighted view that focuses on the products and services that a firm provides, rather than the needs those products and services are intended to address.5 for understanding the implementation issues associated with the leadership strategy, it is important to be clear about what is meant by the “excess capacity” of catalogs. most catalogers would deny there is excess capacity in today’s cataloging departments, and they are correct. library materials continue to flood into acquisitions and cataloging departments and staff can barely keep up. yet the key problem of today’s online catalog is the effect of declining demand. in healthy businesses, the demand for a product and the capacity to produce it are in balance. research libraries invest huge sums in the infrastructure that produces their local catalogs, but search engines are students and scholars’ favorite place to begin a search. more users bypass catalogs for search engines, but research libraries’ investment in catalogs—and in the collections they describe—does not reflect the shift in user demand.6 i have discussed this exact problem in recent articles and technical reports as well.7 there have to be better, more efficient ways for libraries to organize and describe information not based on the status quo of redundant “localizing” of bibliographic records. a good analogy would be the current price of gas and the looming transportation crisis. for many years, americans have had the luxury of being able to purchase just about any type of car, truck, suv, hummer, etc., that they wanted on the basis of their own preferences, personalities, and incomes, not on the size of the gas tank or on the mileage per gallon. why not buy a mercedes over a kia? but with gas prices now well above the average person’s ability to consistently fill their gas tank without mortgaging their future, the market demands that people find alternative solutions in order to survive. this has meant moving away from the status quo of personal choice and selection toward a more economic and sustainable model of informed fuel-efficiency transportation, so much so that public transportation is now inundated with more users than it can handle, and consumers have all but abandoned the truck and suv markets. libraries have long worked in the mercedes arena, providing features such as authority control, subject classification, and redundant localizing of bibliographic records that were essential when libraries held the monopoly on information access but are no longer cost-efficient—nor even sane—strategies in the current information marketplace. users are not accessing the opac anymore; well-known studies indicate that more than 80 percent of information seekers begin their search on a web search engine. libraries are investing huge resources in staffing and priorities fiddling with marc bibliographic records in a time when they are struggling to survive and adapt from a monopoly environment to being just one of many players in the new information marketplace. budgets are stagnant, staffing is at an all-time low, new information formats continue to appear and require attention, and users are no longer patient nor comfortable working with our clunky opacs.8 why do libraries continue to support an infrastructure of buying and offering the same books, cds, dvds, journals, etc., at every library, when the new information environment offers libraries the opportunity to showcase and present their unique information resources and one-of-a-kind collections to the world? special collections materials held by every major research and public library in the world can now be digitized, and from our readers | eden 95 sparse library resources need to be adjusted to compete and offer these unique collections and their services to our users and the world. the october 2007 issue of computers in libraries is devoted solely to articles related to the enhancement, usability, appropriateness, and demise of the library opac. interesting articles include “fac-back-opac: an open source solution interface to your library system,” “dreaming of a better ils,” “plug your users into library resources with opensearch plug-ins,” delivering what people need, when and where they need it,” “the birth of a new generation of library interfaces,” and “will the ils soon be as obsolete as the card catalog?” an especially interesting quote is given by cervone, then assistant university librarian for information technology at northwestern university: what i’d like to see is for the catalog to go away. to a great degree, it is an anachronism. what we need from the ils is a solid, business-process back end that would facilitate the functions of the library that are truly unique such as circulation, acquiring materials, and “cataloging” at the item level for what amounts to inventory-control purposes. most of the other traditional ils functions could be rolled over into a centralized system, like oclc, that would be cooperatively shared. the catalog itself should be treated as just another database in the world of resources we have access to. a single interface to those resources that would combine our local print holdings, electronic text (both journal and ebook), as well as multimedia material is what we should be demanding from our vendors.9 one book that needs to be required reading for all librarians, especially catalogers, is weinberger ’s everything is miscellaneous.10 he describes the three orders of order (self organization, metadata, and digital); provides an extensive history of how western civilization has ordered information, specifically the links to nineteenth-century victorianism; and the concepts of lumping and splitting. in the end, weinberger argues that the digital environment allows users to manipulate information into their own organization system, disregarding all previous organizational attempts by supposed experts using outdated and outmoded systems. in the digital disorder of information, an object (leaf) can now be placed on many shelves (branches), figuratively speaking, and this new shape of knowledge brings out four strategic principles: 1. filter on the way out, not on the way in. 2. put each leaf on as many branches as possible. 3. everything is metadata and everything can be a label. 4. give up control. it is this last principle that libraries have challenges with. whether we agree with this principle or not, it has already happened. arguing about it, ignoring it, or just continuing to do business as usual isn’t going to change the fact that information is user-controled and user initiated in the digital environment. so, where do we go from here? the future of technical services (and its staff) far be it from me to try to predict the future of libraries as viable, and more importantly marketable, information organizations in this new environment. one has only to examine the quotations from the first issues of technical services quarterly to see what happens to predictions and opinions. titles of some of the contributions (from 1983, mind you) are worthy of mention: “library automation in the year 2000,” “musings on the future of the catalog,” and “libraries on the line.” there are developments, however, that require reexamination and strategic brainstorming regarding the future of library bibliographic organization and description. the appearance of worldcat local will have a tremendous impact on the disappearance of proprietary vendor opacs. there will no longer be a need for an integrated library system (ils); with worldcat local, the majority of the world’s marc bibliographic records are available in a library 2.0 format. the only things missing are some type of inventory and acquisitions module that can be formatted locally and a circulation module. if oclc could focus their programming efforts on these two services and integrate them into worldcat local, library administrators and systems staff would no longer have to deal with proprietary and clunky opacs (and their huge budgetary lines), but could use the power of web 2.0 (and hopefully 3.0) tools and services to better position themselves in the new information marketplace. another major development is the google digitization project (and other associated ventures). while there are some concerns about quality and copyright,11 as well as issues related to the disappearance of print and the time involved to digitize all print,12 no one can deny the gradual and inevitable effect that mass digitization of print resources will have in the new information marketplace. just the fact that my research explorations for this article brought up digitized portions of the 1983 technical services quarterly articles is an example. more and more, published print information will be available in full-text online. what effect will this have on the physical collection that all libraries maintain, not only in terms of circulation, but also in terms of use of space, preservation, and collection development? no one knows for sure, but if the search strategies and information discovery patterns of our users are any 96 information technology and libraries | june 2010 indication, then we need to be strategically preparing and developing directions and options. automatic metadata generation has been a topic of discussion for a number of years, and jane greenberg’s work at the university of north carolina–chapel hill is one of the leading examples of research in this area.13 while there are still viable concerns about metadata generation without any type of human intervention, semiautomatic and even nonlibrary-facilitated metadata generation has been successful in a number of venues. as libraries grapple with decreased budgets, multiplying formats, fewer staff to do the work, and more retraining and reprofessional development of existing staff, library administrators have to examine all options to maximize personnel as well as budgetary resources. incorporating new technologies and tools for generating metadata without human intervention into library workflows should be viewed as a viable option. user tagging would be included in this area. even intner, a long-time proponent of traditional technical services, has written that generating cataloging data automatically would be of great benefit to the profession, and that more tools and more programming ought to be focused toward this goal.14 so, with print workflows being replaced by digital and electronic workflows, how can administrators assist their technical services staff to remain viable in this new information environment? how can technical services staff not only help themselves but their supervisors and administrators to incorporate their unique talents, expertise, education, and experience toward the type of future scenarios indicated above? competencies and challenges for technical services staff there are some good opinions available for assisting technical services staff with moving into the new environment. names have power, whether we like to admit it or not, and changing the name from “technical services” to something more understandable to our users, let alone our colleagues within the library, is one way to start. names such as “collections and data management services” or “reference data services” have been mentioned.15 an interesting quote sums up the dilemma: it’s pretty clear that technical services departments have long been the ugly ducklings in the library pond, trumped by a quintet of swans: reference departments (the ones with answers for a grateful public); it departments (the magicians who keep the computers humming); children’s and youth departments (the warm and fuzzy nurturers); other specialty departments (the experts in good reads, music, art, law, business, medicine, government documents, av, rare books and manuscripts, you-name-it); and administrative groups (the big bosses). part of the trouble is that the rest of our colleagues don’t really know what technical services librarians do. they only know that we do it behind closed doors and talk about it in language no one else understands. if it can’t be seen, can’t be understood, and can’t be discussed, maybe it’s all smoke and mirrors, lacking real substance. it’s easy to ignore.16 ruschoff mentions competencies for technical services librarians in the new information environment: comfortable working in both print and digital worlds, specialized skills such as foreign languages and subject area expertise, comfortable working in both digital and web-based technologies (suggesting more computing and technology skills), expertise in digital asset management, and problem-solving analytical skills.17 in a recent blog posting summarizing a presentation at the 2008 ala annual conference on this topic, comparisons between catalogers going extinct or retooling are provided. the following is a summary of that post: converging trends ■■ more catalogers work at the support-staff level than as professional librarians. ■■ more cataloging records are selected by machines. ■■ more catalog records are being captured from publisher data or other sources. ■■ more updating of catalog records is done via batch processes. ■■ libraries continue to deemphasize processing of secondary research products in favor of unique primary materials. what are our choices? ■■ behind door number one—the extinction model. ■■ behind door number two—the retooling model. how it’s done ■■ extinction ■❏ keep cranking about how nobody appreciates us. ■❏ assert over and over that we’re already doing everything right—why should we change? ■❏ adopt a “chicken little” approach to envisioning the future. ■■ retooling ■❏ considers what catalogers already do. ■❏ look for support. ■❏ find a new job. what catalogers do ■■ operate within the boundaries of detailed standards. ■■ describe items one-at-a-time. ■■ treat items as if they are intended to fit carefully from our readers | eden 97 within a specific application—the catalog. ■■ ignore the rest of the world of information. what metadata librarians do ■■ think about descriptive data without preconceptions around descriptive level, granularity, or descriptive vocabularies. ■■ consider the entirety of the discovery and access issues around a set or collection of materials. ■■ consider users and uses beyond an individual service when making design decisions—not necessarily predetermined. ■■ leap tall buildings in a single bound. what new metadata librarians do ■■ be aware of changing user needs. ■■ understand the evolving information environment. ■■ work collaboratively with technical staff. ■■ be familiar with all metadata formats and encoding metadata. ■■ seek out tall buildings—otherwise jumping skills will atrophy. the cataloger skill set ■■ aacr2, lc, etc. the metadata librarian skill set ■■ views data as collections, sets, streams. ■■ approaches the task as designing data to “play well with others.” characteristics of our new world ■■ no more ils ■■ bibliographic utilities are unlikely to be the central node for all data. ■■ creation of metadata will become more decentralized. ■■ nobody knows how this will all shake out, but metadata librarians will be critical in forging solutions.18 while the above summary focuses on catalogers and their future, many of the directions also apply to any librarian or support staff member currently working in technical services. in a recent educause review article, brantley lists a number of mantras that all libraries need to repeat and keep in mind in this new information environment: ■■ libraries must be available everywhere. ■■ libraries must be designed to get better through use. ■■ libraries must be portable. ■■ libraries must know where they are. ■■ libraries must tell stories. ■■ libraries must help people learn. ■■ libraries must be tools of change. ■■ libraries must offer paths for exploration. ■■ libraries must help forge memory. ■■ libraries must speak for people. ■■ libraries must study the art of war.19 you will have to read the article to find out about that last point. the above mantras illustrate that each of these issues must also be aligned with the work done by technical services departments in support of the rest of the library’s services. and there definitely isn’t one right way to move forward; each library with its unique blend of services and staff has to define, initiate, and engender dialogue on change and strategic direction, and then actively make decisions with integrity and vigor toward both its users and its staff. as calhoun indicates, there are a number of challenges to feasibility for next steps in this area, some technically oriented but many based on our own organizational structures and strictures: ■■ difficulty achieving consensus on standardized, simplified, more automated workflows. ■■ unwillingness or inability to dispense with highly customized acquisitions and cataloging operations. ■■ overcoming the “not invented here” mindset preventing ready acceptance of cataloging copy from other libraries or external sources. ■■ resistance to simplifying cataloging. ■■ inability to find and successfully collaborate with necessary partners (e.g., ils vendors). ■■ difficulty achieving basic levels of system interoperability. ■■ slow development and implementation of necessary standards. ■■ library-centric decision making; inability to base priorities on how users behave and what they want ■■ limited availability of data to support management decisions. ■■ inadequate skill set among library staff; unwillingness or inability to retrain. ■■ resistance to change from faculty members, deans, or administrators.20 moving forward in the new information world in a recent discussion on the autocat electronic discussion list regarding the client-business paradigm now being impressed on library staff, an especially interesting quote puts the entire debate into perspective: the irony of this discussion is that our patrons/users/ clients [et al.] expect to be treated as well as business customers. they pay tuition or taxes to most of our institutions and expect to have a return in value. and a very large percentage of them care about the differences between the government services vs. business 98 information technology and libraries | june 2010 arguments we present. what they know is that when they want something, they want it. more library powers-that-be now come from the world of business rather than libraries because of the pressure on the bottom line. business administrators are viewed, even by those in public administration, as being more fiscally able than librarians. i would recommend that we fuss less about titles and semantics and develop ways to show the value of libraries to the public.21 wheeler, in a recent educause review article, documents a number of “eras” that colleges and universities have gone through in recent history.22 first is the “era of publishing,” followed by the “era of participation” with the appearance of the internet and its social networking tools. the next era, the “era of certitude,” is one in which users will want quick, timely answers to questions, along with some thought about the need and context of the question. wheeler espouses five dimensions that tools of certitude must have: reach, response, results, resources, and rights. he explains these dimensions in regards to various tools and services that libraries can provide through human–human, human–machine, and machine–machine interaction.23 wheeler sees extensive rethinking and reengineering by libraries, campuses, and information technology to assist users to meet their information needs. are there ways that technical services staff can assist in these efforts? although somewhat dated, calhoun’s extensive article on what is needed from catalogers and librarians in the twenty-first century expounds a number of salient points.24 in table 1, she illustrates some of the many challenges facing traditional library cataloging, providing her opinion on what the challenges are, why they exist, and some solutions for survivability and adaptability in the new marketplace.25 one quote in particular deserves attention: at the very least, adapting successfully to current demands will require new competencies for librarians, and i have made the case elsewhere that librarians must move beyond basic computer literacy to “it fluency”—that is, an understanding of the concepts of information technology, especially applying problem solving and critical thinking skills to using information technology. raising the bar of it fluency will be even more critical for metadata specialists, as they shift away from a focus on metadata production to approaches based on it tools and techniques on the one hand, and on consulting and teamwork on the other. as a result of the increasing need for it fluency among metadata specialists, they may become more closely allied with technical support groups in campus computing centers. the chief challenges for metadata specialists will be getting out of library back rooms, becoming familiar with the larger world of university knowledge communities, and developing primary contacts with the appropriate domain experts and it specialists.26 getting out of the back room and interacting with users seems to be one of the dominant themes of evolving technical services positions to fit the new information marketplace. putting web 2.0 tools and services into the library opac has also gained some momentum since the launch of the endeca-based opac at north carolina state university. as some people have stated, however, putting “lipstick on a pig” doesn’t change the fundamental problems and poor usability of something that never worked well in the first place.27 in their recent article, jia mi and cathy weng tried to answer the following questions: why is the current opac ineffective? what can libraries and librarians do to deliver an opac that is as good as search engines to better serve our users?28 of course, the authors are biased toward the opac and wish to make it better, given that the last sentence in their abstract is, “revitalizing the opac is one of the pressing issues that has to be accomplished.” users’ search patterns have already moved away from the opac as a discovery tool; why should personnel and resource investment continue to be allocated toward something that users have turned away from? in their recommendations, mi and weng indicate that system limitations, not fully exploiting the functionality already made available by ilss, and the unsuitability of marc standards to online bibliographic display are the primary factors to the ineffectiveness of library opacs. exactly. debate and discussion on autocat after the publication of their article again shows the line drawn between conservative opinions (added value, noncommercialization, and overall ideals of the library profession and professional cataloging workflows) and the newer push for open-source models, junking the opac, and learning and working with non-marc metadata standards and tools. conclusion from an administrative point of view, there are a number of viable options for making technical services as efficient as possible, in its current emanation: ■■ conduct a process review of all current workflows, following each type of format from receipt at loading dock to access by user. revise and redesign workflows for efficiency. ■■ eliminate all backlogs, incorporating and standardizing various types of bibliographic organization (from brief records to full records, using established criteria of importance and access). ■■ as much as possible, contract with vendors to make from our readers | eden 99 all print materials shelf-ready, establishing and monitoring profiles for quality and accuracy. establish a rate of error that is amenable to technical services staff; once that error rate is met, review incoming print materials only once or twice a year. ■■ assure technical services staff that their skills, experience, and attention to detail are needed in the electronic environment, and provide training and professional development to assist them in scanning and digitizing unique collections, learning non-marc metadata standards, improving project management, and performing consultation training to interact with faculty and students who work with data sets, metadata, and research planning. support and actively work for revised job reclassification of library support staff positions. most libraries are forced to work with fewer staff, and it is essential that current personnel are valued for their institutional knowledge and skill sets (knowledge management philosophy). library administrations need to emphasize to their staff that the organization has a vested interest in providing them with the tools and training they need to assist the organization in the new information marketplace. the status quo of technical services operations is no longer viable or cost-effective; all of us must look at ways to regain market share and restructure our organizations to collaborate and consult with users regarding their information and research needs. no longer is it enough to just provide access to information; we must also provide tools and assistance to the user in manipulating that information. to end, i would like to quote from a few of the articles from that 1983 issue of technical services quarterly i have alluded to throughout this chapter: like all prognostications, predictions about cataloging in a fully automated library may bear little resemblance to the ultimate reality. while the future cataloging scenario discussed here may seem reasonable now, it could prove embarrassing to read 10–20 years hence. still, i would be pleasantly surprised if, by the year 2000, ts operations are not fully integrated, ts staff has not been greatly reduced, there has not been a large-scale jump in ts productivity accompanied by a dramatic decline in ts costs, and if most of us are not cooperating through a national database.29 in conclusion, i will revert to my first subject, the uncertain nature of predictions. in addition to the fearless predictions already recorded, i predict that some of these predictions will come true and perhaps even most of them. some of them will come true, but not in the time anticipated, while others never will. let us hope that the influences not guessed that will prevent the actualization of some of these predictions will be happy ones, not dire. however they turn out, i predict that in ten years no one will remember or really care what these predictions were.30 technical services as we know them now may well not exist by the end of the century. the aims of technical services will exist for as long as there are libraries. the technical services quarterly may well have changed its name and its coverage long before then, but its concerns will remain real and the work to which many of us devote our lives will remain worthwhile. there can be few things in life that are as worth doing as enabling libraries to fulfill their unique and uniquely important role in culture and civilization.31 twenty-five years have come and gone; some of the predictions in this first issue of technical services quarterly came true, many of them did not. there have been dramatic changes in those twenty-five years, most of which were unforeseen, as they always are. what is a certainty is that libraries can no longer sustain or maintain the status quo in technical services. what also is a certainty is that technical services staff, with their unique skills, talents, abilities, and knowledge in relation to the organization and description of information, are desperately needed in the new information environment. it is the responsibility of both library administrators and technical services staff to work together to evolve and redesign workflows, standards, procedures, and even themselves to survive and succeed into the future. references 1. norman d. stevens, “selections from a dictionary of libinfosci terms,” in “beyond ‘1984’: the future of technical services,” special issue, technical services quarterly 1, no. 1–2 (fall/winter 1983): 260. 2. marcia j. bates, “improving user access to library catalog and portal information: final report,” (paper presented at the library of congress bicentennial conference on bibliographic control for the new millennium, june 1, 2003): 4, http://www.loc.gov/catdir/bibcontrol/2.3batesreport6-03 .doc.pdf (accessed apr. 7, 2009). see also karen calhoun, “the changing nature of the catalog and its integration with other discovery tools,” final report to the library of congress, mar. 17, 2006, 25, http://www.loc.gov/catdir/calhoun-report-final .pdf (accessed apr. 7, 2009). 3. university of california libraries bibliographic services task force, “rethinking how we provide bibliographic services for the university of california,” final report, dec. 2005, 8, http://libraries.universityofcalifornia.edu/sopag/bstf/final. pdf (accessed apr. 7, 2009). 4. deanna b. marcum, “the future of cataloging,” library resources & technical services 50, no. 1 (jan. 2006): 9, http://www .loc.gov/library/reports/catalogingspeech.pdf (accessed apr. 100 information technology and libraries | june 2010 7, 2009). 5. karen calhoun, “being a librarian: metadata and metadata specialists in the twenty-first century,” library hi tech 25, no. 2 (2007), http://www.emeraldinsight.com/insight/view contentservlet?filename=published/emeraldfulltextarticle/ articles/2380250202.html (accessed apr. 7, 2009). 6. calhoun, “the changing nature of the catalog,” 15. 7. bradford lee eden, “ending the status quo,” american libraries 39, no. 3 (mar. 2008): 38; eden, introduction to “information organization future for libraries,” library technology reports 44, no. 8 (nov./dec. 2007): 5–7. 8. see karen schneider’s “how opacs suck” series on the ala techsource blog, http://www.techsource.ala.org/ blog/2006/03/how-opacs-suck-part-1-relevance-rank-or-the -lack-of-it.html, http://www.techsource.ala.org/blog/2006/04/ how-opacs-suck-part-2-the-checklist-of-shame.html, and http:// www.techsource.ala.org/blog/2006/05/how-opacs-suck-part3-the-big-picture.html (accessed apr. 7, 2009). 9. h. frank cervone, quoted in ellen bahr, “dreaming of a better ils,” computers in libraries 27, no. 9 (oct. 2007): 14. 10. david weinberger, everything is miscellaneous: the power of the new digital disorder (new york: times, 2007). 11. for a list of these concerns, see robert darnton, “the library in the new age,” the new york review of books 55, no. 10 (june 12, 2008), http://www.nybooks.com/articles/21514 (accessed apr. 7, 2009). 12. see calhoun, “the changing nature of the catalog,” 27. 13. see the metadata research center, “automatic metadata generation applications (amega),” http://ils.unc.edu/mrc/ amega (accessed, apr. 7, 2009). 14. sheila s. intner, “generating cataloging data automatically,” technicalities 28, no. 2 (mar./apr. 2008): 1, 15–16. 15. sheila s. intner, “a technical services makeover,” technicalities 27, no. 5 (sept./oct. 2007): 1, 14–15. 16. ibid, 14 (emphasis added). 17. carlen ruschoff, “competencies for 21st century technical services,” technicalities 27, no. 6 (nov./dec. 2007): 1, 14–16. 18. diane hillman, “a has-been cataloger looks at what cataloging will be,” online posting, metadata blog, july 1, 2008, http://blogs.ala.org/nrmig.php?title=creating_the_future_of_ the_catalog_aamp_&more=1&c=1&tb=1&pb=1 (accessed apr. 7, 2009). 19. peter brantley, “architectures for collaboration: roles and expectations for digital libraries,” educause review 43, no. 2 (mar./apr. 2008): 31–38. 20. calhoun, “the changing nature of the catalog,” 13. 21. brian briscoe, “that business/customer stuff (was: letter to al),” online posting, autocat, may 30, 2008. 22. brad wheeler, “in search of certitude,” educause review 43, no. 3 (may/june 2008): 15–34. 23. ibid., 22. 24. karen calhoun, “being a librarian.” 25. ibid. 26. ibid. (emphasis added). 27. andrew pace, quoted in roy tennant, “digitl libraries: ‘lipstick on a pig,’” library journal, apr. 15, 2005, http:// www.libraryjournal.com/article/ca516027.html (accessed apr. 7, 2009). 28. jia mi and cathy weng, “revitalizing the library opac: interface, searching, and display challenges,” information technology & libraries 27, no. 1 (mar. 2008): 5–22. 29. gregor a. preston, “how will automation affect cataloging staff?” in “beyond ‘1984’: the future of technical services,” special issue, technical services quarterly 1, no. 1–2 (fall/ winter 1983): 134. 30. david c. taylor, “the library future: computers,” in “beyond ‘1984’: the future of technical services,” special issue, technical services quarterly 1, no. 1–2 (fall/winter 1983): 92–93. 31. michael gorman, “technical services, 1984–2001 (and before),” in “beyond ‘1984’: the future of technical services,” special issue, technical services quarterly 1, no. 1–2 (fall/winter 1983): 71. lita cover 2, cover 3 neal-schuman cover 4 index to advertisers guest editorial | hirst 179 o rganization structure and reorganization are never exciting topics. the world rarely pauses to take a deep breath or offer a round of applause when an organization adds a new committee or decides to split into subgroups. however, organizations frequently inform the patterns and processes of change—as well as no change. recently, the ex libris users of north america (eluna) group reorganized. processes and outcomes were similar to those i observed many years before when the library information and technology association (lita) restructured, and i labeled the process litaish. john webb subsequently asked me to elaborate through an information technologies and libraries (ital) editorial. ■ lita—an organizational recap in 1981, lita launched a bold reorganization. sections and committees were abolished and a new structure, the interest group, was created with the hope of significant benefits to the organization. the final report of the long-range plan implementation committee of may 29, 1984, stated: the main thrust of the reorganization . . . was the establishment and encouragement of interest groups, which were intended to reflect topics of current interest to members and to have a structure which allows for easy creation and easy elimination as interests and technology change. interest groups could be formed . . . from as few as ten lita members and were empowered to plan and present programs, institutes, and preconferences . . . linda knutson, who became executive director of lita in february 1987, “has . . . been impressed by the increase in the level of participation and by the tremendous energy that the players have; they want to contribute, and they plunge in with both feet.” these comments are from conversations with linda knutson quoted in “lita’s first twenty-five years: a brief history,” by stephen r. salmon in the march 1993 silver anniversary issue of ital. twenty years later, the lita organization and, specifically, the lita interest groups (igs) continue to provide forums for discussion, create conference programs, institutes, and preconferences. the igs hold the content of the organization with minimal administrative overhead, irregular leadership, and virtually no bylaws. ■ naaug—the deconstruction of a classic model aleph, the ex libris integrated library solution (ils) software, is used in numerous countries. the north american aleph user’s group (naaug) existed from 1999 to 2006. the organization had a reasonably classic structure with a steering committee and ad hoc groups to work on annual software enhancements, focus groups, and conference planning. the organization was very centralized with all appointments to subgroups made by the steering committee. developments outside the ils put pressure on naaug to reorganize. ex libris was offering numerous new products, some of which complemented, some of which were independent of the ils. as with any organization, there was some pressure to retain all or part of the status quo from those who were hesitant to change or change radically. leaders, including myself, were cautious, always questioning whether new developments would work and be effective. ■ eluna emerges the new ex libris users’ organization, eluna, is composed of the steering committee, product groups (pgs), and interest groups (igs). i was intrigued with the formation of eluna igs and believe that this structure was an offspring of the lita igs. the eluna igs have very little bureaucracy to hinder the creativity and energy that lita wanted to capture. there is no minimum number of participants in an eluna ig, the creation of which can be proposed by any single individual. each group must write a brief annual report, have a contact person whose name and e-mail is posted on the web site, and may have an optional electronic discussion list. the groups can meet at the annual conference or anywhere they choose and a virtual ig is not discouraged. the igs may get involved in product enhancements, but it is fine to leave this work to the pgs. currently, igs are organized around such areas as function, type of library, and particular software. some examples: ■ data representation (special scripts) ■ law ■ edi ■ music ■ government publications ■ shared systems (consortia) ■ ill ■ sql guest editorial: organizational structure— yesterday informs the present donna hirst donna l. hirst (donna-hirst@uiowa.edu) is project coordinator, library information technology, university of iowa libraries, and a member of the ital editorial board. 180 information technology and libraries | december 2006 ■ large research libraries ■ z39.50 ■ what happens next? the eluna structures of steering committee, pgs, and igs are off to a good start. because each of these is empowered to work independently, a communication matrix needs to be put into place so that all interested or affected parties are adequately informed. in the future, a process will need to be created to identify groups that need to be disbanded. lita solved this problem with the periodic renewal process. in eluna, the contact person may be able to assume this responsibility. we live in an age where “opening” offers a context for change. opening implies new possibilities and few restrictions. open systems . . . open access . . . open source. it appears to me that eluna is continuing a tradition that lita began twenty-five years ago with an open organization. put people into a group, stir lightly, and watch what comes out of the pot. ■ 184 information technology and libraries | december 2009 thomas sommer unlv special collections in the twenty-first century university of nevada las vegas (unlv) special collections is consistently striving to provide several avenues of discovery to its diverse range of patrons. specifically, unlv special collections has planned and implemented several online tools to facilitate unearthing treasures in the collections. these online tools incorporate web 2.0 features as well as searchable interfaces to collections. t he university of nevada las vegas (unlv) special collections has been working toward creating a visible archival space in the twenty-first century that assists its patrons’ quest for historical discovery in unlv’s unique southern nevada, gaming, and las vegas collections. this effort has helped patrons ranging from researchers to students to residents. special collections has created a discovery environment that incorporates several points of access, including virtual exhibits, a collection-wide search box, and digital collections. unlv special collections also has added web 2.0 features to aid in the discovery and enrichment of this historical information. these new features range from a what’s new blog to a digital collection with interactive features. the first point of discovery within the unlv special collections website began with the virtual exhibits. staff created the virtual exhibits as static html pages that showcased unique materials housed within unlv special collections. they showed the scope and diversity of materials on a specific topic available to researchers, faculty, and students. one virtual exhibit is “dino at the sands” (figure 1), a point of discovery for the history not only of dean martin but of many rat pack exploits.1 the photographs in this exhibit come from the sands collection. it is a static html page, and it provides information and pictures regarding one of las vegas’ most famous entertainers. this exhibit contains links to rat pack information and various resources on dean martin, including photographs, books, and videotapes. a second mode of discovery within the unlv special collections website is its new “search special collections” google-like search box (figure 2). this is located on the homepage and searches the manuscript, photograph, and oral history primary source collections.2 the purpose is to aid in the discovery of material within the collections that is not yet detailed in the public online catalog. in the past researchers would have to work through the special collection’s website to locate the resources. they can now go to one place to search for various types of material—a one-stop shop. the search results are easy to read and highlight the search term (see figure 3).3 the third point of access is the digital collection. these collections are digital copies of original materials located within the archives. the digital copies are presented online, described, and organized for easy access. each collection offers full-text searches, browsing, zoom, pan, figure 2. unlv special collections search box figure 1. “dino at the sands” exhibit thomas sommer (thomas.sommer@unlv.edu) is university and technical services archivist in special collections at the university of nevada las vegas libraries. unlv special collections in the twenty-first century | sommer 185 side-by-side comparison, and exporting for presentation and reuse. the newest example of a digital collection is “southern nevada: the boomtown years” (figure 4).4 this collection brings together a wide range of original materials from various collections located within unlv special collections, the nevada state museum, the historical society in las vegas, and the clark county heritage museum. it even provides standards-based activities for elementary and high school students. this project was funded by the nevada state library and archives under the library services and technology act (lsta) as amended through the institute of museum figure 4. “southern nevada: the boomtown years” digital collection figure 5. “what’s new” blog figure 6. unlv special collection facebook page figure 3. hoover dam search results 186 information technology and libraries | december 2009 and library services (imls). unlv special collections director peter michel selected the content. the team included fourteen members, four of whom were funded by the grant. christy keeler, phd, created the educator pages and designed the student activities. new collections are great, but users have to know they exist. to announce new collections and displays, special collections first added a what’s new blog that includes an rss feed to keep patrons up-to-date on new messages (figure 5).5 another avenue of interaction was implemented in april 2009 when special collections created its own facebook page (figure 6).6 students and researchers are encouraged to become fans. status updates with images and links to southern nevada and las vegas resources lead the fans back to the main website where the other treasures can be discovered. special collections has implemented various web 2.0 features within its newest digital collections. specifically, it added a comments section, a “rate it” feature, and an rss feature to its latest digital collections (figures 7, 8, and 9). these latest trends enrich the collections’ resources with patron-supplied information.7 as is apparent, unlv special collections implemented several online tools to allow patrons to discover its extensive primary resources. these tools range from virtual exhibits and digital collections with web 2.0 features to blogs and social networking sites. special collections has endeavored to stay on top of the latest trends to benefit its patrons and facilitate their discovery of historical materials in the twenty-first century. figure 8. “rate it” feature for aerial view of hughes aircraft plant photograph figure 7. comments section for aerial view of hughes aircraft plant photograph figure 9. rss feature for the index to the “welcome home howard” digital collection continued on page 190 190 information technology and libraries | december 2009 as previously mentioned, these easy-to-use tools can allow screencast videos and screenshots to be integrated into a variety of online spaces. a particularly effective type of online space for potential integration of such screencast videos and screenshots are library “how do i find . . .” research help guides. many of these “how do i find . . .” research help guides serve as pathfinders for patrons, outlining processes for obtaining information sources. currently, many of these pathfinders are in text form, and experimentation with the tools outlined in this article can empower library staff to enhance their own pathfinders with screencast videos and screenshot tutorials. reference 1. “unlv libraries strategic plan 2009–2011,” http://www .library.unlv.edu/about/strategic_plan09-11.pdf (accessed july 30, 2009): 2. unlv special collections continued from page 186 references 1. peter michel, “dino at the sands,” unlv special collections, http://www.library.unlv.edu/speccol/dino/index.html (accessed july 28, 2009). 2. peter michel, “unlv special collections search box.” unlv special collections. http://www.library.unlv.edu/speccol/ index.html (accessed july 28, 2009). 3. unlv special collections search results, “hoover dam,” http://www.library.unlv.edu/speccol/databases/index .php?search_query=hoover+dam&bts=search&cols[]=oh&cols []=man&cols[]=photocoll&act=2 (accessed october 27, 2009). 4. unlv libraries, “southern nevada: the boomtown years,” http://digital.library.unlv.edu/boomtown/ (accessed july 28, 2009). 5. unlv special collections, “what’s new in special collections,” http://blogs.library.unlv.edu/whats_new_in_special_ collections/ (accessed july 28, 2009). 6. unlv special collections, “unlv special collections facebook homepage,” http://www.facebook.com/home .php?#/pages/las-vegas-nv/unlv-special-collections/70053 571047?ref=search (accessed july 28, 2009). 7. unlv libraries, “comments section for the aerial view of hughes aircraft plant photograph,” http://digital.library .unlv.edu/hughes/dm.php/hughes/82 (accessed july 28, 2009); unlv libraries, “‘rate it’ feature for the aerial view of hughes aircraft plant photograph,” http://digital.library.unlv.edu/ hughes/dm.php/hughes/82 (accessed july 28, 2009); unlv libraries, “rss feature for the index to the welcome home howard digital collection” http://digital.library.unlv.edu/hughes/ dm.php/ (accessed july 28, 2009). statement of ownership, management, and circulation information technology and libraries, publication no. 280-800, is published quarterly in march, june, september, and december by the library information and technology association, american library association, 50 e. huron st., chicago, illinois 60611-2795. editor: marc truitt, associate director, information technology resources and services, university of alberta, k adams/cameron library and services, university of alberta, edmonton, ab t6g 2j8 canada. annual subscription price, $65. printed in u.s.a. with periodical-class postage paid at chicago, illinois, and other locations. as a nonprofit organization authorized to mail at special rates (dmm section 424.12 only), the purpose, function, and nonprofit status for federal income tax purposes have not changed during the preceding twelve months. extent and nature of circulation (average figures denote the average number of copies printed each issue during the preceding twelve months; actual figures denote actual number of copies of single issue published nearest to filing date: september 2009 issue). total number of copies printed: average, 5,096; actual, 4,751. mailed outside country paid subscriptions: average, 4,090; actual, 3,778. sales through dealers and carriers, street vendors, and counter sales: average, 430; actual 399. total paid distribution: average, 4,520; actual, 4,177. free or nominal rate copies mailed at other classes through the usps: average, 54; actual, 57. free distribution outside the mail (total): average, 127; actual, 123. total free or nominal rate distribution: average, 181; actual, 180. total distribution: average, 4,701; actual, 4,357. office use, leftover, unaccounted, spoiled after printing: average, 395; actual, 394. total: average, 5,096; actual, 4,751. percentage paid: average, 96.15; actual, 95.87. s t a t e m e n t o f o w n e r s h i p , m a n a g e m e n t , a n d c i r c u l a t i o n ( p s f o r m 3 5 2 6 , s e p t e m b e r 2 0 0 7 ) f i l e d w i t h t h e u n i t e d s t a t e s p o s t o f f i c e p o s t m a s t e r i n c h i c a g o , o c t o b e r 1 , 2 0 0 9 . fboze rectangle president’s message: ux thinking and the lita member experience rachel vacek information technologies and libraries | september 1014 1 my mind has been occupied lately with user experience (ux) thinking in both the web world and in the physical world around me. i manage a web services department in an academic library, and it’s my department’s responsibility to contemplate how best to present website content so students can easily search for the articles they are looking for, or so faculty can quickly navigate to their favorite database. in addition to making these tasks easy and efficient, we want to make sure that users feel good about their accomplishments. my department has to ensure that the other systems and services that are integrated throughout the site are located in meaningful places and can be used at the point of need. additionally, the site’s graphic and interaction design must not only contribute to but also enhance the overall user experience. we care about usability, graphic design, and the user interfaces of our library’s web presence, but these are just subsets of the larger ux picture. for example, a site can have a great user interface and design, but if a user can’t get to the actual information she is looking for, the overall experience is less than desirable. jesse james garrett is considered to be one of the founding fathers of user-centered design, the creator of the pivotal diagram defining the elements of user experience, and author of book, the elements of user experience. he believes that “experience design is the design of anything, independent of medium, or across media, with human experience as an explicit outcome, and human engagement as an explicit goal.”1 in other words, applying a ux approach to thinking involves paying attention to a person’s behaviors, feelings, and attitudes about a particular product, system, or service. someone who does ux design, therefore, focuses on building the relationship between people and the products, systems, and services in which they interact. garrett provides a roadmap of sorts for us by identifying and defining the elements of a web user experience, some of which are the visual, interface, and interaction design, the information architecture, and user needs.2 in time, these come together to form a cohesive, holistic approach to impacting our users’ overarching experience across our library’s web presence. paying attention to these more contextual elements informs the development and management of a web site. let’s switch gears for a moment. prior to winning the election and becoming the lita vicepresident/president-elect, i reflected on my experiences as a new lita member and before i became really engaged within the association. i endeavored to remember how i felt when i had joined lita in 2005. was i welcomed and informed, or did i feel distant and uninformed? was the path clear to getting involved in interest groups and committees, or were there barriers that rachel vacek (revacek@uh.edu) is lita president 2014-15 and head of web services, university libraries, university of houston, houston, texas. mailto:revacek@uh.edu president’s message | vacek 2 prevented me from getting engaged? what was my attitude about the overall organization? how were my feelings about lita impacted? luckily, there were multiple times when i felt embraced by lita members, such as participating in bigwig’s social media showcase, teaching pre-conferences, hanging out at the happy hours, and attending the forums. i discovered ample networking opportunities and around every corner there always seemed to be a way to get involved. i attended as many lita programs at annual and midwinter conferences as i could, and in doing so, ran into the same crowds of people over and over again. plus, the sessions i attended always had excellent content and friendly, knowledgeable speakers. over time, many of these members became some of my friends and most trusted colleagues. unfortunately, i’m confident that not every lita member or prospective member has had similar, consistent, or as engaging experiences as i’ve had, or as many opportunities to travel to conferences and network in-person. we all have different expectations and goals that color our personal experiences in interacting with lita and its members. one of my goals as lita president is to enhance the member experience. i want to apply the user experience design concepts that i’m so familiar with to effect change and improve the overall experience for current members and those who are on the fence about joining. to be clear, when i say lita member, i am including board members, committee members and chairs, interest group members and chairs, representatives, and those just observing on the sidelines. we are all lita members and deserve to have a good experience no matter the level within the organization. so what does “member experience” really mean? don norman, author of the design of everyday things and the man attributed with actually coining the phrase user experience, explains that "user experience encompasses all aspects of the end-user's interaction with the company, its services, and its products.” 3 therefore, i would say that the lita member experience encompasses all aspects of a member’s interaction with the association, including its programming, educational opportunities, publications, events, and even other members. i believe that there are several components that define a good member experience. first, we have to ensure quality, coherence, and consistency in programming, publications, educational opportunities, communications and marketing, conferences, and networking opportunities. second, we need to pay attention to our members’ needs and wants as well as their motivations for joining. this means we have to engage with our members more on a personal level, and discover their interests and strengths, and help them get involved in lita in ways that benefit the association as well assist them in reaching their professional goals. third, we need to be welcoming and recognize that first impressions are crucial to gaining new members and retaining current ones. think about how you felt and what you thought when you received a product that really impressed you, or when you started an exciting new job, or even used a clean and usable web site. if your initial impression was positive, you were more likely to connect with the product, environment, or even a website. if prospective and relatively new lita information technologies and libraries | september 1014 3 members experience a good first impression, they are more likely to join or renew their membership. they feel like they are part of a community that cares about them and their future. that experience became meaningful. finally, the fourth component to a good member experience is that we need to stop looking at the tangible benefits that we provide to users as the only things that matter. sure, it’s great to get discounts on workshops and webinars or be able to vote in an election and get appointed to a committee, but we can’t continue to focus on these offerings alone. we need to assess the way we communicate through email, social media, and our web page and determine if it adds or detracts from the member experience. what is the first impression someone might have in looking at the content and design of lita’s web page? do the presenters for our educational programs feel valued? does ital contain innovative and useful information? is the process for joining lita, or volunteering to be on a committee, simple, complex, or unbearable? what kinds of interactions do members have with the lita board or the lita staff? these less tangible interactions are highly contextual and can add to or detract from our current and prospective members’ abilities to meet their own goals, measure satisfaction, or define success. as lita president, and with the assistance of the board of directors, there are several things we have done or intend to do to help lita embrace ux thinking: • we have implemented a chair and vice-chair model for committees so that there is a smoother transition and the vice-chair can learn the responsibilities of the chair role prior to being in that role. • we have established a new communications committee that will create a communication strategy focused on communicating the lita’s mission, vision, goals, and relevant and timely news to lita membership across various communication channels. • we are encouraging our committees to create more robust documentation. • we are creating richer documentation that supports the workings of the board. • we are creating documentation and training materials for lita representatives to compliment the materials we have for committee chairs. • we have disbanded committees that no longer serve a purpose at the lita level and whose concerns are now addressed in groups higher within ala. • the assessment and research committee is preparing to do a membership survey. the last one was done in 2007. • we are going to be holding a few virtual and in-person lita “kitchen table conversations” in the fall of 2014 to assist with strategic planning and to discuss how lita’s goals align with ala’s strategic goals of information policy, professional development, and advocacy. • the membership development committee is exploring how to more easily and frequently reach out, engage, appreciate, acknowledge, and highlight current and prospective members. they will work closely with the communications committee. president’s message | vacek 4 i believe that we’ve arrived at a time where it’s crucial that we employ ux thinking at a more pragmatic and systematic level and treat at it as our strategic partner when exploring how to improve lita and help the association evolve to meet the needs of today’s library and informational professionals. garrett summarizes my argument nicely. he says, “what makes people passionate, pure and simple, is great experiences. if they have great experience with your product [and] they have great experiences with your service, they’re going to be passionate about your brand, they’re going to be committed to it. that’s how you build that kind of commitment.”4 i personally am very passionate about and committed to lita, and i truly believe that our ux efforts will positively impact your experience as a lita member. references 1. http://uxdesign.com/events/article/state-of-ux-design-garrett/203, garrett said this quote in a presentation entitled “state of user experience” that he gave during ux week 2009, a very popular conference for ux designers. 2. http://www.jjg.net/elements/pdf/elements.pdf 3. http://www.nngroup.com/articles/definition-user-experience/ 4. http://www.teresabrazen.com/podcasts/what-the-heck-is-user-experience-design, garret said this quote in a podcast interview with teresa brazen, “what the heck is user experience design??!! (and why should i care?)” http://uxdesign.com/events/article/state-of-ux-design-garrett/203 http://www.jjg.net/elements/pdf/elements.pdffunctional http://www.nngroup.com/articles/definition-user-experience/ http://www.teresabrazen.com/podcasts/what-the-heck-is-user-experience-design information technology and libraries at 50: the 1990s in review steven k. bowers information technology and libraries | december 2018 9 steven k. bowers (sbowers@wayne.edu) is executive director, detroit area library network (dalnet). i played some computers games — stored on data cassette tapes — in the 1980s. that was entertaining, but i never imagined the greater hold that computers would have on the world by the mid-1990s. i can remember getting my first email account in 1993, and looking at information on rudimentary web pages in 1996. i remember my work shifting from an electric typewriter to a bulky personal computer with dial-up internet access. eventually, this new computing technology became a prevalent part of my everyday life. this shift to a computer-driven reality had a major effect on libraries too. i was amazed by the end of the 1990s to be doing research on a university library catalog system connected with other institutions of higher education throughout the region, wondering at the expanded access to, and reach of, information. in my mind, due to computers and the internet, libraries were really connected at that time more than they had ever been. as i prepared this review of what we were writing about in ital in the 1990s, i had some fond memories of the advent of personal computers in my daily life and in the libraries i had access to. as we take a look back, i think it is interesting to see what we were doing then and how it is connected to what we are still working on today. along with the eventual disruption that the internet was to libraries, computers and online access also had the effect of greatly changing how libraries constructed our core research tools, especially the catalog. prior to the 1990s libraries had begun automation projects to move their catalogs to computer-based terminals, creating connections and access that were not previously possible with a card catalog. if we are still complaining about the design and function of the online public access catalog (opac) today, in the early 1990s we were discussing what their design and function should be, in a positive and optimistic way. in some ways it seems hard to recall the discussions of how to format data and display it to users. in other ways it seems like we are still having the same discussions, but our work has become more complex as we continue to restructure library data to become more open and accessible. while we were contemplating the design of online library catalogs, libraries were also discussing the implementation of networking and other information technology infrastructures. nevins and learn examined the changes in hardware, software, and telecommunications at the time and predicted a more affordable cost model with distributed personal computers connected through networks, and enhancing library automation cooperation. 1 they expanded the discussion to include consideration of copyright and intellectual property, security, authorization, and a need for information literacy in the form of user navigation, all key to what we are doing today. beyond catalogs, there was the real adoption of the internet itself. by the early 1990s there was growing enthusiasm for accessing and exploring the internet. 2 this created a need for libraries to learn about the internet and instruct others on how to use it. as late as 1997, however, even search engines were still being introduced and defined, and using the internet or searching the world wide web was still a new concept that was not fully understood by many people. at their the 1990s in review| bowers 10 https://doi.org/10.6017/ital.v37i4.10821 basis, search engines were simply defined as indexing and abstracting databases for the web. 3 it is interesting that library catalogs were developed separately from the development of search engines and we are still trying to get our metadata out of our closed systems and open to the rest of the web. in 1991, kibirige examined the potential impact of this new connectivity on library automation. he posited that “one of the most significant change agents that will pervade all other trends is the establishment and regular use of high-speed, fiber optic communication highways.”4 his article in ital provides a prescient overview of much of what has played out in technology, not just in libraries. he noted the need for disconnected devices to become tools to access full-text information remotely.5 perhaps most important, he noted the need for librarians to become experts in non-library technology, to keep pace with developments outside of the profession. this admonition is still important to keep in mind today. at the time, however, libraries were working on the basics of converting records from online bibliographic utility systems running on mainframes to a more useful format for access on a personal computer, let alone thinking about transforming library metadata into linked data that can be accessed by rest of the internet. so we keep moving forward. later in the decade, libraries began to think about the library catalog as a “one stop shop” for information. in 1997, caswell wrote about new work to integrate local content, digital materials, and electronic resources, all into one search interface. initially the discussion was more technical in nature, but caswell provided an early concept for providing a single access point to all of the content that the library has, print and electronic, which was a step forward from just listing the books in the catalog.6 at the time we were still far away from our current concept of a full discovery system with access to millions of electronic resources that may well surpass the print collections of a library. eventually more discussion developed around the importance of user experience and usability for the design of catalogs and websites. catalogs were examined in parallel with the structure of library metadata, and both were seen as important to the retrievability of library materials. human-machine interaction was starting to be examined on the staff side of systems, and this would eventually become part of examining the public interface usability as well. outlining an agenda for redesigning online catalogs, buckland summarized this new technological development work for libraries by noting that “sooner or later we need to rethink and redesign what is done so that it is not a mechanization of paper but fully exploits the capabilities of the new technology.”7 more exciting, by the end of the 1990s we were seeing usability studies for specific populations and those with accessibility difficulties. systems were in wide enough use that libraries began to examine their usefulness to more audiences. beyond our systems, the technology of our actual collections was changing. new network connectivity combined with new hardware led to new formats for library resources, specifically digital and electronic resources. in 1992, geraci and langschied summarized these changes, stating that “what is new for the 1990s is the complication of a greater variety of electronic format, software, hardware, and network decisions to consider.”8 they also expanded the conversation to include data in all forms, and data sets of various kinds, well beyond traditional library materials. this is an important evolution as libraries worked to shift their operations, identities, and curatorial practices. geraci and langschied defined data by type, including social data, scientific information technology and libraries | month year 11 data, and humanities data. they called most importantly for libraries to include access to this varied data to continue the role of libraries providing access to information, as they cautioned that information seekers were already beginning to bypass libraries and look for such information from other sources. libraries were beginning to lose ground as the gatekeepers of information and needed to shift to providing online access and open data themselves. the early 1990s were an exciting time for preservation, as discussion was moving from converting materials to microforms to digitization. in 1990, lesk compared the two formats and had hope for a promising digital future.9 thank goodness he was on target for sharing resources and creating economical digital copies, even if he did not completely predict the eventual shift to reliance on electronic resources that many research libraries have now made. lesk also noted the importance of text recognition, optical character recognition (ocr), and text formatting in ascii. others focused on digital file formats and the planning and execution of creating digital collections. digitization practices were developing and the need to formalize practice was becom ing evident. the same year, lynn outlined the relationship between digital resources and their original media, highlighting preservation, capture, storage, access, distribution.10 by the late 1990s there were more targeted discussions about the benefits of digitizing resources to provide not only remote access, but access to archival materials specifically. in 1996, alden provided a good primer on everything to consider when doing digitization projects, within budget constraints. 11 by the mid-1990s, karen hunter was excited to extol the promises of the dissemination of information electronically, calling the high performance computing and high speed networking applications act of 1993 “[a] formidable vision and goal. real-time access to everything and a laser printer in every house. the 1990s equivalent to a chicken in every pot.”12 hunter’s article is a good overview of where libraries were at working with electronic publications and online access in the early 1990s. halcyon enssle’s piece on moving reserves to online access opened with a great summary of where much of library access was headed: “the virtual library, libraries without walls, the invisible user . . . these are some of the terms getting used to describe the library of the future . . . .”13 eventually, by the end of the decade we even learned to start tracking how our new online libraries were being used, applying our knowledge of print resource usage to our new online collections. in 1995, laverna saunders had already developed a new definition of what a library was, and how the transformation of libraries from physical warehouses to providing access to online content would affect workflows in libraries. as defined by saunders, “the virtual library is a metaphor for the networked library, consisting of electronic and digital resources, both local and remote.”14 not a bad definition more than 20 years later. saunders asked pertinent questions such as which resources would be best in print vs. online, what print materials should be retained, and which resources and collections libraries should digitize themselves. the broader view provided was that these changes would affect not just collections but the entire operation of libraries. there would still be work to do in libraries, but changes in the work were necessary to address shifting technology and the composition of collections. by the end of the decade there was new work to assess use of electronic resources, extended virtual reference services, and information literacy extending to technology instruction. in 1998, kopp wrote about the promising future of library collaborations. consortia were well established in prior decades and they were seeing a resurgence. kopp noted that just as consortia the 1990s in review| bowers 12 https://doi.org/10.6017/ital.v37i4.10821 had been built around support for new shared utilities in the 1970s and 1980s, in the 1990s they were finding a new purpose in the new networking of the internet and possibilities of greater connectivity and collaborations in the online environment.15 beyond cataloging and automation technology, it is interesting to note that even in the new online environment that was forming in the 1990s, many consortia formed at the time to share print resources. this may have been conversely related to libraries shifting from complete print collections to online holdings that many may have felt were more ephemeral, or maybe money was spent on new technological infrastructures and less on library materials. resource sharing of print materials is still an important part of libraries working together to provide access to information, and since the time that kopp wrote about consortia and growing networked collaborations, there has also been a growing development of sharing electronic resources. a large part of the work of many consortia today revolves around purchasing of electronic resources, but in the late 1990s libraries were just beginning to get into purchasing commercial electronic resources.16 there were lots of ital articles in the 1990s looking at the future of libraries and technology, and some specific articles dedicated to prognostication. in 1991, looking into the future, kenneth e. dowlin shared a vision for public libraries in 2001. he predicted that libraries would still exist but it is noteworthy that at the time the future existence of libraries was questioned by many. dowlin did predict change for libraries, including the confluence of new media formats, computing, and yes, still books. he stated what time has now confirmed: “the public wants them all.”17 he had lots of other interesting ideas as well; his article is worth a second look. another fun take on the future was a special section on science fiction from 1994 considering future possibilities in information technology and access. in one piece, david brin noted, “nobody predicted that the home computer would displace the mega-machine and go on to replace the rifle over the fireplace as freedom’s great emancipator, liberating common citizens as no other technology has since the invention of the plow.”18 an interesting observation, even if the computer has now been replaced by phones in our pockets or other fantastic wearable technologies. by the end of the 1990s, libraries had been greatly transformed by technology. many libraries had automated, workflows continued to adjust in all areas of library work, and most libraries had at least partially incorporated elements of using the internet along with providing computer access to library users. some libraries were already moving through the change from print to electronic library resources. specific web applications and websites were also being developed and used for and by libraries. these eventually have matured into smarter systems that can provide better access to our collections and smarter assessment of our resource usage, for both print and electronic materials. as a whole, the 1990s are an exciting time to review when looking at the intersection of information technology and libraries. as information dissemination moved to an online environment, within and outside of the profession, the future existence of libraries began to be questioned. as we now know, libraries still play an important role in providing access to information. notes 1 kate nevins and larry l. learn, “linked systems: issues and opportunities (or confronting a brave new world),” information technology and libraries 10, no. 2 (1991): 115. information technology and libraries | month year 13 2 constance l. foster, cynthia etkin, and elaine e. moore, “the net results: enthusiasm for exploring the internet,” information technology and libraries 12, no. 4 (1993): 433-6. 3 scott nicholson, “indexing and abstracting on the world wide web: an examination of six web databases,” information technology and libraries 16, no. 2 (1997): 73-81. 4 harry m. kibirige, “information communication highways in the 1990s: an analysis of their potential impact on library automation,” information technology and libraries 10, no. 3 (1991): 172. 5 kibirige, “information communication highways in the 1990s,” 175. 6 jerry v. caswell, “building an integrated user interface to electronic resources,” information technology and libraries 16, no. 2 (1997): 63-72. 7 michael k. buckland, “agenda for online catalog designers,” information technology and libraries 11, no. 2 (1992): 162. 8 diane geraci and linda langschied, “mainstreaming data: challenges to libraries,” information technology and libraries 11, no. 1 (1992): 10. 9 michael lesk, “image formats for preservation and access,” information technology and libraries 9, no. 4 (1990): 300-308. 10 m. stuart lynn, “digital imagery, preservation, and access--preservation and access technology: the relationship between digital and other media conversion processes: a structured glossary of technical terms,” information technology and libraries 9, no. 4 (1990): 309-336. 11 susan alden, “digital imaging on a shoestring: a primer for librarians,” information technology and libraries 15, no. 4 (1996): 247-50. 12 karen a. hunter, “issues and experiments in electronic publishing and dissemination,” information technology and libraries 13, no. 2 (1994): 127. 13 halcyon r. enssle, “reserve on-line: bringing reserve into the electronic age,” information technology and libraries 13, no. 3 (1994): 197. 14 laverna m. saunders, “transforming acquisitions to support virtual libraries,” information technology and libraries 14, no. 1 (1995): 41. 15 james j. kopp, “library consortia and information technology: the past, the present, the promise,” information technology and libraries 17, no. 1 (1998): 7-12. 16 international coalition of library consortia, “guidelines for statistical measures of usage of web-based indexed, abstracted, and full text resources,” information technology and libraries 17, no. 4 (1998): 219-21; charles t. townley and leigh murray, “use-based criteria the 1990s in review| bowers 14 https://doi.org/10.6017/ital.v37i4.10821 for selecting and retaining electronic information: a case study,” information technology and libraries 18, no. 1 (1999): 32-9. 17 kenneth e. dowlin, “public libraries in 2001,” information technology and libraries 10, no. 4 (1991): 317. 18 david brin, “the good and the bad: outlines of tomorrow,” information technology and libraries 13, no. 1 (1994): 54. editorial board thoughts: appreciation for history cynthia porter information technology and libraries | september 2012 2 the future looks exciting for ital, with our new open access and online only journal. as i look forward, i have been thinking about librarians and the changes i have witnessed in library technology. i would like to thank judith carter for her work on ital for over 13 years. she encouraged me to volunteer for the editorial board. i will miss her. i believe that lessons from the past can help us. ital’s first issue appeared in 1982—the same year that i graduated from high school. i typed all my school papers with a typewriter except for my last couple of papers in college. my father bought an early macintosh computer (called lisa). he had a daisy wheel printer—if we wanted to change fonts, we changed out the daisy wheel. i am thankful for the editing capabilities and font choices i have now when i create documents. as an undergraduate student, i worked on dedicated oclc terminals in the interlibrary loan (ill) department at my college library. i was hired because i had the two hours open when ill usually used mail. i thought our ill service was a big help for our students. i could not imagine then that electronic copies of articles could be delivered to ill customers within one day. today’s ill staff doesn’t have to worry about paper cuts now, either. i graduated from library school in 1989. when i first started working as a cataloger, we were able to access oclc on pc’s (an improvement from the dumb terminals) in the libraries. our subject heading lists were in the big red books from the library of congress. i tried to use the red books as an example for today’s students and they had no idea what i was talking about. even though “subject headings” are a foreign concept to many students today, i will always value them and fight for their continuation. i worked on several retrospective conversion projects when i worked for a library contractor until 1991. the libraries still had card catalogs and we converted these physical catalogs to online catalogs. nicholson baker’s article “discards1,” published in 1994, fondly remembered card catalogs. this article was discussed fervently in library school, but it seems quaint now. i grew up with card catalogs and i liked being able to browse through the subject listings. browsing online does not provide the same satisfaction, but i would never give up the ability to keyword search an electronic document. i liked browsing the classification schemes, too. i did like easily seeing where your chosen number appeared within the scheme. it’s harder to do the same thing online. in 1991 i worked at an academic library where we were still converting catalog cards. we all had cynthia porter (cporter@atsu.edu) is distance support librarian at a.t. still university of health sciences, mesa, arizona. editorial board thoughts: appreciation for history| porter 3 computers on our desks by then and we were comfortable with regular use of e-mail. the internet was still young and gophers were the new technology. even though gophers were text-based, i thought it was amazing how easy it was to access information from a university on the other side of the country. the internet was the biggest technology development for me. i currently work with distance students who rely on their internet connections to use our online library. i could not imagine even having distance students if we weren’t connected with computers as we are now. a 2009 issue of ital was dedicated to discovery tools. in judith carter’s introduction to the issue she cites the browsing theory of shan-ju lin chang. browsing is an old practice in libraries and i am very happy to see that discovery tools use this classic library practice. bringing like items together has been a helpful organization method for ages. when i studied s.r. ranganathan and his colon classification scheme, i realized that faceted classification would work very well on the web. i found his ideas to be fascinating, but difficult to implement on book labels for classification numbers. some discovery tools even identify “facets” in searching and limiting. ranganathan’s work is a beautiful example of an old idea blossoming years after its conception. classification, facets, and browsing are old ideas that are still helping us organize information in our libraries. we can’t see the heavily used subjects by how dirty the cards are, but getting exact statistics on search terms is more useful anyway. i would also like to thank marc truitt for his time and contributions to ital. marc recently finished serving for four years as ital editor. he helped me remember library technology. i wanted to know about his collaboration with judith carter. he said that he “thought no one this side of pluto could do as well as she” as managing editor. we are lucky to have had brave librarians like ranganathan, carter, and truitt. although i enjoy remembering the past, i am very happy to utilize modern technology in my library. i don’t want to live in the past, but i definitely don’t want to forget it either. thank you library technology pioneers. references 1. nicholson baker, “discards,” the new yorker, april 4, 1994, vol. 70, no. 7, p. 64-85. fulfill your digital preservation goals with a budget studio yongli zhou information technology and libraries | march 2016 26 abstract to fulfill digital preservation goals, many institutions use high-end scanners for in-house scanning of historical print and oversize materials. however, high-end scanner prices do not fit in many small institutions’ budgets. as digital single-lens reflex (dslr) camera technologies advance and camera prices drop quickly, a budget photography studio can help to achieve institutions’ preservation goals. this paper compares images delivered by a high-end overhead scanner and a consumer-level dslr camera, discusses pros and cons of using each method, demonstrates how to set up a cost-efficient shooting studio, and presents a budget estimate for a studio. introduction colorado state university libraries (csul) are regularly engaged in a variety of digitization projects. materials for some projects are digitized in-house, while items from selected projects are sometimes outsourced. most fragile materials that require professional handling are digitized inhouse using an expensive overhead scanner. however, the overhead scanner has been occasionally unstable since it was purchased, and this has delayed some of our digitization projects. as digital photography technologies advance, image quality delivered by digital singlelens reflex (dslr) cameras is improving, and camera prices have lowered to an affordable level. in this paper, i will compare images produced by a scanner and a camera side-by-side, list pros and cons of using each method, illustrate how to establish a shooting studio, and present a budget estimate for that studio. literature review there are many online guidelines and manuals for digitizing print materials. some universities and museums have information about their digitization equipment online. most articles focus on either high-end scanners or customized scanning stations. these articles are very helpful for universities and museums that are relatively well funded. however, there is almost no literature discussing how to use inexpensive digital cameras and photography equipment to produce highquality digitized images. this article will use a case study to prove that a low-budget studio can produce high-quality digitized images. comparison of scanned and photographed images the test camera set was chosen because it was the one the author used for general purpose. the camera was also chosen by many professional photographers because of its quality and yongli zhou (yongli.zhou@colostate.edu) is digital repositories librarian, colorado state university libraries, fort collins, colorado. mailto:yongli.zhou@colostate.edu fulfill your preservations goals with a budget studio | zhou doi: 10.6017/ital.v35i1.5704 27 affordability. to avoid dispute, the overhead scanner’s make and model are not revealed. test equipment budget studio overhead scanner • nikon d800 • nikon af micro-nikkor 60mm f/2.8d lens • manfrotto 055cxpro3 3-section carbon fiber tripod legs • really right stuff bh-40 lr ii ballhead • nonreflective glass • book cradles • x-rite original colorchecker card • natural daylight • total cost: $4,500 and no maintenance fees (priced in 2014) • our overhead scanner • nonreflective glass • book cradles • purchase price: $55,000 (purchase in 2007) • $8,000 annual maintenance (2013 price) table 1. test equipment focus and sharpness a quality digitized image needs to have a good focus. a well-focused image shows details better and can produce better optical character recognition (ocr) results for text-based documents. at csul, we have no control over the automatic focus on our overhead scanner and have noticed that sometimes one page is sharply focused but the next page is slightly out-of-focus. during the scanning process, our overhead scanner does not indicate if a shot is focused or not. a dslr camera can beep or display a flashing dot on the viewfinder when in focus. illustration the following two figures compare images produced by our test dslr and overhead scanner. both images were originals and have not been enhanced by software. in addition to this image, we tested nine other illustrations. following our comparison study, we concluded that a semiprofessional dslr camera produces sharper images than our expensive overhead scanner. in figure 1, at 100 percent zoom , the left image has a better focus, contains more details, and has colors closer to the original. the left image was taken using a nikon d800 + nikkor 60mm macro lens and under natural lighting. the right image was produced by our overhead scanner. in figure 2, at 200 percent zoom, the left image (taking using the dslr) shows much more detail than the image on the right (taken with the overhead scanner). information technology and libraries | march 2016 28 figure 1. comparative images from dslr (left) and overhead scanner (right), at 100 percent zoom. image from samuel m. janney, the life of william penn; with selections from his correspondence and auto-biography (philadelphia: hogan perkins & co, 1852), plate between pages 296 and 297. figure 2. comparative images from dslr (left) and overhead scanner (right), at 200 percent zoom. image from samuel m. janney, the life of william penn; with selections from his fulfill your preservations goals with a budget studio | zhou doi: 10.6017/ital.v35i1.5704 29 correspondence and auto-biography (philadelphia: hogan perkins & co, 1852), frontispiece, print. at csul, the process of digitizing a text document includes scanning pages, converting them into portable document format (pdf) files, and applying an ocr process. in general, a well-focused image of text produces better ocr results, although software such as adobe acrobat can tolerate fuzzy images and produce reasonably accurate ocr text. our ocr tests from a slightly out-of-focus image and a well-focused image have no significant difference; however, from preservation and usability standpoints, we prefer well-focused images. figure 3. the left image was produced by our test dslr camera and has a better focus. the right image was produced by our overhead scanner. samuel m. janney, the life of william penn; with selections from his correspondence and auto-biography (philadelphia: hogan perkins & co, 1852), 300, print. figure 4. we ran the ocr process on the above two images. the top image was produced by our test dslr camera and the bottom image was produced by our overhead scanner. samuel m. information technology and libraries | march 2016 30 janney, the life of william penn; with selections from his correspondence and auto-biography (philadelphia: hogan perkins & co, 1852), 300, print. generated from the image by camera generated from the image by scanner " on one or two points of high importance, he had notions more correct than were, in his day, common, even among men of e1~larged minds, and he had the rare good fortune of being able to carry his theories into practice without any compromise." yet, "he was not a man of stron sense." " on one or two points of high importance, he bad notions more correct than were, in his day, common, even arnong men of e1~larged minds, and he had the rare good fortune of being able to carry his theories into practice without any compromise." yet, "he was not a man of strong sense." table 2. ocr results comparison these test results are very close because of the forgiveness of the adobe acrobat software. however, we have seen that for some other pages, a better-focused image generates improved ocr results. photograph a 6.5 inches by 4.5 inches silver print was used for this test. our tests show that the test dslr camera produced a sharper image of this historic photograph. fulfill your preservations goals with a budget studio | zhou doi: 10.6017/ital.v35i1.5704 31 figure 5. tested 6.5 inches by 4.5 inches photograph. the red square indicates the enlarged area for figure 6. historical photograph from colorado state university archives and special collections. figure 6. screen view at 100 percent zoom of a silver print. the top image was produced by the test dslr camera and the bottom one was produced by our overhead scanner. historical photograph from colorado state university archives and special collections. oversize materials for oversized materials, overhead scanners and dslr cameras have their drawbacks, so we do not think either option is ideal for them. our library uses a map scanner to scan oversize maps and posters. however, a map scanner is expensive and may not fit many libraries’ budgets. a map scanner also is not suitable for fragile maps or posters. our overhead scanner’s maximum scanning area is 24 inches by 17 inches, and the test map’s size is 25 inches by 26 inches. we had to scan the map in four sections and stitch them together using adobe photoshop. each section image has a files size of 313 mb. because of large file sizes, the stitching process is extremely slow. also stitching images is not recommended because there are always some degrees of mismatching errors created by lens distortion. a camera can capture any material size, but the details of the photographed images diminish as the material’s size increases. the photo of the entire map taken by our test dlsr has a file size of 35.8 mb. the image produced by camera has a lower resolution and less detail. information technology and libraries | march 2016 32 figure 7. oversized materials screen view at 100 percent zoom. the top image was photographed by the test dslr. the bottom image was scanned by our overhead scanner. historical map from colorado state university archives and special collections. small prints one big advantage of a dslr camera is that it can be set farther away to take pictures of oversized materials or very close to smaller objects to take close-up pictures. comparatively, the distance of lens and scanning platform on our overhead scanner is fixed, so no close-up images can be produced, and everything is reproduced at scale of 1:1. for the following example, we used a 5.5 inches by 3.5 inches drawing as our test subject. fulfill your preservations goals with a budget studio | zhou doi: 10.6017/ital.v35i1.5704 33 figure 8. a 5.5 inches by 3.5 inches fine drawing. a historical booklet from colorado state university archives and special collections. figure 9. small prints screen view at 100 percent zoom. the left image is produced by a dslr with a macro lens and the right image was scanned by our overhead scanner. a historical booklet from colorado state university archives and special collections. information technology and libraries | march 2016 34 the image produced by our overhead scanner has a resolution of 3,427 pixels by 2,103 pixels. the camera produces a 6,776 pixels by 4,240 pixels image. the higher pixel count allows users to see more details at the same zoom level. the image produced by camera is not only sharper but also contains more details. it also is good for making enlarged prints for promotion materials. for smaller maps, a dslr camera also produces superior images. for the following sample, we tested a 15 inches by 9.5 inches map. figure 10. a 15 inches by 9.5 inches map. the blue square indicates the enlarged area for figure 11. historical map from colorado state university archives and special collections. fulfill your preservations goals with a budget studio | zhou doi: 10.6017/ital.v35i1.5704 35 figure 11. small map screen views at 100 percent zoom. the left image was photographed by a dslr camera with a macro lens and the right image was produced by our overhead scanner. historical map from colorado state university archives and special collections. post-processing use of a sharpening filter our tests showed that a main drawback of our overhead scanner is that images produced are outof-focus. some digitization guidelines recommend minor post-processing for delivered images files to improve image quality. one might argue that to fix our overhead scanner’s out-of-focus problem, sharpening can be applied. technical guidelines for digitizing cultural heritage materials: creation of raster image master files recommends doing minor post-scan adjustment to optimize image quality and bring all images to a common rendition.1 this is good advice, but it is not applicable in real-world practice. to get the best result, each image would need to be evaluated and have a sharpening filter applied separately because when an improper sharpening setting is applied to an image, it often creates haloing artifacts and an unnatural look. the application of a sharpening filter to each image process will be extremely time-consuming. the haloing artifact is also called chromatic aberration (ca) effect. ca appears as unsightly color fringes near high contrast edges. chromatic aberrations are typically only visible when viewing the image on-screen at higher zoom levels or on large prints. information technology and libraries | march 2016 36 the following example shows that the ca may not appear at lower zoom levels, such as 50 percent or 100 percent. the left image has no sharpening filter applied and the right image has a sharpening filter applied. at 100 percent zoom, chromatic aberration is almost not identifiable, and the right image appears to be superior in turns of sharpness. figure 12. sharpening filter comparison sample at 100 percent zoom. the left image has no sharpening filter applied and the right image has been applied a sharpening filter. historical map from colorado state university archives and special collections. at a higher zoom level, we see ca, visible in the right image of figure 13. the extra colors are introduced by the software. figure 13. comparison of sharpening filter applied to images and at 500 percent zoom. the left image has no sharpening filter applied and the right image has sharpening filter applied. historical map from colorado state university archives and special collections. fulfill your preservations goals with a budget studio | zhou doi: 10.6017/ital.v35i1.5704 37 we recommend not applying sharpening filters to original scanned images; instead, attempt to obtain well-focused images from the beginning. for this reason, the test dslr camera outperformed our overhead scanner for most materials. color balance have you seen a scanned color image or color photograph with colors very different from the original image? for example, a white area appears to be bluish, or it has an orange cast? when scanning or photographing an image under different lighting, the output image can have very different colors. in the following figure, the left image was shot at a correct white balance (wb) setting. wb is the process of removing unrealistic color casts so that objects that appear white in person are rendered white in your photo.2 the center image has a blue color cast, which was caused by a lower kelvin setting, and the right image was shot at a higher kelvin setting. a camera may create images with the wrong colors, but so will a scanner if it is not calibrated correctly. figure 14. images shot under different white balance settings. we pay an $8,000 annual service fee for overhead scanner maintenance, which includes scanner color calibration. in general, image colors rendered by the machine are close to original colors but not exact. we have noticed that some images have a very light green overcast and other others are overly yellow; sometimes images appear to be darker than they should be. because we are not certified to calibrate the overhead scanner, we only use the prescribed settings set by technicians. also, we have no control over maintaining a fading light bulb, which will affect correct exposure. wb adjustment on photographs taken in a studio can be very precise. most dslr contains a variety of preset white balances. in general, auto wb works well, but does not deliver the best results. custom wb allows fine-tuning of colors. if a shooting studio is set up properly, the lighting should be consistent, so ideally one setting found most desirable can be used repeatedly. however, professional photographers do test shots at the beginning of each shooting session. once they find information technology and libraries | march 2016 38 the optimal test shot, they will use the exact settings for the batch. later, they will do minor color adjustment on the chosen test shot to ensure precise color representation, and then apply the adjustment settings on all other photos of the same batch. because many small variations can be present for each shooting session, they do not use the settings from the previous shooting. it may seem arduous to do test shots for each shooting, but it ensures accurate color reproduction. many professional photographers use colorchecker passport,3 which is a commercial product to help with quick and easy capture of accurate colors. i will demonstrate briefly a useful trick i learned from a professional photography seminar how to utilize colorchecker passport to apply correct white balance a group of images. 4 step 1: place an 18 percent gray card or a colorchecker passport card on top of a page. choose the correct exposure and take the photo. use the same exposure setting to take additional photos. for demonstration purposes, we deliberately used a very low and high kelvin setting for sample images. the low kelvin setting created cool and blue tones and the high kelvin setting created a tone that was too warm. note that the test shot with colorchecker board was not taken with exactly the correct white balance setting. figure 15. sample images for white balance adjustment. rocky mountain collegian 3–4 (1893), 118, colorado state university archives and special collections. step 2: in adobe lightroom, select the test target image and switch to “develop” mode. select the white balance tool, move the cursor over a gray area, try to find a spot where the red, green, and blue (rgb) values are close. if you can find a place with equal rgb values, it will be ideal. this simple click will set the test image’s white balance to an almost perfect setting. fulfill your preservations goals with a budget studio | zhou doi: 10.6017/ital.v35i1.5704 39 figure 16. applying a white balance in adobe lightroom 4 step 3. synchronize other images’ settings with the target image. select the target image and all other images, click the sync button, and select settings you would like to synchronize. make sure the wb button is checked. figure 17. synchronize settings in adobe lightroom 4 information technology and libraries | march 2016 40 figure 18. synchronized images with correct white balance. rocky mountain collegian 3–4 (1893), 118, colorado state university archives and special collections. recently, i had the opportunity to visit the spencer museum of art’s digitization lab. they have a different workflow to ensure even more scientifically correct colors. if you are interested in their approach, you can contact their information technology manager or photographer. color space one very important thing to understand is color space when you use a dslr camera. many dslr cameras support adobe rgb and srgb. srgb reflects the characteristics of the average cathode ray tube (crt) display. this standard space is endorsed by many hardware and software manufacturers, and it is becoming the default color space for many scanners, low-end printers, and software applications. it is the ideal space for web work but not recommended for prepress work because of its limited color gamut. adobe rgb (1998) was designed to encompass most of the colors achievable on cmyk printers, but only by using rgb primary colors on a device such as your computer display.5 it is recommended to use this color space if you need to do print production work with a broad range of colors. many scanning vendors deliver images in adobe rgb color space. prophoto rgb contains all colors that are in adobe rgb, and adobe rgb contains nearly every color that is in srgb. this color space covers more colors than the human eye can see. it can only be used for images in raw format and in 16-bit mode. common file formats that support 16-bit images are tiff and psd. most printers do not support 16-bit format. this color space normally is used by photographers who have a specific workflow and who print on specific high-end inkjet printers. when converting from 16-bit to 8-bit, some images will have banding or posterization problems. banding is a digital imaging artifact. a picture with banding problem shows horizontal or vertical lines. fulfill your preservations goals with a budget studio | zhou doi: 10.6017/ital.v35i1.5704 41 figure 19. an example of colour banding, visible in the sky in this thotograph.6 posterization of an image entails conversion of a continuous gradation of tone to several regions of fewer tones, with abrupt changes from one tone to another.7 figure 20. an example of posterization.8 while it is a good idea to capture images using adobe rgb to preserve a wide range of colors, you should convert images to srgb when delivering to unknown users and displaying on the web. currently, srgb is the only appropriate choice for images uploaded to the web, since most web browsers don’t support any color management. adobe rgb images that are uploaded to websites without conversion to srgb generally appear dark and muted.9 if they were printed on printers that do not support adobe rgb format, colors will be dull too. setting up a budget studio commercial approach bookdrive pro is a commercially available digitization unit. it uses two digital cameras and built-in flash lights. it may be the optimal solution for your projects, but it also may not fit your library’s information technology and libraries | march 2016 42 budget. the unit also is not suitable for oversized material such as large maps and posters. for more information about this product, please visit http://pro.atiz.com/. sample budget studio setup a digitization lab can have three rooms or areas, one for oversized materials, one for smaller prints or 3-d objects, and one for computers. the area for shooting oversized materials should have black walls and floor. you can either use one flash light to bounce light off the ceiling or use two flash lights to shine lights directly onto the materials. for fragile materials, the first approach is more appropriate. the area for shooting smaller prints or 3-d objects should have a stable table and black or white background paper. for this room or area, black walls and floor are not required. for shooting equipment, i will use the set chosen by the photographer from the university of kansas spencer museum of art as my example. item name sample item purchasing url price dslr camera nikon d810 http://www.bhphotovideo.co m/c/search?atclk=camera+mo del_nikon+d810&ci=6222&n= 4288586280+3907353607 $2,996.95 macro lens nikon af micro-nikkor 60mm f/2.8d lens http://www.bhphotovideo.co m/c/product/66987grey/nikon_1987_af_micro_ nikkor_60mm_f_2_8d.html $429.00 heavy duty mono stand arkay 6jrcw mono stand jr with counter weight— 6' http://www.bhphotovideo.co m/c/product/2727reg/arkay_605138_6jrcw_m ono_stand_jr.html $678.50 strobe broncolor g2 pulso— 1600 watt/second focusing lamphead with 16' cord http://www.bhphotovideo.co m/c/product/259745reg/broncolor_32_115_07_g2 _pulso_with_16.html $3,053.68 power pack broncolor senso a4 2,400w/s power pack http://www.bhphotovideo.co m/c/product/745060reg/broncolor_31_051_07_se nso_a4_2_400w_s_power.html $3,629.92 http://www.bhphotovideo.com/c/search?atclk=camera+model_nikon+d810&ci=6222&n=4288586280+3907353607 http://www.bhphotovideo.com/c/search?atclk=camera+model_nikon+d810&ci=6222&n=4288586280+3907353607 http://www.bhphotovideo.com/c/search?atclk=camera+model_nikon+d810&ci=6222&n=4288586280+3907353607 http://www.bhphotovideo.com/c/search?atclk=camera+model_nikon+d810&ci=6222&n=4288586280+3907353607 http://www.bhphotovideo.com/c/product/66987-grey/nikon_1987_af_micro_nikkor_60mm_f_2_8d.html http://www.bhphotovideo.com/c/product/66987-grey/nikon_1987_af_micro_nikkor_60mm_f_2_8d.html http://www.bhphotovideo.com/c/product/66987-grey/nikon_1987_af_micro_nikkor_60mm_f_2_8d.html http://www.bhphotovideo.com/c/product/66987-grey/nikon_1987_af_micro_nikkor_60mm_f_2_8d.html http://www.bhphotovideo.com/c/product/2727-reg/arkay_605138_6jrcw_mono_stand_jr.html http://www.bhphotovideo.com/c/product/2727-reg/arkay_605138_6jrcw_mono_stand_jr.html http://www.bhphotovideo.com/c/product/2727-reg/arkay_605138_6jrcw_mono_stand_jr.html http://www.bhphotovideo.com/c/product/2727-reg/arkay_605138_6jrcw_mono_stand_jr.html http://www.bhphotovideo.com/c/product/259745-reg/broncolor_32_115_07_g2_pulso_with_16.html http://www.bhphotovideo.com/c/product/259745-reg/broncolor_32_115_07_g2_pulso_with_16.html http://www.bhphotovideo.com/c/product/259745-reg/broncolor_32_115_07_g2_pulso_with_16.html http://www.bhphotovideo.com/c/product/259745-reg/broncolor_32_115_07_g2_pulso_with_16.html http://www.bhphotovideo.com/c/product/745060-reg/broncolor_31_051_07_senso_a4_2_400w_s_power.html http://www.bhphotovideo.com/c/product/745060-reg/broncolor_31_051_07_senso_a4_2_400w_s_power.html http://www.bhphotovideo.com/c/product/745060-reg/broncolor_31_051_07_senso_a4_2_400w_s_power.html http://www.bhphotovideo.com/c/product/745060-reg/broncolor_31_051_07_senso_a4_2_400w_s_power.html fulfill your preservations goals with a budget studio | zhou doi: 10.6017/ital.v35i1.5704 43 reflector broncolor p65 reflector, 65 degrees, 11" diameter, for broncolor pulso 8, twin and hmi http://www.bhphotovideo.co m/c/product/7162reg/broncolor_33_106_00_p6 5_reflector_65_degrees.html $513.52 reflector broncolor softlight reflector, 20" diameter, for broncolor primo, pulso 2/4 & hmi heads http://www.bhphotovideo.co m/c/product/7167reg/broncolor_33_110_00_sof tlight_reflector_20_for.html $501.76 light stand impact air-cushioned light stand http://www.bhphotovideo.co m/c/product/253067reg/impact_ls10ab_air_cush ioned_light_stand.html $44.99 light meter sekonic l-308s flashmate—digital incident, reflected and flash light meter http://www.bhphotovideo.co m/c/product/368226reg/sekonic_401_309_l_308s _flashmate_light_meter.html $199.00 book cradle book exhibition cradles http://www.universityproduct s.com/cart.php?m=product_list &c=1115&primary=1&parenti d=1271&navtree[]=1115 $30.00 background paper savage seamless background paper (both white and black) http://www.bhphotovideo.co m/c/product/45468reg/savage_1_12_107_x_12yd s_background.html $45.00 x 2 = $90.00 nonreflective glass 1/4" optiwhite starphire purified tempered single lite clear class can be purchased at local glass store. $75.00 white balancing accessory x-rite original colorchecker card http://www.bhphotovideo.co m/c/product/465286reg/x_rite_msccc_original_c olorchecker_card.html $69.00 software adobe lightroom 5 http://www.adobe.com/produ cts/photoshop-lightroom.html $150.00 table 3. list of items needed to prepare for a budget studio the total cost for a “budget” shooting studio ranges from $10,000 to $15,000, and there is no annual maintenance expense. http://www.bhphotovideo.com/c/product/7162-reg/broncolor_33_106_00_p65_reflector_65_degrees.html http://www.bhphotovideo.com/c/product/7162-reg/broncolor_33_106_00_p65_reflector_65_degrees.html http://www.bhphotovideo.com/c/product/7162-reg/broncolor_33_106_00_p65_reflector_65_degrees.html http://www.bhphotovideo.com/c/product/7162-reg/broncolor_33_106_00_p65_reflector_65_degrees.html http://www.bhphotovideo.com/c/product/7167-reg/broncolor_33_110_00_softlight_reflector_20_for.html http://www.bhphotovideo.com/c/product/7167-reg/broncolor_33_110_00_softlight_reflector_20_for.html http://www.bhphotovideo.com/c/product/7167-reg/broncolor_33_110_00_softlight_reflector_20_for.html http://www.bhphotovideo.com/c/product/7167-reg/broncolor_33_110_00_softlight_reflector_20_for.html http://www.bhphotovideo.com/c/product/253067-reg/impact_ls10ab_air_cushioned_light_stand.html http://www.bhphotovideo.com/c/product/253067-reg/impact_ls10ab_air_cushioned_light_stand.html http://www.bhphotovideo.com/c/product/253067-reg/impact_ls10ab_air_cushioned_light_stand.html http://www.bhphotovideo.com/c/product/253067-reg/impact_ls10ab_air_cushioned_light_stand.html http://www.bhphotovideo.com/c/product/368226-reg/sekonic_401_309_l_308s_flashmate_light_meter.html http://www.bhphotovideo.com/c/product/368226-reg/sekonic_401_309_l_308s_flashmate_light_meter.html http://www.bhphotovideo.com/c/product/368226-reg/sekonic_401_309_l_308s_flashmate_light_meter.html http://www.bhphotovideo.com/c/product/368226-reg/sekonic_401_309_l_308s_flashmate_light_meter.html http://www.universityproducts.com/cart.php?m=product_list&c=1115&primary=1&parentid=1271&navtree%5b%5d=1115 http://www.universityproducts.com/cart.php?m=product_list&c=1115&primary=1&parentid=1271&navtree%5b%5d=1115 http://www.universityproducts.com/cart.php?m=product_list&c=1115&primary=1&parentid=1271&navtree%5b%5d=1115 http://www.universityproducts.com/cart.php?m=product_list&c=1115&primary=1&parentid=1271&navtree%5b%5d=1115 http://www.bhphotovideo.com/c/product/45468-reg/savage_1_12_107_x_12yds_background.html http://www.bhphotovideo.com/c/product/45468-reg/savage_1_12_107_x_12yds_background.html http://www.bhphotovideo.com/c/product/45468-reg/savage_1_12_107_x_12yds_background.html http://www.bhphotovideo.com/c/product/45468-reg/savage_1_12_107_x_12yds_background.html http://www.bhphotovideo.com/c/product/465286-reg/x_rite_msccc_original_colorchecker_card.html http://www.bhphotovideo.com/c/product/465286-reg/x_rite_msccc_original_colorchecker_card.html http://www.bhphotovideo.com/c/product/465286-reg/x_rite_msccc_original_colorchecker_card.html http://www.bhphotovideo.com/c/product/465286-reg/x_rite_msccc_original_colorchecker_card.html http://www.adobe.com/products/photoshop-lightroom.html http://www.adobe.com/products/photoshop-lightroom.html information technology and libraries | march 2016 44 figure 21. the university of kansas spencer museum of art digitization lab setup for oversized materials figure 22. steelworks museum of industry and culture’s digitization lab setup for oversized materials fulfill your preservations goals with a budget studio | zhou doi: 10.6017/ital.v35i1.5704 45 figure 23. the university of kansas spencer museum of art digitization lab setup for smaller prints and 3-d objects figure 24. steelworks center of the west’s digitization lab setup for 3-d objects functions of some elements in the sample shooting studio 1. macro lens: it allows close up shooting of objects. it is especially useful when photograph small prints and small 3-d objects. it can also be used to photograph regular and oversized materials. 2. heavy-duty mono stand: it replaces a traditional tripod. it is very stable and allows quick adjustment of camera height and location. 3. strobe, power pack, and reflector: together they generate consistent and homogeneous light distribution. recommended further reading: “introduction to offcamera flash: three main choices in strobe lighting.”10 4. light stand: it holds strobe and reflector. information technology and libraries | march 2016 46 5. light meter: hand-held exposure meters measure light falling onto a light-sensitive cell and converts it into a reading that enables the correct shutter speed and or lens aperture settings to be made.11 6. book cradles: they help to minimize the stress on bookbindings and minimize page curvature problem. 7. nonreflective glass: it helps to flatten a photographed page and reduce the reflection. however, it does not completely eliminate glass reflection. one very useful trick to reduce glass reflection is to place a black board with a hole above a page and shoot through the hole. this approach actually does not eliminate reflection but reflects black to the photograph. when the photograph is reviewed on computer, it will appear as no reflection has occurred. figure 25. the university of kansas spencer museum of art digitization lab setup for materials needed be pressed down by a glass. many librarians believe that digitizing print materials using a digital camera requires a professional photographer, but this is not necessarily true. a professional photographer or even an art student can act as a consultant to help set up a shooting studio and provide basic training. also, many museums have professional photographers and have set up shooting studios for digitization. they are very willing to share their experience and even provide training. i believe the learning curve for operating a shooting studio is no greater than the learning curve to operate an overhead scanner machine and its software. pros and cons no digitization equipment or system is perfect. they all have trade-offs in image quality, speed, convenience of use, quality of accompanying software, and cost. our tests show that for most archival materials a dslr camera will do a better job than an overhead scanner. pros of overhead scanner fulfill your preservations goals with a budget studio | zhou doi: 10.6017/ital.v35i1.5704 47 • the scanner is a complete scanning station. it can be connected to a computer and starts scanning immediately. materials can be placed on the scanning surface, so no equipment adjustments are required while scanning. • it can scan and save images in bitmap format directly, while a dslr camera can only shoot in grayscale or color. • built-in book cradles help to scan thick books and those that cannot be fully opened. • book curve correction functionality is provided by the accompanying software. cons of overhead scanner • high cost. the overhead scanner we have cost more than $50,000, with an annual maintenance contract of $8,000. • high replacement cost. when a scanner is outdated or broken, the entire machine has to be replaced. • instability. our overhead scanner is unstable even when placed on a sturdy table and handled only by professionals. from april 2010 to october 2010, the scanner was down for a total of forty-two working days (sixty calendar days). the company fixed the machine onsite many times, but it continues to have minor problems and has not been completely reliable. • the autofocus feature does not work consistently. • special training is needed to operate the machine and associated software. • file formats supported are limited. most scanners only support tiff, jpeg, jpeg 2000, windows bmp, and png. • unsupported outdated software: our overhead scanner’s software can only be run on an older operating system (windows xp) because there is no updated software for this model. pros of budget studio • stable. under normal use dslr cameras are much less likely to break down than scanners. for example, i have had an older dslr, nikon d200, for seven years. it has survived numerous backpacking trips, multiple drops, and extreme weather conditions. the camera still functions as needed. • fast and accurate focus. dslr cameras are designed to focus quickly, and their focus indicators provide instant feedback to the operators so they know that the image is focused. if operated properly, images delivered by dslr cameras can be sharper than ones delivered by scanners. • less expensive. a good quality dslr camera and a lens can be purchased for fewer than $4,000 and last for years. as technologies advance, dslr cameras’ prices will continue to drop. • ability to save files in more formats. in addition to tiff and jpeg formats, most dslr cameras can save photos in raw file format. some cameras can directly save images in digital negative (dng) format, and others deliver images in proprietary formats that can be information technology and libraries | march 2016 48 converted into dng using a computer program. editing raw images is nondestructive, while editing of tiff and jpeg images is irreversible. • accurate wb and exposure. by using right shooting and post-processing techniques, photographs can have exact color reproduction. on the other hand, calibrating an overhead scanner most likely can only be performed by a company’s trained technician. proper exposure and wb are not guaranteed. • the raw file format usually provides more dynamic range. overexposed and underexposed images can be fixed by adjusting exposure compensation via software; thus lost shadow or highlight detail can be restored. • can photograph 3-d objects. archival collections often have materials other than books, such as art pieces. these materials are better to be photographed than scanned. • versatile. cameras can perform on-site digitization, while overhead scanners are too bulky to be moved around. • faster and better preview. images can be viewed instantly on a computer when proper software, such as adobe lightroom, is used. operators can compare multiple shoots on a screen side-by-side and decide which photo to retain. • more accessible technical support. the number of dslr camera users is much higher than overhead scanner users. technical questions can often be answered through online forums. • easy to find replacement parts. when a piece in a shooting studio break down, it is easy to find replacement piece and replace by staff. • easy software updates. software used in a studio is independent from equipment. cons of budget studio • there is learning curve for setting up a shooting studio, operating the studio, and mastering new image processing techniques. • a dslr camera with a lower pixel setting will not be sufficient for scanning large-format materials, such as posters and maps. • no built-in book curve correction is provided by adobe photoshop or lightroom. however, our experience proves that the automatic book curve function does not always work well. we normally use a home-made book cradle to help lay a page flat and use one or two weights to hold down the other side of book. for some books, if flatness is hard to achieve, we place a piece of glass on the top to ensure the flatness. • security concern: since a dslr camera is highly portable, it can be stolen easily. fulfill your preservations goals with a budget studio | zhou doi: 10.6017/ital.v35i1.5704 49 figure 26. scanning setup using a book cradle. conclusion the technology of dslr cameras has advanced very quickly in the past ten years. newer dslr cameras can handle higher resolutions and have very little image noise even at a high iso setting. the higher demand for dslr cameras and accompanying image-editing software results in more rapid technology advances compared to low-demand and high-end overhead scanners. high consumer demand drives dslr camera prices much lower than prices for overhead scanners. in addition, the wide range of consumers purchasing dslr cameras and software prompts companies to offer more user-friendly interfaces. as you can see from our tests, for most library materials a dslr camera can produce superior images. if you do not have a budget for high-end overhead scanners, you can still fulfill your digitization preservation goals with a budget studio. acknowledgement i would like to thank robert hickerson and ryan waggoner, the university of kansas spencer museum of art, tim hawkins, and steelworks center of the west for showing their digitization labs and sharing experience with me. references 1. federal agencies digitization guidelines initiative, “technical guidelines for digitizing cultural heritage material: creation of raster image master files,” august 2010, http://www.digitizationguidelines.gov/guidelines/digitize-technical.html 2. “tutorials: white balance,” cambridge in colour, accessed march 9, 2016, http://www.cambridgeincolour.com/tutorials/white-balance.htm. http://www.cambridgeincolour.com/tutorials/white-balance.htm information technology and libraries | march 2016 50 3. “colorchecker passport user manual,” x-rite incorporated, accessed march 9, 2016, http://www.xrite.com/documents/manuals/en/colorcheckerpassport_user_manual_en.pdf. 4. scott kelby, “scott kelby's editing essentials: how to develop your photos,” pearson education, peachpit, accessed march 9, 2016, http://www.peachpit.com/articles/article.aspx?p=2117243&seqnum=3. 5. “srgb vs. adobe rgb 1998,” cambridge in colour, accessed march 9, 2016, http://www.cambridgeincolour.com/tutorials/srgb-adobergb1998.htm. 6. “colour banding,” wikipedia, accessed march 9, 2016, http://en.wikipedia.org/wiki/colour_banding. 7. “posterization,” wikipedia, accessed march 9, 2016, http://en.wikipedia.org/wiki/posterization. 8. “image posterization,” cambridge in colour, accessed march 9, 2016, http://www.cambridgeincolour.com/tutorials/posterization.htm. 9. richard anderson and peter krogh, “color space and color profiles,” american society of media photographers, accessed march 9, 2016, http://dpbestflow.org/color/color-space-andcolor-profiles. 10. tony roslund, “introduction to off-camera flash: three main choices in strobe lighting,” fstoppers (blog), accessed march 9, 2016, https://fstoppers.com/originals/introductioncamera-flash-three-main-choices-strobe-lighting-40364. 11. “introduction to light meters,” b & h foto & electronics corp., accessed march 9, 2016, http://www.bhphotovideo.com/find/product_resources/lightmeters1.jsp. http://www.xrite.com/documents/manuals/en/colorcheckerpassport_user_manual_en.pdf http://www.peachpit.com/articles/article.aspx?p=2117243&seqnum=3 http://www.cambridgeincolour.com/tutorials/srgb-adobergb1998.htm http://en.wikipedia.org/wiki/colour_banding http://en.wikipedia.org/wiki/posterization http://www.cambridgeincolour.com/tutorials/posterization.htm http://dpbestflow.org/color/color-space-and-color-profiles http://dpbestflow.org/color/color-space-and-color-profiles https://fstoppers.com/originals/introduction-camera-flash-three-main-choices-strobe-lighting-40364 https://fstoppers.com/originals/introduction-camera-flash-three-main-choices-strobe-lighting-40364 http://www.bhphotovideo.com/find/product_resources/lightmeters1.jsp oversize materials small prints use of a sharpening filter color balance color space setting up a budget studio commercial approach sample budget studio setup cons of budget studio acknowledgement i would like to thank robert hickerson and ryan waggoner, the university of kansas spencer museum of art, tim hawkins, and steelworks center of the west for showing their digitization labs and sharing experience with me. 178 information technology and libraries | december 2006 l eadership—what is it? ala president leslie burger has me thinking about it a lot these days. as i write, the lita board is in the process of determining who lita will sponsor in the ala emerging leaders program. the task is difficult. lita has many new librarians who have strong potential for leadership. consequently i feel assured that lita has a strong future because what is an association, if not its members? so one of the questions the board asked was what does it mean to be an emerging leader? when has one emerged? personally, i feel that i am still emerging because there is always more to learn. lifelong learning, isn’t that what librarians are all about? in preparation for my presidency, i attended an american society for association executives seminar facilitated by tecker consultants. they defined four types of influential leadership: servant, visionary, expert, and catalytic. i see all four types of influential leaders within lita and they are all important. the servant leader provides service to others. in a volunteer organization like lita, a lot of servant leadership is being exhibited. these are the people who keep the organization humming, making sure we have the programs and education opportunities that make lita relevant to its members. the most obvious place we see visionary leaders in lita is at our top technology trends; however, it is not the only place where visionary thinking occurs. lita members are often cuttingedge, applying new technologies to solve problems or to provide better solutions and services. visionary leadership is where one sees what the future could look like. lita programs are filled with expert leaders who share their technical expertise and lead the profession in applying those technologies. however, we also have many expert leaders who have important insights into what the association can be. the catalytic leader brings people together and leverages their capabilities. the lita board works with other lita leadership to ensure that our goals are reached and to bring together all of the lita offerings to make membership a comprehensive professional benefit. my challenge as the current lita president with the ala emerging leaders program is to ensure that our sponsored member has a meaningful opportunity to become a superb leader both within lita and within the profession. in addition to attending the leadership training workshops for all of the emerging leaders, each sponsored person will be appointed to some service role within ala or one of its units. the lita board has elected to have our sponsored emerging leader work closely with the officers, in particular our vice president/president-elect mark beatty, on strategic planning for the next two years. i am hopeful that we will learn a great deal from our emerging leader regarding what new members are seeking out of the organization. when i think about a good leader, i think about someone who listens, who allows others to think creatively and to take risks, who inspires, who sees the big picture, who can make decisions and make others understand the reasons for a decision, and who communicates well. john buchan put it this way: “the task of leadership is not to put greatness into people, but to elicit it, for the greatness is there already.” my goal this year, in conjunction with, but not limited to, the ala emerging leaders program, is to grow our new members into future lita leaders. i have been rewarded in all of my work within lita to witness rising stars take on exciting roles and projects. i hope everyone reaps the joys of mentoring new professionals at some point in their careers. in my own leadership role, i take seriously the need to implement lita’s strategic plan. in that vein, the board has created an assessment and research task force that will make recommendations on gathering assessment data and feedback from members. with the appropriate knowledge base, we can ensure that value is being received. the board has also created a working group consisting of the chairs of the education committee, the regional institutes committee, and the program planning committee to make recommendations on our education programs. i have been working with that group to identify new modes of delivering our programs and to ensure that they maintain their relevancy to lita members. lita continues to implement new communication technologies to reach out to its members. the lita blog has now been up for over a year and the new lita wiki is available for use by interest groups and others to allow experts to collaborate in the building of topic-specific resources. sir john harvey-jones framed the question thusly: “how do you know you have won? when the energy is coming the other way and when your people are visibly growing individually and as a group.” i see this happening in lita. what an energizing and fulfilling sight it is! president’s column bonnie postlethwaite bonnie postlethwaite (postlethwaiteb@umkc.edu) is lita president 2006/2007 and associate dean of libraries, university of missouri–kansas city. lib-mocs-kmc364-20131012114302 304 reports and working papers cable library survey results public service satellite consortium: washington, d .c. the following paper was distributed to pssc members in may 1981 , and is reproduced here to bring it to the attention of a wider audience. background the public service satellite consortium (pssc) conducted a survey of academic libraries in july 1980 to study their data communications needs and services. results of that study, coupled with library interest generated by that study, convinced pssc that: (1) libraries have a wide variety of communications needs which could be addressed with appropriate uses of telecommunications; (2) all types of libraries are affected, not just academic libraries; and (3) data transfer was but one of many types of library services in need of better communications. this information motivated pssc to take a broader look at library communications. that second look resulted in the identification of the "cable library" (catvlib) phenomenon and video library services. in december 1980, pssc launched a second survey directed to cable libraries; that is, libraries of all types which are connected to local cable companies. this study was aimed at determining to what extent, if any, a national satellite cable library network might be already in technical existence. how many libraries are presently connected to cooperative cable companies with satellite hardware and excess satellite receiver capacity? and of that number, how many cable libraries would be interested in participating in satellite-assisted library services and video-teleconferences? to answer these questions, pssc mailed questionnaires to 101 libraries that had been identified as potential cable libraries. in order to allow the participation of unidentified cable libraries, pssc also advertised the survey in various library periodicals, including american libraries, cable-libraries, and lola. that ad resulted in an additional 97 cable libraries requesting to participate in the survey, raising the total number of libraries receiving the questionnaire to 198. as of april 1981, 86 libraries have responded, yielding a 43 % return. follow-up phone calls have indicated that more surveys are forthcoming, or that the questionnaire proved to be irrelevant to present library conditions. in some cases, copies of the survey were requested and distributed for informational purposes only. the survey instrument the questionnaire incorporated explanations of terminology and was eight pages long. additional enclosures furnished more specific information about pssc and videoteleconferencing. the respondent was not only questioned about his/her library facilities, but also was asked to interview thecable company for necessary technical information. though contributing to slower returns, this two-tiered approach did succeed in establishing contact between the library and the cable company, as well as provide all the data required to profile each library as a potential network participant. survey participants since a national network is being pursued, an attempt was made to reach as many of the states as possible. thirty-seven states received copies of the survey, while thirty-one had at least one responding library. all types of libraries were surveyed. those surveyed included elementary school libraries, high school libraries, vocational school libraries, academic libraries, public libraries, regional library networks, state libraries, library systems, special libraries, and libraries that also double as their local community access center for cable television. of the 86 who responded, 63 were public, 18 were academic, 4 were school, and one was a special library. responding libraries have been categorized according to their ability to be an active member of the network: uf usable facility-those libraries that have met all the technical requirements for network participation. the library must be currently connected to an operational cable system which has a satellite receiving station and excess receiver capacity. in addition, the cable system and the library must have indicated an interest in participating in and hosting occasional satellite-transmitted events. nxc no excess ro capacity-libraries that meet all technical cable connectivity requirements, but whose cable system cannot presently accommodate any more activity on its satellite receiver(s), are grouped here. should time become available in the future, these libraries are then technically able to advance to the usable facility group. nro no catv rohere are placed those libraries that are connected to an operational cable system . however, the cable system has no satellite receiving station and, therefore, no satellite access. in order to become a usable facility, these cable systems must install a satellite receiving station and be able to offer excess receiver capacity. ncc no catv connectionwhile a cable system with all the satellite hardware requirements may be operating in the library's area, these libraries are not connected to the cable system. reasons given in the survey are varied including logistics, economics, and disinterest . depending upon the technical status of the cable system, a reports and working papers 305 simple link may be all that is needed for the library to become a usable facility. nca no catv in arealibraries in this group are located in areas that presently have no operational cable system. some areas are now in the franchising process, some have awarded franchises but are not operational, and others have no idea if and when cable service will come to their areas. libraries here have the advantage of knowing what requirements are necessary for network participation and can use this information when franchising negotiations begin. ni no interest-here are grouped those libraries that are at various stages of technical capability, but have no desire to participate in a national satellite cable library network. table 1 illustrates responses according to geographical location. (numbers refer to the quantity of libraries from each state that fit into the above defined categories .) exactly half of these respondents are usable facilities. the largest hindrance to network participation is lack of connectivity between the library and the cable system. library /cable connectivity part one of this survey established the degree of connectivity between libraries and their local cable companies. pssc's major concern was to find libraries wired to at least receive cable programming. pssc also discovered that the highest percentage of libraries had two-way connection, usually for the purpose of cablecasting. connectivity among the 86 respondents was broken down as follows (all percentages have been rounded off): 33 (39%) two-way interconnection (transmit and receive video) 29 (34%) one-way catv drop (receive onlyregular subscriber) 14 (16%) no catv connection 9 (10%) nocatvinmyareaor presently operational in my area 1 (1%) no answer to question other questions in this section profiled the technical capabilities of the cable system. specific hours of each day of the week a satellite receiver was available for occa306 journal of library automation vol. 14/4 december 1981 table i. state nabama naska arizona california colorado connecticut florida georgia hawaii idaho lllinois indiana iowa kansas kentucky maryland massachusetts michigan minnesota mi ssouri nevada new jersey new york north carolina north dakota ohio oklahoma oregon pennsylvania tennessee texas utah vermont virg inia wash ington wisconsin wyoming total total state respondents uf 0 no response 3 i 5 2 2 4 i i i 0no response 2 2 i i 2 2 2 i 3 3 i 2 2 ii 7 i i i i 4 i 14 5 2 i 0 no response i i 0no response 2 2 2 2 i 2 2 2 0 no response 4 2 2 i 3 3 0 no response 86 43 sional use were charted. weekday mornings proved to be the most available time block. it is also imperative for pssc to know what transponders (channels) of the satellite cable systems can access. there are twenty-four transponders on satcom i, the main satellite used by cable. when pssc coordinates a satellite telecast, time on a satellite transponder must be secured . each transponder is leased to someone, such as home box office (hbo), ted turner's cable news network, or the appalachian community service network nxc 3 i 7 nro 2 3 9 ncc 3 i 2 2 2 14 nca 2 i 2 8 nl 5 (acsn), to name a few, for the carriage of their programming. time needed by pssc for a two-hour satellite event, for example, can be sublet from a transponder lessee, subject to availability. however, finding time slots on satcom i transponders is becoming increasingly difficult as many lessees are expanding the number of hours of their own programming. as a result, pssc must know which transponders each cable system can receive so that an attempt can be made, where possible, to accommodate the majority of survey facilities. the ideal situation is for catvs to own "frequency agile" satellite receivers; that is, receivers that can access any of the transponders. some receivers can get only evennumbered transponders or odd-numbered transponders; others can access only certain individual transponders. transponder accessibility is usually related to the type of programming the cable operator offers or plans to offer to the local cable subscribers, or to the age of the system. (older systems often use twelve channel receivers, tunable to only evenor odd-numbered transponders on satcom i.) for example, if a cable operator does not anticipate offering anything besides hbo now or in the future from satcom i, often he/she cannot justify the need for a frequency agile receiver. table 2 outlines transponder accessibility for usable facilities only. this abundance of frequency agile receivers will provide the connected libraries with a greater amount of flexibility in receiving programming since their participation will not be dependent upon a certain transponder. another question probed the availability of provisions for closed-circuit, discrete delivery of satellite transmissions from thecable system's receiver into the library. being able to provide closed-circuit capabilities would ensure the privacy of a satellite telecast. some pssc clients insist that their transmissions be safe-guarded through closed-circuit delivery. as expected, closed-circuit arrangement does not exist between very many libraries and their catvs. unless part of an institutional cable loop, most libraries cannot presently be singled out for closed-circuit cable reception. under normal conditions, what is transmitted from the head end of the cable system travels to everyone subscribing to the cable service. eleven of the forty-three usable facilities claimed closedcircuit capabilities are currently available. those thirty-two without described what technical considerations must be present before such provisions could be offered. these technical requirements included scrambling devices, mid-band channel usage, modulators and demodulators. such upgrading of the cable company's hardware was quoted as costing from hundreds to sevreports and working papers 307 era! thousands of dollars. no catv indicated willingness to assume the expenses for such special capabilities, but a few did offer to investigate the possibility of temporary special links on a per-occasion basis. library facilities the survey also asked about the library's facilities. information in part two centered on library accommodations and equipment. answers here provided a description of each library, which gave pssc an idea of how adaptable to hosting satellite teleconferences each might be. a basic satellite program viewing facility consists of the viewing area, equipped with chairs and tables, at least one television monitor (wired to receive the cable protable2. # of facilities able to access transponder # transponder i 2 2 2 3 i 4 i 5 i 6 4 7 3 8 3 9 6 10 3 11 0 12 2 13 i 14 3 15 0 16 3 17 i 18 i 19 0 20 2 21 3 22 4 23 0 24 5 frequency agile 30 not sure 4 note: these ligures are for transponder accessibility on satcom i. numbers for the specific transponders were tabulated from those surveys that indicated their satellite receivers were not frequency agile, but rather could access only those transponders they had listed. 308 journal of library automation vol. 14/4 december 1981 gramming), and , for interactive programs, a telephone. survey libraries reported they had conference rooms, auditoriums, and classrooms available for viewing satellite telecasts. the number of viewers able to be accommodated at one time ranged from 6 to 400, with the average facility holding 75 people. some libraries could provide simultaneous viewing in more than one room, which increased the total number of people they could accommodate for a single event. a majority of the libraries had more than one monitor; some as many as fifteen monitors. three lib!aries indicated they owned a large-screen television projector. fortyfour percent of the usable facilities have no phones in the viewing rooms, but many explained that phones were either nearby or could be temporarily installed for an interactive event. in response to a question about the location and accessibility of the library within its community, the general comments described the majority of the libraries as being in a convenient part of town, with ample parking and barrier-free design. when given enough advance notice, most libraries were willing to schedule an event at any time, even during hours and on days the library was normally closed to the public. traditionally, as a part of its standard networking service, pssc rents viewing facilities for the client, whether they are public television stations, hotels, or other facilities. libraries, as another type of viewing resource, would be entitled to receive payment for use of their facilities. obviously, this fact treads on controversial "fee or free" waters. being aware of this, pssc asked the libraries whether they could accept money for these purposes; and, if not, whether they might have some other mechanism, such as a "friends of the library" group, to which the money could be given instead. those libraries that said they could accept money directly for the use of their facilities numbered thirty-four. oddly enough, thirtyfour libraries also said they could not accept money directly for the use of their facilities. of that group, thirty-one indicated they did have a "friends of the library" or similar group to which money could be given for indirect channeling back into the library. eighteen libraries did not answer this question (many due to libraries not completing the entire survey once they felt the cable information made them technically ineligible for participation). only three libraries might have a problem with financial arrangements for an event. program interests the final section of the survey (part three) gave each respondent the opportunity to list topics of interest to the library and community that could be presented via a satellite video-teleconference. general comments identified continuing education, organizational conferences, training, seminars, workshops, media distribution, and information dissemination as major activities suitable for satellite-assisted delivery and distribution. special target audiences included the following: 1. senior citizens 2. handicapped 3. minorities 4. the disadvantaged (economically, educationally, socially) 5. the abused (drug addicts and alcoholics; abused children and spouses, teachers and students; victims of crime; and the sexually harrassed) 6. the institutionalized (in hospitals, prisons, nursing homes, mental health centers, hospices) these special patrons are often served through outreach programs and were named here as potential beneficiaries of satellite programming. the most frequently named special population was the elderly, with suggestions for retirement, social services, nursing-home care, insurance, and other senior-oriented programming. three major classes of other potential users of satellite video-teleconferencing in the library were identified: 1. education-oriented: preschool and nursery students; elementary, middle, junior high, and high school students; postsecondary and graduate students; vocational, technical, extension, and cooperative education students; special education students; adult and continuing education students; educational administrators, faculties, and staff 2. government-oriented: federal, regional, state, county, and local government officials and employees 3. employment-oriented: professional! nonprofessional; salaried/hourly; union /nonunion; management/staff; public/private sectors; employed/ unemployed; full /part-time; permanent/ temporary; big/small business; human services/ trade particular topics of interest felt to be ideal satellite program areas within each library's community included the following (appearing in no rank order): energy (solar and natural resources) consumerism community services environment historic preservation/oral history legal aid librarianship computers, data processing technology communications/telecommunications fund raising safety recreation , physical education, sports, parks language (bilingual, sign, foreign, literacy) economics and finance (investment, banking, inflation, budgeting) conservation genealogy religion business and industry civil defense agriculture and forestry health and medicine mental health arts and humanities curriculum sharing therapy and rehabilitation real estate several local associations, who have affiliates or branches located nationally, were listed as potential users of satellite videoteleconferencing (in order of popularity): 1. american association of retired persons 2. league of women voters 3. historical societies 4. american library association 5. chamber of commerce 6. american association of university women reports and working papers 309 7. parent/teacher associations 8. councils of government 9. jaycees 10. boy scouts 11 . friends of the library three questions concerning interest and ability to participate in future satellite video-teleconferencing activities were asked . the questions, vital to the outcome of this survey, are reiterated here with their respective answers: 1. would you be interested in helping set up one or more of these specialized teleconferences? yes 63 (73%) no 10 (12 o/o) maybe 5 (6%) no answer 8 (9 %) 2. would you be interested in doing a local follow-up program after a national teleconference that is of interest to your community? yes 65 (76%) no 6 (7%) maybe 8 (9%) no answer 7 (8%) 3. periodically, nationally based organizations sponsoring teleconferences or special programs enlist promotional and site arrangement support from local site facilitators. would you like to be listed as available to provide this support? yes 54 (63%) no 18 (21 o/o) maybe 3 (3%) no answer 11 (13%) the interest of the libraries surveyed is well documented in questions one and two. however, their ability to presently participate is limited to financial and personnel resources as demonstrated by question three's responses. general conclusions and recommendations the majority of surveyed libraries recognize the need for libraries to expand their community service roles through some use of telecommunications. many of the 86 libraries indicated the concept of libraries becoming satellite program viewing facilities through their cable connectivity was an idea so new to them that they could not fully 310 journal of library automation vol. 14/4 december 1981 understand or visualize what would be expected of the library in this novel role. yet the general consensus was that if joining with their cable systems to provide satellite programs receiving locations was a method of improving community library services, while not making demands on the library's budget, then the concept was worth exploring individually on an operational basis. to illustrate this concept of the ca tvlib as a satellite program viewing facility, a typical scenario would find participating catvlibs contacted by an organization or networking agent who wishes to reach the general community or a special segment with its satellite-transmitted programming. the catvlib, as the community contact, would have the option to respond negatively or positively. if the catvlib is interested, it must begin performing local coordination duties, most important of which is garnering the agreement of its cable system. catvlib and cable system discussions will determine five things: 1. can the cable system access the satellite transponder on which the programming will be carried? 2. will the cable system have a satellite receiver available on the date and time of the program? 3. will theca tv lib have its viewing facility available on the date and time of the program? 4. if desired by the program's sponsor, will the catv lib contact the local group who is to participate in the program and work with them prior to the satellite telecast to the extent needed by the requesting organization? 5. can the cable system and/or the catvlib handle special program considerations, if any? for example, provide closed circuit capability in the catv lib? tape the program? provide telephone(s) for interactive programs? provide local site facilitation? -coordinate local follow -up activities? provide refreshments? coordinate advance publicity within the community? once the catvlib has determined whether or not it is able and desires to offer their services, the ca tvlib would be recorded as a satellite program "receive site." theca tv lib will then assume the degree of local responsibility requested and contracted by the requesting organization, including all negotiations necessary with the cable system. while there were survey indications of general support for such a national satellite cable library network, what are the pros and cons of its operation? pros pre-existing conditions. ca tvlibs need no investment for hardware, but merely take advantage of pre-existing cable connectivity. community service. such ca tvlib participation potentially offers service to every member of the community. outreach to new patrons. those community residents not previously using the library may find this new service applicable to their needs. economics. catvlibs could recoup any charges incurred through this service, as well as expect payment as a rented receive site. program interaction. live satellite programming has the advantage over taped programming of allowing the option of offering viewers the opportunity to interact with the program's presenter(s). resource-sharing potential. this service has the future potential of providing catvlibs with an alternative method of accessing new information resources and data bases. human resources can be shared now through this service. potential catv expansion. more catvs are expanding and upgrading their satellite access capabilities as usage of satellites by cable programming vendors increases. some catv s have already purchased west ar iii hardware in addition to their satcom i hardware. future implications. if satellite-related services become valued by the community, the residents might decide the catv lib should have its own satellite hardware so that the community could take advantage of more programming available directly from satellite. cons lack of sa tcom i occasional time. it is becoming increasingly difficult to sublease transponder time on this satellite for occasional satellite programs. dependency. the catvlib must depend entirely on the cable system to be able to be a network participant and offer this service. ca tvlib participation is dependent upon the cable system's satellite access capabilities, which generally means satcom i only. lack of cctv. generally, most ca tvlibs cannot offer closed-circuit capability, so absolute privacy cannot be guaranteed to the program's sponsor. catvlib policies. some catvlibs will have to make decisions about various controversial items, such as: -accepting money for use of facilities. -allowing some clients the right to limit viewing to only registrants. -hosting controversial groups. range of catvlib capabilities. the survey demonstrated that ca tvlibs cannot all offer the same degree of service due to the wide range of technical capabilities. at present, each satellite event would have to be judged individually to determine which catvlibs were equipped to participate. a glance at the pros and cons of marrying libraries and satellite communications through cable connectivity suggests a national satellite catv lib network is a presently available and usable resource with potential for future expanded capabilities and unlimited programming uses. the obstacles imposed by the cons, however, are cause for a serious and objective look at the present and future viability of such a network. popular present uses of satellite videoteleconferencing are for telecasting continuing education and organizational conference interactive programming to special audiences. some pssc clients will often request to: -charge his/her special audience for participating (course or conference fees, for example). -have the satellite-transmitted event reports and working papers 311 closed-circuit telecasted to the receiving locations only. -reach specific geographical locations (often large urban areas, such as new york or los angeles). charging special audiences for closed-circuit satellite event the first two client requests are often related. if the client intends to charge the registrant-viewer a fee, he/she often expects the program to be viewed only at designated receive sites that are hosting the paying participants. (why should a viewer pay if heishe could watch the same program at home on a cable channel for free?) obviously, those clients interested in a "box office" approach to their event, that is, to make a profit rather than offer a service, are not suited for catvlib network use. however, how can the ca tvlibs accommodate those public service groups which must recoup expenses in order to offer such satellite program services? client-designed incentives such as giving the phone number for viewer interaction in a program only to the ca tvlibs rather than displaying or announcing the number during the program; requiring participants to have special materials and/ or integrating local preor postevent activities in the catvlibs with the program; even offering course credit to registrants only are manageable alternatives for those catvlibs that cannot terminate the program in their facilities only. some catvlibs may be able to negotiate whh their catv for the provision of the necessary equipment to provide closed-circuit capabilities. however, this survey did not identify many catvs that were willing to cooperate with the libraries to that extent. for those catvlibs whose policies restrict their involvement with financial transactions, particularly money exchange among library patrons, advance registration fees paid directly to the client could enable the libraries to avoid being required by the client to "collect at the door." most libraries, however, by their very nature, cannot prohibit anyone from viewing a program within their facilities, thereby making it generally impossible for them to guar312 journal of library automation vol. 14/4 december 1981 antee the client their requested selective audience. size, location, and distribution of receive sites video-teleconference users generally want to reach as many of their members or special populations as possible, yet they must pay to rent each receive site. economics influence their attempt to reach more people at fewer locations, not necessarily those most in need of the program. therefore, it is no surprise that popular receive sites are located in heavily populated cities. while cable television is finally coming to urban areas, present conditions find a lack of operational catvs available. the typical catvlib now is located in a smaller city or rural area. large states, such as california and texas, have little or no catvlib representation. only twentythree states currently have a usable catvlib facility, which makes the network descriptor "national" not quite accurate. expanding the catvlib network to include more and larger cities and all states is a must to make it competitive with other satellite networks available to a client. but even if the network is able to expand, the previously mentioned inability of catvlibs to provide closed-circuit capabilities will lessen its desirability as a resource when that capability is offered by another satellite ground facility in the same city. one competitive alternative a catv lib can consider is rental cost. clients expect to pay a reasonable rate for the use of each facility. this rate differs among different types of satellite networks, and even within the same network. for example, renting a public television station is generally less expensive than booking a hotel. yet the rate for two public television stations can vary in the hundreds of dollars. if a catvlib chooses to offer its facilities for free, asking only for compensation os any expenses it might incur because of the satellite event or charges a minimal amount, their facility becomes economically attractive. one factor the ca tvlibs must not overlook when contemplating such a decision is the cable system. will the cable system expect remuneration for its services, especially if the catvlib is receiving payment? libraries must remember they have entered into a cooperative arrangement with their catvs in order to become a satellite program viewing facility. toward future independence while a skeletal cable library network does technically exist, it is imperative that libraries work toward their own future independence before they can truly establish themselves as a viable satellite network. evolution of a catvlib network to a satellite library network might include the following two steps: l. expanded catvljb network. the survey instrument should now evolve into an interview tool for profiling additional libraries to become part of this network. efforts should be made to encourage libraries within poorly represented states to join the network if technically feasible. expansion is urged for two main reasons: to allow libraries the opportunity to experience being a satellite program viewing facility without financial obligations. -to allow community residents the opportunity to experience a library service with great potential for all local population segments. once the library is regarded as the logical place for community communications, it will be much ea~ier to begin a community drive toward supporting the outfitting of the library with the proper hardware necessary to function in that capacity. requirements for becoming part of the expanded catvlib network include: -at least one-way connectivity between the library and the catv. (a typical subscription for basic service will suffice.) -the catv must have a satellite receiving station. -the catv must have excess capacity available on its satellite receiver. -catv must be willing to cooperate with the library in providing satellite reception of occasional satellite telecasts. library must have at least one viewing room available to seat those viewing the satellite program. library must have at least one television monitor, wired to receive cable programming, available in the viewing room. library must be willing to assume role of community contact to extent requested by client. (need is for library interest in participating in these occasional satellite telecasts; degree of local responsibility can be negotiated.) even though this network is designed to be a temporary method of allowing library participation in satellite communications, future implications could find these libraries expanding, improving, or beginning eablecasting on a library-designated cable channel. thus, libraries deciding whether they should become involved with a temporary network might contemplate the related activities available from library/cable system cooperation. 2. satellite library network. at some point in the not too distant future, libraries will be faced with the decision of becoming independent from their cable system and obtaining their own satellite hardware. a library with its own satellite receiving station will become more desirable to more users as a receive site for a satellite videoteleconference since it will be more reports and working papers 313 flexible and autonomous. besides satellite video-teleconferences, libraries could investigate other uses of their satellite hardware including: direct satellite access (with permission recommended) for cable television fare; reception of nationwide satellite distribution of taped video programming for library use; -facilitation of various library data communications. if the library is able to prove the value and practicality of having community satellite access capabilities located at its facilities to the residents through participation in the catv lib network, local funding of a satellite library project might be realistic. if corporations are made aware of how such a satellite library facility could benefit their own communications needs, a corporate grant could prove to be another funding route. other sources of support must also be explored. final word as a result of this survey, pssc has profiled cable libraries of all technical capabilities for input into a database of network resources. however, the limitations of a catv lib network have been noted. effort will be made by pssc where appropriate to use this network for client satellite telecasts. pssc will continue to profile interested cable libraries for addition to the network , upon request of the library. statement or ownership and management }ourrwf of l.ihrary automatior1 is published qua rterly by the arn ericnn library as~iation, 50 e. lluron st .. chica~o. jl 60611. annual subscriptio n prk-e, s 15. am erican library a.o,sociation. o\\ rwr; brian avcncy. editor. second class postage paid at chicago, ill inois. pnnted in u.s.a. a .. a nonprofit organization authorized to mail at special ratl'\ (sl'ction 132.122, postal s(•rt:ice .\lanual), the puq>oi,(', fu nction. and nonprofit .;:tatu~ of thh organi zation and the exempt status for ft..--dcral in(•omt.· taj; purposes have not chan~ed during the preceding l\\ el\ e month~. extent and ~aturc of circulatlon ("a\eras;:e" figures denott• the numlx:r of copies printed each issue during the preccdmg: tweh"e months: ''actual " figure-. denote number of copies o f sin~le l 'isue published neart..-q to filin~date -the june 1981 i~sue.) tot al numbt!r o f copll"i printed: aq~ra~e. 6,869: ac t ual. 7,345. paid circulation: not applicable (i.e .. no ... a c"' throuj!h dealers. carrie rs. street 'endoro, and rountcnal<.--s} . mail subscription ... : ah•ra~<·. 6,076: actual. 6 ,308. total puid circulation: average, 6.076. actual6 ,308. free dhtrihution b y mail, carrier, o r otlwr means, samples, complirnt·ntary. and ot her free cop ies: a\t~ragc . 432: a<:tuul, 446. t olitl di ~t rihution: average. 6.508: actual. 6,i.54. copies no t di. ... tributcd: offic<' us(', le ft over, unacco unt ed . 11poiled after printing: aven~ge, 361; actual. 59 1. hcturm from news agents: not applicahlc. t otul (sum prcviouo; thrt.."c entries): a\'erage, 6,869: actual. 1.345. stateml·nt of 0\\ ncn hi p. ~1anagement and circulation (ps 3526. j une 19so) fo r 1981 fil ed with the united stat<" po't office pmt rna\tn in chica~o. september 30. 19hl 54 information technology and libraries | june 2011 recreation, law enforcement and public safety, and social services available in the community ■■ access to electronic encyclopedias, local libraries’ catalogs, full-text articles online, and document delivery.”2 at the time we were asking the question, will an information infrastructure be built? the answer? most assuredly. indeed, librarians stepped up to the table and ensured that the public had access to information-related services at their local library. the information the public asked for in 1994, as listed above, is widely available today. there are numerous examples in which librarians and libraries have served as leaders in the ongoing sustainablity of local, regional, and national information networks. it was pointed out at the time, and remains true today, that in an era of ever-shrinking resources, libraries cannot and should not compete with telecommunications, entertainment, and computer companies. they need to “join them as equals in the information arena.”3 lita has a viable role in the development of the twentyfirst-century skills that will firmly put the information infrastructure into place. a lita member is appointed as a liaison to the office for information technology policy (oitp) and serves on the lita technology and access committee, which addresses similar issues. the lita transliteracy interest group explores, develops, and promotes the role of libraries in all aspects of literacy. working with the oitp provides lita membership with the opportunity to participate in current issues, such as digital literacy. the information infrastructure has come a long way in the last twenty some years. there is still much to be done. robert bocher, technology consultant with the wisconsin state library and oitp fellow, will present “building the future: addressing library broadband connectivity issues in the 21st century” at the lita president’s program from 4 p.m. to 5:30 p.m. on sunday, june 26, at the ala annual conference in new orleans. i look forward to seeing you at the program and to hear about the successes and the work that remains to be done to address the broadband needs we all face in the country. references 1. federal communications commission, the national broadband plan: chapter 2: goals for a high performance america, http://www.broadband.gov/plan/2 -goals-for-a-high-performance-america/ (accessed apr. 2, 2011). 2. karen starr, “the american public, the public library, and the internet; an ever-evolving partnership” in the cybrarian’s manual, ed. pat ensor (chicago: ala, 1997): 23–24. 3. ibid., 31. t wenty years ago, librarians became involved in the implementation of the internet for the use of the public across the country. those initiatives were soon followed by the bill and melinda gates foundation projects supporting public libraries, which included funding hardware grants to implement public computer labs and connectivity grants to support high-speed internet connections. in 2008, the institute of museum and library services (imls) convened a task force to define twentyfirst-century skills for museums and libraries, which became an ongoing national initiative (http://www.imls .gov/about/21stcskills.shtm). the one year anniversary of the release of the national broadband plan was march 16, 2011. as described on broadband.gov, the plan is intended “to create a high-performance america—a more productive, creative, efficient america in which affordable broadband is available everywhere and everyone has the means and skills to use valuable broadband applications.”1 in 1994, the idaho state library’s development division cosponsored eight focus groups in which 179 people participated. the participants were asked several questions, including the types of information they would like to see on the internet. the results reflected the public’s interest at that time in the following: ■■ “expert advice on a variety of topics including medicine, law, car repair, computer technology, animal husbandry, and gardening ■■ economic development, investment, bank rates, consumer product safety, and insurance ■■ community-based information such as events, volunteers, local classified advertisements, special interest groups, housing information, public meetings, transportation schedules, and local employment opportunities ■■ computer training, foreign language programs, homework service, teacher recertification, school activities, school scheduling, and adult education ■■ electronic mail and the ability to transfer files locally as well as worldwide ■■ access to public records, voting records of legislators, absentee voting, the ability to renew a driver’s license, the rules and regulations from governmental agencies, and taxes ■■ information about hunting and fishing, environmental quality, the local weather, road advisories, sports, karen j. starr (karen.j.starr@gmail.com) is lita president 2010-11 and assistant administrator for library and development services, nevada state library and archives, carson city. karen j. starr president’s message: 21st century skills, 21st century infrastructure reproduced with permission of the copyright owner. further reproduction prohibited without permission. in the beginning...was the command line zillner, tom information technology and libraries; jun 2000; 19, 2; proquest pg. 103 book reviews in the beginning ... was the command line by neal stephenson. new york: avon books, inc., 1999. 151p. $10 (isbn 0-38081593-1) neal stephenson is best known for his cyberfiction, including snow crash and most recently cryptonomicon. in the beginning . . . was tlze command line is a quite different kettle of fish. command line is a short book with a succinct message: the command line is a good thing, because the full power of the computer is only available to those who can access the command line and type in the magic commands that make things happen. stephenson learned this lesson the hard way, after first spending much time as a macintosh-devoted guihead. the revelation came when he lost a document he was editing on his powerbook, completely and without a trace, forever irretrievable. actually, i say the book has a succinct message, but it has many messages and many metaphors, all artfully constructed by a master of prose. stephenson constructs his arguments along multiple lines, providing a discursive tour through windows, macintosh, and unix history, offering personal history as well as his own take on the economics of the software industry. for example, he believes that microsoft would be better off as an applications company rather than carrying the millstone of a family of operating systems. as for apple, he suggests that they have been doing their best to destroy themselves for years, so far unsuccessfully (but give them time). the real meat of the book is whether, in fact, it is better to offer to people the flash of metaphor with the recognition that power and certain levels of choice are lost, as with graphical user interfaces exemplified by windows and the macintosh, or whether it is better to have at least some access to the command line interface, which ms/dos offered and members of the unix family (e.g., linux) afford. this is, in fact, both a silly and important question at the same time. silly because many people would wonder why anyone would want command line access to any software. silly because others might wonder why you couldn't have both. important, or at least apparently important, because we seem to have become, without much warning, a world wrapped in guis of one sort or another. important in the library automation world, because end-user tools are moving increasingly toward gui-based or web-based interfaces without textbased alternatives (except, perhaps, lynx or similar web browsers, which have their own problems). for much of the book, stephenson dances around the question, among others, of why not both gui and text-based interfaces, and finally finds the answer in the be operating system. my question is, why not as many interfaces as it takes, of whatever sort? to repeat the trite saw, there are two kinds of people in the world, those who divide the world into two kinds of people and those who don't. stephenson has a lot of fun trying to make the division in this case, then ultimately comes out from behind the posturing and admits that he believes in the availability of both worlds. there are many people who do, indeed, want hard things hidden from them, at least some of the time. when i am dealing with an automated teller machine, i don't want to have to use mechanical levers or pedals as i might have needed were atms invented in an earlier age, nor do i want to type in commands, although i am comfortable using a command line environment in my workplace. i just want to be prompted through a minimal number of steps to walk away with some cash from my checking account. the world is a complicated and challenging place to navigate. some people tom zillner, editor would like to be helped by other people in this navigation, although many have found that they would far rather deal with the dumbeddown interface of an atm machine than to interact with not-so-friendly, underpaid bank tellers. similarly, many people want to accomplish a particular task requiring the use of a computer and don't mind having the details hidden from them, no matter how much power knowing the details would provide. or, they want to do that at least some of the time. as an example in the library world, let's consider a nai:ve patron who enters the library desiring to perform a known-item search. such a user might be quite comfortable with an interface with a single type-in box and a set of clickable buttons labeled title, author and subject. or maybe just a single button "click to start search." although nai:ve users may consult library staff, who are most often more friendly than bank tellers, many people want to find their own materials. at the same time, more sophisticated users want more sophisticated capabilities and interfaces from the same catalogs. although vendors have gotten better at providing a couple of levels of complexity and corresponding user interfaces, why not go further? there aren't just two kinds of people. there are lots of kinds of people, with lots of kinds of information needs, representing lots of experience levels. why the restrictions at the user interface? in the history of microcomputing, stephenson points to the evolution of two major players, microsoft and apple, with linux coming on strong and be representing an interesting offshoot. i think the important insight implicit in what stephenson discusses is that much of the appearance and behavior of windows and the macintosh desktop are historically based artifacts. in order to maintain backward compatibility with existing applications, the windows and macintosh book reviews 103 reproduced with permission of the copyright owner. further reproduction prohibited without permission. operating systems have picked up a great deal of "cruft," computer code that allows multitasking and other improvements cobbled on to the fragile inner shell of ancient code required for compatibility with older applications. at the same time, stephenson invokes the familiar refrain that the user interfaces of both platforms are tied to a tired set of metaphors that attempt to mimic the real-world office (e.g., desktop, folder) but do not do so with any kind of useful fidelity. in the library world, i think a similar kind of lineage might be traced from command line interfaces to the current windowsand web-based front-ends. although many libraries and librarians have faced painful conversion processes over the years in moving through generations of automated systems, it might be interesting to see if there are still traces of underlying code that owe their existence to backward compatibility. where does stephenson turn in the face of the inelegance of the windows and macintosh worlds? he finds solace in the power and integrity of linux. it may take a long time to successfully install the operating system and get it to function with all of the hardware components of a particular computer configuration, but it has all that power, and all of those cool applications carefully constructed by people who care. bugs are fixed quickly. it's a community effort. that's all very appealing, particularly when compared to the appalling response (or lack of it) to windows or macintosh bugs. the problem is that so far most of us aren't equipped to deal with the steep curve required to install linux on personal computers, and the corporate or library environment usually isn't politically prepared for linux to be adopted as an institutionwide standard. so, while linux boxes are frequent choices for servers, they are not widespread personal pc choices. nor r.hould they be until easy installation tools are available. again, stephenson is ambivalent. on the one hand, he recognizes that there are many people who don't want the kind of power offered by being so close to the machine if it means becoming experts in arcane commands and codes. even though he wants the power and simplicity, and decries the limitations imposed by the gui, he recognizes that linux is not for everyone. he's right. most people use computers to get some work done (or to play). to the extent that the software gets in the way, it isn't operating properly. by that criterion, none of the three environments described are particularly useful in a desktop world. in spite of the fact that the old metaphors have been rightly criticized for years for their tiredness, there doesn't seem to be much movement beyond them, except in limited research operating environments and applications. similarly, it seems, in the library and information world, at least in most people's routine interactions with opacs and databases. yes, i am waffling, because i'm sure that someone could point out the "snarfle n 1 virtual reality interface to the lc catalog that affords a walkthrough browsing experience," but of course only six computer science researchers have actually experienced the snarfletm interface, and it requires a $25,000 workstation and $10,000 in virtual reality gear to work, plus it is s-1-o-w. pardon the sarcastic riff, but there is a lot of wonderful user interface work that is certainly not finding its way onto mainstream computer users' desktops, or to the library or information center. so what's the answer? criticism is fun, because critics don't necessarily have to provide a positive account to match their nay-saying function. if things are bleak in the world of the user interface, both on the average user desktop and on the library desk104 information technology and libraries i june 2000 top as well, what is to be done? for a taste of what is to come in the library world, take a look at mylibrary (http:/ /my.lib.ncsu.edu/), which allows profiling of user preferences and customization based on academic discipline. similarly, there are a number of web portals and other sites that allow customization for users (e.g., my yahoo, my excite, etc.). suppose that these first steps in customization are carried further, so that each user's unique profile generates a unique user interface experience across all databases he or she deals with in a session. the interface unification could be accomplished across heterogeneous databases in a couple of different ways. a simple initial step that many libraries already employ is to obtain databases from a single aggregator, so that a uniform interface is presented to the user. for example, oclc' s first search offers a single interface to a number of commercial databases. this type of solution is not possible for libraries that need access to a diverse array of databases not available through a single aggregator or vendor. of course, this situation can present patrons and staff with a bewildering array of interfaces and search methods. a more elaborate solution is to employ z39.50 to access the databases and build a single interface at the front end. there may be aggregators that already use this strategy with the databases they provide, but in the future perhaps there would be an incentive to offer unified interfaces with fine-grain customization possible by users. getting back to stephenson's more generalized view of the user interface, i think there are also opportunities here for more finegrained customization. stephenson points to the beos, which apparently allows both command-line and guibased interactions, as an example of what can be done when an operating system is constructed anew, from the bottom up, with no pre-existing reproduced with permission of the copyright owner. further reproduction prohibited without permission. audience to satisfy. at the same time, and in contrast, stephenson extols the power of open software development, which he believes is most apparent in operating systems, the production of which he describes as money-losing propositions. yet, linux is tremendously successful without, for the most part, commercial gain for developers. can this same model be applied to interface and other development in the library world? in this example, might not some group of librarian coders (or coder librarians) work together to put mylibrary together with z39.50 capabilities and customization of interfaces to produce a little slice of paradise for library patrons? promising moves are being made within the library community to get open source efforts off the ground. this could be one of many especially useful and fruitful projects to come out of open software development for libraries. although his book is ostensibly about a few issues that elicit yawns from most of the world, stephenson is really using in the beginning . . . was the command line to look at a much bigger picture than simply the command line versus the gui at its microscopic level. stephenson looks at the cloaking, obfuscation or replacement of underlying text by images and multimedia as contributing to the decline of civilization. that seems like a radical claim, but at heart it is the one that stephenson makes in his discussion of the disney-ification of the world-that visual metaphors and explanations oversimplify and obscure the truth. in fact, stephenson goes further, discussing this trend toward anti-word as our attempt at an antidote for the kind of intellectualism that resulted in a lot of death, pain, and suffering for people in the twentieth century. he, as a person who lives by words and loves the intellectual life, thinks we've gone too far, reaching a state of cultural relativism where there is neither good nor bad remammg. this discussion includes my favorite quote of the book: the problem is that once you have done away with the ability to make judgments as to right and wrong, true and false, etc., there's no real culture left. all that remains is clog dancing and macrame. the ability to make judgments, to believe things, is the entire point of having a culture. i think this is why guys with machine guns sometimes pop up in places like luxor and begin pumping bullets into westerners .... when their sons come home wearing chicago bulls caps with the bills turned sidewavs, the dads go out of their minds. (p. 56) it's a pretty startling move to try to connect up the decline in use of the command line to an anti-intellectualism following world war ii that resulted in cultural relativism. i think it actually has some merit, although in the case of visual interfaces versus the command line the ethical import is minimal, i.e., i don't believe my decision to accomplish certain tasks using visual metaphors contributes to the decline of civilization, and i think the fact that i like to work on other tasks utilizing a command line won't serve to save our written culture. it's too much of a stretch. i think that something stephenson misses in his discussion of the replacement of the written word by visual images is that there is still a creative force and judgment involved in the creation of the images. there is still script writing. isn't this, after all, what a writer does in any case, creating images, metaphorically, through his or her work? certainly, we are moving through a perilous time, when the world really is changing from a reliance on the written word to more dependence on the visual. there will be many things lost in this transition. plato had some major, wellfounded doubts about the transition from greece's oral cultural tradition to a written one. the change happened anyway. civilization has been declining for a long time. my fearless prediction is that it will continue to decline for a long time. i think stephenson has done a masterful job of writing a brief glimpse of the overall picture that represents the state of culture and intellectual life in the world today, and has also made some important points about the economics and character of the world of software and operating environments. his writing skills make this fairly short book a pleasurable read and a worthwhile one. as i did, i think you might find this long essay a useful starting point for thoughts about issues large and small.-tom zillner, wils the cathedral & the • bazaar: musings on linux and open source by an accidental revolutionary by eric s. raymond, sebastopol, calif.: o'reilly, 1999. 288p. $19.95 (isbn 156592-724-9) this short essay examines, in the guise of a book review, the concept of a "gift culture" and how it may or may not be related to librarianship. as a result of this examination, and with a few qualifications, i believe my judgements about open source software and librarianship are true: open source software development and librarianship have a number of similarities-both are examples of gift cultures. i have recently read a book about open source software development by eric raymond. the cathedral & the bazaar describes the environment of free software and tries to explain why some programmers are willing to give away the products of their labors. it describes the "hacker milieu" as a "gift culture": book reviews 105 reproduced with permission of the copyright owner. further reproduction prohibited without permission. gift cultures are adaptations not to scarcity but to abundance. they arise in populations that do not have significant material scarcity problems with survival goods. we can observe gift cultures in action among aboriginal cultures living in ecozones with mild climates and abundant food. we can also observe them in certain strata of our own society, especially in show business and among the very wealthy. 1 raymond alludes to the definition of "gift cultures," but not enough to satisfy my curiosity. being the good librarian, i was off to the reference department for more specific answers. more often than not, i found information about "gift exchange" and "gift economies" as opposed to "gift cultures." (yes, i did look on the internet but found little.) probably one of the earliest and more comprehensive studies of gift exchange was written by marcell mauss. 2 in his analysis he says gifts, with their three obligations of giving, receiving, and repaying, are in aspects of almost all societies. the process of gift giving strengthens cooperation, competitiveness, and antagonism. it reveals itself in religious, legal, moral, economic, aesthetic, morphological, and mythological aspects of life.3 as gregory states, for the industrial capitalist economies, gifts are nothing but presents or things given, and "that is all that needs to be said on the matter." ironically for economists, gifts have value and consequently have implications for commodity exchange. 4 he goes on to review studies about gift giving from an anthropological view, studies focusing on tribal communities of various american indians, cultures from new guinea and melanesia, and even ancient roman, hindu, and germanic societies: the key to understanding gift giving is apprehension of the fact that things in tribal economics are produced by nonalienated labor. this creates a special bond between a producer and his/her product, a bond that is broken in a capitalistic societv based on alienated wage-labor. 5 ingold, in "introduction to social life," echoes many of the things summarized by gregory when he states that industrialization is concerned exclusively with the dynamics of commodity production. clearly in non-industrial societies, where these conditions do not obtain, the significance of work will be very different. for one thing, people retain control over their own capacity to work and over other productive means, and their activities are carried on in the context of their relationships with kin and community. indeed their work may have the strengthening or regeneration of these relationships as its principle objective. 6 in short, the exchange of gifts forges relationships between partners and emphasizes qualitative as opposed to quantitative terms. the producer of the product (or service) takes a personal interest in production, and when the product is given away as a gift it is difficult to quantify the value of the item. therefore, along with the product or service, less tangible elements-such as obligations, promises, respect, and interpersonal relationships-are exchanged. as i read raymond and others i continually saw similarities between librarianship and gift cultures, and therefore similarities between librarianship and open source software development. while the summaries outlined above do not necessarily mention the "abundance" alluded to by raymond, the existence of abundance is more than mere speculation. potlatch, "a ceremonial feast of the american indians of the northwest coast marked by the host's lavish distribution of gifts or sometimes destruction of property to demonstrate wealth and generosity with the 106 information technology and libraries i june 2000 expectation of eventual reciprocation," is an excellent example.? libraries have an abundance of data and information. (i won't go into whether or not they have an abundance of knowledge or wisdom of the ages. that is another essay.) libraries do not exchange this data and information for money; you don't have to have your credit card ready as you leave the door. libraries don't accept checks. instead the exchange is much less tangible. first of all, based on my experience, most librarians simply take pride in their ability to collect, organize, and disseminate data and information in an effective manner. they are curious. they enjoy learning things for learning's sake. it is a sort of platonic end in itself. librarians, generally speaking, just like what they do and they certainly aren't in it for the money. you won't get rich by becoming a librarian. information is not free. it requires time and energy to create, collect, and share, but when an information exchange does take place, it is usually intangible, not monetary, in nature. information is intangible. it is difficult to assign it a monetary value, especially in a digital environment where it can be duplicated effortlessly: an exchange process is a process whereby two or more individuals (or groups) exchange goods or services for items of value. in library land, one of these individuals is almost always a librarian. the other individuals include tax payers, students, faculty, or in the case of special libraries, fellow employees. the items of value are information and information services exchanged for a perception of worth-a rating valuing the services rendered. this perception of worth, a highly intangible and difficult thing to measure, is something the user of library services "pays," not to libraries and librarians, but to administrators and decision-makers. ultimately, these payments reproduced with permission of the copyright owner. further reproduction prohibited without permission. manifest themselves as tax dollars or other administrative support. as the perception of worth decreases so do tax dollars and support. 8 therefore, when information exchanges take place in libraries, librarians hope their clientele will support the goals of the library to administrators when issues of funding arise. librarians believe that "free" information ("think free speech, not free beer") will improve society. it will allow people to grow spiritually and intellectually. it will improve humankind's situation in the world. libraries are only perceived as beneficial when they give away this data and information. that is their purpose, and they, generally speaking, do this without regard to fees or tangible exchanges. in many ways i believe open source software development, as articulated by raymond, is very similar to the principles of librarianship. first and foremost they are similar in the idea of sharing information. both camps put a premium on open access. both camps are gift cultures and gain reputation by the amount of "stuff" they give away. what people do with the information, whether it be source code or journal articles, is up to them. both camps hope the shared information will be used to improve our place in the world. just as jefferson's informed public is necessary for democracy, open source software is necessary for the improvement of computer applications. second, human interactions are a necessary part of the mixture in both librarianship and open source development. open source development requires people skills by source code maintainers. it requires an understanding of the problem the computer application is intended to solve, since the maintainer must be able to "patch" the software, both to add functionality and to repair bugs. this, in turn, requires interactions both with other developers and with users who request repairs or enhancements. similarly, librarians understand that information-seeking behavior is a human process. while databases and many "digital libraries" house information, these collections are really "data stores" and are only manifested as information after the assignment of value is given to the data and interrelations between data are created. third, it has been stated that open source development will remove the necessity for programmers. yet raymond posits that no such thing will happen. if anything, there will be an increased need for programmers. similarly, many librarians feared the advent of the web because they believed their jobs would be in jeopardy. ironically, librarianship is flowering under new rubrics such as information architects and knowledge managers. it has also been brought to my attention by kevin clarke (kevin_clarke@unc.edu) that both institutions use peer-review: your cultural take (gift culture) on "open source" is interesting. i've been mostly thinking in material terms but you are right, i think, in your assessment. one thing you didn't mention is that, like academic librarians, open source folks participate in a peer-review type process. index to advertisers all of this is happening because of an information economy. it sure is an exciting time to be a librarian, especially a librarian who can build relational databases and program on a unix computer. acknowledgements thank you to art rhyno (arhyno@ server.uwindsor.ca) who encouraged me to post the original version of this text.-eric lease morgan, north carolina state university, raleigh, north carolina references 1. the cathedral & the bazaar: musings on linux and open source by an accidental revolutionary, 99. 2. m. mauss, the gift: forms and functions of exchange in archaic societies (new york: norton, 1967). 3. s. lukes, "mauss, marcel," in international encyclopedia of the social sciences, d. l. sills, ed. (new york: macmillian), vol 10, 80. 4. c. a. gregory, "gifts," in the new pa/grave: a dictionary of eeconomics, j. eatwell and others, eds. (new york: stockton pr., 1987), vol. 4, 524. 5. ibid. 6. t. ingold, "introduction to social life," in companion encyclopedia of anthropology, t. ingold, ed (new york: routledge, 1984), 747. 7. the merriam-webster online dictionary, http://search.eb.com/ cgi-bin/ dictionary?va=potlatch 8. e. l. morgan, "marketing future libraries." accessed apr. 27, 2000, www.lib.ncsu.edu/ staff/ morgan/ cil/ marketing. info usa library technologies, inc. lita mit press cover 2 cover 3 58, 69, cover 4 95 book reviews 107 japanese military “comfort women” knowledge graph: linking fragmented digital records article japanese military “comfort women” knowledge graph linking fragmented digital records haram park and haklae kim information technology and libraries | march 2023 https://doi.org/10.6017/ital.v42i1.15799 haram park (haram9553@gmail.com) is master student, library and information science, chung-ang university, haklae kim (haklaekim@cau.ac.kr) is associate professor, library and information science, chung-ang university. © 2023. abstract materials related to japanese military “comfort women” in korea are managed by several institutions. each digital archive has their own metadata schema and management policies. so far, a standard or a common guideline for describing digital records is not formalized. we propose a japanese military “comfort women” knowledge graph to semantically interlink the digital records from distributed digital archives. to build a japanese military “comfort women” knowledge graph, digital records and descriptive metadata were collected from existing digital archives. a list of metadata was defined by analyzing commonly used properties and a knowledge model designed by reusing standard vocabularies. knowledge was constructed by interlinking the collected records, external data sources, and enriching data. the knowledge graph was evaluated using the fair data maturity model. introduction in december 1991, kim hak-sun (a korean) became the first woman to disclose and identify as a former “comfort woman.”1 in february 1992, ms. itoh hideko discovered three telegrams in the japanese defense agency stating that not only korean but also taiwanese women had been dispatched as “comfort women.”2 between 1931 and 1945, the imperial japanese army forced approximately 200,000 girls and young women from korea, china, and other countries, known as “comfort women,” into sexual slavery. these women came from all over east asia, but the majority, over 80 percent, were from south korea.3 it was not until the early 1990s that survivors began to share their stories and demand justice. many international organizations and volunteers continue to participate in advocacy and campaigns to solve the japanese military sexual slavery.4 however, the japanese government has never accepted legal responsibility or agreed to pay reparations.5 regardless of political interpretation, we believe it is critical to reveal the historical truth. the records of japanese military “comfort women” serve as objective evidence to prove the fact that the japanese military indulged in sexual slavery. as there are now only 13 elderly survivors left in south korea, the records could serve as one of the key pieces of evidence for understanding the japanese military “comfort women.” in korea, materials related to japanese military “comfort women” are managed by the national archives of korea and some private organizations, and some of this material is being provided as digital archives.6 digital archives systematically describe digital resources so that users can effectively search and view the materials.7 in general, digital archives describe digital resources based on guidelines for expressing standard metadata elements and data values that are mainly used in the domain. for mailto:haram9553@gmail.com mailto:haklaekim@cau.ac.kr information technology and libraries march 2023 japanese military “comfort women” knowledge graph 2 park and kim example, the us library of congress is creating digital resources with varying levels and types of descriptive metadata, providing an increasingly coordinated and standardized approach to the creation and management of descriptive metadata.8 however, for the digital archives related to japanese military “comfort women,” there are no recommendations or agreed guidelines on metadata for describing digital records. even when metadata standards such as dublin core are used, there remain variations in describing metadata elements of digital records. therefore, linking or integrating the digital records with different metadata structures and values is difficult. to solve this problem, a metadata model to describe digital records related to japanese military “comfort women” should be developed, and digital records should be systematically described. if the various pieces of information contained in the digital record are expressed in a format that a machine can understand, a precise search is possible based on the meaning and relationship of the data. a knowledge graph can be applied to define the relationships between the various entities included in japanese military “comfort women” records. in particular, the records existing in a distributed digital archive can be expressed as objects that can be identified on the web, so that different records can be linked at a semantic level.9 this study proposes a method to interlink and search digital records of the digital archives of japanese military “comfort women.” for describing and linking distributed digital records, a set of metadata elements was proposed, and a knowledge model was defined by examining the common metadata model and the existing rdf vocabulary. the collected digital records were constructed as a knowledge graph, using a knowledge model. the knowledge graph was evaluated by applying the fair data maturity model.10 the remainder of this paper is organized as follows. the literature review introduces the japanese military “comfort women” issue and describes the concepts and research trends related to knowledge graphs. we then introduce the case of korean digital archives containing materials about the japanese military’s use of “comfort women.” next, we describe the process of developing a knowledge graph in detail and define sparql queries, comparing the search results of existing digital archives and knowledge graphs, and describing differences in fair data maturity. finally, the research results are summarized, and future research directions are described. literature review japanese military “comfort women” the japanese military “comfort women” issue was made official in 1991 when the korean council for the women drafted for military sexual slavery by japan and the korean victims appealed to solve the problem themselves,11 through activities such as the testimony of victims,12 and the activities of individual researchers and civic groups,13 raising issues through the international community and through domestic and international judicial procedures.14 through these efforts, the japanese military “comfort women” issue has been seen as a problem of forced mobilization, human trafficking, sexual exploitation, and extreme human rights violations by the ruling state targeting women in the colonized state.15 however, the japanese military “comfort women” were a cause of conflict and confrontation between victims and their families, private organizations, and the south korean and japanese governments. for example, mark ramseyer defined the japanese military “comfort women” in his paper as prostitutes (ianfu) who, based on game theory, engaged in prostitution to the japanese military for high wages during the pacific war.16 this sparked a information technology and libraries march 2023 japanese military “comfort women” knowledge graph 3 park and kim debate about historical distortion.17 some argue that the “comfort women” issue is not viewed as a conflict between korea and japan but as a women’s and a universal human rights issue.18 from a political and social point of view, research on the japanese military “comfort women” is active, but insufficient research has been conducted on archives and records management due to licensing of records, data sharing, and a lack of qualified personnel. various licensing policies and sharing limitations apply to the records kept by different institutions. as a result, the preservation and exchange of documents are nominal, and they are administered with a minimal amount of personnel. records are essential evidence for discussing historical truths. fifteen organizations, from eight countries, have tried to list the records of the japanese military “comfort women” as unesco world’s documentary heritage.19 a total of 2,744 records have been requested, including materials that prove the japanese military’s “comfort women” system or materials produced by “comfort women” victims. however, the decision to list japanese military “comfort women” records as unesco documentary heritage has been postponed due to tensions between south korea and japan.20 the national archives of korea has selected materials related to the “comfort women” of the japanese military as a nation-designated record and is integrating and managing these records.21 however, most records are scattered in various university research institutes, nongovernmental organizations, and institutions, and it is difficult to systematically preserve and manage them. reuse of ontology vocabularies and fair data principles the records of the japanese military “comfort women” are not systematically managed, and existing digital archives tend not to contain sufficient information contained in the original records. a previous study suggested a metadata schema for the integrated management of the records of japanese military “comfort women.”22 however, although most studies suggest common metadata elements, they do not include methods for representing and processing records in a machine-readable format.23 reusing vocabularies is recommended to foster interoperability and facilitate knowledge use by interlinking new datasets to existing resources. some previous efforts demonstrate a way of interlinking digital resources on the web by using several ontology vocabularies.24 in particular, freire et al. propose a mapping from schema.org metadata to the europeana data model. the proposed method is suitable for metadata aggregation in the area of cultural heritage by enriching the semantics of the schema.org model.25 the fair data principles are designed to reinforce the reusability of research data and are defined as four principles: findable, accessible, interoperable, and reusable.26 in particular, the fair principles emphasize the ability of machines to find and use data on their own, in accordance with the research data management environment.27 initially, the fair principles were recognized as a tool to enhance the reusability of research data in the context of open science; however, they are now being extended to a universal framework for preserving and managing data in the long term.28 representative examples include fair metrics,29 the data maturity model of the rda (research data alliance) working group,30 and fairsfair.31 fair metrics presents an evaluation framework that can measure fair indices using an automated tool. discussions on the fair principle are also expanding in digital archives and libraries.32 koster and woutersen-windhouwer propose the fair principle suitable for lam (libraries, archives, museums) collections and suggest a practical method to increase the reusability of digital cultural heritage.33 information technology and libraries march 2023 japanese military “comfort women” knowledge graph 4 park and kim digital archives of japanese military “comfort women” the records or documents of the japanese military “comfort women” are managed in the form of digital archives by national and private institutions. table 1 summarizes the status of digital archives held by each institution as representative digital archives. the wednesday demonstration archive is a digital archive operated by the korean council. it contains a record of the “regular demand demonstration to solve the japanese military’s sexual slavery problem” that began in january 1992. the archive contains 1,085 records, and each record is described with 17 metadata elements. archive 814, named for the annual day of remembrance of the japanese military “comfort women” observed on august 14, aims to develop efforts and research results table 1. status of records by archives archives organization number of digital records number of descriptive metadata url wednesday demonstration the korean council 1,085 17 https://womenandwarmuseum.net archive 814 research institute on japanese military sexual slavery 596 20 https://www.archive814.or.kr/ digital collection of “comfort women” seoul metropolitan archives 137 25 https://archives.seoul.go.kr/class/cc0003 gender archive seoul foundation of women and family 408 88 http://genderarchive.or.kr/ nationdesignated archives no. 8 national archives of korea 27 20 https://theme.archives.go.kr//next/n ationalarchives/subpage/nationalarc hives7.do note: archive names in the following sections are abbreviated for readability: wed: wednesday demonstration; a814: archive 814; sma: digital collection of “comfort women”; gen: gender archive; nak: nation-designated archives no. 8. information technology and libraries march 2023 japanese military “comfort women” knowledge graph 5 park and kim surrounding the “comfort women” issue. archive 814 has 596 records, including domestic and foreign legal records, official documents, collections by subject, chronological tables, and book lists. the seoul archives provides documents proving the existence of japanese military “comfort women” and comfort stations from documents produced by the allied forces during world war ii. in total, 137 records were provided, and each record consisted of 25 descriptive metadata elements. the gender archive provides documents on the issue of “military sexual slavery by japan” and “the women’s international war crimes tribunal on japan’s military sexual slavery.” a total of 408 records were provided, with 88 metadata elements describing each record. the national archives of korea has designated records related to japanese military “comfort women” as nation-designated archives no. 8. among the records (approximately 3,060 cases) owned by house of sharing (http://www.nanum.org/eng/main/index.php) and daegu citizen forum for halmuni (http://www.1945815.or.kr/), 27 records are selected as major records, and digitized records including 20 metadata elements are provided. development of japanese military “comfort women” knowledge graph data preprocessing a total of 2,253 records and metadata were collected from the five digital archives. excluding records with insufficient information (a814 and nak had three and two documents, respectively), 2,248 records were constructed as a knowledge graph. metadata values in the collected records are not consistently expressed. for example, the seoul archives indicates the institution in the form “[organization/group] jinseong jeong research team, seoul national university, 2015,” whereas “kunji takei, governor of yamagata prefecture” in archive 814 has a combination of person, organization, and his position together. these values are separated into relevant categories and described in the corresponding metadata elements (e.g., “kunji takei, governor of yamagata prefecture” is divided to “kunji takei” (name) and “yamagata prefecture” (his position)). the units for expressing metadata values such as “production date” and “language” are also unified, and errors in some data values are corrected directly (e.g., “gabrelle kirk mcdonald” is changed to “gabrielle kirk mcdonald”, restoring the “i” to her first name). in addition, a new classification system is defined by aligning and integrating existing categories, since digital archives uses different categories (e.g., book/publication, document). a model of designing a knowledge graph two tasks are performed to transform the collected data into a knowledge graph. since the metadata elements used in digital archives are different, metadata properties commonly used in archives are extracted. for common metadata, the scope of reuse is determined by investigating the existing rdf vocabularies and adding to the proposed knowledge model. common metadata elements among the selected archives are defined by the following two criteria: 1. metadata elements commonly used in all archives were extracted. metadata elements present in all five archives, such as title, description, identifier, license, and url are mandatory. metadata elements defined in two or more archives, such as “production date” and “language,” are optional properties. even if the metadata name written in korean is different, it is regarded as the same metadata element if its purpose is to indicate the same data value. http://www.nanum.org/eng/main/index.php http://www.1945815.or.kr/ information technology and libraries march 2023 japanese military “comfort women” knowledge graph 6 park and kim 2. metadata elements not used in the actual data were excluded from the model. for example, ged has 88 metadata elements. however, there were no data values for 60 of these elements. table 2 summarizes a list of metadata elements for describing the records of digital archives. a proposed model should be able to represent the context of individual records and their own properties. after investigating semantic relationships between common metadata elements and existing vocabularies, the proposed model is defined. the model reuses existing vocabularies, such as dcmi (dublin core metadata initiative) metadata terms for describing online resources, skos (simple knowledge organization system) for representing taxonomies, ric-o (records in contexts – ontology) for describing digital records, and schema.org for supporting universal search on the web. the basic structure of the japanese military “comfort women” knowledge model is illustrated in figure 1. all records that are digital resources (“#record”) are instances of schema:archivecomponent and represent records provided by each archive. the individual records contain information on several people and organizations. for example, the schema:creator property describes a creator who creates a record, the schema:contributor can be used to represent a person who contributes a record, and the schema:mentions is to represent a thing related to a record. an archive manager who holds or maintains a record can be described using the schema:holdingarchive property, and the archive manager is represented by the schema:archiveorganization class. if the value of each property is a type of organization, then the value of rdfs:range is the schema:organization class. figure 1. abstract structure of the japanese military “comfort women” knowledge graph. information technology and libraries march 2023 japanese military “comfort women” knowledge graph 7 park and kim table 2. mapping results of both metadata elements and models of the knowledge graph wed a814 sma nak gen property entity value mandatory title title title title dc:title schema:title schema: archivecomponent xsd:string yes identifier registration number identification number management number dc:identifier schema:identifier schema: archivecomponent xsd:string yes description scope and content description dc:description schema:description schema: archivecomponent xsd:string yes production date production date year of production itm:date schema:datecreated schema: archivecomponent xsd:datetime no creator creator production institution itm:creator schema:creator schema: archivecomponent schema:person; schema:organizati on yes license license rights statement license cc:license schema: archivecomponent cc:license yes management organization management organization service provider management organization schema:holdingarchive schema: archivecomponent schema: archiveorganizatio n yes url url url url schema:sameas schema: archivecomponent schema:url yes attachment view attachment view attachment view attachment view file schema:mainentityofpage schema: archivecomponent schema:url no attachment download download schema:downloadurl schema: archivecomponent schema:url no information technology and libraries march 2023 japanese military “comfort women” knowledge graph 8 park and kim wed a814 sma nak gen property entity value mandatory record type record type record type record type itm:typeofrecord rico:hascontentoftype schema: archivecomponent skos:concept yes format type of document itm:formatofrecord rico:hasdocumentaryformt ype schema: archivecomponent skos:concept no number of pages number of pages itm:size/amount schema:numberofpages schema: archivecomponent xsd:nonnegativein teger no language itm:langauage schema:inlanguage schema: archivecomponent schema:language no periodic classification temporal coverage schema:temporalcoverage schema: archivecomponent xsd:string no related terms related information itm:relatedperson; itm:relatedorganizati on; itm:relatedevent schema:mentions schema: archivecomponent schema:person; schema:organizati on; schema:event no donor/collect or contributor, collector/provid er itm:donor schema:contributor schema: archivecomponent schema:person; schema:organizati on no information technology and libraries march 2023 japanese military “comfort women” knowledge graph 9 park and kim data enrichment and transformation data enrichment refers to the process of appending or otherwise enhancing the collected data with the relevant context obtained from additional sources. in the collected digital records, the entities of person and organization are linked to wikidata (http://wikidata.org) and the enriched information is expanded to a knowledge graph using the rdf extension of openrefine (http://openrefine.org). a total of 654 terms were extracted from the existing archives for people and organizations. after removing duplicates, the dictionary contained 150 people and 312 organizations. for each term in the dictionary, a matching entity is searched for in wikidata. if the entity name matches completely, the uri of wikidata is assigned automatically. thirty-eight percent of people (57) and 28 percent of organizations (88) matched between the dictionary and wikidata. matched entities can be added to the knowledge graph by extracting the properties and values of wikidata. for example, kim bok-dong is linked to wikidata (q16175111), and citizenship, occupation, place of birth, and gender, which did not exist in the collected data, are added to the knowledge graph. as a result, six properties are representing the extended properties were mapped (e.g., citizenship is mapped to fetched from the person and three attributes are obtained from the organization. a total of nine properties were expanded by data enrichment, and vocabularies for schema:nationality). the constructed knowledge graph had 47,499 triples for 3,069 entities. the collected records and information contained in the records included 2,560 objects. the number of entities expanded through wikidata was 145 (88 individuals and 57 entities) and were added to the organization. the enriched entity contained 2,144 explicit statements and 102 inferred statements. as shown in table 3, the total number of triples was 47,327 for explicit statements and 172 for inferred statements. the knowledge graph is published on github (https://github.com/hike-lab/comfortwomen-archives). table 3. statistics of the constructed knowledge graph entities explicit statements implicit statements sum of statements collected entities 2,560 45,213 70 45,283 enriched entities 509 2,114 102 2,216 sum 3,069 47,327 172 47,499 figure 2 shows the information about “jan ruff o’herne” in the knowledge graph. she is a dutchaustralian sexually enslaved by the japanese military and has been active as a human rights activist since she disclosed in 1992 that she had been sexually enslaved by the japanese army. the knowledge graph links several records produced or contributed by o’herne. wed’s record (wednes-demo-368) links jan ruff o’herne with related information (schema:mentions), and a814’s record (a814-107) links jan ruff o’herne as the record’s creator (schema:creator). existing digital archives do not provide specific information about the person, organization, or http://wikidata.org/ http://openrefine.org/ https://github.com/hike-lab/comfort-women-archives https://github.com/hike-lab/comfort-women-archives information technology and libraries march 2023 japanese military “comfort women” knowledge graph 10 park and kim event described in metadata. if anyone does not know that jan ruff o’herne was a victim of the japanese military “comfort women,” it is difficult to fully understand the record of “letter from jan ruff-o’herne in support of us congress resolution 121 in 2007” provided by a814. however, as shown figure 2 the knowledge graph provides a rich context for understanding her and her associated records. figure 2. semantic relationships of jan ruff o’herne on the knowledge graph. evaluation the evaluation of the constructed knowledge graph was carried out in two ways: 1) discoverability among five archives and the knowledge graph is compared by using several semantic queries, and 2) the fair data evaluation was applied to the knowledge graph and existing digital archives. information technology and libraries march 2023 japanese military “comfort women” knowledge graph 11 park and kim discoverability all queries aim to find out all digital records across five digital archives by using search conditions and are designed by the rdf standard query language (sparql). table 4 is an example query (q3), and the records produced from 1990 to 1994 in digital resources are sorted in ascending order. at this time, the values of all the objects must exactly match the rdf:type, and regardless of the physical location, the object is identified based on the uri and included in the search result. table 4. a sparql query example (q3) prefix schema: <http://schema.org/> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix xsd: <http://www.w3.org/2001/xmlschema#> select ?title ?date ?archiveorganizationname where { ?record rdf:type schema:archivecomponent; schema:name ?title; schema:datecreated ?date; schema:holdingarchive ?archiveorganization . ?archiveorganization rdfs:label ?archiveorganizationname filter (?date >= ‘1990-01-01’^^xsd:date && ?date <= ‘1994-12 31’^^xsd:date) } order by ?date table 5. list of sparql queries queries description number of results q1 select all records of japanese military “comfort women” 2,248 q2 select all records whose record type is ‘document’ 1,793 q3 select records produced between 1990 and 1994, and sort in ascending order 345 q4 select all information about ‘ministry of gender equality and family’ 480 q5 select all information about ‘jan ruff-o’herne’ 120 table 5 summarizes the queries constructed to search for a knowledge graph, and figure 4 shows the results of the comparison between the search of the existing archives and the query of the knowledge graph. the existing archives provide keyword-based search without considering the http://schema.org/ http://www.w3.org/2000/01/rdf-schema http://www.w3.org/2001/xmlschema information technology and libraries march 2023 japanese military “comfort women” knowledge graph 12 park and kim meaning and relationship of search keywords. furthermore, they do not share any common categories or classifications among others. a knowledge graph that semantically links records in different digital archives also enables accurate and relevant discovery. q1, q2, and q3 find all digital records matching the query condition and information semantically linked to those records. for example, gen had 169 records produced between 1990 and 1994. since the archive did not support the search for a type of a record, it is not possible to specifically search for the record type in q2 and q3. however, in the knowledge graph, the record type is rico:hascontenttypeof; thus, information is expressed at the semantic level, such that 169 related records can be retrieved. q4 and q5 discover entities based on their semantic relations. “ministry of gender equality and family” in q4 is an organization, and each government uses the name of the department slightly differently (e.g., “ministry of gender equality”). q5 discovers different entities in existing archives. the knowledge graph semantically defines the variant of entities and their types. as a result, the knowledge graph provided 104 more search results in q4 and nine more search results in q5 than the existing archives. figure 3. search results of knowledge graph and existing digital archives. fair data evaluation for the knowledge graph the fair data evaluation for the constructed knowledge graph reveals a clear improvement compared to existing archives. findable, accessible, and interoperable follow the fair data principles. all objects of the constructed knowledge graph can be identified by uri, and metadata elements are described with a standard vocabulary, so that the machine can search for digital resources. digital resources in the existing archives are accessible over the web, therefore accessible received a pretty good score. however, access to the metadata of ind ividual records was restricted, as the majority of metadata elements were described as simple strings instead of machine-readable forms. all the information in the knowledge graph has improved accessibility by providing uris to metadata elements. in addition, to avoid being linked to the resources of the existing archive, standardized vocabulary, such as schema.org and dublin core, was applied to increase the connectivity between data, and rich contextual information was provided through semantic linkage with wikidata. as shown in figure 5, the evaluation score of reusable is 0.7, which is 2.9 times better than the existing archives. the metadata elements in the information technology and libraries march 2023 japanese military “comfort women” knowledge graph 13 park and kim knowledge graph clearly describe a license for reuse. in particular, the creative commons license and the korea open government license provide machine-readable uri information to enhance reusability. however, data for which licensing information is not clear or not provided are left blank. in summary, the constructed knowledge graph semantically connects digital resources fragmented in different archives, enables a rich search, and satisfies all fair data indicators. figure 4. results of fair data evaluation of the knowledge graph and existing digital archives. conclusion this study proposed a method for linking and searching digital records from the japanese military “comfort women” digital archive. in korea, materials related to japanese military “comfort women” are managed by several institutions, some of which are provided as digital archive services. however, the existing digital archives describe digital records without common standards or guidelines, and the metadata of individual records are expressed in text format in html documents without explicitly expressing their structure and meaning. therefore, digital records that exist in different digital archives cannot be connected even if they have the same context, such as subject, event, person, or institution. this study proposed a common metadata model for the descriptive metadata of digital records and constructed a knowledge graph in which digital records are semantically interlinked. furthermore, the fair data maturity model was used to evaluate the constructed knowledge graph. the constructed knowledge graph semantically defines the relationship between the various entities included in the records of japanese military “comfort women.” in particular, records existing in a distributed digital archive are expressed as objects that can be identified on the web, so that different records can be explored at a semantic level. the knowledge model proposed herein is the first attempt to describe digital records related information technology and libraries march 2023 japanese military “comfort women” knowledge graph 14 park and kim to japanese military “comfort women”; thus, it can serve as a starting point for discussing a comprehensive model for describing fragmented digital records worldwide. we also apply an open license to disclose all the collected records and construct knowledge graphs for further collaboration. however, there are also considerations for the construction and management of high-quality digital records. first, the records must contain accurate and rich semantic information. the collected digital archives have an average of 16 metadata elements, but because the metadata elements and values differ among institutions, the data accuracy needs to be improved. second, it is necessary to clearly provide conditions for the use of records. most records do not provide a clear license for terms of use. it is important to explicitly express and provide international or korean standard licenses for digital resources. finally, it is necessary to discuss the records of japanese military “comfort women” using open data. the sharing of records and the promotion of information exchange between domestic and international scholars can both be facilitated by the opening of records, which can also play a significant role in the long-term preservation and sharing of records. as a majority of records are fragmented and difficult to discover and manage, it is necessary to find an effective method to preserve the records by opening and sharing them and to lead research cooperation at home and abroad. endnotes 1 chunghee sarah soh, “the korean ‘comfort women’: movement for redress,” asian survey 36, no. 12 (1996): 1226–40, https://doi.org/10.2307/2645577. 2 shogo suzuki, “the competition to attain justice for past wrongs: the ‘comfort women’ issue in taiwan,” pacific affairs 84, no. 2 (june 2011): 223–44, https://doi.org/10.5509/2011842223. 3 center for korean legal studies, “military sexual slavery, 1931–1945,” accessed october 17, 2022, https://kls.law.columbia.edu/content/military-sexual-slavery-1931-1945. 4 kathryn j. witt, “comfort women: the 1946–1948 tokyo war crimes trials and historical blindness,” the great lakes journal of undergraduate history 4, no. 1 (september 2016): 17– 34. 5 “south korea: lawsuits against japanese government last chance for justice for ‘comfort women’,” amnesty international, accessed october 17, 2022, https://www.amnesty.org/en/latest/news/2020/08/south-korea-lawsuits-against-thejapanese-government-last-chance-for-justice-for-comfort-women/. 6 sincheol lee and hye-in han, “comfort women: a focus on recent findings from korea and china,” asian journal of women’s studies 21, no. 1 (march 2015): 40–64, https://doi.org/10.1080/12259276.2015.1029229. 7 itza a. carbajal and michelle caswell, “critical digital archives: a review from archival studies,” the american historical review 126, no. 3 (september 2021): 1102–20, https://doi.org/10.1093/ahr/rhab359. https://doi.org/10.2307/2645577 https://doi.org/10.5509/2011842223 https://kls.law.columbia.edu/content/military-sexual-slavery-1931-1945 https://www.amnesty.org/en/latest/news/2020/08/south-korea-lawsuits-against-the-japanese-government-last-chance-for-justice-for-comfort-women/ https://www.amnesty.org/en/latest/news/2020/08/south-korea-lawsuits-against-the-japanese-government-last-chance-for-justice-for-comfort-women/ https://doi.org/10.1080/12259276.2015.1029229 https://doi.org/10.1093/ahr/rhab359 information technology and libraries march 2023 japanese military “comfort women” knowledge graph 15 park and kim 8 “library of congress metadata for digital content – master data element list version 4.1,” library of congress, accessed october 4, 2022, https://www.loc.gov/standards/mdc/elements/masterdataelementlist-20120215.doc. 9 stefano ferilli and domenico redavid, “an ontology and knowledge graph infrastructure for digital library knowledge representation,” italian research conference on digital libraries, (january 2020): 47–61, https://doi.org/10.1007/978-3-030-39905-4_6. 10 mark d. wilkinson et al., “evaluating fair maturity through a scalable, automated, communitygoverned framework,” scientific data 6, no. 174 (september 2019): 1–12, https://doi.org/10.1038/s41597-019-0184-5. 11 na-young lee, “the korean women’s movement of japanese military ‘comfort women’: navigating between nationalism and feminism,” the review of korean studies 17, no. 1 (june 2014): 71–92. 12 jaeyeon lee, “the ethno-nationalist solidarity and (dis)comfort in the wednesday demonstration in south korea,” gender, place & culture (2021): 1–14, https://doi.org/10.1080/0966369x.2021.2016655. 13 lee and han, “comfort women,” 40–64. 14 witt, “comfort women,” 17–34. 15 na-young lee, “the korean women’s movement,” 71–92. 16 j. mark ramseyer, “contracting for sex in the pacific war,” international review of law and economics 65, (march 2021): 105971, https://doi.org/10.1016/j.irle.2020.105971. 17 andrew gordon and carter eckert, “statement by andrew gordon and carter eckert concerning j. mark ramseyer, ‘contracting for sex in the pacific war’,” accessed october 4, 2022, https://nrs.harvard.edu/urn-3:hul.instrepos:37366904. 18 jaeyeon lee, “the ethno-nationalist solidarity and (dis)comfort,” 1–14. 19 heisoo shin, “voices of the ‘comfort women’: the power politics surrounding the unesco documentary heritage,” the asia–pacific journal 19, no. 5 (march 2021): 1–19. 20 ian e. wilson, “the unesco memory of the world program: promise postponed,” archivaria 87, (may 2019): 106–37. 21 yunshin hong, “epilogue: ‘comfort stations’ as sites of remembrance,” in “comfort stations” as remembered by okinawans during world war ii, ed. robert ricketts (leiden: brill, 2020), 432– 59. 22 ji hyeon bong and young joon nam, “a study on the design of metadata elements for management of oral history archives about sexual slavery by japan’s military,” journal of https://www.loc.gov/standards/mdc/elements/masterdataelementlist-20120215.doc https://doi.org/10.1007/978-3-030-39905-4_6 https://doi.org/10.1038/s41597-019-0184-5 https://doi.org/10.1080/0966369x.2021.2016655 https://doi.org/10.1016/j.irle.2020.105971 https://nrs.harvard.edu/urn-3:hul.instrepos:37366904 information technology and libraries march 2023 japanese military “comfort women” knowledge graph 16 park and kim korean society of archives and records management 19, no. 1 (february 2019): 225–50, https://doi.org/10.14404/jksarm.2019.19.1.225. 23 haram park and haklae kim, “a knowledge graph on japanese ‘comfort women’: interlinking fragmented digital archival resources,” journal of korean society of archives and records management 21, no. 3 (august 2021): 61–78, https://doi.org/10.14404/jksarm.2021.21.3.061. 24 myung-ja k. han et al., “exposing library holdings metadata in rdf using schema.org semantics,” international conference on dublin core and metadata applications, (september 2015): 41–49, https://dcpapers.dublincore.org/pubs/article/view/3772. 25 nuno freire, valentine charles, and antoine isaac, “evaluation of schema.org for aggregation of cultural heritage metadata,” semantic web (june 2018): 225–39, https://doi.org/10.1007/978-3-319-93417-4_15. 26 mark d. wilkinson et al., “the fair guiding principles for scientific data management and stewardship,” scientific data 3, no. 160018 (march 2016): 1–9, https://doi.org/10.1038/sdata.2016.18. 27 “fairification process,” go fair, accessed october 4, 2022, https://www.go-fair.org/fairprinciples/fairification-process/. 28 christian haux and petra knaup, “using fair metadata for secondary use of administrative claims data,” studies in health technology and informatics 264 (august 2019): 1472–73, https://doi.org/https://doi.org/10.3233/shti190490. 29 wilkinson et al., “evaluating fair maturity,” 1–12. 30 christophe bahim et al., “the fair data maturity model: an approach to harmonise fair assessments,” data science journal 19, no. 1 (october 2020): 41, https://doi.org/10.5334/dsj2020-041. 31 ansuriya devaraju et al., “fairsfair data object assessment metrics (v0.4),” fairsfair, (october 2020): https://doi.org/10.5281/zenodo.4081213. 32 silvia calamai and francesca frontini, “fair data principles and their application to speech and oral archives,” journal of new music research 47, no. 4 (may 2018): 339–54, https://doi.org/10.1080/09298215.2018.1473449; gustavo candela et al., “reusing digital collections from glam institutions,” journal of information science 48, no. 2 (august 2020): 251–67, https://doi.org/10.1177/0165551520950246; danuta nitecki and adi alter, “leading fair adoption across the institution: a collaboration between an academic library and a technology provider,” data science journal 20, no. 1 (february 2021): 6, https://doi.org/10.5334/dsj-2021-006. 33 lukas koster and saskia woutersen-windhouwer, “fair principles for library, archive and museum collections: a proposal for standards for reusable collections,” code4lib journal 40 (may 2018). https://doi.org/10.14404/jksarm.2019.19.1.225 https://doi.org/10.14404/jksarm.2021.21.3.061 https://dcpapers.dublincore.org/pubs/article/view/3772 https://doi.org/10.1007/978-3-319-93417-4_15 https://doi.org/10.1038/sdata.2016.18 https://www.go-fair.org/fair-principles/fairification-process/ https://www.go-fair.org/fair-principles/fairification-process/ https://doi.org/https:/doi.org/10.3233/shti190490 https://doi.org/10.5334/dsj-2020-041 https://doi.org/10.5334/dsj-2020-041 https://doi.org/10.5281/zenodo.4081213 https://doi.org/10.1080/09298215.2018.1473449 https://doi.org/10.1177/0165551520950246 https://doi.org/10.5334/dsj-2021-006 abstract introduction literature review japanese military “comfort women” reuse of ontology vocabularies and fair data principles digital archives of japanese military “comfort women” development of japanese military “comfort women” knowledge graph data preprocessing a model of designing a knowledge graph data enrichment and transformation evaluation discoverability fair data evaluation for the knowledge graph conclusion endnotes enhancing opac records for discover | griffis and ford 191 patrick griffis and cyrus ford enhancing opac records for discovery this article proposes adding keywords and descriptors to the catalog records of electronic databases and media items to enhance their discovery. the authors contend that subject liaisons can add value to opac records and enhance discovery of electronic databases and media items by providing searchable keywords and resource descriptions. the authors provide an examination of opac records at their own library, which illustrates the disparity of useful keywords and descriptions within the notes field for media item records versus electronic database records. the authors outline methods for identifying useful keywords for indexing opac records of electronic databases. also included is an analysis of the advantages of using encore’s community tag and community review features to allow subject liaisons to work directly in the catalog instead of collaborating with cataloging staff. a t the university of nevada las vegas (unlv) libraries’ discovery mini-conference, there was a wide range of initiatives and ideas presented. some were large-scale initiatives that focused on designing search platforms and systems as well as information architecture schemas that would enhance library resource discovery. but there was not much focus on enhancing the representation of library resources within the construct of bibliographic records in the opac. since searching platforms can only be as useful as the information available for searching, and since opac records are the method for representing the majority of library resources, we thought it important that the prominence of opac records and how they represent library resources be considered in the mini-conference. to that end, our presentation focused on enhancing the opac records for nonbook items to support their discoverability as opposed to focusing on search systems and information architecture schemas. our proposition was that subject liaisons’ expertise could be used to enhance opac records by including their own keyword search terms and descriptive summaries in opac records for electronic databases as well as records of media items. this proposition acts as a moderate approach to initiatives that call for opac records to be opened for usergenerated content in that this approach provides subject liaison mediation and expertise to modify records. as such, this approach may serve as an effective stopgap in cases where there is resistance toward permitting social tagging and user descriptions within opac records. such an initiative also is scalable, allowing liaisons to provide as few or as many terms as they want. such an initiative would require collaboration between cataloging staff and subject liaisons. n disparity between media and database records at unlv libraries, terms included in the notes fields of bibliographic records are indexed for keyword searching. in the case of media items, there is extensive use of notes to include descriptive terms that enhance discoverability for users. for example, notes for films indicate any awards the film has won as well as festivals in which it has been featured (see figure 1). as a result, users can discover films through keyword searches of film awards or film festivals. a film student who is searching “cannes film festival” via a keyword search will generate results that include films owned by unlv libraries that have been featured at that festival. these keyword-searchable notes add value and discoverability for this type of material, and subject liaisons can be a source for such information. while it appears that notes in media records are heavily populated with a variety of user-centric information, there is relatively little use of descriptive notes for figure 1. the notes field in an opac record of a film item patrick griffis (patrick.griffis@unlv.edu) is business librarian and cyrus ford (cyrus.ford@unlv.edu) is special formats catalog librarian, university of nevada las vegas libraries. 192 information technology and libraries | december 2009 electronic databases (see figure 2). for databases, notes traditionally include information about access restrictions and mode of access while overlooking information representing the content of the resource. these fields could be utilized for specific terms relating to database content not adequately covered by the library of congress subject headings (lcsh). subject liaisons have specialized knowledge of which databases work best for unique content areas, class assignments, and information needs. this user-centric knowledge can be used to enhance database discovery if liaisons were to provide catalogers with information and descriptors to add to the record. as an example, at unlv libraries there is one particular database that provides a strengths, weaknesses, opportunities, and threats (swot) analysis for companies, but that natural language term isn’t found anywhere in the general database summary listing or subject headings. if it were added to a note field as part of a description or as a labeled descriptor, then students could easily find this database to complete their assignments. this proposal is scalable, allowing liaisons to provide as few or as many key terms as they want, depending on their preference or on the vagaries of a particular database. subject liaisons could opt to add a few major terms from their own knowledge and expertise that they feel will add value for patrons searching the opac. subject liaisons also could mine the index and thesaurus terms of individual databases to identify prominent content areas for individual databases to find useful keywords. n mining electronic database index descriptors electronic databases typically have subject matter taxonomies developed by experts who assign descriptors to journal articles. subject liaisons could mine these taxonomies to identify predominant descriptors for individual databases to add to the database catalog records. predominance of a subject descriptor could be determined by examining the relative number of articles that are assigned to that descriptor. such a strategy of indexing key predominant subject descriptors identified from database subject matter taxonomies could serve to uncover unique content areas not served with lcsh. a different application of this strategy could be employed for identifying predominant and emerging research areas for particular groups. subject liaisons could conduct a citation analysis of articles authored by members of a particular research group to record and codify the subject descriptors of each article. once codified, an analysis could determine the most predominant subject descriptors for articles authored by that particular group. this could serve as a baseline for identifying emerging research areas and their terms. both types of analysis have potential to provide useful keyword terms for database records. n using encore’s community features in 2008, unlv libraries purchased and implemented the innovative interfaces’ encore discovery platform, which provides a google-like interface for searching the public catalog and the ability to narrow results using facets such as location, year, language, and format. encore also includes many display features that showcase the information provided in the bibliographic records. two of encore’s web 2.0 features provide users with the ability to contribute data to records via community tags and community reviews. unlv requires users to enter a valid library barcode number and pin. subject liaisons could use the community reviews feature to add descriptive summaries of items to encore records independently, without the need for cataloging staff to edit a marc record. however, the content of community reviews are not indexed for searching and thus only add value at the point when a user is determining whether the resources they have retrieved are valuable for them. on the other hand, if a community tag is added to an item, that tag is included in the community tags section figure 2. the notes field in an opac record of an electronic database enhancing opac records for discover | griffis and ford 193 of the encore result display and becomes an indexed keyword for searches in encore (see figure 3). if that tag term is searched in encore’s keyword search, the bibliographic record attached to that tag term will be included in the results list under the community tags facet. since these community tags are searchable, subject liaisons can add keywords to encore records without collaboration with cataloging staff. however, this provides limited success because the keyword is included and indexed only in encore records—not in the opac records. also, the community tags facet must be selected from the results display for the encore record tags to be searchable. n the case for collaboration as described above, keywords and descriptions added by subject liaisons into encore records have inherent discovery limitations when compared to a cataloger adding the same information directly to the marc bibliographic record. the advantages of collaboration between subject liaisons and catalogers is clear, and subject librarians at unlv libraries have experienced similar collaboration in efforts in the past. in 2006, subject librarians at unlv libraries were offered the opportunity to create their own descriptions of electronic resources through an initiative to update the summary descriptions for the electronic databases portion of the libraries’ website. at that time, all existing electronic database summaries were those used by the database publishers. the project provided subject liaisons the option to create custom summary descriptions to represent electronic databases in their own terms. each subject liaison had a document file for their descriptions, and the website editors used them to update the electronic databases list on the libraries’ website. this particular initiative serves as one example of the willingness of subject liaisons to share their subject expertise to enhance the representation of library resources through collaboration with technical services staff. as such, collaboration between subject liaisons and catalogers to allow liaisons to add terms to opac records of electronic databases and media items could prove to be both effective and feasible as an initiative toward enhancing the discovery of library resources. figure 3. encore community tag lita cover 2, cover 3, cover 4 index to advertisers yiotis ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ c1'eation of compuh" "'.,---' llo in (9) presents me results 01 a comparison test 01 me first mree creation of computer input in an expanded character set donald v. black: system development corporation, santa monica california (formerly, university of california, santa cruz, calif.) , keypunching of an expanded character set for library catalog data is described. the set included 101 different characters. source documents were shelf list cards, the master record at the university of california library, santa cruz. at the end of february, 1967, some 50 million characters, 1'epresenting more than 110,000 separate titles, had been punched. some of the considerations leading to the adoption of this method for the creation of machine readable input are given, and details on costs and production rates. for manipulation by a computer, data must be converted to machine readable fornl. there are still only a few reasonably flexible means of creating machine readable records, especially if the data include an. ex~ panded character set. five possible methods utilize one of the fnuowlllg. standard keypunch, paper tape-producing typewriter, optical character reader, keyboard device that encodes dh'ectly onto magnetic tap~, or f keyboard tenuinal that inputs directly into a computer. descriptions. 0 some of these methods are available in the literature. the johns hopkin! university (1) used optical character recognition which can handle a ft~_ > alphanumeric representation, whereas southern illinois (2) used mar. sense scanning to convert only a limited amount of information. car~ wright (3) and ibm (4) discuss direct computer input from a keybo~r terminal. buckland (5) discusses the use of the paper tape-produc:nf typewriter. hammer (6) and kilgour (7) discuss keypunching. patflc t( 8) discusses several methods of conversion, but only in the abstrac · ...cbap\110ds above.does not discuss the relative merlts 01 these methods, but. 'fb paper ts the details of a system that has converted approximately es 11resen . h 2 1ra"eris;fuon characters 01 library catalog data on more t an 0 anguag , 1 500 to et of 101 characters. with • ~ iversity 01 california at santa cruz is one 01 three university '!'h. n recently established by the state. it opened lor business in the ~pls~~65 with a core collection 01 some 55,000 titles in approximately fw000 volun,es. early in the operation 01 the library, it was decided to so, achin ro as much as possible; therelore me existing catalog emods eos. as had to be converted il the original collection were to be a part 'f,~e lutur machine system. the creation 01 the core collection lor the e ;)"ee new campuses 01 the university 01 calilornia has been described in the literature (10). methodsbids were sought to convert the catalog records during the summer 01 )965. the shell list record produced by me new campuses' project was the master record and was to be me source lor conversion. unlortunately, the shell list consisted 01 both printed library of congress caids and cards produced at me new campuses' project irom typewritten multilim mas' ters. no editing was to be done on me shell list caids. the only addition was the stamping 01 an arbitrary number using a five-digit automatic numbering machine, the purpose 01 the number being to keep individual punch cards together for each entry. weighing me responses to me request lor bids was a disherutening experience. only lour responses were received irom a total 01 15 requests sent out. the bid request did not specify the method to be used to convert to machine readable form, but only the resulting machine read­ able record. since the specifications had used punch cards as an example, p,:,haps this limited the minking 01 some 01 the organizations involved, with the result that they did not choose to bid, e three .bids were based on keypunching. one was from florida and me .omplexlhes of the task made the choice of such a distant company :mpossible. ii problems had arisen during the course 01 me conversion, ravel costs would have been excessive. cf'0ther .response estimated the cost to be about $1.50 per record. early, tllls was too costly, and since bids of this nature are apt to be ~::ervative in 'h. matter ~f ~itimate tot~1 costs, we i~lt the choice 01 eth an mgam7.abon to do tne lob would, mdeed, result m a target figure at would be too high. i ?nly one bid used optical scanning as the method 01 conversion. un· orufately, the bid was for me scanning only, and library staff members wou d have had to retype the records for the scanner. since the cost the scanjling alone was close to 301 a title, that bid was also ljased"'ll~.~being ultunately more costly, choice 01 a kely"nching service in san francisc~ was made m 1'b" fill . 01 its pr<>xirillty to santa cruz, on the enthuslas 01 the '" tbo b""j, task to be undertaken, and on a reasonable cost estimate. ljiddor i<?r b':.re completed the task in slightly more than three months. , au'" po , '!'he se~ge hon was done on an ibm/056 verifier during this mass con­ .<: '" .....+j co rl ... ... ~ <d , "i"? 01-4 co lj"\ u"\ t,ey "erwca.o i i i ;e<si;e ej<pan charact~r set employed is shown in figure 1. it was» !> -"" .... n <: ' .... ..... dedco, col '" 0 because it was avlulable on an ibm/1401 computer at the los.<: " >,. +j ... co coir\, p co i i ..... >, i -::t -::t cbg'j':. caidpus 01 the university (ucla). at that time, it was the only >< tdp (v") i is '"' $,,; ~;:1~ ~with sucb a printer on the west coast. the character set had been ! g .& ~ .~ ~ ro...... ... z ..., joe joe 0) f co co i, ted by librarians at ucla from characters offered by ibm in the ..; ~~ pi~ '}j~~ co al ij) joe 0 i • ..-i .;.> ~~§ij)'rlpi ... ~~ (j) ~er of 1964 for the 1403 printer.'" >r< a 'rle:;j~~g <rt ...... ar as figure 1 shows, digits and 100ver-case letters are punched as usual;'" po '"' col<li@l ~ 1-' u el~~::;~'5 -::tarc(~ er'case letters are preceded by the word-separator character (0·5·s). ;:;, <: <: " q) q) x""" j..4 '" 0 '" co ' t-n n u u q} rd q} 6'gg. ..... "'~ '?~..;..;«: () () r-i '-" > ~ ...... ~g lor the special characters is descnbed in the tables. there are'-" d ,....-! ~ -rl" (v") col col ~ 0 0 .;.> ~ ul <li ~ (j.): h ... co «) .;.>~ q) q} , o;:j q.! 0) pi 0) .c ~rl+j rooo"'co i i .;.> > u ~ .... <d .;.> ..-i <d ..; "" """ minus signs: the one printed from an ll·punch is below the center 'd'ro >0 '" ~ a.g -a:ci~~ 0j,'f'cr~ ~ e ~ ~ ~ ~ ~ ~ ~ ;i;«: ~ ~ e-o ~ (j) e-o u '" 0:l.!ll8c"l«:~~~~;:1~ 01 the character; obtaining a centered minus requires a multiple punch u ~ '" ~ -&3 (11. ). the underscore prints in a space by itself, just as do other char· 0:; 'b po< ~ acters. it requires special programming to overprint this character by 0uj co co ..... 0 ~£, , , r-i c\j (v") -::t if'\ \0ffi '):lr-co~j,~ ..... ..... suppressing paper spacing. the virgule overprint requires two columns e-o o ;:1 ..... ~ ~ cj it> punch. sharp.eyed readers will notice that the virgule appears twice ~~ u in figure 1, and it has been counted twice lor the total 01 101 char­ o;j '""l acters. the blank has also been counted as a character, but the black ~ , < o lz -a '" ... > ....... "" ed «: .'ifj square, which was not used at santa cruz, was not counted.fu ~ all data elements were encoded in fixed card fields; that is, the field for each type of inforrnation had a fixed length, generally 300 characters. ..," it was not necessary, however, to use the entire field or to fill it with g ..-i -­ :a til zeros or other codes. no terminating characters were used to separate ii the fields. each type of information was included on one or more cards'"' ~ ~ ~ ~ ~ u ~ ~ m m :arin ~ code which would tell the computer precisely what type 01uj 1l ~ ~~ ~(q" ga ii '"' .<: a .., "" '"' ~ () as ..., 0) <li () v ormation the card ( s) contained. all of this is illustrated in table l.g; j ~ .....;:;, s:ll:! aug ~'"'-ti~~.;.> <: u there are basically two ways that information can be encoded into ~ ,.a<>: 1-'~g~ ~~6 h~~p~" rd 0) '-" j..t ~1j)~!!~gbj'ga~p~,.... cards. this is discussed in references (3) and (6) especially. to use a.ch~meno)~ en 0 ei ro ~ ,0 r::!30)o.rllqo)....-i~vjdco.....-la"ii ~ ~: '" .~ ..; ... ~ ° '" °m ~ .;.> so'"8 rl0 ~ '" h '" r-id ~~og.;i§ n,",daiqo,"podjl) .... p,..; u~etely variable lonnat it is necessary to bave field delimiting codes. cd ~ a«:~~p., (j) 0:; u w ::0: z ~ ~ ~ «: u uj «: ~ po< 000z i' ~ xed sequence 01 data elements is established (e.g., author, title, pub­ co co co co ""'-!cococoed o?cfo'?'?'f'?'? 'f'f<j?"t'9,-i~ 'e~ (\j ~ ~ j.po< nls er, .etc.), one code will suffice to separate each field. fixed sequence "" m i i i ~ 'f t-;~ ~ 1" if' ~ tt 9 ~ 1" u;' '? tj ~§ uu (v")-::tlj"\\o 0000" ° o 0 ";;:1 ;:1 ;:1 ;:1 ;:1 ~ ~ ~ ~ ~ ~ ~~ :sltate~, however, making provision lor establishing sequence lor@ uj <d.<:-e ry possible data element that can occur in a catalog entry· and if one ~ ~ § ~ op e e ements do not occur lor some specific entry, then the keypunch08 ....... • l:c ormor i ' '"ii'­~. ~ ~~j "'" @j "** everatrjr must remember to add the required field-delimiting code lor .~ ~l<:~ cr­u , ery data element not present, and to add the code in the proper se­... o~ .;.> '" a"fu . ii a number 01 individual codes are to be used to delimit fields, h u ell-<!)-@j 1\ v + ....... -ii, h ,,,,,,,,... 1* .. . '-" ...-....... ,......, h ceed en becomes difficult to find codes that are not otherwise used for fu .<:: " colu 114 journal ot library au-tomation vol. 1/2 june, 1968 table 1. punch card input format 'fable ):. (c ont.) field iv cals. comments field id gols. comments . card, no. 200-224 (limits, 1-5 cards per title; 1-5 titles) ')'jile'fitl 1-60 may be cont. on up to 4 additional cards shelf key card, no. 000 call no. 1-20 as desired e 61-67 unused year 21-24 year of publication con . indicator 68 is '-' if continued on next card copy 25-26 (blank or 01-99) tspecial code 69 see table 3 series 27 (either alphabetic or numeric o.i( ) card no. 70-72 200-224 volume no. 28-30 (blank or 001-999) . accession no. 73-80 same as shelf key card (blank or 01-99) part no. 31-32 publisher/source card, no. 300-305 (limit' 1-2 cards per publ./source, donor no. 33-36 no. of donor on g~t list (may be blank) 3 publ./sources total) date rec'd 37-40 month, year received at library publ./source 1-60 may be continued on a second card location 41-42 alpha code designating location on 61-67 unused campus 68 is 'o' if publ./source is cont. on next card type 43 o=book, l=serial, 2=reference cont. indicator 69 p=publisher, s=source 3=gov't pub., 4=see auth., , special code 70-72 300-305 5=see subj., 6=see also subj. card no. 73-80 same as shelf key card language 44-47 from 1 to 4 one-letter codes indicating accession no. lang. collation card, no. 400 as desired suppress 48 h "s", entry appears on shelf list only , collation 1-40 unused 49-69 unused . 41-69 400 (1 card only) card no. 70-72 must be '000' . card no. 70-72 same as shelf key card accession no. 73-80 8-digit no. which sequences a batch of accession no. 73-80 (limit: 1-5 cards per comment; accessions in call no. sequence (gener. commentary card, no. 500-509 2 commentaries, (1 of each type) (i,ally only final five digits are used) may be cont. on up to 4 more cards personal author, no. 100-104 (limit: 0-5 ca.rds) (i, commentary 1-60 unused61-67author 1-60 name of author, left justified is '_' if commentary cont. on next card 61-68 unused cont. indicator 68 's' for commentary to appear on shelf 69special code 69 see table 2 special code list only card no. 70-72 100 through 104 card no. 70-72 500-509accession no. 73-80 same as shelf key card accession no. 73-80 same as shelf key card corporate author, no. 110-119 (limit: 1 or 2 cards per author; subject card, no. 600-604 (limit: 1 card per subject; 5 subjects) 0-5 authors) (i, subject 1-60 as desired corporate author 1-60 may be continued on second card 61-68 unused61-67 unused 69 may be used to indicate level of cont. indicator 68 is '-' if author is cont. on second card special code subjectl) 4 special code 69 see table 2 600-60470-72card no. 70-72 110-119 card no. same as shelf key card accession no. 73-80 same as shelf key card accession no. 73-80 i): h 2 commentary entries are used, at least 1 must have an '5' code. the first author to be processed by the computer is considered th~(i, the special code is ignored by the system at the present time. main author. the main author appears on all catalogs and is represente in the title by "00". .l.lq jotlrnm ot lwrary l-1tlromanon vol l/ ~ june, hjoo legitimate data. it seemed easier to input with fixed fields and s' waste a few card columns. that is, if a particular field ends anywhllnp1y ere'the body of the card before the final column (for example, 60), the 0 ll\ tor simply stops and feeds in a new card. it is possible that a per:\. may have only one character of data on it, in addition to a card-s cardequ number and item-type number. in practice it would seem that the 10el\:e time in card feeding is not significant, and blank hollerith cards are ss iii . ~ry cheap mdeed. training for the mass conversion effort at the keypunch service in s francisco proved relatively easy. an operator's guide was produ~~ showing the codes and conventions for each data element found on th typical catalog card. only the shelf list card was used for the conversio e tables 2 and 3 show the various elements that were coded. there we~' two forms of shelf list cards, as mentioned above: library of congres~ cards and cards produced by the new campuses' program at san diego. on library of congress cards everything was encoded except roman numeral pagination and size information, and the information at the bottom of the card: the call number used in the library of congress itself, the dewey number, the lc card number, and the name of the originat­ ing library if any. on the home-made cards, roman numeral pagination and size were not encoded. everything in roman characters was punched. cards with only a small amount of information in roman char­ acters had a legend punched, "for complete entry see shelf list." it is simply not possible in a short article to give all the fine points of conver­ sion. rules for all contingencies were devised and most proved easy to follow. twenty operators, working in two shifts of ten each, converted the 55,000 titles that existed in june, 1965, in about three months' working time. all data elements to be used later for sorting purposes on the computer were key verified, but for the first month of the conversion the entire record for each title was verified. beginning in december, 1965, the library at santa cruz began key­ punching operations. mter a training period of a week and operational experience of four months, the local operators achieved a rate of 7,000 to 8,000 keystrokes per hour, with a net error rate of only 12 errors in ap­ proximately 24,000 keystrokes; that is, the operators recognized a number of errors and corrected them at the time of initial punching. the 12 remaining errors should be caught during proofreading, which we sub­ stituted for key verification in the ongoing production system. it was felt desirable to combine proofreading for transcription accuracy with the typical library practice known as "revision," which implies that the cat~­ log copy be reviewed for content as well as accuracy. this is true even for text taken from library of congress catalog copy. elements such as. the form of entry, the form of series note if any, number of subject headl1lf and form, etc., are all reviewed by a cataloger other than the one w 0 initially prepared the copy. proofreading and revision was done from !l special codes for authors 'table i,. meaning notation code 6 to create added notation on author catalog: type 1: joint author joint auth. coillpuer compo editor ed. e joint editor joint ed. g illustrator illus. i publisher publ. p translator trans. t type 2: to specify a substitute sort key: x use this author as a substitute sort key for previous author. previous author wul appear on appropriate catalog but this author wul not. table 3. special codes for titles code meaning suppress listing this title in title catalog x title is a transliterated title t title is a series title s partial title p d standard title or conventional title in all cases, the first title encountered when processing a given entry will be the only title which appears in the author and subject catalogs. printout on a line printer having only 64 characters available in the char­ acter set. this number of characters suffices, however, since there are only ~ usable card codes, i.e., the pattern of holes in each column of a holler­ ith card. there are only 64 valid combinations which can be read by the ~mputer equipment. as illustrated in figure 1, some characters with dlacri~cals r~quire three punched columns to produce one character in the ultimate printout. results thfor the mass conversion which took approximately three months during e ~ummer of 1965, the total cost was slightly less than $34,000, or ap­ p:oxunately 60¢ per title. in a discussion of the project, after its conclu­ sion, with the two supervisors of the service bureau in san francisco, it was agreed that in all likelihood the service bureau operators had just 118 journal of library automation vol. 1/2 june, 1968 "d the programs written to process the catalog card data. this ,,ailable' statement to make, yet the road to the production of either ~reached peak efficiency about the time the project terminated. 1'h had the project continued, the cost per title would have decreased. ;t is, work records maintained by the service bureau, it was apparent that roill :first two months of ~he project :vas a learnin.g period? as the output of the operators rose contmually dunng that penod of tune. during the ~e month the productivity curve leveled off considerably. ast table 4 shows the cost of the production operation established in s cruz. the costs used are somewhat arbitrary. for example, the ,~ta punch operator" classification at the university had six steps. in prodey. ing an average cost should the actual rate being earned by the keypu:h operators be used, or the beginning rate, or some other? the amount us~d in table 4 represents an average of the pay being received by the 2.3 full-time equivalent operators, rounded upward to an even amount. holl. erith card costs can also vary slightly. table 4 uses $1.00 per thousand as a reasonable price and one which could probably be obtained any'nhere in the country. costs are based on rates obtaining in february, 1967. table 4. cost per title to produce machine readable catalog data keypunch rental, $65.00/mo. $ .026 (one-shift operation) keypunch operator, $2.10jhr.+20% overhead .168 blank hollerith cards, $1.00/thousand .009 machine listing of cards for proofreading .002 (printing at 390 cards per minute) proofreading, $5.00/hr., 120 titlesjhr. .042 correction of errors .020 total $ .267 discussion there are, of course, hidden costs in the ongoing production at santa cruz that are difficult to fix because the university does not charge for them. for example, there is the cost of space occupied by keyp~ch operators and by equipment, the cost of air conditioning and electncal supply, the cost of adding internal partitions, doors, etc. spread over ~ yearly total of some 30,000 new titles, these unknown costs and the addl­ tional costs of supervision could not be very great per title, assuming that the rate of keypunching production remains relatively constant. however, labor costs may prove to be a key factor in some geographic areas. in, santa cruz good keypunch operators were available at a reasonable cost, but in large metropolitan areas this may not be true. since the ope~~tor cost is over 60 percent of the total per title, it obviously can be a critical factor. what happens after catalog copy is converted to machine readable fori1l by punch cards or any other method, depends on computer equiprnent .. eos)' rds or book-fonn catalogs is not easy. even the data them­ .,..,wg ca cause problems. for example tbe reader will note in figure 1 sei"" ':'cha'a used to indicate an up-smt for capital \etters has a cter...t ih de of 0-5-8 fixed by the manufacturer, ibm, and neces~ary to print ~d ~e exp",ded cbaracter set cham. in retrospect ,t would have been .,rth t convert the firuj code configurations as part of a computer ""'"' ~g step than to punch them from the beginning. the 0-5-8 smt p'f~s a special code used within the 1400 series computers, and is known 10 e is ore! mark or wore! separator. in normal operation of the 1401 com­ asu~these marks are used to delimit fields within the mexnory of the ~chble. certain program commands use these marks to detect when the b bluing or the end of a field has been reached. use of such a code ;ift,e data can raise havoc with a program unless the programmer is con­ stantly alert to the problem and takes great pains to circumvent it. some other code such as the $ sign might have been used and then converted, prior to the final run on the )401, as part of the computer processing to thewhilecode needed for printing purposes. the number of articles on catalog conversion has not yet be.ml overwhehnblg, it is apparent that there is a great deal of interest in the 'field. one might ask, 'should every library proceed to convert its own catalog?" degennaro has addressed this problem (11) and the reader is referred to that discussion. perhaps the question has no ideal answer. it does seern unfortunate, however, that tbe pre-1955 national union catalog is not to be published from a record that is machine readable. it would seem possible, however, that in the future methods could be s devised to use macbine readable records produced by the larger b'br.,ie and some procedure whereby tbe smaller libraries could check their holdings against those of the larger libraries. by some fairly simple method, a subset of the master machine records could be selected for use in the catalogs of smaller libraries., ~ the santa cruz project began before the library of congress had an­ nounced results of preliminary plans for the marc project (12). to a fu'am extent the catalog record at santa cruz could be converted into e marc format, although marc goes far . deeper in coding discrete elements of data within the catalog record than does santa cruz. how­ fiver, to the extent that discrete dahl elements are encoded and identi­ iz proflcrly in the machine record, any catalog format can be transdrmoo .mto any other catalog format by the computer. the key is the proper identification of each data element. acknowledgmentoth~ author wishes to thank his colleagues at system development cor­ p ration, ill particular mrs. ann luke, for helpful comments on this paper. 121 __ ~ jvuiiuu or ltbrary automation vol. 1/ 2 june, 1968 references 1. the johns hopkins university. the milton s. eisenhower lib progress report on an operations research and systems en i r~: study ofa university library (baltimore: johns hopkins, 196i)ne~ 2. southern illinois university. office of systems and procedur~. automated circulation control system for the delyte w. morris ~ brary; the system and its progress in brief (carbondale, ill.: southlj. illinois university, 1963). ern 3. cartwright, kelley l.; shoffner, ralph m.: catalogs in book form (berkeley: institute of library research, 1967). . ... 4. international business machines. federal systems division: report on pilot project for converting the pre-1952 national union catalog t a machine readable record (rockville, maryland: ibm, 1965) . 0 5. buckland, l. f.: recording of library of congress bibliographical data in machine form rev. ed. (washington, d. c.: council on library resources, 1965). 6. hammer, donald p.: "problems in the conversion of bibliographical data-a keypunching experiment," american documentation, 19 (jan­ uary 1968), 12-17. 7. kilgour, frederick g.: "development of computerization of catalogs in medical and scientific libraries," clinic on library applications of data processing, university of illinois, 2nd, 1964, proceedings (champaign: illini union bookstore, 1965), p. 25-35. 8. patrick, robert l.; black, donald v.: "index files; their loading and organization for use," libraries and automation; proceedings of the conference on libraries and automation, airlie foundation, warren­ ton, virginia, may 26-30,1963 (washington, d. c.: library of congress, 1964 ), p. 29-53. 9. chapin, richard . e.; pretzer, dale h.: "comparative costs of co~­ verting shelf list records to machine-readable form," journal of lt. brary automation, 1 (march 1968), 66-74. 10. voigt, melvin j.; treyz, joseph h.: "new campuses prograill (ucsd, uci, and ucsc)," library journal, 90 (may 15, 1965) , p. 2204-08. 1l. degennaro, r. a.: "a strategy for the conversion of research library catalogs to machine readable form," college & research libraries, 28 (july 1967), p. 253-257. 12. u. s. library of congress, information systems office: a prelimi~ry report on the marc (machine-readable catalog) pilot project . (washington, d. c.; library of congress, 1966). costs of library catalog cards produced by computer frederick g. kilgour, ohio college library center, columbus, ohio .pffjductw costs of 79,831 cards are analyzed. cards were produced by ntout ".,.;ants of the columbia.harvard-yale procedure employing an ibm ({l0 document writer and an ibm 1401 computer. costs per card ranged from 8.8 to 9.8 cents for completed cards. early in september, 1964, the yale medical library put into routine oper­ ation the columbia-harvard-yale computerized technique for catalog card manufacture (1), and during the following three years yale pro­ duced over 87,000 cards. the principal objective of the chy project was an on-line, computerized, bibliographie lnformation retrieval system. however, the route selected for attaining the objective included manu­ facture of cards from machine readable data to keep up the manual catalog while machine readable records were being inexpensively ac­ cumulated for computerized subject retrieval. catalog cards were only:iii .product of the system, but their production was designed to be as clent as possible within constraints of the system. nevertheless, this pjper will examine chy card production costs as though this segment ? the system were an isolated procedure, yielding but one product, as hethe case in classical library procedures. costing will disregard other ii nellis, such as accession lists and machine readable data produced lor ttie, or no, additional expense. ~ columbia medical library and harvard medical library also in· s ed ibm 870 document writers and tested the programs for card production, but neither library routinely produced cards. however, co­ metadata provenance and vulnerability timothy robert hart and denise de vries information technology and libraries | december 2017 24 timothy robert hart (tim.hart@flinders.edu.au) is phd researcher and denise de vries (denise.devries@flinders.edu.au) is lecturer of computer science, college of science and engineering, flinders university, adelaide, australia. abstract the preservation of digital objects has become an urgent task in recent years as it has been realised that digital media have a short life span. the pace of technological change makes accessing these media increasingly difficult. digital preservation is primarily accomplished by main methods, migration and emulation. migration has been proven to be a lossy method for many types of digital objects. emulation is much more complex; however, it allows preserved digital objects to be rendered in their original format, which is especially important for complex types such as those comprising multiple dynamic files. both methods rely on good metadata to maintain change history or construct an accurate representation of the required system environment. in this paper, we present our findings that show the vulnerability of metadata and how easily they can be lost and corrupted by everyday use. furthermore, this paper aspires to raise awareness and to emphasise the necessity of caution and expertise when handling digital data by highlighting the importance of provenance metadata. introduction unesco recognised digital heritage in its “charter on the preservation of digital heritage,” adopted in 2003, stating, “the digital heritage consists of unique resources of human knowledge and expression. it embraces cultural, educational, scientific and administrative resources, as well as technical, legal, medical and other kinds of information created digitally, or converted into digital form from existing analogue resources. where resources are ‘born digital’, there is no other format but the digital object.” 1 born-digital objects are at risk of degradation, corruption, loss of data, and becoming inaccessible. we combat this through digital preservation to ensure they remain accessible and useable. the two main approaches to preservation are migration and emulation. migration involves migrating digital objects to a different and currently supported file type. emulation involves replicating a digital environment in which the digital object can be accessed in its original format. both methods have advantages and disadvantages. migration is the more common method because it is simpler than emulation and the risks can often be neglected. these risks include potential data loss or change, in which the effects are permanent. emulation is complex, but it offers the better means to access preserved objects, especially complex file types comprising multiple dynamic files that must be constructed correctly. emulation also allows users to handle digital objects as closely to the “look and feel” as originally intended. 2 mailto:tim.hart@flinders.edu.au mailto:denise.devries@flinders.edu.au metadata provenance and vulnerability | hart and de vries 25 https://doi.org/10.6017/ital.v36i4.10146 accurate and complete metadata is central to both migration and emulation; thus, it is the focus of this paper. metadata are needed to record the migration history of a digital object and to record contextual information. they are also necessary to accurately render digital objects in emulated environments. emulated environments are designed around a digital object’s dependencies , which typically include, but are not limited to, drivers, software, and hardware. 3 the metadata describe the attributes of the digital object from which we can derive the type of system in which it can run (e.g., the operating system), the versions of any software dependencies, and other criteria that are crucial for accurate creation of an emulated environment. while metadata are being used to support the preservation of digital objects, there is another equally important role it should be playing. it is not enough to preserve the object so it can be accessed and used in the future. what of the history and provenance of the digital object? what about search and retrieval functionality within the archive or repository the digital object is held in? one must consider how these preserved objects will be used in the future, and by whom. preserving digital objects is difficult if adequate metadata is not present, especially if the item is outdated and no longer supported. looking to the future, we should try to ensure metadata are processed correctly for the lifecycle of the digital object. this means care must be taken at the time of creation and curation of any digital objects because although some metadata are typically generated automatically, many elements that will play a pivotal role later must be created manually. digital objects also commonly go through many changes, which is something that must be captured, as the change history will reveal what has happened to the object over of its lifecycle. the changes may include how the object has been modified, migrations to different formats, and what software created or changed the object—all of which is considered when emulating an appropriate environment. examples of these changes can be found in case studies presented in the paper. metadata types the common and more widely used metadata types include, but are not restricted to, administrative, descriptive, structural, technical, transformative, and preservation metadata. each metadata type describes a unique set of characteristics for digital objects. administrative metadata include information on permissions as well as how and when an object was created. transformative metadata includes logs of events that have led to changes to a digital object. 4 structural metadata describe the internal structure of an object and any relationships between components. technical metadata describe the digital object with attributes such as height, weight, format, and other technical details. 5 preservation metadata support digital preservation by maintaining authenticity, identity, renderability, understandability, and viability. they are not bound to any one category as they comprise multiple types of metadata, not including descriptive or contextual metadata. however, unlike the common metadata types, preservation metadata are unique from the other metadata types and are often ambiguous. 6 in 2012, the developers of version 2.2 of the premis data dictionary for preservation metadata saw descriptive metadata as less crucial for preserving digital objects; however, they did state it was important for discovery and decision making. 7 while version 2.2 allowed descriptive information technology and libraries | december 2017 26 metadata to be handled externally through existing standards such as dublin core, the latest version (2017) of the dictionary allows for “intellectual entities” to be created within premis that can capture descriptive metadata. 8 thus, while digital preservation does not require all types of metadata, the absence of contextual metadata limits the future possibilities for the preserved object. hart writes that because the multimedia objects are dynamic and interactive, and often composed of multiple image, audio, video, and software files, descriptive metadata are increasingly important because they can be used to describe, organise, and package the files. 9 it is also stressed that content description is of great importance because digital objects are not self-describing, which makes identifying semantic-level content difficult; without description metadata, context is lost. 10 for example, without description metadata to provide context, an image’s subject information and search and retrieval functionality is lost. without this information, verifying whether an object is the original, a copy, or a fabricated or fraudulent item is impossible in most cases. metadata vulnerability—case studies digital objects that are currently being created often go through several modifications, making it difficult to identify the original or authentic copy of the object. verifying and validating authenticity is important for preserving, conserving, and archiving objects. the digital preservation coalition defines authenticity as the digital material is what it purports to be. in the case of electronic records, it refers to the trustworthiness of the electronic record as a record. in the case of “born digital” and digitised materials, it refers to the fact that whatever is being cited is the same as it was when it was first created unless the accompanying metadata indicates any changes. confidence in the authenticity of digital materials over time is particularly crucial owing to the ease with which alterations can be made. 11 tests were undertaken to discover how vulnerable metadata can be in digital files that are subject to change, which can lead to loss, addition, and modification. the tests were conducted using the file types jpeg, pdf, and docx (word 2007). the tests revealed what metadata can be extracted and what metadata could be present in the selected file types. furthermore, they revealed how specific metadata can verify and validate the authenticity of a file such as an image. for each test, the metadata were extracted using exiftool (http://owl.phy.queensu.ca/~phil/exiftool/). alternative browser-based tools were tested and provided similar results; however, exiftool was selected as the primary testing tool because it produced the best results and had the best functionality. some of the files tested provided extensive sets of metadata that are too large to include, but subsets can be found in hart (2009). note that only subsets are included because some metadata was removed for privacy and relevance reasons. the process and method for each test was conducted in the following manner: http://owl.phy.queensu.ca/~phil/exiftool/ metadata provenance and vulnerability | hart and de vries 27 https://doi.org/10.6017/ital.v36i4.10146 • case study 1—jpeg o original metadata extracted for comparison o image copied, metadata extracted from copy and examined for changes o file uploaded to social media, downloaded from social media, extracted and examined against original • case study 2—jpeg (modified) o original metadata extracted for comparison o image opened and modified in photo editing software (adobe photoshop), metadata extracted from new version and examined against original • case study 3—pdf o basic metadata extraction performed to establish what metadata are typically found in pdf files and what types of metadata could be possible • case study 4—docx o original metadata extracted for comparison o file saved as pdf through microsoft word and metadata compared to original o file converted to pdf through adobe acrobat and metadata compared to original case study 1 this case study investigated the everyday use of digital files, the first being simply copying a file. it was revealed that copying a file creates an exact copy of the original file and no changes in metadata aside from the creation and modification time/date. thus, the copy could not be identified against the original unless the original creation time/date was known. the second everyday use was uploading an image to facebook. the metadata-extraction tests revealed that the original file had approximately 265 metadata elements. (the approximation is caused by the ambiguity of certain elements that may be read as singular or multiple entries.) these elements included, but were not limited to, the following: • dates • technical metadata • creator/author information • color data • image attributes • creation-tool information • camera data • change • software history many of the metadata elements had useful information for a range of situations. even so, several metadata elements were missing that would require a user input for creation. once the file had been uploaded to and then downloaded from social media, approximately 203 metadata elements were lost, included date, color, creation-tool information, camera data, change, and software history. it can be argued that removing some of this metadata would help keep user information private, but certain metadata should be retained, such as change and software history. these information technology and libraries | december 2017 28 metadata make it easier to differentiate fabricated images from authentic images and to know which modifications have been made to a file. for preservation purposes, the missing metadata is what may be needed to provide authenticity. this case study aims to make users aware of the significant risk of metadata loss when dealing with digital objects. if metadata are not identified and captured before the object is processed within a repository, the loss could be irreversible. case study 2 the second case study revealed how the change and software history metadata can be used to easily identify when a file has been modified. in the test conducted, it was evident by visually comparing the images that changes were made; however, modifications are not always obvious as some changes can be subtle, such as moving an element in the image that completely changes what the image is conveying. the following example displays the change history from the image used in case study 1, revealing how the metadata can easily identify modification: • history action—saved, saved, saved, saved, converted, derived, saved • history when—the first saved was at 2010:02:11 21:59:05, the last saved was at 2010:02:11 22:12:01 with each action having its own timestamp • history software agent—adobe photoshop cs4 windows for each action • history parameters—converted from tiff to jpeg further testing was conducted with simple photo manipulation using an original image to see firsthand the issues described in the initial test. the image contained approximately 178 metadata elements, including the typical metadata that were found in the first case study. once the image was processed and modified with adobe photoshop cs5, the metadata were no longer identical. the modified image had approximately 201 metadata elements. the new elements included photoshop-specific data, change, and software history. however, extensive camera data were lost. it can be argued that the camera data are not important for digital preservation because the lack of it will not hinder the preservation process. however, once the file is preserved and those data are lost, important technical and descriptive information can never be regained. for example, consider a spectacular digital image that captures an important moment in history. if that image is preserved for twenty years, in that time cameras and perhaps photography itself will have advanced dramatically. how digital images are captured and processed might be completely different and will most likely provide different results. should someone wish to know how that preserved image was captured, they would need to know what camera was used, lens and shutter speed data, lighting data, and other technical information. preserving those metadata can be almost as important as preserving the file itself because each metadata element has importance and meaning to someone. as most viewers of online media are aware, photos are often modified, especially on social media. this is often performed on “selfies,” pictures taken of oneself. these can be modified to make the person in the photo look better or to hide features they see as flawed. small modifications, such as covering some blemishes or improving the lighting have little effect on the image’s context, but some modifications and manipulations that can mislead people. these manipulated images often metadata provenance and vulnerability | hart and de vries 29 https://doi.org/10.6017/ital.v36i4.10146 take the form of viral hoax images circulating around the web. for example, figure 1 displays how two images can be combined into a composite image that changes the context of the image. figure 1. composite image. “photo tampering throughout history,” fourandsix technologies, 2003, http://pth.izitru.com/2003_04_00.html. the two images side by side are original photos taken in basra of a british soldier gesturing to iraqi civilians to take cover. in the right image, the iraqi man is holding a child and seeking help from the solider; as you can see, this soldier does not interpret this as a hostile act. the image above is a composite of the two that changes the story. in this image, the soldier appears to be responding with hostility toward the man approaching. with basic photo manipulation, this soldier who is protecting innocent civilians is portrayed holding them against their will. images like this circulate through media of all types, and although the exchangeable image file format (exif) metadata may not identify what has been done to the image, it would eliminate any doubt that the image has been modified. unfortunately, these data are not made available. making users aware of this vulnerability may improve detection of file manipulation at the time of ingest to better ensure only accurate and authentic material is being considered for preservation. donations received by digital repositories such as libraries must be scrutinised by trained individuals. with this awareness and knowledge of metadata, they can perform their duties to a much higher standard. case study 3 the pdf metadata extraction provided interesting results. over a range of tests on academic research papers, the main metadata identified consisted of pdf version, author, creator, creation date, modification date, and xmp (adobe extensible metadata platform) data. these metadata http://pth.izitru.com/2003_04_00.html information technology and libraries | december 2017 30 were not present in every pdf tested; in fact, the majority of pdf files seemed to be lacking important metadata. the author and creator fields were generally listed as “administrator” or “user” and bibliographic metadata was usually missing. however, pdf openly supports xmp embedding, therefore, bibliographic metadata could be embedded into the pdf. through further testing, bibliographic metadata linked to the pdfs were discovered stored in online databases. bibliographic software such as endnote and zotero allow metadata extraction, which enables users to import pdf files and automatically generate the appropriate bibliographic metadata. for example, zotero performs this extraction by first searching for a match for the pdf on google scholar. if this search does not return a match, zotero uses the embedded digital object identifier (doi) to perform the match. this method is not consistent: it often fails to retrieve any data, and in rare cases it retrieves the wrong data, which leads to incorrect references. given what we saw happen to metadata when a file is uploaded such as in case study 1 and the nature of a pdf’s journey through template selection, editing, and publishing, it is no surprise that metadata are lost or diluted along the way. case study 4 the fourth case study conducted on docx files provided an extensive set of metadata, some of which are unique to this file type. creating a new word document via the file explorer context menu and attempting to extract metadata resulted in an error as there were no readable metadata to extract until the file was accessed and saved. once the file had some user input and was saved, the metadata were created and could be extracted. microsoft office files contain external xml files that holds information about the document, such as formatting data, user information, edit history, and information about the document’s page count, word count, etc. picture a docx file as an uncompressed directory. however, using exiftool on the docx file allowed retrieval of the metadata from all the hidden files. the metadata included creation, modification, and edit information, such as number of edits and total edit time. every element within the document (e.g., text, images, tables, etc.) has its own metadata attached that are crucial for preserving the format of the document. the next step in the test involved converting the docx file into pdf using the following two methods: (1) converting the document via the “publish” save option within microsoft word; and (2) “right clicking” the document and selecting the option to convert to an adobe pdf. the results of the two methods varied slightly. method 1 stripped all the metadata from the document and generated only default pdf metadata consisting of system metadata (file size, date, time, permissions) and the pdf version, author details, and document details. method two behaved the same way except that some xmp metadata were created. both methods resulted in no informative metadata remaining as the majority of the xmp elements were empty fields or contained generic values such as the computer name as the author. all formatting and metadata unique to microsoft word was lost. this case study is an enlightening example of what can happen to metadata when a file is changed from one format to another. metadata provenance and vulnerability | hart and de vries 31 https://doi.org/10.6017/ital.v36i4.10146 human intervention the human element is a requirement in digital preservation as certain metadata, such as descriptive and administrative metadata, can only be created by humans. in fact, as hart notes, user input is needed to record the majority of the digital preservation metadata. 12 the process can be tedious, as described by wheatley. 13 one of the examples described included following the processes in a repository from ingest to access, beginning with the creation of metadata and the managerial tasks that are necessary. these tasks include using extraction tools and automation where possible. using frameworks to record changes to metadata is required, and in some cases metadata must be stored externally to their digital objects. this allows multiple objects of the same type to utilise a generic set of metadata to avoid redundant data. however, although using a generic metadata set is convenient, a large collection of digital objects could be affected if the metadata is lost or damaged. the human element increases the risk of error drastically because there are numerous steps to metadata creation. misconduct is also possible. therefore, the less digital preservation is reliant on humans (and the easier the tasks are that require human input), the better. this can only be achieved by automating most process and training people to ensure they handle their responsibilities accurately, consistently, and completely. learning the results from the case studies like those described in this paper will better prepare users working with digital objects. discussion to achieve the most authentic, consistent, and complete digital preservation, institutions must revise their preservation workflows and processes. this entails ensuring the initial processes within workflows are correct before processing digital content. the content must come from a credible source and have its authenticity approved. participation from the donor of the digital content might be beneficial if they can provide information and metadata about the content. this information could provide additional context for the content as well as identify its history (e.g., format migration or modification). this is not always possible as the donor is not always be the creator of the digital content. if the original source is no longer available, as much information as possible should be gathered from the donor about the acquisition of the content and any information regarding the original source. this should be considered and carefully monitored throughout the lifecycle of digital content. granted, if no changes are needed, devices such as write blockers can ensure this as they restrict users and any systems from making unwanted changes or “writes.” however, changes are sometimes unavoidable and (although it may not affect the content) detrimental. when changes are required, it is crucial to maintain the digital history by capturing all metadata added, removed, or modified during processing, commonly known as the “change history.” donor participation should be stipulated in a donor agreement, something that each institution offers to all donors, sometimes in the form of agreements through communication and often with a structured document. donor-agreement policies differ for each institution: some are quite detailed, allowing donors to carefully stipulate their conditions, whereas others place most of the information technology and libraries | december 2017 32 responsibility on the receiving institution. when dealing with sensitive or historic data of importance, policies should be in place to capture adequate data from the donor. when the content does not fall into this category, standard procedures, which should be present in all donor agreements and institution policies, can be followed. institutions must also consider when to apply these steps as some transactions between donor and institution can follow standard protocol; others are more complex, such as donations of content with diverse provenance issues. conclusion we have presented four case studies that illustrate how vulnerable digital-object metadata are. these examples show that common methods of handling files can cause irretrievable loss of important information. we discovered significant loss of metadata when uploading photos to social media and when converting a file to another format. the digital footprint left behind from photo manipulation was also exposed. we shed light on the bibliographic-metadata generation of pdf files, how they are obtained, and the surrounding issues. action is needed to ensure proper metadata creation and preservation for born-digital objects. librarians and archivists must place a greater emphasis on why digital objects are preserved as well as how and when users may need to access them. therefore, all types of metadata must be captured to allow users from all disciplines to take advantage of historical data in many years to come. given the rate of technological change, we must be prepared; observing first-hand the vulnerability of metadata is a step toward a safer future for our digital history. references 1 “charter on the preservation of digital heritage,” unesco, october 15, 2003, http://portal.unesco.org/en/ev.phpurl_id=17721&url_do=do_topic&url_section=201.html. 2 k. rechert et al., “bwfla—a functional approach to digital preservation,” pik—praxis der informationsverarbeitung und kommunikation 35, no. 4 (2012), 259–67. 3 k. rechert et al., design and development of an emulation-driven access system for reading rooms, archiving conference, 2014, 126–31, society for imaging science and technology, 2014. 4 m. phillips et al., the ndsa levels of digital preservation: explanation and uses, archiving conference, 2013, 216–22, society for imaging science and technology, 2013. 5 “premis: preservation metadata maintenance activity” library of congress, accessed march 10, 2016, http://www.loc.gov/standards/premis/. 6 r. gartner and b. lavoie, preservation metadata (2nd edition) (york, uk: digital preservation coalition, 2013), 5–6. http://portal.unesco.org/en/ev.php-url_id=17721&url_do=do_topic&url_section=201.html http://portal.unesco.org/en/ev.php-url_id=17721&url_do=do_topic&url_section=201.html http://www.loc.gov/standards/premis/ metadata provenance and vulnerability | hart and de vries 33 https://doi.org/10.6017/ital.v36i4.10146 7 premis editorial committee, premis data dictionary for preservation metadata, version 2.2 (washington, dc: library of congress, 2012), http://www.loc.gov/standards/premis/v2/premis-2-2.pdf. 8 premis editorial committee, premis schema, version 3.0 (washington, dc: library of congress, 2015), http://www.loc.gov/standards/premis/v3/premis-3-0-final.pdf. 9 timothy hart, “metadata standard for future digital preservation” (honours thesis, flinders university, adelaide, australia, 2015). 10 j. r. smith and p. schirling, “metadata standards roundup,” ieee multimedia 13, no 2 (april-june 2006): 84–88. 11 “glossary,” digital preservation coalition, accessed august 5, 2016, http://handbook.dpconline.org/glossary. 12 timothy hart, “metadata standard for future digital preservation” (honours thesis, flinders university, adelaide, australia, 2015). 13 paul wheatley, “institutional repositories in the context of digital preservation,” microform & digitization review 33, no. 3 (2004): 135–46. http://www.loc.gov/standards/premis/v2/premis-2-2.pdf http://www.loc.gov/standards/premis/v3/premis-3-0-final.pdf http://handbook.dpconline.org/glossary abstract introduction metadata types metadata vulnerability—case studies case study 1 case study 2 case study 3 case study 4 human intervention discussion conclusion references microsoft word june_ital_gerrity.docx editor’s comments bob gerrity     information  technology  and  libraries  |  june  2015       1       library  discovery  circa  1974   our  ongoing  project  to  digitize  back  issues  of  information  technology  and  libraries  (ital)   and  its  predecessor,  journal  of  library  automation  (jola),  provides  frequent  reminders  of   what’s  changed  (and  what  hasn’t)  in  library  technology  in  the  past  several  decades.  the   image  above  is  from  a  1974  advertisement  in  jola  for  the  “rom  ii  book  catalog  on  microfilm”   from  information  design  in  menlo  park,  ca.    the  ad  copy  speaks  for  itself:   all  the  advantages  of  a  printed  book  catalog…none  of  the  disadvantages.  your  staff  and  patrons   can  use  the  catalog  simultaneously  in  many  different  locations.  the  user  can  scan  a  number  of   related  titles  on  the  same  page,  in  contrast  to  the  one-­‐at-­‐a-­‐time  viewing  of  catalog  cards  in  trays.   manual  filing  routines  and  maintenance  are  eliminated.   easy  to  use…requires  no  instruction.  an  automatic  index  pointer  shows  your  patron  his  position   in  the  file.  at  the  touch  of  a  button  he  can  scan  forward  or  back  at  high  speed.  average  look-­‐up   time  is  about  twelve  seconds.  a  staff  member  can  insert  an  updated  catalog  totally  cumulated   on  a  single  reel  of  microfilm  in  about  one  minute.  your  patrons  never  touch  the  film—your   complete  library  catalog  “locked-­‐in”!     bob  gerrity  (r.gerrity@uq.edu.au)  is  university  librarian,  university  of  queensland,  australia.     editor’s  comments  |  gerrity       doi:  10.6017/ital.v34i2.8805   2   my  favorite  bit  is  the  sign  on  the  front  of  the  machine,  proudly  proclaiming:   these  are  all  the  books  in  the  library.   this  month’s  issue  of  ital  looks  at  the  current  state  of  library  discovery  from  a  number  of   angles.  will  owen  and  sarah  michalak  describe  efforts  at  unc  chapel  hill  and  partners  within   the  triangle  research  libraries  network  to  enhance  the  utility  of  the  library  catalog  as  a  core   tool  for  research,  taking  advantage  of  web-­‐based  search  technologies  while  retaining  many  of   the  unique  attributes  of  the  traditional  catalog.  joseph  deodato  provides  a  useful  step-­‐by-­‐ step  guide  to  evaluating  web-­‐scale  discovery  services  for  libraries.  david  nelson  and  linda   turney  analyze  faceted  navigation  capabilities  in  library  discovery  systems  and  offer   suggestions  for  improving  their  usefulness  and  potential.    julia  bauder  and  emma  lange   describe  a  new  approach  to  subject  searching,  using  an  interactive,  visual  approach.  yan   quan  liu  and  sarah  briggs  report  on  the  current  state  of  mobile  services  among  the  top  100   us  university  libraries.    unrelated  to  discovery  but  certainly  relevant  to  issues  around  library   provision  of  access  to  information,  jill  ellern,  robin  hitch,  and  mark  stoffan  report  on  user   authentication  policies  and  practices  at  academic  libraries  in  north  carolina.     selecting a web content management system for an academic library website | black 185 others. the osu libraries needed a content management system (cms). web content management is the discipline of collecting, organizing, categorizing, and structuring information that is to be delivered on a website. cmss support a distributed content model by separating the content from the presentation and giving the content provider an easy to use interface for adding content. but not just any cms would work. it was important to select a system that would work for the organization. the focus of this article is the process followed by osu libraries in the selection of a web cms. other aspects of the project, such as the creation of the user focused information architecture, the redesign of the site, the implementation of the cms, and the management of the project are outside the scope of this article. ■■ literature review content and workflow management for library web sites: case studies, a set of case studies edited by holly yu, a special issue of library hi tech dedicated to content management, and other articles effectively outlined the need for libraries to move from static websites, dominated by html webpages, to dynamic database and cms driven websites.1 each of these works noted the messy, unmanageable situation of the static websites in which the content is inconsistently displayed and impossible to maintain. seadle summarizes the case well when he wrote “a content management system (cms) offers a way to manage large amounts of web-based information that escapes the burden of coding all of the information into each page in html by hand.”2 a cms provides an interface for content providers to add their contributions to the website without requiring knowledge of html; it separates the layout and design of the webpages from the content and provides the opportunity for reuse of both content and the code running the site. these features of a cms permit a library to professionalize its website by enforcing a consistency of design across all pages while at the same time increasing efficiency by making the maintenance of the content itself less technically challenging.3 the potential of the cms is powerful, yet it is not an easy process to select and implement a cms. one challenge is that the process of selecting and implementing a cms is not a fully technical one. the selection must be tied to the goals and strategy of the library and parent elizabeth l. black selecting a web content management system for an academic library website this article describes the selection of a web content management system (cms) at the ohio state university libraries. the author outlines the need for a cms, describes the system requirements to support a large distributed content model and shares the cms trial method used, which directly included content provider feedback side-by-side with the technical experts. the selected cms is briefly described. i magine a city that has been inhabited consistently for hundreds, perhaps thousands of years. those arriving in the city’s main port follow clear, wide paths that are easy to navigate. soon, however, the visitor notices that the signs change. they look similar but the terms are different and the spaces have an increasingly different look. continuing further, the visitor is lost. some sections look drastically different, as if they belong in an entirely different city. other sections are abandoned. the buildings at first seem occupied, but upon closer inspection all is old and neglected. the visitor tries to head back to the main, clear sections but cannot find the way. in frustration, the visitor leaves the city and moves on, often giving up the mission that led to the city in the first place. this metaphor describes the state of the ohio state university (osu) libraries’ website at the beginning of this project. the website has many content providers, more than 150 at one point. these content providers were given accounts to ftp files to the web server and a variety of web editors with which to manage their files. the site consisted of more than 100,000 files of many types: html, php, image files, microsoft office formats, pdf, etc. the files with content were primarily static html files. in 2005, the osu libraries began to implement a php-based template that included three php statements that called centrally maintained files for the header, the main navigation, and the footer. the template also called a series of centrally controlled style sheets. the goal was to have the content providers add the body of the pages and leave the rest to be managed by these central files. this didn’t work as intended. because of a combination of page editing practices learned with static html and a variety of skill with cascading style sheets (css), many pages lost the central control of the header, menu, and footer. also, the template was confusing for many because they had to wade through a lot of code they didn’t understand. one part of this content model was right—giving the person with the content knowledge the power to update the content while centrally controlling parts that should remain consistent throughout the website. unfortunately, the technical piece of the model didn’t support this goal. it required too much technical knowledge from the content providers. the real solution was a system that would allow the content providers to focus on their content and leave the technical knowledge to elizabeth l. black (black. 367@osu.edu) is head, web implementation team, ohio state university libraries, columbus, ohio. 186 information technology and libraries | december 2011 and interviews/focus groups with the current content providers. the research was similar to that described previously in the literature review section of this article. the most helpful for this project was a 2006 issue of library hi tech focused on cmss.10 the most useful of these articles was wiggins, remley, and klingler’s article about the work done at kent state university, particularly the way in which they organized their requirements.11 a working group of four served as interviewers for the focus groups with current web content providers. they worked in pairs, with one serving as a recorder and the other as the facilitator, who asked the questions. fifteen interview sessions were held over a period of three months. the focus group participants were invited to participate in like groups as much as possible, so for example the foreign language librarians were interviewed together in a different session from the instruction librarians. however, no one participated more than once in an interview. the same set of guiding questions was used for each interview. they are included in the appendix. the results of these interviews became the basis for the requirements document to which the technical team added the technical requirements. ■■ the cms requirements the requirements were gathered into five categories: content creation and ownership, content management, publishing, presentation, and administration/technical. these categories were modeled after those used for the project at kent state university.12 the full list is detailed below by category. content creation and ownership requirements ■■ separation of content and presentation: the content owners can add and edit content without impact on the presentation ■■ web-based gui content-editing environment that is intuitive and easy to learn without knowledge of html ■■ metadata created and maintained for each webpage or equivalent content level that contains: ■❏ owner ■❏ subject terms or tags describing the content ■■ multi-user authoring without overwriting ■■ can handle a large number of content providers (approximately 200) ■■ can integrate rss and other dynamic content from other sources ■■ can handle different content types, including: ■❏ text ■❏ images organization, must meet specific local requirements for functionality, and must include revision of the content management environment, meaning new roles for the people involved with the website.4 karen coombs noted that “the implementation of a content management system dramatically changes the role of the web services staff” and requires training for the librarians and staff who are now empowered to provide the content.5 another challenge was and continues to be a lack of a turn-key library cms.6 several libraries that did a systematic requirements gathering process generally found that the readily available cmss did not meet their requirements, and they ended up writing their own applications.7 building a cms is not a project to take lightly, so only a select few libraries with dedicated in-house programming staff are able to take on such an endeavor. the sharing of the requirements of these in-house library specific cmss is valuable for other libraries in identifying their own requirements. in the past few years, the field of open-source cmss has increased, making it more likely that a library will find a viable cms in the existing marketplace that will meet the organization’s needs. drupal is an open-source cms that was one of the first viable options for libraries and so is widely used in the library community. it was the subject of an edition of library technology reports in 2008.8 since drupal opened the door for open-source cmss in libraries, others have entered the market as well. in 2009 john harney noted, “there are few technologies as prolific as web content management systems. some experts number these systems in the 80-plus range, and most would concede there are at least 50.”9 the cms selection process described here builds on those described in the literature by integrating their requirements and methods to address the needs of a very large decentralized website. it builds on the increased emphasis on user involvement in technology solution building and selection by fully incorporating the cms users in the selection process. further, the process described here took place after those described in the literature, after the opensource cms field had significantly improved. the options were much greater at the time of this study and this article describes the increased possibilities of second generation cmss. while there still does not exist the perfect library ready turn-key cms, there are many excellent, robust open-source cmss available. this article describes one process for selecting among them, including an in-depth trial of three major systems: drupal, modx, and silverstripe. ■■ gathering requirements there were two parts to the requirements gathering process undertaken at osu libraries: research of the literature selecting a web content management system for an academic library website | black 187 ■■ meets established usability standards ■■ dynamic navigation generated for main and subsections of website that includes breadcrumbs and menus ■■ searching ■❏ search engine for website ■❏ can pass searches on to other library web servers ■■ mobile device friendly version (optional) ■■ delivery presentable in the browsers used most heavily by osu libraries websites visitors ■■ page load time is acceptable ■■ easy search engine optimization administration ■■ lamp (linux, apache, mysql, php) platform ■■ good documentation for technical support and end users ■■ scalable in terms of both content and traffic ■■ skills required to maintain system are available at osul the next step was to take this extensive requirements list and identify cmss that would be appropriate for a side-by-side test with both content providers and systems engineers. ■■ cms trial the web cms would become a critical part of the web infrastructure so it was important to ensure selection of the best system for both the content providers and the it team. between may 21 and august 29, 2008, two groups worked with the cmss, testing them on criteria taken from the initial requirements documents. the first team included fourteen content providers with diverse content areas and diverse technical skills; this group rated each system on a content providers set of criteria. the second team, which included the systems engineer and a technical support specialist, rated each system on a set of criteria that was more technical in nature. each participant used a microsoft excel spreadsheet containing requirements condensed from the full list. they rated each system on a scale of 1 to 3 for each criterion, where 1 was difficult, 2 was moderate, and 3 was easy. the project manager in the it web team led the trial. the criteria given to the content providers were: ■■ web gui intuitiveness ■■ media integration ■■ editor ease of use ■■ ability to add content ■■ ability to preview content ■■ ability to publish content ■■ metadata storage ■❏ videos ■❏ camtasia/captivate tutorials ■❏ flash files ■■ content owners can create tables and/or databases for display of tabular data in the web gui interface ■■ content owners can create forms in the web gui interface ■■ option for faculty and professional staff to have webpages featuring their work and their profiles ■❏ all staff must have some control over the personal information available about them on the public website content management ■■ link maintenance ■❏ does not allow internal pages to be deleted if linked to by another cms page ■❏ can regularly check the viability of external links ■❏ periodic reminders to content owners to check their content ■■ way to repurpose content elements to multiple pages for content such as: ■❏ descriptions of article and research databases ■❏ highlight or feature content elements ■■ access controls ■❏ that allows content owners to only edit their content ■❏ that allow web liaisons to provide first line support for their departments ■❏ integrates into our existing security structures (shibboleth) ■■ robust reporting features ■❏ integration with quality web analytics software ■❏ content update tracking ■❏ system usage ■❏ customized report creation publishing ■■ ability to preview before publishing ■■ cms can produce rss feeds for dynamic sections of content ■■ page templates and style sheets are used to control page layout and design centrally ■■ display non-roman scripts using unicode ■■ extensible—can incorporate non-cms content into the site ■■ ability to add personalization options for site users presentation ■■ meets ada and w3c accessibility requirements ■■ code validates to current html specifications 188 information technology and libraries | december 2011 on the technical requirement that the system be easy to extend and integrate. a simple website served as the hub for the cms trial. the site included links to each cms instance, a link to the project blog for updates from the project team, and a link to a wiki space where trial participants shared ideas and thoughts with one another. the web team’s issue tracking system was integrated into this site so participants could easily ask questions of the technical team and report problems. time was set aside each week to handle all reported issues. of the sixteen participants who started the trial, thirteen completed a criteria spreadsheet. the project manager totaled and then averaged the scores provided by each group to determine the overall content provider score and the overall technical score (see figure 1). in the end, both the content providers and the systems engineers agreed that silverstripe was the cms that best met the requirements. ■■ silverstripe silverstripe was released as an open-source cms in november 2006 by silverstripe limited. they had developed the cms as part of their business of creating websites for clients. the company was founded in 2000 and is headquartered in wellington, new zealand. the company continues to use the cms for their website business and also offers paid support for the cms. the testers agreed that silverstripe provided the best match in the areas of easy content creation by multiple authors, handling multilingual content, management of different types of content and content files, search engine optimization, and meeting web standards. a strong and growing open-source community and strong documentation were additional keys to the selection of silverstripe.14 use by high-profile clients, such as the 2008 democratic national convention, provided proof that silverstripe could handle high traffic. the content providers praised silverstripe for the intuitive user interface, the system’s ease of use, specifically the ease of previewing and publishing content. they also noted that silverstripe handled the metadata supporting the pages as well as tabular and form page content better than the other systems. the technical evaluators noted silverstripe’s modular structure, which makes it flexible enough to integrate easily with existing web applications and accommodate local customizations without modifying the core system. silverstripe includes a template language, which fully separates the content from the presentation. in practice, this means that even informed users cannot spot a silverstripe website through simple web browsing, as is common with other cmss. ■■ ability to “feature items” ■■ ability to add rss feeds ■■ ability to enter tabular data ■■ ability to create forms ■■ testing area for new features the criteria given to the systems engineers were: ■■ installation ■■ maintainability ■■ technical documentation ■■ active developer community ■■ structure management (subsites/trees) ■■ access control/permissions ■■ link management ■■ ease of extensibility ■■ interoperability (data portability and web services) the cms requirements document was used in conjunction with the cmsmatrix (http://cmsmatrix.org) website to select five cmss to participate in a trial.13 the five systems selected were drupal 6.2, modx 0.9.6, silverstripe 2.2.2, plone 3.0.6, and typo3. the systems engineer installed all five cmss on a development server and did a simple configuration to make each operational for testing. it was at this stage that plone and typo3 were dropped from the trial because they took too long to configure and set up. the goal was to do a simple installation of the base cms, without any modules, but some systems were not functional as a cms without some modules so we added modules selectively. at the point of the selection of the systems for the trial, the project leaders noted that the entire list of requirements could not be met by an existing cms. they also noted that the majority of the key needs could be met with an existing system. therefore the goal remained to select an existing open-source web cms with the emphasis figure 1. cms trial scores selecting a web content management system for an academic library website | black 189 web guides in a content management system,” library hi tech 24, no. 1 (2006): 29–53; yan han, “digital content management: the search for a content management system,” library hi tech 22, no. 4 (2004): 355–65; david kane and nora hegarty, “new web site, new opportunities: enforcing standards compliance within a content management system,” library hi tech 25, no. 2 (2007): 276–87; ed salazar, “content management for the virtual library,” information technology & libraries 25, no. 3 (2006): 170–75. 2. seadle, “content management systems,” 5. 3. huttenlock, beaird, and fordham, “untangling a tangled web”; kane and hegarty, “new web site, new opportunities”; salazar, “content management for the virtual library.” 4. holly yu, “library web content management: needs and challenges,” in content and workflow management for library web sites: case studies, ed. holly yu, 1–21 (hersey, pa.: information science, 2005). 5. karen coombs, “navigating content management,” library journal 133 (winter 2008): 24. 6. yu, “library web content management,” 10. 7. goans, leach, and vogel, “beyond html”; salazar, “content management for the virtual library”; rick wiggins, jeph remley, and tom klingler, “building a local cms at kent state,” library hi tech 24, no. 1 (2006): 69–101; regina beach and miqueas dial, “building a collection development cms on a shoe-string,” library hi tech 24, no. 1 (2006): 115–25. 8. andy austin and christopher harris, library technology reports 44, no. 4 (may/june 2008). 9. john harney, “are open-source web content management systems a bargain?” infonomics 23, no. 3 (may/june 2009): 59–62. 10. library hi tech 24, no. 1 (2006). 11. wiggins, remley, and klingler, “building a local cms at kent state.” 12. ibid. 13. cms matrix, “the content management comparison tool,” http://cmsmatrix.org/ (accessed aug. 16, 2010). 14. silverstripe.org, “open source help & support,” http:// silverstripe.org/help-and-support/ (accessed aug. 16, 2010). ■■ conclusion an academic library website is a complex operation. the best ones use the strengths of the organization to their fullest: give web content authors direct access to maintain their content without burdening them with the requirement of technical expertise in html. excellent sites also offer a consistent user experience facilitated by centrally managed presentation. a web cms facilitates this model. the selection of a web cms is not solely a technical decision; it is most effective when made in partnership with the web content providers. the process followed by osu libraries described here provides an example of one such selection process. ■■ acknowledgements the author thanks james muir and jason thompson for their thoughtful contributions to this article and their exceptional work on the project. none of it would have been possible without them. references 1. holly yu, ed., content and workflow management for library web sites: case studies (hersey, pa.: information science, 2005); michael seadle, “content management systems,” library hi tech 24, no. 1 (2006): 5–7; terry l. huttenlock, jeff w. beaird, and ronald w. fordham, “untangling a tangled web: a case study in choosing and implementing a cms,” library hi tech 24, no. 1 (2006): 61–68; doug goans, guy leach, and teri m. vogel, “beyond html: developing and re-imagining library appendix. content provider focus interview questions each group interview included a series of questions, which could be modified depending on the direction in which the interviews progressed. these are the questions provided to the interviewers: 1. who is your audience? 2. how do you teach/communicate with each audience? 3. what types of information are you trying to communicate? 4. how dynamic or static is the information? 5. what are the most important resources in your discipline? 6. who do you teach most frequently? undergrads, grads? 7. where do you start your instruction: with library.osu.edu or the department? 8. how do you connect the users/audience to your resources? 9. what message do you want to deliver? 10. what is unique about your discipline/ needs/ department? 11. what would make things easier for you? lib-mocs-kmc364-20131012113937 268 the use of automatic indexing for authority control martin dillon: university of north carolina at chapel hill ; rebecca c. knight: wichita state university, wichita, kansas; margaret f. lospinuso: university of north carolina at chapel hill; and john ulmschneider: national library of medicine. thesaurus-based automatic indexing and automatic authority control share common ground as word-matching processes. to demonstrate the resemblance, an experimental system utilizing automatic indexing as its core process was implemented to perform authority control on a collection of bibliographic records. details of the system are given and results discussed. the benefits of exploiting the resemblance between the two systems are examined. introduction it is not often realized how close the relationship is between automatic indexing using a thesaurus , on the one hand , and automatic authority control, on the other. making the connection is worthwhile for many reasons. the first has to do with terminology. though one would be naive to hope for a reduction in specialized vocabulary, it is helpful to appreciate that what is called a thesaurus in one application is referred to as an authority file in the other; that the two have virtually the same structure, similar working parts, and play the same role in controlling the content of fields in a bibliographic file in their creation and, at least potentially, during retrievals by users. a second reason emerges in system development. below we discuss the various ways that a library can implement authority control. they range from a fully manual system, where the authority file exists only in card form, to online, automatic authority management. there are intermediate points as well. for each of the automated implementations, the system investment in software can be great. recognition of the close parallel in function of these two library needs allows for parallel development of software for any of these stages. a third reason looks to the future. successful system-patron interaction manuscript received apri11981 ; accepted september 1981. automatic indexing/dillon, et al. 269 ought not to depend upon a patron's knowledge of the authorized entry forms currently in use for a library. first, the concept of a controlled vocabulary is far too narrow: authority control should encompass all fields available for searching. but the patron need not be aware of complicating details: substitutions of recognized variants for authorized forms ought to be carried out automatically during patron retrievals (with due regard, of course, for the intent of the patron). this article describes a project in authority control in a specialized system environment, one that is increasingly typical in many of its features. the file of records is relatively small, currently below 10,000, and has a potential for growth not exceeding 100,000. the collection, derived from the annabel morris buchanan collection of american religious tune books at the university of north carolina (chapel hill) music library, has many similarities with standard book collections, but its details vary greatly and cataloging conventions have been developed locally. its use for scholarly research is similar to that for any standard collection of bibliographic records. a great many such nonstandard collections exist-the morgue file in a newspaper, machine-readable data files, even properties marketed by cooperatives of real estate agencies. developing automated retrieval systems for such collections are similar enterprises, sharing similar goals and problems. in particular, all require extensive authority control similar to that required by a tune-book collection. the important feature of the method of authority control described here, one that makes it likely to be of interest to others, is its use of the same structures and software that are used for general vocabulary control. the three major software components we will refer to below are: thesaurus maintenance, automatic indexing, and automatic updating. these components antedated our effort to implement a similar system for authority control. when the problems that dealt with authority control per se were investigated, it was discovered that the system already available for subject control could be used exactly as it stood for authority control as well. initial experiments confirmed this relationship. 1 authority control and automatic indexing automatic authority control has been approached largely as a unique problem requiring special software development for its implementation. but authority control shares common ground with automatic subject indexing. both are term-matching activities based on a list of preferred terms plus a much larger list of match terms. each preferred term is tied to a number of match terms, but each match term is tied to only one preferred term. in the indexing environment, document text is examined for certain terms; these "free text" (uncontrolled vocabulary) terms are tied to equivalent (controlled vocabulary) terms in a thesaurus. when an uncontrolled vocabulary term is encountered in a document, its associated controlled 270 journal of library automation vol. 14/4 december 1981 vocabulary term is posted to the document as a descriptor. in authority control, document text is also examined for certain terms, e.g., author names. these "free-text" author names (i.e., names just as they appear on a title page) are tied to their authoritative name form (controlled vocabulary) in an authority file . when a "free-text" author name is encountered, the authoritative name is posted to the document or book (i.e., assigned as a heading or entry point). an automatic authority control system, then, is realizable by applying standard automatic subject-indexing software, which exploits the resemblance between the two processes. the input would consist of a thesaurus (in this case, an authority file) and bibliographic records; the indexing discovers matches between the list of possible terms in the thesaurus (variants of author names) with the "free-text" terms (title-page author names) , and posts the appropriate controlled thesaurus terms (authoritative author name form) whenever a match occurs. (see figure 1.) the tune-book project an experimental version of an authority control system using automatic indexing was implemented to test the feasibility of automatic indexing as i thesaurus i (authority file) \ \ i i \ fig. 1. at1thority control by indexing. matching and posting , l ' pdated records i \ ' i bibliographic records ~ automatic indexing/dillon, et al. 271 the core process for authority control. the goal was automatic authority control for the buchanan collection index, the first step in work on a more comprehensive project, an index of american religious tune books, in particular, the shape-note tune books. for the study of american cultural and musical history it is important to be able to trace the dissemination of these hymn tunes and texts, but the absence of a comprehensive index of american hymn tune books severely constrains such studies. many factors have discouraged scholars from constructing an index, among them the magnitude of the repertory . using computers to sort, file, and print reduces many of the problems associated with the size of the repertory, but does not address those created by the diverse forms of names and texts used by the tune-book compilers. correct hymn titles and especially accurate composer attributions were not important to the compilers of the tune books. consequently, although many tune-book compilers did attempt to indicate who had composed the work, the names of the composers appeared in various forms. for example, the name "israel holdroyd" might appear as simply "holdrad" or "holdrayd" with no first name given, or a first initial might be added, or an abbreviated first name, such as "is." might be used with one of several forms of the family name. automatic authority control over these names is necessary to the study of this collection, since only automatic means can address the problems of magnitude encountered in approaching the index as a whole. the database now contains about 6,000 records for these tune books. they are stored in marc format with variable-length fields giving a variety of information about each tune . creation of the authority file a thesaurus of authority records for the buchanan collection was manually created and placed in an online file. the initial authority file comprises a selection of composers whose names are present in conflicting forms in the present database. these were obtained by analyzing the file sorted by tune names, noting those tunes for which it appeared that the name of the same composer was given in more than one form. all forms of the name found were entered on cards along with the name of the tune (or tunes) through which the relationship was established . we used an explicit algorithm as a guide in determining which names were actually forms of the same name (see appendix for details). this process resulted in a list of 266 distinct composers, each with one to four different name forms. all were compared with the list sorted by composers, noting additional forms. these names were then checked in several reference works, and authoritative forms (with dates) were established when possible. implementation software systems file processing for the tune records and the authority thesaurus was 272 journal of library automation vol. 14/4 december 1981 accomplished using a local software product, bibliographic/marc processing system (bps). bps is a general-purpose software package for the manipulation of marc-format records. this experiment used bps subsystems for creation of marc-format records, sorting and formatting, and file updating (i.e., updating a master file with the contents of a transaction file). the automatic indexing program used here was intended as part of a thesaurus-based document query system. 2 it is compatible with bps, but utilizes generalized automatic indexing principles-its compatibility depends only on properly formatted thesaurus and bibliographic records. it includes file-processing programs for the thesaurus (authority file) and the bibliographic records (tune records) and a matching program that performs the indexing. posting of the authoritative name forms to the proper marc record is done with standard bps updating procedures using output from the matching program. automatic authority control process as input the system uses a thesaurus and the text of fields selected from marc-format document records. the thesaurus consists of pairs of terms: the first of each pair is the term searched for in a document, the second is the authority term assigned to the document, whenever the first term is found. figure 2 gives examples. the text may be abstracts, titles, or the contents of any field selected from the documents for authority control. in this case, the text is derived from the composer field; for authority work in general, any field requiring authority control would be input. the first step in authority control is as follows. the text sample and a stop-word list are input to the initial text-processing program. the incomau'ihcrity fcri'i cole, j_ i cvle, joh~ 1774-1855 clarkf", thos. 1 clark, thomas \:ol e!' , ~ eo. i cuzens, 9. / cuzens, benjamin ilall , ::;_ bi ba 11 , r. fholraj / hcld r oyd , israel aolroyd i hcldroyd, israel fig. 2. thesaurus/authority file format . automatic indexing/dillon , et al. 273 ing text (in this case, composer names) is separated into individual words. the stop-word list is used to remove designated words from the input, which in authority control might be titles of address and so onterms such as "miss," "elder," or "reverend." (automatic indexing uses the stop-word list to eliminate similarly noncontributory terms, such as conjunctions and prepositions.) the processing program can also convert plurals to singulars if desired. the purpose of this option in automatic indexing is to pare down variants in order to increase matches by standardizing term forms. however, plurals are not converted in authority control, since names are usually distinguished from one another by their full forms. the processing produces a list of individual terms. each term is given once along with the number of words in the term, then broken up with the document number attached to each piece. the thesaurus authority records are edited by the thesaurus processing program into specially formatted matched pairs of variant and authoritative forms. input is the match-term/variant-term file (figure 2) and the same stop-word list used for document processing. the stop-word list eliminates all unwanted words in the list of variant name forms. output is a file containing all possible name forms (variants), the number of terms in each name and their positions in the name, and the authoritative name form, as in figure 3. next the two files are used as input to a matching program that creates an inverted file of the processed document text, then compares each match term from the prepared thesaurus with the inverted file. a match is discovered according to one of the following criteria: 1. exact match: match term and document term are the same words, in the same order, and adjacent. 2. stop word exact match: words are the same in match term and in document term, and in order, but deleted stop words may intervene between words in the document term. 3. any order match: term must be the same words and adjacent (i.e., without intervening words) and may be in any order. va!'iani twc!ld s ~:utiv~ auti:-ci\ily ?cs: no fch hlstin'js, 'ihos. 2 1 2 rastinq~ , tl:hii.l s 17~4-l tl7 _ hastl.nqs, l h:>s :le i 1 2 rds tl nq.< , th.:>llll s 17cl~ 1 -!72 holde a':! l!ol:lccyd , l s cd<! l holdcoy1 i· holiroyd , i sra~: housec, w 2 1 2 lid u se [', willia m 1 '3 12 -1tho fig. 3. processed authority file. 274 journal of library automation vol. 14/4 december 1981 4. stop word any order match: terms must have the same words and in any order, but intervening stop words are ignored. 5. any match: any word of the match term may be in any part of the document text in any order. these match criteria are similar in intent to the criteria for deciding composer-variant forms/composer-authority form match mentioned above and presented in the appendix. an interesting possibility is to use such match criteria to discover variant author name forms in creating the authority file, since many variant forms result only from misspellings, title attributions, and so on. pseudonyms would not be detected, but such a procedure would be useful in collating forms morphologically similar. the experiment used criterion two, one of the most restrictive; the "freetext" composer name must match exactly and with its parts in the same order (except that stop words, such as "miss" or "elder," may intervene) as the variant author name before an authoritative form is posted. this seems the most reasonable choice for this project; presumably more flexibility could be achieved by adding criteria to the match process or by allowing boolean combinations of criteria analogous to those outlined in the appendix. the final output from the match module is three files: a print file of all match terms, a file of all unmatched authority names, and a file constructed for the update of the bibliographic records, giving the document and field to be updated and the update term. the print file is a record printed for each term matched. the record gives the variant form matched, its field type, the proper authoritative name form as given in the thesaurus , and the identifier numbers of the documents in which the term is found. field type is an identifying code assigned to each term in the prepared thesaurus, not necessarily the same as those identifiers in the marc-format authority file; here, the field type is preferred composer n arne (pcn). an example of the printed output file is in figure 4. the update file is for use in an update program that posts the authoritative name form assigned by the indexing. it contains the document identifier number in which a match was found, the field type of term found (pcn), and the authoritative name form. the update program uses this file to add the authoritative composer name form as a new bibliographic data field to the appropriate bibliographic record, assigning as a field identifier the field type identifier accompanying it. figure 5 gives the new records with added fields. during the update process, a file containing all records not receiving a new authority-name field is generated. these records may contain a new variant of an authoritative name already in the file or a name altogether new to the file; in either case the unmatched author name would have to be added to the authority file and tied to an authoritative name form. the output also assists in tracing erroneous name-form assignments. automatic indexing/dillon, et al. 275 '1at'...:ii ~i::r:1: walk er 'y"e: pcil .\'jtnoriti ':zr~: ~alk'?r, williu 1tj q9 -1 d 7s lochc:ons: h-1j59, .&.a-11~ij, a!l-1273, ••• maic!i n:a'l: oaviison type: pol .;otho!li:iy ier,1: da vi s~or., an-1ias 17 ~0 -1 857 locat!on.:i: h-1035 :1!\tch t c:r ~: h an de l type: pcn authcr:ty ter m: han1el, george frideric 1685-17 5 9 locations: ak-1j~5, aa2 1)93 match term: ev€rett type: pc:i author!'!'y term: everett, e. r.. locations: aa-1015, ~a-1090, a~-1105, al-10~3, ak-1060, ak-1111, a!-13 57 ••• ib'ich te8ll: pond lype: pcn autho<!ity terl!: pond, sylnnus sillings 179~-1871 location!>: ab-1054, .\3-166q, ad-1248, aq-133b, ••• fig. 4. update file. results table 1 gives some statistics on the experimental runs. in the 5, 788 bibliographic records, 760 distinct composer names were present, the remainder (one composer per record) being duplicate forms; many of these are simply "anon," where the composer was not known. earlier test runs on a subset of the file had fewer duplicates, and additions to the full database show few new composer name forms. thus the database is nearing a stable state with an exhaustive list of composers; this stability contribtable 1. implementation statistics f ile statistics: total number of bibliograp hi c records number of composer names in biblio reco rds ave rage number of compositions per composer tota l number of authorit y na me forms (in authority file) tota l number of variant and authority names (in authority file) run statisti cs: total number of variant thesauru s names matched total numbe r of variant thesaurus n am es unmatched average number of documents per match ed ter m average number of docume nts per term total number of reeords updated b y authority form 5,788 760 13.2 266 599 372 213 5.87 3.61 2, 110 276 journal of library automation vol. 14/4 december 1981 jqc 10: af1 14 7 .\nt ho l o.:; y ; 'i h <:> ~ n ion ilih jl on y i mjrin : : sel~cted ty ;ecr qe y~njr~ckson tune na:1e: i e::-usa lem firs: lin~:je~us, my all tc h~~v•n is gone, pcn: walk e r, william 18 09 -187 5 cc.'1p!)3:':r: loi al k e r, \ojr • joc i d: aa-1353 "antholo.:;y: the sacred harp imprinl': oy 3. f. lthite, e . j. king [and d.p. white}--4th ed.---atalnta : d. p. byrd, 1870 tune name: the hilt cf zion frgsr ~ine:the hill cf zion yield s , pc~: white, benjamin franklin 1800-1879 coi1po ser: white, b. f. )ot: id: afl -1100 anthology: the culcia;er imprint : or, 'ihe new york coll~ction of ~acred music 1 by i. b. woccbury. --neli york f. j. huntington tune name: carson first line:jesus an1 shall it ever be, pcn: bradbury, williaa; batchelder 1816-1868 composer: er, w. !l. fig. 5. updated records. utes to decreasing errors and fewer unmatched composer names in the automated authority control process. the total numbe r of thesaurus records matched applies to variant forms, authoritative forms (matching occurs for these also) , and for those few forms that have no variants. the unmatched terms (213) are largely variants not in the database but gleaned from reference sources in anticipation of their occurrence, and authority forms, most of which do not occur in the database. the 2, 110 matched represent the total number of composer names matched of the originals, 788 names. most of the unmatched names are the "anon" entries (more than 2 ,000); the remainder are unanticipated forms not detected in the initial manual construction of the authority file. these unanticipated forms become new variants added to the authority file as described above. conclusions automated authority control as presented here has a number of advantages, either for libraries with their own processing facilities or for the management of information collections outside the standard library environment. unifying the processes of subject control and authority control by using the same procedures and software for both simplifies the tasks of automatic indexing/dillon, et al. 277 systems personnel and information managers. where catalog access is online, the patron benefits by applying subject access facilities to other searches. ideally, substitutions for all variants would occur automatically, accompanied by an alerl lo the patron where it was felt necessary. at a minimum, the same command structure would be available for referencing names as would be normally available for consulting an online thesaurus. in either case, the difficulties of the patron are reduced, both in comprehending how the system works, and in acquiring a facility for using system commands. references 1. gordon ellyson jessee, "authority control: a study of the concept and its implementation using an automated indexing system" (master's paper, school of library science, university of north carolina at chapel hill, 1980). 2. margaret s. strode, "automatic indexing using a thesaurus" (master's thesis, department of computer science, university of north carolina at chapel hill, 1977). appendix rules for decisions on similar names the following conditions may exist: a = identical tune name b = identical surname c = identical first initial d = same first letter of surname and close match of the rest of the surname. (55 percent match of latters in content, not in order. such a similarity is presumed to represent a similarity in sound. ) e = similar tune name (same criteria as in d for percentage of match). exception: words "new" and "old" cancel any presumed relation between similar tune names. f = information in cmp subfield x field is identical in content the following combinations of conditions indicate the same person, expressed in decreasing order of reliability: l. a&b 2. b&c 3. a&d 4. c&d 5. b&e 6. c&d&e 7. d&e 8. f&(bord) note: points seven and eight are regarded as tentative, and matches using these combinations are flagged for later checking. martin dillon is associate professor of library science at the university of north carolina at chapel hill. rebecca c. knight is administrative services librarian at wichita state university, wichita, kansas. margaret f. lospinuso is music librarian at the university of north carolina at chapel hill. john ulmschneider is library associate at the national library of medicine. examining attributes of open standard file formats for long-term preservation and open access eun g.park and sam oh information technology and libraries | december 2012 44 abstract this study examines the attributes that have been used to assess file formats in literature and compiles the most frequently used attributes of file formats to establish open-standard file-formatselection criteria. a comprehensive review was undertaken to identify the current knowledge regarding file-format-selection criteria. the findings indicate that the most common criteria can be categorized into five major groups: functionality, metadata, openness, interoperability, and independence. these attributes appear to be closely related. additional attributes include presentation, authenticity, adoption, protection, preservation, reference, and others. introduction file format is one of the core issues in the fields of digital content management and digital preservation. as many different types of file formats are available for texts, images, graphs, audio recordings, videos, databases, and web applications, the selection of appropriate file formats poses an ongoing challenge to libraries, archives, and other cultural heritage institutions. some file formats appear to be more widely accepted: tagged image file format (tiff), portable document format (pdf), pdf/a, office open xml (ooxml), and open document format (odf), to name a few. many institutions, including the library of congress (lc), possess guidelines on file format applications for long-term preservation strategies that specify requisite characteristics of acceptable file formats (e.g., they are independent of specific operating systems, are independent of hardware and software functions, conform to international standards, etc.).1 the format descriptions database of the global digital format registry is an effort to maintain a detailed representation of information and sustainability factors for as many file formats as possible (the pronom technical registry is another such database).2 despite these developments, file format selection remains a complex task and prompts many questions that range from a general interest (“which selection criteria are appropriate?”) to more specific (“are these international standard file formats sufficient for us to ensure long term preservation and access?” or “how should we define and implement standard file formats in harmony with our local context?”). in this study, we investigate the definitions and features of standard file formats and examine the eun g. park (eun.park@mcgill.ca) is associate professor, school of information studies, mcgill university, montreal, canada. sam oh (samoh@skku.edu) is corresponding author and professor, department of library and information science, sungkyunkwan university, seoul, korea. mailto:eun.park@mcgill.ca mailto:samoh@skku.edu information technology and libraries | december 2012 45 major attributes of assessing file formats. we discuss relevant issues from the viewpoint of openstandard file formats for long-term preservation and open access. background on standard file formats the term file format is generally defined as what “specifies the organization of information at some level of abstraction, contained in one or more byte streams that can be exchanged between systems.”3 according to interpares 2, file format is “the organization of data within files, usually designed to facilitate the storage, retrieval, processing, presentation, and/or transmission of the data by software.”4 the premis data dictionary for preservation metadata observes that, technically, file format is “a specific, pre-established structure for the organization of a digital file or bitstream.”5 in general, file format can be divided into two types: an access format and a preservation format. an access format is “suitable for viewing a document or doing something with it so that users access the on-the-fly converted access formats.”6 in comparison, a preservation format is “suitable for storing a document in an electronic archive for a long period”7; it provides “the ability to capture the material into the archive and render and disseminate the information now and in the future.”8 while the ability to ensure long-term preservation focuses on the sustainability of preservation formats, the document in its access format tends to emphasize that it should be accessible and available by users, presumably all of the time. many researchers have discussed file formats and long-term preservation in relation to various types of resources. for example, folk and barkstrom describe and adopt several attributes of file formats that may affect the long-term preservation of scientific and engineering data (e.g., the ease of archival storage, ease of archival access, usability, data scholarship enablement, support for data integrity, and maintainability and durability of file formats).9 barnes suggests converting word processing documents in digital repositories, which are unsuitable for long-term storage, into a preservation format.10 the evaluation by rauch, krottmaier, and tochtermann illustrates the practical use of file formats for 3d objects in terms of long-term reliability.11 others have developed and/or applied numerous criteria in different settings. for instance, sullivan uses a list of desirable properties of a long-term preservation format to explain the purpose of pdf)/a from an archival and records management prospective.12 sullivan cites device independence, self-containment, self-describing, transparency, accessibility, disclosure, and adoption as such properties. rauch, krottmaier, and tochtermann’s study applies criteria that consist of technical characteristics (e.g., open specification, compatibility, and standardization) and market characteristics (e.g., guarantee duration, support duration, market penetration, and the number of independent producers). rog and van wijk propose a quantifiable assessment method to calculate composite scores of file formats.13 they identify seven main categories of criteria: openness, adoption, complexity, technical protection mechanism, self-documentation, robustness, and dependencies. sahu focuses on the criteria developed by the uk’s national archives, which include open standards, ubiquity, stability, metadata support, feature set, examining attributes of open standard file formats for long-term preservation and open access | park and oh 46 interoperability, and viability.14 a more comprehensive evaluation by the lc reveals three components—technical factors, quality, and functionality—while placing a particular emphasis on the balance between the first two.15 hodge and anderson use seven criteria for sustainability, which are similar to the technical factors of the lc study: disclosure, adoption, transparency, selfdocumentation, external dependencies, impact of patents, and technical protection mechanisms.16 some institutions adopt another term, standard file formats, to differentiate accepted and recommended file formats from others. according to the david project, “standard file formats owe their status to (official) initiatives for standardizing or to their widespread use.”17 standard may be too general to specify the elements of file formats. however, there is a recognition that only those file formats accepted and recommended by national or international standard organizations (such as the international standardization organization [iso], international industry imaging association [i3a], www consortium, etc.) are genuine standard file formats. for example, iso has announced several standard file formats for images: tiff/it (iso 12639:2004), png (iso/iec 15948:2004), and jpeg 2000 (iso/iec 15444:2003, 2004, 2005, 2007, 2008). for document file formats, pdf/a-1 (iso standard 19005-1. document file format for long-term preservation) is one example. this format is proprietary to maintain archival and recordsmanagement requirements and to preserve the visual appearance and migration needs of electronic documents. office open xml file format (iso/iec 29500–1:2008. information technology—document description and processing languages) is another open standard that can be implemented from microsoft office applications on multiple platforms. odf (iso/iec 26300:2006. information technology—open document format for office applications [opendocument] v1.0) is an xml-based open file format. regardless of iso-announced standards, some errors in these file formats have been reported. for example, although pdf/a-1 is for longterm preservation of and access to documents, studies reveal that the feature-rich nature of pdf can create difficulties in preserving pdf information over time.18 to overcome the barriers of pdf and pdf/a-1, xml technology seems prevalent for digital resources in archiving systems and digital preservation.19 the digital repository community is treating xml technology as a panacea and converting most of their digital resources to xml. the netherlands institute for scientific information service (nisis) adopts another noteworthy definition of standard file formats. it observes that standard image file formats “are widely accepted, have freely available specifications, are highly interoperable, incorporate no data compression and are capable of supporting preservation metadata.”20 this definition implies specific and advanced ramifications for cost-free interoperability and metadata, which closely relates to open access. open standard is another relevant term to consider in file formats. although perspectives vary greatly between researchers, open standards can be acquired and used without any barrier or cost.21 in other words, open standard products are free from restrictions, such as patents, and are independent of proprietary hardware or software. since the 1990s, open standard has been broadly adopted in many fields and is now an almost compulsory feature in information services. information technology and libraries | december 2012 47 to follow the national archives’ definition, open standard formats are “formats for which the technical specifications have been made available in the public domain.”22 in comparison, the folk and barkstrom approach opens standards from institutional support perspectives, relying on user communities for standards that are widely available and used.23 on a more specific level, stanescu emphasizes independence as the basic selection criteria for file formats.24 others, such as todd, propose determining whether a standard should be more open than others by applying criteria: adoption, platform independence, disclosure, transparency, and metadata support.25 other factors considered by todd include reusability and interoperability; robustness, complexity, and viability; stability; and intellectual property (ip) and rights management.26 echoing the lc, hodge and anderson also suggest a list of selection criteria that have been grouped under the banner of “technical factors”: disclosure, adoption, transparency, self-documentation, external dependencies, impact of patents, and technical protection mechanisms.27 researchers agree that open standard file formats are less obsolete and more reliable than proprietary formats.28 close examination of the nisis definition mentioned above reveals that standard file formats are in reality not free, nor do they allow unrestricted access to resources. the three file formats that iso has announced (pdf/a, ooxml, and odf) are proprietary and sometimes costly. they also prohibit the purchase of access to a proprietary standard, although there is an assumption that a standard should be free from legal and financial restrictions. the iso-announced file formats, in short, are only standard file formats, not open standard file formats. for cultural heritage institutions, questions regarding appropriate selection criteria and the sufficiency of existing international standard file formats for long-term preservation and access remain unanswered. there exists neither a uniform method to compare the specifications of different file formats nor an objective approach to assess format specifications that would ensure long-term preservation and persistent access. objectives of the study in this study, we attempt to better define and establish open-standard file-format-selection criteria. to that end, we assess and compile the most frequently used attributes of file formats to establish open-standard file-format-selection criteria. method we performed a comprehensive review of published articles, institutional reports, and other literature to identify the current knowledge regarding file-format-selection criteria. we included literature that deals with the three standard file formats (pdf, pdf/a, and xml) but excluded the recently announced odf format due to the scarcity of literature on odf. among more than the thirty articles initially reviewed, only twenty-five that use their own clear attributes were included in this study. all of the attributes that we have employed are listed by frequency and grouped according to similarities in meaning (see appendix). the original definitions or descriptions that we used are listed in the second column. the file formats that we assessed by their attributes are examining attributes of open standard file formats for long-term preservation and open access | park and oh 48 listed in the third column. when we give attributes without specific definitions or descriptions, “no definite term” is inserted. findings as illustrated in the appendix, the criteria identified by the studies vary. although the requirements and context of the studies may differ, the most common criteria can be divided into five categories: functionality, metadata, openness, interoperability, and independence. first, functionality refers to the ability of a format to do exactly what it is supposed to be doing.29 it is important to distinguish between two broad uses: preservation of document structure and formatting and preservation of useable content. to preserve document formatting, a “published view” of a given piece of content is critical for distribution. other content, such as database information or device-specific documents, needs to be preserved as well. functionality criteria include various attributes related to formats and structure or physical and technical specifications of files (e.g., robustness, feature set, viability, color maintenance, clarity, compactness, modularity, compression algorithms, etc.). second, metadata indicates that a format allows rich descriptive and technical metadata to be embedded in files. metadata can be expressed as metadata support, self-documentation (selfdocumenting), documentation, content-level (as opposed to presentation-level) description, selfdescribing, self-describing files, formal description of format, etc. third, openness refers to specifications of a file format that are publicly available and accessible and formats that are not proprietary. whether seen as a single definition or as a set of criteria, the characteristic that appears to be at the core of the open standard movement is its independence from outside proprietary or commercial control. openness also may refer to the autonomy of a file format, which relies on several factors. first, the document should be self-contained in terms of the content information (e.g., the text), the structural information (i.e., for those documents that are structured), the formatting information (e.g., fonts, colours, styles, etc.), and the metadata information. self-containment does not necessarily mean that an archivist will only have one document to deal with. it does mean, however, that they will have documents that will provide them with all the information to access and process the content, structure, formatting, and metadata. openness is expressed as open availability by some researchers.30 other researchers adopt the term disclosure for expressing that specification is publicly available.31 fourth, is the independence of a document from proprietary or commercial hardware and software configurations, especially to prevent any issues resulting from different versions of software, hardware, and operating systems. this aspect is expressed in the appendix as open standards, open source software or equivalent, standard/proprietary, etc. this also closely relates to independence, one of the five categories in the appendix, expressed as device independencies, independent implementations, no external dependency, no external dependencies, portability, and monitoring obsolescence. having documents in a proprietary format controlled by a third party information technology and libraries | december 2012 49 implies that, at one time or another, this format may no longer be supported, or that a change in the user agreement may lead to restricted access, access to outdated material, or patent and copyright issues. this fact means that the document must be freely accessible, without password restrictions or protection, and without any digital rights management scheme. blocking access to a document with a password can lead to serious problems if the password gets lost. in addition, the size and compactness of the document will influence the selection of a file format. fifth, interoperability primarily refers to the ability of a file format to be compatible with other formats and to exchange documents without loss of information.32 specifically, it refers to the ability of a given software to open a document without requiring any special application, plug-in, codec, or proprietary add-on. adherence to open source standards is usually a good indication of the interoperability of a format. in general, an open standard is released after years of bargaining and agreements between major players. supervision by an international standard (such as iso or the w3c) commonly helps propagate the format. in addition to the five categories mentioned above, other attributes are often used. presentation, authenticity, adoption, protection, preservation and reference are such examples. among these attributes, authenticity, although this is the seventh in the appendix, is one of the most important attributes in archives and records management. it refers to the ability to guarantee that a file is what it originally was without any corruption or alteration.33 specific to authenticity is data integrity, which assesses the integrity of the file through an internal mechanism (e.g., png files include byte sequences to validate against errors). another method of validating the authenticity of a document is to look at its traceability,34 that is, the traces left by the original author and those who modified or opened a file. one example is the difference between the creation date, modification date, and access date of any file on a personal computer. these three dates correspond to a moment when someone (often a different person each time) opened the file. other mechanisms may require log information, which is external to the file. another good indication of authenticity is the stability of a format.35 a format that is widely used is more likely to be stable. a stable format is also more likely to cause less data loss and corruption; hence it is a better indicator of authenticity. presentation includes attributes related to presenting and rendering data, expressed as distributing a page image, normal rendering, self-containment, selfcontained, and beyond normal rendering. adoption indicates how popular and widely a file format is adopted by user communities, also represented as popularity, widely used formats, ubiquity, or continuity. protection includes the technical protection mechanism or source verification to protect with security skills. preservation means long-term preservation, institutional support, or ease of transformation and preservation. reference indicates citability, or referential extensibility. among other attributes, transparency is interesting to note because it indicates the degree to which files are open to direct analysis with basic tools and human readability. another important aspect across these criteria is that the terminologies used in the studies may be quite different yet describe the same or similar concepts from different angles. for instance, rog and van wijk use openness for standardization and specification without restrictions,36 while examining attributes of open standard file formats for long-term preservation and open access | park and oh 50 several other researchers use open availability to convey the same thing.37 they in turn adopt the term disclosure to express that specification is publicly available.38 discussion and conclusion functionality, metadata, openness, interoperability, and independence appear to be the most important factors when selecting file formats. when file formats for long-term preservation and open access are under discussion, cultural heritage institutions need to consider many issues. despite several efforts, it is still tricky for them to identify the most appropriate file format or even to discern acceptable formats from unacceptable formats. where it is difficult to prevent the creation of a new file format, format selection is not an easy task, both in theory and in practice. it is critical, however, to base the decision on a clear understanding of the purpose for which the document is preserved: access preservation or repurposing preservation. cultural heritage institutions and digital repository communities need to guarantee long-term preservation of digital resources in selected file formats. additionally, users find it necessary to have access to digital information in these file formats. additional consideration involves the level of access users may enjoy (e.g., long-term access, permanent access, open access, persistent access, etc.). when determining international standard file formats, an aspect of open access should be included because it is a well-liked topic. it is necessary to develop a scale or measurement to assess open-standard format specifications to ensure long-term preservation and open access. identifying which attributes are required to be an open-standard file format and which digital format is most apt for the use and sustainability of long-term preservation is a meaningful task. the outcome of our study provides a framework for appropriate strategies when selecting file formats for long-term preservation and access to digital content. we hope that the criteria described in this study will benefit librarians, preservers, record creators, record managers, archivists, and users. we are reminded of todd’s remark that “the most important action is to align the recognition and weighting of criteria with a clear preservation strategy and keep them under review using risk management techniques.”39 the question of how to adopt and implement these attributes can only be answered in the local context and decisions of each cultural heritage institution.40 each institution should consider implementing a file format throughout the entire life cycle of digital resources, with a holistic approach to managerial, technical, procedural, archival, and financial issues for the purpose of long-term preservation and persistent access. the criteria may change over time, as is necessary for any format to adequately serve its purpose. maintaining its quality may be an ongoing task that cultural heritage institutions should take into account at all times. even more importantly, cultural heritage institutions need to establish and implement a set of standard guidelines specific to each context for the selection of open-standard file formats. note: this research was supported by the sungkyunkwan university research fund (2010-2011). information technology and libraries | december 2012 51 references and notes 1. library of congress, “sustainability of digital formats: planning for library of congress collections,” www.digitalpreservation.gov/formats/intro/intro.shtml (accessed november 21, 2011). 2. global digital format registry, www.gdfr.info (accessed november 17, 2011); the technical registry pronom, www.nationalarchives.gov.uk/aboutapps/pronom (accessed november 21, 2011). 3. mike folk and bruce r. barkstrom, “attributes of file formats for long-term preservation of scientific and engineering data in digital libraries” (paper presented at the joint conference on digital libraries (jcdl), houston, tx, may 27–31, 2003), 1, www.larryblakeley.com/articles/storage_archives_preservation/mike_folk_bruce_barkstrom2 00305.pdf (accessed november 21, 2011). 4. interpares 2 project glossary, p. 24, www.interpares.org/ip2/ip2_term_pdf.cfm?pdf=glossary (accessed november 21, 2011). 5. premis editorial committee, premis data dictionary for preservation metadata, ver. 2.0, march 2008, p. 195, www.loc.gov/standards/premis/v2/premis-2-0.pdf (accessed november 21, 2011). 6. ian barnes, “preservation of word processing documents,” july 14, 2006, p. 4, http://apsr.anu.edu.au/publications/word_processing_preservation.pdf (accessed november 21, 2011). 7. ibid. 8. gail hodge and nikkia anderson, “formats for digital preservation: a review of alternatives and issues,” information services & use 27 (2007): 46. 9. folk and barkstrom, “attributes of file formats.” 10. barnes, “preservation of word processing documents.” 11. carl rauch, harald krottmaier, and klaus tochtermann, “file-formats for preservation: evaluating the long-term stability of file-formats,” in proceedings of the 11th international conference on electronic publishing 2007 (vienna, austria, june 13–15, 2007): 101–6. 12. susan j. sullivan, “an archival/records management perspective on pdf/a,” records management journal 16, no. 1 (2006): 51–56. 13. judith rog and caroline van wijk, “evaluating file formats for long-term preservation,” 2008, www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_2 7022008.pdf (accessed november 21, 2011). http://www.digitalpreservation.gov/formats/intro/intro.shtml http://www.nationalarchives.gov.uk/aboutapps/pronom http://www.larryblakeley.com/articles/storage_archives_preservation/mike_folk_bruce_barkstrom200305.pdf http://www.larryblakeley.com/articles/storage_archives_preservation/mike_folk_bruce_barkstrom200305.pdf http://www.interpares.org/ip2/ip2_term_pdf.cfm?pdf=glossary http://www.loc.gov/standards/premis/v2/premis-2-0.pdf http://apsr.anu.edu.au/publications/word_processing_preservation.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf examining attributes of open standard file formats for long-term preservation and open access | park and oh 52 14. d. k. sahu, “long term preservation: which file format to use” (paper presented in workshops on open access & institutional repository, chennai, india, may 2–8, 2004), http://openmed.nic.in/1363/01/long_term_preservation.pdf (accessed november 21, 2011). 15. cendi digital preservation task group, “formats for digital preservation: a review of alternatives and issues,” www.cendi.gov/publications/cendi_presformats_whitepaper_03092007.pdf (accessed november 21, 2011). 16. hodge and anderson, “formats for digital preservation.” 17. david 4 project (digital archiving, guideline and advice 4), “standards for fileformats,” 1, www.expertisecentrumdavid.be/davidproject/teksten/guideline4.pdf (accessed november 21, 2011). 18. sullivan, “an archival/records management perspective on pdf/a”; john michael potter, “formats conversion technologies set to benefit institutional repositories,” http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.7881&rep=rep1&type=pdf (accessed november 21, 2011). 19. eva müller et al., “using xml for long-term preservation: experiences from the diva project,” in proceedings of the 6th international symposium on electronic theses and dissertations (may 20–24, 2003): 109–16, https://edoc.hu-berlin.de/conferences/etd2003/hanssonpeter/html/index.html (accessed november 21, 2011). 20. rene van horik, “image formats: practical experiences” (paper presented in erpanet training, vienna, austria, may 10–11, 2004), 22, www.erpanet.org/events/2004/vienna/presentations/erpatrainingvienna_horik.pdf (accessed november 21, 2011). 21. open standard is related to open access, which comes from the open access movement that allows resources to be freely available to the public and permits any user to use those resources (e.g., mainly electronic journals, repositories, databases, software applications, etc.) without financial, legal, or technical barriers. see amy e. c. koehler, “some thoughts on the meaning of open access for university library technical services,” serials review 32, no. 1 (march 2006): 17–21; budapest open access initiative, “read the budapest open access initiative,” www.soros.org/openaccess/read.shtml (accessed november 21, 2011). 22. national archives, “selecting file formats for long-term preservation,” 6, www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_2 7022008.pdf (accessed november 21, 2011). 23. folk and barkstrom, “attributes of file formats.” http://openmed.nic.in/1363/01/long_term_preservation.pdf http://www.cendi.gov/publications/cendi_presformats_whitepaper_03092007.pdf http://www.expertisecentrumdavid.be/davidproject/teksten/guideline4.pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.7881&rep=rep1&type=pdf https://edoc.hu-berlin.de/conferences/etd2003/hansson-peter/html/index.html https://edoc.hu-berlin.de/conferences/etd2003/hansson-peter/html/index.html http://www.erpanet.org/events/2004/vienna/presentations/erpatrainingvienna_horik.pdf http://www.soros.org/openaccess/read.shtml http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf information technology and libraries | december 2012 53 24. andreas stanescu, “assessing the durability of formats in a digital preservation environment: the inform methodology,” d-lib magazine 10, no. 11 (november 2004), www.dlib.org/dlib/november04/stanescu/11stanescu.html (accessed november 21, 2011). 25. malcolm todd, “technology watch report: file formats for preservation,” www.dpconline.org/advice/technology-watch-reports (accessed november 21, 2011). 26. ibid. 27. hodge and anderson, “formats for digital preservation.” 28. edward m. corrado, “the importance of open access, open source, and open standards for libraries,” issues in science & technology librarianship (spring 2005), www.library.ucsb.edu/istl/05-spring/article2.html (accessed november 21, 2011); carl vilbrandt et al., “cultural heritage preservation using constructive shape modeling,” computer graphics forum 23, no. 1 (2004): 25–41; marshall breeding, “preserving digital information,” information today 19, no. 5 (2002): 48–49. 29. eun g. park, “xml: examining the criteria to be open standard file format,” (paper presented at the interpares 3 international symposium, oslo, norway, september 17, 2010), www.interpares.org/display_file.cfm?doc=ip3_isym04_presentation_3–3_korea.pdf (accessed november 21, 2011). 30. adrian brown, “digital preservation guidance note: selecting file formats for long-term preservation,” www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf (accessed november 21, 2011); barnes, “preservation of word processing documents”; sahu, “long term preservation”; potter, “formats conversion technologies.” 31. stephen abrams et al., “pdf-a: the development of a digital preservation standard” (paper presented at the 69th annual meeting for the society of american archivists, new orleans, louisiana, august 14–21, 2005), www.aiim.org/documents/standards/pdf-a.ppt (accessed november 21, 2011); sullivan, “an archival/records management perspective on pdf/a”; cendi, “formats for digital preservation”; and hodge & anderson, “formats for digital preservation.” 32. the national archives, http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_me thod_27022008.pdf (accessed november 21, 2011); ecma international, “office open xml file formats—ecma-376,” www.ecma-international.org/publications/standards/ecma-376.htm (accessed november 21, 2011). 33. christoph becker et al., “systematic characterisation of objects in digital preservation: the extensible characterisation languages,” www.jucs.org/jucs_14_18/systematic_characterisation_of_objects/jucs_14_18_2936_2952_bec ker.pdf (accessed november 21, 2011); national archives, http://www.dlib.org/dlib/november04/stanescu/11stanescu.html http://www.dpconline.org/advice/technology-watch-reports http://www.library.ucsb.edu/istl/05-spring/article2.html http://www.interpares.org/display_file.cfm?doc=ip3_isym04_presentation_3–3_korea.pdf http://www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf http://www.aiim.org/documents/standards/pdf-a.ppt http://www.ecma-international.org/publications/standards/ecma-376.htm http://www.jucs.org/jucs_14_18/systematic_characterisation_of_objects/jucs_14_18_2936_2952_becker.pdf http://www.jucs.org/jucs_14_18/systematic_characterisation_of_objects/jucs_14_18_2936_2952_becker.pdf examining attributes of open standard file formats for long-term preservation and open access | park and oh 54 www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_2 7022008.pdf (accessed november 21, 2011). 34. folk and barkstrom, “attributes of file formats.” 35. national archives, www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_2 7022008.pdf (accessed november 21, 2011); rog and van wijk, “evaluating file formats for long-term preservation.” 36. rog and van wijk, “evaluating file formats for long-term preservation.” 37. see brown, “digital preservation guidance note: selecting file formats for long-term preservation,” www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf (accessed november 21, 2011); barnes, “preservation of word processing documents”; sahu, “long term preservation”; potter, “formats conversion technologies.” 38. stephen abrams et al., “pdf-a: the development of a digital preservation standard” (paper presented at the 69th annual meeting for the society of american archivists, new orleans, louisiana, august 14–21, 2005), www.aiim.org/documents/standards/pdf-a.ppt (accessed november 21, 2011).; sullivan, “an archival/records management perspective on pdf/a”; cendi, “formats for digital preservation”; and hodge & anderson, “formats for digital preservation.” 39. todd, “technology watch report,” 33. 40. evelyn peters mclellan, “selecting digital file formats for long-term preservation: interpares 2 project general study 11 final report,” www.interpares.org/display_file.cfm?doc=ip2_file_formats(complete).pdf (accessed november 21, 2011). http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf http://www.aiim.org/documents/standards/pdf-a.ppt http://www.interpares.org/display_file.cfm?doc=ip2_file_formats(complete).pdf information technology and libraries | december 2012 55 appendix: file format attributes no. attribute definition/description assessed file format 1. f u n c t i o n a l i t y robustness robust against single point of failure, support for file corruption detection, file format stability, backward compatibility and forward compatibility (rog & van wijk, 2008; wijk & rog, 2007) pdf/a-1 (limited) microsoft word (limited) a robust format contains several layers of defense against corruption (frey, 2000). n/a feature set formats supporting the full range of features and functionality (brown, 2003) n/a not defined (sahu, 2006) n/a viability error-detection facilities to allow detection of file corruption (brown, 2003). png format (yes) not defined (sahu, 2006) n/a support for graphic effects and typography not defined (cendi, 2007; hodge & anderson, 2007) tiff_g4 (no) color maintenance not defined (cendi, 2007; hodge & anderson, 2007) tiff_g4 (limited) clarity support for high image resolution (cendi, 2007; hodge & anderson, 2007) tiff_g4 (yes) quality this pertains to how well the format fulfills its task today: (1) low space costs, (2) highly encompassing, (3) robust, (4) simplicity, (5) highly tested, (6) loss-free, (7) supports metadata (clausen, 2004). n/a compactness to minimize storage and i/o costs (folk & barkstrom, 2003) n/a simplicity ease of implementing readers (folk & barkstrom, 2003) n/a file corruption detection to be able to detect that a file has been corrupted; to provide errorcorrection (folk & barkstrom, 2003) n/a raw i/o efficiency formats that are organized for fast sequential access (folk & barkstrom, 2003) n/a availability of readers to maintain ease of data access for readers (folk & barkstrom, 2003) n/a ease of subsetting to process only part of data files (folk & barkstrom, 2003) n/a size to transfer data in large blocks (folk & barkstrom, 2003) n/a ability to aggregate many objects in a single file to maintain as small as archive “name space” as possible (folk & barkstrom, 2003) n/a ability to embed data extraction software in the files the files come with read software embedded (folk & barkstrom, 2003). n/a ability to name file elements to work with data based on manipulating the element names instead of binary offsets, or other references (folk & barkstrom, 2003) n/a rigorous definition to be defined in a sufficient rigorous way (folk & barkstrom, 2003) n/a multilanguage implementation of library software to have multiple implementations of readers for a single format (folk & barkstrom, 2003) n/a memory some formats emphasize the presence or absence of memory (frey, 2000). tiff (yes) examining attributes of open standard file formats for long-term preservation and open access | park and oh 56 accuracy in some cases, the accuracy of the data can be decreased to save memory, e.g., through compression. in the case of a digital master, however, accuracy is very important (frey, 2000). n/a speed the ability to access or display a data set at a certain speed is critical to certain applications (frey, 2000). n/a extendibility a data format can be modified to allow for new types of data and features in the future (frey, 2000). n/a modularity a modular data set definition is designed to allow some of its functionality to be upgraded or enhanced without having to propagate changes through all parts of the data set (frey, 2000). n/a plugability related to modularity, this permits the user of an implementation of a data set reader or writer to replace a module with private code (frey, 2000). n/a interpretability not binary formats (barnes, 2006) rtf (yes) ms word (no) xml (yes) the standard should be written in characters that people can read (lesk, 1995). n/a complexity human readability, compression, variety of features (rog & van wijk, 2008; wijk & rog, 2007). n/a simple raster formats are preferred (puglia et al., 2004). n/a compression algorithms the format uses standard algorithms (puglia et al., 2004). n/a accessibility to prohibit encryption in the file trailer (sullivan, 2006) pdf/a (yes) component reuse not defined (sahu, 2006) pdf (no) html (limited) sgml (excellent) xml (excellent) repurposing not defined (sahu, 1999) pdf (limited) html (limited) sgml (excellent) xml (excellent) packaging formats in general, packaging formats should be acceptable as transfer mechanisms for image file formats (puglia et al., 2004). zip (yes) significant properties the format accommodates high-bit, high-resolution (detail), color accuracy, and multiple compression options (puglia et al., 2004). n/a processability the requirement to maintain a processable version of the record to have any reuse value (brown, 2003) conversion of a word-processed document into pdf format. (no) searching not defined (sahu, 2006) pdf (limited) html (good) sgml (excellent) xml (excellent) no definite term to support the automatic validation of document conversions and the evaluation of conversion quality by hierarchically decomposing documents from different sources and representing them in an abstract xml language (becker et al., 2008a; becker et al., 2008b) n/a xcl (yes) to make transferring data easy (johnson, 1999) n/a xml (yes) a format that is easy to restore and understand by both humans and machines (müller et al., 2003) n/a xml (yes) information technology and libraries | december 2012 57 inability to be backed out into a usable format (potter, 2006) pdfs (no) 2. m e t a d a t a self-documentation self-documenting digital objects that contain basic descriptive, technical, and other administrative metadata (cendi, 2007; hodge & anderson, 2007) pdf (yes) pdf/a (yes) tiff_g4 (yes) xml (yes) metadata and technical description of format embedded (rog & van wijk, 2008; wijk & rog, 2007) pdf/a-1 (limited) microsoft word (limited) the ability of a digital format to hold (in a transparent form) metadata beyond that needed for basic rendering of the content (arms & fleischhauer, 2006) n/a self-documenting to contain its own description (abrams et al., 2005) n/a documentation deep technical documentation publicly and fully is available. it is maintained for older versions of the format (puglia et al., 2004). n/a metadata support file formats making provision for the inclusion of metadata (brown, 2003) tiff (yes) microsoft word 2000 (yes) not defined (kenney, 2001) fiff 6.0 (yes) gif 89a (yes) jpeg (yes) flashpix 1.0.2 (yes) imagepac, photo cd (no) png 1.2 (yes) pdf (yes) not defined (sahu, 2006) n/a metadata the format allows for self-documentation (puglia et al., 2004). n/a content-level description not presentation-level description; structural markup, not formatting (barnes, 2006) pdf (no) docbook (yes) tei (yes) xhtml (yes) xml (yes) content-level, not presentation-level, descriptions where possible, the labeling of items should reflect their meaning, not their appearance (lesk, 1995). sgml (yes) self-describing many different types of metadata are required to decipher the contents of a file (folk & barkstrom, 2003). n/a self-describing files embed metadata in pdf files (sullivan, 2006) pdf/a (adobe extensible metadata platform required) formal (bnfor xml-like) description of format to create new readers solely on the basis of formal descriptions of the file content (folk & barkstrom, 2003) n/a no definite term its self-describing tags identify what your content is all about (johnson, 1999). n/a xml (yes) a format for strong descriptive and administrative metadata and the complete content of the document (müller et al., 2003) n/a xml (yes) examining attributes of open standard file formats for long-term preservation and open access | park and oh 58 3. o p e n n e s s disclosure authoritative specification publicly available (abrams et al., 2005) pdf/a (yes) microsoft word (no) the degree to which complete specifications and tools for validating technical integrity exist and are accessible to those creating and sustaining digital content (cendi, 2007; hodge & anderson, 2007; arms & fleischhauer, 2006) pdf (yes) pdf/a (yes) tiff_g4 (yes) xml (yes) authoritative specification is publicly available (sullivan, 2006). pdf/a (yes) open availability no proprietary formats (barnes, 2006) odf (yes) gif (no) pdf (no) rtf (no) microsoft word (no) any manufacturer or researcher should have the ability to use the standard, rather than having it under the control of only one company (lesk, 1995). kodak photocd (no) gif (no) openness standardization, restrictions on the interpretation of the file format, reader with freely available source (rog & van wijk, 2008; wijk & rog, 2007) pdf/a-1 (yes) ms word (no) a standard is designed to be implemented by multiple providers and guide 5: file formats for digital masters employed by a large number of users (frey, 2000). n/a formats that are described by publicly available specifications or open-source source code can, with some effort, be reconstructed later: (1) open publicly available specification, (2) specification in public domain, (3) viewer with freely available source, (4) viewer with gpl’ed source, (5) not encrypted (clausen, 2004). n/a open-source software or equivalent to move toward obtaining open-source arrangements for all parts of the file format and associated libraries (folk & barkstrom, 2003) n/a open standard formats for which the technical specification has been made available in the public domain (brown, 2003) jpeg (yes) pdf (limited) ascii (limited) not defined (sahu, 2006) n/a standard/ proprietary not defined (kenney, 2001) fiff 6.0 (yes) gif 89a (yes) jpeg (yes) flashpix 1.0.2 (yes) imagepac, photo cd (no) png 1.2 (yes) pdf (yes) nonproprietary formats the specification is independent of a particular vendor (public records office of victoria, 2004). n/a no definite term to avoid vendor-lock (potter, 2006) odf (yes) information technology and libraries | december 2012 59 4. i n t e r o p e r a b i l i t y interoperability is the format supported by many software applications/os platforms or is it linked closely with a specific application (puglia et al., 2004)? n/a the ability to exchange electronic records with other users and it systems (brown, 2003) n/a not defined (sahu, 2006) n/a data interchange not defined (sahu, 2006) pdf (no) html (limited) sgml (excellent) xml (excellent) compatibility compatibility with prior versions of data set definitions often is needed for access and migration considerations (frey, 2000). n/a stability compatibility between versions (folk & barkstrom, 2003) n/a stable, not subject to constant or major changes over time (brown, 2003) n/a the format is supported by current applications and backward compatible, and there are frequent updates to the format or the specification (puglia et al., 2004). n/a not defined (sahu, 2006). n/a scalability the design should be applicable both to small and large data sets and to small and large hardware systems (frey, 2000). n/a markup compatibility and extensibility to support a much broader range of applications (ecma, 2008) n/a xml (yes) suitability for a variety of storage technologies the format should not be geared toward any particular technology (folk & barkstrom, 2003). n/a no definite term to allow data to be shared across information systems and remain impervious to many proprietary software revisions (potter, 2006) openoffice (yes) 5. i n d e p e n d e n c e device independencies can be reliably and consistently rendered without regard to the hardware/software platform (abrams et al., 2005) pdf/a (yes) tiff (no) static visual appearance can be reliably and consistently rendered and printed without regard to the hardware or software platform used (sullivan, 2006). pdf/a (yes) pdf/x (yes) this is a very important aspect for master files because they will be most likely used on various systems (frey, 2000). n/a independent implementations independent implementations help ensure that vendors accurately implement the specification (public records office of victoria, 2004). n/a externaldependency degree to which the format is dependent on specific hardware, operating system, or software for rendering or use and the complexity of dealing with those dependencies in future technical environments (arms & fleischhauer, 2006) n/a external dependencies the degree to which a particular format depends on particular hardware, operating system, or software for rendering or use and the predicted complexity of dealing with those dependencies in future technical environments (cendi, 2007; hodge & anderson, 2007) pdf (limited) pdf/a (no) tiff_g4 (no) xml (no) examining attributes of open standard file formats for long-term preservation and open access | park and oh 60 portability a format that makes extensive use of specific hardware or operating system features is likely to be unusable when that hardware or operating system falls into disuse. a format that is defined in an independent way will be much easier to use in the future: (1) independent of hardware; (2) independent of operating system; (3) independent of other software; (4) independent of particular institutions, groups, or events; (5) widespread current use; (6) little built-in functionality; and (7) single version or well-defined versions (clausen, 2004). n/a monitoring obsolescence information gathered through regular web harvesting can give us some information about what file types are approaching obsolescence, at least for the more frequently used types (clausen, 2004). n/a no definite term a human-readable text format and internationalized character sets are supported (müller et al., 2003). n/a xml (yes) not dependent on specific hardware, not dependent on specific operating systems, not dependent on one specific reader, not dependent on other external resources (rog & van wijk, 2008; wijk & rog, 2007) pdf/a-1 (limited) microsoft word (little) the format requires a plug-in for viewing if appropriate software is not available or relies on external programs to function (puglia et al., 2004). n/a 6. p r e s e n t a t i o n distributing page image not defined (sahu, 2006) pdf (excellent) html (good) sgml (good) xml (good) normal rendering not defined (cendi, 2007; hodge & anderson, 2007). pdf (yes) pdf/a (limited) tiff_g4 (yes) xml (yes) presentation preservation of its original look and feel (brown, 2003) n/a self-containment everything that is necessary to render or print a pdf/a file must be contained within the file (sullivan, 2006). pdf/a (yes) self-contained to contain all resources necessary for rendering (abrams et al., 2005) n/a beyond normal rendering not defined (cendi, 2007; hodge & anderson, 2007). pdf (yes) pdf/a (yes) tiff_g4 (yes) xml (limited) 7. a u t h e n t i c i t y authenticity the format must preserve the content (data and structure) of the record and any inherent contextual, provenance, referencing and fixity information (brown, 2003). n/a provenance traceability ability to trace the entire configuration of data production (folk & barkstrom, 2003) n/a integrity of layout not defined (cendi, 2007; hodge & anderson, 2007) pdf (yes) pdf/a (yes) tiff_g4 (n/a) xml (yes) integrity of rendering of equations not defined (cendi, 2007; hodge & anderson, 2007) pdf (yes) pdf/a (yes) tiff_g4 (n/a) xml (limited) integrity of structure not defined (cendi, 2007; hodge & anderson, 2007) pdf (limited) pdf/a (limited) tiff_g4 (n/a) information technology and libraries | december 2012 61 xml (yes) 8. a d o p t i o n adoption degree to which the format is already used by the primary creators, disseminators, or users of information resources (cendi, 2007; hodge & anderson, 2007) pdf (yes) pdf/a (yes) tiff_g4 (yes) xml (yes) worldwide usage, usage in the cultural heritage sector as archival format (rog & van wijk, 2008; wijk & rog, 2007) pdf/a-1 (yes) microsoft word (limited) the degree to which the format is already used by the primary creators, disseminators, or users of information resources (arms & fleischhauer, 2006) n/a widespread use may be the best deterrent against preservation risk (abrams et al., 2005). tiff (yes) the format is widely used by the imaging community in cultural institutions (puglia et al., 2004). n/a flexibility of implementation to promote its wide adoption (sullivan, 2006) pdf/a (yes) popularity a format that is widely used (folk & barkstrom, 2003) n/a widely used formats it is far more likely that software will continue to be available to render the format (public records office of victoria, 2004). n/a ubiquity popular formats supported by as much software as possible (brown, 2003) n/a not defined (sahu, 2006) n/a continuity the file format is mature (puglia et al., 2004) n/a 9. p r o t e c t i o n technical protection mechanism password protection, copy protection, digital signature, printing protection and content extraction protection (rog & van wijk, 2008; wijk & rog, 2007) pdf/a-1 (limited) microsoft word (limited) implementation of a mechanism such as encryption that prevents the preservation of content by a trusted repository (cendi, 2007; hodge & anderson, 2007) pdf (yes) pdf/a (no) tiff_g4 (no) xml (no) it must be able to replicate the content on new media, migrate and normalize it in the face of changing technology, and disseminate it to users at a resolution consistent with network bandwidth constraints (arms & fleischhauer, 2006). n/a no encryption, passwords, etc. (abrams et al. (2005) n/a protection the format accommodates error detection, correction mechanisms, and encryption options (puglia et al., 2004). n/a source verification cryptographic encoding of files or digital watermarks without overburdening the data centers or archives (folk & barkstrom, 2003) n/a examining attributes of open standard file formats for long-term preservation and open access | park and oh 62 10. p r e s e r v a t i o n preservation the format contains embedded objects (e.g., fonts, raster images) or links to external objects (puglia et al., 2004). n/a long-term institutional support to ensure the long-term maintenance and support of a data format by placing responsibility for these operations on institutions (folk & barkstrom, 2003) n/a ease of transformation/ preservation the format will be supported for fully functional preservation in a repository setting, or the format guarantee can currently only be made at the bitstream (content data) level (puglia et al., 2004). n/a no definite term to create files with either a very high or very low preservation value (becker et al., 2008a, becker et al., 2008b) pdf (no) tiff (no) 11. r e f e r e n c e citability a machine-independent ability to reference or “cite” the individual data element in a stable way (folk & barkstrom, 2003) n/a referential extensibility ability to build annotations about new interpretations of the data (folk & barkstrom, 2003) n/a no definite term an open and established notation (müller et al., 2003) n/a xml (yes) data is easily repurposed via tags or translated to any medium (johnson, 1999) n/a xml (yes) creating, using, and reusing tags is easy, making it highly extensible (johnson, 1999). n/a xml (yes) 12. o t h e r s transparency degree to which the digital representation is open to direct analysis with basic tools, such as human readability using a text-only editor (cendi, 2007, hodge & anderson, 2007). pdf (limited) pdf/a (limited) tiff_g4 (limited) xml (yes) in natural reading order (sullivan, 2006). pdf/a (yes) microsoft notepad (yes) the degree to which the format is already used by the primary creators, disseminators, or users of information resources (arms & fleischhauer, 2006) n/a amenable to direct analysis with basic tools (abrams et al., 2005) n/a ample comment space to allow rich metadata (barnes, 2006) n/a items should be labeled, as far as possible, with enough information to serve for searching or cataloging (lesk, 1995). tiff (yes) a digital format may inhibit the ability of archival institutions to sustain content in that format (arms & fleischhauer, 2006). n/a information technology and libraries | december 2012 63 table bibliography abrams, stephen et al. 2005. “pdf-a: the development of a digital preservation standard.” paper presented at the 69th annual meeting for the society of american archivists, new orleans, louisiana, august 14–21, http://www.aiim.org/documents/standards/pdf-a.ppt (accessed november 21, 2011). arms, caroline r. and carl fleischhauer. 2006. “sustainability of digital formats: planning for library of congress collections.” http://www.digitalpreservation.gov/formats/sustain/sustain.shtml (accessed november 21, 2011). barnes, ian. 2006. “preservation of word processing documents.” http://apsr.anu.edu.au/publications/word_processing_preservation.pdf (accessed november 21, 2011). becker, christoph et al. 2008. “a generic xml language for characterising objects to support digital preservation.” in proceedings of the 2008 acm symposium on applied computing, fortaleza, ceara, brazil, march 16–20. becker, christoph et al. 2008. “systematic characterization of objects in digital preservation: the extensible characterization language.” journal of universal computer science 14, no 18: 2936– 2952. brown, adams. 2003. “the national archives. digital preservation guidance note: selecting file formats for long-term preservation.” http://www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf (accessed november 21, 2011). cendi digital preservation task group. 2007. “formats for digital preservation: a review of alternatives and issues.” http://www.cendi.gov/publications/cendi_presformats_whitepaper_03092007.pdf (accessed november 21, 2011). clausen, lars r. 2004. “handling file formats.” http://netarchive.dk/publikationer/fileformats2004.pdf (accessed november 21, 2011). ecma. 2008. “office open xml file formats—part 1.” 2nd ed. http://www.ecmainternational.org/publications/standards/ecma-376.htm (accessed november 21, 2011). folk, mike, and bruce barkstrom. 2003. “attributes of file formats for long-term preservation of scientific and engineering data in digital libraries.” paper presented at the joint conference on digital libraries, houston, tx, may 27–31. http://www.hdfgroup.org/projects/nara/sci_formats_and_archiving.pdf (accessed november 21, 2011). http://www.digitalpreservation.gov/formats/sustain/sustain.shtml http://apsr.anu.edu.au/publications/word_processing_preservation.pdf http://www.nationalarchives.gov.uk/documents/selecting-file-formats.pdf http://www.cendi.gov/publications/cendi_presformats_whitepaper_03092007.pdf http://netarchive.dk/publikationer/fileformats-2004.pdf http://netarchive.dk/publikationer/fileformats-2004.pdf http://www.ecma-international.org/publications/standards/ecma-376.htm http://www.ecma-international.org/publications/standards/ecma-376.htm http://www.hdfgroup.org/projects/nara/sci_formats_and_archiving.pdf examining attributes of open standard file formats for long-term preservation and open access | park and oh 64 frey, franziska. 2000. “5. file formats for digital masters.” in guides to quality in visual resource imaging, research libraries group and digital library federation. http://imagendigital.esteticas.unam.mx/pdf/guides.pdf (accessed november 21, 2011). hodge, gail and nikkia anderson. 2007. “formats for digital preservation: a review of alternatives and issues.” information services & use 27: 45–63. johnson, amy helen. 1999. “xml xtends its reach: xml finds favor in many it shops, but it’s still not right for everyone.” computerworld 33, no. 42: 76–81. lesk, michael e. 1995. “preserving digital objects: recurrent needs and challenges.” in proceedings of the 2nd npo conference on multimedia preservation. brisbane, australia. http://www.lesk.com/mlesk/auspres/aus.html (accessed november 21, 2011). müller, eva et al. 2003. “using xml for long-term preservation: experiences from the diva project.” in proceedings of the sixth international symposium on electronic theses and dissertations. berlin, may: 109–116, https://edoc.hu-berlin.de/conferences/etd2003/hanssonpeter/pdf/index.pdf (accessed december 8, 2012). potter, john michael. 2006. “formats conversion technologies set to benefit institutional repositories.” http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.7881\u0026rep=rep1\u0026typ e=pdf (accessed november 21, 2011). public records office of victoria (australia). 2006. “advice on vers long-term preservation formats pros 99/007 (version2) specification 4.” department for victorian communities. http://prov.vic.gov.au/wp-content/uploads/2012/01/vers_advice13.pdf (accessed november 21, 2011). puglia, steven, jeffrey reed, and erin rhodes. 2004. “technical guidelines for digitizing archival materials for electronic access: creation of production master files—raster images.” us national archives and records administration. http://www.archives.gov/preservation/technical/guidelines.pdf (accessed november 21, 2011). rog, judith, and caroline van wijk. 2008. “evaluating file formats for long-term preservation.” national library of the netherlands. http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_metho d_27022008.pdf (accessed november 21, 2011). sahu, d.k. 2004. “long term preservation: which file format to use.” presentation at workshops on open access & institutional repository, chennai, india, may 2–8, http://openmed.nic.in/1363/01/long_term_preservation.pdf (accessed november 21, 2011). sullivan, susan j. 2006. “an archival/records management perspective on pdf/a.” records management journal 16, no. 1: 51–56. http://imagendigital.esteticas.unam.mx/pdf/guides.pdf http://www.lesk.com/mlesk/auspres/aus.html https://edoc.hu-berlin.de/conferences/etd2003/hansson-peter/pdf/index.pdf https://edoc.hu-berlin.de/conferences/etd2003/hansson-peter/pdf/index.pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.7881\u0026rep=rep1\u0026type=pdf http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.124.7881\u0026rep=rep1\u0026type=pdf http://prov.vic.gov.au/wp-content/uploads/2012/01/vers_advice13.pdf http://www.archives.gov/preservation/technical/guidelines.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://www.kb.nl/hrd/dd/dd_links_en_publicaties/publicaties/kb_file_format_evaluation_method_27022008.pdf http://openmed.nic.in/1363/01/long_term_preservation.pdf information technology and libraries | december 2012 65 van wijk, caroline, and judith rog. 2007. “evaluating file formats for long-term preservation.” presentation at international conference on digital preservation, beijing, china, oct 11–12. http://ipres.las.ac.cn/pdf/caroline-ipres2007-11-12oct_cw.pdf (accessed november 21, 2011). http://ipres.las.ac.cn/pdf/caroline-ipres2007-11-12oct_cw.pdf prevention of duplicate orders by matching new orders being input with records already in the database. a potential duplicate is reported if there is a match on both the author and the title fields . it was decided at the time of implementation at lewis and clark that this criterion was too restrictive, and clas was programmed to report a duplicate if only the title fields matched . after some months of experience, it turned out that even this requirement was excessively restrictive: a slight variation in the way a title was input would prevent a duplicate from showing up . the criterion was then further relaxed to signal duplicates if either the title or the author's last name matched. this, however, was too broad a net : although no duplicates were missed, ordering a book by wilson or smith produced a tedious list of potential duplicates. hence, the requirement was tightened slightly to look for a match in either the title or the author's last name and first initial. this final criterion is currently serving well the needs of the watzek library. what is important about this evolutionary process is that it illustrates the dynamic way in which a library can "fine-tune" an automated system that is receptive to user modifications . since peas is supposed to be a selfexplanatory system, it lacks any documentation. clas is still a self-explanatory system, but nevertheless a manual has been produced to describe all its features and to record programming information such as the structure of the files . one version of the documentation is kept in machine-readable form so that it can be easily updated to correspond to developments in the program . in conclusion, it can be stated that a library-application software package has been successfully transplanted from one institution to another, from one hardware environment to another, and in doing so has matured into a fuller and more flexible system, which it is hoped will, in turn, benefit other libraries contemplating the automation of their acquisitions operation .2 references 1. jenko lukac, "a no cost online acquisicommunications 101 tions system for a medium-size library," library journal 107:684-85 (march 15, 1980). 2. interested libraries can request a copy of the clas program ($80) or manual ($40) directly from the author. the significance of information in the ordinary conduct of life* robert newhard: torrance public library, torrance, california. the information benefit provided to the general public by the developing telecommunications systems will be highly dependent upon the provider's perception of the current and potential role of information in the ordinary interests of life. assessing this role cannot easily be done by standard questionnaire or survey methods because information does not have a conscious function in people's lives. some paradigms from the past and present may, therefore, be of use in articulating the everyday importance of information. the tool paradigm: information as a link between man and his tools or repairing a lost confidence prior to the industrial revolution, most production was carried on in the home, using tools either made or repaired mainly at home. in this cottage industry, each person was very close to and secure in the use of his tools . with the advent of the industrial revolution and the factory system, the worker no longer owned his tools, but went to one place to use someone else's tools. man and his tools began to separate. many used the tools, fewer understood them . this process began to create the "expert." today most of the tools we use-the automobile, telephone, computer termi* a version of this paper was delivered at the meeting on "public libraries and the remote electronic delivery of information (redi)," columbus, ohio, march 23-24, 1981. 102 journal of library automation vol. 14/2 june 1981 nal, etc.-we cannot repair. this has led to a set of latter-day "high priests" upon whom , because of their specific knowledge, we are dependent. i suggest that this trend toward information experts is inimical to a democratic society because of the dependency it creates and because of the pervasivehopelessness it engenders in the public mentality regarding matters as diverse as appliance repair and politics. this process , in a milieu of rapidly developing technology, may seem irreversible . i would suggest, however , that wellpackaged and targeted information could do much to reduce frustration, restore the judgmental effectiveness and the selfconfidence of the ordinary citizen (we are all ordinary for more purposes than not), and to improve citizen confidence in society. for example: i am told i need a clutch job on my car. i can check the flat-rate manual in the library to determin e the amount of time that job should take for my make , model, and year of car. the manual will even give the pric e, but , being a book, it is out of date . in california, each garage must post its hourly rate. suppose the flat-rate manual indicated the clutch job should take three hours and the posted rate is $20 per hour. if the estimate comes back at $150 instead of $60, i know something is wrong. either there is more to the job than clutch repair or it is a rip-off. in either case, even though i cannot repair my car, i can, because of information , make a rational judgment. i am effective in dealing with this problem d esp ite my technological incompetence. the flat-rate manual is a packaged set of information targeted on a specific range of problems, and can function as an imperfect paradigm for what information developm e nt commensurate to technological development should be. the word "they" as a paradigm another indicator of the "information gap" in this society, is a particular use of the word "they." if one listens to the frequency with which people say "they do this," "they don't care," "they 're all politicians," etc., one can grasp the pervasiveness of the "information gap." i suggest that the word "they," so used , almost always indicates an absence of information. this absence is frequently accompanied by suspicion and distrust. the yellow pages paradigm another measure of the importance of information to people in general consists of imagining what would happen if the yellow pages of the telephone book were suddenly withdrawn. there would, i suggest, be a minor revolution. freedom as a paradigm a final perspective on the importance of information may be found in its bearing on human freedom. in the earlier phases of this society's development, freedom consisted of enough space as in horace · greeley's "go west, young man," or in frederick turner's observations on the frontier as a release valve for social pressure in the eastern united states, or d aniel boone needing elbow room. today we live on top of each other and this aspect of freedom is rapidly diminishing. one might view time as a delineator of freedom, as we often say: "if only i had enough time." the absence of the time found in simpler societies, the temporal pressure cooker of today where one's days off are filled with running one's personal business (errands, bill paying, etc.), suggests we have lost much of this temporal freedom . i would suggest that the basic de facto support of freedom now lies with information. information, like knowledge, as observed by francis bacon, is power, and distributed information is distributed power. information awareness contrast these indicators of the public importance of information with a lack of conscious awareness of the significance of information. we do not have an information-prone society. when faced with a problem or interest, i suggest, we are more prone to ask, "what do i have to do?" rather than, "what do i have to know?" part of this reaction is probably due to the fact that when we ask "what do i have to know?" we are faced with another problem in addition to the initial one; i.e., where to get the information. this added effort simply confirms in us our indifference to information, and we take our best shot at solving the problem through decision and action . i sometimes think we have made a virtue of the information incapacity by the way we laud decision making as an indicator of ability. if the foregoing examples are reasonably accurate, we are then faced with a situation in which information is fundamentally important to societal and individual wellbeing, but is not perceived to be so by people in the conduct of their daily affairs . computer-supported telecommunications systems can be the instrument for accelerating information control by a few (this has been much of the trend , so far , as indicated by corporate, research, and technical use of these systems), or it can be used to build information confidence, use, and desire throughout society. this option, i. suggest, is central to the significance of telecommunications systems for a democratic society. if the latter option is to be obtained, i suggest that information will have to be packaged and targeted so well on people 's everyday problems and interests that it will be easier and more productive to say "what do i have to know?" before saying "what do i have to do?" a basic approach to articulating an information service of this kind consists of the following steps: l. determine and prioritize the individual and societal problems and interests of a given community. 2. ascertain the information parameters of those problems and interests. 3. locate and obtain the information necessary to address those problems and interests . 4. organize this information so as to optimally target the specified probcommunications 103 !em or interest to be as easily retrievable as possible. this requires an understanding of the context in which the information is used so that it is optimally relevant, and an understanding of the language and problem articulation common ·to the individuals in the community in order to ensure rapid retrieval. a lesson in interactive television programming: the home book club on qube w . theodore bolton: oclc , inc., columbus, ohio. on december 1, 1977, warner communications christened what has become the most publicized and talked about technological development in the field of cable television: qube, its two-way interactive cable system . publicity posters claimed that this would be "a day you'll tell your grandchildren about," and broadcasters added the word "interactive" to their cocktail-party vocabulary. academicians who ten· years ago forecast a technological revolution initiated by the marriage of computer to cable television, smugly grinned and saw their dreams turn into reality. response to qube, however, has been mixed. participatory television brings, to some, futuristic images of instant democracy; others warn of its potential demagogic power. 1 regardless of your critical persuasion, there now exists what former cbs executive turned warner amex2 consultant mike dann calls "a whole new utility ."3 this whole new utility, whether in the form of qube cable television, or some other combination of computer, cable television, telephone, and standard over-the-air broadcasting, will change the way we conduct our lives <:nd interact with other people . the history of the home book club early in 1979, the oclc, inc . , research staff appraised the nature and context of the qube facilities (located in comissing items: automating the replacement workflow process | smith et al. 93 tutorial cheri smith, anastasia guimaraes, mandy havert, and tatiana h. prokrym missing items: automating the replacement workflow process academic libraries handle missing items in a variety of ways. the hesburgh libraries of the university of notre dame recently revamped their system for replacing or withdrawing missing items. this article describes the new process that uses a customized database to facilitate efficient and effective communication, tracking, and selector decision making for large numbers of missing items. t hough missing books are a ubiquitous problem affecting multiple aspects of library services and workflows, policies and procedures for handling them have not generated a great deal of buzz in library literature. for the purpose of this article, missing books (and other collection items), refers to items that were not returned from circulation or have otherwise gone missing from the collection and cannot be located. significant staff time may be invested in the missing-book process by departments such as collection development, circulation, acquisitions, database management, systems, and public services. more importantly, user experiences can be negatively affected when missing books are not handled efficiently and effectively. while most libraries have procedures for replacing or suppressing catalog records for items that are missing from the stacks or have been checked out and never returned, few have made these procedures public. this article describes the procedure developed by the hesburgh libraries of the university of notre dame to replace missing items or to withdraw them from the catalog. hesburgh libraries’ procedure offers streamlined, paperless routing of records for missing materials, accounts for “nondecisions” by subject librarians, and results in a shortened turnaround time for acquisitions and catalogmaintenance workflows. hesburgh libraries’ experience in 2005, hesburgh libraries recognized its need to develop a streamlined method of processing missing items. because of personnel changes and competing demands on staff time, the routine handling of missing materials had been suspended for roughly five years. during this period, circulation staff continued to perform searches. when staff declared an item officially missing, the item’s catalog record was updated to the item process status “missing” (mi) and paper records were routed to the collection development department office, but no further action was taken. the mounting backlog of missing items in the catalog became a recurring source of frustration to patrons and public-services employees alike. searches for books that were popular among undergraduates often led to items with a “missing” status. to compound the problem, budgetary constraints resulted in the suspension of spending from the fund earmarked for the replacement of missing items. subject librarians were forced to use their own discipline-specific funds to replace items in their areas, but because there was no systematic means of notifying subject librarians of missing items, they replaced items very rarely and on a case-by-case basis—primarily when faculty or graduate students asked a selector to purchase a replacement for an item critical to their teaching or research. also in 2005, a library-wide fund to replace materials was made available. unfortunately, by that time, the tremendous backlog of catalog records for missing items rendered the existing paper-based system unworkable. as a result, a small task force was formed to manage the backlog and to develop a new method for handling future missing items. hesburgh libraries’ solution the missing items task force was initially composed of eight members representing all departments affected by changes in the procedures for handling missing books. the task force was chaired by the subject librarian for psychology and education. other members represented the circulation, collection development, cataloging, catalog and database maintenance (cadm), monograph acquisitions, and systems departments. during the initial meeting, each member described their portion of the workflow and communicated their requirements for effectively completing their parts of the process. because most items with the status “missing” were ones that a patron or patrons had either recently used or requested and could therefore be considered relatively high-use material, the task force quickly determined that the search time for missing books should be shortened from one year to six months. task force members from monograph acquisitions were particularly interested in making this change because newer books are more easily replaced if requests were made cheri smith (cheryl.s.smith.454@nd.edu) is coordinator for instructional services, anastasia guimaraes (aguimara @nd.edu) is supervisor of catalog and database maintenance, mandy havert (mhavert@nd.edu) is head of the monograph acquisitions department, and tatiana h. prokrym (tprokrym@ nd.edu) is senior technical consultant at hesburgh libraries of the university of notre dame, notre dame, indiana. 94 information technology and libraries | june 2009 sooner—many books, especially in the sciences, go out of print quickly and become difficult to replace. the systems task force member supplied a spreadsheet containing the roughly three thousand missing items. this initial spreadsheet included all fields that might be useful for staff in monograph acquisitions, cataloging, cadm, and collection development. various strategies for disseminating the spreadsheet to subject librarians were discussed, but all ideas for how the subject librarians might interact with the spreadsheet seemed laborious and inevitably required that someone sort through each item on the list to determine whether the records needed to be sent to monograph acquisitions or cadm for further processing. the process seemed feasible for a onetime effort, but the task force did not see it as a suitable permanent solution. the task force then considered the feasibility of developing a customized database to manage all of the information necessary for library employees—primarily subject librarians and monograph acquisitions and cadm staff—to participate in the processing of missing books. the database once the task force determined that a database would serve hesburgh libraries’ needs more efficiently than a spreadsheetor paper-based system, the task force enlisted the help of an applications developer. hesburgh libraries had previously created a database for handling journal cancellations, and the task force decided to base the replacement application upon this model. the application is therefore written in php and uses a mysql database. the first step in designing the database was to determine which bibliographic metadata (such as call number, isbn, issn, imprint, etc.) would be required by subject librarians to specify replacement or withdrawal decisions, including whether the item was to be replaced with the same edition, any edition, or the newest available edition. because replacement funds may not always be available, the task force wanted to enable the selector to identify other funds to use for the replacement purchase. finally, the task force felt that, no matter how easy the system was to use, there would always be a few subject librarians who choose not to use it. it was therefore important that the database could also account for “nondecisions” from subject librarians. other general database requirements included that it be available through any web browser and accessible to only those people who are part of the replacement-book process. with those requirements in mind, the task force created a list of metadata elements to be included in the database (see table 1). on a quarterly basis, the application pulls the database fields—title, author, call number, sub library, imprint, isbn or issn, barcode, previous fund, local cost, description, item status, update date, bib system number, and system number—from hesburgh libraries’ ils (aleph v18) and imports into the replacements database. for each item, bibliographic, circulation, and acquisitions information is retrieved from aleph and combined to generate the export data file. procedurally, a list of all items with an item process status of “missing” is first retrieved into a temporary table from the item record (z30) table. this temporary table consists of the system number, status field, sublibrary, collection, barcode, description, and the last date the item was modified (z30-update-date in aleph). a second temporary table is then created that includes the purchase price and fund code originally used to purchase the item. the two temporary tables are joined and their information merged, creating a single list of missing items and related acquisitions information. this list is then linked to the bibliographic tables to obtain key bibliographic information such as title, author, imprint, isbn or issn, the ils bibliographic number, and the barcode. these combined results are converted into an ascii text file for import into the mysql replacements database. upon the import of the ascii file, an e-mail is sent to the collection development e-mail list, informing subject librarians that data has been loaded and is ready for their review and input. table 2 lists the purpose of each of the nine tables within the replacements database. figure 1 illustrates the relationships and linking fields table 1. fields for the replacements database database field data type title varchar(200) author varchar(150) call number varchar(30) sub library varchar(12) imprint varchar(150) isbn or issn varchar(150) barcode varchar(30) previous fund varchar(20) local cost decimal(10,2) description varchar(50) item status char(2) update date date bib system number int(9) unsigned zerofill system number varchar(50) new database fields: action to take tinyint(1) new fund code int(10) modified date date modified by varchar(50) notes longtext system-used fields: transfer date date record id int(10) (auto) missing items: automating the replacement workflow process | smith et al. 95 between the tables. the database provides two “pick lists” for subject librarians. the first pick list is the action to take field. primary choices are “any edition,” “newest edition only,” “micro format only,” and “do not replace.” the second pick list is the new fund field. the default choice for this field is hesburgh libraries’ replacement fund code, although any acquisitions funds may be selected. both pick lists provide data integrity and assurance that all input from the subject librarians is standardized. two internal fields, record id and transfer date facilitate programming and identification. these fields are very important for auditing and tracking replacement records through the replacement process. rollbacks are easily handled through the manipulation of these two fields. programmatic process for the initial implementation of this application, the task force decided that batch loads would be preformed on an as-needed basis. after the initial phase of the project, the task force implemented a quarter-based schedule. for each data load, the exported records are written to a text file, which is then imported into the replacements database through an import script. the import script archives the previous group of processed records, appending them to a set of historical tables stored within the database. the import script further processes the aleph data by eliminating duplicate records and ensuring there is only one record per barcode and system number. the historical tables are checked to see if a missing item has already been loaded into the database and processed. if a record has already been processed, it is automatically deleted from the newly imported item list. after the successful completion of the data load, an e-mail is automatically generated notifying subject librarians that the replacements database is ready for their review and input. the verified missing item records are then transferred to the main database table, “tblreplacements,” and are ready for updating. included in the e-mail to subject librarians is a link that directs them to a search window allowing them to take action on the missing items (see figure 2). once the subject librarians update the records, the application provides a mechanism to distribute missing book records to the appropriate departments for further processing. a collection development staff member runs a series of reports, each one creating a microsoft excel spreadsheet. the first report lists missing-book records marked for replacement and is sent to monograph acquisitions for processing. missing books that have been marked “do not replace” or have had no action taken on them after a certain time period are exported to a separate excel spreadsheet that is sent to cadm for suppression or removal of cataloging records. for each report that is run, the application generates an e-mail message, notifying all necessary departments that there is information to be processed. a list of processed records is available for viewing and distribution to cadm and acquisitions as illustrated in figure 3. the application also provides customized manipulation of the data records that are exported to each of the departments. this customization pulls together only the specific fields of interest to each department such that each export template is unique to each department’s needs. at the end of each replacement cycle, the application automatically creates backups and archives the missing book records. table 2. tables and their purposes within the database table description alephdump stores imported aleph data before processing. tbltempreplacemetns stores aleph data from the alephdump table. this data is processed and sent through verification and truncation programs. tblreplacments post-processed aleph records. primary table for all activities, actions, and fund codes selected by the subject librarians. tblactions a reference list of valid actions that can be taken by the subject librarians. tblfunds a reference list of valid fund codes; originally imported from aleph. tblacqrecords temporary table that stores processed records that should be sent to monographic acquistions. tblcadmrecords temporary table that stores processed records that should be sent to cadm. tblcadmnullrecord temporary table that stores records where no action has been taken by a subject librarian. historytblreplacements an archiving table. 96 information technology and libraries | june 2009 subject librarian workflow when subject librarians receive a message indicating a new replacement list is ready for review, their job is surprisingly simple. after entering their network id and password to gain access to the database, they can select how they wish to view the list of missing books—by selected call number ranges, by the budget code with which the books were originally purchased, or by system number (the last two options are rarely used). subject librarians can also view items that have already been processed, and they are able to sort this list by subject librarian, action taken, new budget code, or call number. figure 1. relationship diagram for the nine database tables that were created for this application. the aleph system number is used as the primary linking field for most of the tables. missing items: automating the replacement workflow process | smith et al. 97 initially, subject librarians encounter a list of brief records for each item in the database. the brief records include system numbers, titles, authors, volume numbers (if applicable), call numbers, sublibraries, and isbns or issns. if a record has already been reviewed by a subject librarian, the list will include actions taken and the names of the subject librarians who took the action. to take action on an item, subject librarians select the system number, displaying the full record (see figure 4), and may then choose to replace the book with the same edition, any edition, the newest edition available, or a microform version. by using a drop-down menu, the selector can elect to pay for the replacement with replacement funds or with their own subject funds. subject librarians who choose to replace books with their own funds are rewarded at the end of the quarter when their replacement requests appear at the top of the queue for processing by monograph acquisitions. additional functionality includes the ability to directly link to and browse opac records for items in the database. replacement funds cannot be used for second copies of books, so quick access to opac records is often useful. it also facilitates determining if the library owns other editions of the item before taking action. a notes field allows subject librarians to communicate special instructions for monograph acquisitions or cadm, and records can be e-mailed to other librarians for additional input with just a few clicks. subject librarians are able to return to the database at any time during a given quarter to continue making decisions on their missing books and make any adjustments to prior decisions as necessary. if a subject librarian takes no action on an item by the end of the quarter, it is assumed that it is not to be replaced, and these untouched items are sent to cadm for removal or suppression. figure 2. replacements application search window figure 3. processed book records ready to be sent to monograph acquisitions and cadm. notification and data transmission to these units are achieved through the send buttons on this webpage. 98 information technology and libraries | june 2009 monograph acquisitions workflow once the quarterly database processing completes, a comma-separated file is delivered to the shared monograph acquisitions e-mail address. monograph acquisitions staff format, sort, and begin searching the spreadsheet, giving priority to the orders designated for replacement by subject librarian funds over those funded from the library replacement fund. staff members routinely search the library catalog for duplicate titles or review orders in process for the same title prior to searching with our library materials vendors. staff members ensure that replacement funds are not used to purchase second copies. material that is not available for purchase is referred by monograph acquisitions to the subject librarian for direction. sometimes the materials may be kept on order with a vendor to continue searching for outof-print or aftermarket availability. other times it is necessary for staff to cancel the order and remove the record from the system completely. likewise, the missing edition may have been subsumed by a newer, revised edition. subject librarians are contacted by search and order staff in the monograph acquisitions department regarding availability of different editions when they did not specify that any edition would be acceptable. when the monograph acquisitions department places a replacement-copy order, the search-and-order unit adds an ils library note field code designating the item is a replacement (rplc), the bibliographic system number of the item being replaced, and any typical order notes such as the initials of the staff member placing the order. the rplc code alerts the receipt unit to route new items to the cataloging supervisor, who then reviews and directs the items to either cataloging or cadm for processing. catalog and database maintenance (cadm) workflow cadm is usually the last unit to edit records in the missing books workflow. the unit receives two reports from the database: a “do not replace” list and a “no action taken” list. both reports get the same treatment: all catalog records for titles listed are removed from the catalog. removal of catalog records is accomplished either by suppression/ deletion of the bibliographic records or complete deletion of all records (item, holdings, bibliographic, and administrative) from the server. for titles that have order or subscription records attached to bibliographic records, a suppression/deletion procedure allows the record to be suppressed from patrons’ view while preserving the title’s order and payment history for internal staff use. records are completely deleted when no such information exists (e.g., a gift copy or an older record that has no such data attached). because it takes a long time to review each newly loaded batch from the catalog into the database, some records that come to cadm for deletion no longer need to be deleted if missing books are found and returned to the shelves. it is very important for staff working on the cleanup of records to check the item process status and not delete any items that have been cleared of the “missing” status. fortunately, aleph allows staff to look up an item’s history and view prior changes made to the record. this item history feature eliminates unnecessary shelf checks for items appearing on cadm reports that are no longer listed as “missing” in the catalog. occasionally, cadm receives requests to delete records directly from monograph acquisitions and cataloging staff because of a revised selector decision. this often occurs when a replacement item is only available in a different edition from figure 4. full record for a missing book in the replacement database missing items: automating the replacement workflow process | smith et al. 99 the one originally sought, or when an item is ultimately unable to be replaced because it has gone out of print or a vendor backs out of a purchase agreement. when a different edition is received to replace a missing item, the replacement copy is sent by the receipt unit in monograph acquisitions to cataloging for copy or original cataloging, and cadm is alerted by either monograph acquisitions or cataloging staff if the record for the missing item needs to be deleted. because monograph acquisitions often orders the replacement on its own record with appropriate bibliographic information (we keep the original record just in case the missing piece is found while we wait for replacement), the record for the missing book does not come to cadm on either of the two reports. perhaps in a library with a different makeup of technical services the process would be more streamlined, but because hesburgh libraries has separate cataloging and database maintenance units, we have created such partnerships to make sure nothing falls through the cracks. so far it has worked well, and every party in the process knows and carries out their responsibilities. issues while the initial implementation successfully brought a large backlog of missing records into the database, subsequent loads included duplicate records of some items processed in earlier batches. this duplication occurred, for example, if an item was identified for replacement in a prior database review cycle, but a replacement request had not yet been processed by monograph acquisitions staff. because such an item is still identified as “missing” in the catalog, it was again included in data loaded from the catalog into the missing-books database, creating confusion for selectors, cadm, and monograph acquisitions. to resolve this problem, the import process was revised to include a search for previously loaded items, deleting them before records are viewed by collection managers. a second issue involved the timing of the data load from the catalog into the replacements database. for various reasons, the data load file was not fully generated for several of the scheduled processing dates. to remedy this problem, the application automatically generates an e-mail confirming a successful data load to the collection development department staff. there is continued debate as to whether the missingitems file should be created on a daily basis, providing the capability for collection development to import new data at one time rather than periodically. results since implementing our new system, hesburgh libraries has processed records for 5,141 missing items. since its creation, twenty-five librarians have consulted the database and twenty-three of thirty subject librarians have used the database to request replacements. of the 5,141 records loaded into the database, 2,537 items (49 percent) have been selected for replacement, and 2,604 items (51 percent) have either been suppressed or deleted from our catalog. replacement funds are renewed on an annual basis and have not yet run out. as a reflection of the collection strengths at hesburgh libraries, most of the missing books (21 percent) fell in the theology/religion call number range. language and literatures was the second most popular collection for missing items (17 percent). other collections with significant numbers of missing books are history (15 percent), social sciences (17 percent), science (12 percent), and philosophy (10 percent). conclusion although the process could certainly be further developed and refined, the hesburgh libraries missing books application is an amazing improvement over the extremely outdated paper-based method of dealing with missing library materials. the process works; it is both efficient and effective, and employees who engage in the process have reported satisfaction with it. it has not only allowed hesburgh libraries to catch up on its backlog but, more importantly, to stay current and organized, keeping the catalog more accurate and patrons more satisfied. furthermore, should the libraries opt to do a full inventory in the future, the current system will prove invaluable. the authors are pleased to have the opportunity to share our experiences with interested libraries. feel free to contact any of the authors for further information. reproduced with permission of the copyright owner. further reproduction prohibited without permission. lib-s-mocs-kmc364-20140601052623 137 technical note help: the automated binding records control system an interesting new aspect of library automation has been the appearance of commercial ventures established to provide for an effective use of the new ideas and techniques of automation and related fields. some of these ventures have offered the latest in information science research and development techniques, such as systems analysis, management planning, and operations research. others have offered services based on new procedures, for example, computer-produced book catalogs, selective dissemination of information services, indexing and abstracting activities, mechanized acquisitions, and catalog card production systems. one innovation is a new technique devised for libraries to reduce the clerical effort required to prepare materials for binding and to maintain the necessary related records. the technique is called help, the heckman electronic library program. it was developed by the heckman bindery of north manchester, indiana, with the cooperation of the purdue university libraries. it was recognized by heckman's management that the processing of 10,000 to 20,000 periodicals weekly and the maintenance of over 250,000 binding patterns would soon become too unwieldy and costly unless more efficient procedures were developed. it was additionally realized that any new system should also be designed as a means to aid libraries with their interminable record-keeping problems. the latter purpose could be accomplished by providing a library with detailed and accurate information regarding each periodical it binds, and by simplifying the library's method of preparing binding slips for the bindery. in the fall of 1969, after a detailed analysis, the heckman bindery management began the development and programming of a computerized binding pattern system. this system was a result of a team effort involving management, sales, and production departments. john pilkington, data processing manager, directed the installation of the system and earl beal performed the necessary programming functions. in december of 1971 approx imately 700 libraries were using the system, and about 100,000 binding patterns were in the data file . 138 journal of library automation vol. 5/2 june, 1972 as the system was developed, a library's binding pattern data were converted to machine-readable form which then made it possible for the bindery automatically to provide nearly complete binding slips for each periodical title bound. in addition, the system provides an up-to-date pattern record for the libraries' files, and the bindery maintains the resultant data bank of pattern records as the library notifies it of additions, changes, and deletions. in this manner, the bindery expects to establish an efficient method for purging the file of out-of-date information. the system revolves around four forms: the binding pattern index card, the binding slip, the variable posting sheet, and the binding historical record. the binding pattern index card (figure 1) is a 5" x 8w' card, pink in color, which is a computer printout. one of these cards is retained in the library as its pattern record for each set of each periodical bound by the library. the data given on the card are essentially the same as those maintained by most libraries in their manual pattern £les, except that more detail is provided by the help system, and the library does not maintain the record-the bindery does-in machine-readable form. as changes are made to the patterns, the library clerk simply crosses out the old data on the appropriate binding slip and writes in the new data. when the bindery receives the binding slip, a new index card is produced, among other records, and forwarded to the library with the returned shipment of bound volumes. the system also provides for one-time changes that do not affect the pattern record. the data contained on the index cards include the library account number, the library branch or department code, the pattern number, color, type size, stamping position, title (vertical or horizontal spine positions), labels, call number, library imprint, and collating instructions. the collating instructions, which are listed in the instruction manual provided by the bindery, are given as a series of numeric codes. asterisks are used to indicate the end of a print line. the binding slips are also 5" x 8}2'' forms, but they are four-part multiple forms, of which three parts are sent to the bindery with the periodical to be bound, and one part, a card form, is retained by the library as its "at bindery" record. the information required by the binding slip is essentially the same as that included on the index card. the library, however, must provide the variable data such as volume number(s), date(s), month(s), or whatever information is required to identify a specific volume. the variable posting sheet (figure 2) is an 8)~" x 11" form that is used by the library when it sends several volumes or copies of a volume to the bindery at the same time. since the bindery cannot determine beforehand the number of physical volumes of a title a library will want to send for binding at a given time, it sends to the library only one printed-out binding slip to be used for the next volume of a given serial. if multiple volumes of -r-~---------------------------------------------:r-0 pattern cust. acct . no. i lib rar y' i pattern no. i •• 1. colo~. . , i trim i spine icust . pat. no. 'i' 0 i t'"e slot or i i otui i i i size start i i ::: library i 0 <( ~ ..... oo z z 0 i <( 0 post • 0 0 0 0 ,, ' 0 -o ·-' 0 ·~ f ·o 0 ~ accents i ~ z ~ to i : i rr.· llol ~ !ii llol i: i o i z <( ::e i i u z ·> i a: lli 0 i ~ i id z <( :1 :iii: i ~ x lli x i ... v e • t i c a l f r l 0 a n 0 t e l 0 • • call impriiiit panel l,..lllll$ coll.atl 8 len . s£w p£rma · film vol.. oty. 1 ovt• u: " u fiiioui u•• nu... 0, ~: r tal"[ stui filler sep. covea stu r stu8 w/stui sheets 11111 papu y x title i 0 ~ i f '" required i ~ i new title i i : i 0$ sample i i q. or rub job no. cover no. 0 c o: .. . : 3 0~ : • ! of 'i q_l ' + ~--------. 'f•a ( i o 1 -. }, 4 fig. 1. binding pattern index card. 140 journal of library automation vol. 5/ 2 june, 1972 binding patiern variable posting sheet 1he. heckman bin'de.~y, inc. cust. acct. no. 1 ~.18rj.rv rattern no.,l-israrv name periodicalname 'post patterw variabl-e information from \.eft to right in seqi./li:nc'e i z 3 4 5 . 6 ; .... '-......_~ .... "-....... ,_......-"'\_'-~r··-~ ........... ..____.._ · -l, )~ i / -~ fig. 2. variable posting sheet. a set are to be bound, the library clerk provides the variable information for the first volume by using the single binding slip, and the variable data for each additional volume of the same title are posted by the clerk on the posting sheet. the bindery will automatically produce from its pattern data bank the binding slips necessary for binding the additional volumes that are listed on the posting sheet. the binding historical record (figure 3) is a form provided for the use of the library if it desires a permanent record of every volume bound. the use of this form is not required by the system; it is simply a convenience record for the library binding staff. the form is printed on the back of the pattern index card. spaces are provided for volume, year, and date sent to the bindery, and most of the back of the card is available for posting. all data fields are of fixed length with the maximum size of the records at 328 characters. some of the data formats are shown in figure 4. a few of the data fields in the example need additional explanation. the fifth field labeled "print" refers to the color of the spine stamping, i.e., gold, black, or white. the "trim #1 & 2" fields are for bindery use only, and indicate volume size within certain groups for printing purposes. the "spine" field is also for bindery use, and it indicates the size of type that can be used according to the width of the spine. "product no." refers to certain types of publications such as magazines, matched sets, or items which will be pamphlet (inexpensively) bound. i i 0 0 0 0 0 0 0 0 0 0 title : publisher ' s address: volume year -------------------· binding record 0 0 date sent volume year date sent 0 0 0 0 0 0 0 0 fig. 3. binding historical record. ,..--1 i i i i i i i i i ibr., print punch program control card print punch program control card print punch program control card l----96 column cal card name ______________ _ i 12 1314 15 1 6 171 81 91 10 11 112 113 14l15l16 l11 l18l 19 20 l21l22 l23l24l25l26 l21 l28 l29 l30 131132133 134 35136 3ji38 l39 l40 1411421 43144145 l i ' print line 1 i p ier 1 ' t i i cust. no. lib pattern p mat. trim ~im s customer no. no. r #i p 1 i i i pattern ' n i n i t i e no. i ' i 2j 3 4j5j_6 1 18 19 110 11112113 14li5l 16 l11 l1 8 ll 9 20121 122 23124125126 21128129130 31132133134 35136 31138139140 i 411 42 143144 145 i ii i i i i i ii i i i i i ill ill ii i i ii i ill ii card name-------------i 2 13} 4 }5}6 1 i 8 i 9 i 101 ii 1121 131 14115 116111 118 119 20121122123124125126 1211281 29 130 i 311 32 133 i 34 i 35 l36l31 l38l39l40 141 42143144 14511 l i ' print line 1 i p,, er 1 i ti ' i cust. no. lib pattern i ' ' no. no. i !2 ' ) collate (con~.) -~ ' i i i i i 2 i 3 4 i 5 i 6 118 19 }10 11}12 }13 j4li5l16 i 11 i 18 i 19 20 i 21 i 22123 i 24125126121 i 28129130 131132133 i 34 i 35136 131 138 i 39140 i 41 42 143 14414511 i i i i ii i i , iiiii i i i i i 11 i i i i 11 111111 i i ii ii card name ______________ _ 1 2 i 3 14 i sl 6 i 1 i sl 91 10 111 112 j 13 14l15l 16l11 l1 8}19 20 }21}22}23}24}25}26}21}28 }29 }30 l31 }32 }33}34l35l36l37j38 l39l40 ]41 ]42 143 144]45~ print line 1 pr ier 1 ti cust. no. lib pattern i ~ no. no. 5 i 2 i 3 41516 1 i 8 i 9 i 10 ii 112113 14115 i 16117 i 18 i 19 20121 i 22123124125126 i 27128129130 i 31 i 32 133 134 i 35136131138 i 39140 i41i42i43i44i4sl~ i i i ijj ll . ill i i i i l l i i l i ll l ill l l l l l l l l l i ll fig. 4. data formats. ----, 1 j multiple layout form print lines 3 and 4 tier3 gx21·9088·0 um /050 " pnnted •n u s a "no of,offt'is_,.,~,..\w~l,.. "1f.---------'--collate----------------l 11 line 2 print lines 3 and 4 r2 tier 3 ----------------------~----variable ------------------------~----~ 1t line 2 print lines 3 and 4 r2 tier3 -----variable (contt) ------------------~ ' ' i i i i ~ l i i i i _____ j 144 journal of library automation vol. 5/ 2 june, 1972 l. lllrary hamil!. 'uft. acct. no, llll r: how 80unp i pro.~~~ ;::;:. j'iittflll no.,i'itiny i"ayijtta..:l trim i ~''ni-l cu$ t. i"ayteitn no. i !rvpe nor dr 'patter-n pr.l)o.itlng~t::tu p sixe hart ~ lfor.ix:olhal. lv veil tical ' i fr? fronl' or. labels variable fgl cafyions call ~ c impjlinl' ~ i panel. ~ line:s p collatingom ~ fig. 5. pattern printing setup. technical note / hammer 145 one additional form used in the system is for heckman's internal operations. that is a data input form known as the "pattern printing setup" (figure 5). this form is used by the bindery's input clerks to prepare new binding patterns for conversion to machine-readable form. the data prescribed by the form is much like that required by the binding pattern index card, except that data tags are shown for keypunching purposes. the system operates on an ibm system 3 computer with two 5445 disk drives and a 1403nl printer. the disk drives provide a total of 40,000,000 characters of on-line storage in addition to the 7,500,000 usable characters provided by the system 3 itself. five 5496 data recorders are used for data conversion. the programs are written in rpg2. the development of computer-oriented commercial services for libraries suggests that, perhaps if librarians wait long enough, they will not have to automate their libraries as commercial ventures will do it for them. the rapid appearance of systems-analysis firms, commercial and societal abstracting and indexing services, management and planning consulting groups, and data processing service bureaus tends to bear this theory out. at the very least, libraries will not be able to automate internally without providing for the incorporation of such ready services into their systems. when a service such as help is made available at no additional charge, there is no way for libraries to avoid automation. donald p. hammer donald p. hammer is associate director for library and information systems, university of massachusetts library, amherst. at the time the system d escribed in this article was developed, mr. hammer was the head of libraries systems development at purdue university. 304 a marc based sdi service kenneth john bierman: data processing coordinator, oklahoma department of libraries; and betty jean blue: programmer, information and management services division, state board of public ahairs, oklahoma city, oklahoma. an operating sdi service utilizes the weekly marc ii tapes distributed by the library of congress. the history, creation, operation, uses, advantages, disadvantages, cost and future plans for the sdi service are discussed, and flow charts (system and detail) and sample output given. introduction sdi (selective dissemination of information) is the distribution of new information to individuals or groups according to their expressed interests. sdi as a service of libraries is not a new concept, for libraries have been providing such specialized current awareness services for years both formally and informally in such ways as routing proof slips to interested persons, departments, etc. such services have been provided most commonly in special libraries, but are not uncommon in public and academic libraries as well ( 1). "although the practice of sdi is not new, its application in libraries has been generally irregular, informal, and very limiteddepending variously on the memory, willingness and free time of the librarian and contingent on the desire and ability of the patron to make his interest known" ( 2). with the interest in library applications of data processing has come marc based sdi servicejbierman and blue 305 an interest in automated sdi services. "all computer based sdi systems work on the same principle and include two basic elements: subject interest profiles for the users and a machine readable file of indexed bibliographic records of current materials." ( 3) the annual review of information science, volume 4, presents a summary of many different types of sdi systems ( 4) as well as an excellent bibliography ( 5). additional recent automated sdi services are described in the literature (6-12). the purpose of this article is to describe an operating marc-based sdi system, the environment within which it operates, and some of the thinking which led to its creation. background information the oklahoma department of libraries is the designated state library agency in oklahoma. as such it has two primary statutory responsibilities: 1) provision of library services to state government, including the executive, legislative and judicial branches of government, and the agencies of state government and 2) state-wide responsibility for total library development, including the development of multi-county library systems. the department provides a great variety of library services to fulfill these functions, one of which is the maintenance of a collection of materials with three subject specialty areas: 1) law (prhnarily of use to the judicial and legislative branches), 2) political science (primarily of use to the executive and legislative branches), and 3) library science (primarily of use to the department's own staff and the librarians throughout the state). in addition, a general collection and a general reference collection are maintained primarily for use by the executive and legislative branches of state government and as back-up and supportive col1ections to the libraries throughout the state. with the beginning of the marc distribution service march 29, 1969, the department implemented a service for other libraries around the state by creating and maintaining a marc data base for the use of all libraries within the state ( 13). after the data base had been created and was working satisfactorily, the department considered what it could do with marc to help its own operation. the five following paragraphs discuss projects suggested and considered. the first was design and implementation of the original input of a selected portion of the collection (the law collection, for example) in marc format. the project would be a beginning of putting the entire collection in marc format and would yield interim useful products as well (a book catalog, for example). however, it was decided that this project would be premature for two reasons. if retrospective conversion were going to be done nationally (14), it would be foolish for the department to duplicate the work at the local level; and before the department should expend money putting material into marc format, it should demonstrate the 306 journal of library automation vol. 3/4 december, 1970 usefulness of marc with the already existing records being distributed by the library of congress. a second project considered was design and implementation of conversion of the storage of the marc data base from the sequential (tape ) system ( 13) to a direct-access system (disk, for example). certainly from the standpoint of economic use of the data base such conversion is desirable, and for providing multiple access points to the data base (author, title, subject, etc.) it is essential. it was decided, however, that before additional funds and energy should be expended to improve the storage and retrieval of marc records, the usefulness of the presently available individual records themselves should be demonstrated. direct access storage and retrieval was deferred until the completion of the sdi system. work has recently begun on the direct access project ( 15). design and implementation of an acquisitions module for the department was considered because the department was preparing to re-design its acquisition system. however, to have a meaningful automated marcbased acquisitions system it would be necessary to search the data base by a number of entry points (author, title, etc. ), which would require the direct access system described above. a fourth project considered was design and implementation of a catalog card set and processing aids (label, etc. ) production module for the department. because the department does centralized processing for the library systems throughout the state, the processing center is a critical area within the library; therefore, this alternative had the most immediate appeal. it was decided that the department was not financially prepared to undertake so ambitious a project yet. it was felt that a less ambitious project should be undertaken fust to gain knowledge and experience which would be essential in a successful catalog card and processing aids production system. the project ultimately selected was design and implementation of a subject current-awareness service based upon the weekly marc tapes. the service would be immediately useful, both to the department and its clientele; was not dependent upon the maintenance of a large data base; and could be set up and operated quickly and economically. further, the experience gained in manipulating the marc records in the print portion of an sdi system would be valuable experience for manipulating the marc records for printing catalog cards and processing aids at a later date. overview of the sdi system first, the subject interests of a particular user (perhaps an individual, but more likely a state agency) are profiled in dewey and/or l.c. classification numbers by a reference librarian from the department. for example, the library school could use a listing of marc records on each weekly tape dealing with library science. table 1 is a library science profile. marc based sdi servicejbierman and blue 307 the dewey and lc classification numbers of each marc record on a tape are compared with the profiled dewey and lc classification numbers. table 1. library science profile . subject library science manuscripts & rare books ethics of librarianship library manpower library study techniques audio visual instruction films in adult education printing & binding bookselling management of libraries city planning & libraries architecture & libraries book illustration motion pictures dewey numbers 020-029 090-099 174.902 331.76102 371.30 371.33 374.27 655 658.809655 658.9102 711.57 727.8 741.64 791.43 l. c.' number~ zl-zlooo when either the dewey or lc number matches, that record is pulled from the marc tape and copied onto a detail tape. after the entire marc tape has been searched, the detail tape is rewound and the selected records are printed. description of the sdi programs the sdi system consists of two programs: the first, odl-07, pulls the appropriate records from a marc tape, and the second, odl-07x, prints these records in a readable form and appropriate sequence. figure 1 is a system flow chart. odl-07 program inputs are 1) a control card givmg program identification and date; 2) header cards containing list codes and headers. (the list code is a one-character code that uniquely identifies the list, e.g., "z" for library science; the header will appear on each page of output, e.g., "library science" for the library science list.); 3) classification number cards, which contain the proper list code, a selector code ( "d" for dewey and "c" for library of congress), and the lc or dewey classification number or range of numbers to be selected; and 4) the marc tape to be searched. outputs are: 1) a header tape containing all the information from the header cards and the date, and 2) a detail tape containing all selected records with a list code for each record. 308 journal of library automation vol. 3/4 december, 1970 print sdi listings fig. 1. sdi system flow chart. figure 2 is a detail flow chart for odl-07. the control and header cards are the first to be read. a header table is constructed for editing and records are written on the header tape. the classification number cards are then read. these cards are edited first for such errors as invalid list code (for each list code on a classification number card, there must be a corresponding header record), invalid selector code (must be "c" for lc or "d" for dewey), and invalid characters in the lc or dewey numbers (dewey may not contain any alphabetic characters and the only valid special characters for dewey are the period and dash; the first marc based sdi servicejbierman and blue 309 ( start ) hskp no onve rt to com ..., __ -lpore fom10t on construct lc & dewey tabl es fig. 2. odl-(/)7 detail flow chart. check tables for matches· hskp character of lc must be alphabetic). if the classification cards pass edits, they are used to construct lc and/or dewey entries. each table entry consists of three items: the lowest acceptable value, the highest acceptable value, and the list code. dewey classification numbers can be input into the system without reformatting and are converted by the program to table entries. table 2 presents some dewey numbers as they might be keypunched and input into the system and the corresponding table entries which would be created. dewey numbers are converted from the free form to a fixed-length .10-position all-numeric form. lc classification numbers are more difficult. these cannot be entered into the system without reformatting as can the dewey numbers; rather, table 2. dewey classification number table keypunched classification number cards corresponding table entry list code selector code z d classification number 174.902 list code z lowest value 1749020000 0200000000 3317610200 3400000000 3311100000 highest value 1749029999 0299999999 3317610299 3499999999 3318989999 z d 020-029 z z d 331.76102 z l d 34 z l d 331.11-331.898 z key: z = library science; l = law; d = dewey classification table 3. lc classification number table keypunched classification number cards list code selector code lowest value hv7231 highest value explanation p c p c l c z c jooooo kooooo zooool hv9920 records with lc classification number between hv7231 and hv9920 will be hits. jkzzzz records with lc classification number beginning with j and ja-jk will be hits; jl-jz will not be hits. kzzzzz records with lc classification number beginning with k (including ka-kz) will be hits. z01000 records with lc classification number between z1 and z1000 will be hits. key: z = library science; l = law; p = political science; c = lc classification c.o ..... 0 'c' 3 ~ ...... c -~ ... ~ ~ "'!! r.:: > ~ .... c ~ .... c;· ~ < ~ c.o ......... ~ t) (!) @ o(!) v'"l ..... :s 0 marc based sdi service j bierman and blue 311 low and high values are entered into the system and put directly into the lc table for searching of the marc tape. table 3 presents some lc classification numbers as they might be keypunched and entered into the system and a brief explanation of what records will be pulled as hits (matches). lc table entries are in the form of aann,nn where aa stands for the two possible initial letters and nnnn stands for the four numbers following the initial letter( s) and immediately preceding the first decimal point or next alphabetic character. zero is the lowest value and z is the highest. mter all classification number cards have been converted to table entries, the marc tape is read, the lc and dewey numbers are pulled from each record, and both tables are searched for hits (matches). the dewey classification number from the marc record is read and converted into a fixed-length 10-position numeric field. for example, the classification number 020/.6234/5456 from the marc tape would be converted to 0206234545 and the number 025.3/02 would be converted to 0253020000 before dewey table searching. if a classification number card had been 020-029 (see table 2), both of these records would have been a hit. the lc classification number read from the marc record is first converted to the form aannnn and then searched against the lc table. for example, the classification number z665.h45 from the marc tape would be converted to z00665 and z678.3.k39 would be converted to z00678 and then searched against the lc table. if the last entry in table 3 had been input into the system, these records would both be hits, as their lc numbers lie between zooool and zolooo. if a match is found in either table, the marc record is transferred in the original marc format to the output tape with the list code. mter odl-07 is completed, control passes to odl-07x. odl-fp7x program inputs are the header tape from the previous run and the detail tape containing the selected records from the previous ( odl-07) run. outputs are the sdi listings by subject areas (list code). figure 3 is a detail flow chart for odl-07x. the first record is read from the header tape and the detail tape is then searched for matching list codes. when a match is found, the marc record is formatted and printed. when the entire tape has been searched, the next header is read, the detail tape is rewound and the process is repeated. this continues · until all header and detail records have been matched and printed. the result is a series of sdi lists, each in lc card number sequence. see figure 4 for a sample of two printed records from a library science list. presently, the weekly lists are being printed on two-up, three-part, perforated teletype size (8~" x 5~") paper, one record ( sdi notice) to each separable form. 312 journal of library automation vol. 3/4 december, 1970 hskp at end at end yes construct and print record no fig. 3. odl-07x detail flow chart. discussion the· sdi system was written with flexibility as one of the main considerations. dewey classification number cards in ahnost any format can be machine converted to the intended table entry. both ranges and individual classification numbers are allowed. any number of dewey and lc entries and any number of lists can be handled simultaneously, the only limit being core size. the selection tables, not being built into the programs, can be changed at any time, weekly if desired. the print format generally follows traditional catalog card arrangement, the major difference being that each subject heading and added entry appears on a new line and is not numbered. the print program can be easily adapted to any conversion table desired; delimiters, field terminators, etc. are referred to symbolically. there is an optional feature which allows marc based sdi servicefbierman and blue 313 09/03170 ll erary science stevens, ~ary elizabeth• autc~atic indexing, a state•cf•the•art report. reissued with additions a.no ccrrecticns. washington, u.s. national bureau of standards, for sale by the supt. of docs., u.s. ggvt. print. off., 1970. vit 290 p. 26 cm. 2.25 (national bureau of standards ~onograph 91) •a united . states department of co~~erce publication.• includes eiblicgraphies. automatic indexing. t.s. national bureau cf standards. monograph 91 cc1co.u556 no. 91, l97c 029.5 73•t07239 marc • oklahoma oklahoma d.,artm£nt of lo1uoon sdj usu inr<>rmatoot< s••voco 09/03170 library sci ehce librarianship and literature, essays in honour of jack paffcrc. ecitec by a. t. milne. london, athlone p., 1970. viii ., ·141 p., 4 plates. illus., port. 23 cm . 40/incluoes eibliographical references. £r. j. ~. p. paffcrot by a. t. mjlne.--1. the british museum in recent times, by sir f. fra~cis.--2. the education cf a librarian, by r. irwin.••3. library cd-operation in great britain, by s. p. l. filcn.••4. the development of british university libraries, by j. w. scott.--5. problems of a special library, by r. tho~as.••6. t~e growth of literary societies, by a. brown.••7. the editor and the literary textt requi~e~ents ano opportunities, by ~. f. brooks.••b. some leaves frcm a thirteen•centurv illuminated manvscript in the university of london library, ay f. wormal0.••9. a bibliography of j. h. p. paffort, by j. harries and r. wo pound. library science••acdresse·s, essays, lectures. pafford, john henry pyle. ~ilne, ~lex~niler taylor, eo. pafford, john ~enry pyle. z6~5.l57 c20/.9~2 10~477193 ~85111179 marc • oklahoma . oklahoma o .. artm£nt 01' liuaiifs sol u ... infoiimation suvicf fig. 4. sample sdi notices. 314 journal of library automation vol. 3/4 december, 1970 any character or characters to be deleted and the resulting gap closed; this is desirable for diacriticals until better techniques for handling them are devised. both line and page length are referred to symbolically and can be easily changed to fit any form desired. line spacing and indentation are built into the present program, but even these can be changed. the major disadvantage of the sdi system as it now exists is that it allows selection by classification numbers only. unlike the marc i experimental sdi system at indiana university (16), which allowed for selection by weighted terms (both classification number and subject heading), this system allows for classification number selection only. programming difficulties, expense, and the necessity for additional processing time inhibit searching on subject headings. for selection of detailed subjects, subject heading searching is essential; however, for making subject searches in subject areas classification number searching seems more expedient, as it would be difficult to determine, and expensive to input, all of the subject headings for the field of law, for example. ideally, a marc-based sdi system would be able to provide selection based on classification numbers and/or subject descriptors. computer, language and cost the computer for which the programs were written was an ibm 360/30, 32k core size, one card read/punch, four tape drives, two disk drives and one printer. the programs have also been successfully run on an ibm 360/25 with one card read/punch, two tape drives and one printer. in the latter case, the first program was modified slightly because only two tape drives were available, whereas the sdi system normally requires three. modification was easily accomplished by having the header records punched rather than written in odl-07. the programs are written in cobol for the 360, operating under dos. very little modification would be required to operate under os. being written in cobol, the programs are easily adapted from one machine to another; they have been successfully run on a rca spectra, for example. they also are easily adapted and changed, the symbolic names and procedure division paragraph headings having been carefully selected to build in as much documentation as possible. following is a breakdown of the charges to the department of libraries for programming and machine time for development; department of libraries' staff time, overhead costs, and operating costs are not included. programming and debugging ------------------------$2,941.00 machine & operator costs for testing ___________ 452.00 operating costs are more difficult to determine and nearly impossible to evaluate meaningfully. the total amount of computer time required (and therefore the major cost) is primarily a function of the number of records on the marc tape being searched and the number of selected marc based sdi servicejbierman and blue 315 and printed records. if the marc tape contains 1,200 records, it takes about twelve minutes (clock time) of computer time (ibm 360/30, 32k) to select the desired records ( odl-07). as the total of classification numbers being searched increases (that is, as the dewey and lc tables grow), the computer time for selection does not appear to increase significantly. the print program ( odl-07x) is directly a function of the number of lists being produced (the number of times the detail tape must be rewound and re-read) and the total number of records being printed. as an example, ' if six different lists are being produced and a total of 375 records are being printed out, the computer time is 25 minutes. therefore, producing six weekly lists with an average of 62 records for each list takes approximately 37 minutes (clock time) each week. at the rate of $60.00 an hour, this is $37.00, or approximately 10c per record selected and sdi notice printed. table 4 presents a detailed analysis of five weekly runs. the total computer time is the number of minutes which were charged to the department of libraries by the computer center. since the department is charged one dollar per minute, this is also the dollar cost to the department for computer and operator costs for that weekly run. unfortunately, the total time given includes time for set-up and other factors. therefore, meaningful patterns are difficult to discern, as one week it may take several minutes longer to get the forms inserted and lined up in the printer, forms may break another week, etc. the remainder of table 4 is exactly accurate. it is interesting to note how much variance there is from week to week in the number of sdi notices for each subject list. for example, out of 889 marc records on the marc tape run on july 23, 16 were library science titles. however, the marc tape run on august 6 contained 1,201 records but only 12 were library science titles. in addition, notice that the library science list was reprinted seven times, and for the last two weeks reprinted five times, to get the total number of copies needed for the 25 subscribers to the list. current uses the uses to which the system is presently being put are in three general areas: 1) sdi lists for internal use of the department, 2) sdi lists for state government, and 3) sdi lists for other libraries. the department currently produces subject lists primarily for its own use in the areas of law and political science. since the department maintains specialty collections in these two subject areas, it is anxious to obtain the most current information on materials published in them for selection purposes. because the marc record comes out before the corresponding proof slip is distributed ( 17), use of the marc file has been a most successful means of obtaining complete and verified bibliographic information for the purpose of ordering new books. in addition, complete lc cataloging information is available should the proof not have arrived at the time the book is received. because the lists are currently being printed on three2'able 4. sample run times and list lengths. r"' cl' <r <r ~ print-out .s· :co • n tope-out count a q "' -input count count (total numn -o "' ;;· ;r n '< otal (total num(total number of sol it ;;· " date of omputer ber of reber of renotices tota l .. n .. computer ime cords on cord$ copied printed in number run minutes) marc tape) in ool-%7} ool-px) of lists 15 7 1 1 july 9 65 1,467 525 39 36 87 f:jy u j jo is/ 15 7 1 1 jul y 16 57 1,118 456_ _24 30 71. 600 168 30 76 16 7 1 1 july 23 49 889 377 16 16 71 473 112 16 71 14 5 1 1 july 30 56 1,078 437 23 32 57 529 115 32 57 14 5 1 1 august 6 60 1,201 543 12 23 103 591 60 23 103 ~ !;g'l m :;~ a.. "' 0 c t] -· < n ~ .. q c'3 :::. ~ :co. 3 c;· 2. .. ~ g " )?'~~e.. ~~ ii'v> -8 a .. " ~ n 3 '" ~ .. " 1 1 1 1 75 92 118 22 /:j yl. lis ll. 1 1 1 1 7 1 r~ 100 ?n 71 83 100 20 1 1 1 1 60 65 73 21 60 65 73 21 1 1 1 1 61 80 89 29 61 80 89 29 1 1 1 1 80 113 106 31 80 113 106 31 c )> c .. ~ .. "!?"!?-0 "' 2. 2. (') ~ ~ .. g; .. !1. a (;' a 1 1 -15 41 -i:j 41 -1 1 -r u -8 44 -1 i 1 5 34 16 5 34 16 1 1 1 11 38 17 11 38 17 1 1 1 11 42 22 11 42 22 number of print runs number of marc records selected number of sol notices printed number of print runs nun>t>e_r of marc records selected number of sol notices printed number of pri nt runs number of marc records selected number of sol noli ces printed number of print runs number of marc records se ected number of 501 notices printed number of pri nt runs number of marc records selected number of sol not ices printed c.;, ...... <j:) ...... 0 >:: "'! ;i ~ .q.. ~ ~ ~ "'i ~ > >:: cs .... ;::s a ~· ~ ~ c.;, .......... ~ t::1 ('t) (') ('t) s 0" j~ ...... ~ marc based sdi servicejbierman and blue 317 part teletype paper, one record per sheet, it is easy to separate the record to be ordered and send one copy to acquisitions, retaining one copy for the files, and sending one to the interested individual in state government with a note that the book is on order. the department also produces a special list of many different subjects which are of interest to the legislature for the legislative reference division of the governmental services branch. the legislative reference division can then order particularly useful materials quickly and route a copy of the sdi printout to the interested legislator or legislative committee. the department has prepared profiles of the state agencies having a large planning and research role. lists are prepared weekly for the department of education, department of corrections, department of vocational-technical education, department of welfare, industrial development commission, department of highways, and several small agencies, and are sent to the person responsible for planning and research within the department. he can then request books from the lists by returning one copy of the sdi notice to the department of libraries with a note to order, retaining the other copy for his files or routing it to a researcher particularly interested in the subject. certain lists are being produced and shared with libraries around the state. the law and political science lists are being sent to two law schools in oklahoma. the library science and bibliography lists are being sent to the library school and the two largest public library systems, as well as the two state universities. over 25 libraries outside oklahoma are receiving weekly library science, political science or law lists ( 18). a cooperative acquisitions program is evolving whereby certain libraries agree to specialize in certain subject areas so that every subject area would be covered by one library for specialized materials not needed by all libraries. currently, the program involves the two major public libraries and the department of libraries wherein the state teletype network (otis) is used to transmit rapidly information on expensive materials for cooperative acquisitions. selected lists in the specialized subject areas can be produced each week for each of the cooperating libraries to aid them in their selection, acquisition and cataloging of the materials. the uses currently being made have excited the imagination of many people, both within and without the department of libraries. a great deal has been accomplished since the system became operational early in february 1970; however, the possibilities have barely been identified. as mentioned above, one can envision this being the foundation of a cooperative acquisitions program. such a system could form a node of library service to business and industry; currently, some thought is being given to producing weekly lists of materials in automation and computer science (systems analysis, etc.) both for the many state agencies which have automated equipment and for businesses and industries around the state which utilize computer technology. 318 journal of library automation vol. 3/4 december, 1970 conclusion marc is an exciting and potentially valuable innovative new tool available to the library community, useful to improve both its own internal operations and, more importantly, its service to others. nonetheless, before extensive meaningful use of marc will occur, its potential uses must be identified and explored. this article has attempted to give a picture of one such experimental project to improve library service for others within the framework of a particular institution's resources and functions. much more research is needed on potential and operating uses of marc and the results of this research need to be disseminated to the library community. in addition, it is the opinion of the authors that for reasons both of available financial resources and expertise much of the research and development with marc must be a cooperative venture among many different libraries. some work has been done with marc cooperatively throughout the country (nelinet (19), oclc (20), clsd (21), for example) but much more needs to be done. the future of meaningful uses of marc is bright; however, much research and development is yet to be done which can best be done as a cooperative effort. programs and additional information sdi computer programs and services available from the department of libraries to other libraries are described in a publication called "sdi services and costs," available from the oklahoma department of libraries, 109 state capitol, oklahoma city, oklahoma 73105. additional progress reports on the sdi project, as well as other automation projects in oklahoma are reported in the bi-monthly oklahoma department of libraries automation newsletter, which is available on request. references 1. cuadra, carlos a., editor: annual review of information science and technology, 4 (chicago: encyclopedia britannica, 1969), 249-258. 2. studer, william joseph: computer-based selective dissemination of information (sdi) service for faculty using library of congress machine-readable catalog (marc) records (ph.d dissertation, graduate library school, indiana university, september, 1968 ), 1. 3. studer, william j.: "book-oriented sdi service provided for 40 faculty." in avram, henriette d.: the marc pilot profect; final report on a project sponsored by the council on library resources, inc. (washington: library of congress, 1968), 180. 4. cuadra: op. cit., 243-258. 5. ibid:. 263-270. 6. bloomfield, masse: "current awareness publications; an evaluation," special libraries, 60 (october 1969), 514-520. marc based sdi servicejbierman and blue 319 7. bottle, robert t.: "title indexes as alerting services in the chemical and life sciences," journal of the american society for information science, 21 (january-february 1970), 16-21. 8. brannon, pam barney; et al.: "automated literature alerting system," american documentation, 20 (january 1969), 16-20. 9. brown, jack e.: "the can/sdi project; the sdi program of canada's national science library," special libraries, 60 (october 1969), 501-509. 10. davis, charles h.; hiatt, peter: "an automated current-awareness service for public libraries," journal of the american society for information science, 21 (january-february 1970), 29-33. 11. housman, edward m.: "survey of current systems for selective dissemination of information ( sdi) ." in proceedings of the american society for information science, 6 (westport, connecticut: greenwood publishing corporation, 1969), 57-61. 12. martin, dohn h.: "marc tape as a selection tool in the medical library," special libraries, 61 (april 1970), 190-193. 13. bierman, kenneth john; blue, betty jean: "processing of marc, tapes for cooperative use," journal of library automation, 3 (march 1970), 36-64. 14. recon working task force: conversion of retrospective catalog records to machine-readable form; a study of the feasibility of a national bibliographic service (washington d.c.: library of congress, 1969). 15. bierman, kenneth john: "marc-oklahoma data base maintenance project," oklahoma department of libraries automation newsletter, 2 ( october 1970). 16. studer, william j.: (op. cit., note 2), 23-37. 17. payne, charles t.; mcgee, robert s.: "comparisons of lc proofslip and marc tape arrival dates at the university of chicago library," journal of library automation, 3 (june 1970 ), 115-121. 18. bierman, kenneth john: "marc-oklahoma cooperative sdi project report no. 1," oklahoma department of libraries automation newsletter, 2 (june & august 1970), 10-14. 19. nugent, william r.: nelinet: the new england library information network. paper presented at the international federation for information processing, ifip congress 68, edinburgh, scotland, august 6, 1968. (cambridge, mass: inforonics, inc., 1968). 20. kilgour, frederick g.: "a regional networkohio college library center" datamation, 16 (february 1970 ), 87-89. 21. the collaborative library systems development project (clsd): chicago-columbia-stanford. unpublished paper presented at the marc ii special institute, san francisco, september 29-30, 1969. supporting faculty’s instructional video creation needs for remote teaching: a case study on implementing eglass technology in a library multimedia studio space article supporting faculty’s instructional video creation needs for remote teaching a case study on implementing eglass technology in a library multimedia studio space hanwen dong information technology and libraries | june 2023 https://doi.org/10.6017/ital.v42i2.15201 hanwen dong (hanwendong@uidaho.edu) is instructional technology librarian, university of idaho. © 2023. abstract in 2021, alongside seven colleges at the university of idaho campus, the university of idaho library received an eglass system (https://eglass.io) with funding from the governor’s emergency education relief grant to expand faculty’s capacity to create instructional videos. the eglass is a transparent glass whiteboard that allows instructors to write, draw, and annotate. it comes with a built-in camera that can capture instructors’ facial expressions and gestures while facing their remote students and allow better engagement. the eglass is suitable for creating asynchronous instructional videos for flipped classrooms and integrating zoom for synchronous online classes. this article details the eglass equipment setup, studio space optimization, outreach efforts and initiatives, usage examples of early adopters, lessons learned during the first year of the eglass deployment, and future considerations. introduction in 2021, the university of idaho library (library) received a transparent glass whiteboard called the eglass for faculty to record video-based lectures. the eglass was based on a similar glass whiteboard technology, called the lightboard, that the library already owned. initially built by university of idaho engineering students and later gifted to the library, the lightboard presented challenges to library staff as properly supporting the technology required spending a significant amount of time. offering similar functionalities, the eglass had the potential to also address the issues that the lightboard presented. similar to the lightboard, the eglass allowed instructors to write and draw on the glass while facing their audience, typically students who would be watching the recorded videos later, to provide better engagement. the eglass could also be used for creating asynchronous instructional videos for flipped classrooms and integrating zoom for synchronous online classes. to implement the eglass, it was necessary to consider factors such as the functionality, the space to be occupied, and faculty interest. a year after the original deployment of this tool, the author reports on the lessons learned in this article. lessons including the eglass equipment setup, multimedia studio space optimization, outreach efforts and initiatives, usage examples of early adopters, lessons learned, and future considerations are explored later in this article. background the studio in the university of idaho library provides space and audiovisual equipment to students, faculty, and staff to pursue curricular, personal, and creative multimedia projects. mailto:hanwendong@uidaho.edu https://eglass.io/ https://eglass.io/ information technology and libraries june 2023 supporting faculty’s instructional video creation needs for remote teaching 2 dong originally converted from a 200-square-foot meeting room, the studio is equipped with a 27-inch imac, a 32-inch full-hd victek monitor, a scarlett 18i20 audio interface, a dbx 266xs 2-channel compressor/gate, two krk’s rokit 5 g3 powered studio monitors, two shure sm58 dynamic vocal microphones with microphone arm stands and pop filters, several portable lights, a green screen, and more. software installed on the imac includes audacity, camtasia, and the essential adobe creative cloud applications such as photoshop, premiere pro, indesign, etc. patrons can use the studio software and equipment to record voice-over narrations and podcasts as well as to edit multitrack audio clips and videos. in addition to using the studio equipment, patrons can also borrow other multimedia equipment, such as video camcorders, audio recorders, tripods, a usb microphone, and a dslr camera, at the circulation desk. initially managed by two library support staff, both of whom left the organization to pursue other opportunities, the studio operations were taken over by the author in 2020. due to the covid-19 pandemic and the lack of air ventilation in the space, the studio was closed in march 2020 and did not reopen until august 2021. while any university-affiliated patron is welcome to use the studio, first-time users were expected to complete an orientation with the author to become familiar with the equipment setup and the audio workflow. to use the studio, patrons had to make reservations, up to two weeks in advance, for up to two hours per day. reservations were made from the studio’s webpage and managed through springshare’s libcal product. patrons who frequented the studio pursued various personal, creative, instructional, and curriculum-related projects, including video recording with the green screen, video editing, podcast recording, voice-over narration recording, etc. the studio was used by patrons several times a week. according to the libcal space statistics, in fall semester 2021, the studio had 48 unique users, 147 total bookings, 211 hours booked, and the average reserved time block was 86 minutes. in spring semester 2022, the studio had 30 unique users, 64 total bookings, 103 hours booked, and the average reserved time block was 97 minutes. a noticeable usage drop in the spring semester was likely due to a reduced number of advertised studio orientations provided to the campus community and fewer classroom assignments that required or promoted studio use. for several years, the studio was home to a lightboard for faculty to record class lectures. designed as open-source hardware by dr. michael peshkin from the mccormick school of engineering at northwestern university, the lightboard was a transparent glass whiteboard illuminated with a built-in light, and the ink would glow in low-light environments. instructors could write and draw on the glass with neon markers while facing the viewers, and the writings and drawings along with the instructor could all be captured in the same frame using a separate camera.1 dr. peshkin provided two solutions for those who were interested in acquiring a lightboard : buying a commercially-produced one or building one from scratch. the lightboard in the studio was built by a group of students in a mechanical engineering class for a senior capstone project as part of a design challenge in partnership with the center for excellence in teaching and learning (cetl), and the students later gifted the lightboard to the library. the lightboard that the studio received came with a steel frame and wheels. the unit’s overall dimensions were 75 inches long, 45 inches wide, and 78 inches high. the glass board itself measured 71.5 by 47.5 inches (see figure 1). information technology and libraries june 2023 supporting faculty’s instructional video creation needs for remote teaching 3 dong figure 1. the lightboard that the library received. the lightboard was used by a few instructors who frequented the studio over the years. during fall 2019, one faculty member from the college of natural resources regularly used the lightboard two to three times per week for about 45 minutes to an hour per session. another engineering faculty member, whose students built the lightboard, also used the lightboard several times but did not have a regularly scheduled appointment. there had not been any regular users since then. recording videos using the lightboard required a complicated setup. first, instructors would need to gather several pieces of equipment. for instance, they would need to check out a video camera and a tripod at the circulation desk downstairs and a lavalier microphone at the room adjacent to the studio. the setup required the lightboard to be positioned between the instructor and the camera. it was necessary to change the camera setting to flip the video horizontally; otherwise, any writings or drawings in the final recording would be displayed backward. additional steps included starting and stopping the camera recordings, checking throughout the recording process to ensure the instructor’s writing on the lightboard stayed within the camera’s frame of capture, and transferring the media from the camera’s sd card to an external hard drive or to cloud storage. as a result, recording a session using the lightboard required assistance from at least one other individual, usually a library staff or faculty member, from start to finish. the many different information technology and libraries june 2023 supporting faculty’s instructional video creation needs for remote teaching 4 dong moving parts made the whole experience time-consuming and labor-intensive both for the library staff and the lightboard users. literature review lightboard technology has been implemented at various higher education institutions since 2014. thanks to dr. peshkin, who made the lightboard an open-source technology and provided the building instructions on his website, many institutions built their own versions of lightboards with variable setups. due to the nature of the lightboard requiring a controlled lighting environment and the writing being backwards from the perspective of those facing the glass (including the camera), the lightboards were used almost exclusively in dedicated studio spaces where the videos were to be recorded. for instance, similar to the university of idaho library studio setup, the complete setup at the university of western australia consists of a lightboard, a camera, lights, markers, a lapel microphone, and a black canvas.2 a budget setup that cost as little as $100 as a removable, tabletop version was also developed.3 cornell university came up with a lightboard and projector setup that can be used in a live 500-person auditorium.4 needless to say, the lightboard technology was adaptable enough to meet various needs on many campuses. several studies show that, among the various types of instructional videos for asynchronous learning, students favor lightboard videos. one unique feature of the lightboard technology, for example, is that it enables instructors to incorporate their gaze and gestures into the instruction. according to a 2015 study, combining gaze and gestures with traditional instructional materials proved to be more effective in directing students’ attention.5 in a 2019 study, several researchers analyzed various lightboard cases in the context of learning theories and theoretical frameworks, such as cognitive load theory, cognitive theory of multimedia learning, and social learning theory. the researchers concluded that while more empirical research was needed, the lightboard videos could improve student learning and engagement.6 in another study conducted by researchers at the university of illinois urbana-champaign, students watched two types of recorded lectures—picture-in-picture with the instructor appearing in a corner of the video, or an overlay of the instructor without the background. study results showed that the overlay videos where the instructor interacted with the content had more views and were preferred by the students, likely thanks to the gaze and gestures of the instructor increasing accessibility. 7 in classes in which the instructors opted to use the lightboard, students generally responded positively to the lightboard videos. for example, in two online classes at clayton state university, most students preferred the lightboard lecture over the traditional narrated powerpoint lecture, and “students described it as engaging, more personable, appealing to visual learners, easier to follow and retain the information, and more similar to a conventional live lecture.”8 at bond university, in queensland, australia, in a chemistry class where the lightboard videos were incorporated as a learning aid, researchers reported that over a four-year period, students scored higher on exams in courses in which lightboard videos were incorporated as instructional materials.9 in another example, students enrolled in a physics class at san diego state university were exposed to the learning glass, a commercial product that was based on the lightboard technology. students responded in a post-course assessment that they felt more connected to their instructor when the instructor utilized the learning glass, and thus the researchers argued that the learning glass could positively impact stem students’ retention rates.10 lastly, at georgia southern university, two researchers conducted a mixed-method study to assess different groups of students’ perceptions of lightboard videos. the findings showed that while performing equally information technology and libraries june 2023 supporting faculty’s instructional video creation needs for remote teaching 5 dong well when comparing test scores, the students in the class that incorporated lightboard videos had better understanding, engagement, and satisfaction based on the assessment measures.11 lightboards are not without their drawbacks given the requirements and the limitations of the equipment and the recording conditions. in an engineering class where students used lightboard for a problem-solving assignment to demonstrate their learning, researchers identified the various requirements including a room with sufficient size, the need for filming equipment, and long postproduction processing time.12 other disadvantages of the lightboard included immobility, limited writing surface, and a more rigorous cleaning process.13 the type of content being presented in lightboard videos also required consideration. in a study comparing different types of lecture videos, students showed a strong preference to the learning glass videos and “suggested that this style be used to supplement lecture videos (in the form of practice problems and follow-up videos).”14 this conclusion corroborated another study that a lightboard was useful for step-by-step problem-solving explanations.15 lastly, in a study that examined three different styles of lightboard videos (interview style, multipresenter, and multimedia-enriched), the researchers identified the benefits along with the drawbacks of each style.16 for example, while interview videos highlighted interactions between the presenter and the interviewer, the presenter experienced “difficulty in multitasking between writing notes on the lightboard and attending to the interviewer’s questions.” having several presenters could also limit the amount of space for them to move around and write on the glass while remaining in frame and created possible distractions of having too many people as well as too much writing on the glass. another potential issue is that not all presenters could be wearing darker-colored clothing for better contrast with the writing. eglass context in spring 2021, the manager at the collaboration & classroom technology services (ccts) department at the university of idaho informed the author that they were planning on purchasing several eglass units for the campus to support faculty’s instructional video creation. the funds came from the governor’s emergency education relief (geer) grant to address the covid-19 pandemic’s impact on higher education. initially, the grant was written by several individuals who intended to purchase commercially-made lightboards to enhance distance teaching options. while researching for the grant, the team stumbled upon the eglass, which seemed to be easier to use than the lightboard. the pricing was reasonable, so the team decided to purchase several of these devices instead of the original two lightboards that were originally recommended. if interested, the library could receive one unit alongside eight other colleges on campus. the author checked out the demo unit at ccts and reported the first impressions as a user to the dean of university of idaho libraries. the latter reasoned that due to the lightboard and eglass’s duplicating functionalities and the fact that the eglass had more perceived ease of use given its all in-one package without the lighting and camera being separate, it would be best to replace the lightboard with the eglass. the author contacted the lightboard capstone project faculty member, who chose to rehome the lightboard to the engineering outreach department at the college of engineering. removing the lightboard paved the way for welcoming the eglass to the studio by reclaiming needed room space. information technology and libraries june 2023 supporting faculty’s instructional video creation needs for remote teaching 6 dong the eglass came in two sizes—a 35-inch and a 50-inch diagonal writing surface. the library received a 50-inch unit with the writing surface measured at 45.64 inches long and 27.40 inches tall. the height of the overall unit could be adjusted to 29.37 inches, 31.33 inches, or 33.31 inches. additional accessories that the library received included a desktop computer, two heightadjustable desks, a touchscreen monitor, a webcam, a ring light, peripherals, neon pens, and white clothes for wiping down the writings. once the order of the eglass came through, a ccts team that consisted of several individuals brought the eglass along with two height-adjustable tables to assemble (see figure 2). the assembling of all the equipment took about an hour. figure 2. ccts team assembling the eglass; disclosure: the shirt logo does not represent any affiliations. description similar to the lightboard, the eglass was made of a sheet of glass and a frame, and the instructors could write on the glass using neon markers. however, the eglass had several distinct features and advantages over the lightboard. first and foremost, the eglass had a built-in camera and the recording function that enabled the instructors to start, pause, and stop the recording on their own with a touch of a button. in addition, the eglass internal system flipped the image automatically in real time so that instructors did not need to write backward. therefore, using the information technology and libraries june 2023 supporting faculty’s instructional video creation needs for remote teaching 7 dong eglass would not require additional support from library personnel since the separate camera setup was no longer needed. the eglass’s built-in lights were also an improvement over the lightboard’s lights. the lightboard came with one set of lights on the frame that illuminated the writings on the glass, but it was necessary to set up additional portable lights to ensure the instructors were illuminated as well. the eglass came with two sets of lights—the instructor light illuminated the instructor, and the blue glass lights ensured the ink on the glass would glow for better visibility. each set of lights was controlled by a separate knob to adjust the intensity. moreover, the eglass could be used as a standalone unit for simple tasks that involved writing and drawing on the glass. for example, instructors could start, pause, and stop the recording using the touch buttons located below the writing surface on the frame. instructors could also use the freeto-download eglassfusion software to access additional features, such as taking snapshots; importing powerpoint slides, word documents, pdfs, and other types of media files; removing the imported media’s background color; zooming in and out; and annotating by typing texts and drawing rectangles or arrows. figure 3. a faculty member recording a video with an application overlay. while the eglass was connected to a desktop computer via a usb cable, instructors could bring their own devices to connect to the eglass, which supports windows, macos, and chromebook operating systems. with a laptop connected to the eglass, instructors could use the down loaded and installed eglassfusion software to control what they were sharing on their screens. for instance, on their devices, instructors could use video conference software such as zoom and microsoft teams for synchronous online instruction via screen sharing and could switch from their laptops’ camera to the eglass camera as the output video. in addition, students could see the information technology and libraries june 2023 supporting faculty’s instructional video creation needs for remote teaching 8 dong writings and drawings on the glass, the instructor’s face, body, gestures, and any programs opened on the instructor’s laptop on the same screen (see example in figure 3). lastly, instructors could choose to use the eglass while sitting down or standing up as the eglass was placed on a height-adjustable desk. the desktop computer, touchscreen monitor, webcam, and ring light enabled a one-button studio setup. instructors could open any video recording software when pressing the button to start a recording and use the touchscreen monitor for zoom whiteboard and camtasia for screencast recording with annotating. outreach the new equipment setup was completed a few weeks before the start of fall semester 2021. ccts sent out an announcement to the university daily newsletter targeted to faculty and staff to advertise that the eglass had been set up at various locations on campus. the author also provided 20 in-person studio orientations sessions, scheduled at 10:00 a.m. and 2:00 p.m. monday–friday during the first two weeks of classes, to campus students, faculty, and staff. prior sign-ups were not necessary, so patrons could simply show up at the orientation time. these orientations provided an overview to patrons unfamiliar with the studio or any pieces of the existing or new equipment. among the 36 patrons who showed up to the orientations, three faculty members were introduced to the eglass and one-button studio. several additional informational and educational workshops were conducted to promote awareness of the eglass. in the fall semester, ccts hosted a workshop introducing the eglass. due to the limited physical space in the studio that could only comfortably accommodate less than five people, the workshop was hosted in a hybrid format with the in-person location in a room adjacent to the studio. participants could choose to attend either via zoom or in person. if attending in person, participants could visit the studio after the workshop to check out the eglass setup and try out the equipment. workshop attendees noticed that the writing on the eglass was difficult to differentiate from the white wall, which served as the background. after the workshop, the author ordered some black wallpaper and applied it to the wall facing the eglass to help improve the contrast. in the 2022 spring semester, the author facilitated an online library workshop to introduce the eglass, its core features, advantages over the traditional white/blackboard or zoom instructions, examples of applicable disciplines to use eglass for instruction, and best practices to five faculty and two staff attendees. another event to promote the eglass was the engineering design expo at the university of idaho college of engineering, an annual event that showcases design projects created by students. this event attracted regional k–12 students, community college students, industry partners, and community partners. the makerfaire, an event that featured makerspace technologies and a drone demonstration, took place on the same day as the expo. due to the perceived impact of eglass and its application to stem instructions, marketing eglass to the stem audience seemed to be a natural fit. thanks to the assembling ease, the author staffed a table at the makerfaire with a smaller eglass unit loaned from another campus location. the author demoed the eglass to passersby, including students, faculty, and community members. lastly, the active learning symposium is an annual event hosted by ccts and cetl at the university of idaho. in 50-minute presentations, instructors shared their teaching strategies to promote active learning in their classrooms. the author reached out to one eglass regular user, information technology and libraries june 2023 supporting faculty’s instructional video creation needs for remote teaching 9 dong the computer science department chair, to co-present at the symposium to introduce the eglass and showcase some eglass videos created for a computer science class. usage in the 2021–2022 academic year, two faculty members regularly reserved the studio to use the eglass. one faculty member was the chair of the computer science department, and the other person was in the animal, veterinary and food sciences department. after attending an orientation to the equipment, setup, and software, the faculty members reserved the space and recorded on their own a few more times without the need for support from the author or a staff member. one of the initial goals of replacing the lightboard with the eglass was to free up library staff time to support faculty recording lectures, and the author believed that having this new equipment reached this goal. about halfway through the fall semester in 2021, the author added a checkbox for patrons to indicate their intended studio usage when making a reservation on the library website. based on the statistics generated by libcal, in addition to the two faculty members, five students booked the studio to create instructional videos. however, since none of the students reached out to the author directly and the studio was not staffed, it was not possible to confirm if the students used the eglass or any other pieces of equipment in the studio for video creation. regardless, the overall usage of eglass was lower than anticipated, and the author believed that there were several contributing factors. first, the equipment was not properly set up until the end of summer. several faculty who heard of the eglass expressed interest in using it to prepare for fall instruction, but shipping delays prevented the equipment from being delivered and set up in time. moreover, since several other colleges also received the eglass, faculty members who could access a unit at their colleges chose not to check out the library studio location despite the additional equipment and the optimized space to help improve the user experience. lastly, despite the marketing efforts, the author suspected that the majority of campus was still not aware of the existence of the eglass technology, so additional outreach was probably still needed. lessons learned after overseeing the studio with the new eglass equipment for two semesters, the author underestimated the amount of work to promote the eglass—the saying that “if you build it, they will come” does not always ring true. ensuring that the eglass was adopted by more faculty members required a lot of dedicated effort. identifying several early adopters who saw the value of the technology and were willing to advocate for it by spreading the word to their colleagues was key. even then, the author noticed that the two faculty members who had been using the eglass had stopped coming to the studio regularly after several sessions. keeping faculty engaged despite their diminishing interest in using the equipment was an issue that the author did not anticipate or resolve. in the 2022–2023 academic year, the library engaged in an organization-wide reorganization that halted several existing and anticipated work priorities, one of which was conducting studio space and service assessments. in the 2023–2024 academic year, through a collaborative effort with the new department administrator, the author hopes to improve the studio and eglass usage by planning promotional initiatives and resuming assessment activities. the space to place the equipment, on the other hand, was another consideration. while it was decided to put the eglass in the studio so that the lightboard could be replaced, the physical unit of the 50-inch eglass took more space than the original lightboard. occasionally, the author received information technology and libraries june 2023 supporting faculty’s instructional video creation needs for remote teaching 10 dong requests from patrons who wanted to use the studio to record videos using a green screen. while it was still manageable to set up a green screen in the remaining space, the lack of room made patrons’ recording experience feel cramped and awkward. overall, for a 200-square-foot studio that had a computer desks, audio equipment racks, portable lights, housing the eglass was less ideal than anticipated. moreover, in order for the studio to be optimized for using the eglass, the lighting, sound, and background required permanent adjustments. for example, after the initial setup, the eglass was facing a white wall in the studio. ideally, the background needs to be dark to help contrasting the lighter neon color writings on the glass. possible solutions included installing a black backdrop, painting the wall black, or applying black wallpaper. installing a backdrop with curtains was the most expensive and time-consuming option, and painting the wall would require temporary closure of the studio. the author opted to order black wallpaper from amazon.com to minimize the disruption to studio operations during the regular semester. the wallpaper cost less than a hundred dollars and applying it to the wall only required an hour, but eventually the adhesive started to wear off. the author decided to remove the wallpaper over the summer and contacted the facilities department to paint the wall black, which took time for removing and restoring the equipment in addition to the time for the wall to dry. lighting was another challenge since the eglass required a light-controlled environment. ideally, all the lights in the room should be turned off for patrons who wanted to use the eglass so that the writings and drawings on the glass were highly visible. some fluorescent lights in the studio were emergency lights that could not be turned off by flipping the light switches. the author had to manually disable some of the lights for the eglass users. the last space-related challenge was sound. the eglass came with a built-in microphone that did not require a separate microphone setup. however, the eglass was placed close to the walls in the studio due to a lack of space which caused some reverberations, lowering the overall sound quality. the sound could be improved if patrons used a headset with microphone and connected the headset to the computer dedicated to the eglass. installing acoustic wall panels was another viable option, and the author might consider such an approach if the usage of the eglass grew to justify the equipment purchase. conclusion the eglass technology at the university of idaho library offered an improved instructional video creation experience to the campus community. thanks to the eglass’s easier setup compared to the lightboard and the studio space improvement in terms of the controlled lighting and the black wall, faculty were greatly benefitted from having access to a tool that enabled them to create engaging videos for classes delivered in online and hybrid modalities. however, additional dedicated outreach efforts are needed for a wider campus adoption. at the university of idaho, seven other colleges on campus owned eglass alongside the library, and there has not been any coordinated communications to promote the technology among all locations. while marketing emails and newsletters would work well for most new services, it is the author’s opinion that potential users would better understand the applicability of the eglass to their instruction when they are able to see the physical unit in person. more in-person outreach, such as inviting faculty to the studio or attending departmental faculty meetings to show videos made using eglass, would be of help. information technology and libraries june 2023 supporting faculty’s instructional video creation needs for remote teaching 11 dong for other institutions that might be interested in acquiring an eglass or a similar technology, the author would suggest conducting an environment scan first to determine the campus need. are there faculty on campus who could benefit from this type of technology to achieve their instructional goals? are there any existing spaces on campus that offer comparable services or resources? if the library administration was interested in acquiring the technology for the library, is there an existing space that would be suitable for placing the equipment? would the library invest in the room so that the lights could be fully controlled, sounds could be proofed or dampened, and a background could be darkened? would there be a staff member to be assigned as the dedicated person to support and maintain the technology? the author hopes that this case study presents a myriad of ideas for those considering adopting a technology similar to an eglass at their libraries. endnotes 1 michael peshkin, “lightboard.info,” https://www.lightboard.info/. 2 timothy r. corkish et al., “a how-to guide for making online pre-laboratory lightboard videos,” in advances in online chemistry education, acs symposium series, vol. 1389 (washington, dc: american chemical society, 2021), 77–91, https://doi.org/10.1021/bk2021-1389.ch006. 3 katrina hay and zachary wiren, “do-it-yourself low-cost desktop lightboard for engaging flipped learning videos,” the physics teacher 57, no. 8 (november 1, 2019): 523–25, https://doi.org/10.1119/1.5131115. 4 erik s. skibinski et al., “a blackboard for the 21st century: an inexpensive light board projection system for classroom use,” journal of chemical education 92, no. 10 (october 13, 2015): 1754– 56, https://doi.org/10.1021/acs.jchemed.5b00155. 5 kim ouwehand, tamara van gog, and fred paas, “designing effective video-based modeling examples using gaze and gesture cues,” journal of educational technology & society 18, no. 4 (2015): 78–88. 6 mark lubrick, george zhou, and jingsheng zhang, “is the future bright? the potential of lightboard videos for student achievement and engagement in learning,” eurasia journal of mathematics, science and technology education 15, no. 8 (april 11, 2019): em1735, https://doi.org/10.29333/ejmste/108437. 7 suma bhat, phakpoom chinprutthiwong, and michelle perry, “seeing the instructor in two video styles: preferences and patterns” (paper, international conference on educational data mining, madrid, spain, june 26–29, 2015), https://eric.ed.gov/?id=ed560520. 8 sheryne southard and karen young, “an exploration of online students’ impressions of contextualization, segmentation, and incorporation of light board lectures in multimedia instructional content,” the journal of public and professional sociology 10, no. 1 (january 5, 2018), https://digitalcommons.kennesaw.edu/jpps/vol10/iss1/7. https://www.lightboard.info/ https://doi.org/10.1021/bk-2021-1389.ch006 https://doi.org/10.1021/bk-2021-1389.ch006 https://doi.org/10.1119/1.5131115 https://doi.org/10.1021/acs.jchemed.5b00155 https://doi.org/10.29333/ejmste/108437 https://eric.ed.gov/?id=ed560520 https://digitalcommons.kennesaw.edu/jpps/vol10/iss1/7 information technology and libraries june 2023 supporting faculty’s instructional video creation needs for remote teaching 12 dong 9 stephanie s. schweiker and stephan m. levonis, “a quick guide to producing a virtual chemistry course for online education,” future medicinal chemistry 12, no. 14 (july 1, 2020): 1289–91, https://doi.org/10.4155/fmc-2020-0103. 10 shawn firouzian, chris rasmussen, and matthew anderson, “adaptations of learning glass solutions in undergraduate stem education,” in proceedings of the 19th annual conference on research in undergraduate mathematics education, (pittsburgh, pennsylvania: special interest group of the mathematical association of america on research in undergraduate mathematics education, 2016), 751–60, http://sigmaa.maa.org/rume/rume19v3.pdf. 11 peter d. rogers and diana t. botnaru, “shedding light on student learning through the use of lightboard videos,” international journal for the scholarship of teaching and learning 13, no. 3 (2019), https://eric.ed.gov/?id=ej1235871. 12 kenneth r. hite et al., “effects of lightboard usage on circuit problem skills,” in 2017 ieee frontiers in education conference (fie) proceedings, (ieee, 2017), 1–4, https://doi.org/10.1109/fie.2017.8190529. 13 weibing ye, “lightboard and chinese language instruction,” journal of technology and chinese language teaching 7, no. 2 (december 31, 2016): 97–112. 14 ronny c. choe et al., “student satisfaction and learning outcomes in asynchronous online lecture videos,” cbe—life sciences education 18, no. 4 (december 2019): ar55, https://doi.org/10.1187/cbe.18-08-0171. 15 julia vandermolen, kristen vu, and justin melick, “use of lightboard video technology to address medical dosimetry concepts: field notes,” current issues in emerging elearning 4, no. 1 (june 13, 2018), https://scholarworks.umb.edu/ciee/vol4/iss1/6. 16 christoph dominik zimmermann et al., “utilizing the power of blended learning through varied presentation styles of lightboard videos,” in technology-enabled blended learning experiences for chemistry education and outreach, ed. fun man fung and christoph zimmermann (elsevier, inc., 2021), 31–40, https://doi.org/10.1016/b978-0-12-822879-1.00003-2. https://doi.org/10.4155/fmc-2020-0103 http://sigmaa.maa.org/rume/rume19v3.pdf https://eric.ed.gov/?id=ej1235871 https://doi.org/10.1109/fie.2017.8190529 https://doi.org/10.1187/cbe.18-08-0171 https://scholarworks.umb.edu/ciee/vol4/iss1/6 https://doi.org/10.1016/b978-0-12-822879-1.00003-2 abstract introduction background literature review eglass context description outreach usage lessons learned conclusion endnotes 90 oclc search key usage patterns in a large research library kunj b. rastogi: oclc; and ichiko t. morita: ohio state university, columbus. many libraries use the oclc online union catalog and shared cataloging subsystem to perform various library functions, such as acquisitions and cataloging of library materials. as an initial part of the operations, users must search and retrieve a bibliographic record for the desired item from the large oc lc database. various types of derived search keys are available for retrieval. this study of actual search keys entered by users of the oclc online system was conducted to determine the types of search keys users prefer for performing various library operations and to find out whether the preferred search keys are effective. introduction in the last decade, many information systems have been developed that use search keys to retrieve bibliographic records from large databases. the oclc online union catalog and shared cataloging subsystem in particular is one of the larger of these systems. 1--u there are currently more than 7 million bibliographic records in the oclc database. the oclc online system uses search keys to access various index files that locate bibliographic records in the database. index files are maintained for name/title, personal author, corporate author, coden, isbn, and lccn indexes. the first four of the above index files contain search keys that are derived from information (e. g., author, title) present in the piece or citation. search keys in these four indexes are in general not unique, because the derived key could be the same for different bibliographic records. the last three indexes (coden, isbn, and lccn) contain search keys or identifiers that are unique in general. a user enters a search key consisting of characters (letters, numbers, symbols, commas, hyphens) formatted according to specific rules that identify to the system which index file to search. for example, to search the name/title index, the user enters a search key consisting of the first four characters of the author's last name and the first four characters of manuscript received october 1980; .accepted december 1980. search key usage!rastogi and morita 91 the first nonarticle word of the title of the work, separated by a comma. to search the title index, the user enters a search key consisting of the first three characters of the first nonarticle word in the title, the first two characters of the second word, the first two characters of the third word, and the first character of the fourth word, each separated by a comma. 7 the system compares the user-entered search key with the search keys contained in that index file. this comparison results in one of three possible cases: l. only one index file search key matches the user-entered search key . 2 . more than one index file search key matches the user-entered search key. 3. no index file search key matches the user-entered search key. in the first case, the system retrieves the unique bibliographic record corresponding to the search key and displays it on the user's terminal screen. in the second case, the system retrieves all records that correspond to the search key, prepares truncated entries (consisting of author, title, imprint data, etc.) for those records, and displays the truncated entries on the user's terminal screen . the user then selects the truncated entry that corresponds to the desired record and requests the system to display the full record for that item. in the third case, the system responds with the reply that a record matching the user-entered search key was not present (a "not found" response) in the index. in the oclc online system, 2,500 member libraries ·using 3,800 terminals search the oclc database to perform various library functions such as acquisitions, monograph cataloging, and serials cataloging. users can choose to enter any type of search key from the various types of search keys permitted by the system. users' preferences to enter a particular type of search key will depend in part upon the kind of information they have about the item to be searched and the type of library function they wish to perform. if users receive a "not found" response after entering a particular type of search key, they may then try a different type of search key that they consider next best. the purpose of this study was to determine what types of search keys are preferred to perform various library functions and whether the preferred search keys are effective. the study also investigated what type of search key is used next when particular types of search keys are unable to retrieve the desired record to determine if there are any discernible search patterns. materials and methods for conducting this study, data were needed on the pattern of searchkey use in oclc member libraries. further, the data had to include the actual time of day when work was performed for a particular library 92 journal of library automation vol. 14/2 june 1981 function on a specific terminal. this requirement would permit identification in the online system use data collected by oclc of search keys entered to perform specific library functions. ideally, a library with several oclc terminals, each used exclusively for only one library function, was desired. the ohio state university (osu) library met this requirement. the osu library has eleven terminals: two of the eleven terminals are used exclusively for performing acquisition functions, seven are used for monographic cataloging, and one terminal each is used for serials cataloging and public use. the terminal assigned for serials cataloging is used for monograph cataloging after 5 p.m. library staff at osu use all the terminals exclusively, except for the public-use terminal. this public-use terminal can be used by anyone, including faculty, students, and library staff. two full days' transactions for each of the osu terminals were obtained from the oclc online system use statistics (olsus) file. during the online operation, the system writes a record on the olsus file for each message entered by the user. this record includes the institution number, a number identifying the terminal from which the message came, the time of the transaction, and the first nonblank sixteen characters of the message . if the user-entered message is a search key, the system response is either a "not found" response or a "found" response. with the "found" response, the system displays the bibliographic record (if unique) or displays a truncated entry screen. however, a "found" response does not necessarily mean that the truncated entry screen includes information about the bibliographic record the user was actually seeking. for the study, a program was written to scan the records in the olsus file for two full days in october 1978. the program extracted all the records for messages that came from the eleven osu terminals and wrote the records on two tapes--one for each day's activity. these tapes were sorted first by the terminal number and then within each terminal number by the time of transaction. each sorted tape was fed to another program that printed, for each terminal, the actual messages in chronological order and the associated system response. from this printout, it was possible manually to go through the complete sequence of messages entered to search a single bibliographic item. the printout for an entire day's activity for each terminal was thus divided into sections, each section containing all transactions that were performed to search for a single item. for each section, the type of search key first entered and the system response was noted. in case of a "not found" response, the type of search key next entered (if the search process was continued for the item) also was noted. the results were combined for all the terminals used to perform a specific library function (e.g., acquisitions) and for the two days. search key usage!rastogi and morita 93 results and discussion table 1 and figure 1 show the different types of search keys used as the first choice to perform various library functions. note that at the time of data collection for this study, the interlibrary loan subsystem was not operational. table 1. different types of searches for various applications type of search nametritle title personal author lccn isbn issn coden total monograph acquisitions cataloging items %of items %of searched total searched total 111 37.5 313 51.7 49 16.6 48 7.9 0 0.0 9 1.5 122 41.2 201 33.2 14 4.7 34 5.6 0 0.0 0 0.0 0 0.0 0 0.0 296 100.0 605 100.0 acquisitions lccn ( 41.2% serials cataloging title other s 4 .7% serials cataloging items %of searched total 15 15.9 72 76.6 0 0.0 1 l.l 1 l.l 5 5.3 0 0.0 94 100.0 monograph cataloging lccn name/tit le public use title name / title 48.7% public use items %of searched total 77 48.7 44 27 .8 16 10.1 13 8.2 3 1.9 3 1.9 2 1.3 158 100.0 fig . 1 . number of different types of search keys for various applications. 94 journal of library automation vol. 14/2 june 1981 during the two-day period, a total of 605 items were searched for monograph cataloging, 296 items were searched for acquisitions operations, and 94 items were searched for serials cataloging. a total of 158 items were. searched on the public-use terminal. most types of search keys were used to some extent. the use of isbn and issn search keys was quite limited for all types of library functions. the coden search key was used only twice, and both times through the public-use terminal. the corporate author search key was not used at all. the use of the personal-author search key was much smaller than expected. this was probably because at the time of the study the system did not permit use of personal author keys during peak hours (9 a .m. to 5 p.m .) of online system operation. for the acquisitions function, the lccn search key was used most often, followed by the name/title key. these two types of keys together were used for about 80 percent of the acquisitions items searched . for the monograph cataloging function, the most frequently used search key was the name/title key. this key was entered for about 52 percent of items searched. the next most frequently used key for monograph cataloging was the lccn key, used for about 33 percent of the items searched. for the serials cataloging function, the title key was used most often, for more than 75 percent of the items searched. searches performed through the public-use terminal included all types of search keys. the name/title key was used most frequently , followed by the title key. before performing an actual search, a user must choose, from among the various types of search keys available in the oclc system, the particular search key to use. if the search key used for a first try (primary choice of search key) results in a "not found" response from the system, a second key may be entered (secondary choice of search key). this sequence may continue through many search-key choices until the user retrieves the desired record ("found" response) or decides to abandon the search at some point upon obtaining a "not found" response. for this study, the investigation was confined to onlyprimary and secondary choices of search keys. the results of the "found" responses for the primary choice of key and for the secondary search key entered after receiving the first "not-found" response are presented in tables 2 through 5. for the acquisitions function (table 2), the most frequently used primary search key was the lccn key, which retrieved the desired record about 89 percent of the time. when the lccn key could not retrieve the record, the user chose mostly the name/title key as his/her secondary choice or abandoned the search. the next most frequently used primary search key was the name/title key, which retrieved the desired record about 51 percent of the time. when the name/title key was unsuccessful, the users entered as their secondary search key a title key search key usage!rastogi and morita 95 table 2. number of primary and secondary choices of search keys for acquisitions search discontinued types of search key used after the type of %of notafter the first not-found response first notsearch key items found found found name/ personal found used first searched responses responses responses title title author lccn isbn response nameffitle 111 57 51.3% 54 17 22 0 1 0 14 (31.5%)(40. 7%) (0.0%) (1.9%) (0.0%) (25.9%) title 49 17 34.7% 32 6 ll 0 2 1 12 (18.8%)(34.4%) (0.0%) (6.2%) (3.1%) (37.5%) personal author 0 lccn 122 109 89.3% 13 5 1 0 2 1 4 (38.4%) (7.7%) (0.0%) (15.4%) (7.7%) (30.8%) isbn 14 1 7.1% 13 8 3 0 0 0 2 (61.5%)(23.1 %) (0.0%) (0.0%) (0.0%) (15.4%) issn 0 coden 0 total 296 184 62 .2% 112 36 37 0 5 2 32 (32.1 %)(33.0%) (0.0%) (4.5%) (1.8%) (28.6%) note: to calculate the percentage given in parentheses, the number of ''types of search key used after the first not-found response" was divided by the number of "not-found responses." about 41 percent of the time, or a different name/title key about 31 percent of the time. approximately 26 percent of the time they abandoned the search. it seems that acquisitions users mostly try the lccn key first if available (the lccn is not present in all the records) and the name/title key first if the lccn is not available. thus, users adopted the right approach since the lccn key has· the highest hit rate. furthermore, the lccn key is more efficient than other keys because it results, on the average, in a fewer number of replies. for the monograph cataloging function (table 3), the name/title key was used most often as the primary search key, resulting in retrieval of the desired record about 57 percent of the time. when the name/title key could not retrieve the record, the users next attempted a title key (52 percent of the time) or a different name/title (21 percent of the time). about 23 percent of the time they discontinued the search. the lccn key was the second most frequently used primary search key and successfully retrieved the record about 79 percent of the time. when the lccn key was unsuccessful, the users tried the name/title key (58 percent of the time) as their secondary choice or abandoned the search . unlike the search-key usage pattern for acquisitions, the use of the lccn key for monograph cataloging was lower than use of the name/ title key, although here also the hit rate was highest for the lccn key. the reason the lccn use was lower is that ohio state university, being a research institution, processes a large number of items from var96 journal of library automation vol. 14/2 june 1981 table 3. number of primary and secondary choices of search keys for monograph cataloging search discontinued types of search key used after the type of %of notafter the first not-found response first notsearch key items found found found name/ personal found used first searched responses responses responses title title author lccn isbn response nameffitle 313 180 57.5% 133 28 69 1 4 1 30 (21.1%)(51.9%) (0.7%) (3.0%) (0.7%) (226%) title 48 24 50.0% 24 9 2 1 3 2 7 (37.5%) (8.3%) (4.2%) (12.5%) (8.3%) (29.2%) personal author 9 3 33.3% 6 4 0 0 0 1 1 (66.6%) (0.0%) (0.0%) (0.0%) (16.7%) (16.7%) lccn 201 158 78.6% 43 25 4 0 2 l 11 (58.1 %) (9.3%) (0.0%) (4.7%) (2.3%) (25.6%) isbn 34 3 8.8% 31 20 4 1 1 3 2 (64.5%)(12.9%) (3.2%) (3.2%) (9.7%) (6.5%) issn 0 coden 0 total 605 368 60.8% 237 86 79 3 10 8 51 (36.3%)(33.3%) (1.3%) (4.2%) (3.4%) (21.5%) note: to calculate the percentage given in parentheses, the number of ''types of search key used after the first not-found response" was divided by the number of "not-found responses." ious sources other than regular acquisitions channels, and many of these sources do not have lccn information. for the serials cataloging function (table 4), the title key was the first primary choice and retrieved the desired records 44 percent of the time. if this key failed to retrieve the desired records, the users entered as their secondary key a different title key 55 percent of the time and a name/title key 17 percent of the time. approximately 23 percent of the time, users decided to discontinue the search. although for serials cataloging the title key was used most frequently, its hit rate was less than 45 percent. on the other hand, the issn key was used very little, but its hit rate was as high as 80 percent. the use of the issn key is likely to increase in the future, however, because the united states postal service now requires the issn to be present on serials . 8 therefore, the issn will be more readily available to the user. among the searches performed through the public-use terminal (table 5), the most frequently used primary search key was the name/title key, which resulted in a successful search about 29 percent of the time. when patrons encountered a "not found" response, they tried as their secondary choice a different name/title key 29 percent of the time, or a title key 29 percent of the time. they abandoned the search 38 percent of the time. as mentioned earlier, the public-use terminal can be used by anyone, including faculty and students. the hit rate for name/title search key usage!rastogi and morita 97 table 4 . number of primary and secondary choices of search keys for serials cataloging %of nottypes of search key used after the first not-found response search discontinued after the first not-type of search key used first items found found found name/ personal found response searched responses responses responses title title author lccn isbn nameffitle 15 3 20.0% title 72 32 44.4% personal author 0 lccn 0 0.0% isbn 0 0.0% issn 5 4 80.0% coden 0 total 94 39 41.5% 12 6 4 1 0 0 (50.0%)(33.3%) (8.3%) (0.0%) (0.0%) 1 (8.3%) 40 7 22 2 0 0 9 1 (17.5%)(55.0%) (5.0%) (0.0%) (0.0%) (22. 5%) 0 1 0 0 0 (0.0%)(100.0%) (0.0%) (0.0%) (0.0%) 1 0 0 0 0 (100.0%) (0.0%) (0.0%) (0. 0%) (0.0%) 0 1 0 0 0 (0.0%) (100.0%) (0.0%) (0.0%) (0.0%) 0 (0.0%) 0 (0.0%) 0 (0.0%) 55 14 28 3 0 0 10 (25.5%)(50.9%) (5.4%) (0.0%) (0.0%) (18:2%) note: to calculate the percentage given in parentheses, the number of "types of search key used after the first not-found response" was divided by the number of "not-found responses." table 5. number of primary and secondary choices of search keys for public use %of nottypes of search key used after the first not-found response search discontinued after the first n ot-type of search key used first items found found found name/ personal found response searched responses responses responses title title . author lccn isbn nameffitle 77 22 28.6% 55 16 16 0 2 0 21 (29.1 %)(29.1 %) (0.0%) (3.6%) (0.0%) (38.2%) title 44 20 45.4% 24 ll 9 0 0 0 4 (45.8%)(37.5%) (0.0%) (0.0%) (0.0%) (16.7%) personal author 16 5 31.3% ll 0 0 3 0 0 8 (0.0%) (0.0%) (27.3%) (0.0%) (0.0%) (72.7%) lccn 13 5 38.5% 8 2 2 0 1 1 2 (25.0%)(25. 0%) (0.0%) (12.5%) (12.5%) (25.0%) isbn 3 2 66.7% 0 0 0 0 1 0 (0 .0%) (0.0%) (0.0%) (0.0%) (100.0%) (0.0%) issn 3 33.3% 2 0 0 0 0 0 2 (0 .0%) (0.0%) (0.0%) (0.0%) (0.0%) (100.0%) coden 2 0 0.0% 2 0 1 0 0 0 1 (0.0%) (50.0%) (0.0%) (0.0%) (0.0%) (50.0%) total 158 55 34.8% 103 29 38 3 3 2 3h (28.2%)(27 .2%) (2.9%) (2.9%) (1.9%) (36.9%) note: to calculate the percentaee given in parentheses, the number of "types of search key used after the first not-found response" was divided by the number of "not-found responses." 98 journal of library automation vol. 14/2 june 1981 key at this terminal was rather low. from this study, it is not possible to say whether this was due to patrons' lack of knowledge in key construction or lack of sufficient information needed for the construction of the key. summary and conclusions among various types of search keys available to the users, the name/ title, lccn, and title search keys were entered most frequently. the use of personal-author, isbn, issn, and coden search keys was very limited for all library functions. corporate-author search keys were not used at all. for the acquisitions function, system users most frequently entered the lccn key, followed by the name/title key. for monograph cataloging, the users entered the name/title key most frequently, followed by the lccn key. for serials cataloging, the use of the title key was the most common. persons using public-use terminals entered mostly name/ title and title search keys. for acquisitions and monograph cataloging functions, the lccn key was most successful in retrieving the desired records. the next most successful key was the name/title key. for both of these functions, when the name/title key failed to retrieve the record, users next tried the title key most of the time. for serials cataloging, the title key was used most frequently but was not very successful in retrieving serial records. on the other hand, the issn key was the most successful but it was used very little . individual identifiers such as lccn, issn, isbn, and coden are very efficient search keys because they retrieve, on the average, far fewer numbers of replies than other types of search keys. with the exception of lccn, the individual indentifiers were used only to a small extent. from this study, it is not possible to answer questions such as: why weren't individual identifiers' search keys not used more often? did a searcher use a name/title key even when the lccn was available? to answer such questions, data will have to be collected concerning what kind of information is available to the searcher when constructing the search keys. acknowledgments the authors wish to thank william h. hochstettler for programming assistance, and peggy zimbeck for editorial assistance with the manuscript. references l. f. g. kilgour, p. l. long, and e. b. leiderman, "retrieval of bibliographic entries from a name-title catalog by use of truncated search keys," proceedings of the american society for information science 7:79-82 (1970) . 2. f. g . kilgour and others, '"title-only entries retrieved by truncated search keys," search key usage!rastogi and morita 99 journal of library automation 4:207-10 (dec . 1971). 3. p. l. long and f . g . kilgour, "a truncated search key title index," journal of library automation 5:17-20 (march 1972). 4. a. l. landgraf and f. g. kilgour, "catalog records re trieved by personal author using derived search keys," journal of library automation 6:103--8 (june 1973). 5. a. l. landgraf, k. b. rastogi, and p. l. long, "corporate author entry records retrieved by use of derived truncated search keys," journal of library automation 6:15161 (sept. 1973). 6. j. d . smith and j . e . rush , "the relationship between author names and author entries in a large on-line union catalog as retrie ved using truncated keys," journal of the american society for information science 28 , no.2:115--20 (march 1977). 7. oclc, inc., searching the on-line union catalog (columbus, ohio: oc lc, inc., 1979). 8. library of congress information bulletin 37:35 (1 sept. 1978). kunj b. rastogi is a research scientist at oclc . ichiko morita is assistant professor at the ohio state university libraries. \ orthographic error patterns of author names in catalog searches 93 renata tagliacozzo, manfred kochen, and lawrence rosenberg: mental health research institute, the university of michigan, ann arbor, michigan an investigation of error patterns in author names based on data from a survey of library catalog searches. position of spelling errors was noted and related to length of name. probability of a name having a spelling error was found to increase with length of name. nearly half of the spelling mistakes were replacement errors; following, in order of decreasing frequency, were omission, addition, and transposition errors. computer-based catalog searching may fail if a searcher provides an author or title which does not match with the required exactitude the corresponding computer-stored catalog entry ( 1). in designing computer aids to catalog searching, it is important to build in safety features that decrease sensitivity to minor errors. for example, compression coding techniques may be used to minimize the effects of spelling errors on retrieval ( 2, 3, 4). preliminary to the design of good protection devices, the application of error-correction coding theory ( 5, 6, 7) and data on error patterns in actual catalog searches ( 8, 9) may be helpful. a recent survey of catalog use at three university libraries yielded some data of the above-mentioned kind (10). the aim of this paper is to present and analyze those results of the survey which bear on questions of error control in searching a computer-stored catalog. in the survey, users were interviewed at random as they approached the catalog. of the 2167 users interviewed, 1489 were searching the catalog for a particular item ("known-item searches"). of these, 67.9% first entered the catalog with an author's or editor's name, 26.2% with a title, and 5.9% with a subject heading. approximately half the searchers had a written citation, while half relied on memory for the relevant in94 journal of library automation vol. 3/2 june, 1970 formation. paradoxically, though most known-item searchers tried to match primarily an author and only secondarily a title, there were in the sample of searches many more cases of exact title citation than of exact author citation. imperfect recall of author name of the 1489 "known-item" searches, 1356 could be verified against the actual item. from the total nwnber of searches ( 1260) in which the catalog user had provided an author's (or editor's) name, those works were subtracted which did not have a personal authorship ( 208) or had multiple authors or multiple editors ( 127). this left 925 searches, of which 470 had complete and correct author entries, while 455 contained various degrees of imperfection in the author citation. table 1 gives the distribution of incorrect and/or incomplete author citations. in the study an author's name was defined as incomplete when the first name, or the two initials, or one out of two initials was missing. table 1. incorrect and/or incomplete author names categories university of michigan libraries i ii iii total general library 144 25 6 175 undergraduate library 94 35 4 133 medical library 110 27 10 147 -total 348 87 20 455 in category i (the most numerous) the author's last name was correct, but the author citation as a whole was either incomplete or incorrect; i.e., there were mistakes and/or omissions in the first and middle name or initials. most of the searches in category i were incomplete rather than incorrect. since in category i there is nothing wrong with the author's last name, the searcher's ability to gain access to the right location in the catalog is presumably not impaired as long as the last name is not too common. once the searcher has entered the catalog, he will make use of other clues, such as title or knowledge of the topic, to identify the right item. but if the name is smith or brown or johnson, and the catalog is a large one, to have an incomplete author's name may be equivalent to having no name at all. (in the university of michigan general library catalog, which contains over four million cards, the entry "smith" extends over eight drawers, and the entries "brown" and "johnson" over four drawers each.) in an automated catalog it is easy to limit the set of entries from which the right item has to be selected by intersecting the last name of the author with some other clues. incompleteness of the author name may then not be a serious handicap. orthographic error patternsftagliacozzo 95 category iii includes all searches in which the searcher had an author that turned out to be wrong. the error in this case was not in incompleteness or misspelling of the author's name, but in the identity of the author. no further analysis of this group was conducted. category ii is the one which forms the object o£ the present report. the analysis concerns mainly position and type of errors, and the incidence of errors as related to name length. position of errors in author names the location of errors in the author citation is important for manual systems, such as traditional library card catalogs, as well as for automated systems. table 2 shows the distribution of e in the sample of incorrect author citations from all three libraries, where e is the position of the letter, counting from left to right, in which an error appeared. in the fourteen cases in which more than one error occurred in the same name, only the first error was considered. in a few cases the error involved a string of letters (e.g., friedman for friedberg). in such cases the position of the first letter of the string determined the location of the error. table 2. position of error in last name of author incorrect names e no. % cumulative % 1 2 2.3 2.3 2 11 12.6 14.9 3 11 12.6 27.6 4 19 21.8 49.4 5 13 14.9 64.4 6 12 13.8 78.2 7 7 8.0 86.2 8 6 6.9 93.1 9 3 3.4 96.6 10 2 2.3 98.9 11 1 1.1 100.0 total 87 table 2 shows that about half the incorrect author names had errors in one of the first four letters, while the other half had errors in one of the following letters, from the fifth to the eleventh position. the most frequently misspelled is the fourth letter, which is responsible for 21.8% of the total number of errors occurring in the sample. the ordinal number indicating the position of the error is not, by itself, a sufficient indicator of the area where the error occurred. an error in the third letter, for instance, is close to the beginning of the name if the 96 ]ourml of library automation vol. 3/ 2 june, 1970 name is 9 letters long, but close to the end if the name is 4 letters long. in table 3 l indicates the length (the number of letters) of the authot name and pa the location of the error-i.e., the position of the first letter, counting from left to right, where an error appears. the incorrect author names of the sample ( 87) have a length of between 3 and 12 letters. the column on the right of the table, el, indicates the distribution of names of a given length. the row at the bottom of the table gives the distribution of errors occurring in a given position. mistakes are shown to occur anywhere from the first letter to the eleventh letter. when the error consists in the addition of a letter to the end of the correct name, pa is beyond the name itself. the figures which appear next to the diagonal line, on the right, indicate mistakes of this sort. a sununary inspection of the table produces the impression that errors are clustered toward the end of the names, or at least that they are more prevalent in the second half of the name than in the first half. this seems to be a direct consequence of the fact that the first column of the table (errors in position 1) is almost empty. it is tempting to say that errors very rarely occur in the first letter of a proper name. but is this really so? it is true that english-speaking people place particular emphasis on initials, to the extent that initials are often sufficient for identifying well-known figures. the special attention given to the first table 3. position of error vs. length of name length (l) errors (pe) frequency (el) 1 2 3 4 5 6 7 8 9 10 11 3 1 1 4 1 3 5 5 1 2 1 7 6 1 3 6 21 7 4 2 6 19 8 2 3 2 16 9 2 1 1 1 1 1 1 8 10 1 1 2 1 2 7 11 1 1 2 12 1 1 total 2 11 11 19 13 12 7 6 3 2 1 87 orthographic error patternsjtagliacozzo 97 letter of a name would certainly contribute to the scarcity of errors in such a letter. but it is also possible that when errors in the first letter occur, they so transform the name that it becomes unrecognizable. several such authors may have ended up in the category of non-verified authors necessarily excluded from the analysis. it would be interesting to verify whether the "serial-position effect" that some authors found in the spelling of common nouns is present also in the spelling of proper names. according to jensen and to kooi et al., the distribution of spelling errors in relation to letter position closely approximates the serial-position curve for errors found in serial rote learning ( 11, 12). to ascertain if this is the case for author names, a data base much larger than that used for this study would be needed. distribution of errors and length of names is the probability of a catalog searcher misspelling the name of an author dependent to any extent on the length of the name? table 3 shows the frequency of occurrence of names of a given length in the 87 misspelled names (column el). the next step was to calculate the distribution of the length of author names in the whole group of verified author citations provided by the catalog searchers. this group, it should be remembered, does not include multiple authors, multiple editors or nonpersonal authors. the ratio of the corresponding figures in the two distributions will give the percentage of names of a given length having spelling mistakes (table 4) . table 4. probability of errors in recall of author names of a given length length frequency of frequency of percentage of of name incorrect names all names incorrect names 2 1 3 1 9 11.1%} 4 5 87 5.7% 4.9% (short 5 7 169 4. 1% names) 6 21 215 9.8%"\ 7 19 191 9.9% j 10.5% (medium 8 16 127 12.6% names) 9 8 59 13.6%} 10 7 36 19.4% 14.3% (long 11 2 26 7.7% names) 12 1 5 20.0% 87 925 there is an observable trend toward an increase of mistakes with length of name. of course, the two extremes of length distribution are scarcely 98 journal of library automation vol. 3/2 june, 1970 represented, and this is probably responsible for inconsistencies in the percentage disb·ibution. grouping names into three length categories (i.e., short names, middle-length names, and long names) makes more apparent differences in percentages of incorrect names. the differences are significant at the .01 level of confidence. type of error in author names errors which occurred in the spelling of the last names of authors were grouped into four broad categories: replacement errors, omission errors, addition errors, and transposition errors. while it is true, especially in badly mangled words, that an error can often be said to be of any of several types, it was generally easy to identify the simplest necessary transformation of the letters, and to assign the incorrect name to the type of error corresponding to that kind of transformation. in some cases this meant adding a string of letters or replacing one string by another. altogether the sample of 87 incorrect authors contained 104 errors. eleven names exhibited two errors each, three had three errors, and the remaining just one error. of the 104 errors, 50 were replacement errors; these are cases in which one letter or string of letters of the correct name has been replaced by a different letter or string of letters (e.g. hoiser for hoijer, friedman for friedberg). the most common replacement errors appear in table 5, in order of decreasing frequency. table 5. single-letter replacement errors no. of errors correct lettet' incorrect letter 6 0 a, a, a, a, p, r 5 a, e, y, y, y 4 y a, i, u, z 3 a i, o, 0 3 s c, r, z 3 v b, f, w 2 e i, 0 2 g c, r 28 not included in the table are the 10 letters which were each replaced just once and the 12 strings of letters. in four cases, the replaced letter was the second of a double letter. there were 34 omission errors in all. four of these involved a string of letters; all the rest were single-letter omissions. eleven single-letter omissions occurred in the last letter of the name (e.g. abbot instead of abbott), and 19 in the middle of the name (e. g. brent instead of orthographic error patternsjtagliacozzo 99 brendt). table 6 gives the frequency distribution of the omitted letters. the asterisk indicates that the omitted letter was the second of a double letter. table 6. single-letter omission errors no. of error in middle error in final letter errors position position omitted 8 5 3 e 4 4 a 4 40 t 3 1 20 n 2 2 h 2 2 i 2 20 1 2 1 1 s 1 1 c 1 1 d 1 1 r 30 addition errors totaled 18. in one case the addition consisted of a string of letters, while in the others only one letter was added. addition errors can occur in the middle of a name (e.g. berelison for berelson) or at the end of it (e.g. haller for halle). in the latter case, the added letter is found beyond the last letter of the correct name (these were the errors on the right of the diagonal in table 3). the distribution of addition errors is shown in table 7. the asterisk indicates that the added letter duplicated the previous letter. table 7. single-letter addition errors no. of error in middle errors position 5 2 2 2 1 1 1 1 l 1 17 error in final position 4 1 1 1 .l added letter s c e i a f 1 m n z 100 journal of library automation vol. 3/2 june, 1970 there were two transposition errors: ie for ei and ai for ia. in cases of second and third errors in the name, there were five replacement errors, seven omission errors, and five addition errors. table 8 summarizes the type of errors encountered in the sample of incorrect authors. figures in this table include strings as well as single letters, and second and third errors, as well as first errors. table 8. distribution of types of errors middle position replacement errors omission errors addition errors transposition errors conclusion four trends could be observed: 44 21 10 2 final total position 6 50 13 34 8 18 2 104 1) vowels usually replaced vowels, and consonants usually replaced consonants. apparently the probability of misspelling a single letter was slightly higher for vowels than for consonants. with the latter, there is some indication that the substitution was guided by phonetic similarity ( " » • 1 d b "b" "f" " ") e.g., v is rep ace y , or , or w . 2) most omissions in which the correct name had a double letter occurred at the end of the word. 3) replacement errors tended to come earlier in words than did omissions and additions. (this is not due to the fact that addition and omission errors contained a disproportionately high number of final errors; even when these final errors are excluded, replacement errors still come earlier than other types.) 4) second and third errors in a name have comparatively few replacement errors. acknowledgment this work was supported in part by the national science foundation, grant gn 716. references 1. kilgour, f. g.: "retrieval of single entries from a computerized library catalog file," proceedings of the american society for i nfo1'11ultion science, 5 ( 1968), 133-136. 2. nugent, william r. : "compression word coding techniques for information retrieval," journal of library automation, 1 (december 1968), 250-260. orthographic error patternsjtagliacozzo 101 3. ruecking, frederick h ., jr.: "bibliographic retrieval from bibliographic input; the hypothesis and construction of a test," journal of library automation, 1 (december 1968), 227-238. 4. dolby, james l. : "an algorithm for noisy matches in catalog searching." in: a study of the organization and search of bibliographic holdings records in on-line computer systems: phase i. (berkeley, cal. : institute of library research, university of california march 1969 ), 119-136. 5. peterson, william w.: error correcting codes (new york: wiley, 1961). 6. alberga, cyril n.: "string similarity and mispellings," communications of the acm, 10 ( 1967), 302-313. 7. galli, enrico j.; yamada, hisao m.: "experimental studies in computer-assisted correction of unorthographic text," ieee transactions on engineering writing and speech, ews-11 (august 1968), 75-84. 8. tagliacozzo, r., et al.: "patterns of searching in library catalogs." in: integrative mechanisms in literature growth. vol iv. (university of michigan, mental health research institute, january 1970). report to the national science foundation, gn 716. 9. university of chicago graduate library school: requirements study for future catalogs, (chicago : university of chicago graduate library school, 1968) . 10. tagliacozzo, renata; rosenberg, lawrence; kochen, manfred: access and recognition: from users' data to catalog entries (ann arbor, mich.: the university of michigan, mental health research institute, october 1969, communication no. 257) . 11. jensen, arthur r.: "spelling errors and the serial-position effect," journal of educational psychology, 53 (june 1962), 105-109. 12. kooi, beverly y.; schutz, richard e.; baker, robert l.: "spelling errors and the serial-position effect," journal of educational psychology, 56 ( 1965), .334-336. lib-s-mocs-kmc364-20140601051127 1 foreword the editorial board of the journal of library automation is pleased to pay tribute to frederick g. kilgour who, with the able assistance of his assistant editor, eleanor m. kilgour, so firmly established this periodical and set its standards so high. especially in view of the fact that in these first years of journal publication, mr. kilgour was also designing and implementing the complex system which is the ohio college library center, his achievement as first editor was remarkable. to him the information science and automation division of the american library association owes a great debt. as library automation moves further into the seventies, the context of its existence changes. ever-increasing fisca l pressures have required economic justification for every alteration of traditional practice. the mere availability of equipment, of programs and tested system design, even of skilled and experienced manpower can no longer be considered enough. novelty, the magic word "innovation," seldom now cast a spell on those who control institutional budgets. increasingly, in the issues of this journal, we hope that emphasis will be placed on reviews of experience, retrospective evaluations of operation rather than optimistic projections made in the first bright mornings of system design. we must have reports if not of failures at least of alterations and accommodations enforced on operational systems by experience and the heavy hand of time. it is our further hope that the ] ournal will receive more reports from public and school libraries which indicate an increasing dedication, in automation explications, to the social and educational goals of those institutions. -ajg principles of format design henriette d. avram and lucia j. rather: marc development office, library of congress 161 this paper is a summary of several working papers prepared for the international federation of library associations (ifla) working group on content designators. the first working paper, january 1973, discussed the obstacles confronting the worldng group, stated the scope of responsibility for the working group, and gave definitions of the terms, tags, indicator and data element identifiers, as well as a statement of the function of each.1 the first paper was submitted to the working group for comments and was subsequently modified (revised aprill973) to reflect those comment$ that were applicable to the scope of the working group and to the definit·ion and function of content designators. the present paper makes the basic assumption that there will be a supermarc and discusses principles of format design. this se1·ies of papers is be·ing published in the interest of almting the library community to intemational activities. all individual working papers are submitted to the marbi interdivisional committee of ala by the chairman of the ifla working group for comments by that committee. introduction in order to have this paper stand alone, the scope and the definition and functions of the content designators as agreed to by the working group are summarized below: 1. the scope of responsibility for the ifla working group is to arrive at a standard list of content designators for different forms of material for the international interchange of bibliographic data. 2. the definition and function of each content designator are given as: a. a tag is a string of characters used to identify or name the main content of an associated data field. the designation of main content does not require that a data field contain all possible data elements all the time. b. an indicator is a character associated with a tag to supply additional information about the data field or parameters for the processing of the data field. there may be more than one indicator per data field. 162 ] ournal of lib1'a1'y automation vol. 7 i 3 september 197 4 c. a data element identifier is a code consisting of one or more characters used to identify individual data elements within a data field. the data element identifier precedes the data element which it identifies. d. a fixed field is one in which every occurrence of the field has a length of the same fixed value regardless of changes in the contents of the fixed field from occurrence to occurrence. the content of the fixed field can actually be data content, or a code representing data content, or a code representing information about the record. basic assumption-supermarc there appears to be little doubt that the format used for international exchange will not be the format presently in use in any national system. the first working paper addressed the obstacles that preclude complete agreement on any single national format, and a study of the matrix of the content designators assigned by various national agencies substantiates the above conclusion. consequently, we are concerned with the development of a supermarc whereby national agencies would translate their local format into that of the supermarc format and conversely, each agency would accept the supermarc format and translate it into a format for local processing. 2• 3 supermarc, therefore, is an international exchange format with the principal function that of transferring data across national boundaries. it is not a processing format (although if desired, it could be used as such) and in no way dictates the record organization, character bit configuration, coding schemes, etc., to be used within processing agencies. the supermarc format, however, should conform to certain conventions, namely the format structure should be iso 2709 and the character representation should be an eight-bit extension of iso 646. ~ the latter convention means that data cannot be in any other configuration than a character-by-character representation. supermarc assumes not only agreement on the value of content designators but, equally as important, on the level of application of these content designators. whatever the agreed upon level of content designation is, those agencies with formats more detailed will be able to translate to supermarc but will be in the position of having to upgrade all records entered into their local system from other agencies. likewise, local formats consisting of less detailed content designation than supermarc must upgrade to the supermarc level for communication purposes. where the actual content of the record is concerned, i.e., the fields andjor data elements to be included, it is highly probable that the decision of the content designator working group will be that data, if in~ iso/tc 46/sc4 wgl is presently engaged in the definition of extended characters for roman, cyrillic, and greek alphabets and mathematics and control symbols. principles of format design/ avram and rather 163 eluded in the record, are assigned supermarc content designators, but that not all data will always be present. this permits the flexibility required to bypass some of the substantive problems of different cataloging rules and cataloging systems. for example, one agency may supply printer and place of printing while another may not. it may be assumed, however, that all agencies will conform to the specifications prescribed by the isbd and other such standard descriptions as they become available. principles of format design prior to any deliberation regarding the actual value of content designators, the working group realized it must agree on a set of basic principles for the design of the international format. the first working paper set forth, in the form of questions, some of the issues that must be taken into account in arriving at the principles. several members of the working group expressed their opinions and these were considered in the formulation of the principles. the principles were discussed at the grenoble meeting in august 1973. five of the principles were adopted and the sixth was deferred for further analysis based on working papers to be written by some of the members. the sixth principle was adopted at the brussels meeting in february 1974. the six basic principles are stated below with a discussion following each principle: 1. the international format should be designed to handle all media. it would be ideal if at this time all forms of material had been fully analyzed. this is currently not the case. agreement on data fields and the assignment of content designators can realistically only be accomplished if there is a foundation upon which to build. therefore, the forms of material have been limited to those listed below because, to the best of our knowledge, these are the only forms where either experience has been gained in the actual conversion to machine-readable form or in-depth analysis has been performed to define the elements of information for the material. books: all monographic printed language materials. serials: all printed language materials in serial form. maps: printed maps, single maps, serial maps, and map collections. films: all media intended for projection in monographic or serial form. music and sound recordings: music scores and music and nonmusic sound recordings. at the meeting in brussels, the decision was made to use the isbd as the foundation for the definition of functional areas for the formats. since at the present time an isbd exists only for monographs and serials, these materials will receive first priority by the ifla working group. · still under consideration is the question whether manuscripts should be included in the forms of material within the scope of the 164 j oumal of lihra1'y automation vol. 7 i 3 september 197 4 working group. pictorial representations and computer mediums have not as yet been analyzed. when these forms have been analyzed, they should be added to the generalized list. 2. the inte1'national fo1'mat should accept single-level and multilevel st1'uctu1'es. there is a requirement to express the relationship of one bibliographic entity to another. this relationship may take many forms. a hierarchical relation is expressed for works which are part of a larger bibliographic entity (such as the chapter of a book, a single volume of a multivolume set, a book within a series). a linear relation is expressed for works which are related to other works such as a book in translation. this discussion is concerned with hierarchical relationships and the need to describe this relationship in machinereadable records. there are a number of ways in which hierarchical relationships may be expressed. one method is to place the information on the related work in a single field within the record. for example, the different volumes of a multivolume set may be carried in a contents field. when a book is in a series, the series may be caltied in a series field. this may be termed using a single-level record to show a hierarchical relationship. another method is to use a multilevel record made up of subrecords.t the concept of a subrecord directory and a subrecord relationship field was discussed in appendix ii to the ansi standard z39.2-197!.4 the appendix illustrated a possible method of handling subrecords and expressing relationships within a bibliographic record but was not part of the american standard. similarly, in 1968 the library of congress published as part of its marc ii format a proposal to provide for the bibliographic descriptions of more than one item in a single record, and represented this capability as "levels" of bibliographic description. 5 the international standard (iso 2709) defines a subrecord technique without an explicit statement of a method to describe relationships. 6 more recently, a level structure was proposed in a document by john e. linford,7 and an informal paper by richard coward8 gave the following example of a level structure: level collection sub-collection document analytical record 1 subrecord 1 subrecord 1 subrecord r------1------, 1 subrecord 1 subrecord 1 subrecord t a subrecord is a "group of fields within a bibliographic record which may be treated as a logical entity." when a bibliographic record describes more than one bibliographic unit, the descriptions of the individual bibliographic units may be treated as subrecords. principles of format design/ avram and rather 165 several national ,agencies have expressed concern regarding the efficiency of the iso 2709 subrecord technique and have suggested that a modification be made to the subrecord statement. there are alternative techniques which could be incorporated in the international exchange format to build in level capability. methods have been suggested that would cause a revision (specifically the number of characters in each directory entry) to the iso standard; other alternatives might not. regardless of the final technique agreed upon, national agencies should maintain the authority to record their cataloging data to reflect their catalog practices, i.e., either describing the items related to an item cataloged as fields within a single-level record or as subrecords of a multilevel record. 3. tags should identify a field by type of entry as well as function by assigning specific values to the charactet positions. assigning values to the characters of the tags allows the flexibility to derive more than a single kind of information from the tag. for example, it should be possible by an inspection of the tags to retrieve all personal names from a machine-readable record regardless of the function of the name in the record, i.e., principal author, secondary author, name used as subject, etc. 4. indicatots should be tag dependent and used as consistently as possible across all fields. indicators should be tag dependent because they provide both descriptive and processing information about a data field. if the value assigned to an indicator is used as consistently as possible across all fields, where the situation warrants this equality, the machine coding is simplified to process different functional fields containing the same type of entry. 5. data element identifiets should be tag dependent, but, as fat as possible, common data elements should be identified by the same data element identifiets actoss fields. the principle has been adopted that the format will handle all types of media and consequently the projected number of unique tags may be quite large. in addition, since all types of media are not yet fully analyzed, the number of unique fields is an unknown factor. while it is undeniable that making data element identifiers tag independent would be desirable, the limited number of alphabetic, numeric, and symbolic characters would restrict the number of data elements to the number of unique characters. this constraint on future expansion seems to be more important than any advantages gained from making data element identifiers tag independent. if data element identifiers are tag dependent, then additional refinements could be added in one of two ways: ( 1) the principle of identifying common data elements by the same identifiers across fields could be followed as far as possible, 01' ( 2) the identifiers could be given a value to aid in filing. the two refinements appear to be mutu166 journal of library automation vol. 7/3 september 197 4 ally exclusive since a data element in one field may have a different filing value from the same data element in another field. since the first refinement should be useful for many types of processing, and the second would be useful only in filing, the former seems to be the better option. 6. the fields in a bibliographic record are primarily related to broad categories of information relating to "sttbfect," "description," "intellectual1'esponsibility," etc., and should be grouped according to these fundamental categories. the first working paper discussed as an obstacle the lack of agreement on the organization of data content in machine-readable records in different bibliographic communities. a subsequent paper consisting of comments made by staff of the library of congress on the proposed eudised format discussed in greater detail the analytic versus traditional arrangement. 9 • t the majority of the national formats designed to date are arranged by using the function as the primary grouping and the type of entry as the secondary grouping. several working papers produced by committee members supported the arrangement by function on the grounds that it followed the traditional order of elements in the bibliographic record and therefore simplified input procedures. grouping of the fields first by function and then by type of entry was agreed to at the brussels meeting. references 1. henriette d. avram and kay d. guiles, "content designators for machine readable records," journal of library automation 5:207-16 (dec. 1972). 2. r. e. coward, "marc: national and international cooperation," in international seminar on the marc format and the exchange of bibilographic data in machinereadable form, berlin, 1971, the exchange of bibliographic data and the marc format (munich: pullach, 1972), p. 17-23. 3. roderick m. duchesne, "marc: national and international cooperation," in international seminar on the marc format and the exchange of bibliographic data in machine-readable form, berlin, 1971, the exchange of bibliographic data and the marc format (munich: pullach, 1972), p.37-56. 4. american national standards institute, american national standard fot' bibliogmphic information interchange on magnetic tape (washington, d.c.: 1971) (ansi z39.2-1971). appendix, p.l5-34. 5. henriette d. avram, john f. knapp, and lucia j. rather, the marc ii format; a communications format for bibliographic data (washington, d.c.: library of congress, 1968), appendix iv, p.l47-49. 6. international organization for standardization, documentation-format fot• bibliographic information interchange on magnetic tape. 1st ed. international standard iso 2709-1973(e). 4p. t in an analytic tagging scheme, the first character of the tag describes the type of entry and subsequent characters describe function; in a traditional tagging scheme, the first character describes function and subsequent characters describe type of entry. ptinciples of format design/ avram and rather 167 7. council for cultural cooperation. ad hoc committee for educational documentation and information. working party on eudised formats and standards, 3d meeting, luxembourg, 26-27 april 1973, draft eudised format (second revision). prepared by john e. linford. 8. paper sent from richard coward to henriette d. avram, "notes on marc subrecord directory mechanism." 9. henriette d. avram, "comments on draft eudised format (second revision)," unpublished paper. lib-s-mocs-kmc364-20140601052239 101 an interactive computer-based circulation system for northwestern university: the library puts it to work velma veneziano: systems analyst, northwestern university library, evanston, illinois northwestern university library's on-line circulation system has resulted in dramatic changes in practices and procedures in the circulation services section. after a hectic period of implementation, the staff soon began to adjust to the system. over the past year and a half, they have devised ways to use the system to maximum advantage, so that manual and machine systems now mesh in close harmony. freed from time-consuming clerical chores, the staff have been challenged to use their released time to best advantage, with the result that the "service" in "circulation services" is much closer to being a reality. the transition from a manual to an automated system is never easy. northwestern university library's experience with an automated circulation system was no exception. the first three months of operation were especially harrowing; there were times when only the realization that the bridges back to the old system were burned kept the staff plugging away with a system which often seemed in imminent danger of collapse. that they survived this period is a tribute to their persistence and optimism as well as to the merit of the system . the impressive array of obstacles was offset by a number of positive factors. even though there were mechanical problems with terminals, the on-line computer programs worked flawlessly from the first. the climate for change was favorable. the automation project had the complete support of library administration; the head of circulation services, although new to the department and untrained in automation, was completely committed to the system and was able to transmit his enthusiasm to his staff. 102 journal of library automation vol. 5/2 june, 1972 within three months, the systems analyst, who had been available for advice and trouble-shooting, began to fade from the scene. only an occasional minor refin ement is now necessary. maintenance problems, both in programs and procedures, are minimal. basically the system has proved itself workable. in a previous paper by dr. james s. aagaard (lola , mar. 1972 ), the development of the system is traced and the system is described in terms of its logical design, program, and hardware components. the present paper will describe how the system operates in the library environment. the system accomplishes the traditional library tasks connected with circulation, but the methods used have changed radically. the development of effective procedures must in large part be credited to the circulation staff. these procedures have in a real sense spelled the difference between an adequate system and a good one. it is these procedures on which we will concentrate. the author wishes to thank the head of circulation services, rolf erickson, and his assistants, mrs. eleanor pederson and mrs. lillian hagerty, for supplying th e information to bring her up-to-date on procedures as they have evolved over the past three years. book identification almost 100 percent of the 900,000 books in the main library's circulation collection contain punched cards. accurately punched book cards, available in all books, can make the difference between success and failure of a circulation system. the book cards contain only the call number and location code. there is no doubt that, if conversion funds had been less limited, we might have elected to capture author/title data. however an analysis of the amount of data which could be carried on an 80-column card, added to the fact that this would quadruple the cost, led to the decision to omit author/ title. as a result, key punch costs were exceptionally low-1.1 cent per card. in spite of our fears, the complaints by users because overdue and other notices do not contain author / title have been surprisingly few. cards for new books are, with a few exceptions, produced automatically as output from the technical services module. all book cards are also on magnetic tape and constitute a physical inventory of the entire circulating collection, which is updated at intervals and listed. user identification the system requires a unique numeric identification number for each borrower. for faculty and evanston campus students, this is their social security number; for special users it is a five -digit number assigned by the library from a list of sequential numbers. the number is supplemented by a one-digit code which identifies the type of user. ,. interactive circulation syste m j veneziano 103 the university's division of student finance has responsibility for issuance of punched plastic badges for students. each spring at preregistration time, data are gathered and pictures taken for students planning to return to the university in the fall. badges are ready for distribution as soon as school opens. for incoming freshmen, transfers, and returnees, data are gathered and badges punched at registration time in the fall. a temporary paper badge is used during the several weeks required for badge preparation. an outside contractor prepares and punches the badges. there were initial problems with the accuracy of punching but these have been resolved. the library now has a small ibm 015 badge punch, which it uses for punching special user badges and badges for carrel holders. student badges are valid for one year. the user code is changed each year to prevent use of an expired badge. faculty and staff badges are issued by the personnel department of the university, and are good for three years. these are also produced by an outside firm. book security exit guards examine all books taken from the library to ensure that they are properly charged. the call number on the book and on the date-due slip are compared; the user number on the date-due slip and the user's badge are compared. this need not be a character-by-character comparison. a few selected characters will suffice. student badges contain their pictures, which should bear at least a resemblance to the holder of the badge. initially, students were not required to show their badges. after a rash of book thefts resulting from the use of lost or stolen badges, this policy was changed. the book-check routine sometimes slows exit from the building during peak periods ; however it is considered a necessary security measure. the problem of lost badges is a serious one. users tend to leave badges in the terminals. usually such badges are turned in at the main circulation desk by the next user; the owner is notified to come in and pick it up. if a student loses his badge, he must report it to the circulation desk as soon as possible. he is issued a special use r badge, and the computer center is notified to "block" his regular user number. if someone then tries to use the badge, an "unprocessed" message will appear in lieu of a valid date-due slip. the problem is timing. "blocking" is done only once a day. a determined thief can charge out a considerable number of books before the number can be blocked. for this reason, a check of the photograph on the badge is important. the maximum number of user numbers which can be blocked is fifty. fortunately, except for faculty /staff badges which are good for three years, student badges automatically become invalid at the end of each school year, and special user badges expire at the end of each quarter. behind the decision to go on-line was the belief that a university library, 104 journal of library automation vol. 5/ 2 june, 1972 to effectively serve its patrons, needs to be able to determine the status of a book without delay. all books which are not in their places on the shelves as indicated by the card catalog are, in theory, retiected in the computer circulation file. out of a circulating collection of 900,000 items, the number of records in the file at any one time will range from 30,000 to 60,000. this includes books temporarily located in the reserve room, books being bound, and books which have been sent to the catalog department. it also includes books which are lost or missing but which have not yet been withdrawn from the catalog. a single 2740 typewriter terminal, located at the main circulation desk, is used for inquiry into the circulation file. a library user, having obtained the call number of a book from the catalog, looks for it in the stacks. if he is unable to find it, he inquires at the terminal. the operator enters a command "search," followed by the abbreviated call number of the book (the key ) . if one or more records with this key are in the file, the file address, plus the balance of the call number (the key extension), are typed back from the computer for each such record. if one of the listed records is the desired one, the operator then asks for a display of the record. the display includes the due date, type of charge, user number, and, if there is one, the saver number. the ability to use an abbreviated call number to access the file has proved invaluable. the operator can in effect "browse" among all the various editions, copies, and volumes of a particular book which are in circulation. the technique also facilitates finding a record, such as a volume in a serial, where the format is often quite variable, and not always obvious from the call slip supplied by the user. if a large number of books all with the same key are in the file, there is sometimes a considerable wait while the typewriter types out the addresses and key extensions for all the records. once such a listing begins, there is no way at present to cut it off in mid-point. this is a minor inconvenience; it could be remedied quite easily if computer core were not such a precious commodity. the single 2740 terminal is heavily used and plans are under way to substitute a cathode ray tube in the near future. book locate procedures if a search on the 27 40 terminal reveals that a book is not in circulation, the individual may ask that it be "located." a form is filled out and the book is searched nightly in the stacks. ( it is also again searched in the 27 40 since it may have been charged out to another user after the inquiry. ) if it is found , it is brought down and placed on the "save" shelf, and the inquirer is notified that it is available. if it is not found , the form is held for two weeks and searched again, both in the 2740 and the stacks. if it is not found on the second search, it is interactive circulation systemjveneziano 105 entered into the file as a "missing book." the circulation section has found that entering missing books into the file as soon as possible saves them time, because a search for a single book is often duplicated needlessly for a number of different individuals. save procedures when a user is informed that a book is in the circulation file, he may ask that it be called in for him, provided it is not on loan to the reserve koom and provided it is not already "saved" for someone else. the 2740 operator calls in the record and adds the saver's identification number to the record. each weekday morning, '·book needed notices" are sent over from the computer center for books "saved" since the last notice run. the notices are stuffed in window envelopes and mailed. even though the number of saves is small, in relation to the total number of books charged out, this feature has contributed to the library's and the user's satisfaction with the system. initially there was some consideration given to providing for multiple saves on the same book. a study of the frequency of multiple saves indicated that the increased system complexity did not warrant it. moreover, a student usually cannot wait too long for his turn at a book. a better solution in a university library is either to buy more copies or place high demand books in the reserve room, or both. the standard loan period is four weeks. a save on a book causes the due date to be recalculated either to two weeks, or to five days from the date of the save, whichever is later. this variable loan period increases the number of users who can use a book in high demand, without inconveniencing the user of a book which no one else needs. to succeed, such a call-in policy must be backed with enough force to ensure that a called-in book is returned promptly. if a book is returned after the revised due date, the user incurs a penalty fine of $1.00 per day in addition to the regular 10 cents per day fine. expired call-ins result in a weekly computer-generated reminder. when a book which is saved is discharged, the terminal printer issues a message to this effect, and the book can be placed on the "save" shelf instead of being sent to the stacks. each night "book available notices" are produced for all such books discharged since the last notice run. the first copy of the notice is mailed to the saver; the second part is inserted in the book. the saver is given five weekdays to pick up the book. book charges self-service charges during the regular school year, from 1000 to 1200 books per day are charged out through the system. most of these charges are processed by the users, on the self-service terminals. 106 ]ounwl of library automation vol. 5/2 june, 1972 a basic objective in the design of the system was to make it easy for the user to charge out books. initially it was planned to have manned chargeout terminals. however, as the design of the system progressed, it became evident that the vast bulk of charge-out transactions would consist of three simple steps: ( 1) insert the user badge, ( 2) insert the book card, and ( 3) tear off the date-due slip. the idea arose: if the procedure was so simple, why not let the user himself do it, thus saving the cost of terminal operators? there was some concern over user reaction, but it was decided it was worth the risk. a simple set of illustrated instructions is attached to the terminal. since the terminal will not accept badges or book cards unless they are inserted in the proper direction, the user soon gets the idea. the terminal will also refuse a seriously off-punched badge or book card. if everything is done properly, the printer produces a date-due slip containing the user number, the book ca11 number, and the date due. this is detached and placed in the book pocket. if, instead of a valid date-due slip, the user receives a slip from the printer containing the word "unprocessed," he is instructed to take all materials to the main circulation desk. this condition will occur if the individual tries to take out a book which is already charged out (perhaps to the reserve room or a carrel). it also happens if the badge or book card has fewer than the required complement of characters or if the user code on the badge has expired. it also happens if the user's number has been "blocked." although readers had no difficulty mastering the technique of using the 1031 badge/card reader, the 1033 printer was another story. despite the printed and illustrated injunction to "tear the slip forward ," the users insisted on pulling the paper upward. the result-the continuous roll of paper would start to skew and the paper would eventually jam. to alleviate the skewing problem, we had pin-feed platens installed in the typewriters. these prevented the skew, but the upward pull on the paper caused the pin-feed holes to tear and get out of alignment with the pins. the result-a paper jam. the ibm field engineers valiantly tried to overcome the condition but to no avail. ibm was unwilling to make any major modification of the paper feed mechanism, and no amount of argument that such an improvement would increase their sales to other libraries had any effect. in desperation, the library fina11y took its problem to the physics shop in northwestern university's technological institute. the technicians there designed and built a hooded feed to channel the paper upward and forward at the desired angle. a hand-actuated knife blade was installed to cut and dispense the ticket-type slip. in spite of these heroic efforts, paper jams still occur with enough frequency to be annoying. since the terminals are isolated in the stacks, a jam often goes undetected until a user comes down to the main circulation desk interactive circulation systemfveneziano 107 with a complaint. for this reason, we have plans to install a "ticket printer," which will automatically cut and eject a ticket with no user intervention. unlike the 1033 printer, there has been very little down-time due to malfunctioning of the 1031 badge/card reader. due to their isolation on the stack floors, there was some early tampering with the terminals. now that the newness has worn off, the terminals seem to have lost their appeal to pranksters, except that the photographs used to illustrate procedures have a way of disappearing. everything taken into consideration, the self-service concept has proved completely feasible. it saves staff time and user time. the time required to charge out a book ranges from ten to fifteen seconds. carrel charges each quarter, the circulation section assigns carrels to individuals, mostly graduate students and faculty. carrel holders may charge out books for use in their carrels. a special loan code is entered which results in the date-due slip bearing the word "carrel." the user cannot take these books from the building. carrel charges are subject to call-in after two weeks but are not subject to fines. at the end of each quarter, unless the carrel has been reassigned to the same individual, any remaining books in the carrel are picked up and discharged. once a quarter, the carrel user receives a computer-printed list of books charged to his carrel. carrel holders tend to charge large numbers of books. for saving time on their part and on the part of staff, plastic badges are issued. these will contain the carrel number, the carrel code, and an expiration date. carrel holders may then use the self-service terminals in the stacks. charges to the reserve room the reserve room does not use the circulation system for charges to individuals, since the loan period is so limited. however the circulation file contains a record of all books located in the reserve room. when a book is charged to the reserve room, the identification number of the reserve room is entered in the 1031 slides, together with a loan code indicating an indefinite loan period. processing of large batches of books is speeded up by suppressing the printing of date-due slips for all intralibrary charges. after charging, the punched book card is removed and held until the book is ready to be returned to the stacks, at which time the book is discharged in the regular manner. if a book needed for reserve cannot be found in the stacks, it is searched in the 2740 terminal. if it is in the file , a save is placed on the record which generates a book-needed notice. the user is given five days to return the book. when the book is returned and discharged, a printer message alerts 108 journal of library automation vol. 5/ 2 june, 1972 the discharger, who places the book on the shelf for pick-up by the reserve room. if the book is not in the file, it goes through the "book locate procedure," after which , if it is not found , it is processed as a "missing" book. if su ch a missing book turns up, it can be immediately identified as needed hy the reserve room. a quarterly listing, in call number order, is re ceived from the computer center for all books charged out to the reserve room. this list serves as the heserve room's shelf list. bindery charges if a book is found to be in bad condition, it is set aside for a bindery decision. if it is beyond repair, it is charged out to the catalog department to be replaced or withdrawn. (after it is withdrawn it is deleted from the file.) if it can be repaired in-house, it is charged out to the mending section. if it must be sent to a commercial binder, it is charged to the bindery. the bindery section prepares an extra copy of the bindery ticket for all periodicals and unbound items, which it sends to the bindery. this ticket is used to keypunch a book card, which is then used to charge the book to the bindery. whenever a book is back from mending or binding, it is discharged before sending to the stacks. renewals all renewals are processed at the main circulation desk. the procedure is identical to a regular charge except that a slide on the terminal is set to "renew." the new date-due slip will contain the phrase "renew to." in theory, the self-service terminals could be used for renewals. in practice, unless elaborate precautions were taken, a user could renew a book before it became due and then return it for discharge, leaving one slip in the book and keeping the other. after the book reached the stacks, the user could insert the extra date-due slip and walk out undetected. as protection against this, the original date-due slip must be in the book when it is renewed. phone renewals are not accepted. however, if the user mails or brings in his date-due slips, the renewal is processed on the 2740. in the renewal of a book via the 2740, the record is called in and modified to change the date due and enter the correct renewal code. the original date due slip is stamped with the new date and the phrase "renewed." the slip is mailed to the user. although record modification via the 27 40 is a valuable and necessary feature, it must be used with discretion, since the generalized file management system governing the 2740 does not have the controls contained in the circulation-specific portions of the program which handle data from the 1030's; for example, automatic calculation of date due, rejection of renewals on saved books, validation of codes, etc. interactive circulation systemjveneziano 109 book discharge heturned books are left in book bins, one inside the building and one outside. it became very c\·ident during the implementation phase of the system that the success of the system depended on a thorough screening before discharge for purposes of detecting and deflecting potential problems before they got to the discharge terminal. books are first placed on dated trucks and then screened. books tcifhout punched book cards 1 f the punched hook card is missing, there will usually be a hand-written dat(' slip in the book ( the result of a manual charge ). the screener pulls the matching book card from the "book-cards-pending" fil e. ( after a manual charge, book cards arc punched and filed in this file to await the return of the hook.) the book is then ready for regular discharge. if there is no book card waiting, the hook must be held until a card is ready. this is done to avoid the charge being made after the discharge. books with incorrect book cards all book cards are checked to see that they match the call number on the book pocket. sometimes cards get switched between two books by the user when he charges them; sometimes the error was made when the card was originally matched with the book. if a book is found to contain an incorrect card, sometimes the correct card will he found in the "cards-pending" file. if so, it is pulled and inserted and the hook sent for regular discharge. ( the incorrect card becomes a "snag". ) if the correct card is not found, the record is searched in the 27 40 under both call numbers ( the one on the card and the one on the book ) . if the record is under both call numbers, the record which matches the book is deleted; the book is sent to keypunching; the unmatched book card is filed in the "cards-pending" file to await the return of the book which matches it. , if the record is found under only one of the two call numbers, it is deleted. the book is sent to keypunching; the unmatched book card becomes a ''snag." "snag" cards will be searched in the shelf list and, if they represent valid books, will be searched in the stacks. this is done to determine if a matching book can be found. books without date due slips the presence of a date clue slip in a hook usually indicates that the book should be in the circulation file. a slip will be missing if the user nen'r charged it out or if he lost (or removed ) the slip after charging it out. such books arc searched on the 2740. if no record is found, the book is sent to the stacks. if a record is found, it is deleted. however, we wish to llo journal of library automation vol. .5 / 2 june, 1972 guard against the user returning to insert the date due slip and walk out with it; thus, the book is not sent to the stacks until the date due is past. regular 1031 discharges the speed and accuracy of discharge are features which have contributed much to the success of the system. a book with a date-due slip and book card which matches the book go to the 1030 terminal at the main circulation desk for discharge. one slide is set to either "fine paid" or "fine not paid." (if the user paid a fine at the time he returned the book, a "fine paid" flag will be in the book. ) another slide is set to "book returned today," or "book returned yesterday," or "book returned prior to yesterday." if the last condition applies, the date of return is also set in the slides. once set, these slides need not be reset until there is a change of date or fine condition. for minimizing the resetting of slides, books are segregated into groups all of the same type. discharging is the essence of simplicity. the book card is inserted in the reader; it feeds down and out and is replaced in the book. the date-due slip is discarded and the book is ready for shelving. for the purpose of speeding up discharge, no printer message is received unless there is an error (record not in file), or unless the book has a save on it, or is a "found" lost book. one operator can discharge five to six books per minute. books are almost always discharged within one day of return and usually within three or four hours. if a large number of books should pile up after a period of computer down-time (fortunately rare), a massive discharge campaign is launched. two operators, working together on the terminal, can discharge books at the rate of one every eight seconds. if at the time of discharge a "save" message appears on the printer, the book is placed on the save shelf instead of being sent to the stacks. if a lost book is "found," the message alerts the operator to send the book to the staff member in charge of lost and missing books. if a message is received to the effect that no record exists in the file, the book is routed to the 2740 operator. occasionally the 1031 terminal will misread a card, usually due to improper folding. if a card is folded outside the punched area it causes no trouble. unfortunately, some of the original cards were folded in the middle which sometimes results in a punch being missed. this, in some cases, cannot be detected by the computer program. if the error resulted from a mis-read card, the terminal operator can usually determine, from the date-due slip and the error-message slip, the key under which the record exists. the record is deleted and the book sent to have a replacement card punched. an occasional cause of the "record-not-in-file" condition results when the charge was processed on the standard register punch (the mechanized interactive circulation systemjveneziano 111 back-up system). this punch has a disconcerting habit of dropping a punch from badges which have a slight defect. there is no warning when this happens, and the error is often not detected until the transaction is later processed through the 1030 terminal. since it is impossible to identify the user with certainty, such cards are simply discarded without processing on the assumption that most users are basically honest and will return the book. the 27 40 operator, seeing a date-due slip with a short identification number, is safe in assuming the record never got in the file. sometimes the "record-not-in-file" condition is the result of a discharger absent-mindedly discharging a book twice. if the 2740 operator cannot find a record, she gives up and sends the book to the stacks. during the early days of operation, when much of the charging was being done on the source record punch, the "record-not-in-file" condition was often due to the book being "discharged" b efore the charge was processed. the very small amount of down-time now, coupled with careful scheduling when it does occur, has almost eliminated this source of error. overdue books overdue notices for students and special individual users are prepared once a week. to avoid sending out large numbers of notices for books only a few days overdue, an overdue notice is prepared only if the book is at least four days overdue. a second notice is prepared two weeks after the first; a third and final notice is prepared two weeks after that. if there is no response to the final notice within two weeks, a "delinquent" notice is prepared which is not sent out but is used to prepare a bill for a "lost" book. the overdue-notice run also produces reminders of expired call-ins. fines and fine collection faculty and staff are fine-exempt. students and other individual users pay a 10 cents per day fine for books overdue more than three days. in addition , if a reader does not respond to a call-in by the revised due date, he is charged a $1.00 per day penalty fine. a user may elect to pay a fine on an overdue book at the time he returns it, in which case a "fine-paid" flag is inserted to alert the discharger to set the proper slide. no fine notice will result if this slide is set. for all other books returned late, fine notices are computer-prepared each weekday. these are on four-part forms; one copy is inserted in a window envelope and mailed; the other three parts are filed alphabetically by name. when the user pays his fine, the extra slips are discarded. if the fine is not paid in a reasonable period, one of the extra copies is sent as a follow-up notice. if no response to the follow-up is received, and if the total bill exceeds $3.00, the bill is sent to the department of student finance for collection. 112 journal of library automation vol. 5/2 june, 1972 sometimes the receiver of an overdue notice will come in to report that he ( l) returned the book, ( 2) lost it, or ( 3) never had it. in such cases the book is searched in the 27 40 because the book may have been returned since the overdue notice was prepared. if the record is still in the file, the item is verified in the shelf list. in some instances an incorrectly punched card is responsible for the item not being properly discharged. if a call number on a notice cannot be found in the shelf list, there is no alternative except to delete the record and absolve the reader of responsibility. if the call number on the notice represents a valid book, it is searched in the stacks and if found, is brought down and discharged, with a resultant fine notice. when the book cannot be found, the reader is usually held responsible for it, unless it was a case of a lost badge which was reported promptly, in which case the library is usually lenient. if no lost badge was involved, the book is processed as a "lost" book and the user is billed. a book is also considered lost (and the user billed) if the user does not respond to three overdue notices. weekly overdue notices are not prepared for faculty. instead a once-aquarter computer-produced memo is prepared informing the individual of the books charged to him. he is asked to return them or notify the library by carbon copy of the list that he wishes to retain them. if a faculty member does not return the list, the library calls in the books individually. as part of this quarterly memo run, listings of books charged to carrels and to departments (reserve room, bindery, cataloging, etc. ) are produced. these listings have proved very valuable in maintaining control over books charged out on a long-term basis. lost books when a book is determined to be "lost," a duplicate book card is prepared. the history of the loss, including the name and address of the individual involved, is entered on the card. if the reader is held responsible, the book is priced and a bill is prepared. the original record is left in the file until all the documents are prepared. then it is deleted via the 27 40, and a duplicate card is immediately used to charge the book out to the "lost" category. the duplicate card is then filed in the "lost/missing" file, by call number. another category of books is known as "missing." these are books which, although not charged out to anyone, cannot be found in the stacks. a duplicate card is prepared and used to charge the item out to the "missing" category. the card is filed in the "lost/missing" file. once a quarter, a computer-produced listing of lost/missing books is received. using this list, the stacks can be searched to see if the books have turned up. the list of books lost or missing for more than two years it turned over to the catalog department for withdrawal. after official withdrawal, the record will be deleted from the file. interactive circulation systemjveneziano 113 the fact that all lost/ missing books are reflected in the file has aided in detecting them if they turn up. if such a book is discharged, a printed message alerts the operator who routes the book to the person in charge of lost/missing books. the duplicate card is then pulled from the lost/missing file. since the card contains the name of the responsible individual, it is possible to trace down the original bill in case an adjustment is necessary. lost/ missing books also turn up if someone tries to charge them out. the "unprocessed" message which is printed instead of a date-due slip will usually cause the reader to bring the book to the main circulation desk where the proper action can be taken to reinstate it in the collection. manual charges the system had to be designed so a book could be charged out even if it did not have a punched book card. such books are brought to the main circulation desk where a two-part form is hand-prepared. one part becomes a date-due slip; the other part goes to keypunching. a composite card containing the call number, the user number, and the loan code is punched, which is then fed through the 1030 to create a charge record. also keypunched at this time is a regular book card, which is filed in the "cards-pending" file to await the return of the book. such manual charges are very unsatisfactory. call numbers and identification numbers are often illegible or miscopied. keypunch errors are not uncommon. care must be taken that the composite cards are processed through the 1030 before the book is returned for discharge. fortunately, books without cards are now a rarity. mechanized charges although the amount of computer down-time is very slight, some means had to be devised to charge out books during such periods. the manual charge procedure could have been used; however the high error rate in copying and punching, coupled with the delay in keypunching any substantial volume of cards, caused us to reject this as a back-up system. a standard register source record punch is used. this punch reads the badge and book card and transfers the data, plus data from a series of internal slides, to produce a printed date-due slip and a punched composite card. when the computer comes back on, the composite card is fed through the 1031 to set up the charge record. since only one machine can be justified from a cost standpoint, the process of charging books out in this fashion is slow. long lines of people often form, waiting for service. resetting the internal slides between one loan code and another is awkward and error-prone. the machine is extremely sensitive to badge quality and often misses a punch. however, as with manual charges, the most significant disadvantage is that charges are made "blind." there is no way to determine whether a book is not already in the file, or, if it is being renewed, that it has a save 114 journal of library automation vol. 5/2 june, 1972 on it. the user's number may be one of those "blocked" from use; this fact is not detected until it is too late. as with manual charges, care must be taken that all such mechanized charges are processed through the 1030 before any discharging is done. in spite of its defects, the source record punch has proved useful as a system back-up. the error rate in transfer, while higher than on the 1030, is significantly less than the error rate of manually prepared and keypunched charges. although slow, records get into the file much faster than if they had to be keypunched. the impact of the system on the library the new system has had a profound impact on the operation of the circulation services section, but other departments have also been affected, particularly technical services. tighter control of cataloging is now maintained. no longer is it feasible for small uncataloged collections or collections with off-beat cataloging to exist in virtual isolation from the rest of the library. regulations as to depth of classification have had to be adopted; the formation of the cutter number and work letters must be carefully regulated; the assignment of volume and edition numbers must be uniform. location symbols require careful control; no longer can books be casually passed from one collection to another without official transfer. withdrawal of lost and missing books must be systematically performed. the system gives maximum flexibility-books may circulate on lc class numbers or document numbers as well as on a dewey number. ways of handling non-standard cutter numbers and work letters have been improvised. at the same time, the system operates to prevent unnecessary haphazard and shortsighted practices. within the circulation services section, the computerization of circulation has not resulted in fewer personnel; it has, however, resulted in the same number of staff members being able to handle a much larger volume of circulation and to handle it more efficiently. in addition, cirrulation services has taken on a number of tasks which in the past were either not its responsibility or, if they were, were given only perfunctory attention. a comprehensive inventory of the entire collection of 1,200,000 books in the main library is in progress. errors both on books and in the catalog are being corrected. the physical condition of the collection is being attended to. the content and quality of the collection are receiving increased attention. incomplete serial holdings are being brought to light for possible acquisition. books in the stacks which are candidates for inclusion in the "special collections" department are being detected. so far as circulation proper is concerned, it can be said without reservation that the system saves a great deal of clerical effort. staff time spent in charging out books is very small. discharging an average day's books interactive circulation system j veneziano 115 requires three or four man-hours. filing has almost disappeared as has most of the typing formerly required. a 2740 operator is required for inquiry and processing of mail renewals for the better part of the day and evening. the collection and follow-up on fines and bills is still a time-consuming job, although the extra forms available for follow-up have supplied some relief. the system is not perfect. there are certain improvements-such as on-line validation of users and automatic regulation of loan privilegeswhich would be made if the time and money were available for them. however, considering the modest cost of developing and operating the system, the imperfections are bearable. not the least of the benefits derived from the system is a somewhat intangible one. the role of the circulation librarian, and that of his staff, has changed. no longer are they chained to mountains of cards which, as soon as they are filed, must be unfiled. staff members have been challenged to use their released time to the best advantage. much thought and ingenuity has gone into setting up procedures to achieve maximum efficiency and accuracy. for the first time, perfection is seen as an attainable goal. each day the staff develops more sophistication and gets a step closer to that goal. figure 1. user inserts identification badge and punched book card in self-service circulation terminal. 116 journal of library automation vol. 5/2 june, 1972 fig. 3. user inserts date-due slip in book pocket, completing charge procedure. fig. 2. specially designed attachment is used to cut off printed fig. 4. terminal at circulation desk has manual entry unit, which can be set to process charges without an identification badge, renewals, or discharges. interactive circulation systemjveneziano 117 fig. 5. typewriter terminal is used for inquiry into file, placing saves on books, and occasionally for renewals. reproduced with permission of the copyright owner. further reproduction prohibited without permission. free culture: how big media uses technology and the law to lock down culture and control creativity coyle, karen information technology and libraries; dec 2004; 23, 4; proquest pg. 198 book review free culture how big media uses technology and the law to lock down culture and control creativity by lawrence lessig . new york: penguin, 2004. 240p. $24.95 (isbn 1594-20006-8). this is the third book by stanford law professor larry lessig, and the third in which he furthers his basic theme : that the ancient regime of intellectual property owners is locked in a battle with the capabilities of new technology. lessig u sed his first book, code and other laws of cyberspace (basic books, 1999), to explain that the notion of cyberspace as free, open, and anarchic is simply a myth, and a dangerous one at that: the very architecture of our computers and how they communicate determine what one can and cam10t do within that environment. if you can get control of that architecture, say by mand ating filters on cont ent, yo u can get subs tantial control over the culture of that communication space. in his sec ond book, the future of ideas: the fate of the commons in a connected world (random, 2001), lessig describes how the chang e from real prop erty to virtual propert y actually means more opportunity for control , not less. the theme that he takes up in free culture is his conc ern that certain powerful inter ests in our society (read: hollywood) are using copyright law to lock down the very stuff of creativity: mainly , pa st creativity. lessig himself admits in his preface that his is not a new or unique argument. he cites richard stallman's writings in the mid-1980s that became the basis for the free software movement as containing many of the same concepts that lessig argues in his book. in this case, it serves as a kind of proof of concept (that new ideas build on past ideas) rather than a criticism of lack of originality. stallman's work is not, however, a substitute for lessig's; not only does lessig address popular culture where stallman addresses only computer code, but lessig has one key thing in his favor: h e is a mast er story-tell er and a darned good writer, not something one usually expec ts in an academic and an expert in constitutional law. his book opens with the first flight osf the wright brothers and the death of a farmer's chick ens, followed by buster keaton's film steamboat bill and disney's famous mouse . th e next chapter traces the history of photography and how the law once considered that snapping a picture could require prior permission from the owners of any property caught in th e viewfinder. later he tells how an improvement to a sea rch engin e led one college student to owe the recording industry association of america $15 million. throughout the book lessig illustrates copyright through the lives of real people and uses histor y, science, and the arts to mak e this law come to life for the reader . lessig explains that intellectual property differ s from real property in the eye of the law. unlike real property, where th e property owner has near total control over its uses, the only control offered to authors originally was the control over who could make copies of the work and distribut e them. in addition, that right-the "copy right" -lasted only a short time. the original length of copyright in the united states was fourteen years, with the right to renew for another fourteen years. so a total of twenty-eight years stood betwe en an author's rights and the public domain, and those rights were limited to publishing copies. others could quote from a work, even derive other works from it (such as turning a no ve l into a play) , all within a law that was designed to promote science and the arts. fast forward to the present day and we have a very different situation. not only has there been a change in th e length of time that copyright applies to a work; a major change in 198 information technology and libraries i december 2004 tom zillner, editor copyright law in 1976 extended copyright to works that had not previously b een covered. in the earli es t u.s. copyright regimes of the late 18th century, only works that were registered with the copyright office were afforded the prot ection of copyright law, and only about five perc en t of works produc ed were so registered. th e rest were in the public domain. later, actual registration with the copyright office was unnecessary but the author was required to place a copyright notice on a work (e.g ., "© 2004, karen coyle") in order to claim copyright in it. copyright holder s had to renew works in order make use of the full term of protection, and renewal rates were actually quite low. in 1976, all such requirements were removed, and the law was amended to state that any work in a fixed m edium automatically receives copyright protection, and for the full term. that is true even if the author do es not want that protection . so although many saw the great exchange of ideas an d information on the internet as being a huge commons of knowledge, to be shared and sha red alike, a ll of it has, in fact, alwa ys been covered by copyright law-every word out there belongs to someone. that chang e, combined with a much earlier change that gave a copyright holder control over derivative works, puts creators into a deadlock. th ey cannot safely build on the work of others without permission (thus less ig's argument that we are becomin g a "permission culture ") . yet, we have no m echanism (such as registration of works that would result in a databas e of creators) that would facilitate getting th at permission . if you find a work on the internet and it has no named author or no contact information for the author, the law forbids you to reuse the work without permission, but there is nothing that would make getting that permission a manageable task. of course, even if you do know who th e rights hold er is , permission is not a given. for examreproduced with permission of the copyright owner. further reproduction prohibited without permission. ple, you hear a great song on the radio and want to use parts of that tune in your next rap performance. you would need to approach the major record label that holds the rights and ask permission, which might not be granted. you could go ah ead and use the sample and, if challenged, claim "fair use." but being challenged means going to court in a world where a court case could cost you in the six digits, an amount of money that most creators do not have. lessig, of course, spends quite a bit of time in his book on the length of copyright, now life of the author plus seventy years. it was exactly this issue that he and eric eldred took to the supreme court in 2003. lessig argued before the court that if congress can seemingly arbitrarily increase the length of copyright, as it has eleven times since 1962, then there is effectively no limit to the copyright term. yet "for a limited time" was clearly mandated in the u.s. constitution. lessig lost his case. you might expect him to spend his efforts explaining how the supreme court was wrong and he was right, but that is not what he does . right or wrong, they are the supreme court, and his job was to convince them to decide in favor of his client. instead, lessig revises his estimation of what can be accomplished with constitutional arguments and spends a chapter outlining compromises that mightjust might-be possible in the future. to the extent that eldred v. ashcroft had an effect on lessig's thinking , and there is evidence that the effect was profound, it will have an effect on all of us because lessig is one of the key actors in this arena. throughout the book, lessig points out the difference between copyright law and the actual market for works. there is a great irony in the fact that copyright law now protects works for a century or more while most books are in print for one year or less. it is this vast storehouse of out-ofprint and unexploited works that makes a strong argument for some modification of our copyright law. he also recognizes that there are different creative cultures in our society, with different views of the purpose of creation. here he cites academic movements like the public library of science as solutions for the sector of society that has a low or nonexistent commercial interest but a need to get its works as widely distributed as possible. for these creators, and for "sharers" everywhere, lessig promotes the creativecommons solution (at www. creativecommons.org), a simple licensing scheme that allows creators to attach a license to their work that lets others know how they can make use of it. in a sense, creativecommons is a way to opt out of the default copyright that is applied to all works. when i first received my copy of free culture, i did two things: i looked up libraries in the index, and i looked up the book online to see what other reviewers had said. online, i found a web site for the book (http:/ /free-culture.org) that pointed to two very interesting sites: one that lists free, downloadable fulltext copies of the book in over a dozen different formats; and one that allows you to listen to the chapters being read aloud by volunteers and admirers. (i did listen to a few chapters and generally they are as listenable as most nonfiction audio books. in the end, though, i read the hard copy of the book.) lessig is making a point by offering his work outside the usual confines of copyright law, but in fact the meaning of his gesture is more economic than legal. although he, and cory doctorow before him (down and out in the magic kingdom, tor books, 2003), brokered agreements with their publishers to publish simultaneously in print with free digital copies, few authors and publishers today will choose that option for fear of loss of revenue, not because of their belief in the sanctity of intellectual property. if there were sufficient proof that free online copies of works increased sales of hard copies, this would quickly become the norm, regardless of the state of copyright law. as for libraries-unfortunately, they do not fare well. he dedicates a short chapter to brewster kahle and his way-back machine as his example of the need to archive our culture for future access. i admit that i winced when lessig stated: but kahle is not the only librarian. the internet archive is not the only archive. but kahle and the internet archive suggest what the future of librarie s or archives could be. (114) lessig also mentions libraries in his arguments about out-of-print and inaccessible works, but in this case he actually gets it wrong: after it [a book] is out of print , it can be sold in used book store s without the copyright owner getting anything and stored in libraries, where many get to read the book, also for free. (113) since we know that lessig is very aware that books are sold and lent even while they are still in print, we have to assume that the elegance of the argum ent was preferred over precision . but he makes this error mor e than once in the book, leaving librarie s to appear to be a home for leftov ers and remaindered works. that is too bad. we know that lessig is aware of libraries; anyone active in the legal profession depends on them. he has spoken at library-related conferences and events. yet he does not see libraries as key players in the battle against overly powerful copyright interests . more to the point, libraries have not captured his imagination, or given him a good story to tell. so here is a challenge for myself and my fellow librarians: whether it means chatting up lessig after one of his many public performances, becoming active in creativecommons, or stopping by palo alto to take a busy law professor to lunch , we need to make sure that we get on , and stay on, lessig's radar . we need him ; h e needs us.-karen coyle, digital libraries consultant, http:// kcoyle.net book review 199 editorial | truitt 107 marc truitteditorial: computing in the “cloud” silver lining or stormy weather ahead? c loud computing. remote hosting. software as a service (saas). outsourcing. terms that all describe various parts of the same it elephant these days. the sexy ones—cloud computing, for example—emphasize new age-y, “2.0” virtues of collaboration and sharing with perhaps slightly mystic overtones: exactly where and what is the “cloud,” after all? others, such as the more utilitarian “remote hosting” and “outsourcing,” appeal more to the bean counters and sustainabilityminded among us. but they’re really all about the same thing: the tradeoff between cost and control. that the issue increasingly resonates with it operations at all levels these days can be seen in various ways. i’ll cite just a few: n at the meeting of the lita heads of library technology (holt) interest group at the 2009 ala annual conference in chicago, two topics dominated the list of proposed holt programs for the 2010 annual conference. one of these was the question of virtualization technology, and the other was the whole white hat–black hat dichotomy of the cloud.1 practically everyone in the room seemed to be looking at—or wanting to know more about—the cloud and how it might be used to benefit institutions. n my institution is considering outsourcing e-mail. all of it—to google. times are tough, and we’re being told that by handing e-mail over to the googleplex, our hardware, licensing, evergreening, and technical support fees will total zero. zilch. with no advertising. heady stuff when your campus hosts thirty-plus central and departmental mail servers, at least as many blackberry servers, and total costs in people, hardware, licensing, and infrastructure are estimated to exceed can$1,000,000 annually. n in the last couple of days, library electronic discussion lists such as web4lib have been abuzz— or do we now say a-twitter?—about amazon’s orwellian kindle episode, in which the firm deleted copies of 1984 and animal farm from subscribers’ kindle e-book readers without their knowledge or consent.2 indeed, amazon’s action was in violation of its own terms of service, in which the company “grants [the kindle owner] the non-exclusive right to keep a permanent copy of the applicable digital content and to view, use, and display such digital content an unlimited number of times, solely on the device or as authorized by amazon as part of the service and solely for [the kindle owner ’s] personal, noncommercial use.”3 all of this has me thinking back to the late 1990s marketing slogan of a manufacturer of consumer-grade mass storage devices—remember removable hard drives? iomega launched its advertising campaign for the 1 gb jaz drive with the catch-line “because it’s your stuff.” ultimately, whether we park it locally or send it to the cloud, i think we need to remember that it is our stuff. what i fear is that in straitened times, it becomes easy to forget this as we struggle to balance limited staff, infrastructure, and budgets. we wonder how we’ll find the time and resources to do all the sexy and forward-looking things, burdened as we are with the demands of supporting legacy applications, “utility” services, and a huge and constantly growing pile of all kinds of content that must be stored, served up, backed up (and, we hope, not too often, restored), migrated, and preserved. the buzz over the cloud and all its variants thus has a certain siren-like quality about it. the notion of signing over to someone else’s care—for little or no apparent cost—our basic services and even our own content (our stuff) is very appealing. the song is all the more persuasive in a climate where we’ve moved from just the normal bad news of merely doing more with less to a situation where staff layoffs are no longer limited to corporate and public libraries, but indeed extend now to our greatest institutions.4 at the risk of sounding like a paranoid naysayer to what might seem a no-brainer proposition, i’d like to suggest a few test questions for evaluating whether, how, and when we send our stuff into the cloud: 1. why are we doing this? what do we hope to gain? 2. what will it cost us? bear in mind that nothing is free—except, in the open-source community, where free beer is, unlike kittens, free. if, for example, the borg offer to provide institutional mail without advertisements, there is surely a cost somewhere. the borg, sensibly enough, are not in business to provide us with pro bono services. 3. what is the gain or loss to our staff and patrons in terms of local customization options, functionality, access, etc? 4. how much control do we have over the service offered or how our content is used, stored, marc truitt (marc.truitt@ualberta.ca) is associate university librarian, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 108 information technology and libraries | september 2009 repurposed, or made available to other parties? 5. what’s the exit strategy? what if we want to pick up and move elsewhere? can we reclaim all of our stuff easily and portably, leaving no sign that we’d ever sent it to the cloud? we are responsible for the services we provide and for the content we have been entrusted. we cannot shrug off this duty by simply consigning our services and our stuff to the cloud. to do so leaves us vulnerable to an irreparable loss of credibility with our users; eventually some among them would rightly ask, “so what is it that you folks do, anyway?” we’re responsible for it—whether it’s at home or in the cloud—because it’s our stuff. it is our stuff, right? references and notes 1. i should confess, in the interest of full disclosure, that it was eli neiburger of the ann arbor district library who suggested “hosted services as savior or slippery slope” for next year’s holt program. i’ve shamelessly filched eli’s topic, if not his catchy title, for this column. thanks, eli. also, again in the interest of full disclosure, i suggested the virtualization topic, which eventually won the support of the group. finally, some participants in the discussion observed that virtualization technology and hosting are in many ways two sides of the same topical coin, but i’ll leave that for others to debate. 2. brad stone, “amazon erases orwell books from kindle,” new york times, july 17, 2009, http://www.nytimes .com/2009/07/18/technology/companies/18amazon.html?_ r=1 (accessed july 21, 2009). 3. amazon.com, “amazon kindle: license agreement and terms of use,” http://www.amazon.com/gp/help/customer/ display.html?nodeid=200144530 (accessed july 21, 2009). 4. “budget cutbacks announced in libraries, center for professional development,” stanford university news, june 10, 2009, http://news.stanford.edu/news/2009/june17/layoffs-061709 .html (accessed july 22, 2009; “harvard libraries cuts jobs, hours,” harvard crimson (online edition), june, 26 2009, http:// www.thecrimson.com/article.aspx?ref=528524 (accessed july 22, 2009). letter from the editor (june 2022) letter from the editors kenneth j. varnum and marisha c. kelly information technology and libraries | june 2022 https://doi.org/10.6017/ital.v41i2.15225 editorial board update i would like to open with a message of gratitude to the editorial board members who have helped shape the direction and focus of the journal over the past four years. steve bowers, kevin ford, cinthya ippoliti, ida joiner, michael sauers, and laurie willis have been fantastic colleagues, providing sage advice and thoughtful opinions through their tenures. together, they have reviewed dozens of articles for the journal but, more importantly, have helped shape the policies and directions we hope to take. together, we thought through and instituted our name change policy, a policy for revision of published articles, and ongoing efforts to identify sources of bias in editorial and reviewing practice. this work lays the foundation for future improvements. even as we say farewell to these editorial board members, it is my pleasure to welcome these individuals to the editorial board on july 1: ashlea green, mary a. guillory, dana haugh, shanna hollich, and cynthia schwarz. they were selected from an impressive pool of applicants. we are grateful for all who applied. we welcome submissions related to the intersection of cultural memory institutions (libraries, archives, and museums) and technology. our call for submissions outlines the topics and process for submitting an article for review. if you have questions or wish to bounce ideas off the editor and assistant editor, please contact either of us at the email addresses below. this issue’s contents the june “public libraries leading the way” column is contributed by julie lane at the county of prince edward public library and archives. lane describes how the covid-19 pandemic not only led to immediate changes to serve a geographically distributed community, but also increased the library’s horizons in terms of advocating for and promoting equitable access to learning materials . our peer-reviewed content this month showcases topics including collection analysis, userlearner profiles, topic modeling, copyright bots, intangible cultural heritage, contactless services, and explainable artificial intelligence. 1. rarely analyzed: the relationship between digital and physical rare books collections / allison mccormack and rachel wittmann 2. ontology for the user-learner profile personalizes the search analysis of online learning resources: the case of thematic digital universities / marilou kordahi 3. applying topic modeling for automated creation of descriptive metadata for digital collections / monika glowacka-musial 4. classical musicians v. copyright bots: how libraries can aid in the fight / adam eric berkowitz 5. research on knowledge organization of intangible cultural heritage based on metadata / qing fan, guoxin tan, chuanming sun, and panfeng chen 6. contactless services: a survey of the practices of large public libraries in china / yajun guo, zinan yang, yiming yuan, huifang ma, and yan quan liu 7. explainable artificial intelligence (xai): adoption and advocacy / michael ridley kenneth j. varnum, editor marisha c. kelly, assistant editor varnum@umich.edu marisha.librarian@gmail.com https://ejournals.bc.edu/index.php/ital/name-change-policy https://ejournals.bc.edu/index.php/ital/name-change-policy https://ejournals.bc.edu/index.php/ital/call-for-submissions https://ejournals.bc.edu/index.php/ital/article/view/13415 https://ejournals.bc.edu/index.php/ital/article/view/13601 https://ejournals.bc.edu/index.php/ital/article/view/13601 https://ejournals.bc.edu/index.php/ital/article/view/13799 https://ejournals.bc.edu/index.php/ital/article/view/13799 https://ejournals.bc.edu/index.php/ital/article/view/14027 https://ejournals.bc.edu/index.php/ital/article/view/14093 https://ejournals.bc.edu/index.php/ital/article/view/14141 https://ejournals.bc.edu/index.php/ital/article/view/14683 mailto:varnum@umich.edu mailto:marisha.librarian@gmail.com editorial board update this issue’s contents topic modeling as a tool for analyzing library chat transcripts article topic modeling as a tool for analyzing library chat transcripts hyunseung koh and mark fienup information technology and libraries | september 2021 https://doi.org/10.6017/ital.v40i3.13333 hyunseung koh (hyunseung.koh@uni.edu) is an assessment librarian and assistant professor of library services, university of northern iowa. mark fienup (mark.fienup@uni.edu) is an associate professor in the computer science department, university of northern iowa. © 2021. abstract library chat services are an increasingly important communication channel to connect patrons to library resources and services. analysis of chat transcripts could provide librarians with insights into improving services. unfortunately, chat transcripts consist of unstructured text data, making it impractical for librarians to go beyond simple quantitative analysis (e.g., chat duration, message count, word frequencies) with existing tools. as a stepping-stone toward a more sophisticated chat transcript analysis tool, this study investigated the application of different types of topic modeling techniques to analyze one academic library’s chat reference data collected from april 10, 2015, to may 31, 2019, with the goal of extracting the most accurate and easily interpretable topics. in this study, topic accuracy and interpretability—the quality of topic outcomes—were quantitatively measured with topic coherence metrics. additionally, qualitative accuracy and interpretability were measured by the librarian author of this paper depending on the subjective judgment on whether topics are aligned with frequently asked questions or easily inferable themes in academic library contexts. this study found that from a human’s qualitative evaluation, probabilistic latent semantic analysis (plsa) produced more accurate and interpretable topics, which is not necessarily aligned with the findings of the quantitative evaluation with all three types of topic coherence metrics. interestingly, the commonly used technique latent dirichlet allocation (lda) did not necessarily perform better than plsa. also, semi-supervised techniques with human-curated anchor words of correlation explanation (corex) or guided lda (guidedlda) did not necessarily perform better than an unsupervised technique of dirichlet multinomial mixture (dmm). last, the study found that using the entire transcript, including both sides of the interaction between the library patron and the librarian, performed better than using only the initial question asked by the library patron across different techniques in increasing the quality of topic outcomes. introduction with the rise of online education, library chat services are an increasingly important tool for student learning.1 library chat services have the potential to support student learning, especially for distant learners who have a lack of opportunity to come and learn about library and research skills in person. in addition, unlike traditional in-person reference services whose use has declined drastically, library chat services have become an important communication channel that connects patrons to library resources, services, and spaces.2 quantitative and qualitative analysis of chat transactions could provide librarians with insights into improving the quality of these resources, services, and spaces. for example, in order to maximize patrons’ satisfaction, librarians could identify or evaluate quantitative and qualitative mailto:hyunseung.koh@uni.edu mailto:mark.fienup@uni.edu information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 2 patterns of chat reference data (e.g., busiest days and times of nondirectional, research-focused questions) and develop a better staffing plan for assigning librarians or student employees to most appropriate days and times. furthermore, these insights could be used to help demonstrate library value by showing external stakeholders how successfully library chat services support students’ needs, which is increasingly in demand for higher education. 3 in practice, it is burdensome for librarians to go beyond simple quantitative analysis (e.g., chat duration, message count, word frequencies) with existing chat software tools, such as libraryh3lp, questpoint, springshare’s libchat, and liveperson.4 currently, in order to obtain rich and hidden insights from large volumes of chat transcripts, librarians need to conduct manual qualitative analysis of chat transcripts with unstructured text data, which requires a lot of time and effort. in an age when library patrons' information needs have been changing, the lack of chat analysis tools that handle large volumes of transcripts hinders librarians’ ability to respond to patrons’ wants and needs in a timely manner.5 in particular, small and medium-sized academic libraries have seen a shortage of librarians and need to hire and train student employees , so librarians’ capabilities for real-time quick and easy analysis and assessment will become critical in helping them take appropriate actions to best meet user needs.6 as part of an effort to develop a quick and easy analysis tool for large volumes of chat transcripts, this study applied topic modeling, which is a statistical technique “for learning the latent structure in document collections” or “a type of statistical model for finding hidden topical patterns of words.”7 we compared outcomes of different types of topic modeling techniques and attempted to propose topic modeling techniques that would be most appropriate in the context of chat reference transcript data. literature review to identify the most appropriate research methods that would facilitate analyzing a vast amount of chat transcripts, this section first introduces literature in relation to research methods used in analyzing chat transcript data in library settings and nonlibrary settings. it follows by discussing different types of topic modeling techniques that have high potential for quick and easy analysis of chat transcripts and their strengths and weaknesses. chat transcript analysis methods in library settings in analyzing library chat transcripts, which are one major data source of library chat service research, researchers have used variants of quantitative and qualitative research methods.8 coding-based content analysis with or without predefined categories is one type of qualitative method.9 the other type of qualitative research method is conversation or language usage analysis but it is not a dominant type of research method, as compared to coding-based qualitative content analysis.10 the most common quantitative methods are simple descriptive countor frequencybased analyses that are accompanied by qualitative coding-based content analyses.11 in some recent research, advanced quantitative research methods, such as cluster analysis and topic modeling techniques, have been used, but they have not been fully explored yet with a wide range of techniques.12 chat transcript analysis methods in nonlibrary settings as shown in table 1, researchers in nonlibrary settings also used research methods in analyzing chat data from diverse technology platforms or contexts, ranging from qualitative manual coding methods to data mining and machine learning techniques. topic modeling techniques are one of the chat analysis methods, but again, it seems that they have not been fully explored yet in chat analyses in nonlibrary settings, even though they have been used in a wide range of contexts.13 information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 3 table 1. chat transcript analysis applications in non-library settings disciplines platforms/sources of chat transcript data chat transcript analysis methods/tools/techniques education chat rooms and text chat14 qualitative content analysis health social media15 qualitative & quantitative content analysis business in-game chat features and chatbots16 a spell-checker, readability scores, the number of spelling and grammatical errors, linguistic inquiry and word count (liwc) program, logistic regression analysis, decision tree, support vector machine (svm) criminology instant messengers, internet relay chat (irc) channels, internet-based chat logs, and social media17 liwc program, cluster analysis, latent dirichlet allocation (lda) topic modeling techniques and their strengths and weaknesses as a quantitative and statistical method appropriate for analyzing a vast amount of chat transcript data, researchers from both library and nonlibrary settings used topic modeling. as shown in table 2, conventional topic modeling techniques include latent semantic analysis, probabilistic latent semantic analysis, and latent dirichlet allocation, each of which has its unique strengths and weaknesses.18 in order to overcome weaknesses of the conventional techniques, researchers have developed alternative techniques. for example, dirichlet multinomial mixture (dmm) has been proposed to overcome data sparsity problems in short texts.19 as another example, correlation explanation (corex) has been proposed to avoid time and effort to identify topics and their structure ahead of time.20 last, guided lda (guidedlda) has been proposed to improve performance of infrequently occurring topics.21 information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 4 table 2. strengths and weaknesses of conventional topic modeling techniques acronym definitions strengths weaknesses latent semantic analysis lsa a document is represented as a vector of numbers found by applying dimensionality reduction (specifically, truncated svd) to summarize the frequencies of cooccurring words across documents. can deal with polysemy (multiple meanings) to some extent. is hard to obtain and to determine the optimal number of topics. probabilistic latent semantic analysis plsa a document is represented as vectors, but these vectors have nonnegative entries summing to 1 such that each component (topic) represents the relative prominence of some probabilistic mixture of words in the corpus. topics in a document are “probabilistic instead of the heuristic geometric distances.”22 can deal with polysemy issues; provides easy interpretation terms of word, document, and topic probabilities. has over-fitting problems. latent dirichlet allocation lda a bayesian extension of plsa that adds assumptions about the relative probability of observing different document's distributions over topics. prevents overfitting problems; provides a fully bayesian probabilistic interpretation. does not show relationships among topics. data, preprocessing, analysis, and evaluation this section first introduces the data used for this study. next, it explains the procedures of each stage starting from preprocessing to analyzing chat transcript data using different types of conventional and alternative topic modeling techniques. last, it discusses quantitative and qualitative evaluation in terms of the quality of topic outcomes across different types of topic technique. for more details including python scripts please visit our github page at https://github.com/mfienup/uni-library-chat-study. https://github.com/mfienup/uni-library-chat-study information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 5 data this study collected the university of northern iowa’s rod library chat reference data dated from april 10, 2015, to may 31, 2019 (irb#18-0225). this raw chat data was downloaded from libchat in the form of an excel spreadsheet with 9,942 english chat transcripts with each transcript as a separate row. preprocessing as the first step, this study removed unnecessary components of each chat transcript using a custom python script. components removed were timestamps, patron and librarian identifiers, http tags (e.g., urls), and non-ascii characters. next, it processed the resulting text words using python’s natural language toolkit (https://www.nltk.org/) and its wordnetlemmatizer function (https://www.nltk.org/_modules/nltk/stem/wordnet.html) to normalize words for further analyses. as the final step, it prepared the four types of data sets to identify which type of data set would produce better topic outcomes. the four types of data sets were as follows: • question-only: consists of only the initial question asked by the library patron in each chat transcript. only the latter 10.7% of the chats recorded in the excel spreadsheet contained an initial question column entry. the remaining chats assumed to contain their initial question in the patron’s first response if it was longer than a trivial welcome message. • whole-chat: consists of the whole chat transcripts from the library patron and librarians. • whole-chat with nouns and adjectives: consists of only nouns and adjectives as parts of speech (pos) from the whole chat transcripts. • whole-chat with nouns, adjectives, and verbs: consists of only nouns, adjectives, and verbs as pos from the whole chat transcripts. the first two data sets were prepared to examine if the first question initiated by each patron or the whole chat transcripts would help produce better topic outcomes. the last two data sets were prepared to examine which parts of speech retained would help produce better topic outcomes. data analysis with conventional topic modeling techniques this study first analyzed chat reference data using three conventional topic modeling techniques: latent semantic analysis (lsa), probabilistic latent semantic analysis (plsa), and two versions of latent dirichlet allocation (lda), as shown in table 3. all three techniques are examples of unsupervised topic modeling techniques that automatically analyze text data from a set of documents (in this study, a set of chat transcripts) to infer predominant topics or themes across all documents without human help. a key challenge, or a key parameter to be determined, for unsupervised topic modeling techniques is to identify the optimal number of topics. the study ran the commonly used lda technique with the whole-chat data set with various numbers of topics. fifteen was chosen as an optimal number of topics for this study by calculating and comparing the log-likelihood scores among various number of topics. https://www.nltk.org/ https://www.nltk.org/_modules/nltk/stem/wordnet.html information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 6 table 3. conventional topic modeling techniques and their sources technique programming language implementation source version used in the study latent semantic analysis python https://pypi.org/project/gensim/ 3.8.1 probabilistic latent semantic analysis python https://scikitlearn.org/stable/modules/generated/ sklearn.decomposition.nmf.html 0.21.3 latent dirichlet allocation (with sklearn) python https://scikitlearn.org/stable/modules/generated/ sklearn.decomposition.latentdirichlet allocation.html 0.21.3 latent dirichlet allocation (with pymallet) python https://github.com/mimno/pymallet dated february 26, 2019 also, before analyzing chat transcript data using lsa and plsa, this study performed a term frequency–inverse document frequency (tf–idf) transformation. tf–idf is a measure of how important a word is to a document (i.e., a single chat transcript) compared to its relevance in a collection of all documents. data analysis with alternative topic modeling techniques in addition to conventional topic modeling techniques, this study analyzed chat reference data using three alternative techniques of dirichlet multinomial mixture (dmm), anchored correlation explanation (corex) and guided lda (guidedlda), as shown in table 4. this study selected dmm as an alternative unsupervised topic modeling technique that has been developed for short texts. also, this study selected anchored corex and guided lda (guidedlda) as semi-supervised topic modeling techniques that require human-curated sets of words, called anchors or seeds, which nudge topic models toward including the suggested anchors. this is based on the assumption that human’s curated techniques would help produce better quality of topics than the unsupervised techniques. for example, the three words “interlibrary,” “loan,” and “request,” or the two words “article” and “database,” are possible anchor words in the context of library chat transcripts. such anchor words can appear anywhere within a chat in any order. https://pypi.org/project/gensim/ https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.nmf.html https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.nmf.html https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.nmf.html https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.latentdirichletallocation.html https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.latentdirichletallocation.html https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.latentdirichletallocation.html https://scikit-learn.org/stable/modules/generated/sklearn.decomposition.latentdirichletallocation.html https://github.com/mimno/pymallet information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 7 table 4. alternative topic modeling techniques and their sources unsupervised vs. semisupervised technique programming language implementation source version used in the study unsupervised dirichlet multinomial mixture (dmm) java https://github.com/qiang2 100/sttm 9/27/2019 semi-supervised anchored correlation explanation (corex) python https://github.com/gregve rsteeg/corex_topic 1/21/2020 semi-supervised guided lda using collapsed gibbs sampling python https://guidedlda.readthe docs.io/en/latest/ 10/5/2017 given that a known set of anchor words associated with academic library chats seems unavailable in the literature, this study decided to obtain a list of most meaningful anchor words by combining outcomes of the unsupervised techniques with a human’s follow-up curation, as follows: step 1. execute unsupervised topic modeling techniques step 2. combine resulting topics from all unsupervised topic modeling techniques step 3. identify a list of all possible pairs of words (bi-occurrences), e.g., 28 pairs of words if each topic has 8 words, and all possible combinations of tri-occurrences of words step 4. identify most common bi-occurrences and tri-occurrences of words across all topics by ordering in descending order by frequency step 5. select a set of anchors from these bi-occurrences and tri-occurrences of words by a human’s judgment in terms of selecting a set of anchor words, the librarian author of this paper judged whether combinations of words in each row from step 4 were aligned with frequently asked questions or easily inferable themes in academic library contexts. as shown in table 5, a set of “interlibrary,” “loan,” and “request” was selected as anchor words that are aligned with one frequently asked question about interlibrary loan requests, whereas a set of “access,” “librarian,” and “research” was not selected as anchor words because multiple themes, such as access to resources and asking for research help from librarians, can be inferred. additionally, a set of “hour,” “time,” and “today” was selected over a set of “time,” “tomorrow,” and “tonight” as better or clearer anchor words. https://github.com/qiang2100/sttm https://github.com/qiang2100/sttm https://github.com/gregversteeg/corex_topic https://github.com/gregversteeg/corex_topic https://guidedlda.readthedocs.io/en/latest/ https://guidedlda.readthedocs.io/en/latest/ information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 8 table 5. examples of anchor words that were selected and not selected examples of tri-occurrences of words (note: strikethrough denotes a set of words that were not selected as anchor words) 1 interlibrary loan request 2 hour time today 3 time tomorrow tonight 4 time today tomorrow 5 floor librarian research 6 access librarian research 7 camera digital hub 8 digital hub medium 9 access article journal 10 access article database 11 access account campus 12 research source topic 13 paper research topic quantitative evaluation with topic coherence metrics comparing the quality of topic outcomes across various topic modeling techniques is tricky. purely statistical and quantitative evaluation techniques, such as held-out log-likelihood measures, have proven to be unaligned with human intuition or judgment with respect to topic interpretability and coherency.23 thus, this study adopted the three topic coherence metrics of tcpmi (normalized pointwise mutual information), tc-lcp (normalized log conditional probability), and tc-nz (number of topic word pairs never observed together in the corpus) that have been introduced by boyd-graber, mimno, and newman; bouma; and lau, newman, and baldwin.24 these three metrics are based on the assumption that the likelihood that two words that co-occur in a topic would also co-occur within a corpus. to utilize the three topic coherence metrics, the study chose a binarized choice (e.g., does a transcript contain two words?) instead of a sliding window of fixed size (e.g., do two words appear within a fixed window of 10 consecutive words?) as a type of how to count term cooccurrences. this decision was made because each chat transcript is relatively short, and a fixed information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 9 window size seemed inconsistent across different type of data sets that included different parts of speech. in terms of the other decision to be made for applying the three topic coherence metrics, this study chose a training corpus of all the chat transcripts instead of external corpuses such as the entire collection of english wikipedia articles that has little in common with average library chat transcripts. qualitative evaluation with human judgment in addition to quantitative evaluation with topic coherence metrics, qualitative accuracy and interpretability were judged by the librarian author of this paper based on whether topics were aligned with frequently asked questions or easily inferable themes in academic library contexts. for example, “find or access book or article” was inferred, from a set of words in topic 1 on lsa in table 6, as an accurate and easily interpretable theme. from a set of words in topic 3 on lda, “reserve study room” and “check out laptop computer” were inferred as two separable, easily interpretable themes. from a set of words in topic 15 on corex with nine anchors, no theme was inferred as an easily interpretable theme. (see table 10 in the results section for all themes inferred from table 6.) information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 10 table 6. examples of topics found by topic modeling techniques topic modeling technique topics (top 15 topics with eight words per topic) note: parenthetical additions are explanations or descriptions and not part of the topic. latent semantic analysis (lsa) topic 1. article book search find access link will check topic 2. renew book article room reserve search journal check topic 3. room renew reserve book study scheduler loan online topic 4. renew request loan interlibrary search room review peer topic 5. loan floor renew access interlibrary request log book topic 6. book open print request search loan renew interlibrary topic 7. print floor open printer color hour research pm topic 8. open hour print search review close peer floor topic 9. print access renew research book loan librarian open topic 10. floor article open book renew print locate database topic 11. article book attach file print database floor check topic 12. check book desk laptop answer print shortly open topic 13. answer desk shortly place room database circulation pick topic 14. review peer search reserve log access campus database topic 15. database file attach collection access journal research reserve probabilistic latent semantic analysis (plsa) topic 1. collection special youth contact email number archive department topic 2. book title hold online check pick number reserve topic 3. room reserve study scheduler reservation group rodscheduler (software) space topic 4. search bar click type journal onesearch (a discovery tool) result homepage topic 5. request loan interlibrary link illiad (system) submit inter instruction topic 6. renew online account book today number circulation item topic 7. access link log campus click work online sign topic 8. article journal attach file title access google scholar topic 9. research librarian paper appointment consultation source topic question topic 10. open hour today close pm tomorrow midnight tonight topic 11. check answer place shortly desk laptop student long topic 12. print color printer computer printing mobile release black topic 13. floor locate desk stack main fourth number section information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 11 topic modeling technique topics (top 15 topics with eight words per topic) note: parenthetical additions are explanations or descriptions and not part of the topic. topic 14. database az subject ebsco(database) list business topic access topic 15. review peer journal topic sociology study article result latent dirichlet allocation (lda) with sklearn topic 1. file attach cite citation link article author pdf topic 2. check book renew student item today time member topic 3. room reserve computer laptop study check reservation desk topic 4. book request loan interlibrary check title online copy topic 5. search article database review result type google bar topic 6. student class access iowa course university college fall topic 7. research librarian source paper topic good appointment specific topic 8. email contact chat good librarian work question address topic 9. open hour today check pick hold desk close topic 10. link access click log work campus sign database topic 11. floor locate desk main art music circulation section topic 12. medium digital check video hub desk rent camera topic 13. article journal access title online link education amp topic 14. print printer color card scan document charge job topic 15. answer check place collection shortly special question number dirichlet multinomial mixture (dmm) topic 1. room reserve how will study check floor what topic 2. request loan book interlibrary how article will link topic 3. article access find journal link how search full topic 4. book how find check what online link will topic 5. article find attach file what how will link topic 6. how check open today desk hour will what topic 7. find article what search how research source database topic 8. how print will cite printer link what citation topic 9. search article find how review will database journal topic 10. book find floor how will where call number topic 11. book check how renew will today request what topic 12. research how librarian find what article will email topic 13. find how will contact collection what special email topic 14. access article link log how campus database work topic 15. article find will search what link book how anchored correlation explanation (corex) with nine anchor words topic 1. request loan interlibrary illiad (system) form submit inter fill topic 2. study reserve room scheduler hub medium equipment digital information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 12 topic modeling technique topics (top 15 topics with eight words per topic) note: parenthetical additions are explanations or descriptions and not part of the topic. topic 3. search review peer bar result type onesearch (a discovery tool) homepage topic 4. today open hour pm assist close window midnight topic 5. locate floor main where third fourth desk stack topic 6. print printer color printing black white mobile release topic 7. number collection special call phone youth archive xxx topic 8. research librarian appointment consultation paper set xxx transfer topic 9. access database journal article campus full az text topic 10. email will contact work when good who student topic 11. education read school class professor amp teacher child topic 12. topic source cite write apa start citation recommend topic 13. find attach file google what scholar title specific topic 14. click log link left side catid button hand topic 15. shortly place answer check cedar fall iowa northern guidedlda with nine anchor words and confidence 0.75 topic 1. book request loan interlibrary will how check link topic 2. room reserve how check will desk study medium topic 3. search article find how will database book review topic 4. book check how renew today will hour open topic 5. book floor find how check where call locate topic 6. print how computer will printer color desk student topic 7. contact collection will find email special how check topic 8. research librarian find how what will email article topic 9. article access link how log click database find topic 10. article find how access what link attach file topic 11. find chat copy how good online what will topic 12. article find file attach what journal will work topic 13. how check book answer place shortly what find topic 14. book how find what sport link video textbook topic 15. how cite what find citation author article source information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 13 results this section first introduces which topic modeling techniques, as well as which type of data set, performed the best on each of the three topic coherence metrics. it follows by introducing which technique was the best according to human qualitative judgment. quantitative evaluation with topic coherence metrics given that for a topic coherence metric tc-pmi larger values mean more coherent topics, table 7 and its corresponding figure 1 show that corex with anchor words on the whole-chat performed best on tc-pmi. tf–idf & plsa on the whole-chat performed better than lda on the whole-chat. given that for topic coherence metric tc-lcp larger values mean more coherent topics, table 8 and its corresponding figure 2 show that dmm on the whole-chat performed best on tc-lcp. tf– idf & plsa on the whole-chat performed better than lda, even though lda (pymallet) on the whole-chat performed better than tc-idf & plsa on the whole-chat. given that for topic coherence metric tc-nz smaller values mean more coherent topics, table 9 and its corresponding figure 3 show that tf–idf & plsa, lda and lda (pymallet) on the wholechat performed best on tc-nz. table 7. tc-pmi comparison of topic modeling techniques on the four types of data sets (with top 15 topics with eight words per topic) topic modeling technique whole-chat whole-chat (noun, adjective, verb) whole-chat (noun, adjective) question-only tf–idf & lsa -0.066 -0.061 -0.063 -0.429 tf–idf & plsa 0.508 0.321 0.494 -0.122 lda (sklearn) 0.378 0.261 0.099 -0.995 lda (pymallet) 0.218 0.262 0.271 -0.091 dmm 0.136 0.22 0.285 0.109 corex without anchor words 0.47 0.497 0.396 -0.584 corex with nine anchor words 0.522 0.534 0.558 -0.401 guidedlda with nine anchor words and confidence 0.75 0.133 0.216 0.262 0.069 information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 14 figure 1. tc-pmi comparison of topic modeling techniques on the four types of data sets. table 8. tc-lcp comparison of topic modeling techniques on the four types of data sets (with top 15 topics with eight words per topic) topic modeling technique whole-chat whole-chat (noun, adjective, verb) whole-chat (noun, adjective) question-only tf–idf & lsa -1.114 -1.124 -1.204 -1.675 tf–idf & plsa -0.751 -0.793 -0.893 -1.956 lda (sklearn) -0.789 -0.979 -1.263 -2.827 lda (pymallet) -0.637 -0.767 -0.918 -1.626 dmm -0.546 -0.645 -0.731 -1.159 corex without anchor words -0.868 -0.853 -1.062 -2.618 corex with nine anchor words -0.82 -0.791 -0.884 -2.348 guidedlda with nine anchor words and confidence 0.75 -0.637 -0.686 -0.792 -1.143 information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 15 figure 2. tc-lcp comparison of topic modeling techniques on the four types of data sets. information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 16 table 9. tc-nz comparison of topic modeling techniques on the four types of data sets (with top 15 topics with eight words per topic) topic modeling technique whole-chat whole-chat (noun, sdjective, verb) whole-chat (noun, adjective) question-only tf–idf & lsa 0.267 0.267 0.333 1.8 tf–idf & plsa 0 0 0.067 3.8 lda (sklearn) 0 0.467 1.2 7.067 lda (pymallet) 0 0.133 0.267 1.8 dmm 0.067 0 0 0.267 corex without anchor words 0.333 0.067 0.6 7.067 corex with nine anchor words 0.133 0 0.133 5.267 guidedlda with nine anchor words and confidence 0.75 0.2 0.067 0 0.133 figure 3. tc-nz comparison of topic modeling techniques on all four data sets. information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 17 last, all tables 7 to 9 and their corresponding figures 1 to 3 clearly show that the whole-chat data set with all parts of speech was generally the best data set on all the techniques. qualitative evaluation with human judgment as shown in table 10, all techniques had relatively high accuracy and interpretability in terms of straightforward topics or themes in italicized text, such as “interlibrary loan,” “technology,” “hours,” and “room reservations,” where one keyword could represent a whole theme. however, in terms of less-straightforward topics or themes plsa performed better than the other techniques. in other words, plsa had the highest number of topics that are aligned clearly with frequently asked questions or are easily inferable themes in academic library contexts. also, plsa had a lower number of unrelated or multiple themes within one topic, whereas other techniques had a higher number of unrelated or multiple themes within one topic. as an example, topic 8 on dmm shows that “print” and “citation” can be inferred as two unrelated themes within one topic. table 10. examples of themes qualitatively inferred from a list of words (a topic) identified by each topic modeling technique topic modeling technique themes inferred from table 6 (note: italics denotes straightforward themes; and strikethrough denotes themes with no interpretability or unrelated, multiple themes within one topic) latent semantic analysis (lsa) topic 1. find or access book or article topic 2. renew book or article; reserve a room; search journal topic 3. renew book online; reserve room; loan topic 4. renew; interlibrary loan; search; room topic 5. renew book; interlibrary loan; floor topic 6. renew; interlibrary loan print book; search topic 7. print color; floor; hours; research topic 8. hours; print; search; peer peer review; floor topic 9. print; renew book; librarian; open hours topic 10. renew book and article, print, floor and locate; database topic 11. print; database; floor topic 12. check out book or laptop; print; open topic 13. circulation desk; room; database topic 14. not clear topic 15. not clear probabilistic latent semantic analysis (plsa) topic 1. contact information of special collection and youth topic 2. not clear topic 3. room reservation topic 4. journal search and onesearch topic 5. interlibrary loan request topic 6. how to renew book online topic 7. working from off campus (not clear) topic 8. journal article via google scholar topic 9. appointment with librarians for research consultations topic 10. open hours topic 11. not clear topic 12. printing information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 18 topic modeling technique themes inferred from table 6 (note: italics denotes straightforward themes; and strikethrough denotes themes with no interpretability or unrelated, multiple themes within one topic) topic 13. stack on the fourth floor topic 14. databases a-z for business including ebsco topic 15. peer reviewed journals for sociology latent dirichlet allocation (lda) with sklearn topic 1. not clear topic 2. not clear topic 3. reserve study room; check out laptop computer topic 4. interlibrary loan online topic 5. search article via databases topic 6. not clear topic 7. appointment with research librarians topic 8. contact librarian via email topic 9. open hours topic 10. database access from off campus topic 11. floor for art and music circulation desk topic 12. rent camera topic 13. access journal article topic 14. printing and charge topic 15. special collection dirichlet multinomial mixture (dmm) topic 1. reserve study room and floor topic 2. interlibrary loan topic 3. search and access article topic 4. find book online topic 5. find article (not clear) topic 6. open hours topic 7. find article and database topic 8. print; citation topic 9. find article & database topic 10. find book with call number topic 11. renew book (not clear) topic 12. email librarians for research help topic 13. special collection (not clear) topic 14. access article/database from on campus topic 15. find article (not clear) anchored correlation explanation (corex) with nine anchor words topic 1. interlibrary loan topic 2. reserve study room; equipment topic 3. peer-reviwed and onesearch topic 4. open hours topic 5. floor location topic 6. printing topic 7. special collection and phone number topic 8. research consultation appointment topic 9. access database a-z information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 19 topic modeling technique themes inferred from table 6 (note: italics denotes straightforward themes; and strikethrough denotes themes with no interpretability or unrelated, multiple themes within one topic) topic 10. not clear topic 11. not clear topic 12. apa citations topic 13. google scholar (not clear) topic 14. log in topic 15. not clear guidedlda with nine anchor words and confidence 0.75 topic 1. interlibrary loan topic 2. reserve study room & medium topic 3. search and find article; databases topic 4. renew book; hours topic 5. find book with call number topic 6. printing topic 7. special collection topic 8. email to research librarian topic 9. access article and databases topic 10. access article; attach file (not clear) topic 11. not clear topic 12. find article and journal; file attach (not clear) topic 13. not clear topic 14. find book, video, and textbook about sport topic 15. citation discussion given that different topic modeling techniques performed the best depending on different types of topic coherence metrics, it is not possible to make a firm conclusion that one technique is better than the others. interestingly, the commonly-used technique lda tested in both sklearn and pymallet in this study did not consistently outperform tf–idf & plsa. in addition, semisupervised techniques of anchored correlation explanation (corex) or guided lda (guidedlda) did not necessarily outperform an unsupervised technique of the dirichlet multinomial mixture (dmm). last, from a human’s qualitative judgment, plsa performed the best, which is aligned with the findings on tc-nz. this might imply that tc-nz is a more appropriate metric than the other metrics in measuring topic coherence in the context of academic library chat transcripts. in terms of different types of data sets, all three of the whole-chat data sets significantly outperformed the questions-only data set. at the outset of the study, it was conjectured that the initial question of each chat transaction might concentrate the essence of each chat, thereby leading to better performance. clearly this was not the case, possibly because the rest of chat transcripts would reinforce a topic by standardizing the vocabulary of the chat’s initial question. it was somewhat interesting that varying the parts of speech (pos) retained in the three whole-chat data sets had little benefit on the topic modeling analyses. it might imply that topic modeling information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 20 techniques are sensitive enough to differentiate across different parts of speech, thereby leading to good performance regardless of types of data sets. conclusion this study clearly showed that conventional techniques should be also examined to avoid any errors from the assumption that newly developed techniques such as lda would always outperform regardless of contexts. also, both quantitative and qualitative evaluations indicate that unsupervised techniques should be equally weighted as semi-supervised techniques with human interventions. as a future study, like other similar research, it would be meaningful to compare human qualitative judgment with scores of each metric more rigorously, along with more librarians’ input, to confirm (or disconfirm) our preliminary conclusion that tc-nz is the most appropriate topic coherence metric in the context of library chat transcripts.25 it would be also interesting to investigate and examine semi-supervised techniques with different types of anchoring approaches, such as tandem anchoring.26 last, in order to overcome limitations of this study, it would be valuable to collect more and diverse chat reference data and compare output of topics across different types of institutions (e.g., teaching versus research institutions). acknowledgments this project was made possible in part by the institute of museum and library services [national leadership grants for libraries, lg-34-19-0074-19]. endnotes 1 christina m. desai and stephanie j. graves, “cyberspace or face-to-face: the teachable moment and changing reference mediums,” reference & user services quarterly 47, no. 3 (spring 2008): 242–55, https://www.jstor.org/stable/20864890; megan oakleaf and amy vanscoy, “instructional strategies for digital reference: methods to facilitate student learning,” reference & user services quarterly 49, no. 4 (summer 2010): 380–90, https://www.jstor.org/stable/20865299; shu z. schiller, “chat for chat: mediated learning in online chat virtual reference service,” computers in human behavior 65 (july 2016): 651–65, https://doi.org/10.1016/j.chb.2016.06.053; mila semeshkina, “five major trends in online education to watch out for in 2021,” forbes, february 2, 2021, https://www.forbes.com/sites/forbesbusinesscouncil/2021/02/02/five-major-trends-inonline-education-to-watch-out-for-in-2021/?sh=3261272521eb. 2 maryvon côté, svetlana kochkina, and tara mawhinney, “do you want to chat? reevaluating organization of virtual reference service at an academic library,” reference and user services quarterly 56, no. 1 (fall 2016): 36–46, https://www.jstor.org/stable/90009882; sarah lemire, lorelei rutledge, and amy brunvand, “taking a fresh look: reviewing and classifying reference statistics for data-driven decision making,” reference & user services quarterly 55, no. 3 (spring 2016): 230–38, https://www.jstor.org/stable/refuseserq.55.3.230; b. jane scales, lipi turner-rahman, and feng hao, “a holistic look at reference statistics: whither librarians?,” evidence based library and information practice 10, no. 4 (december 2015): 173– 85, https://doi.org/10.18438/b8x01h. https://www.jstor.org/stable/20864890 https://www.jstor.org/stable/20865299 https://doi.org/10.1016/j.chb.2016.06.053 https://www.forbes.com/sites/forbesbusinesscouncil/2021/02/02/five-major-trends-in-online-education-to-watch-out-for-in-2021/?sh=3261272521eb https://www.forbes.com/sites/forbesbusinesscouncil/2021/02/02/five-major-trends-in-online-education-to-watch-out-for-in-2021/?sh=3261272521eb https://www.jstor.org/stable/90009882 https://www.jstor.org/stable/refuseserq.55.3.230 https://doi.org/10.18438/b8x01h information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 21 3 pamela j. howard, “can academic library instant message transcripts provide documentation of undergraduate student success?,” journal of web librarianship 13, no. 1 (february 2019): 61– 87, https://doi.org/10.1080/19322909.2018.1555504. 4 côté and kochkina, “do you want to chat?”; sharon q. yang and heather a. dalal, “delivering virtual reference services on the web: an investigation into the current practice by academic libraries,” journal of academic librarianship 41, no. 1 (november 2015): 68–86, https://doi.org/10.1016/j.acalib.2014.10.003. 5 feifei liu, “how information-seeking behavior has changed in 22 years,” nn/g nielsen norman group, january 26, 2020, https://www.nngroup.com/articles/information-seeking-behaviorchanges/; amanda spink and jannica heinström, eds., new directions in information behavior (bingley, uk: emerald group publishing limited, 2011). 6 kathryn barrett and amy greenberg, “student-staffed virtual reference services: how to meet the training challenge,” journal of library & information services in distance learning 12, no. 3–4 (august 2018): 101–229, https://doi.org/10.1080/1533290x.2018.1498620; robin canuel et al., “developing and assessing a graduate student reference service,” reference services review 47, no. 4 (november 2019): 527–43, https://doi.org/10.1108/rsr-06-20190041. 7 bhagyashree vyankatrao barde and anant madhavrao bainwad, “an overview of topic modeling methods and tools,” in proceedings of international conference on intelligent computing and control systems, 2018, 745–50, https://doi.org/10.1109/iccons.2017.8250563; jordan boydgraber, david mimno, and david newman, “care and feeding of topic models: problems, diagnostics, and improvements,” in handbook of mixed membership models and their applications, eds. edoardo m. airoldi et al. (new york: crc press, 2014), 225–54. 8 miriam l. matteson, jennifer salamon, and lindy brewster, “a systematic review of research on live chat service,” reference & user services quarterly 51, no. 2 (winter 2011): 172–89, https://www.jstor.org/stable/refuseserq.51.2.172. 9 kate fuller and nancy h. dryden, “chat reference analysis to determine accuracy and staffing needs at one academic library,” internet reference services quarterly 20, no. 3–4 (december 2015): 163–81, https://doi.org/10.1080/10875301.2015.1106999; sarah passonneau and dan coffey, “the role of synchronous virtual reference in teaching and learning: a grounded theory analysis of instant messaging transcripts,” college & research libraries 72, no. 3 (2011): 276–95, https://doi.org/10.5860/crl-102rl. 10 paula r. dempsey, “‘are you a computer?’ opening exchanges in virtual reference shape the potential for teaching,” college & research libraries 77, no. 4 (2016): 455–68, https://doi.org/10.5860/crl.77.4.455; jennifer waugh, “formality in chat reference: perceptions of 17to 25-year-old university students,” evidence based library and information practice 8, no. 1 (2013): 19–34, https://doi.org/10.18438/b8ws48. 11 robin brown, “lifting the veil: analyzing collaborative virtual reference transcripts to demonstrate value and make recommendations for practice,” reference & user services quarterly 57, no. 1 (fall 2017): 42–47, https://www.jstor.org/stable/90014866; sarah https://doi.org/10.1080/19322909.2018.1555504 https://doi.org/10.1016/j.acalib.2014.10.003 https://www.nngroup.com/articles/information-seeking-behavior-changes/ https://www.nngroup.com/articles/information-seeking-behavior-changes/ https://doi.org/10.1080/1533290x.2018.1498620 https://doi.org/10.1108/rsr-06-2019-0041 https://doi.org/10.1108/rsr-06-2019-0041 https://doi.org/10.1109/iccons.2017.8250563 https://www.jstor.org/stable/refuseserq.51.2.172 https://doi.org/10.1080/10875301.2015.1106999 https://doi.org/10.5860/crl-102rl https://doi.org/10.5860/crl.77.4.455 https://doi.org/10.18438/b8ws48 https://www.jstor.org/stable/90014866 information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 22 maximiek, elizabeth brown, and erin rushton, “coding into the great unknown: analyzing instant messaging session transcripts to identify user behaviors and measure quality of service,” college & research libraries 71, no. 4 (2010): 361–73, https://doi.org/10.5860/crl48r1. 12 christopher brousseau, justin johnson, and curtis thacker, “machine learning based chat analysis,” code4lib journal 50 (february 2021), https://journal.code4lib.org/articles/15660; ellie kohler, “what do your library chats say?: how to analyze webchat transcripts for sentiment and topic extraction,” in brick & click libraries conference proceedings (maryville, mo: northwest missouri state university, 2017), 138–48, https://files.eric.ed.gov/fulltext/ed578189.pdf; megan ozeran and piper martin, “‘good night, good day, good luck,’” information technology and libraries 38, no. 2 (june 2019): 49–57, https://doi.org/10.6017/ital.v38i2.10921; thomas stieve and niamh wallace, “chatting while you work: understanding chat reference user needs based on chat reference origin ,” reference services review 46, no. 4 (november 2018): 587–99, https://doi.org/10.1108/rsr09-2017-0033; nadaleen tempelman-kluit and alexa pearce, “invoking the user from data to design,” college & research libraries 75, no. 5 (2014): 616–40, https://doi.org/10.5860/crl.75.5.616. 13 jordan boyd-graber, yuening hu, and david mimno, “applications of topic models,” foundations and trends in information retrieval 11, no. 2–3 (2017): 143–296, https://mimno.infosci.cornell.edu/papers/2017_fntir_tm_applications.pdf. 14 ewa m. golonka, medha tare, and carrie bonilla, “peer interaction in text chat: qualitative analysis of chat transcripts,” language learning & technology 21, no. 2 (june 2017): 157–78, http://hdl.handle.net/10125/44616; laura d. kassner and kate m. cassada, “chat it up: backchanneling to promote reflective practice among in-service teachers,” journal of digital learning in teacher education 33, no. 4 (august 2017): 160–68, https://doi.org/10.1080/21532974.2017.1357512. 15 eradah o. hamad et al., “toward a mixed-methods research approach to content analysis in the digital age: the combined content-analysis model and its applications to health care twitter feeds,” journal of medical internet research 18, no. 3 (march 2016): e60, https://doi.org/10.2196/jmir.5391; janet richardson et al., “tweet if you want to be sustainable: a thematic analysis of a twitter chat to discuss sustainability in nurse education,” journal of advanced nursing 72, no. 5 (january 2016): 1086–96, https://doi.org/10.1111/jan.12900. 16 shuyuan mary ho et al., “computer-mediated deception: strategies revealed by languageaction cues in spontaneous communication,” journal of management information systems 33, no. 2 (october 2016): 393–420, https://doi.org/10.1080/07421222.2016.1205924; mina park, milam aiken, and laura salvador, “how do humans interact with chatbots?: an analysis of transcripts,” international journal of management & information technology 14 (2018): 3338–50, https://doi.org/10.24297/ijmit.v14i0.7921. 17 abdur rahman, m. a. basher, and benjamin c. m. fung, “analyzing topics and authors in chat logs for crime investigation,” knowledge and information systems 39, no. 2 (march 2014): 351–81, https://doi.org/10.1007/s10115-013-0617-y; michelle drouin et al., “linguistic https://doi.org/10.5860/crl-48r1 https://doi.org/10.5860/crl-48r1 https://journal.code4lib.org/articles/15660 https://files.eric.ed.gov/fulltext/ed578189.pdf https://doi.org/10.6017/ital.v38i2.10921 https://doi.org/10.1108/rsr-09-2017-0033 https://doi.org/10.1108/rsr-09-2017-0033 https://doi.org/10.5860/crl.75.5.616 https://mimno.infosci.cornell.edu/papers/2017_fntir_tm_applications.pdf http://hdl.handle.net/10125/44616 https://doi.org/10.1080/21532974.2017.1357512 https://doi.org/10.2196/jmir.5391 https://doi.org/10.1111/jan.12900 https://doi.org/10.1080/07421222.2016.1205924 https://doi.org/10.24297/ijmit.v14i0.7921 https://doi.org/10.1007/s10115-013-0617-y information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 23 analysis of chat transcripts from child predator undercover sex stings,” journal of forensic psychiatry & psychology 28, no. 4 (february 2017): 437–57, https://doi.org/10.1080/14789949.2017.1291707; da kuang, p. jeffrey brantingham, and andrea l. bertozzi, “crime topic modeling,” crime science 6, no. 12 (december 2017): 1–12, https://doi.org/10.1186/s40163-017-0074-0; md waliur rahman miah, john yearwood, and siddhivinayak kulkarni, “constructing an inter‐post similarity measure to differentiate the psychological stages in offensive chats,” journal of the association for information science and technology 66, no. 5 (january 2015): 1065–81, https://doi.org/10.1002/asi.23247. 18 charu c. aggarwal and chengxiang zhai, eds. mining text data (new york: springer, 2012); rubayyi alghamdi and khalid alfalqi, “a survey of topic modeling in text mining,” international journal of advanced computer science and applications 6, no. 1 (2015): 146–53, https://doi.org/10.14569/ijacsa.2015.060121; leticia h. anaya, “comparing latent dirichlet allocation and latent semantic analysis as classifiers” (phd diss., university of north texas, 2011); barde and bainwad, “an overview of topic modeling”; david m. blei, “topic modeling and digital humanities,” journal of digital humanities 2, no. 1 (winter 2012), http://journalofdigitalhumanities.org/2-1/topic-modeling-and-digital-humanities-by-davidm-blei/; tse-hsun chen, stephen w. thomas, and ahmed e. hassan, “a survey on the use of topic models when mining software repositories,” empirical software engineering 21, no. 5 (september 2016): 1843–919, https://doi.org/10.1007/s10664-015-9402-8; elisabeth günther and thorsten quandt, “word counts and topic models: automated text analysis methods for digital journalism research,” digital journalism 4, no. 1 (october 2016): 75–88, https://doi.org/10.1080/21670811.2015.1093270; gabe ignatow and rada mihalcea, an introduction to text mining: research design, data collection, and analysis (new york: sage, 2017); stefan jansen, hands-on machine learning for algorithmic trading: design and implement investment strategies based on smart algorithms that learn from data using python (birmingham: packt publishing limited, 2018); lin liu et al., “an overview of topic modeling and its current applications in bioinformatics,” springerplus 5, no. 1608 (september 2016): 1– 22, https://doi.org/10.1186/s40064-016-3252-8; john w. mohr and petko bogdanov, “introduction—topic models: what they are and why they matter,” poetics 41, no. 6 (december 2013): 545–69, https://doi.org/10.1016/j.poetic.2013.10.001; gerard salton, anita wong, and chung-shu yang, “a vector space model for automatic indexing,” communications of the acm 18, no. 11 (november 1975): 613–20, https://doi.org/10.1145/361219.361220; jianhua yin and jianyong wang, “a dirichlet multinomial mixture model-based approach for short text clustering,” in proceedings of the twentieth acm sigkdd international conference on knowledge discovery and data mining (new york: acm, 2014), 233–42, https://doi.org/10.1145/2623330.2623715; hongjiao xu et al., “exploring similarity between academic paper and patent based on latent semantic analysis and vector space model, ” in proceedings of the twelfth international conference on fuzzy systems and knowledge discovery (new york: ieee, 2015), 801–5, https://doi.org/10.1109/fskd.2015.7382045; chengxiang zhai, statistical language models for information retrieval (williston, vt: morgan & claypool publishers, 2018). 19 neha agarwal, geeta sikkaa, and lalit kumar awasthib, “evaluation of web service clustering using dirichlet multinomial mixture model based approach for dimensionality reduction in service representation,” information processing & management 57, no. 4 (july 2020), https://doi.org/10.1016/j.ipm.2020.102238 ; chenliang li et al., “topic modeling for short https://doi.org/10.1080/14789949.2017.1291707 https://doi.org/10.1186/s40163-017-0074-0 https://doi.org/10.1002/asi.23247 https://doi.org/10.14569/ijacsa.2015.060121 http://journalofdigitalhumanities.org/2-1/topic-modeling-and-digital-humanities-by-david-m-blei/ http://journalofdigitalhumanities.org/2-1/topic-modeling-and-digital-humanities-by-david-m-blei/ https://doi.org/10.1007/s10664-015-9402-8 https://doi.org/10.1080/21670811.2015.1093270 https://doi.org/10.1186/s40064-016-3252-8 https://doi.org/10.1016/j.poetic.2013.10.001 https://doi.org/10.1145/361219.361220 https://doi.org/10.1145/2623330.2623715 https://doi.org/10.1109/fskd.2015.7382045 https://doi.org/10.1016/j.ipm.2020.102238 information technology and libraries september 2021 topic modeling as a tool for analyzing library chat transcripts | koh and fienup 24 texts with auxiliary word embeddings,” in proceedings of the thirty-ninth international acm sigir conference on research and development in information retrieval (new york: acm, 2016), 165–74, https://doi.org/10.1145/2911451.2911499; jipeng qiang et al., “short text topic modeling techniques, applications, and performance: a survey,” ieee transactions on knowledge and data engineering 14, no. 8 (april 2019): 1–17, https://doi.org/10.1109/tkde.2020.2992485. 20 ryan j. gallagher et al., “anchored correlation explanation: topic modeling with minimal domain knowledge,” transactions of the association for computational linguistics 5 (december 2017): 529–42, https://doi.org/10.1162/tacl_a_00078. 21 jagadeesh jagarlamudi, hal daumé iii, and raghavendra udupa, “incorporating lexical priors into topic models,” in proceedings of the thirteenth conference of the european chapter of the association for computational linguistics (stroudsburg, pa: acl, 2012), 204–13, https://www.aclweb.org/anthology/e12-1021; olivier toubia et al., “extracting features of entertainment products: a guided latent dirichlet allocation approach informed by the psychology of media consumption,” journal of marketing research 56, no. 1 (december 2019): 18–36, https://doi.org/10.1177/0022243718820559. 22 nan zhang and baojun ma, “constructing a methodology toward policy analysts for understanding online public opinions: a probabilistic topic modeling approach,” in electronic government and electronic participation, eds. efthimios tambouris et al. (amsterdam, netherlands: ios press bv, 2015): 72–9, https://doi.org/10.3233/978-1-61499-570-8-72. 23 jonathan chang et al., “reading tea leaves: how humans interpret topic models,” in proceedings of the twenty-second international conference on neural information processing systems (new york: acm, 2009), 288–96, https://dl.acm.org/doi/10.5555/2984093.2984126. 24 gerlof bouma, “normalized (pointwise) mutual information in collocation extraction,” in proceedings of the international conference of the german society for computational linguistics and language technology (tübingen, germany: gunter narr verlag, 2009), 43–53; boydgraber, mimno, and newman, “care and feeding of topic models,” in handbook of mixed membership models and their applications, eds. edoardo m. airoldi, david m. blei, elena a. erosheva, and stephen e. fienberg (boca raton: crc press, 2014), 225–54; jey han lau, david newman, and timothy baldwin, “machine reading tea leaves: automatically evaluating topic coherence and topic model quality,” in proceedings of the fourteenth conference of the european chapter of the association for computational linguistics (stroudsburg, pa: acl, 2014), 530–39, https://doi.org/10.3115/v1/e14-1056. 25 lau, newman, and baldwin, “machine reading tea leaves”; david newman et al., “automatic evaluation of topic coherence,” in proceedings of human language technologies: the 2010 annual conference of the north american chapter of the association for computational linguistics (new york: acm, 2010), 100–108, https://dl.acm.org/doi/10.5555/1857999.1858011. 26 jeffrey lund et al., “tandem anchoring: a multiword anchor approach for interactive topic modeling,” in proceedings of the fifty-fifth annual meeting of the association for computational linguistics (stroudsburg, pa: acl, 2017), 896–905, https://doi.org/10.18653/v1/p17-1083. https://doi.org/10.1145/2911451.2911499 https://doi.org/10.1109/tkde.2020.2992485 https://doi.org/10.1162/tacl_a_00078 https://www.aclweb.org/anthology/e12-1021 https://doi.org/10.1177/0022243718820559 https://doi.org/10.3233/978-1-61499-570-8-72 https://dl.acm.org/doi/10.5555/2984093.2984126 https://doi.org/10.3115/v1/e14-1056 https://dl.acm.org/doi/10.5555/1857999.1858011 https://doi.org/10.18653/v1/p17-1083 abstract introduction literature review chat transcript analysis methods in library settings chat transcript analysis methods in nonlibrary settings topic modeling techniques and their strengths and weaknesses data, preprocessing, analysis, and evaluation data preprocessing data analysis with conventional topic modeling techniques data analysis with alternative topic modeling techniques quantitative evaluation with topic coherence metrics qualitative evaluation with human judgment results quantitative evaluation with topic coherence metrics qualitative evaluation with human judgment discussion conclusion acknowledgments endnotes microsoft word december_ital_vacek_final.docx president’s  message   twitter  nodes  to  networks:   thoughts  on  the  #litaforum   rachel  vacek     information  technologies  and  libraries  |  december  1014     1   one  thing  that  never  ceases  to  amaze  me  is  the  technological  talent  and  creativity  of  my  library   colleagues.  the  lita  forum  is  a  gathering  of  intelligent,  fun,  and  passionate  people  who  want  to   talk  about  technology  and  learn  from  one  another.    i  suppose  many  conferences  have  lots  of   opportunities  to  network,  but  the  size  and  friendliness  of  the  forum  makes  it  feel  more  like  a   comfortable  place  among  friends.  however,  the  utilization  of  technology  always  inspires  me,  and   the  networking  and  reconnect  with  friends  is  rejuvenating.   so  many  more  people  are  sharing  their  research  and  their  presentations  through  twitter,  and  it’s   fantastic  in  so  many  ways.  so  no  matter  what  concurrent  session  you  were  in,  or  if  you  couldn’t   even  make  it  to  albuquerque  this  year,  you  can  still  view  most  of  the  presentations,  listen  to  the   keynotes,  see  pictures  of  attendees,  follow  the  backchannel,  and  engage  with  everyone  on  twitter.   with  libraries  having  more  tight  budgets,  it’s  extremely  important  that  we  continue  to  learn   virtually.  there  are  plenty  of  online  workshops  and  webinars,  but  often  they  still  cost  money,  don’t   usually  encourage  much  communication  between  attendees,  and  “attending”  the  lita  forum  only   through  twitter  is  not  only  free,  but  the  learning  and  sharing  is  more  organic.  you  have  the   opportunity  to  engage  with  attendees,  observers,  and  even  the  presenters  themselves.  structured   workshops  have  their  place  for  focused,  more  in-­‐depth  learning  on  a  particular  topic,  and  they  are   definitely  still  needed  and  very  popular.  i  enjoy  our  lita  educational  programs  and  highly   recommend  them.  however,  interacting  with  twitter  throughout  the  forum  was  like  a  giant  social   playground  for  me,  and  i  could  engage  as  much  as  or  as  little  as  i  liked.  it’s  a  different  user   experience  than  so  many  other  more  traditional  learning  environments.   twitter  was  born  in  mid  2006  and  the  paradigm  shift  started  happening  a  few  years  later,  but  the   ways  people  are  socially  engaging  with  one  another  through  twitter  has  changed  drastically  since   then.1    people  aren’t  just  regurgitating  what  the  presenters  are  saying,  but  are  responding  to   speakers  and  others  in  the  physical  and  virtual  audience.  people  are  talking  more  in  depth  about   what  they  are  learning  and  supplementing  talks  with  links  to  sites,  videos,  images,  and  reports   that  might  have  been  mentioned.  they  are  coding  and  sharing  their  code  while  at  the  conference.   they  are  blogging  about  their  experiences  and  sharing  those  links.  they  are  extending  their   networks.       the  conference  theme  this  year  was  “from  node  to  network”  and  reflecting  on  my  own   conference  experience  and  reviewing  all  the  twitter  data,  i  don’t  think  the  2014  lita  forum     rachel  vacek  (revacek@uh.edu)  is  lita  president  2014-­‐15  and  head  of  web  services,  university   libraries,  university  of  houston,  houston,  texas.     president’s  message  |  vacek       2   planning  committee,  led  by  ken  varnum  from  the  university  of  michigan,  could  have  chosen  a   better  theme.     as  previously  mentioned,  the  ways  in  which  we  are  using  twitter  have  been  significantly  changing   the  way  we  learn  and  interact.  when  combing  through  the  #litaform  tweets  for  the  gems,  i  found   many  links  to  tools  that  analyze  and  visually  display  unique  information  about  tweets  from  the   forum.  the  love  of  data  is  not  uncommon  in  libraries,  and  neither  is  the  analysis  of  that  data.     the  tagsarchive2  contains  lots  of  twitter  data  from  the  forum.  as  you  can  see  in  image  1,   between  november  1,  2013,  and  november  17,  2014,  (the  same  tag  for  the  forum  was  used  for   the  2013  forum)  there  were  5,454  tweets,  4,390  of  which  were  unique,  not  just  retweets.  there   were  1,394  links  within  those  tweets,  demonstrating  that  we  aren’t  just  repeating  what  the   speakers  are  saying;  we  are  enriching  our  networks  with  more  easily  accessible  information.     image  1.  archive  of  #litaforum  tweets  through  tags   the  data  also  tells  stories.  for  example,  @cm_harlow  by  far  tweeted  more  than  everyone  else  with   881  tweets,  @thestackscat  had  the  highest  retweet  rate  at  90%,  and  @varnum  with  the  lowest         information  technologies  and  libraries  |  december  1014         3   retweet  rate  at  1%.  i  was  able  to  look  at  every  single  tweet  in  a  google  spreadsheet,  complete  with   timestamps  and  links  to  user  profiles.  all  this  is  rich  data  and  quite  informative,  but  tagsexplorer,   developed  by  @mhawksey,  is  also  quite  an  impressive  data  visualization  tool  that  shows   connections  between  the  twitter  handles.  (see  image  2.)     image  2.  tagsexplorer  data  visualization  and  top  conversationalists   additionally,  you  can  see  whom  you  retweeted  and  who  retweeted  you,3  again  demonstrating  the   power  of  rich,  structured  data.  (see  image  3.)  all  of  these  tools  improve  our  ability  to  share,  reflect,   archive,  and  network  within  lita  and  beyond  our  typical,  often  comfortable  library  boundaries.   tweets  also  don’t  last  forever  on  the  web,  but  they  do  when  they  are  archived.4    one  conference   attendee,  @kayiwa,  used  a  tool  called  twarc  (https://github.com/edsu/twarc),  a  command-­‐line   tool  for  archiving  json  twitter  search  results  before  they  disappear.  looking  through  the  tweets,   you  will  learn  that  a  great  number  of  attendees  experienced  altitude  sickness  due  to   albuquerque’s  elevation,  which  is  around  5,000  feet  above  sea  level.  the  most  popular  and   desired  food  to  were  enchiladas  with  green  chili.  many  were  impressed  with  the  scenery,   mountains,  and  endless  blue  skies  of  the  city,  as  evidenced  by  the  number  of  images  of  outdoor   landscapes  and  sky  shots.       president’s  message  |  vacek       4     image  3.  connections  between  @vacekrae’s  retweets  and  who  she  was  retweeted  by   there  were  two  packed  pre-­‐conferences  at  the  lita  forum.  dean  krafft  and  jon  corson-­‐rikert   from  cornell  university  library  taught  attendees  about  a  very  hot  topic:  linked  data  and  “how   libraries  can  make  use  of  linked  open  data  to  share  information  about  library  resources  and  to   improve  discovery,  access,  and  understanding  for  library  users.”    the  hashtag  #linkeddata  was   used  382  times  across  all  the  forum’s  tweets  –  clearly  conversation  went  beyond  the  workshop.   also,  francis  kayiwa,  of  kayiwa  consulting,  and  eric  phetteplace  from  the  california  college  of   arts,  helped  attendees  “learn  python  by  playing  with  library  data”  in  the  second,  equally  as   popular  pre-­‐conference.  (see  image  4.)     image  4         information  technologies  and  libraries  |  december  1014         5   the  forum  this  year  also  had  three  exceptional  keynote  speakers.    annmarie  thomas,  @amptmn,   an  engineering  professor  from  the  university  of  st.  thomas  in  minnesota,  kicked  off  the  forum   and  shared  her  enthusiasm  and  passion  for  makerspaces,  squishy  circuits,  and  how  to  engage  kids   in  engineering  and  science  in  incredibly  creative  ways.  i  was  truly  inspired  by  her  passion  for   making  and  sharing  with  others.    she  reminded  us  that  all  children  are  makers,  and  as  adults  we   need  to  remember  to  be  curious,  explore,  and  play.    there  are  129  tweets  that  capture  not  only  her   fun  presentation  but  also  her  vision  for  making  in  the  future.  (see  image  5.)     image  5   the  second  keynote  speaker  was  lorcan  dempsey,  @lorcand,  the  vice  president,  oclc  research   and  chief  strategist.    he’s  known  primarily  for  the  research  he  presents  through  his  weblog,   http://orweblog.oclc.org,  where  he  makes  observations  on  the  way  users  interact  with  technology   and  the  discoverability  of  all  that  libraries  have  to  offer,  from  collections  to  services  to  expertise.   he  wants  to  make  library  data  more  usable.  in  his  talk,  he  explained  how  some  technologies  such   as  mobile  devices  and  irs  are  having  huge  effects  on  user  behaviors.    “the  network  reshapes   society  and  society  reshapes  the  network.”  what  was  nice  also  is  that  lorcan’s  talk  complimented   annmarie’s  talk  about  making  and  sharing.  users  are  going  from  consumption  to  creation,  and  we,   as  libraries,  need  to  be  offering  our  services  and  content  in  the  users’  workflows.    we  need  to   share  our  resources,  make  them  more  discoverable.    why?    “discovery  often  happens  elsewhere.”     check  out  the  123  posts  on  the  twitter  archive,  which  includes  links  to  his  presentation.    (see   image  6.)     image  6     president’s  message  |  vacek       6   kortney  ryan  ziegler,  @fakerapper,  is  the  founder  trans*h4ck  and  the  closing  keynote  speaker.     his  work  focuses  on  supporting  trans-­‐created  technology,  trans  entrepreneurs,  and  trans-­‐led   startups.  he’s  led  hackathons  and  helped  create  safe  spaces  for  the  trans  community.    his  work  is   so  important  and  many  of  the  apps  help  to  address  the  social  inequalities  that  the  trans   community  still  faces.    for  example,  he  mentioned  that  it’s  still  legal  in  36  states  to  be  fired  for   being  trans.    but  there  are  174  tweets  captured  at  the  forum  that  give  examples  of  the  web  tools   created,  and  ideas  about  how  libraries  can  be  inclusive  and  more  supportive  of  the  trans   community.    (see  image  7.)     image  7   the  sessions  themselves  were  excellent,  and  many  sparked  conversations  long  after  the   presentation.  lightning  talks  were  engaging,  fast,  and  fun.  posters  were  both  beautiful  and   informative.  overarching  terms  that  i  heard  repeatedly  and  saw  among  the  tweets  were:  open   graph,  openrefine,  social  media,  makerspaces,  bibframe,  library  labs,  leadership,  support,   community,  analytics,  assessment,  engagement,  inclusivity,  diversity,  agile  development,  open   access,  linked  data,  vivo,  dataone,  discovery  systems,  discoverability,  librarybox,  islandora,  and   institutional  repositories.  below  are  some  highlights:           information  technologies  and  libraries  |  december  1014         7       there  were  so  many  opportunities  to  network  at  sessions,  on  breaks,  at  the  networking  dinners,   and  even  at  game  night.  i  see  networking  as  a  huge  benefit  of  a  small  conference,  and  networking   can  lead  to  some  pretty  amazing  things.    for  example,  whitni  watkins,  @nimblelibrarian  and  one   of  lita’s  invaluable  volunteers  for  the  forum,  was  so  inspired  by  a  conversation  on  openrefine   that  she  created  a  list  where  people  could  sign  up  to  learn  more  and  get  some  hands-­‐on  playing   time  with  the  tool.    on  her  blog,5  whitni  says,  “…most  if  not  all  of  those  who  came  left  with  a  bit     president’s  message  |  vacek       8   more  knowledge  of  the  program  than  before  and  we  opened  a  door  of  possibility  for  those  who   hadn’t  any  clue  as  to  what  openrefine  could  do.”   another  example  of  great  networking  is  where  tabby  farney,  @sharebrarian,  and  cody  behles,   @cbehles,  decided  to  create  a  lita  metrics  interest  group.    at  one  of  the  networking  dinners,  they   discussed  their  passion  for  altmetrics  and  web  analytics  but  noticed  that  there  wasn’t  an  existing   group,  and  felt  spurred  to  create  one.         the  technology  and  information  sharing,  the  networking,  the  collaborating,  and  the  strategizing  –   these  are  all  components  that  make  up  the  lita  forum.    twitter  is  just  another  technology   platform  to  help  us  connect  with  one  another.    we  are  all  just  nodes,  and  technology  enables  us  to   both  become  the  network  and  to  network  more  effectively.   but  finally,  i  want  to  acknowledge  and  thank  our  sponsors,  many  of  which  are  also  lita  members.   we  could  not  have  run  the  forum  without  the  generous  funds  from  ebsco,  springshare,  @mire,   innovative,  and  oclc.  on  behalf  of  lita,  i  truly  appreciate  their  support.   i  want  to  leave  you  with  one  more  image  that  was  created  by  @kayiwa  using  the  most  tweeted   words  from  all  the  posts.6  next  year’s  forum  is  in  minneapolis,  and  i  hope  to  see  you  there.           information  technologies  and  libraries  |  december  1014         9   references     1.  http://consumercentric.biz/wordpress/?p=106   2.https://docs.google.com/spreadsheet/pub?key=0asyivmoyhk87dfnfx196v1e2m2zqtvlhq2j vs2fsdee&output=html     3.  http://msk0.org/lita2014/litaforum-­‐directed-­‐retweets.html   4.  http://msk0.org/lita2014/lita2014.html   5.  http://nimblelibrarian.wordpress.com/2014/11/14/lita-­‐forum-­‐2014-­‐a-­‐recap/   6.  http://msk0.org/lita2014/litaforum-­‐wordcloud.html june_ital_fifarek_final president’s message: for the record aimee fifarek information technologies and libraries | june 2017 1 this is my final column as lita president. having just finished the 2016/17 annual report, i must admit i’m a little tapped out. over the last year i’ve written on the events of an ala annual and midwinter conferences, a lita forum, a new strategic plan, information ethics, and advocacy. even for an english major and a librarian that’s a lot of words. as i work with executive director jenny levine and the rest of the lita board to prepare the agenda for our meetings at annual, the temptation is to focus on all the work that is yet to be done. but with the end of school and fiscal years approaching, it is the ideal time to celebrate everything that has been accomplished over the last 12 months. first off, at some magical point during the year we completed the lita staff transition period. jenny has truly made the executive director position her own, and although she and mark beatty have more than enough work for six people, they are well on their way to guiding lita to a bright new future. with her knowledge of the inner workings of ala and her desire to make everything easier, faster and better, jenny is truly the right person for this job. next, we have a great new set of people coming in to lead lita. andromeda yelton is going to be a fabulous lita president. she is an eloquent speaker, has more determination than anyone i know, and is a kick ass coder to boot. bohyun kim has an amazing talent for organizing and motivating people, and as president-elect work wonders with the new appointments committee. our new directors-at-large lindsay cronk, amanda goodman, and margaret heller are all devoted litans who will be great additions to the board. i’m glad i get to work with them all in their new roles as i transition to past-president. and last but certainly not least we have started to make inroads on our advocacy and information policy strategic focus. the privacy interest group has already raised lita’s profile by supplementing ala’s intellectual freedom committee’s privacy policies with privacy checklists.1 a group of board members along with office for information technology policy liaison david lee king and advocacy coordinating committee liaison callan bignoli are working on a new task force proposal to outline strategies for effectively collaborating with the ala washington office. these are just the first steps towards a future in which lita is not only relevant but necessary. with all that hard work accomplished, it must be time to toast to our successes. i hope that everyone who will be at ala annual in chicago (http://2017.alaannual.org/) later this month will join us as we conclude our 50th anniversary year. sunday with lita promises to be amazing, with aimee fifarek (aimee.fifarek@phoenix.gov) is lita president 2016-17 and deputy director for customer support, it and digital initiatives at phoenix public library, phoenix, az. president’s message | fifarek https://doi.org/10.6017/ital.v36i2.10019 2 hugo award winner kameron hurley (http://www.kameronhurley.com) speaking at the president’ program, followed by what is sure to be a spectacular lita happy hour at the beer bistro (http://www.thebeerbistro.com/). we are still working on our goal to raise $10,000 for professional development scholarships. we’re only halfway there, so please donate at: https://www.crowdrise.com/lita-50th-anniversary. being lita president during the association’s 50th anniversary year has been both an honor and a challenge. during a milestone year like this you become acutely aware of all of the hard work and innovation that was required for the association to thrive for half a century, and feel more than a little pressure to leave an extraordinary legacy that will ensure another fifty years of success. it’s a tall order, especially in an era of rapid political and societal change. but as i navigated through my presidential year i realized that i didn’t have to do anything more than ensure that people who already want to work hard for the greater good have a welcoming place to do just that. after fifty years, lita still has the thing that made it a success in the first place: a core group of volunteers committed to the belief that new technologies can empower libraries to do great things. the talented and passionate people i have worked with on the board, in the committee and interest group leadership, and throughout the membership are the best legacy that an association can have. now more than ever the people in libraries who “do tech” can be leaders in their communities and on the national stage. now more than ever it is lita’s time to shine. references 1. http://litablog.org/2017/02/new-checklists-to-support-library-patron-privacy/ delivering: automated materials handling for staff and patrons public libraries leading the way delivering automated materials handling for staff and patrons carole williams information technology and libraries | september 2021 https://doi.org/10.6017/ital.v40i3.13xxx carole williams (williamscar@ccpl.org) is amh and self-service coordinator, charleston county public library. © 2021. “you’ve made libraries cool again!” “wowsuper techie—we’re fascinated by the book return!” with enthusiastic comments like these from our visitors, the staff at charleston county public library (ccpl) knew that we were delivering patron engagement while providing an effective book return system. thanks to county residents overwhelmingly approving a referendum to build five new libraries and renovate thirteen others. from june 2019–november 2020 charleston county opened four new library branches, each with an automated materials handler (amh), and moved our support and administrative staff into a renovated support services building that now houses a 32-chute amh central sorter with smart transit check-in technology. (side note: yes, i know what you’re thinking and yes, we did, we opened two of the new branches during the pandemic—definitely fodder for another article.) the branch amhs have interior and exterior return windows and sit along a glass wall so patrons can watch their items ride the conveyor belts and drop into sorting bins. the staff side has an inductor for items being returned from or sent to other branches, so there is almost always something for the public to watch (see figure 1). men, women, children, young and old enjoy watching the amh and asking questions. some patrons bring their out of town guests (even a nun visiting from ireland) to see the amh in action. this spontaneous interaction bolsters our connection with visitors and subconsciously reinforces the concept of “library as safe exploration.” a frequent question is “how does this work?” our explanation of tags and coding is the perfect opportunity to suggest books, point out games, and promote upcoming classes. we follow a roving customer service model. because an amh is an efficient tool that checks in items and deposits them in pre-determined bins for easy shelving, we have freed up hours of staff time that can now be spent in the stacks, helping patrons find items and answering questions as needed. delivering an excellent amh experience for staff has been more complicated. as befitting a port county, we went full steam ahead with new technology, new locations, and increased services. this required all staff to simultaneously learn new systems and change many of our in-house procedures while continuing with daily operations. every detail, from how to sort for shelving to labeling shipments, needed to be re-examined. the biggest changes came with bringing the central sorter online. some of the changes were technical. for example, we use rfid tags as an identification number and place a matching barcode on each item. rfid is excellent technology; tagging all our items has completely changed and streamlined our process. most items come pre-processed and the amhs are set to only read the rfid. an unintended, but useful consequence is that we have become more aware of vendor processing errors where tags and barcodes don’t match. (side note: we are working on some system-wide solutions to locate discrepancies between barcode numbers and rfid tags in the ils; rfid is another topic entirely, so stay tuned.) mailto:williamscar@ccpl.org information technology and libraries september 2021 delivering| williams 2 figure 1. children returning books to the automated materials handler at a branch of the charleston public library. another benefit of the amh is that our library collections acquisitions and technical services (lcats) department realized that as they processed new orders, they could now send those items out daily instead of waiting to accumulate enough individual branch materials for a separate shipment—a win for patrons (new materials every day) and lcats (storage space.) the unexpected twist: our adult, ya, and children’s’ librarians are accustomed to receiving new materials separately from returns so they can familiarize themselves with the titles before the items are shelved. with the central sorter, new items go out daily mixed in with the rest of the daily shipment. spine tape makes it is easy for circulation staff to separate the new adult items, but we still needed a solution for children and ya. after several sort changes and many discussions, we went old school, recycling used paper into book flags. the flagging doesn’t cause a problem with the amh, is quick for technical services to place in each new book and is easy for circulation to spot and put aside at the receiving location. some of the changes were electrical. only the four new branch locations and support services facility have an amh, while the other fourteen branches check in items by hand. we added a tote check-in server (tcs) system to the central sorter. this feature creates a manifest of the items in each crate. branches can now receive the contents of each crate by entering a 4 -digit barcode instead of scanning individual items. an unintended consequence to our new internet-dependent system that we had not anticipated was electricity. the coast has frequent thunderstorms that can information technology and libraries september 2021 delivering| williams 3 cause power outages and flooding. if the power is out, there is no way to sort or receive the items in delivery. luckily this doesn’t happen often, and so far, power has been restored quickly. some of the changes were physical. our delivery drivers also process the shipment when they return each day. in their previous workflow, most of the shipment was delivered to the downstream libraries. the parts of the shipment that they did process had printed routing slips placed in each item, so staff could all be sorting the shipment at the same time. now their department has become logistics, which is a more encompassing title and better covers the wider variety of tasks the staff have added to their day. in addition to delivery and mail duties, logistics also manages and maintains the amh and tcs equipment, troubleshooting problems that arise, scanning barcodes, and processing an average of 3000 items daily with the amh. most of the shipment is now coming through the central sorter—staff handle an average of 157 crates each weekday, moving items from support services and to/from branches. we have electric forklifts that hold three crates at a time to help with the increase in physically shifting the crates. now one person inducts the shipment while others scan and stack the crates on the loading dock. this procedure is much faster than the previous paper slip method and processing is usually finished in a couple of hours. other changes were mental and emotional. new locations, renovations, technologies, and procedures can be exciting, but can also lead to change fatigue. fortunately, everyone retained their job this past year, but in order to operate a new branch built in a previously unserved community, we had to reassign staff from locations closed for renovation. ccpl’s vision is for our library to be the path to our cultural heritage, a door to resources of the present, and a bridge to opportunities in the future. we are doers, creators, servers, and teammates, not only to the community, but to our coworkers. we are all in for our shared vision, but whew. . . some days we all experience mental eye rolling and collective sighs of “another change?” our director’s mantra is “we are the calm.” and it is true. by fall we will have three of the renovated branches reopened, three more under renovation and another staffing shift. with some grace and encouragement to one another, we will handle whatever comes next. mitigating bias in metadata: a use case using homosaurus linked data article mitigating bias in metadata a use case using homosaurus linked data juliet l. hardesty and allison nolan information technology and libraries | september 2021 https://doi.org/10.6017/ital.v40i3.13053 juliet l. hardesty (jlhardes@iu.edu) is metadata analyst, indiana university. allison nolan (anolan147@gmail.com) is library and information science graduate student, indiana university. © 2021. abstract controlled vocabularies used in cultural heritage organizations (galleries, libraries, archives, and museums) are a helpful way to standardize terminology but can also result in misrepresentation or exclusion of systemically marginalized groups. library of congress subject headings (lcsh) is one example of a widely used yet problematic controlled vocabulary for subject headings. in some cases, systemically marginalized groups are creating controlled vocabularies that better reflect their terminology. when a widely used vocabulary like lcsh and a controlled vocabulary from a marginalized community are both available as linked data, it is possible to incorporate the terminology from the marginalized community as an overlay or replacement for outdated or absent terms from more widely used vocabularies. this paper provides a use case for examining how the homosaurus, an lgbtq+ linked data controlled vocabulary, can provide an augmented and updated search experience to mitigate bias within a system that only uses lcsh for subject headings. introduction controlled vocabularies are a vital part of how individuals and communities are understood and discussed in scholarly discourse and research. controlled vocabularies are also a way to standardize terminology and allow items to be grouped by common subjects for easier discovery and access points. while larger, more universally recognized vocabularies like the library of congress subject headings (lcsh) exist, they are often slow to be updated and they reflect a largely white, heterosexual, cisgender, male, christian-centric point of view.1 when the terminology used to define a systemically marginalized group is determined by those outside of the group, often the terms are outdated or reflect a biased perspective.2 the prevalence and continued use of outdated metadata and vocabularies in discovery systems creates a cycle of biased search practices that can be difficult to break without the help of information professionals and outside resources. controlled vocabularies that have been created by or have the input of marginalized communities tend to be more inclusive and up to date. unfortunately, these vocabularies often are not known to the public or to researchers not well versed in metadata practices. providing access to controlled vocabularies created by marginalized communities and linking them to existing vocabularies such as lcsh can help make the search process more representative of the people who are using discovery systems and can connect them to resources that better represent themselves and their needs in a complex information world. lcsh terms are available as linked data, a format that enables online machine-readable connections between concepts and terms, and there needs to be an effort to make systems using lcsh terms more inclusive and representative of marginalized communities. the project described in this article built and gathered feedback on a proof-of-concept javascript application to show how defined connections between vocabularies can be used to provide alternative and mailto:jlhardes@iu.edu mailto:anolan147@gmail.com information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 2 often enhanced access to library catalog resources. in this instance, simple knowledge organization system (skos) relationships link lcsh subject terms to the homosaurus linked data vocabulary, an “international linked data vocabulary of lgbtq terms that supports improved access to lgbtq resources within cultural institutions.”3 skos is “a common data model [from the w3c] for sharing and linking knowledge organization systems via the web.”4 this project uses skos:exactmatch relationships defined by the homosaurus to enable researchers to use homosaurus terms to search a library catalog and retrieve relevant results based on the connected lcsh terms that are already in the catalog record.5 subject searches are conducted when the homosaurus term and the lcsh term match exactly, since the lcsh term’s presence in the library catalog record indicates a specific grouping of records could have this subject term applied. if the homosaurus term does not match exactly to the lcsh term, a keyword search is conducted using the homosaurus term to retrieve library catalog results where the homosaurus term appears in any indexed field in the catalog record, including creator-supplied title and abstract information. using a vocabulary like the homosaurus this way helps to connect researchers to resources that more accurately reflect systemically marginalized communities and potentially more accurately reflects the researchers themselves. by providing connections for users that they would otherwise have difficulty finding without the help of a librarian or other information professional, projects such as this one hope to combat the cycle of biased metadata and biased research practices that has dominated academic research. literature review students in higher education who identify as members of systemically marginalized communities can continue to experience marginalization within higher educational institutions, and the academic library setting is no exception. brook, ellenwood, and lazzaro provide analysis of multiple studies showing the effect of mostly white staffing in academic libraries, the impact this can have on reference services provided to patrons from marginalized communities, and the overwhelming and intimidating spaces in sizable academic libraries that can be “compounded for students who already feel that they do not belong on campus on the basis of their race.” 6 when considering how this experience impacts using an online library catalog or digital repository system for conducting research, these same students can find themselves not well represented.7 additionally, crossing disciplines to capture intersectionalities of an identity can be complicated by narrow controlled vocabulary terms which compound problems that already make interdisciplinary research difficult.8 drabinski proposes that the library catalog should be treated as a biased text that requires critical thinking to understand.9 subject headings from authorities such as the library of congress will never be unbiased as attitudes, perspectives, and identities change over time. it is therefore important to leverage information literacy competency standards put forward by the association of college & research libraries and teach students how to critically engage the library catalog as another information source. library instruction is one way to ease the challenges faced by marginalized researchers in higher education, helping researchers effectively use a system like a library catalog that incorporates biased subject headings. however, with interdisciplinary research, materials are often dispersed across information systems and physical locations, and there is still the challenge to identify and locate everything relevant to the research topic.10 using available fields within the library catalog record itself (the 590 in marc, for example) can identify cross-disciplinary resources. examples are provided by hogan for black lgbtq resources and latina lesbian literature.11 what all of these efforts seem to point to is what hannah buckland proposes: changing the framing of catalog records from “aboutness” to information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 3 “fromness,” providing “culturally-responsive metadata” that j. l. colbert recognizes can create an “equitable subject access” experience that “center[s] the information needs and information seeking behaviors of those whom our systems disenfranchise.”12 these changes can often only be implemented locally due to language variation and localized community relevance; but colbert then considers how linked open data might prove useful to combine or relate different subject or community vocabularies. “when we decenter the idea that for every concept there is one controlled term to describe it, we allow the play of seemingly opposite ways of thinking. . . . a linked open data catalog allows libraries to complement, replace, or even reject the standards that have been decided for us and our patrons.”13 librarians and archivists have suggested and tried other methods to mitigate the impact of systemic marginalization. these efforts go beyond the use of controlled vocabularies in the creation of catalog records. one of the earliest and most significant examples of this is dorothy porter’s work in organizing the collections she managed at howard university. up to that point in the 1930s and 1940s, dewey decimal classification (ddc) was used to organize works on the shelf. many libraries of the time were predominantly white institutions and dorothy porter remembered them using ddc to shelve anything by a black author or about the black experience under the ddc heading for colonization (325) or slavery (326).14 porter instead organized her collections based on subject matter, genre, and author, categorizing the work based on what it was about rather than the race of the author or the race of any people mentioned in the work. this subtle yet fundamental shift shows the real impact that libraries have on access to collections for their audiences. hope a. olson and dennis ward created a proof-of-concept microsoft access database interface connecting mary ellen capek’s a women’s thesaurus to the dewey decimal classification scheme to offer an end user interface for searching a ddc system using the thesaurus terminology. the idea, initially from joan mitchell (then editor of ddc), was to develop “a means of making ddc accessible from the point of view of a marginalized knowledge domain—in particular, creating a means of browsing ddc from a feminist/women’s studies perspective.”15 variables were defined from characteristics of different classifications to enable a systematic match to thesaurus terms. dorothy berry’s work at university of minnesota libraries to gather and digitize african american-related materials from across archival collections for aggregating in umbra search african american history shows an option for pulling a collection from other collections and highlighting what would otherwise remain marginalized items from marginalized communities.16 discovering these materials required searching with a variety of terms used over time to refer to african americans. adding collection level context at the folder level for these materials allows aggregation without losing original place and context, while at the same time centering the marginalized communities represented in these materials by gathering them from these various and marginalized original locations. archives for black lives in philadelphia is “a loose association of archivists, librarians, and allied professionals in the philadelphia and delaware valley area responding to the issues raised by the black lives matter movement.” within this group, the anti-racist description working group has compiled an annotated bibliography and metadata recommendations to address racist and antiblack archival description.17 the recommendations focus on the black community but can be applied more broadly when describing records by and about any marginalized community. the information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 4 recommendations include decentering “neutrality” and “objectivity” for “respect” and “care,” particularly when deciding on controlled vocabulary terms to use in archival description. specific recommendations to use “terminology that black people use to describe themselves,” to recognize that this “terminology changes over time, so description will be an iterative process,” and to consult “alternative cataloging schemes created by the subjects of the records being described when and if they are available” provide an approach that looks for descriptive terms from within the community and moves away from terms applied to a community by others.18 paying attention to the controlled vocabularies applied to archival description helps to change the narrative and the power structure of the historical record, centering those who have been marginalized and oppressed and increasing discoverability and access to their stories and perspectives. allowing for changes in controlled vocabulary terms keeps systems flexible enough to accommodate changes in a community’s terminology over time. linked data relationships can connect term changes for more comprehensive searching while also identifying the current controlled vocabulary term to use. the lavender library, archives, and cultural exchange (llace) community archives in sacramento, california is an archive for a marginalized community.19 in developing archival and circulating library collections that serve the queer community, the library collections use a thesaurus of queer terms from dee michel for classification and the archival collections use subject headings from michel’s thesaurus along with lcsh.20 the focus, again, begins with the community being served and recognizes that widely used controlled vocabularies like lcsh do not serve these collections or communities well. starting with a community-specific vocabulary and then connecting lcsh terms centers the collections and community first and then makes connections to the larger library and archives community possible. other efforts have used alternatives or supplements to common vocabularies and schemes. the xwi7xwa library’s use of the brian deer classification system at the university of british columbia incorporates names and terminology from the first nations community to better represent that community beyond what something like library of congress classification provides. using accurate names of nations and peoples, according to the head librarian, ann doyle, helps create identity among users of the collection and “shapes the research and types of questions that people ask.”21 the national indian law library began cataloging using local terminology only. as it moved records online and sought to be more discoverable and cooperative with other libraries, this local terminology was synchronized with lcsh and specialized terms for federal indian law and tribal law were kept as a supplement.22 doing this work is not only about changing terms on catalog records but also learning and making connections with communities who have been marginalized by these systems. farnel et al. explain the process of decolonizing both the library catalog and digital collections description at university of alberta libraries through investigation, analysis, partnering with other institutions doing this work and, most importantly, reaching out to indigenous communities represented in these records to engage and learn about the most appropriate terminology to use.23 different methods and attempts to center the marginalized in cataloging and collection description show it is possible and essential to voice the concerns of those least represented in order to have the most impact on all researchers using these resources. widely used controlled vocabularies like lcsh continue to be a major way to aggregate collections and provide common access points. groups like the association for library collections and technical services’ information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 5 cataloging and metadata management section subject analysis committee continue to work to change terms in these vocabularies to provide better and more accurate representation for systemically marginalized communities, but the process is slow and will likely never be enough.24 incorporating vocabularies from systemically marginalized communities for use either on the cataloging/description side or for researchers to use for search and discovery offers possibilities for more inclusive experiences that center marginalized voices and expand the options for research questions to ask and answer. methodology to test this idea that connections provided between a systemically marginalized community’s controlled vocabulary and a more generalized vocabulary like lcsh could be helpful, a proof-ofconcept information retrieval aid was conceived. the idea was to create a lightweight javascript application that could use a select set of terms from the homosaurus (http://homosaurus.org), an lgbtq+ vocabulary originally created by ihlia lgbt heritage (https://www.ihlia.nl/?lang=en) and now also used in its linked data form by the digital transgender archive (https://www.digitaltransgenderarchive.net), to connect to lcsh terms and provide search links against a library catalog (iucat, https://iucat.iu.edu, indiana university’s online library catalog) that uses lcsh for subject headings. homosaurus version 1 was used initially and did not identify connections to lcsh terms. analysis of homosaurus terms against lcsh terms suggested some connections could be made and for initial construction of the proof-of-concept application these were used, but with the recognition that these connections were not coming from the community vocabulary. this was a problem since the point in mitigating bias is to use the community’s definitions and any outside interpretations are necessarily not going to reflect the community’s intentions. as the application concept continued to form and the initial term comparison work continued, homosaurus version 2 was released containing explicit connections to lcsh terms, using skos:exactmatch for mapping those connections. those connections in version 2 are not expressed as linked data but are provided in the vocabulary’s site for each term. the proof-of-concept work switched to using select terms from homosaurus version 2 in order to make use of the lcsh connections now being provided by the community.25 the proof-of-concept application used the select set of homosaurus version 2 terms downloaded as json-ld and added in the lcsh terms using the supplied skos:exactmatch relationship. the user interface provided visual connections from the selected homosaurus term to its narrower, broader, and related terms within homosaurus. any exact matches to lcsh terms and any use for terms homosaurus indicated should be replaced by this term were provided together. the visual layout for the application is directly influenced by the ihlia lgbt heritage collections browse interface.26 in ihlia’s system, after searching for a term (“love,” for example), the interface provides broader, narrower, related, and used for terms as suggestions for other ways to discover items in these collections in a visually connected bubble layout surrounding the search term. those connections are linked and can be used to navigate ihlia’s controlled vocabulary, which also happens to be powered by a local non-linked data form of the homosaurus vocabulary. in the proof-of-concept application, for terms where there is an lcsh exact match, the lcsh term was used for the connection to search iucat and was only revealed on screen if the exact match (lcsh) bubble was clicked by the user (see fig. 1). http://homosaurus.org/ https://www.ihlia.nl/?lang=en https://www.digitaltransgenderarchive.net/ https://iucat.iu.edu/ information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 6 figure 1. information retrieval aid showing the homosaurus term “transgenderism” linked to search iucat. exact match (lcsh) shows the lcsh term “gender nonconformity” (also linked to search in iucat) along with narrower, broader, and related homosaurus terms. the initial proof-of-concept information retrieval aid javascript application was shared with and tested by olivia adams, a graduate student at indiana university working as the library coordinator for the lgbtq+ culture center library at indiana university (https://lgbtq.indiana.edu/programs-services/library/index.html). this library has adapted the llace classification system, the shelving organizational scheme developed by the lavender library in sacramento, california (http://lavenderlibrary.com), for organizing its own physical collection of resources. the lgbtq+ culture center library also has its own online library catalog that makes use of an established local list of tags for items included in that system (https://www.librarycat.org/lib/iuglbtlibrary/). the information retrieval aid application was first presented to the lgbtq+ culture center library coordinator for general impressions and feedback. additionally, specific tasks were proposed. please note that the proposed tasks use a vocabulary term as an example that is offensive and outdated. the results of this testing, along with feedback from the homosaurus editorial board, clarified the need to change the information retrieval aid to supply this additional contextual information (available in homosaurus as a description for the term). the tasks presented for trying the information retrieval aid were the following: https://lgbtq.indiana.edu/programs-services/library/index.html http://lavenderlibrary.com/ https://www.librarycat.org/lib/iuglbtlibrary/ information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 7 • you want to find resources at iu about transgenderism. what do you think of the resources that iucat is offering through this information retrieval aid? • how do the homosaurus terms you are seeing here compare to the llace classification terms or the tags/subjects you use in the lgbtq+ library catalog? • what is the importance of transparency for the lcsh terms in relation to community values (for terms that are different and only shown in the hidden section right now)? “transgenderism” is a term homosaurus connects to lcsh’s term “gender nonconformity” with an exact match relationship (http://homosaurus.org/v2/transgenderism). to provide results for answering the first question, the proof-of-concept information retrieval aid interface showed the homosaurus term with a linked search in iucat that provided results using the lcsh term as a subject search.27 the second question was asked to get a sense of the relevance of the homosaurus terms to the collections organized and housed in the lgbtq+ culture center library. the third question about the importance of transparency for the lcsh terms in relation to community values was meant to investigate how a system like this proof-of-concept information retrieval aid might be used by the community of researchers and patrons using the culture center’s library, and if the mechanism to mask the lcsh term in favor of the homosaurus term is useful or not. the code for this javascript web application in its current state is available on github at https://github.com/jlhardes/metadatabias. the initial proof-of-concept application was developed by justina kaiser, at the time an information and library science graduate student at indiana university. the current code is a fork of her project, also available on github (https://github.com/juskaise/metadatabias). discussion sharing this proof-of-concept information retrieval aid using homosaurus terms with the lgbtq+ culture center librarian revealed the importance of usability testing and being receptive to a community’s needs. an introduction and explanation of the controlled vocabulary and the community it represents was a recommended addition since the term list presented was not initially easily identified. additionally, the interface terminology of narrower/related/broader/exact match/use for is familiar in the library world but not necessarily for the casual user. this terminology is still in use by the information retrieval aid but is under review for updated labels that are easier to understand. this initial version kept any use for terms hidden unless the user clicked on that bubble in the interface to see them. the reasoning was to give more emphasis to the homosaurus term and to keep any potentially derogatory or harmful terms still in use by lcsh out of the way of researchers (even though the searches conducted against the catalog might need to use those terms if no other linked data connection is available). feedback here was helpful: hiding terms that homosaurus does not recommend might hinder discovering results if the researcher wants to search on a term that is no longer used by the community or is considered derogatory or harmful. this is a useful lesson in that covering up the past is not helpful to those in a marginalized community who have experience with that marginalization or those trying to learn about the past experiences of a marginalized community. also, being able to find all relevant resources can mean a variety of terms (both current in the community and no longer current) might be necessary. the homosaurus editorial board also explained that use for terms are sometimes slang terms and are not always considered derogatory. this information is helpful in figuring out how to present lcsh terms in the interface http://homosaurus.org/v2/transgenderism https://github.com/jlhardes/metadatabias https://github.com/juskaise/metadatabias information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 8 in the context of the homosaurus terms. additionally, moving use for terms next to related terms connected these sets of terms better than placing use for terms with exact match terms. further feedback from the homosaurus editorial board regarding the example term used for testing showed the terms and their connections to other terms do not supply enough information to express the full meaning of the term within the community. without supplying the homosaurus description for the term “transgenderism” (“pathologizing term often used in the medicalization of transgender people; use only in historical context,” see http://homosaurus.org/v2/transgenderism), the term can come across in the information retrieval aid as a preferred term from the community when, in fact, it is not. this was a critical update needed for the information retrieval aid to be effective as a research tool. in using the proof-of-concept interface to search against iucat, it was noted by the lgbtq+ culture center librarian that using the lcsh term to conduct a subject search against the catalog might not produce useful results if the homosaurus term is not an actual exact match to the lcsh term. in this case the homosaurus term should be searched in the catalog as a keyword instead of a subject, so the search is conducted on all indexed fields in the catalog record. in the example tried for the term “transgenderism” the skos:exactmatch relationship is defined as the lcsh term “gender nonconformity” (see fig. 1). even though the relationship is identified in homosaurus as an exact match, searching for “gender nonconformity” as a subject term in the catalog (267 results) and “transgenderism” as a keyword in the catalog (289 results) arrives at different result sets with different types of entries (see figs. 2 and 3). use for terms, while not always representative of the community providing the vocabulary, do have possible historical relevance if present in supplied information (such as a title) and can be connected to the catalog via keyword searching as well. there is an importance to revealing these differences within the library catalog and providing results that reflect the terms used by the community. the library’s applied terminology via subjects organizes a different set of resources compared to searching for terminology available via titles or other information supplied by authors and creators. when considering who is part of a community and who is not in this scenario, there are benefits to trying to work around or in addition to the library’s applied organizational scheme. subject searching in the catalog provides another view (and set of results) for those familiar with the community’s terminology. those approaching a research topic from outside of a community are able to learn more about how to find resources most effectively, moving from the catalog’s terminology to the community’s terminology. after trying the proof-of-concept information retrieval aid, the lgbtq+ culture center librarian provided feedback that this could be useful for people new to studying the lgbtq+ community and unfamiliar with the community’s terminology. with an introduction and explanation of the controlled vocabulary in place and an easy-to-follow interface to guide users through the vocabulary terms, effective searches against the catalog that also reveal terminology used by the community and differences between that terminology and the catalog’s terminology can be both educational and useful for research. http://homosaurus.org/v2/transgenderism information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 9 figure 2. searching indiana university’s online library catalog (iucat) for the lcsh term “gender nonconformity” as subject shows 267 results. information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 10 figure 3. searching indiana university’s online library catalog for the homosaurus term “transgenderism” as keyword shows 289 results. one of the largest obstacles to connecting marginalized communities to reliable, representative controlled vocabularies is the lack of controlled vocabularies that are readily available as linked data. unless an individual or organization has made the effort to establish connections between a community’s vocabulary and lcsh, the representative vocabularies stand alone and remain difficult to discover or use. the proof-of-concept testing of this project illustrates not only the need for connections to community-created controlled vocabularies, but also that having access to those vocabularies can result in more accurate and effective searches and usage of catalog resources. although vocabularies like lcsh contain outdated terms, having access to a variety of terms that are acceptable at different points in a community’s history can be useful for researchers who may not be as informed about certain systemically marginalized communities and whether certain terms have been completely eliminated, reclaimed, or replaced by more accurate terminology. efforts to mitigate bias in metadata via linked data are representative of a larger effort to correct a long-standing issue in libraries and other fields where the voices and perspectives of marginalized individuals have been overshadowed by the voices and needs of the majority. in addition to working to update large, generalized vocabularies and trying to incorporate these voices and information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 11 perspectives, this change in method is meant to add those voices and center their importance. by linking community-created vocabularies and placing them front and center in the search process, metadata can become a tool with which to center the voices of marginalized communities and move toward a more equitable method of searching, finding, and using resources. conclusion the information retrieval aid is still progressing beyond a proof-of-concept but it has seen significant updates since its initial implementation. figure 1 shows the initial proof -of-concept that was tested. introductory information has been added to explain the homosaurus vocabulary and the information retrieval aid tool itself. more terms are available (although still not the full set of homosaurus version 2 terms) and the term list in json-ld is being used to automatically populate the term list in the interface. if available, the term description is provided for more complete context. additionally, no terms are hidden in the bubble navigation and use for is located with related terms now. future work for this project includes incorporating the full list of homosaurus terms; reconsidering the category names (narrower/related/broader/exact match/use for) to determine if there are better labels to use for these categories that will be easier to understand for a general research audience; and testing the tool with researchers new to lgbtq+ terminology as well as those more knowledgeable about the lgbtq+ community and its terminology and history. additional areas of work that welcome investigation include automating the term list generated for use with the information retrieval aid (via api calls, for example) to help reflect any changes or updates made to the community vocabulary over time; the technical implications of connecting this information retrieval aid to a search engine beyond indiana university’s online library catalog; and using this tool with controlled vocabularies from other systemically marginalized communities, such as the bc first nations subject headings, the glossary of disability terms from the north carolina council on developmental disabilities, or atria: women’s thesaurus from the institute on gender equality and women’s history.28 what difference does it make to use a different search engine that incorporates lcsh terms? likewise, is it possible to connect other linked data (or non-linked data) controlled vocabularies from systemically marginalized communities and is that effective for retrieving information and improving research outcomes? the work so far shows the possibility of centering systemically marginalized voices by using the system more effectively, making linked data work to connect and update the terminology and search terms available for research. acknowledgements the authors would like to thank the lgbtq+ culture center librarian at indiana university for spring 2020, olivia adams, for her helpful review and feedback of the initial proof -of-concept information retrieval aid. we would also like to thank brian m. watson, editorial board member of homosaurus.org, for their help with using homosaurus version 2 terms and the homosaurus editorial board, particularly k. j. rawson, for reviewing and supplying article feedback. the authors also acknowledge the work of justina kaiser who created the initial code behind the information retrieval aid. information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 12 endnotes 1 hope a. olson, “mapping beyond dewey’s boundaries: constructing classificatory space for marginalized knowledge domains,” library trends 47, no. 2 (fall 1998): 238. 2 the term “systemically marginalized group,” used recently by dr. nicki washington from duke university at the september 3, 2020, indiana university center for women in technology talk, “‘bring a folding chair’: understanding and addressing issues of race in the context of stem,” was revealing to the authors as a better term to use than “historically marginalized communities.” this is significant in that it emphasizes the continued oppression and marginalization of these communities, rather than viewing these communities’ struggles as something of the past that has been overcome/surmounted. 3 “mission, history, editorial board,” homosaurus vocabulary site, accessed march 2, 2021, http://homosaurus.org/about. 4 “skos simple knowledge organization system reference,” w3, published august 18, 2009, https://www.w3.org/tr/skos-reference/. 5 “skos:exactmatch,” skos simple knowledge organization system namespace document—html variant, 18 august 2009 recommendation edition, w3, last modified august 6, 2011, https://www.w3.org/2009/08/skos-reference/skos.html#exactmatch. 6 freeda brook, david ellenwood, and althea eannace lazzaro, “in pursuit of antiracist social justice: denaturalizing whiteness in the academic library,” library trends 64, no. 2 (fall 2015): 259, https://muse.jhu.edu/article/610078. 7 holly tomren, “classification, bias, and american indian materials” (san jose state university, 2003), http://ailasacc.pbworks.com/f/biasclassification2004.pdf. 8 amelia koford, “how disability studies scholars interact with subject headings,” cataloging & classification quarterly 52, no. 4 (2014), https://doi.org/10/gf542p. 9 emily drabinski, “queering the catalog: queer theory and the politics of correction,” library quarterly: information, community, policy 83, no. 2 (april 2013), https://www.jstor.org/stable/10.1086/669547. 10 sara a. howard and steven a. knowlton, “browsing through bias: the library of congress classification and subject headings for african american studies and lgbtqia studies,” library trends 67, no. 1 (summer 2018), https://doi.org/10.1353/lib.2018.0026. 11 kristen hogan, “‘breaking secrets’ in the catalog: proposing the black queer studies collection at the university of texas at austin,” progressive librarian 34 (2010), http://www.progressivelibrariansguild.org/pl/pl34_35/050.pdf. 12 j. l. colbert [ https://orcid.org/0000-0001-5733-5168], “patron-driven subject access: how librarians can mitigate that ‘power to name’,” in the library with the lead pipe, november 15, 2017, http://www.inthelibrarywiththeleadpipe.org/2017/patron-driven-subject-access-howlibrarians-can-mitigate-that-power-to-name/. http://homosaurus.org/about https://www.w3.org/tr/skos-reference/ https://www.w3.org/2009/08/skos-reference/skos.html#exactmatch https://muse.jhu.edu/article/610078 http://ailasacc.pbworks.com/f/biasclassification2004.pdf https://doi.org/10/gf542p https://www.jstor.org/stable/10.1086/669547 https://doi.org/10.1353/lib.2018.0026 http://www.progressivelibrariansguild.org/pl/pl34_35/050.pdf http://www.inthelibrarywiththeleadpipe.org/2017/patron-driven-subject-access-how-librarians-can-mitigate-that-power-to-name/ http://www.inthelibrarywiththeleadpipe.org/2017/patron-driven-subject-access-how-librarians-can-mitigate-that-power-to-name/ information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 13 13 j. l. colbert, “patron-driven subject access.” 14 avril johnson madison and dorothy porter wesley, “dorothy burnett porter wesley: enterprising steward of black culture,” public historian 17, no. 1 (winter 1995): 25, https://www.jstor.org/stable/3378349; janet sims-wood, dorothy porter wesley at howard university: building a legacy of black history (charleston, sc: the history press, 2014), 39; zita cristina nunes, “cataloging black knowledge: how dorothy porter assembled and organized a premier africana research collection,” perspectives on history: the news magazine of the american historical association (november 20, 2018), https://www.historians.org/publications-and-directories/perspectives-on-history/december2018/cataloging-black-knowledge-how-dorothy-porter-assembled-and-organized-a-premierafricana-research-collection. 15 hope a. olson and dennis b. ward, “feminist locales in dewey’s landscape: mapping a marginalized knowledge domain,” in knowledge organization for information retrieval: proceedings of the sixth international study conference on classification research (the hague, netherlands: international federation for information documentation, 1997), 129. 16 dorothy berry, “digitizing and enhancing description across collections to make african american materials more discoverable on umbra search african american history,” the design for diversity learning toolkit, northeastern university libraries, august 2, 2018, https://des4div.library.northeastern.edu/digitizing-and-enhancing-description-acrosscollections-to-make-african-american-materials-more-discoverable-on-umbra-search-africanamerican-history/. 17 alexis a. antracoli et al., anti-racist description resources (philadelphia, pa: archives for black lives in philadelphia, 2019), i, https://archivesforblacklives.files.wordpress.com/2019/10/ardr_final.pdf. 18 antracoli et al., “anti-racist description resources,” 5. 19 diana k. wakimoto, debra l. hansen, and christine bruce, “the case of llace: challenges, triumphs, and lessons of a community archives,” american archivist 76, no. 2 (fall/winter 2013), http://www.jstor.org/stable/43490362. 20 according to the article, “the word queer is used throughout this article as the most general, over-arching term to describe communities and individuals who support llace and make it possible.” diana k. wakimoto et al., “case of llace,” 439; dee michel, ed., gay studies thesaurus, rev. ed. (urbana, il, 1990). 21 catelynne sahadath, “classifying the margins: using alternative classification schemes to empower diverse and marginalized users,” feliciter 59, no. 3 (june 2013): 16. 22 monica martens, “creating a supplemental thesaurus to lcsh for a specialized collection: the experience of the national indian law library,” law library journal 98, no. 2 (spring 2006). https://www.jstor.org/stable/3378349 https://www.historians.org/publications-and-directories/perspectives-on-history/december-2018/cataloging-black-knowledge-how-dorothy-porter-assembled-and-organized-a-premier-africana-research-collection https://www.historians.org/publications-and-directories/perspectives-on-history/december-2018/cataloging-black-knowledge-how-dorothy-porter-assembled-and-organized-a-premier-africana-research-collection https://www.historians.org/publications-and-directories/perspectives-on-history/december-2018/cataloging-black-knowledge-how-dorothy-porter-assembled-and-organized-a-premier-africana-research-collection https://www.historians.org/publications-and-directories/perspectives-on-history/december-2018/cataloging-black-knowledge-how-dorothy-porter-assembled-and-organized-a-premier-africana-research-collection https://www.historians.org/publications-and-directories/perspectives-on-history/december-2018/cataloging-black-knowledge-how-dorothy-porter-assembled-and-organized-a-premier-africana-research-collection https://des4div.library.northeastern.edu/digitizing-and-enhancing-description-across-collections-to-make-african-american-materials-more-discoverable-on-umbra-search-african-american-history/ https://des4div.library.northeastern.edu/digitizing-and-enhancing-description-across-collections-to-make-african-american-materials-more-discoverable-on-umbra-search-african-american-history/ https://des4div.library.northeastern.edu/digitizing-and-enhancing-description-across-collections-to-make-african-american-materials-more-discoverable-on-umbra-search-african-american-history/ https://des4div.library.northeastern.edu/digitizing-and-enhancing-description-across-collections-to-make-african-american-materials-more-discoverable-on-umbra-search-african-american-history/ https://des4div.library.northeastern.edu/digitizing-and-enhancing-description-across-collections-to-make-african-american-materials-more-discoverable-on-umbra-search-african-american-history/ https://archivesforblacklives.files.wordpress.com/2019/10/ardr_final.pdf http://www.jstor.org/stable/43490362 information technology and libraries september 2021 mitigating bias in metadata | hardesty and nolan 14 23 sharon farnel et al., “rethinking representation: indigenous peoples and contexts at the university of alberta libraries,” international journal of information, diversity, & inclusion 2, no. 3 (2018), https://doi.org/10.33137/ijidi.v2i3.32190. 24 alcts is a division of the american library association— http://www.ala.org/alcts/mgrps/camms/cmtes/ats-ccssac; sac working group, “report of the sac working group on alternatives to lcsh ‘illegal aliens,’” american library association institutional repository, submitted june 19, 2020, https://alair.ala.org/bitstream/handle/11213/14582/sac20-ac_report_sac-working-groupon-alternatives-to-lcsh-illegal-aliens.pdf. 25 this is a moment to acknowledge the work of several homosaurus editorial board members, including brian m. watson, who is studying and working with linked data at university of british columbia; chloe noland from american jewish university; and walter “cat” walker from the william h. hannon library and one national gay and lesbian archives. there was never a request to add these lcsh term connections, but the timing was incredibly helpful, and the effort greatly appreciated. 26 example search for term “love” that results in browsable terms in a visual interface: https://www.ihlia.nl/search/index.jsp?q%3asearch=love&q%3azoekterm.row1.field3=&lang =en. 27 “gender nonconformity,” (search results, iucat, indiana university, accessed march 2, 2021), https://iucat.iu.edu/?utf8=%26%2310004%3b&search_field=subject&q=gender+nonconfor mity. 28 bc first nations subject headings (vancouver, bc: xwi7xwa library first nations house of learning, march 2, 2009), http://branchxwi7xwa.sites.olt.ubc.ca/files/2011/09/bcfn.pdf; “glossary of disability terms,” north carolina council on developmental disabilities, accessed march 8, 2021, https://nccdd.org/welcome/glossary-and-terms/category/glossary-ofdisability-terms; “search in the women’s thesaurus,” atria—institute on gender equality and women’s history, accessed march 8, 2021, https://institute-genderequality.org/libraryarchive/collection/thesaurus. https://doi.org/10.33137/ijidi.v2i3.32190 http://www.ala.org/alcts/mgrps/camms/cmtes/ats-ccssac https://alair.ala.org/bitstream/handle/11213/14582/sac20-ac_report_sac-working-group-on-alternatives-to-lcsh-illegal-aliens.pdf https://alair.ala.org/bitstream/handle/11213/14582/sac20-ac_report_sac-working-group-on-alternatives-to-lcsh-illegal-aliens.pdf https://alair.ala.org/bitstream/handle/11213/14582/sac20-ac_report_sac-working-group-on-alternatives-to-lcsh-illegal-aliens.pdf https://alair.ala.org/bitstream/handle/11213/14582/sac20-ac_report_sac-working-group-on-alternatives-to-lcsh-illegal-aliens.pdf https://www.ihlia.nl/search/index.jsp?q%3asearch=love&q%3azoekterm.row1.field3=&lang=en https://www.ihlia.nl/search/index.jsp?q%3asearch=love&q%3azoekterm.row1.field3=&lang=en https://iucat.iu.edu/?utf8=%26%2310004%3b&search_field=subject&q=gender+nonconformity https://iucat.iu.edu/?utf8=%26%2310004%3b&search_field=subject&q=gender+nonconformity http://branchxwi7xwa.sites.olt.ubc.ca/files/2011/09/bcfn.pdf https://nccdd.org/welcome/glossary-and-terms/category/glossary-of-disability-terms https://nccdd.org/welcome/glossary-and-terms/category/glossary-of-disability-terms https://institute-genderequality.org/library-archive/collection/thesaurus https://institute-genderequality.org/library-archive/collection/thesaurus abstract introduction literature review methodology discussion conclusion acknowledgements endnotes 172 information technology and libraries | december 2009 information discovery insights gained from multipac, a prototype library discovery system alex a. dolski at the university of nevada las vegas libraries, as in most libraries, resources are dispersed into a number of closed “silos” with an organization-centric, rather than patron-centric, layout. patrons frequently have trouble navigating and discovering the dozens of disparate interfaces, and any attempt at a global overview of our information offerings is at the same time incomplete and highly complex. while consolidation of interfaces is widely considered to be desirable, certain challenges have made it elusive in practice. m ultipac is an experimental “discovery,” or metasearch, system developed to explore issues surrounding heterogeneous physical and networked resource access in an academic library environment. this article discusses some of the reasons for, and outcomes of, its development at the university of nevada las vegas (unlv). n the case for multipac fragmentation of library resources and their interfaces is a growing problem in libraries, and unlv libraries is no exception. electronic information here is scattered across our innovative webpac; our main website, our three branch library websites; remote article databases, local custom databases, local digital collections, special collections, other remotely hosted resources (such as libguides), and others. the number of these resources, as well as the total volume of content offered by the libraries, has grown over time (figure 1), while access provisions have not kept pace in terms of usability. in light of this dilemma, the libraries and various units within have deployed finding and search tools that provide browsing and searching access to certain subsets of these resources, depending on criteria such as n the type of resource; n its place within the libraries’ organizational structure; n its place within some arbitrarily defined topical categorization of library resources; n the perceived quality of its content; and n its uniqueness relative to other resources. these tools tend to be organization-centric rather than patron-centric, as they are generally provisioned in relative isolation from each other without placing as much emphasis on the big picture (figure 2). the result is, from the patron’s perspective, a disaggregated mass of information and scattered finding tools that, to varying degrees, each accomplishes its own specific goals at the expense of macro-level findability. currently, a comprehensive search for a given subject across as many library resources as possible might involve visiting a half-dozen interfaces or more—each one predicated upon awareness of each individual interface, its relation to the others, and figure 1. “silos” in the library figure 2. organization-centric resource provisioning alex a. dolski (alex.dolski@unlv.edu) is web & digitization application developer at the university of nevada las vegas libraries. information discovery insights gained from multipac | dolski 173 the characteristics of its specific coverage of the corpus of library content. our library website serves as the de facto gateway to our electronic, networked content offerings. yet usability studies have shown that findability, when given our website as a starting point, is poor. undoubtedly this is due, at least in part, to interface fragmentation. test subjects, when given a task to find something and asked to use the library website as a starting point, fail outright in a clear majority of cases.1 multipac is a technical prototype that serves as an exploration of these issues. while the system itself breaks no new technical ground, it brings to the forefront critical issues of metadata quality, organizational structure, and long-term planning that can inform future actions regarding strategy and implementation of potential solutions at unlv and elsewhere. yet it is only one of numerous ways that these issues could be addressed.2 in an abstract sense, multipac is biased toward principles of simplification, consolidation, and unification. in theory, usability can be improved by eliminating redundant interfaces, consolidating search tools, and bringing together resource-specific features (e.g., opac holdings status) in one interface to the maximum extent possible (figure 3). taken to an extreme, this means being able to support searching all of our resources, regardless of type or location, from a single interface; abstracting each resource from whatever native or built-in user interface it might offer; and relying instead on its data interface for querying and result-set gathering. thus multipac is as much a proof-of-concept as it is a concrete implementation. n background: how multipac became what it is multipac came about from a unique set of circumstances. from the beginning, it was intended as an exploratory project, with no serious expectation of it ever being deployed. our desire to have a working prototype ready for our discovery mini-conference meant that we had just six weeks of development time, which was hardly sufficient for anything more than the most agile of table 1. some popular existing library discovery systems name company/institution commercial status aquabrowser serials solutions commercial blacklight university of virginia open-source (apache) encore innovative interfaces commercial extensible catalog university of rochester open-source (mit/gpl) libraryfind oregon state university open-source (gpl) metalib ex libris commercial primo ex libris commercial summon serials solutions commercial vufind villanova university open-source (gpl) worldcat local oclc commercial table 2. some existing back-end search servers name company/institution commercial status endeca endeca technologies commercial idol autonomy commercial lucene apache foundation open-source (apache) search server microsoft commercial search server express microsoft free solr (superset of lucene) apache foundation open-source (apache) sphinx sphinx technologies open-source (gpl) xapian community open-source (gpl) zebra index data open-source (gpl) 174 information technology and libraries | december 2009 development models. the resulting design, while foundationally solid, was limited in scope and depth because of time constraints. another option, instead of developing multipac, would have been to demonstrate an existing open-source discovery system. the advantage of this approach is that the final product would have been considerably more advanced than anything we could have developed ourselves in six weeks. on the other hand, it might not have provided a comparable learning opportunity. n survey of similar systems were its development to continue, multipac would find itself among an increasingly crowded field of competitors (table 1). a number of library discovery systems already exist, most backed by open-source or commercially available back-end search engines (table 2), which handle the nitty-gritty, low-level ingestion, indexing, and retrieval. these lists of systems are by no means comprehensive and do not include notable experimental or research systems, which would make them much longer. n architecture in terms of how they carry out a search, meta-search applications can be divided into two main groups: distributed (or federated search), in which searches are “broadcast” to individual resources that return results in real time (figure 4); and harvested search, in which searches are carried out against a local index of resource contents (figure 5).3 both have advantages and disadvantages beyond the scope of this article. multipac takes the latter approach. it consists of three primary components: the search server, the user interface, and the metadata harvesting system (figure 6). figure 4. the federated search process figure 5. the harvested search process figure 6. the three main components of multipac figure 3. patron-centric resource provisioning information discovery insights gained from multipac | dolski 175 n search server after some research, solr was chosen as the search server because of its ease of use, proven library track record, and http–based representational state transfer (rest) application programming interface (api), which improves network-topological flexibility, allowing it to be deployed on a different server than the front-end web application—an important consideration in our server environment.4 jetty—a java web application server bundled with solr—proved adequate and convenient for our needs. the metadata schema used by solr can be customized. we derived ours from the unqualified dublin core metadata element set (dcmes),5 with a few fields removed and some fields added, such as “library” and “department,” as well as fields that support various multipac features, such as thumbnail images, and primary record urls. dcmes was chosen for its combination of generality, simplicity, and familiarity. in practice, the solr schema is for finding purposes only, so whether it uses a standard schema is of little importance. n user interface the front-end multipac system is written in php 5.2 in a model-view-controller design based on classical object design principles. to support modularity, new resources can be added as classes that implement a resource-class interface. the multipac html user interface is composed of five views: search, browse, results, item, and list, which exist to accommodate the finding process illustrated in figure 7. each view uses a custom html template that can be easily styled by nonprogrammer web designers. (needless to say, judging by figures 8–12, they haven’t been.) most dynamic code is encapsulated within dedicated “helper” methods in an attempt to decouple the templates from the rest of the system. output formats, like resources, are modular and decoupled from the core of the system. the html user interface is one of several interfaces available to the multipac system; others include xml and json, which effectively add web services support to all encompassed resources—a feature missing from many of the resources’ own built-in interfaces.6 n search view search view (figure 8) is the simplest view, serving as the “front page.” it currently includes little more than a brief introduction and search field. the search field is not complicated; it is, in fact, possible to include search forms on any webpage and scope them to any subset of resources on the basis of facet queries. for example, a search form could be scoped to las vegas–related resources in special collections, which would satisfy the demand of some library departments for custom search engines tailored to their resources without contributing to the “interface fragmentation” effect discussed in the introduction. (this would require a higher level of metadata quality than we currently have, which will be discussed in depth later.) because search forms can be added to any page, this view is not essential to the multipac system. to improve simplification, it could be easily removed and replaced with, for example, a search form on the library homepage. n browse view browse view (figure 9) is an alternative to search view, intended for situations in which the user lacks a “concrete target” (figure 7). as should be evident by its appearance, figure 7. the information-finding process supported by multipac figure 8. the multipac search view page 176 information technology and libraries | december 2009 this is the least-developed view, simply displaying facet terms in an html unordered list. notice the facet terms in the format field; this is malprocessed, marc– encoded information resulting from a quick-and-dirty extensible stylesheet language (xsl) transformation from marcxml to solr xml. n results view the results page (figure 10) is composed of three columns: 1. the left column displays a facet list—a feature generally found to be highly useful for results-gathering purposes.7 the data in the list is generated by solr and transformed to an html unordered list using php. the facets are configurable; fields can be made “facetable” in the solr schema configuration file. 2. the center column displays results for the current search query that have been provided by solr. thumbnails are available for resources that have them; generic icons are provided for those that do not. currently, the results list displays item title and description fields. some items have very rich descriptions; others have minimal descriptions or no descriptions at all. this happens to be one of several significant metadata quality issues that will be discussed later. 3. the right column displays results from nonindexed resources, including any that it would not be feasible to index locally, such as google, our article databases, and so on. multipac displays these resources as collapsed panes that expand when their titles are clicked and initiate an ajax request for the current search query. in a situation in which there might be twenty or more “panes” to load, performance would obviously suffer greatly if each one had to be queried each time the results page loaded. the on-demand loading process greatly speeds up the page load time. currently, the right column includes only a handful of resource panes—as many as could be developed in six weeks alongside the rest of the prototype. it is anticipated that further development would entail the addition of any number of panes—perhaps several dozen. the ease of developing a resource pane can vary greatly depending on the resource. for developerfriendly resources that offer a useful javascript object notation (json) api, it can take less than half an hour. for article databases, which vendors generally take great pains to “lock down,” the task can entail a two-day marathon involving trial-and-error http-request-token authentication and screen-scraping of complex invalid html. in some cases, vendor license agreements may prohibit this kind of use altogether. there is little we can do about this; clearly, one of multipac’s severest limitations is its lack of adeptness at searching these types of “closed” remote resources. n item view item view (figure 11) provides greater detail about an individual item, including a display of more metadata fields, an image, and a link to the item in its primary context, if available. it is expected that this view also would include holdings status information for opac resources, although this has not been implemented yet. the availability of various page features is dependent on values encoded in the item’s solr metadata record. for example, if an image url is available, it will be displayed; if not, it won’t. an effort was made to keep the view logic separate from the underlying resource to improve code and resource maintainability. the page template itself does not contain any resource-dependent conditionals. n list view list view (figure 12), essentially a “favorites” or “cart” view, is so named because it is intended to duplicate the list feature of unlv libraries’ innovative millennium figure 9. the multipac browse view page information discovery insights gained from multipac | dolski 177 opac. the user can click a button in either results view or item view to add items to the list, which is stored in a cookie. although currently not feature-rich, it would be reasonable to expect the ability to send the list as an e-mail or text message, as well as other features. n metadata harvesting system for metadata to be imported into solr, it must first be harvested. in the harvesting process, a custom script checks source data and compares it with local data. it downloads new records, updates stale records, and deletes missing records. not all resources support the ability to easily check for changed records, meaning that the full record set must be downloaded and converted during every harvest. in most cases, this is not a problem; most of our resources (the library catalog excluded) can be fully dumped in a matter of a few seconds each. in a production environment, the harvest scripts would be run automatically every day or so. in practice, every resource is different, necessitating a different harvest script. the open archives initiative protocol for metadata harvesting (oai-pmh) is the protocol that first jumps to mind as being ideal for metadata harvesting, but most of our resources do not support it. ideally, we would modify as many of them as possible to be oai–compliant, but that would still leave many that are out of our hands. either way, a substantial number of custom harvest scripts would still be required. for demonstration purposes, the multipac prototype was seeded with sample data from a handful of diverse resources: 1. a set of 16,000 marc records from our library catalog, which we converted to marcxml and then to solr xml using xsl transformations 2. our locally built las vegas architects and buildings database, a mysql database containing more than 10,000 rows across 27 tables, which we queried and dumped into xml using a php script 3. our locally built special collections database, a smaller mysql database, which we dealt with the same way 4. our contentdm digital collections, which we downloaded via oai-pmh and transformed using another custom xsl stylesheet there are typically a variety of conversion options for each resource. because of time constraints, we simply chose what we expected would be the quickest route for each, and did not pay much attention to the quality of the conversion. n how multipac answers unlv libraries’ discovery questions multipac has essentially proven its capability of solving interface multiplication and fragmentation issues. figure 10. the multipac results view page 178 information technology and libraries | december 2009 by adding a layer of abstraction between resource and patron, it enables us to reference abstract resources instead of their specific implementations—for example, “the library catalog” instead of “the innopac catalog.” this creates flexibility gains with regard to resource provision and deployment. this kind of “pervasive decoupling” can carry with it a number of advantages. first, it can allow us to provide custom-developed services that vendors cannot or do not offer. second, it can prevent service interruptions caused by maintenance, upgrades, or replacement of individual back-end resources. third, by making us less dependent on specific implementations of vendor products—in other words, reducing vendor “lock-in”—it can potentially give us leverage in vendor contract negotiations. because of the breadth of information we offer from our website gateway, we as a library are particularly sensitive about the continued availability of access to our resources at stable urls. when resources are not persistent, patrons and staff need to be retrained, expectations need to be adjusted, and hyperlinks—scattered all over the place—need to be updated. by decoupling abstract resources from their implementations, multipac becomes, in effect, its own persistent uri system, unifying many library resources under one stable uri schema. in conjunction with a url rewriting system on the web server, a resource-based uri schema (figure 13) would be both powerful and desirable.8 n lessons learned in the development of multipac the lessons learned in the development of multipac fall into three main categories, listed here in order of importance. metadata quality considerations quality metadata—characterized by unified schemas; useful crosswalking; and consistent, thorough description—facilitates finding and gathering. in practice, a surrogate record is as important as the resource it describes. below a certain quality threshold, its accompanying resource may never be found, in which case it may as well not exist. surrogate record quality influences relevance ranking and can mean the difference between the most relevant result appearing on page 1 or page 50 (relevance, of course, being a somewhat disputed term). solr and similar systems will search all surrogates, including those that are of poor quality, but the resulting relevancy ranking will be that much less meaningful. figure 13. example of an implementation-based vs. resource-based uri implementation-based http://www.library.unlv.edu/arch/archdb2/index.php/projects/view/1509 resource-based (hypothetical) http://www.library.unlv.edu/item/483742 figure 11. the multipac item view page figure 12. the multipac list view page information discovery insights gained from multipac | dolski 179 metadata quality can be evaluated on several levels, from extremely specific to extremely broad (figure 14). that which may appear to be adequate at one level may fail at a higher level. using this figure as an example, multipac requires strong adherence to level 5, whereas most of our metadata fails to reach level 4. a “level 4 failure” is illustrated in table 3, which compares sample metadata records from four different multipac resources. empty cells are not necessarily “bad”— not all metadata elements apply to all resources—but this type of inconsistency multiplies as the number of resources grows, which can have negative implications for retrieval. suggestions for improving metadata quality the results from the multipac project suggest that metadata rules should be applied strictly and comprehensively according to library-wide standards that, at our libraries, have yet to be enacted. surrogate records must be treated as must-have (rather than nice-to-have) features of all resources. resources that are not yet described in a system that supports searchable surrogate records should be transitioned to one that does; for example, html webpages should be transitioned to a content management system with metadata ascription and searchability features (at unlv, this is planned). however, it is not enough for resources to have high-quality metadata if not all schemas are in sync. there exist a number of resources in our library that are well-described but whose schemas do not mesh well with other resources. different formats are used; different descriptive elements figure 14. example scopes of metadata application and evaluation, from broad (top) to specific table 3. comparing sample crosswalked metadata from four different unlv libraries resources library catalog digital collections special collections database las vegas architects & buildings database title goldfield: boom town of nevada map of tonopah mining district, nye county, nevada 0361 : mines and mining collection flamingo hilton las vegas creator paher, stanley w. booker & bradford call number f849.g6p34 contents (item-level description of contents) format digital object photo collections database record language eng eng eng coverage tonopah mining district (nev.) ; ray mining district (nev.) description (omitted for brevity) publisher nevada publications university of nevada las vegas libraries unlv architecture studies library subject (lcsh omitted for brevity) (lcsh omitted for brevity) 180 information technology and libraries | december 2009 are used; and different interpretations, however subtle, are made of element meanings. despite the best intentions of everyone involved with its creation and maintenance, and despite the high quality of many of our metadata records when examined in isolation, in the big picture, multipac has demonstrated—perhaps for the first time—how much work will be needed to upgrade our metadata for a discovery system. would the benefits make the effort worthwhile? would the effort be implementable and sustainable given the limitations of the present generation of “silo” systems? what kind of adjustments would need to be made to accommodate effective workflows, and what might those workflows look like? these questions still await answers. of note, all other open-source and vendor systems suffer from the same issues, which is a key reason that these types of systems are not yet ascendant in libraries.9 there is much promise in the ability of infrastructural standards like frbr, skos, rda, and the many other esoteric information acronyms to pave the way for the next generation of library discovery systems. organizational considerations electronic information has so far proved relatively elusive to manage; some of it is ephemeral in existence, most of it is constantly changing, and all of it is from diverse sources. attempts to deal with electronic resources—representing them using catalog surrogate records, streamlining website portals, farming out the problem to vendors—have not been as successful as they have needed to be and suffer from a number of inherent limitations. multipac would constitute a major change in library resource provision. our library, like many, is for the most part organized around a core 1970s–80s ils–support model that is not well adapted to a modern unified discovery environment. next-generation discovery is trending away from assembly-line-style acquisition and processing of primarily physical resources and toward agglomerating interspersed networked and physical resource clouds from onand offsite.10 in this model, increasing responsibilities are placed on all content providers to ensure that their metadata conforms to site-wide protocols that, at our library, have yet to be developed. n conclusion in deciding how to best deal with discovery issues, we found that a traditional product matrix comparison does not address the entire scope of the problem, which is that some of the discoverability inadequacies in our libraries are caused by factors that cannot be purchased. sound metadata is essential for proper functioning of a unified discovery system, and descriptive uniformity must be ensured on multiple levels, from the element level to the institution level. technical facilitators of improved discoverability already exist; the responsibility falls on us to adapt to the demands of future discovery systems. the specific discovery tool itself is only a facilitator, the specific implementation of which is likely to change over time. what will not change are library-wide metadata quality issues that will serve any tool we happen to deploy. the multipac project brought to light important library-wide discoverability issues that may not have been as obvious before, exposing a number of limitations in our existing metadata as well as giving us a glimpse of what it might take to improve our metadata to accommodate a next-generation discovery system, in whatever form that might take. references 1. unlv libraries usability committee, internal library website usability testing, las vegas, 2008. 2. karen calhoun, “the changing nature of the catalog and its integration with other discovery tools.” report prepared for the library of congress, 2006. 3. xiaoming liu et al., “federated searching interface techniques for heterogeneous oai repositories,” journal of digital information 4, no. 2 (2002). 4. apache software foundation, apache solr, http://lucene .apache.org/solr/ (accessed june 11, 2009). 5. dublin core metadata initiative, “dublin core metadata element set, version 1.1,” jan. 14, 2008, http://dublincore.org/ documents/dces/ (accessed june 25, 2009). 6. lorcan dempsey, “a palindromic ils service layer,” lorcan dempsey’s weblog, jan. 20, 2006, http://orweblog.oclc .org/archives/000927.html (accessed july 15, 2009). 7. tod a. olson, “utility of a faceted catalog for scholarly research,” library hi tech 4, no. 25 (2007): 550–61. 8. tim berners-lee, “hypertext style: cool uris don’t change,” 1998, http://www.w3.org/provider/style/uri (accessed june 23, 2009). 9. bowen, jennifer, “metadata to support next-generation library resource discovery: lessons from the extensible catalog, phase 1,” information technology and libraries 2, no. 27 (june 2008): 6–19. 10. calhoun, “the changing nature of the catalog.” using dpla and the wikimedia foundation to increase usage of digitized resources article using dpla and the wikimedia foundation to increase usage of digitized resources dominic byrd-mcdevitt and john dewees information technology and libraries | march 2022 https://doi.org/10.6017/ital.v41i1.13659 dominic byrd-mcdevitt (dominic@dp.la) is data fellow, digital public library of america. john dewees (john.dewees@toledolibrary.org) is supervisor, digitization services, toledo lucas county public library. © 2022. abstract the digital public library of america has created a process by which rights-free or openly licensed resources that have already been harvested can be copied over into wikimedia commons, thus creating a simple path for including those digital collections materials into wikipedia articles. by meeting internet users where they already are, rather than relying on them to navigate to individual digital libraries, the access and usage of digital assets is dramatically increased, in particular to user groups that might otherwise not have a reason to interact with such digitized resources. introduction a dpla-sponsored webinar given by dominic byrd-mcdevitt, dpla data fellow, and sandra fauconnier, glam-wiki specialist at the wikimedia foundation, on april 21, 2020, entitled “dpla intro to wikimedia: increased discoverability and use” introduced a workflow by which records harvested by the digital public library of america (dpla) could be automatically copied over into wikimedia commons with their accompanying metadata.1 the major benefit of this migration is the ease with which assets can then be added to wikipedia articles, exposing resources to a large audience of general internet users who might otherwise have no reason to interact with a given repository’s resources. the gains from making digitized resources available in wikipedia articles is substantial, providing incredibly high usage statistics while requiring very little time commitment to execute the work. this dpla project, launched in early 2020, was a result of grant funding provided by the alfred p. sloan foundation and ongoing consultation from the wikimedia foundation. dpla’s interest in designing this system stemmed from an exploration of new ways to increase usage of materials. while previous bulk uploads to wikimedia commons by cultural institutions have required technical expertise and steep learning curves in navigating the wikimedia community, this project was designed to reduce these barriers by taking advantage of dpla’s role as an aggregator (more information is available at https://commons.wikimedia.org/wiki/commons:partnerships). with the workflow developed by dpla’s technology team in mid-2020, an authorized bot account on wikimedia commons (user:dpla_bot, https://commons.wikimedia.org/wiki/user:dpla_bot) uploads assets from dpla institutions. using data provided by contributing institutions, dpla applies filters to identify eligible items from participating institutions, then for each of these generates wiki markup from descriptive metadata and downloads media files to a server. these files are uploaded by a script that interacts with wikimedia’s api using the pywikibot framework (https://www.mediawiki.org/wiki/manual:pywikibot). by centralizing all of the dpla network’s wikimedia commons uploads, dpla was able to upload over 2.25 million files (or 2.5 tb of total mailto:dominic@dp.la mailto:john.dewees@toledolibrary.org https://commons.wikimedia.org/wiki/commons:partnerships https://commons.wikimedia.org/wiki/user:dpla_bot https://www.mediawiki.org/wiki/manual:pywikibot information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 2 storage) from 780,000 items in under a year and a half, becoming the largest single contribution to wikimedia commons ever (by more than quadruple the previous record). 2,3 this approach to the problem provides a simple on-ramp to participation in wikipedia for dpla institutions—especially the many that would otherwise lack the resources or expertise to do so— by requiring of them only those tasks that need their local knowledge, such as describing their own collections prior to aggregation and then making editorial decisions on wikipedia about them once uploaded. this project required a chain of partnerships between separate organizations, as well as a variety of metadata and technical requirements that needed to be satisfied: records of digitized resources are created by an organization locally and are then harvested by dpla. the eligible records in dpla are then copied over into wikimedia commons. once images are in wikimedia commons it is a straightforward process to embed the images in wikipedia articles, thus achieving the goal of expanded use and access to digitized resources. john dewees, supervisor digitization services at the toledo lucas county public library (tlcpl), was in attendance at the april 21, 2020 webinar and subsequently met with dominic byrd mcdevitt on april 30, 2020 to discuss the feasibility of using tlcpl collections as a pilot project for this workflow. the copying of records from tlcpl’s repository into wikimedia commons was actually started in the course of that first conversation on april 30. a map from page 96 of the book geography of ohio (see figures 1 and 2), previously digitized by dewees, will be used to illustrate the process of how records move through the various tools and platforms discussed.4 tlcpl makes digitized resources available through ohio memory, a shared contentdm instance for libraries, archives, and museums in ohio maintained by the state library of ohio and the ohio history connection. information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 3 figure 1. digitized image of geography of ohio, page 96, as seen in ohio memory. information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 4 figure 2. record metadata for geography of ohio, page 96, as seen in ohio memory. information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 5 dpla harvest dpla is a discovery portal that aggregates records of digitized resources from over 4,000 libraries, archives, and museums from around the united states. this creates a single search interface allowing millions of digital records to be searched simultaneously without having to navigate through a wide variety of different digital libraries. the aggregation of these records is accomplished by working with two different types of partners: content hubs and service hubs.5 content hubs are either organizations large enough to contribute to dpla directly, such as the library of congress or harvard library, or large digital libraries that work with partner institutions of their own, such as hathitrust or the internet archive. service hubs, on the other hand, act as mediators between the national aggregation service and individual organizations in states (such as ohio and its service hub, the ohio digital network) or regions (such as utah, idaho, and nevada, who have collectively formed a service hub in the mountain west digital library). service hubs ensure that the technical and metadata requirements for harvesting into dpla are satisfied and act as consultants and facilitators to prospective contributors. as dpla has grown over time, the metadata requirements and possibilities have also evolved and have varied depending on which service hub a contributing organization is working with. the ohio digital network (odn) is the service hub for our example page from ohio memory. odn’s metadata requirements for contributors in march 2021 included a title and a standardized rights statement in the metadata application profile for the contributing collection. more information on the dpla harvest process for the ohio digital network is available at https://ohiodigitalnetwork.org/contributors/getting-started. the nature of these requirements has also evolved since odn’s first harvest in march 2018. initially, the standardized rights statement was required to be one of the options from rightsstatements.org but through the work of dpla and odn, now creative commons licenses and the cc0 public domain dedication can be utilized as well. standardized rights statements must be formatted as machine-readable uris rather than textual descriptions. finally, the technical backend that supports the harvest of a digital collection is via an oai-pmh feed. other hubs operate in very different ways—such as some that actually host all their contributors’ collections in a single domain—but in all cases the end result is providing a data set that dpla can harvest and ingest. figures 3 and 4 illustrate this process, showing the geography of ohio represented as a record in dpla (available at https://dp.la/item/aaba7b3295ff6973b6fd1e23e33cde14) with associated metadata. https://ohiodigitalnetwork.org/contributors/getting-started https://rightsstatements.org/page/1.0/?language=en. https://creativecommons.org/licenses/ https://creativecommons.org/share-your-work/public-domain/cc0/ https://dp.la/item/aaba7b3295ff6973b6fd1e23e33cde14 information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 6 figure 3. geography of ohio as seen in dpla, specifically focusing on the thumbnail, link to the original record, and initial metadata fields. information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 7 figure 4. geography of ohio as seen in dpla, specifically focusing on the remaining metadata fields harvested. this process achieves the first level of aggregation: harvesting thumbnail images (full-sized images suitable for research are not harvested in this process) and metadata from local digital repositories and making them available for a unified search experience in dpla. dpla’s aggregation currently contains over 42 million items, with the majority of these containing standardized rights uris; 18 million items have rights compatible with upload to wikimedia commons (as can be seen at https://dp.la/search?rightscategory=%22unlimited%20reuse%22). once dpla has access to the records, it is possible for the code authored by dpla staff to be utilized to then integrate the resources into wikimedia commons. wikimedia commons harvest wikimedia commons is part of the larger network of services and tools under the umbrella of the wikimedia foundation. there are a wide variety of different tools available such as wikidata, a portal for open structured data; wikipedia, a collaboratively edited open encyclopedia; and https://dp.la/search?rightscategory=%22unlimited%20re-use%22 https://dp.la/search?rightscategory=%22unlimited%20re-use%22 information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 8 wikimedia commons. this last portal uses the same software platform that powers wikipedia to create an open file and media server that can interoperate with the other tools. wikimedia commons is capable of hosting digital still images, audio files, and video files. anyone can contribute to this open repository so long as the work is in the public domain or openly licensed. users may either release a work for which they own the rights under an open license at upload time or may upload any other works by providing evidence in the metadata that the work is out of copyright or openly licensed (more information on copyright and licensing in wikimedia commons is available at https://commons.wikimedia.org/wiki/commons:licensing). with this in mind, in order for records in dpla to be eligible for harvest into wikimedia commons, they first must have one of the five specific standardized rights statements available at the following links:6 • http://rightsstatements.org/vocab/noc-us/1.0/ • https://creativecommons.org/publicdomain/zero/1.0/ • https://creativecommons.org/publicdomain/mark/1.0/ • https://creativecommons.org/licenses/by/4.0/ • https://creativecommons.org/licenses/by-sa/4.0/ the uris above indicate the most recent version of each of the associated copyright descriptions or licenses, though being published under the most recent version is not a requirement for harvest into wikimedia commons. while standardized rights statements are not a requirement for contributing to dpla generally, they are being used as a requirement for wikimedia commons upload so that the software has a machine-readable way to determine the compatibility of rights. though it is a non-profit educational resource, wikimedia commons does not utilize media under fair use or materials only licensed for noncommercial/educational use, in order to ensure its users may reuse the media for any purpose. as a result one thing to keep in mind is that while a given organization may include in their gift or accession agreement a statement that digitized versions of physical resources are allowed to be shared through channels decided by the organization, this does not necessarily extend to wikimedia commons users outside your organization, because of the requirement to be able to reuse materials with little restriction past attribution and the need to share alike, depending on the standardized rights statement. dpla locates the asset to upload by using urls explicitly provided by the service hub; the urls can be provided in one of two ways. one is to provide the iiif manifest url (via the iiif presentation api), from which the dpla-developed software queries the manifest for the list of assets which are listed by the presentation api in the form of iiif image api urls. the other way the media location can be identified is by providing a list of direct urls to the media in the field dpla calls mediamaster during the initial harvest process. unlike the iiif manifest url, this is a multivalued field that can accommodate a list of urls. the reason for this approach is to allow any institution to contribute assets via the pipeline, regardless of whether they actually have implemented iiif in their repository or not. not all organizations have adopted the iiif suite of apis so it is important to be able to provide more than one avenue for wikimedia commons harvest. https://commons.wikimedia.org/wiki/commons:licensing http://rightsstatements.org/vocab/noc-us/1.0/ https://creativecommons.org/publicdomain/zero/1.0/ https://creativecommons.org/publicdomain/mark/1.0/ https://creativecommons.org/licenses/by/4.0/ https://creativecommons.org/licenses/by-sa/4.0/ https://iiif.io/api/presentation/3.0/ https://iiif.io/api/presentation/3.0/ https://iiif.io/api/image/3.0/ information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 9 however, providing a iiif manifest when and if it becomes available has benefits over the mediamaster field. it will always be true when queried, whereas the mediamaster values are only accurate to the last harvest, which may be a month or more out of date. figure 5. the dashboard developed by dpla displaying, for pine river library, percent of records that have open rights statements and percent of files with media access. a dashboard has been developed for dpla content hub and service hub administrators to analyze how many records in a given collection conform to the standardized rights statement and iiif api requirements (see figure 5). harvest of a collection into wikimedia commons from dpla necessitates that all eligible records in the collection be harvested into wikimedia commons; it is not possible for a participating institution to hand-curate which of the eligible items will be included. that is, all records in a given collection with the aforementioned standardized rights statements will be harvested into wikimedia commons. an additional signed agreement or memorandum of understanding has not been required between dpla and participating organizations due to the open nature of the works being transferred. since the works have been identified as in the public domain or openly licensed, users can already freely use the resources for any purpose they want, so long as it conforms to the appropriate creative commons license. resource presentation in wikimedia commons each portion of the migration process presents the resource in different ways. the original instance of geography of ohio is made available in contentdm as a complex digital object: multiple images (or more specifically in this case, pages) associated with a single metadata record. dpla presents this resource only in terms of its metadata along with a thumbnail image of the resource itself; to view the contents of the resource the user is directed back to the original repository for full access to the digital object. the migration process into wikimedia commons actually copies the image assets themselves along with the metadata. in this example, both the information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 10 image assets and the metadata are drawn from contentdm. wikimedia commons is not able to accommodate complex digital objects, and any that are imported via this process are broken out into discrete simple digital objects in wikimedia commons, for example, page 96 of geography of ohio (see figures 6, 7, and 8; view page 96 in wikimedia commons). figure 6. geography of ohio, page 96, as seen in wikimedia commons, with a focus on the file name, image, and viewing options. https://commons.wikimedia.org/wiki/file:geography_of_ohio_-_dpla_-_aaba7b3295ff6973b6fd1e23e33cde14_(page_96).jpg information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 11 figure 7. geography of ohio, page 96, as seen in wikimedia commons, with a focus on the record metadata. figure 8. geography of ohio, page 96, as seen in wikimedia commons, with a focus on the derivative images created from the original record and administrative metadata. information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 12 the filename is programmatically generated and embeds a great deal of information; the following example illustrates the various components of the filename. example filename: file:geography of ohio dpla aaba7b3295ff6973b6fd1e23e33cde14 (page 96) (cropped).jpg 1. the prefix for all items in wikimedia commons, “file:” 2. the title of the work, in this case “geography of ohio” followed by a hyphen 3. the source of the digital object, universally “dpla” for this project, followed by a hyphen 4. the unique identifier assigned by dpla, in this case “aaba7b3295ff6973b6fd1e23e33cde14” 5. in the case of complex objects, the page number, in this case “(page 96)” 6. if the file was cropped using wikimedia common’s built-in image editing tool, “(cropped)” will be included between the page number and file extension to indicate the image is a derivative of an original 7. the file format extension, in this case “.jpg” even if the complex object being imported is not actually a book, the individual item records in wikimedia commons still uses the “(page x)” nomenclature to differentiate the individual objects. the summary section of the wikimedia commons record displays how the metadata is crosswalked into this environment. the dublin core creator, title, description and date fields are copied verbatim from the local metadata application profile (map). to identify the contributing institution, and to differentiate between similarly named institutions, dpla maintains a json file mapping all dpla institutions with their wikidata identifiers.7 this document also indicates which hubs/institutions are participating in the project at any given time through a true/false field that is toggled when an institution authorizes upload. this enables distinct category pages for each contributing institution and analytics to be tracked and provided to dpla, relevant hubs, and contributing institutions. the source/photographer field is one of the most important as it ensures that attribution for the contributing institution is clear. the field contains a narrative description of how dpla facilitated this resource to be available in wikimedia commons. it also makes available information on the original contributing institution with links to the record as it is originally displayed (in ohio memory in this case) as well as in dpla. proper attribution of items was a topic that came up continuously when discussing this project with other organizations, so it should be reassuring to know that credit and direct links back to resources is enabled in this workflow. the permission and standardized rights statement fields leverage the aforementioned uris to be able to provide information to the user on the copyright status of the work as well as concrete information on how exactly they are able to use it responsibly for their own purposes. finally, an interesting aspect of this record is the links provided to derivative images. in this case we can see the map displayed on this book page has two cropped derivative images. use in wikipedia articles all of the work described above is in service of one goal: to enable higher usage and exposure of digitized resources in wikipedia articles. while it is possible to do this work manually, inserting images into articles without being a dpla contributor or even having a digital repository to speak information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 13 of, the automated process is a clear advantage, especially when talking about large collections. for the map on page 96 in geography in ohio, we can see that the map of limestone distribution in ohio has been included in an image gallery on the limestone wikipedia article (see figures 9 and 10). figure 9. the wikipedia article on limestone displaying the introduction and one image (but not the worked-example image). https://en.wikipedia.org/wiki/limestone information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 14 figure 10. the image gallery in the limestone wikipedia article, with the map from geography of ohio included and seen at the bottom right. figure 11. the source view editing option for the limestone article in wikipedia, allowing direct editing of the wikitext. once images are in wikimedia commons, embedding the images in wikipedia articles is a simple process. one option for wikipedia editors is to use a what you see is what you get (wysiwyg) information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 15 html editor that should be familiar to most users. alternately, there is also a source view editing option which uses the custom markdown called wikitext to format pages in wikipedia (see figure 11). source view editing allows more precision when inserting images into wikipedia articles and makes it easier to understand how they will ultimately be displayed in the article. the way in which different page elements flow around one another in articles can be surprising when using the wysiwyg editing option as images assumed to show up where you placed the cursor can ultimately be placed in very different locations than expected. 8 usage analytics analytics tools are available that allow organizations to track the articles containing their assets, showing what image was embedded in an article and how many views the article received. tlcpl’s initial ingest added a total of 129,725 discrete image assets to wikimedia commons. from that pool, images were added to a total of 227 wikipedia articles between may 2020 and february 2021 (see figure 12). in that time period the articles had a total of 11.7 million page views (see figure 13).9 in february 2021 alone, the 227 enriched articles received 1.87 million page views. by comparison, the total number of records tlcpl had available in ohio memory was 129,395 in february 2021, and those records received 26,602 unique page views. the major strength of this project is to display locally created digitized resources where researchers would already be on the open web and take advantage of that much wider level of exposure.10 figure 12. a graph displaying the cumulative total number of articles with inserted images from tlcpl resources from may 2020 to february 2021. https://en.wikipedia.org/wiki/help:wikitext information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 16 figure 13. a graph displaying the monthly total number of page views of wikipedia articles with inserted images from tlcpl resources from may 2020 to february 2021. there is a valid discussion to be had about the comparison of these metrics, as comparing page views to unique page views is not a one-to-one match, but no matter the measurement it is fairly clear this audience is an order of magnitude beyond what might conventionally be available. ultimately what might be one of the most exciting metrics for an organization looking to implement this work is the amount of time it actually took to execute this project. since tlcpl’s records already satisfied the requirements to be copied over to wikimedia commons, the actual import process was able to begin during the april 30, 2020 zoom call between tlcpl and dpla staff that was set up to discuss the project; from the perspective of the contributing organization, this process takes essentially no time or effort. once the process is started, staff at the contributor institution will be informed when the records have finished being copied. the actual work of locating images for inclusion into articles and inserting them took roughly an hour of work a week, or roughly ten minutes per article, sometimes less. approximately 38 hours of work was spent identifying images and inserting them into articles between may 2020 and february 2021. while not of central concern to the project or its usage, the editorial work is also interesting and uses enough creativity and problem solving to be an enjoyable activity. because the resources in wikimedia commons are available to be used by anyone (as in, anyone with a device and an internet connection), this makes it a wonderful opportunity for interns or volunteers to contribute. volunteers could work on the editorial portion of this project remotely with no real barriers. while all the effort of getting tlcpl digitized images into wikipedia described here has been using previously existing articles, this work could make an excellent partnership opportunity information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 17 with schools to write and create whole new articles for which there is an abundance of already digitized resources to support. conclusion the work of remediating and writing metadata to participate in large consortial efforts such as dpla is always going to be a major undertaking, but projects like this that can leverage automation and partnerships show just how powerful these relationships can be. making locally digitized resources available through dpla, copying them over to wikimedia commons, and then embedding those images into wikipedia articles is an excellent opportunity to meet users where they already are—online. this work provides exceptionally high usage statistics and is fertile ground for outreach and programming opportunities to get partners, volunteers, and interns involved with making those digitized resources available in wikipedia. acknowledgements special thanks to jen johnson, library consultant at the state library of ohio, and virginia dressler, digital projects librarian at kent state university, for their support in enabling this work and article. endnotes 1 this presentation is available on youtube at https://youtu.be/0bsoksybcbi. information on all past dpla webinars and programming can be found at https://pro.dp.la/events/workshops. 2 the entire collection of all resources contributed to wikimedia commons via dpla can be found at https://commons.wikimedia.org/wiki/category:media_contributed_by_the_digital_public_libr ary_of_america. 3 statistics related to contributor totals were created from a wikimedia database query published at https://quarry.wmflabs.org/query/51256. 4 geography of ohio was published as part of a series of bulletins by the ohio state geological survey. the book was authored by roderick peattie, an assistant professor of geology at ohio state university, in 1923. this item was digitized by the toledo lucas county public library and uploaded as part of public domain day 2019. the digitized version of this book is available through ohio memory at https://ohiomemory.org/digital/collection/p16007coll33/id/115214. 5 more information, including a complete list of content hubs and services hubs and their geographic distribution, is available on the dpla website at https://pro.dp.la/hubs/our-hubs. 6 as shared by dominic byrd-mcdevitt in a webinar on march 18, 2021 entitled “dpla + wikimedia: one year in + ten million views,” available at https://www.youtube.com/watch?v=jloj0gvvsnu. 7 the json file is available for view on dpla’s github page at https://github.com/dpla/ingestion3/blob/develop/src/main/resources/wiki/institutions_v2. json. https://youtu.be/0bsoksybcbi https://pro.dp.la/events/workshops https://commons.wikimedia.org/wiki/category:media_contributed_by_the_digital_public_library_of_america https://commons.wikimedia.org/wiki/category:media_contributed_by_the_digital_public_library_of_america https://quarry.wmflabs.org/query/51256 https://ohiomemory.org/digital/collection/p16007coll33/id/115214 https://pro.dp.la/hubs/our-hubs https://www.youtube.com/watch?v=jloj0gvvsnu https://github.com/dpla/ingestion3/blob/develop/src/main/resources/wiki/institutions_v2.json https://github.com/dpla/ingestion3/blob/develop/src/main/resources/wiki/institutions_v2.json information technology and libraries march 2022 using dpla and the wikimedia foundation | byrd-mcdevitt and dewees 18 8 for more information on step-by-step instructions for adding images into wikipedia articles after harvest, see the blog post at https://johndewees.com/2021/03/18/adding-images-towikipedia-articles-via-dpla/. 9 all statistics on wikipedia page views are drawn from the baglama 2 utility available at https://glamtools.toolforge.org/baglama2/#gid=430. 10 up-to-date statistics and data are available at the digitization statistics dashboard created to communicate about major projects in digitization services at the toledo lucas county public library and available at https://docs.google.com/spreadsheets/d/1jv0zzt6h_jl1tq8v2zdxmf5ygn0dfbnhqbffifcep zm/edit?usp=sharing. https://johndewees.com/2021/03/18/adding-images-to-wikipedia-articles-via-dpla/ https://johndewees.com/2021/03/18/adding-images-to-wikipedia-articles-via-dpla/ https://glamtools.toolforge.org/baglama2/#gid=430 https://docs.google.com/spreadsheets/d/1jv0zzt6h_jl1tq8v2zdxmf5ygn0dfbnhqbffifcepzm/edit?usp=sharing https://docs.google.com/spreadsheets/d/1jv0zzt6h_jl1tq8v2zdxmf5ygn0dfbnhqbffifcepzm/edit?usp=sharing abstract introduction dpla harvest wikimedia commons harvest resource presentation in wikimedia commons use in wikipedia articles usage analytics conclusion acknowledgements endnotes intro to coding using python at the worcester public library public libraries leading the way intro to coding using python at the worcester public library melody friedenthal information technology and libraries | june 2020 https://doi.org/10.6017/ital.v39i2.12207 melody friedenthal (mfriedenthal@mywpl.org) is a public services librarian, worcester public library. abstract the worcester public library (wpl) offers several digital learning courses to our adult patrons, and among them is “intro to coding using python”. this 6-session class teaches basic programming concepts and the vocabulary of software development. it prepares students to take more intensive, college-level classes. the bureau of labor statistics predicts a bright future for software developers, web developers, and software engineers. wpl is committed to helping patrons increase their “hireability” and we believe our python class will help patrons break into these lucrative and gratifying professions… or just have fun. history and details of our class i came to librarianship from a long career in software development, so when i joined the worcester public library in january 2018 as a public services librarian, my manager proposed that i teach a class in programming. she asked me to research what language would be best. python got high marks for ease of use, flexibility, growing popularity, and a very active online community. once i selected a language, i had to choose an environment to teach it in – or so i thought. i had absolutely no experience in front of a classroom, and few pedagogical skills, so i sought out an online python course within which to teach. i decided to use the code academy (ca) website as our programming environment. ca has selfguided classes in a number of subjects and the free beginning python course seemed to be just what we needed. i went through the whole class myself before using it as courseware. my intent was to help students register for ca, then, each day, teach them the concepts in that day’s ca lesson. they would then be set to do the online lesson and assignments. we first offered python in june 2018. problems with ca came up right from the start: students registered for the wrong class (despite the handout explicitly naming the correct class) and ca frequently tried to upsell to a not-free python class. since ca’s classes are moocs (massive open online courses), the developers built in an automated way of correcting student code: embedded behind each web page of the course, there’s code that examines the student’s code and decides whether it is acceptable or not. good in theory, not so good in practice. ca’s “code-behind” is flawed and sometimes prevented students from advancing to the next lesson. mailto:mfriedenthal@mywpl.org information technology and libraries june 2020 intro to coding using python at the worcester public library | friedenthal 2 moreover, some of the ca tasks were inane. for example, one lesson incorporated a kind of mad libs game. this is where the instructions ask, for example, for 13 nouns and 11 adjectives, and these are combined with set sentences to generate a silly story. this assignment turned out to be too long and difficult to fulfill, preventing students from advancing. although i used ca the first few times i offered the class, i subsequently abandoned it and wrote my own classroom material. after determining that ca wasn’t appropriate, i chose an online ide where the students could code independently. this platform worked well when i tested it ahead of time, but when the whole class tried to log on at once, we received denial-of-service error messages. hurriedly moving on to plan c, i chose thonny, a free python ide which we downloaded to each pc in the lab (see https://thonny.org/). each student receives a free manual (see figure 1), which i wrote. every time i’ve offered this class i’ve edited the manual, clarifying those topics the students had a hard time with. i’ve also added new material, including commands students have shown me. it is now 90 pages long, written in microsoft word, and printed in color. we use soft binders with metal fasteners. figure 1. intro to coding using python manual developed for the course. https://thonny.org/ information technology and libraries june 2020 intro to coding using python at the worcester public library | friedenthal 3 the manual consists of the following sections: • cover: course name, dates we meet, time class starts and ends, location, instructor’s name, manual version number, and a place for the student to write their own name. • syllabus: goals for each of the six sessions. this is aspirational. • basic information about programming, including an online alternative to thonny, for students who don’t have a computer at home and wish to use our public computers for homework. • lessons 1 – 17: “hello world” and beyond. • lesson 18: object oriented design, which i consider to be advanced, optional material. skipped if time is pressing or the class isn’t ready for it. • lesson 19: wrap-up: o how to write good code. o how to debug. o list of suggested topics for further study. o online resources for python forums and community. • list of wpl‘s print resources on python and programming. • relevant comic strips and cartoons. in march 2019, my manager asked me to start assigning homework. if a student attends all six sessions and makes a decent attempt at each assignment, at the sixth session they receive a certificate of completion. the certificate has the wpl name & logo, the student’s name, and my signature. typically three or four students earn a certificate. homework is emailed to me as an attachment. this class meets on tuesday evenings and i tell students to send me their homework as soon as possible. inevitably, several students don’t email me until the following monday. while i don’t give out grades, i do spend considerable time reviewing homework, line by line, and i email back detailed feedback. when the january 2020 course started, i found that between october’s class and january, outlook implemented a security protocol which removes certain file extensions from incoming email. and – you can see where this is going – the .py python extension was one of them. i told students to rename their python code files from xxxx.py to xxxx.py.doc, where “xxxx” is their program name. this fools outlook into thinking the file is a microsoft word document and the email is delivered to me intact. when it arrives, i remove the .doc extension from the attachment and save it to a student-specific file. then i open the file in thonny and review it. physically, our computer lab contains an instructor’s computer and twelve student computers (see figure 2). it also has a projector which projects the active window from the instructor’s computer onto a screen: usually the class manual. i use dry erase markers in a variety of colors to illustrate the concepts on a whiteboard. there is also a supply of pencils on hand for student notetaking use. the class is offered once per season. although the classroom can accommodate twelve students, we set our maximum registration to fourteen, which allows us to maximize attendance even if patrons cancel or don’t show up. and if all fourteen do attend the first class, we have two lap tops i information technology and libraries june 2020 intro to coding using python at the worcester public library | friedenthal 4 can bring into the lab. we also maintain a small waitlist, usually of five spots. we’ve offered this class seven times and the registration and waitlists have been full every time. sometimes we have to turn students away. figure 2. classroom at worcester public library. however, we had a problem with registered patrons not showing up, so last spring we implemented a process where, about a week before class starts, i email each student, asking them confirm their continued interest in the class. i tell them that if they are no longer interested—or don’t respond i will give the seat we reserved for them to another interested patron (from the waitlist). in this email i also outline how the course is structured and that they can each earn a certificate of completion. i tell them class starts promptly at 5:30 and to please plan accordingly. some students don’t check their email. some patrons show up without ever registering; they are told registration is required and to try again in a few months. i keep track of attendance on an excel spreadsheet. here in worcester, ma, weather is definitely a factor for our winter sessions. information technology and libraries june 2020 intro to coding using python at the worcester public library | friedenthal 5 over time i’ve made the class more dynamic. i have a student read a paragraph in the manual aloud. i’ve switched around the order of some lessons, in response to student questions. i have them play a game to teach boolean logic: “if you live in worcester and you love pizza, stand up!”… then: “if you live in worcester or you love pizza, stand up!” students range from experienced programmers (of other languages), to people with no experience but great aptitude, to people who just never seem to “get it”. this material is technical and i try hard to communicate the concepts but i lose a few students every time. we ask our patrons for feedback on all of our programs. our python students have written: • “… the classes were formatted in an organized manner that was beginner friendly” • “the manual is a big help. i'm thankful that the program is free.” • “… coding is fun and i learned a new skill.” • “this made me think critically and helped me understand where my errors in the programs were.” wpl is proud to offer classes that make a difference in our patrons’ lives. abstract history and details of our class 149 an integrated computer based technical processing system in a small college library jack w. scott: kent state university library, kent, ohio (formerly lorain county community college, lorain, ohio) a functioning technical processing system in a two-year community college library utilizes a model 2201 friden flexowriter with punch card control and tab card reading units, an ibm 026 key punch, and an ibm 1440 computer, with two tape and two disc drives, to produce all acquisitions and catalog files based primarily on a single typing at the time of initiating an order. records generated by the initial order, with slight updating of information,. are used to produce, via computer, manual and mechanized order files and shelf lists, catalogs in both the traditional 3x5 card form and book form, mechanized claiming of unfilled orders, and subject bibliographies. the lorain county community college, a two-year institution designed for 4000 students, opened in september 1964, with no librarian and no library collection. when the librarian was hired in october 1964, lack of personnel, both professional and clerical, forced him to examine closely traditional ways of ordering and preparing materials, his main task being the controlled building of a collection as quickly as possible. no library having been established, there were no inflexible rules governing acquisitions or cataloging and no catalogs or other files enforcing their pattern on future plans. the librarian was free to experiment and adapt as much as he desired; and adapt and experiment he did, remembering, at least most of the time, the primary reasons for designing the 150 journal of library automation vol. 1/3 september, 1968 system. these were 1) to notify the vendor about what material was desired; 2) to have readily available information about when material had been ordered and when it might arrive; 3) to provide a record of encumbrances; 4) to make sure that material received was the material which had been ordered; 5) to initiate payment for material received; 6) to provide catalog copy for technical processes to use in producing card and book catalogs; 7) to provide inexpensive control cards for a circulation system; and 8) to provide whatever other statistics might be needed by the librarian. the librarian attended the purdue conference on library automation (october 2-3, 1964) and an ibm conference on a-utomation held in cleveland (december 1964), and visited libraries with data processing installations, such as the decatur public library. then an extensive literature search was run on the subject of mechanization of libraries and the available material thoroughly reviewed. it was the consensus of the president, the librarian, and the manager of data processing that, as white said later, "the computer will play a major part in how libraries are organized and operated because libraries are a part of the fabric of society and computers are becoming a daily accepted part of life." ( 1) moreover, it was agreed that the use of data processing equipment would be justified only if it made building a collection more efficient and more economical than manual methods could do. metro}) after careful consideration of the ibm 870 document writing system ( 2) and the system described by kraft ( 3) as input techniqu~s for the college library, ·it . was decided to use the friden flexowriter, recommended both at purdue and, in european applications, by bernstein ( 4). its most attractive feature was the use of paper tapes to generate various secondary. records without the necessity of proofreading each one. the college, by mid-1965, ·had the following equipment available for library use: one friden flexowriter (model 2201) with card punch control unit and tab card reading unit, one ibm 026 key punch with alternate programming, and guaranteed time on the college-owned ibm 1440 8k computer with two tape and lwo disc drives. to produce punched paper tape and tab cards with only one keyboarding, an electrical connection between the flexowtiter and the keypunch was especially designed and installed. . it was fortunate for the library that the college also had an excellent data processing· manager who was interested in seeing data processing machines and techniques utilized in as many ways as possible. with his enthusiastic support, aid in programming and preparation of flow charts, and patient cooperation, it was not surprising that the automation of library processes was completely successful. ·· at this time it ·was decided that since the college was likely to remain integrated computer based processing/ scott 151 a single-campus institution it would be uneconomical to rely solely on a book catalog, even though the portability of such a device was most attractive to librarian and faculty alike. therefore, it was planned to have the public catalog, as well as the official shelf list, in card form, permitting both to be kept current economically. these two files were to be supplemented with crude book catalogs which would be a by-product, among others, of the typing of the original book orders. these book catalogs were not to replace the card catalog but simply to extend and facilitate use of the collection. it was also decided to design a system which would duplicate as few as possible of the manual aspects of normal technical processing systems, but one which would, at the same time, permit the return to a manual system from a machine system with a minimum of trouble and tribulation if support for the library's automated system should be withdrawn. concern about such withdrawal of support had originally been voiced by durkin and white in 1961, when they said: "there have been a number of unfortunate examples of libraries that abandoned their home-grown catalogs for a machine retriev(tl program because there was some free computer time, only to lose their machine time to a higher priority project and to be left with information storage to which they no longer have access. many of these librarians, and others who have heard about their plight, are determined not to bum their bridges behind' them by abandoning their reliable, if old-fashioned, 3x5 card catalogs." ( 5) although the necessity of returning to an inefficient manual system has not, to date, raised its ugly head, there were times when it was most comforting to know that routes of retreat and reformation were available. under the present system there is only one manual keyboarding of descriptive catalog main entries for most titles. all other records are generated from these main entries. this integrated system was adopted on the assumption that cataloging infonnation in some form ( 6) would be available for a high percentage of books. experience showed that about 95 percent of acquisitions did have catalog copy readily available. of 4029 titles processed in a 5-month period, catalog copy was available for 3824. after verification that a requested title is neither in the library nor on order, a copy of a catalog entry is located in a source such as the national union catalog, library of congress proofsheets, or publisher's weekly, etc. the catalog information is manually typed in its entirety (including subject headings) onto five-part multiple request forms, using the friden flexowriter. output from the friden consists of the multiple order, a punched paper tape containing the full bibliographic entry but no order information, and tab cards, punched by the slave ibm key punch, which contain full order information but only abbreviated bibliographic data. (figure 1 ). the tab cards, containing full order information, are used as input to the 1440 computer to create an "on order" file arranged by order 152 /ou·rnal of library automation vol. 1/ 3 september, 1968 mail copies to vendor typed multiple book orders on order tape fig. 1 on order creation routine. start flexowriter 026 key punch on order cards cards to week integrated computer based p1'0cessing / scott 153 number and stored on magnetic tape, from which an "on order" printout is produced weekly (figure 2). at any given time this magnetic tape order file can be used to total the dollar amount of outstanding orders to any given vendor, or the total amount outstanding to all vendors (figure 3 ). the punched paper tape and two copies of the request form are stored in a standard 3x5 card file arranged by main entry. one copy of the request form is to be used as a work slip when material is received. on order cards for one week fig. 2 on order update. start cpu on order update scratc h a f ter update 154 journal of library automation vol. 1/ 3 september, 1968 the original and one copy of the request form is sent to the vendor, with instructions to return one copy with shipment. in the event the vendor does not comply, the main entry can be located readily by checking the order number or order .date on the "on order" printout and using the abbreviated bibliographic information which appears there. if the material requested has not been shipped within three months, the magnetic tape order file is used to prepare tab cards containing all original order information and the cards are sent to the library with a notice stating that shipment is overdue. these tab cards are used as input fig. 3 on order cost tally. start cpu list or tab of on order file by cost #30000 on order cost tab integrated computer based processing/ scott 155 to the flexowriter tab card reader unit which activates the flexowriter itself and prepares "overdue, ship or cancel" notices to the vendor (figfig. 4 late on order routine. ure 4). 156 journal of library automation vol. 1/ 3 september, 1968 products when material is received, the paper tape and one copy of the main entry work slip are pulled from the card order file and sent to the cataloger who notes on the work slip the call number to be used as well as any changes. the work slip, punched paper tape and book then pass to the technician who does the shelf listing. at this point the original output paper tape containing full bibliographic information is used as input for the flexowriter to create a standard 3x5 hard-copy shelf list card containing full bibliographic information, as well as inventory data such as vendor, date of receipt and cost. the last three items and the call number are added manually as "changes." simultaneously a new paper tape is produced as output which contains bibliographic information from the first tape and all revisions deemed necessary by the cataloger. the revised paper tape is used on the flexowriter to prepare 3x5 card sets for the public catalog. at the same time the slave keypunch prepares a set of tab cards containing full acquisitions fig. 5 shelf list creation routine. integrated computer based processing/scott 157 information: cost, vendor, date of receipt; and abbreviated bibliographic information: short author, short title, full call number (including copy, year, part and volume), accession number and short edition statement (figure 5). the tab cards are used first to delete the item from the magnetic tape "on order" file and second as input to create a magnetic tape shelf list of abbreviated information arranged by call number (figure 6). the magnetic tape shelf list is used to create 1) eight copies of author, title, and classified catalogs which are updated semi-annually; 2 ) printouts of weekly acquisitions; 3) subject printouts on demand; and 4) tab cards which serve as circulation cards for books, film s, drawings, tape and disc recordings, filmstrips and any other materials. the tab cards can be used with the ibm 357 circulation system or any similar system. discussion the efficiency of this system is most dramatically demonstrated by the amount of work accomplished per person per year. one technician can sort by call number cpu circ. caro prep fig. 6 weekly shelf list update. sort by control number cpu 158 journal of library automation vol. 1/ 3 september, 1968 process over one thousand orders per month. over fifteen thousand fully cataloged volumes per year (approximately eleven thousand titles) are added to the collection by a technical processing department which consists solely of one full-time cataloger and two full-time technicians. one technician spends one half of her time typing orders and the other half preparing the shelf list. at present the limiting factor in processing material is not the personnel time available but rather time on the flexowriterkeypunch combination, which runs continuously for sixty hours per week. the cataloger feels if some thirty hours more per week were available for running the machines, or if a second flexowriter were available to handle catalog card output, it would then be possible to order, receive, and fully process fifteen thousand titles per year (eighteen to twenty thousand volumes) with only the present technical processing staff. references 1. white, herbert s.: "to the barricades! the computers are coming!" special libmries 57 (november, 1966), 631. 2. general information manual: mechanized library procedures (white plains, n.y.: ibm, n.d.). 3. kraft, donald h .: libmry automation with data processing equipment (chicago: ibm, 1964). 4. bernstein, hans h.: "die verwendung von flexowritern in dokumentation und bibliothek", n achrichten fur dokumentation 12 (june, 1961), 92. 5. durkin, robert e.; white, herbert s.: "simultaneous preparation of library catalogs for manual and machine applications", special libraries 52 (may, 1961), 231. 6. kaiser, walter h.: "new face and place for the catalog card", library journal 88 (january, 1963 ), 186. reproduced with permission of the copyright owner. further reproduction prohibited without permission. beyond information architecture: a systems integration approach to web-site design maloney, krisellen;bracke, paul j information technology and libraries; dec 2004; 23, 4; proquest pg. 145 beyond information architecture: a systems integration approach to web-site design krisellen maloney and paul j. bracke users' needs and expectations regarding access to information have fundamentally changed, creating a disconnect between how users expect to use a library web site and how the site was designed. at the same time, library technical infrastructures include legacy systems that were not designed for the web environment. the authors propose a framework that combines elements of information architecture with approaches to incremental system design and implementation. the framework allows for the development of a web site that is responsive to changing user needs, while recognizing the need for libraries to adopt a cost-effective approach to implementation and maintenance. t he web has become the primary mode of information seeking and access for users of academic libraries. the rapid acceptance of web technologies is due, in part, to the ubiquity of the web browser, which presents a user interface that is recognized and understood by a broad range of users. as libraries increase the amount of content and broaden the range of services available through their web sites, it is becoming evident that it will take more than a well-designed user interface to completely support users' information-seeking and access needs. the underlying technical infrastructure of the web site must also be organized to logically support the users' tasks. library technical infrastructures, largely designed to support traditional library processes, are being adapted to provide web access. as part of this adaptation process, they are not necessarily being reorganized to meet the changing expectations of web-savvy users, particularly younger users who are not familiar with traditional library organization methods such as the card catalog, print indexes, or other legacy tools. libraries must harness the power of the highly structured information systems that have long been a part of libraries and integrate these systems in new ways to support users' goals and objectives. part of this challenge will be answered by the development of new systems and technical standards, but these are only a partial solution to the problem. an important part of making library systems and web sites function as powerful discovery tools is to modernize the systems that provide existing services and content to support the changing needs and expectations of the user. emerging concepts of information architecture (ia) describe the system requirements from the user perspective but do not provide a mechanism to conceptually integrate existing functions and content, or to inform the requirements necessary to modernize and integrate the current system architecture. the authors propose a framework for approaching a comprehensive web-site implementation that combines components of ia and system modernization that have been successful in other industries. within this framework, those components are tailored for the unique aspects of information provision that characterize a library. the proposed framework expands the concept of ia to include functional and content requirements for the web site. this expansion identifies points within the conceptual and physical design where user requirements are constrained by the existing infrastructure. identification of these constraints begins an iterative design process in which some user requirements inform changes to the underlying system architecture. conversely, when the required changes to the underlying system architecture cannot be achieved, the constraints inform the conceptual design of the web site. the iterative nature of this approach acknowledges the usefulness of much of the existing infrastructure but provides an incremental approach to modernizing installed systems. this framework describes aspects of the conceptual and physicaldesign elements that must be considered together and balanced to produce a web site that supports the goals and objectives of the user but is cost-effective and practical to implement. i information architecture and the problem of libraries ia is both a characteristic of a web site and an emerging discipline. a number of authors have attempted to develop a formal definition of ia. wodtke presents a simple task-based definition, stating that an information architect "creates a blueprint for how to organize the web site so that it will meet all (business, end user) these needs." 1 rosenfeld and marville present a four-part definition in which two parts focus on the practice, and two parts define ia as characteristic. the first characteristic defines ia as a combination of "organization, labeling, and navigation schemes" while the second describes it as "the structural design of an information space to facilitate task description and intuitive access to content." 2 there is general agreement that ia provides a specification of the web site from the perspective of the user. the specification usually describes the organization, navigational elements, krisellen maloney (maloneyk@u.library.arizona.edu) is director of technology at the university of arizona libraries, tucson. paul j. bracke (paul@ahsl.arizona.edu) is head of systems and networking at the arizona health sciences library, tucson. beyond information architecture i maloney and bracke 145 reproduced with permission of the copyright owner. further reproduction prohibited without permission. and labeling required to completely structure a user's web-site experience. ia is not synonymous with web-site design, but rather provides the conceptual foundation upon which a presentation design is based. web-site design adds presentation and graphical elements to ia to create the user experience. library web sites provide a display platform by which library content and services can be accessed through a common user interface. most of the tools and services have been available for decades and, in response to user demand, are increasingly being made web-accessible in digital formats (virtual reference, full-text databases). despite this new access medium and format, the conceptual design of the underlying systems has not changed much. the library technical infrastructure is made up of many loosely coupled systems optimized to perform a single function or to support the work of a library department. library web sites do not present a sufficiently unified interface design or level of technical integration to match current users' mental models of information seeking and access. 3 the systems have not been integrated to support users' overarching goals or meet the expectation of seamless access that they have developed when using other web sites (such as google or amazon). in many cases, users are still expected to understand aspects of the library that are now obsolete (card catalogs) in order to navigate the library's web site. for example, the process of finding a journal article using a typical library web site is based on a print paradigm and has changed little despite the advent of online discovery tools. in a print environment, users first looked at an index to identify an article of interest, then wrote down the citation, went to the card catalog, and there looked up the journal containing the article. if the library owned the journal, the user would then write down the call number and go to the shelves to find the article. this process has not necessarily changed much for many libraries, even though indexes, card catalogs, and journals are often available online. even more confusing is that the end result of some search processes within a library web site is not necessarily content, but a metadata representation of content that must be entered into another search box. although the first search is representative of the search of a traditional index and the second search is representative of the search of the card catalog, many of our users have no mental model for this multistep search process. users accustomed to the simple keyword search available through internet search engines may have great difficulty in understanding the need for the many steps involved in library use. there is an expectation that search systems and online content will be linked, regardless of the economic, legal, and technical factors that make these links difficult. while linking-options in vendor databases and openurl resolvers have begun to simplify the electronic version of the process by automating some of the steps, the multistep process is still valid in many instances in most libraries. it is clear that library web sites must undergo a fundamental change in order to be responsive to the needs of the user. because library web sites appear to be similar to conventional web sites, it is tempting to adopt a general approach to ia to address users' needs. there are, however, several areas in which the general approach to ia does not adequately support the design needs for library web sites. generalized ia approaches, such as those provided by rosenfeld and marville, do not provide adequate guidance regarding the organization and display of content from external sources. there is an unstated assumption that external sources will provide information in the format specified by the web-site architect. ia approaches suggest methods to completely describe the user experience, from the time a user first accesses a site to the point at which a user task is complete, regardless of the origin of the content or service accessed. for example, the content from each of amazon.com' s commercial partners is packaged to operate like a part of the amazon.com site. in contrast, libraries often only have control of a user's experience up to the point at which they leave a library's servers. libraries guide users not only to local services and digitized collections, but to databases, journals, and more that are licensed from external sources and the appearances of which are controlled by external sources. even when using a technical standard such as z39.50 to provide a local look and feel to remote resources, libraries do not necessarily have full control over the data format or elements of the content that is returned. this lack of local control over content is a limitation to libraries adopting common definitions of ia. another design area that is not well supported by generalized approaches to ia is the integration of previously installed systems, such as library catalogs. these legacy systems provide important services that represent decades of development and collaboration, and are essential to the future of libraries. for example, libraries provide access to unique resources and systems ranging from online catalogs to abstracting and indexing databases to interlibrary loan (ill) networks. libraries are using web technologies to provide new access methods to library content and services. these technologies provide a thin veneer on systems that function in a manner unfamiliar to many users. the challenge then becomes to change what lies beneath the surface, the underlying functionality of the site, to support the needs of the user. using a generalized approach to ia, as applied in other settings, libraries would assess the needs of the user and develop a new, complete system that supports those needs. such an approach ignores the extensive, existing infrastructure of legacy systems in libraries that is still useful and that serves purposes beyond the user's web interface. what is 146 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. needed is a standard reference model for library services that provides a framework for access to services and content. this is a long-term goal that requires cooperation and agreement among libraries, and that would allow legacy systems to be repackaged in ways that are more flexible, meet changing user needs, and can be integrated into changing technology environments. because there are currently no such reference models, librarians need to develop other approaches to integrate existing legacy systems into a modernized web site. i extending the ia framework in this paper, the general definition of ia that has been proposed by several authors has been extended to incorporate the additional constraints that characterize library web sites.4 extended information architecture (eia) is the first half of the framework, and provides a complete conceptual design of the web site from the users' perspective. figure 1 depicts the elements and relationships within eia. the coordinating structure provides an overarching framework for the integration of the multiple service elements that provide much of the underlying functionality of the web site. the relationship between the coordinating structure and the service elements is iterative, with service elements constraining the coordinating structure and the coordinating structure informing the design of the service elements. the coordinating structure the coordinating structure contains many of the design elements that are found in generalized approaches to ia, including the organization, navigational structure, and labeling. these are the elements of a web site that, in concert, define the structure of the user interface without specifying the functionality and content underlying that interface. the framework emphasizes aspects of the generalized approaches that are most relevant to libraries and places them in relation to the service elements that specify the content and functionality of the site. the first element of the coordinating structure is the organization of the web site. organization refers to the logical groupings of the content and services that are available to the user. these groupings are not necessarily representative of physical-system implementations, but may be taskor subject-based instead. for example, many academic library web sites have primary groupings that include information resources, services, and user guides. although the information resources may include information from a range of systems (for instance, the catalog, abstracting and indexing databases, full-text databases, locally-developed exhibits), the logical grouping of information resources unifies the concept for the user. a site's organization scheme will often serve as the foundation for the primary navigational choices on a site's main menu or primary navigational bar. another component of the coordinating structure is the navigational structure of the site. navigational structures define the relationships between content and service elements of a site, and between groupings in the site's organization. these structures also include search tools and other link-management tools that help users locate needed content and services. there are usually two types of relationships that form a navigational structure. first is the definition of a global relationship scheme that outlines the primary navigational structure of the site. these often define relationships between sections of a site's organization, but may also provide access to key pieces of functionality from any point within a site. in addition to the overarching global relationship scheme, there are often several locally or functionally defined relationship schemes that are used throughout the site. these local relationship schemes are usually located within a service or content grouping and provide logical connections within their defined grouping. both sets of relationships are designed to support a task and provide pathways for the user to move among the various elements of the site. other relationship schemes may be topic oriented, allowing the user to move easily among similar content sources. these logical relationships are later implemented within a user interface as tools such as menus, navigation bars, and navigation tabs when combined with labels and a visual design. customization and personalization are navigational structures that have gained a fair amount of attention in the library literature. both strategies allow a web site to be displayed differently, based on user characteristics. customization allows the user to create the relationships most suitable for his or her needs. this strategy has been explored by a number of libraries, although there is little convincing evidence that users implement such strategies in an intense or repeated manner. 5 personalization allows a system designer to bring together a set of pages in a relationship that is meaningful for a user or a user group. labels, the third element of the coordinating structure, provide signposts that communicate an integrated view of a web site's design to those who use it. it is important to define a labeling system that consistently and clearly communicates the meaning of the site to the user. accordingly, the labels should be constructed in the user's language, not the librarian's. for example, a user may not understand that an abstracting and indexing database will provide them with information regarding journal articles that are relevant to a topic of interest. in that case, the label "find an article" is more useful than "indexes." beyond information architecture i maloney and bracke 147 reproduced with permission of the copyright owner. further reproduction prohibited without permission. extended informati on architecture coordinating structure • organization: the grouping and specification of the funct ion and content that is necessary to support the site. • navigational structure: the associations among the service and content elements of the site. these relationships provide the conceptual foundation for navigation and include global and local navigationa l concepts, site index and search, customizab le and personalized structures. • labeling: a consistent naming scheme that presents options and choices to users in terms that will understand. serv ice elements • functio nal requirements: the description of the functional elements that are necessary to support the user. • content requirements: the description of the content elements that are necessary to support the user. • content specifications: the description of the content elements that are already available to support the user. • functional specifications: the description of the functional elements that are present in a previously installed system. figure 1. an extended information architecture for developing a conceptual design of library web sites labels are used to describe individual service or content units, but may also be used as headings to provide structural elements to augment the navigational scheme. the consistent use of labels as headings within the site not only increases user understanding of the site, but may also be explicitly constructed to support user tasks. an example of labeling to support tasks can be seen on the university libraries web site of the university of louisville where, under the main heading for articles, the first subheading is step 1: search article databases; and the second subheading is step 2: search (the catalog) by journal title." service elements service elements are the second major component of extended information architecture, and represent the content and functionality of the web site. in this framework, the service elements serve a dual purpose. the definition of service elements involves defining both the ideal requirements for functionality and content as well as the specifications of what is currently available. the definition process can then be used to identify points in the web site where new functions and content need to be added, or where existing functionality must be modernized. these additions and modifications may be achievable immediately, but in many cases an incremental plan for change may need to be developed. the service-element requirements, labeled as functional requirements and content requirements in figure 1, express the users' needs and expectations for the functional or content elements of the web site. the purpose of the requirements definitions is to describe the service elements that are necessary to allow a user to meet his or her goals or objectives in using the site. these requirements are a representation of the ideal composition of a web site, and inform not only the immediate implementation of the site but also the development of future systems and the modernization of existing systems. it is also important to note that the requirements should be developed to express user needs, not a particular implementation option. for example, it might be tempting to specify the implementation of a particular vendor's openurl resolver. this does not, however, describe how the system would function ideally from a user perspective. instead, an appropriate requirement would be that users should be able to link to full text from all citations in an abstracting and indexing database. more specifically, content requirements describe the content that is necessary to meet the users' goals and objectives. access to content is often the primary emphasis of a library web site, and the content requirements describe the intellectual content that should be accessible through a web site. examples of content that might be required are article citations, full-text articles, and multimedia objects. normally, these requirements will be closely connected with library-wide collection-development policies and priorities, and should be driven by subject specialists rather than systems personnel. these requirements inform the development of systems to meet the needs of the users. the content specifications describe the content that is available within the current systems. there are many reasons why content requirements and content specifications do not match, including the inability or choice of a library to acquire a particular piece, the unavailability of specified content, or technical incompatibilities between content and the library's infrastructure. although content is sometimes viewed as the core component of a library web site, there is also great deal of additional functionality that is provided to users. the functional requirements describe the users' needs and expectations of the functionality in the context of completing tasks on the web site. for example, ill forms found on many sites are easy for the user to fill out, although the most effective interface to ill for the user might not involve a form-based user interface at all. it might be a direct system-to-system interface from an openurl resolver to the ill software in which all citation data are transmitted for the user. this requirement is not necessarily obvious when considering ill in isolation, but is evident when considering it in the larger context of the users' goals and objective for the entire web site. the functional specifications describe the functions 148 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. as they exist in the installed base of systems and expose the functionality that is available to the user. when the specifications do not match the requirements, the users' expectations regarding the system will not be fully achieved. the economic and technical limitations of system implementation and modernization often reduce the speed at which the large base of previously installed systems can be modified to meet users' changing needs and expectations. it is thus critical to identify gaps between existing systems and desired systems and discover areas where a web site will have characteristics that are not completely aligned with what the user needs or expects. when the service-element requirements do not match the service-element specifications of existing systems, an iterative design process begins. this process will be intertwined with the evaluation of the system architecture. gaps that can be addressed immediately should be incorporated into an implementation plan for the new web site. longer-term migration or development plans can be developed to fill gaps that cannot be addressed immediately. it is also important to acknowledge that developing and meeting service-element requirements is an iterative process. they will need to be revisited over time as user needs change, and requirements that are met now become the specifications that are evaluated in the future. i interrelationships within eia when the service-element requirements cannot be used to modify the service-element specifications, the service elements constrain the design of the web site and influence the design of the coordinating structure. the upward arrow in figure 1 labeled constrains indicates that the user experience is constrained by the specifications of content or functional elements that are not currently changeable. in such situations, the coordinating structure must be designed to provide additional context for the user to understand the purpose of the existing service elements. this explanatory role can be seen in the implementation of many web sites as formal parts of the organizational structure designed to explain the idiosyncrasies of the web site to the user. for example, many academic library web sites have tutorials, faqs, or sections labeled "how do i . . . ?" that provide tips on using aspects of the site that are not always evident to users. it is necessary to acknowledge the usefulness of the explanatory role of the coordinating structure in the iterative and incremental processes of web-site design. just as bibliographic instruction and adequate signage have allowed the user to navigate aspects of the traditional library that were not intuitive, the coordinating structure provides the conceptual signposts and other guidance required for users to effectively navigate the web site. at the same time, it is important to realize that the explanatory role would not be necessary if the web site's architecture and design were intuitive to the user. as the design of the service elements changes to accommodate the larger goals of the user, the explanatory function of the coordinating structure will be diminished. the main goal of library web site design should be to reduce the explanatory role of the coordinating structure and to develop service elements that seamlessly support the goals and objectives of the user. until all service elements have been modernized to meet the needs of the user, the conceptual design of web sites will represent a compromise between what users require and what it is possible for users to do within the current legacy information infrastructure i system architecture while the conceptual design of the web site describes the needs of the user apart from the technical details of the implementation, the system architecture is the description of the system as it exists. in the case of library web sites, the system architecture is not limited to the functionality and data on the library's web server. instead, it is also inclusive of all core infrastructure, individual systems, and data access and storage mechanisms that provide the blueprint of the web site's backend as it has been built. the individual systems in the architecture may include locally controlled ones, (for instance, an online catalog), but will also include remote systems such as abstracting and indexing databases mounted by a vendor. a definition of the design of the existing system plays a key role in the evolutionary specification of the system because it provides developers with a greater understanding of the possibilities and constraints of the existing infrastructure. in describing a system architecture, several formal representations can be used that capture various aspects of the system's capabilities at different levels of granularity. these include module views that provide static specifications of individual components; component and connector views that provide dynamic views of processes; and deployment views that incorporate hardware elements.' the selection of representations is beyond the scope of this paper. typical elements of a system architecture can be seen in figure 2. for the paper, three classes of components are being considered, although more may be introduced if applicable locally. the core-infrastructure components are fundamental services and information that support one or more systems or subsystems. in a typical library environment this includes authentication services, web platforms, and the network. in some library environments, external beyond information architecture i maloney and bracke 149 reproduced with permission of the copyright owner. further reproduction prohibited without permission. units may maintain some or all of these components. for example, many college campuses maintain an authentication infrastructure in the campus computing office. overall, core infrastructure provides the glue for tying together the many applications that libraries attempt to integrate in their web sites. the system architecture should include details regarding the standards and interfaces that are used within the library technical infrastructure. many of the applications in the library environment are off-the-shelf components that have been developed by external vendors. these off-the-shelf components may include the catalog, ill modules, electronic-course reserves, and virtual-reference systems. although individual libraries may have some control over configuration options in these applications, they are likely to have little influence over the basic functionality or data formats provided by these systems. core functionality tends to change based on the demands of many libraries looking for similar functionality. despite the lack of functional control over these systems, components developed by external vendors may provide standards-based system interfaces to their functionality. these usually take the form of industry-supported standards or vendor-supplied application programming interfaces and give libraries some flexibility in working with these components. explicit descriptions of the available standard and proprietary interfaces should be included within the system architecture. other applications may have been developed within the library and so can be changed more easily. examples of locally developed applications typically include subject pages, information about the library, and digital web exhibits and collections. although local development does provide more control over the appearance and functionality of a piece of software, it is not without problems. local development is often conducted using a bricolage approach, solving specific problems singularly, without giving consideration to the larger networks of systems in which the solutions operate. when such approaches do not take into account larger issues of systems architecture, opportunities to solve a broader range of problems may be missed and subsequent repackaging of these solutions may be limited or impossible. libraries frequently also have a limited number of programmers, often remedied by pulling librarians or staff from other duties. while this certainly can allow libraries to meet some user needs, the lack of software-engineering skills in libraries may result in local solutions that are inflexible and that do not support standards for data storage or interchange. because the internal design of these applications is accessible and modifiable, the system architecture should include more extensive descriptions of the internal features and relationships that they contain. although this will not completely alleviate the problems of software maintenance, it will provide a better foundation for decisions regarding future migration. system architecture applications (off-the-shelf and locally developed) specification of the access mechanisms and standar ds for previously installed systems including: • catalog • interlibrary-loan • electronic reserves • abstracting and indexing databases • content management systems • legacy web content core infrastructure • authentication: the va lidation of a users identity based on creden tials. increas ingly a part of a campus-wide infrastructure . • web platforms : operating systems, server software and application software the provide the general foundation for the website. • network: the communication infrastructur e within the library system and connect ing to the internet. information storage and access • storage: the definition of storage structures including relational or hierarchical schema. character format specifications. • standards: standards available for access to the data. these include formats like marc, dublin core and mechani sms like 239 .50 and odbc . figure 2. eleme nts of a system archit ec tur e finally, typical library architectures consist of links to resources that are licensed or organized on behalf of the user. these include abstracting and indexing databases, full-text content provided by publishers outside of the library, and general vetted internet sites. linking the user to the system usually provides access to these systems, and libraries have no control over the technical implementations of such resources. newer federated search technologies are integrating into the library infrastructure the users' access to the site and to results from the sites, and linking tools make the interrelationships between these systems more easily understood. nevertheless, integrating these resources into a web site in a manner that makes sense to library users is a challenge. the access mechanisms and information formats required to communicate with the site should be clearly documented within this system architecture. i interrelationship of the information and system architectures reacting to the rapid pace of change can result in an adhoc or haphazard approach to web-site design. the sections above describe a systematic approach to include and evaluate changes to the web site. in order to imple150 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. ment the changes and create a web site that is scalable and made of reusable components, it is necessary to evaluate, plan, and document all changes to the system . figure 3 graphically depicts the interrelationship between eia and system architecture. user needs, as described by ia, should inform the development of technical infrastructure. the informs arrow indicating that eia informs the design and development of th e system architecture depict s this interrelationship. the constrains arrow designates the reality that some aspects of the existing infrastructure cannot be changed within this planning cycle and will limit the library 's ability to immediately change the underlying content and function of the web site. when mapping the conceptual d esign to the physical design, there will be gaps that represent functionality that cannot be supported, either fully or in part, by the current system architecture and thus constrain the full implementation of the conceptual design . if ia is then to be implemented as fully as possible, these gaps identify the modification s and additions that must be carefully evaluated, designed , and implemented within the underlying system architecture. gaps can be addressed in a variety of ways. if there is a total gap in functionality, a system can be deve loped or implemented to provide the desired functionality as part of the larg er system architecture. this may result in a complete development project or in the specification of an off-the-sh elf application to meet the newly identified demand. in the case where an existing system has some of the required functionality but is not completely suitable for the users ' goals and objectiv es, an incremental approa ch of modernization can be adopted . modernization surrounds "the legacy system with a so ftware layer that hid es the unwanted complexity of the old system and exports a modern interface ."" this is done to provide integration with a modern operating environment while retaining the data and exposing the functions of the existing system, if desired. techniques range from screen scraping to the implementation of web services to export access to functions that are still relevant within the new context. all of these chang es beco me part of the system architecture for future iteration s of change. gaps that cannot be immediately added or changed to meet the specified requirements become constraints in the next iteration of conceptual design. in the absence of a plan, the underlying systems will continue to undergo constant evolutionary changes, ostensibly to meet the changing needs and workflows of both users and staff. change comes from many sources, including local implementations and modifications, external vendors, and industry-wide changes in standards. this rapid but incremental change can produce a system that is very difficult to maintain and that provides few reusable modules. having a well-documented implementation and integration plan will not guarantee that extended inform atio n archit ecture system archit ect ure coro infrastructu re • authonticatlon • web platforms • network figure 3. the interrelat ionsh ip between the conceptual and physical design of the library web site the library will not experi ence the negative effects of technological change, but it does allow a library to b ette r manage change in meeting the needs of its users. th e more explicitly and clearly th e modifiable featur es are documented within the sys te m architecture; the easier it will be to plan to fill the gaps. i conclusion library users' mental models of library processes hav e fundamentally changed, creating a serious disconn ect between how users expect to use a library web site and how the site was design ed. in particular , user expectations regarding the numb er of step s that must be completed have changed. at the same time, library technical infrastructures are composed, in part , of legacy systems that provide great value and facilitate interlibrary resour ce sharing, but were not designed for the web environm ent. it is essential that librari es develop new approaches to th e conceptual design of web sites that support current and future changes to both user behaviors and to library systems architectures. in th e long run, these approach es should contribute to th e development of a referenc e model for the description of library services. the authors have proposed a complete framework for conceptual design and physical implementation that is responsive to changing user ne eds while recogni zing the need for libraries to adopt an efficient and cost-effe ctive approach to web-site design, implementation, and maintenanc e. functional and content needs of the user are id entified and molded into a conceptual design based on a broadened perspectiv e of the users' objectiv es . mapping conceptual requirem ents to physical architectures is an important part of this framework, using an architectural representation in combination with descriptions of integration elements that have been developed to support the incremental and iterative change. beyond information architecture i maloney and bracke 151 reproduced with permission of the copyright owner. further reproduction prohibited without permission. the ability to respond is essential, necessitated by the rapid change in the technical and user environments in which libraries operate. the framework is designed to allow logical and informed decisions to be made throughout the process regarding when to create new systems, when to replace or modernize existing systems, and when to improve the conceptual signage of the web site. references 1. christina wodtke, information architecture: blueprints for the web (indianapolis: new riders, 2003), 2. louis rosenfeld and peter marville, information architecture for the world wide web, 2nd ed. (cambridge, mass.: o'reilly, 2002), 4. 3. bob gerrity, theresa lyman, and ed tallent, "blurring services and resources: boston college's implementation of metalib and spx," reference services review 30, no. 3 (2002): 229-41; barbara j. cockrell and elaine anderson jayne, "how do i find an article? insights from a web usability study," journal of academic librarianship 28, no. 3 (may 2002): 122-32. 4. jesse james garrett, elements of user experience (indianapolis: new riders, 2002); rosenfeld and marville, information architecture. 5. james s. ghapery and dan ream, "vcu's my library: librarians love it ... users? well maybe," information technology and libraries 19, no. 4 (dec. 2000): 186-90; james s. ghapery, "my library at virginia commonwealth university: third year evaluation," d-lib magazine 8, no. 7 /8 (july/ aug. 2002). accessed july 16, 2003, www.dlib.org/ dlib/july02/ ghaphery / 07ghaphery.html. 6. university of louisville libraries web site (2003). accessed july 16, 2003, http:/ /library.louisville.edu. 7. craig larman, applying uml and patterns: an introduction to object-oriented analysis and design (new jersey: prentice hall ptr, 1998); martin fowler, analysis patterns: reusable object models (boston: addison-wesley, 1997); james rumbaugh, ivar jacobson, and grady booch, the unified modeling language reference manual (boston: addison-wesley, 1999); robert c. seacord, daniel plakosh, and grace a. lewis, modernizing legacy systems: software technologies, engineering processes, and business practices (boston: addison-wesley, 2003). 8. seacord, plakosh, and lewis, modernizing legacy systems, 9. 152 information technology and libraries i december 2004 1 an automated music programmer (musprog) david f. harrison, music director, wsui-ksui, and randolph j. herber, applications programmer, university computer center, the university of iowa, iowa city, iowa a system to compile programs of recorded music for .broadcast by the university of iowa's radio stations. the system also provides a permanent catalog of all recorded music holdings and an accurate inventory control. the program, which operates on an ibm 360/65, is avaaable in fortran iv, cobol and pl/1, with assembly language subroutines and external maintenance programs. the state university of iowa (iowa city) owns and operates two broadcasting stations, wsui, at 9l0 kc, and ksui, at 91.7 mc. wsui was the first educational radio station in operation west of the mississippi, and ranks among the oldest stations iri the country; ksui was among the earliest of the frequency modulation outlets in the area to offer programming in multiplex stereo. in the spring of 1967, when it became necessary to completely reorganize their recorded music libraries, an investigation wali simultaneously underway to determine the feasibility of utilizing automated data processing ( a.d.p.) techniques in the discographic operations of the stations. at the time there were several working bibliographic applications ( 1), ranging from relatively simple record-keeping (where is ... ? ) to more ambitious cross-referencing and indexing operations, one of which uses the kwic (keyword-in-context) computer program to classify musical recordings ( 1). on the basis of the awareness of these applications, and a belief that the intrinsic principles could be utilized and extended to cover somewhat different needs, it was proposed that the facilities of the uni2 journal of library automation vol. 2/1 march, 1969 versity computer center be employed in the selection and updating of recorded music programs. in designing a coded set of instructions to perform these tasks, it was deemed necessary that any attempt at the selection or compilation of a series of music programs should be made in accordance with certain criteria supplied to the system by the user, and that these selection specification parameters should closely parallel those which would be employed were such an extraction from the total libraries to be performed manually. additional requisites were that provision be made for updating and enlarging the master file as new items were acquired, and that the coding of the programmed instn.1ctions should be sufficiently flexible to permit inclusion of supplemental criteria as they became desirable. the above proposal met with a certain degree of opposition, the main bone of contention being that such an application would necessarily "dehumanize" music programming. there have been, and will continue to be, similar objections raised by those who are unaware of the advantages offered by a. d. p. and concomitantly unaware of the mental processes which result in what is commonly referred to as "artistic judgment." it is not the purpose of this article to attempt an exhaustive analysis of such processes, nor to castigate the objectors; it is rather simply to bring forth several basic observations dealing with the problem under discussion. a contemporary composer-theorist interested in the applications of a. d. p. techniques to the process of musical composition has observed that no paradoxical "almighty force" exists in science, which, in actual fact, progresses by discrete steps which are at once limited but unpredictable ( 3, 4). the following list of conclusions, although relating specifically to the problems of machine-"created" music, find no less an application to the current problem: creative human thought is an aggregate of restrictions or choices in all fields of human activity, including the arts. certain aspects of these ·judgments can be mechanized and simulated by certain physical mechanisms currently extant, including the cm;nputer. the rapidity of calculation or decision by computer frees human beings from the long and arduous task of manually selecting, compiling and checking of programmed works. the time thus saved can be better spent on such amenities as scripting, with complete performance information and record data, and the always-too-necessary pronouncing aids. moreover, the computer program can be "exported" to any place similarly equipped to be used by other individuals, or where other programmers are able to alter the algorithm to meet their specific needs. the automated music programmer ( musprog) was interpreted as being a series of steps, the first of which specifies that complete mus~c programs are to be selected in accordance with a table of specifications introduced as data, each card containing inforination pertinent to a discrete program. the second step requires that each and every entry in the catalog be i f automated music programmer/harrison and herber 3 checked for availability by any program in the tables established in the preceding step, this status to be determined on the basis of a satisfactory comparison with the individual criteria supplied on the selection specification card. among these are "tests" (note that a failure to meet the requirements in any step disqualifies the item) to determine when the item was last selected, as well as the number of times selected; a check for allowable time length; a check for duplication of composer and/ or title; a statement that stereo recordings are to be used only for fm; a check for acceptable period, style and type of composition; and the decision to update the master file. in the final operation of the program, each duplicate title of a work selected is also updated, simulating selection to prevent its selection during the next month. if each duplicate were given the date factor of the item actually selected, the latter would tend to appear much more frequently than its companions because the program would continue to select the longest available item, and it is reasonably safe to assume that the selected item is the longest version of the title in question. it was necessary, therefore, to devise a means by which each version of a given work (indicated by both title and composer) be given equal weight for "fair" selection. a unidimensional array called item was constructed with ten positions as follows: item (10)/0', '0', '0', '0', '1', '1', '1', '2', '2', '3'/. the index of the array was then selected by referencing a routine which generates random, positive integers in the range one through ten. the contents of that position in item are added to the date factor of the record selected, and the result placed in the corresponding field of the duplicate title under scrutiny. thus there exists a 40% probability that the duplicate will have the same "weight" as the selected item, a 30% chance that the duplicate will be "pushed back" one month, 20% for two months, and a 10% probability that the date factor of the duplicate title will be increased by three months. 'when all the titles have been thus read or updated, the run concludes. figure 1 is the flowchart that is the basic design of musprog and from which the computer program was coded. the program runs on an ibm 360/65 and is available in fortran iv, cobol, and pl 1 with assembly language subroutines and external maintenance programs. copies of these programs may be obtained from the national auxiliary publications service (naps #00278). the machine readable catalog system currently employed by the university's radio stations is, on the whole, independent of the record's origin or manufacture. (the catalog number could be considered as nothing more than an indication of a discrete shelf space. ) the system was designed to facilitate maximally efficient use of the 80 columns available on a punch card. by utilizing two alaphabetic and two decimal characters ranging from aoo through zz99 provision is made for identification of records and tapes in quantities somewhat in excess of seventy-thousand 4 ] ournal of library automation vol. 2/1 number of p rogram spec .. lfication spaces +1 to snumi snumi -1 to snumi write program determine for which station program is being selected indic ate null selection list iy setting pol nter= ~ o'flfrmine which compo~ents of the allowable characteristics are significant fig. 1. flowchart for musprog. march, 1969 to the paogram pointer set j to pointer in piece cep•j-it--l.--------' i from lag add oh f 10 i\iumifr of plfcfs sflfctfd chain rl f: cf to begi nn i ng of list for ,rogram sui i rac t duration from time remain ing c om,u i f lag fr om du ra l io n copy p i e ce informal i on 1 ~ 1 0 spac f automated music programmer/ harrison and herber 5 ph 1 w ri te prog ram poi nter rewind ~npui and r an oom c hang e in to pr opu l a g f i el o fig. 1 continued. 6 journal of library automation vol. 2/1 march, 1969 individual discs or reels. the total of actual single titles possible to catalog in this manner is at least twice that number. the card catalog is made up along more or less standard, triple-reference lines on the familiar 3x5-inch card. these remain in the master card file, but are actually used only for reference purposes, rather than for actual selection. the "real" master library exists in the form of punched cards (later transferred to magnetic tape). each card image contains the following information, with blank columns separating contiguous fields: columns 1-10 composer, or first ten characters if abbreviation necessary. 12-27 title, abbreviations standardized 29-33 duration of work in seconds. 35-37 period of composition. 39-40 type of composition. 42-45 catalog number. 47-57 physical location of item on cataloged disc or tape. 59-64 "date fields, used for updating and usage factors. 66-69 seasonal key, a blank indicating general usefulness. 71-80 field used by musprog for internal record-keeping. operation selection of music by the system is performed in accordance with a table of program specifications which includes information pertinent to the length of the desired program and maximum permissible length of any single work within it, the type of music desired, and additional information, such as date, time and title of the program to be aired and an indication of the station for which the program is to be selected. all the selections for ksui ( fm) are required to be stereophonic. classification into stereophonic and monophonic groups is a function of the catalog number, aoo through z99 being stereophonic and aaoo through zz99 being monophonic. a program selection card contains the following data: columns 1 2-6 7-11 12 13-27 28-79 station code: w for wsui, blank for ksui duration of program in seconds. maximum duration of each item to be selected ( 0 or blank indicates program may consist of but a single work equivalent in length to program duration. number of types being specified. three three-plus-two character fields to specify period and type (modia equals "twentieth-century, orchestra"). if any field is blank, musprog assumes anything acceptable. title of program to be selected, day and time. automated music programmer/harrison and herber 7 as an example, the following specifications were made for a program called "aubade" which was aired at 10:00 a.m. on tuesday, july 30, by wsui. program duration was to be 3400 seconds (56:40), allowing 3:20 for continuity. maximum length of any single work within the program was to be 900 seconds (15 :00). music could be chosen from the contemporary orchestral repertoire, any instrumental work from the classic period, or any type "3" work, i. e., soloist and piano, or chorus a cappella. figure 2 shows a printout of selections for two programs. music sel ected fo r ws ui even i ng con cert 5:30 pm thu rsday. july 2 5 pr ogram no, 6 9 le ng th 860 0 unuse d time is 1 tota l 2:23:19 rangs trom di v el eg !aco fa41 52/82 prok o~ ! ev semyon kotko sui ka 4 7 51-2/e bach cantata 146 ma 19 s1-2/e beetho ven pia co n arr vn c kb-> 1 s 12/ e music sele cte d for wsui 0:15:32 0:42:12 0:42:25 0 :43:10 even i ng co nce rt 5:30 pm tu esday• july 30 procr mt no , 6 7 length 060 0 unus ed t im e is 0 total 2:23:20 ha ydn sy mp ho ny 38 kfl21 51/e tchaik ovsk sy~ phony 5 ga 5 0 s l -2/e i ve s pi a son 2 con cor na8 1 s l-2/ e beethoven strin g qrt 15 ca 12 51-8/e fig. 2 printout of selections. 0 : 13:40 0 ! 42:58 0:4 3: 0 an additional feature of musprog is provision for a periodic summary of library usage, affording the librarian a concise account of frequently played items, as well as an indication of those works which have been selected infrequently or ignored altogether. this report allows the programmer to assess more accurately the maximum number of times a selection may be programmed before it is declared unacceptable. the system also puts out printed lists of works extracted from the library in accordance with a user-specified table of reference fields: e. g., all symphonies, all works by bach, all works of under ten minutes in length, all christmas music; or conceivably, any symphonies by bach which are 8 journal of library automation vol. 2/1 march, 1969 suitable for christmas and less than ten minutes long. this latter step could also include, with minor alterations in the computer program, provision for performances by one specific ensemble or artist only. an external program allows adding items to the master tape, deleting those no longer needed and correcting any of the various fields within individual records; thus if mis-timings or other inaccuracies are noted, it becomes a relatively simple matter to correct them. discussion it can readily be seen that "the machine" neither possesses nor displays "taste" in any conventional sense of that word, since it can select only those types of music which the programmer has declared acceptable. it does not, indeed cannot, show any predilection toward certain types of music to the detriment or exclusion of others, save those which have been removed from the list of potential selections by the programmer. it performs no independent judgments. without doubt, then, there is no logical basis for t.~e cry of "dehumanization," since the program was originally designed by human minds and is, at each step of the process of selection, governed by the human-designed control parameters and program specifications; therefore it cannot select music willy-nilly, but must be told what to do and how to do it. it also has been found that specifications cannot be "plugged in" at random, for the programs thus selected would prove little more than a conglomerate of sundry works bearing no relation to one another. it is very much a necessity that organization and logic be designed into each program to have any coherent programming result. the machine does not "know" what to do unless told. it should be brought out that because of a built-in logic and the order of titles on the master file the program will tend to select the longest works available to fill the specified program time, making up the difference, if any, with progressively shorter pieces until the time is filled, or until no work of acceptable type and sufficient brevity can be located. since the longer works tend to occur among certain types and/ or styles of music, there may be some tenuous grow1ds for a suspicion of bias. it will be observed that musprog does not include information pertinent to performer, conductor, etc. one of the several reasons for this apparent oversight is that such information would, at the outset, have required the use of one to four additional data cards per title. since this information was not deemed absolutely essential to the immediate functions of the program, it was decided to postpone inclusion of such a refinement to some future date. conclusion musprog has been utilized by the state university of iowa since march, 1968, and has resulted in considerable time-saving. for example, the july, 1968, programming required one hundred and two programs automated music programmer/harrison and herber 9 varying in length from thirty minutes to somewhat over four hours, and consisting of a variety of musical styles and representing a diversity of programming difficult to achieve efficiently by ordinary means. in three minutes and twelve seconds, musprog selected the programs, updated the catalog, checked for duplication of selections, timed each program, and printed out the resultant copy properly headed. at an approximate cost of $250.00 per hour, this comes to less than fifteen dollars per month to perform tasks which might normally require two persons, at perhaps two or three dollars per hour, to work an entire week or more. it is doubtful that even then each catalog entry could be examined and an accurate record of usage be kept. acknowledgments a staff research grant from the graduate college, university of iowa, partially supported development and operation of this system. dean duane c. spriestersbach of the graduate college, professor gerard p. weeg, chairman of the department of computer science, and program supervisor robert e. irwin gave generous support and encouragement to the development of musprog. references 1. 'wilhoit, g. cleveland: "computerized indexing for broadcast music libraries," journal of broadcasting, 11 (fall, 1967) 325-337. 2. brook, barry s.: "rilm, repertoire internationale de ia litterature musicale," notes; the quarterly journal of the mtisic library association, 23 (march, 1967) 462-467. 3. xenakis, iannis: "in search of a stochastic music," gravesano review, 11 (1958). social contexts of new media literacy: mapping libraries elizabeth thorne-wallington information technology and libraries | december 2013 53 abstract this paper examines the issue of universal library access by conducting a geospatial analysis of library location and certain socioeconomic factors in the st. louis, missouri, metropolitan area. framed around the issue of universal access to internet, computers, and technology (ict) for digital natives, this paper demonstrates patterns of library location related to race and income. this research then raises important questions about library location, and, in turn, how this impacts access to ict for young people in the community. objectives and purpose the development and diffusion of new media and digital technologies has profoundly affected the literacy experiences of today’s youth.1 young people today develop literacy through a variety of new media and digital technologies.2 the dissemination of these resources has also allowed for youth to have literacy-rich experiences in an array of different settings. ernest morrell, literacy researcher, writes, as english educators, we have a major responsibility to help future english teachers to redefine literacy instruction in a manner that is culturally and socially relevant, empowering, and meaningful to students who must navigate a diverse and rapidly changing world.3 this paper will explore how mapping and geographic information systems (gis) can help illuminate the cultural and social factors related to how and where students access and use new media literacies and digital technology. libraries play an important role in encouraging new media literacy development;4 yet access to libraries must be understood through social and cultural contexts. the objective of this paper is to demonstrate how mapping and gis can be used to provide rigorous analysis of how library location in st. louis, missouri, is correlated with socioeconomic factors defined by the us census including median household income and race. by using gis, the role of libraries in providing universal access to new media resources can be displayed statistically, both challenging and confirming previously held beliefs about library access. this analysis raises new questions about how libraries are distributed across the st. louis area and whether they truly provide universal and equal access. elizabeth thorne-wallington (ethornew@wustl.edu) is a doctoral student in the department of education at washington university in st. louis. mailto:ethornew@wustl.edu information technology and libraries | december 2013 54 literature review advances in technologies are transforming the very meaning of literacy.5 traditionally, literacy has been defined as the ability to understand and make meaning of a given text.6 the changing global economy requires a variety of digital literacies, which schools do not provide.7 instead, young people acquire literacy through a multitude of inand out-of-school experiences with new media and digital technology.8 libraries play a vital role in supporting new media literacy by offering out-of-school access and experiences. to understand the role that libraries play in offering access to new media literacy technologies, a few key concepts must be defined. first is the concept of the digital native. those born around 1980, who have essentially grown up with technology, are known as digital natives.9 digital natives are expected to have a base knowledge of technology and to be able to pick up and learn new technology quickly because of that base knowledge. digital natives have been exposed to technology from a young age and are adept at using a variety of digital technologies. the suggestion is that young people can quickly learn to make use of the new media and technology available in a specific location. key to any discussion of digital natives is the concept of the digital divide. the digital divide has been a central issue of education policy since the mid-1990s.10 early work on the digital divide was concerned primarily with equal access.11 more recently, however, the idea of a “binary digital divide” has been replaced by studies focusing on a multidimensional view of the digital divide.12 hargattai asserts that even among digital natives, there are large variations in internet skills and uses correlated with socioeconomic status, race, and gender.13 these variations call for a nuanced study examining social and cultural factors associated with new media literacy, including out-ofschool contexts. the concept of literacy and learning in out-of-school contexts has a strong historical context. hull and schultz provide a review of the theory and research on literacy in out-of-school settings.14 a variety of studies, including self-guided literacy activities, after-school programs, and reading programs were reviewed, and the significance of out-of-school learning opportunities was supported by these studies. importantly for the research here, research has also been done on the use of digital technology in out-of-school settings. lankshear and knobel examine out-of-school practices extensively with their work on new literacies.15 lankshear and knobel also make clear the complexity of out-of-school experiences among young people. students participate in nontraditional literacy activities such as blogging and remix in a variety of out-of-school contexts, from home computers to community-based organizations to libraries. most importantly, lankshear and knobel found that the students did connect what they learned in the classroom with these out-of-school activities. the connection between out-of-school literacies and in-school learning has also been studied. education policy researcher allan luke writes, the redefined action of governments . . . is to provide access to combinatory forms of enabling capital that enhance students’ possibilities of putting the kinds of practices, texts, and discourses social contexts of new media literacies: mapping libraries| thorne-wallington 55 acquired in schools to work in consequential ways that enable active position taking in social fields.16 collins writes about this relationship between inand out-of-school literacies. collins writes in her case study that there are a variety of “imports” and “exports” in terms of practices. that is, skill transaction works in both directions, with skills learned out of school used in school, and skills learned in school used out of school.17 skerett and bomer make this connection even more explicit when looking at adolescent literacy practices.18 their article examines how a teacher in an urban classroom drew on her students’ out-of-school literacies to inform teaching and learning in a traditional literacy classroom. the authors found that the teacher in their study was able to create a curriculum that engaged students by inviting them to use literacies learned in out-of-school settings. however, the authors write that this type of literacy study was taxing and time-consuming for both the teacher and the student. still, it is clear that connections between inand out-of-school literacies can be made. the role libraries play in making this connection has not been studied as extensively. yet it is clear that young people do use libraries to access technology. becker et al., found that nearly half of the nation’s 14 to 18 year olds had used a library computer within the past year. becker et al. additionally found that for poor children and families, libraries are a “technological lifeline.” among those below the poverty line, 61 percent used public library computers and the internet for educational purposes.19 tripp writes that libraries have long played an important role in helping people gain access to digital media tools, resources, and skills.20 tripp writes that libraries should capitalize on the potential of new media to engage young people. additionally, tripp argues that librarians need to develop skills to train young people to use new media. the idea that libraries are important in meeting the need is further supported by the recent grants, totaling $1.2 million, by the john d. and catherine t. macarthur foundation to build “innovative learning labs for teens” in libraries. this grant making was a response to president obama’s “educate to innovate” campaign, a nationwide effort to bring american students to the forefront in science and math.21 this literature review demonstrates that the body of research currently available focuses on digital natives and the digital divide, but that the research lacks the nuance needed to capture the complexity of social and cultural contexts surrounding the issue. this literature review further demonstrates both the importance of new media literacy and out-of-school learning, as well as the key role that libraries play in supporting these learning opportunities. the study provided here uses gis analysis to demonstrate important socioeconomic and cultural factors that surround libraries and library access. first, i describe the role of gis in understanding context. next, i describe the methods used in this paper. finally, i analyze the results and implications for the study. geographic information systems analysis in education there is a burgeoning body of research which uses geographic information systems (gis) to better understand socioeconomic and cultural contexts of education and literacy issues.22 information technology and libraries | december 2013 56 there are several key works that link geography and social context. lefebvre defines space as socially produced, and he writes that space embodies social relationships shaped by values and meanings. he describes space as a tool for thought and action or as a means of control and domination. lefebvre writes that there is a need for spatial reappropriation in everyday urban life. the struggle for equality, then, is central to the “right of the city.”23 the unequal distributions of resources in the city help to maintain social and economic advantaged positions, which is important to the analysis here of library access. this unequal distribution of resources continues today. de souza briggs and others write that there is clear geographical segregation in american cities today.24 this is seen in housing choice, racial attitudes, and discrimination, as well as metropolitan development and policy coalitions. in the conclusion of his book, de souza briggs writes that housing choice is limited for low-ses minorities, and these limitations produce myriad social effects. again, this finding is important to the contexts of where libraries are located. jargowsky writes of similar findings.25 like de souza briggs, jargowsky focuses on the role that geography plays in terms of neighborhood and poverty. jargowsky even finds social characteristics of these neighborhoods: there is a higher prevalence of single-parent families, lower educational attainment, a higher level of dropouts, and more children living in poverty. important here, though, is that all such characteristics can be displayed geographically, which means that varying housing, economic, and social conditions can be displayed with library locations. soja goes beyond the geographic analysis offered by de souza briggs and jargowsky and writes that space should be applied to contemporary social theory.26 soja found that spatiality should be used in terms of critical human geography to advance a theory of justice on multiple levels. he writes that injustice is spatially construed and that this spatiality shapes social injustice as much as social injustice shapes a specific geography. this understanding, then, shapes how i approach the study of new media literacies as influenced by cultural and social factors. these factors are particularly prevalent in the st. louis, missouri, area. colin gordon reiterates the arguments of lefbvre jargowsky and de souza briggs in arguing that st. louis is a city in decline.27 by providing maps that project housing policies, gordon is able to provide a clear link between historical housing policies such as racial covenants and current urban decline. gordon is able to show that vast populations are moving out of st. louis city and into the county, resulting in a concentration of minority populations in the northern part of the city. gordon argues that the policies and programs offered by st. louis city have only exacerbated the problem and led to greater blight.28 in terms of literacy, morrell makes the most explicit connection between literacy and mapping with a study that used a community-asset mapping activity to make the argument that teachers need to make an explicit connection between literacy at school and the new literacies experienced in the community.29 the significance of this is that gis can be used to illuminate the social and economic contexts of new media literacy opportunities as well, which in turn could help inform social dialogue about the availability of and access to informal education opportunities for new media literacy. social contexts of new media literacies: mapping libraries| thorne-wallington 57 methods and data the gis analysis performed here concerns library locations in the st. louis metropolitan area, including st. louis city and st. louis county. the st. louis metropolitan area was chosen because of past research mapping the segregation of the city, largely because the city and county are so clearly segregated racially and economically along the north–south line. this segregation is striking when displayed geographically and illuminating when mapped with library location. maps were created using tiger files (www.census.gov/geo/maps-data/data/tiger.html) and us census data (http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml), both freely available to the public via internet download. libraries were identified using the st. louis city library’s “libraries & hours” webpage (www.slpl.org/slpl/library/article240098545.asp), the st. louis county library “locations & hours” webpage (www.slcl.org/about/hours_and_locations), google maps (www.maps.google.com), and the yellow pages for the st. louis metropolitan area (www.yellowpages.com). the address of each library was entered into itouchmap (http://itouchmap.com ) to indentify the latitude and longitude of the library. a spreadsheet containing this information was then loaded into the gis software and displayed as x–y data. the maps were then displayed using median household income, african american population, and latino and hispanic population as obtained from the us census at census tract level. for median household income, the data was from 1999. for all other census data, the year was 2010. for district-level data, communication arts data from the missouri department of elementary and secondary education (modese) website (http://dese.mo.gov/dsm ), was entered into microsoft excel, and then displayed on the maps. the data is district level, representing all grades tested for communication arts across all district schools. the modese data was from 2008, the most recent year available at the time the analysis was performed. the communication arts data was taken from the missouri assessment program test. this test is given yearly across the state to all public school students. the state then collects the data and makes it available at the state, district, and school level. the data used here is district-level data. scores are broken into four categories: advanced, proficient, basic, and below basic. the groups for proficient and advanced were combined to indicate the district’s success on the map test. these are the two levels generally considered acceptable or passing by the state.30 before looking at patterns of library location and these socioeconomic and educational factors, density analysis was performed on the library locations using esri arcgis software, version 9.0, to analyze whether clustering was statistically significant. this analysis was used to demonstrate whether libraries were clustered in a statistically significant pattern, or if location was random. the nearest neighbor tool of arcgis was used to determine if a set of features, in this case the libraries, shows a statistically significant level of clustering. this was done by measuring the distance from each library to its single nearest neighbor and calculating the average distance of all the measurements. the tool then created a hypothetical set of data with the same number of features, but placed randomly within the study area. then an average distance was calculated for these features and compared to the real data. that is, a hypothetical random set of locations was compared to the set of actual library locations. a near-neighbor index was produced, which expresses the ratio of the observed distance divided by the distance from the hypothetical data, thus comparing the two sets.31 this score was then standardized, producing a z-score, reported below in the results section. http://www.census.gov/geo/maps-data/data/tiger.html http://factfinder2.census.gov/faces/nav/jsf/pages/index.xhtml http://www.slpl.org/slpl/library/article240098545.asp http://www.slcl.org/about/hours_and_locations http://www.maps.google.com/ http://www.yellowpages.com/ http://dese.mo.gov/dsm information technology and libraries | december 2013 58 results and conclusions using the nearest neighbor tool produced a z-score of -3.08, showing that the data is clustered beyond the 0.01 significance level. this means that there is a less than 1 percent chance that library location would be clustered to this degree based on chance. knowing, then, that library location is not random, we can now examine socioeconomic patterns of the areas where libraries are located. figure 1 shows library location and population of individuals under the age of 18 at the census tract level for st. louis city and county, using data from the 2010 us census. to clarify, the city and county are divided by the bold black line crossing the middle of the map, the only such boundary in figure 1, where the county is the larger geographic area. library location is important because previous research shows that young people use informal learning environments to access new media technologies,32 and libraries are a key informal learning environment.33 this map demonstrates, however, that libraries are not located in census tracts with the highest populations of individuals under the age of 18 in st. louis city and county. in fact, for all the tracts with the highest number of individuals under the age of 18, there are zero libraries located in these tracts. this is especially concerning given that young people may have less access to transportation, so their access of facilities in neighboring census tracts may be quite limited. figure 1. number of individuals under the age of 18 by census tract and library location in st. louis city and st. louis county. source: 2010 us census. social contexts of new media literacies: mapping libraries| thorne-wallington 59 figure 2 includes maps showing library locations in st. louis city and county in terms of poverty and race by census tract level, as well as act score by district, represented by the bold lines, where st. louis city is represented by a single district, the st. louis public school district. median household income in indicated by the gray shading, with white areas not having data available. first, census tracts with low median household income are clustered in the northern part of the city and county. there are four libraries in the northern half of the city, and eleven libraries in the central and southern parts of the city. there are fewer libraries in the census tracts with low median household income. figure 2. median household income, act score, and library location, st. louis city and county. source: 2010 us census and missouri department of elementary and secondary education, 2010, www.modese.gov. while the nearest neighbor analysis has already demonstrated the libraries are significantly clustered, the maps seem to suggest the pattern of that clustering. this is especially concerning given the report by becker that 61 percent of those living below the poverty line use libraries to access the internet.34 first, in terms of median household income, it does appear that many libraries are located in higher income areas of the city and county. while the libraries appear to be http://www.modese.gov/ information technology and libraries | december 2013 60 clustered centrally, and particularly near major freeways, there appear to be libraries in many of the higher income census tracts. adding to the concern of location is that of access to these library locations. for those living below the poverty line, transportation is often a prohibitive cost, so access from public transportation should also be a major concern for libraries. additionally, in a pattern repeated in figure 4, the location of libraries does not appear to have any effect on act scores, but there are clearly higher act scores in wealthier areas of the city and county. this is not to say that there is a statistical relationship between act score and library location, but rather to look at the spatial patterns of each in order to note similarities and differences in these patterns. figure 3 shows library location by race, including african american or black and hispanic or latino. first, it is important to note that patterns of race in st. louis have been carefully documented by gordon.35 the st. louis area is clearly a highly segregated region, which makes the social contexts of libraries in the st. louis area even more important. this map demonstrates that while there are many libraries in the northern parts of st. louis city and county, none of these libraries is located in the census tracts with the highest populations of those identifying themselves as african american or black in either the city or county. this raises questions about the inequality of access to the libraries. on the other hand, the densest populations of those identifying themselves as hispanic or latino are in the southern part of the city, but not the county. there is a library located in one of those tracts. it appears the areas with higher concentrations of african americans or blacks have fewer libraries, while areas with the higher concentrations of latinos or hispanics are located in the southern parts of the city that do have libraries. it is important to note, however, that the concentrations of latinos and hispanics is quite low, and those areas are majority white census tracts. as noted above, beyond location, access from public transportation is also an important issue. at the same time, the clustering and patterns shown on these maps raise key issues about access based on income and race. libraries are not located in areas with low median household income or in areas with high concentrations of african americans or blacks. this raises serious questions about why libraries are located where they are, and whether the individuals located in these areas have equal access to library resources, particularly new media technologies. social contexts of new media literacies: mapping libraries| thorne-wallington 61 figure 3. african american or black and hispanic, library location, st. louis city and county. source: 2010 us census. the final map raises a slightly different issue, one of test scores and student achievement. figure 4 shows library location by percent proficient or advanced on the missouri achievement program test by district. beyond the location of the libraries, one factor that stands out is that the areas with the lowest percent proficient or advanced are also the areas with the lowest median household income and the highest percentage of those identifying as african american or black. here an interesting pattern emerges. while there are many libraries in the city and northern part of the county, the percent proficient or advanced on the communication arts portion of exam is quite low (20–30 percent). on the other hand, in the western part of the county, there are few libraries, but the percent proficient or advanced is at its highest level. this suggests that there may not be a strong connection between achievement on the map exam and library location, similar to the lack of relationship seen in between act average score and library location in figure 2. at the same time, there does appear to be a correlation between race, income, and test scores. this correlation is noted throughout the literature on student achievement.36 clearly, these maps raise important questions such as how and why libraries are located in a certain area, who uses libraries in a given area, as well as what other informal learning environments and community assets exist in these areas. what is made clear by the maps, though, is that gis can be used as a tool to help understand the context of new media literacy. information technology and libraries | december 2013 62 figure 4. proficient or advanced, communication arts map by district, 2009, and library location. source: missouri department of elementary and secondary education, 2010, www.modese.gov. significance these results demonstrate that gis can be used to illuminate the social, cultural, and economic complexity that surrounds informal learning environments, particularly libraries. this can help demonstrate not only where young people have the opportunity to use new media literacy, but also the complex contextual factors surrounding those opportunities. paired with traditional qualitative and quantitative work, gis can provide an additional lens for understanding new media literacy ecologies, which can help inform dialogue about this topic. for the results of this study, there does appear to be a relationship between library location and race and income. this study illuminates the complex contextual factors affecting libraries. because of the important role that libraries can play in offering young people out of school learning opportunities, particularly in terms of access to new media resources, these contextual factors are important to ensuring equal access and opportunity for all. http://www.modese.gov/ social contexts of new media literacies: mapping libraries| thorne-wallington 63 references 1. ernest morrell, “critical approaches to media in urban english language arts teacher development,” action in teacher education 33, no. 2 (2011): 151–71, doi: 10.1080/01626620.2011.569416. 2. mizuko ito et al., hanging out, messing around, and geeking out: kids living and learning with new media (cambridge: mit press/macarthur foundation, 2010). 3. morrell, “critical approaches to media in urban english language arts teacher development.” 4. lisa tripp, “digital youth, libraries, and new media literacy,” reference librarian 52, no. 4 (2011): 329–41, doi: 10.1080/02763877.2011.584842. 5. gunther kress, literacy in the new media age (london: routledge, 2003). 6. ibid. 7. donna e. alvermann and alison h. heron, “literacy identity work: playing to learn with popular media,” journal of adolescent & adult literacy 45, no. 2 (2001): 118–22. 8. colin lankshear and michele knobel, new literacies: everyday practices and classroom learning (maidenshead: open university press, 2006). 9. john palfrey and urs gasser, born digital: understanding the first generation of digital natives (new york: perseus, 2009). 10. karin m. wiburg, “technology and the new meaning of educational equity,” computers in the schools 20, no. 1–2 (2003): 113–28, doi: 10.1300/j025v20n01_09. 11. rob kling, “learning about information technologies and social change: the contribution of social informatics,” information society 16, no. 3 (2000): 212–24. 12. james r. valadez and richard p. durán, “redefining the digital divide: beyond access to computers and the internet,” high school journal 90, no. 3 (2007): 31–44, http://www.jstor.org/stable/40364198. 13. eszter hargittai, “digital na(t)ives? variation in internet skills and uses among members of the ‘net generation,’” sociological inquiry 80, no. 1 (2010): 92–113, doi: 10.1111/j.1475682x.2009.00317.x. 14. glynda hull and katherine schultz, “literacy and learning out of school: a review of theory and research,” review of educational research 71, no. 4 (2001): 575–611, http://www.jstor.org/stable/3516099. 15. colin lankshear and michele knobel, new literacies. http://www.jstor.org/stable/40364198 http://www.jstor.org/stable/3516099 information technology and libraries | december 2013 64 16. allan luke, “literacy and the other: a sociological approach to literacy research and policy in multilingual societies,” reading research quarterly 38, no. 1 (2003): 132–41, http://www.jstor.org/stable/415697. 17. stephanie collins, “breadth and depth, imports and exports: transactions between the in-and out-of-school literacy practices of an ‘at risk’ youth,” in cultural practices of literacy: case studies of language, literacy, social practice, and power (mahwah, nj: lawrence erlbaum, 2007). 18. allison skerrett and randy bomer, “borderzones in adolescents literacy practices: connecting out-of-school literacies to the reading curriculum,” urban education 46, no. 6 (2011): 1256–79, doi: 10.1177/0042085911398920. 19. samantha becker et al., opportunity for all: how the american public benefits from internet access at u.s. libraries (washington, dc: institute of museum and library services). 20. lisa tripp, “digital youth, libraries, and new media literacy.” 21. nora fleming, “museums and libraries awarded $1.2m to build learning labs,” education week (blog), december 7, 2012, http://blogs.edweek.org/edweek/beyond_schools/2012/12/museums_and_libraries_awarde d_12_million_to_build_learning_labs_for_youth.html. 22. see william f. tate iv and mark hogrebe, “from visuals to vision: using gis to inform civic dialogue about african american males,” race ethnicity and education 14, no. 1 (2011), 51– 71, doi: 10.1080/13613324.2011.531980; mark c. hogrebe and william f. tate iv, “school composition and context factors that moderate and predict 10th-grade science proficiency,” teachers college record 112, no. 4 (2010), 1096–1136; robert j. sampson, great american city: chicago and the enduring neighborhood effect (chicago: university of chicago press, 2012). 23. henri lefebvre, the production of space (oxford: blackwell, 1991). 24. xavier de souza briggs, the georgraphy of opportunity: race and housing choice in metropolitan america (washington, dc: brookings institute press, 2005). 25 paul jargowsky, poverty and place: ghettos, barrios, and the american city (new york: russell sage foundation, 1997). 26. edward w. soja, postmodern geographies: the reassertion of space in critical social theory (new york: verso, 1989). 27. collin gordon, mapping decline: st. louis and the fate of the american city (university of pennsylvania press, 2008). 28. ibid. http://www.jstor.org/stable/415697 http://blogs.edweek.org/edweek/beyond_schools/2012/12/museums_and_libraries_awarded_12_million_to_build_learning_labs_for_youth.html http://blogs.edweek.org/edweek/beyond_schools/2012/12/museums_and_libraries_awarded_12_million_to_build_learning_labs_for_youth.html social contexts of new media literacies: mapping libraries| thorne-wallington 65 29. ernest morrell, “critical approaches to media in urban english language arts teacher development.” 30. missouri department of elementary and secondary education, http://dese.mo.gov/dsm/. 31. david allen, gis tutorial ii: spatial analysis workbook (redlands, ca: esri press, 2009). 32. becker et al., opportunity for all: how the american public benefits from internet access at u.s. libraries (washington, dc: institute of museum and library services). 33. lisa tripp, “digital youth, libraries, and new media literacy.” 34. becker et al., opportunity for all: how the american public benefits from internet access at u.s. libraries (washington, dc: institute of museum and library services). 35. collin gordon, mapping decline: st. louis and the fate of the american city. 36. see mwalimu shujaa, beyond desegregation: the politics of quality in african american schooling (thousand oaks, ca: corwin, 1996); william j. wilson, the truly disadvantaged: the inner city, the underclass, and public policy (chicago: university of chicago press, 1987); gary orfield and mindy l. kornhaber, raising standards or raising barriers: inequality and highstakes testing in public education (new york: century foundation, 2010). http://dese.mo.gov/dsm/ communication design and development of a himalayan studies information system for india: a proposed model anil singh the ever-increasing need for information, with its complexity and escalating costs; the enormous growth in publications, and the emergence of subject specialization have compelled librarians to share resources through information networks and systems. this paper describes the necessity of networking among the himalayan studies and research centers in india, allowing the sharing of information originating from the himalayan studies information system (himis). the paper also discusses in brief the definition of information systems, as well as the objectives and needs of a proposed himis. the recent advancements in the computer, communications, and networking technologies have brought about three paradigm shifts. there is the shift of information resources from print to electronic media, the shift in the role of information providers from passive to proactive, and the shift from manual to automated information delivery. this has presented library and information professionals with a tremendous challenge, that of playing a proactive role not only in their routine activities (acquisition, processing, and dissemination), but also in the actual learning process of their clientele. library and information professionals have to learn to scan, filter, interpret, analyze, repackage, and deliver information from a variety of sources in ways that are meaningful to their users.1 himalaya, the greatest physical feature of the earth, is not only an integral part of history and heritage, but it has also assumed the form of a social, cultural, and geo-political reality that cannot be ignored or underestimated. this mountain range makes an enormous contribution to our contemporary life, even as it influenced our history and mythology. furthermore, this influence promises to extend even to the future.2 the himalayas have always remained a source of fascination and inspiration for people from all walks of life, and have been deemed by the peoples of the subcontinent to be the cradle of human civilization. the variety of cultures, terrains, forest physiographies, flora, and fauna of this region has lured the intelligentsia of the world since time immemorial. in recent years, however, the himalayas have become the focus of attention of scientists and government alike. efforts are underway for a better understanding of its highly complex environmental and ecological systems, and to bring about an all-around development of the region, which has remained backward throughout the centuries.3 there has been a tremendous explosion of data and information in recent years, particularly in the field of himalayan resources, with an increase in the number of research and development institutions at the national and international level. while most of these research institutions and universities possess excellent libraries and information centers, there is at present no information network by which coordination and sharing of resources could be effected for the mutual benefit of each of the existing libraries. because there is little access to the right information at the right time, it has become difficult for one single organization to collect the data and information that are required by policy makers, administrators, and research scientists. the other scientific departments of the government of india have already started planning to begin information networking in their respective areas. some of these projects are completed and others are in the process of implementation.4 during the last decades, india has been active in setting up information systems and networks, and considerable progress has been achieved in this area (figure 1). since most of the information related to the himalayas is scattered in different research-andstudies centers all over india, there is a need to develop a himalayan studies information system (himis). himis will be a computer-communication network for linking various libraries and information centers of research and development (r&d) institutions, universities, and nongovernmental organizations (ngos) working on the himalayas. the system, therefore, has to take into account the specific information requirements of each development sector so far as its relevance to the himalayas is concerned. defining information systems the purpose of an information system is to provide accurate and relevant information to users at the right time and at the appropriate level of detail. this will help ensure that the corporate information resource is utilized fully.5 buckingham defines an information system as a system which assembles, stores, processes, and delivers information relevant to an organization (or to society), in such a way that the information is accessible and useful to those who wish to use it, including managers, staff, clients, and citizens. an anil singh (rathoreas@hotmail.com) is professional assistant, division of lib-rary, documentation and information, national council of educational research and training, sri aurobindo marg, new delhi, india. communications design and development of a himalayan studies information system for india | singh 37 38 information technology and libraries | march 2005 information system is a human activity (social) system which may or may not involve the use of computer system.6 harrod’s librarians’ glossary defines information system as an organized procedure for collecting, processing, storing, and retrieving information to satisfy a variety of needs.7 according to bowman, four different components can be identified: (1) a store of useful information that has been accumulated over a period of time; (2) a series of techniques used for adding material to and retrieving information from the store upon demand; (3) a group of people who operate the system and are responsible for selecting information to be added, answering questions, organizing the store, and for implementing and modifying the techniques for both storage and retrieval; and (4) the user. the ultimate test of any such system is the degree of satisfaction it gives the user who has specific information needs.8 setting up himis the volume of himalayan information plus the number of users and their various requirements have created a situation where it is almost impossible for any single library to provide information services singlehandedly. it is only through a network of all these information centers that some viable control of himalayan information is feasible, not only at a local level, but also at the national and international level, to ensure effective use of information resources to the best advantage at a minimum cost. these days, information is being regarded as a national resource, and this awareness has led to computerbased information services such as indexing and abstracting at the national and international levels in the field of himalayan studies.9 considering the importance of himis, the government of india, through the agency of the national information systems in science and technology (nissat), should aid and support the proposed himis generously. himis is eminently suitable for several spheres of national activities, including planning and research. reliable and timely information for decisionmaking becomes increasingly important for india, where a concept of social welfare has developed over the past three decades. many organizations have successfully developed their own information systems to plan, monitor, or control their research activities, and these have yielded increased research proficiency. the government should surely benefit from these methods. apart from suggesting suitable solutions to the problems of planning, monitoring, allocation, control, and coordination of the departmental programs, one has to consider the special distribution of these programs.10 the need to set up himis has, therefore, to be considered in the context of the rapid development of himalayan information as well as the increasing awareness of its relevance to societal development.11 objectives of himis himis will be fully computerized so that an efficient system of storage and retrieval could be organized through networks linking all the himalayanstudies centers with each other. the main objectives of the proposed himis are listed in appendix a. users of the proposed system the proposed information system is planned to meet the needs of the specialists who are directly or indirectly concerned with himalayan studies and research as a subject or as an activity. the following are the categories of those to whom the information would be supplied in a meaningful form within a reasonable time through himis. 1. planners, policy makers, decision makers, administrators with respect to himalayan development at government and nongovernment levels; 2. major institutions devoted to himalayan study and research as a discipline; 3. international organizations such as the united nations educational, scientific, and cultural organization (unesco) and the international centre for integrated mountain development (icimod); agricultural information system ahamdabad library network biodiversity conservation information system calcutta library network developing library network education information system environmental information system industrial information system information and library network management library network medical information system mysore library network national information system for science and technology nutrition information system patent information system project information system pune library network rural information system figure 1. some of the current networking systems in india 4. scientists engaged in the implementation and execution of plans and policies; 5. scientists and researchers engaged in himalayan study and research; 6. teachers engaged in teaching about the himalayas; 7. communicators who attempt to convey the information about development policies, plans, programs, and projects; and 8. ngos working on projects about himalayas. components of the proposed himis for designing himis, six essential components have been identified and proposed: (1) national resource center on himalayan studies; (2) library consortia; (3) computerization of himalayan institutes’ libraries; (4) information networking between himalayan institutes; (5) digitization of information material on himalayas; and (6) himalayan information gateway. figure 2 depicts the proposed components. national resource center on himalayan studies as a part of the information system, a national-level resource center for himalayan studies is also proposed on the pattern of other information centers in india, such as the national information center for management information (nicman), ahmedabad, and the national information center for food science and technology (nicfos), mysore, to name but two. these information centers have been established with the financial help of nissat, and at present are working as sectoral information centers of nissat. figure 3 describes various functions and activities of the proposed center. the planned objectives of the resource center are: � to create a user-need-based information-technology (it) resource center; � to develop a strong collection of different types of information sources; � to develop a user-friendly information-retrieval mechanism, such as an online facility for having access to international databases; � to create an it-based infrastructure; � to develop a liaison with all himalayan studies and research centers of india and other international centers for better information service through resource sharing and networking; and � to provide bibliographies on selected topics on demand or even in advance of demand. himalayan-studies libraries consortium the concept of library consortia has been floating around for quite some time in india. though it is the need of the hour, indian libraries have yet to move in a definite direction in this regard. strong resource-sharing activity among libraries, a prerequisite for the right attitude towards consortia activities, has not been as present in india as could be desired. sudden influx of electronic information is forcing library consortia to materialize.13 traditionally, the primary purpose of establishing a library consortium is to share physical resources amongst members. access to resources is now considered more important than collection building, especially if the access is perpetual in nature. a library consortium helps libraries to get the benefit of wider access to electronic resources at an affordable cost and at the best terms of license. a consortium with the collective strength of resources of the various institutions available to it is in a better position to address and resolve the problems of managing, organizing, and archiving electronic resources.14 the indian national digital library in science and technology (indest) consortium, which is set up by india’s ministry of human resource and development, and ugc-infonet, which is set up by the university grant commission of india are the best example of library consortia in india. therefore, it is also proposed that a consortium of himalayan institute’s libraries may be formed to share the electronic resources of other libraries. computerization of himalayan-studies institute’s libraries observance of an adherence to standard techniques, procedures, and methods is an essential prerequisite for the effective functioning of a network. participating libraries will have to follow certain procedures and practices, without which the resources held by them cannot be effectively and meaningfully shared. in the context of library computerization, standardization is very necessary in such areas as classification, subject heading, and cataloging of various types of library materials; bibliographic description; and standard interchange of bibliographic data.15 the himalayan institutes have started the computerization of their libraries, beginning with creation of computerized databases of books, journals, reports, conference proceedings, annual reports, and monographs. but not all the housekeeping activities have been computerized. it is, therefore, suggested that every library should begin the computerization of each and every one of their activities. in india, libraries are generally using dewey decimal classification (ddc) for classification, anglo-american cataloguing rules (aacr) for cataloging rules, and library of congress (lc) for subject headings. machine-readable cataloging (marc) fomat is being used in the majority of libraries for creation of the database. design and development of a himalayan studies information system for india | singh 39 40 information technology and libraries | march 2005 himalayan institute’s network it is also proposed that there is a need for networking among the himalayan institute’s libraries and information centers in the country for optimum resource sharing. resource sharing and networking in libraries are powerful tools both for increasing productivity and enhancing services to meet the changing needs of library users. the proposed network ensures effective bibliographic control, document delivery, cooperative acquisition of serials and other literature in the field, and dissemination of relevant information. all the libraries and information centers of himalayan-studies centers of india (see appendix b) will be linked to the national resource center on himalayan stud-ies. these libraries have been identified as regional centers of the proposed himis. a network, in the first instance, envisaged a physical structure of links among the libraries and information centers established by means of computer and telecommunication links. resource sharing is based on the concept that the collective strength and effectiveness of a group of libraries is greater than that of the sum of its individual members. digitization of himalayaninformation sources research publications are vital for any professional discipline, and it is crucial to preserve and provide access to them. due to the amount of important information published in journals, conference proceedings, and reports, these efforts must be on a par with similar initiatives in other countries. permanent preservation and enhanced access must be ensured vigorously now by the applications of new technology for digitizing and electronic access to valuable contents. digitization provides unhindered access to information via computer and communication networks, justifying the need for using it for studying himalayan literature. at present, very few copies of conference volumes, journals, and reports are published, and these are exclusively distributed to subscribers. the institutions involved rarely encourage readers to subscribe personally to such literature, thus placing it out of the reach of a large number of readers. the paper used for publishing these types of sources is of inferior quality, significantly reducing shelf life. in addition to providing increased and easy access to these publications, the most visible benefit of digitization is the fact that it preserves them.16 some of the purposes figure 2. components of himis figure 3. national resources center on himalayan studies in india of digitization, identified by different ongoing projects, are to: � collect, store, and organize information and knowledge in digital form; � promote economic and efficient delivery of information; � leverage considerable investments in computing and communication infrastructures; � strengthen communication and collaboration between research, government, and educational communities; and � contribute for lifelong learning opportunities. keeping all of these points in mind, the digitization of himalayan literature—particularly back volumes of journals, conference proceedings, annual reports, monographs, reports, and research papers published by himalayan scientists in various journals, available in himalayan studies centers of india—is very necessary. and it is one of the important and necessary aspects of the proposed himis. himalayan-studies information and subject gateway one of the major problems in accessing information from the internet is that it is very difficult and time consuming to get reliable and relevant information in the shortest possible time. the effective and efficient way to provide easy access to quality information on the internet is developing subject gateways in specific areas. “subject gateways are online services and sites that provide searchable and browseable catalogues of internetbased resources. subject gateways will typically focus on a related set of academic subject areas.”17 to meet the information requirements of the scientific and academic communities in the digital era, various departments in india have developed or are still in the process of developing subject gateways in their respective areas.18 in recent years, there has been a tremendous explosion of data and information on the internet, particularly in the field of himalayan resources. as a result, it has become difficult for an individual—and also for an organization— to collect data and information. thus, access to the right information at the right time has become very difficult.19 since most of the information related to the himalayas is scattered on the web, it is necessary to develop a himalayan-information subject gateway. this gateway will provide links to various libraries and information centers of r&d institutions, universities, and ngos working in the himalayan region. this gateway will be developed on the pattern of the sayama prasad mookerjee information gateway of social science (spmigss) developed by the indian council of social science research (icssr).20 the system, therefore, has to take into account the specific information requirements of each development sector insofar as its relevance to the himalayas is concerned. institutions engaged in himalayan studies and research to reduce the quantum of information illiteracy, it is essential that information is readily available to an individual about the agencies that generate and publish himalayan information—a huge task. the main hurdle has been the lack of appreciation of the role and importance of this type of institutional activity on a continuing basis. bearing in mind the concern for setting up himis, the first step is to identify the agencies and institutions that generate himalayan information. the information generated by these institutions may be contained in the form of files, computerized databases, reports, institutional publications, dissertations and theses, articles in journals, and conference and seminar papers. the information also is accumulated through state-of-the-art reports, serials, and yearbooks.21 india adinet (www.alibnet.org/) agris (www.fao.org/agris/) calibnet (www.calibnet.org.in/) csir (www.csir.res.in) delnet (www.delnet.nic.in/) desidoc (http://drdo.nic.in/labindex. shtml) drdo (http://drdo.nic.in) dst (www.dst.gov.in/) envis (http://envis.nic.in/) hellis (www.hellis.org/) icar (www.icar.org.in) icfre (www.icfre.org) icssr (www.icssr.org) indest (http://paniit.iitd.ac.in/indest/) inflibnet (www.inflibnet.ac.in) moef (www.envfor.nic.in) mylibnet (www.mylibnet.org/) nassdoc (www.icssr.org/doc_mail.htm) nicchem (www.dsir.nic.in/division/ nissat/nisnat/nics/mh.html) nicfos (www.cftri.com/department/ fostis.htm) niclai (www.clri.org/) nicman (www.iimahd.ernet.in/library/) niscair (www.niscair.res.in) nissat (www.dsir.nic.in/division/nissat/) punenet (www.punenet.com/) ugc (www.ugc.ac.in) ugc-infonet (http://web.inflibnet.ac.in/info/ ugcinfonet/ugcinfonet.jsp) figure 4. urls of organizations, networks, and systems of india design and development of a himalayan studies information system for india | singh 41 42 information technology and libraries | march 2005 has a reasonably good institutional setup for himalayan research. figure 5 lists agencies currently involved in diverse fields of r&d on the himalayas. appendix b lists some of the institutions that are engaged in himalayan studies and research. 22 conclusion all himalayan studies and research centers have to assume the major responsibility for developing himis. the government of india needs to be convinced of the usefulness and utility of such a system. it is necessary to emphasize that in the absence of such an information system, a large amount of research talent and resources will be wasted in duplicated efforts. it is hoped that the himalayan institutions and scientists engaged in himalayan studies and research will be able to impress upon the government of india the need for an early formulation and implementation of himis.23 this information system will also supplement the resources and services of the participating libraries as “libraries acting together can more effectively satisfy user needs and thus meet the objectives at reduced costs.” the success of the venture shall depend upon financial support, guidance, and encouragement received from government of india.24 references 1. r. l. raina, and i. v. malhan, eds., business librarianship and information services: proceedings of the iiml-manlibnet 3rd annual national convention, march 12–14, 2001, lucknow: (lucknow: international book distributing co., 2002), v–vii. 2. shekhar pathak and anup sah, kumaon himalaya, temptations (nainital: published for kumaon mandal vikas nigam ltd. by gyanodaya prakashan, 1993). 3. n. k. shah, s. d. bhatt, and r. k. pande, himalaya: environment resources and development (almora: shree almora book depot, 1990), iii. 4. p. c. bose, “national agricultural research information system,” in national information policies and programmes, seminar papers thirty-seventh all-india library conference, (delhi: indian library association, 1991), 177. 5. d. e. avinson and g. fitzgerald, information systems development: methodologies, techniques and tools (oxford: blackwell scientific, 1988), 7. 6. ibid., 8. 7. raymond prytherch and l. m. harrod, harrod’s librarians’ glossary of terms used in librarianship, documentation, and the book crafts, and reference book, 6th ed., (aldershot, u.k.: gower pub., 1987), 385. 8. c. m. bowman, “the development of chemical information systems,” in chemical information systems, j. e. ash, and ernest hyde, eds. (chichester, u.k.: ellis horwood, 1975), 6. 9. n. k. goil, “need for a social science information system: guidelines for a model for india,” library herald 17, nos. 1–4 (1975–1979): 81. 10. s. p. agrawal, “national information system in social sciences in india: a review,” in twenty-eighth all-india library conference, october 20–23, 1982, lucknow: seminar papers of planning for national information system, j. l. sardana et al., eds. (delhi: indian library association, 1982), 273–74. 11. s. p. agrawal, “national information system in social sciences,” in handbook of libraries, archives and information centers in india, vol. 3, information policy systems and networks, b. m. gupta et al., eds. (new delhi: information industry pub., 1986), 183. 12. raina, roshan, “national information system for geoscience in india,” in twenty-eighth all-india library conference, october 20–23, 1982, lucknow: seminar papers of planning for national information system, j. l. sardana et al., eds. (delhi: indian library association, 1982), 262–63. 13. swati bhattacharyya, “library consortia: towards an action plan,” in business librarianship and information services: proceedings of the iiml-manlibnet 3rd annual national convention, march 12–14, 2001, lucknow, r. l. raina and i. v. malhan, eds. (lucknow: international book distributing co., 2002), v–vii. 14. jagdish arora and pawan agrawal, “indian national digital library in science and technology (indest) consortium: consortium-based subscription to electronic resources for technical education system in india,” in mapping technology on libraries and people: proceedings of the second international conference automation of libraries in education and research institutions (ahmedabad: inflibnet, 2003), 272–73. 15. hanif uddin, md. and haru-orrashid, md. (2002), “networking of agricultural information systems in bangladesh (bd-agrinet): a model,” library herald 40, no. 1 (2002): 11. 16. v. k. j. jeevan, “digitizing of indian library science journals,” university news 39, no. 34 (2001): 5–13. 17. desire subject gateways. accessed july 30, 2003, www.desire.org/ html/subjectgateways/community/ imesh/. 18. anil singh and j. n. gautam, “himalayan information subject gateway in digital era: a proposal for its development,” desidoc bulletin of information technology 23, no. 2 (mar. 2003): 3–9. council of scientific and industrial research (csir) defence research and development organization (drdo) department of science and technology (dst) indian council of agricultural research (icar) indian council of forests research and education (icfre) ministry of environment and forests (moef) nongovernmental organizations (ngos) universities centers under the network of university grant commission (ugc) science, technology, and environment departments in various himalayan states figure 5. agencies involved in himalayan research 19. p. c. bose, “national agricultural research information system,” 177. 20. icssr newsletter 23, no. 4 (jan.–mar. 2002): 24. 21. goil, “need for a social science information system,” 72–73. 22. p. pushpangadan and k. narayanan nair, “future of systematics and biodiversity research in india: need for a national consortium and national agenda for systematic biology research,” current science 80, no. 5 (2002): 633. 23. n. k. goil, “need for a social science information system,” 92. 24. amritpal kuar, “networking of the libraries of agricultural universities and research institutes in the states of punjab, haryana, and himachal pradesh (phhalnet): a proposal,” library herald 33, nos. 3–4 (1995–1996): 113. appendix a. main objectives of himis the main objectives of the proposed himis are as follows: � to identify, study, and survey the existing himalayan information infrastructure in the country; � to function as an information base so that policy makers, administrators, and scientists can access the computerbased information in special fields and build up their expertise; � to function as a computer-based information storage-and-retrieval system database that collects structured information generated by research institutions, continuously updating and making the information available to users � to provide a communications link with international databanks and databases for selective bibliographic information to scientists and other users; � to examine, promote, and develop existing information services and resources to meet the information requirements of scientists working in the area of himalayan research; � to establish and maintain links with other national information centers and systems in the country;12 � to create a linked collection of internet-based, high-quality himalayan resources; � to convert core indian himalayan journals, research reports, dissertations, and working papers into digital format; � to keep indian databases of himalayan journals and newsletters of himalayan institute’s online; � to establishing a network of all himalayan-studies research centers situated in different parts of the country for sharing research resources; � to provide online information of forthcoming conferences, seminars, and training workshops in himalayan-researchand-studies centers in india; � to provide details of completed and on-going himalayan-research projects; � to connect web sites of himalayan studies, hill studies departments of major universities, and himalayan-research institutes; � to provide for discussions, chat groups, and video-conferencing facilities for himalayan scientists; � to share the resources of other libraries to supplement a library’s own collection; � to share scientific efforts and expertise; � to ensure effective bibliographic control of the literature; � to facilitate and promote document-delivery and library-lending services; � to develop a common collection-development policy; � to share catalog service and to create a computerized union database; � to share database services such as abstracting, indexing, and full-text services; � to collect, store, organize, and retrieve information on all aspects of himalaya and its interdisciplinary areas contained in various recorded media; and � to coordinate the existing resources, services, and facilities within india in the field of himalayan studies. design and development of a himalayan studies information system for india | singh 43 appendix b. institutions engaged in himalayan research universities center for environmental sciences, himachal pradesh university, shimla (http://hpuniv.nic.in/envstu.htm) center for himalayan studies, university of north bengal, darjeeling (www.nbu.ac.in/) center for interdisciplinary studies of mountain and hill environment, university of delhi, delhi g. b. pant university of agriculture and technology, ranichauri, tehri-garhwal (www.gbpuat.ac.in/) 44 information technology and libraries | march 2005 north east hill university, shillong (www.nehu.ac.in/) high altitude plant physiology research center, h. n. b. garhwal university, srinagar-garhwal dr. y. s. parmar university of horticulture and forestry, solan (www.yspuniversity.ac.in/) institute of integrated himalayan studies (iihs), (ugc center of excellence) himachal pradesh university, shimla (www.hpuniv.nic.in/) institute of himalayan studies and regional development, garhwal university, srinagar-garhwal r&d institutions central institute of higher tibetan studies, sarnath, varanasi (www.smith.edu/cihts/) central institute of himalayan culture studies, dahung, arunachal pradesh defence agricultural research laboratory (drdo) pithoragarh (http://drdo.nic.in/labindex.shtml) snow and avalanche study establishment, (drdo) chandigarh (http://drdo.nic.in/labindex.shtml) temperate forest research institute (icfre), shimla (www.envfor.nic.in/icfre/tfris/tfris.html) himalayan forest research institute, shimla (www.icfre.org/institues/hfri.htm) forest research institute (icfre), dehradun (www.envfor.nic.in/icfre/fri/fri.html) institute of himalayan bioresources technology, (csir) palampur (www.csir.res.in/ihbt/) regional research laboratory, (csir) jammu tawi (www.rrljammu.org/) g. b. pant institute of himalayan environment and development, with its headquarters at almora; and regional units at tadong-gangtok; srinagar-garhwal; shamshi-kullu; itanagar (http://gbpihed.nic.in/) icar research complex for neh region, (icar) umroi road : barapani, meghalaya (http://dare.nic.in/icarneh.htm) vivekananda parvatiya krishi anushandhanshala (icar), almora (http://vpkas.nic.in/) central institute of temperate horticulture (icar), srinagar, jammu and kashmir (http://dare.nic.in/cith.htm) indian veterinary research institute, regional station, palampur (icar) (http://ivri.nic.in/) indian veterinary research institute, regional station, muketeswar (icar), nainital (http://ivri.nic.in/) national bureau of plant genetic resources, regional station, bhowali–niglat, nainital (http://nbpgr.delhi.nic.in/) north eastern regional institute of science and technology, nirjuli, itanagar (http://agni.nerist.ac.in/) wadia institute of himalayan geology, (dst) dehradun (www.himgeology.com/) wildlife institute of india, dehradun (www.wii.gov.in/) international center international center for integrated mountain development (icimod), kathmandu, nepal (www.icimod.org/) ngos center of himalayan development and policy studies, dehradun himalayan environmental studies and conservation organization, kotdwara (garhwal), uttaranchal society for integrated development of himalayas (sidh), landour cantt., musoorie himalayan action research center (harc), dehradun himalayan study circle, pithoragarh people’s association for himalayan area research (pahar), nainital the himalaya trust, dehradun the himalayan foundation, nandprayag, chamoli distt research, advocacy, and communication in himalayan areas (rachna), dehradun central himalayan environment association, nainital himalayan region study and research center institute, new delhi himalayan seva sangh, new delhi himalayan research group, nainital himalayan research and cultural foundation, new delhi himalayan institute of action research and development, dehradun \ comparisons of lc proofslip and marc tape arrival dates at the university of chicago library charles t. payne: systems development librarian, and robert s. mcgee: assistant systems development librarian; university of chicago library, chicago, illinois 115 a comparison of arrival dates of 5020 lc proofslips and corresponding marc magnetic tape records reveals that four-fifths of the marc records were received the same week as, or earlier than, the proofslips. the purpose of this study is to determine the timeliness of marc ii records' arrival dates in comparison to the arrival dates of matching lc proofslips. the acquisitions department of the university of chicago library receives a complete set of cut and punched lc proofsheets (or "lc proofslips") that is used primarily for selection and ordering. in examining potential uses of marc records in acquisitions processing, the library systems development office felt that a critical determinant would be the timeliness of marc records in comparison to the arrival dates of the matching lc proofslips. accordingly, the study described below was designed to gather data upon which appropriate system design questions might be considered. it was decided that "arrival date" would be defined as the week in which an arrival occmted, since the initial processing and distribution of incoming lc proofslips is framed within weekly, rather than daily, periods. "week" was defined as the monday through friday workweek. "arrivals" were defined as deliveries of marc tapes and lc proofslips by the library mail service. no attempt was made to influence the normal delivery procedures, or to specialize or hasten identification of these 116 journal of library automation vol. 3/2 june, 1970 materials for priority handling. arrival weeks were numbered consecutively, the week of march 31 april 4, 1969, being designated week 1. marc tape numbers correspond to arrival week numbers; i.e., marc tape #4 arrived during week 4. table 1 presents these correspondences. table 1. week numbers for 15 weeks of study week number arrival week dates 1 march 31 april 4 2 april 7 april 11 3 april 14 april 18 4 april 21 april 25 5 april 28may 2 6 may 5-may 9 7 may 12may 16 8 may 19 may 23 9 may 26-may 30 10 june 2june 6 11 june 9june 13 12 june 16june 20 13 june 23june 27 14 june 30july 3 15 july 7july 11 data collection proofslip collection began in week 2, but in that week only a partial collection was made. in subsequent weeks, complete collections of proofslips bearing the marc acronym (marc proofslips) were attempted, so that proofslip data beginning with week 3 (april 14-18) are more complete. proofslip collection was terminated in week 15. discrepancies between the counts of marc records and the numbers of marc proofslips collected have not been accounted for, but possible reasons are discussed in the following section. data collection was based upon comparisons of: 1) the weekly printed indexes, in lc card number order, that came with marc ii tapes; and 2) weekly lists of marc proofslip arrivals. in each incoming batch of lc proofslips, those with marc notes were separated and their arrival date noted. the marc proofslips for each week were put in primary order by the £rst two digits (series number) of the card number, and were secondarily ordered within each series by the serial number following the hyphen, thereby matching the order of lc card numbers in the marc indexes. these numbers were transcribed to create a weekly list of proofslip arrivals. two new lists of lc card numbers were derived each week: 1) a marc index; and 2) a proofslip list. weekly each new list was compared with all lists of the other type to identify card number matches. \ proofslip and marc arrival dates/payne and mcgee 117 thus, each of the two types of lists was cross-tabulated with all the lists of the other type, showing on all lists which card numbers had been matched, and the week numbers of these matches. counts were made of the matches tabulated on each list, and were entered into table 2. matches made during a given week are subcounted by series groups 65-68, 69, and the 7 series. the cumulative percentages of marc record and proofslips matches were entered into tables 3 and 4. table 3 contains the percentages of matches for any week's proofslips with successive marc tapes. for example, of the 340 proofslips received in week 4, 71.2% matched marc records received the same week, or earlier, i.e., tapes 4, 3, and 2. table 4 contains the percentage of matches for any marc index on successive proofslip lists. for example, of the 768 records on marc tape number 5 (received in week 5) 23% were matched by proofslips received the same week, or earlier, i.e., weeks 5, 4, 3, and 2. analysis of results some patterns of marc and proofslip arrivals are indicated by the tables. the results in table 2 show that there is not a one-for-one weekly relationship between proofslip and marc record arrivals. for example, the 340 marc proofslips received in week 4 matched tape records received from week 2 through week 10, although the highest number of matches was also in the tape received in week 4. in later proofslip weeks, however, the highest number of proofslip matches was with tape records received at least one week earlier. a summary of table 2 would show that of the 5020 marc proofslips received during weeks 3-10, 4004, or 79.8% were matched to marc records received the same week or earlier. in table 3, the cumulative percentages of proofslip matches with successive marc indexes indicate, for several of the weeks, more than a 90% match with tape records two weeks after proofslip arrivals. table 3 shows that the percentage of matches for a set of proofslips received in one week with the marc indexes received the same week or earlier ranges from 48.9% to 91.6%. table 4 shows that the percentage of matches for a marc tape received in a given week with the proofslips received the same week or earlier ranges from 7.1% to 49.8%. for the period of weeks corresponding to tape numbers 3-10, 6335 tape records (from table 4) and 5020 proofslips (from table 2 or 3) were received. the reason for the discrepancy between the number of marc records and the number of marc proofslips is not clear, but is possibly due to the combined effects of basic factors such as the limited period of the study, the difficulties of collecting proofslips in a working environment, and the nature of the manual effort required to list lc card numbers and compare proofslip lists and marc indexes. table 2. number of proofslip matches with marc indexes by arrival week and by lc card number subseries proof slip lc tape tape tape tape tape tape tape tape tape tape week series 1 2 3 4 5 6 7 8 9 10 ps 2 6568 5 25 13 4 4 0 0 0 0 0 # 88 69 2 8 5 3 3 0 l 0 0 0 7 series 1 2 7 1 2 1 0 0 0 0 total 8 35 25 8 9 1 1 0 0 0 6!:>-68 6 42 77 37 17 9 2 1 0 0 ps 3 69 7 25 65 32 30 9 5 0 1 0 #497 7 series 0 16 36 41 8 2 0 0 2 0 total 13 83 178 110 55 20 7 1 3 0 65-68 0 14 35 57 19 12 1 3 2 1 ps 4 #340 69 0 0 26 56 12 3 0 1 1 3 7 serie' 0 0 18 36 19 9 1 0 1 2 total 0 14 79 149 so 24 2 4 4 6 65-68 0 0 14 56 33 35 0 4 7 4 ps 5 69 0 0 7 62 21 8 0 4 3 2 #398 7 seriel 0 0 9 49 9 9 1 5 2 6 total 0 0 30 167 63 52 1 13 12 12 65-68 0 0 0 29 108 77 3 5 2 3 ps 6 69 0 0 1 55 95 so 4 2 2 4 #653 7 serie~ 0 0 2 28 72 52 6 0 5 s· total 0 0 3 112 275 179 13 7 9 12 65-68 0 0 0 9 68 128 29 6 4 2 ps 7 #711 69 0 0 0 2 29 133 54 9 1 8 7 serie! 0 0 0 5 33 92 33 10 3 6 total 0 0 0 1 6 130 353 116 25 8 16 ps 8 65-68 1 0 0 0 5 87 46 29 6 4 69 0 0 0 0 2 54 49 20 11 4 #503 7 seriel 0 .0 0 0 2 37 46 17 1 2 total 1 0 0 0 9 178 141 66 18 10 65-68 0 0 0 0 0 10 86 122 52 34 ps 9 69 0 0 0 0 l 3 75 107 53 39 #933 7 «f>ri f>< 0 0 0 0 0 1 49 115 73 42 total 0 0 0 0 1 14 210 344 178 115 65-68 0 0 0 0 0 0 5 36 159 91 ps 10 #985 · 69 0 0 0 0 0 0 1 40 180 96 7 series 0 0 0 0 0 0 5 23 165 101 total 0 0 0 0 0 0 11 99 504 288 tape tape tape 11 12 13 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 0 0 1 0 0 0 0 0 2 1 q_ 3 1 n 0 0 0 3 0 0 1 2 0 4 2 0 0 l. 0 0 0 0 0 0 , 0 l. 1 5 1 1 0 0 0 10 4 0 15 5 1 8 1 0 5 1 1 7 1 0 20 3 1 tape tape 14 15 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 n n n n 0 0 1 0 0 0 1 0 0 0 0 0 () () () () 0 0 1 0 1 0 2 0 5 1 1 0 1 1 7 2 ~ ~ 00 'o' ~ g ..... -q.. t-c & j :;t.. ~ 0 g ..... a· ~ 2 !"""' co -to ._ c: l:l sd ~ td z table 3. cumulative percentages of matches of each week's proofslips received with each additional marc ii tape index proofsl ' cumulative \ of tape number 1_p tape 1 tape 2 tape 3 tape 4 tape 5 tape 6 tape 7 tape 8 tape 9 tape 10 ps 2 9.1 48,9 77.3 86.4 96.6 97 . 7 98.6 98.6 98.6 98.6 ll 88 ps 3 2.6 19.3 55.1 77.2 !1497 88.5 92.3 93.7 94.0 94.6 94.6 ps 4 o.o 4.1 27 . 4 71.2 85.9 92.9 ll340 93.5 94.7 95 .9 97.6 ps 5 o.o 0 . 0 7.5 49 . 7 65.3 78.4 78.6 81.9 84.9 87.9 # 398 ps 6 o.o o.o .4 17.6 59 .7 87.1 89.1 90.2 91.6 93.4 #653 ps 7 o.o o.o 2.2 20.5 70 . 2 86.5 90.0 91.1 93.5 11711 o.o ps 8 .2 .2 .2 .2 2. :> 37 .4 65.4 78 . 5 8 2.1 84.1 #503 ps 9 o.o o.o 0 . 0 o.o .1 1.6 24.1 61.0 80.0 92 . 4 ll933 ps 10 1.1 11.2 62 . 3 91.6 ll995 o.o o.o 0.0 o.o o.o 0.0 tape 11 tape 12 tape 13 tape 14 tape 15 ~ 0 (;l" 98.6 98 . 6 98.6 98.6 98.6 ....... -s· 9 4.6 94.6 94.6 94.6 94.6 ~ ~ r.. 97.6 97 . 6 97.6 97.6 97.6 ~ > i:j::j 88.2 88.2 88.2 88.2 88 .2 93.9 94.0 94 . 0 94.0 94.0 cj ~ c;· ~ ...... 94.9 94.2 94.2 94 .4 94 . 4 t:::l / ~ <'+<.1:> ., 84 . 1 84 .5 84. 7 84.7 84.7 --~ to< 94.0 94.5 94 . 6 94.8 94.8 z tr1 93 .6 93,9 94.0 94.7 94.9 $)j ~ 0.. ~ n g'j tr1 tr1 ...... ...... co table 4. cumulative percentages of matches of each marc ii tape index with each additional week's proofslips received tape cumulative \ of proofslip week number ps 2 ps 3 ps 4 ps 5 ps 6 ps 7 ps 8 ps 9 ps 10 ps 11 tape 1 1.2 3.2 3.2 3.2 3.2 3.2 3.4 3.4 3.4 3.4 # 648 tape 2 7.1 23.8 26.6 26.6 26.6 26.6 26.6 26.6 26.6 26.6 # 495 tape 3 · 5.1 41.1 57.1 63.2 63.7 63.7 63.7 63.7 63.7 63.7 # 494 tape 4 1.0 14.9 33 . 1 54.7 67.9 71.0 71.0 71.0 71.0 71.0 #791 tape 5 1.2 8.3. 14.8 23.0 #768 58.8 75.6 77.0 77.1 77.1 77.1 tape 6 0.1 1.9 4,0 a.5 #1136 24.2 55.4 71.0 72.3 72.3 72.4 tape 7 0.1 1.2 1.4 1.6 3.5 20.2 40.5 70.7 72.3 72.6 #694 tape 8 o.o 0,2 o.7 2.7 * 658 3.8 7.6 17.6 69.9 84.9 86.2 tape 9 o.o 0.3 o . a 2.1 3.1 4.0 6.1 26.0 82.5 86.0 #892 tape 10 o.o o.o 0.7 2.0 3.3 5.0 6.2 18.5 49.8 79.5 #922 -------ps 12 ps 13 ps 14 3.4 3.4 3.4 26.6 26.6 26.6 63.7 63.7 63.7 71.0 71.0 71.0 77.1 77.1 77.1 72.4 72.4 72.4 72.6 72.6 72.6 86.5 86.5 86.5 86.3 86.3 86.3 89.9 89.9 90.5 ps 15 3.4 26.6 63.7 71.0 77.1 72.4 72.6 86.5 86.3 90 . 5 ~~f--1 to 0 g ~ ...... .q.. ~ 1 > ~ 8' g ..... a· ;;s 2 ~ vo ........... to <....t = ::i so f--1 c:d ~ \ proofslip and marc arrival dates/payne and mcgee 121 conclusion the data collected to date indicate that the arrivals of marc records generally precede those of the corresponding proofslips. thus, marc records seem to be timely enough to be used in book selection and ordering processes, where proofslips are now used, as well as to supply bibliographic data for cataloging. 24 cost analysis of an automated and manual cataloging and book processing system joselyn druschel: washington state university , pullman. a comparative cost analysis of an automated network system (wln) and a local manual system of cataloging and book processing at washington state university libraries indicates that the automated system is about 20 percent less costly than the manual system. a per-unit cost approach was used in calculating the monthly cost of each system based on the average number of items processed per month under the automated system. the process and the results of the analysis are presented in a series of charts which detail the tasks, items processed, unit and total monthly costs of both the manual and automated systems. the higher costs of the manual system were essentially staff costs. the technical services division (tsd) of washington state university libraries (wsul) has had considerable experience in the use of automated techniques in selected areas of technical processing. an in-house automated acquisitions system was developed and implemented in 1967; that in-house system was eventually replaced by the acquisitions component of the washington library network (wln) . since november 1977, the technical services division of wsul has used the wln bibliographic component for data verification (searching) and cataloging of materials. although the library has generally known its total automation expenditures, it has lacked a more precise breakdown of cost data on automated processing. moreover, the library has practically no cost data on manual processing. this report deals only with the costs of using the wln bibliographic system, not the wln acquisitions component. an analysis was made of the total costs of both the automated and manual book processing systems. the objectives in undertaking the cost analysis were threefold: (1) to identify the essentially unknown costs of manual processing; (2) to provide more exact cost data on automated processing; and (3) to develop comparable data on the costs of each system . manuscript received october 1980; accepted december 1980. cost analysis/druschel 25 methodology the methodology used in this cost analysis was a per-unit cost approach. first, each process or task in which the staff were engaged in cataloging and book processing was identified. second, the per-unit cost-e.g., staff, data base, materials-of each process was calculated. finally, monthly costs were determined by multiplying the average number of items processed per month by the unit cost per task. the cost analysis charts (tables l(a)-l(e)-manual system; tables 2(a)-2(d)-automated system), which detail the tasks, items processed, and unit and total costs form the body of· the analysis. equipment costs-purchase, lease, maintenance-were calculated separately, and are included in the summary cost data for each system (table 3). identification of processes the staff of the tsd cataloging and book processing unit perform the following functions: bibliographic verification, bibliographic record production, bibliographic record maintenance, the marking of materials, binding preparation and receipt (for most of the library system), and the preparation of book cards. table i( a). cost analysis: manual cataloging and book processing system staff costs per process item bibliographic searching idc microfiche search (lc and cip copy) lt i (.084/min @ 3 min/item) $ .252 lt ii (.094/min@ 3 min/item) .282 lt iii (.117/min @ 3 min/item) .351 subscription costs-idc ($10,000/yr -'47,664 searches/yr = . 21/search) microfiche search subtotal national union catalog, etc., search lt i (.084/min@ 15 min/item) $1.26 lt ii (.094/min @ 15 min/item) 1.41 lt iii (.117/min@ 20 min/item) 2.34 lt iii (.117/min@ 40 min/item) 4. 68 subscriptions ($2, 940/yr -'15,300 searches/yr = .19/search) manual search subtotal bibliographic searching total data base costs/ item subscription costs/item $ .21 .21 .21 $ .19 .19 .19 . 19 materials costs/item average total number total cost processed cost per per per item month month $ .462 2484 $1,148 244 557 .492 496 .561 992 $1.45 1.60 2.53 4.87 3972 $1,949 588 $ 853 169 270 418 1,058 100 487 1275 $2,668 5247 $4,617 26 journal of library automation vol. 14/1 march 1981 table l(b). cost analysis: manual cataloging and book processing system ave rage staff data total number total costs base cost processed cost per costs/ subscription materials per per per process item item costs/item costs/item item month month bibliographic record production-processing and products l. cataloging with lc microfiche copy type abbreviated fanfold (4-part 3x5 slips) timeslip (.03/min@ min/item x 72) $ .803 $.02/ fanfold lt i (.084/min @ 10 min/item x 985) check series lt i (.084/min@ 2 min/item) .168 revise fanfold supervisor i (.126/min @ 3 min/item) .38 check fanfold against book; separate fanfold lt ii (.094/min@ 2 min/item x 540) .22 supervisor i (.126/min @ 2 min/item x 517) arrange and file shelflist copy of fanfold timeslip (. 03/min @ 1.5 min/slip) .045 revise shelflist filing of fanfold slips lt ii (.094/min @ 1 min/slip) .094 verify authorities ( subject and name ) (1057x4 ) timeslip (.03/min@ 4 min/item) .12 type multilith master for card production lt i (. 084/min @ 6 min/master) .504 .061 master revise typed multilith master lt i (.084/min@ 3 min/master) .252 run multilith masters multilith operator (.13/min@ 3.5 min/set) .455 (cost of cards see below) microfiche copy cataloging subtotal $ 3.04 $ 081 $3.12 1057 $3,298 item 2. cataloging with modified copy (nuc/ lc) type fanfolds (4-part 3x5 slips ) lt i (.084/min@ 15 min/item) $ 1.26 $.021 fanfold cost analysis/druschel 27 table l(b) ( cont.) check series lt i (.084/min@ 2 min/item) .168 revise fanfold supervisor i (.126/min @ 5 min/item) .63 review fanfold cataloging librarian (.155/min @ 5 min/item) .775 separate fan folds lt ii (.094/min @ 30 sec/item) .047 . arrange and file sheljlist copy of fanfold timeslip (.03/min@ 1.5 min/slip) .045 revise filing of shelflist copy lt ii (.094/min @ l min/slip) .094 verify authorities (984 x 4) timeslip (.03/min@ 4 min/item) .12 type multilith master for card production lt i (.084/min @ 6 min/master) .504 .06/ master revise typed multilith master lt i (.084/min @ 3 min/master) .252 run multilith masters multilith operator (.13/min @ 3. 5 min/set) .455 (cost of oards see below) modified copy cataloging subtotal $ 4.35 $.08/ $4.43 984 $4,359 item 3. original cataloging catalog material librarian (. 155/min @ 60 min/item x 200) $ 9.60 librarian (.205/min @ 60 min/item x 22) revise cataloging librarian (.205/min @ 5 min/item) 1.03 type fan folds ( 4-part 3x5 slips) lt i (.084/min@ 15 min/item) 1.26 $.02/ fanfold check series lt i (.084/min@ 2 min/item .168 revise fanfold supervisor i (.126/min @ 5 min/item) .63 separate fan folds lt ii (.094/min @ 30 sec/item) .047 arrange and file sheljlist copy of fanfold timeslip (.03/min @ 1.5 min/item) .045 revise filing of sheljlist copy lt ii (.094/min @ 1 min/slip) .094 28 journal of library automation vol. 14/1 march 1981 table i(b) (c ont .) staff costs per process item 4. type multilith master for card production lt i (.084/min @ 6 min/master) .504 revise typed master lt i (.084/min@ 3 min/master) .252 run multilith master multilith operator (.13/min @ 3.5 min/set) .455 original cataloging subtotal $14.085 catalog cards (7 cards/set @ .055/card) subtotal cataloging total miscellaneous bibliographic record production assign class numbers to theses supervisor i (. 126/min @ 2 min/item) assign subject headings for audio visual materials librarian (.155/min @ 2 min/set) type multilith masters for catalog cards for a-v materials lt i (.084/min @ 6 min/master) revise multilith masters lt i (.084/min @ 3 min /master) run multilith masters multilith operator (.13/min@ 3.5 min/set) (20 cards) resolve problems; general supervision supervisor i (7.56/hr x 52 hrs/mo) librarian (9.32/hr @ 22 hrs/mo) miscellaneous bibliographic record production subtotal bibliographic record production total $ .252 .31 .504 .252 .455 data base costs/ item subscription costs/item materials costs/item .06/ master (cost of cards see below) total cost per item average number total processed cost per per month month $. 081 $14.165 222 $ 3,145 item $.385/ set $.06/ master 1.10/ set $ .252 .31 .564 .252 1.555 4297 $ 1,6.54 2263 $12,456 30 30 30 30 30 8 9 17 8 47 393 205 $ 687 $13,143 cost analysis/druschel 29 table i( c). cost analysis: manual cataloging and book processing system average staff data total number total costs base cost processed cost per costs/ subscription mate rials per per per process item item costs/ item c osts/ltem i tern month month bibliographic record maintenance count sets of cards and match against cataloging copy lt i (.084/min @ 2 sets/min) $ .042 $ .042 4297 $ 180 type subject and added entries on card sets timeslip (.03/min @ 3 min/set) .09 .09 4297 387 revise card sets lt ii (.094/min @ 3 min/set) .282 .282 2520 711 lt iii (.117/min@ 3 min/set) .351 .35 1 1803 633 type subject and name authority slips timeslip (.03/min @ 1 min/slip) .03 .03 4526 136 file subject and name authority slips timeslip (.03/min @: 1 min/slip) .03 .03 4526 136 separate card sets lt i (.084/min@ 2 sets/min) .042 .042 4297 180 file subject catalog cards (2263x2) lt ii (.094/m in @ 1 min/card ) .094 .094 4526 425 file ait catalog cards (2263 x3 ) lt i (.084/min @ 1 min/card) .084 .084 6789 570 file shelflist cards (2) timeslip (.03/min @ 1 min/card ) .03 .03 4526 136 retoise subject card filing lt iii (.117/min@ 1 min/card ) .117 .117 .4526 530 retoise ait card filing lt iii (.117/min@ 1 min/card) .117 .117 6789 794 revise sheljlist filing (2) lt ii (.094/min @ 1 min/card ) .094 .094 2340 220 supervisor i (.126/min @ 1 min/card ) .126 .126 2186 275 alphabetize and date works/ips lt i (.084/min @ 4 slips/min) .021 .021 2263 48 pull card sets (withdrawals and card corrections timeslip (.03/min @ lo min/set) .30 .30 100 30 revise card pulling ( 100 sets/month) supervisor i (.126/min @ 2 min/set) .252 .252 100 25 correct card sets (50 sets/month ) lt ii (.094/min@ 5 min/set) .465 .465 50 23 revise card corrections supervisor i (.126/min @ 2 min/set) .252 .252 50 13 process added copies (record accession # on shelflist; record call # in book; type slip for marking) lt ii (.094/min @ 15 min/item) 1.41 1.41 50 71 30 journal of library automation vol. 14/1 march 1981 table 1( c) (cont.) average staff data total number total costs base cost per costs/ subscription materials per process item locate materials in process lt ii (.094/min @ 15 min/item) $1.41 prepare books for binding decision supervisor i (. 126/min @ 1 min/item) .126 general supervision librarian ($12.34/hr @ 65 hours/month) bibliographic record maintenance total item costs/item costs/item item $1.41 . 126 table i( d ) . cost analysis: manual cataloging and book processing system staff data total costs base cost per costs/ subscription materials per process item item costs/item costs/item item marking sort materials for processing (marking) oa ii-typing (.105/min@ 30 sec/item) $ .053 $ .053 place materials on table oa 11-typing (.105/min@ 20 items/min) .005 .005 process materials (type and paste labels, pockets, & date due slips; type book cards) timeslip (.03/min@ 20 min/item) .60 $.029/ .629 label ; pocket ; date due slip; book card process materials with tab book cards (type and paste labels, pockets, & date due slips) timeslip (.03/min @ 16 min/item) .48 .032/ .512 oa ii-typing (.105/min@ 16 min/item) 1.68 label; 1.712 pocket; date due slip; book card processed cost per per month month 50 $ 71 50 6 802 $6,402 average number total processed cost per per month month 2263 $ 120 2263 ll 400 252 1555 796 308 527 cost analysis/druschel 31 table 1( d) (cont.) keypunch bookcards lt i (.084/min @ 2.4 min/card) .201 verify book cards lt iii (.117/min@ 1.6 min/card) .187 revise processing lt i (.084/min @ 2 min/item) .168 lt iii (. 117/min@ 2 min/item) .234 sort materials for delivery oa ii-typing (.105/min@ 1.5 items/min) .07 unpack bindery materials, pull slips lt i (.084/min @ 1 min/item) .084 verify bindery slips; check price lt iii (.117/min@ 2 min/item) general supervision, bindery account & statistical data lt iii (7.04/hr@ 15 hrs/mo) supervisor ii (8.97/hr @ 128 hrs/mo) librarian (12.34/hr @ 15 hrs/mo) marking total cataloging and book processing total .234 table i( e). total monthly costs (summary ) staff costs per month $25,775 data base costs/month subscription costs per month $1,076 .201 1863 374 .187 1863 348 .168 1500 252 .234 763 179 .07 2263 158 .084 550 46 .234 550 129 106 1,148 185 $ 4,631 $28,793 material costs per month total cost per month $1,942 $28,793 table 2(a). cost analysis: automated cataloging and book processing system average staff data total number total costs base cost processed cost per costs/ subscription materials per per per process item item costs/item costs/item item month month bibliographic searching l. wln data base search items searched, no inquiry charges lt ii (.094/min @ 1 min/item) terminal use (4 @ .06) $ .094 $ .24 $ .334 2443 $ 816 terminal use (3@ .06) .094 .18 .274 100 27 32 journal of library automation vol. 14/1 march 1981 table 2( a) (cont.) average staff data total number total costs base cost processed cost per costs/ subscription materials per per per process item item costs/item costs/item item month month items searched, inquiry charges assessed lt ii (.094/min@ 1 min/item) inquiry (3 @ .069) .094 .39 .484 1429 692 terminal use (3@ .06) data base search subtotal 3972 $1,535 2. national union catalog, etc. search (manual) lt ii (.094/min @ 10 min/item ) .94 .31 1.25 508 635 subscriptions ($1,860/yr + 6096 searches/yr) manual search subtotal 508 $ 635 bibliographic searching total 4480 $2, 170 table 2(b ). cost analysis: automated cataloging and book processing system average staff data total number total costs base cost processed cost per costs/ subscription materials per per per process item item costs/item costs/item item month month bibliographic record productionprocessing and products l. materials cataloged via wln a. cataloging with wln data base copy attach holdings; order cards lt ii (.094/min @ 6 min/item) $ .564 data base costs inquiry costs (no charge) cost per record use $1.60 cost per request .15 shelllist cards (4 @ .055) .22 com (cost pe r record ) .43 terminal use (1 @ .06/use) .06 -----wln data base copy subtotal $ .564 $2.46 $3.024 1376 $4,161 b. cataloging with cip copy upgrade data base copy lt ii (. 094/m in @ 11 min/item) $1.034 revise upgraded copy librarian (.155/min @ 5 min/item) .775 attach holdings order cards lt ii (.094/min @ 6 min/item ) .564 table 2(b) (cont.) data base costs cost per record use cost per request shelflist cards (4 @ .055) com (cost per record) terminal use (1 @ .06/use) $1.60 .15 .22 .43 .06 cip copy subtotal $2.373 $2.46 c. cataloging with modified copy (e.g., nuc/lc copy) prepare cataloging worksheets lt ii (.094/min @ 15 min/item) $1.41 revise cataloging worksheets lt ii (.094/min @ 10 min/item) . 94 marc tag worksheets supervisor ii (.15/min @ 15 min/item) 2.25 revise marc tagged worksheets librarian (.155/min @ 8 min/item) 1.24 input cataloging data; attach holdings; order cards timeslip (.03/min @ 25 min/item) . 75 revise data input and verify authorities librarian (. 155/min @ lo min/item) 1.55 data base costs cost of input per record cost of authority checks (7 checks @ .069/entry) shelflist cards (4 @ .055) com (cost per record) terminal use (7 @ .06/use) $ .14 .48 .22 .43 .42 modified copy subtotal $8.14 $1.69 d. original cataloging catalog and marc tag material librarian (.155/min @ 60 min/item) $ 9.30 revise cataloging and marc tagging librarian (.205/min @ 5 min/item) 1.03 input cataloging data; attach holdings; order cards lt ii (.094/mi n @ 25 min x 104) cost analysis/druschel 33 $4.833 153 $ 739 $9.83 95 $ 934 34 journal of library automation vol. 14/1 march 1981 table 2(b ) (cont.) average staff data total number total costs base cost processed cost per costs/ subscription materials per per per process item item costs/item costs/item item month month timeslip (.03/min @ 25 min x 118) 1.49 revise input; verify authorities librarian (.155/min @ 10 min/item) 1.55 data base costs cost of input per record $ .14 cost of authority checks (7 checks @ .069/entry) .48 shelflist cards (4 @ .055) .22 com (cost per record) .43 terminal use (7 @ .06/use) .42 ----subtotal $13.37 $1.69 $15.06 222 $ 3,343 wln cataloging total 1846 $ 9,177 2. materials cataloged via other methods a. microform cataloging from publisher's copy review and revise copy; complete processing; revise card sets librarian (.25/min@ 2. 7 min/item) $ .675 xerox card sets (1 0 cards/set) timeslip (.03/min @ 1 min/title) .03 $.551 set microform subtotal $ .705 $.551 $1.255 407 $ 511 set b. cataloging music scores catalog scores; prepare for card production; revise card sets librarian (.25/min @ 28 min/item) 7.00 xerox card sets (14 cards/se t) timeslip (.03/min@ 2 min/title) .06 .77/ set music score subtotal $ 7.06 $. 77/ $7.83 10 $ 78 set non-wln cataloging total 417 $ 589 cataloging total 2263 $9,766 cost analysis/druschel 35 table 2(b ) (cont .) 3. miscellaneous costs assign class numbers to theses supervisor ii (.15/m in @ 2 min/item) $ .30 $ .30 30 $ 9 retrieve "rush" monographs supervisor ii (. 15/min @ 15 min/item) 2.25 2. 25 75 169 correct/update wln data base information lt ii (.094/min @ 10 min/item) terminal use (1 @ .06/use ) .94 $.06 1.00 360 360 assign subject headings for audio visual materials librarian (.155/min @ 2 min/set) .31 .31 30 9 file subject authority slips for microform mat erials librarian (.155/min @ 1.15 min/slip) .18 .18 55 10 resolve problems; general supervision lt ii (5 .68/hr x 13 hrs/mo) 74 supervisor ii (8. 97 hrs x 89 hrs/mo) 798 librarian ($12. 34 hr x 52 hrs/mo) 642 miscellaneous costs subtot~l $ 2,071 bibliographic record production total $11 ,837 table 2(c). cost analysis: automated cataloging and book processing system average staff data total number total costs base cost processed cost per costs/ subscription materials per per per process item item costs/item costs/item item month month bibliographic record maintenancets d collate card sets from wln (7384 cards) lt i (. 083/min @ 30 sec/card) $ .042 $ .042 7384 $ 310 insert card sets in books lt ii (.094/min @ 1.6 min/item) process new books (1846 ) 1.51 . 151 1846 279 review cards against books; add accession number and stamp date on shelflist card; carrect series (when needed); separate card sets and distribute timeslip (.03/min @ 10 min/item) .30 .30 145 44 38 journal of library automation vol. 14/1 march 1981 table 2( c) (cont.) average staff data total number total costs base cost processed cost per costs/ subscription materials per per per process item item costs/item costs/item item month month lt i (.083/ min @ 10 min/item) $.83 $ .83 1701 $1 ,412 revise book processing (1846) lt iii (.117/min@ 1 min/item) .117 .117 1846 216 file central and holland sheljlist timeslip (.03/min @ 1 min/card) .03 .03 4526 136 revise central and holland sheljlist lt i (.083/min @ 30 sec/card) .042 .042 4526 190 separate and alphabetize microform card sets timeslip (.03/min @ 1 min/set) .03 .03 2000 60 file author/title/subject microform cards in general catalog timeslip (.03/min@ 1 min/card) .03 .03 2000 60 revise filing general catalog lt iii (.117/min@ 1 min/card) .117 .117 2000 234 pull card sets (withdrawals and card corrections--40 setslmo) timeslip (.03/min@ 10 min/set) .30 .30 20 6 lt i (.083/min @ 6 min/set) .498 .498 20 10 revise card pulling lt iii (.117/min@ 2 min/set) .234 .234 40 10 correct card sets (20 sets/mo) lt i (.083/min@ 6 min/set) .498 .498 20 10 revise card corrections lt iii (.117/min@ 2 min/set) .234 .234 20 5 process added copies (record accession number on shelflist; record call number in book; type slip for marking) lt i (.083/min @ 15 min/item ) 1.25 1.25 50 63 locate materials in process lt i (.083/min@ 15 min/item) 1.25 1.25 33 41 prepare books for bindery decision lt iii (.117/min @ 1 min/item) .117 .117 50 6 supervise staff and timeslip lt iii (7.04/hr @ 68 hrs/mo) 479 librarian (12.34/hr@ 50.5 hrs/mo ) 623 bibliographical record maintenance total $4,194 cost analysis!druschel 39 table 2(d). cost analysis: automated cataloging and book processing system average staff data total number total costs base cost processed cost per costs! subscription materials per per per process item item costs/item costs/item item month month marking sort materials for processing ( marking ) oa ii-typing (. 105/ min@ 30 sec/item) $ .053 $ .053 2263 $ 120 place materials on table oa iityping (.105/min@ 20 items/min) .005 .005 2263 11 process materials (type and paste labels, pockets and date due slips; type book cards ) timeslip (.03/min @ 16 min/item) .60 $.029/ .629 400 252 date due slip; label pocket ; process materials with tab book card book cards (type and paste labels, pockets and date due slips) timeslip (.03/min @ 16 min/item) .48 .032/ .512 1555 796 oa ii-typing (. 105/min @ 16 min/item) 1.68 date due 1.712 308 527 slip ; label; pocket ; book card keypunch bookcards lt i (.083/min @ 2.4 min/card) .20 .20 1863 373 verify book cards lt iii (. 117/min @ 1.6 min/card) . 187 .187 1863 348 revise processing lt i (.083/min @ 2 min/item) .166 .166 1500 249 lt iii (.117/min@ 2 min/item) .234 .234 763 179 sort materials for delivery oa ii-typing (.105/min@ 1.5 items/min ) .07 .07 2263 158 unpack bindery materials; pull slips lt i (.083/min @ 1 item/min) .083 .083 550 46 verify bindery slips and check price lt iii (. 117/min @ 2 min/item) .234 .234 550 129 general supervision; bindery accounts and statistical data lt iii (7.04/hr @ 36 hrs/mo) 253 supervisor ii (8.97/hr@ 128 hrs/mo) 1,148 marking total $ 4,589 cataloging and book processing total $22,790 40 journal of library automation vol. 14/ 1 march 1981 table 2(e ) . total monthly costs ( summary ) staff costs per month $16,849 data base costs/month $5,480 subscription costs per month $157 materials costs per month $304 total cost per month $22 ,790 table 3. cataloging and book processing system: summary comparison costs manual system category costs/month staff data base subscriptions materials equipment total $25,775 1,076 1,942 462 $29 ,255/month cost comparison-difference manual $29,255/ month automated $23,680/month automated system category costs/month staff data base subscriptions materials equipment total $16,849 5 ,480 157 304 890 $23, 680/month $ 5,575/month/$66,900/year since 1978 this unit, as well as all units in the technical services division, have periodically analyzed unit activities, and recorded the data collected on work assignment/staffing profile sheets (see table 4 for sample profile sheet). the primary purpose of the profiles was to develop a detailed account of work distribution throughout tsd in order to determine the staffing requirements necessary for each unit to maintain an even workflow. in the cost analysis , the cataloging and book processing (cbp) profile was used to identify each unit process, as well as to provide the basic data on the number and level of staff and the time required to perform each process . additionally, for the automated system, the cbp profile sheets, together with wln invoices (see figure · 1 for sample invoice) and wln monthly activity reports (see figure 2 for sample activity report) were used to determine the average number of items processed per month . for example, since about 85 percent of the cataloging done in tsd is via wln, it was possible to derive exact figures from wln invoices for the average number of items cataloged per month. the wln invoices also differentiated between data-base copy cataloging and original data entry . the cbp profile sheets were used to determine average number of non-wln items cataloged. using a combination of wln invoice and profile data, a chart was constructed of the average number of items searched and cataloged per month under the automated system (see table 5) . in order to make costs comparable, an assumption was made that the same average number of items was searched and cataloged under the previous manual system and a similar chart was made for it (see table 6). in reality , the available staff under the manual system could not process the same amount of material per month. cost analysis/drusche l 41 table 4 . technical services division work assignment/staffing profile: november 1978 unit: cataloging and book processing. subunit: lc copy editing . tasks or processes average number of items received for processing order card sets, check item { 2100/mo(monos ) agamst data base, enter holdings 63/mo(serials) prepare worksheets 210/mo prepare ts d series cards 126/mo do series check 350/mo average time per item 6 min/item 6 min/item 10 min/item 2 min/item 2 min/item update cip records 134/mo 10 min/item . . . { 210/mo(mono) 25 min/item input ongmal catalogmg data 211 ( · 1 ) 25 · /'t · mo sena s · mm 1 em process "rush" monographs 168/mo 15 min/item process corrections360/mo 10 min/item data base information receive materialssort series 2100/mo 3 items/min resolve problems; na na locate materials prepare and sort series 168/mo 8 min/item decisions materials sort mail na na staff costs total staff hours average staff available number hours at of items needed level of designated processed per task staff level 10/hr {lti 124.1 210/mo lt ii 85 .9 10/hr 6.3/mo lt iii 6.3 6/hr {lti 17.5 35/mo lt ii 17.5 30/hr 4.2/mo lt i 4.2 30/hr 11. 7/mo lt i 22.3 6/hr 22.3/mo lt ii 22.3 2.4/hr 87.5/mo lt ii 87.5 2.4/hr 8.75/mo lt iii 8.75 4/hr 42/mo lt ii 42 6/hr {lt ii 30 60/mo sup ii 30 180/hr 11.6/mo lt ii 11.6 na {lt ii 18.6 31.2/mo lt iii 13 7.5/hr 22.4/mo lt iii 22.4 na 42/mo lt iii 42 in the cost analysis of the automated system, the monthly wages for staff members of the cataloging and book processing un it were based on current monthly salaries (as of february 1980) plus estimated fringe benefits (2 1 percent). the total wages were added together for each level of staff and d ivided by the number of staff at that level to give an 0002 rbsbill rpt b1041 agency invoiced 0002 washington state unive rsity holland library pullman wa 99164 allene f schnaitter services charges com catlg processing w/s hold com catlg fiche copies online-attach sum holdcol 1 online-req cat cards-co l 1 online i nput of bib rec col 1 onli ne inquiry into database catalog cards washington library network customer invoice **. * * ***** *** * ** *** ** ••• *** * *** * *invoiced expenditure breakdown* account number / system * 4000 00 * recurring cha~ges-bib sys t em * ** *** ** * *** ** ** ** **** * ****** *** * quantity units 18 , 750 . 00 459.00 810.00 1,003.00 378.00 5 , 335 . 00 5,54 1. 00 @ 4¢ a title @ 15¢ a copy @$ 1. 60 recrd @ 15¢ each @ 14¢ each @ 6.9¢ each @ 5.5¢ card total services charges total charges , f ig. 1. washington library network customer invoice . billing date 12/31/79 ref . invoice no . 000001311 page no . 0001 total charges credits net charges 750.00 68.85 1 , 296 . 00 150.45 52 . 92 368.11 304 . 75 2. 991.08 2. 991.08 750,00 68.85 1 , 296 . 00. 150.45 52 . 92 368.11 304 . 75 2,991.08 * 2,991. 08 * 42 journal of library automation vol. 14/1 march 1981 monthly activity report tor period 11/01/79 to 11/30/79 library total holdings holdings records contribution rcps from acq orders inq uiry as of 11/30/79 added input factor 11/01 to 11/28 created transactions wapac 2, 0 59 38 . 0% 311 wap1p 41,549 416 385 92.5% 588 1,472 6,607 wapoh 33,801 566 89 15 . 7x 616 5,243 waps (wsu library) 44 866 1 630 197 12 . 0% 2013 1 674 1 9 013 fig . 2 . washington library network monthly activity report (selective sample ). average monthly wage . this average was then divided by 174 (the standard figure for university staff hours per month) to determine the average hourly rate . to calculate staff costs per minute, it was necessary to carry the per-minute costs to the third decimal to approximate the total dollars expended for staffing (see table 7) . no other indirect costs , e. g., breaks, annual leave, or holidays, were included in staff wages ; however, in order to determine the staff hours available to perform the functions being analyzed, nonproductive hours or staff hours devoted to other ass ignments had to be calculated and deducted. these calculations were made according to the following formula : hours/year hours/year 120 hours/year hours/year 88 hours/year 96 hours/year committee assignment (varied) unit meetings (varied) breaks (standard) annual leave (varied) holidays (standard) sick leave (standardized) based on hours earned per month hours/year-:12 = __ hours/month the primary reasons for variation in the nonproductive hours were length of service and whether a staff member was faculty or classified . staff costs under the manual system were based on current monthly wages; however, the number and level of staff are esse ntially that which existed at the time the manual system was function ing (see table 8). timeslip costs were not based on the minimum hourly wage, since a large number of hours were work/study during the period of the analysis . the total hours worked were divided by the total monthly expenditure to derive the per-minute timeslip costs. no effort was made to reconstruct actual timeslip costs under the manual system, but the same per-minute timeslip costs were used in order to avoid unnecessary skewing of staff costs under the manual system. data base costs the per unit costs of using the wln bibliographic system, both for performing processes and securing products, were based on the 1979-80 cost analysis/druschel 43 table 5 . typ e and average numb e r of items searched/cataloged per month on automated system (based on wln invoice data and cbp work assignment/staffing profile ) not nuc searched (wln )/month found/month found/month searched/month book approvals 600 420 (70%) 180 firm orders 700 406 (58%) 294 form approvals 244 (60%) regular 162 (40%) new acquisitions 295 90 (30% ) 205 (re-searched) precats 1380 414 (30%) 966 documents 125 25 (20%) 100 50 se rials 100 lo (10%) 90 30 rush 75 32 (42%) 43 43 gifts 100 5 (5% ) 95 95 monographic series 300 120 (40%) 180 originals 222 0 (0%) 222 222 reinstates 75 7 (10%) 68 68 3972 1529 (38 .5%) 2443 508 type and quantity of bibliographic data found in data base 1376 lc copy 153 cip copy (10%) 1529 type and quantity of original data entry monographs 192 se rials 30 nuc/ lc 95 total 317 total mate rials cataloged wln data base copy 1529 wln original data entry 317 non-wln microform 407 non-wln music lo 2263 wln schedule of charges. the average number of items processed was derived from the wln invoices. the per-record cost of the com catalog was calculated by taking the total costs of producing the com catalog from july 1979 to february 1980 and dividing these costs by the number of titles contained in the com catalog. although the wln schedule of charges stipulates a charge of . 069 cents per data-base inquiry, three kinds of processes allow a given number of inquiries without charge. since not all allowable inquiries are always used for these processes, there are generally a number of inquiries which can be made without charges being assessed. between july 1979 and february 1980, the average number of monthly inquiries for which there was a charge was 11,800; the average number per month for which there was no charge assessed was 8,044. for this reason, in the cost analysis of the automated system (table 2(a)), there appears a category "items searched, no inquiry charges" under the bibliographic searching section. 44 journal of library automation vol. 14/1 march 1981 table 6. type and average number of items searched/cataloged per month on manual system (based on cbp work assignment/staffing profile) not nuc searched (idc)/month found/month found/month searched/month book approvals 600 300 (50%) 300 firm orders 700 280 (40%) 420 420 new acquisitions 295 59 (20%) 236 (re-searched) precats 1380 276 (20%) 1104 documents 125 12 (10%) 113 113 serials 100 5 (5%) 95 95 rush 75 23 (30%) 52 52 gifts 100 5 (5%) 95 95 monographic series 300 90 (30%) 210 210 originals 222 0 (0%) 222 222 reinstates 75 7 (10%) 68 68 total 3972 1057 (26.5%) 2915 1275 type and quantity of materials cataloged idc copy 1057 modified copy 984 original cataloging 222 2263 (note: part of the "no charge" inquiries are generated and used by the acquisitions unit and are therefore not included in this analysis.) although the terminal service and line charges might simply have been added as a total amount to the data-base costs, it seemed more meaningful to distribute these costs on a per-use basis . the method used to distribute these charges was to identify each use of the bibliographic data base, and to divide the total monthly costs of terminals and lines by the total monthly units of use (see table 9). this method of distributing terminal service and line charges not only provided per-unit terminal use costs, but also served to categorize kinds and quantity of data-base use. subscription and material costs subscription costs include only those bibliographic tools purchased for use in tsd for the purpose of bibliographic searching. as a result of the increased growth of the bibliographic data base, fewer tools are being used for searching under the automated system than under the manual system. prior to the implementation of wln, the library subscribed to bibliographic data (lc and cip copy) on microfiche supplied by the information dynamics corporation (idc). the per-unit costs of all subscriptions are presented in the cost analysis charts (tables 1(a) and 2(a)). material costs include only those materials unique to cataloging and book processing; general supplies, such as pencils and paper, are not included. the calculation of the per-unit cost of most materials is generally straightforward. it should be noted, however, that under the automated system, products, i.e., materials, are included in the data-base cost analysis/druschel 45 table 7. staff costs: automated cataloging and book processing system staff costs/month classified staff oa ii lt i (4) lt ii (4) lt iii (2) supervisor ii (2) faculty catalogers (3vz) (monos) unit head staff costs/minute times lip oa ii lt i (4) lt ii (4) lt iii (2) supervisor ii (2) catalogers (3v2) unit head total staff costs/month timeslip----809 hrs @ $1,456/mo special projects librarian classified staff faculty total (all staff) salaries month $ 912 2,888 3,269 2,024 2,578 $4,691 1,774 plus 21% (fringe benefits) $192 606 686 425 541 $985 373 $1,456/mo + 809 hrs = 1.80/hr + 60 = .03/min 1,104/mo + 174 = 6.34/hr + 60 = .105/min costs/ month $ 1,104 3,494 3,955 2,449 3,119 subtotal $14,121 $ 5,676 2,147 subtotal $ 7,823 3,494/mo + 4 = $874/mo + 174 = 5.02/hr + 60 = .083/min 3, 955/mo + 4 = $989/mo + 17 4 = 5. 68/hr + 60 = . 094/min 2,449/mo + 2 = $1,225/mo + 174 = 7.04/hr + 60 = .117/min 3,119/mo + 2 = $1,560/mo + 174 = 8.97/hr + 60 = .15/min 5,676/mo + 3.5 = $1,622/mo + 174 = 9.32/hr + 60 = .155/min 2,147/mo + 174 = 12.34/hr + 60 = .205/min $ 1,456 345* 14,121 7,823 $23,745 *amount of time (wages) assigned to cataloging. costs, and only those materials used independent of the data base, e. g., book pockets and book cards, are listed as material costs on the charts. under the manual system, due to the divisional arrangement of the library system and the number of card catalogs being maintained, the formula for producing sets of cards for a single title was complex. for this reason, the costs and number of cards produced for the titles cataloged per month are listed as a separate line item. equipment costs equipment costs include only equipment unique to cataloging and book processing, i.e., required for processing or products. general equipment, such as desks, book trucks, typewriters, are not included. equipment-automated system during the period covered by the cost analysis, november 1977 to february 1980, the following equipment was purchased for the automated system: 46 journal of library automation vol. 14/1 march 1981 7 bibliographic terminals 10 modems or modem contention units 2 printers tax $24,360 5,433 6,500 $36,293 1,887 $38,180 two pieces of equipment are currently being leased (maintenance included): keypunch verifier @$ 92.61 @ 101.12 $193. 73/month summary of monthly equipment costs purchases (5-year amortization) $636.33 maintenance 60.00 leased equipment 193.73 $890.06/month equipment-manual system if the automated system had not been implemented, the following equipment would have been purchased during this period: 2 card catalogs $ 3, 755 5 kardex units 4,475 2 lined ex units __ 2, 944 $11,174 tax 581 $11,755 although the anticipated life span of this equipment should be considerably greater than that of terminals and modems, it has also been amortized over a five-year period. the rationale for this period of amortization is that the rate of growth of the files for which the equipment is used results in the purchase of additional equipment equivalent to the expected replacement of electronic equipment. therefore, the initial cost of these purchases amortized would have been $196/month. since the multilith has been owned by the library for more than twenty years, its purchase price is not applicable to this analysis. however, maintenance on the multilith is $72.24/month. two pieces of equipment were being leased under the manual system (maintenance included): keypunch verifier @$ 92.61 @ 101.12 $193. 73/month cost analysis/druschel 47 summary of monthly equipment costs purchases (5-year amortization) $196.00 maintenance 72.27 leased equipment 193.73 $462.00/month summary and conclusion the cost analysis clearly indicates that at washington state university libraries the automated cataloging and book processing system is less expensive than its previous manual system . by using the bibliographic component of the washiqgton library network, the library has reduced the costs of searching, cataloging, and record maintenance by almost 20 percent (see table 10-summary comparison costs by function). the higher costs of the manual system are essentially staff costs. under that table 8. staff costs: manual cataloging and book processing system (based on the 1977 staffing levels at current staff costs) staff costs/month classified staff oa iityping lt i (11) lt ii (3) lt iii (5) supervisor i (2) supervisor ii offset duplicator operator faculty catalogers (3. 5) unit head staff costs/minute timeslip oa ii-typing lt i (11) lt ii (3) lt iii (5) supervisor i (2) supervisor ii offset duplicator operator catalogers (3.5) unit head total staff costs/month timeslip---1208 hrs @ $2,174/mo classified staff faculty total (all staff) plus 21% salaries (fringe month benefits) $ 912 $ 192 7,950 1,670· 2,434 511 5,060 1,063 2,175 457 1,289 271 1,135 238 subtotal 4,691 985 1,774 373 subtotal $2,174/mo -o1208 hrs. = 1.80/hr -o60 = .03/min. 1, 104/mo -o174 = 6.34/hr -o60 = .105/min costs/ month $ 1,104 9,622 2,945 6,123 2,632 1,560 1,373 $25,359 5,676 2,147 $ 7,823 9,622/mo -o11 = 875/mo -o174 = 5.03/hr -o60 = .084/min 2,945/mo -o3 = 982/mo -o174 = 5.64/hr -o60 = .094/min 6,123/mo -o5 = 1,225/mo -o174 = 7.04/hr -o60 = .117/min 2,632/mo -o2 = 1,316/mo -o174 = 7.56/hr -o60 = .126/min 1,560/mo -o174 = 8.97/hr -o60 = .149/min. 1,373/mo -o174 = 7.89/hr -o60 = .13/min 5,846/mo -o3.5 = 1,670/mo -o174 = 9.60/hr -o60 = .155/min 2, 147/mo -o174 = 12.34/hr -o60 = .205/min. $ 2,174 25,359 7,823 $35,356 48 journal of library automation vol. 14/ 1 march 1981 table 9 . bibliographic data base use per month (one unit = one access to or process in data base) category searching cataloging (data base copy) cataloging (original data entry ) authority verification (317 x 7) bibliographic changes/corrections ill, ref, general total units wln terminal service and telecommunication line charges/ month 5 v2 terminals @ $140/mo = $770/mo 5v2 lines @ $40/mo = 220/mo $990/mo quantity of te rminal use 10688 1529 3 17 2219 360 537 15650 $990 + 15650 = $.06/terminal use for cataloging and book proce ssing system table 10 . catalo ging and book processing system : summary comparison costs by function (excluding equipment costs ) function manual system l. bibliographic searching 2. bibliographic record production (cost of catalog cards distribution) lc copy cataloging modified copy cataloging original cataloging misce llane ous 3. bibliographic record maintenance 4. marking total automated system l. bibliographic searching 2. bibliographic record production (cost of catalog cards included) lc and cip copy cataloging modified copy cataloging original cataloging miscellaneous 3. bibliographic record maintenance 4. marking total *total of items listed below. ttotal of costs listed below. number of items 5247 [2263)* 1057 984 222 na na na 4480 [2263 )* 1529 512 222 na na na costs pe r month $ 4 ,61 7 [$13, 143)t 4,092 5,021 3,343 687 6 ,402 4 ,631 $28 ,793 $ 2, 170 [$ll ,837)t 4,900 1,523 3 ,343 2,071 4 , 194 4 ,589 $22,790 system, eleven more staff and 1,365 more timeslip hours were needed per month to process the same amount of materials as is processed under the automated system . in fact, compared to the staff costs of both the manual and automated systems, the costs of equipment, data-base use (including products), terminal service, and telecommunication lines cost analysis/druschel 49 of the automated system are a relatively small percentage (27 percent) of the total cataloging and book processing costs . this analysis serves to underscore a basic reality of the current library organization: personnel is one of its largest expenditures and staff-intensive systems are very costly . this cost analysis has not directly addressed the issue of the quality of processing and products of either the manual or automated systems. the analysis suggests, however, that the automated system is more efficient in terms of staff time . moreover, the tsd staff has found that not only can more be done with fewer staff, but the automated system also provides more accurate data and has the flexibility to accommodate with relative ease the many corrections and changes that must be made to the library's bibliographic files. joselyn druschel is assistant director for automation and technical support at the washington state university libraries . she is currently chairing a staff task force which is developing specifications for the libraries' on-line catalog. · 164 information technology and libraries | december 2009 “discovery” focus as impetus for organizational learning jennifer l. fabbi the university of nevada las vegas libraries’ focus on the concept of discovery and the tools and processes that enable our users to find information began with an organizational review of the libraries’ technical services division. this article outlines the phases of this review and subsequent planning and organizational commitment to discovery. using the theoretical lens of organizational learning, it highlights how the emerging focus on discovery has provided an impetus for genuine learning and change. t he university of nevada las vegas (unlv) libraries’ focus on the concept of discovery and the tools and processes that enable our users to find information stemmed from the confluence of several initiatives. however, a significant path that is directly responsible for the increased attention on discovery leads through one unit in unlv libraries—technical services. this unit, consisting of the materials ordering and receiving (acquisitions) and bibliographic and metadata services (cataloging) departments, had been without a permanent director for three years when i was asked to take the interim post in april 2008. while the initial expectation was that i would work with the staff to continue to keep technical services functioning while we performed our third search for a permanent director, it became clear after three months that, because of nevada’s budgetary limitations, we would not be able to go forward with a search at that time. as all personnel searches in unlv libraries were frozen, managers and staff across the divisions moved quickly to reassign staff with the aim of mitigating the effects of staff vacancies. there was division between the library administrators as to what the solution would be for technical services: split up the division—for which we had trouble recruiting and retaining a leader in the past—and divvy up its functions among other divisions in the libraries, or to continue to hold down the fort while conducting a review of technical services that would inform what it might become in the future. other organizations have taken serious looks at, and provided roadmaps of, how their organizations’ focus of technical services will change in the future.1 the latter route was chosen, and the review—eventually dubbed revisioning technical services—led directly to the inquiries and activities documented in this ital special issue. detailing the process of revisioning technical services and using the theoretical lens of organizational learning, i will demonstrate how the libraries’ emerging focus on the concept of discovery has provided an impetus for genuine learning and change. n organizational learning in images of organization, morgan devotes a chapter to theories of organizational development that characterize organizations using the metaphor of the brain.2 based on the principles of modern cybernetics, argyris and schön provide a framework for thinking about how organizations can learn to learn.3 while many organizations have become adept at single-loop learning—the ability to scan the environment, set objectives, and monitor their own figure 1. singleand double-loop learning source: learning-org discussion pages, “single and double loop learning,” learning-org dialog on learning organizations, http://www.learning-org.com/ graphics/lo23374singledll.jpg (accessed aug. 11, 2009). jennifer l. fabbi (jennifer.fabbi@unlv.edu) is special assistant to the dean at the university of nevada las vegas libraries. “discovery” focus as impetus for organizational learning | fabbi 165 general performance in relation to existing operating norms—these types of systems are generally designed to keep the organization “on course.” double-loop learning, on the other hand, is a process of learning to learn, which depends on being able to take a “double look” at the situation by questioning the relevance of operating norms (see figure 1). bureaucratized organizations have fundamental organizing principles, including management hierarchy and subunit goals that are seen as ends to themselves, which can actually obstruct the learning process. to become skilled in the art of double-loop learning, organizations must avoid getting trapped in singlelooped processes, especially those created by “traditional management control systems” and the “defensive routines” of organizational members.4 according to morgan, cybernetics suggests that learning organizations must develop capacities that allow them to do the following:5 n scan and anticipate change in the wider environment to detect significant variations by o embracing views of potential futures as well as of the present and the past; o understanding products and services from the customer’s point of view; and o using, embracing, and creating uncertainty as a resource for new patterns of development. n develop an ability to question, challenge, and change operating norms and assumptions by o challenging how they see and think about organizational reality using different templates and mental models; o making sure strategic development does not run ahead of organizational reality; and o developing a culture that supports change and risk taking. n allow an appropriate strategic direction and pattern of organization to emerge by o developing a sense of vision, norms, values, limits, or “reference points” to guide behavior, including the ability to question the limits being imposed; o absorbing the basic philosophy that will guide appropriate objectives and behaviors in any situation; and o placing as much importance on the selection of the limits to be placed on behavior as on the active pursuit of desired goals. unlv libraries’ revisioning technical services process and the resulting organizational focus on discovery is outlined below, and the elements identifying unlv libraries as a learning organization throughout this process are highlighted (see appendix a). n revisioning technical services this review of technical services was a process consisting of several distinct steps over many months, and each step was informed by the data and opinions gained in the prior steps: phase 1: technical services baseline, focusing on the nature of technical services work at unlv libraries, in the library profession, and factors that affect this work now and in the future phase 2: organizational call to action, engaging the entire organization in shared learning and input phase 3: summit on discovery, shifting significantly away from technical services and toward the concept of discovery of information and the experience of our users technical services baseline the first phase of the process, which i called the “technical services baseline,” included a face-to-face meeting with me and all technical services staff. we talked openly about the challenges that we faced, options on the table for the division and why i thought that taking on this review would be the best course to pursue, and goals of the review. outcomes of the process were guided by the dean of libraries, were written by me, and received input from technical services staff, resulting in the following goals: 1. collect input about the kinds of skills and leadership we would like to see in our new technical services director. (while creating these goals, we were given the go-ahead to continue our search for a new director). 2. investigate the organization of knowledge at a broad level—what is the added value that libraries provide? 3. increase overall knowledge of professional issues in technical services and what is most meaningful for us at unlv. 4. encourage technical services staff to consider current and future priorities. after establishing these goals, i began to document information about the process on unlv libraries’ staff website (figure 2) so that all staff could follow its progress. 166 information technology and libraries | december 2009 with the feedback i received at the face-to-face meeting and guided by the stated goals of the process, i gave technical services staff a series of three questions to answer individually: 1. what do you think the major functions of technical services are? examples are “cataloging physical materials” and “ordering and paying for all resources purchased from the collections budget.” 2. what external factors—in librarianship and otherwise—should we be paying the most attention to in terms of their effect on technical services work? examples are “the ways that users look for information” and “reduction of print book and serials budgets.” feel free to do a little research on this question and provide the sources of the information that you find. 3. what are the three highest priority/most important tasks on your to-do list right now? eighteen of twenty staff members responded to the questions. i then analyzed the twenty pages of feedback according to two specific criteria: (1) i paid special attention to phrases that indicated an individual’s beliefs, values, or philosophies to identify potential sources of conflict as we moved through the process; and (2) i looked for priority tasks listed that are not directly related to the individual’s job duties, as many of them were indicators of work stress or anxiety related to perceived impending change. during this phase, organizational learning was initiated through the process of challenging how technical services staff and others viewed technical services as a unit in the organization, and through the creation of shared reference points to guide our future actions. while beginning a dialogue about a variety of future management options for technical services work functions may have raised levels of anxiety within the organization, it also invited administration and staff to question the status quo and consider alternative modes of operation within the context of efficiency.6 in addition to thinking about current realities and external influences, staff were asked to participate in generating outcomes to guide the review process. these shared goals helped to develop a sense of coherence for what started out as a very loose assignment—a review that would inform what the unit might become in the future. organizational call to action the next phase of the process, “a call to action,” required library-wide involvement and input. while i knew that this phase would involve a library staff survey, i also desired that all staff responding to the survey had a basic knowledge of some of the issues that are facing library technical services today. using input from the two technical services department heads, i selected two readings for all library staff: bothmann and holmberg’s chapter on strategic planning for electronic resource management addressed many of the planning, policy, and workflow issues that unlv libraries has experienced7; and coyle’s article on information organization and the future of the library catalog offers several ideas for ensuring that valuable information is visible to our users in the information environments they are using.8 i also asked the library staff to visit the university of nebraska–lincoln’s “encore catalog search” (http://iris.unl.edu) and go through the discovery experience by performing a guided search and a search on a topic of their choice. they were then asked to ponder what collections of physical or digital resources we currently own at the libraries that are not available from the library catalog. after completing these steps, i directed library staff to a survey of questions related to the importance of several items referenced in the articles in terms of the following unlv libraries priorities: n creating a single search interface for users pulling together information from the traditional library catalog as well as other resources (e.g., journal articles, images, archival materials) n considering non–marc records in the library catalog for the integration of nontraditional library and nonlibrary resources into the catalog n linking to access points for full-text resources from the catalog n creating ways for the catalog to recommend items to users figure 2. project’s wiki page on staff website “discovery” focus as impetus for organizational learning | fabbi 167 n creating metadata for materials not found in the catalog n creating “community” within the library catalog n implementing an electronic resource management system (erms) to help manage the details related to subscriptions to electronic content n implementing federated searching so that users can search across multiple electronic resource interfaces at once n making electronic resource license information available to library staff and patrons there also were several questions asking library staff to prioritize many of the functions that technical services already undertakes to some extent: n cataloging specialized or unique materials n cataloging and processing gift collections n ensuring that full-text electronic access is represented accurately in the catalog n claiming and binding print serials n ordering and receiving physical resources n ordering and receiving electronic resources n maintaining and communicating acquisitions budget and serials data the survey asked technical services staff to “think of your current top three priority to-do items. in light of what you read and what you think is important for us to focus on, how do you think your work now will have changed in five years?” all other library staff members were asked to respond to the following: 1. please list two ways that technical services supports your work now. 2. please list two things you would like technical services to start doing in support of your work now. 3. please list two things you think technical services can stop doing now. 4. please list two things technical services will need to begin doing to support your work in the next five years. finally, the survey included ample opportunity for additional comments. fifty-eight staff members (over half of all library staff) completed the readings, activity, and survey. i analyzed the information to inform the design of subsequent phases of revisioning technical services. the dean of libraries’ direct reports then reviewed the design. in addition, many library staff contributed additional readings and links to library catalogs and other websites to add to the revisioning technical services staff webpage. throughout this phase, the organization was invited into the learning process through engagement with shared reference points, the ability to question the status quo, and the ability to embrace views of potential futures as well as of the present and the past.9 the careful selection of shared readings and activities created coherence among the staff in terms of thinking about the future, but these ideas also raised many questions about the concept of discovery and what route unlv libraries might take. the survey allowed library staff to better understand current practices in technical services, to prioritize new ideas against these practices, and to think about future options and their potential impact on their individual work as well as the collective work of the libraries. summit on discovery in the third phase of this process, “the discovery summit,” focus began to shift significantly from technical services as an organizational unit to the concept of discovery and what it means for the future of unlv libraries. during this half-day event, employing a facilitator from off campus, the dean of libraries and i designed a program to fulfill the following desired outcome: through a process of focused inquiry, observation, and discussion, participants will more fully understand the discovery experience of unlv libraries users. the event was open to all library staff members; however, individuals were required to rsvp and complete an activity before the day of the event. (the facilitator worked specifically with the technical services staff at a retreat designed to prepare for upcoming interviews for technical services director candidates.) participants were each sent a “summit matrix” (see appendix b) ahead of time, which asked them to look for specific pieces of information by doing the following: 1. search for the information requested with three discovery tools as your starting points: the libraries’ catalog, the libraries’ website, and a general internet search engine (like google). 2. for each discovery tool, rate the information that you were able to find in terms of “ease of discovery” on a scale of 1 (lowest ease—few results) to 5 (highest ease—best results). 3. document the thoughts and feelings you had and/ or process you went through in searching for this information. 4. answer this question: do you have other preferred starting points when looking for information that the libraries own or provide access to? the information that staff members were asked to search for using each discovery tool was mostly specific to the region of southern nevada, such as, “i heard that henderson (a city in southern nevada) started as a mining community. does unlv libraries have any books about that?” and “find any photograph of the gay 168 information technology and libraries | december 2009 pride parade in las vegas that you can look at in unlv libraries.” during the summit, the approximately sixty participants were asked to discuss their experiences searching for the matrix information, including any affective component to their experience, and they were asked to specify criteria for their definition of “ease of discovery.” next, we showed end-user usability video testing footage of a unlv professor, a human resources employee, and a unlv librarian going through similar discovery exercises. after each video, we discussed these users’ experiences—their successes, failures, and frustrations— and the fact that even our experts were unable to discover some of this information. finally, we facilitated a robust brainstorming session on initiatives we could undertake to improve the discovery experience of our users. [editor’s note: read more about this usability testing in “usability as a method for assessing discovery” on page 181 of this issue.] during the wrap-up of the discovery summit, the final phase of this initial process, the discovery miniconference was introduced. a call for proposals for library staff to introduce or otherwise present discovery concepts to other library staff was distributed. this call tied together the revisioning technical services process to date and also placed the focus on discovery to the libraries’ upcoming strategic planning process. this strategic planning process, outlining broad directions for the libraries to focus on for the next two years, would be the first time we would use our newly created evaluation framework. we focused on the concepts of discovery, access, and use, all tied together through an emphasis on the user. all library staff members were invited to submit a poster session or other visual display on various themes related to discovery of information to add to our collective and individual knowledge bases and to better understand our colleagues’ philosophies and positions on discovery. in addressing one of six mini-conference themes listed below, all drawn directly from the revisioning technical services survey results, potential participants were asked to consider the question, “what are your ideas for ways to improve how users find library resources?” n single search interface (federated searching, harvester-type platform, etc.) n open source vs. vendor infrastructure n information-seeking behavior of different users n social networking and web 2.0 features as related to discovery n describing primary sources and other unique materials for discovery n opening the library catalog for different record types and materials proposals could include any of these perspectives: n an environmental scan with a summary of what you learn n a visual representation of what you would consider improvement or success n a position for a specific approach or solution that you advocate ultimately, we had seventeen distinct projects involving twenty-four staff members for the afternoon miniconference. it was attended by approximately seventy additional staff members from unlv libraries as well as representatives from institutions who share our innovative system. we collected feedback on each project in written form and electronically after the mini-conference. miniconference content was documented on its own wiki pages and in this special issue of ital. during this phase of the revisioning technical services process, there was an emphasis on understanding our services from the customers’ point of view, a hallmark of a learning organization.10 during the discovery summit, we aimed to transform frustration and uncertainty over the user experience of the services we are providing into a motivation to embrace potential futures. the mini-conference utilized the discovery themes that had evolved throughout the revisioning technical services process to provide a cohesive framework for library staff members to share their knowledge and ideas about discovery systems and to question the status quo. n organizational ownership of discovery: strategic planning and beyond through the phases of the revisioning technical services process outlined above, it should be evident how the concept of discovery, highlighted during the process, moved from being focused on technical services to being owned by the entire organization. while the vocabulary of discovery had previously been owned by pockets of staff throughout unlv libraries, it has now become a common lexicon for all. the libraries’ evaluation framework, which includes discovery, had set the stage for our upcoming organizational strategic plan. just prior to the discovery summit, the dean of libraries’ direct reports group began to discuss how it would create a strategic plan for the 2009–11 biennium. it became increasingly apparent how important a focus on discovery would be in this process, and that we needed to time our planning right, allowing the organization and ourselves time to become familiar with the potential activities we might commit to in this area before locking into a strategic plan. “discovery” focus as impetus for organizational learning | fabbi 169 the dean’s direct reports group first spent time crafting a series of strategic directions to focus on in the two-year time period we were planning for. rather than give the organization specific activities to undertake, the strategic directions were meant to focus our new initiatives—and in a way to limit that activity to those that would move us past the status quo. of the sixteen directions, one stemmed directly from the organization’s focus on discovery: “improve discoverability of physical and electronic resources in empowering users to be self sufficient; work toward an interface and system architecture that incorporates our resources, internal and external, and allows the user to access them from their preferred starting point.” an additional direction also touched on the discovery concept: “monitor and adapt physical and virtual spaces to ensure they respond to and are informed by next-generation technologies, user expectations, and patterns in learning, social interactions, and research collaboration; encourage staff to experiment with, explore, and share innovative and creative applications of technology.” through their division directors and standing committees, all library staff members were subsequently given the opportunity to submit action items to the strategic plan within the framework of the strategic directions. the effort was made by the dean of libraries for this part of the process to coincide with the discovery mini-conference, a time when many library staff members were being exposed to a wide variety of potential activities that we might take as an organization in this area. one of the major action items that made it into the strategic plan was for the dean’s direct reports to charge an oversight task force with the investigation and recommendation of a systems or systems that would foster increased, unified discovery of library collections. the charge of this newly created discovery task force includes a set of guiding principles for the group in recommending a discovery solution that n creates a unified search interface for users pulling together information from the library catalog as well as other resources (e.g., journal articles, images, archival materials); n enhances discoverability of as broad a spectrum of library resources as possible; n is intuitive: minimizes the skills, time, and effort needed by our users to discover resources; n supports a high level of local customization (such as accommodating branding and usability considerations); n supports a high level of interoperability (easily connecting and exchanging data with other systems that are part of our information infrastructure); n demonstrates commitment to sustainability and future enhancements; and n is informed by preferred starting points of the user. in setting forth these guiding principles, the work of the discovery task force is informed by the organization’s discovery values, which have evolved over a year of organizational learning. in the timing of the strategic planning process and the emphasis of the plan, we made sure that the organization’s strategic development did not run ahead of organizational reality and also have worked to develop a culture that supports change and risk taking.11 the strategic discovery direction and pattern of organizational focus has been allowed to emerge throughout the organizational learning process. as evidenced in both the strategic plan directions and guiding principles laid out in the charge of the discovery task force, the organization has begun to absorb the basic philosophy that will guide appropriate objectives in this area and has focused more on this guiding philosophy than on the active pursuit of one right answer as it continues to learn. n conclusion using the theoretical lens of organizational learning, i have documented how unlv libraries’ emerging focus on the concept of discovery has provided an impetus for learning and change (see appendix a). our experience throughout this process supports the theory that organizational intelligence evolves over time and in reference to current operating norms.12 argyris and schön warn that a top-down approach to management focusing on control and clearly defined objectives encourages singleloop learning.13 had unlv libraries chosen a more management-oriented route at the beginning of this process, it most likely would have yielded an entirely different result. in this case, genuine organizational learning proved to be action based and ever-emerging, and while this is known to introduce some level of anxiety into an organization, the development of the ability to question, challenge, and potentially change operating norms has been worth the cost.14 i believe that while any single idea we have broached in the discovery arena may not be completely unique, it is the entire process of organizational learning that is significant and applicable to many information and technology-related areas of interest. references 1. karen calhoun, the changing nature of the catalog and its integration with other discovery tools (washington, d.c.: library 170 information technology and libraries | december 2009 scan and anticipate change in the wider environment to detect significant variations by n embracing views of potential futures as well as of the present and the past (revisioning phase 1: technical services questions); n understanding products and services from the customer’s point of view (revisioning phase 3: summit); and n using, embracing, and creating uncertainty as a resource for new patterns of development (revisioning phase 1: meeting; phase 3: summit). develop an ability to question, challenge, and change operating norms and assumptions by n challenging how they see and think about organizational reality using different templates and mental models (revisioning phase 2: survey); n making sure strategic development does not run ahead of organizational reality (strategic planning process; discovery task force charge); and n developing a culture that supports change and risk taking (strategic planning process). allow an appropriate strategic direction and pattern of organization to emerge by n developing a sense of vision, norms, values, limits, or “reference points” to guide behavior, including the ability to question the limits being imposed (revisioning phase 1: outcomes; phase 2: shared readings, activity; strategic planning process; discovery task force charge); n absorbing the basic philosophy that will guide appropriate objectives and behaviors in any situation (strategic planning process, discovery task force charge); and n placing as much importance on the selection of the limits to be placed on behavior as on the active pursuit of desired goals (strategic planning process, discovery task force charge). of congress, 2006), http://www.loc.gov/catdir/calhoun-report -final.pdf (accessed aug. 12, 2009); bibliographic services task force, rethinking how we provide bibliographic services for the university of california (univ. of california libraries, 2005), http://libraries.universityofcalifornia.edu/sopag/bstf/final .pdf (accessed aug. 12, 2009). 2. gareth morgan, images of organization (thousand oaks, calif.: sage, 2006). 3. chris argyris and donald a. schön, organizational learning ii: theory, method, and practice (reading, mass.: addison wesley, 1996). 4. morgan, images of organization, 87. 5. morgan, images of organization, 87–97. 6. ibid. 7. robert l. bothmann and melissa holmberg, “strategic planning for electronic management,” in electronic resource management in libraries: research and practice, ed. holly yu and scott breivold, 16–28 (hershey, pa.: information science reference, 2008). 8. karen coyle, “the library catalog: some possible futures,” the journal of academic librarianship 33, no. 3 (2007): 414–16. 9. morgan, images of organization. 10. ibid. 11. ibid. 12. ibid. 13. argyris and schön, organizational learning ii. 14. morgan, images of organization. appendix a. tracking unlv libraries’ discovery focus across characteristics of organizational learning “discovery” focus as impetus for organizational learning | fabbi 171 please complete the following and bring to the summit on discovery—february 24: 1. search for the information requested in each row of the table below with three discovery tools as your starting points: the libraries catalog, the libraries website, and a general internet search engine (like google). 2. for each discovery tool, rate the information that you were able to find in terms of “ease of discovery” on a scale of 1 (lowest ease) to 5 (highest ease). 3. document the thoughts and feelings you had and/ or process you went through in searching for this information in the space provided. 4. answer this question: do you have other preferred starting points when looking for information that the libraries own or provide access to? appendix b. summit matrix what am i looking for? libraries catalog libraries website google thoughts, etc., on what i discovered what’s all the fuss about frazier hall? why is it important? does unlv libraries have any documents about the history of the university that reference it? it’s black history month and my professor wants me to find an oral history about african americans in las vegas that is available in unlv libraries. i heard that henderson started as a mining community. does unlv libraries have any books about that? find any photograph of the gay pride parade in las vegas that you can look at in unlv libraries. 128 on-line serials system at laval university library rosario de varennes: director, library analysis and automation, laval university library, cite universitaire, quebec, canada description of a system, operational since june 1968, that provides control of all serials holdings in nine campus libraries, permits updating of the complete file every two or three days, and produces various outputs for library users and library staff from data in variable fields on disks (listings, statistics, etc.). the program, presently operating on an ibm 360/50 and utilizing an ibm 2314 disk -storage facility and three ibm 226 crt terminals, is written in ibm system/360 operating system assembler language and in pl/i; it could encompass a file of no more than 10 million records of variable length limited to 127/255 characters and subdivided in 25 or fewer fields . l'universite laval, the oldest french university in america, around 1950 began a move from the original location in historic old quebec to a new campus in suburban sainte-foy; the general plan calls for a total investment of $235,000,000. this private institution, subsidized by the provincial government at about 75%, had an operating budget in 1968/69 of $32,000,000 (research not included); of this sum, $2,300,000 was appropriated for the library system. the enrollment of full-time students was 10,145 and the total registration 22,726. the regular teaching staff amounted to 1,016 and the total figure was 1,691. the library serving this community constitutes a unified system under one administration, with centralized technical processing, but with nine physical locations-one of which is still in the old city-and four auxiliary services: documentation center, rare books and archives, map lion-line serials systemjde varennes 129 brary and film library. the most recent addition to it is the main library building, dedicated in june 1969, a $10,000,000 seven-story complex of 424,000 square feet(1) . the library staff consists of 269 employees, of which 78 are professional librarians or specialists. the serials department totals fifteen employees, of which three are professional librarians. the collections as of august 1969 represent 815,966 physical units, or 433,407 cataloging units of books, periodicals, government publications, pamphlets and microtexts; and 88,734 physical units of special collections (maps, photos, films, fixed films, music records, manuscripts, archives). the serials alone account for 189,440 bound volumes and 16,335 titles, of which 12,396 are received currently and 7,934 are subscriptions. the figures for serials titles will probably reach the 20,000 mark with the completion in 1970 of an inventory started in 1964. library automation venture at the library goes back to the autumn of 1963, when an off-line serials system and a subject headings list program were begun. along the road, there was developed in the documentation center a special technique of information storage and retrieval utilizing the recordak miracode (microfilm retrieval access code) system and a program called asyvol 2 (analyse synthetique par vocabulaire libre/synthetical analysis by free vocabulary) by means of which various indexes and research projects are currently processed. recently the first on-line real-time program with the new serials system went into successful operation. some literature, mostly in french, has been issued concerning these realizations and projects, but has been little publicized (2-11). it is also worth noting that the library, except for some peripheral equipment, mostly input devices, does not own any machinery and is utilizing instead the programming staff and the computer facilities of the polyvalent laval university's computing center. in the library itself the author of this article is mainly responsible for preliminary analysis of projects, for coordination of activities between the library and the computing center, for the supervision of work done in library automation units integrated into library services, and for the administration of the budget appropriated for library automation. this last item, research projects not included, is $170,000 for the year under discussion. system design contents of the file in its present organiza.tion, the serials file is accessible only by an accession number limited to seven digits and ordered corresponding to the alphabetized entries of records. there is a distinct entry for every title and every reference and for each duplication of any one title or reference. all records fall within two main divisions: humanities, represented by h, and sciences, represented by s, and are further identified by subdivisions of these main classes to a limit of three letters (for 130 journal of library automation vol. 3/2 june, 1970 titre a sigle g period! cite c repertoires de defouillement d ab./don/ech. e lieu de fubl. f pays/langue g cote h date de parution i nit !ale i editeur et son adresse j prix k note historique l titre direct m vedettes·"'atiere 1 2 n renouv • ccm~. 0 . etat de collection de l'annee cour. p etat retrospectif de la collection q etat retrospectif de ce qui manqjje r n.b. ."sigle !ere col 2e col 3e col 4e col. 5e col. 6e c ol. fig. 1. input sheet. matricule no: i i i i i i i i i i i i i 1 courant 1 publication officielle 1 annuel ou continuation 1 voir 1 collection non complete 1 periodiques d'hopitallx i i i i i i i 0 non courant 0 non officielle 0 les autres o titre du periodique o collection complete information sheet for transmission to system (back-up) on-line serials system/de varennes 131 fig. 2. serials file updating. 132 journal of library automation vol. 3/2 june, 1970 example: hh, main library; hmu, music library; sa, agricultural library; scc> science library, department of chemistry). there is a possibility of 25 fixed/variable fields for any record, but only 18 are currently used (figure 1). as of september 2, 1969, the statistical figures for the complete file were as follows: 22,530 entries, of which 16,335 were titles; 6,192 references; three entries were unspecified by error. hardware the system is operational with an ibm 360/50, an ibm 2314 diskstorage facility and ibm 2316 disk packs, three ibm 2260 crt display units, an ibm 2848 display control unit, an ibm 2401 tape transport and control unit and an ibm 1403 high-speed printer. the program system occupies a 56,320-byte region of core memory. software the system developed at laval provides essentially for two things: the record display on crt terminals for questioning or modifying the records; and the updating of the serials file (figure 2). it is not affected by the bibliographic contents of records; the control of this part is the responsibility of the serials department. the system could encompass any file of no more than 10 million records of variable length limited to 127/255 characters and subdivided into 25 or fewer fields. the program is written in ibm system/ 360 operating system assembler language, except for the output and printing routines written in pl/1, and it is conceived for an ibm 360/ 30 model or one of a higher number, matched at a minimum with one magnetic tape, one disk, one 2848 control unit and one 2260 crt terminal; it operates under the control of operating system os/ 360, version 14 or subsequent (12, 13). the system is roughly subdivided into three subsystems, the first being the control routine for the system and crt terminals, developed by the linkages cm cmca ed~ nvc modossif avanc i racine i --r--institr ~ecran croix ins chan 'ligne ins ~ cor mot aic reperage ric log blanbuf fig. 3. communication between modules. on-line serials systemfde varennes 133 computing center of laval university and called racine (root). the second is a subsystem that consists of display and updating routines, a group of 18 modules falling under two control sections ( csect): linkages and marchand (family name of an analyst-programmer from b.i.r.o.). each is again constituted of various subprograms, and marchand includes also all literals of the program. all these modules communicate various ways as illustrated in figure 3. modules not within the large box constitute the csect marchand; other csecf are within the large box. the third subsystem is a modification routine of records on disk called modossif (modification des dossiers du fichier/modification of records on file). the ibm/linkage editor links these routines, and they communicate 1) by way of specific registers; and 2) by way of working space areas, some common to all terminal stations and some restricted to one in particular. the main purpose of the system being to give the user up-to-date information, it is implied that information concerning modified records should be available as soon as the transaction is performed. the ibm/ indexed sequential system seems at first sight to ideally answer this need. nevertheless, the library was forced to elaborate a more complex system for the sake of security. data sets (figure 4) commune et unite modossi f ' ' ' ' 0 0 0 0 procedure de " restart " ---~, fig. 4. data sets. 134 journal of library automation vol. 3/ 2 june, 1970 there exist the master file in direct access on disk, with a backup £le on tape. when a record is asked for, the accession number of the record is transmitted to the program and searched for in the file; if found it is duplicated completely in the working space area on disk corresponding to a particular terminal, and is displayed on the crt nine lines at a time. in case of a modification being asked for, it is on this copy in that particular area that the program modossif applies. in switching to another demand, the program checks to see if any modification occurred. if not, the copy is destroyed; otherwise the amended record is transferred to the temporary common working space area on disk called bp am (buffer periodiques amendes/ buffer amended serials), where all modifications accumulate from one updating of the master £le to the other. if queried anew before updating, the same amended record will be retrieved from bp am £le and duplicated as before. moreover, any instruction concerning modifications is chronologically recorded on tape as given and constitutes the log (figure 2). if any down time occurs, it is then possible to simulate all the transactions performed since the last updating. updating, normally a daily process, is basically the merger of the master £le with the bp am £le, resulting in the creation of a new master £le on disk and a new backup on tape. record in the file as mentioned before, any record in the £le is identified by an accession number of seven digits. number 0000000 identifies the system's messages and is always displayed first, and number 9999999, indicating that a working space area is not occupied, is not to be used. otherwise all numbers are symmetrical and interchangeable. any record may cover up to 25 fields or blocks of logical information. these fields are identified by letters a to y and put into alphabetical order. they vary in length from three to many thousand characters. each field is divided into three elements: identifying letter; information to display; and end of field or record control tag. this tag is fd (fin du dossier/end of field) for all fields except the last one, which is tagged fe (fin de !'entree/ end of record ). the information to be displayed is submitted to various restrictions, exemplified in detail in the instructions manual ( 13). the manual, in fact, puts in action the main program modossif. physically any record on file is subdivided into many subrecords of fixed length ( l equ nnn) optimized at 239 bytes to a maximum of 127 per record. each subrecord is addressed in three sections as follows : 235 bytes of information, three bytes representing the accession number in binary code and one byte giving the sequence number of the subrecord under this particular accession number. this way, the last four bytes give the key to the subrecord in the master file and the last byte the key of access in the working-space area, making it possible to execute on-line serials systemfde varennes 135 modossif and various print-out routines. to facilitate the retrieval of any particular field in a record, the 26 first bytes of the first subrecord are set aside for an index to the fields . the 25 first bytes represent fields a to y; in each position, a binary zero points to an inexistent field and a positive value indicates the sequence of the subrecord where the field starts. the 26th byte gives the total number of subrecords in the record. the 27th byte gives the name of the first existing field, etc. figure 5 is an example of a complete record. underlined sections indicate hexadecimal notation. each row in the · figure is a subrecord here given an unreal value of 40. the remainder is in alphanumeric characters, except that space is compressed and indicated by a dollar sign. the information on figure 5 appears as follows on the crt screen: a tokyo bunrika daiguru, science reports b 000010 sc d annual reports of sciences q no 50-67, 83, 97 t chimie chimie industrielle u a et b c v a-b c w vide x revue annuelle de chimie y ce dossier est dresse a titre d'exemple seulement. varia the program provides also the parameters for each of the lines displayed on the screen, that is, nine screen-parameters called p armec ( parametres-ecran). a b c d e f g u i j k l m n 0 p q r s t u v w x y 0102000300000000000000000000000003000004050505050608a t 0 k y 0 bun 21e88e01 1 2 3 4 567 r i k a d a i g u r u , s c i e n c e r e p 0 r t s fd b 0 0 0 0 1 0 21e88e02 $ s c fd d a n n u a l r e p 0 r t s 0 f s c i e n c e s !~ q n 0 21e88e03 5 0 6 7 , 8 3 , 9 7 fd t c h i h i e c ii i 11 i e i n d u s t r i e 21e88e04 l l e fd u a e t d $ c fd v a f d , $ c fd w $ v i d e fd x r e v u 21e88eos e a n n u e l l e d e c h i h i e • fd y c e d 0 s s i e r e s 21e88e06 t d r e s s e a t i t r e d ' e x e m p l e s e u l e m e n t 21e8be07 , fe0000000000000000000000000000000000000000000000000000000000000000000021e8be08 fig. 5. example of complete record. 136 journal of library automation vol. 3/2 june, 1970 the analyst-programmers at the computing center completed the program by various printing subprograms from the data on variable fields on disks, by a statistics subprogram and by a control routine of the indexes to the file. recently another addition occurred to the system when les presses de l'universite laval, the library's subscription agent for serials, decided to utilize the file to initiate a computerized ordering process. the programming for this project was tested during october and the program was successfully run during the first week of december 1969. implementation as soon as it was confirmed that the computing center would receive by summer 1967 a third-generation computer (ibm 360/40) it was deemed advisable to contemplate an on-line system to replace the already saturated off-line serials system on the ibm 1410 inaugurated in 1964 (8). an optimistic target date having been set for january 1968, the author transmitted in april 1967 to the computing center for study a working hypothesis concerning the automatic conversion of holdings data and the automatic claiming of missing issues ( 14). in answer to it, in august 1967, mr. jean lachance, analyst-programmer, proposed a first draft of an automatic control system for serials ( 15). in fact the draft envisaged only a semi-automatic conversion of data and the on-line system for current entries only, the non-current being managed off line. then, on account of various restrictions befalling the computing center, it was decided to call upon an external firm, b.i.r.o., inc., located in quebec city. the contract, signed at the end of november 1967, provided basically for: 1) the conversion of the master file on magnetic tapes, containing records in fixed fields, to a random access file on disks, with records in variable fields; 2) the programming of record display on crt terminals; 3) the updating of the file via coded input procedures; 4) the provision of transitory working space areas for current transactions; 5) the possibility of questioning and amending both the master file and the transitory file; 6) the writing of the appropriate technical documentation and the intitial training of the operators of the terminals. the contract was to be in conformity with the standards of operating system os for an ibm 360 computer and subject to the acceptance of the computing center of laval university. at the same time a working schedule was established as follows: 1) beginning of work as soon as the contract was signed; 2) operational program sixty days after delivery to the contractor of terminals in good working condition according to manufacturer's specifications; and 3) termination of contract thirty days after acceptance of the finished product. the terminals were ready by january 25, 1968. the program was declared operational by april 11, 1968, and the technical report describing it deposited the week after. meanwhile a last updating of the master on-line serials systemjde varennes 137 file with the off-line system was performed at the beginning of april. on april 29, 1968, the conversion of the file to ibm 2311 disks connected to an ibm 360/40 was realized. everything was then ready for the final test. unhappily, mr. lachance left laval at the end of april at the most crucial moment, and it was not before june 12, 1968, that the first updating succeeded and the program became operational. from then on, apart from various technical problems, there were other difficulties: a moving of the main library during august, a moving of the computing center that precluded any activity from september 26 to october 25, and a switch to an ibm 360/50 and to an ibm 2314 diskstorage facility during october-not to mention a turnover of staff in the serials department. these prevailing conditions explain why the program was not officially accepted by the library before december 16, 1968. operation the first of a series of turning points in the refinement of the program occurred in november 1968 with a normalized run of updatings. in january 1969 the two first printing programs ran successfully; these were a daily checking list with calendar, and a statistics subprogram (figure 6). in february 1969 the program produced almost error-free updatings (0.8% and 0.63%), and in june 1969 the system was finally debugged, eliminating a particular recurring anomaly accounting for most of the errors in the system (a display of preceding instruction bearing accession number and code in some field o:i subsequent record). some other technical difficulties were encountered along the road and tasleau des slalisti~u<siflchier des per!ooiquesi secteur s z septeiibre l969o nombre total oe dossiers• h~7 • • ............................ ....... ............................ . • tit~ts• 9hs • . .................................. • cduraihs= 6s~s • • •••••••••••••••••••••••• t aboh~ehettts= 3i39 1$: • dons• sits t t echanges~ 39d * * cas/pitoblekes~ 1059 t •••••••••••••••••••••••• • n~n/tru•ants• 323a • • • ••••••••••••••••••••••• t aiofthe,.fnts c 0 t t oohs• 1 t • eciianges• 0 • t cas/problemesz 1 • •••••••••••••••••••••••• • reference$• 463! • • . .............................. . • • c.jurantes• 3061 non/couuntes• ls7l ** iioiibre de iiossi!ils qui sont des titres ·courahts et qui n'ont pas de champ •e• a ~ fig. 6. some outputs. 138 ]ournn,l of library automation vol. 3/2 june, 1970 diversely overcome; for example, incomplete reading of log, down time, overlapping of information, duplication of fields, unaccepted records, display of information into frame or display of frame of inexistent field, unsettled screen, etc. in the fifteen months after june 1968, the input capability per person improved from an average of 30-35 transactions a day during the first three months to an average of 100-200 transactions a day, depending on the length of records; the down time was reduced from 50% to 10%-15% and even less than 5% for some weeks; the error percentage attributable to program-as distinct from errors of transcription-has declined from 10%-5% to 1% or less. to familiarize library staff with the new system and build confidence in it, a round of systematic demonstrations of the system at terminal points was organized as part of an in-training program. two printed guides (16, 17) were previously distributed; then from february 5 to march 14, 1969, all employees, in groups of six, were directed in questioning the file individually by sitting down at the consoles and pressing the right tabs. later the experience was completed by a guided tour in larger groups to the machine room of the computing center. members of the staff were enthusiastic, especially the professional people, and the experiment on both sides quite a success. an initial feature of the new system, but one which has not yet materialized, is the automatic claiming of missing issues. a first step in this direction will be initiated with the recent subprogram prepared for automatic ordering of serials at les presses de l'universite laval. coden may be introduced into the system and a question-answer subprogram devised that would give a more selective approach to the file-in fact, an application of ibm/ dps (document processing system) to a sample of some 200 records has already been positively tested. as for the hardware, there is a good possibility of installing terminals for users in main libraries in 1970/71. it also seems quite realistic that the system will be switched again to an ibm 360/67 in the near future. discussion it may rightly be said that at the start of the 70's, the library field has come half way to the materialization of the "on-line, total library system" concept as defined by robert m. hayes in 1965, and also that the manufacturers are on the verge of offering libraries the "computer service centers" fitted to their particular needs as forecast by the same author ( 18). on the one hand, indeed, numerous examples of operational online applications to various library functions have been described in the ] ournal of library automation, as well as in other specialized publications (the larc reports, program, special libraries, etc.). on the other hand, pertinent literature on library automation issued by people or firms from the data processing and computer fields-for example, warheit ( 19), on-line serials systemjde varennes 139 ibm, sdc ( 20 )-and diversified services under contract executed by computer firms for library organizations, are sufficient proof that manufacturers are now heeding this new market of computer applications. it is only to be hoped that these firms will also realize the linguistic dimensions of their new market and will accelerate the extension of their publications program to languages such as french, spanish, etc., with special emphasis on french for the french-speaking population of canada. concerning laval university's on-line serials system, it is still too soon to be in a position to evaluate its impact on the library's environment. the system is still in the building and not yet really exploited, especially from the user's point of view. there are other shortcomings of the present program. it's only access by accession number is far too limiting for on-line dialogue. it is questionable whether it was advisable to have divided the holding statement in two parts (summary notation of what is in field q, to conform to regular practice, and detailed notation of missing issues in field r, for acquisition purposes) on .account of the heavy routine it imposes on library staff; questionable also is the double entry for titles in a (regular librarians' practice) and m (direct title) on account of complexities it bears with it for references. a lack of consultation with b.i.r.o., inc., in the initial phase of the contract led to the establishment of a field sequence not ideally compatible with the requirements of the serials department's daily operations, though adjusted to optimum storage capacity. a more complete mastery of downtime periods and a more effective and more reliable routine of restart should have been developed by now by the staff of the computing center. finally, there has not been a total shift from off-line procedures to procedures more compatible with the new system, mostly because of psychological reluctance on the part of library staff. on the other hand, a far more reliable system keeps serials under better control. the serials file is updated twice or thrice a week-and is indeed "updatable" anytime-without any inconvenience to the serials department, whereas in the off-line system the annual or semi-annual updatings were compelling stoppage of regular operations for weeks, or even requiring overtime. there need be no more distinction between current and non-current records, nor with oversized records, since any record of any length is readily acceptable by the system. the opportunity exists of producing more adequate listings of all kinds, and reliable statistics more quickly and at a better price. the input procedure is so simplified that that fact alone would almost justify the system. finally, "humiliter dico," this program has brought more fame to the library (or is it only curiosity?) than any other achievement, inquiries about it coming from canada, the u.s.a. , england, france, africa, and even ussr. a year or so from now, it should be possible to evaluate the impact of the system on laval library users, by which time additional terminals will have been put at their disposal. 140 journal of library automation vol. 3/2 june, 1970 programs microfiches and photocopies of the following may be obtained from national auxiliary publications service of asis: "on-line serials system at laval university library: basic programs (naps 00966) and "on-line serials system at laval university library: printing programs (naps 00967). acknowledgments the sustained collaboration, expert advice and friendly understanding of people from the computing center of laval university were important to the library program from the beginning. the director, mr. louis p. a. robichaud, scientist of international repute, the assistant director, mr. pierre ardouin, and the analysts assigned to the serials program-mr. conrad bourdon and mr. richard desrosiers-skillfully maintain the system and work continuously towards its potential development. special thanks must also deservedly go to the firm b.i.r.o., inc., for implementation of the basic programs. encouragement in this venture by the librarian-in-chief, father joseph marie blanchet, was constant. references 1. varennes, rosario de: "the siamese twins or the new building of laval university library", (quebec, june 17, 1969) 8 pp. 2. ]ournee d'etude sur la mecanisation de certains services de la bibliotheque de l'universite laval; communiques ( sainte-foy, quebec: 5 juin 1964), 93 p. 3. communications presentees a une reunion de bibliothecaires d'universites canadiennes sur r automatisation des services de bibliotheque tenue a l'universite laval (quebec: 21 et 22 mars 1966), 69 p. 4. varennes, rosario de: summary report, committee for automation of library services, laval univ. library, (quebec, mar. 18, 1966) 5 pp. 5. stuart-stubbs, basil: conference on computers in canadian libraries, universite laval, quebec, march 21-22, 1966; a report prepared for the canadian association of college and university libraries (vancouver: university of british columbia library june 1966), 13 p. 6. programme d'automatisation des services de la bibliotheque de l'universite laval (montreal: acblf, aout 1967), 74 p. 7. forget, guy: "rapport sur asyvol", in dolan, f. t.: information retrieval in canada; a preliminary survey. (calgary, alberta: imperial oil ltd., producting dept., western region august 1967), iprc-4 mir-67, pp. 163-173. 8. varennes, rosario de: "computerized serials record at laval university: a progress report", canadian library, 24 (september 1967), 122-123. 9. leclerc, rita: "le centre de documentation de la bibliotheque de l'universite laval (quebec )", revue de l'aupelf, 5 (automne 1967)' 27-32. on-line serials systemfde varennes 141 10. varennes, rosario de: bird's-eye view of the library's computer programs and projects as of february 21, 1969. prepared for the members of the aucc (association of universities and colleges of canada) library automation committee (quebec: bibliotheque de l'universite laval, analyse et automatisation des services: february 25, 1969), 9 p. 11. forget, guy. "the university library and information center: a new dimension". in clinic on library applications of data processing, university of illinois, 6th, 1968. proceedings (urbana, ill.: university of illinois graduate school of library service), pp. 1-10. 12. b.i.r.o. inc., quebec: systeme d'affichage et de mise a ]our du fichier des periodiques; documentation technique (quebec, b.i.r.o. inc., avril 1968), 33 p. 13. b.i.r.o. inc., quebec: systeme d'affichage du fichier des feriadiques, bibliotheque de l'universite laval. manuel d'operation (quebec: centre de traitement de !'information, universite laval. juin 1968), 58 p. 14. varennes, rosario de: programme des periodiques 1968. hypothese de travail: conversion mecanisee des donnees de l'etat de collecti{)n et rappels automatiques. schema preliminaire a l'intention du centre de traitment de l'information (quebec: bibliotheque de l'universite laval, services des periodiques. 26 avril 1967), 6 p. 15. lachance, jean: etudes preliminaires d'un systeme de controle automatique des periodiques, universite laval" (quebec: centre de traitement de l'information, universite laval: 14 aout 1967), 8 p. 16. varennes, rosario de: guide succinct pour l'interrogation du systeme de periodiques en temps reel avec l'ordinateur (ibm 360-50/2314/ 2260). a l'intention du personnel de la bibliotheque, (quebec: bibliotheque de l'universite laval, analyse et automatisation des services: 15 janvier 1969), 7 p. 17. bourdon, conrad; varennes, rosario de: description technique somnaire du systeme de periodiques en temps reel avec l'ordinateur (ibm 360-50/2314/2260). complement au "guide succinct . .. "defa paru. a l'intention du personnel de la bibliotheque" (quebec: bibliotheque de l'universite laval, analyse et automatisation des services: 24 avril 1969), 8 p. 18. hayes, robert m.: "the concept of an on-line, total library system", library technology reports, section: data processing (may 1965), 13 p. 19. warheit, i. a.: "file organization of library records", journal of . library automation, 2 (march 1969), 20-30. 20. black, donald v.: library information system time-sharing on a large, general-purpose computer (santa monica, calif.: system development corporation, september 20, 1968) sp-3135, 21 p. discovery: what do you mean by that? | carter 161 judith carter editorial board thoughts: issue introduction discovery: what do you mean by that? m wuah ha ha ha haaa! finally it’s my turn. i hold the power of the editorial. (can you tell i’m writing this around halloween?) seriously now, i’ve been intimately and extensively involved with information technology and libraries for eleven years, yet this is the first time i’ve escaped from behind the editing scenes to address the readership directly. as managing editor for seven of the eleven volumes (18–22 and 27–28) and an editorial board member reviewing manuscripts (vols. 23–26), i am honored marc agreed to let me be guest editor for this theme issue. this issue is a compilation of presentations from the discovery mini-conference held at the university of nevada las vegas (unlv) libraries in the spring of 2009. the first article by jennifer fabbi gives the full chronology and framework of the project, but i have the pleasure of introducing this issue and topic by virtue of my role as guest editor, as well as my own participation in the miniconference before i left unlv in july 2009. n what is discovery? when the dean of libraries, patricia iannuzzi, announced that unlv would have a libraries-wide, poster-session style discovery mini-conference, jennifer fabbi and i decided we wanted to be part of it. we had already been exploring various aspects of discovery as part of an organizational focus as well as following up on a particular event that happened earlier in the year. while serving on a search committee, we posed a question to all the candidates: “what do you see the library catalog looking like in the future? what do you see as the relationship between the library catalog and other access or discovery tools?” one of the candidates had such a unique answer that it got us thinking: are we all talking about the same thing when we discuss discovery? the mini-conference gave us the opportunity to explore the idea further. an all-library summit that preceded the mini-conference announcement had focused on users finding known items. we knew that discovery was so much more and that it depended on the users’ needs. of course, first we went to multiple online dictionaries to look up the meanings of “discovery” and found the following definitions: n something learned or found; something new that has been learned or found n the process of learning something; the fact or process of finding out about something for the first time n the process of finding something; the process or act of finding something or somebody unexpectedly or after searching we also looked at famous quotes about discovery. there were some of our favorites: a discovery is said to be an accident meeting a prepared mind. —albert szent-gyorgyi education is a progressive discovery of our own ignorance. —will durant next, a colleague recommended we look at chang’s browsing theory.1 this theory covered the broad spectrum of how users seek information and showed a more serendipitous view than the former focus of known item search. obviously, browsing implies a physical interaction with a collection, so we reframed the themes to fit discovery in the “every-library” electronic information environment. chang’s five browsing themes, adapted to discovery: n looking for a specific item, to locate n looking for something with common characteristics, to find “more like this” n keeping up-to-date, to find out what’s new in a field, topic or intellectual area n learning or finding out, to define or form a research question n goal-free, to satisfy curiosity or be entertained.2 all interesting information, but a little theoretical for a visual presentation. to make these themes more concrete and visual, i suggested we apply them to personas as described in one of my favorite books, the inmates are running the asylum.3 this encourages programmers to create a user with a full backstory and then design a product for their needs. to do this in an entertaining way, we identified five types of users we’ve encountered in our libraries and described an information-seeking need for each. i then created some colorful and representational characters using a well-known, alliteratively named candy’s website. our five characters were 1. mina, stylishly dressed and always carries a cell phone, is an undergraduate who rarely uses the library. she has a sociology class library assignment to find information on the cell phone habits of generation x. 2. ms. lvite lives in the las vegas area and contributes to the library. she is a regular from the community judith carter (jcarter.mls@gmail.com) is head of technical services at marquette university raynor memorial libraries, milwaukee, wi and managing editor of ital. 162 information technology and libraries | december 2009 who likes to dig into everything the library owns about small mining towns in nevada. 3. dr. prof is a faculty member with a slightly outdated wardrobe but a thirst for knowledge. he wants to know what books have been published in his field of quantum bowtie mechanics by any of his colleagues across the country. 4. phdead tired is a slightly mussed grad student who is always in the library clutching a cup of coffee. he needs to narrow down his dissertation topic. 5. duuuuude is an energetic, sociable young man who likes to hang out in the library with his friends. he has some time to kill on the computer. on our poster, we asked the discovery miniconference attendees to place cutouts of our personas on a pie chart divided into the five themes of discovery. jennifer and i expected certain placements and were pleasantly surprised when our attendees challenged our assumptions with alternate possibilities. another section of the poster related discovery behaviors to specific electronic discovery tools. we provided a few and asked the attendees to add others (see table 1). while talking with each attendee, we provided a bookmark listing the five discovery behaviors (with colorful character personas) and suggested they keep them in mind as they visited the other conference sessions. we challenged them to identify what user behaviors the other presenters’ systems or services were targeting. the message jennifer and i hoped to convey with our poster was this: the way we think about discovery, or the users’ goals in finding information, drives the discovery table 1. relating discovery behaviors to electronic discovery tools user wants . . . provide the user . . . other tools?* to find a specific item search by title, author, or call number (e.g., libraries’ webopac) search a database worldcat flickr google books to find items with common characteristics items linked by subject headings, format, or other elements; tag clouds; federated search for article databases (e.g., webopac, encore, article databases) flickr summon twine delicious to be kept up-to-date recently added items by subject; integration of blogs for news or updates (e.g., new books list, libguides, encore “recently added”) blogs rss feeds apple itunes amazon readers advisory authors/musicians websites newspapers online to learn more about something general information that provides context, reviews (e.g., wikipedia, google, encore community reviews) dissertation abstracts encyclopedias database of databases (for context) peer to peer: delicious, social tagging to satisfy curiosity or be entertained surfing the web, multimedia, social networking (e.g., google, youtube, facebook) myspace world of warcraft second life podcasts wikipedia “random article” feature * ideas generated at the discovery mini-conference discovery: what do you mean by that? | carter 163 systems we have or will create. as you read through this issue, i hope you’ll see some new ways to think about discovery and that those ways will fuel this audience’s potential to create new tools. what follows is a textual walk around our miniconference. taken as individual articles, each might not look like what you are used to seeing in ital. taken as a whole that grew out of the process, these articles are what makes this a special issue. as i said before, jennifer fabbi provides the background and process for the discovery mini-conference. then, alex dolski describes a prototype multipac discovery system he created and demonstrated, and he discusses the issues surrounding the design of such a system. tom ipri, michael yunkin, and jeanne brown, as members of the usability working group, had already been conducting testing on unlv libraries’ website. they share their methods, findings, and results with us. thomas sommer presents a look at what the special collections department has implemented to aid discovery of their unique materials. wendy starkweather and eva stowers used the mini-conference as an opportunity to research how other libraries are providing discovery opportunities to students via smartphones. patrick griffis describes his work with free screen capture tools to build pathfinders to promote resource discovery. patrick griffis and cyrus ford each looked at enhancing catalog records, so they combined their two presentations here to describe ways to enrich the online catalog to better aid our users’ success. references 1. shan-ju chang, “chang’s browsing,” in theories of information behavior, by karen e. fisher, sanda erdelez, and lynne mckechnie (medford, n.j.: information today, 2005): 69–74. 2. ibid., 71–72. 3. alan cooper, the inmates are running the asylum, (indianapolis, ind.: sams, 1999). personas are described in chapter 9. figure 1. “initial thoughts” and “five general themes of discovery behavior” panel from the discovery mini-conference poster letter from the editor (september 2022) letter from the editors kenneth j. varnum and marisha c. kelly information technology and libraries | september 2022 https://doi.org/10.6017/ital.v41i3.15559 as summer turns to fall—too soon for the editor of this journal—the northern hemisphere’s academic calendar is getting started. after the past two years of covid-directed activities, it feels good to be returning to a more typical start to the year. while we’re not out of the pandemic woods yet, by any means, it does feel as if we’ve turned a corner. some of us have returned to precovid modes of working and socializing, while others are finding the return to the status quo ante a bit more challenging. if the pandemic has shown us one thing, it’s the power—and limitations—of technology to adapt to a changed world, and of our human ability to adapt, or not, to new habits of technology. we’ve covered many of these responses of adaptation in the pages of this journal over the past two years, and expect that many more innovations and lessons learned will be shared here in the years to come. technological change is a never-ending process. information technology and libraries will continue to share the ways cultural memory institutions adapt, respond, and react to changes in the technological tools we use. as always, if you have lessons learned about technologies and their effect on our mission, we’d like to hear from you. our call for submissions outlines the topics and process for submitting an article for review. if you have questions or wish to bounce ideas off the editor and assistant editor, please contact either of us at the email addresses below. this issue’s contents the september “public libraries leading the way” column, “the first 500 mistakes you will make while streaming on twitch.tv” by chris markman, kasper kimura, and molly wallner of the palo alto (california) public library, is all about lessons learned. the authors summarize the many things they discovered while launching, managing, and sustaining a library presence on twitch . our peer-reviewed content this month showcases topics including public library broadband connectivity, two articles on aspect of chat reference, digitization, library management, and a learning object repository to support a cross-institutional, land-based, multidisciplinary academic initiative. 1. measuring library broadband networks to address knowledge gaps and data caps / chris ritzo, colin rhinesmith, and jie jiang 2. perceived quality of whatsapp reference service: a quantitative study from user perspectives / yan guo, apple hiu ching lam, dickson k. w. chiu, and kevin k. w. ho 3. library management practices in the libraries of pakistan: a detailed retrospective / asim ullah, shah khusro, and irfan ullah 4. navigating uncharted waters: utilizing innovative approaches in legacy theses and dissertations digitization at the university of houston libraries / annie wu, taylor davisvan atta, bethany scott, santi thompson, anne washington, jerrell jones, andrew weidner, a. laura ramirez, and marian smith 5. using machine learning and natural language processing to analyze library chat reference transcripts / yongming wang 6. an omeka s repository for placeand land-based teaching and learning / neah ingram-monteiro and ro mckernan kenneth j. varnum, editor marisha c. kelly, assistant editor varnum@umich.edu marisha.librarian@gmail.com https://ejournals.bc.edu/index.php/ital/call-for-submissions https://ejournals.bc.edu/index.php/ital/article/view/15475 https://ejournals.bc.edu/index.php/ital/article/view/15475 https://ejournals.bc.edu/index.php/ital/article/view/13775 https://ejournals.bc.edu/index.php/ital/article/view/14325 https://ejournals.bc.edu/index.php/ital/article/view/14325 https://ejournals.bc.edu/index.php/ital/article/view/14433 https://ejournals.bc.edu/index.php/ital/article/view/14719 https://ejournals.bc.edu/index.php/ital/article/view/14719 https://ejournals.bc.edu/index.php/ital/article/view/14967 https://ejournals.bc.edu/index.php/ital/article/view/14967 https://ejournals.bc.edu/index.php/ital/article/view/15123 mailto:varnum@umich.edu mailto:marisha.librarian@gmail.com this issue’s contents 166 book reviews proceedings of the 1968 clinic on library applications of data processing, edited by dewey e. carroll. urbana : university of illinois, 1969. 235 pp. $3.00. for all except inveterate institute participants, it must be difficult to decide to spend yet another week listening to a widely mixed series of papers and discussions on data processing in libraries, in the hope of finding something new or useful. to attract a wide audience, the offerings tend to range from simple introductions to technical discussions of specific programs or projects. the value of gathering the papers of such institutes into volumes of proceedings is questionable. material from the introductory papers would certainly find greater use in a comprehensive monograph, while the papers which report new developments or technical problems would have a better chance of reaching their proper audiences if published in journals. the repetitive "how-we-did-it" reports might best be left unpublished. the proceedings of the 1968 illinois clinic does have a number of articles which deserve wide readership. frederick g. kilgour's paper on initial system design for the ohio college library center is excellent, not so much for solutions, but because he raises the questions on the purpose of college libraries and the nature of regional systems which need to be raised before embarking on design. those who have had experience with automated operations will appreciate lawrence auld's listing of ten categories of library automation failure. (he omits one of the most common-lack of computer stability. ) a technical article of considerable interest is alan r. benenfeld's paper on generation and encoding of the data base for intrex. those looking for reports of successful computer applications may find useful information in the papers by robert hamilton, of the illinois state library, on circulation; by james w. thomson and robert h. muller, of the university of michigan, on the u. of m. order system; by michael m. reynolds, of indiana university, on centralized technical processing for the university's regional campus libraries; by john p. kennedy, of georgia tech, on production of catalog cards; and by robert k. kozlow, of the university of illinois, on a computer-produced serials list. melvin ]. voigt book reviews 167 planning library services. proceedings of a research seminar held at the university of lancaster, 9-11 july, 1969. edited by a. graham mackenzie and ian m. stuart. lancaster, england: university of lancaster library, 1969. 30 shillings. this volume offers fifteen papers presented in six sessions; each session had one or more papers and some discussion. the papers range from very general mathematical models to local problems of british legal codes and re-organization of local governments. the first session introduces the problems and some theoretical notions of how to deal with them. the next three sessions deal with analysis techniques. morely introduces some simple techniques of maximizing benefits for given resources. brookes presents a good quick introduction to statistics and distributions which occur frequently in information science. leimkuhler develops cost models for storage policies and woodburn analyzes the costs in heirarchical library systems. the mathematics in these latter papers, although not difficult, will probably put off a good many librarians and administrators. both are practitioners and impressed by results, not complex models; the equations developed by leimkuhler or woodburn are probably too complex to be successfully used by most librarians. this might reflect the state of the librarian and not of the art, however, to quote cloote (from the paper by duchesne) : "with only a very few notable exceptions, successful models have been so simple that an operational research specialist would disown them." the fifth session covers data collection and evaluation. duchesne comments on management information systems and operations research for librarians. conventional techniques of data collection are reviewed by ford, including sample forms and a note of warning about too many surveys. in the final session leimkuhler presents an overview which includes several choice comments on progress (or lack of it) in libraries. during the discussion period, mackenzie suggests that libraries should use up to five percent of their budgets for research. this reviewer feels that unless this suggestion is taken more seriously, most of the theory will never find an application. these proceedings would make an excellent companion to burkhalter's case studies in library systems analysis as more theoretically oriented readings for a course operations research or administration in librarianship. some of the techniques presented could be adapted for immediate application in analyzing present systems. thus this collection of papers can be useful to both student and practitioner interested in research and development of library systems. arvo tars 168 journal of library automation vol. 3/2 june 1970 libraries at large, edited by douglas m. knight and e. shepley nourse. new york: r. r. bowker, 1969. 664 p. $14.95. libraries at large is based on the materials which the national advisory commission on libraries employed in its deliberations. the commission appraised the adequacy of libraries and made recommendations designed "to ensure an effective, efficient library system for the nation." these materials are also useful to those engaged in the enrichment of present library programs and to those developing new library projects. the materials consist of papers and reports written for the commission and include essays, original investigations, and literature reviews, as well as reprints of material that has appeared elsewhere. some papers are of top quality; some are poor. nevertheless, the appearance of these materials in one volume adds a convenient source of information that will be useful to librarians for years to come. approximately half the book is devoted to problems related to the use of libraries and to the users of libraries. the second half contains discussions of government relationships of libraries and a series of useful appendixes. perhaps the most novel section of the book is william j. baumors "the cost of library and informational services." this study investigates the economics of libraries in depth and the results are of great interest. this chapter on economics contains new material and brings together that which existed heretofore, so that it constitutes the major resource on library economics. this chapter alone is so valuable as to justify the recommendation that all libraries and most librarians should acquire libraries at large. the section on copyright is equally important, for it brings together data on a topic possessing cataclysmic potentials for librarianship. verner clapp's "copyright: a librarian's view" is the best statement that has appeared on the subject, and it is hoped that clapp's dissertation will awaken librarians to the peril that confronts them. on the other hand, the chapter entitled "some problems and potentials of technology as applied to library informational services" is somewhat less than satisfying. the section starts off with mathews and brown's "research libraries and the new technology," which originally appeared in on research libraries. it is still an inadequate exposition. there follows a reprint of "the impact of technology on the library building," which educational facilities laboratories published in 1967. the statement is adequate, but more useful information exists. the last section of the chapter is a study, "technology in libraries," which the system development corporation produced. this paper is a useful review of technologies employed by libraries and recommends five important network and systems projects to be undertaken. the chapters on government relationships include discussions of those book reviews 169 with the federal government and those at local, state and regional levels. germaine krettek and eileen d. cooke have provided a worthwhile appendix listing and abstracting library-related legislation at the national level. libraries at large is indeed a resource book, and those papers containing original investigations and literature reviews are of such high quality as to insure usefulness of this work to all thoughtful librarians. frederick g. kilgour computers and their potential applications in museums. a conference sponsored by the metropolitan museum of art. new york: arno press, 1968. 402 pp. $12.50. computers and their potential applications in museums contains the published proceedings of a conference which was held in new york, 1968. sponsored by the metropolitan museum of art and supported by ibm, the conference was another attempt to involve art and related fields in computer technology. this book covers a broad range of issues and problems from information retrieval to creativity. experts from museums, educators, librarians and computer specialists discussed the possible uses and the implications of computers for the museum field. the diversity of the participants seems to represent the components of an exceedingly complex problem which is as monumental as the museum field itself. as an overall document it gives evidence of concern and insight into the many technical problems which some researchers have encountered. in many instances the non-technical experts were too global in their thinking, while the technologists were too local in their area of concern to communicate to anyone but technologists. this disparity between approaches, with the obvious difficulties presented, is a typical one whenever non-technical groups attempt to make use of computer technology. an ambitious conference in scope, there were excellent participants and several of the papers were stimulating and provocative. the interaction among the people who attended the conference may have been useful and it may have generated important ideas. for a reader of the published proceedings one wishes there had been a final chapter which could have provided some guidelines for research and education in this field. there was an opportunity for the organizers of the conference or a small group of the participants to summarize the problems and to give some direction to solutions. several years and many conferences later we in the humanities have made little progress in use of the computer. it seems that we are still better at rhetoric than at problem solving. charles csuri 170 journal of library automation vol. 3/2 june 1970 books for junior college libraries. pirie, james w., comp. chicago: american library assoc., 1969. 452 pp. $35.00. during the recent period of rapid growth and development of junior and community colleges, a bibliographic guideline has been long awaited. james w. pirie's books for junior college libraries, with its healthy potential for developing many basic collections and extending and updating others, fills that void. though it does not boast to be the single ideal bibliographic tool, it is a welcome addition to, (and perhaps replacement for some of) its predecessors-frank bertalan's books for junior college libraries; charles l. trinkner's basic books for junior college libraries; hester hohman's readers adviser; helen wheeler's a basic book collection for the community college library; bro dart foundation's the junior college library collection~ edited by dr. bertalan; and the ever-present subject guide to books in print and books in print, from bowker. books for junior college libraries represents the cooperative efforts of some 300 expert consultants-subject specialists, faculty members and librarians-charged with the responsibility of producing a publication to serve as a book selection guide for new or established junior and community college libraries. approximately 20,000 titles are arranged by subject, broadly interpreted; with entries consisting of author, title, subtitle, edition; if other than the first, publisher, and place of publication, date, pagination, price and l.c. number. easy access is provided by the inclusion of an author and subject index. a comparative "table of subject coverage" appearing in the preface, tabulating the percentage of subject distribution to total volume for the lamount, michigan, and the more recent books for college libraries lists, indicates that books for junior college libraries maintains a comparable subject percentage distribution to total volume. only book titles have been included; foreign entries have been limited to a few major works, and out-of-print titles, in favor of titles readily available. paperbacks were listed, in the absence of card copy. though limited in its coverage of terminal and vocational courses, with emphasis toward the transfer or liberal arts program, books for junior college libraries does embrace all fields of knowledge that tend to be challenging and useful for the general education programs. it has been endorsed by the joint committee on junior colleges of aajc, ala, and the junior college section of acrl, and moves toward the recommendations of the ala standards for junior college libraries. this bibliographic guideline for junior college libraries should be welcomed by public schools as well as junior and community colleges for its assistance in developing new collections, as well as expanding and updating old collections, with quantity, quality, and economy working together. ]ames i. richey book reviews 171 agricultural sciences information network development plan. educom research report, august 1969. 74 pp. the national agricultural library wants to implement its old plan of an agricultural science information network "based on the assumption that the land-grant libraries in the states are the natural nodes to this network." educom undertook a study which was submitted to and discussed by a symposium held in washington, d. c., on february 10-12, 1970, with the participation of all agricultural libraries interested in "new and improved ways of exchanging information in support of agricultural research and education." the goal is "to develop a long-range plan for strengthening information, communication, and exchange among the libraries of land-grant institutions and the nal." according to the report, the network concept would constitute a "network of networks" and three basic components are envisioned: 1) land-grant libraries, 2) information analysis centers, and 3) telecommunications. all these components have their own aims and objectives described in this report. "nal's first course of action in the establishment of a system of information analysis centers is to develop a directory of existing analysis centers of interest to the agricultural community. the directory should be supported with a catalog detailing the services and products offered by these centers. nal should then establish cooperative agreements with these centers which would make them responsive to the needs of the users of the agricultural sciences information network. this should be supported with the installation of communications equipment to encourage and facilitate the use of a center." no doubt, the participants of the symposium will have thoroughly investigated and discussed this plan with serious consideration to its practical implementation. a new approach and improvement of information exchange is not only a necessity, but also long overdue, for those in agriculture. this information development plan would provide service for research workers at the experiment stations, scientists and teachers at the colleges, agricultural extension people at the land-grant institutions, and, last but not least, for the farmers who provide us with food and fibers in order to bring a fuller and better life on the farm and in rural and city homes. a detailed analysis of the performance, an evaluation and revision of this gigantic scientific information system, can only be made after it has been in operation for a few years. it is very promising that the national agricultural library-among its many objectives-has again taken the initiative. john de gara 172 journal of library automation vol. 3/2 june, 1970 cornell university libraries. manual of cataloging procedures. 2d ed. ithaca, n.y. : cornell university libraries, 1969. $18.00. editor robert b. slocum and his associates have produced a valuable manual useful to catalogers and persons involved in the administration of policies and procedures in technical services. as stated in the preface the manual is a supplement, not a substitute, for the anglo-american cataloging rules and its predecessors, lc list of subject headings and the lc classification schedules. the following directive is basic: "the revisers are always open for consultation on particularly difficult problems, but it must be assumed that a professional cataloger will have a thorough knowledge of the basic tools of his profession. . . . if this knowledge is in any way lacking, the cataloger has the obvious responsibility of acquiring it through diligent study and experience. he should not come to the reviser with questions whose answers are available in the aforementioned tools and in this manual." the format is loose-leaf, so that additions and revisions may be made easily to reflect new developments and techniques. the sections include pre-cataloging procedures; general cataloging and classification procedures; recataloging and reclassification; cornell university college and department libraries-special collections and special catalogs; . . . serials and binding department; files and filing; typing, card production, book preparation; statistics; appendix (including abbreviations, romanization tables, etc. ) ; and index. the procedures and practices described are those adopted by a research library "conscious of the need for both quality and quantity in the work of its staff." this publication, weighing five pounds, is a great achievement and with its full index an indispensable contribution to the collection of worthwhile cataloging manuals. descriptions of local procedures may seem detailed but basic principles and policies are well covered. the final touch is the inclusion of a catalog card for the manual! margaret oldfather a comparative analysis of the effect of the integrated library system on staffing models in academic libraries ping fu and moira fitzgerald information technology and libraries | september 2013 47 abstract this analysis compares how the traditional integrated library system (ils) and the next-generation ils may impact system and technical services staffing models at academic libraries. the method used in this analysis is to select two categories of ilss—two well-established traditional ilss and three leading next-generation ilss—and compare them by focusing on two aspects: (1) software architecture and (2) workflows and functionality. the results of the analysis suggest that the nextgeneration ils could have substantial implications for library systems and technical staffing models in particular, suggesting that library staffing models could be redesigned and key librarian and staff positions redefined to meet the opportunities and challenges brought on by the next-generation ils. introduction today, many academic libraries are using well-established traditional integrated library systems (ilss) built on the client-server computing model. the client-server model aims to distribute applications that partition tasks or workloads between the central server of a library automation system and all the personal computers throughout the library that access the system. the client applications are installed on the personal computers and provide a user-friendly interface to library staff. however, this model may not significantly reduce workload for the central servers and may increase overall operating costs because of the need to maintain and update the client software across a large number of personal computers throughout the library. 1 since the global financial crisis, libraries have been facing severe budget cuts, while hardware maintenance, software maintenance, and software licensing costs continue to rise. the technology adopted by the traditional ils was developed more than ten years ago and is evidently outdated. the traditional ils does not have sufficient capacity to provide efficient processing for meeting the changing needs and challenges of today’s libraries, such as managing a wide variety of licensed electronic resources and collaborating, cooperating, and sharing resources with different libraries.2 ping fu (pingfu@cwu.edu), a lita member, is associate professor and head of technology services in the brooks library, central washington university, ellensburg, wa. moira fitzgerald (moira.fitzgerald@yale.edu), a lita member, is access librarian and assistant head of access services in the beinecke rare book and manuscript library, yale university, new haven, ct. mailto:pingfu@cwu.edu mailto:moira.fitzgerald@yale.edu a comparative analysis of the effect of the integrated library system on staffing models in academic libraries | fu and fitzgerald 48 today’s libraries manage a wide range of licensed electronic resource subscriptions and purchases. the traditional ils is able to maintain the subscription records and payment histories but is unable to manage details about trial subscriptions, license negotiations, license terms, and use restrictions. some vendors have developed electronic resources management system (erms) products as standalone products or as fully integrated components of an ils. however, it would be more efficient to manage print and electronic resources using a single, unified workflow and interface. to reduce costs, today’s libraries not only band together in consortia for cooperative resource purchasing and sharing, but often also want to operate one “shared ils” for managing, building, and sharing the combined collections of members.3 such consortia are seeking a new ils that exceeds traditional ils capabilities and uses new methods to deliver improved services. the new ils should be more cost effective, should provide prospects for cooperative collection development, and should facilitate collaborative approaches to technical services and resource sharing. one example of a consortium seeking a new ils is the orbis cascade alliance, which includes thirty-seven universities, colleges, and community colleges in oregon, washington, and idaho. as a response to this need, many vendors have started to reintegrate or reinvent their ilss. library communities have expressed interest in the new characteristics of these next-generation ilss; their ability to manage print materials, electronic resources, and digital materials within a unified system and a cloud-computing environment is particularly welcome.4 however, one big question remains for libraries and librarians, and that is what implications the next-generation ils will have on libraries’ staffing models. little on this topic has been presented in the library literature. this comparative analysis intends to answer this question by comparing the nextgeneration ils with the traditional ils from two perspectives: (1) software architecture, and (2) workflows and functionality, including the capacity to facilitate collaboration between libraries and engage users. scope and purpose the purpose of the analysis is to determine what potential effect the next-generation ils will have on library systems and technical services staffing models in general. two categories of ilss were chosen and compared. the first category consists of two major traditional ilss: ex libris’s voyager and innovative interfaces’ millennium. the second category includes three nextgeneration ilss: ex libris’s alma, oclc’s worldshare management services (wms), and innovative interfaces’ sierra. voyager and millennium were chosen because they hold a large portion of current market shares and because the authors have experience with these systems. yale university library is currently using voyager, while central washington university library is using millennium. alma, wms, and sierra were chosen because these three next-generation ilss are produced by market leaders in the library automation industry. the authors have learned about these new products by reading and analyzing literature and vendors’ proposals, as well as information technology and libraries | september 2013 49 attending vendors’ webinars and product demonstrations. in the long run, yale university library must look for a new library service platform to replace voyager, verde, metalib, sfx, and other add-ons. central washington university library is affiliated with the orbis cascade alliance mentioned above. the alliance is implementing a new library management service to be shared by all thirty-seven members of the consortium. ex libris, innovative interfaces, oclc, and serials solutions all bid for the alliance’s shared ils. after an extensive rfp process, in july 2012 the orbis cascade alliance decided to choose ex libris’s alma and primo as their shared library services platform. the system will be implemented in four cohorts of approximately nine member libraries each over a two-year period, beginning in january 2013. the central washington university library is in the forth migration cohort, and their new system will be live in december 2014. it is important to emphasize that the next-generation ils has no local online public access catalog (opac) interface. vendors use additional discovery products as the discovery-layer interfaces for their next-generation ilss. specifically, ex libris uses primo as the opac for alma, while oclc’s worldcat local provides the front-end interface for wms. innovative interfaces offers encore as the discovery layer for sierra. as front-end systems, these discovery platforms provide library users with one-stop access to their library resources, including print materials, electronic resources, and digital materials. while these discovery platforms will also impact library organization and librarianship, they will have more impact on the way that end-users, rather than library staff, discover and interact with library collections. in this analysis, we focus on the effects that back-end systems such as alma, wms, and sierra will have on library organizational structure and staffing, rather than the end-user experience. as our sample only includes five ilss, the scope of the analysis is limited, and the findings cannot be universal or extended to all academic libraries. however, readers will gain some insight into what challenges any library may face when migrating to a next-generation ils. literature review a few studies have been published on library staffing models. patricia ingersoll and john culshaw’s 2004 book about systems librarianship describes vital roles that systems librarians play, with responsibilities in the areas of planning, staffing, communication, development, service and support, training, physical space, and daily operations. 5 systems librarians are the experts who understand both library and information technology and can put the two fields together to context. they point out that system librarians are the key players who ensure that a library stays current with new information technology. the daily and periodic operations for systems librarians include ils administration, server management, workstation maintenance, software and applications maintenance and upgrades, configuration, patch management, data backup, printing issues, security, and inventory. all of these duties together constitute the workloads of systems librarians. ingersoll and culshaw also emphasize that systems librarians must be proactive in facing constant changes and keep abreast of emerging library technologies. a comparative analysis of the effect of the integrated library system on staffing models in academic libraries | fu and fitzgerald 50 edward iglesias et al., based on their own experiences and observations at their respective institutions, studied the impact of information technology on systems staff.6 their book covers concepts such as the client-server computing model, web 2.0, electronic resource management, open-source, and emerging information technologies. their 2004 studies show that, tough there are many challenges inherent in the position, there are also many ways for system staff to improve their knowledge, skills, and abilities to adapt to the changing information technologies. janet guinea has also studied the roles of systems librarians at an academic library.7 her 2003 study shows that systems librarians act as bridge-builders between the library and other university units in the development of library-initiated projects and in the promotion of information technology-based applications across campus. another relevant study was conducted by marshall breeding at vanderbilt university in an investigation of the library automation market. his 2012 study compares the well-established, traditional ilss that dominate the current market (and are based on client-server computing architecture developed more than a decade ago) to the next-generation ilss deployed through multitenant software-as-a-service (saas) models, which are based on service-oriented architecture (soa).8 through this comparison, breeding indicates that next-generation ilss will differ substantially from existing traditional ilss and will eliminate many hardware and maintenance investments for libraries. the next-generation ils will bring traditional ils functions, erms, digital asset management, link resolvers, discovery layers, and other add-on products together into one unified service platform, he argues.9 he gave the next-generation ils a new term, library services platform.10 this term signifies that a conceptual and technical shift is happening: the next-generation ils is designed to realign traditional library functions and simplify library operations through a more inclusive platform designed to handle different forms of content within a unified single interface. breeding’s findings conclude that the next-generation ils provides significant innovations, including management of print and electronic library materials, reliance on global knowledge bases instead of localized databases, deployment through multitenant saas based on a service-oriented architecture, and the provision of a suite of application programming interfaces (apis) that enable greater interoperability and extensibility.11 he also predicts that the next-generation ils will trigger a new round of ils migration.12 method our method narrowed down the analysis for the implications of ilss on library systems and technical services staffing models to two major aspects: (1) software architecture, and (2) workflows and functionality, including facilitation of collaborations between libraries and user engagement. first, we analyzed two traditional ilss, voyager and millennium, which are built on a client-server computing model, deliver modular workflow functionality, and are implemented in our institutions. through the analysis, we determined how these two aspects affect library organizational structure and librarian positions designed for managing these modular tasks. then, information technology and libraries | september 2013 51 based on information we collected and grouped from vendors’ documents, rfp responses, product demonstrations, and webinars, we examined the next-generation ilss alma, wms, and sierra— which are based on soa and intended to realign traditional library functions and simplify library operations—to evaluate how these two factors will impact staffing models. to provide a more in-depth analysis, particularly for systems staffing models, we also gathered and analyzed online systems librarian job postings, particularly for managing the voyager or millennium system, for the past five years. the purpose of this compilation is to cull a list of typical responsibilities of systems librarians and then determine what changes may occur when they must manage a next-generation ils such as alma, wms, or sierra. data on job postings were gathered from online job banks that keep an archive of past listings, including code4lib jobs, ala joblist, and various university job listing sites. duplicates and reposts were removed. the responsibilities and duties described in the job descriptions were examined for similarities to determine a typical list. the data from all sources were gathered together in a single database to facilitate its organization and manipulation. specific responsibilities, such as administering an ils, were listed individually, while more general responsibilities for which descriptions may vary from one posting to another were grouped under an appropriate heading. to ensure complete coverage, all postings were examined a second time after all categories had been determined. we also used our own institutions as examples to support the analysis. the implications of ils software architecture on staffing models voyager and millennium are built on client-server architecture. libraries that use these ilss also use add-ons, such as erms and link resolvers, to manage their print materials and licensed electronic resources. the installation, configuration, and updates of the client software require a significant amount of work for library it staff. many libraries must allocate substantial staff effort and resources to coordinating the installation of the new software on all computers throughout the library that access the system. those libraries that allow staff to work remotely have experienced additional costs and it challenges. in addition, server maintenance, backups, upgrades, and disaster recovery also require excessive time and effort of library it staff. administering ilss, erms, and other library hardware, software, and applications is one of the primary responsibilities for a library systems department. positions such as systems librarian, electronic resource librarian, and library it specialist were created to handle this complicated work. at a very large library, such as yale university library, the systems group of library it is only responsible for voyager’s configuration, operation, maintenance, and troubleshooting. two other it support groups—a library server support group and a workstation support group—are responsible for installation, maintenance, and upgrade of the servers and workstations. specifically, the library server support group deals with the maintenance and upgrade of ils servers and the software and relational database running on the servers, while the workstation support group takes care of the installation and upgrade of the client software on hundreds of a comparative analysis of the effect of the integrated library system on staffing models in academic libraries | fu and fitzgerald 52 workstations throughout twenty physical libraries. at a smaller library, such as central washington university library, on the other hand, one systems librarian is responsible for the administration of millennium, including configuration, maintenance, backup, and upgrade on the server. another library it staff member helps install and upgrade the millennium client on about forty-five staff computers throughout its main library and two center campus libraries. comparatively, the next-generation ilss alma, wms, and sierra have a saas model designed by soa principles and deployed through a cloud-based infrastructure. oclc defines this model as “web-scale management services.”13 using this innovation, service providers are able to deliver services to their participating member institutions on a single, highly scalable platform, where all updates and enhancements can be done automatically through the internet. the different participating member institutions using the service can configure and customize their views of the application with their own brandings, color themes, and navigational controls. the participating member institutions are able to set functional preferences and policies according to their local needs. web-scale services reduce the total cost of ownership by spreading infrastructure costs across all the participating member institutions. the service providers have complete control over hardware and software for all participating member institutions, dramatically eliminating capital investments on local hardware, software, and other peripheral services. service providers can centrally implement applications and upgrades, integration across services, and system-wide infrastructure requirements such as performance reliability, security, privacy, and redundancy. thus participating member institutions are relieved from this burdensome responsibility that has traditionally been undertaken by their it staff.14 from this perspective, the next-generation ils will have a huge impact on library organizational structure, staffing, and librarianship. since the next-generation ils is implemented through the cloud-computing model, there is no requirement for local staff to perform the functions traditionally defined as “systems” staff activities, such as server and storage administration, backup and recovery administration, and server-side network administration. for example, the entire interfaces of alma and wms are served via web browser; there is no need for local staff to install and maintain clients on local workstations. therefore, if an institution decided to migrate to a next-generation ils, the responsibilities and roles of systems staff within the institution would need to be readdressed or redefined. we have learned from attending oclc’s webinars and product demonstrations that library systems staff would be required to prepare and extract data from their local systems during new systems implementation. they also would be required to configure their own settings such as circulation policies. however, after the migration, a systems staff member would likely serve as a liaison with the vendor. this would require, according to oclc’s proposal, only 10 percent of the systems staff’s time on an ongoing basis. through attending ex libris’s webinars and product demonstrations, we have learned that a local system administrator may be required to take on basic management processes, such as record-loading or integrating data from other campus systems. similarly, we have learned from innovative interfaces’ webinars and product demonstrations that sierra would still need local systems information technology and libraries | september 2013 53 expertise to perform the installations of the client software on staff workstations. sierra would require library it staff to perform administrative tasks like the user account administration and to support sierra in interfacing with local institution-specific resources. in general, as shown in table 1, local systems staff could be freed from the burdensome responsibility of administering the traditional ils because of the software architecture of the nextgeneration ils. systems librarian responsibilities workload percentage traditional ils nextgen ils managing ils applications, including modules and the opac 10 x managing associated products such as discovery systems, erms, link resolver, etc. 10 x day-to-day operations including management maintenance, troubleshooting, and user support 10 x x server maintenance, database maintenance and backup 10 x customizations and integrations 5 x x configurations 5 x x upgrades and enhancements 5 x patches or other fixes 5 x design and coordination of statistical and managerial reports 5 x x overall staff training 5 x x primary representative and contact to the designated library system vendors 5 x x keeping abreast of developments in library technologies to maintain current awareness of information tools 5 x x engaging in scholarly pursuit and other professional activities 10 x x serving on various teams and committees 5 x x reference and instruction 5 x x total 100 100% 60% table 1. systems librarian responsibilities comparison for traditional ils and next-generation ils. note: the systems librarian responsibilities and the approximate percentage of time devoted to each function are slightly readjusted based on the compiled descriptions of the systems librarian job postings we collected and analyzed from the internet and from vendors’ claims. a total of 47 position a comparative analysis of the effect of the integrated library system on staffing models in academic libraries | fu and fitzgerald 54 descriptions were gathered. the workload percentage is adopted from the job description of the systems librarian position at one of our institutions. our analysis shows that systems staff might reduce their workload by approximately 40 percent. therefore library systems staff could use their time to focus on local applications development and other library priority projects. however, it is important to emphasize that library systems staff should reengineer themselves by learning how to use apis provided by the next-generation ils so that they will be able to support the customization of their institutions’ discovery interfaces and the integration of the ils with other local enterprise systems, such as financial management systems, learning management systems, and other local applications. the implications of ils workflows and functionality on staffing models the typical workflow and functionality of both voyager and millennium are built on a modular structure. major function modules, called client modules, include systems administration, cataloging, acquisitions, serials, circulation, and statistics and reports. additionally, the traditional ils provides an opac interface for library patrons to access library materials and manage their accounts. millennium has an erms module built in as a component of their ils while ex libris has developed an independent erms as an add-on to voyager. the systems administration module is used to add system users and to set up locations, patron types, material types, and other library policies. the cataloging module supports the functions of cataloging resources, managing the authority files, tagging and categorizing content, and importing and exporting bibliographic records. the sophistication of the cataloging module depends primarily on the ils. the acquisitions module helps in the tracking of purchases and acquisition of materials for a library by facilitating ordering, invoicing, and data exchange with serial, book, and media vendors through electronic data interchange (edi). the circulation module is used to set up rules for circulating materials and for tracking those materials, allowing the library to add patrons, issue borrowing cards, and form loan rules. it also automates the placing of holds, interlibrary loan (ill), and course reserves. self-checkout functionality can be integrated as well. the serials module is essentially a cataloging module for serials. libraries are often dependent on the serials module to help them track and check-in serials. the statistics and reports module is used to generate reports such as circulation statistics, age of collection, collection development, and other customized statistical reports. a typical traditional ils comprises a relational database, software to interact with that database, and two graphical user interfaces—one for patrons and one for staff. it usually separates software functions into discrete modules, each of them integrated with a unified interface. the traditional ils’s modular design was a perfect fit for a traditional library organizational structure. the staff at central washington university library, for example, under the library administration, are organized into the following three major groups: public services, including the reference and circulation departments; technical and technology services, including the cataloging, collection development, serials & electronic resource, and systems departments; and information technology and libraries | september 2013 55 other library services and centers, including the government documents department, the music library, two center campus libraries, the academic and research commons, and the rare book collection & archive. each department has at least one professional librarian and other library staff members responsible for their daily operations. for example, the collection development librarian is responsible for the acquisition of print monographs and serials, while the electronic resource librarian is responsible for purchasing and managing licensed databases or e-journals. however, the next-generation ils significantly enhances and reintegrates the workflow of traditional ils functions. the functionality is quite different from the traditional ils’s modular structure. the design of the functionality stresses two principles: modularity and extensibility. it brings together the selection, acquisition, management, and distribution of the entire library collection. it provides a centralized data-services environment to its unified workflows for all types of library assets. one of the big enhancements of the next-generation ils is the acquisitions module, which enables the management of both print and electronic materials within a single unified interface, with no need to move between modules or multiple systems for different formats and related activities. for example, according to oclc, wms streamlines selection and acquisition processes via built-in access to worldcat records and publisher data. vendor, local, consortium, and global library data share the same workflows. wms automatically creates holdings for both physical and electronic resources. the worldcat knowledge-base simplifies electronic resource management and delivery. order data from external systems can be automatically uploaded. for consortium users, wms’s unified workflow and interface fosters efficient resource-sharing between different institutions whose holdings share a common format. similarly, ex libris’s alma has an integrated central knowledge base (ckb) that describes available electronic resources and packages, so there is no need to load additional descriptive records when acquiring electronic resources based on the ckb. the purchasing workflow manages orders for both print and electronic resources in a very similar way and handles some aspects unique to electronic resources, such as license management and the identification of an access provider. staff users can start the ordering process by searching the ckb directly and ordering from there. this search is integrated into the repository search, allowing a staff user to perform searches both in his or her institution as well as in the community zone, which holds the ckb. the next-generation ils provides unified data services and workflows, and a single interface to manage all physical, electronic, and digital materials. this will require libraries to rethink their acquisitions staffing models. for example, in small libraries could merge the acquisition librarian position and the electronic resource librarian position or reorganize the two departments. another functionality enhancement of the next-generation ils provides the authoritative ability for consortia users to manage local holdings and collections as well as shared resources. for example, wms’s single shared knowledge base eliminates the need for each library to maintain a copy of a knowledge base locally, because all consortia members can easily see what is licensed by other members of the consortia. cataloging records are shared at the consortium and global levels a comparative analysis of the effect of the integrated library system on staffing models in academic libraries | fu and fitzgerald 56 in real time. each institution immediately benefits from original cataloging records added to the system and from enhancements to existing records. authority control is built into worldcat, so there is no need to do authority processing against local bibliographic databases. with real-time circulation between libraries’ collections, there is no need to re-create bibliographic and item data in separate local systems. similarly, sierra enhances the traditional technical services workflows by providing a shared bibliographic database. whenever a member library performs selection or ordering, the library is able to determine if other consortia members have already selected, ordered, and cataloged the title. this may impact a local selection, allowing consortia members to more collectively develop their individual collections and reduce duplication. alma’s centralized metadata management service (mms) takes a very similar approach to wms and sierra, allowing several options for local control and shared cataloging, depending on an institution’s needs, while ex libris maintains authority files. very large institutions, for example, might manage some records in the local catalog and most records in a shared bibliographic database, while smaller institutions might manage all of their records in the shared bibliographic database. all these approaches require more collaboration and cooperation between consortia members. according to vendors’ claims on their proposals to the orbis cascade alliance, small institutions might not need to have a professional cataloger, since the cataloging process is simplified and it is therefore easier for paraprofessional staff to operate and copy bibliographic records from the knowledgebases of these ilss. in addition, the next-generation ils also allows library users to actively engage with ils software development. for example, by adding opensocial containers to the product, wms allows library developers to use api to build social applications called gadgets and add these gadgets to wms. one example highlighted by oclc is a gadget in the acquisitions area of wms that will show the latest new york times best sellers and how many copies the library has available for each of those titles. similarly, sierra’s open developer community will allow library developers to share ideas, reference code samples, and build a wide range of applications using sierra’s web services. also, sierra will provide a centralized online resource called sierra developer sandbox to offer a comprehensive library of documented apis for library-developed applications. all these enhancements provide library staff with new opportunities to redefine their roles in a library. conclusions and arguments in summary, compared to the client-server architecture and modular design of the traditional ils, the next-generation ils has an open architecture and is more flexible and unified in its workflow and interface, which will have a huge impact on library staffing models. the traditional ils specifies clear boundaries between staff modules and workflows while the next-generation ils has blurred these boundaries. the integration and enhancement of the functionality of the nextgeneration ils will help libraries streamline and automate workflows and processes for managing both print and electronic resources. it will increase libraries’ operational efficiency, reduce the information technology and libraries | september 2013 57 total cost of ownership, and improve services for users. particularly, it will free approximately 40 percent of library systems staff time from managing servers, software upgrades, client application upgrades, and data backups. moreover, the next-generation ils provides a new way for consortial libraries to collaborate, cooperate, and share resources. in addition, the web-scale services provided by the next-generation ils allow libraries to access an infrastructure and platforms that enable them to reach a broad, geographically diverse community while simultaneously focusing their services on meeting the specific needs of their end-users. thus the more integrated workflows and functionality allow library staff to work with more modules, play multiple roles, and back up each other, which will bring changes to traditional staffing models. however, the next-generation ils also brings libraries new challenges along with its clear advantages. librarians and library staff might have concerns pertaining to their job security and can be fearful of new technologies. they may feel anxious about how to reengineer their business processes, how to get training, how to improve their technological skills, and how to prepare for a transition. we argue here that library directors might think about these staff frustrations and find ways to address their concerns. libraries should provide staff more opportunities and training to help them to improve their knowledge and skills. redefining job descriptions and reorganizing library organizational structures might be necessary to better adapt to the changes brought about by the next-generation ils. systems staff might invest more time in local application developments, other digital initiatives, website maintenance, and other library priority projects. technical staff might reconsider their workflows and cross-train themselves to expand their knowledge and improve their work efficiency. they might spend more time on data quality control and special collection development or interact more with faculty on book and e-resource selections. we hope this analysis will provide some useful information and insights for those libraries planning to move to the next-generation ils. the shift will require academic libraries to reconsider their organizational structures and rethink their manpower distribution and staffing optimization to better focus on library priorities, projects, and services critical to their users. references 1. marshall breeding, “a cloudy forecast for libraries,” computers in libraries 31, no. 7 (2011): 32–34. 2. marshall breeding, “current and future trends in information technologies for information units,” el profesional de la información 21, no. 1 (2012): 11. 3. jason vaughan and kristen costello, “management and support of shared integrated library systems,” information technology & libraries 30, no. 2 (2011): 62–70. 4. marshall breeding, “agents of change,” library journal 137, no. 6 (2012): 30–36. a comparative analysis of the effect of the integrated library system on staffing models in academic libraries | fu and fitzgerald 58 5. patricia ingersoll and john culshaw, managing information technology: a handbook for systems librarians (westport, ct: libraries unlimited, 2004). 6. edward g. iglesias, an overview of the changing role of the systems librarian: systemic shifts (oxford, uk: chandos, 2010). 7. janet guinea, “building bridges: the role of the systems librarian in a university library,” library hi tech 21, no. 3 (2003): 325–32. 8. breeding, “agents of change,” 30. 9. ibid. 10. ibid., 33. 11. ibid., 33. 12. ibid., 30. 13. sally bryant and grace ye, “implementing oclc’s wms (web-scale management services) circulation at pepperdine university,” journal of access services 9, no. 1 (2012): 1. 14. gary garrison et al., “success factors for deploying cloud computing,” communications of the acm 55, no. 9 (2012): 62–68. 226 technical communications reports-library projects and activities light-pen technology at the university of south carolina-the south carolina circulation system for some years at the university of south carolina studies have been underway to perfect a computer-based circulation system, and every avenue of input has been explored. about three years ago the new light-pen inventory control system which was being developed for use in the retail trade became known to the library. this device, using a light-readable label about one inch square in size, seemed to be a far better input device for an inventory control system utilizing identification cards and books than the traditional punched card. mter as much research as could be done in such a new field, the library staff examined light-pen systems marketed by ncr, checkpointplessey, and the monarch marking system, a subsidiary of pitney-bowes. all of these systems were similar in technology but the hardware and interest of the companies varied. monarch and plessey were very interested and willing to cooperate with libraries. plessey (through checkpoint) has developed and is marketing a library circulation system. at carolina the decision was made to develop an in-house batch system using the monarch light-pen and its technology coupled to a digital equipment corporation pdpll/ 10, 16k, dual dectape system. the basis for the functioning of the system is the light-pen stations at the circulation desk where the books are charged and discharged by running the light-pen rapidly over the light-readable label on the patron's id card and the label in the front inside cover of the book. one quick pass over the id card sets the system, followed by a pass over each book label in succession. at present the library has three light-pen stations and anticipates adding a fourth, which will be sufficient for meeting all of the main library's needs for the foreseeable future. the light-pen control boxes were constructed on campus. they turn the system on, and have message lights, trouble li:ghts, and charge points up to five different due dates. any number of light-pen stations can be attached. a hazeltine 2000 crt is used to show all transactions on the screen as they occur and to serve as the system console. a decwriter, used as a line printer, insures a backup system and gives a printout of transactions. the decwriter was selected because its thirty cps speed is adequate, it is highly reliable, and the price is right. the pdpll is used as a batch controller. it does not convert the label data to human-readable data; this is done at the central computer center. each night after the circulation desk closes, a telephone link is made to the university's central computer and the day's scanned data are pumped into the big computer. while the system is batch design, it incorporates the features of an on-line system without the high cost. if a patron inquires about an item, a glance at the updated patron report and/ or an inquiry into the current activity file through the system console can answer questions on the location of all books in the system. unlike some systems, this one has not required that the lfbrary change its hours of operation or data input. when the library opens in the morning, all reports are distributed. the library gets its in-process, or charge, file, in clear text without borrower information for use by patrons to see which books are charged out. complete circulation files with borrower information are furnished to the circulation staff on com fiche. a total of 10,335 charge records is on each four-bysix-inch fiche. the record includes patron name, status, and social security number, call number of the book, item number, date checked out, date due, author, and title. in addition, there is a field for charging to graduate carrels within the library. notices are periodically written for overdue books and there is an indication in the charge file showing how many notices have been sent. personal reserves or holds can be placed on a book by simply keying the book number into the crt. when a book is returned, a message light on the controller lights so that the staff member will know that the book has a hold on it. similar procedures will put a "hold" on any borrower whom the library needs to reach. the printouts are generally by call number; however, it is possible to get lists of all books by borrower, or in other formats. statistics are obtainable in almost any configuration, including number of books checked out to different categories of borrowers, by individual title, etc. the total cost for the hardware in this system was $38,381.04. maintenance contracts, telephone charges, and miscellaneous operational costs add up to about $357.04 monthly. labels cost $1.70 per thousand. the total cost of the system amortized over a five-year period is no more than $975.00 per month. mter that the only continuing costs will be maintenance contracts and labels. additional light-pen stations can be added in the same building for about $1,200 each. light-pen stations can be added in branch libraries for about $2,400 each. not included in the cost figures is central computer time, which is held to a minimum by the batch features and software development. at the university of south carolina computer services are not charged to the individual department but are treated as a campus-wide service. all of the com fiche is produced on the campus. the entire project was developed and put into operation between april 1973 technical communications 227 and january 1974. the first books were officially charged out on january 15, 1974 and the system has been in continuous operation since then. needless to say, problems of a special nature had to be solved, for example, issuing id cards with light-readable labels to 20,000 students in two days, acquiring labels which are permanent for the books, constructing the control boxes which are unique, and solving the usual telephone and computer difficulties. suffice it to say, the entire project was planned and became operational in less than eight months without employing additional staff. although the system is still being refined, the performance has been spectacular.-kenneth f. toombs, di1'ecto1' of librm·ies, unive1'sity of south cm·olina. public access cable tv information center at new york public library the new york public library (nypl), recognizing public access cable television as an important social tool, has assembled in the mid-manhattan library one of the most extensive collections, and possibly the first, on the subject. noncommercial public access television has been characterized as a people-oriented television system that can respond to and reflect society in terms of culture, language, history, experience, and race. the collection is designed for readers seeking information on all aspects of pubhe access cable tv, both practical and theoretical, with a significant portion devoted to television as a community tool. the mid-manhattan collection includes books, pamphlets, periodicals, and microforms. related materials are also collected to document television activities of programming intentions similar to public access tv. it is hoped that the mid-manhattan project will provide a prototype for other libraries beginning collections of public access cable tv. the collection's book materials emphasize three main areas of interest: programming and the audience, the educational potential of public access television, and legislative controls. pamphlet materials in"' :i: '" : ~ i '" ,, :i! ,ii '" ;ii ,, ill i! 228 j oumal of library automation vol. 7 i 3 september 197 4 elude information on ethnic involvement, women's groups, conferences and conventions, library activities in video, bibliographies, and other current topics, and are accessible through a vertical file index. the subject headings in the file reflect the "rule of probable associ:ation" whereby the first meaningful word (after the assumed word television) is used. if the reader knows what he wants, aided by cross-reference system, he can easily identify the proper subject file. the collection also includes periodical indexes leading to a wide variety of journal articles on different aspects of video. the eric (educational resources information center) microfiche series is available from 1971 to date and includes much published and unpublished research. a special feature of the collection is the card file which lists hard-to-find information concerning organizations, associations, and periodicals in the field. such commercial groups as video cassette manufacturers as well as alternative groups making tapes can all be located in the file. contact: richard hecht, history and social science department, mid-manhattan library, 8 e. 40th st., new york, ny 10018. reports on regional projects and activities sal/net-satellite libmry information network this project is designed to experiment in the extension of library services to sparsely populated regions of the rocky mountain and northern plains states. the project has been awarded "designated user" status on the communications technology satellite to be launched by nasa in late 1975. salinet is one of the first attempts to experiment in delivering library services via satellite. a group meeting in denver described plans to use the world's most powerful communications satellite as an extension of local library resources for residents of twelve mountain and plains states. the national space agency, the multistate federation of rocky mountain states, and several library oriented groups and agencies serving the area will pool their expertise and resources in the program, vvhich will begin planning late this year. the library information and development program is a new passenger on the educational satellite which will demonstrate new means of helping to teach residents of far-flung portions of the rocky mountain states and assist them in their information needs during a hvo-year period beginning next fall. four interests are represented in the library oriented project, which bears the acronym of salinet -satellite library information network. the university of denver graduate school of librarianship, the university of kansas libraries, the wyoming state library, and natrona county (wyoming) library are the principals in the consortium. each institution is responsible for certain portions of the library program, which will benefit both libraries and their patrons in the mountain and plains states. dr. margaret knox goggin, dean of the du graduate school of librarianship, is principal investigator on the library program. her co-workers representing other members of the consortium include kenneth e. dowlin, director of natrona county library, casper, wyoming; william williams, wyoming state librarian; and robert malinowsky, assistant director for public service, education, and statistics at the university of kansas libraries. also taking part in the salinet program are the bibliographical center for research, rocky mountain region, inc.; the federation of rocky mountain states; and the mountain-plains library association. these groups will assist with programming, broadcast, and engineering requirements, utilization, and research. the proposed program will utilize fiftysix satellite ground stations which will be in place as part of the federation of rocky mountain states' satellite technology demonstrations. twenty participating libraries in the states of north dakota, south dakota, nebraska, and kansas will 230 1 ournal of library automation vol. 7 i 3 september 197 4 the student as "physician" has selected in managing the case. the learning center offers this and other kinds of audiovisual programs designed to enhance textbook and classroom learning. computers, video cassettes, slide projectors, and models enable the student to experience close up and at his or her own speed areas of medicine that often cannot he presented as well in lectures or textbooks. in addition to the audiovisual materials, the learning center provides students with periodicals, lecture notes, and reference texts. students can watch dissections, see examples of blood cell abnormalities, hear the sounds of healthy and defective heartbeats, and examine oversize plastic models of the brain, the heart, and other parts of the human anatomy. and, with the computer learning programs, the students can participate in a case and make choices to guide its outcome. medical students learning about iron metabolism last year were split into two groups by their instructor so that half attended traditional classroom lectures and half learned the unit from the computer. final examinations showed no difference between the two groups, according to their teacher, dr. james mcarthur, formerly associate professor of medicine. the students preferred human teachers in small group tutorial sections for this unit, mcarthur said, but generally they were in favor of computer instruction. mcarthur, now assistant director of the health sciences learning resources center at the university of washington in seattle, said that the crowd of students usually found at the learning center is an indication of its success. announcements resolution whereas, the american library association is the chief advocate for librarians and laymen seeking to provide citizens of the united states with the highest quality library and information service, and whereas, a major effort will be required of this association and of all supporters of libraries in the next few years as the country's leaders determine longrange national positions in such matters as copyright, intellectual freedom, federal support of libraries, and a national plan for libraries and information services, and whereas, the effectiveness of this effort will depend on the concerted effort of all those concerned with library service, including library users, citizens groups, government officials and librarians themselves from all aspects and ranks of the profession; therefore let it be resolved that all the committees, chapters and divisions of the american library association take definite steps to increase mutual efforts within the association and with other associations seeking ways to strengthen the common effort toward the provision of quality library service to all people. and let it be further resolved that chapter councilors, division officers, the legislation assembly and chairpersons of committees and round tables, affiliated organizations and related groups transmit this resolution to members of their respective units. adopted by the ala legislation committee on july 9, 1974. isad p1·esident receives awar.d frederick g. kilgour, director of the ohio college library center, received the margaret mann citation for 1974, on july 9 at the program meeting of the resources and technical services division of the american library association, during the annual conference of ala in new york city, the week of july 7-13, 197 4. the award recognizes outstanding professional achievement in the areas of cataloging or classification. mr. kilgour received his a.b. from harvard and studied library service at columbia while working at the harvard college library. he worked at the office of strategic services in washington, d.c. and later became deputy director of the 232 journal of library automation vol. 7 i 3 september 197 4 loo, ontario) and edwin buchinski universite d'ottawa): canadian marc, canadian cataloging task group, union lists of serials, prospects for cooperation, unique cataloging problems (e.g., dual language requirements) , large serial data bases. registration will be $70.00 to members of either ala or asis, $85.00 to nonmembers, and $20.00 to library school students. registration includes one lunch, a reception, and a copy of the marc serials manual. for hotel reservation information and a registration blank, write to donald p. hammer, isad, american library association, 50 e. huron st., chicago, il 60611. washington university school of medicine library-book catalog the washington university school of medicine library, st. louis, announces the publication of its catalog of books 1970-1973, containing all entries for monographs cataloged at this library from january 1, 1970 to december 31, 1973. the first part, the register, consists of the complete citations arranged in the order cataloged. the second part consists of three alphabetical indexes to the register -name, title and series, and subject indexes. the catalog is on thirty microfiche, 24x reduction, and the price is $15.00. orders can be filled or additional information obtained from doris bole£, assistant librarian for technical and informational services. proceedings of info1·matics/ ucla symposium available on tapes high fidelity recordings of proceedings from the annual data processing symposium held march 27-29, 1974, at the university of california, los angeles, are now available on cassette tapes. the subject of this year's conference, cosponsored by informatics inc., los angeles, and ucla, was "information systems and networks: the new world of information retrieval available to your organization through computer networks." the complete program, recorded by convention seminar cassettes, north hoilywood, can be ordered by session or in total for review whenever convenient. cassettes one and two cover session one, the evolution of interactive information systems; cassettes three and four include session two, data bases; cassettes five and six, session three, on-line information retrieval systems; cassettes seven and eight, session four, cost effectiveness of information retrieval systems and networks; and cassettes nine and ten, session five, information networks in the 1980s. each set of two cassettes covering one session is priced at $10.95. the entire series can be purchased for $49.95 in an easy-to-store cassette album file. prices include postage and handling. to order, contact convention seminar cassettes, 13356 sherman way, north hollywood, ca 91605; tel: (213) 765-2777. nonprint media institute a nonprint media institute will be held in galveston, texas on october 15, 1974, southwestern library association's annual conference registration day. the one-day institute, sponsored by swla, will feature morning speakers, including pearce grove discussing progress in resolving differences among three cataloging standards for nonprint media, and vivian schrader, head of the a v section of library of congress, reporting on the progress of lc's nonprint cataloging standards. afternoon informal discussion forums will focus on technical service handling of art prints, microforms, films, kits, phonorecords, and audiotape. the nonprint media institute is open to members and nonmembers of swla, but is limited to 150 registrants. registration fee is $20.00. for registration, hotel reservations, and transportation information, write: ann adams, head cataloger, houston public library, 500 mckinney, houston, tx 77002. international standm·ds for cataloging: an institute on isbd, issn, nsdp and chapter 6, aacr the seventh annual institute of the li234 journal of library automation vol. 7/3 september 1974 zurkowski, president, information industry association, 4720 montgomery lane, bethesda, md 20014; tel: (301) 6544150. commercial services and developments 10,000 computer program abstracts in ncpas data base the national computer program abstract service ( n cp as) , a clearinghouse for computer program abstracts, has categorized over 10,000 abstracts into 142 subject areas in its latest newsletter. these abstracts of simulation models, application and computational programs, and information retrieval systems are derived from business, government, industry, military, and universities. all fields of knowledge are included and are grouped into the following general categories: biosciences, medical sciences, business, manufacturing, management, education, libraries, environment, ecology, nature, government (federal, state, local), urban affairs, legal, humanities, specific industries, publfc utilities, military, science, and engineering. this service should be of value to a present or potential user of computer programs, a vendor with a program to sell, or a professor developing programs in the academic community. programs can be listed in the data base free of charge. the service is problem oriented. the program abstract information is disseminated in two forms: ( 1) a program index newsletter which includes a detailed index of the available subjects and the number of abstracts available for each subject (updated quarterly) -the newsletter cost is $10.00 per year ( $5.00 additional for foreign airmail) ; and ( 2) a sub;ect abstract report which includes all the abstracts available in the ncp as data base on a particular subject identi'fied in the progmm index newsletter-the abstract report cost is $10.00 for the rrst 200 abstracts and $5.00 for up to each additional 200 abstracts ( $5.00 additional for foreign airmail). for additional information contact: ncpas, p.o. box 3783, washington, dc 20007. communications by telephone a lfghtweight communications system about the size of a suitcase is now being introduced that can take the travel-and the cost-out of meetings. the new solid state system is the darome edu-com, a portable self-contained communications unit with four microphones, that uses regular telephone lines. edu-com enables groups of people in different places across the country to confer together as easily as if they were all in the same room. the cost of a onehour meeting with participants located coast-to-coast is a few hundred dollars. manufactured by darome, inc., of harvard, illinois, makers of modular sound systems equipment, the edu-com unit plugs into an inexpensive, standard telephone coupler, a device supplied by the telephone company. the number of locations that can be included in a darome edu-com conference is practically unlimited. to participate, each location need only be equipped with a darome unit and a telephone coupler. then, rather than having just one meeting at a time, it is possible to hold any number of meetings in any number of places at the same time. before an edu-com session begins, the organizer of the meeting telephones a special conference call telephone operator and gives the names, locations, and telephone numbers of the groups to be reached and the time of the meeting. charges begin only when all locations have been tied together by the operator and the conference is ready to start. the rate for the darome edu-com meetings is much lower than for direct dialing the places individually. the rate is equal to the cost of calling only the farthest city participating in the conference. for example, a one-hour meeting that originates in chicago and includes groups of people who participate in new york, newark, huntington, greensboro, atlanta, orlando, detroit, denver, and san diego would cost $280. 236 journal of library automation vol. 7/3 september 1974 the wall street journal and banon's magazine. anthony a. barnett, senior vice-president of bunker ramo, said test installations in five stockbrokerage firms over the past month "have allowed us to shake down the system and prepare for nationwide marketing." mr. barnett said fifty of the news retrieval systems were sold before formal introduction. "we're encouraged by the marketing prospects among stockbrokerage firms and financial institutions but believe the market among corporations may have even greater potential," he added. the news retrieval system permits instantaneous recall of stories on 6,000 companies listed on the new york and american stock exchanges and traded over-thecounter. users also are able to retrieve news of twenty-five industry groups, fifteen government agencies, and several general categories. mr. barnett said that at the outset customers will be able to recall from the ffie any story that has appeared in the last three months. dj news-recall was developed as a joint venture by bunker ramo and dow jones, which publishes the wall stmet joumal, barron's, and the dow jones news service. the joint venture, dow jones-bunker ramo news retrieval service inc., in turn will market the data base to distributors for resale. bunker ramo's information systems division is the charter distributor for dj news-recall. mr. barnett said the basic charge for dj news-recall to users of bunker ramo's system 7 will be $175 a month per office ·plus $25 for each video terminal having access to the news retrieval service. on-line access to the compendex data base engineering index, inc., announces the availability of its computer-readable data base, compendex, through on-line access. two organizations are currently providing this di'rect mode of bibliographic search: lockheed information systems and system development corporation. using the latest in data communications services, users requiring access to the compendex files may interact with the system via their own in-house terminal, thus providing the convenience and speed of "on-demand" searches. compendex is the machine-readable version of ei's monthly and provides abstracts/bibliographic citations covering worldwide developments in all fields of engineering. both s.d.c. and lockheed, utilizing the most modern system technology, afford the user the opportunity to maintain an actual "dialog" with major bases. this is done without imposing an overly complicated or difficult command language on those addressing the system. on-line access now adds a new dimension to those requiring searches of the ei data base. for further information interested individuals and organizations may contact: lockheed information service, 3251 hanover st., palo alto, ca 94304; tel: ( 415) 493-44ll (east coast office: lockheed-405 lexington ave., new york, ny 10017; tel: (212) 697-7171), or s.d.c. search service, system development corporation, 2500 colorado ave., santa monica, ca 90406; tel: (213) 393-94ll (east coast office: s.d.c.5827 columbia pike, falls church, va 22041; tel: ( 703) 820-2220) . standards the isad committee on technical standards for library automation invites your participation in the standards game editor's note: use of the following guidelines and forms is described in the mticle by john kountz in the june 1974 issue of jola. the tesla reactor ballot will also appear in subsequent issues of technical communications for mader use, and the tesla standards sc01'eboard will be presented as cumulated results warrant its publication. to use, photocopy or otherwise duplicate the forms presented in jolatc, fill out these copies, and mail them be added to complete a twelve-state test bed representing all categories of libraries. with the involvement of all these points, half of which will be in two-way communication with other points via the satellite, the library information project hopes to accomplish three primary goals: 1. improving individual and organization capacities for getting information. 2. demonstrating and testing cost effectiveness in using technological advances to disseminate information. 3. developing user "markets" for information utilizing satellite distribution. the program will try to help individual users of information and community-level groups such as governmental agencies, businesses, and other organizations. on a regional level, bibliographic information will be transmitted to libraries in a "compressed data format." with such a fonnat, a library in a remote area of north dakota may have access to most needed information about resources available from large and specialized centers, such as the denver public library's special conservation library or western history collection. the proposed satellite information program will also be used to train librarians, both at a professional and paraprofessional level. the in-service program will be aimed at helping librarians to better assist their patrons in getting information. all these major aspects-public information programming at the individual level, technology dissemination at the community level, compressed bibliographical data transmission, and in-service training-will be accomplished in a total of fifty hours per year of programming, reports dr. william e. rapp, vice-president of the federation of rocky mountain states. the limited time available for this programming in coordination with other programs planned for the satellite project place a premium on solid advance preparation of material to be transmitted, and speed of transmission, he notes. for example, the transmission of the technical communications 229 compressed bibliographical data would be in twoto three-minute segments at the end of other programming. technology dissemination, a community-level program, would be handled in a total of fifteen hours of satellfte use a year-an average of fifteen to twenty minutes per week. the largest segment of time, for inservice training of librarians, is twenty hours per year-which breaks down to less than half an hour a week on the average. but if the available time on the satellite is used to its full potential, dean goggin belfeves the population of the entire rocky mountain and plains region will benefit tremendously. the combined resources of major libraries and two major universities could be shared instantly with communities and residents of the region. new horizons cu1'e by compute1'-a learning expe1'ience it has been a long day for the physician, and at 10:30 p.m. he is getting ready to go horne and have his first meal since breakfast. but the phone rings and the caller from university of minnesota hospital tells him that an infant has just been brought in from outstate minnesota for diagnosis. the baby's hometown physician has noted that the baby hasn't eaten well for several days and he can't decide what's wrong. the case apparently isn't urgent, so the physician can either go for a bite to eat and return later or go straight to the hospital to look at the baby. this is the first of a series of choices presented to medical students in this imaginary case history. it's offered as a computerized learning program for medical and health sciences students at the university of minnesota's learning center. for a student playing the part of the physician in this case, each choice he or she makes presents new difficulties in the case--which call for more choices. ultimately, the imaginary infant either dies or survives, depending on what options i i ' i i i. office of intelligence collection at the department of state fn washington. he was librarian at the yale medical library and then associate librarian for research and development of the yale university library. he has been active in library and library-related organizations since the beginning of. his career and has served on many committees. he was managing editor of the yale journal of biology and medicine; and has written numerous articles for professional journals. his professional interests are computerization of libraries and information retrieval. the text of the citation reads in part: " ... awarded in 1974 to frederick g. kilgour for his success in organizing and putting into operation the first practical centralized computer bibli:ographical center. he has been the principal inr.uence behind an emerging trend toward cooperation in technical services. . . . as director of the ohio college library center he has made the library of congress marc data base a practical and useful product, stimulating i'nterest throughout the country and the profession .... his tireless efforts represent an outstanding contribution to the technical improvements of cataloging and classification and the introduction of new techniques of recognized importance." institute on automated serials contml the information science and automation division (isad) of the american library association and the american society for information science will cosponsor a preconference institute on "automated serials control: national and international considerations." the institute will be held on october 11 and 12, 1974 in atlanta, georgia immediately before the asis annual conference, which begins on october 13. the institute and the conference will both be held in the atlanta regency hyatt house. the purposes of the institute will be: ( 1) to present in-depth discussions on the new and dramatic developments in fhe serials field and their implications for the library and the library systems development communities; .and ( 2) to provide a technical communications 231 survey of the progress made to date in automated serials systems. formal presentations by acknowledged experts actively involved in the field will be provided, and ample opportunity will be available for informal discussion between the participants and a panel of concerned professionals. the panel will represent various views related to current and future developments as well as the national and international consequences. among other things, the program will include the following: elaine wood (lc marc development office) : marc serials format and serials processing at lc-a tutorial. in addition, this session will include discussion on national and international standards, and will emphasize the difference between marc serial and marc monograph formats. joseph howard (lc serials record division) : cataloging considerations. the proposed changes to the cataloging rules -isbd, aacr, various points concerning entry and other unique cataloging problems. henriette a vram and lucia rather (lc marc development office): international considerations. prospects for international exchange of data, mechanisms for exchange, problems posed by differing practices and conventions, international developments in machine-readable cataloging. paul fasana (new york public library) : impact of national developments and of automation on library services. general consideration of automation's impact on library services with emphasis on serials control. recommendations concerning national developments. linda crismond (university of southern california): review of serials systems and system considerations. acquisitions, cataloging, check-in, claiming, etc. problems posed by holdings notations, volatility of data, linking entries, etc. lois upham (university of minnesota): conser (consolidated serials project). joseph price (nsdp): international serials data system and nsdp. cynthia pugsley (university of waterbrary institutes planning committee will be held october 18-19, 1974, at rickey's hyatt house hotel, palo alto, california. paul w. winkler, principal descriptive cataloger, library of congress, will speak on the application of the international standard bibliographic description to monographs and on related topics. the establi:shment of bibliographic control of serials through international standard serial numbers, chapter 6 of the angloame1'ican cataloging rules, and the national serials data program will be presented by richard anable, coordinator, conser project. the program is designed to be of particular interest to technical services librarians, serials librarians, bibliographers, and administrators. registration for the two-day meeting is limited; the fee is $20.00 and includes two luncheons. further information, including a list of hotel accommodations, will be mailed to applicants. registrants of the 1972 and 1973 institutes will automatically receive registration forms. others may obtain forms by writing joseph e. ryus, 2858 oxford ave., richmond, ca 94806, or by telephoning him during weekday hours at the university of california, berkeley, ( 415) 642-4144. all registration forms will be mailed early in september. the library institutes planning committee is a nonprofit organization composed of eight librarians from county, special, and university libraries in northern california. previous institutes have featured ralph ellsworth, j. mcree elrod, seymour lubetzky, ellsworth mason, daniel melcher, john c. rather, joseph a. rosenthal, and paul w. winkler. info1'mation industry associations expanded micropublishing and data base programs major policy steps have been taken by the information industry association ( iia), making the work of the associatirm more understandable and more relevant to information industry companies. it changed the title of the government micropublishing committee to micropublishing and it directed the establishment technical communications 233 of a data base committee. "regardless of the media information companies and other publishers currently use in delivering information," iia presi'dent, paul g. zurkowski, said, "competition and rising costs are forcing them into consideration of alternative methods. iia member companies will be able to focus their energies most effectively on the industry-wide problems through these new committees." micropublishing committee chairman, henry powell, bell & howell, bethesda, at a recent meeting of the committee spelled out several areas of concern to micropublishers which will be the subject of committee action: 1. how can micropublishers protect their investment from unfair competition of unscrupulous competitors who misappropriate the micropublishers' work product and market essentially "reprinted" versions of the original microfilm. 2. library relations. joint library-industry steps toward mutual understanding and cooperation. 3. z-39 standards committee recommended standards covering what micropublishers can say about their products. what is a volume equivalent in microform? what information should be included on each mfcrofiche and where on the header or title section of the microfilm product? 4. a program to educate users as to the operational benefits of micropublished materials. the data base committee is being formed with the participation of both data base creating companies and those offering public access to various data bases. the area of interest to this committee will embrace the status of data bases under existing proprietary rights laws, communications capabilities and rates as controlled by the fcc, unfair competition legislation pending in congress, and such other problems as those created for the industry by university computer centers marketing access to similar data bases, but without full cost recovery. for further information contact paul g. simply by pressing the lever on one of the edu-com microphones and speaking into it, anyone in any of those cities will be heard in the meeting i'n every other city as if he were right there in the room. in addition, an automatic slide projector can also be plugged into the unit. during a presentation, the speaker can change the slides simultaneously in all the locations equipped with slides. a cassette player-recorder can be plugged into the darome edu-com unit, either to provide the program or record the session. vvhen used alone, the darome educom can also serve as a public address system for a single meeting. for further information, contact darome, inc., 711 e. diggins st., harvard, il 60033. automated news clipping, indexing and retrieval system (ancirs) image systems, inc. of culver city, california, has developed an automated system for the indexing and retrieval of news clippings. while ancirs (pronounced answers) is geared for use in the newspaper library, terminals located at remote sites provide access to the system for business, industry, education, and law enforcement and other government agencies. the microfiche terminals, whfch are controlled by a minicomputer, are each capable of storing 325,000 clippings and 1 million lines of index and search terms. ancirs has a capacity in excess of 1.25 million listings. access to a page of index hstings or to the full text of any clipping requires less than four seconds. paper copy of any or all of the selected clippings can be produced at the terminal at the touch of a button. multiple terminals can share the same minicomputer. a unique off-line/ on-line indexing system generates subject term lists from story headlines and other key words, names, and places in the story as selected by the indexer. when indexing a story, the indexer keys in the first letters of the subject terms to be assigned. this causes the terms currently in use to be displayed at the terminal, allowing the indexer to automatically assign the appropriate terms to the story being indexed. if the term is technical communications 235 not already in use it may be entered by completing the typing of the term. after new terms have been entered and old terms assigned, a magnetic tape is produced for the off-line program. the off-line program prepares three lists for computer output to microfiche: 1. headlines permuted by the key words in each headline interleaved with subject terms selected from the stories. 2. a category list by classification and subclass. 3. each story headline in date order. to perform a search, the user keys i'n the first few letters of the search term. this causes the appropriate portion of list one to be displayed. each item on the page has a line number. keying the line number ( s) selects the desired term ( s) and causes the most recent clipping to be displayed. if the selected term is too general, i.e., a category heading, the appropriate portion of hst two is automatically displayed so that a more precise selection may be made. the selection in this instance is also made by entering the line number(s) of the desired term(s). these selections may be combined logically with other selections to further narrow the search. once the terms have been selected and the most recent story i's displayed it is then possible to page back through previous related stories. the hard copy of any story can be requested at any time and is produced by the unit in ten seconds. ancirs' low cost makes it the ideal tool for researchers and decision makers who must have at their fingertips complete facts on world, national, and local events. machine-readable data bases news retrieval service bunker ramo corporation and dow jones & co. inc. announced the start of dj news-recall, a computerized news retrieval service based on stories appearing on the dow jones news service and in to the tesla chairman, mr. john c. kountz, associate for librmy automation, of/ice of the chancellor, the california state university and colleges, 5670 wilshire blvd., suite 900, los angeles, ca 90036. the procedure this procedure is geared to handle both reactive (originating from the outside) and initiative (originating from within ala) standards proposals to provide recommendations to ala's representatives on existing, recognized standards organizations. to enter the procedure for an initiative standards proposal you must complete an "initiative standards proposal" using the outline which follows: initiative standard proposal outlinethe following outline and forms are designed to facilitate review by both the isad committee on technical standards for library automation (tesla) and the membership of initiative standards requirements and to expedite the handling of the initiative standard proposal through the procedure. since the outline will be used for the review process, it is to be followed explicitly. where an initiative standard requirement does not require the use of a specific outline entry, the entry heading is to be used followed by the words "not applicable" (e.g., where no standards exist which relate to the proposal, this is indicated by: vi. existing standards. not applicable). note that the parenthetical statements following most of the outline entry descriptions relate to the ansi standards proposal section headings to facilitate the translation from this outline to the ansi format. all initiative standards proposals are to be typed, double spaced on 8%" x 11" white paper (typing on one side only). each page is to be numbered consecutively in the upper right-hand corner. the initiator's last name followed by the key word from the title is to appear one line below each page number. i. title of initiative standard proposal (title). technical communications 237 ii. initiator information (forward). a. name b. title c. organization d. address e. city, state, zip f. telephone: area code, number, extension iii. technical area. describe the area of library technology as understood by initiator. be as precise as possible since in large measure the information given here will help determine which ala official representative might best handle this proposal once it has been reviewed and which ala organizational component might best be engaged in the review process. iv. purpose. state the purpose of standard proposal (scope and qualifications) . v. description. briefly describe the standard proposal (specification of the standard). vi. relationship of other standards. if existing standards have been identified which relate to, or are felt to influence, this standard proposal, cite them here (expository remarks). vii. background. describe the research or historical review performed relating to this standard proposal (if applicable, provide a bibliography) and your findings (justification). viii. specifications. specify the standard proposal using record layouts, mechanical drawings, and such related documentation aids as required in addition to text exposition where applicable (specification of the standard). kindly note that the outline is designed to enable standards proposals to be written following a generalized format which will facilitate their review. in addition, the outline permits the presentation of background and descriptive information which, while important during any evalu238 journal of libmry automation vol. 7 i 3 september 197 4 ation, is a prerequisite to the development of a standard. the reactor ballot is to be used by members to voice their recommendations relative to initiative standards proposals. the reactor ballot permits both "for" and "against" votes to be explained, permitting the capture of additional information · which is necessary to document and communicate formal standards proposals to standards organizations outside of the american library association. tesla reactor ballot reactor information name title organization address city state ___ zip __ telephone identification number for standard requirement for against reason for position: (use additional pages if required} as you, the members, use the outline to present your standards proposals, tesla will publish them in ]ola-tc and solicit membership reaction via the reactor ballot. throughout the process tesla will insure that standards proposals are drawn to the attention of the applicable american library association division or committee. thus, internal review usually will proceed concurrently with membership review. from the review and the reactor ballot tesla will prepare a "majority recommendation" and a "minority report" on each standards proposal. the majority recommendation and "minority report" so developed will then be transmitted to the originator, and to the official american library association representative on the appropriate standards organization where it should prove a source of guidance as official votes are cast. in addition, the status of each standards proposal will be reported by tesla in jola-tc via the standards scoreboard. the committee (tesla) itself will be nonpartisan with regard to the proposals handled by it. however, the committee does reserve the right to reject proposals which after review are not found to relate to library automation. an invitation from tesla during the formative period of tesla the list of potential standards areas for library automation, below, was developed. you are invited to review the list below and voice your opinion of any or all areas indicated by means of the reactor ballot. or, if you have a requirement for a standard not included in this list, use the initiative standard proposal outline to collect and present your thoughts. potential technical standards areas1. codes for library and library network, including network hierarchy structures. 2. documentation for systems design, development, implementation, operation, and postimplementation review. 3. minimum display requirements for library crts, keyboards for terminals, and machine-readable character or code set to be used as label printed in book. 4. patron or user badge physical dimension(s) and minimum data elements. 5. book catalog layout (physical and minimum data elements), a. off-line print b. photocomposed c. microform 6. communication formats for inventory control (absorptive of interlibrary loan and local circulation). 7. data element dictionary content, format, and minimum vocabulary, and inventory identification minimum content. 8. inventory labels or identifiers (punched cards, labels, badges, or . . . ) physical dimensions and minimum data elements. 9. model/minimum specifications relating to hardware, software, and services procurement for library applications. 10. communications formats for library material procurement (absorptive of order, bid, invoice, and related follow-up). input to the editor: i have reviewed mr. joe rosenthal's incisive survey of the marc types which appear to be eminent. unfortunately, in his studies he seems to have overlooked the one marc type which will pose the technical communications 239 greatest problem and relates to "unmarced" books: nonmarc-the grand universe of records which yet remain to be placed in this most noble of formats and its international counterpart, originating in holland, nedermarc-which stems from, "i nedermarc, you nedermarc, all them systems nedermarc .... " hopefully, someone will solve the problem explicit in these forms of marc. as a result, until someone solves this problem, we are all without marc. john kountz associate for library automation california state university and colleges patrick griffis building pathfinders with free screen capture tools building pathfinders with free screen capture tools | griffis 189 this article outlines freely available screen capturing tools, covering their benefits and drawbacks as well as their potential applications. in discussing these tools, the author illustrates how they can be used to build pathfinding tutorials for users and how these tutorials can be shared with users. the author notes that the availability of these screen capturing tools at no cost, coupled with their ease of use, provides ample opportunity for low-stakes experimentation from library staff in building dynamic pathfinders to promote the discovery of library resources. o ne of the goals related to discovery in the university of nevada las vegas (unlv) libraries’ strategic plan is to “expand user awareness of library resources, services and staff expertise through promotion and technology.”1 screencasting videos and screenshots can be used effectively to show users how to access materials using finding tools in a systematic, step-by-step way. screencasting and screen capturing tools are becoming more intuitive to learn and use and can be downloaded for free. as such, these tools are becoming an efficient and effective method for building pathfinders for users. one such tool is jing (http://www.jingproject.com), freeware that is easy to download and use. jing allows for short screencasts of five minutes or less to be created and uploaded to a remote server on screencast.com. once a jing screencast is uploaded, screencast.com provides a url for the screencast that can be shared via e-mail or instant message or on a webpage. another function of jing is recording screenshots, which can be annotated and shared by url or pasted into documents or presentations. jing serves as an effective tool for enabling librarians working with students via chat or instant messaging to quickly create screenshots and videos that visually demonstrate to students how to get the information they need. jing stores the screenshots and videos on its server, which allows those files to be reused in subject or course guides and in course management systems, course syllabi, and library instructional handouts. moreover, jing’s files storage provides an opportunity for librarians to incorporate tutorials into a variety of spaces where patrons may need them in such a manner that does not require internal library server space or work from internal library web specialists. trailfire (http://www.trailfire.com) is another screencapturing tool that can be utilized in the same manner. trailfire allows users to create a trail of webpage screenshots that can be annotated with notes and shared with others via a url. such trails can provide users with a step-by-step slideshow outlining how to obtain specific resources. when a trail is created with trailfire, a url is provided to share. like jing, trailfire is free to download and easy to learn and use. wink (http://debugmode.com/wink) was originally created for producing software tutorials, which makes it well suited for creating tutorials about how to use databases. although wink is much less sophisticated than expensive software packages, it can capture screenshots, add explanation boxes, buttons, titles, and voice to your tutorials. screenshots are captured automatically as you use your computer on the basis of mouse and keyboard input. wink files can be converted into very compressed flash presentations and a wide range of other file types, such as pdf, but do not support avi files. as such, wink tutorials converted to flash have a fluid movie feel similar to jing screencasts, but wink tutorials also can be converted to more static formats like pdf, which provides added flexibility. slideshare (http://www.slideshare.net) allows for the conversion of uploaded powerpoint, openoffice, or pdf files into online flash movies. an option to sync audio to the slides is available, and widgets can be created to embed slideshows onto websites, blogs, subject guides, or even social networking sites. any of these tools can be utilized for just-in-time virtual reference questions in addition to the common use of just-in-case instructional tutorials. such just-in-time screen capturing and screencasting offer a viable solution for providing more equitable service and teachable moments within virtual reference applications. these tools allow library staff to answer patron questions via e-mail and chat reference in a manner that allows patrons to see processes for obtaining information sources. demonstrations that are typically provided in face-toface reference interactions and classroom instruction sessions can be provided to patrons virtually. the efficiency of this practice is that it is simpler and faster to capture and share a screencast tutorial when answering virtual reference questions than to explain complex processes in written form. additionally, the fact that these tools are freely available and easy to use provides library staff the opportunity to pursue low-stakes experimentation with screen capturing and screencasting. the primary drawback to these freely available tools is that none of them provides a screencast that allows for both voice and text annotations, unlike commercial products such as camtasia and captivate. however, tutorials rendered with these freely available tools can be repurposed into a tutorial within commercial applications like camtasia studio (http://www.techsmith.com/camtasia .asp) and adobe captivate (http://www.adobe.com/ products/captivate/). patrick griffis (patrick.griffis@unlv.edu) is business librarian, university of nevada las vegas libraries. 190 information technology and libraries | december 2009 as previously mentioned, these easy-to-use tools can allow screencast videos and screenshots to be integrated into a variety of online spaces. a particularly effective type of online space for potential integration of such screencast videos and screenshots are library “how do i find . . .” research help guides. many of these “how do i find . . .” research help guides serve as pathfinders for patrons, outlining processes for obtaining information sources. currently, many of these pathfinders are in text form, and experimentation with the tools outlined in this article can empower library staff to enhance their own pathfinders with screencast videos and screenshot tutorials. reference 1. “unlv libraries strategic plan 2009–2011,” http://www .library.unlv.edu/about/strategic_plan09-11.pdf (accessed july 30, 2009): 2. unlv special collections continued from page 186 references 1. peter michel, “dino at the sands,” unlv special collections, http://www.library.unlv.edu/speccol/dino/index.html (accessed july 28, 2009). 2. peter michel, “unlv special collections search box.” unlv special collections. http://www.library.unlv.edu/speccol/ index.html (accessed july 28, 2009). 3. unlv special collections search results, “hoover dam,” http://www.library.unlv.edu/speccol/databases/index .php?search_query=hoover+dam&bts=search&cols[]=oh&cols []=man&cols[]=photocoll&act=2 (accessed october 27, 2009). 4. unlv libraries, “southern nevada: the boomtown years,” http://digital.library.unlv.edu/boomtown/ (accessed july 28, 2009). 5. unlv special collections, “what’s new in special collections,” http://blogs.library.unlv.edu/whats_new_in_special_ collections/ (accessed july 28, 2009). 6. unlv special collections, “unlv special collections facebook homepage,” http://www.facebook.com/home .php?#/pages/las-vegas-nv/unlv-special-collections/70053 571047?ref=search (accessed july 28, 2009). 7. unlv libraries, “comments section for the aerial view of hughes aircraft plant photograph,” http://digital.library .unlv.edu/hughes/dm.php/hughes/82 (accessed july 28, 2009); unlv libraries, “‘rate it’ feature for the aerial view of hughes aircraft plant photograph,” http://digital.library.unlv.edu/ hughes/dm.php/hughes/82 (accessed july 28, 2009); unlv libraries, “rss feature for the index to the welcome home howard digital collection” http://digital.library.unlv.edu/hughes/ dm.php/ (accessed july 28, 2009). statement of ownership, management, and circulation information technology and libraries, publication no. 280-800, is published quarterly in march, june, september, and december by the library information and technology association, american library association, 50 e. huron st., chicago, illinois 60611-2795. editor: marc truitt, associate director, information technology resources and services, university of alberta, k adams/cameron library and services, university of alberta, edmonton, ab t6g 2j8 canada. annual subscription price, $65. printed in u.s.a. with periodical-class postage paid at chicago, illinois, and other locations. as a nonprofit organization authorized to mail at special rates (dmm section 424.12 only), the purpose, function, and nonprofit status for federal income tax purposes have not changed during the preceding twelve months. extent and nature of circulation (average figures denote the average number of copies printed each issue during the preceding twelve months; actual figures denote actual number of copies of single issue published nearest to filing date: september 2009 issue). total number of copies printed: average, 5,096; actual, 4,751. mailed outside country paid subscriptions: average, 4,090; actual, 3,778. sales through dealers and carriers, street vendors, and counter sales: average, 430; actual 399. total paid distribution: average, 4,520; actual, 4,177. free or nominal rate copies mailed at other classes through the usps: average, 54; actual, 57. free distribution outside the mail (total): average, 127; actual, 123. total free or nominal rate distribution: average, 181; actual, 180. total distribution: average, 4,701; actual, 4,357. office use, leftover, unaccounted, spoiled after printing: average, 395; actual, 394. total: average, 5,096; actual, 4,751. percentage paid: average, 96.15; actual, 95.87. s t a t e m e n t o f o w n e r s h i p , m a n a g e m e n t , a n d c i r c u l a t i o n ( p s f o r m 3 5 2 6 , s e p t e m b e r 2 0 0 7 ) f i l e d w i t h t h e u n i t e d s t a t e s p o s t o f f i c e p o s t m a s t e r i n c h i c a g o , o c t o b e r 1 , 2 0 0 9 . algorithmic literacy and the role for libraries article algorithmic literacy and the role for libraries michael ridley and danica pawlick-potts information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.12963 abstract artificial intelligence (ai) is powerful, complex, ubiquitous, often opaque, sometimes invisible, and increasingly consequential in our everyday lives. navigating the effects of ai as well as utilizing it in a responsible way requires a level of awareness, understanding, and skill that is not provided by current digital literacy or information literacy regimes. algorithmic literacy addresses these gaps. in arguing for a role for libraries in algorithmic literacy, the authors provide a working definition, a pressing need, a pedagogical strategy, and two specific contributions that are unique to libraries. introduction algorithms, in one form or another, are as old as human problem solving and as simple as “a sequence of computational steps that transform the input into the output.”1 for centuries they have been effective, and uncontroversial, methodologies. however, the rise of artificial intelligence (the integration of big data, enhanced computation, and advanced algorithms) with its human and greater-than-human performance in many areas has positioned algorithms as transformational and a “major human rights issue in the twenty-first century.”2 algorithmic literacy is important given of the prevalence of algorithmic decision-making in many aspects of everyday life and because “the danger is not so much in delegating cognitive tasks, but in distancing ourselves from—or in not knowing about—the nature and precise mechanisms of that delegation.”3 as a result, david lankes warns of a new type of digital divide with “a class of people who can use algorithms and a class used by algorithms.”4 in a 2019 deloitte survey “only 4 percent reported they were confident explaining what ai is and how it works.”5 while a 2019 edelman survey indicated general awareness of ai, it also revealed a similar lack of knowledge about the details of ai.6 an informed, algorithmically literate public is better able to negotiate and employ the complexities of ai.7 identifying and acting upon algorithms as a literacy makes them as “fundamental as reading, writing, and arithmetic.”8 however, the uncritical use of the term literacy should make one suspicious of extending it to algorithms. increasingly “literacy” has come to mean merely a body of knowledge or a set of domain-specific skills.9 various literacies have been described, such as health, death, financial, physical, ocean, religious, visual, dancing, spatial, screen, and porn. this includes a dozen different technology-related literacies.10 the case for algorithmic literacy, and the role for libraries in advancing it, must rest on a clear definition, a recognized problem and need, a pedagogical strategy, and a unique (or at least supportive) contribution libraries can provide. michael ridley (mridley@uoguelph.ca) is librarian emeritus, mclaughlin library, university of guelph, ontario, canada. danica pawlick-potts (dpawlic@uwo.ca) is phd candidate, faculty of information and media studies, western university, ontario, canada. © 2021. mailto:mridley@uoguelph.ca mailto:dpawlic@uwo.ca information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 2 algorithms and literacy while the term “algorithmic literacy” is recent, it has antecedents that cover similar if not equivalent ground. the general terms computer literacy or digital literacy have spawned more specific terms such as cyber literacy, computational thinking, and algorithmic thinking.11 most of these arise from the field of computer science, where algorithms are central, and focus on the computational nature of algorithms as a “matter of mathematical proof” where “other knowledge about algorithms—such as their applications, effects, and circulation—is strictly out of frame.”12 the implications of algorithms in everyday life suggests that a deeper and broader interpretation is required. whether a literacy, a mode of thinking, or merely a set of skills, discussions about computation and algorithms have been plagued by “ambiguity and vagueness” and “definitional confusion” resulting in ongoing challenges in establishing core pedagogy in both k–12 and higher education.13 without a clear, acknowledged, and actionable definition that differentiates it from concepts such as digital literacy, computational thinking, and algorithmic thinking, algorithmic literacy will be relegated to a buzz phrase and the urgency of its recognition and application will be lost. the relationship between algorithms and artificial intelligence might recommend the adoption of “ai thinking” or “ai literacy” as the more appropriate term.14 however, algorithmic literacy is both more foundational than the broader concept of ai and more actionable than just thinking. algorithms are not a technology like ai or, more generally, computers. algorithms provide a structure that frames—and constrains—how we express ourselves. they are a way of seeing and acting in the world and “need to be understood as relational, contingent, [and] contextual.”15 while the technical and operational aspects of algorithms are important to understand and use (as they are for the technologies and processes of reading and writing in a new language), they are complemented by a broader awareness: literacy is not a set of generic skills or something we do or do not possess, it’s a sociocultural practice, it’s something that we do, and what we do with literacy depends on the social, cultural, and historical contexts in which we do it. literacy looks different in different contexts and communities. literacy is not neutral, it’s ideological. there are dominant and marginalized literacies.16 this perspective is the essence of critical algorithm studies, where algorithms are viewed as sociotechnical systems that are “intrinsically cultural . . . constituted not only by rational procedures, but by institutions, people, intersecting contexts, and the rough-and-ready sensemaking that obtains in ordinary cultural life.”17 algorithms as part of increasingly ubiquitous ai, such as machine learning and deep learning systems, reflect and promulgate certain ideologies and have impacts and influences in the full range of human society. cautions about algorithmic decision-making have identified the far-reaching implications for bias, fairness, privacy, and democratic processes.18 at the same time, numerous national strategies to support ai development have highlighted the substantial economic impact, anticipated to be $15.7 trillion (us) by 2030.19 the idea of algorithmic literacy must encompass multiple perspectives and contexts. information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 3 “literacies of the digital” computer, internet, information, computation, and algorithmic are all “literacies of the digital.” 20 while each of these has its own domain and focus, they share common ideas and are generally symbiotic with each other. there is an especially strong and complementary connection between computational literacy and information literacy.21 computational thinking and algorithmic literacy are closely related even if most definitions of the former fail to fully acknowledge the broader social, economic, and political implications. however, the extensive literature on computational thinking is useful in helping to articulate aspects of algorithmic literacy. wing’s foundational article about computational thinking describes the key characteristics in terms that closely resemble a literacy: 1. conceptualizing, not programming 2. fundamental, not a rote skill 3. a way that humans, not computers, think 4. complements and combines mathematical and engineering thinking 5. ideas, not artifacts 6. for everyone, everywhere.22 jacob and warschauer make a strong case for computational thinking as a literacy. their threepart framework identifies computational thinking as a new literacy embedded in modern sociocultural practices (computational thinking as literacy), discusses how literacy development can be leveraged to foster computational thinking (computational thinking through literacy), and explores ways in which computational thinking can facilitate literacy development (literacy through computational thinking).23 this analysis of computational thinking informs the larger context and broader implications of algorithmic literacy. defining algorithmic literacy scribner and cole define a literacy as “socially organized practices [that] make use of a symbol system and a technology for producing and disseminating it.”24 therefore, literacy = practices + symbol system + technology. to this definition, steiner adds a more aspirational and humanistic definition: by “literacy” i mean the ability to engage with, to respond to, what is most challenging and creative in our societies. to experience and contribute to the energies of informed debate. to distinguish the “news that stays news,” as ezra pound put it, from the tidal waves of ephemeral rubbish, superstition, irrationalism, and commercial exploitation.25 literacy is about knowing and meaning making through the processes of internalizing and externalizing information. literacy enables a reflective, critical, and integrative approach to information that utilizes a broad knowledge base for both understanding and communicatin g ideas. finn calls for an algorithmic literacy “that builds from a basic understanding of computational systems, their potential and their limitations, to offer us intellectual tools for interpreting the algorithms shaping and producing knowledge” and thereby provides “a way to information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 4 contend with both the inherent complexity of computation and the ambiguity that ensues when that complexity intersects with human culture.”26 referring more broadly to “ai literacy,” long and magerko provide an operational view defining it as “a set of competencies that enables individuals to critically evaluate ai technologies; communicate and collaborate effectively with ai; and use ai as a tool online, at home, and in the workplace.” 27 following an exhaustive analysis of different, and often contradictory, definitions of literacy, information literacy, and digital literacy, bawden suggests “explaining, rather than defining, terms.”28 this provisional description of algorithmic literacy acknowledges that advice. algorithmic literacy is the skill, expertise, and awareness to • understand and reason about algorithms and their processes • recognize and interpret their use in systems (whether embedded or overt) • create and apply algorithmic techniques and tools to problems in a variety of do mains • assess the influence and effect of algorithms in social, cultural, economic, and political contexts • position the individual as a co-constituent in algorithmic decision-making. this description recognizes two overarching concepts: “creativity and critical analysis.”29 creativity involves building, creating, and using algorithms for specific purposes. critical analysis involves recognizing the application of algorithms in decision-making and the implications of their use in a variety of settings and within certain contexts. why algorithmic literacy? the need for algorithmic literacy arises from two key and equally important perspectives, both of which essentially focus on power: control and empowerment. algorithms, especially those us ing machine learning and deep learning, are complex, opaque, invisible, shielded by intellectual property protection, and most importantly, consequential in the everyday lives of people.30 control is held by those who build and deploy algorithms, not those who use them. in part because of these characteristics, people hold significant misconceptions about algorithms, their use, and their effect. in a 2019 global survey of consumers, 72% said they understood what ai was. however, despite ai being used in a wide variety of consumer-facing applications (e.g., email, search, social media), 64% said they had never used ai.31 a study of facebook users found that 62% were unaware that the news feed is algorithmically constructed and, even when told this, 12% concluded that it is, as a result, completely random.32 bias, discrimination, and unfairness in ai have been well documented.33 it is clear that poor data combined with underspecified algorithms and uncritical interpretations of the ai model outcomes can lead to abuses in a variety of ways. there is no quick fix, no automated solution to these problems. accordingly, those creating algorithms and those using them must be able to question the source of training data, the strengths and weaknesses of learning algorithms, the metrics for success, and how (and for whom) the systems are being optimized. the overarching objectives are accountability and transparency. perhaps most critically, the prevalence of algorithms in our lives has changed the way we interact with and use those systems, and the ways we behave in personal and social contexts. we conduct information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 5 ourselves to be “algorithmically recognizable” allowing us to become “increasingly legible to machines for capture and calculation.”34 the danger is that this will “lead users to internalize their [algorithm’s] norms and priorities.”35 at the same time the power of algorithmic technology is abused and misused, it remains a powerful technology to enhance human capabilities and insight. algorithms are attributable to dramatic advances in health care and science as well as more mundane (but appreciated) applications such as spam filters. anti-science sentiments, typified by anti-vaxxers, should not be allowed to undermine the opportunities for algorithms that materially improve the human condition and the natural world. those opportunities now extend beyond the well-funded, technology-rich research and corporate ai departments. increasingly more consumer-friendly tools and applications allow a broader and more diverse population to create algorithmic solutions. the rise of mlaas (machine learning as a service) brings together powerful cloud-based machine learning environments with accessible toolsets.36 algorithmic literacy is needed to acknowledge both the technology’s power (control) over people and power (empowerment) for people. recognizing the need for protection and encouragement, many governments have enacted protective legislation and training initiatives. emblematic of the former is the general data protection regulation (gdpr) of the european union with its “right to explanation” for algorithmic decisions.37 exemplary of the latter is finland’s initiative to educate a large portion of their population through “elements of ai,” a free online course.38 despite these advances there remain power imbalances that require vigilance on the part of 21 st century digital citizens. understanding the power and politics of algorithms recognizes their ontological impact in “new ways of ordering the world.”39 effects this profound suggest a deeper and more comprehensive understanding of algorithms is needed: efforts to help people understand algorithms need to continue moving away from a focus on building awareness of algorithms—people increasingly know about “those things called algorithms”—and toward explaining algorithms in such a way that people have a more consistent conceptualization of what algorithms are, what algorithms do, and—what often is overlooked—what algorithms cannot do.40 algorithmic literacy, like all literacies, is not about mastery but levels of competence appropriate to age, circumstance, and need. understood simply as recipes or visual decision trees, algorithms are accessible to even those with minimal digital literacy. public institutions, and specifically libraries, can and must take a lead role in addressing the challenges of this “new world.” the library role in algorithmic literacy libraries have traditionally played a central role in making emerging technologies accessible to their communities whether those be online systems, makerspaces, interactive media, virtual reality, or a host of others. advancing digital access, digital literacy, and digital inclusion have long been the acknowledged by governments and public agencies as a role of the public library even if not appropriately funded to do so.41 recently, libraries have begun addressing their role in relation to ai and algorithmic literacy. information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 6 the urban libraries council (ulc) conducted an informal poll about ai and public libraries.42 of the responding libraries (83 of its 150-member library systems), 45% identified ai as important to their leadership with 23% having a staff person dedicated to ai and 27% providing programming to help the public learn about ai. in response to a question of how best libraries could serve their community in this area, 79% said by framing and building awareness of ai, 68% recommended providing continuous education opportunities for the public, and 61% supported the provision of experiential programming. in 2019, the ulc formed a working group to advance the public library role in ai awareness, education, and experiences. in 2018 the canadian federation of library associations (cfla) held a national forum in part focused on artificial intelligence.43 participant discussions yielded three key priorities with respect to ai: training for library staff, educational materials of for the public, and advocacy initiatives regarding privacy, bias, and transparency. a fourth priority was the inclusion o f ai literacy and awareness in mis and mlis curricula to facilitate a leadership role for the profession in this area. algorithmic literacy programs have two general audiences: members of the community the library serves and the staff of the libraries themselves. for the community, these programs center on awareness and implications, skill development, and application and use.44 through workshops, hands-on laboratories and makerspaces, consumer checklists, and a variety of informational tools, libraries can provide, or partner in providing, resources in an ageand context-appropriate setting. for library staff, an additional focus is required on advocacy with respect to regulatory issues, system development, and the evolution of the local and national information infrastructures. library staff can lead, and participate in, advocacy programs that seek to influence government, public agencies, commercial system and service providers, and others about algorithmic literacy. it is a misconception to think of algorithms, and ai more generally, as arcane topics beyond the ability of library staff to understand and teach. while the technical details of ai are complex, this is not the level of understanding required of staff or needed by the library’s community. for example, ai programing at the frisco public library introduced ai maker kits and ran basic ai classes. the toronto public library, through its digital innovation hubs, has offered learning circles in basic ai (using the finnish elements of ai course as a foundation) and hosted presentations on various aspects of algorithms in everyday life. by abstracting algorithms to higher level concepts related directly to daily experience (using facebook is illustrative of many key ideas regarding algorithmic literacy), staff can obtain a sufficient overview from a variety of accessible, introductory texts or videos. perhaps most importantly, given the new and evolving nature of this technology, library staff should view themselves as co-learners. no matter the setting or context, an active learning approach is recommended with learners situated as makers as well as consumers.45 a review of the k–12 curricula regarding computational literacy identified active learning strategies based on projects, problem solving, cooperation, and games. the researchers recommend augmenting these with scaffolding strategies, storytelling, and aesthetic approaches.46 while intended for algorithmic literacy initiatives involving children, four design principles from dasgupta and mako are relevant for any demographic: https://friscolibrary.com/ https://www.torontopubliclibrary.ca/ https://www.elementsofai.com/ information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 7 1. make data analysis central and ensure the data is relevant to the learner, 2. manage risk by using sandboxes for experimentation, 3. respect community values about technology that may differ, and 4. support authenticity with real-world examples and scenarios.47 long and magerko document a set of 17 core competencies and 15 associated learning design considerations regarding ai literacy.48 taken together these represent the basis for an algorithmic literacy program for any demographic and any context. libraries are encouraged to seek partnerships and collaborations with schools (k–12 and higher education) as well as with non-profit advocacy and training groups.49 examples among these include the algorithmic literacy project (algorithmliteracy.org) and a.i. for anyone (https://aiforanyone.org). many technology companies also offer high quality programs and resources. however, a report from the public policy forum notes that digital literacy campaigns are “too often funded by the very companies that are contributing to the problem.”50 a key issue is the lack of assessment instruments. there are none for algorithmic literacy and few for computational thinking. the most prominent of the latter is skills based, focusing on concepts and operational practices and very little on the wider social and cultural implications.51 library experience with information literacy assessment can inform algorithmic literacy assessment by helping to balance skills and operational concerns with a wider focus on concepts and contextual awareness. information literacy and explainable ai (xai): unique library contributions while libraries can make contributions to algorithmic literacy through a variety of programs, resources, and advocacy initiatives, two specific areas suggest opportunities for unique contributions: algorithmic literacy as a part of information literacy and algorithmic literacy in support of “explainable ai” (xai). algorithmic literacy and information literacy annemaree lloyd describes the opacity and ubiquity of algorithms as “a wicked problem for librarians and archivists who have a vested interest in equitable access, informed citizenry and the maintenance of public memory” and insists that information literacy “provides resistance to the expansionist claims of algorithms, while at the same time ensuring that people harness the power of this culture to their advantage.”52 information literacy programs championed by libraries have been instrumental in raising awareness and skill building among their user communities. using information literacy programs as a scaffold, algorithmic literacy can be incorporated into these successful initiatives. however, given the current needs “machine learning and algorithms present frontstage in the information literacy constellation.”53 head et al., in their important 2020 study of algorithms and information literacy, present a view of student perspectives that is both troubling and optimistic. 54 the students expressed “a tangle of resignation and indignation” about the effects of algorithms on their lives. for them, algorithms obscure more than they reveal, privacy is compromised, “trust is dead,” and skepticism is total. the authors conclude that we face an “epistemological crisis” where algorithms are “stripping individuals of the responsibility to interpret the facticity of the information these systems give us when that interpretation has been performed by the algorithms themselves.” however, students also employed “defensive practices” against algorithms, utilized “multiple selves” to preserve their https://algorithmliteracy.org/ https://aiforanyone.org/ information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 8 privacy, and were keen to learn how to “fight back” against surveillance and algorithmic decisionmaking. this is a reminder that “while algorithms certainly do things to people, people also do things to algorithms.”55 people have “algorithmic capital” which they can use in “negotiation with algorithmic power.”56 with these findings, it seems clear that status quo information literacy programs will not address the unique challenges presented by algorithms. jason clark, scott young, and lisa janicke hinchliffe took up this challenge with a project funded by an imls grant.57 calling “algorithmic awareness” a “new competency,” these researchers identified a gap in the acrl framework for information literacy that revealed “a lack of an understanding around the rules that govern our software and shape our digital experiences.”58 those rules are the “invisible logic” of algorithms that need to be made transparent for users and library staff. deliverables from this project include an integrated curriculum, syllabus, and software prototype that respond uniquely to the pedagogical challenges of algorithmic literacy.59 in promoting ml (machine learning) literacy, ryan cordell also calls for a specific pedagogical approach that would “emphasize the situated-ness of ml training data and experiments, including the biases or oversights that influence the outcomes of academic, economic, and governmental ml processes.”60 recommendations from this report provide guidelines for developing staff expertise, running pilot projects, and creating toolsets and checklists supportive of responsible machine learning. algorithmic literacy and explainable ai (xai) perhaps a less obvious way for libraries to contribute to algorithmic literacy is through explainable ai (xai).61 difficulties in interrogating algorithms to assess bias, discrimination, and unfairness (as well as other deficiencies such as veracity and generalizability) have led to widespread interest in xai. the purpose of xai is to “enable human users to understan d, appropriately trust, and effectively manage the emerging generation of artificially intelligent partners” and to deploy ai systems that have “the ability to explain their rationale, characterize their strengths and weaknesses, and convey an understanding of how they will behave in the future.”62 there is complementarity between the objectives of xai and algorithmic literacy. both seek transparency, promote understanding, and facilitate accountability. both recognize the primacy of human agency in human-machine interaction. xai is accomplished through a variety of techniques, strategies, and processes. these can involve unambiguous proofs, technical and statistical interventions for verification and validation, and authorizations that rely on standards, audits, and policy directives.63 explanations are contextual. system designers, professionals, regulators, end users, and the general public need explanations specific to their objectives and tailored to their skills and knowledge. as algorithmic decision-making is increasingly embedded in the information tools, services, and resources provided by libraries and promoted to users, xai and algorithmic literacy can operate in close association. libraries can incorporate aspects of xai into algorithmic literacy programming and the principles of algorithmic literacy (and more generally information literacy) can inform how xai is sensitive and responsive to different explanatory needs. xai is still an emergent field but it has had, and will continue to have, a profound impact on the development of machine learning systems. the opportunity for library involvement is immediate: information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 9 librarians need to become well versed in these technologies, and participate in their development, not simply dismiss them or hamper them. we must not only demonstrate flaws where they exist but be ready to offer up solutions. solutions grounded in our values and in the communities we serve.64 a repeated message from lis researchers is that library-developed tools to interrogate ai systems are essential components in advancing algorithmic literacy. 65 these tools can address the complexity and opacity of machine learning systems and provide levels of explainability and transparency in contextually appropriate ways. one such tool, either as a stand-alone system or embedded in an existing discovery system, might provide a user with access to the nature, and potential bias, of the training data, the general efficacy of the learning algorithm(s) used, and the generalizability of the trained model to different contexts. this xai scorecard would integrate the objectives of xai, algorithmic literacy, and information literacy. by leveraging and developing library staff skills and by partnering with ai research and industry groups “libraries can become ideal sites for cultivating responsible and responsive ml.”66 padilla views this engagement as not just a technical initiative but a library-wide effort to promulgate “responsible operations” with ai, noting that library practices “that embed transparency and explainability increase the likelihood of organizational accountability.”67 conclusion algorithms are “the new power brokers in society” and “we are growing increasingly dependent on computational spectacles to see the world.”68 lash argues that this development has altered the rules by which society operates. constitutive rules (e.g., rules that define the boundaries of society) and regulative rules (e.g., the rules define how we operate in society) are now joined by “algorithmic, generative rules.” these rules are “compressed and hidden and we do not encounter them in the way that we encounter constitutive and regulative rules. yet this third type of generative rules is more and more pervasive in our social and cultural life of the post-hegemonic order.”69 algorithmic literacy is a means to understand this new set of rules and to encourage the skills and abilities so people can use algorithms and not be used by them. libraries have typically championed accessible technology and its effective use. the ubiquity of algorithmic decisionmaking and its profound impact on everyday lives makes the recognition and promotion of algorithmic literacy a critical new challenge and imperative for libraries of all types. information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 10 endnotes 1 thomas h. cormen et al., introduction to algorithms, 3rd ed. (cambridge ma: mit press, 2009), 13. 2 yoav shohman et al., “ai index 2017 report” (stanford, ca: human-centered ai initiative, stanford university, 2017), http://cdn.aiindex.org/2017-report.pdf; safiya noble, algorithms of oppression: how search engines reinforce racism (new york: new york university press, 2018), 1. 3 jos de mul and bibi van den berg, “remote control: human autonomy in the age of computer mediated agency,” in law, human agency, and autonomic computing, ed. mireille hildebrandt and antoinette rouvroy (abingdon: routledge, 2011), 58. 4 lee rainie and janna anderson, “code-dependent: pros and cons of the algorithmic age” (pew research center, february 2017), http://www.pewinternet.org/wpcontent/uploads/sites/9/2017/02/pi_2017.02.08_algorithms_final.pdf. 5 “canada’s ai imperative: from predictions to prosperity” (toronto: deloitte, 2019), 16, https://www.canada175.ca/en/reports/aiimperative?&id=ca:2el:3or:awa_2019_fcc_omnia1:from_dca_fccomnia2. 6 “2019 edelman ai survey,” edelman, 2019, https://www.edelman.com/sites/g/files/aatuss191/files/201903/2019_edelman_ai_survey_whitepaper.pdf. 7 jenna burrell, “how the machine ‘thinks’: understanding opacity in machine learning algorithms,” big data & society 3, no. 1 (2016), https://doi.org/10.1177/2053951715622512; rainie and anderson, “code-dependent.” 8 jeannette wing, “computational thinking, 10 years later,” communications of the acm 59, no. 7 (2016): 10, https://doi.org/10.1145/2933410. 9 loanne snavely and natasha cooper, “the information literacy debate,” journal of academic librarianship 23, no. 1 (1997): 9–14, https://doi.org/10.1016/s0099-1333(97)90066-5. 10 alfred thomas bauer and ebrahim mohseni ahooei, “rearticulating internet literacy,” cyberspace studies 2, no. 1 (2018): 29–53, https://doi.org/10.22059/jcss.2018.245833.1012. 11 evelyn stiller and cathie leblanc, “from computer literacy to cyber-literacy,” journal of computing sciences in colleges 21, no. 6 (2006): 4–13; peter j. denning and matti tedre, computational thinking (cambridge ma: mit press, 2019); z. katai, “the challenge of promoting algorithmic thinking of both sciencesand humanities-oriented learners,” journal of computer assisted learning 31, no. 4 (2015): 287–99, https://doi.org/10.1111/jcal.12070. 12 nick seaver, “what should an anthropology of algorithms do?” (american anthropological association, chicago, 2013), 1–2, http://nickseaver.net/papers/seaveraaa2013.pdf. 13 jesús moreno-león and marcos román-gonzález, “on computational thinking as a universal skill,” in ieee global engineering education conference (educon, santa cruz de tenerife, http://cdn.aiindex.org/2017-report.pdf http://www.pewinternet.org/wp-content/uploads/sites/9/2017/02/pi_2017.02.08_algorithms_final.pdf http://www.pewinternet.org/wp-content/uploads/sites/9/2017/02/pi_2017.02.08_algorithms_final.pdf https://www.canada175.ca/en/reports/ai-imperative?&id=ca:2el:3or:awa_2019_fcc_omnia1:from_dca_fccomnia2 https://www.canada175.ca/en/reports/ai-imperative?&id=ca:2el:3or:awa_2019_fcc_omnia1:from_dca_fccomnia2 https://www.edelman.com/sites/g/files/aatuss191/files/2019-03/2019_edelman_ai_survey_whitepaper.pdf https://www.edelman.com/sites/g/files/aatuss191/files/2019-03/2019_edelman_ai_survey_whitepaper.pdf https://doi.org/10.1177/2053951715622512 https://doi.org/10.1145/2933410 https://doi.org/10.1016/s0099-1333(97)90066-5 https://doi.org/10.22059/jcss.2018.245833.1012 https://doi.org/10.1111/jcal.12070 http://nickseaver.net/papers/seaveraaa2013.pdf information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 11 spain: ieee, 2018), 1684–89; shuchi grover and roy pea, “computational thinking in k–12: a review of the state of the field,” educational researcher 42, no. 1 (2013): 38–43, https://doi.org/10.3102/0013189x12463051; betual c. czerkawski and eugene w. lyman iii, “exploring issues about computational thinking in higher education,” techtrends 59, no. 2 (2015): 57–65. 14 daniel zeng, “from computational thinking to ai thinking,” ieee intelligent systems (november/december, 2013), 2–4; duri long and brian magerko, “what is ai literacy? competencies and design considerations,” in proceedings of the 2020 chi conference on human factors in computing systems, chi ’20 (honolulu, hi: association for computing machinery, 2020), 1–16, https://doi.org/10.1145/3313831.3376727. 15 rob kitchin, “thinking critically about and researching algorithms,” information, communication & society 20, no. 1 (2017): 18, https://doi.org/10.1080/1369118x.2016.1154087. 16 karen nicholson, “information into action? reflections on (critical) practice” (workshop on instruction in library use (wilu), university of ottawa, 2018), 7–8, https://ir.lib.uwo.ca/fimspres/51/. 17 nick seaver, “algorithms as culture: some tactics for the ethnography of algorithm systems,” big data & society 4 (2017): 10, https://doi.org/10.1177/2053951717738104. 18 virginia eubanks, automating inequity: how high-tech tools profile, police, and punish the poor (new york: st. martin’s press, 2018); noble, algorithms of oppression; cathy o’neil, weapons of math destruction: how big data increases inequality and threatens democracy (new york: crown, 2016); frank pasquale, the black box society: the secret algorithms that control money and information (cambridge, ma: harvard university press, 2015). 19 time dutton, “building an ai world: report on national and regional ai strategies” (toronto: cifar, 2018), https://www.cifar.ca/docs/default-source/aisociety/buildinganaiworld_eng.pdf?sfvrsn=fb18d129_4; pricewaterhousecooper, “sizing the prize: what’s the real value of ai for your business and how can you capitalise?,” 2017, https://www.pwc.com/gx/en/issues/analytics/assets/pwc-ai-analysis-sizing-the-prizereport.pdf. 20 allan martin and jan grudziecki, “digeulit: concepts and tools for digital literacy development,” innovation in teaching and learning in information and computer sciences 5, no. 4 (2006): 249–67, https://doi.org/10.11120/ital.2006.05040249. 21 rosanne cordell, “information literacy and digital literacy: competing or complementary?,” communications in information literacy 7, no. 2 (2013): 177–83, https://doi.org/10.15760/comminfolit.2013.7.2.150; andreas dengel and ute heuer, “a curriculum of computational thinking as a central idea of information & media literacy,” in proceedings of the 13th workshop in primary and secondary computing education (wipsce’18) october 4-6, 2018, potsdam, germany (new york: acm, 2018), https://doi.org/10.1145/3265757.3265777; sarah gretter and aman yadav, “computational https://doi.org/10.3102/0013189x12463051 https://doi.org/10.1145/3313831.3376727 https://doi.org/10.1080/1369118x.2016.1154087 https://ir.lib.uwo.ca/fimspres/51/ https://doi.org/10.1177/2053951717738104 https://www.cifar.ca/docs/default-source/ai-society/buildinganaiworld_eng.pdf?sfvrsn=fb18d129_4 https://www.cifar.ca/docs/default-source/ai-society/buildinganaiworld_eng.pdf?sfvrsn=fb18d129_4 https://www.pwc.com/gx/en/issues/analytics/assets/pwc-ai-analysis-sizing-the-prize-report.pdf https://www.pwc.com/gx/en/issues/analytics/assets/pwc-ai-analysis-sizing-the-prize-report.pdf https://doi.org/10.11120/ital.2006.05040249 https://doi.org/10.15760/comminfolit.2013.7.2.150 https://doi.org/10.1145/3265757.3265777 information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 12 thinking and media & information literacy: an integrated approach to teaching twenty-first century skills,” techtrends 60 (2016): 510–16, https://doi.org/10.1007/s11528-016-0098-4. 22 jeannette wing, “computational thinking,” communications of the acm 49, no. 3 (2006): 35. 23 sharin rawhiya jacob and mark warschauer, “computational thinking and literacy,” journal of computer science integration 1, no. 1 (2018): 3, https://doi.org/10.26716/jcsi.2018.01.1.1. 24 sylvia scribner and michael cole, the psychology of literacy, acls humanities e-book (series) (cambridge, ma: harvard university press, 1981), 99. 25 george steiner, “school terms: redefining literacy for the digital age,” lapham’s quarterly 1, no. 4 (2008): 198. 26 ed finn, “algorithm of the enlightenment,” issues in science and technology 33, no. 3 (2017): 25; ed finn, what algorithms want: imagination in the age of computing (cambridge, ma: mit press, 2017), 2. 27 long and magerko, “what is ai literacy?,” 2. 28 david bawden, “information and digital literacies: a review of concepts,” journal of documentation 57, no. 2 (2001): 233. 29 gretter and yadav, “computational thinking,” 510. 30 pasquale, the black box society; o’neil, weapons of math destruction. 31 “what consumers really think about ai: a global study,” pega, 2019, https://www.ciosummits.com/what-consumers-really-think-about-ai.pdf. 32 motahhare eslami et al., “first i ‘like’ it, then i hide it: folk theories of social feeds,” in proceedings of the 2016 chi conference on human factors in computing systems, chi ’16 (san jose, ca: association for computing machinery, 2016), 2371–82, https://doi.org/10.1145/2858036.2858494. 33 julia angwin et al., “machine bias,” propublica, may 23, 2016, https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing; eubanks, automating inequity; noble, algorithms of oppression; pasquale, the black box society; ruha benjamin, race after technology: abolitionist tools for the new jim code (polity press, 2019); o’neil, weapons of math destruction. 34 tarleton gillespie, “the relevance of algorithms,” in media technologies: essays on communication, materiality, and society, ed. tarleton gillespie, pablo j. boczkowski, and kirsten a. foot (cambridge, ma: mit press, 2014), 184; sun-ha hong, technologies of speculation: the limits of knowledge in a data-driven society (new york: new york university press, 2020), 2. 35 gillespie, “the relevance of algorithms,” 187. https://doi.org/10.1007/s11528-016-0098-4 https://doi.org/10.26716/jcsi.2018.01.1.1 https://www.ciosummits.com/what-consumers-really-think-about-ai.pdf https://doi.org/10.1145/2858036.2858494 https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 13 36 altexsoft, “comparing machine learning as a service: amazon, microsoft azure, google cloud ai, ibm watson,” data science (blog), september 27, 2019, https://www.altexsoft.com/blog/datascience/comparing-machine-learning-as-a-serviceamazon-microsoft-azure-google-cloud-ai-ibm-watson/. 37 european union, “regulation (eu) 2016/679 of the european parliament and of the council of 27 april 2016,” 2016, http://eur-lex.europa.eu/legalcontent/en/txt/?uri=celex:32016r0679; bryce goodman and seth flaxman, “european union regulations on algorithmic decision making and a ‘right to explanation,’” ai magazine 38, no. 3 (2017): 50–57, https://doi.org/10.1609/aimag.v38i3.2741. 38 finland, “work in the age of artificial intelligence: four perspectives on economy, employment, skills and ethics” (helsinki: ministry of economic affairs and employment, 2018), http://urn.fi/urn:isbn:978-952-327-313-9. 39 taina bucher, if . . . then: algorithmic power and politics (new york: oxford university press, 2018), 20. 40 alison j. head, barbara fister, and margy macmillan, “information literacy in the age of algorithms: student experiences with news and information, and the need for change” (project information literacy, 2020), 41, https://www.projectinfolit.org/uploads/2/7/5/4/27541717/algoreport.pdf. 41 paul t. jaeger et al., “the intersection of public policy and public access: digital divides, digital literacy, digital inclusion, and public libraries,” public library quarterly 31, no. 1 (2012): 1, https://doi.org/10.1080/01616846.2012.654728. 42 “ulc snapshot: artificial intelligence,” urban libraries council weekly newsletter, july 18, 2018. 43 canadian federation of library associations, “artificial intelligence and intellectual freedom: key policy concerns for canadian libraries” (ottawa: cfla, 2018), http://cfla-fcab.ca/wpcontent/uploads/2018/07/cfla-fcab-2018-national-forum-paper-final.pdf. 44 martin and grudziecki, “digeulit.” 45 b. alexander, s. adams becker, and m. cummins, “digital literacy: an nmc horizon project strategic brief” (austin, tx: the new media consortium, 2016), https://www.nmc.org/publication/digital-literacy-an-nmc-horizon-project-strategic-brief/. 46 ting-chia hsu, shao-chen chang, and yu-ting hung, “how to learn and how to teach computational thinking: suggestions based on a review of the literature,” computers & education 126 (2018): 296–310, https://doi.org/10.1016/j.compedu.2018.07.004. 47 sayamindu dasgupta and benjamin mako hill, “designing for critical algorithmic literacies,” arxiv:2008.01719 [cs], 2020, http://arxiv.org/abs/2008.01719. 48 long and magerko, “what is ai literacy?” 49 alexander, adams becker, and cummins, “digital literacy.” https://www.altexsoft.com/blog/datascience/comparing-machine-learning-as-a-service-amazon-microsoft-azure-google-cloud-ai-ibm-watson/ https://www.altexsoft.com/blog/datascience/comparing-machine-learning-as-a-service-amazon-microsoft-azure-google-cloud-ai-ibm-watson/ http://eur-lex.europa.eu/legal-content/en/txt/?uri=celex:32016r0679 http://eur-lex.europa.eu/legal-content/en/txt/?uri=celex:32016r0679 https://doi.org/10.1609/aimag.v38i3.2741 http://urn.fi/urn:isbn:978-952-327-313-9 https://www.projectinfolit.org/uploads/2/7/5/4/27541717/algoreport.pdf https://doi.org/10.1080/01616846.2012.654728 http://cfla-fcab.ca/wp-content/uploads/2018/07/cfla-fcab-2018-national-forum-paper-final.pdf http://cfla-fcab.ca/wp-content/uploads/2018/07/cfla-fcab-2018-national-forum-paper-final.pdf https://www.nmc.org/publication/digital-literacy-an-nmc-horizon-project-strategic-brief/ https://doi.org/10.1016/j.compedu.2018.07.004 http://arxiv.org/abs/2008.01719 information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 14 50 edward greenspon and taylor owen, “democracy divided: countering disinformation and hate in the digital public sphere” (ottawa: public policy forum, 2018), 19, https://ppforum.ca/wpcontent/uploads/2018/08/democracydivided-ppf-aug2018-en.pdf. 51 marcos román-gonzález, juan-carlos pérez-gonzález, and carmen jiménez-fernández, “which cognitive abilities underlie computational thinking? criterion validity of the computational thinking test,” computers in human behavior 72 (2017): 678–91, https://doi.org/dx.doi.org/10.1016/j.chb.2016.08.047. 52 annemaree lloyd, “chasing frankenstein’s monster: information literacy in the black box society,” journal of documentation 75, no. 6 (2019): 1476, https://doi.org/10.1108/jd-022019-0035. 53 head, fister, and macmillan, “information literacy in the age of algorithms,” 42. 54 head, fister, and macmillan, “information literacy in the age of algorithms.” 55 taina bucher, “the algorithmic imaginary: exploring the ordinary affects of facebook algorithms,” information, communication & society 20, no. 1 (2017): 42, https://doi.org/10.1080/1369118x.2016.1154086. 56 tanya kant, making it personal: algorithmic personalization, identify, and everyday life (oxford: oxford university press, 2020), 152. 57 jason clark, lisa janicke hinchliffe, and scott young, “unpacking the algorithms that shape our ux” (washington, dc: imls, 2017), https://www.imls.gov/sites/default/files/grants/re-7217-0103-17/proposals/re-72-17-0103-17-full-proposal-documents.pdf. 58 association of college and university libraries, “framework for information literacy for higher education,” 2015, http://www.ala.org/acrl/standards/ilframework; jason clark, “building competencies around algorithmic awareness” (washington, dc: code4lib, 2018), https://www.lib.montana.edu/~jason/talks/algorithmic-awareness-talk-code4lib2018.pdf. 59 jason clark, algorithmic awareness (2018; repr., github, 2020), https://github.com/jasonclark/algorithmic-awareness. 60 ryan cordell, “machine learning + libraries: a report on the state of the field” (washington dc: library of congress, 2020), 31, https://labs.loc.gov/static/labs/work/reports/cordellloc-ml-report.pdf. 61 michael ridley, “explainable artificial intelligence,” research library issues, no. 299 (2019): 28– 46, https://doi.org/10.29242/rli.299.3. 62 matt turek, “explainable artificial intelligence (xai)” (arlington, va: darpa, 2016), https://www.darpa.mil/program/explainable-artificial-intelligence; darpa, “explainable artificial intelligence (xai)” (arlington, va: darpa, 2016), http://www.darpa.mil/attachments/darpa-baa-16-53.pdf. 63 ashraf abdul et al., “trends and trajectories for explainable, accountable, and intelligible systems: an hci research agenda,” in proceedings of the 2018 chi conference on human https://ppforum.ca/wp-content/uploads/2018/08/democracydivided-ppf-aug2018-en.pdf https://ppforum.ca/wp-content/uploads/2018/08/democracydivided-ppf-aug2018-en.pdf https://doi.org/dx.doi.org/10.1016/j.chb.2016.08.047 https://doi.org/10.1108/jd-02-2019-0035 https://doi.org/10.1108/jd-02-2019-0035 https://doi.org/10.1080/1369118x.2016.1154086 https://www.imls.gov/sites/default/files/grants/re-72-17-0103-17/proposals/re-72-17-0103-17-full-proposal-documents.pdf https://www.imls.gov/sites/default/files/grants/re-72-17-0103-17/proposals/re-72-17-0103-17-full-proposal-documents.pdf http://www.ala.org/acrl/standards/ilframework https://www.lib.montana.edu/~jason/talks/algorithmic-awareness-talk-code4lib2018.pdf https://github.com/jasonclark/algorithmic-awareness https://labs.loc.gov/static/labs/work/reports/cordell-loc-ml-report.pdf https://labs.loc.gov/static/labs/work/reports/cordell-loc-ml-report.pdf https://doi.org/10.29242/rli.299.3 https://www.darpa.mil/program/explainable-artificial-intelligence http://www.darpa.mil/attachments/darpa-baa-16-53.pdf information technology and libraries june 2021 algorithmic literacy and the role for libraries | ridley and pawlick-potts 15 factors in computing systems, chi ’18 (new york: acm, 2018), 582:1–582:18, https://doi.org/10.1145/3173574.3174156; wojciech samek and klaus-robert muller, “towards explainable artificial intelligence,” in explainable ai: interpreting, explaining and visualizing deep learning, ed. wojciech samek et al., 2019., lecture notes in artificial intelligence 11700 (cham: springer international publishing, 2019), 5–22; alejandro barredo arrieta et al., “explainable artificial intelligence (xai): concepts, taxonomies, opportunities and challenges toward responsible ai,” arxiv:1910.10045 [cs], 2019, http://arxiv.org/abs/1910.10045. 64 r. david lankes, “decoding ai and libraries,” r. david lankes (blog), july 3, 2019, https://davidlankes.org/decoding-ai-and-libraries/. 65 catherine coleman, “artificial intelligence and the library of the future, revisited,” digital library blog (blog), november 3, 2017, https://library.stanford.edu/blogs/digital-libraryblog/2017/11/artificial-intelligence-and-library-future-revisited; head, fister, and macmillan, “information literacy in the age of algorithms”; cordell, “machine learning + libraries”; clark, hinchliffe, and young, “unpacking the algorithms.” 66 cordell, “machine learning + libraries,” 2. 67 thomas padilla, responsible operations. data science, machine learning, and ai in libraries (dublin, oh: oclc research, 2019), 10, https://doi.org/10.25333/xk7z-9g97. 68 nicholas diakopoulos, “algorithmic accountability reporting: on the investigation of black boxes” (new york: tow center for digital journalism, columbia university, 2014), 2, https://doi.org/10.7916/d8zk5tw2; finn, “algorithm of the enlightenment,” 24. 69 scott lash, “power after hegemony: cultural studies in mutation?,” theory, culture & society 24, no. 3 (2007): 71, https://doi.org/10.1177/0263276407075956. https://doi.org/10.1145/3173574.3174156 http://arxiv.org/abs/1910.10045 https://davidlankes.org/decoding-ai-and-libraries/ https://library.stanford.edu/blogs/digital-library-blog/2017/11/artificial-intelligence-and-library-future-revisited https://library.stanford.edu/blogs/digital-library-blog/2017/11/artificial-intelligence-and-library-future-revisited https://doi.org/10.25333/xk7z-9g97 https://doi.org/10.7916/d8zk5tw2 https://doi.org/10.1177/0263276407075956 abstract introduction algorithms and literacy “literacies of the digital” defining algorithmic literacy why algorithmic literacy? the library role in algorithmic literacy information literacy and explainable ai (xai): unique library contributions algorithmic literacy and information literacy algorithmic literacy and explainable ai (xai) conclusion endnotes marcive: a cooperative automated library system virginia m. bowden: systems analyst, the university of texas health science center at san antonio, and ruby b. miller: head cataloger, trinity university, san antonio, texas. 183 the marcive library system is a batch computer system utilizing both the marc tapes and local cataloging to provide catalog cal'ds, book catalogs, and selective bibliographies for five academic libra1·ies in san antonio, texas. the development of the system is traced and present procedures are described. batch retrieval from the marc 1·ecords plus the modification of these records costs less than twenty cents per title. computer costs fo1' retrieval, modification, and card production average six-ty-six cents per title, between seven and ten cents per card. the attributes and limitations of the marcive system are compm·ed with those of the oclc system. in san antonio, texas, a unique cooperative effort in library automation has developed, involving the libraries of five diverse institutions: trinity university, the university of texas health science center at san antonio (uthscsa), san antonio college (sac), the university of texas at san antonio (utsa), and st. mary's university. these institutions are utilizing the marcive library system which was developed by and for one library, that of trinity university. the marcive system is a batch, disc oriented computer system utilizing both local cataloging and the marc tapes to produce catalog cards, book catalogs, selective bibliographies, and other products. development the trinity university library has been involved in library automation since 1966.1 when the library reclassified its collection from dewey to the library of congress classification in1966, a simplified machine-readable format was developed and used for storage on computer. this format contained the following bibliographic elements: accession number, call number, author, title, and imprint date. in 1969 the library decided to reformat the computer data base into a marc ii compatible format in order 184 ] ournal of library automation vol. 7 i 3 september 197 4 to build a data base of bibliographic records that could be the basis for all future automated systems within the library. the resulting system, marcive, was designed jointly by the head cataloger, ruby b. miller, and the library programmer, paul jackson, a graduate student in trinity's department of computer science. since in 1969 literature on completed library automation projects was sparse, no other system was used as a guide. the marcive format was based on the designers' interpretation of the 1969 edition of the marc manual. the name, marcive, evolved when the programmer facetiously claimed that his format was so advanced he would call it the marc iv format. the computer room operating staff, ignoring the space between the marc and iv, combined the two, producing marciv. an e was added later for ease of pronunciation. the marcive system was designed initially as a system for data storage and retrieval. the update, select, and acquisitions list programs were operative in september 1970. the next month uthscsa inquired as to the possibility of producing catalog cards as part of the marcive system. within the brief span of three months, by january 1971, trinity university library produced 4,289 catalog cards and uthscsa produced 1,719 catalog cards via marcive. in february 1974, the five participating libraries produced a total of 29,000 catalog cards, with trinity accounting for 10,740 cards. continued development of the marcive system was delayed in 1971 by changes in computer center personnel and equipment. in 1972 new programs were developed to incorporate the marc tapes into the marcive system. the size of the marc data base, which is now held on three discs, was a major problem. modifications were included to accept input from magnetic tape and typewriter terminals using the apl language as well as keypunched cards. the original restriction of the system to classifications with one to three alphabetic letters followed by numbers, such as used by lc and nlm, was modified to accept dewey decimal classification to accommodate san antonio college. this restriction had been incorporated in an attempt to insure that the call number would be properly formatted, thus simplifying retrieval in the select program and grouping in the acquisitions list and update programs. computer configuration the marcive system is a disc oriented system which was programmed for an ibm 360/44 using the mft operating system. this computer model was designed for scientific programming and was manufactured in limited quantities. the programs were written in basic assembly language since adequate higher level language compilers for the 360 i 44 were not available at the trinity computer center. in 1971 the programs were converted to run under dos, and in 1972 they were converted for processing on the ibm 370/155 using the os processing system. since the initial promarcne/bowden and miller 185 grams were written in basic assembly language, the subsequent programs have also been written this way. marcive format the marcive format is an adaptation of the marc ii format. the definition of the marc ii format is a", .. format which is intended for the interchange of bibliographic records on magnetic tape. it has not been designed as a record format for retention within the files of any specific organization ... [it is] a generalized structure which can be used to transmit between systems records describing all forms of material capable of bibliographic descriptions . . . the methods of recording and identifying data should provide for maximum manipulability leading to ease of conversion to other formats for various uses."2 adaptation of the marc ii format is common among users. an analysis by the recon task force found much variation among the use of the fixed fields, tags, indicators, and subfields. 3 the oclc system can regenerate marc ii records from oclc records although they contain only 78 percent of the number of characters in the original marc ii record. 4 the developers of the marcive system studied the marc manual and decided that the leader and directmy were not necessary for program manipulation. such information can be generated by a conversion program. the marc mnemonic codes were chosen instead of the numeric ones because all bibliographic data were being coded locally and it was felt that mnemonics would be easier to work with. the mnemonic codes are the ones designated in the marc manuals except that "si" was substituted for "se." rules for assigning indicators, subfields, and delimiters are those described by marc. the basic structure of the marcive format is illustrated in figure 1. the differences between marcive and marc are as follows: 1. marcive's leader consists of three fields: length of disc space, status code, and length of record. in converting marc the following elements of the marc leader are incorporated in the marcive leader fields: length of disc space, status code, and length of record. 2. marcive does not contain the marc record directory, but rather places the tags and subfield codes in front of the actual data. 3. in the conversion from marc ii to marcive, fixed fields such as date of publication are omitted. 4. all data elements in marcive are treated as variable tags even though they contain fixed field data. 5. marcive uses the mnemonic code names for the input of data rather than the numeric marc codes. for example "mep" is used for coding a person as main entry rather than "llo." the mnemonic tag names are stored in the machine format and not the numeric marc tags. ,, ·' 186 j oumal of libra1'y automation vol. 7 i 3 september 197 4 '""d ~ .... ~ 0 i5 cj " "' .oj "' "' .oj § ~ s s " '""' p:; ql "' ql ~ cj "' fin fin-data data elements ..sl 0"' ~ z z <:<:: "' 0 <:<:: <:<:: 1"'1 fjp< ~ tag .g elements bil .g bil ~ b!l"' fl "' "' .s "' bil c/) e-< c/) e-< "' "' " "' p >-1 "' >-1 length of disc space. this identifies the number of seventy-two byte blocks a record uses. the marcive records average 350 characters or three to six blocks. blank. this field is used by the update program. length of record. identifies the actual number of characters a record .contains. fin tag. this is the marcive control tag and must precede each record. it contains four subfields: accession number, type of material, location of material, and call number. tag name. after the fin tag, any of the marcive tags may be input as long as they conform to the proper sequence (i.e., main entry must pi·ecede title). each tag is followed by its subfield codes and the data elements. fig. 1. marcive fo1•mat st1'uctu1'e. 6. all first indicators are input except for the first indicator in the contents note. 7. most of the second indicators are not input, except for the filing indicators which are included in the marcive format. 8. marcive adds one variable tag to the marc format called "fin." it serves the function of the marc 090 local holdings tag. the fin tag must be the first variable tag in each marcive record and must contain four data elements: ( 1) accession number; ( 2) type of material code (monograph, serial, etc.); ( 3) location of material within library (reference, reserve, etc.); ( 4) local call number. even though marcive is not a pure marc format, there has been an attempt to code most of the data elements into marcive. a marcive to marc conversion is being written by one of the marcive libraries in order to merge its marcive data base with a purchased marc data base. marcive master data bases each of the m arcive users maintains a separate data base of its holdings, which is called its marcive master. this master file contains a complete bibliographic record for each title cataloged by the library, including marc cataloging and local cataloging. when a library modifies a marc record, the modified record is recorded in that library's marcive master. the various libraries' marcive masters have not been merged, although this is being considered. each library has prefaced all of its accession numbers with a unique library code just in case a merged data base is desired. marc-con data base the largest data base in the system is the marc-converted data base, marcive/bowden and miller 187 hereafter referred to as marc-con. this data base contains only pure marc data that have been converted into marcive machine format. no original cataloging or local modifications of marc are contained in the marc-con data base. marcive programs convert-this program reformats the weekly marc tapes into the marcive machine format. marc-update-this program merges the weekly converted marc tape with the marc-con disc file. an index sequential ( isam) file containing lc card number, fifty characters of the title, and the disc address of the marc reoord is generated. the isam file is in lc card number order. in 1974 the marc-con data base filled three 3330 disc packs. there are three tape back-up files: one file consisting of original marc records, one of the marc-con records, and a third with the isam file. deleted records and replaced records are annually purged from the marc-con files. a new set of back-up tapes for the disc packs is created every three months in order to facilitate regeneration of the disc packs should damage occur. marc-list-this program lists marc records in title sequence from the tape. once every six to eight weeks the list is cumulated and printed. these lists are used for searching until the annual cumulation of the nuc is received. this provides current listings of records on the marc tapes that are not easily available in the national union catalog. this listing will be eliminated in 1974, when access by title to the marc-con data base is available. marc-search-this program searches for lc numbers on the marc-con file using the isam file. a file of the matched records is produced on tape or disc as specified along with a listing of these records. this listing contains the marc-con complete bibliographic entry (figure 2). although access is currently only by lc card number, access by title algorithm ( 3, 1, 1) is expected in 197 4. replace-the purpose of this program is to modify marc-con records to fit the needs of the individual library. these modifications can be done automatically to all records or on a single record basis by the library. the automatic changes are specified on a control card and include twenty-two options such as assignment of accession number, usage of dewey class number instead of lc, and changing "u.s." in subject headings to "united states." an example of a single modification would be the changing of a series entry from t~·aced to untraced. most marcive participants use a combination of automatic and single changes. the output from the replace program may be input to all other marcive programs, such as edit, catalog card, update, etc. edit-this program verifies the format of the input. valid tags and subfields as well as correct sequence of tags are checked. multiple spaces 188 journal of library automation vol. 7/3 september 1974 library code t0000100fin ab~pa3877.a1~d5~ t0000102lcn a~?0-022854 ~ t0000104lano a~eng~ t0000106lant a~enggrc~ t0000108ddc a~882j.01~ t0000110mepf a~aristophanes.~ t0000112tiln ac~plays;~newly translated into english vbrsb by patrie dickinson.~ t0000114ihp aabc~london,~new york,~oxford university pr~ss,~1970-~ t0000116col ac~v. ~21 em.~ t000011hpri ablb.0.75 (v. 1)~{$2.95 u.s.)~ t0000120siru a~oxford paperbacks, 216-~ t0000122noc a~1. acbarnians. knights. clouds. wasps. peace,1 t0000124aeps ade~dickinson, patrie,11914-1tr.~ t0000200fin ab1nd1097.w4~m613~ t0000202lcn a173-4j7272 ~ t0000204lano a~enq1 t0000206lant a~engita~ t0000210meps a~monti, franco.~ t0000212til ac~african masks;~[translated from th~ italian by andrew hale].1 t0000214imp aabc~london,~new york,1hamlyn,~1969.~ t0000216col adc~j-157 p.169 col. illus.~20 em.~ t0000218pri a,15/-~ t0000220siru a~cameo~ t0000222nog a,translation of le maschere africane.~ t0000224sut az,masks, african,africa, west., fig. 2. search listing of marc-con data. are compressed to one, implied subfields are added, and a limited number of punctuation marks are generated. actual bibliographic data are not checked so spelling errors are not detected by the program. those titles which do not conform to specifications are rejected and an explanatory message is generated. a library may choose one of three forms of listings of output: (1) full-edit, (2) mini-edit, or (3) error-edit. the full-edit marcive!bowden and miller 189 950564 fin,cb6950564,m,rp, qs,4,jk49t,1961;, 950564 meps a, kimber, !diana jclifford, 950564 til ac,janatomy and physiology, (by> jdiana !clifford jkimber 950564 <et al,), 950564 edn,14 th ed. by jlutie jc, !leavell, <et al,), 950564 imp abc,jn.jy.,, !macmillan, <1961), 950564 col a,779 p,, 950564 nog,iearlier eds, have title: !textbook of anatomy and physiology, 950564 sut a,ianatomy, 950564 sut a,iphysiology, 950564 aeps a,jleavell, ilutie jc., fin atlc,cb6950564,m,rp,qs,4,k49t,1961;,meps a,kimber, diana clifford,til ac,anatomy and physiology,[ by] diana clifford kimber [et al. ],edn a,14 th ed. by lutie c. leavell, (et al. ],imp abc,n.y.,macmillan,[1961j,col a,779 p,nqg a,earlier eds. have title: textbook of anatomy and physiology,sut a,anatomy,sut a,physiology,aeps a,leavoll, lutie c, qs 4 kimber, diana cliffor•l. k49t anatomy and physiology [by] diana 1961 clifford kimber (et al.] 14th ed. by lutie c. leave.ll, [ et al.] n.y., macmillan, [ 1961] 6950564 779 p. earlier eds. have title: textbook of anatomy and physiology. 1. anatomy. 2. physiology. i. title. ii. leavell, lutie c. marcive anatomy and physiology anatomy physiology leavell, lutie c fig. 3. full-edit listing from the edit program showing the input, keypunched data, data in marcive file format, data in catalog card format, and tracings. lists every title processed in the following forms: the input data, the data as retained in the marcive file, the data in catalog card format, and the tracings (figure 3) . the mini-edit lists the input for each title and any error messages. the error-edit lists only the input form and the error message of those titles which do not conform to specifications. a library might use the error-edit for cataloging data from the marccon file as the proofreading has occurred after the search phase. however, for original cataloging in which all data have been input locally, the full-edit would be most beneficial. acquisitions list-this program produces listings arranged by classification in a format suitable for printing on 8~~-by-ll-inch paper. these listings include the following data elements for each title: main en190 journal of library automation vol. 7/3 september 1974 try, title, imprint, collation, call number. the items are sorted by author within classification. catalog card-this program produces catalog cards on one-up stock the ibm tn train is used and the printing is eight lines to an inch. the program has many options including number of cards produced by type of entry, and whether or not the tracings will print on the tops of the cards. the cards may be printed in filing order in the following arrays: 1. shelflist arranged by call number. 2. main entry-title-series-added entries in one alphabet arranged alphabetically by the first eight characters of the first word, excluding designated prefixes. 3. subject entries in one alphabet arranged alphabetically by the first eight characters of the first word. the alphabetizing is intended to be a prefiling aid and not to be used as the absolute filing arrangement since each library adapts the filing rules for its own collection. the cards may also be printed in set order. examples of typical catalog cards are shown as figure 4. book catalog i-this is a modification of the acquisitions list program in which the classification does not print. it is used to list faculty publications. book catalog 11-this is a modification of the catalog card program in which the cataloging information is printed with all blank lines compressed. since it is a variation of the catalog card, any type of catalog can be created by specifying the type of entry and then having it prefiled accordingly. trinity produces an author-title book catalog which is used in the interim period between the production and the filing of the catalog cards. thus, the public has a listing of new titles added to the library. book catalog iii-this program generates a book catalog arranged by classification. it is similar to book catalog ii but tracings are not printed (figure 5). indexes by added entry, title, and subject are also generated. update-this program merges the additions and deletions to the marcive master file, producing an updated master and a listing in classification sequence of the additions. the marcive master is in accession number order. duplicate accession numbers are rejected. select-this program generates bibliographies from the marcive master files (figure 6). any tag and its subfields can be searched. the output can be sorted by call number, main entry, title, or any other bibliographic element in marcive. the program can be a powerful search tool for a library. however, the program's method of retrieval is by comparing input data with each record on the file, which can be a very expensive process. there have been discussions of building isam files for various points of entry. build misam-this program builds an isam file to the marcive hf 5549 • .5 r3 f54 245701 marcive/bowden and miller 191 finkle, . robert,; b.,. 1922-... . assessing corporate talent; a key to managerial manpower plann1ng, by robert b. finkle and.william s. jones. new.york~ wileyrnterscience, 1910~ ix, 248 p. illus. 22 em. 1. employees, rating of. ii. jones, w1lliam s. 0 71-120702 i. title. marcive employees, rating of hf 551pl.5 r3 f54 finkle, robert. b., 1922-. assessing corporate talent; a key to manaqerial manpower planning, by robett b. finkle and william s. jones. new york~ wileyinterscience, 1qro. ix, 248 p. illus. 22 em. 245701 1. employees, r~tinq of. ii. jones, w1ll1am s. 0 71.-120702 i. title. marcive assessing corporate talent hf 5549.5 r3 f54 finkle, ~obert, b., 1922-. assess1~g corporate talent: a key to manager1al manpower planning, by robert b. finkle an~ william s. jones. new york~ w1leyinterscience, 19f0. ix, 248 p. illus. 22 em. 245701 1. employees 1 ratinq of. ii. jones, w1lliam s. 0 71-120702 i. title. marcive fig. 4. sample set of catalog cards. 192 journal of libm1·y automation vol 7 i 3 september 197 4 vt 244 the physical examination (video -1 tape). paul cutler, dept. of medicine. san antonio, the university of texas medical school, 1972. vt so min. sd. color. the complete physical examination is presented and done on a patient in an orderly precise method, systematically covering the general inspection of bead, neck, eyes, ears, nose, mouth, glands, chest, lungs, heart, abdomen, external genitalia, rectum, and extremities. 248 minimum dosage local anesthesia -1 (video tape). astra pharmaceutical and adrian cowan, dept. of oral surgery. houston, the university of texas dental school at houston, 1965. vt 45 min. sd. color. anatomical drawings and clinical demonstrations are used to show upper and lover incisor, premolar and molar infiltrations as vell as palatal injections. infra-orbital, palatine, mandibular, mental nerve blocks together with anatomical landmarks for each are demonstrated clinically. 248 the posterior cervical triangle -2 (video tape). yick williams, department of anatomy. san antonio, the university of texas dental school, 1972. 7,1/2 min. sd. color. the procedure for the dissection of 115 fig. 5. page j1·om uthscsa video tape catalog produced by book catalog iii. r qh 431 1 subject genetics, title genetic g. mendel memorial symposium, 1865 1965, brue~n, 1965. (symposia csavj 1. genetics congresses. 2. iii. proceedings, g. mendel memorial symposium, qh,43l 1 g66p 1 1965 6 proceedings, edited by milan scsna. p~ague, academia, 1~66. 287 p, genetics hist. i. mendel, gregcr, 1822 1864. ii. sosna, milan, ed. 1865 1965. galton, francis, sir, 1822 1911. hereditary genius an inquiry into its laws and consequences. lnnocn, macmillan, 18~9. 390 p. tables (part fold.). svo. garrison and morton, no. 226. reynolds historical lltipary catalogue, ng. 1593. 1. genetics. 2. creativeness. 05 72. bf,418,gl81h 1 1s6s gardner, eldon john, 1909 • 12. principles of genetics. 4th ed. ~ew york, wiley, cl972 527 p. illus. 1. ge,etics. 07 qh,431 1 g226p,1s72 gates, reginald ruggles, 1882 • 07 72. hereditary in man. new york, macmillan, 1930. 365 p. illus. diagrs. 1. genetics, human. qh,431,g25sh 1 1930 giblett, eloise r. genetic markers in human blood. philadelphia, davis, 1969 629 p. 1. blood cells. 2. phenotyp2. 3. plasma. 4. polymorphism igeneticsl qh 1 431,g446g,1969 grubb, rune. the genetic markers of human immunoglobins. new york, springer verlag, 1970. 152 p. illus. (molecular biology, biochemistry, anu biophysics, 9j 1. amino acid sequence. 2. anti ant!rodies. 3. gamma glorulin. 4. immunogenetics. qw,504 1 g885g 1 1970 hamrert, gunnar. males with positive sex chromatin an epicemiologic investigation fclloweo by psychiatric study cf sevehty five cases. goeteborg akademifcerlaget, 1966 98 p. 1st. joergen's hospital, goftebofg. psychiatric hese4~ch centre. reports, 1) 1. mental disorders etiology. 2. sex chromosome abncrmaljties. qh 1 431,h199m,1966 harris, harry. human biochemical genetics. cambridge eng. univ. press, 1962 c1959 310 ?. 1. ge"jetics. z. biochemistry. qh,43l,h314h 1 1s59 harris, har~y. principles of human biochemical genetics. n.y., american elsevier, 1970. 328 p. (frontiers of biclggy 1 v. 19j 1. genetics, biochemical. 2. genetics, human. i. human biochemical genetics. qh 1 431 1 h314p 1 1s7c harris, morgan, 1916 • cell culture and somatic variation. n.y., holt, rinehart & winston, 1964 54 7 p. 1. gcne"ics. 2. tissue culture. qh,401 1 h315c 1 1964 fig. 6. select listing arranged by main entry. retrieval was based on call number qh 431, subiect "genetics" and "genetics" as first word in title. ~ ::tl a ~ ~ -c:; ~ u l;tj z 8. ~ j-j t"" r l;tj ~ 1-' ~ 194 ]ou1'nal of library automation vol. 7/3 september 1974 data bases by accession number. this isam file is then used by the marcive search program. this isam file is built only on demand. marcive search-this program searches for accession numbers on the marcive isam file. a disc file and a listing of the marcive records is generated. this file can be corrected by using the replace program, thus creating a corrected disc file which can be input into the various marcive programs. using the marcive system the original marcive programs used only keypunched card input. the cataloged data were coded with the marc mnemonic codes and then keypunched. the resulting decklet of cards was the input to several programs. if in the edit program an error was detected, the appropriate punched card could be corrected and the decklet resubmitted to the edit program, or that program could be bypassed and the decklet input into the catalog card program. these same decldets were saved for the monthly acquisition list. they were not stored on disc until recorded on the marcive master disc file in the update program. although this may seem awkward, it was easier for the library staff, who thus did not worry about change codes and deletes, but could work with their prime input, the keypunched cards. the uthscsa library still uses the keypunched card method of input, since its cataloging is based on the national library of medicine's cataloging. citations are manually coded with the m1arc mnemonic codes at a rate of one to three minutes per title. keypunching takes approximately five minutes per title and is much easier than typing a catalog card because placement on the catalog card is not a consideration. the keypunched data are input to the edit program and the full-edit listing is used for proofreading. correct data are input to the catalog card program, then saved for input to the monthly acquisitions list and the update programs. the other libraries in the marcive system use the marc-con portion of the marcive system. with the addition of the marc tapes it became possible to bypass the coding and keypunching steps, thus. saving both time and effort and reducing the chance of error. at trinity university library when books arrive for processing, a library clerk keypunches the lc card number for each book published after 1968. these keypunched cards are then submitted to the marc search program. the books are matched against the search listing of marc data and any cataloging changes in addition to certain automatic changes are noted on the listing. when the books on a search listing are checked and classified, the books are sent for further processing for the shelves. unmatched books are held for later input to the search program. the corrected search listing is sent to the ibm 2741 typewriter terminal operator. the operator types an input file of changes to the marc marcive/bowden and miller 195 records, such as a series entry change. trinity uses the following automatic options in the replace program: ( 1) the call number used is the lc call number without the period; ( 2) a sequential accession number is generated; ( 3) the date entered on the file is added to the record; ( 4) library location code is generated; ( 5) "u.s." and "ct. brit." in subject headings are spelled out. the change file plus the automatic options control data are transmitted via telephone line to the computer and submitted to the replace program. the replace program modifies the records on the search file and creates a file of records which can be used by any of the marcive programs to produce catalog cards, book catalogs, or updates to the library marcive master. since many automatic options have been included in the replace program, the correction required on marc search records is minimal and a great many catalog cards can be created with little or no input required from the library. books published before 1968, unless one of the marc popular titles, must be fully coded and input by the trinity university library staff. this represents about 25 percent of the books cataloged by the library, with the percentage varying from month to month depending on the cataloging priorities in the department. trinity university library is the only marcive user that inputs via an on-line terminal. this is a much more expensive though versatile method of input. all of the other marcive users who modify marc search records follow the same procedure except all input is via punched cards. · one potential marcive user is experimenting with producing catalog cards without any interim editing of the marc records other than the automatic options available through the replace program. this. method of catalog card production is quicker and less expensive and would be useful for many libraries. benefits the benefits of the marcive system must be compared to the manual system which it replaced. clerical staff time to type tracings and call numbers has been eliminated. trinity has effected major time savings by eliminating the maintenance of proof slips. uthscsa has reduced the typical three-week period during which the unit card was reproduced by a commercial firm to one week. the production of the monthly acquisitions list which formerly took days is now accomplished in a matter of minutes. production of bibliographies is also now an easy task, whereas in the manual phase it was done only by manually searching and copying the card catalog. uthscsa has also utilized the marcive system for the cataloging of local products-the videotapes produced by its television department, the citations of publications of its faculty and staff, and most recently the cataloging of computer assisted instruction programs, a project funded by 196 journal of libmry automation vol. 7/3 september 1974 the national library of medicine. the marcive system was used to produce book catalogs for each of the above. costs the costs of developing the marcive system were borne by trinity university library and the trinity university computer center. there was no outside funding for the development. no records were kept of the computer time used to test the marcive programs since all computer time was charged to a university-wide academic budget. during the various development periods of the marcive system there were never more than 1.5 full-time employees engaged in the project. table 1 is an estimate of the manpower and time spent in the various phases of the system. the library paid the salary of the librarian and the computer center paid the salary of the programmer. marcive evolved as the result of the cooperation of both departments within the university and would not have been possible if the administration had not supported the project. table 1. estimated time and manpower involvement months 1969-1970 1971 1971-1972 1972-1973 1973development 12 hiatus 6 conversion to dos & os 9 addition of marc capability 12 maintenance production costs librarian programmer/ analyst .5 .5 .25 .25 .25 .5 1.25 1.00 .25 charges to a marcive library are presently determined by the number of programs that the library uses and the method of input. a program's cost is based on cpu time, cards read, number of lines printed, and online data storage. commercial rates which reflect overhead and salaries are used. method of input can be keypunched cards, typewriter terminals, or magnetic tape. the computer costs for producing a set of catalog cards depend on the method of input, whether the marc tapes are searched, which edit listing is used, and the length and number of cards produced. an additional $0.02 per card is charged for the cost of card stock and the maintenance of the system. in 197 4 this charge will be increased to $0.03 per card to cover the rising costs of paper. uthscsa has kept records of the cost of each computer job since 1971. the average cost per card for 65,217 cards in finished form produced between april 1972 and august 1973 was $0.024 per card. considering that the average set had ten cards and a surcharge of $0.02 per card was added, the cost for producing a set of cards was $0.44. for the average 400 titles cataloged by uthscsa in each month, the marcive system charge would be .$246.00. the same input can be used to produce the monthly acmarciveibowden and miller 197 quisitions list at a cost of $0.015 per title, or $6.00 per list. the addition of a title to the marcive master disc file is an additional $0.03 per title. an average monthly bill for marcive computer services is $262.00. to this should be added the $40.00 prorated cost of the uthscsa keypunch which is also used for other projects, giving a total of $302.00, or $0.755 per title. a library assistant codes and keypunches the data as one half of the job assignment. an average of fifteen minutes per title is involved in the coding, keypunching, proofreading, and handling of data. at a salary rate of ·$4.00 per hour, this amounts to $1.00 per title. transportation costs for delivering data average $17.00 per month at $0.10 per mile. trinity university retrieves 75 percent of its cataloging data from the marc tapes via the search program. the per title cost of this retrieval varies according to the number of items searched and reflects the fact that the more records a computer processes, the less expensive the process becomes. during september 1973 a search of 723 lc numbers resulted in a $0.025 computer charge per retrieved title. a search of 10 lc numbers cost $0.05 per retrieved item. trinity edits each retrieved title to make local changes and to add an accession number. the average cost for this using the replace and edit programs is $0.12. thus batch retrieval from the marc tapes combined with the modification of these records costs from $0.145 to $0.17. the marcive system and the oclc system it is useful to compare the attributes and limitations of the marcive system with those of the ohio college library center ( oclc) system. the two were developed fully independently of one another during the same period of time. 5 oclc is an on-line system with access to the cataloging input of its member libraries in addition to the marc tapes. this access is by authortitle and title algorithms in addition to lc number. marcive is a batch system with access to the marc tapes by lc card number. even though it is a batch system, the libraries in the marcive system enjoy high priority which enables immediate usage of the computer, and jobs are executed throughout the day. the turnaround time for jobs other than search is one to two hours. because of the large size of the marccon file, search programs are executed only at night, so the turnaround could be as much as twenty-four hours. the importance of the access to the oclc member libraries' original cataloging in addition to the marc tapes has not been evaluated. oclc reports that 71 to 76 percent of new requests run against the marc file produce hits. an eventual success rate approaching 100 percent was predicted. 6 it would seem that the marc tapes would be sufficient for all but the larger or more esoteric libraries. printed copy is generated from a marcive search of the marc tapes, ;.,i ~~ i ,, 198 jou1'nal of libm1'y automation vol. 7/3 september 1974 thus allowing a library which does not accept lc cataloging unaltered to do its checking and revising off line. oclc displays the retrieved item on a crt screen. printed copy would be generated only if a special attachment at additional cost could be hooked to the terminal. if a library accepted lc data as displayed, only one retrieval would be necessary. if, however, the procedure was similar to that described by walsh college library, two retrievals would be made. in this procedure, some manual transcribing from the oclc crt screen is made in order to check the entry in the library catalog. 7 however, oclc does not presently charge for aretrieval with no other transaction. the costs for catalog cards produced by oclc and marcive are similar. kilgour has reported that oclc cost for formatting but not printing catalog cards is $0.0221 using commercial rates. a printing cost of $0.0033 was also given. 8 uthscsa reports a cost of $0.024 for the average catalog card, including formatting and printing. oclc charges a fixed price of $0.035 per catalog card. 9 marcive takes the commercial cost of producing the cards and adds an additional $0.02 per card for card stock and system maintenance, thus causing the average card to cost $0.044. in 197 4 the surcharge will increase to $0.03 per card to cover the increased cost of card stock, thus causing the average card to cost $0.054. it is in retrieval charges that a significant difference occurs between the marcive and oclc systems. it has been difficult for the authors to know the precise costs of oclc to participants as these vary according to the structure of the agreement and the location. we have chosen to compare the marcive costs to those of the iucjoclc system which has recently negotiated a contract. these costs do differ in structure from the oclc member libraries. in the agreement between oclc and the interuniversity council ( iuc) of the north texas area, the charge for service is based upon the calls made upon the oclc system for catalog ca1'd p1'oduction whe1'e the cataloging data 1'equested m·e found within the oclc data bank. such a call, referred to as a "hit," is charged at the rate of $0.875. to this must be added such items as leased line costs, terminal hardware and maintenance costs, local training and administrative costs, etc. the total of these costs is approximately ·$1.70 per hit. 10 no charge is made for retrieval from the oclc system unless catalog cards are requested, or for material input by the requesting library. if a library were to catalog 1,000 books using retrieved data from the oclc system, the charge would be $1,700 plus the cost of catalog cards and postage. if a library were to catalog an additional 200 titles not found on the oclc data bank and input these titles into oclc, the charge for the 1,200 titles would still be $1,700 plus the cost of catalog cards and postage. in the marcive system there is a charge each time the marc tapes are accessed, regardless of whether catalog cards are produced. retrieval from the marc tapes via the marcive system costs between $0.025 and marcive!bowden and miller 199 $0.05 per title, depending upon the quantity and percentage of matches in a hatched search. costs vary according to the mode of input from the $80.00 monthly rental for a keypunch machine to the $120.00 rate for an on-line typewriter terminal. using the maximum $0.05 per title retrieval cost and the ·$120.00 terminal cost and adding another $0.12 for the replace and edit programs, the cost for 1,000 titles is $290.00 plus online data costs. using the minimum $0.025 per title and $80.00 for the keypunch machine and adding the $0.12, the cost for 1,000 titles is $225~00. although all the factors which account for this charge variation are not known, the following appear to be significant. first, the costs inherent in an on-line system may be higher than in a batch procedure. second, the size of the oclc data base, which in 1972 had 181,209 member records in addition to the 229,807 marc records, may increase search time and thus costs. third, overhead costs of the oclc staff and their developmental projects in other areas may be a factor, although the substantial grant support oclc has received should have offset some of such costs. marcive presently serves five libraries within a twenty-minute driving area that are responsible for picking up their own output. marcive has minimal overhead costs. the present pricing structure of the computer center has covered its overhead costs. all new development is separately funded. mrs. miller's salary has been totally absorbed by trinity university library even though she is the marcive liaison for the computer center in addition to her responsibility as head of trinity library's cataloging department. it is probable that if marcive were to expand the number of participating libraries, additional administrative support would be required. the present pricing structure used by marcive would then be reevaluated. current developments continued development of the marcive system is in progress. one of the marcive libraries is programming a marcive to marc conversion which will allow it to merge its marcive holdings with a purchased marc data base. programming is underway to create book labels. additional access indexes to the marc files are also being programmed. some procedures which are feasible now were not feasible when the original programs were written. for example, it would be logical for the update program to generate both an updated master and an isam file, instead of the latter requiring the build misam program. the catalog card program will be rewritten for output on two-up card stock using the ala print train. a union catalog and a joint circulation system for the participating libraries are also possibilities. in january 1974 the marcive users' group was formed as a special interest group of the council of research and academic libraries. its purposes are: ( 1) to share information among users regarding problems, procedures, and needs pertaining to the marcive library system; (2) 200 journal of librm·y automation vol. 7/3 september 1974 to establish guidelines, standards, and manuals for input, output, and reporting procedures of user libraries; ( 3) to maintain a sound financial policy for the marcive system; ( 4) to explore and develop new ideas for programs, techniques, and procedures to further enhance the marcive system; ( 5) to recommend changes and/ or new programs for the marcive system; and ( 6) to assist libraries in the installation of any marcive programs. a major project of the users' group has been the writing of a detailed procedure manual for the marcive system. conclusion the marcive system is an excellent example of library cooperation in which the sharing of facilities, interest, and expertise has had great benefit for all concerned. in 1971 only trinity university and the university of texas health science center at san antonio used the marcive system. with the addition of accessibility to the marc tapes, san antonio college joined in 1972 and the university of texas at san antonio and st. mary's followed in 1973. one factor which made this cooperation possible was the formation in 1968 of a consortium of libraries in the san antonio area, the council of research and academic libraries (coral). its membership is comprised of the academic, public, and special libraries in the area. another factor has been the trinity university library and computer center staffs' enthusiasm and graciousness, which have been of great importance in fostering the desire to cooperate. too many projects fail because of lack of communication between participants. such communication can be as important a consideration as economic benefit in the decision to enter into a consortium. references 1. ruby b. miller and robert a. houze, "new horizons in computer applications at trinity university library," texas library journal 48:227-29, 254-55 (nov. 1972). 2. "usa standard for a format for bibliographic information interchange on magnetic tape," journal of library automation 2:53-54 (march 1969). 3. recon working task force, "levels of machine readable records," journal of library automation 3: 124-25 (june 1970) . 4. frederick g. kilgour et al., "the shared cataloging system of the ohio college library center," journal of library automation 5:157-83 (sept.1972). 5. kilgour et al., "shared cataloging system," p.159, 164. 6. frederick g. kilgour, "evolving, computerizing, personalizing," american libraries 3:146 (feb. 1972). 7. patricia lyons and margaret northcraft, "oclc: a user's viewpoint," catholic library world 44:265-68 (dec. 1972). 8. kilgour, "evolving, computerizing, personalizing," p.146. 9, ohio college library center, newsletter 53:2 (sept. 29, 1972). 10. interuniversity council of the north texas area, memorandttm (aug. 24, 1973). public access technologies in public libraries | bertot 81 john carlo bertot public access technologies in public libraries: effects and implications public libraries were early adopters of internet-based technologies and have provided public access to the internet and computers since the early 1990s. the landscape of public-access internet and computing was substantially different in the 1990s as the world wide web was only in its initial development. at that time, public libraries essentially experimented with publicaccess internet and computer services, largely absorbing this service into existing service and resource provision without substantial consideration of the management, facilities, staffing, and other implications of public-access technology (pat) services and resources. this article explores the implications for public libraries of the provision of pat and seeks to look further to review issues and practices associated with pat provision resources. while much research focuses on the amount of public access that public libraries provide, little offers a view of the effect of public access on libraries. this article provides insights into some of the costs, issues, and challenges associated with public access and concludes with recommendations that require continued exploration. p ublic libraries were early adopters of internet-based technologies and have provided public access to the internet and computers since the early 1990s.1 in 1994, 20.9 percent of public libraries were connected to the internet, and 12.7 percent offered public-access computers. by 1998, internet connectivity in public libraries grew to 83.6 percent, and 73.3 percent of public libraries provided public internet access.2 the landscape of public-access internet and computing was substantially different in the 1990s, as the world wide web was only in its initial development. at that time, public libraries essentially experimented with public-access internet and computer services, largely absorbing this service into existing service and resource provision without substantial consideration of the management, facilities, staffing, and other implications of public-access technology (pat) services and resources.3 using case studies conducted at thirty-five public libraries in five geographically dispersed and demographically diverse states, this article explores the implications for public libraries of the provision of pat. the researcher also conducted interviews with state library agency staff prior to visiting libraries in each state. the goals of this article are to n explore the level of support pat requires within public libraries; n explore the implications of pat on public libraries, including management, building planning, staffing, and other support issues; n explore current pat support practices; n identify issues and challenges public libraries face in maintaining and supporting their pat infrastructure; and n identify factors that contribute to successful pat practices. this article seeks to look beyond the provision of pat by public libraries and review issues and practices associated with pat–provision resources. while much research focuses on the amount of public access that public libraries provide, little offers a view of the effect of public access on libraries. this article provides insights into some of the costs, issues, and challenges associated with public access, and it concludes with recommendations that require continued exploration. n literature review quickly over time, public libraries increased their public-access provision substantially (see figures 1 and 2). connectivity grew from 20.9 percent in 1994 to nearly 100 percent in 2006.4 moreover, nearly all libraries that connected to the internet offered public-access internet services. simultaneously, the average number of publicaccess computers grew from 1.9 per public library in 1996 to 12 per public library in 2007.5 accompanying and in support of the continual growth of basic connectivity and computing infrastructure was a demand for broadband connectivity. indeed, since 1994, connectivity has progressed from dial-up phone lines to leased lines and other forms of high-speed connectivity. the extent of the growth in public-access services within public libraries is profound and substantive, leading to the development of new internet-based service roles for public libraries.6 and public access to the internet through public libraries provides a number of community benefits to different populations within served communities.7 overlaid onto the public-access infrastructure is an increasingly complex service mix that now includes access to digital content (e.g., databases and digital john carlo bertot (jbertot@umd.edu) is professor and director of the center for library innovation in the college of information studies at the university of maryland, college park. 82 information technology and libraries | june 2009 libraries), integrated library systems (ilss), voice over internet protocol (voip), digital reference, and a host of other services and resources—some for public access, others for back-office library operations. and patrons do use these services in increasing amounts—both in the library and in everyday life.8 in fact, 82.5 percent of public libraries report that they do not have an adequate number of public-access computers some or all of the time and have resorted to time limits and wireless access to extend public-access services.9 by 2007, as connectivity and public-access computer infrastructure grew, so ensued the need to provide a range of publicly available services and resources: n 87.7 percent of public libraries provide access to licensed databases n 83.4 percent of public libraries offer technology training n 74.1 percent of public libraries provide e-government services (e.g., locating government information and helping patrons complete online applications) n 62.5 percent of public libraries provide digital reference services n 51.8 percent of public libraries offer access to e-books10 the list is not exhaustive, but illustrative, since libraries do offer other services such access to homework resources, video content, audio content, and digitized collections. as public libraries expanded these services, management realized that they needed to plan and evaluate technology-based services. over the years, a range of technology management, planning, and evaluation resources emerged to help public libraries cope with their technology-based resources—those both publicly available and for administrative operations.11 but increasingly, public libraries report the strain that pat services promulgate. this centers on four key areas: n maintenance and management. the necessary maintenance and management requirements of pat places an additional burden on existing staff, many of whom do not possess technology expertise to troubleshoot, fix, and support internet-based services and resources that patrons access. n staff. libraries consistently cite staff expertise and availability as a barrier to the addition, support, and management of pat. indeed, as described in previous sections, some libraries have experienced a decline in library staff. n finances. there is evidence of stagnant funding for libraries at the local level as well as a shift in expenditures from staff and collections to operational costs such as utilities and maintenance. n buildings. the buildings are inadequate in terms of space and infrastructure (e.g., wiring and cabling) to support additional public access.12 this article explores these four areas through a sitevisit method in an effort to go beyond a quantitative assessment of pat within the public library community. though related in terms of topic area and author, this study was conducted separately from the public library internet surveys conducted since 1994 and offers insights into the provision of pat services and resources that a national survey cannot explore in such depth. figure 1. public-access internet connectivity from 1994 through 2008 figure 2. public-access internet workstations from 1996 through 2008 public access technologies in public libraries | bertot 83 n method the researcher visited thirty-five public libraries in five geographically and demographically diverse states between october 2007 and may 2008. the states were in the west, southwest, southeast, and mid-atlantic regions. the libraries visited included urban, suburban, rural, and native american public libraries that served populations ranging from a few hundred to more than half a million. the communities that the libraries served varied in terms of poverty, race, income, age, employment, and education demographics. prior to visiting the public library sites, the researcher conducted interviews with state library agency staff to better understand the public library context within each state and to explore overall pat issues, strategies, and other factors within the state. the following research questions guided the site visits: n what are the community and library contexts in which the library provides pat? n what are the pat services and resources that the library makes available to its community? n what pat services and resources does the library desire to provide to its community? n what is the relationship between provided and desired pat and the effect on the library (e.g., staff, finances, the building, and management)? n what are the perceived benefits to the library and its community gains through pat in the library? n what are the issues and barriers that the library encounters in providing pat services and resources? n how does the library manage and maintain its pat? the researcher visited each library for four to six hours. during that time, he interviewed the library director and/or branch manager and technology support staff (either a specific library position, designated library employee, or city or county it staff person), toured the library facility, and conducted a brief technology inventory. at some libraries, the researcher was able to meet with community partners that in some way collaborated with the library to provide pat services and resources (e.g., educational institutions that collaborated with libraries to provide access to broadband or volunteers who conducted technology training sessions). interviews were recorded and transcribed, and the technology inventories were entered into a microsoft excel spreadsheet for analysis. the transcripts were coded using thematic content analytic schemes to allow for the identification of key issues regarding pat areas.13 this approach enabled the researcher to use an iterative site-visit strategy that used findings from previous site visits to inform subsequent visits. to ensure valid and reliable data, the researcher used a three-stage strategy: 1. site-visit reports were completed and sent to th libraries for review. corrections from libraries were incorporated into a final site-visit report. 2. a final state-based site-visit report was compiled for distribution to state library agency staff and also incorporated their corrections. this provided a state-level reliability and validity check. 3. a summary of key findings was distributed to six experts in the public library technology environment, three of which were public library technology managers and three of which were technology consultants who worked with public libraries. in combination, this approach provided three levels of data quality checks, thus providing both internal (library and state) and external (technology expert) support for the findings. the findings in this article are limited to the libraries visited and interviews conducted with public librarians and state library agency staff. however, themes emerged early during the site-visit process and were reinforced through subsequent interviews and visits across the states and libraries visited. in addition, the use of external reviewers of the findings lends additional, but limited, support to the findings. n findings this section presents the results of the site visits and interviews with state library agency staff and public librarians. the article presents the findings by key areas surrounding pat in public libraries. the public-access context public libraries have a range of pat installed in their libraries for patron use. these technologies include public-access computers, wireless (wifi) access, ilss, online databases, digital reference, downloadable audio and video, and others. many of these services and resources are also available to patrons from outside library buildings, thus extending the reach (and support issues) of the library beyond the library’s walls. in addition, when libraries do not provide direct access to resources and services, they serve as access points to those services, such as online gaming and social networking. while libraries can and do deploy a number of technologies for public use, it is possible to group these 84 information technology and libraries | june 2009 technologies broadly into two overlapping categories: n hardware. library pat hardware can include public-access computers, public-access computing registration (i.e., reservation) systems, self-checkout stations, printers, faxes, laptops, and a range of other devices and systems. some of these technologies may have additional devices, such as those required for persons with disabilities. within the hardware grouping are networking technologies that include a range of hardware and software to enable a range of library networks to run (e.g., routers, hubs, switches, telecommunications lines, and networking software). n software. software can include device operating system software (e.g., microsoft windows, mac os, and linux), device application software (e.g., microsoft office, openoffice, graphics software, audio software, e-book readers, assistive software, and others), and functional software (e.g., web browsers, online databases, and digital reference). in short, public libraries make use of a range of technologies that the public uses in some way. each type of technology requires skills, management, implementation, and maintenance, all of which are discussed later. in the building, all of these products and services come together at the library’s public-access computers, or patron mobile device if wifi is available. moreover, patrons increasingly want to use their portable devices (e.g., usb drives, ipods, and others) with library technology. this places pressure on libraries to not just offer public-access computers, but also to support a range of technologies and services. thus the environment in which libraries offer pat is complex and requires substantial technical expertise, support, and maintenance in key areas of applications, computers, and networking. moreover, as discussed below, patrons are increasingly demanding market-based approaches to pat. these demands—which are largely about single-point access to a range of information services and resources—are often at odds with library technology that is based on stove-piped approaches (e.g, ils, e-books, and licensed resources) and that do not necessarily lend themselves to seamless integration. n external pressures on pats the advent and increased use by the public of google, amazon, itunes, youtube, myspace, second life, and other networked services affects public libraries in a number of ways. this article discusses these services and resources from the perspective of an information marketplace of which the public library is one entrant. interviewed librarians overwhelmingly indicated that users now expect library services to resemble those in the marketplace. users expect the look and feel, integration, service capabilities, interactivity, and personalization and customization that they experience while engaging in social networking, online searching, online purchasing, or other online activities. and within the library building, patrons expect the services to integrate at the public-access computer entry point—not distributed throughout the library in a range of locations, workstations, or devices. said differently, they expect to have a “mylibrary.com” experience that allows for seamless integration across the library’s services but also facilitates the use of personal technologies (e.g., ipods, mp3 players, and usb devices). thus users expect the library’s services to resemble those services offered by a range of information service providers. importantly, however, librarians indicated that library systems on which their services and resources reside by and large do not integrate seamlessly—nor were they designed to do so. public-access computers are gateways to the internet; the ils exists for patrons to search for and locate library holdings; and online databases, e-books, audiobooks, etc., are extensions of the library’s holdings but are not physical items under a library’s control and thus subject to a vendor’s information and business models. while library vendors and the library community are working to develop more integrated products that lead users to the information they seek, the technology is under development. there are three significant issues that libraries face because of market pressures: (1) the pressures all come together at a single point—the public-access computer; (2) users want a customized experience while using technology designed for the general public, not the individual user; and (3) users have choices in the information marketplace. one participant indicated, “if the library cannot match what users have access to on the outside, users will and do move on.” managing and maintaining public access managing the public-access computer environment for public libraries is an growing challenge. there are a number of management areas with which public librarians contend: n public-access computers—the computers and laptops (if applicable) themselves, which can include anything from keyboards and mice to troubleshooting a host of computer problems (it is important to note that these may be computers that often vary in age and composition, come from a range of vendors, run different operating systems, and often public access technologies in public libraries | bertot 85 have different application software versions). n peripheral management—the printers, faxes, scanners, and other equipment that are part of the library’s overall public access infrastructure. n public-access management software or systems—these may include online or in-building computer-based reservations (which encompasses specialized reservations such as teen machines, gaming computers, computers for seniors, and so on), time management (set to the library’s decided-upon time allotment), filtering, security, logins, virtual machines, etc. n wireless access—this may include logins and configurations for patrons to gain access to the library’s wireless network. n bandwidth management—this may include the need to allocate bandwidth differently as needs increase and decrease in a typical day. n training and patron assistance—for a vast array of services such as databases, online searching, e-government (e.g., completing government forms and seeking government information), and others. training can take place formally through classes, but also through point-of-use tutorials requested by patrons. to some extent, librarians commented that, while they do have issues with the public-access computers themselves from time to time, the real challenges that they face regard the actual management of the publicaccess environment—sign-ups, time limits, cost recovery for print jobs, helping patrons, and so on. one librarian commented that “the computers themselves are pretty stable. we don’t really have too many issues with them per se. it’s everything that goes into, out from, or around the computer that creates issues for us.” as a result of the management challenges, several libraries have adopted turn-key solutions, such as public-access management systems (e.g., comprise technology’s smart access manager [http://www.comprisetechnologies .com/product_29.html]) and all-encompassing public computing management systems that include networking and desktops (e.g., userful’s discoverstations [http:// userful.com/libraries/]). these systems allow for an allin-one sign-up, print cost recovery, filtering (if desired), and security approach. also, the discoverstations are a linux-based, all encompassing public-access management environment. a clear advantage to the discoverstation approach is that the discoverstation is connected to the internet and is accessible by userful staff remotely to update software and perform other maintenance functions. they also use open-source operating and application software. while these solutions do provide efficiencies, they also can create limitations. for example, the discoverstations are a thin-client system and are dependent on the server for graphics and memory, thus limiting their ability to access gaming and social-networking sites. the smart access manager, and similar programs, can rely on smart cards or other technology that users must purchase to print. another limitation is that the time limits are fixed, and, while users get warnings as time runs out, the session can end abruptly. these approaches are by and large adopted by libraries to ease the management associated with public-access computers and let staff concentrate on other duties and responsibilities. one librarian indicated that “until we had our management system, we would spend most of the day signing people up for the computers, or asking them to finish their work for the next person in line.” n planning for pat services and resources public libraries face a number of challenges when planning for pat services and resources. this is primarily because pat planning involves more than computers. any planning needs to encompass n building needs, requirements, limitations, and design; n technology assessment that considers the library’s existing technology, technology potential, current practices, and future trends; n planning for and supporting multiple technology platforms; n telecommunications and networking; n services and resources available in the marketplace—those specifically for libraries and those more broadly available to consumers and used by patrons; n specific needs and requirements of technology (e.g., memory, disk space, training, other); n requirements of other it groups with which the library may need to integrate, for example, city or county technology mandates; n support needs, including the need to enter into maintenance agreements for computer, network, and other equipment and software; n staff capabilities, such as current staff skill sets and their ability to handle the technologies under review or purchased; and n policy, such as requirements to filter because of local, state or federal mandates. the above list may not be exhaustive, but rather based on the main items that librarians identified during the site visits, and they serve to provide indicators of the challenges those planning library it initiatives face. 86 information technology and libraries | june 2009 n the endless upgrade and planning one librarian likened the pat environment to “being a gerbil on a treadmill. you go round and round and never really arrive,” a reference to the fact that public libraries are in a perpetual cycle of planning and implementing various pat services and resources. either hardware needs to be updated or replaced, or there is a software update that needs to be installed, or libraries are looking to the next technology coming down the road. in short, the technology planning to implementation cycle is perpetual. the upgrade and replacement cycle is further exacerbated by the funding situation in which most public libraries find themselves. increasingly, public library local and state funding, which combined can account for more than 90 percent of library funding, is flat or declining.14 the most recent series of public library internet studies indicates an increase in reliance by public libraries on fees and fines, fundraising, private foundation, and grant funding to finance collections and technology within libraries.15 this places key aspects of library operations in the realm of unreliable and one-time funding sources, thus making it difficult for libraries to develop multiyear plans for pat. n multiple support models to cope with pat management and maintenance issues, public libraries are developing various support strategies. the site visits found a number of technology-support approaches in effect, ranging from no it support to highly centralized statewide approaches. the following list describes the technology-support models encountered during the site visits: 1. no technology support. libraries in this group have neither technology-support staff nor any type of organized technology-support mechanism with existing library staff. nor do they have access to external support providers such as county or city it staff. libraries in this group might rely on volunteers or engage in ad hoc maintenance, but by and large have no formal approach to supporting or maintaining their technology. 2. internal library support without technology staff. in this model, the library provides its own technology support but does not necessarily have dedicated technology staff. rather, the library has designated one or more staff members to serve as the it person. usually this person has an interest in technology but has other primary responsibilities within the library. there may be some structure to the support—such as updating software (e.g., windows patches) once a week at a certain time— but it may be more ad hoc in approach. also, the library may try to provide its designated it person(s) with training to develop his or her skills further over time. 3. internal library support with technology staff. in this model, the library has at least one dedicated it staff person (partor full-time) who is responsible for maintaining and planning the library’s pat environment. the person may also have responsibilities for network maintenance and a range of technology-based services and resources. at the higher end of this approach are libraries with multiple it staff with differing responsibilities, such as networking, telecommunications, public-access computers, the ils, etc. libraries at this end of the spectrum tend to have a high degree of technology sophistication but may face other challenges (i.e., staffing shortages in key areas). 4. library consortia. over the years, public libraries have developed consortia for a range of services— shared ilss, resource sharing, resource licensing, and more. as public-library needs evolve, so too do the roles of library consortia. consortia increasingly provide training and technology-support services, and may be funded through membership fees, state aid, or other sources. 5. technology partners. while some libraries may rely on consortia for their technology support, others are seeking libraries that have more technology expertise, infrastructure, and abilities with whom to partner. this can be a fee-for-service arrangement that may involve sharing an ils, a maintenance agreement for network and public-access computer support, and a range of services. these arrangements allow the partner libraries to have some input into the technology planning and implementation processes without incurring the full expense of testing the technologies, having to implement them first, or hiring necessary staff (e.g., to manage the ils). the disadvantage to this model is that the smaller partner libraries are dependent on the technology decisions that the primary partner makes, including upgrade cycles, technology choices, migration time frames, etc. 6. city, county, or other agency it support. as city or county government agencies, some libraries receive technology support from the city or county it department (or in some cases the education department). this support ranges from a full slate of services and support available to the library to support only for the staff network and computers. public access technologies in public libraries | bertot 87 even at the higher end of the support spectrum, librarians gave mixed reviews for the support received from it agencies. this was primarily because of competing philosophies regarding the pat environment, with public librarians wanting an open-access policy to allow users access to a range of information service and resources and it agency staff wanting to essentially lock down the public-access environment and thus severely limit the functionality of the public-access computers and network services (i.e., wireless). other limitations might include prescribed pat, specified vendors, and bidding requirements. 7. state library support. one state library visited provides a high degree of service through its statewide approach to supporting public-access computing in the state’s public libraries. the state library has it staff in five locations throughout the state to provide support on a regional level but also has additional staff in the capital. these staff offer training, inhouse technical support, phone support, and can remote access the public-access computers in public libraries to troubleshoot, update, and perform other functions. moreover, this state built a statewide network through a statewide application to the federal e-rate program, thus providing broadband to all libraries. this model extends the availability of qualified technical support staff to all public libraries in the state—by phone as well as in person if need be. as a result, this enables public libraries to concentrate on service delivery to patrons. it is important to note that there are combinations of the above models in public libraries. for example, some libraries support their public-access networks and technology while the county or city it department supports the staff network and technology. it is clear, however, that there are a number of models for technology support in public libraries, and likely more than are presented in this article. the key issue is that public libraries are engaging in a broad spectrum of strategies to support, maintain, and manage their pat infrastructure. also of significance is that there are public libraries that have no technology-support services that provide pat services and resources. these libraries tend to serve populations of less than ten thousand, are rural, have fewer than five full-time equivalents (ftes), and are unlikely to be staffed by professional librarians. staff needs and pressures the study found a number of issues related to the effect of pat on library staff. this section of the findings discusses the primary factors affecting library staff as they work in the public-access context. n multiple skills needed not only is the pace of technological change increasing, but the change requires an ever-increasing array of skills because of the complexity of applications, technologies, and services. an example of such complexity is the library opac or ils. visited libraries indicated that such systems are becoming so complex and technologically sophisticated that there is a need for a full-time staff person to run and maintain the library ils. given the range of hardware, software, and networking infrastructure, as well as planning and pat management requirements, public librarians need a number of skills to successfully implement and maintain their pat environments. moreover, the skill needs depend on the librarian’s position—for example, an actual it staff person versus a reference librarian who does double duty by serving as the library’s it person. the skills required fall into technology, information literacy, service and facilities planning, management, and leadership and advocacy areas: n technology o general computer troubleshooting o basic maintenance, such as mouse and keyboard cleaning o basic computer repair, such as memory replacement, floppy drive replacement, disk defragmentation, etc. o basic networking, such as troubleshooting an “internet” issue versus a computer problem o telecommunications so as to understand the design and maintenance of broadband networks o integrated library systems o web design n information literacy o searching and using internet-based resources o searching and using library licensed resources o training patrons on the use of the publicaccess computers, general internet resources, and library resources o designing curriculum for various patron training courses n services and facilities planning o technology plan development and implementation (including budgeting) o telecommunications planning (including 88 information technology and libraries | june 2009 e-rate plan and application development) o building design so as to accommodate the requirements of public access technologies n management o license and contract negotiation for licensed resources, various public-access software and licenses, and maintenance agreements (service and repair agreements) o integration of pat into library operations o troubleshooting guidelines and process o policy development, such as acceptable use, filtering, filtering removal requests by patrons, etc. n leadership and advocacy o grant writing and partnership development so as to fund pat services and resources and extend out into the community that the library serves o advocacy so as to be able to demonstrate the value of pat in the library as a community good o leadership so as to build a community approach to public access with the library as one of the foundational institutions these items provide a broad cross section of the skills that public library staff may need to offer a robust pat environment. in the case of smaller, rural libraries, these requirements in general fall to the library director—along with all other duties of running the public library. in libraries that have separate technology, collections development, and other specialized staff, the skills and expertise may be dispersed throughout various areas in the library. n training public librarians receive a range of technology training— including none at all. in some cases, this might be a basic workshop on some aspect of technology at a state library association annual meeting or a regional workshop hosted by the library’s consortium. it could be an online course through webjunction (http://www.webjunction .org/). it could be a one-on-one session with a vendor representative or colleague. or it could be a formal, multiday class regarding the latest release of an ils. if available, public librarians have access to technology training that can take many forms, has a wide array of content (basic to expert), and can enhance staff knowledge about it with varying degrees of success. an issue raised by librarians was that having access to training and being able to take advantage of training are two separate things. regardless of the training delivery medium, librarians indicated that they were not always able to get release time to attend a training session. this was particularly the case for small, rural libraries that had less than five ftes spread out over several part-time individuals. for these staff to take advantage of training would require a substitute to cover public-service hours—or shut down the library. funding information technology as one might expect, there was a range of technology budgets in the public libraries visited or interviewed— from no technology budget to a substantial technology budget, and many points in between. some libraries had a dedicated it budget line item, others had only an operating budget out of which they might carve some funds for technology. libraries with dedicated it budgets by and large had at least one it staff person; libraries with no it budget largely relied on a staff person responsible for other library functions to manage their technology. in the smallest libraries, the library director served as the technology specialist in addition to being the general library operation manager. some libraries have established foundations through which they can raise funds for technology, among other library needs. many seek grants and thus devote substantial effort to seeking grant initiatives and writing grant proposals. some libraries held fundraisers and worked with their library friends groups to generate funds. other libraries engage in all of the above efforts to provide for their pat infrastructure, services, and resources. in short, there are several budgetary approaches public libraries use to support their pat environment. critical to note is that a number of libraries are increasingly relying on nonrecurring funds to support pats, a fact corroborated by the 2007 and 2008 public library internet surveys.16 the buildings when one visits public libraries, one is immediately struck by the diversity in design, functionality, and architecture of the buildings. public libraries often reflect the communities that they serve not only in the collection and service, but also in the facilities. this diversity serves the public library community well because it allows for a custom approach to libraries and their community. the building design, however, can also be a source of substantial challenge for public libraries. the increased integration of technology into library service places a range of stresses on buildings—physical space for workstations and other equipment and specialized furniture, power, server rooms, and cabling, for example. along with the library-based technology requirements come those of patrons—particularly the need for power so that public access technologies in public libraries | bertot 89 patrons may plug in their laptops or other devices. also important to note is that the building limitations also extend to staff and their access to computing and networked technologies. a number of librarians commented that they are “simply at capacity.” one librarian summed it up by stating that “there’s no more room at the inn. unless we start removing parts of our collection, we don’t have any more room for workstations.” another said that, “while we do have the space to add more computers, we don’t have enough power or outlets to support them. and, with our building, it’s not a simple thing to add.” in short, many libraries are reaching, or have reached, a saturation point as to just how much pat they can support. n discussion and implications over time, pat services have become essential services that public libraries provide their communities. with nearly all public libraries connected to the internet and offering public-access computers, the high percentage of libraries that offer internet-based services and resources, the overall usage of these resources by the public,17 and 73 percent of public libraries reporting that they are the only free provider of pat in their communities, it is clear that the provision of pat services is a key and critical service role that public libraries offer.18 it is also clear, however, that the extent to which public libraries can continue to absorb, update, and expand their pat depends on the resolution of a number of staffing, financial, maintenance and management, and building barriers. in a time of constrained budgets, it is unlikely that libraries will receive increased operational funding. indeed, reports of library funding cuts are increasing in the current economic downturn, which affects the ability of libraries to increase, or significantly update, staff—particularly in the areas of technology, licensing additional resources, procuring additional and new computers, and purchasing and offering expanded services such as digital photography, gaming, or social networking.19 moreover, the same financial constraints can affect the ability of libraries to raise capital funds for building improvements and new construction. funding also has an effect on the training that public libraries can offer or develop for their staff. and training is becoming increasingly important to the success of pat services and resources in public libraries—but not just training regarding the latest technologies. rather, there is a need for training that provides instruction on the relationship between the level of pat services and resources a library can or desires to provide and advocacy; broadband, computing, and other needs; technology planning and management; collaboration and partnering; and leadership. the public library pat environment is complex, encompasses a number of technologies, and has ties to many community services and resources. training programs need to reflect this complexity. the continued provision of pat services in public libraries is increasingly burdensome on the public library community, and the pressures to expand their pat services and resources continues to grow—particularly as libraries report their “sole provider” of free pat status in their communities. the successful libraries in terms of pat services and resources visited had staff that could n understand pat (both in terms of functionality and potential); n think creatively across the technology and library service spectrum; n integrate online content, pat, and library services; n articulate the value of pat as an essential community need and public library service; n articulate the role of the perception of the library by its community as a critical bridge to online content; n demonstrate leadership within the community and library; n form partnerships and extend pat services and resources into the community; and n raise funds and develop other support mechanisms to enhance pat services and resources in the library and throughout the community. in short, successful pat in libraries was being redefined in the context of communitywide pat service and resource provision. this approach not only can lead to a more robust community pat infrastructure, but it also lessens the library’s burden of pat service and resource provision. but equally important to note is that the extent to which all public libraries can engage in these activities on their own is unclear. indeed, several libraries visited were struggling to maintain basic pat service levels and indicated that increasing pat services came at the expense of other library services. “we’re trying to meet demand,” one librarian said, “but we have too few computers, too slow a connection, and staff don’t always know what to do when things go wrong or someone comes in talking about the latest technology or website.” for some libraries, therefore, quality pat services that meet community needs are simply out of reach. thus another implication and finding of the study is the need for libraries to explore other models of support for their pat environments—for example, using the services of a regional cooperative, if available; if none is available, libraries could form their own cooperative for resource sharing, technology support, and other aspects of pat service provision. the same approach could be 90 information technology and libraries | june 2009 taken within a city or county to enhance technology support throughout a region. another approach would be to outsource a library’s pat support and maintenance to a nearby library with support staff in a fee-for-service approach. there are a number of approaches that libraries could take to support their pat infrastructure. a key point is that libraries need to consider pat service provision in a broader community, regional, or state context, and the study found some libraries doing so. the need to avail staff of the skills required to truly support pat was a recurring theme throughout the site visits. approaches and access to training varied. for example, some state libraries provided—either directly or through the hiring of consultants and instructors—a number of technology-related courses taught in regional locations. an example of this approach is california’s infopeople project (http://www.infopeople .org/). some state libraries subscribed to webjunction (http://www.webjunction.org/), which provides access to online instructional content. online manuals provided by compumentor through a grant funded by the bill and melinda gates foundation aimed at helping rural libraries support their pat (www.maintainitproject.org) are another resource. beyond technology skills training, however, is the need for technology planning, effective communication, leadership, value demonstration, and advocacy. the extent to which leadership, advocacy, and library marketing, for example, are able to be taught remains a question. all of these issues take place with the backdrop of an economic downturn and budgetary constraints. increased operating costs created through inflation and higher energy costs place substantial pressures on public libraries simply to maintain current levels of service— much less engage in the additional levels of service that the pat environment brings. indeed, as the 2008 public library funding and technology access study demonstrated, public libraries are increasingly funding their technology-based services through non-recurring funds such as fines and fundraising activities.20 thus, the ability of public libraries to provide robust pat services and resources is increasingly limited unless such service provision comes at the expense of other library services. alone, the financial pressures place a high burden on public libraries. combined with the building, staffing, skills, and other constraints reported by public libraries, however, the emerging picture for library pat services and resources is one of significant challenge. n three key areas for additional exploration the findings from the study point to the need for additional research and exploration of three key services areas and issues related to pat support and services: 1. develop a better understanding of success in the pat environment. this study and the 2006 study by bertot et al. point to what is required for libraries to be successful in a networked environment.21 in fact, the 2007 public libraries and the internet report contained a section entitled “the successfully networked public library,” which offered a range of checklists for public libraries (and others) to consider as they planned and implemented their networked services.22 this study identified additional success factors and considerations focused specifically on the public access technology environment. together, these efforts point to the need to better understand and articulate the critical success factors necessary for public libraries to plan, implement, and update their pat given current service contexts. this is particularly necessary in the context of meeting user expectations and needs regarding networked technologies and services. 2. further identify technology-support models. this study uncovered a number of different technologysupport models implemented by public libraries. undoubtedly there are additional models that require identification. but, more importantly, there is a need to further explore how each technologysupport model assists libraries, under what circumstances, and in what ways. some models may be more or less appropriate on the basis of the service context of the library—and that is not clearly understood at this time. 3. levels of service capabilities. an underlying theme throughout this research, and one that is increasingly supported by the public library and the internet studies, is that the pat service context is essentially a continuum from low service and capability to high service and capability. there are a number of factors contributing to where libraries may lie on the success continuum—funding, management, leadership, attitude, skills, community support, and innovation, to name a few. this continuum requires additional research, and the research implications could be profound. emerging data indicate that there are public libraries that will be unable to continue to evolve and meet the increased demands of the networked environment, both in terms of staff and infrastructure. public libraries will have to make choices regarding the provision of pat services and resources in light of their ability to provide high-quality services (as defined by their service communities). for better or worse, the technology environment continually evolves and requires new technologies, management, and support. that is, public access technologies in public libraries | bertot 91 and will continue to be, the nature of public access to the internet. though there are likely other issues worthy of exploration, these three are critical to further our understanding of the pat environment and public library roles and issues associated with the provision of public access. n conclusion the pat environment in which public libraries operate is increasingly complex and continues to grow in funding, maintenance and management, staffing, and building demands. public libraries have navigated this environment successfully for more than fifteen years; however, stresses are now evident. libraries rose quickly to the challenge of providing public-access services to the communities that they serve. the challenges libraries face are not necessarily insurmountable, and there are a range of tools designed to help public libraries plan and manage their public-access services. these tools, however, place the burden of public access, or assume that the burden of public access in placed, on the public library. given increased operating costs because of inflation, the continual need to innovate and upgrade technologies, staff technology skills requirements, and other factors discussed in this article, libraries may not be in a position to shoulder the burden of public access alone. thus there is a need to reconsider the extent to which pat provision is the sole responsibility of the library; perhaps there is a need to integrate and expand public access throughout a community. the potential of such an approach can benefit a community through an integrated and broader access strategy, but also can relieve the pressure on the public library as the sole provider of public access. n acknowledgement this reserach was made possible in part through the support of the maintianit project (http://www.maintainit project.org/), an effort of the nonprofit techsoup web resource (http://www.techsoup.org/). references 1. charles r. mcclure, john carlo bertot, and douglas l. zweizig, public libraries and the internet: study results, policy issues, and recommendations (washington, d.c.: national commission on libraries and information science, 1994). 2. john carlo bertot and charles r. mcclure, moving toward more effective public internet access: the 1998 national survey of public library outlet internet connectivity (washington, d.c.: national commission on libraries and information science, 1998), http://www.liicenter.org/reports/1998_plinternet_ study.pdf (accessed apr. 22, 2009). 3. charles r. mcclure, john carlo bertot, and john c. beachboard, internet costs and cost models for public libraries (washington, d.c.: national commission on libraries and information science, 1995). 4. charles r. mcclure, john carlo bertot, and douglas l. zweizig, public libraries and the internet: study results, policy issues, and recommendations (washington, d.c.: national commission on libraries and information science, 1994); john carlo bertot, charles r. mcclure, paul t. jaeger, and joe ryan, public libraries and the internet 2006: study results and findings (tallahassee, fla.: information institute, 2006), http://www .ii.fsu.edu/projectfiles/plinternet/2006/2006_plinternet.pdf (accessed mar. 5, 2009). 5. john carlo bertot, charles r. mcclure, carla b. wright, elise jensen, and susan thomas, public libraries and the internet 2007: study results and findings (tallahassee, fla.: information institute, 2008). http://www.ii.fsu.edu/projectfiles/plinternet/ 2007/2007_plinternet.pdf (accessed sept. 10, 2008). 6. charles r. mcclure and paul t. jaeger, public libraries and internet service roles: measuring and maximizing internet services (chicago: ala, 2008). 7. george d’elia, june abbas, kay bishop, donald jacobs, and eleanor jo rodger, “the impact of youth’s use of the internet on the use of the public library,” journal of the american society for information science & technology 58, no. 14 (2007): 2180–96; george d’elia, corinne jorgensen, joseph woelfel, and eleanor jo rodger, “the impact of the internet on public library use: an analysis of the current consumer market for library and internet services,” journal of the american society for information science & technology 53, no. 10 (2002): 802–20. 8. national center for education statistics (nces), public libraries in the united states: fiscal year 2005 [nces 2008301] (washington, d.c.: national center for education statistics, 2007); pew american and internet life, “internet activities,” http:// www.pewinternet.org/trends/internet_activities_2.15.08.htm (accessed mar. 5, 2009). 9. bertot et al., public libraries and the internet 2007. 10. ibid. 11. cheryl bryan, managing facilities for results: optimizing space for services (chicago: public library association, 2007); joseph matthews, strategic planning and management for library managers (westport, conn.: libraries unlimited, 2005); joseph matthews, technology planning: preparing and updating a library technology plan (westport, conn.: libraries unlimited, 2004); diane mayo and jeanne goodrich, staffing for results: a guide to working smarter (chicago: public library association, 2002). 12. ala, libraries connect communities: public library funding & technology access study (chicago: ala, 2008), http:// www.ala.org/ala/aboutala/offices/ors/plftas/0708report.cfm (accessed mar. 5, 2008). 13. charles p. smith, ed., motivation and personality: handbook of thematic content analysis (new york: cambridge univ. 92 information technology and libraries | june 2009 pr., 1992); klaus krippendorf, content analysis: an introduction to its methodology (beverly hills, calif.: sage, 1980). 14. ala, libraries connect communities. 15. bertot et al., public libraries and the internet 2006; bertot et al., public libraries and the internet 2007. 16. ibid. 17. nces, public libraries in the united states. 18. bertot et al., public libraries and the internet 2007. 19. american libraries, “branch closings and budget cuts threaten libraries nationwide,” nov. 7, 2008, http://www .ala.org/ala/alonline/currentnews/ newsarchive/2008/november2008/ branchesthreatened.cfm (accessed nov. 17, 2008). 20. ala, libraries connect communities. 21. bertot et al., public libraries and the internet 2006. 22. bertot et al., public libraries and the internet 2007. lib-s-mocs-kmc364-20140601051638 51 scientific serial lists dana l. roth: central library, indian institute of technology, kanpur, u.p., india this article describes the need for user-oriented serial lists and the development of such a list in the california institute of technology library. the results of conversion from eam to edp equipment and subsequent utilization of com (computer-output-microfilm) is reported. introduction prior to the dedication of the millikan memorial library, which houses the divisional collections in chemistry, biology, mathematics, physics, engineering, and humanities, the libraries at the california institute of technology were largely autonomous, reflecting the immediate needs of each division, and exhibited little attempt at interdivisional coordination of library purchases. with centralization of the major science collections, it became apparent that any efforts to reduce duplication, promote more effective library usage, and provide assistance in interdisciplinary research efforts would require a published list of serials and journals ( 1). scientists vs librarians it is certainly a truism that serial publications constitute the backbone of a library's research collection. particularly in the sciences, where serial publications serve as the primary record of past accomplishments, studies have shown that over 80 percent of the references cited in basic source journals are to serials (see table 1). citation of serials rather than monographs was greater in chemistry than in other sciences and the overall order may reflect the efficiency of the respective abstracting/ indexing services. in spite of the scientist's heavy dependence on serials, it appears that in most libraries little attempt has been made to reconcile the library 52 ]oumal of library automation vol. 5/1 march, 1972 table 1. percentage of citations to serials found in basic source journals for various, scientific disciplines 0 discipline percentage of citations to serials ch em is try ____________________ ____________ ---------------------. __ . _____________________ 93. 6 physiology ----------------------------------____________________________________________ 90. 8 physics __ . ____________________________________________________________________________ . .88. 8 ~~~l~~lo~~--~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~=~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~6:~ mathematics __________ ·------------------------------------_____________ ....... 76.8 °c. h. brown, scientific serials (chicago: association of college and research libraries, 1956). record with practices found in the scientific literature. this is in part due to the general acceptance of the library of congress dictum that serials should be cataloged according to the general principles laid down for monographs. fortunately, monographs are generally cited in the scientific literature by entries (author / title ) which invariably appear in the library catalog. serials, however, present the special problems of so-called indistinctive titles, frequent title changes, and common reference to the abbreviated form of their title. most american libraries have followed the library of congress/union list practices and as a result have long suffered user complaints about the use of corporate entries for so-called indistinctive titles, entries under the latest form of title, and the treatment of prepositions and conjunctions as filing elements. these practices have been defended as attempts to extend the reference value of the catalog but in doing so they create a number of problems and ambiguities which are only partially resolved by the annoying use of see references. the recent surge of interest in making the library "relevant" and more intimately involved with its users needs must take into account that in the minds of scientists it is a presumptive requirement for them to remember cataloging rules when the library could just as well accommodate the scientific form. in recognition of the long-standing scientific tradition of describing serials by their titles (which considerably predates the corporate entry syndrome), the logical solution wou ld be to provide title added entries for those serials whose main entry is in corporate form ( 2). specific problems 1. even if scientists were to remember the basic rules for society publications and similar corporate entries, how are the exceptions shown in table 2 to be reconciled? 2. the practice of cataloging serials under their latest title then best serves as an obstruction to determining the library holdings, since referscientific serial lists/roth 53 table 2. an example of the difficulties encountered in translating abbreviations of scientific journal titles into lc entries abbreviation scientific form of title union list entry bull acad pol sci bulletin de 1' polska akademia nauk academie ... pnas proceedinys of the national academy ... nationa ... jacs journal of the american chemical .. . american ... berichte berichte der deutsche chemischen ... deutschen . .. comp. rend. comptes rendus ... academie des sciences .. . ber. bunsen... berichte der bunsen. .. deutsche bunsen ... bull. soc. chim. belges bulletin des societies ... bulletin des bull. soc. chim. france bulletin de la societe societies ... societe chimique des france ences given in the scientific literature and citations obtained from abstracting/ indexing services are obviously to the title currently in use. another important factor, that is sometimes overlooked, is the requirement of a classified shelf arrangement. otherwise, since the title of the bound volume corresponds to the title in use at the time of binding, you have the ambiguity of catalog referring to the latest title and shelf locator referring back to the earlier title. these problems are further complicated by the long d elays and backlogs in recataloging. in many large libraries this is a major function of serials catalogers and it is estimated that it takes 50 percent longer to recatalog than to catalog originally ( 3). 3. the jargon of scientists when discussing or requesting information about various periodicals is replete with acronyms and abbreviated forms. jags, pnas, berichte, comptes rendus, annalen all have well-defined meanings in scientific literature and conversation because of the well-developed title entries and abbreviations given in physics abstracts, chemical abstracts, and the world list of scientific periodicals. the use of prepositions and conjunctions as filing elements constrains these scientists to being able to translate these abbreviations only into title entries where the omitted words are obvious, e.g., journal of the american chemical society but often causes problems with titles like journal of the lesscommon metals. the cal tech serials list: objectives and procedures the publication of a serials list oriented to the needs of scientists must then provide for: scientific title entries for corporate and society publica54 journal of library automation vol. 5/1 march, 1972 tions, treatment of each title change as the cessation of the old title, and omitting prepositions and conjunctions as filing elements. these practices will increase the number of entries by about 40 percent over the number of current titles but in terms of user appreciation the extra expense is amply justified. the list can then be a logical extension of the library's reference service and offers the opportunity of facilitating the research efforts of its users by obviating the need to remember cataloging rules or visit the library to determine its holdings. input to the serials list was derived from the library's serials card catalog. the information was typed on oversize card stock and included the full main entry, holdings, and divisional library location, with additional data cards, as required, to reflect title changes. with this data base, an extensive search of the world list of scientific periodicals and list of periodicals abstracted by chemical abstracts was made to determine the additional scientific title entries to be incorporated in the list. (each departmental library provides a shelf locator which relates the various forms of entry in the serials list to that chosen for the bindery title and subsequent shelf location.) prepositions and conjunctions were replaced with ellipses in the final typing of multilith stencils required for the manual publication of the first edition of the cal tech serials list ( 4). during the spring of 1969, the decision was made to employ edp techniques in the publication of the second edition of the list. as an interim housekeeping device between editions, the author maintained an in-house supplement on punch cards using a single card format. this experience indicated an unacceptable severity of title abbreviation which was obviated by adopting a two-card format. this is consistent with the ibm 360 system wherein input records are read two cards at a time, and thus, the unit record may be thought of as a "super" card of 160 columns (of which only a maximum of 131 columns can be printed on a given line, the remaining 29 columns being used for internal records). the unit serials record consists of the title, holdings, divisional library, serial number, and spacing command (see table 3). the unit records were created directly from the existing serial list and the cumulated supplement by in-house clericals. this obviated the usual requirement of coding the data for keypunch operators. subsequent to the preparation of the unit records, having an alphabetical sequence of punched cards, it was a simple matter to program the computer to serially number each second card, using orre letter and six digits. an example of the distribution of titles one might expect is given in table 4. while the data conversion was being performed, a series of programs was written. these programs were designed to create a master tape, update the tape, and to produce a variety of listings. these listings, in addition to the required 131-column printout for the serial list, include the 160-column table 3. the unit serials record card no. columns 1 1-75 2 1-27 2 29-32 2 72-78 2 80 scientific serial lists/roth 55 field designation title holdings divisional library serial no. spacing command table 4. distribution of titles by initial letter. letter number of title entries a 1,024 b-d 1,126 e-1 1,199 j-m 1,272 n-r 1,413 s-z 1,471 printout (in sequential so-column units) and printouts for individual divisional libraries which can be annotated with shelf locations. the data base was then transferred from punch cards to magnetic tape and subsequent additions and changes involve punch cards and tape one onto tape two operations. as a protective device tape one and tape two are the current and previously current tapes, respectively. thus in the case of accident the preceding tape can again be updated. as a further precaution the original punch card data base and update decks are on file. the economic justification for the use of edp equipment in libraries is based upon the necessity of maintaining current records that can be published at regular intervals. in the special case of serial lists this involves the periodic merging of small numbers of new and corrected unit records with the much larger number of unit records in the existing data base. the use of serially numbered unit records allows the relatively easy machine function of merging numbered items in contrast with the difficulties involved in merging large alphabetical fields. recent advances in reprographic technology suggested that com (computer-output-microfilm) could be utilized to produce a quality catalog, free of the normal objections to "computer printout." the flexibility of currently available com units allows the acceptance, as input of a normal print tape from most computer systems (ibm, burroughs, univac) 56 journal of library automation vol. 5/1 march, 1972 table 5. data presentation and relative spacing title holdings divisional library faraday society, london discussions 1,1947+ chern 10,1951+ c eng symposia 1,1968+ chern 1,1968+ c eng transactions 1,1905+ chern 46,1950+ c eng farber-zeitung 1889-1918 chern without reformating ( 6). the print processors resident in the front-end computer of the fr-80, for example, allow for upperand lowercase, gold characters, column format, pagination, and sixty-four-character sizes. variation in character size allows a maximum density of 170 characters per line and 120 lines per ( 8~ x 11 ) page. the application of com equipment requires the production of a "print tape." this is simply a coded version of the current tape which contains the additional instructions necessary for spacing the unit records, defining the page size, and inserting "continued on next page" statements. the use of spacing command instruction, as an integral part of the unit record, allows all the information on a given title to remain in one unit and easily provides for a blank line before the next title ( see table 5). the additional problem of keeping the information on one title together on a given page or providing a "continued on next page" statement was solved by analyzing the information in the eighty-ninth line of each page to determine whether to print another line, insert the "continued on next page" instruction, or begin the title on the next page. once the film is generated, it is a simple matter to produce plates for the multilith production of hard copy ( 7). the choice of a ninety-lines-per-page format was influenced, in part, by our desire to use the serials list to break down the reluctance shown by faculty and students toward microformats. this format results in a onethird reduction of the 112-column computer printout and enables our 5,000 current titles to be accommodated on two microfiches ( 152/ pages ). footnotes 1. for the purposes of this article, periodical and serial are synonymous and refer to publications which may be suspended or cease but never conclude. the term "serials list" should be restricted to publications which record only serial titles ( and supplementary information to distinguish between similar titles), holdings, and internal records. library catalogs and union lists are quite sufficient sources for relating a title scientific serial lists /roth 57 to its successor or precedent, and providing full bibliographic detail. 2. p. a. richmond and m. k. gill, "accomodation of nonstandard entries in a serials list made by computer," journal of the american society for information science 11:240 ( 1970 ); dana l. roth, "letters to the editor; comments on the 'accomodation of nonstandard entries . . . ,' " journal of the american society for information science (in press). 3. andrew d. osborn, serial publications ( chicago: american library association, 1955). 4. e. r. moser, serials and journals in the c.i.t. libraries (pasadena: california institute of technology, 1967). 5. dana l. roth, serials and journals in the c.i.t. libraries (2nd ed.; pasadena: california institute of technology, 1970). 6. robert f. gildenberg, "technology profile; computer output microfilm ," modem data 3:78 ( 1970 ). 7. computer micrographics, inc., los angeles, california. lib-mocs-kmc364-20131012115244 150 the british library's approach to aacr2* lynne brindley: british library, bibliographic services division, london, england. the formal commitment of the british library to aacr2 and dewey 19 entailed substantial changes to the u.k. marc format, the blaise filing rules, and a variety of products produced for the british library itself and for other libraries, including the british national bibliography. the british library file conversion involved not only headings but also algorithmic conversion of the descriptive cataloguing. along with the u.s. library of congress and the national libraries of australia and canada, the british library was formally committed to the adoption of the anglo-american cataloguing rules, second edition (aacr2) and decimal classification, 19th edition (dc19) in 1981. this entailed fairly substantial changes to the marc format as published in the u.k. marc manual, 2nd edition as well as the implementation of the new and more sophisticated blaise (british library automated information service) filing rules. 1 there is, of course, never an ideal time for making major changespolitically, economically, or technically; and the bibliographic services division (bsd) found itself having a large number of preexisting separate systems, particularly for our batch processing work, which had grown up over a long period of time and had in most cases been tailor-made to the individual products. whilst relatively small, bsd is nonetheless responsible for a multiplicity of products and services, almost all of which were to be affected to some extent by the change toaacr2/dc19. briefly, then, a comment on the different services and the degree to which they were affected, thus setting the scene for our decisions on machine conversion. *based on a talk given at the library association seminar "library automation and aacr2," held in london on january 28, 1981. the views expressed in this paper do not necessarily represent those of the british library or the bibliographic services division. manuscript received june 1981; accepted june 1981. services and impacts printed publications british library!brindley 151 the major printed publication of the division is the british national bibliography. it is arguable that for the printed publications (especially the weeklies) there would have been little justification for retrospective conversion. the files could have been cut off at the end of 1980 and started afresh for 1981-it might, however, have precluded, or certainly have made more messy, the possibility of any multiannual cumulations across this period. microform products these are mostly individual com catalogues, both within the bl, especially the reference division, and externally, provided through locas (bsd's local catalogue service) to some sixty libraries in the u.k. in many ways those libraries that plunged into automation early, building up files of records derived from central u.k. and lc marc, were likely to be worst affected. individual machine-readable files had grown very large and exploited not only relatively current cataloguing data, but also full retrospective u.k. holdings back to 1950. also we foresaw no lessening of use by libraries taking our catalogue service of the u.k. retrospective 1950-80 file after aacr2 implementation. therefore the grounds for attempting automatic retrospective conversion of records were indisputable. tape services u .k. exchange tapes, either as a weekly service or through the selective record service, are supplied to nearly one hundred organisations. the same arguments that there will be continuing selection from the retrospective files apply-therefore, for compatibility and ease of use we needed to consider conversion. the weekly exchange tape service makes a clean aacr1/aacr2 break, but obviously libraries have back files of aacrl records. mindful of our responsibility to other organisations and agencies utilising our records, we decided to make our own converted tapes of lc and u.k. marc records available to tape-service customers to aid their own conversions. online services regarding the blaise online information retrieval system for u.k. and lc marc, our concern was to ensure continued easy searching and printing across the total span of files. without automatic conversion it would have been difficult, if not impossible, to ensure consistency in search elementsandindexentries (e.g.: in u.k. marc, seriesfields400, 410, and 411 no longer exist, so without conversion a searcher would have to remember specific search qualifiers for pre-1981 records, and different ones thereafter). without conversion the searcher would need a lot more knowl152 journal of library automation vol. 14/3 september 1981 edge of marc and the history of cataloguing practices to formulate effective strategies. outside users of marc last and very much not least was a consideration of what we could do to help the now large community of u.k. marc users in coping with the changeover. this is now a very large and diverse group relying on bsd for the provision of bibliographic records for whatever purpose. our own conversion enabled us to provide a multiplicity of aids to libraries. of particular note are (1) u.k. and lc exchange tapes of converted records, and (2) machine-readable and microfiche versions of our own name conversion file, which is being used as the basis for the new name authority fiche. so, in the context of the variety of our services the case for conversion was strong. retrospective conversion the extent of the retrospective conversion exercise is discussed below. in conjunction with this work we were faced with the necessity of rationalising our com and print product software (library software package), both to enable it to drive each of the previously separate print applications and to ensure that it had sufficiently sophisticated output facilities to cope with the complexity of aacr2/u.k. marc 2 records, with their increase in numbers of subfields, their repeatability, all or some, and varying sequences, to produce the specified layout and punctuation across our services. extent of conversion we are now in a position to discuss the retrospective conversion exercise. having decided in principle to become involved with conversion, the extent of our involvement had to be established. british libraries have never had the tradition of building and utilising name authority files, and certainly the concepts fit more easily in the north american primarily online system context rather than in the predominantly batch cataloguing systems established in the u.k. the bl therefore found itself without a machinereadable authority file and began to create one from scratch to enable the important heading changes required by aacr2 to be handled automatically. again because of the overriding importance of com catalogues in the u.k., considerable attention was paid not only to automatic heading changes but also to automatic marc coding and text conversions bringing the descriptive cataloguing elements also into line with aacr2/u.k. marc 2, so that catalogue records could be consistent on output whether derived from the conversion or newly created . the third consideration for conversion was our library of congress file british library!brindley 153 (books all1968), used in the u.k. as part of our cataloguing services and as a file in the blaise online system. we had always performed certain conversions on lc records to bring them more into line structurally with the u.k. marc format. however, u.k. libraries using these records for cataloguing purposes still had to undertake substantial editing. it was therefore decided to use the opportunity to enhance this conversion and bringlc records into line with u.k. marc 2 to make them of maximum use to british librarians. to summarise, then, the retrospective conversion comprised three main parts: 1. that part which utilised information stored in the name conversion file, which records the aacr2 and aacrl forms of names. this enabled the automatic conversion of major, commonly occurring personal and corporate headings. 2. automatic marc coding and text conversions-this consisted of specifications at marc tag and subfield level of algorithms for automatic marc coding and scme bulk text conversions. it resulted in records being converted to a pseudo-aacr2/u.k. marc 2jormat, so that all output specifications, whether by profile or by online inversion, had only to cater for the new format. these two parts of the conversion are inexorably linked, both conceptually and in programming terms , with frequent references to alternative courses of action dependent on whether a match has been found on ncf. the details of conversion are in "specification for retrospective conversion of the uk marc files 1950-1980,"2 prepared in the computer services department. 3 . the third facet of conversion was to our library of congress files (books all1968), to bring records in line with u.k. marc 2 as far as possible. only conversions of tags, indicators, subfield marks, punctuation, and order of data elements have been included; no attempt has been made to bring textual data into conformity with bsd practice. the converted records are therefore in aa c r2 form to the extent that lc applies aacr2 to a particular record. the next section highlights major points of each part of the conversion, commenting particularly on aspects of programming and testing. name conversion then arne conversion file was built up by bsd's descriptive cataloguing section over nine months of 1980 and comprises authenticated aacr2 headings with theaacrl form where different. it will form the basis of an authority file of headings and references for future bsd cataloguing and will be the first publicly available u.k. authority file. the file was maintained using existing locas facilities. pseudo-marc records were created recording the aacrl and aacr2 forms of headings in the format shown in example 1. 154 journal of library automation vol. 14/3 september 1981 field 001 (control number) 049 (source code) 110.1 $a great britain $c accidents investigation branch (name heading in aacrii form) 710.1 $a great britain $c department of trade $c accidents investigation branch (name heading in aacri form) 910.1 $a great britain $c department of trade $c accidents investigation branch $x see $a great britain $x accidents investigation branch (reference for aacrii name heading) name conversion file record example! the file being used for conversion comprised some 12,000 records, of which 4,000 had aacr2 heading changes. the remaining records were authenticated by bsd as correct aacr2 headings without alteration. of the changed headings most were prolific personal and corporate (particularly u.k. government) headings. the first stage of the conversion process for u.k. marc records (1950-80) involved all records being processed against the name conversion file to replace aacrl with aacr2 headings and associated references. in programming terms, the name conversion was relatively easyrelatively, that is, in the context of bibliographic programming. the matching program used was not particularly sophisticated. it took each ncf record, identified the 7xx (aacrl) field, created a key of fifty characters stripping out all blanks, embedded punctuation and diacriticals, and then tried to match the key against each 1xx heading in whatever file was being converted. if there was a match on the key, then the program proceeded to match character by character through the data looking for an exact match. if this was not found, then the ncf record was not processed. example 2 shows this procedure more clearly. of course, this file has not converted all aacrl headings, but it has ensured that the majority of headings likely to recur (i.e., of any significance in catalogue collocation of headings) have been automatically changed. automatic marc coding and text conversions this is commonly known as the format conversion program and forms the bulk of the "specification for retrospective conversion." the original specification was extremely complex, particularly bearing in mind the tight time scales that we were working to. the major difficulty throughout all parts of this facet of conversion was having to specify procedures to accommodate the variety of usage of marc across thirty years, including previously automatically converted 1950-68 u.k. marc records; it has british library!brindley 155 ncfrecord 710 (aacri) $a great britain $c civil service department $c central computer agency# 110 (aacrii) $a central computer agency# 910 (aacrii) $a great britain $c civil service department $c central computer agency $x see $a central computer agency# key: 10$agreatbrit ain$ccivilservicedepartment$ccentralc matching on datawould match central computer agency would not match central cataloguing agency n.b. key equals 50 characters (upper case) ncfrecord 700 (aacri) 100 (aacrii) $a walker $h david esdailel $a walker $h david e. $q david esdaile $r 1907 -1 900 (aacrii) $a walker $h david $c 1907$x see $a walker, david e.# key: 10$aw alker$hdavidesdaile book record before: 100 walker $h david esdaile# 900ajter: 100 $a walker $h david e. $q david esdaile $r 1907 -# 900 $a walker $h david $c 1907$x see $a walker, david e. $z 100# n. b. addition of new reference name conversion matching example2 been almost impossible to verify absolutely that any of the automatic changes would cover all cases. not surprisingly, this was an extremely complex program. it had to allow for manipulating in fairly precise ways nonstandard and variable data, and had to be designed to cope with occurrences in many different combinations . the programmer had to code for these combinations, some of which may possibly never have been used. it is probably the case that certain combinations do not exist, but this could not be guaranteed over such a large number of records until the total file had been converted. a good example of the complex logic of this kind of processing is found in the 245 field, where seven complex conditions were allowed for: (1) (2) (3) field245 if $e ___ then _ __ else _ _ _ if$£ then else _ _ _ if $d or $e ___ or ___ or _ _ _ or ______ or ______ or ______ or ____ __ 156 journal of library automation vol. 14/3 september 1981 then __ _ else if $d or $e ___ or ___ or __ _ or ___ or ___ or or __ _ or __ _ then __ _ (4) iftags ___ then __ _ (5) if008 and or ___ or __ _ then __ _ (6) if $h then and __ (7) if $e then __ _ else if first $e then _ _ _ else __ _ else __ _ repeat for all levels of 245. another variation on this theme is that the specification catered for what it expected to find. again, because of the voiume and span of data the expected was not always found. for example, a lot of processing of references is dependent on the presence of a $x. what do you do when you find a record accidentally without one? a third problem was that of interdependency of fielch and subsequent actions . a good example of this is found in llos and related 910s. if a 110 is changed, you may have to create a 910 , replace a 910 with another one, or reorganise existing subfields. then you may have to reorder the field and also flag the action to come back to later in the program. hence you are switching back and forth across fields throughout the program . you cannot simply start at field one, process sequentially, and then stop. clearly this makes program testing that much more complicated. however, those were the problems-really a very small percentage of the whole. from all that has been seen of the converted files so far it has been a highly successful exercise. all of the major marc changes and many of less significance have been converted automatically by this program-treaties, laws, statutes, series, conferences, multipart works-the resulting records being consistent in marc tagging structure and in significant headings and areas of text. library of congress file conversion it has already been stressed that the automatic marc coding and text conversions for u.k. marc were very complex programs. perhaps even more complicated was the conversion program written to transform lc into u.k. marc format. the main reason for this is that the u.k. and ncf conversions are one-off programs and a great number of the manipulations could be hard-coded. however, it is intended that the lc conversion program will be used on an ongoing basis against each weekly lc tape. thus each conversion has been treated as a separate parameter to the british library!brindley 157 program so that it is general purpose and easily alterable in the light of changes of practice by lc. to give you some idea of the complexity, there are well over 600 separate parameters to the program. i say separate, but in fact they are interrelated parameters, so that if a minor change is made to one it can potentially affect many others. many of the problems relating to this program could again only be really apparent in volume testing, not in writing. each parameter written and tested in isolation was satisfactory, but when they began to be put together in modular form, then the problem of unusual combinations began to show. although the conversion parameters for lc records are extensive, they cannot touch the cataloguing data, certainly not nearly as much as in the u.k. marc conversion. there are added problems in the fact that the records coming to us from lc do not show the clean aa crl/ aacr2 break that bsd is adopting. we are having to allow for mixed records from lc at least in the foreseeable future. details of the lc-to-u.k. marc conversion are published in a detailed specification. 3 common issues in conversion testing it is possible to draw out common problems applicable across all the conversion work, particularly in testing. they are as follows: 1. variability of records; 2. complexity of records; 3. volume of data; 4. nonstandard data; 5. repercussions throughout system. variability this is an obvious problem in the handling of marc records, but particularly pertinent when trying to do such complex manipulations. the record format itself is of course variable-there are very few essential fields or data elements; most need not be present at all; if they are present, they can be there once or ten times. standards of cataloguing, and therefore marc coding, have changed considerably over the period in question, adding to the variability. in some exceptional cases bsd practices are different from those prescribed in the marc manual, e. g., nonstandard use of title references. all of this results in additional difficulties from specification, through programming and testing. on average we found that one conversion process took two to three times the amount of coding required for more normal computer processing. complexity this is linked with variability and was manifest particularly in the fact that it was extremely difficult to ensure that the programs catered for all 158 journal of library automation vol. 14/3 september 1981 conditions. we found that testing threw up oddities not allowed for in the original specification. in an ideal situation with no time constraints a totally tailored and comprehensive test file should have been drawn up for each facet of conversion. this exercise alone would have taken a good year and would still not have catered for the unexpected data problems. in practice, whilst bsd's descriptive cataloguing staff were able to provide several hundred records that tested the majority and most important of the conversions, we always faced the possibility of coming across exceptions. this soon became apparent when volume testing commenced and each new file threw up another combination and a different program route not previously tested. volume the third major factor adding to the complexity of the whole operation was the sheer volume of data to be processed. approximate figures are as follows : u.k. marc 0.7 million records lc marc 1.4 million records locas 2.5 million records the combination of these three factors-variability, complexity, and volume of datamade testing extremely difficult and expensive in machine terms, in that large test batches of material had to be processed. nonstandard data like any large file, u.k. marc has its share of incorrect data, most of it of no particular significance. however, some problems arose in conversion testing resulting occasionally in corrupted records. one example that springs to mind was the incorrect spelling of months in treaties, giving problems in the 110 $b conversion to 240 . repercussions throughout system a cautionary note, really: we made a decision that postconversion records should not be put back and overwrite existing master files until they had been through validation programs (i.e., those used for validating new input for bnb and locas); it was felt that this was a necessary safeguard against reintroducing any structurally incorrect records postconversion. it was here again that testing threw up timely reminders of just how much the validation programs had been upgraded and changed since many of the original records had been input through the system. scheduling the scheduling of such a large, complex exercise was extremely difficult, with interdependency of processing related to the success or otherwise of overnight runs . a lot of time was spent before the conversion period in british library/brindley 159 discussion with our computer bureau to ensure maximum cooperation throughout the difficult time. they were extremely helpful in ensuring operator coverage throughout weekends and priority for our work. one of the problems we encountered was having to forecast the approximate number of machine hours that would be required throughout january 1981 when the bulk of conversion work was carried out. at the time the figures were needed we were still in early stages of programming so no volume tests could be run. equally, although we were experienced in large-volume processing it was difficult to draw any direct comparisons with production work. additionally, we had to allow for a heavier than normal production work load towards the end of the year, which always sees annual volumes, cumulations, online file reorganisation, and so on. scheduling therefore was a fine art to ensure correct priorities for production, the bureau's own work, and conversion , and to minimise contentions for files and peripherals. staffing of interest is a picture of the human resources involved in this project . what is striking is the magnitude of the task achieved by very few people. the overall management of the project was taken on by existing line management within bsd's computer services department. two project leaders were appointed, one a librarian and one a systems analyst. the librarian had a team of four temporarily seconded staff who were totally responsible for all output profile specifications (printed products and com), testing, and implementation. they also did a considerable amount of checking of test file conversion runs . the systems analyst was a project leader for three analyst-programmers and one jcl writer. between them they were responsible for lc and u.k. conversion programming and the new filing rules. existing operations staff and others as appropriate within the division were called upon for other tasks. disruption to services whilst disruption to our normal production services was kept to an absolute minimum, it was decided that it would be necessary to temporarily suspend certain services through the month of january 1981 while the bulk of the file conversion took place. throughout the period, the blaise online information retrieval system continued to be operational : associated online facilities that would normally allow the despatch of marc records to catalogue files were suspended to avoid any non-aacr2 or nonconverted records inadvertently updating converted locas files. the production of com catalogues through locas was suspended for a single month, and the first issue of bnb for 1981 was not scheduled until early in february. the schedule for the conversion exercise was adhered to with no major slippage except in the case of our lc file conversion; this exercise 160 journal of library automation vol. 14/3 september 1981 stretched on into the spring for a variety of technical reasons largely concerned with the characteristics of the lc data. conclusions having been so closely involved in this project it is difficult to draw out general conclusions as yet. however, there are some already obvious benefits both for bsd and the wider library community: the rationalisation of our software for com/printed products will lead to easier maintenance and future upgrading; the introduction of the blaise filing rules across all our products is an improvement; the new lc conversion will make our lc files much more easily usable by the british library community; we have the basis of a u.k. name authority file for the first time. this was a vast and sophisticated conversion exercise and will result in u.k. marc files probably more uniform in structure than they have ever been. it forms an excellent basis for the continuation of bsd services, especially those based on utilising records across the whole time span, e. g., blaise information retrieval, selective record and cataloguing services. equally, because our conversion has been so extensive we have been able to share it: the specification, the name conversion file, and the converted u.k. and lc files were all available at minimal cost to libraries in the u.k. of course, it is not the 100 percent solutionit was never intended to beso of course if you look hard enough you will find inconsistencies. however, it has proved that very extensive automatic conversion is possible even with today's state of the art of computing and that bsd had led the way, indeed eased the path of transition to aacr2 for british libraries. references 1. british library, filing rules committee, blaise filing rules (bl , 1980). 2. british library, bibliographic services division, computer services department, "specification for retrospective conversion of the uk marc files 19501980" (unpublished with limited distribution). 3. british library, bibliographic services division, "specification for conversion of lc marc records to uk marc" (unpublished with limited distribution). lynne brindley is head of c ustomer services for the british library automated information service (blaise). lib-mocs-kmc364-20131012113604 233 lit a a ward, 1980: maurice j. freedman s. michael malinconico this is the third presentation of the lit a award for outstanding achievement. the first two honored individuals whose achievements can be said to have created the discipline we know as library automation. the first award went to fred kilgour whose vision, daring, and entrepreneurial and managerial skills changed the way libraries operate almost overnight, and may in the increasingly stringent economic times ahead have helped ensure the economic viability of libraries. the second award went to henriette avram, whose untiring efforts on behalf of the marc formats and their promulgation is only just short of legendary. this year's winner distinguished himself in a somewhat different manner. his contributions did not lead to the development of new automated systems or services. rather, his outstanding achievement lies in the creative and pioneering use he made of technology in support of a clear vision of effective library service. his contribution comes from the depth of sensitivity and understanding he brought to the application of technology to library service. much to our go<;>d fortune, he has chosen to share with us through his many writings the insights he has found in his study of the fit between technology and the delivery of effective library service. this year's winner shares the distinction, with the two previous winners, of being a former president of the division. in fact, he presided over the change from the venerable acronym isad to the new name of the division: library and information technology association (lita). it gives me particular pleasure to present this year's award, as it goes not simply to an esteemed colleague but to a valued friend. i first met maurice (mitch) freedman at the first ala conference i attended-the midwinter meeting of 1972. the first session i attended at that conference was a meeting of the committee on library automation (cola). i had gone to that meeting to report on nypl's automated cataloging system, which had that month become fully operational with the publication of the book catalogs of the research libraries and of the mid-manhattan library. following the cola program, mitch approached me, introduced himself, and inquired about the possibility of using the nypl system to produce hennepin county's catalog. the consequences of that afternoon were most salutary both for the hennepin county library (hcl) and for me personally. hcl acquired at no cost an automated bibliographic control system, and i gained a friendship that has endured for nearly a decade. thus, rather than dwelling on mitch's professional accomplishments-which are already well known to you-1 would prefer to say a few words about the man himself. perhaps the best way to characterize him is to describe to you his office at 234 journal of library automation vol. 14/3 september 1981 maurice freedman (left) receiving 1980 lita award presented by s. michael malinconico (right). columbia university. prominently displayed on the walls are two enormous posters, one of bertrand russell and another of lenny bruce. a perhaps odd pair until one realizes that these men had one important attribute in common: neither of them accepted, without incontrovertible proof, truths supported by conventional wisdom alone. mitch, like the philosopher and satirist whose images grace the walls of his office, is an iconoclast who insists on more than the endorsement of reigning authority before he will embrace an idea; and he will work tirelessly to change the prevailing wisdom if he finds that it serves to frustrate rather than aid the delivery of the kind and quality of library service to which he feels the patrons of libraries are entitled. likewise, though he was among the pioneers who helped introduce sophisticated technologies such as automation and micrographics into the operation of libraries, he has always maintained a healthy skepticism, which has prevented him from being seduced by the dry voices of the hollow men who proclaim marvels that are in reality only gilded figures of straw. just as lenny bruce refused to accept contemporary conventions regarding language and behavior, mitch freeman has refused to accept the sanctity of lc subject terminology. he, sanford berman, and joan marshall have served for more than a decade as lc's conscience, prodding our phlegmatic, de facto national library to action. just as bertrand russell returned to the axioms of giuseppe peano in an attempt to secure the foundation of mathematics in formal logic and to lita award 235 free that discipline of fuzzy thinking, mitch has returned to the principles articulated by antonio panizzi and seymour lubetzky, as the tests by which to judge the claims of the self-assured mountebanks who regale us with newly coined bibliographic wisdom. in this regard i anxiously await the completion of his doctoral dissertation, in which he explores the philosophical underpinnings of theories of bibliographic control (a work that would have proved most useful during the protracted emotional debate that surrounded aacr2). i expect that it must be particularly gratifying for mitch to accept his award in this particular city. although his physical roots are in the northeast, i rather think his intellectual and spiritual roots are here, or more precisely, in the city across the bay-berkeley. it was just about twenty years ago that mitch, after graduating from rutgers university, newark, enrolled as a graduate student in philosophy at the university of california, berkeley. while at berkeley, his sense of social justice and utter disdain for unsupported dogma-could one expect less of a student of philosophy?led him to become active in the free speech movement. thus, we find very early in his career a concern for social issues, a concern that reemerged in his active involvement with the social responsibilities round table shortly after joining the library profession. before leaving berkeley, mitch earned his degree in library science. thus, he earned his degree from one of the most prestigious library schools on the west coast, and now plies his trade as associate professor at one of the most prestigious library schools on the east coast, the columbia university school of library service. if he is only moderately successful in conveying to his students his dedication to the delivery of quality library service, his steadfast conviction that technical services is in reality the first step in the provision of effective public service, and a respect for the supremacy of principle over expedience, his graduating classes will constitute a more lasting and meaningful award than this simple gesture conferred upon him by his professional colleagues. communications ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ lib-mocs-kmc364-20131012125101 174 oclc's database conversion: a user's perspective arnold wajenberg and michael gorman: university of illinois library, urbana-champaign this article describes the experience of a large academic library with headings in the oclc database that have been converted to aacr2 form. it also considers the use of lc authority records in the database. specific problems are discussed, including some resulting from lc practices. nevertheless, the presence of the authority records, and especially the conversion of about 40 percent of the headings in the bibliographic file, has been of great benefit to the library, significantly speeding up the cataloging operation. an appendix contains guidelines for the cataloging staff of the university of illinois, urbana-champaign in the interpretation and use of lc authority records and converted headings. the library of the university of illinois, urbana-champaign, is the largest library of a publicly supported academic institution, and the fifth largest library of any kind, in the united states. in the last year for which figures are available (1979-80), the library added more than 180,000 volumes representing more than 80,000 titles. the library is currently cataloging more than 8,000 titles a month; more than 80 percent of the records for these titles are derived from the oclc database (library of congress and oclc member copy). because our cataloging is of such volume and because we are actively engaged in the development of an online catalog, we decided to use the second edition of the anglo-american cataloguing rules (aacr2) earlier than the "official" starting date of january 1981. we began to use aacr2 for all our cataloging in november 1979. this early use of aacr2 has led to two consequences. first, we now have oclc archival tapes representing about 150,000 titles cataloged according to aacr2. this represents a valuable and continuously growing bibliographic resource that can be used without modification in our future online catalog. second, we have a considerable and unique collective experience in the practical application of aacr2. the minor problems of working with aacr2 in an aacrl manuscript received june 1981; accepted june 1981. oclc's database conversion/w ajenberg and gorman 175 plus superimposition environment (until january 1981) were more than compensated for by these two positive results. oclc conversion with our practical background in the use of aacr2 and our continuing need for a high volume of cataloging, we were, naturally, keenly interested in the (to our mind) progressive decision of oclc to use machine matching techniques to convert the form of name and title headings in its database-the online union catalog (oluc) to conform to aacr2. we recognized the limitations of the project, essentially those defined by the capabilities of the computer for matching character by character, but felt that this was a major venture that would, when completed, produce major benefits. what follows is an assessment and analysis of the results of the project in the light of the experience of a library that is dedicated to achieving highvolume, quality cataloging. we deal with the lc authority file as well as the oclc headings because the lc file was the basis of the project and because, from the practical point of view, the two files are complementary aspects of the same service. the greatest value of the conversion, and its greatest claim to uniqueness, lies in the sheer size of the project in terms of headings checked and changed. our catalogers, and others who work with current materials, estimate that more than 40 percent of the name and title fields we use in our current cataloging have a w subfield indicating that the name or title has been changed to its aacr2 form. since oclc estimates that 39 percent of the name and title fields were affected by the conversion, it would appear that the headings that were changed are the headings that we are more likely to use. in other words, the project has brought us more than a 39 percent benefit . we are also greatly encouraged to find that the number of headings coded dn (meaning aacr2 "compatible," or, more bluntly, lc's modifications of the provisions of aacr2) is a very tiny minority of all converted headings. this means that when, in the future, this policy of "compatibility" is lessened or dropped, there will be relatively few changes to be made. lc authority records we also benefit from the presence of lc authority records in the oclc database when we establish headings that are new to our catalogs. there is one problem with the use of these records, which was revealed by a sample of new university of illinois authority records (see table 1). this sample of 368 new university of illinois records reveals that lc authority records are available relatively rarely for new headings. this is not surprising as these new headings are established most often as part of the process of original cataloging, which , almost without exception , occurs in our library only when oclc copy is not available. it seems to us to be unfortunate that 176 journal of library automation vol. 14/3 septem her 1981 table 1. recently established headings no record record record authority coded coded coded record c• d* n• given name headi ngs 13 5 0 1 single surname headings 212 26 2 2 (number of this category with (132) (7) (1) (2) initial isms expanded in parentheses) compound surname headings 29 12 0 (number of this category with (2) (0) (0) (0) initialisms expanded in parentheses) single surnames plus 3 0 0 0 uniform titles general corporate 34 12 0 0 headings general headings with 7 2 0 0 subdivisions government headings 4 2 0 total 302 59 2 5 *key: c-in subfield w, indicates an aacr2 form , as established by library of congress. din subfield w, indicates an aacr2 " compatible" fo rm, as established b y library of congress. n-in subfield w, indicates that the input operator could not determine which set of rules governed the form of the heading. member libraries cannot contribute their authority records to the oclc database. our experience suggests that the online authority file would grow very rapidly if that were the case. to put it another way, the oclc conversion provides an enormous and valuable resource of aacr2 headings. it did not, and could not, provide new authority information. oclc will be complementing its valuable work in upgrading the retrospective file when it devises and implements a scheme for making available authority records for new headings derived from a wide range of sources. since so many headings were converted to aacr2, it may seem churlish and ungrateful to complain that more was not done. the following descriptions are not intended to form part of an attack on oclc's project or to minimize its achievement. form subdivisions the project failed to delete form subdivisions (such as "liturgy and ritual" and "laws, statutes, etc.") from added entry headings and subjects. the program correctly deleted them from main entry headings, but the inconsistencies resulting from their retention elsewhere makes the job of ensuring consistency in a large copy cataloging operation that much harder. oclc's database conversion/w ajenberg and gorman 177 this inconsistency in treatment is illustrated by examples 1 and 2. example 1 originally was entered under 110 10 illinois. k laws, statutes, etc . the program correctly changed the main entry heading to 110 10 illinois and added a subfield w, coded mn (them indicates a conversion by machine to theaacr2 form; then means "not applicable," and indicates that there is no title element in the heading) . example 2 has as main entry 110 20 illinois community college board but has as added entry 710 10 illinois. k laws, statutes, etc. t illinois public community college act under aacr2, the subfield k, "laws, statutes, etc.," should not be present in the heading. unfortunately, the program looked only at 110 fields, not at 710 fields, and so the heading was not corrected in the conversion. it must therefore be edited manually by every library that uses the record. program problems our direct use of the online authority file is somewhat hampered by the programming oversight that makes it impossible to search uniform titles. of course, uniform titles that are accompanied by a 100 field (notably in music) can be retrieved by an author search, but those without 100 fields (anonymous classics, sacred scriptures, etc.) are virtually inaccessible. there were a handful of specific instances in which the specifications were inadequate or the programs seem to have malfunctioned. these resulted in some oddities such as the conversion of the subject "jesus christ" to "sermon on the mount" and the (surely not politically motivated) switch from "u.s. department of state" to "voice of america." oclc has been scrupulous in identifying and publicizing these errors. they are few in number and, though conspicuous, have rarely caused us many problems. as can be seen, the problems caused by what we see as failures on oclc's part are few and affect few cataloging circumstances. theremaining problems either result from the decisions and actions of the library of congress and, hence, are wholly or mostly out of oclc's control, or are of such a nature that they cannot be solved by computer matching techniques without extensive editorial intervention. whether such human intervention is possible and, if possible, cost-beneficial is not for us to say, though it must be recognized that to transform the oluc to pure aacr2 conformity would be a herculean task . that task would undoubtedly involve many of the hundreds of thousands of records that are seldom or never used. oclc's database conversion!w ajenberg and gorman 179 serials the most troublesome example of the kind of problem that cannot be resolved by machine matching is that of serials. the oclc conversion project was, quite properly, not concerned with choice of entry (aacr2, chapter 21). this seems a simple and clearly defined decision. when we come to consider serials, this clear distinction between choice and form of entry becomes blurred. the major change brought about by aacr2 (rule 21.1b2) is that many serials previously entered under the heading for a corporate body are to be entered under their titles. in fact, the great majority of serials will now be entered under title. the upshot of this is that the citation (or form of heading) for a serial changes from, for example, national society for medical research. bulletin to bulletin i national society for medical research the restriction of the oclc project to forms of heading means that most serials in oluc will be found under headings the form of which may be correct but are inappropriate for citations. this problem, which, of course, cannot be resolved by computer matching, has led to difficulties for us in copy cataloging, because a degree of expertise is needed to apply aacr2 rule 21.1b2 and to distinguish between the majority of serials where the 110 field should be changed to a 710 and the small minority where the 110 field should remain as it is. since most serials are to be entered under their titles, it occurs to us to suggest that the oclc conversion project could have changed all 110 fields in records identified as relating to serials to 710 . by that method, the majority of serials would be correctly entered and the potential for mistaken citations greatly reduced. multiple personal names persons who write under more than one name (real names, pseudonyms, etc.) and who are not primarily identified by one of those names (aacr2 22.2c3) pose a special problem. under the provisions of aacr2, such persons are to be represented in the catalog (and the database) under two or more names. despite the fact that "creasey, john" and "marric, j. j ." and" ashe, gordon" are all names used by the same man, they will appear as separate headings from now on. under aacrl plus superimposition one of those names ("creasey, john") was used as the heading for all works. within the confines of the oclc project, there was no method available to distribute the various records under the various headings. it occurs to us that some method based on matching the name found in the 245 $c subfield with the 100 field might, at least, have resulted in the project recognizing probable cases calling for multiple headings. for example: 100 a hibbert, eleanor 245 a bride of satan i $c jean plaidy 180 journal of library automation vol. 14/3 september 1981 could alert the system to a case for change. we recognize that this would call for more sophisticated computer matching techniques and that it would call for editorial intervention. a good example of the problem this has caused for us is the case of the danish author karen blixen. she wrote under that name and under the pseudonyms isak dinesen and pierre andrezel. records in the database that were added before 1981 will use "blixen, karen, 1885-1962" as the heading for all her works including those published under pseudonyms. since the blixen heading is a perfectly acceptable aacr2 form, the conversion program codes it as an aacr2 heading, which it is for the blixen books but is not for those published under other names. the authority record (example 3) includes a note identifying both pseudonyms as valid aacr2 headings, but, of course, the programs as written cannot interpret such a note and match them with appropriate records. corporate name changes corporate bodies present a similar problem when one is dealing with those that have changed their name. until1967, the library of congress used the latest name of such bodies with see references from the earlier names. both editions of aacr require that works issued under the earlier names be entered under those names and works issued under the latest name be entered under that name, the various names being connected by see also references. however, records in the oluc for earlier works cataloged before 1967 will show those works entered under a later name. for bib record enter t>1b display recd :o: end r~c: stat : n entrd : 80 11 :::1 u!;eooj : 80 1121 t·,p~ : = b1b lvl : g~vt a9n : ~ang : suurce : ~· t tt:o : 004 inlc: :.. en-: lvl : n h~ad ref: a h~ad : c•: i i~ ;;\ d s t ol. t •j s : ;1_ n-3-n~~.? : o:t mod h:~ c : a•j t h status : a 1 0 10 n 7~0077 1 9 '2 1(•0 10 bl1 ~~'' t:-:·~r:"no d t:::=:3c.j '=~62 . w r.001790::1~-:t.·'l.c.:t.nn----n r.n n :..; 4(h) 10 andrb'i-::'1?1· pj."'r' \!' w rp;.jo:::790::15a•:ht.snn----nnnd 4 4 0c"i t ll d1 n,~ '='1::n. ls·"l ~ w nl"l•) 37"''021 ~;,:toln-tnr.----nr.r.d 5 /;..(.7 th.: f-(.•11 r.• w lnq ps~ •jd•:•n•r•s. ~r i.· val l •j aa(r 2 h~adlrt:a~ : a ar.,jrbe,;:el, r" l ·:?r 1"' €'. u:::::5l ctt,~·~ .a d l l1(•!5-er. ~ i !.•-:\1 1 1f::35j '="162 w n 0047902 15>i<:ln0l nr. -·--r•nr.n no holding~. i n uiu for iioldh~c·. en ter .1h depress disf'lay recd ~·end f'\~ o: r t .:... t: ,en t r d: 7507 11 used : 8 10725 t,pe : ·"i:l btb lvl : m c• o vt r ~•ji• : _ l~r,9 : ..:tn9 ::::.;:, tj rce : ij i ll•js.: r~::pr: [i•·= 1·11: r c.:onf p•jb: _t-rr. : __ oat tp : _ m/f/b : _ _ _ ind ;.. : _mod rec : f~s.ts•: t-.r: _ c·:.or.t : d-?~·: : lr:.t lvl: 0dt~~ = 196:?.. _ l 0 1 0 63-11618 2 040 ~ orl 1 oc.l ~ ~.~: . ::: 0'::·0 0 pz 3. 86;2(126 b el • 4 0~..,2 fl •5 09:.· t; 6 04.;, tiiuu 7 100 10 bl1xen ~ ka~ en , d 188~ ·1 962 w e n :::: _::.q•:; 1 ehr·.:-n9ar-d c (b·,·) 1~<1 1 d1 nes.-?n [,. .. :-~ ... :jd) q 260 0 new yor~. brandon. house. c fi~~~j 10 "30•) 111 "· c 22 err •• example3 oclc,s database conversion/w ajenberg and gorman 181 because those later names are valid aacr2 headings in terms of their form , they are coded en (i.e., aacr2 validated) by the program , even though they may not be the right headings for the records to which they are attached. a good example of this problem is that of the "lutheran churchmissouri synod ." an earlier name of this religious body is "deutsche evangelisch-lutherische synode von missouri, ohio, und andern staaten." unfortunately, the authority record (example 4) does not even show that the earlier name is valid according to aacr2. the conversion program, on encountering the earlier name used as a heading, would change it to the later name and code that form as being the aacr2 heading. another example of the problem is: chamber of commerce of the united states of america. international department this is identified as the aa c r2 form (example 5) but, in fact, the department has changed its name to "international division." lc practice another problem we have encountered is that of the literal-mindedness of the computer programs in matching like with like. this problem is compounded by inconsistencies in cataloging practice resulting from variations in lc cataloging practice. an example of this problem is that of the nigerian author chinua achebe. the heading "achebe, chinua" is marked as being aacr2 despite the fact that the authority record shows ~;.:rt:oen 1 ot .. ~ fur f< i b reco:.ord ente:f~ to1b oi'sf'lay reco send f;:.,.,: ~ t~ t : 11 f r.t r· d: :::(' 1 ~~:::::: 1.1•3-.:o•j! 801~::.2:3 t ·tpe! :;:: 8jb lvl : c'o:•vt a9n : l~l.n9 : 'i-t:••jr•:.: : ·; lt e: 0 11 l'nlc : .:;:~ e: n•: lvl : r• h e.]_ d 1 ...:.·t : a h~a. d: c i •) 10 :<)•)57t)65 :.: 11•.) .:-l~· ' • • ,~··.;t r, ,-hur•:h--11j.ss<:••jr j :_::..·, .,-l( .. j , t li' n(l(ol:::(lll(l~:::;.ao3•:~.r • r•----r,r.r+n 3 4 j. o .:•: l '~ '' '=' <e.· i l.•: ·!il lutt, i·r.:s. n :.··,·r ... :ocl of ml ~!•)tj r l, oht•)• ar,d oth~;..:r· ·~;tat_~ w rh1 0:2~:(1! 1 't•s.;:t.-~ lo•ln r,--·--nr•na 4 .. 11 0 .·;:· f'll s!-<:• u rl s ·,nod w nc"j(•3801105aanann----~~~.r.a ~. 41•) 2 t) (i(·r·m.;.n ml$:s.•)tjrt -:. .• nno1 u.. lt0d4:.:;:0 ll ('l':j d.-1.n;}nn -----nnn.'l. 6 410 .:·o oe u t ~·:he ev;:in9~ ll ~·~ h -l•.jtt'. cott s•:. h~ :~. -.,r, (i(j.;? v(h• t'1t. ss •:•uij . • oht •.• und andepn ·~;t·.l·l.t~n 1.1.1 nf.•f)s:::0 1 1 0'3·.l·lr•-lnn--nnr.-3. st~aten w n00680j 1 0~~anan~----nnn~ 8 4 10 20 ger man evan9cl t c al luthard.rl s .n o d o f mt ssouf' ll ohl vt •nd other st~ t e5 w n00780 1 1 05a~nann----nnna scr~?e-n ~ o f 2 10 5 10 ~0 evan9~ltcal luthera n s.nod j c al confer~n ce o f nor th amertca w n000801105aan~nn----nnn~ 11 667 aacr 1 f or m: a lyth e r a n churc h --mt sso urt synod w no 1080 1 1 05aolnann----nn r.r. 1 2 1;,./-.7 th .:f·:·l l ow t n 9 f'jbd t vlsl•.•r• has n"t t•eo:n us e-•1 as a h~adtr,g : a lutheran church--mlssourt s .nad . ft sca l off t ce w n0 1 180 1 105a anann ----nnnn example4 182 journal of library automation vol. 14/3 september 1981 •:; ,~ r c-~r, 1 •:o f :: for bib rec(ifi[, enter ~1b [i! '~play re u j ·_.nm r~=-( ~ t o:tt : ·= l=n t rd : 80 1 ~'•)"1 t.k,.-,j : :7: 1 nr:;31 top!.:'! :: btb lv1 : ,.,_,vt ~~~r,: l-j. r•~~ = ·.·•.•ijr •..:€: : ·:.l t·? ! (loy/ irtl. (! r\ f: r••l vl : r, ht}·l•1 ~-~ f~ l h<:·td! ~·~~ d 5 tatu s : ~ n~n.~: d mud f@c: aut~ ~ l ~ tu ~ : ~ 1 0 1 0 ,,. 80 l u:.z..; l7 :.:. llt'j 20 r· r,. t n·b~ • •.• f c<:•rtiit•~?r ( <.? .:•f t r. v ur, lt <::·d ·~. l -lt:·~ ··· t ,'\m ~ 1 •--3 . t· j r.tt-r na t t•:,nal deft. w r.oo t ;;;() lo::l.:io'3•~·.in r•---,, ,,,..,,, ;: 41 0 _::o c t.-3.mb e-" r o t c •:o n.nu:o r ·•:e •:of 1 h~ ur. tt ~ j "=; t .lt.:.~ .:•f f lnt<~r l• ~ . to f(t l t.' j~n r.:.:. rr.n.~rct?-r"ore t 9r, f' c•lt•: , d~pt . w ,.,·,o.:::?.o l 0.2 l.ao:ln~t+r• · ~r.r•ih 4 41(1 ,::(1 (h~mb ~ r .:· f c o fl'i lll r('\? ,·,f th <.? l.l l tl t \'· d ~t ,tc··~ r)f. r. n, e.. rt • ~ . lo commerc~ dept . ~~ n00480 1 0~1~ana nn----nrlnd (:. 510 2 0 (h•lmt• ..:r r'l f c(•rnn,._:. r ,··~ ,-, f thi' ijr,j t ~d :::. !~t :·=•' t ;::. rr.~:r· j • .'}. i• l nte~n~tton•l rel~tton s de pt . w n00580102 laanann--·--nnn~ s..:r ~~n 2 •.•f 2 7 1_-,1_,7 th e f•:.ll <:!llll n q h~.t.jdlna t (•f th i• fo atll .~r flo3uaf' j r.~ \.a ltd aa( r .;;. head1n9 : a chan1b er ut c o:•rmt.cro:e of th~ un1l e •j ·:. tat -e· ~ (1f fhr1<.•r 1•>'3. . rori?l91• commerce-fc•rc~ i'ein f'oll c, [i~ pt. w n(u_l~.:?o l ').:· l ·'l.-3.f1~rtn ---nflrar' 8 67(1 ar, t ntr odu·:tt or. t.~· do1n9 lffif'•ht •• • 19 4 7w n007:~:t) t 02' 1 aananr.---nn n n example s that he was born in 1930. lc's announced policy is to give dates "whenever the information is readily available," but only for headings established after december 1980. this restriction creates inconsistencies in lc practice that are hard to predict. the result is that we often establish a personal heading with a date, only to discover that lc is not using it. the definition of " readily available" is clearly elastic and does not provide clear guidance to other libraries. it is irritating and occasionally burdensome but does not create a quantitatively serious problem . one unfortunate result of lc's machine conversion of its authority file to aacr2 forms has been to make notes on the authority records harder to understand. this is because only headings and references were changed; notes were not affected. this means that the wording of the notes may refer to a state of affairs that has altered as a result of the aacr2 conversion. example 6 is the authority record for theaacr2 form of heading for the university of illinois prior to the change of name in 1966. the history note (field 667) incorrectly says that the heading for works published before 1966 is "illinois. university" (the pre-aacr2 form) . since the aacr2 form as established by lc looks very much like the new name, "university of illinois at urbana-champaign," the authority card is very difficult to understand. nothing short of revising the note, and/or the use by lc of a less confusing qualifier than "(urbana-champaign campus)," will make the authority record intelligible. an example of how lc practice has affected the oclc program adversely is in the area of the so-called compatible headings. these are instances of when lc has chosen to depart from the provisions of aacr2 for one reason or another. leaving aside the utility and morality of such a policy, it presents a considerable problem to those of us who use oclc oclc's database conversion/w ajenberg and gorman 183 records. the example that follows is of the worst of these "compatible" practices. lc has decided to ignore the common form of name for persons who are not "famous or published under an american imprint." 1 thus, the writer p. c. boeren would be recorded as "boeren, p . c. (petrus cornelis), 1909" under the provisions of aacr2, but, because boeren is neither famous nor american, the "compatible" heading will be "boeren, petrus cornelis, 1909." this heading is not acceptable in an aacr2 catalog. scr-e-er. 1 qf 4 for bib record enter btb display recd senli rec stat: c entrd : 801122 us~d : 810718 fype: z bib lvl : ~ govt a9r : lang: source: site: 038 inl~: ~ enc lvl: n h@ad r~f: a he~d: ~ ~ie~d status : a name : a mod rec : auth ~t~tus: a 1 010 n 7904c•l04 2 110 20 untvt:-rsit-.~ of illtnc•ls <urt-~na-(hampal911 ~.o .:jmpus) w n029801115aac-l.nn----nnrll• 3 410 20 illtnois industrtal untvers1t. w n0027?08~9aananll----n~na 4 410 :20 unlverstt. of illir•oj.c; w rt0 '?7800::: 17aanar.r,----r•rtn;;j 5 4 1010 illinois. b ljfllvf!fslty. ttj no::::o:::(1\l l5a .jaa.r·u----nnn.j. 6 41 0 10 ur· b ar.a <ill . ) . b unjver!_;"tt-.. c•f tlllr••:•ls {l lrt•-:l.n.:l--·(h.amf ·.:& l~n ~ampus) w n031810616~dn~nn----nnna 7 410 10 1111no1s . t• un1vers.1t. (ijrl•ana--chantpal!=lr• .:~ntp&..i!) w n038810616aanann----nnn~ 8 510 20 unjver-slt-·, of jlltn•w•ls at chtca~..:.· ctl"' cle w n00779t')8~~~~-:~r•"-t~n---­ nnna 9 510 20 university of tllinuis ~t the medical cent~r w n008700829aar•a~n-­ -nnna scr·een 2 '='f 4 10 510 20 llntve-rs1t ·, ~>t 11 510 20 un1ver~1t. nrtna ill tnot~ \~ystem) w n00q7q082qaanaen----nnn~ j lltn.:•is at ur bana.-champa.t9n w n0~:2810616aar.ae-n---12 510 ·io lirnverc;.jt •• t illtno ls at c(•l'l91"'toss circle, cht•:a~o w n0:3:.::81061 6a:lrtae r,-,tt,;i·• scl"'e€-n 3 of 4 13 665 the illin~1 5 industrial un1v~r~1t~ wa s ch~rtered in 1867. jr, 1885 the name wa~ cha~~ ~ d to unj,·erslt~ of ill1nnts and in 1966 to untverstty of illtn ots at urban~-ch~mfat9n . a worls b~ th1s bod v publt~hed befor~ the chan9e" of narr,~ 1n 1 <"~66 ar~ found und~r a un1ver~1t.· t:•f llltr• ols curt•ana-champal9n campus) a worjs p~bl1she~ ~ft~r th~t ch~n9e of n~m~ ~r efound under~ untverstt' of 11lln01tat urband chaffipal9n . a the ch i c ago undergraduate dtvls:.ion of the l_ lniver~slt·-. .,f ill1n01s wa5 ~stabll !, hed 1n 1946 . ir. 1.:-,~.:: th€naffi~ was chan9ed to untver-slt( o)f lllllll)ls at c,.n9r es.s c lrcll?, lhl•:ol.9o), and in 1964 t..--. unlvforsltl of illlnc•ls:. at chjc.:l~o:• c1 rr:.l~. a worj ~· t.··.th1 s t•r.•dl published befor t: the char•9t:t.:tf nam.:1r1 1':'64 ~.r· ·-~ fc•ur.d und~r a ur,1v fi:r·5t' of illio•)ls at c:on~_.ro?ss c'trclt?~ chica~·' · a wor• ~ p•lt• lj ·_;t,t:"•j :.1fter that char.~e of name are found und€-r a untvers t t ~, of 111 inoi£ at (t,jc-a.9(1 ctrcle • .a jr, 196(:, the llnlverstt. c· f illlnot sat urbana-(h-arr•p.aj~n .. the l.lr,jv ~~ r4;.jt, f•f l11jnt:•ts at screen 4 t;:. t 4 13 ccont> chi•: a9o cir·cl€, artd the univer·sit -..-· .:·f illjr .. :.ts at th et-tedto:·a.l center~ o1e-r·f=' reorganized into equal administr·ative ca~puses with1n a university s . stem with a •:entral admintstr·attve staff in llr· ba,r,a . a wor•-s p•jbllshed by th~s~ b(•dles after the reorgantzatton tn 1966 are found under a un1v~rs1tv of llltnots at urbana-champat9n . a un1vers1ty of 11 l1no1s at ch1ca9o c trcle. a un1vers1ty ot lll1nots at the medical center . a untver·st t. of llltr.ots (srstem) a subject entry: wor• s about these bod1es are ~nte-red •jnder th ~?name o:·r· n>3m~s tn e .:..::: 1stence d•jrln~ the 1ate5-t period fc•r wht•:t. sijt•je:t c.·.vera9-e ts 91v~r.. in the c ase wher~ the required name is represent~d 1n tht~ ~ at~lo~ onl, un·j~r ~ later form of th e• rtame. ~r.tr··r 1s ma.j-e un·j ~r· tht.:o la t ~r f.:·r·m. w n010:3 t061c·a'lnur.n--nnrtn 14 667 llltno1s indus t r1al university. w n004790s=a4dnann----nnnn example6 184 journal of library automation vol. 14/3 september 1981 more, it is quite possible that if boeren' s works are published in america or if lc suddenly decides that boeren is "famous," the heading will be changed. this is an infrequently encountered problem for us but one where lc's peculiar policies have created problems that have nothing to do with oclc or aacr2. conclusion the problems that we have cited above are real but not numerically significant (except in the case of serials and multiple personal namesneither of which are under oclc's control). they are far outweighed by the tremendous value of the more than 40 percent of oclc headings that have been converted to their aacr2 form. the oclc conversion has made it possible for us to do aacr2 cataloging more quickly than in the period november 1979-december 1980. we have issued guidelines to our professional, paraprofessional, and clerical cataloging staff who deal with all the headings we encounter in using oclc (see appendix). problems such as those we have described are dealt with in our guidelines, and in practical terms now in day-to-day work. they may take some extra time, but overall our cataloging operation has been greatly speeded by oclc's conversion . reference 1. cataloging service bulletin , no.6:6(falll979) appendix university of illinois library at urbana champaign copy cataloguing guidelines authority records lc authority records, now available on oclc, can be very helpful in determining the correct aacr 2 form of headings, and should be cited on authority cards we prepare, when we use them in establishing headings. the tag numbers used on authority records sometimes have different meanings from the numbers used on bibliographic records. the meanings are: lxx heading 4xx see reference (i.e. from the form in this field to the form in the lxx field) 5xx see also reference (i.e. from the form in this field to the form in the lxx field) 6xx notes (e.g. the authority used by the lc cataloguer) each field concludes with a w subfield, consisting of 24 characters indicating in coded form various types of information about the heading. the 13th character, the 3rd past the six-character date, consists of one of five letters indicating the rules governing the form of heading in that field. the codes are: oclc's database conversion!w ajenberg and gorman 185 c aacr2 d compatible with aacr 2 b aacr, 1967 ed. a earlier rules (e.g., ala rules of 1949, etc.) n not applicable or not applied here is an example of an lc authority record, omitting the fixed field and some of the references: 010 n 790558820 110 20 state university of new york at buffalo. w n008801115aacann----nnnn 410 10 buffalo. b university w n002791105aaaann----nnna 41010 new york (state). b state university, buffalo. w n009801115aaaann----nnna 667 the following heading for an earlier name is a valid aacr 2 heading: university of buffalo. w n007791105aanann----nnnn when oclc carried out its aacr 2 conversion project, the data about the rules encoded in subfield w was added to headings in bibliographic records, if those headings were altered by the conversion. for bibliographic records in oclc, subfield w contains 2 characters, each of which must be one of the following: c (for aacr 2 heading) d (for accr 2 compatible heading) m (for machine converted heading) n (not applicable or not applied) the first character applies to the name portion of the heading; the second, to the title portion. obviously, in many cases there is no title portion, in which case the second character will ben. the code m (machine converted heading) is used when a heading is altered directly by program, rather than being extracted from an authority record. an example would be the elimination of subfield k laws, statutes, etc. 1. use of subfield win cataloguing since oclc does not want member libraries to apply the letter codes in subfield w for their original input, the presence of a cord in subfield w should always indicate an lc decision identifying an aacr 2 or aacr 2 compatible heading. supply subfield w for all cataloguing to be added to oclc's data base. the codes to be used are given in illinet's information bulletin #92, from which this table is copied: 1 aacr 2 form found in on-line lc name-authority file 2 aacr 2 compatible form in on-line lc name-authority file 3 aacr 2 form supplied by inputting institution with copy in hand and piece not in hand 4 aacr 2 form supplied by inputting institution with piece in hand 5 author or title portion of heading not converted to aacr 2 form. this subfield (#w) is always the last subfield in the field. it must contain a two character code. the first character applies to the name portion of the heading; the second character applies to the title portion of the heading. if the heading is a name heading and does not include a title portion, use "n" as the second part of the code. if the heading is a uniform title heading, use "n" as the first part of the code. examples: 700 10 day lewis, c. #q (cecil), #d 1904-1972 #win 600 10 schmidt, h. r. #q (heinrich rudolf) #w 4n 130 00 bible. #p n.t. #s authorized. #f 1974. #w n4 accept headings coded c in subfield was correct aacr 2 headings, unless the heading is for an author entered under surname who writes in a non-roman alphabet language. for such 186 journal of library automation vol. 14/3 september 1981 authors, use the form given only if it is a standard romanization of the name in the original alphabet. if a form other than the standard romanization is used, substitute the standard romanization, and trace an x ref. from the form coded c. 2. lc author headings without dates lc recently announced that it will not add dates to a heading already established without dates, unless the dates are needed to resolve a conflict. when there is no conflict, the dates will be recorded in the authority record in a 6xx field , but will not be added to the heading. dates will be routinely added to newly established headings at the time the headings are established, if the information is readily available. lc codes such headings c, not d, because aacr 2 does not require that a date be added to the heading, except to resolve a conflict. if such an lc authority record is available when a heading is being established, use the lc form , without adding dates to the heading, unless dates are needed to resolve a conflict in the new catalogue. record the dates on an authority card. if lc authority is not available wh(;ln a heading is being established, use dates in the heading if the information is readily available. if, later , lc authority is found that omits date from the heading, do not change the heading as already established for the uiuc new catalogue. since records in oclc may contain headings without dates for persons we have established with dates, some conflicts will be generated. these should be resolved by catalogue maintenance staff, who will add dates in pencil to headings on new cards that lack dates, but are otherwise identical with headings in the new catalogue. such conflicts in the machine record will be cleaned up gradually, after fbr is up . 3. acceptable dn forms headings coded d in authority records (dn in bibliographic records) are the aacr 2 "compatible" forms. in many cases, the difference from aacr 2 is trivial, and the form can therefore be used. in such cases, if lc authority is available, use the form as established by lc, and record the information on an authority card. if lc authority is not available when a heading is being established, follow aacr 2. if, later, lc authority is found that establishes a "compatible" form , do not change the form in the uiuc new catalogue to the lc "compatible" form. it will sometimes happen that "compatible" forms will be found on records in oclc (coded dn, usually) . such headings may be used only if they fall into one of the categories listed below . this will sometimes result in "compatible" forms and true aacr 2 forms both being used in the new catalogue. in some cases, the two forms can be interfiled; in other cases, catalogue maintenance staff will need to correct "compatible" headings in pencil. acceptable dn form s are: a . lc will omit hyphens between forenames if the heading has been established without hyphens, even though rule 22.102 would require hyphens. use the lc form , if found . catalogue maintenance will interfile headings identical except for the presence or absence of hyphens. b. lc will continue to place the abbreviation ca. after a date in the heading for a person, if the heading has already been established in that form , even though rule 22.18 specifies that the abbreviation should precede the date. use the lc form, if found . catalogue maintenance will interfile headings identical except for the placement of the abbreviation ca. c. lc will not correct the language of an addition to a personal name heading; i. e . will not change to the language used in the person's works. (e.g., a heading already established as louis antoine, father will not be changed to louis antoine, pere, even though the latter is the author's usage.) use the lc form, if found . catalogue maintenance will correct conflicts in pencil, to the lc form. d. lc will not change a personal name heading to a fuller form of the name, even if the shorter form is not predominant. use the lc form , if found. catalogue maintenance will correct conflicts created by personal name headings that vary in fullness to the form to which a "see" reference has been made. if there is no "see" reference, catalogue oclc's database conversion!w ajenberg and gorman 187 maintenance will refer the conflict to the appropriate cataloguing service. e. lc will continue to use additions to surname headings supplied by cataloguers, for headings already established with such additions. use the lc form, if available. catalogue maintenance will resolve conflicts by adding qualifiers in pencil to headings that are otherwise identical with the forms with qualifiers. f. lc will continue to use titles of honor, address, or nobility with headings that have already been established with such titles, even though the authors do not use such titles. use the lc form , if found. catalogue maintenance will resolve conflicts by adding qualifiers in pencil to headings that are otherwise identical to the forms with the qualifiers. g. lc will not use initial articles in uniform title and corporate headings, even when they are required by aacr 2. we will follow lc practice in this, and use the lc form when found. catalogue maintenance will interfile uniform title and corporate headings that are identical except for the presence or absence of initial articles. h . lc will continue to use the abbreviations bp. and abp. for personal name headings that have already been established with those abbreviations used as qualifiers, instead of spelling out the qualifiers in full. use the lc form, if found. otherwise, follow aacr 2 and spell out "bishop" and "archbishop". catalogue maintenance will resolve conflicts by correcting in pencil to the form spelled out in full. i. lc will not add terms of incorporation to corporate headings already established without them, nor delete them from corporate headings already established with them, even though lc interpretation of aacr 2 would require such adjustment. use the lc form, if available. otherwise, retain terms of incorporation in corporate name headings only if the term is an integral part of the name, or if, without the term, it would not be apparent that the heading is the name of a corporate body. catalogue maintenance will resolve conflicts by adding, in pencil, terms of incorporation to headings identical to established forms except for the absence of such terms. j. lc will not add geographic qualifiers to corporate headings established previously without such qualifiers, even though they have chosen to apply the option in rule 24.4 that allows qualifiers to be added when there is no conflict. use the lc form, if available. catalogue maintenance will resolve conflicts by adding qualifiers in pencil to headings identical to established headings except for the absence of such qualifiers. k. lc will not reduce the hierarchy of far eastern corporate headings, established before 1981, even though aacr 2 rules would require that intervening superior bodies would be omitted from the heading. use the lc form , if available. catalogue maintenance will refer conflicts to the appropriate cataloguing agency for resolution. the asian library cataloguer is the final authority for such headings. l. lc will not change the capitalization of acronyms and initialisms to conform to the usage of the corporate body, if the acronym has already been established with a different capitalization. use the lc form, if available. catalogue maintenance will resolve conflicts by interfiling acronyms and initialisms that are identical except for variations in capitalization. m. lc will not supply quotation marks around elements in a corporate heading that has already been established without quotation marks, even though this varies from the usage of the body. use the lc form, if available. catalogue maintenance will resolve conflicts by interfiling headings identical except for the presence or absence of quotation marks. n. if lc is attempting to resolve a conflict (i.e. two different people with identical author statements), and neither dates nor expanded initials are available to resolve the c:onflict, lc will add an unused name in parentheses to the heading if the information is available. e.g.: established heading: smith, elizabeth new author: elizabeth smith 188 journal of library automation vol. 14/3 september 1981 (new author's full name, ann elizabeth smith, is available) lc heading: for new author: smith, elizabeth (ann elizabeth) use lc forms if found in name authority file. catalogue maintenance will refer problems to the appropriate cataloguing agency. 4. unacceptable dn forms in a few cases, the aacr 2 "compatible" forms, coded d in authority records and dn in bibliographic records, are unacceptable in the uiuc library. instead, we will follow aacr 2 in constructing these headings, and record the lc form on authority cards when they are found. we will also make references from the lc forms, if they would file differently from the forms we use. for many of these, catalogue maintenance will have to refer conflicts to the appropriate cataloguing agency. in a few cases, catalogue maintenance can make the corrections on the cards. the unacceptable dn forms are: a. lc will sometimes, but not always, continue to use headings established prior to 1981 with names spelled out in full , when the authors represent some of those names with initials. follow aacr 2 in constructing headings for these names. use initials in conformity with the authors' usage, and add the corresponding full names in parentheses, in subfield q, when the information is available. whenever an element in a compound surname or a first forename is represented by an initial, make a reference from the fuller form. usually, a reference will not be needed if a forename other than the first is represented by an initial. b. lc will continue to add " pseud." to personal name headings already established with that qualifier. do not use the qualifier "pseud." when establishing personal name headings, and delete the term from oclc records that use it, including records added by lc. catalogue maintenance will resolve conflicts by lining out the qualifier "pseud." in headings. c. lc will continue to add 20th century fl. dates to personal name headings already established with such dates. do not use 20th century fl. dates when establishing personal name headings, and delete such dates from oclc records that use it, including recorded added by lc. catalogue maintenance will resolve conflicts by lining out 20th century fl. dates in headings. 5. 87x fields one part of the aacr 2 conversion project by oclc was the addition of fields tagged 870, 871, 872, or 873. these fields contain the pre-aacr 2 forms of headings that were changed by the conversion. oclc participants can add 87x fields to records they enter into the data base. however, we will not supply these fields in our cataloguing. 6. authority cards prepare authority cards whenever references are needed, and whenever an lc authority record for the heading is found , even if we do not use the lc form. citation of the authority record takes the form: "lc auth. rec." followed by the record number and the indication, in parentheses, of the code for rules given in subfield w. example: akademie der wissenschaften und der literatur (mainz, germany) lc auth . rec. 80076417 (en) if the lc form differs from the form used as the heading in muc, give the lc form in parentheses, following the sub field w code. example: abrahamson , max w. (max william) lc auth. rec. 78064817 ( dn) (ab-rahamson, max william) it will sometimes happen , when establishing the heading for a corporate body, that an lc oclc's database conversion/w ajenberg and gorman 189 authority record for a subdivision of the body you are establishing will give you the aacr 2 form of the body you are setting up. precede the citation to the authority record with the word "from". example: united states. environmental protection agency. region v. from lc auth . rec. 80159375 (en) (the lc authority record is for the water division of region v) 7. references the basic rule for making references is given in aacr 2, rule 26.1: "whenever the name of a person or corporate body or the title of a work is, or may reasonably be, known under a form .that is not the one used as a name heading or uniform title, refer from that form to the one that has been used. do not make a reference, however, if the reference is so similar to the name heading of uniform title or to another reference as to be unnecessary." ultimately, this decision depends on the cataloguer's judgement. usually, make a reference only if it would file differently from the established heading and from all other references. refer from variant forms found in works catalogued for this library, and in standard reference sources. lc authority records will often suggest useful references. however, we may need references not traced by lc, and we may not need all of the references lc traces. notice especially that lc authority records will often give a reference from the pre-aacr 2 form, even when it would file with the aacr 2 form. for example, the authority record for akademie der wissenschaften und der literatur (mainz, germany) traces a reference from adakemie der wissenschaften und der literatur, mainz-the pre-aacr 2 form. these two forms would file together, so we do not need the reference. we will trace "see also" references from forms that can legitimately be used as headings, whether or not they have been used yet in the uiuc library. we will no longer observe the former restriction, which allowed "see also" references to be made only if both headings had been used. for further information on authority records and references, see the cataloguing manual, section a79. aw:lgo arnold wajenberg is principal cataloger and michael gorman is director, technical services, at the university of illinois library. gathering strength to combat access inequality: how a small rural public library supported virtual access for public school students, staff, and their families public libraries leading the way gathering strength to combat access inequality how a small rural public library supported virtual access for public school students, staff, and their families julie lane information technology and libraries | june 2022 https://doi.org/10.6017/ital.v41i2.15161 julie lane (jlane@peclibrary.org) is technology resource centre coordinator and educational resource consultant, county of prince edward public library and archives. © 2022. prince edward county (pec) is located east of toronto and covers approximately 1,050 square kilometers. pec is a part of the hastings prince edward district school board (hpedsb) and have a total of 6 public schools, one catholic school, and one private school. the other county serviced by our school board is hastings county. the county of prince edward public library (cpepl) system of 6 branches services just under 25,000 residents and countless seasonal visitors during the tourism season. our public school board services approximately 15,000 students across 7,220 square kilometers and 39 in-person schools and a k-10 virtual school across the two counties. starting off a technology column with a bunch of statistics is not exactly how i figured i would write this. however, context is key when discussing equity and access; and in this piece, i intend to highlight how both of those are made significantly easier to achieve for community stakeholders, with the presence of technology and education. when the stay-at-home orders were announced in march 2020 due to the covid-19 pandemic, we knew that we would not be able to hold our scheduled and planned public library programs. we turned to live streaming story times, maker programs, and author visits, all using what equipment we had on hand—tablets, laptops, and the internet. once it became clear that students in the public schools would not return to in-person learning within any short amount of time, all school boards in ontario ensured that enough chromebooks were purchased so that every student had their own dedicated device, with the assumption that providing a device meant all students could participate in remote learning. teachers rushed to transition their teaching plans to an online format; school administrators scrambled to schedule safe device pick-ups for students; and parents were not only juggling professional responsibilities and parenthood, but now teaching and tech support. although school boards provided tools to meet the “classroom” requirements, they could not ensure that every single student had access to a high-speed internet connection, nor could they offer school library access remotely. this is where the cpepl was able to offer support. the global shut down had a significant impact on the relationship that the cpepl had with the schools in our county. a large focus of mine was to rebuild those working relationships to support students, staff, and families, and ultimately demonstrate in actionable ways how the local pu blic library system was there for them. one immediate way i thought we could demonstrate support was through lending our wi-fi hotspots. hotspot lending programs through public libraries have gained popularity over the last few years. although our program had been in place for nearly 5 years, i am always surprised at the number of people that do not realize it is an available resource. with that in mind, i persistently reached out to the school administrators in our area and set up meetings to discuss how our borrow the internet program could benefit those working remotely without reliable internet. wait lists for our 9 available hotspot devices drastically increased, but mailto:jlane@peclibrary.org information technology and libraries june 2022 gathering strength to combat access inequality | lane 2 our patron community was incredibly supportive of our students and would frequently request that their loan, which is at maximum 7 days in length, be passed to a student. though connecting families with internet hotspots was helpful for the required online learning, we could not fill the gap completely. if we had an unlimited communications budget, the situation would have been easily remedied, but, as we all know in the library world, budgets can be very tight. this fact pushed us to find creative ways to bring as many resources as possible to the students, staff, and families in our community. to broaden the reach to individual schools (and staying persistent with that outreach), i focused on not only ensuring that school communities knew what physical resources the library had, but also what electronic resources were available. these conversations and emails with school administrators led me to get in contact with the curriculum coordinator at the board office. this connection was a complete game changer. instead of us, as a public entity outside of the school community, contacting individual schools and trying to build relationships with teachers, librarians, and administrators, we had the person who oversaw all of the school librarians, library technicians, and curriculum development for the k-8 grades on our side. the coordinator was on board to help us make the desired connections with the schools in a number of ways. she put us in contact with the curriculum coordinator for the secondary grades (9-12) and our program and service list was sent from the board office to every teacher, principal, school librarian, and library technician in prince edward county. we were then able to set up a meeting with the coordinator of assistive technologies for the board, which set us on a track to completely revamp how we marketed and allocated our resources to schools. it became clear in our first conversation that we needed to get students connected with their public libraries as quickly and efficiently as possible. with students split between in-person learning, virtual learning, or a combination of the two, with still minimal to no access to school library borrowing, the online resources of the public library system seemed like the perfect solution. not only would connecting students, staff, and their families with their local public library be a way to get everyone reading, but we were fulfilling the opportunity to ensure that everyone had genuine and equitable access. what the school board had observed was that the required shift to remote learning made the inequality of literature access glaringly obvious. students who relied on their school library for reading were not getting that opportunity and students who had individual education plans were jumping through hoops to get digital copies of material. so though everyone had a school supplied chromebook, not everyone had the same access. this is where public library subscriptions to hoopla and libby came to the rescue for providing current and popular literature in a variety of electronic formats for students to immediately access for both course reading and leisure enjoyment. connecting with like-minded, growthand education-oriented people is incredibly empowering. the curriculum coordinators at the board office were so enthusiastic about connecting students, staff, and families in our school board with their public library that it made the next parts of the process not only successful, but fun as well! the curriculum coordinators and i created a presentation that we brought first to school administrators in prince edward county. having public library advocacy come from the school board was incredibly influential and a big step toward issuing library cards to students. once we had buy-in from the school administrators, we circulated registration forms for families to fill out and get everyone in their household public library access. we found that the easiest way to do this information technology and libraries june 2022 gathering strength to combat access inequality | lane 3 was using google forms. it was simple for parents to fill out and easy for library staff to glean the required information for card registration. since the library was also working with the virtual school, we needed to be able to issue library cards even if some students were not in our catchment area. it was common for virtual classes to consist of students from the smallest village in pec and all the way up to the northern most part of hastings county, a full 3 hours’ drive away. cpepl was able to accommodate this need. pec is a tourist destination and frequently issues cards for visitors staying in the area for an extended period of time under the rule of if you “wo rk, live, or play” in pec, you are eligible for a public library card. once library cards were set up or renewed for all families who requested them through the google form, i got to work teaching students and staff how to access library resources. after communicating with the curriculum staff and public school administrators, it was decided that creating an information presentation on getting started with hoopla was the best course of action. hoopla is an incredibly intuitive application in regards to the format possibilities (ebooks and audiobooks) as well as adjustable features within each format. the available settings and adjustment options make the reading experience comfortable and accessible as possible for users. also, since there is no wait time to borrow materials, this allowed entire classes learning remotely to all check out the same title and read together. the material presented to students was easy to understand and interactive. the session provided ample time for students to follow along and test each feature in the hoopla app with their own individual book selections. the best part? this presentation was just the starting point. while we were only able to schedule and virtually deliver this presentation at two in-person schools, the other five schools in pec and a number of primary classes in the virtual school still participated in the google form for library card registration. teachers started asking what else the public library had to offer to enhance the curriculum delivery with additional resources. many community teachers were reminded of the public library’s services and resources (beyond just hoopla) and reached out for class visits or access to materials. other schools outside of our prince edward county catchment reached out and connected with their local public libraries, or vice versa. we are still working to develop ways to meet the needs of students, staff, and their families through the public library. some schools in the northern area of the region have students coming from multiple, different public library catchment areas, and most of these libraries do not have the same resources as others, especially in the case of smaller systems. this posed an issu e of equitable access for students: why should some students in the class have access to library online resources, and some not because they come from different/smaller communities? we were able to mitigate this issue with the virtual school, but for students attending in-person learning, we could not give library cards to every student in the school board. thankfully, another public library system in our area stepped up their access to offer virtual library access to any student or teacher in hastings county (so everywhere except prince edward county). this recognition of the importance of equitable access enabled students to not only regain access to a public library system, but it also ensured that all students could access books in the way that best suited them. when i ask a class if listening to an audiobook counts as reading, it amazes me that the majority of the class say “no.” or if i ask students if they had ever read an ebook, some would say it was not a “real” book. these comments and notions are not only untrue, but they are information technology and libraries june 2022 gathering strength to combat access inequality | lane 4 also exclusionary. countless students need other formats than just printed materials. how many would benefit from listening to an audiobook along with reading a printed version? how many students dislike reading because it is just hard to see the words, but if the text was more spaced out, or a different font, it would make all the difference? how many times is a student not able to access a book they want because all available copies are already checked out at their school library? these are issues students in the classes i work with face. having a public library card can significantly ease these barriers to access. all in all, we processed hundreds of card requests and renewals and were able to powerfully illustrate to teachers how they could meaningfully integrate public library resources into their classrooms, either virtually or physically. our requests for library visits came back up to prepandemic levels, but we were working with more schools than we had previously. teachers were, and still are, reaching out and asking if we can get extra copies of books, or if we can lead virtual novel studies. one of our more popular pieces of progress is the integration of our coding programs with other subjects. currently, i am running a ukulele program where students are writing group arrangements using binary code as the basis for composition. we have classes doing art projects with robotics and integrating math learning objectives. we have done virtual story time and connected the story to creating scratch programs. the possibilities are endless , and now that we once again have the interest from teachers, we are working with them to support their students and all the learning that comes with incorporating technology and maker-thinking into a classroom environment. the momentum has not let up, and we are beyond thrilled. our communities and local school board have embraced the reality that public libraries are more than just books. public libraries are a critical part of any community and have the power to be a meaningful component to education at all levels. having schools and all educational stakeholders using public library services not only broadens the reach of a public library, but also broadens our advocacy potential. we know there is still a long way to go in terms of genuine equitable access, especially when it comes to technology. internet connectivity and technology literacy are just the tip of the iceberg, but when organizations support each other to truly serve their community, collectively, that is how you make change. presidents ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ business intelligence in the service of libraries articles business intelligence in the service of libraries danijela tešendić and danijela boberić krstićev information technology and libraries | december 2019 98 danijela tešendić (tesendic@uns.ac.rs) is associate professor, university of novi sad. danijela boberić krstićev (dboberic@uns.ac.rs) is associate professor, university of novi sad. abstract business intelligence (bi) refers to methodologies, analytical tools, and applications used for data analysis of business information. this article aims to illustrate an application of bi in libraries, as reporting modules in library management systems are usually inadequate for a comprehensive business analysis. the application of bi technology is presented as a case study of libraries using the bisis library management system in order to overcome shortcomings of an existing reporting module. both user requirements regarding reporting in bisis and already existing transactional databases are analysed during the development of a data warehouse model. based on that analysis, three data warehouse models have been proposed. also, examples of reports generated by an olap tool are given. by building the data warehouse and using olap tools, users of bisis can perform business analysis in a more user-friendly and interactive manner. they are not limited with predefined types of reports. librarians can easily generate customized reports tailored to the specific needs of the library. introduction organizations usually have a vast amount of data which increases on a daily basis. the success of an organization is directly related to its ability to provide relevant information in a timely manner. an organization must be able to transform raw data into valuable information that will enable better decision-making.1 for this reason, it is impossible to imagine an organization without an efficient reporting module as a part of its management information system. if we put libraries in a business context, they are very similar to any other organization. the common characteristic of each is that they have high demand for a variety of statistical reports in order to support their business. a library management system uses a transactional database to store and process relevant data. this database is designed in accordance with the main functionalities of the system. information used to make strategic decisions is usually obtained from historical and summarized data. however, the database model may have a complex structure and may not be suitable for performing analytical queries that are often very complex and involve aggregations. execution of those queries may be a time-consuming and resource-intensive process that can decrease performance as well as the availability of the system itself. also, creating such queries can require advanced it knowledge. these problems can be overcome by developing business intelligence systems. business intelligence (bi) refers to methodologies, analytical tools, and applications used for data analysis of business information. bi gives business managers and analysts the ability to conduct mailto:tesendic@uns.ac.rs mailto:dboberic@uns.ac.rs business intelligence in the service of libraries |tešendić and krstićev 99 https://doi.org/10.6017/ital.v38i4.10599 appropriate analyses. by analyzing historical and current data, decision-makers get valuable insights that enable them to make better, more-informed decisions. bi systems rely on a data warehouse as an information source. the data warehouse is a repository of data usually structured to be available in a form ready for analytical processing activities. 2 business intelligence systems do not exist as ready-made solutions for each organization, but need to be built in accordance with the characteristics of each organization using the appropriate methodology. this article proposes a data warehouse architecture and usage of olap tools in order to support bi in libraries. the application of bi technology is illustrated through a case study of libraries using the bisis library management system. the first step in implementation of bi was the creation of a data warehouse model considering data that exist in bisis and requirements regarding reporting. after the data warehouse model had been created, data were loaded into the data warehouse using olap tools. olap tools were also used for visualization of data stored in the data warehouse. reporting in bisis the bisis library management system has been in development since 1993 by the university of novi sad, serbia. currently, the bisis community comprises over forty medium-sized libraries in serbia.3 the primary modules of the bisis system include cataloguing, reporting, circulation, opac, bibliographic data interchange, and administration. bisis supports cataloguing according to unimarc and marc 21 formats using an xml editor for bibliographic material processing.4 the bisis search engine is implemented with a lucene engine.5 bisis supports z39.50 and sru protocols for the search and retrieval of bibliographic records.6 those protocols are also used for developing a bisis service for searching and downloading electronic materials by the audio library system for visually impaired people.7 in addition, bisis allows sharing of bibliographic records with the union catalogue of the university of novi sad.8 the circulation module features all standard activities for managing users: registration, charging, discharging, searching users and publications, and generating different kinds of reports, as well as user reminders.9 the reporting module of bisis is implemented using the jasperreports tool.10 however, this module has some limitations due to the fact that bisis works only with a transactional database and does not cope well with complex reports. firstly, in order to generate reports regarding library collections, it is necessary to process all bibliographic records stored in that transactional database. this activity significantly burdens the system and reduces its performance. to avoid this, reports are prepared in advance outside working hours, usually at night. consequently, those reports include only data collected before report generation. creating reports in this manner greatly reduces system load and speeds up presentation of the reports because they are already generated. however, some reports, such as those related to the financial aspects of the library (e.g., the number of new members and the balance at the end of the day), need to be created in real time. due to the execution in real time, those reports are ineffective and affect the performance of the entire system. the next limitation of this reporting module is that it has a set of predefined reports and the creation of new reports requires additional development. in the current deployment it is not possible to add new reports without engaging software developers. also, an additional obstacle is the fact that the data for generating reports are obtained from two different data sources (described in more detail in the following sections). for example, the report regarding the number information technology and libraries | december2019 100 of borrowed books by the udc (universal decimal classification) groups requires data about the udc groups from xml documents and data about book borrowing from the relational database. generating this kind of reports cannot be done in a timely and efficient manner. taking into account these shortcomings of the reporting module, it can be concluded that the application of business intelligence, primarily data warehouse and olap tools, could improve analytical data processing in the libraries using bisis. related work one of the basic components of the business intelligence system is a data warehouse. a data warehouse is a centralized database that stores historical data. those data are in principle unchangeable and they are obtained by collecting and processing data from various data sources. data warehouses are used as support for making business decisions.11 the data sources for a data warehouse can be diverse and may include transactional databases and different file formats. the process of integrating data from different data sources into a single database is called data warehousing. data warehousing includes extracting, transforming, and loading (etl) data into data warehouse.12 the goal of data warehousing is to extract useful data for further analysis from the huge amount of data that is potentially available. there are different approaches to modeling a data warehouse. these approaches can be classified in three different paradigms according to the origin of the information requirements: (1) supplydriven, (2) demand-driven, and (3) hybrids of these. a supply-driven approach is based on data that exist in the transactional database. these data are analyzed to determine which data are the most relevant for making business decisions, or which data should be part of the data warehouse. alternatively, a demand-driven approach is based on the end-user requirements which means that the data warehouse is modeled in a way that is possible to get answers to the questions asked by the users. the third approach is a hybrid approach and it combines the previous two approaches in the process of data warehouse modelling. the hybrid approach attempts to diminish the shortcomings of the previous two approaches. in the case of a supply-driven approach, the data warehouse will probably not meet the requirements of the end users, while in the demand-driven approach there may be no data to fill the created data warehouse. in an article published in 2009, romero and abelló gave an overall view of the research in the field of dimensional modeling of data warehouses.13 various examples of implementation of data-warehouse solutions in libraries can be found in the literature. in 2014, siguenza-guzman et al. described the design of a knowledge-based decision support system based on data-warehouse techniques that assists library managers making tactical decisions about the optimal use and leverage of their resources and services. when designing the data warehouse, the authors started from the requirements of the end users (demand -driven approach) and extracted data from heterogeneous sources.14 a similar approach has been used by yang and shieh, who started from the reports needed by public libraries in taiwan and through an iterative methodological approach modeled a data warehouse that meets all their reporting requirements.15 unlike the previously described articles where a demand-driven approach was used, we applied a hybrid approach to modeling data warehouse. we analyzed data sources that exist in bisis business intelligence in the service of libraries |tešendić and krstićev 101 https://doi.org/10.6017/ital.v38i4.10599 following a supply-driven approach, but we also analysed user requirements to identify the facts and dimensions for the dimensional data warehouse model. modeling the data warehouse in order to implement a data warehouse solution, the first step is to design a data model suitable for analytical data processing. a data warehouse usually stores data in a relational database and organizes them in so called dimensional models. unlike standard relational database models, those models are denormalized and provide easier data visualization. data can be presented as a cube with three, four or n-dimensions. analyzing such data is more intuitive and user-friendly. the dimensional model contains the following concepts: dimensions, facts, and measures. dimensions represent the parameters for data analysis while facts represent business entities, business transactions, or events that can be used in analyzing business processes. the most commonly used model in dimensional modeling is the star model. after identifying the facts and dimensions, a dimensional model almost always resembles a star, with one central fact and several dimensions that surround it. dimensions and facts are usually implemented as tables in the relational database. dimension tables contain primary keys and other attributes. fact tables contain numerical data as well as dimension tables keys. the measure is a numerical attribute of the fact table and can be obtained by aggregating data by certain dimensions. there are several approaches to modeling data warehouse and we followed a hybrid approach to design dimensional models presented in this article. this implies that both the existing data sources and the user requirements were considered while designing the final data-warehouse models. that modeling process involved the following activities: 1. analysis of existing data sources in bisis with identification of possible facts and dimensions, 2. analysis of user requirements regarding reporting, 3. refactoring of the facts and dimensions in accordance with the user requirements, and 4. design of dimensional models. analysis of data sources in bisis the first step in creating a data warehouse is an analysis of existing data sources. the bisis system uses two different data sources. bibliographic records are stored in xml documents, while circulation data, as well as holdings data regarding the items that are circulated, are stored in a relational database. in 2009, tešendić et al. described the bisis circulation database model.16 that model describes data about individual and corporate library members. data about members includes information about personal data, membership fees, as well as information about a member’s borrowed and returned items. bibliographic data in bisis are presented in unimarc format. dimić and surla in 2009 described the model for bibliographic records used in bisis.17 a bibliographic record is modeled as a list of fields and subfields. a field contains a name, values of the indicators and a list of subfields. a subfield contains a name and a value of that subfield. the data described by that model are stored in xml documents because the bibliographic record structure is not suitable for relational modeling. that structure is more in line with the document-oriented data storage approach. information technology and libraries | december2019 102 analysis of user requirements one of the essential functionalities of information systems, including library management systems, is to provide various statistical reports that should help the management of the library to make better business decisions. user requirements related to analytical processing in bisis can be grouped into several categories. the first category consists of requirements regarding reports on the library collections. examples of reports from this category are: • number of publications per language for a certain period of time; • number of publications by departments; • number of new publications for a certain period of time; and • number of publications by udc groups. the second category consists of requirements related to the circulation of library resources. examples of such reports are: • number of borrowed items by member category; • number of borrowed items by language of publication; • number of borrowed items by departments; • the most popular books; and • the most avid readers for a certain period. the third category consists of requirements related to the reports on financial elements of the library's business. some of the reports are: • number of new members on a daily basis with a financial balance; • number of members by membership category and gender; and • number of members per departments. analyzing user requirements, it was perceived that a new data warehouse have to be created using data from both data sources. this means that appropriate transformations of data from the relational database as well as from the bibliographic records documents need to be performed. data warehouse models taking into account the reporting requirements as well as the data that exist in bisis, appropriate dimensional models are designed. the proposed dimensional models were designed to meet all the needs for analytical processing, as well as to enable flexibility of the reporting process in bisis. for each of the observed groups of reports, a dimensional model was created as described below. model describing library collection data a dimensional model of the bisis data warehouse used for analytical processing of the library collection data is shown in figure 1. the data from this model are used to generate reports on the library collection. examples of such reports are accessions register, number of items by udc group, number of items by departments, etc. in generating all these reports, an acquisition number of an item has the main role and all reports are created either by counting the acquisition numbers or by displaying the acquisition business intelligence in the service of libraries |tešendić and krstićev 103 https://doi.org/10.6017/ital.v38i4.10599 numbers along with other data related to that item. therefore, the acquisition number represents the measure in this dimensional model. the central table in the model is the item table and it presents a fact table. this table contains the acquisition number and foreign keys from dimension tables. all other tables in the model are dimension tables. the publication table represents a dimension table containing bibliographic data from bibliographic records. only data that are needed for reports are extracted from bibliographic records and stored in this table. those data refer to the name of the author, the title of the publication, the publication’s isbn and udc number, the number of pages, keywords, and an identification number for the bibliographic record in the transactional database. the acquisition table represents a dimension that describes the publication's acquisition data such as a retail price, the name of the supplier, and the invoice number. the location table describes departments within the library where an item is stored. the status, publisher, language, and udc_group tables relate to information about the status of an item, publisher, language of the publication, and udc group to which an item belongs. the date and year tables represent the time dimensions. data in the date table are extracted from the date of an item acquisition and data in the year table are extracted from the publishing year. figure 1. dimensional model describing library collection data. information technology and libraries | december2019 104 model describing library circulation data a dimensional model of the bisis data warehouse used for the analytical processing of library circulation data is shown in figure 2. data from this model are used for generating statistical reports regarding usage of library resources. examples of such reports are the number of borrowed publications according to different criteria (such as user categories, language of publication, departmental affiliation of the user who borrowed the publication, etc.). these data can answer questions about the most popular books or the readers with the highest number of borrowed books. similar to the previous reporting group, the acquisition number of the item which was borrowed has the main role in generating those reports. all reports from this group are created by counting acquisition numbers of borrowed items and displaying data related to those checkouts. therefore, in this dimensional model, the acquisition number is a measure. the central table in the model is the lending table and is presented as a fact table. this table contains the acquisition number of the borrowed item and foreign keys from the dimension tables. all other tables in the model are dimension tables. the publication, publisher, year, acquisition, ucd_group, status, and language tables contain data from bibliographic records and the content of these tables have been already explained. the member, membershiptype, category, education, and gender tables represent the dimension tables containing information about library users. these data are only a subset of circulation data from transactional database. the location table describes departments within the library where items are borrowed. the date table represents the time dimension. the data in the date table are derived from the date of borrowing and the date of discharge of an item. business intelligence in the service of libraries |tešendić and krstićev 105 https://doi.org/10.6017/ital.v38i4.10599 figure 2. dimensional model describing library circulation data. model describing members’ data a dimensional model of the bisis data warehouse used for the analytical processing of members’ data is shown in figure 3. data from this model are used for generating statistical reports on library members, as well as for generating financial reports based on membership fees. examples of such reports are the number of members according to different criteria (such as department of registration, member category, type of membership, gender, or education level). also, this report group contains reports that include a financial balance (for example, a list of members with membership fees in a certain time period). the membership fee has the main role in generating these reports. all reports from this group are generated by counting or displaying members who have paid a membership fee or summarizing membership fees. therefore, in this dimensional model, membership fee is a measure. the main table in the model is the membership table and it presents a fact table. it contains the membership fee, which is the measure, and foreign keys from the dimension tables. information technology and libraries | december2019 106 all other tables in the model are dimension tables. tables member, membershiptype, category, education and gender represents the dimension tables that contain information about library members and the content of these tables was previously described. the table location describes departments within the library where user registration is performed. the table date represents the time dimension. data in the table date are based on the registration date and the date of the membership expiration. figure 3. dimensional model describing library members. true value of a data warehouse in the previous sections, we presented models of data warehouse, but those models are unusable if they are not implemented and populated with data needed for business analysis. extracting, transforming, and loading (etl) processes are responsible for reshaping the relevant data from the source systems into useful data to be stored in the data warehouse. etl processes load data into a data warehouse, but that data warehouse is still only storage for those data. a real-time and interactive visualisation of those data will show the true benefits of data warehouse implementation in various organisations including libraries. to load as well as to analyze and visualize large volumes of data in data warehouses, various online analytical processing (olap) tools can be used.18 the usage of olap tools does not business intelligence in the service of libraries |tešendić and krstićev 107 https://doi.org/10.6017/ital.v38i4.10599 require a lot of programming knowledge in comparison to tools used for querying transactional databases. the interface of olap tools should provide a user with a comfortable working environment to perform analytical operations and to visualize query results without knowing programming techniques or structure of transactional database. there are various olap tools available on the market.19 when choosing an olap tool to be used in an organization, there are several important criteria to consider: the duration of query execution, user-oriented interface, the possibility of interactive reports, price of tool, automation of the etl process, etc.20pentaho bi system is one of the open-source olap tools which satisfies most of those criteria. among various features, pentaho supports creation of etl processes, data analysis, and reporting.21 implementation of etl processes can be a challenging task primarily because of the nature of the source systems. we used pentaho tool to transform data from bisis to the data warehouse, as well as to visualize data and generate statistical reports. etl processes modeling after creating a data-warehouse model, it is necessary to load data into the data warehouse. the first step in that process is to extract data from the data sources. those data may not be in accordance with the newly created data-warehouse model and appropriate transformations of data may be needed before loading. regarding the structure of the data sources, transformations can be implemented from scratch, or by using dedicated olap tools. both techniques are used in the development of our data warehouse. transformations that required data from bibliographic records were implemented from scratch because of complex data structure, while transformations that processed data from relational database are implemented using pentaho data integration (pdi) tool. pdi is a graphical tool that enables designing and testing etl processes without writing programming code. figures 4 and 5 show an example of transformations created and executed by that tool. those transformations have been applied to load members’ data from bisis relational database into the data warehouse. figure 4. transformations for loading members data. information technology and libraries | december2019 108 figure 5. the membershiptransformation process. an issue that may arise after an initial loading of a data warehouse relates to updating the data warehouse. in order to achieve a better performance of transactional databases, updates of the data warehouse should not be performed in real time. in the case of library management systems, those updates can be performed outside of working hours so data in the data warehouse will be up to date on a daily basis. an update algorithm can be defined as an etl process using olap tools or it can be implemented from scratch. data visualization the basic task of olap tools is to enable visualization of data stored in a data warehouse. the olap tools use multidimensional data representation, known as a cube, which allows a user to analyze data from different perspectives. olap cubes are built on dimensional models of a data warehouse and consist of dimensions and measures. dimensions form the cube structure and each cell of the cube holds a measure. measures are derived from the records in the fact table and dimensions are derived from the dimension tables. olap tools allow a user to select a part of the olap cube by setting an appropriate query and that part can be further analyzed by different dimensions. this process is performed by applying common operations on the cube which include slice and dice, drill down, roll up, and pivot.22 data that are results of operations on the cube can be visualized in the form of tables, charts, graphs, maps, etc. the main advantage of olap tools reflects is that end users can do their own analyses and reporting very efficiently. users can extract and view data from different points of view on demand. olap tools are valuable because they provide an easy way to analyze data using various graphical wizards. by analyzing data interactively, users are provided with feedback which can define the direction of further analysis. in order to visualize data from our data warehouse, we used the pentaho olap tool. we used it to create predefined reports identified during the analysis of user requirements as well as some interactive reports using operations on the olap cube. examples of generated reports are presented below in order to illustrate some features of the pentaho olap tool. an example of a report shown in figure 6 was obtained with a dice operation on the cube. the dice operation selects two or more dimensions from a given cube and provides a new sub-cube. in this particular example, we selected three dimensions: gender, member category, and registration date. business intelligence in the service of libraries |tešendić and krstićev 109 https://doi.org/10.6017/ital.v38i4.10599 figure 6. example of dice operation performed on the olap cube. information technology and libraries | december2019 110 figure 7. example of roll-up and drill-down operations performed on the olap cube. business intelligence in the service of libraries |tešendić and krstićev 111 https://doi.org/10.6017/ital.v38i4.10599 additionally, we analyzed only those data from 2014 to 2018. the result of this operation is presented in the form of nested pie charts. however, other forms of visualisation can be applied on the same data set very easily. in figure 7, a more complex report is presented. that report is obtained by performing combination of roll-up and drill-down operations. the roll-up operation performs aggregation on a cube reducing the dimensions. in our example, we aggregated the number of newly acquired publications for certain years ignoring all other dimension except the date dimension. a user can select a particular year, quarter, and month and analyze details of purchased publications in that period, such as title and author of the publication. this is an example of using drill-down operation on the cube. the result of that operation is presented as a table, as shown in figure 7. this report is interactive, because user can investigate data in more detail by performing other operations on the cube that are placed on the toolbar of the report. conclusion this article aims to illustrate an application of business intelligence in libraries, as reporting modules in library management systems are usually inadequate for a comprehensive business analysis. development of a data warehouse, which is the base of any business intelligence system, as well as usage of olap tools are presented. both user requirements regarding reporting in bisis and already-existing transactional databases are analyzed during the development of a datawarehouse model. based on that analysis, three data-warehouse models have been proposed. also, examples of reports generated by an olap tool are given. by building the data warehouse and using olap tools, users of bisis can perform business analysis in a more user-friendly manner than with other processes. users are not limited to predefined types of reports. librarians can easily generate customized reports tailored to the specific needs of the library. in this way, librarians work in a more comfortable environments, performing analytical operations interactively and visualizing query results without additional programming knowledge. the article presents the usage of pentaho olap tool, but the proposed data-warehouse model is independent of olap tools selection and any other tool can be integrated with the proposed data warehouse. references 1 ralph stair and george reynolds, fundamentals of information systems (cengage learning, 2017). 2 ramesh sharda, dursun delen, and efraim turban, business intelligence, analytics, and data science: a managerial perspective (pearson, 2016). 3 “bisis,” library management system bisis, accessed july 8, 2019, http://www.bisis.rs/korisnici/. 4 bojana dimić and dušan surla, “xml editor for unimarc and marc 21 cataloguing,” the electronic library 27, no. 3 (2009): 509-28, https://doi.org/10.1108/02640470910966934; bojana dimić surla,“eclipse editor for marc records,” information technology and libraries 31, no. 3 (2012): 65-75, https://doi.org/10.6017/ital.v31i3.2384; bojana dimić surla, “developing an eclipse editor for marc records using xtext,” software: practice and experience 43, no. 11 (2013): 1377-92, https://doi.org/10.1002/spe.2140. http://www.bisis.rs/korisnici/ https://doi.org/10.1108/02640470910966934 https://doi.org/10.6017/ital.v31i3.2384 https://doi.org/10.1002/spe.2140 information technology and libraries | december2019 112 5 branko milosavljević, danijela boberić, and dušan surla, “retrieval of bibliographic records using apache lucene,” the electronic library 28, no. 4 (2010): 525-39, https://doi.org/10.1108/02640471011065355. 6 danijela boberić and dušan surla,“ xml editor for search and retrieval of bibliographic records in the z39. 50 standard,” the electronic library 27, no. 3 (2009): 474-95; danijela boberić krstićev, “information retrieval using a middleware approach,” information technology and libraries 32, no. 1 (2013): 54-69, https://doi.org/10.6017/ital.v32i1.1941; miroslav zarić, danijela boberić krstićev, and dušan surla, “multitarget/multiprotocol client application for search and retrieval of bibliographic records,” the electronic library 30, no. 3 (2012): 351-66, https://doi.org/10.1108/02640471211241636. 7 danijela tesendic and danijela boberic krsticev, “web service for connecting visually impaired people with libraries,” aslib journal of information management 67, no. 2 (2015): 230-43, https://doi.org/10.1108/ajim-11-2014-0149. 8 danijela boberić-krstićev and danijela tešendić,“ mixed approach in creating a university union catalogue,” the electronic library 33, no. 6 (2015): 970-89, https://doi.org/10.1108/el-022014-0026. 9 danijela tešendić, branko milosavljević, and dušan surla, “a library circulation system for city and special libraries,” the electronic library 27, no. 1 (2009): 162-86, https://doi.org/10.1108/02640470910934669; branko milosavljević and danijela tešendić, “software architecture of distributed client/server library circulation system,” the electronic library 28, no. 2 (2010): 286-99, https://doi.org/10.1108/02640471011033648; danijela tešendić, “data model for consortial circulation in libraries,” in proceedings of the fifth balkan conference in informatics, novi sad, serbia, september, 16-20, 2012. 10 danijela boberic and branko milosavljevic, “generating library material reports in software system bisis,” in proceedings of the 4th international conference on engineering technologies icet, 2009: 133-37. 11 william h. inmon, building the data warehouse (indianapolis: john wiley & sons, 2005); ralph kimball, the data warehouse toolkit: practical techniques for building dimensional data warehouses (ny: john willey & sons, 1996), 248, no. 4. 12 ralph kimball and joe caserta, the data warehouse etl toolkit: practical techniques for extracting, cleaning, conforming, and delivering data (indianapolis: john wiley& sons, 2004), 528. 13 oscar romero and alberto abelló, “a survey of multidimensional modeling methodologies,” international journal of data warehousing and mining (ijdwm) 5, no. 2 (2009): 1-23. 14 lorena siguenza guzman, victor saquicela, and dirk cattrysse,“design of an integrated decision support system for library holistic evaluation,”in proceedings of iatul conferences (2014):112. https://doi.org/10.1108/02640471011065355 https://doi.org/10.6017/ital.v32i1.1941 https://doi.org/10.1108/02640471211241636 https://doi.org/10.1108/ajim-11-2014-0149 https://doi.org/10.1108/el-02-2014-0026 https://doi.org/10.1108/el-02-2014-0026 https://doi.org/10.1108/02640470910934669 https://doi.org/10.1108/02640471011033648 business intelligence in the service of libraries |tešendić and krstićev 113 https://doi.org/10.6017/ital.v38i4.10599 15 yi-ting yang and jiann-cherng shieh, “data warehouse applications in libraries—the development of library management reports,” in advanced applied informatics (iiai-aai), 2016 5th iiai international congress on advanced applied informatics, 88-91. ieee, 2016, https://doi.org/10.1109/iiai-aai.2016.129. 16 tešendić, milosavljević, and surla, “a library circulation system,”162-86. 17 dimić and surla, “xml editor for unimarc,” 509-28. 18 paulraj ponniah, data warehousing fundamentals for it professionals (hoboken, nj: john wiley & sons, 2011). 19 “top 10 best analytical processing (olap) tools,” software testing help, https://www.softwaretestinghelp.com/best-olap-tools/. 20 rick sherman, “how to evaluate and select the right bi tools,” https://searchbusinessanalytics.techtarget.com/buyersguide/a-buyers-guide-to-choosingthe-right-bi-analytics-tool. 21 doug moran, “pentaho community wiki,” https://wiki.pentaho.com/. 22 ponniah, “data warehousing,” 382-93. https://doi.org/10.1109/iiai-aai.2016.129 https://www.softwaretestinghelp.com/best-olap-tools/ https://searchbusinessanalytics.techtarget.com/buyersguide/a-buyers-guide-to-choosing-the-right-bi-analytics-tool https://searchbusinessanalytics.techtarget.com/buyersguide/a-buyers-guide-to-choosing-the-right-bi-analytics-tool https://wiki.pentaho.com/ abstract introduction reporting in bisis related work modeling the data warehouse analysis of data sources in bisis analysis of user requirements data warehouse models model describing library collection data model describing library circulation data model describing members’ data true value of a data warehouse etl processes modeling data visualization conclusion references text analysis and visualization research on the hetu dangse during the qing dynasty of china article text analysis and visualization research on the hetu dangse during the qing dynasty of china zhiyu wang, jingyu wu, guang yu, and zhiping song information technology and libraries | september 2021 https://doi.org/10.6017/ital.v40i3.13279 zhiyu wang (mikemike248@gmail.com) is phd candidate, school of management, harbin institute of technology and associate professor, school of history, liaoning university. jingyu wu (734665532@qq.com) is graduate student, school of history, liaoning university. guang yu (yug@hit.edu.cn) is professor, school of management, harbin institute of technology. zhiping song (1367123893@qq.com) is graduate student, school of history, liaoning university. © 2021. abstract in traditional historical research, interpreting historical documents subjectively and manually causes problems such as one-sided understanding, selective analysis, and one-way knowledge connection. in this study, we aim to use machine learning to automatically analyze and explore historical documents from a text analysis and visualization perspective. this technology solves the problem of large-scale historical data analysis that is difficult for humans to read and intuitively understand. in this study, we use the historical documents of the qing dynasty hetu dangse, preserved in the archives of liaoning province, as data analysis samples. china’s hetu dangse is the largest qing dynasty thematic archive with manchu and chinese characters in the world. through word frequency analysis, correlation analysis, co-word clustering, word2vec model, and svm (support vector machines) algorithms, we visualize historical documents, reveal the relationships between functions of the government departments in the shengjing area of the qing dynasty, achieve the automatic classification of historical archives, improve the efficient use of historical materials as well as build connections between historical knowledge. through this, archivists can be guided practically in historical materials’ management and compilation. introduction china has a long history documented in numerous archives. at present, various local archive departments preserve large numbers of historical documents from different periods. owing to the development of china’s archive digitization, archive management departments at all levels have established digital archive abstracts, catalogs, and subject indexes of historical documents in their collections realizing online retrieval of historical archives. with in-depth research on chinese history, simple catalog retrieval cannot satisfy researchers’ demand for related knowledge in historical archives. owing to the limitations of the catalog retrieval system, complex catalog data still need to be read manually. however, it is difficult to view the overall picture of the recorded content and impossible to easily distinguish important information in historical materials; this leads to various difficulties, such as the compilation of historical materials for chinese historical researchers. thus, in this study, we aim to use text analysis and visualization methods in machine learning to conduct data mining analysis of historical document data. these methods will help us discover the logical relationships of historical records and their purposes, accomplish visual presentations of historical entities and knowledge discovered in historiography, improve knowledge representation and automatic classification of historical data, and provide valuable information for historical archive researchers. mailto:mikemike248@gmail.com mailto:734665532@qq.com mailto:yug@hit.edu.cn mailto:1367123893@qq.com information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 2 during the process of analyzing traditional manual methods for interpreting historical documents, we find the following phenomena: macro description, single angle, selective analysis, and one-way knowledge connection, among others. for example, the hetu dangse preserved in the liaoning archives contains a total of 1,149 volumes and 127,000 pages, making it difficult to fully grasp and understand the overall content of such documents. relying on manual reading and analysis of entire archives is an unrealistic task. therefore, this paper proposes using machine learning, natural language processing (nlp), and other technologies to address various problems from traditional manual reading. first, information from historical documents can be revealed from different angles, and this allows the content of the documents to be displayed more comprehensively and scientifically through visual charts. second, use of objective quantitative analysis methods, such as text analysis and nlp, prevents subjective interpretations of the same content. third, nlp and other technologies can solve the problem of calculating massive text training data sets while forming systematic knowledge that avoids the omission and one-sided understanding of knowledge in the historical archive. the application of machine learning in historical data analysis has attracted the attention of researchers in management, history, and computer science. tao used the latent dirichlet allocation (lda) topic modeling algorithm to analyze the themes of documents from 1700 to 1800 included in the german archives, providing a more three-dimensional interpretation and explanation of the spiritual world of germany during the eighteenth century.1 chinese scholars kaixu et al. proposed a method of automatic sentence punctuation based on conditional random fields in ancient chinese.2 this method was proved to better solve the problem of automatic punctuation processing compared with the single-layer conditional random field strategy in ancient chinese as tested on the two corpora of the analects and records of the grand historian. swiss and south african scholars stauffer, fischer, and riesen, and chinese scholars wu, wang, and ma used the kws technology and deep reinforcement learning to automatically recognize handwritten pictures in historical documents.3 solar and radovan used the national and university library of slovenia’s historical pictures and maps as research data. using gis technology, they created a novel display method, and interdisciplinary data resource web application to access and research the data.4 chinese scholars dong et al. and polish scholars kuna and kowalski used the webgis technology to conduct efficient management and visualization research on historical data of natural disasters in ancient china and russia. 5 meanwhile, latvian scholars ivanovs and varfolomeyev and dutch scholars schreiber et al. used web technology to develop a web service platform and explored the intelligent environment of cultural heritage service utilization.6 korean scholars kim et al. used machine learning technology to determine the complex relationships between tasks of various classes in a specific historical period through the network of historical figures.7 judging from results in related fields, the semantic analysis and visualization of historical archives in an intelligent way are gradually moving from statistical description to knowledge mining. these results provide theoretical feasibility and practical technical experience for this study. at present, research on historical documents mainly focuses on the retrieval and utilization of historical material databases. since the words, semantics, grammar, and sentence patterns recorded in historical materials differ from modern texts, using data mining technologies such as machine learning and nlp to intelligently identify historical documents and organize historical data will help us more than traditional methods. this requires the cooperation of artificial intelligence and historical researchers to establish an effective method of historical big data information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 3 analysis to achieve the transformation from traditional manual historical document analysis to automatic artificial intelligence analysis methods. in this paper, we use machine learning and data visualization as a tool to identify differently the content of the historical documents from traditional literature reading, reveal valuable information in the content of historical documents, and promote more systematic, efficient, and detailed understanding of the literature. related technology definition to perform text analysis and visualization of the hetu dangse, we use machine learning technology such as word vector processing, the svm (support vector machines) model and network analysis. word vector is a numerical vector representation of a word’s literal and implicit meaning.8 we segmented the hetu dangse’s catalog data and used the word2vec model to transform the segmented data’s word vector form into a set of 50-dimensional numerical vectors representing a catalog’s vector data set. to accurately visualize historical document records’ relationship features, we reduced the vector data set’s dimensionality. dimensionality reduction, or dimension reduction, is data’s transformation from a highinto a low-dimensional space so that the representation retains some of the original data’s meaningful properties, ideally close to its intrinsic dimension.9 after dimensionality reduction, each catalog data in the vector data set is reduced from 50 to 2 dimensions to facilitate flat display. we used the svm model and network analysis technology to analyze the vector data set. the svm model is a set of supervised learning methods used for classification, regression, and outlier detection.10 it is given a vector data set as training to represent historical document records as points in space, and learns independently through the kernel algorithm. using the algorithm, it maps the separated new records to the same space, and predicts their category based on which side of the interval they fall. network analysis techniques derive from network theory, a computer science system demonstrating social networks’ powerful influences. network analysis technology’s characteristics determine that it is suitable for books and historical archives’ visualization in the library and information science field, because the visualization technique involves mapping entities’ relationships based on the symmetry or asymmetry of their relative proximity.11 thus, it helps to discover historical documents’ knowledge relevance. for example, citation network analysis can identify emerging relationships in healthcare domain journals.12 sample data preprocessing and classification this study uses the catalog of the qing dynasty historical archives from the hetu dangse collected by the liaoning archives as the research sample to conduct text analysis and visualization research. china’s hetu dangse is the largest qing dynasty thematic archive with manchu and chinese characters both in domestic and international. the hetu dangse is the official document of communication between shengjing general yamen, the wubu of shengjing and fengtian office, and the document communicated between the beijing internal affairs office in charge and the liubu of beijing during the qing dynasty. the hetu dangse was published from 2015 to 2018, including the hetu dangse·kangxi period (56 volumes), hetu dangse·yongzheng period (30 volumes), hetu dangse·qianlong period (24 volumes), hetu dangse·qianlong period (17 volumes), hetu dangse·daoguang period (52 volumes), hetu dangse·jiaqing period (58 volumes), hetu dangse·qianlong period official documents (46 volumes), hetu dangse·qianlong period official documents (46 volumes), and hetu dangse·general list (16 volumes).13 the hetu dangse is an information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 4 important document for studying the history of the qing dynasty. owing to the special status of shengjing in the qing dynasty, it has a unique historical significance as the companion capital of beijing and the hometown of the qing royal family. this provides original evidence from this time for studying politics, economy, culture, history, and natural ecology in northeast china. in this study, we preprocess the catalog data of the hetu dangse by performing text segmentation, creating a corpus, and labeling data before using text analysis and visualization technology to analyze the catalog data of hetu dangse. first, we use word frequency analysis and statistics to study the functions of institutions. second, we use the co-word clustering algorithm to quantify and visualize the institutional relationships. finally, we use the svm model to automatically classify and explore the catalog data of the hetu dangse. figure 1 illustrates this process. figure 1. text analysis flowchart. data preparation and preprocessing we collected 95,680 catalog data items in the hetu dangse of the liaoning archives, including 25,148 items from the kangxi period; 1,096 items from the yongzheng period; 23,819 items from the qianlong period; 20,730 items from the jiaqing period; and 15,887 items from the daoguang period. the content of each catalog data includes three parts: title information, time of publication (chinese lunar calendar), and responsible agency. the proportion for each period was not evenly distributed in the catalog data of the hetu dangse with the kangxi period catalog data having the highest proportion (26.2%). through the catalog data information, we can perform an in -depth analysis of the content of the hetu dangse from the three perspectives: institutional functions, institutional relationships, and topic classification. data cleaning as the text recorded in the archives of the hetu dangse are manchu and ancient chinese, using chinese word segmentation tools (jieba, snownlp, thulac, etc.) based on modern chinese will cause errors. therefore, it is necessary to construct a special text corpus for word segmentation. first, we construct a stop vocabulary list to remove words with little impact on semantics in the hetu dangse, such as for (为), please (请) and of (之). second, we use the word segmentation tools mentioned above for preliminary word segmentation and then perform part-of-speech tagging and word segmentation corrections based on the word segmentation results. the title part of the catalog data of the hetu dangse mainly contains three dimensions of information: the record title of the catalog, issuing institution, and receiving institution. accordingly, we set a total of four types of tags in the text corpus: issuing institution, receiving institution, record type, and keywords. the receiving institution and the issuing institution correspond to the institutions at the beginning and the end of the catalog, respectively, such as the words shengjing zhangguan fang zuoling, and shengjing ministry of justice. the record type is the front word of the receiving institution, such as counseling (咨) and please (请). the keywords are words that can represent the overall semantics information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 5 in the record title of the catalog, such as arrest (缉拿) and advance (进送). table 1 presents the corpus we developed. table 1. hetu dangse corpus num word property1 property 2 1 盛京掌关防佐领 organization noun 2 为 stop_words preposition 3 缉拿 keywords verb 4 逃人 keywords noun 5 舒廷 name noun 6 官事 stop_words noun 7 咨 keywords verb 8 盛京刑部 organization noun 9 正白旗佐领 organization noun 10 兆麟 name noun 11 呈 stop_words preposition 12 为 stop_words preposition 13 交纳 keywords verb 14 壮丁 keywords noun 15 银两事 keywords noun ┋ ┋ ┋ ┋ 61047 收讫事 keywords noun 61048 盛京佐领 organization noun label data to improve the utilization efficiency of the hetu dangse and show the document content information from multiple angles, we use a supervised machine learning method to automatically classify the catalog data of the hetu dangse. therefore, the original catalog data set must be labeled. we determine the classification and label of the hetu dangse catalog according to the chinese archives classification law, chapter 12. table 2 presents the 11 categories of the catalog. with this, we complete the hetu dangse catalog sampling classification and labeling laying the foundation for automatic catalog classification. the hetu dangse has a total of 95,680 catalog records involving five periods: kangxi, yongzheng, qianlong, jiaqing, and daoguang. we randomly select 500 records from each period and manually label these 2,500 records as the sample data set. the data classification after manual labeling is shown in figure 2. the overall distribution is relatively even, making it suitable for machine learning processing. information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 6 table 2. data labels num category 1 type of official documents (政务种类) 2 palace, royal family and eight banners affairs(宫 廷、皇族及八旗事务) 3 bureaucracy, officials(职官、吏役) 4 military(军事) 5 politics and law(政法) 6 sino-foreign relations(中外关系) 7 culture, education, health and scientific cultural study(文化、教育、卫生及科学文化研究) 8 finance(财政) 9 agriculture, water conservancy, animal husbandry (农业、水利、畜牧业) 10 building(建筑) 11 transportation, post and telecommunication(交 通、邮电) figure 2. percentage of the hetu dangse catalog data label chart. results in this study, we used the catalog data of the hetu dangse as a sample to analyze and reveal the hetu dangse catalog data from three perspectives: institutional function, institutional relationship, and automatic classification. this will improve usage efficiency of the hetu dangse, thus improving information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 7 researchers’ mastery of relevant information about the document. to achieve the functional requirements of text analysis, we adopted four methods: word vector conversion, word frequency analysis, co-word clustering, and the svm model. word vector conversion of text catalog data the automatic classification of machine-learning technology is based on vector data sets. thus, the hetu dangse text catalog data set must be vectorized before automatic classification. currently, word vector conversion technology mainly includes methods such as one-hot, word2vec, and glove. hetu dangse records the history of the qing dynasty for more than 200 years. there are inevitable relationships among the contents recorded in the documents, indicating that they are not isolated from each other. the word2vec model provides an efficient implementation of cbow and skip-gram architectures for computing vector representations of words, both of which are simple neural network models with one hidden layer. the word2vec model produces word vectors as outputs from inputting the text corpus. this method generates a vocabulary from the input words and then learns the word vectors via backpropagation and stochastic gradient descent.14 this makes the word2vec model more suitable for catalog data from hetu dangse. word2vec includes the cbow model and the skip-gram model, which can enrich the semantic relevance depending on the context, and it is more suitable for the semantic relevance of historical documents such as the hetu dangse. therefore, we adopt the skip-gram model to analyze the catalog data of hetu dangse. we extracted the features of word vectors in catalog data from the corpus, input them into the word2vec model, imported the gensim library in python, trained the vector embeddings, and obtained the htd.model.bin vector file and htd.text.model model file. the correlation between each word in the hetu dangse catalog can be found by implementing the model. for example, if the word bannerman (旗人) is input into the model, the most relevant words are minren (民人, with 0.84726 relevance), accused (被控, with 0.812017), and robbery (抢 劫, with 0.795359). to visualize the ethnic relationships recorded in the hetu dangse catalog, we input the first 300 words of the word vector into the trained word2vec model and performed dimensionality reduction to realize a planar graph. to understand the structure of the data intuitively, we used the t-sne algorithm to reduce the dimensions of the word vector. the t-sne is a type of nonlinear dimensionality reduction used to ensure that similar data points in high-dimensional space are as close as possible in low-dimensional space. we set the embedded space dimension parameter of tsne to 2 and the initialization parameter as pca. this makes it more globally stable than random initialization. the maximum number of optimization iterations is 5,000. figure 3 presents the results. in figure 3, the terms sanling, yongling, zhaoling, prime minister, and fuling form clusters. in shengjing, the qing set up the sanling prime minister's office, and the prime minister's mausoleum affairs minister was appointed concurrently by general shengjing. near fujinmen, the sanling prime minister's office was established. in the 30th year of guangxu, the government office was changed to the prime minister's office of shengjing mausoleum affairs, and the governor of the three provinces concurrently served. under the sanling prime minister’s office, the sanling office was set up to undertake the sacrifice and repair affairs of the three tombs (xinbin yongling, shenyang fuling, and zhaoling).15 therefore, the clustering in figure 3 verifies the close relationship between the sanling prime minister's office and the tombs. information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 8 figure 3. 2d tsne visualization of word2vec vectors. analysis of the relationship between the documents received and sent of the institution with the statistics of the text data obtained after word segmentation, we can find the quantitative relationship between the documents received and sent by the institution, using the pearson correlation coefficient to judge whether there is a correlation between the number of documents received and the number of documents sent by the same institution. 𝜌(𝑟,𝑠) = 𝑐𝑜𝑣(𝑅,𝑆) 𝜎𝑟 𝜎𝑠 (3.1) we suppose that the pearson correlation coefficient between the number of documents received and the number of documents sent is ρ(r,s), r= {r1, r2, r3...r11}. here, r is the variable set of documents received from the institutional sample. set s= {s1, s2, s3…s11} is the variable set of documents sent by the institutional sample. by dividing the covariance of r and s by the product of their respective standard deviations, we can obtain the value of the correlation coefficient of the documents sent and received by the same institution. mining the relationship between institutions’ sending and receiving documents based on co-word clustering to mine the relationship between the institutions’ sending and receiving documents, we adopt a co-word clustering algorithm to generate a visualized network map of institutional relationships. the global co-occurrence rate represents the probability of two words appearing together in all the data sets. in large-scale data sets, if two words often appear together in the text, these two words are considered to be strongly related to the semantics.16 clustering is a method that places objects into a group by similarity or dissimilarity. thus, keywords with high correlation to each other tend to be placed in the same cluster. social network analysis, which evaluates the unique structure of interrelationships among individuals, has been extensively used in social science, psychological science, management science, and scientometrics. 17 we can obtain a sociogram from the institutional function analysis. the main purpose of the sociogram is to provide information information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 9 about the relationship between institutions’ sending and receiving documents. in the sociogram, each member of a network is described by a “vertex” or “node.” vertices represent high-frequency words, and the sizes of the nodes indicate the occurrence frequency. the smaller the size of a node, the lower the occurrence frequency. lines depict the relationships between two institutions. they exist between two keywords, indicating that they received or sent documents to each other. the thickness is proportional to the correlation between the keywords. the thicker the line between the two keywords, the stronger the connection. using this rationale, the map visualization and network characteristics (centrality, density, core-periphery structure, strategic diagram, and network chart) were obtained by analyzing pearson’s correlation matrix or other similarity matrices.18 in this study, we conducted network analysis on a binary matrix to display the relationships between the documents sent and received by the institutions in the shengjing area during the qing dynasty recorded in the hetu dangse. further, we extracted the receiving institution and issuing institution from each record of catalog data in the hetu dangse, and then we composed a new data set with the following data from the receiving institution: issuing institution and title content. we used python to convert the new data set to endnote format and import it into vosviewer1.6.15 to calculate and draw a visual map of the new data set. van eck and waltman of the netherlands’ leiden university developed vosviewer, a metrological analysis software used for constructing and visualizing network graphs.19 although the software’s development principle is based on documents’ co-citation principles, it can be applied to the construction of data network knowledge graphs in various fields. combined with the co -word clustering algorithm, we can create an entity connection network map for historical documents through vosviewer software to reflect the recorded content. automatic classification method of historical archives catalog based on the svm model we used the svm model in machine learning for automatic classification. the svm model has the advantages of strong generalization, low error rate, strong learning ability, and support for small sample data sets, making it suitable for historical archive catalog data samples with small sample characteristics. therefore, we attempted to classify the catalog data set of hetu dangse using the svm model. first, we divided the vectorized labeled data set into a training set and a testing set. the training set accounts for 70% of the data, and the testing set accounts for 30%. to ensure the accuracy of the model prediction, we adopted a random division method to avoid overfitting. second, we used a linear kernel in the svm model and grid search to find the best parameter. various combinations of the penalty coefficient (c) and gamma parameter in the svm model were tested based on their accuracy ranked from high to low. we then determined the best parameter combination. after the model was established, we validated the predictive performance of the model from multiple perspectives such as precision, recall, and f1 score to ensure the generalization ability and availability of the model. we set the penalty coefficients to 10, 100, 200, and 300, while the gamma parameters are set to 0.1, 0.25, 0.5, and 0.75. we used the precision evaluation criteria to find the optimal parameter combination of the model and then imported them. the penalty coefficient is set to the x-axis, the gamma parameter set to the y-axis, and the precision set to the z-axis. we implemented the model to obtain the visualization that is shown in figure 4. clearly, the optimal parameter combination is a penalty coefficient of 10 and a gamma parameter of 0.075. information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 10 figure 4. svm grid search parameter tuning diagram. discussion the history of a nation is the foundation on which it is built. historical documents are the witnesses and recorders of history. through the study of historical documents, we can go back to the past, cherish the present, and look forward to the future. an increasing number of scholars have studied these documents in recent years due to their importance. the hetu dangse records the document communications between institutions in shengjing (now shenyang) and beijing during the qing dynasty. it is an important historical document that cannot be ignored when information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 11 studying the history of northeast china during the qing dynasty. here, we use the catalog data of the hetu dangse as the sample data to test the machine learning methods previously mentioned. we explore the results from the perspectives of institutional function, institutional relationship, and automatic classification to determine the feasibility of our methods. functions of institutions the number of institutions involved in the hetu dangse is over 150. these functional departments formed the governance system of the shengjing area during the qing dynasty. to gain a deeper understanding of the qing dynasty’s ruling system in the shengjing area, the functions of these institutions should be examined. this study analyzes and studies the functions of the institutions in the shengjing area through the number of documents and the frequency of content of the sending and receiving institutions. analysis of the number of documents received and sent by institutions by sorting and statistically analyzing the catalog data of hetu dangse, we obtained data on the number of documents received and sent by institutions in the shengjing area recorded in the hetu dangse. we set the vertical axis as the total number of communicated documents, number of issued documents, and number of received documents. we set the horizontal axis as the names of the institutions and then drew a histogram. this study analyzes the number of institutional archives of the hetu dangse catalog from three perspectives: total number of sent and received documents, number of received documents, and number of issued documents to find the institutions with the highest research value in the shengjing area. in the histogram shown in figure 5(a), the top three institutions in total number of communicated documents are shengjing internal affairs office, shengjing zuoling, and shengjing ministry of revenue. we can also observe that the top 10 institutions have different volumes of their respective documents received and sent by institutions. therefore, the ranking of the total number of communicated documents is not directly related to the respective rankings of the number of documents received and the number of documents sent. in figure 5(b), we can observe that the top three institutions in number of documents received in the hetu dangse are shengjing internal affairs office, shengjing ministry of revenue, and shengjing general yamen. figure 5(c) shows the top three institutions in number of documents sent in the hetu dangse are shengjing internal affairs office, shengjing zuoling, and shengjing general yamen. the total number of communicated documents, number of documents sent, and number of documents received by the shengjing internal affairs office all rank first; this indicates that the shengjing internal affairs office is the most important department of the ruling system in the qing dynasty during the shengjing area. information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 12 figure 5. number of documents received and sent by institutions. a b c information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 13 by using the number of documents received and sent by the institutions, we calculated the pearson correlation coefficient to determine if the number of documents received and sent by the same institution is relevant. as institutional samples, we selected the shengjing internal affairs office, shengjing ministry of revenue, (beijing) internal affairs office in charge, shengjing zuoling, shengjing ministry of works, shengjing ministry of justice, shengjing general yamen, shengjing close defense zuoling, shengjing ministry of war, fengtian general yamen, and shengjing ministry of rites. through calculation, the result of pearson correlation coefficient is 0.69 (save two decimal places), so there is a correlation between the number of sent and received documents, as shown in figure 6. figure 6. scatter plot of pearson correlation coefficient. the hetu dangse is a copy of official documents dealing with the royal affairs of the shengjing internal affairs office during the qing dynasty. it contains the official documents between the shengjing internal affairs office and the beijing internal affairs office in charge, the liubu, etc. and the local shengjing general yamen, fengtian office, the wubu of shengjing, and other yamens.16 thus, there exist a large stock of documents with the shengjing internal affairs office as the sending and receiving agency. the wubu of shengjing, shengjing general yamen, shengjing zuoling, and other institutions are important hubs for the operation of institutions in shengjing. they played an important role in maintaining and stabilizing the society of shengjing. the number of documents is second in importance only to the shengjing internal affairs office. analysis of the frequency of documents received and sent by institutions to further explore the functions of institutions with research value, we extracted the contents of the catalogs from the top three institutions in total number of documents sent and received: shengjing internal affairs office, shengjing ministry of revenue, and shengjing zuoling. we then classified the catalogs of the aforementioned institutions according to receipts and postings. subsequently, we used word segmentation and word frequency statistics to process the two types information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 14 of catalog information and draw comparison diagrams to explore their specific functions in the hetu dangse. as shown in figure 7, we can roughly divide the obtained segmentation words into two categories. one is the name of the communicated official document institutions, such as the ministry of revenue, the ministry of justice, and the ministry of rites on the side of the word frequency (see fig. 7[a]). the other is the name of the official document content and the words zhuangtou (庄头), dimu (地亩), and zhuangding (壮丁) on the side of the frequency of the words in the documents sent. through a comparative analysis of the top 10 words received and sent by the same institution, we conclude that the institutions with a close relationship between receiving and sending documents are not the same. for example, the ministry of revenue of shengjing internal affairs office ranks first in the frequency of documents sent by institutions, while the shengjing zuoling ranks first for receiving institutions (see fig. 7[b]). the contents of documents sent and received by the same institution are different. figure 7(c) shows how the affairs sent by shengjing zuoling to ula (乌拉), forage (粮草), and license (执照) differ from those represented by the zhuangtou (庄头), accounting (会计), and close defense (关防) in the frequency of documents sent and frequency of receipts, respectively. based on previous research on the functions of shengjing’s institutions, the shengjing internal affairs office was set up in the companion capital of shengjing during the qing dynasty to be in charge of shengjing cemetery, sacrifice, organization of staff transfer, and other matters. 20 this relates to the meaning of words such as sacrifice (祭祀) in figure 7(a). the functions of the shengjing ministry of revenue were represented in guangxu’s great qing huidian. the cashiers in charge of taxation in shengjing, number of annual losses in official villages, and banner land were carefully recorded. the expenditures were distinguished and the accounting obeyed the regulations according to the beijing ministry of revenue at the end of the year.21 this is related to the meaning of words, such as dimu (地亩), land sale (卖地), and money and grain (钱粮) in figure 7(b). in fu yonggong and guan jialu’s research of shengjing zuoling’s functions, shengjing zuoling handled the transfer communicated documents; supervised and urged the various departments of guangchu, duyu, zhangyi, accounting, construction, and qingfeng to undertake matters; managed officials and various people; maintained the shengjing palace and the warehouse; selected women to send to beijing inspect; heard all types of cases; undertook the emperor’s general letter; managed the ula people and tributes; and accepted the emperor or the internal affairs office in charge, among other tasks.22 this is connected to the meaning of words such as ula (乌拉), close defense (关防) and license (执照) in figure 7(c). information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 15 figure 7. word frequency comparison of documents received (in blue) and sent (in orange) by institutions. a b c information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 16 institutional relationship analysis to further study the governance structure of the shengjing area, we not only need to understand the functions of each institution but also explore the overlap between functions of institutions. the catalog data of the hetu dangse consist of three parts: receiving institutions, issuing institutions, and record title of the catalog. a document often includes two institutions, th e receiving institution and the issuing institution, and it is certain that the content of a document relates closely to the functions between the two institutions. by observing the closeness between the number of institutions through visualizations, we conducted a quantitative analysis of consistent catalog data of the receiving and issuing institutions in the hetu dangse to provide reliable data for further research in the intersection of institutional functions in shengjing area. results of institutional connection analysis using the co-word clustering algorithm, we counted the number of archive catalog data consistent with the receiving and issuing institutions. we set the vertical axis as the issuing institution and the horizontal axis as the receiving institution to obtain figure 8. the numbers inside the boxes represent the quantity of catalog data that are consistent with the issuing institution. to facilitate measurements in the statistical process, records less than or equal to 50 communicated documents between the receiving institution and the issuing institution have been zeroed out. as shown in figure 8, the institutions having close relations with the documents recorded in the hetu dangse are concentrated in the issuing institutions shengjing zuoling and shengjing internal affairs office, and the receiving institutions shengjing internal affairs office and shengjing zuoling. among the receiving institutions, the number of documents received by the shengjing internal affairs office from shengjing general yamen reached as high as 11,936. the top three documents received by shengjing zuoling were fengtian general yamen (2,265 pieces), shengjing ministry of revenue (1,527 pieces), and shengjing ministry of justice (1,520 pieces). it is worth noting that there are less than 50 documents from shengjing zuoling in the shengjing internal affairs office. the overlapping functions of the institutions in the shengjing area enabled individual offices to play bureaucratic games, passing responsibility to other offices, leading to low efficiency in handling affairs. for example, the military and political power in the shengjing area was jointly controlled by the shengjing general office and the shengjing ministry of war. the shengjing area’s tax power was controlled by the shengjing ministry of revenue and fengtian office and their subordinate offices. this phenomenon ran through the entire qing dynasty. research on the cr ossfunctionality of institutions has always been a hot topic in qing historiography. by analyzing the official documents between the institutional functions, we can further explore the overlap as well as the advantages and disadvantages of the qing dynasty shengjing ruling system to study the history of shengjing institutions in the qing dynasty more thoroughly providing a reference for the design of current institutions. information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 17 figure 8. relationship of communicated documents by the hetu dangse institutions diagram. visualization of institutional network map we used the hetu dangse catalog as sample data and the co-word clustering algorithm to obtain the close relationship between institutions and the appearance frequency of institutions. we drew a visual network diagram by virtue of vosviewer1.6.15 to obtain figure 9. in figure 9, institutions are represented by default as a circle with their names. the size of the label and the circle of an institution are determined by the weight of the item. the higher the weight of an item, the larger the label and the circle of the item. for some items, labels may not be displayed to avoid overlapping labels. the color of an institution is determined by the cluster the institutions belong to, and lines between items represent links. as shown in figure 9, the relationships between the institutions and departments in the hetu dangse form three core groups: the shengjing internal affairs office (in charge), shengjing zuoling, and beijing internal affairs office in charge. however, the relationships between the three groups are not similar; the distance between the group (beijing) internal affairs office in charge and the two other groups is relatively large. the group at the core of shengjing internal affairs office and the group at the core of shengjing zuoling are closely connected to each other through the wubu of information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 18 shengjing (shengjing ministry of revenue, shengjing ministry of rites, shengjing ministry of war, shengjing ministry of justice, and shengjing ministry of works). further, there are two larger individuals: fengtian general yamen and shengjing general yamen. fengtian general yamen and shengjing zuoling are closely related to each other, and the relationship between shengjing general yamen and shengjing internal affairs office is relatively close. figure 9. co-occurrence of institutions network map. the city of shengjing was the companion capital of the qing dynasty. the qing government implemented special governance measures in these areas that differed greatly from those of direct inland provinces.23 to ensure the stable rule of the shengjing area, the qing dynasty performed the following tasks. first, the qing dynasty set up a general garrison as the highest military and political chief in the shengjing area to be responsible for all military and political affairs within its jurisdiction. second, they established the fengtian office, a capital of the same level as the shuntian office, to rule the common people of the shengjing area. the states and counties, as well as the garrison banner officer, which was under the rule of general garrison, were local administrative institutions under the fengtian office. these institutions implemented the dual management rule of the bannerman and common people. third, as the companion capital, the shengjing area followed the ming dynasty companion capital system to set up the wubu of shengjing to maintain power. in addition, the shengjing internal affairs office, which was in charge of palace affairs, communicated with the beijing internal affairs office in charge. information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 19 results of automatic classification analysis catalogs are important information resources in the field of historical archives. the classification of archival catalogs can not only link relevant information in archives or archive fonds, improve researchers’ utilization efficiency, and save time to search for required archives, but it can also be shown to readers in clusters. as the hetu dangse catalog is a series of historical documents stored for a long period of time, its original classification system does not suit well existing archival management methods. the hetu dangse has a total of 1,149 volumes and 127,000 pages. each volume contains a different number of documents and the ink characters on chinese art paper are in manchu and chinese. reading and categorizing the full text of the hetu dangse not only requires a lot of manpower, material, and financial resources but also extremely high requirements for the classified staff. they need to possess a good knowledge of manchu, archival science, document taxonomy, and other related disciplines. therefore, sorting and organizing the content of the hetu dangse is an impractical task that relies on manual reading and comprehension. to address this problem, we used the svm model of machine learning to automatically classify and explore the catalog data of the hetu dangse. this model further demonstrates the relevance of the knowledge between documents in the hetu dangse and facilitates an in-depth analysis. we imported the vectorized labeled data set into the svm model and selected the optimal parameter combination to run the model. to visualize the data results, the 50-dimensional word vector is reduced to a 2-dimensional word vector using the t-distributed random neighborhood embedding algorithm. we used the svm model to establish a hyperplane visualized in 2dimensional form. the legend only in figure 10 shows the data distribution of the six categories with the highest proportion owing to the large number of categorized data. to test the classification effect of the svm model, we used precision and recall as metrics and calculated the f1 score to validate the model. the results are presented in table 3. based on the created svm model, 95,680 catalog data of the hetu dangse were predicted and classified. the results are shown in figure 11. although there exist certain deficiencies in accuracy and other aspects, it a positive impact for the content research, management, utilization, and retrieval discovery of hetu dangse. table 3. svm model validation parameters result precision 0.736 recall 0.717 f1 score 0.716 information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 20 figure 10. svm decision region boundary. figure 11. hetu dangse catalog data prediction classification. conclusion in this study, we used machine learning to analyze and visualize the catalog data of the hetu dangse, revealing the functional relationship of the qing dynasty, shengjing regional institutions recorded in this historical document, and showing the institutional communicated relationships. using the svm model, we achieved automatic classification of the hetu dangse catalog from the category perspective. owing to the massive archives of historical materials in ancient china, the information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 21 fonts of many historical materials cannot be recognized by computers or humans. the digitization of catalogs has become a digital bridge between researchers and historical documents. this not only achieves the concise summary and refinement of them but also greatly improves the utilization efficiency by researchers. the svm model can “learn” through the labeled sample data and realize automatic classification of large amounts of unlabeled catalog data. by automatic classification of catalog data, historical data researchers and archive managers can use and manage a large number of historical documents and catalog data more effectively, greatly increasing their utilization. the co-occurrence algorithm can reveal the rules written by the catalog data itself, discover the distance between the catalog data, and form clusters providing a clearer direction for researchers to use historical documents. the algorithm also saves time for researchers to identify documents without purpose, making content presentation of historical documents to readers clearer. this paper improves archivists’ awareness of archive data compilation and management. first, data is observed, topics are identified, and potential relationships between these are found and established to improve historical archives’ compilation. second, the visual presentation method and carrier is chosen, and via the web browser established relationships are visualized for the users to access and utilize. it can be said that scientometric research method can promote the transformation of historical research and archives management and compilation research from traditional explanatory scholarship to truth-seeking scholarship. currently, the application of machine learning technology has gradually extended from applied disciplines to traditional fields of literature, art, and sociology. however, there are still many opportunities in the field of historical research. this study used methods in the field of artificial intelligence to conduct text mining and visualize the presentation of historical archive document catalog data and proposes a new digital and intelligent solution for researching chinese historical documents. with the development of science and technology, research methods for historical documents are undergoing constant changes from the traditional manual subjective analysis of historical data to relying on quantitative analysis represented by deep learning and data mining technology. it is an irreversible trend to research historical documents more comprehensively, accurately, and scientifically by means of artificial intelligence and other technologies on the scientific frontier. for future work, we plan to conduct research on the qing dynasty historical documents from a deeper semantic analysis level, construct a knowledge graph through the method of named entity recognition, and construct an ontological model transforming historical documents into a structured knowledge base to discover new knowledge from historical documents in an automated manner. acknowledgments funding statement this work was supported by the general program of the national natural science foundation of china [grant number 72074060], the research foundation of the ministry of education of china [grant number 20jhq012], and the national social science fund of china [grant number 16btq089]. data accessibility the data sets supporting this article have been uploaded as part of the supplementary material. https://drive.google.com/drive/folders/1bzs17otruyva_qkbshmf836ygdti40y0?usp=sharing https://drive.google.com/drive/folders/1bzs17otruyva_qkbshmf836ygdti40y0?usp=sharing information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 22 competing interests we have no competing interests. endnotes 1 wang tao, “data mining of german historical documents in the 18th century, taking topic models as examples,” xuehai 1, no. 20 (2017): 206–16, https://doi.org/10.16091/j.cnki.cn321308/c.2017.01.021. 2 kaixu zhang and yunqing xia, “crf-based approach to sentence segmentation and punctuation for ancient chinese prose,” journal of tsinghua university (science and technology) 10, no. 27 (2009): 39–49, https://doi.org/10.16511/j.cnki.qhdxxb.2009.10.027. 3 michael stauffer, andreas fischer, and kaspar riesen, “keyword spotting in historical handwritten documents based on graph matching,” pattern recognition 81 (2018): 240–53, https://doi.org/10.1016/j.patcog.2018.04.001; wu sihang et al., “precise detection of chinese characters in historical documents with deep reinforcement learning,” pattern recognition 107 (2020): 107503, https://doi.org/10.1016/j.patcog.2020.107503. 4 renata solar and dalibor radovan, “use of gis for presentation of the map and pictorial collection of the national and university library of slovenia,” information technology and libraries 24, no. 4 (2005): 196–200, https://doi.org/10.6017/ital.v24i4.3385. 5 shaochun dong et al., “semantic enhanced webgis approach to visualize chinese historical natural hazards,” journal of cultural heritage 14, no. 3 (2013): 181–89, https://doi.org/10.1016/j.culher.2012.06.009; jakub kuna and łukasz kowalski, “exploring a non-existent city via historical gis system by the example of the jewish district ‘podzamcze’ in lublin (poland),” journal of cultural heritage 46 (2020): 328–34, https://doi.org/10.1016/j.culher.2020.07.010. 6 aleksandrs ivanovs and aleksey varfolomeyev, “service-oriented architecture of intelligent environment for historical records studies,” procedia computer science 104 (2017): 57–64, http://doi.org/10.1016/j.procs.2017.01.062; guus schreiber et al., “semantic annotation and search of cultural-heritage collections: the multimedian e-culture demonstrator,” journal of web semantics 6, no. 4 (2008): 243–49, https://doi.org/10.1016/j.websem.2008.08.001. 7 m kim et al., “inference on historical factions based on multi-layered network of historical figures,” expert systems with applications 161 (2020): 113703, http://doi.org/10.1016/j.eswa.2020.113703. 8 hobson lane, cole howard, hannes hapke, natural language processing in action: understanding, analyzing, and generating text with python (new york: manning publications, 2019), 165. 9 laurens van der maaten, eric postma, and jaap van den herik, “dimensionality reduction: a comparative review,” tilburg university technical report, ticc-tr 2009-005 (2009), https://lvdmaaten.github.io/publications/papers/tr_dimensionality_reduction_review_200 9.pdf. https://doi.org/10.16091/j.cnki.cn32-1308/c.2017.01.021 https://doi.org/10.16091/j.cnki.cn32-1308/c.2017.01.021 https://doi.org/10.16511/j.cnki.qhdxxb.2009.10.027 https://doi.org/ https://doi.org/10.1016/j.patcog.2018.04.001 https://doi.org/10.1016/j.patcog.2020.107503 https://doi.org/10.6017/ital.v24i4.3385 https://doi.org/10.1016/j.culher.2012.06.009 https://doi.org/10.1016/j.culher.2020.07.010 http://doi.org/10.1016/j.procs.2017.01.062 https://doi.org/10.1016/j.websem.2008.08.001 http://doi.org/ https://doi.org/10.1016/j.eswa.2020.113703 https://lvdmaaten.github.io/publications/papers/tr_dimensionality_reduction_review_2009.pdf https://lvdmaaten.github.io/publications/papers/tr_dimensionality_reduction_review_2009.pdf information technology and libraries september 2021 text analysis and visualization research on the hetu dangse | wang, wu, yu, and song 23 10 gavin hackeling, mastering machine learning with scikit-learn (birmingham: packt publishing, 2017). 11 richard smiraglia, domain analysis for knowledge organization: tools for ontology extraction (oxford: chandos publishing, 2015). 12 kuo-chung chu, hsin-ke lu, and wen-i liu, “identifying emerging relationship in healthcare domain journals via citation network analysis,” information technology and libraries 37, no. 1 (2018): 39–51, https://doi.org/10.6017/ital.v37i1.9595. 13 archives of liaoning province in china, “the hetu dangse series archives publication,” qing history research 6, no. 2 (2009): 1. 14 amit kumar sharma, sandeep chaurasia, and devesh kumar srivastava, “sentimental short sentences classification by using cnn deep learning model with fine tuned word2vec,” procedia computer science 167 (2020): 1139–47, https://doi.org/10.1016/j.procs.2020.03.416. 15 b hongxi, “research on the sanling management institutions of the qing dynasty outside the pass,” manchu minority research 4, no. 12 (1997): 38–56. 16 guangli zhu et al., “building multi-subtopic bi-level network for micro-blog hot topic based on feature co-occurrence and semantic community division,” journal of network and computer applications 170 (2020): 102815, https://doi.org/10.1016/j.jnca.2020.102815. 17 s. ravikumar, ashutosh agrahari, and s. n. singh, “mapping the intellectual structure of scientometrics: a co-word analysis of the journal scientometrics (2005–2010),” scientometrics 102 (2015): 929–55, https://doi.org/10.1007/s11192-014-1402-8. 18 jiming hu and yin zhang, “research patterns and trends of recommendation system in china using co-word analysis,” information processing and management 51, no. 4 (2015): 329–39, https://doi.org/10.1016/j.ipm.2015.02.002. 19 nees jan van eck and ludo waltman, “software survey: vosviewer, a computer program for bibliometric mapping, scientometrics, 84, no. 2 (2010): 523–38, https://doi.org/10.1007/s11192-009-0146-3. 20 z yanchang and l xinzhu, “the study of the function of shengjing office from the use of the official communication — an academic investigation based on hetu dangse,” shanxi archives 8, no. 12 (2020): 179–88. 21 shengjing ministry of revenue, guangxu's great qing huidian volume 25 (zhonghua book company, 1991), 211–12. 22 f yonggong and g jialu, “brief introduction of shengjing upper three banners baoyi zuoling,” historical archives 9, no. 30 (1992): 93–7. 23 wangyue, “research on the yamens and their affair relationships in shengjing area,” shenyang palace museum journal 1, no. 31 (2011): 67–77. https://doi.org/10.6017/ital.v37i1.9595 https://doi.org/10.1016/j.procs.2020.03.416 https://doi.org/10.1016/j.jnca.2020.102815 https://doi.org/ https://doi.org/10.1007/s11192-014-1402-8 https://doi.org/10.1016/j.ipm.2015.02.002 https://doi.org/10.1007/s11192-009-0146-3 abstract introduction related technology definition sample data preprocessing and classification data preparation and preprocessing data cleaning label data results word vector conversion of text catalog data analysis of the relationship between the documents received and sent of the institution mining the relationship between institutions’ sending and receiving documents based on co-word clustering automatic classification method of historical archives catalog based on the svm model discussion functions of institutions analysis of the number of documents received and sent by institutions analysis of the frequency of documents received and sent by institutions institutional relationship analysis results of institutional connection analysis visualization of institutional network map results of automatic classification analysis conclusion acknowledgments funding statement data accessibility competing interests endnotes 75 the development and administration of automated systems in academic libraries richard de gennaro: harvard university library, cambridge, mass. the first part of this paper considers three general approaches to the development of an automation program in a large research library. the library may decide simply to wait for developments; it may attempt to develop a total or integrated system from the start; or it may adopt an evolutionary approach leading to an integrated system. outside consultants, it is suggested, will become increasingly important. the second part of the paper deals with important elements in any program regardless of the approach. these include the building of a capability to do automation work, staffing, equipment, organizational structure, selection of projects, and costs. since most computer-based systems in academic libraries at the present time are in the developmen tal or early operational stages when improvements and modifications are frequent, it is difficult to make a meaningful separation between the developmental function and the administrative or management function. development, administration, and operations are all bound up together and are in most cases carried on by the same staff. this situation will change in time, but it seems safe to assume that automated library systems will continue to be characterized by instability and change for the next several years. in any case, this paper will not attempt to distinguish between developmental and administrative ftmctions but will instead discuss in an informal and non-technical way some of the factors to be considered b y librarians and administrators when 76 journal of library automation vol. 1/ 1 march, 1968 their thoughts turn, as they inevitably must, to introducing computer systems into their libraries or to expanding existing machine operations. alternative approaches to library automation will be explored first. there will follow a discussion of some of the important elements that go into a successful program, such as building a capability, a staff, and an organization. the selection of specific projects and the matter of costs will also be covered briefly. approaches to library automation devising a plan for automating a library is not entirely unlike formulating a program for a new library building. while there are general types of building best suited to the requirements of different types of library, each library is unique in some respects, and requires a building which is especially designed for its own particular needs and situation. as there are no canned library building programs, so there are no canned library automation programs, at least not at this stage of development; therefore the first task of a library administration is to formulate an approach to automation based on a realistic assessment of the institution• s needs and resources. certain newly-founded university libraries such as florida atlantic, which have small book collections and little existing bibliographical apparatus, have taken the seemingly logical course of attempting to design and install integrated computer-based systems for all library operations. certain special libraries with limited collections and a flexible bibligraphical apparatus are also following this course. project intrex at m.i.t. is setting up an experimental library operation parallel to the traditional one, with the hope that the former will eventually transform or even supersede the latter. several older university libraries, including chicago, washington state, and stanford, are attempting to design total systems based on on-line technology and to implement these systems in modules. many other university libraries (british columbia, harvard, and yale to name only a few) approach automation in an evolutionary way and are designing separate, but related, batch-processing systems for various housekeeping functions such as circulation, ordering and accounting, catalog input, and card production. still other libraries (princeton is a notable example) expect to take little or no action until national standardized bibliographical formats have been promulgated, and some order or pattern has begun to emerge from the experimental work that is in progress. only time will tell which of these courses will be most fruitful. meanwhile the library administrator must decide what approach to take; and the approach to automation, like that to a building program, must be based on local requirements and available resources ( 1,2). for the sake of this discussion the major principal approaches will be considered under three headings: 1) the wait-for-developments approach, automated systems in academic libraries/ de gennaro 77 2) the direct approach to a total system, and 3) the evolutionary approach to a total system. the use of outside consultants will also be discussed. the wait-for-developments approach this approach is based on the premise that practically all computerbased library systems are in an experimental or research-and-development stage with questionable economic justification, and that it is unnecessary and uneconomical for every library to undertake difficult and costly development work. the advocates of this approach suggest that library automation should not be a moon race and say that it makes sense to wait until the pioneers have developed some standardized, workable, and economical systems which can be installed and operated in other libraries at a reasonable cost. for many libraries, particularly the smaller ones, this is a reasonable position to take for the next few years. it is a cautious approach which minimizes costs and risks. for the larger libraries, however, it overlooks the fact that soon, in order to cope with increasing workloads, they will have to develop the capability to select, adapt, implement, operate, and maintain systems that were developed elsewhere. the development of this capability will take time and will be made more· difficult by the absence of any prior interest and activity in automation within the adapting institution. the costs will be postponed and perhaps reduced because the late-starters will be able to telescope much of the process, like countries which had their industrial revolution late. however, it will take some courage and political astuteness for a library administrator to hold firmly to this position in the face of the pressures to automate that are coming from all quarters, both inside and outside the institution ( 3). a major error in the wait-for-developments approach is the assumption that a time will come when the library automation situation will have shaken down and stabilized so that one can move into the field confidently. this probably will not happen for many years, if it happens at all, for with each new development there is another more promising one just over the horizon. how long does one wait for the perfect system to be developed so that it can be easily "plugged in," and how does one recognize that system when one sees it? there is real danger of being left behind in this position, and a large library may then find it difficult indeed to catch up. the direct approach to a total system this approach to library automation is based on the premise that, since a library is a total operating unit and all its varied operations are interrelated and interconnected, the logic of the situation demands that it be looked upon as a whole by the systems designers and that a single inte78 journal of library automation vol. 1/ 1 march, 1968 grated or total system be designed to include all machinable operations in the library. such a system would make the most efficient and economical use of the capabilities of the computer. this does not require that the entire system be designed and implemented at the same time, but permits treating each task as one of a series of modules, each of which can be implemented separately, though designed as part of a whole. several large libraries have chosen this method and, while a good deal of progress is being made, these efforts are still in the early development stage. the university of chicago system is the most advanced (4) . unlike the evolutionary approach, which assumes that much can be done with local funds, home-grown staff, batch processing and even second generation computers, the total systems approach must be based on sophisticated on-line as well as batch-processing equipment. this equipment is expensive; it is also complex, requiring a trained and experienced staff of systems people and expert programmers to design, implement, and operate it effectively. since the development costs involved in this approach are considerable, exceeding the available resources of even the larger libraries, those libraries that are attempting this method have sought and received sizable financial backing from the granting agencies. the total systems approach has logic in its favor: it focuses on the right goal and the goal will ultimately be attainable. the chief difficulty, however, is one of timing. the designers of these systems are trying to telescope the development process by skipping an intermediate stage in which the many old manual systems would have been converted to simple batch-processing or off-line computer systems, and the experience and knowledge thus acquired utilized in taking the design one step further into a sophisticated, total system using both on-line and batch-processing techniques. the problem is that we neither fully understand the present manual systems nor the implications of the new advanced ones. we are pushing forward the frontiers of both library automation and computer technology. it may well be that the gamble will pay off, but it is extremely doubtful that the first models of a total library system will be economically and technically viable. the best that can be hoped for is that they will work well enough to serve as prototypes for later models. while bold attempts to make a total system will unquestionably advance the cause of library automation in general, the pioneering libraries may very well suffer serious setbacks in the process, and the prudent administrator should carefully weigh the risks and the gains of this approach for his own particular library. the evolutionary approach to a total system this approach consists basically of taking a long-range, conservative view of the problem of automating a large, complex library. the ultimate goal is the same as that of the total systems approach described in the automated systems in academic libraries/de gennaro 79 preceding section, but the method of reaching it is different. in the total systems approach, objectives are defined, missions for reaching those objectives are designed, and the missions are computerized, usually in a series of modules. in the evolutionary approach, the library moves from traditional manual systems to increasingly complex machine systems in successive stages to achieve a total system with the least expenditure of effort and money and with the least disruption of current operations and services ( 5 ) . in the first stage the library undertakes to design and implement a series of basic systems to computerize various procedures using its own staff and available equipment. this is something of a bootstrap operation, the basic idea of which is to raise the level of operation circulation, acquisitions, catalog input, etc. -from existing manual systems to simple and economical machine systems until major portions of the conventional systems have been computerized. in the process of doing this, the library will have built up a trained staff, a data processing department or unit with a regular budget, some equipment, and a space in which to work: in short, an in-house capability to carry on complex systems work. during this first stage the library will have been working with tried and tested equipment and software packages probably of the second generation variety and meanwhile, third generation computers with on-line and time-sharing software are being debugged and made ready for use in actual operating situations. at some point the library itself, computer hardware and software, and the state of the library automation art will all have advanced to a point where it will be feasible to undertake the task of redesigning the simple stage-one systems into a new integrated stage-two system which builds upon the designs and operating experience obtained with the earlier systems. these stage-one systems will have been, for the most part, mechanized versions of the old manual systems; but the stage-two systems, since they are a step removed from the manual ones, can be designed to incorporate significant departures from the old way of doing things and take advantage of the capabilities of the advanced equipment and software that will be used. the design, programming, and implementation of these stage-two systems will be facilitated by the fact that the library is going from one logical machine system to another, rather than from primitive unformalized manual systems to highly complex machine systems in one step. because existing manual systems in libraries produce no hard statistical data about the nature and number of transactions handled, stage-one machine systems have had to be designed without benefit of this essential data. however, even the simplest machine systems can be made to produce a wide variety of statistical data which can be used to great advantage by the designers of stage-two systems. the participation of non80 journal of library automation vol. 1/ 1 march, 1968 library-oriented computer people in stage-two design will also ·be facilitated by the fact that they will be dealing with formalized machine systems and records in machine readable form with which they can easily cope. while the old stage one of library automation was one in which librarians almost exclusively did the design and programming, it is doubtful that stage-two systems can or should be done without the active aid of computer specialists. in stage one it was easier for librarians to learn computing and to do the job themselves than it was to teach computer people about the old manual systems and the job to be done to convert them. this may no longer be the case in dealing with redesign of old machine systems into very complex systems to run on third or fourth generation equipment in an on-line, time-sharing environment. there is now a generation of experienced computer-oriented librarians capable of specifying the job to be done and knowledgeable enough to judge the quality of the work that has been done by the experts. there is no reason why a team of librarians and computer experts should not be able to work effectively together to design and implement future library systems. as traditional library systems are replaced by machine systems, the specialized knowledge of them becomes superfluous, and it was this type of knowledge that used to distinguish the librarian from the computer expert. just as there is a growing corps of librarians specializing in computer work, so there is a growing corps of computer people specializing in library work. it is with these two groups working together as a team that the hope of the future lies. the question of who is to do library automation librarians or computer experts is no longer meaningful; library automation will be done by persons who are knowledgeable about it and who are deeply committed to it as a specialty; whether they have approached it through a background of librarianship or technology will be of little consequence. experience has shown that computer people who have made a full-time commitment to the field of library automation have done some of the best work to date. stage-two, or advanced integrated library systems, may be built by a team of library and computer people of various types working as staff members of the library, as has been suggested in the preceding discussion, but this approach also has its weaknesses. for example, let us assume that a large library has finally brought itself through stage one and is now planning to enter the second stage. it may have acquired a good deal of the capability to do advanced work, but its staff may be too small and too inexperienced in certain aspects of the work to undertake the major task of planning, designing, and implementing a new integrated system. additional expert help may be needed, but only on a temporary basis during the planning and design stages. such people will be hard to find, and also hard to hire within some library salary structures. they automated systems in academic librari-es/ de gennaro 81 will be difficult to absorb into the library's existing staff, administrative, and physical framework. they may also be difficult to separate from the staff when they are no longer needed. use of outside consultants there are alternative approaches to creating advanced automated systems. the discussion that follows will deal with one of the most obvious: to contract much of the work out to private research and development firms specializing in library systems. what comes to mind here is an analogy with the employment of specialized talents of architects, engineers, and construction companies in planning and building very large, complex and costly library buildings, which are then turned over to librarians to operate. when a decision has been made to build a new building, the university architect is not called in to do the job, nor is an architect added to the library staff, nor are librarians on the staff trained to become architects and engineers qualified to design and supervise the construction of the building. most libraries have on their staffs one or two librarians who are experienced and knowledgeable enough to determine the over-all requirements of the new building, and together they develop a building program which outlines the general concept of the building and specifies various requirements. a qualified professional architect is commissioned to translate the program into preliminary drawings, and there follows a continuing dialogue between the architect and the librarians which eventually produces acceptable working drawings of a building based on the original program. for tasks outside his area of competence, the architect in turn engages the services of various specialists, such as structural and heating and ventilating engineers. both the architect and the owners can also call on library consultants for help and advice if needed. the architect participates in the selection of a construction company to do the actual building and is responsible for supervic;ing the work and making sure that the building is constructed according to plans and contracts. upon completion, the building is turned over to the owners, and the librarians move in and operate it and see to its maintenance. in time, various changes and additions will have to be made. minor ones can be made by the regular buildings staff of the institution, but major ones will probably be made with the advice and assistance of the original architect or some other. in the analogous situation, the library would have its own experienced systems unit or group capable of formulating a concept and drawing up a written program specifying the goals and requirements of the automated system. a qualified "architect" for the system would be engaged in the form of a small firm of systems consultants specializing or experienced in library systems work. their task, like the architect's, would be to turn 82 journal of library automation vol. 1/ 1 march, 1968 the general program into a detailed system design with the full aid and participation of the local library systems group. this group would be experienced and competent enough to make sure that the consultants really understood the program and were working in harmony with it. mter an acceptable design had emerged from this dialogue, the consultant would be asked to help select a systems development firm which would play a role similar to that of the construction company in the analog: to complete the very detailed design work and .to do the programming and debugging and implementation of the system. the consultant would oversee this work, just as the architect oversees the construction of a building. the local library group will have actively participated in the development and implementation of the system and would thus be competent to accept, operate, maintain and improve it. success or failure in this approach to advanced library automation will depend to a large extent on the competence of the "architect" or consultant who is engaged. until recently this was not a very promising route to take for several reasons. there were no firms or consultants with the requisite knowledge and experience in library systems, and the state of the library automation art was confused and lacking in clear h·ends or direction. it was generally felt tl1at batch-processing systems on second and even third generation computing equipment could and should be designed and installed by local staff in order to give them necessary experience and to avoid the failures that could come from systems designed outside the library. library automation has evolved to a point where there is a real need for advanced library systems competence that can be called upon in the way that has been suggested, and individuals and firms will appear to satisfy that need. it is very likely, however, that the knowledge and the experience that is now being obtained in on-line systems by pioneering libraries such as the university of chicago, washington state university and stanford university, will have to be assimilated before we can expect competent consultants to emerge. the chief difficulty with the architect-and-building analog is that while the process of designing and constructing library buildings is widely understood, there being hundreds of examples of library buildings which can be observed and studied as precedents, the total on-line library system has yet to be designed and tested. there are no precedents and no examples; we are in the position of asking the "architect'' to design a prototype system, and therein lies the risk. mter this task has been done several times, librarians can begin to shop around for experienced and competent "architects" and successful operating systems which can be adapted to their needs. the key problem here, as always in library automation, is one of correct timing: to embark on a line of development automated systems in academic libraries/de gennaro 83 only when the state of the art is sufficiently advanced and the time is ripe for a particular new development. building the capability for automation regardless of the approach that is selected, there are certain prerequisites to a successful automation effort, and these can be grouped under the rubric of "building the capability." to build this capability requires time and money. it consists of a staff, equipment, space, an organization with a regular budget, and a certain amount of know-how which is generally obtained by doing a series of projects. success depends to a large extent on how well these resources are utilized, i.e. on the overall sh·ategy and the nature and timing of the various moves that are made. much has already been said about building the capability in the discussion on the approaches to automation, and what follows is an expansion of some points that have been made and a recapitulation of others. staff since nothing gets done without people, it follows that assembling, training, and holding a competent staff is the most important single element in a library's automation effort. the number of trained and experienced library systems people is still extremely small in ·relation to the ever-growing need and demand. to attract an experienced computer librarian and even to hold an inexperienced one with good potential, libraries will have to pay more than they pay members of the staff with comparable experience in other lines of library work. this is simply the law of supply and demand at work. to attract people from the computer field will by the same token require even higher salaries. in addition, library systems staff, because of the rate of development of the field and the way in which new information is communicated, will have to be given more time and funds for training courses and for travel and attendance at conferences than has been the case for other library staff. the question of who will do library automation-librarians or computer experts-has already been touched upon in another context, but it is worth emphasizing the point that there is no unequivocal answer. there are many librarians who have acquired the necessary computer expertise and many computer people who have acquired the necessary knowledge of library functions. the real key to the problem is to get people who are totally committed to library automation whatever their background. computer people on temporary loan from a computing center may be poor risks, since their professional commitment is to the computer world rather than that of the library. they are paid and promoted by the computing center and their primary loyalty is necessarily to that employer. computer people, like the rest of us, give their best to tasks which they find interesting and challenging, and by and large, they tend to look l 84 journal of library automation vol. 1/ 1 march, 1968 upon the computerization of library housekeeping tasks as trivial and unworthy of their efforts. on the other hand, a first-rate computer person who has elected to specialize in library automation and who has accepted a position on a library staff may be a good risk, because he will quickly take on many of the characteristics of a librarian yet without becoming burdened by the full weight of the conventional wisdom that librarians are condemned to carry. the ideal situation is to have a staff large enough to include a mixture of both types, so that each will profit by the special knowledge and experience of the other. to bring in computer experts inexperienced in library matters to automate a large and complex library without the active participation of the library's own systems people is to invite almost certain failure. outsiders, no matter how competent, tend to underestimate the magnitude and complexity of library operations; this is tme not only of computing center people but also of independent research and development firms. a library automation group can include several different types of persons with very different kinds and levels of qualifications. the project director or administrative head should preferably be an imaginative and experienced librarian who has acquired experience with electronic data processing equipment and techniques, and an over-all view of the general state of the library automation art, including its potential and direction of development. there are various levels of library systems analysts and programmers, and the number and type needed will depend on the approach and the stage of a particular library's automation effort. the critical factor is not numbers but quality. there are many cases where one or two inspired and energetic systems people have far surpassed the efforts of much larger groups in both quality and quantity of work. some of the most effective library automation work has been done by the people who combine the abilities of the systems analyst with those of the expert programmer and are capable of doing a complete project themselves. a library that has one or two really gifted systems people of this type and permits them to work at their maximum is well on the way to a successful automation effort. as a library begins to move into development of on-line systems, it will need specialist programmers in addition to the systems analysts described above. these programmers need not be, and probably will not be, librarians. other members of the team, again depending on the projects, will be librarians who are at home in the computer environment but who will be doing the more traditional types of work, such as tagging and editing machine catalog records. in any consideration of library automation staff, it would be a mistake to underestimate the importance of the role of keypunchers, paper tape automated systems in academic libraries/de gennaro 85 typists, and other machine operators; it is essential that these staff members be conscientious and motivated persons. they are responsible for the quality and quantity of the input, and therefore of the output, and they can frequently do much to make or break a system. a good deal of discussion and experimentation has gone into the question of the relative efficiency of various keyboarding devices for library input, but little consideration is given to the human operators of the equipment. experience shows that there can be large variations in the speed and accuracy of different persons doing the same type of work on the same machine. equipment one of the lessons of library automation learned during the last few years is that a library cannot risk putting its critical computer-based systems onto equipment over which it has no control. this does not necessarily mean that it needs its own in-house computer. however, if it plans to rely on equipment under the administrative control of others, such as the computer center or the administrative data processing unit, it must get firm and binding commitments for time, and must have a voice in the type and configuration of equipment to be made available. the importance of this point may be overlooked during an initial development period, when the library's need for time is minimal and flexible; it becomes extremely critical when systems such as acquisitions and . circulation become totally dependent on computers. people at university computing centers are generally oriented toward scientific and research users and in a tight situation wiu give the library's needs second priority; those in administrative data process~g, because they are operations oriented, tend to have a somewhat better appreciation of the library's requirements. in any case, a library needs more than the expressed sympathy and goodwill of those who control the computing equipment-it needs firm commitments. for all but the largest libraries, the economics of present-day computer applications in libraries make it virtually impossible to justify an in-house machine of the capacity libraries will need, dedicated solely or largely to library uses. even the larger libraries will find it extremely difficult to justify a high-discount second generation machine or a small third generation machine during the period when their systems are being developed and implemented a step or a module at a time. eventually, library use may increase to a point where the in-house machine will pay for itself, but during the interim period the situation will be uneconomical unless other users can be found to share the cost. in the immediate future, most libraries will have to depend on equipment located in computing or data processing centers. the recent experience of the university of chicago library, which is pioneering on-line systems, suggests that this situation is inevitable, given the high core requirements and low com86 journal of library automation vol. 1/ 1 march, 1968 puter usage of library systems. experience at the university of missouri ( 6), suggests that the future will see several libraries grouping to share a machine dedicated to library use; this may well be preferable to having to share with research and scientific users elsewhere within the university. a clear trend is not yet evident, but it seems reasonable to suppose that in the next few years sharing of one kind or another will be more common than having machines wholly assigned to a single library; and that local situations will dictate a variety of arrangements. while it is clear that the future of library automation lies in third-generation computers, much of their promise is as yet unfulfilled, and it would be premature at this point to write off some of the old, reliable, second-generation batch-processing machines. the ibm 1401, for example, is extremely well suited for many library uses, particularly printing and formatting, and it is a machine easily mastered by the uninitiated. this old workhorse will be with us for several more years before it is retired to majorca along with obsolete paris taxis. organization when automation activity in a library has progressed to a point where the systems group consists of several permanent professionals and several clericals, it may be advisable to make a permanent place for the group in the library's regular organizational structure. the best arrangement might be to form a separate unit or department on an equal footing with the traditional departments such as acquisitions, cataloging, and public services. this systems department would have a two-fold function: it would develop new systems and operate implemented systems; and it would bring together for maximum economy and efficiency most of the library's data processing equipment and systems staff. it will require adequate space of its own andabove alla regular budget, so that permanent and long-term programs can be developed and sustained on some thing other than an ad hoc basis. there are other advantages to having an established systems department or unit. it gives a sense of identity and esprit to the staff; and it enables them to work more effectively with other departments and to be accepted by them as a permanent fact of life in the library, thereby diminishing resistance to automation. let there be no mistake about it the systems group will be a permanent and growing part of the library staff, because there is no such thing as a finished, stable system. (there is a saying in the computer field which goes "if it works, it's obsolete.") the systems unit should be kept flexible and creative. it should not be allowed to become totally preoccupied with routine operations and submerged in its day-to-day workload, as is too frequently the case with the traditional departments, which consequently lose their capacity to see their operations clearly and to innovate. part of the systems effort automated systems in academic libraries/de gennaro 87 must be devoted to operational systems, but another part should be devoted to the formulation and development of new projects. the creative staff should not be wasted running routine operations . . . there has never been any tradition for research and development work in libraries they were considered exclusively service and operational institutions. the advent of the new technology is forcing a change in this traditional attitude in some of the larger and more innovative libraries which are doing some research and a good deal of development. it is worth noting that a concomitant of research and development is a certain amount of risk but that, while there is no such thing as change without risk, standing pat is also a gamble. not every idea will succeed and we must learn to accept failures, but the experiments must be conducted so as to minimize the effect of failure on actual library operations. ·automated systems are never finished they are open-ended. they are always being changed, enlarged, and improved; and program and system maintenance will consequently be a permanent activity. this is one of the chief reasons why the equipment and the systems group should be concentrated in a separate department. the contrary case, namely dispersion of the operational aspects among the departments responsible for the work, may be feasible in the future as library automation becomes more sophisticated and peripheral equipment becomes less expensive, but the odds at this time appear to favor greater centralization. · the harvard university library has created, with good results, a new major department along the lines suggested above, except that it also includes the photo-reproduction services. the combination of data processing and reprography in a single department is a natural and logical relationship and one which will have increasingly important implications as both technologies develop concurrently and with increasing interdependence in the future. even at the present time, there is sufficient relationship between them so that the marriage is fruitful and in no way premature. while computers have had most of the glamour, photographic technology in general, and particularly the advent of the quick-copying machine, during the last seven years has so far had a more profound and widespread impact on library resources and services to readers than the entire field of computers and data processing. within the next several years, computer and reprographic technology will be so closely intertwined in libraries as to be inseparable. it would be a mistake to sell reprography short in the coming revolution. project selection no academic library should embark on any type of automation program without first acquiring a basic knowledge of the projects and plans of the library of congress, the national library of medicine, the national li88 journal of library automation vol. 1/ 1 march, 1968 · brary of agriculture, and certain of their joint activities, such as the national serials data program. as libraries with no previous experience with data processing systems move into the field of automation, they frequently select some relatively simple and productive projects to give experience to the systems staff and confidence in machine tec;hniques to the rest of the library staff. precise selection will depend on the local situation, but projects such as the production of lists of current journals (not serials check-in), lists of reserve books, lists of subject headings, circulation, and even acquisitions ordering and accounting systems are considered to be the safest and the most productive type of initial projects. since failures in the initial stage will have serious psychological effects on the library administration and entire staff, it is best to begin with modest projects. until recently it was fashionable to tackle the problem of automating the serials check-in system as a first project on the grounds that this was one of the most important, troublesome, and repetitive library operations and was therefore the best area in which to begin computerization. fortunately, a more realistic view of the serials problem has begun to prevail that serial receipts is an extremely complex and irregular library operation and one which will probably require some on-line updating capabilities, and complex file organization and maintenance programs. in any case, it is decidedly not an area for beginners. a major objection to all of the projects mentioned is that they do not directly involve the catalo~, which is at the heart of library automation. now that the marc ii tormat has been developed by the library of congress and is being widely accepted as the standardized bibliographical and communications format, the most logical initial automation effort for many libraries will be to adapt to their own environments the input system for current cataloging which is now being developed by the library of congress. the logic of beginning an integrated system with the development of an input sub-system for current cataloging has always been compelling for this author far more compelling than beginning in the ordering process, as so many advocate. the catalog is the central record, and the conversion of this record into machinable form is the heart of the matter of library automation. it seems self-evident that systems design should begin here with the basic bibliographical entry upon which the entire system is built. having designed this central module, one can then tum to the acquisitions process and design this module around the central one. circulation is a similar secondary problem. in other words, systems design should begin at the point where the permanent bibliographical record enters the system and not where the first tentative special-purpose record is created. unfortunately, until the advent of the standardized marc ii format, it was not feasible, except in automated systems in academic libraries/ de gennaro 89 an experimental way, for libraries to begin with the catalog record, simply because the state of the art was not far enough advanced. the development and acceptance of the marc ii format in 1967 marks the end of one era in library automation and the beginning of another. in the pre-marc ii period every system was unique; all the programming and most of the systems work had to be done by a library's own staff. in the post-marc ii period we will begin to benefit from systems and programs that will be developed at the library of congress and elsewhere, because they will ~e designed around the standard format and for at least one standard computer. as a result of this, automation in libraries will be greatly accelerated and will become far more widespread in the next few years ( 7). an input system for current cataloging in the marc ii format will be among the first packages available. it will be followed shortly by programs designed to sort and manipulate the data in various ways. a library will require a considerable amount of expertise on the part of its staff to adapt these procedures and programs to its own uses (we are not yet at the point of "plugging-in" systems), but the effort will be considerably reduced and the risks of going down blind alleys with homemade approaches and systems will be nearly eliminated for those libraries that are willing to adopt this strategy. the development and operation of a local marc ii input system with an efficient alteration and addition capability will be a prerequisite for any library that expects to learn to make effective use of the magnetic tapes containing the library of congress's current c;atalog data in the marc ii format, which will be available as a regular subscription in july, 1968. in addition to providing the experience essential for dealing with the library of congress marc data, a local input system will enable the library to enter its own data both into the local systems and into the national systems which will l?egin to emerge in the near future. since the design of the marc ii format is also hospitable to other kinds of library data, such as subject-headings lists and classification schedules, the experience gained with it in an input system will be transferable to other library automation projects. costs the price of doing original development work in the library automation field comes extremely highso high that in most cases such work cannot be undertaken without substantial assistance from outside sources. even when grants are available, the institution has to contribute a considerable portion of the total cost of any development effort, and this cost is not a matter of money alone; it requires the commitment of the library's limited human resources. in the earlier days of library automa-:tion attention was focused on the high cost of hardware, computer and 90 journal of library automation vol 1/ 1 march, 1q.68 peripheral equipment. the cost of software, the systems work and programming, tended to be underestimated. experience has shown, however, that software costs are as high as hardware costs or even higher. the development of new systems, i.e., those without precedents, is the most costly kind of library automation, and most libraries will have to select carefully the areas in which to do their original work. for those libraries that are content to adopt existing systems, the costs of the systems effort, while still high, are considerably less and the risks are also reduced. these costs, however, will probably have to be borne entirely by the institution, as it is unlikely that outside funding can be obtained for this type of work. the justification of computer-based library systems on the basis of the costs alone will continue to be difficult because machine systems not only replace manual systems but generally do more and different things, and it is extremely difficult to compare them with the old manual systems, which frequently did not adequately do the job they were supposed to do and for which operating costs often were unknown. generally speaking, and in the short run at least, computer-based systems will not save money for an institution if all development and implementation costs are included. they will provide better and more dependable records and systems, which are essential to enable libraries simply to cope with increased intake and workloads, but they will cost at least as much as the inadequate and frequently unexpansible manual systems they replace. the picture may change in the long run, but even then it seems more reasonable to expect that automation, in addition to profoundly changing · the way in which the library budget is spent, will increase the total cost of providing library service. however, that service will be at a much higher level than the service bought by today's library budget. certain jobs will be eliminated, but others will be created to provide ·new services and services in greater depth; as a library becomes increasingly successful and responsive, more and more will be demanded of it. conclusion the purpose of this paper has been to stress the importance of good strategy, correct timing, and intelligent systems staff as the essential ingredients for a successful automation program. it has also tried to make clear that no canned formulas for automating an academic library are waiting to be discovered and applied to any particular library. each library is going to have to decide for itseh which approach or strategy seems best suited to its own particular needs and situation. on the other hand, a good deal of experience with the development and administration of library systems has been acquired over the last few years and some of it may very well be useful to those who are about to take the plunge for the first time. this paper was written with the intention of automated systems in academic libraries/ de gennaro 91 passing along, for what they are worth, one man's ideas, opinions, and impressions based on an imperfect knowledge of the state of the library automation art and a modest amount of first-hand experience in library systems development and administration. references 1. wasserman, paul: th e librarian and the machine (detroit: gale, 1965). a thoughtful and thorough review of the state of the art of library automation, with some discussion of the various approaches to automation. essential reading for library administrators. 2. cox, n. s. m.; dews, j. d.; dolby, j. l.: the computer and the libmry (newcastle upon tyne: university of newcastle upon tyne, 1966). american edition published by archon books, hamden, conn. extremely clear, well-written and essential book for anyone with an interest in library automation. 3. dix, william s.: annual report of the librarian for the year ending june 30, 1966 (princeton: princeton university library, 1966). one of the best policy statements on library automation; a comprehensive review of the subject in the princeton context, with particular emphasis on the "wait-for-developments" approach. 4. fussier, herman h.; payne, charles t.: annual report 1966/67 to the national science foundation from the university of chicago library; development of an integrated, computer-based, bibliographical data system for a large university library (chicago: university of chicago library, 1967 ). appended to the report is a paper given may 1, 1967, at the clinic on library application of data processing conducted by the graduate school of library science, university of illinois. mr. payne is the author, and the paper is entitled "an integrated computer-based bibliographic data system for a large university library: progress and problems at the university of chicago." 5. kilgour, frederick g.: "comprehensive modern library systems," in the brasenose conference on the automation of libraries, proceedings. (london: mansell, 1967), 46-56. an example of the evolutionary approach as employed at the yale university library. 6. parker, ralph h.: "not a shared system: an account of a computer operation designed specifically and solely for library use at the university of missouri," librm·y journal, 92 (nov. 1, 1967), 3967-3970. 7. annual review of information science and technology (new york: lnterscience publishers), 1 ( 1966) . a useful tool for surveying the current state of the library automation art and for obtaining citations to current publications and reports is a chapter on automation in libraries which appears in each volume. vr hackfest public libraries leading the way vr hackfest chris markman, m ryan hess, dan lou, and anh nguyen information technology and libraries | december 2019 6 chris markman (chris.markman@cityofpaloalto.org) is senior librarian – information technology & collections, palo alto city public library. m ryan hess (ryan.hess@cityofpaloalto.org) is library services manager — digital initiatives, information technology & collections, palo alto city public library. dan lou (dan.lou@cityofpaloalto.org) is senior librarian — information technology & collections, palo alto city public library. anh nguyen (anh.nguyen@cityofpaloalto.org) is library specialist, information technology & collections, palo alto city public library. we built the future of the internet…today! the elibrary team at the palo alto city library held a vr hackfest weaving together multiple emerging technologies into a single workshop. during the event, participants had hands -on experience building vr scenes, which were loaded to a raspberry pi and published online using the distributed web. throughout the day, participants discussed how these technologies might change our lives, for good and for ill. and afterward, an exhibit showcasing the participants’ vr scenes was placed at our mitchell park branch to stir further conversation. multiple emerging technologies explored the workshop was largely focused around the a-frame code, a framework for publishing 3d scenes to the web (https://aframe.io/). however, we also integrated a number of other technologies, including a raspberry pi, qr codes, a twitter-bot, and the inter-planetary file system (ipfs), which is a distributed web technology. virtual reality built with a-frame code in the vr hackfest, participants first learned how to use a-frame code to render 3d scenes that can be experienced through a web browser or vr headset. a-frame is a new framework that web publishers and 3d designers can use to design web sites, games and 3d art. a-frame is an extension of html, the code used to build web pages. anyone who is familiar with html will pick up a-frame very quickly, but it is simple enough even for beginners. for example, here is some raw a-frame code: <!doctype html> <html> <head> <script src="https://aframe.io/releases/0.9.1/aframe.min.js"></script> </head> <body style='margin: 0px; overflow: hidden;'> <a-scene> <a-box position="0 1 -2" rotation="0 45 45" scale="1 1 1" color="blue"> </a-box> </a-scene> </body> </html> mailto:chris.markman@cityofpaloalto.org mailto:ryan.hess@cityofpaloalto.org mailto:dan.lou@cityofpaloalto.org mailto:anh.nguyen@cityofpaloalto.org https://aframe.io/ vr hackfest | markman, hess, lou, and nguyen 7 https://doi.org/10.6017/ital.v38i4.11877 figure 1. try this code example! https://tinyurl.com/ipfsvr02. save the above code as an html file and open it with a webvr compatible browser like chrome and you will then see a blue cube in the center of your screen. by just changing the values of a few parameters, novice coders can easily change the shape, size, color and location of primitive 3d objects, add 3d backgrounds and more. advanced users can also insert javascript code to make the 3d scenes more interesting. for example, in the workshop, we provided javascript that animated a 3d robot head (see figure 1) pre-loaded into the codepen (https://codepen.io) interface for quicker editing and iteration. the inter-planetary file system (ipfs) the collection of 3d scenes created in the vr hackfest was published to the internet using the inter-planetary file system (ipfs), an open source distributed web technology originally created in palo alto by protocol labs in 2014 and now actively improved by a global network of software developers. ipfs allows anyone to publish to the internet without a server, through a peer-to-peer network that can also work seamlessly with the regular internet through http “gateways”. in november 2019, brave browser (https://brave.com) became the first to offer seamless ipfs integration, capable of spawning its own background process or daemon that can upload and download to ipfs content on the fly without the need for an http gateway or separate browser extension installation. unlike p2p technologies such as bittorrent, ipfs is best suited for distributing small files available for long periods of time rather than the quick distribution of large files over a short period of time. this is an oversimplification of what is really happening behind the scenes (part of the magic involves content-addressable storage and asynchronous communication methods based on pub/sub messaging, to name a few) but the ability to share and publish 3d environments and 3d objects in a way that can instantly scale to meet demand could have far reaching consequences for future technologies like augmented reality. https://tinyurl.com/ipfsvr02 https://codepen.io/ https://ipfs.io/ https://brave.com/ information technology and libraries | december 2019 8 figure 2. workshop attendees. ipfs can load content much faster, more securely (through features like automated cryptographic hash checking), and allows people to publish directly to the internet without the need of a thirdparty host. google, facebook, and amazon web services need not apply. the same technology has already been used to overcome censorship efforts by governments, but like any technology it has its downsides. content on ipfs is essentially permanent, allowing free speech to flourish but it could also make undesirable content, like hate speech or child pornography, all but impossible to control. toward 21st century literacy like our other technology programs, the vr hackfest was designed to engage customers around new forms of literacy, particularly around understanding code and thinking critically about emerging communication technologies. in 2019, we are already seeing how technologies like machine learning and social media are impacting social relations, politics and the economy. it is no longer enough to know how to read and write code that underlies the web. true literacy must also understand how these technologies interface with each other and how they impact people and society. vr hackfest | markman, hess, lou, and nguyen 9 https://doi.org/10.6017/ital.v38i4.11877 figure 3. the free-standing exhibit. information technology and libraries | december 2019 10 to this end, the vr hackfest sought to take participants on a journey, both technological but also sociological. once the initial practice with the code was completed, we moved on to a discussion of the consequences for using these technologies. with the distributed web, for example, we explored questions like: • what are the implications for permanent content on the web which no one can take down? • what power do gatekeepers like the government and private companies have over our online speech? • what does a 3d web look like and how will that change how we communicate, tell stories and learn? after the workshop ended, we continued the conversation with the public through an exhibit placed at our mitchell park branch (see figure 3). in this exhibit, we showcased the vr scenes participants had created and introduced the technologies underlying them. but we also asked people to reflect on the future of the internet and to share their thoughts by posting on the exhibit itself. public comments reflected the current discourse around the internet. responses (see figure 5) were generally positive—most of our customers mentioned better download speeds or other efficiency increases but a few also highlighted online privacy and safety improvements. we recorded an equal number of pessimistic and technical responses to the same question, these often demonstrated either knowledge of similar technology (e.g. “how is this different than napster?”) or displeasure with the current state of the world wide web (e.g. “less human connections” or “more spyware and less freedom”). outcomes one surprise outcome was that our project reached the attention of the developers of ipfs, who happen to live a few blocks away from the library. after reading about the exhibit online, their whole team visited our team at the library. in fact, one of their team turned out to be a former child customer of our library! the workshop itself, which was featured as a summer reading program activity, also brought in record numbers. originally open to 20 participants and later expanded to 30, the workshop grew a waitlist that more than quadrupled our initial room capacity. clearly, people were interested in learning about these two emerging technologies. we also want to take a moment to highlight the number of design iterations this project went through before making its way into the public eye. the free-standing vr hackfest exhibit was originally conceived as a wall mounted computer kiosk that encouraged users to publish a short message directly to the web with ipfs, but this raised too many privacy concerns and ultimately our building design does not make mounting a computer on the wall an easy task. our workshop also initially focused much more on command line skills working directly with ipfs, but user testing with library staff showed learning a-frame was more than enough. vr hackfest | markman, hess, lou, and nguyen 11 https://doi.org/10.6017/ital.v38i4.11877 figure 4. building the exhibit. information technology and libraries | december 2019 12 figure 5. exhibit responses. figure 6. visit from protocol labs co-founders. 0 2 4 6 8 10 12 14 16 18 20 optimistic pessimistic technical spam illegible n u m b e r o f p o st -i t n o te s vr hackfest | markman, hess, lou, and nguyen 13 https://doi.org/10.6017/ital.v38i4.11877 the vr hackfest was also a win because it combined so many different skills into a single project. we were not only working with open source tools and highlighting new technologies, but also building an experience for workshop attendees and showcasing their work to thousands of people. future work our immediate plans include re-use of the exhibit frame for future public technology showcases and offering another round of vr hackfest workshops, perhaps in a smaller group so participants have the chance to view their work while wearing a vr headset. figure 7. 3d mock-up. beyond this, we also think libraries have the opportunity to harness the distributed web for digital collections, potentially undercutting the cost of alternative content delivery networks or file hosting services. through this project we have already tested things like embedded ipfs links in marc records and building a 3d object library. essentially, all the pieces of the “future web” are already here and it is just a matter of time before all modern web browsers offer native support for these new technologies. in general, our project demonstrated the popularity of 21st-century literacy programs. but it also demonstrated the significant technical difficulties of conducting cutting edge technology workshops in public libraries. clearly, the demand is there, and our library will continue to strive to re-imagine library services. multiple emerging technologies explored virtual reality built with a-frame code the inter-planetary file system (ipfs) toward 21st century literacy outcomes future work design of library systems for implement at ion with interactive computers 65 i. a. w arheit: program administrator, information systems marketing, international business machines, san jose, california in the development of library systems, the movement today is toward the so-called "totar' or integrated system. this raises certain design and implementation questions, such as: what functions should be on-line, real time and what should be done off line in a batch mode; should one operate in a time-share environment or is a dedicated system preferred; is it practical to design and implement a total system or is the selective implementation of a series of applications to be preferred. although it may not be feasible in most cases to design and install a total system in a single operation, it is shown how a series of application programs can become the incremental development of such a system. currently library mechanization is entering a new phase. the first phase, extending from 1936 to the mid-fifties, saw the development of a number of small, scattered, and essentially experimental automatic data processing ( adp) library applications. these were punch card systems for purchasing, serials holdings lists and circulation control. during the second phase, which has been running now about 15 years, a large number of library applications have been mechanized. these include the production of catalog cards, book catalogs, periodical check-in, serials holdings, circulation control systems, acquisitions programs and searching of files, or 66 journal of library automation vol. 3/ 1 march, 1970 information retrieval. systems librarians have been busy designing individual programs, building special computer stored files, implementing conversion of records and developing operating procedures for these various applications. more importantly, they have been studying the library from a systems point of view in order to have a better understanding of the individual tasks performed and how they can be best accomplished with the available tools. at first concern was limited to individual applications in the library. gradually some of the more perceptive systems analysts began to be concerned about integrating these various applications. some simple examples are the generation of book cards for process control and circulation control as a by-product during the order-receiving cycle; the combination of subscription renewal, claims, and binding control with the serials holding program; the development of authority lists in book catalog programs; the simultaneous updating of accession files and circulation control files, etc. the purpose of many of these partially integrated programs was to reduce redundancy and make multiple use of single inputs. the next step was to look at the library as a whole and consider it as a "total" or single, integrated system. rather than building a series of independent applications programs, a number of libraries began to plan total systems in which the individual applications would be integrated segments. in the past year or two such efforts have been undertaken by the university of chicago, stanford university, redstone arsenal, the national library of medicine, washington state university, university of toronto, system development corporation, ibm and others ( 1, 2, 3, 4, 5, 6) . it is this total systems concept which is the new and current development of library electronic data processing ( edp). at first, a total integrated system was conceived as a series of separate application programs utilizing separate files, but whose records have similar formats and field designators allowing for the multiple use of single inputs. a more advanced concept, however, calls for the construction of a single logical file, even though, physically, the individual record elements may be distributed over a number of tracks and storage devices. operating on this central file are a series of program modules performing functions involving file building, searching, computation, display and printing. as each application is called for-that is, as the librarian prepares an order, receives an invoice, checks in a periodical, adds a call number, does some cataloging, charges out a book, etc.-the appropriate program functions are called into use. attached to the file are a number of indexes or access points. one such program, for example, provides some eighteen indexes: author, permuted title, subject heading, descriptor, call number, invoice number, publisher, serial j.d., l. c. card number, borrower, etc. it is not just coincidental that the development of the total integrated library system developed at the same time that computer hardware besystems with interactive computers/ warheit 67 came available that made it practical, especially in an economic sense, to operate a total library system. one of the basic elements of this hardware was the development of real-time, on-line, terminal-oriented, timeshared systems. at present, orders for on-line systems are increasing at such a rate that it is estimated in the june 23, 1969, edp weekly "that half of the computers installed by 1975 will be on-line systems." although there are a number of reasons why on-line, time sharing and terminal oriented equipment made it feasible to build total library systems, the fundamental ones were that now the librarians could interact with their system and records and could, essentially simultaneously, perform a great variety of tasks. the scientific and business communities have been quick to take advantage of these new capabilities. a number of computer manufacturers, software firms and service companies soon started to provide terminal oriented, commercial time-share services. by the beginning of 1969 there were some 35 such services in existence, serving over 10,000 customers; by the end of 1969 it is estimated there will be over 30,000 users. although these systems are often used essentially for remote job entry, their main attraction for users has been their on-line, conversational, realtime capabilities. the interactive, man-computer techniques made possible by commercial time-sharing services have been extremely valuable for problem solving applications, especially engineering and programming. however, the wide availability of text editing packages have also opened up these services for libraries. one of the first academic libraries to use such a service for preparing bibliographic records was the state university of new york at buffalo (7, 8). many universities and industrial firms have developed their own timesharing systems. a number of special libraries, notably those in ibm, were quick to take advantage of their in-house, time-share system to implement acquisitions, catalog input and library bulletin programs (9). the defense documentation center over three years ago began preparing its bibliographic inputs on line. the suny biomedical network based in syracuse does the same (10). the washington state university library was one of the first academic libraries to implement an on-line acquisition program ( 11), and midwestern university ( 12) and bell laboratories ( 13) now have on-line circulation control systems. with the advent of time-shared, on-line capabilities and the potentiality of building total, integrated systems, librarians today who are planning edp systems are faced with a number of design decisions: 1) should the system be a real-time, on-line system or an off-line, batch mode operation, or a combination of both? 2) is it desirable to operate in a timeshare environment or is a dedicated system to be preferred? 3) should one design a total, integrated system or should one selectively implement a number of individual applications? 4) if the decision is for an integrated system, how can it be incrementally implemented? 68 journal of library automation vol. 3/1 march, 1970 it is recognized that a program must be tailored to fit the available resources and that it is not always possible to build an ideal system. nevertheless, design objectives must be established even though they cannot be immediately realized. if the ultimate objectives are understood, then the program development will be orderly and later reconversions will be kept to a minimum. therefore, even though the design objectives may not be achieved for a number of years, they should be established so that current implementation can be carried out in a rational manner with some assurance that the system will grow and develop. real time or batch library operations have always involved a variety of interactive real time and batch mode procedures. most operations dealing directly with the library patrons are, of course, in real time; reference question handling and charging of books are typical examples. some technical processes, such as cataloging and searching for acquisition, are also essentially interactive, real-time operations. this means that the librarian completely processes each item by creating or updating a record or servicing an inquiry, one at a time, with little or no attempt to batch the identical operations for a number of items or inquiries. other processing, however, such as preparing and mailing orders to vendors, sorting and filing charge-out cards, sending overdues, filing into the catalog, checking in periodicals, labeling, preparing binding, etc., is essentially done in the batch mode. in other words, batch and real-time operations complement each other, for whereas it is more effective to do some operations in real time, hatching is more effective for other operations. librarians, therefore, expect and need both modes of operation. the actual distinction between these two modes is often lost in certain mechanized systems where everything is done in a non-interactive batch mode while interactive, real-time services are provided from printouts. many current library mechanized systems are really nothing more than processing techniques for producing the standard, hard-copy, bibliographic tools such as catalog cards, serials lists, book catalogs, orders, overdue notices and the like. whenever the librarian wants to use the information generated by these programs, he consults the hard-copy files or lists. he does not interrogate a computer file directly. this approach has been typical of many other computer-based information systems. when the first direct access devices ( ramac) were made available for commercial and industrial inventory control, they were used primarily to update the records and to produce the inventory lists and card files which the user would consult for information. later, as confidence developed in the machines, and terminals became available, the printout lists and files were abandoned and the user began consulting the computer store directly. systems with interactive computers/warheit 69 typically today in libraries using computer systems, inputs are processed in batches and outputs are produced in batches. real-time services are provided from the print-outs: the catalogs, the on-order file, serials lists and so on. even circulation control has been an off-line, batch operation. although the charge-out may be made through a data entry unit, all that is actually accomplished at the time is that the transaction is recorded. it is only later that the transactions are hatched and processed, the files set up for the loans, the discharges pulled from the file and the delinquencies handled. although librarians will not, in the immediate future at least, as readily give up their card catalogs and printed lists as business and industry are doing and as some enthusiasts believe librarians will ( 14, 15) the queuing problem alone where the public must use the files would be very severesome hard-copy files could be dispensed with in an on-line system. certainly hard-copy files of circulation records, periodical checkin records, authority lists, on-order records and the like need not be maintained when these files are available via terminals. until now, practically all library machine processing, with a few exceptions, has been hatched, off line and not interactive. in a non-interactive system, records are created and modified by manual preparation of work sheets followed by keypunching for data entry. in a library environment, for example, this means that the acquisitions librarian fills out an order work sheet that is given to a keypunch operator, who either prepares a decklet of punch cards or punches a paper tape or makes a magnetic record on tape. the cards or tape are then fed into the computer, the input is edited and errors noted and a proof copy is printed. the error messages and proof copy come back to the order librarian, who makes the necessary corrections. these are handed to the keypunch operator, who corrects and updates the record and inputs it again into the computer. if the operator has not introduced some new errors, the record is then processed. if she has, the record loops back again to the order librarian. the same story can, of course, be told about catalog records, journal and report records, and so on. in an interactive on-line system, the originator of the information (in this example the order librarian) could key his data directly into the computer or could prepare a work sheet for operator input. the editing would occur at once by the terminal responding to each entry and verification or error messages would be returned immediately. the librarian or operator would enter the necessary corrections and upon acceptance of the record by the system would signal entry of the record into the file and the print queues as required. also, during the preparation of the entry, the librarian would be using the terminal-presumably a display type terminal-to consult the files he needs, such as shelf list, orders outstanding, authority lists, etc. 70 journal of library aait01tultion vol. 3/ 1 march, 1970 a simplified flow chart comparison of an off-line and an on-line cataloging process would look something like that shown in figure 1. off-line catalog revision kp proof input output worksheet h edit h .], t 8 correction correction 1 .online cataloger output or --4 input edit (7 revision catalog worksheet l 1 error correction i fig. 1. cataloging process: off-line and on-line. although only a few library applications and no total library system are as yet on-line operations, a number of analogous operations are being carried out in other industries, such as order entry, inventory control, production scheduling, insurance policy information, freight waybilling, etc., so that one can make a few tentative assessments (16, 17, 18). to begin with, in an on-line system a work sheet does not have to be prepared, and so the keypunch operator is eliminated. because of the interaction of the originator and the system, all corrections and editing are accomplished at once, so that the tum-around time is very much less. preparation of printed error messages and proof copy are eliminated and the total error rate is greatly reduced. thus, although the reading-in of the individual records is slower in the on-line mode than in the batch mode, appreciably fewer messages need be read to complete a record in the on-line mode, making for more economical machine time. to this, however, must be added terminal and communication costs as well as the terminal supervisor program and the fact that most on-line work is done systems with interactive computers/warheit 71 during the prime shift, so that actual machine costs tend to be higher with the on-line system. some, however, dispute this, claiming that, on balanc~, machine costs are equal. labor costs, however, are very much lower with the on-line system. as a general rule, computer input costs are 85% labor and 15% machine. not only can a transcription clerk be eliminated, but the order librarian who prepares the original inputs on the terminal works very much more efficiently. consulting hard-copy files and lists is more time consuming and less informative than interrogating machine files. in an on-line system, the librarian's necessary tools are brought directly to him and displayed rapidly and efficiently. he does not have to walk to the sheh list, the catalog or the on-order file and copy information. in a well developed, sophisticated system some of the heavily used tools, such as the subject heading authority lists and class tables, would also be available from the terminal. not only does the librarian not have to spend time going to the physical files, but since the information is computer stored, it is brought to him in a greater variety of forms and sequences than is available in the hard-copy files. for example, titles are fully permuted so that incomplete title information can be searched. some systems librarians are proposing the use of codes and ciphers to search for entries, especially those with garbled titles ( 19, 20). all entries, including added authors, editors, vendors, etc., are immediately available even for uncataloged on-order items, so that searching is not restricted to main entries. it is not surprising, therefore, that clerks preparing computer inputs prefer working on line rather than off line. one interesting discovery is that since operators can do so much more with on-line systems they tend to take more time to turn out a better product. indications are "that significantly lower costs would have resulted if the time-sharing users had stopped work (i.e. gone to the next task) when they reached a performance level equal to that of batch users" ( 17). even with a circulation control system, there is higher system efficiency with an on-line operation. every transaction, such as a charge-out or a discharge, is an actual inquiry into the file as to the status of the book and borrower and the answer is immediately available; therefore controls and audit procedures can be simpler. elaborate error correction routines do not have to be provided in the program to identify improper inputs as has to be done with an off-line system. incorrect loans are not made of restricted material, such as holds and reserves, or to delinquent borrowers. the system also acts as a locator tool for determining the location and availability of volumes. as a final note, on-line systems are necessary if effective networks are to be developed and decentralized services provided ( 21, 22). the basic conclusion is that an on-line system can handle more work and provide more services at greater machine costs but lower labor costs than a manual or an off-line machine system. in view of the fact that 72 journal of library a-utomation vol 3/1 march, 1970 machine costs are coming down rapidly, while labor costs and throughput demands are forever rising, the future of the on-line machine system in the library looks very promising. time share or dedicated system a number of librarians have had very unhappy experiences with data processing departments over which they had no control. machines have been changed, schedules dropped, library jobs delayed or dropped for "higher priority jobs" and so on. one tendency, therefore, has been to try to get a library's own computer facility. but, as de gennaro so succinctly summarizes it, "the economics of present day computer applications in libraries make it virtually impossible to justify an in-house machine of the capacity libraries will need dedicated solely or largely to library uses ... eventually, library use may increase to a point where the in-house machine will pay for itself, but during the interim period the situation will be uneconomical unless other users can be found to share the cost. in the immediate future, most libraries will have to depend on equipment located in computing or data processing centers . . . experience at the university of missouri suggests the future will see several libraries grouping to share a machine dedicated to library use . . . it seems reasonable to suppose that in the next few years sharing of one kind or another will be more common than having machines wholly assigned to a single library . . ." ( 23). it is true that the small computers are getting more powerful and it is quite possible the day will come when small stand-alone computers will have the capacity to do all the jobs required by the library. for the time being, however, an on-line system supporting a number of terminals for a variety of tasks in the library requires a computer of a size which cannot be economically justified except for the very large libraries. also, one thing that is often overlooked is that implementing a large library system requires data processing technical support that is very seldom available on the library's staff. one need only look at the information systems office of the library of congress, or the system analysis and data processing office of the new york public library to have some appreciation of the requirements for such technical support. also, a large central system often has backup capabilities which provide insurance against breakdowns and interruptions. the question really is not whether a library should time share or have a dedicated system, but rather whether or not the library has the necessary control over its segment of the total system. this segment is the library's property and its services are available to the library as set forth in the agreement made when the library became part of the data processing services. again, it must be emphasized that all this applies to systems which have to perform all library functions. most libraries, however, in order to systems with interactive computers/warheit 73 get started and develop their programs, are beginning with small, standalone computers or are submitting batch jobs to a data processing center. later, as their programs develop, they will have to upgrade their com• puter capabilities. in view of the ultimate needs of a system which will support most of the major processing functions of a library, most libraries will have to have access to computer facilities whose full support they cannot economically justify. time sharing, certainly for the immediate future, will be required for any on-line library system. total integrated system or individual application it is more economical to handle a variety of library applications by using a single file and a standard set of functional programs, than it is to provide a separate file and a separate set of application programs for each application. not only is it more economical, but this total, integrated approach is, in its essential modularity, extremely flexible. functions can be added, changed, or removed, and sequences can be re-ordered, so that the system can grow and change with changing needs and capabilities. also, since the full record is available, if needed, for every application, added services, normally not feasible, are practical. for example, a circulation control system that, instead of having separate circulation files, keeps charge records in its central bibliographic file, can set a hold on all copies of a book, no matter where the copies are kept, as in the bellrel system ( 13). also, from a total record one can select various subsets and make different orderings to provide a variety of services. the library systems currently being designed are essentially mechanized versions of existing manual systems. however, as experience is gained with these new systems, as more advanced equipment is made available, and as research and development provide new insights, these systems will evolve and change. for example, in some cases a major part of descriptive cataloging is becoming a part of acquisitions. the former compartmentalization in libraries is already breaking down. one should, therefore, be prudent and not lock up the system into tightly compartmentalized segments on the assumption that current file subsets will remain unchanged. it is advisable that each library activity have potential access to all system functions and to all records. in the present context, an activity may have no need for all functions, nor does it need the total record, but as the system develops it might very well need these added capabilities. the problem, however, is that for a total, integrated system one must first build a complete structure including the file and all the functionssuch as file building, search, compute, compare, display, print, etc.-as well as set up all the access points which are essentially indexes. in addition, all the overhead necessary for supervising the programs, managing the files, and monitoring the terminals must be provided for. to use an 74 journal of library automation vol. 3/ 1 march, 1970 analogy, one must first build foundation, walls and roof and install all plumbing and wiring before building any rooms. consequently, the start up or initial investment is far higher than for implementing a single application program. some who have undertaken the development of total systems did not fully appreciate this at first and have, as a result, had to replan their development programs. even if one could bring in a fully debugged program for a total system, there would still be the tasks of converting records, training staff, setting up operating manuals and working out procedures. only as machinable records became available and the file grew and developed could various applications become operable. from a practical point of view, the implementation of a total system would have to be incremental; that is, once the basic system is installed, applications would have to be implemented one at a time and in some rational order. this is even more true where the programs for a total system have not been written as yet or where the library's resources are such that it can only undertake one job at a time. from a practical point of view, one can develop and implement only one application at a time. furthermore, as is often the case, the available equipment is limited and cannot do everything the library will ultimately want. it is necessary, therefore, to develop single applications and to design them in such a way that they can become part of an integrated system. it is also necessary to have a strategy and a plan to move up through the various levels of mechanization. today there are many who, although accepting a total, on-line system as a desirable goal, feel that it is impractical to consider because of costs and unavailability of equipment. a full analysis of economic change in terms of wage-cost rise and machine-cost decrease, of technologic improvement and of demand for added services, goes far beyond the limits of this paper. there is developing, moreover, a literature on these subjects (24, 25, 26, 27, 28, 29). suffice it to say that an increasing number of librarians are becoming convinced that library mechanization is inevitable, that it will affect all operations of the library, that it will provide the highest level of service through direct, on-line, interactive systems and that, whatever today's limitations may be, these changes are coming so fast that plans must be made now. these individuals are also convinced that whatever is now undertaken in the way of mechanization will evolve into an integrated system with many basic functions operated in a real-time, on-line mode. implementation of an integrated library system typically a library mechanization project will start with a single, relatively uncomplicated application that will not impact library operations very much, will require only a small amount of systems design and programming, and will run in a batch mode on a small equipment configuration. a typical example is the preparation of a serials holdings list. from systems with interactive computers/warheit 75 this first job, the librarian and his staff will become acquainted with data processing, will introduce the data processing personnel to some library requirements and will, hopefully, begin to develop procedures for working with the computing center. having passed this introductory stage, many librarians continued, as a rule, simply by developing the next application. today, however, the more prescient ones are first assessing the total impact of mechanization and, having decided that their library will be mechanized, try to plan what their foreseeable goals are, then work out a plan to achieve these goals. having decided that the ultimate goal is a total integrated system for the whole library, which will provide real-time services and therefore must operate on line, the library planner will set priorities and work out a strategy to reach these goals. in some instances he can start designing a total system. in other situations, he does not have the resources to do so, but plans to make use of programs being developed for other libraries or of so-called standard, commercial packages, or programs which may be developed jointly with other libraries. he should realize that he can't just sit and wait for d-day when a total complete program will be wheeled in and a turnkey operation will be installed overnight. the lead time necessary for planning, training, conversion and installation is too often grossly underestimated, so that these preliminary preparations are neglected to the detriment of orderly growth and development. having established certain long-range goals, the librarian will tailor his current programs so that the library system will develop as smoothly as possible. he will try to keep the various subsystems and program segments as generalized and as modular as possible. he will structure his records so that they can ultimately be fitted together into a full bibliographic record. he will try to avoid using records so truncated that they will have to be discarded and recorded again later. he may, in fact, actually start with a full record that is comparable to his present shelf list or catalog card, even though there may be no need of the whole record for the current application. he will provide for a variety of print options, such as line width, number of lines, number of columns, etc., so that a separate print program will not have to be written for each product or to accommodate every change in style. he will try to organize his files so that the file structures and the record formats will not have to be radically changed when the system goes on line. he may store some of his records -his active on-order file, for example-on direct access storage devices. if he can, he will create access points to his large bibliographic file and store them on disk files too, even though he is currently operating off-line. such direct access storage of indexes makes economic sense when very large files and library files are large and grow very fast must be searched or sorted. aside from these immediate benefits, such a file organization requires little or no restructuring or record reformatting when 76 journal of library automation vol. 3/ 1 march, 1970 the system ultimately goes on line and becomes terminal oriented. as early as possible, he will put his circulation control system on line. this is by far the cheapest and easiest on-line operation requiring the least investment and yet producing the most immediate benefits. again, aside from the immediate benefits, this on-line operation represents an important building block for the ultimate total system. aside from the current improved services, the experience of working on line and the opportunity to develop and refine processes and procedures will pay important dividends in the design and implementation of the total on-line system. with knowledge of how he wants his system to develop, the librarian is now able to establish priorities and allocate his resources. the emphasis will be on file building, on capturing the record. acquisitions programs or circulation control systems will come first. work on the display terminal and communication will come later after searchable files have been built up. in other words, an attempt is made to have a controlled growth through several levels of mechanization. a start is made with a simple, off-line, batch job. then a beginning is made on building what is to become the main, central bibliographic file, the catalog. as soon as possible, parts of it are stored on direct access devices, so that it can be used more effectively and so that its structure will conform to the requirements of an ultimate on-line system. a simple on-line process is adopted as soon as feasible. each application program uses standard functional modules in macro form and so on. all this, of course, is highly oversimplified and may seem truistic to many. nevertheless, there has been too much evidence of programs undertaken without adequate planning and of programs that have lacked continuity because adequate guide lines have not been established. such failures are too often ascribed to changes in personnel or hardware. a project should be designed so that inevitable changes in personnel and hardware can be tolerated without its being wrecked. therefore, the establishment of long-range goals can have a profound effect on the shape and success of current operations. more and more librarians and systems personnel engaged in library projects are beginning to think in terms of total integrated systems. they are looking ahead and planning. they are designing and implementing their present applications not in a simple ad hoc way but as part of what is to become a total system. references 1. alexander, r. w.: "toward the future integrated library system," 33rd conference of fid and international congress on documentation, (tokyo: 1967). systems with interactive computers/ warheit 77 2. redstone scientific information center : automation in libraries (first atlis workshop) 15-17 november 1966, huntsville, ala.: redstone arsenal, (june 1967). report rsic625. 3. black, donald v.: "library information system time sharing: system development corporation's lists project," california school libraries, (march 1969), 121-6. 4. black, donald v.: library information system time-sharing on a large, general purpose computer. (system development corporation report sp-3135, 20 september 1968). 5. bruette, vernon r.; cohen, joseph; kovacs, helen : an on-line computer system for the storage and retrieval of books and monographs (brooklyn, new york : state university of new york downstate medical center, 1967). 6. fussier, herman h.; payne, charles t. : development of an integrated computer-based bibliographical data system for a large university library. (chicago : chicago university, 1968) . clearinghouse report pb 179 426. 7. balfour, frederick m.: "conversion of bibliographic information to machine readable form using on-line computer terminals," journal of library automation, 1 (december 1968), 217-26. 8. lazorick, gerald j.: "computer/ communications system at suny buffalo," educom. the bulletin of the interuniversity communications council, 4 (february 1969), 1-3. q_. bateman, betty b.; farris, eugene h.: "operating a multilibrary system using long-distance communications to an on-line computer," proceedings of asis, 5 ( 1968 ), 155-62. 10. pizer, i. h.: "regional medical library network," medical library association bulletin, 57 (april 1969), 101-15. 11. burgess, t .; ames, l.: lola library on-line acquisitions subsystem. (pullman, wash.: washington state university library, july 1968). 12. reineke, charles d.; boyer, calvin j. : "automated circulation system at midwestern university," ala bulletin, 63 (october 1969 ), 1249-54. 13. kennedy, r. a.: "bell laboratories' library real-time loan system (bellrel)," journal of library automation, 1 (june 1968), 128-46. 14. licklider, j. c. r. : libraries of the future (cambridge, massachusetts : m.i.t. press, 1965). 15. swanson, don r. : "dialogues with a catalog," library quarterly, 34 (january 1964), 113-25. 16. brown, robert r.: "cost and advantages of on-line dp," datamation, 14 (march 1968), 40-3. 17. gold, michael m.: "time-sharing and batch-processing; an experimental comparison of their values in a problem-solving situation," communications of the acm, 12 (may, 1969), 249-59. 78 journal of library automation vol. 3/ 1 march, 1970 18. · sackman, h.: "time sharing versus batch processing: the experimental evidence," afips conference proceedings, 32, 1968 spring ]oint computer conference, 1-10. 19. nugent, william r.: "compression word coding techniques for information retrieval," journal of library automation, 1 (december 1968) ) 250-60. 20. ruecking, frederick h.: "bibliographic retrieval from bibliographic input; the hypothesis and construction of a test," journal of library automation, 1 (december 1968), 227-38. 21. grosch, audrey n.: "implications of on-line systems techniques for a decentralized research library system," college & research libraries, 30 (march 1969), 112-18. 22. rayward, w. boyd: "libraries as organizations," college & research libraries, 30 (july 1969), 312-26. 23. de gennaro, richard: "the development and administration of automated systems in academic libraries," journal of library automation, 1 (march 1968), 75-91. 24. "the costs of library and informational services." knight, douglas m.; nourse, e. shepley, eds.: in libraries at large (new york: r. r. bowker, 1969), 168-227. 25. cuadra, carlos a.: "libraries and technological forces affecting them," ala bulletin, 63 (june 1969), 759-68. 26. culbertson, dons.: "the costs of data processing in university libraries : in book acquisition and cataloging," college & research libraries, 24 (november 1963), 487-89. 27. dolby, j. l.; forsyth, v.; and resnikoff, h. l.: an evaluation of the utility and cost of computerized library catalogs. final report project no. 7-1182, u. s. department of health, education and welfare. 10 july 1968, eric ed 022517. 28. kilgour, frederick g.: ''the economic goal of library automation," college & research libraries~ 30 (july 1969), 307-11. 29. knight, kenneth e.: 'evolving computer performance," datamation, 14 (january 1968), 31-5. reproduced with permission of the copyright owner. further reproduction prohibited without permission. the impact of information technology on library anxiety: the role of computer attitudes jiao, qun g;onwuegbuzie, anthony j information technology and libraries; dec 2004; 23, 4; proquest pg. 138 the impact of information technology on library anxiety: oun g. jiao and anthony j. onwuegbuzie the role of computer attitudes over the past two decades, computer-based technologies have become dominant forces to shape and reshape the products and services the academic library has to offer. the application of library technologies has had a profound impact on the way library resources are being used. although many students continue to experience high levels of library anxiety, it is likely that the new technologies in the library have led to them experiencing other forms of negative affective states that may be, in part, a function of their attitude towards computers. this study investigates whether students' computer attitudes predict levels of library anxiety. c omputers and information technologies have experienced considerable growth over the past two decades. as such, familiarity with computers is rapidly becoming a basic skill and a prerequisite for many tasks. although not every college student is equally prepared for the rising demand of computer skills in the !nformation age, computer literacy is increasingly becommg a gatekeeper for students' academic success. 1 gaps in computer literacy and skills can leave many students behind not only in their academic achievement but also in their future job-market success. the unprecedented pace of technological change in the development of digital information networks and electronic services in recent years has helped to expand the role of the academic library. once only a storehouse of printed materials, it is now a technology-laden information network where students can conduct research in a mixed print and digital-resource environment, experience the use of advanced information technologies, and hone their computer skills. yet, many students are struggling to cope with the changes brought on by the rapid advances of information teclmologies. academic libraries of various sizes have spent a large percentage of their material budget on electronic commercial content, and the trend will continue.' these days, college students are faced with the choices of ever-changing modes of electronic accessing tools, interfaces, and protocols along with the traditional print resources in the library. the fact that the same journal article may be available in multiple vendors' aggregator oun g. jiao (gerryjiao@baruch.cuny.edu) is reference libraria~ and_ associate professor at newman library, baruch college, city university of new york, and anthony j. onwuegbuzie (tony onwuegbuzie@aol.com) is associate professor at the college of education, university of south florida, tampa. sites (such as ebscohost and gale group) makes the navigation through these bibliographic databases more complex and challenging. relevant sources must be identified and navigation protocols must be learned before appropriate information and contents can be found. furthermore, having located a citation, students still have to search the library online catalog to find out if the journal or book is available in the library and, if not, know how to make an interlibrary loan request either on paper or electronically.' anxiety levels can be high and patience levels can be low at varying times of conducting library research. 4 . that students experience various levels of apprehension when using academic libraries is not a new phenomenon. indeed, the phenomenon is prevalent among college students in the united states and many other countries, and is widely known as library anxiety. mellon first coined the term in her study in which she noted that 75 percent to 85 percent of undergraduate students described their initial library experiences in terms of anxiety.5 according to mellon, feelings of anxiety stem from either the relative size of the library; a lack of knowledge about the location of materials, equipment, and resources of the library; how to initiate library research; or how to proceed with a library search. 6 library anxiety is an unpleasant feeling or emotional state with physiological and behavioral concomitants that come to the fore in li_brary settings. typically, library-anxious students experience negative emotions, including ruminations, tension, fear, and mental disorganization, which prevent them from using the library effectively. 7 a student who experiences library anxiety usually undergoes either emotional ~r physical discomfort when faced with any library or library-related task. 8 library anxiety may arise from a lack of self-confidence in conducting research, lack of prior exposure to academic libraries, the inability to see the relevance of libraries to one's field of interest, and lack of familiarity with library equipment and technologies. library anxiety is often accorded special attention because of its debilitating effects on students' academic achievement.9 although many students continue to experience high levels of library anxiety, it is likely that the new technologies and electronic databases in libraries have led to students experiencing other forms of negative affective states. in particular, it is likely that library anxiety experienced by students is, in part, a function of their attitudes toward computers. consistent with this assertion, mizrachi and shoham and mizrachi reported a statistically significant relationship between library anxiety and computer attitudes. 10 they noted in their research that home and work usage of computers, computer games, word processors, computer spreadsheets, and the internet are all related to the dimensions of library anxiety found among israeli students to varying degrees. 138 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. similarly, jerabek, meyer, and kordinak found levels of computer anxiety to be related to levels of library anxiety for both men and women. 11 these studies focused exclusively on undergraduate students. however, no study has examined this relationship among graduate students, a population that uses the academic library more than any other student population. over the past fifteen years, a large body of research literature on computer attitudes has been generated. in particular, many researchers have studied the relationship between computer attitudes and computer use. 12 the importance of beliefs and attitudes towards computers and technologies is widely acknowledged. 13 students' computer attitudes arguably impact their willingness to engage in computer-related activities in colleges and universities where effectively using library electronic resources represents an increasingly important part of college education. negative computer attitudes may inhibit students' interests in learning to use the library resources and thereby weaken their academic performance levels, while at the same time elevating levels of library anxiety. mclnerney, mclnerney, and sinclair observed that negative perceptions about computers among student teachers may accompany feelings of anxiety, including worries about being embarrassed, looking foolish, and even damaging the computer equipment. 14 further, there is often a negative relationship between prior experience with computers and computer anxiety experienced by individuals. 15 until recently, library anxiety has only been interpreted in the context of the library setting-that is, a phenomenon that occurs while students are undertaking library tasks. jiao, onwuegbuzie, and lichtenstein defined library anxiety as "an uncomfortable feeling or emotional disposition, experienced in a library setting, which has cognitive, affective, physiological, and behavioral ramifications." 16 at the same time, unprecedented technological advancement has had a profound impact on the products and services offered by academic libraries. students now are able to conduct sophisticated library searches from the comfort of their homes. it is clear that the construct of library anxiety needs to be expanded in the new library and information environment, incorporating into its definition other variables that are relevant for the changing library and information context. because many library users spend a significant portion of their time using computer-based technologies to conduct information searches, it is natural to ask, to what extent does library anxiety stem from students' prior attitudes and experiences with computers and library technologies? however, with the exception of the studies conducted by mizrachi and shoham and mizrachi on israeli undergraduate students, this link has not been examined. 17 thus, the present study investigated the relationship between computer attitudes and library anxiety in the rapidly changing library and information environment. as such, the current inquiry replicated the works of mizrachi, shoham and mizrachi, and jerabek, meyer, and kordinak by examining the degree to which computer attitudes predict levels of library anxiety among graduate students in the united states. 18 it was expected that findings from this study would help to increase the understanding of the construct of library anxiety. indeed, research in this area has become critical in higher education where educators are responsible for graduating students with the skills necessary to thrive and to lead in a rapidly changing technological environment in the twenty-first century. i method participants participants were ninety-four african american graduate students enrolled in the college of education at a historically black college and university in the eastern u.s. all participants were solicited in either a statistics or a measurement course at the time that the investigation took place. in order to participate in the study, students were required to sign an informed-consent document that was given during the first class session of the semester. the majority of the participants were female. ages of the participants ranged from twenty-two to sixty-two years (mean = 30.40, sd = 8.75). instruments and procedure all participants were administered two scales, namely, the computer attitude scale (cas) and the library anxiety scale (las). the cas, developed by loyd and gressard, contains forty likert-type items that assess individuals' attitudes toward computers and the use of computers. 19 this instrument consists of the following four scales, which can be used separately: (1) anxiety or fear of computers; (2) confidence in the ability to use computers; (3) liking or enjoying working with computers; and (4) computer usefulness. loyd and gressard reported coefficient alpha reliability coefficients of .86, .91, .91, and .95 for scores pertaining to computer anxiety, computer confidence, computer liking, and total scales, respectively. for the present study, the score reliabilities were as follows: • computer anxiety, .84 (95 percent confidence interval ci = .79, .88); • computer confidence, .81 (95 percent ci = .75, .86); • computer liking, .89 (95 percent ci= .85, .92); and • computer usefulness, .76 (95 percent ci = .68, .83). the impact of information technology on library anxiety i jiao and onwuegbuzie 139 reproduced with permission of the copyright owner. further reproduction prohibited without permission. the las, developed by bostick, contains forty-three 5-point likert-format items that assess levels of library anxiety experienced by college students. 20 it also contains the following five subscales: 1. barriers with staff; 2. affective barriers; 3. comfort with the library; 4. knowledge of the library; and 5. mechanical barriers. a high score on any subscale represents high levels of anxiety in that area. jiao and onwuegbuzie, in their examination of the score reliability reported on las in the extant literature, found that it has typically been in the adequate to high range for the subscale and total-scale scores. 21 based on their analysis, onwuegbuzie, jiao, and bostick concluded that "not only does the [las] produce scores that yield extremely reliable estimates, but also these estimates are remarkably consistent across samples with different cultures, nationalities, ages, years of study, gender composition, educational majors, and so forth." 22 for the current investigation, the subscales generated scores for the combined sample that had a classical theory alpha reliability coefficient of .89 (95 percent ci = .85, .92) for barriers with staff, .84 (95 percent ci = .79, .88) for affective barriers, .53 (95 percent ci= .37, .66) for comfort with the library, .62 (95 percent ci= .48 .73) for knowledge of the library, and .70 (95 percent ci= .58, .79) for mechanical barriers. analysis a canonical correlation analysis was conducted to identify a combination of library anxiety dimensions (barriers with staff, affective barriers, comfort with the library, knowledge of the library, and mechanical barriers) that might be simultaneously related to a combination of computer-attitude dimensions (computer anxiety, computer liking, computer confidence, and computer usefulness). canonical correlation analysis is used to examine the relationship between two sets of variables whereby each set contains more than one variable. 23 in the present investigation, the five dimensions of library anxiety were treated as the dependent multivariate set of variables, and the four dimensions of computer attitudes formed the independent multivariate profile. the number of canonical functions (factors) that can be produced for a given dataset is equal to the number of variables in the smaller of the two variable sets. because the library-anxiety set contained five dimensions and the computer-attitude set contained four variables, four canonical functions were generated. for any significant canonical coefficient, the standardized canonical-function coefficients and structure coefficients were then interpreted. standardized canonicalfunction coefficients are computed weights that are applied to each variable in a given set in order to obtain the composite variate used in the canonical correlation analysis. as such, standardized canonical-function coefficients are equivalent to factor-pattern coefficients in factor analysis or to beta coefficients in a regression analysis." conversely, structure coefficients represent the correlations between a given variable and the scores on the canonical composite (latent variable) in the set to which the variable belongs.2 5 thus, structure coefficients indicate the degree to which each variable is related to the canonical composite for the variable set. indeed, structure coefficients are essentially bivariate correlation coefficients that range in value between -1.0 and + 1.0 inclusive." the square of the structure coefficient yields the proportion of variance that the original variable shares linearly with the canonical variate. i results table 1 presents the intercorrelations among the five dimensions of library anxiety and the four dimensions of computer attitude. of particular interest were the twenty correlations between the library-anxiety subscale scores and the computer-attitude subscale scores. it can be seen that, after applying the bonferroni adjustment, four of these relationships were statistically significant. specifically, computer liking was statistically significantly related to affective barriers, knowledge of the library, and comfort with the library. using cohen's criteria of .1, .3, and .5 for small, medium, and large relationships, respectively, the first two relationships (involving affective barriers and knowledge of the library) were medium, and the third relationship (between computer liking and comfort with the library) was large. 27 in addition to these three relationships, the association between computer usefulness and knowledge of the library also was statistically significant, with a medium effect size. the correlation matrix in table 1 was used to examine the multivariate relationship between library anxiety and computer attitudes. this relationship was assessed via a canonical correlation analysis. the canonical analysis revealed that the four canonical correlations combined were statistically significant (p < .0001). also, when the first canonical root was removed, the remaining three canonical roots were not statistically significant. in fact, removal of subsequent canonical roots did not lead to statistical significance. together, these results suggested that only the first canonical function was statistically significant, but the remaining three roots were not statistically significant. this first canonical root also was practically significant (rc1 = .63), contributing 40.8 percent (rc12) to the shared variance, which represents a large effect size. 28 140 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. data pertaining to the first canonical root are presented in table 2, which provides both standardized function coefficients and structure coefficients. using a cutoff correlation of 0.3, the standardized canonical-function coefficients revealed that affective barriers, comfort with the library, and knowledge of the library made important contributions to the library-anxiety set, with affective barriers and comfort with the library making similarly large contributions. 29 with regard to the computer-attitude set, computer anxiety, computer liking, and computer confidence made noteworthy contributions, with the latter two dimensions making the most noteworthy contributions. the structure coefficients revealed that all five dimensions of library anxiety made important contributions to the first canonical variate. the square of the structure coefficient indicated that barriers with staff, affective barriers, comfort with the library, and knowledge of the library made similarly large contributions, explaining 67.2 percent, 72.3 percent, 72.3 percent, and 60.8 percent of the variance, respectively. with regard to the computerattitude set, computer liking and computer usefulness made important contributions. these variables explained 64.0 percent and 16.8 percent of the variance, respectively. comparing the standardized and structure coefficients indicated that computer anxiety and computer confidence served as suppressor variables because the standardized coefficients associated with these variables were large, whereas the corresponding structure coefficients were relatively small. 30 suppressor variables are variables that assist in the prediction of dependent variables due to their correlation with other independent variables. 31 thus, the inclusion of computer anxiety and computer confidence in the canonical correlation model strengthened the multivariate relationship between library anxiety and computer attitudes. i discussion the purpose of this study was to investigate the relationship between computer attitudes and library anxiety among african american graduate students. specifically, the multivariate link between these two constructs was examined. a canonical correlation analysis revealed a strong multivariate relationship between library anxiety and computer attitudes. the library-anxiety subscale scores and computer-attitudes subscale scores shared 40.82 percent of the common variance. specifically, computer liking and computer usefulness were related simultaneously to the following five dimensions of library anxiety: barriers with staff, affective barriers, comfort with the library, knowledge of the library, and mechanical barriers. computer anxiety and computer confidence served as suppressor variables. thus, computer attitudes predict levels of library anxiety. as such, the present findings are consistent with those of mizrachi and shoham and mizrachi, who found a statistically significant relationship between computer attitudes and the following seven dimensions of the hebrew library-anxiety scale, a modified version of las developed by the authors for their israeli sample: 1. staff 2. knowledge 3. language 4. physical comfort 5. library computer comfort 6. library policies and hours, and 7. resources. 32 according to its authors, the staff factor refers to students' attitudes towards librarians and library staff and their perceived accessibility. the knowledge factor pertains to how students rate their own library expertise. the language factor relates the extent to which using englishlanguage searches and materials yield discomfort. physical comfort evaluates how much the physical facility negatively affects students' satisfaction and comfort with the library. library computer comfort assesses the perceived trustworthiness of library computer facilities and the quality of directions for using them. library policies and hours concerns students' attitudes toward library rules, regulations, and hours of operation. finally, resources refers to the perceived availability of the desired material in the library collection. the correlations between the dimensions of library anxiety and computer attitudes ranged from .11 (physical comfort) to .47 (knowledge). the current results also replicate those of jerabek, meyer, and kordinak, who found levels of computer anxiety to be related to levels of library anxiety for both men and women. 33 nevertheless, caution should be exercised in generalizing the current findings to all graduate students. though the present study examined the association between library anxiety and computer attitudes among african american graduate students, it should not be assumed that this relationship would hold for other racial groups. jiao, onwuegbuzie, and bostick found that african american students attending a research-intensive institution reported statistically significantly lower levels of library anxiety associated with barriers with staff, affective barriers, and comfort with the library than did caucasian american graduate students enrolled at a doctoral-granting institution, with effect sizes ranging from moderate to large. 34 in a follow-up study, jiao and onwuegbuzie compared african american and caucasian american students with respect to library anxiety, controlling for educational background by selecting both racial groups from the same institution. 35 no statistically significant racial differences the impact of information technology on library anxiety i jiao and onwuegbuzie 141 reproduced with permission of the copyright owner. further reproduction prohibited without permission. table 1. lntercorre lations among the library-anxiety subscales and computer-att itude subsca les subscale 2 3 4 5 6 7 8 9 1 . barriers with staff .64 * .63* .49* .46* .02 .05 -.27 -.09 * .52 * * -.05 .02 -.37 * -.23 2. affective barriers .56 .40 3. comfort with the library .56 * .44 * -.19 -.20 -.55 * -.16 _39*' -.21 -.11 -.37 * -.32 * 4. knowledge of the library 5. mechanical barriers -.13 -.01 -.18 .04 .77 * .48 * .46 * 6. computer anxiety .67 * .36 * 7. computer confidence .43 * 8. computer liking 9. computer usefulness *indicates a statistically significant relationsh ip after the bonferroni adjustment. table 2. canon ical solution for th ird function-re lationship between library-anx iety subscales and computer-att itude subsca les theme standardization coefficient library-anxiety subscale barriers with staff .17 affect ive barriers .40* comfort with the library .39* know ledge of the library .31 * mechanical barr iers -.12 computer-attitude subscale computer anxiety -0.31* computer confidence 0.98* computer liking -1.25* computer usefulness -0 .13 *loadings with the effect sizes larger than .3. were found in library anxiety for any of the five dimensions of las. however, across all five library-anxiety measures, the african american sample reported lower scores than did the caucasian american sample. in fact, using the test of trend by onwuegbuzie and levin, they found that the consistency with which the african american graduate students had lower levels of library anxiety than did the caucasian american students was both statistically and practically significant. 36 thus, jiao and onwuegbuzie's results, alongside those of jiao, onwuegbuzie, and bostick, suggest that racial differences in library anxiety prevail. 37 thus, future research should investigate whether the relationship between library anxistructure coefficient .82* .85* .85* .78* .39* -.22 .13 -.80* -.41 * structure•(%) 67.2 72 .3 72.3 60.8 15.2 4.8 1.7 64.0 16.8 ety and computer attitudes found in the present study among african american graduate students also exists among caucasian american graduate students, as well as among other racial groups. further, the causal direction of the relationship found in the current study should be investigated. that is, future studies should investigate whether library anxiety places a person more at risk for experiencing poor computer attitudes, or whether the converse is true. more research also is needed to determine how computer attitudes might play a role in the library context. notwithstanding, it appears that the construct of library anxiety can be expanded to include the construct 142 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. of computer attitudes. indeed, one implication of the findings is that bostick's las should be modified to include dimensions of computer attitudes. 38 such a modification likely would facilitate the identification of library-anxious students. by identifying students with high levels of library anxiety and poor computer attitudes, library educators and others could help them improve their dispositions and provide them with the skills necessary to negotiate the rapidly changing technological environment, thereby putting them in a better position to be lifelong learners. references 1. susan m. piotrowski, computer training: pathway from extinction (eric document reproduction service, ed 348955, 1992). 2. thomas h. hogan, "drexel university moves aggressively from print to electronic access for journals (interview with carol hansen montgomery, dean of libraries)," computers in libraries 21, no. 5 (may 2001): 22-27. 3. m. claire stewart and h. frank cervone, "building a new [nfrastructure for digital media: northwestern university library," information technology and libraries 22, no. 2 (june 2003): 69-74. 4. carol c. kuhlthau, "longitudinal case studies of the information search process of users in libraries," library and information science research 10 (july 1988): 257-304; carol c. kuhlthau, "inside the search process: information seeking from the user's perspective," journal of the american society for information science 42, no. 5 (june 1991): 361-71; carol c. kuhlthau, seeking meaning: a process approach to library and information services (norwood, n.j.: ablex, 1993); carol c. kuhlthau, "students and the information search process: zones of intervention for librarians," advances in librarianship 18 (1994): 57-72; carol c. kuhlthau et al., "validating a model of the search process: a comparison of academic, public, and school library users," library and information science research 12, no. 1 (jan.-mar. 1990): 5-31. 5. constance a. mellon, "library anxiety: a grounded theory and its development," college & research libraries 47, no. 2 (mar. 1986): 160-65. 6. ibid. 7. qun g. jiao, anthony j. onwuegbuzie, and art lichtenstein, "library anxiety: characteristics of' at-risk' college students," library and information science research 18 (spring 1996): 151-63. 8. constance a. mellon, "attitudes: the forgotten dimension in library instruction," library journal 113 (sept. 1, 1988): 137-39; constance a. mellon, "library anxiety and the nontraditional student," in reaching and teaching diverse library user groups, ed. teresa b. mensching (ann arbor, mich.: pierian, 1989), 77-81; anthony j. onwuegbuzie, "writing a research proposal: the role of library anxiety, statistics anxiety, and composition anxiety," library and information science research 19, no. 1 (1997): 5-33. 9. anthony j. onwuegbuzie and qun g. jiao, "information search performance and research achievement: an empirical test of the anxiety-expectation model of library anxiety," fournal of the american society for information science and technology (jasist) 55, no. 1 (2004): 41-54; anthony j. onwuegbuzie, qun g. jiao, and sharon l. bostick, library anxiety: theory, research, and applications (lanham, md.: scarecrow, 2004). 10. diane mizrachi, "library anxiety and computer attitudes among israeli b.ed. students" (master's thesis, bar-ilan university, israel, 2000); snunith shoham and diane mizrachi, "library anxiety among undergraduates: a study of israeli b.ed. students," journal of academic librarianship 27, no. 4 (july 2001): 305-11. 11. ann j. jerabek, linda s. meyer, and thomas s. kordinak, "'library anxiety' and 'computer anxiety': measures, validity, and research implications," library and information science research 23, no. 3 (2001): 277-89. 12. muhamad a. al-khaldi and ibrahim m. al-jabri, "the relationship of attitudes to computer utilization: new evidence from a developing nation," computers in human behavior 9, no. 1 (jan. 1998): 23-42; margaret cox, valeria rhodes, and jennifer hall, "the use of computer-assisted learning in primary schools: some factors affecting uptake," computers in education 12, no. 1 (1988), 173-78; gayle v. davidson and scott d. ritchie, "attitudes toward integrating computers into the classroom: what parents, teachers, and students report," journal of computing in childhood education 5, no. 1 (1994): 3-27; donald g. gardner, richard l. dukes, and richard discenza, "computer use, self-confidence, and attitudes: a causal analysis," computers in human behavior 9, no. 4 (winter 1993): 427-40; robin h. kay, "predicting student teacher commitment to the use of computers," journal of educational computing research 6, no. 3 (1990): 299-309. 13. deborah bandalos and jeri benson, "testing the factor structure invariance of a computer attitude scale over two grouping conditions," educational and psychological measurement 50, no. 1 (spring 1990): 49-60; frank m. bernt and alan c. bugbee jr., "factors influencing student resistance to computer administered testing," journal of research on computing in education 22, no. 3 (spring 1990): 265-75; michel dupagne and kathy a. krendl, "teacher's attitudes toward computers: a review of the literature," journal of research on computing in education 24, no. 3 (spring 1992): 420-29; elizabeth mowrer-popiel, constance pollard, and richard pollard, "an analysis of the perceptions of preservice teachers toward technology and its use in the classroom," journal of instructional psychology 21, no. 2 (june 1994): 131-38; jennifer d. shapka and michel ferrari, "computerrelated attitudes and actions of teacher candidates," computers in human behavior 19, no. 3 (may 2003): 319-34. 14. valentina mclnerney, dennis m. mclnerney, and kenneth e. sinclair, "student teachers, computer anxiety, and computer experience," journal of educational computing research 11, no. 1 (1994): 27-50. 15. susan e. jennings and anthony j. onwuegbuzie, "computer attitudes as a function of age, gender, math attitude, and developmental status," journal of educational computing research 25, no. 4 (2001): 367-84. 16. jiao, onwuegbuzie, and lichtenstein, "library anxiety," 152. 17. mizrachi, "library anxiety and computer attitudes"; shoham and mizrachi, "library anxiety among undergraduates." 18. mizrachi, "library anxiety and computer attitudes"; shoham and mizrachi, "library anxiety among undergraduates"; the impact of information technology on library anxiety i jiao and onwuegbuzie 143 reproduced with permission of the copyright owner. further reproduction prohibited without permission. jerabek, meyer, and kordinak, '"library anxiety' and 'computer anxiety."' 19. brenda h. loyd and clarice gressard, "the effects of sex, age, and computer experience on computer attitudes" aeds journal 18, no. 2 (1984): 67-77. 20. sharon l. bostick, "the development and validation of the library anxiety scale" (ph.d. diss, wayne state university, 1992). 21. qun g. jiao and anthony j. onwuegbuzie, "reliability generalization of the library anxiety scale scores: initial findings/' (unpublished manuscript, 2002). 22. onwuegbuzie, jiao, and bostick, library anxiety, 22. 23. norman cliff and david j. krus, "interpretation of canonical analyses: rotated versus unrotated solutions," psychometrica 41, no. 1 (mar. 1976): 35-42; richard b. darlington, sharon l. weinberg, and herbert j. walberg, "canonical variate analysis and related techniques," review of educational research 42, no. 4 (fall 1973): 131-43; bruce thompson, "canonical correlation: recent extensions for modeling educational processes" (paper presented at the annual meeting of the american educational research association, boston, mass., apr. 7-11, 1980) (eric, ed 199269); bruce thompson, canonical correlation analysis: uses and interpretations (newbury park, calif.: sage, 1984); bruce thompson, "canonical correlation analysis: an explanation with comments on correct practice" (paper presented at the annual meeting of the american educational research association, new orleans, la., apr. 5-9, 1988) (eric, ed 295957); bruce thompson, "variable importance in multiple regression and canonical correlation" (paper presented at the annual meeting of the american educational research association, boston, mass., april 16-20, 1990) (eric, ed 317615). 24. margery e. arnold, "the relationship of canonical correlation analysis to other parametric methods" (paper presented at the annual meeting of the southwest educational research association, new orleans, la., jan. 1996) (eric, ed 395994). 25. thompson, "canonical correlation: recent extensions." 26. ibid. 27. jacob cohen, statistical power analysis for the behavioral sciences (new york: wiley, 1988). 28. ibid. 29. zarrel v. lambert and richard m. durand, "some precautions in using canonical analysis," journal of marketing research 12, no. 4 (nov. 1975): 468-75. 30. anthony j. onwuegbuzie and larry g. daniel, "typology of analytical and interpretational errors in quantitative and qualitative educational research," current issues in education 6, no. 2 (feb. 2003). accessed nov. 13, 2003,http://cie.ed.asu.edu/ volume6/number2/. 31. barbara g. tabachnick and linda s. fidell, using multivariate statistics, 3rd ed. (new york: harper), 1996. 32. mizrachi, "library anxiety and computer attitudes"; shoham and mizrachi, "library anxiety among undergraduates." 33. jerabek, meyer, and kordinak, '"library anxiety' and 'computer anxiety."' 34. qun g. jiao, anthony j. onwuegbuzie, and sharon l. bostick, "racial differences in library anxiety among graduate students," library review 53, no. 4 (2004): 228-35. 35. qun g. jiao and anthony j. onwuegbuzie, "library anxiety: a function of race?" (unpublished manuscript, 2003). 36. anthony j. onwuegbuzie and joel r. levin, "a proposed three-step method for assessing the statistical and practical significance of multiple hypothesis tests" (paper presented at the annual meeting of the american educational research association, san diego, calif., apr. 12-16, 2004). 37. jiao, onwuegbuzie, and bostick, "racial differences in library anxiety." 38. bostick, "the development and validation of the library anxiety scale." 144 information technology and libraries i december 2004 gaps in it and library services at small academic libraries in canada jasmine hoover information technology and libraries | december 2018 15 jasmine hoover (jasmine_hoover@cbu.ca) is scholarly resources librarian, cape breton university, sydney, nova scotia, canada. abstract modern academic libraries are hubs of technology, yet the gap between the library and it is an issue at several small university libraries across canada that can inhibit innovation and lead to diminished student experience. this paper outlines results of a survey of small (<5,000 fte) universities in canada, focusing on it and the library when it comes to organizational structure, staffing, and location. it then discusses higher level as well as smaller scale solutions to this issue. introduction modern academic libraries are hubs of technology, yet existing staffing, organizational structures, physical proximity and traditional ways of doing things in higher education have maintained a gap between the library and it, which is an issue at several small university libraries across canada. libraries today are largely online, which means managing access to resources, using online tools for reference and research, designing websites and more. the physical space in libraries is now a place to interact with new technologies, visualize data, a place for research support including open access repositories and data management, and other digital research initiatives. 1 these library functions often require a staffing complement to support them with a level of specialization in information technology (it). however, though the offerings of the library have changed drastically over the years, smaller university libraries have struggled to support the growing need for it services. larger universities (over 5,000 fte) have managed this influx of demand and usage of new technologies in libraries by having their own library it services to manage software and technologies to support research, teaching and learning. many also offer student and user -facing technical support with it help desks within the library. smaller universities (below 5,000 fte) often do not have the resources to have their own it department or staff and find themselves not able to help researchers with modern digital scholarship, not able to support new systems and software, and not working as closely with it as they would like or need. also, the it department is generally not responsible for this kind of work, as it is outside of institutional-wide software support. this paper outlines the current status of it and the library when it comes to organizational structure, physical location, and collaboration in small academic libraries across canada. it then outlines strategies that can be used in smaller libraries to help bridge the gap, as well as recommendations for administrators when considering organizational changes to better serve a modern research atmosphere. current status at small canadian universities the technologies behind modern library services are often complex, as libraries need to securely manage access to online resources (both on and off campus); support faculty as they research and mailto:jasmine_hoover@cbu.ca gaps in it and library services at small academic libraries in canada | hoover 16 https://doi.org/10.6017/ital.v37i4.10596 teach using new software and technologies; and support new models for publishing that include open-access repositories, data management, open education resources, and more. library staff deal with technology issues that come up daily, with several non-it library staff members troubleshooting and solving various issues that arise. library users run into all kinds of technical issues and reach out for help. in nova scotia, our library consortium offers live help, an online library chat service distributed throughout eleven academic institutions in nova scotia. statistics kept on type of question asked on this service from january 2010 to march 2018 show that 26 percent of the over 68,000 questions asked are technical in nature, with topics including difficulty accessing online resources, login troubles, and other technical issues.2 for this study, 18 out of the 21 universities with fte >1,000 and < 5,000 in canada were surveyed. excluded were universities that were “sister” institutions of larger universities which utilized the same library system and french-only-speaking universities. twelve university libraries responded to an online survey which asked questions concerning organization and collaboration focused on it, the library, and educational technology. results (see figure 1) show that organizational reporting structures in higher education vary when it comes to it and the library. fifty percent of the survey respondents reported that their it department reports to the ceo/cfo or vp administrative, 25 percent of it departments report to a cio, 17 percent report to a provost/vp academic, and 8 percent report to a vp finance. figure 1. which of the following best describes how your it organization reports? all of the libraries in this survey, on the other hand, report to a provost or vp academic. this makes sense, as libraries are generally considered academic while it is usually associated with operations. however, there have been recent changes to some university library structures in canada that might indicate new thinking when it comes to organizational structure and the relationship between these units. in 2018, it was announced that there would be restructuring at brandon university which removed the university librarian position altogether (as well as the 50% 17% 25% 8% reports to ceo/cfo/vp admin reports to provost/ vp academic reports to cio reports to the vp finance information technology and libraries | december 2018 17 director of it services), and placed the library under a chief information officer. this would bring the library and it under one reporting structure.3 in an opposite move, mount allison university recently proposed to eliminate the top librarian position and have an academic dean split the responsibility of the library and their academic unit.4 after local outcry, this move was reversed and the job ad is out for a head librarian. it is hard to say if these are signals of upcoming change in the future of library reporting, or a temporary solution in a time of budget restrictions. however, half of the survey respondents mentioned that there has been some recent reorganization or planned reorganization related to it and the library at their institutions. only 33 percent of small university libraries surveyed have their own it department or staff. one of those libraries have an it specialist who splits time between the library and their it department. the other 67 percent have no it department or staff in the library (see figure 2). figure 2. does your library have its own it department? when asked, “is there anything you would like changed about the current organization when it comes to it and the library?,” all of the libraries without in-library it support mentioned a desire for either a position in the library responsible for it; greater collaboration between it and the library; or a specific person within the it department who they could contact regarding it. student experience, including their experience with technology is important according to a recent educause study. this 2017 educause study outlines the importance of it, and support for students when it comes to wi-fi and other technical support.5 one recommendation from this report is to have it help desks more visible and available. not only is the library a convenient location, but as we have already seen, students are increasingly using technologies in the library and often run into issues. it makes sense then to have an it help desk within the library, as the majority of larger university libraries in canada already offer. when asked about it help desks in 25% 67% 8% yes, they are library employees no gaps in it and library services at small academic libraries in canada | hoover 18 https://doi.org/10.6017/ital.v37i4.10596 the library, three of the responding university libraries (25 percent) have help desks staffed by it services, one (8 percent) had a help desk staffed by library staff, and another (8 percent) had an after-hours help desk staffed by it services. the remaining 59 percent have no it help available in the library (see figure 3). figure 3. does your library have an it help desk? the physical location of the two units is also important. in this survey, 75 percent of respondents replied that the library and the it department are in separate spaces while 25 percent share a common space. studies have shown that physical proximity in the workplace can lead to greater collaboration. an mit study showed that physical proximity drives collaboration between researchers on university campuses.6 as one of the common themes in the survey was the desire for more collaboration, a physical change of location could have a great impact. when asked about changes people would like to see with the current organization of the it and library, many mention a need for more collaboration due to interrelated responsibilities. common suggestions included library it staff, having an it help desk in the library, or a specific person in it they could contact directly for help or who had shared responsibilities between it and the library. another suggestion was a committee that would bring together members from both units to strengthen communication. what can be done? in the larger view, university administrations need to look for outdated governance and organizational structures that are in place. as universities shift their goals and focus over time, they need to adapt structures and staffing accordingly. chong and tan describe it govern ance as being of utmost importance, claiming there needs to be strategic alignment between it and organizational strategies and objectives.7 carraway describes universities with a high level of it governance maturity and effectiveness as those where “it initiatives are aligned with the 59%25% 8% 8% no yes, employed by it services yes, employed by the library yes, employed by it services, only after hours information technology and libraries | december 2018 19 institution’s strategic priorities and prioritized among the university’s portfolio projects.” 8 effective it governance, focused on collaboration and communication, is associated with greater integration of innovation into institutional process. also, it governance was found to be more effective under a delegated model that empowers it governance bodies than under a cio centric model. the majority of universities surveyed showed common governance structures of it, with most as separate units reporting to a cfo/vp admin or similar. the inclusion of faculty, students , and business units in it governance committees was associated with a stronger innovation culture.9 stakeholder inclusion is an important characteristic of it governance maturity. students, as consumers of it, and faculty should both have a seat at the table when it comes to it governance. carraway found that an increased level of student engagement in it governance correlates with a high level of innovation culture.10 university administration should take a good look at how it is governed, who has input and how it is affecting the university’s objectives. the reporting structure of libraries has generally gone unchanged, with most respondents confirming that their library reports to an academic vice president. budget constraints at two canadian universities have seen the library structure being impacted as of late, however there has been little research done on the ideal governance structure of libraries in higher education. both it and the library in smaller canadian universities could consider governance committees that include students, faculty and other stakeholders, in order to be more innovative and effective. it is an interesting unit, where the model in higher education has moved back and forth between three main structures: centralized, decentralized, and federated it structures. centralized, where there is a central hub that runs it services for the university, is the most common structure found at the surveyed universities. decentralized, where it services are spread throughout the organization, would automatically mean the library (and other units) had it staff. a federated model would also lead to local library it work being done by specific people, who work for and out of a central it office, but are assigned to specific areas. federated structures offer centralized control, with decentralized functions in faculties and units. chong and tan believe that federated structures are more appropriate for a collaborative network, such as a university.11 their study found that federated structure, combined with coordinated communication, led to higher effectiveness. nugroho maintains that decentralized organizations such as universities need to regularly review their it governance structure, as both technology and the organization itself changes.12 he maintains that effective governance does not happen by coincidence, and it governance is not a static concept. library staffing also needs to change based on needs of the users and goals of the organization. some even suggest that libraries reorganize every few years to keep staff flexible, take advantage of new opportunities and foster growth.13 in 2011, we saw bell and shank’s work on the blended librarian, which advocated for librarianship with educational technology and instructional design skills.14 according to the 2015 arl statistics, we continue to see nontraditional professional jobs increasing in the library. in 2015, the top three new hire categories included two nontraditional categories: digital specialists and functional specialists.15 arcl statistics from 2016 showed that over the previous five years, 61 percent of libraries repurposed or cross-trained staff to better support new technologies or services.16 we saw in the survey that out of over 68,000 research questions fielded by librarians across nova scotia since 2010, just over one quarter of these are technical in nature. library administration at smaller universities, looking at these numbers, gaps in it and library services at small academic libraries in canada | hoover 20 https://doi.org/10.6017/ital.v37i4.10596 should respond by ensuring that technical knowledge and skills will be written into job ads, as they are increasingly in demand or that staff are trained appropriately. physical location is also important. we’ve seen from the survey results that there is a lack of physical connectedness between the library and it in smaller canadian universities. wineman et al. studied various organizations and their physical proximity. they state: “social networks play important roles in structuring communication, collaboration, access to knowledge and knowledge transformation.”17 they suggest that innovation is a process that occurs at the crossroad between social and physical space. cramton points out that “maintaining mutual knowledge is a central problem of geographically dispersed collaboration.”18 if it is not possible to change the organizational structure or governance to ensure more communication and knowledge sharing, physical spaces such as an it desk in the library is another way for the library and it staff to be in regular contact. a 2017 mit study recommended that institutions keen to support the crossdisciplinary collaborative activity that is vital to research and practice, may need to adopt “a new approach to socio-spatial organisation that may ultimately enrich the design and operation of places for knowledge creation.”19 we could apply the same thinking to institutions interested in supporting collaborative activity between the library, it, and newer-yet-related initiatives such as educational technology and digital research centers. proximity to collaborators should be considered as one option to enhance outcomes and innovation between the library and it. organizational structures and models, physical locations, and governance are all large-scale factors that should be considered when looking at the relationship between it and the library. there are also smaller-scale practical ideas that can help. these ideas will be discussed below. an important first step is to start the conversation. the author’s institution has begun thinking about the gaps in our services and support for research, especially when it comes to support for technologies needed for modern research and publication that are often housed in the library. factors which have helped start this conversation include: funding mandates related to open access and data management; new services or initiatives that researchers or units would like to start; which require it and library specialization; and planning for a future in higher education that increasingly relies on up to date technologies to support research, publishing, and teaching. a conversation is beginning between researchers, administration, the library, and other stakeholders which will lead to a collaborative solution to some of these issues. it’s important that there is interest and initiation from administration, but also that other stakeholders are involved from the onset. many universities have developed new positions, new units, or worked these positions into it or the library to fill this gap, but the solution needs to fit each institution and their goals. often times when there is no it staff in the library, technical issues are managed by one or two technical-minded staff members. equipping frontline service providers may help alleviate some of this work by enabling many staff to solve common technical issues. here at the author’s institution, the librarian in charge of access has begun presenting common technical/access issues during a monthly reference meeting. the goal is to have all staff who field questions from users have a basic understanding of how the systems work in the library, what to do if they see issues, and whom they can contact. in libraries where there is not a strong it presence, it is important to enable staff to be comfortable with basic issues that will come up. this also ensures that there is not just one person who can answer common technical/access issues. if someone staffing the information technology and libraries | december 2018 21 reference or circulation desk encounters users with these issues, they can explain why they are happening and what the library is going to do to help them. the plan is to create a library technical manual out of these quick presentations that can act as a resource for all staff or as a training manual for new staff. at each of these presentations, a survey is administered. the survey has four questions and asks participants about their comfort level dealing with technical/access questions both before and after the presentation. one hundred percent of staff answered that after the presentation, they felt more comfortable when encountering the issues described. this is not a suitable replacement of the specialized it skills needed in libraries; however, it can alleviate some of the pressure put on select people in smaller academic libraries. library staff can, and do, actively work to learn new skills through formal training and professional development. we saw from the acrl survey that many libraries are working to cross-train staff in order to keep up with technological demands. encouraging learning new skills and educational opportunities can go a long way and should be encouraged by library administration. the benefit of having it staff dedicated to the library is obvious, and libraries should continually push for this. results of the survey showed that library staff would prefer to have a person to contact with issues specific to the library. issues can be dealt with promptly, it personnel working in or assigned to the library will have an understanding of the systems involved, communication is easier, as there is a point person to contact, and the library has control over the products and services they offer. however, if that is not possible within the organization, a good system of communication is important. a timely system of contacting it and resolving issues can go a long way. chong and tan maintain that a coordinated communication system is key for it in an organization.20 a commonly used system for technical issues is the ticket system, where issues can be submitted by users, and answered and tracked by it. this is a very useful system for it staff, however the users often cannot track their own ticket, see a timeline for completion, or know who is on the other end to contact with more information. it is a good idea to meet regularly with it, formally or informally, to be able to discuss issues, build a relationship with colleagues, and get a better sense of how each unit works. on the library end, it is important to keep statistics on technical issues sent to it and the time elapsed before the issues are resolved. these statistics can be used to demonstrate the need for library-specific it staff, encourage better communication between departments, or demonstrate a problem with the current way issues are communicated. having statistics will help libraries if and when the time comes that new positions can be created. at the author’s institution we use springshare’s libanswers software to track all technical issues, including those sent on to it. this software records the dates and times; important details and resolutions to technical issues; and exports useful statistics. in smaller canadian university libraries there is a growing need for it support. however, there has been little done by way of organizational structure, staffing, or physical proximity between these two units to allow universities to better serve their students and faculty. this paper outlined the current situation in several smaller university libraries in canada and provides some high level as well as local solutions to this problem. gaps in it and library services at small academic libraries in canada | hoover 22 https://doi.org/10.6017/ital.v37i4.10596 appendix a: it, library, and educational technology organization *required 1. institution name * 2. total student population 3. which of the following best describes how your it organization reports? mark only one oval. reports to ceo/cfo/vp admin reports to cio reports to provost/vp academic reports to dean of library/ head of library other: 4. which of the following best describes how the dean/head of library/university librarian reports? mark only one oval. reports to the ceo/cfo/vp admin reports to provost/vp academic reports to university president other: 5. which of the following best describes it's relationship to the library? mark only one oval. it and the library are not at all part of the same reporting structure it is a part of the library reporting structure it and the library report to the same person, but are separate departments other: 6. which of the following describes the physical location of it and the library? mark only one oval. located in separate spaces share a physical location other: 7. does your library have its own it department? mark only one oval. yes, they are library employees yes, they are employed by it services and work in the library information technology and libraries | december 2018 23 no other: 8. does your library have an it help desk? mark only one oval. yes, they are library employees yes, they are employed by it services no other: 9. have there been any major reorganizations (that you are aware of) related to it and library services in the last ten years? 10. is there anything you would like changed about your current organization when it comes to it and the library? 11. who is in charge of educational technology/academic technology at your university? mark only one oval. library it educational technology is separate unit/office educational technology duties are split up among the library/it/other other: 12. which of the following describes the physical location of educational technology? mark only one oval. ed tech is located in or shares space with the library ed tech is located in or shares space with it ed tech has its own space no ed tech unit gaps in it and library services at small academic libraries in canada | hoover 24 https://doi.org/10.6017/ital.v37i4.10596 other: 13. what would you include as roles of an educational technology unit? mark all that apply. media design/production research and development (testing technologies, em erging tech) instructional design and developm ent faculty development learning spaces assessment (learning outcomes, course evaluations) distance/online learning support training on course software/technologies related to teaching and learning managing classroom technologies other: 14. have there been any changes (that you know of) related to educational technology services in the last ten years? 15. is there anything you would like changed about your current organization when it comes to educational technology services and the library? 16. may i use direct quotes in my research/publication? (no names or institutions will be attributed to a quote.) mark only one oval. yes no information technology and libraries | december 2018 25 references 1 tibor koltay, “are you ready? tasks and roles for academic libraries in supporting research 2.0,” new library world 117, no. 1/2 (january 11, 2016): 94–104, https://doi.org/10.1108/nlw-09-2015-0062. 2 “instant messaging service—statistics data entry page,” novanet, accessed june 5, 2018, https://util.library.dal.ca/livehelp/liveh3lp/admin/livehelp/chatentry.php. 3 “brandon university will eliminate 15% of senior administration to help tackle budget cut,” brandon university, march 15, 2018, https://www.brandonu.ca/news/2018/03/15/brandonuniversity-will-eliminate-15-of-senior-administration-to-help-tackle-budget-cut/. 4 joseph tunney, “mount a proposal to phase out top librarian makes students, staff want to make noise,” cbc news, january 18, 2018, https://www.cbc.ca/news/canada/newbrunswick/mount-allison-university-librarian-1.4492297. 5 d. christopher brooks and jeffrey pomerantz, “ecar study of undergraduate students and information technology,” educause, october 18, 2017, accessed june 7, 2017, https://library.educause.edu/resources/2017/10/ecar-study-of-undergraduate-studentsand-information-technology-2017. 6 matthew claudel et al., “an exploration of collaborative scientific production at mit through spatial organization and institutional affiliation,” plos one 12, no. 6 (2017), https://doi.org/10.1371/journal.pone.0179334. 7 josephine chong and felix b. tan, “it governance in collaborative networks: a socio-technical perspective,” pacific asia journal of the association for information systems 4, no. 2 (2012). 8 deborah louise carraway, “information technology governance maturity and technology innovation in higher education: factors in effectiveness” (master’s diss., the university of north carolina at greensboro, 2015), 113. 9 ibid., 89. 10 ibid. 11 chong and tan, “it governance in collaborative networks: a socio-technical perspective,” 44. 12 heru nugroho, “conceptual model of it governance for higher education based on cobit 5 framework,” journal of theoretical and applied information technology, 60, no. 2 (february 2014): 6. 13 gillian s. gremmels, “staffing trends in college and university libraries,” reference services review 41, no. 2 (2013): 233–52, https://doi.org/10.1108/00907321311326165. 14 john d. shank and steven bell, “blended librarianship.” reference & user services quarterly 51, no. 2 (winter 2011): 105–10. https://doi.org/10.1108/nlw-09-2015-0062 https://util.library.dal.ca/livehelp/liveh3lp/admin/livehelp/chatentry.php https://www.brandonu.ca/news/2018/03/15/brandon-university-will-eliminate-15-of-senior-administration-to-help-tackle-budget-cut/ https://www.brandonu.ca/news/2018/03/15/brandon-university-will-eliminate-15-of-senior-administration-to-help-tackle-budget-cut/ https://www.cbc.ca/news/canada/new-brunswick/mount-allison-university-librarian-1.4492297 https://www.cbc.ca/news/canada/new-brunswick/mount-allison-university-librarian-1.4492297 https://library.educause.edu/resources/2017/10/ecar-study-of-undergraduate-students-and-information-technology-2017 https://library.educause.edu/resources/2017/10/ecar-study-of-undergraduate-students-and-information-technology-2017 https://doi.org/10.1371/journal.pone.0179334 https://doi.org/10.1108/00907321311326165 gaps in it and library services at small academic libraries in canada | hoover 26 https://doi.org/10.6017/ital.v37i4.10596 15 stanley wilder, “hiring and staffing trends in arl libraries,” association of research libraries, october 2017, https://www.arl.org/storage/documents/publications/rli-2017-stanleywilder-article2.pdf. 16 “new acrl publication: 2016 academic library trends and statistics,” news and press center, july 20, 2017, http://www.ala.org/news/member-news/2017/07/new-acrl-publication-2016academic-library-trends-and-statistics. 17 jean wineman et al., “spatial layout, social structure, and innovation in organizations,” environment and planning b: planning and design 41, no. 6 (december 1, 2014): 1,100–112, https://doi.org/10.1068/b130074p. 18 catherine durnell cramton, “the mutual knowledge problem and its consequences for dispersed collaboration,” organization science 12, no. 3 (may-june2001): 346–71, https://doi.org/10.1287/orsc.12.3.346.10098. 19 claudel et al., “an exploration of collaborative scientific production at mit through spatial organization and institutional affiliation,” 2. 20 chong and tan, “it governance in collaborative networks: a socio-technical perspective,” 44. https://www.arl.org/storage/documents/publications/rli-2017-stanley-wilder-article2.pdf https://www.arl.org/storage/documents/publications/rli-2017-stanley-wilder-article2.pdf http://www.ala.org/news/member-news/2017/07/new-acrl-publication-2016-academic-library-trends-and-statistics http://www.ala.org/news/member-news/2017/07/new-acrl-publication-2016-academic-library-trends-and-statistics https://doi.org/10.1068/b130074p https://doi.org/10.1287/orsc.12.3.346.10098 abstract introduction current status at small canadian universities what can be done? appendix a: it, library, and educational technology organization references expanding and improving our library’s virtual chat service: discovering best practices when demand increases article expanding and improving our library’s virtual chat service discovering best practices when demand increases parker fruehan and diana hellyar information technology and libraries | september 2021 https://doi.org/10.6017/ital.v40i3.13117 parker fruehan (fruehanp1@southernct.edu) is assistant librarian, hilton c. buley library, southern connecticut state university. diana hellyar (hellyard1@southernct.edu) is assistant librarian, hilton c. buley library, southern connecticut state university. © 2021. abstract with the onset of the covid-19 pandemic and the ensuing shutdown of the library building for several months, there was a sudden need to adjust how the hilton c. buley library at southern connecticut state university (scsu) delivered its services. overnight, the library’s virtual chat service went from a convenient way to reach a librarian to the primary method by which library patrons contacted the library for help. in this article, the authors will discuss what was learned during this time and how the service has been adjusted to meet user needs. best practices and future improvements will be discussed. background the buley library started using springshare's libchat service in january 2015. the chat service was accessible as a button in the header of all the library webpages, and the wording would change depending on the availability of a librarian. at buley library, the chat service is only staffed by our faculty librarians. there were other chat buttons on various individual libguides for either specific librarians or for the general library chat. chat was monitored at the research & information desk by the librarian on duty. the first librarian of the day would log into the shared chat account on the reference desk computer. while each librarian had their own account, using a shared account meant that the librarians could easily hand off a chat interaction during a shift change. while the reference desk was typically busy, librarians would only receive a small number of chats per day. between 2015 and 2019, the library saw an average of 250 chats per year. due to the low usage, there was little focus on libchat training for librarians. for more complicated questions, librarians would often recommend that chat users call, email, or schedule an in-person appointment. since libchat was only monitored while librarians were at the reference desk, it was easy to let it become a secondary mode of reference interaction, particularly if there was a surge of in-person reference questions at any given time. due to the covid-19 pandemic, the library quickly shifted from mostly in-person to solely online services. suddenly, libchat was the virtual reference desk and the main mode of patron interaction. despite this change in how the library interacted with the campus, there was only a slight increase in chat usage in the first two months of the closure. in april 2020, we started to explore our options with libchat in the hopes of increasing visibility and usage. mailto:fruehanp1@southernct.edu mailto:hellyard1@southernct.edu information technology and libraries september 2021 expanding and improving our library’s virtual chat service | fruehan and hellyar 2 evaluating chat widget options considering technical implementation the publicly accessible chat interface is made available completely within a webpage, requiring no clients, external applications, or plugins to make it functional. springshare calls this component the libchat widget, and provides a prepackaged set of website code necessary to create the chat interface. within the libchat system there are a few options for widget placement and presentation. at the time of writing, springshare offers four widget types in its libchat product: in-page chat, button pop-out, slide-out tab, and floating.1 when the service is offline, the system replaces the chat interface with a link to library faqs and the option to submit a question for follow-up. at buley library, prior to the covid-19 pandemic shutdown, the button pop-out was the main widget type used to enter a chat session (see fig. 1). figure 1. previous library website header with chat pop-out button in upper right-hand corner. the pop-out button works by opening a separate pop-up window with the chat interface. this allows the user to navigate to other pages in the previous window without disconnecting from the session. one challenge to the pop-up window method is that many web browsers block pop-up windows by default, requiring a user to recognize and override this setting. another option used mainly on librarian profiles and subject guides is the in-page chat, which embeds the chat interface directly on an existing webpage. many times, these chat widgets are connected to a particular user rather than the queue monitored by all librarians. the user will interact with the chat operator in this dedicated section of the webpage. if a user navigates to a different page in the same window or tab it will disconnect from the chat session. these widget options are easiest when considering web design expertise and time commitment involved in implementation. both the button pop-out and in-page chat can be accomplished with a user having access to a what you see is what you get, or wysiyg, editor on the webpage and the ability to copy and paste a few lines of html code. it does not require any custom <script> elements to be placed in the page <head> or footer area. choosing the floating widget when the library shifted to all virtual services, there was concern that the chat button could easily be missed by library patrons. it was decided at this point to investigate alternative options to invite patrons to chat with librarians. the floating chat widget was chosen as the best option and was integrated into the website during a theme update in may 2020. the floating chat widget was made widely available to springshare libchat users in 2017.2 this widget operates by placing a chat icon at the bottom right of a webpage. this icon remains visible in this location while users scroll down a page. another option is to implement this icon as a proactive widget which displays a message to users after a set number of seconds to invite them to start a chat session (see fig. 2). information technology and libraries september 2021 expanding and improving our library’s virtual chat service | fruehan and hellyar 3 figure 2. floating chat widget on the redesigned library homepage with the proactive setting enabled (pop-up modal) to invite users to begin a chat. a pop-out button is also on the library homepage for patrons who were accustomed to the previous version. implementing this type of widget requires an administrator level of access to the website content management system if it is to be implemented across an entire website. a single <script> tag in the <head> section of the site template will activate it across an entire site. the floating chat widget is a common feature widely seen in the business world and on retailer websites. the hesitation in implementing this type of widget was that it would be perceived by patrons as intrusive or annoying. however, one study reported finding that college students found it useful.3 additionally, several other libraries have written previously about their success in implementing a proactive chat widget.4 it was decided that the library would implement this proactive chat widget on a trial basis and then evaluate the outcome as to whether to continue or not. shortly after, it was decided to add another proactive chat widget to southernsearch, the library’s discovery platform, built on ex libris primo (see fig. 3). the primo new ui, built on nodejs, is more complex to implement as it requires building a javascript function to insert the chat widget information technology and libraries september 2021 expanding and improving our library’s virtual chat service | fruehan and hellyar 4 script code into the application for nodejs to render on the front-end. this process is well documented by laura guy of the colorado school of mines.5 figure 3. embedded chat widget in southernsearch discovery platform. the pop-up modal is hidden in this view. libchat in action in active chats the ask a librarian chat became the library’s virtual reference desk when the university closed its campus in march 2020 due to the covid-19 pandemic. with the new focus on chat, the librarians decided it was time for a refresher on libchat. training was also necessary due to an update to the librarian libchat dashboard. a virtual training session was held to show librarians how to use the new dashboard and to remind everyone how best to use chat since it was infrequently used prior to closing in-person services. the training was recorded, and a link was provided to everyone so librarians could watch the training again as needed. the libchat widget change caused a huge increase in daily chat numbers. the library received 47 chats in april with the chat button. the total spiked to 160 in may with the floating chat widget. historically, there is a decrease in both virtual and in-person reference interactions during the month of may. librarians were answering more chats than ever before. however, some librarians information technology and libraries september 2021 expanding and improving our library’s virtual chat service | fruehan and hellyar 5 still had questions or technical troubles that did not get asked until after a chat ended. there was a need for a place to address these questions ahead of time so answers could be provided at the point of need. in response, a libchat best practices guide was created soon after the widget change to have a place to house tips for effective chat interactions and for quick troubleshooting answers. the best practice guide was well received by librarians; however, it was underutilized. libchat administrators hoped to be able to add a link to it directly within the dashboard for quick reference if librarians had an issue during a live chat, but they were unable to do so. the librarians added new canned messages to increase efficiency and consistency among common patron needs. canned messages are commonly used prewritten statements that can be utilized by all librarians on chat. they can be created by an administrator to be used by any librarian and can be easily added to any active chat. for example, librarians created messages for bookstore contact information, how to request electronic delivery of journal articles, and the library’s hours. privacy considerations the library’s chat system is capable of capturing data that comprises personally identifiable information (pii) for a patron. an intentional decision was made to limit this collection as much as possible in keeping with the principles of ala’s stance on patron privacy in the library bill of rights, which as interpreted says: the right to privacy includes the right to open inquiry without having the subject of one’s interest examined or scrutinized by others, in person or online. confidentiality exists when a library is in possession of personally identifiable information about its users and keeps that information private on their behalf.6 the chat login form asks a patron for information before starting a chat; that information can include name, contact information (such as email or phone numbers), and up to three customized questions. additionally, the system automatically captures the referrer url and the user’s public ip address, browser, and operating system. while it would be helpful to have more information up front, this could lead to the collection of a wealth of pii. with these concerns in mind, it was decided to make pii such as name, email address , and phone numbers an optional entry. while knowing a name and email address is useful when the librarian would like to follow up on a question or if a chat session is unintentionally disconnected, it is not necessary. the librarian may request this information later in the chat if needed. handling harassment and inappropriate behavior in june, one chat patron caused an immediate change in the operation of libchat at the library. on this day, a librarian answered an incoming chat, and a hostile patron threatened the librarian. before the situation could be addressed, another question from an anonymous patron came into the queue. a second librarian, who happened to be one of this article’s authors, claimed the chat despite being aware of the situation and suspecting this was the same difficult patron. it began as a normal interaction but eventually they became inappropriate. the librarian warned the patron multiple times before finally ending the chat and blocked their ip address. in response, the patr on left an inappropriate comment on the ratings and comments survey after the chat interaction finished. this one patron interaction made the librarians realize that they were vulnerable to harassment and other hostile behavior. changes were made to help create a safe environment for librarians information technology and libraries september 2021 expanding and improving our library’s virtual chat service | fruehan and hellyar 6 and patrons. first, training and instructions were provided so every librarian knew how to ban a user’s ip address. it was made clear that those ip addresses could be reinstated if a mistake was made. the list is checked on occasion to determine if any decision needed to be reverted or if a university ip address was inadvertently banned. this is especially important if one bans the ip address assigned to the university edge router, a critical network device from which all campus http traffic originates. this could block all users on campus from accessing libchat. additionally, some librarians also chose to change their names from the default option which is full first and last name. many changed their display name, using a combination of their first names, and subject liaison specific job titles. a decision was also made to turn off the chat rating and comment survey. it was not useful for the library’s own assessment purposes as most chat interactions were left unrated. leaving it in place could lead to more inappropriate behavior as comments can be left by a patron even after being banned. most importantly, the librarians created a virtual reference policy. while there was an existing general reference policy in place, it did not focus on, nor specifically mention, online conduct. librarians agreed that it was time to create an additional policy since the increase in usage identified new concerns. the policy gave librarians grounds to ban patrons if needed, informing them that inappropriate conduct was against the virtual reference policy, and disclosed the librarian’s right to report inappropriate behavior to campus authorities. a canned message was created to allow librarians to quickly inform a patron before banning the ip address and ending the chat. the message warns the patron their behavior violates the virtual reference policy and links to the library’s research & instruction policies page where the policy is included. in the beginning of the fall 2020 semester students returned to school, both on campus and virtually. as shown in figure 1, there was a huge spike in chat questions at the beginning of the semester. typical questions included those about borrowing textbooks, using study rooms, and research questions. librarians could sometimes tell if chats were from scsu students or employees if the patron provided their school email address. librarians started seeing another interesting trend with the referrer url. it was noticed that subject guides see more chat patrons who provide non-university email addresses or provide no contact information. when asked, patrons respond they were not affiliated with scsu. it is suspected that non-scsu patrons find the library’s subject guides through a search engine and while exploring the page the floating chat prompts them to chat with a librarian. the floating chat widget has helped to inform us about outside traffic to some of our guides. this has encouraged the library to take a more proactive approach to libguide maintenance among subject librarians, especially the small number of guides with heavy chat traffic. information technology and libraries september 2021 expanding and improving our library’s virtual chat service | fruehan and hellyar 7 observing patron usage figure 4. total number of chats per month from january to december 2020. best practices the lessons learned from the increased chat usage have helped shape a better approach to virtual reference services and create better experiences for the librarians and patrons using the service. with that, here are some considerations and recommendations to expand virtual reference or chat services at other libraries: • use a persistent chat widget with pop-up notifications on all library pages to remind patrons that immediate help is available. • consider patron privacy when making decisions about chat settings. • have policies in place that cover virtual services, such as chat, that includes language on harassment and other inappropriate behavior. • use the policy language for a canned message so librarians can reply quickly if there is an issue. • be prepared for non-university affiliated patrons using chat services, especially from subject guides easily found through search engines. what’s next at buley library there are a few anticipated changes for the buley library chat service in the future. the library’s student advisory board praised the library chat, sharing they found it to be a great way for students to be able to contact librarians for help. however, they did suggest the library consider the branding of the service. their recommendation was to make it clear before a patron starts an information technology and libraries september 2021 expanding and improving our library’s virtual chat service | fruehan and hellyar 8 interaction that a real scsu librarian would be answering the chat. they felt as though students may think that the answerer would be a chat bot or from a call center and not a real librarian that works for the school. additionally, librarians will re-evaluate chat coverage schedules based on usage data. the current schedule is based on in-person desk coverage. however, in-person historical statistics do not translate well to online reference services; busy periods for in-person reference services may not be the case for the same period online. the trend appears to show that weekend chat transactions exceed those of in-person transactions during the same period a year previously. noticing this trend, chat scheduling may need to be adjusted to better align demand with librarian availability. these adjustments could include longer shifts scheduled less frequently rather than daily hour long shifts. chat allows for more multitasking, so a longer shift may be preferable. it might help the librarian adjust their own schedule in preparation for a longer chat shift. chat coverage may need to be evaluated again when the library eventually goes back to operating an in-person reference desk. one pressing question is, will chat demand go back to pre-covid levels, or will it remain high when an in-person reference desk returns? while an in-person reference desk is likely to come back in the future, if chat traffic remains high, a separate schedule may be needed for chat in addition to the reference desk, so the same person is not operating both at the same time. conclusion it may never be known if the increase in chat volume came from the widget updates or was due solely to the pandemic, but it is likely a bit of both. increased chat volume seems to correlate strongly with implementing the floating chat widget, even though there was already an increase due to the effects of the pandemic shutdown. it is planned to look in depth at all available chat data and compare it to in-person reference statistics. another area for further research is the textual analysis of chat transcripts to find trends and make recommendations for greater effectiveness. endnotes 1 “springboard tutorials: create libchat widgets,” springshare help center, accessed january 11, 2021, https://ask.springshare.com/springboards/faq/1880. 2 “libanswers 2.15 update—redesigned chat widgets!,” springshare blog, july 6, 2017, https://blog.springshare.com/2017/07/06/libanswers-2-15-update/. 3 bonnie brubaker imler, kathryn rebecca garcia, and nina clements, “are reference pop -up widgets welcome or annoying? a usability study,” reference services review 44, no. 3 (january 1, 2016): 282–91, https://doi.org/10.1108/rsr-11-2015-0049. 4 jan h. kemp, carolyn l. ellis, and krisellen maloney, “standing by to help: transforming online reference with a proactive chat system,” journal of academic librarianship 41, no. 6 (november 1, 2015): 764–70, https://doi.org/10.1016/j.acalib.2015.08.018; michael epstein, “that thing is so annoying: how proactive chat helps us reach more users,” college & research libraries news 79, no. 8 (2018), https://doi.org/10.5860/crln.79.8.436. 5 laura guy, “embedding springshare libchat widget into the primo nu,” ex libris developer network tech blog, updated december 17, 2018, https://ask.springshare.com/springboards/faq/1880 https://blog.springshare.com/2017/07/06/libanswers-2-15-update/ https://doi.org/10.1108/rsr-11-2015-0049 https://doi.org/10.1016/j.acalib.2015.08.018 https://doi.org/10.5860/crln.79.8.436 information technology and libraries september 2021 expanding and improving our library’s virtual chat service | fruehan and hellyar 9 https://developers.exlibrisgroup.com/blog/embedding-springshare-libchat-widget-into-theprimo-nu/. 6 “privacy: an interpretation of the library bill of rights,” american library association, amended june 24, 2019, https://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy. https://developers.exlibrisgroup.com/blog/embedding-springshare-libchat-widget-into-the-primo-nu/ https://developers.exlibrisgroup.com/blog/embedding-springshare-libchat-widget-into-the-primo-nu/ https://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy abstract background evaluating chat widget options considering technical implementation choosing the floating widget libchat in action in active chats privacy considerations handling harassment and inappropriate behavior observing patron usage best practices what’s next at buley library conclusion endnotes 36 processing of marc tapes for cooperative use kenneth john bierman: data processing coordinator, oklahoma department of libraries, and betty jean blue: programmer, information and management services division, state board of public affairs, oklahoma city, oklahoma a centralized data base of marc ii records distributed by the library of congress is discussed. the data base is operated by the oklahoma department of libraries and is available to any library that can make use of it. the history, creation, operation, uses, advantages, disadvantages, cost and future plans of the data base are included, as well as flowcharts (both system and detail) and sample outputs. background information early in 1966, college, university and public librarians in oklahoma began meeting irregularly to discuss library automation. the incentive for such meetings was clear -libraries in oklahoma could not justify the financial expenditure necessary to "go their own road" in library automation. secondarily, they realized that at some time in the future, cooperative automation projects begun now would pay big dividends. with the coming of library of congress marc ii distribution service in april 1969, interest in library automation once again came to the forefront in oklahoma library circles. mter several general meetings, primar.. processing of marc tapes/bierman and blue 37 ily to find out what others were doing, planned to do, had done, or had failed to do, a marc planning meeting was called by the oklahoma department of libraries. representatives (both administrative and technical) of the three colleges, two public library systems, and two universities that were most likely to be doing anything with marc in the immediate future, were invited. the feeling expressed at the meeting was that if economic use of marc were to be made in oklahoma, there would have to be cooperative effort so that marc data could be used at the least total cost. at the same time, libraries had planned different uses of marc; therefore, allowance for local autonomy and creative use of marc was important. since libraries were planning varying applications of marc, it was decided that the best place to begin a cooperative effort was in the centralized maintenance of a marc data base. four libraries in oklahoma had placed subscriptions with the library of congress for the marc ii tapes when they became available one public library, one college library, one university library, and the department of libraries. the cost of maintaining four complete data bases would be large compared to the cost of maintaining one complete data base in the state for everyone's use. the money saved could then be used for utilization of marc records rather than for housekeeping maintenance of marc records. mr. ralph funk, director of the oklahoma department of libraries, committed the department to obtaining and maintaining a complete file of all cataloging information sent out by the library of congress in marc ii machine readable format (both current and, when available, retrospective) which would always be available on demand (either in part or in whole) to any library in the state. this report describes the cooperative system developed by the department of libraries to maintain and make available marc ii records to any library in the state that has the computer equipment to make use of them. unlike nelinet (1) and the washington state system (2), which are processing marc tapes to produce final hard-copy products for the cooperating libraries, the oklahoma system provides the machine readable marc records (not final products) a library needs; then that library can process these records in any way it wishes on its own equipment. none of the marc i participants was primarily concerned with the central distribution of selected machine readable records ( 3). possible future state-wide cooperative ventures with marc (including useful products) are also discussed. overview of the system the system can be thought of as two sub-systems: 1) merging and maintaining a marc master file of all records sent out by the library of congress in marc ii format, and 2) retrieving-i. e., withdrawing38 journal of library automation vol. 3/1 march, 1970 selected records by lc card number from the marc master file for specific libraries on demand. the maintenance sub-system has four distinct programs: 1) odl-01, which merges marc tapes; 2) odl-03, which drops or transfers to another tape any record or combination of records on a given marc tape; 3) odl-04, which prints a marc tape in upper-case ebcdic; 4) odl-06, which prints the lc numbers (300/page) from any given marc tape. the retrieval sub-system has one program, odl-05, which selects and copies specified lc card number records from the marc master file to a blank magnetic tape to be sent to the requesting library for its use. the maintenance sub-system the programs discussed in this section are used to merge and maintain the marc data base ( odl-01 and odl-03) and produce hard-copy by-products which are of occasional use for various purposes ( odl-04 and odl-06). system, input and output descriptions are included for each program. record merge program this program takes tapes in marc format and code and merges them in lc card number sequence. during processing, messages print if any unusual conditions occur, such as a new record with status other than "n" (new), a matched record with a code other than "c" (corrected), or "d" (deleted), etc. the messages also indicate the action taken. in general, any new record is merged onto the file regardless of code, a match with code "d" causes deletion, and a match with code "c" (and any other match) results in replacement by the new record. this occasionally causes "invalid" codes to be merged onto the file, but this approach was taken for three major reasons, one being that it is usually easier, in cases of error, to remove a bad record from the master than to retrieve it from its source and then merge it onto the master. secondly, as files become larger, it is feasible to make minor merges of a few tapes, then merge the result onto the actual master. during the minor merge, many apparently new records with codes "c" and "d'' will appear, but as the final merge is made, appropriate action is taken. thirdly, a library obtaining marc records from the department of libraries may also want to use the odl-01 merge for its own internal use. some of the records requested by the local library may have been corrected at some time and are therefore coded "c". although new to the individual library, these are coded "c", but are perfectly valid records from that library's point of view. since the odl-01 merge always merges a new record onto the file regardless of the code, this program can be used by the individual library without modification. processing of marc tapes/bierman and blue 39 inputs are 1) a marc master (a tape in marc format and code containing all records merged to date), which is in lc card number sequence; and 2) marc "items" (a tape ( s) in marc format and code containing the new records to be merged.) processing halts if this tape ( s) is out of sequence. r----___,, i i i i i i i i i i i i dl1 marc merge fig. 1. record merge program system flowchart. outputs are 1) a marc master, which is a tape containing records from all inputs, with appropriate corrections and deletions made; and 2) a merge listing, which contains notices of all corrections and deletions and notices of all unusual conditions. if desired, this listing can be expanded to print certain desirable information from all records merged and thus can be used as a valuable reference and check list. it will contain the lc card number, with prefixes and suffixes, and status code, and will indicate if a match was found on the master tape and the action taken. figure 1 explains the overall flow of the program. figure 2 gives the program details as of september 1969. hskp i / ' e';ral ""'• / •• r tra ' \ red i construct i nvalid n~\1 code msg ' ' \ m o'i' i fig. 2. record merge program detail flowchart . • 11:>0 ....... 0 1:: '"'t ~ ~ .q. t"-t .... ~ '"'t ~ ~ e0 ~ .,... .... 0 ~ < 0 ~ c.:> .......... ~ s: ill '"i n .?"' ~ (0 ~ 0 e~tus hconstruct •c? hatcr message; ____j \.__} \.__} l.__1 \.__} i """ no ~ ~ ~ co co -· ;:$ q"q .q.. add 1 to h construct ~ print h skip a skip i a:: a read a read > delete delete msg old new old new ~ counter ~iessage ht":ti ~~~ d~a d-a (j ~ ~ ~ co construct i ~ral .......... invalid to matcr msg 1-; trj !:lo ~ > z ill ::s 0.. to roe fig. 2 continued. c:: tr1 ,j:>.. ....... ~ ~ ...... c subtract i 'r·-··-~ 2048 from ...... length c -1:"'4 .... add 1 to c:.-neii·count ~ exit ) ~ ~ .... c ;:! ~ .... s· ;: < c :-c.:> subtract i ......... ~ subtract i v \ r~d i 1 2048 from i 2048 from length ~ length cj no >; ~ no i . \ .?" exit ~ (!) exit ~ fig. 2 continued. add 1 to new-count fig. 2 continued. subtract 2048 i froh length ~ i exit ) ud-ne\1 hove hivalues to old-compare ~ area -----' hove hi· values to h nell-compare area ~~ j exit '"i; ~ ~ <.':) ~ ~0 ~ ::x:l (") ~ ~ ) <.':) ex it ~ b:j 1-c tr1 !:d ~ > z § 0.. b:j t-t c tr1 ..... (,:) 00886nam 2200205 0010013000000080041000130500021000540820018000751110093000932450119001862600 ~7ft03053000033003425000089c037550400290046465000320049365000240052570000460054970000460059571000400 0641& 67026007 &690324s1968 moua · b 10100 engo &0 sarc847sb.a67 1966& $4616.3/62/0755&20 saapplied seminar 0~ the laboratory diagnosis of liver diseases,scwashington, d.c.,$01966.&1 $alabor atory diagnosis of liver oiseases.$ccompiled and edited by f. william suno f. r~an and f. william sunde rman, jr.&o sast. louis,sbw. h. greensc*c1968*& saxl[[, 542 p.sbillus.sc27 cm.& sahelo under the a uspicf.s of the association of clinical scientists, nov. 10•13, 1966.& $al~cludes bibliographies.&oo saliversxdiseasessxdiagnosis.&oo$ameoicine, clinical.&10sasunderman, frederick williamtsd1898•seeo.& 10~asunderman, frederick william,sd1931•seed.&20saassociation of clinical scientists.* 00778nam 2200169 0010013000000080041000130500019000540820010000731000017000832450295001202600 04600415300002600461500003800487500002800525652002800553740002800581& 6702a617 &690324r19681846mdu c c 00000 f~go &0 saf93sb.h65 1968& sa929.3&10sahinman, royal ralph,$01785•1868.&1 saa catal ogtw of the names of the first puritan settlers of the colony of connecticut,sbwith the time of thei r a~rival in the colony, and their standing in society, together with thfir place of residence, as f ar as can be discovered by the records.sccollected from the state and town records.&o sabaltimore,sb genealogical pub. co.,sc1968.& sa336 p.sbport.sc23 cm.& $aon spine* first puritan settlers.& sare print of the 1646 ed.&oosaconnecticutsxgenealogy.&olsafirst puritan settlers.* 00896nam 2200193 00100130000000800410u0130500017000540820010000711000021000812450128001022600 0500023030000320020049c005800312500013300370504003100503650003100534710006100565810007700626& 6703 0030 &690324s1968 nyua b 00010 engo &0 sara395.a3sbu4& sa362.1£10saullmann, john e.t1 sat he application of management science to thf evaluation and design of regional hf.alth services,scedit ed by john e. ullmann.&o sa*hempsteao, n.y.,sbhofstra university*sc1968.& saiii, 346 p.sbillus.$c28 cm.&1 sahofstra university yearbook of business, ser. 5 0 v. 2& 'a**this* ~fport results from the c ontinulng series of m.b.a. seminars conducted by the school of business of hofsfra university.*& sa bibliographical footnotes.&oosacommunitv health services.&20 sahofstra university, hempstead, n.y.sbs chool of business.&2 sahofstra university, hempstead, n.y.styearbook of buslnfss,svser. 5, v. 2* 00844nam 2200217 00100130000000a0041000130410011000540500018000650 8 20014000831000027000972450 0940012426000580021830000490027635000100032549000730033550400810040865000260048965000330051584000270 054884000s200575& 67031114 &690328s1968 njua 8 00100 engo &1 shengfrf.&o san7b32sb.g6613& sa704.948/2&10sagrabar, anor=e,$dl896•&1 sachrlstian iconography*sba study of its origins.sc*trans lateo from french by terry grabar.&o saprinceton, n.j.*sbprinceton univepsity presssc*c196r*& sal, 174, *203* p.sbillus. ipart col.)sc27 cm.& sa15.00&1 sabollingen series, ~s. the a. w. mellon lectu res in the fine arts, 10& sabibliography* p. 149•158 12d group) *illustrations** p. *1*•*203* (30 g roupl&oosaartt early christian.&oosachr.istian art and symbolism.& sabolltngfn seriesrsv35.& sathe a. w. mellon lectures in the fine artsrlv10* fig. 3. print record program output. ..,. ..,. i ....... -q.. t"i & ~ ~ ..... c ~ ..... c;· ;:s ~ !"""' (;:) ........... ...... a:: ~ '"i pi-' cd c3 processing of marc tapes j bierman and blue 45 drop and transfer records program this is a utility program that enables any number of lc card numbers to be entered on cards, with the option in each case of dropping the record entirely or transferring it to another tape for future action. it has proven useful for removing out-of-sequence records, purging files, etc. inputs are two in number: 1) any tape in marc code and format (sequence is not checked) ; and 2) detail cards, each of which contains a 12-position lc card number and a code indicating if this marc record is to be dropped or transferred to another tape. these cards must be in sequence. there are three outputs: 1) an updated tape containing all marc records on which no action was taken; 2) transferred tape containing, in sequence, all records transferred; and 3) a listing showing the lc number and the action taken, which is useful for verification of results. print record program this program prints in readable form any tape in marc code and format. the translation table, which produces a form of upper-case ebcdic, is the same as that used for other department of libraries programs. it is a character-for-character translation, which, for the present, is useful for many and varied applications. input is any tape in marc code and format. output is an upper-case ebcdic translation of the tape. figure 3 shows a sample output. figure 4 shows how the oklahoma department of libraries is handling the marc expanded character set with a small printer (ibm 1404-48 characters). simply stated, the problem is that there are many more characters coded in the marc ascii character set than are available on the particular printer that the department of libraries is using. (this is a local limitation of the printer that happens to be available; it is not a limitation of computer technology, as printers with expanded character sets are readily available). in general, rarely used punctuation and special punctuation marks not in the printer's character set print as an "•'', the lower-case letters print their upper-case equivalents, and diacriticals and foreign language symbols print as "= ". this translation table is used for in-house lists (for checking purposes, etc.). for production purposes, a slightly different translation table is used. characters, particularly punctuation marks, not available on the printer are translated to their closest equivalent or left blank, whichever is more appropriate. at the oklahoma department of libraries, all translations at this time are internal and do not affect the marc tapes, which are being left in the original ascii code. it seemed unreasonable to centrally translate the tapes to ebcdic until agreement among all the users could be reached as to a mutually useful translation table. 46 journal of library alutomation vol. 3/ 1 march, 1970 there is a good possibility that in the near future the information and management services division will make available an off-line printer with an expanded character set ( upperand lower-case letters, additional punctuation, etc.). if this does happen, then print-outs in an expanded character set would be economically possible. k .. .c " .c "" kc: h a, y,9 a,9 8, 9 c , 9 0,9 e , 9 f ,9 g ,9 11, 9 a,8 , 9 8 , 8 , 9 c , 8 , 9 0 ,8, 9 e, 8 ,9 f , 8,9 g , 8,9 a , q , 9 j , 9 k, 9 l , 9 m, 9 n,9 0, 9 p,9 q, 9 j , 8 , 9 k ,8,9 l ,8, 9 m,8 , 9 n , 8 ,9 0 ,8 , 9 p . 8 , 9 k .. " k .c .. " " .c .c " u .... " " 0 .. .. u .. .. "' :e :e "' 00 nul $ 01 soh s 0 2 stx $ oj etx $ 04 eot $ os enq $ 0 6 ack $ 07 bel s 08 ss $ 09 lit $ oa lf s os vt s oc ff $ od cr s :je so s of si $ 10 ole $ 1 1 dc! $ 12 dc2 + lj ocj $ 14 dc4 $ 15 nak $ !6 syn $ 17 ets $ 18 can $ 19 em $ ! a sub $ 18 esc $ !c fs $ 10 gs * le rs & l f us $ " k .. " .c .c " u u .... .....c 0 0<.> u uc: : ~6. s b $ sb $ sb $ 58 s sb $ s b $ s b $ sb $ ss s 58 $ ss $ sb $ ss s sb $ ss $ ss $ 58 $ 58 $ 4e + ss $ 5b $ 58 $ sb $ sb $ 58 $ 5b $ 58 $ sb $ sb $ sc h 50 & sb $ .. .. .c ... .c ...... "" .. ~ "'"' j ,y,9 z , l z,2 z , j z,4 z, s z , 6 z , 7 z , 8 y,l , 9 y, 2 , 9 y , 3 , 9 y, 4,9 y , 5 , 9 y,6, 9 y, 7 , 9 a ,q, z 1 , 9 2 , 9 j , 9 4 ,9 5 , 9 6 , 9 7 , 9 8 , 9 1 ,8, 9 2 , 8 , 9 j , 8 ,9 4, 8 , 9 5 , 8 ,9 6 , 8,9 7 , 8,9 ~ k ~ " 0 .. .. "' 20 sp 21 ! 22 .. 2j i 24 $ 25 7. 26 & 27 . 28 ( 29 ) 2a " 2b + 2c ' 20 2e 2f i jo 0 jl 1 j2 2 3j j j4 4 3 5 5 j6 6 j7 7 38 8 39 9 ja : 38 jc < )0 3e } 3f ? marc hex lc • tape mare k .. .c 0 !::! 0 u "' "' sp ,, k * * * " . ( ) * * ' i ll 1 2 j 4 s 6 7 8 9 .;( " " • • * marc hex ld • end o f recor d marc hex l e • field terminator marc h ex if • delimet er fig. 4. conversion table. k " .. <j.r. .c " uu .... >-<£: 0 ov u uc: "'"'~ "' "'"' 40 sc ,., sc * 5c * sc * sc * 5c " 70 i 40 ( so ) 5c * sc • 6 b ' 60 48 61 i f\l 0 fll f2 2 fj j f4 4 f5 5 f6 6 f7 7 f8 8 f9 9 5c * sc h sc • sc • sc " sc • ..~ "'"' a, z 8 , z c,z d, z l:: , z f , z c , z h, z a, 8 b, s c , 8 0 , 8 e , s f , s g, 8 & a,r 8 , r c , r d, r e , r f , r g , r h,r j , 8 k, 8 l , 8 m, 8 n, s 0, 8 p,8 k x <0 " .c .c " 0 " .. .. .. "' .. "' 4 0 @ 41 a 42 b 4j c 44 d 4 5 e 4 6 f 47 c 48 h 49 i 4a j 48 k 4c l 40 ~~ 4e )i 4 f 0 so p 51 q 52 r 53 s 54 t 5 5 u 56 v 57 w 58 x 59 y sa z ss ( sc \ so j se ... sf .. .. oo x <0 .c <i.c " .c " 8 8 ~ 0 0 co u <..' ug ~ ~~ "' "' ,; sc ,, a c 1 a s c2 8 c cjc d c4 d e cs e f c6 f g c7 g h c8 ~ i c9 i j 01 j k 0 2 k l oj l m d4m n os )i 0 0 6 0 p 07 p q 08 q r 09 r s e2 s t ej t u e4 u v es v w e6 w x e7 x y eb y z e9 z " 5c " * sc * * 5c • * sc " * sc * .. .. .c 0 .c v o "" "'~ ,; a. ll,l k, z l , z m, z n, z o , z p , z q , z y,l & ,y , j y, 4 y, s y,6 y, 7 &, -.0 a , r,0 ll ,r,~ c ,r, 0 d,r,0 e,r,v' f , r ,\1 g , r , 0 h,r , 0 l , 8 2 , 8 3 , 8 4, 8 5 , 8 6, 8 7 ,8 .. x <0 " .c .c 0 0 " .. .. .. .. "' :e 6~ ' 61 a 62 b 63 c 64 d 65 e 66 f 67 g 68 h 69 i 6a j 68 k 6c 1 60 m 6e n 6f 0 70 p 71 q 72 r 73 s 74 t 75 u 76 v 77 "' 78 x 79 y 7a z 78 { 7c i 70 } 7e 7f del .. .. <0 x .. .c " .c " .c " 8 8 ~.c c c'l 0 0 u u u c:: "' ., "'~ w "' we. " sc l'f a cl a b c2 s c cj c d c4 0 e cs e f c6 f g c7 c h c8 h i c9 i j 01 j k 02 k l oj l m 04 m n os n 0 0 6 0 p 07 p q 08 q r 0 9 r s e2 s t e3 t u e4 u v es v w e6 w x e7 x y e8 y z e9 z • sc • " sc • " sc * * sc • * sc " processing of marc tapes j bierman and blue 47 "" :0: :0: a,y 80 a,ll 81 8,0 82 c, ll 83 0 , 0 84 e,ll 85 f ,ll 86 g,0 87 h, 0 88 1,0 89 b, y sa c, y 88 d,y sc e, y 80 f , y be g, y sf a,q 90 a,91 8 ,92 c,. 93 d,94 e, 95 f ,. 96 g, 97 h,98 i,99 8 , q 9a c,q 98 d, q 9c e,q 90 f , q 9e g, q 9f ( i s &,5,8 + i s &,6,8 "' + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + & is !"2 z one $ i s -,3, 8 u '"'"' 0 0 u c "' ~ "" "' "' 4e + 4£ + 4e + i~£ + 4£ + 4£ + 4e + 4e + 4e + 4e + 4e + 4e + 4£ + 4e + 4£ + 4£ + 4e + 4e + 4e + 4e + 4e + 4e + 4e + 4e + 4e + 4e + 4e + 4£ + 4e + 4£ + 4£ + 4e + ~ 1 u.., u 0 ~ c .. ~ :.:c.. j ,y j,ll k, l! l,ll m, ll n,ll 0 ,0 p ,0 q,0 r,ll k,y l, y m, y n, y o,y p , y a,q,il a,-,0 b,-,0 c.,0 d,,0 e,-,0 f,-,0 c , .. ,g ll, -,0 i ,,0 b, q,il c, q,il d, q, 0 e, q,0 f , q,il g, q, il is -, 5 ,8 • is 6·, 8 :0: ad ai a2 aj a4 as a6 a7 a8 a9 aa ab ac ad ae af 801 81 b2 b3 b4 b5 b6 b7 b8 b9 ba bb 8c bd be bf is 11 zone * i s -, 4 , 8 fig. 4 continued. :0: . .<;0 -9 p n!. ce " b ® .! r:j u: . t ~ .! p ,., "' .. .... tt <1 u: print card numbers program .., .., "' . . . . . . -. . -. ---+ + -. ----. . . . + -+ + 7£ 7£ 7£ 7e • 7£ 7e • 7e 7e • 7 e • 7e 7e 7e 7e • 7£ • 4e + 4e + 7e • 7£ 7e • 7£ 7e • 7e • 7e 7e • 7e • 7e 7e 4e + 7£ 7e • 4e + 4>: + is 0 ,1 is 0 ,3, 8 i s 5 , 8 .. .. .<! 0 .t! 0 u .. c "~ :.:c.. &, 0 a b c d e f g i! i b, y,9 c,y,9 d, \',9 e1 y, 9 f, y,9 g, y,9 -·" j k l m n 0 p q r 8 ,q, 9 c,q,9 d, q, 9 e,q ,9 f,q,9 g,q,9 .. " .. " .<! .. u .. .<! .<! u ~ 0 u 0 .. .. u " .. "' x :0: "' co + cl + c2 + c3 + c4 + c5 + c6 + c7 + c8 • c9 + ca + cb + cc + cd + ce + cf + do + 01 + 02 + 03 + 04 + 05 + 06 + 07 + 08 + 09 + oa + db + dc + do + de + of + .. " .. .. .<! .<! 0 u u .... '"'"' 0 ou l;{ !;l § "' "' "' 4e + 4£ + 4e + 4e + 4e + 4e + 4£ • 4£ • 4£ + 4e + 4£ + 4e + 4e + 4£ + 4e + 4£ + 4£ + 4e + 4£ + 4e + 4£ + 4e + 4e + 4e + 4e + 4£ + 4e + 4e + 4e + 4e + 4e + 4e + .. .. .<! u .<! u u .. c .. ~ :.:a. s,a j,z s t u v w x y z k, y, l, y, m,y, n,y, o,y,9 p,y,9 0 1 2 j 4 5 6 7 8 9 b, q, z c, q, z d,q , z e,q,z f , q,z g, q,z " " .. .. , " " .. .<! .. u .<! u .. .t! .c u u u ~.<! .... .... u u 0 0 ou .. .. u u !;l~ " "' "' "' :0: :0: "' "' "'"" eo ? . 7e • el ' . 7e • £2 / . 7e e3 " . 7 e • £ 4 . 7! • e5 7e • e6 .., . 7e • e7 . . 7£ £8 .. 7£ -£9 v . 7£ ea 0 . 7!· £8 , 7£ -ec ' 7e • ed . 7£ -ee .. . 7e • ef "' . 7e • fo • 7£ fl • 7£ • f2 7£ f3 .. 7e • f4 7e · fs = 7e • f6 . 7£ f7 -..l 7£ -fb ~ 7e • f9 " 7£ -fa ' 7e • fb -' 7e • fc + 4e + fd + 4£ + fe • 7e • ff .. 7£ • this program prints all lc card numbers on any tape in marc code and format. numbers are 12 positions in length and print 300/page in translated form. occasionally. a cumulative list of lc card numbers on the master tape is produced as a reference tool. its input is any tape in marc code and format, its output a printed listing of all library of congress numbers on the tape (figure 5). a record count will appear at the end of the listing. figure 6, the detail flowchart for this program, appears on the following pages. 48 journal of library a<utomation vol. 3/ 1 march, 1970 lc nu.,.ber listing as of 07/03/69 62022062 65062804 66030342 660629:}2 63011276 65062892 66030452 66063102 63022268 65062911 66030453 66063756 63064323 65062999 66030644 66063181 63065066 65063172 66060155 66063923 64011102 65065046 66060267 66063926 64023532 6506'5193 66060594 66063931 64062037 65065405 66060596 66064123 64062399 65065794 66060600 66064334 64062r67 6506'i815 66060631 66064456 64063587 65065818 66060710 66064513 64063588 66010930 66060816 66064619 640b363l 66011644 66060830 66064688 64063835 66012619 66060860 66064722 64063999 66014644 66060882 66065118 64064293 66014929 66061036 66065264 64064724 6601h0 8 4 66061101 66065371 64064971 66019184 66061116 66065445 64066336 66021082 66061340 660655:>1 65019066 66021087 66061643 66065709 65019667 66021669 66061685 6606'5767 6'5023174 6607.1679 66061f!69 66065770 6'502s047 66021680 66061875 67010007 65026126 66021689 66061886 67010038 65027231 66021695 66061889 67010310 6'>027416 66022509 66061r99 67010836 6'>021"107 660229af! 66061917 67011394 6s027708 66.02'l067 66061967 670115h 65028116 66024150 66061983 670ll9h 65060409 6602'i530 66061988 67012048 65060483 66025986 66062017 67012128 6'>060652 1>6026120 660620l8 67012478 650601,84 66026122 66062160 67012840 65060737 660261h 66062168 67013691 65060796 66026124 66062252 670140!>1 65061226 660261?.5 66062259 67014071 650b1567 1>6026126 6 60 62283 67014142 65061895 6602659r 6 6 062290 67014311 65061896 660?6650 66062309 67014312 65062346 b6027410 66062403 67014916 65062359 66027435 66062405 67015033 65062399 6602769~ 6 6 062417 67015715 6'i062463 66027694 660624'•4 67016233 6506248'> m021!204 66062476 67016619 65062489 66028413 66062637 670169h 65062')04 6602a462 66062640 67017216 65062'>07 66028495 66062649 670174!19 65062'>4'3 6602 8687 66062820 67017582 6506<'').72 6602f\c)')f) 66062964 67017584 65062300 660 'i014fl 6fl06?.986 670176:>9 fig. 5. library of congress number listing. processing of marc tapes j bierman and blue 49 c start ) move data to header une move to hoidarea usingh alter end-sw to go to eoj t-----.1 print-rtn fig. 6. print card numbers program detail flowchart. 50 journ-al of library automation vol. 3/ 1 march, 1970 add 1 to h yes )---------.{ print-rtn locate & pull est l.------lil'l length codertranslate fig. 6 continued. subtract 2048 from length processing of marc tapes/ bierman and blue 51 sethtofl translate hold area move 1 to k 51 to l 101 tom 151 ton 201 top 251 to q perform rtnx 5 times clear hold area fig. 6 continued. eo) close files stop run skip 1 une perform rtn-y 10 times l move hold (k) toprlnt-k • move hold (l) to printl move hold (m) toprlntm , move hold (n) toprlntn move hold (p) to printp , move hold (q) to printq add 1 t o k, l, m, n, p, q. 52 journal of library a.utomation vol. 3/1 march, 1970 retrieval sub-system withdrawing records program this program withdraws records selected by lc card number and copies the complete marc ii records onto another tape. a library sends the department of libraries a magnetic tape containing the lc card numbers for the records it wants copied from the data base. the data base is searched and the requesting library is sent back three tapes and three hard copies. the tapes are: 1) the original finder tape, 2) an item tape containing the records which matched, and 3) a tape containing the lc card numbers of the records which did not match. the three hard copies are: 1) a list in lc card number order of the records which matched containing on the first line information from the finder tape and on the second line information from the marc tape; 2) a listing of the card numbers and other information on the finder tape which did not match any card number in the data base; 3) a listing of card numbers and other information on the finder tape that were invalid. there are three inputs to the system, the first being a marc master, which is the latest merged master at the department of libraries; its records are in the original code and format. the second consists of finder records, which come from the individual library. input is originally on card in the format specified in table 1, then put on tape, blocked 5, and sorted (no tape labels are used at this time) on all 12 positions of the lc number. the tapes are unlabeled upper-case ebcdic 1600 bpi. the third is a card that enters the appropriate date and library code into the system. table 1. original card input format to odl-~5 card columns 1 2-4 5-12 13 14-28 29-48 49-76 77-80 field contents and special instructions local library code (assigned by dept. of libraries) lc card number prefix (upper case alpha or blank) lc card number (numeric) lc card supplement indicator (may be blank) local use (may be blank) local use or first 20 positions of author (may be blank) local use or first 28 positions of title (may be blank) local use or publication date (may be blank) the system gives the following five outputs: 1) matched records, a listing of records that matched and were transferred to the individual library's item tape. this listing shows all informaprocessing of marc tapes/bierman and blue 53 tion from the finder record, and immediately below, the following information from the marc record: lc card number, the first 20 characters of the author, and first 28 characters of the title and the publication date. information pulled is as follows: author (first tag beginning with 1), which will usually be 100 or 110; title, which will always be 245; and date, which will be the 7-10 positions under tag 008. figure 7 shows sample of output. the first line is data from the finder tape and the second line data from the marc master tape. hucheo .records library cool! x qate. processed · 06/15/69 lc number local use author htle date 6406(1336 arcd publishing cohp operations and maintenance 1966 aroo publishing,tom9 operation~ and maintl!nanceot 1966 6()02[680 knox, john jay a history of banking in the 1969 knox, john jay.$0182 a history of banking in .the 1969 67021200 g.ll/pic gilbert pictorial anatomy ·of the cat 1968' gilber.t • stephen g.c pictorial anatomy of the cat 1968 61.023081> dickinson, emily two poehs. 1968 dicki.nsoii, ehily•sdt two poems.·sc*illus. and call 1'168 680080h gernsheih, helmut l. jo1 m. daguerrb 1968 gernsheih, helmulo$0 l. j• m. daguerr!!*sbthe hist 1968 68008214 riley• jdhii it, the studeiit looks at his tea 1'169 riley, john w.£1 sat the .student looks at his tea 1969' 6a0081ol8 tagiurl'o renato organizational climate 1968 tagiurto rf.nat0~&1 $ organizational clihate*~8exp 1968 680257:!11 t"e bardoue principles, sty 1968 8az!n 0 gerhajn~tl sa the baroque* principles, ·sno 1968 69015554 36823 philips, judson girt with sl~ fingers 1969 philips, judson penf the girl with six fingers*sb 1969 17002574 aylmer, g. e. the struggle for the constit 1968 aylmer, .c. e.&t uth the struggle for the constit .1968 78625296 groves, doris guiding the development of 1968 .groves, ooris.u sag guiding . the development of t 1968 79000540 americ~n assoclation preparation fo~ retirement 1968 ·american association preparation for retirement•s· 1968 at 6.800't"l52 iiellst r08ei!t science• hob8y hook of iieathf 1968 iiel:lsi robert', s01913 ·st i ence•h08b'i' book op weathe 1968 agr680001't 5 .santhyer, carolee morocco s agricultural ecd~o 1968 santh'i'er'• carolee,sd morocco s agricultural econo l96b gs 68000236 us geological survey bi~liogi!aphy 'of reports resu 1968 heath, jo ann,sd1923 bi8liograph'i' of reports rfsu 1968 total hatched reccs 15 aoo. generated errs .. fig. 7. matched records listing. 54 journal of library a.utomation vol. 3/1 march, 1970 2) items tape, containing all records requested from the master tape. they are in marc format and code, and the number of logical records should match the matched record count. 3) unmatched finders listing, showing all valid finder records that did not match the marc master tape. figure 8 shows sample output. 4) unmatched finders tape, containing all valid finder records that did not match the marc master tape. uiimatcheo ~fcoros lleaary coof. x date paocessfo 06/is/69 lc nuhbf.~ local use a unto~ titlf date 39015412 paoint paoelfoao i~teana tional uw 1939 6000716) 3~6~ delllt melvin belli l coks at life 1960 640635r6 caswell t barbara w woak~ens c!i~pe~sat i on hfnh i 1963 68002763 belli law revolt 1?6~ 68055~0~ eiseha'it alberta thf. guest oog 68066~07 roheck t hllcreo spf.c!al class programs for 1?6a 70003466 th f fhenoeo ca{f. facility 196 7 71 079)1 0 rfitlelt willi am the •fo itfrranean, i is rolf 1969 hfw68000051 pi ve'ih/fouc tot~l un•atcheo rcos 9 fig. 8. unmatched records listing. 1!!\ltor list iiig library code x d ate processed 06/15/69 x 66016u65 rich necessit ies of life 1966 i nv all c lc• number x 6805540~ i? i segues t dupl i ca te lc number x 73u3622 8leinh eih the ai.se and fall of the 197 0 inv alic lc• number j 95000001 invalid libr ary code j g9 6md003b ~683986 i nvalid library code jg9 6adddol8 468~986 invalid lc• pref i x jhe\17369 78 }6 he t s~lue iiivalid liormy code xhe 3a3326d9l invalid lc• prefix xheh332609z invalid lc• nuh8er xhlw790d0366 j ones relationships a~cng l969 invalic lc• pref ix j3266aodoog9 curt is the making of a presiden t 1969 invalid lirrary code j3266addddg9 curtis the making of a pres i dent l 96? i'ivalio lc• prffix j32668dooor.9 curtis the making of a pres i dent 196? i nv all c lc• nuhher x32669005736 hew3265ht 32 invalic lc2 pre f ix total frrors l4 fig. 9. errors listing. processing of marc tapesjbierman and blue 55 5) errors listing, showing all 80 columns of invalid finder records and the appropriate error message. finder records are invalid if one of the following errors occurs: 1) blank or invalid library code; 2) prefix any characters except blank or upper-case alpha; 3) lc card number not pure numeric. invalid finder records are not processed but are placed on an error listing. figure 9 shows sample output. no edits will be made on columns 14-80, which are for local use entirely; all data from these fields will be transmitted to printed listings for any desired local use or for verification. record counts are included at numerous points to facilitate accurate record control. for the purposes of this particular program, counts should check as follows: matched + e + unmatched -generated original rrors '= records records errors count matched records appear at the end of the listing of the same name, errors appear at the end of the listing of the same name, and unmatched records appear at the end of the listing of the same name. generated errors appear at the end of the matched records listing. a generated error indicates more than one error in -a single card and this count is included only for control purposes. the original count is expected to be maintained by the submitting library for maximum accuracy. these counts are checked immediately, and any discrepancies cleared up as soon as possible. figure 10 gives the overall view of the program and figure 11 a detailed flowchart. the odl-05 program was written to provide the greatest flexibility possible to the user libraries. the only information absolutely required for the finder tape is the local library code, and the complete lc card number. however, the remaining 67 card columns are available to the local library for any use it may wish to make of them. if the local library would like a quick method of sight checking to make sure that the records copied were the records wanted, it can keypunch the first twenty characters of the author in columns 29-48, the first 28 characters of the title in columns 49-76, and the date of publication in columns 77-80. if this is done, the matched records listing will contain the author, title and date from the finder tape, immediately followed underneath in the same position on the page by the corresponding information from the marc record. figure 7 shows sample output. thus, the library can quickly sight check what it thought it was getting at the time of request with what it actually got from the marc record. of course, the local library is free to put no information, or other information, in columns 29-80; the operation of the system will not be affected and whatever information is included in columns 29-80 will appear on the three output listings (matched records, unmatched finders and errors). 56 journal of library automation vol. 3/ 1 march, 1970 -phase 1edit, pull atches and list. -phase 2print errors -phase 3print unmatched listing unmatched listing date & lib. code card matched listing errors fig. 10. withdrawing records program system flowchart. ~ master-read \..._/ ~ finder-read ~ processing of marc tapes/ bierman and blue 57 move control data ·to header areas put hi-values 1----~in compare pull lci & convert to ebcdic compare \..._/ area close files put proper error code in record ~ !---•compare '-.j ~ phase-2 \._.,/ ~---=" fineread fig. 11. withdrawing records program detail flowchart. 58 journal of library automation vol 3/ 1 march, 1970 construct ~:...___,..,non-match red 'and code as 1----~ ~1rud finder-read fig. 11 continued. such /"""'.. work-read '-./ open work tap e r-'\ 1----_.. finder-read -...__,~ close files construct error message work-read processing of marc tapes/bierman and blue 59 {\ phase-3 open unmatched tape construct at end close files unmatched eoj mes sage fig. 11 continued. another convenience for the local library is that it has to do no original programming to use the system. all that is needed are standard sort, merge and card-to-tape programs. any of the programs written by the department of libraries is available to users on demand. they may find the merge or lc card number print programs useful. another consideration for the user is the ease with which invalid finder records and unmatched finder records can be resubmitted into the system. to correct finder records in error, the library simply repunches cards from the error listing, with necessary corrections, and resubmits them in the next cycle with new cards. unmatched finder records can be merged with any new finder records in the next cycle and resubmitted, no repunching being necessary. 60 journal of library automation vol. 3/ 1 march, 1970 what is presently being done the variety of applications for marc presently being worked on in oklahoma libraries is most interesting. central state college, edmond, oklahoma, is currently subscribing to the weekly marc tapes and producing an index of available materials which cumulates for two months and then drops off the older entries. the library is receiving its own subscription to the marc tapes for this purpose but does not plan to maintain a complete file of marc records. the tulsa city-county library system, tulsa, oklahoma, is currently using marc records from the state data base for bibliographic information for its machine produced book catalog. it originally had a subscription to the marc tape service, but with the operation of the state-wide data base, is dropping it. the university of oklahoma, oklahoma state university, and oklahoma county libraries have no immediate plans for utilization of the marc records as distributed by the library of congress; however, when they do move in this area it will probably be for use in their technical processing departments and the state marc data base will form a basis for their use. computer and language used the computer being used for the department of libraries marc program is an ibm 360/ 30 located in the state budget bureau but under the administrative control and operation of the information and management services division of the board of affairs (the centralized state computer center for the capitol complex) . the computer has 32k core size, one on-line card read/ punch, model 2540, four magnetic tape drives, model 2415, two magnetic disk drives, model 2311, and one on-line printer model 1404. the programs are written in cobol for the 360/ 30, operating under dos, with a cobol compiler. very little modification would be required to operate under os. the merge program ( odl-01) requires three tape drives. the withdrawing program ( odl-05) requires four tape drives but could be modified to operate with only three tape drives. in agreement with henriette avram and julius droz ( 4), the department of libraries has found that cobol can easily be used to process marc records. the information and management services division has assigned a programmer to the department of libraries who has done, and will do, all the marc programming. she is actually employed by the imsd and the department of libraries contracts with them for her services. presently, the department is being charged about $7.00 an hour for programming time. the planning, system design, actual programming, and production are all closely supervised by the data processing coordinator of the library, and he is on the department of libraries' staff. processing of marc tapes/ bierman and blue 61 the relationship between the imsd and the odl has been extremely beneficial for the library. thus far, the centralized computer center has provided fast and excellent service at a minimum cost. having a fulltime data processing coordinator on the staff of the library has negated the communication barrier which so often exists between a computer service center and a user library. cost cost figures for use of marc are very difficult to find. few of the marc i participants ( 3) give anything but a fleeting reference to cost. the reason is clear: cost figures are difficult to determine and even more difficult to evaluate meaningfully. table 2 is a breakdown of the charges to the department of libraries for programming and machine time; it does not include department of libraries' staff time or overhead costs. the figures are accurate through the end of february 1970. table 2. costs system design -----------------------------------------------$1,102.00 programming -----------------------------2,467.00 machine cost for program testing, debugging and machine and operator cost for merging through 2/ 28/ 70 ------------------2,026.00 total --------------------------$5,595.00 for the first year, the department of libraries is absorbing all the costs of merging and maintaining the marc master file, as well as the costs of all programming, as a form of state aid to libraries. the machine costs of comparing a finder tape with the master file, copying the desired records, and printing the various hard-copy lists, is being absorbed by the user library. the user also supplies the two blank tapes which are needed for each run. the machine time costs are based on the rate of $80.00 an hour of cpu time. plans for the future of the state-wide marc master file two major problems are apparent in the system as it is now set up. the system was initially created as a sequential tape system because this was the easiest and quickest way to establish a working system, and because it was felt that this would be practical for at least the first year of operation. one problem is that the sequential file will become expensive to maintain and does not allow direct access to a particular record without a sequential search. another problem is that the present system allows enhy into the file only by lc card number and does not allow entry directly with bibliographic information. 62 ]vurnal vf library automation vol 3/1 march, 1970 in accordance with present plans, in march 1970 work will begin on converting the storage medium from tape to a direct access device (disk or data cell) as the recon study suggests ( 5). at that time the file will cease to be maintairied in lc card number order and will be maintained in the order in which the records are received from the library of congress. various indices to the marc data base will be produced; author and title indices will enable the data base to be searched by bibliographic information when the lc card number is not known. in this way, only the indices (which would be comparatively much smaller), and not the complete data base, would have to be merged and searched. in terms of the data base itself, this will be the next major change. in the long run, it will be desirable for libraries that want access to the marc data base to have such access directly via terminals. at the present time, the cost of this kind of access is not worth the increased speed of access, nor is the money presently available; however, in the future, the cost of such a system will surely be reduced by technological improvements and increased importance of instantaneous access to the data base. when need balances with cost, such a set-up will be feasible. the geographical expansion of the system is a possibility. economically, this is most desirable, because the more ways the cost of maintaining the data base is split, the cheaper it is for all involved. some preliminary investigation along these lines with bordering states is being made and hopefully at some time in the future there will be a regional data base which many libraries can use. plans for future cooperative use of marc the cooperative use of marc thus far in oklahoma only affects the larger libraries which have access to computers and automation personnel. essentially, each library is autonomous and is free to use marc in any manner it wishes. it will remain true in oklahoma that individual libraries will always be free to use the data base to retrieve part or all of the data base for any purpose. however, plans are under way for more cooperative use of marc with libraries that do not have automation capabilities that would result in useful hard copy products for such libraries. two such cooperative plans have been proposed for immediate implementation. the first of these is a current awareness service. selected subjects would be compared against the data base on a bi-weekly (or other period) basis and complete bibliographic information for books representing the selected subjects would be printed as a personalized current awareness service. for example, all law titles on the marc tapes for two weeks could be pulled and listed, and the listing distributed to the county and state law offices, attorney firms, the law school library, etc., for selection and order purposes. the same could be done with library processing of marc tapes/ bierman and blue 63 science or any other subject. subject lists of interest to various agencies of state government could be produced and sent to them. another possibility is a profile of a legislative session by subject and then weekly or monthly lists of current materials available on these subjects for ordering by the department of libraries and possible lists to be made available to the legislative members. there are many possible uses for such a system which could be done fairly inexpensively. work began on this project in october 1969, and the service became operational on a cost basis in february 1970. a second possibility is catalog card and processing aids production. this would probably be done as a pilot project with several libraries throughout the state and then, if successful, expanded to any library in the state wanting to use the service. catalog card sets with subject headings printed at the top, and call numbers printed if the library accepts lc or lc dewey classification (there would be several options available within the system), spine labels, and book and circulation card labels would be provided. a by-product of such a state-wide operation would be the maintenance of book location information in machine readable form in a central place for future use as a basis for a machine readable state-wide union catalog. a project not in the immediate future but certainly being considered is that of cooperative retrospective conversion. that is, several libraries in the state would like to have bibliographic information in marc format for all books in their collections. whether the department of libraries would go ahead with such an ambitious project or wait for it to be done nationally ( recon study) would depend on timeliness on the national scene, need on the local scene, and available financial resources. eventually, oklahoma would like to have in machine readable form a complete union catalog of the entire library resources of the state that could be used for cooperative acquisitions programs, for strengthening subjects which are weak within the state, and as a location tool for interlibrary loan. such a data base would later be used also for reference functions. needless to say, such an ambitious project as this is not in the immediate future. conclusion early in the game, oklahoma libraries learned that the most economical means to library automation was cooperative automation. the creation of a state-wide marc data base is an important step toward cooperative library automation, while still allowing each local library to maintain its individuality for uses of the data. many areas of cooperation still remain untouched. the future success of library automation in oklahoma lies in the imaginative and creative projects that could be designed and implemented cooperatively to the mutual cost savings and benefit of all. 64 journal of library automation vol. 3/ 1 march, 1970 programs copies of the programs mentioned in this paper may be obtained from national auxiliary publications service of asis as follows: 1) "a program to merge all marc ii tapes received from the library of congress onto a single tape" (naps 00815); 2) "a program to drop given records or to transfer them to a separate tape" (naps 00816); 3) "a program to print marc tapes in readable form" (naps 00817); 4) "a program to pull selected records from the marc master tape for a single library" (naps 00818); and 5) "a program to print a listing of all library of congress card numbers on a given marc tape" (naps 00819). references 1. nugent, william r.: "nelinet: the new england library information network." a paper presented at the international federation for information processing, ifip congress 68, edinburgh, scotland, august 6, 1968. (cambridge, mass.: inforonics, inc., 1968), 4 pp. 2. pulsifer, josephine s.: "washington state library." in avram, henriette d.: the marc pilot project; final report on a project sponsored by the council on library resources, inc. (washington: library of congress, 1968), pp. 149-165. 3. avram, henriette d.: the marc pilot project; final report on a project sponsored by the council on library resources, inc. (washington: library of congress, 1968), pp. 89-183. 4. avram, henriette d.: droz, julius r.: "marc ii and cobol," journal of library automation, 1 (december 1968), 261-72. 5. recon working task force: conversion of retrospective catalog records to machine-readable form; a study of the feasibility of a national bibliographic service. (washington, d. c.: library of congress, 1969). search across different media | buckland, chen, gey, and larson 181 digital technology encourages the hope of searching across and between different media forms (text, sound, image, numeric data). topic searches are described in two different media: text files and socioeconomic numeric databases and also for transverse searching, whereby retrieved text is used to find topically related numeric data and vice versa. direct transverse searching across different media is impossible. descriptive metadata provide enabling infrastructure, but usually require mappings between different vocabularies and a search-term recommender system. statistical association techniques and natural-language processing can help. searches in socioeconomic numeric databases ordinarily require that place and time be specified. a hope for libraries is that new technology will support searching across an increasing range of resources in a growing digital landscape. the rise of the internet provides a technological basis for shared access to a very wide range of resources. the reality is that network-accessible resources, like the contents of a well-stocked reference library, are quite heterogeneous, especially in the variety of indexing, classification, categorization, and other forms of metadata. however, the use of digital technology implies a degree of technical compatibility between different media, sometimes referred to as “media convergence,” and these developments encourage the prospect of being able to search across and between different media forms—notably text, images, sound, and numeric data sets—for different kinds of material relating to the same topic. to examine the practical problems involved, the authors undertook to demonstrate searching between and across two different media forms: text files and socioeconomic numeric data sets.1 two kinds of search are needed. first, it should be possible to do a topical search in multiple media resources, so that one can find, for example, both pertinent factual numeric data and relevant discussion. (one difficulty is that the vocabulary used to classify the numeric data is ordinarily quite different from the subject headings used for books, magazine articles, and newspaper stories about the same topic.) second, when intriguing data values are encountered, one would like to move directly to topically relevant texts. likewise, when a questionable statement is read, one would like to be able to find relevant statistical evidence. therefore, there needs to be search support that facilitates such transverse searching among resources, establishing connections, transferring data, and invoking appropriate utilities in a helpful way. both problems were addressed through the design and demonstration of a gateway providing search support for both text and socioeconomic numeric databases. first, the gateway should help users conduct searches in databases of different media forms by accepting a query in the searcher’s own terms and then suggesting the specialized categorization terms to search for in the selected resource. second, if something interesting was found in a socioeconomic database, the gateway would help the searcher to find documents on the same topic in a text database, and vice versa. selection of the best search terms in target databases is supported by the use of indexes to the categories (entries, headings, class numbers) in the system to be searched. these search-term recommender systems (also known as “entry vocabulary indexes”) resemble dewey’s “relativ index,” but are created using statistical association techniques.2 four characteristics of this investigation need to be noted: 1. searching independent sources: the authors were not concerned with ingesting resources from different sources into a consolidated local data repository and searching within it. the interest lay, instead, in being able to search effectively in any accessible resource as and when one wants. this implies that interoperability issues in dealing with the native query languages and metadata vocabularies of remote repositories can be solved. 2. search for independent content: numeric data sets commonly have associated text in the form of documentation, code books, and commentary. however, the authors were interested in finding topical content that had no such formal or literary connection. independent means, for example, a newspaper article written by someone unaware that relevant statistical data existed or had been written before the author’s article existed. in the other direction, having found statistical data of interest, could topically related text created independently of this particular data point be found? 3. two different media forms were chosen: text and numeric data sets. they look similar because they both use arabic numerals, but the traditional reliance on information retrieval in a text environment search across different media: numeric data sets and text files michael buckland, aitao chen, fredric c. gey, and ray r. larson michael buckland (buckland@sims.berkeley.edu) is emeritus professor, school of information, university of california, berkeley; aitao chen (aitao@yahoo-inc.com) is a researcher at yahoo!, sunnyvale, california; fredric c. gey (gey@berkeley .edu) is an information scientist, uc data archive and technical assistance at the university of california, berkeley; and ray r. larson (ray@sims.berkeley.edu) is a professor, school of information at the university of california, berkeley. 182 information technology and libraries | december 2006 of using any character string from the corpus as a query, although technically feasible, cannot be expected to be useful here. one can copy a number expressing quantity, such as 12,941, from a numeric data cell, use it as a query in a text search engine such as google, and retrieve a large and eclectic retrieved set, usually involving “12941” as an identifying number for a postal code, a memorandum, a part number, software bug report, and so on, but the relationship is spurious. it requires great faith in numerology to expect anything topically meaningful to the original data cell one started with. with other combinations of media forms, not even spurious results are feasible: one cannot submit a musical fragment or some pixels from an image as a text query. 4. the authors’ interest was in how to achieve a better return on existing investments in well-formed, edited resources with descriptive metadata. this project built directly on prior work on how to make more effective use of existing, expertly developed metadata, rather than creating or replacing metadata. search of multiple resources comes in two forms: 1. parallel search is when a single query is sent to two or more resources at more or less the same time. for example, a researcher interested in the import of shrimp would like to see pertinent newspaper articles and trade statistics. thus, one might send a query to the census bureau’s united states (u.s.) imports and exports numeric data series and look at sic 0913 for shrimp and prawn and note a dramatic increase in imports from vietnam through los angeles from 1995 onwards. one would also search newspaper indexes for articles such as “normalizing ties to vietnam important steps for u.s. firms; california stands to profit handsomely when barriers fall to trade with fast-growing country.”3 different sources are likely to use different index terms or categories, so the challenge is how to express the searcher’s query in terms that will be effective for searching in the target resources, which, mostly likely, will use different vocabularies. as one example, the term for “automobiles” is 3711 in the standard industrial classification; tl 205 in the library of congress (lc) classification, 180/280 in the u.s. patent classification; and, in the census bureau’s u.s. imports and exports data series, pass mot veh, spark ign eng.4 2. transverse search is when an item of interest found in one resource is used as the basis for a query to be forwarded to a different resource. the challenge here, again, is that when a query using the topical metadata in one resource needs to be expressed in the vocabulary of the target resource, the metadata vocabularies in the two resources will usually be different from each other, and, quite likely, both are unfamiliar to the searcher. when searching within a single media form, it may be possible to use content itself directly as a query: a fragment of text in a source-text database is commonly used as a query in a target-text database. similarly, one might start with an image and seek images that are measurably similar. however, because such direct search cannot be done when searching across different media forms, an indirect approach relying on the use of interpretive representations becomes necessary. as the network environment expands, mapping between vocabularies will be increasingly important. ■ text and numeric resources text resource a library catalog—a special case of text file—was chosen for use as a text file rather than a corpus of “full text.” the reasons were practical: in this exploratory investigation, it was important to start with resources that had rich metadata; it needed to be a resource that was sufficiently controllable to enable experimentation with it. a library catalog was in the spirit of the project in that it would lead to additional text resources; and a suitable resource was available, which was intended for metadata mapping: a set of several million marc records, derived from melvyl, the university of california online library catalog. socioeconomic numeric data set initially, and in prior work, the authors had worked on access to u.s. federal data series, especially import and export statistics and county business reports. although some progress was made with interfaces to these data series, it became clear that the investment needed to craft interoperable access was high relative to the available staff. crafting access to individual data series did not appear to be a scalable way to demonstrate variety within the authors’ limited resources, so attention was turned to a single collection comprising many diverse numeric tables, the counting california database.5 ■ mapping topical metadata well-edited, high-quality databases typically have topical metadata expertly assigned from a vocabulary (thesaurus, classification, subject-heading system, or set of search across different media | buckland, chen, gey, and larson 183 categories). but there is a babel of different vocabularies. not only do the names of topics vary, but the underlying concepts or categories may also differ. effective searching requires expert familiarity with a system’s vocabulary; but as access to digital resources expands, the diversity of vocabularies increases and accessible resources are decreasingly likely to use vocabularies familiar to any individual searcher. the best answer is twofold: first, it is desirable to have an index (a “mapping”) from the natural language of each group of searchers to the entries used in each metadata vocabulary. such a mapping provides an index from a vocabulary familiar to the searcher to the vocabulary used in entries of the target system and so is called a search-term recommender system. (the authors called it an “entry-vocabulary index,” or evi.) dewey’s “relativ index” to his decimal classification is a familiar example. when searching across databases, one also wants a second kind of mapping: between pairs of system vocabularies. unfortunately, mappings between different vocabularies are rare, expensive, time-consuming, and hard to maintain. (the unified medical language system is a notable example.)6 it is the authors’ impression that this problem is worse in searching across different media forms because data bases in different media forms tend to be created by different communities, increasing the chances that they will use different categories, vocabularies, and ways of thinking. fortunately where data containing two forms of vocabulary are available, they can be used as training sets for statistical-association techniques to generate evis automatically, and this is the approach that was used. (more details can be found in the appendix.) from text words to library subject headings an evi from ordinary english words to library of congress subject headings (lcsh) was created by taking catalog records containing at least one subject heading (6xx field in the marc bibliographic format). from each of the 4,246,510 records used, main subject headings were extracted (subfield a from fields 600, 610, 611, 630, 650, and 651) and fields containing text: titles (245a), subtitles (245b), and summaries describing the scope and general content of the material (520a). the underlying assumption is that for each record, the words in the “text” fields (245a,b and 520a) tend to be characteristic of discourse on the subject (6xxa). two examples, with identifying lccns in the <001> field are: <001>73180254 //r86</001> <245><a>a study of operant conditioning under delayed reinforcement in early infancy</a></245> <650><a>infant psychology</a></650> <650><a>operant conditioning</a></650> <001>73180255 </001> <245><a>reptilian disease</a><b>recognition and treatment</b></245> <650><a>reptiles</a><x>diseases</x></650> the words in the text fields (245a, 245b, and 520a) were extracted. stop words were removed and the remainder normalized. then the degree to which each word is associated with each subject heading (by co-occurring in the same records) was computed using a maximum likelihood ratio-based measure. natural-language processing can be used to identify adjective-noun phrases to support more precise searching using phrases as well as individual words. a very large matrix shows the association of each text word (or phrase) with each subject heading; so, for any given word (or combination of words), a list of the most closely associated headings, ranked by degree of association, can be derived from the matrix. queries a query, which can be a single word, a phrase, a set of keywords, a book title, and so on, is normalized in the same way and looked up in the matrix to produce a ranked list of the most closely associated subject headings as candidate lcsh search terms. for example, entering the textual query words “peanut” and “butter” generates the following ranking list of lcsh main headings as candidates for searching: rank lcsh (subfield 650a) 1. peanut 2. cookery (peanut butter) 3. cookery (peanuts) 4. peanut industry 5. peanut butter 6. butter 7. schulz, charles m. this display is an important departure from traditional fully automatic searching. the list is, in effect, a prompt, indicating probably suitable query terms in the vocabulary of the target resource. it introduces the searcher to the categories and terminology of the system and enables the searcher to use expert judgment to select the heading that seems best for the search. from text words to the metadata vocabularies in numeric data sets a training set of records containing both descriptive words and topical metadata is often not readily available for numeric data sets. the authors’ first effort was to create an evi to the standard industrial classification (sic), widely used over many years in numeric data sets. (sic codes were associated with words by using, as a training set, the 184 information technology and libraries | december 2006 titles in a bibliographic database that used sic codes.) but by the time the sic evi was completed, sic had been discontinued and replaced by the north american industry classification system (naics), so a mapping was created from sic codes to naics codes. figures 1–3 show stages in an interface that accepts a searcher’s query “car” (figure 1), prompts with a ranked list of naics codes (figure 2), then extends the search with the selected naics code to retrieve numeric data (figure 3). by this time, however, it had become apparent that, with the current low level of interoperability in software and in data formats, the labor required to create evis and interfaces to each large traditional numeric data series was enormous. therefore, attention was turned to a collection of different numeric data sets available through a single interface, counting california, made available by california digital library at http://countingcalifornia.cdlib.org. this resource is a collection of some three thousand numeric tables containing statistics related to a range of topics. the numeric data sets are mainly from the california department of health services, the california department of finance, and the federal bureau of the census. the tables are organized under a two-level classification scheme. there are sixteen topics at the top level, which are subdivided into a total of 184 subtopics. all the numeric tables were assigned to one or more subtopics and each table has a caption. at the counting california web site, a searcher can browse for tables by selecting a higher-level topic, then a lower-level subtopic, and then a table. two additional ways were created to access the tables: probabilistic retrieval, and an evi to the topical categories. the captions, topics, and subtopics were extracted for each of the three thousand tables, and xml records were created in the following form: <table> <topic> education </topic> <subtopic> libraries </subtopic> <caption> library statistics, statewide summary by type of library california 1992–93 to 1997–98 </caption> </table> retrieval two search methods were used: direct probabilistic retrieval. an in-house implementation was used of a probabilistic full-text retrieval algorithm developed at berkeley.7 this search engine takes a free-form text query and returns a ranked list of captions of tables ranked according to their relevance scores. for example, the five top-ranked captions returned to the query “public libraries in california” were: figure 1. query interface for search-term recommender system f or the north american industry classification system figure 2. display of naics code search-term recommendations for “car” figure 3. display of numeric data retrieved using selected naics code search across different media | buckland, chen, gey, and larson 185 1. library statistics, statewide summary by type of library california, 1992–93 to 1997–98 table f6. 2. library statistics, statewide summary by type of library california, 1993–94 to 1998–99 table f6yr0-0. 3. number of california libraries, 1989 to 1999 table f5yr00 4. number of california libraries, 1989 to 1998, as of september table f5. 5. california public schools, grades k–12, 1989 to 1998 table f4. each entry in the retrieved set list is linked to a numeric table maintained at the counting california web site and, by clicking on the appropriate link, a user can display the table as an ms excel file or as a pdf file. mediated search. from the same extracted records the words in the captions were used to create an evi to the subtopics in the topic classification using the method already described. as an example, the query “personal individual income tax,” when submitted to the evi, generated the following ranked list of subtopics: 1. income 2. government earnings and tax revenues 3. personal income 4. property tax 5. personal income tax 6. corporate income tax 7. per capita income a user can click on any selected subtopic to retrieve the captions of tables assigned that subtopic. for example, clicking on the fifth subtopic, personal income tax, retrieves: ■ personal income tax returns: number and amount of adjusted gross income reported by adjusted gross income class california, 1998 taxable year. table d10yr00 ■ personal income tax returns: number and amount of adjusted gross income reported by adjusted gross income class california, 1997 taxable year. table d9 ■ personal income statistics by county, california 1997 taxable year. table d10 ■ personal income statistics by county, california 1998 taxable year. table d11yr00 ■ transverse searching between textand numeric-data series to demonstrate the searching capability from a bibliographic record to numeric-data sets, the first step is to retrieve and display a bibliographic record from an online catalog. a web-based interface for searching online catalogs was implemented using an in-house implementation of the z39.50 protocol. besides the z39.50 protocol, an important component that makes searching remote online catalogs feasible is the gateway between the http (hypertext transfer protocol) and the z39.50 protocol. while http is a connectionless-oriented protocol, the z39.50 is a connection-oriented protocol. the gateway maintains connections to remote z39.50 servers. all search requests to any remote z39.50 server go through the gateway. searching from catalog records to numeric data sets having selected some text (for the purposes of this study, a catalog record), how could one identify the facts or statistics in a numeric database that are most closely related to the topic? clicking on a “formulate query” button placed at the end of a displayed full marc record creates a query for searching a numeric database. the initial query will contain the words extracted from the title, subtitle, and the subject headings and is placed in a new window where the user can modify or expand the query before submitting it to the search engine for a numeric database. so, for example, the following text extracted from a catalog record: library laws of the state of california, library legislation. california. public libraries when submitted as a query, retrieves a ranked list of table names, of which two, covering different time periods, are entitled library statistics, statewide summary by type of library, california. searching from numeric data sets from catalog records transverse search in the other direction, starting from a data table, is achieved by forwarding the caption of a table to the word-to-lcsh evi to generate a prompt list of the seven top-ranked lchss, any one of which can be used as a query submitted to the catalog. ■ architecture figure 4 shows the structure of the implementation. the boxes shown in the figure are: 1. a search interface for accessing bibliographic/textual resources through a word-to-lcsh evi. 2. a word to the lcsh evi. 3. a ranked list of lcshs closely associated with the query. 4. an online catalog. 186 information technology and libraries | december 2006 5. results of searching the online catalog using an lcsh. 6. a full marc record displayed in tagged form. 7. a new query formed by extracting the title and subject fields from the displayed full marc record. 8. a numeric database. 9. a list of captions of numeric tables ranked by relevance score to the query. 1 0. numeric table displayed in pdf or ms excel format. 11. a search interface for numeric databases based on a probabilistic search algorithm. a user can start a search using either interface (boxes 1 or 11) and, from either starting point, find records on the same topic of interest in a textual (here bibliographic) database and a socioeconomic database. ■ conclusions and further work enhanced access to numeric data sets the descriptive texts associated with numeric tables, such as the caption, headers, or row labels, are usually very short. they provide a rather limited basis for locating the table in response to queries, or describing a data cell sufficiently to form a usefully descriptive query from it. sometimes the title (caption) of a table may be the only searchable textual description about the content of the table, and the titles are sometimes very general. for example, one of the titles, library statistics, statewide summary by type of library california, 1992–93 to 1997–98, is so general that neither the kinds of statistics nor the types of libraries are revealed. if a user posed the question, “what are the total operating expenditures of public libraries in california?” to a query system that indexes table titles only, the search may well be ineffective since the only word in common between the table title and the user’s query is “california” and, if the plurals of nouns have been normalized, to the singular form, “library.” table column headings and row headings provide additional information about the content of a numeric table. however, the column and row headings are usually not directly searchable. for example, a table named “language spoken at home” in counting california databases consists of rows and columns. the column headings list the languages spoken at home, while the row headings show the county names in california. each cell in the table gives the number of people, five years of age and older, who speak a specific language at home. to answer questions such as “how many people speak spanish at home in alameda county, california?” using the table title alone may not retrieve the table that contains the answer to the example question. it is recommended that the textual descriptions of numeric tables be enriched. automatically combining the table title and its column and row headings would be a small but practical step toward improved retrieval. geographic search socioeconomic numeric data series refer to particular areas and, in contrast to text searching, the geographical aspect ordinarily has to be specified. to match the geographical area of the numeric data, a matching text search may also have to specify the same place. the authors found that this was hard to achieve for several reasons. place names are ambiguous and unstable: a search for data relating to trinidad might lead to trinidad, west indies, instead of trinidad, california, for example. the problem is compounded because, in numeric data series, specialized geopolitical divisions, such as census tracts and counties, are commonly used. these divisions do not match conveniently with searchers’ ordinary use of place names. also, the granularity of geographical coverage may not match well. data relating to berkeley, for example, may be available only in aggregated data for alameda county. it was eventually concluded that reliance on the names of places could never work satisfactorily. the only effective path to reliable access to data relating to places would be to use geospatial coordinates (latitude and longitude) to establish unambiguously the identity and location of any place and the relationship between places. this means that gazetteers and map visualizations become important. gazetteers relate named places to defined spaces, and thereby reveal spatial relationships between places, e.g., the city of alameda is on alameda island within alameda county. this problem has been addressed in a subsequent figure 4. architecture of the prototype search across different media | buckland, chen, gey, and larson 187 study entitled “going places in the catalog: improved geographical access.”8 temporal search searches of text files and of socioeconomic numeric data series also differ substantially with respect to time periods: numeric data searches ordinarily require the years of interest to be specified; text searches rarely specify the period. an additional difficulty arises because in text, as in speech, a period is commonly referred to by a name derived metaphorically from events used as temporal markers, rather than by calendar time, as in “during vietnam,” “under clinton,” or “in the reign of henry viii.” named time periods have some of the characteristics of place names: they are culturally based and tend to be multiple, unstable, and ambiguous. it appears that an analogous solution is indicated: directories of named time periods mapped to calendar definitions, much as a gazetteer links place names to spatial locators. this problem is being addressed in a subsequent study entitled “support for the learner: what, where, when, and who.”9 media forms the paradox, in an environment of digital “media convergence,” that it appears impossible to search directly across different media forms invites closer attention to concepts and terminology associated with media. a view that fits and explains the phenomena as the authors understand them, distinguishes three aspects of media: ■ cultural codes: all forms of expression depend on some shared understandings, on language in a broad sense. convergence here means cultural convergence or interpretation. ■ media types: different types of expression have evolved: texts, images, numbers, diagrams, art. an initial classification can well start with the five senses of sight, smell, hearing, taste, and feel. ■ physical media: paper; film; analog magnetic tape; bits; . . . being digital affects directly only this aspect. anything perceived as a meaningful document has cultural, type, and physical aspects, and genre usefully denotes specific combinations of code, type, and physical medium adopted by social convention. genres are historically and culturally situated. convergence can be understood in terms of interoperability and is clearly seen in physical media technology. the adoption of english as a language for international use in an increasingly global community promotes convergence in cultural codes. nevertheless, the different media types are fundamentally distinct. metadata as infrastructure it is the metadata and, in a very broad sense, “bibliographic” tools that provide the infrastructure necessary for searches across and between different media—thesauruses, mappings between vocabularies, place-name gazetteers, and the like. in isolation, metadata is properly regarded as description attached to documents, but this is too narrow a view. collectively, the metadata forms the infrastructure through which different documents can be related to each other. it is a variation on the role of citations: individually, references amplify an individual document by validating statements made within it; collectively, as a citation index, references show the structure of scholarship to which documents are attached. ■ summary a project was undertaken to demonstrate simultaneous search of two different media types (socioeconomic numeric data series and text files) without ingesting these diverse resources into a shared environment. the project objective was eventually achieved, but proved harder than expected for the following reasons: access to these different media types has been developed by different communities with different practices; the systems (vocabularies) for topical categorization vary greatly and need interpretative mappings (also known as relative indexes, searchterm recommender systems, and evis); specification of geographical area and time period are as necessary for search in socioeconomic data series and, for this, existing procedures for searching text files are inadequate. ■ acknowledgement this work was partially supported by the institute of museum and library services through national library leadership grant no. 178 for a project entitled “seamless searching of numeric and textual resources,” and was based on prior research partially supported by darpa contracts n66001-97-c-8541; ao# f477: “search support for unfamiliar metadata vocabularies” and n66001-00-18911, to# j290: “translingual information management using domain ontologies.” references 1. michael k. buckland, fredric c. gey, and ray r. larson, seamless searching of numeric and textual resources: final report on institute of museum and library services national leadership 188 information technology and libraries | december 2006 grant no. 178 (berkeley, calif.: univ. of california, school of information management and systems, 2002), http:// metadata.sims.berkeley.edu/papers/seamlesssearchfinal report.pdf (accessed july 18, 2006); michael buckland et al., “seamless searching of numeric and textual resources: friday afternoon seminar, feb. 14, 2003,” http://metadata.sims .berkeley.edu/papers/seamlessfri.ppt (accessed july 18, 2006). 2. michael buckland et al., “mapping entry vocabulary to unfamiliar metadata vocabularies,” d-lib magazine 5, no. 1 (jan. 1999), www.dlib.org/dlib/january99/buckland/01buckland .html (accessed july 18, 2006); michael buckland, “the significance of vocabulary,” 2000, http://metadata.sims.berkeley .edu/vocabsig.ppt (accessed july 18, 2006); fredric c. gey et al., “entry vocabulary: a technology to enhance digital search,” in proceedings of the first international conference on human language technology, san diego, mar. 2001 (san francisco: morgan kaufmann, 2001), 91–95, http://metadata.sims.berkeley.edu/ papers/hlt01-final.pdf (accessed july 18, 2006). 3. los angeles times, july 12, 1995: d1. 4. michael buckland, “vocabulary as a central concept in library and information science,” in digital libraries: interdisciplinary concepts, challenges, and opportunities. proceedings of the third international conference on conceptions of library and information science (colis3), dubrovnik, croatia, may 23–26, 1999, ed. t. arpanac et al. (lokve, croatia: benja pubs., 1999), 3–12, www .sims.berkeley.edu/~buckland/colisvoc.htm (accessed july 18, 2006); buckland et al., “mapping entry vocabulary.” 5. counting california, http://countingcalifornia.cdlib.org (accessed july 18, 2006). 6. “factsheet: unified medical language system,” www .nlm.nih.gov/pubs/factsheets/umls.html (accessed july 18, 2006). 7. william s. cooper, aitao chen, and fredric c. gey, “fulltext retrieval based on probabilistic equations with coefficients fitted by logistic regression,” in d. k. harman, ed., the second text retrieval conference (trec-2), march 1994, 57–66 (gaithersburg, md.: national institute of standards and technology, 1994), http://trec.nist.gov/pubs/trec2/papers/txt/05.txt (accessed july 18, 2006). 8. “going places in the catalog: improved geographical access,” http://ecai.org/imls2002 (accessed jul. 18, 2006). 9. vivien petras, ray larson, and michael buckland, “time period directories: a metadata infrastructure for placing events in temporal and geographic context,” in opening information horizons: joint conference on digital libraries (jcdl), chapel hill, n.c., june 11–15, 2006, forthcoming, http://metadata.sims .berkeley.edu/tpdjcdl06.pdf (accessed july 18, 2006); “support for the learner: what, where, when, and who,” http://ecai .org/imls2004 (accessed july 18, 2006). search across different media | buckland, chen, gey, and larson 189 appendix: statistical association methodology a statistical maximum likelihood ratio weighting technique was used to construct a two-way contingency table relating each natural-language term (word or phrase) with each value in the metadata vocabulary of a resource, e.g., lcsh, lccns, u.s. patent classification numbers, and so on.1 an associative dictionary that will map words in natural languages into metadata terms can also, in reverse, return words in natural language that are closely associated with a metadata value. training records containing two different metadata vocabularies can be used to create direct mappings between the values of the two metadata vocabularies. for example, u.s. patents contain both u.s. and international patent classification numbers and so can be used to create a mapping between these two quite different classifications. multilingual training sets, such as catalog records for multilingual library collections, can be used to create multilingual natural language indexes to metadata vocabularies and, also, mappings between natural language vocabularies. in addition to the maximum likelihood ratio-based association measure, there are a number of other association measures, such as the chi-square statistic, mutual information measure, and so on, that can be used in creating association dictionaries. the training set used to create the word-to-lcsh evi was a set of catalog records with at least one assigned lcsh (i.e., at least one 6xx field). natural language terms were extracted from the title (field 245a), subtitle (245b), and summary note (520a). these terms were tokenized; the stopwords were removed; and the remaining words were normalized. a token here can contain only letters and digits. all tokens were then changed to lower case. the stoplist has about six hundred words considered not to be content bearing, such as pronouns, prepositions, coordinators, determiners, and the like. the content words (those not treated as stopwords) were normalized using a table derived from an english morphological analyzer.2 the table maps plural nouns into singular ones; verbs into the infinitive form; and comparative and superlative adjectives to the positive form. for example, the plural noun printers is reduced to printer, and children to child; the comparative adjective longer and the superlative adjective longest are reduced to long; and printing, printed, and prints are all reduced to the same base form print. when a word belonging to more than one part-of-speech category can be reduced to more than one form, it is changed to the first form listed in the morphological analyzer table. as an example, the word saw, which can be a noun or the past tense of the verb to see, is not reduced to see. subject headings (field 6xxa) were extracted without qualifying subdivisions. the inclusion of foreign words (alcoholismo, alcoolisme, alkohol, and alcool), derived from titles in foreign languages, demonstrate that the technique is language independent and could be adopted in any country. it could also support diversity in u.s. libraries by allowing searches in spanish or other languages, so long as the training set contains sufficient content words. evis are accessible at http://metadata. sims.berkeley.edu/prototypesi.html. fuller descriptions of the project methodology can be found in the literature.3 ■ references 1. ted dunning, “accurate methods for the statistics of surprise and coincidence,” computational linguistics 19 (march 1993): 61–74. 2. daniel karp et al., “a freely available wide coverage morphological analyzer for english,” in proceedings of coling-92, nantes, 1992 (morristown, n.j.: association for computational linguistics, 1992), 950–55, http://acl.ldc.upenn .edu/c/c92/c92-3145.pdf (accessed july 18, 2006). 3. michael k. buckland, fredric c. gey, and ray r. larson, seamless searching of numeric and textual resources: final report on institute of museum and library services national leadership grant no. 178 (berkeley, calif.: univ. of california, school of information management and systems, 2002), http://metadata.sims .berkeley.edu/papers/seamlesssearchfinalreport.pdf (accessed jul. 18, 2006); youngin kim et al., “using ordinary language to access metadata of diverse types of information resources: trade classification and numeric data,” in knowledge: creation, organization, and use. proceedings of the american society for information science annual meeting, oct. 29–nov. 4, 1999 (medford, n.j.: information today, 1999), 172–80. 142 lc/marc on molds; an experiment in computer-based, interactne bibliographic storage, search, retrieval, and processing pauline atherton, associate professor, school of library science, and karen b. miller, research associate, syracuse university, syracuse, new york a project at syracuse university utilizing molds, a generalized computer-based interactive retrieval program, with a portion of the library of congress marc pilot project tapes as a data base. the system, written in fortran, was used in both a batch and an on-line mode. it formed part of a computer laboratory for library science students during 1968-1969. this report describes the system and its components and points out its advantages and disadvantages. introduction the somewhat intimidating title of this report becomes less so when translated from jargon into more familiar phrases. the lc/marc on molds experimental project conducted at syracuse university school of library science utilizes a computer: 1) to store bibliographic reference (library catalog) data, 2) to search the data for items that meet a searcher's criteria, 3) to retrieve items the searcher wishes retrieved, and 4) to process or manipulate items as required. a dialog or interaction between man and his data, via the machine, is established when a searcher makes a request in a query language and the computer responds immediately to the request. the lc/marc on molds system consists of two major components. the first is the data base, which is a slightly modified subset of the library of congress marc pilot project records ( 1). the second component is the computer programming system written in fortran known as molds (acronym for management on-line lc marc on molds/atherton and miller 143 data system). molds provides the computer routines required to store and maintain the data base, and the query language (also known generally as molds) that a searcher uses to interact with his data stored in the computer. the lc/ marc on molds system was originally implemented in april 1968 on the ibm 360/ 50 at the syracuse university computing center. this system is part of an experiment to determine how on-line interactive retrieval systems could be used to greatest advantage in the information gathering process. the molds system, developed in 1966 by the syracuse university research corporation (2) for management purposes, was readily available for use in the research reported in this paper. molds has been used with several data bases, including the marc records. the system has not been made available to a large user population. preliminary work with the system and a few demonstrations to students have already provided considerable insight into the desirable and undesirable features in both the marc data base and the molds query language, an insight that has already resulted in both data-base and querylanguage modification. work with the system on the computer at syracuse university has raised many crucial questions extending beyond the original research plan about system and data base design-questions for which there are as yet no answers. even at its early stage of experimentation the work should be of interest to librarians because of its use of the marc pilot project records and its use of an available retrieval program with features suitable for reference retrieval. to the authors' knowledge, this is the first computer-based project in which the library of congress marc records were used in an interactive retrieval environment. the query language (molds) was not specifically designed for reference retrieval, but its design features make its use for this purpose quite feasible. it differs from the usual interactive system designed for bibliographic reference retrieval and therefore deserves attention for comparative purposes. molds gives a user the ability to process as well as retrieve data, something very few search and retrieval systems are designed to do. the contribution of lc/marc on molds to the world of information retrieval, promising though it appears, cannot be assessed until all experiments are run. this report on its features, both good and bad, is offered in order to make those concerned with the design and application of interactive systems aware of its unique aspects and potential. hopefully, this work will contribute another ingredient to the synthesis of ideas and methods that will bring the state of the art ever closer to the optimum and ideal. 144 journal of library automation vol. 3/2 june, 1970 table i. some features of interactive retrieval systems (circa 1968) #docs. in data base system name data base structure access points 1. audacious 2330 tree structure udc descriptions (alp) threaded list euratom key words 2. bold 6000 threaded list astia subject category (sdc) index terms accession numbers 3. colex 2000 inverted index descriptor} subject micro tree structure author qualified country (sdc) index subject by ~ent, subject area, date 4. grins > 1000 serial document index terms (lehigh u) inverted index 5. multilist varies threaded list any chosen key term to fit aaplitree structure cation ( e! author, subject, ate, directory title wor , subject headings) 6. marc/ 2000 molds cell-matrix any discrete data block 7. nasa/recon 270,000 ? subject l { author corporate qualified date source by report# contract# 8. tip > (mit) 25,000 list structure author(s) location ( where work done) citation identification ( i.v-p.) article title ( entire, keyword) citation index bibliographic coupling 9. suny biomed > 20,000 inverted index auilior } { comm. title qualified date, ne1work subject by lang. •each command is a subroutine. commands are tailored to application. a ccess to authority files on-line udc schedules subject category list, index term file no index term no optional no no no lc marc on molds/atherton and miller 145 related terms or #commands cross refs in query given language yes 11 yes 14 ( 11 light pen) no (conversation) yes (conversation) no 0 optional 35 yes 16 function keys no 9 also various mac commands no 10 (?) computer instruction in language use optional optional optional no no no optional no no comj:ier ai query formulation (conversation) limited yes yes yes no no yes no yes root communiword cation search link yes crt yes crt with light pen yes teletype yes teletype no crt no crt yes teletype no ibm 2740 console 146 journal of library automation vol. 3/ 2 june, 1970 background a number of interactive retrieval systems have b een designed and implemented within the last few years. the features and potential of lc/ marc on molds are best viewed in relation to what has been done in the field up to now. to gain some perspective, the major features of data base structures and query languages of other interactive systems are summarized in table 1. this table presents those features of most interest to librarians who may wish to compare searching on a computer with searching in the card catalog or other bibliographic reference tools. references 3-12 document sources for the data in this table. molds data base structure the general structure of the data base with which molds operates is, in comparison with the threaded lists and inverted indexes found in many retrieval systems, extremely simple and unsophisticated. the data base can be composed of from one to ten distinct files of 1000 records each. a record is equal to the bibliographic description on a card in a library catalog. each record may be up to 300 computer words ( 1200 characters) long and may be subdivided into 80 blocks. originally, there was a 200word (boo-character) limitation on record size, but this has now been expanded. the total file size (limit of 10,000 records) is adequate for testing purposes, but expansion beyond the present limitations is planned in order to make the system more practical for actual use. the structure of a file is essentially a simple matrix. each row contains all the elements of a single complete record; each column contains all like discrete items of all the records in the file. the columns are called blocks in the molds system, block and fi eld being used synonymously in this report. for example, a library catalog card for one publication would be a record in a file composed of library catalog records. the main entries in the file constitute a block and the dates of publication constitute another block. figure 1 illustrates the data base structure, as of 1968. in this illustration the maximum number of files is 10 ( 1000 records each) and the maximum number of blocks 80. each file and each block in a file is given a name and/ or number. a user can reference or call up any file or data block within a file by using its name or number in a molds query language command. there are as many access points to a file as there are blocks in that file. this is in contrast to a conventional card catalog, for example, where the only access points are filing entries: main entry, title, subject(s), added entries, series, and analytics. no specific provision is made within the molds system for the storage of authority files, cross reference lists, or other intermediate keys to the records. such files are not absolutely necessary for effective operation of the system since every block can be accessed and can serve as its own authority file. for more efficient system operation, however, it is intended lc marc on molds/atherton and miller 147 to explore the possibility of creating authority fi1es as part of the data base, beginning with portions of the seventh edition of the library of congress list of subject headings. block al block a2 •••••••• block abo pin l ....___ ---v../ file a } record aj. record a2 re;,ord a3 record aj.ooo data base block bl blockb2 -------block b8c ~ ~ file b record bl record b2 .. record blooo (as of 1968,--maldmulllllulllber .of files is 10 files,each or 1000 records, v1th maximum number of blocks 80) figure 1. section of general molds data base structure. provision is made for temporary user storage areas in which the user places the results of his retrieval and processing operations. data in the user area is retained only during the session in which it is created. although it cannot be saved for use at a later date, all or part of it can be printed out on the on-line printer for the user's later reference. while the general structure of the data base is formalized within the molds system, the content and specific organization of a particular data base is determined by its originator. this feature, plus the simplicity of molds' own structure, introduces a great deal of flexibility into the data base and the use that can be made of it. the originator of the data base may designate as a block any discrete data item he wishes. if the user population is dissatisfied with results using one content and arrangement of blocks, the base can be reformatted and restructured in a fairly simple maintenance run. no problems of linking records or modifying authority lists arise, as neither is part of the system. the first version of the lc/marc data base has in f.act been modified by addition of three blocks and division of one block in half to form two blocks, giving access to smaller units of data. 148 journal of library automation vol. 3/2 june, 1970 the lc/marc data base in molds format library of congress marc pilot project tapes containing some 40,000 records of english language books cataloged in 1966-67 became available for this project in the fall of 1967. because of the molds data base limitations, a subset of these catalog records was selected for use with molds. the original plan was to have each file in the data base consist of as complete a set as possible of all marc pilot project records from a single library of congress classification schedule. the candidate for the first file was class r (medicine) which contained just under 1000 records. later molds files were formed for two other lc classes: t (technology) and z (bibliography and library science). in mid-1969 two stratified sample files of the marc data base were created, one in the humanities, another in the social sciences. in all, syracuse has a marc/molds data base of 10,000 records. the record format of the marc tape was first analyzed to determine which fields should be included in the data base, and which might be omitted. the criterion for selection was probable usefulness to searchers of the data base, a conception that should undoubtedly be modified as searches are monitored. appropriate changes would not be difficult. toward the end of january 1969, a programming project was begun which entailed the design and implementation of a computer program to perform format conversion of the library of congress marc i bibliographic file to satisfy molds data base requirements. the project represented a three man-month effort and was completed by june 1969. the data-base converter program represents an attempt to provide a user-oriented facility for creating a molds data base from marc information. essentially, the user of the program describes each molds file to be produced by specifying: 1) the number of (fixed) fields per molds record; 2) the name and size (in characters) of each field in the molds record; 3) the name of the marc i field from which the data are to be taken; 4) selection criteria according to which marc i records are to be chosen for conversion; 5) for any marc i field, a data conversion procedure to be applied prior to transferring the information to the appropriate molds field; 6) whether or not diacritical codes should be stripped from the marc i field prior to transferring the information to the molds field; 7) whether or not character translation from lower-case to upper-case codes should be performed on the data prior to transfer from the marc i to the molds field. although the program has not yet been refined to the extent originally intended, nevertheless it contains all the features indicated above and has lc marc on molds/atherton and miller 149 been used to create ten molds files since its completion. the program is written in pl/i and more fully documented in a report available from the national auxiliary publication service of asis. molds requires fixed-field input for its data base, but many of the fields or data blocks on the marc tape are variable in length. therefore, the field lengths of 200 records in the class r (medicine) subset were examined to determine the maximum size which would produce a molds record within the original 200 computer-word (boo-character) limitation and still retain all the desired data. this limitation was easily expanded to 300 words, allowing addition of new fields and expansion of existing fields as new marc/molds files were generated. a record whose original variable length was 500 characters or less expanded to about 800 characters when converted to fixed-field form. in the first data base only records of 500 characters or less were considered for inclusion, which gave a total of 620 records in the first marc/molds file. by mid-1969 this data base was greatly enlarged using the program described above. the names of the present marc/molds files are: ss01, ss02, ss03, ss04, ssoz, and ssoh. the first files generated were called marc and marz. the marc/molds format now in use is given in table 2. the additions made to the original format are noted. marc/molds block names can be used instead of block numbers; for ease of searching both name and number are given in the table. the molds block number corresponds to marc pilot project field tags whenever possible. after this second revision had been completed, marc ii ( 13) format with new field tags appeared . interestingly, there were remarkably few differences. creating an information retrieval system from other data bases can present some major headaches. during the first test session with the marc/ molds data base, it was discouraging to find that successful retrieval operations could not be performed on such vital items as subject or main entry (blocks main and suba, respectively) . the problem lay in the fact that the lower-case character codes employed on the marc tape had not been converted to the all-upper-case-codes required by molds. once discovered, the problem was easily remedied. other problems were not so easy to solve. the marc data base had been received in a "raw form", i.e., there were typographical errors in the original tapes and irregular spacing; and incorrect punctuation, spelling and abbreviations. there was no way to detect these errors, and the retrieval program would only work on direct matches of query and document information elements. the molds language (to be discussed subsequently) required a good deal of standardization and regularity of the records to take full and effective advantage of its retrieval capabilities. 150 journal of library automation vol. 3/2 june, 1970 table 2. marc/ molds data base format description field names chars. marc i data element marc fixed fields: molds blk in fixed information block no. block field values or name position explanation or tag no. lc card no. ldn~ 80 11 9-19 type of main entry type 81 1 21 a-g form of work f0rm 82 1 22 mis bibliographies indicator bib 83 1 23 xb illustrations " illu 84 1 24 " maps map 85 1 25 conferences c0nf 86 1 26 juvenile juv 87 1 27 languages lang 88 4/ 4 29-36 both languages language 1 lanl 1 4 29-32 language 2 lan2 2 4 33-36 publication dates date 89 4/ 4 38-45 both dates height in em. hite 90 2 59-60 uniform tracing indicator unif 91 1 66 xlb series tracing indicator sert 92 1 69 " place of publication code plcd 18 4 46-49 publisher code pucd 19 4 50-53 lc call no. lcn~ 98 20 90 dewey class no. dew! 99 20 92 dewey class no. (edited) dew2 39 8 92 ooddd.dd lc class no. (edited) lccl 97 8 90 e.g. 00351.2352 main entry main 10 68 10 title statement titl 20 80 20 subtitle statement stit 21 80 20 edition statement edit 25 12 25 place } . plce 30 28 30 publisher 1mpnnt statement publ 31 28 30 collation c0ll 40 48 40 series note sers 50 44 50/51 note n0ta 60 44 60 note n0tb 61 44 60 subject tracing suba 68 48 70 subject tracing subb 69 48 70 subject tracing subc 70 48 70 lc marc on molds/atherton and miller 151 personal author tracing paua 71 40 71 personal author tracing paub 72 40 71 corporate author tracing c0rp 73 1 72 lc card suffix lcff 94 3 94 total marc/molds characters 848 the molds system functionally, the molds system consists of utility routines to store a data base, a well-defined query language, a language interpreter, and a set of logical procedures which allow the user to operate on a data base. the molds system is a set of fortran iv subroutines which perform the maintenance functions, interpret the commands in the query language and perform the desired logical procedures. the subroutines render the system modular and open. it is therefore relatively easy for a programmer skilled in fortran iv to add, modify and delete commands and functions as required. this feature of the system is quite desirable. user feedback invariably points up weaknesses in the language or suggests useful features which might be incorporated. molds was continually modified in response to user requirements, and each modification was implemented within a short time without requiring major programming changes throughout the system. the system has already grown since it was first implemented with the marc data base, and commands have been added or modified as required. hardware configuration marc/molds was run at syracuse university computing center on an ibm 360/ 50 computer. originally, the on-line mode required full dedication of the computer during execution. the molds system requires some 150,000 bytes of main memory and a disk storage unit to hold the entire data base, as well as intermediate data generated by the user. the molds system has been implemented on other computers (2). interaction with the system in the on-line version was carried on through an ibm 2260 display station consisting of a keyboard and crt (cathode ray tube) display screen. although two or more consoles have not as yet been operated simultaneously, the system is intended to be time-shared. effort was made to alter the system to operate in a 50,000 ( 50k) upper partition, so that it could be accessible at all times rather than on a scheduled basis. this involved reorganizing the program into an overlay structure in which the basic or root segments are resident in a fixed portion of memory throughout execution, while the remainder of the program is divided into a set of smaller segments which can overlay each other, being brought into memory only when needed. this task 152 ] ournal of library automation vol. 3/ 2 june, 1970 required a careful analysis of each subroutine for its dependence upon others, breaking the program into mutually exclusive segments, while ensuring that any given set of segments which occupied memory simultaneously did not exceed 50k bytes of storage. many of the larger segments which had to be further subdivided required considerable reprogramming. the first attempt at executing the new overlay version failed. due to a general lack of experience with the 2260 display units, it had not been anticipated that system software would not allow the console to be accessed from outside of the root segment, and the 2260 software package had been placed in an overlay area. as a result the original overlay configuration had to be altered. the console input/ ouput (i/0) package was moved into the root segment, increasing its size by several hundred bytes and similarly decreasing the amount of storage available for the overlay portions. therefore, it was necessary to develop yet another configuration to conform to these new storage limitations. while the necessary changes were being made, the computing center began operating a limited time-sharing system which itself required full dedication of the 360/50 machine. projected dates for returning to normal computer operations within a multi-partition environment were far enough in the future to suggest the efficacy of creating a new version of molds which could function off line, with cards and printer instead of the 2260 consoles. in this batch, or off-line, mode molds jobs could be submitted through the regular queue and run by computer center staff during batch processing time. with the on-line source program as a starting point, all references to 2260's were replaced with card reader and printer statements and the molds language instructions deleted which depended on the console for their use. mter all changes had been made and compilation was completed successfully, the off-line molds was exercised against a sample data base until it was satisfactorily debugged. since it was known that the computing center would eventually return to r artitioned operation, it was next undertaken to overlay the off-line molds into a 50k partition. this was accomplished with little difficulty since the problems encountered in working with the on-line version were largely due to the consoles. the end result of the entire task, therefore, was an off-line molds which could operate either in core or in overlay structure at the discretion of the user. the molds query language the molds query language includes some 34 distinct commands which must be entirely formulated by the user according to precise syntactical rules. the large number of commands is in part a reflection of the fact that this system provides the user with the ability to perform more operations of a greater variety on a data base than other interactive inforlc marc on molds/ atherton and miller 153 mation retrieval systems. it provides for retrieval of records from the data base according to data value descriptors, processing of data values by arithmetic and logical operations, sorting of retrieval records, and display of retrieval records in full or in part. operationally, the molds system regards a file of records as a set of parallel lists of blocks (figure 1). with the marc data base, these blocks were the 38 fields of catalog data (such as dewey class number, title, author, etc.). the commands in the molds query language are geared to list processing operations. in general, most of the molds commands will result in the formation of lists which are either identical in format to the original file, or are an independent list of alpha or numeric constants not subdivided into blocks. despite its surface complexity, the query language was designed specifically for users with absolutely no computer experience. the fixed format commands are easy to learn and use, even for the novice in computer based systems. they are mnemonic enough so that a little use soon brings an easy familiarity with them. commands in the molds query language there are six categories of commands in the language: retrieval, processing, display, storage, utility, and language augmentation. the commands are listed below with a brief explanation of each. retrieval commands: find: extract fetch define chain select forms a temporary subfile consisting of records from the data base for which the value in a specified block is equal, not equal, greater, greater or equal, less, less or equal to an input value. forms a temporary subfile consisting of records from an argument subfile for which the value in a specified block is equal, not equal, greater, greater or equal, less, less or equal to an input value. forms a temporary file which duplicates an existing file in the data base (added to original molds commands during this project) . forms a temporary subfile from two argument subfiles based on logical relationships and, or, not. forms a temporary subfile consisting of records from an argument subfile for which the value in a specified block is equal to any of the values in a specified block from a second argument subfile. forms a temporary subfile consisting of records from an argument subfile for which the value in a specified block is equal to any of the values in an argument list. 154 journal of library automation vol. 3/2 june, 1970 these six retrieval commands allow the user to extract selected data from the data base. selection is based on 1) a simple algebraic relationship (e.g., equal, not equal, greater than, etc.) between block values and a value specified by the user in the command (value may be alphanumeric or numeric), or 2) a simple logical relationship (e.g., and, or, not) between block values in two lists. all retrievals from molds files are based on exact-match correspondences between input descriptors and data values as they occur in records. each file is treated as distinct regardless of the fact that for the marc/ molds data base the second file may simply be a continuation of the first, etc. any block in a file may be used as an argument in a retrieval process. thus, the usual range of access points (author, title, subject, classification number) is considerably extended to include such unorthodox access points as juvenile literature, language, illustrations, and bibliographies. for example, one can retrieve all documents on a given subject or subjects which are juvenile books with bibliographies and illustrations published by a given publisher in 1966. the user can define his search limits with a degree of specificity not found in most interactive systems. however, the price he must pay is exactness in specifying the values used as retrieval criteria. the system will not retrieve on root words or key letter combinations, although such capability could be added. the block values must, therefore, be consistent and the user must have a precise knowledge of what they may be. this knowledge can be gained by examining the values and having them printed out as needed. (molds does have the capability of selecting unique values from a list, ordering them, and printing them out at any time during system operation. processing commands: count counts the number of records in an argument subfile or items in an argument list. order (reverse) maximum (minimum) total average arranges the records of an argument subfile in ascending (descending) order according to the values in a specified block or similarly sorts the values in an argument list. may be applied to alphabetic, numeric, and chronological data. selects the record containing the maximum (minimum) value in a specified block from an argument subfile, or the maximum (minimum) value in an argument list. may be applied to numeric or chronological data. calculates the sum of the values in a specified block of an argument subfile or of a list of numbers. calculates the average of the values in a specified block of an argument subfile or of a list of numbers. lc marc on molds/atherton and miller 155 median variance squareroot difference add (subtract multiply divide) calculates the median of the values in a specified block of an argument subfile or of a list of numbers. calculates the variance (standard deviation squared) of the values in a specified block of an argument subfile or of a list of numbers. calculates the square root of each value in a block of an argument subfile or of a list of numbers. calculates successive differences in the values of a specified block in an argument subfile or of a list of numbers. adds (subtracts, multiplies, divides ) the values from a specified block from an argument file (or list) to the corresponding values from a specified block from a second argument file (or list) . firstelement selects the first record from an argument subfile or reduce compress list. deletes the first record from an argument subfile or list. forms a temporary list composed of all the unique values in a specified block of an argument subfile or in an argument list. the eighteen processing commands allow the user to manipulate the data in the lists he has retrieved. he may count the number of elements in a list, arrange them in ascending or descending order, form the sum, average, variance, median and square root of a list of numbers; add, subtract, multiply, and divide one list by another, and select all unique elements from a list. the ability to process data as well as retrieve it may be unique to molds as compared to other interactive systems, and gives the language a useful added power. display commands display show print outputs on the crt (cathode ray tube) each complete record in an argument subfile (added to original molds commands during this project). outputs in columnar fashion on the crt selected blocks from up to three argument subfiles or lists (deleted in batch or off -line mode) . outputs in columnar fashion on the printer selected blocks from up to three argument subfiles or lists (added to original molds commands during this project) . the three display commands allow the user to display entire documents, or display selected books of information or records in columnar format. in 156 journal of library automation vol. 3/2 june, 1970 the on-line version of molds this may be done on the crt, or a printout made of selected blocks or lists of documents on the high speed printer. there is much flexibility and versatility in output format which is completely determined by the user. the command, show, is not used in the batch mode of molds. storage commands: set stores a single numeric value. store stores an alphabetic, chronological, or numeric list of arbitrary length. the two storage commands allow the user to insert independent lists of constants into the storage area. such lists do not become part of the data base, but are used in conjunction with retrieval and processing commands. utility commands: clear delete dump recall list deletes from storage a temporary subfile or list created during the session. deletes from storage all temporary subfiles or lists created during the session. displays on the crt in tabular fashion the names, file origins, and number of items in each subfile and list created by the user during the session (deleted in batch or off-line mode). displays on the crt the command which resulted in the creation of a specified temporary subfile or list (added to original molds commands during this project). produces printed copy of all commands issued during the session. may be used with stop at end of search (added to original molds commands during this project). the five utility commands allow the user to perform housekeeping operations, such as the clearing of storage areas, reinitialization of the system, and termination of execution. the command dump is not used in the batch mode of molds. language augmentation command: program allows the user to create new commands consisting of a sequence of basic commands and to store them for future sessions. the language augmentation command program, is one of the most important features of the language. it allows the user to create new commands tailormade to his own needs. this is shown in the first molds search query which follows. lc marc on molds/atherton and miller 157 search request formulation in marc/molds molds search query-example 1 (batch mode) program tally a/ count b a/ print b// end find zny ssoz/plcd/e/nyny/ tally zny/ print zny/plce/ plcd/pucd/ i find p67 ssoz/ date/e/1967 i tally p67/ define ny67 zny/and/p67/ tally ny67/ average avht ny67 /hite/ print avht/ stop the above example shows an off-line or batch-mode search. this sequence of commands would be keypunched and submitted as a job deck in the regular queue and run by the computer center staff, the searcher receiving the results as a printout from the high speed printer. ssoz is the name of one of the marc/molds files. this particular interaction shows the use of the operator program to augment the language in the subsequent search by adding tally to the list of commands. the following example shows a search query which is a sequence of some typical molds commands along with an explanation of the effect of each. each command has three parts. the first part (find, define, etc.) is the imperative which tells what operation is to be performed. the second part ( bibl, engl, both, etc.) is the label of the place in storage where the result of the operation is to be stored. this label is made up by the user when he gives a command. the third part of the command is the operand. in some cases the operand gives the criteria for retrieval (as in find, define) . it always gives the name or label of the list to be operated on, and in some cases specifies a particular block of that list. the request shown in this example was handled by molds to retrieve, display, and process all english language books on printing, or typesetting, or type founding which have bibliographies. the sequence illustt·ates the flexibility of molds, the many types of processing which can be done, the relatively easy way to use command format. this particular sequence was performed in the on-line version with chance for usersystem interaction after each command. 158 journal of library automation vol. 3/2 june, 1970 molds search query-example 2 (on-line mode) molds commands: find bibl marc/bib/e/x/ explanation: find all records in the file named marc for which the block named bib contains a value equal to (e) x (x in the block indicates presence of bibliographies). the list of selected records is to be stored in a location called bibl. find engl marc/lang/e/eng/ find all documents in the file named marc for which the block named lang contains a value equal to (e) eng, i.e. english language books. the list of selected records is to be !itored in a location called engl. define both bibl/ and/engl/ store subs 3/ alpha/13/ element 1 = printing/ element 2 = type-setting/ element 3 = define a new list called both which consists of the documents common to both bibl and engl, i.e., all english language books with bibliographies. inform the system that the user wishes to store, via the console, a list of values which will be called subs. the list will contain 3 elements which will be alphanumeric (alpha) as opposed to strictly numeric. the longest element will not exceed 13 characters. (system responds with these words.) user inserts first value by typing it on the console. (system responds with these words.) user inserts second value. (system responds with these words.) lc marc on molds/atherton and miller 159 type-founding/ select all both/subj/subsi count no. all/ show no.// print all/main/titl/lcno i i all/publ/plce/ i maximum big all/hite/ average ave all/hite/ user inserts third value. user has now created an independent list of three distinct valuesprinting, typesetting, type-founding and stored them in a location called subs. select all records from the list called both for which the values in the block named subj are equal to any of the values in the list called subs, i.e. those records for which the subject heading is printlng, type-setting, or type-founding. the selected records are stored in a location called all. count the number of records in the list called all. the count is stored in a location called no. display the contents of no. on the crt. produce a 5-column printed listing consisting of the values in the blocks ·named main (main entry), titl (title), lcno (library of congress classification number), publ (publisher), plce (place of publication) from each record of the list called all. from the list called all, select the record containing the maximum value in the block named hite (height) . the record is stored in a location called big. calculate the average of the values in the block named hite (height) of the list called all. the value is stored in a location called ave. 160 journal of library automation vol. 3/2 june, 1970 the following example records another interaction and the results in the off-line or batch mode. notice the error message which did not interrupt the search. this result also includes a report on the length of central processing unit (cpu) time each operation takes in hours, minutes, seconds and tenths of seconds. any line preceded by c indicates that the line was printed by the computer; any line minus the c indicates that the information was typed in by the user. molds retrieval-example 3 (batch mode) c please enter your program c line 1 oooooooopauline athertonooooooooo c invalid command name c set in at 185 day of 1969 16-01-17.1 c line 1 program tally a/ c line 1 count b a/ c line 2 print b// c line 3 end c set in 185 day of 1969 16-01-17.5 c line 2 find d2 ssoz/dew2/ne/o? c set in at 185 day of 1969 16-02-38.7 c line 3 find d1 ssoz/dewl/ne/ i c set in at 185 day of 1969 16-03-56.7 c line 4 tally d2/ c 950.00 c set in at 185 day of 1969 16-03-57.3 c line 5 tally d1/ c 905.00 c line 6 stop comments on marc/molds thus far this report has been confined to a more or less factual description of the components of the marc/molds system. no doubt the reader has asked himself many questions about the system, and made his own critical comparisons between this system and others. what follows are preliminary and necessarily subjective comments based on a lc marc on molds/atherton and miller 161 few demonstrations given to students in the school of library science and on the authors' own observations and reflections. system design response time response time (i.e. the time between transmission of a command in the on-line version and its execution) has been on the order of 90 seconds for a search of 620 records, to 20 seconds for an arithmetic operation involving the same number of records. when one thinks of these times in comparison with the time required to perform the same operations manually, they seem rapid. however, 90 seconds appears to be an unreasonably long period of time in a computer-based interactive retrieval environment. viewers of demonstrations often asked why it took the computer "so long" to perform a search. a user's tolerance for delay appears to vary a great deal with the type of retrieval system he is using. this has been observed on other occasions, but no determination has yet been made of tolerable limits in different environments, a determination that would be important in designing computer-based systems. · man-system interaction a design goal of most other existing interactive retrieval systems seems to be to give the computer certain anthropomorphic qualities and make it into a teacher or a responsive friend. such systems offer computeraided query formulation and/or a friendly conversation with the computer. the molds on-line system does not include either of these features. the user must first master a marc/molds manual which is an explanation of the system and the data base. he then goes on line and gives his command. molds responds by performing that command or by putting out a brief error message if the command format was improper. apparently the objective of conversation with the computer as found in most systems is to make it easier for the user to achieve desired results or to make him feel more at ease with the system. the person who plays with an interactive system once or twice probably finds conversations with a computer amusing, novel, and helpful in his first attempts. however, for a serious and steady user, carrying on the same conversation with the computer during each and every session can be tedious, repetitive, time consuming and sometimes circular. the optimum mix of computer-aided and independent user-formulated query is yet to be studied and found. perhaps molds, because it is a poor conversationalist, could aid in this search. at any rate, the automatic assumption of conversational features as a design goal for computer-based retrieval systems may not be based on sound knowledge of what suits the serious user. molds repertory of commands the processing commands in the molds query language are a wei162 journal of library automation vol. 3/2 june, 1970 come and valuable addition to the usual repertory of search and display commands common to most interactive systems. although the marc data base does not lend itself to a great deal of processing, we have found some commands useful, particularly count, order, maximum, minimum, and compress. processing times when individual commands of a single search take seconds of cpu time, it is certain that a retrieval system will be expensive if it is employed by a great many users as a general purpose system. some of the molds commands operating on the marc data base took whole minutes of cpu time! the authors have learned a great deal about interactive retrieval systems by using molds experimentally, but because of the excessive cost of certain runs, may not be able to continue research with it. modifications will have to be made to make it more efficient (i.e. cheaper to run) before it could be recommended for general use in the syracuse university library school or anywhere else. if the molds system can be designed to yield good results for certain types of searches with a realistic file size, it will be a boon to the library or educational institution seeking to automate some part of its searching procedures. data base noah prywes ( 14) has commented, "the effectiveness in retrieving documents is highly dependent on the amount of labor and processing invested in the storage of documents." the minimum amount of processing done on the marc tapes has, in fact, limited the effectiveness of retrieval. the extreme simplicity of the general molds data base structure is worthy of study. the efficiency and cost of retrieval using this structure needs to be compared very carefully with more sophisticated threaded lists. one extremely important factor to consider will undoubted· ly be the effect of increasing the size of the file. as pointed out before, the molds system requires an exact match of punctuation and spelling between retrieval criteria and stored data items, a match difficult to achieve. to be sure, this is partially a limitation in the molds system that may be relaxed by incorporating a capability to search for root words and key letter combinations. however, the many inconsistencies in abbreviations, punctuation, and spelling that appear in bibliographic records when information on title pages is transcribed, as on the marc tapes, can enormously complicate effective retrieval. marc or non-marc bibliographic records will always contain some "author" variations that such a system as molds may have to accommodate. this is a very knotty problem. these comments are not to be construed as a criticism of the fine work the library of congress has done in its marc pilot project. the marc lc marc on molds/atherton and miller 163 pilot project record format, with sometimes indistinct data elements ( special punctuation marks and symbols), was not specifically designed for computer-based interactive search systems. hopefully, the use herein described to which the marc data base has been put, and the experience derived from that use, will be of value as future modifications of the marc format are made. mter all, reference retrieval, using bibliographic information, automated or manual, is natural to libraries and is, indeed, one of the purposes for which that information is recorded in the first place. since one of the true values of a computer-based file lies in making multiple use of the records, it becomes imperative to test the various uses to which these records can be put. the future use of marc/molds at syracuse university the marc/molds system has undergone continual modification in data base structure and query language during the first year of work on it. a computer-based system must be capable of such flexibility, for changes should be accomplished easily and smoothly. no system is perfect, especially in its early days, least of all molds. it is intended to continue investigation into information-seeking behavior, and to use marc/ molds occasionally along with other retrieval systems. another paper describes use of the marc file with the ibm/ document processing system ( 15). summary this report has tried to describe, not sell, marc/molds as fairly as possible in the belief that some of its features should be considered by persons designing interactive systems, and by those responsible for refinement of the marc format. the searching capability is valuable as it increases the access points to the data. the arithmetic and logical operations provide an opportunity to perform certain studies of the marc data base. the marc files will eventually have many applications beyond technical processing functions in libraries. these applications would be more practically implemented if the marc format were modified to accommodate them and if librarians would use systems such as molds during their exploration of alternatives. marc/molds as a computer-based system has many wealmesses. outnumbering and to some extent overshadowing the concrete statements about its faults is its great potential. many questions have been raised which remain unanswered. questions dealing with the basic design of the system and data base are indicative of the development and experimentation which must be done before computer-based interactive retrieval in libraries is a practical reality. acknowledgments the work on this project has been supported by rome air develop164 journal of library automation vol. 3/2 june, 1970 ment center (contracts. u. no. af30 (602)-4283). related work, supported by a grant from the u. s. office of education, provided an education in understanding of the marc tapes. the authors gratefully acknowledge the comments made by phyllis a. richmond and frank martel on the original manuscript. mrs. sharon stratakos, programmer most responsible for molds, contributed a great deal to the authors' understanding of this retrieval program and its potential use with a bibliographic reference file such as marc. program microfiches and photocopies of the following may be obtained from national auxiliary publications service of asis: "rome project program description: molds support package" (naps 00884). references 1. avram, henriette: the marc pilot project, final report (washington, d. c.: library of congress, 1968). 2. a user-oriented on-line data system (syracuse, n. y.: syracuse university research corp., 1966). 2 v. 3. freeman, robert r.; atherton, pauline: audacious-an experiment with an on-line interactive reference retrieval system using the universal decimal classification as the index language in the field of nuclear science (new york: american institute of physics, april 25, 1968) (aip/udc-7). 4. burnaugh, h . p.; et al: the bold user's manual (revised) (santa monica, cal.: jan. 16, 1967) ( tm-2306/004/01). 5. cegala, l.; waller, e.: colex user's manual (falls church, va.: system development. feb., 1969) (tm-wd-(l)-405/000/00). 6. smith, j. l.; micro: a strategy for retrieving ranking and qualifying document references (santa monica, cal.: jan. 15, 1966) (sp 2289). 7. green, james sproat: grins : an on-line structure for the negotiation of inquiries (bethlehem, pa.: lehigh university, center for the information sciences, september 1967) . 8. computer command and control company: description of the multilist system (philadelphia, pa.: july 31, 1967. 9. national aeronautics and space administration, scientific and technical information division: nasa/recon user's manual (washington, d. c.: october 1966). 10. kessler, m. m.: tip user's manual (cambridge, mass.: massachusetts institute of technology, dec. 1, 1965). 11. biomedical communication network: user's training manual (syracuse, new york : december 1968). 12. welch, noreen 0. : a survey of five on-line retrieval systems (washington, d. c.: mitre corp., august 1968) (mtp-322). lc marc on molds/atherton and miller 165 13. avram, henriette d.; knapp, john f.; rather, lucia j.: the marc ii format (washington, d. c.: library of congress, 1968). 14. prywes, noah s.: on-line information storage and retrieval (philadelphia, pa.: university of pennsylvania, moore school of electrical engineering, june 1968). 15. atherton, p.; wyman, j.: "searching marc project tapes using ibm/document processing system," proceedings of american society for information science, 6 ( 1969), 83-88. 66 1 comparative costs of converting shelf list records to machine readable form richard e. chapin and dale h. pretzer: michigan state university library, east lansing, michigan a study at michigan state university library compared costs of three different methods of conversion: keypunching, paper-tape typew1·iting, and optical scanning by a service bureau. the record converted included call number, copy number, first 39 letters of the author's name, first 43 letters of the title, and m.te of publication. source documents were all of the shelf list cards at the library. the end products were a master book tape of the library collections and a machine readable book card for each volume to be used in an automated circulation system. the problems of format, cost and techniques in converting bibliographic data to machine readable form have caused many libraries to defer the automation of certain routine operations. the literature offers little for the administrator facing the decisions of what to convert and how to convert it. automated circulation systems require at least partial conversion of the accumulated bibliographic record. the university of missouri, like many libraries, has been converting the past record only for books as they are circulated ( 1) . southern illinois university ( 2) and johns hopkins ( 3), on the other hand, have converted the record for their entire collections. the southern illinois program is based upon converting only the call number. johns hopkins has converted the call number, main entry, title, pagination, size, and number of copies. and missouri has recorded call number, accession number, and abbreviated author and title. costs of shelf list conversion/ chapin and pretzer 67 · several methods of converting the record have been described. missouri employed keypunching; southern illinois marked code sheets which were scanned electronically and converted to magnetic tape; johns hopkins, working from microfilm copy of the shelf list, used special type font and typed the records for optical scanning. an ibm report on converting the national union catalog recommended an on-line terminal as the best method of conversion ( 4). studies at michigan state university led to the conclusion that acquisition, serials, circulation, and card production contained certain routines that might well be automated. once automation of circulation was decided upon as our initial effort, decisions were necessary as to the conversion. it was recommended that a portion of the bibliographic record for all items in the shelf list should be converted. information other than the call number is being used for other programs ( 5) . cost figures for converting library records are scarce. in only two instances are figures available. the ibm report on the national union cata. log shows that the average entry in nuc contains 277 characters, with an estimated conversion cost ranging from $0.3531 to $0.417 per entry. the proposed conversion method employs an on-line terminal, a technique not available to most libraries. the johns hopkins conversion of "about 300,000 · cards" was accomplished by optical scanning and cost $18,170 (3,p.4). this figures out at about $.06 per record. later in the report it is stated that the conversion "is at a rate of $.0038 per character converted" ( 3,p.25). at $.06 per card and $.0038 per character, the converted record would consist of 16 characters! in the study herewith reported every effort was made to arrive at comparative cost figures for the three methods of conversion that are readily available to most research libraries: keypunching, paper-tape typewriting, and optical scanning as accomplished through a service bureau. methods of study the shelf list records of the michigan state university library were divided into three sections by numbering catalog drawers in sequence: 1,2,3; then 2,3,1; then 3,1,2. all the drawers marked with number one became one sample group; those marked two and three made up the other groups. this method of numbering the drawers gave samples from each area of the classification schedule for each method of conversion. the bibliographic data were taken directly from the shelf list without transferring information to worksheets. a sample of the shelf list shows that 74 per cent of the cards are library of congress cards or copies of library of congress proof slips. of those cards produced in the library, only 12 per cent of the total were abbreviated records. the keypunch operators, the typists, and the service bureau were in68 journal of library automation vol. 1/ 1 march, 1968 structed to extract information from the shelf list record. all differences in type-capitals, italics, etc.-were to ~e ignored; transliterated titles were to he used in those cases where entries were in non-roman alphabet; accents and diacritical marks were ignored, except where it made a difference in filing, as with umlauts; all numbers in title and author fields were to be spelled as if written. <d qd 941 • a4l3 c.l c.2 c.) jx 1417 • w47 c.l ~ auanovich. vladimir moiseevfela. <!> apahaf dlsw.rslon m crystal rtics and the theory of excitons ,l>yl 1'. m. agranovic and v. l. ginzburg. trnnslnted from the original manuscript by literaturprojekt, innsbruck, a~tria. london, new york intencience publishers ,cu~r~ \!1 ' ml, 316 p. ua. 24 em. (lnterteleoce mon01rapba and te:dl in phyiiici.i and utronomy, y, 18) translation of kphctu .. oonthka c )"'etom n~tp&hctaeiiiioi .iiicnepc:hh h teo ph• jltchtohob ( romanlzed: ~latallooptlka i adletom prostrn n11t\ ennoj dl1pe11111 i teorul ~ksltonoy) b.lbllograpby: p . 807-313. 1. cry11tal optici. 2. e1:clto" tbeory. t. gln&burj, vltallt luanyich, 191&joint author. n. title. (sertea: lntencience monosj&phl id physics ao aattooomy, v. 18) qd941.a.13 548'.9 66-2th7 llbrarr of con1re1111 ed • ern.tional .relations• san francisco, chandler 0 fig. 1. shelf list cards. costs of shelf list conversion/ chapin and pretzer 69 information that was transcribed is marked in the example, figure 1. the complete call number 1) was included. author 2) was typed through 39 spaces, including dates, if possible. in cases where author entry was lengthy the operators were instructed to stop at the end of 39 spaces. title 3) was recorded as completely as possible th1·ough 43 spaces, but not to extend beyond the first major punctuation. date 4) was included as shown. only one copy 5) was shown on each entry. in the example of abbreviated form in figure 1, five separate records were required, with change only in copy number. the master book tape includes the call number, which occupies 32 spaces; 3 spaces are allowed for copy number, 39 for author, 43 for title, and 4 for date of publication. on the book card, figure 2, which was generated by the computer from the master book tape, the format is as follows: 32 spaces for call numbers, 3 for copy number, 11 for author, 26 for title and 4 for the year published. the remainder of the card is for machine codes used in the circulation system. l. i i i i ii i ill ill i i i michigan state fjnjversiiy i library i i i i i i i i i ii i importa.fr: i if this card is lost or damaged, a fine jll ie chargta. msu 7j6 i i i i i fig. 2. book pocket card. i i i i i i ii . i i the book card alone can be created directly by the keypunch. however, if a library has equipment available for a more complete program, it is useful to prepare information in a format to create a master book tape. programs have been written so that the master tape can be added to or deleted from at a later date. four operators worked on the project at michigan state university. two of them were average keypunch operators with little typing skill, one was an expert typist, and the other was an expert keypunch operator. the first two operators were trained to use both the keypunch and the flexowriter. the purpose in using a variety of typists and operators for the job was to arrive at average figures for the conversion project. the data show great variance of output among operators. .70 journal of library automation vol. 1/ 1 march, 1968 the outline of the methods used is shown in figure 3. the keypunch method recorded the bibliographic data by use of an ibm 026 keypunch. the punch cards were transferred to a magnetic tape and the book cards were generated by the computer. the paper-tape typewriter information was punched in paper tape by the use of a 2201 flexowriter. a portion of the sample was converted directly to magnetic tape. since some libraries will not have a paper-tape to magnetic-tape converter, the remainder of the paper-tape sample was converted to punch cards and then to magnetic tape. typed 1-----+ page fig. 3. flowchart of shelf list record. optical scanner the optical scanning method was handled by farrington corporation·s service bureau, input services, in dayton, ohio. the service bureau assigned 10 to 15 employees to transcribe the shelf list. they used ibm selectric typewriters, with special type font. special symbols were used to designate end of field. the data were recorded on continuous-form paper. the typed record was then edited and scanned, producing a magnetic tape. mter the tape was used for production of book cards, it was added to the master book tape. the first batch of cards sent to dayton was gone from the library for approximately four weeks. after the personnel at dayton became accustomed to the format and to library terminology, the turnaround time was approximately two weeks. the 255,000 records which were converted by the service bureau were sent off campus in four separate batches. costs of shelf list conversion/chapin and pretzer 71 machine verification of the record was not required. each operator was instructed to proofread her own copy. machine verification was considered, but the idea was discarded because of the extra cost involved. also, since book cards were to be inserted in all volumes, final verification would result when the books and cards were matched. results in the conversion keypunching cost 6.63 cents per record. paper-tape ran slightly higher-7.07 cents; this higher cost was due to the added cost of machinery and the added cost of going from paper tape to magnetic tape. optical scanning, through a service bureau, was exactly the same as keypunching-6.63 cents, including the programming costs. cost details are shown in table 1. table 1. average cost per shelf list record converted labor (1) salary fringe benefits equipment rental (2) computer supplies overhead ( 4) contractual services keypunch $.04073 .03723 .00350 .00322 .00280 .00042 .00003 .02232 paper-tape typewriter $.03960 .03620 .00340 .00888 .00840 .00048 . (3) .00052 .02172 scanning, service bureau $.00030 .06600 (5) total $.06630 $.07072 $.06630 ( 1) average costs for all operators based upon salary of $2.10 per hour, and fringe benefits of 9.4 per cent. ( 2) rental tin1e to library of ibm 1401 computer is $30.00 per hour, including personnel costs. (3) includes $.000089 for tape-to-tape conversion and $.000091 for tape to card to magnetic tape conversion. ( 4) university charge of 54.87 per cent of salaries, for space, utilities, maintenance, etc. this figure does not include cost of training and supervision. ( 5) $.057 per record plus .009 per record for programming costs. late in the study we observed that a seemingly inordinate amount of the flexowriter time was consumed by the automatic movement of the typewriter caniage to the pre-determined fixed fields. in order to circum72 journal of library automation vol. 1/ 1 march, 1968 vent this the operator was instructed to strike one key to indicate end of field, and then she no longer had to wait for the carriage movement. by using the manual field markers, as opposed to automatic fixed fields, the cost of the flexowriter operation was reduced to 6.672 cents per record. the disadvantage of the manual field-marking system was the increased chance of operator error, which amounted to 3.13 per cent more than the fixed-field method. for this reason, and in spite of the economy of the manual method, the use of pre-determined fixed fields for flexowriter conversion is to be preferred. in the comparison of the salary costs for keypunching and for the use of flexowriter, great variations were shown among operators. two participants were asked to use both the keypunch and the flexowriter on varying days, with tallies of their output accounted for throughout the entire project. operator 1 was essentially a skilled keypunch operator who had some background in typing. her salary cost per record during keypunching was 3.98 cents; her salary for the paper-tape typewriter was 7.92 cents. operator 2 was a skilled keypunch operator who was also sent to typing class for one term to raise her typing skill. her salary cost was 3.92 cents per record on the keypunch and 3.79 cents per record on the paper-tape machine. operator 3, who was a skilled keypunch operator, averaged 2.32 cents per record for salary cost. operator 4, who was a typist and not a keypunch operator, produced records on the flexowriter at a cost of 3.56 cents per record. the above figures indicate salaries only, and do not include overhead, fringe benefits, and other expenses which are reflected in the total conversion cost shown. a letter from farrington service corporation stated the following information about the scanning operation: "1) our typists produced an approximate total of 7,950 typing pages in the course of this conversion. 2) each typist averaged from 3.6 to 3.8 pages per hour. 3) we processed an average of 800-1,000 (shelf list) cards, per girl, per day. 4) the total man hours expended in this project was 2,144. 5) the amount of error detected as a result of sight verification varies significantly from girl to girl. the average, however, ran approximately 2.8 per cent (of records to be corrected)." comparison was made of actual records converted per eight-hour day by each of the methods. the service bureau, with skilled typists, was able to convert approximately 100 records an hour for each typist. the most efficient keypunch operator averaged about 75 records per hour, which was noticeably more than the average. the paper-tape typist, using pre-programmed fixed fields, reached 65 records per hour, but was able to produce 73 records per hour by manually typing the field markers. a short-run sample was stop-watch-timed to give an indication of the differences in results for each method when only minimum changes in certain fields, such as copy number or volume number, were required. costs of shelf list conversion/ chapin and pretzer 73 on the keypunch machine an operator consumed 34.6 seconds in typing the initial record and 20.4 seconds in duplicating the basic information and changing data in one given field. the operator with the automatic program flexowriter consumed 47.2 seconds typing the initial record, including 13.2 seconds in shifting fields and automatically firing the record marks, and 24 seconds duplicating the record. when she manually indicated the field information, she was able to convert the initial record in slightly less time-30 seconds; and she took 22.8 seconds to duplicate the data with a change in one field. final verification will be completed only when all cards are matched with the proper books. for those books that do not circulate, this may never be accomplished. a sample of cards was selected to reflect the three methods of conversion. the service bureau cards contained fewer errors than those produced by keypunching and paper-tape typewriting. production of records that were not acceptable to the computer in an edit program occurred in 1.75 per cent of the sample for keypunching, 0.93 per cent for paper-tape typewriting, and 0.16 per cent for service bureau. operator errors, discovered while matching cards with books, showed a higher percentage: 4.62 per cent for keypunching, 3.60 per cent for flexowriter, and 0.35 per cent for service bureau. conclusions and recommendations 1. the cost of converting a portion of the bibliographic record is relatively inexpensive when compared to the total cost of automated library programs. one reason for our delay in entering into the field of an automated circulation program was that of making the book cards. now that this task has been completed, it is obvious that conversion is a one-time cost that can well be absorbed. if the library cannot afford the original conversion, at a cost of 6 or 7 cents a record, then the library cannot afford to proceed with automated programs. 2. there is no difference in cost between keypunching a machine readable record and in having the project undertaken by a service bureau. the use of paper-tape typewriter for conversion costs more than the other two methods. 3. large scale conversion of records to machine readable form might well be done by an outside organization. in order to get the task completed in a short period of time, a library would be required to hire a number of short-term clerical employees. in the case of michigan state, situated in the small community of east lansing, recruiting and training a large number of employees for short-term projects is most difficult. it is rather certain that the overhead for such a program would bring the cost beyond that of using a service bureau. on the basis of our experience it is recommended that the conversion be sent to a service bureau. 74 journal of library automation vol. 1/ 1 march, 1968 4. a library can get along without portions of. a shelf list for short periods of time. one of the predicted problems of sending material off campus to be converted was that of losing the availability of the shelf list records. although there were some inconveniences, it was found that the library could carry on its operations and function without the shelf list. certainly, this could not be done if the shelf list cards were gone for any length of time. acknowledgment a grant from the council on library resources, inc., made possible the study described in this paper. references 1. parker, ralph h.: "development of automatic systems at the university of missouri library," in university of illinois graduate school of library science, ptoceedings of the 1963 clinic on library applications of data processing. (champaign, ill.: illini union bookstore, 1964)' 43-55. 2. southern illinois university. office of systems and procedures: an automated circulation control system for the delyte w. morris library; the system and its progress in brief. (carbondale, ill.: southern illinois university, 1963). 3. the johns hopkins university. the milton s. eisenhower library: progress report on an operations research and systems engineering study of a university library. (baltimore: johns hopkins, 1965). 4. international business machines. federal systems division: report on a pilot project for converting the pre-1952 national union catalog to a machine readable record. (rockville, maryland: ibm, 1965). 5. chapin, richard e.: "administrative and economic considerations for library automation," in university of illinois graduate school of library science, proceedings of the 1967 clinic on applications of data processing. (in press). extending im beyond the reference desk: a case study on the integration of chat reference and library-wide instant messaging network ian chan, pearl ly, and yvonne meulemans information technology and libraries | september 2012 4 abstract openfire is an open-source instant messaging (im) network and a single unified application that meets the needs of chat reference and internal communication. in fall 2009, the california state university san marcos (csusm) library began using openfire and other jive software im technologies to simultaneously improve our existing im-integrated chat reference software and implement an internal im network. this case study describes the chat reference and internal communications environment at the csusm library and the selection, implementation, and evaluation of openfire. in addition, the authors discuss the benefits of deploying an integrated im and chat reference network. introduction instant messaging (im) has become a prevalent contact point for library patrons to get information and reference help, commonly known as chat reference or virtual reference. however, im can also offer a unique method of communication between library staff. librarians are able to rapidly exchange information synchronously or asynchronously in an informal way. im provides another means of building relationships within the library organization and can improve teamwork. many different chat-reference software packages are widely used by libraries, including questionpoint, meebo, and libraryh3lp. less commonly used is openfire (www.igniterealtime.org/projects/openfire), an open-source im network and a single unified application that uses the extensible messaging and presence protocol (xmpp), a widely adopted open protocol for im. since 2009, the california state university san marcos (csusm) kellogg library has used openfire for chat reference and internal im communication. openfire was relatively easy to set up and administer by the web development librarian. librarians and library users have found the im interface to be intuitive. in addition to helpful chat reference features such as statistics capture, queues, transfer, linking to meebo widgets, openfire offers the unique capability to host an internal im network within the library. ian chan (ichan@csusm.edu) is web development librarian, california state university san marcos, pearl ly (pmly@pasadena.edu) is access services & emerging technologies librarian, pasadena community college, pasadena, and yvonne meulemans (ymeulema@csusm.edu) is information literacy program coordinator, california state university san marcos, california. extending im beyond the reference desk | chan, ly, and meulemans 5 in this article, the authors present a literature review on im as a workplace communication tool and its successful use in libraries for chat reference services. a case study on the selection, implementation, and evaluation of openfire for use in chat reference and as an internal network will be discussed. in addition, survey results on the library staff use of the internal im network and its implications for collaboration and increased communication are shared. literature review although there is a great deal of literature on im for library reference services, publications on the use of im in libraries for internal communications do not appear in the professional literature. a review of library and information science (lis) literature has revealed very limited work on this aspect of instant messaging. however, a wider literature review in the fields of communications, computer science, and business, indicates there is growing interest in studying the benefits of im within organizations. instant messaging in the workplace in the workplace, im can offer a cost-effective means of connecting in real-time and may increase communication effectiveness between employees. it offers a number of advantages over email, telephone, and face-to-face that we will discuss further in the following section. within the academic library, im offers the possibility of not only improving access to librarians for research help but also provides the opportunity to enhance communication and collaboration throughout the entire organization. research findings indicate that im allows coworkers to maintain a sense of connection and context that is different from email, face-to-face (ftf), and phone conversations.1 each im conversation is designed to display as a single textual thread with one window per conversation. the contributions from each person in the discussion are clearly indicated and it is easy to review what has been said. this design supports the intermittent reconnection of conversation and in contrast to email, “intermittent instant messages were thought to be more immersive and to give more of a sense of a shared space and context than such email exchanges.”2 through the use of im, coworkers gain a highly interactive channel of communication that is not available via other methods of communication.3 phone and ftf conversations are two of the most common forms of interruption within the workplace.4 however, garrett and danziger found that “instant messaging in the workplace simultaneously promotes more frequent communications and reduces interruptions.”5 participants reported they were better able to manage disruptions using im and that im did not increase their communication time. the findings of this study revealed that some communication that otherwise may have occurred over email, by telephone, or in-person were instead delivered via im. this likely contributed to the reduced interruptions because im does not require full and immediate attention unlike a phone call or face-to-face communication. in addition, im study participants reported the ability to negotiate their availability through postponing conversations, information technology and libraries | september 2012 6 and these findings support earlier studies suggesting im is less intrusive than traditional communication methods for determining availability of coworkers.6 a number of research studies show that im improves teamwork and is useful for discussing complex tasks. huang, hung, and chen compared the effectiveness of email and im and the number of new ideas; they found that groups utilizing im generated more ideas than the email groups.7 they suggested that the spontaneous and rapid interchanges typical of im facilitates brainstorming between team members. the information that is uniquely visible through im and the ease of sending messages help create opportunities for spontaneous dialog. this is supported by a study by quan-haase, cothrel, and wellman, which found im promotes team interaction by indicating the likelihood of a faster response.8 ou et al. also suggest im has “potential to empower teamwork by establishing social networks and facilitating knowledge sharing among organizational members.”9 im can enhance the social connectedness of coworkers through its focus on contact lists and instant, opportunistic interactivity. the informal and personalized nature of im allows workers to build relationships while promoting the sharing of information. cho, trier, and kim suggest that the use of im as a communication tool encourages unplanned virtual hallway discussions that may be difficult for those located in different parts of a building, campus, or in remote locations.10 im can build relationships between teams and organizations where members are in physically separated locations. however, cho, trier, and kim also note that im is more successful in building relationships between coworkers who already have an existing relationship. wu et al. argue that by helping to build the social network within the organization, instant messaging can contribute to increased productivity.11 several studies have cautioned that im, like other forms of communication, requires organizational guidelines on usage and best practices. mahatanankoon suggests that productivity or job satisfaction may decrease without policies and workplace norms that guide im use.12 other research indicates that personality, employee status, and working style may affect the usefulness of im for individual employees.13 some workers may find the multitasking nature of im to work in their favor while those who prefer sequential task completion may find im disruptive. the hierarchy of work relationships and the nature of managerial styles are likely to have an impact on the use of im as well. while there are no research findings associated with the use of im for internal communication within libraries, there are articles encouraging its use. breeding writes of the potential for im to bring about “a level of collaboration that only rarely occurs with the store-and-forward model of traditional e-mail.”14 fink provides a concise introduction to the advantages of using internal im for communication between library staff.15 in addition, he provides an overview of the implementation and success of the openfire-based im network at mcmaster university. extending im beyond the reference desk | chan, ly, and meulemans 7 success of chat reference in libraries im-based chat reference gives libraries the means to more easily offer low-cost delivery of synchronous, real-time research assistance to their users, commonly referred to as “chat reference.” although libraries have used im for the last decade and many currently subscribe to questionpoint, a collaborative virtual reference service through oclc, two newer online services helped propel the growth of im-based chat reference. first available in 2006, the web-based meebo (www.meebo.com) made it much easier to use im for localized chat reference because library patrons were no longer required to have accounts on a proprietary network, such as aol or yahoo, to communicate with librarians.16 instead, meebo provided web widgets that allowed users to chat via the web browser. libraries could easily embed these widgets throughout their website and unlike questionpoint, meebo is free and does not require a subscription. librarians could answer questions using either their account on meebo’s website or by logging-in with a locally installed instant messaging client. in comparison to im-based chat reference, a number of libraries also found questionpoint difficult to use due to its complexity and awkward interface.17 in 2008, libraryh3lp (http://libraryh3lp.com) pushed the growth of im-based chat reference even further because it offered a low-cost, library-specific service that required little technical expertise to implement and operate. libraryh3lp improved on the meebo model by adding features such as queues, multi-user accounts, and assessment tools.18 im adds a more informal means of interaction that helps librarians build relationships with their users. several recent studies have shown that users respond positively to the use of im for chat reference. the illinois state university milner library found that switching from its older chat reference software to im increased transactions by 161 percent within one year.19 with the introduction of web-based im widgets pennsylvania state university library’s im-based chat reference grew from 20 percent to 60 percent of all virtual reference (vr), which includes email reference, in one year.20 a 2010 study of vr and im service at the university of guelph library found 71 percent user satisfaction with im compared to 70 percent satisfaction with vr overall.21 im use in academic libraries has become ubiquitous, and other types of libraries also use im to communicate with library patrons. case study california state university, san marcos (csusm) is a mid-size public university with approximately 9,500 students. csusm is a commuter campus with the majority of students living in north county san diego and offers many online or distance courses at satellite campuses. the csusm kellogg library has a robust chat reference service that is used by students on and off campus. the library has about forty-five employees including librarians, library administrators, and library assistants. the following section will discuss the meebo chat reference pilot, selection of openfire to replace meebo, implementation and customization of openfire, and evaluation of openfire for chat reference by librarians and as an internal network for all library personnel. information technology and libraries | september 2012 8 meebo chat reference pilot to examine the feasibility of using im for chat reference at csusm, the reference librarians initiated a pilot program using meebo (2008–9). a meebo widget was placed on the library’s homepage, the ask a librarian page, and on library research guides. within the first year of the pilot project, chat reference grew to more than 41 percent of all reference transactions.22 based on responses to user satisfaction surveys, 85 percent indicated they would recommend chat reference to other students, and 69 percent said they preferred it to other forms of reference services. chat reference is now an integral part of the library’s research assistance program, and im has become a permanent access point for students to contact reference librarians. although the new im service was successful, the pilot program uncovered a number of key shortcomings with meebo when used for chat reference; these shortcomings are documented in a case study by meulemans et al.23 these findings matched problems reported by other libraries who used meebo in their reference services.24 meebo is most suited for individual users who communicate one-to-one via im. for example, meebo chat widgets are specific to each meebo user, and it is not possible to share a single widget between multiple librarians. in addition, features such as message queues and message transfers, invaluable for managing a heavily used chat reference service, are not available in meebo. those features are essential for working with multiple, simultaneous incoming im messages, a common occurrence in virtual reference. other missing features included the lack of built-in transcript retention and lack of automated usage statistics.25 selecting openfire based on the need for a more robust chat reference system, the csusm reference librarians and the web development librarian explored other im options, especially open-source software. the web development librarian had previous experience using openfire at the university of alaska anchorage, for an internal library im network and investigated its capabilities to replace meebo as a chat reference tool. the desire to replace meebo for chat reference at csusm also provided the opportunity to pilot an internal im network. openfire, part of the suite of open-source instant messaging tools from jive software, was the only application that could easily fulfill both roles and offered a number of features that made it highly preferable when compared to other im-based chat reference systems. of its many features, one of the most valuable was the integration between openfire user accounts and our campus email system. being able to tap into the university’s email system meant automated configuration and updating of all staff accounts and contact lists. this removed the burden of individual account maintenance associated with external services such as meebo, libraryh3lp, and questionpoint. openfire supports internal im networks at educational institutions such as the university of pennsylvania, central michigan university, and university of california, san francisco. extending im beyond the reference desk | chan, ly, and meulemans 9 openfire could meet our im chat reference needs because it includes the fastpath plugin, a complete web-based chat management system available at www.igniterealtime.org/projects/openfire/plugins.jsp. this robust system incorporates important features such as message queues, message transfer, statistics, and canned messages. james cook university library in australia also chose to use openfire with fastpath plugin as its chat reference solution based on their need for those features.26 other institutions using fastpath and openfire in the role of chat reference or support include the university of texas, the oregon/ohio multistate virtual reference consortium, mozilla.com, and the university of wisconsin. when reviewing chat reference solutions, we considered the possibility of using chat modules available through drupal (http://drupal.org), the web content management system (cms) for our library website. the primary advantage of that option was complete integration with the library website and intranet. further analysis of the drupal option revealed that the available chat modules where too basic for our needs and that reconfiguration of our intranet and website to incorporate a workable chat reference system would require extensive time. in comparison to the implementation time associated with deploying the openfire system, using drupal-based chat modules did not provide a favorable cost-benefit ratio. while the proprietary libraryh3lp offered similar functionality for chat reference, its inability to integrate with our email system was clearly a deficit when compared to openfire. in libraryh3lp, it is necessary to create accounts for all library personnel in chat reference. fastpath does not have that requirement if you integrate openfire with your organization’s lightweight directory access protocol (ldap) directory. instead, the system will automatically create accounts for all library staff. furthermore, the administrative options and interface for libraryh3lp also did not compare favorably with that of fastpath. the fastpath interface for assigning users is more intuitive and the system generates a customizable chat initiation form for each workgroup (figures 1 and 2). oregon’s l-net and ohio’s knowitnow24x7 offer information about software requirements and an online demonstration of spark/fastpath.27 information technology and libraries | september 2012 10 figure 1. fastpath chat initiation form for csusm research help desk figure 2. fastpath chat initiation form for csusm media library for our requirements, openfire was clearly superior to the available systems for chat reference. its relatively simple deployment requirements and ease of setup helped make it our first choice for building a combined im network and chat reference system. in the following section, we will discuss the installation, customization, and assessment of our openfire implementation. openfire installation and configuration the openfire application is a free download from ignite realtime, a community of jive software. the program will run on any web server that has a windows, linux, or macintosh operating system. if configured as a self-contained application, openfire only requires java to be available on your web server. installation of the software is an automated process and system configuration is through a web-based setup guide. after the initial language selection form, the next step in the server configuration process is to enter the web server url and the ports through which the server will communicate with the outside world (figure 3). the third step provides fields for selecting the type of database to use with openfire and for inputting any information relating to your selection (figure 4). extending im beyond the reference desk | chan, ly, and meulemans 11 figure 3. openfire server settings screen figure 4. openfire database configuration form openfire uses a database to store information such as im network settings, user account information, and transcripts. database options include using an embedded database or connecting to an external database server. using the embedded database is the simpler option and is helpful if you do not have access to a database server. connecting to an external database server offers more control of the data generated by openfire and provides additional backup options. openfire works with a number of the more commonly used database servers such as mysql, postgresql, and microsoft sql server. in addition, oracle and ibm’s db2 are database options with additional free plugins from these vendors. we choose to use mysql because of our experience using it with other library web applications. if using the external database option, creating and configuring the external database before installing openfire is highly recommended. after choosing a database, the openfire configuration requires the selection of an authentication method for user accounts. one option is to use openfire’s internal authentication system. while the internal system is robust, it requires additional administrative support to manage the process of creating and maintaining user accounts. the recommended option is to connect openfire with your organization’s lightweight directory access protocol (ldap) directory (figure 5). ldap is a protocol that allows external systems to interact with the user information stored in an organization’s email system. using ldap with openfire is highly preferable because it simplifies access for your librarians and staff by automatically creating user accounts based on the information in your organization’s email system. library staff simply login with their work email or network account information; they are not required to create a new username and password. information technology and libraries | september 2012 12 figure 5. openfire ldap configuration form the last step in the configuration process is to grant system administrator access to the appropriate users. if using the ldap authentication method, you are able to select one or more users in your organization by entering their email id (the portion before the ampersand). the selected users will have complete access to all aspects of the openfire server. once the setup and configuration process is complete, the server is ready to accept im connections and route messages. reviewing the settings and options within the openfire system administration area is highly recommended. most libraries will likely want to adjust the configurations within the sections for server settings and archives. connecting the im network the second phase of the implementation process connected our library personnel with the im network using im software installed on their workstations. the openfire im server works with any multiprotocol im client (“multiprotocol” refers to support for simultaneous connections to multiple im networks) that provides options for configuring an xmpp or jabber account. some of the more popular im clients that offer this functionality include spark, trillian, miranda, and pidgin. based on our chat reference requirements, we choose to use spark (www.igniterealtime.org/projects/spark), an im client program designed to work specifically with the fastpath web chat service. spark comes with a fastpath plugin that enables users to receive and send messages to anyone communicating through the web-based fastpath chat widgets (more information on fastpath configuration is in the next section of this article). this plugin provides a tab for logging into a fastpath group and for viewing the status of the group’s message queues extending im beyond the reference desk | chan, ly, and meulemans 13 (figure 6). spark also includes many of the features offered by other im clients including built-in screen capture, message transfer, and group chat. figure 6. the fastpath plugin for spark library personnel were able to install spark on their own by downloading it from the ignite software website and launching the software’s installation package. the installation process is very simple and user-specific information is only required when spark is started for the first time. the fields required for login include the username and password of the user’s organizational email and the address of the im server. as part of our implementation process, we also provided library staff with recommendations regarding the selection and configuration of optional settings that might enhance their im experience. recommendations included auto-start of spark when loggingin to computer and the activation of incoming message signals, such as sound effects and pop-ups. on our openfire server, we had also installed the kraken gateway (http://kraken.blathersource.org) plugin to enable connections to external im networks. the gateway plugin works with spark to integrate library staff accounts on chat network such as google talk, facebook, and msn (an example of integrated networks is shown in figure 6.) by integrating meebo as well, librarians were able to continue using the meebo widgets they had embedded into their research guides and faculty profile pages. this allowed them to use spark to receive im messages rather than logging on to the meebo website. information technology and libraries | september 2012 14 configuring the fastpath plugin for chat reference a primary motivation for using openfire was the feature set available in the fastpath plugin. fastpath is a complete chat messaging system that includes workgroups, queues, chat widgets, and reporting. fastpath actually consists of two plugins that work together, fastpath service for managing the chat system and fastpath webchat for web-based chat widgets. both plugins are available as free downloads from the openfire plugins section of the ignite software website— www.igniterealtime.org/projects/openfire/plugins.jsp. to install fastpath, upload the its packages using the form in the plugins section of the openfire administrative interface. the plugins will automatically install and add a fastpath tab to the administrative main menu. the first step in getting started with the system is to create a workgroup and add members (figure 7). within each new workgroup, one or more queues are required to process and route incoming requests and each queue requires at least one “agent.” in fastpath, the term agent refers to those who will receive the incoming chat requests. figure 7. workgroup setup form in fastpath as work groups are created, the system automatically generates a chat initiation form which by default includes fields for name, email and question. administrators can remove, modify, and add any combination of field types including text fields, dropdown menus, multiline text areas, radio buttons, and check boxes. you may also configure the chat initiation form to require completion of some, all, or none of the fields. at csusm, our form (figures 1 and 2) includes name, question, email, and a dropdown menu for selecting the topic area of the user’s research and a field for the user to enter their question. the information in these fields allows us to quickly route incoming extending im beyond the reference desk | chan, ly, and meulemans 15 questions to the appropriate subject librarian. fastpath includes the ability to create routing rules that use the values submitted in the form to send messages to specific queues within a workgroup. in future, we may use the dropdown menu to automatically route questions to the subject specialist based on the student’s topic. there are two methods to make the fastpath chat widget available to the public. the standard approach embeds a presence icon on your webpage and provides automatic status updates. clicking on the icon displays the chat initiation form. for our needs we choose to embed the chat initiation form in our webpages (see appendix b for sample code). when the user submits the form, openfire routes the message to the next available librarian. on the librarian’s computer, the spark program plays a notification sound and displays a pop-up dialog. the pop-up dialog remains open until the librarian accepts the message, passes it on, or the time limit for acceptance is reached, in which case the message returns to the queue for the next available librarian. evaluation of openfire for enhanced chat reference the csusm reference librarians found fastpath and openfire to be much more robust than meebo for chat reference. the ability to keep chat transcripts and to retain metadata such as time stamps, duration of chats, and topic of research for each conversation is very helpful toward analyzing the effectiveness of chat research assistance and for statistical reporting. the automated recording of transcripts and metadata saved time when compared to meebo. using meebo, transcripts were manually copied into a microsoft word document and the tracking statistics of im interactions were kept in a shared excel spreadsheet. other useful features of fastpath were the capability of transferring of patrons to other librarians and having more than one librarian monitor incoming questions. furthermore, access to the database holding the fastpath data allowed us to build an intranet page to monitor real-time incoming im messages and their responses. however, some issues were encountered with the fastpath plugin when initiating chat connections. we experienced intermittent, random instances of dropped im connections and lost messages. while many of these lost connections were likely the result of user actions (accidentally closing the chat pop-up, walking away from the computer, etc.), others appear to have been due to problematic connections between the server and the user’s browser. to address these issues, we are now asking users to provide their email when they initiate a chat session. with user emails and our real-time chat monitoring system, we are able to follow up with reference patrons that experience im connection issues and provide research assistance via email. evaluation of openfire as an internal communication tool while the adoption of im as internal communication tool was highly encouraged, its use was not mandatory for all library personnel. based on the varied technical background of our staff and librarians, we recognized that some might find im difficult to integrate within their workflow or communication style and chose a soft-launch for our network. information technology and libraries | september 2012 16 in summer 2011, we conducted a survey of csusm library personnel (44 respondents, 99 percent of total staff) to evaluate im as an internal communication tool. (see appendix a for survey questions.) we found that 59 percent of staff use the internal im network while 85 percent use some type of im for web-based chat for work. of those who use internal im, 30 percent used it daily. while the survey was anonymous, anecdotal discussions indicate adoption rates are higher among library units where the work is technically oriented or instructional in nature, such as library systems and the information literacy program/reference. among the respondents who use im, 45 percent of library staff indicated they use it because it allows quick communication between those in the library and 39 percent like its informal nature of communication. twenty percent of total respondents preferred im to email and phone communications. two respondents use the internal im network but were dissatisfied with it and indicated it did not work well while one found it too difficult to use. an additional survey question was geared for staff members who do not use the internal im network at all (“why do you not use the library im network?”). this question was designed to find areas of possible improvement within our system to encourage greater use. survey respondents were allowed to select more than one reason. the most common reasons given by those who do not use the library im network were that they don’t feel the need (34 percent of nonusers), they mainly communicate with staff members who are also not utilizing the im network (18 percent), im does not work for their communication style (14 percent), and privacy concerns (14 percent). we believe more in-depth analysis is necessary to learn more regarding the perceived usefulness of im within our organization and to further its adoption. conclusion through additional training and user education, we hope to promote greater use of the openfire internal im network among those who work in the library. while 100 percent adoption of im as a communication tool is not a stated goal of our project, we believe that some staff have not realized the full potential of im for collaboration and productivity due to a lack of experience with this technology. in hindsight, additional training sessions beyond the initial introductory workshop to set up the spark im client may have increased the usage of im by staff. for example, providing more information on the library’s policies regarding internal im tracking and the configuration of our system may have alleviated concerns regarding privacy. in addition, we need to lead more discussions on the benefits of im for collaboration, lowering disruptions, and increasing effectiveness in the workplace. openfire and fastpath for chat reference has brought many new features that were previously unavailable to chat reference at csusm. the addition of queues, message transfer, and transcripts has enhanced the effectiveness of this service and eased its management. compared to the prior chat reference implementations that used questionpoint and meebo, this new system is more user friendly and robust. extending im beyond the reference desk | chan, ly, and meulemans 17 furthermore, the internal im network and its connection to web-based chat widgets offer the opportunity for building a library that is more open to users. library users could feasibly contact any library staff member, not just reference librarians, via im for help. we are testing this concept with a pilot project involving the csusm media library. they are staffing their own chat workgroup and a chat widget is now available on their website. in the future, we also hope to employ a chat widget for circulation and ill services, another public services area that frequently works with library users. it is important to note that the success of openfire and im in the library attracted the attention of other csusm instructional and student support areas. in spring 2011, instructional and information technology services (iits), which provides campus-wide technology services for faculty, staff, and students piloted an openfire-based im helpdesk service to assist users with technology questions and problems. as of fall 2011, the “ask an it technician” service is fully implemented and available on all campus webpages. discussions on the adoption of im for other campus student services, such as financial aid and counseling, have also occurred. in addition to being a contact point for students, im has potential to improve the internal communication within the organization. references 1. hee-kyung cho, matthias trier, and eunhee kim, “the use of instant messaging in working relationship development: a case study,” journal of computer-mediated communication 10, no. 4 (2005), http://onlinelibrary.wiley.com/doi/10.1111/j.1083-6101.2005.tb00280.x/full (accessed aug. 1, 2011). 2. bonnie a. nardi, steven whittaker, and erin bradner, “interaction and outeraction: instant messaging in action,” in proceedings of the 2000 acm conference on computer supported cooperative work (new york, new york: acm press, 2000),79–88. 3. ellen isaacs et al., “the character, functions, and styles of instant messaging in the workplace,” in proceedings of the 2002 acm conference on computer supported cooperative work (new york, new york: acm press, 2002), 11–20. 4. victor m. gonzález and gloria mark, “constant, constant, multi-tasking craziness: managing multiple working spheres,” in proceedings of the sigchi conference on human factors in computing systems (new york, new york: acm press, 2004), 113–20. 5. r. kelly garrett and james n. danziger, “im = interruption management? instant messaging and disruption in the workplace,” journal of computer-mediated communication 13, no. 1 (2007), http://jcmc.indiana.edu/vol13/issue1/garrett.html (accessed jun. 15, 2011). 6. nardi, whittaker, and bradner, “interaction and outeraction,” 83. 7. albert h. huang, shin-yuan hung, and david c. yen, “an exploratory investigation of two internet-based communication modes,” computer standards & interfaces 29, no. 2 (2006): 238–43. http://onlinelibrary.wiley.com/doi/10.1111/j.1083-6101.2005.tb00280.x/full http://jcmc.indiana.edu/vol13/issue1/garrett.html information technology and libraries | september 2012 18 8. anabel quan-haase, joseph cothrel, and barry wellman, “instant messaging for collaboration: a case study of a high-tech firm,” journal of computer-mediated communication 10, no. 4 (2005), http://jcmc.indiana.edu/vol10/issue4/quan-haase.html (accessed jun. 12, 2011). 9. carol x. j. ou et al., “empowering employees through instant messaging,” information technology & people 23, no. 2 (2010): 193–211. 10. cho, trier, and kim, “instant messaging in working relationship development.” 11. lynn wu et al., “value of social network—a large-scale analysis on network structure impact to financial revenue of information technology consultants” (paper presented at winter information systems conference, salt lake city, ut, feb. 5, 2009). 12. pruthikrai mahatanankoon, “28p. exploring the impact of instant messaging on job satisfaction and creativity,” conf-irm 2010 proceedings (2010). 13. ashish gupta and han li, “understanding the impact of instant messaging (im) on subjective task complexity and user satisfaction,” in pacis 2009 proceedings. paper 10, http://aisel.aisnet.org/pacis2009/1; and stephanie l. woerner, joanne yates, and wanda j. orlikowski, “conversational coherence in instant messaging and getting work done,” in proceedings of the 40th annual hawaii international conference on system sciences, http://www.computer.org/portal/web/csdl/doi/10.1109/hicss.2007.152 (2007). 14. marshall breeding, “instant messaging: it’s not just for kids anymore,” computers in libraries 23, no. 10 (2003): 38–40. 15. john fink, “using a local chat server in your library,” feliciter 56, no. 5 (2010): 202–3. 16. william breitbach, matthew mallard, and robert sage, “using meebo’s embedded im for academic reference services: a case study,” reference services review 37, no. 1 (2009): 83–98. 17. cathy carpenter and crystal renfro, “twelve years of online reference services at georgia tech: where we have been and where we are going,” georgia library quarterly 44, no. 2 (2007), http://digitalcommons.kennesaw.edu/glq/vol44/iss2/3 (accessed aug. 25, 2011); and danielle theiss-white et al., “im’ing overload: libraryh3lp to the rescue,” library hi tech news 26, no. 1/2 (2009): 12–17. 18. theiss-white et al., “im’ing overload,” 12–17. 19. sharon naylor, “why isn’t our chat reference used more?” reference & user services quarterly 47, no. 4 (2008): 342–54 20. sam stormont, “becoming embedded: incorporating instant messaging and the ongoing evolution of a virtual reference service,” public services quarterly 6, no. 4 (2010): 343–59. http://jcmc.indiana.edu/vol10/issue4/quan-haase.html http://www.computer.org/portal/web/csdl/doi/10.1109/hicss.2007.152 http://digitalcommons.kennesaw.edu/glq/vol44/iss2/3 extending im beyond the reference desk | chan, ly, and meulemans 19 21. lorna rourke and pascal lupien, “learning from chatting: how our virtual reference questions are giving us answers,” evidence based library & information practice 5, no. 2 (2010): 63–74. 22. pearl ly and allison carr, “do u im?: using evidence to inform decisions about instant messaging in library reference services” (poster presented at the 5th evidence based library and information practice conference, stockholm, sweden, june 29, 2009), http://blogs.kib.ki.se/eblip5/posters/ly_carr_poster.pdf (accessed august 1, 2011). 23. yvonne nalani meulemans, allison carr, and pearl ly, “from a distance: robust reference service via instant messaging,” journal of library & information services in distance learning 4, no. 1 (2010): 3–17. 24. theiss-white et al., “im’ing overload,” 12–17. 25. meulemans, carr, and ly, “from a distance,” 14–15 26. nicole johnston, “improving the reference and information experience of students in regional areas—does an instant messaging service make a difference?” (paper presented at 4th alia new librarians symposium, december 5–6, 2008, melbourne, australia), http://eprints.jcu.edu.au/2076(accessed august 17, 2011); and alan cockerill, “open source for im reference: openfire, fastpath and spark” (workshop presented at fair shake of the open source bottle, griffith university, queensland college of art, brisbane, australia, november 20, 2009), http://www.quloc.org.au/download.php?doc_id=6932&site_id=255 (accessed august 4, 2011). 27. oregon state multistate collaboration, “multi-state collaboration: home,” http://www.oregonlibraries.net/multi-state (accessed august 16, 2011). http://blogs.kib.ki.se/eblip5/posters/ly_carr_poster.pdf http://eprints.jcu.edu.au/2076 http://www.quloc.org.au/download.php?doc_id=6932&site_id=255 http://www.oregonlibraries.net/multi-state information technology and libraries | september 2012 20 appendix a library instant messaging (im) usage survey the information you submit is confidential. your name and campus id are not included with your response. which of the following do you use . . . for work for personal library’s im network (spark) meebo msn yahoo gtalk facebook or other website-specific chat system im app on my phone trillian, pidgin or other im aggregator skype i don’t use im or web-based chat other if you selected other, please describe: ____________________________________________________________________ extending im beyond the reference desk | chan, ly, and meulemans 21 on average, how often do you communicate via im or web-based chat at work? ● several times a day ● almost daily ● several times a week ● several times a month ● never how often do you use im or web-based chat to . . . 5—often 4 3— sometimes 2 1—never discuss work-related topic socialize with co-worker answer questions from library users talk about non-work related topic request tech support other if you selected other, please describe: ____________________________________________________________________ if you use im to communicate at work, what do you like about it? ● allows for quick communication with others in the library ● facilitates informal conversation ● students like to use it to ask library related questions ● i prefer im over phone or email ● other: information technology and libraries | september 2012 22 why do you not use the library im network? ● don’t feel the need ● the people i usually talk to aren’t on it ● does not work well ● never get around to it . . . but would like to ● it doesn’t work for my communication style ● the system is too difficult to use ● privacy concerns ● other: additional comments? ____________________________________________________________________ extending im beyond the reference desk | chan, ly, and meulemans 23 appendix b iframe code for embedding fastpath chat widget <iframe scrolling= “no” frameborder= “0” src=“http://library.your_org.edu: 7070/webchat/userinfo.jsp?workgroup=<workgroupname>@workgroup.library.your_org.edu”> your browser does not support inline frames or is currently configured not to display inline frames. content can be viewed at actual source page: http://library.your_org.edu: 7070/webchat/userinfo.jsp?workgroup=<workgroupname>@workgroup.library.your_org.edu </iframe> many different chat-reference software packages are widely used by libraries, including questionpoint, meebo, and libraryh3lp. less commonly used is openfire (www.igniterealtime.org/projects/openfire), an open-source im network and a single unified ap... literature review instant messaging in the workplace success of chat reference in libraries case study meebo chat reference pilot selecting openfire openfire installation and configuration connecting the im network configuring the fastpath plugin for chat reference evaluation of openfire for enhanced chat reference evaluation of openfire as an internal communication tool conclusion references appendix a library instant messaging (im) usage survey appendix b 31 a fast algorithm for automatic classification r. t. dattola: department of computer science, cornell university, ithaca, new york an economical classification process of order n log n (for n elements), which does not employ n-square procedures. conversion proofs are given and possible information retrieval applications are discussed. many methods exist for ordering or classifying th<;j elements of a file. the elements are usually clustered into groups based on the similarities of the attributes of the elements. in information retrieval, the elements are frequently documents, and the attributes are words or concepts characterizing the documents. classification of document files may be divided into two basic categories: 1) an a priori classification already exists and each document is placed into the cluster whose centroid is most similar to that document; 2) no a priori classification is specified and clusters are formed only on the basis of similarities among documents. classification schemes that fall into the first class are very common and often involve manual methods. for example, new acquisitions of a library are classified by placing them into the clusters of a standard a priori classification. problems of the second type are usually more difficult to handle, and automatic or semi-automatic methods are often used. methods of this type are widely used in statistical programs, but the number of elements in the file is limited to several hundred, or at most, a few thousand items. in information retrieval applications, the number of elements may approach several hundred thousand or even a million documents, as in the case of a large library. in the present study, a method is described which is suitable for classification of very large document collections. 32 journal of library automation vol. 2/1 march, 1969 the n2 problem current methods of automatic document classification usually require the calculation of a similarity matrix. this matrix specifies the correlation, or similarity, between every pair of documents in the collection. thus, if the collection contains n documents, n 2 computations are required for calculation of the similarity matrix; however the similarity matrix is often symmetric, so the number of computations is reduced to n2/2. this immediately poses two serious problems: the storage space necessary to store the matrix increases as the square of the number of documents, and the time required to calculate the matrix also increases quadratically. fortunately, document-document similarity matrixes are normally only about ten percent dense, and only the non-zero elements need be stored ( 1). however, as n increases, auxiliary storage must eventually be used, and although this solves the space problem, it also magnifies the time problem. to illustrate the magnitude of this problem, suppose that it takes one hour of computer time to classify a one-thousand document collection. then for n = 104, the time is approximately one hundred hours, and for n = 106, the time needed is about 120 years! the classification scheme described in this paper is an adapt~tion of the one proposed by doyle, and the time required is of the order of n log n (2) . for example, assuming the logarithm has base 10, and the time required for a one-thousand document collection is again one hour, then for n = 104 the time is 13 hours, and for n = 106, the time required is about 83 days. doyle's algorithm the n2 problem is avoided in this classification scheme, because a similarity matrix is never computed. assume the document set is arbitrarily partitioned into m clusters, where s, is the set of documents in cluster j, associated with each set s, are a corresponding concept vector c1 and frequency vector f,. the concept vector consists of all the concepts occurring in the documents of s;, and the frequency vector specifies the number of documents in s; in which each concept occurs. every concept in c; is assigned a rank according to its frequency; i.e., concepts with the highest frequency have a rank of 1, concepts with the next highest frequency receive a rank of 2, etc. given an integer b (base value), every concept in c; is assigned a rank value equal to the base value minus the rank of that concept. the vector of rank values is called the profile p, of the set sj. tables 1 and 2 illustrate the concept and frequency vectors, and the corresponding profiles for a sample document collection (base value=6). , starting from a partition of the document set into m clusters, the profiles are generated as described. every document di in the document space is now scored against each of the m profiles by a scoring function g, where g ( dt, p1) = the sum of the rank values of all the concepts from cl fast algorithm for automatic classification/dattola 33 which occur in c,. tables 3 and 4 show the results of scoring the documents in the sample collection against the profiles from table 1 (cutoff= 10). table 1. concept vectors dl d2 da d-4 d5 d6 d1 c1 ci c1 c1 c1 cs c6 c2 c2 c7 c2 cs cs c5 c4 cs cs c5 cn table 2. initial clusters, profiles, and frequencies s~ c1 f. pl s2 c2 f2 p2 sa cs fs dl c1 3 5 d2 c1 2 5 d6 cs 1 da c2 1 3 d4 c2 2 5 d7 . c6 1 d5 cn 1 3 cs 1 4 cs 1 c1 1 3 c4 1 4 cs 2 4 c5 2 5 table 3. document scoring document profile of highest score score ~ 2 m ~ 2 w ~ 1 ~ ~ 2 w d5 1 9 ds 3 5 ~ 3 ro table 4. clusters resulting from document scoring s1' s2' sa' l da d1 d1d5 d2 d6 d4 pa 5 5 5 34 journal of library automation vol. 2/1 march, 1969 given a cut-off value t, a new partition of the document set into m+1 clusters is made by the following formula : s/ = [d.ig(d.,pj)~g(d.,pk) and g(d.,p,)'=::,..t, for k=1, . . . ,m]. thus, s/ consists of all the documents that score highest against profile pj, provided that the score is at least as great as t. in cases where a document scores highest against two or more profiles, say pr, .. . , pr, 1 " the following tie-breaking rule is used: if d.esj and, a) f=rk, 1 l k l n, then d. is assigned to ( srk) '; b) j=r,, for 1 l k l n, then c1 is arbitrarily assigned to (sr 1 ) '. those documents which do not fall into any of the m clusters s/ are called loose documents, and they are assigned to a special class l . the process is now repeated after replacing p, by p/. the iteration continues until p; satisfies the termination condition, which states that p/=p1 for f=1, . .. ,m, i.e., the profiles are unchanged after two consecutive iterations. satisfaction of termination condition a) non-convergence of doyle's algorithm doyle's algorithm as described is not guaranteed to tenninate. to illustrate this, consider the following document collection: dl d2 da d4 du d6 dr ds d9 1 1 3 3 1 2 2 2 1 2 5 7 4 5 3 3 4 2 3 6 8 7 6 4 4 5 4 4 11 9 8 7 9 7 6 5 5 12 12 8 7 6 11 9 8 7 10 10 10 let s1=[d1-d6], s2=[d1-d9], and let p1=pro:6le of sl~ and p2= profile of s2. the two profiles are as follows (base value=7) : dld6 drd9 concept frequency profile concept frequency profile 1 3 5 1 1 4 2 2 4 2 3 6 3 4 6 3 1 4 4 3 5 4 3 6 5 3 5 5 2 5 fast algorithm for automatic classification/dattola 35 concept frequency profile concept frequency 6 2 4 6 2 7 3 5 7 3 8 2 4 8 2 9 2 4 9 1 ro o ro 3 11 2 4 11 0 12 2 4 12 0 profile 5 6 5 4 6 13 0 13 1 4 now assume that t = 0, and partition the document set by the formulae: s1'=[~jg( ~,p1)::::,g( dt,p2)] and ss'=[dtjg( dt,p2)::::,g( dt,pl) ]. the results are summarized in the following table: g(dt,pl) g(~,p2) dl 29 25 d2 22 14 ds 23 19 d4 20 21 d5 19 20 da 19 20 d1 28 37 ds 27 39 d9 28 42 therefore, s1'=[d1-dal and s2'=[d4-d9]. according to doyle's algorithm, p1 is replaced by p1' and p2 by p2'. the new profiles are: d1-ds d4-d9 concept frequency profile concept frequency profile 1 2 6 1 2 3 2 1 5 2 4 5 3 2 6 3 3 4 4 1 5 4 5 6 5 2 6 5 3 4 6 1 5 6 3 4 7 1 5 7 5 6 8 1 5 8 3 4 9 1 5 9 2 3 10 0 10 3 4 11 2 6 11 0 12 2 6 12 0 13 0 13 1 2 36 journal of library automation vol. 2/1 march, 1969 now the document set is again partitioned and the results are: g(i1,p1) g(t1,p2) dl 34 22 d2 29 11 da 27 17 d4 21 20 d5 22 17 do 21 18 d1 31 32 ds 31 33 do 32 34 therefore, s1'=fd1-da] and s2'=fd1-dol. these are the original sets, so that the algorithm will never terminate for this example. b) termination of modified algorithm although doyle's algorithm is not guaranteed to terminate, needham proved that similar types of iterative methods are guaranteed to terminate in a finite number of steps ( 3). a small change in doyle's method produces an algorithm that is guaranteed to terminate. the modification occurs after the calculation of the s/. instead of automatically replacing the old p1 by p;', the following condition must also be satisfied: 'l g( c1,p/) > 'l g( t1,p1 ) i£s/ i£s/ if the above condition is not satisfied, p1 is left unchanged. before proving that this new algorithm is guaranteed to terminate, it is desirable first to make the algorithm more general by allowing overlap between the clusters. the following theorem proves the termination of a method which allows overlapping clusters. theorem: let the subscript n designate the nth iteration. let d represent the document space and let po,1, ... , po,m represent m initial profiles corresponding to an arbitrary distribution so,1, ... , so,m of documents in d . given a cut-off value t, the nth iteration is defined as follows : 1). generate the sets sn,l, ••• , sn,m and ln by sn,j=[i1lg( d~,,pn-1,1 f:""·t] ln= (loose documents) { p n,j if, 'l g ( t1,p n,f) ? 'l g ( t1,p n-l,j) 2), let p n,;= 't£sn,j 't£sn,j pn-l,j otherwise this algorithm is guaranteed to terminate in a finite number of iterations, where termination occurs when pn,1=pn-l,j for all f. proof: extend the document spaced to a new document spaced# confast algorithm for automatic classification/dattola 37 taining m distinguishable copies of every document in d. also, add the condition that sn,j can never contain more than one copy of each document. clearly, any sn,j defined on d# in this manner can also be represented on d as defined in the theorem. conversely, any sn,j defined on d can also be represented on d# as defined above. thus, it suffices to prove the theorem on d# under the added condition. define a function f ,., which will be shown to be monotone increasing in n, by the following: m f,. = l f n,j+ t•zn, where i=l f,.,j= lg(cl,pn-t,j) and if.sn,j z,. = number of documents in l,.. after step 2 of the iteration, f,. is replaced by f ,.', where ( f n,j)' = l g ( dt,p n,j) • if.sn,j if for any j, pn,j =f pn-t,j, then (fn,j )' > fn,j (this statement is not necessarily true in doyle's algorithm) and therefore f ,.' > f n· if . termination occurs; i.e., pn,j = pn-t,j for all j; then fn' = f,.. for the n + lth iteration, m f,.+l = l fn+t,j + t•zn+t, where i=l fn+l ,j = l g(dt,pn,j). if.sn+z,j consider the relation between the contribution of cl to f,.' and fn+t, and note that each cl (where copies of a document are distinct) contributes once and only once to both f,.' and fn+t· this relation is summarized in the following table: documentcl a) was assigned to sn,j and now 1) to s,.+l,j (cl did not change clusters) 2) to sn+l,k, k =f f ( cl did change clusters) 3) to ln+t b) was assigned to l,. and now 1) to ln+t 2) sn+l ,j relation between contribution of cl to f,.+t and f,.' > (g( dt,pn,j) <t; otherwise cl would be in sn+t,j. also, g(clt,pn,rc)::::,. t) >(g(cl,pn,j)<t. now cl contributes l•t=t to fn+t) = (contributes t to both) ::::,.. (gave t for f,.', and now gives g(clt,pn,j)::::,.. t for fn+t) 38 journal of library automation vol. 2/1 march, 1969 therefore, f,.+l:::::...f.:. if pn,j=pn-1,1 for all f, then from a) 1) and b) 1) f n+t = f ,.'. therefore, if the termination condition is satisfied, then f,.+l = f,.. on the other hand, if f,.+t = f,., then f .. = f,.', which occurs only when termination occurs. thus, f,. is a monotone increasing function, where fn+l = f,. if the termination condition is satisfied. given m and t, f depends only on the distribution of the documents of d# in s1, ... ,sm. since there are only a finite number of distributions, there are only a finite number of values for f. therefore, at some iteration n, fn+1 must equal f,.. implementation the algorithm described in the preceding section is not implemented. instead, experiments are performed using an algorithm which differs from the preceding one in four important respects: 1) the extra condition necessary for convergence that is mentioned in section b above is not implemented; i.e., p1 is always replaced by p/; 2) termination occurs when s:.; = s:+l,j for all f, where s:., is the subset of sn,j consisting of all those documents that score highest against profile p n,j; 3) let h,.,, = max. ( g ( c1,p n,j) ) , and define sn,j as 19~m sn,j = [c11g(c1,p,..1,j)::::::... tn-1,,], where t _ fhn-1,,[a•(hn-1,-tt)], if hn-l,t>t, where 0..::::::~1 "·1·' lt otherwise 4) if any sn,; contains fewer than two documents, then s,.,; is eliminated, thereby reducing the number of clusters by one. the advantages of this method over the one defined in the theorem are discussed in the present section; the disadvantage is, of course, that termination is not guaranteed. to show this, note that conditions 1) and 2) above are equivalent to the termination condition in doyle's algorithm, since in doyle's method pn,j always corresponds to the new partition s,.,,, and s,.,1 = s:j (no overlap is allowed). also, if a = 0 in condition 3, then tn-t,< = hn-t,t. thus, only those documents c1 that score highest against pn-1,/, where hn-u::::::... t, are assigned to sn,j. therefore, with a= 0 this method is equivalent to doyle's algorithm. the first two modifications are implemented to improve the efficiency of the program. although convergence is no longer guaranteed, all the experiments tried so far have in fact always terminated. programs without these two modifications run about twice as slow. also, in cases where the overlap is not too high ( s:.1=sn,j), the new termination condition is usually equivalent to the one used in the theorem. that is, when s:.1=s:+l,j, then very often sn,j=sn+t,j. fast algorithm for automatic classification/dattola 39 the third modification does not improve efficiency, but it allows a more flexible, and intuitively, a more desirable method for creating overlap. the algorithm described in the theorem assigns a document d. to a cluster sn,j if g( dt, pn-l,j )-::::...t. this has two major disadvantages: 1) the overlap cannot be increased independently of the number of loose documents; increasing the overlap by lowering t in general decreases the percentage of loose documents; 2) the difference between d.'s highest score and d/s second highest score is ignored; e.g., if t=50, g(ct,p~)=200, and g(ct,p2)=50, then ct is assigned to both sl and s2. the first problem decreases the flexibility of the algorithm, since the amount of overlap and percentage of loose documents cannot be varied independently. the example in the second part illustrates the other problem. it seems desirable that a document should be assignable to two or more clusters when it scores equally (or almost equally) as high against all of them. the previous method does not take this fact into account. in the new algorithm, documents are assigned to more than one cluster on the basis of how close the score is to the highest score, relative to the cut-off value t. the parameter a determines how close to the highest score the other scores must be. when a.;_o, no overlap occurs, while a=! generates the maximum amount of overlap. with a=l, the formula reduces to tn-l,<=t; hence, it is the same definition of sn,j as in the theorem. the last modification increases the efficiency of the program, and also avoids forming clusters around documents which should be classified as loose. when s,,1 contains only one document, and that document is contained in no other clusters, then it has the same status as a loose document. experimental results the algorithm described in the preceding section is used to cluster the 82-document adi collection and the 200-document cranfield word stem collection ( 4). the results of the classification indicate three important problems: 1) the scoring function g tends to give higher scores to documents containing a larger number of concepts; thus, many of the documents containing very few concepts are classified as loose; 2) the documents do not move freely enough from one profile to another; i.e., the final clusters are quite similar to the initial ones; 3) the initial clusters cannot be chosen arbitrarily. the scoring function the first problem is due to the fact that g scores a document ct against a profile p1 by simply adding up the rank values of all the concepts in d, which appear in p1• if tl contains a larger number of concepts than d~c, the chances are greater for d; to receive a higher score. figure 1 is a plot of the score of the document against its final profile vs. the number of con40 journal of library automation vol. 2/1 march, 1969 -g) ... 0 cj (/) 8 0 • 1 c1uster t cluster 4 no. of concepts /document fig. 1. initial scoring function . q) ... 0 cj (/) 25 5 • _jl.----"~•r--c i uster 3 ,_ .. 0 10 20 30 40 50 no. of concepts/document fig. 2 modified scoring function. fast algorithm for automatic classification/dattola 41 cepts in the document for one of the adi runs. although there are a few exceptions, the graph indicates that the documents with a larger number of concepts generally receive higher scores. in fact, the average number of concepts in a loose document is eleven, while the average number of concepts per document for the entire collection is twenty. the solution to this problem is to weight the score inversely by the number of concepts in the document. the obvious answer is to divide the score by the number of concepts, but this overcompensates and gives many of the smaller documents the highest scores. dividing by the square root of the number of concepts in the document does not solve the original problem; i.e., larger documents give higher scores. satisfactory results are obtained when the score is divided by ( # of concepts per document) 718• figure 2 represents the same adi sample as figure 1, except that the new scoring function h=g/(# concepts per document) 718 is used. unlike the function g, h seems to be independent of the number of concepts in the document. movement of documents the second problem is clearly indicated by examination of the results of the classification. table 5 shows the initial and final clusters for the adi collection. the problem occurs because the documents tend to "stick" to the clusters that they are already in. this problem is solved by a method similar to that used by doyle. cluster 1 2 3 4 5 6 7 loose table 5. final results of adi classification initial documents 1-12 13-24 25-36 37-48 49-60 61-71 72-82 final documents 111, 13, 21, 30, 33, 34, 40, 43, 51, 68 3, 10, 13 24, 26, 33, 34, 53, 69, 79 9, 11, 13, 20, 22, 23, 25 28, 30 34, 36, 47, 51, 55, 65, 75 4, 7-9, 14, 20, 30, 3748, 51, 69 1, 5, 7, 20, 30, 32, 45, 47, 51 53, 55 59, 79, 80 2, 9, 27, 30, 47, 51, 61, 62, 64-71 10, 40, 51, 72 75, 77 81 12, 29, 35, 49, 50, 54, 60, 63, 76, 82 42 i ournal of library automation vol. 2/1 march, 1969 during the first few iterations, documents should be allowed to move freely from cluster to cluster, until a nucleus is formed within each cluster. the nucleus consists of those documents that are most highly correlated to one another. once the nucleus is formed, these documents will probably not move from their present clusters. clusters can be forced to contain only very highly correlated documents by raising the cut-off value t, assuming that documents with the highest scores are most similar to the other documents in the cluster. this assumption is investigated later. however, raising the cut-off value results in a larger number of loose documents. this is resolved by repeating the classification for a lower value of t, but using the clusters from the first classification as the initial clusters. this creates the problem of how to determine the initial value of t, and how much to decrement it when the classification is repeated using as initial clusters the results of the first classification. the initial value of t should be high enough so that only those documents which score very highly against profile p1 are assigned to sj. one method of achieving this is to pick t so that the clusters after the first iteration average q documents, where q is small compared to the total number of documents. in the experiments run so far, q is arbitrarily set at 4. after termination of the first classification, a nucleus is formed within each cluster. t is now chosen so that a certain percentage of the loose documents are assigned to clusters after the first iteration of the second classification. assuming it is desirable to have approximately x percent of the documents loose after the final clusters are formed, two approaches are possible: 1) t is lowered far enough so that only x percent of the documents remain loose after the first iteration; thus, after termination of the second classification, the clusters represent the final results; 2) t is lowered just enough to allow a certain percentage of the loose documents to be assigned to clusters after the first iteration; thus, the classification is repeated until approximately x percent of the documents remain loose. experiments performed using both methods indicate that the second approach allows greater control of the loose documents, with only slightly greater execution times. after the first classification, a large proportion of the documents still remain loose. therefore, if x is not too high, method 1) decreases t by a large amount. this injects many new documents into the clusters, and several iterations are necessary before termination occurs. also, t is chosen so that the percentage of loose documents is x at the end of the first iteration, but it is impossible to know beforehand the percentage of loose documents after the final iteration. in general, the more iterations, the more the final percent varies from the percent after the first iteration. in method 2), t is lowered just enough to allow a fairly small percentage ( 20% in the present experiments) of the loose documents to be assigned to clusters. this normally results in only a few fast algorithm for automatic classification/dattola 43 iterations before termination occurs; therefore, the final percent of loose documents does not vary much from the percent loose after the first iteration. the adi collection is reclassified using the procedures described above, where it is desired that about 25% of the documents remain loose. once again seven initial clusters are used, and the initial value of t is calculated to be 28.2 so that the clusters after the first iteration average four documents. however, in this case cluster 3 is assigned ten documents, while clusters 1,5, and 6 contain only one document. thus, these three clusters are eliminated, and the documents within them become loose. mter termination occurs, the final clusters are used as initial clusters for the next classification, where t is set to 19.1. the process is repeated again for t = 16.8, and after termination 17% of the documents remain loose. table 6 shows the final results of this classification. compared with table 5, many more of the documents have moved from their initial clusters. table 6. final results of new adi classification cluster 1 2 3 4 5 6 7 loose initial clusters initial documents 1-12 13-24 25-36 37-48 49-60 61-71 72-82 final documents 3, 5, 9, 10, 14-17, 2028, 30, 34, "37, 43, 45, 48, 53, 57 59, 64, 68, 69, 72, 79, 80 1, 2, 5, 6, 8, 11, 13, 20, 21, 24, 27, 28, 30, 36, 39, 41, 43, 47, 51, 53, 55, 56, 58, 61, 62, 65 68, 70, 71 79, 80 7, 31, 42, 44, 46 4, 9, 19, 32, 40, 51, 73 75, 78, 81 12, 18, 29, 35, 38, 49, 50, 52, 54, 60, 63, 76, 77, 82 in the present study, the initial clusters are determined by assigning the first p (or possibly p+ 1) documents to cluster 1, the next p ( p + 1) to 44 journal of library automation vol. 2/1 march, 1969 table 7. score vs. average correlation for adi classification cluster 1 cluster 2 document score avg. corr. document score avg. corr. 25 19.1 .08 8 18.7 .09 5 19.6 .12 5 18.8 .12 64 20.1 .10 20 18.8 .12 23 20.2 .13 68 18.9 .10 27 20.3 .10 2 19.2 .12 34 20.6 .11 70 19.2 .13 15 20.6 .11 39 19.2 .14 37 20.7 .14 28 19.3 .11 48 20.8 .12 58 19.4 .12 58 20.9 .12 36 19.5 .14 28 21.0 .12 61 19.6 .11 53 21.0 .14 56 19.6 .14 20 21.0 .14 66 19.7 .12 68 21.1 .12 67 19.9 .12 80 21.2 .14 80 20.0 .14 57 21.2 .13 43 20.0 .14 59 21.3 .15 33 20.0 .15 14 21.4 .13 21 20.0 .15 16 21.5 .13 11 20.0 .14 79 21.5 .15 65 20.0 .15 43 21.6 .16 27 20.0 .13 24 21.6 .15 71 20.1 .14 69 21.7 .14 41 20.2 .15 26 21.7 .16 79 20.4 .16 17 21.8 .15 24 20.4 .15 72 21.8 .17 13 20.5 .15 21 22.0 .17 51 20.6 .12 3 22.0 .17 53 20.7 .16 9 22.1 .17 6 21.0 .18 30 22.2 .17 62 21.4 .18 22 22.3 .17 55 21.5 .17 45 22.4 .18 1 21.7 .20 10 22.4 .17 30 21.8 .18 cluster 3 cluster 4 document score avg. cor1·. document score avg. co1'1'. 31 31.0 .21 32 24.5 .10 46 31.2 .05 51 25.0 .15 44 31.3 .14 74 26.0 .16 7 31.8 .24 4 26.3 .18 42 33.2 .28 75 26.4 .16 9 26.5 .19 19 26.9 .17 73 27.0 .19 78 27.1 .18 40 27.4 .24 81 27.6 .21 fast algorithm for automatic classification/dattola 45 table 8. score vs. average correlation for cranfield classification cluster 1 cluster 2 document score avg. corr. document score avg. corr. 26 22.0 .12 38 19.9 .12 6 22.3 .13 97 20.2 .13 7 22.4 .13 15 20.3 .11 117 23.0 .14 1 20.3 .13 2 23.1 .14 34 20.4 .13 121 23.2 .15 145 20.4 .14 13 23.3 .15 171 20.8 .12 19 23.4 .16 172 20.9 .13 60 23.7 .15 30 20.9 .14 23 23.8 .17 4 21.1 .15 18 23.8 .17 140 21.1 .15 44 24.1 .18 72 21.2 .15 183 24.2 .17 138 21.3 .15 116 24.3 .18 143 21.3 .15 128 24.5 .18 141 21.3 .14 61 24.6 .18 27 21.6 .13 9 24.6 .18 36 21.7 .13 197 24.7 .19 157 21.7 .17 16 24.7 .20 59 2!.8 .16 198 24.7 .17 156 21.8 .16 3 24.8 .18 200 21.9 .16 25 24.9 .20 32 22.0 .18 28 25.0 .21 137 22.4 .15 115 25.1 .21 29 22.5 .19 58 25.6 .21 148 22.5 .17 181 25.6 .20 . 57 22.8 .18 56 25.9 .21 128 22.8 .15 160 26.6 .23 44 23.1 .19 31 23.2 .19 139 23.9 .19 56 23.4 .18 160 24.3 .18 58 25.3 .21 clustet· 3 document score avg. corr. 179 31.4 .19 154 32.5 .24 79 32.8 .27 133 32.9 .29 134 33.4 .28 77 33.7 .27 132 33.8 .32 78 34.1 .30 76 34.3 .34 74 34.4 .34 75 34.5 .36 46 ]oumal of library automation vol. 2/1 march, 1969 cluster 2, . . . , and the final p to cluster m, where p = (total number of docu'ments) i m. since the nucleus of each cluster depends quite strongly on the initial clusters, it is not surprising that different initial clusters lead to different results. if the initial clusters are chosen at random, it is unlikely that the documents within each cluster are very similar. thus, the nucleus of each cluster might not be very tight. this problem is solved by insuring that the initial clusters contain at least a few documents that are highly con-elated. in the adi and cranfield collections, the order of the documents is such that many adjacent documents are quite similar; therefore, most of the initial clusters contain a few highly con-elated documents. in collections where the order of the documents is random, a simple, fast, clustering scheme can be used to determine the initial clusters. this type of an algorithm need only perform document-document con-elations within a fraction of the docll'ment space, and therefore should not take up much time. evaluation of results the assumption was made earlier that those documents of a cluster s; that score highest against the conesponding profile p1 are most similar to the other documents in the cluster. the phrase "most similar" is used to mean "con-elate most highly", where a standard con-elation function is used. table 7 compares the score of each document to the average correlation ( unweighted cosine function) of each document with every other document in the cluster. the documents are arranged in ascending order by scores, and hopefully, the con-elations will also appear in ascending order. as the table indicates, there is a strong tendency for the higher scores to conespond to the higher correlations. table 8 illustrates the same results for three out of seven final clusters from the cranfield collection. so far nothing has been said about how to choose the base value that is used to compute rank values. this integer has an important effect on the type of clusters produced. recall that the rank value of a concept equals the base value b, minus its rank. suppose a cluster s1 contains four documents d1-d4, and a total of twenty different concepts. the lowest possible rank value for any concept=b-4, since 4 is the lowest possible rank. if h=20, then the lowest rank value is 16, while if h=5, the lowest rank value is 1. consider a document d. which is the same as d1 except for one concept, and assume this concept does not occur in p,. with h=20, g( d.,p1) is between 16 and 20 points less than g( d1,p1 ); with h=5, g( d.,p1 ) is only 1 to 5 points less. since large clusters have profiles containing many concepts, the chances of a random document d, having concepts in the profile of a large cluster are greater than the chances of d. having concepts in the profile of a small cluster. therefore, if b is high, d. will score much lower against the profile of the small cluster, and large clusters will tend to capture all the remaining loose documents at the expense of the smaller clusters. fast algorithm for automatic classification/dattola 47 experimental results support this hypothesis, i.e., a large base value produces a few clusters with many documents, and many clusters with only a few documents. if, on the other hand, b is set so that the lowest rank value in an average cluster is 1, then there is a tendency for small clusters to get larger and large clusters to get smaller. in smaller than average clusters, all the rank values are high, since there are only a few different ranks. in larger than average clusters, the rank value as defined might become zero or even less than zero. in these cases, it is redefined to be 1, but then it is possible for many concepts to have a rank value of i . thus, a document often scores higher against the profiles of smaller clusters. the results of the cranfield classification clearly indicate the ability of a document to score higher against profiles of smaller clusters. during the classification, nine clusters are generated, and cluster 9 starts to grow much larger than average (average = 22 documents). it keeps growing until it contains 27 documents, and then it starts to oscillate. the following numbers indicate the number of documents in cluster 9 on successive iterations: 27, 21, 34, 17, 56, 01 thus, cluster 9 is eliminated. the same thing happens to cluster 8 on the next few iterations. although this tends to keep the size of the clusters somewhat uniform, it is not desirable to throw away a cluster which might contain many highly correlated documents. one solution which might be implemented is to split up large clusters into several smaller ones; i.e., classify the documents , within a single cluster. if the number of documents in the cluster is not too large, it might be practicable to use an n 2 algorithm to do this. conclusion the classification algorithm that has been described in the preceding two sections requires the following parameters as input: 1) maximum number of clusters desired; 2) approximate percentage of loose documents desired; 3) decision on whether or not loose documents should be "blended" into the nearest cluster at the end of the classification; 4) amount of overlap desired. the first parameter specifies the number of initial clusters that are formed. if no clusters are eliminated during the evaluation, then the maximum number are actually generated. the experiments run so far indicate that the number of clusters produced is usually only about 60% of the maximum. the next two parameters determine the "tightness" of the final clusters; the higher the percentage of loose documents, the tighter the clusters. if no loose documents are desired, parameter b can be set to 0, but very low percentages increase the running time of the program. almost identical results are obtained in less time by specifying about 15% loose, and then asking for all loose documents to be assigned to the cluster to which they score highest. 48 journal of library automation vol. 2/1 march, 1969 the last parameter determines the amount of overlap. this number corresponds to a in the formula t _ fhn-1,•-a • (hn-1,,-t), if hn-1,t>t "· 1 ·' lt otherwise which was mentioned under implementation. when a = 0, no overlap is produced, and with a = 1, the maximum amount of overlap is produced. the actual percentage of overlap for a given value of a depends on the collection, but results indicate that 10% overlap for a = .4, and about 20% for a= .6. although the algorithm is not guaranteed to terminate, convergence has always been obtained in practice. in order to prevent the program from looping in cases of non-convergence, the algorithm can be modified to permit a maximum of n iterations, whether or not convergence is obtained. the results indicate that clusters change very little after about four or five iterations, so that this modification would not make much difference in the final clusters. the true evaluation of the final clusters can only be made by actually performing two-level searches on the clustered document space. however, the algorithm is sufficiently general to allow for the evaluation of many different types of clusters. references 1. jones, s. k.; jackson, d.: "current approaches to classification and clump-finding at the cambridge language research unit," the computer journal, 10 (may 1967). 2. doyle, l. b.: breaking the cost barrier in automatic classification, sdc paper sp-2516 (july 1966). 3. needham, r. n.: the termination of certain iterative processes. the rand corporation memorandum rm-5188-pr (november 1966). 4. salton, g.; lesk, m. e.: computer evaluation of indexing and text processing, information storage and retrieval; report isr-12 to the national science foundation, section iii. (ithaca, n.y.: cornell university, department of computer science, august, 1967). editorial | truitt 3 marc truitt marc truitt (marc.truitt@ualberta.ca) is associate university librarian, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. marc truitt editorial: and now for something (completely) different t he issue of ital you hold in your hands—be that issue physical or virtual; we won’t even go into the question of your hands!—represents something new for us. for a number of years, ex libris (and previously, endeavor information systems) has generously sponsored the lita/ex libris (née lita/endeavor) student writing award competition. the competition seeks manuscript submissions from enrolled lis students in the areas of ital’s publishing interests; a lita committee on which the editor of ital serves as an ex-officio member evaluates the entries and names a winner. traditionally, the winning essay has appeared in the pages of ital. in recent years, perhaps mirroring the waning interest in publication in traditional peerreviewed venues, the number of entrants in the competition has declined. in 2008, for instance, there were but nine submissions, and to get those, we had to extend the deadline six weeks from the end of february to midapril. in previous years, as i understand it, there often were even fewer. this year, without moving the goalposts, we had— hold onto your hats!—twenty-seven entries. of these, the review committee identified six finalists for discussion. the turnout was so good, in fact, that with the agreement of the committee, we at ital proposed to publish not only the winning paper but the other finalist entries as well. we hope that you will find them as stimulating as have we. even more importantly, we hope that by publishing such a large group of papers representing 2009’s best in technology-focused lis work, we will encourage similarly large numbers of quality submissions in the years to come. i would like to offer sincere thanks to my university of alberta colleague sandra shores, who as guest editor for this issue worked tirelessly over the past few months to shepherd quality student papers into substantial and interesting contributions to the literature. she and managing editor judith carter—who guest-edited our recent discovery issue—have both done fabulous jobs with their respective ital special issues. bravo! n ex libris’ sponsorship in one of those ironic twists that one more customarily associates with movie plots than with real life, the lita/ex libris student writing award recently almost lost its sponsor. at very nearly the same time that sandra was completing the preparation of the manuscripts for submission to ala production services (where they are copyedited and typeset), we learned that ex libris had notified lita that it had “decided to cease sponsoring” the student writing award. a brief round of e-mails among principals at lita, ex libris, and ital ensued, with the outcome being that carl grant, president of ex libris north america, graciously agreed to continue sponsorship for another year and reevaluate underwriting the award for the future. we at ital and i personally are grateful. carl’s message about the sponsorship raises some interesting issues on which i think we should reflect. his first point goes like this: it simply is not realistic for libraries to continue to believe that vendors have cash to fund these things at the same levels when libraries don’t have cash to buy things (or want to delay purchases or buy the product for greatly reduced amounts) from those same vendors. please understand the two are tied together. point taken and conceded. money is tight. carl’s argument, i think, speaks as well to a larger, implied question. libraries and library vendors share highly synergistic and, in recent years, increasingly antagonistic relationships. library vendors—and i think library system vendors in particular—come in for much vitriol and precious little appreciation from those of us on the customer side. we all think they charge too much (and by implication, must also make too much), that their support and service are frequently unresponsive to our needs, and that their systems are overly large, cumbersome, and usually don’t do things the way we want them done. at the same time, we forget that they are catering to the needs and whims of a small, highly specialized market that is characterized by numerous demands, a high degree of complexity, and whose members—“standards” notwithstanding—rarely perform the same task the same way across institutions. we expect very individualized service and support, but at the same time are penny-pinching misers in our ability and willingness to pay for these services. we are beggars, yet we insist on our right to be choosers. finally, at least for those of us of a certain generation—and yep, i count myself among its members—we chose librarianship for very specific reasons, which often means we are more than a little uneasy with concepts of “profit” and “bottom line” as applied to our world. we fail to understand the open-source dictum that “free as in kittens and not as in beer” means that we will have to pay someone for these services—it’s only a question of whom we will pay. carl continues, making another point: i do appreciate that you’re trying to provide us more recognition as part of this. frankly, that was another consideration in our thought of dropping it—we just didn’t feel like we were getting much for it. marc truitt (marc.truitt@ualberta.ca) is associate university librarian, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 4 information technology and libraries | march 2010 i’ve said before and i’ll say again, i’ve never, in all my years in this business had a single librarian say to me that because we sponsored this or that, it was even a consideration in their decision to buy something from us. not once, ever. companies like ours live on sales and service income. i want to encourage you to help make librarians aware that if they do appreciate when we do these things, it sure would be nice if they’d let us know in some real tangible ways that show that is true. . . . good will does not pay bills or salaries unless that good will translates into purchases of products and services (and please note, i’m not just speaking for ex libris here, i’m saying this for all vendors). and here is where carl’s and my views may begin to diverge. let’s start by drawing a distinction between vendor tchotchkes and vendor sponsorship. in fairness, carl didn’t say anything about tchotchkes, so why am i? i do so because i think that we need to bear in mind that there are multiple ways vendors seek to advertise themselves and their services to us, and geegaws are one such. trinkets are nice—i have yet to find a better gel pen than the ones given out at iug 14 (would that i could get more!)—but other than reminding me of a vendor’s name, they serve little useful purpose. the latter, vendor sponsorship, is something very different, very special, and not readily totaled on the bottom line. carl is quite right that sponsorship of the student writing award will not in and of itself cause me to buy aleph, primo, or sfx (oh right, i have that last one already!). these are products whose purchase is the result of lengthy and complex reviews that include highly detailed and painstaking needs analysis, specifications, rfps, site visits, demonstrations, and so on. due diligence to our parent institutions and obligations to our users require that we search for a balance among best-of-breed solutions, top-notch support, and fair pricing. those things aren’t related to sponsorship. what is related to sponsorship, though, is a sense of shared values and interests. of “doing the right thing.” i may or may not buy carl’s products because of the considerations above (and yes, ex libris fields very strong contenders in all areas of library automation); i definitely will, though, be more likely to think favorably of ex libris as a company that has similar—though not necessarily identical—values to mine, if it is obvious that it encourages and materially supports professional activities that i think are important. support for professional growth and scholarly publication in our field are two such values. i’m sure we can all name examples of this sort of behavior: in addition to support of the student writing award, ex libris’ long-standing prominence in the national information standards organization (niso) comes to mind. so too does the founding and ongoing support by innovative interfaces and the library consulting firm r2 for the taiga forum (http://www.taigaforum.org/), a group of academic associate university librarians. to the degree that i believe ex libris or another firm shares my values by supporting such activities—that it “does the right thing”—i will be just a bit more inclined to think positively of it when i’m casting about for solutions to a technology or other need faced by my institution. i will think of that firm as kin, if you will. with that, i will end this by again thanking carl and ex libris—because we don’t say thank you often enough!—for their generous support of the lita/ex libris student writing award. i hope that it will continue for a long time to come. that support is something about which i do care deeply. if you feel similarly—be it about the student writing award, niso, taiga, or whatever—i urge you to say so by sending an appropriate e-mail to your vendor’s representative or by simply saying thanks in person to the company’s head honcho on the ala exhibit floor. and the next time you are neck-deep in seemingly identical vendor quotations and need a way to figure out how to decide between them, remember the importance of shared values. n dan marmion longtime lita members and ital readers in particular will recognize the name of dan marmion, editor of this journal from 1999 through 2004. many current and recent members of the ital editorial board—including managing editor judith carter, webmaster andy boze, board member mark dehmlow, and i—can trace our involvement with ital to dan’s enthusiastic period of stewardship as editor. in addition to his leadership of ital, dan has been a mentor, colleague, boss, and friend. his service philosophy is best summarized in the words of a simple epigram that for many years has graced the wall behind the desk in his office: “it’s all about access!!” because of health issues, and in order to devote more time to his wife diana, daughter jennifer, and granddaughter madelyn, dan recently decided to retire from his position as associate director for information systems and digital access at the university of notre dame hesburgh libraries. he also will pursue his personal interests, which include organizing and listening to his extensive collection of jazz recordings, listening to books on cd, and following the exploits of his favorite sports teams, the football irish of notre dame, the indianapolis colts, and the new york yankees. we want to express our deep gratitude for all he has given to the profession, to lita, to ital, and to each of us personally over many years. we wish him all the best as he embarks on this new phase of his life. president’s message cindi trainor information technologies and libraries | september 2013 1 it's fall already, and for lita that means some exciting things! the program planning committee is hard at work evaluating the sixty-plus program proposals that have been submitted. the new conference format specified by ala means that we have 20 slots for lita programs at annual, including the president's program and top tech trends. the ppc certainly has its work cut out! well before midwinter in philadelphia will be national forum 2013. i'm so excited that this year's forum is going to be in my home state of kentucky. join us from november 7-10 in "luhvuhl" (louisville) for great keynotes, preconference workshops, concurrent sessions, and of course, networking opportunities. travis good from make magazine will deliver our opening keynote, nate hill from chattanooga public library is up on saturday, and emily gore from the digital public library of america closes out the forum on sunday. i hope you'll join us for an exciting forum in the bluegrass state. in governance news, the board of directors identified three goal areas on which we will concentrate this year: stabilizing the budget, engaging members, and growing membership. we are eagerly awaiting the final report of the financial stability task force, led by tom wilson and andrew pace; they presented preliminary findings at the board meeting in chicago in july. we are currently updating the strategic plan, which was last updated in 2010—watch the lita-l list and the blog for more. we would love your input on them! if you're interested in lita governance, check out the board's space on ala connect. many discussion posts are open, and we welcome your comments! you can also complete this form (http://www.ala.org/lita/about/board/contact) to reach out to the board anytime. the board will be having several meetings this fall: the executive committee is tentatively meeting september 30; the forum steering committee will meet after forum 2013; the budget review committee will be meeting at the ala joint boards meeting in late october; and the entire board will have an online meeting before we convene in person in philadelphia. watch lita-l for the announcements; we welcome you as guests to all our meetings. finally, for those of you interested in leadership but not necessarily ready to run for board, i want to point you to documents put together by the lita emerging leaders team in 2013: http://connect.ala.org/node/197839 our team of three ala emerging leaders, margaret heller, zach coble, and katie heidgerkengreene, surveyed lita leaders, worked with committee chairs coordinator michelle frisque and ig chairs coordinator paul keith, and synthesized tons of information and many documents into the leadership guide for new chairs of committees and interest groups cindi trainor (cindiann@gmail.com) is lita president 2013-14 and community specialist & trainer for springshare, llc. http://litablog.org/ http://www.ala.org/lita/about/board/contact http://www.ala.org/lita/about/board/contact http://connect.ala.org/node/197839 http://connect.ala.org/node/209032 mailto:cindiann@gmail.com president’s message | trainor 2 (http://connect.ala.org/node/209032). they also created a sample leadership game (http://www.gloriousgeneralist.com/leadership.html) to test our leadership knowledge, and presented their work at the emerging leaders poster session in chicago. the leadership guide will inform future orientation activities for committee and ig chairs and will then be handed over to the bylaws & organization committee to be incorporated into the lita manual. thank-you for being a lita member! i hope to see you at a future event. my fellow board members and i welcome your comments and suggestion--for program ideas, workshop or online class ideas, and for how we can keep lita awesome. :) http://connect.ala.org/node/209032 http://www.gloriousgeneralist.com/leadership.html http://www.gloriousgeneralist.com/leadership.html 62 information technology and libraries | june 2011 jason vaughan and kristen costello management and support of shared integrated library systems the second major hardware migration occurred, and an initial memorandum of understanding (mou) was drafted by the unlv libraries. this mou is still used by the libraries. the mou was discussed with all partners and ultimately signed by the director of each library. since the mou was signed nearly a decade ago, the system has continued to grow by all measures—size of the database, number of users, number of software modules comprising the complete system, and the financial and staff commitment toward support and maintenance. despite the emergence of a large number of other network-based technologies critical to library operations and services, the ils remains a critical system that supports many library operations. the research described in this paper developed in part because there is a dearth of published survey-based research of shared ils management and financial support. this article interweaves local existing practices with research findings. for brevity’s sake, the system shared by the unlv university libraries and four additional partners will be referred to as unlv’s system. to provide a relative sense of the footprint of each partner on the system, various measures can be used (see figure 1). ■■ survey method in april 2010, the authors administered a 20-question survey to the innovative user’s group (iug) via the group’s listserv. the survey focused on libraries that are part of a consortial or otherwise shared innovative ils. the innovative user’s group is the primary user’s group associated with the innovative ils and suite of products. the iug hosts a busy listserv, coordinates the annual north american conference devoted solely to the innovative system, and provides innovative customer-driven enhancement requests. to prevent multiple individuals from the same consortium responding to the survey, instructions indicated that only one individual from the main institution hosting the system should officially respond. given the anonymity of the survey and the desire to provide confidentiality, there is the possibility that some survey responses refer to the same system. the survey consisted primarily of multiple choice, “select all that apply,” and free-text response questions. the survey was divided into four broad topical areas: (1) background information; (2) funding; (3) support; and (4) training, professional development, and planning. the survey was open for a period of three weeks. because respondents could choose to skip questions, the number of responses received per question varied. on average, 43 individual responses were received for each question. innovative currently has more than 1,200 millennium ils installations.2 not all of those installations support multiple, administratively separate library entities. it is unknown the university of nevada, las vegas (unlv) university libraries has hosted and managed a shared integrated library system (ils) since 1989. the system and the number of partner libraries sharing the system has grown significantly over the past two decades. spurred by the level of involvement and support contributed by the host institution, the authors administered a comprehensive survey to current innovative interfaces libraries. research findings are combined with a description of unlv’s local practices to provide substantial insights into shared funding, support, and management activities associated with shared systems. s ince 1989, the university of nevada, las vegas university libraries has hosted and managed a shared integrated library system (ils). currently, partners include the university of nevada, las vegas university libraries (consisting of one main and three branch libraries, and hereafter referred to as unlv libraries); the administratively separate unlv law library; the college of southern nevada (a community college system consisting of three branch libraries); nevada state college; and the desert research institute. the original ils installation included just the unlv libraries and the clark county community college (now known as the college of southern nevada). the desert research institute joined in the early 1990s, the unlv law library joined with the establishment of the william j. boyd school of law in 1998, and, finally, nevada state college joined upon its creation in 2002. over time, the technological underpinnings of the ils have changed tremendously and have migrated firmly into a webbased environment unknown in 1989. the system was migrated to innovative interfaces’ current java-based platform, millennium, beginning in 1999. since the original installation, there have been three major full hardware migrations, in 1997, 2002, and 2009. over time, regular innovative software updates, as well as additional purchased software modules, have greatly extended both the staff and end user functionality of the ils. in early 2001, unlv and its partners conducted a marketplace assessment of ils vendors catering to academic customers.1 the assessment reaffirmed the consortia’s commitment to innovative interfaces. shortly thereafter, jason vaughan (jason.vaughan@unlv.edu) is director, library technologies, university of nevada las vegas. kristen costello (kristen.costello@unlv.edu) is systems librarian, university of nevada las vegas. management and support of shared integrated library systems | vaughan and costello 63 partners originally purchased the system together; 20 (38.5 percent) indicated they purchased the system with some of their current existing partners, while 9 (17.3 percent) indicated they as the main institution originally and solely purchased the system. several of the entities sharing the unlv libraries’ system did not even exist when the ils was originally purchased; only two of the current partners shared the original purchase cost of the system. another background question sought to understand how partners potentially individualize the system despite being on a shared platform. innovative, and likely other similar ils vendors, offers several products to help libraries better manage and control their holdings and acquisitions. of potential benefit to staff operations and workflow, innovative offers the option to have multiple acquisitions and/or serials control units, which provide separate fund files and ranges of order records for different institutions sharing the ils system. of 51 responses received, 44 respondents (86.3 percent) indicated they had multiple acquisitions and serials units and 7 (13.7 percent) do not. innovative offers two web-based discovery interfaces for patrons: the traditional online public access catalog, known as webpac, and their version of a next-generation discovery layer, known as encore. of potential benefit to staff as well as patrons, innovative offers “scoping” modules that help patrons using one of the web-based discovery interfaces, as well as staff using the millennium staff modules. the scoping module allows holdings segmentation by location or material type. scopes allow libraries to define their collections and offer their patrons the option to search just the collection of their applicable library. forty-six (88.5 percent) of the 52 respondents indicated they use scoping and 6 (11.5 percent) do not. unlv how many shared innovative library systems exist. while a true response rate cannot be determined, such a measure is not critical for this research. the survey questions with summarized results are provided in appendix a. ■■ survey background unlv’s system, with only five unique library entities, is a “small” system when compared with survey responses. survey respondents indicated a range from 2 to 80 unique members sharing their system. of the 48 responses received for this background question, 26 (54 percent) indicated 10 or fewer partners on the system. seven (14.6 percent) indicated 40 or more partners. the average number of partners sharing an ils implementation was 18 and the median was 8.5. there can be varying levels of partnership within a shared ils system. unlv’s instance is a rather informal partnership. some survey respondents indicated the existence of a far more structured or dedicated support group not directly associated with any particular library. one respondent noted they have a central office comprised of an executive director and two additional staff, responsible for ils administration; this central office reports to a board of directors, comprised of library directors for each member library. another indicated they have a central office responsible not only for the ils, but for other things such as wide and local area networks and workstation support. one respondent indicated that they are actually a consortium of consortia, with 9 hosts each comprised of anywhere from 4 to 11 libraries. twenty-three respondents out of 52 (44.2 percent) indicated that they and all of their current existing full-time library staff bibliographic records item records order records patron records staff login licenses unlv libraries 105 (70.9%) 1,494,890 (78.2%) 1,906,225 (81.1%) 74,223 (58.4%) 40,788 (59.6%) 85 (69.1%) unlv law library 13 (8.8%) 246,678 (12.9%) 243,788 (10.4%) 29,921 (23.5%) 2,034 (3%) 13 (10.6%) college of southern nevada 27 (18.2%) 146,118 (7.6%) 175,862 (7.5%) 22,142 (17.4%) 23,876 (34.9%) 20 (16.3%) nevada state college 1 (.7%) 17,787 (.9%) 17,979 (.8%) 841 (.7%) 1,718 (2.5%) 3 (2.4%) desert research institute 2 (1.4%) 5,396 (.3%) 5,361 (.2%) 0 (0%) 24 (<.1%) 2 (1.6%) figure 1. various measures of ils footprints for unlv’s shared ils (percentage of overall system) note: “staff login licenses” refers to the number of simultaneous staff users each institution can have on the system at any given time. 64 information technology and libraries | june 2011 share of funding toward annual maintenance based on their number of staff licenses, as shown in figure 1. ■■ funding support from partners mous appear to include funding and budgeting information more than any other discrete topic. direct support costs can include the maintenance support costs paid to one or more vendors, costs for additional vendor authored software modules purchased in addition to the base software, and, perhaps, licensing costs associated with a database or operating system used by the ils (e.g., an oracle license for oracle based ils systems). there are many parameters by which costs could be determined for partners, and, given the dearth of published research on the topic, a chief focus of this research sought more information on what factors were used by other consortia. the authors brainstormed 10 elements that could potentially figure into the overall cost sharing method. thirty-eight respondents provided information on factors playing a role in their cost sharing arrangements, illustrated in figure 2. respondents could mark more than one answer for this question, as more than one factor could be involved. the top two factors relate directly to vendor costs— whether annual support costs or acquisition of new vendor software. hardware placed third in overall frequency; for innovative and likely for other ils systems, ils hardware can be purchased from the vendor or an approved platform can be sourced from a reseller directly. support costs from third parties and the number of staff login ports were each identified as a factor by more than a third of all respondents. ■■ software purchases depending on the software, additional modules extending the system capabilities can benefit a single partner, or, in unlv’s experience, all partners on the system. traditionally, the unlv libraries have had the largest operating budget of the group, and a majority of new software requests have come internally from unlv libraries staff. over the past 20 years, the unlv libraries have fully funded the initial purchase costs of a majority of the software extending the system, regardless of whether it benefits just the unlv libraries or all system partners. there are numerous exceptions where the partner libraries have contributed funding, including significant start-up costs associated with the unlv law library joining the system in 1998 and the addition of nevada state college in 2002. in both instances, those bodies funded required and recommended software directly applicable has multiple serials and acquisitions units as well as multiple scopes configured to help segment the records for each entities’ particular collection. innovative offers various levels of maintenance support. unlv’s level of support includes the vendor supplying services such as application troubleshooting resolution, software updates, and some degree of operating system and hardware configuration and advice. unlv also contracts with the hardware vendor for hardware maintenance and underlying operating system support. the unlv libraries have had the opportunity to hire fully qualified and capable technical staff to provide a high level of support for the ils. unlv’s level of vendor support has evolved from an original full turnkey installation with innovative providing all support to a present level of more modest support. nearly half of all survey respondents, 25 of 52 (48.1 percent) indicated they had a turnkey arrangement with innovative; the remaining 27 respondents had a lesser level of support. maintenance and support obviously carry a cost with one or more third party providers. the majority of the respondents, 40 of 51 (78.4 percent), indicated there is a cost-sharing structure in place where maintenance support costs related to the ils are spread across partner libraries. six respondents (11.8 percent) indicated the main institution fully funds the maintenance support costs. the unlv libraries drafted the first and current mou in 2002 for all five entities sharing the ils system. thirty-five of 51 survey respondents (68.6 percent) indicated they, too, have a mou in place. unlv’s mou is a basic document, two pages in length, split into the following sections: background; acquisition of new or additional hardware; acquisition of new or additional software; annual maintenance associated with the primary vendor and third party suppliers and, importantly, the associated cost allocation method for how annual support costs are split between the partners; how new products are purchased from the vendor; and management and support responsibilities of the hosting institution. many of the survey respondents provided details on items contained in their own mous, which can be clustered into several broad categories. these include budgeting, payments, funding formulas; general governance and voting matters; support (e.g., contractual service responsibilities, responsibilities of member libraries); equipment (e.g., title and use of equipment, who maintains equipment); and miscellaneous. this latter category includes items such as expectations for record quality; network requirements/ restrictions; fine collection; and holds management. the majority of unlv’s mou addresses shared costs for annual maintenance. unlv’s cost-sharing structure is simple. the system has a particular number of associated staff (simultaneous login) licenses, which have gradually increased as the libraries have grown. logins are separated by institution, and each member is assessed their management and support of shared integrated library systems | vaughan and costello 65 annual maintenance bill and all partners help maintain new software acquisitions by contributing toward the annual maintenance. regarding new software acquisitions, cost-sharing practices varied between 44 respondents providing information in the survey. eight (18.2 percent) indicated there is consultation with other partners and there is some arrangement to share costs between the majority or all partners sharing the system. two respondents (4.5 percent) indicated the institution expressing the initial interest in the product fully funds the purchase. nineteen respondents (43.2 percent) indicated that they have had instances of both these scenarios (shared funding and sole funding). two respondents (4.5 percent) indicated they could not recall ever adding any additional software. thirteen respondents (29.5 percent) offered details to their operation such as additional serials and accounting units (for the law library), check-in and order records, and staff licenses. in addition, when the system was migrated from the aging text-based system (innopac) to the current millennium java-based gui system in 1999, the current partners contributed toward the upgrade cost based on number of staff licenses. partner institutions have continued to fund items of sole benefit to their operation, such as adding staff licenses or required network port interfaces associated with patron self-check stations installed at their facilities. during the 2000s, the unlv libraries have fully funded a majority of software of potential benefit to all partners, such as the electronic resource management module, the encore next generation discovery platform, and various opac/encore enhancements. software additions typically increase the figure 2. cost-sharing formula factors t h e a m o u n t o f th e o ve ra ll ye a rl y in n o va ti ve in te rf a c e s m a in te n a n c e /s u p p o rt i n vo ic e t h e a m o u n t o f a n y a d d it io n a l 3 rd p a rt y m a in te n a n c e / su p p o rt a g re e m e n ts a ss o c ia te d w it h t h e i n n o va ti ve sy st e m ( su c h a s c o n tr a c ts w it h t h e h a rd w a re m a n u fa c tu re r— h p, s u n m ic ro sy st e m s [o ra c le ], e tc .) t h e p u rc h a se c o st (s ) fo r n e w ly a c q u ir e d i n n o va ti ve m o d u le s/ p ro d u c ts t h e p u rc h a se c o st (s ) fo r n e w ly a c q u ir e d h a rd w a re a ss o c ia te d w it h t h e i n n o va ti ve s ys te m ( su c h a s a se rv e r, a d d it io n a l d is k s p a c e , b a c k u p e q u ip m e n t, e tc .) t h e n u m b e r o f in c id e n t re p o rt s (o r ti m e s p e n t) , b y p e rs o n n e l a t th e m a in i n st it u ti o n r e la te d t o r e se a rc h , tr o u b le sh o o ti n g , e tc . su p p o rt i ss u e s re p o rt e d b y p a rt n e r in st it u ti o n s t h e “ si ze ” o f th e p a rt n e r in st it u ti o n ’s p o rt io n o f th e in n o va ti ve s ys te m , a s m e a su re d b y in st it u ti o n f t e t h e “ si ze ” o f th e p a rt n e r in st it u ti o n ’s p o rt io n o f th e in n o va ti ve s ys te m , a s m e a su re d b y n u m b e r o f b ib o r it e m r e c o rd s th e p a rt n e r’ s in st it u ti o n h a s in t h e in n o va ti ve d a ta b a se t h e “ si ze ” o f th e p a rt n e r in st it u ti o n ’s p o rt io n o f th e in n o va ti ve s ys te m , a s m e a su re d b y n u m b e r o f st a ff lo g in p o rt s d e d ic a te d t o t h e p a rt n e r lib ra ry t h e “ si ze ” o f th e p a rt n e r in st it u ti o n ’s p o rt io n o f th e in n o va ti ve s ys te m , a s m e a su re d b y n u m b e r o f u se r se a rc h e s c o n d u c te d f ro m i p r a n g e s a ss o c ia te d w it h th e p a rt n e r in st it u ti o n t h e “ si ze ” o f th e p a rt n e r in st it u ti o n ’s p o rt io n o f th e in n o va ti ve s ys te m , a s m e a su re d b y th e n u m b e r o f p a tr o n r e c o rd s w h o se h o m e l ib ra ry i s a ss o c ia te d w it h t h e p a rt n e r in st it u ti o n 66 information technology and libraries | june 2011 applied, the number of staff users has increased significantly, and the system was migrated to an underlying oracle database in 2004. since the original system was purchased in 1989 and fully installed in 1990, the central, locally hosted server has been replaced three times, in 1997, 2002, and 2009. partners contributed toward the costs of the server upgrades in 1997 and 2002, while the unlv libraries fully funded the 2009 upgrade. software and hardware components comprising the backup system have been significantly enhanced with a modern system capable of the speed, capacity, and features needed to perform appropriately in the short backup window available each night. unlv funded the initial backup software and hardware, and the partner institutions contribute toward the annual maintenance associated with the backup equipment and software. one survey question focused on major central infrastructure supporting the ils (defined as items exceeding $1,000 and with several examples listed). the question did not focus on hardware that could be provided by ils vendors benefiting a single partner, such as self-check stations or inventory devices. fourteen (31.8 percent) of the 44 respondents indicated that if major new hardware was needed, there was consultation with other partners, and, if purchased, a cost-sharing agreement was arranged. two respondents (4.5 percent) indicated the institution expressing the initial interest fully funds the purchase and seven respondents indicated they’ve had instances in the past of both these scenarios. three respondents (6.8 percent) indicated their shared system hardware had never been replaced or upgraded to their knowledge. nineteen respondents provided information on alternate scenarios or otherwise more details as to local practice. several indicated a separate fund is maintained solely for large ils system-related improvements or ils related purchases. revenue for these funds can be built up over time through maintenance and use payments by partner libraries or by a small additional fee earmarked for future hardware replacement needs collected each year. one respondent indicated they have been able to get grant funds to cover major purchases. with few exceptions, the majority of free text responses indicated that costs for major purchases were shared by partners or otherwise funded by the central consortium or cooperative agency. as with regular annual maintenance and new software purchases, various elements can determine what portion of hardware replacement costs are borne by partner libraries. this includes number of staff licenses (21.9 percent of responses), institutional fte count (15.6 percent), number of bibliographic or item records (15.6 percent), and number of patron records (9.4 percent). twenty respondents provided additional information. several indicated that the costs are split evenly across all partners. several indicated that population served was a factor. others reiterated that costs for central hardware on other scenarios. several indicated that if a product is directly applicable to only one library, such as self-check interfaces and additional acquisition units, then the library in need fully funds the purchase, which mirrors the local practice at unlv. several respondents indicated that if a product benefits all libraries, then costs are shared equally. one respondent indicated that the partner libraries discuss the potential item, and collectively they may choose not to purchase, even if one or more partners are very interested. in such cases, those partners have the option to purchase the product and must agree to make it available to all partners. several respondents indicated that, as the largest entity using the shared system, they generally always purchased new software for their operation as needed, with the associated benefit that the other partners of the system were allowed to use the software as well. three respondents reiterated that a central office funds add-on modules, in one case from funding set aside each year for system improvements. a fourth respondent indicated that a “joiners fee” fund, built up from new members joining the system, allows for the purchase of new software. clearly there are many scenarios of how new software is funded. generally, regardless of funding source, sole or share, if a product can benefit all partners, it’s allowed to do so. thirty-six survey respondents provided details on what factors determine how much each partner contributes toward new software purchases. seven respondents (19.4 percent) indicated the number of staff licenses plays a role (as in the unlv model). three respondents (8.3 percent) indicated that institution fte played a role, while three other respondents indicated that the number of partner bibliographic/item records played a role. the majority of respondents, 25 (69.4 percent) provided alternate scenarios or otherwise more information. nine of these 25 respondents indicated costs were split evenly across all partners. several indicated that the formula used for determining maintenance costs was also applied to new software purchases. four respondents indicated that the library service population was a factor. two indicated that circulation counts were a factor. one indicated that it’s negotiated on a per purchase basis, based on varying factors. ■■ hardware purchases hardware needs related to the underlying infrastructure, such as server(s), disk space, and backup equipment increases as the ils grows. unlv’s ils installation has grown tremendously. new software modules have been purchased, application architecture changes occurred with the release of the millennium suite in the late 1990s, regular annual updates to the system software have been management and support of shared integrated library systems | vaughan and costello 67 each partner institution. each module coordinator served as the contact person charged with maintaining familiarity with the functions and features of a particular module, testing enhancements within new releases, keeping other staff informed of changes, and alerting the system vendor of any problems with the module. annually, module coordinators were to consider new software and prioritize and recommend ils software the library should consider purchasing. module coordinators were tasked to maintain a system-wide view of the ils and alert others if they discovered problems or made changes to the ils that could affect other areas of the system. in addition, module coordinators were encouraged to subscribe to the iug listserv to monitor discussions and to maintain awareness of overall system issues. all staff had access to the system’s user manual but if they had questions on system features or functions, the module coordinator served as an additional resource. in addition, any bug reports were provided to the most appropriate module coordinator, who would contact innovative. the unlv systems staff, which has grown over time and is now part of the library technologies division, was responsible for all hardware and networking problems, and for scheduling and verifying nightly data backups. the systems department coordinated any new software installations with the module coordinators group, library staff, and library partners. in 2006, the unlv libraries reorganized and hired a dedicated systems librarian focused on the ils. the systems librarian’s principal job responsibility is to serve as the central administrator and site coordinator of the unlv libraries’ shared ils. responsibilities include communicating with colleagues regarding current system capabilities, monitoring vendor software developments, monitoring how other libraries utilize their innovative systems, and recommending enhancements. the systems librarian is the site contact with innovative and coordinates and monitors support calls, software and patch upgrades, and new software module installations. the position serves as the contact person for the shared institutions whenever they have questions or issues with the ils. the systems librarian has taken over much of the work previously coordinated through the module coordinators group. while the formal module coordinators group no longer exists, module experts still provide assistance as needed, and consultation always occurs with partners on system-wide issues as they arise. unlv is not unique in how it manages their ils. in the survey results, 36 respondents (87.8 percent) indicated there is a dedicated individual at the main institution who has a primary responsibility of overseeing the ils. to help clarify the responses, “primary responsibility” is defined as individuals spending more than half their time devoted to support, research, troubleshooting, and system administration duties related to the ils. the authors replacements are determined by the same formula used for assessing the share of annual maintenance. ■■ additional purchases the last funding-related survey question asked if ongoing content enrichment services were subscribed to, and if so, to describe how the cost share amount is determined for partner libraries. content enrichment services can provide additional evaluative content such as book cover images, table of contents (toc), and book reviews. unlv subscribes to a toc service as well as an additional service providing book covers, reviews, and excerpts. partner institutions contribute to the annual service charge associated with the toc service and pay for each record enhanced at their library. unlv fully funds the book cover/review/excerpt service that benefits all partners. fourteen of the 43 survey respondents (32.6 percent) indicated they did not subscribe to enrichment services. twelve respondents (27.9 percent) indicated they had one or more enrichment services and that the costs were fully funded by the main institution. seventeen respondents (39.5 percent) subscribe to enrichment services and that the costs are shared. several indicated the existing cost-sharing formula used for other assessments (annual maintenance, hardware, or nonsubscription-based software) is also used for the ongoing enrichment services. one respondent indicated they maintain a collective fund for enrichment services and estimate the cost of all shared subscriptions; this figure is integrated into the share each institution contributes to the central fund annually. one respondent indicated that their system only uses free enrichment services. ■■ support the next section of the survey addressed staff support efforts related to management of the ils. twenty years ago when unlv installed its ils, staff support included one librarian and one additional staff; both focused on various aspects of system support, from maintaining hardware to working with the vendor, in addition to having other primary job responsibilities completely unrelated to the ils. in addition, over time, functional experts developed for particular modules of the system, such as cataloging, acquisitions, circulation, and serials control. this group of functional experts eventually became known as the unlv innovative module coordinators group, which was chaired by the head of the library systems department. this group met quarterly and included experts from unlv as well as one representative from 68 information technology and libraries | june 2011 solely by the main library. typical system administration activities include managing and executing mid-release and major release software upgrades (95.2 percent of all respondents indicated the main library is solely responsible); managing, coordinating, and scheduling new products for installation (95.2 percent); monitoring disk space (95 percent); and scheduling and monitoring backups (92.9 percent). unlv’s ils support model is very similar to the survey results. the systems librarian at unlv manages all software upgrades, as well as coordinating and scheduling new ils software product and module installs. the library technologies division monitors and schedules the nightly backups and diskspace usage. certain unlv libraries staff and selected individuals from the partner libraries are authorized to open support calls with the system vendor, although the systems librarian often handles this activity herself. other functions, such as maintaining the year-to-date and last year circulation statistics are also performed by the unlv libraries systems librarian. updating circulation parameters are tasks best performed by each of the created a list of 20 duties related to ils system administration and asked respondents to indicate whether: the main library or a central consortial or cooperative office dedicated to the ils handles this particular duty; the duty is shared between the main library and partner libraries; or the duty is handled by just a partner library. as illustrated in figure 3, the survey results overwhelmingly show that the main library in a shared system provides the majority of system administration support. only two tasks were broadly shared between the main library and partner libraries; maintenance of the institution’s records (bibliographic, item, patron, order, etc.) and maintaining network and label printers. other shared tasks included changes to the circulation parameters tables (e.g., configuring loan rules and specifying open hours and days closed tables for materials they themselves circulate) with 40.5 percent of the respondents indicating this as a shared responsibility, opening support calls with the vendor (38.1 percent), monitoring bounced export and fts mail (33.3 percent), and account management (31 percent). the more typical system administration activities are done a c c o u n t m a n a g e m e n t (c re a te n e w / d e le te a c c o u n ts ; m ill e n n iu m a u th o ri za ti o n s) m a n a g e a n d e x e c u te i n n o va ti ve m id -r e le a se a n d m a jo r re le a se s o ft w a re u p g ra d e s m a n a g e , c o o rd in a te a n d s c h e d u le n e w in n o va ti ve s o ft w a re p ro d u c t in st a lla ti o n s s c h e d u le a n d m o n it o r b a c k u p s w ri te s c ri p ts t o a u to m a te p ro c e ss e s (i. e ., c ir c u la ti o n o ve rr id e s re p o rt , sy st e m s ta tu s re p o rt s, e tc .) p e rf o rm r e vi e w f ile m a in te n a n c e a n d t a k e a c ti o n s h o u ld a ll fi le s fi ll o p e n s u p p o rt c a lls w it h i n n o va ti ve m o n it o r st a tu s o f o p e n c a lls ; se rv e a s lia is o n w it h i n n o va ti ve f o r re so lu ti o n o f su p p o rt c a lls m a in ta in y e a rto -d a te /l a st y e a r c ir c u la ti o n st a ti st ic c o u n te rs m o n it o r sy st e m m e ss a g e s m o n it o r d is k s p a c e u sa g e m o n it o r b o u n c e d e x p o rt a n d f t s m a il m a in ta in c o d e t a b le s (f ix e d l e n g th , va ri a b le le n g th , e tc .) u p d a te c ir c u la ti o n p a ra m e te rs t a b le s (lo a n ru le s, h o u rs o p e n , d a ys c lo se d , e tc .) s e t u p , m o n it o r a n d t ro u b le sh o o t n o ti c e s is su e s w ri te o r m o d if y lo a d t a b le s fo r n e w r e c o rd lo a d in g m a in ta in s ys te m p ri n te rs ( la b e l, n e tw o rk e d la se r p ri n te rs ) p ro vi d e m a in te n a n c e o n r e c o rd s (p a tr o n , b ib , it e m , e tc .) m a n a g e s ys te m s e c u ri ty t h ro u g h i n n o va ti ve sy st e m s e tt in g s a n d /o r h o st b a se d o r n e tw o rk b a se d f ir e w a lls p ro vi d e e m e rg e n c y (o ff h o u rs ) re sp o n se t o re p o rt s o f in n o va ti ve d o w n ti m e o r se rv e r h a rd w a re f a ilu re s figure 3. systems administration / support responsibilities management and support of shared integrated library systems | vaughan and costello 69 and definition of policies and procedures. some groups provide recommendations to a larger executive board for the consortia. the meeting frequency of these groups is as varied as the libraries. some groups meet quarterly (33.3 percent) or monthly (20 percent) but the majority meet at other frequencies (40 percent), such as every other month or twice a year. some libraries use e-mail to communicate as opposed to having regular in-person meetings. in addition to a standing committee focused on the ils, and similar to unlv’s experience, libraries may have finite working groups to implement particular products. ■■ training, professional development, and planning the survey also focused on training, professional development, and planning activities related to the ils. there are many methods that library staff can use to stay current with their ils. most training methods typically include in-person workshops or online tutorials, as well as other venues for professional development, such as conference attendance. the authors were interested in how libraries sharing an ils determined training needs and who was responsible for the training. the survey results showed that libraries value a variety of training opportunities, partner institutions, with advice and assistance as necessary provided by the systems librarian. the authors were interested if an ils oversight body exists with other shared systems, and, if so, what issues are discussed. responses indicated that a variety of groups exist, and, in some instances, multiple groups may exist within one consortia (some groups have a more specific ils focus and others a more tangential involvement). as illustrated in figure 4, a minority of respondents, 11 of 41 (26.8 percent), indicated that they do not have a group providing ils oversight. if such a group exists, respondents were allowed to select various predefined duties performed by that group. twenty-three respondents indicated the group discusses purchasing decisions. respondents also indicated that such a group also discusses the impact of the vendor enhancements offered by mid-release and regular full-releases (19), and when to schedule the upgrades (12). the absence of an oversight group doesn’t imply that consultation doesn’t occur, rather, it may be the responsibility of an individual as opposed to an effort coordinated by a group. some libraries also have module-driven committees, which disseminate information, introduce new ideas, and try to promote cohesiveness throughout the consortium. other duties that such an oversight group may focus on include workflow issues, discussion of system issues, figure 4. issues discussed by ils oversight body updates on unresolved problem calls with innovative discussion on enhancements offered by mid-release and regular full release software upgrades and their impact (positive/ negative) on users of the system scheduling mid-release/ full release software upgrades prioritizing and selecting choices related to the innovative user’s group enhancements ballot for your installation discussion of potential new software/ modules to purchase from innovative n/a—an oversight group, body, or committee does not exist related to the oversight of the innovative system other 70 information technology and libraries | june 2011 specifically regarding cost sharing, support, and rights and responsibilities. in conducting this background research, a paucity of published literature was observed, and thus the authors hope the findings above may help other established consortia, who may be interested in reviewing or tweaking their current mous or more formalized agreements likely in place. it may also provide some considerations for libraries considering initiating a shared ils instance, something that, given the current recession, may be a topic to consider. given that nearly a decade has passed since the original unlv mou was drafted and agreed to, several revisions will be proposed and drafted. this includes formalization of how costs are divided for enrichment services (new since the original mou), and formalization in writing of the coordination role of the systems librarian in her capacity as chief manager of the ils. other ideas gathered from survey responses are worth consideration, such as a base additional fee contributed each year (above and beyond the fee accessed as determined by staff licenses). such a fee could help recoup real, sometimes significant costs associated with the system, such as the purchase of additional software benefitting all players (often, in practice funded solely by the main library). such a fee could also help recoup more tangential (but still real) expenses, such as replacement of backup media. however, at the time of writing, tweaking (increasing) the fee assessed to partner institutions is a delicate issue. as with many other institutions of learning and their associated libraries, the nevada system of higher education has been particularly hard hit with funding cuts, even when compared against serious cuts experienced by colleagues nationwide. by all measures (unemployment, state budget shortfall, foreclosures, etc.) nevada has been one of the hardest hit states in the current recession. while knowledge gained from this survey was useful (and current), what effect it will have in changing the cost structure is, now, on hold. in the spirit of support among the libraries in the same system of higher education, and in continuing to demonstrate serious shared efficiencies (by maintaining one joint system as opposed to five individual systems), no new fee structure will be implemented in the short term. at the appropriate time, different costing structures such as those elicited in the survey results will merit closer attention. references 1. jason vaughan, “a library’s integrated online library system: assessment and new hardware implementation,” information technology and libraries 23, no. 2 (june 2004): 50–57. 2. innovative interfaces, “about us: history,” http://www .iii.com/about/history.shtml (accessed may 17, 2010). regardless of the library’s status. the easiest and cheapest method of awareness involves having someone monitor the iug electronic discussion list, with 29 respondents (70.7 percent) indicating that both the main library and one or more partner libraries participate in this activity. attendance at the national and regional iug meetings was also valued highly by libraries with 26 respondents (66.7 percent) indicating both the main libraries and their partner libraries having a staff member attend such meetings in the past 5 years. sixteen respondents (64 percent) indicated both the main library and their partner libraries regularly send staff to the american library association annual conference and midwinter meeting. iug typically has a meeting the friday before the midwinter meeting. attendance at training workshops held at the vendor headquarters, as well as online training, is an activity in which the main library participates more frequently than the partner libraries (61.1 percent). complete survey results are provided in appendix a, available at http://www.lita.org/ala/mgrps/divs/lita/ ital/302011/3002jun/pdf/vaughan_app.pdf. ■■ research summary and future directions integrated library systems shared by multiple partners hold the promise of shared efficiencies. given a rather significant number of responses, shared systems appear to be quite common, ranging from a few partners to systems with many partners. perhaps reflecting this, shared systems range from loose federations of library partners to shared systems managed by a more formalized, official consortium. a majority of libraries with shared systems have a mou or other official documents to help define the nature of the relationship, focusing on such topics as budgeting, payments, and funding formulas; general governance and voting matters; support; and equipment. most libraries sharing a system have a method or funding formula outlining how the ils is funded on an annual basis and the contributions provided by each partner. such methods can include not only annual maintenance, but also the procurement of new hardware and software extending the system capabilities. while many support functions are carried out by a central office or staff at the main library hosting the shared system, partner libraries often participate in annual user group and library association conferences where they help stay abreast of vendor ils developments. the research above describes the authors’ investigations into management of shared integrated library systems. in particular, the authors were interested in how other consortia sharing an ils managed their system, high-performance annotation tagging over solr full-text indexes michele artini, claudio atzori, sandro la bruzzo, paolo manghi, marko mikulicic, and alessia bardi information technology and libraries | september 2014 22 abstract in this work, we focus on the problem of annotation tagging over information spaces of objects stored in a full-text index. in such a scenario, data curators assign tags to objects with the purpose of classification, while generic end users will perceive tags as searchable and browsable object properties. to carry out their activities, data curators need annotation tagging tools that allow them to bulk tag or untag large sets of objects in temporary work sessions where they can virtually and in real time experiment with the effect of their actions before making the changes visible to end users. the implementation of these tools over full-text indexes is a challenge because bulk object updates in this context are far from being real-time and in critical cases may slow down index performance. we devised tagtick, a tool that offers to data curators a fully functional annotation tagging environment over the full-text index apache solr, regarded as a de facto standard in this area. tagtick consists of a tagtick virtualizer module, which extends the api of solr to support real-time, virtual, bulk-tagging operations, and a tagtick user interface module, which offers end-user functionalities for annotation tagging. the tool scales optimally with the number and size of bulk tag operations without compromising the index performance. introduction tags are generally conceived as nonhierarchical terms (or keywords) assigned to an information object (e.g., a digital image, a document, a metadata record) in order to enrich its description beyond the one provided by object properties. the enrichment is intended to improve the way end users (or machines) can search, browse, evaluate, and select the objects they are looking for. examples are qualificative terms, i.e. terms associating the object to a class (e.g., biology, computer science, literature) or qualitative terms, i.e. terms associating the object to a given measure of value (e.g., rank in a range, opinion).1 approaches differ in the way tags are generated. in some cases users (or machines)2 freely and collaboratively produce tags,3 thereby generating so-called michele artini (michele.artini@isti.cnr), claudio atzori (claudio.atzori@isti.cnr.it), sandro la bruzzo (msandro.labruzzo@isti.cnr), paolo manghi (paolo.manghi@iti.cnr.it), and mark mikulicic (mmark.mikulicic@isti.cnr.it) are researchers at istituto di scienza e tecnologie dell’informazione “alessandro faedo,” consiglio nazionale delle richerce, pisa, italy. alessia bardi (mallessia.bardi@for.unipit.it) is a researcher at the dipartimento di ingegneria dell’informazione, università di pisa, italy. mailto:michele.artini@isti.cnr.it mailto:claudio.atzori@isti.cnr.it mailto:msandro.labruzzo@isti.cnr mailto:paolo.manghi@iti.cnr.it mailto:mmark.mikulicic@isti.cnr.it mailto:mallessia.bardi@for.unipit.itmailto: high-performance annotation tagging over solr full-text indexes | artini et al 23 folksonomies. the natural heterogeneity of folksonomies calls for solutions to harmonise and make more effective their usage, such as tag clouds.4 in other approaches users can pick tags from a given set of values (e.g., vocabulary, ontology, range) or else find hybrid solutions, where a degree of freedom is still permitted.5,6 a further differentiation is introduced by semantically enriched tags, which are tags contextualized by a label or prefix that provides an interpretation for the tag.7 for example, in the digital library world, the annotation of scientific article objects with subject tags could be done according to the tag values of the tag interpretations of acm scientific disciplines and “dewey decimal classification,” whose term ontologies are different.8 the action of tagging is commonly intended as the practice of end users or machines of assigning or removing tags to the objects of an information space. an information space is a digital space a user community populates with information objects for the purpose of enabling content sharing and providing integrated access to different but related collections of information objects.9 the effect of tagging information objects in an information space may be private, i.e., visible to the users who tagged the objects or to a group of users sharing the same right, or public, i.e., visible to all users.10 many well-known websites allow end users to tag web resources. for example delicious11 (http://delicious.com) allows users to tag web links with free and public keywords; stack overflow (http://stackoverflow.com), which lets users ask and answer questions about programming, allows tagging of question threads with free and public keywords; gmail 12 (http://mail.gmail.com) allows users to tag emails—at the same time, tags are also transparently used to encode email folders. in the digital library context, the portal europeana (http://www.europeana.eu) allows authenticated end users to tag metadata records with free keywords to create a private set of annotations. in this work we shall focus on annotation tagging—that is, tagging used as a manual data curation technique to classify (i.e., attach semantics to) the objects of an information space. in such a scenario, tags are defined as controlled vocabularies whose purpose is classification.13,14 unlike semantic annotation scenarios, where semantic tags may be semiautomatically generated and assigned to objects,15 in annotation tagging authorized data curators are equipped with search tools to identify the sets of objects they believe should belong or not belong to a given category (identified by a tag), and to eventually perform the tagging or untagging actions required to apply the intended classification. in general, such operations may assign or remove tags to and from an arbitrarily large subset of objects of the information space. it is therefore hard to predict the quality and consistency of the combined effect of a number of such actions. as a consequence, data curators must rely on virtual tagging functionalities which allow them to bulk (un)tag sets of objects in temporary work sessions, where they can in real-time preview and experiment (do/undo) the effects of their actions before making the changes visible to end users. examples of scenarios that may require annotation tagging can be found in many fields of application. this is the case, for example, in several data infrastructures funded by the european commission fp7 program, which share the common goal of populating very large information spaces by aggregating textual metadata records collected from several data sources. examples are the data http://delicious.com/ http://stackoverflow.com/ http://mail.gmail.com/ http://www.europeana.eu/ information technology and libraries | september 2014 24 infrastructures for driver,16 heritage of the people’s europe (hope),17 european film gateway (efg and efg1914),18 openaire19 (http://www.openaire.eu), and europeana. in such contexts, the aggregated records are potentially heterogeneous, not sharing common classification schemes, and annotation tagging becomes a powerful mean to make the information space more effectively consumable by end users. there at two significant challenges to be tackled in the realization of annotation tagging tools. first is the need to support bulk-tagging actions in almost real time so that data curators need not wait long for their actions to complete. second, bulk-tagging actions need to be virtualized over the information space, so that data curators can verify the quality of their actions before committing them, and access to the information space is unaffected by such actions. naturally, the feasibility and quality of annotation tagging tools strictly depends on the data management system adopted to index and search objects of the information space. in general, not to compromise information space availability, bulk-updates are based on offline, efficient strategies, which minimize the update’s delay,20 or virtualisation techniques, which perform the update in such a way that users have the impression this was completed.21 in this work, we target the specific problem of annotation tagging of information spaces whose objects are documents in a solr full-text index (v3.6).22 solr is an open-source apache project delivering a full-text index whose instances are capable of scaling up to millions of records, can benefit from horizontal clustering, replica handling, and production-quality performance for concurrent queries and bulk updates. the index is widely adopted in the literature and often in contexts where annotation tagging is required, such as the aforementioned aggregative data infrastructures. the implementation of virtual and bulk-tagging facilities over solr information spaces is a challenge, since bulk updates of solr objects are fast, but far from being real-time when large sets of objects are involved. in general, independently of the configuration, a re-indexing of millions of objects may take up to some hours, while for real-time previews even minutes would not be acceptable. moreover, in critical cases, update actions may also slow down index performance and compromise access to the information space. in this paper, we present tagtick, a tool that implements facilities for annotation tagging over solr with no remarkable degradation of performances with respect to the original index. tagtick consists of two main modules: the tagtick virtualizer, which implements functionalities for realtime bulk (un)tagging in the context of work sessions for solr, and the tagtick user interface, which implements user interfaces for data curators to create, operate and commit work sessions, so as to produce newly tagged information spaces. tagtick software can be demoed and downloaded from http://nemis.isti.cnr.it/product/tagtick-authoritative-tagging-apache-solr. annotation tagging annotation tagging is a process operated by data curators whose aim is improving the end user’s search experience over an information space. specifically, the activity consists in assigning searchable and browsable tags to objects in order to classify and logically structure the http://www.openaire.eu/ http://nemis.isti.cnr.it/product/tagtick-authoritative-tagging-apache-solr high-performance annotation tagging over solr full-text indexes | artini et al 25 information space into further (and possibly overlapping) meta-classes of objects. moreover, when ontologies published on the web are used, for example ontologies available as linked data such as the geonames ontology (http://www.geonames.org/ontology/documentation.html) or the dbpedia ontology (http://dbpedia.org/ontology), then tags are means to link objects in the information space to external resources. in this section, we shall describe the functional requirements of annotation tagging in order to introduce assumptions and nomenclature to be used in the remainder of the paper. information space: objects, classes, tags, and queries we define an information space as a set of objects of different classes c1 . . . ck. each class ci has a structure (l1 : v1, . . . ,ln : vn), where lj’s are object property labels and vj are the types of the property values. types can be value domains, such as strings, integers, dates, or controlled vocabularies of terms. in its general definition, annotation tagging has to do with semantically enriched tagging, where a tag consists of a pair (i, t), made of a tag interpretation i and a tag value t from a term ontology t; as an example of interpretation consider the acm subject classification scheme (e.g., i = acm), where t is the set of acm terms. in this context, tagging is de-coupled from the information space and can be configured aposteriori. typically, given an information space, data curators set up the annotation tagging environment by: (i) defining the interpretation/ontology pairs to be used for classification, and (ii) assigning to each class c the interpretations to be used to tag its objects. as a result, class structures are enriched with a set of interpretations (i1:t1 . . . im:tm), where ij are tag interpretation labels and tj the relative ontologies. unless differently specified, an object may be assigned multiple tag values for the same tag interpretation, e.g. scientific publication objects may cover different scientific acm disciplines. finally, the information space can reply to queries q formed according to the abstract syntax intable 1, where op is a generic boolean operator (dependent on the underlying data management system, e.g. “=,” “<,” “>”) and c∈{c1, . . . ,ck}. tag predicates (i = t) and class predicates (class = c) represent exact matches, which mean “the object is tagged with the tag (i, t)” and “the object belongs to class c.” q ∷=(q and q) | (q or q) | (l op v) | (i = t) | (class = c) | v | ε table 1. solr query language. http://www.geonames.org/ontology/documentation.html http://dbpedia.org/ontology information technology and libraries | september 2014 26 virtual and real-time tagging in annotation tagging data curators apply bulk (un)tagging actions with respect to a tag (i, t) over arbitrarily large sets of objects returned by queries q over the information space. due to the potential impact that such operations may have over the information space, tools for annotation tagging should allow data curators to perform their actions in a protected environment called work session. in such an environment curators can test sequences of bulk (un)tagging actions and incrementally shape up an information space preview: they may view the history of such actions, undo some of them, add new actions, and pose queries to test the quality of their actions. to offer a usable annotation tagging tool, it is mandatory for such actions to be performed in (almost) realtime. for example, curators should not wait more than a few seconds to test the result of tagging 1 million objects, an action which they might undo immediately after. moreover, such actions should not conflict (e.g., slow performance) with the activities of end users running queries on the information space. finally, when data curators believe the preview has reached its maturity, they can commit the work session, i.e., materialise the preview in the information space, and make the changes visible to end users. apache solr and annotation tagging as mentioned in the introduction, our focus is on annotation tagging for apache solr (v3.6). this section describes the main information space features and functionalities of the solr full-text index search platform. in particular, it explains the issues arising when using its native apis to implement bulk real-time tagging as described previously. solr information spaces: objects, classes, tags, and queries solr is one of the most popular full-text indexes. it is an apache open source java project that offers a scalable, high performance and cross-platform solution for efficient indexing and querying of information spaces made of millions of objects (documents in solr jargon).23 a solr index stores a set of objects, each consisting in a flat list of possibly replicated and unordered fields associated to a value. each object is referable by a unique identifier generated by the index at indexing time. the information spaces described previously can be modelled straightforwardly in solr. each object in the index contains field-value pairs relative to the properties and tag interpretations of all classes they belong to. moreover, we shall assume that all objects share one field named class whose values indicate the classes (e.g. c1, . . . ,ck) to which the object belongs. such an assumption does not restrict the application domain, since classes are typically encoded in solr by a dedicated field. the solr api provides methods to search objects by general keywords, field values, field ranges, fuzzy terms and other advanced search options, plus methods for the bulk addition and deletion of objects. in our study, we shall restrict to the search method query(q, qf), where q and qf are cql queries respectively referred as the “main query” and the “filter query”. in particular, in order to match the query language requirements described previously, we shall assume that q and qf are high-performance annotation tagging over solr full-text indexes | artini et al 27 expressed according to the cql subset matching the query language in table 1. getdocset :rs→ds returns the docset relative to a result set intersectdocsets :ds × ds→ds returns the intersection of two docset intersectsize :ds × ds→integer returns the size of the intersection of two docsets unifydocsets :ds × ds→ds returns the union of two docsets andnotdocsets :ds × ds→ds given two docsets ds1 and ds2 returns the docset {d | d ∈ ds1 ⋀ ¬ d ∈ ds2} searchondocsets :q × ds→rs executes a query q over a docset and returns the relative resultset table 2. solr docset management low-level interface. to describe the semantics of query(q, qf) it is important to make a distinction between the solr notions of result set and docset. in solr, the execution of a query returns a result set (i.e., queryresponse in solr jargon) that logically contains all objects matching the query. in practice, a result set is conceived to be returned at the time of execution to offer instant access to the query result, which is meantime computed and stored in memory into a low-level solr data structure called docset. docsets are internal solr data structures, which contain lists of object identifiers and allow for efficient operations such as union and intersection of very large sets of objects to optimize query execution. table 1 illustrates some of the methods used internally by solr to handle docsets. method names have been chosen to be self-explanatory and therefore do not match the ones used in the libraries of solr. information technology and libraries | september 2014 28 ⟦query(q,qf)⟧solr= � {d|id(d)∈ ⟦q⟧ds} if (qf=null) searchondocset(q, ⟦qf⟧cache(ϕ)) if (qf ≠null) ⟦qf⟧cache(ϕ)= � ds if (ϕ(qf)=ds) ⟦qf⟧cache(ϕ[qf ← ⟦qf⟧ds]) if (ϕ(qf) = ⊥) ⟦(q1 and q2)⟧ds= ⟦q1⟧ds ∩ ⟦q2⟧ds ⟦(q1 or q2)⟧ds= ⟦q1⟧ds ∪ ⟦q2⟧ds ⟦(l op v)⟧ds= {id(d)| d.l op v} ⟦(i=t)⟧ds= {id(d)| d.i op t} table 3. semantic functions. informally, query(q, qf) returns the result set of objects matching the query q intersected with the objects matching the filter query qf, i.e. its semantics is equivalent to the one of the command query(q and qf, null). in practice, the usage of a filter query qf is intended to efficiently reduce the scope of q to the set of objects whose identifiers are in the docset of qf. to this aim, solr keeps in memory a filter cache ϕ:q →ds. the first time a filter query qf is received, solr executes it and stores the relative docset ds in ϕ, where it can be accessed to optimize the execution of query(q, qf). once the docset ϕ(qf) = ds is available, query(q, qf) invokes the low-level method searchondocset(q, ds) (see table 1). the method executes q to obtain its docset, efficiently intersects such docset with ds, and populates the result set relative to the query. due to the efficiency of docset intersection and in-memory data structures, query execution time is closely limited to the one necessary to execute q. table 3 shows the semantic functions ⟦.⟧solr :q x q →rs , ⟦.⟧ds :q →ds, ⟦.⟧cache :q x ℘(q x ds) → ds. the first yields the result set of query(q, qf); the second the docset relative to a query q (where d is an object); and the third resolves queries into docsets by means of a filter cache ϕ. limits to virtual and real-time tagging in solr whilst solr is a well-known and established solution for full-text indexing over very large information spaces, it poses challenges for higher-level applications willing to expose to users private, modifiable views of the same index. this is the case for annotation tagging tools, which must provide data curators with work sessions where they can update with tagging and untagging actions a logical view of the information space, while still providing end users with search facilities over the last committed information space. since solr api does not natively provide “view management” primitives, the only approach would be that of materializing tagging and untagging high-performance annotation tagging over solr full-text indexes | artini et al 29 actions in the index while making sure that such changes are not visible to end users. prefixing tags with work session identifiers, cloning of tagged objects, or keeping index replicas may be valuable techniques, but all fail to deliver the real-time requirement described previously. this is due to the fact that when very large sets of objects are involved the re-indexing phase is generally far from being real-time. in general, independently of the configuration, processing such requests may take up to some hours for millions of objects, while for real-time previews even minutes would not be acceptable. tagtick virtualizer: virtual real-time tagging for solr this section presents the tagtick virtualizer module, as the solution devised to overcome the inability of apache solr to support out-of-the-box real-time virtual views over information spaces. the virtualizer api, shown in table 4, supports methods for creating, deleting and committing work sessions, and, in the context of a work session: (1) performing tagging/untagging actions and (2) querying the information space modified by such actions. in the following we will describe both functional semantics and implementation of the api, given in terms of a formal symbolic notation. the semantics defines the expected behaviour of the api and is provided in terms of the semantics of solr. the implementation defines the realization of the api methods in terms of the low-level docset management library of solr. the right side of figure 1 illustrates the layering of functionalities required to implement the tagtick virtualizer module. as shown, the realization of the module required exposing the solr low-level docset library through an api. figure 1. tagtick virtualizer: the architecture. tagtick virtualizer api: the intended semantics the commands createsession() creates a new session s, intended as a sequence of (un)tagging information technology and libraries | september 2014 30 actions over an initial information space i. the command and deletesession(s) removes the session s from the environment. we denote the virtual information space obtained by modifying i with the actions in s as i(s); note that: i(𝜖) = i. createsession() creates and returns a work session s deletesession(s) deletes a work session s commitsession(s) commits a work session s action(a, rs, (i, t), s) applies the action a with (i, t)to all objects in rs in s virtquery(q, s) executes q over the information space i(s) table 4. tagtick virtualizer api: the methods. the command action(a, rs, (i, t), s), depending on the value of a being tag or untag, applies the relative action for the tag (i, t) to all objects in rs and in the context of the session s. (un)tagging actions occur in the context of a session s, hence update the scope of the information space i(s). the construction of such rs takes place in the annotation tagging tool user interface and may require several queries before all objects to be bulk (un)tagged are collected. annotation tagging tools may for example provide web-basket mechanisms to support curators in this process. the command commitsession(s) makes the virtual information space i(s) persistent, i.e., materializes the bulk updates collected in session s. once this operation is completed, the session s is deleted. the command virtquery(q, s) executes a virtual search whose semantics is that of the solr’s method query(q, null) executed over i(s). more formally, let’s extend the semantic function ⟦.⟧solr to include the information space scope of the execution, that is: ⟦query(q,qf)⟧solr i is the semantics of query(q, qf) over a given information space i. then, we can define: ⟦virtquery(q, s)⟧tv = ⟦query(q, null)⟧solr i(s) tagtick virtualizer api: the implementation to provide its functionalities in real time, the tagtick virtualizer avoids any form of update action into the index. the module emulates the application of bulk (un)tagging actions over the information space by exploiting solr low-level library for docset management, whose methods are shown in table 2. the underlying intuition is based on two considerations: (1) the action action(a, rs, (i, t), s) can be encoded in memory as an association between the tag (i, t) and the objects in the docset ds relative to rs in the context of s; and (2) the subset of objects ds should be returned to the high-performance annotation tagging over solr full-text indexes | artini et al 31 query (i = t) if executed over i in the scope of s (i.e., as if i was updated with such an action). by following this approach, the module may rewrite and execute calls of the form virtquery(q and (i = t)) into calls searchondocset(q, ds), thereby emulating the real-time execution of the query over the information space i(s). more generally, any query of the form q and qtag predicates, where qtag predicates is a query combining tag predicates relative to tags touched in the session, can be rewritten as searchondocset(q, ds). in such cases, ds is obtained by combining the docsets relative to tag predicates by means of the low-level methods intersectdocsets and unifydocsets. the tagtick virtualizer module implements the aforementioned session cache by means of an inmemory map ρ = s × i × t →ds, which caches the tagging status of all active work sessions. to this aim, ρ maps triples (s, i, t) onto docsets ds that are defined as the set of objects tagged with the tag (i, t) in the context of s at the time of the request. the tagtick virtualizer is stateless with regard to the specific tags and sessions identifiers it is called to handle; such information is typically held in applications using the module to take advantage of real-time, virtual tagging mechanisms. tagging and untagging actions the method action(a, rs, (i, t), s) has the effect of changing the status ρ to reflect the action of tagging or untagging the objects in the result set rs with the tag (i, t) in the session s. table 5 describes the effect of the command over the status ρ in terms of the semantic function ⟦.⟧m:c × ℘(s×i×t)→℘(s×i×t) that takes a command c and a status ρ and returns the status ρ affected by c. in order to optimize the memory heap, ρ is populated following a lazy approach, according to which a new entry for the key (s, i, t) is created when the first tagging or untagging action with respect to the tag (i, t) is performed in the scope of s. when the user adds or removes a tag (i, t) for the first time in the session s (case ρ(s, i, t)= ⊥), the value of the entry ρ(s, i, t) is initialized to the docset relative to the query i = t: ds = getdocset(⟦query((i=t),null)⟧solr i the function init(ρ, s, i, t) returns such new ρ over which the tag or untag action is eventually executed. if the action involves a tag (i, t) for which an entry ρ(s, i, t)= ds exists (case ρ(s, i, t) ≠ ⊥), the commands return the new ρ obtained by adding or removing the docset getdocset(rs) to or from ds. such actions are performed in memory with minimal execution time. information technology and libraries | september 2014 32 ⟦action(a, rs, (i, t), s)⟧m(ρ)= � updatetag(ρ, rs, (i, t), s) if(a=tag and ρ(s, i, t)≠ ⊥) updateuntag(ρ, rs, (i, t), s) if(a=untag and ρ(s, i, t)≠ ⊥) ⟦action(a, rs, (i, t), s)⟧m(init(ρ, s, i, t)) if(ρ(s, i, t) = ⊥) init(ρ, s, i, t)= ρ[ρ(s, i, t)←getdocset(⟦query(i=t, null)⟧solr] updatetag(ρ, rs, (i, t), s)= ρ[ρ(s, i, t)← ρ(s, i, t) ∪ getdocset(rs)] updateuntag(ρ, rs, (i, t), s)= ρ[ρ(s, i, t)← ρ(s, i, t) ∖ getdocset(rs)] table 5. semantics of tag/untag commands. queries over a virtual information space as mentioned above, the command virtquery(q, s) is implemented by executing the low-level method searchondocset(q', ds). informally, 𝑞′ is the subpart of q whose predicates are not affected by actions in s, while ds is the subset of objects matching tag predicates affected by actions in s, to be calculated by means of the map ρ. to make this a real statement, two main issues must be addressed. the first one is syntactical: how to extract from q the sub-query q' and the subquery to be filtered by ρ to generate ds. the second issue is semantic: the misalignment between the objects in the original information space i, where searchondocset is executed, and the ones in i(s), to be virtually queried over and returned by virtquery. syntactic issue: to obtain 𝑞′ and ds from q, the tagtick virtualizer module includes a query rewriter module that is in charge of rewriting q as a query: q' and qtags in session (1) both queries are compliant to the query grammar in table 1, but the second is a query that groups all tag predicates in q which are affected by s. the reason of this restriction is due to the fact that the method searchondocset(q’, ds) performs an intersection between the docset ds and the docset obtained from the execution of q�. in principle, qtags in session may contain arbitrary combinations of tag predicates (i = t) combined with and and or operators. to get a better understanding, refer to the examples in table 6, where we assumed to have two tag interpretations a with terms {a1, a2} and b with terms {b1, b2} where ρ(s, a, a1) and ρ(s, b, b1) are defined in ρ; note that keyword searches, e.g., “napoleon,” are not run over tag values. the first two queries can be executed, while the last one is invalid. indeed there is no way to factor out the tag predicate (a = a1) so that it can be separated and joint with the rest of the query using an and operator. high-performance annotation tagging over solr full-text indexes | artini et al 33 clearly, the ability of the query rewriter module to rewrite the query independently of its complexity may be crucial to increase the usability level of tagtick virtualizer. in its current implementation, the tagtick virtualizer assumes that q is provided to virtquery as already satisfying the expected query structure (1). as we shall see in the next section, this assumption is very reasonable in the realization of our annotation tagging tool tagtick and, more generally, in the definition of tools for annotation tagging. indeed, such tools typically allow data curators to run google-like free-keyword queries to be refined by a set of tags selected from a list. such queries fall in our assumption and also match the average requirements of this application domain. 𝑞 = "napoleon" 𝐴𝑛𝑑 (𝐴 = 𝑎1 𝑂𝑟 𝐵 = 𝑏1) 𝑤ℎ𝑒𝑟𝑒: 𝑞′ = "napoleon" 𝑞𝑡𝑎𝑔 𝑖𝑛 𝑠𝑒𝑠𝑠𝑖𝑜𝑛 = (𝐴 = 𝑎1 𝑂𝑟 𝐵 = 𝑏1) 𝑞 = (𝐴 = 𝑎2 𝑂𝑟 "napoleon") 𝐴𝑛𝑑 (𝐴 = 𝑎1 𝑂𝑟 𝐵 = 𝑏1) 𝑤ℎ𝑒𝑟𝑒: 𝑞′ = (𝐴 = 𝑎2 𝑂𝑟 "napoleon") 𝑞𝑡𝑎𝑔 𝑖𝑛 𝑠𝑒𝑠𝑠𝑖𝑜𝑛 = (𝐴 = 𝑎1 𝑂𝑟 𝐵 = 𝑏1) 𝑞 = (𝐴 = 𝑎1 𝑂𝑟 𝐵 = 𝑏2) 𝐴𝑛𝑑 napoleon table 6. query rewriting. semantic issue: the command searchondocset(q�, ds) does not match the expected semantics of virtquery(q, s). the reason is that searchondocset is executed over the original information space i and objects in the returned result set may not reflect the new tagging imposed by actions in s. for example, consider an untagging action for the tag (i, t) and the result set rs in s. although the objects in rs would never be returned for a query virtquery((i = t),s), they could be returned for queries regarding other properties and in this case they would still display the tag (i, t). to solve this problem, the function patchresultset : rs → rs in table 7 intercepts the result set returned by searchondocset and “patches” its objects, by properly removing or adding tags according to the actions in s. to this aim, the function exploits the low-level function intersectsize, which efficiently computes and returns the size of the intersection between two docsets. for each object d in a given result set rs, the function verifies if d belongs to the docsets ρ(s, i, t) relative to the tags touched by the session s: if this is the case (intersectsize returns 1), the object should be enriched with the tag (add(d, (i, t))), otherwise the tag should be removed from the object (remove(d, (i, t))). information technology and libraries | september 2014 34    = = = = ≠ 0 ))(},({ )),(,( 1 ))(},({ )),(,( )),(,( ))},(,({),,( ^),,( rsgetdocsetdizeintersectsiftidremove rsgetdocsetdizeintersectsiftidadd tidentpatchdocum tidentpatchdocumsrrstsetpatchresul tisr dîrs   table 7. patching result sets. the tagtick virtualizer implements also patching of results for browse queries. a solr browse query is a cql query q followed by the list of object properties l for which a group-by operation (in the sense of relational databases) is demanded. the query returns two responses: the query result set rs and the group-by statistics (l, v, k(l, v)) calculated over the result set and for the given properties, where k(l, v) is the number of objects featuring the value v for the property l in rs. as in the case of standard queries, the semantic issue affects browse queries when a group-by is applied over a tag interpretation i touched in the current work session. indeed, the relative stats would be calculated over the information space i rather than the intended i(s). to solve this issue, when a browse query demands for stats over a tag interpretation i, the relative triples (i, t, k(i, t)) are patched as follows: 1. if (i, t, k(i, t)) is such that ρ(s, i, t )= ⊥, i.e. the tag was not affected by the session, then k(i, t) is left unchanged; 2. if (i, t, k(i, t)) is such that ρ(s, i, t )= ds, then k(i,t)= intersectsize(ds, getdocset(rs)) the operation returns the number of objects currently tagged with (i, t) which are also present in the result set rs. query execution: the implementation of virtquery can therefore be defined as ⟦virtquery(q,s)⟧tv = patchresultset(searchondocset(q�, ds), ρ, s) where q is rewritten in terms of q� and qtags in session by the query rewriter module, and ds is the docset obtained by applying the function ⟦.⟧vt:q × s × ℘(s × i × t) →ds defined in table 8 to qtags in session. the function, given a query of tag predicates, a session identifier, and the status map ρ returns the docset of objects satisfying the query in the session’s scope. high-performance annotation tagging over solr full-text indexes | artini et al 35 ⟦𝑞1 𝑂𝑟 𝑞2⟧𝑉(𝑠,𝜌) = unifydocsets(⟦𝑞1⟧𝑉(𝑠,𝜌), ⟦𝑞2⟧𝑉(𝑠,𝜌)) ⟦𝑞1 𝐴𝑛𝑑 𝑞2⟧𝑉(𝑠,𝜌) = intersectdocsets(⟦𝑞1⟧𝑉(𝑠,𝜌), ⟦𝑞2⟧𝑉(𝑠,𝜌)) ⟦(i=t)⟧v(s, ρ) = ρ(s, i, t) table 8. evaluation of qtags in session in session s. the definition of ρ, the query rewriter module, the semantics of the commands action and virtquery, the definition of searchondocset, and the function ⟦.⟧v guarantee the validity of the following claim, crucial for the correctness of the tagtick virtualizer: claim (search correctness) given an information space i, a map ρ, and a session s, for any query q such that 1. q = q� and qtags in session 2. ds = �qtags in session�v(s, ρ) we can claim that ⟦virtquery(q, s)⟧tv = ⟦query(q, null)⟧solr i(s) hence the implementation of the command virtquery matches its expected semantics. making a virtual information space persistent the commitsession(s) command is responsible for updating the initial information space i to the changes applied in s, i.e. add and remove tags to objects in i according to the actions in s. to this aim, the module relies on the map ρ, which associates each tag (i, t) to the set of objects virtually tagged by (i, t) in s, and on the low-level function andnotdocsets. by properly matching the set of objects tagged by (i, t) in i and i(s) the function derives the sets of objects to tag and untag in i. overall, the execution of commitsession(s) consists in: 1. identifying the set of tags affected by tagging or untagging actions in the session s: changedtags(s) = {(i, t)|ρ(s, i, t) ≠ ⊥} 2. for each (i, t) ∈ changedtags(s) a) fetching the result set relative to all objects in i with tag (i = t): rs = query((i = t), null); b) keeping in memory the relative docset ds = getdocset(rs); c) calculating in memory the set of objects in i to be untagged by (i = t): information technology and libraries | september 2014 36 tobeuntagged = andnotdocsets(ds, ρ(s, i, t)); d) calculating in memory the set of objects in i to be tagged with (i = t) tobetagged = intersectdocset(ρ(s, i, t), ds); e) update the index to tag and untag all objects in the two sets; and f) remove session s. the tagtick virtualizer module is also responsible for the management of conflicts on commits and to avoid index inconsistencies. to this aim, only the first commit action is executed, and once the relative actions are materialized into the index, all other sessions are invalidated, i.e., deleted. tagtick user interface: annotation tagging for solr the tagtick user interface module implements the functionalities presented in previously over a solr index equipped with the tagtick virtualizer module described in the section on solr and annotation tagging (see figure 2). the user interface offers to authenticated data curators an annotation tagging environment where they can open work sessions, do and undo sequences of (un)tagging actions, and eventually commit the session into the current solr information space. when data curators log out from the tool, the modules stores on disk their pending work sessions and the relative (un)tagging actions. such sessions will be restored at the next access to the interface, to allow data curators continuing their work. figure 2. tagtick: user interface. the tagtick user interface is a general-purpose module that can be configured to adapt to the classes and to the structure of objects residing in the index. to this aim, the modules acquires this information from xml configuration files where data curators can specify: high-performance annotation tagging over solr full-text indexes | artini et al 37 1. the names of the different classes, the values used to encode such classes in the index, and the index field used to contain such values; 2. the list of tag interpretations together with the relative ontologies: in the current implementation ontologies are flat sets of terms, which can be optionally populated by curators during the tagging step; and 3. the intended use of interpretations: the association between classes and interpretations. once instantiated, the tagtick user interface allows users to search for objects of all classes by means of free keywords and to refine such searches by class and by the tags relative to such class. this combination of predicates, which matches the query structure 𝑞� = 𝑞 𝐴𝑛𝑑 𝑞𝑡𝑎𝑔𝑠 𝑖𝑛 𝑠𝑒𝑠𝑠𝑖𝑜𝑛 expected by the tagtick virtualizer, is then executed by the module and the results presented in the interface. users can then add or remove tags to the objects—the interface makes sure that the right interpretations are used for the given class. as an example, we shall consider the real-case instantiation of tagtick in the context of the hope project, whose aim is to deliver a data infrastructure capable of aggregating metadata records describing multimedia objects relative to labour history and located across several data sources.24 such objects are collected, cleaned, and enriched to form an information space stored into a solr index. the index stores two main classes of objects: descriptive units and digital resources. descriptive unit objects contain properties describing cultural heritage objects (e.g., a pin). digital resource objects instead describe the digital material representing the cultural heritage objects (e.g., the pictures of a pin). tagtick is currently used in the project hope to classify the aggregated objects according to two tag interpretations: “historical themes,” to tag descriptive units with an ontology of terms describing historical periods, and “export mode,” to tag digital resources with an ontology which describes the different social sites (e.g., youtube, facebook, flickr) from which the resource must be made available from. in particular, figure 3 illustrates the hope tagtick user interface. in the screenshot, a set of descriptive units obtained by a query is being added a new tag “communism . . .” of the tag interpretation “historical themes.” the tagtick user interface offers the possibility to access the history of actions, in order to visualize their sequence, and possibly to undo their effects. figure 4 shows the history of actions that led to the actual tag virtualization in the current work session. curators can only rollback the last action they accomplished. this is because virtual tagging actions may be depending on each other; e.g., an action is based on a query that includes tag predicates whose tag has been affected by previous actions. other approaches may infer the interdependencies between the queries behind the tagging actions and expose dependency-based undo options. information technology and libraries | september 2014 38 figure 3. tagtick user interface: bulk tagging action. figure 4. tagtick user interface: managing history of actions. stress tests the motivations behind the realization of tagtick are to be found in annotation tagging requirements of bulk and real-time tagging. in general, the indexing speed of solr highly depends on the underlying hardware, on the number of threads used for feeding, on the average size of the objects and their property values, and on the kind of text analysis to be adopted.25 however, even assuming the most convenient scenario, bulk indexing in solr is comparably slow with respect to other technologies, such as relational databases,26 and far from being real-time. in this section, we present the result of stress tests conceived to provide concrete measures of query performance, i.e., the real-time effect, the scalability of the tool, and how many tagging actions can be handled in the same session. the idea of the tests is to re-create worst scenarios and give evidence of the ability of tagtick to cope and scale with response time and memory high-performance annotation tagging over solr full-text indexes | artini et al 39 consumption. the experiments were run on a machine with processor intel(r) xeon(r) cpu e5630 @ 2.53ghz (4 cores), a total of memory 4 gb, and available disk of 100 gb (used at around 52 percent). the machine installs an ubuntu 10.04.2 lts operating system, with a java virtual machine configured as -xmx1800m -xx:maxpermsize = 512m. in simpler terms, a medium-low server for a production index. the index was fed with 10 million objects randomly generated and with the following structure: [ identifier: string, title: string, description: string, publisher: string, url: string, creator: string, date: date, country: string, subject: terms] the tag interpretation subject can be assigned values from an ontology terms of scientific subjects, such as “agricultural biotechnology,” “automation,” “biofuels,” “biotechnology,” “business aspects.” the objects are initially generated without tags. each test defines a new session s with k tagging actions of the form action(tag, virtquery(identifier <> id,null), t, s) where id is a random identifier and t is a random tag (subject, term). in practice, the action adds the tag t to all objects in the index, thereby generating docsets of size 10 million. once the k actions are executed, the test returns the following measures: 1. the size of the heap space required to store k tags in memory. 2. the minimal, average, and maximum time required to reply to two kinds of stress queries to the index (calculated out of 100 queries): a. the query identifier <> id and(i,t)∈s(i = t): the query returns the objects in the index which feature all tags touched by the session. b. the query identifier <>id or(i,t)∈s(i = t): the query returns the objects in the index which feature at least one of the tags assigned in the session. in both cases, since tagging actions where applied to all objects in the index, the result will contain the full index. however, in one case the response will be calculated by intersecting docsets, while in the other case by unifying them. note that by selecting a random identifier value (id), the test makes sure that low-level solr optimization by caching is not fired, as this would compromise the validity of the test. information technology and libraries | september 2014 40 3. the minimal, average, and maximum time required to reply to browse queries which involve all tags used in the session (calculated out of 100 queries). 4. the time required to reconstruct the session in memory whenever the data curator logs into tagtick. the results presented in figure 5 show that the average time for the execution of search and browse queries always remain under 2 seconds, which we can consider under the “real-time” threshold from the point of view of the users. user tests have been conducted in the context of the hope project, where curators were positively impressed by the tool. hope curators can today apply sequences of tagging operations over millions of aggregated records by means of a few clicks. moreover, independently of the number of tagging operations, queries over the tagged records take about 2 seconds to complete. the execution time has a major increase from 0 tags to 1 tag. this behavior is expected because when there is 1 tag in the session, the 10 million records must be “patched.” from 1 tag onwards the execution time increases as well, but not at the same rate as in the previous case. this means that in the average case patching 10 million records with 100 tags does not cost much more than tagging them with 1 tag. figure 5. stress test for tagtick search and browse functionality. the results in figure 6 show that the amount of memory to be used does not exceed the limits expected on reasonable servers running a production system. the time required to reconstruct the sessions is generally long, starting from 20 seconds for 50 tags up to 1.5 minutes for 200 tags. high-performance annotation tagging over solr full-text indexes | artini et al 41 on the other hand, this is a one-time operation, required only when logging in to the tool. figure 6. stress test for heap size growth and session restore time. conclusions in this paper, we presented tagtick, a tool devised to enable annotation tagging functionalities over solr instances. the tool allows a data curator to safely apply and test bulk tagging and untagging actions over the index in almost real time and without compromising the activities of end users searching the index at the same time. this is possible thanks to the tagtick virtualizer module, which implements a layer over solr that enables real-time and virtual tagging by keeping in memory the inverted list of objects associated to a (un)tagging action. the layer is capable of parsing user queries to intercept the usage of tags kept in memory and, in this case, to manipulate the query response to deliver the set of objects expected after tagging. future developments may regard the ability to enable more complex query parsing to handle rewriting of a larger set of queries beyond google-like queries currently handled by the tool. another interesting challenge is tag propagation. curators may be interested in having the action of (un)tagging an object to be propagated to objects that are somehow related with the object. handling this problem requires the inclusion into the information space model of relationships between classes of objects and the extension of the tagtick virtualizer module for the specification and management of propagation policies. acknowledgements the work presented in this paper has been partially funded by the european commission fp7 econtentplus-2009 best practice networks project hope (heritage of the people’s europe, http://www.peoplesheritage.eu), grant agreement 250549. http://www.peoplesheritage.eu/ information technology and libraries | september 2014 42 references 1. arkaitz zubiaga, christian körner, and markus strohmaier, “tags vs shelves: from social tagging to social classification,” in proceedings of the 22nd acm conference on hypertext and hypermedia, 93–102 (new york: acm, 2011), http://dx.doi.org/10.1145/1995966.1995981. 2. meng wang et al., “assistive tagging: a survey of multimedia tagging with human-computer joint exploration,” acm computer survey 44, no. 4 (september 2012): 25:1–24, http://dx.doi.org/10.1145/2333112.2333120. 3. lin chen et al., “tag-based web photo retrieval improved by batch mode re-tagging,” in 2010 ieee conference on computer vision and pattern recognition (cvpr) (june 2010), 3440–46, http://dx.doi.org/10.1109/ cvpr.2010.5539988. 4. emanuele quintarelli, andrea resmini, and luca rosati, “information architecture: facetag: integrating bottom-up and top-down classification in a social tagging system,” bulletin of the american society for information science & technology 33, no. 5 (2007): 10–15, http://dx.doi.org/10.1002/bult.2007.1720330506. 5. stijn christiaens, “metadata mechanisms: from ontology to folksonomy . . . and back,” in lecture notes in computer science: on the move to meaningful internet systems 2006: otm 2006 workshops (berlin heidelberg: springer-verlag, 2006). 6. m. mahoui et al., “collaborative tagging of art digital libraries: who should be tagging?” in theory and practice of digital libraries, ed. panayiotis zaphiris et al., 162–72, vol. 7489, lecture notes in computer science (springer berlin heidelberg, 2012), http://dx.doi.org/10.1007/978-3-642-33290-6_18. 7. alexandre passant and philippe laublet, “meaning of a tag: a collaborative approach to bridge the gap between tagging and linked data,” in proceedings of the linked data on the web (ldow2008) workshop at www2008, http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.142.6915. 8. michael khoo et al., “towards digital repository interoperability: the document indexing and semantic tagging interface for libraries (distil),” in theory and practice of digital libraries, ed. panayiotis zaphiris et al., 439–44, vol. 7489, lecture notes in computer science (springer berlin heidelberg, 2012), http://dx.doi.org/10.1007/978-3-642-33290-6_49. 9. leonardo candela, et al, “setting the foundations of digital libraries: the delos manifesto.” d-lib magazine 13, no. 3/4, march/april 2007, http://dx.doi.org/10.1045/march2007castelli. 10. jennifer trant, “studying social tagging and folksonomy: a review and framework,” journal of digital information (january 2009), http://hdl.handle.net/10150/105375. http://dx.doi.org/10.1145/1995966.1995981 http://dx.doi.org/10.1145/2333112.2333120 http://dx.doi.org/10.1109/%20cvpr.2010.5539988 http://dx.doi.org/10.1002/bult.2007.1720330506 http://dx.doi.org/10.1007/978-3-642-33290-6_18 http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.142.6915 http://dx.doi.org/10.1007/978-3-642-33290-6_49 http://dx.doi.org/10.1045/march2007-castelli http://dx.doi.org/10.1045/march2007-castelli http://hdl.handle.net/10150/105375 high-performance annotation tagging over solr full-text indexes | artini et al 43 11. cameron marlow et al., “ht06, tagging paper, taxonomy, flickr, academic article, to read,” in proceedings of the seventeenth conference on hypertext and hypermedia, 31–40 (new york: acm, 2006), http://dx.doi.org/10.1145/1149941.1149949. 12. andrea civan et al., “better to organize personal information by folders or by tags? the devil is in the details,” proceedings of the american society for information science and technology 45, no. 1 (2008): 1–13, http://dx.doi.org/10.1002/meet.2008.1450450214. 13. marianne lykke et al., “tagging behaviour with support from controlled vocabulary,” in facest of knowledge organization, ed. alan gilchrist and judi vernau, 41–50 (bingley, uk: emerald group, 2012) 14. guus schreiber et al., “semantic annotation and search of cultural-heritage collections: the multimedian e-culture demonstrator,” web semantics: science, services and agents on the world wide web 6, no. 4 (2008): 243–49, http://dx.doi.org/10.1016/j.websem.2008.08.001. 15. diana maynard and mark a. greenwood, “large scale semantic annotation, indexing and search at the national archives,” in proceedings of lrec vol. 12 (2012). 16. martin feijen, “driver: building the network for accessing digital repositories across europe,” ariadne 53 (october 2007), http://www.ariadne.ac.uk/issue53/feijen-et-al/. 17. heritage of the people’s europe (hope), http://www.peoplesheritage.eu/. 18. european film gateway project, http://www.europeanfilmgateway.eu. 19. paolo manghi et al., “openaireplus: the european scholarly communication data infrastructure,” d-lib magazine 18, no. 9–10 (september 2012), http://dx.doi.org/10.1045/september2012-manghi. 20. panagiotis antonopoulos et al., “efficient updates for web-scale indexes over the cloud,” in 2012 ieee 28th international conference on data engineering workshops (icdew), 135–42, april 2012, http://dx.doi.org/10.1109/icdew.2012.51. 21. chun chen et al., “ti: an efficient indexing mechanism for real-time search on tweets,” in proceedings of the 2011 acm sigmod international conference on management of data, 649– 60 (new york: acm, 2011), http://dx.doi.org/10.1145/1989323.1989391. 22. rafal kuc, apache solr 4 cookbook (birmingham, uk: packt, 2013). 23. david smiley and eric pugh, apache solr 3 enterprise search server (birmingham, uk: packt, 2011). 24. the hope portal: the social history portal, http://www.socialhistoryportal.org/timelinemap-collections. http://dx.doi.org/10.1145/1149941.1149949 http://dx.doi.org/10.1002/meet.2008.1450450214 http://dx.doi.org/10.1016/j.websem.2008.08.001 http://www.ariadne.ac.uk/issue53/feijen-et-al/ http://www.peoplesheritage.eu/ http://www.europeanfilmgateway.eu/ http://dx.doi.org/10.1045/september2012-manghi http://dx.doi.org/10.1109/icdew.2012.51 http://dx.doi.org/10.1145/1989323.1989391 http://www.socialhistoryportal.org/timeline-map-collections http://www.socialhistoryportal.org/timeline-map-collections information technology and libraries | september 2014 44 25. assuming to operate a stand-alone instance of solr, hence not relying on solr sharding techniques with parallel feeding. 26. whyusesolr—solr wiki, http://wiki.apache.org/solr/whyusesolr. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs manolis peponakis abstract the aim of this study is to contribute to the field of machine-processable bibliographic data that is suitable for the semantic web. we examine the entity relationship (er) model, which has been selected by ifla as a “conceptual framework” in order to model the fr family (frbr, frad, and rda), and the problems er causes as we move towards the semantic web. subsequently, while maintaining the semantics of the aforementioned standards but rejecting the er as a conceptual framework for bibliographic data, this paper builds on the rdf (resource description framework) potential and documents how both the rdf and linked data’s rationale can affect the way we model bibliographic data. in this way, a new approach to bibliographic data emerges where the distinction between description and authorities is obsolete. instead, the integration of the authorities with descriptive information becomes fundamental so that a network of correlations can be established between the entities and the names by which the entities are known. naming is a vital issue for human cultures because names are not random sequences of characters or sounds that stand just as identifiers for the entities—they also have socio-cultural meanings and interpretations. thus, instead of describing indivisible resources, we could describe entities that appear in a variety of names on various resources. in this study, a method is proposed to connect the names with the entities they represent and, in this way, to document the provenance of these names by connecting specific resources with specific names. introduction the basic aim of this study is to contribute to the field of machine-processable bibliographic data. as to what constitutes “machine processable” we concur with the clarification of antoniou and van harmelen, who state, “in the literature the term machine-understandable is used quite often. we believe it is the wrong word because it gives the wrong impression. it is not necessary for intelligent agents to understand information; it is sufficient for them to process information effectively, which sometimes causes people to think the machine really understands.”1 also, in the bibliography used, the term “computationally processable” is used as a synonym to “machine­ processable.” manolis peponakis (epepo@ekt.gr) is an information scientist at the national documentation centre, national hellenic research foundation, athens, greece. information technology and libraries | june 2016 19 mailto:epepo@ekt.gr with regard to machine-processable bibliographic data, we have taken into consideration both the practice and theory of library and information science (lis) and computer science. from lis we have chosen the functional requirements for bibliographic records (frbr) and the functional requirements for authority data (frad) while making comparisons with the resource description and access (rda) standard. from the computer science domain we have chosen the resource description framework (rdf) as a basic mechanism for the semantic web. we examine the entity relationship (er) model (selected from ifla as a “conceptual framework” for the development of frbr), 2 as well as the potential problems that may arise as we move towards the semantic web. having rejected the er model as a conceptual framework for bibliographic data, we have built on the potential of rdf and document how its rationale affects the modeling process. in the context of the semantic web and uniform resource identifiers (uris), the identification process has been transformed. for this reason we have performed an analysis of appellations and names as identifiers and also explored how we could move on from an era where controlled names play the role of identifiers to one of the uri dominion: “while it is self-evident that labels and comments are important for constructing and using ontologies by humans, the owl standard does not pay much attention to them. the standard focuses on the syntax, structure and reasoning capabilities. . . . if the semantic web is to be queried by humans, there will be no other way than dealing with the ambiguousness of human language.”3 it is essential to build on the “library's signature service, its catalog,”4 and use it to provide addedvalue services. but to get there, first there has to be “a shift in perspective, from locked-up databases of records to open data shared on the web.”5 this requires a transition from descriptions aimed at human readers to descriptions that put the emphasis on computational processes to escape the rationale of records being a condensed description in textual form and move towards more flexible and fruitful representations and visualizations. background frbr and rda the fr family has been growing for more than a decade. the first member of the family was the functional requirements for bibliographic records (frbr),6 the first version of which was published towards the end of the last century. subsequently, ifla decided to extend the model in order to cover authorities. during this process, the task of modeling the names was separated from the task of modeling the subjects. thus two new members were added to the family; the “functional requirements for authority data: a conceptual model” (frad) and the “functional requirements for subject authority data (frsad).” 7,8 at the same period of time, the “resource description and access” (rda) standard was established as a set of cataloging rules to replace the aacr standard. according to its creators, the alignment with the fr family was crucial. as stated, in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 20 “a key element in the design of rda is its alignment with the conceptual models for bibliographic and authority data developed by the international federation of library associations and institutions (ifla): functional requirements for bibliographic records [and] functional requirements for authority data.”9 this paper uses the fr family and the rda as a starting point but detects some problems and inconsistencies between these models. it sustains the basic semantics from these standards but rejects their structural formalism because it finds that it is quite problematic and lacks effectiveness in expressing highly machine-processable data. the effective processability of the data will be discussed in detail in the section “the impact of the representation scheme’s selection: rdf versus er.” among the fr family, the terminology is inconsistent and, as we pass from the frbr to frad and frsad, even the perception angle of the general model undergoes change. in frbr (the first in order), there is no notion of the name as an entity. frad introduces this perception (frad also adds family as a new entity) and frsad makes a step forward and introduces the concept of nomen instead of the concept of name. hence, despite the fact that each of the members of the fr family of models has been represented in rdf,10 there is no established consolidated edition yet that combines the different angles using a common model and terminology (vocabulary).11 these representations (one for each model) are available at ifla’s website.12 on the other hand, in the context of rda there may be more consistency regarding terminology, but, as is well established in the relevant literature, there are significant differences between the two models, i.e. the fr family and rda.13,14,15 due to these differences, there are no uris, not even in the rda registry, in the examples of our study.16 given the above, the terms appearing in the figures are a selection from the three texts of the fr family. thus, nomen (from frsad) is used instead of name (from frad) as a more abstract notion, and the attribute—property in the context of rdf—“has string” (from frad) is used to assign a specific literal to a nomen. in figures 2–5 we have used the “has appellation” (reversed “is appellation of”) relationship of frad.17 notes about terminology and graphs: how to read the figures in this paper two different sorts of figures appear. this covers the need to compare two different models and pinpoint the differences between them and the problems that arise from selecting the er model to express frbr. an explanation of the two major models follows in the next subsection. information technology and libraries | june 2016 21 the first figure type follows the diagrams of the entity–relationship model and is used in figure 1. in this case: • the rectangles represent entities. • the oval shapes represent attributes. • the diamond-shaped boxes represent relationships. the second figure type has been created according to the rdf graphical representations and is used in figures 2–5. in these cases: • the oval shapes represent nodes that are identified by a uri and they could serve as objects or subjects for further expansion of the network. in figures 3–5 all the names were derived from the fr entities. • the line connectors between nodes represent the predicates (i.e., they are properties) and should also serve as uris. • the rectangle shapes represent literals consisting of lexical form. language code could apply in these cases. with or without language codes, these are the end points and they could not be subject to new connections. we follow the common modeling of the language in rdf in which the literal itself contains a language code, for example "example"@en in standard turtle syntax, or <rdfs:label xml:lang="en"> in rdfs xml coding. we must note that this kind of modeling is quite a simplistic way of language modeling because there is no mechanism to declare more information about language, such as multiple scripts, which could apply in the context of the same language. the impact of the representation scheme’s selection: rdf versus er nowadays, all the information on library catalogs is created through and stored in computers. this technological infrastructure provides specific methods and dictates limitations for the catalog’s data management. hence, every model must take into consideration the basic rationale of the technological infrastructure that will curate and process the data. depending on the syntax capabilities of the representation model, the expression of what we want to express becomes reasonably easy and accurate since “semantics is always going to have a close relationship with the field of syntax.”18 this establishes a vital relationship between what we want to do and how computers can do it. in this section we emphasize the limitations of the entity relationship (er) implementation, which frbr proposes, and denote how syntax affects expressiveness and, accordingly, functionality. finally, we demonstrate how the selection of one implementation or another (in our case er vs. rdf) has serious implications, both for cataloging rules and for cataloging practice. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 22 why do we compare these two specific models? the er model is the base that has been selected from ifla as a “conceptual framework” 19 for the development of frbr, while frbr is the conceptual model upon which rda has been founded. subsequently, rda is also affected by the choice of er model. on the other hand, rdf is the current conceptualization for resource description in the web of data. so, what kind of problems and conflicts arise from the implementations of each of these models? the basic rationale of er comprises three fundamental elements. there are entities; entities have attributes; and there are relationships between entities. it is also possible to declare cardinality constraints upon which the fr family builds. then again, rdf implies quite a different model. “the core structure of the abstract syntax is a set of triples, each consisting of a subject, a predicate and an object. a set of such triples is called an rdf graph. an rdf graph can be visualized as a node and directed-arc diagram, in which each triple is represented as a node-arc-node link. . . . there can be three kinds of nodes in an rdf graph: iris, literals, and blank nodes.”20 “linking the object of one statement to the subject of another, via uris, results in a chain of linked statements, or linked data. this avoids the ambiguity of using natural language strings as headings to match statements. as a result, a literal object terminates a linked data chain, and literals are generally used for human-readable display data such as labels, notes, names, and so on.”21 as a representative example of the differences between the two models, let us consider “place of publication.” peponakis counts nine attributes of place and notices that, due to the fact that the er model does not allow links between attributes, there is no way to define explicitly whether these attributes address the same place or not.22 taking into consideration this problem we demonstrate the transition from the er attributes approach to rdf implementations in figures 1– 2. let us assume that there is person (x), who was born in london, is named john smith and works at publisher (y). this publisher is located in london, where book (1), entitled history of london, has been published. for this specific book, person x was the lithographer. if we create a strict mapping to frbr entities, attributes, and relations, then we have the situation illustrated in figure 1. due to the fact that there is no way to link the four occurrences of london (inasmuch as there is no option to define relations between attributes in the er model), there is no way to be certain that london is the same in all cases. judging only by the name, it could stand for london in england, in ontario, in ohio, or elsewhere. information technology and libraries | june 2016 23 figure 1. example of “place” as attribute of several entities the ifla working group has faced the problem with place and noted the following. the model does not, however, parallel entity relationships with attributes in all cases where such parallels could be drawn. for example, “place of publication/distribution” is defined as an attribute of the manifestation to reflect the statement appearing in the manifestation itself that indicates where it was published. inasmuch as the model also defines place as an entity it would have been possible to define an additional relationship linking the entity place either directly to the manifestation or indirectly through the entities person and corporate body which in turn are linked through the production relationship to the manifestation. to produce a fully developed data model further definition of that kind would be appropriate. but for the purposes of this study it was deemed unnecessary to have the conceptual model reflect all such possibilities. 23 finally, they seem to avoid the problem and repeat their position in frad as well. in certain instances, the model treats an association between one entity and another simply as an attribute of the first entity. for example, the association between a person and the place in which the person was born could be expressed logically by defining a relationship (“born in”) between person and place. however, for the purposes of this study, it was deemed sufficient to treat place of birth simply as an attribute of person. 24 in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 24 for some reason the creators of the fr family have chosen not to “upgrade” the attributes of place into one and only one entity. furthermore, the same problem exists for many attributes, not only for place. thus, the problem has to do with the selection of er as “conceptual framework” and not with the specific entity of place. if we accept that “place of publication” must not be recorded as it appears on the resource, an rdf-based approach makes things clearer, as figure 2 shows. in this case, all attributes of place are promoted to the same rdf node and, instead of four repeats of the attribute with the value “london,” we reduce it to one and only one node with four connections to it. then, as illustrated by figure 2, we can be sure that all instances refer to the same london. figure 2. rdf-based representations of figure 1 in figure 2, it is assumed that there is no need to transcribe the literal of “place of publication” from the resource; i.e., we did not follow rule 2.8.1.4 of rda: “transcribe places of publication and publishers' names as they appear on the source of information.” for cataloging rules that demand to record the place as it appears on the resource, the readers can consult the subsection “place names” in this study. last but not least, rdf has another significant advantage compared to the er model: data coded in rdf are packed ready for use in the semantic web. on the contrary, data coded in er must undergo conversion—with all its implications—in order to be published in the semantic web. information technology and libraries | june 2016 25 names, entities, and identities in this section, the significance of names as carriers of meaning is outlined and the importance of documenting the relations of names with the entities and identities they refer to is established. additionally, the basic approaches are presented for metadata generation for managing names. these approaches resulted in the distinction (dissociation of authorities) from the bibliographic records, which in turn led (both frbr/frad and rda) to the lack of potentially linking—in an explicit way—the entity with the names it goes by. this linking, as it is presented later in this text, is fundamental for the description and interpretation of the entity. in everyday communication, the usage of a name in a sentence plays the role of the identifier for the entity that this specific name indicates. if the speakers share a common background, there is no need for qualifiers other than the name in order to disambiguate information such as whether nick is person x or person y, or if the word “london” indicates the city in ohio or in england, etc. thus, the common background leads to a very limited context in which the interpretation of the name and the assignment to the appropriate entity is sufficient and accurate. however, the context of the internet is extended into a variety of possibilities, so there is need of a more precise way to identify specific entities. in this regard, a very essential issue is the distinction between the properties of the name and the properties of the entity that is represented by the specific name. the word “john” could be recognized as an english name, but we jump to a logical flaw if we assume that john knows english. a representative example of this kind of inference (syllogism) can be found in rayside and campbell.25 statement: “man is a species of animal. socrates is a man. therefore, socrates is a species of animal. . . . ‘man' is a three-lettered word. socrates is a man. therefore, socrates is a three-lettered word.” therefore the authorities of a catalog should embody a two-level modeling of the information they represent. the first has to do with the entities and the second with the names of these entities. consequently, there is the need to find a way to pass from names to the entities they indicate; and, from entities, to the various appellations that these entities have. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 26 in catalogs, it is kind of vague whether the change of a name signifies a new identity. niu states: “for example: the maiden name and the married name of an agent are normally not considered two separate identities, yet one pseudonym used for writing fiction and another pseudonym used for writing scientific works are often considered two different identities of an agent.”26 then there can be one individual with many identities. but there can also be one identity which incorporates many individuals: for example, a shared pseudonym for a group of authors. to deal with these problems, frad introduces the notion of persona, rejecting at the same time the idea that a person is equal to an individual. frad defines a person as an “individual or a persona or identity established or adopted by an individual or group.”27 the question that arises here is when the persona must be conceived as a new identity. yet, frad does not make a sufficient judgment; instead, they refer to cataloguing rules. “under some cataloguing rules, for example, authors are uniformly viewed as real individuals, and consequently specific instances of the bibliographic entity person always correspond to individuals. under other cataloguing rules, however, authors may be viewed in certain circumstances as establishing more than one bibliographic identity, and in that case a specific instance of the bibliographic entity person may correspond to a persona adopted by an individual rather than to the individual per se.”28 so there is no specific guidance if, for example, in the case of “religious relationship,”29 there must be one identity created with two alternative names or two different identities. rule 9.2.2.8 in rda does not elaborate further. still, even with the problem of identities solved, the matter of appellations itself could be extremely complicated, and this is widely addressed in relevant literature.30,31,32 the viaf project confirms this with an extremely huge data set .33 assigning all appellations as attributes is an easy way to model the variants of a name, but it is very simplistic because it “does not allow these appellations to have attributes of their own and neither does it allow the establishing of relationships among the appellations. . . . frad makes a big step forward: all appellations are defined as entities in their own right, thus allowing full modeling.”34 of course, frad’s approach is not a novelty in the domain of lis since library catalogs have been modeling names since the era of marc. in unimarc authorities,35 the control subfield $5 contains a coded value to indicate the relations between the names with values such as “k = name before the marriage,” “i = name in religion,” “d = acronym,” etc., and in marc 21 there is the corresponding subfield $w.36 frad puts these values on a more consistent and abstract level. frad also defines “relationships between persons, families, corporate bodies, and works” in section 5.3 and “relationships between their various names” in section 5.4.37 the distinction between authorities and descriptive information since the days of card catalogs and for as long as marc and aacr have been used, bibliographic records have set their grounds on the dichotomy between descriptive information and control access points. the various types of headings stand for control access points. the terminus of headings was the alphabetical sorting. with the advent of computers, they were used as string identifiers to cluster and retrieve relevant bibliographic records. these bibliographic records had information technology and libraries | june 2016 27 a body of descriptive information that was transcribed from the resource and remained unchanged. so the headings were the keys to the records and the records were surrogates for documents. “the elements of a bibliographic record . . . were designed to be read and comprehended by human beings, not by machines”38; established headings are not an exception. one of their basic characteristics was the precondition that they were unique in the context of a specific catalog, thereby avoiding ambiguity. in every case of synonymy, qualifiers (such as date of birth or profession) were added to disambiguate, while the names also played the role of a unique identifier. from this process, an issue emerges: the information that appears on the document has changed and the controlled name may be completely different from the name on the resource. this means that the cataloger performs a transformation of the information, and this transformation carries two dangers. first, by changing the name, there is the possibility of assigning the entity behind the name to a wrong entity. second, by disturbing the correspondence between the information on the resource and the information on the record of the resource, the record becomes a problematic surrogate of the resource. to surpass this obstacle, traditional catalogs split the information into two different areas: one with the established forms, i.e., the headings; and the second with the purely descriptive information, i.e., the information that must be transcribed from the resource. this is the reason why traditional library catalogs put much effort into transcribing information from resources and very detailed guidelines have been developed. on the other hand, current approaches on metadata creation (such as dublin core) seem to underestimate the importance of descriptive information while concentrating on the established forms of names. but how can we be sure that different literals communicate the same meaning? does this kind of simplification, perhaps, cause problems regarding the integrity of the information? the names are not just sequences of characters (i.e., strings), but they carry latent information. it is known that there are women who wrote using male names (for example mary ann evans wrote as george eliot) and men who wrote by using female names. there are also nicknames for groups (e.g., “richard henry” is a pseudonym for the collaborative works of richard butler and henry chance newton), etc. therefore, it is important not to ignore names and the forms in which they appear on the resources, but to model them in such a way that integration between authorities and descriptive information is feasible, and the names are efficiently machine-processable. integrating authorities with descriptive information as we have already stated, traditional library catalogs are built on the dichotomy between description and access points. this analysis aims to bring descriptive information and authorities closer, i.e. to connect the access point of catalogs with the description of the resource. the basic principle of the model presented in this section is to promote each verbal (lexical) representation of a name to a nomen, whether this form of the name derives from a controlled vocabulary or not. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 28 in the cases that this form appears in a specific vocabulary, appropriate properties could be used to indicate such a relation. in this section, some representative examples are presented. it is important to note, once again, that every node and relation in the following figures could (and must, in the context of the semantic web) be identified by a uri, except for the values in rectangles, which are rdf simple literals and therefore cannot be the subjects of further expansion. thus, the concatenation is the following: every individual (instance of the relevant class) acquires a uri. every individual is connected through the “has appellation” property (acquires uri) to a nomen (also acquires uri) and these nomens end up connected to a plain rdf literal, which is in natural language wording and cannot be subjected to further analysis. place names the problem of place as an attribute in frbr and frad has also been analyzed in the background analysis of the current paper, specifically in the subsection “the impact of the representation scheme’s selection: rdf versus er.” here, a solution to this problem that is compatible with the frbr/rda solution is proposed. by promoting every nomen of a place to an rdf node, there is the option of referring to the entity of place as a whole or to a specific appellation of this entity. so, the relation (property in the context of rdf) between the subjects of a work could be indicated by connecting work x with place z. on the other hand, according to rule 2.8.1.4 of rda, the place of publication for the manifestation must be transcribed as it appears on the source of information. but following the connections presented in figure 3, it is easy to assume that this specific nomen corresponds to the same entity, i.e., to the same place. figure 3. place information technology and libraries | june 2016 29 personal names in the section “names, entities and identities,” we analyzed many of the problems associated with personal names. here, a model is presented where the work (and expression) is connected directly with the author, whereas manifestation is connected with a specific appellation, i.e., nomen, of this author. figure 4. statements of responsibility rda rule 2.4.1.4 states, “transcribe a statement of responsibility as it appears on the source of information.” but occasionally the statement of responsibility may contain phrases and not just names. in these cases, a solution similar to the metadata object description schema (mods) could be implemented where, if needed, the statement of responsibility is included in the note element using the attribute type="statement of responsibility." titles the management of titles in frbr and rda indicates a different point of view between the two standards. according to rda there is no title for the expression,39 and, as taniguchi states, this is a “significant difference between frbr and rda.”40 bibframe abides by the same principle of downgrading expression, since it entangles expression with work in an indivisible unit. in this regard, bibframe is closer to rda than to frbr. the notion of work has nothing to do with specific languages, even in the case when the work is a written text. therefore the assignment of the title of work to a specific appellation is an unnecessary limitation. on the contrary, the title of a manifestation is derived by a specific in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 30 resource. we argue that between these two poles there is the title of expression, which could stand as a uniform title per language. figure 5. titles v of bibliographic records and cataloging rules resource description in the domain of lis—from cutter’s era to the present day—emphasizes static linear textual representations. according to the rda “0.1 key features,” “in rda, there is a clear line of separation between the guidelines and instructions on recording data and those on the presentation of data. this separation has been established in order to optimize flexibility in the storage and display of the data produced using rda. guidelines and instructions on recording data are covered in chapters 1 through 37; those on the presentation of data are covered in appendices d and e.” but the tables in the relative appendices (d and e) contain guidelines that are mainly concentrated on punctuation issues, and they do not take into consideration the dynamics of current interactive user interface capabilities. as coyle and hillmann comment, “there are instructions for highly structured strings that are clearly not compatible with what we think of today as machine-manipulable data.”41 it is rather like producing high-tech cards: rda is faithful information technology and libraries | june 2016 31 to the classical text-centric approaches that produce bibliographic records as a linear enumeration of attributes; thus, rda can be likened to a new suit that is quite old fashioned. traditional catalogs (from card catalogs to opacs and repository catalogs) were built upon the principle of creating autonomous records. frbr set this principle, i.e. one record for each resource, under dispute, while linked data abolishes it. this way, a gigantic graph of statements is created, while a certain part of these statements (not always the same) responds to or describes the desired information. thus, a more sophisticated method emerges, if not makes itself imposed, for showing the results. therefore, the issue is not to present a record that describes a specific resource, since this conceptualization tends to be obsolete altogether. consequently, the visualization has to be different while in dependence with the data structure as well as the available interface of the searcher. in this context, the analysis of this study tries to keep in balance the machine-processable character of rdf that builds on identifiers (uris), while paying attention to the linguistic representation of entities. we argue that the balance between them will result in highly accurate and efficient representations for both humans and software agents. let us consider the model for titles that has been introduced in this study. according to frbr, “if the work has appeared under varying titles (differing in form, language, etc.), a bibliographic agency normally selects one of those titles as the basis of a ‘uniform title’ for purposes of consistency in naming and referencing the work.”42 rda treats the case in a very similar way: rule 5.1.3 states, “the term ‘title of the work’ refers to a word, character, or group of words and/or characters by which a work is known. the term ‘preferred title for the work’ refers to the title or form of title chosen to identify the work. the preferred title is also the basis for the authorized access point representing that work”. in this study, we consider the aforementioned statements as a projection that springs from the days when records were static textual descriptions independent of interfaces. nowadays we are moving towards a much clearer distinction between the entity and its names. this is reflected in figure 5, in which the connection between a work and its author has nothing to do with specific names (appellations) but is based on uris. the selection of the appropriate name as a title for the specific work could be based on certain criteria such as the language of the interface: in this case, the title of the work will be the title of the user interface language, and if this is not possible (i.e. there is no title label in this language), then it could be the title of the catalog’s default language. following the kind of modeling proposed in the current study, the visualizations of data become more flexible and efficient in a variety of dynamic ways. hence, we can isolate and display nodes and their connections, correlate them with the interface language or screen size (i.e., mobile phone or pc), create levels relative to the desired depth of analysis, personalize them upon the user’s request or habits, and so on. also, it becomes possible to display the data in forms other than textual. “as a result, humans, with their great visual pattern recognition skills, can comprehend data tremendously faster and more effectively through visualization than by reading the numerical or textual representation of the data.”43 in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 32 as we have already mentioned, the syntax and the semantics are always going to have a close relationship, but it is crystal clear that, now more than ever, the current semantic web standards allow for greater flexibility. as dunsire et al. put it, the rdf approach is very different from the traditional library catalog record exemplified by marc21, where descriptions of multiple aspects of a resource are bound together by a specific syntax of tags, indicators, and subfields as a single identifiable stream of data that is manipulated as a whole. in rdf, the data must be separated out into single statements that can then be processed independently from one another; processing includes the aggregation of statements into a record-based view, but is not confined to any specific record schema or source for the data. statements or triples can be mixed and matched from many different sources to form many different kinds of user-friendly displays.44 in this framework, cataloging rules must reexamine their instructions in light of the new opportunities offered by technological advancements. discussion naming is a vital issue for human cultures. names are not random sequences of characters or sounds that stand just as identifiers for the entities, but they also have socio-cultural meanings and interpretations. recently, out of “political correctness” and fear of triggering racism, sweden changed the names of bird species that could potentially offend, such as “gypsy bird” and “negro.”45 therefore we cannot treat names just as random identifiers. in this study we examined how, instead of describing indivisible resources, we could describe entities that appear in a variety of names on various resources. we proposed a method for connecting the names to the entities they represent and, at the same time, we documented the provenance of these names by connecting specific resources with specific names. we illustrated how to establish connections between entities, connections between an entity and a specific name of another entity, as well as connections between one name and another name concerning one or two entities. in the proposed framework, we maintain the linguistic character of naming while modeling the names in a machine-processable way. this formalism allows for a high level of expressiveness and flexible descriptions that do not have a static, text-centric orientation, since the central point is not the establishment of the text values (i.e., heading) but the meaning of our statements. this study has shown that it is important to have the possibility to establish relationships both between entities and between specific appellations (nomens in the context of this study) of these entities. to achieve this we promoted every appellation to an rdf node. this is not something unheard of in the domain of rdf since this approach has also been adopted by w3c for the development of skos-xl.46 frbroo, which is another interpretation of increasing influence in the wider context of the fr family, adopts the same perspective. 47 frbroo also gives the option to connect a specific name with a resource through the property “r64 used name (was name used information technology and libraries | june 2016 33 by)” or to connect a name with someone who uses this specific name through the property “r63 named (was named by).” murray and tillett state that “cataloging is a process of making observations on resources”48; hence, the production of records is the result of the judgments during this process. but in the context of traditional descriptive cataloging, the cataloger was not required to judge information in any way other than its category, i.e. to characterize whether the x set of characters corresponded to the name of an author, publisher, or place and so on. there was no obligation of assigning a particular name to a specific author, publisher, or place. in our approach, the cataloger interprets the information and supports the catalog’s potential to deliver added-value information. moreover, the initial information remains undifferentiated; hence, there is always the option of going back in order to generate new interpretations or validate existing ones. in recent years, there has been a significant increase in the attention given to multi-entity models of resource description.49 in this new environment, “the creation of one record per resource seems a deficient simplification.”50 rdf allows the transformation of universal bibliographic control to a giant global graph.51 in this manner, current approaches on resource description “cannot be considered as simple metadata describing a specific resource but more like some kind of knowledge related to the resource.”52 indeed, this knowledge can be computationally processable and exploitable. yet, to achieve this, “catalogers can only begin to work in this way if they are not held bound by the traditional definitions and conceptualizations of bibliographic records.”53 one critical issue is the isolation of parts (sets of statements) of this “giant graph” and the linking of these parts with something else; indeed, theory on this topic is starting to emerge.54 this is very essential because it allows for the creation of ad hoc clusters (i.e. the usage of a specific identity for an entity with all the names that have been assigned to this identity, in our context), which could be used as a set to link to some other entity. as a final remark, we could say that authorities manage controlled access points. in the semantic web, every uri is a controlled access point, and hence, the discrimination between description and authorities acquires a new meaning. in the context of machine-processable bibliographic data, the aim is to connect these two, i.e. the authorities with the description, and examine how one can support the other. however, since the emphasis is not on their individual management, we are drawn away from a mentality of ‘descriptive information versus access points” and towards one of “descriptive information as an access point.” acknowledgement the author wishes to thank henry scott who assisted in the proofreading of the manuscript. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 34 references and notes 1. grigoris antoniou and frank van harmelen, a semantic web primer, 2nd ed. (cambridge, ma: mit press, 2008), 3. 2. ifla, functional requirements for bibliographic records: final report, as amended and corrected through february 2009, ifla series on bibliographic control, vol. 19 (munich: k.g. saur, 1998), 6. 3. daniel kless et al., “interoperability of knowledge organization systems with and through ontologies,” in classification & ontology: formal approaches and access to knowledge: proceedings of the international udc seminar 19–20 september 2011, the hague, the netherlands, organized by udc consortium, the hague, edited by aida slavic and edgardo civallero (würzburg: ergon, 2011), 63–64. 4. karen coyle and diane hillmann, “resource description and access (rda): cataloging rules for the 20th century,” d-lib magazine 13, no. 1/2 (january 2007): para. 2, doi:10.1045/january2007-coyle. 5. cory k. lampert and silvia b. southwick, “leading to linking: introducing linked data to academic library digital collections,” journal of library metadata 13, no. 2–3 (2013): 231, doi:10.1080/19386389.2013.826095. 6. ifla, functional requirements for bibliographic records. 7. ifla, functional requirements for authority data: a conceptual model, edited by glenn e. patton, ifla series on bibliographic control (munich: k.g. saur, 2009). 8. ifla, “functional requirements for subject authority data (frsad): a conceptual model” (ifla, 2010), http://www.ifla.org/files/assets/classification-and-indexing/functional­ requirements-for-subject-authority-data/frsad-final-report.pdf. 9. ala, “rda toolkit: resource description and access,” sec. 0.3.1, accessed june 18, 2014, http://access.rdatoolkit.org/. 10. gordon dunsire, “representing the fr family in the semantic web,” cataloging & classification quarterly 50, no. 5–7 (2012): 724–41, dx:10.1080/01639374.2012.679881. 11. while this paper was under review, ifla released the draft “frbr-library reference model” (frbr-lrm), which is a consolidated edition for the fr family standards. it is developed according to the respective individual standards following the principles of the entity relationship modeling, which is challenged in this paper. taking into account the er modeling and the statement (available on p.5 of the standard) that “the model is comprehensive at the conceptual level, but only indicative in terms of the attributes and relationships that are defined,” this consolidated edition could not be perceived as a standard that could be implemented directly as a property vocabulary qualifying for use in the rdf environment. information technology and libraries | june 2016 35 http://dx.doi.org/10.1045/january2007-coyle http://dx.doi.org/10.1080/19386389.2013.826095 http://www.ifla.org/files/assets/classification-and-indexing/functional-requirements-for-subject-authority-data/frsad-final-report.pdf http://www.ifla.org/files/assets/classification-and-indexing/functional-requirements-for-subject-authority-data/frsad-final-report.pdf http://access.rdatoolkit.org/ http://dx.doi.org/10.1080/01639374.2012.679881 12. main page (for all fr) at http://iflastandards.info/ns/fr/; “frbr model" available at http://iflastandards.info/ns/fr/frbr/frbrer/; “frad model” available at http://iflastandards.info/ns/fr/frad/; “frsad model” available at http://iflastandards.info/ns/fr/frsad/. an addition to the previous is frbroo: the element set is available at http://iflastandards.info/ns/fr/frbr/frbroo/. 13. manolis peponakis, “conceptualizations of the cataloging object: a critique on current perceptions of frbr group 1 entities,” cataloging & classification quarterly 50, no. 5–7 (2012): 587–602, doi:10.1080/01639374.2012.681275. 14. pat riva and chris oliver, “evaluation of rda as an implementation of frbr and frad,” cataloging & classification quarterly 50, no. 5–7 (2012): 564–86, doi:10.1080/01639374.2012.680848. 15. shoichi taniguchi, “viewing rda from frbr and frad: does rda represent a different conceptual model?,” cataloging & classification quarterly 50, no. 8 (2012): 929–43, doi:10.1080/01639374.2012.712631. 16. rda registry is available at http://www.rdaregistry.info/. 17. the nomen entity and the “has appellation” (reversed “is appellation of”) property are also used by the frbr-lrm. 18. paul h. portner, what is meaning?: fundamentals of formal semantics (malden, ma: blackwell, 2005), 34. 19. ifla, functional requirements for bibliographic records, 19:6. 20. w3c, “rdf 1.1 concepts and abstract syntax: w3c recommendation,” february 25, 2014, http://www.w3.org/tr/2014/rec-rdf11-concepts-20140225/. 21. gordon dunsire, diane hillmann, and jon phipps, “reconsidering universal bibliographic control in light of the semantic web,” journal of library metadata 12, no. 2–3 (2012): 166, doi:10.1080/19386389.2012.699831. 22. manolis peponakis, “libraries’ metadata as data in the era of the semantic web: modeling a repository of master theses and phd dissertations for the web of data,” journal of library metadata 13, no. 4 (2013): 333, doi:10.1080/19386389.2013.846618. 23. ifla, functional requirements for bibliographic records, 19:32. 24. ifla, functional requirements for authority data: a conceptual model, 36–37. 25. derek rayside and gerard t. campbell, “an aristotelian understanding of object-oriented programming,” in proceedings of the 15th acm sigplan conference on object-oriented programming, systems, languages, and applications, oopsla ’00 (new york: acm, 2000), 350, doi:10.1145/353171.353194. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 36 http://iflastandards.info/ns/fr/ http://iflastandards.info/ns/fr/frbr/frbrer/ http://iflastandards.info/ns/fr/frad/ http://iflastandards.info/ns/fr/frad/ http://iflastandards.info/ns/fr/frsad/ http://iflastandards.info/ns/fr/frbr/frbroo/ http://dx.doi.org/10.1080/01639374.2012.681275 http://dx.doi.org/10.1080/01639374.2012.680848 http://dx.doi.org/10.1080/01639374.2012.712631 http://www.rdaregistry.info/ http://www.w3.org/tr/2014/rec-rdf11-concepts-20140225/ http://dx.doi.org/10.1080/19386389.2012.699831 http://dx.doi.org/10.1080/19386389.2013.846618 http://dx.doi.org/10.1145/353171.353194 26. jinfang niu, “evolving landscape in name authority control,” cataloging & classification quarterly 51, no. 4 (2013): 405, doi:10.1080/01639374.2012.756843. 27. ifla, functional requirements for authority data: a conceptual model, 24. 28. ibid., 20. 29. “religious relationship” is the “relationship between a person and an identity that person assumes in a religious capacity”; for example the “relationship between the person known as thomas merton and that person’s name in religion, father louis” (ifla, 2009, 61–62). 30. junli diao, “‘fu hao,’ ‘fu hao,’ ‘fuhao,’ or ‘fu hao’? a cataloger’s navigation of an ancient chinese woman’s name,” cataloging & classification quarterly 53, no. 1 (2015): 71–87, doi:10.1080/01639374.2014.935543. 31. on byung-won, sang choi gyu, and jung soo-mok, “a case study for understanding the nature of redundant entities in bibliographic digital libraries,” program: electronic library and information systems 48, no. 3 (july 1, 2014): 246–71, doi:10.1108/prog-07-2012-0037. 32. neil r. smalheiser and vetle i. torvik, “author name disambiguation,” annual review of information science and technology 43, no. 1 (2009): 1–43, doi:10.1002/aris.2009.1440430113. 33. thomas b. hickey and jenny a. toves, “managing ambiguity in viaf,” d-lib magazine 20, no. 7/8 (2014), doi:10.1045/july2014-hickey. 34. martin doerr, pat riva, and maja žumer, “frbr entities: identity and identification,” cataloging & classification quarterly 50, no. 5–7 (2012): 524, doi:10.1080/01639374.2012.681252. 35. ifla, unimarc manual: authorities format, 2nd revised and enlarged edition, ubcim publications—new series, vol. 22 (munich: k.g. saur, 2001). 36. library of congress, “marc 21 format for authority data” (library of congress, april 18, 1999), http://www.loc.gov/marc/authority/. 37. ifla, functional requirements for authority data: a conceptual model. 38. martha m. yee, “frbrization: a method for turning online public findings lists into online public catalogs,” information technology and libraries 24, no. 2 (2005): 81, doi:10.6017/ital.v24i2.3368. 39. see frbr-rda mapping from joint steering committee for development of rda available at http://www.rda-jsc.org/docs/5rda-frbrrdamappingrev.pdf 40. taniguchi, “viewing rda from frbr and frad,” 934. 41. coyle and hillmann, “resource description and access (rda): cataloging rules for the 20th century,” sec. 8. information technology and libraries | june 2016 37 http://dx.doi.org/10.1080/01639374.2012.756843 http://dx.doi.org/10.1080/01639374.2014.935543 http://dx.doi.org/10.1108/prog-07-2012-0037 http://dx.doi.org/10.1002/aris.2009.1440430113 http://dx.doi.org/10.1045/july2014-hickey http://dx.doi.org/10.1080/01639374.2012.681252 http://www.loc.gov/marc/authority/ http://dx.doi.org/10.6017/ital.v24i2.3368 http://www.rda-jsc.org/docs/5rda-frbrrdamappingrev.pdf 42. ifla, functional requirements for bibliographic records, 19:33. 43. leonidas deligiannidis, amit p. sheth, and boanerges aleman-meza, “semantic analytics visualization,” in intelligence and security informatics, edited by sharad mehrotra et al., lecture notes in computer science 3975 (springer berlin heidelberg, 2006), 49, http://link.springer.com/chapter/10.1007/11760146_5. 44. dunsire, hillmann, and phipps, “reconsidering universal bibliographic control in light of the semantic web,” 166. 45. rick noack, “out of fear of racism, sweden changes the names of bird species,” washington post, february 24, 2015, http://www.washingtonpost.com/blogs/worldviews/wp/2015/02/24/out-of-fear-of­ racism-sweden-changes-the-names-of-bird-species/. 46. w3c, “skos extension for labels (skos-xl) namespace document—html variant,” 2009, http://www.w3.org/tr/2009/rec-skos-reference-20090818/skos-xl.html. 47. chryssoula bekiari et al., frbr object-oriented definition and mapping from frbrer, frad and frsad, version 2.0 (draft), 2013, http://www.cidoc­ crm.org/docs/frbr_oo//frbr_docs/frbroo_v2.0_draft_2013may.pdf. 48. robert j. murray and barbara b. tillett, “cataloging theory in search of graph theory and other ivory towers,” information technology and libraries 30, no. 4 (january 12, 2011): 171, http://dx.doi.org/10.6017/ital.v30i4.1868. 49. thomas baker, karen coyle, and sean petiya, “multi-entity models of resource description in the semantic web,” library hi tech 32, no. 4 (2014): 562–82, http://dx.doi.org/10.1108/lht­ 08-2014-0081. 50. peponakis, “libraries’ metadata as data in the era of the semantic web,” 343. 51. kim tallerås, “from many records to one graph: heterogeneity conflicts in the linked data restructuring cycle,” information research 18, no. 3 (2013), http://informationr.net/ir/18­ 3/colis/paperc18.html. 52. peponakis, “conceptualizations of the cataloging object,” 599. 53. rachel ivy clarke, “breaking records: the history of bibliographic records and their influence in conceptualizing bibliographic data,” cataloging & classification quarterly 53, no. 3–4 (2015): 286–302, doi:10.1080/01639374.2014.960988. 54. gianmaria silvello, “a methodology for citing linked open data subsets,” d-lib magazine 21, no. 1/2 (2015), doi:10.1045/january2015-silvello. in the name of the name: rdf literals, er attributes, and the potential to rethink the structures and visualizations of catalogs | peponakis |doi:10.6017/ital.v35i2.8749 38 http://link.springer.com/chapter/10.1007/11760146_5 http://www.washingtonpost.com/blogs/worldviews/wp/2015/02/24/out-of-fear-of-racism-sweden-changes-the-names-of-bird-species/ http://www.washingtonpost.com/blogs/worldviews/wp/2015/02/24/out-of-fear-of-racism-sweden-changes-the-names-of-bird-species/ http://www.w3.org/tr/2009/rec-skos-reference-20090818/skos-xl.html http://www.cidoc-crm.org/docs/frbr_oo/frbr_docs/frbroo_v2.0_draft_2013may.pdf http://www.cidoc-crm.org/docs/frbr_oo/frbr_docs/frbroo_v2.0_draft_2013may.pdf http://dx.doi.org/10.6017/ital.v30i4.1868 http://dx.doi.org/10.1108/lht-08-2014-0081 http://dx.doi.org/10.1108/lht-08-2014-0081 http://informationr.net/ir/18-3/colis/paperc18.html http://informationr.net/ir/18-3/colis/paperc18.html http://dx.doi.org/10.1080/01639374.2014.960988 http://dx.doi.org/10.1045/january2015-silvello introduction background frbr and rda notes about terminology and graphs: how to read the figures the impact of the representation scheme’s selection: rdf versus er names, entities, and identities the distinction between authorities and descriptive information integrating authorities with descriptive information place names personal names titles visualization of bibliographic records and cataloging rules discussion references and notes the lc/marc record as a national standard 159 the desire to promote exchange of bibliographic data has given rise to a rather cacophonous debate concerning marc as a "standard," and the definition of a marc compatible record. much of the confusion has arisen out of a failure to carefully separate the intellectual content of a bibliographic record, the specific analysis to which it is subjected in an lc/marc format, and its physical representation on magnetic tape. in addition, there has been a tendency to obscure the different requirements of users and creators of machine-readable bibliographic data. in general, the standards making process attempts to find a consensus among both groups based on existing practice. the process of standardization is rarely one which relies on enlightened legislation. rather, a more pragmatic approach is taken based on an evaluation of the costs to manufacturers weighed against costs to consumers. even this modest approach is not invested with lasting wisdom. ansi standards, for example, are subject to quinquennial review. standards, as already pointed out, have as their basis common acceptance of conventions. thus, it might prove useful to examine the conventions employed in an lc/marc record. the most important of these is the anglo-american cataloging rules as interpreted by lc. the use of these rules for descriptive cataloging and choice of entry is universal enough that they may safely be considered a standard. similar comments may be made concerning the subject headings used in the dictionary catalog of the library of congress. the physical format within which machine-readable bibliographic data may be transmitted is accepted as a codified national and international standard (ansi z39.2-1971 and iso 2709-1973 (e) ) . this standard, which is only seven pages in length, should be carefully read by anyone seriously concerned with the problems of bibliographic data interchange. ansi z39.2 is quite different from the published lc/ marc formats. it defines little more than the structure of a variable length record. simply stated, ansi z39.2 specifies only that a record shall contain a leader specifying its physical attributes, a directory for identifying elements within the record by numeric tag (the values of the tags are not defined), and optionally, additional designators which may be used to provide further information regarding fields and subfields. this structure is completely general. within this same structure one could transmit book 160 1 oumal of library automation vol. 7 i 3 september 197 4 orders, a bibliographic record, an abstract, or an authority record by adopting specific conventions regarding the interpretation of numeric tags. thus, we come to the crux of the problem, the meanings of the content designators. content designators (numeric tags, subfields, delimiters, etc.) are not synonymous with elements of bibliographic description; rather, they represent the level of explicitness we wish to achieve in encoding a record. it might safely be said that in the most common use of a marc record-card production-scarcely more than the paragraph distinctions on an lc card are really necessary. if we accept such an argument, then we can simply define compatibility with lc/marc by defining compatibility in terms of a particular class of applications, e.g., card, book, or crt catalog creation. a record may be said to be compatible with lcjmarc if a system which accepts a record as created by lc produces from the compatible 1·ecord products not discernibly different from those created from an lc/marc record. thus, what is called for is a family of standards all downwardly compatible with lc/marc, employing ansi z39.2 as a structural base. this represents the only rational approach. the alternative is to accept lc/ marc conventions as worthy of veneration as artistic expression. s. michael malinconico 121 n ews and announcements redi or not . .. "public libraries and the remote electronic delivery of information (redi)," a working meeting, was held in columbus, ohio, on monday and tuesday, march 23 and 24, 1981. the meeting, jointly sponsored by the public library of columbus and franklin county (ohio) and oclc, inc., considered the issues that public libraries must examine before becoming involved in electronic information services . subjects explored included technology, communications, information providers, information users , social implications, and financial, legal, and regulatory responsibilities. tom harnish, program dir e ctor of oclc' s home delivery of library services program, was moderator of the twoday event. participants at the conference represented a variety of public libraries from throughout the u.s., including new york, georgia, texas, california, colorado, and illinois. don hammer represented lita at the meeting; mary jo lynch of the ala office for research also attended . "geographic distances, " said harnish, "were the only points of separation among the meeting participants . there was an overwhelming agreement on the concerns for the future of libraries and universal access to information in the electronic age . " on the second day of the conference it became apparent that the redi agenda could not be properly dealt with in two days. "we need an organization which will address these issues on an ongoing basis," said richard sweeney, executive director of plcfc . "librarians at the conference agreed to promote and lead the development of the electronic library . to that end, this group is seeking recognition by ala as a membership initiative group with a special interest in the electronic library." the group's founders prepared the following mission statement for the membership initiative group: to ensure that information delivered electronically remains accessible to the general public, the electronic library association shall promote participation and leadership in the remote electronic delivery of information* (redi) by publicly supported libraries and nonprofit organizations . goals of the organization are to: • identify services and information that are best suited to remote electronic delivery; • plan , fund, and develop working demonstrations of library redi services ; • communicate the availability of electronic library services to the user community; · • inform the library profession of trends, specific events , and future directions of redi; • create coalitions with organizations in allied fields ·of interest. public libraries and nonprofit organizations with information interests, such as information and referral groups, are invited to join the electronic library association . the group plans to meet at the ala annual conference in san francisco. meeting details will be announced as soon as they are available . it was the goal of the "public libraries and the remote electronic delivery of information " meeting to provide th e fram e work within which to address the myriad issues in redi. the electronic library group will validate the role of libraries in technology .... redi or not here we come. *information delivered electronically where and when it is needed, in the library and elsewhere (home/office/off-site). 122 journal of library automation vol. 14/2 june 1981 arl adopts plan for improving access to microforms a plan aimed at improving bibliographic access to materials in microform by building a nationwide database of machinereadable records for individual titles in microform sets was approved in principle by the arl board of directors on january 30, 1981. the plan concentrates on monograph collections, and is aimed at providing records for individual titles in both current and retrospective sets. records add~d to the database will also aid cooperative efforts in preservation microfilming. elements of the plan include: • inputting of records conforming to accepted north american standards to the major bibliographic utilities by libraries and microform publishers; • d~ve_lopment of "profile matching" by the b1blwgraphic utilities permitting the cataloging of all individual titles in a series or microform collection with single operation; • cooperative cataloging of current and retrospective microform sets by libraries and publishers; • compensation for publishers who input acceptable bibliographic records to the bibliographic utilities to offset loss of revenue from card set sales. cooperation among libraries publishers, networks, and others ha's been stressed throughout the development of the plan, and initiatives on a number of fronts are necessary and encouraged in order to accomplish the goal of improved bibliographic access to microforms. arl will s_eek outside funding for a program coordmator to facilitate implementation of the elements outlined above, and recruitment for the one-year position will begin short!~ . the coordinator, advised by a committee of librarians (from arl and ~on-arl institutions) and microform publ~shers, will work with libraries, publishers, and the bibliographic utilities to help get the plan off the ground. the plan is the result of a one-year study funded by a grant from the national endowment for the humanities and conducted for arl by richard boss of information systems consultants, inc. during the course of the year, he interviewed librarians , microform publishers, representatives of the bibliographic utilities, and others interested in bibliographic access to microforms, gradually building the plan from elements on which there was agreement and discarding ideas that were not widely accepted. the effort to build a consensus among the various interested parties was aided by the advisory committee, comprising both arl librarians and microform publishers, which assisted and advised throughout the course of the project. arl will publish the study this spring. arl sponsorship of this project and its follow-up reflects the long-standing commitment the association has had to improving access to microforms . two earlier arl studies on improving bibliographic access contributed to the development of standards for descriptive cataloging of microforms, reinforced the importance of microforms for preserving and disseminating scholarly materials, and identified some of the problem areas that the current study has addressed . today, as the amount of materials in microform in arl libraries continues to grow-arl libraries hold more than 146,660,000 units of microform-improving access to these materials has taken on even greater urgency. the association of research libraries is an organization of major research libraries in the united states and canada. members include the larger university libraries, the national libraries of both countries and a number of public and special librar~ ies with substantial research collections . there are at present 111 institutional members . battelle studies using computers to access unpublished technical information engineers may be able to use computers to store, call up, and otherwise display some technical information not currently published in professional journals as a result of a study recently begun by battelle's columbus laboratories. in a four-month study sponsored by the american society of mechanical engineers (asme), battelle researchers are examining ways to use computers as an alternative to publications for communicating with the technical community. asme is a technical and educational organization with a membership of 100,000 individuals, including 17,000 student members. it conducts one of the largest technical publishing operations in the world, which includes codes, stanc dards, and operating principles for industry. according to battelle's gabor j. kovacs, certain types of information traditionally are not covered in monthly or quarterly technical journals, yet they often have widespread appeal among engineers. "recent advances in computer and telecommunications technologies, coupled with rapidly rising publication costs and postal rates, have created an ideal environment for organizations to consider using computers as an alternative mode of communication," kovacs said . "data bases can be used to maintain information that is impractical for conventional publication, and it is now possible to use them for many other types of communication as well." during the study, researchers will determine the feasibility of using a computer database to disseminate to asme members such information as short articles dealing with design and applications data, news and announcements 123 catalog data, and teleconference messages. with the help of the asme, battelle specialists will define the information requirements for such a system. while technology is sufficiently advanced to accommodate virtually any type of information, costs can become prohibitive unless practical compromises are made, kovacs said . as part of the study, battelle researchers also will analyze the costs associated with systems of varying capabilities. researchers then will define several alternative database systems, which will include such attributes as: • online, interactive retrieval features • simple-to-use retrieval language • user-aid features • a minimum of seventy-five simultaneous users • ability to send, store, and broadcast messages • compatibility with a variety of hard copy and crts (cathode ray tube terminals) • sixteen or more hours per day availability to accommodate different time zones • a minimum of thirty-characters-persecond transmission rates two of these alternative system de .signs-one representing a minimum capability and the other a maximum capability-then will be selected for further evaluation by battelle and the asme. 50 communications how long the wait until we can call it television jerry borrell: congressional research service, library of congress , washington , d.c* this brief article will review videotex and teletext. there is little need to define terminology because new hybrid systems are being devised almost constantly (hats off to oclc's latest buzzword-viewtel). ylost useful of all would be an examination of the types of technology being used for information provision. the basic requirement for all systems is a data base-i.e ., data stored so as to allow its retrieval and display on a television screen. the interactions between the computer and the television screens are means to distinguish technologies. in teletext and videotex a device known as a decoder uses data encoded onto the lines of a broadcast signal (whatever the medium of transmission ) to generate the display screen. in videotex, voice grade telephone lines or interactive cable are used to carry data communications between two points (usually 1200 baud from the computer and 300 baud or less from the decoder and th e television screen). in teletext the signal is broadcast over airwaves (wideband) or via a time-sharing system (narrowband). the numerous configurations possible make straightforward classification of syst e ms questionable. a review of the systems currently available is useful to illustrate these terms, videotex and teletext. compuserve, the columbus, ohio-based company, provides on-line searching of newspapers to about 4,000 users. reader's digest recently acquired 51 percent of the source, a time*the views expressed in this paper do not necessarily represent those of the library of congress or of the congressional research ser~ vice. sharing service that provides more than 100 different (nonbibliographic) data bases to about 5,000 users. the warner and american express joint project, qube (also columbus-based), utilizes cable broadcast with a limited interactive capability . it does not allow for on-demand provision of information ; rather, it uses a polling technique. antiope, the french teletext system, used at ksl in st. louis last year and undergoing further tests in los angeles at knxt in the coming year, is only part of a complex data transmission system known as didon. antiope is also at an experimental stage in france, with 2,500 terminals scheduled for use in 1981. ceef ax and oracle , broadcast teletext by the bbc and ibc in britain, have an estimated 100,000 users currently. two thousand adapted television sets are being sold every month . prestel, bbc's videotex system, currently has approximately 5,000 users, half of whom are businesses. all other countries in europe are conducting experiments with one of the technologies. in canada, telidon, the most technically advanced system, has 200 users. experiments involving telidon are being conducted nationwide due to government interest in telecommunications improvements. telidon will also be used in washington in the spring of 1981 for consumer evaluation. these cursory notes should indicate the breadth of interes t in alternative means of information provision. video and electronic publishing newsletters (see references) keep track of the number of users and are the best way to keep informed of activities and developments. several important trends are becoming evident. perhaps the most evident is the realization that videography is being developed in countries other than the u.s. as a result of strong support by the national posts and telecommunications (ptt) authorities . until recently there was a feeling that the u.s. was technically behind europe. what is now evident is that in the free market system of the u . s. manufacturers or other potential system providers have had insufficient impetus to provide videotex/teletext technology. the technology of information display (see borrell, journal of library automation, v.13 (dec. 1980), p.277-81) in the u.s. is an order of magnitude more sophisticated than in europe. the point being that in the absence of strong ptt pressure, videography in the u . s. developed for specialized markets in which telecommunications were not a central need. in the one area of great demand, teletext services for the hearing impaired, decoders were developed and have been employed for a number of years (about 25,000 are currently in use ). as the high cost of telecommunications bandwidth is eased by data compression, direct broadcasting by satellite, enhanced cable services, and fiber optic networks, then videotex and te letext will become available on a wide scale in the u.s. the computer inquiry ii decision by the fcc involving reinterpretation of the communications act of 1934 has given at&t permission to enter the data processing market . in fact, at&t, in its third experiment with videotex, is taking such an aggressive stance that it seems to be doing everything that its critics have feared: providing updatable classified ads (dynamic yellow pages), allowing users to place information into the system memory , and providing voice mail servicesthereby taking on the newspapers, home computer manufacturers, and the u . s. postal service. in addition, banking services will be offered . as the largest company in the u.s ., this stance cannot be ignored. at&t supplies about 80 percent of the phone service in the u .s., and has the potential, if allowed , to become a broadcaster, data processor, publisher, and banker ; cross-ownership was never allowed up to this time . the trend toward specialized services provision is also exemplified by the communications 51 french and british systems. prestel , which was originally targeted for a home market, is now promoted with the tacit policy of being a special business service allowing financial and private data to be provided to subscribers. sofratev, the marketers of the french teletext system, are acknowledging the importance of transactional markets in two ways, based on technology they have named "smart card," a credit card-size (in one configuration) plate with a built-in microprocessor or chip. the card will allow system users to access material that will have controlled readership. an example would be a magazine of financial data provided to those who need such information (or, more importantly, are willing to pay for it). in a more complex effort, the largest retailer in paris will advertise material via teletext and system users will be able to make acquisitions with their smart card, which can be programmed with financial data. nor is this the end of the effort by the french to market information display technology. the electronic phone directory, being offered by bell in austin , is replicated in a more modest way by the french, who plan to produce a six-byeight-inch black-and,white display unit that will provide. phone directory information (both white and yellow pages) to all of france by the 1990s. developed as part of the "telerriatique" program of the .french government, the terminals represent to some (the parent company of the source has tendered an offer for up to 250,000 of the terminals) a low-cost alternative for providing videotex to a mass market. the tandy home computer in its videotex configuration seems to fill the same market slot. perhaps the most disturbing trend, at least from a librarian's point of view, is the fact that contemporary data systems are being created which could benefit greatly from the experience of librarians and libraries. for instance, research into the methods of access-keyword, phonetic and geographical-by the french is intended to pro:vide a flexible and easily used system for untrained persons searching for directory information, and is being performed by an advertising and yellow pages 52 journal of library automation vol. 14/1 march 1981 publishing firm. with a feeling of deja vu i listened to an explanation of how difficult it is to develop a system for the novice; one proposed solution is to allow only the first four letters of a word to be entered (one of the search methods used at the library of congress, which does suggest some cross-fertilization ). whatever the trends, the reality is that librarians and information scientists are playing decreasing roles in the growth of information display technology. hardware systems analysts, advertisers, and communications specialists are the main professions that have an active role to play in the information age. perhaps the answer is an immediate and radical change in the training of library schools of today. our small role may reflect our penchant to be collectors, archivists, and guardians of the information repositories . have we become the keepers of the system? the demand today is for service, information, and entertainment. if we librarians cannot fulfill these needs our places are not assured. should the american library association (ala) be ensuring that libraries are a part of all ongoing tests of videotex-at least in some way-either as organizers, information providers, or in analysis? consider the force of the argument given at the ala 1980 new york annual conference that cable television should be a medium that librarians become involved with for the future. certainly involvement is an important role, but we , like the industrialists and marketers before us, must make smart decisions and choose the proper niche and the most effective way to use our limited resources if we are to serve any part of society in the future. bibliography 1. electronic publishing revietc. oxford, england : learned information ltd . quarterly . 2. home video report . white plains, new york : knowledge industry publications. weekly. 3. ieee transactions on consumer electronics. new york: ieee broadcast, cable, and consumer electronics soc iety . five tim es yearly. 4. international videotex /te letext news. washington , d. c.: arlen communications ltd. monthly . 5. videodisc/teletext news. westport , conn.: microform revi ew. quarterly. 6. videoprint. norwalk , conn.: videoprint. two times monthly. 7. viewdata/videotex report. new york: link resources corp. monthly. data processing library: a very special library sherry cook, mercedes dumlao, and maria szabo: bechtel data processing library, san francisco, california. the 1980s are here and with them comes the ever broadening application of the computer. this presents a new challenge to libraries. what do we do with all these computer codes? how do we index the material? and most importantly, how do we make it accessible to our patrons or computer users? bechtel's data processing library has met these demands. the genesis for th e collection was bechte l's conversion from a honeywell 6000 computer to a univac lloo in 1974. all the programs in use at that time were converted to run on the univac system. it seemed a good time to put all of the computer programs together from all of the various bechtel divisions into a controlled collection. the librarians were charged with the responsibility of enforcing standards and control of bechtel's computer programs. the major benefits derived from placing all computer programs into a controlled library were: 1. company-wide usage of the programs. 2. minimize investment in program development through common usage. 3. computer file and documentation storage by the library to safeguard the investment. 4. central location for audits of program code and documentation. 5. centralized reporting on bechtel programs . developing the collection involved basic cataloging techniques which were greatly modified to encompass all the information that computer programs generate, including actual code, documentation, and listlib-mocs-kmc364-20131012114424 book reviews more joy of contracts: an epicurean approach to negotiation, by kevin hegarty. tacoma, wash.: tacoma public library, 1981. 66p. $10. order from: administrative offices, tacoma public library, 1102 tacoma ave. south, tacoma, wa 98402. hegarty's book, the second edition of his original joy of contracts (american library association, dallas, texas, june 1979), has both strengths and weaknesses. the basic strength is one heck of a lot of information about how to negotiate and write a contract that will assure a library that it gets what it pays for from a turnkey automation system. the weakne.c;ses involve the organization of the text, the writing style, the specific focus on automated circulation systems, and the physical format of the document. first, the author has clearly fought his way through a contract negotiation for a turnkey "computerized library circulation system." the first edition of this book was produced soon after that negotiation was completed. this second edition seems to be augmented on the basis of experience gained in living with the contract. the main text walks the reader through each element of a contract (e.g., terms of agreement, specification of governing law, schedule, acceptance testing, etc.), provides sample contract language and adds comments and recommendations for how to cope with specific problems (e. g., negotiation of system reliability standards, p. 3-4). while the contract structure and the specification of contract elements may be useful, the real value of the book lies in the comments (e.g., the difference between two percent downtime and five percent downtime over one year is a system that is disabled for 140 additional hours). the practical value of these comments may be measured in wasted dollars, wasted staff hours, or frustrated library patrons. the section on system maintenance (p. 13-15) 319 alone, may be worth the cost of the book. on the negative side of the ledger, the book is somewhat difficult to use, because of its organization. it is composed of a primary section-in outline form-on the elements of a contract between a library and a vendor, and seven secondary sections, including examples of plans, sub-agreements, and schedules (and a seventeen-item bibliography). that is all that appears in the table of contents and there is neither an introduction, an overview, nor an index. it is very difficult to find a specific topic of interest without skimming through the text itself. second, the body of the text is a mixture of sample (or recommended; it isn't clear) contract language (identifiable by use of the word "shall"), comments on the language of particular portions of the contract (sometimes labeled "comment " and sometimes not), and cross-references within the book itself (sometimes labeled "note:"). the mixture of different elements-contract language, narrative, etc.-are sometimes confusing. moreover, there are a number of small grammatical garbles which are slightly distracting. a bit of professional editing would make this document both more readable and more useful. third, hegarty focuses on (or uses as an example; it isn't clear) automated circulation systems. this would be very useful if that is what the reader intends to buy. however, with a variety of other turnkey automated systems and sources for libraries on the market or soon to be made available (e.g., acquisitions, book fund accounting, cataloging, online bibliographic access), some language about how the contract should be redesigned or revised to account for different systems and services would have made the book more immediately useful to more readers. last, the book comes as a photocopy of a typed original, with a velo binding. the binding of the reviewer's copy broke apart 320 journal of library automation vol. 14/4 december 1981 the first time it was opened. however, it should be possible to rebind or staple it together if this turns out to be a persistent problem. on balance, for those about to negotiate a contract with a vendor of automated systems and services, the strengths of more loy of contracts probably outweigh the weaknesses. one gets what a contract says one will get; any help in writing a thorough, comprehensive, and airtight contract will be of usei-donald thompson, university of california systemwide administration, office of the president, berkeley, california. computer science resources: a guide to professional literature. compiled and edited by darlene myers. white plains, n.y.: knowledge industry publications, 1981 . 346p. $59.50 (asis members: $47.60), paperback. isbn 0-914236-80-6. this comprehensive guide to the englishlanguage literature of computer sciences catalogs qn-time managed by professionals grc provides • accurate • flexible • economical com catalog services contact don gill general research corporation a subsidiary of flow general. inc. p.o box 6770, santa barbara. calofornoa 93111 (805) 964-7724 covers books, journals, technical reports, indexing and abstracting resources, directories, dictionaries, handbooks, newsletters, software resources, proceedings, programming languages, and publishers. its appendixes give information relating to career and salary trends, societies and associations, academic computer-center libraries, commercial fairs and shows, and myers' draft of a proposed expansion of the library of congress classification for the computer sciences. as meyers states in the preface, "the work is designed to serve the needs of researchers, managers, librarians, consultants and systems analysts in academic, corporate and governmental data processing centers." computer science resources, divided into ten main sections with five appendixes at the back, is on the whole easy to use. since the book does not have an index, its table of contents becomes the key to information access. its wide margins together with fairly large print make it very readable. however, its unconventional arrangement of entries-letter by letter ignoring conjunctions and prepositions instead of word by word-can be misleading. for instance, "computers and urban society" is arranged ahead of "computer survey." the word "and" and spaces between words are ignored; resulting in computersurban . .. filing before computersurvey. this practice does not follow the traditional library principle "nothing files before something. " the explanation of the idiosyncratic entry arrangement is only given in the preface. when people use a book for quick reference, they usually skip the preface and the introduction; some users will probably miss many terms as a result. the book is international in scope, relatively up-to-date, and informative. english titles published overseas, foreign publishers, and trade fairs and shows pertaining to the computer industry are included in the directory. most titles mentioned have been published since 1970 and many citations are as recent as 1980. the annotations for each entry in "indexing and abstracting resources," "directories, dictionaries, handbooks," and "software resources'' are very informative. it would have been ideal if titles in the "current books" and "computer-related journals" were also annotated to aid users in selecting the materials. subject headings and cross-references used in various sections of the book are not al·ways consistent. for example, in the section "current books," there is a see reference from "a. i. (artificial intelligence)" to "cybernetics/ artificial intelligence/robots," but none from "artificial intelligence." however, in the section "computer-related journals," the heading is "artificial intelligence" with a see also reference to "cybernetics; robots," but no reference from "a.l. (artificial intelligence)." in the "current books" section, "careers/vocational guidance" is used as a subject heading. in the "computer-related journals" section, "employment" becomes the subject. there is no cross reference from either heading to the other in either section. in the "computer-related journals" section, preceding and succeeding titles are linked by cross-references. the history of title changes is outlined whenever applicable under the entries for the current titles. this information is invaluable especially for librarians in identifying variant journal titles. although there are see references under most former titles to current titles, some entries are omitted for previous titles. for example, injosystems was formerly called management and business automation and later changed to business automation with the merging of international business automation and international edition business automation. then there was business automation news analysis edition published book reviews 321 as a supplement to business automation. surprisingly there are no see references under "business automation" and "international edition business automation" to "infosystems." maybe it is because ''business automation" is quite similar to "business automation news analysis edition'' and "international edition business automation" is similar to "international business automation" and would have appeared close together if not adjacent to one another. again some users may miss the links to the current titles. it might have been better to include a separate list for ceased journals. computer science uesources is the result of monumental effort and years of thorough research and careful planning. its compiler and editor, darlene myers has been very active in the computer and information science field, and is the manager of the computing information center at the university of washington. the wealth of information in the book and the currentness of cited materials are the prominent strengths. the flaws mentioned earlier are minor if users read the preface and the introduction in each section first. this reference tool is strongly recommended for computer industry libraries as well as for medium-sized and large public and academic libraries. although more current, it does not wholly supplant ciel carter's guide to reference sources in the computer sciences (new york: macmillan, 1974). carter's entries are all annotated, and some of the citations are not included in the newer work.-frauces lau, blackwell! north america, beaverton, oregon. the benefits of enterprise architecture for library technology management: an exploratory case study sam searle information technology and libraries | december 2018 27 sam searle (samantha.searle@griffith.edu.au) is manager, library technology services, griffith university, brisbane, australia. abstract this case study describes how librarians and enterprise architects at an australian university worked together to document key components of the library’s “as-is” enterprise architecture (ea). the article covers the rationale for conducting this activity, how work was scoped, the processes used, and the outputs delivered. the author discusses the short-term benefits of undertaking this work, with practical examples of how outputs from this process are being used to better plan future library system replacements, upgrades, and enhancements. longer-term benefits may also accrue in the future as the results of this architecture work inform the library’s it planning and strategic procurement. this article has implications for practice for library technology specialists as it validates views from other practitioners on the benefits for libraries in adopting enterprise architecture methods and for librarians in working alongside enterprise architects within their organizations. introduction griffith university is a large comprehensive university with multiple campuses located across the south east queensland region in australia. library and information technology operations are highly converged and from 1989–2017 were offered within a single division of information services. scalable, sustainable, and cost-effective it is seen as a key strategic enabler of the university’s core business in education and research. “information management and integration” and “foundation technology” are two of four key areas outlined in the griffith digital strategy 2020, which highlights enterprise-wide decision-making and proactive moves to take advantage of as-a-service models for delivering applications.1 from late 2016 through to early 2018, library and learning services (“the library”) and it architecture and strategy (itas) worked iteratively to document key components of the library’s “as-is” enterprise architecture (ea). around fifty staff members have participated in the process at different points. the process has been very positive for all involved and has led to a number of benefits for the library in terms of improved planning, decision-making, and strategic communication. as manager, library technology services, the author was well placed to act as a participant-asobserver with the objective of sharing these experiences with other library practitioners. the author actively participated in the processes described here and has been able to informally discuss the benefits of this work with the architects and some of the library staff members who were most involved. mailto:samantha.searle@griffith.edu.au benefits of enterprise architecture for library technology management | searle 28 https://doi.org/10.6017/ital.v37i4.10437 literature review enterprise architecture (ea) emerged over twenty years ago and is now a well-established it discipline. like other disciplines such as project management and change management, there are a number of best practice frameworks in common use, including the open group architecture framework (togaf).2 a global federation of member professional associations has been in place since 2011, with aims including the formalization of standards and promotion of the value of ea.3 educational qualifications, certifications, and professional development pathways for enterprise architects are available within universities and the private training sector. according to the international higher education technology association educause, ea is relatively new within universities but is growing in importance. as a set of practices, “ea provides an overarching strategic and design perspective on it activities, clarifying how systems, services, and data flows work together in support of business processes and institutional mission.”4 yet despite this growing interest in our parent organizations, individual academic libraries applying ea principles and methods are notably absent from the scholarly literature and library practitioner information sharing channels. the fullest account to date of the experience and impacts of enterprise architecture practice in a library context is a case study from the canada institute for scientific and technical information (cisti). at the time of the case study’s writing in 2008, cisti was already well underway in its adoption of ea methods in an effort to address the challenges of “legacy, isolated, duplicated, and ineffective information systems” and to “reduce complexity, to encourage and enable collaborations, and, finally, to rein in the beast of technology.”5 the author of this case study concludes that while getting started in ea was complex and resource-intensive, this was more than justified at cisti by the improvements in technology capability, strategic planning, and services to library users. broader whole-of-government agendas are a driver for ea adoption in non-university research libraries. the national library of finland’s ea efforts were guided by a national information society policy and the ea architecture design method for finnish government. 6 a 2009 review of the it infrastructure at the u.s. library of congress (lc) argued lc was lagging behind other federal agencies in adoption of government-recommended ea frameworks. the impact of this included: inadequate linking of it to the lc mission; potential system interoperability problems; difficulties assessing and managing the impact of changes; poor management of it security; and technical risk due to non-adherence to industry standards and lack of future planning.7 a followup review in 2015 noted that lc had since developed an architecture, but that it had still fallen short by not gathering data from management and validating the work with stakeholders. 8 there is little discussion in the literature about the ea process as a collaborative effort. in their 2016 discussion of emerging roles for librarians, parker and mckay proposed ea as a new area for librarians themselves to consider moving into, rather than as a source of productive partnerships.9 they argued that there are many similarities in the skillsets and practices of enterprise architects and information professionals (in particular, systems librarians and corporate information managers). areas of crossover identified included: managing risks, for example, related to intellectual property and data retention; structured and standardized approaches to (meta)data and information; technical skills such as systems analysis, database design and vendor management; and understanding and application of information standards and internal information technology and libraries | december 2018 29 information flows. while not a research library, within a broader information management context state archives and records nsw has promoted the benefits to records managers of working with enterprise architects, including improved program visibility, strategic assistance with business case development, and the embedding of recordkeeping requirements within the organization’s overall enterprise architecture.10 getting started: context and planning library technology services context in 2015–16, the awareness of enterprise and solution architecture expanded significantly within griffith university’s library technology services (lts) team. in 2015, some members of the team participated in activities led by external consultants to document griffith’s overall enterprise architecture at a high level. in 2016, the author became a member of the university’s solution architecture board (sab). lts submitted several smaller solution architectures to this group for discussion and approval, and team members found this process useful in identifying alternative ways to do things that we may not have otherwise considered. as a small team looking after a portfolio of high-use applications, lts was seeking to align itself as much as possible with university-wide it governance and strategy. these broader approaches included aggressively seeking to move services to cloud hosting, standardizing methods for transferring data between systems, complying with emerging requirements for greater it security, and participating in large-scale disaster recovery planning exercises. the author also needed to improve communication with senior it stakeholders. there was little understanding outside of the library of the scale and complexity involved in delivering online library services to a community of over 50,000 people. in a resource-scarce environment, it was increasingly important to make business cases not just in formal project documents but also opportunistically in less formal situations (the “elevator pitch”). existing systems were definitely hindering the library in making progress toward an improved online student experience and more efficient usage of staff resources. a complex ecosystem of more than a dozen library applications had developed over time. the library had selected these at different times based on requirements for specific library functions rather than alignment with an overall architectural strategy. our situation mirrored that described at cisti: “a complex and ‘siloed’ legacy infrastructure with significant vendor lock-in” combined with “reactionary” projects that “extended or redesigned [existing infrastructure] to meet purported needs, without consideration for the complexity that was being added to overcomplicated systems.”11 complex data flows between local systems and third-party providers that were critical to library services were not always well-documented. while lts staff members were extremely experienced, much of their knowledge was tacit. as in many libraries, staff could be observed sharing in informal, organic ways focused on the tasks at hand, but less effort was spent on capturing knowledge systematically. building a more explicit shared understanding about the library’s application portfolio would help address risks associated with staff succession. improved internal documentation would also address emerging requirements for team members to both develop their own understanding in new areas (upskilling) as well as become more flexible in terms of taking up broader roles and responsibilities across the team (cross-skilling). benefits of enterprise architecture for library technology management | searle 30 https://doi.org/10.6017/ital.v37i4.10437 there was also a sense that the time was right to take stock and evaluate the current state of affairs before embarking on any major changes. the team was supporting several applications, including the library management system and the interlibrary loans system, that were end-of-life. we needed to make decisions, and these needed to not only address our current issues but also provide a firm platform for the future. it was in this context that in 2016 library technology services approached the information technology architecture and solutions group for assistance. information technology architecture and solutions context in 2014, griffith university embarked on a new approach to enterprise architecture. the chief technology officer was given a mandate by the senior leadership of the university to ensure that it architecture was managed within an architecture governance framework, and the information services ea team was tasked with developing and maintaining an ea and providing services to support the development of solution architectures for projects and operational activities. two new boards were established to provide governance: the information and technology architecture board (itab) would control architectural standards and business technology roadmaps, while the solution architecture board (sab) would “support the development and implementation of solution architecture that is effective, sustainable and consistent with architectural standards and approaches.” project teams and operational areas were explicitly given responsibility to engage with these boards when undertaking the procurement and implementation of it systems. sets of architectural, information, and integration principles were developed, which promoted integration mechanisms that minimized business impact and were future-proof, loosely coupled, reusable, and shared services.12 our enterprise architects saw their primary role as maximizing the value of the university’s total investment in it by promoting standards and frameworks that could potentially improve consistency and reduce duplication across the whole organization. in order to do this , they would need to work with and through other business units. from the architects’ perspective, a collaboration with the library offered an opportunity to exercise skillsets and frameworks that were in place but still relatively new. griffith was still maturing in this area and attempting to move from the hiring of consultants as the norm to building more internal capability. working with the library would be a good learning experience for a junior architect, who was on a temporary work placement from another part of information services as a professional development opportunity. she could build her skills in a friendly environment before embarking on other engagements with potentially less open client groups. determining scope in a statement of architecture work once the two teams had decided that the process could have benefits on both sides, the next step was to jointly develop a statement of architecture work outlining what the process would include and how we would work together. a formal document was eventually endorsed at the director level, but prior to that, the librarians and the architects had a number of useful informal conversations in which we discussed our expectations, as well as the amount of time that we could reasonably contribute to the process. in developing the statement of work, the two teams agreed to focus on the current “as-is” environment and on assessment of the maturity of the applications already in use (see figure 1). this would help us immediately with developing business cases and roadmaps, without information technology and libraries | december 2018 31 necessarily committing either team to the much greater effort required to identify an ideal “to-be” (i.e., future) state to work towards. figure 1. overview of the architecture statement of work. full size version available at https://doi.org/10.6084/m9.figshare.6667427. the open group architecture framework (togaf) supports the development of enterprise architectures through four subdomains: business architecture, data architecture, application architecture, and technology architecture.13 the work that we decided to pursue maps to two of these areas: data architecture, which “describes the structure of an organization’s logical and physical data assets and data management resources;” and application architecture, which “provides a blueprint for the individual applications to be deployed, their interactions, and their relationships to the core business processes of the organization.” enterprise architecture process and outputs once the architecture statement of work had been agreed on, the two teams embarked on the process of working together over an extended period. while the lapsed time from approval of the statement of work through to endorsement of the architecture outputs by the solution architecture board was approximately fourteen months, the bulk of the work was undertaken within the first six months. following an intense period of information gathering involving large numbers of staff, a smaller subset of people then worked iteratively to refine the outputs for final approval. several times architecture activities had to be placed on hold in favor of essential ongoing operational work and higher priority projects, such as a major upgrade of the institutional repository. the process involved four main activities which are described in more detail in following sections. https://doi.org/10.6084/m9.figshare.6667427 benefits of enterprise architecture for library technology management | searle 32 https://doi.org/10.6017/ital.v37i4.10437 data asset and application inventory the first activity consisted of a series of three workshops to review information held about library systems in the ea management system, orbus software’s iserver. this is the tool used by the griffith ea team to develop and store architectural models, and to produce artifacts such as architecture diagrams (in microsoft visio format) and documentation (in microsoft word, excel, and powerpoint formats).14 the architects guided a group of librarians who use and support library systems through a process of mapping the types of data held against an existing list of enterprise data entities. in this context, a data entity is a grouping of data elements that is discrete and meaningful within a particular business context. for library staff, meaningful data entities included all the data relating to a person, to items and metadata within a library collection, and to particular business processes such as purchasing. we also identified the systems into which data were entered (system of entry), the systems that were considered the “source of truth” (system of record), and the systems that made use of data downstream from those systems of record (reference systems). the main output of this process was a workbook (figure 2) showing a range of relationships: between systems and data entities; between internal systems; and between internal systems and external systems. the first two columns in the worksheet contain a list of all the data entities and sub-entities stored in library systems (as expressed in the enterprise architecture). along the top of the worksheet is a list of all the products in our portfolio along with a range of systems they are integrated with. each of the orange arrows in this spreadsheet represents the flow of data from one system to another. the workbook in this raw form is definitely messy and the data within it is not really meant to be widely consumed in this format. the workbook’s main role is as the data source for the application communication diagram that is described in a later section. as a result of this data asset inventory, the management system used by our architects now contains a far more comprehensive and up-to-date view of the library’s architectural components than before: • the data entities better reflect library content. for example, while iserver already had a collection item data entity, we were able to add new data entity subtypes for bibliographic records, authority records, and holdings records. • library systems are now captured in ways that make more sense to us. workshopping with the architects led to the breakdown of several applications into more granular architectural components. for example, the library management system is now represented not just as a single system, but rather as a set of interconnected modules that support different business functions, such as cataloguing and circulation. similarly, our reading lists solution was broken down into its two main components: one for managing reading lists and one for managing digitized content. this granularity has enabled us to build a clearer picture of how systems (and modules within systems) interface with each other. information technology and libraries | december 2018 33 figure 2. part of the data asset and application inventory worksheet. full size version available at https://doi.org/10.6084/m9.figshare.6667430. https://doi.org/10.6084/m9.figshare.6667430 benefits of enterprise architecture for library technology management | searle 34 https://doi.org/10.6017/ital.v37i4.10437 • the wide range of technical interfaces we have with third parties, such as publishers and other libraries, is now explicitly expressed. feedback from the architects suggested that the library was very unusual compared to other parts of the organization in terms of the number of critical external systems and services that we use as part of our service provision. previously iserver did not contain a full picture of these critical services, including: o the web-based purchasing tools that we use to interact with publishers, such as ebsco’s gobi;15 o the library links program that we use to provide easier access to scholarly content via google scholar;16 and o various harvesting processes that enable us to share metadata with content aggregators, such as the national library of australia’s trove service and the australian national data service’s research data australia portal. 17 application maturity survey the second activity was an application maturity assessment. this involved forty-four staff members from all areas of the library with different viewpoints (technical, non-technical, and management) answering a series of questions in a spreadsheet format. the survey contained questions about: • how often a system was used; • how easy it was to use; • how well it supported the business processes that person carried out; • how well it performed, for example, in terms of response times; • how quickly changes/enhancements were implemented in the product; • how easily the system could be integrated with other systems; • the level of compliance with industry standards; and • overall supportability (including vendor support). as different respondents were assigned multiple systems depending on their level of support and/or use, the final overall number of responses to the survey was 144 responses relating to eleven different systems. the outputs of this process were a summary table and a series of four graphs. the summary table (see figure 3) presents aggregated scores on a scale of one (low) to five (high) for each application as well as recommended technical and management strategies. it is interesting, and somewhat disheartening, to note that scores for the business criticality of the applications are generally much higher than the scores for fitness. there is also some variation in the strategies required; some systems need to be replaced, but there are others where the issues seem to be less technical. the third row of the table shows a product that is scored as highly business-critical and perfectly suited to the job from a technical perspective, yet the product still scores much more poorly for business fit, which could indicate that something has gone wron g in the way that this product has been implemented. information technology and libraries | december 2018 35 figure 3. table summarizing the results of the application maturity assessment [product names redacted]. applications are rated on a scale of one to five, and one of four management strategies (technology refresh—not shown here, optimise, implementation review, or replace) is recommended. full size version available at https://doi.org/10.6084/m9.figshare.6667433. figure 4. two of the four graph types produced from the application maturity survey results, for a product [name redacted] that is performing well. full size version available at https://doi.org/10.6084/m9.figshare.6667436. figures 4 and 5 show the four graph types produced automatically from the survey results. on the left in figure 4 is a view displaying the business criticality, business fit, and technical fit for an individual application (shown in pink) as compared to the overall portfolio (shown in blue). on the right is a graph showing scores for the range of measures covered by the survey. this https://doi.org/10.6084/m9.figshare.6667433 https://doi.org/10.6084/m9.figshare.6667436 benefits of enterprise architecture for library technology management | searle 36 https://doi.org/10.6017/ital.v37i4.10437 particular product is doing well; technical and business fit are high in the graph on the left, and most measures are above average in the graph on the right. figure 5 shows the remaining two graphs for the same product. the graph on the left plots the scores for business criticality and application suitability (fitness for purpose) to produce a recommended technical strategy. the graph on the right plots the scores for business fit and technical fit to produce a recommended management strategy. in both graphs, it is possible to see how the specific application is performing (the red square) compared to the portfolio overall (the blue diamond). placement within the quadrant with the green optimize label is preferred, as in this case. figure 5. the remaining two graph types from the application maturity survey results, for a system [product name redacted] that is performing well. the specific system’s location is shown by the red square, while the blue diamond maps the average for all systems in the application portfolio. full size version available at https://doi.org/10.6084/m9.figshare.6667442. figures 6 and 7 present the same set of graphs for an end-of-life system. in figure 6 the graph on the left shows that the product is very business-critical but that its scores for technical fit and business fit (the lower corners of the pink triangle) are lower than the average across all applications (the lower corners of the blue triangle). the graph on the right shows that supportability and the time to market for changes and enhancements (the least prominent “points” in the pink polygon) are below the portfolio average (shown in blue along the same axes) while scores for other criticality, standards compliance, information quality, and performance were more in line with the portfolio average. https://doi.org/10.6084/m9.figshare.6667442 information technology and libraries | december 2018 37 figure 6. the first and second (of four) graphs for a system [product name redacted] that is end of-life. full size version available at https://doi.org/10.6084/m9.figshare.6667478. in figure 7, this application is placed well within the quadrant suggesting replacement. figure 7. the third and final graphs for a system [product name redacted] that is end-of-life. the placement of the red square within the replace quadrant indicates that this product is a high candidate for decommissioning. this is a marked difference from the portfolio as a whole (the blue diamond), which could be reviewed for possible implementation improvements. full size version available at https://doi.org/10.6084/m9.figshare.6667484. https://doi.org/10.6084/m9.figshare.6667478 https://doi.org/10.6084/m9.figshare.6667484 benefits of enterprise architecture for library technology management | searle 38 https://doi.org/10.6017/ital.v37i4.10437 the graphs are also useful for highlighting anomalies. figure 8 shows a product that is assessed as better-than-average in the portfolio on most measures. however, the survey results quite clearly show that information quality is a major issue. figure 8. graph from application maturity survey showing a specific area of concern (data quality) for an otherwise well-performing application [product name redacted]. full size version available at https://doi.org/10.6084/m9.figshare.6667487. this type of finding will help library technology services to target our continuous improvement efforts and work through our relationships with user groups and vendors to get a better result. application communication diagram the third major activity was the production of an application communication diagram (see figure 9). this is a visual representation of all of the information that was collated through the workshops using the workbook described above. https://doi.org/10.6084/m9.figshare.6667487 information technology and libraries | december 2018 39 figure 9. application communication diagram [simplified view]. full size version available at https://doi.org/10.6084/m9.figshare.6667490. https://doi.org/10.6084/m9.figshare.6667490 benefits of enterprise architecture for library technology management | searle 40 https://doi.org/10.6017/ital.v37i4.10437 the diagram includes a number of things to note. • key applications that make up the library ecosystem. an example of this is the large blue box on the top left. this represents the intota product suite from proquest, which contains multiple components, including our link resolver, discovery layer, and electronic resource manager. • physical technology. self-checkout machines appear as the small green box mid-right. • other internal systems that connect to library system components. examples of these are throughout and include: corporate systems, such as peoplesoft for human resources and finances; identity management systems like metadirectory and ping federate; the learning management system blackboard; and research systems, including the research information management system and the researcher profiles system. • external systems that connect to our systems. these are mostly gathered into the large grey box bottom right. • actors who access the systems. this includes administrators, staff, students, and the general public. actors are identified using a small person icon. • interfaces between components. each line in the diagram represents a unique connection into another system or interface. captions on these lines indicate the nature of the connection, e.g. manual data entry, z39.50 search, export scripts, and lookup lists. the production of this diagram has been an iterative process that has taken place over a long time period. the number of components involved in the diagram is quite large, so it is worth noting that the version presented here has actually been simplified. the architects’ tools can present information in different ways and this particular “view” was chosen to balance the need for detail and accuracy with the need to communicate meaningfully with a variety of stakeholders. production of interactive visualizations in the fourth and final work package, the data entity and application inventory spreadsheet was used as a data source to provide an interactive visualization (see figure 10). a member of the architecture team converted the workbook (see figure 2) from microsoft excel .xls into a .csv file. he developed a php script to query the file and return a json object based on the parameters that were passed. the data driven documents javascript library (d3.js) was used to produce a force graph that uses shapes, colors, and lines to visually present the spreadsheet information in a more interactive way.18 this tool enables navigation through the library’s network of data entities (shown as orange squares) and applications (shown as blue dots). in the example being displayed, the data entity “bibliographic records—marc” has been selected. it is possible to see both in the visualization and in the popup box on the left how marc records are captured, stored, and used across our entire ecosystem of applications. this visualization was very much an experiment and the value of this in the long term is something we are still discussing. in the short term, other outputs have proven to be more useful for planning purposes. information technology and libraries | december 2018 41 figure 10. interactive visualization of library architecture, showing relationships between a single data subentity (bibliographic records—marc) and various applications. full size version available at https://doi.org/10.6084/m9.figshare.6667493. https://doi.org/10.6084/m9.figshare.6667493 benefits of enterprise architecture for library technology management | searle 42 https://doi.org/10.6017/ital.v37i4.10437 discussion the process described above was not without its challenges, including establishing a common language. enterprise architecture and libraries are both fertile breeding grounds for jargon and acronyms. there was also a disconnect in our understandings of who our users were, with the architects tending to concentrate on internal users, while the librarians were keen to include the perspectives of the academic staff and students who make up our core client base. these were minor challenges, and the experience of working with the enterprise architects was overall an interesting and positive one for the library. our collaboration validated mckay and parker’s view that there is much crossover in the skillsets and mindsets of librarians and enterprise architects.19 both groups tended to work in systematic and analytical ways, which was helpful in removing some of the more emotive aspects that might have arisen through a more judgmental “assessment” process. the enterprise architects’ job was to promote conformance with standards that are aspirational in many respects for the library. however, the collaborative nature of the process and the immediate usefulness of its outputs helped us to approach this as an opportunity to improve our internal practices as well as the services that we offer to library customers. the architects observed in return that library staff were very open-minded about the process; this had not necessarily always been their experience with other groups in the university. one reason for this may have been lts’s efforts to communicate early with other library staff. before embarking on this work, we sent emails and provided verbal updates to all participants and their supervisors. these communications were clear about both the time commitment needed for workshops and surveys and also about the benefits we hoped to achieve. short-term impacts in the library domain the level of awareness and understanding in library technology services about ea concepts and methods is much higher than what it was previously. our capacity to self-identify architectural issues is better as a result and this is enabling us to be proactive rather than reactive. a recent example of this is a request from our solution architecture board (sab) to seek an exemption from our it advisory board (itab) for our proposed use of the niso circulation interchange protocol (ncip) to support interlibrary loan. while ncip is a niso standard that is widely used in libraries, it is not one of the integration mechanisms incorporated into the architecture standards. as a result of this request, we plan to develop a document for these it governance groups about all the library-specific data transfer protocols that we use; not just ncip, but also z39.50, the open archives initiative protocol for metadata harvesting (oai-pmh), the edifact standard for transferring purchasing information, and possibly others. it is in our interests to educate these important governance groups about integration methods commonly used in the library environment, since these are not well understood outside of our team. the baseline as-is application architecture diagram gives us a much better grasp on the complexity we are faced with. understanding this complexity is a prerequisite to controlling it. the diagram, and the process worked through to populate it, makes it easier to identify manual processes that should be automated and integrations that might be done more efficiently or effectively. for example, like most libraries, we still have many scheduled batch processes that we could potentially replace in the future with web services to provide real-time updates. information technology and libraries | december 2018 43 the iserver platform is now an important source of data to support our decision-making, in terms of arriving at broad recommendations for replacing, reimplementing, or optimizing our systems as well as highlighting specific areas of concern. importantly, the process produced relative results, so that we can see across our application portfolio which systems are underperforming compared to others. this makes it easier to determine where the team should be putting its efforts and highlights areas where firmer approaches to vendor management could be applied. a practical example of this was our decision in late 2017 to review (and ultimately unbundle and replace) an e-journal statistics module that was underperforming compared to other modules within the same suite. the outputs from this process are also helping library technology services communicate, both within our own team and also with other stakeholders. the results of the application maturity assessment were included as part of a business case seeking project funding to upgrade our library management system and replace our interlibrary loans system. that funding bid was successful. while it is possible that the business case would have been approved regardless, a recommendation from the architects that the system needed to be replaced was likely more persuasive than the same recommendation coming solely from a library perspective. in our organizational context, enterprise architects are trusted by very senior executives; they are perceived as neutral and objective, and the processes that they use are understood to be systematic and data-driven. longer-term impacts in an enterprise context there are a number of longer-term impacts that may arise from this work. seeing the library’s applications in a broader enterprise context is likely to lead to more questioning of the status quo and to a desire to investigate new ways to do things. in large organizations like universities, available enterprise systems can offer better functionality and more standardized ways of operating than library systems. financial systems are an obvious example, as are business intelligence tools. the canned and custom reports and dashboards within library systems meet a narrow set of requirements, but do not compare well for increasingly complex analytics when compared to enterprise data warehousing, emerging “data lake” technologies for less structured data, and sophisticated reporting tools. an enterprise approach also highlights where the same process is being done across different systems. for example, oai-pmh harvesting is a feature of multiple systems at griffith. traditionally each system provides its own feeds. our data repository, publications repository, and researcher profile system all provide oai-pmh harvesting endpoints for sending metadata to different aggregators. an alternative solution to explore could be to harvest all publications data from multiple systems into our corporate data warehouse (particularly if this evolved to provide more linked data functionality) and provide a single oai-pmh endpoint that could then be managed as a single service. the ea process has further raised our already high level of concern with the current library systems market. there has been a move in recent years towards larger, highly-integrated “black box” solutions. while there have been some moves towards openness, for example through the development of apis, these are often rhetorical rather than practical. the pricing structures for products mean that we continue to pay for functionality that would not be required if we could integrate library applications with non-library enterprise tools in smarter ways. at griffith, the benefits of enterprise architecture for library technology management | searle 44 https://doi.org/10.6017/ital.v37i4.10437 products that scored most highly in our maturity assessment in terms of business and technical fit were the less expensive, lightweight, browser-based, cloud-native tools designed to do one or two things really well. this suggests that strategies around a more loosely coupled microservices approach, such as that being developed through the folio open source library software initiative, will be worth exploring in future.20 conclusion there are few documented examples of librarians working closely with enterprise architects in higher education or elsewhere. the goal of this case study is to encourage other librarians to learn more about architects’ work practices and to seek opportunities to apply ea methods in the library systems space for the benefit not just of the library but also for the organization as a whole. as a single institution case study, the applicability of this work may be limited in other environments. griffith has a long tradition of highly converged library and it operations; other organizations may have more structural barriers to entry if the library and it areas are not as naturally cooperative. a further obvious limitation relates to resourcing. the author of the cisti case study cautions that getting started in ea can be complex and resource-intensive. few libraries are likely to be in the position of cisti in having dedicated library architects, so working with others will be required. in many universities, work of this nature is outsourced to specialist consultants because of a lack of in-house expertise. at griffith university, we conducted this exercise entirely with in-house staff. a downside of this was that, despite our best efforts at the scoping stage, competing priorities in both areas meant that this work took far longer than we expected. in theory, external consultants could have guided the library through similar activities to produce similar outputs, and probably in a shorter timeframe. however, we would observe that the process has been just as important as the outputs; the knowledge, skills, and relationships that have been built will continue into the future. at cisti, investments in ea were assessed by the library as justified by the improvements in technology capability, strategic planning, and services to library users. the griffith experience validates this perspective. it is also important to note that ea work can and should be done in an iterative way. our experience suggests that some outputs can be delivered earlier than others and useful insights can be gleaned even from drafts. our local “ecosystem” of library applications, enterprise applications, and integrations between these different components mus t respond to changes in technologies; legal and regulatory frameworks; institutional policies and procedures; and other factors. it is therefore unrealistic to expect outputs from a process like this to remain current for long. assuming that the library’s data and application architecture will always be a work-in-progress, it will continue to be worth the effort involved to build and maintain positive working relationships with the enterprise architects, who now have a deeper understanding of who we are and what we do. acknowledgements thank you to anna pegg, associate it architect; jolyon suthers, senior enterprise architect; colin morris, solution consultant; the library technology services team; all our library and learning services colleagues who participated in this initiative; and joanna richardson, library strategy information technology and libraries | december 2018 45 advisor, for support and feedback during the writing of this article. this work was previously presented at theta (the higher education technology agenda) 2017, auckland, new zealand. references 1 griffith university, “griffith digital strategy 2020,” 2016, https://www.griffith.edu.au/__data/assets/pdf_file/0026/365561/griffithuniversity-digitalstrategy.pdf. 2 the open group, “togaf®, an open group standard,” accessed june 4, 2018, http://www.opengroup.org/subjectareas/enterprise/togaf. 3 federation of enterprise architecture professional associations, “a common perspective on enterprise architecture,” 2013, http://feapo.org/wp-content/uploads/2013/11/commonperspectives-on-enterprise-architecture-v15.pdf. 4 judith pirani, “manage today’s it complexities with an enterprise architecture practice,” educause review, february 16, 2017, https://er.educause.edu/blogs/2017/2/managetodays-it-complexities-with-an-enterprise-architecture-practice. 5 stephen kevin anthony, “implementing service oriented architecture at the canada institute for scientific and technical information,” the serials librarian 55, no. 1–2 (july 3, 2008): 235–53, https://doi.org/10.1080/03615260801970907. 6 kristiina hormia-poutanen, “the finnish national digital library: national library of finland developing a national infrastructure in collaboration with libraries, archives and museums,” accessed march 24, 2018, http://travesia.mcu.es/portalnb/jspui/bitstream/10421/6683/1/fndl.pdf. 7 karl w. schornagel, “information technology strategic planning: a well-developed framework essential to support the library’s and future it needs. report no. 2008-pa-105,” may 2, 2009, https://web.archive.org/web/20090502092325/https://www.loc.gov/about/oig/reports/20 09/final%20it%20strategic%20planning%20report%20mar%202009.pdf. 8 joel willemssen, “information technology: library of congress needs to implement recommendations to address management,” december 2, 2015, https://www.gao.gov/assets/680/673955.pdf. 9 rebecca parker and dana mckay, “it’s the end of the world as we know it . . . or is it? looking beyond the new librarianship paradigm,” in marketing and outreach for the academic library, ed. bradford lee eden (lanham, md: rowman and littlefield, 2016): 81–106. 10 new south wales state archives and records authority, “recordkeeping in brief 59—an introduction to enterprise architecture for records managers,” 2011, https://web.archive.org/web/20120502184420/https://www.records.nsw.gov.au/recordkee ping/government-recordkeeping-manual/guidance/recordkeeping-in-brief/recordkeeping-inbrief-59-an-introduction-to-enterprise-architecture-for-records-managers. 11 anthony, “implementing service oriented architecture,” 236–37. https://www.griffith.edu.au/__data/assets/pdf_file/0026/365561/griffithuniversity-digital-strategy.pdf https://www.griffith.edu.au/__data/assets/pdf_file/0026/365561/griffithuniversity-digital-strategy.pdf http://www.opengroup.org/subjectareas/enterprise/togaf http://feapo.org/wp-content/uploads/2013/11/common-perspectives-on-enterprise-architecture-v15.pdf http://feapo.org/wp-content/uploads/2013/11/common-perspectives-on-enterprise-architecture-v15.pdf https://er.educause.edu/blogs/2017/2/manage-todays-it-complexities-with-an-enterprise-architecture-practice https://er.educause.edu/blogs/2017/2/manage-todays-it-complexities-with-an-enterprise-architecture-practice https://doi.org/10.1080/03615260801970907 http://travesia.mcu.es/portalnb/jspui/bitstream/10421/6683/1/fndl.pdf https://web.archive.org/web/20090502092325/https:/www.loc.gov/about/oig/reports/2009/final%20it%20strategic%20planning%20report%20mar%202009.pdf https://web.archive.org/web/20090502092325/https:/www.loc.gov/about/oig/reports/2009/final%20it%20strategic%20planning%20report%20mar%202009.pdf https://www.gao.gov/assets/680/673955.pdf https://web.archive.org/web/20120502184420/https:/www.records.nsw.gov.au/recordkeeping/government-recordkeeping-manual/guidance/recordkeeping-in-brief/recordkeeping-in-brief-59-an-introduction-to-enterprise-architecture-for-records-managers https://web.archive.org/web/20120502184420/https:/www.records.nsw.gov.au/recordkeeping/government-recordkeeping-manual/guidance/recordkeeping-in-brief/recordkeeping-in-brief-59-an-introduction-to-enterprise-architecture-for-records-managers https://web.archive.org/web/20120502184420/https:/www.records.nsw.gov.au/recordkeeping/government-recordkeeping-manual/guidance/recordkeeping-in-brief/recordkeeping-in-brief-59-an-introduction-to-enterprise-architecture-for-records-managers benefits of enterprise architecture for library technology management | searle 46 https://doi.org/10.6017/ital.v37i4.10437 12 jolyon suthers, “information and technology architecture,” 2016, accessed april 6, 2018 https://www.caudit.edu.au/system/files/media%20library/resources%20and%20files/com munities/enterprise%20architecture/ea2016%20joylon%20suthers%20caudit%20ea%2 0symposium%202016%20-%20it%20architecture%20v2_0.pdf. 13 the open group, “togaf® 9.1,” 2011, 2018, http://pubs.opengroup.org/architecture/togaf9doc/arch/index.html: part 1 introduction section 2: core concepts. 14orbus software, “iserver for enterprise architecture,” accessed march 26, 2018, https://www.orbussoftware.com/enterprise-architecture/capabilities/. 15 ebsco, “gobi®,” accessed june 5, 2018, https://gobi.ebsco.com/gobi. 16 google scholar, “google scholar support for libraries,” accessed june 5, 2018, https://scholar.google.com/intl/en/scholar/libraries.html. 17 national library of australia, “trove,” accessed june 5, 2018, https://trove.nla.gov.au/; australian national data service, “research data australia,” accessed june 5, 2018, https://researchdata.ands.org.au/. 18 mike bostock, “d3.js—data-driven documents,” accessed april 3, 2018, https://d3js.org/. 19 parker and mckay, “it’s the end of the world,” 88. 20 breeding, marshall, “five key technology trends for 2018,” computers in libraries, 37, no.10 (december 2017), http://www.infotoday.com/cilmag/dec17/breeding--five-key-technologytrends-for-2018.shtml. https://www.caudit.edu.au/system/files/media%20library/resources%20and%20files/communities/enterprise%20architecture/ea2016%20joylon%20suthers%20caudit%20ea%20symposium%202016%20-%20it%20architecture%20v2_0.pdf https://www.caudit.edu.au/system/files/media%20library/resources%20and%20files/communities/enterprise%20architecture/ea2016%20joylon%20suthers%20caudit%20ea%20symposium%202016%20-%20it%20architecture%20v2_0.pdf https://www.caudit.edu.au/system/files/media%20library/resources%20and%20files/communities/enterprise%20architecture/ea2016%20joylon%20suthers%20caudit%20ea%20symposium%202016%20-%20it%20architecture%20v2_0.pdf http://pubs.opengroup.org/architecture/togaf9-doc/arch/index.html http://pubs.opengroup.org/architecture/togaf9-doc/arch/index.html https://www.orbussoftware.com/enterprise-architecture/capabilities/ https://gobi.ebsco.com/gobi https://scholar.google.com/intl/en/scholar/libraries.html https://trove.nla.gov.au/ https://researchdata.ands.org.au/ https://d3js.org/ http://www.infotoday.com/cilmag/dec17/breeding--five-key-technology-trends-for-2018.shtml http://www.infotoday.com/cilmag/dec17/breeding--five-key-technology-trends-for-2018.shtml abstract introduction literature review getting started: context and planning library technology services context information technology architecture and solutions context determining scope in a statement of architecture work enterprise architecture process and outputs data asset and application inventory application maturity survey application communication diagram production of interactive visualizations discussion short-term impacts in the library domain longer-term impacts in an enterprise context conclusion acknowledgements references fagan 140 information technology and libraries | september 2006 visual search interfaces have been shown by researchers to assist users with information search and retrieval. recently, several major library vendors have added visual search interfaces or functions to their products. for public service librarians, perhaps the most critical area of interest is the extent to which visual search interfaces and text-based search interfaces support research. this study presents the results of eight full-scale usability tests of both the ebscohost basic search and visual search in the context of a large liberal arts university. l ike the web, online library research database interfaces continue to evolve. even with the smaller scope of library research databases, users can still suffer from information overload and may have difficulty in processing large results sets. web search-engine research has shown that the number of searchers viewing only the first results page has increased from 29 percent in 1997 to 73 percent in 2002 for united states-based web searchengines users.1 additionally, the mean number of results viewed per query in 2001 was 2.5 documents.2 this may indicate either increasing relevance in search results or an increase in simplistic web interactions. visual alternatives to search interfaces attempt to address some of the problems of information retrieval within large document sets. while research and development of visual search interfaces began well before the advent of the web, current research into visual web interfaces has continued to expand.3 within librarianship, the most visual interface research seems to focus on those that could be applied to large-scale digital library projects.4 although library products often have more metadata and organizational structure than the web, search engine-style interfaces adapted for field searching and boolean operators are still the most frequent approach to information retrieval.5 yet research has shown that visual interfaces to digital libraries offer great benefit to the user. zaphiris emphasizes the advantage of shifting the user’s mental load “from slow reading to faster perceptual processes such as visual pattern recognition.”6 according to borner and chen, visual interfaces can help users better understand search results and the interrelation of documents within the result set, and refine their search.7 in their discussion of the function of “overviews” in visual interfaces, greene and his colleagues say that overviews can help users make better decisions about potential relevance, and “extract gist more accurately and rapidly than traditional hit lists provided by search engines.”8 several library database vendors are implementing visual interfaces to navigate and display search results. serials solutions’ new federated search product, centralsearch, uses technology from vivisimo that “organizes search results into titled folders to build a clear, concise picture for its users.”9 ulrich’s fiction connection web site has used aquabrowser to help one “discover titles similar to books you already enjoy.”10 the queens library has also implemented aquabrowser to provide a graphical interface to its entire library’s collections.11 xreferplus maps search results to topics by making visual connections between terms.12 comabstracts, from cios, uses a similar concept map, although one cannot launch a search directly from the tool. groxis chose a circular style for its concept-mapping software, grokker. partnerships between groxis and stanford university began as early as 2004, and grokker is now being implemented at stanford university libraries academic and information resources.13 ebsco and groxis announced their partnership in march 2006.14 the ebscohost interface now features a visual search tab as an option that librarians can choose to leave on (by default) or turn off in ebsco’s administrator module. figure 1 shows a screenshot of the visual search interface. within the context of library research databases, visual searching likely provides a needed alternative from traditional, text-based searching. to test this hypothesis, james madison university libraries (jmu libraries) decided to conduct eight usability sessions with ebscohost’s new visual search, in coordination with ebsco and groxis. while this is by no means the first published usability test of vendor interfaces, the literature understandably reveals a far greater number of usability tests on in-house projects such as library web sites and customized catalog interfaces than on library database interfaces.15 it is hoped that by observing users try both the ebsco basic search and visual search, more understanding will be gained about user search behavior and the potential benefits of a visual approach. ฀ method the usability sessions were conducted at jmu, a large liberal arts university whose student population is mostly drawn from virginia and the northeastern region. only 10 percent of the students are from minority groups. jmu requires that all freshmen pass the online information skills seeking test (isst) before becoming a sophomore, and the libraries developed a web tutorial, “go for the gold,” to prepare students for the isst. therefore, usabiljody condit fagan usability testing of a large, multidisciplinary library database: basic search and visual search jody condit fagan (faganjc@jmu.edu) is digital services librarian at carrier library, james madison university, harrisonburg, virginia. usability testing of a large, multidisciplinary library database | fagan 141 ity-test participants were largely white, from the northeastern united states, and had exposure to basic information literacy instruction. jmu libraries’ usability lab is a small conference room with one computer workstation equipped with morae software.16 audio and video recordings of user speech and facial expressions, along with “detailed application and computer system data,” are captured by the software and combined into a searchable recording session for the usability tester to review. a screenshot of the morae analysis tool is shown in figure 2. the usability test script was developed in collaboration with representatives of ebsco and groxis. ebsco provided access to the beta version of visual search for the test, and groxis provided financial incentives for student participants. the test sessions and the results analysis, however, were conducted solely by the researcher and librarian facilitators. the visual search development team was provided with the results and video clips after analysis. usability study participants were recruited by posting an announcement to the jmu students’ web portal. a $25 gift certificate was offered as an incentive, and more than 140 students submitted a participation interest form. these were sorted by the number of years the student(s) had been at jmu to try to get as many novice users as possible. because so much of today’s student work is conducted in groups, four groups of two, as well as four individual sessions, were scheduled, for a total of twelve students. jmu librarians who had received both human-subjects training and an introduction to facilitation served as facilitators to the usability sessions. their role was to watch the time and ask open-ended questions to keep the student participants talking about what they were doing. the major research question it was hoped would be answered by the tests was, “to what extent does ebsco’s basic search interface and visual search interface support student research?” since the tests could not evaluate the entire research process, it was decided to focus on the development of the research topic. specifically, the goal was to find out how well each interface supported the intellectual process of the students in coming up with a topic, narrowing their topic, and performing searches on their chosen subtopics. an additional goal was to determine how well users were able to find and use the interface widgets and how satisfied the students felt after using the interfaces. the overall session was structured in this order: a pretest survey about the students’ research experience; a series of four tasks performed with ebscohost’s basic search; a series of three tasks performed with ebscohost’s visual search; and a posttest interview. both basic and visual search interfaces were used with academic search premier. each of the eight sessions was recorded in entirety by the morae software, and each recording was viewed in entirety. to try to gain some quantitative data, the researcher measured the time it took to complete each task. however, due to variables such as facilitator involvement and interaction between group members, the numbers did not lend themselves to comparison. also, it would not have been clear whether greater numbers indicated a positive or negative sign. taking longer to come up with subtopics, for example, could as easily be a sign of exploration and interested inquiry as it might be of frustration or failure. as such, the data are mostly qualitative in nature. figure 1. screenshot of ebscohost’s visual search figure 2. screenshot of morae recorder analysis tool 142 information technology and libraries | september 2006 ฀ results the student participants were generally underclassmen. two of the students, group 2, were in their third year at jmu. all others were in their first or second year. while students were drawn from a wide variety of majors, it is regrettable that there was not stronger representation from the humanities. when asked, “what do you normally use to do research?” six students answered an unqualified “google.” three other students mentioned internet search engines in their response. only two students gave the brand or product names of library research databases: one said, “pubmed, wilsonomnifile, and ebsco,” while the other, a counseling major, mentioned psycinfo and cinahl. when shown a screenshot of basic search, half of the students said they had used an ebsco database before. all of the participants said they had never before used a visual search interface. the full results from the individual pretest interviews are shown in figures 3 and 4. to begin the usability test, the facilitator started internet explorer and loaded the ebscohost basic search, which was set to have a single input box. the scripts for each task are listed in figure 5. note that task 4 was only featured in the basic search portion of the test. for task 1 on the basic search—coming up with a general topic—all of the participants began by using their own topics rather than choosing from the list of ideas. also, although they were asked to “spend some time on ebsco to come up with a possible general topic,” all but group 6 fulfilled this by simply thinking of a topic (sometimes after some discussion within the groups of two) and typing it in. with the exception of group 6, the size of the result set did not inspire topic changes. figure 6 summarizes the students’ searches and relative success on task 1. in retrospect, the tests might have yielded more straightforward findings if the students had been directed to choose from the provided list of topics, or even to use the same topic. however, part of the intention was to determine whether either interface was helpful in guiding the students’ topic development. it was hoped that by defining the scenario as writing a paper for class, their topic selection would reflect the realities of student research. however, it probably would have been better to have used the same topic for each session. task 2 asked participants to identify three subtopics, and task 3 asked them to refine their search to one subtopic and limit it to the past two years. a summary of these tasks appears in figure 7. a surprising finding during task 2 was that students did go past the first page of results. four groups went past the first page of results, while two groups did not get enough results for more than one page. the other two groups did not choose to look past the first page of results. this contrasts with jansen and spink’s findings, figure 3. results from pretest interview, groups 1–4 figure 4. results from pretest interview, groups 5–8 usability testing of a large, multidisciplinary library database | fagan 143 in which 73 percent of web searchers only view the first results page.17 another pleasant surprise was that students spent some time actually reading through results when they were searching for ways to narrow their topic. five groups scanned through both titles and abstracts, which requires clicking on the article titles to display the citation view. one of these five additionally chose to open full-text articles and look at the references to determine relevance. two groups scanned through the results pages only, but looked at both article titles and the subjects in the left-hand column. group 5 seemed to only scan the titles in the results list. this user behavior is also quite different than that found with web search-engine users. in one recent study by jansen and spink, more than 90 percent of the time, search-engine users viewed five or fewer documents per query.18 the five groups that chose to view the citation/abstract view by clicking on the title (groups 1, 2, 3, 4, and 6) identified subtopics that were significantly more interesting and plausible than the general topic they had come up with. from looking at their results, these groups were clearly identifying their subtopics from reading the abstracts and titles rather than just brainstorming. although group 2 had the weakest subtopics, going from the world baseball classic to specific players’ relationships to the classic and the home-run derby, they were working with a results set of but eleven items. the three groups that relied on scanning only the results list succeeded to an extent, but as a whole, the new subtopics would be much less satisfying to the scenario’s hypothetical professor. after scanning the titles on two pages of results, group 5 (an individual) ended up brainstorming her subtopics (prevention, intervention, and what an eating disorder looks like) based on her knowledge of the topic rather than drawing from the results. group 7 (a group of two) identified their subtopic (sand dunes) from the lefthand column on the results list. group 8 (an individual) picked up his subtopics (steroids in sports, president bush’s stance on steroids, and softball) from reading keywords in the article titles on the first page of results. since the subjects in the left-hand column were a new addition to basic search, the use of this area was also noted. four groups used the subjects in the left-hand column without prompting. two groups saw the subjects (i.e., ran the mouse over them) but did not use them. the remaining two groups made no action related to the subjects. a worrisome finding of tasks 2 and 3 was that most students had trouble with the default search being set to phrase-searching rather than to a boolean and. this can easily be seen in looking at the number of results the students came up with when they tried to refine their topics (figure 7). even though most students had some limiter still in effect (full text, last two years) when they first tried their new refined search, it was the phrasesearching that really hurt them. luckily, this figure 6. task 1, coming up with a general topic using basic search figure 5. tasks posed for each portion of the usability test. 144 information technology and libraries | september 2006 is a customizable setting in ebsco’s administrator module, and it is recommended that libraries enable the “proximity” expander to be set “on” by default, which will automatically combine search terms with and. task 4, finding a “recent article in the economist about the october earthquake in kashmir,” was designed to test the usability of the ebscohost publication search and limiter. it was listed as optional in case the facilitator was worried that time was an issue. four of the student groups—1, 2, 5, and 7—were posed the task. of these four groups, three relied entirely on the publication limiter on the refine search panel. group 1 chose to use the publication search. all four groups quickly and successfully completed this task. ฀ ฀additional questions during basic search tasks at various points during the three tasks in ebsco’s basic search, the students were asked to limit their results set to only full-text results, to find one peer-reviewed article, and to limit their search to the past two years. seven out of the eight student groups had no problem finding and using the ebscohost “refine search” panel, including the full-text check box, date limiter, and peerreviewed limiter. group 7 did not find the refine search panel or use its limiters until specifically guided by the facilitator near the end. this group had found other ways to apply limits: they used the “books/monographs” tab on the results list to limit to full text, and the results-list sorting function to limit to the past two years. after having seen the refine search panel, group 7 did use the “peer reviewed” check box to find their peer-reviewed article. toward the end of the basic search portion, students were asked to “save three of their results for later.” three groups demonstrated full use of the folder. an additional three groups started to use the folder and viewed the folder but did not use print, save, or e-mail. it is unclear whether they knew how to do so and just did not follow through, or whether they thought they had safely stored the items. two students did not use the folder at all, acting individually on items. one group used the “save” function but did not save each article. ฀ visual search similar to task 1, when using the basic search, students did not discover general topics by using the interface, but simply typed in a topic of interest. only two groups, 1 and 8, chose to try the same topic again. in the interests of processing time, visual search limits the search to the first 250 results retrieved. since jmu has set the default sort results to display in chronological order, the most recent 250 results were returned during these usability tests. figure 8 shows the students’ original search terms using visual search, the actions they took while looking for subtopics, and the subtopics they identified. additionally, if the subtopics they identified matched words on the screen, the location of those words is noted. three of the groups (1, 2, and 5) identified subtopics when looking at the labels on topic and subtopic circles. group 3 identified subtopics while looking at article titles as well as the subtopic circles. the members of group 6 identified subtopics while looking at the citation view and reading the abstract and full text, as well as rolling over article titles with their mice. it was not entirely clear where the student in group 4 got his subtopics from. two of the three subtopics did not seem to figure 7. basic search, task 2 and 3, coming up with subtopics. usability testing of a large, multidisciplinary library database | fagan 145 be represented in the display of the results set. his third subtopic was one of the labels from a subtopic circle. groups 7 and 8 both struggled with finding their subtopics. group 7 simply had a narrow topic (“jackalope”), and group 8 misspelled “steroids” and got few results for that reason. lacking many clusters, both groups tried typing additional terms into the title keyword box on the filter panel, resulting in fewer or zero results. for task 3, students were asked to limit their search to the last two years and to refine their search to a chosen subtopic (figure 9). particularly because the results set is limited to 250, it would have been better to have separated these two tasks: first to have them limit the content, then perhaps the date of the search. three groups, all groups of two, used the date limit first (2, 6, and 8). three groups (1, 3, and 6) narrowed the content of their search by typing a new search or additional keywords into the main search box. groups 2 and 4 narrowed the content of their search by clicking on the subtopic circles. note that this does not change the count of the number of results displayed in the filter panel. groups 5 and 7 tried typing keywords into the title keyword filter panel and also clicking on circles. both groups fared better with the latter approach. group 8 typed an additional keyword into the filter panel box to narrow his search. while five of the groups announced the subtopic to which they wanted to narrow their search before beginning to narrow their topic, groups 2, 7, and 8 began to interact with the interface and experiment with subtopics before choosing one. while groups 2 and 8 arrived at a subtopic and identified it, group 7 tried many experiments, but since their original topic (jackalope) was already narrow, they were not ultimately successful in identifying or searching on a subtopic. as with basic search, students were asked to save three articles for later. five of the groups (2, 4, 5, 6, and 8) used the “add to folder” function which appears in the citation view on the right-hand side of the screen. of these, three groups proceeded to “folder has items.” of these groups, two chose the “save” function. two groups used either “save” or “e-mail” to preserve individual items, rather than using the folder. one group experienced system slowness and was not able to load the full-record view in time to determine whether they would be able to save items for later. a concern that students may not realize is that in folder view or individually, the “save” button really just formats the records. the user must still use a browser function to save the formatted page. no student performed this function. figure 8. visual search, task 1 and 2, coming up with a general topic figure 9. visual search, task 3, searching on subtopic (before date limit, if possible) 146 information technology and libraries | september 2006 several students had some trouble with the mechanics of the filter panel, shown in figure 10. seven of the eight groups found and used the filter panel, originally hidden from view, without assistance. however, some users were not sure how the title keyword box related to the main search box. at least two groups typed the same search string into the title keyword box that they had already entered into the main search box. also, users were not sure whether they needed to click the search button after using the date limiter. however, in no case was a student unable to quickly recover from these areas of confusion. ฀ results of posttest interview at the end of the entire usability session, participants were asked several questions while looking at screenshots of each interface. a full list of posttest interview questions can be found in figure 11. when speaking about the strengths of basic search, seven of eight groups talked about the search options, such as field searching and limiters. the individual in group 1 mentioned “the ability to search in fields, especially for publications and within publications.” one of the students in group 3 mentioned that “i thought it was easier to specify the search for the full text and the peer reviewed—it had a separate page for that.” the student in group 4 added, “they give you all the filter options as opposed to the other one.” five of the eight groups also mentioned familiarity with the type of interface as a strength of basic search. since jmu has only had access to ebsco databases for less than a year, and half of the students admitted they had not used ebsco, it seemed their comments were with the style of interface more than their experience with the interface. the student in group 1 commented, “seems like the standard search engine.” group 2 noted, “it was organized in a way that we’re used to more,” and group 3 said, “it’s more traditional so it’s more similar to other programs.” half of the groups mentioned that basic search was clear or organized. group 6 explained, “it was nice how it was really clearly set out . . . like, everything’s in a line.” not surprisingly, visual search’s strengths surrounded the grouping of subtopics: seven of eight groups made some comment about this. the student in group 4 said, “it groups the articles for you better. it kinda like gives you the subtopics when you get into it and search it and that’s pretty cool.” the student in group 8 stated, “you can look and see an outline of where you want to go . . . it’s easy to pinpoint it on screen like that’s where i want to go with my research.” some of the other strengths mentioned about visual search were: showing a lot of information on one screen without scrolling (group 7) and the colorful nature of the interface. a student in group 2 added, “i like the circles and squares—the symbols register easily.” the only three weaknesses listed for basic search in response to the first question were: “not having a spot to put in words not to search for” (group 1); that, like internet search engines, basic search should have “a clip from the article that has the keyword in it, the line before and the line after” (group 6); and that basic search might be too broad, because “unless you narrow it, [you have to] type in keywords to narrow it down yourself” (group 7). figure 10. visual search filter panel figure 11. posttest interview questions usability testing of a large, multidisciplinary library database | fagan 147 with regard to weaknesses of visual search, half of the groups had some confusion about the content, partially due to the limited number of results. a student from group 7 declared, “it may not have as many results. . . . if you typed in ‘school’ on the other one, it might have . . . 8,000 pages [but] on this you have . . . 50 results.” the student in group 5 agreed, saying that with visual search, “they only show you a certain number of articles.” the student in group 1 said, “it’s kind of confusing when it breaks it up into the topics for you. it may be helpful for some other people, but for the way my mind works i like just having all my results displayed out like on the regular one.” half of the groups also made some comment that they were just not used to it. six of the groups were asked which one they would choose if they had class in one hour. (it is not clear why the facilitator did not ask this question of groups 3 and 8.) four groups (1, 2, 5, and 7) indicated basic search. one student in group 2 said, “i think it’s easier to use, but i don’t trust it.” the other in group 2 added, “it’s new and we’re not quite sure because every other search engine is you just type in words and it’s not graphical.” both students in group 7 commented that the familiarity of basic search was the reason they would use it for class in one hour. both groups 2 and 7 would later say that they liked the visual search interface better. two groups (4 and 6) chose visual search for the “class in one hour” scenario. the student in group 4 commented, “because it does cool things for you, makes it easier to find. otherwise you’re going through by title.” both these groups would later also say that they liked the visual search interface better. the students were also asked to describe two scenarios, one in which they would use basic search and one in which they would use visual search. four of the groups (1, 3, 5, and 6) said they would use basic search when they knew what information they needed. seven of the eight groups said they would use visual search for broad topics. all the students’ responses are given in figure 12. when asked which interface they preferred, the groups split evenly. comments from the four who preferred basic search (1, 3, 5, and 8) centered on the familiarity of the interface. the student in group 5 added, “the regular one . . . i like to get things done.” all four of these students had said they had used an ebsco database before. the two students who could list library research databases by name were both in this group. of the four who preferred visual search (2, 4, 6, and 7), three groups had never used ebsco before, though one of the students in group 7 thought he’d used it in the library web tutorial. group 2 commented, “it seemed like it had a lot more information . . . cool . . . futuristic.” the student in group 4 said, “it’s kind of like a little game. . . . like you’re trying to find the hidden piece.” group 7 commented that visual search was colorful and intriguing. the students in group 6 both stated “the visual one” in unison. one student said that visual search was more “[eye-catching] . . . it keeps you focused at what you are doing, i felt, instead of . . . words . . . you get to look at colors” and added later that it was “fun.” the other students in group 6 said, “i’m a very visual learner. so to see instead of having to read the categories, and say oh this is what makes sense, i see the circles like ‘abilities test’ or ‘academic achievement’ and i automatically know that’s what it is . . . and i can see how many articles are in it . . . and you click on it and it zooms in and you have all of them there.” the second student went on to add, “i’ve been teaching my mom how to use technology and the visual search would be so much easier for her to get, because its just looks like someone drew it on there like this is a general category and then it breaks it down.” other suggestions given during the free-comment portion of the survey were to have the filters from basic search appear on visual search (especially peer-reviewed); curiosity about when visual search would become available (at the time it was in beta test); and a suggestion to have generaleducation writing students write their first paper using visual search. figure 12. examples of two situations: one in which you would be more likely to use visual search, and one in which you would be more likely to use ebsco 148 information technology and libraries | september 2006 ฀ discussion this evaluation is limited both because most students chose different topics for each search interface, and because they only had time to research one topic in each interface. therefore, there could be an infinite number of scenarios in which they would have performed differently. however, this study does show that, for some students, or for some search topics, visual search will help students in a way that basic search may not. one hypothesis of this study was that within the context of library research databases, visual searching would provide a needed alternative from traditional, text-based searching. the success of the students was observed in three areas: the quality of the subtopics they identified after interacting with their search results; the improvement of the chosen subtopic over their chosen general topic, and the quality of the results they found for their subtopic search. the researcher made a best effort to compare topics and results sets and decide which interface helped the student groups to perform better. in addition, qualities that each interface seemed to contribute to the students’ search process were noted (figure 13). these qualities were determined by reviewing the video recordings and examining the ways in which either interface seemed to support the attitudes and behaviors of the students as they conducted their research tasks. when considering all three of these areas, four groups did not, overall, require visual search as an alternative to basic search (1, 3, 4, and 7). two of these groups (4 and 7) seemed to benefit from more focus when using the basic search interface. although visual search lent them more interaction and exploration (which may be why they said they preferred visual search), it seems the focus was more important to their performance. for the other two groups (1 and 3), basic search really supported the depth of inquiry and high interest in finding results. these two groups confirmed that they preferred basic search. for two groups (6 and 8), visual search seemed an equally viable alternative to basic search. for group 6, both interfaces seemed to support the group’s desire to explore; they said they preferred visual search. for the student in group 8, basic search seemed to orient him to the goal of finding results, while visual search supported a more exploratory approach. since, in his case, this exploratory approach did not turn out well in the area of finding results, it is not surprising that he ended up preferring basic search. the remaining two groups (2 and 5) performed better with visual search, upholding the hypothesis that an alternate search is needed. group 2 seemed bored and uninterested in the search process when using basic search even though they chose a topic of personal interest: “world baseball classic.” visual search caught their attention and sparked interest in the impersonal topic “global warming.” group 2 spent more time exploring while using the visual search interface, and in the posttest survey admitted that they preferred the visual search interface. the student in group 5 said she preferred basic search, and as a selfdescribed psycinfo user, seemed comfortable with the interface. yet for this test scenario, visual search made her think of new ideas and supported more real exploration during the search process. within each of the three areas, basic search appeared to have the upper hand for both the quality of the subtopics identified by the students, and in the improvement of the chosen subtopics over the general topics. this is at least partially explained by the limitation of visual search to the most recent 250 results. that is, as the students explored the visual search results, choosing subtopics would not relaunch a search on that subtopic, which would have engendered more and perhaps better subtopics. in the third area, the quality of the results set for the chosen topic, visual search seemed to have the upper hand if only because of the phrase-searching limitation present in jmu’s administrative settings for basic search. that is, students were often finding few or no results on their chosen subtopics in basic search. this study also had findings that seem to transcend figure 13: strengths of basic search and visual search in quality of subtopics, most improved topic, and result sets usability testing of a large, multidisciplinary library database | fagan 149 these interfaces and the underlying database. first, libraries should strongly consider changing their database default searching from phrase searching to a boolean and, if possible. (this is possible in ebsco using the administrative module.) second, most students did not have trouble finding or using the interface widgets to perform limiting functions, with the one exception being some confusion about the relationship between the visual search filters and main search box. unlike some research into web search behavior, students may well travel beyond the first page of results and view more than just a few documents when determining relevance. finally, the presence of subject terms in both interfaces proved to be an aid to understanding results sets. this study also pointed out some improvements that could be made to visual search. first, it would be great if visual search returned more than 250 results in the initial set, or at least provided an overview of the size, type, and extent of objects using available metadata.19 however, even with today’s high-speed connections, result-set size will need to be balanced with performance. perhaps, as students click on subtopics, the software could rerun the search so that the results set does not stay limited to the original 250. on a minor note, for both basic and visual search, greater care should be taken to make sure users understand how the save function works and alert users to the need to use the browser function to complete the process. it should be noted that ebsco has not stopped developing visual search, and many of these improvements may well be on their way. ebsco says it will be adding more support for limiters, display preferences, and contextual text result-list viewing at some point in the future. these feature sets can currently be viewed on grokker.com. an important area for future research is user behavior in library subscription databases. while these usability tests provide a qualitative evaluation of a specific interface, it would be worthwhile to have a more reliable understanding about students’ searching behavior in library databases across similar interfaces. since public service librarians deal primarily with users who have self-identified as needing help, their experience does not always describe the behavior of all users. furthermore, studies of web search behavior may not apply directly to searching in research databases. specifically, students’ use of subject terms in both interfaces could be explored. half of the student groups in this study chose to use the basic search subject clusters in the left-hand column on the results page, despite the fact that they had never seen them before (this was a beta-test feature). is this typical? would this strategy hold up to a variety of research topics? another interesting question is the use of a single search box versus several search boxes arrayed in rows (to assist in constructing boolean and field searching). in the ebsco administrative module, librarians can choose either option. based on research rather than anecdotal evidence, which is best? another option is the default sort: historically, at jmu libraries, this has been a chronological sort. does this cause problems for relevance-thinking students? finally, the issue of collaboration in student research using library research databases would be a fascinating topic. certainly, these usability recordings could be reviewed with a mind to capturing the differences between individuals and groups of two, but there may be better designs for a more focused study of this topic. ฀ conclusion if you take away one conclusion from this study, let it be this: do not hesitate to try visual search with your users! information providers must balance investments in cutting-edge technology with the demands of their users. libraries and librarians, of course, are a key user group for information providers. a critical need in librarianship is to become familiar with the newest technology solutions, particularly with regard to searching, in order to provide vendors with informed feedback about which technologies to pursue. by using and teaching new visual search alternatives, librarians will be poised to influence the further development of alternatives to text-based searching. references and notes 1. bernard j. jansen and amanda spink, “how are we searching the world wide web? a comparison of nine search engine transaction logs,” special issue, information processing and management 42, no. 1 (2006): 257. 2. bernard j. jansen and amanda spink, “an analysis of web documents retrieved and viewed,” in proceedings of the 4th international conference on internet computing (las vegas, 2003), 67. 3. aravindan veerasamy and nicholas j. belkin, “evaluation of a tool for visualization of information retrieval results,” sigir forum (acm special interest group on information retrieval) (1996): 85–93; katy börner and javed mostafa, “jodl special issue on information visualization interfaces for retrieval and analysis,” international journal on digital libraries 5, no. 1 (2005): 1–2; ozgur turetken and ramesh sharda, “clustering-based visual interfaces for presentation of web search results: an empirical investigation,” information systems frontiers 7, no. 3 (2005): 273–97. 4. stephen greene et al., “previews and overviews in digital libraries: designing surrogates to support visual information seeking,” journal of the american society for information science 51, no. 4 (2000): 380–93; panayiotis zaphiris et al., “exploring the use of information visualization for digital libraries,” new review of information networking 10, no. 1 (2004): 51–69. 5. katy börner and chaomei chen eds., visual interfaces to digital libraries, 1st ed. (berlin; new york: springer, 2003), 243. 150 information technology and libraries | september 2006 6. zaphiris et al., “exploring the use of information visualization for digital libraries,” 51–69. 7. börner and chen, visual interfaces to digital libraries, 243. 8. greene et al., “previews and overviews in digital libraries,” 380–93. 9. “vivisimo corporate profile,” in vivisimo, http://vivi simo.com/html/about (accessed apr. 19, 2006). 10. “aquabrowser library—fiction connection,” www.fic tionconnection.com/ (accessed apr. 19, 2006). 11. “queens library—aquabrowser library,” http://aqua .queenslibrary.org/ (accessed apr. 19, 2006). 12. “xrefer—research mapper,” www.xrefer.com/research (accessed apr. 19, 2006). 13. “stanford ‘groks,’” http://speaking.stanford.edu/back _issues/ soc67/library/stanford_groks.html (accessed apr. 19, 2006); “grokker at stanford university,” http://library.stan ford.edu/catdb/grokker/ (accessed apr. 19, 2006). 14. “ebsco has partnered with groxis to deliver an innovative visual search feature as part of ebsco,” www.groxis .com/service/grokker/pr29.html (accessed apr. 19, 2006). 15. michael dolenko, christopher smith, and martha e. williams, “putting the user into usability: developing customer-driven interfaces at west group,” in proceedings of the national online meeting 20 (medford, n.j.: learned information, 1999), 81–90; e. t. morley, “usability testing: the silverplatter experience,” cd-rom professional 8, no. 3 (1995); ron stewart, vivek narendra, and axel schmetzke, “accessibility and usability of online library databases,” library hi tech 23, no. 2 (2005): 265–86; nicholas tomaiuolo, “deconstructing questia: the usability of a subscription digital library,” searcher 9, no. 7 (2001): 32–39; b. hamilton, “comparison of the different electronic versions of the encyclopaedia britannica: a usability study,” electronic library 21, no. 6 (2003): 547–54; heather l. munger, “testing the database of international rehabilitation research: using rehabilitation researchers to determine the usability of a bibliographic database,” journal of the medical library association (jmla ) 91, no. 4 (2003): 478–83; frank cervone, “what we’ve learned from doing usability testing on openurl resolvers and federated search engines,” computers in libraries 25, no. 9 (2005): 10–14; alexei oulanov and edmund f. y. pajarillo, “usability evaluation of the city university of new york cuny+ database,” electronic library 19, no. 2 (2001): 84–91; steve brantley, annie armstrong, and krystal m. lewis, “usability testing of a customizable library web portal,” college & research libraries 67, no. 2 (2006): 146–63; carole a. george, “usability testing and design of a library web site: an iterative approach,” oclc systems & services 21, no. 3 (2005): 167–80; leanne m. vandecreek, “usability analysis of northern illinois university libraries’ web site: a case study,” oclc systems & services 21, no. 3 (2005): 181–92; susan goodwin, “using screen capture software for web-site usability and redesign buy-in,” library hi tech 23, no. 4 (2005): 610–21; laura cobus, valeda frances dent, and anita ondrusek, “how twenty-eight users helped redesign an academic library web site,” reference & user services quarterly 44, no. 3 (2005): 232–46. 16. “morae usability testing for software and web sites,” www.techsmith.com/morae.asp (accessed apr. 19, 2006). 17. jansen and spink, “an analysis of web documents retrieved and viewed,” 67. 18. ibid. 19. greene et al., “previews and overviews in digital libraries,” 381. user experience testing in the open textbook adaptation workflow: a case study article user experience testing in the open textbook adaptation workflow a case study camille thomas, kimberly vardeman, and jingjing wu information technology and libraries | march 2021 https://doi.org/10.6017/ital.v40i1.12039 camille thomas (cthomas5@fsu.edu) is scholarly communications librarian, florida state university. kimberly vardeman (kimberly.vardeman@ttu.edu) is user experience librarian, texas tech university. jingjing wu (jingjing.wu@ttu.edu) is web librarian, texas tech university. © 2021. abstract as library publishers and open education programs grow, it is imperative that we integrate practices in our workflows that prioritize and include end users. although there is information available on best practices for user testing and accessibility compliance, more can be done to give insight into the library publishing context. this study examines the user and accessibility testing workflow during the modification of an existing open textbook using pressbooks at texas tech university. introduction as library publishers and open education programs grow, there is an opportunity to integrate into our workflows practices that prioritize and include end users. although there is information available on best practices for user testing and accessibility compliance, more can be done to give insight into the library publishing context. there are currently no case studies that examine the user and accessibility testing workflow during the modification of an existing open textbook. this study examines user experience testing as a method to improve oer interfaces, learning experience, and accessibility during the oer production process using pressbooks at a large research university. literature review user experience (ux) is a “momentary, primarily evaluative feeling (good–bad) while interacting with a product or service” that can go beyond simple usability evaluations to consider “qualities such as meaning, affect and value.”1 ux evaluations are generally applied to library websites, spaces, and interfaces and are not currently a common element in library publishing workflows. open educational resources (oer) are defined as teaching, learning, and research resources that reside under an intellectual property license that permits their free use and repurposing by others.2 whitfield and robinson make a distinction between teaching vs. learning resources, instructional vs. interface usability, and ease of modification for creators.3 this select literature review considers usability testing of e-books, oer workflows, and accessibility evaluations and how they apply to local contexts. along with incentives for instructors to engage with oer, the ability to adapt oer is o ften highlighted as a benefit. walz shares common workflows for oer production, including broad steps for design and development during creation of original oer.4 in the case of reuse, the design stage in walz’s workflow includes review, redesign, redevelopment, and adoption. open university models for transforming oer include the integrity model, in which the new oer remains close to the original material; the essence model, in which material is transformed by mailto:cthomas5@fsu.edu mailto:kimberly.vardeman@ttu.edu mailto:jingjing.wu@ttu.edu information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 2 reducing some features and adding new activities for interactivity; and the remix model, in which content is redesigned to be optimal for web viewing.5 student participation in oer production is often seen in open pedagogy, but these cases look at student frustrations and feedback with the objective of experiential learning, not for usability or evaluation.6 now that oer production has grown in scale, librarians and advocates seek the most effective and sustainable workflows. figure 1. illustrations of two oer lifecycles. this work was adapted by camille thomas from an original by anita walz (2015) under a cc by 4.0 international license.7 in his workflow and framework analysis meinke recommends the inclusion of more discrete steps and believes each institution’s ideal workflow will be based on local context.8 usability testing is a discrete workflow step that gives us human-centered insight about how users are affected by interfaces and how they value systems.9 libraries favor collections-based assessment that measures how many end users are using digital items, without prioritizing who users are or how and why they use resources.10 demonstrating and assessing value is essential for scholar buy-in and content recruitment, for example, which are central to all types of open resources. in the case of educational materials, lack of engagement and breakdowns in learning can be attributed to barriers and marginalization of learners.11 additionally, critiques of oer include assumptions that access to information equates to meaningful access to knowledge, but withou t context there is no guarantee that there will be meaningful transference or learning.12 harley believes defining a target audience and considering reasons for use and nonuse of resources in specific educational contexts beyond measuring anecdotal demand (e.g., website page views or survey responses, which harley does not see as indicators of value but rather of popularity) may address challenges to effectively measuring outcomes for content that is freely available on the web.13 meaningful evaluation of learning resources requires deep understanding of contextualized information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 3 educator and student needs, not just content knowledge.14 to address these barriers and assumptions, openstax, a leading oer publisher, has ux experts on staff, but this model is exceptional and rare at a university or library. many universities and libraries publishing oer do not have full-time personnel dedicated to review. some library user experience departments have hired content strategists for auditing, analyzing, defining, and designing website content, contentrelated projects, and overall content strategy.15 currently, oer work is rarely included in the scope of library user experience departments. however, limited literature does show the use of ux research methods in library publishing contexts. libraries and support units with few resources can also perform user testing.16 user experience practitioners have established that a low number of test participants—three to five—are enough to identify a majority of usability issues.17 borsci et al. suggest the important aspect is not securing a high-volume sample, but rather finding behavior diversity to make reliable evaluations.18 the number of users required to find diverse behavior can depend on what is being tested. following this standard, the consistent inclusion of user evaluations in oer workflows will not necessarily require large amounts of funding, participants, resources, or time. oer, in particular, are well suited to cumulative, early, frequent, and specific user testing. with their open copyright licensing and availability, oer are an example of the mutable digital content needed for collaboration, cumulative production, and support of networked communities. 19 several studies assert that complete information behavior analysis should be carried out before or during development, not after.20 meinke concludes his workflow analysis by encouraging iterative release in oer production workflows, which aligns with lean and iterative “guerilla” approaches used in libraries to sustainably improve usability.21 iteration is a process of development that involves cyclical inquiry or repetitive testing, providing multiple opportunities for people to revisit ideas or deliverables in order to improve an evolving design until the end goal is reached.22 in the context of design, it is a method of creating high-quality deliverables at a faster rate.23 a cyclical approach also reflects walz’s as well as blakiston and mayden’s workflow visualizations.24 walz asserts that incentives for instructors and the quality of the resources are key factors in advancing adoption, adaptation, and creation of oer.25 harley uncovered disconnects between what faculty say they need in undergraduate education and what those who produce digital educational resources imagine is an ideal state. 26 influence on faculty resource use, including oer, varied by discipline, teaching style and philosophy, and digital literacy levels, with personal preferences having the most influence on decision-making. in the evaluation or tracking stage found in most oer production workflows, we can see the impact of the quality assurance stage. the study by woodward et al. on student voice in evaluating textbooks found that incorporating multiple stakeholders into the process resulted in deeper exploration of students’ expectations when learning. students ranked one oer and one conventional textbook the highest based on content, design, pedagogy, and cost. multiauthored options ranked higher, and texts with examples were seen as more beneficial for distance learners.27 meinke believes unless discrete parts of the development process are identified, it is not useful to signal others to contribute to a project.28 an example of an oer production workflow containing usability considerations is the content, openness, reuse & repurpose, and evidence (corre) framework by the university of leicester (see fig. 2).29 the openness phase of the corre 2.0 framework includes “transformation for usability,” which is assigned to the oer evaluator, editorial board, or head of department.30 versions of the corre workflow were adapted by the information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 4 university of bath and the university of derby in the united kingdom for their institutional contexts. for example, the university of derby assigned “transformation for usability” to a developer. by building usability as a discrete step in oer production workflows, publishers and collaborators can make improvements on pain points, make changes in context, and create clear guidelines for partnerships based on local needs. betz and hall’s study supports considerations for how user microinteractions, or individual actions within a process, can be improved to make them scalable and commonplace in library workflows.31 this can include publishing workflows. for example, a study of oer on mobile devices in brazil found problems related to performance, text legibility, and trigger actions in small objects.32 other guidelines for oer and usability include using high-quality multimedia source material, advanced support from educational technologists, and balancing high and low technology in order to avoid assumptions about learners’ internet connection or devices.33 although usability testing alone is an important part of evaluating a website or product, because the user experience is multifaceted, it is also important to ensure that the product is accessible, meets user needs, and has an appealing design.34 figure 2. corre framework for oer development at the university of leicester.35 accessibility studies also encourage integrating user interactions into the creation workflow. accessibility impacts usability, findability, and holistic user experience. 36 creators and supporting advocates have relied on universal design, web standards, and ada compliance when creating information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 5 accessible digital content, emphasizing that accessible content for those with disabilities means more accessible content for all users.37 areas addressed can include text style and formatting, linking, and alternative text considerations, as detailed in the bccampus open education accessibility toolkit.38 for example, kourbetis and boukouras drew from universal design to create a bilingual oer for students with hearing impairments in greece, incorporating contextual considerations for vernacular languages and other local user needs.39 early efforts toward accessible oer, such as a 2010 framework, prompted critiques from members of the accessibility community and impeded adoption.40 while guides based in universal design offer a starting place and consistent reference, oer advocates could create workflows that support adaptive changes seen in inclusive design. universal design and web standards are fixed, while inclusive design seeks to support adaptive changes as needs evolve and does not treat non normative users as a homogenous, segregated group.41 treviranus et al. go on to state that compliance is not achieved by providing a set of rules, guidelines, or regulations to be followed.42 beyond lack of awareness of accessibility best practices, librarians and creators tend to have little control over customizing proprietary digital content platforms to add local context. 43 the flexible learning for open education (floe) project, for example, aims to integrate automatic and unconscious inclusive oer design through open source tools, but many institutions may not be able to develop such tools to incorporate local contexts.44 both librarians and e-resources vendors have been interested in the features and usability of ebooks to fine-tune their collection development strategies as well as improve the user experience of their platforms. literature shows that most studies about e-books have focused on features or the interface design of e-book reading applications. the recent academic student ebook experience survey showed that three-quarters of survey attendees considered it extremely important or important for e-books to have page numbers for citation purposes.45 this survey and other studies suggest that search, navigation, zoom, annotation, mobile compatibility, as well as offline availability including downloading, printing, and saving, were the most expected features.46 other features, such as citation generation and emailing, were mentioned or tested in some research.47 while using e-books and using e-textbooks may involve the same functionality, the purpose, devices, and user types differ because knowledge transfer is needed in learning. jardina and chaparro evaluated eight e-textbook applications with four tasks: bookmarking, note-taking, note locating, and word searching.48 they found that the interfaces to these common features varied on the different applications. standardization, or at least following general web convention when designing these interfaces, may reduce distractions that keep students from learning. the etextbook user interface can be critical to the future success of e-textbook adoption. although limited research on usability of e-textbooks or open textbooks has been conducted, a considerable number of findings from studies on e-books are relevant and applicable to etextbook projects. the e-book or e-textbook applications usability evaluation methods and results can be borrowed when understanding oer user needs. libraries can apply these e-book usability evaluations to the basic infrastructure of oer, but leverage the local contexts of students, instructors, and institutional culture when adapting the material. the more normalized usability, prototyping, and collaboration are in oer production workflows, the richer the resources and community investment. this approach can address diverse and evolving oer user needs, locally and sustainably, as they arise. our study contributes to the information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 6 literature by examining the impacts of integrating usability testing in an inaugural oer adaptation project at a large research university in the united states. case study the project to adapt the open textbook college success, published by the university of minnesota libraries, for use in the raiderready first-year seminar course, was brought to the texas tech university libraries’ scholarly publishing office by the head of the outreach department in march 2018. the program was deciding between a commercial textbook and the adapted open textbook. the course was offered during each fall semester and had an average enrollment of over 1,600 since 2016. an initial meeting took place and regular weekly meetings were set up afterward to review edits and ensure communication within the original 30-day timeline. it was the first oer production project for the libraries’ publishing program, which had previously focused on open access journals and materials in the repository. originally, we sought to use open monograph press because we had a local instance already installed. however, a platform with more robust formatting capabilities was needed in order to reach the desired product within the timeline. we decided to use the pressbooks pro (a wordpress plugin) sandbox for one project through our open textbook network membership. a rough draft of edits to the original work were already completed. we used a mediated service model, in which librarians performed the formatting, quality assurance, and publishing. this was in contrast to self-service models in which creators work independently and consult with support specialists. the digital publishing librarian and scholarly communication librarian formatted the edits, with html/css customization and platform troubleshooting from the web librarian. other library staff involved in the project included communications & marketing (cover design), the user experience librarian, and the electronic resources librarian (cataloging in the catalog). campus stakeholders and partners included the libraries, the raiderready program, editors, copytech (printers affiliated with the university), the campus bookstore, and the worldwide elearning unit. program partners were enthusiastic about usability and accessibility testing for the textbook. the initial testing took place in the middle of the adaptation project timeline, once initial content was formatted and ready for testing. the bccampus’ accessibility toolkit and the pressbooks user guide were used as primary guides throughout the process. the scholarly publishing librarian and the user experience librarian met to develop the testing method and identify users who would reflect the audience using the textbook. a second round of tests was conducted a year after the initial project when the editors made updates to the text. while the resulting changes were minor, this further testing allowed us to seek more feedback on the most recent version of the textbook and apply some lessons learned from the first round of testing. we did not use personas or identify user needs beforehand. we planned to recruit first-year students and students who took the raiderready course in a previous semester. however, we decided to instead recruit from existing pools of student volunteers for library usability tests in order to get three to six students in a short amount of time. for the second round of testing, we planned to recruit on-campus students, distance students, and students with diverse abilities. we recruited from newly established pools of volunteers for distance students as well as existing volunteer pools. during the first iteration, we requested that worldwide elearning, texas tech university’s distance learning unit on campus, test the textbook pilot content in pdf and epub formats using screen reader software. information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 7 the user experience librarian conducted a first round of four usability tests in march 2018 and a second round of two usability tests in april 2019. a sample test script from steve krug provided a solid foundation for conducting our own tests.49 in each test, participants were asked to answer two pretest questions, complete four tasks, and answer four posttest questions (see appendix for script). tasks included finding the textbook, exploring the textbook itself, locating activities for a specific chapter, and searching for the student code of conduct. participants were instructed to think aloud as they worked through the tasks. the think-aloud protocol is a commonly used test method, where participants are asked to verbalize their thoughts and actions as they perform tasks on a website. the observation tasks are set beforehand, and the facilitator follows a script to ensure consistency among testers.50 the combination of observing and recording participants’ comments and reactions provides insight into a site’s usability. testers were invited to comment on their experience at the end of the session. each usability test was recorded using morae software to track on-screen activity such as mouse movements and clicks, typing, and the verbal comments of the facilitator and participant. we conducted tests using a laptop running windows 10 with a 15.6-inch display. in the first round of testing, we also showed students the book on an ipad mini, both in adobe reader and ibooks. while we asked them to briefly view the textbooks, we did not ask them to complete specific tasks while using the tablet. limitations the biggest limitation was that we did not test on users using a screen reader or other assistive technology. the user experience librarian built a pool of on-campus students who volunteered to participate in user research in 2018, and relationships with a pool of distance student users was established in 2019. however, a pool of other types of non-normative learners had not yet been established for either round of testing. another limitation of the study was that we primarily tested on campus servers, so we do not have data on rural or distance learner experiences with the textbook until the second round of testing. in addition, we used only a few devices, a windows 10 laptop for formal testing and an ipad students briefly viewed afterward. we also did not have an educational technologist as a partner throughout the process. results once testing was complete, the scholarly communications librarian and the user experience librarian analyzed the notes and identified areas of common concern and confusion among participants. all participants were familiar with online textbooks from other courses. participants cited cost as a major consideration when deciding between purchasing print or electronic texts. more than one participant said that electronic textbooks can be cheaper but can be more frustrating to use. participants had more experience viewing textbooks on laptops. the ability to download texts for reading on a phone was not always available due to publisher restrictions. content and navigation participants liked pictures and visuals to break up the blocks of text. however, one participant expressed a dislike for too many slideshows or other media. another in the first round of testing liked that there were not “too many” links that brought you out of the textbook, stating it was “annoying to split screen in order to see text plus activity/homework assignment.” in the second round of testing, one participant felt the lack of interactive content was best for the first-year students compared to videos and activities in textbooks for advanced courses. that participant also thought the simpler language of the text was more welcoming to first-year students. a information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 8 participant said an ipad would be better than a laptop for viewing this book, because scrolling was easier. several users did not find features such as bookmarked sections in the in-browser pdf viewers or adobe reader. participants who did not see or use the table of contents (toc) continually relied on scrolling through pages to locate content. only one participant, unprompted, used the ctrl+f shortcut to keyword search the text. a few other participants viewed the toc, then entered the desired page number in the top toolbar navigation field. most of them expected the code of conduct information in one of the tasks to be in the front matter. the emphasis on content reflects blakiston and mayden’s experience that without a content strategy, it becomes difficult to search and to demonstrate credibility, and it is a challenge to create a coherent, user-centered information architecture.51 all participants navigated to the toc several times to complete tasks, making it a relevant feature. in the second round of testing, one participant preferred the statements and questions at the beginning of the first chapter to learning objectives typically listed in textbooks. discovery and access participants took varied approaches to finding textbooks. one would get links from the professor via email or the syllabus. others would use the campus bookstore for purchases. one would use the student information system (raiderlink) to locate information about the textbook. potential access points to make the raiderready textbook discoverable included the institutional repository, the open textbook library, and the local library catalog. the open textbook library was ruled out mostly due to campus-specific adaptations, which were not more substantial for public use than the original college success. thinktech, the institutional repository, was the most viable option and allowed for permanent linking, which worked well with the access points student users mentioned. in the second round of testing, one participant searched for the textbook via the library catalog/discovery system, google search, and the raiderready department website. the course description on the department website listed an open textbook, but the user pointed out that it was not actually linked there. discussion user testing changed our actions during the project. interactions with students did not occur during any other stage of the adaptation process before the resource was adopted in the course. many insights from the testing were indicative of self-reported preferences such as requesting more visuals, preferring print for reading and exercises, and auditory screen reading. we also learned ways that cost impacted how students used textbooks. for example, when we followed up on a participant’s comment and asked if they liked to highlight books, the student responded that they try not to mark their books because they want to resell them. testing also helped us observe actual behaviors among similar users in a way oer toolkits and guidelines alone did not. we learned more about how oer fits into the culture of learning and resources at texas tech university and how that may differ from other institutions. for a visual representation of our workflow, we adapted billy meinke’s oer production workflow (targeted to creators) because it was an openly available, editable workflow with comprehensive discrete steps. similar to the corre framework adaptations, meinke’s workflow was adapted by others, including the southern alberta institute of technology (sait), lansing community college, and the university of houston, to fit their institutional contexts.52 our process did not include an external peer review process; instead review was done by the editors. priming and preproduction information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 9 phases in the workflow were relatively quick, occurring in the first two weeks. the bulk of time— about four weeks—was spent in the development phase. the quality assurance and publishing phases occurred for about two weeks, with most of the time spent on finalizing edits and formatting. the first round of user testing took about two hours total and redux (revisiting the prototype and implementing changes), along with the format finalization, took about two weeks. finalization for formatting and redux changes in the first iteration of the text involved pressbooks troubleshooting. the original timeline for the project was 30 days, but the actual time for the project was 60 days. the second round of user testing took about two hours total and occurred at the halfway point within a new 60-day deadline for an updated version of the textbook. we acknowledge that even though the actual time spent with users in the first round was limited to two hours, the process also required time for drafting recruitment emails, communicating with volunteers, scheduling testing, and debriefing after sessions. figure 3 shows our workflow diagram, including a new quality assurance phase (see fig. 4 for detail) based on our case study. it includes prototyping (content and format draft), user testing, and implementing user feedback on the oer prototype. figure 3. discrete production workflow including quality assurance phase. this workflow is an adaptation of a workflow by billy meinke and university of hawai’i at manoa under a cc-by license. information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 10 figure 4. quality assurance phase with user testing. we addressed several suggestions participants made during testing to improve the textbook’s navigation functionality. we were able to address requests for a linked toc in the second round of testing. in the first round of testing, formatting was tailored to print pdf format because the editors wanted a print version to be available. we were able to create a linked toc in the digital pdf format, but not the printable pdf format. we were not aware that the toc could be changed based on available documentation in the first round of testing, but we were able to successfully troubleshoot this issue in the second round of tests. we were not able to do any customization on the search feature, which was built in. for customization, pressbooks allows styling through css files for three formats (pdf, web page, epub) separately. we customized them for look and feel. many of the requests were constrained by our working knowledge of and the technical limits of pressbooks, so we added a tips for pdf readers and printing section in the front matter of the textbook during the first round. it is important to note that although these were not major changes to the interface, they gave us insight for iterative changes. upon reflection, it would have been information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 11 preferable to involve someone with pressbooks developer experience at the outset. because we had not led a similar project before, nor worked with the software previously, we were more limited on the changes we could make as a result of testing than we expected. however, after this experience, we know what areas to test and are better prepared to effect actual changes. we made chapters available separately in the institutional repository to cut down on excessive scrolling, because scrolling through an entire textbook slowed students’ ability to study and quickly reference the text. also, the editors requested digital as well as print access to the textbook through the campus bookstore. the raiderready textbook was also added to the library’s local records in the catalog. we did not make a web version available through pressbooks. a web version was not a priority because of the institution-specific customization and because the editors did not request one. usage statistics from the repository between march 2018 and february 2019 peaked during midterms and at the end of semesters in which the class was taught. chapters 5, 6, 7, and 8 had the most downloads—the last chapters of the book, likely the chapters students were tested on for the final—with the majority of downloads (1,368) taking place during october 2018. this indicates that the option to download individual chapters appealed to students. accessibility testing the textbook with screen readers confirmed the need for an epub format of the text. hyde gives the following guidance for educators using pressbooks: “pdf accessibility is known to be somewhat limited, and in particular some files are not able to be interpreted by screen readers. the pdf conversion tool that pressbooks uses does support pdf tagging, which improves screen reader compatibility, but often the best solution is to offer web and e-book versions of a textbook as well as the pdf, so readers can find the format that best suits their needs.”53 for pdfs, issues included lack of alt tags, headings not set, and tables and images lacking tags. adding alt tags was planned early, after they were lost when uploading the wxr (wordpress extended rss) file—a wordpress export file in xml format—in pressbooks, and loss of the alt tags was confirmed during testing at the midpoint of the process. however, due to deadlines and pressbooks functionality, we were not able to address more of the tagging issues. epubs worked much better in tests with screen readers, apple devices, and e-readers. editors preferred that a pdf be used as the primary version and wanted an epub for screen readers upon request. our partners’ preference was likely based on the common use of pdfs but it did not comply with the principles of universal or inclusive design. regarding e-book accessibility, pressbooks documentation says, “ebook accessibility is somewhat dictated by the file format standards, which focus on font sizes and screen readers, and improvements are also being made with dynamic content. the international digital publishing forum has a checklist to prompt ebook creators on accessibility functions they can incorporate while creating their content.”54 we made a decision to include multiple formats to take multiple types of use into consideration. in the fir st round of changes, we included an epub alongside the pdf in the repository, so users with disabilities would not have to self-identify by making a request in order to gain access. upon learning more about inclusive design after the pilot, we realized we were treating users as a homogeneous group and segregating the more accessible version. in the second round, when we realized the epub was not available by separate chapters as was the pdf version, we then made it available by chapter as well. we recommend that evaluating oer according to the international digital publishing forum checklist be incorporated into the qa part of the workflow. information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 12 conclusion there is room for future research on iterative testing for oer and testing with more emphasis on mobile devices, testing with deeper investigation into microinteractions concerning accessibility, and testing in workflows that use other publishing platforms. as the creators of the floe project suggest, many more customizations can be made to points of user interactions if the software platform for adaptation is open source. future research may also examine regional and cultural influence on learning and interface preferences. one change that may support future adaptation projects at texas tech university would be modifying internal guidelines that take into consideration previous testing and local context. we also recommend keeping detailed documentation, particularly of steps for changes that are not included in existing guides on oer production. creating a memorandum of understanding with partners that clearly outlined responsibilities could have prevented some of the misunderstandings that occurred. for example, when stakeholders discussed producing print copies of the textbook, it wasn’t clear what the library’s role was. with a short timeline and more work involved than expected, the library was in a position of overpromising and underdelivering. it was apparent that the workflows themselves needed to be open and adaptable to support resources, communities, and processes in local contexts. it was important throughout the process to be aware of our partners’ priorities (e.g., instructional preferences, cost to students, departmental buy-in), because we had to balance these priorities with user feedback. we recommend having specific roles for content strategists, educational technologists, and developers in workflows during oer production. the work of creating workflows, assigning roles, and creating standards for oer content currently falls on librarians, instru ctional designers, and creators. as librarians seek the most sustainable workflows, it will be beneficial to emphasize investing in the quality assurance stages of oer production and evenly distributing responsibilities. this can be done through collaborative partnerships or by hiring additional positions. if other institutions were to scale the practices from our case study, ideally, librarians would take responsibility for adding roles or formalized work to the scope of either ux or oer departments so that it becomes normalized in oer workflows. we recommend working with editors to advocate for one textbook format that addresses a variety of learning needs. we plan to use these experiences, along with existing resources, to include inclusive and user -friendly recommendations in policies and guidelines for oer adaptation. conducting user testing did challenge assumptions about student use of oer by librarians and editing instructors. while we referred to toolkits, guidelines, and best practices, internal testing allowed us to make improvements to several specific microinteractions students encountered while using the text. it was very feasible to incorporate testing into the workflow. we were able to directly observe user information behavior from members of the community that the resource was intended to serve. information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 13 appendix: usability test method pretest questions 1. what is your academic classification? (undergraduate, graduate, faculty) 2. have you ever used an e-textbook or a digital textbook in one of your classes? (if yes, ask for course details.) tasks to observe 1. imagine you needed to get a copy of the digital textbook raider ready: unmasking the possibilities of college success. how would you go about finding it? it will help us if you think out loud as you go along—tell us what you’re looking at, what you’re trying to do, what you’re thinking. 2. [if the tester is unable to locate the digital textbook, the moderator will open it.] please take a couple of minutes to look at this textbook. explore and click on a link or two. 3. for the next task, imagine an instructor asked you to locate the chapter activities for chapter 1. could you show us how you would locate those? 4. for the final task, could you find the student code of conduct? posttest questions 1. what were your impressions of this resource? 2. what did you like? dislike? what would you change? 3. how easy or difficult was it to find what you wanted? please explain. 4. is there anything else about your experience using this textbook today that you’d like to tell us? information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 14 endnotes 1 sonya betz and robyn hall, “self-archiving with ease in an institutional repository: microinteractions and the user experience,” information technology and libraries 34, no. 3 (september 21, 2015): 44–45, https://doi.org/10.6017/ital.v34i3.5900. 2 anita r. walz, “open and editable: exploring library engagement in open educational resource adoption, adaptation and authoring,” virginia libraries 61 (january 2015): 23, http://hdl.handle.net/10919/52377. 3 stephen whitfield and zoe robinson, “open educational resources: the challenges of ‘usability’ and copyright clearance,” planet 25, no. 1 (2012): 52, https://doi.org/10.11120/plan.2012.00250051. 4 walz, “open and editable,” 24. 5 andy lane, “from pillar to post: exploring the issues involved in repurposing distance learning materials for use as open educational resources” (working paper, uk open university, december 2006), accessed august 1, 2018, http://kn.open.ac.uk/public/document.cfm?docid=9724. 6 andy arana et al. eds., “open logic project,” university of calgary faculty of arts and the campus alberta oer initiative, accessed april 26, 2019, http://openlogicproject.org/; robin derosa, the open anthology of earlier american literature (public commons publishing, 2015), https://openamlit.pressbooks.com/; timothy robbins, “case study: expanding the open anthology of earlier american literature,” in a guide to making open textbooks with students, ed. elizabeth mays (the rebus community for open textbook creation, 2017), https://press.rebus.community/makingopentextbookswithstudents/chapter/case-studyexpanding-open-anthology-of-earlier-american-literature/. 7 walz, “open and editable,” 24. 8 billy meinke, “discovering oer production workflows,” uh oer (blog), university of hawai’i, december 23, 2016, https://oer.hawaii.edu/discovering-oer-production-workflows/. 9 betz and hall, “self-archiving with ease,” 44. 10 beth st. jean et al., “unheard voices: institutional repository end-users,” college & research libraries 72, no. 1 (january 2011): 23, https://doi.org/10.5860/crl-71r1. 11 jutta treviranus et al., “an introduction to the floe project,” in international conference on universal access in human-computer interaction, universal access to information and knowledge, ed. constantine stephanidis and margherita antona, uahci 2014 (june 2014), lecture notes in computer science 8514: 454, https://doi.org/10.1007/978-3-319-074405_42. 12 sarah crissinger, “a critical take on oer practices: interrogating commercialization, colonialism, and content,” in the library with the lead pipe, october 21, 2015, http://www.inthelibrarywiththeleadpipe.org/2015/a-critical-take-on-oer-practicesinterrogating-commercialization-colonialism-and-content/; diane harley, “why https://doi.org/10.6017/ital.v34i3.5900 https://nam04.safelinks.protection.outlook.com/?url=http%3a%2f%2fhdl.handle.net%2f10919%2f52377&data=02%7c01%7ccamille.thomas%40ttu.edu%7c1184a3bd8d1b411a7d6f08d6abcb07bd%7c178a51bf8b2049ffb65556245d5c173c%7c0%7c0%7c636885285831155488&sdata=dqdoguvqanm6uote7bqivip8uoaz%2b3xnoxg5uscm4tc%3d&reserved=0 https://nam04.safelinks.protection.outlook.com/?url=http%3a%2f%2fhdl.handle.net%2f10919%2f52377&data=02%7c01%7ccamille.thomas%40ttu.edu%7c1184a3bd8d1b411a7d6f08d6abcb07bd%7c178a51bf8b2049ffb65556245d5c173c%7c0%7c0%7c636885285831155488&sdata=dqdoguvqanm6uote7bqivip8uoaz%2b3xnoxg5uscm4tc%3d&reserved=0 http://hdl.handle.net/10919/52377 https://doi.org/10.11120/plan.2012.00250051 https://doi.org/10.11120/plan.2012.00250051 https://doi.org/10.11120/plan.2012.00250051 http://kn.open.ac.uk/public/document.cfm?docid=9724 http://openlogicproject.org/ https://openamlit.pressbooks.com/ https://press.rebus.community/makingopentextbookswithstudents/chapter/case-study-expanding-open-anthology-of-earlier-american-literature/ https://press.rebus.community/makingopentextbookswithstudents/chapter/case-study-expanding-open-anthology-of-earlier-american-literature/ https://press.rebus.community/makingopentextbookswithstudents/chapter/case-study-expanding-open-anthology-of-earlier-american-literature/ https://press.rebus.community/makingopentextbookswithstudents/chapter/case-study-expanding-open-anthology-of-earlier-american-literature/ https://oer.hawaii.edu/discovering-oer-production-workflows/ https://doi.org/10.5860/crl-71r1 https://doi.org/10.1007/978-3-319-07440-5_42 https://doi.org/10.1007/978-3-319-07440-5_42 http://www.inthelibrarywiththeleadpipe.org/2015/a-critical-take-on-oer-practices-interrogating-commercialization-colonialism-and-content/ http://www.inthelibrarywiththeleadpipe.org/2015/a-critical-take-on-oer-practices-interrogating-commercialization-colonialism-and-content/ information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 15 understanding the use and users of open education matters,” in opening up education: the collective advancement of education through open technology, open content, and open knowledge, ed. toru iiyoshi and m.s. vijay kumar (cambridge, ma: the mit press, 2008), 197– 212. 13 harley, “why understanding,” 208. 14 tom carey and gerard l. hanley, “extending the impact of open educational resources through alignment with pedagogical content knowledge and institutional strategy: lessons learned from the merlot community experience,” in opening up education: the collective advancement of education through open technology, open content, and open knowledge, ed. toru iiyoshi and m.s. vijay kumar (cambridge, ma: the mit press, 2008), 238. 15 rebecca blakiston and shoshana mayden, “how we hired a content strategist (and why you should too),” journal of web librarianship 9, no. 4 (2015): 202–6, https://doi.org/10.1080/19322909.2015.1105730; “our team,” openstax, rice university, accessed december 9, 2019, https://openstax.org/team. 16 maria nuccilli, elliot polak, and alex binno, “start with an hour a week: enhancing usability at wayne state university libraries,” weave: journal of library user experience 1, no. 8 (2018), https://doi.org/10.3998/weave.12535642.0001.803. 17 jakob nielsen and thomas k. landauer, “a mathematical model of the finding of usability problems,” in proceedings of the interact’93 and chi’93 conference on human factors in computing systems (may 1993): 211–12, https://doi.org/10.1145/169059.169166. 18 simone borsci et al., “reviewing and extending the five-user assumption: a grounded procedure for interaction evaluation,” in acm transactions on computer-human interaction 20, no. 5, article 29 (november 2013), 18–19, http://delivery.acm.org/10.1145/2510000/2506210/a29-borsci.pdf. 19 treviranus et al., “floe project,” 454. 20 laura icela gonzález-pérez, maría-soledad ramírez-montoya, and francisco j. garcía-peñalvo, “user experience in institutional repositories: a systematic literature review,” international journal of human capital and information technology professionals 9, no. 1 (january–march 2018): 79, 84, https://doi.org/10.4018/ijhcitp.2018010105; betz and hall, “self-archiving with ease,” 45; st. jean et al., “unheard voices,” 23, 36–37, 40. 21 meinke, “discovering oer production workflows”; nuccilli, polak, and binno, “start with an hour.” 22 steven d. eppinger, murthy v. nukala, and daniel e. whitney, “generalised models of design iteration using signal flow graphs,” research in engineering design 9, no. 2 (1997): 112; helen timperley et al., teacher professional learning and development (wellington, new zealand: ministry of education, 2007), http://www.oecd.org/education/school/48727127.pdf. 23 eppinger, nukala, and whitney, “design iteration,” 112–13. https://doi.org/10.1080/19322909.2015.1105730 https://openstax.org/team https://doi.org/10.3998/weave.12535642.0001.803 https://doi.org/10.1145/169059.169166 http://delivery.acm.org/10.1145/2510000/2506210/a29-borsci.pdf https://doi.org/10.4018/ijhcitp.2018010105 http://www.oecd.org/education/school/48727127.pdf information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 16 24 walz, “open and editable,” 23; blakiston and mayden, “how we hired a content strategist,” 203. 25 walz, “open and editable,” 28. 26 harley, “why understanding,” 201–6. 27 scott woodward, adam lloyd, and royce kimmons, “student voice in textbook evaluation: comparing open and restricted textbooks,” international review of research in open and distributed learning 18, no. 6 (september 2017), 150–63, https://doi.org/10.19173/irrodl.v18i6.3170. 28 meinke, “discovering oer production workflows.” 29 samuel k. nikoi et al., “corre: a framework for evaluating and transforming teaching materials into open educational resources,” open learning: the journal of open, distance and e-learning 26, no. 3 (2011), 194–99, https://doi.org/10.1080/02680513.2011.611681. 30 “corre 2.0,” institute of learning innovation, university of leicester, accessed april 25, 2019, https://www2.le.ac.uk/departments/beyond-distance-researchalliance/projects/ostrich/corre-2.0. 31 betz and hall, “self-archiving with ease,” 45–46. 32 andré constantino da silva et al., “portability and usability of open educational resources on mobile devices: a study in the context of brazilian educational portals and android-based devices” (paper, international conference on mobile learning 2014, madrid, spain, february 28–march 2, 2014), 198, https://eric.ed.gov/?id=ed557248. 33 sarah morehouse, “oer bootcamp 3-3: oers and usability,” youtube video, 3:16, march 2, 2018, https://www.youtube.com/watch?v=cncxbcs-2gm. 34 krista godfrey, “creating a culture of usability,” weave: journal of library user experience 1, no. 3 (2015), https://doi.org/10.3998/weave.12535642.0001.301; peter morville, “user experience design,” semantic studios, june 21, 2004, http://semanticstudios.com/user_experience_design/. 35 meinke, “discovering oer production workflows.” 36 cynthia ng, “a practical guide to improving web accessibility,” weave: journal of library user experience 1, no. 7 (2017), https://doi.org/10.3998/weave.12535642.0001.701; whitney quesenbery, “usable accessibility: making web sites work well for people with disabilities,” ux matters, february 23, 2009, http://www.uxmatters.com/mt/archives/2009/02/usableaccessibility-making-web-sites-work-well-for-people-with-disabilities.php. 37 ng, “improving web accessibility.” 38 amanda coolidge et al., accessibility toolkit 2nd edition (victoria, b.c.: bccampus, 2018), 1–71, https://opentextbc.ca/accessibilitytoolkit/. https://doi.org/10.19173/irrodl.v18i6.3170 https://doi.org/10.1080/02680513.2011.611681 https://www2.le.ac.uk/departments/beyond-distance-research-alliance/projects/ostrich/corre-2.0 https://www2.le.ac.uk/departments/beyond-distance-research-alliance/projects/ostrich/corre-2.0 https://eric.ed.gov/?id=ed557248 https://www.youtube.com/watch?v=cncxbcs-2gm https://doi.org/10.3998/weave.12535642.0001.301 http://semanticstudios.com/user_experience_design/ https://doi.org/10.3998/weave.12535642.0001.701 http://www.uxmatters.com/mt/archives/2009/02/usable-accessibility-making-web-sites-work-well-for-people-with-disabilities.php http://www.uxmatters.com/mt/archives/2009/02/usable-accessibility-making-web-sites-work-well-for-people-with-disabilities.php https://opentextbc.ca/accessibilitytoolkit/ information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 17 39 vassilis kourbetis and konstantinos boukouras, “accessible open educational resources for students with disabilities in greece,” in universal access in human-computer interaction, universal access to information and knowledge, ed. constantine stephanidis and margherita antona, uahci 2014 (june 2014), lecture notes in computer science 8514: 349–57, https://doi.org/10.1007/978-3-319-07440-5_32. 40 treviranus et al., “floe project,” 455–56. 41 treviranus et al., 456–57. 42 treviranus et al., 456–57. 43 ng, “improving web accessibility”; treviranus et al., “floe project,” 460–61. 44 treviranus et al., “floe project,” 461. 45 2018 academic student ebook experience survey report (library journal research, 2018): 6, accessed may 3, 2019, https://mediasource.formstack.com/forms/2018_academic_student_ebook_experience_survey _report. 46 michael gorrell, “the ebook user experience in an integrated research platform,” against the grain 23, no. 5 (december 2014): 38; robert slater, “why aren’t e-books gaining more ground in academic libraries? e-book use and perceptions: a review of published literature and research,” journal of web librarianship 4, no. 4 (2010): 305–31; joelle thomas and galadriel chilton, “library e-book platforms are broken: let’s fix them,” academic e-books: publishers, librarians, and users (2016): 249–62; christina mune and ann agee, “ebook showdown: evaluating academic ebook platforms from a user perspective,” in creating sustainable community: the proceedings of the acrl 2015 conference (2015): 25–28; laura muir and graeme hawes, “the case for e-book literacy: undergraduate students’ experience with ebooks for course work,” the journal of academic librarianship 39, no. 3 (2013): 260–74; esta tovstiadi, natalia tingle, and gabrielle wiersma, “academic e-book usability from the student’s perspective,” evidence based library and information practice 13, no. 4 (2018): 70– 87. 47 erin dorris cassidy, michelle martinez, and lisa shen, “not in love, or not in the know? graduate student and faculty use (and non-use) of e-books,” the journal of academic librarianship 38, no. 6 (2012): 326–32; gorrell, “the ebook user experience,” 36–40. 48 jo r. jardina and barbara s. chaparro, “investigating the usability of e-textbooks using the technique for human error assessment,” journal of usability studies 10, no. 4 (2015): 140–59. 49 steve krug, rocket surgery made easy (berkeley, ca: new riders, 2010), 146–53. 50 danielle a. becker and lauren yannotta, “modeling a library website redesign process: developing a user-centered website through usability testing,” information technology and libraries 32, no. 1 (march 2013): 9–10. 51 blakiston and mayden, “how we hired a content strategist,” 194. https://doi.org/10.1007/978-3-319-07440-5_32 https://mediasource.formstack.com/forms/2018_academic_student_ebook_experience_survey_report https://mediasource.formstack.com/forms/2018_academic_student_ebook_experience_survey_report information technology and libraries march 2021 user experience testing in the open textbook adaptation workflow | thomas, vardeman, and wu 18 52 jessica norman, sait oer workflow, may 2019, accessed july 14, 2020, https://docs.google.com/drawings/d/1xvjpu9s4bb32k3gblnvw4uy1ely9rtxnr8bkfdm5yk/; regina gong, oer production workflow, accessed july 14, 2020, http://libguides.lcc.edu/oer/adopt; ariana e. santiago, oer adoption workflow visual overview, april 2019, accessed july 14, 2020, https://docs.google.com/drawings/d/1czqhpgpqyrr46vm5iytoemyqj-s1zr0p-m-lj16rtto/; meinke, “discovering oer production workflows.” 53 zoe wake hyde, “accessibility and universal design,” in pressbooks for edu guide (pressbooks.com, 2016), https://www.publiconsulting.com/wordpress/eduguide/. 54 hyde. https://docs.google.com/drawings/d/1xvjpu9s4bb32k3gblnvw4uy1ely9rtxnr8bkfdm5-yk/edit https://docs.google.com/drawings/d/1xvjpu9s4bb32k3gblnvw4uy1ely9rtxnr8bkfdm5-yk/edit http://libguides.lcc.edu/oer/adopt https://docs.google.com/drawings/d/1czqhpgpqyrr46vm5iytoemyqj-s1zr0p-m-lj16rtto/ https://www.publiconsulting.com/wordpress/eduguide/ abstract introduction literature review case study limitations results content and navigation discovery and access discussion accessibility conclusion appendix: usability test method endnotes editorial i think that writing editorials in my job as the new editor of information technology and libraries (ital) is going to be a real piece of cake. all i have to do, dear readers, is to quote (with proper attribution) walt crawford, the title of whose book i repeat as the title of this, my inaugural editorial.1 and then quote other sages of our profession, using only as many of their words as is fitting and proper to make my editorials relevant to the concerns of our membership and readers and as few of my own words as i can to repay the confidence that the library information and technology association (lita) has placed in me— and to avoid muddling the ideas of those to whom i shall be indebted. those of you reading this will note that i have already fallen prey to the conceit of all scholarly journal editors: that their readers, of course, after surveying the tables of contents, dive wide-eyed first into the editorials. of course. to paraphrase a technologist of an earlier era, “when in the course of human events, it becomes necessary for” a new editor to take on the responsibility for the stewardship of ital, “a decent respect to the opinions of mankind requires that” he “should declare the causes which impel” him to accept that responsibility and, further, to write editorials. i quote, of course, from the first paragraph of the declaration of independence adopted by the “thirteen united states of america” july 4, 1776. in this, my first editorial, i, too, shall put forth for the examination of the members of lita and the readers of ital my goals and hopes for the journal that i am now honored to lead. these goals and hopes are shared by the members of the ital editorial board, whose names appear in the masthead of this journal. ital is a double-blind refereed journal that currently has a manuscript acceptance rate of 50 percent. it began in 1968 as the journal of library automation (jola), the journal of the information science and automation division (isad) of ala, and its first editor was fred kilgour. in 1978 isad became lita, and in 1982, the journal title was changed to reflect the expanding role of information technology in libraries, an expansion that continues to accelerate so that ital is no longer the only professional journal within ala whose pages are now dominated by our accelerating use of information technologies as tools to manage the services we provide our users and as tools we use ourselves to accomplish our daily duties. i write part of this editorial in the skies over the middle section of the united states as i return home from the seventh national lita forum held in st. louis, october 7–10. at the forum, i heard presentations, visited poster sessions, and talked with colleagues from forty-four states and six countries who had something to say and said it well. i hope that some of them may submit manuscripts to ital so that all the members of lita and all the readers of the journal will profit as well from some of what the attendees of the forum heard and saw. i attended the forum forewarned by previous ital editors to carry plenty of business cards, and i went armed with a pocketful. i think i distributed enough that, if pieced together, their blank sides would provide sufficient writing space for at least one manuscript! in an attempt to fulfill the jeffersonian promise above, i hereby list a few of my goals for the beginning of my term as editor. i must emphasize that these goals of mine supplement but do not supplant the purposes of the journal as stated on the first page and on the ital web site (www.ala.org/lita/litapublications/ital/italinformation. htm); likewise, they do not supplant the goals of my predecessors. in no particular order: i hope to increase the number of manuscripts received from our library and information schools. their faculty and doctoral students are some of the incubators of new and exciting information technologies that may bear fruit for future library users. however, not all research turns up maps on which “x marks the spot.” exploration is interesting, even vital, for the journey, for the search itself, and our graduate faculties and students have something to say. i hope to increase the submission of manuscripts that describe relevant sponsored research. in the earlier volumes, jola had an average of at least one article per issue, maybe more, describing the results of funded research. ital can and should be a source that information-technology researchers consider as a vehicle for the publication of their results. two articles in this issue result from sponsored research. in fact, i hope to increase the number of manuscripts that describe any relevant research or cutting-edge developments. much of the exploration undertaken by librarians improving and strengthening their services involves research or problems solved on both small scales and large. neither the officers of lita, the referees, the readers, nor i are interested in very many “how i run my library good” articles. we all want to read a statement of the problem(s), the hypotheses developed to explore the issues surrounding the problem(s), the research methods, the results, the assessment of the outcomes, and, when feasible, a synthesis of how the research methods or results may be generalized. i hope to increase the number of articles with multiple authors. libraries are among society’s most cooperative institutions and librarians, members of one of the most cooperative of professions. the work we do is rarely that of solitary performers, whether it be research or the editorial | webb 3 editorial: first have something to say john webb john webb (jwebb@wsu.edu) is assistant director for digital services/collections, washington state university libraries, pullman, and editor of information technology and libraries. (continued on page 21) __problems with unauthorized people accessing the internet through the wireless network __problems with restricted parts of the network being accessed by unauthorized users __other 3. how were security problems resolved? benefits of use of network 1. what have been the biggest benefits of wireless technology? check all that apply. __user satisfaction __increased access to the internet and online sources __flexibility and ease due to lack of wires __has improved technical services (use for library functions) __has aided in bibliographic instruction __provides access beyond the library building __allows students to roam the stacks while accessing the network __other 2. how would you describe current usage of the network? __heavy __moderate __low 3. in your opinion, has this technology been worth the benefit-cost ratio thus far? __yes __no __not sure 4. what advice would you give to librarians considering this technology? (editorial continued from page 3) design and implementation of complex systems to serve our users. writing about that should not be solitary either. i hope to publish think-pieces from leaders in our field. i hope to publish more articles on the management of information technologies. i hope to increase the number of manuscripts that provide retrospectives. libraries have always been users of information technologies, often early adopters of leading-edge technologies that later become commonplace. we should, upon occasion, remember and reflect upon our development as an information-technology profession. i hope to work with the editorial board, the lita publications committee, and the lita board to find a way, and soon, to facilitate the electronic publication of articles without endangering—but in fact enhancing—the absolutely essential financial contribution that the journal provides to the association. in short, i want to make ital a destination journal of excellence for both readers and authors, and in doing so reaffirm the importance of lita as a professional division of ala. to accomplish my goals, i need more than an excellent editorial board, more than first-class referees to provide quality control, and more than the support of the lita officers. i need all lita members to be prospective authors, prospective referees, and prospective literary agents acting on behalf of our profession to continue the almost forty-year tradition begun by fred kilgour and his colleagues, who were our predecessors in volume 1, number 1, march 1966, of our journal. reference 1. walt crawford, first have something to say: writing for the library profession (chicago: ala, 2003). wireless networks in medium-sized academic libraries | barnett-ellis and charnigo 21 off-campus access to licensed online resources through shibboleth article off-campus access to licensed online resources through shibboleth francis jayakanth, ananda t. byrappa, and raja visvanathan information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.12589 abstract institutions of advanced education and research, through their libraries, invest substantially in licensed online resources. only authorized users of an institution are entitled to access licensed online resources. seamless on-campus access to licensed resources happens mostly through internet protocol (ip) address authentication. increasingly, licensed online resources are accessed by authorized users from off-campus locations as well. libraries will, therefore, need to ensure seamless off-campus access to authorized users. libraries have been using various technologies, including proxy server or virtual private network (vpn) server or single sign-on, to facilitate seamless offcampus access to licensed resources. in this paper, authors share their experience in setting up a shibboleth-based single sign-on (sso) access management system at the jrd tata memorial library, indian institute of science, to enable authorized users of the institute to seamlessly access licensed online resources from off-campus locations. introduction the internet has both necessitated and offered options for libraries to enable remote access to an organization’s licensed online content—journals, e-books, technical standards, bibliographical and full-text databases, and more. in the absence of such an option for remote access, faculty, students, and researchers have limited and constrained access to the licensed online content from off campus locations. as scholarly resources transitioned from print to online in the mid-1990s, libraries and their vendors had to start identifying user affiliations in order to grant access to licensed online resources to the authorized users of an institution. the ip address was an obvious mechanism to do that. allowing or denying access to online resources based on a user’s ip address was simple, it worked, and, in the absence of practical alternatives, it became the universal means of authentication for gaining access to licensed library content.1 to facilitate seamless access to licensed online resources from off-campus sites, libraries have been using various technologies including proxy server or vpn server or remote desktop gateway or federated identity management or a combination of the said technologies. in our institute, the on-campus ip-based access to the licensed content is supplemented by vpn technology for off-campus access. the covid-19 pandemic has necessitated academic and scientific staff work from home, which demands smooth and seamless access to the organization’s licensed content. the sudden surge in demand for seamless off-campus access to the licensed online resources had an impact on the institute’s vpn server. also, not all authorized users of the francis jayakanth (francis@iisc.ac.in) is scientific officer, j.r.d. tata memorial library, indian institute of science. ananda t. byrappa (anandtb@iisc.ac.in) is librarian, j.r.d. tata memorial library, indian institute of science. raja visvanathan (raja@inflibnet.ac.in) is scientist c (computer science), inflibnet centre, gandhinagar, india. © 2021. mailto:francis@iisc.ac.in mailto:anandtb@iisc.ac.in mailto:raja@inflibnet.ac.in information technology and libraries june 2021 off-campus access to licensed online resources | jayakanth, byrappa, and visvanathan 2 institute are entitled to get vpn access. to mitigate the situation, the library, therefore, had to explore a secure, reliable, and cost-effective solution to facilitate seamless off-campus access to all the licensed online resources to all the authorized users of the institute. after exploring the possibilities, the library decided to implement a single sign-on solution based on shibboleth. shibboleth software implements the security assertion markup language (saml) protocol, separating the functions of authentication (undertaken by the library or university, which knows its community of end users) and authorization (undertaken by the resource provider, which knows which libraries have licenses for their users to access the resource in question). 2 about the indian institute of science (iisc) the indian institute of science (iisc, or “the institute”) was established in 1909 by a visionary partnership between the industrialist jamsetji nusserwanji tata, the maharaja of mysore, and the government of india. over the 109 years since its establishment, iisc has become the premier institute for advanced scientific and technological research and education in india. since its inception, the institute has laid a balanced emphasis on the pursuit of fundamental knowledge in science and engineering, and the application of its research findings for industrial and social benefit. during 2017–18, the institute initiated the practice of undergoing international peer academic reviews over a 5-year cycle. each year, a small team of invited international experts reviews a set of departments. the experts spend 3 to 4 days at the institute. during this period, they interact closely with the faculty and students of these departments and tour the facilities, aiming to assess the academic work against international benchmarks. iisc has topped the ministry of human resource development (mhrd), government of india’s nirf (national institutional ranking framework) rankings not only in the university’s category but also overall among all ranked institutions. times higher education has placed iisc at the 8th position in its small university rankings (that is, among universities with fewer than 5 ,000 students), at the 13th position in its ranking of universities in the emerging economies, and in the range 91–100 in its world reputation rankings. in the qs world university rankings, iisc is ranked 170. in the same ranking system, on the metric of citations per faculty, iisc is placed in second position. iisc publishes about 3,000 papers per year in scopus and web of science indexed journals and conferences and, each year, the institute awards around 400 phd degrees. about the iisc library jrd tata memorial library (https://www.library.iisc.ac.in), popularly known as the indian institute of science library, is one of the best science and technology libraries in india. started in 1911, as one of the first three departments in the institute, it has become a precious national resource center in the field of science and technology. the library receives annually a grant of 1012% of the total budget of the institute. the library spends about 95% of its budget toward periodical subscriptions, which is unparalleled in this part of the globe. with a collection of nearly 500,000 volumes of books, periodicals, technical reports and standards, the jrd tata memorial library is one of the finest in the country. currently, it subscribes to over 13,000 current periodicals. the library also maintains the iisc’s research publications repository, eprints@iisc (http://eprints.iisc.ac.in), and its theses and dissertations repository, etd@iisc (https://etd.iisc.ac.in). https://www.library.iisc.ac.in/ http://eprints.iisc.ac.in/ http://etd.iisc.ac.in/ https://etd.iisc.ac.in/ information technology and libraries june 2021 off-campus access to licensed online resources | jayakanth, byrappa, and visvanathan 3 off-campus access to licensed online resources in a typical research library, licensed scholarly resources comprise research databases, electronic journals, e-books, standards, and more. a library licenses these resources through publishers/vendors. these license agreements limit access to the resources to the authorized users of an institute. in our case, authorized users include faculty members, enrolled students, current staff, contractual staff, and walk-in users to the library. seamless access to the licensed resources from on-campus sites is predominantly ip-address authenticated, which is a simple and efficient model for users physically located on the institute campus. these users expect a similar experience while accessing licensed online resources from off-campus locations. therefore, the challenge to the libraries is to ensure that such off-campus accesses are secure, seamless, and restricted to authorized users of an institute. libraries have been using various technologies including proxy servers, vpn servers, or single sign-on to facilitate seamless off-campus access to licensed resources. our institute has been using vpn technology to enable off-campus access to licensed online resources. a virtual private network (vpn) is a service offered by many organizations to its members to enable them to remotely connect to the organization’s private network. a vpn extends a private network across a public network and allows users to send and receive data across shared or public networks as if their computing devices were directly connected to the private network. applications running across a vpn may therefore benefit from the functionality, security, and management of the private network. encryption is common, although not an inherent, part of a vpn connection.3 in our institute, faculty members and students are provided access to the vpn service when their institute email address is created. users follow four steps to use a vpn client to get connected to the campus network: • install vpn client software on their computer system. cisco anyconnect (https://www.cisco.com/c/en/us/products/collateral/security/anyconnect-securemobility-client/at-a-glance-c45-578609.html) is one such software. • start the vpn client software every time there is a need to connect to the private network. • enter the address of the institute’s vpn server, and click connect in the anyconnect window. • log in to the vpn server using their institutional email credentials. an authorized user of the institute can use any of the ip authenticated network services, including the licensed online resources, after a successful login to the vpn server. the vpn technology has been serving the purpose well, but the service is, by default, available only to the institute’s faculty and students. other categories of employees such as project assistants, project associates, research assistants, post-doctoral fellows, and others, who constitute a good percentage of iisc staff, are provided vpn access on a case-by-case basis. during the covid-19 lockdown, the library received several enquiries about accessing the online resources from off-campus sites. realizing the importance of the situation, the library quickly assessed the various possibilities for facilitating seamless off-campus access to the subscribed online resources apart from the vpnbased access. federated access through shibboleth identity provider (idp) service emerged as a possible solution to facilitate seamless off-campus access to the entire institute community. https://www.cisco.com/c/en/us/products/collateral/security/anyconnect-secure-mobility-client/at-a-glance-c45-578609.html https://www.cisco.com/c/en/us/products/collateral/security/anyconnect-secure-mobility-client/at-a-glance-c45-578609.html information technology and libraries june 2021 off-campus access to licensed online resources | jayakanth, byrappa, and visvanathan 4 federated access federated access is a model for access control in which authentication and authorization are separated and handled by different parties. if a user wishes to access a resource controlled by a service provider (sp), the user logs in via an identity provider (idp). more complex forms of federated access involve the use of attributes (information about the user passed from the idp to the sp, which can be used to make access decisions) and can include extra services such as trust federations and discovery services (where the user selects which idp to use to connect to the sp). 4 examples of this federated access model include shibboleth and openathens. shibboleth is opensource software that offers single sign-on infrastructure. openathens is a commercial product delivered as a cloud-based solution. it supports many of the same standards as shibboleth. so, an institution could pay and join the openathens federation, which will provide technical support to set up, integrate, and operationalize federated access using openathens. we decided to go with shibboleth for the following reasons: • to avoid the recurring cost associated with the openathens solution. • the existence of a shibboleth-based infed federation in the country. infed manages the trust between the participating institutions and publishers (http://infed.inflibnet.ac.in/). • infed is part of the edugain inter-federation, which enables our users to gain access to the resources of federations of other countries. what is shibboleth? shibboleth is a standards-based, open-source software package for web single sign-on across or within organizational boundaries. it allows sites to make informed authorization decisions for individual access of protected online resources in a privacy-preserving manner. the shibboleth software implements widely used federated identity standards, principally the oasis security assertion markup language (saml), to provide a federated single sign-on and attribute exchange framework. a user authenticates with their organizational credentials, and the organization (or identity provider) passes the minimal identity information necessary to the service provider to enable an authorization decision. shibboleth also provides extended privacy functionality allowing a user and their home site to control the attributes released to each application (https://www.shibboleth.net/index/). shibboleth has two major components: (1) an identity provider (idp), and (2) a service provider (sp). the idp supplies required authorizations and attributes about the users to the service providers (for example, publishers). the service providers make use of the information about the users sent by the idp to make decisions on whether to allow or deny access to their resources. interaction between a shibboleth identity provider and service provider. when a user attempts to access licensed content on the service provider’s platform, the service provider generates an authentication request and then directs the request and the user to the user’s idp server. the idp prompts for the login credentials. in our setup, the idp server communicates the login credentials to the institute’s active directory (ad) using the secure lightweight directory access protocol (ldap). http://infed.inflibnet.ac.in/ https://www.shibboleth.net/index/ information technology and libraries june 2021 off-campus access to licensed online resources | jayakanth, byrappa, and visvanathan 5 ad is a directory service provided by microsoft. in a directory service, objects (such as a user, a group, a computer, a printer, or a shared folder) are arranged in a hierarchical manner facilitating easy access to the objects. organizations primarily use ad to perform authentication and authorization. once the authenticity of a user is verified, ad helps in determining if a user is authorized to use a specific resource or service. access is granted to a user only if the user checks out on both counts. the ad authenticates a user, and the response is sent back to the idp along with the required attributes. the idp then releases only the required set of attributes to the service provider. based on the idp attributes, which is nothing but a user’s entitlement, the sp grants access to the resource. figure 1 illustrates the functioning of the two components of shibboleth. figure 1. a shibboleth workflow involving a user, identity provider, and service provider. identity federation the interaction between a service provider and identity provider happens based on mutual trust. the trust is established by providing idp metadata as encrypted keys and the idp url that the sp uses to send and request information from the idp. the exchange of metadata between idp and sp can be informal if an institution licenses online resources from only a few publishers. however, research libraries license content from hundreds of sps. therefore, the role of federations is significant. in the absence of a federation, each identity provider and service provider must individually communicate with each other about their existence and configuration, as illustrated in figure 2. information technology and libraries june 2021 off-campus access to licensed online resources | jayakanth, byrappa, and visvanathan 6 figure 2. individual communication between idps and sps. a federation is merely a list of metadata entries aggregated from their member idps and their sps. our institute is a member of infed (information and library network access management federation). infed was established as a centralized agency to coordinate with member institutions in the process of implementing user authentication and access control mechanism across all member institutions. infed manages the trust relationship between the idps and sps (publishers) in india. therefore, individual idps that intend to facilitate access to subscribed online resources through shibboleth will share their metadata with infed. infed, in turn, will share the metadata of the idps with respective service providers, as illustrated in figure 3. other regions have their federations. for example, n the us, incommon (https://www.incommon.org/) serves as the federation, and in the uk, it is the uk access management federation (http://www.ukfederation.org.uk/). https://www.incommon.org/ http://www.ukfederation.org.uk/ information technology and libraries june 2021 off-campus access to licensed online resources | jayakanth, byrappa, and visvanathan 7 figure 3. role of a federation as a trust manager between idps and sps. how does one gain access to shibboleth-enabled resources? a federation manages the trust between identity providers and service providers. the sps enable shibboleth-based access to subscribed resources to the idps based on the metadata shared by a federation. once the sps allow access, users can access such resources by using the institutional login option via the athens/shibboleth link found on the service provider’s platform. alternatively, a library can create a simple html page listing all the shibboleth-enabled licensed resources, as shown in figure 4. information technology and libraries june 2021 off-campus access to licensed online resources | jayakanth, byrappa, and visvanathan 8 figure 4. partial screenshot of shibboleth-enabled resources of our institute. each of the links in figure 4 is a wayfless url. a wayfless url is specific to an institution (idp), and it enables users of that institution to gain federated access to a service or resource in a way that bypasses where are you from (wayf), or the institutional login (discovery service) steps on the sp’s platform. since the institutional login or the discovery service step can be confusing to end users, wayfless links to the resources will facilitate an improved end-user experience in accessing licensed resources. a user needs to follow a link from the list of resources. the link will take the user to the sp. the sp will redirect the user to the idp server for authentication. after successful authentication, the user will gain access to the resource. there are two ways to get a wayfless url to a service: (1) the service provider can share the url or (2) one can make use of a wayfless url generator service like wugen (https://wugen.ukfederation.org.uk/wugen/login.xhtml). https://wugen.ukfederation.org.uk/wugen/login.xhtml information technology and libraries june 2021 off-campus access to licensed online resources | jayakanth, byrappa, and visvanathan 9 benefits of shibboleth-based access shibboleth-based single sign-on can effectively address several requirements of the libraries in ensuring secure and seamless on-campus and off-campus access to subscribed online resources. there are other benefits of shibboleth-based sso: 1. it is open-source software that provides single sign-on infrastructure. 2. it enables organizations to use their existing user authentication mechanism to facilitate seamless access to licensed online resources. 3. being a single sign-on system, for the end users, it eliminates the need to have individual credentials for each online resource. 4. it uses security assertion mark-up language (saml) to securely transfer information about the process of authentication and authorization. 5. it is used by most of the publishers, who facilitate shibboleth-based access through shibboleth federations. 6. it requires a formal federation as a trusted interface between the institutions as an identity provider (idp) and publishers as service providers (sp) thereby ensuring the use of uniform standards and protocols while transmitting attributes of authorised users to publishers. inflibnet’s access management federation, infed, plays this role (https://parichay.inflibnet.ac.in/objectives.php). idp server configuration we installed the shibboleth idp software version 3.3.2 on a virtual machine on the azure platform. the vm system is configured with two virtual cpus, 4 gb of ram, 300 gib of os disk (standard hdd), and ubuntu linux os version 18.04.4 lts. coordination with the organization’s network support team is essential. the network support team handles the domain name service resolution of the idp server and facilitates the idp server to communicate with the organization’s active directory and to open non-standard communication ports on the idp server. shibboleth idp usage statistics the infed team has developed a beta version of the usage analysis tool called infedstat to analyse the use of federated access to gain access to licensed resources. we have implemented the tool on the idp server. figure 5 shows the redacted screenshot of the infedstat dashboard. it shows • date-wise usage details of logged-in users along with ip address, time logged in, and the publishers’ platforms accessed, • number of times the publishers’ platforms were accessed during a specific period, • number of times users logged in for a specific period, • unique users for a specific period, and • unique publishers accessed during a specific period. https://parichay.inflibnet.ac.in/objectives.php information technology and libraries june 2021 off-campus access to licensed online resources | jayakanth, byrappa, and visvanathan 10 figure 5. idp usage dashboard. conclusions the implementation of federated access to subscribed online resources has ensured that all the authorized users of the institute can access almost all the licensed resources from wherever they are. the counter 5 usage analysis of subscribed resources for the period of january 2020 to october 2020 indicates that usage of online resources has increased by nearly 20 percent over the last year for the same period. the enhanced use could be partly because of ease of accessing online resources facilitated by federated access. to assess the reasons for enhanced usage of online resources, the library is planning to conduct a survey to understand how convenient and useful federated access to online resources has been especially while being off campus. federated access through single sign-on is useful not just for accessing licensed online resources. a typical research library offers various other services to its users, including the institutional repository service, learning management system, online catalogue, etc. the library intends to integrate such services with sso, thereby freeing the end users from service-specific credentials. endnotes 1 thomas dowling, “we have outgrown ip authentication,” journal of electronic resources librarianship 32, no. 1 (2020): 39–46, https://doi.org/10.1080/1941126x.2019.1709738. 2 john paschoud, “shibboleth and saml: at last, a viable global standard for resource access management,” new review of information networking 10, no. 2 (2004): 147–60, https://doi.org/10.1080/13614570500053874. 3 andrew g. mason, ed., cisco secure virtual private network (cisco press, 2001): 7, https://www.ciscopress.com/store/cisco-secure-virtual-private-networks-9781587050336. 4 masha garibyan, simon mcleish, and john paschoud, “current access management technologies,” in access and identity management for libraries: controlling access to online information (london, uk: facet publishing, 2014): 31–38. https://doi.org/10.1080/1941126x.2019.1709738 https://doi.org/10.1080/13614570500053874 https://www.ciscopress.com/store/cisco-secure-virtual-private-networks-9781587050336 abstract introduction about the indian institute of science (iisc) about the iisc library off-campus access to licensed online resources federated access what is shibboleth? interaction between a shibboleth identity provider and service provider. identity federation how does one gain access to shibboleth-enabled resources? benefits of shibboleth-based access idp server configuration shibboleth idp usage statistics conclusions endnotes the efficient storage of text documents in digital libraries | skibiński and swacha 143 przemysław skibiński and jakub swacha the efficient storage of text documents in digital libraries przemysław skibiński (inikep@ii.uni.wroc.pl) is [qy: title?], institute of computer science, university of wrocław, poland. jakub swacha (jakubs@uoo.univ.szczecin.pl) is [qy: title?], institute of information technology in management, university of szczecin, poland. przemysław skibiński and jakub swacha the efficient storage of text documents in digital libraries in this paper we investigate the possibility of improving the efficiency of data compression, and thus reducing storage requirements, for seven widely used text document formats. we propose an open-source text compression software library, featuring an advanced word-substitution scheme with static and semidynamic word dictionaries. the empirical results show an average storage space reduction as high as 78 percent compared to uncompressed documents, and as high as 30 percent compared to documents compressed with the free compression software gzip. i t is hard to expect the continuing rapid growth of global information volume not to affect digital libraries.1 the growth of stored information volume means growth in storage requirements, which poses a problem in both technological and economic terms. fortunately, the digital librarys’ hunger for resources can be tamed with data compression.2 the primary motivation for our research was to limit the data storage requirements of the student thesis electronic archive in the institute of information technology in management at the university of szczecin. the current regulations state that every thesis should be submitted in both printed and electronic form. the latter facilitates automated processing of the documents for purposes such as plagiarism detection or statistical language analysis. considering the introduction of the three-cycle higher education system (bachelor/master/doctorate), there are several hundred theses added to the archive every year. although students are asked to submit microsoft word–compatible documents such as doc, docx, and rtf, other popular formats such as tex script (tex), html, ps, and pdf are also accepted, both in the case of the main thesis document, containing the thesis and any appendixes that were included in the printed version, and the additional appendixes, comprising materials that were left out of the printed version (such as detailed data tables, the full source code of programs, program manuals, etc.). some of the appendixes may be multimedia, in formats such as png, jpeg, or mpeg.3 notice that this paper deals with text-document compression only. although the size of individual text documents is often significantly smaller than the size of individual multimedia objects, their collective volume is large enough to make the compression effort worthwhile. the reason for focusing on text-document compression is that most multimedia formats have efficient compression schemes embedded, whereas text document formats usually either are uncompressed or use schemes with efficiency far worse than the current state of the art in text compression. although the student thesis electronic archive was our motivation, we propose a solution that can be applied to any digital library containing text documents. as the recent survey by kahl and williams revealed, 57.5 percent of the examined 1,117 digital library projects consisted of text content, so there are numerous libraries that could benefit form implementation of the proposed scheme.4 in this paper, we describe a state-of-the-art approach to text-document compression and present an opensource software library implementing the scheme that can be freely used in digital library projects. in the case of text documents, improvement in compression effectiveness may be obtained in two ways: with or without regard to their format. the more nontextual content in a document (e.g., formatting instructions, structure description, or embedded images), the more it requires format-specific processing to improve its compression ratio. this is because most document formats have their own ways of describing their formatting, structure, and nontextual inclusions (plain text files have no inclusions). for this reason, we have developed a compound scheme that consists of several subschemes that can be turned on and off or run with different parameters. the most suitable solution for a given document format can be obtained by merely choosing the right schemes and adequate parameter values. experimentally, we have found the optimal subscheme combinations for the following formats used in digital libraries: plain text, tex, rtf, text annotated with xml, html, as well as the device-independent rendering formats ps and pdf.5 first we discuss related work in text compression, then describe the basis of the proposed scheme and how it should be adapted for particular document formats. the section “using the scheme in a digital library project” discusses how to use the free software library that implements the scheme. then we cover the results of experiments involving the proposed scheme and a corpus of test files in each of the tested formats. n text compression there are two basic principles of general-purpose data compression. the first one works on the level of character sequences, the second one works on the level of przemysław skibiński (inikep@ii.uni.wroc.pl) is associate professor, institute of computer science, university of wrocław, poland. jakub swacha (jakubs@uoo.univ.szczecin .pl) is associate professor, institute of information technology in management, university of szczecin, poland. 144 information technology and libraries | september 2009 individual characters. in the first case, the idea is to look for matching character sequences in the past buffer of the file being compressed and replace such sequences with shorter code words; this principle underlies the algorithms derived from the concepts of arbraham lempel and jacob ziv (lz-type).6 in the second case, the idea is to gather frequency statistics for characters in the file being compressed and then assign shorter code words for frequent characters and longer ones for rare characters (this is exactly how huffman coding works—what arithmetic coding assigns are value ranges rather than individual code words).7 as the characters form words, and words form phrases, there is high correlation between subsequent characters. to produce shorter code words, a compression algorithm either has to observe the context (understood as several preceding characters) in which the character appeared and maintain separate frequency models for different contexts, or has to first decorrelate the characters (by sorting them according to their contexts) and then use an adaptive frequency model when compressing the output (as the characters’ dependence on context becomes dependence on position). whereas the former solution is the foundation of prediction by partial match (ppm) algorithms, burrows-wheeler transform (bwt) compression algorithms are based on the latter.8 witten et al., in their seminal work managing gigabytes, emphasize the role of data compression in text storage and retrieval systems, stating three requirements for the compression process: good compression, fast decoding, and feasibility of decoding individual documents with minimum overhead.9 the choice of compression algorithm should depend on what is more important for a specific application: better compression or faster decoding. an early work of jon louis bentley and others showed that a significant improvement in text compression can be achieved by treating a text document as a stream of space-delimited words rather than individual characters.10 this technique can be combined with any general-purpose compression method in two ways: by redesigning character-based algorithms as word-based ones or by implementing a two-stage scheme whose first step is a transform replacing words with dictionary indices and whose second step is passing the transformed text through any generalpurpose compressor.11 from the designer’s point of view, although the first approach provides more control over how the text is modeled, the second approach is much easier to implement and upgrade to future general-purpose compressors.12 notice that the separation of the wordreplacement stage from the compression stage does not imply that two distinct programs have to be used—if only an appropriate general-purpose compression software library is available, a single utility can use it to compress the output of the transform it first performed. an important element of every word-based scheme is the dictionary of words that lists character sequences that should be treated as single entities. the dictionary can be dynamic (i.e., constructed on-line during the compression of every document),13 static (i.e., constructed off-line before the compression stage and once for every document of a given class—typically, the language of the document determines its class),14 or semidynamic (i.e., constructed off-line before compression stage but individually for every document).15 semidynamic dictionaries must be stored along with the compressed document. dynamic dictionaries are reconstructed during decompression (which makes the decoding slower than in the other cases). when the static dictionary is used, it must be distributed with the decoder; since a single dictionary is used to compress multiple files, it usually attains the best compression ratios, but it is only effective with documents of the class it was originally prepared for. n the basic compression scheme the basis of our approach is a word-based, lossless text compression scheme, dubbed compression for textual digital libraries (ctdl). the scheme consists of up to four stages: 1. document decompression 2. dictionary composition 3. text transform 4. compression stages 1–2 are optional. the first is for retrieving textual content from files compressed poorly with generalpurpose methods. it is only executed for compressed input documents. it uses an embedded decompressor for files compressed using the deflate algorithm,16 but an external tool—precomp—is used to decode natively compressed pdf documents.17 the second stage is for constructing the dictionary of the most frequent words in the processed document. doing so is a good idea when the compressed documents have no common set of words. if there are many documents in the same language, a common dictionary fares better—it usually does not pay off to store an individual dictionary with each file because they all contain similar lists of words. for this reason we have developed two variants of the scheme. the basic ctdl includes stage 2; therefore it can use a document-specific semidynamic dictionary in the third stage. the ctdl+ variant uses a static dictionary common for all files in the same language; therefore it can omit stage 2. during stage 2, all the potential dictionary items that meet the word requirements are extracted from the document and then sorted according to their frequency the efficient storage of text documents in digital libraries | skibiński and swacha 145 to form a dictionary. the requirements define the minimum length and frequency of a word in the document (by default, 2 and 6 respectively) as well as its content. only the following kinds of strings are accepted into the dictionary: n a sequence of lowercase and uppercase letters (“a”–“z”, “a”–“z”) and characters with ascii code values from range 128–255 (thus it supports any typical 8-bit text encoding and also utf-8) n url address prefixes of the form “http:// domain/,” where domain is any combination of letters, digits, dots, and dashes n e-mails—patterns of the form “login@domain,” where login and domain are any combination of letters, digits, dots, and dashes n runs of spaces stage 3 begins with parsing the text into tokens. the tokens are defined by their content; as four types of content are distinguished, there are also four classes of tokens: words, numbers, special tokens, and characters. every token is then encoded in a way that depends on the class it belongs to. the words are those character sequences that are listed in the dictionary. every word is replaced with its dictionary index, which is then encoded using symbols that are rare or nonexistent in the input document. indexes are encoded with code words that are between one and four bytes long, with lower indexes (denoting more frequent words) being assigned shorter code words. the numbers are sequences of decimal digits, which are encoded with a dense binary code, and, similarly to letters, placed in a separate location in the output file. the special tokens can be decimal fractions, ip numerical addresses, dates, times, and numerical ranges. as they have a strict format and differ only in numerical values, they are encoded as sequences of numbers.18 finally, the characters are the tokens that do not belong to any of the aforementioned group. they are simply copied to the output file, with the exception of those rare characters that were used to construct code words; they are copied as well, but have to be preceded with a special escape symbol. the specialized transform variants (see the next section) distinguish three additional classes from the character class: letters (words not in the dictionary), single white spaces, and multiple white spaces. stage 4 could use any general-purpose compression method to encode the output of stage 3. for this role, we have investigated several open-licensed, generalpurpose compression algorithms that differ in speed and efficiency. as we believe that document access speed is important to textual digital libraries, we have decided to focus on lz–type algorithms because they offer the best decompression times. ctdl has two embedded backend compressors: the standard deflate and lzma, wellknown for its ability to attain high compression ratios.19 n adapting the transform for individual text document formats the text document formats have individual characteristics; therefore the compression ratio can be improved by adapting the transform for a particular format. as we noted in the introduction, we propose a set of subschemes (modifications of the original processing steps or additional processing steps) that can help compression— provided the issue that a given subscheme addresses is valid for the document format being compressed. there are two groups of subschemes: the first consists of solutions that can be applied to more than one document format. it includes n changing the minimum word frequency threshold (the “minfr” column in table 1) that a word must pass to be included in the semidynamic dictionary (notice that no word can be added to a static dictionary); n using spaceless word model (“wdspc” column in table 1) in which a single space between two words is not encoded at all; instead, a flag is used to mark two neighboring words that are not separated by a space; n run-length encoding of multiple spaces (“spruns” column in table 1); n letter containers (“letcnt” column in table 1), that is, removing sequences of letters (belonging to words that are not included in the dictionary) to a separate location in the output file (and leaving a flag at their original position). table 1 shows the assignment of the mentioned subschemes to document formats, with “+” denoting that a given subscheme should be applied when processing a given document format. notice that we use different subschemes for the same format depending on whether a semidynamic (ctdl) or static (ctdl+) dictionary is used. the remaining subschemes are applied for only one document format. they attain an improvement in compression performance by changing the definition of acceptable dictionary words, and, in one case (ps), by changing the definition of number strings. the encoder for the simplest of the examined formats—plain text files—performs no additional formatspecific processing. the first such modification is in the tex encoder. the difference is that words beginning with “\” (tex 146 information technology and libraries | september 2009 instructions) are now accepted in the dictionary. the modification for pdf documents is similar. in this case, bracketed words (pdf entities)— for example “(abc)”—are acceptable as dictionary entries. notice that pdf files are internally compressed by default—the transform can be applied after decompressing them into textual format. the precomp tool is used for this purpose. the subscheme for ps files features two modifications: its dictionary accepts words beginning with “/” and “\” or ending with “(“, and its number tokens can contain not only decimal but also hexadecimal digits (though a single number must have at least one decimal digit). the hexadecimal number must be at least 6 digits long, and is encoded with a flag: a byte containing its length (numbers with more than 261 digits are split into parts) and a sequence of bytes, each containing two digits from the number (if the number of digits is odd, the last byte contains only one digit). for rtf documents, the dictionary accepts the “\”-preceded words, like the tex files. moreover, the hexadecimal numbers are encoded in the same way as in the ps subscheme so that rtf documents containing images can be significantly reduced in size. specialization for xml is roughly the transform described in our earlier article, “revisiting dictionarybased compression.”20 it allows for xml start tags and entities to be added to dictionary, and it replaces every end tag respecting the xml well-formedness rule (i.e., closing the element opened most recently) with a single flag. it also uses a single flag to denote xml attribute value begin and end marks. html documents are handled similarly. the only difference is that the tags that, according to the html 4.01 specification, are not expected to be followed by an endtag (base, link, xbasehref, br, meta, hr, img, area, input, embed, param and col) are ignored by the mechanism replacing closing tags (so that it can guess the correct closing tag even after the singular tags were encountered).21 n using the scheme in a digital library project many textual digital libraries seriously lack text compression capabilities, and popular digital library systems, such as greenstone, have no embedded efficient text compression.22 therefore we have decided to develop ctdl as an open-source software library. the library is free to use and can be downloaded from www.ii.uni.wroc .pl/~inikep/research/ctdl/ctdl09.zip. the library does not require any additional nonstandard libraries. it has both the text transform and back-end compressors embedded. however, compressing pdf documents requires them to be decompressed first with the free precomp tool. the compression routines are wrapped in a code selecting the best algorithm depending on the chosen compression mode and the input document format. the interface of the library consists of only two functions: ctdl_encode and ctdl_decode, for, respectively, compressing and decompressing documents. ctdl_encode takes the following parameters: n char* filename—name of the input (uncompressed) document n char* filename_out—name of the output (compressed) document n efiletype ftype—format of the input document, defined as: enum efiletype { html, pdf, ps, rtf, tex, txt, xml}; n edictionarytype dtype—dictionary type, defined as: enum edictionarytype { static, semidynamic }; ctdl_decode takes the following parameters: n char* filename—name of the input (compressed) document n char* filename_out—name of the output (decompressed) document table 1. universal transform optimizations ctdl settings ctdl+ settings format minfr wdspc spruns letcnt wdspc spruns letcnt html 3 + + + + + pdf 3 ps 6 + + rtf 3 + + + tex 3 + + + + + + txt 6 + + + + + + xml 3 + + + + + the efficient storage of text documents in digital libraries | skibiński and swacha 147 the library was written in the c++ programming language, but a compiled static library is also distributed; thus it can be used in any language that can link such libraries. currently, the library is compatible with two platforms: microsoft windows and linux. to use static dictionaries, the respective dictionary file must be available. the library is supplied with an english dictionary trained on a 3 gb text corpus from project gutenberg.23 seven other dictionaries—german, spanish, finnish, french, italian, polish, and russian— can be freely downloaded from www.ii.uni.wroc.pl/~inikep/ research/dicts. there also is a tool that helps create a new dictionary from any given corpus of documents, available from skibiński upon request via e-mail (inikep@ii.uni .wroc.pl). the library can be used to reduce the storage requirements or also to reduce the time of delivering a requested document to the library user. in the first case, the decompression must be done on the server side. in the second case, it must be done on the client side, which is possible because stand-alone decompressors are available for microsoft windows and linux. obviously, a library can support both options by providing the user with a choice whether a document should be delivered compressed or not. if documents are to be decompressed client-side, the basic ctdl, using a semidynamic dictionary, seems handier, since it does not require the user to obtain the static dictionary that was used to compress the downloaded document. still, the size of such a dictionary is usually small, so it does not disqualify ctdl+ from this kind of use. n experimental results we tested ctdl experimentally on a benchmark set of text documents. the purpose of the tests was to compare the storage requirements of different document formats in compressed and uncompressed form. in selecting the test files we wanted to achieve the following goals: n test all the formats listed in table 1 (therefore we decided to choose documents that produced no errors during document format conversion) n obtain verifiable results (therefore we decided to use documents that can be easily obtained from the internet) n measure the actual compression improvement from applying the proposed scheme (apart from the rtf format, the scheme is neutral to the images embedded in documents; therefore we decided to use documents that have no embedded images) for these reasons, we used the following procedure for selecting documents to the test set. first, we searched the project gutenberg library for tex documents, as this format can most reliably be transformed into the other formats. from the fifty-one retrieved documents, we removed all those containing images as well as those that the htlatex tool failed to convert to html. in the eleven remaining documents, there were four jane austen books; this overrepresentation was handled by removing three of them. the resulting eight documents are given in table 2. from the tex files we generated html, pdf, and ps documents. then we used word 2007 to transform html documents into rtf, doc, and xml (thus this is the microsoft word xml format, not the project gutenberg xml format). the txt files were downloaded from project gutenberg. the tests were conducted on a low-end amd sempron 3000+ 1.80 ghz system with 512 mb ram and a seagate 80 gb ata drive, running windows xp sp2. for comparison purposes, we used three generalpurpose compression programs: n gzip implementing deflate n bzip2 implementing a bwt-based compression algorithm table 2. test set documents specification file name title author tex size (bytes) 13601-t expositions of holy scripture: romans corinthians maclaren 1,443,056 16514-t a little cook book for a little girl benton 220,480 1noam10t north america, v. 1 trollope 804,813 2ws2610 hamlet shakespeare 194,527 alice30 alice in wonderland carroll 165,844 cdscs10t some christmas stories dickens 127,684 grimm10t fairy tales grimm 535,842 pandp12t pride and prejudice austen 727,415 148 information technology and libraries | september 2009 n ppmvc implementing a ppm-derived compression algorithm24 tables 3–10 show n the bitrate attained on each test file by the deflatebased gzip in default mode, the proposed compression scheme in the semidynamic and static variants with deflate as the back-end compression algorithm, 7-zip in lzma mode, the proposed compression scheme in the semidynamic and static variants with lzma as the back-end compression algorithm, bzip2 and ppmvc; n the average bitrate attained on the whole test corpus; and n the total compression and decompression times (in seconds) for the whole test corpus, measured on the test platform (they are total elapsed times including program initialization and disk operations). bitrates are given in output bits per character of an uncompressed document in a given format, so a smaller table 3. compression efficiency and times for the txt documents deflate lzma bzip2 ppmvc file name gzip ctdl ctdl+ 7-zip ctdl ctdl+ 13601-t 2.944 2.244 2.101 2.337 2.057 1.919 2.158 1.863 16514-t 2.566 2.150 1.969 2.228 1.993 1.838 2.010 1.780 1noam10t 2.967 2.337 2.109 2.432 2.151 1.958 2.160 1.946 2ws2610 3.217 2.874 2.459 2.871 2.659 2.312 2.565 2.343 alice30 2.906 2.533 2.184 2.585 2.360 2.056 2.341 2.090 cdscs10t 3.222 2.898 2.298 2.928 2.721 2.192 2.694 2.436 grimm10t 2.832 2.275 2.090 2.357 2.079 1.931 2.112 1.886 pandp12t 2.901 2.251 2.097 2.366 2.061 1.930 2.032 1.835 average 2.944 2.445 2.163 2.513 2.260 2.017 2.259 2.022 comp. time 0.688 1.234 0.954 6.688 2.640 2.281 2.110 3.281 dec. time 0.125 0.454 0.546 0.343 0.610 0.656 0.703 3.453 table 4. compression efficiency and times for the tex documents deflate lzma bzip2 ppmvc file name gzip ctdl ctdl+ 7-zip ctdl ctdl+ 13601-t 2.927 2.233 2.092 2.328 2.049 1.913 2.146 1.852 16514-t 2.277 1.904 1.794 1.957 1.744 1.645 1.746 1.534 1noam10t 2.976 2.370 2.142 2.445 2.186 1.986 2.195 1.976 2ws2610 3.206 2.906 2.482 2.864 2.674 2.323 2.562 2.340 alice30 2.897 2.526 2.183 2.573 2.350 2.048 2.332 2.085 cdscs10t 3.224 2.931 2.328 2.941 2.759 2.222 2.723 2.466 grimm10t 2.831 2.304 2.120 2.364 2.113 1.960 2.143 1.910 pandp12t 2.881 2.239 2.090 2.346 2.049 1.916 2.013 1.817 average 2.902 2.427 2.154 2.477 2.241 2.002 2.233 1.998 comp. time 0.688 1.250 0.969 6.718 2.703 2.406 2.140 3.329 dec. time 0.109 0.453 0.547 0.360 0.609 0.672 0.703 3.485 the efficient storage of text documents in digital libraries | skibiński and swacha 149 bitrate (of, e.g., rtf documents compared to the plain text) does not mean the file is smaller, only that the compression was better. uncompressed files have a bitrate of 8 bits per character. looking at the results obtained for txt documents (table 3), we can see an average improvement of 17 percent for ctdl and 27 percent for ctdl+ compared to the baseline deflate implementation. compared to the baseline lzma implementation, the improvement is 10 percent for ctdl and 20 percent for ctdl+. also, ctdl+ combined with lzma compresses txt documents 31 percent better than gzip, 11 percent better than bzip2, and slightly better than the state-of-the-art ppmvc implementation. in case of tex documents (table 4), the gzip results were improved, on average, by 16 percent using ctdl and by 26 percent using ctdl+; the numbers for lzma are 10 percent for ctdl and 19 percent for ctdl+. in a cross-method comparison, ctdl+ with lzma beats gzip by 31 percent, bzip2 by 10 percent, and attains results very close to ppmvc. on average, deflate-based ctdl compressed xml documents 20 percent better than the baseline algorithm (table 5), and with ctdl+ the improvement rises to 26 percent. ctdl improves lzma compression by 11 percent, and ctdl+ improves it by 18 percent. ctdl+ with lzma beats gzip by 33 percent, bzip2 by 8 percent, and loses only 4 percent to ppmvc. similar results were obtained for html documents (table 6): they were compressed with ctdl and deflate 18 percent better than with the deflate algorithm alone, and 27 percent better with ctdl+. lzma compression efficiency is improved by 11 percent with ctdl and 20 percent with ctdl+. ctdl+ with lzma beats gzip by 33 percent, bzip2 by 9 percent, and loses only 2 percent to ppmvc. for rtf documents (table 7), the gzip results were improved, on average, by 18 percent using ctdl, and 25 percent using ctdl+; the numbers for lzma are respectively 9 percent for ctdl and 17 percent for ctdl+. in a cross-method comparison, ctdl+ with lzma beats gzip by 34 percent, bzip2 by 7 percent, and loses 5 percent to ppmvc. although there is no mode designed especially for doc documents in ctdl (table 8), the basic txt mode was used, as it was found experimentally to be the best choice available. the results show it managed to improve deflate-based compression by 9 percent using ctdl, and by 21 percent using ctdl+, whereas lzma-based compression was improved respectively by 4 percent for ctdl and 14 percent for ctdl+. combined with lzma, ctdl+ compresses doc documents 30 percent better than gzip, 13 percent better than bzip2, and 1 percent better than ppmvc. in case of ps documents (table 9), the gzip results were improved, on average, by 5 percent using ctdl, and by 8 percent using ctdl+; the numbers for lzma improved 3 percent for ctdl and 5 percent for ctdl+. in a cross-method comparison, ctdl+ with lzma beats gzip by 8 percent, losing 5 percent to bzip2 and 7 percent to ppmvc. finally, ctdl improved deflate-based compression of pdf documents (table 10) by 9 percent using ctdl and 10 percent using ctdl+ (compared to gzip; the numbers are table 5. compression efficiency and times for the xml documents deflate lzma bzip2 ppmvc file name gzip ctdl ctdl+ 7-zip ctdl ctdl+ 13601-t 2.046 1.551 1.514 1.585 1.405 1.339 1.451 1.242 16514-t 0.871 0.698 0.670 0.703 0.612 0.590 0.599 0.552 1noam10t 2.383 1.870 1.736 1.914 1.711 1.575 1.724 1.515 2ws2610 0.691 0.539 0.497 0.561 0.474 0.440 0.461 0.422 alice30 1.477 1.258 1.140 1.248 1.131 1.034 1.116 0.999 cdscs10t 2.106 1.892 1.576 1.862 1.741 1.462 1.721 1.538 grimm10t 1.878 1.485 1.422 1.521 1.337 1.276 1.337 1.198 pandp12t 1.875 1.404 1.349 1.465 1.263 1.207 1.252 1.105 average 1.666 1.337 1.238 1.357 1.209 1.115 1.208 1.071 comp. time 0.750 1.844 1.390 10.79 4.891 5.828 7.047 3.688 dec. time 0.141 0.672 0.750 0.421 0.859 0.953 1.140 3.907 150 information technology and libraries | september 2009 much higher if compared to the embedded pdf compression—see “native” column in table 10); the numbers for lzma are respectively 7 percent for ctdl and 10 percent for ctdl+. combined with lzma, ctdl+ compresses pdf documents 28 percent better than gzip, 4 percent better than bzip2, and 5 percent worse than ppmvc. the results presented in tables 3–10 show that ctdl manages to improve compression efficiency of the general-purpose algorithms it is based on. the scale of improvement varies between document types, but for most of them it is more than 20 percent for ctdl+ and 10 percent for ctdl. the smallest improvement is achieved in case of ps (about 5 percent). figure 1 shows the same results in another perspective: the bars show how much better compression ratios were obtained for the same documents using different compression schemes compared to gzip with default options (0 percent means no improvement). compared to gzip, ctdl offers a significantly better compression ratio at the expense of longer processing time. the relative difference is especially high in case of decompression. however, in absolute terms, even in the worst case of pdf, the average delay between ctdl+ and gzip is below 180 ms for compression and 90 ms for decompression per file. taking into consideration the low-end specification of the test computer, these results table 6. compression efficiency and times for the html documents deflate lzma bzip2 ppmvc file name gzip ctdl ctdl+ 7-zip ctdl ctdl+ 13601-t 2.696 2.054 1.940 2.121 1.868 1.751 1.932 1.670 16514-t 1.726 1.405 1.310 1.436 1.258 1.180 1.257 1.113 1noam10t 2.768 2.159 1.972 2.244 1.979 1.815 1.973 1.785 2ws2610 2.084 1.747 1.504 1.743 1.525 1.344 1.499 1.303 alice30 2.451 2.124 1.829 2.128 1.929 1.701 1.888 1.684 cdscs10t 2.880 2.593 2.084 2.597 2.410 1.966 2.348 2.131 grimm10t 2.603 2.074 1.916 2.138 1.883 1.752 1.889 1.688 pandp12t 2.640 2.037 1.891 2.120 1.826 1.717 1.777 1.596 average 2.481 2.024 1.806 2.066 1.835 1.653 1.820 1.621 comp. time 0.750 1.438 1.078 8.203 3.421 3.328 2.672 3.500 dec. time 0.140 0.515 0.594 0.359 0.688 0.750 0.812 3.672 table 7. compression efficiency and times for the rtf documents deflate lzma bzip2 ppmvc file name gzip ctdl ctdl+ 7-zip ctdl ctdl+ 13601-t 1.882 1.431 1.372 1.428 1.267 1.200 1.300 1.120 16514-t 0.834 0.701 0.696 0.662 0.601 0.591 0.568 0.529 1noam10t 2.244 1.774 1.637 1.765 1.594 1.462 1.601 1.404 2ws2610 0.784 0.630 0.581 0.629 0.545 0.500 0.520 0.485 alice30 1.382 1.196 1.065 1.134 1.046 0.948 0.995 0.922 cdscs10t 2.059 1.882 1.558 1.784 1.704 1.432 1.645 1.488 grimm10t 1.618 1.301 1.227 1.285 1.150 1.082 1.149 1.010 pandp12t 1.742 1.340 1.264 1.336 1.169 1.115 1.142 1.012 average 1.568 1.282 1.175 1.253 1.135 1.041 1.115 0.996 comp. time 0.766 2.047 1.500 12.62 6.500 7.562 8.032 3.922 dec. time 0.156 0.688 0.766 0.469 0.875 0.953 1.312 4.157 the efficient storage of text documents in digital libraries | skibiński and swacha 151 certainly seem good enough for practical applications. compared to lzma, ctdl offers better compression and a shorter compression time at the expense of longer decompression time. notice that the absolute gain in compression time is several times the loss in decompression time, and the decompression time remains short, noticeably shorter than bzip2’s and several times shorter than ppmvc’s. ctdl+ beats bzip2 (with the sole exception of ps documents) in terms of compression ratio and achieves results that are mostly very close to the resourcehungry ppmvc. n conclusions in this paper we addressed the problem of compressing text documents. although individual text documents rarely exceed several megabytes in size, their entire collections can have very large storage space requirements. although text documents are often compressed with general-purpose methods such as deflate, much better compression can be obtained with a scheme specialized for text, and even better if the scheme is additionally specialized for individual document formats. we have developed such a scheme (ctdl), beginning with a text transform designed earlier for xml documents and table 8. compression efficiency and times for the doc documents deflate lzma bzip2 ppmvc file name gzip ctdl ctdl+ 7-zip ctdl ctdl+ 13601-t 2.798 2.183 2.062 2.181 1.976 1.854 2.115 1.818 16514-t 2.226 2.213 2.073 1.712 1.712 1.652 1.919 1.686 1noam10t 2.851 2.250 2.025 2.289 2.057 1.869 2.113 1.870 2ws2610 2.497 2.499 2.210 2.095 2.095 1.890 2.251 1.999 alice30 2.744 2.714 2.270 2.345 2.345 2.038 2.348 2.058 cdscs10t 2.916 2.891 2.231 2.559 2.560 2.062 2.475 2.196 grimm10t 2.691 2.677 2.059 2.179 2.179 1.856 2.075 1.833 pandp12t 2.761 2.171 2.050 2.189 1.955 1.843 1.983 1.770 average 2.686 2.450 2.123 2.194 2.110 1.883 2.160 1.904 comp. time 0.718 1.312 1.031 7.078 4.063 3.001 2.250 3.421 dec. time 0.125 0.375 0.547 0.344 0.547 0.718 0.735 3.625 table 9. compression efficiency and times for the ps documents deflate lzma bzip2 ppmvc file name gzip ctdl ctdl+ 7-zip ctdl ctdl+ 13601-t 2.847 2.634 2.589 2.213 2.105 2.074 2.011 1.778 16514-t 3.226 3.129 3.039 2.730 2.707 2.699 2.613 2.505 1noam10t 2.718 2.551 2.490 2.147 2.060 2.015 1.892 1.694 2ws2610 3.064 2.922 2.795 2.600 2.521 2.450 2.336 2.186 alice30 3.224 3.154 3.026 2.750 2.745 2.691 2.553 2.400 cdscs10t 3.110 3.029 2.890 2.657 2.683 2.579 2.447 2.276 grimm10t 2.833 2.664 2.597 2.288 2.200 2.162 2.074 1.863 pandp12t 2.814 2.533 2.468 2.193 2.049 1.998 1.858 1.644 average 2.980 2.827 2.737 2.447 2.384 2.334 2.223 2.043 comp. time 1.328 3.015 2.500 14.23 10.96 11.09 4.171 5.765 dec. time 0.203 0.688 0.781 0.609 1.063 1.125 1.360 6.063 152 information technology and libraries | september 2009 modifying it for the requirements of each of the investigated document formats. it has two operation modes: basic ctdl and ctdl+ (the latter uses a common word dictionary for improved compression) and uses two back-end compression algorithms: deflate and lzma (differing in compression speed and efficiency). the improvement in compression efficiency, which can be observed in the experimental results, amounts to a significant reduction of data storage requirements, giving the reasons to use the library in both new and existing digital library projects instead of general-purpose compression programs. to facilitate this process, we implemented the scheme as an open-source software library under the same name, freely available at http://www.ii.uni.wroc . p l / ~ i n i k e p / re s e a rc h / c t d l / ctdl09.zip. although the scheme and the library are now complete, we plan future extensions aiming both to increase the level of specializations for currently handled document formats and to extend the list of handled document formats. table 10. compression efficiency and times for the (uncompressed) pdf documents deflate lzma bzip2 ppmvc file name native gzip ctdl ctdl+ 7-zip ctdl ctdl+ 13601-t 3.443 2.624 2.191 2.200 1.986 1.708 1.656 1.852 1.659 16514-t 4.370 2.839 2.836 2.810 2.422 2.422 2.328 2.378 2.241 1noam10t 3.379 2.522 2.103 2.094 1.924 1.659 1.603 1.770 1.587 2ws2610 3.519 2.204 2.346 2.248 1.781 1.947 1.860 1.625 1.480 alice30 3.886 2.863 2.753 2.668 2.429 2.308 2.216 2.315 2.137 cdscs10t 3.684 2.835 2.688 2.557 2.399 2.276 2.164 2.260 2.079 grimm10t 3.543 2.557 2.135 2.120 2.008 1.713 1.661 1.858 1.696 pandp12t 3.552 2.684 2.267 2.256 2.071 1.831 1.769 1.870 1.705 average 3.672 2.641 2.415 2.369 2.128 1.983 1.907 1.991 1.823 comp. time n/a 1.594 3.672 3.250 19.62 13.31 16.32 5.641 7.375 dec. time n/a 0.219 0.844 0.969 0.719 1.219 1.360 1.765 7.859 figure 1. compression improvement relative to gzip the efficient storage of text documents in digital libraries | skibiński and swacha 153 acknowledgements szymon grabowski is the coauthor of the xml-wrt transform, which served as the basis for the ctdl library. references 1. john f. gantz et al., the diverse and exploding digital universe: an updated forecast of worldwide information growth through 2011 (framingham, mass.: idc, 2008), http://www .emc.com/collateral/analyst-reports/diverse-exploding-digital -universe.pdf (accessed may 7, 2009). 2. timothy c. bell, alistair moffat, and ian h. witten, “compressing the digital library,” in proceedings of digital libraries ‘94 (college station: texas a&m univ. 1994): 41. 3. ian h. witten and david bainbridge, how to build a digital library (san francisco: morgan kaufmann, 2002). 4. chad m. kahl and sarah c. williams, “accessing digital libraries: a study of arl members’ digital projects,” the journal of academic librarianship 32, no. 4 (2006): 364. 5. donald e. knuth, tex: the program (reading, mass.: addison-wesley, 1986); microsoft technical support, rich text format (rtf) version 1.5 specification, 1997, http://www.biblioscape .com/rtf15_spec.htm (accessed may 7, 2009); tim bray et al., eds., extensible markup language (xml) 1.0 (fourth edition), 2006, http://www.w3.org/tr/2006/rec-xml-20060816 (accessed may 7, 2009); dave raggett, arnaud le hors, and ian jacobs, eds., w3c html 4.01 specification, 1999, http://www.w3.org/ tr/rec-html40/ (accessed may 7, 2009); postscript language reference, 3rd ed. (reading, mass.: addison-wesley, 1999), http://www.adobe.com/devnet/postscript/pdfs/plrm.pdf (accessed may 7, 2009); pdf reference, 6th ed., version 1.7, 2006, http://www.adobe.com/devnet/acrobat/pdfs/pdf_ reference_1-7.pdf (accessed may 7, 2009). 6. jacob ziv and abraham lempel, “a universal algorithm for sequential data compression,” ieee transactions on information theory 23, no. 3 (1977): 337. 7. ian h. witten, alistair moffat, and timothy c. bell, managing gigabytes: compressing and indexing documents and images, 2nd ed. (san francisco: morgan kaufmann, 1999). 8. john g. cleary and ian h. witten, “data compression using adaptive coding and partial string matching,” ieee transactions on communication 32, no. 4, (1984): 396; michael burrows and david j. wheeler, “a block-sorting lossless data compression algorithm,” digital equipment corporation src research report 124, 1994, www.hpl.hp.com/techreports/ compaq-dec/src-rr-124.pdf (accessed may 7, 2009). 9. witten, moffat, and bell, managing gigabytes. 10. jon louis bentley et al., “a locally adaptive data compression scheme,” communications of the acm 29, no. 4 (1986): 320; r. nigel horspool and gordon v. cormack, “constructing word-based text compression algorithms,” proceedings of the data compression conference (snowbird, utah, 1992): 62. 11. see for example andrei v. kadach, “text and hypertext compression,” programming & computer software 23, no. 4 (1997): 212; alistair moffat, “word-based text compression,” software—practice & experience 2, no. 19 (1989): 185; przemysław skibiński, szymon grabowski, and sebastian deorowicz, “revisiting dictionary-based compression,” software— practice & experience 35, no. 15 (2005): 1455. 12. przemysław skibiński, jakub swacha, and szymon grabowski, “a highly efficient xml compression scheme for the web,” proceedings of the 34th international conference on current trends in theory and practice of computer science, lncs 4910 (2008): 766. 13. jon louis bentley et al., “a locally adaptive data compression scheme,” communications of the acm 29, no. 4 (1986): 320. 14. skibiński, grabowski, and deorowicz, “revisiting dictionary-based compression,” 1455. 15. skibiński, swacha, and grabowski, “a highly efficient xml compression scheme for the web,” 766. 16. peter deutsch, “deflate compressed data format specification version 1.3,” rfc1951, network working group, 1996, www.ietf.org/rfc/rfc1951.txt (accessed may 7, 2009). 17. christian schneider, precomp—a command line precompressor, 2009, http://schnaader.info/precomp.html (accessed may 7, 2009). 18. the technical details of the algorithm constructing code words and assigning them to indexes, and encoding numbers and special tokens, are given in skibiński, swacha, and grabowski, “a highly efficient xml compression scheme for the web,” 766. 19. david solomon, data compression: the complete reference, 4th ed. (london: springer-verlag, 2006). 20. skibiński, swacha, and grabowski, “a highly efficient xml compression scheme for the web,” 766. 21. dave raggett, arnaud le hors, and ian jacobs, eds., w3c html 4.01 specification, 1999, http://www.w3.org/tr/rec -html40/ (accessed may 7, 2009). 22. ian h. witten, david bainbridge, and stefan boddie, “greenstone: open source dl software,” communications of the acm 44, no. 5 (2001): 47. 23. project gutenberg, 2008, http://www.gutenberg.org/ (accessed may 7, 2009). 24. przemysław skibiński and szymon grabowski, “variablelength contexts for ppm,” proceedings of the ieee data compression conference (snowbird, utah, 2004): 409. alcts cover 2 lita cover 3, cover 4 index to advertisers editorial | truitt 159 marc truitt editorial: reflections on what we mean by “forever” w hat do we mean when we tell people that we want or intend to preserve content or an object “forever”? a couple of weeks ago, i attended the fall meeting of the preservation and archiving special interest group (pasig) in san francisco. the group, generously sponsored by sun microsystems, is the brainchild of art pasquinelli of sun and michael keller of stanford. first, a confession on my part. since the university of alberta (ua) was one of the founding members of pasig, i had occasion to attend the first several pasig meetings. in the beginning, there were just a handful of—perhaps fewer than ten—institutions represented. it seemed at the first couple of meetings, when the group was still finding its direction, that the content was slim, repetitious, and overly focused on sun’s own solutions in the digital preservation and archiving (dpa) arena. since we had other attendees ably representing ua, i stayed away from the following several meetings. well, pasig has grown up. the attendee list for this meeting boasted nearly two hundred persons representing more than thirty institutions. among the attendees were many of the leading lights in dpa and the profession generally. institutions represented included several north american and european national libraries, as well as arls, memory institutions, and a host of companies and consultants offering a range of dpa solutions. yes, pasig has arrived, and we have art, mike, and sun to thank for this. if i have one real remaining complaint about pasig, it’s that the group is still overly focused on sun’s solutions. true, other vendors such as exlibris and vtls attended, but their solutions don’t compete; rather, they build on sun’s offerings. and while microsoft also was in attendance for the first time, its presentation focused not so much on dpa solutions—it has none—as on a raft of interesting and useful plug-ins whose purpose is to facilitate preservation of content created in microsoft products such as word, excel, powerpoint, etc. other large vendors of dpa solutions—think ibm, for one—remain conspicuously absent. it’s time for sun to do the “right thing” and “open source” pasig. if sun wishes to continue to sponsor pasig by lending administrative and organizational expertise, that would be great. indeed, a leading but not controlling role in pasig would be entirely consistent with the company’s new focus on support of open-source efforts such as mysql, openoffice, and opensolaris. so, what about the title of this editorial? when we talk of digital preservation, just how long are we thinking of preserving an object? ask any twenty specialists in dpa, and chances are that you’ll get at least ten different answers. for some, the timeframe can be as short as five to twenty years. for others, it’s fifty or perhaps one hundred years. at pasig, at least one presenter described an organizational business model that envisions preserving content for five hundred years. and there are even some in our profession who glibly use what one might call “the dpa f-word,” although fortunately none of them seemed to be in attendance at this fall’s pasig what does this mean in a very practical, nuts-and-bolts it sense? chris wood of sun gave a presentation at the 2008 pasig spring meeting in which he estimated that the cost to supply power and cooling alone to maintain a petabyte (1,000 tb) of disk-based digital content for a mere ten years would easily exceed $1 million.1 refining his figures downward somewhat, wood noted a few months later at the following pasig meeting that for a 1 tb drive, the fiveyear estimated power and cooling for 2008–12 could be estimated at approximately $320, or $640,000 per petabyte over ten years, still a considerable sum.2 add to this the costs of migration—consider that a modern spinning disk is generally thought to have a useful lifespan of about five years, and tape may have two or three decades—and the need regular integrity-checking of digital content for “bit-rot,” and you have the stuff of a sustainability nightmare. these challenges don’t even include the messy question of preservating an object so that it is usable in a century or five. while we probably will be able to read word and excel files for the foreseeable future, there are already countless files created with nowdefunct pc applications of the 1980s and 1990s; many are stored on all kinds of obsolete media and today are skating on the edge of inaccessibility. already we are seeing concern expressed at institutions with significant digital library and digitization commitments that curating, migrating, and ensuring the integrity and usability of growing petabytes of content over centuries may be unsustainable in both dollars and staff.3 can we even imagine the possible maintenance burden for our descendants, say, 250 or 500 years from now? in 2006, alexander stille observed that “one of the great ironies of the information age is that, while the late twentieth century will undoubtedly have recorded more data than any other period in history, it will also almost certainly have lost more information than any previous era.”4 how are we to deal with this? can we meaningfully plan for the preservation of digital content over centuries given our poor track record over just the past few decades? perhaps we’re thinking too big when we speak of “forever.” maybe we need to begin by conceptualizing and implementing on a more manageable scale. or, to adopt a phrase that seemed to become the informal mantra of marc truitt (marc.truitt@ualberta.ca) is associate university librarian, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 160 information technology and libraries | december 2009 both this year’s pasig and the immediately preceding ipres meeting, “to get to forever you have to get to five years first.”5 n about this issue of ital a few months ago, while she was still working at the university of nevada las vegas, ital’s longtime managing editor, judith carter, shared with me the program for discovery mini-conference that had just been held at unlv. the presentations, originally cast as poster sessions, suggested a diverse and fascinating collection of insights deserving of wider attention. i suggested to judith that she and her colleagues had the makings of a great ital theme issue, and i’m pleased that they accepted my invitation to rework the presentations into a form suitable for publication here. i hope that you will find the results of their work interesting—i certainly do. they’ve done a superb job! bravo to judith and the presenters at the unlv discovery mini-conference! n corrigenda in our september issue, in an article by kathleen carlson, we inadvertently characterized camtasia studio as an open-source product. it is not. camtasia studio is published by techsmith corporation. you can find out more at the product website (http://www.techsmith.com/ camtasia.asp). also, in the same article, we provided a url to a flash tutorial titled “how to order an article that asu does not own.” ms. carlson has recently advised us that the tutorial in question is no longer available. references and notes 1. chris wood, “the billion file problem and other archive issues” (presentation, spring meeting of the sun preservation and archiving special interest group [pasig], san francisco, california, may 28, 2008), http://events-at-sun.com/ pasig_spring/presentations/chriswood_massivearchive.pdf (accessed oct. 22, 2009). 2. chris wood, “archive and preservation: emerging storage: technologies & trends” (presentation, fall meeting of pasig, baltimore, maryland, nov. 19, 2008), http://events -at-sun.com/pasig_fall08/presentations/pasig_wood.pdf. (accessed oct. 22, 2009). 3. consider, for example, the following extract from a recent posting to the syslib-l electronic discussion list by the head of library systems at the university of north carolina at chapel hill: i’m exaggerating a little in my subject line, but it’s been less than 4 years since we purchased our first large (5tb) storage array. we now have a raw 65tb online, and 84tb on order—although a considerable chunk of that 84 is going to replace storage that’s going out of warranty/maintenance and is more cost effective to replace (apple xraids, for instance). in the end, though we’ll net out with 100tb or thereabouts by the end of next year. a great deal of this space is going to digitization projects—no surprise there. we have over 20tb now in our “digital archive,” storage i consider dim, if not dark. we need a heck of a lot of space for staging backups, givien [sic] how much we write to tape in a 24-hour period. individual staff aren’t abusing our lack of quotas—it’s really almost all legitimate, project-driven work that’s eating us up. what’s scarier is that we’re now talking seriously about moving from project-driven work to programmatic work: the latest large photographic archive we acquired is being scanned as part of the acquisition/processing workflow. we’re looking at ways to prioritize the scanning of our manuscript collections. donors increasingly expect to see their gifts online. and we’re not even yet supporting an “institutional repository.” will owen, “0 to 60 in three years: mass storage management,” online posting, dec. 8, 2008, syslib-l@listserv.indiana.edu, https://listserv.indiana.edu/cgi-bin/wa-iub.exe?a0=syslib-l (account required; accessed oct. 22, 2009). 4. alexander stille, “are we losing our memory? or, the museum of obsolete technology,” lost magazine, no. 3 (feb. 2006), http://www.lostmag.com/issue3/memory.php (accessed oct. 22, 2009). while stille was referring in this quotation to both digital and nondigital materials, his comments are but part of a larger debate positing that the latter half of the twentieth century could well come to be known in the future as a “digital dark age” because of the vast quantity of at-risk digital content, recently estimated by one expert at some 369 exabytes (369 billian gb) worth of data. physorg.com, “‘digital dark age’ may doom some data,” http://www.physorg.com/news144343006 .html (accessed oct. 22, 2009). 5. ed summers, “ipres, iipc, pasic roundup/braindump,” online posting, oct. 14, 2009, inkdroid, http://inkdroid .org/journal/2009/10/14/ipres-iipc-pasig-roundupbrain dump/ (accessed oct. 22, 2009). the binary vector as the basis of an inverted index file donald r. king: rutgers university, new brunswick, new jersey. 307 the inverted index file is a frequently used file structure for the storage of indexing information in a document retrieval system. this paper describes a novel method for the computer storage of such an index. the method not only offers the possibility of reducing storage requirements fot an index but also affords more mpid processing of query statements expressed in boolean logic. introduction the inverted index file is a frequently used file structure for the storage of indexing information in document retrieval systems. an inverted index file may be used by itself or with a direct file in a so-called combined file system. the inverted index file contains a logical record for each of the subject headings or index terms which may be used to describe documents in the system. within each logical record there is a list of pointers to those documents which have been indexed by the subject heading in question. the individual pointers are usually in the form of document numbers stored in fixed-length digital form. obviously, the length of the lists will vary from record to record. the purpose of this paper is the presentation of a new technique for the storage of the lists of pointers to documents. it will be shown that this technique not only reduces storage requirements, but that in many cases the time required to search the index is reduced. the technique is useful in systems which use boolean searches. the relative merits of boolean and weighted term searches are beyond the scope of this paper, as are the relative merits of the various possible file structures. the binary vector as a storage device the exact form of each document pointer is immaterial to the user of a document retrieval system as long as he is able to obtain the document he desires. the standard form for these pointers in most automated systems is a document number. note that each pointer is by itself a piece of information. however, if one thinks of a "peek-a-boo" system, the document 308 journal of library automation vol. 7/4 december 1974 pointer becomes simply a hole punched in a card. in this case the position of the pointer, not the pointer itself, conveys the information. the new technique presented in this paper is an extension of the "peeka-boo" concept. a vector or string of binary zeroes is constructed equal in length to the number of documents expected in the system. the position of each vector element corresponds to a document number. that is, the first position in a vector corresponds to document number one and the tenth vector position corresponds to document number ten. a vector is constructed for each subject heading in the system. as a document enters the system, ones are inserted in place of the zeroes in the positions corresponding to the new document number in the vectors for the subject headings used to describe the document. as an example, assume the following document descriptions are presented to a system using binary vectors: document number 1 2 3 subject headings a,b,d c,e a,c the binary vectors for terms a, b, c, d, and e before the insertion of the indexing data would be as follows: subject heading a b c d e vector 000 ... 0 000 ... 0 000 ... 0 000 ... 0 ooo ... ·o after the insertion of the indexing information, the same vectors would appear as follows: subject heading a b c d e vector 101 ... 0 100 ... 0 011 ... 0 100 ... 0 010 ... 0 the binary vector seems to have several advantages over the standard form of storage of document numbers in an inverted file. first, the records are of fixed length since the vectors are all equal in length to the expected number of documents in the system. space may be left at the end of each vector for the addition of new documents. periodic copying of the file may be used to expand the index records with additional zeroes added at the end of each record during the process. consequently, unless binary vector/king 309 there are limitations of size imposed by the equipment, only one access to the storage device will be needed to retrieve the index record for a term. the second advantage offered by the binary vector method appears in the search process. most modern computers have a built-in capability of performing boolean logical manipulations on binary digit vectors or strings. thus, when boolean operations are specified as part of a query, the implementation of the operations within the· computer is considerably easier and faster for binary vectors than for the standard form of inverted files. other investigators of the use of the binary digit patterns or vectors have not fully explored its advantages and disadvantages. bloom suggests, without an explanation or evaluation, the use of bit patterns as the storage technique for inverted files in large data bases in the area of management information systems.1 davis and lin, again in the area of management information systems, propose bit patterns as the means of locating pertinent records in a master file. 2 they do not compare the method with other possible techniques. sammon discusses briefly the use of binary vectors as a storage technique, but dismisses it on the basis that the two-valued approach obviates the possible assignment of weights to index terms in describing documents. 3 gorokhov discusses the use of a modified binary vector approach in a document retrieval system implemented on a small soviet computer.4 faced with the need to minimize storage requirements for his inverted file, gorokhov concentrated on developing a technique for locating and removing strings of zeroes occurring in the binary vectors used within the system. since these zeroes represent the absence of information they could be removed if there were a way to indicate the position in the original vector of the ones that remained. he proposed the removal of strings of zeroes and the inclusion of numeric place values with the remaining vector elements. his result is a file with variable-length index records. the abandoning of the pure binary vector obviates the process, and gorokhov found it necessary to expand the vector elements into the original vector before logical operations could be applied. even though he does not state so explicitly, gorokhov seems to have found his method more efficient than the standard inverted file. gorokhov' s suggestion has led to the development of an algorithm for the compression of binary vectors. heaps and thiel have also discussed the use of compressed binary vectors as the basis of an inverted index file. 5• 6 aside from a brief description of the method for implementing the concept, they offer no comparison of the binary vector with the standard inverted file. storage requirements an immediate reaction to the concept of binary vectors is to state that they will obviously take more storage space than the standard inverted file. a closer study shows that this is not always the case. the storage requirements for the two types of files may be calculated as follows: 310 journal of library automation vol. 7/4 december 1974 d·n 1. mbv = 8 bytes 2. msr = d · i · k where: ( binary vector file) (standard inverted file) m = storage requirements in bytes d = number of documents in the system n = number of index terms in the system i = average depth of indexing in the system k = size in bytes of a document number stored in the file using equations 1 and 2 we find that the storage requirements for the binary vector file are, in fact, less than the requirements for the standard inverted file if n < 8 •] • k. it is well lmown that the distribution of the use of index terms follows a logarithmic curve. in simple terms, one might say that a few terms are used very frequently and many terms are used infrequently. this condition implies that in a binary vector file the records for many terms will contain segments in which there are no "ones" in any byte. a method for removing these "zero" bytes is called compression. compression algorithm the technique for the compression of binary vectors as described here is designed specifically for the ibm 360 family of computers and similar machines. the extension to other machines should be obvious. within the ibm 360 the byte, which contains eight binary digits, is the basic storage unit, and with the eight binary digits it is possible to store a maximum integer value of 255. for the purpose of describing a proposed compression algorithm for the binary vector in the ibm 360, the term subvector will be defined as a string of contiguous bytes chosen from within the binary vector. a zero subvector will be a subvector each of whose bytes contains eight binary zeroes. a nonzero subvecto1· will be a subvector each of whose bytes contains at least one binary one. to compress a binary vector in the ibm 360 the following steps may be taken: 1. divide the binary vector into a series of zero subvectors and nonzero subvectors. subvectors of either type may have a maximum length of 255 bytes. for zero subvectors longer than 255 bytes, the 256th byte is to be treated as a nonzero byte, thus dividing the long zero subvector. 2. each nonzero subvector is prefixed with two bytes. the first of the prefix bytes contains the count of zero bytes which precede the nonzero subvector in the uncompressed vector. the second prefix byte contains a count of the bytes in the nonzero subvector. 3. the compressed vector then consists of only the nonzero subvectors together with their prefix bytes. 4. a two byte field of binary zeroes will end the compressed vector. binmy vector/king 311 the compression of the vectors creates variable-length records and removes the advantage of having records which are directly amenable to boolean manipulation. the effect of file compression on such manipulation in the search process is not as severe as it might appear. for the search process, the compressed vector may be expanded into its original form. the process of expansion of the binary vectors is relatively simple, and since only those index term records which are used in a query need to be expanded at the search time, the search time is not significantly affected. as an example of the use of the compression algorithm consider the following binary vector. 01100000/10000000/ seven zero bytes j00000001j10000000j ... the slashes indicate the division of the vector into bytes. the vector might be read as indicating the following list of document numbers: 2, 3, 9, 80, and 81. in a standard inverted file with each document number assigned three bytes of storage, fifteen bytes would be required to store these numbers. the compressed vector which results from the application of the algorithm is the following: 00000000j00000010j01100000/10000000j00000111/00000010/ 00000001/10000000/ ... again the slashes separate the vector into bytes. for the purpose of the following discussion consider each byte in a vector to be numbered sequentially beginning with byte one at the left. in the uncompressed vector bytes one and two form a nonzero subvector. consequently, the first four bytes in the compressed vector can be interpreted as follows: byte one. binary zero indicating that no zero bytes were removed preceding this subvector. byte two. binary two indicating that the following nonzero subvector is two bytes long. bytes three, four. bytes one and two of the original vector. bytes three through nine of the original vector are a zero subvector, and bytes ten and eleven form a second nonzero subvector. consequently, the second four bytes of the compressed vector are interpreted as follows: byte five. binary seven indicating that a zero subvector of seven bytes has been removed. byte six. binary two indicating that the following two bytes are a nonzero subvector. bytes seven, eight. bytes ten and eleven of the original vector. thus the binary vector has been reduced from eleven bytes to eight 312 journal of library automation vol. 7/4 december 1974 bytes while the space required to record the document numbers in the standard inverted file remains fifteen bytes. memory requirements for the standard inverted file and the binary vector file to compare memory requirements for the standard inverted file and the compressed binary vector file, we base our comparison on the total number of postings in the file. in the standard inverted file the storage space for the postings is equal to the number of postings times the length of a single posting, which is usually two, three, or five bytes. memory requirements for the compressed binary vector file are more difficult to estimate because the distribution of document numbers within the record for each index term is not known. the fact that a single byte in the binary vector file may contain between zero and eight postings is extremely important. the worst possible case occurs if the postings in the binary vector are spaced in such a way that each nonzero byte contains only one posting, and these bytes are separated by zero bytes. consider the following example: ... /00000000/00010000/00000000/00000100/ ... in this case the compression algorithm will remove the zero bytes, but will add two bytes (the prefix bytes) for each nonzero byte. the resulting compressed vector will be essentially the same length as the standard inverted file record if each posting is three bytes long in the standard inverted file. it might seem that the distribution of one posting per byte for the entire vector represents an even worse situation. it is clear that the compression algorithm will, in this case, not reduce the size of the vector. however, it must be remembered that in the standard inverted file each posting will require at least two bytes and perhaps three bytes. thus, the length of the record in the standard inverted file is two or three times longer than the corresponding binary vector regardless of compression. in data used in two model retrieval systems prepared to compare the standard inverted file and the binary vector file there are 6,121 documents with a total of 94,542 postings. an examination of the binary inverted file for the model systems discloses that there are only 55,311 nonzero bytes in the binary vector file. thus there seems to be some form of clustering of the document numbers in each index term record. if each nonzero byte in this binary vector is isolated by zero bytes, two prefix bytes would be added for each byte. thus the total memory requirements for the postings in the compressed file would be 165,933 bytes. less storage space is required if some nonzero bytes are contiguous. on the other hand, the standard inverted file will require 189,084 bytes if a two-byte posting is used, or 283,626 bytes if a three-byte posting is used. further study of the clustering phenomenon is needed. binary vector /king 313 model retrieval systems to test some of the conjectures about the differences between the standard inverted file and the binary vector file, two model systems were prepared for operation on an ibm 360/67. details of the systems and pl/1 program listings are available elsewhere. 7 the data base used was obtained from the institute of animal behavior at rutgers university. in the data base 6,121 documents were indexed by 1,484 index terms. a total of 94,542 postings in the system gives an average depth of indexing of 15.4 terms per document. both inverted files were stored on ibm 2314 disc storage devices. to ease the problem of handling variable-length records in both files the logical records for each index term were divided into chains of fixed~ lehgth physical records. for the standard inverted file a physical record size of 331 bytes was chosen. the entire file required 702,713 bytes including record overhead. for the uncompressed binary vector file a physical record size of 1,286 bytes was chosen to include overhead and space for up to 10,216 document numbers. when the compression algorithm was applied, with a physical record length of 130 bytes, the memory requirements for the binary vector file were reduced to 281,450 bytes, or 41 percent of the space required to store the standard inverted file. a series of forty searches of varying complexities were run against both files. the "time" function of pl/1 made it possible to accumulate timing statistics which excluded input/output functions. search times for the binary vector file include expansion of the compressed vectors, boolean manipulation of the vectors, and conversion of the resultant vector into digital document numbers. the times for the standard inverted file are for the boolean manipulation of the lists. the following points were noted in the analysis of the times: 1. in twenty-two of the forty queries for which comparative timings were obtained, the search of the binary vector file was faster, in one case by a factor of thirty-five. in the eighteen cases in which the search of the standard inverted file was faster, the search of the standard inverted file was at most 6.17 times faster. 2. the range of the total times for the binary vector file was .79 seconds to 9.72 seconds. the range for searching the standard inverted file was .15 seconds to 202.98 seconds. the fact that the search times for the binary vector file are within a fairly narrow range, in contrast to the wider range of times for searching the standard inverted file, has important implications for the design of an on-line interactive document retrieval system. in such a system it is important that the computer respond to users' requests not only rapidly but consistently. the narrower range of the search times provided by the binary vector file will assist in producing consistent times. 3. the search times for the binary vector file, exclusive of expansion and conversion times, are unaffected by the number of postings con314 journal of library automation vol. 7/4 december 1974 tained in the index terms used in a query. on the other hand, the number of postings in the records used from the standard inverted file appears to cause the differences in search times for that file. to test the conjectures! that 1. search times for the binary vector file are related to the number of index terms in the query, and 2. search times for the standard inverted file are related to the number of postings in the index terms in the query, a correlation analysis was performed. the following correlation coefficients were obtained: v a1'iables 1' number of terms in query and search .960 times for the binary vector file. number of postings in query terms and .979 search times for standard inverted file. the relationships indicated above are significant at the .001 level. no attempt was made to compute an average search time per term for the binary vector file or average search time per posting for the standard inverted file. such times would have meaning only for the model systems. summary the binary vector is suggested as an alternative to the usual method of storing document pointers in an inverted index file. the binary vector file can provide savings in storage space, search times, and programming effort. references 1. burton h. bloom, "some techniques and trade-offs affecting large data base retrieval times," proceedings of the acm 24 ( 1969). 2. d. r. davis and a. d. lin, "secondary key retrieval using an ibm 7090-1310 system," communications of the acm 8:243-46 (april1965). 3. john w. sammon, some mathematics of information storage and retrieval (technical report radc-tr-68-178 [rome, new york: rome air development center, 1968]). 4. s. a. gorokhov, "the 'setka-3' automated irs on the 'minsk-22' with the use of the socket associative-address method of organization of information" (paper presented at the all-union conference on information retrieval systems and automatic processing of scientific and technical information, moscow, 1967. translated and published as part of ad 697 687, national technical information service). 5. h. s. heaps and l. h. thiel, "optimum procedures for economic information retrieval," information storage & retrieval6:131-53 (1970). 6. l. h. thiel and h. s. heaps, "program design for retrospective searches on large data bases," information storage & retrieval8:1-20 (1972). 7. d. r. king, "an inverted file structure for an interactive document retrieval system" (ph.d. dissertation, rutgers university, 1971). emergency remote library instruction and tech tools: a matter of equity during a pandemic article emergency remote library instruction and tech tools a matter of equity during a pandemic kathia ibacache, amanda rybin koob, and eric vance information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.12751 abstract during spring 2020, emergency remote teaching became the norm for hundreds of higher education institutions in the united states due to the covid-19 pandemic. librarians were suddenly tasked with moving in-person services and resources online. for librarians with instruction responsibilities, this online mandate meant deciding between synchronous and asynchronous sessions, learning new technologies and tools for active learning, and vetting these same tools for security issues and ada compliance. in an effort to understand our shared and unique experiences with emergency remote teaching, the authors surveyed 202 academic instruction librarians in order to answer the following questions: (1) what technology tools are academic librarians using to deliver content and engage student participation in emergency remote library sessions during covid-19? (2) what do instruction librarians perceive as the strengths and weaknesses of these tools? (3) what digital literacy gaps are instruction librarians identifying right now that may prevent access to equitable information literacy instruction online? this study will deliver and discuss findings from the survey as well as make recommendations toward best practices for utilizing technology tools and assessing them for equity and student engagement. introduction the worldwide covid-19 pandemic has had important repercussions for university libraries. all library services, including information literacy instruction, moved online in a matter of days, creating a wave of needs that required immediate response. with the closure of university campuses all around the world, academic libraries encountered an unprecedented test of their adaptation abilities. although online education has been around for many years, widespread use of the remote classroom may have been unprecedented for many librarians until the spring of 2020. this type of online learning, as charles hodges et al. explain, is significantly different from the otherwise established domains of online and distance learning because it is unplanned, rushed, and happening in the midst of a crisis.1 as they note, “emergency remote teaching has emerged as a common alternative term” to differentiate from standard online education prior to the pandemic.2 the authors recognize the different and sometimes overlapping personal and professional impacts covid-19 has had on our communities, both inside and outside of the classroom. rather than broadly assessing emergency remote teaching, the authors are looking at what jody greene, referring to teaching during the covid-19 pandemic, calls “specific technological tools and flexible teaching practices.”3 this paper is concerned with issues of equity, student engagement, kathia salomé ibacache oliva (kathia.ibacache@colorado.edu) is romance languages librarian, assistant professor, university of colorado boulder. amanda rybin koob (amanda.rybinkoob@colorado.edu) is literature and humanities librarian, assistant professor, university of colorado boulder. eric vance (eric.vance@colorado.edu) is associate professor of applied mathematics and director of lisa (laboratory for interdisciplinary statistical analysis), university of colorado boulder. © 2021. mailto:kathia.ibacache@colorado.edu mailto:amanda.rybinkoob@colorado.edu mailto:eric.vance@colorado.edu information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 2 and technology tools that could be used to facilitate library instruction during emergency remote teaching. the authors seek to answer the following questions: (1) what technology tools are academic librarians using to deliver content and engage student participation in emergency remote library sessions during covid-19? (2) what do instruction librarians perceive as the strengths and weaknesses of these tools? (3) what digital literacy gaps are instruction librarians identifying since covid-19 that may prevent equitable access to information literacy instruction online? literature review technology tools facilitated a quick transition online in march 2020, enabling librarians to interact with students despite the move to emergency remote teaching. however, this fast transition and its associated learning curve accentuated issues of student engagement including equity and accessibility. there is a dearth of existing literature on teaching and learning online during times of great societal stress, with some notable exceptions, including a recent piece about university closures and moving to online classes during student-led protests in south africa from 2015 to 2017.4 as such, this literature review considers some of the barriers that contribute to inequitable information access in online learning, as well as digital literacy definitions. here we consider both ongoing challenges to equitable online access and specific challenges for the current covid-19 pandemic. barriers to equitable student access in online learning equity in academic libraries is widely represented in the scholarship through topics including disability, race, class, and salary gaps among librarians.5 however, as our ongoing pandemic illustrates, there is a strong need for more literature regarding students’ equitable online access to information during times that call for emergency remote teaching. the issue of equity may be considered in terms of external and internal challenges, which affect students differently. external barriers include low bandwidth and lack of devices. some researchers advise letting students communicate through chat instead of a webcam, since webcam use increases bandwidth consumption.6 understandably, colleges may need to provide computers and wireless hotspots to students who lack access to computers or to the internet.7 moreover, a 2018 pew fact tank publication noted that 15 percent of homes with school-age students (6–17 years old) do not have access to high-speed connection, and this digital divide particularly affects teens and their ability to be involved with homework.8 although this data focused on school-age students, these issues probably affected some college students during the pandemic. students may also be experiencing internal barriers such as language differences, lack of self regulation, lack of previous educational experience, and stress, all of which may affect academic performance. for example, one study found that language barriers challenged international students during remote web conferences with librarians.9 another study of international students showed that their academic success relied significantly on a variety of internal characteristics, such as self-regulation.10 additionally, a survey of students taking online courses showed that previous educational experience, including with online learning or within a given discipline, supported completion of those courses.11 moreover, stress is an internal barrier for students that may have external causes and is likely affecting librarians, faculty, and students during covid -19. scholars note that stress changes peoples’ use of technology, and this stress manifests differently depending on individual identity markers, such as gender and experience.12 information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 3 technology tools, digital literacy, student engagement in addition to barriers to equitable access, the digital age that has characterized the late 20th and 21st centuries, has prompted the advent of multiple technology tools that may be used in online library sessions, including emergency remote library instruction. these tools are meant to facilitate instruction and engagement, but they require students and instructors to be comfortable with technology. in the case of higher education, this level of comfort involves digital literacy competencies that surpass what is known as traditional textual literacy. the american library association’s (ala) digital literacy task force defines digital literacy as “the ability to use information and communication technologies to find, evaluate, create, and communicate information, requiring both cognitive and technical skills.”13 during the pandemic, the technical and cognitive skills of library instructors and students may be compromised due to stress as well as individual situations and specific environments. one of the technical challenges for remote library sessions stems from the need for instructor s to use tools to achieve flexibility and hybridity. librarians steven j. bell and john shank, addressing the challenges of new technologies for librarianship, coined the term “blended librarian” in 2004 to denote a librarian who combines traditional skills with those involving knowledge of hardware and software as applied in the teaching and learning process.14 the concept of the “blended librarian” may be outdated, but it encompasses the notion that librarians are expected to be comfortable with technology. again, librarians are now facing the mandate of presenting information literacy and library resources online, navigating between and facilitating the use of multiple technology tools and formats. it is worth considering how well our tools meet this mandate. although remote learning may be more amenable to some learners than others, there is consensus on the benefits of using technology for teaching and learning even if a learning curve exists for instructors. for example, researchers examining school support for classroom technology found that teachers supported enhanced technology integration even if it surpassed their own technology skills.15 notwithstanding the benefits perceived by teachers, there are also some drawbacks in the use of technology in the classroom, especially for distance learning. digital technologies researcher jesper aagaard, reporting part of a study on “technological mediation in the classroom” refers to two processes: “outside in,” where students use educational technologies to acquire knowledge in the classroom, and “inside out,” where students use technology tools to withdraw from the classroom visiting non-related websites.16 for instruction librarians, student engagement is paramount; therefore, redirecting students who leave the digital classroom is important, though it can be difficult to know when this occurs. a number of reasons could explain why students may disengage in a distance learning setting, one of them being the lack of digital literacy. moreover, the belief that higher education students in the 21st century are technologically savvy may be misleading. citing mark prensky, who originated the terms “digital native” and “digital immigrant,” wan ng explains that the phrase “digital natives” describes those people born in 1980 and after whose lives have been shaped by technology.17 ng found that while the students in his study were very comfortable with technologies such as word processor software, youtube, and facebook, they were not as comfortable using technologies to create content.18 there may be a digital literacy divide between information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 4 knowing and using a technology for social media and using a technology to create online content such as web pages and blogs. similarly, ng found that when presented with unfamiliar technology, students spent less time learning the new technology and instead focused on preparation of content.19 this finding may be of concern to instruction librarians who use a myriad of tools during emergency remote teaching. it is important to consider that these digital literacy divides could stem from factors not related to a student’s age group. researchers ellen johanna helsper and rebecca envon question the notion that a person may be called a digital native if they were born after 1980. these authors state that there are variables other than generational differences that could define a person as a digital native, such as gender, education, experience, and interaction with technology.20 therefore, even when people grow up in technological environments, they may not be considered digital natives. to minimize a gap in equity, lecture design, even for one-time library sessions, offers an opportunity to think of technology tools that could increase students’ participation and prompt learning. david ellis, studying classroom resources to enhance student engagement, notes that padlet, a web 2.0 technology, supports interaction and learning.21 seyed abdollah shahrokni, reviewing another web 2.0 technology, playposit, as a video tool for language instruction, states that it “can support learning in language classrooms” if used in a lecture design that includes relevant questions.22 lecture design applies to all types of settings: in person, flipped, and distance learning. approaches should be applied consistently to help students become more digitally literate and bridge equity issues where possible. jurgen schulte et al., providing examples of “new” librarian roles in a science curriculum, note that digital literacy enables better learning. 23 in the case of emergency remote teaching, instruction librarians may promote digital literacies through the use of technologies that increase students’ engagement and their “outside in” participation in the teaching and learning process. considering these challenges, the authors seek to identify the strengths and weaknesses of technology tools used by librarians and the digital literacy gaps that may prevent access to equitable library instruction. methods instrument the authors used a six-question qualtrics survey approved by the institutional review board at the university of colorado boulder. the survey was open for two weeks, between may 10 and may 24, 2020. it is worth noting that the questions were specific to this timeframe, and some responses indicated that instruction librarians were still finishing up spring semester 2020. the survey received 202 responses. however, the number of responses to each question varied as answers were not required. the data collected were both quantitative and qualitative, reflecting respondents’ practices, perceptions, and personal knowledge. respondents answered two multiple-choice and four free-text questions. for the multiple-choice questions, participants could choose all the options that applied and enter their own choice as well. the multiple-choice questions gathered data on the technology tools that librarians used to deliver content or to engage with students during covid-19. these questions distinguished between content delivery platforms (like zoom) and technology tools used for student engagement (like padlet). the technology tools included in the multiple-choice questions were chosen based on the authors’ knowledge of their potential relevance to instruction librarians. the information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 5 final four qualitative questions collected information about respondents’ perceptions of strengths and weaknesses of technology tools, as well as digital literacy gaps identified during covid -19 and other challenges to equitable instruction. qualtrics provided a report, which the authors organized in a spreadsheet used to analyze the data and create the figures. the following survey questions were asked: 1. what content delivery technology have you used to create your distance learning library sessions during covid-19? 2. what technology tools have you used to enhance student engagement in your distance learning library sessions during covid-19? 3. what are the strengths of the technology tools you’re using right now? 4. what are the weaknesses of the technology tools you’re using right now? 5. what digital literacy gaps have you identified in your students since covid-19 closures? ala’s digital literacy task force defines digital literacy as “the ability to use information and communication technologies to find, evaluate, create, and communicate information, requiring both cognitive and technical skills.” 6. what other challenges exist in your ability to effectively provide equitable information literacy instruction during this time? please see appendix a for the complete survey instrument. participants the survey was distributed through email to five listservs associated with academic libraries and library organizations: the seminar on the acquisition of latin american library materials (salalm) listserv, information literacy instruction discussion listserv, the library instruction roundtable (lita) listserv, the lita instructional technologies interest group listserv, and the literature in english discussion list. these organizations were chosen due to their connection with library instruction in academic libraries and the authors’ subject specialty affiliations (romance languages and english and american literature). grounded theory approach the data for questions 3, 4, 5, and 6 were analyzed using a basic grounded theory approach, where the authors collected themes and patterns from the responses rather than approaching the data with pre-existing hypotheses.24 based on their observations, the authors categorized responses according to an agreed-upon set of keywords. in addition, after coding the data separately, the researchers examined every answer together to ensure consistency and reliability. a mixedmethods survey with a grounded theory approach to analysis allowed for a larger number of responses than qualitative interviews. the survey format also allowed for quicker solicitation and analysis of data, given the urgency of the topic and the authors’ desire to provide recommendations to colleagues in a timely manner. findings popularity of technology tools figure 1 shows respondent selections from the list of content delivery tools provided by the authors. a large number of respondents used libguides as a content delivery tool during covid 19, followed closely by the video conferencing tool zoom. however, although libguides and zoom displayed a substantial amount of concurrence among the respondents, fewer than half of the https://lists.ala.org/sympa/info/ili-l https://lists.ala.org/sympa/info/lirt-l https://lists.ala.org/sympa/info/lirt-l https://lists.ala.org/sympa/info/lita-insttechig https://lists.ala.org/sympa/info/les-l information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 6 respondents used the rest of the technology tools shown in figure 1. these data suggest that a large number of the respondents were able to deliver library instruction via synchronous learning through zoom or by providing resources asynchronously via libguides, and thus had the opportunity to have at least some engagement with students. figure 1 also shows that more respondents used snagit and screencast-o-matic to create videos than playposit. similarly, a little over one-eighth of respondents used the graphic design tool canva to create content, although this tool had better usage than adobe illustrator, which was only used by one respondent. in addition, the communication software google hangouts was largely not used by respondents. the authors listed formative and pear deck in the survey options as well, but these were not selected by any respondents (not shown in figure 1). figure 1. respondent selections to question 1: what content delivery technology have you used to create your distance learning library sessions during covid-19? figure 2 represents the tools used by the 95 respondents who selected other and entered additional tools in a free-text box in question 1. tools mentioned, such as webex, camtasia, panopto, and kaltura capture, were used for video conferencing but to a lesser extent than zoom. similarly, only six respondents reported using narrated powerpoint. interestingly these tools were still used by more people than playposit. respondents mentioned a wide array of other technology tools in the free-text box (see appendix b); however, none of these tools were used individually by more than three respondents. information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 7 figure 2. other content delivery technology used to create distance learning library sessions during covid-19. the survey also asked about technology tools used for student engagement in distance learning library sessions during covid-19. the authors distinguish these tools from content delivery tools, as they are often utilized in conjunction with some of the tools mentioned in figure 1 to facilitate student interaction. figure 3 shows that, among the tools listed by the authors, respondents preferred the application google forms, found in google drive and google classroom, as more than one-third of respondents indicated they used this application to enhance student engagement. although representing fewer than half of the respondents, 18 more people selected google forms over poll everywhere, the tool with the second-best representation. moreover, poll everywhere and padlet, two online tools that enable student participation through custom-made polls and post-it boards, were each utilized by about one-fourth of participants. the game-based learning platform kahoot was used by nearly one-fifth of respondents, and mentimeter, another interactive platform allowing students to answer multiple-choice and openended questions, was used by 11 respondents. less than five percent of the respondents used the interactive technology tools flipgrid, answergarden, jamboard, mural, slido, and socrative. no respondents indicated they used pear deck, google drawings, quizalize, gosoapbox, and yo teach! (not shown in figure 3). in addition, 42 respondents entered the names of technology tools they used to enhance student engagement in the other free-text option. similar to the responses in the free-text answer for question 1, respondents provided a broad list of technology tools. two of the tools listed displayed a higher number of concurrences: eight respondents mentioned zoom polls and four mentioned information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 8 springshare libwizard. an additional 20 tools were used by just one participant each (see appendix c). figure 3. respondent selections to question 2: what technology tools have you used to enhance student engagement in your distance learning library sessions during covid-19? strengths and weaknesses of technology tools instruction librarians also described the perceived strengths of the technology tools they used. figure 4 shows that a little less than half of the respondents agreed “easy to use” was an important consideration for technology tools, making it the most frequently mentioned strength. responses showed interest in ease of use for librarians, students, and faculty alike. for example, respondents included the phrases “our learners were comfortable with them,” “it’s easy to get started,” and “everyone already knows zoom.” in addition, nearly one-fourth of participants selected the strength “interactive/collaborative” followed at a distance by the strength “flexible,” which dropped dramatically to 15 percent. in fact, the number of respondents who noted “interactive/collaborative” was almost quadruple the number of respondents who mentioned the less popular choices “supported by it” or “captioning functionality.” fewer than 19 participants acknowledged that it was important for the technology tools to enable remote instruction, include recording functionality and screen-sharing functionality, and to be able to enhance communication. only 11 participants wrote that it was important for the tool to be readily available. respondents referred to other strengths not included in figure 4 due to their infrequency. nonetheless, some of these strengths offer unique insights. for example, four respondents noted information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 9 that they favor free tools. in addition, three respondents stated that it was beneficial to repurpose content created with technology tools. two respondents mentioned that they preferred tools that do not require download and/or account creation. another respondent mentioned that mo bilefriendly tools were most helpful for engaging students. figure 4. respondent answers to question 3: what are the strengths of the technology tools you’re using right now? respondents also shared their observations around technology tool weaknesses. figure 5 shows that several perceived weaknesses were the inverse of strengths from figure 4, including that tools were “difficult to use” or “not interactive or engaging.” figure 5 also indicates that respondents were divided as to the most significant weaknesses. in fact, not even one-fourth of respondents selected the most frequent response, “not interacting or engaging,” displaying a lack of concurrence. the second most-repeated weakness referred to bandwidth requirements, with 27 respondents worrying about the lack of requisite internet access. the authors joined together seven weaknesses mentioned by respondents as “other functional limitations.” these weaknesses included “lack of screen capture,” “connection failures,” “lack of captioning,” “lack of recording capabilities,” “limited sharing screen,” “freezing video,” and “video quality.” each of these specific limitations was only mentioned a couple of times, but together these functional limitations were mentioned by 17 respondents. again, there were some specific weaknesses mentioned by only a few respondents. some of the highlights included tech overload or too many tools to choose from (two respondents), computer storage requirements (three respondents), and that the tools are not flexible enough or easy to information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 10 integrate into other systems such as canvas or libguides (four respondents). two people observed that the tech tools they used had no weaknesses. interestingly, 18 respondents included keywords and phrases in their answers (not shown in figure 4) that were not directly related to tool weaknesses, but rather described other issues affecting teaching and learning in a remote setting. these included students lacking computer s or having only cell phones (seven respondents), students’ limited technology skills or attitudes about remote learning (six respondents), students’ home setups (three respondents), and limited familiarity with the tools among teaching faculty (two respondents). these kinds of responses illustrate the wide range of interconnected factors impacting librarians’ experiences engaging with students and technology during covid-19. finally, 26 percent of librarians answering this question mentioned some weaknesses related to zoom (not shown in figure 5). to illustrate, some comments included “active learning in zoom [sic] is difficult . . .”; “zoom recordings take up a lot of space and our college is running out of room . . .”; “zoom doesn’t work as well when using wifi [sic], as opposed to connecting through a network”; “it is easy to zone out and not pay attention to zoom [sic]”; “with zoom it is difficult to interact with students on a one-to-one basis as they breakout [sic] to conduct research”; and “students tend to not have cameras on . . . and it’s hard to tell if they are actually paying attention.” these observations may show that while respondents favor using tools like zoom, they are also aware of important limitations. figure 5. respondent answers to question 4: what are the weaknesses of the technology tools you’re using right now? information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 11 digital literacy gaps beyond describing the technology tools used, respondents were asked to identify digital literacy gaps that they noticed in students during covid-19 closures. as stated above, the authors defined digital literacy in the survey question according to the ala digital literacy task force definition. still, answers to this question provoked a wide range of responses as seen in figure 6. the most frequently recurring response was that digital literacy gaps were the same as those perceived before the pandemic, although only 25 respondents agreed on this. digital literacy gaps observed by respondents included “lack of tech skills in general,” “problems evaluating information online,” “ineffective search strategies,” “difficulty communicating online,” “problems using library resources in general,” “problems using online resources,” “problems using library databases,” and “understanding citation and plagiarism.” the second-biggest category, “lack of tech skills in general,” included varied responses such as “some of my students lack a basic understanding of . . . browsers, upload/download, url versus link, activate/enable a feature etc.”; “students have trouble navigating multiple windows”; and “students are having a hard time trying something new which involves more than a single click or two.” eleven other respondents noted that it was too early to evaluate digital literacy gaps during emergency remote teaching. one respondent offered insight about the possibility that librarians missed gaps because they were not able to meet with all students. as they stated, “students who have access and are in contact with librarians seem to have adequate skills. i don’t know how many students simply lack internet access, and i don’t know how many need the library and don’t figure out how to access it. . . .” ideas for reaching more students who may not have access to in-class library sessions are mentioned in the recommendations section below. when asked about digital literacy gaps, some respondents mentioned student experiences during covid-19 that were not directly related to digital literacy and therefore are not included in figure 6. however, the authors considered this information relevant because it provided insight into perceived challenges students faced. the authors separated such responses into two g roups: external challenges and internal challenges. external challenges mostly involved technology access rather than digital literacy per se, with 22 respondents mentioning lack of tech access as a barrier or gap. it is worth noting that respondents mentioned this lack more than any individual digital literacy gap shown in figure 6. fifteen respondents also noted that students may lack internet access at home, while five percent mentioned a home environment that was not ideal or conducive to learning. although these external challenges are not explicitly related to digital literacy, the fact that they are mentioned here may indicate that respondents perceived these challenges as interrelated during the covid-19 pandemic. internal challenges included concepts that may be seen as related to digital literacy but are not explicitly included in the ala task force definition. in fact, many of these challenges had to do with pandemic-specific difficulties such as “emotional issues” arising during covid-19 (10 respondents). five respondents worried about information overload, while two respondents each mentioned that students were less likely to ask for help and more likely to have problems following directions during emergency remote learning. information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 12 figure 6. respondent answers to question 5: what digital literacy gaps have you identified in your students since covid-19 closures? the last survey question asked respondents to reflect on any other challenges that may have impacted their ability to effectively provide equitable library instruction during emergency remote learning. figure 7 displays an array of responses, including some related to technology tools, home environments, and institutional support. nonetheless, technology access from home was perceived as the most important challenge (39 respondents) followed closely by internet issues (35 respondents). for both challenges, the authors included responses that specified lack of tech access for students, teaching faculty, or instruction librarians. many respondents d id not specify who lacked access. however, one could argue that lack of access by any of those three groups may impede connection and student engagement. other challenges such as home environment, fewer library instruction sessions, communication barriers with students, lack of student engagement, no time to plan, emotional distress, and issues with synchronous or asynchronous instruction affected 11 percent or less of respondents each. additionally, the data indicated that librarians perceived more communication barriers with students (14 respondents) than with faculty (nine respondents). in figure 7, “asynchronous/synchronous” refers to problems encountered by respondents that had to do, in general, with the unique challenges of presenting content online either asynchronously or synchronously. for example, respondents mentioned being unsure whether students were engaging with asynchronous content. they also mentioned being asked by faculty to use one format over another, despite librarian preferences. one respondent focused specifically information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 13 on the need for flexibility when addressing equity: “asynchronous instruction does not allow the real time adaptation to student needs (cognitive and technical).” even though figure 7 relates to challenges experienced in providing equitable library instruction, respondents showed that there was also an emotional factor surrounding these challenges . two revealing responses to the question about challenges included “my kids running around in the background, not having an actual office, being expected to work 40 hours a week while homeschooling and running a household” and “some students [are] more or less in shock from the pandemic; some students have illness in the family; some students have economic issues, some students just don’t learn well with online learning only.” other comments stated personal challenges, such as the “stress of living in [the] epicenter of [a] global pandemic” and “my own mental and emotional capacity.” figure 7. respondent answers to question 6: what other challenges exist in your ability to effectively provide equitable information literacy instruction during this time? because there was often little consensus among responses, the authors created word clouds for all four qualitative questions (figure 8). each of these questions showed students at the center of instruction librarians’ responses, which is not surprising given their roles and the subject of this survey. the purpose of emergency remote teaching and learning is, at its core, to continue to connect students with resources and to engage them in their learning, even and especially when it is challenging to do so. still, it is meaningful to see students at the heart of these data. information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 14 figure 8. word cloud visualization for each qualitative question answer set. challenges and limitations many of the challenges encountered while analyzing data had to do with creating meaningful keyword codes for the qualitative survey questions. this coding was challenging because respondents expressed varied experiences and opinions and there was no significant consensus regarding tools used, tool weaknesses, digital literacy gaps, or other challenges. in contrast, respondents frequently referred to students’ lack of technology and internet access, even when the question at hand did not explicitly address this. these challenges speak both to the varied experiences of and institutional responses to covid-19, as well as perceived lack of tech or internet access among students as a primary barrier to effective emergency remote teaching. further, while some questions signaled a clear answer, others required interpretation. to illustrate, respondents used the term “accessibility” inconsistently. some respondents used this term to refer to accessibility for students with disabilities, and others used it to refer to “availability.” therefore, the authors employed contextual clues to determine meaning. regardless, if the meaning remained unclear, then these answers were not considered for coding. similarly, respondents didn’t always use the same language to describe the same concepts. for example, a participant noted that “the technology we have is limited to lecturing and answering questions and providing documents and videos online. we don’t have polls enabled . . . .” the information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 15 authors interpreted this to mean that the technology tools didn’t allow for robust engagement with students, though the respondent didn’t specifically mention the word “engagement.” again, if context or meaning was unclear, those responses were not coded. another challenge occurred in analyzing responses to question 6: what digital literacy gaps have you identified in your students since covid-19 closure? some respondents appeared to be unfamiliar with the term “digital literacy,” even though a definition was provided within the question. some respondents referred to hardware access, home environment, tech access, or psychological stress rather than explicitly reflecting on digital literacy gaps as included in ala ’s task force definition. these responses could indicate either confusion around the definition of digital literacy or, as suggested above, the perception of all these factors being codependent or interrelated. limitations of the study included the design of the survey itself. for example, respondents received a list of tools for questions 1 and 2, which may have meant that they were more likely to select these than to remember other tools that they used and add them to the other category accordingly. questions 3 through 6, in contrast, did not include any multiple-choice options, which may have limited the thoroughness of responses. for example, the average number of responses to question 3 was 2.08 strengths mentioned per respondent. we think it is likely that respondents would have indicated more strengths had they been presented with a list rather than only a freetext box. the authors also did not define the difference between content delivery tools and tools for student engagement in the survey. for this reason, there was some overlap noted in the responses for questions 1 and 2. also, respondents mentioned tools for engagement that were sometimes features of content delivery tools, such as webex whiteboards and lms discussion forums. the vast landscape of tools used meant that our survey could not account for all possible manifestations of technology for content delivery and student engagement. discussion questions 1 and 2: technology tools instruction librarians are using to deliver content and engage students the instruction librarians that answered the survey have widely used technology tools such as libguides and zoom in their library seminars during covid-19. however, as data show, librarians have also used many other technology tools to create and deliver emergency remote library sessions during covid-19, due perhaps to the wide array of tools available. while libguides and zoom exhibit a high percentage of usage, this result was expected because libguides is a wellknown tool used by academic librarians and, according to the company, zoom became more prominent as a tool during covid-19.25 the relatively low usage of adobe illustrator is also somewhat predictable because this tool not only requires a subscription, but also may have a higher learning curve than other free graphic editor and design programs. data raised some further questions about the role of information technology (it) departments. are instruction librarians reaching out to their respective universities’ it departments to learn about technology tools available to them and vice versa? are it offices willing and able to provide training via video conferencing if in-person training is not available due to the pandemic? do it departments offer enough promotion to advertise these tools? these questions are not addressed in this manuscript but are important avenues for further research. only six percent of respondents information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 16 recorded “supported by it” as a strength of the technology tools they were using. this shallow percentage may appear striking but could be understood under the premise that as the pandemic set in across the united states and instruction librarians rushed to prepare and present online sessions, librarians relied on the tools that were most familiar to them instead of learning a new technology tool. the data from this survey would seem to corroborate this, as so many respondents chose “ease of use” as an important strength. one interesting detail that is worth addressing about the tools is that respondents mentioned other more than they selected the options the authors provided, which may imply that either the authors did not include the most-used tools or that the number and/or variety of tools is so wide that it is difficult to reach a consensus. one wonders if the tools mentioned in other had been included as part of the listed options, the number of respondents using these tools would be higher. questions 3 and 4: the strengths and weaknesses of tools as they affect student engagement the authors wanted to know what tools respondents had used to enhance student engagement. data show that google forms is the tool that most of the respondents have used for this purpose. however, fewer respondents used tools that are purposely designed to increase interaction in online sessions, such as kahoot and mentimeter, which do not even require a fee for using their basic features. respondents’ perceptions of the strengths and weaknesses of tools they have used provided useful information. in terms of tools strengthening student engagement, the responses were not as conclusive, as 40 of 150 respondents found these tools interactive or collaborative, and an even lower number of these respondents thought of these tools as flexible or helpful to enable remote instruction. one could argue that ada capabilities are features that may facilitate student engagement. however, when respondents were asked about the strengths of technology tools they had used, accessibility was not often mentioned as a strength. moreover, data showed that only eight respondents referred to ada problems and three of them voiced concern over captioning capabilities, which is considered a relevant ada feature. there was no mention of alt text for images, screen-readable and software-neutral file formats, or the importance of user ability to change the color and font setting in their devices to see the content. in fact, only three respondents specifically mentioned issues with videos in terms of their audio quality, lack of auto closed-captioning, and freezing images. respondents noted wide-ranging effects of tool weaknesses on both instruction librarians and students. to illustrate, the weaknesses “time intensive,” “not designed for teaching,” and “no feedback or assessment” likely affected instruction librarians at a personal level as they prepared for and assessed their teaching. in contrast, the weaknesses “ada problems,” “not interactive or engaging,” “difficult to use,” and “makes communication difficult” might primarily impact students. other concerns respondents stated that may influence student engagement included poor bandwidth, which affects internet access and causes connection issues. for example, even if librarians try to improve the video quality in zoom by disabling the higher definition option, or start a session with audio conferencing only, which will decrease the amount of bandwidth needed, students with poor bandwidth may still not be able to engage. therefore, in situations of emergency remote learning, if students lack bandwidth or an appropriate home environment, learning and engaging may become a challenge. information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 17 questions 5 and 6: digital literacy gaps and equitable instruction a number of factors affect the ability to provide equitable library instruction on the librarian side and to engage with equitable library instruction on the students’ side. one of these factors is home environment, including access to computers, good bandwidth, or an appropriate working station. our data specify that 15 respondents perceived “home environment” as a challenge when providing equitable library instruction. in addition, some respondents noticed home environment issues when asked about digital literacy gaps in relation to shared spaces and lack of computers and access. it’s possible that equity issues increased during the covid-19 shutdown, which raises the question of whether there is a correlation between the issues affecting students, librarians, and faculty. data showed that of 139 participants, a little over one-fourth of them considered “tech access from home” and “internet issues” as challenges in the ability to provide equitable library instruction. these challenges, along with “emotional issues,” were perceived to affect not only students but also librarians and faculty. although responses recorded librarians’ perceptions on equity issues, often including their own experiences, data revealed that respondents presumed faculty and students were having similar issues. data also exhibited that respondents perceived other challenges, such as “fewer library sessions” and “student lack of engagement,” that may affect students directly. fewer library sessions are a challenge that may be further addressed forthwith. however, students’ lack of engagement is a difficulty that may require thoughtful outreach, collaboration with mental health offices at the campus level, and reflective and inclusive lecture design. these challenges may have a negative impact on receiving an equitable learning experience. in fact, less commonly acknowledged gaps may, in some ways, be more important than those frequently mentioned. a well-known gap can be addressed because there is consensus that the gap exists and poses a barrier to equity in education. unnoticed or overlooked gaps, in contrast, are more difficult to address but may be no less important as barriers to equitable education. equity issues may also arise as a result of lack of digital literacy skills in students. students with higher digital literacy are deemed to perform better in an emergency remote library instruction setting, may be more prone to stay in tune and engaged with the lesson, and have less emotional stress by feeling confident. however, as wan ng explains above, those recognized as digital natives may not necessarily have digital literacy, even if they are comfortable with social media tools.26 data do not tell us the age of students; regardless, digital literacy gaps were detected by respondents. these perceived gaps in digital literacies (evaluating information, communicating online, applying search strategies, using library resources and databases, understanding plagiarism and citation, and using online resources) are important for librarians to address during emergency remote learning. last, the lack of consensus may be explained by the complexity of the concept of digital literacy. it is possible that many of these gaps existed before, but librarians recognized them as new during emergency remote learning. one response illustrates this idea: “the closure has prompted many more students to request help in every step of the digital literacy process. i’m not sure if students typically ask each other, or their professors/instructors. regardless, it’s exposed that not all students know things i’d assumed they did.” whether these gaps are new or not remains unclear, as evidenced by another respondent who stated, “nothing new to the covid era.” information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 18 recommendations these recommendations seek to address some of the issues that arose in the data, especially those regarding equity and emergency remote library instruction. to illustrate, one respondent summed up the current situation while also posing a question that appears valuable: “not all of our students have the same access to stable technology and internet, nor do they all respond to online teaching strategies in the same ways. how do we create equitable and accessible learning opportunities?” while the authors do not have all the answers, based on the analysis of the data and emerging themes, some recommendations may help instruction librarians move forward through the covid-19 crisis. technology and equity the authors realize that a budget is essential for the implementation of recommendations that may reduce both inequitable access to information and lack of digital literacy. nonetheless, the recommendations below intend to offer guidance on ways to improve equitable access, digital literacy, and student engagement during emergency remote library sessions. one external digital barrier for students engaging in emergency remote library sessions was the lack of equipment at home, possibly due to economic hardship. university libraries could provide kits containing a chromebook, webcam, microphone, wi-fi hotspot, and headphones to increase equitable access. access to this equipment may help students feel supported and understood, with a sense of dignity. these offerings should be in coordination with other campus units who may provide similar services, such as student affairs and it departments. likewise, a coordinated marketing and outreach effort at the campus level may enhance the visibility of equipment available for student use. as stated above, “ease of use” rose to the top as the most-frequently mentioned technology tool strength, which is understandable given the many stressors educators and students may be experiencing during covid-19. however, it is important to keep in mind that tools should be “easy to use” not just for librarians and teaching faculty, but especially for students. nonetheless, given the difficulty of assessing instructional technology and library information literacy sessions right now, it is challenging to know whether students find the technology tools that librarians choose truly “easy to use.” compounding the perception that tools are easy to use is the possibility that tools may not be ada accessible. though the survey did not ask about accessibility explicitly, and while the authors did not vet the tools listed in the survey for their accessibility features, the authors wonder how many tools are fully accessible to all learners. instead of choosing tools for their perceived ease of use, a further recommendation is to move beyond valuing what’s easy to critically reflect on whether tools are fully accessible to students with visual or hearing impairments or learning differences. if the answer is no or unclear, perhaps using basic content delivery tools that are vetted for accessibility features is the better option. it is recommended to follow best practices for using those tools (for example, by referring to guidance from campus it departments). if instruction librarians consider themselves “blended,” or perhaps even so well-versed in technology that a term like “blended librarian” is no longer needed, they should also prioritize flexible, responsive, and intentional use of technology in their lecture design. if a tool that they assumed would be easy to use for all students is proving challenging for some, librarians should have alternative options and extra support at the ready. they may also ask themselves whether information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 19 use of a technology tool furthers the learning process and outcomes of the course, or if technology is added for its own sake. in addition, avoiding use of extra tools and technology that does not genuinely enhance lecture goals and priorities may help students avoid stress related to technology, which could further students’ emotional well-being during this fraught time. being clear with students about which tools will be used and for what purpose may help students who would otherwise struggle with layered content delivery and engagement tools. a glossary of these tools, along with when and how they’ll be used and links to technical support, could be a helpful support document for students. communication and equity it is worth exploring librarian, student, and faculty communication not explicitly focused on technology. some respondents mentioned outreach and connection challenges that have less to do with technology and more to do with other stressors and limitations. for example, some librarians reported receiving fewer requests for information literacy sessions or library support than usual, and some speculated that this was because of the quick move to emergency remote learning, lack of time to plan, and the possibility that a library session was “extra” and faculty were trying to simplify. there are several ways to address this challenge. librarians can attempt to meet students and faculty where they are by offering multimodal learning opportunities, including both synchronous and asynchronous offerings (zoom meetings, prerecorded videos, tutorials/quizzes, canvas discussion posts, and libguides are a few options). it is also paramount to make sure librarians are reachable at the point-of-need, which may mean extended weekend and evening hours on the virtual ask-a-librarian desk. also imperative is ensuring that virtual services, as well as consultation request links and/or email addresses, are clear and visible to students and faculty on the library’s website. some survey respondents mentioned that communication with faculty was difficult, and this may have contributed to fewer instruction requests. while it is understandable that faculty may have been less responsive to librarian outreach for a variety of reasons, there are some ways to encourage faculty communications. for example, librarians could provide simple, bulleted lists with updated information on services and offerings, individual attention (focused on specific classes and topics), and options, acknowledging that some faculty will simply not want to share classroom time during emergency remote teaching. librarians can also work to bridge the disconnect between it and their departments by proactively reaching out to learn about best practices not only for technology use, but also for ada accommodations. even when information literacy sessions are requested, faculty may not always share student accommodation needs. librarians can ask for help from it or other units on campus (such as centers for teaching and learning) to make sure that their communication techniques are aligned with inclusive, user-centered approaches to teaching and learning with technology. as professionals in a unique role serving both students and faculty, librarians may also check in on a person-to-person basis with both groups. acknowledging that we are people with mental and physical health needs working together in difficult circumstances is one way of connecting with students and faculty in an authentic way. emergency remote teaching and learning is different information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 20 from typical remote or online learning and being clear about that might also help everyone adjust expectations and extend compassion. professional development and personal support while emergency remote teaching and learning may not seem like the best time for professional development, it is important to acknowledge that librarians deserve support in navigating this unprecedented time. even as we clearly want to help students who may be especially vulnerable during covid-19, there is a sense of being overwhelmed, and librarians may not always know where to start. while there are online webinars and discussions that provide advice about how to best help students during covid-19, the authors recommend a more specific approach targeting digital literacy gaps and support systems for librarians. in reviewing survey responses to perceived digital literacy gaps and other challenges, it became clear that not all librarians are well-versed in digital literacy concepts. if librarians have time to take one approach to professional development as it relates to instruction and information literacy, the authors recommend learning more about digital literacy competencies and thinking critically about how emergency remote library instruction design can address those competencies and potential gaps. of course, stresses of the pandemic are impacting librarians as well as faculty and students. it is important that we connect with colleagues and support systems during this time. one option might be to form a community with colleagues to determine best practices for use of technology in instruction among other relevant topics (examples at the authors’ library include anti-racist actions and a caregiver’s support group). librarians should also prioritize their own health (mental and physical) and stress management. the recommendations are everywhere but bear repeating: connect with family and friends, exercise, take time away from the computer, and make sure to rest. librarians should be kind to themselves and their colleagues and offer or ask for support when needed. conclusion as of spring 2021, the covid-19 pandemic is not yet over. it remains unclear whether and when academic library instruction will return to the old normal. the data collected and analyzed during this paper, as well as the discussion and recommendations, can inform how instruction librarians respond to student needs and challenges as everyone continues to cope with life during emergency remote learning. especially compelling are the data shared about the strengths and weaknesses of technology tools used to enhance student engagement in library instruction. these data provide parameters that may help other instruction librarians make decisions when choosing a technology tool and be prepared to troubleshoot when issues arise. a concerning data revealed that digital literacy, as defined by ala’s digital literacy task force, is a subject that may not be widely understood by instructors. although our pool of respondents was small, instruction librarians may need a broader understanding of what digital literacies look like in practice when dealing with emergency remote teaching and a diverse student population. while instruction librarians’ experiences and perceptions are one important piece of the puzzle, especially in acknowledging shared challenges, it is important to recognize that students may have needs, digital literacy or otherwise, that educators are missing. though assessment is difficult right now, reflection and attention to the whole student experience is necessary. working with colleagues on campus to provide technology, including laptop computers and wi-fi hotspots, as information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 21 well as evaluating our content delivery and engagement tech tools for ada accessibility, are examples of ways that instruction librarians can connect students with unmet needs to resources during this difficult time. examining instruction librarians’ ongoing response to the pandemic, while challenging, will help libraries become more emergency-responsive and better able to meet the needs of diverse students in the 21st century. acknowledgement we would like to thank moria woodruff from the university of colorado boulder writing center for her help revising this manuscript. information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 22 appendix a: survey instrument distance learning during a pandemic: a matter of equity we are curious to hear about your experiences of library instruction during the abrupt shift to online learning. in particular, we are researching librarians’ use of technology tools for online content delivery and student engagement during covid-19.this survey should take less than ten minutes to complete. your answers will be anonymous. please do not include personally identifiable information. participation in the survey indicates your consent for us to use the data collected in a forthcoming research paper about using online technology tools to teach information literacy or library seminars during covid-19. the survey will be open through sunday, may 24th. thank you for your participation! q1 what content delivery technology have you used to create your distance learning library sessions during covid-19? select as many as apply. zoom microsoft teams libguides course management system (e.g., canvas) formative pear deck adobe illustrator snagit screencast-o-matic playposit google hangouts google classrooms canva (graphic design tool) other information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 23 q2 what technology tools have you used to enhance student engagement in your distance learning library sessions during covid-19? select as many as apply. padlet answergarden kahoot! mentimeter flipgrid slido socrative jamboard pear deck mural google drawings google forms quizalize gosoapbox poll everywhere yo teach! other q3 what are the strengths of the technology tools you’re using right now? q4 what are the weaknesses of the technology tools you’re using right now? information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 24 q5 what digital literacy gaps have you identified in your students since covid-19 closures? ala’s digital literacy task force defines digital literacy as “the ability to use information and communication technologies to find, evaluate, create, and communicate information, requiring both cognitive and technical skills.” q6 what other challenges exist in your ability to effectively provide equitable library instruction during this time? information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 25 appendix b: tools mentioned by three or fewer respondents, question 1, option other ninety-five respondents answered other to question 1: what content delivery technology have you used to create your synchronous and asynchronous distance learning library sessions during covid-19? tool name type of tool number of respondents bluejeans online meetings 3 google meet online meetings 3 jing / techsmith capture screen capture 3 blackboard ensemble video creation 2 imovie video editing 2 guide on the side interactive tutorials 2 kapwing video and image editing 2 libchat communications service 2 piktochart graphics editing 2 techsmith relay video creation 2 thinglink multimedia editing 2 adobe indesign desktop publishing 1 adobe photoshop graphics editing 1 adobe premiere pro video editing 1 amazon chime communications service 1 audacity audio editing 1 chat (in general) communications service 1 clideo video editing 1 faststone capture screen capture 1 genially interactive content creation 1 google sheets web-based spreadsheets 1 gotomeeting online meetings 1 information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 26 tool name type of tool number of respondents microsoft bookings scheduling 1 microsoft stream video sharing 1 powtoons video creation 1 pressbooks content management 1 prezi video video creation 1 qualtrics surveys 1 quicktime multimedia editing 1 screenflow video editing and screen capture 1 springshare libwizard interactive tutorials and forms 1 telephone communications service 1 videoscribe animated video creation 1 vimeo video sharing 1 whatsapp communications service 1 information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 27 appendix c: tools mentioned by one respondent, question 2, option other forty-two respondents answered other to question 2: what technology tools have you used to enhance student engagement in your distance learning library sessions or courses during covid 19? each tool was used by only one respondent. tool name type of tool articulate storyline interactive e-learning modules calendly scheduling camtasia video editing and screen recording canva quizzes quizzes google voice communications service h5p programming language for websites handout (not a technology tool) html/css programming language for websites knight lab tools storytelling lms discussion forums discussions microsoft powerpoint presentation platform microsoft word word processor nearpod interactive lessons parlay discussions qualtrics surveys remind communications service speakpipe communications service twine storytelling voicethread video, voice, and text commenting webex whiteboard drawing tool information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 28 endnotes 1 charles hodges et al., “the difference between emergency remote teaching and online learning,” educause review (2020), https://er.educause.edu/articles/2020/3/thedifference-between-emergency-remote-teaching-and-online-learning. 2 hodges et al., “the difference.” 3 jody greene, “how (not) to evaluate teaching during a pandemic,” chronicle of higher education (2020), https://www-chronicle-com.colorado.idm.oclc.org/article/how-not-to-evaluateteaching/248434. 4 laura czerniewicz, “what we learnt from ‘going online’ during university shutdowns in south africa,” philoned (2020), https://philonedtech.com/what-we-learnt-from-going-onlineduring-university-shutdowns-in-south-africa/. 5 for scholarship on equity and librarianship see joanne oud, “systematic workplace barriers for academic librarians with disabilities,” college & research libraries 80, no. 2 (2019), https://doi.org/10.5860/crl.80.2.169; amanda l. folk, “reframing information literacy as academic cultural capital: a critical and equity-based foundation for practice, assessment, and scholarship,” college & research libraries 80, no. 5 (2019), https://doi.org/10.5860/crl.80.5.658; scott seaman, carol krismann, and nancy carter, “salary market equity at the university of colorado at boulder libraries: a case study followup,” college & research libraries 64, no. 5 (2003), https://doi.org/10.5860/crl.64.5.390; freeda brook, dave ellenwood, and althea eannace lazzaro, “in pursuit of antiracist social justice: denaturalizing whiteness in the academic library,” library trends 64, no. 2 (2015), https://doi.org/10.1353/lib.2015.0048; isabel gonzalez-smith, juleah swanson, and azusa tanaka, “unpacking identity: racial, ethnic, and professional identity and academic librarians of color,” in the librarian stereotype: deconstructing perceptions and presentations of information work, ed. nicole pagowsky and miriam rigby (chicago: association of college and research libraries, 2014), 149–73. 6 tom riedel and paul betty, “real time with the librarian: using web conferencing software to connect to distance students,” journal of library & information services in distance learning 7, no. 1–2 (2013): 101, https://doi.org/10.1080/1533290x.2012.705616. 7 keith shaw, “colleges expand vpn capacity, conferencing to answer covid-19,” network world (online) (2020): 1. 8 monica anderson and andrew perrin, “nearly one-in-five teens can’t always finish their homework because of the digital divide,” pew research center fact tank news in the numbers, october 26, 2018, https://www.pewresearch.org/fact-tank/2018/10/26/nearlyone-in-five-teens-cant-always-finish-their-homework-because-of-the-digital-divide/. 9 julie arnold lietzau and barbara j. mann, “breaking out of the asynchronous box: suing web conferencing in distance learning,” journal of library & information services in distance learning 3, no. 3–4 (2009): 113, https://doi.org/10.1080/15332900903375291. https://er.educause.edu/articles/2020/3/the-difference-between-emergency-remote-teaching-and-online-learning https://er.educause.edu/articles/2020/3/the-difference-between-emergency-remote-teaching-and-online-learning https://www-chronicle-com.colorado.idm.oclc.org/article/how-not-to-evaluate-teaching/248434 https://www-chronicle-com.colorado.idm.oclc.org/article/how-not-to-evaluate-teaching/248434 https://philonedtech.com/what-we-learnt-from-going-online-during-university-shutdowns-in-south-africa/ https://philonedtech.com/what-we-learnt-from-going-online-during-university-shutdowns-in-south-africa/ https://doi.org/10.5860/crl.80.2.169 https://doi.org/10.5860/crl.80.2.169 https://doi.org/10.5860/crl.80.5.658 https://doi.org/10.5860/crl.64.5.390 https://doi.org/10.1353/lib.2015.0048 https://doi.org/10.1080/1533290x.2012.705616 https://www.pewresearch.org/fact-tank/2018/10/26/nearly-one-in-five-teens-cant-always-finish-their-homework-because-of-the-digital-divide/ https://www.pewresearch.org/fact-tank/2018/10/26/nearly-one-in-five-teens-cant-always-finish-their-homework-because-of-the-digital-divide/ https://doi.org/10.1080/15332900903375291 information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 29 10 aek phakiti, david hirsh, and lindy woodrow, “it’s not only english: effects of other individual factors on english language learning and academic learning of esl international students in australia,” journal of research in international education 12, no. 3 (2013): 248. 11 t. v. semenova and l. m. rudakova, “barriers to taking massive open online courses,” russian education & society 58, no. 3 (2016): 242, https://doi.org/10.1080/10609393.2016.1242992. 12 xinghua wang, seng chee tan, and lu li, “technostress in university students’ technologyenhanced learning: an investigation from multidimensional person-environment misfit,” computers in human behavior 105, (2020): 2, https://doi.org/10.1016/j.chb.2019.106208. 13 “digital literacy,” ala literacy clearinghouse, accessed may 16, 2021: https://literacy.ala.org/digital-literacy/. 14 steven j. bell and john shank, “the blended librarian: a blueprint for redefining the teaching and learning role of academic librarians,” college & research libraries news 65, no. 7 (2004): 374, https://doi.org/10.5860/crln.65.7.7297. 15 vanessa w. vongkulluksn, kui xie, and margaret a bowman, “the role of value on teachers’ internalization of external barriers and externalization of personal beliefs for classroom technology integration,” computer & education 118, (2018): 79, https://doi.org/10.1016/j.compedu.2017.11.009. 16 jesper aagaard, “breaking down barriers: the ambivalent nature of technologies in the classroom, new media & society 19, no. 7 (2017): 1138, https://doi.org/10.1177/1461444816631505. 17 wan ng, “can we teach digital natives digital literacy?” computers & education 59, no. 3 (2012): 1065, https://doi.org/10.1016/j.compedu.2012.04.016. 18 ng, “can we teach,” 1071–72. 19 ng, “can we teach,” 1072. 20 ellen helsper and rebecca enyon, “digital natives: where is the evidence?,” british educational research journal 36, no. 3. (2010): 515, https://doi.org/10.1080/01411920902989227. 21 david ellis, “using padlet to increase student engagement in lectures,” in proceedings of the 14th european conference on e-learning (ecel 2015), ed. amanda jefferies and marija cubric (reading, uk: academic conferences and publishing international limited): 195. 22 seyed abdollah shahrokni, “playposit: using interactive videos in language education,” teaching english with technology 18, no. 1 (2018): 106. 23 jurgen schulte et al., “shaping the future of academic libraries: authentic learning for the next generation,” college & research libraries 79, no. 5 (2018): 688, https://doi.org/10.5860/crl.79.5.685. https://doi.org/10.1080/10609393.2016.1242992 https://doi.org/10.1016/j.chb.2019.106208 https://literacy.ala.org/digital-literacy/ https://doi.org/10.5860/crln.65.7.7297 https://doi.org/10.1016/j.compedu.2017.11.009 https://doi.org/10.1177/1461444816631505 https://doi.org/10.1177/1461444816631505 https://doi.org/10.1016/j.compedu.2012.04.016 https://doi.org/10.1080/01411920902989227 https://doi.org/10.5860/crl.79.5.685 information technology and libraries june 2021 emergency remote library instruction and tech tools | ibacache, rybin koob, and vance 30 24 chiara faggiolani, “perceived identity: applying grounded theory in libraries,” jlis.it: italian journal of library and information science 2, no. 1 (2011): 4592, https://doi.org/10.4403/jlis.it-4592. 25 “over 700 universities and colleges now use zoom!” zoom blog, july 15, 2013, https://blog.zoom.us/over-700-universities-and-colleges-now-use-zoom-video-conferencing/. 26 ng, “can we teach,” 1071–72. https://doi.org/10.4403/jlis.it-4592 https://blog.zoom.us/over-700-universities-and-colleges-now-use-zoom-video-conferencing/ abstract introduction literature review barriers to equitable student access in online learning technology tools, digital literacy, student engagement methods instrument participants grounded theory approach findings popularity of technology tools strengths and weaknesses of technology tools digital literacy gaps challenges and limitations discussion questions 1 and 2: technology tools instruction librarians are using to deliver content and engage students questions 3 and 4: the strengths and weaknesses of tools as they affect student engagement questions 5 and 6: digital literacy gaps and equitable instruction recommendations technology and equity communication and equity professional development and personal support conclusion acknowledgement appendix a: survey instrument appendix b: tools mentioned by three or fewer respondents, question 1, option other appendix c: tools mentioned by one respondent, question 2, option other endnotes microsoft word author_edits_march_ital_rebmannproof_edits.docx tv white spaces in public libraries: a primer kristen radsliff rebmann, emmanuel edward te, and donald means information technology and libraries | march 2017 36 abstract tv white space (tvws) represents one new wireless communication technology that has the potential to improve internet access and inclusion. this primer describes tvws technology as a viable, long-term access solution for the benefit of public libraries and their communities, especially for underserved populations. discussion focuses first on providing a brief overview of the digital divide and the emerging role of public libraries as internet access providers. next, a basic description of tvws and its features is provided, focusing on key aspects of the technology relevant to libraries as community anchor institutions. several tvws implementations are described with discussion of tvws implementations in several public libraries. finally, consideration is given to first steps that library organizations must take when contemplating new tvws implementations supportive of wifi applications and crisis response planning. introduction tens of millions of people rely wholly or in part on libraries to provide access to the internet. many lack access to the federal communications commission (fcc) recommended standard of 25 mbps (megabits per second) download speed and 3 mbps upload speed.1 though the fcc reclassified high-speed internet as a public utility under title ii of the telecommunications act to ensure that broadband networks are “fast, fair, and open” in 2015,2 the “digital divide” still remains. one in four community members does not have access to the internet at home. accounting for age and education level, households with the lowest median income households have service adoption rates of around 50%, compared to those with higher incomes, with rates of 80 to 90%.3 a recent pew research center survey on home broadband adoption found that 43% of those surveyed reported cost being their main reason for non-adoption.4 individuals with low quality or no access are more likely to be digitally disadvantaged, tend to use library computers more frequently, and are less equipped to interact and compete economically as more services and application processes move online.5 kristen radsliff rebmann (kristen.rebmann@sjsu.edu) is associate professor, san jose state university school of information, san jose, ca. emmanuel edward te (emmanueledward.te@sjsu.edu) is a graduate student, san jose state university school of information, san jose, ca. donald means (don@digitalvillage.com) is co-founder and principal of digital village associates, sausalito, ca. tv white spaces in public libraries: a primer | rebmann, te, and means | https://doi.org/10.6017/ital.v36i1.9720 37 this article highlights tv white space (tvws), a new wireless communication technology with the potential to assist libraries in addressing digital access and inclusion issues. this primer provides first a brief overview of the digital divide and the emerging role of public libraries as internet access providers, highlighting the need for cost-efficient, technological solutions. we go further to provide a basic description of tvws and its features, focusing on key aspects of the technology relevant to libraries as community anchor institutions. several tvws implementations are described with discussion of how tvws was set up in several public libraries. finally, we extend consideration to first steps library organizations must consider when contemplating new implementations including everyday applications and crisis response planning. digital access and inclusion the term “digital divide” describes the gap between people who can easily access and use technology and the internet, and those who cannot.6 as kinney observes, “there has not been one single digital divide, but rather a series of divides that attend each new technology.”7 digital divides are exacerbated by various factors including: socioeconomic status, education, geography, age, ability, language, and especially availability and quality.8 in recent years, the language describing this issue has changed, but the inequalities stay consistent and widen among different dimensions with each emerging technology. the most recent public policy term “digital inclusion” promotes digital literacy efforts for unserved and underserved populations.9 the progression from the term “digital divide” to “digital inclusion” represents a shift in focus from issues of access exclusively toward contexts and quality of participation and usage. along these lines, the language of digital inclusion reframes the issue by making visible that simply focusing on internet access can obscure the fact that divides associated with quality and effectiveness remain.10 in response to the digital divide, public libraries have become the “unofficial” providers of internet access, stemming from libraries’ access to broadband infrastructure, maintenance of publiclyavailable computers, and services providing assistance and training.11 a pew research center survey on perceptions of libraries found that most respondents reported viewing public libraries as important parts of their communities, providing resources and assisting in decisions regarding what information to trust.12 however, many public libraries are facing an “infrastructure plateau” of internet access due to few computer workstations and slower broadband connection speeds that can support a growing number of users,13 on top of insufficient funding, physical space, and staffing.14 previous surveys show that although public libraries are connected to the internet and provide public access workstations and wireless access, nearly 50% of public libraries only offer wireless access that shares the same bandwidth as their workstations.15 this increased usage strains existing network connections and infrastructure, resulting in slower connections for everyone connected to the public library’s network. many public libraries cannot accommodate more workstations, support the power requirements of both workstations and patrons’ laptops, and afford workstation upgrades and bandwidth increases to move past their insufficient connectivity speeds. libraries often lack the it skills, time, and funds to upgrade their information technology and libraries | march 2017 38 infrastructure.16 typical wireless access via wi-fi is relegated to distances within library buildings, which may extend to exterior spaces and is available only during operating hours. despite these challenges, public libraries continually provide access and “at-the-point-of-need” training and support for their patrons, especially for those who do not have easy access to the internet and computers.17 subsidized by federal funding, libraries represent key access providers and technology trainers for the public without internet access.18 the fcc classifies libraries as “community anchor institutions” (cais), organizations that “facilitate greater use of broadband by vulnerable populations, including low-income, the unemployed, and the aged.”19 recent surveys show that users have a positive view of libraries, providing opportunities to spend time in a safe space, pursue learning, and promote a sense of community. librarians offer internet skills training programs more often than other community organizations though (at around 75% of the time) training occurs informally.20 in particular, 29% of respondents to a library use survey reported going to libraries to use computers, the internet, or the wi-fi network; 7% have also reported using libraries’ wi-fi signals outside when libraries are closed.21 the majority of these users are more likely to be young, black, female, and lower income, utilizing library technology resources for school or work (61%), checking email or sending texts (53%), finding health information (38%), and taking online courses or completing certifications (26%).22 public libraries are already exploring creative approaches to providing internet access for these underserved communities. the mobile hotspot lending program in public library systems in new york city and kansas city are just two examples.23 yet libraries must do more by supporting innovation and providing leadership by partnering with other community organizations and their stakeholders to enhance resilience in addressing access and inclusion. the emergence of tvws wireless technology presents an opportunity for libraries to explore expanding the reach of their wireless signals beyond library buildings and extend 24/7 library wi-fi availability to community spaces such as subsidized housing, schools, clinics, parks, senior centers, and museums. tvws basics tv whitespace (tvws) refers to the unoccupied portions of spectrum in the vhf/uhf terrestrial television frequency bands.24 television broadcast frequency allocations traditionally assumed that tv station transmissions operating at high power needed wide spectrum separation to prevent interference between broadcasting channels, which led to the specific spectrum allocation of these frequency “guard bands.”25 research discovered that low-power devices can operate within these spaces, which led the federal communications commission (fcc) to field test tvws applications to wireless communications and (ultimately) promote tvws neutrality.26 in 2015, the federal communications commission (fcc) made a portion of these very valuable tvws bands of spectrum available for open, shared public use, like wi-fi. yet, unlike wi-fi, with a reach measured in 10s of meters, the range of tvws is measured in 100s or even 1000s of meters. tvws has good propagation characteristics, which makes it an extremely valuable license-exempt radio spectrum.27 it is a relatively stable frequency that does not change over time, allowing for tv white spaces in public libraries: a primer | rebmann, te, and means | https://doi.org/10.6017/ital.v36i1.9720 39 spectrum availability estimates to remain reliable and valid, which in turn promotes its various applications.28 radio spectrum is considered a “common heritage of humanity,”29 as radio waves “do not respect national borders.”30 the fcc recently made a portion of these tvws bands of spectrum available for open, shared public use.31 tvws availability and application are contextual and dependent on many key factors. availability is influenced by frequency (the idle channels purposely planned in tv bands, varying across regions), deployment (the height and location of the tvws transmit antenna and its installation sites in relation to nearby surrounding tv broadcasting reception), space and distance (geographical areas outside the current planned tv coverage, including no present broadcasting signals), and time (off-air availability of licensed broadcasting transmitters during specific periods of time, subject to change by the broadcaster).32 as tvws existed as fragmented “safety margins” between broadcast services, tvws is typically more abundant in rural areas that have less broadcast coverage and in larger contiguous blocks rather than in highly dense urban areas.33 assigned spectrum is not always used efficiently and effectively by licensees, and exclusive or nonexclusive sharing can alleviate pressure on these resources.34 this “spectrum crunch” of the inefficient use of scarce spectrum resources can be alleviated with dynamic spectrum access (dsa) and spectrum sharing. tvws availability is small where digital television has been deployed, with the potentials for aggregate interference (from tvws users in relation to primary tv service) and self-interference (within the tvws network), which may lead to a “mismatch situation” where there is high demand for bandwidth but very low tvws bandwidth supply.35 as most spectrum frequencies have been organized through some form of exclusive access in which only the licensee can use the specific spectrum, technologies such as cognitive radios can enable new modes of spectrum access, supporting autonomous, self-configuring, self-planning networks which rely on up-to-date tvws availability databases. the limited distribution (in many areas) of basic broadband infrastructure and relatively high cost of access often prevents individuals with lower incomes from participating in the digital revolution of information access and its opportunities.36 despite these challenges to broadband availability, tvws excels in areas with low broadband coverage. rural regions possess greater frequency availability due to lower density of spectrum licensing. in comparison to other frequencies operating higher up on the spectrum band, tvws does not require direct line-of-sight between devices for operation, and has lower deployment costs. equipment market costs are comparable to wi-fi equipment currently on the market.37 importantly, tvws can address access and inclusion by having relatively low start-up costs and no ongoing services fees. as a public resource, it can work with existing services to create new, potentially mobile connections to the internet that ensure the continuation of vital services in the event of service interruptions.38 in urban areas with fewer channels available, new efficient spectrum sharing policies will be necessary. assigned spectrum is not always used efficiently and effectively by licensees, and exclusive or non-exclusive sharing or “recycling” of bands for more information technology and libraries | march 2017 40 effective spectrum use by multiple parties with changing spectrum needs can alleviate pressure on these resources.39 tvws for public libraries tvws is a viable medium for applications from internet access, content distribution within a given location, tracking (people, animals, and assets), task automation, and public safety and security,40 as well as remote patient monitoring and other telemedicine applications.41 tvws complements existing networks that use other parts of the spectrum for access points, mobile communications, and home media devices.42 analyses of a recent digital inclusion survey suggest that technology upgrades can have significant impact on the ability of libraries to expand programs and services.43 as community anchor institutions (cais), public libraries can use tvws systems to expand and improve access to their services for their users, especially for underserved populations. library-led collaborations to deploy tvws networks in other cais and public spaces have numerous benefits. in conjunction with building-centered wi-fi, tvws can redistribute network users from congested library spaces to other community sites, thereby distributing network usage across the community. from an existing broadband connection, libraries can extend their networks of internet access strategically across their communities. yet, unlike networks which solely use limited-range wi-fi, far-reaching tvws can improve the coverage and inclusion of patrons in accessing library programs, services and the broader internet.44 the portability of the access points allows libraries to extend their reach by providing wireless connections in the shortterm, for cultural or civic events like fairs, markets, or concerts, and in the long-term, for use at popular public areas. recent tvws pilot installations have proven to be very stable in kansas, colorado, mississippi, and delaware. manhattan public library (kansas)’s tvws project began in fall 2013. though there were a few delays in the installation and testing process, the tvws equipment was successfully implemented and welcomed by the community in early 2014. it staff report that their remote locations have shown that this library service fills a community need, especially for underserved populations.45 delta county libraries (colorado) are conducting trials with two public hotspots to support “guest” access and potentially provide library patrons with more bandwidth access.46 tvws implementations in the pascagoula school district (mississippi)47 and delaware public libraries48 show successful initial pilot usage in providing wireless internet service directly to community-distributed access points. though there are contextual differences across these sites, the strength of public libraries as cais providing internet access via tvws systems is evident and promising. first steps any library can take the initiative in setting up a tvws network on its own. the first step is to assess availability of spectrum in the library’s geographic location. access to tvws frequencies is free and requires no subscription fees other than the initial equipment investment. public tv white spaces in public libraries: a primer | rebmann, te, and means | https://doi.org/10.6017/ital.v36i1.9720 41 databases of tvws availability are easily accessible and have been tested by the fcc since 2011;49 google also has posted its own spectrum database as well.50 from this setup, the library gains access to public tvws frequencies by which they can broadcast and receive internet connections from paired tvws-enabled remote hotspots. once it is determined that there is available spectrum/channels in the desired area, libraries can then explore how their current broadband and wireless connections might be expanded to include several community spaces where internet access is needed. next, the library works with a tvws equipment supplier to design and install a tvws network consisting of a base station that is integrated with their wired connection to the internet. finally, the library places tvws-enabled remote hotspots in (previously identified) community-based spaces where wi-fi access is needed by underserved populations. given a high quality backhaul (i.e., fiber optic cable high speed connection), tvws can spread that signal and provide access from the library, which is able to propagate and penetrate multiple barriers and geographical features with a signal up to 10 times stronger than current wi-fi. depending on the context (geographical features, tvws availabilities, etc.), hotspots can be installed up to six miles (10 km) away and do not require line-of-sight between the base station and hotspots. this ability is superior to current wi-fi networks that only cover patrons in the immediate vicinity of the library. these tvws remote hotspots also can be easily (and strategically) moved to support occasional community needs (such as neighborhood-wide or city events) or in response to crisis situations. tvws, libraries, and emergency response public libraries provide leadership as “ready access point, first choice, first refuge, and last resort” for community services in everyday matters and in emergencies.51 they have assisted residents in relief efforts during hurricanes katrina and rita, and other natural and man-made disasters.52 …the provision of access to computers and the internet was a wholly unique and immeasurably important role for public libraries… the infrastructure can be a tremendous asset in times of emergencies, and should be incorporated into community plans.53 they have likewise provided immediate and long-term assistance to communities and aid workers, providing physical space for recovery operations for emergency agencies, communication technologies, and emotional support for the community. in previous library internet usage surveys, nearly one-third of libraries reported that their computers and internet services would be used by the public in emergencies to access relief services and benefits.54 such activities include finding and communicating with family and friends, completing online fema forms and insurance claims, and checking news sites regarding information of their affected homes.55 yet, despite the admirable and successful efforts of many public libraries, their infrastructures are not always built to meet the increased demand of user needs and e-government services in emergency contexts.56 jaeger, shneiderman, fleischmann , preece, qu, and wu propose the concept of community response grids (crgs), which utilize the internet and mobile communications devices so that emergency responders and residents in a disaster area can information technology and libraries | march 2017 42 communicate and coordinate accurate, appropriate responses.57 this concept relies on social networks, both in person and online, to enable residents and emergency responders to work together in a multi-directional communication scheme. crgs provide residents tailored, localized information and a means to report pertinent disaster related information to emergency responders, who in turn can synthesize and analyze submitted information and act accordingly.58 due to their existing role as community anchor institutions (cais), public libraries are uniquely positioned for crg involvement. libraries can assist in facilitation of internet access with portable tvws network connection points. by virtue of their portability, tvws hotspots can provide essential digital access in times of crisis by moving along with their affected populations. emergency operations and communications in a crisis occur throughout networks comprised of various technologies. information management before, during, and after a disaster affects how well a crisis is managed.59 broadband internet can be one access route in the event that phone and radio transmissions are affected, and vice versa, as part of a “mixed media approach” to get messages to those that need it in an emergency.60 yet one must remember that internet communications are double-edged: the internet provides relevant material on demand and near instant sharing and collaborating, but these very features can compound a crisis with misinformation.61 despite these concerns, the potential of the integration of wireless devices and other technologies into a multi-technology, collaborative response system can solve the problem of existing communication structures that lack coordination and quality control.62 the proliferation of smartphones, laptops, and other portable wireless devices makes such technology ideal for emergency communications, especially in how users’ familiarity with their own devices will help them navigate crg communications while under stress.63 conclusion supporting internet access and inclusion in public libraries and having equal, affordable, and available access to information is a necessary component to bridging the digital divide. technology has become “an irreducible component of modern life, and its presence and use has significant impact on an individuals’ ability to fully engage in society.”64 as cohron argues, this principle represents more than providing people with internet access: it is about “leveling the playing field in regards to information diffusion. the internet is such a prominent utility in peoples’ lives that we, as a society, cannot afford for citizens to go without.”65 broadband access is the first step; digital literacy training is also a necessity. access alone is not enough to ensure quality and effective use, however, as the digital divide is representative of broader social inequalities that computer and internet access cannot fully remedy.66 this is a complex problem that requires a multi-faceted solution. as kinney states, “the digital divide is a moving target, and new divides open up with new technologies. libraries help bridge some inequities more than others, and substantial disparities exist among library systems.”67 internet access also becomes a necessity when the internet is to play a role in emergency communications.68 tv white spaces in public libraries: a primer | rebmann, te, and means | https://doi.org/10.6017/ital.v36i1.9720 43 it is problematic to suggest that public libraries can be simultaneously promoted as the solution to digital divide issues while facing cuts to funding. policy makers, community advocates, and the community members themselves are stakeholders in the success of their communities, and must also take responsibility for access and inclusion via public libraries.69 as public agencies automate to increase equality and save money, they exacerbate digital divides by excluding those without access. suggesting that community members simply visit the library to ensure access to public services places additional pressure on libraries, yet these efforts may go unsupported and unacknowledged. public libraries are already valuable community access points to resources especially in emergencies, though many suffer from a lack of concerted disaster planning. along similar lines, many libraries are ill-equipped to accommodate the bandwidth needs of growing and oftentimes sparsely connected populations. as communications and government services move increasingly online, it becomes imperative to build strong cost-effective information infrastructures. tvws connections can arguably help in breaking down the barriers that challenge ubiquitous access and inclusion. tvws-enabled remote access points in daily use around communities are ideally situated to provide everyday wi-fi and for rapid redeployment to damaged areas (as pop-up hotspots) to provide essential communication and information resources in times of crisis. in short, tvws can augment the technological infrastructure of public libraries toward further developing their roles as cais and leaders serve their communities well into the future. references 1. wireline competition bureau, “2016 broadband progress report,” federal communications commission, january 29, 2016, https://www.fcc.gov/reports-research/reports/broadbandprogress-reports/2016-broadband-progress-report. 2. office of chairman wheeler, “fcc adopts strong, sustainable rules to protect the open internet,” federal communications commission, february 26, 2015, https://apps.fcc.gov/edocs_public/attachmatch/doc-332260a1.pdf. 3. “here's what the digital divide looks like in the united states,” the white house, july 15, 2015, https://www.whitehouse.gov/share/heres-what-digital-divide-looks-united-states. 4. john b. horrigan and maeve duggan, “home broadband 2015,” pew research center, december 21, 2015, http://www.pewinternet.org/files/2015/12/broadband-adoptionfull.pdf. this 43% is further divided between 33% reporting the monthly subscription cost as their main reason, while the other 10% report the expensive cost of a computer as their reason for non-adoption. 5. bo kinney, “the internet, public libraries, and the digital divide,” public library quarterly 29, no. 2 (2010): 104-161, https://doi.org/10.1080/01616841003779718. information technology and libraries | march 2017 44 6. madalyn cohron, “the continuing digital divide in the united states,” the serials librarian 69, no. 1 (2015): 77-86, https://doi.org/10.1080/0361526x.2015.1036195. 7. kinney, “the internet, public libraries, and the digital divide.” 8. paul t. jaeger, john carlo bertot, kim m. thompson, sarah m. katz, and elizabeth j. decoster, “the intersection of public policy and public access: digital divides, digital literacy, digital inclusion, and public libraries,” public library quarterly 31, no.1 (2012): 1-20, https://doi.org/10.1080/01616846.2012.654728. 9. brian real, john carlo bertot, and paul t. jaeger, “rural public libraries and digital inclusion: issues and challenges,” information technology and libraries 33, no. 1 (2014): 6-24, https://doi.org/10.6017/ital.v33i1.5141. 10. jaeger et al., “the intersection of public policy and public access.” 11. john carlo bertot, paul t. jaeger, lesley a. langa, charles r. mcclure, “public access computing and internet access in public libraries: the role of public libraries in e-government and emergency situations,” first monday 11, no. 9 (2006), https://doi.org/10.5210/fm.v11i9.1392. 12. john. b horrigan, “libraries 2016,” pew research center, september 9. 2016, http://www.pewinternet.org/2016/09/09/libraries-2016/. 13. real et al., “rural public libraries and digital inclusion.” 14. john carlo bertot, charles r. mcclure, and paul t. jaeger, “the impacts of free public internet access on public library patrons and communities,” library quarterly 78, no.3 (2008): 285301, https://doi.org/10.1086/588445. 15. charles r. mcclure, paul t. jaeger, john carlo bertot, “the looming infrastructure plateau? space, funding, connection speed, and the ability of public libraries to meet the demand for free internet access,” first monday 12, no. 12 (2007): https://doi.org/10.5210/fm.v12i12.2017 . 16. ibid. 17. bertot et al., “public access computing and internet access in public libraries.” 18. ibid.; jaeger et al., “the intersection of public policy and public access.” 19. wireline competition bureau, “wcb cost model virtual workshop 2012 community anchor institutions,” federal communications commission, june 1, 2012, https://www.fcc.gov/newsevents/blog/2012/06/01/wcb-cost-model-virtual-workshop-2012-community-anchorinstitutions. 20. jennifer koerber, "ala and ipac analyze digital inclusion survey," library journal 141, no. 1 (2016): 24-26. 21. horrigan, “libraries 2016.” tv white spaces in public libraries: a primer | rebmann, te, and means | https://doi.org/10.6017/ital.v36i1.9720 45 22. ibid. 23. timothy inklebarger, “bridging the tech gap,” american libraries, september 11, 2015, https://americanlibrariesmagazine.org/2015/09/11/bridging-tech-gap-wi-fi-lending. 24. andrew stirling, “white spaces – the new wi-fi?,” international journal of digital television 1, no. 1 (2010): 69–83, https://doi.org/10.1386/jdtv.1.1.69/1; cristian gomez, “tv white spaces: managing spaces or better managing inefficiencies?,” in tv white spaces a pragmatic approach, eds. ermanno pietrosemoli and marco zennaro (trieste: abdus salam international centre for theoretical physics t/ict4d lab, 2013), 67-77. 25. steve song, “spectrum and development,” in tv white spaces a pragmatic approach, eds. ermanno pietrosemoli and marco zennaro (trieste: abdus salam international centre for theoretical physics t/ict4d lab, 2013), 35-40. 26. robert horvitz, “geo-database management of white space vs. open spectrum,” in tv white spaces a pragmatic approach, eds. ermanno pietrosemoli and marco zennaro (trieste: abdus salam international centre for theoretical physics t/ict4d lab, 2013), 7-17. 27. julie knapp, “fcc announces public testing of first television white spaces database,” federal communications commission, september 14, 2011, https://www.fcc.gov/newsevents/blog/2011/09/14/fcc-announces-public-testing-first-television-white-spacesdatabase. 28. horvitz, “geo-database management of white space vs. open spectrum.” 29. ryszard strużak and dariusz więcek, “regulatory issues for tv white spaces,” in tv white spaces a pragmatic approach, eds. ermanno pietrosemoli and marco zennaro (trieste: abdus salam international centre for theoretical physics t/ict4d lab, 2013), 19-34. 30. horvitz, “geo-database management of white space vs. open spectrum,” 8. 31. engineering & technology bureau, “fcc adopts rules for unlicensed services in tv and 600 mhz bands,” federal communications commission, august 11, 2015, https://apps.fcc.gov/edocs_public/attachmatch/fcc-15-99a1_rcd.pdf. 32. gomez, “tv white spaces: managing spaces or better managing inefficiencies?,” 68. 33. stirling, “white spaces – the new wi-fi?.” 34. linda e. doyle, “cognitive radio and africa,” in tv white spaces a pragmatic approach, eds. ermanno pietrosemoli and marco zennaro (trieste: abdus salam international centre for theoretical physics t/ict4d lab, 2013), 109-119. 35. gomez, “tv white spaces: managing spaces or better managing inefficiencies?,” 72. 36. mike jensen, “the role of tv white spaces and dynamic spectrum in helping to improve internet access in africa and other developing regions,” in tv white spaces a pragmatic information technology and libraries | march 2017 46 approach, eds. ermanno pietrosemoli and marco zennaro (trieste: abdus salam international centre for theoretical physics t/ict4d lab, 2013), 83-89. 37. song, “spectrum and development.” 38. ibid. 39. doyle, “cognitive radio and africa,” 113. 40. stirling, “white spaces – the new wi-fi?.” 41. afton chavez, ryan littman-quinn, kagiso ndlovu, and carrie l kovarik, “using tv white space spectrum to practice telemedicine: a promising technology to enhance broadband internet connectivity within healthcare facilities in rural regions of developing countries,” journal of telemedicine and telecare 22, no. 4 (2015): 260-263, https://doi.org/10.1177/1357633x15595324. 42. stirling, “white spaces – the new wi-fi?.” 43. koerber, "ala and ipac analyze digital inclusion survey." 44. chavez et al., “using tv white space spectrum to practice telemedicine.” 45. kerry ingersoll, june 22, 2015, google+ comment to the gigabit libraries network, https://plus.google.com/107631107756352079114/posts/l4y8ci8sg5y. 46. delta county libraries, “super wi-fi pilot,” accessed november 1, 2016, http://www.deltalibraries.org/super-wi-fi-pilot/. 47. pascagoula tv white spaces facebook group, accessed november 1, 2016, https://www.facebook.com/psdtvws/. 48. “delaware libraries white space pilot update, january 2015,” accessed november 1, 2016, http://lib.de.us/files/2015/01/delaware-libraries-white-space-pilot-update-jan-2015.pdf. 49. knapp, “fcc announces public testing of first television white spaces database.” 50. see https://www.google.com/get/spectrumdatabase/. 51. bertot et al., “public access computing and internet access in public libraries.” 52. bertot et al., “the impacts of free public internet access.” see also horrigan, “libraries 2016.” 53. paul t. jaeger, lesley a. langa, charles r. mcclure, and john carlo bertot, “the 2004 and 2005 gulf coast hurricanes: evolving roles and lessons learned for public libraries in disaster preparedness and community services,” public library quarterly 25, 3/4, (2007), 199-214. 54. ibid. tv white spaces in public libraries: a primer | rebmann, te, and means | https://doi.org/10.6017/ital.v36i1.9720 47 55. bertot et al., “public access computing and internet access in public libraries.” 56. ibid. 57. paul t. jaeger, ben shneiderman, kenneth r. fleischmann , jennifer preece, yan qu, philip fei wu, “community response grids: e-government, social networks, and effective emergency management,” telecommunications policy 31 (2007): 592-604, https://doi.org/10.1016/j.telpol.2007.07.008. 58. ibid., 595. 59. laurie putnam, “by choice or by chance: how the internet is used to prepare for, manage, and share information about emergencies,” first monday 7, no.11 (2002), https://doi.org/10.5210/fm.v7i11.1007. 60. ibid. 61. ibid. 62. jaeger et al., “community response grids,” 598. jaegar et al. describe how the internet combines the best of one-to-one, one-to-many, many-to-one, and many-to-many in terms of the flow and quality of information. one-to-one communication is slow; many-to-one only benefits the central network, while outsiders reporting emergencies do not learn what others are reporting; one-to-many is inefficient, limited, and assumes the broadcaster has the appropriate information and can get it to those that need it most; many-to-many can create “information overload” of questionable content. 63. ibid., 599. 64. jaeger et al., “the intersection of public policy and public access,” 3. 65. cohron, “the continuing digital divide in the united states,” 84. 66. kinney, “the internet, public libraries, and the digital divide,” 120. 67. ibid., 148. 68. jaeger et al., “community response grids,” 599. 69. bertot et al., “the impacts of free public internet access,” 299. 324 journal of library automation vol. 7/4 december 1974 book reviews current awareness and the chemist, a study of the use of ca con.densates by chemists, by elizabeth e. duncan. metuchen, n.j.: scarecrow press, 1972. 150p. $5.00. this book starts with a five-page foreword by allen kent entitled "kwic indexes have come a long way-or have they?" kent is always interesting but when one detects that his foreword is becoming almost an. apologia, one wonders just what is to come. the remainder of the book (apart from the index) appears already to have been presented as dr. duncan's ph.d. thesis at the university of pittsburgh. the first two chapters are the usual sort of stuff, taking us from alexandria in the third century to columbus, ohio in 1970, with undistinguished reviews of user studies and the history of the chemical abstracts service. the remaining sixty-four pages of text report and discuss a study of the use of ca condensates by quite a small sample of academic and industrial chemists in the pittsburgh area. the objective appears to have been to compare profile hits with periodical holdings and interlibrary loan requests at the client's library so that a decision model for the acquisition of periodicals could be developed. on the author's own admission, this objective was not achieved. a certain amount of data is presented but it is difficult to draw many conclusions from it, other than the fact that chemists do not appear to follow up the majority of profile hits that they receive nor do they use the current issues of chemical abstracts very frequently. it is difficult to understand why this material was published in book form. it could have been condensed to one or possibly two papers for ].chem.doc. or perhaps even left for the really diligent seeker to find on the shelves of university microfilms-but, as the old testament scribe bemoaned, "of making many books there is no end." at the bottom of page 118 a reference is made to the paper by abbott et al. in aslib proceedings (feb. 1968); at the top of page 119 the same paper's date is given as january 1968. other errors are less obvious, but one really questions whether the provision of a short foreword and an index makes even a good thesis worth publishing in hard covers. r. t. bottle the city university london, u.k. computer-based reference service, by m. lorrai'ne mathies and peter g. watson. chicago: american library assn., 1973. 200p. $9.95. the archetypal title and model for all works of explication is ....... without tears. lorraine mathies and peter watson have attempted the praiseworthy task of explaining computer-produced indexes to the ordinary reference librarian, but for a number of reasons, some of them probably beyond the control of the authors, the tears will remai'n, perhaps one difficulty is that this book was, in its beginnings at least, the product of a committee. back in 1968 the information retrieval committee of the reference services division of the ala wanted to present to "working reference librarians the essentials of the reference potential of computers and the machine-readable data they produce" (p.xxix). the proposal worked its way (not untouched, of course) through several other groups and eventually resulted in a preconference workshop on computer-based reference service being given at the dallas convention of 1971. the present book is based on the tutor's manual which mathies and watson prepared for that workshop but incorporates revisions suggested by the ala publishing services as well as changes initiated by the authors themselves. with so many people getting into the planning act, it is not surprising that the various parts of the book should end up by working at cross purposes to each other. unfortunately, the principal conflicts come at just those points where a volume of exposition needs to be most definite and precise: just what is the book trying to do and for whom? at the original workshop, the eric data base was chosen as a "model system" since educational terminology was more likely to be understood than that of the sciences. and because the participants were to learn by doing, they were told a great deal about eric so as to be able to "practice" on it. the trouble is that these objectives do not translate well from workshop to print. the detafls about eric, which may have been necessary as tutors' instructions, seem misplaced in book form. almost half the present book is devoted to a laborious explanation of how eric works and this is a great deal more than most workaday reference librarians will want to know about it. moreover, it is no longer clear whether mathies and watson aim to train "producers" or "consumers." the welter of detail suggests that they expect their readers to learn hereby to construct profiles and to program searches but it is highly doubtful that skills of this kind can or should be imparted on a "teach yourself" basis. once mathies and watson leave eric behind, they seem on surer ground. part ii (computer searching: principles and strategies) begins with a fairly routine chapter on binary numeration which is perhaps unnecessary since this material is easily available elsewhere. however, the section quickly moves on to an excellent explanation of boolean logic and weighting, describes their application in the formulation of search strategies, and ends with an admirably succinct and demystifying account of how one evaluates the output (principles of relevance and recall). the reader might well have been better served if the book had indeed begun with this part. the last section (part iii: other machine readable data bases) is also very useful, particularly for the "critical bibliography" (p.153) in which the authors describe and evaluate ten of the major bibliographic data bases. this critical bibliography is apparently a first of its kind, which makes the authors' perceptive and frank comments all the more welcome. part iii also contains chapters on marc and the 1970 census but, sh·angely enough, does not include a final resume and conclusions. it is true that in each book reviews 325 chapter there is a paragraph or so of summary but this is hardly a satisfactory substitute for the overall recapitulation one would expect. in the final analysis, indeed, one's view of the book will depend on just thatwhat one expects of it. if "working reference librarians" expect to read this book in order to be no longer "intimidated by these electronic tools" (p.ix), they are apt to be disappointed. the inordinate emphasis on eric, the rather dense language, and the fact that the main ideas are never pulled together at the end will all prevent easy enlightenment. however, if our workaday reference librarians are willing to work their way through a fairly difficult manual on computer-based indexing as in effect a substitute 'for a workshop on the subject, they will find this book a worthwhile investment of their time-and tears. samuel rothstein school of lihl'arianship university of british columbia the circulation system at the university of missouri-columbia library: an evolutionary approach. sue mccollum and charles r. sievert, issue eds. the larc reports, vol. 5, issue 2, 1972. 101p. in 1958 the university of missouri-columbia library was one of the first libraries to mechanize circulation by punching a portion of the charge slip with book and borrower and/ or loan information. in 1964 an ibm 357 data collection system utilizing a modified 026 keypunch was installed, but not until 1966 was 026 output processed on the library owned and operated ibm 1440 computer. however, budgetary constraints forced a transfer of operations in 1970 to the data processing center, which undertook rewriting of library programs in 1971. after explanation of hardware changes and an overview of the circulation department organization and data processing center operation, this report deals in depth with the major files of the circulation system-circulation master flle and location master file-and the main components of the circulation system-edit, update, overdues, fines, interlibrary loans, 326 journal of libmry automation vol. 7/4 december 1974 address file, location file, reserve book, listing of files, special requests, and utility programs. many examples of report layouts are included, particularly those accomplished by utilizing data gathered from main collection and reserve book loans. although this off-line batch processing circulation system is limited in that it does not handle any borrower reserve or lookup (tracer) routines, both of which are possible in off-line systems, the university of missouri-columbia system has merit as a pioneer system which influenced other university library circulation system designs in the 1960s. detailed reference given throughout the report to changes in the original library programs not only makes it of value as a case history for any library interested in circulation automation but also indicates the important fact that library programs do change and evolve in response to new demands and technological capabilities. lois m. kershnm university of pennsylvania libraries national science information systems, a guide to science information systems in bulgaria, czechoslovakia, hungary, poland, rumania, and yugoslavia, by david h. kraus, pranas zunde, and vladimir slamecka. (national science information series) cambridge, mass.: the m.i.t. press, 1972. 325p. $12.50. as indicated by the title, this volume provides a comparative description and analysis of the various organizational or political structures which have been adopted by six counb·ies of central and eastern europe in their attempts to develop effective national systems for the dissemination of scientific and technical information. for each country there is a detailed account of the national information system now existing, with a brief outline of its antecedents, a directory of information or documentation centers, a list of serials published by these centers, and a bibliography of recent papers dealing with the development of information systems in that country. this main section of the book is preceded by a brief review of the common characteristics of the six national systems and an outline of steps being taken to achieve international cooperation for the exchange of information in specific subjects. of particular interest is the description of the international center of scientific and technical information established in moscow in 1969, and which is now linked to five of these national systems. no attempt is made to describe the techniques being used to store, retrieve, and disseminate information. the authors point out that the six countries being examined "have experimented intensely with organizational variants of national science information systems." unfortunately, they do not attempt to indicate which of these organizational structures was most effective in bringing about the desired results. undoubtedly, this would have been an impossible task and probably not worth the effort, since a successful type of organization in a socialist country would not necessarily be effective in a democracy. the book will be of interest to political scientists and to those seeking the most effective ways of coordinating the information processing efforts of all types of government bodies. it will be only of academic interest to the information specialist concerned primarily with information processing techniques. jack e. brown national science library of canada ottawa information retrieval: on-line, by f. w. lancaster and e. g. fayen. los angeles: melville publishing co., 1973. 597p. lc: 73-9697. isbn: 0-471-51235-4. have you been reading the asis annual review of information science and technology year after year and wishing for a compendium of the best information and examples of the latest systems, user manuals, cost data, and other facts so that you would not have to go searching in a library for the interesting reports, journal articles, and books? well, if you have (and who hasn't), your prayers have been answered if you are interested in online bibliographic retrieval systems. the authors of the handy reference book have collected and reprinted, among other things, the complete dialog terminal users reference manual, the supars user manual, the user instructions for aim-twx, obar, and the caruso tutorial program. each of these systems, and several others (arranged alphabetically from aim-twx [medline] to to xi con [toxline]), is described and illustrated. features and functions of on-line systems, such as vocabulary control and indexing, cataloging, instruction of users, equipment, and file design, are all covered in a straightforward manner, simply enough for the uninformed and carefully enough so that a system operator could compare his system's features and functions with the data provided. richly illustrated with tables, charts, graphs, and figures, up-to-date bibliographies (only serious omission noticed was the afips conference proceedings edited by d. walker), and subject and author indexes, this volume will stand as another landmark in the state-of-the-art review series which the wiley-becker & hayes information science series has come to represent. emphasis has been placed on the design, evaluation, and use of on-line retrieval systems rather than the hardware or programming aspects. several of the chapters have a broader base of interest than on-line systems, covering as they do performance criteria of retrieval systems, evaluating effectiveness, human factors, and cost-performance-benefits factors. easy to use and as up to date and balanced a book as any in a rapidly changing field can be, lancaster and fayen have given students of information studies and planners and managers of information services a very valuable reference aid. pauline a. atherton school of information studies syracuse university national library of australia. australian marc specification. canberra: national library of australia, 1973. 83p. $2.50. isbn: 0-642-99014-x for those readers who are familiar with book reviews 327 the library of congress marc format, the australian marc specification will be, for the most part, self-explanatory. the intent of the document is to describe the basic format structure and to list the various content designators that are used in the format. no effort was made to include any background information or explanation of data elements. because of this, the reviewer found it necessary to refer to other documents, e.g., precis: a rotated subiect index system, by derek austin and peter butcher, in order to complete a comparative analysis of the australian format with other similar formats. perhaps the value of reviewing a descriptive document of this type lies in discovering how the format it describes compares to other existing formats developed for the same purpose. the international organization for standardization published a format for bibliographic information interchange on magnetic tape in 1973, international standard iso 2709, the australian format structure is the same throughout as the international standard. the only variance is in character positions 20 and 21 of the leader, which the australian format left undefined. a comparison of content designators cannot be made with the international standard because it specifies only the position and length of the identifiers in the structure of the format, but not the actual identifier (except for the three-digit tags 001-999 that identify the data fields). the best comparison of content designators can be made with the lc marc format, since the australian format uses many of the same tags, indicators, and subfield codes for the same purposes. the australian format has assigned to the same character positions the same fixed-length data elements as the lc format except for position 38, which is the periodical code in the australian format and the modified record code in the lc format. in the fixed-length character. positions for form of contents, publisher (government publication in lc marc), and literary text (fiction in lc 328 journal of library automation vol. 7/4 december 1974 marc) , the australian format assigned different codes than lc. in general, the australian format uses the same three-digit tags as lc to identify the primary access fields in their records, e.g., 100, 110, 111 for main entries; 400, 410, 411, 440, 490 for series notes; 600, 610, 611, 650, 651 for subject headings; and 700, 710, 711 for added entries. for the remaining bibliographic fields there are some variations in tagging between the two formats. the australian marc has chosen a different method of identifying uniform titles, and has identified five more note fields in the 5xx series of tags than has lc. the australians have also added some manufactured fields to their record. these fields do not contain actual data from the bibliographic record, but rather are fields consisting of data created by program for control and manipulation purposes, or from lists such as the precis subject index. the australian format has also included, as part of its record, a series of cross-reference fields identified by 9xx tags. lc has reserved the 9xx block of tags for local use. the use of indicators differs in most instances between the two formats. both allow for two indicator positions in each field as specified by the international standard format structure. however, the information conveyed by the indicators differs except where the first indicator conwhich means no intelligence carried in this position. in the australian format the indicators in the 6xx block of tags have three different patterns. inconsistency of this kind does not tend to destroy compatibility with other coding systems using the same format structure, as long as sufficient explanation and examples are given from which conversion tables may be developed by the institutions with whom one wants to exchange, or interchange, bibliographic data. an even greater degree of difference exists between the two formats in the subfield codes used to identify data elements. the australian marc has identified some data elements that lc has not, e.g., in personal name main entries, the australian record identifies first names with subfield code "h," whereas lc does not identify parts of a personal name, only the form of the name, i.e., forename form, single surname, family name, etc. in most of the fields the two formats have defined some of the same data elements, but each uses a different subfield code to represent the element. in the australian document, under each field heading, the subfield codes are listed alphabetically with a data element following each code. this arrangement causes the data elements to fall out of their normal order of occurrence in the field, i.e., name, numeration, titles, dates, relator, etc. for example: personal name main entry (tag 100) subfield code a b amtralian marc entry element ( name) relator lc marc entry element (name ) numeration c dates d e second or subsequent additions to name numeration titles ( honorary) dates relator f additions to name other than date date (of a work) veys form of name for personal and corporate name headings. within each block of tags, lc has made an effort to remain consistent in the use of indicators, e.g., in the 6xx block for subject headings, the first indicator specifies form of name where a form of name can be discerned. where no form of name is discernable such as in a topical subject heading (tag 650), a null indicator or blank is used the example demonstrates the need for precise definition and documentation of data elements for the purpose of conversion or translation when interchanging data with other institutions. the australian format has included the capability of identifying analytical entries by using an additional digit (called the level digit) placed between the tag and the indicators to identify the analytical entries. a subrecord directory (tag 002) is present in each record containing data for analytical entries. the australian document includes appendixes for the country of publication codes, language codes, and geographical area codes that were developed by the library of congress. their only deviabook reviews 329 tion from lc marc usage is in the country of publication codes, where the australians have added entities and codes for australian first-level administrative subdivisions. patricia e. parker marc development office library of congress articles no need to ask: creating permissionless blockchains of metadata records dejah rubel information technology and libraries | june 2019 1 dejah rubel (rubeld@ferris.edu) is metadata and electronic resource management librarian, ferris state university. abstract this article will describe how permissionless metadata blockchains could be created to overcome two significant limitations in current cataloging practices: centralization and a lack of traceability. the process would start by creating public and private keys, which could be managed using digital wallet software. after creating a genesis block, nodes would submit either a new record or modifications to a single record for validation. validation would rely on a federated byzantine agreement consensus algorithm because it offers the most flexibility for institutions to select authoritative peers. only the top tier nodes would be required to store a copy of the entire blockchain thereby allowing other institutions to decide whether they prefer to use the abridged version or the full version. introduction several libraries and library vendors are investigating how blockchain could improve activities such as scholarly publishing, content dissemination, and copyright enforcement. a few organizations, such as katalysis, are creating prototypes or alpha versions of blockchain platforms and products.1 although there has been some discussion about using blockchains for metadata creation and management, only one company appears to be designing such a product. therefore, this article will describe how permissionless blockchains of metadata records could be created, managed, and stored to overcome current challenges with metadata creation and management. limitations of current practices metadata standards, processes, and systems are changing to meet twenty-first century information needs and expectations. there are two significant limitations, however, to our current metadata creation and modification practices that have not been addressed: centralization and traceability. although there are other sources for metadata records, including the open library project, the largest and most comprehensive database with over 423 million records is provided by the online computer library center (oclc).2 oclc has developed into a highly centralized operation that requires member fees to maintain its infrastructure. oclc also restricts some members from editing records contributed by other members. one example of these restrictions is the program for cooperative cataloging (pcc). although there is no membership fee for pcc, catalogers from participating libraries must receive additional training to ensure that their institution contributes high quality records.3 requiring such training, however, limits opportunities for participation and can create bottlenecks when non-pcc institutions identify errors in a pcc record. decentralization no need to ask | rubel 2 https://doi.org/10.6017/ital.v38i2.10822 would help smaller, less-well-funded institutions overcome such barriers to creating and contributing their records and modifications to a central database. the other significant limitation to our current cataloging practices is the lack of traceability for metadata changes. oclc tracks record creation and changes by adding an institution’s oclc symbol to the 040 marc field.4 however, this symbol only indicates which institution created or edited the record, not what specific changes they made. oclc also records a creation date and a replacement date in each record, but a record may acquire multiple edits between those two dates. recording the details of each change within a record would help future metadata editors to understand who made certain changes and possibly why they were made. capturing these details would also mitigate concerns about the potential for metadata deletion because every datum would still be recorded even if it is no longer part of the active record. information science blockchain research many researchers and institutions are exploring blockchain for information science applications. most of these applications can be categorized as either scholarly publishing, content dissemination and management, or metadata creation and management. one of the most promising applications for blockchain is coordinating, endorsing, and incentivizing research and scholarly publishing activities. in “blockchain for research,” rossum from digital science describes benefits such as data colocation, community self-correction, failure analysis, and fraud prevention.5 research activity support and endorsement would use an academic endorsement points (aep) currency to support work at any level, such as blog posts, data sets, peer reviews, etc. the amount credited to each scientist is based on the aep received for their previous work. therefore, highly endorsed researchers will have a greater impact on the community. one benefit of this system is that such endorsements would accrue faster than traditional citation metrics.6 one detriment to this system is its reliance on the opinions of more experienced scientists. the current peer review process assumes these experts would be the best to evaluate new research because they have the most knowledge. breakthroughs often overturn the status quo, however, and consequently may be overlooked in an echo chamber of approved theories and approaches. micropayments using aep could “also introduce a monetary reward scheme to researchers themselves,” bypassing traditional publishers.7 unfortunately, such rewards could become incentives to propagate unscientific or immoral research on topics like eugenics. in addition, research rewards might increase the influence of private parties or corporations to science and society’s detriment. blockchains might also reduce financial waste by “incentivizing research collaboration while discouraging solitary and siloed research.”8 smart contracts could also be enabled that automatically publish any article, fund research, or distribute micropayments based on the amount of endorsement points.9 to support these goals, digital science is working with katalysis on the blockchain for peer review project. it is hard to tell exactly where they are in development, but as of this writing, it is probably between the pilot phase and the minimum viable product.10 the decentralized research platform (deip) serves as another attempt “to create an ecosystem for research and scientific activities where the value of each research…will be assessed by an experts’ community.”11 the whitepaper authors note that the lack of negative findings and unmediated or open access to information technology and libraries | june 2019 3 research results and data often leads to scientists replicating the same research.12 they also state that 80 percent of publishers’ proceeds are from university libraries, which spend up to 65 percent of their entire budget on journal and database subscriptions.13 this financial waste is surprising because universities are the primary source of published research. therefore, deip’s goals include research and resource distribution, expertise recognition, transparent grant processes, skill or knowledge tracking, preventing piracy, and ensuring publication regardless of the results.14 the second most propitious application of blockchain to information science is content dissemination and management.15 blockchain is an excellent way to track copyright. several blockchains have already been developed for photographers, artists, and musicians. examples include photochain, copytrack, binded, and dotbc.16 micropayments for content supports the implementation of different access models, which can provide an alternative to subscriptionbased models.17 micropayments can also provide an affordable infrastructure for many content types and royalty payment structures. blockchain could also authenticate primary sources and trace their provenance over time. this authentication would not only support archives, museums, and special collections, but it would also ensure law libraries can identify the most recent version of a law.18 finally, blockchain could protect digital first sale rights, which are key to libraries being able to share such content.19 “while drm of any sort is not desirable, if by using blockchain-driven drm we trade for the ability to have recognized digital first sale rights, it may be a worthy bargain for libraries.”20 to support such restrictions, another use for blockchain developed by companies such as libchain is open, verifiable, and anonymous access management to library content.21 another suitable application for blockchain is metadata creation and management.22 an open metadata archive, information ledger, or knowledgebase is very appealing because access to high quality records often requires a subscription to oclc.23 some libraries cannot afford such subscriptions. therefore, they must rely on records supplied by either a vendor or a government agency, like the library of congress. unfortunately, as of this writing, there is little research on how these blockchains could be constructed at the scale of large databases like those of oclc and the library of congress. in fact, the only such project is demco’s private, invitation-only beta.24 demco does not provide any information regarding their new product, but to make its development profitable, it is most likely a private, permissioned blockchain. creating permissionless blockchains for metadata records this section will describe how to create permissionless blockchains for metadata records including grouping transactions, an appropriate consensus algorithm, and storage options. please note that these blockchains are intended to augment current metadata record creation and modification practices and standards, not supersede them. the author assumes that record creation and modification will still require content (rda) and encoding (marc) validation prior to blockchain submission. validation in this section will refer solely to blockchain validation. generating and managing public and private keys all distributed ledger participants will need a public key or address for blocks of transactions to be sent to them and a private key for digital signatures. one way to create these key pairs is to generate a seed, which can be a group of random words or passphrases. the sha-256 algorithm can then be applied to this seed to create a private key.25 next, a public key can be generated from that private key using an elliptic curve digital signature algorithm.26 for additional security, the no need to ask | rubel 4 https://doi.org/10.6017/ital.v38i2.10822 public key can be hashed again using a different cryptographic hash function, such as ripemd160, or multiple hash functions, like bitcoin does to create its addresses.27 these key pairs could be managed with digital wallet software. “a bitcoin wallet is an organized collection of addresses and their corresponding private keys.”28 larger institutions, such as the library of congress, could have multiple key pairs with each pair designated for the appropriate cataloging department based on genre, form, etc. creating a genesis block every blockchain must start with a “genesis block.”29 for example, a personal name authority blockchain might start with william shakespeare’s record. a descriptive bibliographic blockchain might start with the king james bible. this genesis block includes a block header, a recipient’s public key or address, a transaction count, and a transaction list.30 being the first block, the block header will not contain a hash of the previous block header. it will contain, however, a hash of all of the transactions within that block to verify that the transactions list has not been altered. the block header will also include a timestamp and possibly a difficulty level and nonce.31 then the block header is hashed using the sha-256 algorithm and encrypted with the creator’s private key to produce a digital signature. this digital signature will be appended to the end of the block so validators can verify that the creator made the block by using their (the creator’s) public key.32 finally, the recipient’s public key or address, the transaction count, and transaction list are appended to the block header.33 block header • hash of previous block header • hash of all transactions in that block • timestamp • difficulty level (if applicable) • nonce (if applicable) block • recipient public key or address • transaction count • transaction list • digital signature in her master of information security and intelligence thesis at ferris state university, amber snow investigated the feasibility of using blockchain to add, edit, and validate changes to woodbridge n. ferris’ authority record.34 as shown in figure 1, she began by creating a hash function using the sha-256 algorithm to encrypt the previous hash, the timestamp, the block number, and the metadata record. “the returned encrypt value is significant because the returned data is the encrypted data that is being committed as [a] mined block transaction permanently to ledger.”35 the ledger block, however, “contains the editor’s name, the entire encrypted hash value, and the prior blocks [sic] hashed value.”36 information technology and libraries | june 2019 5 figure 1. creating a sha-256 hash. next, as shown in figures 2 and 3, she created a genesis block with a prior hashed value of zero by ingesting ferris’ authority record as “a single line file that contains the indicator signposts for cataloging the record.”37 figure 2. ingesting woodbridge n. ferris' authority record.38 figure 3. woodbridge n. ferris' authority record as a genesis block. note the previoushash value is zero. snow noted that “the understanding and interpretation of the marc authority record’s signposts is not inherently relevant for the blockchain data processing.”39 to keep the scope narrow, she also avoided using public and private key pairs to exchange records between nodes. “the ri [research institution] blockchain does not necessarily require two users to agree…instead the ri blockchain is looking to commit and track single user edits to the record.”40 creating and submitting new blocks for validation once a genesis block has been created and distributed, any node on the network can submit new blocks to the chain. for metadata records, new blocks should contain either new records or multiple modifications to the same record with each field being treated as a transaction. when a no need to ask | rubel 6 https://doi.org/10.6017/ital.v38i2.10822 second block is appended, the new block header will include the hash of the previous block header, a hash of all of the new transactions, a new timestamp, and possibly a new difficulty level and/or nonce. the block header will then be hashed using sha-256 and encrypted with the submitter’s private key to become a digital signature for that block. finally, another recipient’s public key or address, a new transaction count, and a new transaction list will be appended to the block header. additional blocks can then be securely appended to the chain ad infinitum without losing any of the transactional details. if two validators approve the same block at the same time, then the fork where the next block is appended first becomes the valid chain while the other chain becomes orphaned.41 although snow’s method does not include exchanging records using public keys or addresses, she was able to change a record, add it to the blockchain, and successfully commit those edits using the proof of work consensus algorithm.42 as shown in figure 4, after creating and submitting a genesis block as “tester 1,” she added a modified version of woodbridge n. ferris’ record as “tester 2.” this version appended the string “testerchanged123” to woodbridge n. ferris’ authority record. then she validated or “mined” the second block to commit the changes. figure 4. submitting and validating an edited record. figure 5 shows that the second block is chained to the genesis block because the “previoushash” value of the second block matches the “hash” of the genesis block. this link is what commits the block to the ledger. the appended string in the second block is at the end of the “metadata” variable. information technology and libraries | june 2019 7 figure 5. the new authority record blockchain. a more sophisticated method to append a second block would require key pairs. as described previously, a block would include a recipient’s public key or address, which would route the new and modified records to large, known institutions like the library of congress. although every node on the network can see the records and all of the changes, large institutions with welltrained and authoritative catalogers may be the best repository for metadata records and could store a preservation or backup copy of the entire chain. they are also the most reliable for validating records for content accuracy and correct encoding. achieving algorithmic consensus once a block has been submitted for validation, the other nodes use a consensus algorithm to verify the validity of the block and its transactions. “consensus mechanisms are ways to guarantee a mutual agreement on a data point and the state…of all data.”43 the most well-known consensus algorithm is bitcoin’s proof of work, but the most suitable algorithm for permissionless metadata blockchains is a federated byzantine agreement. proof of work proof of work (pow) relies on a one-way cryptographic hash function to create a hash of the block header. this hash is easy to calculate, but it is very difficult to determine its components.44 to solve a block, nodes must compete to calculate the hash of the block header. to calculate the hash of a block header, a node must first separate it into its constituent components. the hash of the previous block header, the hash of all of the transactions in that block, the timestamp, and the difficulty target will always have the same inputs. the validator, however, changes the nonce or random value appended to the block header until the hash has been solved.45 in bitcoin this process is called “mining” because every new block creates new bitcoins as a reward for the node that solved the block.46 no need to ask | rubel 8 https://doi.org/10.6017/ital.v38i2.10822 bitcoin also includes a mechanism to ensure the average number of blocks solved per hour remains constant. this mechanism is the difficulty target. “to compensate for increasing hardware speed and varying interest in running nodes over time, the proof-of-work difficulty is determined by a moving average targeting an average number of blocks per hour. if they’re generated too fast, the difficulty increases.”47 adjusting the difficulty target within the block header keeps bitcoin stable because its block rate is not determined by its popularity.48 in sum, validators are trying to find a nonce that generates a hash of the block header that is less than the predetermined difficulty target. unfortunately, proof of work requires immense and ever-increasing computational power to solve blocks, which poses a sustainability and environmental challenge. bitcoin and other financial services may need to rely on proof of work because “the massive amounts of electricity required helps to secure the network. it disincentivizes hacking and tampering with transactions…”49 because an attacker would need to control over 51 percent of the entire network to convince the other nodes that a faulty ledger is correct.50 metadata blockchains would rely on public information and therefore would not need the same level of security as private financial, medical, or personally identifiable information. unlike bitcoin, metadata blockchains also would not need a difficulty target because fluctuations in block production rates would not affect a metadata block’s value the same way cryptocurrency inflation would. therefore, despite its incredible security, proof of work would be computationally excessive for metadata record blockchains. federated byzantine agreement byzantine agreements are “the most traditional way to reach consensus. […] a byzantine agreement is reached when a certain minimum number of nodes (known as a quorum) agrees that the solution presented is correct, thereby validating a block and allowing its inclusion on the blockchain.”51 byzantine fault-tolerant (bft) state machine replication protocols support consensus “despite participation by malicious (byzantine) nodes.”52 this support ensures consensus finality, which “mandates that a valid block…never be removed from the blockchain.”53 in contrast, proof of work does not satisfy consensus finality because there is still the potential for temporary forking even if there are no malicious nodes.54 the “absence of consensus finality directly impacts the consensus latency of pow blockchains as transactions need to be followed by several blocks to increase the probability that a transaction will not end up being pruned and removed from the blockchain.”55 this latency increases as block size increases, which may also increase the number of forks and possibility of attack.56 “with this in mind, limited performance is seemingly inherent to pow blockchains and not an artifact of a particular implementation.”57 bft protocols, however, can sustain tens of thousands of transactions at nearly network latency levels.58 a bft consensus algorithm is also superior to one based on proof of work because “users and smart contracts can have immediate confirmation of the final inclusion of a transaction into the blockchain.”59 bft consensus algorithms also decouple trust from resource ownership, allowing small organizations to oversee larger ones.60 to use bft, every node must know and agree on the exact list of participating peer nodes. ripple, a bft protocol, tries to ameliorate this problem by publishing an initial membership list and allowing members to edit that list after implementation. unfortunately, users are often reluctant to edit the membership list thereby placing most of the network’s power in the person or organization that maintains the list.61 information technology and libraries | june 2019 9 federated byzantine agreement (fba), however, does not require each node to agree upon and maintain the same membership list. “in fba, each participant knows of others it considers important. it waits for the vast majority of those others to agree on any transaction before considering the transaction settled.”62 theoretically, an attacker could join the network enough times to outnumber legitimate nodes, which is why quorums by majority would not work. instead, fba creates quorums using a decentralized method that relies on each node selecting its own quorum slices.63 “a quorum slice is the subset of a quorum convincing one particular node of agreement.”64 a node may have many slices, “any one of which is sufficient to convince it of a statement.”65 the system constructs quorums based on individual node decisions thereby generating consensus without every node being required to know about every other node in the system.66 one example of quorum slices that might be good for metadata blockchains is a tiered system as shown in figure 6. the top tier would be structured like a bft system where the nodes can tolerate a limited number of byzantine nodes at the same level. this level would include the core metadata authorities, such as the library of congress or pcc members. members of this tier would be able to validate any record. the second or middle tier nodes would depend on the top tier because, in this example, a middle tier node requires two top tier nodes to form a quorum slice. these middle tier nodes would be authoritative, known institutions, such as universities, that already rely on the core metadata authorities on the top tier to validate and distribute their records. finally, a third tier, such as smaller institutions, would, in this example, rely on at least two middle tier nodes for their quorum slice. figure 6. tiered quorum example. no need to ask | rubel 10 https://doi.org/10.6017/ital.v38i2.10822 using an fba protocol to validate a transaction requires each node to exchange two sets of messages. the first set of messages gathers validations and the second set of messages confirms those validations. “from each node’s perspective, the two rounds of messages divide agreement…into three phases: unknown, accepted, and confirmed.”67 the unknown status becomes an acceptance when the first validation succeeds. acceptance is not sufficient for a node to act on that validation, however, because acceptance may be stuck in an indeterminate state or blocked for other nodes.68 the accepting node may also be corrupted and validate a transaction the network quorum rejects. therefore, the confirmation validation “allows a node to vote for one statement and later accept a contradictory one.”69 figure 7. validation process of statement a for a single node v. fba would lessen concerns about sharing a permissionless blockchain, but it can “only guarantee safety when nodes choose adequate quorum slices.”70 after discovery, byzantine nodes should be excluded from quorum slices to prevent interference with validation. one example of such interference is tricking other nodes to validate a bad confirmation message. “in such a situation, nodes must disavow past votes, which they can only do by rejoining the system under new node names.”71 theoretically, this recovery process could be automated to include “having other nodes recognize reincarnated nodes and automatically update their slices.”72 therefore, the key limitation to using an fba algorithm is continuity of participation. if too many nodes leave the network, reengineering consensus would require centralized coordination whereas proof of work algorithms could operate after losing many nodes without substantial human intervention.73 storing the blockchain storing a large blockchain, such as bitcoin, is a significant challenge. one method to facilitate that storage would be to rely on top tier nodes to retain a complete copy of the blockchain and allow smaller, lower tier nodes to retain an abridged version. in bitcoin, these methods are known as full payment verification (fpv) and simplified payment verification (spv). fpv requires a complete copy of the blockchain to “verify that bitcoins used in a transaction originated from a mined block by scanning backward, transaction by transaction, in the blockchain until their origin is found.”74 unfortunately, as one might expect, fpv consumes many resources and can take a long time to initialize. for example, downloading bitcoin’s blockchain can take several days. this long installation period is partly due to the size of blockchain, but if proof of information technology and libraries | june 2019 11 work is used as the consensus algorithm, then the new node must also connect to other full nodes “to determine whose blockchain has the greatest proof-of-work total (by definition, this is assumed to be the consensus blockchain).”75 using fba instead of proof of work would eliminate this time and resource consuming step. in contrast, svp only allows a node “to check that a transaction has been verified by miners and included in some block in the blockchain.”76 a node does this by downloading the block headers of every block in the chain. in addition to retaining the hash of the previous block header, these headers also include root hashes derived from a merkle tree. a merkle tree is a method where “the spent transactions…can be discarded to save disk space.”77 as shown in figure 8, combining transaction hashes for the entire block into a single root hash in the block header saves a considerable amount of storage capacity because the interior hashes can be eliminated or “pruned” off the merkle tree. figure 8. using a merkle tree for storage. as shown in figure 9, to verify that a transaction was included a block, a node “obtains the merkle branch linking the transaction to the block it’s timestamped in.”78 although it cannot check the transaction directly, “by linking it to a place in the chain he can see that a network node has accepted it and blocks after it further confirm the network has accepted it.”79 no need to ask | rubel 12 https://doi.org/10.6017/ital.v38i2.10822 figure 9. verifying a transaction using a merkle root hash. compared to fvp, svp “requires only a fraction of the memory that’s needed for the entire blockchain.”80 this small amount of storage enables svp ledgers to sync and become operational in less than an hour.81 svp is limited, however, only allowing nodes to manage addresses or public keys that they maintain whereas fvp ledgers are able to query the entire network. thus, an svp ledger must rely “on its network peers to ensure its transactions are legit.”82 theoretically, an attacker could overpower the entire network and convince nodes using svp to accept fraudulent transactions, but such an attack is very unlikely for metadata blockchains. for additional security, an svp node could also “accept alerts from network nodes when they detect an invalid block, prompting the user’s software to download the full block and alerted transactions to confirm the inconsistency.”83 adding such a feature to metadata blockchain software would eliminate the slight risk of it being contaminated by malicious actors. thus, svp offers the ability for smaller institutions to participate in creating and maintaining a metadata blockchain without requiring them to have the storage capacity for the entire blockchain. conclusion and future directions this article described how permissionless metadata blockchains could be created to overcome two significant limitations in current cataloging practices: centralization and a lack of traceability. the process would start by creating public keys using a seed and the sha-256 algorithm and private keys using an elliptic curve digital signal algorithm. after creating the genesis block, nodes would submit either a new record or modifications to a single record for validation. validation would rely on a federated byzantine agreement (fba) consensus algorithm because it offers the most flexibility for institutions to select authoritative peers. quorum slices would be chosen using a tiered system where the top tier institutions would be the core metadata authorities, such as the library of congress. only the top tier nodes would be required to store a copy of the entire blockchain (fvp) thereby allowing other institutions to decide whether they prefer to use svp or fvp. information technology and libraries | june 2019 13 future directions for research could start with investigating whether this theoretical design will work. fba has not been heavily promoted as an option for a consensus algorithm, but its quorum slices create trust between recognized authorities and smaller institutions. another area of study could be whether there is a significant demand for metadata blockchains. many institutions appear frustrated at the costs and limitations of working with a vendor, but they also view such relationships as necessary for metadata record creation and maintenance. a metadata blockchain would reduce such dependence, but some institutions may be leery of using open source software. other institutions might be hesitant to adopt blockchain because they believe it is merely another “fad” or an unnecessary addition to metadata exchange systems. a third area for research could be a cost-benefit analysis for implementing metadata blockchains that weighs current vendor fees and labor costs against the potential storage and labor costs. such an analysis may create a tipping point where long-term return on investment outweighs the short-term challenges. endnotes 1 “about the project,” blockchain for peer review, digital science and katalysis, accessed nov. 29, 2018, https://www.blockchainpeerreview.org/about-the-project/. 2 “marc record services,” marc standards, library of congress, accessed nov. 29, 2018, https://www.loc.gov/marc/marcrecsvrs.html; “open library data,” open library, internet archive, accessed nov. 29, 2018, https://archive.org/details/ol_data ; oclc, 2017-2018 annual report. 3 “join the pcc,” program for cooperative cataloging, library of congress, accessed nov. 29, 2018, http://www.loc.gov/aba/pcc/join.html. 4 “040 cataloging source (nr),” oclc support & training, oclc, accessed nov. 29, 2018, https://www.oclc.org/bibformats/en/0xx/040.html. 5 dr. joris van rossum, “blockchain for research,” accessed nov. 29, 2018, https://www.digitalscience.com/resources/digital-research-reports/blockchain-for-research/. 6 van rossum, 11. 7 van rossum, 12. 8 van rossum, 12. 9 van rossum, 16. 10 digital science and katalysis, “about the project.” 11 “decentralized research platform,” deip, accessed nov. 29, 2018, https://deip.world/wpcontent/uploads/2018/10/deip-whitepaper.pdf. 12 deip, 13. 13 deip, 14. 14 deip, 16. no need to ask | rubel 14 https://doi.org/10.6017/ital.v38i2.10822 15 jason griffey, “blockchain for libraries,” feb. 26, 2016, https://speakerdeck.com/griffey/blockchain-for-libraries. 16 “e-services,” concensum, accessed nov. 29, 2018, https://concensum.org/en/e-services; “about,” binded, accessed nov. 29, 2018, https://binded.com/about; “faq,” dot blockchain media, accessed nov. 29, 2018, http://dotblockchainmedia.com/. 17 van rossum, “blockchain for research,” 10. 18 debbie ginsberg, “law and the blockchain,” blockchains for the information profession, nov. 22, 2017, https://ischoolblogs.sjsu.edu/blockchains/law-and-the-blockchain-by-debbieginsberg/. 19 griffey, “blockchain for libraries.” 20 “ways to use blockchain in libraries,” san josé state university, accessed nov. 29, 2018, https://ischoolblogs.sjsu.edu/blockchains/blockchains-applied/applications/. 21 “libchain: open, verifiable, and anonymous access management,” libchain, accessed nov. 29, 2018, https://libchain.github.io/. 22 griffey, “blockchain for libraries.” 23 san josé state university. “ways to use blockchain in libraries.” 24 “demco software blockchain,” demco, accessed nov. 29, 2018, http://blockchain.demcosoftware.com/. 25 jordan baczuk, “how to generate a bitcoin address—step by step,” coinmonks, accessed nov. 29, 2018, https://medium.com/coinmonks/how-to-generate-a-bitcoin-address-step-by-step9d7fcbf1ad0b. 26 “elliptic curve digital signature algorithm,” bitcoin wiki, accessed nov. 29, 2018, https://en.bitcoin.it/wiki/elliptic_curve_digital_signature_algorithm. 27 conrad barski and chris wilmer, bitcoin for the befuddled (san francisco: no starch pr., 2015), 139. 28 barski and wilmer, 12-13. 29 barski and wilmer, 11. 30 barski and wilmer, 172-73. 31 barski and wilmer, 172-73. 32 satoshi nakamoto, “bitcoin: a peer-to-peer electronic cash system,” accessed nov. 29, 2018, https://bitcoin.org/bitcoin.pdf. 33 barski and wilmer, bitcoin for the befuddled, 170-72. information technology and libraries | june 2019 15 34 amber snow, “the design and implementation of blockchain technology in academic resource’s authoritative metadata records: enhancing validation and accountability” (master’s thesis, ferris state university, 2018), 34. 35 snow, 40. 36 snow, 40. 37 snow, 37, 40. 38 snow, 42. 39 snow, 37. 40 snow, 39. 41 barski and wilmer, bitcoin for the befuddled, 23. 42 snow, “the design and implementation of blockchain technology,” 37. 43 “9 types of consensus mechanisms you didn’t know about,” daily bit, accessed nov. 29, 2018, https://medium.com/the-daily-bit/9-types-of-consensus-mechanisms-that-you-didnt-knowabout-49ec365179da. 44 barski and wilmer, bitcoin for the befuddled, 138. 45 barski and wilmer, 171. 46 barski and wilmer, 138. 47 nakamoto, “bitcoin,” 3. 48 barski and wilmer, bitcoin for the befuddled, 171. 49 helen zhao, “bitcoin and blockchain consume an exorbitant amount of energy. these engineers are trying to change that,” cnbc, feb. 23, 2018, https://www.cnbc.com/2018/02/23/bitcoinblockchain-consumes-a-lot-of-energy-engineers-changing-that.html. 50 barski and wilmer, bitcoin for the befuddled, 23. 51 shaan ray, “federated byzantine agreement,” towards data science, accessed nov. 29, 2018, https://towardsdatascience.com/federated-byzantine-agreement-24ec57bf36e0. 52 marko vukolić, “the quest for scalable blockchain fabric: proof-of-work vs. bft replication,” ibm research – zurich, accessed nov. 29, 2018, http://vukolic.com/inetsec_2015.pdf 53 vukolić, “the quest for scalable blockchain fabric,” [5]. 54 vukolić, [6]. no need to ask | rubel 16 https://doi.org/10.6017/ital.v38i2.10822 55 vukolić, [6]. 56 vukolić, [7]. 57 vukolić, [7]. 58 vukolić, [7]. 59 vukolić, [6]. 60 david mazières, “the stellar consensus protocol: a federated model for internet-level consensus,” stellar development foundation, accessed nov. 29, 2018, https://www.stellar.org/papers/stellar-consensus-protocol.pdf. 61 mazières, 3. 62 mazières, 1. 63 mazières, 4. 64 mazières, 4. 65 mazières, 4. 66 mazières, 5. 67 mazières, 11. 68 mazières, 11. 69 mazières, 13. 70 mazières, 28. 71 mazières, 29. 72 mazières, 29. 73 mazières, 29. 74 barski and wilmer, bitcoin for the befuddled, 191. 75 barski and wilmer, 191. 76 barski and wilmer, 192. 77 nakamoto, “bitcoin,” 4. 78 nakamoto, 5. 79 nakamoto, 5. information technology and libraries | june 2019 17 80 barski and wilmer, bitcoin for the befuddled, 192. 81 barski and wilmer, 193. 82 barski and wilmer, 193. 83 nakamoto, “bitcoin,” 5. the open access citation advantage: does it exist and what does it mean for libraries? colby lewis information technology and libraries | september 2018 50 colby lewis (colbyllewis@gmail.com), a second year master of science in information student at the university of michigan school of information, is winner of the 2018 lita/ex libris student writing award. abstract the last literature review of research on the existence of an open access citation advantage (oaca) was published in 2011 by philip m. davis and william h. walters. this paper reexamines the conclusions reached by davis and walters by providing a critical review of oaca literature that has been published since 2011 and explores how increases in open access publication trends could serve as a leveraging tool for libraries against the high costs of journal subscriptions. introduction since 2001, when the term “open access” was first used in the context of scholarly literature, the debate over whether there is a citation advantage (ca) caused by making articles open access (oa) has plagued scholars and publishers alike.1 to date, there is still no conclusive answer to the question, or at least not one that the premier publishing companies have deemed worthy of acknowledging. there have been many empirical studies, but far fewer with randomized controls. the reasons for this range from data access to the numerous potential “methodological pitfalls” or confounding variables that might skew the data in favor of one argument or another. the most recent literature review of articles that explored the existence (or lack thereof) of an open access citation advantage (oaca) was published in 2011 by philip m. davis and william h. walters. in that review, davis and walters ultimately concluded that “while free access leads to greater readership, its overall impact on citations is still under investigation. the large access -citation effects found in many early studies appear to be artifacts of improper analysis and not the result of a causal relationship.”2 this paper seeks to reexamine the conclusions reached by davis and walters in 2011 by providing a critical review of oaca literature that have been published since their 2011 literature review.3 this paper will examine the methods and conclusions provoking such criticisms and whether these criticisms are addressed in the studies. i will begin by identifying some of the top confounders in oaca studies, in particular the potential for self-archiving bias. i will then examine articles from july 2011, when davis and walters published their findings, to july 2017. there will be a few exceptions to this time frame, but the studies cited in figures 4 and 5 are entirely from this period. in addition to reviewing oaca studies since davis and walters’ march 2011 study, i will explore the implications of an oaca on the future of publishing and the role of librarians in the subscription process. as antelman points out in her association of college and research libraries conference paper, “leveraging the growth of open access in library collection decision making,” it is the responsibility of libraries to use the newest data and technology available to them in the interest of best serving their patrons and advancing scholarship.4 in connecting oaca mailto:colbyllewis@gmail.com the open access citation advantage | lewis 51 https://doi.org/10.6017/ital.v37i3.10604 studies and the potential bargaining power an oaca could bring libraries, i assess the current roles that universities and university libraries play in promoting (or not) oa publications and the implications of an oaca for researchers, universities, and libraries, and i provide suggestions on how recent research could influence the present trajectory. i conclude by summarizing what my findings tell us about the existence (or lack thereof) of an oaca, and what these findings imp ly for the future of library journal subscriptions and the publish-or-perish model for tenure. lastly, i will suggest some alternative metrics to citations that could be used by libraries in determining future journal subscriptions and general collection management. self-archiving bias and why it doesn’t matter the idea of a self-archiving bias is based upon the concept that, if faced with a choice, authors will always opt to make their best work more widely available. effectively, when open access is not mandated, these articles may be specifically chosen to be made open access to increase readership and, hypothetically, citations.5 this biased selection method has the potential to confound the results of oaca studies because of the intuitive notion that an author’s best work is much more likely to be cited than any of their other work. its effect is amplified by making this work available oa, but it prevents studies in which articles were self-archived from being able to convincingly claim that the citation advantage these articles received was due to oa and not to its inherent quality and subsequent likelihood to be cited anyway. in a 2010 study, gargouri et al. determined that articles by authors whose institutions mandated self-archiving (such as in an institutional repository [ir]) saw an oaca just as great for articles that were mandated to be oa as for articles that were self-selected to be oa.6 this by no means proves a causal relationship between oa and ca, but does counter the notion that self -archived articles are an uncontrollable confounder that automatically compromises the legitimacy of oaca studies.7 ottaviani affirms this conclusion in a 2016 study in which he writes, “in the long run better articles gain more citations than expected by being made oa, adding weight to the results reported by gargouri et al.”8 in short, claiming that articles self-selected for self-archiving irreparably confound oaca studies ignores the fact that these authors have accounted for the likelihood that articles of higher quality will inherently be cited more. as gargouri et al. put it, “the oa advantage [to self-archived articles] is a quality advantage, rather than a quality bias” (italics in original).9 gold versus green and their effect on oaca analyses many critics of oaca studies have argued that such studies do not distinguish between gold oa, green oa, and hybrid (subscription journals that offer the option for authors to opt-in to gold oa) journals in their sample pool, thus skewing the results of their studies. in fact, there are many acknowledged subcategories of oa, but for the purposes of this paper, i will primarily focus on gold, green, and hybrid oa. figure 1, provided by elsevier as a guide for their clients, distinguishes between gold and green oa.10 while the chart provided applies specifically to those looking to publish with elsevier, it highlights the overarching differences between gold oa and green oa. a comprehensive list of oa journals is available through the directory of open access journals (doaj) website (https://doaj.org/). https://doaj.org/ information technology and libraries | september 2018 52 figure 1. elsevier explains to potential clients their options for publishing oa with elsevier and the differences between publishing with gold oa versus green oa. the argument that not distinguishing between gold oa and green oa in oaca studies distorts study results primarily stems from the potential for skew in green oa journals. green oa journals allow authors to self-archive their articles after publication, but the articles are often not made full oa until an embargo period has passed. this problem was addressed in a recent study conducted by science-metrix and 1science, who manually checked and coded approximatively 8,100 top-level domains (tlds).11 it is important to note that this study was made available as a white paper on the 1science website and has not been published in a peer-reviewed journal. additionally, 1science is a company built on providing oa solutions to libraries, which means they have a vested interest in proving the existence of an oaca. however, just as publishers such as elsevier have a vested interest in a substantial oaca not existing, this should not prevent us from examining their data. for their study, 1science did not distinguish hybrid journals as being in a distinct journal category. critics, such as the editorial director of journals policy for oxford university press, david crotty, were quick to fixate on this lack of distinction as a means of discrediting the study.12 employees of elsevier were similarly inclined to criticize the study, declaring that it, “like many others [studies] on this topic, does not appear to be randomized and controlled.”13 however, archambault et al., acknowledging that their study “does not examine the overlap between green and gold,” have provided an extremely comprehensive sample pool, examining 3,350,910 oa papers published between 2007 and 2009 in 12,000 journals.14 this paper examines the notion that “the advantage of oa is partly due to citations having a chance to arrive sooner . . . and concludes that the purported head start of oa papers is actually contrary to observed data.” 15 the open access citation advantage | lewis 53 https://doi.org/10.6017/ital.v37i3.10604 in a more recent study published in february 2018, piwowar et al. examine the prevalence of oa and average relative citation (arc) based on three sample groups of one hundred thousand articles each: “(1) all journal articles assigned a crossref doi, (2) recent journal articles indexed in web of science, and (3) articles viewed by users of unpaywall, an open-source browser extension that lets users find oa articles using oadoi.”16 unlike the 1science study, piwowar et al. had a twofold purpose: to examine the prevalence of oa articles available on the web and whether an oaca exists based on their sample findings. i do not include their results in my literature review because of the dual focus of their study, although i do compare their results with those of archambault et al. and analyze the implications of their findings. bronze: neither gold nor green in their article, piwowar et al. introduce a new category of oa publication: bronze. if gold oa refers to complete open access at the time of publication, and green oa refers to articles published in a paywalled journal but ultimately made oa either after an embargo period or via an ir, bronze oa refers to oa articles that somehow don’t fit into either of these categories. piwowar et al. define bronze oa articles as “free to read on the publisher page, but without any clearly identifiable license.”17 however, as crotty points out in a scholarly kitchen article reflecting on the preprint version of piwowar et al.’s article, “bronze” already exists as an oa category, but has simply been called “public access.”18 while coining “bronze” as a new term for “public access” is helpful in connecting it to oa terms such as “green” and “gold,” it is not quite the new phenomenon it is touted to be. arc as an indication of an oaca both archambault et al. and the authors of the 1science paper provide the arc as a means of establishing a paper’s impact on the larger research community. 19 within their arc analyses, archambault et al. distinguish between non-oa and oa, within which they differentiate between gold and green oa (figure 2). piwowar et al. group papers by closed (non-oa) and oa, with the following oa subcategories: bronze, hybrid, gold, and green oa (figure 3). an arc of 1.0 is the expected amount of citations an article will receive “based on documents published in the same year and [national science foundation (nsf)] specialty.” 20 based on this standard, articles with an arc above or below 1.0 represent a citation impact that percentage above or below the expected citation impact of like articles. for example, an article with an arc of 1.23 has received 23 percent more citations than expected for articles of similar content and quality. this scale can be incredibly useful in determining the presence of a citation advantage, and it can enable researchers to determine overall ca patterns. information technology and libraries | september 2018 54 figure 2. research impact of paywalled (not oa) versus open access (oa) papers “computed by science-metrix and 1science using oaindx and the web of science.” archambault et al., “research impact of paywalled versus open access papers,” white paper, science-metrix and 1science, 2016, http://www.1science.com/1numbr/. critics’ fixation on the “randomized and controlled” nature of the 1science study ignores the fact that the authors do not claim causation. rather, their findings suggest the existence of an oaca when comparing oa (in all forms) and non-oa (in any form) articles (see figure 2). the authors ultimately conclude that “in all these fields, fostering open access (without distinguishing between gold and green) is always a better research impact maximization strategy than relying on strictly paywalled papers.”21 unlike archambault et al., piwowar et al. found that gold oa articles had a significantly lower arc, and that the average arc of all oa balances out to 1.18 because of the high arcs of bronze (1.22), hybrid (1.31), and green (1.33). however, both studies fou nd that non-oa (referred to by piwowar et al. as “closed”) articles had an arc below 1.0, suggesting a definitive correlation between oa (without specifying type) and an increase in citations. http://www.1science.com/1numbr/ the open access citation advantage | lewis 55 https://doi.org/10.6017/ital.v37i3.10604 figure 3. “average relative citations of different access types of a random sample of world of science (wos) articles and review with a digital object identifier (doi) published between 2009 and 2015.” heather piwowar et al., “the state of oa: a large-scale analysis of the prevalence and impact of open access articles,” peerj, february 13, 2018, https://doi.org/10.7717/peerj.4375. six years and what has changed in oaca research between july 2011 and the publication of piwowar et al.’s work in february 2018, nine new oaca studies have been published in peer-reviewed journals. of these, five only look at the oaca in one field, such as cytology or dentistry. the other four are multidisciplinary studies, two of which are repository-specific and only use articles from deep blue and academia.edu, respectively. this is important to note because of critics’ earlier stated objections to the use of studies that are not randomized controlled studies. however, the deep blue study can still be considered a randomized controlled sample group because the authors are not self-selecting articles to upload to the repository as they are with academia.edu. rather, articles were made accessible through deep blue “via blanket licensing agreements between the publishers and the [university of michigan] library.”22 some of the field-specific studies use sample sizes that may not reflect a general oaca, but rather one only for that field, and in certain cases, only for a single journal. field-specific studies between july 2011 and july 2017, five field-specific studies were conducted to determine whether an oaca existed in those fields. i summarize the scope and conclusions of these studies in table 1. as you can see from the table, the article sample size vastly varied between studies, but that can likely be accounted for by considering the specific fields studied since there are only five major cytopathology journals and nearly fifty major ecology journals. piwowar et al. acknowledge this in their study, noting that the nsf assigns all science journals “exactly one ‘discipline’ (a high-level categorization) and exactly one ‘specialty’ (a finer-grained categorization).”23 the more deeply nested in an nsf discipline a subject is, the more specialized the field becomes and the fewer journals there are on the subject. this alone is reason not to extrapolate from the results of these studies and project their results on the existence of oaca across all fields. https://doi.org/10.7717/peerj.4375 information technology and libraries | september 2018 56 only two of these studies, those focused on an oaca in dentistry and ecology, can be cons idered truly randomized controlled studies. both the cytopathology and marine ecology studies chose a specific set of journals from which to draw their entire sample pool. while the dentistry and ecology studies can be considered randomized controlled in nature, they still only reflect the occurrence (or lack thereof) of an oaca in those specific fields. it would be irresponsible to allow the results from studies in a single field of a single discipline to represent oaca trends across all disciplines. therefore, it is surprising that elsevier employees use the dentistry study to make such a claim. hersh and plume write, “another recent study by hua et al (2016) looking at citations of open access articles in dentistry found no evidence to suggest that open access articles receive significantly more citations than non-open access articles.”24 the key phrase missing from the end of this analysis is in dentistry. one might question whether a claim about multidisciplinary oaca can effectively be extrapolated from a single-field analysis. the authors do, two sentences later, qualify their earlier statement by saying, “in dentistry at least, the type of article you publish seems to make a difference but not oa status.”25 that is indeed what this study seems to show, and is therefore a logical claim to make. likewise, the three empirical studies in table 1 show that, for those respective fields, oa status does correlate to a citation advantage. in the case of the ecology study, the authors are confident enough in their randomized controlled methodology to claim causation. 26 the ecology study is the most recently published oaca study, and its authors were able to learn from similar past studies about the necessary controls and potential confounders in oaca studies. with this knowledge, tang et al. determined that: by comparing oa and non-oa articles within hybrid journals, our estimate of the citation advantage of oa articles sets controls for many factors that could confound other comparisons. numerous studies have compared articles published in oa journals to those in non-oa journals, but such comparison between different journals could not rule out the impacts of potentially confounding factors such as publication time (speed) and quality and impact (rank) of the journal. these factors are effectively controlled with our focus on hybrid journals, thereby providing robust and general estimates of citation advantages on which to base publication decisions. 27 the open access citation advantage | lewis 57 https://doi.org/10.6017/ital.v37i3.10604 summary of key field-specific studies author study design content number of articles controls results, interpretation, and conclusion clements 2017 empirical 3 hybrid-oa marine ecology journals all articles published in these journals between 2009 and 2012; specific number not provided jif; article type; selfcitations “on average, open access articles received more peer-citations than nonopen access articles.” oaca found. frisch et al. 2014 empirical 5 cytopathology journals; 1 oa and 4 non-oa 314 articles published between 2007 and 2011 jif; author frequency; publisher neutrality “overall, the averages of both cpp and q values were higher for oa cytopathology journal (cytojournal) than traditional non-oa journals.” oaca found. gaulé and maystre 2011 empirical 1 major biology journal 4,388 articles published between 2004 and 2006 last author; characteristics; article quality “we find no evidence for a causal effect of open access on citations. however, a quantitatively small causal effect cannot be statistically ruled out.” oaca not found. hua et al. 2016 randomized controlled articles randomly selected from pubmed database, not specific dentistry journals 908 articles published in 2013 randomized article selection; exclusion of articles unrelated to dentistry; multidatabase search to determine oa status “in the present study, there was no evidence to support the existence of oa ‘citation advantage’, or the idea that oa increases the citation of citable articles.” oaca not found. tang et al. 2017 randomized controlled 46 hybrid-oa ecology journals 3,534 articles published between 2009 and 2013 gni of author country; randomized article pairing; article length “overall, oa articles received significantly more citations than non-oa articles, and the citation advantage averaged approximately one citation per article per year and increased cumulatively over time after publication.” oaca found. table 1. scope, controls, and results of field-specific oaca studies since 2011. based on a chart in stephan mertens, “open access: unlimited web based literature searching,” deutsches ärzteblatt international 106, no. 43 (2009): 711. jif, journal impact factor; cpp, citations per publication; q, q-value (see frisch, nora k., romil nathan, yasin k. ahmed, and vinod b. shidham. “authors attain comparable or slightly higher rates of citation publishing in an open access journal (cytojournal) compared to traditional cytopathology journals—a five year (2007–2011) experience.” cytojournal 11, no. 10 (april 2014). https://doi.org/10.4103/1742-6413.131739 for specific equation used.) https://doi.org/10.4103/1742-6413.131739 information technology and libraries | september 2018 58 summary of key multidisciplinary studies author study design content number of articles controls results, interpretation, and conclusion mccabe and snyder 2014 empirical 100 journals in ecology, botany, and multidisciplinary science all articles published in these journals between 1996 and 2005; specific number not provided jif; journal founding year “we found that open access only provided a significant increase for those volumes made openly accessible via the narrow channel of their own websites rather than the broader pubmed central platform.” oaca found. niyazov et al. 2016 empirical unspecified number of journals across 23 academic divisions 31,216 articles published between 2009 and 2012 field; jif; publication vs. upload date “we find a substantial increase in citations associated with posting an article to academia.edu. . . . we find that a typical article that is also posted to academia.edu has 49% more citations than one that is only available elsewhere online through a non-academia.edu venue.” oaca found for academia.edu. ottaviani 2016 randomized controlled unspecified number of journals who have blanket licensing agreements between the publishers and the university of michigan library 93,745 articles published between 1990 and 2013 self-selection “even though effects found here are more modest than reported elsewhere, given the conservative treatments of the data and when viewed in conjunction with other oaca studies already done, the results lend support to the existence of a real, measurable, open access citation advantage with a lower bound of approximately 20%.” oaca found. sotudeh et al. 2015 empirical 633 apc-funded oa journals published by springer and elsevier 995,508 articles published between 2007 and 2011 journals who adopted oa policies after 2007 journals with non– article processing charge oa policies “the apc oa papers are, also, revealed to outperform the ta ones in their citation impacts in all the annual comparisons. this finding supports the previous results confirming the citation advantage of oa papers.” oaca found. table 2. scope, controls, and results of multi-disciplinary oaca studies since 2011. jif, journal impact factor; apc, article processing charge; ta, toll access the open access citation advantage | lewis 59 https://doi.org/10.6017/ital.v37i3.10604 based on the randomized controlled methodology that tang et al. found hybrid journals to provide, it is possible that this study may serve as an ideal model for future larger oaca studies across multiple disciplines. however, more field-specific hybrid journal studies will have to be conducted before determining if this model would be the most accurate method for measuring oaca across multiple disciplines in a single study. multidisciplinary studies the multidisciplinary oaca studies conducted since 2011 include a single randomized control study and three empirical studies (table 2). all these studies found an oaca; in the case of niyazov et al., an oaca was found specifically for articles posted to academia.edu. i included this study because it is an important contribution to the premise that a relationship exists between self selection and oaca. niyazov et al. highlight this point in the section “sources of selection bias in academia.edu citations,” explaining that “even if academia.edu users were not systematically different than non-users, there might be a systematic difference between the papers they choose to post and those they do not. as [many] . . . have hypothesized, users may be more likely to post their most promising, ‘highest quality’ articles to the site, and not post articles they believe will be of more limited interest.”28 to underscore this point, i refer to gargouri et al., who stated that “the oa advantage [to self archived articles] is a quality advantage, rather than a quality bias” (italics in original).29 again, it is unsurprising that articles of higher caliber are cited more and that making such articles more readily available increases the amount of citations they would likely already receive. similar to my conclusion in the field-specific study section, we simply need more randomized controlled studies, such as ottaviani’s, to determine the nature and extent of the relationship between oa and ca across multiple disciplines. conclusions critics of some of the most recent studies, specifically archambault et al. and ottaviani, have argued that authors of oaca studies are too quick to claim causation. while a claim of causation does indeed require strict adherence to statistical methodology and control of potential confounders, few of the authors i have examined actually claim causation. they recognize that the empirical nature of their studies is not enough to prove causation, but rather to provide insight into the correlation between open access and a citation advantage. in all their conclusions, these authors acknowledge that further studies are needed to prove a causal relationship between oa and ca. the recent work published by piwowar et al. provides a potential model for replication by other researchers, and ottaviani offers a replicable method for other large research institutions with non-self-selecting institutional repositories. alternatively, field-specific studies conducted in the style of tang et al. across all fields would serve to provide a wider array of evidence for the occurrence of field-specific oaca and therefore of a more widespread oaca. recent developments in oa search engines have created alternative routes to many of the same articles offered by subscriptions, but at a fraction (if any) of the cost. antelman proposed that libraries use an oa-adjusted cost per download (oa-adj cpd), a metric that “subtracts the downloads that could be met by oa copies of articles within subscription journals,” as a tool for negotiating the price of journal subscriptions.30 by calculating an oa-adj cpd, libraries could information technology and libraries | september 2018 60 potentially leverage their ability to access journal articles through means other than traditional subscription bundles to save money and encourage oa publication. while antelman suggests using oa-adj cpd as a leveraging tool when making deals with publishers for journals subscriptions, i suggest that libraries use the data-gathering methods of piwowar et al. via unpaywall to determine whether enough articles from a specific journal can be found oa via unpaywall. by using metrics such as those collected by piwowar et al. through unpaywall, the potential confounding variable of articles found through illegitimate means (such as scihub) is alleviated. instead, piwowar et al.’s metrics focus on tracking the percentage of material searched by library patrons that can be found oa through the unpaywall browser extension. according to unpaywall’s “libraries user guide” page, libraries “can integrate unpaywall into their sfx, 360 link, or primo link resolvers, so library users can read oa copies in cases where there's no subscription access. over 1000 libraries worldwide are using this now. ”31 ideally, scholars will also be more willing to publish papers oa, and institutions will be more supportive of providing the necessary costs for making publications oa. though the publish-orperish model still reigns in academia, there is great potential in encouraging tenured professors to publish oa by supplementing the costs through institutional grants and other incentives wrapped into a tenure agreement. perhaps through this model, as gargouri et al. have suggested, the longstanding publish-or-perish doctrine will give way to an era of “self-archive to flourish.”32 bibliography antelman, kristin. “leveraging the growth of open access in library collection decision making.” acrl 2017 proceedings: at the helm, leading the transformation, march 22–25, baltimore, maryland, ed. dawn m. mueller (chicago: association of college and research libraries, 2017), 411–22. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/l everagingthegrowthofopenaccess.pdf. archambault, éric, grégoire côté, brooke struck, and matthieu voorons. “research impact of paywalled versus open access papers.” white papers, science-metrix and 1science, 2016. http://www.1science.com/1numbr/. calver, michael c. and j. stuart bradley. “patterns of citations of open access and non -open access conservation biology journal papers and book chapters.” conservation biology 24, no. 3 (may 2010): 872-80. https://doi.org/10.1111/j.1523-1739.2010.01509.x. chua, s. k., ahmad m. qureshi, vijay krishnan, dinker r. pai, laila b. kamal, sharmilla gunasegaran, m. z. afzal, lahri ambawatta, j. y. gan, p. y. kew, et al. “the impact factor of an open access journal does not contribute to an article’s citations” [version 1; referees: 2 approved]. f1000 research 6 (2017): 208. https://doi.org/10.12688/f1000research.10892.1. clarivate analytics. “incites journal citation reports.” dataset updated september 9, 2017. https://jcr.incites.thomsonreuters.com/. clements, jeff c. “open access articles receive more citations in hybrid marine ecology journals.” facets 2 (january 2017): 1–14. https://doi.org/10.1139/facets-2016-0032. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/leveragingthegrowthofopenaccess.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/leveragingthegrowthofopenaccess.pdf http://www.1science.com/1numbr/ https://doi.org/10.1111/j.1523-1739.2010.01509.x https://doi.org/10.12688/f1000research.10892.1 https://jcr.incites.thomsonreuters.com/ https://doi.org/10.1139/facets-2016-0032 the open access citation advantage | lewis 61 https://doi.org/10.6017/ital.v37i3.10604 crotty, david. “study suggests publisher public access outpacing open access; gold oa decreases citation performance.” scholarly kitchen, october 4, 2017. https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-accessoutpacing-open-access-gold-oa-decreases-citation-performance/. crotty, david. “when bad science wins, or ‘i’ll see it when i believe it.’” scholarly kitchen, august 31, 2016. https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-itwhen-i-believe-it/. davis, philip m. “open access, readership, citations: a randomized controlled trial of scientific journal publishing.” faseb journal 25, no. 7 (july 2011): 2129–34. https://doi.org/10.1096/fj.11183988. davis, philip m., and william h. walters. “the impact of free access to the scientific literature: a review of recent research.” journal of the medical library association 99, no. 3 (july 2011): 208– 17. https://doi.org/10.3163/1536-5050.99.3.008. elsevier. “your guide to publishing open access with elsevier.” amsterdam, netherlands: elsevier, 2015. https://www.elsevier.com/__data/assets/pdf_file/0020/181433/openaccessbooklet_may.pdf. evans, james a. and jacob reimer. “open access and global participation in science.” science 323, no. 5917 (february 2009): 1025. https://doi.org/10.1126/science.1154562. eysenbach, gunther. “citation advantage of open access articles.” plos biology 4, no. 5 (may 2006): e157. https://doi.org/10.1371/journal.pbio.0040157. fisher, tim. “top-level domain (tld).” lifewire, july 30, 2017. https://www.lifewire.com/toplevel-domain-tld-2626029. frisch, nora k., romil nathan, yasin k. ahmed, and vinod b. shidham. “authors attain comparable or slightly higher rates of citation publishing in an open access journal (cytojournal) compared to traditional cytopathology journals—a five year (2007–2011) experience.” cytojournal 11, no. 10 (april 2014). https://doi.org/10.4103/1742-6413.131739. gaulé, patrick, and nicolas maystre. “getting cited: does open access help?” research policy 40, no. 10 (december 2011): 1332–38. https://doi.org/10.1016/j.respol.2011.05.025. gargouri, yassine, chawki hajjem, vincent larivière, yves gingras, les carr, tim brody, and stevan harnad. “self-selected or mandated, open access increases citation impact for higher quality research.” plos one 5, no. 10 (october 2010). https://doi.org/10.1371/journal.pone.0013636. hajjem, chawki, stevan harnad, and yves gingras. “ten-year cross-disciplinary comparison of the growth of open access and how it increases research citation impact.” ieee data engineering bulletin 28, no. 4 (december 2005): 39-46. hall, martin. “green or gold? open access after finch.” insights 25, no. 3 (november 2012): 235– 40. https://doi.org/10.1629/2048-7754.25.3.235. https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access-outpacing-open-access-gold-oa-decreases-citation-performance/ https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access-outpacing-open-access-gold-oa-decreases-citation-performance/ https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it-when-i-believe-it/ https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it-when-i-believe-it/ https://doi.org/10.1096/fj.11-183988 https://doi.org/10.1096/fj.11-183988 https://doi.org/10.3163/1536-5050.99.3.008 https://www.elsevier.com/__data/assets/pdf_file/0020/181433/openaccessbooklet_may.pdf https://doi.org/10.1126/science.1154562 https://doi.org/10.1371/journal.pbio.0040157 https://www.lifewire.com/top-level-domain-tld-2626029 https://www.lifewire.com/top-level-domain-tld-2626029 https://doi.org/10.4103/1742-6413.131739 https://doi.org/10.1016/j.respol.2011.05.025 https://doi.org/10.1371/journal.pone.0013636 https://doi.org/10.1629/2048-7754.25.3.235 information technology and libraries | september 2018 62 hersh, gemma, and andrew plume. “citation metrics and open access: what do we know?” elsevier connect, september 14, 2016. https://www.elsevier.com/connect/citation-metrics-andopen-access-what-do-we-know. houghton, john, and alma swan. “planting the green seeds for a golden harvest: comments and clarifications on ‘going for gold.’” d-lib magazine 19, no. 1/2 (january/february 2013). https://doi.org/10.1045/january2013-houghton. hua, fang, heyuan sun, tanya walsh, helen worthington, and anne-marie glenny. “open access to journal articles in dentistry: prevalence and citation.” journal of dentistry 47 (april 2016): 41– 48. https://doi.org/10.1016/j.jdent.2016.02.005. internet corporation for assigned names and numbers. “list of top-level domains.” last updated september 13, 2018. https://www.icann.org/resources/pages/tlds-2012-02-25-en. jump, paul. “open access papers ‘gain more traffic and citations.’” times higher education, july 30, 2014. https://www.timeshighereducation.com/home/open-access-papers-gain-more-trafficand-citations/2014850.article. mccabe, mark j., and christopher m. snyder. “identifying the effect of open access on citations using a panel of science journals.” economic inquiry 52, no. 4 (october 2014): 1284–1300. https://doi.org/10.11111/ecin.12064. mccabe, mark j., and christopher m. snyder. “does online availability increase citations? theory and evidence from a panel of economics and business journals.” review of economics and statistics 97, no. 1 (march 2015): 144–65. https://doi.org/10.1162/rest_a_00437. mertens, stephan. “open access: unlimited web based literature searching.” deutsches ärzteblatt international 106, no. 43 (2009): 710–12. https://doi.org/10.3238/arztebl.2009.0710. moed, hank. “does open access publishing increase citation or download rates?” research trends 28 (may 2012). https://www.researchtrends.com/issue28-may-2012/does-open-accesspublishing-increase-citation-or-download-rates/. niyazov, yuri, carl vogel, richard price, ben lund, david judd, adnan akil, michael mortonson, josh schwartzman, and max shron. “open access meets discoverability: citations to articles posted to academia.edu.” plos one 11, no. 2 (february 2016): e0148257. https://doi.org/10.1371/journal.pone.0148257. ottaviani, jim. “the post-embargo open access citation advantage: it exists (probably), it’s modest (usually), and the rich get richer (of course).” plos one 11, no. 8 (august 2016): e0159614. https://doi.org/10.1371/journal.pone.0159614. pinfield, stephen, jennifer salter, and peter a. bath. “a ‘gold-centric’ implementation of open access: hybrid journals, the ‘total cost of publication,’ and policy development in the uk and beyond.” journal of the association for information science and technology 68, no. 9 (september 2017): 2248–63. https://doi.org/10.1002/asi.23742. piwowar, heather, jason priem, vincent larivière, juan pablo alperin, lisa matthias, bree norlander, ashley farley, jevin west, and stefanie haustein. “the state of oa: a large-scale https://www.elsevier.com/connect/citation-metrics-and-open-access-what-do-we-know https://www.elsevier.com/connect/citation-metrics-and-open-access-what-do-we-know https://doi.org/10.1045/january2013-houghton https://doi.org/10.1016/j.jdent.2016.02.005 https://www.icann.org/resources/pages/tlds-2012-02-25-en https://www.timeshighereducation.com/home/open-access-papers-gain-more-traffic-and-citations/2014850.article https://www.timeshighereducation.com/home/open-access-papers-gain-more-traffic-and-citations/2014850.article https://doi.org/10.11111/ecin.12064 https://doi.org/10.1162/rest_a_00437 https://doi.org/10.3238/arztebl.2009.0710 https://www.researchtrends.com/issue28-may-2012/does-open-access-publishing-increase-citation-or-download-rates/ https://www.researchtrends.com/issue28-may-2012/does-open-access-publishing-increase-citation-or-download-rates/ https://doi.org/10.1371/journal.pone.0148257 https://doi.org/10.1371/journal.pone.0159614 https://doi.org/10.1002/asi.23742 the open access citation advantage | lewis 63 https://doi.org/10.6017/ital.v37i3.10604 analysis of the prevalence and impact of open access articles.” peerj (february 13, 2018): 6:e4375. https://doi.org/10.7717/peerj.4375. research information network. “nature communications: citation analysis.” press release, 2014. https://www.nature.com/press_releases/ncomms-report2014.pdf. riera, m. and e. aibar. “¿favorece la publicación en abierto el impacto de los artículos científicos? un estudio empírico en el ámbito de la medicina intensive” [does open access publishing increase the impact of scientific articles? an empirical study in the field of intensive care medicine]. medicina intensiva 37, no. 4 (may 2013): 232-40. http://doi.org/10.1016/j.medin.2012.04.002. sotudeh, hajar, zahra ghasempour, and maryam yaghtin. “the citation advantage of author-pays model: the case of springer and elsevier oa journals.” scientometrics 104 (june 2015): 581–608. https://doi.org/10.1007/s11192-015-1607-5. swan, alma, and john houghton. “going for gold? the costs and benefits of gold open access for uk research institutions: further economic modelling.” report to the uk open access implementation group, june 2012. http://wiki.lib.sun.ac.za/images/d/d3/report-to-the-uk-openaccess-implementation-group-final.pdf. tang, min, james d. bever, and fei-hai yu. “open access increases citations of papers in ecology.” ecosphere 8, no. 7 (july 2017): 1–9. https://doi.org/10.1002/ecs2.1887. unpaywall. “libraries user guide.” accessed september 13, 2018. https://unpaywall.org/userguides/libraries. wray, k. brad. “no new evidence for a citation benefit for author-pay open access publications in the social sciences and humanities.” scientometrics 106 (january 2016): 1031–35. https://doi.org/10.1007/s11192-016-1833-5. endnotes 1 elsevier, “your guide to publishing open access with elsevier” (amsterdam, netherlands: elsevier, 2015), 2, https://www.elsevier.com/__data/assets/pdf_file/0020/181433/openaccessbooklet_may.pdf. 2 philip m. davis and william h. walters, “the impact of free access to the scientific literature: a review of recent research,” journal of the medical library association 99, no. 3 (july 2011): 213, https://doi.org/10.3163/1536-5050.99.3.008. 3 david and walters, “the impact of free access,” 208. 4 kristin antelman, “leveraging the growth of open access in library collection decision making,” acrl 2017 proceedings: at the helm, leading the transformation, march 22–25, baltimore, maryland, ed. dawn m. mueller (chicago: association of college and research libraries, 2017): 411, 413, http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/201 7/leveragingthegrowthofopenaccess.pdf. https://doi.org/10.7717/peerj.4375 https://www.nature.com/press_releases/ncomms-report2014.pdf http://doi.org/10.1016/j.medin.2012.04.002 https://doi.org/10.1007/s11192-015-1607-5 http://wiki.lib.sun.ac.za/images/d/d3/report-to-the-uk-open-access-implementation-group-final.pdf http://wiki.lib.sun.ac.za/images/d/d3/report-to-the-uk-open-access-implementation-group-final.pdf https://doi.org/10.1002/ecs2.1887 https://unpaywall.org/user-guides/libraries https://unpaywall.org/user-guides/libraries https://doi.org/10.1007/s11192-016-1833-5 https://www.elsevier.com/__data/assets/pdf_file/0020/181433/openaccessbooklet_may.pdf http://jmla.mlanet.org/ https://doi.org/10.3163/1536-5050.99.3.008 http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/leveragingthegrowthofopenaccess.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2017/leveragingthegrowthofopenaccess.pdf information technology and libraries | september 2018 64 5 research information network, “nature communications: citation analysis,” press release, 2014, https://www.nature.com/press_releases/ncomms-report2014.pdf. 6 gargouri et al., “self-selected or mandated, open access increases citation impact for higher quality research,” plos one 5, no. 10 (october 2010): 17, https://doi.org/10.1371/journal.pone.0013636. 7 david crotty, “when bad science wins, or ‘i’ll see it when i believe it’,” scholarly kitchen, august 31, 2016, https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-seeit-when-i-believe-it/. 8 jim ottaviani, “the post-embargo open access citation advantage: it exists (probably), it’s modest (usually), and the rich get richer (of course),” plos one 11, no. 8 (august 2016): 9, https://doi.org/10.1371/journal.pone.0159614. 9 gargouri et al., “self-selected or mandated,” 18. 10 elsevier, “your guide to publishing,” 2. 11 top-level domain (tld) refers to the last string of letters in an internet domain name (i.e., the tld of www.google.com is .com). for more information on tlds, see tim fisher, “top-level domain (tld),” lifewire, july 30, 2017, https://www.lifewire.com/top-level-domain-tld2626029. for a full list of tlds, see “list of top-level domains,” internet corporation for assigned names and numbers, last updated september 13, 2018, https://www.icann.org/resources/pages/tlds-2012-02-25-en. 12 crotty, “when bad science wins.” 13 hersh and plume, “citation metrics and open access: what do we know?,” elsevier connect, september 14, 2016, https://www.elsevier.com/connect/citation-metrics-and-open-accesswhat-do-we-know. 14 archambault et al., “research impact of paywalled versus open access papers,” white paper, science-metrix and 1science, 2016, http://www.1science.com/1numbr/. 15 archambault et al., “research impact.” 16 heather piwowar et al., “the state of oa: a large-scale analysis of the prevalence and impact of open access articles,” peerj, february 13, 2018, https://doi.org/10.7717/peerj.4375. 17 piwowar et al., “the state of oa,” 5. 18 david crotty, “study suggests publisher public access outpacing open access; gold oa decreases citation performance,” scholarly kitchen, october 4, 2017, https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-accessoutpacing-open-access-gold-oa-decreases-citation-performance/. https://www.nature.com/press_releases/ncomms-report2014.pdf https://doi.org/10.1371/journal.pone.0013636 https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it-when-i-believe-it/ https://scholarlykitchen.sspnet.org/2016/08/31/when-bad-science-wins-or-ill-see-it-when-i-believe-it/ https://doi.org/10.1371/journal.pone.0159614 https://www.lifewire.com/top-level-domain-tld-2626029 https://www.lifewire.com/top-level-domain-tld-2626029 https://www.icann.org/resources/pages/tlds-2012-02-25-en https://www.elsevier.com/connect/citation-metrics-and-open-access-what-do-we-know https://www.elsevier.com/connect/citation-metrics-and-open-access-what-do-we-know http://www.1science.com/1numbr/ https://doi.org/10.7717/peerj.4375 https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access-outpacing-open-access-gold-oa-decreases-citation-performance/ https://scholarlykitchen.sspnet.org/2017/10/04/study-suggests-publisher-public-access-outpacing-open-access-gold-oa-decreases-citation-performance/ the open access citation advantage | lewis 65 https://doi.org/10.6017/ital.v37i3.10604 19 archambault et al., “research impact”; piwowar et al., “the state of oa,” 15. 20 piwowar et al., “the state of oa,” 9–10. 21 archambault et al., “research impact.” 22 ottaviani, “the post-embargo open access citation advantage,” 2. 23 piwowar et al., “the state of oa,” 9. 24 hersh and plume, “citation metrics and open access.” 25 hersh and plume, “citation metrics and open access.” 26 tang et al., “open access increases citations of papers in ecology,” ecosphere 8, no. 7 (july 2017): 8, https://doi.org/10.1002/ecs2.1887. 27 tang et al., “open access increases citations,” 7. tang et al. list the following as examples of the “numerous studies” as quoted above, which i did not include in the quote for the purpose of brevity: (antelman 2004, hajjem et al. 2005, eysenbach 2006, evans and reimer 2009, calver and bradley 2010, riera and aibar 2013, clements 2017). 28 yuri niyazov et al., “open access meets discoverability: citations to articles posted to academia.edu,” plos one 11, no. 2 (february 2016): e0148257, https://doi.org/10.1371/journal.pone.0148257. 29 gargouri et al., “self-selected or mandated,” 18. 30 antelman, “leveraging the growth,” 414. 31 “library user guide,” unpaywall, accessed september 13, 2018, https://unpaywall.org/userguides/libraries.<<disclaimer not needed if access date is given. i put the date that i accessed the page and verified the quote.>> 32 gargouri et al., “self-selected or mandated,” 20. https://doi.org/10.1002/ecs2.1887 https://doi.org/10.1371/journal.pone.0148257 https://unpaywall.org/user-guides/libraries https://unpaywall.org/user-guides/libraries abstract introduction self-archiving bias and why it doesn’t matter gold versus green and their effect on oaca analyses bronze: neither gold nor green arc as an indication of an oaca six years and what has changed in oaca research field-specific studies summary of key field-specific studies summary of key multidisciplinary studies multidisciplinary studies conclusions bibliography endnotes 116 journal of library automation vol. 14/2 june 1981 tions only. they do not list the individual works that may be contained in publications. if an analytic catalog were to be built into a computerized system at some time in the future , the structure code would be a great help in the redesign, because it makes it easy to spot items that need analytics, namely those that contain embedded works, or codes 2, 4, 5, 6, 8, 9, 10, 11, and 13. a searcher working with such an analytic catalog could use the code to limit output to manageable stages-first all items of type c, for example; then broadening the search to include those of type d; and so forth, until enough relevant material has been found. the structure code would also be useful in the displayed output. if codes 5 or 8 appeared together with a bibliographic description on the screen, this would tell the catalog user that the item retrieved is a set of many separately titled documents. a complete list of those titles can then be displayed to help the searcher decide which of the documents are relevant for him. in the card catalog this is done by means of contents notes . not all libraries go to the trouble of making contents notes, though, and not all contents notes are complete and rtliable . the structure code would ensure consistency and completeness of contents information at all times. codes 10 and 13 in a search output, analogously, would tell the user that the item is a serial with individual issue titles. there is no mechanism in the contemporary card catalog to inform readers of those titles. codes 4 and 7 would tell that the document is part of a finite set, and so forth. it has been the general experience of database designers that a record cannot have too many searchable elements built into its format. no sooner is one approach abandoned "because nobody needs it," than someone arrives on the scene with just that requirement. it can be anticipated, then, that once the structure code is part of the standard record format, catalog users will find many other ways to work the code into search strategies. it can also be anticipated that the proposed structure code, by adding a factor of selectivity, will help catalogers because it strengthens the authority-control aspect of machine-readable catalog files. if two publications bear identical titles, for example, and one is of structure 1, the other of structure 6, then it is clear that they cannot possibly be the same items. however, if they are of structures 1 and 7, respectively, extra care must be taken in cataloging, for they could be different versions of the same work. determination of the structure of an item is a by-product of cataloging, for no librarian can catalog a book unless he understands what the structure of that book is-one or more works, one or more documents per item, open or closed set, and so forth . it would therefore be very cheap at cataloging time to document the already-performed structure analysis and express this structure in the form of a code. references l. herbert h. hoffman, descriptive cataloging in a new light: polemical chapters for librarians (newport beach, calif.: headway publications, 1976), p.43. revisions to contributed cataloging in a cooperative cataloging database judith hudson: university libraries , state university of new york at albany. introduction oclc is the largest bibliographic utility in the united states. one of its greatest assets is its computerized database of standardized cataloging information . the database, which is built on the principle of shared cataloging, consists of cataloging records input from library of congress marc tapes and records contributed by member libraries. oclc standards ln. order to provide records contributed by member libraries that are as usable as those input from marc tapes, it is imperative that the records meet the standards set by oclc and that the cataloging and formatting of the records be free of errors. member libraries are requested to follow the nationally accepted cataloging code (anglo-american cataloging rules, north american text, 1 • 2 for records input before december 12, 1980, and angloamerican cataloguing rules, second edition, 3 for records input later), the library of congress' application of the cataloging code, and the various marc formats in preparing records to be input. 4 • 5 the cataloging rules dictate what kind of bibliographic information should be included in the cataloging records, a prescribed system of punctuation that identifies the various fields of the cataloging record (international standard bibliographic description, isbd), which access points should be provided, and what form the entries should take. the marc formats provide a standardized method of identifying the various fields and subfields in a cataloging record and, through the use of indicators, information necessary to make the record easily manipulated by computers. in addition, fixed fields provide coded information about the cataloging records. the form of main, added, and series entries can be verified in the national union catalog to ensure that member libraries are following the library of congress' application of the cataloging code . by the same token, subject entries can be verified in the appropriate subject heading list (e.g., library of congress subject headings, sears subject headings, etc.). a study of oclc member cataloging a major problem with the use of contributed cataloging is the amount of revision needed to bring the records up to the standards described above. in 1975, a study of the quality of a group of membercontributed catalog records was conducted by c. c . ryans. 6 the first 700 monographic records input into oclc after september 1, 1975, to which kent state university attached its holdings were examined. 7 the analysis included changes in or additions to main, added, or series communications 117 entries, changes in descriptive cataloging, and changes in or additions to subject headings . the study dealt only with the revision of cataloging; revision of the formatting of records was not noted. the kent state study found that 393 revisions were necessary to 283 records. the remaining 417 records were considered to be acceptable, i.e., they adhered to aacr and isbd rules and to the oclc standards for input cataloging. recent developments relating to quality control since these records were studied, the internetwork quality control council was formed in 1977 by the oclc board of trustees. 8 its primary purpose is to identify problem areas regarding quality control and distribute information to networks concerning problems and solutions. its role is to promote quality control through education and by monitoring the implementation of standards. in addition, oclc' s documentation has steadily improved. the recent publication of the books format9 and the recent revision of the cataloging manual10 provide clear and specific information on oclc' s formatting requirements. with these developments in mind, it would seem likely that the quality of the contributed cataloging has improved since 1975. in order to test this assumption, a number of cataloging records were analyzed in an effort to replicate the kent state study. the analysis of these records differed from the earlier study in that differences in the treatment of series were not noted because one library's treatment of series can reasonably be expected to differ from that of another . methodology the records included in this study consist of 1,017 monographic catalog records to which the state university of new york at albany (sunya) library added its holding symbol during an eight-month period from november 1979 to july 1980. the records included only those that were entered into the oclc database after 1976. cataloging revisions that were noted 118 journal of library automation vol. 14/2 jun e 1981 consisted of changes in main and added entries to make them consistent with library of congress form of entry, and the inclusion of other added entries that were deemed necessary to provide adequate access to the material. in addition, corrections or additions to the imprint and the collation· were noted, as were typograph_ ical errors in all fields . subject headings that were changed to make them consistent with library of congress subject headings and subject headings and/or subdivisions added to provide better subject access to the material were also noted . analysis of cataloging cataloging revisions were required for 43 percent of the 1,017 records examined (596 changes or additions were made to 437 records). changes or additions to subject headings were made to 22.4 percent of all the records in the sunya sample, and represented the most common revision . changes in descriptive cataloging were made to 20 percent of the records, and changes or additions to main or added entries were made to approximately 16 percent of the records. table 1 compares the results of this analysis with the findings of the e arlier study . it should be emphasized that the two studies are not exactly comparable because the kent state study included differences in the treatment of series, while this study noted only typographical errors in series statements. the findings of this analysis do not bear out the hypothesis that the quality of member-contributed cataloging has improved since 1975. the overall percentage of records requiring cataloging revision is similar in both the kent state and the sunya samples . the percentage of changes made in the various areas of the cataloging records was similar, with the exception of added entries and subject headings . in the sunya sample , more revisions and additions were made to these two areas. this difference between the two samples may reflect variation in the cataloging policies of the two libraries rather than the presence or absence of more errors in member-contributed catalog records . analysis of oclc reportable errors and additions in the fall of 1979, oclc distributed its revised cataloging manual, which includes a chapter dealing with quality control. 11 the chapter delineates the errors and changes that are to be reported to oclc for correction or addition . the cataloging records examined in this study were also analyzed with these criteria in mind. this analysis (table 2) revealed that 661 reportable errors or changes were found on 486 records (47.8 percent of all the records). reportable errors or changes included formatting errors or omissions such as incorrect assignment of tags, incorrect or missing indicators, subfield codes or fixed fields, and errors affecting retrieval or card printing . other types of errors intable 1 . comparison of two studies of cataloging revision area needing kent state sample* sunya sample revision or addition number percentage number percentage main entry 44 6.2 46 4.5 title statement 28 4.0 76 7.5 edition statement 4 0.6 2 0.2 imprint 29 4.4 64 6.3 collation 111 15.9 58 5.7 series 55 7.9 3 0.3 subject heading 88 12.6 228 22 .4 added entries 44 6.2 119 11.7 total records in study 700 100.0 1017 100.0 records requiring revision 283 40.4 437 43.0 number of revisions made 393 596 *source: constance c . ryans, "a study of errors found in non-marc cataloging in a machineassisted system," journal of library automation 11 :128 (june 1978). communications 119 table 2 . errors and additions reportable to oclc number percentage of total records percentage of total errors and additions 19 6 13 17 59 errors in transcription of data incorrect assignment of tags incorrect or missing subfield codes incorrect assignment of 1st indicator incorrect assignment of 2d indicator incorrect fixed fields incorrect isbd incorrect form of entry (less than lc) errors affecting retrieval or card printing bibliographic information missing addition of access points 313 8 87 3 1 135 total number of records containing reportable errors or additions total number of reportable errors or additions 486 661 eluded incorrect or omitted access points (added or subject entries, isbn, lc card numbers, etc.), errors in transcription of data, incorrect isbn, and the omission of needed bibliographic information. approximately 40 percent (408) of the records contained formatting errors, with over 29 percent (300) of the records containing incomplete or incorrect fixed fields. the apparent unconcern with fixed fields may stem from a lack of understanding of the value of correct fixed-field information. the recent addition of date and type of material as qualifiers in a search of the database is one example of the use of fixed fields. in order to underscore their importance, it might be useful for oclc to highlight this use of fixed fields and further explain to its members how other fixed fields might be used in online search strategies in the future. errors in or omission of access points were found in 222 records (21.8 percent). these errors were also noted in the study of cataloging revisions discussed above, as were errors in transcription of data, in isbd, and in omission of necessary bibliographic information. summary of findings although the quality of the sunya sample seems equivalent to that of the kent state sample, an analysis by date of input of the records examined indicates a slight decrease in the percentage of rec1.9 0.6 1.3 1.7 5.8 30.8 0.8 8.6 0.3 0.1 13.3 47.8 2.9 0.9 2.0 2.6 8.9 47.4 1.2 13.2 0.5 0.2 20 .4 100.0 ords needing correction for those records input in 1979 and 1980 (table 3). perhaps this is the beginning of a trend toward more careful cataloging and formatting of records input by members. in summary, 589 of the 1,017 membercontributed records studied were found to require revision. of these, 486 records contained er.rors or omissions that may be reported to oclc, and 437 required cataloging revision. it is discouraging to realize that approximately 60 percent of the member records used required revision. such a high percentage of records needing revision necessitates the review of all member records .used if a library wishes to adhere to oclc standards for cataloging. this leads to tremendous duplication of effort and negates, in part, the purpose of shared cataloging. table 3. yearly breakdown of catalog records total records percentage year number needing needing of input of records correction correction 1977 186 115 61.8 1978 332 202 60.8 1979 339 184 54.3 1980 160 88 55.0 influences for change the implementation of aacr2 in 1981 provides the impetus for greater adherence to standards. since all catalogers 120 journal of library automation vol. 14/2 june 1981 have had to learn the new cataloging requirements, greater care may be used in the formulation of records by member libraries. the publication of clear and specific guidelines for reportable errors may help to alleviate the situation in two ways . first, the careful articulation of errors or desirable additions may impel member libraries to place more emphasis on the quality control of input. second, member libraries may report more errors, thus allowing oclc to correct the master records. a change in the method of correcting errors and the rate at which they are corrected might be beneficial. presently, errors on the master records can only be corrected by oclc or by the inputting library if it is the only library that has used the record. such an arrangement is clumsy and time-consuming. if other member libraries were trained and authorized to correct errors on master records, errors might be corrected as often as they are detected. in the long run, however, the responsibility for inputting catalog records that meet the standards for cataloging and formatting rests with the member libraries. oclc and the networks must develop methods of encouraging libraries to input records that are correctly formatted and cataloged . one way of alleviating the problem might be to develop training programs conducted by oclc or by network staff that are aimed at those libraries identified as having high error rates. another approach might be to give public recognition to libraries that contribute cataloging of high quality to the database. one example of this approach is the pittsburgh regional library council's fred award, which annually honors the library with the lowest error rate in the prlc network. 12 through the use of peer pressure the member libraries and networks of oclc can encourage adherence to the standards. in addition, they must continue to insist that oclc address this annoying, expensive, and seemingly perennial problem. references l. anglo-american cataloging rules, north american text (chicago: american library assn., 1967), 409p. 2. anglo-american cataloging rules, chapter 6 (rev. ed.; chicago: american library assn., 1974), 122p. 3. anglo-american cataloguing rules, second edition (chicago: american library assn., 1978), 620p. 4. oclc, inc . , cataloging: user manual (columbus: oclc, 1979), 1v. (looseleaf). 5. oclc level i and level k input standards (columbus: ohio college library center, 1977), 1 v. (looseleaf). 6. constance c. ryans, "a study of errors found in non-marc cataloging in a machine-assisted system," journal of library automation 11:125-32 oune 1978). 7. ibid., p . 127. 8. frederick g. kilgour, "establishment of inter-network quality control council" (unpublished document, ohio college library center, 1977), 2p. 9. oclc, inc., books format (columbus: oclc, 1980), 1v. (looseleaf). 10. oclc, inc., cataloging: user manual, 1v. (looseleaf) . 11. ibid. 12. "prlc peer council cites pittsburgh theological seminary library for high cataloging standards," oclc newsletter 131:4 (sept. 1980). 240 i ournal of library automation vol. 7 i 3 september 197 4 book reviews case studies in lihm1·y computer systems, by richard phillips palmer. new york: r. r. bowker, 1973. 214p. $10.95. surely one of the most annoying and disappointing aspects of the literature of library automation is the complete lack of uniformity or standards for reports of individual accomplishments. thus one reads the continuing stream of reports of automated processes in individual libraries with only the remotest idea of which of the projects described are actually operating, which are in the process of being implemented, and which are merely proposals that still exist exclusively in the minds of their creators. in this volume, richard palmer has brought together a number of descriptions of operating systems, upon which he has imposed his own standards of presentation. in all, six circulation, eight serials, and six acquisitions systems are described; in each case the description is divided into six parts. first, in a section entitled "environment," the library, its collections, and its users are briefly described. some idea is provided of the library's total budget, or at least its materials budget, and unusual features of the library are given. next, the objectives of the automated system are stated, generally with some indication of what prompted automation to be considered and what features of the previous manual system were less than satisfactory. a section entitled "the computer" describes the hardware used in some detail (and this information is summarized in a table at the end of the book) , and the next section, "the system," gives a lengthy and detailed description of how the system works. the last section in each case is devoted to observations by palmer, indicating the significance to the library of the automated system, and often pointing out problems that have been noted. the least satisfactory section of the book is the final chapter, "summary and observations," in which palmer lays out the stated costs of each system in such a way that they may be directly compared, even though he knows the figures have been derived in various manners and are therefore not directly comparable. palmer's warning to the reader that "unit costs ... should not be compared without noting that they were not computed on a standard basis" makes even more mystifying his arrangement of those costs in tabular form. a second area that seems weak is the suggestion that the book constitutes an effective rebuttal to the criticisms of ellsworth mason. it seems unlikely that anything short of a very thorough systems analysis, showing all of the problems, alten1atives, costs, and benefits of both manual and machine systems, will satisfy mason. despite these very minor reservations, the book is well worthy of study. it presents, in nontechnical language, some of the most carefully and honestly described systems descriptions to be found in the literature, suggesting by example that many of the individual applications described in the journals, including lola, might well be better than they are. pete1· simmons school of librarianship university of british c@lumbla information systems, se1'vices and centers, by herman m. weisman. new york, n.y.: wiley-becker-hayes, 1972. 265p. $10.95. isbn: 0-471-92645-0. weisman states that his work "is not a text on automated information technology," and mechanization is pretty well dismissed in one page of critical discussion. ellsworth mason is singled out for his "amusing, facetious and bitter account of a [sic] melancholy experience at mechanization." use of automated services for information work is covered in less than a page. the work is supposed to be a university-level text and "reference source" on "the practices of information transfer and use" on the "retailer" level. it is almost entirely limited to industrial and government scientific and technical information services. libraries are defined in passing as a "specific type of information system . . . largely limited with some few exceptions to the passive repository function. . . ." however, "if the organization has a library, consultation with the librarian and use of his mechanisms for acquisition and purchase are advisable." it is also suggested that acquisitions are "recommended by the systems advisory committee . . . selected and purchased by the director of the information system and the documentation unit head," and the onorder file is maintained as a list (to be distributed monthly, perhaps) and in card form. the section on cataloging is equally instructive in advising that the acquisition process has provided "subject" as one of three elements needed for descriptive cataloging. the book swings dizzily back and forth from this lilliputian (or is it laputan?) perspective to the more olympian outlook suggested by a seventeen-page appendix which is the text of a charter for the united engineering information service with an expected annual budget of $1.2 million. it also seesaws from the uselessly general to the exquisite detail of an operations manual with hardly a pause for breath. we are told at the start of the chapter on "documentation practicesinformation services" that it "is more efficient to provide [information dissemination services] than to have individuals scurrying about searching for information." a summary of the "procedural flow" follows immediately: "1. ... all requests and inquiries no matter how received or to whom addressed are logged and assigned a control number. 2 .... the head of inquiry services is responsible for monitoring all requests and inquiries. . . . all incoming requests are entered on the inquiry form .... " most examples appear to be drawn from the author's experience as manager of information services, national bureau of standards. some are useful. strauss' scientific and technical libraries: their organization and administration was another wiley-becker-hayes volume issued during the same year. it is impossible to avoid imagining the publisher's marketing division people counting the respective memberships of the special libraries association and the american book reviews 241 society for information science as distinct markets for the two works. however, the first three-quarters of weisman's work is a duplication distinguished from strauss mainly by the shallowness of its coverage and the poverty of its prose. weisman's only notable contribution is thirty pages about information analysis centers, which might be worth a school reading assignment. the assignment will be at some risk, depending .on students' toleration for such words as "essentialness," "beneficialness," "collaborationists," and such phrases as "parameters of data points," as in "an indexed bibliography becomes a more useful document, since it can indicate to a user exactly the type of data contained as well as parameters of data points." "relevance," as weisman notes, "is not always synonymous with competence." justine roberts university of california san francisco k1wwing books and men; knowing computers, too, by jesse h. shera. littleton, colo.: libraries unlimited, 1973. 363p. $13.50. isbn: 0-87287-073-1. the only clumsy thing about this book is its pretentious title, which not only gives little indication of the book's contents but is discordant with the lucid and vigorous style of the writing. kbam;kc,t is a selection of writings and speeches by dr. shera, done between 1931 and 1972, all but one previously published. but only a few are reprinted unchanged; most have undergone revision to some significant extent, and one has been almost doubled in length in revision. even the oldest papers are not unduly "dated," and the author's reflections on the use and abuse of computers in libraries are as timely now as when first written. the twenty-nine papers published here are presented under six headings, each representing an area of librarianship in which dr. shera has been a major influence: philosophy of librarianship, library history, reference work in the library, documentation, the academic library, and library education. most of lola's readers, it is hazarded, will find 242 journal of library automation vol. 7/3 september 1974 the section on documentation of most interest. reviewing kbam;kc,t is no fit occasion for attempting to evaluate jesse shera's contributions to librarianship. he is established, and this selection from his writings contains many of his important and influential papers and others, inevitably, less weighty. throughout, however, they bear shera' s characteristic combination of clarity, intelligence, vision, and a forthrightness bordering on truculence, the mix spiced judiciously with attic salt. in a disarming preface shera suggests that the collection may be "more of an addition to library shelves than to library literature." be that as it may, many of the writings were originally published in somewhat obscure journals, and it is helpful to have them gathered in this convenient form. george piternick school of librarianship the university of british columbia a library management game: a report on a research project, by p. brophy, m. k. buckland, g. ford, a. hindle, and a. g. mackenzie; with an appendix by l. c. guy. (university of lancaster occasional papers, no.7) university of lancaster library, 1972. 90p. £ 1.00. isbn: 0901699-14-4. in the context of the need for greater managerial expertise in libraries, the state of managerial education in library schools, and the place of games in this education, the authors describe in this document the development of a simplified probablistic model of a loan and duplication system. while it is perhaps the novelty of concept exhibited by the game which first attracts attention, closer examination reveals that the game is but the vehicle upon which is carried a far-ranging analysis of the state of library management. a dynamic model utilizes three input variables-loan period, titles bought, and duplicates purchased-and three output measures-satisfaction level, document exposure, and collection bias-of the effective manipulation of the former within the constraint of budget to illustrate complex interactions within a library system. sufficient flexibility (e.g., variation of loan periods according to popularity of volumes and/ or status of user) enables different policies to be selected to effect the stated objectives of the player (library "manager"). comparison of selected outputs illustrates that while choosing and implementing policies may be simple (a "game editor" interprets a player's decisions to the computer), judging their merits is not. policy (l) decreases collection bias at the expense of average document exposure per issue, while policy ( q) has the opposite effect, for similar costs and total issues; policy (t) increases satisfaction level and decreases collection bias in comparison with policy ( q), at a cost of 8,000 units of expenditure. evaluation of the policy decision rests on a value judgment (as in the real world) . although description of the game and probabilities upon which it is based occupies a considerable portion of the volume, the authors considered not only the practicability of such a game but also its usefulness in teaching and cost of utilization. an appendix devoted to an in-depth study of education for library management concludes that: in britain and, to a lesser extent, the united states this aspect of library education needs considerable strengthening; games such as that described are most suited to specialized courses for experienced librarians but there is a place for similar ones in firstlevel courses; and a larger proportion of the profession needs to comprehend the concepts put forward in this and other studies before better management techniques will be applied to libraries. this volume is an important contribution to the literature of library management, illustrating that the effect that computers can have on the practice of librarianship goes far beyond the mere substitution of machines for clerical workers. george ]. snowball sir george williams unive1·sity library montreal, canada liu ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ microsoft word march_ital_tharani_tc proofread.docx linked  data  in  libraries:  a  case  study     of  harvesting  and  sharing  bibliographic   metadata  with  bibframe     karim  tharani     information  technology  and  libraries  |  march  2015             5   abstract   by  way  of  a  case  study,  this  paper  illustrates  and  evaluates  the  bibliographic  framework  (or   bibframe)  as  means  for  harvesting  and  sharing  bibliographic  metadata  over  the  web  for  libraries.   bibframe  is  an  emerging  framework  developed  by  the  library  of  congress  for  bibliographic   description  based  on  linked  data.  much  like  semantic  web,  the  goal  of  linked  data  is  to  make  the   web  “data  aware”  and  transform  the  existing  web  of  documents  into  a  web  of  data.  linked  data   leverages  the  existing  web  infrastructure  and  allows  linking  and  sharing  of  structured  data  for   human  and  machine  consumption.   the  bibframe  model  attempts  to  contextualize  the  linked  data  technology  for  libraries.  library   applications  and  systems  contain  high-­‐quality  structured  metadata,  but  this  data  is  generally  static   in  its  presentation  and  seldom  integrated  with  other  internal  metadata  sources  or  linked  to  external   web  resources.  with  bibframe  existing  disparate  library  metadata  sources  such  as  catalogs  and   digital  collections  can  be  harvested  and  integrated  over  the  web.  in  addition,  bibliographic  data   enriched  with  linked  data  could  offer  richer  navigational  control  and  access  points  for  users.  with   linked  data  principles,  metadata  from  libraries  could  also  become  harvestable  by  search  engines,   transforming  dormant  catalogs  and  digital  collections  into  active  knowledge  repositories.  thus   experimenting  with  linked  data  using  existing  bibliographic  metadata  holds  the  potential  to   empower  libraries  to  harness  the  reach  of  commercial  search  engines  to  continuously  discover,   navigate,  and  obtain  new  domain  specific  knowledge  resources  on  the  basis  of  their  verified   metadata.   the  initial  part  of  the  paper  introduces  bibframe  and  discusses  linked  data  in  the  context  of   libraries.  the  final  part  of  this  paper  outlines  and  illustrates  a  step-­‐by-­‐step  process  for  implementing   bibframe  with  existing  library  metadata.   introduction   library  applications  and  systems  contain  high-­‐quality  structured  metadata,  but  this  data  is  seldom   integrated  or  linked  with  other  web  resources.  this  is  adequately  illustrated  by  the  nominal   presence  of  library  metadata  on  the  web.1  libraries  have  much  to  offer  to  the  web  and  its  evolving   future.  making  library  metadata  harvestable  over  the  web  may  not  only  refine  precision       karim  tharani  (karim.tharani@usask.ca)  is  information  technology  librarian  at  the  university   of  saskatchewan  in  saskatoon,  canada.     information  technology  and  libraries  |  march  2015   6   and  recall  but  has  the  potential  to  empower  libraries  to  harness  the  reach  of  commercial  search   engines  to  continuously  discover,  navigate,  and  obtain  new  domain  specific  knowledge  resources   on  the  basis  of  their  verified  metadata.  this  is  a  novel  and  feasible  idea,  but  its  implementation   requires  libraries  to  both  step  out  of  their  comfort  zones  and  to  step  up  to  the  challenge  of  finding   collaborative  solutions  to  bridge  the  islands  of  information  that  we  have  created  on  the  web  for   our  users  and  ourselves.     by  way  of  a  case  study,  this  paper  illustrates  and  evaluates  the  bibliographic  framework  (or   bibframe)  as  means  for  harvesting  and  sharing  bibliographic  metadata  over  the  web  for  libraries.   bibframe  is  an  emerging  framework  developed  under  the  auspices  of  the  library  of  congress  to   exert  bibliographic  control  over  traditional  and  web  resources  in  an  increasingly  digital  world.   while  bibframe  has  been  introduced  as  a  potential  replacement  for  marc  (machine-­‐readable   cataloging)  in  libraries;2  however,  the  goal  of  this  paper  is  to  highlight  the  merits  of  bibframe  as   a  mechanism  for  libraries  to  share  metadata  over  the  web.   bibframe  and  linked  data   while  the  impetus  behind  bibframe  may  have  been  replacement  of  marc,  “it  seems  likely  that   libraries  will  continue  using  marc  for  years  to  come  because  that  is  what  works  with  available   library  systems.”3  despite  its  uncertain  future  in  the  cataloging  world,  bibframe  in  its  current   form  provides  fresh  and  insightful  mechanism  for  libraries  to  repackage  and  share  bibliographic   metadata  over  the  web.  bibframe  utilizes  the  linked  data  paradigm  for  publishing  and  sharing   data  over  the  web.4  much  like  semantic  web,  the  goal  of  linked  data  is  to  make  the  web  “data   aware”  and  transform  the  existing  web  of  documents  into  a  web  of  data.  linked  data  utilizes   existing  web  infrastructure  and  allows  linking  and  sharing  of  structured  data  for  human  and   machine  consumption.  in  a  recent  study  to  understand  and  reconcile  various  perspectives  on  the   effectiveness  of  linked  data,  the  authors  raise  intriguing  questions  about  the  possibilities  of   leveraging  linked  data  for  sharing  library  metadata  over  the  web:     although  library  metadata  made  the  transition  from  card  catalogs  to  online  catalogs   over  40  years  ago,  and  although  a  primary  source  of  information  in  today’s  world  is  the   web,  metadata  in  our  opacs  are  no  more  free  to  interact  on  the  web  today  than  when   they  were  confined  on  3"  ×  5"  catalog  cards  in  wooden  drawers.  what  if  we  could  set   free  the  bound  elements?  that  is,  what  if  we  could  let  serial  titles,  subjects,  creators,   dates,  places,  and  other  elements,  interact  independently  with  data  on  the  web  to  which   they  are  related?  what  might  be  the  possibilities  of  a  statement-­‐based,  linked  data   environment?  5       linked  data  in  libraries:  a  case  study  of  harvesting  and  sharing  bibliographic  metadata     with  bibframe  |  tharani       7     figure  1.  the  bibframe  model6   bibframe  provides  the  means  for  libraries  to  experiment  with  linked  data  to  find  answers  to   these  questions  for  themselves.  this  makes  bibframe  both  daunting  and  delighting   simultaneously.  it  is  daunting  because  it  imposes  a  paradigm  shift  in  how  libraries  have   historically  managed,  exchanged,  and  shared  metadata.  but  embracing  linked  data  also  leads  to  a   promise  land  where  metadata  within  and  among  libraries  can  be  exchanged  seamlessly  and   economically  over  the  web.  bibframe  (http://bibframe.org)  consists  of  a  model  and  a  vocabulary   set  specifically  designed  for  bibliographic  control.7  the  model  identifies  four  main  classes,  namely,   work,  instance,  authority,  and  annotation  (see  figure  1).  for  each  of  these  classes,  there  are  many   hierarchical  attributes  that  help  in  describing  and  linking  instantiations  of  these  classes.  these   properties  are  collectively  called  the  bibframe  vocabulary.     philosophically,  linked  data  is  based  on  the  premise  that  more  links  among  resources  will  lead  to   better  contextualization  and  credibility  of  resources,  which  in  turn  will  help  in  filtering  irrelevant   resources  and  discovering  new  and  meaningful  resources.  at  a  more  practical  level,  linked  data   provides  a  simple  mechanism  to  make  connections  among  pieces  of  information  or  resources  over   the  web.  more  specifically,  it  not  only  allows  humans  to  make  use  of  these  links  but  also  machines   to  do  so  without  human  intervention.  this  may  sound  eerie,  but  one  has  to  understand  the  history   behind  the  origin  of  linked  data  not  to  think  of  this  as  yet  another  conspiracy  for  machines  to  take   over  the  world  (wide  web).     in  1994  tim  berners-­‐lee,  the  inventor  of  the  web,  put  forth  his  vision  of  the  semantic  web  as  a   “web  of  actionable  information—information  derived  from  data  through  a  semantic  theory  for     information  technology  and  libraries  |  march  2015   8   interpreting  the  symbols.  the  semantic  theory  provides  an  account  of  ‘meaning’  in  which  the   logical  connection  of  terms  establishes  interoperability  between  systems.”8  while  the  idea  of   semantic  web  has  not  been  fully  realized  for  a  variety  of  functional  and  technical  reasons,  the   notion  of  linked  data  introduced  subsequently  has  made  the  concept  much  more  accessible  and   feasible  for  a  wider  application.9  once  again,  it  was  tim  berners-­‐lee  who  put  forth  the  ground   rules  for  publishing  data  on  the  web  that  are  now  known  as  the  linked  data  principles.10  these   principles  advocate  using  standard  mechanisms  for  naming  each  resource  and  their  relationships   with  unique  universal  resource  identifiers  (uris);  making  use  of  the  existing  web  infrastructure   for  connecting  resources;  and  using  resource  description  framework  (rdf)  for  documenting  and   sharing  resources  and  their  relationships.     a  uri  serves  as  a  persistent  name  or  handle  for  a  resource  and  is  ideally  independent  of  the   underlying  location  and  technology  of  the  resource.  although  often  used  interchangeably,  a  uri  is   different  from  a  url  (or  universal  resource  locator),  which  is  a  more  commonly  used  term  for   web  resources.  a  url  is  a  special  type  of  uri,  which  points  to  the  actual  location  (or  the  web   address)  of  a  resource,  including  the  file  name  and  extension  (such  as  .html  or  .php)  of  a  web   resource.  being  more  generic,  the  use  of  uris  (as  opposed  to  urls)  in  linked  data  provides   persistency  and  flexibility  of  not  having  to  change  the  names  and  references  every  time  resources   are  relocated  or  there  is  a  change  in  server  technology.  for  example  if  an  organization  switches  its   underlying  web-­‐scripting  technology  from  active  server  pages  (asp)  to  java  server  pages  (jsp),  all   the  files  on  a  web  server  will  bear  a  different  extension  (e.g.,  .jsp)  causing  all  previous  urls  with   old  extension  (e.g.,  .asp)  to  become  invalid.  this  technology  change,  however,  may  have  no  impact   if  uris  are  used  instead  of  urls  because  the  underlying  implementation  and  location  details  for  a   resource  are  masked  from  the  public.  thus  the  uri  naming  scheme  within  an  organization  must   be  developed  independent  of  the  underlying  technology.  there  are  diverse  best  practices  on  how   to  name  uris  to  promote  usability,  longevity,  and  persistence.11  the  most  important  factors,   however,  remain  the  purpose  and  the  context  for  which  the  resources  are  being  harvested  and   shared.     use  of  rdf  is  also  a  requirement  of  using  linked  data  for  sharing  data  over  the  web.  much  like   how  html  (hypertext  markup  language)  is  used  to  create  and  publish  documents  over  the  web,   rdf  is  used  to  create  and  publish  linked  data  over  the  web.  the  format  of  rdf  is  very  simple  and   makes  use  of  three  fundamental  elements,  namely,  subject,  predicate,  and  object.  similar  to  the   structure  of  a  basic  sentence,  the  three  elements  make  up  the  unit  of  description  of  a  resource   known  as  a  triple  in  the  rdf  terminology.  unsurprisingly,  rdf  requires  all  three  elements  to  be   denoted  by  uris  with  the  exception  of  the  object,  which  may  also  be  represented  by  constant   values  such  as  a  dates,  strings,  or  numbers.12  as  an  example,  consider  the  work  divine  comedy.  the   fact  this  work,  also  known  as  divina  commedia,  was  created  by  dante  alighieri  can  be  represented   by  the  following  two  triples  (using  n-­‐triples  format):       linked  data  in  libraries:  a  case  study  of  harvesting  and  sharing  bibliographic  metadata     with  bibframe  |  tharani       9     <http://dbpedia.org/resource/divine_comedy>     <http://bibframe.org/vocab/creator>     <http://dbpedia.org/resource/dante_alighieri>  .   <http://dbpedia.org/resource/divine_comedy>     <http://www.w3.org/2002/07/owl#sameas>   “divina  commedia”.   in  the  first  triple  of  this  example,  the  work  divine  comedy  (subject)  is  being  attributed  to  a  person   called  dante  alighieria  (object)  as  the  creator  (predicate).  in  the  second  triple  the  use  of  sameas   predicate  asserts  that  both  divine  comedy  and  divina  commedia  refer  to  the  same  resource.  thus   using  uris  makes  the  resources  and  relationships  persistent  whereas  use  of  rdf  makes  the   format  discernible  by  humans  and  machines.  this  seemingly  simple  idea  allows  data  to  be   captured,  formatted,  shared,  transmitted,  received,  and  decoded  over  the  web.  use  of  the  existing   web  protocol  (http  or  hypertext  transfer  protocol)  for  exchanging  and  integrating  data  saves   the  overhead  of  putting  additional  agreements  and  infrastructure  in  place  among  parties  willing   or  wishing  to  exchange  data.  this  ease  and  freedom  to  define  relationships  among  resources  over   the  web  also  makes  it  possible  for  disparate  data  sources  to  interact  and  integrate  with  each  other   openly  and  free  of  cost.     why  is  this  seemingly  simple  idea  so  significant  for  the  future  of  the  web?  from  a  functional   perspective,  what  this  means  is  that  linked  data  facilitates  “using  the  web  to  create  typed  links   between  data  from  different  sources.  these  may  be  as  diverse  as  databases  maintained  by  two   organisations  in  different  geographical  locations,  or  simply  heterogeneous  systems  within  one   organisation  that,  historically,  have  not  easily  interoperated  at  the  data  level.”13  the  notion  of   typed  linking  refers  to  the  facility  and  freedom  of  being  able  to  have  and  name  multiple   relationships  among  resources.  from  a  technical  point  of  view,  “linked  data  refers  to  data   published  on  the  web  in  such  a  way  that  it  is  machine-­‐readable,  its  meaning  is  explicitly  defined,  it   is  linked  to  other  external  data  sets,  and  can  in  turn  be  linked  to  from  external  data  sets.”14  in  a   traditional  database,  relationships  between  entities  or  resources  are  predefined  by  virtue  of  tables   and  column  names.  moreover,  data  in  such  databases  become  part  of  the  deep  web  and  not   readily  accessed  or  indexed  by  search  engines.  15   the  use  of  uris  to  name  relationships  allows  data  sources  to  establish,  use,  and  reuse   vocabularies  to  define  relationships  between  existing  resources.  these  names  or  vocabularies,   much  like  the  resources  they  describe,  have  their  own  dedicated  uris,  making  it  possible  for   resources  to  form  long-­‐term  and  reliable  relationships  with  each  other.  if  resources  and   relationships  have  and  retain  their  identities  by  virtue  of  their  uris,  then  links  between  resources   add  to  the  awareness  of  these  resources  both  for  humans  and  machines.  this  is  a  key  concept  in   realizing  the  overall  mission  of  linked  data  to  imbue  data  awareness  and  transforming  the   existing  web  of  documents  into  a  web  of  data.  consequently  various  institutions  and  industries     information  technology  and  libraries  |  march  2015   10   have  established  standard  vocabularies  and  made  them  available  for  others  to  use  with  their  data.   for  example,  the  library  of  congress  has  published  its  subject  headings  as  linked  data.  the   impetus  behind  this  gesture  is  that  if  data  from  multiple  organizations  is  “typed  link”  using  lcsh   (library  of  congress  subject  headings)  with  linked  data,  then  libraries  and  others  gain  the  ability   to  categorize,  collocate,  and  integrate  data  from  disparate  systems  over  the  web  by  virtue  of  using   a  common  vocabulary.  as  more  and  more  resources  link  to  each  other  through  established  and   reusable  vocabularies,  the  more  data  aware  the  web  becomes.  recognizing  this  opportunity,  the   library  of  congress  has  also  developed  and  shared  its  vocabulary  for  bibliographic  control  as  part   of  the  bibframe  framework.16     implementing  bibframe  to  harvest  and  share  bibliographic  metadata   nowadays,  systems  like  catalogs  and  digital  collection  repositories  are  commonplace  in  libraries,   but  these  source  systems  often  operate  as  islands  of  data  both  within  and  across  libraries.  the   goal  of  this  case  study  is  to  explore  and  evaluate  bibframe  as  a  viable  approach  for  libraries  to   integrate  and  share  disparate  metadata  over  the  web.  as  discussed  above,  the  bibframe  model   attempts  to  contextualize  the  use  of  linked  data  for  libraries  and  provides  a  conceptual  model  and   underlying  vocabulary  to  do  so.  to  this  end,  a  unique  collection  of  ismaili  muslim  community  was   identified  for  the  case  study.  the  collection  is  physically  housed  at  the  harvard  university  library   (hul)  and  the  metadata  for  the  collection  is  dispersed  across  multiple  systems  within  the  library.   an  additional  objective  of  this  case  study  has  been  to  define  concrete  and  replicable  steps  for   libraries  to  implement  bibframe.  the  discussion  below  is  therefore  presented  in  a  step-­‐by-­‐step   format  for  harvesting  and  sharing  bibliographic  metadata  over  the  web.     1. establishing  a  purpose  for  harvesting  metadata   the  harvard  collection  of  ismaili  literature  is  first  of  its  kind  in  north  america.  “the  most   important  genre  represented  in  the  collection  is  that  of  the  ginans,  or  the  approximately  one   thousand  hymn-­‐like  poems  written  in  an  assortment  of  indian  languages  and  dialects.”17  the   feasibility  of  bibframe  was  explored  in  this  case  study  by  creating  a  thematic  research  collection   of  ginans  by  harvesting  existing  bibliographic  metadata  at  hul.  the  purpose  of  this  thematic   research  collection  is  to  make  ginans  accessible  to  researchers  and  scholars  for  textual  criticism.   historically  libraries  have  played  a  vital  role  in  making  extant  manuscripts  and  other  primary   sources  accessible  to  scholars  for  textual  criticism.  the  need  for  having  such  a  collection  in  place   for  ginans  was  identified  by  dr.  ali  asani,  professor  of  indo-­‐muslim  and  islamic  religion  and   cultures  at  harvard  university:     perhaps  the  greatest  obstacle  for  further  studies  on  the  ginan  literature  is  the  almost   total  absence  of  any  kind  of  textual  criticism  on  the  literature.  thus  far  merely  two  out  of   the  nearly  one  thousand  compositions  have  been  critically  edited.  naturally,  the   availability  of  reliably  edited  texts  is  fundamental  to  any  substantial  scholarship  in  this   field.  .  .  .  for  the  scholar  of  post-­‐classical  ismaili  literature,  recourse  to  this  kind  of     linked  data  in  libraries:  a  case  study  of  harvesting  and  sharing  bibliographic  metadata     with  bibframe  |  tharani       11   material  has  become  especially  critical  with  the  growing  awareness  that  there  exist   significant  discrepancies  between  modern  printed  versions  of  several  ginans  and  their   original  manuscript  form.  fortunately,  the  harvard  collection  is  particularly  strong  in  its   holdings  of  a  large  number  of  first  editions  of  printed  ginan  texts—a  strength  that  should   greatly  facilitate  comparisons  between  recensions  of  ginans  and  the  preparation  of   critical  editions.18   2. modeling  the  data  to  fulfill  functional  requirements   historically,  the  physicality  of  resources  such  as  book  or  compact  disc  has  dictated  what  is   described  in  library  catalogs  and  to  what  extent.  the  issue  of  cataloging  serials  and  other  works   embedded  within  larger  works  has  always  been  challenging  for  catalogers.  for  this  case  study  as   well,  one  of  the  major  implementation  decisions  revolved  around  the  granularity  of  defining  a   work.  designating  each  ginan  as  a  work  (rather  than  a  manuscript  or  lithograph)  was  perhaps  an   unconventional  decision,  but  one  that  was  highly  appropriate  for  the  purpose  of  the  collection.   thus  there  was  a  conscious  and  genuine  effort  to  liberate  a  work  from  the  confines  of  its  carriers.   fortuitously,  bibframe  does  not  shy  away  from  this  challenge  and  accommodates  embedded  and   hierarchal  works  in  its  logical  model.  but  bibframe,  like  any  other  conceptual  model,  only   provides  a  starting  point,  which  needs  to  be  adapted  and  implemented  for  individual  project   needs.       figure  2.  excerpt  of  project  data  model     information  technology  and  libraries  |  march  2015   12   the  data  model  for  this  case  study  (see  figure  2)  was  designed  to  balance  the  need  to   accommodate  bibliographic  metadata  with  the  demands  of  linked  data  paradigm.  central  to  the   project  data  model  is  the  resources  table  where  information  on  all  resources  along  with  their  uris   and  categories  (work,  instance,  etc.)  are  stored.  resources  relate  to  each  other  with  use  of   predicates  table,  which  captures  relevant  and  applicable  vocabularies.  the  namespace  table  keeps   track  of  all  the  set  of  vocabularies  being  used  for  the  project.  in  the  triples  table,  resources  are   typed  linked  using  appropriate  predicates.  once  the  data  model  for  the  project  was  finalized,  a   database  was  created  using  mysql  to  house  the  project  data.   3. planning  the  uri  scheme     in  general  the  uri  scheme  for  this  case  study  conformed  to  the  following  intuitive  nomenclature:   <http://domain.com/resource/resource_type/resource_id>.     this  uri  naming  scheme  ensures  that  a  uri  assigned  to  a  resource  depends  on  its  class  and   category  (see  table  1).  while  it  may  be  customary  to  use  textual  identifiers  in  the  uris,  the  project   used  numeric  identifiers  to  account  for  the  fact  that  most  of  the  ginans  (works)  are  untitled  and   transliterated  into  english  from  various  indic  languages.  generally  support  for  using  uris  is  either   already  built-­‐in  or  added  on  depending  on  the  server  technology  being  used.  this  case  study   utilized  the  lamp  (linux,  apache,  mysql,  and  php)  technology  stack,  and  the  uri  handler  for  the   project  was  added  on  to  the  apache  webserver  using  url-­‐rewriting  (or  mod_rewrite)  facility.19     resource  types   bibframe  category   uri  example   organizations   annotation   http://domain.com/organization/1   collections   annotation   http://domain.com/collection/1   items   instance   http://domain.com/item/1     ginan   work   http://domain.com/ginan/1     subjects   authority   http://domain.com/subject/1     table  1.  uri  naming  scheme  and  examples   4. using  standard  vocabularies     bibframe  provides  the  relevant  vocabulary  and  the  underlying  uris  to  implement  linked  data   with  bibliographic  data  in  libraries.  while  not  all  attributes  may  be  applicable  or  used  in  a  project,   the  ones  that  are  identified  as  relevant  must  be  referenced  with  their  rightful  uri.  for  example,   the  predicate  hasauthority  from  bibframe  has  a  persistent  uri   (http://bibframe.org/vocab/hasauthority)  enabling  humans  as  well  as  machines  to  access  and   decode  the  purpose  and  scope  of  this  predicate.  other  vocabulary  sets  or  namespaces  commonly   used  with  linked  data  include  resource  description  frameowrk  (rdf),  web  ontology  language   (owl),  friend  of  a  friend  (foaf),  etc.  in  rare  circumstances,  libraries  may  also  choose  to  publish   their  own  specific  vocabulary.  for  example,  any  unique  predicates  for  this  case  study  could  be     linked  data  in  libraries:  a  case  study  of  harvesting  and  sharing  bibliographic  metadata     with  bibframe  |  tharani       13   defined  and  published  using  the  http://domain.com/vocab  namespace.   5. identifying  data  sources     the  bibliographic  metadata  used  for  this  case  study  was  obtained  from  within  hul.  as  mentioned   above,  the  data  pertained  to  a  unique  collection  of  religious  literature  belonging  to  the  ismaili   muslim  community  of  the  indian  subcontinent.  this  collection  was  acquired  by  the  middle  eastern   department  of  the  harvard  college  library  in  1980.  the  collection  comprises  28  manuscripts,  81   printed  books,  and  11  lithographs.  in  1992,  a  book  on  the  contents  of  this  collection  was  published   in  1992  by  dr.  asani  and  was  titled  the  harvard  collection  of  ismaili  literature  in  indic  languages:   a  descriptive  catalog  and  finding  aid.  the  indexes  in  the  book  served  as  one  of  the  sources  of  data   for  this  case  study.     subsequent  to  the  publication  of  the  book,  the  harvard  collection  of  ismaili  literature  was  also   made  available  through  harvard’s  opac  (online  public  access  catalog)  called  hollis  (see  figure  3).   the  catalog  records  were  also  obtained  from  the  library  for  the  case  study.  some  of  the  120  items   from  the  collection  were  subsequently  digitized  and  shared  as  part  of  the  harvard’s  islamic   heritage  project.  the  digital  surrogates  of  these  items  were  shared  through  the  harvard   university  library  open  collections  program.  and  the  library  catalog  records  were  also  updated  to   provide       figure  3.  hollis:  harvard  university  library’s  opac   direct  access  to  the  digital  copies  where  available.  additional  metadata  for  the  digitized  items  was   also  developed  by  the  library  to  facilitate  open  digital  access  through  harvard  library’s  page   delivery  service  (pds)  to  provide  page-­‐turning  navigational  interface  for  scanned  page  images   over  the  web.  data  from  all  these  sources  was  leveraged  for  the  case  study.       information  technology  and  libraries  |  march  2015   14     6. transforming  source  metadata  for  reuse   etl  (extract,  transform,  and  load)  is  an  acronym  commonly  used  to  refer  to  the  steps  needed  to   populate  a  target  database  by  moving  data  from  multiple  and  disparate  source  systems.  extraction   is  the  process  of  getting  the  data  out  of  the  identified  source  systems  and  making  it  available  for   the  exclusive  use  of  the  new  database  being  designed.  in  the  context  of  the  library  realm,  this  may   mean  getting  marc  records  out  from  a  catalog  or  getting  descriptive  and  administrative  metadata   out  of  a  digital  repository.  format  in  which  data  is  extracted  out  of  a  source  system  is  also  an   important  aspect  of  the  data  extraction  process.  use  of  xml  (extensible  markup  language)  format   is  fairly  common  nowadays  as  most  library  source  systems  have  built-­‐in  functionality  to  export   data  into  a  recognized  xml  standard  such  as  marcxml  (marc  data  encoded  in  xml),  mods   (metadata  object  description  schema),  mets  (metadata  encoding  and  transmission  standard),   etc.  in  certain  circumstances,  data  may  be  extracted  using  csv  (comma-­‐separated  values)  format.   transformation  is  the  step  in  which  data  from  one  or  more  source  systems  is  massaged  and   prepared  to  be  loaded  to  a  new  database.  the  design  of  the  new  database  often  enforces  new  ways   of  organizing  source  data.  the  transformation  process  is  responsible  to  make  sure  that  the  data   from  all  source  systems  is  integrated  while  retaining  its  integrity  before  being  loaded  to  the  new   database.  a  simplistic  example  of  data  transformation  may  be  that  the  new  system  may  require   authors’  first  and  last  names  to  be  stored  in  separate  fields  rather  than  in  a  single  field.  how  such   transformations  are  automated  will  depend  on  the  format  of  the  source  data  as  well  as  the   infrastructure  and  programming  skills  available  within  an  organization.  since  xml  is  becoming   the  de  facto  standard  for  most  data  exchange,  use  of  xslt  (extensible  stylesheet  language   transformations)  scripts  is  common.  with  xslt,  data  in  xml  format  can  be  manipulated  and   given  different  structure  to  aid  in  the  transformation  process.     the  loading  process  is  responsible  for  populating  the  newly  minted  database  once  all   transformations  have  been  applied.  one  of  the  major  considerations  in  this  process  is  maintaining   the  referential  integrity  of  the  data  by  observing  the  constraints  dictated  by  the  data  model.  this  is   achieved  by  making  sure  that  records  are  correctly  linked  to  each  other  and  are  loaded  in  proper   sequence.  for  instance,  to  ensure  referential  integrity  of  items  and  their  annotations,  it  may  be   necessary  to  load  the  items  first  and  then  the  annotations  with  correct  reference  to  the  associated   item  identifiers.   for  this  case  study,  records  from  source  systems  were  obtained  in  marcxml  and  mets  formats,   and  specific  scripts  were  developed  to  extract  desired  elements  and  transform  them  into  the   required  format.  a  somewhat  unconventional  mechanism  was  used  to  capture  and  reuse  the  data   from  dr.  asani’s  book,  which  was  only  available  in  print.  the  entire  book  was  scanned  and   processed  by  an  ocr  (optical  character  recognition)  tool  to  glean  various  data  elements.  once  the   data  was  cleaned  and  verified,  the  information  was  transformed  into  a  csv  data  file  to  facilitate     linked  data  in  libraries:  a  case  study  of  harvesting  and  sharing  bibliographic  metadata     with  bibframe  |  tharani       15   database  loading.   7. generating  rdf  triples   the  rdf  triples  can  be  written  or  serialized  using  a  variety  of  formats  such  as  turtle,  n-­‐triples,   json,  as  well  as  rdf/xml,  among  others.  the  traditional  rdf/xml  format,  which  was  the  first   standard  to  be  recommended  for  rdf  serialization  by  the  world  wide  web  consortium  (w3c),   was  used  for  this  case  study  (see  figure  4).  the  format  was  chosen  for  its  modularity  in  preserving   the  context  of  resources  and  their  relationships  as  well  as  its  readability  for  humans.  generating   rdf  may  be  a  simple  act  if  the  data  is  already  stored  in  a  triplestore,  which  is  a  database   specifically  designed  to  store  rdf  data.  but  given  that  this  project  was  implemented  using  a   relational  database  management  system  (rdbms),  i.e.,  mysql,  the  programming  effort  to  generate   rdf  data  was  complex.  the  complications  arose  in  identifying  and  tracking  the  hierarchical  nature   of  the  rdf  data,  especially  in  the  chosen  serialization  format.  several  server-­‐side  scripts  were   developed  to  aid  in  discerning  the  relationships  among  resources  and  formatting  them  to  generate   triples.  in  hindsight  generating  triples  would  have  been  easier  using  the  n-­‐triples  serialization  but   that  would  have  also  required  more  complex  programming  for  rebuilding  the  context  for  the  user   interface  design.   figure  4.  a  sample  of  triples  serialized  for  the  project   8. formatting  rdf  triples  for  human  and  machine  consumption   the  raw  rdf  data  is  sufficient  for  machines  to  parse  and  process,  but  humans  typically  require   intuitive  user  interface  to  contextualize  triples.  in  this  case  study,  xsl  was  extensively  used  for   formatting  the  triples.  while  xslt  and  xsl  (extensible  stylesheet  language)  are  intricately   related,  they  serve  different  purposes.  xslt  is  a  scripting  language  to  manipulate  xml  data   whereas  xsl  is  a  formatting  specification  used  in  presentation  of  xml,  much  like  how  css   (cascading  style  sheets)  are  used  for  presenting  html.  a  special  routing  script  was  also   developed  to  detect  whether  the  request  for  data  was  intended  for  machine  or  human   consumption.  for  machine  requests,  the  triples  were  served  unformatted  whereas  for  human   requests,  the  triples  were  formatted  to  display  in  html.       information  technology  and  libraries  |  march  2015   16     figure  5.  formatted  triples  for  human  consumption   discussion   models  are  tools  of  communicating  simple  and  complex  relations  between  objects  and  entities  of   interest.  effectiveness  of  any  model  is  often  realized  during  implementation  when  the  theoretical   constructs  of  the  models  are  put  to  test.  the  challenge  faced  by  bibframe,  like  any  new  model,  is   to  establish  its  worthiness  in  the  face  of  the  existing  legacy  of  marc.  the  existing  hold  of  marc  in   libraries  is  so  strong  that  it  may  take  several  years  for  bibframe  to  be  in  a  position  to  challenge   the  status  quo.  historically  bibliographic  practices  in  libraries  such  as  describing,  classifying,  and   cataloging  resources  have  primarily  catered  to  tangible,  print-­‐based  knowledge  carriers  such  as   books  and  journals.20  bibframe  challenges  libraries  to  revisit  and  refresh  their  traditional  notion   of  text  and  textuality.   although  initially  introduced  as  a  replacement  for  marc,  bibframe  is  far  from  being  an  either-­‐or   proposition  given  the  marc  legacy.  nevertheless,  bibframe  has  made  linked  data  paradigm   much  more  accessible  and  practical  for  libraries.  rather  than  perceiving  bibframe  as  a  threat  to   existing  cataloging  praxis,  it  may  be  useful  for  libraries  to  allow  bibframe  to  coexist  within  the   current  cataloging  landscape  as  a  means  for  sharing  bibliographic  data  over  the  web.  libraries   maintain  and  provide  authentic  metadata  about  knowledge  resources  for  their  users  based  on   internationally  recognized  standards.  this  high  quality  structured  metadata  from  library  catalogs   and  other  systems  can  be  leveraged  and  repurposed  to  fulfill  unmet  and  emerging  needs  of  users.   with  linked  data,  library  metadata  could  become  readily  harvestable  by  search  engines,   transforming  dormant  catalogs  and  collections  into  active  knowledge  repositories.   in  this  case  study  seemingly  disparate  library  systems  and  data  were  integrated  to  provide  a   unified  and  enabling  access  to  create  a  thematic  research  collection.  it  is  also  possible  to  create   such  purpose-­‐specific  digital  libraries  and  collections  as  part  of  library  operations  without  having   to  acquire  additional  hardware  and  commercial  software.  it  was  also  evident  from  this  case  study   that  digital  libraries  built  using  bibframe  offer  superior  navigational  control  and  access  points     linked  data  in  libraries:  a  case  study  of  harvesting  and  sharing  bibliographic  metadata     with  bibframe  |  tharani       17   for  users  to  actively  interact  with  bibliographic  data.  any  linked  data  predicate  has  the  potential   to  become  an  access  point  and  act  as  a  pivot  to  provide  insightful  view  of  the  underlying   bibliographic  records  (see  figure  6).  with  advances  in  digital  technologies  “richer  interaction  is   possible  within  the  digital  environment  not  only  as  more  content  is  put  within  reach  of  the  user,   but  also  as  more  tools  and  services  are  put  directly  in  the  hands  of  the  user.”21  developing  capacity   to  effectively  respond  to  the  informational  needs  of  users  is  part  and  parcel  of  libraries’   professional  and  operational  responsibilities.  with  the  ubiquity  of  the  web  and  increased  reliance   of  users  on  digital  resources,  libraries  must  constantly  reevaluate  and  reimagine  their  services  to   remain  responsive  and  relevant  to  their  users.       figure  6.  increased  navigational  options  with  linked  data   conclusion   just  as  libraries  rely  on  vendors  to  develop,  store,  and  share  metadata  for  commercial  books  and   journals,  similar  metadata  partnerships  need  to  be  put  in  place  across  libraries.  the  benefits  and   implications  of  establishing  such  a  collaborative  metadata  supply  chain  are  far  reaching  and  can   also  accommodate  cultural  and  indigenous  resources.  library  digital  collections  typically   showcase  resources  that  are  unique  and  rare,  and  the  metadata  to  make  these  collections   accessible  must  be  shared  over  the  web  as  part  of  library  service.     as  the  amount  of  data  on  the  web  proliferates,  users  find  it  more  and  more  difficult  to  differentiate   between  credible  knowledge  resources  and  other  resources.  bibframe  has  the  potential  to   address  many  of  the  issues  that  plague  the  web  from  a  library  and  information  science  perspective,   including  precise  search,  authority  control,  classification,  data  portability,  and  disambiguation.   most  popular  search  engines  like  google  are  gearing  up  to  automatically  index  and  collocate   disparate  resources  using  linked  data.22  libraries  are  particularly  well  positioned  to  realize  this   goal  with  their  expertise  in  search,  metadata  generation,  and  ontology  development.  this  research   looks  forward  to  further  initiatives  by  libraries  to  become  more  responsive  and  make  library     information  technology  and  libraries  |  march  2015   18   resources  more  relevant  to  the  knowledge  creation  process.     references     1.     tim  f.  knight,  “break  on  through  to  the  other  side:  the  library  and  linked  data,”  tall   quarterly  30,  no.  1  (2011):  1–7,  http://hdl.handle.net/10315/6760.   2.     eric  miller  et  al.,  “bibliographic  framework  as  a  web  of  data:  linked  data  model  and   supporting  services,”  november  11,  2012,  http://www.loc.gov/bibframe/pdf/marcld-­‐report-­‐ 11-­‐21-­‐2012.pdf.   3.     angela  kroeger,  “the  road  to  bibframe:  the  evolution  of  the  idea  of  bibliographic   transition  into  a  post-­‐marc  future,”  cataloging  &  classification  quarterly  51,  no.  8  (2013):   873–89,  http://dx.doi.org/10.1080/01639374.2013.823584.   4.     eric  miller  et  al.,  “bibliographic  framework  as  a  web  of  data:  linked  data  model  and   supporting  services,”  november  11,  2012,  http://www.loc.gov/bibframe/pdf/marcld-­‐report-­‐ 11-­‐21-­‐2012.pdf.   5.     nancy  fallgren  et  al.,  “the  missing  link:  the  evolving  current  state  of  linked  data  for  serials,”   serials  librarian  66,  no.  1–4  (2014):  123–38,   http://dx.doi.org/10.1080/0361526x.2014.879690.   6.     the  figure  has  been  adapted  from  eric  miller  et  al.,  “bibliographic  framework  as  a  web  of   data:  linked  data  model  and  supporting  services,”  november  11,  2012,   http://www.loc.gov/bibframe/pdf/marcld-­‐report-­‐11-­‐21-­‐2012.pdf.   7.     “bibliographic  framework  initiative  project,”  library  of  congress,  accessed  august  15,  2014,   http://www.loc.gov/bibframe.   8.     nigel  shadbolt,  wendy  hall,  and  tim  berners-­‐lee,  “the  semantic  web  revisited,”  intelligent   systems  21  no.  3  (2006):  96–101,  http://dx.doi.org/10.1109/mis.2006.62.   9.     sören  auer  et  al.,  “introduction  to  linked  data  and  its  lifecycle  on  the  web,”  in  reasoning   web:  semantic  technologies  for  intelligent  data  access,  edited  by  sebastian  rudolph  et  al.,  1– 90  (heidelberg:  springer,  2011),  http://dx.doi.org/10.1007/978-­‐3-­‐642-­‐23032-­‐5_1.   10.    tim  berners-­‐lee,  “linked  data,”  design  issues,  last  modified  june  18,  2009,   http://www.w3.org/designissues/linkeddata.html.   11.    danny  ayers  and  max  völkel,  “cool  uris  for  the  semantic  web,”  world  wide  web  consortium   (w3c),  last  modified  march  31,  2008,  http://www.w3.org/tr/cooluris.   12.    tom  heath  and  christian  bizer,  linked  data:  evolving  the  web  into  a  global  data  space   (morgan  &  claypool,  2011),  http://dx.doi.org/10.2200/s00334ed1v01y201102wbe001.     linked  data  in  libraries:  a  case  study  of  harvesting  and  sharing  bibliographic  metadata     with  bibframe  |  tharani       19     13.    christian  bizer,  tom  heath,  and  tim  berners-­‐lee,  “linked  data—the  story  so  far,”   international  journal  on  semantic  web  and  information  systems  5,  no.  3  (2009):  1–22,   http://dx.doi.org/10.4018/jswis.2009081901.   14.    ibid.     15.    tony  boston,  “exposing  the  deep  web  to  increase  access  to  library  collections”  (paper   presented  at  the  ausweb05,  the  twelfth  australasian  world  wide  web  conference,   queensland,  australia,  2005),   http://www.nla.gov.au/openpublish/index.php/nlasp/article/view/1224/1509.   16.      “bibliographic  framework  initiative,”  bibframe.org,  accessed  august  15,  2014,     http://bibframe.org/vocab;  “bibliographic  framework  initiative  project,”  library  of  congress,   accessed  august  15,  2014,  http://www.loc.gov/bibframe.   17.    ali  asani,  the  harvard  collection  ismaili  literature  in  indic  languages:  a  descriptive  catalog   and  finding  aid  (boston:  g.k.  hall,  1992).   18.    ibid.   19.    ralf  s.  engelschall,  “url  rewriting  guide,”  apache  http  server  documentation,  last  modified   december,  1997,  http://httpd.apache.org/docs/2.0/misc/rewriteguide.html.   20.    yann  nicolas,  “folklore  requirements  for  bibliographic  records:  oral  traditions  and  frbr,”   cataloging  &  classification  quarterly  39,  no.  3–4  (2005):  179–95,   http://dx.doi.org/10.1300/j104v39n03_11.   21.    lee  l.  zia,  “growing  a  national  learning  environments  and  resources  network  for  science,   mathematics,  engineering,  and  technology  education:  current  issues  and  opportunities  for   the  nsdl  program,”  d-­‐lib  magazine  7,  no.  3  (2001),   http://www.dlib.org/dlib/march01/zia/03zia.html.     22.    thomas  steiner,  raphael  troncy,  and  michael  hausenblas,  “how  google  is  using  linked  data   today  and  vision  for  tomorrow”  (paper  presented  at  the  linked  data  in  the  future  internet   at  the  future  internet  assembly  (fia  2010),  ghent,  december  2010),   http://research.google.com/pubs/pub37430.html. 128 bell laboratories' library real-time loan system (bellrel) r. a. kennedy: bell telephone laboratories, murray hill, new jersey bell telephone laboratories has established an on-line circulation system linking two terminals in each of its three largest libraries to a central computer. objectives include improved service through computer pooling of collections, immediate reporting on publication availability or a borrower's record, automatic reserve follow-up; reduced labor; and increased · management information. loans, returns, reserves and many queries are handled in real time. input may be keyboard only or combined with card reading, to handle all publications with borrower present or absent. bellrel is now being used for some 1500 transactions per day. introduction as part of a continuing program to exploit available technology to improve library service, the technical information libraries system of the bell telephone laboratories has established an on-line, real-time computer circulation network. the initial configuration links two terminals in each of the holmdel, murray hill and whippany, new jersey, libraries to a central computer at murray hill. these are the three largest libraries in bell laboratories, handling 75% of a system total of more than 300,000 loans per year. the bellrel system is designed to process loans, returns, reservations and queries with real-time speed and responsiveness; additionally, it provides a wide range of other products and information basic to the effective control and use of library resources. the libraries of bell laboratories, like many other research libraries, have experienced unprecedented growth over the past decade in facilities, collections, services and traffic. new approaches have had to be found bellrel/ kennedy 129 not only to supply information services of sufficient power and diversity to meet the needs of a communications research organization of over 15,000 people, but also to cope with the expanding volume of everyday work in its eighteen library units. as elsewhere, a large component of that work is circulation in all of its ramifications: direct service, record-keeping, follow-up, resource identification, inter-unit coordination, feedback for purchase and purge decisions, etc. the bellrel system is addressed to these problems within the context of the bell laboratories. the use of computers in circulation control is no longer novel. the studies done by george fry and associates for the library technology project of the american library association emphasize the expense of implementing computer-aided circulation systems ( 1,2). despite these studies, which tend to focus more on the gross costs of substituting data processing for manual techniques than on the immediate and long-range gains for the library as an information system, a trend to the computer is clear. southern illinois ( 3), lehigh ( 4), and oakland ( 5) are among the many university and research libraries which have automated circulation operations using the ibm 357 data collection system and batch processing. comparable systems are in use or planned by other libraries (6,7). latterly there is increasing evidence of serious interest in real-time circulation control. the queen's university of belfast ( 8 ), and the state university of new york at buffalo ( 9) are two institutions reporting studies. redstone arsenal has been demonstrating a two-terminal, on-line system for about a year as part of a comprehensive automation program (10). the bellrel system was put into regular service in march, 1968, after two months of dry-run testing at all six terminals. this paper describes the reasons for changing from a manual system; the objectives established for the new system; the alternatives evaluated; the principal elements, operations and services of the selected system; and problems and performance in the brief period of operations to date. the paper is essentially a summary description; it does not report in detail on all card, disk and tape formats, maintenance procedures, products, logical operations, etc., of the system and its fifty-plus programs. a further report on bellrel will be published when significant experience has accrued. the displaced manual system the newark self-charge-signature system has been used by bell laboratories' libraries for some forty years. in this well-known simple system, the borrower writes his name and address on a prepared book card pulled from the book pocket. for the two out of three loans at bell labs where the borrower is not present, a circulation assistant fills out the card, which is then date stamped, tabbed for date due and filed by author. minor variations on this practice are used for unbound journals and other items lacking book cards. 130 journal of library automation vol. 1/2 june, 1968 reservations for individuals or other libraries in the network are hand posted on the charge card. files are scrutinized for overdue dates every several days (latterly, less frequently as traffic has mounted) and notices prepared by xerox copying of the charge card on an appropriate form. although standard loan periods run from one to four weeks, depending upon the item and demand, about 30% of all loans result in overdue notices. each library in the network has maintained its own circulation rec~ ords, including records for the local circulation of items borrowed on inter~ unit loan. inter-unit traffic is heavy, although substantial duplication of important publications exists in the various libraries. the merits of the newark self-charge system-simplicity, fast handling of borrowers, relatively low cost-are widely known. the system is a venerable one; it works. but all circulation systems have imperfections and in the bell laboratories long-recognized deficiencies of the manual system became increasingly unacceptable when loan traffic began to approach, then exceed, 200,000 items per year. these deficiencies included: 1. an increasing number of hours spent on the tedious and uninspiring tasks of sorting, tagging, posting, slipping, checking and husbanding cards. 2. labor, frequent delays and poor service associated with processing over 60,000 overdue notices per year. 3. inability automatically to use the pooled resources of several libraries to meet demands. 4. inability to determine quickly not merely the holdings of other copies of a title in the library system (union catalogs serve this purpose, after some steps and card handling) but the availability of loan copies at the moment of need. 5. inefficiencies in tracking down missing publications, inventory items, etc. 6. inability to identify all publications currently on loan to a borrower or used by him sometime previously. 7. inadequate information on collection use for resource management. 8. excessive service delays due to combinations of the preceding factors. new system objectives the deficiencies listed above suggest some of the characteristics defined for the new system. library management concluded early in 1965 that any replacement for the existing system must: 1. meet the long-range needs of each of the major libraries in bell laboratories and be extensible to other units in the library network as traffic, experience and costs warranted. . 2. provide not merely a more effective means for handling circulation operations within the walls of any one library but also, if possible, bellrel/kennedy 131 an instrument for knocking walls down, for bringing the combined resources of a number of libraries to bear on any information need. 3. handle all types of materials, bound or unbound, and all types of requests whether in person, by mail, direct telephone or recorded message (i.e., telereference) service. 4. give immediate up-to-the-minute accounting for all items on loan or otherwise off the shelves and locate copies still available for loan. 5. hold reservations against system resources (in line with objective 2) and direct the first copy returned, wherever returned, and as automatically as practical, to the first person on the reserve queue, whatever his base location. 6. identify promptly all items currently charged to a borrower and, as required, previously borrowed by him. 7. monitor circulation traffic and generate, as necessary, overdue notices, missing item lists, high-demand lists, zero-activity reports, statistics, use analyses and other feedback fundamental to effective control and management of the collection. 8. lift the circulation staff from clerical tasks to more personal service to library users, in the interest of the "human use of human beings," to use norbert wiener's phrase. 9. integrate the loan system with other computer-aided systems in use or planned in the libraries. 10. improve the total response of the library to the user. systems evaluated in view of these objectives it will be apparent that only a computeraided system could be seriously considered. none of the several dozen noncomputer systems surveyed in the fry report ( 1) could be considered a worthwhile alternative to the libraries' manual system. the essential questions therefore became: off-line or on-line access? batch or real-time processing? the demonstrated success of the ibm 357 batch processing circulation system compelled study and on-site investigation in several libraries. it was concluded, however, that while the 357 system would meet a number of the established goals, and at moderate cost, the important objectives of immediate accountability, automatic follow-up on reserves, full disclosure of copies available for loan, and automatic pooling of network resources would be seriously compromised. further, the fact that twothirds of all loans made in bell laboratories do not involve the presence of the borrower substantially detracted from one of the major virtues of the 357 system, i.e., the simplicity of input using a pre-punched man ( identification) card submitted by the borrower. the various alternatives for coping with this situation in a 357 system, for 200,000 loans a year and a potential of over 15,000 people, were not attractive. i .. ' i i (. ! :: " ' r ' 132 journal of library automation vol. 1/ 2 june, 1968 the feasibility of on-line access has been widely demonstrated in the research and business world. remote, on-line computer processing is clearly a common course of the near future. equally predictably, it will steadily give more favorable cost/ value ratios as machine costs decrease and labor costs mount. in sum, the technical information libraries concluded that an on-line system was worth the investment and that no other system was worth the price. only an on-line approach would meet the overall objectives for a new system and offer advantages sufficient to justify conversion effort at this time. as frederick ruecking has observed, "a charging system should not be selected because it is 'cheaper' than others. if the selected system does not meet the present and future needs, the choice is poor." ( 11) the bellrel system bellrel is a joint development of the technical information libraries and the comptroller's division of bell laboratories. the system was designed, programmed and implemented in a little over two years, beginning in late 1965. during this time, preparation of the bibliographic records, system design and programming took about seven man years. basic machine elements the initial network is illustrated in figure 1. the two ibm 1050 terw.£ . o ata phone holm0£l l i brary ,...i,-,.,050~te=r=min""a-..~ 1051 1052 1056 selector murray hill library 1050 terminal 10~0 terminal 10" 105 2 lost 1051 1052 1056 channe~ #i 1-----.------.1 console typewriter fig. 1. bellrel circulation system network . whippany library 105 0 te rmina~ r-10 __ 5_0 --te--.m-in-4--.~ 10" 1052 1051 1051 1052 1056 won i to r te"minal bellrel/ kennedy 133 minals in each of the three libraries incorporate keyboard, printer and card reader facilities for maximum flexibility in handling all types of transactions and queries. each terminal is linked by telephone lines, using western electric 103a data-sets, to an ibm 360-40 computer in the compb·oller's division at murray hill. the murray hill library is only a building away from the computer. the holmdel and whippany libraries are about thirty and twelve miles distant, respectively. the computer, in heavy daily use along with other computers for regular operations of the comptroller's division, has a 262,000 byte ( character) core memory. core is partitioned, permitting effective simultaneous use of the computer for routine batch operations and the bellrel system. in addition to core requirements for the 360 operating system, core partitions include (a) the teleprocessing logic of the ibm queued teleprocessing access method (qtam), (b) message editing logic and application logic packages, including library applications and (c) batch processing programs and operations for all purposes. figure 2, a flowchart of the real-time processing logic, illustrates core partitioning for function process equip input and output of inquiry or transaction at 1050 terminal teleprocessing logic *inter-t.erminal communication, if desired by librarian message editing logic application logic common logic routines output ~ message ( response) v receive a i queue message i (switch) i queue a send response • process message • refer to disk files • update files • generate response disk input. and output logic process logic 1050 fig. 2. general flowchart of bellrel real-time programming logic. 134 journal of library automation vol. 1/ 2 june, 1968 (a) and (b). in addition to the programs resident in core (portions of which can be overlaid as necessary by other real-time operations) certain programs for particular functions (e.g. loan, return, etc.) are called from disk as needed. in all, 32 real-time and 23 batch programs, together with the 360 operating system, are used by bellrel. the programs are written in cobol level f and basic assembly language. disk records publication and man records are stored on an ibm 2314 disk pack with a capacity of some 29,000,000 characters. about two-thirds of this space is in use or dedicated. the man records, which are up-dated daily from tape used for telephone directory, payroll and other purposes, cover about 19,000 people including btl employees and resident visitors, i.e., contractual people who may also use library facilities. each man record is 161 characters in length and contains such information as payroll account number, name, department number, telephone number, location, occupational class, space for three book loans, keys referring to overflow loan trailers elsewhere on disk, etc. the man file is organized by payroll account number, a five-digit number which is keyed in or read from a prepunched card for all loans, reservations and other transactions requiring it. access to man records on disk is by the ibm index sequential access method ( isam). publication records vary in format, length and method of access depending upon the class of publication. five classes of publications are currently in the system: books (class 1), journals (class 2), trade catalogs (class 3), college catalogs (class 4) and dewey-classified continuations and multiple-volume titles cataloged as sets (class 5). other classes of information, e.g., documents, motion picture films , etc. will be added. each title in each class is assigned a unique six-digit identification number, the first digit of which identifies the class. a typical number for a monograph title is 127391. the punched cards and book labels for each copy of this title also indicate the holding library and its copy number, e.g., 127391mh01, 127391 wh05. a sample card and label, generated by the computer, are shown in figure 3. as noted above, books fall in two classes-1 and 5. each class provides a maximum of 100,000 title numbers, more than adequate for the predicted growth of the technical information libraries where weeding is heavy. the book collections for the three libraries now on disk total about 33,000 titles and 66,000 volumes. the disk record for each class 1 title is 188 characters in length and contains the book number, 43 characters of author-title, the call number, copies by location, the fields for file maintenance change infonnation, three loans, two reserves, keys to loan trailers and reserve trailers, etc. each loan field identifies borrower, date due, copy and status of the loan (e.g., overdue number, renewed, number of reserves, returned). the bellrel/kennedy 135 i i 102362mhui i i holton, g ./ sci~nc e and the mod e rn mind ii ill ii i i ii ii i i i i i i i i i _ .... u 500/h7 5 1111 i i i i i i i i 111111 i i bell 'elephone laboratories ii i i i i i i i i i i tethnical .. iiofi.1atjon liirafties i i i i i i i i i i i i i i 1023!>2 mh 01 holton, g./ science ano the modern mind 500/h75 ill i fig. 3. bellrel book card and label. z ~ ::u rr1 3': i 0 < rr1 i i -u r rr1 l> (f) rr1 0 0 identification number for each new class 1 book is assigned by the computer on update runs. numbers are sequential. disk access is direct. class 5 books-cataloged continuations and multiple-volume titles cataloged as sets-share a different kind of disk record. they could all have been entered as class 1 items, in which case each volume of a set would have had a separate record on disk, a unique (not ~ecessarily consecutive) identification number, and a separate listing in the author, call number and identification number printed catalogs. the class 5 approach, however, permits grouping of volumes in sets and series. ten volumes of one title are handled in one disk record, 288 characters in length, under the same identification number. additional volumes, up to a total of 100, are handled in succeeding records. all of the records of the set carry the same first five digits in their identification number. disk access is by the index sequential access method ( isam). in addition to grouping sets, class 5 records effect a saving on disk space and permit use statistics to be derived for the set as a whole, as well as for each volume in the set. the principal disadvantage of the approach is that all keyed messages dealing with any volume in the set must cite both the basic access number and the specific data (e.g., volume number) pertinent to the volume in question. the journal disk records cover all the 2700 journal titles held in the library system. unlike books, however, records of all copies and volumes of each title are not permanently stored on disk. instead, each !55character journal title record contains the journal identification number ,. ,, 136 journal of library automation vol. 1/ 2 june, 1968 and 48 characters of title, plus fields for file maintenance changes, two loans, one reserve, and keys to loan and reserve trailers. specific bound volumes or unbound issues are recorded on this record only as long as they are current loan or reserve transactions. to expedite loans and returns, punch cards and computer-printed labels have been prepared for some 10,000 bound journal volumes. additional volumes are similarly processed as circulated or bound. disk records for trade catalogs and college catalogs are also 155 characters long. access to records is also by the index sequential access method. unlike journal volumes, however, each separate catalog is specifically identified and recorded on disk. when conversion is complete, more than 5000 catalogs will be accessible on disk. the loan and reserve trailers for each publication class accommodate overflow. trailer records vary in number and length depending upon function, publication type and predicted need. for example, 5000 31character trailer records, each handling three reserves, are available for book reserves. for journals, 800 59-character records, each handling three reserves, are provided. the difference reflects the heavier book traffic and the particularly sharp peaking of reserves on new book titles. apart from the normal safety back-up files (e.g., the nightly dump to tape of the current disk records), the only remaining machine record which requires mention is the history tape. this tape, up-dated daily, is a continuing record of all completed loans which provides information necessary for statistics and use analyses. on-line transactions twenty-two different transaction codes are currently available to handle loans, returns, renewals, reservations and queries in real time. in addition, any terminal can call another terminal by a single digit code and one terminal in each library can call the other two libraries simultaneously by a 'broadcast' code. this inter-library, typewritten message facility is a highly useful component of the total system. ten of the twenty-two transaction codes handle loans, returns, reservations and renewals. these codes, their prime functions and associated data inputs are listed in table 1. the eleven lq (library query) codes for requesting information from bellrel are listed in table 2. one additional code causes the computer to print out at the query terminal a statistical log of all classes of transactions at each terminal and their totals. it also gives the number of input errors made at each terminal. the log aids in adjusting work loads and monitoring performance. let us now consider several common transactions in more detail. loans: if the borrower is present, he gives the desired book to the circulation clerk. he shows his badge or, alternatively, writes his surname and fivetable 1. on-line codes for loans, etc. code function 1. loans lm ln 2. returns lc lk 3. reserves la lb ld 4. renewals, etc. loan of 1-5 items for one man at one time. overnight loan. assigns overnight loan period automatically; does not pick-up reserves on return. cancel loan. charge out automatically to first person on reserve queue. cancel loan. no automatic charge-out. add to reserve queue. give reserve no., copies held and available, etc. bypass reserve queue. put designated man first. delete from reserve queu~. lp change loan period assigned. lr renew loan, once. lg force renewal irrespective of reserves, overdue status, etc. input data & method man no., item no. (including location and copy). usually card read. , item no., with location and copy. usually card read. " man no., item no. (less location and copy). keyed. , , new loan period, complete item no. keyed or card read. , , t:x:j ~ ~ ~ ~ ~ ~ ........... ~ t'%:1 z z t'%:1 t1 ~ )-' ~ i ( ~ .. = ~ •• 142 journal of library automation vol. 1/ 2 june, 1968 similar messages are not accepted by the system. recovery from errors may be done by aborting input, repeating it correctly or, if all elements are legitimate to the message edit program, by using the appropriate online code to correct the record. on the first day of full operations, 10% of the input transactions were incorrect. one week's experience reduced the error rate to 3%, and further improvement is expected. the .25% error rate estimated by lazorick and herling for a system planned to function without any prepunched cards ( 9) appears unrealistic. non-personal codes some thirty special codes, which function like man numbers in the system, are available to handle real-time transactions involving branch libraries, outside organizations and such internal library functions as charges to recataloging, repair, new book shelf, etc. all are three-digit codes, essentially mnemonic, e.g., al9 allentown ( pa.) library; wi9withdrawn. most of the codes generate overdue notices; the codes for binding, missing, repair and a few oth~rs do not. several require backup manual records, e.g., ala interlibrary loan forms for charges to outside libraries. batch processes and products overdue notices and daily loan lists are produced in a nightly file maintenance run which also updates the history tape. the preprinted forms used for first and second overdue notices are address-sorted for direct mailing. the third notice, triggered three days after the second and ten days after the first, is a listing with telephone numbers and other data for telephone follow-up. the daily loan list is primarily a back-up record in the event of system down-time. current loans, the number of reserves and other information are combined in one list for all three libraries. the bellrel master book catalog is run quarterly from disk records. main entry, dewey number and access number catalogs are produced. all new copy; new title and other record changes made on disk in maintenance runs are reflected in cumulative weekly catalog supplements . these runs also produce all the new or changed cards and labels required. the bellrel catalog is a precursor to a system-wide printed book catalog which will replace nearly one million catalog cards held in eighteen libraries. when completely developed, input to the circulation system will be a sub-system of the master catalog maintenance procedures. maintenance of the disk journal records for bellrel follows a comparable integrated approach: journal code numbers, title abbreviations, data changes and the like are derived from the computer routines used to prepare the serials catalog since 1962. trade catalog files for the bellrel/kennedy 141 the book and the number of copies still available for loan at each library. getting one copy into the hands of the requester is then very simple. the holding library nearest to the borrower is instructed, by telephone or terminal message, to send call number such-and-such "out." the requester's name and address are not relayed. the holding library gets the book from the shelves and cancels it, using the lc command with the card reader. although this copy was not on loan, the computer ignores this fact because someone is waiting for the book, i.e., the requester whose reserve triggered this sequence. as a consequence of the cancel operation, the requester is automatically charged with the book, the holding library is told his name and address, and mailing follows. the lc command is also used in the same way to get additional copies of a book, when purchased to meet high demands, into the hands of the requesters. the la reserve transaction is put to particularly good use in handling the 600-plus requests received within a few days each month for new books announced in the library bulletin. bulletin request forms supply both item numbers ~nd man numbers. mass input follows and the computer responds with all the signposts needed to put every copy in the system to work, with a dispatch speed hitherto impossible to achieve. as shown in table 1, two transactions permit changes in reserve queues. ld deletes a requester. lb permits the queue to be bypassed and insertion of a new name at the top of the list. queries this is a fact retrieval facility. the codes listed in table 2 are reasonably self-explanatory, and take into account the realities of on-line circulation service. lqc, for example, tells the status of a title at the moment of asking, an up-to-dateness not available from the backup daily loan list generated each night. typical responses to the lqc code are: copies available, mh02 who!; title removed my68; or all 03 copies loaned, 14 reserves similarly, lql provides a requester with an immediate, printed listing of all the items he has on loan. two query codes cause display of the complete disk record for a publication ( lqd) or a person ( lqe), including current loans, reserves and trailer records. error detection in any keyboard operation mistakes will be made. bellrel attempts to signal critical errors and prevent them from affecting records. as noted previously, input man numbers and item numbers are translated by the computer into alpha characters. numerous diagnostics are also returned: e.g., invalid transaction code; invalid book id #; invalid empl #;invalid transaction bad copy#; variable data required, etc. incorrect inputs generating these and table 2. on-line library query codes. code query 1. publications lqc what is the status of title . . . ? lqs what is the status of copy . . . ? lqn what overnight items are still out? lqd display the complete disk record for title .... 2. people lqm how many items are on loan to ... ? lql what items are charged to . . . ? lqq who is first on the reserve queue for . . . ? lqr is man . . . on reserve? where? lqw who are the borrowers of title ... ? lqz who is man number ... ? lqe display the complete disk record for man . . .. input data & method (all queries are keyed.) item no. (less location and copy). item no. (with location and copy). location symbol only. item no. (less location and copy). man no. ,, man no., item no. (less location, copy) . " man no. " " ~ i ..a t-'4 ... i it i g· ~ ..... ~ '-1 § s~ ..... cd &5 bellrel/ kennedy 139 digit number on a card. while he is doing this, the clerk hits the 'request" button on the keyboard-printer, and inserts the book card in the card reader along with an end-of-transmission card. with the keyboard 'proceed' light on ( 2 seconds after 'request'), the clerk returns the typewriter carriage and keys in lm (the loan code) and the man number obtained from the borrower: e.g., lm43486. input is completed by activating the card reader. the card reader reads only to the end-of-block punch (column 16 in book cards ) , ignoring the author-title data and call number in columns 17-78. as the book identification number is read, it is listed on the typewriter. the loan period is not punched in the card, but is assigned by the computer on the basis of the first digit of the publication's identification number. the assigned loan may be altered from the keyboard, if desired. the computer responds to the loan transaction in 3 to 5 seconds, printing back in upper-case red the first three letters (trigram) of the borrower's surname and twenty characters of the book's author-title entry. these responses provide checks against errors in keying. man numbers are usually keyed (although they may also be card read) and a book number is keyed when its punch card is not available, e.g., in posting a reserve. as noted below, a wide range of other computer responses are available to flag errors and aid diagnosis. the loan transaction is completed by inserting the punch card in the book and date stamping it. total elapsed time from the borrower's presentation of the book to date stamping averages about 23 seconds for a single loan of the type described. this compares with about 20 seconds cited for one ibm 357 system ( 3) and 14 second~ in bell laboratories manual system; however, in both these systems further processing is required. if a borrower wishes to charge out more than one book at a time (a common occurrence which ruled out punching the end-of-transmission code on the book card), up to five books may be handled with one keyboarding. total elapsed time for multiple loans averages about 15 seconds per book. loans of bound journals, trade catalogs and other publications with prepunched cards follow the routine described. for unbound journals and other items lacking cards, it is necessary to obtain the title number from the printed catalog and to key this in with the relevant issue information. other transaction codes, as noted in table 1, deal with loan period changes and renewals. typical computer responses from the renew code (lr) include: renew; overdue; res waiting; no renew. returns the two-character return code ( lc) is used with card reading or typing. five items may be discharged with one lc action. the computer ·' .,, iii '•' ill 1;1 '• ' i ' . ~ . . ,, '•' ' o , i , i. ... i ' 140 journal of library automation vol. 1/ 2 june, 1968 responds with twenty characters of author-title, and one of the following messages, for each item: returned i.e., the loan is complete and no one is waiting for a copy of this book. loan to . . . i.e., send the book to the man indicated by name and address. since he was first on the reserve queue, the book is now charged out to him for the loan period shown. mail to ... libr i.e., this book belongs to the library sho\vn and should be returned there. no one is waiting for it. not on loan i.e., this book was previously cancelled or somebody borrowed' it without charging it out. the loan to ... response noted above is a particularly valuable service and time-saving feature. in effect, if any reserve exists anywhere in the system for the title, then the first copy returned is automatically charged to the first person in the queue and the next person moved up. the loan period assigned by the computer depends upon whether there is a waiting list for the book. the library does not need to take any charge-out · action except to date stamp the book and address a mail envelope using the information provided . the mail to ... libr response, calling attention to the fact that the book should be returned to its 'home' library, is coupled with automatic charging by the computer to in transit to . . . questions about the copy will receive this response during the time it takes to ship it to its home base. when the book is cancelled at the home library, any reservations made during the 'in transit' phase will cause automatic loan in the manner already described . cancellation of a loan charge without automatic follow-up on the reserve queue is sometimes desirable. for example, after a copy of a book has been charged to 'missing' and search has failed to locate it, a charge to 'lost' may be desirable for record purposes. use of the lk return code, instead of the normal lc, makes this possible without automatic pick-up of the reserve queue. reservations since reserves are posted in bellrel in real time, any copy of a title returned, even seconds after the reserve is made, will be charged to the first man on reserve. reserves are input using the keyboard sequence la man number, item number. the computer response includes the standard name trigram and publication data. if all copies of the title are on loan, the computer also responds with information to the requester on where he stands; as an example, "res #03, copies held 05". if all copies are not on loan, the response includes the call number of l bellrel/kennedy 143 circulation system are similarly correlated with other existing machine processes and products. as stated earlier, much improved feedback on collection use, demand patterns, and other matters important to library management was a major goal of bellrel. the history tapes serve this purpose both for special-purpose analyses and regular system reports. the latter include circulation statistics by subject class and library, laboratory location, user department and so on. three other reports may be mentioned: 1. high-demand list-this is a weekly list focusing attention on all titles with more than a specified number of reserves. reserves and copies are shown by location. previous loan totals are also given to aid in purchase decisions to meet demands. 2. zero-loan list-this is a semiannual listing of all titles in the collection with no recorded loan activity in one or more libraries for the period surveyed. a summary of previous loans is given, to help in decisions on weeding. 3. missing items list-this is a twice-monthly, dewey-ordered list of all titles charged to 'missing.' it is used to conduct scheduled searches in all libraries until the items are converted to 'lost' and replaced or withdrawn. operating experience . this paper is being written after only one month's . use of bellrel in regular service. the following observations are therefore limited. circulation assistants have adapted very quickly to the ·input mechanics, familiarity with typewriter keyboards and the novelty of conversing with the computer being contributing factors .. bellrel appears to be regarded as a powerful and perceptive colleague with the occasional off moments accepted in a friend. · burdensome tasks, such as preparing overdue notices and maintaining card records have ·been dropped with enthusiasm. staff members are developing new perspectives as they understand the functioning of an information network. the total system concept, embracing the resources of all participating libraries . and permitting one copy of a book to serve many readers without inter-library loan, is modifying many practices. greater record accuracy, · completeness and utility is also being realized, along with significant time-savings throughout the system. the query . facility, which shows promise of being much used, provides immediate answers to certain questions which previously could not be asked and gives a glimpse of the eventual responsiveness of a complete on-line library catalog. .. · customer reaction has ranged from some technical interest (technical staff members were consulted in the development of the system and information about its purposes and functions has been widely disseminated) to more common approval and enthusiasm. the increase in time to charge ,. •' if' ,. ii · it '• '•' it' ' • ,, " •' 144 journal of library automation vol. 1/ 2 june, 1968 out a book in person in bellrel-about nine seconds more than the manual system for a single loan and two seconds more per book for multifle loans-appears to be widely accepted. whether this ·is due to initia tolerance of a new system, or less 'work' by the borrower in the charge operation, or an appreciation that service as a whole will be faster and more responsive, is not known. it is expected that charging time will be reduced with program modifications and experience. it should also be recalled that in two out of three loans the borrower is not present: far from experiencing additional delay, he gets what he wants faster. the usual bugs in a complex of programs have arisen; certain trailers had to be enlarged; the 360 operating system and hardware have failed several times. down-time, under initial loads of up to 1500 on-line transactions per day, has been less than anticipated for the first month and is expected to drop sharply. about two down-times per day were experienced in the first month, about half of these being deliberate, and most recoveries have taken less than fifteen minutes. down-time logs are used to record transactions for immediate entry into the system when it becomes alive, a similar procedure being used for after-hours loans. costs the costs of operating the bellrel system are, understandably, higher than the displaced manual system, the two systems, of course, not being comparable in services and functions. in the operations which can be fairly directly correlated, bellrel permits very significant labor savings. appreciable materials savings are also anticipated as a result of collection pooling (leading to reduced duplication of resources in the individual libraries), better inventory control, and other factors. rental costs are the major component. each of the six terminals, for example, with associated data-sets and telephone lines, costs $275 a month. costs of the portion .of the transmission control unit and disk facility used by the libraries total about $1100 a month. in addition to a small amount for materials, other costs include a share of the central processing unit and core memory charges, depending upon usage. to execute 1500 real-time transactions per day appears to require less than 12 minutes of main-frame computer time, but a share of the real-time terminal polling and batch processing time must also be included. however, experience with the automated system has been far too brief to reach any precise cost figures for the whole system. in particular, although the dollar value of the largely intangible but very real benefits to library users and library staff can only, at least at this stage, be guessed at, bellrel has been implemented on the premise that these benefits are major. it should be noted that the costs of the manual (newark) system in bell laboratories differ greatly from the costs calculated by the library technology project ( l tp) for this system in an academic library ( 12). bellrel/kennedy 145 ltp cost estimates for both the newark and the ibm 357 systems do not conform to our calculations for more reasons than can he discussed here. in the main, however, environmental conditions, strongly affecting labor costs, are too different. for example, in arriving at labor costs, l tp uses the figure of 44 overdues per 1000 circulations in academic libraries; in our library system where there are no fines or long loans, overdues total about eight times this figure. again, as a result of book announcement services, discipline concentration and other factors, reserves in the bell laboratories libraries are nearly twenty times the ratio used by the library technology project for academic libraries. still further, in bell laboratories some 200,000 loans per year are made without the borrower being present to fill in the loan card. these and other factors add heavily to the cost of labor. few industrial organizations can obtain labor at the cost of $2.00 per hour cited in library technology reports when personnel benefits and other overhead are included. conclusion paul fasana has observed: "since cost is primarily a quantitative measme of a system, it is hut one of several factors (and possibly not even the most important factor) to consider in evaluating an automated system. other factors . . . qualitative factors . . . must also he considered. . . . they include such items as operating efficiency, reliability, services rendered, and growth potential." ( 13) . a full judgment on these factors in the bellrel system must await fmther experience hut the following observations may he made: 1) bellrel is not an experiment; it is addressed to practical problems in an industrial library network. 2) it is not a final system; software and hardware evolution will see to that. 3) it is not a model system, transportable in toto to another context; any system of comparable complexity and investment requires careful matching to local needs and objectives. bellrel objectives, to reiterate, include improved service through computer pooling of dispersed library collections, up-to-date reporting on the status of any publication, immediate identification of all items on loan to a person and automatic follow-up on reserve queues; reduced clerical labor; better inventory control; much enriched feedback for library management; more effective realization of the information network philosophy; and experience in the new era of man-machine communication in a real-life environment. the evidence is strong that these objectives are being achieved. acknowledgments the technical information libraries gratefully acknowledge the unstinting and imaginative aid given by the comptroller's division of bell tel,, •' ~ • ' r .. ... 146 journal of librm·y automation vol. 1/ 2 june, 1968 ephone laboratories in the design, development and operation of the bellrel system. bibliography 1. 2. 3. 4. 5 . 6. 7. 8. 9. 10. 11. 12. 13. george fry and associates, inc.: study of circulation control systems (chicago: ala, 1961). american library association, library technology project: the use of data-processing equipment in circulation control (chicago: ala, july 1965), library technology reports. mccoy, ralph e.: "computerized circulation work: a case study of the 357 data collection system," library resources & technical se1·vices, 9 (winter 1965), 59-65. flannery, anne; mack, james d.: mechanized circulation system, lehigh university library (center for the information sciences, lehigh univ.: nov. 1966), library systems analyses report no. 4. cammack, floyd; mann, donald: "institutional implications of an automated circulation study," college & research libraries, 28 ( march 1967), 129-32 . cuadra, carlos a., ed.: american documentation institute annual review of information science and technology, vol. 1. (new york: interscience, 1966), pp. 201-4. mccune, lois c.; salmon, stephen r.: "bibliography of library automation," ala bulletin, 61 (june 1967), 674-94. kimber, richardt.: "studies at the queen's university of belfast on real-time computer control of book circulation," journal of documentation, 22 (june 1966), 116-22. lazorick, gerald j.; herling, john p. : "a real time library circulation system without pre-punched cards," proceedings of the american documentation institute, v. 4 (washington: adi, 1967), 202-6. croxton, f. e.: on-line computer applications in a technical library (redstone scientific information center, redstone arsenal, alabama: november 1967), rsic-723. ruecking, frederick, jr.: "selecting a circulation-control system: a mathematical approach," college & research libraries, 25 (sept. 1964)' 385-90. american library association, library technology project: three systems of circulation control (chicago: ala, may 1967), library technology reports. fasana, paul j.: "determining the cost of library automation," ala bulletin, 61 (june 1967) 661. adventure code camp: library mobile design in the backcountry david ward , james hahn, and lori mestre information technology and libraries | september 2014 45 abstract this article presents a case study exploring the use of a student coding camp as a bottom-up mobile design process to generate library mobile apps. a code camp sources student programmer talent and ideas for designing software services and features. this case study reviews process, outcomes, and next steps in mobile web app coding camps. it concludes by offering implications for services design beyond the local camp presented in this study. by understanding how patrons expect to integrate library services and resources into their use of mobile devices, librarians can better design the user experience for this environment. introduction mobile applications offer an exciting opportunity for libraries to expand the reach of their services, to build new connections, and to offer unique, previously unavailable services for their users. mobile apps not only provide the ability to present library services through mobile views (e.g., the library catalog and library website), but they can tap into an ever-increasing list of mobile-specific features. by understanding how patrons expect to integrate library services and resources into their use of mobile devices, librarians can better design the user experience for this environment. by adjusting the normal app production workflow to directly involve students during the formative stages of mobile app conception and design, libraries have the potential to generate products that more accurately anticipate real-life student needs. this article details one such approach, which sources student talent to code apps in a fast-paced, collaborative setting. as part of a two-year institute of museum and library services (imls) grant, an academic library– based research team investigated three different methods for involving users in the app development process—a student competition, integrated computer science class projects, and the coding camp described in this article. the coding camp method focuses on a trend in mobile software development of having intensive two-to-three-day coding events that result in working prototypes of applications (e.g., iphonedevcamp, http://www.iphonedevcamp.org/). coders typically work in groups to simultaneously learn how new software works, and also develop a functioning app that matches an area of personal interest. camps promote collaboration, which provides additional networking and social outcomes to attendees. additionally, camps provide an david ward (dh-ward@illinois.edu) is reference services librarian, james hahn (imhahn@illinois.edu) is orientation services & environments librarian, and lori mestre (lmestre@illinois.edu) is head, undergraduate library and professor of library administration, university of illinois at urbana-champaign. http://www.iphonedevcamp.org/ mailto:dh-ward@illinois.edu mailto:imhahn@illinois.edu mailto:lmestre@illinois.edu adventure code camp: library mobile design in the backcountry | ward, hahn, and mestre 46 opportunity for software makers to promote their services and products, and they can result in new code and ideas on which to base future products. for academic libraries, a camp environment provides an educational opportunity for students, particularly those in a field with a computing or engineering focus, to learn new coding languages and techniques and to gain experience with a professional software production process that runs the full timeline from conception to finished product. coding camps offer a chance for librarians to get direct student feedback on their own software development goals. the resulting applications provide potential benefits to both groups—students have a functional prototype to enhance their classroom experiences and a codebase to build on for future projects, and the librarians gain an insight into students’ desires for the content of mobile apps, code to integrate into existing apps, and direct student input into the iterative design process. this article presents the results of a mobile application coding camp held in fall 2013. the camp method was tested as a way to explore a less timeand staff-intensive process for involving students in the creation of library mobile apps. three specific research questions framed this investigation: 1. what library and course-related needs do students believe would benefit from the development of a mobile application? 2. is the library providing access to data that is relevant to these needs, and is it available in a format (e.g., restful apis) that end users can easily adopt into their own application design process? 3. how viable is the coding camp method for generating usable mobile app prototypes? literature review in line with efforts in academic libraries to operationalize participatory design for novel service implementation,1 the library approach to code camps included sourcing student technical expertise in line with other tech companies’ approaches to quickly iterating prototypes that may advance or enhance company services. while coding camps happen in corporate settings, other types of camps try to publicize technologies, like a programming language, while others still are directed toward a specific cohort.2 the departure point for the library was in understanding other ways the library might consider organizing and pairing its resources of apis with other available campus services. a few highly visible and notable corporate “hackfests” or “hackdays” include the facebook hackdays, in which facebook timeline was developed (http://www.businessinsider.com/facebook-timeline-began-as-a-hackathon-project-2012-1). the mobile app company twitter also has monthly hackfests where employees from across the company work for a sustained period (a weekend or friday) on new ideas putting together prototypes that may transition into new services for the company. http://www.businessinsider.com/facebook-timeline-began-as-a-hackathon-project-2012-1 information technology and libraries | september 2014 47 an example of code camps from academia are the mhacks camps at the university of michigan (http://mhacks.challengepost.com/), among the largest code camps for university students in the midwest. these camps are notable for their funding from corporations and for their support of student travel from colleges around the country to participate at the university of michigan. at each event, coders are encouraged to make use of the corporate apis that student programmers may make use of once they graduate or form companies after graduation. on the professional front, digital library code meet-ups (such as that of the code4lib preconference: http://code4lib.org/) are an opportunity for library technologists to share strategies and new directions in software using hands-on group coding sessions that last a half or full day. a recent digital event for the digital public library of america (dpla) hosted hackfests to demonstrate interface and functional possibilities with the source content in the dpla. similarly, the hathi trust research center organized a developer track for api coaching at their conference so that participants would have hands-on opportunities to use the hathi trust api (http://www.hathitrust.org/htrc_uncamp2013). goals of coding camps include development of new services or creation of value-added services on already existing operations. code is not required to be complete, but functional prototypes help showcase new ways of approaching problems or novel solutions. recently, mhacks issued the call to form new businesses at their winter hackathon (http://www.mhacks.org). libraries are typically less interested in new businesses, but rather seek new service ideas and new principles for organizing content via mobile and to do so in such a way that will source student preferences for location specific services, a key focus for the research team’s student/library collaborative imls grant. method while the camp itself took only two days, there was a significant amount of lead-time needed to prepare. in addition to obtaining standard campus institutional-review-board permissions for the study, it was also necessary to consult the office of technology management to devise an assignment agreement covering the software generated by the camp. the research team chose a model that gave participating students the option to assign co-ownership of the code they developed to the library. this meant that both students and the library could independently develop applications using the code generated during the camp. marketing for the camp specifically targeted departments and courses where students with interest and skills for mobile application development were likely to be found, particularly in computer science and engineering. individual instructors were contacted, as well as registered student organizations, to help promote the camp. attendees were directed to an online application form, where they were asked to provide information on their coding skills and details on their interest in mobile application development. http://mhacks.challengepost.com/ http://code4lib.org/ http://www.hathitrust.org/htrc_uncamp2013 http://www.mhacks.org/ adventure code camp: library mobile design in the backcountry | ward, hahn, and mestre 48 ten students were ultimately selected from the pool and, of those, six attended the camp. a precamp package was sent to these students to help them prepare for the short, intense timeframe the event entailed. this package included details on library data that were available to base applications on through web apis, as well as brief tutorials on the coding languages and data formats participants needed to be familiar with (e.g., javascript, json, xml, etc.). participants were also provided with information on parking and other logistics for the event. the research team consisted of librarians and academic professionals involved in public services and mobile design, and student coders employed by the library to serve as peer mentors. the team designed the camp as a two-day experience occurring over a weekend (friday evening to saturday late afternoon). the first day was scheduled as an introduction to the camp, with details on library and related apis that could be used for apps and an opportunity for participants to brainstorm app ideas and form design teams. the day ended with some preliminary coding and consultation with camp organizers about planned directions and needs for the second day. the second day of the camp mostly for coding, with breaks scheduled for food, presentations of work-in-progress, and an opportunity to ask questions of the research team. the day ended with each team presenting their app, describing their motivation in designing it and the functionality they had been able to code into it. given the brief turnaround time, the research team put a heavy focus during the orientation session on clearly articulating the need to develop apps germane to student library needs. examples from the student mobile app design competition conducted in february 2013 were provided as starting points for discussion, as these reflected known student desires for mobile library applications.3 after the camp ended, students who elected to share their code with the library were given details on how and where to deposit the code. post-camp debriefing interviews (lasting 30 to forty-five minutes each) were scheduled individually with all participants to get their feedback on the setup of the event as well as what they felt they learned from the experience. discussion researcher observations and feedback from students, both during the camp and in individual interviews afterwards, led to several insights about what sorts of outcomes libraries might anticipate from running camps, how to best structure library coding camps, what outcomes students anticipate from participating in a sponsored camp environment, and what features and preferences students have for mobile apps designed to support their academic endeavors. a key student assumption, which emerged from comments at the event and through subsequent student interviews, was that students anticipated completing a fully functioning mobile app by the end of camp. instead, the two student teams each finished with an app that, while it included some of the features they desired, still required additional coding to be fully realized. several information technology and libraries | september 2014 49 suggestions were made for how this need might be met at future events. the most consistent feedback from the students was that they would have liked an additional day of coding (three total camp days), so that they could have gotten further on the implementation of their app ideas. during the exit interviews, one student noted that the two-day timeframe really only allowed for sketching out an idea for an app, not coding it from scratch. a pair of related suggestions from students included having templates for mobile apps available to review to get up to speed on new frameworks (particularly jquery), and secondly, a longer meetand-greet for teams prior to beginning work during which they could compare available coding skills and have some extended brainstorming of app ideas. students were somewhat mixed in their desire for assistance in developing app ideas—some appreciated the open-endedness of the camp, but others wanted a more organizer-driven approach. some students suggested having time to work with library staff after the camp to finish or polish their apps. this observation suggests the enthusiasm students had for the camp itself, and specifically for having a social, structured, and mentored opportunity to develop their coding skills. based on these requests, the research team created “office hours” on the fly after the camp ended to support this request. research team members and coding staff communicated times when team members could come into the library and get additional help with developing their apps. the students had very similar themes for app features to those that the research team observed in an earlier student mobile app competition study. notable categories included the following: • identify and build connections with peers from courses. • discover campus resources and spaces. • facilitate common activities such as studying and meeting for group work. students remarked that the camp was an opportunity to both meet people with similar coding interests as well as to learn more about specific functional areas of app development (specific coding languages, user interface design, etc.) in which they had little experience. jquery and javascript for user-facing designs were particular areas of interest. many students had some indepth background working on pieces of a finished software product but had not previously done start-to-finish software design; this was a big selling point for the camp. the collaborative nature of the camp also matched students’ preferences to work in teams and to learn from peers. while the research team had coders on hand to assist with both the library apis, as well as jquery basics, most teams did the majority of their work themselves, and preferred self-discovery. each team did eventually ask for coding advice, but this occurred toward the end of the camp, once their apps were largely coded and they needed assistance overcoming particular sticking points. the other piece of advice students asked organizers about concerned identifying apis for locations of campus maps, and other related resources to serve as data sources powering their apps. in the course of assisting with these requests, researchers discovered another key issue facing library mobile app development—the lack of campus standards for presenting information across adventure code camp: library mobile design in the backcountry | ward, hahn, and mestre 50 different colleges and departments. in particular, maps of rooms inside campus buildings were not provided in a consistent or comprehensive way. this was particularly frustrating to the team that was attempting to develop an app featuring turn-by-turn navigation and directions to meeting rooms and computer labs. in addition to sharing information on known apis and data sources, camp organizers also learned about previously unknown data sources from the student teams. one example was a json feed for the current availability of computers in labs provided by the college of engineering. while this feed was beneficial to starting work on an app for one team, it also led to frustration because feeds for other campus computer labs did not exist, and the team was limited to designing around the specific labs that did have this information available. observed student discussions about the randomness of data availability also highlighted one of the key themes of student-centered design—the conceptualization of a university as a single entity, the various parts of which combine and come in and out of focus depending on the current student task. related student feedback from one of the post-event interviews described a strong desire to create integrated, multifunction apps to meet student needs as opposed to a variety of apps that each did one thing. the siloed nature of campus infrastructures frustrates this desire to some extent but also creates opportunities for students to build a tool that meets a real need among their peers to comprehend and organize their academic environment. this observation also matches those found during the aforementioned student competition. conclusion and future directions student feedback on the camp, as a whole, was very positive, and in the individual interviews, students noted they would like to participate in another camp if it was offered. on the library side, the research team felt that the camp was useful to their ongoing mobile app development process, partially for the code generated but primarily for the direct feedback on what types of apps students wanted to see. the start-up time and costs for the project were low, as expected, and the insights into student mobile preferences seemed proportionate to this outlay. the camp method should be reproducible in a variety of library environments. the key assets other libraries will need to have in place to run a camp include staff with knowledge of client-side api use (in particular jquery, cors, or related skills), and knowledge of campus data sources that students may wish to pull from. third-party apis with bibliographic data (e.g., good reads) could also be used as placeholders for libraries that do not have access to apis for their own catalogs or discovery systems. student suggestions for extending the camp by a day, and their ideas for how to structure it for student success, were very specific and actionable and provided excellent guidance. one of their ideas was to develop tutorials and templates that could be introduced as a pre-camp meeting. this would not add too much prep time. another idea for a future camp would be to develop a specific theme for teams, which would allow for more documentation of and practice with specific apis. information technology and libraries | september 2014 51 the low attendance was a concern, so for the next camp twice the number of desired participants will be invited to ensure both a variety of coding skills and interests as well as opportunities for more teams to be formed. additionally, partnerships with student coding groups or related classes should help to drive up attendance. the biggest difficulty moving forward will be developing campus standards for data that can be made available to students about resources, spaces, and services. as noted above, students typically do not design a “library app,” rather they look to build a “student app” that pulls in a variety of data from across campus. functions of apps are therefore more oriented toward common student activities like studying, socializing, and learning. a related challenge will be to provide adequate format and delivery mechanisms for access to supporting data feeds. cognizant of the silo issue, noted above, as libraries present their own data for student consumption, these tendencies towards a unified view need to be taken into account. completion of an assignment is more than identifying three scholarly sources; it might involve identifying a space to do the research, locating peers or mentors for either the research or writing process, locating suitable technology to complete an assignment, and a variety of other needs. the features and information presented on a library’s website should be designed as modular building blocks that can fit into other campus services in a similar way to how course reserves are sometimes presented in campus learning management services alongside syllabi and assignments. separating library content (e.g., full-text articles, room information, research services) from library process can help with freeing information about what libraries have to offer and can facilitate broader discovery of services and resources at point of need. key to this process is recognizing the student desire to shape the resources they need into a comprehensible format that matches their workflow rather than forcing students to learn a specific, isolated, and inflexible path for each part of the projects they work on. this study has shown that a collaborative process in technology design can yield insights into students’ conceptual models about how spaces, resources, and services can be implemented. while the traditional model of service development often leaves these considerations until the very end in a summative assessment of service, the coding camp and collaborative methods presented here provide librarians a new tool for adding depth to service design and implementation, ultimately resulting in services and platforms that are informed by a more wellrounded and deeper understanding of the student mobile-use experience. in that regard, the initial research questions that framed this study could also be used by other libraries as they explore the library and course-related needs that students could benefit from with the development of mobile applications, as well as to determine if their library provides access to data that is relevant to those needs. the results from this study have affirmed that, for at least the library from this study, that the coding camp method is viable for generating usable mobile app prototypes. it also affirmed that by directly involving students during the formative stages of mobile app conception and design, the products of those apps more accurately reflect real-life student needs. adventure code camp: library mobile design in the backcountry | ward, hahn, and mestre 52 references 1. council on library and information resources (clir), participatory design in academic libraries: methods, findings, and implementations (washington, dc: clir, 2012), http://www.clir.org/pubs/reports/pub155/pub155.pdf. 2. “hackathon,” wikipedia, 2014), http://en.wikipedia.org/wiki/hackathon. 3. david ward, james hahn, and lori mestre, “designing mobile technology to enhance library space use: findings from an undergraduate student competition,” journal of learning spaces (forthcoming). http://www.clir.org/pubs/reports/pub155/pub155.pdf http://en.wikipedia.org/wiki/hackathon 198 production of library catalog cards and bulletin using an ibm 1620 computer and an ibm 870 document writing system donald p. murrill: philip morris, incorporated, richmond, virginia a program is presented which runs on an ibm 1620 computer and pro· duces punched cards that activate an ibm 870 document writing system to type catalog cards in upperand lower·case characters. another pro· gram produces punched cards which instruct the 870 to type a library accessions bulletin. the programs are written in fortran ii and are de· scribed in detail. estimates of costs and production times are included. producing library catalog cards and accessions bulletins with the aid of a computer is not a new idea. since 1963 several published papers have described projects that have resulted in the production of such cards and bulletins, either as the principal end products or as two products under a total systems concept. kilgour ( 1,2,3), while with the yale medical library, described a project which had been developed jointly with tl1e harvard and columbia medical libraries under a grant from the national science foundation. six pro. grams were written to produce catalog cards by means of an ibm 1401 computer and either a 1403 line printer or an ibm 870 document writer. these programs were of interest to the philip morris, incorporated, re· search center library because of the upper· and lower·case printing capability. during the period 1961·1963 the technical library of the bureau of ships, department of the navy, used a 1401 computer to automate the preparation of 3x5 catalog cards and the library accessions bulletin. in production of library catalog cards/murrill 199 reports concerning project sharp (ships analysis and retrieval project) ( 4,5) these functions are described as being two of eleven formal outputs of the project. production of the cards is termed a "by-product" of the accumulated bibliographic data. in an ibm publication concerning the administrative terminal system, holzbaur and farris ( 6) list the production of catalog cards and of the library bulletin as two of approximately seven outputs of a total system using the 1401 computer. ibm began automatic card preparation in 1963. other publications by ibm also deal with the production of catalog cards and library bulletins with the help of a computer ( 7,8). in 1964 buckland ( 9) prepared a report for the council on library resources, inc., in which he described the preparation of catalog data on a tape-punching typewriter. the perforated tape was processed by computer for phototypesetting, tape typewriter, or line printer output. at the 1967 meeting of the american documentation institute (now american society for information science) cariou ( 10) discussed the preparation of catalog cards by means of a computer and an ibm document writing system. she programmed her computer to count the number of spaces between sentences and to use this count to determine the type of information it could expect next and, thus, what kind of processing it should give that information. the files set up by this program were used with another program to punch cards for the document writer. the computers used with the programs discussed in the foregoing were of the ibm 1400 series or equivalent. the philip morris research library has access to such computers but only on a limited basis. it has ready access to a 1620 computer. for this reason its programs were written for use with the 1620. the punched card output is processed by the 870 document writing system to produce printout in upperand lower-case characters. the philip morris library contains 5000 books, subscribes to 600 journals, and serves approximately 300 people. it is growing at the rate of approximately 1000 volumes per year. the ibm 1620 computer the 1620 computer for which the programs described in this paper were written has 20,000 positions of core, and typewriter and punched card input. it has typewriter and punched card output and differs from the computers used with the ptograms mentioned earlier in not having magnetic tape or a line printer. the lack of these devices does not detract from the final output, although that output is not produced as fast with the 870 document writer as it would be with a printer. also, data punched into cards cannot be stored on tape but must be saved, if desired, in the cards. there are many 1620 computers extant, however, and many libraries which might use this machine for the automation of their library functions. programs which permit the use of the 1620 in the pro200 journal of library automation vol. 1/ 3 september, 1968 duction of catalog cards and accessions bulletins might be of help to these libraries. the programs are written in fortran ii, specifically for compilation with a pdq fortran processor deck, as developed by maskiell ( 11 ) of mcgraw-edison company. when the program for the library catalog cards is compiled, columns 7-13 of card 25058 in the fixed format subroutine deck must be changed to 4903158. this eliminates the punching of sequence numbers in the computer output cards which are to be run through the 870 system, thus permitting use of the last eight columns for tum up control characters ( & ) and for the forms feed character (). the ibm 870 document writing system the 870 document writing system is a combination of a control unit, which is a card punch machine with a control panel, and a typewriter. a complete system could include an auxiliary keyboard, a paper-tape punch, an auxiliary card punch, and a second typewriter; but for an output of library catalog cards and bulletin only the control unit and one typewriter are needed. the punched cards whose contents are to be typed are placed in the hopper of the control unit and are passed in a continuous feed under the read head. the control panel interprets the punched characters in the cards and produces the desired alphameric symbols and punctuation marks on the typewriter. for the production of library catalog cards, continuous 3x5 card forms are put into the typewriter and the punched cards which are output from the computer, as explained in the next section, are passed through the control unit. carriage turnup is controlled by a continuous chain of small beads in which four large beads are equally spaced three-and-a-half inches apart. one rotation of the chain corresponds to the turnup of four 3x5 cards. before typing begins, one of the large beads is positioned to coincide with the top of a card. a special character in the last card of a unit of punched cards obtained from the computer activates the carriage control, causing tumup to the next large bead and the top of the next card. eleven special character exits on the control panel correspond to the special characters on the 836 card-punch keyboard. any one of the exits can be wired to do a certain job when the special character is encountered in a punched card. it seems logical to have a punched period produce a typed period, a punched comma a typed comma, and a punched slash a typed slash. these three characters, along with the numbers, are lower case on the 866 typewriter. other special characters on the typewriter, such as parentheses, brackets, colon and semicolon, are obtained by punching the upper-case shift character in the card immediately before the appropriate lower-case character. the left parenthesis, for example, is produced from an upper-case 9, the right parenthesis from an upper-case 0. reference to the typewriter keyboard gives ready knowledge of how production of library catalog gauls/ murrill 201 to obtain any desired character (see table 1). other special character exits on the control panel are wired to produce lower-case shift, single and multiple upper-case shift, typewriter carriage return, tab control, underlining (obtained from an asterisk punched immediately before the character to be underlined), and forms feed (see table 2). table 1. special character production card punch input @. typewriter output @, @/ #1 #2 #3 #4 #5 #6 #7 #8 #9 #0 #. #, #i • ( @ = lower case shift, # = upper case shift) table 2. control panel wiring typewriter 1 on _ _ ..,.. column zero single card read on ..,.. column zero single carriage return __ ..,.. card drop out special character exits common channel: . __ ..,.. type-only entry . , __ ..,.. type-only entry , / __ ..,.. type-only entry / o __ ..,.. type-only entry o & __ ..,.. carriage return @ _ _ ..,.. lower case shift # __ ..,.. single upper case shift $ __ ..,.. multiple upper case shift % __ ..,.. typewriter tab control __ ..,.. forms feed start 1 ' i " l + [ ] ? & ( ) (underline) (punched "&" prints as "+", punched "%" as "(", and punched ":if' as "-'' ) . 202 journal of library automation vol. 1/ 3 september, 1968 to obtain panel control of the typewriter the star wheels of the control unit must be engaged and a program card must be on the drum. for the library programs a blank program card is used. the same panel that is used to control the printout of the catalog cards is used for the library bulletin. the same manually punched cards are used as input data to both programs, with modifications in the case of the bulletin, as will be noted later. library catalog cards the data cards containing the bibliographic information on the books being cataloged must be punched with care, but no worksheet need be filled out by the librarian. this saves him a great deal of time and trouble. he must designate the call number, tracings, and other details for the keypunch operator, but he can do this on a plain piece of paper, which also contains a transcription of the title page or of an lc proof slip. he does not need to remember or look up any codes, nor does he need to be concerned with where each letter will go in the punched cards. the keypuncher must know certain details, of course; e.g., that the author's name always starts in column nine, and that two blank spaces are inserted after a period, one after a comma. she must know the special characters which have been wired in the control panel to produce upperand lower-case printout on the 866 typewriter. she must remember that the character for multiple upper-case printing will produce capital letters until a different control character is encountered, and she must punch the appropriate control characters where needed. these are details which are quickly learned with use, but because of them only one keypunch operator should be selected to handle the library data and to be responsible for production of the catalog cards. the 3x5 catalog card will hold seventeen typed lines. it was arbitrarily decided that the tracings, i.e., the subject entries and added entries, would always start on line thirteen. if there are more than twelve lines of bibliographic information before the tracings, the program described here will not punch the thirteenth card, and the computer processing will stop. if there are more than seventeen lines total, the program will not provide for automatic turnup to the necessary continuation card. catalog cards which fall into these two categories must be typed manually, but only a small percentage of cards need to be prepared in this way. computerizing the production of catalog cards enables one set of keypunched cards to produce several sets of computer punched cards and, thus, several catalog cards for each book. the necessity for typing the card manually is eliminated. in the program presented here each punched card represents one printout line on the catalog card. this makes it possible to count the number of lines before the tracings, by counting each punched card as it is read by the computer, and to skip as· many lines as is necessary to always start the tracings on line thirteen. production of library catalog cards/ murrill 203 to facilitate explanation in the following discussion certain terms and definitions have been assumed: a "unit" is all of the manually punched cards required for all of the catalog cards for a given book, including main entry cards and tracing cards, the latter being those with headings taken from the subject and added entries; a "set" is all of the manually punched cards required for the main entry card and not including the headings punched separately; the "body" cards are those required for the "body" of the catalog card, down to but not including the tracings. figure 1 shows how the keypunched cards look. class code subject code cutter number $tx :;@415 ij_if29 ~ : r; ~ r;$tx r;stx v@415 vif29 #fritzsche #brothers iinc. #guide to flavoring ingredients as classified under the #federal #food, #drug ahd fcosmetic fact . #neii#york, 1966. 84p. 1 l. fflavoring essences. ii . #t itle. 1 flavoring essences 1 #guide to flavoring ingredients as classified under the #federal i food, #drug and #cosnetic #act 1-i ..,_..~ fig. 1. keypunched cards. ii l i unit l certain controls must be punched into the data cards. a "1" is punched into a field called mon to indicate the last body card; a "1" is punched right-justified into a two-character field called kon to designate the last tracings card, before a repetition of the tracings to serve as headings; and "-1" is punched into the field kon to indicate the last card for a given book. one rule must be followed: if there are no non-spacing characters in a card, e.g., upperand lower-case shift characters, punching should not extend beyond column 57. for each non-spacing character punching can be extended one column, but must not extend beyond column 65. in the read statement of the computer program discussed here sixtyfive columns of alphameric data are read and stored in an array. the contents of mon and kon are saved until the next card is read. 10 read 11, m1(i),m2(i),m3(i), (read statement for all lines 1m4(i),m5(i),m6(i),m7(i), on main entry cards and all 2m8(i) ,m9( i),m10(i),mll(i), lines except headings on trac3ml2(i),m13(i),mon,kon ing cards.) 11 format( 13a5,11x,2i2) 204 journal of library automation vol. 1/ 3 september, 1968 as long as mon and kon are zero, each card is punched as it is read, and the program returns to the read statement: 15 punch 8, m1(i),m2(1) ,m3(i) , (punch statement for all lines 1m4(i),m5(i),m6(i),m7(i), except last of body of main 2m8(i ),m9(i ),m10(i) ,mll(i) , entry cards.) 3m12 (i) ,m13(i) 8 format(13a5) 14 go to (32,33) ,nbc 32 i=i+1 go to 10 (nbc is set to "1" at beginning of program. ) when the last body card is read and mon=1, the program branches to a statement which stores n-1, n being the number of cards read to that point. the program then calculates the difference between thirteen and n and branches to the appropriate statement for punching the last body card and the special characters needed to produce the number of blank lines which will begin the tracings on line thirteen in the printout: 23 max=13-n go to (15,19,20,21,22,34), max 21 (e.g.) punch 30, m1(i),m2(i), 1m3( i) ,m4( i ),m5(i) ,m6(i ), 2m7 (i),m8(i) ,m9(i ),mlo(i) , 3mll (i) ,m12( i) ,m13( i) 30 format ( 13a5,11x,4h&&&&) (sample punch statement for last body line; includes special characters to produce skipped lines.) the computed go to contains only six statement numbers to which the program can branch because of the limited memory of the 1620 computer. this means that there must be at least seven cards before the tracings and, if necessary, blank cards must be added to reach this count. the subject entries and added entries, i.e., the tracings, are read and punched, card by card, after a branch back to the read statement ( 10). with the reading of the last tracings card, kon=l. the program branches to a statement which punches the last tracings data and a special character ( ) which, during printout, will cause the typewriter to tum up to a new 3x5 card. this completes the preparation of punched cards for the first main entry card. most libraries need more than one such card. additional sets of punched cards are prepared by means of a do loop and a return to an earlier part of the program: 36 do 6 k= 1,nb 6 punch 8,ml(k),m2(k) , 1m3 ( k) ,m4( k) ,m5( k) ,m6( k), 2m7(k) ,m8(k),m9(k),mlo(k), 3mll(k),m12(k),m13(k) ( nb has been previously set to one less than the number of body cards.) (punch statement for all except last body card for all main entry cards except the first. ) max is again defined in statement 23 and the last body card is again production of library catalog cards/ murrill 205 punched. statement 14 sends the program to statements which punch the tracings for the second and subsequent cards: 33 nix=nb+2 if(not-nix)1,9,9 (not=one less than total number of cards in set.) 9 do 47 jo=nix,not 47 punch 8, m1(jo),m2(jo ), 1m3(jo ),m4(jo ),m5(jo ), 2m6(jo),m7(jo),m8(jo), 3m9(jo ),m10(jo),mll(jo ), 4m12(jo),m13(jo) 1 i=not+1 punch 26, m1(i),m2(i), 1m3( i ),m4( i) ,m5(i ),m6( i), 2m7 (i ),m8( i ),m9( i),m10( i) , 3mll( i) ,m12( i) ,m13(i) 26 format ( 13a5,14x,1h-) (punch statement for all but last line of tracings. ) (punch statement for last line of tracings; includes special character to produce typewriter turn up.) a count is kept, in a fixed point variable, ncs, of the number of card sets which have been prepared. this variable is now used to determine the next step in the program: go to ( 36,36,53,53,53,53,53,53,etc.) ,ncs the number of card sets punched for main entry cards is one more than the number of times "36" is inserted in the foregoing computed go to statement. . the headings for the tracing cards, as shown in figure 1, are placed after the card set, thus completing the card unit. a 't' in the mon field controls the processing of one heading at a time. if a heading requires more than one card, the control is punched into the last card. preparation of the remaining punched cards required for the tracing cards is provided for in the following statements: 59 m=1 2 read ll,n1,n2,n3,n4,n5,n6, 1n7,n8,n9,n10,n11,n12,n13, 2mon,kon (read statement for headings. ) punch 8, n1,n2,n3,n4,n5,n6, (punch statement for head1n7,n8,n9,n10,nll,n12,n13 ings.) m=m+1 if(mon)2,2,51 51 if(m-4)12,52,52 52 i= 3 punch 13, m2(i),m3(i), 1m 4 (i) ,m5 (i) ,m6 (i) ,m7 (i ) , 2m8( i ),m9( i ),mlo( i) ,mll (i), 3m12( i) ,m13( i) (punch statement for main entry on tracing card, drawn from fomputer memory; class code is t>mitted.) 206 journal of library automation vol. 1/ 3 september, 1968 a header card is punched not only with the heading but also with the alphabetic class code of the book being cataloged. if there is a second card to the heading, it is punched with the subject code number. a third card, if there is one, contains the cutter number. in the program statements cited above, provision is made for retrieving from the computer memory any part or parts of the call number that are not read in with the heading. care is taken to assure that no part of the number that is not needed is retrieved. as headings are generally in capital letters, and the multiple upper-case shift character is used to produce them, it is necessary to precede the subject code number of a second card to a heading with the lower-case shift character. otherwise, instead of the subject code number's being typed, the upper-case characters for the digits of this number will be typed. cards for the remainder of the body of the catalog card, except the last line, are punched by use of a do loop: m= 4 12 do 7 j= m,nb (punch statement for remainder of tracing card through next-to-last body line.) 7 punch 8, m1(j),m2(j),m3(j), 1m4(j),m5( j) ,m6(j ),m7 (j), 2m8 ( j),m9(j),m10(j),mll(j), 3m12(j),m13(j) before the last body card can be punched, the value of n must be set. before this can be done, the value of i, subscript for the punch statement, must be t ested, which is accomplished in a series of if statements. i is set to nb + 1; then if there are fewer than four cards in a heading: 54 if(i-8)18,42,43 43 if(i-10)44,45,46 46 if(i-12)48,49,50 18 n= 7 42 n= 8 44 n= 9 45 n= 10 48 n= ll 49 n= 12 after each statement setting n to a specific value, a go to statement sends the program back to statement 23. if the number of cards in a heading is four or more, the value of n is set by the following statements, mat having been equated earlier to one more than the number of cards in the heading: 41 if(mat-4 )23,27,37 37 if(mat-6)38,39,40 27 n= n + 1 38 n=n + 2 39 n= n + 3 40 n= n + 4 -----------(it is assumed that no heading will be longer than seven lines.) --~~--==---................ """"""'""""""'-... production of library catalog cards/ murrill 207 again, after each statement setting n to a specific value, a go to statement sends the program back to statement 23. . the last card of the unit of manually punched cards contains -1 in the kon field. when this control number is read and tested, the program branches to a do loop which erases the bibliographic data stored in the computer's memory, then branches to the beginning of the program to statements which set n=o, i=l, nbc=l, and ncs= o. the computer is ready to read the first card of the next unit. figure 2 shows a printout of a main entry card. exact cost figures for the production of catalog cards are not available, but a good estimate would be approximately eighteen cents per card. this cost is high, perhaps, but when the cards issue from the 870 system, they are complete, including call numbers and tracings, items that are missing from the lc cards and that must be typed onto these cards. the saving is in time; the cards produced by the program described here can be ready for filing in the library within a week; delivery of lc cards is, on the average, six months after the order date. the overall cost is reduced somewhat by the fact that the same cards that are punched for the catalog cards are used for the accessions bulletin. they can also be used for bibliographic listings under selected headings. a listing of the complete catalog card program may be obtained from the author. tx 415 f29 fritzsche brothers inc. guide to flavoring ingredients as classified under the federal food, drug and cosmetic act. new york, 1966. ·a4p. l.flavoring essences. i.title. fig. 2. printout of main entry cm·d. 208 journal of library automation vol. 1/ 3 september, 1968 library bulletin the data for the library bulletin program consists of the cards which were punched manually for the catalog card program. the information concerning each book which is to be included in the bulletin is the choice of the individual librarian. at the philip morris, incorporated, research center library none of the data after the publication date is included. care must be taken that there are at least five cards remaining for each book after unwanted cards have been discarded. the reason for this will become apparent later. the headings, consisting of the first subject entries in the tracings of the books to be listed, need to be punched. the books are grouped in the bulletin under these headings. each of the headings is to be in uppercase letters, so the first column of each card must be punched with a dollar sign, the special character wired in the 870 control panel to produce multiple upper-case printout. as with the catalog card program, certain controls are required for the bulletin program. a 't' must be punched in the field called mon in the last card of each book set. a 't' must be punched in a field called jon in each header card. a code punched into the cards to facilitate keeping them in the proper order is not necessary for the computer program, but it is certainly desirable. therefore, a sequence code consisting of eleven digits is in each card: the first five digits designate the subject heading, the next four the author, and the last two the card sequence for each book. judicious selection of the codes makes it possible to put the cards into proper or near proper order with a card sorter an especially useful feature for preparing a listing of all the titles under a given heading acquired over a period of several months. in the bulletin the titles are numbered sequentially beginning with 't', the numbering being controlled with a fixed point variable, no, which is set to one at the beginning of the computer program. the data cards are arranged in alphabetic order by author and are placed behind the appropriate header cards, which have also been arranged in alphabetic order. the first card to be read and punched is a header card: read 4, ka,kb,kc,kd,ke, kf, (read statement for first headlkg,kh,ki,kj,kk,kl,km,jon er card.) 4 format( 13a5,11x,ll) 15 punch 3, ka,kb ,kc,kd,ke, (punch statement for header lkf,kg,kh,ki,kj,kk,kl,km cards; includes special charac3 format ( 3h&&&, 13a5 ) ters to produce skipped lines.) the first three data cards are read in a do loop. the three parts of the call number are stored in an array, then the main entry, in the third card, is punched, along with the sequence number and the first part of the call number, the alphabetic class code. production of library catalog cards/ murrill 209 do 6 i= 1,3 6 read 1, m (i ) ,n (i) ,lb,lc,ld, 1le,lf,lg,lh,li,lj,lk,ll, 2lm,mon 1 format(a5,a4,11a5,a1,12x,il) punch 12, no,lb,lc,ld,le, 1lf,lg,lh,li,lj,lk,ll,lm, 2m( 1),n ( 1) 12 format ( ih&,1h@,i4,1h.,2x, 112a5, 1h%,a5,a4) (read statement for call number and main entry.) (punch statement for sequence number, main entry, and class code; includes characters for skipped line, lower case, and typewriter tab control. ) the next two cards are read and new cards are punched in another do loop. the remainder of the call number, the subject code number and the cutter number, are punched into these two cards. if there are fewer than five cards to be processed in each book set, part of the call number will be lost. blank cards are added, if necessary, to bring the count to five. 16 do 7 i = 2,3 read 17, ka,kb,kc,kd,ke, 1kf,kg,kh ,ki,kj,kk,kl,mon 17 format(8x,11a5,a2,12x,il) punch 5, ka,kb,kc,kd,ke, lkf ,kg,kh,ki,kj ,kk,kl,m (i), 2n(i) 5 format( 4x,1h%, 12a5,5x, 11h%,a5,a4) 7 continue (read statement for fourth and fifth cards of set. ) (punch statement for second and third lines of bibliographic data and remainder of call number; includes typewriter tab controls. ) the mon field in the fifth card is tested in an if statement. if this field is zero, indicating that there are more cards in the set, the additional cards are read and new ones are punched in another pair of read and punch statements. if the mon field is ''1'', indicating that the processing of the card set is complete, one is added to the sequence number variable, no, and the next card is read. if this is a header card, as indicated by ''1'' in the field called jon, the program branches back to statement 15. if, on the other hand, it is the first card of another title under the same heading, the class code is stored in an array, and the second and third cards are read in a do loop. the main entry, the content of the third card, is then punched, along with the class code and the sequence number. 9 no=n0 + 1 read 4, ka,kb,kc,kd,ke,kf, 1kg,kh,ki,kj,kk,kl,km,mon (read statement, for heading if jon=1, for main entry and class code of new card set if jon= 0.) 212 journal of library automation vol. 1/ 3 september, 1968 7. ibm manual no. e20-8094 : mechaniz ed library procedmes, 14. 8. ibm manual no. e20-0093: library catalog production -1 401 and 870. 9. buckland lawrence f. : the recording of library o f congress bibliographical data in machine form. a report for the council on library resources, inc. (washington council on library resources, inc.: november 1964 ) (rev. february 1965 ). 10. cariou, mavis: "a computerized method of preparing catalogue cards, using a simplified form of data input," proceedings, american documentation institute annual meeting, 4 (october 1967 ), 186-90. 11. maskiell, frank h.: "pdq fortran (an interpretive program for the fortran language )" (november 1963 ). 122 levels of machine readable records recon working task force: henriette d. avram, chairman; richard de gennaro; josephine s. pulsifer; john c. rather; joseph a. rosenthal and allen b. veaner. this study of the feasibility of determining levels or subsets of the established marc ii format concludes that only two levels are necessary and desirable for national purposes: 1) the full ma.rc ii format for distribution purposes; and 2) a less complex subset to be used by libraries reporting holdings to the national union catalog. introduction in march 1969, the advisory committee to the recon working task force, after approving publication of the initial recon report ( 1 ), endorsed investigation of a number of questions raised in that report as well as consideration of certain issues not covered in the initial survey. the basic tasks to be undertaken have been described in another article in this issue (2). with further support for recon from the council on library resources, inc., the working task force has met several times to explore some of these problems. this article reports the conclusions reached with respect to one task: the feasibility of determining a level or subset of the established marc ii format that would still allow a library using it to be part of a future national network. definition of "level" during the initial recon study the working task force, for discussion purposes, considered levels of encoding detail of machine readable catalog records in relation to the conditions under which conversion might occur. a level was distinguished by differences in 1) the bibliographic \ levels of machine readable recordsjrecon task force 123 completeness of a record, and 2) the extent to which its contents were separately designated. with respect to the latter point, the recon report stated: "a machine format for recording of bibliographic data and the identification of these data for machine manipulation is composed of a basic structure (physical representation), content designators (tags, delimiters, subfield codes), and contents (data elements in fixed and variable fields). although the basic structure should remain constant, the contents and their designation are subject to variation. for example, a name entry could be designated merely as a name instead of being distinguished as a personal name or corporate name. when a distinction is made, a personal name entry can be further refined as a single surname, multiple surname, or forename. likewise, if a personal name entry contains date of birth and/ or death, relationship to the work (editor, compiler, etc.), or title, these data elements can be identified or can be treated as part of the name entry without any unique identification. thus individual data elements can be identified at various levels of completeness." (3) appendix f of the recon report tentatively defined three levels: "level 1 involves the encoding of bibliographic items according to the practices followed at the library of congress for currently cataloged items, i.e., the marc ii format. a distinguishing feature of level 1 is the inclusion of certain content designators and data elements which, in some instances, can be specified only with the physical item in hand. "level 2 supplies the same degree of detail as in level 1 insofar as it can be ascertained through an already supplied bibliographic record ... . "level 3 would be distinguished by the fact that only part of the bibliographic data in the original catalog record would be transcribed. in addition, content designators might be restricted ... " ( 4) . at the outset of the present study, however, it was recognized that incomplete bibliographic description is not acceptable in records for national use. in addition, it seemed that the question of having a level below level 2 really arose from a desire to define a machine readable record with a lesser degree of content designation rather than one with less complete bibliographic data. it was decided, therefore, to concentrate the study effort on this task, and the original formulation of level 3 was discarded. on further consideration, it was realized also that the distinguishing feature between levels 1 and 2 was not significant. omission of data elements that cannot be determined unless the book is in hand may simplify an individual record but does not simplify the content designators in the format because these elements are often present in other 124 journal of library automation vol. 3/2 june, 1970 records. thus, as far as content designation is concerned, levels 1 and 2 (as originally defined) were in fact the same. once this similarity became apparent, it was recognized that the specification of levels really depended on the functions of machine readable catalog records from the standpoint of national use. functions and levels on the basis of present knowledge, it seems that machine readable records will serve two primary functions for national use. the first involves the distribution of cataloging information in machine readable form for use by library networks, library systems, and individual libraries; the second involves the recording of bibliographic data in a national union catalog to reflect the holdings of libraries in the united states and canada. in this report, the first is called the distribution function; the second is called the national union catalog ( nuc) function. each of these functions can be related to a distinct level of machine readable record. the distribution function the distribution function can best be satisfied by a detailed record in a communications format from which an individual library can extract the subset of data useful in its application. at the present stage of library automation, it is impossible to define rigorously all of the potential uses of machine readable catalog records. thus, there is no way to predict which data elements may not be needed or to rank them according to their value to a wide variety of users under different circumstances. to confirm the wide variation in treatment of the marc ii format, an analysis was made of the use of marc content designators by eight table 1. use of marc content designators by 8 library systems or networks number of libraries number of items fixed fields (19) tags indicators (63) 8 26 7 6 6 3 5 1 5 4 6 3 3 7 2 2 4 4 1 1 7 none 7 (126) 2 7 9 92 16 note: only six libraries supplied information on fixed fields . sub field codes (181) 1 88 45 15 9 11 9 3 \ levels of machine readable recordsjrecon task force 125 library systems and emerging networks. the data from this analysis were synthesized for presentation in two tables. table 1 shows the acceptance of content designators in terms of the absolute number of libraries using them. it should be read as shown by the following examples: 1) 26 of the 63 marc tags are used by all eight libraries; 2) 92 of the 126 indicators are used by only three libraries. table 2 shows the acceptance of content designators in relative terms. thus, if only three libraries were using a particular tag and all used the associated subfield codes, the acceptance of those subfield codes was calculated as 100 percent. in both tables 1 and 2, the columns on indicators and subfield codes include responses only from those libraries that were definitely using the tag with which a given indicator or subfield code was associated. the analysis excludes tags for which no immediate implementation is planned by the marc distribution service. table 2. percentage of acceptance of marc content designators by 8 library systems or networks percent of libraries 100 75-99 50-74 25-49 1-24 0 fixed fields (19) 1 13 4 1 number of items tags indicators (63) ( 126 ) 26 9 2 8 16 6 108 7 7 subfield codes (181) 10 134 32 5 the major findings of this analysis may be summarized as follows: 1) of 19 fixed fields, 14 were used by at least half of the libraries and all were used by at least one library. 2) of 63 tags, 43 were used by at least half of the libraries and 26 were used by all of them. seven tags were not used by any of the libraries studied, but these tags cover items that will appear in machine records produced by the national library of medicine, the national agricultural library, and the british national bibliography. 3) of 126 indicators, only 18 were used by at least half of the libraries. the highest degree of acceptance was the use of the same two indicators by six libraries. on the other hand, each indicator was used by at least two libraries. 4) of 181 subfield codes, 176 were used by at least half of the libraries that were using the related tags. each subfield code was used by at least a quarter of the libraries that could express a relevant opinion. 126 journal of library automation vol. 3/ 2 june, 1970 the foregoing analysis confirmed the view that a nationally distributed record should be as rich in content designation as possible. failure to provide this detail would result in many libraries having to enrich the record to satisfy local needs, a process more costly than deleting items selectively. therefore, as of now, the present marc ii format constitutes the level required to satisfy the national distribution function. the national union catalog function as noted above, the nuc function relates to the use of machine readable records to build a national union catalog. at first thought, it might appear that this function overlaps the distribution function. as far as library of congress cataloging is concerned, this view is correct. it is valid also with respect to cooperative cataloging entries issued by the library as part of the card service. however, the two functions are quite distinct as far as regular reports to nuc are concerned. the essential difference between the two categories of catalog records is that those issued as lc cards have been completely checked against the library's authority files and edited for consistency, whereas only the main and added entries of nuc reports have been checked for compatibility. the impact of this difference can be judged from the fact that an attempt to distribute nuc reports as proof slips several years ago was abandoned because the response to this service did not justify its continuance. distributing nuc reports in machine readable form would add another dimension to the problem of processing them, because, to be flexible enough for wide acceptance, nuc reports would have to be entirely compatible with those issued by the marc distribution service. since compatibility would involve more detailed content designation than many libraries might put into their records for local use, libraries would have to be willing to provide this detail in nuc reports, or the level of nuc reports would have to be upgraded centrally. as the certification of the bibliographic data and the content designators would entail a major workload for the library of congress, it does not seem practical to pursue this goal at present. it is possible, however, to define a subset of content designators to cover the eventuality that outside libraries may be able to report their holdings to nuc in machine readable fmm. a marc subset can be determined for the nuc function because this function involves processing records in a multiplicity of places to be used centrally for specifically definable purposes. the distribution function, on the other hand, involves the preparation of records at a central somce to be used for a wide variety of purposes in a multiplicity of places. the difference is vital when it comes to stating the requirements for the two types of records. levels of machine readable recordsfrecon task force 127 the specifications of a machine readable record to fulfill the nuc function depend on the nature and functions of the national union catalog itself. the content designators for such a record will be defined in a separate investigation now being conducted by the working task force. the present study was considered to be completed once the feasibility of defining a level of machine readable record for that purpose was established. conclusion the findings of this study of the feasibility of defining levels of machine readable bibliographic records are as follows: 1) the level of a record must be adequate for the purposes it will serve. 2) in terms of national use, a machine readable record may function as a means of distributing cataloging information and as a means of reporting holdings to a national union catalog. 3) to satisfy the needs of diverse installations and applications, records for general distribution should be in the full marc ii format. 4) records that satisfy the nuc function are not necessarily identical with those that satisfy the distribution function. 5) it is feasible to define the characteristics of a machine readable nuc report at a lower level than the full marc ii format. references 1. recon working task force: conversion of retrospective catalog records to machine readable form (washington, d. c.: library of congress, 1969). 2. avram, henriette d. "the recon pilot project. a progress report." journal of library automation, 3 (june 1970), 10-22. 3. recon, op. cit., p. 43. 4. ibid., p. 164. 75 video technologies: neologism or library trend? converging factors are shaping a new environment for libraries, and, as a consequence, the present is full of opportunity. technical and social changes provide libraries with a host of alternatives for service, growth, and innovation. in this new environment libraries will, undoubtedly, continue to promote the availability of books and other materials, continue to increase their efforts to furnish patrons with information, and continue to broaden the range of activities offered so that patrons can receive personalized service. patron information seeking and searching methods we have known, however, will give way to new methods based on computers and telecommunications. a host of new technologies is growing out of the evolutionary pathway marked by telegraph, telephone, radio, and television . broadband communications (that's the cable that today brings you predominantly entertainment television), satellite, videotex, teletext, videodisc, videotape, large-screen television, and computer displays (some are as large as the side of a building) are available either today or within the next year or two . each of these technologies is a new medium within its own inherent capabilities and limitations. each has the promise of providing faster and more cost-efficient information services than some present forms of printed communication. and each requires a different approach and different knowledge for effective and efficient use, and integration into library operations. in a growing number of locations, cable communications for delivery of library se rvices have already been made available virtually free of charge . other technologies, such as videotex, will grow significantly . estimates suggest that in five years more than 8 million american homes will be able to obtain extensive, automated information services from commercial, private, and government sources . probably a larger number will receive limited information services over the broadcast airwaves via teletext. dramatic new services will combine television, computer, telephone, satellite, and cable into home entertainment and information centers ... potential extensions of libraries. some sources suggest more than 50 percent of the american gross national product results from the collection, processing, and dissemination of information, much of which involves new technologies . inev76 journal of library automation vol. 14/2 june 1981 itably, this technological trend also will occur in libraries and, in this light, the relatively low level of involvement of computers in providing patron services today is notable. by their natural inertia, individuals and organizations in the library community will be opposed to the acceptance of cable services, videotex , online catalogs, information retrieval , and other video technologies simply because it represents change . but these technologies are technically feasible and are becoming an economic reality. the point of demarcation between computer and library may well become a terminal in a patron's home. whether or not the service provided is a library's or a commercial competitor's depends to a great extent on how libraries define their role in this environment, and on the degree of library participation in the evolutionary process that's now taking place. something besides inertia opposes the acceptance of new technologies, however. to some degree, lack of awareness of technological trends is a factor, but more significant is a lack of clear understanding (both by the proponents of the technology and by librarians) of how new technology can be integrated into the library setting. understanding the value a technology offers for increased service or decreased cost, for example should be paramount, but frequently the technology seems to be offered as an end in itself. internal and external factors must be considered to guide the application of technology toward meeting library and patron needs. financial concerns, social forces , and the consumer/patron appear to be major factors leading libraries toward a future deeply involved in video technologies. whether the outcome will result from external pressure or internal plan remains to be seen. it's incumbent upon libraries to be informed and active participants in directing their own future in this kind of an environment. what are the implications of this technical evolution and internal/external factors? one thing is sure: it's a massive industry growing at a very rapid rate, and it is going to grow even faster . libraries have the opportunity to grow with this trend through application of the technologies to existing technical services, increased availability of patron services, and development of innovative services . if there is a common thread that can identify those libraries which will grow and prosper, that thread is flexibility the capacity of library management and staff to adapt this library to the new environment, and integrate technology into their library. readers and contributors of ]ola are the people that can e ither have an integral part in defining the future direction of libraries, or passively watch patron needs outstrip services. library schools and people involved in library-related research must play a key role in assessing the value of video technologies and defining how to integrate them into the business and service of libraries. what is going to preserve and enhance editorial!harnish 77 the role of libraries in the 1980s will not only be flexibility but another very critical element foresight, dedicated to patron needs . many libraries have met this technological revolution head-on and are intimately involved in testing, developing, and providing innovative library services . in this and forthcoming issues, we hope to bring a perspective on these changes that is valuable and cogent to the library community. readers of ]ola and practitioners in all areas of video technology are called upon to describe their efforts and share their results drawn from this rapidly changing field through contributions to this journal. thomas d. harnish editor's notes jola will continue to be interesting and useful to its readers to the extent that its readers are willing to expend the efforts to also be its writers. the authors in this issue are all as busy as you and i. they have made time in their already full schedules to write down ideas and information they hope will be useful and provocative. they and we of the ]ola staff hope you are pleased with the results . so what's new by you? how have your patrons reacted to your new online catalog? what do the costs of your acquisitions system look like? how about that idea you had about a new way to do whatever? do you think the fuss over authority control is worth it? if you have ,ideas, perceptions, or stories to tell that you feel are of interest to your fellow readers, please write and let them know. lib-mocs-kmc364-20131012122710 292 journal of library automation vol. 14/4 december 1981 we need a format which is consistent, easily maintainable without being uncontrollably disruptive, and responsive to changing needs which are likely to accelerate as we gain experience with online systems. rather than recommending or supporting the implementation of specific changes to the marc format, it is essential that the library community begin to establish the framework and benchmarks necessary to maintain the marc formats over the long term as well as to guide short-term considerations. arl and others can play an important role in undertaking and encouraging a broader approach to this pressing problem. such an approach will not only reduce the risk of decision making, but will also assist in the development of the cost/benefit data needed to enhance consideration of format changes. references l. d. kaye capen, simplification of the marc format: feasibility, benefits, disadvantages, consequences (washington, d.c.: association of research libraries, 1981), 22p. 2. "principles of marc format content designation," draft (washington, d.c.: library of congress, 1981), 66p. 3. ichikot. morita and d. kaye capen, "a cost analysis of the ohio college library center on-line shared cataloging system in the ohio state university libraries," library resources & technical services 21:286-302 (summer 1977). 4. council on library resources bibliographic interchange committee, bibliographic interchange report, no.1 (washington, d.c.: the council, 1981). comparing fiche and film: a test of speed terence crowley: division of library science, san jose state university, san jose, california. introduction for more than a decade librarians have been responding to budget pressures by altering the format of their library catalogs from labor-intensive card formats to computer-produced book and microformats. studies at bath, 1 toronto, 2 texas, 3 eugene, 4 los angeles, 5 and berkeley, 6 have compared the forms of catalogs in a variety of ways ranging from broad-scale user surveys to circumscribed estimates of the speed of searching and the incidence of queuing. the american library association published a state-of-the-art reporf as well as a guide to commercial computer-output microfilm (com) catalogs pragmatically subtitled how to choose; when to buy. 8 in general, com catalogs are shown to be more economical and faster to produce and to keep current, to require less space, and to be suitable for distribution to multiple locations. primary disadvantages cited are hardware malfunctions, increased need for patron instruction, user resistance (particularly due to eyestrain), and some machine queuing. the most common types of library com catalogs today are motorized reel microfilm and microfiche, each with advantages and disadvantages. microfilm offers filesequence integrity and thus is less subject to user abuse, i.e., theft, misfiling, and damage; in motorized readers with "captive" reels it is said to be easier to use. disadvantages include substantially greater initial cost for motorized readers; limits on the capacity of captive reels necessitating multiple units for large files; inexact indexing in the most widespread commercial reader, and eyestrain resulting from high speed film movement. microfiche offers a more nearly random retrieval, much less expensive and more versatile readers, and unlimited file size. conversely, the file integrity of fiche is lower and the need for patron assistance in use of machines is said to be greater than for self-contained motorized film readers. the problem one of the important considerations not fully researched is that of speed of searching. the toronto study included a selftimed "look-up" test of thirty-two items "not in alphabetical order" given to thirtysix volunteers, of whom thirty finished the test. the researchers found the results "inconclusive" but noted that seven of the ten librarians found film searching the fastest method. "average" time reported for searching in card catalogs was 37.3 min-utes, in film catalogs 41.6 minutes, and for fiche catalogs 4i. 7 minutes. a reanalysis of the original data shows a stronger advantage of fiche over film (45.3 minutes versus 51.7 minutes) when all times except duplicates are totaled, but that difference is almost entirely due to one extreme score (203 minutes). 9 the berkeley report of fiche/film comparability addressed the issue of retrieval speed directly. by constructing a series of look-up tests composed of items selected from a large public library com catalog, the researchers were able to compare microfiche and microfilm formats while holding other variables constant. in one test involving thirty-six paid users and 252 trials, microfilm was determined to be faster by 7.6 percent (±2.5 percent). in a second test, forty volunteer users were timed in 240 trials and the advantage of film over fiche dropped to 5. 7 percent ( ± 2.5 percent) .1° although rigorous in design and execution, the berkeley experimenters used in their look-up tests questions that naive users might misinterpret, e.g., "you want a book about paul robeson, written by eloise greenfield. find the listing and give the call number"; and some which could be confusing, e.g., "does the library have any joke books? if so, give the call number for one. "11 such questions potentially pose an element of uncertainty for subjects: should i look under robeson or greenfield? under joke books or humor? in addition, questions were selected by "browsing the file for target items," a procedure which could result in an uneven distribution of items which in turn could bias the results. since the number of observations is relatively large the reliability of the results is not questioned; the validity may be. the study reported here was executed by a class in research methods taught by the author during the same time as the berkeley study; we used the same two formats of the same catalog, and attempted to answer the same question: using the best available equipment, which microformat is faster to search? assumptions we assumed (i) the two forms of the catalog were identical; (2) the quality of the image was not significantly different; (3) a communications 293 search for items selected randomly from the file and arranged randomly was a fair test of retrieval speed; and ( 4) graduate students in library science were reasonably representative users for a test of speed. methodology we used a dictionary catalog from a public library system with 436, 79i entries, of which 5,63i were author, ill,l58 were title or added entries, and 320,002 were subject entries. using a random number table, we selected from the catalog i6 entries which were reproduced and randomly arranged to form the test. of the i6 items, 3 were author entries, 8 were title or added entries, 5 were subject entries. the sequence, which presumably would affect the speed of retrieval more in the film format because of the necessity to scroll from one letter to another, wasacwns kcb wm h l p pal. the test was then administered to thirty-seven volunteer graduate students randomly assigned to a micro-design 4020 fiche reader or an information design rom 3 film reader. the two readers were located in the same room. the 86 fiche were held and displayed by a ring king binder. all times were measured by a stopwatch. questionnaires administered before and after the test established that the two groups did not differ significantly in age or in selfperceived mechanical ability. of the film users, 64 percent used micro-formats "occasionally" or "frequently" compared with 35 percent of the fiche users. of the total group, 73 percent wore glasses and 62 percent reported prior physical problems with both film and fiche readers used before the test. results table 1 shows that the mean speed of the film users was i6. 7 minutes, significantly faster than the 25.3 minutes recorded by the fiche users; the range of speed for the film users was less than v3 that of the fiche users. even the slowest film user was faster than 70 percent of the fiche users. however, the fastest fiche user was faster than 70 percent of the film users. the range of fiche scores is more than 3 times that of the film scores (figure i). the standard statistical test shows the difference of means to be significant at the .oilevel. 294 journal of library automation vol. 14/4 december 1981 table i. speed of retrieval (in minutes) format low microfilm (n = 17) 12.3 microfiche(c = 20) 14.6 t = 4.8,p< .01 discussion searching motorized microfilm appears to be significantly faster than searching microfiche, on the average, for relatively inexperienced users. even the slowest time on the film was faster than most fiche times. the wide range of fiche scores suggests the possibility that frequent users could improve their searching times; very experienced users may be able to search fiche faster than film. • because of the relatively small numbers of subjects and observations •the author, an experienced fiche user, was timed at 11.6 minutes; this was the fastest time recorded by either fiche or film users. <j) q) ol ~ c: q) (..) .... & standard high mean deviation 19.45 16. 7 2.34 18.0 25 .3 7.47 involved, the results should be interpreted with caution. although the advantage of film over fiche in this study is greater than that shown in the berkeley report, differences in design and analysis must be taken into account. acknowledgments the author wishes to acknowledge the members of his research methods class, especially david fishbaugh and carol manoukian, for their assistance. references 1. bath university comparative catalog study : final report. papers no. 110. i film n•20 ~ fiche n•l7 time in minutes and tenths fig. i. dislribulionsojtest scores. (bath: the library, 1974-75). 2. valentine de bruin, "sometimes dirty things are seen on the screen," journal of academic librarianship 3:256-66 (nov. 1977). 3. carolyn m. cox and bonnie juergens, microform catalogs: a viable alternative for texas libraries (dallas: amigos bibliographical council, 1977). eric document no. ed 149 739. 4. james r. dwyer, "public response to an academic library microcatalog," journal of academic librarianship 5:132-41 quly 1979). 5. brett butler, martha w. west, and brian aveney, com catalog: use and evaluation: report of a field study of the los angeles county public library system (rev. ed.; los altos: information access corporation, 1979), 71p. 6. theodora hodges and uri bloch, "fiche or film for com catalogs-two use tests" in library effectiveness: a state of the art (chicago: american library assn., 1980), p.122-30. 7. william saffady, computer-output microfilm: its library applications (chicago: american library assn., 1978), loop. 8. commercial com catalogs: how to choose, when to buy. catalog use committee, reference and adult services division, american library association. (chicago: american library assn., 1978), 47p. 9. debruin, ''dirty things," p.266. 10. hodges, "fiche or film," p.l28. 11. hodges to crowley, september 1979. electronic order transmission james k. long : oclc, inc., dublin, ohio. in this era of decreasing library allocation from the public sector, libraries are realizing increased benefits from the automation of the acquisitions process. the price of hardware is decreasing and the capabilities of the available offerings increasing. we have evolved from the small local library collection of data and printing of orders, through the book vendor offerings of an online connection to a single vendors inventory. these systems still required local mailing for all other vendor orders. communications 295 in 1981 we have seen a greater emphasis on electronic ordering. memorial university in canada has been experimenting in sending orders directly to john coutts library services ltd. in print format using the utlas catss system. wayne state university is planning to use the ringgold nonesuch acquisitions system to transmit orders electronically to book house using the bisac tape format. blackwell/ north america and the academic book center have experimentally used wl~ to receive test orders in a print file format. these all save time in getting the orders to the respective vendor. if sufficient volume can be generated there may be a savings in transmission costs over the u.s. mail. however, in order to realize maximum economics in this electronic process, four activities need to occur. 1. acquisition orders must be collected from multiple libraries at a central site to generate volume for dispersal to multiple sites. 2. standard formats need to be accepted and enforced for order transmission. 3. the isbn must become a universally accepted part of the library acquisitions order. 4. the library must receive order status information from the vendor. once again, this should occur via a standard data format. at oclc there were 113 libraries, as of november 1981, thatcouldsendprintedorders from a central site to over 15,000 addresses of their choice. by july 1982 the projection is for over 200 libraries to be using the system. the library's order is hatched by the vendor address that the library has specified. this process offers savings by sharing mail and printing costs between participants. with the proposed installation of direct transmission in 1982, this central collection will afford shared transmission costs. this is the type of centralized collection that maximizes the benefits of electronic ordering. within the book industry, standards for electronic data transmission for book ordering have been developed. in may of 1981 the book industry systems advisory committee (bisac), a subcommittee of the book industry study group (bisg), aplikes, comments, views: a content analysis of academic library instagram posts articles likes, comments, views a content analysis of academic library instagram posts jylisa doney, olivia wikle, and jessica martinez information technology and libraries | september 2020 https://doi.org/10.6017/ital.v39i3.12211 jylisa doney (jylisadoney@uidaho.edu) is social sciences librarian, university of idaho. olivia wikle (omwikle@uidaho.edu) is digital initiatives librarian, university of idaho. jessica martinez (jessicamartinez@uidaho.edu) is science librarian, university of idaho. © 2020. abstract this article presents a content analysis of academic library instagram accounts at eleven land-grant universities. previous research has examined personal, corporate, and university use of instagram, but fewer studies have used this methodology to examine how academic libraries share content on this platform and the engagement generated by different categories of posts. findings indicate that showcasing posts (highlighting library or campus resources) accounted for more than 50 percent of posts shared, while a much smaller percentage of posts reflected humanizing content (emphasizing warmth or humor) or crowdsourcing content (encouraging user feedback). crowdsourcing posts generated the most likes on average, followed closely by orienting posts (situating the library within the campus community), while a larger proportion of crowdsourcing posts, compared to other post categories, included comments. the results of this study indicate that libraries should seek to create instagram posts that include various types of content while also ensuring that the content shared reflects their unique campus contexts. by sharing a framework for analyzing library instagram content, this article will provide libraries with the tools they need to more effectively identify the types of content their users respond to and enjoy as well as make their social media marketing on instagram more impactful. introduction library use of social media has steadily increased over time; in 2013, 86 percent of libraries reported using social media to connect with their patron communities.1 the ways in which libraries use social media tend to vary, but common themes include marketing services, content, and spaces to patrons, as well as creating a sense of community.2 even with this wealth of research, fewer studies have examined how libraries use instagram, and those that do often utilize a formal or informal case study methodology.3 this research seeks to fill that gap by examining the types of content shared most frequently by a subset of academic library instagram accounts. although this research focused on academic libraries, its methods and findings could be leveraged by educational institutions and non-profits in their own investigations of instagram usage and impact. literature review since its inception in 2010, instagram’s number of account holders has been steadily increasing. by 2019, more than one billion user accounts were active each month, making it the third most popular social media network in the world, and the pew research center has reported that instagram is the second most used social media platform among people ages 18-29 in the united states, after facebook.4 instagram has estimated that 90 percent of user accounts follow at least one business account.5 previous research has also shown that individuals who use instagram to follow specific brands have the highest rates of engagement with, and commitment to, those mailto:jylisadoney@uidaho.edu mailto:omwikle@uidaho.edu mailto:jessicamartinez@uidaho.edu information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 2 brands when compared to users of other social media platforms.6 though businesses are fundamentally different in the products or services they are trying to market, academic libraries share a desire to provide information to, and engage with, their followers. as such, in the past decade, libraries have begun to adopt instagram as a way to market their libraries and interact with patrons.7 however, methods and parameters for libraries’ use of instagram vary across types of libraries and even within specific library types.8 research has demonstrated that academic libraries’ use of social media, including instagram, is often for the purpose of increasing the sense of community among librarians and patrons by marketing the library’s services and encouraging student feedback and interaction.9 similarly, harrison et al. discovered that academic library social media posts reflected three main themes: “community connections, inviting environment, and provision of content.”10 chatten and roughley have also reported that libraries’ use of social media ranges from providing customer service to promoting the library and building a community of users.11 indeed, when comparing modern social networking systems, such as instagram, to older platforms, such as myspace, fernandez posited that today’s popular social media sites encourage networking and are especially suited to creating community.12 ideally, community engagement in the virtual social media environment would encourage more patrons to enter the library and thus engage in more face-to-face encounters.13 libraries’ methods for measuring the success of their social media engagement are as varied as the ways in which they use social media. assessment of libraries’ social media efficacy is tricky, and highly variable from institution to institution. hastings has cautioned that librarians should recognize that patrons both actively and passively interact with social media content.14 for this reason, while a large number of comments or likes may be identified as positive markers for active engagement, passive forms of engagement, such as the number of times a post appeared in users’ instagram feeds, may also be relevant.15 therefore, when librarians measure the success of an instagram post by examining only the number of likes and comments, they should be aware that they are measuring a very specific type of engagement: one which, on its own, may not determine a post’s full reach or effectiveness. other ways to measure engagement include monitoring how the number of people subscribed to an account changes over time, evaluating reach and impressions,16 or analyzing the content of comments (a type of qualitative measure that may indicate the type of community developing around the library’s social media). despite, or perhaps because of, the general excitement surrounding the possibilities that libraries’ engagement with social media can produce, very little has been written about how different types of libraries (such as academic libraries, law libraries, public libraries, etc.), or libraries in general, use these platforms.17 additionally, many librarians may lack expertise in marketing, including those who are managing social media accounts.18 as social media culture continues to evolve, librarians should move toward a more targeted and pragmatic approach to their instagram practices. this refinement in social media practices may enable libraries to develop more structure, so that they may create and share the type of content that would achieve their desired result at a given time. however, in order to develop this kind of measured approach, it is necessary for researchers to first analyze libraries’ current instagram practices to determine how posts are being used and the outcomes they generate. one effective method of analyzing instagram content centers on coding and classifying images. while many such schemas have been developed for analyzing images posted by instagram users and businesses, transferring these schemas to academic contexts has been difficult. 19 to address information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 3 this gap, stuart et al. adapted a schema that had been used to examine how “news media [and] non-profits,” as well as businesses, used instagram.20 this new schema allowed stuart et al. to classify instagram posts produced by academic institutions in the uk and measure the effect of these universities’ attempts to engage with students via instagram.21 stuart et al.’s schema, which classified instagram images into six categories (orienting, humanizing, interacting, placemaking, showcasing, and crowdsourcing), was the basis for the present study.22 methods research questions the impetus for this study was to learn more about how academic libraries use instagram to connect with their campus communities and promote their services and events. the authors of the present study adapted the research questions posed by stuart et al. to reflect academic library contexts:23 • rq1: which type of post category is used most frequently by libraries on instagram? • rq2: is the number of likes or the existence of comments related to the post category? identifying a sample population this study investigated a small subset of academic institutions: the university of idaho’s sixteen peer institutions. these peers have similar “student profiles, enrollment characteristics, research expenditures, [or] academic disciplines and degrees”; each is designated as a land-grant institution; and the university of idaho considers three to be “aspirational peers.”24 after selecting this population, the authors investigated the library websites of each of the sixteen peer institutions to determine whether or not they had a library-specific instagram account. when a link was not available on the library websites, the authors conducted a search within instagram as well as a general google search in an attempt to identify these instagram accounts. of the university of idaho’s sixteen peer institutions, eleven had active, library-specific instagram accounts. data collection the authors undertook manual data collection between november and december 2018 for these eleven library instagram accounts. initial information about each instagram account was gathered prior to the study on october 23, 2018: the date of the first post, the total number of posts shared by the account, the total number of followers, and the total number of accounts followed. for each account, the authors identified posts shared from january 1, 2018, to june 30, 2018. the “print to pdf” function available in the chrome browser was used to preserve a record of the content, in case the accounts were later discontinued while research was underway. if a post included more than one image, only the first image was captured in the pdf and analyzed. to organize the 3 77 instagram posts shared within this timeframe, the authors assigned each institution a unique, fivedigit identifier; file names included this identifier as well as the date of the post (e.g. , 00004_igpost_20180423). this file naming convention ensured that posts were separated based on institution and that future studies could use the same file naming convention, even if the sample size increased significantly. the authors added the file names of all 377 instagram posts to a shared google sheet, and for each post they reported the kind of post (photo or video), the number of likes, and whether comments existed. information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 4 research data analysis content analysis this project adapted the coding schema stuart et al. employed to investigate the ways in which uk universities used instagram.25 expanding on research by mcnely, stuart et al. employed six instagram post categories: orienting, humanizing, interacting, placemaking, showcasing, and crowdsourcing.26 for the purposes of the present study, the authors used the same category names when coding library instagram posts. however, they updated and adapted the descriptions of each category over the course of two rounds of coding to better reflect academic library contexts (see table 1). within this coding schema, the authors elected to apply only a single category name (i.e., a code) to each library instagram post. interrater reliability during the first round of coding, the authors selected two or three institutions every month, independently coded the posts based on the initial adapted schema, met to discuss discrepancies, and identified the final code based on consensus.27 however, during these discussions, it became evident that there was substantial disagreement concerning how specific categories were interpreted. to examine the impact of this disagreement, the authors calculated fleiss’ kappa, which can be used to assess interrater reliability when two or more coders categorically evaluate data.28 although this project’s fleiss’ kappa (0.683554901) was relatively close to a score of 1.0, demonstrating moderate agreement between each of the three coders, the authors recognized that additional fine-tuning of the adapted coding schema would allow for a more accurate representation of the types of content shared by academic libraries. after updating the schema (table 1), a small sample of collected instagram posts (20 percent, or 76 posts) was randomly selected for independent recoding by each of the authors. again, after coding this random sample individually, the authors met to seek consensus. anecdotal feedback from the coders, as well as an increase in the project’s fleiss’ kappa (0.795494117), demonstrated that the updated coding schema was more robust and representative. based on this evidence, the authors randomly distributed the remaining 301 posts amongst themselves; each post was coded by one author. information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 5 table 1. coding schema for library instagram posts [adapted from: emma stuart, david stuart, and mike thelwall, “an investigation of the online presence of uk universities on instagram,” online information review 41, no. 5 (2017): 588, https://doi.org/10.1108/oir-02-2016-0057.] category description example1 crowdsourcing posts that were created with the intention of generating feedback within the platform. if the content of the post itself fits within a different classification category, but the image is accompanied by text that explicitly asks for viewer feedback, then the post should be classified as crowdsourcing. includes requests for followers to like, comment on, or tag others in a particular post. humanizing posts that aim to emphasize human character or elements of warmth, humor, or amusement. this includes historic/archival photos used to convey these sentiments. this code is only used if both the text and the photo or video can be categorized as humanizing because many library posts contain a “humanizing” element. 1 sample images from the university of idaho library’s instagram account. https://doi.org/10.1108/oir-02-2016-0057 information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 6 category description example1 interacting posts with candid photographs or videos at library and libraryassociated events. includes events within or outside the library. orienting posts that situate the library within its larger community, especially regarding locations, artifacts, or identities. text often includes geographic information. placemaking posts that capture the atmosphere of the library through its physical space and attributes. includes permanent murals, statues, etc. information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 7 category description example1 showcasing posts that highlight library or campus resources, services, or future events. can include current or on-going events if people are not the focus of the image (e.g. exhibit, highlight of collection, etc.). these posts can also present information about library operations, such as hours and fundraising. posts can also entice their audience to do something, outside of instagram, such as visit a specific website. results general data about the library instagram accounts as of october 23, 2018 (the date this initial information was gathered), the eleven academic library instagram accounts had shared a combined 3,124 posts. most libraries created their instagram accounts and started posting between 2013 and 2016, but one library shared a post in 2012 and one created their account in april 2018. since the date of their first post, each account had shared 284 posts on average, while the actual number of posts shared across accounts ranged from 62 to 520. the number of followers and accounts followed across these eleven accounts ranged from 115 to 1,390 and 65 to 2,717, respectively. between january 1, 2018 , and june 30, 2018, these eleven library instagram accounts shared a total of 377 posts. the number of posts shared by each account during this time period ranged from four to 57, with an average of 34 posts. rq1: which type of post category is used most frequently by libraries on instagram? of the 377 posts analyzed, 359 included photos and 18 included videos. more than 50 percent of posts shared were coded as showcasing, with humanizing (18 percent) and crowdsourcing (9.8 percent) being the next most common categories (see table 2), although data demonstrated that individual libraries differed in their use of specific post categories (see table 3). when examining frequency based on category of post, the authors identified slight differences between video and photo posts. as with photos, the majority of videos (55.6 percent) were still coded as showcasing; however, the second most common post category for videos was interacting (16.7 percent). information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 8 table 2. number and percentage of posts by category for posts with photos or videos category number of posts percentage of posts crowdsourcing 38 10.1% humanizing 68 18.0% interacting 16 4.2% orienting 28 7.4% placemaking 33 8.8% showcasing 194 51.5% total 377 100% table 3. percentage of posts by category and library for posts with photos or videos library crowdsourcing humanizing interacting orienting placemaking showcasing lib 1 7.7% 15.4% 0% 23.1% 30.8% 23.1% lib 2 4.2% 50.0% 0% 4.2% 0% 41.7% lib 3 56.1% 10.5% 1.8% 3.5% 7.0% 21.1% lib 4 0% 4.1% 4.1% 4.1% 2.0% 85.7% lib 5 0% 24.4% 2.2% 20.0% 26.7% 26.7% lib 6 7.5% 18.9% 3.8% 11.3% 11.3% 47.2% lib 7 0% 20.0% 0% 0% 10.0% 70.0% lib 8 0% 21.6% 9.8% 5.9% 0% 62.7% lib 9 0% 25.0% 25.0% 0% 0% 50.0% lib 10 0% 16.1% 6.5% 0% 9.7% 67.7% lib 11 0% 15.0% 5.0% 5.0% 5.0% 70.0% rq2: is the number of likes or the existence of comments related to the post category? number of likes by category the results of the coding process also indicated that the number of likes differed based on the category of post. when examining photo posts, the authors noted that every post received at least five likes, with most posts receiving between 20-39 likes (see table 4). on average, crowdsourcing information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 9 photo posts generated the highest average number of likes across all categories, followed by orienting and placemaking posts (see table 5). however, it is important to recognize that crowdsourcing posts often asked visitors to participate in a post by “liking” it, often with the chance to win a library-sponsored contest, which may partially explain the higher average number of likes. table 4. number of posts by category and range of likes for posts with photos (does not include posts with videos) range of likes category 5-19 20-39 40-59 60-79 80-99 100119 120140 crowdsourcing 0 11 16 6 1 1 1 humanizing 16 26 10 9 5 0 1 interacting 5 5 3 0 0 0 0 orienting 2 7 9 8 0 1 0 placemaking 3 10 12 3 2 1 1 showcasing 67 83 27 5 1 0 1 total 93 142 77 31 9 3 4 table 5. average number of likes by category for posts with photos (does not include posts with videos) category average number of likes number of posts crowdsourcing 53.6 36 humanizing 39.9 67 interacting 27.8 13 orienting 50.0 27 placemaking 46.9 32 showcasing 27.6 184 existence of comments by category the authors also examined the existence of comments, another metric for engagement with instagram posts. data demonstrated that 78.9 percent of crowdsourcing posts included information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 10 comments, while a much lower percentage of placemaking (30.3 percent), orienting (28.6 percent), and humanizing (26.5 percent) posts generated this type of engagement (see table 6). as with the data on the number of “likes,” many crowdsourcing posts encouraged visitors to comment on a particular post, at times with an incentive connected to this type of engagement. table 6. presence of comments by category for posts with photos or videos category number of posts with comments number of posts without comments total number of posts percentage of posts with comments crowdsourcing 30 8 38 78.9% humanizing 18 50 68 26.5% interacting 3 13 16 18.8% orienting 8 20 28 28.6% placemaking 10 23 33 30.3% showcasing 40 154 194 20.6% total 109 268 377 28.9% discussion as noted previously, the post category used most frequently by these eleven libraries on instagram was showcasing (51.5 percent). the fact that libraries were more likely to share this type of content—which highlighted library resources, events, or collections—is understandable, as library promotion is one of the foundational reasons libraries spend the time and effort required to maintain social media accounts.29 this finding differs substantially from previous research with uk universities, which classified only 28.8 percent of posts as showcasing.30 when examining other post categories, it also became clear that uk universities shared humanizing posts more frequently (31 percent) than the eleven libraries (18 percent) included in this study.31 although the results of this study demonstrated that showcasing posts were shared most often, the data also indicates that showcasing posts were neither the category with the most likes on average nor the category that received comments most often. crowdsourcing posts were the category with the highest average number of likes (53.6) with orienting posts coming in at a close second (50), followed by placemaking (46.9) and humanizing (39.9) posts. showcasing posts, along with interacting posts, only generated slightly more than half the number of likes on average, when compared to the other categories (27.6 and 27.8, respectively). the category with the largest proportion of comments was crowdsourcing posts, with 78.9 percent of posts in this category generating comments from visitors. however, this result is likely skewed, as one of the library instagram accounts had exceptionally successful crowdsourcing posts, which often included a giveaway or other incentive for participation. in fact, when this institution was removed from the data set, only six crowdsourcing posts remained, two of which generated information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 11 comments. to better determine whether crowdsourcing posts are always this effective at generating engagement, it would be necessary to code a larger sample of instagram pos ts. it is clear that while showcasing posts were the most common among the instagram accounts analyzed, they also received the lowest number of likes, on average, and generated comments less frequently than all but one post category. while this may seem disheartening, it is important to remember that the showcasing category includes informational posts that convey library hours, services, or closures; this information that may be effectively relayed to users without necessitating an active response in the form of likes and comments. therefore, one might use different criteria to determine the success of showcasing posts, perhaps examining instagram data related to reach (the total number of unique visitors that view a post) and impressions (the total number of times a post is viewed).32 data on reach and impressions are only available to instagram account “owners.” in the current study, the authors did not quantify these types of engagement as their goal was to evaluate the content and metrics available to all instagram users, rather than the data that was only available to the “owners” of these library instagram accounts. in addition to answering the research questions, coding these instagram posts prompted several new questions regarding the types of information libraries and other institutions share online. one such question includes: with both universities and academic libraries working with students, why did academic libraries share a smaller percentage of interacting posts than uk universities? 33 additional research is needed to answer this question, but anecdotally, this difference may be related to the fact that universities, as a whole, have a larger number of opportunities to promote and share instances of interaction via instagram than libraries. for example, general university instagram accounts often include photos of students and affiliates interacting at large scale events such as sports games, musical performances, and other student gatherings that take place across campus. library-specific accounts on the other hand, have fewer opportunities to post photos that capture individuals “interacting” candidly. further, the fact that libraries tend to be proponents of privacy rights may inhibit library staff from taking photos of their users and sharing them online without first getting permission. therefore, differences related to the number of events and the organization type may contribute to whether or not universities and libraries share interacting posts; more research is needed to examine this hypothesis. another issue that arose during coding was that, if not for their inclusion of a request to comment, many crowdsourcing posts could have been classified under other categories. if an account follower looked only at the photos included in many of the crowdsourcing posts without reading the captions, they may not interpret those posts as crowdsourcing. therefore, a future research project might examine whether applying secondary categories to crowdsourcing posts, as a means of further classifying images and not just their captions, could generate a more comprehensive picture of what libraries are sharing on their instagram accounts. the authors also discovered that a majority of the library instagram posts included in this sample contained humanizing elements. almost all posts attempted to convey warmth, humor, or assistance, and therefore had the potential to be classified as humanizing. to successfully adapt stuart et al.’s coding schema for academic library instagram accounts, the authors specified that a post had to have both a humanizing caption as well as a humanizing photo to be coded as such.34 as with crowdsourcing posts, adding secondary categories to humanizing posts could better reflect the dual nature of this content and help future coders more accurately interpret the types of content shared by academic libraries. information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 12 limitations and future research the number of library instagram accounts selected as well as the use of a six-month timeframe were limitations of the current study. in the future, selecting a larger sample size and a different group of academic libraries would serve to advance the discipline’s understanding of the types of content shared by academic libraries and how users interact with these instagram posts. additionally, collecting instagram posts shared during an expanded timeframe could allow researchers to explore whether library instagram accounts consistently share the same types of content at various points throughout the year. as mentioned in the discussion section, future research could also include adding secondary categories to posts, which would allow researchers to gather more granular information about the types of content shared and the relationships between post category, comments, and likes. lastly, to better understand the post categories that generate the greatest engagement, collaborative research between institutions could allow researchers to gather and analyze metrics that are only available to account owners, such as impressions and reach. with this type of collaboration, researchers could also investigate how social media outreach goals influence the types of content shared on library instagram accounts. for example, researchers could conduct interviews or surveys with libraries and ask questions such as: what does your library hope to accomplish with its instagram account, who are you attempting to reach, how do you define a successful post, what metrics do you use to evaluate your instagram presence, and do your social media outreach goals influence the types of content shared on instagram? pursuing these types of questions, in addition to examining the actual content shared, would allow researchers to gain a more complete picture of what a successful social media presence looks like for an academic library. conclusion this research provides initial insight into the instagram presence of a subset of academic libraries at land-grant institutions in the united states. expanding on the research of stuart et al., this project used an adapted coding schema to document and analyze the content and efficacy of academic libraries’ instagram posts.35 the results of this study suggest that social media accounts, including those used by academic libraries, perform better when they reflect the community the library inhabits by highlighting content that is unique to their particular constituents, rather than simply functioning as another platform through which to share information. this study’s findings also demonstrate that academic libraries should strive to create an instagram presence that encompasses a variety of post categories to ensure that their online information sharing meets various needs. information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 13 endnotes 1 nancy dowd, “social media: libraries are posting, but is anyone listening?,” library journal 138, no. 10 (may 7, 2013), 12, https://www.libraryjournal.com/?detailstory=social-media-libraries-areposting-but-is-anyone-listening. 2 marshall breeding, next-gen library catalogs (london: facet publishing, 2010); zelda chatten and sarah roughley, “developing social media to engage and connect at the university of liverpool library,” new review of academic librarianship 22, no. 2/3 (2016), https://doi.org/10.1080/13614533.2016.1152985; amanda harrison et al., “social media use in academic libraries: a phenomenological study,” the journal of academic librarianship 43, no. 3 (2017), https://doi.org/10.1016/j.acalib.2017.02.014; nicole tekulve and katy kelly, “worth 1,000 words: using instagram to engage library users,” brick and click libraries symposium, maryville, mo (2013), https://ecommons.udayton.edu/roesch_fac/20; evgenia vassilakaki and emmanouel garoufallou, “the impact of twitter on libraries: a critical review of the literature,” the electronic library 33, no. 4 (2015), https://doi.org/10.1108/el03-2014-0051. 3 yeni budi rachman, hana mutiarani, and dinda ayunindia putri, “content analysis of indonesian academic libraries’ use of instagram,” webology 15, no. 2 (2018), http://www.webology.org/2018/v15n2/a170.pdf; catherine fonseca, “the insta-story: a new frontier for marking and engagement at the sonoma state university library,” reference & user services quarterly 58, no. 4 (2019), https://www.journals.ala.org/index.php/rusq/article/view/7148; kjersten l. hild, “outreach and engagement through instagram: experiences with the herman b wells library account,” indiana libraries 33, no. 2 (2014), https://journals.iupui.edu/index.php/indianalibraries/article/view/16633; julie lê, “#fashionlibrarianship: a case study on the use of instagram in a specialized museum library collection,” art documentation: bulletin of the art libraries society of north america 38, no. 2 (2019), https://doi.org/10.1086/705737; danielle salomon, “moving on from facebook: using instagram to connect with undergraduates and engage in teaching and learning,” college & research libraries news 74, no. 8 (2013), https://doi.org/10.5860/crln.74.8.8991. 4 “our story,” instagram, https://business.instagram.com/; chloe west, “17 instagram stats marketers need to know for 2019,” sprout blog, april 22, 2019, https://web.archive.org/web/20191219192653/https://sproutsocial.com/insights/instagra m-stats/; pew research center, “social media fact sheet,” last modified june 12, 2019, http://www.pewinternet.org/fact-sheet/social-media/. 5 “our story,” instagram. 6 joe phua, seunga venus jin, and jihoon jay kim, “gratifications of using facebook, twitter, instagram, or snapchat to follow brands: the moderating effect of social comparison, trust, tie strength, and network homophily on brand identification, brand engagement, brand commitment, and membership intention,” telematics and informatics 34, no. 1 (2017), https://doi.org/10.1016/j.tele.2016.06.004. https://www.libraryjournal.com/?detailstory=social-media-libraries-are-posting-but-is-anyone-listening https://www.libraryjournal.com/?detailstory=social-media-libraries-are-posting-but-is-anyone-listening https://doi.org/10.1080/13614533.2016.1152985 https://doi.org/10.1016/j.acalib.2017.02.014 https://ecommons.udayton.edu/roesch_fac/20 https://doi.org/10.1108/el-03-2014-0051 https://doi.org/10.1108/el-03-2014-0051 http://www.webology.org/2018/v15n2/a170.pdf https://www.journals.ala.org/index.php/rusq/article/view/7148 https://journals.iupui.edu/index.php/indianalibraries/article/view/16633 https://doi.org/10.1086/705737 https://doi.org/10.5860/crln.74.8.8991 https://business.instagram.com/ https://web.archive.org/web/20191219192653/https:/sproutsocial.com/insights/instagram-stats/ https://web.archive.org/web/20191219192653/https:/sproutsocial.com/insights/instagram-stats/ http://www.pewinternet.org/fact-sheet/social-media/ https://doi.org/10.1016/j.tele.2016.06.004 information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 14 7 fonseca, “the insta-story;” hild, “outreach and engagement;” lê, “#fashionlibrarianship;” rachman, mutiarani, and putri, “content analysis;” salomon, “moving on from facebook;” tekulve and kelly, “worth 1,000 words.” 8 vassilakaki and garoufallou, “the impact of twitter.” 9 breeding, next-gen library catalogs; hild, “outreach and engagement;” rachman, mutiarani, and putri, “content analysis;” vassilakaki and garoufallou, “the impact of twitter.” 10 harrison, burress, velasquez, schreiner, “social media use,” 253. 11 chatten and roughley, “developing social media.” 12 peter fernandez, “‘through the looking glass: envisioning new library technologies’ social media trends that inform emerging technologies,” library hi tech news 33, no. 2 (2016), https://doi.org/10.1108/lhtn-01-2016-0004. 13 robin m. hastings, microblogging and lifestreaming in libraries (new york: neal-schumann publishers, 2010). 14 hastings, microblogging. 15 robert david jenkins, “how are u.s. startups using instagram? an application of taylor's sixsegment message strategy wheel and analysis of image features, functions, and appeals” (ma thesis, brigham young university, 2018), https://scholarsarchive.byu.edu/etd/6721. 16 lucy hitz, “instagram impressions, reach, and other metrics you might be confused about,” sprout blog, january 22, 2020, https://sproutsocial.com/insights/instagram-impressions/. 17 vassilakaki and garoufallou, “the impact of twitter.” 18 mark aaron polger and karen okamoto, “who’s spinning the library? responsibilities of academic librarians who promote,” library management 34, no. 3 (2013), https://doi.org/10.1108/01435121311310914. 19 yuhen hu, lydia manikonda, and subbarao kambhampati, “what we instagram: a first analysis of instagram photo content and user types,” eighth international aaai conference on weblogs and social media (2014), https://www.aaai.org/ocs/index.php/icwsm/icwsm14/paper/viewpaper/8118; jenkins, “how are u.s. startups using instagram?;” brian j. mcnely, “shaping organizational imagepower through images: case histories of instagram,” proceedings of the 2012 ieee international professional communication conference, piscataway, nj (2012), https://doi.org/10.1109/ipcc.2012.6408624; emma stuart, david stuart, and mike thelwall, “an investigation of the online presence of uk universities on instagram,” online information review 41, no. 5 (2017): 584, https://doi.org/10.1108/oir-02-2016-0057. 20 stuart, stuart, and thelwall, “an investigation of the online presence;” mcnely, “shaping organizational image-power,” 3. https://doi.org/10.1108/lhtn-01-2016-0004 https://scholarsarchive.byu.edu/etd/6721 https://sproutsocial.com/insights/instagram-impressions/ https://doi.org/10.1108/01435121311310914 https://www.aaai.org/ocs/index.php/icwsm/icwsm14/paper/viewpaper/8118 https://doi.org/10.1109/ipcc.2012.6408624 https://doi.org/10.1108/oir-02-2016-0057 information technology and libraries september 2020 likes, comments, views | doney, wikle, and martinez 15 21 stuart, stuart, and thelwall, “an investigation of the online presence.” 22 stuart, stuart, and thelwall, “an investigation of the online presence,” 588. 23 stuart, stuart, and thelwall, “an investigation of the online presence,” 585. 24 “university of idaho’s peer institutions,” university of idaho, accessed october 8, 2019. 25 stuart, stuart, and thelwall, “an investigation of the online presence,” 588. 26 mcnely, “shaping organizational image-power,” 4; stuart, stuart, and thelwall, “an investigation of the online presence,” 588. 27 johnny saldaña, the coding manual for qualitative researchers (los angeles: sage publications, 2013), 27. 28 “fleiss’ kappa,” wikipedia, https://en.wikipedia.org/wiki/fleiss%27_kappa. 29 chatten and roughley, “developing social media.” 30 stuart, stuart, and thelwall, “an investigation of the online presence,” 590. 31 stuart, stuart, and thelwall, “an investigation of the online presence,” 590. 32 hitz, “instagram impressions, reach, and other metrics.” 33 stuart, stuart, and thelwall, “an investigation of the online presence,” 590. 34 stuart, stuart, and thelwall, “an investigation of the online presence,” 588. 35 stuart, stuart, and thelwall, “an investigation of the online presence.” https://en.wikipedia.org/wiki/fleiss%27_kappa abstract introduction literature review methods research questions identifying a sample population data collection research data analysis content analysis interrater reliability results general data about the library instagram accounts rq1: which type of post category is used most frequently by libraries on instagram? rq2: is the number of likes or the existence of comments related to the post category? number of likes by category existence of comments by category discussion limitations and future research conclusion endnotes assessing the treatment of patron privacy in library 2.0 literature michael zimmer information technology and libraries | june 2013 29 abstract as libraries begin to embrace web 2.0 technologies to serve patrons, ushering in the era of library 2.0, unique dilemmas arise regarding protection of patron privacy. the norms of web 2.0 promote the open sharing of information—often personal information—and the design of many library 2.0 services capitalize on access to patron information and might require additional tracking, collection, and aggregation of patron activities. thus embracing library 2.0 potentially threatens the traditional ethics of librarianship, where protecting patron privacy and intellectual freedom has been held paramount. as a step towards informing the decisions to implement library 2.0 to adequately protect patron privacy, we must first understand how such concerns are being articulated within the professional discourse surrounding these next generation library tools and services. the study presented in this paper aims to determine whether and how issues of patron privacy are introduced, discussed, and settled, if at all, within trade publications utilized by librarians and related information professionals introduction in today’s information ecosystem, libraries are at a crossroads: several of the services traditionally provided within their walls are increasingly made available online, often by non-traditional sources, both commercial and amateur, thereby threatening the historical role of the library in collecting, filtering, and delivering information. for example, web search engines provide easy access to millions of pages of information, online databases provide convenient gateways to news, images, videos, as well as scholarship, and largescale book digitization projects appear poised to make roaming the stacks seem an antiquated notion. further, the traditional authority and expertise enjoyed by librarians has been challenged by the emergence of automated information filtering and ranking systems, such as google’s algorithms or amazon’s recommendation system, as well as amateur, collaborative, and peerproduced knowledge projects, such as wikipedia, yahoo! answers, and delicious. meanwhile, the professional, educational, and social spheres of our lives are increasingly intermingled through online social networking spaces such as facebook, linkedin, and twitter, providing new interfaces for interacting with friends, collaborating with colleagues, and sharing information. michael zimmer, phd, (zimmerm@uwm.edu), a lita member, is assistant professor, school of information studies, and director, center for information policy research, university of wisconsin-milwaukee. mailto:zimmerm@uwm.edu information technology and libraries | june 2013 30 libraries face a key question in this new information environment: what is the role of the library in providing access to knowledge in today’s digitally networked world? one answer has been to actively incorporate features of the online world into library services, thereby creating “library 2.0.” conceptually, library 2.0 is rooted in the global web 2.0 discussion, and the professional literature often links the two concepts. according to o’reilly, web 2.0 marks the world wide web’s shift from a collection of individual websites to a computing platform that provides applications for end users and can be viewed as a tool for harnessing the collective intelligence of all web users.1 web 2.0 represents a blurring of the boundaries between web users and producers, consumption and participation, authority and amateurism, play and work, data and the network, reality and virtuality.2 its rhetoric suggests that everyone can and should use new internet technologies to organize and share information, to interact within communities, and to express oneself. in short, web 2.0 promises to empower creativity, to democratize media production, and to celebrate the individual while also relishing the power of collaboration and social networks. library 2.0 attempts to bring the ideology of web 2.0 into the sphere of the library. the term is generally attributed to casey,3 and while over sixty-two distinct viewpoints and seven different definitions of library 2.0 have been advanced,4 there is general agreement that implementing library 2.0 technologies and services means bringing interactive, collaborative, and user-centered web-based technologies to library services and collections.5 examples include • providing synchronous messaging (through instant message platforms, skype, etc.) to allow patrons to chat with library staff for real-time assistance; • using blogs, wikis, and related user-centered platforms to encourage communication and interaction between library staff and patrons; • allowing users to create personalized subject headings for library materials through social tagging platforms like delicious or goodreads; • providing patrons the ability to evaluate and comment on particular items in a library’s collection through rating systems, discussion forums, or comment threads; • using social networking platforms like facebook or linkedin to create online connections to patrons, enabling communication and service delivery online; and • creating dynamic and personalized recommendation systems (“other patrons who checked out this book also borrowed these items”), similar to amazon and related online services. launching such library 2.0 features, however, poses a unique dilemma in the realm of information ethics, especially patron privacy. traditionally, the context of the library brings with it specific norms of information flow regarding patron activity, including a professional commitment to patron privacy (see, for example, american library association’s privacy policy, 6 foerstel,7 gorman,8 and morgan 9). in the library, users’ intellectual activities are protected by decades of established norms and practices intended to preserve patron privacy and confidentiality, most assessing the treatment of patron privacy in library 2.0 literature | zimmer 31 stemming from the ala’s library bill of rights and related interpretations.10 as a matter of professional ethics, most libraries protect patron privacy by engaging in limited tracking of user activities, having short-term data retention policies (many libraries actually delete the record that a patron ever borrowed a book once it is returned), and generally enable the anonymous browsing of materials (you can walk into a public library, read all day, and walk out, and there is no systematic method of tracking who you are or what you’ve read). these are the existing privacy norms within the library context. library 2.0 threatens to disrupt these norms. in order to take full advantage of web 2.0 platforms and technologies to deliver library 2.0 services, libraries will need to capture and retain personal information from their patrons. revisiting the examples provided above, each relies on some combination of robust user accounts, personal profiles, and access to flows of patrons’ personal information: • providing synchronous messaging might necessitate the logging of a patron's name (or chat username), date and time of the request, e-mail or other contact information, and the content of the exchange with the librarian staff member. • library-hosted blogs or wikis will require patrons to create user accounts, potentially tying posts and comments to patron ip addresses, library accounts, or identities. • implementing social tagging platforms would similarly require unique user accounts, possibly revealing the tags particular patrons use to label items in the collection and who tagged them. • comment and rating systems potentially link patrons’ particular interests, likes, and dislikes to a username and account. • using social networking platforms to communicate and provide services to patrons might result in the library gaining unwanted access to personal information of patrons, including political ideology, sexual orientation, or related sensitive information. • creating dynamic and personalized recommendation systems requires the wholesale tracking, collecting, aggregating, and processing of patron borrowing histories and related activities. across these examples, to participate and benefit from library 2.0 services, library patrons could potentially be required to create user accounts, engage in activities that divulge personal interests and intellectual activities, be subject to tracking and logging of library activities, and risk having various activities and personal details linked to their library patron account. while such library 2.0 tools and services can greatly improve the delivery of library services and enhance patron activities, the increased need for the tracking, collecting, and retaining of data about patron activities presents a challenge to the traditional librarian ethic regarding patron privacy.11 despite these concerns, many librarians recognize the need to pursue library 2.0 initiatives as the best way to serve the changing needs of their patrons and to ensure the library’s continued role in information technology and libraries | june 2013 32 providing professionally guided access to knowledge. longitudinal studies of library adoption of web 2.0 technologies reveal a marked increase in the use of blogs, sharing plugins, and social media between 2008 and 2010.12 in this short amount of time, library 2.0 has taken hold in hundreds of libraries, and the question before us is not whether libraries will move towards library 2.0 services, but how they will do it, and, from an ethical perspective, whether the successful implementation of library 2.0 can take place without threatening the longstanding professional concerns for, and protections of, patron privacy. research questions recognizing that library 2.0 has been implemented, in varying degrees, in hundreds of libraries,13 and is almost certainly being considered at countless more, it is vital to ensure that potential impacts on patron privacy are properly understood and considered. as a step towards informing the decisions to implement library 2.0 to adequately protect patron privacy, we must first understand how such concerns are being articulated within the professional discourse surrounding these next generation library tools and services. the study presented in this paper aims to determine whether and how issues of patron privacy are introduced, discussed, and settled—if at all—within trade publications utilized by librarians and related information professionals. specifically, this study asks the following primary research questions: rq1. are issues of patron privacy recognized and addressed in literature discussing the implementation of library 2.0 services? rq2. when patron privacy is recognized and addressed, how is it articulated? for example, is privacy viewed as a critical concern, as something that we will need to simply “get over,” or as a non-issue? rq3. what kind of mitigation strategies, if any, are presented to address the privacy issues related to library 2.0? data analysis the study combines content and textual analyses of articles published in professional publications (not peer-reviewed academic journals) between 2005 and 2011 discussing library 2.0 or related web-based services, retrieved through the library, information science, and technology abstracts (lista) and library literature & information science full text databases. the discovered texts were collected in winter 2011 and coded to reflect the source, author, publication metadata, audience, and other general descriptive data. in total, there were 677 articles identified discussing library 2.0 and related web-based library services, appearing in over 150 different publications. of the articles identified, 50 percent of appeared in 18 different publications, which are listed in table 1. assessing the treatment of patron privacy in library 2.0 literature | zimmer 33 table 1. top publications with library 2.0 articles (2005–2011) publication count computers in libraries library journal information today library and information update incite scandinavian public library quarterly american libraries electronic library online school library journal information outlook mississippi libraries college & research library news library hi tech news library media connection csla journal (california school library association) knowledge quest multimedia information and technology 51 51 21 21 20 18 16 15 14 14 13 13 12 12 12 10 10 8 each of the 677 source texts was then analyzed to determine if a discussion of privacy was present. full-text searches were performed on word fragments to ensure the identification of variations in terminology. for example, each text was searched for the fragment “priv” to include hits on both the terms “privacy” and “private.” additional searchers were performed for word fragments related to “intellectual freedom” and “confidentiality” in order to capture more general considerations related to patron privacy. of the 677 articles discussing library 2.0 and related web-based services, there were a total of 203 mentions of privacy or related concepts in 71 articles. these 71 articles were further refined to ensure the appearance of the word “privacy” and related terms were indeed relevant to the ethical issues at hand (eliminating false positives for mentions of “private university,” for example, or mention of a publication’s “privacy policy” that happened to be provided in the pdf searched). the final analysis yielded a total of 39 articles with relevant mention of patron privacy as it relates to library 2.0, amounting to only 5.8 percent of all articles discussing library 2.0 (see table 2). a full listing of the articles is in appendix a. information technology and libraries | june 2013 34 table 2. article summary count % total articles discussing library 2.0 articles with hit in “priv” and related text searches articles with relevant discussion of privacy 677 71 39 10.5 5.8 the majority of these articles were authored by practicing librarians in both public and academic settings and present arguments for the increased use of web 2.0 by libraries or highlight successful deployment of library 2.0 services. of the 39 articles, only 4 focus primarily on challenges faced by libraries hoping to implement library 2.0 solutions.14 a textual analysis of the 39 relevant articles was performed to assess how privacy was discussed in each. two primary variables were evaluated: the length of discussion, and the level of concern. length of discussion was measured qualitatively as high (concern over privacy is explicit or implicit in over 50 percent of the article’s text), moderate (privacy is discussed in a substantive section of the article), and minimal (privacy is mentioned, but not given significant attention). the level of concern was measured qualitatively as high (indicated privacy as a critical variable for implementing library 2.0), moderate (recognized privacy as one of a set of important concerns), and minimal (mentioned privacy largely in passing, giving it no particular importance). results of these analyses are reported in table 3. table 3. length of discussion and level of concern length of discussion level of concern high moderate minimal 3 8 28 9 13 16 of the 39 relevant articles, only three had lengthy discussions of privacy-related issues. as early as 2007, coombs recognized that the potential for personalization of library services would force libraries to confront existing policies regarding patron privacy. 15 anderson and rethlefsen similarly engage in lengthy discussions of the challenges faced by libraries wishing to balance patron privacy with new web 2.0 tools and services. 16 these three articles represent less than 1 percent of the 677 total articles identified that discussed library 2.0 while only three articles dedicate lengthy discussions to issues of privacy, over half the articles that mention privacy (21 of 39) indicate a high or moderate level of concern. for example, cvetkovic warns that while “privacy is a central, core value of libraries…the features of web 2.0 applications that make them so useful and fun all depend on users sharing private information with the site owners.” 17 and casey and savastinuk’s early discussion of library 2.0 puts these concerns in context for librarians, warning that “libraries should remain as vigilant with assessing the treatment of patron privacy in library 2.0 literature | zimmer 35 protecting customer privacy with technology-based services as they are with traditional, physical library services.” 18 while 21 articles indicated a high or moderate level of concern over patron privacy, less than half of these provided any kind of solution or strategy for mitigating the privacy concerns related to implementing library 2.0 technologies. overall, 14 of the 39 relevant articles provided privacy solutions of one kind or another. breeding, for example, argues that librarians must “absolutely respect patron privacy,” 19 and suggests any library 2.0 tools that rely on user data should only be implemented if users must explicitly “opt-in” to having their information collected, a solution also offered by wisniewski in relation to protecting patron privacy with location-based tools.20 rethlefsen goes a step further, proposing libraries take steps to increase the literacy of patrons regarding their privacy and the use of library 2.0 tools, including the use of classes and tutorials to help educate patrons and staff alike. 21 conversely, cvetkovic argues that “the place of privacy in our culture is changing,” and that while “in many ways our privacy is diminishing, but many people…seem not too concerned about it.” 22 as a result, while she argues for only voluntary participation in library 2.0 services, cvetkovic takes a position that information sharing is becoming the new norm, weakening any absolute position regarding protecting patron privacy above all. discussion rq1 asks if issues of patron privacy are recognized and addressed within literature discussing library 2.0 and related web-based library services. of the 677 articles published for professional audiences that discuss library 2.0, only 39 contained a relevant discussion of the privacy issues that stem from this new family of data-intensive technologies, and only 11 of these discussed the issue beyond a passing mention. rq2 asks how the privacy concerns, when present, are articulated. of the 39 articles with relevant discussions of privacy, only 11 make more than a minimal mention of privacy concerns. however, the discussion in 22 of the articles reveals a high or moderate level of concern. this suggests that while privacy might not be a primary focus of discussion, when it is mentioned, even minimally, its importance is recognized. finally, rq3 seeks to understand if any solutions or mitigation strategies related to the privacy concerns are articulated. with only 14 of the 39 articles providing a means for practitioners to address privacy issues, readers of library 2.0 publications are more often than not left with no real solutions or roadmaps for dealing with these vital ethical issues. taken together, the results of this study reveal minimal mention of privacy alongside discussions of library 2.0. less than 6 percent of all 677 articles on library 2.0 include mention of privacy; of these, only 11 make more than a passing mention of privacy, representing less than 2 percent of information technology and libraries | june 2013 36 all articles. of the 39 relevant articles, 22 express more than a minimal concern, but of these, only 9 provide any mitigation strategy. these results suggest that while popular publications targeted at information professionals are giving significant attention to potential for library 2.0 to be a powerful new option for delivering library content and services, there is minimal discussion of how the widespread adoption and implementation of these new tools might impact patron privacy and even less discussion of how to address these concerns. consequently, as the interest in, and adoption of, library 2.0 services increase, librarians and related information practitioners seeking information regarding these new technologies in professional publications will not likely be confronted with the possible privacy concerns, nor learn of any strategies to deal with them. this absence of clear guidance for addressing patron privacy in the library 2.0 era resembles what computer ethicist jim moor would describe as a “policy vacuum”: a typical problem in computer ethics arises because there is a policy vacuum about how computer technology should be used. computers provide us with new capabilities and these in turn give us new choices for action. often, either no policies for conduct in these situations exist or existing policies seem inadequate. a central task of computer ethics is to determine what we should do in such cases, that is, formulate policies to guide our actions. 23 given the potential for the data-intensive nature of library 2.0 technologies to threaten the longstanding commitment to patron privacy, these results show that work must be done to help fill this vacuum. education and outreach must be increased to ensure librarians and information professionals are aware of the privacy issues that typically accompany attempts to implement library 2.0, and additional scholarship must take place to help understand the true nature of any privacy threats and to come up with real and useful solutions to help find the proper balance between enhanced delivery of library services through web 2.0-based tools and the traditional protection of patron privacy. acknowledgements this research was supported by a ronald e. mcnair postbaccalaureate achievement program summer student research grant,and a uw-milwaukee school of information studies internal research grant. the author thanks kenneth blacks, jeremy mauger, and adriana mccleer for their valuable research assistance. assessing the treatment of patron privacy in library 2.0 literature | zimmer 37 references 1. tim o’reilly, “what is web 2.0? design patterns and business models for the next generation of software,” 2005, www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web20.html. 2. michael zimmer, “preface: critical perspectives on web 2.0,” first monday 13, no. 3 (march 2008), http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/2137/1943. 3. michael casey, “working towards a definition of library 2.0,” librarycrunch (october 21, 2005), www.librarycrunch.com/2005/10/working_towards_a_definition_o.html. 4. walt crawford, “library 2.0 and ‘library 2.0,’” cites & insights 6, no 2 (midwinter 2006): 1–32, http://citesandinsights.info/l2a.htm. 5. michael casey and laura savastinuk, “library 2.0: service for the next-generation library,” library journal 131, no. 14 (september 1, 2006): 40–42; michael casey and laura savastinuk, library 2.0: a guide to participatory library service (medford, nj: information today, 2007).; nancy courtney, library 2.0 and beyond: innovative technologies and tomorrow’s user (westport, ct: libraries unlimited, 2007). 6. american library association, “policy on confidentiality of library records,” www.ala.org/offices/oif/statementspols/otherpolicies/policyconfidentiality. 7. herbert n. foerstel, surveillance in the stacks: the fbi’s library awareness program (new york: greenwood, 1991). 8. michael gorman, our enduring values: librarianship in the 21st century (chicago: american library association, 2000). 9. candace d. morgan, “intellectual freedom: an enduring and all-embracing concept,” in intellectual freedom manual. (chicago: american library association, 2006). 10. library bill of rights, american library association, www.ala.org/advocacy/intfreedom/librarybill; american library association, “privacy: an interpretation of the library bill of rights,” www.ala.org/template.cfm?section=interpretations&template=/contentmanagement/conten tdisplay.cfm&contentid=132904 11. rory litwin, “the central problem of library 2.0: privacy,” library juice (may 22, 2006), http://libraryjuicepress.com/blog/?p=68. http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/2137/1943 http://www.librarycrunch.com/2005/10/working_towards_a_definition_o.html http://citesandinsights.info/l2a.htm http://www.ala.org/offices/oif/statementspols/otherpolicies/policyconfidentiality http://www.ala.org/advocacy/intfreedom/librarybill http://www.ala.org/template.cfm?section=interpretations&template=/contentmanagement/contentdisplay.cfm&contentid=132904 http://www.ala.org/template.cfm?section=interpretations&template=/contentmanagement/contentdisplay.cfm&contentid=132904 http://libraryjuicepress.com/blog/?p=68 information technology and libraries | june 2013 38 12. zeth lietzau and jamie helgren, u.s. public libraries and the use of web technologies, 2010 (denver: library research service, 2011), www.lrs.org/documents/web20/webtech2010_closerlookreport_final.pdf. 13. ibid. 14. sue anderson, “libraries struggle to balance privacy and patron access,” alki 24, no. 2 (july 2008): 18–28; karen coombs, “privacy vs. personalization,” netconnect (april 15, 2007): 28; milica cvetkovic, “making web 2.0 work–from ‘librarian habilis’ to ‘librarian sapiens,’” computers in libraries 29, no. 9 (october 2009): 14–17, www.infotoday.com/cilmag/oct09/cvetkovic.shtml;, melissa l. rethlefsen, “tools at work: facebook’s march on privacy,” library journal 135, no. 12 (june 2010): 34–35. 15. coombs, “privacy vs. personalization.” 16. anderson, “libraries struggle to balance privacy and patron access.”; melissa l rethlefsen, “facebook’s march on privacy,” library journal 135, no. 12 (2010): 34–35. 17. cvetkovic, “making web 2.0 work.” 18. casey and savastinuk, “library 2.0: service for the next-generation library.” 19. marshall breeding, “taking the social web to the next level,” computers in libraries 30, no. 7 (september 2010): 34–37, www.librarytechnology.org/ltg-displaytext.pl?rc=15053. 20. jeff wisniewski, “location, location, location,” online 33, no. 6 (2009): 54–57. 21. rethlefsen, “tools at work: facebook’s march on privacy.” 22. cvetkovic, “making web 2.0 work,” 17. 23. james moor, “what is computer ethics?” metaphilosophy 16, no. 4 (october 1985): 266–75. http://www.lrs.org/documents/web20/webtech2010_closerlookreport_final.pdf http://www.infotoday.com/cilmag/oct09/cvetkovic.shtml http://www.librarytechnology.org/ltg-displaytext.pl?rc=15053 assessing the treatment of patron privacy in library 2.0 literature | zimmer 39 appendix a: articles with relevant mention of patron privacy as it relates to library 2.0 anderson, sue. “libraries struggle to balance privacy and patron access.” alki 24, no. 2 (july 2008): 18–28. balnaves, edmund. “the emerging world of open source, library 2.0, and digital libraries.” incite 30, no. 8 (august 2009): 13. baumbach, donna j. “web 2.0 and you.” knowledge quest 37, no. 4 (2009): 12–19. breeding, marshall. “taking the social web to the next level.” computers in libraries 30, no. 7 (september 2010): 34–37. casey, michael e. and laura savastinuk. “library 2.0: service for the next-generation library.” library journal 131, no. 14 (september 1, 2006): 40–42. cohen, sarah f. “taking 2.0 to the faculty why, who, and how.” college & research libraries news 69, no. 8 (september 2008): 472–75. coombs, karen. “privacy vs. personalization.” netconnect (april 15, 2007): 28. coyne, paul. “library services for the mobile and social world.” managing information 18, no. 1 (2011): 56–58. cromity, jamal. “web 2.0 tools for social and professional use.” online 32, no. 5 (october 2008): 30–33. cvetkovic, milica. “making web 2.0 work—from ‘librarian habilis’ to ‘librarian sapiens.’” computers in libraries 29, no. 9 (october 2009): 14–17. eisenberg, mike. “the parallel information universe.” library journal 133, no. 8 (may 1, 2008): 22–25. gosling, maryanne, glenn harper, and michelle mclean. “public library 2.0: some australian experiences.” electronic library 27, no. 5 (2009): 846–55. han, zhiping, and yan quan liu. “web 2.0 applications in top chinese university libraries.” library hi tech 28, no. 1 (2010): 41–62. harlan, mary ann. “poetry slams go digital.” csla journal 31, no. 2 (spring 2008): 20–21. hedreen, rebecca c., jennifer l. johnson, mack a. lundy, peg burnette, carol perryman, guus van den brekel, j. j. jacobson, matt gullett, and kelly czarnecki. “exploring virtual librarianship: second life library 2.0.” internet reference services quarterly 13, no. 2–3 (2008): 167–95. information technology and libraries | june 2013 40 horn, anne, and sue owen. “leveraging leverage: how strategies can really work for you.” in proceedings of the 29th annual international association of technological university libraries (iatul) conference, auckland, nz (2008): 1–10, http://dro.deakin.edu.au/eserv/du:30016672/horn-leveragingleveragepaper-2008.pdf. huwe, terence. “library 2.0, meet the ‘web squared’ world.” computers in libraries 31, no. 3 (april 2011): 24–26. “idea generator.” library journal 134, no. 5 (1976): 44. jayasuriya, h. kumar percy, and frances m. brillantine. “student services in the 21st century: evolution and innovation in discovering student needs, teaching information literacy, and designing library, 2.0-based student services.” legal reference services quarterly 26, no. 1–2 (2007): 135–70. jenda, claudine a., and martin kesselman. “innovative library 2.0 information technology applications in agriculture libraries.” agricultural information worldwide 1, no. 2 (2008): 52–60. johnson, doug. “library media specialists 2.0.” library media connection 24, no.7 (2006): 98. kent, philip g. “enticing the google generation: web 2.0, social networking and university students.” in proceedings of the 29th annual international association of technological university libraries (iatul) conference, auckland, nz (2008), http://eprints.vu.edu.au/800/1/kent_p_080201_final.pdf. krishnan, yyvonne. “libraries and the mobile revolution.” computers in libraries 31, no. 3 (april 2011): 5–9. li, yiu-on, irene s. m. wong, and loletta p. y. chan. “mylibrary calendar: a web 2.0 communication platform.” electronic library 28, no. 3 (2010): 374–85. liu, shu. “engaging users: the future of academic library web sites.” college & research libraries 69, no. 1 (january 2008): 6–27. mclean, michelle. “virtual services on the edge: innovative use of web tools in public libraries.” australian library journal 57, no. 4 (november 2008): 431–51. oxford, sarah. “being creative with web 2.0 in academic liaison.” library & information update 5 (may 2009): 40–41. rethlefsen, melissa. “facebook’s march on privacy.” library journal 135, no. 12 (2010): 34–35. schachter, debbie. “adjusting to changes in user and client expectations.” information outlook 13, no. 4 (2009): 55. http://dro.deakin.edu.au/eserv/du:30016672/horn-leveragingleveragepaper-2008.pdf http://eprints.vu.edu.au/800/1/kent_p_080201_final.pdf assessing the treatment of patron privacy in library 2.0 literature | zimmer 41 shippert, linda crook. “thinking about technology and change, or, ‘what do you mean it’s already over?’” pnla quarterly 73, no. 2 (2008): 4, 26. stephens, michael. “the ongoing web revolution.” library technology reports 43, no. 5 (2007): 10–14. thornton, lori. “facebook for libraries.” christian librarian 52, no. 3 (2009): 112. trott, barry and kate mediatore. “stalking the wild appeal factor.” reference & user services quarterly 48, no. 3 (2009): 243–46. valenza, joyce kasman. “a few new things.” lmc: library media connection 26, no. 7 (2008): 10– 13. widdows, katharine. “web 2.0 moves 2.0 quickly 2.0 wait: setting up a library facebook presence at the university of warwick.” sconul focus 46 (2009): 54–59. wisniewski, jeff. “location, location, location.” online 33, no. 6 (2009): 54–57. woolley, rebecca. “book review: information literacy meets library 2.0: peter godwin and jo parker (eds.).” sconul focus 47, (2009): 55–56. wyatt, neal. “2.0 for readers.” library journal 132, no. 18 (2007): 30–33. tending to an overgrown garden: weeding and rebuilding a libguides v2 system article tending to an overgrown garden weeding and rebuilding a libguides v2 system rebecca hyams information technology and libraries | december 2020 https://doi.org/10.6017/ital.v39i4.12163 rebecca hyams (rhyams@bmcc.cuny.edu) is web and systems librarian, borough of manhattan community college/cuny. © 2020. abstract in 2019, the borough of manhattan community college’s library undertook a massive cleanup and reconfiguration of the content and guides contained in their libguides v2 system, which had been allowed to grow out of control over several years as no one was in charge of its maintenance. this article follows the process from identifying issues, getting departmental buy-in, and doing all of the necessary cleanup work for links and guides. the aim of the project was to make their guides easier for students to use and understand and for librarians to maintain. at the same time, work was done to improve the look and feel of their guides and implement the built-in a-z database list, both of which are also discussed. introduction in early 2019, the a. philip randolph library at the borough of manhattan community college (bmcc) (part of the city university of new york (cuny) system) hired a new web and systems librarian. the position itself was new to the library, though some of its functions had previously been performed by a staff member who had left more than a year prior. it quickly became apparent to the newest member of the library’s faculty that, while someone had at one point managed the website, the same could not really be said for the library’s libguides system and the mass of content contained within. the library’s libguides system was first implemented in january 2013 and over time the system came to be used primarily by instruction librarians to serve their teaching efforts. not long after bmcc implemented libguides, springshare announced libguides version 2 (v2), a new version of the system that included several enhancements and features not present in the earlier version.1 these features included the ability to mix content types in a single box (in the earlier version, for example, boxes could have either rich text or links but not both), a centrally managed asset library, and an automatically-generated a-z database list designed to make it easy to manage a publicfacing display. bmcc moved to libguides v2 around early 2015, but few of those who worked with the system ever took advantage of the newer features offered for quite some time, if at all. at the time the web and systems librarian came aboard, the bmcc libguides system contained over 400 public guides and an unwieldy asset library filled with duplicates and broken widgets and links. many of the guides essentially duplicated others, with only the name of the classroom instructor differing. there were, for example, 69 separate guides just for english 101, some of which had not been updated in three or four years. there were there no local guidelines for creating or maintaining guides, and in theory, each librarian was responsible for their own. however, it was apparent that in practice, no one was actively managing the guides or their related assets, as the lists of both were overwhelming. the creators of existing guides were primarily reference and instruction librarians whose other responsibilities meant there was little mailto:rhyams@bmcc.cuny.edu information technology and libraries december 2020 tending to an overgrown garden | hyams 2 time to do guide upkeep and because there was no single person in charge of the guides, there was no one to ensure any maintenance took place. in addition to the unwieldy guide list and asset library, the bmcc library was also effectively maintaining two separate a-z database lists, one on the library’s website that was a homegrown sql database built by a previous staff member, and another running on libguides to provide links to databases via the guides. the lists were not in sync with one another and several of the librarians were unaware that the libguides version of the list even existed, leading to links to databases appearing on both the database list and as link assets. and, while the libguides a -z list was not linked from the library’s website, it was still accessible from points within libguides, meaning that patrons could encounter an incorrect list that was not being maintained. getting started before any work could be done on our system, there needed to be buy-in from the rest of the library faculty. with the library director in agreement, agenda items were added to department meetings between march and may 2019 for discussion and department approval. the various aspects of the project were pitched to emphasize the following goals: • removing outdated material, broken links, etc. • streamlining where information could be found • decluttering guides to make everything easier to use and understand for students • improving the infrastructure to make maintenance and new guide creation easier and more manageable • standardizing layouts and content the aim of all of this would be to increase guide usability, accessibility, and make the guides overall a more consistent resource for our students. for the sake of transparency (as well as to have a demo of some of the aesthetic changes discussed in more detail below), a project guide was created and shared with the rest of the library department to share preliminary data as well as detailed updates as tasks were completed.2 process the database list while the libguides a-z database list, a feature built into v2 of the platform, contained information about our databases, it was essentially only serving to provide links to databases when creating guide content. there was some indication, in the form of a dormant a-z database “guide,” that someone had tried to create a list in libguides by manually adding assets to a guide. while that was a common practice in libguides v1 sites, as the built-in list was not yet a part of the system, the built-in list itself was never properly put into use. the links on our website all pointed to a homegrown list which, while powered by an sql database, was essentially a manual list. because of its design, it had proved impossible for anyone in the library to update without extensive web programming knowledge. it seemed a no-brainer to work on the database list first. this way we had both the infrastructure to update database-related content on the guides and a single and up-to-date list of resources with enhanced functionality that could benefit the library’s users almost immediately.3 information technology and libraries december 2020 tending to an overgrown garden | hyams 3 to begin, the two lists were compared to find any discrepancies, of which there were many. as the e-resources librarian was on leave at the time, the library director was consulted to determine which of databases missing from the libguides list were active subscriptions (and which of the ones missing from the homegrown list were previously cancelled so they could be removed). once the database list reflected current holdings, the metadata entries for the databases on the libguides side were updated to include resource type, related subjects, and related icons. these updates would enhance the functionality of the libguides list, as it could be filtered or searched using that additional information, something that was missing from the homegrown lis t. in addition to updating content and adding useful metadata, some slight visual changes were made to improve the look and usability of the list using custom css. most of this was done because as the list was being worked on, several librarians (of those who were even aware of it in the first place) mentioned that one reason they disliked the libguides list was because of the font size and spacing, which they felt was too small and hard to read. with the list updated, it was presented at the march 2019 department meeting and quickly won over all in attendance, especially when it was pointed out that the list could be very easily maintained because it required no special coding knowledge. while the homegrown list would remain live on the server for the rest of the semester (so as to not disrupt any classes that may have been using it), it was agreed that the web and systems librarian could go ahead with switching all of the links pointing to the homegrown list to point to the springshare list instead. the asset library because of how guides were typically created over the years since adopting libguides (many appeared to have been copied from another existing guide each time) the asset library had grown immense and unmanageable. for example, there were 149 separate links to our “databases by subject” page on the library’s website, the overwhelming majority of which were only used once. there were also 145 separate widgets for the same embedded scribd-hosted keyword worksheet, which was in fact broken and displayed no content. this is to say nothing of the broken-link report that no one had reviewed in quite some time. tackling the cleanup of duplicates and fixing of broken links/embeds was a large piece of the invisible work taken on behind the scenes to make maintaining the guides easier in the future. in order to analyze the data, the asset library report was exported to an excel file to make it easier to identify issues that needed correction. to start this process, we requested that springshare technical support wipe out all assets (other than documents) that were not mapped to anything and were just cluttering up the asset library (this ended up being just under 2,000 assets).4 most of those items had been removed from the guides they were originally included on but were never removed from the asset library. they served no real function other than to clutter up the backend. the guide authors had given the web and systems librarian permission to remove anything broken that could not be easily fixed. this included the aforementioned broken worksheet (and other similar items), as well as an assortment of youtube video embeds where the video had since been taken down, resulting in a “this video is unavailable” error message. it was felt that since those were already not working and seriously hurt the reliability of our guides to our users, that no further permission was needed. then came the much more tedious task of standardizing (where possible) which assets were in use. this involved going into guides listed as containing known-duplicate assets, replacing them with a single, designated asset, and then removing the resulting unmapped items.5 it was decided information technology and libraries december 2020 tending to an overgrown garden | hyams 4 that while many of the guides would likely be deleted after the spring semester, that only assets appearing on currently-active guides would be standardized. while in hindsight, as many of the links that were fixed were on guides that were soon-to-be deleted, it would have been better to hold off and wait until guides could be deleted first. however, doing at least some of this work in advance helped find other issues including instances where our proxy prefix was included directly in the url (an issue as we were also in the process of changing our ezproxy hosting) and where custom descriptions or link names were unclear. “books from the catalog” assets had their own issues that also needed to be addressed. with a pending migration of the library’s ils, it was already apparent that the links to any books in the library’s catalog would need updating so they could have a shot at continuing to function postmigration.6 we had been told at the time that the library’s primo instance would remain through the migration (though this changed during the migration process) so at the time we felt it important to ensure that all links were pointing to primo, as some had been pointing to the soonto-be decommissioned opac. for consistency, the urls were structured as isbn searches instead of ones relying on internal system numbers that would soon change. however, it became obvious very early on that some of the links to books were either pointing to materials that were no longer in the library’s collection, or were pointing to a previously decommissioned opac server, both of which resulted in errors. because the domain of the previously decommissioned opac server had been whitelisted in the link checker report settings, these items had not appeared on the broken link list. using the filtered list of “books from the catalog” assets, all titles were checked, which allowed the web and systems librarian to remove items that were no longer in the collection and make other adjustments as needed. as a result of the asset cleanup process, the asset library went from an unwieldy total of more than 5,000 items to just over 2,000 items. it also simplified the process for reusing assets in new guides, as there was now only one choice per item, and made it much easier to find and fix broken links and embeds. the guides the cleanup of the guides themselves was by far the most complex task. before starting the guide cleanup work itself, the web and systems librarian performed a content analysis to identify and recommend guides for deletion and which could be converted into general subject area guides. because a common practice was to create a “custom” guide for each class that came in for a library instruction session, there was an overrepresentation of guides for the classes that had regular sessions: english 101 (english composition), english 201 (introduction to literature), speech 100 (public speaking), and introduction to critical thinking. those four courses accounted for 187 guides, or over 40 percent of the total number in our system. the majority of them had not been updated directly in over three years, and in some cases, were designed for instructors who no longer taught at the college. perhaps more telling was that the content for these guides diff ered more across the librarians who created them than across the courses they were designed for. this meant that while there might be three or four different iterations of the english 101 guide, the guides created by the same librarian for different introductory courses were essentially the same. before the arrival of the web and systems librarian, one of the other librarians had been occasionally maintaining guide groups for “current courses” and “past courses,” but it was unclear if anyone was still actively maintaining these groupings, as guides for current instructors were sometimes under “past courses” and vice versa. because these groups did not actually hide the information technology and libraries december 2020 tending to an overgrown garden | hyams 5 guides from view on the master list of guides and appeared to be unnecessary work, it was decided to remove the groupings. instead, the web and systems librarian would plan to revisit the guides on a regular basis to unpublish/remove anything for courses that were no longer taught. however, since the philosophy behind the guides was to move from “custom” guides for each instructor’s section to a general guide for the course as a whole for the overwhelming majority of cases, the need for maintaining these groupings was essentially eliminated anyway. in may 2020, a preliminary list of guides to be deleted was presented to the librarians at the monthly department meeting. the list was broken down as: • duplicates to be deleted: this portion consisted primarily of course guides like those mentioned above where multiple guides existed for the same course, most of which used the exact same content. • guides to be “merged:” while merging guides is not actually possible in the libguides platform, there were cases where we had two or three for the same course. they could be condensed into a single guide with the rest deleted. • guides to convert to subject area guides: these were guides that were essentially already structured as a subject guide but were titled for a specific course, and in many cases, a guide for the subject area did not already exist (for example, a course-specific guide for business would become the business subject area guide). • dead guides: these were guides that had not been updated in more than two years and had not been viewed in at least one year. librarians were given an opportunity in the department meeting to comment on the list, as well as to contact the web and systems librarian with any comments. additionally, as some of the classroom faculty on campus had connections to specific guides, the library director also sent out a message to classroom faculty to let them know of our general plan to revamp the guides and that many would be removed over the summer. surprisingly, there were few objections either amongst the librarians or the classroom faculty once they understood the rationale and process. of the few classroom faculty members that did respond to the library director’s message, most of them were more concerned with content or specific links that they felt strongly about versus the guides themselves. in those cases, we noted the content requests to make sure they appeared on the new guides. most of these instructors were satisfied when we further explained our process and , if needed, ensured them that the content they requested would be worked into the new guide. only one instructor who responded, whose assignment was related to a grant they had received, made a strong case for keeping a separate guide for their sections of english 101. with the project approval out of the way, it was then time to begin removing all of the to-bedeleted guides and start the process of revamping those that would be kept. the goal was that the project would be completed by the start of the fall semester so that faculty and students would come back to a new (and hopefully, much improved) set of guides. removing debris to be cautious, a few preliminary steps were taken before the guides selected for deletion were removed. for starters, the selected guides had their status changed to “unpublished,” meaning that they no longer appeared on the public-facing list of guides. this gave everyone a chance to say something if a guide they were actively using suddenly went “missing.” these unpublished guides were then downloaded using the libguides html backup feature and saved to the department’s information technology and libraries december 2020 tending to an overgrown garden | hyams 6 network share drive. while the html backup output is not a full representation of the guide (the file generated displays as a single page and is missing any formatting or images that were included in the guide), it does include all of a guide’s content, meaning that a link or block of text can be retrieved from the backup in case of moments of “i know i had this on my guide before but....” because of the somewhat haphazard nature of our guides, deleting unwanted ones turned out to result in interesting and unexpected challenges. over the years, some of librarians had, from time to time, reused individual boxes between guides, but there was no consistency to the practice. while there was a repository guide for reusable content, not everyone used it or used it consistently. thankfully, libguides runs a pre-delete check, which proved to be invaluable in this process, as it showed if any of the boxes displayed on one guide were reused on any others. in most cases where boxes were reused, they were reused on guides that were also on the “to be deleted” list, but that was not always the case. by having that check we could find the other guides listed and make copies of the boxes that would have otherwise been deleted. if a box was reused on multiple guides that were being kept, it was copied to the reusable content guide and then remapped from there. cosmetic improvements in conjunction with the work being done to improve content of our guides, the web and systems librarian felt it was the perfect opportunity to update the guide templates and overall aesthetics to make the guides more visually appealing, especially considering little had been done in this area system-wide apart from setting the default color scheme. using the project guide as an initial sandbox, several changes were put into motion that would eventually be worked into new templates and pushed out to all of the reworked guides. the first, and perhaps biggest, change was the move from tab navigation to side navigation (an option first made available with the release of libguides v2). while there have been several usability studies that have debated using one over the other, in this case side navigation was chosen both for the streamlined nature of the layout as a whole (by default there is only one full content column), and because enabling the box-level navigation could serve as a quick index for anyone looking to find specific content on a page.7 side navigation also avoided the issue of long lists of tabs spilling into a second row, which further complicated page navigation. several changes to the look and feel of the guides were also put into place, with many of the changes coming from suggestions given on various libguide style or best practice guides or more general recommendations from web usability guidelines.8 perhaps most importantly, all of the font sizes were increased for improved readability, especially on box titles and headers, to better facilitate visual scanning. the default fonts were also replaced with two commonly used fonts from the google fonts library, roboto (for headings and titles) and open sans (for body text). additionally, the navigation color scheme was changed because the orange of the college’s blueand-orange color scheme regularly failed accessibility contrast checks and was described by some colleagues as “harsh on the eyes.” instead, two analogous lighter shades of blue (one of which was taken from the college’s branding documentation) were selected for the navigation and box titles respectively, both of which allowed for the text in those areas to be changed from white to black (again, for improved readability). figure 1 shows a typical “before” guide navigation design, and figure 2 shows a typical “after” design. information technology and libraries december 2020 tending to an overgrown garden | hyams 7 figure 1. a sample of guide navigation and content frequently found on guides before start of cleanup figure 2. navigation and content after revisions additionally, the web and systems librarian took this opportunity to go through the remaining guides to ensure they were all consistent. most of this work fell in the area of text styling, or rather, undoing text styling. it was clear from several of the guides that over the years, librarians had not been happy with the default font sizes or styles, which lead to a lot of customizing using the built-in wysiwyg text editor. not only did this create a nightmare in the code itself (as the wysiwyg editor adds a lot of extraneous <span> tags and style markup), but it also meant that the changes coming from the new stylesheet were not being applied universally as any properties assigned on a page overrode the global css. there was also the issue of paragraph text (<p>) that was sometimes styled as fake headings (made larger or bolder to look like headings, but not using the proper <h#> tags) which needed to be corrected for consistency and accessibility purposes. replanting and sprucing up with an overwhelming majority of the guides (and their associated assets) deleted, it was finally time to rework the remaining guides into clear, easy-to-use resources that would benefit our students. at this point the guides fell into three categories: • guides that just needed to be pruned and updated. • guides that should be combined into a single subject area guide. • guides that should be created to fill an unmet need. information technology and libraries december 2020 tending to an overgrown garden | hyams 8 pruning and updating tasks were generally the least-arduous, as many of the guides included content that was also housed on discrete guides (citations, resource evaluation, etc.). instead of duplicating, for example, citation formats on every guide, those pages were replaced with navigation-level links out to the existing citation guide. this was also the point that we could do more extensive quality control such as switching to a single content column which further emphasized the extraneous information on many of our guides. infographics, videos, and long blocks of links or text were scrutinized to determine if they were helping to enhance students’ understanding of the core content or if they were merely providing clutter that would make it more difficult to understand the important information.9 in some cases, by going from guide to guide, it became apparent that there were guides for multiple courses in a subject area where the resources were basically identical. this was most noticeable in the criminal justice and health education subject areas. in these cases, it made little sense to keep separate course guides when the content was basically the same across them. to remedy this duplication, one of the course guides for each subject was transformed into the subject area guide, and resources were added to ensure they covered the same materials that the separate course guides may have covered. the remaining course guides were then marked for future deletion as they were no longer needed. lastly, subject areas without guides were identified so that work could be done later to create them. as we had discussed moving towards using the “automagic” integration of guide content into our blackboard learning management system (lms), this step will be key in ensuring that all subject areas have at least some resources students can use. however, as of this time we have yet to finish creating these additional guides, and several subject areas (including computer science, nursing, and gender studies) have no guides at all. next steps now that all of the work to clean and update our libguides is done, the most important next step is coming up with a workflow to ensure that the guides stay relevant and useful. the web and systems librarian mostly left the guides alone for the fall 2019 semester to allow their colleagues time to use them and report back any issues. to the web and systems librarian’s surprise there were few issues reported, but that does not mean there is no room for future improvement. as a department, it is clear that we need a formal plan for maintaining the guides, including update frequency, content review, and guidelines for when guides should be added or deleted. additionally, immediately following the conclusion of this cleanup project the library’s website was forced into a server migration and full rebuild for reasons outside of the scope of this article. however, as a result there were changes made on the library’s site involving the look and feel of pages that will need to be carried through into our guides and associated springshare platforms. while most of this work is relatively simple, mimicking changes developed in wordpress to work properly on external services will take time and effort. conclusion overall, while this project was a massive undertaking (done almost entirely by a single person), the end result, at least on the surface, has made our guides much easier to use and understand. there were obviously several things that, if the project were to be done over, should have been done differently, mostly involving the cleaning of the asset library. however, it is now much easier information technology and libraries december 2020 tending to an overgrown garden | hyams 9 to refer students to guides for their courses and the feelings about the guides amongst the library faculty have become much more positive. endnotes 1 “libguides: the next generation!,” springshare blog (blog), june 26, 2013, https://blog.springshare.com/2013/06/26/libguides-the-next-generation/. 2 the guide can be viewed at: https://bmcc.libguides.com/guidecleanup. 3 though the author only learned of the project undertaken at unc a few years ago, after they had already finished this project, a similar project was outlined here: sarah joy arnold, “out with the old, in with the new: migrating to libguides a-z database list,” journal of electronic resources librarianship 29, no. 2 (april 2017): 117–20, https://doi.org/10.1080/1941126x.2017.1304769. 4 because there was no way to view the documents before a bulk deletion, documents were manually reviewed and deleted as needed. 5 it was only long after this process that springshare promoted that they could do this on the backend by request. 6 however, it turned out that due to the differences in url structure between classic primo and primo ve that this change was completely unnecessary as the urls did actually needed to be changed again post-migration. at least they were consistent which meant a systemwide findand-replace could take care of most of the links. 7 several studies have been done since the roll out of libguides v2 including: sarah thorngate and allison hoden, “exploratory usability testing of user interface options in libguides 2,” college and research libraries 78, no. 6 (2017): 844–61, https://doi.org/10.5860/crl.78.6.844; kate conerton and cheryl goldenstein, “making libguides work: student interviews and usability tests,” internet reference services quarterly 22, no. 1 (january 2017): 43–54, https://doi.org/10.1080/10875301.2017.1290002. 8 of the many guides the author consulted, the following were the most informative: stephanie jacobs, “best practices for libguides at usf,” https://guides.lib.usf.edu/c.php?g=388525&p=2635904; jesse martinez, “libguides standards and best practices,” https://libguides.bc.edu/guidestandards/getting-started; carrie williams, “best practices for building guides & accessibility tips,” https://training.springshare.com/libguides/best-practices-accessibility/video. 9 there is a very detailed discussion of cognitive overload in libguides in jennifer j. little, “cognitive load theory and library research guides,” internet reference services quarterly 15, no. 1 (march 1, 2010): 53–63, https://doi.org/10.1080/10875300903530199. https://blog.springshare.com/2013/06/26/libguides-the-next-generation/ https://bmcc.libguides.com/guidecleanup https://doi.org/10.1080/1941126x.2017.1304769 https://doi.org/10.5860/crl.78.6.844 https://doi.org/10.1080/10875301.2017.1290002 https://guides.lib.usf.edu/c.php?g=388525&p=2635904 https://libguides.bc.edu/guidestandards/getting-started https://training.springshare.com/libguides/best-practices-accessibility/video https://doi.org/10.1080/10875300903530199 abstract introduction getting started process the database list the asset library the guides removing debris cosmetic improvements replanting and sprucing up next steps conclusion endnotes reproduced with permission of the copyright owner. further reproduction prohibited without permission. the impact of web search engines on subject searching in opac yu, holly;young, margo information technology and libraries; dec 2004; 23, 4; proquest pg. 168 the impact of web search engines on subject searching in opac holly yu and margo young this paper analyzes the results of transaction logs at california state university, los angeles (csula) and studies the effects of implementing a web-based opac along with interface changes. the authors find that user success in subject searching remains problematic. a major increase in the frequency of searches that would have been more successful in resources other than the library catalog is noted over the time period 2000-2002. the authors attribute this increase to the prevalence of web search engines and suggest that metasearching, relevance-ranked results, and relevance feedback ("more like this") are now expected in user searching and should be integrated into online catalogs as search options. i n spite of many studies and articles on online public access catalogs (opac) over the last twenty-five years, many of the original ideas about improving user success in searching library catalog have yet to be implemented. ironically, many of these techniques are now found in web search engines. the popularity of the web appears to have influenced users' mental models and thus their expectations and behavior when using a webbased opac interface. this study examines current search behavior using transaction-log analysis (tla) of subject searches when zero-hits are retrieved. it considers some of the features of web search engines and online bookstores and suggests future enhancements for opacs. i literature review many studies have been published since the 1980s centering on the opac. seymour and large and beheshti provide in-depth overviews on opac research from the mid-1980s through the mid-1990s.' much of this research has addressed system design and user behavior including: • user demographic s, • search behavior, • knowledge of system, • knowledge of subject matter, holly yu (hyu3@calstatela.edu) is library web administrator and reference librarian at the university library, california state university, los angeles. margo young (margo.e.young@jpl. nasa.gov) is manager of the library, archives and records section at the jet propulsion laboratory, california institute of technology, pasadena. • library settings, • search strategies, and • opac systems 2 opac research has employed a number of data-collection methodologies: experiment, interviews, questionnaires, observation, think aloud, and transaction logs. ' transaction logs have been used extensively to study the use of opacs, and library literature reflects this. while the exact details of tla vary greatly, peters et al. define it simply as "the study of electronically recorded interactions between online information retrieval systems and the persons who search for the information found in those systems."' this section reviews the tla literature relevant to the study. i number of hits tla cannot portray user intention or actual satisfaction since relevance, success, or failure are subjectively determined and require the user to decide. peters recommends combining tla with another technique such as observation, questionnaire or survey, interview, or focus group. 5 in spite of the limit ations of tla, many studies (including this one) rely on it alone. typically, these studies define failure as zero hits in response to a search. generalizing from several studies, approximately 30 percent of all searches result in zero hits.6 the failure rate is even higher for subject searches: peters reported that about 40 percent of subject searches failed by retrieving zero hits. 7 some researchers also define an upper number of results for a successful sea rch. buckland found that the average retrieval set was 98.8 blecic reported that cochrane and markey found that opac users retrieve too much (15 percent of the time). 9 wiberly, daugherty, and danowski (as reported in peters) found that the median number of postings considered to be too many was fifteen, although when fifteen to thirty postings were retrieved, more users displayed them all than abandoned the search. 10 i subject searching some studies have specifically looked at subject searching. hildreth differentiated among various types of searches and defined one hundred items as the upper limit for keyword searches and ninety as the upper limit for subject searches." larson defined reasonable subject retrieval as between one and twenty items and found that only 12 percent of subject searches retrieved the appropriate number. 12 168 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. larson is not the only researcher to have reported poor results in subject searching. for more than twenty years, research has demonstrated that subject or topical searches are both popular and problematic. tolle and han found that subject searching is most frequently used and the least successful. 13 moore reported that 30 percent of searches were for subject, and matthews et al. found that 59 percent of all searches were for subject information. 14 hunter found that 52 percent of all searches were subject searches and that 63 percent of these had zero hits. 15 van pulis and ludy referred to alzofon and van pulis's earlier work in 1984 where they reported that 42 percent of all searches were subject searches.16 hildreth found that 62.1 percent of subject searches and 35.4 percent of keyword searches failed. 17 larson categorized the major problems with online catalogs as follows: • users' lack of knowledge of library of congress subject headings (lcsh), • users' problems with mechanical and conceptual aspects of query formulation, • searches that retrieve nothing, • searches that retrieve too much, and • searches that retrieve records that do not match what the user had in mind. 18 during an eleven-year longitudinal study, larson found that subject searching was being replaced by keyword searching. 19 no consistent pattern in the number of search terms has emerged in the literature. van pulis and ludy reported that user searches were typically single words. 20 markey contended that users' search terms frequently matched standardized vocabulary in large catalogs. 21 none of markey's researchers consulted lcsh, and only 11 percent of van pulis and ludy's did so, notably in spite of their library's user-education programs. peters reported that lester found that the average search was less than two words and fewer than thirteen characters." hildreth found that more than two-thirds of keyword searches included two or more words and 42 percent of these multiple-word searches resulted in zero hits. 23 the proportion of zero-hit keyword searches rose with the increasing number of words in the search. subject headings have been a matter of considerable study. gerhan examined catalog records and surmised their accessibility in an online catalog. he contended that when a keyword from the title only is accessed, only 50 percent of all relevant books would be found and that title keywords would lead a user to subject-relevant records in 55 percent of cases while lcsh would lead a user successfully in 85 percent of the cases. 24 in contrast, cherry found that 42 percent of zero-hit subject searches would have been more fruitful as keyword or title searches than by following cross references retrieved from the subject field.25 she recommended converting zero-hit subject queries to other types of subject searches (keyword). thorne and whitlatch recommended that subject searchers should select keyword rather than subject headings as their first access strategy. 26 types of problems in subject searches numerous studies have categorized reasons for search failure (typically in zero-hit situations), but peters reports that a standard categorization has not yet been established .27 tn cases where more than one error is made in a search (and hunter reported this to be frequent), there is no consistency in how that is assigned. nonetheless, some major categories of problems stand out: • misspelling and typographical errors-peters found that these errors accounted for 20.8 percent of all unsuccessful keyword searches, while henty (reported by peters) concluded that 33 percent of such searches could be attributed to this.28 hunter found that 9.3 percent of subject searches had typographical and spelling errors. 29 • keyword search-hunter found 52.6 percent of zerohit searches used uncontrolled vocabulary terms. 30 • wrong source or field-hunter concluded that 4.5 percent of searches should have been done in a source other than the catalog, while 1.3 percent of searches were of the wrong type (an author search in the subject-search option). 31 • items not in the database-peters found that searches for items not held in the database accounted for 39.1 percent of unsuccessful searches, while hunter found that problem in only 2.5 percent of the problem cases. 32 in addition to these problems, hunter also found that index display and rules relating to the systems accounted for 27 percent of errors. 33 i resulting recommendations for change while hildreth stated, "there has been little research on most components of the opac interface" in 1997, he proposed two options to improve user success: increased user training or improved design based on informationseeking behavior. 34 wallace pointed out that there is a very short window of opportunity when searchers are amenable to instruction and that successful screen designs should therefore focus on presenting the quicksearching options employed by the majority of users first. 35 large and beheshti observed "that too many options simply caused confusion, at least for less experienced opac users," and they summarized that opacthe impact of web search engines on subject searching in opac i yu and young 169 reproduced with permission of the copyright owner. further reproduction prohibited without permission. interface research focuses on menu sequence, browsing, and querying .3'; menu sequence in terms of menu sequence, hancock-beaulieu indicated that "the menu sequence in which search options are offered will influence user selection." 37 ballard found that the amount of keyword searching was affected by its posi tion on the menu. 38 scott reported that both keywordand subject-search success improved when the keyword was plac ed at the top of the menus .39 thorne and whitlach used a combination of methods in their study and concluded that several interface changes should be implemented : • strongly encourage novi ce users to start with keyword (list keyword above subject heading), • relabel "keyword" to "subject or title words," and • relabel ii subject heading" to "library of congress subject heading."' 0 blecic et al. studied tran sactio n logs over six months to track th e impact of "simplifying and clarifying" opac introductory screens. after moving the keyword option to th e top, keyword searching incr ease d from 13.30 percent to 15.83 percent of all sea rch statements. blecic et al. found her original tally of 35.05 p ercent of correct searches having zero hits decre ased to 31.35 percent after screen changes. 41 querying opac-interface design has been based on an assumption that us ers come to the catalog knowing what the y need to know . in either text-bas ed opac or web-based opac, query-based searches are still mainstream. searchers are required to have knowledge of title, author, or subject. ortiz-repiso and moscoso observed that web-based catalogs, like all library catalogs, basi cally fulfill two functions: locating works based on known details and identifying which documents in the databas e cover a given subject. 42 natural-language input has long been considered a desi rable way to overcome this shortcoming. browsing relevance-ranked output and hypertext were considered by hildr eth to be promising in 1997.43 opacs have not been conceived within a true h ypertext environment, but rather they maintain the structure of their original formats, principally machine-readable cataloging (marc), and therefore impede the generation of a structure of nodes and links. 44 in addition to continuing to employ marc format as its underlying structure, the concept of main entry and added entr y, field label, and displa y logic all reflect cataloging rules . amazon.com and barnes and noble have completel y mo ve d away from this centuryold structure to pro vi d e easy access to book information . in the web environment , th e concept of main ent ry loses its meaning to multiple-acces s points and linking capabilities of author, subject, and call number. another prominent drawback of web-based op a cs is that they have not taken advantage of thesaurus structure and utilized the thesaurus for sea rching feedback. the hierarc hical relationship in lcsh is underutilized in terms of the relationship betw een terms and associations through related terms. web-based opacs have failed to make use of this important access. the persistence of the se drawbacks in opac-interfac e design is rooted d eeply in cataloging rules that were derived from the manual environment more than a century ago. it reflects th e gap between "concepts typically h eld by nonprofessional users and those used in library practices." 45 in her article "why are online catalogs still hard to use?" borgman conclude s: despite numerous imp rove m en ts to the user interfac e of online catalogs in recent years, searc her s still find th em hard to use . most of th e improvements are in surface features rather than in th e core functionality. we see little evidence th at our research on searching behavior studies has influ enced onlin e catalog design. " catalog content users misunderstand th e scope of the catalog. in questionnaire responses, 80 percent of van pulis and ludy 's participants indicated the y had considered looking elsewhere than the library catalog, as in periodical ind exes. 47 blazek and bilal report ed a reque st for inclusion of journalarticle titles in one respo nse to their questionnaire .48 libraries responded to th ese requests by acquiring databas es on cd-rom , loadin g them locally (sometimes using the catalog system to mount a separate databas e), and, most recently, providing access to databases over the internet. however, seldom h ave libraries responded to these requests by integratin g searc h access through a single front end as the default search. i impact of web search engines blecic et al. found that keyword searching increased from 13.3 percent to 28.3 percent over her four-year series of logs. at the same time, zero hits in keyword increased from 8.71 percent to 20.78 percent while subject zero hits dropped from 23 percent to 13.69 percent. she surmised th at the influence of web interfaces might have affected the regression-fluctuation in search syntax, initial articles, and author order. 49 170 information technology ano libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. automatically sco uts the web for pa ges that are related to its res ults so it can find a large number of resources ver y qu ickly without requiring th e user to select the right keyw ord s . teoma structures the appropriate communities of int eres t on-the-fly and ranks th e results on a range of facto rs including authorities and hubs (good resources pointing to related resources). google offers an opti on of "similar pages." whil e the subj ect-r edirect function in a web-bas ed opac emulates thi s, it succ ee ds only if th e user 's initi a l search term y ielded the right result. opac u sers ha ve the option of clicking on hyperlinked h ea din gs (author, titl e, subject headin gs ) but cannot ask the sys tem to perform a more so phisticated sea rch on their behalf. user-popularity tracking amazon and barnes and nobl e web sites pr ese nt enhanced information about items b y user-popul arity tracking. circulation stati stics or user comments could serv e as a form of "r ecommend er sys tem " to h elp novi ces narrow th eir selections. messa ge s such as "o ther student s who checked this bo ok out also read thes e book s" could be dynamically in serted in bibliographic records. users could also be allowed to pro vide comment s on mat eri als in the catalog, thus providin g an int era ctive experience for opac user s. summary of web features there are positive and negati ve imp acts of web sea rch engines and on line bookstores on web-based opac u sers . u sers who find web p ages to b e comfortable, easy, and familiar may mak e greater use of web-ba sed opacs. while th ey brin g with them their knowledge of search eng ine s, they also brin g their misp erce ption s. the possibility of using similar too ls to those found on web sea rch en gine s can greatly "re infor ce the u sefuln ess of the catalog as well as th e positiv e perc eption that th e end us er has of it." 61 given the diver sity of the error s that u sers experience , a co mbination of approaches is necessa r y to improve their search success. automatic mapping of freetex t-to -th esauru s term s, tran sla tion of common spelling mistak es, and links to related pages are to ols alr eady in use in th e web sea rch engines . "see similar pages," extensive us e of releva nce feedback, and popularity track ing along with natural language are less common. i recommendations for web-based opacs th e authors' tla rev ea led a continuing problem with subject-h ea din g searches and sho we d a trend toward searching top ics that are n ot typically answered in a bo ok catalog. the form er probl em ha s a well-documented hi story, whil e the authors b elieve th e latt er probl em stems from the influence of th e web and web sear ch engin es . severa l changes to typical opacs are recommended to addr ess th e trend s observ ed in th e cour se of thi s study. metasearching th e recent trend of incorporating databases and opac s into a single sear ch reflects the neces sity of exp anding information resourc es and simplif ying access to resources. thi s stud y's empirical results clearly indicate a need to exp and thi s integration into one sea rch. while some argu e that this metasearching w ill further au gment the syntax digr ess ion an d pr eve nt us ers from becom ing information literate, oth er s beli eve that metas ea rchin g, along with th e option of sear chin g each individu al d a tabase , is an ultim a te goal for onl ine search. like it or not, the m etasearch technolog y, also known as federat ed or broadca st search, "crea tes a portal that could allow the libr ary to become the on e-stop shop th eir us ers and potential use rs find so attractive ."65 onesea rch-for-all cannot solve all problems; how ev er, guidin g u sers to where the y are mo st likel y to find results quickly (the quick search) should sa tisfy th e ne ed s of th e majority of u sers . menu sequence eff ec tive scree n d es ign h as a p osi ti ve e ffec t on user su ccess. the m enu sequence for search opti ons plays a significant role in user selection . this research and oth ers it h ave demonstr ated th at users choose an option hi gher rath er than lower in a list. too many options "simply cause confu sion, at least for less experienced opac use rs." •• browsing feature browsing is a natural and effective approach to m any information -seekin g problems an d requires less effort and knowledge on the part of th e u ser. the liter a ture suggests that a great deal of the use of th e web relies on known web sites, recommended sites , or return visits to sit es recently visited-thus relying on browsin g rather than on searching. jenkins, co rritore , and widenb eck found that domain novice s seldom clicked very deepout and b ack-while web experts explor ed mor e deeply. 67 holscher and strub e not e that hurtineene and wandtke claim that only minimal trainin g is necessa ry for brow sin g an individual web site, whil e pollok and hockl ey claim that considerably more experience is req uired for qu ery ing and na viga tin g among sites. 68 hancock -beaulieu found that betwe en 30 p ercent and 45 percent of all online searches, reg ardl ess of th e typ e of search, ar e concluded with brow sing the librar y shelve s.69 176 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. to implement user help throu gh tip s or tac tics select ed and accumulated from a collection of common u sersearc h mistakes. in such a case, the system would play a more activ e role by generatin g relevant search tips on th e fly and using zero-hits search resul ts as a basis for gener ating a spe ll check or sugg esting altern ate wording. an idea l scenario is th at opac allow s the user to pursue mu ltiple avenues of an inq uiry by entering fra gments of th e question, exploring vocabulary choices, and reformulating the search wi th the assis tan ce of var iou s spec ialized intelligent assistants. borgman suggests that an opac should be jud ge d by whether the ca tal og answers questions rather th an merely mat ches queri es. she s ugges ts the need to design systems that are ba sed on behavioral models of h ow people ask questions, arguing th a t users still need to tra n sla te their question into what a sys tem will accept. " user instruction onsite tr aini ng and online documentation can help mak e it eas ier to u se opac. with the adven t of information literacy, the shi ft in librar y instruction from procedur ebased query formulation to question-being-answered has taken place. at csula, in struct ion for en try-level classes focu ses on formulating a research sta teme nt and then identifying keywords and alternate terms. the instruc tion sess ions that follow the initia l-conce pt formulation are sh ort an d focus on how to en ter keyword or subject, a u t h 01~ a n d title, and th e u se of boolea n operators. thi s approac h may improve success until th e sys tems provid e th e tools to improve sea rch stra tegies or accept an unt rai ned user 's input. as an increas ing numb er of users access online librar y ca talogs remotel y, assistance needs to be embedded into intuitive sys tems. "time invested in elaborate help systems often is better spent in redesigning the user interfac e so that help is no longer n eeded." 74 users are not willing to devote much of their time to learning to use these systems. they just want to get th eir searc h results quickly and expec t the catalog to be easy to use w ith little or no tim e invested in learning th e sys tem. i conclusion the em piri cal study repo rted in thi s paper indicates th at p rogress has been made in terms of increasing search success by improv ing the opac search int erfac e. the goal is to design web-based opac systems for today's users who are like ly to bring a mental model of web search engin es to the lib rary catalog. web-b ased opacs and web search engi nes differ in terms of th eir sys tems and interfac e design. however, in most cases, these differences do not res ult in different sea rch charac teris tics by users. researc h findings on the impact of web searc h engines and u ser searc hing expectations and behavior should b e adequately utilized to guide the in terface design. web users typically do n ot know how a search engine works. therefore, fund amental fea tures in the desi gn of the n ext generation of th e opac in terface should includ e ch ang in g the search to allow natural-language searching wit h keyword search first, and focu s on meetin g th e quick-search need . such a concep t-ba sed sea rch will allow u sers to enter natu ra l lan guage of their chos en top ic in the searc h bo x w hil e th e system maps the quer y to th e s tru cture and content of the database. relevance feedb ack to allow the system to brin g back related page s, spe llin g correctio n, and relevan ce-ranked output remain key goals for future opacs. references and notes 1. sharon seymour, "on line public-access catalog user stud ies: a revi ew of research methodologies, march 1986november 1989," library and information science research 13 (1991): 89-102; andrew large and ja mshid beheshti , "opacs: a resear ch review," library and information science research 19 (1997): 2, 111-33. 2. ibid., 113-16. 3. ibid., 116-20. 4. thomas a. peters et al.," an introduct ion to the special section on transaction-log analysis," library hi tech 11(1993): 2, 37. 5. thomas a. peters, "the history and developm ent of transactionlog analysis," library hi tech 11 (1993): 2, 56. 6. pauline a. cochrane an d karen markey, "cata log use studies since th e introdu ction of onlin e interactiv e ca tal ogs: impact on design for subj ec t access, " in redesign of catalogs and indexes for improved subject access: selected papers of pauline a. cochrnne (phoenix: oryx , 1985), 159-84; steve n a. zink , "monitoring user success th ro u gh transac tion-log analysis: the wolfpac example," reference services review 19 (sprin g 1991): 44956; michael k. buckl and et al., "oas is: a front end for prototy ping catalog enhancements," library hi tech 10 (1992): 7-22. 7. thomas a. peters, "when smart people fail: an analysis of the tra nsaction log of an on line public-access catalog," journal of academic librarianship 15 (1989): 5, 267. 8. michael k. buckland et al., "oasis," 7-22. 9. deborah d. blecic et al., "using transac tion-lo g ana lys is to imp rove opac retrieval result s," college and research libraries (jan. 1998): 48. 10. peters, "histo ry and development of transacti on-log analys is," 2, 52. 11. cha rles r. hildr eth , "the use and understanding of keyword searching in a un iversity online catalog," information technology and libraries 16 (1997): 6. 12. ray r. larson, "th e decline of subject searching: long term trends and patt erns of index use in an online catalog," journal of the american society for information science and technology 42 (1991): 3, 210. 178 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. 13. john e. tolle and sehchang hah, "o nline search patterns: nlm catline database," journal of the american society for information science 36 (mar. 1985): 8293. 14. carol weiss moore, "user reac tion to online catalog s: an exploratory study," college and research libraries 42 (1981): 295-302; joseph r. matth ews et a l., using online catalogs: a nationwide survey-a report of a study sponsored by the council on library resources (new york: n ea l-schuman, 1983), 144. 15. rhonda n. hunter, "success and failures of patrons searching the online catalog at a large academic library: a transaction-log analysis," r.q 30 (spring 1991): 399. 16. noelle van puli s and lorne e. ludy, "subject searching in an onl ine cata log with aut h ority contro l," college and research libraries 49 (1988): 526. 17. hildret h, "th e use and understanding of keyword searching," 6. 18. ray r. larson, "the decline of subjec t searching," 3, 60. 19. ibid. 20. van pulis and ludy, "subj ect searching in an onlin e cat alog," 527. 21. karen markey, research report on the process of subject searching in the library catalog: final report of the subject access research project (repo rt no. oclc /op r/ rr-83-1) (dub lin , ohio: oclc online co mput er library center, 1983), 529. 22. pe ters, "the history and deve lopment o f transactionlog ana lysis," 2, 43. 23. hi ldr eth, "the use and understanding of keyword searching," 8-9. 24. david r. gerhan, "lcsh in vivo: subje ct searching performance an d strategy in th e opac era," journal of academic librarianship 15 (1989): 86-8 7. 25. joan m. cherry, "improving subject access in op acs: an exploratory study of conversion of users' queries," journal of academic librarianship 18 (1992): 2, 98. 26. rosemary thorne and jo bell whitlatch, "patron on line catalog success," college and research libraries 55 (1994): 496. 27. peters, "the history and developmen t of transactionlog analys is," 2, 48. 28. ibid. 29. h unt er, "succe ss and failures," 400. 30. ibid., 399. 31. ibid., 400. 32. peters, "the histor y and developmen t of transa ctionlog analysis," 2, 56. 33. hunter, "success and failures," 400. 34. hildreth , "the use and understandi n g of keyword searchi n g," 6. 35. patricia m . wa llace, "how do patrons search th e online c:, talog w h en no one 's looking? trnn sae tion-log a nal ysis and impli cation s for bibliographic instruction and system desi gn, " rq 33 (winter 1993): 3, 249. 36. large and beheshti, "opacs: a research review," 125. 37. m. m. hancock-beaulieu , "online cata logue: a case for the user," in the online catalogue: developments and directions, c. hildreth, ed. (london: library association , 1989), 25-46. 38. terry ballard, "com parative searching styles of patrons and staff," library resources and technical services 38 (1994): 293305. 39. jane scott et al.,"@*&#@ this computer and the horse it rode in on: patron frustration and failur e at th e opac" (in "co ntinuity and transformation : the promise of confluen ce": u sabi li rs·"' i [: ,, ), b p l..jr l.i ""( ' " user interface consulting fed erat ed search tn(,ln es 1.ibr;'.'\ry portals & [)at/\, (ln 'itr s ()pacs f.." ( h i ldrei' l's dl(, ital libr ar ies ezra schwartz locs (773) 256-1418 ezra@artandtech.com http://www.artandtech.com proceedings of the acrl 7th nationa l conference, chicago: acrl 1995), 247-56. 40. thorne and whitlat ch, "patron on lin e catalog success," 496. 41. blecic et al., "usin g tran sac tion-log ana lys is," 46. 42. virginia ortiz-repiso and purificac ion moscoso, "we bbased op a cs: between tradition and innovation ," lnformntion technology and libraries 18, no. 2 (june 1999): 68-69. 43. hildreth, "the use and understanding of keyword searching," 6. 44. ortiz-repiso and mos coso, "web-bas ed opac s," 71. 45. ibid., 75. 46. chris tine borgm an, "why are on line catalogs still hard to us e?" journal of the americnn society for information science 47 (1996): 7, 501. 47. van pulis and ludy, "subje ct searching in an onlin e cat alog," 53. 48. rla zek and bilal , "prob lems with opac: a case study of an academic research library," rq 28 (w int er 1988): 175. 49. debora h d. blecic et al., "a longitud inal stu dy of the effects of opac screen changes on searching behavior and user success," college and research library 60, no. 6 (nov. 1999): 524,527. 50. bernar d j. jan sen and udo pooch, "a revi ew of web searching studies and a framework for future resear ch," journal of the american society for information science and technology 52 (2001): 3, 249-50. 51. ibid., 250. 52. blazek and bilal, "problems with opac: a case study," 175; moore , "user reaction to online cata logs," 295-302. the impact of web search engines on subject searching in opac i yu and young 179 reproduced with permission of the copyright owner. further reproduction prohibited without permission. 53. m. j. bates, "the design of browsin g and berry-pickin g techniques for the onlin e search interfac e," online review 13 (1989): 5, 407-24. 54. jan sen and pooch, "a review of web searc hing studies, " 238. 55. judy luther, "trumping google? metasearching's promise," library journal 128 (2003): 16, 36. 56. jack muramatsu and wanda pratt, "transparent queries: investigating users' mental models of search engines," research and development in information retrieval sept. 2001. accessed mar. 10, 2003, http://citeseer.nj.nec.com/muramatsuoltransparent. html. 57. jans en and pooch, "a review of web searching studie s," 235. 58. luth er, 't rumping goog le," 36. 59. blecic et a l., "a lon gitudina l study of th e effects of opac screen changes," 527. 60. sus an m. colaric, "ins truction for web searching: an empirical study," college and research libraries news, 64 (2003): 2. 61. a. g. sutcliff, m. ennis, and s. j. watkinson, "empirical studies of end-user informati on searching," joumal of the american society for information science and tcchnologtj 51 (2000): 13, 1213. 62. "a ll about google," google. accessed dec. 10, 2003, www.google.com. 63. g. salton, introduction to modern information retrieval (new york: mcgraw-hill, 1983), 18. 64. orti z-rep iso and moscoso, "we b-ba sed opacs," 71. 65. luth er, "trumping google," 37. 66. maaike d. kiestr a et al, "end-us ers searching th e online catalogue: the influenc e of domain and system knowledge on search patterns. experiment at tilburg university," the electronic library 12 (dec. 1994): 335-43. 67. c. jen kins et al., "pa tterns of in forma tion seeking on the web: a qualitative study of domain expertise and web experti se," it and society l (winter 2003): 3, 74,77. accessed may 10, 2003, www.itandsociety.org/. 68. c. holscher and g. strube, "web search behavior of internet experts and newbi es," 9th international world wide web conference, (amsterdam. 2000). accessed mar. 28, 2003, www9.org/ w9cdrom /8 1/81.html; a. pollock and a. hockley, "wha t's wrong with internet searching," d-lib magazine (mar. 1997). accessed may 10, 2003, www.dlib.org/dlib/march97 /b t /03 pollo ck.h tml. 69. m . m . hanco ck-beau lieu , "on lin e catalogue: a case for the user," 25-46. 70. wilbert 0. galitz, the essential guide to user interface design: an introduction to gui design principles and techniques (chichester, england: wiley, 1996). 71. juliana chan," an evaluation of displays of bibliographic records in opacs in canadian academic and public libraries," mis report, univ. of toronto, 1995. [025.3132 c454e] 72. giorgio brajnik et al., "strategic h elp in user interfaces for information retriev al," journal of the american society for information science and technology (jasist) 53 (2002): 5, 344 . 73. borgman, "why are online catalogs still hard to use?" 500. 74. ibid . 180 information technology and libraries i december 2004 lib-mocs-kmc364-20131012114038 278 circulation systems past and present* maurice j. freedman: school of library service, columbia university, new york city. a review of the development of circulation systems shows two areas of change. the librarian's perception of circulation control has shifted from a broad service orientation to a narrow record-keeping approach and recently back again . the technological development of circulation systems has evolved from manual systems to the online systems of today. the trade-ojjs and deficiencies of earlier systems in relation to the comprehensive services made possible by the online computer are detailed. in her 1975 library technology reports study of automated circulation control systems, barbara markuson contrasted what she called "older" and "more recent" views of the circulation function. the "older" or traditional view was that circulation control centered on conservation of the collection and recordkeeping. the "more recent" attitude encompasses "all activities related to the use of library materials. " 1 it appears that this latter outlook is not as new as markuson had suggested. in 1927, jennie m. flexner's circulation work in public libraries described the work of circulation as the "activity of the library which through personal contact and a system of records supplies the reader with the [materials] wanted. "2 flexner went on to characterize four major functions of circulation as follows: (1) the staff must know the books in the collection, and have a working familiarity with them. (2) the staff must know the readers; their wants, interests, etc. (3) the circulation staff must fully understand the library mission and policies and work harmoniously with those in related departments. (4) the circulation department has its own particular duty to perform .... effective routines and techniques must be established by the library and mastered by the staff if the distribution of books is to be properly accomplished and the public is to have *this article is adapted from a speech delivered at rutgers university. manuscript received november 1980; revised may 1981 ; accepted july 1981. circulation systems/freedman 279 the fullest use of the resources of the institution. the library must be able to locate books, on the shelves or in circulation; to know who is using material and how the reader can be traced, if he is misusing or unduly withholding the books drawn. 3 the function of circulation has not changed since flexner's description. even within the context of online circulation systems, it is absolutely essential that the circulation system be seen in as broad a context as possible. it is not merely an electromechanical phenomenon staffed by automatonclerks. circulation services involve that function which is ultimately one of the most fundamental: the satisfactory bringing together of the library user and the materials sought by that person. it follows, then, that the mechanism and means of delivery and control of the service are only a small part, and certainly not the most important part of the circulation function. knowing your collection, your readers, and clearly knowing your library's mission are crucial prerequisites for the effective circulation of library materials. an examination of the history of circulation systems and their evolution to the present state reveals the change in outlook from a narrow view of the circulation function to a broader view. let us begin by establishing the basic elements of record keeping, upon which circulation control is based. there are three categories of records: 1. for the collection of materials, books, tapes, microforms, etc., comprising the library. 2. for the readers or users of the library service. 3. for the wedding or concatenation of the first two, i.e., the library user's use or borrowing of the library's materials. a minimal circulation model is a set of procedures or recordkeeping with respect to only the third category, i.e., records of the materials held by the library user outside of the library. a total or complete system would then be one that provides for all three categories. using these criteria to judge the level of control provided by the various circulation systems of the past, let us review. the earliest method of circulation control was the chain method. in this case, "circulation" is not an accurate term; "use" of materials is more appropriate, as the collection did not circulate. books were chained to the wall and the user did not take the material outside of the library. the minimal circulation model is not met, and records were not required. several hundred years later, the ledger system's first iteration involved a simple notation into a ledger. the identification of the book-call number and/or author and title-and the borrower's identification were recorded. upon the return of the book, the borrower or the receiving clerk initialed the ledger entry or otherwise indicated the return of the item. minimal circulation control is met. a more developed or sophisticated ledger system exceeded this minimal circulation model. the new ledger had each page headed by a different 280 journal of library automation vol. 14/4 december 1981 borrower or registration number. consequently, a given user had all of his or her charges recorded on the given page indicated by the user's number. the economy of not having to write the borrower's name for every transaction was made possible through the creation of a file of patron records linked to the ledger page by common registration numbers. in effect, this was our first "automation." the use of a master file in support of anumbered page provided information that had previously been handwritten every time someone wished to borrow books from the library. the new ledger system also allowed for a more orderly control of charges. only the borrower's number was needed to get at the page of transactions relating to that borrower, as opposed to the former methoda benchmark method, in a sensein which the transactions were chronologically entered and had no other ordering whatsoever. even with the improved ledger system, though, the only ordering was by borrower number and date of issue to the borrower. there was no arrangement that provided for sequencing or finding the books borrowed. the need to identify borrowed books led to the dummy system. every book had a concomitant dummy book (or large card) that had a ruled sheet of paper with the book identification information on it and the borrower's name and/or number. when a user wished to borrow a book, the dummy was pulled from a file and the borrower information was written on the sheet of paper. the dummy was then filed on the shelf occupying the space formerly occupied by the book itself. when the book was returned, it was reshelved, the dummy removed, and the circulation transaction was crossed out. this system is interesting in that it provides for a complete inventory control. either all items are on the shelf in proper sequence or a physical surrogate or record for circulating items is substituted and placed in proper sequence. one has instant and, in effect, "online" access to the presence or absence of materials if one has the call number and can go to the shelf. unlike most systems that can only tell whether or not the book is present, the dummy system tells who has the book and when it was charged. in terms of a minimal model, this system provided less and more than the ledger system. if a reader wanted a list of books he or she borrowed, the reader would have to view every dummy and see if the listed item was charged to him or her. in contrast, the ledger system served such a request well, though every page of the ledger might have to be examined to find out who had borrowed a book not found on the shelf. leaping past several systems, let us now discuss the newark system , the overwhelmingly prevalent system in the united states today (if we include the mechanical or electromechanical versions of dickman, gaylord (the manual, not automated), and demeo). the newark system incorporated the best features of the systems already mentioned. a separate registration file was kept which provided both alphabetic access by patron and numeric access by patron registration circulation systems/freedman 281 number. consequently, the recording of the borrower's identification during circulation transactions only involved the notation of the number. for book identification, a card and matching pocket were placed in each book with the call number and/or author-title identification information. the circulation transaction involved the removal of the card from the pocket and the entering on it, ala dummy system, the date of the transaction and the borrower number. the cards for all of the books borrowed on a given day were aggregated and filed in shelflist sequence in a tray headed by the date of the transactions. resorting to computer jargon, the major or primary sort of the book cards (read circulation cards) was by date, but the minor sort was by call number. consequently, if one wanted to know the status of a given book and one had the call number, it would not take too long to search, even with a file as large as the one in the main branch of newark public library, by looking for the item in all of the different days' charges. when a book was returned, the clerk noted from the date-of-issue card inserted in the book's pocket, the tray in which to search, and the matching call number on the pocket which was used for discharging the book, i.e., removing the charge card from the tray and replacing it in the book. the combination of the books on the shelf plus the cards in the different trays in shelflist order constituted a complete inventory. additionally, the trays of cards comprised a comprehensive record of all current charges, i.e., all transactions by date, call number, and borrower, with borrower number pointing to fuller information in the registration file. looking back at our basic model, the newark system offered not just the minimum-a record of the item and the borrower who took it-but also introduced a major step toward inventory control. there was an inventory sequence involved, or, more accurately, several inventory sequences-one for each given collection (or day) of circulation transactions. what was still missing was a record by borrower of what was charged to him or her. in the original newark system, the borrower's card had entered upon it dates of issue and return of items. this way, even if the library could not tell the user what items (s)he had, the user's card would reflect the number of items outstanding. the handling of reserves, renewals, and overdue notices occurred as follows: a colored clip or some indicator on a circulation card would be used to indicate a reserve. a renewal would be handled the same as a return except the person would wait while the charge card was pulled from the appropriately dated tray, and assuming that no reserves had been placed on the circulation card, the book would be recharged (i.e., renewed) to the borrower. overdues automatically presented themselves by default. cards left in a tray after a predetermined number of days represented charges for which overdues were to be sent. the tray was taken to the registration file and the numerically sequenced registration cards for the delinquent borrowers removed so that notices could be prepared and sent. then the 282 journal of library automation vol. 14/4 december 1981 registration slips and circulation cards had to be refiled at the completion of the process. essentially, most subsequent systems are variants on the newark system. the mcbee key-sort system involves the use of cards with prepunched holes around the edges, one of which can be notched to indicate the date an item is due. the cards are arranged by call number creating a single sequence. the insertion of a knitting needle .like device through a given hole will allow all of the books overdue for a given date to fall free of the deck. this system is like the newark system in that it has inventory and date access, but unlike newark it places a horrible burden on the borrower. each card has (written by the borrower) the borrower's name and address and the call number, author, and title of the book. thus, the library is saved the labor of creating circulation cards and maintaining registration records for every patron-all of the information needed is on the charge card. but here, as marvin scilken has pointed out, the burden of the library's tasks are merely passed on to the users. this point should be emphasized. the next system to be considered is the photo-charge system. microphotos are taken of the borrower's card, which has the name and address on it, the book card (as in the newark book identification card), and a sequentially numbered date-of-issue or date-due slip . again, as with the mcbee, since the photo record includes the borrower's name and address, one can throw away registration files. also, a list or range of transaction numbers is kept by date used. since the numbered date-of-issue slip is placed in the book at the time of charging, and one removes it when the book is returned, it is a simple step to cross off or remove the number on the slip from its corresponding duplicate on the list of numbers for that day's transactions. overdue transactions are found by searching for unchecked transaction numbers on the numerically sequenced microfilm. this system does meet the criterion of the minimal model, a record of the user's use of the item. in terms of labor intensity, one has eliminated the maintenance of charge-card files and registration files by a single microfilm record. reserves, though , are terribly time-consuming with the photo-charge system: each returned book, before it can be returned to the shelf or renewed, must be searched against a call-numbered sequence of reserve cards. academic libraries would not use this kind of system because call-number access is a necessity, especially in relation to recalls of longloaned items . the elimination of paper files is what so commended this system to public libraries over the newark-based systems. but, as was noted, one has virtually no way of determining who took a book out or when it is due back except, in principle, by searching all of the reels of microfilm. some variants on this microfilm system were developed. bro-dart marketed a system that thermographically produced eye-readable records instead of microimages . such was the state of circulation systems before computers began to be used. the following-a discussion of the involvement of computers-can circulation systems/freedman 283 be separated by the type of hardware: main frames, minicomputers, and microcomputers. the main-frame computer has been used primarily in the past as a processing unit for batches of circulation transactions collected and fed to it via punched cards, terminals, or minicomputers. call number and author and title (albeit brief) and user identification number, were captured for each transaction. in the 1960s and into the early 1970s, this information would be batch-processed by the computer and a variety of reports would be produced. what the computer does, then, is keeps track of numbers, their ranges, and the dates of the ranges. but the computer can do much more than this. it is capable, as none of the nonautomated systems were, of rearranging the data input and then comparing and tabulating them as desired and appropriate. consequently, the fact that the call number, author, and title are stored by the machine means that lists or files can be arranged by any of these elements. the same goes for date of transaction. as to borrower identification number, a master file much like the newark registration file is kept (only now in its machine-readable form), and the computer does the comparing at high speed instead of the clerk taking the charge record and going to the numeric file to find the name and address of the borrower. of course, the computer can then readily and quickly print out overdue notices with an obvious absence of clerical support and labor intensity. as we all know, the rate of increase of labor costs in increasing, and the rate of increase of computer costs is decreasing. two kinds of large computer systems have been used. the batchoriented one, which either kept track of items in circulation only (the absence system-only items absent from the collection were tracked), or one that kept track of the entire collection (the inventory system). 4 normally, identification numbers were used for patrons in either system. although relatively rare in academic and public libraries, the mainframe-based online system is also in use. ohio state university is famous for its online system. what is meant here is that all transactions are immediately recorded and all files are instantly updated. printing is still necessary for overdue notices, but printed circulation lists are not necessary because of the online answers to queries regarding books or patrons now possible through terminals distributed to appropriate locations. the minicomputers came on the scene in two stages. clsi's entrance in 1973 utilized one of the early minicomputers, quite small by today's standards. for relatively small libraries that had not begun to dream of having their own computers, it became possible to have an entire inventory (in abbreviated form) and an entire patron file online. consequently, all of the access power of the newark system, and none of its labor intensity, was available online and much more besides. few libraries could afford the main-frame system of ohio state, but many could pay for clsi's, and indeed they did. in the last few years, minicomputers have grown several magnitudes 284 journal of library automation vol. 14/4 december 1981 above the capacity and speed of main-frame computers of the 1960s. consequently, such firms as dataphase, systems control, geac, gaylord, and others offer these larger minis, which can now support online the needs of large branch systems with inventories of hundreds of thousands of books. incidentally, clsi, with a new mini line, can do this now as well. both the miniand maxi-based systems do all of the basic work originally outlined: the whole inventory can be accessed online or with printed lists arranged by author, title, or call number (and, presently, some vendors offer online subject access and cross-references); access can also be made by patron's name. further, the basic transactionitem, borrower, and date-is recorded and checked for holds or delinquency before it is accepted. without overly extolling the present state of the art, it should be said that all of the information identified as important in the earliest systems is now not only available in a far quicker and more usable fashion, it can be manipulated by the machine in a variety of ways to meet and serve management objectives not considered practicable in the past. peter simmons showed how collection development could be aided by automatically generating purchase orders when reserves exceeded a specified acceptable level. 5 all kinds of statistical data regarding collection and patron use can be generated that could not have been possible in a manual mode. while at the university of southwestern louisiana, william mcgrath was able to adjust book budget allocations in terms of collection use and undergraduate major in a most interesting fashion. 6 the net result was an empirically based expenditure of book funds. now the microcomputer or microprocessor is the newly emerging phenomenon , and in many respects it is not unlike the minicomputer of the early 1970s. it is being used to perform single data-recording functions, and is also being seen as the link to the larger computer . so we have moved from chained books to microcomputers the size of a desk top. originally, a great deal of information was captured at great expense and laboriously maintained. certainly the handwritten and typed records of the newark system, although relatively comprehensive, were obtained and preserved at great cost. and, despite it all , there were real limitations of access . the succeeding mcbee and photo-charging systems appreciably cut out-of-pocket costs to the library, but either passed labor directly on to the user, or eliminated access altogether. book or patron access are virtually impossible with the photo-charging method. simply put, that system tells what is overdue, and that's all. the entry in the 1960s of the computer radically altered the ground rules. now all sequences of encoded elements are possible, and management information can be derived. important statistical data pertaining to collection use and library users can be obtained by further manipulating the data accumulated in the circulation process. it is now possible for all but the smallest and the very largest libraries to have access to and control circulation systems/freedman 285 of their materials through the current range of minicomputers on the market. jennie flexner told us that circulation had to be more than maintenance and record keeping of loan and borrower transactions. through the advances of the computer technology and its application to circulation control, we have finally seen what seems to be an optimization of the recordkeeping process and, by extension, an improvement in circulation service. if instantaneous access to patron files, inventory files, and outstanding transaction files through a variety of modes and computer-developed management data does not constitute that optimization, it will have to dountil the real thing comes along. acknowledgment the author is deeply indebted to susan e. bourgault for her editorial assistance. references 1. barbara evans markuson, "automated circulation control," library technology reports quly and sept., 1975), p.6. 2. jennie m. flexner, circulation work in public libraries (chicago: american library assn., 1927), p.l. 3. ibid., p.2. 4. robert mcgee, "two types of design for online circulation systems," journal of library automation 5:185 (sept. 1972). 5. peter simmons, collection development and the computer (vancouver, b.c.: univ. of british columbia, 1971), 60p. 6. william e. mcgrath, "a pragmatic allocation formula for academic and public libraries with a test for its effectiveness," library resources & technical services 19:356-69 (fall1975). maurice j. freedman is an associate professor at the school of library service, columbia university, new york city. bailey 116 information technology and libraries | september 2006 three critical issues—a dramatic expansion of the scope, duration, and punitive nature of copyright laws; the ability of digital rights management (drm) systems to lock-down digital content in an unprecedented fashion; and the erosion of net neutrality, which ensures that all internet traffic is treated equally—are examined in detail and their potential impact on libraries is assessed. how legislatures, the courts, and the commercial marketplace treat these issues will strongly influence the future of digital information for good or ill. editor's note: this article was submitted in honor of the fortieth anniversaries of lita and ital. b logs. digital photo and video sharing. podcasts. rip/mix/burn. tagging. vlogs. wikis. these buzzwords point to a fundamental social change fueled by cheap personal computers (pcs) and servers, the internet and its local wired/wireless feeder networks, and powerful, low-cost software. citizens have morphed from passive media consumers to digital-media producers and publishers. libraries and scholars have their own set of buzzwords: digital libraries, digital presses, e-prints, institutional repositories, and open-access (oa) journals, to name a few. they connote the same kind of change: a democratization of publishing and media production using digital technology. it appears that we are on the brink of an exciting new era of internet innovation: a kind of digital utopia. gary flake of microsoft has provided one striking vision of what could be (with a commercial twist) in a presentation entitled “how i learned to stop worrying and love the imminent internet singularity,” and there are many other visions of possible future internet advances.1 when did this metamorphosis begin? it depends on who you ask. let’s say the late 1980s, when the internet began to get serious traction and an early flowering of noncommercial digital publishing occurred. in the subsequent twenty-odd years, publishing and media production went from being highly centralized, capital-intensive analog activities with limited and welldefined distribution channels, to being diffuse, relatively low-cost digital activities with the global internet as their distribution medium. not to say that print and conventional media are dead, of course, but it is clear that their era of dominance is waning. the future is digital. nor is it to say that entertainment companies (e.g., film, music, radio, and television companies) and information companies (e.g., book, database, and serial publishers) have ceded the digital-content battlefield to the upstarts. quite the contrary. high-quality, thousand-page-per-volume scientific journals and hollywood blockbusters cannot be produced for pennies, even with digital wizardry. information and entertainment companies still have an important role to play, and, even if they didn’t, they hold the copyrights to a significant chunk of our cultural heritage. entertainment and information companies have understood for some time that they must adapt to the digital environment or die, but this change has not always been easy, especially when it involves concocting and embracing new business models. nonetheless, they intend to thrive and prosper—and to do whatever it takes to succeed. as they should, since they have an obligation to their shareholders to do so. the thing about the future is that it is rooted in the past. culture, even digital culture, builds on what has gone before. unconstrained access to past works helps determine the richness of future works. inversely, when past works are inaccessible except to a privileged minority, future works are impoverished. this brings us to a second trend that stands in opposition to the first. put simply, it is the view that intellectual works are property; that this property should be protected with the full force of civil and criminal law; that creators have perpetual, transferable property rights; and that contracts, rather than copyright law, should govern the use of intellectual works. a third trend is also at play: the growing use of digital rights management (drm) technologies. when intellectual works were in paper (or other tangible forms), they could only be controlled at the object-ownership or object-access levels (a library controlling the circulation of a copy of a book is an example of the second case). physical possession of a work, such as a book, meant that the user had full use of it (i.e., the user could read the entire book and photocopy pages from it). when works are in digital form and are protected by some types of drm, this may no longer be true. for example, a user may only be able to view a single chapter from a drm-protected e-book and may not be able to print it. the fourth and final trend deals with how the internet functions at its most fundamental level. the internet was designed to be content-, application-, and hardware-neutral. as long as certain standards were met, the network did not discriminate. one type of content was not given preferential delivery speed over another. one type of strong copyright + drm + weak net neutrality = digital dystopia? charles w. bailey jr. charles w. bailey jr. (cbailey@digital-scholarship.com) is assistant dean for digital library planning and development at university of houston libraries. digital dystopia | bailey 117 content was not charged for delivery while another was free. one type of content was not blocked (at least by the network) while another was unhindered. in recent years, network neutrality has come under attack. the collision of these trends has begun in courts, legislatures, and the marketplace. it is far from over. as we shall see, its outcome will determine what the future of digital culture looks like. ฀ stronger copyright: 1790 versus 2006 copyright law is a complex topic. it is not my intention to provide a full copyright primer here. (indeed, i will assume that the reader understands some copyright basics, such as the notion that facts and ideas are not covered by copyright.) rather, my aim is to highlight some key factors about how and why united states copyright law has evolved and how it relates to the digital problem at hand. three authors (lawrence lessig, professor of law at the stanford law school; jessica litman, professor of law at the wayne state university law school; and siva vaidhyanathan, assistant professor in the department of culture and communication at new york university) have done brilliant and extensive work in this area, and the following synopsis is primarily based on their contributions. i heartily recommend that you read the cited works in full. the purpose of copyright let us start with the basis of u.s. copyright law, the constitution’s “progress clause”: “congress has the power to promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries.”2 copyright was a bargain: society would grant creators a time-limited ability to control and profit from their works before they fell into the public domain (where works are unprotected) because doing so resulted in “progress of science and useful arts” (a social good). regarding the progress clause, lessig notes: it does not say congress has the power to grant “creative property rights.” it says that congress has the power to promote progress. the grant of power is its purpose, and its purpose is a public one, not the purpose of enriching publishers, nor even primarily the purpose of rewarding authors.3 however, entertainment and information companies can have a far different view, as illustrated by this quote from jack valenti, former president of the motion picture association of america: “creative property owners must be accorded the same rights and protections resident in all other property owners in the nation.”4 types of works covered when the copyright act of 1790 was enacted, it protected published books, maps, and charts written by living u.s. authors as well as unpublished manuscripts by them.5 the act gave the author the exclusive right to “print, reprint, publish, or vend” these works. now, copyright protects a wide range of published and unpublished “original works of authorship” that are “fixed in a tangible medium of expression” without regard for “the nationality or domicile of the author,” including “1. literary works; 2. musical works, including any accompanying words; 3. dramatic works, including any accompanying music; 4. pantomimes and choreographic works; 5. pictorial, graphic, and sculptural works; 6. motion pictures and other audiovisual works; 7. sound recordings; 8. architectural works.”6 rights in contrast to the limited print publishing rights inherent in the copyright act of 1790, current law grants copyright owners the following rights (especially notable is the addition of control over derivative works, such as a play based on a novel or a translation): ฀ to reproduce the work in copies or phonograph records; ฀ to prepare derivative works based upon the work; ฀ to distribute copies or phonograph records of the work to the public by sale or other transfer of ownership, or by rental, lease, or lending; ฀ to perform the work publicly, in the case of literary, musical, dramatic, and choreographic works, pantomimes, and motion pictures and other audiovisual works; ฀ to display the copyrighted work publicly, in the case of literary, musical, dramatic, and choreographic works, pantomimes, and pictorial, graphic, or sculptural works, including the individual images of a motion picture or other audiovisual work; and ฀ in the case of sound recordings, to perform the work publicly by means of a digital audio transmission.7 duration the copyright act of 1790 granted authors a term of fourteen years, with one renewal if the author was still living (twenty-eight years total).8 now the situation is much more complex, and, rather than trying to review the details, i’ll provide the following example. for a personal author who produced a work on or after january 1, 1978, it is covered for the life of the author plus seventy years.9 so, assuming 118 information technology and libraries | september 2006 an author lives an average seventy-five years, the work would be covered for 144 years, which is approximately 116 years longer than in 1790. registration registration was required by the copyright act of 1790, but very few eligible works were registered from 1790 to 1800, which enriched the public domain.10 now registration is not required, and no work enriches the public domain until its term is over, even if the author (or the author’s descendants) have no interest in the work being under copyright, or it is impossible to locate the copyright holder to gain permission to use his or her works (creating so-called “orphan works”). drafting of legislation by 1901, copyright law had become fairly esoteric and complex, and drafting new copyright legislation had become increasingly difficult. consequently, congress adopted a new strategy: let those whose commercial interests were directly affected by copyright law deliberate and negotiate with each other about copyright law changes, and use the results of this process as the basis of new legislation.11 over time, this increasingly became a dialogue among representatives of entertainment, high-tech, information, and telecommunications companies; other parties, such as library associations; and rights-holder groups (e.g., ascap). since these parties often had competing interests, the negotiations were frequently contentious and lengthy. the resulting laws created a kind of crazy quilt of specific exceptions for the deals made during these sessions to the ever-expanding control over intellectual works that copyright reform generally engendered. since the public was not at the table, its highly diverse interests were not directly represented, and, since stakeholder industries lobby congress and the public does not, the public’s interests were often not well served. (there were some efforts by special interest groups to represent the public on narrowly focused issues.) frequency of copyright term legislation with remarkable restraint, congress, in its first hundred years, enacted one copyright bill that extended the copyright term and one in its next fifty; however, starting in 1962, it passed eleven bills in the next forty years.12 famously, jack valenti once proposed that copyright “last forever less one day.”13 by continually extending copyright terms in a serial fashion, congress may grant him his wish. licenses in 1790, copyrighted works were sold and owned. today, many digital works are licensed. licenses usually fall under state contract law rather than federal copyright law.14 licensed works are not owned, and the first-sale doctrine is not in effect.15 while copyright is the legal foundation of licenses (i.e., works can be licensed because licensors own the copyright to those works), licenses are contracts, and contract provisions trump user-favorable copyright provisions, such as fair use, if the licensor chooses to negate them in a license. criminal and civil penalties in 1790 there were civil penalties for copyright infringement (e.g., statutory fines of “50 cents per sheet found in the infringer ’s possession”).16 now there are criminal copyright penalties, including felony violations that can result in a maximum of five years of imprisonment and fines as high as $250,000 for first-time offenders; civil statutory fines that can range as high as $150,000 per infringement (if infringement is “willful”), and other penalties.17 once the copyright implications of digital media and the internet sunk in, entertainment and information companies were deeply concerned: digital technologies made creating perfect copies effortless, and the internet provided a free (or low-cost) way to distribute content globally. congress, primarily spurred on by entertainment companies, passed several laws aimed at curtailing perceived digital “theft” through criminal penalties. under the 1997 no electronic theft (net) act, copyright infringers face “up to 3 years in prison and/or $250,000 fines,” even for noncommercial infringement.18 under the 1998 digital millennium copyright act (dmca), those who defeat technological mechanisms that control access to copyrighted works (a process called “circumvention”) face a maximum of five years in prison and $500,000 in fines.19 effect of copyright on average citizens in 1790, copyright law had little effect on citizens. the average person was not an author or publisher, private use of copyrighted materials was basically unregulated, the public domain was healthy, and many types of works were not covered by copyright at all. in 2006, ฀ virtually every type of work imaginable is under automatic copyright protection for extended periods of time; ฀ private use of digital works is increasingly visible and of concern to copyright holders; ฀ the public domain is endangered; and ฀ ordinary citizens are being prosecuted as “pirates” under draconian statutory and criminal penalties. digital dystopia | bailey 119 regarding this development, lessig says: for the first time in our tradition, the ordinary ways in which individuals create and share culture fall within the reach of the regulation of the law, which has expanded to draw within its control a vast amount of culture and creativity that it never reached before. the technology that preserved the balance of our history—between uses of our culture that were free and uses of our culture that were only upon permission—has been undone. the consequence is that we are less and less a free culture, more and more a permission culture.20 how has copyright changed since the days of the founding fathers? as we have seen, there has been a shift in copyright law (and social perceptions of it) from ฀ promoting progress to protecting intellectual property owners’ “rights”; ฀ from covering limited types of works to covering virtually all types of works; ฀ from granting only basic reproduction and distribution rights to granting a much wider range of rights; ฀ from offering a relatively short duration of protection to offering a relatively long (potentially perpetual) one; ฀ from requiring registration to providing automatic copyright; ฀ from drafting laws in congress to drafting laws in work groups of interested parties dominated by commercial representatives; ฀ from making infrequent extensions of copyright duration to making frequent ones; ฀ from selling works to licensing them; ฀ from relatively modest civil penalties to severe civil and criminal penalties; and ฀ from ignoring ordinary citizens’ typical use of copyrighted works to branding them as pirates and prosecuting them with lawsuits. (regarding lawsuits filed by the recording industry association of america against four students, lessig notes: “if you added up the claims, these four lawsuits were asking courts in the united states to award the plaintiffs close to $100 billion—six times the total profit of the film industry in 2001.”)21 complicating this situation further is intense consolidation and increased vertical integration in the entertainment, information, telecommunications, and other high-tech industries involved in the internet.22 this vertical integration has implications for what can be published and the free flow of information. for example, a company that publishes books and magazines, produces films and television programs, provides internet access and digital content, and provides cable television services (including broadband internet access) has different corporate interests than a company that performs a single function. these interrelated interests may affect not only what information is produced and whether competing information and services are freely available through controlled digital distribution channels, but corporate perceptions of copyright issues as well. one of the ironies of the current copyright situation is this: if creative works are by nature property, and stealing property is (and has always been) wrong, then some of the very industries that are demanding that this truth be embodied in copyright law have, in the past, been pirates themselves, even though certain acts of piracy may have been legal (or appeared to be legal) under then-existing copyright laws.23 lessig states: if “piracy” means using the creative property of others without their permission—if “if value, then right” is true—then the history of the content industry is a history of piracy. every important sector of “big media” today—film, records, radio, and cable tv—was born of a kind of piracy so defined. the consistent story is how last generation’s pirates join this generation’s country club—until now.24 let’s take a simple case: cable television. early cable television companies used broadcast television programs without compensating copyright owners, who branded their actions as piracy and filed lawsuits. after two defeats in the supreme court, broadcast television companies won a victory (of sorts) in congress, which took nearly thirty years to resolve the matter: cable television companies would pay, but not what broadcast television companies wanted; rather they would pay fees determined by law.25 of course, this view of history (big media companies as pirates in their infancy) is open to dispute. for the moment, let’s assume that it is true. put more gently, some of the most important media companies of modern times flourished because of relatively lax copyright control, a relatively rich public domain, and, in some cases, a societal boon that allowed them to pay statutory license fees— which are compulsory for copyright owners—instead of potentially paying much higher fees set by copyright owners or being denied use at all. today, the very things that fostered media companies’ growth are under attack by them. the success of those attacks is diminishing the ability of new digital content and service companies to flourish and, in the long run, may diminish even big media’s ability to continue to thrive as a permission culture replaces a permissive culture. several prominent copyright scholars have suggested copyright reforms to help restore balance to the copyright system. james boyle, professor of law at the duke university law school, recommends a twenty-year copyright term with “a broadly defined fair use protection for journalistic, teaching, and parodic uses—provided that those uses were not judged to be in bad faith by a jury applying the ‘beyond a reasonable doubt’ standard.”26 120 information technology and libraries | september 2006 william w. fisher iii, hale and dorr professor of intellectual property law at harvard university law school, suggests that “we replace major portions of the copyright and encryption-reinforcement models with . . . a governmentally administered reward system” that would put in place new taxes and compensate registered copyright owners of music or films with “a share of the tax revenues proportional to the relative popularity of his or her creation,” and would “eliminate most of the current prohibitions on unauthorized reproduction, distribution, adaptation, and performance of audio and video recordings.”27 lessig recommends that copyright law be guided by the following general principles: (1) short copyright terms, (2) a simple binary system of protected/not protected works without complex exceptions, (3) mandatory renewal, and (4) a “prospective” orientation that forbids retrospective term extensions.28 (previously, lessig had proposed a seventy-five-year term contingent on five-year renewals). he suggests reinstating the copyright registration requirement using a flexible system similar to that used for domain name registrations. he favors works having copyright marks, and, if they are not present, he would permit their free use until copyright owners voice their opposition to this use (uses of the work made prior to this point would still be permitted). litman wants a copyright law “that is short, simple, and fair,” in which we “stop defining copyright in terms of reproduction” and recast copyright as “an exclusive right of commercial exploitation.”29 litman would eliminate industry-specific copyright law exceptions, but grant the public “a right to engage in copying or other uses incidental to a licensed or legally privileged use”; the “right to cite” (even infringing works); and “an affirmative right to gain access to, extract, use, and reuse the ideas, facts, information, and other public-domain material embodied in protected works” (including a restricted circumvention right).30 things change in two hundred-plus years, and the law must change with them. since the late nineteenth century, copyright law has been especially impacted by new technologies. the question is this: has copyright law struck the right balance between encouraging progress through granting creators specific rights and fostering a strong public domain that also nourishes creative endeavor? if that balance has been lost, how can it be restored? or is society simply no longer striving to maintain that balance because intellectual works are indeed property, property must be protected for commerce to prosper, and the concept of balance is outmoded and no longer reflects societal values? ฀ drm: locked-up content and fine-grained control noted attorney michael godwin defines drm as “a collective name for technologies that prevent you from using a copyrighted digital work beyond the degree to which the copyright owner (or a publisher who may not actually hold a copyright) wishes to allow you to use it.”31 like copyright, drm systems are complex, with many variations. there are two key technologies: (1) digital marking (i.e., digital fingerprints that uniquely identify a work based on its characteristics, simple labels that attach rights information to content, and watermarks that typically hide information that can be used to identify a work), and (2) encryption (i.e., scrambled digital content that requires a digital key to decipher it).32 specialized hardware can be used to restrict access as well, often in conjunction with digital marking and encryption. the intent of this article is not to provide a technical tutorial, but to set forth an overview of the basic drm concept and discuss its implications. what is of interest here is not how system a-b-c works in contrast to system x-y-z, but what drm allows copyright owners to do and the issues related to drm. to do so, let’s use an analogy, understanding that real drm systems can work in other ways as well (e.g., digital watermarks can be used to track illegal use of images on the internet without those images being otherwise protected). for the moment, let’s imagine that the content a user wishes to access is in an unbreakable, encrypted digital safe. the user cannot see inside the safe. by entering the correct digital combination, certain content becomes visible (or audible or both) in the safe. that content can then be utilized in specific ways (and only those ways), including, if permitted, leaving the safe. if a public domain work is put in the safe, access to it is restricted regardless of its copyright status. bill rosenblatt, bill trippe, and stephen mooney provide a very useful conceptual model of drm rights in their landmark drm book, digital rights management: business and technology, summarized here.33 there are three types of content rights: (1) render rights, (2) transport rights, and (3) derivative-works rights. render rights allow authorized users to view, play, and print protected content. transport rights allow authorized users to copy, move, and loan content (the user retains the content if it is copied and gets it back when a loan is over, but does not keep a copy if it is moved). derivative-works rights allow authorized users to extract pieces of content, edit the content in place, and embed content by extracting some of it and using it in other works. each one of these individual rights has three attributes: (1) consideration, (2) extents, and (3) types of users. in the first attribute, consideration, access to content is provided for something of value to the publisher (e.g., money or personal information). content can then be used to some extent (e.g., for a certain amount of time or a certain number of times). the rights and attributes users have are determined by their user types. digital dystopia | bailey 121 for example, an academic user, in consideration of a specified license payment by his or her library, can view a drm-protected scholarly article—but not copy, move, loan, extract, edit, or embed it—for a week, after which it is inaccessible. we can extend this hypothetical example by imagining that the library could pay higher license fees to gain more rights to the journal in question, and the library (or the user) could dynamically purchase additional article-specific rights enhancements as needed through micropayments. this example is extreme; however, it illustrates the fine-grained, high level of control that publishers could potentially have over content by using drm technology. godwin suggests that drm may inhibit a variety of legitimate uses of drm-protected information, such as access to public-domain works (or other works that would allow liberal use), preservation of works by libraries, creation of new derivative works, conduct of historical research, exercise of fair-use rights, and instructional use.34 the ability of blind (or otherwise disabled) users to employ assistive technologies may also be prevented by drm technology.35 drm also raises a variety of privacy concerns.36 fair use is an especially thorny problem. rosenblatt, trippe, and mooney state: fair use is an “i’ll know it when i see it” proposition, meaning that it can’t be proscriptively defined. . . . just as there is no such thing as a “black box” that determines whether broadcast material is or isn’t indecent, there is no such thing as a “black box” that can determine whether a given use of content qualifies as fair use or not. anything that can’t be proscriptively defined can’t be represented in a computer system.37 no need to panic about scholarly journals—yet. your scholarly journal publisher or other third-party supplier is unlikely to present you with such detailed options tomorrow. but you may already be licensing other digital content that is drm-protected, such as digital music or e-books that require a hardware e-book reader. as the recent sony bmg “rootkit” episode illustrated, creating effective, secure drm systems can be challenging, even for large corporations.38 again, the reasons for this are complex. in very simple terms, it boils down to this: assuming that the content can be protected up to the point it is placed in a drm system, the drm system has the best chance of working if all possible devices that can process its protected content either directly support its protection technology, recognize its restrictions and enforce them through another means, or refuse access.39 anything less creates “holes” in the protective drm shell, such as the well-known “analog hole” (e.g., when drm-protected digital content is converted to analog form to be played, it can then be rerecorded using digital equipment without drm protection).40 ideally, in other words, every server, network router, pc and pc component, operating system, and relevant electronic device (e.g., cd player, dvd player, audiorecording device, and video-recording device) would work with the drm system as outlined previously or would not allow access to the content at all. clearly, this ideal end-state for drm may well never be realized, especially given the troublesome backwardcompatibility equipment problem.41 however, this does not mean that the entertainment, information, and hightechnology companies will not try to make whatever piecemeal progress that they can in this area.42 the trusted computing group is an important multiple-industry security organization, whose standards work could have a strong impact on the future of drm. robert a. gehring notes: but a drm system is almost useless, that is from a content owner’s perspective, until it is deployed broadly. putting together cheap tc components with a marketdominating operating system “enriched” with drm functionality is the most economic way to provide the majority of users with “copyright boxes.”43 seth schoen argues computer owners should be empowered to override certain features of “trusted computing architecture” to address issues with “anti-competitive and anti-consumer behavior” and other problems.44 drm could potentially be legislatively mandated. there is a closely related legal precedent, the audio home recording act, which requires that digital audiotape equipment include special hardware to prevent serial copying.45 there is currently a bill before congress that would require use of a “broadcast flag” (a digital marker) for digital broadcast and satellite radio receivers.46 last year, a similar fcc regulation for broadcast digital television was struck down by a federal appeals court; consequently, the current bill explicitly empowers the fcc to “enforce ‘prohibitions against unauthorized copying and redistribution.’”47 another bill would plug the analog-to-digital video analog hole by putting “strict legal controls on any video analog to digital (a/d) convertors.”48 whether these bills become law or not, efforts to mandate drm are unlikely to end. dmca strongly supports drm by prohibiting both the circumvention of technological mechanisms that control access to copyrighted works (with some minor exceptions) and the “manufacture of any device, composition of any program, or offering of any service” to do so.49 what would the world be like if all newly published (or released) commercially created information was in digital form, protected by drm? what would it be like if all old works in print and analog formats were only reissued in digital form, protected by drm? what would it be like if all hardware that could process that digital information had to support the information’s drm scheme or block any access to it because this was mandated by law? what would it be 122 information technology and libraries | september 2006 like if all operating systems had direct or indirect built-in support for drm? would “progress of science and useful arts” be promoted or squashed? ฀ weaker net neutrality lessig identifies three important characteristics of the internet that have fostered innovation: (1) edge architecture: software applications run on servers connected to the network, rather than on the network itself, ensuring that the network itself does not have to be modified for new or updated applications to run; (2) no application optimization: a relatively simple, but effective, protocol is utilized (internet protocol) that is indifferent to what software applications run on top of it, again insulating the network from application changes; and (3) neutral platform: the network does not prefer certain data packets or deny certain packets access.50 lessig’s conceptual model is very useful when thinking about net neutrality, a topic of growing concern. educause’s definition of net neutrality aptly captures these concerns: “net neutrality” is the term used to describe the concept of keeping the internet open to all lawful content, information, applications, and equipment. there is increasing concern that the owners of the local broadband connections (usually either the cable or telephone company) may block or discriminate against certain internet users or applications in order to give an advantage to their own services. while the owners of the local network have a legitimate right to manage traffic on their network to prevent congestion, viruses, and so forth, network owners should not be able to block or degrade traffic based on the identity of the user or the type of application solely to favor their interests.51 for some time, there have been fears that net neutrality was endangered as the internet became increasingly commercialized, a greater percentage of home internet users migrated to broadband connections not regulated by common carrier laws, and telecommunications mergers (and vertical integration) accelerated. some of these fears are now appearing to be realized, albeit with resistance by the internet community. for example, aol has indicated that it will implement a two-tier e-mail system for companies, nonprofits, and others who send mass mailings: those who pay bypass spam filters, those who don’t pay don’t bypass spam filters.52 critics fear that free e-mail services will deteriorate under a two-tier system. facing fierce criticism from the dearaol.com coalition and many others, aol has relented somewhat on the nonprofit issue by offering special treatment for “qualified” nonprofits. a second example is that an analysis of verizon’s fcc filings reveals that “more than 80% of verizon’s current capacity is earmarked for carrying its service, while all other traffic jostles in the remainder.”53 content-oriented net companies are worried: leading net companies say that verizon’s actions could keep some rivals off the road. as consumers try to search google, buy books on amazon.com, or watch videos on yahoo!, they’ll all be trying to squeeze into the leftover lanes on verizon’s network. . . . “the bells have designed a broadband system that squeezes out the public internet in favor of services or content they want to provide,” says paul misener, vice-president for global policy at amazon.com.54 a third example is a comment by william l. smith, bellsouth ‘s chief technology officer, who “told reporters and analysts that an internet service provider such as his firm should be able, for example, to charge yahoo inc. for the opportunity to have its search site load faster than that of google inc.,” but qualified this assertion by indicating that “a pay-for-performance marketplace should be allowed to develop on top of a baseline service level that all content providers would enjoy.”55 about four months later, at&t announced that it would acquire bellsouth, after which it “will be the local carrier in 22 states covering more than half of the american population.”56 finally, in a white paper for public knowledge, john windhausen jr. states: this concern is not just theoretical—broadband network providers are taking advantage of their unregulated status. cable operators have barred consumers from using their cable modems for virtual private networks and home networking and blocked streaming video applications. telephone and wireless companies have blocked internet telephone (voip—voice over the internet protocol) traffic outright in order to protect their own telephone service revenues.57 these and similar examples are harbingers of troubled days ahead for net neutrality. the canary in the net neutrality mine isn’t dead yet, but it’s getting very nervous. the bottom line? noted oa advocate peter suber analyzes the situation as follows: but now cable and telecom companies want to discriminate, charge premium prices for premium service, and give second-rate service to everyone else. if we relax the principle of net neutrality, then isps could, if they wanted, limit the software and hardware you could connect to the net. they could charge you more if you send or receive more than a set number of emails. they could block emails containing certain keywords or emails from people or organizations they disliked, and block traffic to or from competitor web sites. they could make filtered service the default and force users to pay extra for the digital dystopia | bailey 123 wide open internet. if you tried to shop at a store that hasn’t paid them a kickback, they could steer you to a store that has. . . . if companies like at&t and verizon have their way, there will be two tiers of internet service: fast and expensive and slow and cheap (or cheaper). we unwealthy users—students, scholars, universities, and small publishers—wouldn't be forced offline, just forced into the slow lane. because the fast lane would reserve a chunk of bandwidth for the wealthy, the peons would crowd together in what remained, reducing service below current levels. new services starting in the slow lane wouldn't have a fighting chance against entrenched players in the fast lane. think about ebay in 1995, google in 1999, or skype in 2002 without the level playing field provided by network neutrality. or think about any oa journal or repository today.58 is net neutrality a quaint anachronism of the internet’s distant academic/research roots that we would be better off without? would new internet companies and noncommercial services prosper better if it was gone, spurring on new waves of innovation? would telecommunications companies (who may be part of larger conglomerates), free to charge for tiered-services, offer us exciting new service offerings and better, more reliable service? ฀ defending the internet revolution sixties icon bob dylan’s line in “the times they are achangin’”—“then you better start swimmin’ or you’ll sink like a stone”—couldn’t be more apt for those concerned with the issues outlined in this paper. here’s a brief overview of some of the strategies being used to defend the freewheeling internet revolution. 1. darknet: j. d. lasica says: “for the most part, the darknet is simply the underground internet. but there are many darknets: the millions of users trading files in the shady regions of usenet and internet relay chat; students who send songs and tv shows to each other using instant messaging services from aol, yahoo, and microsoft; city streets and college campuses where people copy, burn, and share physical media like cds; and the new breed of encrypted dark networks like freenet. . .”59 we may think of the darknet as simply fostering illegal file swapping by ordinary citizens, but the darknet strategy can also be used to escape government internet censorship, as is the case with freenet use in china.60 2. legislative and legal action: there have been attempts to pass laws to amend or reverse copyright and other laws resulting from the counter-internet-revolution, which have been met by swift, powerful, and generally effective opposition from entertainment companies and other parties affected by these proposed measures. the moral of this story is that these large corporations can afford to pay lobbyists, make campaign contributions, and otherwise exert significant influence over lawmakers, while, by and large, advocates for the other side do not have the same clout. the battle in the courts has been more of a mixed bag; however, there have been some notable defeats for reform advocates, especially in the copyright arena (e.g., eldred v. ashcroft), where most of the action has been. 3. market forces: when commercial choices can be made, users can vote with their pocketbooks about some internet changes. but, if monopoly forces are in play, such as having a single option for broadband access, the only other choice may be no service. however, as the oa movement (described later) has demonstrated, a concerted effort by highly motivated individuals and nonprofit organizations can establish viable new alternatives to commercial services that can change the rules of the game in some cases. companies can also explore radical new business models that may appear paradoxical to pre-internet-era thinking, but make perfect sense in the new digital reality. in the long run, the winners of the digital-content wars may be those who are not afraid of going down the internet rabbit hole. 4. creative commons: copyright is a two-edged sword: it can be used as the legal basis of licenses (and drm) to restrict and control digital information, or it can be used as the legal basis of licenses to permit liberal use of digital information. by using one of the six major creative common licenses (ccl), authors can retain copyright, but significantly enrich society’s collective cultural repository with works that can be freely shared for noncommercial purposes, used, in some cases, for commercial purposes, and used to easily build new derivative creative works. for example, the creative commons attribution license requires that a work is attributed to the author; however, a work can be used for any commercial or noncommercial purpose without permission, including creating derivative works.61 there are a variety of other licenses, such as the gnu free documentation license, that can be used for similar purposes.62 5. oa: scholars create certain types of information, such as journal articles, without expecting to be paid to do so, and it is in their best interests for these works to be widely read, especially by specialists in their fields.63 by putting e-prints (electronic preprints or post-prints) of articles on personal home pages or in various types of digital archives (e.g., institutional repositories) in full compliance with copyright law and, if needed, in compliance with publisher policies, scholars can provide free global access to these works with minimal effort and at no (or little) cost to themselves. further, a new generation of free e-journals are being published on the internet that are being funded by a variety of business models, such as advertising, author fees, library membership fees, and supplemental products. these oa strategies make digital 124 information technology and libraries | september 2006 scholarly information freely available to users across the globe, regardless of their personal affluence or the affluence of their affiliated institutions. ฀ impact on libraries this paper’s analysis of copyright, drm, and network neutrality trends holds no good news for libraries. copyright the reach of copyright law constantly encompasses new types of materials and for an ever-lengthening duration. as a result, copyright holders must explicitly place their works in the public domain if the public domain is to continue to grow. needless to say, the public domain is a primary source of materials that can be digitized without having to face a complex, potentially expensive, and sometimes hopeless permission clearance process. this process can be especially daunting for media works (such as films and video), even for the use of very short segments of these works. j. d. lasica recounts his effort to get permission to use short music and film segments in a personal video: five out of seven music companies declined; six out of seven movie studios declined, and the one that agreed had serious reservations.64 the replies to his inquiry, for those companies that bothered to reply at all, are well worth reading. for u.s. libraries without the resources to deal with complicated copyright-related issues, the digitization clock stops at 1922, the last year we can be sure that a work is in the public domain without checking its copyright status and getting permission if it is under copyright.65 what can we look forward to? lessig says: “thus, in the twenty years after the sonny bono act, while one million patents will pass into the public domain, zero copyrights will pass into the public domain by virtue of the expiration of a copyright term.”66 (the sonny bono term extension act was passed in 1998.) digital preservation is another area of concern in a legal environment where most information is automatically copyrighted, copyright terms are lengthy (or endless), and information is increasingly licensed. simply put, a library cannot digitally preserve what it does not own unless the work is in the public domain, the work’s license permits it, or the work’s copyright owner grants permission to do so. or can it? after all, the internet archive does not ask permission ahead of time before preserving the entire internet, although it responds to requests to restrict information. and that is why the internet archive is currently being sued by healthcare advocates, which says that it: “is just like a big vacuum cleaner, sucking up information and making it available.”67 if it is not settled out of court, this will be an interesting case for more digitally adventurous libraries to watch. as the cost of the hardware and software needed to effectively do so continues to drop, faculty, students, and other library users will increasingly want to repurpose content, digitizing conventional print and media materials, remixing digital ones, and/or creating new digital materials from both. with the “information commons” movement, academic libraries are increasingly providing users with the hardware and software tools to repurpose content. given that the wording of the u.s. copyright act section 108 (f) (1) is vague enough that it could be interpreted to include these tools when they are used for information reproduction, is the old “copyright disclaimer on the photocopier” solution enough in the new digital environment? or—in light of the unprecedented transformational power of these tools to create new digital works, and their widespread use both within libraries and on campus—do academic libraries bear heavier responsibilities regarding copyright compliance, permission-seeking, and education? similar issues arise when faculty want to place self-created digital works that incorporate copyrighted materials in electronic reserves systems or institutional repositories. enduser contributions to “library 2.0” systems that incorporate copyrighted materials may also raise copyright concerns. drm as libraries realize that they cannot afford dual formats, their new journal and index holdings are increasingly solely digital. libraries are also licensing a growing variety of “born digital” information. the complexities of dealing with license restrictions for these commercial digital products are well understood, but imagine if drm was layered on top of license restrictions. as we have discussed, drm will allow content producers and distributors to slice, dice, and monetize access to digital information in ways that were previously impossible. what may be every publisher/vendor’s dream could be every library’s nightmare. aside from a potential surge of publisher/vendor-specific access licensing options and fees, libraries may also have to contend with publisher/ vendor-specific drm technical solutions, which may: ฀ depend on particular hardware/software platforms, ฀ be incompatible with each other, ฀ decrease computer reliability and security, ฀ eliminate fair or otherwise legal use of drm-protected information, ฀ raise user privacy issues, ฀ restrict digital preservation to bitstream preservation (if allowed by license), digital dystopia | bailey 125 ฀ make it difficult to assess whether to license drmprotected materials, ฀ increase the difficulty of providing unified access to information from different publishers and vendors, ฀ multiply user support headaches, and ฀ necessitate increased staffing. drm makes solving many of these problems both legally and technically impossible. for example, under dmca, libraries have the right to circumvent drm for a work in order to evaluate whether they want to purchase it. however, they cannot do so without the software tools to crack the work’s drm protection. but the distribution of those tools is illegal under dmca, and local development of such tools is likely to be prohibitively complex and expensive.68 fostering alternatives to restrictive copyright and drm given the uphill battle in the courts and legislatures, ccls (or similar licenses) and oa are particularly promising strategies to deal with copyright and drm issues. copyright laws do not need to change for these strategies to be effective. it is not just a question of libraries helping to support oa by paying for institutional memberships to oa journals, building and maintaining institutional repositories, supporting oa mandates, encouraging faculty to edit and publish oa journals, educating faculty about copyright and oa issues, and encouraging them to utilize ccls (or similar licenses). to truly create change, libraries need to “walk the talk” and either let the public-domain materials they digitize remain in the public domain, or put them under ccls (or similar licenses), and, when they create original digital content, put it under ccls (or similar) licenses as well. as the oa movement has shown, using ccls does not rule out revenue generation (if that is an appropriate goal), but it does require facilitating strategies, such as advertising and offering fee-based add-on products and services. net neutrality there are many unknowns surrounding the issue of net neutrality, but what is clear is that it is under assault. it is also clear that internet services are more likely to require more, not less, bandwidth in the future as digital media and other high-bandwidth applications become more commonplace, complex, and interwoven into a larger number of internet systems. one would imagine that if a corporation such as google had to pay for a high-speed digital lane, it would want it to reach as many consumers as possible. so, it may well be that libraries’ google access would be unaffected or possibly improved by a two-tier (or multi-tier) internet “speed-lane” service model. would the same be true for library-oriented publishers and vendors? that may depend on their size and relative affluence. if so, the ability of smaller publishers and vendors to offer innovative bandwidth-intensive products and services may be curtailed. unless they are affluent, libraries may also find that they are confined to slower internet speed lanes when they act as information providers. for libraries engaged in digital library, electronic publishing, and institutional repository projects, this may be problematic, especially as they increasingly add more digital media, large-data-set, or other bandwidth-intensive applications. it’s important to keep in mind that net neutrality impacts are tied to where the chokepoints are, with the most serious potential impacts being at chokepoints that affect large numbers of users, such as local isps that are part of large corporations, national/international backbone networks, and major internet information services (e.g.,yahoo!). it is also important to realize that the problem may be partitioned to particular network segments. for example, on-campus network users may not experience any speed issues associated with the delivery of bandwidth-intensive information from local library servers because that network segment is under university control. remote users, however, including affiliated home users, may experience throttled-down performance beyond what would normally be expected due to speed-lane enforcement by backbone providers or local isps controlled by large corporations. likewise, users at two universities connected by a special research network may experience no issues related to accessing the other university’s bandwidth-intensive library applications from on-campus computers because the backbone provider is under a contractual obligation to deliver specific network performance levels. although the example of speed lanes has been used in this examination of potential net neutrality impacts on libraries, the problem is more complex than this, because network services, such as peer-to-peer networking protocols, can be completely blocked, digital information can be blocked or filtered, and other types of fine-grained network control can be exerted. ฀ conclusion this paper has deliberately presented one side of the story. it should not be construed as saying that copyright law should be abolished or violated, that drm can serve no useful purpose (if it is possible to fix certain critical deficiencies and if it is properly employed), or that no one has to foot the bill for content creation/marketing/distribution and ever-more-bandwidth-hungry internet applications. 126 information technology and libraries | september 2006 nor is it to say that the other side of the story, the side most likely to be told by spokespersons of the entertainment, information, and telecommunications industries, has no validity and does not deserve to be heard. however, that side of the story is having no problem being heard, especially in the halls of congress. the side of the story presented in this paper is not as widely heard—at least, not yet. nor does it intend to imply that executives from the entertainment, information, telecommunications, and other corporate venues lack a social conscience, are fully unified in their views, or are unconcerned with the societal implications of their positions. however, by focusing on short-term issues, they may not fully realize the potentially negative, long-term impact that their positions may have on their own enterprises. nor has this paper presented all of the issues that threaten the internet, such as assaults on privacy, increasingly determined (and malicious) hacking, state and other censorship, and the seemingly insolvable problem of overlaying national laws on a global digital medium. what this paper has said is simply this: three issues—a dramatic expansion of the scope, duration, and punitive nature of copyright laws; the ability of drm to lock-down content in an unprecedented fashion; and the erosion of net neutrality—bear careful scrutiny by those who believe that the internet has fostered (and will continue to foster) a digital revolution that has resulted in an extraordinary explosion of innovation, creativity, and information dissemination. these issues may well determine whether the much-touted information superhighway lives up to its promise or simply becomes the “information toll road” of the future, ironically resembling the pre-internet online services of the past. references and notes 1. gary flake, “how i learned to stop worrying and love the imminent internet singularity,” http://castingwords.com/ transcripts/o3/5073.html (accessed may 2, 2006). 2. lawrence lessig, free culture: the nature and future of creativity (new york: penguin, 2005), 130, www.free-culture.cc/ (accessed may 2, 2006). 3. ibid., 131. 4. ibid., 117–18. 5. william f. patry, copyright law and practice (washington, d.c.: bureau of national affairs, 2000), http://digital-law -online.info/patry (accessed may 2, 2006). 6. u.s. copyright office, copyright basics (washington, d.c.: u.s. copyright office, 2000), www.copyright.gov/circs/circl/ html (accessed may 2, 2006). 7. ibid. 8. lessig, free culture, 133. 9. barbara m. waxer and marsha baum, internet surf and turf revealed: the essential guide to copyright, fair use, and finding media (boston: thompson course technology, 2006), 17. 10. patry, copyright law and practice; lessig, free culture, 133. 11. jessica litman, digital copyright (amherst: prometheus books, 2001), 35–63. 12. lessig, free culture, 134. 13. ibid., 326. 14. association of american universities, the association of research libraries, the association of american university presses, and the association of american publishers, campus copyright rights & responsibilities: a basic guide to policy considerations (association of american universities, the association of research libraries, the association of american university presses, and the association of american publishers, 2006), 8, www.arl.org/info/ frn/copy/campuscopyright05.pdf (accessed may 2, 2006). 15. george h. pike, “the delicate dance of database licenses, copyright, and fair use,” computers in libraries 22, no. 5 (2002): 14, http://infotoday.com/cilmag/may02/pike .htm (accessed may 2, 2006). 16. patry, copyright law and practice. 17. computer crime and intellectual property section criminal division, u.s. department of justice, “prosecuting intellectual property crimes manual,” www.cybercrime.gov/ipmanual .htm (accessed may 2, 2006); u.s. copyright office, copyright law of the united states of america and related laws contained in title 17 of the united states code (washington, d.c.: u.s. copyright office, 2003), www.copyright.gov/title17/circ92.pdf (accessed may 2, 2006). 18. recording industry association of america, “copyright laws,” www.riaa.com/issues/copyright/laws.asp (accessed may 2, 2006). 19. kenneth d. crews, copyright law for librarians and educators: creative strategies and practical solutions, 2nd ed. (chicago: ala, 2006), 94. 20. lessig, free culture, 8. 21. ibid., 51. 22. lawrence lessig, the future of ideas: the fate of the commons in a connected world (new york: vintage bks., 2002), 165–66, 176. 23. lessig, free culture, 53–61. 24. ibid., 53. 25. ibid., 59–61. 26. james boyle, shamans, software, and spleens: law and the construction of the information society (cambridge: harvard univ. pr., 1996), 172. 27. william w. fisher iii, promises to keep: technology, law, and the future of entertainment (stanford, calif.: stanford univ. pr., 2004), 202. 28. lessig, free culture, 289–93. 29. litman, digital copyright, 179–80. 30. ibid., 181–84. 31. michael godwin, digital rights management: a guide for librarians (washington, d.c.: office for information technology policy, ala, 2006), 1, www.ala.org/ala/washoff/woissues/ copyrightb/digitalrights/drmfinal.pdf (accessed may 2, 2006). digital dystopia | bailey 127 32. ibid., 10–18. 33. bill rosenblatt, bill trippe, and stephen mooney, digital rights management: business and technology (new york: m&t bks., 2002), 61–64. 34. godwin, digital rights management: a guide for librarians, 2. 35. david mann, “digital rights management and people with sight loss,” indicare monitor 2, no. 11 (2006), www .indicare.org/tiki-print_article.php?articleid=170 (accessed may 2, 2006). 36. julie e. cohen, “drm and privacy,” communications of the acm 46, no. 4 (2003): 46–49. 37. rosenblatt, trippe, and mooney, digital rights management: business and technology, 45. 38. j. alex halderman and edward w. felten, “lessons from the sony cd drm episode,” feb. 14, 2006, http://itpolicy.princeton .edu/pub/sonydrm-ext.pdf (accessed may 2, 2006). 39. godwin, digital rights management: a guide for librarians, 18–36. 40. wikipedia, “analog hole,” http://en.wikipedia.org/ wiki/analog_hole (accessed may 2, 2006). 41. godwin, digital rights management: a guide for librarians, 18–20. 42. ibid., 36. 43. robert a. gehring, “trusted computing for digital rights management,” indicare monitor 2, no. 12 (2006), www.indicare .org/tiki-read_article.php?articleid=179 (accessed may 2, 2006). 44. seth schoen, “trusted computing: promise and risk,” www.eff.org/infrastructure/trusted_computing/20031001 _tc.php (accessed may 2, 2006). 45. pamela samuelson, “drm {and, or, vs.} the law,” communications of the acm 46, no. 4 (2003): 43–44. 46. declan mccullagh, “congress raises broadcast flag for audio,” cnet news.com, mar. 2, 2006, http://news.com .com/congress+raises+broadcast+flag+for+audio/2100-1028 _3-6045225.html (accessed may 2, 2006). 47. ibid. 48. danny o’brien, “a lump of coal for consumers: analog hole bill introduced,” eff deeplinks, dec. 16, 2005, www.eff .org/deeplinks/archives/004261.php (accessed may 2, 2006). 49. siva vaidhyanathan, copyrights and copywrongs: the rise of intellectual property and how it threatens creativity (new york: new york univ. pr., 2001), 174–75. 50. lessig, the future of ideas, 36–37. 51. educause, “net neutrality,” www.educause.edu/ c o n t e n t . a s p ? pa g e _ i d = 6 4 5 & pa r e n t _ i d = 8 0 7 & b h c p = 1 (accessed may 2, 2006). 52. electronic frontier foundation, “dearaol.com coalition grows from 50 organizations to 500 in one week,” mar. 7, 2006, www.eff.org/news/archives/2006_03.php#004461 (accessed may 2, 2006). 53. catherine yang, “is verizon a network hog?” businessweek, feb. 13, 2006, 58, www.businessweek.com/technology/ content/feb2006/tc20060202_061809.htm (accessed may 2, 2006). 54. ibid. 55. jonathan krim, “executive wants to charge for web speed,” washington post, dec. 1, 2005, d05, www.washingtonpost .com/wp-dyn/content/article/2005/11/30/ar2005113002109 .html (accessed may 2, 2006). 56. harold furchtgott-roth, “at&t, or another telecom takeover,” the new york sun, mar. 7, 2006. www.nysun.com/ article/28695 (accessed may 2, 2006). (see also: www.furchtgott -roth.com/news.php?id=87 (accessed may 2, 2006). 57. john windhausen jr., good fences make bad broadband: preserving an open internet through net neutrality (washington, d.c.: public knowledge, 2006), www.publicknowledge.org/ content/papers/pk-net-neutrality-whitep-20060206 (accessed may 2, 2006). 58. peter suber, “three gathering storms that could cause collateral damage for open access,” sparc open access newsletter, no. 95 (2006), www.earlham.edu/~peters/ fos/newsletter/ 03-02-06.htm#collateral (accessed may 2, 2006). 59. j. d. lasica, darknet: hollywood’s war against the digital generation (new york: wiley, 2005), 45. 60. john borland, “freenet keeps file-trading flame burning,” cnet news.com, oct. 28, 2002, http://news.com.com/2100 -1023-963459.html (accessed may 2, 2006). 61. creative commons, “attribution 2.5,” http://creative commons.org/licenses/by/2.5/ (accessed may 2, 2006). 62. lawrence liang, “a guide to open content licenses.” http://pzwart.wdka.hro.nl/mdr/research/lliang/open _content_guide (accessed may 2, 2006). 63. peter suber, “open access overview: focusing on open access to peer-reviewed research articles and their preprints.” www.earlham.edu/~peters/fos/overview.htm (accessed may 2, 2006); charles w. bailey jr., “open access and libraries,” in mark jacobs, ed., electronic resources librarians: the human element of the digital information age (binghamton, n.y.: haworth, 2006), forthcoming, www.digital-scholarship.com/cwb/oa libraries.pdf (accessed may 2, 2006). 64. lasica, darknet, 72–73. 65. waxer and baum, internet surf and turf revealed, 17. 66. lessig, free culture, 134–35. 67. joe mandak, “internet archive’s value, legality debated in copyright suit,” mercury news, mar. 31, 2006, www .mercurynews.com/mld/mercurynews/news/local/states/ california/northern_california/14234638.htm (accessed may 2, 2006). 68. arnold p. lutzker, primer on the digital millennium: what the digital millennium copyright act and the copyright term extension act mean for the library community (washington, d.c.: ala washington office, 1999), www.ala.org/ala/washoff/wois sues/copyrightb/dmca/dmcaprimer.pdf (accessed may 2, 2006). the chamberlain group inc. v. skylink technologies inc. decision offers some hope that authorized users of drm-protected works could legally circumvent drm for lawful purposes if they had the means to do so (see: crews, copyright law for librarians and educators: creative strategies and practical solutions, 96–97). continued on page 139 toward a twenty-first-century library catalog | antelman, lynema, and pace 139 copyright © 2006 by charles w. bailey jr. this work is licensed under the creative commons attributionnoncommercial 2.5 license. to view a copy of this license, visit http://creativecommons.org/licenses/by-nc/2.5/ or send a letter to creative commons, 543 howard st., 5th floor, san francisco, ca, 94105, usa. bailey continued from 127 ฀ known-item questions 1. “your history professor has requested you to start your research project by looking up background information in a book titled civilizations of the ancient near east.” a. “please find this title in the library catalog.” b. “where would you go to find this book physically?” 2. “for your literature class, you need to read the book titled gulliver’s travels written by jonathan swift. find the call number for one copy of this book.” 3. “you’ve been hearing a lot about the physicist richard feynman, and you’d like to find out whether the library has any of the books that he has written.” a. “what is the title of one of his books?” b. “is there a copy of this book you could check out from d. h. hill library?” 4. “you have the citation for a journal article about photosynthesis, light, and plant growth. you can read the actual citation for the journal article on this sheet of paper.” alley, h., m. rieger, and j.m. affolter. “effects of developmental light level on photosynthesis and biomass production in echinacea laevigata, a federally listed endangered species.” natural areas journal 25.2 (2005): 117–22. a. “using the library catalog, can you determine if the library owns this journal?” b. “do library users have access to the volume that actually contains this article (either electronically or in print)?” ฀ topical questions 5. “please find the titles of two books that have been written about bill gates (not books written by bill gates).” 6. “your cat is acting like he doesn’t feel well, and you are worried about him. please find two books that provide information specifically on cat health or caring for cats.” 7. “you have family who are considering a solar house. does the library have any materials about building passive solar homes?” 8. “can you show me how would you find the most recently published book about nuclear energy policy in the united states?” 9. “imagine you teach introductory spanish and you want to broaden your students’ horizons by exposing them to poetry in spanish. find at least one audio recording of a poet reading his or her work aloud in spanish.” 10. “you would like to browse the recent journal literature in the field of landscape architecture. does the design library have any journals about landscape architecture?” appendix a: ncsu libraries catalog usability test tasks information security in libraries: examining the effects of knowledge transfer tonia san nicolas-rocca and richard j. burkhard information technology and libraries | june 2019 58 tonia san nicolas-rocca (tonia.sannicolas-rocca@sjsu.edu) is assistant professor in the school of information at san jose state university. richard j. burkhard (richard.burkhard@sjsu.edu) is professor in the school of information systems and technology in the college of business at san jose state university. . author three (email) is title, institution. abstract libraries in the united states handle sensitive patron information, including personally identifiable information and circulation records. with libraries providing services to millions of patrons across the u.s., it is important that they understand the importance of patron privacy and how to protect it. this study investigates how knowledge transferred within an online cybersecurity education affects library employee information security practices. the results of this study suggest that knowledge transfer does have a positive effect on library employee information security and risk management practices. introduction libraries across the u.s. provide a wide range of services and resources to society. libraries of all types are viewed as important parts of their communities, offering a place for research, to learn about technology, to access accurate and unbiased information, and a place that inspires and sparks creativity. as a result, there were over 171 million registered public library users in the u.s. in 2016.1 a library is a collection of information resources and services made available to the community in which it serves. the american library association (ala) affirms the ethical imperative to provide unrestricted access to information and to guard against impediments to open inquiry.2 further, in all areas of librarianship, best practice leaves the library user in control of as many choices as possible.3 in a library, the right to privacy is the right to open inquiry without having the subject of one’s interest examined or scrutinized by others.4 many library resources require the use of a library card. to obtain a library card in the u.s. one must provide official photo identification showing personally identifiable information (pii), such as name, address, telephone number, and email address. pii connects library users or patrons with, for example, items checked out, and websites visited. as such, pii has the potential to build up an image of a library patron that could potentially be used to assess the patron’s character. in response, the ala developed a policy concerning the confidentiality of pii about library users.5 confidentiality extends to “information sought or received and resources consulted, borrowed, acquired or transmitted,” and includes, but is not limited to, database search records, reference interviews, circulation records, interlibrary loan records, and other personally identifiable uses of library materials, facilities, or services.6 in more recent years, the ala has further specified that the right of patrons to privacy applies to any information that can link “choices of taste, interest, or research with an individual.”7 when library users recognize or fear that their privacy or information technology and libraries | june 2019 59 confidentiality is compromised, true freedom of inquiry no longer exists. therefore, it is imperative that libraries use extra care when handling patron personally identifiable information. while librarians and other library employees may understand the importance of data protection, they generally don’t have the resources available to assess information security risk, employ risk mitigation strategies, or offer security education, training, or awareness (seta) programs. this is of particular concern as libraries increasingly have access to databases of both proprietary and personal information.8 seta programs are risk mitigation strategies employed by organizations worldwide to increase and maintain end-user compliance of information security and privacy policies. in libraries, information systems are widely used to provide services to patrons, however, there is little known about information security practices in libraries.9 given the sensitivity of the data libraries handle, and the lack of information security resources available to them, it is important for those currently or planning to work in the library environment to develop the knowledge necessary to identify risks and develop and employ risk mitigation strategies to protect information and information resources they are entrusted with. therefore, the research question in this present study is: how can cybersecurity education strengthen information security practices in libraries? currently, there is a dearth of research on information security practices in libraries.10 this is an important research gap to acknowledge given that patron privacy is fundamental to the practice of librarianship in the u.s, and the advancement in technology coupled with federal regulations adds to the challenges of keeping patron privacy safe.11 thus this study contributes to current literature by evaluating the effects of knowledge transfer as a means to strengthen information security within libraries. furthermore, this study will offer a preliminary investigation as to whether knowledge utilization leads to motivation and the participation of information security risk management activities within libraries. the remainder of this paper proceeds as follows: first, a review of knowledge transfer is covered. a description of the cybersecurity course, including students and course material, is provided. data collection and analysis are then presented. this is followed by a discussion of the findings, limitations, and future research. literature rivew knowledge transfer in seta knowledge transfer through seta programs plays a key role in the development and implementation of cybersecurity practices.12 knowledge is transferred when learning takes place and when the recipient of that knowledge understands the intricacies and implications associated with that knowledge so that he or she can apply it.13 for example, in a security education program, an educator may transfer knowledge about information security risks to users who learn and apply the knowledge to increase patron privacy. the knowledge is applied when evidenced by users who are able to identify risks to patron data and implement risk mitigations strategies that serve to protect patron information and information system assets. knowledge transfer can be influenced by four factors: absorptive capacity, communication, motivation, and user participation.14 this study evaluates the extent to which knowledge transferred from a cybersecurity course strengthens information security practices within libraries. this study adapts the theoretical model as proposed by spears & san nicolas-rocca information security in libraries | san-nicolas-rocca and burkhard 60 https://doi.org/10.6017/ital.v38i2.10973 (2015) (see figure 1) to examine the effects of cybersecurity education on information security practices in libraries.15 figure 1. factors of knowledge transfer leads to knowledge utilization. absorptive capacity absorptive capacity is the ability of a recipient to recognize the importance and value of eternally sourced knowledge, assimilate and apply it and has been found to be positively related to knowledge transfer.16 activating a student’s prior knowledge could enhance their ability to process new information.17 that is, knowledge transfer is more likely to take place between the instructor and students enrolled in a cybersecurity course if the student has existing knowledge or has had experience in some related area. for the present study, students have stated that prior to enrolling in the cybersecurity course, they had little to no knowledge of cybersecurity. one student mentioned, “while i am the director of a small academic library, i have no understanding of cybersecurity. i am taking this course to learn about cybersecurity so that i can better secure the library i work in and to share the information with those who work in the library.” another student mentioned, “my goal is to work in a public library after graduation. i am taking this course because i keep hearing about cybersecurity breaches in the news, and i want to learn more about cybersecurity because i think it will help me in my future job.” while all of the students enrolled in the course had no cybersecurity experience, all of them had some understanding of principle 3 in the ala code of ethics, which states, “we protect each library user’s right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.”18 understanding of principle 3 in the code of ethics demonstrates existing knowledge in some related area with regards to cybersecurity, albeit limited knowledge. given this understanding, students should have the ability to process new information from the cybersecurity course. information technology and libraries | june 2019 61 communication the success of any seta program depends on the ability of the instructor to effectively communicate the applicability and practical purpose of the material to be mastered, as distinguished from abstract or conceptual learning.19 according to current research, knowledge transfer can only occur if communication is effective in terms of type, amount, competence, and usefulness.20 for the present study, students were enrolled in an online graduate level cybersecurity course at a university we call mountain view university (mvu). we changed the name to protect the privacy of the research participants. while research suggests that the best form of communication for knowledge transfer is face-to-face communication, the cybersecurity course at mvu is only offered online.21 therefore, communication relating to the course was conducted via course management software, email, video conferencing, discussion board, and prerecorded videos. motivation motivation can be a significant influence on knowledge transfer.22 that is, an individual’s motivation to participate in seta programs has been found to influence the extent to which knowledge is transferred.23 specifically, without motivation, a trainee may fail to use information shared with them about methods used to protect and safeguard patron privacy. in this present study, research participants voluntarily enrolled in the cybersecurity course. the cybersecurity course is not a core course or a class required for graduation. therefore, enrolling in the course implies motivation to learn about cybersecurity by participating in course activities and completing assigned work. user participation user participation in information security activities may influence effective knowledge transfer initiatives.24 according to previous research, when users participate in cybersecurity activities, security safeguards were more aligned with organizational objectives and were more effectively designed and performed within the organization.25 for the present study, given that students enrolled in the cybersecurity course, it is expected that they will participate in information security risk management activities, such as the completion of personal and organizational risk management projects. cybersecurity course information this study will examine whether cybersecurity education strengthens information security practices within libraries. based on the model in figure 1, students enrolled in the cybersecurity course (motivation), and therefore, were expected to participate in all course activities and complete assigned work (user participation), such as isrm assignments. isrm assignments are described in the course material section below. as per figure 2, the cybersecurity course was offered online, and used multiple forms of communication, including email, video conferencing, discussion board, and pre-recorded videos (communication). students were able to access these resources through canvas, a learning management system. students came into the class with some understanding of principle 3 in the ala code of ethics. therefore, given that this knowledge is in a “related area,” students may be able to process new information relating to cybersecurity (absorptive capacity). as per the above information and as depicted in figure 1, motivation, user participation, communication, and absorptive capacity will lead to knowledge transfer. therefore, this study will focus on how knowledge transfer, as a means to strengthen information security, leads to knowledge utilization by cybersecurity students within information organizations. information security in libraries | san-nicolas-rocca and burkhard 62 https://doi.org/10.6017/ital.v38i2.10973 specifically, this study will explore the possibility of knowledge utilization leading to motivation, and participation in isrm initiatives in libraries. figure 2. knowledge transfer elements: cybersecurity knowledge transfer for information organizations. course material the course was offered to graduate students at mountain view university. course material was created based on the national institute of technology special publication (nist sp) 800-53 and 60, as well as federal information processing standards (fips) publications 199 and 200. the focus of the course was information security risk management (isrm). course requirements included lab exercises, discussion posts relating to current cybersecurity findings and news reports, and isrm assignments. isrm assignments included a personal risk management assignment, which then led to the completion of an organizational risk management project (ormp). students completed the ormp for various libraries, healthcare institutions, pharmaceutical companies, government organizations, and small businesses. with instructor approval, students were allowed to select the organization they wanted to work with. the objective of the course was for students to obtain an understanding of isrm and be able to apply what they have learned to the workplace. course communication seta programs depend strongly on the ability of the knowledge source to effectively communicate the importance and applicability of the knowledge shared. current research suggests that the type of communication medium, relevance and usefulness of the information, and competency of the instructor can affect knowledge transfer. given that face-to-face communication is considered the best method for successful knowledge transfer, it is important to understand if online communication methods were effective in the cybersecurity course described herein as the main focus of this study is to determine if knowledge transfer leads to knowledge utilization. according to table 1, respondents “strongly agree” or “agree” that the materials used, relevance of communication, comprehension of instructor communication, and the amount of time communicating about cybersecurity in the course was effective (data collection described in section, data collection and analysis. information technology and libraries | june 2019 63 questions response strongly agree agree neither agree nor disagree disagree strongly disagree medium: the material used in the cybersecurity course i took at mvu communicated security lessons effectively. 12 (50%) 12 (50%) 0 (0.00%) 0 (0.00%) 0 (0.00%) relevance: communication during the cybersecurity course i took at mvu was effective in focusing on things i needed to know about cybersecurity for my job. 10 (45.45%) 12 (54.55%) 0 (0.00%) 0 (0.00%) 0 (0.00%) comprehension: in the cybersecurity course i took at mvu, the instructor’s oral and/or written communication with me was understandable. 12 (54.55%) 10 (45.45%) 0 (0.00%) 0 (0.00%) 0 (0.00%) amount: in the cybersecurity course i took at mvu, the amount of time communicating about cybersecurity was sufficient. 12 (54.55%) 10 (45.45%) 0 (0.00%) 0 (0.00%) 0 (0.00%) table 1. effectiveness of communication in cybersecurity course. data collection and analysis the purpose of this study is to determine if knowledge transfer through cybersecurity education, as a means to strengthen information security, leads to knowledge utilization within libraries. specifically, this study will examine if research participants will engage in isrm activities after completion of the cybersecurity education course. the model in figure 1 is examined via survey instrument by the authors. the survey instrument was available to former students who completed an online, semester long, cybersecurity course from fall 2013 through fall 2017. one hundred and twenty-six former students completed one of eight cybersecurity courses, and all were asked to participate in this study. thirty-nine students accessed the survey, but only thirty-eight agreed to participate. of those who agreed to participate in the survey, only twenty-two work in a library in the u.s. or a u.s. territory. of the other sixteen participants, twelve do not currently work within a library environment, and four do not have a job. therefore, responses from twenty-two research participants who work in a library in the u.s. or u.s. territory will be reported in this study. table 2 provides a list of the types of libraries the twenty-two research participants work in. type of library environment response (22) academic library 3 (13.64%) public library 11 (50%) school library (k-12) 2 (9.09%) special library 6 (27.27%) table 2. types of libraries research participants work in. information security in libraries | san-nicolas-rocca and burkhard 64 https://doi.org/10.6017/ital.v38i2.10973 having knowledge and an understanding of information security policies, work processes, and information and information system use within a library environment, a knowledge recipient may understand the value of the knowledge shared with them through effective seta programs and utilize the new knowledge to protect information and information resources. according to table 3, most survey participants stated that they have average to excellent knowledge of their library’s computing-related policies, work processes that handle sensitive patron information, how access to patron information is granted, and how internal staff tend to use computing devices to access organizational information. a few respondents stated that their knowledge is below average. questions: response excellent above average average below average poor how would you rate your knowledge of your organization’s computing-related policies for internal staff computer usage? 4 (18.18%) 10 (45.45%) 8 (36.36%) 0 (0.00%) 0 (0.00%) how would you rate your knowledge of your library’s work processes that handle sensitive patron information? 4 (18.18%) 11 (50%) 6 (27.27%) 1 (4.55%) 0 (0.00%) within the organization you work for, how would you rate your knowledge of how access to patron information is granted? 3 (13.64%) 12 (54.55%) 5 (22.73%) 2 (9.10%) 0 (0.00%) how would you rate your knowledge on how internal staff tend to use computing devices to access organizational information? 2 (9.10%) 11 (50%) 8 (36.36%) 1 (4.55%) 0 (0.00%) table 3. knowledge of organization’s computing-related policies. knowledge transfer for this study, knowledge transfer is measured as the extent to which the cybersecurity student acquired knowledge or understands the key educational objective. according to table 4 below, all survey participants stated that during the cybersecurity course, they acquired knowledge on information security risks, and solutions to manage information security risks within organizations. furthermore, 91 percent of the twenty-two survey participants stated that they gained an understanding of the feasibility to implement solutions and potential impact of not implementing solutions to manage information security risk within the organizations in which they work. this is consistent with previous research that has measured knowledge transfer.26 question: during the cybersecurity course i took at mvu, i _________. response acquired knowledge on information security risks within the organization. 22 (100%) acquired knowledge on solutions to manage information security risks identified within my organization. 22 (100%) gained an understanding of the feasibility to implement solutions to manage information security risks identified within my organization. 20 (90.90%) gained an understanding of the potential impact of not implementing solutions to manage information security risks identified within my organization. 20 (90.90%) information technology and libraries | june 2019 65 table 4. indicators of knowledge transfer. knowledge utilization the desired outcome of knowledge transfer is knowledge utilization.27 this study is interested in the extent to which cybersecurity students have been engaged in information security risk management initiatives in their workplace since the completion of the cybersecurity course. according to table 5, twelve of the twenty-two survey participants have utilized the knowledge transferred to them from the cybersecurity course within the libraries in which they work. of the twelve survey participants, ten performed security procedures within the organization on an ad hoc, informal basis. seven worked on defining new or revised security policies. four implemented new or revised security procedures for organizational staff to follow, and two evaluated at least one security safeguard to determine whether it is being followed by organizational staff. question: since the completion of the cybersecurity course i took at mvu, i have ______ (please check all that apply). response performed security procedures within the organization on an ad hoc, informal basis. 10 (83.33%) worked on defining new or revised security policies. 7 (58.33%) implemented new or revised security procedures for organizational staff to follow. 4 (33.33%) evaluated at least one security safeguard to determine whether it is being followed by organizational staff. 2 (16.66%) not performed any security procedures within the organization. 10 (45.45%) table 5. indicators of knowledge utilization in the library. participation knowledge transfer through cybersecurity education may influence a cybersecurity student to utilize the knowledge they have gained by participating in isrm activities. according to table 6, sixteen of the twenty-two survey participants have participated in isrm activities in the library in which they work since the completion of the cybersecurity course. fifteen communicated with internal senior management on training materials. seven performed a policy review and communicated with internal senior management on training materials. five worked on a security questionnaire, one had an interview with an external collaborator, and another research participant analyzed their library’s business or it process workflow. question: since the completion of the cybersecurity course you took at mvu, have you performed any of the following activities within the workplace: (please check all that apply) response security questionnaire 5 (31.25%) interview with external collaborator (i.e. trainers) 1 (6.25%) policy review 7 (43.75%) business or it process workflow analysis 1 (6.25%) communication with internal peers or staff on training materials 15 (93.75%) communicate with internal senior management on training materials 7 (43.75%) i have not performed any security activities in my workplace 6 (14.29%) table 6. participation in isrm activities. participation may also include discussions on isrm activities. according to table 7, sixteen of the twenty-two survey participants have participated in discussion on isrm activities within the information security in libraries | san-nicolas-rocca and burkhard 66 https://doi.org/10.6017/ital.v38i2.10973 libraries they are currently working at. fifteen survey participants participated in discussions on physical security, and ten had discussions on password policy. seven survey participants had discussions on user provisioning, and six had discussions on encryption. four survey participants had discussions on mobile devices, and another four had discussions on vendor security question: since the completion of the cybersecurity course you took at mvu, have you participated in discussions on the following areas of security? (check all that apply) response password policy 10 (62.5%) user provisioning (i.e., establishing or revoking user logons and system authorization) 7 (43.75%) mobile device 4 (25%) encryption 6 (37.5%) vendor security 4 (25%) physical security 15 (93.75%) disaster recovery, business continuity, or security incident response 6 (37.50%) i have not participated in any discussions relating to security in my workplace 6 (27.27%) table 7. participation in discussions on isrm activities. participation in cybersecurity education may lead to formal responsibility or accountability of isrm activities. according to table 8, nine of the twenty-two survey respondents stated that since the completion of the cybersecurity course, they are formally responsible or accountable for isrm in the libraries in which they work. three research participants are responsible for identifying organizational members to participate in cybersecurity training. five survey participants stated that they are responsible for communicating results on cybersecurity training to upper management, peers, and staff. three research participants are responsible for organizational compliance with government regulations. two are responsible for communicating organizational risk to the board of directors, and one research participant is responsible for organizational compliance of funder requirements. question: since the completion of the cybersecurity course you took at mvu, are you formally responsible or accountable in the following ways? (check all that apply) response identifying organizational members to participate in cybersecurity training 3 (33.33%) communicating results to upper management 5 (55.56%) communicating results to peers or staff 5 (55.56%) responsible for organizational compliance of funder requirements 1 (1.11%) responsible for organizational compliance with government regulations 3 (33.33%) responsible for internal audit 0 (0%) responsible for communicating organizational risk to the board of directors 2 (22.22%) i am not formally responsible for security in my workplace 13 (59.10%) table 8. participation via accountability of isrm activities. motivation an objective of seta programs is to motivate knowledge recipients to comply with information security policies that serve to protect information and information resources. as such, cybersecurity education may motivate students to comply with organizational information security policies that serve to protect information and information resources. according to table 9, information technology and libraries | june 2019 67 since the completion of the cybersecurity course, eighteen of the twenty-two survey participants stated that they believe it is important to protect patron sensitive data. two respondents stated that they wholeheartedly feel responsible to protect their patrons from harm, and another two stated that they would be embarrassed if their organization experienced a data breach. since the completion of the cybersecurity course i took at mvu, _________. response i wholeheartedly feel responsible to protect our patrons from harm. 2 (9.10%) i believe it is important to protect our patrons’ sensitive data. 18 (81.82%) i would be embarrassed if my organization experienced a data breach. 2 (9.10%) my job could be in jeopardy if my organization were to experience a data breach. 0 (0.00%) i do not care about cybersecurity in my organization. 0 (0.00%) table 9. motivation to protect patron privacy. discussion the purpose of this study was to evaluate the effects of knowledge transfer as a means to strengthen information security within libraries. given the results from the survey instrument, the findings suggest that knowledge transfer through cybersecurity education can lead to knowledge utilization. specifically, knowledge transfer through cybersecurity education may influence a library employee to utilize the knowledge they have gained by participating in discussions about, and the accountability and responsibility of isrm activities. in addition, participating in seta programs. seta programs are implemented within organizations as a means to increase compliance of information security policies. the findings suggest that library employees who completed a cybersecurity education course believe that it is important to, or feel that they have a responsibility to, protect patron private information. a couple of research participants stated that they would feel embarrassed if their organization experienced a data breach. a student enrolled in a cybersecurity education course may develop an understanding of and value the information that is passed on from the knowledge source about isrm activities. with ongoing development and implementation of seta programs, activating a student’s prior knowledge of isrm activities could enhance their ability to process new information and apply to their job. limitations and future research this research was conducted based on an online cybersecurity course offered at a university located in the western u.s. therefore, future research is needed to study how cybersecurity courses in other parts of the u.s and internationally affects knowledge transfer as a means to strengthen isrm initiatives in libraries, and other information organizations. it would also be valuable to conduct a modified version of this research within a classroom-based, face-to-face cybersecurity course. furthermore, seta programs implemented in libraries in the united states and internationally would add to this research area. there were 126 potential research participants identified, and although all were asked to participate, only thirty-eight completed the online survey. of the thirty-eight completed surveys, responses from twenty-two participants were reported in this article. participation from additional research participants may have generated different results. information security in libraries | san-nicolas-rocca and burkhard 68 https://doi.org/10.6017/ital.v38i2.10973 while a major limitation of this study is its small pilot study and exploratory focus, a next phase of research should further investigate what type of seta programs would be most effective in different library environments. while cybersecurity education may not be feasible for all library employees to obtain, examining and implementing the most effective seta program for each library environment could strengthen cybersecurity practices in libraries across the u.s. a future study instrument should take into account the factors that influence knowledge transfer (absorptive capacity, communication, motivation, and user participation) as a means to strengthen isrm practices. a common an important outcome for seta programs is user compliance to information security policies. as such, a future study should test library employee knowledge of, and compliance to, information security policies. conclusion u.s. libraries handle sensitive patron information, including personally identifiable information and circulation records. with libraries providing services to millions of patrons across the united states, it is important that they understand the importance of patron privacy and how to protect it. this study investigated how knowledge transferred within an online cybersecurity education course as a means to strengthen information security risk management affects library employee information security practices. the results of this study suggest that knowledge transfer does have a positive effect on library employee information security and risk management practices. references 1 “public library survey (pls) data and reports,” institute of museum and library services, retrieved on june 10, 2018 from https://www.imls.gov/research-evaluation/datacollection/public-libraries-survey/explore-pls-data/pls-data. 2 “policy concerning confidentiality of personally identifiable information about library users,” american library association, july 7, 2006, http://www.ala.org/advocacy/intfreedom/statementspols/otherpolicies/policyconcerning; "professional ethics," american library association, may 19, 2017, http://www.ala.org/tools/ethics. 3 “privacy: an interpretation of the library bill of rights,” american library association, amended july 1, 2014, http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy. 4 ibid. 5 “policy concerning confidentiality of personally identifiable information about library users,” american library association; “code of ethics of the american library association,” american library association, amended jan. 22, 2008, http://www.ala.org/advocacy/proethics/codeofethics/codeethics. 6 “policy concerning confidentiality of personally identifiable information about library users,” american library association; “code of ethics of the american library association,” american library association. 7 “privacy: an interpretation of the library bill of rights,” american library association. information technology and libraries | june 2019 69 8 samuel t.c. thompson, “helping the hacker? library information, security, and social engineering,” information technology and libraries 25, no. 4 (2006): 222-25, https://doi.org/10.6017/ital.v25i4.3355. 9 roesnita ismail and awang ngah zainab, “assessing the status of library information systems security,” journal of librarianship and information science 45, no. 3 (2013): 232-47, https://doi.org/10.1177/0961000613477676. 10 ibid. 11 shayna pekala, “privacy and user experience in 21st century library discovery,” information technology and libraries 36, no. 2 (2017): 48–58, https://doi.org/10.6017/ital.v36i2.9817. 12 tonia san nicolas-rocca, benjamin schooley and janine l. spears, “exploring the effect of knowledge transfer practices on user compliance to is security practices,” international journal of knowledge management 10, no. 2, (2014): 62-78, https://doi.org/10.4018/ijkm.2014040105; janine spears and tonia san nicolas-rocca, “knowledge transfer in information security capacity building for community-based organizations,” international journal of knowledge management 11, no. 4 (2015): 52-69, https://doi.org/10.4018/ijkm.2015100104. 13 dong-gil ko, laurie j. kirsch and william r. king, “antecedents of knowledge transfer from consultants to clients in enterprise system implementations,” mis quarterly 29, no. 1 (2005): 59-85, https://doi.org/10.2307/25148668. 14 spears and san nicolas-rocca, “knowledge transfer in information security capacity building for community-based organizations,” pp. 52-69; dana minbaeva et al., “mnc knowledge transfer, subsidiary absorptive capacity and hrm,” journal of international business studies 45, no. 1 (2014): 38-51, https://doi.org/10.1057/jibs.2013.43; geordie stewart and david lacey, “death by a thousand facts: criticising the technocratic approach to information security awareness,” information management & computer security 20, no. 1 (2012): 29-38, https://doi.org/10.1108/09685221211219182; mark wilson et al., “information technology training requirements: a role-and performance-based model” (nist special publication 800-16), national institute of standards and technology, (2018), https://www.nist.gov/publications/information-technology-security-training-requirementsrole-and-performance-based-model; san nicolas-rocca, schooley and spears, “exploring the effect of knowledge transfer practices on user compliance to is security practices,” 62-78. 15 spears and san nicolas-rocca, “knowledge transfer in information security capacity building for community-based organizations,” 52-69. 16 janine l. spears and henri barki, “user participation in information systems security risk management,” mis quarterly 34, no. 3 (2010): 503-22, https://doi.org/10.2307/25750689; piya shedden, tobias ruighaver, and atif ahmad, “risk management standards-the perception of ease of use,” journal of information systems security 6, no. 3 (2010): 23–41. information security in libraries | san-nicolas-rocca and burkhard 70 https://doi.org/10.6017/ital.v38i2.10973 17 shedden, ruighaver and ahmad, “risk management standards-the perception of ease of use” pp. 23-42; janne hagen, eirik albrechtsen, and stig ole johnsen, “the long-term effects of information security e-learning on organizational learning,” information management & computer security 19, no. 3 (2011): 140-154, https://doi.org/10.1108/09685221111153537. 18 “code of ethics of the american library association,” american library association. 19 spears and san nicolas-rocca, “knowledge transfer in information security capacity building for community-based organizations,” pp. 52-69; wilson et al., “information technology training requirements: a roleand performance-based model” (nist special publication 800-16). 20 thompson s.h. teo and anol bhattacherjee, “knowledge transfer and utilization in it outsourcing partnerships: a preliminary model of antecedents and outcomes,” information & management 51, no. 2 (2014): 177–86, https://doi.org/10.1016/j.im.2013.12.001; ko, kirsch, and king, “antecedents of knowledge transfer from consultants to clients in enterprise system implementations,” 59-85; minbaeva et al., “mnc knowledge transfer, subsidiary absorptive capacity and hrm,” 38-51; geordie stewart and david lacey, “death by a thousand facts: criticising the technocratic approach to information security awareness,” information management & computer security 20, no. 1 (2012): 29-38, https://doi.org/10.1108/09685221211219182. 21 martin spraggon and virginia bodolica, “a multidimensional taxonomy of intra-firm knowledge transfer processes,” journal of business research 65, no. 9 (2012) 1,273-282: https://doi.org/10.1016/j.jbusres.2011.10.043; shizhong chen et al., “toward understanding inter-organizational knowledge transfer needs in smes: insight from a uk investigation,” journal of knowledge management 10, no. 3 (2006): 6-23, https://doi.org/10.1108/13673270610670821. 22 maryam alavi and dorothy e. leidner, “review: knowledge management and knowledge management systems: conceptual foundations and research issues,” mis quarterly 25, no. 1 (2001): 107-36, https://doi.org/10.2307/3250961. 23 ko, kirsch, and king, “antecedents of knowledge transfer from consultants to clients in enterprise system implementations,” 59-85. 24 san nicolas-rocca, schooley, and spears, “exploring the effect of knowledge transfer practices on user compliance to is security practices,” 62-78; spears and san nicolas-rocca, “knowledge transfer in information security capacity building for community-based organizations,” 52-69. 25 spears and san nicolas-rocca, “knowledge transfer in information security capacity building for community-based organizations,” 52-69; spears and barki, “user participation in information systems security risk management,” 503-22. 26 san nicolas-rocca, schooley, and spears, “exploring the effect of knowledge transfer practices on user compliance to is security practices,” 62-78; janine l. spears and tonia san nicolasrocca, “information security capacity building in community-based organizations: information technology and libraries | june 2019 71 examining the effects of knowledge transfer,” 49th hawaii international conference on system sciences (hicss), koloa, hi, 2016, pp. 4,011-20, https://doi.org/10.1109/hicss.2016.498; ko, kirsch, and king, “antecedents of knowledge transfer from consultants to clients in enterprise system implementations,” 59-85. 27 ko, kirsch, and king, “antecedents of knowledge transfer from consultants to clients in enterprise system implementations,” 59-85; teo and bhattacherjee, “knowledge transfer and utilization in it outsourcing partnerships: a preliminary model of antecedents and outcomes,” 177–86. 178 subject reference lists produced by computer ching-chih chen: massachusetts institute of technology, boston, massachusetts (formerly university of waterloo) and e. robert kingham, university of waterloo, waterloo, ontario, canada. a system developed to produce fourteen subject reference lists by ibm 360 f75 is described in detail. the computerized system has many advantages over conventional manual procedures. the feedback from students and other users is discussed, and some analysis of cost is included. introduction the university of waterloo, with the third largest enrollment in the province of ontario, was the first in canada to institute a "cooperative education plan". undergraduate students enrolled in cooperative courses (all engineering and some science, mathematics and arts students) spend eight four-month terms at the university for academic studies, alternated with six four-month terms with industry or government for practical experience related to their academic programmes. an ibm 360/ 75 at the university of waterloo is the heart of the largest university computer installation in canada, and is an important tool for faculty, students and administration. under multi-processing it can serve many departments through terminals around the campus. one terminal serves the data processing department of the computer centre, where all the maintenance and printing of various reports required for the project under discussion are handled for the engineering, mathematics & science library ( e.m .s. library) . the e.m.s. library contains approximately 75,000 volumes of monographs, periodicals, technical reports and government documents, and subject reference lists/ chen and kingham 179 currently receives 1,650 periodical titles. it serves about 4,500 on-campus students and more than 300 faculty members in the fields of engineering, mathematics and science (in 1967/68), and provides assistance on request to business and industry in the area. system since e.m.s. library users have frequently requested subject reference lists to guide them in the use of library materials, and library reference statistics have proved that there is a justified need for them ( 1), the reference staff began, in the fall of 1966, to investigate means of compiling and producing these lists. it was planned that each subject list should first be prepared and edited by reference librarians, but at that point, conventional manual procedures should be abandoned in favor of using the computer available on campus. in this way, operations in revising and updating the lists and in adding new lists in other subject areas would be simplified, manual clerical work would be reduced significantly (no typing would be needed) and titles related to interdisciplinary areas of study could be easily coded to appear on more than one list. although library literature contains numerous accounts of library automation programmes ( 2), it is very obvious that the chief emphasis has been on technical services and circulation applications. so far as "reference services" or "information services" go, many developments have been discussed in recent years in the areas of documentation, indexing, retrieval techniques and systems, selective dissemination of information, interlibrary communication, etc. . . concise summaries . can be found in many papers (3, 4, 5, 6, 7). however, in the initial stages of developing our system, we failed to locate any existing mechanized system of producing subject bibliographies for reference use. such subject reference lists could be easily generated if the library catalogue were in machine readable form ( 6, 8), but since a computerized catalogue was not foreseen at waterloo for some time to come, the library had to design and develop an independent system to fulfil reference needs. since december 1965, the university of waterloo libraries have achieved success in producing a serials list by computer. the techniques used in the original serials project (using an ibm 1620 with card input) which started in spring 1965 and was completed in december 1965, were not new, and the fields and codes used were based on modifications of those used by the national research council library ( 9) and dalhousie university [the dalhousie aa u list] ( 10). these techniques have also been used with various modifications at several libraries in the united states, such as m.i.t. libraries ( 11 ). in 1966, the waterloo serials project was greatly modified by conversion from ibm 1620 to ibm 360, and from a card system to a tape system, by re-writing the fortran ii pro180 journal of library automation vol. 1/3 september, 1968 gramme in rpg (report programme generator) and by expanding and adding certain data fields. the reference project was initiated in november 1966. it was apparent that, after the revision of the serials project, the newly improved serials. system could be adapted to maintain the master file of the reference subject lists. the project is unique in that it uses a separate code structure that makes possible the provision of information from the master file by types of materials within each subject area. it was decided that the existing library serials maintenance form could be used with minor modifications to produce reference lists. the original form was . modified to facilitate maintenance of the master file by the lib_rary reference staff and easy transcription onto cards by keypunch operators. through the use of these forms, the master file was created and is kept up-to-date. reference master file there are four record types in the master file, each of which is 64 characters in length. they are stored on tape in a blocked length of 6,400 characters for faster processing on the computer, tape being a relatively slow input-output device. the fields in each of the record types are as follows: 1. reference 1st record 1-7 serial number 8-10 record type code 1 [blank] [blank] 11 form type 12~21 classification number 22-32 cutter number 33-34 agent number 35 country code 36 language code 37-38 department code 39 serial exclusion code (for future use) 40-42 sequence number 43 library location 44-64 f~ller (for future use) 2; reference title record 1-7 serial number 8-10 record type code (2nn) 11-63 title information 64 filler (for future use) 3. reference holdings 1-7 serial number 8-10 record type code (3nn) 11-63 holdings information 64 filler (for future use) ' ( subject refetence lists/ chen and kingham 181 4. reference notes record 1-7 serial number 8-10 record type code ( 4nn) 11-63 notes information 64 filler (for future use) library data processing yes additions fig. 1. flowchart of maintenance run. computer print run --, i i i i i i i i i i i i i __ j i i r ( 182 journal of library automation vol. 1/ 3 september, 1968 programmes were written in r.p.g. ( 12) to achieve operational status rapidly with a minimum of debugging. r.p.g. is a problem-oriented ian· guage designed to provide users with an efficient, easy-to-use technique for generating programmes. a set of specification sheets is required, on which the user makes entries. the forms are simple and the headings on the sheets are largely self-explanatory. library i data processing i source '1'0 mai~alice .., ___ ...._ __ -t form oh display imtbe library fig. 2. flowchart of listings run. compu'rer porlic 'l'ransaction sort public print run subject reference lists/ chen and kingham 183 there are three phases to the e.m.s. computer runs: 1) maintenance run (weekly or as required) (figure 1); 2) listings run (monthly ) (figure 2); 3) masters run (semi-annually) (figure 3). . prints!iop subjjx:t referjlfce booklets printed library available 'lo stud!m's '-----1~ at. 25¢ p!3 copy fig. 3. flowchart of masters run. computer 184 journal of library automation vol. 1/ 3 september, 1968 library maintenance form most of the fields on this form (figure 4) are self-explanatory; however, the following may need further definition. library maintenance form serial no. [ lxi i i i i i i • i >p l • lsl•l7 le l' insert: "a" f or addit ion,"(;" for c:hange, or "d" for deletio n f 0 r m <:lass if i cation <: utter. i /k i i i i i l l_l i i i i seria lswhi te reference pink agnt. <: l dept . s e f. code t code n a no. e & 8. ~ ~ ·~· seq. no. 5 y <: r n n n i l 10 ii 11 ll 14 is 16 17 11 19 20 2 1 22 21 1 4 1 5 16 17 1 8 19 30 31 l2 ll h ls 16 37 38 19 40 41 42 4] 4<4 45 when "change' ' has been checked above an oit aff ects title holdings or notes i ndic ate the type of chang e with a-a dd ition c-change 0-0eletion here; place a li ne ty p e code t-title h-holoing nnot es he re;; place ttle sequence number witt~in t he line type he re : l 10 ll·l l 13 i s 20 25 ) 0 fig. 4. library maintenance form. columns 1-2: card code there are six possible codes: 35 •o 1. a[blank] new entry to master file. so 5 5 __l i 60 65 2. c[blank] change to record type 1 (see cols. 10 12 as described below). 3. cal 4. cc~ 5. cdj 6. d[blank] change in lines for an existing entry on the master file, which add, change or delete respectively title, holdings and/ or notes. deletion of an entry from master file. subject reference lists/ chen and kingham 185 columns 3-9: serial number (major sequence of master file) serial number is assigned to every new entry to maintain the alphabetical order of the complete listing. it consists of one alphabetic character taken from the first letter of the main entry, followed by six numerics which serve to make each entry unique within the letter. columns 10-12: (minor sequence of master file) there are four record type codes: · · 1. record type 1 one record permitted per entry ( information on call number, subject matter of the entry and other data). cols. 11 & 12 not 2. record type tl record type h ~ record type n j column 13: form code used. col. 10 contains "title", "holdings" & "notes" information respectively from cols. 13-65 inclusive. cols. 11 & 12 permit up to 99 lines per record type per entry. this alphabetic code represents form of publication, e.g., "a" stands for "abstract", "p" stands for "periodical'' etc ... column 39-40: department code this numeric code indicates the subject list or lists which reference librarians. assign to each entry, and there are two code types: 1. prime department numbers, of which there are 14, e.g. 20 physics ..... to appear on the physics list. 2. implied department numbers : to appear on 2 or more of the prime department lists. e.g. 41 math., phys. & chern. _..,... to appear on the math., phys. & chemistry lists. 60 general _ ____ ..,... to appear on all fourteen subject lists. etc. . . column 4244: sequence number col. 42 is always "r", which stands for "reference list". col. 43 & 44 is a numeric code which indicates type of reference materials. e.g. 12 reference books dictionaries 14 reference bookshandbooks and tables 60 abstracts and indexes where .. reference books" & "abstracts and indexes" are section headings, and "dictionaries" & "handbooks and tables" are sub-section headings. 186 journal of library automation vol. 1/3 september, 1968 pre-edit report the programme that produces the pre-edit report (figure 5) checks the maintenance transactions for the following known error conditions: 1. card code invalid. 2. serial number invalid. 3. sequence number invalid. 4. first record card columns 46-80 should be blank. 5. agent code invalid. 6. country code invalid. 7. language code invalid. 8. department number invalid. 9. exclude code should be "x" or blank. 10. reference code invalid. 11. library location invalid. 12. deletion card should be blank card columns 10-80. 13. 1st record card missing on addition. 14. title, holding or note card sequence error. 15. title, holding or note delete card should be blank card columns 13-80. 16. title, holding or note, addition or change card should be blank card columns 66-80. this step catches approximately 80% of the clerical and keypunching errors. page 1 rei'eri:l<ce library hasti:ll/nle iiaintetiahce pr&-ei>it rei'ort liallcr 0}, 1968 s , b,-w<rxexplanation -aalpi!abetic char , -&blank -hany nomlier -rlel'ter i -xle'l'f!:r x card colohn guide iiitr invalid card printed beneath ", ,5,, , 10, • , 15,, ,20, ,. 25, ,.}0,, ,j5,, . ito •• , 45, .. sq., ,55,, ,60, .. 65 .,. 70,,. 75, ,.80 card colioois a f800000l rha17 f7 1al1 r12~ lahouage code cc ,s s,b , -nor -b,,,5 ... 10,, .1 5 .. , 20 ... 25 ... }0,,.}5,. ,4o, .. 4s ... }0,,.55 ... 60,.,65, .. 70, .. 75,,8o card oowkns a l'8ooo001 iuia17 f7 1aa1 rl24 language code cc }8 s, b,-11or -!,,,,5 ••• 10 ... 15, •• 20 ... 25, ,.}0,,,}5,,.4o,,45. ,,sq ,,, 55 ... 60 •• ,65,, , 70 ... 75 .. ,8o card coloi!!is a fs000001 rral7 f7 laal r12s language code cc }8 s,b ,-nor -b, ,,5 . .. 10 .... 15 ••• 20 ... 25 .. ,}0, ,.}5,,.1to,,,45 •• • so •. . 55 •• • 60 • • ,65 ••• 70 ,,,75, ,80 card colum!is a fs000001 rra17 77 laal xl2s language code cc js s.b,-nor b",.5,.10 ... 1 5, .. 20 ... 25., , }0, , o}5 .. ,ito,, . 45 .. • so .. , 55,, .60;, ,65, , 70, .. 75,, . 8o card coliooi s a f8ooooot011'reond, john e, 1 st rec . add.card hissing this card dropd "• , 5, • . 10. • .15,, , 20, , ,25 ,, . }0,, ,}5,, ,40 ,, ,45 ,, ,so ,, , 55, .. 60,, . 65,, , 70, ,, 75, , .so card colioois a f8oooooto2 dictionary/outline of basic statistics , n£v yorjc , 1 st rd:.add. card hissi:t:l '!"dis card dropd ",5, .. 10. ,,15 .. , 20., , 25. , }0, , }5, , ,ito, , 45,, .so .. , 55,, ,60, , ,65,,, 70.,, 75., ,so card columns a f8oooooto}hcoraw-ril1, 1966, 1st rec,ado,card hissi ng this card dropd page 2 refer!iice library master/file hain'miance pre-edit report karch 0}, 1968 iio. of valid h'i'ce,cards read 9 no,of invalid htce,cards read 7 fig. 5. pre-edit report. subject reference lists/ chen and kingham 187 maintenance report the maintenance report (figure 6) shows: 1. additions to and deletions from the master file. 2. changes to the master file displaying the entry as it was before the change and as it is after the change. this permits easy recovery of the previous record when eltors are made. page l reference library master/file maintenance report january 30, 1968. serial rio, comments ----data -------a846ooo deletion 1st record form no. p class.l/0, tll cutrer no. detail agent country language department serial ex ref.list rr8o llb.locn. a846o0o deletion title t01 automobile engineer. a846o0o deletion holding b01 vol. 8-24, (25) 1918-1935 d0?.5l20 c!wlge 1st from forh no. r class no, z6673 ciitl'er no. rjx:ord to r z6673 detail froh agent country language_ department to froh serial ex ref,list rrlo llb.locn. '1'0 11-10 d200000 addition 1st rjx:ord forh no. p class.no. nkl. ciitl'er no, detail agent country language departhd>'t· serial ex ref.llst r-80 llb.locn, 1)200000 addition title tol design quarterly. d200000 addition holding hol 1966/67d520700 change title from directory of british scientists, lonoon, e. benn, 'f01 to directory of british scientists, vi03100 change bolding fro!~ library has vols. 1-3· b01 to library has vols, l-it, m4o8ooo .... attehpt to add new record has been unsuccessnll serial nwmer exists already .... card columns .. . . 5 ••• 10 ••• 15 • •• 20 ••• 25 .. . 30 ... 35 .. ,40, .. 45 ... 50 ... 55 ... 60 ... 65 ., .,invalid card a m4o8oool be331 b55 40 rlelt a66 88 4 d36 d36 82 ito 4 4 e9 60 4 nl?3000 .... attd!pt to change a record has been unsuccessful add t,h or n seq.no, exists already .,.,card columns .. , .5. , .10 ... 15 ... 20 ••• 25 .. ,3q ... 35 ... 40 ... 45 ... 50 ... 55 ... 6o .. ,65 •••• invalid card can173qooh02 1925-1962// 111?3000 .,.,attempt to change a record has been unsuccessful add t,h or n seq.no. exists already ,.,,card columns , ... 5 ... 10 ... 15 ••• 20 ••• 25 ... 30 ... 35 ••• 40 ... 45 ... 50 • • • 55 ... 60 ... 65 ... ,invalid. card caiu73000n01. superseded by its highway research rf.x:ord, master/l'ile records read 6292 lltlmber of records added 162 number of records deleted 55 master/file records 'riritten 6399 lltlmber of invalid maintenance records not processed 8 fig. 6. maintenance report. 188 journal of library automation vol. 1/ 3 september, 1968 3. two types of error conditions that fail to appear in the pre-edit report due to the absence of the master file in the pre-edit programme. a. additions where serial numbers and/ or sequence numbers ( cols. 1012) exist already. b. changes/deletions where serial numbers and/ or sequence numbers are non-existent. 4. master file maintenance statistics on: ·~ a. master file records read. b. number of records added. c. number of records deleted. d. master file records written. e. number of invalid maintenance records not processed. addition list this list (figure 7) is an alphabetical summary (in serial number sequence) containing added entries only from the "maintenance" run. information on call number, complete bibliographical data of the entry, department or subject code ( cols. 39 40) and location are printed for each entry. this augments the internal reference list between the "listings" run (see figure 2). page 1 reference addition list for_ wed< ending ~anuary 30, 1968. serial a262000 abs qdl a 53 d200000 per nk1 ag cntry l dpl'. 85 t01 american chemical society • lf02 abstracts of paper. hol 196?60 e9 tol design quarterly • hol 1966/6?d56.5000 ref z?916 d6 01 tol doc~ts digest. h01 vol. 16, no. ?fig. 7. addition list. locn ma. ma. eng. subject reference lists/ chen and kingham 189 internal reference list this is a complete alphabetical list (figure 8) of all entries on the master file, similar to the addition list (figure 7) in arrangement and format. the serial number sequence facilitates the reference staff assignment of unused serial numbers to new entries and the easy location of serial numbers of entries for updating purposes. this document is the prime source of information for maintaining the master file. public list the main list (figure 9) is first divided by subjects of which there are fourteen: mathematics, astronomy, biology, chemistry, earth sciences, physics, design, management sciences, aero engineering, chemical engineering, civil engineering, electrical engineering, mechanical engineering and nuclear engineering. each subject list is further divided into the following sections and sub-sections: 1. reference books a. guides to the literature and bibliographies b. periodical listings c. dictionaries d. encyclopedias e. handbooks and tables f. directories -individuals page 1 serial a002500 per tk1 a8 a020000 per qc221 a4 a028ooo per qd1 a325 internal reference list ag cntry 1. dpt • ref x r 48 r80 t01 asea journal h01 vol. 321959n01 published with abstracts 60 rbo '1'01 acoustical society of america. '1'02 journal, h01 vol. 17• 1945/4644 rbo '1'01 acta chemica scandinavica. h01 vol. 1· 1947fig. 8. internal reference list. . february 5, 1968. locn eng, eng, eng. 190 journal of library automation vol. 1/ 3 september, 1968 chdiisl'rt page 4 encycidpaedias ref the encyclopedia of ciidusl'rt. 21> ed. qd5 new york, reinhold publishing corp., 1966. e.58 ref hampel, clifford allen, ed. qd.553 the fzfcycwpedia of elfx;troc!m4istry • b3 new york, reinhold, c1964. ref international encyclopedia of che24ical science. qd5 princeton, n.j., van nostrand, 1964. i5 ref jacobson, carl alfred, ed. qdl55 ejicyclopedia of choocal reactions. new york, j} reinhold pub. corp., 1946-1959. 8v. kingze'rl', charles thomas. kingze'rl's chemical encycidpedia, a digest of chejfis'l'ry &c its inoos'l'rial applications.gr ~:;:h~._~ princeton, n.j., van nostrand, 1966. fig. 9. public list. g. directoriesorganizations h. international conferences 2. standards and patents 3. important series 4. theses 5. abstracts and indexes 6. periodicals reference booklets it is planned that semiannually the e.m.s. library will receive from the computer centre the computer produced masters, which are exact duplicates of the public list except that they are printed on unlined paper with a special printer ribbon. the masters are then sent to the university's printshop, and the fourteen separate reference booklets are printed from offset masters photo-reduced to 75% of the original. this results in a publication of convenient size (8~"x5~") with clearer typographical representation than the actual computer printout. figure 10 shows a representative page from the aero engineering list of the first edition. rff tho) f.~7 rff tall .e~ ref ta'i j6~ ref l~c63 a~ ~2 ref q 17.1 1032 subfect refet·ence lists/ chen and kingham 191 atron, eng , p~ge 6 t~e et;cytlop(dja of f.ngi~eering iuhiuals ano processes, nfw ycp~t r~ini'cli) pub, corp,, 1'163, hcyclopeilu of hcimf.ring sic:ns anc syiii'olso nfw yor~, coyssf.y pi\fs~, c\'165, jones, frln~lin day, fc, enei~fering f.ncyclopeoia, 30 f.o, new york, inousti\ul press, c\'163, kempe s eng infers yf.ar-iiook, 720 ell, lcnoon, "organ broti'f.rs, l9h, 2v, l mccraw-hill encyclopedia qf science ano technology, rev. f!), new yllrk, ~cgraw-hll, 1'166, 15v, ~cc~aw-hlll yearbook of scie~ce ano technology, nfh york, ~ccraw-hillt 1'16?.hanoftooks and tables ref tjz33 1572 tjiu6 a~6 th07 j.7z ~ef oclu 8~5 american society for testing ann "atf.rial s, coii"t ttee ~-1 d n f 'on-c~roiiiuii, ircn-chromtliiinl ckelt ano relued alloys compilation, cciifilation of chell i cal compos it ions and rupture strengths of super-strength alloys, philadelphia, 1'16 .... american society of ~echhical engineers, asme handbook, 2d f.d, new y~rk, mcgraw-hill, 1965library has yol o l • aiurican society of tool and han\.ifacturinc engineers, machining thf. space-age ~etals.,, cearborno hichigaic, 1'165. armcur resurch founcationt cii'icago, handbook of thhhophvsical pri)perths of solid materials, rev, eo, new york, iiciiillan, 1'161. 5vo aviatiot; ace research and developme~t technical handbook, l'ih-1958, benedict, roftf.rt p, ha~cftcck cf generillzec cas dynamics , new yorko plenuh ~ress data divislllnt 1966, fig. 10. page (actual size) in the aero engineering list. •' •' ,. ' i i i i i i ' 192 journal of library automation vol. 1/ 3 september, 1968 table 1. information on first edition copies number estimated copies ordered sold to of printing first second students & subject pages cost/ copy printing printing faculty astronomy 9 14c 30 40 7 biology 16 18c 90 37 chemistry 16 18c 150 64 earth sciences 22 2lc 50 19 physics 15 18c 100 44 design 14 17c 50 12 management sciences 11 16c 30 100 88 mathematics 15 18c 150 81 aero engineering 20 chemical 20c 30 40 11 engineering civil 28 24c 100 46 engineering elech·ical 23 22c 100 57 engineering mechanical 26 23c 100 65 engineering nuclear 27 24c 100 44 engineering 16 18c 30 40 5 discussion first edition an estimated number of copies for each list, as shown in table 1, was ordered on the basis of sttident enrollment figures in different departments of the faculties of engineering, mathematics and science, and on the subject matter of each list in relation to the academic programmes of the university. it was hoped that those copies could adequately meet the demands of students, faculty and interested people outside the u niversity until the completion of the second edition, tentatively set then for september 1968. the first edition of the reference lists was available for distribution at the end of november 1967. experience having shown that free library materials were no sooner received than discarded, it was decided to give some value to these lists by a charge of 25¢ per copy. from the start students responded so enthusiastically to ·the lists that one week after their , sub;ect reference lists/ chen and kingham 193 availability, the library had to order 100 additional copies of the second printing of the "management science" list, and by the end of february 1968,. 40 additional copies each of the "aero engineering," "nuclear engineering" -and "astronomy" lists. table 1 gives information on quantities printed, costs, and sale to students and faculty of first edition lists. the estimated printing cost per copy is based on printing of 100 copies. · mter the announcement of the availability of the lists in several library professional journals, the e. m. s. library received many letters of inquiry and requests for complimentary copies. complimentary distribution was made of 12 sets and some 80 lists of different subjects. purchase orders were received for 83 complete sets "of lists, 21 from canada, 58 from the united states, and two each from australia and england. by the end of -march 1968, stock of the first edition was exhausted, and there were still 44 purchase orders unfilled and 28 filled only partially. questionnaire instead of ordering more copies of the first edition from the university f'rintshop to meet the requests received thus far, the reference staff decided to work on a second edition, and the original scheduled completion date of that edition was moved ahead to early june 1968. ' although the e. m. s. library had already received many valuable suggestions and comments on the project from waterloo faculty and interested librarians in canada and the united states, including some very enthusiastic library school professors, there was little feedback at that time from the immediate users, the students, on their use of the lists. since addresses and department affiliations of most of those who purchased lists had been recorded, it was possible to send out questionnaires (figure 11) to 210 undergraduates, 122 graduate students and 30 faculty members in the beginning of april 1968. by april 20, 65 returns ( 31%) were received from the undergraduates, 41 ( 33.6%) from the graduates and 11 (36.6%) from the faculty. a summary of those returns, shown in table 2, has been of great help in assessing the value of the first edition. from the replies it is certain that almost all who purchased lists found them useful and would be willing to buy the updated edition. most important, students used the list for research' 'purposes (including term papers and thesis work), thus fulfilling the original purpose of the project. another fact emerging from the questionnaire ' was that the number of serial titles included should be greatly expanded. second edition reference librarians started at the end of april to update the fourteen subject reference lists by incorporating the valuable feedback and comments received and to compile the fifteenth list, "optometry" (the university of waterloo has had a new optometry school since september 1967). many changes, additions and deletions have been made, and the 194 journal of library automation vol. 1/3 september, 1968 university of waterloo e.m,s, library according to our records, you have purchased one (or mqre) of the reference booklets. in order to plan for a second edition, and to a ssess the value of th~ first edition, we would be most grateful if you would fill out this questionnaire as completely as possible and mail it to us before april 20 1 1968. it is not necessary to sign yo~r name. 1. haye you used your reference list? a, if so did you find it helpful? yes yes 0 no 0 no 0 0 b. for what purpose did you use the list? regulard studies research o (including term papers, c. did you use the list in place of the serials list and card catalogue? 2. did the list save you time in your use of the library? 3. should the list include more or fewer titles? yea yes more a, which area~ do you feel should be expanded or deleted? thesis work) 0 no 0 0 no 0 ~~ fewer ~~ expanded deleted guides to the i.iterature & bibliographies •••••••••••••• periodical li stings ••• , ••••••.•••••• · •••• , • •••••••• , , , • , dictionar~es • •• •••• •• .•• •. • • .••• •..•. • •• , • • , , •• , •. , • , .• encyclopedias , , ••• , • , •••••••••••• • ••••• , •••••• , • , • , • , • , handbook and tables , • , •••••••.•••• ••• , ••••• , ••• , •••• , •• directoriesindividuals •••• ••••• ••••••••••••••••••••• directo!ues organizations ••••••••••• , • . ••••••• , , • , , , • international conferences •• , ••••••••• , ••••••••••••••••• standards and patents ••••••••• • •••• •• ••••.•••••• , , , •• , • important series , , , •• , , ••• •• • • •••••••••••• , • , , •• •• , • , , • theses ••••• •• ••• , • , ••••••••••••••••• •• , • •• • •••• • •••• , • , abstracts and indexes • , , ••••• • •••• , • , • • , • , •• , •••• , , •• , , periodi cals •. , , •• , •. , . , •.•.•• .• , •• . . ••. , •. , •. , •. , •••• , , • b. which specific titles do you feel should be added? c. which specific titles do you feel should be deleted? 4, would you be interes ted in buying an updated edition of the reference list? 5. additional comments 6. undergraduate d graduate· 0 faculty 0 thank you for answering this questionnaire. if you would like to discuss further anything pertaining to the reference lists, please feel free to call us, fig. 11. questionnaire on use of reference subject lists. subject reference lists/ chen and kingham 195 table 2. summary of questionnaire returns question undergr. grad. fac. 1. yes 39 30 5 no 26 11 5 la. yes 31 26 5 no 8 3 1 lb. studies 16 9 1 research 27 23 5 lc. yes 19 12 2 no 25 20 4 2. yes 27 21 6 no 11 11 1 3. more 45 32 5 fewer 2 3a. handb. exp. 16 18 7 .. del. 2 1 2 series exp. 19 8 3 .. del. 1 theses exp. 15 14 1 .. del. abst. exp. 15 13 3 '' del. 1 per. ea. 28 23 6 " de. 2 1 4. yes 23 24 6 no 17 8 6. 65 41 14 serial titles greatly expanded as requested by users. a new sub-section heading has been created under the section "reference books" for reference materials of a very general nature; thus materials such as encyclopaedia canadiana, canada yearbook, etc ... are pulled out from subsections such as "reference encyclopaedia" and "reference handbooks & tables" etc . .. to the sub-section "reference general" at the very beginning of each subject list. it is estimated that the second edition will be available at the beginning of june. a comparison of the two editions is shown in table 3. cost although up to this time, the computer centre has made no internal charge for its services to the library, it is estimated that with the university's present computer configuration, the monthly cost of maintaining 196 journal of library automation vol. 1/3 september, 1968 table 3. first and second editions compared edition completion date no. of records on master file addition up-dating . change (no. of entnes) d 1 t' e e 1011 no. of subject lists number of pages of each subject list aero engineering chemical engineering civil engineering electrical engineering mechanical engineering nuclear engineering design management sciences mathematics astronomy biology chemistry earth sciences physics optometry i nov./ 67 c.5,500 14 20 28 23 26 27 16 14 11 15 9 16 16 22 15 n · june/68 7,446 280 216 7 15 26 37 31 34 35 21 17 15 21 14 23 27 27 22 15 this project is approximately $40.00. this cost covers about 4 minutes/ month computer time, about 2 hours/month for keypunching and verifying and the cost of punch cards, multipart paper etc. . ., but does not cover the initial cost of system analysis and the charges for printing the booklets. by comparison, it would cost approximately $95.00 per month to produce the copy by hand and this method . would not provide the flexibility and other advantages of a computerized system. heferences 1. chen, ching-chih: "computer-produced subject reference lists," iplo newsletter, 9 (feb. 1968), 38-40. · ·. 2. mccune, lois c.; salmon, stephen r.: "bibliography of library automation," ala bulletin, 61 (june 1967), 674-94. · 3. black, donald v.; farley, earl a.: "library automation," in annual review of information science and technology, edited by carlos a. cuadra (new york, inter science: wiley) . 1 ( 1966), · 273 303. 4. schultz, claire . k.: "automation of reference work," .libmry trends, 12 (jan. 1964 ), 413-424. · subject reference lists/ chen and kingham 197 5. brownson, helen l. : "new developments in scientific documentation," cla occasional paper, no. 32, 1961. 6. hammer, donald p.: "automated operations in a university library; a summary," college & research libraries, 26 (jan. 1965), 19-29, 44. 7. prodrick, r. g.: "automation can transform reference services," ontario library review, 51 (sept. 1967) , 145-50. 8. cox, n. s. m.; dews, j. d. ; dolby, j. l.: the computer and the library; the role of the computer in the organization and handling of information in libraries (hamden, conn.: archon books, 1967), 78-84. . . 9. brown, j. e.; walters, peter: "mechanized listing of serials at the national research council library," canadian library, 19 (may 1963 ), 420-26. 10. wilkinson, john p.: "a.a.u. mechanized union list of serials," apla bulletin, 29 (may 1965), 54 59. 11. nicholson, natalie n.; thurston, 'villiam : "serials and journals in the m.i.t. library," american documentation, 9 ( 1958), 304-7. 12. international business machines corporation : "ibm system 360 operating system-report programme generator specifications," ibm system reference library, file no. s360-20, form c24-3337, (ibm programming publications dept. 452, san jose, c.alif. 95114, 1965 + revisions). 201 application of the variety-generator approach to searches of personal names in bibliographic data bases-part 2. optimization of key-sets, and evaluation of their retrieval efficiency dirk w. fokker and michael f. lynch: postgraduate school of librarianship and information science, university of sheffield, england. keys consisting of variable-length chamcter strings from the front and rear of surnames, derived by analysis of author names in a particular data base, am used to provide approximate representations of author names. when combined in appropriate mtios, and used together with keys for each of the first two initials of personal names, they provide a high degme of discrimination in search. methods for optimization of key-sets are desc1·ibed, and the performance of key-sets varying in size between 150 and 300 is determined at file sizes of up to 50,000 name entries. the effects of varying the proportions of the queries present in the file are also examined. the results obtained with fixed-length keys are compared with those f01' variable-length keys, showing the latter to be greatly superior. implications of the work for a variety of types of information systems a1'e discussed. introduction in part i of this series the development of variety generators, or sets of variable-length keys with high relative entropies of occurrence, from the initial and terminal character strings of authors' surnames was described.1 their purpose, used singly or in combination, is to provide a high and constant degree of discrimination among personal names so as to facilitate searches for them. in this paper the selection of optimal combinations of the keys and evaluation of their efficiency in search are described. the performance of combined key-sets of various compositions is determined at a range of file sizes and compared with fixed-length keys. in addition, 202 1 ournal of lib1'm'y automation vol. 7 i 3 september 197 4 the extent of statistical associations among keys from different positions in the names is determined. balancing of key-sets the relative entropies of distribution of the first and last letters of the surnames of authors in the file of 100,000 entries from the inspec data base differ significantly, the former being 0.92 and the latter 0.86. as a result, a larger key-set has to be produced from the back of the surnames to reach the same value of the relative entropy as that of a key-set of given size from the front of the surname. for instance, the value of 0.954 is reached by a key-set comprising 41 keys from the front of the name, but a set of 101 keys from the back is needed to attain this value. it seemed reasonable to assume that keys from the front and rear should be combined in different proportions in order to maximize the relative entropy of the combined system, and that their proportions should reflect the redundancies of each distribution (redundancy = 1 hr). in order to test this, a series of combined key-sets of different total sizes was produced, in which the proportions of keys were varied around the ratio of the redundancies of the first and last character positions, i.e., ( 1 0.92): ( 1 0.86), or 8:14. the relative entropies of the name representations provided by combining these key-sets with keys for the first and second initials were determined by applying them to the 50,000 name file, and the entropy value used to determine the optimal ratio of keys. in one case, the correlation between the value of the relative entropy and retrieval efficiency, as measured by the precision ratio, was also studied, and shown to be high. the sizes of the combined key-sets studied were 148 and 296, with an intermediate set of 254 keys. the values of 148 and 296 were chosen in view of the projected implementation in the serial-parallel file organization.2 this relates the size of the key-set to the number of blocks on one cylinder of a disc. (the 30mbyte disc cartridges available to us have 296 blocks per cylinder.) otherwise the choice of key-set is arbitrary, and can be varied at will. the minimum key-set size is 106, consisting of 26 letters each for the first and last letter of the surname, and 27 ( 26 letters and the space symbol) each for the first and second initials. the numbers of n-gram keys ( n ::::,. 2) required for the key-sets numbering 148, 254, and 296 in size are . thus 42, 148, and 190. full details are given of the composition of the first and third of these sets. a slight refinement to key-set generation was employed to ensure as close an approximation to equifrequency as possible, especially with the smallest key-sets. precise application of a threshold frequency may occasionally result in arbitrary inclusion of either very high or very low frequency keys. thus, if almost all the occurrences of a longer key are accounted for by a shorter key (as with -mann and -ann), only the shorter n-gram is included. va1'iety-generato1· approach/fokker and lynch 203 optimal set of 148 keys the number of n-gram keys ( n ::::::,. 2) to be added to the minimum set of 106 keys is 42, the presumed optimum proportion being 8:14, which implies about 16 keys from the front of the name and 26 from the back. in order to examine the relationship between the ratio of keys from the front and rear of the surname and the relative entropy of the combined sets, the ratios were varied at intervals between 1:1 and 1:3 so that the numbers of n-grams varied from 21 and 21 to 11 and 31 respectively. for each ratio the keys were applied to the 50,000 name entries, and the distribution of the resultant descriptions determined. the ratios, the number of n-gram keys, and the relative entropies of the distributions are shown in table 1. the maximum value of the entropy is taken to be log250,000. in this case the balancing point, with the key-set including 16 n-gram keys table 1. relation between ratio of n-grams f1'0m f1'dnt and rear of surname, entropy of combined key-sets, and retrieval efficiency for a series of sets of 148 keys ratio numbm· of n-gram number of diffm·ent relative · precision(%) of n-gram keys representations entropy (file size= keys front back in 50,000 entries of system 25,000) 1:1 21 21 33,485 0.9450 71.5 3:4 18 24 33,501 0.9450 71.3 17:25 17 25 33,434 0.9447 70.9 8:13 16 26'* 33,454 0.9453 72.2 5:9 15 27 33,402 0.9450 72.0 1:2 14 28 33,378 0.9449 72.1. 1:3 11 31 33,126 0.9437 71.5 total number of different name entries = 41,469. '* key-set with highest relative entropy. from the front and 26 from the back, corresponds with the ratio of the redundancies of the first and last letters of the surnames. table 2 shows the composition of the optimal key-set of 148 keys, while table 3 gives the distribution of the name representations compiled from the combined key-set, and its corresponding relative entropy. optimal set of 296 keys a similar procedure to that used for the optimal148-key key-set was also applied in this instance. here the ratios of front and rear n-gram keys varied from 57 and 133 to 69 and 121 respectively. for each of the sets chosen, the distributions of the entries resulting from application of the combined key-sets to the file of 50,000 names were determined. these showed virtually no difference in terms of the relative entropy alone, although the total number of different entries differed slightly between keysets, and the highest value was used to choose the optimal set, detailed in table 4. the range of combinations studied is shown in table 5, and the distribution of the entries for the optimal set is given in table 6. .. , -::: 204 journal of library automation vol. 7/3 september 197 4 table 2. composition of balanced key-set of 148 keys keys from front of surname ( 42) : key p• key p• key p• key p• a .035 g .055 ma .030 sh .016 b .020 h .035 n .025 st .016 ba .020 ha .021 0 .017 t .040 be .017 i .013 p .038 u .005 bo .014 j .017 pa .014 v .025 br .014 k .041 q .001 w .040 c .036 ka .017 r .032 x ch .016 ko .017 ro .017 y .011 d .044 l .033 s .049 z .013 e .018 le .014 sa ,016 f .034 m .050 sc .015 keys from rear of surname (52) : a .060 ii .015 nn .010 is .012 ra .010 ki .015 on .018 t .042 va .015 j .001 son .027 u .013 b .003 k .033 0 .028 v .001 c .005 l .013 ko .013 ev .018 d .030 el .012 p .004 ov .026 e .068 ll .016 q .001 ,kov .012 f .006 m .013 r .016 nov .on g .012 n .009 er .064 w .005 ng .014 an .020 ler .013 x .003 h .020 man .017 ner .010 y .031 ch .017 en .025 s .055 ey .012 i .044 in .039 es .015 z .013 keys from first initial: 27 characters keys from second initial: 27 characters table 3. frequencies of entries represented by optimall48-key key-set in a file of 50,000 names frequency number of entries with f frequencyf 1 24,363 2 5,622 3 1,850 4 757 5 372 6 193 7 103 8 68 9 32 10 24 11-15 54 16--20 11 21-30 4 33 1 total number of different entries = 33,454 maximum number of possible combinations= 1,592,136 (i.e., 42 x 52 x 27") h = 14.7553 hmax = 15.6096{log,50,000) hr = 0.9453 variety-generator approach/fokker and lynch 205 table 4. composition of balanced key-set of 296 keys keys from front of surname ( 87) : a bu e ha ki ma ni ra si wa al c f he ko mar 0 re so we an qa fr ho kr mc p ri st wi b ch g hu ku me pa ro t x ba co ga i l mi pe s ta y bar d go j la mo po sa u z be da gr jo le mu pr sc v bo de gu k ll n q se· va br do h ka m na r sh w keys from rear of surname ( 155) : a ld ng vskii el lin r or nt sov ca nd ang ki ll tin ar s rt w da rd lng ski all nn er as ert x ka e rg wski ell on ber es st y ma de h li m son der nes tt ay na ee ch ni am lson ger is ett ey ina ge ich ri n nson nger ns u ley ra ke vich ti an rson her ins v ky ta le gh j man ton ier os ev ry va ne sh k rman 0 ker rs ov z ova re th ak yan ko ler ss kov tz wa se ith ck en nko ller ts ikov ya te i ek sen no mer us lov b f ai ik in to ner t nov c ff hi l ein p ser dt anov d g ii al kin q ter et rov keys from first initial: 27 characters keys from second initial: 27 characters table 5. relation between ratio of n-grams from front and rear of surname and entropy of combined key-sets for a series of sets of 296 keys (file size= 50,000) ratio ofn-gram keys 3:7 61:129 13:25 69:121 number of n-gram keys front 57 61 65 69 back 133 129'* 125 121 '* key-set with highest number of different entries. number of different representations 39,182 39,191 39,186 39,179 relative entropy of system 0.9679 0.9679 0.9679 0.9679 in this instance, the ratio of n-gram keys from the front and back of the surnames has been displaced from the ratio of the redundancies of the first and last characters of the surnames, i.e., 8:14 (1:1.7). here the ratio is roughly 1:2. this is undoubtedly due to the fact that the relative entropies of key-sets from the back of the surname increase less rapidly than those of key-sets from the front, and hence larger sets must be employed. evaluation of retrieval effectiveness the keys in the optimized key-sets represent name entries in an approxi,, i' i: 206 ] oumal of librm·y automation vol. 7 i 3 september 197 4 table 6. frequencies of entries represented by optimal key-set of 296 keys in a file of 50,000 names frequency f 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 total number of different entries = 39,191 number of entries with frequencyf 31,705 5,394 1,371 442 164 63 27 12 4 3 2 2 1 1 maximum number of possible combinations= 9,830,565 (i.e., 87 x 155 x 27') h = 15.108 hmax = 15.6096(log,50,000) hr = 0.9679 mate manner only, so that when a search for a name is performed, additional entries represented by the same combination of keys are identified. while these may be eliminated in a subsequent character-by-character match of the candidate hits, the proportion of unwanted items should remain low if the method is to offer advantages. in evaluating the effectiveness of the key-sets in the retrieval, the names in the search file were represented by concatenating the codes for the keys from the front and back of the surnames and the initials, and subjecting the query names to the same procedure. the matching procedure produced lists of candidate entries, of which the desired entries were a subset. the final determination was carried out manually. the tests were performed first with names sampled from the search file, so that correct items were retrieved for each query. since searches for name entries may be performed with varying probabilities that the authors' names are present in the file (especially in current-awareness searches), varying proportions of names of the same provenance, but known not to be present in the search file, were also added. in these cases candidate items were selected which included none of the desired entries. recall tests were also performed and recall shown to be complete. the measure used in determining the performance of the variety-generator search method is the precision ratio, defined as the ratio of correctly identified names to all names retrieved. it is presented both as the ratio of averages (i.e., the summation of items retrieved in the search and calculation of the average) and as the average of ratios (i.e., averaging the val'iety-genemtor app1'0ach/fokker and lynch 207 figures for individual searches). the latter gives higher figures, since many of the individual searches give 100 percent precision ratios. the precision ratio was found to be dependent on file size and to fall somewhat as the size of file increases. this is due to the fact that the keysets provided only a limited, if very high, total number of possible combinations, while the total possible variety of personal names is virtually unlimited. the evaluation was performed with a sample of 700 names, selected by interval sampling. this number ensured a 99 percent confidence limit in the results. a comparison of the interval sampled query names with randomly sampled names showed that no bias was introduced by interval sampling. a test to confirm that the retrieval effectiveness reached a peak at the maximum value of the relative entropy of a balanced key-set was performed first. this was carried out on a file of 25,000 names, using as queries names selected from the file and the optimal 148-key key-set. as shown in table 1, the values of the precision ratio (ratio of averages) and of the relative entropy both peak at the same ratio of n-gram keys from the front and back of the surnames. the performance of the optimal key-sets of 148, 254, and 296 keys with files of 10,000, 25,000, and 50,000 names is shown in table 7. calculated as the ratio of averages, the smallest key-set ( 148 keys) shows a precision ratio of 64 percent with a file of 50,000 names, which means that of every three names identified in the variety-generator search, two are those desired. with the largest key-set ( 296 keys), this rises to nine correctly identified names in every ten retrieved at this stage. on the other hand, calculated as the average of ratios, the precision ratios rise to 81 percent and 94 percent respectively. for smaller file sizes-typical, for instance, of current-awareness searches-the figures for all of these are cottespondingly higher. table 7. precision ratios obtained in variety-generator searches of personal names-queries sampled from sea1'ch file (confidence level= 99 pm·cent) precision as ratio of averages (%) : file size 50,000 25,000 10,000 precision as average of ratios (%) : file size 50,000 25,000 10,000 148 64 71 84 148 81 87 93 key-set size 254 87 90 93 key-set size 254 91 95 97 296 90 91 94 296 94 96 97 '; ~;: 208 journal of library automation vol. 7/3 september 1974 the effect of sampling from a larger file, so that increasing proportions of the names searched for are not present in the search file, is shown in table 8 for a file of 25,000 names. in this case, the proportion of correctly identified names in the total falls, so that overall performance is somewhat reduced. thus, depending both on file size and on the expected proportion of queries identifying hits, the key-set size can be adjusted to reach a desired level of performance. in addition, tests to determine the table b. effect of varying proportion of query names not present in search file of 25,000 names, using 296 keys (ratio of averages) %of names not precision% number of names number of names in search file (ratio of averages) ret1·ieved correctly retrieved 21 90 766 691 42 85 595 505 61 83 449 371 74 76 319 242 84 68 228 154 applicability of a key-set optimized for one file of 50,000 names to another file of the same provenance and size were carried out. the three key-sets derived from the first file were applied to the second, query names sampled from the latter, and the precision ratios determined. some reduction in performance was observed; expressed as ratio of averages, the precision with the 296-key key-set fell from 90 to 83 percent, with the 254-key keyset from 87 to 82 percent, and with the 148-key key-set from 64 to 56 percent, figures which seem unlikely to prejudice the net performance in any marked way. nonetheless, monitoring of performance and of data base name characteristics over a period of operation might well be advisable. distribution characteristics of other types of keys it is particularly instructive to examine the distribution characteristics of other types of keys, including those of fixed length, generated from various positions in the names, and to compare them with those of the optimal key-sets employed in the variety-generator approach. to this end, the file of 50,000 names was processed to produce the following keys or keysets: 1. initial digram of surname. 2. initial trigram of surname. 3. key-set of ninety-four n-grams from the front of the surname, with first and second initials. 4. key-set consisting of first and last character of surname, with first and second initials. the figures (table 9) show clearly that all have distributions which leave no doubt as to their relative inadequacy in resolving power, where this is defined as the ratio of distinct name representations provided by the key-set used to the number of different name entries ( 41,469) in the file. at the digram level, the value of the resolving power is 0.009, i.e., each vm·iety-generator approach/fokker and lynch 209 digram represents, on average, 110 different name entries, while no fewer than thirty-two specific digrams each represent between 500 and 1,000 different names. at the trigram level, the value of the resolving power rises to 0.08, a tenfold increase; however, one trigram still represents between 500 and 1,000 different names. use of the first and last letters of the surname plus the initials again increases the value of the resolving power to 0.627, or 1.6 distinct names per entry; eight of the representations now account for between thirty-one table 9. distributions of a variety of other representations of personal names in a file of 50,000 entries 94 n-grams from first and last frequency initial digram initial trigram front of surname letter of surname f of surname of surname plus 2 initials plus 2 initials 1 40 735 8,964 16,346 2 22 428 3,929 4,919 3 16 249 1,884 2,025 4 11 197 1,006 973 5 7 170 646 581 6 7 110 397 340 7 10 112 234 224 8 4 98 186 146 9 7 81 144 92 10 5 66 108 72 11 6 61 70 49 12 2 56 88 36 13 5 51 74 33 14 1 48 50 24 15 2 35 51 23 16 3 37 36 25 17 2 35 29 15 18 3 33 29 11 19 8 35 28 6 20 8 40 23 5 21-30 21 207 127 49 31-40 23 109 47 8 41-50 13 88 13 51-100 36 142 3 101-200 24 62 201-500 57 15 501-1000 32 1 total 375 3,301 18,166 26,002 resolving power .009 .080 .438 .627 and forty distinct entries. in contrast, however, the key-set of 148 keys comprising ninety-four n-gram keys from the front of the name and the first and second initials, although almost 50 percent larger than the fourcharacter representation, has a resolving power of only 0.438 (or 2.28 entries per representation). this contrast provides particularly strong evidence for the superiority of keys from the front and rear of the surnames over those from the front alone, even when the latter are variable in •' 210 journal of library automation vol. 7/3 september 1974 length. as expected, the precision ratio of the four-character representation is low, at 37 percent (ratio of averages), compared with 64 percent for the optimal148-key key-set. extent of statistical association among keys thus far, the frequency of occurrence of variable-length character strings from the front and back of the surnames is the only factor considered in their selection as keys. it is well known in other areas that statistical associations among keys can influence the effectiveness of their combinations. 3 where a strong positive association between two keys exists, their intersection results in only a small reduction of the number of items retrieved over that obtained by using each independently. when the association is strongly negative, the result of intersection may be much greater than that predicted on the basis of the product of the individual probabilities of the keys. to assess the extent of associations among keys from the front and rear of surnames and initials, sets of both fixedand variable-length keys from each of these positions were examined.· the kendall correlation coefficient v was calculated for each of the twenty most frequent combinations of these. this is related to the chi-square value by the expression x2 =m v2 where m is the file size, or 50,000. table 10 shows the values of the association coefficient for certain of the characters in the full name. those above .012 are significant at a 99 percent confidence level. positive associations are table 10. a8sociation coefficients for sets of the most frequent digrams from various positions in personal names first and last first letter of surname first and second letters of surname and first initial initials digram v digram v digram v kv .064 kv .054 hv .078 wr .050 hj .027 mv .069 ka .038 br -.024 kv .069 hn .028 sj -.023 rv -.055 sa .024 dj .022 dv -.053 sn .024 bg .018 tv .053 cn .022 ka .018 jv -.045 kn -.020 cj ,018 sv .034 ma .014 sd .015 fv .033 kr -.011 sv .013 nv -.029 sv ,010 mm .011 gv .022 rn .010 mj ,007 lv -.022 bn -.008 bj ,005 iv -.019 br .008 sg -.004 av -.019 mn -.007 sr .004 cv -.018 sr .007 ba .004 pv .017 mr .004 ma ,004 wv -.014 si -.002 sm -.003 yv .010 gn .001 mr .002 bv .005 ln .001 sa -.000 ev -.002 variety-generator app1'0ach/fokker and lynch 211 more frequent than negative. the figures indicate that intersection of certain of these characters as keys in search would result in some slight diminution in performance against that expected. the figures for the association coefficients among the twenty most frequent combinations of keys from the front and back of surnames in the 148and 296-key key-sets show magnitudes (mostly positive) which are substantially greater than those for single characters (see table 11). the reasons for these values are obvious; in certain instances, e.g., miller, jones, and martin, common complete names are apparent, while in one case, lee, an overlap between keys from the front and rear exists. in others, linguistic variations on common names can be discerned, as with br n-brown or braun. table 11. association coefficients in the twenty most frequent key combinations from front and back of surnames in two key-sets key-set size key-set size 148 296 keys v keys v s h .146 s ith .343 j son .127 jo nson .297 sc er .104 jo nes .278 w s .043 an rson .274 t a .038 si gh .249 t i .038 le ee .221 w er .038 mu ller .214 c e .034 ta or .195 f er .033 gu ta .168 p s .025 br n .160 d e .023 mi ller .151 l e .022 mar tin .145 w e .022 wi s .137 g in .020 f her .133 m e .009 sc der .121 s a .008 sa to .110 g e .006 t as .084 m a .005 sc er .069 m er -.004 ch en .055 g er -.000 t son .050 such associations are inevitable. when the selection of keys is based solely on frequency, some deviation from the ideal of independence must result, becoming larger as the size of the key-sets increases, and as the length of certain of the keys increases. however, since its effect in the most extreme cases is merely to lead to virtually exact definition of the most frequent surnames, no particular disadvantage results. possible implementations of the variety-generator name search approach the variety-generator approach permits a number of possible implementations of searches for personal names to be considered, if only in outline f ( f•j/ 212 journal of library automation vol. 7/3 september 1974 at this stage, using a variety of file organization methods. the most widely known methods (apart from purely sequential files) are direct access (utilizing hash-addressing), chained, and index sequential files. direct application of the concatenated key-numbers as the basis for hash-address computation appears attractive in instances where the personal name is used alone or in combination (as, for instance, with a part of the document title). the almost random distribution of the bits in this code should result in a general diminution of the collision and overflow problems commonly encountered with fixed-length keys. since only four keys are used to represent each name, and the four sets of keys from which these are selected are limited in number and of approximately equal probability, the keys can be used to construct chained indexes, to which, however, the usual constraints still apply. index sequential storage again offers opportunities, in particular since the low variety of key types means that the sorting operations which this entails can be eliminated. in effect, each name entry would be represented by an entry in each of four lists of document numbers or addresses, and documents retrieved by intersection of the lists. while four such numbers are stored for each name, in contrast to a single entry for the more conventional name list, the removal of the name list itself would more than compensate for the additional storage required for the lists. in the index sequential mode, the lists of document addresses or numbers stored with each key are more or less equally long. they may thus be replaced by bit-vectors in which the position of a bit corresponds to a name or document number. if the number of keys bears a simple relation to the number of blocks on a disc cylinder, the vectors can be stored in predetermined positions within a cylinder, resulting in the serial-parallel file. the usefulness of this file organization has yet to be fully evaluated; however, it also promises substantial economies in storage. on average, only four of the bits are set at the positions in the vectors corresponding to the name or document entry. on average, then, the density of 1-bits is very low, and long runs of zeros occur in the vectors. they can, therefore, be compressed using run-length coding, for instance as applied by bradley.3· 4 preliminary work with the 296-key key-set has indicated already that a gross compression ratio of nine to one is attainable, so that the explicit storage requirements to identify the association between a name and a document number would be just over thirty bits. conclusions the work described here relates solely to searches for individual occurrences of personal names. clearly, in operational systems in which one or more author names are associated with a particular bibliographical item, it will be necessary to provide for description of each of these for access. if this is provided solely on the basis of a document number, some false coordination will occur-for instance, when the initials of one entry are variety-generator approach/fokker and lynch 213 combined with the surname of another. a number of strategies can be envisaged to overcome this problem. , the performance figures show clearly that a small number of characteristics-between 100 and 300 in this study-are sufficient to characterize the entries in large files of personal names and to provide a high degree of resolution in searches for them. while performance in much larger files, involving the extension of key-set sizes to larger munbers, has yet to be studied, the logical application of the concept of variety generation would appear to open the way to novel approaches to searches for documents associated with particular personal names, which seem likely to offer advantages in terms of the overall economic performance of search systems, not only in bibliographic but also in more general computer-based information systems. acknowledgments we thank m. d. martin of the institution of electrical engineers for provision of a part of the inspec data base and of file-handling software, and the potchefstroom university for c.h.e. (south mrica) for awarding a national grant to d. fokker to pursue this work. we also thank dr. i. j. barton and dr. g. w. adamson for valuable discussions, and the former for n-gram generation programs. references 1. d. w. fokker and m. f. lynch, "application of the variety-generator approach to searches of personal names in bibliographic data bases-part 1. microstructure of personal authors' names," journal of library automation 7:105-18 (june 1974). 2. i. j. barton, s. e. creasey, m. f. lynch, and m. j. snell, "an information-theoretic approach to text searching in direct-access systems," communications of the acm (in press). 3. s. d. bradley, "optimizing a scheme for run-length encoding," proceedings of the ieee 57:108-9 (1969). 4. m. f. lynch, "compression of bibliographic files using an adaptation of runlength coding," information storage and retrieval 9:207-14 (1973). r /' i, letter from the editors (march 2022) letter from the editors kenneth j. varnum and marisha c. kelly information technology and libraries | march 2022 https://doi.org/10.6017/ital.v41i1.14881 our first issue of 2022 brings the welcome appointment of marisha c. kelly as assistant editor for the journal. marisha is reference and instruction librarian at northcentral university, wh ere her job duties include planning, developing, integrating, implementing, and maintaining digital systems and services. she has a bachelor of science in journalism from syracuse university, a master of science in library and information science from drexel university, and is currently pursuing a master of science in information technology from northcentral university. contribute to the journal are you interested in furthering the scholarly record for library technology and have a background in information technology in libraries, archives, or museums? i would assume the answer is “yes” if you are reading this issue. ital needs new editorial board members to fill vacancies starting in july. joining the board is an exciting way for members of core to contribute to the profession and engage with colleagues across all types of organizations in examining the role of technology in libraries, archives, and museums. we are especially interested in applications from those in underrepresented groups and identities and encourage all interested individuals to apply. please see the full call for nominations for more information and details on how to apply. we also encourage all library technologists to consider submitting articles for publication. our call for submissions outlines the topics and process for submitting an article for review. if you have questions or wish to bounce ideas off the editor and assistant editor, please contact either of us at the email addresses below. in this issue in the final thought-provoking editorial board thoughts column (“policy before technology— don’t outkick the coverage”) of his editorial board term, brady lund writes about the risks of adopting new technologies before thinking through the possible policy and practical implications of offering it. we likewise highly recommend the peer-reviewed content in this issue: 1. using dpla and the wikimedia foundation to increase usage of digitized resources / dominic byrd-mcdevitt and john dewees 2. researchgate metrics’ behavior and its correlation with rg score and scopus indicators / saeideh valizadeh-haghi, hamed nasibi-sis, maryam shekofteh, and shahabedin rahmatizadeh 3. balancing community and local needs: releasing, maintaining, and rearchitecting the institutional repository / daniel coughlin 4. using open access institutional repositories to save the student symposium during the covid-19 pandemic / allison symulevich and mark hamilton 5. migration of ict-based services of a research library to a cloud platform / francis jayakanth, ananda t. byrappa, and filbert minj 6. local hosting of faculty-created open education resources / joseph letriz kenneth j. varnum, editor marisha c. kelly, assistant editor varnum@umich.edu mkelly@ncu.edu https://www.ala.org/news/member-news/2022/01/marisha-c-kelly-selected-new-ital-assistant-editor https://drive.google.com/file/d/1-foy8y5hyhr8op9wmouvfc3yz3ctykeo/view?usp=sharing https://ejournals.bc.edu/index.php/ital/call-for-submissions https://ejournals.bc.edu/index.php/ital/call-for-submissions https://ejournals.bc.edu/index.php/ital/article/view/14773 https://ejournals.bc.edu/index.php/ital/article/view/14773 https://ejournals.bc.edu/index.php/ital/article/view/13659 https://ejournals.bc.edu/index.php/ital/article/view/14033 https://ejournals.bc.edu/index.php/ital/article/view/14073 https://ejournals.bc.edu/index.php/ital/article/view/14073 https://ejournals.bc.edu/index.php/ital/article/view/14175 https://ejournals.bc.edu/index.php/ital/article/view/14175 https://ejournals.bc.edu/index.php/ital/article/view/13537 https://ejournals.bc.edu/index.php/ital/article/view/13803 mailto:varnum@umich.edu mailto:mkelly@ncu.edu contribute to the journal in this issue collaboration and integration: embedding library resources in canvas articles collaboration and integration embedding library resources in canvas jennifer l. murray and daniel e. feinberg information technology and libraries | june 2020 https://doi.org/10.6017/ital.v39i2.11863 jennifer l. murray (jennifer.murray@unf.edu) is associate dean, university of north florida. daniel e. feinberg (daniel.feinberg@unf.edu) is online learning librarian, university of north florida. abstract the university of north florida (unf) transitioned to canvas as its learning management system (lms) in summer 2017. this implementation brought on opportunities that allowed for a more userfriendly learning environment for students. working with students in courses which were in-person, hybrid, or online, brought about the need for the library to have a place in the canvas lms. students needed to remember how to access and locate library resources and services outside of canvas. during this time, the thomas g. carpenter library’s online presence was enhanced, yet still not visible in canvas. it became apparent that the library needed to be integrated into canvas courses. this would enable students to easily transition between their coursework and finding resources and services to support their studies. in addition, librarians who worked with students, looked for ways for students to easily find library resources and services online. after much discussion, it became clear to the online learning librarian (oll) and the director of technical services and library systems (library director) that the library needed to explore ways to integrate more with canvas. introduction online learning is not a new concept at the unf. in fact, in-person, hybrid, and online courses used online learning in some capacity since distance learning took hold in higher education. unf transitioned to canvas as their learning management system (lms) in summer 2017, which replaced blackboard. this change, which affected all the unf’s online instruction and student learning, brought on new benefits and challenges that allowed for a more secure system for students to take in-person, hybrid, and distance learning courses. while this change occurred, unf’s library went through many changes in its virtual presence. students, specifically those who had classes that utilized canvas, needed a user-friendly way to use the library website and its resources virtually. in response, the library’s resources transitioned into having a greater online presence. however, ultimately, many students needed to use resources that they did not actually realize were available electronically from the library. through instruction and research consultations (both in-person and virtually), students needed to be directed back to the library homepage to access resources; however, the reality was that unless there was a presence of library instruction or professors pointing out library resources, students instead turned to google or other easy to find online resources to which they were previously exposed. how the project originated by spring 2018, there was growth of unf courses that were converted to online or hybrid courses. as students used canvas more, librarians started receiving feedback from in-person and online sessions that students had difficulty accessing the library’s resources while in canvas. the lack of library visibility in canvas caused the librarians to truly acknowledge that this was a problem. mailto:jennifer.murray@unf.edu mailto:daniel.feinberg@unf.edu information technology and libraries june 2020 collaboration and integration | murray and feinberg 2 students had to open a new browser window to access the library and then go back to canvas to complete their assignments, which involved multiple steps. this caused frustration among students who had to remember the library url, while also getting used to navigating their new courses in canvas. librarians consistently spent large amounts of time instructing students how to navigate to the library website during library instruction sessions and research consultations. in effect, more time was spent with students to guide them to library resources such as programmatic or course specific springshare hosted libguides (also known as library guides), or the library homepage. rather than being focused on how to use library resources and become more information literate, students spent more time on just locating the library website to get to the unf library’s online resources. together, the oll and library director talked about possibilities in canvas that would benefit all students who attended unf both in-person and online. canvas is located in unf’s mywings, a portal where all students go for coursework, email, and resources that support their academic studies at unf. it became apparent that if it was possible, there needed to be a quicker way to access the unf library resources for students. literature review with the advent of online learning, it became obvious that students needed to have library access within their online learning management system. for campuses such as unf, this meant within canvas. for unf students that are distance or online students only, this was especially true. farkas noted that librarians had worked to determine the best ways to provide library materials, services, and embed librarians into the lms.1 over the last fifteen years, lms have become more important to support the growth of online learning. pomerantz noted that the lms has become critical to instruction and online learning. approximately 99 percent of institutions adopt an lms and 88 percent of faculty utilize an lms platform.2 this “puts it in the category with cars and cellphones as among the most widely adopted technologies in the united states.”3 library guides that have been integrated into an lms increased their visibility, but did not guarantee that faculty and students would utilize them. that is why it was critical to continuously collaborate and communicate with faculty, students, and librarians to bring attention to the resources that could assist students. farkas noted that librarians at the ohio state university discussed that no matter how the library was integrated into a university’s lms, the usage of the library there was decidedly dependent on if the faculty professor promoted the library to their students.4 the reality that libraries faced was that without visibility in an lms, students that were online/distance learners needed to remember or find the library’s website. while this seemed to be inconsequential, it caused students to use google or other resources instead of their university/college’s library discovery tool or library databases. farkas noted that shank and dewald’s seminal article described a university’s lms as having two levels, macro and micro. when there was one way to access the library in the lms, then it was termed macro. this single pathway allowed for less maintenance since there was a single way to access the library from the lms.5 the university of kentucky embedded the library by adding a library tab in blackboard. other institutions like portland state university, ohio state university, and oakland state university developed library widgets to make the library more accessible.6 the addition of library and research guides in library instruction was critical to increase visibility for information technology and libraries june 2020 collaboration and integration | murray and feinberg 3 students and furthermore make sure students had easier access to the library through their lms. getting librarians’ access to the lms at their institutions is an ongoing issue.7 unf librarians wanted to determine best practices to decide how the library could integrate into canvas. therefore, research was needed to see what other university libraries were doing. the librarians at unf discovered that there was no obvious preference based on examples found in research to accomplish how to get the library into canvas. davis observed that “claiming a small amount of real estate in the lms template . . . is an easy way to put the library in students’ periphery.” by simply having a library link added or a page added to each course was “the digital equivalent for students of walking past the library on the way to class.”8 however, it seemed that a lot depended on how an lms was used at an unf and the technical expertise available. thompson and davis noted that the “lms technology has added another layer of complexity to the puzzle. as technology evolves to address changes in learning design, student and facu lty attitudes, expectations, perceptions will continue to be a critical piece of the course integration puzzle.” 9 while looking at other institutions, there were a variety of ways in which canvas and the library were integrated. there were numerous examples from embedded springshare product library guides, to the creation of modules of quizzes or tutorials, and even to the creation of online minicourses, and embedded librarians in lms courses.10 penn state university looked at their method of how to add library content into canvas. they already had a specific way of putting library guides in canvas, but it was not in a highly visible location for students to easily access. when faculty put guides in their courses, with the collaboration of librarians, the guides were used. however, many of the faculty did not use these librarians or resources. a student survey and user studies were used to best learn how to fix the problem of students and faculty that did not use the guides and content more. penn state worked with their comm 190 instructors to administer a survey that was extra credit, to ensure getting responses.11 “general findings included: 53 percent of students did not know what a course guide was; 41 percent of students had used a guide for a class project; 60 percent accessed a course guide via the lms; and 37 percent of students used course guides on their own.”12 many students were interested in doing their library research within canvas itself. it should be noted that the guides needed to be in a prominent place in canvas, but not overwhelm the course content. for course-specific guides an introductory statement was needed to describe what the guide was about. when the release of springshare’s lti tool occurred, it became an optimal time in which the technical solutions allowed for penn state’s library guides to be embedded smoothly into canvas.13 the learning tools interoperability (lti tool) allows for online software to be integrated into canvas. in effect, when professors want to add a tool to their course, it allows for more seamless and controlled avenue. in the case of library guides, it creates a way in which guides can be embedded into the lms with little problems. another example of a library integration into a campus lms was at the utah state university (usu) merrill-cazier library. they looked to find a way to maximize the effectiveness of springshare’s library guides when they assessed the design and reach of library guides within their lms.14 they took a unique approach to build an lti tool that automatically pulled in the most appropriate library guide when the “research help” link in canvas was activated by a professor. they also saw this as an opportune time to redesign their subject guides and ensure there were guides for all disciplines. they provided usage data to subject librarians to help determine where there might be opportunities to interact with classes and provide more library instruction. overall, information technology and libraries june 2020 collaboration and integration | murray and feinberg 4 the study and feedback they received from students helped them to find ways to improve how librarians used and thought about library guides, and expanded their reach based on usage data. 15 this ability to add library guides to canvas provided students a way to access library materials or the library without having to leave the online classroom. many libraries have conducted focus groups and usability studies that were key to providing valuable feedback on the knowledge and understanding that faculty and students had of guides, ways to improve information shared that assisted students with their coursework and faculty in their online teaching. research indicated that exploration and implementation of integrating library guides into an lms led to a need to improve and provide more consistently designed guides.16 the literature indicated the importance of a strong relationship with the department that manages the lms. these integrations were made much easier when there was a relationship established and it sometimes led to finding out about additional opportunities to integrate more with the lms. penn state saw an increase of over 200,000 hits to its course guides believed to be because of the lti integration.17 this, however, did not guarantee that the students benefited from the course guides, similar to library statistics not proving resources were being used despite page hits. in addition, faculty were able to turn off the lti course guide tool, which reduced the chances of student usage or awareness of the course guide. based on feedback from students and faculty, it did not matter where the course guides were since they could be ignored anyway. a penn state blog was developed by the teaching and learning with technology unit to provide instructors a venue in which they could be aware of online services that librarians provide.18 “although automatic integration allows specialized library resources to be targeted at all lms courses, that does not mean that they’ll be accessed. it is important then to build ongoing relationships with stakeholders, providing not just information that such integrations exist, but also reasons why to use them.”19 however, not all universities and colleges decided to integrate the library strictly through a library guide or a link to the library integration into their lms. karplus noted that students spent more time online rather than going to the physical academic library. karplus discussed that the digital world combined with academic library resources had two benefits. one of which brought online research as a more normal occurrence. the second benefit was that students were more comfortable with accessing online resources.20 while using blackboard, st. mary’s college’s goal was to incorporate library information literacy modules into courses that existed. using the blackboard lms, students were able to access all components of their courses including assigned readings. this became their academic environment. therefore, information literacy modules, tutorials, and outside internet resources could be added to the lms.21 tutorials combined with preand post-testing, gave faculty instant feedback. librarians were also able to participate in blackboard through discussion boards and work with students.22 there was a constant need to update the modules and the information added to blackboard. librarians having access to the blackboard site, allowed for students to use the library resources more readily. “the site can be the focal point for many librarians in one location thus ensuring a consistent, collaborative instructional program.”23 overall, the integration of campus librarians into an lms was to get students to use the library in order to be more successful in their academic endeavors. information technology and libraries june 2020 collaboration and integration | murray and feinberg 5 developing a plan of action initially, the oll and library director brainstormed possible integration ideas, ranging from adding a library global navigation button to the canvas ui, to adding a link to the library in the canvas help menu. at the same time, they also researched what other libraries had done. after brainstorming, it was realized that additional conversations needed to occur within the library and with unf’s online learning support team, a part of the center for instruction and research technology (cirt), the group that manages canvas. the discussion to integrate the library and canvas was a complex matter. unf administrators asked for a proposal to be written so it could be brought to the library, online learning support team, and information technology services (its) stakeholders for discussion and approval. that proposal, along with much needed discussion, was critical in order to determine the possibilities and actions that needed to be taken. that being said it was important to recognize the importance of what was best to serve the faculty and students. when brainstorming discussions started to occur with unf’s online learning support team, it was important for the library to determine what options were available to embed the library in canvas. the library had a strong relationship with unf’s online learning support team and its administrators, which made this an easy process to pursue. what the oll and library director initially wanted was to add a simple link to the global navigation in canvas that would take all users to the library homepage. however, it became apparent that this was not possible due to the fact that this space is limited and many departments on campus would like greater visibility in canvas. the next option, which was easier to implement, was to add a link to the library homepage under the help menu in canvas. although this menu link was added, it was so hidden in canvas that the oll and library director felt that it would never be found in canvas by faculty, let alone students. cirt administrators asserted to the oll and library director of what other possibilities were available. after researching options, the library recommended creating access to library resources and services using a springshare lti tool for library guides, which cirt agreed to. library guides, or libguides, are library webpages that are built from springshare software. using the lti tool seemed like a great possibility since it would allow for more of a presence in addition to the help link to the library homepage. after approval from library administration and initial discussions with it, the project moved forward. implementation the project took about a year to complete from the time discussions began internally in the library to the time the integration went live (see figure 1). information technology and libraries june 2020 collaboration and integration | murray and feinberg 6 figure 1. project timeline the idea to have a seamless entryway to the library seemed to be a good idea based on observations and feedback from students, but the oll and library director started by completing an environmental scan to see what other institutions did to get ideas on ways the unf library could integrate into canvas. the oll and library director learned that there were a variety of ways it had been done from the integration of the library at the global navigation level, course level, and by an added link to the library under the help menu in canvas. it became clear that an integration into canvas would seem like an obvious progression to strengthen not only online learning, but also give students the ability to benefit from the resources that the library subscribed to and enhance their curricular needs. conversations then occurred with unf’s online learning support team to discuss integration options further. after much discussion, a decision was made to pursue an added link to the librar y website under the canvas help menu and a new lti tool at the course level. since canvas was used in so many courses, it was determined that university-wide campus committee agreement was needed on how to go about adding library guides to canvas courses. librarians were also approached at this time to get their input and feedback. the goal seemed obvious to the librarians. when they were approached, buy-in to support the students with canvas by way of the help button and lti tool integration seemed more than straightforward. therefore, for the librarians, the goal was to solve the problem of making sure that students could easily access library materials. overall, the library faculty’s preference for the implementation was to embed the library website under the canvas help menu while also have the student resources libguide inside all canvas courses using the springshare lti tool. after all internal approvals were obtained, the link to the library was seamlessly added under the canvas help menu. as for the springshare lti tool, it required more work and discussion before it could be implemented. after approval was granted from the unf online learning support team and campus its security team, the integration began. configuration options for the lti tool were explored and the systems and digital technologies librarian worked closely with the unf online learning support team and springshare to setup the libapp lti tool. information technology and libraries june 2020 collaboration and integration | murray and feinberg 7 the first step was to configure springhare’s automagic lti tool to automatically add libguid es to courses in canvas. this included adding an lti tool name and description, which appeared in canvas during setup and the course navigation. it was also decided to set the student resources libguide as the default guide for all courses based on feedback from across campus. instructors could request to use a different libguide for their course. to enable this, two parameters had to be set in the automagic lti tool to enable libguide matching between canvas and libguides: • lti parameter name: for unf, this was set to “context” label, to select the course code field in canvas. • libguides metadata name: this was set to the appropriate value to identify the metadata field used in libguides. if an instructor decided to change the default guide to another guide, these two parameters would need to be entered into a specific libguide’s custom metadata, so that canvas could link to the designated guide to display in a course. the change had to be made in the libguide itself, so it was handled by the systems and digital technologies librarian. there had not been many instructors who had requested this yet, but if utilized, the library would also have had to ensure this carried over each semester by updating the metadata in the guide to the new course code. after the configuration was completed on the springshare side, the next step was to set up the integration in the canvas test environment. an external application had to be installed in canvas to allow the springshare lti tool to run. after it was tested, the application was applied across all courses and set to display by default, which the majority of faculty preferred. faculty who did not want to use the integration had the ability to manually turn it off in canvas. during the implementation setup, a few minor issues were encountered. after seeing what the student resources guide looked like in canvas it became clear that the header and footer were not needed and just cluttered the guide. they were both removed in the lti setup options to ensure a cleaner looking guide. since the libguides were being embedded into another system (canvas), formatting of the guides had to be adjusted. the other issue encountered was trying to add available springshare widgets such as the library hours or room booking content to the guide using the automagic lti tool. while this was not successful, it was determined that the additional options were not needed. once the integration was set up in the canvas test environment, demonstrations were held and input was gathered from stakeholders through campus-wide meetings with faculty to obtain their input. it was critical to determine if faculty would utilize libguides in their canvas courses. an overview of the integration and the benefits were given to the campus technology committee and distance learning committee faculty. a demonstration was also given so that these faculty committees could see what the integration would look like in their courses. overall, the feedback obtained from the faculty was very positive. the preference was to have the configuration be optout, where the library guides would automatically display in canvas courses. many faculty members were excited about the integration and looked forward to having it in their courses. after demos took place and final setup was completed based on feedback, the integration was then setup in the canvas production environment and was announced via newsletters, emails , and social media. as of the fall 2019 semester, the library's student resources guide was integrated into all courses in canvas (see figure 2). information technology and libraries june 2020 collaboration and integration | murray and feinberg 8 figure 2. student resources library guide in canvas benefits of the integration students are dependent on their campus lms in order to complete their coursework, support their studies, and in the case at unf, have easier access to the online campus. the libguide integration not only streamlined their way to library resources, but also promoted library usage from students that may not have known how to get to the resources available to them. for faculty it should be noted that they were able to replace the default student libguide with a more specific subject or course guide. either way, it brought more awareness to resources and services that supported curricular needs. the springshare chat widget in the guide also gave students the ability to communicate directly with a librarian from within canvas. this integration not only increased the library visibility in the online environment but enabled all students, whether inperson, hybrid, or online, with direct access to the resources they needed for their coursework. challenges of the integration although there were many benefits to integrate the library into canvas, there were many challenges with making the integration happen. there were many more stakeholders than expected. from library administration, to the canvas administrators, to library faculty, and teaching faculty committees, their input was needed prior to the project taking place. since the project grew organically, this meant that all of the stakeholders were brought in as the project grew or unfolded. once the project received approval from the library and cirt administrators, its administrators had to give the final approval in order to proceed with the integration of library guides. the process to implement the integration took some time to figure out. in addition, getting buy-in from the teaching faculty was key as the navigation options in their canvas courses would be impacted. making sure the faculty understood how it would assist their students was important as the goal was to help their students succeed with their coursework. information technology and libraries june 2020 collaboration and integration | murray and feinberg 9 a concern was if faculty would tell their students, or conversely, would students find the link to the libguide on their own? determining how the news of the library and canvas integration would be communicated to the unf community was the final step. the library director, oll, and cirt administrators needed to determine the best communication routes to get the unf community aware of this news. in effect, emails, unf updates/newsletters, and by word of mouth by teaching faculty. it was crucial that students be aware of these tools. this meant that going forward, unf would depend on word of mouth or student's curiosity in the canvas navigation bars themselves. discussion and next steps integrating the library with the unf’s learning management system, canvas, took much planning and collaboration, which was key to creating a more user-friendly learning environment for students. in reflecting on what went well and what did not, the unf librarians learned several important lessons that will help improve upon the implementation of future projects. to begin, it is important to identify and involve stakeholders early on, so they can provide feedback along the way. getting buy-in from the teaching faculty is also key since the integration affects the navigation options in their canvas courses. for unf, initially, the oll and library director did not realize how many groups of teaching faculty and departments would need to approve this canvas change and implementation. it was important to have them understand the importance of the integration and how it can assist their students with their coursework. considering the content of the library guides was important because of the impact it would have on canvas courses. for example, at the unf library some students thought that the librarian’s picture on the default guide was in fact their professor. in turn, students began to contact her. this caused much confusion for our students and professors alike. along the way, communication is critical so that everyone is kept informed as the integration progresses. communication at the appropriate times and ensuring all information is gathered about configuration options before starting conversations with stakeholders is important too. having this transparency at the appropriate times and ensuring there was enough info rmation about the configuration options before starting conversations with stakeholders was important too. finally, investigating the ability to retrieve usage statistics from day one would be extremely beneficial and provide data to assess how often the library guides are being used in the lms and by whom. this information would help determine next steps and explore other potential integration opportunities. at unf, the librarians were not able to implement statistics as part of our integration which has made it more difficult to assess the usage of the library guides in canvas. now that the integration has been completed, ensuring the integration continues to meet the needs of faculty and students will be important. feedback will need to be gathered from stakeholders to find out if they find the integration useful, if there are any issues being encountered, and/or if they have any recommendations for ways to enhance the integration. usage statistics will also be gathered as soon as they are available. this will provide information on which instructors are using the library guides in their courses and which instructors are not using them. for those who have used it, it will be an opportunity to target those courses for instruction. for those who have not used them, it will be an opportunity to find out why and make sure they are aware of the benefits of using them in their courses. information technology and libraries june 2020 collaboration and integration | murray and feinberg 10 exploring other integration possibilities, especially as the technology continues to evolve, will be important to ensure the library continues to reach students. while the natural progression of the unf integration would be to embed librarians in the canvas platform, others have faced challenges. “according the ithaka s & r library survey 2013 by long and schonfeld, 80–90 percent of academic library directors perceive their librarians’ primary role as contributing to student learning while only 45–55 percent of faculty agree librarians contribute to student learning.”24 even though this is a challenge, faculty collaboration with librarians is crucial for the embedded librarian role. without a requirement of embedded librarianship, marketing for the librarians and what they can do for students will be essential for their role to be successful.25 at unf, conversations will have to be held to determine what other integrations would be of interest and possible at our university. the unf library will also be looking to improve the design and layout of library guides. now that their visibility has increased, it will be important to standardize them and ensure they all have a consistent look and feel, which will make it easier for students to find the information and resources they are looking for. conclusion in today’s rapidly changing technological world, it is critical to make resources available despite where students are physically located. integrating the library’s libguides into canvas not only brought more visibility to the library, its resources, and its services, but it also brought the library to where the students were engaged with the university. as noted by farkas, “positioning the library at the heart of the virtual campus seems as important as positioning the library as the heart of the physical campus.”26 providing resources to students at their point of need, enabled them to easily access the information they needed to help them succeed in their courses. it also allowed faculty to integrate library resources that were most beneficial to their courses and enhanced their teaching as well as the educational needs of their students. the unf library will continue to look at how library resources are used, and how to best serve the online community going forward. it will be important to explore ways to enhance existing services with existing technology but also look ahead and determine what may be possible down the road with new and upcoming technologies. in addition, assessing how the library connects to online learners and gathers feedback from students and faculty will be critical to contributing to the success of students. endnotes 1 meredith gorran farkas, “libraries in the learning management system,” american libraries tips and trends (summer 2015), https://acrl.ala.org/is/wpcontent/uploads/2014/05/summer2015.pdf. 2 jeffrey pomerantz et al., “foundations for a next generation digital learning environment: faculty, students, and the lms” (jan 12, 2018): 1–4. 3 pomerantz et al., “foundations for a next generation digital learning environment.” 4 farkas, “libraries in the learning management system.” https://acrl.ala.org/is/wp-content/uploads/2014/05/summer2015.pdf https://acrl.ala.org/is/wp-content/uploads/2014/05/summer2015.pdf information technology and libraries june 2020 collaboration and integration | murray and feinberg 11 5 farkas, “libraries in the learning management system.” 6 farkas, “libraries in the learning management system.” 7 farkas, “libraries in the learning management system.” 8 robin camille davis, “the lms and the library,” behavioral & social sciences librarian 36, no. 1 (jan 2, 2017): 31–5, https://doi.org/10.1080/01639269.2017.1387740. 9 liz thompson and davis vess, “a bellwether for academic library services in the future: a review of user-centered library integrations with learning management systems,” virginia libraries 62, no. 1 (2017): 1–10, https://doi.org/10.21061/valib.v62i1.1472. 10 davis, “the lms and he library.” 11 amanda clossen and linda klimczyk, “chapter 2: tell us a story: canvas integration strategy,” library technology reports 54, no. 5 (2018): 7–10, https://doi.org/10.5860/ltr.54n5. 12 clossen and klimczyk, “chapter 2,” 8. 13 clossen and klimczyk, “chapter 2,” 8. 14 britt fagerheim et al. “extending our reach,” reference & user services quarterly 56, no. 3 (2017): 180–8, https://doi.org/10.5860/rusq.56n3.180. 15 fagerheim et al., “extending our reach,” 187. 16 fagerheim et al., “extending our reach,” 188. 17 amanda clossen, “chapter 7: ongoing implementation: outreach to stakeholders,” library technology reports 54, no. 5 (2018): 28. 18 amanda clossen, “chapter 7,” 29. 19 amanda clossen, “chapter 7,” 29. 20 susan s. karplus, “integrating academic library resources and learning management systems: the library blackboard site,” education libraries 29, no. 1 (2006): 5, https://doi.org/10.26443/el.v29i1.219. 21 karplus, “integrating academic library resources and learning management systems.” 22 karplus, “integrating academic library resources and learning management systems.” 23 karplus, “integrating academic library resources and learning management systems.” 24 beth e. tumbleson, “collaborating in research: embedded librarianship in the learning management system,” reference librarian 57, no. 3 (jul, 2016): 224–34, https://doi.org/10.1080/02763877.2015.1134376. https://doi.org/10.1080/01639269.2017.1387740. https://doi.org/10.21061/valib.v62i1.1472 https://doi.org/10.5860/ltr.54n5 https://doi.org/10.5860/rusq.56n3.180 https://doi.org/10.26443/el.v29i1.219 https://doi.org/10.1080/02763877.2015.1134376 information technology and libraries june 2020 collaboration and integration | murray and feinberg 12 25 tumbelson, “collaborating in research.” 26 farkas, “libraries in the learning management system.” abstract introduction how the project originated literature review developing a plan of action implementation benefits of the integration challenges of the integration discussion and next steps conclusion endnotes communications ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ cost comparison of computer versus manual catalog maintenance 159 john c. kountz : county of orange public library, orange, california is a computer assisted catalog system less expensive than . its manual counterpart? a method for comparing the two was developed and applied to historical data from the orange county public library. comparative costs obtained were $ .89 per entry for computer assisted catalog maintenance versus $1.71 for manual maintenance. introduction since november 1965, the county of orange public library has performed all acquisitions by means of a computer assisted system. as a byproduct of this continuing operation, records for over 30,000 titles are now available in machine readable form on magnetic tape. the next logical step to realize the library's goal of mechanizing a major portion of its many nonprofessional functions is the production of a comprehensive multi-access list of its holdings suitable for both library and patron use; in short, a book catalog. the 30,000 captive entries, however, comprise only a quarter of the library's total holdings of 120,000 titles. before the envisioned book catalog can be produced, approximately 90,000 titles remain to be captured, and subsequent file handling and data printout operations must be developed. an undertaking of this magnitude naturally prompted a review of the literature. initially, hayes and shoffner's work for the stanford university undergraduate library ( 1) would appear adequate. on closer exam160 journal of library autouwtion vol. 1/ 3 september, 1968 fig. 1. manual card catalog system. cem~\.\uo 0~'\lii..,.iotl~ (c~ ... "'"'""'c.t\oh) cost comparisonj kountz 161 fig. 2. proposed computer assisted book catalog system. 162 journal of library automation vol. 1/ 3 september, 1968 ination, however, their approach did not optimize the cycle for supplement production or catalog reprint; nor was particular attention given this problem in the institute of library research report to the california state library ( 2). the cartwright and shoffner study for the california state library ( 3) paid close attention to cycle length, but the system therein described differed extensively from the system proposed for orange county. further, though the costing of data capture has been well documented and continues to appear in the literature ( 4,5,6), there is little concerning the cost of maintaining data once on file. in brief, neither a method nor basic information was available which could be applied generally, although several specific approaches and results had been presented (1,7,8,9,10,11), and an approach to the analysis of manual operations established ( 12). when it became apparent that more than article reading .would be required, cost analysis of the existing manual operation and the proposed computer assisted book catalog program was performed. in addition, a method was designed to discern what cost benefit, if any, was implied in a computer maintained file before a massive keying effort and systems development should be undertaken. it is important to note that the analysis gives no consideration to increased level of service, esthetics, practicality, or the subsidiary products of a computerized system. nor is the capital investment represented by existing card catalogs considered, as those units are assumed to have been paid for in the course of their creation. manual card catalog system the manual system to be replaced consists of individual card catalogs and shelf lists in each of the library's service units, comprising 25 branches and a separate bookmobile base. this system, depicted in figure 1, consists of: 1) centralized card production, and; 2) branch catalog maintenance. in the centralized operations, offset masters are created from worksheets prepared by the cataloging section and used for two-up card production. these cards are collated into sets, the sets merged with their corresponding books, and the completed packages sent to the ordering branches. when book and card packages are received by the branch, shelf list and catalog cards are sorted and merged with their respective files. withdrawal of a book (discarded or lost) from a branch collection triggers a reversal of this process, and all cards for the withdrawn item are purged from the files. proposed computer assisted book catalog system the computer assisted system (figure 2) consists of tlll'ee phases of computer operation and catalog printing. in the first phase the computer receives as input magnetic tapes produced by the library's ongoing book acquisition system and/ or the output of a device providing a direct keycost comparisonj kountz 163 board to tape capability, processes the input data into updated records, and merges the updated records with the master file of library holdings. the first phase will build the initial master file through capture of the library's remaining 90,000 titles via the keyboard-to-tape device indicated above, and will also form the main avenue for communicating revision (update) data to the master file. in the second phase the computer extracts two print tapes from the master file: the first is the biblio file, consisting of all the bibliographic data and the record number for each master file entry in alphabetical sequence (author-title mix); the second, or locate file, contains location codes and copy counts for each record number in numeric sequence. in the third and final phase, the biblio and locate files are processed. from tl1e biblio file are produced keylines (camera-ready copy) for the book catalog and periodic cumulative supplements of new entries. out of the locate file are generated three numeric listings: 1) a locate list containing all entries, 2) periodic cumulative locate supplements, 3) branch inventories. in production of the book catalog, the computer produced keylines are used to create offset masters for printing. the end product of the printing process is 400 bound copies of the book catalog. factors in cost comparison following is an examination of the principal factors which must be equal or identical to permit comparative analysis of the two systems. unit of comparison (entry) to facilitate the cost analysis between manual and computer assisted file maintenance systems, a unit of comparison was established which would be compatible to both. this unit is called the ent}\y, and in the analysis which follows is the basis for all cost comparisons. for the manual system, entry means creation, distribution, filing and, ultimately, purging of the complete set of cards (figure 3) pertaining to a specific book; while for the computer assisted counterpart, an entry is a record (figure 4) in machine readable form which has been captured, sorted, listed and updated. frequency of transactions either system, in addition to creating and posting new records to a file, must periodically update both entire records and the elements of those records. the number of these updates can be determined for a given period of time, and for our purposes we call this figure the frequency of transactions. with regard to the systems under analysis, the frequency of transactions is identical, and includes two elements: titles added and withdrawn; and volumes added and withdrawn (including re-assignments) as shown in table 1. 164 journal of library automation vol. 1/ 3 september, 1968 don baa 940.5472 940.5472 sandulescu, jacques donbas. mckay, 1968. 217p $4.95 escapes sandulescu, jacques donbas. mckay, 1968, 217p $4.95 world war, 1939-1945 prisoners and prisons, russian personal narratives 940.5472 ., 940.5472 sandulescu, jacques donbas. mckay, 1968. 217p $4.95 sandulescu, jacques donbas. mckay, 1968, 217p $4.95 5217s3 940,5472 940.5472 sandulescu, jacques · donbaa. mckay, 1968, 217p $4.95 sandulescu, jacques donbas. mckay, 1968. 217p $4.95 521763 1. world war, 1939-1945 prisoners and prisons, russian personal narratives (wo 63866) 2. escapes (es14042) i, t 0 68-14127 fig. 3. set of catalog cards. 11-i. ~ m~~fr. ~ rl::c.olitc z ~ ... 0 "' ...j "' a: "' • "' :> z 0 c: ... 0 "' ...j "' a: "' .. 2 ::> z 0 ~ <> "' .j "' 2. ::; na.!'l!: ; :> sv~!:>)e<.t ~ rec.oro:: 0 fig. 4. "' .j "' dep artm ent ~' . . . multiple layout form for electric account ing mac hine cards interpreter spacing n'-mi:: / svaje.ct c.odes lc./oc. nvm8t:r • . ~ iu...f-. o-n l.tnnnc."n•• 0 c.o~ coac c.odt c.oot. c.oo~ c.oot:. p1tlrt• nv"at:" ~ ~ ~ ~ ~ ~ ~ s' 0 0 0 0 0 0 effective date----filingtitle. v tt « « tt. ii it 9 9 919 919 9 9 9 9 919 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 919 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 i 2 3 4 s 6 1 i s 10 11 i jl 14 15 1l 11 ii ii 10 21 2 23 24 25 2l 27 2t 1'! 30 )i 32 ll 34 35 3c l1 31 jt 41 41 4~43 44 4s 46 47 41 0 $0 51 52 s3 s4 55 ~ 51 s. 5t so 61 ' 2 6j 64 5$ "51 61 " 70 71 11 1314 75 11 7171 7t to fiuniptiti..e (c.on'r.) svs·t\tl.e: 9 9 9 9 9 9 9 9 9 9 9 919 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 1 ~ 3 4 5 i 7 i t \ 0 ii 12 ll 14 15 " 11 ii it )(! 21 n ll 24 2'5 k 1l 21 2! 30 ll l2 33 :w l s 3l 31 31: ll 40 41 42 4j 44 '54& 41 u 49 so 51 52 5j s4 5.5 ~ 57 5i st 50 61 cj 63 s4 65 6' 51 m tt 70 71 ll 7l 74 15 71 71 1111 • 5ub·tttl.e (.c.• .. t ) d"-te jfve<o .. r"'---..--,....--,o • .:: 99 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 919 9 9 919 9 9 919 9 919 9 919 9 9 12l45171!~11~~~~""~a~~nn~~3 n~~~~nu~~~n» hc~u~~ ~ ~uq~~» ~~ ~~~u~~~ u~~~ ~"""~~nn»h~nnnn• ca'·~~c.twl\1"0..1 v&llott p~\c.e qu...,nt• t"t he~o e'( c.o~t ce'nt~r(~xp¥j$\&t.t:to·n··<.o\."r<.e'~te'q~) t i c.oll'£ t:t " ~ df. .i ooooooooo·--1 --.j--::r'~~ .. ~"r'~t'~r'" "'~w •.• ".t ~l.&l:::::: o11. % ~ ~ ~ ~~~~~~o-~~ ~~ ~o~~ r~ ~ o-~-~~ ~ ~ o ~~s ~~••o c t . s 9 s sis 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 s 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 1 1 1 •is s 1 1 t 1 n1zu u 1!115 n 11nzg 2~ n zj 21r. n 2121 lo 11 n 33 34 j lli j7 lll9 tl 41 u t3 44" "'<~7 41141 51~ !sl 54 5o 5i s ;s. ~ 61' 63 sot 65" n 1t 1011 1z n t c 7~ n 7771 71 eo name / s ue.)e<r coo£ 19 9 9 919 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 g 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 g 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 91919 9 9 l1 1 3 4 5 5 1 i 9 id 11 12 1l 14 is " 1t ii 19 70 21 22 13 24 25 25 71 21 n lo 31 32 33 34 35 36 31 31 js 40 41 42 43 44 45 46 u 41 49 50 51 52 53 54 55 5(j 57 51 59 60 e1 62 6l 64 65 66 6j 5i " 70 11 72 7374 ts td 71 7171 10 entry record map. c":l 0 "' .... c":l 0 ~ £i ~0 ;:$ ............ ~ 0 c z ~ n 1-' ~ 166 journal of library automation vol. 1/3 september, 1968 table 1. frequency of transactions for manual and computer systems per year. titles volumes added 15,000 (9,000 new) 105,000 table 2. manual system direct costs personnel: file and typist-clerk offset operator equipment: offset press depreciation (5-year) materials: card stock offset masters press supplies table 3. computer system direct costs personnel: typist-clerk equipment: keyed data recordermohawk 1181 (other possibilities include: ibm 50; or ibm mt / st and 2495 tape cartridge reader) computerrca spectra 70/ 45f, 131 k byte core, 3 selector channels, four 70/ 442 tape drives ( 8 tapes), three 70/ 564 disc units. software used is in rca cobol v version 15 materials: fanfold paperstock, l magnetic tape s withdrawn 4,000 24,000 $ 2.17 hr. 2.90 hr. 900.00 year 2.69 per 1,000 4.60 per 1,000 147.00 mo. $ 2.17 hr. 300.00 mo. 82.00 hr. included in equipment cost. cost comparisonj kountz 167 input both manually produced and computer generated catalogs use an identical input: the source document being the sub purchase order (figure 5), which is completed and edited by a cataloger. this document provides the text to be typed onto offset masters for card production (manual system); or the corrected data to be keyed into machine readable form (computer assisted system). output similarly, the output from each process must fulfill the primary objective of catalog creation: multi-access listing of the materials held by the library (figures 3 and 6). the on order list in figure 6 contains many elements which will appear later in the proposed book catalog. computation of costs once identity of input, output, and frequency of transactions have been established, the activities internal to each process must be isolated and documented in terms of cost. materials consumed and equipment used in each process to fulfill its objectives must also be accounted for. the direct costs for the two systems are shown in tables 2 and 3, respectively. in addition to these direct costs, a burden rate could .be applied to these figures to reflect the indirect charges involved in both systems. however, essentially the same burden rate is applicable to both, primarily because of the amortization of computer system development costs, the dollar amount of which approximates the supervisorial an~ administrative costs of the manual system. therefore burden rates are not included. following is a development of the costs incurred by each system, in an attempt to answer the question: is a computer maintained book catalog less expensive than a manually maintained card catalog? manual card catalog system the cost figures presented in table 4 reflect the total card production load for the entire system and the average card catalog shelf list maintenance load for a single branch. they have been derived from time studies performed on both the manual card files and the library's card production operation. as indicated in table 4, the $1.71 cost per entry is com· posed of $0.99 for card production and $0.72 for file maintenance at a single branch. computer produced book catalog cost a computer assisted catalog system requires the construction of tw6 forms of the same file: a machine readable record for file maintenance and a book catalog for staff and library users. to produce an accurate total cost picture, the specific costs in both the consbuction and maintenance of these "parallel" files must be identified and summarized. 168 journal of library automation vol. 1/ 3 september, 1968 table 4. manual catalog card production and maintenance costs operation supplies entry cost card production type offset masters print cards offset masters card stock $ 00.30 00.02 00.32 00.22 a.b. dick 360 ( 5-year) amortization 00.06 00.07 assemble sets with books subtotal: maintenance (single branch) sort shelf list from catalog cards shelf list sort file/ purge catalog sort file/ purge subtotal: total: $ 00.99 00.11 00.08 00.02 00.47 00.04 00.72 $ 1.71 table 5. initial costs for machine readable (master file) entries operation unit cost total input of 90,000 titles key-to-tape $00.37 conect errors 00.007 subtotal: $00.377 $33,930 transfer of 30,000 entries correct errors $00.007 merge with master file 00.0048 subtotal: $00.012 360 total for 120,000 entries $34,290 cost comparison j kountz 169 machine readable file construction the first step toward implementing a computer assisted system is the construction of a machine readable file on magnetic tape of some 120,000 entries ( 90,000 entries of approximately 309 characters each to be captured, and the 30,000 entries captured previously). costing the capture of the 90,000 entries has been closely estimated at $0.37 per entry. but, to complete the initial file cost picture requires pricing the transfer of the 30,000 acquisitions entries . and the correction of errors occurring in both the capture of the 90,000 entries .and/or the transferred acquisitions entries. further, it is known that the error correction figure will not include the price of deriving the update data (cataloging), nor the price of keying the original entries (acquisitions). therefore, the error correction cost is the price of keyboarding update data, posting it to the erroneous entries and, in the case of acquisitions entries, merging those entries with the master file. the cost of error correction is tied to the number of errors to be corrected per entry. the figure used here was developed by surveying the acquisitions records (figure 5), which have been used for pre-cataloging in the manual system, where an average of five characters per entry were found . to be in error . (edition, pagination, price, misspellings, etc.) . for the computer assisted system an additional 9 characters will be required to contact a specific field in any record (record, card code, and action digit) . thus, approximately 14 keystrokes will be required to correct an error in an entry. ·at approximately 8,500 net keystrokes per hour, slightly more than 650 minimum changes per hour are possible. therefore, if all 30,000 entries require correction, approximately 50 keyboard hours will be required (or a rounded $195.00 for operator and machine) , plus some 20 minutes of computer time ( $30.00), which can be reduced to an average cost of $0.007 per entry. to merge the updated acquisitions entries with the master file will cost an additional $0.0048 (historical cost derived from a similar operation in the acquisitions system) for a total of approximately $0.012 per entry. this is shown in table 5 in concert with the development of the initial master file entry cost predicated on 120,000 entries, all of which have required a minin1um error correction. the initial average cost per entry (master file) is $00.286. book catalog construcfiml. the cost for a comput~r produced book catalog involves the computer production of camera-ready copy (keylines) and offset catalog production. the first step in the transition from the master file to the production of keylines is the rearranging of the individual elements (author, title, collation, lc order number) of .each record into the sequence they will occupy in the printed entry, and translating the coded elements in each entry to their "real world" equivalents. the ·$0.009 cost for "reformat. . . " 170 journal of library automation vol. 1/ 3 september, 1968 dole: vendo r: sui purchase order county of orange santa ana, california 0'1-25 .. 66 tarl j, lle8el1 lnc, 1236 south hatther avenu£ . la puente, callforhla route sheet sub p. 0 . no. l 51282 contract o r order no: a 8'200 storage: 63it priority: /. woiu~ wa~_,/'ijs'·itf'ls--fi>lfi$4.-fies a~ pte /sons 1'?1):.~ •1\n pt"iu~n~ t n~tttt'a11vgs(wll ".39,6) ;). • fsc"pj:s (es 11./oi.l~) strips: 6 price new ti tle 0 recol. [=:j end papers new ed. d new set c=j bind have d l abels i ,._., re info rce f850-66 . 2 bind ing: trade closs: 9 01 02 1 20 21 32 53 54 55 56 61 fig. 5. sub purchase order. anf . 1 63 71 72 73 cost comparisonj kountz 171 adult titles on oroer branch 11 laguna beach 04-30-67 page 3 01933.5 arco 03-67 $6.50 scoring high bn reading tests 019413 arco 03-67 1 s4.00 vocabulary, spelling, gra1<4mar 022015 armstrong, charlotte 05-67 1 $4.95 the cih shop 022483 ashley ~ontagu, m. f. 05-67 1 $5.95 american way of life 017896 ashuy-~ontagu s4.50 on being hijman 021535 asi"'ov, isaac 05-67 1 s3.75 whlsprings of liff 022599 attwood, . wil~ia~ 05-67 1 $5.95 the rf.ds & thf blacks 020713 aucfiincloss, louis 04-67 1 $4.95 tales of ~anhattan 019960 auer, alfons 03-67 1 s5.95 open to the world 022460 austen, jane 05-67 s5.95 pride & prejudice 018680 austen, jane 1 s3.95 em'4a 021536 bak f.r, geoffrey 05..;67 1 sl5.00 motfls 021666 balogh. thas 05-67 $7.95 economics of poverty 018116 bannister, margaret $4.95 6up.n tfie little lamp 022093 baring-goulo, william 05-67 ss.oo the lure of th£ limerick 018392 barlow, james 1 s5.95 one man in the world 021188 barnett, a •. ooak 04-67 1 $6.00 china aftei-t mao fig. 6. on order listing. 172 journal of library automation vol. 1/ 3 september, 1968 is taken from historical costs for the operation of a report generator doing this in the acquisitions system. similarly, the reformatted entries in "printable" form must also be : sequenced alphabetically (single ·authortitle mix) before they can be printed, and again the $0.00034 cost is taken from historical data. finally, the sorted, reformatted entries are printed (upper case) at a cost of $0.027 each ( 2 lines). the total cost for these operations becomes $0.036 per entry, as shown in table 6. table 6. keyline production cost computer reformat master file entries sort reformatted enh·ies print (offline) total entry cost $00.009 00.00034 00.027 $00.036 for catalog printing, the computer generated keylines will be reduced photographically to 60 percent, and the reductions assembled for 16-up reproduction with approximately 100 entries per sheet (both sides). initial book catalog production will be 400 copies of approximately 1,800 sheets. there are slightly more than 61,000 author entries which will receive full bibliographic data, call and lc order numbers, while the 120,000 title entries will present only author data and call number. the estimated total cost of printing is given in table 7. the resultant. cost per entry is $00.186, regardless of the number of lines required. table 7. catalog printing cost set-up plates run time gather/ collate paper cover / perfect bind total: $ 3,000 7,000 6,000 2,000 4,000 350 $22,350 as the printed and bound book catalog will not present the locations of the materials it lists, an off-line locate list will be produced concurrent with catalog creation. this list will contain 120,000 numeric entries ( lc order number, coded locations and price), and will be generated for library use only. the cost of offline printing of this list ( 25 copies) is based on a historical print cost of $0.014 per one-line entry extended to the number of entries, or $1,680.00. cost comparisonj kountz 173 summary of computer assisted catalog production costs the grand total cost per entry for all operations leading to the initial book catalog (based on initial data capture through file construction above) is given in table 8. as shown in this table, the computer cost per table 8. composite cost per entry of a computer produced book catalog operation file construction keyline production book production locate list production total cost $0.286 0.036 0.186 0.014 $0.52 entry for the first 'edition' of the book catalog is $0.52 .. this figure is comparable to the manual system figure of $0.99 per entry derived earlier. however, the cost per entry figure for computer assisted file maintenance must also be derived before comparison with the total manual figure of $1.71 per entry is possible. table 9. cost of posting and printing catalog update data operation unit cost total locate update input print locate list (offline) subtotal total for 133,000 actions biblio update reformat master file entry sort reformatted entry print biblio list (offline) subtotal ( rounded) total for 9,000 actions grand total ( 142,000 actions) $00.007 00.014 00.021 00.009 00.00034 00.027 00.036 $2,793.00 324.00 $3,117.00 174 journal of library automation vol. 1/ 3 september, 1968 computer file maintenance the figures developed in table 9 establish a cost per entry for file maintenance, from keyboarding corrected data to the production of an offline printout of biblio and locate supplements. to understand their derivation, let us review the frequency of transactions. in table 1, it can be noted that the 15,000 titles added to the collection annually will necessitate master file location update for the volumes they represent. the locations for withdrawals will also require update. in combination, additions and withdrawals mean a total of 129,000 actions, plus 4,000 last copy withdrawals, or 133,000 updates yearly to keep the master locate file current. in contrast, only the 9,000 new titles will require bibliographic listing. the locate file update input cost is identical to that used for the entry of error correction data (table 5). this is possible since approximately the same number of characters must be keyed to address an entry and enter updated location data ( 9 characters for record and card code, an action digit, and 2 numeric location characters for a total of 12 characters versus the 14 characters required for entry correction) . similarly, the offline print cost for locate data remains the same as that indicated for the initial locate list printout: $0.014 per entry. the biblio file update costs are the computer keyline production figures presented in table 6. to derive the cost per entry, both locate and biblio figures are extended to reflect the proportion of the final figure they represent, and reduced to a single cost per entry. in summation, $0.0242 is the cost per entry for file maintenance for one year. however, this figure is of limited value without reference to either the frequency of supplemental production or total catalog reprint period. therefore, the optimum frequency of supplement production and the period of maintenance are discussed below to bring this raw $0.0242 per entry into perspective. optimum frequency of supplement production and catalog reprints the optimum frequency of bibliographic supplement production is based on the most timely reporting of new title disposition at the least cost. that is, a determination of the number of cumulative listings of new titles in concert with all location changes which can be produced before their production cost equals or exceeds the cost of total catalog reprint. the most economical approach to reporting revised, new, or deleted bibliographic and location entries would be through listing only those entries which have been changed. the summary figures presented in table 10 reflect only the cost per entry developed in table 9 for the production of cumulative exception listings, assuming an equal monthly distribution of transactions. in addition, the annual cost per year, excepting the twehth month, is tabulated to reflect overall cost where total reprint cost comparison/ kountz 175 would occur instead of last cumulative supplement cycle. a quarterly supplement production cycle is selected, as it best meets the optimum defined earlier (i.e., most timely reporting for the least cost). table 10. cumulative supplement costs for various cycles computer runs per year 12 6 _..,.. 4 3 2 1 annual cost @ $0.0242/entry 12th month 12th month included excluded $ 22,335.92 $ 18,899.63 12,027.04 8,590.74 8,590.74 5,154.44 6,872.26 3,436.30 5,154.44 1,718.15 3,436.30 by extending the quarterly supplement production costs shown in table 10 to represent recurring annual expenses and cumulating these annual expenses for comparison with the total cost of complete book catalog and locate list product, the number of years between catalog reprints becomes obvious. this calculation is shown in table 11, where 3 years is the optimum reprint cycle for the qua1terly supple.ment costs selected. table 11. years i 2 _..,.. 3 4 catalog reprint vs. supplement production costs supplement cost annual cumulative $ 8,591 17,182 25,773 34,364 12th month excluded $ 5,154 13,746 22,337 30,928 (year's end) catalog reprint $ 23,000 24,250 25,500 26,750 comparison and conclusion to return to the cost per entry for catalog maintenance alone for optimum reprint cycle, there is a total outlay of $47,837 for 3 years of cumulative supplements and a catalog reprint to report an average of 129,000 titles. from this base can be derived a cost per entry of $0.37 for entry maintenance. this $0.37 can then be summed with the $0.52 cost per entry for the catalog "first edition", for a grand total of $0.89 as the cost per entry for a computer assisted catalog production and maintenance system. further, this cost per entry is realized in a document equal to 400 card catalogs! in terms of the manual system, maintenance was $0.72 per entry, and some 26 files had to be maintained. thus, it is possible to extend the single file maintenance cost to a systemwide average of $18.72, plus the $00.99 required for entry preparation, or a grand total of $19.72 per entry, rather than the $1.71 indicated earlier. 176 journal of library automation vol. 1/3 september, 1968 the lesson implied here is simple: manual cost per entry is dependent upon the number of manual files being maintained. this is of importance since it means a significant increase in outlay for file maintenance with the addition of each new branch; whereas, costing for a computer produced and maintained catalog is relatively independent of the number of service units accommodated. finally, a word of caution. there is a potential danger lurking in these figures for the small public library which has a limited number of branches. this is the fact that the cost per entry, even for the single shelf list/card-catalog comparison, has been calculated for an operating system serving a relatively large number of branches. the cost-per-entry method used in this paper does not include amortization of the capital outlay for "computerization" which, in this specific case, amounts to almost $200,000 for design of system, procedures and forms, and for design, coding and debugging of programs. although savings equal to this amount, or more, would be realized over a period of time because of reduced clerical operations and attendant burden, a large sum would still have to be earmarked for expenditure during a relatively short period with no immediate return. foreknowledge of this "one-shot" cost and its related cost-per-entry payoff should not be a deterrent. rather, it should permit the administrator of a limited operation to deal effectively with increased clerical costs and to make meaningful decisions relative to service bureau overtures, library board interrogations, or the goals of a new library system. references 1. hayes, robert m.; shoffner, ralph m.; weber, david c.: "the economics of book catalog production," library resources and technical services, 10 (winter 1966), 65, 68-82, 87-88. 2. university of california, institute of library research: report to the california state library preliminary evaluation of the feasibility of mechanization (institute of library research, university of california, 1966), p . 3-6. 3. cartwright, kelly l.; shoffner, ralph m.: catalogs in book form: a research study of their implications for the california state library and the california union catalog, with a design for their implementation (institute of library research, university of california, 1967), p. 58-68. 4. bourne, charles : bibliographic data conversion techniques (mimeographed tables presented at oregon library mechanization workshop, june 1968) , table ii. · 5. chapin, richard e.; pretzer, dale h.: "comparative costs of converting shelf list records to machine readable form," journal of library automation, 1 (march 1968) , 71. l cost comparison j kountz 177 6. black, donald v.: "creation of computer input in an expanded character set," ] ournal of library automation, 1 (june 1968), 117118. 7. fasana, paul j.: "automating cataloging functions in conventional libraries," library resources and technical services, 7 (summer 1963), 358, 361-365. 8. robinson, charles w.: "the book catalog: diving in," wilson library bulletin, 40 (november 1965), 265-268. 9. macquarrie, catherine; martin, beryl l.: "the book catalog of the los angeles county public library; how it is being made," library resources and technical services, 4 (summer 1960), 225-226. 10. heinritz, fred: "book versus card catalog costs," library resources and technical sm·vices, 7 (summer 1963), 231-236. 11. smith, f. r.; jones, s. 0.: card versus book-form printout in a mechanized library system, (douglas aircraft company, 1967; clearing house document #ad 653 697), p. 7-8. . 12. wynar, don: "cost analysis in a technical services division:· library resources and technical services, 7 (fall 1963 ), 320-326. 20190615 11093 galley lita president’s message moving forward with lita bohyun kim information technology and libraries | june 2019 2 bohyun kim (bohyun.kim.ois@gmail.com) is lita president 2018-19 and chief technology officer & associate professor, university of rhode island libraries, kingston, ri. i am happy to share some updates on what i covered in my previous column. first of all, i am excited to report that the merger planning of lita, alcts, and llama is back on track. the merger planning had been temporarily put on hold due to the target date for the merger being delayed from fall 2019 to fall 2020, as announced earlier this year. after taking some time after the 2019 ala midwinter meeting, the current leadership of lita, alcts, and llama met, reviewed the work that we have accomplished so far, and decided that the remaining work will now go to the capable hands of the president-elects of lita, alcts, and llama, who were elected this april. during their term, this new cohort of president-elects will build on the work done by the cross-divisional working groups, in order to present the three-division merger for the membership vote in spring 2020 with more details. another piece of good news is that lita, alcts, and llama will begin experimenting with joint programming in order to kickstart our collaboration while the merger planning continues. the lita board decided to hold the next lita forum in fall 2020. alcts is also planning for its second virtual alcts exchange to take place in spring 2020. lita, alcts, and llama will work together on both program committees of the lita forum and the alcts exchange to provide a wider and more interesting range of programs at both conferences. if the membership vote result is in favor of the three-division merger, then the new division will be officially formed in fall 2020, and the planned 2020 lita forum may become the first conference of the new division. shortly after the 2019 ala midwinter meeting, the lita board decided to commit funds to create and disseminate an online allyship training to address the issues aggressive behavior, racism, and harassment reported at the midwinter meeting.1 since then, the lita staff and the lita board of directors have been closely working with the ala office and several other divisions, alcts, alsc, asgcla, pla, rusa, and united, reviewing options. it is likely that this training will follow the “train-the-trainer” model, in order to generate and expand the pool of allyship trainers who will develop and run the lita’s online allyship training for lita members. our goal is to expand our collective capacity to strengthen active and effective allyship, recognize and undo oppressive behaviors and systems, and promote the practice of cultural humility, which requires ongoing efforts, not just a one-time event. we hope to be able to announce more details soon once the final plan is determined. i would also like to highlight the lita award winners who will be celebrated at the 2019 ala annual conference in washington d.c. and to thank the members of the award committees for their hard work.2 the 2019 lita/ex libris student writing award will go to sharon han, a master of science in library and information science candidate at the university of illinois school of information sciences, for her paper, "weathering the twitter storm: early uses of social media as a disaster response tool for public libraries during hurricane sandy," which is included in this issue. charles mcclure and john price wilkin were selected as the 2019 winners of the lita/oclc information technology and libraries | june 2019 3 frederick g. kilgour award for research in library and information technology and the hugh c. atkinson memorial award sponsored by acrl, alcts, llama, and lita, respectively. charles mcclure is the francis eppes professor of information studies in the school of information and the director of the information use management and policy institute at florida state university. john price wilkin is the juanita j. and robert e. simpson dean of libraries at the university of illinois at urbana-champaign. the north carolina state university libraries will receive the 2019 lita/library hi tech award for outstanding communication in library and information technology, which recognizes outstanding individuals or institutions for their long-term contributions to the education of the library and information science technology field and is sponsored by lita and emerald publishing. other not-to-be-missed lita highlights at the 2019 ala annual conference in washington d.c. include the lita top tech trends program widely known for its insightful overview of emerging technologies, the lita president’s program with meredith broussard, a data journalist and the author of artificial unintelligence: how computers misunderstand the world3 as the speaker, and the lita happy hour, a lively social gathering of all library technologists and technologyenthusiasts. the lita avram camp is also preparing for another terrific all-day discussion and activities this year for women and non-binary library technologists to examine the shared challenges, to network, and to support one another. the lita imagineering interest group has put together another fantastic program, “agency, consent, and power in science fiction and fantasy,” featuring four sci-fi authors: sarah gailey, malka older, john scalzi, and martha wells. the lita membership committee is also preparing a virtual lita kickoff orientation for those who are newly attending the ala annual conference. in this last column that i write as the lita president, i would like to express my sincere gratitude to the dedicated lita board of directors, the always fantastic lita staff, and many lita leaders and members whose creativity, passion, and energy continue to drive lita forward. serving as the chief elected officer of one of the leading membership association in library technology has been a true honor to me, and having such a great team of people to work with has been of tremendous help to me in tackling many dauting tasks. it is often said that all lita presidents face unique challenges during their terms. i can say that this has been certainly true during my term. working together with the alcts and the llama leadership on the three-division merger was a valuable experience and a privilege. while we could not move things as quickly as we hoped, we have built a great foundation for the next phase of the planning and learned many things together along the way. last but not least, i would like to thank everyone who stood for the election and congratulate all newly-elected lita officers: evviva weinraub for the president-elect, hong ma and galen charlton for board of directors at large, and jodie gambill for the lita councilor. i am confident that led by the incoming lita president, emily morton-owens, the capable and dedicated lita leadership will continue to accomplish many great things with energetic and forward-thinking lita members in coming years. the future of lita is brighter with these new lita leaders. good luck and thank you for your service! lita president’s message: moving forward with lita | kim 4 https://doi.org/10.6017/ital.v38i2.11093 endnotes 1 “lita’s statement in response to incidents at ala midwinter 2019,” lita blog, february 4, 2019, https://litablog.org/2019/02/litas-statement-in-response-to-incidents-at-ala-midwinter2019/. 2 “lita awards & scholarships,” library information technology association (lita), http://www.ala.org/lita/awards. 3 meredith broussard, artificial unintelligence: how computers misunderstand the world (cambridge, massachusetts: the mit press, 2018). 124 book reviews theory and application of information research. edited by ole harbo and leif kajberg. london: mansell publishing, 1980. 235p. £16.00. isbn: 0-7201-1513-2. this book reproduces twenty-one papers presented at the second intemational research forum on information science, which was held at the royal school of librarianship in copenhagen during august of 1977. the title of this work may be misleading since the majority of the papers could better be described as the foundations of information science. the papers that advanced the theory of information science were the exception, and the contributions dealing with practical applications were even rarer. the contributors included many familiar names: kathleen t. bivins, anthony debons, william goffman, manfred kochen, allan d. pratt, and hans h . wellisch from the united states; nicholas j. belkin, j. m. brittain, b. c. brookes, robert a. fairthorne, j.-m. griffiths, m. h. heine, s. e. robertson, b. c. vickery, and t. d. wilson from the united kingdom; and many names from europe that may be less familiar on this side of the atlantic. the forum was organized into five sessions: general models of information science, information science in relation to other scientific disciplines, measurement, the information retrieval process, and the future tasks of information scientists in europe. within the book, the distinction between these sessions generally is not obvious. appendixes give the forum program, summarize the discussions of the papers, and report on group discussions. in the introduction, it was stated that it was hoped that the forum would bridge the gap betwe~n theory and research on one side and practice on the other. the book does not fulfill this hope, but it does present a good collection of papers dealing with a variety of aspects in information science. the view that the main problems of information science are cognitive rather than technical is evident in many of the papers. however, bradford's law, shannon's theory, and the epidemic model are addressed in several of the papers. with a few exceptions, the papers are quite readable and do not require a mathematical background to be understood and appreciated. the summaries and group discussions are disappointing, possibly because several of the authors were unable to attend the forum. kathleen bivius was the only american contributor present. there is no index, although one would have been helpful. the book is valuable and should be part of any library collection covering information science. anyone interested in information science should be able to find several highly relevant papers. however, only a limited number of scholars will find it necessary to read the entire work.-edward t. o'neill, matthew a. baxter school of information and library science, case western reserve university, cleveland, ohio. personal documentation for professionals-means and methods, by v. stibic. amsterdam: north-holland pub!. co., 1980. 214p. $29.25 (dfl 60.00). isbn: 0-444-85480-0. while there have been many a number of books written on the design, development, and use of large-scale database systems, there have been few that focus on the control of one's own personal collect ion of reprints, memoranda, reports, drafts, slides, and related miscellanea, which accumulate so rapidly in any professional " information -handler's" office. stibic's book addresses this problem in a thoroughly professional and competent manner. his first two chapters introduce the general nature of the problem, and discuss professionals' information needs and sources. the third, "document description," covers the record structure, abstracting, subject descriptions, keywords and classification methods, and their various combinations. the fourth chapter details the various technical means for storage of original documents, microfilm, and such control meobanisms as card indexes, peek-a-boo cards, and computer-supported indexes. all of these chapters draw on the experience and practices familiar to users of large-scale systems. stibic recommends the use of iso and other standardized practices, and endeavors to emphasize the need for constructing one's own system in accord with generally accepted design principles. stibic is careful to point out, however, that if one is in fact designing a personal documentation system, then personal idiosyncrasies and preferences can be built into it. it is not necessary to use an established and standardized vocabulary or classification system without modification. one may alter it to suit one's own purposes. however, the structure of the system (whether descriptors, classification numbers, or other means} must be controlled; otherwise the system will become useless. the next four chapters are case studies of different systems. the first is a card index technique used by an individual. the second describes a computerized index to support the documentation needs of a project team. (essentially an augmented kwic index, published quarterly.) the third case study is one of particular interest to many professionals at the moment-the use of a personal computer as an indexing control system. the system, though not explicitly identified, is roughly comparable to many of those available in the u.s.; a microcomputer with 64k ram, a display of 80x24 lines, two floppy disks with 512k bytes/disk, and an socharacter-per-line printer. the indexing is done via a faceted classification system of about 250 terms, which are hierarchically linked, providing automatic up-posting from specific to generic terms. a hashcoding technique is used to minimize the storage space required on the disk, and searching is performed by simple serial book reviews 125 searching of the index records. the fourth case study is an examination of the upgrading of the manual card index described in the first study to a system supported by a large main-frame computer, using a terminal in the professional's office. a combination of automatic keyword extraction and manual classincation is used for indexing. complex boolean searches are possible with this system. stibic concludes with a chapter on future prospects, touching briefly on such things as internal and public viewdata/teletext systems. he also provides a checklist of desirable features of "a multi-purpose personal work station." such a station is not merely a special-purpose device used to aid in some parts of one's work, such as retrieval, but is an integral part of all of one's work; computer, calculator, textprocessor, mail-dispatch system, calendar, in/out box, and so forth. the author, a scientist of long standing with philips in holland, has provided a valuable guide to this area. there are two relatively minor points of criticism, however. whether it was the author's or the publisher's choice is not clear, but there is an excessive use of italics throughout the text. this lavish use seems more appropriate to teenagers' romantic novels than to a serious work. in this case, it is more distracting than helpful. secondly, but more understandably, the extensive references stibic gives are frequently to documents not easily available in the u.s. some are oecd papers, some refer to the german din standards, and some to internal philips technical reports. these are minor points, however, regarding an excellent book. it is recommended not only for the information professional, but for anyone who is seriously concerned with the problem of keeping track of what one needs to know.-allan d. pratt, university of arizona graduate library school, tucson. viewdata revolution, by sam fedida and rex mahle a halsted press book. new york: wiley, 1979. l86p. $34.95. lc: 7923869. isbn: 0-470-26879-4. sam fedida is the inventor of prestel, 126 journal of library automation vol. 14/2 june 1981 the british post office's viewdata system . with this as his license, he and rex malik have written a 186-page volume explaining the prestel system. prestel is a series of databases, which are accessed by a keypad similar to a calculator. the common television takes on the characteristic of a crt for viewin·g alphabetical and numerical information. the connection to the computer is by telephone, and, in britain, the post office is in charge of the telephones . overall, in spite of several printing errors, this book does provide information about the system. the authors explain the types of information that will be available on the pres tel system, such as "buying a car," "houses for sale," "entertainment," "education," "an evening out," and "news . " they have also devoted individual chapters to electronic mail, electronic funds transfer, and education, explaining how each works in the system. the authors stress the benefits and attributes of their system almost to the point of redundancy . in each of the chapters, the manner in which the information is going to be accessed is repeated. despite the repetition, the primary focus is what prestel will do for the betterment of mankind. the uniqueness of prestel is the simplicity of its access process. according to the authors, being able to access the information in one's own home will make prestel a major tool for dissemination of information for many agencies and businesses. at times, the "hard sell" is very obvious throughout the volume. however, the diagrams are good and help to explain the authors' points. the problems fedida and malik anticipate in the electronic mail and protocols are realistic. in the chapters "future i" and "future ii," the authors go off on a tangent, using a time line, on what they see in the future. again, it is basically a repetition of what was said in the previous chapters, only from a futuristic point of view. here, the reader gets a distinct feeling of what is really bothering them now in the system; that is, government bureaucracy . they cite the different groups trying to control the information by means of legislation. they delve into the problem of uniformity of standards. television is an example . what will be standard for convertors and adapters for the computer hookup? this is a real problem that was well explored throughout the work. this volume is good for librarians who are interested in cable, telecommunications, and computers . however, be aware of its poor organization. there are numerous printing errors that affect its readability. nevertheless, if a person can wade through these errors and the repetition of ideas, he/she can obtain some useful information from this text. there is a distinct feeling throughout this work that it was put together hastily . nonetheless, there is a dearth of information on this subject, and this book will serve some useful purpose for libraries .-robert miller, memphis/shelby county public library and information center, memphis, tennessee . ala filing rules. filing committee, resources and technical services division, american library association . chicago: american library assn., 1980. 50p. $3.50. lc: 80-22186. isbn: 0-8389-3255-x. library of congress filing rules. prepared by john c . rather and susan c. biebel. washington, d. c.: library of congress, 1980. ll1p . $5. lc: 80-607944 isbn: 0-8444-0347-4. available from customer services section, cataloging distribution service, library of congress, washington, dc 20541. these two works represent the culmination of over a decade of effort within the library profession to overhaul the techniques by which entries are arranged to form catalogs. the impetus for this work came from recognition that computer technology would soon be enlisted to perform the arrangement of entries for the production of catalogs, and that filing rules current at the time would be impossible to implement in their entirety on the computer. although the original intention was to develop rules appropriate for the arrangement of entries by computer, those at the library of congress and the ala committee working on the problem soon realized that, from the point of view of catalog users, it would be very undesirable to have different sets of filing rules in operation depending on the physical medium of the catalog. therefore, the scope of the effort was broadened to rules that could be applied both manually and by machine using headings that were formulated according to more than one set of cataloging rules. now that we have these new rules, the question arises whether they are better than what preceded them . the criteria for "better" ought to be whether the rules make entries easier to find both for known-item searches and browsing within the complex device called a library catalog. or to state the same criteria negatively: it should be more difficult to lose an entry in the catalog if it has been filed according to the rules . the evaluation of these rules against other possible approaches to catalog arrangement ought to be centered on observation of the needs of a variety of both experienced and unsophisticated catalog users and on measurement of the effectiveness of the alternative approaches to meet these needs. the complex problems of filing clearly exemplify the need for research as recently expressed by herb white in his columns in american libraries. lacking any empirical data on which to base an evaluation, we must rely on our professional judgment and personal biases to argue the case for the new rules. to this reviewer, it seems that common sense supports a set of rules that are simple, consistent, and easy to explain to library users. the need for simplicity and consistency directly implies the "file-as-is" principle (i.e., file exactly as the heading is visually constructed, not by some interpretation of it), which should be applied even at the cost of having to search in more than one place in the arrangement; e .g., numeric digits and numeric words , mac and me, muller and mueller. the file-as-is principle has been more consistently applied in the ala rules than the lc rules, the latter undoubtedly a result of the anticipated complexity and size book reviews 127 of lc's catalogs, although there is no justification argued for these departures from the basic principle. of specific interest to readers of the journal is whether these rules can be implemented for computer sorting of catalog entries . do the rules succeed in meeting their original objective? the ala rules certainly appear to be amenable to very straightforward systems analysis and programming. for this the committee and its chairperson, joe rosenthal, need to be commended. from some sources there are already claims of systems that fully implement the new ala rules, which certainly could be the case . however, it would be interesting to know how these systems deal with the follow ing, which seem to be potentially troublesome: • the lack of consistent support in the marc format for handling initial articles when the rules call for ignoring initial articles in corporate names other than personal or place names, title subheadings ($t subfield), and subject headings. the english articles obviously present no problem, but the table of articles in appendix 2 shows more than thirty words that can be both an article and the cardinal numeral l. in addition, the footnote , "in h awaiian, the '0 emphatic' must be carefully distinguished from the preposition 0, but 0 also serves. the h awaiian language as a noun and a verb (each with several meanings), an adverb, and a conjunction," must surely give pause to the diligent systems designer. the recent library of congress practice of dropping nonfiling initial articles from heading fields still does not solve the problem of initial articles in the several million marc records that already exist in library catalogs . • the requirement that roman numerals be filed numerically presents an opportunity to construct an interesting but not overly complex algorithm . however, although the marc format makes the identification of roman numerals in heading fields fairly straightforward (the $b subfield), the identification of roman numerals embedded in a long title is much more ambiguous . for example, does iv mean "4" or "intravenous"? 128 journal of library automation vol. 14/2 june 1981 • the rules require that punctuation in an arabic numeral that is included to increase its readability is to be ignored in filing, but decimal points are significant in determining the numeric value of the number (i.e . , .003 files before 1) . how does one specify an algorithm to deal with the title, "5.000 kilometres dans le sud"? using european practice, this number is obviously 5,000, but why not 5 according to the computer algorithm? • the special rule for nonroman alphabets (rule 7) is interesting: "if, in the arrangement of bibliographic records, it is necessary to distinguish access points containing characters in different nonroman alphabets, scripts and syllabaries (cf. rule 1, order of characters) the following order of precedence is used. . .. " there follows a table beginning with amharic and ending in tibetan. that is the entire rule. systems designers who have implemented this rule clearly have transcendent skills! reliance on the marc language code in the 008 field has both theoretical and practical problems. • the introductory text advises libraries to include in the file information notes and references that explain filing practices to catalog users . however, the rules do not specify where these references are to file in relation to other headings. admonishment to provide these at "appropriate points" is not much help. • the ampersand is ignored in filing (for which we should be grateful) . but, by including the optional rule 1.3, which allows filing the ampersand "as its spelled-out language equivalent," the ala committee has put systems designers in the position of having to explain why this rule cannot be implemented on the computer-at least not until the marc format includes a code for language of the field (not a likely development, and even then not all ambiguity would be eliminated). interestingly, the library of congress treats all ampersands as a character filing between blank and the letter a . • the optional rule 9.1, which allows the inclusion of "the role of a person or a corporate body in a legal action in arranging access points," presents a problem when the rule requires suppression of all other relators . how is the computer programmed to recognize a legal action? is there a finite list of such relator words? differences between aacr2 and previous cataloging practices further complicate the use of this option . admittedly, many of these problems are marginal in terms of the number of entries in a catalog affected, but to a systems designer, even though there is only one instance, it must be accounted for in the computer programs if the system can claim a "full" implementation of the rules. clearly, full implementation will require some changes in the marc format before all rules can be applied absolutely consistently and unambiguously . the library of congress rules, although applying similar principles, depart significantly from the ala rules in detail and complexity. a full analysis of the implementation problems would require much more space than this review will allow. suffice it to say that although the library's libsked program has been under development for twelve years, and its strengths and limitations have undoubtedly influenced the development of these filing rules, there are elements in these rules that have not yet been implemented in libsked, and several where no one has yet figured out how to do it. although the work on these rules is complete, there are two more projects the profession should undertake that would be most useful for those concerned with catalog development . in both sets of rules, there is mention in the introduction of the need for a brief version of the essential rules, which could be handed out to catalog users . why did the committee not develop such a brief guide and include it as an appendix to the rules? those of us who work on computers are familiar with the reference cards for programming languages put out by computer manufacturers . a similar format for the filing rules would be very useful. another more difficult but equally useful project would be the publication of a standard design implementation of the ala filing rules expressed in terms of the marc format . such a design would include the marc fields and subfields necessary for each possible entry from a bibliographic record and a description of any special processing required for particular data elements. the design would be expressed at a level that is independent of programming languages and computer hardware . we need a standard reference that translates the filing rules into the language of the marc format. t he ala rules, in some tantalizingly brief instances, begin this process. both sets of filing rules are significant improvements over those previously available to systems analysts. reference librarians should find these rules easy to explain book reviews 129 to beleaguered catalog users. for their simplicity and relatively slight departure from the "file-as-is" principle, the ala rules are to be recommended . the library of congress rules, in their attempt to retain the classificatory structures that support the browsing user, further complicate the task of the user performing a known-item search. library research has indicated that the preponderance of catalog searches in research libraries are known-item searches .-] ohn f. knapp, ringgold management systems, beaverton, oregon . tps ties them together ® aegistered tr ade mark oclci!!> a t s a c c i rcu l at i®o n l 0 s g i® r l i n® n g tps electronics provides on-line and off-line interfaces • one-step item processing • error-free data entry • back-up storage tps electronics 4047 transport st. palo alto, ca 94303 41 5-494-6802 6 information technology and libraries | march 2009 paul t. jaeger and zheng yan one law with two outcomes: comparing the implementation of cipa in public libraries and schools though the children’s internet protection act (cipa) established requirements for both public libraries and public schools to adopt filters on all of their computers when they receive certain federal funding, it has not attracted a great amount of research into the effects on libraries and schools and the users of these social institutions. this paper explores the implications of cipa in terms of its effects on public libraries and public schools, individually and in tandem. drawing from both library and education research, the paper examines the legal background and basis of cipa, the current state of internet access and levels of filtering in public libraries and public schools, the perceived value of cipa, the perceived consequences of cipa, the differences in levels of implementation of cipa in public libraries and public schools, and the reasons for those dramatic differences. after an analysis of these issues within the greater policy context, the paper suggests research questions to help provide more data about the challenges and questions revealed in this analysis. t he children’s internet protection act (cipa) established requirements for both public libraries and public schools to—as a condition for receiving certain federal funds—adopt filters on all of their computers to protect children from online content that was deemed potentially harmful.1 passed in 2000, cipa was initially implemented by public schools after its passage, but it was not widely implemented in public libraries until the 2003 supreme court decision (united states v. american library association) upholding the law’s constitutionality.2 now that cipa has been extensively implemented for five years in libraries and eight years in schools, it has had time to have significant effects on access to online information and services. while the goal of filtering requirements is to protect children from potentially inappropriate content, filtering also creates major educational and social implications because filters also limit access to other kinds of information and create different perceptions about schools and libraries as social institutions. curiously, cipa and its requirements have not attracted a great amount of research into the effects on schools, libraries, and the users of these social institutions. much of the literature about cipa has focused on practical issues—either recommendations on implementing filters or stories of practical experiences with filtering. while those types of writing are valuable to practitioners who must deal with the consequences of filtering, there are major educational and societal issues raised by filtering that merit much greater exploration. while relatively small bodies of research have been generated about cipa’s effects in public libraries and public schools,3 thus far these two strands of research have remained separate. but it is the contention of this paper that these two strands of research, when viewed together, have much more value for creating a broader understanding of the educational and societal implications. it would be impossible to see the real consequences of cipa without the development of an integrative picture of its effects on both public schools and public libraries. in this paper, the implications of cipa will be explored in terms of effects on public libraries and public schools, individually and in tandem. public libraries and public schools are generally considered separate but related public sphere entities because both serve core educational and information-provision functions in society. furthermore, the fact that public schools also contain school library media centers highlights some very interesting points of intersection between public libraries and school libraries in terms of the consequences of cipa: while cipa requires filtering of computers throughout public libraries and public schools, the presence of school library media centers makes the connection between libraries and schools stronger, as do the teaching roles of public libraries (e.g., training classes, workshops, and evening classes). n the legal road to cipa history under cipa, public libraries and public schools receiving certain kinds of federal funds are required to use filtering programs to protect children under the age of seventeen from harmful visual depictions on the internet and to provide public notices and hearings to increase public awareness of internet safety. senator john mccain (r-az) sponsored cipa, and it was signed into law by president bill clinton on december 21, 2000. cipa requires that filters at public libraries and public schools block three specific types of content: (1) obscene material (that paul t. jaeger (pjaeger@umd.edu) is assistant professor at the college of information studies and director of the center for information policy and electronic government of the university of maryland in college park. zheng yan (zyan@uamail.albany .edu) is associate professor at the department of educational and counseling psychology in the school of education of the state university of new york at albany. one law with two outcomes | jaeger and yan 7 which appeals to prurient interests only and is “offensive to community standards”); (2) child pornography (depictions of sexual conduct and or lewd exhibitionism involving minors); and (3) material that is harmful to minors (depictions of nudity and sexual activity that lack artistic, literary, or scientific value). cipa focused on “the recipients of internet transmission,” rather than the senders, in an attempt to avoid the constitutional issues that undermined the previous attempts to regulate internet content.4 using congressional authority under the spending clause of article i, section 8 of the u.s. constitution, cipa ties the direct or indirect receipt of certain types of federal funds to the installation of filters on library and school computers. therefore each public library and school that receives the applicable types of federal funding must implement filters on all computers in the library and school buildings, including computers that are exclusively for staff use. libraries and schools had to address these issues very quickly because the federal communications commission (fcc) mandated certification of compliance with cipa by funding year 2004, which began in summer 2004.5 cipa requires that filters on computers block three specific types of content, and each of the three categories of materials has a specific legal meaning. the first type—obscene materials—is statutorily defined as depicting sexual conduct that appeals only to prurient interests, is offensive to community standards, and lacks serious literary, artistic, political, or scientific value.6 historically, obscene speech has been viewed as being bereft of any meaningful ideas or educational, social, or professional value to society.7 statutes regulating speech as obscene have to do so very carefully and specifically, and speech can only be labeled obscene if the entire work is without merit.8 if speech has any educational, social, or professional importance, even for embodying controversial or unorthodox ideas, it is supposed to receive first amendment protection.9 the second type of content—child pornography—is statutorily defined as depicting any form of sexual conduct or lewd exhibitionism involving minors.10 both of these types of speech have a long history of being regulated and being considered as having no constitutional protections in the united states. the third type of content that must be filtered— material that is harmful to minors—encompasses a range of otherwise protected forms of speech. cipa defines “harmful to minors” as including any depiction of nudity, sexual activity, or simulated sexual activity that has no serious literary, artistic, political, or scientific value to minors.11 the material that falls into this third category is constitutionally protected speech that encompasses any depiction of nudity, sexual activity, or simulated sexual activity that has serious literary, artistic, political, or scientific value to adults. along with possibly including a range of materials related to literature, art, science, and policy, this third category may involve materials on issues vital to personal well-being such as safe sexual practices, sexual identity issues, and even general health care issues such as breast cancer. in addition to the filtering requirements, section 1731 also prescribes an internet awareness strategy that public libraries and schools must adopt to address five major internet safety issues related to minors. it requires libraries and schools to provide reasonable public notice and to hold at least one public hearing or meeting to address these internet safety issues. requirements for schools and libraries cipa includes sections specifying two major strategies for protecting children online (mainly in sections 1711, 1712, 1721, and 1732) as well as sections describing various definitions and procedural issues for implementing the strategies (mainly in sections 1701, 1703, 1731, 1732, 1733, and 1741). section 1711 specifies the primary internet protection strategy—filtering—in public schools. specifically, it amends the elementary and secondary education act of 1965 by limiting funding availability for schools under section 254 of the communication act of 1934. through a compliance certification process within a school under supervision by the local educational agency, it requires schools to include the operation of a technology protection measure that protects students against access to visual depictions that are obscene, are child pornography, or are harmful to minors under the age of seventeen. likewise, section 1712 specifies the same filtering strategy in public libraries. specifically, it amends section 224 of the museum and library service act of 1996/2003 by limiting funding availability for libraries under section 254 of the communication act of 1934. through a compliance certification process within a library under supervision by the institute of museum and library services (imls), it requires libraries to include the operation of a technology protection measure that protects students against access to visual depictions that are obscene, child pornography, or harmful to minors under the age of seventeen. section 1721 is a requirement for both libraries and schools to enforce the internet safety policy with the internet safety policy strategy and the filtering technology strategy as a condition of universal service discounts. specifically, it amends section 254 of the communication act of 1934 and requests both schools and libraries to monitor the online activities of minors, operate a technical protection measure, provide reasonable public notice, and hold at least one public hearing or meeting to address the internet safety policy. this is through the 8 information technology and libraries | march 2009 certification process regulated by the fcc. section 1732, titled the neighborhood children’s internet protection act (ncipa), amends section 254 of the communication act of 1934 and requires schools and libraries to adopt and implement an internet safety policy. it specifies five types of internet safety issues: (1) access by minors to inappropriate matter on the internet; (2) safety and security of minors when using e-mail, chat rooms, and other online communications; (3) unauthorized access; (4) unauthorized disclosure, use, and dissemination of personal information; and (5) measures to restrict access to harmful online materials. from the above summary, it is clear that (1) the two protection strategies of cipa (the internet filtering strategy and safety policy strategy) were equally enforced in both public schools and public libraries because they are two of the most important social institutions for children’s internet safety; (2) the nature of the implementation mechanism is exactly the same, using the same federal funding mechanisms as the sole financial incentive (limiting funding availability for schools and libraries under section 254 of the communication act of 1934) through a compliance certification process to enforce the implementation of cipa; and (3) the actual implementation procedure differs in libraries and schools, with schools to be certified under the supervision of local educational agencies (such as school districts and state departments of education) and with libraries to be certified within a library under the supervision of the imls. economics of cipa the universal service program (commonly known as e–rate) was established by the telecommunications act of 1996 to provide discounts, ranging from 20 to 90 percent, to libraries and schools for telecommunications services, internet services, internal systems, and equipment.12 the program has been very successful, providing approximately $2.25 billion dollars a year to public schools, public libraries, and public hospitals. the vast majority of e-rate funding—about 90 percent—goes to public schools each year, with roughly 4 percent being awarded to public libraries and the remainder going to hospitals.13 the emphasis on funding schools results from the large number of public schools and the sizeable computing needs of all of these schools. but even 4 percent of the e-rate funding is quite substantial, with public libraries receiving more than $250 million between 2000 and 2003.14 schools received about $12 billion in the same time period.15 along with e-rate funds, the library services and technology act (lsta) program administered by the imls provides money to each state library agency to use on library programs and services in that state, though the amount of these funds is considerably lower than e-rate funds. the american library association (ala) has noted that the e-rate program has been particularly significant in its role of expanding online access to students and to library patrons in both rural and underserved communities.16 in addition to the effect on libraries, e-rate and lsta funds have significantly affected the lives of individuals and communities. these programs have contributed to the increase in the availability of free public internet access in schools and libraries. by 2001, more than 99 percent of public school libraries provided students with internet access.17 by 2007, 99.7 percent of public library branches were connected to the internet, and 99.1 percent of public library branches offered public internet access.18 however, only a small portion of libraries and schools used filters prior to cipa.19 since the advent of computers in libraries, librarians typically had used informal monitoring practices for computer users to ensure that nothing age inappropriate or morally offensive was publicly visible.20 some individual school and library systems, such as in kansas and indiana, even developed formal or informal statewide internet safety strategies and approaches.21 why were only libraries and schools chosen to protect children’s online safety? while there are many social institutions that could have been the focus of cipa, the law places the requirements specifically on public libraries and public schools. if congress was so interested in protecting children from access to harmful internet content, it seems that the law would be more expansive and focused on the content itself rather than filtering access to the content. however, earlier laws that attempted to regulate access to internet content failed legal challenges specifically because they tried to regulate content. prior to the enactment of cipa, there were a number of other proposed laws aimed at preventing minors from accessing inappropriate internet content. the communications decency act (cda) of 1996 prohibited the sending or posting of obscene material through the internet to individuals under the age of eighteen.22 however, the supreme court found the cda to be unconstitutional, stating that the law violated free speech under the first amendment. in 1998, congress passed the child online protection act (copa), which prohibited commercial websites from displaying material deemed harmful to minors and imposed criminal penalties on internet violators.23 a three-panel judge for the district court for the eastern district of pennsylvania ruled that copa’s focus on “contemporary community standards” violated the first amendment, and the panel subsequently imposed an one law with two outcomes | jaeger and yan 9 injunction on copa’s enforcement. cipa’s force comes from congress’s power under the spending clause; that is, congress can legally attach requirements to funds that it gives out. since cipa is based on economic persuasion—the potential loss of funds for technology—the law can only have an effect on recipients of those funds. while regulating internet access in other venues like coffee shops, internet cafés, bookstores, and even individual homes would provide a more comprehensive shield to limit children’s access to certain online content, these institutions could not be reached under the spending clause. as a result, the burdens of cipa fall squarely on public libraries and public schools. n the current state of filtering when did cipa actually come into effect in libraries and schools? after overcoming a series of legal challenges that were ultimately decided by the supreme court, cipa came into effect in full force in 2003, though 96 percent of public schools were already in compliance with cipa in 2001. when the court upheld the constitutionality of cipa, the legal challenge by public libraries centered on the way the statute was written.24 the court’s decision states that the wording of the law does not place unconstitutional limitations on free speech in public libraries. to continue receiving federal dollars directly or indirectly through certain federal programs, public libraries and schools were required to install filtering technologies on all computers. while the case decided by the supreme court focused on public libraries, the decision virtually precludes public schools from making the same or related challenges.25 before that case was decided, however, most schools had already adopted filters to comply with cipa. as a result of cipa, a public library or public school must install technology protection measures, better known as filters, on all of its computers if it receives n e-rate discounts for internet access costs, n e–rate discounts for internal connections costs, n lsta funding for direct internet costs,26 or n lsta funding for purchasing technology to access the internet. the requirements of cipa extend to public libraries, public schools, and any library institution that receives lsta and e–rate funds as part of a system, including state library agencies and library consortia. as a result of the financial incentives to comply, almost 100 percent of public schools in the united states have implemented the requirements of cipa,27 and approximately half of public libraries have done so.28 how many public schools have implemented cipa? according to the latest report by the department of education (see table 1), by 2005, 100 percent of public schools had implemented both the internet filtering strategy and safety policy strategy. in fact, in 2001 (the first year cipa was in effect), 96 percent of schools had implemented cipa, with 99 percent filtering by 2002. when compared to the percentage of all public schools with internet access from 1994 to 2005, internet access became nearly universal in schools between 1999 and 2000 (95 to 98 percent), and one can see that the internet access percentage in 2001 was almost the same as the cipa implementation percentage. according to the department of education, the above estimations are based on a survey of 1,205 elementary and secondary schools selected from 63,000 elementary schools and 21,000 secondary and combined schools.29 after reviewing the design and administration of the survey, it can be concluded that these estimations should be considered valid and reliable and that cipa was immediately and consistently implemented in the majority of the public schools since 2001.30 how many public libraries have implemented cipa? in 2002, 43.4 percent of public libraries were receiving e-rate discounts, and 18.9 percent said they would not apply for e-rate discounts if cipa was upheld.31 since the supreme court decision upholding cipa, the number of libraries complying with cipa has increased, as table 1. implementation of cipa in public schools year 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2005 access (%) 35 50 65 78 89 95 98 99 99 100 100 filtering (%) 96 99 97 100 10 information technology and libraries | march 2009 have the number of libraries not applying for e-rate funds to avoid complying with cipa. however, unlike schools, there is no exact count of how many libraries have filtered internet access. in many cases, the libraries themselves do not filter, but a state library, library consortium, or local or state government system of which they are a part filters access from beyond the walls of the library. in some of these cases, the library staff may not even be aware that such filtering is occurring. a number of state and local governments have also passed their own laws to encourage or require all libraries in the state to filter internet access regardless of e-rate or lsta funds.32 in 2008, 38.2 percent of public libraries were filtering access within the library as a result of directly receiving e-rate funding.33 furthermore, 13.1 percent of libraries were receiving e-rate funding as a part of another organization, meaning that these libraries also would need to comply with cipa’s requirements.34 as such, the number of public libraries filtering access is now at least 51.3 percent, but the number will likely be higher as a result of state and local laws requiring libraries to filter as well as other reasons libraries have implemented filters. in contrast, among libraries not receiving e-rate funds, the number of libraries now not applying for e-rate intentionally to avoid the cipa requirements is 31.6 percent.35 while it is not possible to identify an exact number of public libraries that filter access, it is clear that libraries overall have far lower levels of filtering than the 100 percent of public schools that filter access. e-rate and other program issues the administration of the e-rate program has not occurred without controversy. throughout the course of the program, many applicants for and recipients of the funding have found the program structure to be obtuse, the application process to be complicated and time consuming, and the administration of the decision-making process to be slow.36 as a result, many schools and libraries find it difficult to plan ahead for budgeting purposes, not knowing how much funding they will receive or when they will receive it.37 there also have been larger difficulties for the program. following revelations about the uses of some e-rate awards, the fcc suspended the program from august to december 2004 to impose new accounting and spending rules for the funds, delaying the distribution of over $1 billion in funding to libraries and schools.38 news investigations had discovered that certain school systems were using e-rate funds to purchase more technology than they needed or could afford to maintain, and some school systems failed to ever use technology they had acquired.39 while the administration of the e-rate program has been comparatively smooth since, the temporary suspension of the program caused serious short-term problems for, and left a sense of distrust of, the program among many recipients.40 filtering issues during the 1990s, many types of software filtering products became available to consumers, including serverside filtering products (using a list of server-selected blocked urls that may or may not be disclosed to the user), client-side filtering (controlling the blocking of specific content with a user password), text-based content-analysis filtering (removing illicit content of a website using real-time analysis), monitoring and timelimiting technologies (tracking a child’s online activities and limiting the amount of time he or she spends online), and age-verification systems (allowing access to webpages by passwords issued by a third party to an adult).41 but because filtering software companies make the decisions about how the products work, content and collection decisions for electronic resources in schools and public libraries have been taken out of the hands of librarians, teachers, and local communities and placed in the trust of proprietary software products.42 some filtering programs also have specific political agendas, which many organizations that purchase them are not aware of.43 in a study of over one million pages, for every webpage blocked by a filter as advertised by the software vendor, one or more pages were blocked inappropriately, while many of the criteria used by the filtering products go beyond the criteria enumerated in cipa.44 filters have significant rates of inappropriately blocking materials, meaning that filters misidentify harmless materials as suspect and prevent access to harmless items (e.g., one filter blocked access to the declaration of independence and the constitution).45 furthermore, when libraries install filters to comply with cipa, in many instances the filters will frequently be blocking text as well as images, and (depending on the type of filtering product employed) filters may be blocking access to entire websites or even all the sites from certain internet service providers. as such, the current state of filtering technology will create the practical effect of cipa restricting access to far more than just certain types of images in many schools and libraries.46 n differences in the perceived value of cipa and filtering based on the available data, there clearly is a sizeable contrast in the levels of implementation of cipa between one law with two outcomes | jaeger and yan 11 schools and libraries. this difference raises a number of questions: for what reasons has cipa been much more widely implemented in schools? is this issue mainly value driven, dollar driven, both, or neither in these two public institutions? why are these two institutions so different regarding cipa implementation while they share many social and educational similarities? reasons for nationwide full implementation in schools there are various reasons—from financial, population, social, and management issues to computer and internet availability—that have driven the rapid and comprehensive implementation of filters in public schools. first, public schools have to implement cipa because of societal pressures and the lobbying of parents to ensure students’ internet safety. almost all users of computers in schools are minors, the most vulnerable groups for internet crimes and child pornography. public schools in america have been the focus of public attention and scrutiny for years, and the political and social responsibility of public schools for children’s internet safety is huge. as a result, society has decided these students should be most strongly protected, and cipa was implemented immediately and most widely at schools. second, in contrast to public libraries (which average slightly less than eleven computers per library outlet), the typical number of computers in public schools ranges from one hundred to five hundred, which are needed to meet the needs of students and teachers for daily learning and teaching. since the number of computers is quite large, the financial incentives of e-rate funding are substantial and critical to the operation of the schools. this situation provides administrators in schools and school districts with the incentive to make decisions to implement cipa as quickly and extensively as possible. furthermore, the amount of money that e-rate provides for schools in terms of technology is astounding. as was noted earlier, schools received over $12 billion from 2000 to 2003 alone. schools likely would not be able to provide the necessary computers for students and teachers without the e-rate funds. third, the actual implementation procedure differs in schools and libraries: schools are certified under the supervision of the local educational agencies such as school districts and state departments of education; libraries are certified within a library organization under the supervision of the imls. in other words, the certification process at schools is directly and effectively controlled by school districts and state departments of education, following the same fundamental values of protecting children. the resistance to cipa in schools has been very small in comparison to libraries. the primary concern raised has been the issue of educational equality. concerns have been raised that filters in schools may create two classes of students—ones with only filtered access at school and ones who also can get unfiltered access at home.47 reasons for more limited implementation in libraries in public libraries, the reasons for implementing cipa are similar to those of public schools in many ways. public libraries provide an average of 10.7 computers in each of the approximately seven thousand public libraries in the united states, which is a lot of technology that needs to be supported. the e-rate and lsta funds are vital to many libraries in the provision of computers and the internet. furthermore, with limited alternative sources of funding, the e-rate and lsta funds are hard to replace if they are not available. given that the public libraries have become the guarantor of public access to computing and the internet, libraries have to find ways to ensure that patrons can access the internet.48 libraries also have to be concerned about protecting and providing a safe environment for younger patrons. while libraries serve patrons of all ages, one of the key social expectations of libraries is the provision of educational materials for children and young adults. children’s sections of libraries almost always have computers in them. much of the content blocked by filters is of little or no education value. as such, “defending unfiltered internet access was quite different from defending catcher in the rye.”49 nevertheless, many libraries have fought against the filtering requirements of cipa because they believe that it violates the principles of librarianship or for a number of other reasons. in 2008, 31.6 percent of public libraries refused to apply for e-rate or lsta funds specifically to avoid cipa requirements, a substantial increase from the 15.3 percent of libraries that did not apply for e-rate because of cipa in 2006.50 as a result of defending patron’s rights to free access, the libraries that are not applying for e-rate funds because of the requirements of cipa are being forced to turn down the chance for funding to help pay for internet access in order to preserve community access to the internet. because many libraries feel that they cannot apply for e-rate funds, local and regional discrepancies are occurring in the levels of internet access that are available to patrons of public libraries in different parts of the country.51 for adult patrons who wish to access material on computers with filters, cipa states that the library has the option of disabling the filters for “bona fide research or other lawful purposes” when adult patrons request such disabling. the law does not require libraries to 12 information technology and libraries | march 2009 disable the filters for adult patrons, and the criteria for disabling of filters do not have a set definition in the law. the potential problems in the process of having the filters disabled are many and significant, including librarians not allowing the filters to be turned off, librarians not knowing how to turn the filters off, the filtering software being too complicated to turn off without injuring the performance of the workstation in other applications, or the filtering software being unable to be turned off in a reasonable amount of time.52 it has been estimated that approximately 11 million low-income individuals rely on public libraries to access online information because they lack internet access at home or work.53 the e-rate and lsta programs have helped to make public libraries a trusted community source of internet access, with the public library being the only source of free public internet access available to all community residents in nearly 75 percent of communities in the united states.54 therefore usage of computers and the internet in public libraries has continued to grow at a very fast pace over the past ten years.55 thus public libraries are torn between the values of providing safe access for younger patrons and broad access for adult patrons who may have no other means of accessing the internet. n cipa, public policy, and further research while the diverse implementations, effects, and levels of acceptance of cipa across schools and libraries demonstrate the wide range of potential ramifications of the law, surprisingly little consideration is given to major assumptions in the law, including the appropriateness of the requirements to different age groups and the nature of information on the internet. cipa treats all users as if they are the same level of maturity and need the same level of protection as a small child, as evidenced by the requirement that all computers in a library or school have filters regardless of whether children use a particular computer. in reality, children and adults interact in different social, physical, and cognitive ways with computers because of different developmental processes.56 cipa fails to recognize that children as individual users are active processors of information and that children of different ages are going to be affected in divergent ways by filtering programs.57 younger children benefit from more restrictive filters while older children benefit from less restrictive filters. moreover, filtering can be complimented by encouragement of frequent positive internet usage and informal instruction to encourage positive use. finally, children of all ages need a better understanding of the structure of the internet to encourage appropriate caution in terms of online safety. the internet represents a new social and cultural environment in which users simultaneously are affected by the social environment and also construct that environment with other users.58 cipa also is based on fundamental misconceptions about information on the internet. the supreme court’s decision upholding cipa represents several of these misconceptions, adopting an attitude that ‘we know what is best for you’ in terms of the information that citizens should be allowed to access.59 it assumes that schools and libraries select printed materials out of a desire to protect and censor rather than recognizing the basic reality that only a small number of print materials can be afforded by any school or library. the internet frees schools and libraries from many of these costs. furthermore, the court assumes that libraries should censor the internet as well, ultimately upholding the same level of access to information for adult patrons and librarians in public libraries as students in public schools. these two major unexamined assumptions in the law certainly have played a part in the difficulty of implementing cipa and in the resistance to the law. and this does not even address the problems of assuming that public libraries and public schools can be treated interchangeably in crafting legislation. these problematic assumptions point to a significantly larger issue: in trying to deal with the new situations created by the internet and related technology, the federal government has significantly increased the attention paid to information policy.60 over the past few years, government laws and standards related to information have begun to more clearly relate to social aspects of information technologies such as the filtering requirements of cipa.61 but the social, economic, and political ramifications for decisions about information policy are often woefully underexamined in the development of legislation.62 this paper has documented that many of the reasons for and statistics about cipa implementation are available by bringing together information from different social institutions. the biggest questions about cipa are about the societal effects of the policy decisions: n has cipa changed the education and informationprovision roles of libraries and schools? n has cipa changed the social expectations for libraries and schools? n have adult patron information behaviors changed in libraries? n have minor patron information behaviors changed in libraries? n have student information behaviors changed in school? n how has cipa changed the management of libraries and schools? n will congress view cipa as successful enough to merit using libraries and schools as the means of enforcing other legislation? one law with two outcomes | jaeger and yan 13 but these social and administrative concerns are not the only major research questions raised by the implementation of cipa. future research about cipa not only needs to focus on the individual, institutional, and social effects of the law. it must explore the lessons that cipa can provide to the process of creating and implementing information policies with significant societal implications. the most significant research issues related to cipa may be the ones that help illuminate how to improve the legislative process to better account for the potential consequences of regulating information while the legislation is still being developed. such cross-disciplinary analyses would be of great value as information becomes the center of an increasing amount of legislation, and the effects of this legislation have continually wider consequences for the flow of information through society. it could also be of great benefit to public schools and libraries, which, if cipa is any indication, may play a large role in future legislation about public internet access. references 1. children’s internet protection act (cipa), public law 106554. 2. united states v. american library association, 539 u.s. 154 (2003). 3. american library association, libraries connect communities: public library funding & technology access study 2007–2008 (chicago: ala, 2008); paul t. jaeger, john carlo bertot, and charles r. mcclure, “the effects of the children’s internet protection act (cipa) in public libraries and its implications for research: a statistical, policy, and legal analysis,” journal of the american society for information science and technology 55, no. 13 (2004): 1131–39; paul t. jaeger et al., “public libraries and internet access across the united states: a comparison by state from 2004 to 2006,” information technology and libraries 26, no. 2 (2007): 4–14; paul t. jaeger et al., “cipa: decisions, implementation, and impacts,” public libraries 44, no. 2 (2005): 105–9; zheng yan, “limited knowledge and limited resources: children’s and adolescents’ understanding of the internet,” journal of applied developmental psychology (forthcoming); zheng yan, “differences in basic knowledge and perceived education of internet safety between high school and undergraduate students: do high school students really benefit from the children’s internet protection act?” journal of applied developmental psychology (forthcoming); zheng yan, “what influences children’s and adolescents’ understanding of the complexity of the internet?,” developmental psychology 42 (2006): 418–28. 4. martha m. mccarthy, “filtering the internet: the children’s internet protection act,” educational horizons 82, no, 2 (winter 2004): 108. 5. federal communications commission, in the matter of federal–state joint board on universal service: children’s internet protection act, fcc order 03-188 (washington, d.c.: 2003). 6. cipa. 7. roth v. united states, 354 u.s. 476 (1957). 8. miller v. california, 413 u.s. 15 (1973). 9. roth v. united states. 10. cipa. 11. cipa. 12. telecommunications act of 1996, public law 104-104 (feb. 8, 1996). 13. paul t. jaeger, charles r. mcclure, and john carlo bertot, “the e-rate program and libraries and library consortia, 2000–2004: trends and issues,” information technology & libraries 24, no. 2 (2005): 57–67. 14. ibid. 15. ibid. 16. american library association, “u.s. supreme court arguments on cipa expected in late winter or early spring,” press release, nov. 13, 2002, www.ala.org/ala/aboutala/hqops/ pio/pressreleasesbucket/ussupremecourt.cfm (accessed may 19, 2008). 17. kelly rodden, “the children’s internet protection act in public schools: the government stepping on parents’ toes?” fordham law review 71 (2003): 2141–75. 18. john carlo bertot, paul t. jaeger, and charles r. mcclure, “public libraries and the internet 2007: issues, implications, and expectations,” library & information science research 30 (2008): 175–184; charles r. mcclure, paul t. jaeger, and john carlo bertot, “the looming infrastructure plateau?: space, funding, connection speed, and the ability of public libraries to meet the demand for free internet access,” first monday 12, no. 12 (2007), www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm/ article/view/2017/1907 (accessed may 19, 2008). 19. mccarthy, “filtering the internet.” 20. leigh s. estabrook and edward lakner, “managing internet access: results of a national survey,” american libraries 31, no. 8 (2000): 60–62. 21. alberta davis comer, “studying indiana public libraries’ usage of internet filters,” computers in libraries (june 2005): 10–15; thomas m. reddick, “building and running a collaborative internet filter is akin to a kansas barn raising,” computers in libraries 20, no. 4 (2004): 10–14. 22. communications decency act of 1996, public law 104-104 (feb. 8, 1996). 23. child online protection act (copa), public law 105-277 (oct. 21, 1998). 24. united states v. american library association. 25. r. trevor hall and ed carter, “examining the constitutionality of internet filtering in public schools: a u.s. perspective,” education & the law 18, no. 4 (2006): 227–45; mccarthy “filtering the internet.” 26. library services and technology act, public law 104-208 (sept. 30, 1996). 27. john wells and laurie lewis, internet access in u.s. public schools and classrooms: 1994–2005, special report prepared at the request of the national center for education statistics, nov. 2006. 28. american library association, libraries connect communities; john carlo bertot, charles r. mcclure, and paul t. jaeger, “the impacts of free public internet access on public library patrons and communities,” library quarterly 78, no. 3 (2008): 285–301; jaeger et al., “cipa.” 29. wells and lewis, internet access in u.s. public schools and classrooms. 14 information technology and libraries | march 2009 30. ibid. 31. jaeger, mcclure, and bertot, “the e-rate program and libraries and library consortia.” 32. jaeger et al., “cipa.” 33. american library association, libraries connect communities. 34. ibid. 35. ibid. 36. jaeger, mcclure, and bertot, “the e-rate program and libraries and library consortia.” 37. ibid. 38. norman oder, “$40 million in e-rate funds suspended: delays caused as fcc requires new accounting standards,” library journal 129, no. 18 (2004): 16; debra lau whelan, “e-rate funding still up in the air: schools, libraries left in the dark about discounted funds for internet services,” school library journal 50, no. 11 (2004): 16. 39. ken foskett and paul donsky, “hard eye on city schools’ hardware,” atlanta journal-constitution, may 25, 2004; ken foskett and jeff nesmith, “wired for waste: abuses tarnish e-rate program,” atlanta journal-constitution, may 24, 2004. 40. jaeger, mcclure, and bertot, “the e-rate program and libraries and library consortia.” 41. department of commerce, national telecommunication and information administration, children’s internet protection act: study of technology protection measures in section 1703, report to congress (washington, d.c.: 2003). 42. mccarthy, “filtering the internet.” 43. paul t. jaeger and charles r. mcclure, “potential legal challenges to the application of the children’s internet protection act (cipa) in public libraries: strategies and issues,” first monday 9, no. 2 (2004), www.firstmonday.org/issues/issue9_2/ jaeger/index.html (accessed may 19, 2008). 44. electronic frontier foundation, internet blocking in public schools (washington, d.c.: 2004), http://w2.eff.org/censor ship/censorware/net_block_report (accessed may 19, 2008). 45. adam horowitz, “the constitutionality of the children’s internet protection act,” st. thomas law review 13, no. 1 (2000): 425–44. 46. tanessa cabe, “regulation of speech on the internet: fourth time’s the charm?” media law and policy 11 (2002): 50–61; adam goldstein, “like a sieve: the child internet protection act and ineffective filters in libraries,” fordham intellectual property, media, and entertainment law journal 12 (2002): 1187–1202; horowitz, “the constitutionality of the children’s internet protection act”; marilyn j. maloney and julia morgan, “rock and a hard place: the public library’s dilemma in providing access to legal materials on the internet while restricting access to illegal materials,” hamline law review 24, no. 2 (2001): 199–222; mary minow, “filters and the public library: a legal and policy analysis,” first monday 2, no. 12 (1997), www .firstmonday.org/issues/issue2_12/minnow (accessed may 19, 2008); richard j. peltz, “use ‘the filter you were born with’: the unconstitutionality of mandatory internet filtering for adult patrons of public libraries,” washington law review 77, no. 2 (2002): 397–479. 47. mccarthy, “filtering the internet.” 48. john carlo bertot et al., “public access computing and internet access in public libraries: the role of public libraries in e-government and emergency situations,” first monday 11, no. 9 (2006), www.firstmonday.org/issues/issue11_9/bertot (accessed may 19, 2008); john carlo bertot et al., “drafted: i want you to deliver e-government,” library journal 131, no. 13 (2006): 34–39; paul t. jaeger and kenneth r. fleischmann, “public libraries, values, trust, and e-government,” information technology and libraries 26, no. 4 (2007): 35–43. 49. doug johnson, “maintaining intellectual freedom in a filtered world,” learning & leading with technology 32, no. 8 (may 2005): 39. 50. bertot, mcclure, and jaeger, “the impacts of free public internet access on public library patrons and communities.” 51. jaeger et al., “public libraries and internet access across the united states.” 52. paul t. jaeger et al., “the policy implications of internet connectivity in public libraries,” government information quarterly 23, no. 1 (2006): 123–41. 53. goldstein, “like a sieve.” 54. bertot, mcclure, and jaeger, “the impacts of free public internet access on public library patrons and communities”; jaeger and fleischmann, “public libraries, values, trust, and e-government.“ 55. bertot, jaeger, and mcclure, “public libraries and the internet 2007”; charles r. mcclure et al., “funding and expenditures related to internet access in public libraries,” information technology & libraries (forthcoming). 56. zheng yan and kurt w. fischer, “how children and adults learn to use computers: a developmental approach,” new directions for child and adolescent development 105 (2004): 41–61. 57. zheng yan, “age differences in children’s understanding of the complexity of the internet,” journal of applied developmental psychology 26 (2005): 385–96; yan, “limited knowledge and limited resources”; yan, “differences in basic knowledge and perceived education of internet safety”; yan, “what influences children’s and adolescents’ understanding of the complexity of the internet?” 58. patricia greenfield and zheng yan, “children, adolescents, and the internet: a new field of inquiry in developmental psychology,” developmental psychology 42 (2006): 391–93. 59. john n. gathegi, “the public library as a public forum: the (de)evolution of a legal doctrine,” library quarterly 75 (2005): 12. 60. sandra braman, “where has media policy gone? defining the field in the 21st century,” communication law and policy 9, no. 2 (2004): 153–82; sandra braman, change of state: information, policy, & power (cambridge, mass.: mit pr., 2007); charles r. mcclure and paul t. jaeger, “government information policy research: importance, approaches, and realities,” library & information science research 30 (2008): 257–64; milton mueller, christiane page, and brendan kuerbis, “civil society and the shaping of communication-information policy: four decades of advocacy,” information society 20, no. 3 (2004): 169–85. 61. paul t. jaeger, “information policy, information access, and democratic participation: the national and international implications of the bush administration’s information politics,” government information quarterly 24 (2007): 840–59. 62. mcclure and jaeger, “government information policy research.” lib-mocs-kmc364-20131012113301 random sample of personal names in the lc file indicates that less than 17 percent of personal names require cross-references. thus the personal name headings that occur only once but would require authority records because of cross-references could be less than 17 percent. the frequency data combined with reference structure data could have a significant impact on design. out of a total of 695,074 personal names in the authority files associated with the marc bibliographic files examined here, 456,328, or 66 percent, occur only once. of these, fewer than 77,575 would be expected to have cross-references, thus the nameauthority file for personal names could be reduced in size from 695,074 records to 316,321, a 55 percent decrease. if separate authority records are a system requirement, the occurrence figures might then be useful for defining configurations that employ machine-generated provisional records for single-occurrence headings that do not have reference structures or that simplify in other ways the treatment of these headings. these figures may also be useful in making decisions on the addition of retrospective authority records to the automated files. reference 1. william gray potter, "when names collide: conflict in the catalog and aacr2," library resources & technical services 24:7 iwinter 1980). . rlin and oclc as reference tools douglas jones: university of arizona, tucson. the central reference department (social science, humanities, and fine arts) and the science-engineering reference department at the university of arizona library are currently evaluating the oclc and rlin systems as reference tools, to see if their use can significantly improve the effectiveness and efficiency of providing reference service. a significant number of the questions received by our librarians, and presumably by librarians elsewhere, incommunications 201 valve incomplete or inaccurately cited references to monographs, conference proceedings, government documents, technical reports, and monographic serials. if by using a bibliographic utility a librarian can identify or verify an item not found in printed sources, then effectiveness has been improved. once a complete and accurate description of the item is found , it is a relatively simple task to determine whether or not the library has the item, and if not, to request it through interlibrary loan. additionally, if the efficiency of the librarian can be improved by reducing the amount of time required to verify or identify a requested item, then the patron, the library, and, in our case, the taxpayer, have been better served. the promise of nearimmediate response from a computer via an online interactive terminal system is clearly beguiling when compared to the relatively time-consuming searching required with printed sources, which frequently provide only a limited number of access points and often become available weeks, months, or even years after the items they list. we realize, of course, that the promise of instantaneous electronic information retrieval is limited by a variety of factors, and presently we view access to rlin and oclc as potentially powerful adjuncts tonot replacements for printed reference sources. given that rlin and oclc have databases and software geared to known-item searches for catalog card production, our evaluation attempts to document their usefulness in reference service. a preliminary study conducted during the spring semester of 1980-81 indicated that approximately 50 percent of the questionable citations requiring further bibliographic verification could be identified on oclc or rlin. the time required was typically five minutes or less. successful verification using printed indexes to identify the same items ranged from 20 percent in the central reference department to 50 percent in science-engineering. time required per item averaged approximately fifteen minutes. based on our findings, we plan a revised and more thorough test during the fall semester of 1981-82, which will include an assessment of the enhancements to the 202 journal of library automation vol. 14/3 september 1981 rlin system scheduled to be operational this summer. the proposed test will involve eight members of the reference staff-four from each department-who will be trained to search on oclc and rlin. those selected will include both librarians and library assistants who regularly provide reference assistance. the results obtained from such a representative group will better enable us to assess the impact on the whole reference staff should we later decide to fully implement the service. they will be the only ones involved in sampling questions and conducting comparative searches. the test will have two components, the first of which will be a twenty-week period to collect at least 400 sample questions. during their regularly scheduled reference hours, the eight specially trained librarians 'will collect samples of reference requests for materials that, based on the information initially given by the patron, cannot be identified in the card catalog. after checking the catalog, the librarian will then complete the top portion of a two-page selfcarbon form with all of the information that is known about the requested item. then, at regular intervals during the semester, the pages of each form will be separated and distributed to other members of the test staff for batch-mode searching. the manual oclc and rlin searching for each query will be done by different staff members to eliminate crossover effects. each request will be searched on both oclc and rlin with the following information being recorded: 1. date of the material requested (if known). 2. type of material (e.g., conference proceeding). 3. amount of time required to do the search. 4. success or failure of the search. this information will then be cumulated in a statistical table, and the results of each search will be keypunched for computerized analysis using the bmdp (biomedical computer programs) statistical package to determine whether or not effectiveness and efficiency have been improved significantly. in addition, on twenty-four randomly selected days during the semester the trained searchers will count the total number of questions received by them on that day that would have been appropriate to search on rlin or oclc. by using these data it will be possible to extrapolate the potential usefulness of the systems for the entire semester. the second component of the test will be a two-week real-life test during which all questions requiring further verification would be searched immediately on rlin , oclc, and in the appropriate printed sources to compare time required, success rate, and type of material requested. this sort of test would permit the searcher to continue to negotiate with the patron as the search progressed, which is the usual situation. also, this would provide the only opportunity to have the patron judge the value of subject searches done on rlin. if funding is received, preliminary results should be available in early 1982. anyone conducting similar or otherwise relevant studies is asked to contact the author. replicating the washington library network computer system software thomas p. brown: manager of computer services, and raymond deb use: manager of development and library services, washington library network, olympia. the washington library network (wln) computer system supports shared cataloging and catalog maintenance, retrospective conversion, reference, com catalog production, acquisitions, and accounting functions for libraries operating within a network. the system offers both full marc and brief catalog records as well as linked authority control for a ll traced headings. it contains more than 250,000 lines of pl/1 and ibm bal code in more than 1,100 program modules and runs on ibm or ibm-compatible hardware with ibm operating systems (mvs,os/vs1). all database management functions are provided by adxbas, a product of software a.g. of north america. the online system runs unservices to mobile users: the best practice from the top-visited public libraries in the us article services to mobile users the best practice from the top-visited public libraries in the us yan quan liu and sarah lewis information technology and libraries | march 2023 https://doi.org/10.6017/ital.v42i1.15143 yan quan liu (liuy1@southernct.edu) is professor, southern connecticut state university. sarah lewis (sbojo32@gmail.com) is mlis graduate, southern connecticut state university. © 2023. abstract libraries are adapting to the changing times by providing mobile services. one hundred fifty-one libraries were chosen based on circulation, with at least one library or library system from each state, to explore the diverse services provided to mobile users across the united states. according to the data, mobile apps, mobile reference services, mobile library catalogs, and mobile printing are among public libraries’ most-frequently offered services, as determined by mobile visits, content analysis, and librarian survey responses. every library examined had at least one mobile website, mobile catalog, mobile app, or webpage adapted for a mobile device. following the covid-19 outbreak, services such as mobile renewal, subscriber database access, mobile reservations, and the ability to interact with a librarian were expanded to allow better communication with customers—all from the comfort and safety of their own homes. libraries are continually looking for innovative methods to assist their mobile customers as the world changes. introduction searching can be done on a computer, but it’s more likely to be done on a mobile device.1 according to data from the pew research center, about 96 percent of american adults own a cell phone.2 americans are connecting to the internet via their mobile devices in greater numbers than ever before. the pew report also states that 81 percent of americans have a smartphone.3 while direct usage is not measured in terms of how it relates to public libraries, the reality is that users are looking to connect with businesses and services through their mobile devices. while the covid virus’ ongoing spread has had a significant effect on various public services sectors around the world, libraries, especially in suburban areas, had to evolve and adapt to the changing environment. public libraries now offer more patron services. while many public libraries offered curbside services during covid-19 as a way to provide continuity of service, contactless services—services without having to speak with librarians directly—also became prevalent.4 these services cross geographical boundaries and reduce the risk of disease transmission from direct contact with other people. however, are there areas where libraries are lacking? are there areas where libraries can improve, especially as more and more people are relying on mobile services rather than in-person ones? this study examines current mobile services being offered in public libraries across the united states, how these offerings changed due to the pandemic, and what services libraries are looking to offer in the future. literature review several studies have explored mobile devices and their usage in a broader context. currently, it is estimated that 67 percent of the world’s population has mobile devices, with most of those devices being smartphones.5 in the united states alone, nearly 80 percent of the population owns a mailto:liuy1@southernct.edu mailto:sbojo32@gmail.com information technology and libraries march 2023 services to mobile users 2 liu and lewis smartphone.6 people across the world use mobile media to talk to one another, order food, and attend meetings and appointments. “mobile media—referring to devices, services, and content accessible on the go—has in a decade rapidly become a part of urban culture, and the habit of using a mobile device in public is only increasing.”7 once the covid-19 pandemic hit, use of mobile media was the easy way to connect as shelter-in-place orders were mandated across the country. even as many people are venturing out again, using mobile media for certain services is likely going to continue. the world is forever changed because of the pandemic, and organizations like libraries have sought new ways to connect with their patrons. status of library services provided in public libraries while a few research articles have addressed services provided to mobile users in libraries, none at the time of writing were focused explicitly on services for mobile users in public libraries in the united states. some studies focused on libraries overseas.8 others focused on what drives a library to provide services for mobile devices and what drives users to want to access their library through a mobile device.9 with time, the availability of mobile-friendly services has become critical to a library’s long-term viability. the ability to search a library’s catalog, for example, has aided a traditional library’s modern relevance. “the digital library on mobile devices has been a milestone in library industry development, leading to huge changes in knowledge carrying, spreading, acquiring, processing , and sharing of cloud computing and big data in online and offline forms.”10 users want data at their fingertips, whether it is an online order or a library catalog. “social sharing functions such as reading, borrowing, sharing, comment tracking, automatic retrieval, and new book recommendations on mobile devices such as mobile phones have become the popular trend of digital library development.”11 when evaluating mobile library websites and apps that libraries use to offer services, the literature appears to imply that a mobile version of a library website is more prevalent than a dedicated (or “native”) app. this is because of the development cost, the different mobile platforms (apple, android, etc.), and the maintenance required.12 instead of creating an app, it is often easier to optimize the library’s website for mobile devices through responsive design techniques. the challenge for accessing a library website on a mobile device is making sure the website has the same functionality as when viewed on a desktop computer. “today’s mobile users are no longer satisfied with simple mobile websites with only a small fraction of the information and features that are available on desktop websites. the small screen size of a mobile device may make performing certain tasks more tedious or cumbersome, but mobile users do expect to perform more and more tasks on their mobile devices.”13 lemire et al. performed a study of how mobile services have improved since 2010. while academic libraries were analyzed and surveyed, this information is helpful and relevant to the study of public libraries. mobile apps offered across various libraries while mobile apps and mobile webpages aren’t precisely the same (as proven by previous literature), the study public library mobile apps in scotland: views from the local authorities and the public sheds some insight on the use of mobile apps. it seems that many libraries do not have or are not interested in developing a mobile app (this coincides with the findings from lemire et al.’s study in the united states).14 however, when the public was surveyed, they did want more information technology and libraries march 2023 services to mobile users 3 liu and lewis remote services offered to mobile devices from their libraries. to increase the use of library services, apps should be considered.15 “by using an app instead of a mobile-enabled website, all the functionalities of smart technology can be incorporated to the library’s advantage. improved communication with patrons increases exposure to communities who otherwise would not use library services.”16 a similar study was done in malawi, where mobile services to libraries were analyzed. as we might expect, since malawi has a developing economy, it has fewer mobile services then the more mature economies of the us or scotland. the country recognizes the potential of using mobile devices to access library services. computer shortages are often a problem within libraries in developing countries, so by creating mobile services, users could access the library through their own mobile devices.17 studies have also found that it is important to know what users are looking for and to use that information when creating an app design. “perceived situation efficiency and perceived mobile library quality positively affect intention to use mobile library, demonstrating that both quality and situation efficiency are necessary to satisfy library users’ needs in mobile era.”18 the quality of the library app or mobile website is obviously important, but knowing what kind of services will be provided is also important. users are not going to turn to an app if it is not going to provide the information they need. this reported study proved that mobile users want to quickly obtain the information they want in the most effective way possible. library services for mobile users universities are often more likely offering mobile services than are small-town libraries. nearly all prestigious universities in the united states are already using mobile-friendly services, according to a study of one hundred university libraries.19 typically, the services for mobile users include “mobile sites, mobile apps, mobile opacs, mobile access to databases, text messaging services, qr codes, augmented reality, and e-books.”20 public libraries may not offer all these mobile services, such as augmented reality, but the majority provide access to a mobile opac or library catalog. guo, liu, and bielefield examined how urban public libraries provide services for and offer content to mobile users.21 their analysis explored what was being done in an urban setting to potentially help public libraries plan and create mobile services. they looked into literature dating back to 1991, when mobile data was just a thought in some forward-thinkers’ minds, as was the concept of what a “mobile device” entailed. the study used current research to group library services into two categories: “traditional library services modified to be available via mobile devices and services created for mobile devices.”22 to conduct their study, a list of 138 urban libraries in the united states was used based on the urban libraries council. all 138 were examined using the same criteria. a list of contents was created (components of mobile websites, components of mobile apps, mobile reference services, social media, mobile reservation services, mobile printing, apps or databases). the findings supported the hypothesis that services for mobile users have been in place in urban libraries in the united states. according to guo et al., 95 percent offer at least one type of service for mobile users.23 pope and others discussed sms or text messaging services to a mobile device. researchers also mention the my info quest project, which was trying to get more libraries on board with using text information technology and libraries march 2023 services to mobile users 4 liu and lewis messaging, one of the mobile services studied in this review.24 literature from the fields of information science and library science shows that in recent years, typical library services have been adapted to be accessible via a mobile device rather than a service being developed specifically for a mobile platform. research design a selection of two to five of the most-visited public libraries per state was chosen based on a statistic from the institute of museum and library services’ database of 9,247 public libraries (this number treats library systems as one).25 the libraries were chosen based on two criteria: libraries needed to be in states having a sizable number of public libraries and to have at least 3 million yearly in-person visitors. this resulted in a list of public libraries compiled to ensure that all notable libraries in the united states were covered in this study. the sample of 151 libraries’ state and population served, total circulation, total number of programs, total visits, and other associated data were entered into an excel spreadsheet for analysis. with a low of 168,661 and a high of 16,686,945, the average number of visitors per year (prior to the pandemic) to these libraries is 2,663,292. these libraries serve an average of 700,924 people, with a low of 8,542 and a high of 4,294,460. the dataset includes all branch libraries, with the largest system having 92 branches. mobile website/app visits, content analysis, and email surveys were among the study methodologies used. the mobile website examination was conducted through an android mobile phone device, using a specially designed codebook/spreadsheet to store and analyze data collected and verified from the library websites through 2021. the services offered were checked on each library’s website. email surveys were developed as a supplement to ensure data accuracy and additional input of the library’s mobile services from the librarians who work there. irb approval was granted (protocol #406)26 because human participants did not provide personal information and simply responded to email surveys. the first questionnaire was sent out to each library via google forms, either through direct email addresses or the library’s web form, from april 3 through april 7, 2021. the follow-up email survey was designed to act as a check for some possible limitations (such as being unable to access app features) and was sent on april 19–20, 2021. the data was verified again through the imls database in early 2022. the overarching question of what services libraries provided to users via their mobile devices was designed to delve into the mobile websites and/or apps provided by libraries, as well as the resources provided by the mobile websites or apps, reference services, reservation services, remote printing, and other services provided to mobile users. results and findings examining the use of a mobile device to deliver services by these most-visited public libraries reveals intriguing and novel findings in comparison to the use of desktop access for services available to users. the poll sought to find the following information: which mobile services the libraries now provide, what they intend to provide, and any feedback received on their mobile services. information technology and libraries march 2023 services to mobile users 5 liu and lewis all library websites are accessible via mobile devices every one of the 151 libraries analyzed had a mobile website. in certain situations, this was the library’s website optimized and tailor-made with responsive designs for a mobile device. in other cases, all that was provided was a version of the library’s website (but not optimized, so navigation was more difficult). in other cases, the library’s website was merged with the parent organization and simply provided basic library information. examples of these three conditions of the library mobile websites are shown in figure 1. figure 1. examples of different library websites. library services were made available to mobile devices primarily due to covid-19, as indicated by the responses of 60 (40%) of the 151 libraries to online questionnaires using google forms. though it is practically difficult to compare the services offered in 2019 to those offered during the covid-19 pandemic, 38 (63%) of respondents indicated that they had added mobile services in the last year (see fig. 2). information technology and libraries march 2023 services to mobile users 6 liu and lewis figure 2. mobile services added because of covid-19. what services were added, particularly since covid-19? chat features were the most popular response as a reference service and a means to connect. one example is the usage of discord, a chat program that allows users to communicate by voice, video, and text, as a virtual communication tool to provide reference services. nearly every library implemented curbside pickup. during the pandemic, virtual events and online reference services also became popular additions. book delivery was still one of the mobile services that libraries provided, either using “chomp delivery” (a local delivery service company used by one library in iowa) or the united states postal service (usps). while these “ship to patron” delivery approaches may not be a frequent practice, it was an inventive approach to getting library books into the hands of clients who were unable to leave their homes. see figure 3 for common services added during covid-19. over half of libraries provide dedicated apps for mobile devices though all libraries had a website, more than half had a specific library app dedicated for users with mobile devices. out of the 151 libraries analyzed, 52 percent (78) had at least one dedicated app built for the library or library system (see fig. 4). these apps could be downloaded from the google play store (android platform) or the apple store (ios platform). all but one of the 78 libraries allowed patrons to log in to their accounts to look at their current checkouts, search the catalog, place holds, and request items. other library applications such as hoopla or libby/overdrive, as well as apps that were used to display upcoming events and other library data, like locations and hours, were excluded. 63% 37% yes no information technology and libraries march 2023 services to mobile users 7 liu and lewis figure 3. common features added because of covid-19. figure 4. percent of libraries that offer a mobile app. 5% 8% 11% 14% 14% 19% 19% 24% 0% 5% 10% 15% 20% 25% 30% book delivery self-checkout digital library card mobile printing virtual reference services curbside virtual events chat 48% 52% no yes information technology and libraries march 2023 services to mobile users 8 liu and lewis services delivered through library mobile websites all 151 libraries had some form of mobile website. although each website provided a lot of the same information, there were significant variances (see fig. 5). the ability to log in to one’s account was available on the investigated websites. this allows customers to search the library catalog, place holds on books, and renew any books that were currently checked out. all the websites included basic contact information as well as information about the locations and hours of the libraries, and all libraries have an online public access catalog (opac) accessible through mobile devices. web-mediated services have been in place particularly since covid-19. nearly every library now offers services such as curbside pickup for books and other materials. library events are now almost exclusively offered virtually. over 98 percent of libraries have a calendar or other means of informing users about forthcoming virtual activities and events. some libraries are now making available re-opening plans, covid-19 protocols, and even covid-19 vaccine information. almost every library’s main webpage mentions modifications brought on by covid-19. as a result, certain library hours have altered, and services at all libraries have been impacted by covid-19. typically, getting a library card was a service that physically took place in the library. during covid-19, libraries had to learn to be flexible. over 94 percent of libraries offered the ability to get a library card or allowed patrons to register for an e-card so they could check out books. some of these libraries offered the option of printing the paperwork and necessary documentation to obtain a library card, but this paperwork did have to be dropped off at the library. however, many allowed their patrons to either extend the expiration date on a current library card or apply for a new one to be able to use the library’s virtual services. many libraries require a library card to access some of their services, such as databases. nearly all libraries offered a variety of databases or other apps (such as libby/overdrive or hoopla). a library card is needed to access all services. a couple of libraries even have their databases behind a login screen, so a user cannot even see the list of available databases until they have a library card and log in. all libraries in the sample had a social media presence. discrepancies on library websites were noted concerning new arrivals and recommendations. only about 55 percent of library websites listed new arrivals and about 52 percent listed recommendations. it is possible that this data is included within the library catalog (and only accessible once a user has logged in). out of all libraries analyzed, 43 libraries (28%) offered both recommendations and listed new arrivals. a smaller number, 31 libraries (21%) did not offer either of these. additional services seen on some of the library websites included the option to recommend a purchase to the library, how and if libraries were accepting donations, and a list of online class es being offered. some libraries offered services on finding a job or becoming a united states citizen. overall, the library websites provided a lot of valuable information to patrons. some were easier to navigate than others. the locations, hours, and contact information should be easy to find and access; however, on quite a few websites, these points of information are not easily accessed, since some libraries’ websites required logging in. still, with some searching and scrolling, a user could get to nearly all the basic information they might be looking for. information technology and libraries march 2023 services to mobile users 9 liu and lewis figure 5. percentage of libraries in study that offered services via the library’s mobile website. services delivered through library mobile apps while the primary function of a library’s dedicated app seemed to be accessing the catalog and requesting books, common services delivered through apps also include listings of events, such as webinars, found in 35 libraries (45%) that have an app; recent arrivals or recommendations, found in 33 libraries (42%); and ways to contact the library, in 13 libraries (17%). (see fig. 6 for the complete list of services.) all percentages are calculated based on the number of libraries that had a dedicated app that had that feature (78). gathering the information available on mobile apps was challenging as some apps require users to be affiliated with that library and log in with a library card. the information on what the app provides primarily came from the screenshots in the google play store (for the investigators’ android mobile devices) and the description on either the library’s website or within the play store. the apple app store applications available for library use are not disclosed in this study. as a result, the number of apps that offer additional services (such as locations/hours, events, etc.) may be higher than the percentages in figure 6. for example, the brooklyn public library app (seen in fig. 8) has many features that are not shown in the sample on the google play store. clicking on library info will provide additional details beyond library locations and hours, but also contact information and links to social media. this information is only displayed to affiliated library patrons who have downloaded and logged in to the app. 52% 56% 95% 98% 99% 100% 100% 100% 100% 100% 100% 0% 20% 40% 60% 80% 100% 120% recommendations recent arrivals get a library card/ecard virtual programs library database search social media locations and hours library contact information library catalog search ability to log in to account ability to reserve/renew materials information technology and libraries march 2023 services to mobile users 10 liu and lewis figure 6. percentage of libraries with a dedicated mobile app that offered services. apps developed and delivered for dedicated library services while it may appear that apps are the most popular method of connecting with patrons, maintaining an app and ensuring that it works on all mobile platforms and devices can be challenging. while creating and making an app available through the google play or apple app stores may be an easy procedure for certain libraries, others are impeded by a lack of technical expertise. designing and maintaining an app takes too much time and money, as voiced from the study survey. an in-depth examination of the google play store reveals that variables influencing the makers of each library’s app include familiarity with the community serviced, the potential for options, staff training, and phone access (see fig. 7). however, the majority of apps (67%) were commercially made, and only 33 percent were self-developed. this results in apps that are unique to the library and the user’s needs, but also require dedicated it staff to maintain the app. 17% 17% 42% 45% 63% 99% 99% 99% 0 0.2 0.4 0.6 0.8 1 1.2 social media contact us/ask a question recent arrivals/recommendations events locations/hours book reservations account login catalog search information technology and libraries march 2023 services to mobile users 11 liu and lewis figure 7. app developers, according to the google play store. one of the most popular software developers that provide libraries with scalable, effective mobile applications is solus uk ltd. out of the 78 apps analyzed, 24 percent used this developer. feedback from the study survey indicated that this app has strong capacity to expand or change according to user requirements, and its interface is user friendly and simple (see fig. 8). the app was obviously updated during the past year to allow for contactless holds pickup, a service that many libraries are offering so patrons do not have to come into the library to pick up their books or other materials. however, that interface is markedly different than others. both the chicago public library and the brooklyn public library have self-developed apps (see fig. 9). the functionality of these apps is like that of the st. paul public library, but the look is very different. other apps, developed by other companies, also have an entirely different presentation and notion, such as responding time and user interface. the main purpose is the same: allow the user to be able to view the catalog and be able to check their holds and current materials. 5% 6% 6% 8% 17% 24% 33% 0% 5% 10% 15% 20% 25% 30% 35% boopsie other bibliocommons capira communico solus uk ltd self-developed information technology and libraries march 2023 services to mobile users 12 liu and lewis figure 8. st. paul public library’s app developed by solus uk ltd. figure 9. self-developed apps: chicago public library app (left) and brooklyn public library app (right). information technology and libraries march 2023 services to mobile users 13 liu and lewis major forms of mobile reference services one of the most important ways for a library to connect with patrons is through mobile reference services. even when the library is not open, many people seek help from the library reference desk. while calling the library is always an option, it is often not the most convenient one. while not all mobile reference services will work in this instance, some certainly will. an increasingly common example is the use of chatbots to offer such services. out of the 151 libraries surveyed, 134 libraries (80%) offered mobile reference services in some format via both mobile websites and dedicated apps (see fig. 10). figure 10. percentage of libraries offering mobile reference services. mobile reference services are described in this study as a direct way to contact the library via its mobile site. this can be done via chat, which functions similarly to instant messaging, text messaging (a patron can text a reference request to a specified text message number), or a web form (mobile friendly and reachable from remote devices) (see fig. 11). the web form, which was found on the websites of 127 of the 134 libraries (95%), was the most often utilized channel for mobile reference transactions compared to other services (see fig. 11). the user’s name, email address (and sometimes their library card or branch location), as well as their inquiry, were required. users’ phone numbers can be blank and are not collected. the librarian would then answer through email (or phone, if provided and applicable). the web form is available seven days a week, 24 hours a day. the user can submit at any time and receive a response when the librarian is available, which is convenient for both parties. 11% 89% no yes information technology and libraries march 2023 services to mobile users 14 liu and lewis figure 11. major forms of mobile reference service. chat or instant messaging is the second most common type of mobile reference service and was used by 74 (55%) of the 134 libraries that offered mobile reference services. chat rooms were set up in several libraries during specific hours of the day. for example, from 11 a.m. to 1 p.m., the boston public library (massachusetts) hosts a chat session. outside of those hours, this feature is not available. the limitation of the chat option for some libraries is that it works on a computer but not on a mobile device. also, the chat feature was unavailable outside of libraries’ standard operating hours. the sms text option was chosen by 36 (27%) of the libraries. the fact that many questions require a lengthy response or back-and-forth conversation can sometimes make this more challenging for librarians. if a patron has a short inquiry, the text option is convenient; nevertheless, this is most often used when the library is open. in addition to the text function, all 36 libraries offered another form of mobile reference service. thirty-five libraries offered the web form in addition to the text alternative. while most libraries provide only one form of mobile reference service, a few provide three or more such options. out of 134 libraries that provided mobile reference services, 56 (42%) provided only one service and 49 (37%) provided two. only 30 libraries (22%) provided all three services (chat, sms, and web form) (see fig. 12). the provo city library (utah) combines all three services in one chat box in the example shown in figure 13. this allows a user to ask a question and then continue the conversation using the technique of their choice. presently, these tools are more frequently observed in public libraries using libraryh3lp as opposed to mosio or libanswers, etc. 27% 55% 95% 0% 20% 40% 60% 80% 100% text/sms chat/im web form information technology and libraries march 2023 services to mobile users 15 liu and lewis figure 12. comparison of the types of mobile reference services offered by libraries. figure 13. “ask a librarian” combines options of mobile services. mobile reservation services are widely available reserving a computer, museum pass, study room, meeting room, and show space are among the mobile reservation services that were examined. at least one of these was available at 106 libraries (70%) (see fig. 14). this finding indicated that mobile reservation services became widely used as a result of the covid-19 pandemic in the surveyed libraries. 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% one form two forms three forms information technology and libraries march 2023 services to mobile users 16 liu and lewis figure 14. percentage of libraries offering at least one mobile reservation service. many libraries remained closed throughout the survey study period, making it difficult for patrons to book conference rooms, study rooms, or exhibit space. some libraries that were open had meeting rooms and study rooms that were not available for use due to social distancing or other municipal standards. however, the option to reserve these rooms was still available on the website at the time of the analysis. some libraries supplied meeting space information but required patrons to make a reservation by contacting them. some libraries, such as the salt lake city public library in utah, closed their actual meeting rooms but offered rental of virtual meeting rooms while their physical meeting rooms were closed to the public (see fig. 15). others offer not just rooms, but outdoor and large areas. the indianapolis public library has an auditorium, an atrium, and a garden that are available to rent for a meeting—or even a wedding (the online form can be filled out requesting the date and the space for all types of events). 70% 30% yes no information technology and libraries march 2023 services to mobile users 17 liu and lewis figure 15. the salt lake city public library’s online announcement. while many events can take place at a library, the spaces or rooms that could be reserviced, such as wedding and exhibit spaces or rooms, are not commonly offered (see fig. 16). rather, the rooms at most libraries are used for either meetings or study sessions. the most common mobile reservation is for a meeting room, as 79 libraries (75%) out of the 106 libraries offering mobile reservation services allow patrons to reserve this space online (although, again, this service may not currently be available because of covid-19). study rooms are less commonly offered, with only 24 libraries (23%) offering mobile reservations for those online. however, some study rooms may be included with meeting rooms at some libraries. because libraries are closed or have covid-19 restrictions, book-a-librarian and reserve computer services are also paused. some libraries have pivoted to booking a librarian for a virtual meeting (34, or 32%), but often, that is just done through reference services. many libraries are limiting computer use and, as such, booking one can only be done in person to limit the number of people on the computers at one time. only 10 libraries (9%) of the libraries offering mobile reservations provide a service where a patron can reserve a computer. libraries are typically a great resource for free or discounted museum passes, but with some museums closed or having limited hours, some museum passes may not be available either. only 36 (34%) of the 106 libraries offering mobile reservation services allow patrons to book a museum pass online. information technology and libraries march 2023 services to mobile users 18 liu and lewis figure 16. percentage of libraries offering reservation of various kinds of services via a mobile interface mobile printing services become an emerging phenomenon the ability to print from a laptop computer or a mobile device is a newer service that many libraries are starting to offer to their patrons. with libraries being closed because of covid -19, the mobile printing service has become even more important. patrons can send a printing job to the library and then pick it up curbside. this service is another way libraries are adapting to the world during a pandemic. most libraries (114, or 76%) do offer mobile printing (see fig. 17). the majority of these libraries have their users download a specific app that allows them to connect to the library’s printers remotely with a variety of instructions. some libraries offer the ability to print wirelessly, where a user can connect to the library’s wi-fi (even in the parking lot if the library is closed) and send a document to the library that way. in this analysis, wireless printing and mobile printing are included together, as it is difficult to differentiate them because of the pandemic. as part of a remote service, some libraries are also starting to offer 3-d printing services, allowing patrons to submit 3-d print jobs from their mobile devices. this typically is also done through a specific app and sometimes comes with a fee allowing users to 3-d print and then pick the product up curbside. some libraries include this service as part of their makerspace and make it available for free with a library card. overall, 17 libraries (11%) offer 3-d printing. as more locations are utilizing 3-d printing technology, libraries can offer such 3-d service libraries to the general public wisely. 9% 23% 32% 34% 75% 0% 10% 20% 30% 40% 50% 60% 70% 80% computer study room librarian museum pass meeting room information technology and libraries march 2023 services to mobile users 19 liu and lewis figure 17. percentage of libraries that offer mobile (or wireless) printing. table 1. databases & applications accessible via a mobile website app/database percent of libraries providing app/database percent of libraries providing abcmouse 18% learning express 65% ancestry 78% lynda.com (linkedin) 57% bookflix 27% mango languages 57% brainfuse 32% masterfile 40% britannica 39% morningstar 57% chilton 35% national geographic 15% consumer reports 49% new york times 58% driving-tests.org 29% novelist 81% ebscohost 51% overdrive/libby 62% eric 31% proquest 17% explora 46% rbdigital/zinio 27% flipster 39% reference usa 57% freegal 33% rosetta stone 15% gale 72% tumble book 47% heritage quest 55% tutor.com 20% hoopla 48% valuline 56% kanopy 41% world book 33% khan 10% worldcat 44% 75% 25% yes no information technology and libraries march 2023 services to mobile users 20 liu and lewis subscribed databases are available via mobile devices patrons should be able to access databases via their mobile sites. when consumers think of public library services, databases are not always the first thing that comes to mind. however, when users were at home and libraries were closed, it was critical for patrons to have access to these databases. a library card was necessary to access the majority, if not all, of the databases. some libraries locked their databases behind a login screen so users can’t access them without a valid library card. a wide range of databases were available, and some libraries seemed to cater to their target audiences, with some offering more databases geared at children or teens (abc mouse, tumblebooks) and others focusing on their adult patrons (consumer reports, valuline). ancestry, gale (which includes academic onefile), and learning express (also known as prepstep) were the most widely used databases and were mentioned by 108 libraries (72%) of the 150 libraries that have databases available through their library mobile websites. table 1 lists the most commonly offered databases. many others were available at various libraries across the country, but none were offered by more than 10 percent of the libraries examined. social media bridges mobile devices from libraries to patrons all the libraries surveyed have social media links on their mobile websites. some libraries include apps that connect to social networking. every library had a facebook page, and almost all had a youtube channel (96%) or an instagram account (95%). the amount of access to some of these pages varies. one expectation from users is that all libraries would promote their social media on their main (or contact us) page, but some did not do so. figures were double-checked by browsing youtube, instagram, and twitter for each library. a youtube channel with fewer than 20 subscribers and postings was found for several libraries. ninety-three percent of libraries use their twitter accounts to promote their programming and other activities. following the listing of the primary social media, some libraries have a “drop-off” for how they interact with their users (see fig. 18). only 48 libraries (32%) used pinterest, and fewer used flickr (27, or 18%) or linkedin (25, or 17%). a very small percentage used tumblr (10, or 7%) or goodreads (6, or 4%). it is possible that more of these libraries used these social media channels but did not include the logo on their web page with their other social media. an extensive check was done on youtube, twitter, and instagram to verify that the libraries had a social media presence on those channels. these are the primary ways libraries connect users to their services, as they are the more popular on social media. other social media used include blogs from each library, e-newsletters, podcasts, rss feeds, vimeo, and tiktok. actions and barriers to advance mobile services despite the fact that more individuals are visiting libraries again and the facilities are open, many users might be hesitant to do so. fifty-six percent of the 54 libraries responding to the survey questionnnarie said they would like to provide more mobile services in the future (see fig. 19). information technology and libraries march 2023 services to mobile users 21 liu and lewis figure 18. percentage of public libraries using various social media services. figure 19. percentage of libraries wishing to add mobile services. 4% 7% 17% 18% 32% 93% 95% 96% 100% 0% 20% 40% 60% 80% 100% 120% goodreads tumblr linkedin flickr pinterest twitter instagram youtube facebook 56% 44% yes no information technology and libraries march 2023 services to mobile users 22 liu and lewis when libraries were asked what services they were looking to add, the two most common responses were: an app and text services. both responses were given by seven (21%) out of the 33 libraries that responded. the sms texting service was not just mobile reference, but also text notifications for program updates or reminders or account notifications. the rest of the responses were spread out, in that three libraries (9%) said that they were looking to add chat features and two libraries (6%) were looking to add mobile printing. other answers to the question by a single library included bookmobile, mobile checkout, and virtual story time. many libraries aspire to expand their offering of mobile reference services. the number of libraries that employ chat and sms text messaging appears to be substantial in the upcoming year. considerations are being given to expanding mobile services at libraries (see fig. 20). a total of 34 libraries responded, with the most common considerations being getting to know the community and promoting/marketing the services (18% of the libraries responded for each). a popular mobile service does not imply it will work for patrons, according to the surveyed libraries. the advice from survey respondents was to really attempt to figure out what people want and then give it to them. figure 20. percentage of libraries giving consideration to expanding mobile services to various areas. the research showed that libraries frequently test services to see what works and test services frequently to make sure they are operating properly. one library urged other libraries to do the best they could with what they had. they recognized that their customers were frustrated by their inability to provide mobile services and urged that they push out whatever mobile offerings they could. they agreed that it would not be flawless, but users would prefer to communicate with library staff when encountering a bug-free app/website/service. staffing, technology, and money were by far the three most significant difficulties that libraries faced, based on the response from 36 libraries (see fig. 21). while expanding mobile services is something that these libraries want to accomplish, finding employees to manage them and 9% 12% 15% 18% 18% 0% 5% 10% 15% 20% mobile phone access train your staff provide options/try as many as possible know your community promotion/marketing information technology and libraries march 2023 services to mobile users 23 liu and lewis understand how to utilize them is difficult. furthermore, many upgrades necessitate funding, whether for staff training or the development of an app. figure 21. percentage of libraries facing resource challenges in offering mobile services. one of the top technical issues is ensuring that their mobile services are interoperable across several platforms. six libraries (17%) indicated that anything they add will operate on both android and apple platforms on mobile devices. another technological challenge is that any service the library introduces must be compatible with what they already have. when new apps or platforms are required, integrating them with existing technology can be tricky. conclusion and suggestions for further study after examining 151 public libraries across the united states, it is obvious that libraries are attempting to engage with their consumers via mobile services. all 151 libraries have either a mobile website or one that is mobile friendly. as such, the website, in all cases, contained contact information, library locations and hours, access to the library catalog, and the ability to log in and renew or reserve a book. during a period when many libraries were closed, these primary services were critical in connecting with the bulk of users. libraries adapted to the times by introducing curbside service, which allowed users to place holds on books or materials remotely using their mobile devices and then return to the library to pick up their holds without having to go inside. a bit more than half of the libraries polled had a specialized mobile app that performs some of the same functions as the mobile website. some libraries are considering developing an app to help their patrons in the future. many libraries have incorporated remote reference services, such as booking a librarian with an online reservation, sms notifications, and chat, in response to the pandemic. these services enable the library to communicate with its patrons when face-to-face interaction is not possible. 17% 17% 23% 0% 5% 10% 15% 20% 25% money technology staff information technology and libraries march 2023 services to mobile users 24 liu and lewis libraries are continually assessing their mobile services to determine what will work best for their users. mobile printing, chat, and an app are among the new features they’re introducing. according to a poll, 56 percent of libraries plan to continue to provide mobile services in the future. many businesses, including libraries, have had to reconsider and adapt their business models. libraries are relying on mobile services to maintain their relationships with their clients as the world continues to change because of covid-19 and the rise of mobile devices. endnotes 1 statcounter global stats, “mobile and tablet internet usage exceeds desktop for first time worldwide,” press release, november 1, 2016, https://gs.statcounter.com/press/mobile-andtabletinternet-usage-exceeds-desktop-for-first-time-worldwide. 2 pew research center, “mobile phone ownership over time,” mobile fact sheet, april 7, 2021, http://www.pewinternet.org/factsheet/mobile/ 3 ash turner, “how many smartphones are in the world?”, accessed march 2021, https://www.bankmycell.com/blog/how-many-phones-are-in-the-world. 4 yajun guo, zinan yang, zhishun yang, yan quan liu, arlene bielefield, and gregory tharp, “the provision of patron services in chinese academic libraries responding to the covid-19 pandemic,” library hi tech 39, no. 2 (july 24, 2020): 533–48, https://doi.org/10.1108/lht04-2020-0098. 5 turner, “how many smartphones.” 6 turner, “how many smartphones.” 7 petter bae brandtzaeg, “how mobile media impacts urban life: blending social cohesion and social distancing,” interactions 27, no. 6 (november–december 2020): 52–56, https://doi.org/10.1145/3424682. 8 yajun guo, zinan yang, yiming yuan, hulfang ma, and yan quan liu, “contactless services: a survey of the practices of large public libraries in china,” information technology and libraries 41, 2 (2022): 2–21, https://doi.org/10.6017/ital.v41i2.14141. 9 theophilus kwamena ocran, peter graham underwood, and paulina afful arthur, “strategies for successful implementation of mobile phone library services,” the journal of academic librarianship 46, no. 5 (2020): 102174, https://doi.org/10.1016/j.acalib.2020.102174. 10 li liu, xin su, umair akram, and muhammad abrar, “the user acceptance behavior to mobile digital libraries,” international journal of enterprise information systems (ijeis) 16, no. 2 (april–june 2020): 38–53, https://doi.org/10.4018/ijeis.2020040103. 11 liu et al., “the user acceptance behavior.” 12 s. lemire, stacy gilbert, stephanie graves, and tiana faultry-okonkwo, “the present and future of the library mobile experience,” library technology reports, 49, no. 6, (2017). https://gs.statcounter.com/press/mobile-and-tablet-internet-usage-exceeds-desktop-for-first-time-worldwide https://gs.statcounter.com/press/mobile-and-tablet-internet-usage-exceeds-desktop-for-first-time-worldwide https://gs.statcounter.com/press/mobile-and-tablet-internet-usage-exceeds-desktop-for-first-time-worldwide http://www.pewinternet.org/fact-sheet/mobile/ http://www.pewinternet.org/fact-sheet/mobile/ https://www.bankmycell.com/blog/how-many-phones-are-in-the-world https://doi.org/10.1108/lht-04-2020-0098 https://doi.org/10.1108/lht-04-2020-0098 https://doi.org/10.1145/3424682 https://doi.org/10.6017/ital.v41i2.14141 https://doi.org/10.1016/j.acalib.2020.102174 https://doi.org/10.4018/ijeis.2020040103 information technology and libraries march 2023 services to mobile users 25 liu and lewis 13 lemire et al., “the library mobile experience.” 14 lemire et al., “the library mobile experience.” 15 alan kerr and diane rasmussen pennington, “public library mobile apps in scotland: views from the local authorities and the public,” library hi tech 36, no. 2, (2018): 237–51, https://doi.org/10.1108/lht-05-2017-0091. 16 kerr and pennington, “public library mobile apps in scotland.” 17 aubrey harvey chaputula and stephen mutula, “ereadiness of public university libraries in malawi to use mobile phones in the provision of library and information services,” library hi tech 36, no. 2, (2018): 270–88, https://doi.org/10.1108/lht10-2017-0204. 18 min zhang, xuele shen, mingxing zhu, and jun yang, “which platform should i choose? factors influencing consumers’ channel transfer intention from web-based to mobile library service,” library hi tech 34, no. 1, (2016): 2–20, https://doi.org/10.1108/lht-06-2015-0065. 19 yan quan liu and sarah briggs, “a library in the palm of your hand: mobile services in top 100 university libraries,” information technology & libraries 34, no. 2, (june 2015): 133–48, https://doi.org/10.6017/ital.v34i2.5650. 20 liu and briggs, “a library in the palm of your hand.” 21 yajun guo, yan quan liu, and arlene bielefield, “the provision of mobile services in us urban libraries,” information technology and libraries 37, no. 2 (june 2018): 78–93, https://doi.org/10.6017/ital.v37i2.10170. 22 guo, liu, and bielefield, “the provision of mobile services.” 23 guo, liu, and bielefield, “the provision of mobile services.” 24 kitty pope, tom peters, lori bell, and skip burhans, “twenty-first century library musthaves: mobile library services,” searcher 18, no. 3 (april 2010): 44–47. 25 institute of museum and library services, “library search and compare,” https://www.imls.gov/search-compare/. 26 yan quan liu and sarah lewis, irb protocol 406, for “the use of mobile services in public libraries across the country,” the department of information and library science, southern connecticut state university (2021). https://doi.org/10.1108/lht-05-2017-0091 https://doi.org/10.1108/lht-10-2017-0204 https://doi.org/10.1108/lht-10-2017-0204 https://doi.org/10.1108/lht-06-2015-0065 https://doi.org/10.6017/ital.v34i2.5650 https://doi.org/10.6017/ital.v37i2.10170 https://www.imls.gov/search-compare/ abstract introduction literature review status of library services provided in public libraries mobile apps offered across various libraries library services for mobile users research design results and findings all library websites are accessible via mobile devices over half of libraries provide dedicated apps for mobile devices services delivered through library mobile websites services delivered through library mobile apps apps developed and delivered for dedicated library services major forms of mobile reference services mobile reservation services are widely available mobile printing services become an emerging phenomenon subscribed databases are available via mobile devices social media bridges mobile devices from libraries to patrons actions and barriers to advance mobile services conclusion and suggestions for further study endnotes lib-mocs-kmc364-20131012113204 194 journal of library automation vol. 14/3 september 1981 today's large academic libraries struggle, there is, nonetheless, room for criticism of library priorities. this study must be viewed as only a first step (largely tentative and exploratory) in relating automation with service attitudes. it suggests that online systems may be associated with managers more positive in their view of the management role and more positive in their attitudes toward users than batchand manual-system managers. further research would be useful at this point to compare levels of automation (manual, batch, and online) with circulation-staff service attitudes or those of patrons using the systems. references l. laurence miller, "changing patterns of circulation services in university libraries" (ph.d. dissertation, florida state university, 1971), p.iii. 2. ibid., p.149. 3. robert oram, "circulation," in allen kent and harold lancour, eds., encyclopedia of library and information science, v.s (new york: marcel dekker, 1971), p.l. 4. william h. scholz, "computer-based circulation systemsa current review and evaluation," library technolo gy reports 13:237 (may 1977). 5. robert oram , " circulation," p.2. 6. james robert martin , "automation and the service environment of the circulation manager" (ph.d. dissertation, florida state university, 1980), p.22. statistics on headings in the marc file sally h. mccallum and james l. godwin: network development office, library of congress, washington, d.c. in designing an automated system, it is important to understand the characteristics of the data that will reside in the system. work is under way in the network development office of the library of congress (lc) that focuses on the design requirements of a nationwide authority file. in support of this work, statistics relating to headings that appear on the bibliographic records in the lc marc ii files were gathered. these statistics provide information on characteristics of headings and on the expected sizes and growth rates of various subsets of authority files. this information will assist in making decisions concerning the contents of authority files for different types of headings and the frequency of update required for the various file subsets. then ational commission on libraries and information science supported this work. use of these statistics to assist in system design is largely system-dependent; however, some general implications are given in the last section of this paper. in general , counts were made of the number of bibliographic records, headings that appear in those records, and distinct headings that appear on the records. the statistics were broken down by year, by type of heading, and by file. in this paper, distinct headings are those left in a file after removal of duplicates. distinctness will not be used to imply that a heading appears only once in a source bibliographic file, although distinct headings may in fact have only a single occurrence. thus, a file of records containing the distinct headings from a set of bibliographic records is equivalent in size to a marc authority file of the headings in those bibliographic records. methodology these statistics were derived from four marc ii bibliographic record files maintained internally at lc: books, serials, maps, and films. the files contain updated versions of all marc records that have been distributed by lc on the books, serials, maps, and films tape:; frum 1969 through october 1979, and a few records that were then in the process of distribution. the files do not contain cip records. a total of l ,336,182 bibliographic records were processed, including 1,134,069 from the books file, 90,174 from the serials file, 60,758 from the maps file, and 51,176 from the films file. a file of special records, called access point (ap) records, was created that contains one record for the contents of each occurrence of the following fields in the bibliographic records: type of heading personal name corporate name conference name topical subject geographic subject uniform title heading fields 100,700,400,800,600 110,710,410,810,610 111,711,411,811,611 130, 730, 650 651 830,630 only the 6xx subject fields that contained lc subject headings (i.e., second indicator = 0) were selected asap records. the main entry data string was substituted for the pronoun in the series (4xx) fields that contained pronouns. the ap records also contained information from the bibliographic records that assisted in making the counts, such as the date of entry of the record on the file, the identity of the type of bibliographic file, and the language of the bibliographic record. a third file was derived from the ap file that contained a normalized character string for each ap record heading. these normalized ap records were used to produce the counts of distinct headings by clustering like data strings. normalization included conversion of all characters to uppercase, and masking of diacritics, marks of punctuation, and other characters that do not determine the distinctness of a heading, but would interfere with machin~ determination of uniqueness. the subhelds included in the normalized string, hence used for all heading comparisons, are given below. only use-dependent subfields, such as the relator subfield, and those that belonged to title clusters in author/title headings were excluded. examples of the ap file field contents and the normalized forms are: ap field contents: chuang-tzu chuang-tzu [blaeu,joan] 1596-1673 blaeu, joan. 1596-1673 blaeu,joan, 1596-1673 byron, george gordon noel byron, baron, 1788-1824 byron, george gordon noel byron, baron, 1788-1824 byron, george gordon noel byron, baron, 1788-1824 byron, george gordon noel byron, baron, 1788.1824 communications 195 normalized forms: chuang tzu blaeu joan 1596 1673 byron george gordon noel byron baron 17881824 distinct headings for this study were determined by comparing on the following subfields: type of heading personal name corporate name sub fields a, b,c,d a, b, k, f, p, s, g conference name a, q, e topical subject a, b, x, y, z geographic subject a, b, x, y, z all occurrences of repeating subfields were included. the relator data of subfields were dropped from personal and corporate name headings as were the title subfields in author/title headings. a separate study will examine the occurrence of author/title headings. approximately 8 percent of the name headings in the files carry title subfields: 6 percent are series and 2 percent are author/title subjects or added entries. two types of distinct heading counts were generated for topical and geographic subject headings. one takes account only of main terms, the a and b subfields, excluding all subject subdivisions. the other compared the complete heading strings, including subject subdivisions. characteristics of the files the four bibliographic files from which the statistics were derived were begun in different years and are of unequal size. table 1 presents the number of bibliographic records added to each of the marc files by the year that the record was first entered into the file. the records added in the first months of 1979 have been eliminated from tables 1-3, thus the total number of records under consideration is 1,210,809. in the combined file, the records for books dominate the contributions from other forms of materials, representing 85 percent of the combined file records. after the addition of the films and serials records in 1972 and 1973 the total number of records added each year leveled off to around 115,000 but jumped to an average of slightly more than 150,000 records per year following the ad196 journal of library automation vol. 14/3 september 1981 table 1. number of records added to each file by year year entered book serial map film total 1968 11,812 0 0 0 11,812 1969 43,874 0 1, 104 0 44,978 1970 86,004 0 3,467 0 89,978 1971 105,390 0 8,857 6,280 114 ,247 1972 73,437 0 4,665 6,280 84,382 1973 92,512 3,720 5,566 8,929 110,727 1974 99,004 10,682 6,246 8,457 124,389 1975 86,527 15,866 6,721 8,604 117,718 1976 120,106 19,098 6,876 5,432 151,512 1977 140,011 17,999 7,011 4 ,797 169,818 1978 169,044 12,643 5,584 4,464 191,735 total 1,027,721 80,008 56,117 46,963 1,210,809 table 2. numbers of headings and distinct name headings added to all files by year number of headin gs number of distinct headin gs year personal corporate conference personal corporate conference entered names names names names names. names 1968 14,526 3,138 155 12,620 2,139 143 1969 53, 134 21,206 1,027 39,184 9,364 909 1970 104,365 42 ,798 2,175 63,037 14,286 1,769 1971 129,617 57,496 2,742 64,029 15,216 2,158 1972 91,040 45,768 1,942 41,246 9,891 1,402 1973 118,188 57,847 2,625 48,703 12,653 1,862 1974 127,588 73,303 2,972 51,623 17,129 1,983 1975 113,622 76,417 2,519 50,291 18,135 1,742 1;}76 154 ,7 18 88,207 3,454 73,182 23,120 2,306 1977 182,860 87,985 3,487 89,353 23,906 2,333 1978 218,535 97,042 4,192 99,780 24,280 2,831 total 1,308, 193 651,207 27,290 633,048 170, 119 19,438 table 3. numbers of subject headings and distinct subject headings added to all files by year number of distinct headings number of headings first terms only full headings year topical geographic topical geographic topical geographic entered subjects subjects subjects subjects subjects subjects 1968 10,615 1,857 4,390 489 7,775 1,512 1969 45,161 9,047 8,104 1,980 23,617 5,426 1970 89,304 21,054 8,170 4,263 34 ,526 10,179 1971 115,220 31,278 6,853 5,417 36,689 12,862 1972 92,247 20,760 4,236 2,597 26,201 7,074 1973 121 , 161 27,890 4,460 3,105 33,061 9,819 1974 137,843 31,814 4,524 3,553 39,262 11 ,4 13 1975 130,980 30,650 4,203 3,417 40,129 11 ,818 1976 168,840 39,886 5, 125 4,142 55,468 15,472 1977 185,331 44,973 5,718 4,194 59,529 16,676 1978 222,565 49,923 7,151 4,034 69,856 17,855 t otal 1,319,267 309,132 62,934 37,191 426, 113 120, 106 clition of major non-english roman alphabet language records in 1976. the increase is noticeable primarily in the books and serials files since the maps file had been adding those languages since 1969 and only a limited number of non-english-language audiovisual materials are cataloged. the unusually large number of records added to the books file in 1971 resulted from a special project to add retrospective titles to the file. the large increase in books records in 1978 was due to the co marc project in which retrospective lc records that had been converted to machine-readable form by other libraries were contributed to the lc marc file. approximately 12,000 comarc records were added in 1977 and 28,000 in 1978. the fall in numbers of film records produced in 1976-1978 reflects a general fall in production of instructional films in the united states. counts of items cataloged that are compiled by lc processing services from catalogers' statistics sheets show that lc cataloged approximately 225,000 titles in 1978; thus, approximately 73 percent of lc cataloging is currently going into machinereadable form. the principal exclusions are records for most nonroman material (only nonroman records for maps have been transliterated and added since 1969) and a few records for music, sound recordings, incunabula, and microforms. the portion being put into machine-readable form should rise significantly as the romanized records for items in several nonroman alphabets are added in the next year. name headings table 2 presents the number of occurrences of name headings in the marc bibliographic files and the number of distinct name headings, both by type of heading and by year. the number of distinct headings that were new to the file in a year was determined by comparing the headings added in a given year against those added in all previous years. it is not surprising to find that 66 percent of name-heading occurrences are personal names, 33 percent are corporate, and only 1.4 percent are conference. the figures shift when considering the distinct names, where 77 percent are percommunications 197 sonal and only 21 percent are corporate. looking at ~he total figures in table 2, while 1 ,308,193 of the headings that appeared on the records were personal names, only 633,048 or 48 percent of these were distinct. of the rest, 52 percent were duplicates of the distinct headings. similarly, 26 percent of corporate names were distinct, with 74 percent being duplicates; and 71 percent of conference names were distinct, with only 29 percent being duplicates. in 1968, 87 percent and 68 percent of personal and corporate names, respectively, were distinct, i.e., 13 percent and 32 percent "had been used previously" when they appeared on a bibliographic record during the year. as the base file of names grows, the percentage of names appearing on new records but which "had been used previously" rises, to 60 percent and 77 percent in 1974. while the figures reported in table 2 indicate that the percentage of headings used that were repeats fell slightly again in 1977 (51 percent and 73 percent), this is probably due to the influx of new names with the addition of new languages in 1976-77. additional statistics gathered on english-language items show the percentage of repeating headings becoming steady after 1974. subject headings statistics concerning distinct topical and geographical subject headings were collected for main terms, excluding subdivisions, and for full subject heading strings. table 3 gives the numbers of headings and the numbers of distinct headings of each type found in the marc file. looking at the total figures, only 4.8 percent of topical first terms are distinct, the rest are duplicates. this indicates an average occurrence of 20.8 times for each first term. slightly more, 12 percent, of the geographic first terms are distinct. when the full headings with topical, period, form, and geographic subdivisions are considered, the percentage of headings that are distinct rises to 32.3 percent for topical subjects and 38.8 percent for geographic subjects. thus, 67.3 percent of topical and 61.2 percent of geographic are duplicates of existing headings. in the yearly figures, sub198 journal of library automation vol. 14/3 september 1981 ject headings show the same tendency as name headings in that the percentages of headings that appear on new records but which "had been previously used" rises as the stock of headings increases and then levels off. subjects were also affected by the addition of other roman alphabet languages in 1976-77 but not to a very large degree. for all access points, name headings and full string subject headings, name headings account for 55 percent of the headings that occur in the bibliographic records, with only 45 percent attributable to topical and geographical headings. it should be noted that 12 percent of the name headings that appear on the bibliographic records are names used as subjects. frequencies of occurrence counts were also made of the frequency with which name headings occurred in the bibliographic files. table 4 summarizes the frequency data: 66 percent of distinct personal names, 62 percent of distinct corporate names, and 84 percent of distinct conference names occur only once in the files. the percent of corporate names with single occurrences is surprisingly close to that for personal; however, the percent of names having multiple occurrences falls more slowly for corporate than for personal names. while 5.47 percent of corporate names occur ten or more times, only 1.92 percent of personal names occur ten or more times. the figures for personal names roughly correspond to those obtained by william potter from a sample taken from the main catalog at the university of illinois at urbana-champaign. that study showed 63.5 percent of personal names occurred onlyonce. 1 the number of occurrences of different types of headings are compared in figure 1. the bars show the numbers of personal, corporate, conference, topical, and geographic headings that appear in the bibliographic files. the shaded areas represent the number of headings that are distinct, thus the upper part of each bar represents additional occurrences of the headings from the shaded area. for personal, corporate, and conference headings a further distinction is made between distinct headings that occur only once, the crosshatched area, and those that have multiple occurrences. thus the multiple occurrences of corporate names may be seen to come from a small table 4. frequency of occurrence of name headings in all files distinct distinct distinct number of personal names corporate names conference names occurrences number percent number percent number percent 1 456,328 65.65 116,250 62.02 18,02 1 83.90 2 119,68 1 17.22 30,185 16.10 2,049 9.54 3 46,247 6.65 11,563 6.17 587 2.73 4 23,951 3.45 6,814 3.64 289 1.35 5 13,820 1.99 4,109 2.19 163 .76 6 8,790 1.26 2,958 1.58 98 .46 7 5,827 .84 2,175 1.16 56 .26 8 4,056 .58 1,673 .89 48 .22 9 2,998 .43 1,395 .74 36 . 17 10 2,153 .31 10 ,037 .55 18 .08 11-13 4,116 .59 2,180 1.16 44 .20 14-20 3,748 .54 2,632 1.40 41 .19 2150 2,678 .39 2,901 1.55 23 .11 51-100 448 .06 936 .50 4 .02 101-200 149 .02 374 .20 2 .01 201-300 47 .01 109 .06 1 .00 301400 19 .00 46 .02 0 .00 401-500 11 .00 21 .01 0 .00 5011000 5 .00 53 .03 0 .00 1001 + 2 .00 18 .01 0 .00 total 695,074 99.99 187,429 99.98 21,480 100.00 number of distinct corporate headings, as was indicated by the slow decrease of the multiple-heading occurrence rate (i.e., a small group of corporate names have a very large number of occurrences). file growth as a bibliographic file grows and the stock of names and subjects that are contained in the associated authority file increases, the number of new-to-the-file 1400 1200 1000 "' <:> 800 z i5 <t w :>: .. 0 a: w 600 id ::;: "' z 400 200 1,444,726 personal names corporate names communications 199 headings that are required for the new bibliographic records would be expected to fall. figure 2 illustrates that tendency and shows that there is a leveling off of the number of new-to-the-file headings per new bibliographic record after the bibliographic file reaches a certain size. for example, after approximately 700,000 bibliographic records are in the file, for every additional 100 bibliographic records approximately 298 name and subject headings 30,417 conference names 1.468,804 topical subjects geographic subjects d distinct headings distinct headings that occur -only once fig. 1. number of headin gs by type. 200 journal of library automation vol. 14/3 september 1981 will be assigned, and, of these, approximately 53 will be new personal names, 14 new corporate names, 2 new conference names, 35 new topical subjects, and 10 new geographic subjects; the remaining 184 headings used will already be established in the authority file. thus after a certain bibliographic file size is reached, the growth of the authority file is approximately a linear function of the growth of the bibliographic file. implications the reoccurrence frequency of headings in a bibliographic file is often cited as a factor in designing bibliographic and authority-file configurations. discussion 1.2 ii i 0 0 .9 a: 0 u w a: .8 ~ ~ .7 z 5 :.\ " .6 ~ z 0 .5 a: w "' ~ . 4 z .3 centers on the necessity of carrying authority records for headings that occur only once in a bibliographic file . with reference to the name-heading data in table 4 and figure 1, carrying authority records only for headings that occur more than once could 'potentially reduce the size of the authority file from that indicated by the whole shaded area (including shaded and crosshatched) to the plain shaded area, i.e., from 903,983 records to 310,123, a 66 percent decrease. controlling multiple occurrences of a heading is, however, only one role of the authority record. more important perhaps is the control of cross-references connected with the heading. preliminary work with a • persona l names ---9 top~cal su8jects ... corporate names 2~ ----------~----~---------& geographi ca l subj ects 'y con ference n ames » ~~~~r=~~~~~==~==~==~~~==~==~==~==~-100 200 300 400 500 600 700 800 900 1000 11 00 1200 1300 number of bibliographic records cthousands) fig. 2 . n umber of n ew headings p er r eco rd for all files. random sample of personal names in the lc file indicates that less than 17 percent of personal names require cross-references. thus the personal name headings that occur only once but would require authority records because of cross-references could be less than 17 percent. the frequency data combined with reference structure data could have a significant impact on design. out of a total of 695,074 personal names in the authority files associated with the marc bibliographic files examined here, 456, 328, or 66 percent, occur only once. of these, fewer than 77,575 would be expected to have cross-references, thus the nameauthority file for personal names could be reduced in size from 695,074 records to 316,321, a 55 percent decrease. if separate authority records are a system requirement, the occurrence figures might then be useful for defining configurations that employ machine-generated provisional records for single-occurrence headings that do not have reference structures or that simplify in other ways the treatment of these headings. these figures may also be useful in making decisions on the addition of retrospective authority records to the automated files. reference 1. william gray potter, "when names collide: conflict in the catalog and aacr2," library resources & technical services 24:7 (winter 1980). rlin and oclc as reference tools douglas jones: university of arizona, tucson. the central reference department (social science, humanities, and fine arts) and the science-engineering reference department at the university of arizona library are currently evaluating the oclc and rlin systems as reference tools, to see if their use can significantly improve the effectiveness and efficiency of providing reference service. a significant number of the questions received by our librarians, and presumably by librarians elsewhere, incommunications 201 volve incomplete or inaccurately cited references to monographs, conference proceedings, government documents, technical reports, and monographic serials. if by using a bibliographic utility a librarian can identify or verify an item not found in printed sources, then effectiveness has been improved. once a complete and accurate description of the item is found, it is a relatively simple task to determine whether or not the library has the item, and if not, to request it through interlibrary loan. additionally, if the efficiency of the librarian can be improved by reducing the amount of time required to verify or identify a requested item, then the patron, the library, and, in our case, the taxpayer, have been better served. the promise of nearimmediate response from a computer via an online interactive terminal system is clearly beguiling when compared to the relatively time-consuming searching required with printed sources, which frequently provide only a limited number of access points and often become available weeks, months, or even years after the items they list. we realize, of course, that the promise of instantaneous electronic information retrieval is limited by a va):'iety of factors, and presently we view access to rlin and oclc as potentially powerful adjuncts tonot replacements for-printed reference sources. given that rlin and oclc have databases and software geared to known-item searches for catalog card production, our evaluation attempts to document their usefulness in reference service. a preliminary study conducted during the spring semester of 1980-81 indicated that approximately 50 percent of the questionable citations requiring further bibliographic verification could be identified on oclc or rlin. the time required was typically five minutes or less. successful verification using printed indexes to identify the same items ranged from 20 percent in the central reference department to 50 percent in science-engineering. time required per item averaged approximately fifteen minutes. based on our findings, we plan a revised and more thorough test during the fall semester of 1981-82, which will include an assessment of the enhancements to the 20190318 10992 editor letter from the editor kenneth j. varnum information technology and libraries | march 2019 1 https://doi.org/10.6017/ital.v38i1.10992 the current (march 2019) issue of information technology and libraries sees the first of what i know will be many exciting contributions to our new “public libraries leading the way” column. this feature (announced in december 2018) shines a spotlight on technology-based innovations from the public library perspective. the first column, “the democratization of artificial intelligence: one library’s approach,” by thomas finley of the frisco (texas) public library, discusses how his library has developed a teaching and technology lending program around artificial intelligence, creating kits that community members can take home and use to explore artificial intelligence through a practical, hands-on, approach. if you have a public library perspective on technology that you would like to share in a conversational, 1000-1500-word column, submit a proposal. full details and a link to the proposal submission form can be found on the lita blog. i look forward to hearing your ideas. in addition to the premiere column in this series, the current issue includes the lita president’s column from bohyun kim to update us on the 2019 ala midwinter meeting, particularly on the status of the proposed alcts/llama/lita merger, and our regular editorial board thoughts column, contributed this quarter by kevin ford, on the importance of user stories in successful technology projects. articles in this issue cover the topics: improving sitewide navigation; improving the display of hathitrust records in primo; using linked data to create a geographic discovery system; measuring information system project success; a systematic approach towards web preservation; and determining textbook cost, formats and licensing. i hope you enjoy reading the issue, whether you explore just one article, or read it “cover to cover.” as always, if you want to share the research or practical experience you have gained as an article in ital, get in touch with me at varnum@umich.edu. sincerely, kenneth j. varnum, editor varnum@umich.edu march 2019 web services and widgets for library information systems | han 87on the clouds: a new way of computing | han 87 shape cloud computing. for example, sun’s well-known slogan “the network is the computer” was established in late 1980s. salesforce.com has been providing on-demand software as a service (saas) for customers since 1999. ibm and microsoft started to deliver web services in the early 2000s. microsoft’s azure service provides an operating system and a set of developer tools and services. google’s popular google docs software provides web-based word-processing, spreadsheet, and presentation applications. google app engine allows system developers to run their python/java applications on google’s infrastructure. sun provides $1 per cpu hour. amazon is well-known for providing web services such as ec2 and s3. yahoo! announced that it would use the apache hadoop framework to allow users to work with thousands of nodes and petabytes (1 million gigabytes) of data. these examples demonstrate that cloud computing providers are offering services on every level, from hardware (e.g., amazon and sun), to operating systems (e.g., google and microsoft), to software and service (e.g., google, microsoft, and yahoo!). cloud-computing providers target a variety of end users, from software developers to the general public. for additional information regarding cloud computing models, the university of california (uc) berkeley’s report provides a good comparison of these models by amazon, microsoft, and google.4 as cloud computing providers lower prices and it advancements remove technology barriers—such as virtualization and network bandwidth—cloud computing has moved into the mainstream.5 gartner stated, “organizations are switching from factors related to cloud computing: infinite computing resources available on demand, removing the need to plan ahead; the removal of an up-front costly investment, allowing companies to start small and increase resources when needed; and a system that is pay-for-use on a short-term basis and releases customers when needed (e.g., cpu by hour, storage by day).2 national institute of standards and technology (nist) currently defines cloud computing as “a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g. network, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.”3 as there are several definitions for “utility computing” and “cloud computing,” the author does not intend to suggest a better definition, but rather to list the characteristics of cloud computing. the term “cloud computing” means that ■■ customers do not own network resources, such as hardware, software, systems, or services; ■■ network resources are provided through remote data centers on a subscription basis; and ■■ network resources are delivered as services over the web. this article discusses using cloud computing on an it-infrastructure level, including building virtual server nodes and running a library’s essential computer systems in remote data centers by paying a fee instead of running them on-site. the article reviews current cloud computing services, presents the author’s experience, and discusses advantages and disadvantages of using the new approach. all kinds of clouds major it companies have spent billions of dollars since the 1990s to on the clouds: a new way of computing this article introduces cloud computing and discusses the author’s experience “on the clouds.” the author reviews cloud computing services and providers, then presents his experience of running multiple systems (e.g., integrated library systems, content management systems, and repository software). he evaluates costs, discusses advantages, and addresses some issues about cloud computing. cloud computing fundamentally changes the ways institutions and companies manage their computing needs. libraries can take advantage of cloud computing to start an it project with low cost, to manage computing resources cost-effectively, and to explore new computing possibilities. s cholarly communication and new ways of teaching provide an opportunity for academic institutions to collaborate on providing access to scholarly materials and research data. there is a growing need to handle large amounts of data using computer algorithms that presents challenges to libraries with limited experience in handling nontextual materials. because of the current economic crisis, academic institutions need to find ways to acquire and manage computing resources in a cost-effective manner. one of the hottest topics in it is cloud computing. cloud computing is not new to many of us because we have been using some of its services, such as google docs, for years. in his latest book, the big switch: rewiring the world, from edison to google, carr argues that computing will go the way of electricity: purchase when needed, which he calls “utility computing.” his examples include amazon’s ec2 (elastic computing cloud), and s3 (simple storage) services.1 amazon’s chief technology officer proposed the following yan hantutorial yan han (hany@u.library.arizona.edu) is associate librarian, university of arizona libraries, tucson. 88 information technology and libraries | june 201088 information technology and libraries | june 2010 company-owner hardware and software to per-use service-based models.”6 for example, the u.s. government website (http://www.usa .gov/) will soon begin using cloud computing.7 the new york times used amazon’s ec2 and s3 services as well as a hadoop application to provide open access to public domain articles from 1851 to 1922. the times loaded 4 tb of raw tiff images and their derivative 11 million pdfs into amazon’s s3 in twenty-four hours at very reasonable cost.8 this project is very similar to digital library projects run by academic libraries. oclc announced its movement of library management services to the web.9 it is clear that oclc is going to deliver a web-based integrated library system (ils) to provide a new way of running an ils. duraspace, a joint organization by fedora commons and dspace foundation, announced that they would be taking advantage of cloud storage and cloud computing.10 on the clouds computing needs in academic libraries can be placed into two categories: user computing needs and library goals. user computing needs academic libraries usually run hundreds of pcs for students and staff to fulfill their individual needs (e.g., microsoft office, browsers, and image-, audio-, and video-processing applications). library goals a variety of library systems are used to achieve libraries’ goals to support research, learning, and teaching. these systems include the following: ■■ library website: the website may be built on simple html webpages or a content management system such as drupal, joomla, or any home-grown php, perl, asp, or jsp system. ■■ ils: this system provides traditional core library work such as cataloging, acquisition, reporting, accounting, and user management. typical systems include innovative interfaces, sirsidynix, voyager, and opensource software such as koha. ■■ repository system: this system provides submission and access to the institution’s digital collections and scholarship. typical systems include dspace, fedora, eprints, contentdm, and greenstone. ■■ other systems: for example, federated search systems, learning object management systems, interlibrary loan (ill) systems, and reference tracking systems. ■■ public and private storage: staff file-sharing, digitization, and backup. due to differences in end users and functionality, most systems do not use computing resources equally. for example, the ils is input and output intensive and database query intensive, while repository systems require storage ranging from a few gigabytes to dozens of terabytes and substantial network bandwidth. cloud computing brings a fundamental shift in computing. it changes the way organizations acquire, configure, manage, and maintain computing resources to achieve their business goals. the availability of cloud computing providers allows organizations to focus on their business and leave general computing maintenance to the major it companies. in the fall of 2008, the author started to research cloud computing providers and how he could implement cloud computing for some library systems to save staff and equipment costs. in january 2009, the author started his plan to build library systems “on the clouds.” the university of arizona libraries (ual) has been a key player in the process of rebuilding higher education in afghanistan since 2001. ual librarian atifa rawan and the author have received multiple grant contracts to build technical infrastructures for afghanistan’s academic libraries. the technical infrastructure includes the following: ■■ afghanistan ils: a bilingual ils based on the open-source system koha.11 ■■ afghanistan digital libraries website (http://www.afghan digitallibraries.org/): originally built on simple html pages, later rebuilt in 2008 using the content management system joomla. ■■ a digitization management system. the author has also developed a japanese ill system (http://gif project.libraryfinder.org) for the north american coordinating council on japanese library resources. these systems had been running on ual’s internal technical infrastructure. these systems run in a complex computing environment, require different modules, and do not use computing resources equally. for example, the afghan ils runs on linux, apache, mysql, and perl. its opac and staff interface run on two different ports. the afghanistan digital libraries website requires linux, apache, mysql, and php. the japanese ill system was written in java and runs on tomcat. there are several reasons why the author moved these systems to the new cloud computing infrastructure: ■■ these systems need to be accessed in a system mode by people who are not ual employees. ■■ system rebooting time can be substantial in this infrastructure because of server setup and it policy. ■■ the current on-site server has web services and widgets for library information systems | han 89on the clouds: a new way of computing | han 89 reached its life expectancy and requires a replacement. by analyzing the complex needs of different systems and considering how to use resources more effectively, the author decided to run all the systems through one cloud computing provider. by comparing the features and the costs, linode (http://www.linode.com/) was chosen because it provides full ssh and root access using virtualization, four data centers in geographically diverse areas, high availability and clustering support, and an option for month-to-month contracts. in addition, other customers have provided positive reviews. in january 2009, the author purchased one node located in fremont, california, for $19.95 per month. an implementation plan (see appendix) was drafted to complete the project in phases. the author owns a virtual server and has access to everything that a physical server provides. in addition, the provider and the user community provided timely help and technical support. the migration of systems was straightforward: a linux kernel (debian 4.0) was installed within an hour, domain registration was complete and the domains went active in twenty-four hours, the afghanistan digital libraries’ website (based on joomla) migration was complete within a week, and all supporting tools and libraries (e.g., mysql, tomcat, and java sdk) were installed and configured within a few days. a month later, the afghanistan ils (based on koha) migration was completed. the ill system was also migrated without problem. tests have been performed in all these systems to verify their usability. in summary, the migration of systems was very successful and did not encounter any barriers. it addresses the issues facing us: after the migration, ssh log-ins for users who are not university employees were set up quickly; systems maintenance is managed by the author’s team, and rebooting now only takes about one minute; and there is no need to buy a new server and put it in a temperature and security controlled environment. the hardware is maintained by the provider. the administrative gui for the linux nodes is shown in figure 1. since migration, no downtime because of hardware or other failures caused by the provider has been observed. after migrating all the systems successfully and running them in a reliable mode for a few months, the second phase was implemented (see appendix). another linux node (located in atalanta, georgia) was purchased for backup and monitoring (see figure 2). nagios, an open-source monitoring system, was tested and configured to identify and report problems for the above library systems. nagios provides the following functions: (1) monitoring critical computing components, such as the network, systems, services, and servers; (2) timely alerts delivered via e-mail or cell phone; and (3) report and record logs of outages, events, and alerts. a backup script is also run as a prescheduled job to back up the systems on a regular basis. figure 1. linux node administration web interface figure 2. two linux nodes located in two remote data centers node 1: 64.62.xxx.xxx (fremont, ca) node 2: 74.207.xxx.xxx (atlanta, ga) nagios backup afghan digital libraries website afghan ils interlibrary loan system dspace 90 information technology and libraries | june 201090 information technology and libraries | june 2010 findings and discussions since january 2009, all the systems have been migrated and have been running without any issues caused by the provider. the author is very satisfied with the outcomes and cost. the annual cost of running two nodes is $480 per year, compared to at least $4,000 dollars if the hardware had been run in the library.12 from the author ’s experience, cloud computing provides the following advantages over the traditional way of computing in academic institutions: ■■ cost-effectiveness: from the above example and literature review, it is obvious that using cloud computing to run applications, systems, and it infrastructure saves staff and financial resources. uc berkeley’s report and zawodny’s blog provide a detailed analysis of costs for cpu hours and disk storage.13 ■■ flexibility: cloud computing allows organizations to start a project quickly without worrying about up-front costs. computing resources such as disk storage, cpu, and ram can be added when needed. in this case, the author started on a small scale by purchasing one node and added additional resources later. ■■ data safety: organizations are able to purchase storage in data centers located thousands of miles away, increasing data safety in case of natural disasters or other factors. this strategy is very difficult to achieve in a traditional off-site backup. ■■ high availability: cloud computing providers such as microsoft, google, and amazon have better resources to provide more up-time than almost any other organizations and companies do. ■■ the ability to handle large amounts of data: cloud computing has a pay-for-use business model that allows academic institutions to analyze terabytes of data using distributed computing over hundreds of computers for a short-time cost. on-demand data storage, high availability and data safety are critical features for academic libraries.14 however, readers should be aware of some technical and business issues: ■■ availability of a service: in several widely reported cases, amazon’s s3 and google gmail were inaccessible for a duration of several hours in 2008. the author believes that the commercial providers have better technical and financial resources to keep more up-time than most academic institutions. for those wanting no single point of failure (e.g., a provider goes out of business), the author suggests storing duplicate data with a different provider or locally. ■■ data confidentiality: most academic libraries have open-access data. this issue can be solved by encrypting data before moving to the clouds. in addition, licensing terms can be negotiated with providers regarding data safety and confidentiality. ■■ data transfer bottlenecks: accessing the digital collections requires considerable network bandwidth, and digital collections are usually optimized for customer access. moving huge amounts of data (e.g., preservation digital images, audios, videos, and data sets) to data centers can be scheduled during off hours (e.g., 1–5 a.m.), or data can be shipped on hard disks to the data centers. ■■ legal jurisdiction: legal jurisdiction creates complex issues for both providers and end users. for example, canadian privacy laws regulate data privacy in public and private sectors. in 2008, the office of the privacy commissioner of canada released a finding that “outsourcing of canada .com email services to u.s.-based firm raises questions for subscribers,” and expressed concerns about public sector privacy protection.15 this brings concerns to both providers and end users, and it was suggested that privacy issues will be very challenging.16 summary the author introduces cloud computing services and providers, presents his experience of running multiple systems such as ils, content management systems, repository software, and the other system “on the clouds” since january 2009. using cloud computing brings significant cost savings and flexibility. however, readers should be aware of technical and business issues. the author is very satisfied with his experience of moving library systems to cloud computing. his experience demonstrates a new way of managing critical computing resources in an academic library setting. the next steps include using cloud computing to meet digital collections’ storage needs. cloud computing brings fundamental changes to organizations managing their computing needs. as major organizations in library fields, such as oclc, started to take advantage of cloud computing, the author believes that cloud computing will play an important role in library it. acknowledgments the author thanks usaid and washington state university for providing financial support. the author thanks matthew cleveland’s excellent work “on the clouds.” references 1. nicholars carr, the big switch: rewiring the world, from edison to google web services and widgets for library information systems | han 91on the clouds: a new way of computing | han 91 (london: norton, 2008). 2. werner vogels, “a head in the clouds—the power of infrastructure as a service” (paper presented at the cloud computing and in applications conference (cca ’08), chicago, oct. 22–23, 2008). 3. peter mell and tim grance, “draft nist working definition of cloud computing,” national institute of standards and technology (may 11, 2009), http:// csrc.nist.gov/groups/sns/cloud-computing/index.html (accessed july 22, 2009). 4. michael armbust et al., “above the clouds: a berkeley view of cloud computing,” technical report, university of california, berkeley, eecs department, feb. 10, 2009, http://www.eecs.berkeley .edu/pubs/techrpts/2009/eecs-200928.html (accessed july 1, 2009). 5. eric hand, “head in the clouds: ‘cloud computing’ is being pitched as a new nirvana for scientists drowning in data. but can it deliver?” nature 449, no. 7165 (2007): 963; geoffery fowler and ben worthen, “the internet industry is on a cloud—whatever that may mean,” wall street journal, mar. 26, 2009, http://online.wsj.com/article/ sb123802623665542725.html (accessed july 14, 2009); stephen baker, “google and the wisdom of the clouds,” business week (dec. 14, 2007), http://www.msnbc .msn.com/id/22261846/ (accessed july 8, 2009). 6. gartner, “gartner says worldwide it spending on pace to supass $3.4 trillion in 2008,” press release, aug. 18, 2008, http://www.gartner.com/it/page .jsp?id=742913 (accessed july 7, 2009). 7. wyatt kash, “usa.gov, gobierno usa.gov move into the internet cloud,” government computer news, feb. 23, 2009, http://gcn.com/articles/2009/02/23/ gsa-sites-to-move-to-the-cloud.aspx?s =gcndaily_240209 (accessed july 14, 2009). 8. derek gottfrid, “self-service, prorated super computing fun!” online posting, new york times open, nov. 1, 2007, http://open.blogs .nytimes.com/2007/11/01/self-service -prorated-super-computing-fun/?scp =1&sq=self%20service%20prorated&st =cse (accessed july 8, 2009). 9. oclc online computing library center, “oclc announces strategy to move library management services to web scale,” press release, apr. 23, 2009, http://www.oclc.org/us/en/news/ releases/200927.htm (accessed july 5, 2009). 10. duraspace, “fedora commons and dspace foundation join together to create duraspace organization,” press release, may 12, 2009, http:// duraspace.org/documents/pressrelease .pdf (accessed july 8, 2009). 11. yan han and atifa rawan, “afghanistan digital library initiative: revitalizing an integrated library system,” information technology & libraries 26, no. 4 (2007): 44–46. 12. fowler and worthen, “the internet industry is on a cloud.” 13. jeremy zawodney, “replacing my home backup server with amazon’s s3,” online posting, jeremy zawodny’s blog, oct. 3, 2006, http://jeremy .zawodny.com/blog/archives/007624 .html (accessed june 19, 2009). 14. yan han, “an integrated high availability computing platform,” the electronic library 23, no. 6 (2005): 632–40. 15. office of the privacy commissioner of canada, “tabling of privacy commissioner of canada’s 2005–06 annual report on the privacy act: commissioner expresses concerns about public sector privacy protection,” press release, june 20, 2006, http://www.priv.gc.ca/media/ nr-c/2006/nr-c_060620_e.cfm (accessed july 14, 2009); office of the privacy commissioner of canada, “findings under the personal information protection and electronic documents act (pipeda),” (sept. 19, 2008), http://www.priv.gc.ca/cf -dc/2008/394_20080807_e.cfm (accessed july 14, 2009). 16. stephen baker, “google and the wisdom of the clouds,” business week (dec. 14, 2007), http://www.msnbc.msn .com/id/22261846/ (accessed july 8, 2009). appendix. project plan: building ha linux platform using cloud computing project manager: project members: object statement: to build a high availability (ha) linux platform to support multiple systems using cloud computing in six months. scope: the project members should identify cloud computing providers, evaluate the costs, and build a linux platform for computer systems, including afghan ils, afghanistan digital libraries website, repository system, japanese interlibrary loan website, and digitization management system. resources: project deliverable: january 1, 2009—july 1, 2009 92 information technology and libraries | june 201092 information technology and libraries | june 2010 phase i ■■ to build a stable and reliable linux platform to support multiple web applications. the platform needs to consider reliability and high availability in a cost-effective manner ■■ to install needed libraries for the environment ■■ to migrate ils (koha) to this linux platform ■■ to migrate afghan digital libraries’ website (joomla) to this platform ■■ to migrate japanese interlibrary loan website ■■ to migrate digitization management system phase ii ■■ to research and implement a monitoring tool to monitor all web applications as well as os level tools (e.g. tomcat, mysql) ■■ to configure a cron job to run routine things (e.g., backup ) ■■ to research and implement storage (tb) for digitization and access phase iii ■■ to research and build linux clustering steps: 1. os installation: debian 4 2. platform environment: register dns 3. install java 6, tomcat 6, mysql 5, etc. 4. install source control env git 5. install statistics analysis tool (google analytics) 6. install monitoring tool: ganglia or nagios 7. web applications 8. joomla 9. koha 10. monitoring tool 11. digitization management system 12. repository system: dspace, fedora, etc. 13. ha tools/applications note calculation based on the following: ■■ leasing two nodes $20/month: $20 x 2 nodes x 12 months = $480/year ■■ a medium-priced server with backup with a life expectancy of 5 years ($5,000): $1,000/year ■■ 5 percent of system administrator time for managing the server ($60,000 annual salary): $3,000/year ■■ ignore telecommunication cost, utility cost, and space cost. ■■ ignore software developer’s time because it is equal for both options. appendix. project plan: building ha linux platform using cloud computing (cont.) what more can we do to address broadband inequity and digital poverty? editorial board thoughts what more can we do to address broadband inequity and digital poverty? lori ayre information technology and libraries | september 2020 https://doi.org/10.6017/ital.v39i3.12619 lori ayre (lori.ayre@galecia.com) is principal, the galecia group, and a member of the ital editorial board. © 2020. we are now almost seven months into our new lives with the novel coronavirus and over 190,000 americans have died of covid-19. library administrators have been struggling with their commitment to provide services to their communities while keeping their staff safe. initially, libraries relied on their online offerings, so more e-books and other online resources were acquired. staff learned that they could do quite a bit of their work from home. they could still respond to email and phone messages. they could evaluate and order new material. they could deliver online programs like summer reading and story time. they could interact with people on social media. they could put together key resources for patrons and post them on the website.1 a lot of what the library was doing while the buildings were closed was not obvious. most people associate the library with the building and since the building was closed… it seemed like nothing was happening at the library. yet, library workers were busy. once it became possible for library staff to enter the building (per local health ordinances), the first thing that libraries started to do was accept returns. that was a little fraught considering how little we knew about the virus and how long contaminants might live on returned library material. eventually with the long-awaited testing results from the realm project and battelle labs (https://www.webjunction.org/explore-topics/covid-19-research-project.html), people started standardizing on a three-day quarantine of returns. then more testing of stacked material was done resulting in some people choosing to quarantine returns for four days. as of early september, we have learned that even five days isn’t enough to quarantine delivery totes and some other plastic material. curbside pick-up was born in these early days of being allowed back in the buildings. if someone had mapped who was offering curbside pick-up, it would look like popcorn popping across the country. the number of libraries offering the service slowly increased and pretty soon nearly everyone was doing it.2 many library directors will say that curbside pick-up is here to stay. people love the convenience too much to take the service away. rolling out curbside pick-up has had some challenges: how to safely make the handoff between library staff and library patrons; whether to accept returns; whether to charge fines; modifying mailto:lori.ayre@galecia.com https://www.webjunction.org/explore-topics/covid-19-research-project.html information technology and libraries september 2020 what more can we to do address broadband inequity and poverty? | ayre 2 circulation policies to fit the current needs; and selecting books for people that want them but who don’t have the skills needed to negotiate the library catalog’s requesting system. some libraries started putting together grab bags of materials selected by staff for specific patrons—kind of like homebound services on-the-fly. curbside helped get material in circulation again. importantly, also during this period, libraries started finding creative ways to get wi-fi hotspots out into communities. they began lending them if they weren’t already. those libraries already circulating hotspots increased their inventory. they took their bookmobiles into neighborhoods and created temporary hotspot connections around town. many libraries made sure wi-fi from their building was available in their own parking lots.3 but one thing everyone has learned during this pandemic is that libraries alone cannot be the solution to the digital divide. this isn’t news to librarians who have been arguing that internet access should be as readily available as electricity and water. librarians understand that information cannot be free and accessible unless everyone has internet access and knows how to use it. public access computers, wi-fi hotspots, and media literacy are all staple services in our libraries today.4 however, these services are not enough to bridge a digital divide that only seems to be getting worse. the coronavirus that closed libraries and schools has made it painfully clear that something much bigger has to happen to address the problem. as gina millsap stated in a recent facebook post: i think it’s become obvious that the covid-19 crisis is shining a spotlight on the flaws we have in our broadband infrastructure and on our failure to make the investments that should have been made for equitable access to what should be a basic utility, like water or electricity.5 according to broadbandnow, the number of people who lack broadband internet access could be as high as 42 million.6 the fcc reports that at least “18 million people lacked access to broadband internet at the end of 2018.”7 even if all the libraries were open and circulating hundreds of wi-fi hotspots, we’d still have a very serious access problem. thinking differently about addressing the digital divide in a paper published march 28, 2019, by the urban libraries council (ulc), the author suggested three specific actions that libraries can take to address race and social equity and the digital divide. they are: 1. assess and respond to the needs of your community through meaningful conversation (including considering different partners for your work) 2. optimize funding opportunities to support your efforts (e.g. e-rate), and information technology and libraries september 2020 what more can we to do address broadband inequity and poverty? | ayre 3 3. think outside the box to create effective solutions that are informed by those in need (e.g. lending wi-fi hot spots).8 while we know libraries have been heeding this advice when it comes to wi-fi hotspots, let’s look into what can be done when we consider ulc’s suggestion to consider different partners for your work. community partners an excellent example of what can be done with a coalition of community partners comes from detroit where a mesh wireless network was put in place to provide permanent broadband in a low-income neighborhood.9 the project is called the detroit community technology project. with the community-based mesh network, only one internet account is needed to provide access for multiple homes. the networks also enable people on the network to share resources on the network (calendar, files, bulletin board) and that data lives on their network, not in the cloud. one of the sponsors of the detroit community technology project is the allied media project (https://www.alliedmedia.org/) which also sponsors the casscowifi and the equitable internet initiative to get broadband and digital literacy training into several underserved areas. community networks (https://muninetworks.org/), a project of the institute for local selfreliance (https://ilsr.org/), describes several innovative projects in which communities partner with electric utilities. surry county, virginia, expects to extend broadband access to 6,700 households through a first-ever partnership between a utility (dominion energy virginia) and an electric cooperative (dominion energy). a similar project is underway with the northern neck cooperative and dominion energy.10 these initiatives are made possible due to some regulatory changes made in virginia (sb 966). according to community networks, there are 900 communities providing broadband connectivity locally (https://muninetworks.org/communitymap). but nineteen states still have barriers in place that discourage, if not outright prevent, local communities from investing in broadband. libraries in states where community networks are a viable option should be at the table, or perhaps setting the table, for discussions about how to bring broadband to the entire community not just into the library or dispatched one-at-a-time via wi-fi hotspots. this is an opportunity to convene community conversations focusing on the issue of broadband. library staff have been doing more and more of this type of outreach into the community and acting as facilitator. the ala has even produced a community conversation workbook (http://www.ala.org/tools/sites/ala.org.tools/files/content/ltc_convoguide_final_062414.pdf ) to support libraries just getting started. state partners in california, the governor recently issued executive order n-73-20 (https://www.gov.ca.gov/wp-content/uploads/2020/08/8.14.20-eo-n-73-20.pdf) directing state agencies to pursue a goal of 100 mbps download speed and outlines actions across state agencies https://www.alliedmedia.org/ https://muninetworks.org/ https://ilsr.org/ https://muninetworks.org/communitymap http://www.ala.org/tools/sites/ala.org.tools/files/content/ltc_convoguide_final_062414.pdf https://www.gov.ca.gov/wp-content/uploads/2020/08/8.14.20-eo-n-73-20.pdf information technology and libraries september 2020 what more can we to do address broadband inequity and poverty? | ayre 4 and departments to accelerate mapping and data collection, funding, deployment, and adoption of high-speed internet.11 this will undoubtedly create fertile ground for libraries to partner with other agencies and community organizations to advance this initiative. libraries are specifically called out to raise awareness of low-cost broadband options to their local community. every state has some kind of broadband task force or commission or advisory council (https://www.ncsl.org/research/telecommunications-and-information-technology/statebroadband-task-forces-commissions.aspx). this is another instance where libraries should be at the table. in my state, our state librarian is on the california broadband council. but many of these commissions do not have a representative from the library world which means they probably are not hearing from us. whether it is through your local library, your state library, or your state library association, it is important for librarians to build relationships with people on these commissions—if not get a seat on the commission themselves. national partners unless your community is blanketed with affordable broadband connectivity, it will be important that we continue to advocate nationally for the needs we see. in addition to helping the patron standing right in front of us checking out their hotspot, we also need to address the needs of the people who aren’t able to get to the library but are equally in need of access. our job is to make sure that any new initiatives undertaken by a new administration provide for free and equitable access to the internet for every household. extending e-rate (the federal communication commission’s program for making internet access more affordable for schools and libraries) isn’t enough. free (or at least affordable) broadband needs to be brought to every home. the electronic frontier foundation (eff) argues that fiber-to-the-home is the best option for consumers today because it will be easily upgradeable without touching the underlying cables and will support the next generation of applications (see https://www.eff.org/wp/case-fiber-hometoday-why-fiber-superior-medium-21st-century-broadband). libraries have worked with the eff on issues related to privacy and government transparency. maybe it’s time to team-up with them about broadband. global partners low earth orbit (leo) satellites could potentially bring broadband to everyone on earth.12 starlink (https://www.starlink.com/) is elon musk’s initiative and project kuiper (https://blog.aboutamazon.com/company-news/amazon-receives-fcc-approval-for-projectkuiper-satellite-constellation) is amazon’s jeff bezos’ project. a private beta starlink service is due (or perhaps it is already happening). if it works as musk has envisioned, it could be a gamechanger. or it might just make the digital divide worse if it isn’t affordable to everyone who needs it. how might we lobby musk to roll-out this service in a way that is equitable and fair? https://www.ncsl.org/research/telecommunications-and-information-technology/state-broadband-task-forces-commissions.aspx https://www.ncsl.org/research/telecommunications-and-information-technology/state-broadband-task-forces-commissions.aspx https://www.eff.org/wp/case-fiber-home-today-why-fiber-superior-medium-21st-century-broadband https://www.eff.org/wp/case-fiber-home-today-why-fiber-superior-medium-21st-century-broadband https://www.starlink.com/ https://blog.aboutamazon.com/company-news/amazon-receives-fcc-approval-for-project-kuiper-satellite-constellation https://blog.aboutamazon.com/company-news/amazon-receives-fcc-approval-for-project-kuiper-satellite-constellation information technology and libraries september 2020 what more can we to do address broadband inequity and poverty? | ayre 5 speak up, speak out, and get in the way these are just a few avenues that we, as professionals committed to free access to information, might pursue. i worry that we have not made enough noise about the problems we see in our communities that are a result of broadband inequity and digital poverty. and although virtually every library is doing something to address the problem, our efforts are no match for the magnitude of the problem. in a blog post on the brookings institution’s website, authors lara fishbane and adie tomer argue for a new agenda focused on comprehensive digital equity that includes (among other things) “building networks of local champions, ensuring community advocates, government officials, and private network providers share intelligence, debate priorities, and deploy new programming .”13 there are no better local champions and advocates for communities than the city or county librarians and their staffs. let’s treat this problem with the seriousness it deserves and at a scale that will be meaningful. to quote john lewis (as so many of us have since his death on july 17, 2020), it's time for us to “speak up, speak out, and get in the way.”14 we have to make it painfully clear to policymakers that libraries cannot bridge the digital divide with public access computers and hotspots. we need to tell our communities’ stories, convene conversations, and agitate for equitable broadband that is as readily available as water and electricity. endnotes 1 “libraries respond: covid-19 survey,” american library association, accessed august 25, 2020, http://www.ilovelibraries.org/sites/default/files/may-2020-covid-survey-pdf-summaryof-results-web-2.pdf. 2 erica freudenberger, “reopening libraries: public libraries keep their options open,” library journal, june 25, 2020, https://www.libraryjournal.com/?detailstory=reopening-librariespublic-libraries-keep-their-options-open. 3 lauren kirchner, “millions of american depend on libraries for internet. now they’re closed,” the markup, june 25, 2020, https://themarkup.org/coronavirus/2020/06/25/millions-ofamericans-depend-on-libraries-for-internet-now-theyre-closed. 4 jim lynch, “the gates library foundation remembered: how digital inclusion came to libraries,” techsoup, accessed august 24, 2020, https://blog.techsoup.org/posts/gateslibrary-foundation-remembered-how-digital-inclusion-came-to-libraries. 5 gina millsap, “this was in april. q. we’re starting a new school year and what has changed? a. not much. it’s past time to get serious about universal broadband in the u.s.” facebook, august 16, 2020, 5:37 a.m., https://www.facebook.com/gina.millsap.7/posts/10218986781485855. accessed september 14, 2020. http://www.ilovelibraries.org/sites/default/files/may-2020-covid-survey-pdf-summary-of-results-web-2.pdf http://www.ilovelibraries.org/sites/default/files/may-2020-covid-survey-pdf-summary-of-results-web-2.pdf https://www.libraryjournal.com/?detailstory=reopening-libraries-public-libraries-keep-their-options-open https://www.libraryjournal.com/?detailstory=reopening-libraries-public-libraries-keep-their-options-open https://themarkup.org/coronavirus/2020/06/25/millions-of-americans-depend-on-libraries-for-internet-now-theyre-closed https://themarkup.org/coronavirus/2020/06/25/millions-of-americans-depend-on-libraries-for-internet-now-theyre-closed https://blog.techsoup.org/posts/gates-library-foundation-remembered-how-digital-inclusion-came-to-libraries https://blog.techsoup.org/posts/gates-library-foundation-remembered-how-digital-inclusion-came-to-libraries https://www.facebook.com/gina.millsap.7/posts/10218986781485855 information technology and libraries september 2020 what more can we to do address broadband inequity and poverty? | ayre 6 6 “libraries are filling the homework gap as students head back to school,” broadband usa, last modified september 4, 2018, https://broadbandusa.ntia.doc.gov/ntia-blog/libraries-arefilling-homework-gap-students-head-back-school. 7 james k. willcox, “libraries and schools are bridging the digital divide during the coronavirus pandemic,” consumer reports, last modified april 29, 2020, https://www.consumerreports.org/technology-telecommunications/libraries-and-schoolsridging-the-digital-divide-during-the-coronavirus-pandemic/. 8 sarah chase webber, “the library’s role in bridging the digital divide”, urban libraries council, last modified march 28, 2019, https://www.urbanlibraries.org/blog/the-librarys-role-inbridging-the-digital-divide. 9 cecilia kang, “parking lots have become a digital lifeline,” the new york times, may 20, 2020, https://www.nytimes.com/2020/05/05/technology/parking-lots-wifi-coronavirus.html. 10 ry marcattilio-mccracken, “electric cooperatives partners with dominion energy to bring broadband to rural virginia,” last modified august 6, 2020, https://muninetworks.org/content/electric-cooperatives-partner-dominion-energy-bringbroadband-rural-virginia. 11 “newsom issues executive order on digital divide,” cheac (improving the health of all californians), last modified august 14, 2020, https://cheac.org/2020/08/14/newsom-issuesexecutive-order-on-digital-divide/. 12 tyler cooper, “bezos and musk’s satellite internet could save americans $30b a year,” podium: opinion, advice, and analysis by the tnw community, last modified august 24, 2019, https://thenextweb.com/podium/2019/08/24/bezos-and-musks-satellite-internet-couldsave-americans-30b-a-year/. 13 lara fishbane and adie tomer, “neighborhood broadband data makes it clear: we need an agenda to fight digital poverty,” brookings institution, last modified february 6, 2020, https://www.brookings.edu/blog/the-avenue/2020/02/05/neighborhood-broadband-datamakes-it-clear-we-need-an-agenda-to-fight-digital-poverty/. 14 rashawn ray, “five things john lewis taught us about getting in ‘good trouble,’” brookings institution, last modified july 23, 2020, https://www.brookings.edu/blog/how-werise/2020/07/23/five-things-john-lewis-taught-us-about-getting-in-good-trouble/. https://broadbandusa.ntia.doc.gov/ntia-blog/libraries-are-filling-homework-gap-students-head-back-school https://broadbandusa.ntia.doc.gov/ntia-blog/libraries-are-filling-homework-gap-students-head-back-school https://www.consumerreports.org/technology-telecommunications/libraries-and-schools-bridging-the-digital-divide-during-the-coronavirus-pandemic/ https://www.consumerreports.org/technology-telecommunications/libraries-and-schools-bridging-the-digital-divide-during-the-coronavirus-pandemic/ https://www.urbanlibraries.org/blog/the-librarys-role-in-bridging-the-digital-divide https://www.urbanlibraries.org/blog/the-librarys-role-in-bridging-the-digital-divide https://www.nytimes.com/2020/05/05/technology/parking-lots-wifi-coronavirus.html https://muninetworks.org/content/electric-cooperatives-partner-dominion-energy-bring-broadband-rural-virginia https://muninetworks.org/content/electric-cooperatives-partner-dominion-energy-bring-broadband-rural-virginia https://cheac.org/2020/08/14/newsom-issues-executive-order-on-digital-divide/ https://cheac.org/2020/08/14/newsom-issues-executive-order-on-digital-divide/ https://thenextweb.com/podium/2019/08/24/bezos-and-musks-satellite-internet-could-save-americans-30b-a-year/ https://thenextweb.com/podium/2019/08/24/bezos-and-musks-satellite-internet-could-save-americans-30b-a-year/ https://www.brookings.edu/blog/the-avenue/2020/02/05/neighborhood-broadband-data-makes-it-clear-we-need-an-agenda-to-fight-digital-poverty/ https://www.brookings.edu/blog/the-avenue/2020/02/05/neighborhood-broadband-data-makes-it-clear-we-need-an-agenda-to-fight-digital-poverty/ https://www.brookings.edu/blog/how-we-rise/2020/07/23/five-things-john-lewis-taught-us-about-getting-in-good-trouble/ https://www.brookings.edu/blog/how-we-rise/2020/07/23/five-things-john-lewis-taught-us-about-getting-in-good-trouble/ thinking differently about addressing the digital divide community partners state partners national partners global partners speak up, speak out, and get in the way endnotes usability as a method for assessing discovery | ipri, yunkin, and brown 181 tom ipri, michael yunkin, and jeanne m. brown usability as a method for assessing discovery the university of nevada las vegas libraries engaged in three projects that helped identify areas of its website that had inhibited discovery of services and resources. these projects also helped generate staff interest in the usability working group, which led these endeavors. the first project studied student responses to the site. the second focused on a usability test with the libraries’ peer research coaches and resulted in a presentation of those findings to the libraries staff. the final project involved a specialized test, the results of which also were presented to staff. all three of these projects led to improvements to the website and will inform a larger redesign. u sability testing has been a component of the university of nevada las vegas (unlv) libraries web management since our first usability studies in 2000.1 usability studies are a widely used and relatively standard set of tools for gaining insight into web functionality. these tests can explore issues such as the effectiveness of interactive forms or the complexity of accessing full-text articles from third-party databases. they can explore aesthetic and other emotional responses to a site. in addition, they can provide an opportunity to collect input concerning satisfaction with the layout and logic of the site. they can reveal mistakes on the site, such as coding errors, incorrect or broken links, and problematic wording. they also allow us to engage in testing issues of discovery to isolate site elements that facilitate or hamper discovery of the libraries’ resources and services. the libraries’ usability working group seized upon two library-wide opportunities to highlight findings of the past year’s studies. the first was the discovery summit, in which the staff viewed videos of staff attempting finding exercises on the homepage and discussed the finding process. the second was the discovery mini-conference, an outgrowth of a new evaluation framework and the libraries’ strategic plan. through a poster display, the working group highlighted areas dealing with discovery of library resources. the mini-conference allowed us to leverage library-wide interest in the topic of effective information-finding on the web to draw wider attention to usability’s importance in identifying the likelihood of our users discovering library resources independently. the usability working group engaged in three projects to help identify areas of the website that inhibited discovery and to generate staff interest in the process of usability. all three of these projects led to improvements to the website and will inform a larger redesign. the first project is an ongoing effort to study student responses to the site. the second was to administer a usability test with the libraries’ peer research coaches and present those findings to the libraries’ staff. the final project was requested by the dean of libraries and involved a specialized test, the results of which also were presented to staff. n student studies the usability working group began its ongoing evaluation of unlv libraries’ website by conducting two series of tests: one with five undergraduate students and one with five graduate students. not surprisingly, most students self-reported that the main reason they come to the libraries’ site is to find books and journal articles for assignments. the group created a set of fourteen tasks that were based on common needs for completing assignments: 1. find a journal article on the death penalty. (note: if students go somewhere other than the library, guide them back.) 2. find what floor the book the catcher in the rye is on. 3. find the most current issue of the journal popular mechanics. 4. identify a way to ask a question from home. 5. find a video on global warming. 6. you need to write a bibliography for a paper. find something on the website that would help you. 7. find out what lied library’s hours were for july 4. 8. find the libraries’ tutorial on finding books in the library. 9. the library offers workshops on how to use the library. find one you can take. 10. find a library-recommended website in business. 11. find out what books are checked out on this card. 12. find instructions for printing from your personal laptop. 13. your sociology professor, dr. lampert, has placed something on reserve for your class. please find the material. 14. your professor wants you to read the book efficiency and complexity in grammars by john a. hawkins. find a copy of the book for your assignment. (the tom ipri (tom.ipri@unlv.edu) is head, media and computer services; michael yunkin (michael.yunkin@unlv.edu) is web content manager/usability specialist; and jeanne m. brown (jeanne.brown@unlv.edu) is head, architecture studies library and assessment librarian, university of nevada las vegas libraries. 182 information technology and libraries | december 2009 moderator will prompt if the person stops at the catalog.) the results of these tests revealed that the site was not as conducive to discovery as was hoped. the libraries are planning on a complete redesign of the site in the near future; however, the results of these first two series of usability tests were compelling enough to prompt an intermediary redesign to improve some of the areas that were troublesome to students. that said, the tests also found certain parts of the old site (figure 1) to be very effective: 1. all participants used the tabbed box in the center of the page, which gives them access to the catalog, serials lists, databases, and reserves. 2. all students quickly found the “ask a librarian” link when prompted to find a way to ask a question from home. 3. most students found the libraries’ hours, partly because of the “hours” tab at the top of the page and partly because of multiple access points. 4. many participants used the “site search” tab to navigate to the search page, but few actually used it to conduct searches. they effectively used the site map information also included on the search page. the usability tests also revealed some variables that undermined the goal of discoverability. 1. due to the various sources of library-related information (website, catalog, vendor databases) navigation posed problems for students. although not a specific question in the usability tests, the results show students often struggled to get back to the libraries’ home page to start a new question. 2. students often expected to find different content under “help and instruction” than what was there. 3. students used the drop down boxes as a last resort. often, they would expand a drop down box and quickly navigate away without selecting anything from the list. 4. with some exceptions, students mainly ignored the tabs across the top of the home page. 5. although students made good use of the tabbed box in the center of the page, many could not distinguish between “journals” and “articles & databases.” 6. similarly, students easily found the “reserves” tab but could not make sense of the difference between “electronic reserves (e-reserves)” and “other reserves.” 7. no student found business resources via the “subject guides” drop down menu at the bottom of the home page. n peer-coach test and staff presentation unlv libraries employs peer research coaches, undergraduate students who serve as frontline research mentors to their peers. the usability working group administered the same test they used with the first group of undergraduate and graduate students to the peer research coaches. although these students are trained in library research, they still struggled with some of the usability tasks. the usability working group presented the findings of the peer research coach tests with staff. the peer research coaches are highly regarded in the libraries, so staff were surprised that they had so much difficulty navigating the site; this presentation was the first time many of the staff had seen the results of usability studies of the site. the shocking nature of these results generated a great deal of interest among the staff regarding the work of the usability working group. n the dean’s project in january 2009, the dean of libraries asked the usability working group for assistance in planning for the discovery summit. initially, she requested to view figure 1. unlv libraries’ original website design usability as a method for assessing discovery | ipri, yunkin, and brown 183 the video from some of the usability tests with the goal of identifying discovery-oriented problems on the libraries’ website. soon after, the dean tasked the group with performing a new set of usability tests using three subjects: a librarian, a library employee with little research or web expertise, and a faculty researcher. each participant was asked to complete three tasks, first using the libraries’ website, then using google. the tasks were based on items found in the libraries’ special collections: 1. find a photograph available in unlv libraries of the basic magnesium mine in henderson, nevada. 2. find some information about the baneberry nuclear test. are there any documents in unlv libraries about the lawsuit associated with the test? 3. find some information about the local greenpeace chapter. are there any documents in unlv libraries about the las vegas chapter? the dean viewed those videos and chose the most interesting clips for a presentation at the discovery summit. prior to this meeting, the libraries’ staff were instructed to try completing the tasks on their own so that they might see the potential difficulties users must overcome and to compare the user experience provided by our website with that provided by google. at the discovery summit, the dean presented the staff a number of clips from these special usability tests, giving the staff an opportunity to see where users familiar with the libraries collections stumble. the staff also were shown several clips of undergraduates using the website to perform basic tasks, such as finding journal articles or videos in the libraries, with varying degrees of success. these clips helped illustrate the various difficulties users encounter when attempting to discover library holdings, including unfamiliar search interfaces, library jargon, and a lack of clear relationships between the catalog and other databases. this discussion helped set the stage for the discovery mini-conference. n initial changes to the site unlv libraries’ website is in the process of being redesigned, and the results of the usability studies are being used to inform that process. however, because of the seriousness of some of the issues, some changes are being implemented into an intermediary design (figure 2). the new homepage n combines article and journal searching into one tab and removes the word “databases” from the page entirely; n adds a website search to the tabbed box; n adds a “music & video” search option; n makes better use of the picture on the page by incorporating rotating advertisements in that area; n widens the page, allowing more space on the rest of the site’s templates; n breaks the confusing “help & instruction” page into two more specific pages: “help” and “using the libraries”; and n adds the main library and the branch library hours to the homepage. this new homepage is just the beginning of our efforts to improve discovery through the libraries’ website. the usability working group already has plans to do a card sort for the “using the library” category to further refine the content and language of that section. the group plans to test the initial changes to the site to ensure that they are improving discovery. reference 1. jennifer church, jeanne brown, and diane vanderpol, “walking the web: usability testing of navigational pathways at the university of nevada las vegas libraries,” in usability assessment of library-related web sites: methods and case studies, ed. nicole campbell (chicago: ala, 2001). figure 2. unlv libraries’ new website design 222 information technology and libraries | december 2006 social engineering is the use of nontechnical means to gain unauthorized access to information or computer systems. while this method is recognized as a major security threat in the computer industry, little has been done to address it in the library field. this is of particular concern because libraries increasingly have access to databases of both proprietary and personal information. this tutorial is designed to increase the awareness of library staff in regard to the issue of social engineering. one morning the phone rings at the circulation desk; the assistant, joyce, answers. “seashore branch public library, how may we help you?” she asks, smiling. “my wife and i recently moved and i wanted to confirm that you had our current address,” a pleasant male voice responds. “could you give me your name please?” “the card is in my wife’s name, jennifer greene. we’ve been so busy with the move that she hasn’t had a chance to catch up with everything.” “okay, i have her information here. 123 main street, apartment 2b. is that correct?” “thank you so much, that’s it. do you have our new number or is it still 555-555-1234 in your records?” “let me see . . . no, i think we have your new number.” “could you read it back to me?” “sure . . . 555-555-6789, is that right?” “555-555-6789 . . . that’s right. thank you very much, you’ve been very helpful.’ “no problem, that’s what we’re here for.” <click> what just happened? what happened to joyce may have been exactly what it appeared to be—a conscientious spouse trying to make sure information was updated after a move. but what else could it have been—research for an identity theft, or a stalker trying to get personal information? we have no way of knowing. all reasons except for the first, innocent, reason are covered by the term social engineering. in the language of computer hackers, social engineering is a nontechnical hack. it is the use of trickery, persuasion, impersonation, emotional manipulation, and abuse of trust to gain information or computer-system access through the human interface. regardless of an institution’s commitment to computer security through technology, it is vulnerable to social engineering. recently, the institute of management and administration (ioma) reported social engineering as the number-one security threat for 2005. according to ioma, this method of security violation is on the rise due to continued improvements in technical protections against hackers.1 why and how does social engineering work? the first thing to keep in mind about social engineering is that it does work. kevin mitnick, possibly the best known hacker of recent decades, carried out most of his questionable activities through the medium of social engineering.2 he did not need to use his technical expertise because it was easier to just ask for the information he wanted. he discovered that people, when questioned appropriately, would give him the information he wanted. social engineering succeeds because most people work under the assumption that others are essentially honest. as a pure matter of probability, this is true; the vast majority of communications that we receive during the day are completely innocent in character. this fact allows the social engineer to be effective. by making seemingly innocuous requests for information, or making requests in a way that seems reasonable at the time, the social engineer can gather the information that he or she is looking for. methods of social engineering the arsenal of the social engineer is large and very well established. this is mainly because social engineering amounts to a variation on confidence trickery, an art that goes back as far as human history can recall. one might argue that homer’s iliad contains the first record of a social engineering attack in the form of the trojan horse. direct requests many social-engineering methods are complex and require significant planning. however, there is a simple and effective method that is often just as effective. the social engineer contacts his or her target and simply asks for the information. preying on trust and emotion social engineering is a method of gaining information through the persuasion of human sources, based on the abuse of trust and the manipulation of emotion. in his book, the art of deception, mitnick makes the argument that once a social engineer has established the trust of a contact, then all security is effectively voided and helping the hacker? library information, security, and social engineering samuel t. c. thompson samuel t. c. thompson (sthompson@ collier-lib.org), is a public service librarian at the collier county public library, naples, florida. helping the hacker? | thompson 223 the social engineer can gather whatever information is required. the most common method of targeting computer end-users is through the manipulation of gratitude. in these cases, a social engineer, usually impersonating a technician, contacts a user and states that there is something wrong on the victim’s end, and that the social engineer needs a few pieces of information to “help” the user. appreciative of the assistance, the victim provides the necessary information to the helpful caller or carries out the requested actions. predictably, no problem ever existed and the victim has now provided the social engineer either access to a computer system or with the information needed to gain that access. a counterpoint to the manipulation of gratitude is the manipulation of sympathy. this method is most often used on information providers such as help-desk personnel, technicians, and library staff members. in this scenario, a social engineer contacts a victim and claims to have either lost information, is out of contact with a normal source, or is simply ignorant of something that he or she should know. as anyone can empathize with this plea, the victim is often all too willing to provide the information sought by the social engineer. using these methods—taking advantage of the gratitude, sympathy, and empathy of their victims—social engineers are able to achieve their aims. impersonation because forming trust relationships with their victims is critical to a socialengineering attack, it is not surprising that social engineers often pretend to be someone or something that they are not. two of the major tools of impersonation are (1) speaking the language of the victim institution and (2) knowledge of personnel and policy. to allay suspicion, a social engineer needs to know and be able to use an institution’s terminology. being unable to do so would cause the victim to suspect, rather than trust, the social engineer. with a working knowledge of an organization’s particular vocabulary, a social engineer can phrase his or her request in terms that will not rouse alarm with the intended victim. the other major goal of a social engineer in preparing a successful impersonation is to develop a familiarity with the “lay of the land,” i.e., the specifics of and personnel within an organization. for instance, a social engineer needs to discover who has what authority within an organization so as to understand for whom he or she needs to claim to speak. research to establish trust in their victims, social engineers use research as a tool. this comes in two forms, background research and cumulative research. background research is the process by which a social engineer uses publicly available resources to learn what to ask for, how to ask for it, and whom to ask it of. while the intent and goal of this research differs from the techniques used by students, librarians, and other members of the population, the actual process is the same. cumulative research is the process by which a social engineer gathers the information that he or she needs to make more critical requests of their victims. the facts that a social engineer seeks through cumulative research may seem without value to the casual observer, but put together properly, they are anything but that. questions can include names of staff, internal phone numbers, procedures, or seemingly minor technical details about the library’s network (e.g., what operating system are you running?). late in the afternoon the phone at the reference desk rings. marcy, the librarian on duty answers, “reference desk.” “hi there, this is dave simpson calling from information services at the main branch. sorry about the echo, i’m working in the cabling closet at the moment, so i’m calling you on my cell phone.” “no problem, i can hear you fine. what can i do for you?” “thanks. a lot of the branches have been having network problems over the last few days. has everything been okay at the seashore branch reference desk?” “i think so.” “okay, that’s good. i’m running a test right now on the network and needed to find a terminal that was behaving itself. could you log off and let me know if any messages come up?” “no problem.” marcy logs off of the reference computer; nothing strange happens. “just the usual messages.” “good. now start logging back on. what user are you going in as? i mean which login name are you using?” “searef. okay, i’m logged on now.” “no strange messages?” “nothing.” “that’s great. look, our problem might be kids hacking into the system so i need you to change the password. do you know how to do that?” “i think so.” “well, let me walk you through it.” dave spends a couple of minutes walking marcy through changing the system password. the password is now changed to 5ear3f, a moderately secure password. “thanks, marcy. you’ve been a great help. we have your new password logged into the system. could you pass on the new password to the other reference personnel?” “sure.” “wonderful. just remember not to give the password out to anyone who doesn’t need it, and don’t write it down where anyone who shouldn’t have it can get at it. have a great day.” “you too.” <click> 224 information technology and libraries | december 2006 why are libraries vulnerable? libraries are vulnerable to social-engineering attacks for two major reasons: (1) ignorance and (2) institutional psychology. the first of these difficulties is the easiest to address. the ignorance of library professionals in this matter is easily explained—there is very little literature to date about the issue of social engineering directed at library personnel. what exists is usually mixed in larger articles on general security issues and receives little focus. this lack of concern about social engineering can also be seen in computer professional literature, where it is dwarfed by the volume of articles concerning technical security issues. this is a curious gap, considering the high rate of occurrence of this kind of attack. is it because many technical professionals are less comfortable with a social issue—that can only be solved through people—than with a technical security issue that can be solved through the development or implementation of proper software?3 unfortunately, not knowing about a method of security violation leaves one vulnerable to that method. it is incumbent on librarians, computer administrations, and security professionals to be aware of these issues. the second factor is harder to address but equally important. unlike almost any other profession, librarians are expected to fulfill their patrons’ informational needs without question or bias. this laudable goal makes librarians vulnerable to social-engineering attacks because the inquiries made by a social engineer about the information resources available at a library may be used for nefarious purposes. a reference interview over these issues may be very successful from the point of view of both parties involved, as the librarian fills the openended inquiries of the social engineer, and the social engineer receives much, if not all, of the information that he or she needs to violate the library’s internal information systems. why libraries can be targets at this point, it is relevant to ask why security violators would even bother with library computer networks. what do libraries have that is worth possibly committing a crime to get? personal information is probably the most tempting target in a library computer system. libraries possess databases of names, addresses, and other personal data about library cardholders. this information is valuable, and not all of it is easily available from public sources. as may be seen in the section of this article on techniques, such information could be used as an end unto itself or as a stepping stone to security violations in other systems. subscriptions to proprietary databases are quite expensive, as any acquisitions librarian will explain. given the high prices and limited licensing, a hacker may want to gain access to these information resources. this could be a casual hacker who wants to have access to a library-only resource from his or her home computer, or this may be a criminal who wishes to steal intellectual properties from a database provider. libraries often have broadband access designed for a large network (e.g., t1). as these lines are very expensive, few individuals can afford them. at the same time, it has been observed that these broadband lines have immense capabilities for downloading information from other networks. there are many reasons why a hacker would seek to illicitly use such a resource. for instance, a casual hacker may want to download a large number of bootlegged movie files, or a criminal may wish to download a corporate database. with access to a library’s high bandwith internet line, these actions can be carried out quickly and with a minimized chance of detection. libraries possess large numbers of computers due to their increasing automation. these computer resources can, if compromised, be used as anonymous remote computers by hackers. called “zombies,” compromised computers could be used to deliver illegal spam, distributed denial of service (ddos) attacks, or as servers to distribute illegal materials. if library computers are used in this way, there is a potential for a library to face legal responsibility for the actions of its computers or for the questionable materials found on them. prevention the tools needed to prevent social engineering from succeeding are awareness, policy, and training. these tools feed into one another—we become aware of the possibility of social-engineering attacks, develop policy to communicate these concerns to others, and then train others in these policies to protect them and their libraries from social engineering. libraries should have a simple set of policies to help prevent social engineering from affecting them. this policy need not be long; ideally, it should be a small page of bullet points that are easy to remember or to post near telephones. what is important is that it is easy to remember and implement when a call or e-mail comes in.4 basic guidelines for protection against social engineering ■ be suspicious of unsolicited communications asking about employees, technical information, or other internal details. ■ do not provide passwords or login names over the phone or helping the hacker? | thompson 225 via e-mail no matter who claims to be asking. ■ do not provide patron information to anyone but the patron in person and only upon presentation of the patron’s library card or other proper identification. ■ if you are not sure if a request is legitimate, contact the appropriate authorities. ■ trust your instincts. if you feel suspicious about a question or communication, there is probably a good reason. ■ document and report suspicious communications. in closing social engineering is an immensely effective method of breaching computer and network security. it is, however, entirely dependent on the ability of the social engineer to persuade staff members into providing information or access that they should not provide. with care and good information policies, we can prevent social engineering from working. after all, do we really want to be helping the hacker? the circulation desk phone rings. joyce answers, “seashore branch public library, how may we help you?” “hi there, i’m worried that i haven’t turned in all the books i have out, and i really don’t want to get stuck with a fine. could you tell me what i have out?” “no problem. what is you name?” “sean grey.” joyce brings up sean grey’s circulation records, and then remembers about the library’s information policy and decides to ask another question, “could you give me your library card number?” “i don’t have that with me. i really don’t want to get stuck with those fines.” “i’m sorry. mr. grey, to preserve patron privacy we can only give out circulation information if you give us your card number or if you are here in person with your card or id.” “but i just want to avoid a fine. can’t you help?” “don’t worry; if you are late by accident on occasion, we are willing to forgive a fine.” “so you can’t give me my records?” “i’m sorry but we have to protect patron privacy. i’m sure you understand.” “i guess so. goodbye.” “have a good day.” <click> ■ references 1. institute of management & administration, “six security threats that will make headlines in ’05,” ioma’s security director’s report 5, no. 1 (2004): 1–14. 2. k. manske, “an introduction to social engineering,” security management practices (nov./dec. 2000): 53–59. 3. m. mcdowell, “cyber-security tip st04-014,” (2005), http://www.us.cert. gov/cas/tips/st04-014.html (accessed june 5, 2005). 4. k. mitnick and w. simon, the art of deception (indianapolis: wiley, 2002). alcts cover 2 lama cover 3 lita 180, 216, cover 4 index to advertisers : | wang 81building an open source institutional repository at a small law school library | wang 81 fang wangcommunications v700 flatbed scanner, which was recommended by many digitization best practices in texas. for software, we had all the important basics such as ocr and image editing software for the project to start. for the following several months, i did extensive research on what digital asset management platform would be the best solution for the law library. we had options to continue displaying the digital collections through webpages or use a digital asset management platform that would provide long-term preservation as well as retrieval functions. we made the decision to go with the latter. generally speaking, there are two types of digital asset management platforms: proprietary and open source. in some rare occasions, a library chooses to develop its own system and not to use either type of the platforms if the library has designated programmers. there are pros and cons to both proprietary and open source platforms. although setting up the repository is fairly quick and easy on a proprietary platform, it can be very expensive to pay annual fees for hosting and using the service. for the open source software, it may appear to be “free” up front; however, installing and customizing the repository can be very time consuming and these solutions often lack technical and development support. there is no uniform rule for choosing a platform. it depends on what the organization wants to achieve and its own unique circumstances. i explored several popular proprietary platforms such as contentdm and digital commons. contentdm is an oclc product, which has a lot of capability and is especially good for displaying image collections. digital commons is owned of the repository is ongoing; it is valuable to share the experience with other institutions who wish to set up an institutional repository of their own and also add to the knowledgebase of ir development. institutional repository from the ground up unlike most large university libraries, law school libraries are usually behind on digital initiative activities because of smaller budgets, lack of staff, and fewer resources. although institutional repositories have already become a trend for large university libraries, it still appears to be a new concept for many law school libraries. at the beginning of 2009, i was hired as the digital information management librarian to develop a digital repository for the law school library. when i arrived at texas tech university law library, there was no institutional repository implemented. there were very few digital projects done at the law library. one digital collection was of faculty scholarship. this collection was displayed on a webpage with links to pdf files. another digital project, to digitize and provide access to the texas governor executive orders found in the texas register, was planned then disbanded because of the previous employee leaving the position. i started by looking at the digitization equipment in the library. the equipment was very limited: a very old and rarely used book scanner and a sheet-fed scanner. the good thing was that the library did have extra pcs to serve as workstations. i did research on the book scanner we had and also consulted colleagues i met at various digital library conferences about it. because the model is very outdated and has been discontinued by the vendor and thus had little value to our digitization project, i decided to get rid of the scanner. i then proposed to purchase an epson perfection building an open source institutional repository at a small law school library: is it realistic or unattainable? digital preservation activities among law libraries have largely been limited by a lack of funding, staffing and expertise. most law school libraries that have already implemented an institutional repository (ir) chose proprietary platforms because they are easy to set up, customize, and maintain with the technical and development support they provide. the texas tech university school of law digital repository is one of the few law school repositories in the nation that is built on the dspace open source platform.1 the repository is the law school’s first institutional repository in history. it was designed to collect, preserve, share and promote the law school’s digital materials, including research and scholarship of the law faculty and students, institutional history, and law-related resources. in addition, the repository also serves as a dark archive to house internal records. i n this article, the author describes the process of building the digital repository from scratch including hardware and software, customization, collection development, marketing and outreach, and future projects. although the development fang wang (fang.wang@ttu.edu) is digital information management librarian, texas tech university school of law library, lubbock, texas. 82 information technology and libraries | june 2011 two months later, we discovered that a preconfigured application called jumpbox for dspace was released and approved to be a much easier solution for the installation. the price was reasonable too, $149 a year (the price has jumped quite a bit since then). however, using jumpbox would leave our newly purchased red hat linux server of no use because jumpbox runs on ubuntu, therefore after some discussion we decided not to pursue it. we were a little stuck in the installation process. outsourcing the installation seemed to be a feasible solution for us at this point. we identified a reputable dspace service provider after doing extensive research including comparing vendors, obtaining references, and pursuing other avenues. after obtaining a quote, we were quite satisfied with the price and decided to contract with the vendor. while waiting for the contract to be approved by the university contracting office, i began designing the look and feel that is unique to the ttu school of law with some help from another library staff member. the installation finally took place at the beginning of january 2010. i worked very closely with the service provider during the installation to ensure the desired configuration for our dspace instance. our repository site with the ttu law branding became accessible to the public three days later. and with several weeks of warranty, we were able to adjust several configurations including display thumbnails for images. overall, we are very pleased with the results. after the installation, our it department maintains the dspace site and we host all the content on our own server. collection development of the ir content is the most critical element to an institutional repository. while we were waiting for our it department 66, the majority of the repositories worldwide were created using the dspace platform.2 for the installation, we looked at the opportunity to use services provided by the state digital library consortium texas digital library (tdl) and tried to pursue a partnership with the main university library, which had already implemented a digital repository. however, because of financial reasons and separate budgets, those approaches did not work out. so we decided to have our own it department install dspace. installation and customization of our dspace unlike large university libraries, smaller special libraries face many challenges while trying to establish an open source repository. after making the decision to use dspace, the first challenge we faced was the installation. dspace runs on postgresql or oracle and requires a server installation. customizing the web interface requires either the jspui (javaserver pages user interface) or xmlui (extensible markup language user interface). the staff in our it department knew little about dspace. however, another special library on campus offered their installation notes to our system administrator because they just installed dspace. although dspace runs on a variety of operating systems, we purchased red hat enterprise linux after some testing because it is the recommended os for dspace. then our system administrator spent several months trying to figure out how to install the software in addition to his existing projects. because we did not have dedicated it personnel working on the installation, the work was often interrupted and very difficult to complete. our it staff also found it very difficult to continue with the installation because the software requires a lot of expertise. by berkley press and is often used in the law library community. as a smaller law library, our budget did not allow us to purchase those platforms, which require annual fees of more than $10,000. so we had to look at the open source options. for the open source platforms, i investigated dspace, fedora, eprints and green stone. dspace is a javabased system developed by mit and hp labs. it offers a communitiescollections model and has built-in submission workflows and long-term preservation function. it can be installed “out of the box” and is easy to use. it has been widely adopted as institutional repository software in the united states and worldwide. fedora was also developed in the united states. it is more of a backend software with no web-based administration tools and requires a lot of programming effort. similar to dspace, eprints is another easy to set up and use ir software developed in the u.k. it is written in perl and is more widespread in europe. greenstone is a tool developed in new zealand for building and distributing digital library collections. it provides interfaces in 35 languages so it has many international users. when choosing an ir platform, it is not a question of which software is superior to others but rather which is more appropriate for the purpose and the content of the repository. our goal was to find a platform that had low costs and did not involve much programming. we also wanted a system that was capable of archiving digital items in various formats for the long term, flexible for data migration, had a widely accepted metadata scheme, decent search capability, and was easy to use. another factor we had to consider was the user base. because open source software relies on the user themselves for technical support for the most part, we wanted a software that had an active user community in the united states. dspace seemed to satisfy all of our needs. also, according to repository : | wang 83building an open source institutional repository at a small law school library | wang 83 hosted by the lubbock county bar association at the ttu law school. we made the initial announcement to the law faculty and staff and later to the lubbock county bar about the new digital initiative service we have established. we received very positive feedback from the law community. professor edgar’s family was delighted to see his collection made available to the public. following the success of the initial launch, i developed an outreach plan to promote the digital repository. to make the repository site more visible, several efforts were made: the repository site url was submitted to the dspace user registry, the directory of open access repositories (opendoar), and registry of open access repositories (roar); the site was registered with google webmaster tools for better indexing; and the repository was linked to several websites of the law school and library. the “faculty scholarship” collection and the “texas governor executive orders” collection became available shortly after. i then developed a poster of the newly established digital repository and presented it at the texas conference on digital libraries held at university of texas austin in may 2010. currently, our digital repository has more than eight hundred digital items as of august 2010. with more and more content becoming available in the repository, we plan on making an official announcement to the law community. we will also make entering first-year law students aware of the ir by including an article about the new repository in the library newsletter that is distributed to them during their orientation. our future marketing plan includes sending out announcements of new collections to the law school using our online announcement system techlawannounce and promoting the digital repository through the law library social networking pages on facebook and twitter. we also plan reviewed each year. based on the collection development policy, we made a decision to migrate the content of the old “faculty scholarship” collection from webpages into the digital repository. it was intended to include all publications of the texas tech law school faculty in the collection. we then hired a second-year law student as the digital project assistant and trained him on scanning, editing, and ocr-ing pdf files; uploading files to dspace; and creating basic metadata. we also brought another two student assistants on board to help with the migration of the faculty scholarship collection. the faculty services librarian checked the copyright with faculty members and publishers while i (the digital information management librarian) served as the repository manager handling more complicated metadata creation, performing quality control over student submissions, and overseeing the whole project. later development and promoting the ir during the faculty scholarship migration process, we discovered a need to customize dspace to allow active urls for publications. we wanted all the articles linked to three widely used legal databases: westlaw, lexisnexis, and hein online. because the default dspace system does not support active urls, it requires some programming effort to make the system detect a particular metadata field then render it as a clickable link. we outsourced the development to the same service provider who installed dspace for us. the results were very satisfying. the vendor customized the system to allow active urls and displayed the links as clickable icons for each legal database. in april 2010, “professor j. hadley edgar ’s personal papers” collection was made available in conjunction with his memorial service, to install dspace, we prepared and scanned two collections: the “texas governor executive orders” collection and the “professor j. hadley edgar’s personal papers” collection. the latter was a collection donated by professor edgar’s wife after he passed away in 2009. professor edgar taught at the law school from 1971 to 1991. he was named the robert h. bean professor of law and was twice voted by the student body as the outstanding law professor. the collection contains personal correspondence, photos, newspaper clippings, certificates, and other materials. many of the items have a high historic value to the law school. for the scanning standards, we used 200 dpi for text-based materials and 400 dpi for pictures. we chose pdf as our production file format as it is a common document format and smaller in size to download. after the installation was completed at the beginning of january, i drafted and implemented a digital repository collection development policy shortly after to ensure proper procedures and guidance of the repository development. the policy includes elements such as the purpose of the repository, scope of the collections, selection criteria and responsibilities, editorial rights, and how to handle challenges and withdrawals. i also developed a repository release form to obtain permissions from donors and authors to ensure open access for the materials in the repository. twelve collections were initially planned for the repository: “faculty scholarship,” “personal manuscripts,” “texas governor executive orders,” “law school history,” “law library history,” “regional legal history,” “law student works,” “audio/ video collection,” “dark archive,” “electronic journals,” “conference, colloquium and symposium,” and “lectures and presentations.” there will be changes to the collections in the future as the digital repository collection development policy will be 84 information technology and libraries | june 2011 all roads lead to rome. no matter what platform you choose, whether open source or not, the goal is to pick a system that best suits your organization’s needs. to build a successful institutional repository is not simply “scanning” and “putting stuff online.” various factors need to be considered, such as digitization, ir platform, collection development, metadata, copyright issues, and marketing and outreach. our experience has proven that it is possible for a smaller special library with limited resources and funding to establish an open source ir such as dspace and continue to maintain the site and build the collections with success. open source software is certainly not “free” because it requires a lot of effort. however, in the end it still costs a lot less than what we would pay to the proprietary software vendors. references 1. “the texas tech university school of law digital repository,” http://reposi tory.law.ttu.edu/ (accessed apr. 5, 2011). 2. “repository maps,” accessed http://maps.repository66.org/ (accessed aug. 16, 2010). (ssrn) links to individual articles in the faculty scholarship collection. after that, the next collections we will work on are the law school and law library history materials. we also plan to do some development on the dspace authentication to integrate with the ttu “eraider” system to enable single log-in. in the future, we want to explore the possibilities of setting up a collection for the works of our law students and engage in electronic journal publishing using our digital repository. conclusion it is not an easy task to develop an institutional repository from scratch, especially for a smaller organization. installation and development are certainly a big challenge for a smaller library with limited number of it staff. outsourcing these needs to a service provider seems to be a feasible solution. another challenge is training. we overcame this challenge by taking advantage of the state consortium’s dspace training sessions. subscribing to the dspace mailing list is necessary as it is a communication channel for dspace users to ask questions, seek help, and keep up to date about the software. on hosting information sessions for our law faculty and students to learn more about the digital repository. future projects there is no doubt that our digital repository will grow significantly because we have exciting collections planned for future projects. one of our law faculty, professor daniel benson, donated some of his personal files from an eight-year litigation representing the minority plaintiffs in the civil rights case of jones v. city of lubbock, 727 f. 2d 364 (5th cir. 1984) in which the minority plaintiffs won the case. the lawsuit changed the city of lubbock’s election system for city council members from the “at large” method to the “single member district system,” which allowed the minority candidates consistently being elected. this collection contains materials, notes, memoranda, letters, and other documents prepared and utilized by the plaintiffs’ attorneys. it has significant historical value because a texas tech law professor and five texas tech law graduates participated in that case successfully as pro bono attorneys for the minority plaintiffs. in addition, we plan on adding social science research network efficiently processing and storing library linked data using apache spark and parquet kumar sharma, ujjal marjit, and utpal biswas information technology and libraries | september 2018 29 kumar sharma (kumar.asom@gmail.com) is research scholar, department of computer science and engineering; ujjal marjit (marjitujjal@gmail.com) is system-in-charge, center for information resource management (cirm); and utpal biswas (utpal01in@yahoo.com) is professor, department of computer science and engineering, the university of kalyani, india. abstract resource description framework (rdf) is a commonly used data model in the semantic web environment. libraries and various other communities have been using the rdf data model to store valuable data after it is extracted from traditional storage systems. however, because of the large volume of the data, processing and storing it is becoming a nightmare for traditional datamanagement tools. this challenge demands a scalable and distributed system that can manage data in parallel. in this article, a distributed solution is proposed for efficiently processing and storing the large volume of library linked data stored in traditional storage systems. apache spark is used for parallel processing of large data sets and a column-oriented schema is proposed for storing rdf data. the storage system is built on top of hadoop distributed file systems (hdfs) and uses the apache parquet format to store data in a compressed form. the experimental evaluation showed that storage requirements were reduced significantly as compared to jena tdb, sesame, rdf/xml, and n-triples file formats. sparql queries are processed using spark sql to query the compressed data. the experimental evaluation showed a good query response time, which significantly reduces as the number of worker nodes increases. introduction more and more organizations, communities, and research-development centers are using semantic web technologies to represent data using rdf. libraries have been trying to replace the cataloging system using a linked-data technique such as bibframe.1 libraries have received much attention on transitioning marc cataloging data into rdf format.2 data stored in various other formats such as relational databases, csv, and html have already begun their journey toward the open-data movement.3 libraries have participated in the evolution of linked open data (lod) to make data an essential part of the web.4 various researchers have explored areas related to library data and linked data. in particular, transitioning legacy library data into linked data has dominated most of the research works. other areas include researching the impact of linked library data, investigating how privacy and security can be maintained, and exploring the potential effects of having open linked library data. obviously, a linked-data approach for publishing data on the web brings many benefits to libraries. first, once isolated library data currently stored using traditional cataloging systems (marc) becomes a part of the web, it can be shared, reused, and consumed by web users.5 this promotes the cross-domain sharing of knowledge hidden in the library data, opening the library as a rich source of information. online library users can share more information using linked library resources since every library mailto:kumar.asom@gmail.com mailto:marjitujjal@gmail.com mailto:utpal01in@yahoo.com efficiently processing and storing library linked data | sharma, marjit, and biswas 30 https://doi.org/10.6017/ital.v37i3.10177 resource is crawlable on the web via uniform resource identifiers (uri). most importantly, library data benefits from linked-data technology’s real advantages, such as interoperability, integration with other systems, data crosswalks, and smart federated search.6 numerous approaches have evolved for making the vision of the semantic web a success. no doubt, they have succeeded in making the library a part of the web, but there remain issues related to library big data. the term big data refers to data or information that cannot be processed using traditional software systems.7 the volume of such data is so large that it requires advanced technologies for processing and storing the information. libraries also have real concerns with large volumes of data during and after the transition to linked data. the main challenges are in processing and storage. during conversion from library data to rdf, the process can become stalled because of the large volumes of data. once the data is successfully converted into rdf formats, there are storage issues. finally, even if the data is somehow stored using common rdf triple stores, it is difficult to retrieve and filter. this is a challenging problem that every librarian must give attention to. librarians should know the real nature of library big data, which causes problems in analyzing data and decision making. librarians must also know the technologies that can resolve these issues. the rate of data generation and the complexity of the data itself are constantly increasing. traditional data-management tools are becoming incapable of managing the data. that is why the definition of big data has been characterized by five vs—volume, velocity, variety, value, and veracity.8 • volume is the amount of the data. • velocity is the data-generation rate (which is high in this case). • variety refers to the heterogeneous nature of the data. • value refers to the actual use of the data after the extraction. • veracity is the quality or trustworthiness of the data. to handle the five vs of big data, distributed technologies such as commodity hardware, parallel processing frameworks, and optimized storage systems are needed. commodity hardware reduces the cost of setting up a distributed environment and can be managed with very limited configurations. a parallel processing system can process distributed data in parallel to reduce processing time. an optimized storage system is required to store the large volume of data, supporting scalability to accommodate more data on demand. with these library requirements to tackle the challenges posed by library big data, a distributed solution is proposed. this approach is based on apache hadoop, apache spark, and a column-oriented storage system to process largesize data and to store the processed data in a compressed form. bibliographic rdf data from british national library and the national library of portugal have been used for this experiment. these bibliographic data are processed using apache spark and stored using apache parquet format. the stored data can be queried using sparql queries for which spark sql is used to execute queries. given an existing rdf dataset, we designed a schema for storing rdf data using a columnoriented database. using column-oriented design with apache parquet and spark sql as the query information technology and libraries | september 2018 31 processor, a distributed rdf storage system was implemented that can store any amount of rdf data by increasing the number of distributed nodes as needed. literature review while big data continues to rise, library data are still in traditional storage systems isolated from the web. to continue working with the web, libraries must redesign the way they format data and contribute toward the web of data. to serve library data to other communities, libraries must integrate their data with the web. attempts to do this have been made by several researchers. the task of integration cannot be achieved by only librarians; rather, it requires a team of experts in the field of library and information technology. the advanced way for integrating resources is with linked-data technology by assigning uris to every piece of library data. with this goal, there exist various projects related to the convergence of library data and linked data. one of these, bibframe, is an initiative to transition bibliographic resources into linked-data representation. bibframe aims to replace traditional cataloging standards such as marc and unimarc using the concept of publishing structured data on the web. marc formats cannot be exchanged easily with nonlibrary systems. the marc standard also suffers from inconsistencies, errors, and inability to express relationships between records and fields within the record. that is why mostly bibliographic resources stored in marc standards are targeted for conversion.9 other works include the open-data initiative from the british national library, library catalog to linked opendata conversion, exposing library data as linked data, and building a knowledge graph to reshape the library staff directory.10 linked data is fully dependent on rdf. rdf reveals graph-like structures where resources are linked with one another. thus, rdf can improve on marc standards because of its strong ability to link related resources. this system of revealing everything as a graph helps in building a network of library resources and other data on the web. this also makes for fast search functionality. in addition, searching a topic or book could bring similar graphs from other library resources, leading to the creation of linked-data service.11 such a service has been implemented by the german national library to provide bibliographic and authority data in rdf format, by the europeana linked open data with access to open metadata on millions of books and multimedia data, and by the library of congress linked data service.12 there is less discussion of library big data. though big data in general is in active research, the library domain has received much less attention than the broader concept of big data and its challenges. this could be because most of librarians working with linked data are from nontechnical backgrounds. now is the right time for libraries to give priority to adopting big data technologies to overcome challenges posed by big data. wang et al. have discussed library big data issues and challenges.13 they made some statements about whether library data belongs to the big data category. obviously, library data belongs to big data since it fulfills some of the characteristics of big data, such as volume, variety, and velocity. wang et al. also raise some of libraries’ challenges related to library big data, such as lacking teams of experts, inability to adopt big data due to budgetary issues, and technical challenges. finally, they point out that to take advantage of the web’s full potential, library data must be transformed into a format that can be accessible beyond the library using technologies like semantic web and linked data. the web has already started its work related to big-data challenges. libraries need to transition their data into an advanced format with the ability to handle big-data issues. the main problems efficiently processing and storing library linked data | sharma, marjit, and biswas 32 https://doi.org/10.6017/ital.v37i3.10177 related to library big data happen at data transformation and storage. to store and retrieve large amounts of data, we need commodity hardware that can handle trillions of rdf triples, requiring terabytes or petabytes of disk space. as of now, there are semantic web frameworks such as jena and sesame to handle rdf data, but these frameworks are not scalable for large rdf graphs.14 jena is a java-based framework for building semantic web and linked-data applications. it is basically a semantic web programming framework that provides java libraries for dealing with rdf data. jena tdb is the component of jena for storing and querying rdf data. 15 it is designed to work in a single-node environment. sesame is also a semantic web framework for processing, storing, and querying rdf data. basically, sesame is a web-based architecture for storing and querying rdf data as well as schema information. 16 background this section briefly describes the structure of rdf triples, apache spark along with its features and column-oriented database system, and apache parquet. structure of rdf triples rdf is a schema-less data model. it implies that the data is not fixed to a specific schema, so it does not need to conform to any predefined schema. unlike in relational tables, where we define columns during schema definition and those columns must contain the required type of data, in rdf we can have any number of properties and data using any kind of vocabulary. we only need vocabulary terms to embed properties. the vocabulary is created using domain ontology, which represents the schemas. to describe library resources we need a library-domain ontology. for example, to define a book and its properties one can use the bookont ontology.17 bookont is a book-structure ontology designed for an optimized book search and retrieval process. however, it is not mandatory to use existing ontology and all the properties defined under it. we can use terms from a newly created ontology or mixed ontologies with required properties. rdf represents resources in the form of subject, predicate, and object. the subject is the resource being described, identified by a uri. this subject can have any number of property-value pairs. this way representation of a resource is called knowledge representation, where everything is defined as a knowledge in the form of entity attribute value (eav). in rdf, the basic unit of information is a triple t, such that t = {subject, predicate, object}. such information when stored on disk is called a triplestore. the collection of rdf triples is called an rdf database. an rdf database is specially designed to store linked data to make the web more useful by interlinking data from different sources in a meaningful way. the real advantage of rdf is its support of the common data model. rdf is the standard way for publishing meaningful data on the web, and this is backed by linked data. linked data provides some rules about how data can be published on the web by following the rdf data model.18 with such a common data model, one can integrate data from any sources by inserting new property-value pairs without altering database schema. another important purpose of rdf is to provide resources to be processable by software agents on the web. rdf triples are of two types: literal triples and linked triples. literal triples consist of a uri referenced subject and a literal object (scalar value) joined by a predicate. in linked triples, both the subject and the object consist of uris linking by the predicate. this type of linking is called rdf link, which is the basis for interlinking the resources.19 rdf data are queried using the sparql query language.20 sparql is a graph-matching query language and is used to retrieve information technology and libraries | september 2018 33 triples from the triple store. the sparql queries are also called semantic queries. like sql queries, sparql also finds and retrieves the information stored in the triplestore. a sparql query is composed of five main components:21 • the prefix declaration part is used to abbreviate the uris; • the dataset definition is used to specify the rdf dataset from which the data is to be fetched; • the result clause is used to specify what information is needed to be fetched, which can be select, construct, describe, and ask; • the query pattern is used to specify the search conditions; and • the query modifiers are used to rearrange query results using order by, limit etc. hadoop and mapreduce hadoop is open-source software that supports distributed processing of large datasets on machine clusters.22 two core components—hadoop distributed file system (hdfs) and mapreduce— make distributed storage and computation of processing jobs possible.23 hdfs is the storage component, whereas mapreduce is a distributed data-processing framework, the computational model of hadoop based on java. the mapreduce algorithm consists of two main tasks: map and reduce. the map task takes a set of data as input and produces another set of data with individual components in the form of key/value pairs or tuples. the output of the map task goes to the reduce task, which combines common key/value pairs into a smaller set of tuples. hdfs and mapreduce are based on driver/worker architecture consisting of driver and worker nodes having different roles. an hdfs driver node is called the name-node while the worker node is called the data-node. the name-node is responsible for managing names and data blocks. data blocks are present in the data-nodes. data-nodes are distributed across each machine, responsible for actual data storage. similarly, the mapreduce driver node is called the job-tracker and the worker node is called the task-tracker. job-tracker is responsible for scheduling jobs on task-trackers. task-tracker again is distributed across each machine along with the data-nodes, responsible for processing map and reducing tasks as instructed by the job-tracker. the concept of hadoop implies that the set of data to be processed is broken into smaller forms that can be processed individually and independently. this way, tasks can be assigned to multiple processors to process the data, and eventually it becomes easy to scale data processing over multiple computing nodes. once a mapreduce program is written, the program can be scaled to run over thousands of machines in a cluster. spark and resilient distributed datasets (rdd) apache spark is an in-memory cluster computing platform, which is a faster batch-processing framework than mapreduce. more importantly, it supports in-memory processing of tasks along with data, so querying data is much faster than disk-based engines. the core of spark is the resilient distributed dataset (rdd). rdd is a fundamental data structure of spark that holds a distributed collection of data where data cannot be modified. rather, data modification yields another immutable collection of data (or rdd). this process is called rdd transformation. for example, figure 1 depicts an example of rdd transformation. the distributed processing and efficiently processing and storing library linked data | sharma, marjit, and biswas 34 https://doi.org/10.6017/ital.v37i3.10177 transformation of data is managed by rdd. rdds are fault-tolerant, meaning that the lost data is recoverable using lineage graph of rdds.24 spark constructs a direct acyclic graph (dag) of a sequence of computations that needed to be performed on data. spark has the most powerful computing engine that allows most of the computations in multistage memory. because of this multistage in-memory computation engine, it provides better performance at reading and writing data than the mapreduce paradigm.25 it aims at speed, ease of use, extensibility, and interactive analytics. spark relies on concepts such as rdd, dag, spark context, transformations, and actions. spark context is an execution environment in which rdds and broadcasting variables can be created. spark context is also called the master of a spark application and allows accessing the cluster through a resource manager. data transformation happens in the spark application when the data is loaded from a data-store into rdds and some filter or map functions are performed to produce a new set of rdds. when the set of computations is created, forming a dag, it does not perform any execution; rather, it prepares for execution in the end, like a lazy loading process. some examples of actions are data extraction or collection and getting the count of words. transformations are the sequence of events, and action is the final execution of the underlying logic. figure 1. rdd transformations. the execution model of spark is shown in figure 2. the execution model is based on the driver/worker architecture consisting of the driver and the worker processes. the driver process creates the spark context and schedules tasks based on the available worker nodes. initially, the master process must be started, then creating worker nodes follows. the driver takes the responsibility of converting a user’s application into several tasks. these tasks are distributed among the workers. the executors are the main components of every spark application. executors actually perform data processing, reading and writing data to the external sources and the storage system. the spark manager is responsible for resource allocation and deallocation to the spark job. basically, spark is only a computation model. it is not related to storage of data, which is a different concept. it only helps in computations and data analytics in a distributed manner. for distributed execution, the task is distributed among the connected nodes so that every node can perform tasks at the same time; it performs the desired operation and notifies the master upon completion of the task. information technology and libraries | september 2018 35 figure 2 execution model of spark. in mapreduce, read/write operations happen between disk and memory, making job computation slower than spark. rdds resolve this by allowing fault-tolerant, distributed, in-memory computations. in rdd, the first load of data is read from disk and then a write-to-disk operation may take place depending upon the program. the operations between first read and last write happen in memory. data on rdds are lazily evaluated, i.e., during rdd transformations, data will not take part until any action is called on the final rdd, which triggers the job execution. the chain of rdd transformations creates dependencies between rdds. each dependency has a function for calculating its data and a pointer to its parent rdd. spark divides rdd dependencies into stages and tasks, then it sends them to workers for execution. hence, an rdd does not actually hold the data; rather, it either loads data from disk or from another rdd and performs some actions on the data for producing results. one of the important features of rdd is its fault tolerance, because of which it can retain and recompute any of the unsuccessful partitions due to node failures. rdds have built-in methods for saving data into files. for example, the rdd calls on saveastextfile(), its data are written on the specified text file line by line. there are numerous options for storing data in different formats, such as json, csv, sequence files, and object files. all these file formats can be saved directly into hdfs or normal file systems. spark sql and dataframe spark sql is a query interface for processing structured data using sql style on the distributed collection of data. that means it is used for querying structured data stored in hdfs (like hive) and parquet. spark sql runs on top of spark as a library and provides higher optimization. the efficiently processing and storing library linked data | sharma, marjit, and biswas 36 https://doi.org/10.6017/ital.v37i3.10177 spark dataframe is an api (application programming interface) that can perform relational operations on rdds and external data sources such as hive and parquet. like rdds, a spark dataframe is also a collection of structured records that can be manipulated by spark sql. it evaluates operations lazily to perform relational optimizations.26 a dataframe is created using rdds along with the schema information. for example, the java code snippet below creates a dataframe using rdd and a schema called rdftriple (rdf-triple schema will be discussed in the proposed approach). javardd<string> n_triples_ = marc_records.map(new texttostring()); javardd<rdftriple> rdf_triples = n_triples.map(new linestordffunction()); dataset<row> dataframe = sparksession.createdataframe(rdf_triples, rdftriple.class); dataframe.write().parquet("/full-path/rdfdata.parquet"); the spark dataframe uses memory management wisely by saving data in off-heap memory and provides an optimized execution plan. conceptually, a dataframe is equivalent to the relational tables with richer optimization and supports sql queries over its data. so, a dataframe is used for storing data into tables. structured data from spark dataframe can be saved into the parquet file format as shown in the above code snippet. column-oriented database a database is a persistent collection of records. these records are accessed via queries. the system that stores data and processes queries to retrieve data is called a database system. such systems use indexes or iteration over the records to find the required information stored in the database. indexes are an auxiliary, dictionary-like data structure that keeps indexes of individual records. indexing is efficient in some cases, however, as it requires two lookup operations and it slows down the access time. data scanning or iteration over each record resolves the query by finding the exact location of the records. it is inefficient when the size of the data is too large. as data-generation rate is increasing constantly, more and more data is going to be stored on the disk. for a fast-growing rate of data, we need a system that can adjust to more data than traditional storage systems and, at the same time, query-processing tasks should take less time. when the data gets too large, indexing and record scanning will be costly during querying. hence, a satisfying solution is the columnar-storage system, which stores data by columns rather than by rows. 27 a column-oriented database system stores data in corresponding columns, and each column is stored in a separate file into the disk. this makes data access time much quicker. since each column is stored separately, any required data can directly be accessed instead of reading all the data. that means any column can be used as an index, making it auto-indexing. that is why the column-oriented representation is much faster than the row-oriented representation. apart from this, data is stored in the compressed form. each column is compressed using a different scheme. in the column-oriented database, the compression is always efficient as all the values belong to the same data type. hence, column-oriented databases require less disk space, as they do not need additional storage for indexes since the data is stored within the indexes themselves. consider an example where a database table named “book” consisting of columns “bookid,” “title,” and “price.” following a column-oriented approach, all the values for bookid are stored together under the “bookid” column, all the values for title are stored together under “title” column. and so on as shown in figure 3. information technology and libraries | september 2018 37 figure 3 an example of an entity and its row and column representation. apache parquet parquet is a top-level apache project that stores data in column-oriented fashion, highly compressed and densely packed in the disk.28 it is a self-describing data format that embeds schema within the data itself. it supports efficient compression and encoding schemes that allows lowering data-storage costs and maximizes the effectiveness of querying data. parquet has added advantages, such as limiting the i/o operation and storing data in compressed form using the snappy method developed by google and used in its production environment. hence it is designed especially for space and query efficiency. snappy aims at compressing petabytes of data in minimal amounts of time, and especially aims for resolving big data issues.29 the data compression rate is more than 250 mb/sec, and decompression rate is more than 500 mb/sec. these compression and decompression rates are for a single core of a system having a core i7 processor in 64-bit mode. it is even faster than the fastest mode of zlib compression algorithm.30 parquet is implemented using column-striping and assembly-language algorithms that are optimized for storing large data-blocks.31 it supports nested data structures in which each value of the same column is stored in contiguous memory locations.32 apache parquet is flexible and can work with many programming languages because it is implemented using apache thrift (https://thrift.apache.org/). a parquet file is divided into row groups and metadata at the end of the file. each row group is divided into column values (or column chunks), such as column 1, column 2, and so on as shown in figure 4. each column value is divided into pages, and each page consists of the page header, repetition levels, definition levels, and values. the footer of the file contains various metadata, such as file metadata, column metadata, and page-header metadata. the metadata information is required to locate and find the values, just like indexing. https://thrift.apache.org/ efficiently processing and storing library linked data | sharma, marjit, and biswas 38 https://doi.org/10.6017/ital.v37i3.10177 figure 4 parquet file structure. the proposed approach the proposed approach relies on spark’s core apis—rdd, spark sql, and dataframe—which can operate on large datasets. rdd is used to load the initial data from the input file, process the data and transform them into triple structure. spark dataframe is used to load the data from rdd into the triple structure and send the transformed rdf data into a parquet file. spark sql is used to fetch the data stored in the parquet file. processing rdf data processing rdf data from large rdf/xml files requires breaking the file into smaller file components. general data-processing systems cannot handle large files because they face memory issues. at this stage, the proposed approach can process the data using an n-triples file, hence individual rdf/xml files again need to be converted into the n-triples file format. the process of breaking rdf/xml file into smaller file components and then converting them into n-triples format depends upon the size of the input file. if it is not more than 500 mb then it is directly converted into n-triples file format. multiple rdf/xml files are converted into individual ntriples file formats, which are again combined into one n-triples file, as the proposed spark application reads input from a single file. information technology and libraries | september 2018 39 schema to store rdf data a simple rdf schema with three triple entities has been designed. this schema is an rdf triple view, which is the building block of the rdf storage schema proposed in this work. the rdf triple view is a simple java class consisting of three attributes—subject, predicate, and object. given an rdf dataset d, consisting of a set of rdf triples t, in either rdf/xml or n-triples format, the dataset is transformed into a format that can be processed by a spark application. further, the dataset is transformed into a line-based format where the individual triple statement is placed in a line separated by a new-line (\n) character. a line contains three components—subject, predicate, and object separated by a space. here each line is unique, using the combined information of subject, predicate, and object. given an rdf triple structure ti, ti = (si, pi, oi) and ti ∈ t, for each t an instance of rdf triple view is created to hold the triple information. the columnar schema organizes triple information into three components, storing each component separately as subject, predicate, and object columns (figure 5). figure 5. rdf triple view. rdf storage we store the rdf data based on rdf triple view, which is the main schema for storing data in the triple representation. we do not need any indexing or additional information related to subject, predicate, or object to be stored on the disk. since we can have any number of temporary dataframe tables in memory, join operations can be performed using these tables to filter the data. in the absence of expensive indexing and additional triple information, storage area can be reduced significantly. apart from this, the compression technique used in apache parquet reduces lot more space than storing in other triple stores. in figure 6, we illustrate the data-storing process. efficiently processing and storing library linked data | sharma, marjit, and biswas 40 https://doi.org/10.6017/ital.v37i3.10177 figure 6. data-storing process in hdfs. the collection of triple instances is loaded into an rdd. at the end, the collection of triple instances is loaded into spark dataframe. spark dataframes are equivalent to the rdbms tables and support both structured and unstructured data formats. using a single schema, multiple dataframes can be used and can be registered as temporary tables in the memory, where highlevel sql queries can be executed on top of them. here the concept of using multiple dataframes with a single schema is motivated to avoid joins and indexing. in the final step, the spark dataframe is saved into hdfs files in the parquet format. from the parquet file, the data can be loaded back into dataframes in memory and queried using spark sql. fetching data from storage given an rdf dataset d, a sparql query q, and a columnar-schema s, we use s to translate q to q' to perform queries on top of s. here, the answer of query q' on top of s is equal to the answer of q on top of d. query mappings m are used to transform sparql queries into spark sql queries. for querying, first the data is loaded into a spark dataframe from parquet files. to query data using sparql, queries must follow basic graph patterns (bgp). a bgp is a set of triple patterns similar to an rdf triple (s, p, o) where any of s, p, and o can be query variables or literals. bgp is used for matching a triple pattern to an rdf graph. this process is called binding between query variables and rdf terms. the statements listed under the where clause is known as bgp consisting of query patterns. for example, the query “select ?name ?mbox where {?x foaf:name ?name . ?x foaf:mbox ?mbox .}” has two query patterns. to evaluate the query containing two query patterns, one join is required. based on the total number of query patterns, information technology and libraries | september 2018 41 we need one less number of joins. that is, for n number of query patterns we need n-1 joins to resolve the values. figure 7 illustrates the process of query execution. figure 7. process of query execution. evaluation to evaluate the proposed approach we compare the storage size with file-based storage systems such as n-triples files and rdf/xml files. we also compare with standard triple stores such as jena tdb and sesame. the data-storing time is compared with jena tdb, sesame, and parquet, having one, two, and three worker nodes respectively. finally, for the purposes of the experiment, some sparql queries are selected and tested over rdf data stored in parquet format into hdfs. the query performance is tested on the distributed system having one, two, and three worker nodes respectively. in the following subsections, we show the results for each of the above comparisons. datasets for evaluation, we use two datasets. dataset 1 contains bibliographic data from the national library of portugal (nlp) (http://opendata.bnportugal.pt/eng_linked_data.htm). from nlp, we choose the nlp catalogue datasets in rdf/xml formats. the datasets are freely available to reuse and contain metadata information from nlp catalogue, the national bibliographic database, the portuguese national bibliography, and the national digital library. the datasets are available as linked data, which were produced in the context of the european library. the size of the rdf/xml file is 6.46 gb with more than 45 billion rdf triples. http://opendata.bnportugal.pt/eng_linked_data.htm efficiently processing and storing library linked data | sharma, marjit, and biswas 42 https://doi.org/10.6017/ital.v37i3.10177 dataset 2 contains bibliographic data from the british national library (https://www.bl.uk/bibliographic/download.html). from the british national bibliography collection we choose the bnb lod books dataset. the datasets are publicly available and contain bibliographic records of different categories, such as books, locations, bibliographic resources, persons, organizations, and agents. the datasets are divided into sixty-seven files in rdf format. however, we combine them into one file in n-triples format to fit the requirement of the large size of the input data. the combined file is 22.52 gb and contains more than 16 billion rdf resources in n-triples format, making it suitable for the proposed approach. from this conversion, we get more than 150 billion rdf triples. figure 8. data storage time for different file formats. figure 9. disk size for different file formats. disk storage figure 8 shows the data-storing time using sesame, jena tdb, and parquet for the above two datasets. data from raw rdf files are stored in jena tdb and sesame. individual files are processed for storing into jena tdb and sesame to avoid memory overflow as jena or sesame models cannot load data at once from the large files. to store data in parquet format we run the program separately on different worker nodes. figure 9 presents the total disk size required for each of these file formats and triple stores for the two datasets. https://www.bl.uk/bibliographic/download.html information technology and libraries | september 2018 43 query performance for testing, the sparql queries are converted manually at this stage. we run some of the selected queries over bibliographic rdf data stored in parquet file format in hdfs. we run the following type of queries on worker nodes 1, 2 and 3 respectively. the queries are listed below: q1) the first query is to fetch the count of rdf triples present in the storage. query: select (count(*) as ?count) where ?s ?p ?o . q2) the second query is to fetch the entire dataset in spo format. it fetches data in the n triples format. query: select * { ?s ?p ?o } . q3) the third query is to fetch resources that belong to books with the subject “english language composition and exercises.” query: select ?s where ?x rdf:type bibo:book . ?x dc:subject <http://bnb.data.bl.uk/id/concept/lcsh/english_language_composition_and_exercises> . q4) the fourth query is to fetch resources that belong to books with the subject “english language composition and exercises” and creator “palmer frederick.” query: select ?s where ?x rdf:type bibo:book . ?x dc:subject <http://bnb.data.bl.uk/id/concept/lcsh/english_language_composition_and_exercises> . ?x dc:creator <http://bnb.data.bl.uk/id/person/palmer_frederick>. q5) the fifth query is to fetch objects having predicate dcterms:ispartof. query: select ?name where ?s dcterms:ispartof ?name . figure 10 shows the query response time for the above queries on different worker nodes for two different datasets. the queries are executed in the distributed environment. it shows that increasing the number of worker nodes decreases the query response time. efficiently processing and storing library linked data | sharma, marjit, and biswas 44 https://doi.org/10.6017/ital.v37i3.10177 figure 10. query response time with different numbers of worker nodes. query comparison for comparing query response time, the proposed approach is tested with the first dataset as mentioned above. though at this stage the proposed approach requires further research to be compared with other distributed triple storage systems. also, it requires more worker nodes and larger datasets compatible for parallel processing in the distributed environment. with a smaller setup, it will be hard to analyze the performance of the individual approaches, as they may produce similar results. we compare the proposed approach with the standard jena tdb solution in a single-node environment. the following sparql queries are tested against dataset 1. prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix dc: <http://purl.org/dc/terms/> prefix rdau: <http://rdaregistry.info/elements/u/> prefix foaf: <http://xmlns.com/foaf/0.1/> q1. select (count(*) as ?count) { ?s ?p ?o } q2. select * { ?s ?p ?o } q3. select ?x where { ?x rdf:type dc:bibliographicresource. } q4. select ?x where { ?x rdf:type <http://schema.org/book>. ?x rdau:p60339 'time out lisboa'. } q5. select ?s where {?s dc:ispartof <http://data.theeuropeanlibrary.org/collection/a0511>. ?s foaf:page 'http://www.theeuropeanlibrary.org/tel4/record/3000115318515'. } information technology and libraries | september 2018 45 figure 11. query comparison. we are interested in measuring the query response time with the above queries. first, we test with jena tdb. we then test the proposed approach on a single-node environment. we execute the above set of queries multiple times to record the average performance. as mentioned above, no indexing is used in the storage. rdf triples are stored as they appeared in the n-triples file. queries are executed without indexing and are still getting better performance than jena tdb, as shown in figure 11. discussion in this article, we claim that apache spark and column-oriented databases can resolve library big data issues. especially when dealing with rdf data, spark can perform far better than other approaches because of its in-memory processing ability. concerning rdf data storage, the column-oriented database is suitable for storing the large volume of data because of its scalability, fast data loading, and highly efficient data compression and partitioning. a column-oriented database system requires less disk, reducing the storage area. as a proof, we have shown the data storage comparison and the performance of the columnar-storage for rdf data using parquet formats in hdfs. as shown in the results, apache parquet takes much less disk space as compared to other storage systems. also, the data-storing time is relatively very small as compared to others. we observed that the result of query 2 is the entire dataset stored in parquet format. the size of this resultant dataset is 22.52 gb, which is the same as the original size. the same dataset when stored with parquet format is reduced to 2.89 gb. this shows that parquet is a very optimized efficiently processing and storing library linked data | sharma, marjit, and biswas 46 https://doi.org/10.6017/ital.v37i3.10177 storage system that can reduce the storage cost. we have shown the query response time for five different sparql queries on distributed nodes for two different datasets. we believe with better schema for storing rdf triples the proposed approach can be improved, and with the used technologies a fast and reliable triple store can be designed. conclusion and future work librarians all over the globe should give priority to integrating library data with the web to enable cross-domain sharing of library data. to do this, they must pay attention to current trends in big data technologies. because the data-generation rate is increasing in every domain, traditional data processing and storage systems are becoming ineffective because of the scale and complexity of the data. in this article, we present a distributed solution for processing and storing a large volume of library linked data. from the experiment, we observe that the processing of large volume of the data takes significantly less time using the proposed approach. also, the storage area is reduced significantly as compared to other storage systems. in the future we plan to optimize the current approach using advanced technologies such as graphx, machine learning tools, and other big -data technologies for even faster data processing, searching, and analyzing. references 1 eric miller et al., “bibliographic framework as a web of data: linked data model and supporting services,” library of congress, november 11, 2012, https://www.loc.gov/bibframe/pdf/marcld-report-11-21-2012.pdf. 2 brighid m. gonzales, “linking libraries to the web: linked data and the future of the bibliographic record,” information technology and libraries 33 no. 4 (2014): 10, https://doi.org/10.6017/ital.v33i4.5631; myung-ja k. han et al., “exposing library holdings metadata in rdf using schema.org semantics,” in international conference on dublin core and metadata applications dc-2015, são paulo, brazil, september 1–4, 2015, pp. 41–49, http://dcevents.dublincore.org/intconf/dc-2015/paper/view/328/363. 3 franck michel et al., “translation of relational and non-relational databases into rdf with xr2rml,” in proceedings of the 11th international conference on web information systems and technologies, lisbon, portugal, 2015, pp. 443–54, https://doi.org/10.5220/0005448304430454; varish mulwad, tim finin, and anupam joshi, “automatically generating government linked data from tables,” working notes of aaai fall symposium on open government knowledge: ai opportunities and challenges 4, no. 3 (2011), https://ebiquity.umbc.edu/_file_directory_/papers/582.pdf; matthew rowe, “data.dcs: converting legacy data into linked data,” ldow 628 (2010), http://ceur-ws.org/vol628/ldow2010_paper01.pdf. 4 virginia schilling, “transforming library metadata into linked library data,” association for library collections and technical services, september 25, 2012, http://www.ala.org/alcts/resources/org/cat/research/linked-data. 5 getaneh alemu et al., “linked data for libraries: benefits of a conceptual shift from libraryspecific record structures to rdf-based data models,” new library world 113, no. 11/12 (2012): 549–70 (2012), https://doi.org/10.1108/03074801211282920. https://www.loc.gov/bibframe/pdf/marcld-report-11-21-2012.pdf https://doi.org/10.6017/ital.v33i4.5631 http://dcevents.dublincore.org/intconf/dc-2015/paper/view/328/363 https://doi.org/10.5220/0005448304430454 https://ebiquity.umbc.edu/_file_directory_/papers/582.pdf http://ceur-ws.org/vol-628/ldow2010_paper01.pdf http://ceur-ws.org/vol-628/ldow2010_paper01.pdf http://www.ala.org/alcts/resources/org/cat/research/linked-data https://doi.org/10.1108/03074801211282920 information technology and libraries | september 2018 47 6 lisa goddard and gillian byrne, “the strongest link: libraries and linked data,” d-lib magazine, 16, no. 11/12 (2010), https://doi.org/10.1045/november2010-byrne. 7 t. nasser and r. s. tariq, “big data challenges,” journal of computer engineering & information technology 4, no. 3 (2015), https://doi.org/10.4172/2324-9307.1000133. 8 alexandru adrian tole, “big data challenges,” database systems journal 4, no. 3 (2013): 31–40, http://dbjournal.ro/archive/13/13_4.pdf. 9 carol jean godby and karen smith-yoshimura, “from records to things: managing the transition from legacy library metadata to linked data,” bulletin of the association for information science and technology 43, no. 2 (2017): 18–23, https://doi.org/10.1002/bul2.2017.1720430209. 10 corine deliot, “publishing the british national bibliography as linked open data,” catalogue & index, issue 174 (2014): 13–18, http://www.bl.uk/bibliographic/pdfs/publishing_bnb_as_lod.pdf; gustavo candela et al., “migration of a library catalogue into rda linked open data,” semantic web 9, no. 4 (2017): 481–91, https://doi.org/10.3233/sw-170274; martin malmsten, “exposing library data as linked data,” ifla satellite preconference sponsored by the information technology section: emerging trends in 2009, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.181.860&rep=rep1&type=pdf ; keri thompson and joel richard, “moving our data to the semantic web: leveraging a content management system to create the linked open library,” journal of library metadata 13, no. 2– 3 (2013): 290–309, https://doi.org/10.1080/19386389.2013.828551; jason a. clark and scott w. h. young, “linked data is people: building a knowledge graph to reshape the library staff directory,” code4lib journal 36 (2017), http://journal.code4lib.org/articles/12320; martin malmsten, “making a library catalogue part of the semantic web,” humbolt university of berlin, 2008, https://doi.org/10.18452/1260. 11 r. hastings, “linked data in libraries: status and future direction,” computers in libraries 35, no. 9 (2015): 12–28, http://www.infotoday.com/cilmag/nov15/hastings--linked-data-inlibraries.shtml. 12 mirjam keßler, “linked open data of the german national library,” in eco4r workshop lod of dnb, 2010; antoine isaac, robina clayphan, and bernhard haslhofer, “europeana: moving to linked open data,” information standards quarterly 24, no. 2/3 (2012)<<qy: page range?>>; carol jean godby and ray denenberg, “common ground: exploring compatibilities between the linked data models of the library of congress and oclc,” oclc online computer library center, 2015, https://files.eric.ed.gov/fulltext/ed564824.pdf. 13 chunning wang et al., “exposing library data with big data technology: a review,” 2016 ieee/acis 15th international conference on computer and information science (icis), pp. 1-6, https://doi.org/10.1109/icis.2016.7550937. 14 b. mcbride, “jena: a semantic web toolkit,” ieee internet computing 6, no. 6 (2002): 55–59, https://doi.org/10.1109/mic.2002.1067737; jeen broekstra, arjohn kampman, and frank van https://doi.org/10.1045/november2010-byrne https://doi.org/10.4172/2324-9307.1000133 http://dbjournal.ro/archive/13/13_4.pdf https://doi.org/10.1002/bul2.2017.1720430209 http://www.bl.uk/bibliographic/pdfs/publishing_bnb_as_lod.pdf https://doi.org/10.3233/sw-170274 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.181.860&rep=rep1&type=pdf https://doi.org/10.1080/19386389.2013.828551 http://journal.code4lib.org/articles/12320 https://doi.org/10.18452/1260 http://www.infotoday.com/cilmag/nov15/hastings--linked-data-in-libraries.shtml http://www.infotoday.com/cilmag/nov15/hastings--linked-data-in-libraries.shtml https://files.eric.ed.gov/fulltext/ed564824.pdf https://doi.org/10.1109/icis.2016.7550937 https://doi.org/10.1109/mic.2002.1067737 efficiently processing and storing library linked data | sharma, marjit, and biswas 48 https://doi.org/10.6017/ital.v37i3.10177 harmelen, “sesame: a generic architecture for storing and querying rdf and rdf schema,” international semantic web conference, ed. j. davies, d. fensel, and f. van harmelen (berlin and heidelberg: springer, 2002), https://doi.org/10.1002/0470858060.ch5. 15 “apache jena—tdb,” apache jena, accessed august 22, 2018, https://jena.apache.org/documentation/tdb/. 16 “sesame (framework),” everipedia, july 15, 2016, https://everipedia.org/wiki/sesame_(framework)/. 17 asim ullah et al., “bookont: a comprehensive book structural ontology for book search and retrieval,” 2016 international conference on frontiers of information technology (fit), 211– 16, https://doi.org/10.1109/fit.2016.046. 18 tom heath and christian bizer, “linked data: evolving the web into a global data space,” synthesis lectures on the semantic web: theory and technology 1, no. 1 (2011): 1–136, https://doi.org/10.2200/s00334ed1v01y201102wbe001. 19 christian bizer et al., “linked data on the web (ldow2008),” proceeding of the 17th international conference on world wide web—www 08, 2008, pp. 1265–66 (2008), https://doi.org/10.1145/1367497.1367760. 20 eric prud and andy seaborne, “sparql query language for rdf,” w3c recommendation, january 15, 2008, https://www.w3.org/tr/rdf-sparql-query/. 21 devin gaffney, “how to use sparql,” datagov wiki rss, last modified april 7, 2010, https://data-gov.tw.rpi.edu/wiki/how_to_use_sparql. 22 tom white, hadoop: the definitive guide (sebastopol, ca: o’reilly media,, 2012), https://www.isical.ac.in/~acmsc/wbda2015/slides/hg/oreilly.hadoop.the.definitive.guide. 3rd.edition.jan.2012.pdf. 23 dhruba borthakur, “the hadoop distributed file system: architecture and design,” hadoop project website, 2007, http://svn.apache.org/repos/asf/hadoop/common/tags/release0.16.3/docs/hdfs_design.pdf; seema maitrey and c. k. jha, “mapreduce: simplified data analysis of big data,” procedia computer science 57 (2015), 563–71 (2015), https://doi.org/10.1016/j.procs.2015.07.392. 24 michael armbrust et al., “spark sql: relational data processing in spark,” in proceedings of the 2015 acm sigmod international conference on management of data (new york: acm, 2015), 1383–94, https://doi.org/10.1145/2723372.2742797. 25 abdul ghaffar shoro and tariq rahim soomro, “big data analysis: apache spark perspective,” global journal of computer science and technology 15, no. 1 (2015), https://globaljournals.org/gjcst_volume15/2-big-data-analysis.pdf. 26 salman salloum et al., “big data analytics on apache spark,” international journal of data science and analytics 1, no. 3–4 (2016): 145–64, https://doi.org/10.1007/s41060-016-0027-9. https://doi.org/10.1002/0470858060.ch5 https://jena.apache.org/documentation/tdb/ https://everipedia.org/wiki/sesame_(framework)/ https://doi.org/10.1109/fit.2016.046 https://doi.org/10.2200/s00334ed1v01y201102wbe001 https://doi.org/10.1145/1367497.1367760 https://www.w3.org/tr/rdf-sparql-query/ https://data-gov.tw.rpi.edu/wiki/how_to_use_sparql https://www.isical.ac.in/~acmsc/wbda2015/slides/hg/oreilly.hadoop.the.definitive.guide.3rd.edition.jan.2012.pdf https://www.isical.ac.in/~acmsc/wbda2015/slides/hg/oreilly.hadoop.the.definitive.guide.3rd.edition.jan.2012.pdf http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.16.3/docs/hdfs_design.pdf http://svn.apache.org/repos/asf/hadoop/common/tags/release-0.16.3/docs/hdfs_design.pdf https://doi.org/10.1016/j.procs.2015.07.392 https://doi.org/10.1145/2723372.2742797 https://globaljournals.org/gjcst_volume15/2-big-data-analysis.pdf https://doi.org/10.1007/s41060-016-0027-9 information technology and libraries | september 2018 49 27 daniel j. abadi, samuel r. madden, and nabil hachem, “column-stores vs. row-stores: how different are they really?,” in proceedings of the 2008 acm sigmod international conference on management of data (new york: acm, 2008), 967–80, https://doi.org/10.1145/1376616.1376712. 28 deepak vohra, “apache parquet,” in practical hadoop ecosystem (berkeley, ca: apress, 2016), 325–35, https://doi.org/10.1007/978-1-4842-2199-0_8. 29 “google/snappy,” github, january 04, 2018, https://github.com/google/snappy. 30 jean-loup gailly and mark adler, “zlib compression library,” 2004, https://www.repository.cam.ac.uk/bitstream/handle/1810/3486/rfc1951.txt?sequence=4. 31 sergey melnik et al., “dremel: interactive analysis of web-scale datasets,” proceedings of the vldb endowment 3, no. 1–2 (2010): 330–39, https://doi.org/10.14778/1920841.1920886. 32 marcel kornacker et al., “impala: a modern, open-source sql engine for hadoop,” in proceedings of the 7th biennial conference on innovative data systems research, asilomar, california, january 4–7, 2015, http://www.inf.ufpr.br/eduardo/ensino/ci763/papers/cidr15_paper28.pdf. https://doi.org/10.1145/1376616.1376712 https://doi.org/10.1007/978-1-4842-2199-0_8 https://github.com/google/snappy https://www.repository.cam.ac.uk/bitstream/handle/1810/3486/rfc1951.txt?sequence=4 https://doi.org/10.14778/1920841.1920886 http://www.inf.ufpr.br/eduardo/ensino/ci763/papers/cidr15_paper28.pdf abstract introduction literature review background structure of rdf triples hadoop and mapreduce spark and resilient distributed datasets (rdd) spark sql and dataframe column-oriented database apache parquet the proposed approach processing rdf data schema to store rdf data rdf storage fetching data from storage evaluation datasets disk storage query performance query comparison discussion conclusion and future work references lib-s-mocs-kmc364-20140601052109 computer assisted circulation control at health sciences library sunyab 87 jean k. miller: associate health sciences librarian for circulation and dissemination a description of the circulation system which the health scien ces library at the state university of new york at buffalo has been using since october 1970. features of the system include automatic production of overdue, fin e, and billing notices; notices for call-in of requested books; and book availability notices. remote operation and processing on the ibm 360/40 and cdc 6400 computer are accomplished via the administrative terminal system (ats) and terminal job entry (t]e). th e system provides information for management of the collection and improved service to the user. introduction the health sciences library of the state university of new york at buffalo (sunyab) serves the teaching, research, and clinical programs of the five schools of health sciences at the university-medicine, dentistry, pharmacy, nursing, health related professions-as well as the department of biology. it is the biomedical resource library for the five teaching hospitals affiliated with sunyab and for the health professionals within the nine counties of the lakes area regional medical program. service demands had increased steadily since 1961 with the incorporation of the university within the state of new york this was apparent in the circulation department where statistics indicated a 21 percent increase in the circulation of materials between fy 1967/ 68 and 1968/ 69. the circulation system in use was inefficient and time-consuming for both the user and the clerical staff. the user was required to fill out a charge card for each book, giving his name, address, and status; and the title, author, year of publication, volume, copy number, and call number of the book. the card 88 journal of library automation vol. 5/ 2 june , 1972 was stamped with the due date and filed alphabetically by main entry. problems resulted from illegible handwriting, selection of incorrect main entry, and incorrect filing. control of library materials was inadequate. the system to be described was adopted following consideration of the requirements of an effective system of circulation control and of the resources available to the library. planning for the development and implementation of the automated circulation system began in the fall of 1969. funding was provided by a medical library resource grant and the office of the provost of health sciences of sunyab. system design began in february 1970; programming was accomplished during june and july; implementation started in august; and the system was operational in october 1970. costs of operation have been provided by the university libraries of sunyab since april 1971. computer facilities the health sciences library shares the facilities provided by the department of computer services on campus. the current installation is an ibm 360/40 h with an eight-disk drive 2319 unit, six magnetic tape devices, card read and punch unit, and a 1100 line-per-minute printer. it includes a 2703 telecommunications unit supporting forty 2741 terminals and a 2701 unit with parallel data adapter unit interfacing a channel-to-channel adapter to a cdc 6400 computer. the ibm operating system, scope 3.2.0 version 1.1, is used. processing the library's circulation system was designed to use the administrative terminal system (a ts) and terminal job entry ( tje) for remote operation and processing on the ibm 360/40 and cdc 6400 computers. programs are written in fortran for cdc 6400-6500-6600 (version 2.3). the program modules comprising the circulation system require from lk to 60k and from 0 to 2 tape units for processing. ats documents are used rather than punched card decks as program and data input media. the system incorporates several large data bases which are updated at regular intervals. a file of current circulation transactions ( 80 characters per record) is mahltained on both magnetic tape and in a ts storage. this file is merged daily with new transactions. names and addresses of university personnel and students are maintained on magnetic tape. a file of inactive circulation records (50 characters per record) is also maintained on tape. other smaller files are stored in ats and are updated daily and/ or weekly. no permanent disk storage is used. input of data and programs is made from the ibm 2741 terminal located in the circulation office of the library. data are entered daily by the clerical staff via the ats terminal. storage, retrieval, and text editing are performed as required. processing of data is initiated by the library staff. a properly sequenced assemblage of ats documents consisting of data and procomputer assisted circulation control/miller 89 gramming instructions ( tje input file) is input from the ibm 27 41 terminal. this input file is submitted through tje for execution on the cdc 6400 computer. the data are processed in accordance with the specific job command entered at the terminal. after processing, the output is stored as a single ats document. in some instances, the clerical staff divides the ats stored output into discrete output files for storage and subsequent use. selected segments of the output (notices, save lists, etc.) are produced in hard copy format and delivered to the library by the computer center (figure 1). hsccirc system the health sciences library circulation system ( hsccirc) provides: 1. a file (query file) of all monographs off the shelves which includes a record of: a. books charged out. b . books on interlibrary loan. c. books on reserve. d . books at the bindery. e. books on the "hold shelf'' which have been returned upon the request of another patron. f. books on the "new book shelf." g. books which have been declared lost and are in the process of being replaced. 2. overdue notices to all borrowers. 3. billing notices to students for those books not returned after a second overdue has been sent. 4. a file (fine file) indicating the amount owed by individual students for overdues. 5. fine notices to students if an overdue book is returned but the fine is not paid. 6. notices to users having books requested by other patrons. 7. hold shelf notices alerting library personnel to those books which have been reserved for library pat rons. 8. book availability notices to users who have made "save" requests. 9. a file (history file) containing records of inactive transactions. 10. daily and cumulative (fiscal year-to-date) statistics of the transactions. the foregoing lists the information which the system provides on a routine basis. other modules of the system permit access to additional information as required. for example, lists may be prepared of books currently in circulation to interlibrary loan, on reserve, or at the bindery. these lists are used by the staff involved in processing these materials and may be updated at their request. 90 journal of library automation vol. .5 j 2 jun e, 1972 query file processing query file (ats) 8 8 ,.-----'--'---" processing fine unpaid fine file file \ \ \ \ \ ' \ ' ' ' ' ' ' transaction file(s) \ ' \ \ \ \ \ ' \ \ ' \ ' ' ' \ \ \ \ \ updated fine file fig. 1. system overr.;ieu;. history file analysis tables charts lists address file update updated address fi le computer a ssist ed citculation control/ miller 91 creation names/ addresses master 1~---i address tape creation creation names / addresses semester faculty/ staff letters 9 special run sequences lists (ill . reserve bindery) 92 journal of library automation vol. .5/ 2 june, 1972 the history file is analyzed quarterly. the analysis provides a statistical breakdown, by user categories, of the transactions which occurred since the last analysis. the total number of charges, renewals, and save requests for each of the five user categories are tallied. the call numbers of the books borrowed by members of each user category are listed. multiple charges of the same book are incremented and recorded. this information on book usage and borrowing patterns assists in library management decisions. it is possible to identify high usage of specific volumes or subject areas and to determine whether the demand is from the faculty, staff, or graduate or undergraduate student body. records of heavy demand and multiple save requests aid in decisions to purchase additional copies of a monograph. at the end of each semester, faculty /staff letters are prepared and mailed. each notice lists the call number and due date for overdue books currently charged to the faculty or staff member. the notice requests return of the book( s) before the beginning of the next semester. statistics generated by the system (figure 2) are used in the preparation of monthly, quarterly, and annual reports. they have been used as a basis for decisions on policy such as that resulting in a change in the length of the circulation period in april 1971. subsequent statistics have been used to evaluate such changes. in addition, the system permits rapid, easy consultation of the query file to detect the location of any book off the shelves. this is accomplished through use of the printed query file (figure 3) which is arranged in call number sequence. it contains one line of information for each transaction new chrgs holds spcl chrgs (ill) spcl chrgs (bnd) spc 1 ch rgs (res) renewals save requests recall letters ho 1 d 1 etters books overdue 1st overdue 2nd overdue bills lost books discharges discharges (hld shlf) fig. 2. circulation statistics. 42572 year to date 112 5 5 0 1 12 3 2 3 48 0 0 0 2 173 5 2225 185 85 48 116 308 74 63 99 1230 773 364 122 61 2837 154 computer assisted circulation control/miller 93 *t)l696/p2l27 /1 *clo *t 60772 *d 70772 *p 202319 *u3 *ql696/p2l27 /1 *c51 *t 71172 *d 71872 *p 202319 *u3 *ql697/g4/l966/l *clo *t 62672 *d 62672 *p 71165132 *ul *ql697/g4/l966/l *c52 *t 70772 *d 71472 *p 71165132 *ul *ql698/h25/l *clo *t 61972 *0 71972 *p 138551 *u3 *ql698/j3/l953/l *clo *t 61972 *d 71772 *p 138551 *u3 *ql698/s78/l965/l *c20 *t 62372 *d 72372 *p 244856 *u2 *ql698/s78/l *clo *t 61972 *d 71972 *p 138551 *u3 *ql698.3/a7/l965/l *clo *t 61972 *d 71972 *p 138551 *u3 *ql703/w22/l968/vl/l *clo *t 51172 *d 61172 *p 102714 *u3 *ql706/p25/1957/l *clo *t 71172 *d 81172 *p 440503366 *u4 *ql715/cl32m/l966/l *clo *t 70272 *d 80272 *p 180439 *u3 *ql73l/d67f/l969/l *clo *t 51872 *d 61872 *p 102714 *u3 *ql73l/d67f/l969/l *c53 *t 70772 *d 71472 *p 102714 *u3 *ql73l/e55/1953/l *clo *t 51872 *0 61872 *p 102714 *u3 *ql731/e55/l953/l *c53 *t 70772 *d 71472 *p 102714 *u3 *ql737/c2h24/l948/l *clo *t 50772 *d 60772 *p 147339 *u2 *ql737/c2h24/l952/l *c60 *t 42472 *d -0 *p -0 *uo *ql737/c2h24/l966/l *c60 *t 42472 *d -0 *p -0 *uo *ql737/c23c7/l969/l *clo *t 62172 *d 72172 *p 175470 *u2 *ql737/c4l5/l96l/l *c60 *t 30172 *d -0 *p -0 *uo *ql737/c7/l964/l *clo *t 62272 *d 72272 *p 220053 *u3 *ql737/m3f919k/l969/l *clo *t 60972 *d 70972 *p 165872 *u3 ~ql737/m3f919k/l96l/l *c51 *t 71172 *d 71872 *p 165872 *u3 *ql737/p9b9/1963/l *c60 *t 12272 *d -0 *p -0 *uo *ql737/p9h4/vl/1953/l *clo *t 51172 *d 61172 *p 102714 *u3 *ql737/p9h4/vl/l953/l *c54 *t 71172 *d 61172 *p 102714 *u3 *ql737/p9m64/l967/3 *clo *t 61572 *d 71572 *p 360307268 *u6 *ql737/p9s3/l965/v2/2 *cl8 *t 100471 *d -0 *p 777777777 *uo legend: *call number *c transaction code *t date of transaction *d due date or date next notice will be generated *p patron identification number *u user category fig. 3. query file. changing the status of a book. for example, when a book is charged, the query file contains one line of information relating to the charge. if the book becomes overdue, a second line of information is automatically generated indicating the overdue status of the book. a two-digit transaction code defines the status. the transaction code is entered as part of the input (as code 10 when charging a book); or it is generated by the system, as occurs when a book becomes overdue and initial and subsequent overdue, billing, and/or fine notices are produced (code 51,52,53,54). this same information may be obtained through on-line query of the circulation file from the ibm 2741 terminal during the hours of operation of ats. access to the query file is either by call number of the book or identification number of the user. the latter is used when producing lists of items out on loan to a borrower and in detecting delinquent borrowers. 94 journal of library automation vol. 5/2 june, 1972 overview comparison of statistics between fy 1969/70 and fy 1971/72 showed a 12 percent increase in circulation. during the same period there was a 61 percent increase in the number of people using the library. the circulation department has been able to handle the increased workload more efficiently because of the automated system. a decrease in clerical time required for carrying out the tasks of the department has been realized. the circulation records are now updated five times per week and notices are issued promptly. previously, updating was possible only once in every seven to ten days. service to the user is much faster and more accurate in charging books and in providing information on this status. control of items loaned to users is more effective. information for management of the collection and provision of improved service is available. system disadvantages are related to the mode of data input and lack of author and title information on records. transcription errors occur during manual capture of data at the time the transaction occurs and when the data are entered by the clerical staff from the terminal. correction of errors requires rekeying and reentry of the corrected data for reprocessing. this increases cost in terms of personnel time and equipment use. author and title information is not provided in the query file or on notices sent to users. this is an inconvenience to the user and requires checking of the shelf list by library personnel to provide the information when required. these potential disadvantages were recognized at the time the system was planned. however, they were not considered serious drawbacks. the decision was made to adopt the system and, when additional funds were forthcoming, to provide machine readable input and add author and title information to the records. costs the cost of the system during its first year of operation was $10,590.65. this included monthly charges for rental of equipment, use of ats, storage of records, computer time, and print costs. ibm 27 41 terminal (including phone line) a ts sign on time ats storage computer time and print costs total $1082.86 1042.08 3187.24 5278.47 $10,590.65 unit cost figures are imperfect, but over 69,000 transactions were processed and over 20,000 notices generated at an average unit cost of 11.6 cents. clerical time is not included in this figure. the number of clerical assistants remained constant although, as noted, all phases of the work of the circulation department increased. computer assisted circulation contmljmiller 95 future development in the future, the library hopes to be able to take greater advantage of the on-line query capability of the present system. additional ibm 27 41 terminals at selected locations in the library could provide instantaneous file query. while non-routine queries are made on-line, the library now uses printed listings for most routine queries. the installation of automatic data input devices, such as ibm 1030 equipment, would permit reading of coded book cards and patron identification cards with direct transmission of data to ats storage. the hardware and software modification required to implement this additional capability is technically feasible and not financially prohibitive. the present system is to be installed soon in another library on the sunyab campus. implementation should require only minimal software modifications to identify and keep separate the records of the other library. adoption is simplified because of the fact that book cards are not required and that the circulation file consists only of charged materials and not a record of complete library holdings. acknowledgments the following individuals contributed their varied talents and support to the development and implementation of the system: mr. erich meyerhoff, former librarian of the health sciences library; gerald lazorick, systems design programmer, former director, technical information dissemination bureau, sunyab; mrs. jean risley, programmer/analyst; mr. mark fennessy, former library intern at the health sciences library; and the clerical staff of the circulation department, especially barbara helminiak and _evelyn hufford. microsoft word 9526-16430-5-ce.docx president’s message: reflections on lita’s past and future aimee fifarek information technologies and libraries | september 2016 3 when i reached out to ital editor bob gerrity about my first president’s column, he graciously provided copies of past lita presidents’ columns to get me started. it reminded me once again of the illustrious company i am in, starting with stephen r. salmon, the first president of the information services and automation division, as we were known until 1977. i am proud to be at the head of lita as it begins to celebrate its 50th anniversary year. a half century ago when lita was founded the world was experiencing an era of profound technological change. the us and soviet union were battling to be first in the space race, and an increasing number of world powers were engaging in nuclear testing. while civil rights demonstrations and the fighting in vietnam dominated the news, we were imagining peace via the technologically-driven future depicted in a new tv series called star trek. with tv focused on the stars, we were able to go to the movies and explore the strange new world of inner space in fantastic voyage. technology was poised to enter our daily lives as well, with diebold demonstrating the first atm1 and ralph h. baer writing the 4-page paper that would lay the foundation for the video game industry.2 heady times for technology indeed, and the fact that libraries were sufficiently advanced to require an association dedicated to supporting technologists is hardly surprising. by the time of lita’s founding at the 1966 midwinter meeting in chicago, library automation had been in development for over a decade.3 marc was just being invented, with the first tapes from the library of congress scheduled to go to the sixteen pilot libraries later that year. membership in the only organization that existed, the committee on library automation (cola), was restricted to the handful of professionals who either developed or managed existing library systems. but technology was beginning to impact many more librarians than just those rarified few. according to president salmon, “it was clear that large numbers of librarians who didn't meet cola's standards for membership were in need of information on library automation and wanted leadership.”4 the first meeting of our division on july 14, 1966 at the ala annual conference in new york was attended by several hundred librarians interested in information sharing, technology standards, and technology training for library staff. this group created the first mission, vision, and bylaws that set us on a 50-year path of success. lita is well positioned to take the first steps into our next 50 years. thanks to the efforts of last year’s lita board, we are on the verge of adopting a new two-year strategic plan that is designed aimee fifarek (aimee.fifarek@phoenix.gov) is lita president 2016-17 and deputy director for customer support, it and digital initiatives at phoenix public library, phoenix, az. president’s message | fifarek doi: 10.6017/ital.v35i3.9526 4 to guide us through the current transitional period. it will be accompanied by a tactical plan that will allow us to document our accomplishments and set the stage for an ongoing culture of continuous planning. also, jenny levine has proven to be extremely capable as she completes her first year as lita executive director. she has just the right combination of ala experience, technology know-how, and calm competence to guide us through the retooling and reimagining that is required to take a middle-aged association into the next phase of its life. the four areas of focus in the new strategic plan will help us to balance our efforts between preserving the strengths of our past and adapting our organization for a successful future. the first area of focus, member engagement, shows that our primary commitment needs to be to lita members. without you, lita would not exist. one of the key efforts is to increase the value of lita for members who are unable to travel to conferences. with travel budgets down and staying low, online member engagement is an area all of ala needs to improve, and who better to lead in this area than lita. the next area, organizational sustainability, is all about keeping the infrastructure of the organization strong, much of which happens in the domain of lita staff. budgeting, quality communication, and strategic planning all live here. the section on education and professional development recognizes the important role that webinars, online courses, online journal, and print publications play in allowing lita members to share their knowledge on both cutting edge and practical topics with the rest of the association and ala in general. we are already doing great work here and we need to better support and expand these efforts. the last focus area, advocacy and information policy, represents a future growth area for lita. now that everyone in the library world "does" technology to a certain extent, lita needs to think about how we will differentiate ourselves as outside competencies increase. our advantage is that we have been doing and thinking about technology for much longer than anyone else. with our vast wealth of experience, it's appropriate that we work to become thought leaders and implementers in the information policy realm. in this, as always, we return to where we started: our members. lita has thrived over the last 50 years because of this, our most important resource. lita was founded on the concept of sharing information about technology through conversation, publications, and knowledge creation. we endure because you, the committed, passionate information professionals are willing to share what you know with those who come after. and like our founders, there are always individuals who are willing to take on the mantle of leadership, whether through getting elected to lita board, becoming a committee or interest group chair, serving in key editorial roles for our monographs, journal, and blog, or joining the all-important lita staff. thanks to all of you who make lita’s future happen every day. i am proud to be in your company. information technologies and libraries | september 1016 5 references 1 . alan taylor, “50 years ago: a look back at 1966,” the atlantic photo, march 23, 2016, http://www.theatlantic.com/photo/2016/03/50-years-ago-a-look-back-at-1966/475074/, photo 46. 2. “take me back to august 30, 1966,” http://takemeback.to/30-august-1966#.v8szitlrtaq. 3. “library technology timeline,” http://web.york.cuny.edu/~valero/timeline_reference_citations.htm. 4. stephen r. salmon, “lita’s first 25 years, a brief history,” http://www.ala.org/lita/about/history/1st25years. president’s message andromeda yelton information technology and libraries | december 2017 2 andromeda yelton (andromeda.yelton@gmail.com) is lita president 2017-18 and senior software engineer, mit libraries, cambridge, united states. before i dive into my column, i’d like to recognize and thank bob gerrity for his six years of service as ital’s editor in chief. he oversaw our shift from a traditional print journal to a fully online one, recognized by micah vandegrift and chealsye bowley as having the strongest open-access policies of all lis journals (http://www.inthelibrarywiththeleadpipe.org/2014/healthyself/). i’d like to further extend a welcome to ken varnum as our new editor in chief. ken’s distinguished record of lita service includes stints on the ital editorial board and the lita board of directors, so he knows the journal very well and i am enthusiastic about its future under his lead. i’m particularly curious to see what will be discussed in ital under ken’s leadership because i’ve just come back from two outstanding conferences which drove home the significance of the issues we wrestle with in library technology, and i’m looking forward to a third. in early november, i attended lita forum in scenic denver. the schedule was packed with sessions on intriguing topics – too many, of course, for me to attend them all – but two in particular stand out to me. in one, sam kome detailed how he’s going about a privacy audit at the claremont colleges library. he walked us through an extensive – and sometimes surprising – list of places personally identifiable information can lurk on library and campus systems, and talked through what his library absolutely needs (which is less than he’d thought, and far less than the library has been logging without thinking about it). in the other, mary catherine lockmiller took a design thinking approach to serving transgender populations. she shared a fantastic, practical libguide (http://libguides.southmountaincc.edu/transgenderresources), but the part that stuck with me most is her statement that many trans people may never physically enter a library because public spaces are not safe spaces; for this population, our electronic services are our public services. as technologists, we create the point of first, and maybe only, contact. a week later, i attended the inaugural data for black lives conference (http://d4bl.org/) at the mit media lab, steps from my office. this was – and i think everyone in the room felt it – something genuinely new. from the galvanizing topic, to the sophisticated visual and auditory design, to the frisson of genius and creativity buzzing all around a room of artists, activists, professors, poets, data scientists and software engineers, it was a remarkable experience for us all. those of you who heard dr. safiya noble speak at thomas dowling’s lita president’s program in 2016 are familiar with algorithmic bias. numerous speakers discussed this at d4bl: the ways that racial disparities in underlying data sets can be replicated, magnified, and given a veneer of objective power when run through the black boxes that power predictive policing or risk assessment for bail hearings. absent and messy data was a theme as well: in a moment that would make many librarians chuckle (and then wince) knowingly, a panel of music industry executives estimated that 40% of their metadata is wrong, thus making it impossible to credit and compensate artists appropriately. mailto:andromeda.yelton@gmail.com) https://www.google.com/url?q=http://www.inthelibrarywiththeleadpipe.org/2014/healthyself/&sa=d&ust=1512118443864000&usg=afqjcnedfyl-ywfgnadmdzfcrvvnmhlhhq http://libguides.southmountaincc.edu/transgenderresources http://d4bl.org/) president’s message | yelton 3 https://doi.org/10.6017/ital.v36i4.10238 and yet – in a memorable keynote – dr. ruha benjamin called on us not only to collect data about black death, as she showed us an image of the ambulance bill sent to tamir rice’s family, but to listen to our artists and poets as we use our data to imagine black life – this in front of an image of wakanda. with our data and our creativity, what new worlds can we map? several of my mit colleagues also attended d4bl, and as we discussed it afterward we started thinking about how these ideas can drive our own work. how does the imaginary world of wakanda connect to the archival imaginary, and what worlds can we empower our own creators to imagine with what we collect and preserve? how can we use our data literacy and access to sometimes un-googleable resources to help community groups collate data on important issues that are not tracked by our public institutions, such as police violence (https://mappingpoliceviolence.org/) or racial disparities in setting bail? with these ideas swirling in my mind, i am looking forward with tremendous excitement to lita forum 2018. building on the work of our forum assessment task force, we’ll be doing a lot of things differently; in particular, aiming for lots of hands-on, interactive sessions. this will be a conference where, whether you’re a presenter or an attendee, you’ll be able to do things. and these last two conferences have driven home for me how very much there is to do in of library technology. our work to select, collect, preserve, clean, and provide access to data can indeed have enormous impact. technology services are front-line services. https://mappingpoliceviolence.org/) microsoft word september_ital_betz_final.docx self-­‐archiving  with  ease  in  an     institutional  repository:     microinteractions  and  the  user  experience     sonya  betz    and     robyn  hall     information  technology  and  libraries  |  september  2015             43   abstract   details  matter,  especially  when  they  can  influence  whether  users  engage  with  a  new  digital  initiative   that  relies  heavily  on  their  support.  during  the  recent  development  of  macewan  university’s   institutional  repository,  the  librarians  leading  the  project  wanted  to  ensure  the  site  would  offer  users   an  easy  and  effective  way  to  deposit  their  works,  in  turn  helping  to  ensure  the  repository’s  long-­‐term   viability.  the  following  paper  discusses  their  approach  to  user-­‐testing,  applying  dan  saffer’s   framework  of  microinteractions  to  how  faculty  members  experienced  the  repository’s  self-­‐archiving   functionality.  it  outlines  the  steps  taken  to  test  and  refine  the  self-­‐archiving  process,  shedding  light  on   how  others  may  apply  the  concept  of  microinteractions  to  better  understand  a  website’s  utility  and   the  overall  user  experience  that  it  delivers.     introduction   one  of  the  greatest  challenges  in  implementing  an  institutional  repository  (ir)  at  a  university  is   acquiring  faculty  buy-­‐in.  support  from  faculty  members  is  essential  to  ensuring  that  repositories   can  make  online  sharing  of  scholarly  materials  possible,  along  with  the  long-­‐term  digital   preservation  of  these  works.  many  open  access  mandates  have  begun  to  emerge  around  the  world,   developed  by  universities,  governments,  and  research  funding  organizations,  which  serve  to   increase  participation  through  requiring  that  faculty  contribute  their  works  to  a  repository.1   however,  for  many  staff  managing  irs  at  academic  libraries  there  are  no  enforceable  mandates  in   place,  and  only  a  fraction  of  faculty  works  can  be  contributed  without  copyright  implications  when   author  agreements  transfer  copyrights  to  publishers.  persuading  faculty  members  to  take  the  time   to  sort  through  their  works  and  self-­‐archive  those  that  are  not  bound  by  rights  restrictions  is  a   challenge.   standard  installations  of  popular  ir  software,  including  dspace,  digital  commons,  and  eprints,  do   little  to  help  facilitate  easy  and  efficient  ir  deposits  by  faculty.  as  dorothea  salo  writes  in  a  widely   cited  critique  of  irs  managed  by  academic  libraries,  the  “‘build  it  and  they  will  come’  proposition   has  been  decisively  wrong.”2  a  major  issue  she  points  out  is  that  repositories  were  predicated  on   the  “assumption  that  faculty  would  deposit,  describe,  and  manage  their  own  material.”3  seven     sonya  betz  (sonya.betz@ualberta.ca)  is  digital  initiatives  project  librarian,  university  of  alberta   libraries,  university  of  alberta,  edmonton,  alberta.  robyn  hall  (hallr27@macewan.ca)  is   scholarly  communications  librarian,  macewan  university  library,  macewan  university,   edmonton,  alberta.     self-­‐archiving  with  ease  in  an  institutional  repository  |  betz  and  hall     doi:  10.6017/ital.v34i3.5900   44   years  after  the  publication  of  her  article,  a  vast  majority  of  the  more  than  2,600  repositories   currently  operating  around  the  world  still  function  in  this  way  and  struggle  to  attract  widespread   faculty  support.4  to  deposit  works  into  these  systems,  faculty  are  often  required  to  fill  out  an   online  form  to  describe  and  upload  each  work  individually.  this  can  be  a  laborious  process  that   includes  deciphering  lengthy  copyright  agreements,  filling  out  an  array  of  metadata  fields,  and   ensuring  file  formats  or  file  sizes  that  are  compatible  with  the  constraints  of  the  software.   in  august  of  2014,  macewan  university  library  in  edmonton,  alberta,  launched  an  ir,  research   online  at  macewan  (ro@m;  http://roam.macewan.ca).  our  hope  was  that  ro@m’s  simple  user   interface  and  straightforward  submission  process  would  help  to  bolster  faculty  contributions.  the   site  was  built  using  islandora,  an  open-­‐source  software  framework  that  offered  the  project   developers  substantial  flexibility  in  appearance  and  functionality.  in  an  effort  to  balance  their   desire  for  independence  over  their  work  with  ease  of  use,  faculty  and  staff  have  the  option  of   submitting  to  ro@m  in  one  of  two  ways:  they  can  choose  to  complete  a  brief  process  to  create   basic  metadata  and  upload  their  work,  or  they  can  simply  upload  their  work  and  have  ro@m  staff   create  metadata  and  complete  the  deposit.     thoroughly  testing  both  of  these  processes  was  critical  to  the  success  of  the  ir.  we  wanted  to   ensure  that  there  were  no  obstacles  in  the  design  that  would  dissuade  faculty  members  from   contributing  their  works  once  they  had  made  the  decision  to  start  the  contribution  process.  as  the   primary  means  of  adding  content  to  the  ir,  and  as  a  process  that  other  institutions  have  found   problematic,  carefully  designing  each  step  of  how  a  faculty  contributor  submits  material  was  our   highest  priority.  to  help  us  focus  our  testing  on  some  of  these  important  details,  and  to  provide  a   framework  of  understanding  for  refining  our  design,  we  turned  to  dan  saffer’s  2013  book   microinteractions:  designing  with  details.  the  following  case  study  describes  our  use  of   microinteractions  as  a  user-­‐testing  approach  for  libraries  and  discusses  what  we  learned  as  a   result.  we  seek  to  shed  light  on  how  other  repository  managers  might  envision  and  structure  their   own  self-­‐archiving  processes  to  ensure  buy-­‐in  while  still  relying  on  faculty  members  to  do  some  of   the  necessary  legwork.  additionally,  we  lay  out  how  other  digital  initiatives  may  embrace  the   concept  of  microinteractions  as  a  means  of  better  understanding  the  relationship  between  the   utility  of  a  website  and  the  true  value  of  positive  user  experience.     literature  review   user  experience  and  self-­‐archiving  in  institutional  repositories   user  experience  (ux)  in  libraries  has  gained  significant  traction  in  recent  years  and  provides  a   useful  framework  for  exploring  how  our  users  are  interacting  with,  and  finding  meaning  in,  the   library  technologies  we  create  and  support.  although  there  is  still  some  disagreement  around  the   definition  and  scope  of  what  exactly  we  mean  when  we  talk  about  ux,  there  seems  to  be  general   consensus  that  paying  attention  to  ux  shifts  focus  from  the  usability  of  a  product  to  more   nonutilitarian  qualities,  such  as  meaning,  affect,  and  value.5  hassenzhal  simply  defines  ux  as  a     information  technologies  and  libraries  |  september  2015   45   “momentary,  primarily  evaluative  feeling  (good-­‐bad)  while  interacting  with  a  product  or  service.”6   hassenzhal,  diefenbach,  and  goritz  argue  that  positive  emotional  experiences  with  technology   occur  when  the  interaction  fulfills  certain  psychological  needs,  such  as  competence  or  popularity.7   the  2010  iso  standard  for  human-­‐centered  design  for  interactive  systems  defines  ux  even  more   broadly,  suggesting  that  it  “includes  all  the  users’  emotions,  beliefs,  preferences,  perceptions,   physical  and  psychological  responses,  behaviors  and  accomplishments  that  occur  before,  during   and  after  use.”8  however,  when  creating  tools  for  library  environments,  it  can  be  difficult  for   practitioners  to  translate  ambiguous  emotional  requirements,  such  as  satisfying  emotional  and   psychological  needs  or  increasing  motivation,  with  pragmatic  outcomes,  such  as  developing  a   piece  of  functionality  or  designing  a  user  interface.   it  has  been  well  documented  that  repository  managers  struggle  to  motivate  academics  to  self-­‐ archive  their  works.9  however,  the  literature  focusing  on  how  ir  websites’  self-­‐archiving   functionality  helps  or  hinders  faculty  support  and  engagement  is  sparse.  one  study  of  note  was   conducted  by  kim  and  kim  in  2006,  who  led  usability  testing  and  focus  groups  on  an  ir  in  south   korea.  10  they  provide  a  number  of  ways  to  improve  usability  on  the  basis  of  their  findings,  which   include  avoiding  jargon  terms  and  providing  comprehensive  instructions  at  points  of  need  rather   than  burying  them  in  submenus.  similarly,  veiga  e  silva,  goncalves,  and  laender  reported  results   of  usability  testing  conducted  on  the  brazilian  digital  library  of  computing,  which  confirmed  their   initial  goals  of  building  a  self-­‐archiving  service  that  was  easily  learned,  comfortable,  and   efficient.11  the  authors  of  both  of  these  studies  suggest  that  user-­‐friendly  design  could  help  to   ensure  the  active  support  and  sustainability  of  their  services,  but  long-­‐term  use  remained  to  be   seen  at  the  time  of  publication.  meanwhile,  bell  and  sarr  recommend  integrating  value-­‐added   features  into  ir  websites  as  a  way  to  attract  faculty.12  their  successful  strategy  for  reengineering  a   struggling  ir  at  the  university  of  rochester  included  adding  tools  to  allow  users  to  edit  metadata   and  add  and  remove  files,  and  providing  portfolio  pages  where  faculty  could  list  their  works  in  the   ir,  link  to  works  available  elsewhere,  detail  their  research  interests,  and  upload  a  copy  of  their  cv.   although  the  question  remains  as  to  whether  a  positive  user  experience  in  an  ir  can  be  a   significant  motivating  factor  for  increasing  faculty  participation,  there  seems  to  be  enough   evidence  to  support  its  viability  as  an  approach.   applying  microinteractions  to  user  testing   dan  saffer’s  2013  book,  microinteractions:  designing  with  details,  follows  logically  from  the  ux   movement.  although  he  uses  the  phrase  “user  experience”  sparingly,  saffer  consistently  connects   interactive  technologies  with  the  emotional  and  psychological  mindset  of  the  user.  saffer  focuses   on  “microinteractions,”  which  he  defines  as  “a  contained  product  moment  that  revolves  around  a   single  use  case.”13  saffer  argues  that  well-­‐designed  microinteractions  are  “the  difference  between   a  product  you  love  and  product  you  tolerate.”14  saffer’s  framework  is  an  effective  application  of  ux   theory  to  a  pragmatic  task.  not  only  does  he  privilege  the  emotional  state  of  the  user  as  a  priority     self-­‐archiving  with  ease  in  an  institutional  repository  |  betz  and  hall     doi:  10.6017/ital.v34i3.5900   46   for  design,  he  also  provides  concrete  recommendations  for  designing  technology  that  provokes   positive  psychological  states  such  as  pleasure,  engagement,  and  fun.   defining  what  we  mean  by  a  “microinteraction”  is  important  when  translating  saffer’s  theory  to  a   library  environment.  he  describes  a  microinteraction  as  “a  tiny  piece  of  functionality  that  only   does  one  thing  .  .  .  every  time  you  change  a  setting,  sync  your  data  or  devices,  set  an  alarm,  pick  a   password,  turn  on  an  appliance,  log  in,  set  a  status  message,  or  favorite  or  like  something,  you  are   engaging  with  a  microinteraction.”15  in  libraries,  many  microinteractions  are  built  around   common  user  tasks  such  as  booking  a  group-­‐use  room,  placing  a  hold  on  an  item,  registering  for  an   event,  rating  a  book,  or  conducting  a  search  in  a  discovery  tool.  a  single  piece  of  interactive  library   technology  may  have  any  number  of  discrete  microinteractions,  and  often  are  part  of  a  larger   ecosystem  of  connected  processes.  for  example,  an  integrated  library  system  is  composed  of   hundreds  of  microinteractions  designed  both  for  end  users  and  library  staff,  while  a  self-­‐checkout   machine  is  primarily  designed  to  facilitate  a  single  microinteraction.   saffer’s  framework  provided  a  valuable  new  lens  on  how  we  could  interpret  users’  interactions   with  our  ir.  while  we  generally  conceptualize  an  ir  as  a  searchable  collection  of  institutional   content,  we  can  also  understand  it  as  a  collection  of  microinteractions.  for  example,  ro@m’s  core   is  microinteractions  that  enable  tasks  such  as  searching  content,  browsing  content,  viewing  and   downloading  content,  logging  in,  submitting  content,  and  contacting  staff.  ro@m  also  includes   microinteractions  for  staff  to  upload,  review,  and  edit  content.  as  discussed  above,  one  of  the   primary  goals  when  developing  our  ir  was  to  allow  faculty  to  deposit  scholarly  content,  such  as   articles  and  conference  papers,  directly  to  the  repository.  we  wanted  this  process  to  be  simple  and   intuitive,  and  for  faculty  to  have  some  control  over  the  assignation  of  keywords  and  other   metadata,  but  also  to  have  the  option  to  simply  submit  content  with  minimal  effort.  we  decided  to   employ  user  testing  to  carefully  examine  the  deposit  process  as  a  discrete  microinteraction  and  to   apply  saffer’s  framework  as  a  means  of  assessing  both  functionality  and  ux.  we  hoped  that   focusing  on  the  details  of  that  particular  microinteraction  would  allow  us  to  make  careful  and   thoughtful  design  choices  that  would  lead  to  a  more  consistent  and  pleasurable  ux.   method  and  case  study   we  conducted  two  rounds  of  user  testing  for  the  self-­‐archiving  process.  our  initial  user  testing   was  conducted  in  january  2014.  we  asked  seven  faculty  to  review  and  comment  on  a  mockup  of   the  deposit  form  to  test  the  workflow.  this  simple  exercise  allowed  us  to  confirm  the  steps  in  the   upload  process,  and  identified  a  few  critical  issues  that  we  could  resolve  before  building  out  the  ir   in  islandora.  after  completing  the  development  of  the  ir,  and  with  a  working  copy  of  the  site   installed  on  our  user  acceptance  testing  (uat)  server,  we  conducted  a  second  round  of  in-­‐depth   usability  testing  within  our  new  microinteraction  framework.     in  april  2014  we  recruited  six  faculty  members  through  word  of  mouth  and  through  a  call  for   participants  in  the  university’s  weekly  electronic  staff  newsletter.  the  volunteers  represented   major  disciplines  at  macewan  university,  including  health  sciences,  social  sciences,  humanities,     information  technologies  and  libraries  |  september  2015   47   and  natural  sciences.  saffer  describes  a  process  for  testing  microinteractions  and  suggests  that  the   most  relevant  way  to  test  microinteractions  is  to  include  “hundreds  (if  not  thousands)  of   participants.”16  however,  he  goes  on  to  describe  the  most  effective  methods  of  testing  to  be   qualitative,  including  conversation,  interviews,  and  observation.  testing  thousands  of  participants   with  one-­‐on-­‐one  interviews  and  observation  sessions  is  well  beyond  the  means  of  most  academic   libraries,  and  runs  counter  to  standard  usability  testing  methodology.  while  testing  only  six   participants  may  seem  like  a  small  number,  and  one  that  is  apt  to  render  inconclusive  results  and   sparse  feedback,  it  is  strongly  supported  by  usability  experts,  such  as  jakob  nielson.  during  the   course  of  our  testing,  we  quickly  reached  what  nielson  refers  to  in  his  piece  “how  many  test  users   in  a  usability  study?”  as  “the  point  of  diminishing  returns.”17  he  suggests  that  for  most  qualitative   studies  aimed  at  gathering  insights  to  inform  site  design  and  overall  ux,  five  users  is  in  fact  a   suitable  number  of  participants.  we  support  his  recommendation  on  the  basis  of  our  own   experiences;  by  the  fourth  participant,  we  were  receiving  very  repetitive  feedback  on  what   worked  well  and  what  needed  to  be  changed.   testing  took  place  in  faculty  members’  offices  on  their  own  personal  computers  so  that  they  would   have  the  opportunity  to  engage  with  the  site  as  they  would  under  normal  workday  circumstances.   each  user  testing  session  lasted  45  to  60  minutes,  and  was  facilitated  by  three  members  of  the   ro@m  team:  the  web  and  ux  librarian  guided  each  faculty  member  through  the  testing  process,   the  scholarly  communications  librarian  observed  the  interaction,  and  a  library  technician  took   detailed  notes  recording  participant  comments  and  actions.  each  faculty  member  was  given  an   article  and  asked  to  contribute  that  article  to  ro@m  using  the  uat  site.  the  ro@m  team  observed   the  entire  process  carefully,  especially  noting  any  problematic  interactions,  while  encouraging  the   faculty  member  to  think  aloud.  once  testing  was  complete,  the  scholarly  communications  librarian   analyzed  the  notes  and  identified  areas  of  common  concern  and  confusion  among  participants,  as   well  as  several  suggestions  that  the  participants  made  to  improve  the  site’s  functionality  as  they   worked  through  the  process.  she  then  went  about  making  changes  to  the  site  based  on  this   feedback.  as  we  discuss  in  the  next  section,  each  task  that  faculty  members  performed,  from  easy   to  frustrating,  represented  an  interaction  with  the  user  interface  that  affected  participants’   experiences  of  engaging  with  the  contribution  process,  and  informed  changes  we  were  able  to   make  before  launching  the  ir  service  three  months  later.     basic  elements  of  microinteractions   saffer’s  theory  describes  four  primary  components  of  a  microinteraction:  the  trigger,  rules,   feedback,  and  loops  and  modes.  viewing  the  ir  upload  tool  as  a  microinteraction  intended  to  be   efficient  and  user-­‐friendly  required  us  to  first  identify  each  of  these  different  components  as  they   applied  to  the  contribution  process  (see  figure  1),  and  then  evaluate  the  tool  as  a  whole  through   our  user  testing.     self-­‐archiving  with  ease  in  an  institutional  repository  |  betz  and  hall     doi:  10.6017/ital.v34i3.5900   48     figure  1.  ir  self-­‐archiving  process  with  microinteraction  components.   trigger   the  first  component  to  examine  in  a  microinteraction  is  the  trigger,  which  is,  quite  simply,   “whatever  initiates  the  microinteraction.”18  on  an  iphone,  a  trigger  for  an  application  might  be  the   icon  that  launches  an  app;  on  a  dishwasher,  the  trigger  would  be  the  button  pressed  to  start  the   machine;  on  a  website,  a  trigger  could  be  a  login  button  or  a  menu  item.  well-­‐designed  triggers   follow  good  usability  principles:  they  appear  when  and  where  the  user  needs  them,  they  initiate   the  same  action  every  time,  and  they  act  predictably  (for  example,  buttons  are  pushable,  toggles   slide).     information  technologies  and  libraries  |  september  2015   49   examining  our  trigger  was  a  first  step  in  assessing  how  well  our  upload  microinteraction  was   designed.  uploading  and  adding  content  is  a  primary  function  of  the  ir,  and  the  trigger  needed  to   be  highly  noticeable.  we  can  assume  that  users  would  be  goal-­‐based  in  their  approach  to  the  ir;   faculty  would  be  visiting  the  site  with  the  specific  purpose  of  uploading  content  and  would  be   actively  looking  for  a  trigger  to  begin  an  interaction  that  would  allow  them  to  do  so.     the  initial  design  of  ro@m  included  a  top-­‐level  menu  item  as  the  only  trigger  for  contributing   works.  in  the  persistent  navigation  at  the  top  of  the  site,  users  could  click  on  the  menu  item   labeled  “contribute”  where  they  would  then  be  presented  with  a  login  screen  to  begin  the   contribution  process.  this  was  immediately  obvious  to  half  of  the  participants  during  user  testing.   however,  the  other  half  immediately  clicked  on  the  word  “share,”  which  appeared  on  the  lower   half  of  the  page  beside  a  small  icon  simply  as  a  way  to  add  some  aesthetic  appeal  to  the  homepage   along  with  the  words  “discover”  and  “preserve.”  not  surprisingly,  the  users  were  interpreting  the   word  and  icon  as  a  trigger.  because  of  the  user  behavior  that  we  observed,  we  decided  to  add   hyperlinks  to  all  three  of  these  words,  with  “share”  linking  to  the  contribution  login  screen  (see   figure  2),  “discover”  leading  to  a  browse  page,  and  “preserve”  linking  to  an  faq  for  authors  page   that  included  information  on  digital  preservation.  this  increased  visibility  of  the  trigger   significantly  for  the  microinteraction.     figure  2.  “share”  as  additional  trigger  for  contributing  works.     self-­‐archiving  with  ease  in  an  institutional  repository  |  betz  and  hall     doi:  10.6017/ital.v34i3.5900   50   rules   the  second  component  of  microinteractions  described  by  saffer  are  the  rules.  rules  are  the   parameters  that  govern  a  microinteraction;  they  provide  a  framework  of  understanding  to  help   users  succeed  at  completing  the  goal  of  a  microinteraction  by  defining  “what  can  and  cannot  be   done,  and  in  what  order.”19  while  users  don’t  need  to  understand  the  engineering  behind  a  library   self-­‐checkout  machine,  for  example,  they  do  need  to  understand  what  they  can  and  cannot  do   when  they’re  using  the  machine.  the  hardware  and  software  of  a  self-­‐checkout  machine  is   designed  to  support  the  rules  by  encouraging  users  to  scan  their  cards  to  start  the  machine,  to   align  their  books  or  videos  so  that  they  can  be  scanned  and  desensitized,  and  to  indicate  when   they  have  completed  the  interaction.   the  goal  when  designing  a  self-­‐archiving  process  in  ro@m  was  to  ensure  that  the  rules  were  easy   for  users  to  understand,  followed  a  logical  structure,  and  were  not  overly  complex.  to  this  end,  we   drew  on  saffer’s  approach  to  designing  rules  for  microinteractions,  along  with  the  philosophy   espoused  by  steve  krug  in  his  influential  web  design  book,  don’t  make  me  think:  a  common  sense   approach  to  web  usability.20  both  krug  and  saffer  argue  for  reducing  complexity  and  removing   decision-­‐making  from  the  user  whenever  possible  to  remove  potential  for  user  error  or  mistakes.   the  rules  in  ro@m  follow  a  familiar  form-­‐based  approach:  users  log  in  to  the  system,  they  have  to   agree  to  a  licensing  agreement,  they  create  some  metadata  for  their  item,  and  they  upload  a  file   (see  figure  1).  however,  determining  the  order  for  each  of  these  elements,  and  ensuring  that  users   could  understand  how  to  fill  out  the  form  successfully,  required  careful  thinking  that  was  greatly   informed  by  the  user  testing  we  conducted.   for  example,  we  designed  ro@m  to  connect  to  the  same  authentication  system  used  for  other   university  applications,  ensuring  that  faculty  could  log  in  with  the  credentials  they  use  daily  for   institutional  email  and  network  access.  forcing  faculty  to  create,  and  remember,  a  unique   username  and  password  to  submit  content  would  have  increased  the  possibility  of  login  errors   and  resulted  in  confusion  and  frustration.  we  also  used  drop-­‐down  options  where  possible   throughout  the  microinteraction  instead  of  requiring  faculty  to  input  data  such  as  file  types,   faculty  or  department  names,  or  content  types  into  free-­‐text  boxes.   during  our  user  testing  we  found  that  those  fields  where  we  had  free-­‐text  input  for  metadata   entry  most  often  led  to  confusion  and  errors.  for  instance,  it  quickly  became  apparent  that  name   authority  would  be  an  issue.  when  filling  out  the  “author”  field,  some  people  used  initials,  some   used  middle  names,  and  some  added  “dr”  before  their  name,  which  could  negatively  affect  the  ir’s   search  results  and  the  ability  to  track  where  and  when  these  works  may  be  cited  by  others.  when   asked  to  include  a  citation  for  published  works,  most  of  our  participants  noted  frustration  with   this  requirement  because  they  could  not  do  so  quickly,  and  they  had  concerns  about  creating   correct  citations.  finally,  many  participants  also  became  confused  at  the  last,  optional  field  in  the   form  that  allowed  them  to  assign  a  creative  commons  license  to  their  works.     information  technologies  and  libraries  |  september  2015   51   our  user  testing  indicated  that  we  would  need  to  be  mindful  of  how  information  like  author  names   and  citations  were  entered  by  users  before  making  an  item  available  on  the  site.  under  ideal   circumstances,  we  would  have  modified  the  form  to  ensure  that  any  information  that  the  system   knew  about  the  user  was  brought  forward:  what  saffer  calls  “don’t  start  from  zero.”21  this  could   include  automatically  filling  in  details  like  a  user’s  name.  however,  like  many  libraries,  we  chose  to   adapt  existing  software  rather  than  develop  our  microinteraction  from  the  ground  up;   implementing  such  changes  would  have  been  too  time-­‐consuming  or  expensive.  in  response,  we   instead  added  additional  workflows  to  allow  administrators  to  edit  the  metadata  before  a   contribution  would  be  published  to  the  web  so  we  could  correct  any  errors.  we  also  changed  the   “citation”  field  to  “publication  information”  to  imply  that  users  did  not  need  to  include  a  complete   citation.  lastly,  we  made  sure  that  “all  rights  reserved”  was  the  default  selection  for  the  optional   “add  a  creative  commons  license?”  field  in  the  form  because  this  was  language  that  with  which   our  users  were  familiar  with  and  comfortable  proceeding.     policy  constraints  are  another  aspect  of  the  rules  that  provide  structure  around  microinteractions,   and  can  also  limit  design  choices  that  can  be  made.  having  faculty  complete  a  nonexclusive   licensing  agreement  that  acknowledged  they  had  the  appropriate  copyright  permissions  to  allow   them  to  contribute  the  work  was  a  required  component  in  our  rules.  without  the  agreement,  we   would  risk  liability  for  copyright  infringement  and  could  not  accept  the  content  into  the  ir.   however,  our  early  designs  for  the  repository  included  this  step  at  the  end  of  the  submission   process,  after  faculty  had  created  metadata  about  the  item.  our  initial  round  of  testing  revealed   that  several  of  our  participants  were  unsure  of  whether  they  had  the  appropriate  copyright   permissions  to  add  content  and  didn’t  want  to  complete  the  submission,  a  frustrating  experience   for  them  after  spending  time  filling  out  author  information,  keywords,  abstract,  and  the  like.  we   attempted  to  resolve  this  issue  by  moving  the  agreement  much  earlier  in  the  process,  forcing  users   to  acknowledge  the  agreement  before  creating  any  metadata.  we  also  used  simple,   straightforward  language  for  the  agreement  and  added  additional  information  about  how  to   determine  copyrights  or  contact  ro@m  staff  for  assistance.  integrating  an  api  that  could   automatically  search  a  journal’s  archiving  policies  in  sherpa  romeo  at  this  stage  in  the   contribution  process  is  something  we  plan  to  investigate  to  help  reduce  complexity  further  for   users.     feedback   understanding  the  concept  of  feedback  is  critical  to  the  design  of  microinteractions.  while  most   libraries  are  familiar  with  collecting  feedback  from  users,  the  feedback  saffer  describes  is  flowing   in  the  opposite  direction,  and  instead  refers  to  feedback  the  application  or  interface  is  providing   back  to  users.  this  feedback  gives  users  information  when  and  where  they  need  it  to  help  them   navigate  the  microinteraction.  as  saffer  comments,  “the  true  purpose  of  feedback  is  to  help  users   understand  how  the  rules  of  the  microinteraction  work.”22     self-­‐archiving  with  ease  in  an  institutional  repository  |  betz  and  hall     doi:  10.6017/ital.v34i3.5900   52   feedback  can  be  provided  in  a  variety  of  ways.  an  action  as  simple  as  a  color  change  when  a  user   hovers  over  a  link  is  a  form  of  feedback,  providing  visual  information  that  indicates  that  a  segment   of  text  can  be  clicked  on.  confirmation  messages  are  an  obvious  form  of  feedback,  while  a  folder   with  numbers  indicating  how  many  items  have  been  added  to  it  is  more  subtle.  while  visual   feedback  is  most  commonly  used,  saffer  also  describes  cases  where  auditory  and  haptic  (touch)   feedback  may  be  useful  .  designing  feedback,  much  like  designing  rules,  should  aim  to  reduce   complexity  and  confusion  for  a  user,  and  should  be  explicitly  connected  both  functionally  and   visually  to  what  the  user  needs  to  know.   in  an  online  web  environment,  much  of  the  feedback  we  provide  the  user  should  be  based  on  good   usability  principles.  for  example,  formatting  web  links  consistently  and  providing  predictable   navigation  elements  are  some  ways  that  feedback  can  be  built  into  a  design.  providing  feedback  at   the  users’  point  of  need  is  also  critical,  especially  error  messages  or  instructional  content.  this   proved  to  be  especially  important  to  our  ro@m  test  subjects.  while  the  ir  featured  an  “about”   section  accessible  in  the  persistent  navigation  at  the  top  of  the  website  that  contained  detailed   instructions  and  information  about  how  to  submit  works,  and  terms  of  use  governing  these   submissions,  this  content  was  virtually  invisible  to  the  users  we  observed.  instead,  they  relied   heavily  on  the  contextual  feedback  that  was  included  throughout  the  contribution  process  when  it   was  visible  to  them.     these  observations  led  us  to  rethink  our  approach  to  providing  feedback  in  several  cases.  for   example,  an  unfortunate  constraint  of  our  software  required  users  to  select  a  faculty  or  school  and   a  department  and  then  click  an  “add”  button  before  they  could  save  and  continue.  we  included   some  instructions  above  the  drop-­‐down  menus,  stating  “select  and  click  add”  in  an  effort  to   prevent  any  errors.  however,  our  participants  failed  to  notice  the  instructions  and  inevitably   triggered  a  brief  error  message  (see  figure  3).  we  later  changed  the  word  “add”  in  the  instructions   from  black  to  bright  red  hoping  to  increase  its  visibility,  and  we  ensured  that  the  error  message   that  displayed  when  users  failed  to  select  “add”  clearly  explained  how  to  correct  the  problem  and   move  on.  we  also  observed  that  the  plus  signs  to  add  additional  authors  and  keywords  were  not   visible  to  users.  we  added  feedback  that  included  both  text  and  icons  with  more  detail  (see  figure   4).  however,  this  remains  a  problem  for  users  that  we  will  need  to  further  explore.  on  completing   a  contribution,  users  receive  a  confirmation  page  that  thanks  them  for  the  contribution,  provides  a   timeline  for  when  the  item  will  appear  on  the  site,  and  notes  that  they  will  receive  an  email  when   it  appears.  response  to  this  page  was  positive  as  it  succinctly  covered  all  of  the  information  the   users  felt  they  needed  to  know  having  completed  the  process.       information  technologies  and  libraries  |  september  2015   53     figure  3.  feedback  for  the  “add”  button.     figure  4.  feedback  for  adding  multiple  authors  and  keywords.     self-­‐archiving  with  ease  in  an  institutional  repository  |  betz  and  hall     doi:  10.6017/ital.v34i3.5900   54   modes  and  loops   the  final  two  components  of  microinteractions  defined  by  saffer  are  modes  and  loops.  saffer   describes  a  mode  as  a  “fork  in  the  rules,”  or  a  point  in  a  microinteraction  where  the  user  is   exposed  to  a  new  process,  interface,  or  state.23  for  example,  google  scholar  provides  users  with  a   setting  to  show  “library  access  links”  for  participating  institutions  with  openurl  compatible  link   resolvers.24  users  who  have  set  this  option  are  presented  with  a  search  results  page  that  is   different  from  the  default  mode  and  includes  additional  links  to  their  chosen  institution’s  link   resolver.  our  microinteraction  includes  two  distinct  modes.  once  logged  in,  users  can  choose  to   contribute  works  through  the  “do  it  yourself”  submission  that  we’ve  described  here  in  some   detail,  or  they  can  choose  “let  us  do  it”  and  complete  a  simplified  version  that  requires  them  to   acknowledge  the  licensing  agreement,  upload  their  files,  and  provide  any  additional  data  they   chose  in  a  free-­‐text  box  (see  figure  5).  the  majority  of  our  testers  specified  that  they  would  opt  for   the  “do  it  yourself”  option  because  they  wanted  to  have  control  over  the  metadata  describing   their  work,  including  the  abstract  and  keywords.  however,  since  launching  the  repository,  several   submissions  have  arrived  via  the  “let  us  do  it”  form,  which  suggests  a  reasonable  amount  of   interest  in  this  mode.     figure  5.  the  “let  us  do  it”  form.   loops,  on  the  other  hand,  are  simply  a  repeating  cycle  in  the  microinteraction.  a  loop  could  be  a   process  that  runs  in  the  background,  checking  for  network  connections,  or  it  could  be  a  more   visible  process  that  adapts  itself  on  the  basis  of  the  user’s  behavior.  for  example,  in  the  ro@m   submission  process  users  can  move  backward  and  forward  in  the  contribution  forms;  both  have     information  technologies  and  libraries  |  september  2015   55   “previous”  and  “save  and  continue”  buttons  on  each  page  to  allow  users  to  navigate  easily.  the   final  step  on  the  “do  it  yourself”  form  allows  users  to  review  their  metadata  and  the  file  that  they   have  uploaded.  they  can  then  use  the  previous  button  to  make  changes  to  what  they  have  entered   before  completing  the  submission.  ideally,  users  would  be  able  to  edit  this  content  directly  from   this  review  page,  but  software  constraints  prevented  us  from  including  this  feature,  and  the   “previous”  button  did  not  pose  any  major  challenges  for  our  testing  participants.  another  example   of  a  loop  in  ro@m  is  a  “contribute  more  works”  button  embedded  in  the  confirmation  screen  that   takes  users  back  to  the  beginning  of  the  microinteraction.  this  feature  was  suggested  by  one  of   our  participants,  and  it  extends  the  life  of  the  microinteraction,  potentially  leading  to  additional   contributions.   discussion  and  conclusions   focusing  on  the  details  of  the  self-­‐archiving  process  in  our  ir  provided  extremely  rich  qualitative   data  for  improving  the  user  interface,  while  analyzing  the  structure  of  the  microinteraction,   following  saffer’s  model,  was  also  a  valuable  exercise  in  thinking  about  user  needs  and  software   design  from  a  different  perspective  from  standard  usability  studies.  the  improvements  we  made,   based  on  both  saffer’s  theory  and  the  results  we  observed  through  testing,  added  significant   functionality  and  ease  of  use  to  the  self-­‐archiving  process  for  faculty.  thinking  carefully  about   elements  like  placement  of  buttons,  small  changes  in  wording  or  flow,  and  timing  of  instructional   or  error  feedback  highlighted  the  big  effect  small  elements  can  have  on  usability.     however,  there  are  some  limitations  to  both  the  theory,  and  our  approach  to  testing  and   improving  the  ir  that  affect  how  well  we  can  understand  and  utilize  the  results.  of  particular   concern  is  how  well  this  kind  of  testing  can  capture  the  ux  of  a  faculty  member  beyond  the  utility   or  ease  of  use  of  the  interaction.  in  an  observational  study  we  can  rely  on  comments  from   participants  and  key  statements  that  may  indicate  a  participant’s  emotional  or  affective  state,  but   we  didn’t  include  targeted  questions  to  gather  this  data  and  focused  instead  on  the  details  of  the   microinteraction.  we  didn’t  ask  how  they  felt  while  using  the  ir,  or  if  successfully  uploading  an   item  to  the  ir  gave  them  a  sense  of  autonomy  or  competence,  or  if  this  experience  would   encourage  them  to  submit  content  in  the  future.  nevertheless,  improving  usability  is  a  solid   foundation  for  providing  a  positive  ux.  hassahzhal  describes  the  difference  between  “do-­‐goals”   (completing  a  task)  and  “be-­‐goals”  (human  psychological  needs  like  being  competent,  or   developing  relationships).25  while  he  argues  that  “be-­‐goals”  are  the  ultimate  drivers  of  ux,  he  also   suggests  that  creating  tools  that  make  the  completion  of  do-­‐goals  easy  can  facilitate  the  potential   fulfillment  of  be-­‐goals  by  removing  barriers  and  making  their  fulfillment  more  likely.  ultimately,   however,  a  range  of  user  testing  strategies  can  lead  to  improvements  in  a  user  interface,  whether   that  testing  relies  on  carefully  detailed  examination  of  a  microinteraction,  analysis  of  large  data   sets  from  google  analytics,  or  interviews  with  key  user  groups.  microinteraction  theory  is  a  useful   approach,  and  valuable  in  its  conceptualization,  but  it  should  be  one  of  many  tools  libraries  adopt   to  improve  their  online  ux.     self-­‐archiving  with  ease  in  an  institutional  repository  |  betz  and  hall     doi:  10.6017/ital.v34i3.5900   56   similarly,  focusing  on  the  ux  of  irs  must  be  only  one  of  many  strategies  institutions  employ  to   improve  rates  of  faculty  self-­‐archiving.  there  have  been  recent  studies  that  argue  that  regardless   of  platform  or  process,  faculty-­‐initiated  submissions  have  proven  to  be  uncommon.26  instead,  they   suggest  that  sustainability  relies  on  marketing,  direct  outreach  with  individual  faculty  members,   and  significant  staff  involvement  in  identifying  content  for  inclusion,  investigating  rights,  and   depositing  on  authors’  behalf.  it  would  be  short  sighted  to  suggest  that  relying  solely  on  designing   a  user-­‐friendly  website,  or  only  developing  savvy  promotional  and  outreach  efforts,  can  determine   the  ongoing  success  of  an  ir  initiative.  gaining  and  maintaining  support  is  an  ongoing,   multifaceted  process,  and  largely  depends  on  the  academic  culture  of  an  institution  as  well  as   available  financial  and  staffing  resources.  as  such,  user  testing  offers  qualitative  insights  into  ways   that  processes  and  functions  might  be  improved  to  enhance  the  viability  of  ir  initiatives  in  tandem   with  a  variety  of  marketing  and  outreach     references     1.     “welcome  to  roarmap,”  university  of  southampton,  2014,  http://roarmap.eprints.org.   2.     dorothea  salo,  “innkeeper  at  the  roach  motel,”  library  trends  57,  no.  2  (2008):  98,   http://muse.jhu.edu/journals/library_trends.     3.     ibid.,  100.   4.     “the  directory  of  open  access  repositories—opendoar,”  university  of  nottingham,  uk,   2014,  http://www.opendoar.org.   5.     effie  l-­‐c  law  et  al.,  “understanding  scoping  and  defining  user  experience:  a  survey   approach,”  computer-­‐human  interaction  2009:  user  experience  (new  york:  acm  press,  2009),   719.   6.     marc  hassenzahl,  “user  experience  (ux):  towards  an  experiential  perspective  on  product   quality,”  proceedings  of  the  20th  international  conference  of  the  association  francophone   d’interaction  homme-­‐machine  (new  york:  acm  press,  2008),  11,   http://dx.doi.org/10.1145/1512714.1512717.     7.     marc  hassenzahl,  sarah  diefenbach,  and  anja  göritz,  “needs,  affect,  and  interactive  products:   facets  of  user  experience,”  interacting  with  computers  22,  no.  5  (2010):  353–62,   http://dx.doi.org/10.1016/j.intcom.2010.04.002.   8.     international  standards  organization,  human-­‐centred  design  for  interactive  systems,  iso   9241-­‐210  (geneva:  iso,  2010),  section  2.15.     9.     see  philip  m.  davis  and  matthew  j.l.  connolly,  “institutional  repositories:  evaluating  the   reasons  for  non-­‐use  of  cornell  university’s  installation  of  dspace,”  d-­‐lib  magazine  13,  no.   3/4  (2007),  http://www.dlib.orghttp://www.dlib.org;  ellen  dubinsky,  “a  current  snapshot  of   institutional  repositories:  growth  rate,  disciplinary  content  and  faculty  contributions,”     information  technologies  and  libraries  |  september  2015   57     journal  of  librarianship  &  scholarly  communication  2,  no.  3  (2014):  1–22,   http://dx.doi.org/10.7710/2162-­‐3309.1167;  anthony  w.  ferguson,  “back  talk—institutional   repositories:  wars  and  dream  fields  to  which  too  few  are  coming,”  against  the  grain  18,  no.   2  (2006):  86–85,   http://docs.lib.purdue.edu/atg/vol18/iss2/14http://docs.lib.purdue.edu/atg/vol18/iss2/14;   salo,  “innkeeper  at  the  roach  motel”;  feria  wirba  singeh,  a  abrizah,  and  noor  harun  abdul   karim,  “what  inhibits  authors  to  self-­‐archive  in  open  access  repositories?  a  malaysian  case,”   information  development  29,  no.  1  (2013):  24–35,   http://dx.doi.org/0.1177/0266666912450450.   10.   hyun  hee  kim  and  yong  ho  kim,  “usability  study  of  digital  institutional  repositories,”   electronic  library  26,  no.  6  (2008):  863–81,  http://dx.doi.org/10.1108/02640470810921637.   11.    lena  veiga  e  silva,  marcos  andré  gonçalves,  and  alberto  h.  f.  laender,  “evaluating  a  digital   library  self-­‐archiving  service:  the  bdbcomp  user  case  study,”  information  processing  &   management  43,  no.  4  (2007):  1103–20,  http://dx.doi.org/10.1016/j.ipm.2006.07.023.   12.    suzanne  bell  and  nathan  sarr,  “case  study:  re-­‐engineering  an  institutional  repository  to   engage  users,”  new  review  of  academic  librarianship  16,  no.  s1  (2010):  77–89,   http://dx.doi.org/10.1080/13614533.2010.5095170.   13.    dan  saffer,  microinteractions:  designing  with  details  (cambridge,  ma:  o’reilly,  2013),  2.   14.    ibid.,  3.   15.    ibid.,  2.   16.    ibid.,  142.   17.    jakob  nielson,  “how  many  test  users  in  a  usability  study?”  nielsen  norman  group,  2012,   http://www.nngroup.com/articles/how-­‐many-­‐test-­‐users.     18.    saffer,  microinteractions,  48.   19.    ibid.,  82.   20.    steve  krug,  don’t  make  me  think:  a  common  sense  approach  to  web  usability  (berkeley,  ca:   new  riders,  2000).   21.    saffer,  microinteractions,  64.   22.    ibid.,  86.   23.   ibid.,  111.   24.    “library  support,”  google  scholar,  http://scholar.google.com/intl/en-­‐ us/scholar/libraries.html.       self-­‐archiving  with  ease  in  an  institutional  repository  |  betz  and  hall     doi:  10.6017/ital.v34i3.5900   58     25.    hassahzhal,  “user  experience,”  10–15.   26.    see  dubinsky,  “a  current  snapshot  of  institutional  repositories,”  1–22;  shannon  kipphut-­‐ smith,  “good  enough:  developing  a  simple  workflow  for  open  access  policy  implementation,”   college  &  undergraduate  libraries  21,  no.  3/4  (2014):  279–94.   http://dx.doi.org/10.1080/10691316.2014.932263.   navigating uncharted waters: utilizing innovative approaches in legacy theses and dissertations digitization at the university of houston libraries article navigating uncharted waters utilizing innovative approaches in legacy theses and dissertations digitization at the university of houston libraries annie wu, taylor davis-van atta, bethany scott, santi thompson, anne washington, jerrell jones, andrew weidner, a. laura ramirez, and marian smith information technology and libraries | september 2022 https://doi.org/10.6017/ital.v41i3.14719 annie wu (awu@uh.edu) is head of metadata and digitization services and the ambassador kenneth franzheim ii and mrs. jorgina franzheim endowed professor, university of houston libraries. taylor davis-van atta (tgdavis-vanatta@uh.edu) is director of the digital research commons, university of houston libraries. bethany scott (bscott3@uh.edu) is head of preservation and reformatting, university of houston libraries. santi thompson (sathompson3@uh.edu) is associate dean for research and student engagement and the eva digital research endowed library professor, university of houston libraries. anne washington (washinga@oclc.org) is semantic applications product analyst, oclc. jerrell jones (jjones46@uh.edu) is digitization lab manager, university of houston libraries. andrew weidner (andrew.weidner@bc.edu) is head of digital production services, boston college libraries. a. laura ramirez (alramirez@uh.edu) is senior library specialist, university of houston libraries. marian smith (mrsmith8@uh.edu) is digital photo tech, university of houston libraries. © 2022. abstract in 2019, the university of houston libraries formed a theses and dissertations digitization task force charged with digitizing and making more widely accessible the university’s collection of over 19,800 legacy theses and dissertations. supported by funding from the john p. mcgovern foundation, this initiative has proven complex and multifaceted, and one that has engaged the task force in a broad range of activities, from purchasing digitization equipment and software to designing a phased, multiyear plan to execute its charge. this plan is structured around digitization preparation (phase one), development of procedures and workflows (phase two), and promotion and communication to the project’s targeted audiences (phase three). the plan contains step-by-step actions to conduct an environmental scan, inventory the theses and dissertations collections, purchase equipment, craft policies, establish procedures and workflows, and develop digital preservation and communication strategies, allowing the task force to achieve effective planning, workflow automation, progress tracking, and procedures documentation. the innovative and creative approaches undertaken by the theses and dissertations digitization task force demonstrated collective intelligence resulting in scaled access and dissemination of the university’s research and scholarship that helps to enhance the university’s impact and reputation. introduction to answer the call of implementing university of houston (uh) libraries strategic plan to position the libraries as a campus leader in research and transform library space to reflect evolving modes of learning and scholarship, the uh libraries launched a cross-departmental task force in 2019 charged with digitizing the university’s extensive print theses and dissertations collection. providing online access to newly digitized theses and dissertations boosts the reach and impact of our institution’s research and scholarship while expanding available space for computing, mailto:awu@uh.edu mailto:tgdavis-vanatta@uh.edu mailto:bscott3@uh.edu mailto:sathompson3@uh.edu mailto:washinga@oclc.org mailto:jjones46@uh.edu mailto:andrew.weidner@bc.edu mailto:alramirez@uh.edu mailto:mrsmith8@uh.edu information technology and libraries september 2022 navigating uncharted waters | 2 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith technology, and faculty and student learning and research activities. a study by bennett and flanagan revealed the positive impact and benefits of online dissemination of theses and dissertations, including enhanced discoverability by google’s strong indexing capabilities, significant increase in the usage of the works, and an overall enhancement of the reputation of an institution.1 encouraged by the positive outcomes and supported by funding from the john p. mcgovern foundation to initiate this project, the theses and dissertations digitization (tdd) task force developed a phased project plan and utilized creative, automated processes and methods to execute it. this article articulates the tdd project planning and the innovative work undertaken by the task force to achieve efficiency in making our print theses and dissertations readily available to new readerships around the world. literature review over the past several decades, research libraries have been building programs around digitization and open access repository infrastructures, largely aimed at expanding their digital collections and engaging communities with newly available research materials. for some, part of their programming has included projects that digitize their institution’s legacy print collections of theses and dissertations. the review below explores literature on the mass digitization process , including institutional case studies, guidance documents, legal and policy papers, and local documentation developed as libraries have planned and implemented these projects. any library tackling a retrospective thesis and dissertation project needs a framework for determining the copyright status of these works en masse. perhaps it is no coincidence, then, that copyright concerns are the most heavily documented aspect of the process. clement and levine provide the definitive work to date on copyright and the publication status of theses and dissertations written in the united states before 1978. their study asserts that “p re-1978 american dissertations were considered published for copyright purposes by virtue of their deposit in a university library or their dissemination by a microfilm distributor.”2 they go on to write that, “for copyright purposes, these were acts of publication with the same legal effect as dissemination through presses, publishers, and societies.”3 they suggest that libraries should investigate the copyright status for theses and dissertations authored between 1909 and 1978 (typically found on the title page and verso); if there is no copyright notice, then the thesis or dissertation is likely in the public domain and eligible for digitization and public release without permission. moreover, even those works that have a printed copyright notice might have fallen out of copyright if they were not renewed after 28 years for the same length of time.4 broad guidance and best practice for copyright status and other matters of process around theses and dissertations is provided in guidance documents for lifecycle management of etds, which acknowledges that legal services may be required for some retrospective thesis and dissertation digitization projects, especially “before scanning without the permission of former students .”5 the authors assert that information professionals should investigate any “appropriate access options” with institutional legal expertise before engaging in a retrospective digitization project and articulate the two most commonly encountered copyright scenarios: “[either] former student authors may not allow the reproduction and open dissemination of their work, or unauthorized copyrighted material was used in the original theses and dissertations.”6 strategies that might be employed to determine copyright status include “consulting with legal counsel at one’s institution to see where it stands on this issue; negotiating with commercial entities that make such content information technology and libraries september 2022 navigating uncharted waters | 3 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith available at a price so that institutions can have some control over it for the purpose of broader access; and working with groups such as alumni associations, colleges, departments, and graduate schools to establish contact with thesis and dissertation authors for securing their permission to digitize, and render available online, their past scholarship.”7 on the question of public access to newly digitized works, the guidance documents detail the implications of the “transition from print to electronic,” which “has led to increased scrutiny over who will be allowed to access the electronic versions and how widely they will be disseminated.”8 when there is any legal doubt, there are many reasons for libraries to exercise caution and restrict access to electronic theses and dissertations; that said, “research available on the web immediately upon submission of the final, approved thesis can prove advantageous to the newlydegreed student, the institution, and other researchers.”9 again, consulting legal officers and the original authors, if possible, remains the consensus approach to establishing a strategy for access to digitized theses and dissertations. the guidance documents also touch on the thorny issue of digitizing theses and dissertations that contain third-party content. they summarize the history and routine application of the fair use doctrine in both the creation and dissemination of scholarly works but provide little firm guidance on the matter.10 indeed, after reviewing the entire body of literature on retrospective thesis and dissertation projects, this remains a practical challenge that any library undertaking a mass digitization project must consider and the associated risks must be accounted for. in recent years, several case studies have documented institutions’ efforts to digitize and make more widely available legacy theses and dissertations. of the institutions that the tdd task force reviewed for the environmental scan, none of their case studies attempts an exhaustive documentation of end-to-end workflows and processes developed to execute the task; most focus on particularly difficult questions inherent to the process. martyniak provides a rationale for the university of florida’s (uf) retrospective scanning project and details their process for contacting authors before works were scanned.11 the workflow outlines several points of contact with authors to obtain signed distribution agreements, as well as uf’s approach to automate this process as much as possible. notably, the distribution agreement form and correspondence templates are provided as appendices to the article.12 as part of this retrospective digitization project, uf also released a scanning policy that articulates their approach to determining the copyright status of works and their resultant practice. 13 this policy document is an excellent example of an institution’s implementation of clement and levine’s research described above. likewise, mundle describes the methods used by simon fraser university (sfu) to establish its approach to the issues of copyright status and access, ultimately resulting in a public thesis access policy and procedures for contacting authors whenever possible to offer them the ability to opt their work out of the project.14 unlike the uf, sfu began scanning before any explicit permission had been obtained from authors. sfu also shares their use of scripts to automate the ingest of metadata from original marc records into their dspace repository.15 piorun and palmer, meanwhile, focus on an analysis of the time and cost associated with digitizing 300 doctoral dissertations for a newly implemented institutional repository at the university of information technology and libraries september 2022 navigating uncharted waters | 4 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith massachusetts chan medical school.16 piorun and palmer detail the library’s process for obtaining cost comparisons from external vendors as well as estimated costs, including labor, associated with undertaking the task in house.17 issues of workflow, policy development, and permissions are also addressed with an emphasis on developing accurate and streamlined methods of processing works; however, piorun and palmer conclude, “regardless of the amount of planning and thought that goes into a project, there is always the possibility that each record or file will need to be reworked.”18 shreeves and teper discuss theses and dissertations’ complicated status as grey literature and the university of illinois urbana-champaign (uiuc) library’s digitization project, which they describe as “less of a collection management or preservation issue and more as an effort to tackle broader scholarly communication and outreach issues.”19 after consulting with university legal counsel, digitized works were ingested to the uiuc institutional repository as a restricted (campus -only access) collection. as authors provide consent, access to their work is broadened to the public. worley demonstrates that, according to an analysis of circulation numbers, works that are accessible electronically are used dramatically more than print copies, serving as rationale for undertaking digitization of student works.20 they provide significant detail around virginia polytechnic institute and state university’s process to establish file specifications for its digitization process, and image quality/resolution and file format selection are discussed in some detail, with helpful visual examples.21 these case studies are particularly valuable in that they provide evidence and cautionary tales around how local contexts have made a difference in copyright and workflow issues. this case study contributes to the existing body of literature by attempting to provide an exhau stive, end-toend description of the retrospective digitization process—from copyright evaluation, to physical handling, to digitizing with an eye to access controls and digital preservation concerns. furthermore, our approach to digitization at scale incorporates automation at several points throughout the workflow, representing a production improvement to the decade-old case studies we reviewed. project planning and execution digitizing a large corpus of print theses and dissertations is a complex process touching areas of equipment, copyright policy, workflows for different sections of the process, progress tracking, preservation, and communication. to handle such a multifaceted project, the tdd task force designed a plan that divided the project activities into three phases (see table 1). phase one is dedicated to tasks of preparation such as the environmental scan, copyright permission investigation, digitization equipment purchasing, and print theses and dissertation inventory. phase two includes activities such as digitization and metadata workflow development, documentation, project tracking, ingestion, and preservation of digitized files. phase three is mainly for promotion and communication to our researchers on the availability of our digitized theses and dissertations collection. task force members volunteered to serve in subteams for identified specific tasks in each project phase. information technology and libraries september 2022 navigating uncharted waters | 5 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith table 1. phased planning for the tdd project theses and dissertation digitization task force planning phases task force activities subteams (*subteam lead) phase one: preparation environmental scan jerrell, anne, crystal, santi*, bethany physical theses/dissertations inventory/retention bethany copyright permissions and policies bethany, annie, taylor* purchase equipment jerrell*, crystal phase two: workflow development td digitization workflow jerrell, crystal* td metadata workflow anne*, taylor, annie td ingest and publishing workflow andrew, taylor td progress tracking annie, andrew preservation/storage strategy bethany*, santi phase three: communication/promotion promote dtd to colleagues and researchers taylor*, santi communicate progress to staff and users annie*, santi* develop training materials for stakeholders anne*, crystal* phase one: preparation a subgroup of the tdd task force conducted an environmental scan of similar theses and dissertations digitization approaches previously used by other institutions. the lead for the subgroup created a google sheet that all group members used to document information found in published literature, public documents, and institutional websites. the lead assigned group members to review information from institutions with publicly available data, including: the university of florida, the university of north texas, the university of illinois urbana champaign, brigham young university, william and mary university, texas a&m university, the university of arizona, the california institute of technology, the massachusetts institute of technology, iowa state university, xavier university, texas tech university, and the university of british columbia. group members noted relevant information pertaining to a variety of topics focused on theses and dissertations digitization. one of the most prominent was the institution’s response to copyright permissions. the group tried to determine if the institution required author permission before releasing a digitized thesis or dissertation (the “opt in” option), or incorporated policies and procedures that prioritized taking down digitized theses and dissertations once requested by the author (the “opt out” option). they observed software and hardware specifications used by other institutions—critical data that would inform the technology needed to complete a project of this scale. the group documented the key components of the digitization and metadata workflows, including roles and responsibilities, sequencing of actions, and the implications that policies and procedures had on the process. this data helped the group understand what gaps, common problems, and emerging best practices existed. finally, the group reviewed physical retention and preservation strategies articulated by institutions to ensure it understood the long-term stewardship hurdles and requirements for analog and digital material. information technology and libraries september 2022 navigating uncharted waters | 6 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith based on the assessment of the 19,800 uh theses and dissertations identified for inclusion in the project, the digitization subgroup members determined that several scanners would be required for agility in digitization production. the tdd digitization workflow was designed so that this project could run effectively, in parallel, with existing digitization projects regardless of the need for some theses to be scanned on existing equipment. automatic document feed (adf) scanners were a strategic choice for the rapid scanning of disbound items. two canon dr-g2110 adf scanners were purchased for the project. these scanners were chosen for their scanning speed, scanning quality, ease of use, onboard image preprocessing, and reasonable price point. the canon dr-g2110 can handle a large page stack, approximately 500 pages. theses and dissertations can be scanned on the longer or shorter dimension, which allows faster scanning times. among many innovative features, this duplex scanner simultaneously digitizes both sides of a document, rotates the pages based on text orientation, and auto-crops through preprocessing during the scanning process. this canon adf solution makes more image postprocessing automation possible since the resulting scans match our output expectations with minimal user input. other scanning options were needed for a smaller subset of theses and dissertations that could not be disbound. the digitization team leveraged an existing zeutschel os 12002 planetary scanner for items that could not be disbound. an existing plustek opticbook (po) a300 plus was used for items with foldouts containing graphs, maps, and illustrations that measure beyond 11 inches on the longest dimension. additionally, a plustek opticbook 3800l was purchased to accommodate fragile us letter–sized pages that are not suitable for adf scanning. thin or heavily waxed papers typically do not stand up well to the fast-moving rubber rollers and other internal scanning mechanisms. while the po 3800l provides a much longer scanning time than the po a300, both scanners can scan into the page gutter of bound materials, a useful feature for items with insufficient margins. figure 1. image processing workflow testing on a thesis in limb processing. the green check marks on the left indicate that a page has been processed correctly. information technology and libraries september 2022 navigating uncharted waters | 7 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith the canon adf scanner operates through two pieces of software working concurrently, canon captureontouch v4 pro and kofax vrs (an additional product supplied by the task force’s scanner vendor). some image processing settings are applied in kofax vrs, which communicates with captureontouch v4 pro. both pieces of software were bundled with purchases of the two canon adf scanners. limb processing by i2s was also purchased for the project. limb processing is a powerful mass image processing product that operates through user-built processing workflows that can be applied to multiple folders, creating standardized output suitable for automation. the limb software can transform an imperfect scan into a fully processed, clean derivative with minimal user input, which is especially useful for transforming legacy image data. abbyy finereader server 14 is used to provide quality optical character recognition (ocr) data and features efficient tools for automation, allowing for large ocr processing jobs to be queued and run recursively with minimal user intervention (see fig. 1). with these powerful tools, uh libraries has been able to leverage our existing scanners, new scanners, and advanced software to plan for the timely capture of nearly three million pages of content. the number of theses and dissertations required the implementation of a semiautomatic disbinding system. the spartan 185a paper cutter from the challenge machinery company was purchased to ensure the replication of many clean binding removals. options from several manufacturers were considered for these needs but the spartan 185a offered the cutting power needed to cut millions of pages over the life of the project (see fig. 2). the cutter features several safeguards that protect the operator, such as the lowering of a protective acrylic guard and the requirement of two hands, away from the blade, to lower the blade automatically. uh libraries chose a local cutter blade replacement company that services the equipment quarterly. in addition to the cutter, supplies for binding removal and physical volume management were needed, such as: • x-acto knife and/or utility blades • recycling bins • table brooms and dustpans • disbinding tables • cutting mats • standing mats • letter and legal-size folders • folder holders • surface cleaning materials • carts/book trucks information technology and libraries september 2022 navigating uncharted waters | 8 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith figure 2. (l) thesis scanning test on the zeutschel os 12002; (r) challenge spartan 185a cutter with red numbers indicating blade lowering and cutting safety button order. the physical retention of volumes was considered in the context of the overall preservation of the theses and dissertations collection, including the digital preservation approach to the tdd project. the uh libraries holds two copies of a student’s thesis and dissertation. after consulting with stakeholders throughout the library—such as the university archivist, the dean of libraries and associate deans, and access services’ shelving team for shelf space/storage in different areas of the library building—the task force decided to retain one bound copy of each thesis or dissertation. additional copies will be weeded from the general collection, and the best copy for digitization will be disbound for feeder scanning using the equipment described above. when only one copy of a thesis or dissertation exists in the collection, it will be scanned using a scanner that will not destroy or damage the binding. the retained theses and dissertations collection will be housed in uh libraries special collections in the secure and climate controlled closed stacks. once the tdd task force settled on this retention strategy, the digital projects coordinator, a member of the task force who represents special collections, conducted a full shelf -read of the theses and dissertations already housed in special collections. using a master tracking spreadsheet that was generated from catalog reports for project tracking and pulling, a small team of student workers reviewed over 20,000 volumes to identify missing titles, titles with multiple copies that can be weeded from special collections, and copies with label and/or cataloging errors. missing titles were transferred from the general stacks to special collections, and the items were reshelved in chronological order. a more extensive shelving shift still needs to be completed to move volumes to accommodate additions and finalize the shelf location for all items in this collection, which will no longer be growing or because all theses and dissertations at uh are submitted electronically as of 2014. as part of the shifting project, the items also need to be checked in and/or have their location codes changed in the catalog to reflect their new permanent home in special collections. information technology and libraries september 2022 navigating uncharted waters | 9 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith phase two: workflow development the theses and dissertations digitization workflow starts with pulling physical volumes from shelves. the task force generated a report of all uh theses and dissertations and sorted them by call number so that student workers can pull these volumes from the general stacks in call number order. after the pulled volumes’ records are withdrawn from the catalog system, they are shelved by call number order in the “ready for digitization” section of the tdd shelf in the library basement, close to the digitization lab. volumes are pulled from a section of the library stacks dedicated to the tdd project and loaded onto book carts for transfer to the physical volume processing room. using a custom-built processing table, covers are removed with utility knives and discarded. the text is placed in a folder with a pre-printed label indicating the oclc number and call number of the volume. the spine of each volume is removed with a spartan guillotine. the completely disbound volumes, housed in labeled folders, are then moved on book carts to the scanning room. prior to scanning, physical volumes are grouped in batches of approximately 50 and a text file is created that lists each oclc number in a batch, one per line. a simple executable file reads the text file and creates a batch directory. the batch directory is labeled with the current date in yyyymmdd format and contains a folder for each scanned volume. the scanned volume folders are labeled by oclc number and contain a metadata.txt file that records the volume’s descriptive metadata from the uh catalog system in yaml format: a data carrier that is easily readable for both humans and machines. scanning is performed with one of the two canon dr-g2110 high-speed feed scanners controlled by kofax vrs and captureontouch v4 pro software. before a volume is placed in the scanner, it is checked to ensure that the binding has been completely removed, that there are no pages that have been glued in after binding, and that there are no onion skin pages, irregular page sizes, inserts, or foldouts. if necessary, additional scans for delicate onion skin pages, inserts, or foldouts are performed on a flatbed scanner. page images are output as 300 dpi grayscale or color tiffs, and first pass quality control for completeness, page legibility, and rotation, and cropping is performed in captureontouch. after page images have been captured, a batch is loaded into limb for final processing. scanned volumes are again checked for completeness, legibility, and orientation. text pages are processed as 300 dpi bitonal tiffs. pages with grayscale or color images are processed as such. when batch processing is complete, the documents’ processed signature pages, which include names and signatures of the author, advisor, and committee members, are separated out so they are not included in the final version published online. this step protects the privacy of individuals by not sharing their signature openly over the internet. the tdd project uses abbyy finereader server 14 to generate full text pdfs and a plain text file for each scanned volume. the data in each scanned volume directory undergoes transformations both before and after the ocr processing. the transformations are accomplished with the tdd workflow utility, a ruby command line application. before running a batch through ocr, the archive digitized batch function moves the high-resolution master tiffs to an archive directory and formats the batch directory for the input that abbyy expects. after ocr processing, the archive ocr batch function moves the derivative tiffs used as ocr input to the archive directory information technology and libraries september 2022 navigating uncharted waters | 10 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith as well. in a final process before sending the batch to the metadata unit, the process ocr batch function adds descriptive metadata to the embedded exif metadata of each pdf document with exiftool for improved accessibility. the tdd task force sought to align print materials’ metadata standards with the existing metadata standards applied to electronic theses and dissertations in the university’s institutional repository, largely based on the dictionary of texas digital library descriptive metadata guidelines for electronic theses and dissertations, v.2.22 early on in the project, the metadata subteam reviewed thesis and dissertation records in the institutional repository (ir) as well as marc catalog records in uh libraries’ library services platform, with special emphasis on the metadata elements used, to identify alignments and gaps. after analysis, the team established the crosswalk from marc to the qualified dublin core profile in the ir (see table 2). in july 2019, uh libraries migrated to the alma library services platform. prior to this migration, the task force exported tdd marc records from uh libraries’ former library services platform, sierra, and crosswalked into dublin core metadata fields using the freely available software marcedit. data was further normalized using openrefine. at this early stage, openrefine proved to be a valuable tool for batch editing and formatting metadata and identifying legacy terms or missing data. once the crosswalked data was cleaned up and put into place, standard values for all records were added (see table 3). table 2. metadata crosswalk from marc to qualified dublin core metadata field marc field qualified dublin core element oclc number 001, 035 $a dc.identifier.other call number 099 [n/a, admin use only] author name 100 $a dc.creator title 245 $a $b dc.title thesis year 264 $c dc.date.issued degree information 500, 502 $a thesis.degree.name subject 6xx fields dc.subject department 710 $b thesis.degree.department during the ongoing processing of digitized materials and as part of the quality control, each volume’s metadata is evaluated against its corresponding metadata record and edited when necessary. in an effort to enrich the metadata available to users and increase visibility of the volumes, information not typically provided in the marc records, such as thesis committee chairs, other committee members, and abstracts, are added to the records using dublin core contributor (dc.contributor.committeemember) and abstract (dc.description.abstract). information technology and libraries september 2022 navigating uncharted waters | 11 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith table 3. standard values added to all records qualified dublin core element value dc.format.mimetype application/pdf dc.type.genre “thesis” or “dissertation,” as applicable thesis.degree.grantor university of houston dc.type.dcmi text dc.format.digitalorigin reformatted digital in the interest of closely observing copyright best practices, members of the tdd task force, including the digital projects coordinator and the director of the digital research commons, created copyright review guides and applicable rights statements. under these guidelines, theses and dissertations are considered under copyright if a copyright notice appears on volumes created in 1977 and earlier, if the item was created between 1978 and february 28, 1989, and if it was registered with the us copyright office within five years of its creation, or if it was created on march 1, 1989 or later. inserts and other research material provided in the volumes are similarly considered for copyright evaluation during the copyright review process. once a volume has been evaluated for copyright status, an out-of-copyright or in-copyright statement is assigned. in alignment with the uh libraries’ mission to provide valuable research and educational materials, digitized volumes and metadata records are then ingested into the institutional repository.23 in this stage of the process, out-of-copyright volumes are made available as open access materials. due to inherent limitations, in-copyright volumes are access restricted and available solely to the university community. when content is ready for ingest, volumes are moved to the ingest folder and placed in staging directories based on rights status: open access or in copyright. the ingest process is the same for both types of content, but in-copyright content requires additional post-ingest processing, so ingest batch folders are labeled according to rights status for clarity. the tdd workflow utility’s prepare ingest package function is used to create ingest packages in an input format expected by the saf creator, a utility for preparing dspace batch imports in the simple archive format .24 pdf files are copied and renamed in the format lastname_year_oclcnumber.pdf, a csv file is created with descriptive metadata for the batch, and the original files and metadata are moved to an archive directory. the saf creator is then used to create an saf ingest package that is imported into dspace. limiting access to copyrighted content was a necessary component of the project that took some time to solve. the team investigated creating a separate collection for the in-copyright content with access limited to users logged in with uh credentials. the downside to this approach was that the content within the restricted collection was not discoverable to users who were not information technology and libraries september 2022 navigating uncharted waters | 12 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith logged into the ir. in the end, the tdd task force worked with the texas digital library, a consortial nonprofit organization that hosts uh libraries’ dspace repository, to enable restricted access using bitstream authentication with shibboleth. this allows the task force to ingest all tdd project content into a single collection and apply uh authentication to copyrighted pdf documents. in this manner, descriptive metadata for all documents is discoverable, but access to the document content is only available to members of the uh community. applying authentication to bitstreams in the dspace administrative interface is a tedious process involving numerous clicks and dropdown menu selections. selenium ide, a browser plug-in designed for automated web development testing, is used to automate that process in the firefox web browser. after an in-copyright batch has been ingested, the tdd workflow utility’s prepare selenium script function is used to create an automation script for selenium. when loaded in the firefox selenium add-on, the script automatically applies the bitstream authentication steps in the browser for each volume in the batch. the tdd workflow comprises detailed tasks carried out at different units in the library in a sequential routine as an assembly line. tdd activities flow from pulling volumes from shelves to disbinding, scanning, image quality control and ocr, metadata creation and copyright evaluation, and digitized files ingestion into the dspace system. as the tdd task force worked collaboratively to develop and confirm workflows for this complicated process, they documented each section of the workflow in the one-stop tdd workflow google document for easy access and transparency of the overall process.25 the tdd working group members notify each other at completion of tasks at each section. to better track each thesis and dissertation as it moved through the digitization, metadata, and copyright verification workflows, the task force developed an excel spreadsheet tracking system.26 this tracking system lists uh libraries’ theses and dissertation titles, their oclc numbers, dates, and call numbers. it records the tdd volumes pulled from shelves, digitization completed, digitization batch, borrower notes, metadata completed, and other notes. this tracking system provides a channel for the team members to inform each other of completed tasks at each unit and to communicate issues in the working process (see fig. 3). information technology and libraries september 2022 navigating uncharted waters | 13 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith figure 3. a screenshot of a portion of the tdd tracking system. phase three: promotion, communication, and next steps it is important to have strategies for tdd promotion and communication to raise awareness of the online availability of the university’s legacy theses and dissertations. the tdd task force brainstormed elements such as audience, channels, and timeline for tdd communication. theses and dissertation authors and campus users are the two main groups the task force plans to target in its promotion and communication plan. to attract audience attention, the tdd task force will design an online flyer/postcard for dissemination. they are currently collaborating with the uh libraries director of communication, the uh alumni office, the uh graduate office, and the uh division of research to distribute messages to targeted audiences. the task force will communicate tdd digitization progress as they reach important milestones, including the completion of pre-1978 volumes, then at the increments of 10,000 and 15,000 volumes, and once all volumes have been digitized and deposited to the repository. with the disbanding, digitization, and metadata workflows firmly in place, the tdd task force commenced the process of generating digitized versions of uh’s theses and dissertations in 2020. while this process will continue over the next several years, the task force will also fo cus on refining policies and workflows around its copyright and digital preservation activities. the tdd task force has developed a draft copyright policy development document, which outlines copyright determination decisions and access controls for content deemed in copyright. the task force is currently consulting with uh general counsel to ensure its recommended copyright approaches are in concert with university best practices. information technology and libraries september 2022 navigating uncharted waters | 14 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith at the same time, the task force is developing digital preservation procedures to ensure the longterm access, storage, and preservation to digitized theses and dissertations. the group has made some foundational decisions to date. since one physical copy of each title will be retained, allowing for future higher-resolution rescanning if needed, the task force determined that the preservation master file for each digitized thesis or dissertation will be one pdf. this will allow the uh libraries to greatly reduce the ongoing storage costs associated with digitally preserving the tdd collection. throughout 2023, the task force will be exploring ways to sync tdd content to its current digital preservation workflow process, including submitting content to uh libraries’ archivematica instance for preservation curation services such as file fixity checks and normalization, and transferring preserved tdd content to cloud storage for distributed digital preservation. prior to ingesting any content into the institutional repository, the team reached out to the uh’s electronic and information resources accessibility (eira) coordinator for feedback on the accessibility of the pdf documents produced by abbyy. the eira coordinator recommended encoding our pdfs as pdf/a-1a, a standard designed for preservation and accessibility, and introduced the team to the accessibility tools available in adobe acrobat. the adobe acrobat accessibility checker has been useful for identifying and addressing accessibility issues with the pdfs that we are producing. uh libraries web accessibility standards strive to comply with the world wide web consortium’s (w3) web content accessibility guidelines (wcag). combined with the feedback from the uh’s eira coordinator, the current output was reviewed against these accessibility checklists, and areas needing improvement were identified. after several adjustments, the newest output for the project passes a majority of adobe acrobat’s accessibility checker accessibility parameters, with further investigation planned to address weak points moving forward. conclusion the tdd project at uh libraries provides an in-depth view of the planning and workflow processes needed to launch a retrospective thesis and dissertations digitization effort in an academic library setting. collaborating across uh libraries departments, the tdd task force designed a phased approach to identify technology and resources needed to undertake the project, to develop policies, procedures, and workflows to guide the work to its completion, and to communicate about the scope, purpose, and progress of the project to internal and external stakeholders. throughout the planning and development phases, the task force leveraged automation, bibliographic data reuse, and project management tracking to achieve workflow objectives efficiently and responsibly. with the project well underway, the task force will continue refining its processes and working across uh libraries and campus units to ensure it complies fully with copyright and digital preservation best practices. through these ongoing efforts, the tdd task force is ensuring that the original research and scholarship contained in thousands of theses and dissertations are more accessible than ever before—broadening the reach and impact of uh graduates well into the future. funding this project was funded by the john p. mcgovern foundation. information technology and libraries september 2022 navigating uncharted waters | 15 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith acknowledgments the authors dedicate this work to the memory of their colleague and tdd task force member crystal cooper. endnotes 1 linda bennett and dimity flanagan, “measuring the impact of digitized theses: a case study from the london school of economics,” insights: the uksg journal 29, no. 2 (2016): 111–19, https://doi.org/10.1629/uksg.300. 2 gail clement and melissa levine, “copyright and publication status of pre-1978 dissertations: a content analysis approach,” portal: libraries and the academy 11, no. 3 (2011): 825, https://doi.org/10.1353/pla.2011.0031. 3 clement and levine, “copyright and publication status,” 825. 4 clement and levine, “copyright and publication status,” 826. 5 xiaocan (lucy) wang, “guidelines for implementing etd programs—roles and responsibilities,” in guidance documents for lifecycle management of etds, eds. matt schultz, nick krabbenhoeft, and katherine skinner (2014): sect.1, p. 14, https://educopia.org/guidance-documents-forlifecycle-management-of-etds. 6 wang, “guidelines,” 1-17. 7 patricia hswe, “briefing on copyright and fair use issues in etds,” in guidance documents for lifecycle management of etds, eds. matt schultz, nick krabbenhoeft, and katherine skinner, (2014): sect. 3, p. 12, https://educopia.org/guidance-documents-for-lifecycle-management-ofetds. 8 geneva henry, “guide to access levels and embargoes of etds,” in guidance documents for lifecycle management of etds, eds. matt schultz, nick krabbenhoeft, and katherine skinner, (2014): sect. 2, p. 1, https://educopia.org/guidance-documents-for-lifecycle-management-ofetds. 9 henry, “guide to access levels,” 2-1. 10 hswe, “briefing on copyright,” 3-9–3-13. 11 cathleen l. martyniak, “scanning our scholarship: the university of florida retrospective dissertation scanning project,” microform and imaging review 37, no. 3 (2008): 122–24, https://doi.org/10.1515/mfir.2008.013. 12 martyniak, “scanning our scholarship,” 127–29. 13 “retrospective dissertation scanning policy,” (2011), university of florida, accessed january 1, 2022, https://ufdc.ufl.edu/aa00007596/00001. https://doi.org/10.1629/uksg.300 https://doi.org/10.1353/pla.2011.0031 https://educopia.org/guidance-documents-for-lifecycle-management-of-etds https://educopia.org/guidance-documents-for-lifecycle-management-of-etds https://educopia.org/guidance-documents-for-lifecycle-management-of-etds https://educopia.org/guidance-documents-for-lifecycle-management-of-etds https://educopia.org/guidance-documents-for-lifecycle-management-of-etds https://educopia.org/guidance-documents-for-lifecycle-management-of-etds https://doi.org/10.1515/mfir.2008.013 https://ufdc.ufl.edu/aa00007596/00001 information technology and libraries september 2022 navigating uncharted waters | 16 wu, davis-van atta, scott, thompson, washington, jones, weidner, ramirez, and smith 14 todd mundle, “digital retrospective conversion of theses and dissertations: an in house project,” in proceedings of eighth symposium on electronic theses and dissertation (sydney, australia, 2005): 3–4. 15 mundle, “digital retrospective conversion,” 3. 16 mary piorun and lisa a. palmer, “digitizing dissertations for an institutional repository: a process and cost analysis,” journal of the medical library association: jmla 96, no. 3 (2008): 223–29, https://doi.org/10.3163/1536-5050.96.3.008. 17 piorun and palmer, “digitizing dissertations,” 224. 18 piorun and palmer, “digitizing dissertations,” 227. 19 sarah l. shreeves and thomas h. teper, “looking backwards: asserting control over historic dissertations,” college and research library news 73, no. 9 (2012): 532–33, https://doi.org/10.5860/crln.73.9.8830. 20 gary m. worley, “dissertations unbound: a case study for revitalizing access,” in proceedings of the 10th international symposium on electronic theses and dissertations (uppsala, sweden, 2007). 21 worley, “dissertations unbound,” 3–6. 22 dictionary of texas digital library descriptive metadata for electronic theses and dissertations, version 2.0, (2015), http://hdl.handle.net/2249.1/68437. 23 to access cougar roar, see https://guides.lib.uh.edu/roar. 24 saf creator is a tool developed by james creel at texas a&m university. for more, see https://github.com/jcreel/safcreator. 25 see the tdd google document: https://docs.google.com/document/d/18gyjq6isn7qsuelo1z3b7btmxlxnchmvqp8rhquzy8g /edit?usp=sharing. 26 see the complete tdd tracking system: https://docs.google.com/spreadsheets/d/1tehagvcqw6wb3n5cdaulbtlwzqzstwdbltiapd 1oan0/edit?usp=sharing. https://doi.org/10.3163/1536-5050.96.3.008 https://doi.org/10.5860/crln.73.9.8830 http://hdl.handle.net/2249.1/68437 https://guides.lib.uh.edu/roar https://github.com/jcreel/safcreator https://docs.google.com/document/d/18gyjq6isn7qsuelo1z3b7btmxlxnchmvqp8rhquzy8g/edit?usp=sharing https://docs.google.com/document/d/18gyjq6isn7qsuelo1z3b7btmxlxnchmvqp8rhquzy8g/edit?usp=sharing https://docs.google.com/spreadsheets/d/1tehagvcqw6wb3n5cdaulbtlwzqzstwdbltiapd1oan0/edit?usp=sharing https://docs.google.com/spreadsheets/d/1tehagvcqw6wb3n5cdaulbtlwzqzstwdbltiapd1oan0/edit?usp=sharing abstract introduction literature review project planning and execution phase one: preparation phase two: workflow development phase three: promotion, communication, and next steps conclusion funding acknowledgments endnotes academic libraries on social media: finding the students and the information they want heather howard, sarah huber, lisa carter, and elizabeth moore information technology and libraries | march 2018 8 heather howard (howar198@purdue.edu) is assistant professor of library science; sarah huber (huber47@purdue.edu) is assistant professor of library science; lisa carter (carte241@purdue.edu) is library assistant; and elizabeth moore (moore658@purdue.edu) is library assistant and student supervisor at purdue university. librarians from purdue university wanted to determine which social media platforms students use, which platforms they would like the library to use, and what content they would like to see from the library on each of these platforms. we conducted a survey at four of the nine campus libraries to determine student social media habits and preferences. results show that students currently use facebook, youtube, and snapchat more than other social media types; however, students responded that they would like to see the library on facebook, instagram, and twitter. students wanted nearly all types of content from the libraries on facebook, twitter, and instagram, but they did not want to receive business news or content related to library resources on snapchat. youtube was seen as a resource for library service information. we intend to use this information to develop improved communication channels, a clear social media presence, and a cohesive message from all campus libraries. introduction in his book tell everyone: why we share and why it matters, alfred hermida states, “people are not hooked on youtube, twitter or facebook but on each other. tools and services come and go; what is constant is our human urge to share.”1 libraries are places of connection, where people connect with information, technologies, ideas, and each other. as such, libraries look for ways to increase this connection through communication. social media is a key component of how students communicate with classmates, families, friends, and other external entities. it is essential for libraries to communicate with students regarding services, collections, events, library logistics, and more. purdue university is a large, land-grant university located in west lafayette, indiana, with an enrollment of more than forty thousand. the purdue libraries consist of nine libraries, presented collectively on the social media platforms facebook and twitter since 2009 and youtube since 2012. going forward, the purdue libraries want to ensure it establishes a cohesive message and brand that is communicated to students on platforms they use and on which they will engage with it. the purpose of this study was to determine which social media platforms the students are currently using, which platforms they would like the library to use, and what content they would like to see from the libraries on each of these platforms. mailto:howar198@purdue.edu mailto:huber47@purdue.edu mailto:carte241@purdue.edu mailto:moore658@purdue.edu academic libraries on social media | howard, huber, carter, and moore 9 https://doi.org/10.6017/ital.v37i1.10160 literature review academic libraries and social media academic libraries have been slow to accept social media as a venue for either promoting their services or academic purposes. a 2007 study of 126 academic librarians found that only 12 percent of those surveyed “identified academic potential or possible benefits” of facebook while 54 percent saw absolutely no value in social media.2 however, the mission of academic libraries has shifted in the last decade from being a repository of knowledge to being a conduit for information literacy; new roles include being a catalyst for on-campus collaboration and a facilitator for scholarly publication within contemporary academic librarianship.3 academic librarians have responded to this change, with many now believing that “social media, which empowers libraries to connect with and engage its diverse stakeholder groups, has a vital role to play in moving academic libraries beyond their traditional borders and helping them engage new stakeholder groups.”4 student perceptions about academic libraries on social media as the use of social media has grown with college-aged students, so has an increasing acceptance of academic libraries using social media to communicate. a pew research center report from 2005 showed just 7 percent of eighteen to twenty-nine year olds using social media. by 2016, 86 percent were using social media.5 in 2007 the oclc asked 511 college students from six different countries to share their thoughts on libraries using social networking sites. this survey revealed that “most college students would be unlikely to participate in social networking services offered by a library,” with just 13 percent of students believing libraries have a place on social media.6 however, just two years later (in 2009), a shift was seen: students were open to connecting with academic libraries, as observed in a survey of 366 freshmen at valparaiso university. when asked their thoughts on the library sending announcements and communications to them via facebook or myspace (a social media powerhouse at the time), 42.6 percent answered they would be “more receptive to information received in this way than any other response.” a smaller group, 12.3 percent, responded more negatively to this approach. students showed concern for their privacy and the level of professionalism, as a quote from a student illustrates: “facebook is to stay in touch with friends or teachers from the past. email is for announcements. stick with that!!!” 7 as students report becoming more open to academic libraries on social media, the question of whether they will engage through social media emerges. a recent study from western oregon university’s hammersley library asked this question with promising results. forty percent of students said they were either “very likely “or “somewhat likely” to follow the library on instagram and twitter, as opposed to wanting communications being sent to them directly through social media (for example, a facebook message). pinterest followed, with 33 percent of students saying they were either “very likely” or “somewhat likely” to follow the library using this platform.8 throughout the literature, students have shown an interest in information about the libraries that is useful to them. in another survey given to undergraduate students from three information technology classes at florida state university, one question examined the perceived importance of different library social media postings to students. the report showed students considered postings related to operations updates, study support, and events as the most important.9 in the hammersly study noted above, 78 percent and 87 percent of respondents said information technology and libraries | march 2018 10 they were either “very interested” or “somewhat interested,” respectively, in every category relating to library resources presented in the survey, but “interesting/fun websites and memes” received the least interest from participants.10 the literature shows an increase in students being receptive to academic libraries on social media. results vary campus to campus and students are leery of libraries reaching out to them via social media, but they have an increasingly positive view about content posted that will help them with the library. research questions the aim of this project was to investigate the social media behaviors of purdue university students as they relate to the libraries, and to develop evidence-based practices for managing the library’s social media accounts. the project focused on three research questions: 1. what social media platforms are students using? 2. what social media platforms do students want the library to use? 3. what kind of content do students want from the library on each of these platforms? methods we created the survey using the web-based qualtrics survey software. it was distributed in electronic form only, and it was promoted to potential respondents via table tents in the libraries, bookmarks at the library desk, facebook posts, and in-classroom promotion. potential respondents were advised that the survey was anonymous and voluntary. the survey consisted of closed questions, though many questions contained an open-ended field for answers that did not fall into the provided choices. inspiration for some of the options in our survey questions came from the hammersly library study, as we felt they did a good job capturing information about the social media usage of their patrons.11 our survey asked what social media platforms students use, what they use them for, how often they visit the library, how likely they are to follow the library on social media, which platforms they want the library to have, and what content they would like from the library on each of those platforms. the social media platforms included were facebook, flickr, g+, instagram, linkedin, pinterest, qzone, renren, snapchat, tumblr, twitter, youtube, and yik yak.12 there were also open-ended spaces where participants could write in additional platforms. the survey originally ran for three weeks in only the business library early in the spring 2017 semester, as its intended purpose was to inform how the business library would manage social media. after that survey was completed, we decided to replicate the survey in three additional libraries (humanities, social science, and education; engineering; and the main undergraduate libraries). this was done to expand the dataset and reach additional students in a variety of disciplines. these libraries were chosen because they were the libraries in which the authors work, with the hope to expand to additional libraries in the future. the second survey also lasted for three weeks starting in mid-april of the spring 2017 semester. as a participation incentive, students who completed the initial survey and the second survey had an opportunity to enter a drawing for a $25 visa gift card. academic libraries on social media | howard, huber, carter, and moore 11 https://doi.org/10.6017/ital.v37i1.10160 the survey was advertised across four different campus libraries and promoted in several ways to reach different populations. though the results are not from a random sample of the student population, the results are broad enough that we intend to apply them to our entire student population. results survey the survey was completed by 128 students. an additional 13 students began the survey but did not complete it; we removed their results from the analysis. the breakdown of respondents was 10 percent freshmen (n = 13), 22 percent sophomore (n = 28), 27 percent junior (n = 35), 20 percent senior (n = 25), and 21 percent graduate or professional (n = 27). library usage the students were asked how frequently they visit the library to determine if the survey was reaching a population of regular or infrequent library visitors. the results showed that the students who completed the survey were primarily frequent library users, with 93 percent (n = 119) visiting once a week or more. social media platforms the students were asked to identify which social media platforms they used and how frequently they used them. the most popular social media platforms were determined by combining the number of students who said they used them daily or weekly. the top five were facebook (n = 114, 88 percent), youtube (n = 102, 79 percent), snapchat (n = 90, 70 percent), instagram (n = 85, 66 percent), and twitter (n = 41, 32 percent). full results are in table 1. table 1. usage frequency by platform social media platform daily weekly monthly < once per month never facebook 94 (72.87%) 20 (15.50%) 5 (3.88%) 5 (3.88%) 4 (3.10%) flickr 0 (0.00%) 1 (0.78%) 2 (1.55%) 8 (6.20%) 117 (90.70%) g+ 3 (2.33%) 6 (4.65%) 4 (3.10%) 16 (12.40%) 99 (76.74%) instagram 68 (52.71%) 17 (13.18%) 5 (3.88%) 11 (8.53%) 27 (20.93%) linkedin 9 (6.98%) 29 (22.48) 22 (17.05%) 22 (17.05%) 46 (35.66%) pinterest 12 (9.30%) 12 (9.30%) 16 (12.40%) 19 (14.73%) 69 (53.49%) qzone 0 (0.00%) 0 (0.00%) 0 (0.00%) 4 (3.10%) 124 (96.12%) renren 0 (0.00%) 0 (0.00%) 1 (0.78%) 3 (2.33%) 124 (96.12%) snapchat 84 (65.12%) 6 (4.65%) 6 (4.65%) 7 (5.43%) 25 (19.38%) tumblr 7 (5.43%) 2 (1.55%) 7 (5.43%) 11 (8.53%) 101 (78.29%) information technology and libraries | march 2018 12 social media platform daily weekly monthly < once per month never twitter 28 (21.71%) 13 (10.08%) 12 (9.30%) 9 (6.98%) 66 (51.16%) youtube 58 (44.96%) 44 (34.11%) 15 (11.63%) 4 (3.10%) 7 (5.43%) yik yak 0 (0.00%) 0 (0.00%) 0 (0.00%) 11 (8.53%) 117 (90.70%) other: email 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) other: groupme 3 (2.33%) 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) other: reddit 2 (1.55%) 2 (1.55%) 0 (0.00%) 0 (0%) 0 (0.00%) other: skype 0 (0.00%) 0 (0.00%) 0 (0.00%) 1 (0.78%) 0 (0.00%) other: vine 0 (0.00%) 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) other: wechat 3 (2.33%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) other: weibo 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) other: whatsapp 1 (0.78%) 0 (0.00%) 0 (0.00%) 0 (0.00%) 0 (0.00%) social media activity next, students were asked how much time they spend on social media doing the following activities: watching videos, keeping in touch with friends/family, sharing photos, keeping in touch with classmates/professors, learning about campus events, doing research, getting news, or following public figures. table 2 shows that students overwhelmingly use social media daily or weekly to watch videos (94 percent, n = 120), keep in touch with family/friends (93 percent, n = 119), and to get news (81 percent, n = 104). the least popular activities, those that students do less than once per month or never, were research (47 percent, n = 60) and to following public figures (34 percent, n = 45). social media and the library the students were asked how likely they are to follow the libraries on social media. the response to this was primarily positive, with 57 percent of respondents saying they are either extremely likely or somewhat likely to follow the library. one response for this question was inexplicably null, so for this question n = 127. figure 1 contains the full results. academic libraries on social media | howard, huber, carter, and moore 13 https://doi.org/10.6017/ital.v37i1.10160 table 2. social media activity social media activity daily weekly monthly < once per month never watch videos 85 (66.41%) 35 (27.34%) 1 (0.78%) 4 (3.13%) 3 (2.34%) keep in touch with friends/family 89 (69.53%) 30 (23.44%) 6 (4.69%) 2 (1.56%) 1 (0.78%) share photos 32 (25%) 33 (25.78%) 38 (29.69%) 20 (15.63%) 5 (3.91%) keep in touch with classmates/professors 34 (26.56% 47 (36.72%) 21 (16.41%) 19 (14.84%) 7 (5.47%) learn about campus events 24 (18.75%) 53 (41.41%) 29 (22.66%) 18 (14.06%) 4 (3.13%) do research 24 (18.75%) 26 (20.31%) 18 (14.06%) 23 (17.97%) 37 (28.91%) get news 66 (51.56%) 38 (29.69%) 7 (5.47%) 9 (7.03%) 8 (6.25%) follow public figures 34 (26.56%) 30 (23.44%) 20 (15.63%) 19 (14.84%) 24 (18.75%) other 2 (1.56%) 0 (0%) 0 (0%) 0 (0%) 0 (0%) figure 1. library social media follows. 12 66 23 16 10 0 10 20 30 40 50 60 70 extremely likely somewhat likely neither likely nor unlikely somewhat unlikely extremely unlikely how likely are you to follow the library on social media? information technology and libraries | march 2018 14 the students were asked which social media platforms they thought the library should be on. five rose to the top of the results: facebook (82 percent, n = 105), instagram (55 percent, n = 70), twitter (40 percent, n = 51), snapchat (34 percent, n = 44), and youtube (29 percent, n = 37). full results can be seen in figure 2. after a student selected a platform they wanted the library to be on, logic built into the survey then directed them to an additional question that asked what content they would like to see from the library on that platform. content included library logistics (hours, events, etc.), research techniques and tips, how to use library resources and services, library resource info (database instruction/tips, journal availability, etc.), business news, library news (e.g., if the library wins an award), campus-wide info/events, and interesting/fun websites and memes. for facebook, students widely selected all types of content, with the most selections made for library logistics (n = 73) and the fewest made for business news (n = 33). for instagram, students wanted all content except business news (n = 18). snapchat was similar, except along with business news (n = 8), students also were not interested in receiving content related to library resource information (n = 9). twitter was similar to facebook in that all content was widely selected. youtube had a focus on library services, with the three most-selected content options being research techniques and tips (n = 20), how to use library resources and services (n = 19), and library resource info (n = 16). table 3 contains the full results. figure 2. library social media presence. 105 7 70 23 10 1 1 44 5 51 37 0 20 40 60 80 100 120 facebook g+ instagram linkedin pinterest qzone renren snapchat tumblr twitter youtube what social media platform should the library be on? academic libraries on social media | howard, huber, carter, and moore 15 https://doi.org/10.6017/ital.v37i1.10160 table 3. library social media content by platform what type of content would you like to see from the library? content type f a c e b o o k (n = 1 0 5 ) g + (n = 7 ) in s ta g r a m (n = 7 0 ) l in k e d in (n = 2 3 ) p in te r e s t (n = 1 0 ) s n a p c h a t (n = 4 4 ) t u m b lr (n = 5 ) t w itte r (n = 5 1 ) y o u t u b e (n = 3 7 ) library logistics (hours, events, etc.) 73 (69.52%) 2 (28.57%) 34 (48.57%) 7 (30.43%) 4 (40%) 23 (52.27%) 2 (40%) 32 (62.75%) 8 (21.62%) research techniques & tips 52 (49.52%) 3 (42.85%) 28 (40%) 13 (56.53%) 7 (70%) 19 (43.18%) 3 (60%) 27 (52.94%) 20 (54.05%) how to use library resources & services 53 (50.48%) 3 (42.85%) 26 (37.14%) 8 (34.78%) 7 (70%) 16 (36.36%) 3 (60%) 25 (49.02%) 19 (51.35%) library resource info (database instruction/tips , journal availability, etc.) 53 (50.48%) 3 (42.85%) 22 (31.42%) 8 (34.78%) 6 (60%) 9 (20.45%) 2 (40%) 23 (45.10%) 16 (43.24%) business news 33 (31.43%) 2 (28.57%) 18 (25.71%) 13 (56.52%) 3 (30%) 8 (18.18%) 2 (40%) 17 (33.33%) 7 (18.92%) library news (e.g., if the library wins an award) 49 (46.67%) 3 (42.85%) 37 (52.86%) 12 (52.17%) 5 (50%) 19 (43.18%) 3 (60%) 24 (47.06%) 7 (18.92%) campus-wide info/events 73 (69.52%) 3 (42.85%) 42 (60%) 5 (21.74%) 5 (50%) 26 (59.09%) 2 (40%) 35 (68.63%) 13 (35.14%) interesting/fun websites & memes 48 (45.71%) 0 41 (58.57%) 2 (8.70%) 10 (100%) 30 (68.18%) 3 (60%) 26 (50.98%) 12 (32.43%) other 1 (0.95%) 0 2 (2.86%) 0 1 (10%) 2 (4.55%) 0 2 (3.92%) 1 (2.70%) discussion historically, libraries have used social media as a marketing tool.13 with social media’s everincreasing popularity with young adults, academic libraries have actively established a presence on several platforms.14 our survey shows that our students follow this trend, using social media regularly and for a variety of activities. we were surprised that facebook turned out to be the information technology and libraries | march 2018 16 most widely used by our students, as much has been written in the last few years about teens and young adults leaving the platform.15 a november 2016 survey, however, found that 65 percent of teens said they used facebook daily, a large increase from 59 percent in november 2014. though snapchat and instagram preferred, teens continue to use facebook for its utility in scheduling events or keeping in touch regarding homework.16 students do seem receptive to following the library on different platforms and report wanting primarily library-related content from us, including more in-depth content such as research techniques and database instruction. limitations and future work findings from this study give insight into opportunities for libraries to reach university students through social media. we acknowledge that only limited generalizations can be made because of the way the survey was conducted. our internal recruitment methods led to a selection bias in our surveyed population, as advertisement of the survey took place either in the chosen libraries or on the purdue libraries’ existing facebook page. because of this, our sample consists primarily of students who visit the library or already follow the library on facebook. we hope to alter this in future surveys by expanding our recruitment to other physical spaces across campus. in addition, we plan to add questions that first establish a better understanding of students’ opinions of libraries being on social media before asking what social media they would like to see libraries use. this would potentially avoid leading students to an answer. further, we are concerned we took for granted students’ understanding of library resources; that is, we may have made distinctions librarians understand, but students may not. in future studies, we plan to rephrase, and possibly combine, questions in a way that will be clear to people less familiar with library resources and services. we believe confusion with these questions created contradictory responses. for example, “research help through social media” received a low response rate, but “information on research techniques and tips” received a much higher response rate. additionally, a limitation of using a survey to collect behavior information is that respondents do not always report how they actually behave. using methods such as focus groups, interviews, text mining, or usability studies could provide a more holistic view of student behavior. duplication of this study on a yearly or semi-yearly basis across all libraries could help us see how social media preferences change over time and across a larger sample of our population. this study aimed to provide a broad view of a large university’s student body by surveying across different subject libraries. with the changes discussed, we think a revised survey could give us the detailed information we need to build a more effective social media strategy that reaches both library users and non-users. conclusion this study improved our understanding of the social media usage and preferences of purdue students. from these results, we intend to develop better communication channels, a clear social media presence, and a more cohesive message across the purdue libraries. under the direction of our new director of strategic communication, a social media committee was formed with representatives from each of the libraries to contribute content for social media. the committee will consider expanding the purdue libraries’ social media presence to communication channels where students have said they are and would like us to be. as social media usage is ever-changing, we recommend repeated surveys such as this to better understand where on social media students want to see their libraries and what information they want to receive from them. academic libraries on social media | howard, huber, carter, and moore 17 https://doi.org/10.6017/ital.v37i1.10160 references 1 alfred hermida, tell everyone: why we share and why it matters (toronto: doubleday canada, 2014), 1. 2 laurie charnigo and paula barnett-ellis, “checking out facebook.com: the impact of a digital trend on academic libraries,” information technology and libraries 26, no. 1 (march 2007): 23–34, https://doi.org/10.6017/ital.v26i1.3286. 3 stephen bell, lorcan dempsey, and barbara fister, new roles for the road ahead: essays commissioned for the acrl’s 75th anniversary (chicago: association of college and research libraries, 2015). 4 amanda harrison et al., “social media use in academic libraries: a phenomenological study,” journal of academic librarianship 43, no. 3 (may 1, 2017): 248–56, https://doi.org/10.1016/j.acalib.2017.02.014. 5 “social media fact sheet,” pew research center, january 12, 2017, http://www.pewinternet.org/fact-sheet/social-media/. 6 online computer library center, sharing, privacy and trust in our networked world: a report to the oclc membership, (dublin, ohio: oclc, 2007)), https://eric.ed.gov/?id=ed532599. 7 ruth sara connell, “academic libraries, facebook and myspace, and student outreach: a survey of student opinion,” portal: libraries and the academy 9, no. 1 (january 8, 2009): 25–36, https://doi.org/10.1353/pla.0.0036. 8 elizabeth brookbank, “so much social media, so little time: using student feedback to guide academic library social media strategy,” journal of electronic resources librarianship 27, no. 4 (2015): 232–47, https://doi.org/10.1080/1941126x.2015.1092344. 9 besiki stvilia and leila gibradze, “examining undergraduate students’ priorities for academic library services and social media communication,” journal of academic librarianship 43, no. 3 (may 1, 2017): 257–62, https://doi.org/10.1016/j.acalib.2017.02.013. 10 brookbank, “so much social media, so little time.” 11 stvilia and gibradze, “examining undergraduate students’ priorities.” 12 qzone and renren are chinese social media platforms. 13 curtis r. rogers, “social media, libraries, and web 2.0: how american libraries are using new tools for public relations and to attract new users,” south carolina state library, may 22, 2009, http://dc.statelibrary.sc.gov/bitstream/handle/10827/6738/scsl_social_media_libraries_20 09-5.pdf?sequence=1; jakob harnesk and marie-madeleine salmon, “social media usage in libraries in europe—survey findings,” linkedin slideshare slideshow presentation, august https://doi.org/10.6017/ital.v26i1.3286 https://doi.org/10.1016/j.acalib.2017.02.014 http://www.pewinternet.org/fact-sheet/social-media/ https://eric.ed.gov/?id=ed532599 https://doi.org/10.1353/pla.0.0036 https://doi.org/10.1080/1941126x.2015.1092344 https://doi.org/10.1016/j.acalib.2017.02.013 http://dc.statelibrary.sc.gov/bitstream/handle/10827/6738/scsl_social_media_libraries_2009-5.pdf?sequence=1 http://dc.statelibrary.sc.gov/bitstream/handle/10827/6738/scsl_social_media_libraries_2009-5.pdf?sequence=1 information technology and libraries | march 2018 18 10, 2010, https://www.slideshare.net/jhoussiere/social-media-usage-in-libraries-in-europesurvey-teaser. 14 “social media fact sheet.” 15 daniel miller, “facebook’s so uncool, but it’s morphing into a different beast,” the conversation, 2013, http://theconversation.com/facebooks-so-uncool-but-its-morphing-into-a-differentbeast-21548; ryan bradley, “understanding facebook’s lost generation of teens,” fast company, june 16, 2014, https://www.fastcompany.com/3031259/these-kids-today; nico lang, “why teens are leaving facebook: it’s ‘meaningless,’” washington post, february 21, 2015, https://www.washingtonpost.com/news/the-intersect/wp/2015/02/21/why-teensare-leaving-facebook-its-meaningless/?utm_term=.1f9dd4903662. 16 alison mccarthy, “survey finds us teens upped daily facebook usage in 2016,” emarketer, january 28, 2017, https://www.emarketer.com/article/survey-finds-us-teens-upped-dailyfacebook-usage-2016/1015053. https://www.slideshare.net/jhoussiere/social-media-usage-in-libraries-in-europe-survey-teaser https://www.slideshare.net/jhoussiere/social-media-usage-in-libraries-in-europe-survey-teaser http://theconversation.com/facebooks-so-uncool-but-its-morphing-into-a-different-beast-21548 http://theconversation.com/facebooks-so-uncool-but-its-morphing-into-a-different-beast-21548 https://www.fastcompany.com/3031259/these-kids-today https://www.washingtonpost.com/news/the-intersect/wp/2015/02/21/why-teens-are-leaving-facebook-its-meaningless/?utm_term=.1f9dd4903662 https://www.washingtonpost.com/news/the-intersect/wp/2015/02/21/why-teens-are-leaving-facebook-its-meaningless/?utm_term=.1f9dd4903662 https://www.emarketer.com/article/survey-finds-us-teens-upped-daily-facebook-usage-2016/1015053 https://www.emarketer.com/article/survey-finds-us-teens-upped-daily-facebook-usage-2016/1015053 introduction literature review academic libraries and social media student perceptions about academic libraries on social media research questions methods results survey library usage social media platforms social media activity social media and the library discussion limitations and future work conclusion references 207 standardized costs for automated library systems mary ellen l. jacob: systems officer, fisher library, university of sydney costs of automated library systems as currently given in published reports tend to be misleading and confusing. it is necessary to have a clear understanding of how they were derived before any comparisons can be made. clearly defined costs in terms of time units are more meaningful than straight dollar costs and can be used as one means of comparison among various system designs and as guidelines for the design of new systems. there is a great lack of consistency in reporting the costs of automated library systems. cost figures given in published reports tend to be misleading and confusing; rather than indicate the true cost of a system, they tend to obscure the entire issue. without a clear understanding of how such figures were obtained, one cannot use them for comparison against any other system ( 1). while it is true that no two systems are identical, use of standardized methods can make cross comparisons meaningful and give a basis for estimating the costs of new systems ( 2). when all the variables affecting automated library systems are considered, it is very tempting to say that no realistic comparisons can be made. what is needed is some definite statement of just what criteria can be used to determine costs and how they are derived ( 3). while there have been numerous studies dealing with the cost aspects of specific functions, no real attempt has been made to define standardized cost criteria .. 208 journal of library automation vol. 3/3 september, 1970 for automated library systems. the following discussion is an attempt to identify and define some of the more common cost aspects of such systems. what is cost? "cost" as defined by accountants, is not the subject of this article. primary interest here is in cost as a yardstick for measuring the efficiency and effectiveness of a system and for its comparison with other systems. it is important to note that cost is only one criterion and not necessarily the most important one. as costs are herein described several factors need to be determined. does cost include: 1 ) fixed overhead such as lighting, office space, administrative functions, etc.? 2) actual salaries or assessed salaries (some installations have a fixed figure for certain types of jobs regardless of the actual cost)? 3) equipment cost ( each installation has its own methods for prorating equipment costs)? 4) material costs, paper supplies, etc.? cost figures in terms of dollars have little or no meaning unless their derivation is understood. more meaningful are costs in terms of time, man units for human work, and actual running times for equipment. even for use of these units it is necessary to know something of the relative skill of the personnel involved and the equipment configuration used. personnel costs before examination of the possible breakdown of personnel costs several pertinent points should be considered. it is necessary to state the backgrounds, skills, and levels of experience of the personnel involved. these should include extent of familiatity and experience with the system environment. this environment consists of the equipment, computer or otherwise, and the particular library application involved. it would be advantageous if there were some objective ways of measuring system analyst and programmer performance, rather than reliance on background and experience as measures. unfortunately there are none. this is a problem that has bothered service bureaus, software houses, and any data processing manager worth his salt. at present there is no clear-cut answer. some try to measure efficiency by the length of time taken to code a number of program steps. this gives no measure of the efficiency of the program generated, only a guide to the translating ability of the individual involved. it is certainly no measure of the actual program performance on the computer system. in addition it is extremely difficult to estimate accurately the actual running time that a given program should take, especially for a time-sharing system. how then can one measure the effectiveness of a given programmer in achieving such a goal? another problem to be considered is that of the best program versus the most efficient. it is important that a program be maintainable and capable of being changed easily by another programmer. too, if equipment changes are contemplated, it is highly desirable that the program be written costs for automated systems/jacob 209 in a higher-level language, such as cobol or fortran, which can be used on another machine. higher-level languages are not as efficient as assembler languages, but generally take less time to write and debug and are usually transferable from one machine to another with only minor modifications. while it is not possible to measure analyst or programmer efficiency accurately, it does help to know the level of experience and the general background of personnel. while an inexperienced analyst or programmer may occasionally be more efficient than an experienced one, this is not generally true. normally the more experienced man will know a variety of standardized methods or shortcuts that can be used effectively to either shorten running time, coding time, or both. more important, he knows where to start. a sample personnel description might read: systems and programming staff: 1 systems analyst, b.a. in business administration, five years' experience with various makes of computers, two years of which were spent as a programmer working with cobol and fortran, no library background, worked on a part-time basis. 2 programmers, both with high school diplomas, one with one year's experience with cobol, one trainee with no experience, but high aptitude, both with no library background. 1 library data processing coordinator, masters in library science, manufacturer's course in systems analysis and programming, knowledge of cobol. data preparation staff: 2 professional librarians with masters in library science. 5 clerk-typists, high school diplomas, two with keypunch ability, three typists with 60 wpm. a breakdown of personnel costs should include : 1 ) planning; 2) actual design (both systems and individual programs); 3) coding (writing of actual programs); 4) testing/debugging; 5) file conversion; 6) actual data preparation and correction (includes new file; preparation and maintenance of existing files); 7) program maintenance. in the planning, design, and coding phases both total time actually spent and the elapsed project time should be given. testing/debugging and conversion costs are normally one-time costs, but both can amount to a sizable portion of the system cost ( 4). ideally, if conversion costs can also include file cleanup, it helps make that portion more valuable and easier to accept ( 5). 2io journal of library automation vol. 3/3 september, i970 once the system is in operation, data preparation and correction times become major cost factors. these are usually highest during the initial installation when personnel are learning the new system. care should be taken not to let the size of initial costs bias the entire cost figure. once initial training is over, these will reach a more realistic level and will be more indicative of system requirements and actual costs. equipment costs just as there are difficulties in determining analyst and programmer efficiency there are similar, though less severe, problems in comparing the efficiency of various machine configurations. even in comparison of two identical machine configurations, different run times are possible for the same job. the operating system or monitor must also be considered, as must the experience and efficiency of the computer operator. systems are improving to the point where operator performance is less critical than it once was, but it can still be a significant factor. equipment costs are largely determined by the machine configuration used. a tape system may have a totally different running time from that of a disc or drum system. the configuration also affects what types of systems may be implemented. for comparison purposes it is necessary to state the make, model, and memory size of the computer used. memory size should be given in either words or bytes. a byte is the amount of storage required for one alphabetic character, one special character, or one or two numeric characters. if the memory size is specified in words, the word size should also be given. details of the computer peripherals, such as the general type (i.e., tape drives, disc, printer, card reader, punch, etc.) make, model, and number should be given. for printers, card readers, punches, paper tape units, etc. the speed should also be given (i.e. lines/ minute, card/minute, characters/second, etc.). for storage media, such as disc and drum, the storage capacity should be given. for tape units the tape density should be included. a sample description might read: 1-ibm 360/20 submodel 5, 24k bytes 4-ibm 2415 tape drives, model 3, 800 bpi i-ibm 2560 card reader/punch, reader-500 cpm, punch-160 cpm i-ibm 2203 printer 450 ipm 2-ibm 2311 disc, model i2, 2.7 million bytes equipment costs can be subdivided in many ways. a possible method includes: 1) computer costs a) compile times (highly language and computer dependent) b) test/debug time (should include the entire system as well as individual components) c) actual run times d ) maintenance or debug after installation costs for automated systems/jacob 211 2) additional equipment costs a) keypunch, paper tape, optical character, other inpu~ devices b) interpreting punched card output c) sorting/ collating d) listings e) bursting, binding, etc. 3) special forms or material costs a) input or work forms b) punched cards (pre-printed or blank) c) pre-printed forms d) carbon sets or ncr forms e) pre-punched badge or id cards f) masters for reproduction g) special computer printer ribbons while all of the above items may not be applicable to a particular system, all those that are should be included. large or real-time systems might need additional categories. compile and testjdebug times are of interest to the systems designer, but are less important than actual run times. compile times are a function of the computer, the language used, and the complexity of the program. they are more indicative of the compiler efficiency than the system performance. test/debug times must be allowed for in any system, but data on them will be useful to those having had little experience with automated systems. experienced designers will be aware of the problem and make adequate allowances for it. actual run times are a primary cost factor in any system and representative samples should be given. details concerning the type and volume of input, type of processing, and the type and volume of output should also be stated. program maintenance costs usually do not appear until after the system has been up and running for some time, and are usually not included in reports of system costs, since most reports are written before, or soon after, system installation. they are important, however, because they represent a part of the continuing cost of the system. conversion costs can represent a sizable portion of the installation cost of the system. this is especially true if the data must be converted to machine readable form. these costs are of great interest to others engaged in similar conversions and care should be taken to ensure these are accurate. the most obvious type of file conversion is from a record such as a typed list or a catalog card into a machine readable record through keypunching or keytape conversion. file conversion may still exist for a file already in machine readable form if there are differences between it and the files used by the system. normally such costs are considerably less than conversion from a non-machine-readable form. exceptions may occur if the file lacks much of the necessary information or if extensive character 212 journal of library automation vol. 3/3 september, 1970 manipulation is required before the information can be used. an example of a file having insufficient data to warrant conversion might be a card file with very abbreviated authors and titles used for quick listing purposes, when what is wanted is a full shelf list containing all added entries and subjects, full titles, and imprint information. existing information may require too many corrections to expand the authors and titles to provide any really usable information. in other words it might be more economical to repunch the file from scratch than to try to edit and punch corrections. non-computer equipment costs should not be neglected. while capital investments in such equipment may be small, the time spent in using the equipment can often be lengthy. this is particularly true of input devices. non-computer equipment includes such items as keypunches; keytape units; and any unit record equipment, such as collators, sorters, interpreters, xerox machines, typewriters, guillotines, etc. just as for computer equipment the type, make, model, quantity used, and special features should be given. a sample description of such equipment might be: 1 ibm selectric typewriter, ascii ocr type element 2 ibm 029 keypunches (no special features) 1 ibm 82 sorter 1 ibm 85 collator in a system using a large number of punched cards or large volumes of paper for printing, these too can be significant cost factors. again the volume of usage may be more helpful than actual dollar cost. special or pre-printed forms are usually more expensive than plain forms, so it is important to state types as well as quantities. the actual dollar cost should be stated as well. an example of materials used to produce a small printed catalog with shortened entries on a six month cycle is: 600 dikote masters (for multilith reproduction) at $56.00/1000 masters 1000 pre-printed punched cards at $1.50/1000 cards 1 ibm 1403 computer printer ribbon, no. 413197 1200 pages, standard, lined, 14% x 11-inch computer printer paper presentation of costs the format for presenting cost data could be divided as follows: personnel: a brief paragraph describing the number, types, backgrounds, and skills of all personnel involved with the system. equipment: computer equipment: a brief statement of the computer make, model, and memory size; type, model and number of peripherals. additional equipment: a brief statement of the types, makes, models, and numbers of any other equipment necessary for the successful operation of the system. materials : a brief statement of the types and quantities of forms used, and for special forms an indication of the actual dollar cost as well. table 1. cost control form for sdi system functions planning actual design coding compile testing/debugging file conversion data preparation/correction per run individual job run citations/external citations /internal profile update decollating bmsting printing computer paper ( 141.4 x 11) citations /external citations /internal profile update elapsed 1 month 3 months 4 months 3 months 2 months .5 months personnel total type hours 80 analyst 20 librarian 98 analyst 20 librarian 90 analyst 2 analyst 5 analyst 15 clerk 1 clerk 3 librarian .1 clerk .1 clerk .1 clerk equipment and material time number (in hours) type 1 1 1 1 1 1 1.2 cdc3600 <j 1 2.9 cdc3600 c "' 7 ibm 029 .,... 1 "' 1 cdc3600 -c 1 1 ibm 029 .... > 1 .2 cdc3600 ~ .,... c 1 .5 cdc3600 ~ ;:::. ..... 1 .05 cdc3600 ~ ~ 1 .1 cdc3600 en .2 ~ "' .,... .2 ~ ~ "' ........ ._ > 80 pages n 0 50 pages t:p 50 pages 1:-0 1-' i en 214 journal of l ibrary automation vol. 3/3 september, 1970 fig. 1. profile update . f'rofii. £ lipdates sort (man nd . s£ tot·) sort fa i.pha sl£ al•) costs for alltomated systems/jacob 215 .sg/.£~t /. fdifi"'a -r t£rms .sort (aj.pjia 7"'£rm s.~ta.) ma'f'~/j plfdi'il£./ cita-r/iin 7'lifms fig . 2. citation run. !1drt ccitati~n .s£q,) slim aijd s£l£cr ~ort lit a ii. .s&d) pi? !nt norl(.£.5 sjji ndt/c.l.s 216 journal of library automation vol. 3/3 september, 1970 table 1 shows a simple presentation of system cost. the table is a suggested form only and is not exhaustive; it can be expanded as needed for more complex systems. the information and figures given in the table illustrate the system discussed in the following section. the purpose is to provide a sample, not to describe the system in detail, and consequently the system description is very brief. system description a selective dissemination of information system was developed to serve a small group of engineers in a scientific laboratory. one source of input consisted of current accessions obtained as a by-product of regular weekly runs to create a master shelf-list file in machine readable form. another input file was obtained by subscription to a commercial tape service supplying journal, book and report citations. while most of the programs developed for the system were new, it was possible to modify some existing programs for use in the new system. file conversion was required from an existing profile tape used in the previous sdi (ibm package 1401-cx-01) for the format used by the new system. the greater capabilities of the new system also resulted in numerous modifications and expansions of the profiles. the profile master containing a description of user interests had just under 100 profiles representing 40 separate groups. most profiles were for groups rather than individuals; these were updated only as needed. the citation tape contained slightly over 8,000 journals and book citations per week. the internal citation tape contained 180 report citations per week. an average of 400 notices per weekly run for the external citations and 200 notices per weekly run for the internal citations were generated. systems how for the profile update and the citation runs are contained in figures 1 and 2. the language used for the system was cobol. development personnel 1 analyst/programmer with three years cobol, two years autocoder, professional librarian with four years library experience, worked with ibm 1401, 1410 and cdc 3600 1 professional librarian with 15 years' experience in all phases of library work, knowledge of computers, but no programming or analysis experience 1 clerk-typist, ba in english, 60 wpm typist, self-taught keypunch operator, worked in library four years equipment configuration computer equipment 1 cdc 3600, 65 k (words), 8 bytes/word 8 cdc 604 tape drives, 200/500/800 bpi, 7 track, 37.5 inches per sec. costs for automated systems/jacob 217 2 cdc 861 magnetic drwns at 4.2 million characters, 17 ms access time, 2 million cps transfer rate 1 cdc 405 card reader, photoelectric, 1200 cpm 1 cdc 415 card punch, 250 cpm 2 cdc 501 printers, 1000 1 pm, 64 char. print set, 136 char. line 1 cdc 3601 console non-computer equipment 1 ibm 026 keypunch (no special features) 1 decollator 1 burster 1 hand perforator materials standard ( 14lh x 11), lined, computer printer paper blank punched cards magnetic tape subscription @ $5000./year general considerations how well the system attains its intended goals within the desired limits of design, development, and operating costs is the most important consideration. design and development costs are usually initial costs only, but operating costs continue as long as the system functions. operating costs must include the cost of data preparation, computer run times, cost of program maintenance, additional equipment costs, and cost of special forms or materials needed. careful consideration should be given to allowing sufficient money to be spent in design and development so that overal1 operating costs, especially those of data preparation and computer run times, can be reduced. references 1. griffin, hillis l.: "estimating data processing costs in libraries," college and research libraries, 25 (sept. 1964), 400-03, 431. 2. fasana, paul j.: "determining the cost of library automation," a.l.a. · bulletin, 61 (june 1967, 656-61). 3. landau, herbert b.: "the cost analysis of document surrogation: a literature review," american documentation, 20 (oct. 1969), 320-310. 4. gregory, robert h.; van horn, richard l.: automatic data-processing systems: principles and procedures (belmont, ca: wadsworth, 1963 ). 5. hammer, donald p.: "problems in the conversion of bibliographic data: a keypunching experiment," american documentation, 19 (jan. 1968), 12-17. microsoft word 13063 20211217 galley.docx article bridging the gap using linked data to improve discoverability and diversity in digital collections jason boczar, bonita pollock, xiying mi, and amanda yeslibas information technology and libraries | december 2021 https://doi.org/10.6017/ital.v40i4.13063 jason boczar (jboczar@usf.edu) is digital scholarship and publishing librarian, university of south florida. bonita pollock (pollockb1@usf.edu) is associate director of collections and discovery, university of south florida. xiying mi (xmi@usf.edu) is digital initiative metadata librarian, university of south florida. amanda yeslibas (ayesilbas@usf.edu) is e-resource librarian, university of south florida. © 2021. abstract the year of covid-19, 2020, brought unique experiences to everyone in their daily as well as their professional life. facing many challenges of division in all aspects (social distancing, political and social divisions, remote work environments), university of south florida libraries took the lead in exploring how to overcome these various separations by providing access to its high-quality information sources to its local community and beyond. this paper shares the insights of using linked data technology to provide easy access to digital cultural heritage collections not only for the scholarly communities but also for those underrepresented user groups. the authors present the challenges at this special time of the history, discuss the possible solutions, and propose future work to further the effort. introduction we are living in a time of division. many of us are adjusting to a new reality of working separated from our colleagues and the institutions that formerly brought us together physically and socially due to covid-19. even if we can work in the same physical locale, we are careful and distant with each other. our expressions are covered by masks, and we take pains with hygiene that might formerly have felt offensive. but the largest divisions and challenges being faced in the united states go beyond our physical separation. the nation has been rocked and confronted by racial inequality in the form of black lives matter, a divisive presidential campaign, income inequality exacerbated by covid-19, the continued reckoning with the #metoo movement, and the wildfires burning the west coast. it feels like we are burning both literally and metaphorically as a country. adding fuel to this fire is the consumption of unreliable information. ironically, even as our divisions become more extreme, we are increasingly more connected and tuned into news via the internet. sadly, fact checking and sources are few and far between on social media platforms, where many are getting their information. the pew foundation report the future of truth and misinformation online warns that we are on the verge of a very serious threat to the democratic process due to the prevalence of false information. lee raine, director of the pew research center’s internet and technology project, warns, “a key tactic of the new anti-truthers is not so much to get people to believe in false information. it’s to create enough doubt that people will give up trying to find the truth, and distrust the institutions trying to give them the truth.”1 libraries and other cultural institutions have moved very quickly to address and educate their populations and the community at large, trying to give a voice to the oppressed and provide information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 2 reliable sources of information. the university of south florida (usf) libraries reacted by expanding antiracism holdings. usf’s purchases were informed by work at other institutions, such as the university of minnesota’s antiracism reading lists, which has in turn grown into a rich resource that includes other valuable resources like the mapping prejudice project and a link to the umbra search.2 the triad black lives matter protest collection at the university of north carolina greensboro is another example of a cultural institution reacting swiftly to document, preserve, and educate.3 these new pages and lists being generated by libraries and cultural institutions seem to be curated by hand using tools that require human intervention to make them and keep them up to date. this is also a challenge the usf libraries faced when constructing its new african american experience in florida portal, a resource that leverages already existing digital collections at usf to promote social justice. another key challenge is linking new digital collections and tools to already established collections and holdings. beyond the new content being created in reaction to current movements, there is already a wealth of information established in rich archives of material, especially regarding african american history. digital collections need to be discoverable by a wide audience to achieve resource sharing and educational purposes. this is a challenge many digital collections struggle with, because they are often being siloed from library and archival holdings even within their own institutions. all the good information in the world is not useful if it is not findable. an example of a powerful discovery tool that is difficult to find and use is the umbra search (https://www.umbrasearch.org/) linked to the university of minnesota’s anti-racism reading list. umbra search is a tool that aggregates content from more than 1,000 libraries, archives, and museums.4 it is also supported by highprofile grants from the institute of museum and library services, the doris duke charitable foundation, and the council on library and information resources. however, the website is difficult to find in a web search. umbra search was named after society of umbra, a collective of black poets from the 1960s. the terms umbra and society of umbra do not return useful results for finding the portal, nor do broader searches of african american history the portal is difficult to find through basic web searches. one of the few chances for a user to find the site is if they came upon the human-made link in the university of minnesota anti-racism reading list. despite enthusiasm from libraries and other cultural institutions, new purchases and curated content are not going to reach the world as fully as hoped. until libraries adopt open data formats in favor of locking away content in closed records like marc, library and digital content will remain siloed from the internet. the library catalog and digital platforms are even siloed from each other. we make records and enter metadata that is fit for library use but not shareable to the web. as karen coyle asked in her lita keynote address a decade ago, the question is how can libraries move from being “on the web” to being “of the web”?5 the suggested answer and the answer the usf libraries are researching is with linked data. literature review the literature on linked data for libraries and cultural heritage resources reflects an implementation that is “gradual and uneven.” as national libraries across the world and the library of congress develop standards and practices, academic libraries are still trying to understand their role in implementation and identify their expertise.6 information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 3 in 2006 tim berners-lee, the creator of the sematic web concept, outlined four rules of linked data: 1. use uris as names for things. 2. use http uris so that people can look up those names. 3. when someone looks up a uri, provide useful information, using the standards (rdf, sparql). 4. include links to other uris so that they can discover more things.7 it was not too long after this that large national libraries began exploring linked data and experimenting with uses. in 2010 the british library presented its prototype of linked data. this move was made in accordance with the uk government’s commitment to transparency and accountability along with the user’s expectation that the library would keep up with cutting edge trends.8 today the british library has released the british national bibliography as linked data instead of the entire catalog because it is authoritative and better maintained than the catalog.9 the national libraries of europe, spurred on by government edicts and europeana (https://www.europeana.eu/en), are leading the progress in implementation of linked data. national libraries are uniquely suited to the development and promotion of new technologies because of their place under the government and proximity to policy making, bridging communication between interested parties and the ability to make projects into sustainable services.10 a 2018 survey of all european national libraries found that 15 had implemented linked data, two had taken steps for implementation and three intended to implement it. even national libraries that were unable to implement linked data were contributing to the linked data open cloud by providing their data in datasets to the world.11 part of the difficulty with earlier implementation of linked data by libraries and cultural heritage institutions was the lack of a “killer example” that libraries could emulate.12 the relatively recent success of european national libraries might provide those examples. many other factors have slowed the implementation of linked data. a survey of norwegian libraries in 2009 found considerable gap in the semantic web literature between the research undertaken in the technological field and the research in the socio-technical field. implementing linked data requires reorganization of the staff, commitment of resources, education throughout the library and buy-in from the leadership to make it strategically important.13 the survey of european national libraries cited the exact same factors as limitations in 2018.14 outside of european national libraries the implementation of linked data has been much slower. many academic institutions have taken on projects that tend to languish in a prototype or proof of concept phase.15 the library-centric talis group of the united kingdom “embraced a vision of developing an infrastructure based on semantic web technologies” in 2006, but abandoned semantic web-related business activities in 2012.16 it has been suggested that it is premature to wholly commit to linked data, but it should be used for spin-off projects in an organization for experimentation and skill development.17 linked data is also still proving to be technologically challenging for implementation of cultural heritage aggregators. if many human resources are needed to facilitate linked data, it will remain an obstacle for cultural heritage aggregators. a study has shown automated interpretation of information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 4 ontologies is hampered by a lack of inter-ontology relations. cross-domain applications will not be able to use these ontologies without human intervention.18 aiding in the development and awareness of linked data practices for libraries is the creation and implementation of bibframe by the library of congress. the library of congress’s announcement in july 2018 that bibframe would be the replacement of marc definitively shows that the future of library records is focused on linking out and integrating into the web.19 the new rda (resource description and access) cataloging standards made it clear that marc is no longer the best encoding language for making library resources available on the web.20 while rda has adopted the cataloging rules to meet a variety of new library environments, the marc encoding language makes it difficult for computers to interpret and apply logic algorithms to the marc format. in response, the library of congress commissioned the consulting agency zepheria to create a framework that would integrate with the web and be flexible enough to work with various open formats and technologies, as well as be able to adapt to change. using the principles and technologies of the open web, the bibframe vocabulary is made of “resource description framework (rdf) properties, classes, and relationships between and among them.”21 eric miller, the ceo of zepheria, says bibframe “works as a bridge between the description component and open web discovery. it is agnostic with regards to which web discovery tool is employed” and though we cannot predict every technology and application bibframe can “rely on the ubiquity and understanding of uris and the simple descriptive power of rdf.”22 the implementation of linked data in the cultural heritage sphere has been erratic but seems to be moving forward. it is important to pursue though because bringing local data out of the “deep web” and making them open and universally accessible, means offering minority cultures a democratic opportunity for visibility.”23 linked data linked data is one way to increase the access and discoverability of critical digital cultural heritage collections. also referred to as semantic web technologies, linked data follows the w3c resource description framework (rdf) standards.24 according to tim berners-lee, the semantic web will bring structure and well-defined meaning to web content allowing computers to perform more automated processes.25 by providing structure and meaning to digital content, information can be more readily and easily shared between institutions. this provides an opportunity for digital cultural heritage collections of underrepresented populations to get more exposure on the web. following is a brief overview of linked data to illustrate how semantic web technologies function. linked data is created by forming semantic triples. each rdf triple contains uniform resource identifiers or uris. these identifiers allow computers (machines) to “understand” and interpret the metadata. each rdf triple consists of three parts: a subject, a predicate, and an object. the subject defines what the metadata rdf triple is about, while the object contains information about the subject which is further defined by the relationship link in the predicate. information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 5 figure 1. example of a linked data rdf triple describing william shakespeare’s authorship of hamlet. for example, in figure 1, “william shakespeare wrote hamlet” is a triple. the subject and predicate of the triple are written as an uri containing the identifier information and the object of the triple is a literal piece of information. the subject of the triple, william shakespeare, has an identifier which in this example links to the library of congress name authority file for william shakespeare. the predicate of the rdf triple describes the relationship between the subject and object. the predicate also typically defines the metadata schema being used. in this example, dublin core is the metadata schema being used, so “wrote” would be identified by the dublin core creator field. the object of this semantic triple, hamlet, is a literal. literals are text that are not linked because they do not have a uri. subjects and predicates always have uris to allow the computer to make links. the object may have a uri or be a literal. together these uris, along with the literal, tell the computer everything it needs to know about this piece of metadata, making it self-contained. rdf triples with their uris are stored in a triple-store graph style database which functions differently from a typical relational database. relational databases rely on table headers to define the metadata stored inside. moving data between relational databases can be complex because tables must be redefined every time data is moved. graph databases don’t need tables since all the defining information is already stored in each triple. this allows for bidirectional flow of information between pieces of metadata and makes transferring data simpler and more efficient.26 information in a triple-store database is then retrieved using sparql, a query language developed for linked data. because linked data is stored as self-contained triples, machines have all the information needed to process the data and perform advanced reasoning and logic programming. this leads to better search functionality and lends itself well to artificial intelligence (ai) technologies. many of today’s modern websites make use of these technologies to enhance their displays and provide greater functionality for their users. the internet is an excellent avenue for libraries to un-silo their collections and make them globally accessible. once library collections are on the web, advanced keyword search functionalities and artificial intelligence machine learning algorithms can be developed to automate metadata creation workflows and enhance search and information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 6 retrieval of library resources. the use of linked data metadata in these machine-learning functions will add a layer of semantic understanding to the data being processed and analyzed for patron discovery. ai technology can also be used to create advanced graphical displays making connections for patrons between various resources on a research topic. sharing digital cultural heritage data with other institutions often involves transferring data and is considered one of the greatest difficulties in sharing digital collections. for example, if one institutional repository uses dublin core to store its metadata for a certain cultural heritage collection and another repository uses mods/mets to store digital collections, there must first be a data conversion before the two repositories could share information. dublin core and mods/mets are two completely different schemas with different fields and metadata standards. these two schemas are incompatible with each other and must be crosswalked into a common schema. this typically results in some data loss during the transformation process. this makes combining two collections from different institutions into one shared web portal difficult. linked data allows institutions to share collections more easily. because linked data triples are self-contained, there is no need to crosswalk metadata stored in triples from one schema into another when transferring data. the uris contained in the rdf triples allow the computer to identify the metadata schema and process the metadata. rdf triples can be harvested from one linked data system and easily placed into another repository or web portal. a variety of schemas can all be stored together in one graph database. storing metadata in this way increases the interoperability of digital cultural heritage collections. collections stored in triple-store databases have sparql endpoints that make harvesting the metadata in a collection more efficient. libraries can easily share metadata on important collections increasing the exposure and providing greater access for a wider audience. philip schreur, author of “bridging the worlds of marc and linked data,” sums this concept up nicely: “the shift to the web has become an inextricable part of our day-to-day lives. by moving our carefully curated metadata to the web, libraries can offer a muchneeded source of stable and reliable data to the rapidly growing world of web discovery.”27 linked data also makes it easier to harvest metadata and import collections into larger cultural heritage repositories like digital public library of america (dpla) which uses linked data to “empower people to learn, grow, and contribute to a diverse and better-functioning society by maximizing access to our shared history, culture, and knowledge.”28 europeana, the european cultural heritage database, uses semantic web technologies to support its mission which is to “empower the cultural heritage sector in its digital transformation.”29 using linked data to transfer data into these national repositories is more efficient and there is less loss of data because the triples do not have to be transformed into another schema. this increases the access of many cultural heritage collections that might not otherwise be seen. one of the big advantages to linked data is the ability to create connections between other cultural heritage collections worldwide via the web. incorporating triples harvested from other collections into the local datasets enables libraries to display a vast amount of information about cultural heritage collections in their web portals. libraries thus can provide a much richer display and allows users access to a greater variety of resources. linked data also allows web developers to use uris to implement advanced search technologies creating a multifaceted search environment for patrons. current research points to the fact that using sematic web technologies makes the creation of advance logic and reasoning functionalities possible. according to liyang yu in the book introduction to the semantic web and semantic web services, “the semantic web is an information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 7 extension of the current web. it is constructed by linking current web pages to a structured data set that indicates the semantics of this linked page. a smart agent, which is able to understand this structure data set, will then be able to conduct intelligent actions and make educated decisions on a global scale.”30 many digital cultural heritage collections in libraries live in siloed resources and are therefore only accessible to a small population of users. linked data helps to break down traditional library silos in these collections. by using linked data, an institution can expand the interoperability of the collection and make it more easily accessible. many institutions are starting to incorporate linked data technologies into digital collections, thereby increasing the ability for institutions to share collections. this allows for a greater audience to have access to critical cultural heritage collections for underrepresented populations. in the article “bridging the worlds of marc and linked data,” the author states, “the shift to linked data within this closed world of library resources will bring tremendous advantages to discovery both within a single resource … as well as across all the resources in your collections, and even across all of our collective collections. but there are other advantages to moving to linked data. through the use of linked data, we can connect to other trusted sources on the web.… we can also take advantage of a truly international web environment and reuse metadata created by other national libraries.”31 university of south florida libraries practice university of south florida libraries digital collections house a rich collection varying from cultural heritage objects to natural science and environment history materials to collections related to underrepresented populations. most of the collections are unique to usf and have significant research and educational value. the library is eager to share the collections as widely as possible and hopes the collections can be used at both document and data level. linked data creates a “web of data” instead of a “web of documents,” which is the key to bringing structure and meaning to web content, allowing computers to better understand the data. however, collections are mostly born at the document level. therefore, the first problem librarians need to solve is how to transform the documents to data. for example, there is a beautiful natural history collection called audubon florida tavernier research papers in usf libraries digital collections. the audubon florida tavernier research papers is an image collection which includes rookeries, birds, people, bodies of water, and man-made structures. the varied images come from decades of research and are a testament to the interconnectedness of bird population health and human interaction with the environment. the images reflect the focus of audubon’s work in research and conservation efforts both to wildlife and to the habitat that supports the wildlife.32 this was selected to be the first collection the authors experimented with to implement linked data at usf libraries. the lessons learned from working with this collection are applied to later work. when the collection was received to be housed in the digital platform, it was carefully analyzed to determine how to pull the data out of all the documents as much as possible. the authors designed a metadata schema of the combination of mods and darwin core (darwin core, abbreviated to dwc, is an extension of dublin core for biodiversity informatics) to pull out and properly store the data. information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 8 figure 2. american kestrel. figure 3. american kestrel metadata. information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 9 figure 2 is one of the documents in the collection, which is a photo of an american kestrel. figure 3 shows the data collected from the document and the placement of the data in the metadata schema. the authors put the description of the image in free text in the abstract field. this field is indexed and searchable through the digital collections platform. location information is put in the hierarchical spatial field. the subject heading fields describe the “aboutness” of the image, that is, what is in the image. all the detailed information about the bird is placed in darwin core fields. thus, the document is dissembled into a few pieces of data which are properly placed into metadata fields where they can be indexed and searched. having data alone is not sufficient to meet linked data requirements. the first of the four rules of linked data is to name things using uris.33 to add uris to the data, the authors needed to standardize the data and reconcile it against widely-used authorities such as library of congress subject headings, wikidata, and the getty thesaurus of geographic names. standardized data tremendously increases the percentage of data reconciliation, which will lead to more links with related data once published. figure 4. amenaprkitch khachkar. figure 4 shows an example from the armenia heritage and social memory program. this is a visual materials collection with photos and 3d digital models. it was created by the digital heritage and humanities collection team at the library. the collection brings together comprehensive information and interactive 3d visualization of the fundamentals of armenian identity, such as their architectures, languages, arts, etc.34 when preparing the metadata for the items in this collection, the authors devoted extra effort to adding geographic location metadata. this effort serves two purposes: one is to respectfully and honestly include the information in the collection; and the second is to provide future reference to the location of each item as the physical items are in danger and could disappear or be ruined. the authors employed the getty thesaurus of geographic names because it supports a hierarchical location structure. the location names at each level can be reconciled and have their own uris. the authors also paid extra attention on the subject headings. figure 5 shows how the authors used library of congress subject headings, local subject headings assigned by the researchers, and the getty art and architecture thesaurus for this collection. in the data reconciliation stage, the metadata can be compared against both library of congress subject headings authority files and the getty aat vocabularies so that as many uris as possible can be fetched and added to the metadata. the focus information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 10 on geographic names and subject headings is to standardize the data and use controlled vocabularies as much as possible. once moving to the linked data world, the data will be ready to be added with uris. therefore, the data can be linked easily and widely. figure 5. amenaprkitch khachkar metadata. information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 11 one of the goals of building linked data is to make sense out of data and to generate new knowledge. as the librarians explored how to bring together multiple usf digital collections to highlight african american history and culture, three collections seemed particularly appropriate: • an african american sheet music collection from the early 20th century (https://digital.lib.usf.edu/sheet-music-aa) • the “narratives of formerly enslaved floridians” collection from 1930s (https://digital.lib.usf.edu/fl-slavenarratives) • the “otis r. anthony african american oral history collection” from 19781979(https://digital.lib.usf.edu/ohp-otisanthony) these collections are all oral expressions of african american life in the us. they span the first three-quarters of the 20th century around the time of the civil rights movement. creating linked data out of these collections will help shed light on the life of african americans through the 20th century and how it related to the civil rights movement. with semantic web technology support, these collections can be turned into machine actionable datasets to assist research and education activities on racism, anti-racism and to piece into the holistic knowledge base. usf libraries started to partner with dpla in 2018. dpla leverages linked data technology to increase discoverability of the collections contributed to it. dpla employs javascript object notation for linked data (json-ld) as its serialization for their data which is in rdf/xml format. json-ld has a method of identifying data with iris. the use of this method can effectively avoid data ambiguity considering dpla is holding a fairly large amount of data. json-ld also provides computational analysis in support of semantics services which enriches the metadata and in results, the search will be more effective.35 in the 18 months since usf began contributing selected digital collections to dpla, usf materials have received more than 12,000 views. it is exciting to see the increase in the usage of the collections and it is the hope that they will be accessed by more diverse user groups. usf libraries are exploring ways to scale up the project and eventually transition all the existing digital collections metadata to linked data. one possible way of achieving this goal would be through metadata standardization. a pilot project at usf libraries is to process one medium-size image collection of 998 items. the original metadata is in mods/mets xml files. we first decided to use the dpla metadata application profile as the data model. if the pilot project is successful, we will apply this model to all of our linked data transformation processes. in our pilot, we are examining the fields in our mods/mets metadata and identify those that will be meaningful in the new metadata schema. then we transport the metadata in those fields to excel files. the next step is to use openrefine to reconcile the data in these excel files to fetch uris for exact match terms. during this step, we are employing reconciliation services from the library of congress, getty tgn, and wikidata. after all the metadata is reconciled, we are transforming the excel file to triples. the column headers of the excel file become the predicates and the metadata as well as their uris will be the objects of the triples. next, these triples will be stored in an apache jena triple-store database so that we can start designing sparql queries to facilitate search. the final step will be designing a user-friendly interface to further optimize the user experiences. in this process, to make the workflow as scalable as possible, we are focusing on testing two processes: first, creating a universal metadata application profile to apply to the most, if not all, of the collections; and second, only fetching uris for exactly matching terms during the reconciliation information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 12 process. both of these processes aim to reduce human interactions with the metadata so that the process is more affordable to the library. conclusion and future work linked data can help collection discoverability. in the past six months, usf has seen an increase in materials going online. usf special collections department rapidly created digital exhibits to showcase their materials. if the trend in remote work continues, there is reason to believe that digital materials may be increasingly present and, given enough time and expertise, libraries can leverage linked data to better support current and new collections. the societal impact of covid-19 worldwide sheds light on the importance of technologies such as linked data that can help increase discoverability. when items are being created and shared online, either directly related to covid-19 or a result of its impact, linked data can help connect those resources. for instance, new covid-19 research is being developed and published daily. the publications office of the european union datathon entry “covid-19 data as linked data” states that “[t]he benefit of having covid-19 data as linked data comes from the ability to link and explore independent sources. for example, covid-19 sources often do not include other regional or mobility data. then, even the simplest thing, having the countries not as a label but as their uri of wikidata and dbpedia, brings rich possibilities for analysis by exploring and correlating geographic, demographic, relief, and mobility data.”36 the more institutions that contribute to this, the greater the discoverability and impact of the data. in 2020 there has been an increase in black lives matter awareness across the country. this affects higher education. usf libraries are not the only ones engaged in addressing racial disparities. many institutions have been doing this for years. others are beginning to focus on this area. no matter whether it’s a new digital collection or one that’s been around for decades, the question remains: how do people find these resources? perhaps linked data technologies can help solve that problem. linked data is a technology that can help accentuate the human effort put forth to create those collections. linked data is a way to assist humans and computers in finding interconnected materials around the internet. usf libraries faced many obstacles implementing linked data. there is a technological barrier that takes well-trained staff to surmount, i.e., creating a linked data triple store database and having linked data interact correctly on webpages. there is a time commitment necessary to create the triples and sparql queries. sparql queries themselves vary from being relatively simple to incredibly complicated. the authors also had the stumbling block of understanding how linked data worked together on a theoretical level. taking all of these considerations into account, we can say that creating linked data for a digital collection is not for the faint of heart. a cost/benefit analysis must be taken and assessed. the authors of this paper must continue to determine the need for linked data. at usf, the authors have taken the first steps in converting digital collections into linked data. we’ve moved from understanding the theoretical basis of linked data and into the practical side where the elements that make up linked data start coming together. the work to create triples, sparql queries, and uris has begun, and full implementation has started. our linked data group has learned the fundamentals of linked data. the next, and current, step is to develop workflows for existing metadata conversion into appropriate linked data. the group meets regularly and has created a triple store database and converted data into linked data. while the process is slow information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 13 moving due to group members’ other commitments, progress is being made by looking at the most relevant collections we would like to transform and moving forward from there. we’ve located the collections we want to work on, taking an iterative approach to creating linked data as we go. with linked data, there is a lot to consider. how do you start up a linked data program at your institution? how will you get the required expertise to create appropriate and high-quality linked data? how will your institution crosswalk existing data into triples format? is it worth the investment? it may be difficult to answer these questions but they’re questions that must be addressed. the usf libraries will continue pursuing linked data in meaningful ways and showcasing linked data’s importance. linked data can help highlight all collections but more importantly those of marginalized groups, which is a priority of the linked data group. endnotes 1 peter perl, “what is the future of truth?” pew trust magazine, february 4, 2019, https://www.pewtrusts.org/en/trust/archive/winter-2019/what-is-the-future-of-truth. 2 “anti-racism reading lists,” university of minnesota library, accessed september 24, 2020, https://libguides.umn.edu/antiracismreadinglists. 3 “triad black lives matter protest collection,” unc greensboro digital collections, accessed december 9, 2020, http://libcdm1.uncg.edu/cdm/blm. 4 “umbra search african american history,” umbra search, accessed december 10, 2020, https://www.umbrasearch.org/. 5 karen coyle, “on the web, of the web” (keynote at lita, october 1, 2011), https://kcoyle.net/presentations/lita2011.html. 6 donna ellen frederick, “disruption or revolution? the reinvention of cataloguing (data deluge column),” library hi tech news 34, no. 7 (2017): 6–11, https://doi.org/10.1108/lhtn-072017-0051. 7 tim berners-lee, “linked data,” w3, last updated june 18, 2009, https://www.w3.org/designissues/linkeddata.html. 8 neil wilson, “linked data prototyping at the british library” (paper presentation, talis linked data and libraries event, 2010). 9 diane rasmussen pennington and laura cagnazzo, “connecting the silos: implementations and perceptions of linked data across european libraries,” journal of documentation 75, no. 3 (2019): 643–66, https://doi.org/10.1108/jd-07-2018-0117. 10 jane hagerlid, “the role of the national library as a catalyst for an open access agenda: the experience in sweden,” interlending and document supply 39, no. 2 (2011): 115–18, https://doi.org/10.1108/02641611111138923. 11 pennington and cagnazzo, “connecting the silos,” 643–66. information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 14 12 gillian byrne and lisa goddard, “the strongest link: libraries and linked data,” d-lib magazine 16, no. 11/12 (2010): 2, https://doi.org/10.1045/november2010-byrne. 13 bendik bygstad, gheorghita ghinea, and geir-tore klæboe, “organisational challenges of the semantic web in digital libraries: a norwegian case study,” online information review 33, no. 5 (2009): 973–85, https://doi.org/10.1108/14684520911001945. 14 pennington and cagnazzo, “connecting the silos,” 643–66. 15 heather lea moulaison and anthony j. million, “the disruptive qualities of linked data in the library environment: analysis and recommendations,” cataloging & classification quarterly 52, no. 4 (2014): 367–87, https://doi.org/10.1080/01639374.2014.880981. 16 marshall breeding, “linked data: the next big wave or another tech fad?” computers in libraries 33, no. 3 (2013): 20–22. 17 moulaison and million, “the disruptive qualities of linked data,” 369. 18 nuno freire and sjors de valk, “automated interpretability of linked data ontologies: an evaluation within the cultural heritage domain,” (workshop, ieee conference on big data, 2019). 19 “bibframe update forum at the ala annual conference 2018,” (washington, dc: library of congress, july 2018), https://www.loc.gov/bibframe/news/bibframe-update-an2018.html. 20 jacquie samples and ian bigelow, “marc to bibframe: converting the pcc to linked data,” cataloging & classification quarterly 58, no. 3–4 (2020): 404. 21 oliver pesch, “using bibframe and library linked data to solve real problems: an interview with eric miller of zepheira,” the serials librarian 71, no. 1 (2016): 2. 22 pesch, 2. 23 gianfranco crupi, “beyond the pillars of hercules: linked data and cultural heritage,” italian journal of library, archives & information science 4, no. 1 (2013): 25–49, http://dx.doi.org/10.4403/jlis.it-8587. 24 “resource description framework (rdf),” w3c, february 25, 2014, https://www.w3.org/rdf/. 25 tim berners-lee, james hendler, and ora lassila, “the semantic web,” scientific american 284, no. 5 (2001): 34–43, https://www.jstor.org/stable/26059207. 26 dean allemang and james hendler, “semantic web application architecture,” in semantic web for the working ontologist: effective modeling in rdfs and owl, (saint louis: elsevier science, 2011): 54–55. information technology and libraries december 2021 bridging the gap | boczar, pollock, mi, and yeslibas 15 27 philip e. schreur and amy j. carlson, “bridging the worlds of marc and linked data: transition, transformation, accountability,” serials librarian 78, no. 1–4 (2020), https://doi.org/10.1080/0361526x.2020.1716584. 28 “about us,” dpla: digital public library of america, accessed december 11, 2020. https://dp.la/about. 29 “about us,” europeana, accessed december 11, 2020, https://www.europeana.eu/en/about-us. 30 liyang yu, “search engines in both traditional and semantic web environments,” in introduction to semantic web and semantic web services (boca raton: chapman & hall/crc, 2007): 36. 31 schreur and carlson, “bridging the worlds of marc and linked data.” 32 “audubon florida tavernier research papers,” university of south florida libraries digital collections, accessed november 30, 2020, https://lib.usf.edu/?a64/. 33 berners-lee, “linked data,” https://www.w3.org/designissues/linkeddata.html. 34 “the armenian heritage and social memory program,” university of south florida libraries digital collections, accessed november 30, 2020, https://digital.lib.usf.edu/armenianheritage/. 35 erik t. mitchell, “three case studies in linked open data,” library technology reports 49, no. 5 (2013): 26-43. 36 “covid-19 data as linked data,” publications office of the european union, accessed december 11, 2020, https://op.europa.eu/en/web/eudatathon/covid-19-linked-data. the impact of covid-19 on the use of academic library resources article the impact of covid-19 on the use of academic library resources ruth sara connell, lisa c. wallis, and david comeaux information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.12629 abstract the covid-19 pandemic has greatly impacted higher education, including academic libraries. this paper compares the use of library resources (including interlibrary loan, website and discovery tool pageviews, database use, patron interactions, etc.) at three university libraries before and after the pandemic. the latter part of the 2019 and 2020 spring semesters are the time frames of focus, although two control time frames from earlier in those semesters are used to determine how the semesters differed when the coronavirus was not a factor. the institutions experienced similar patterns of use across many metrics. introduction the year 2020 will be remembered as the year of the novel coronavirus (covid-19). around the world, hundreds of thousands of people died from the disease, schools and businesses shut their doors, wearing masks in public places became commonplace, and unemployment soared.1 everyone and everything changed in ways large and small, and libraries were no exception. this study measures changes in use of library resources during time frames of covid-19 closures at three different institutions: louisiana state university, northeastern illinois university, and valparaiso university. these three universities vary in size (large to medium), control (two public, one private), basic carnegie classification (doctoral-very high research, master’s, and doctoral/ professional), and setting (two primarily non-residential and one highly residential).2 despite their differences, these institutions experienced similar patterns of use across many categories. key findings: • the pandemic affected the three institutions studied on a continuum, with the least impact at the largest school, and the biggest drop in use seen at the smallest school. • use of all three libraries’ websites as well as the discovery tools/catalogs and major databases decreased during the covid time frame. • all three libraries experienced an increase in virtual communication. background louisiana state university louisiana state university (lsu) is the flagship institution of louisiana and is one of only 22 prestigious universities nationwide holding land-grant, sea-grant, and space-grant status. the main campus is in baton rouge, has the carnegie basic classification “doctoral universities: very ruth sara connell (ruth.connell@valpo.edu) is professor of library science & director of systems, valparaiso university. lisa c. wallis (l-wallis@neiu.edu) is associate dean of libraries and eresources & systems librarian, northeastern illinois university. david comeaux (davidcomeaux@lsu.edu) is systems and discovery librarian, louisiana state university. © 2021. mailto:ruth.connell@valpo.edu mailto:l-wallis@neiu.edu mailto:davidcomeaux@lsu.edu information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 2 high research activity,” is primarily non-residential, and has a student full-time equivalent (fte) of about 30,000.3 lsu library is the university’s main library and a center of campus life for both students and faculty, and it houses approximately three million physical items (print books, media, and serials).4 before the onset of covid-19, lsu library was open 24/5 (24 hours sunday-thursday) plus weekend hours, with a 24/7 schedule during finals. as of july 30, 2020, it had 100 employees, of which 39 were librarians. lsu library staff, like all nonessential lsu staff, last reported to work in-person on monday, march 16. the previous day, the state’s governor ordered a statewide stay-at-home order, restricting events and closing venues. that monday, library it employees hurriedly assisted other staff in preparing to work remotely. near the end of the day, the university’s president sent an email asking all employees to work from home until further notice. classes were canceled for the week to allow instructors to prepare for remote teaching. the following week, march 23, was the beginning of spring break. classes resumed online-only on march 30. the libraries remained closed through the duration of the spring semester. despite the closure, the libraries continued to serve patrons. librarians continued to assist students through email and zoom-based sessions. the catalog and discovery systems remained available, and staff continued to fill article and book chapter requests through interlibrary loan and document delivery. northeastern illinois university northeastern illinois university (neiu) is a public, comprehensive, multicultural university located on the north side of chicago. it has an enrollment of approximately 7,000 students in undergraduate and graduate programs among three colleges: arts and sciences, education, and business. while currently classified in the “master’s colleges and universities: larger programs” category, neiu is undergoing a major enrollment shift due to the state of illinois’ failure to provide a budget during the period 2015–2017 and, now, the covid crisis.5 the spring 2019 fte was 4,644 (83% undergraduate), while in spring 2020, fte was 4,404 (80% undergraduate). the campus is primarily a commuter campus. the neiu libraries offer library services at three locations in chicago. altogether, the three libraries house approximately 500,000 physical books, media, and serials.6 in spring 2020, the neiu libraries employed 12 people in positions requiring an mls—including the dean and associate dean of libraries—and 18 staff. the main library is typically open 92 hours per week. neiu’s spring break typically falls at the same point in the semester every year, with the spring 2020 break scheduled to begin on saturday, march 14. the week prior to spring break, neiu’s president announced that the break would be extended by an extra week to allow instructors to move instruction to alternatives from face-to-face teaching. the neiu libraries closed its doors on wednesday, march 18 at 6 p.m., and library faculty and staff began working from home. the libraries were able to offer continued reference and instruction by chat, email, phone, and google meet and to fill article and book chapter requests electronically. no physical materials were available, as the statewide delivery system supporting borrowing among academic libraries stopped its services on march 16. alternative instruction resumed on monday, march 30 for all students. information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 3 valparaiso university valparaiso university (valpo) is a private, not-for-profit, highly residential, four-year university in northwest indiana. its carnegie basic classification is “doctoral/professional universities,” although it serves a largely undergraduate student population (90% of fte in spring 2020). the graduate programs serve fewer than 500 students. during the time frame of this study, valpo’s fte ranged from 3,449 (spring 2019) to 3,147 (spring 2020).7 there is one library on campus. the christopher center for library and information resources has approximately 450,000 items in its print collection.8 during the fall and spring semesters (excluding breaks), the library is open 113 hours per week. as of fall 2020, the library employed 19 people: 10 librarians (including the dean) and nine staff members. this is four fewer positions than before the pandemic; due to covid-19 funding cuts, three staff members were laid off and one open librarian position was eliminated. valpo is unusual in that it has a two-week spring break which always falls on the first full two weeks in march. in 2020, spring break coincided with the burgeoning covid-19 crisis. during the second week of spring break, campus administration announced that campus would remain open, but classes would go online immediately following the break, starting monday, march 16. on march 16, the christopher center library moved to reduced hours (open 67.5 hours per week). as the fallout from the pandemic rapidly unfolded, hours were further reduced four days later. at the close of business on tuesday, march 24, in accordance with the state of indiana’s stay-at-home order, the physical building closed, although library faculty and staff continued to work from home. literature review preparing and responding to pandemics and other disasters libraries have long understood the need to prepare for disasters and have chronicled their struggles with disasters both natural and man-made. library collections have been lost due to fires and floods, and libraries have been forced to drastically alter their service models in response. the literature on this topic includes surveys of libraries regarding their emergency preparedness, advice on preparing disaster recovery plans, and recovering or replacing lost collections.9 one particularly prescient article describes the work done at the university of minnesota’s school of public health.10 two librarians joined a task force to prepare the school to function through an influenza pandemic. they focused on two scenarios. one was the onset of a pandemic mid semester, forcing schools to close for weeks or even through the duration of the semester. the other was an even longer (9to 18-month) school closure. both scenarios included implementing social distancing practices now familiar to us all. the task force provided many recommendations to enable continuity of online teaching in the event of a pandemic, but none dealt specifically with libraries. resource use during building disruptions previous studies have studied the impact on the use of library resources during building disruptions, generally due to renovations. in the studies reviewed, these libraries moved to temporary locations and still had some physical space available to students. this differs from the complete closure experienced by most libraries due to covid-19. typically, library services and resources are used less when normal building operations are disrupted, but there are exceptions. information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 4 in 1999, the library at eastern illinois university closed for 31 months. library services were relocated elsewhere. overall, the library experienced a “sharp drop in the use of library resources and services,” although one bright spot of growth was in interlibrary loan use, which went up by 16%.11 however, the authors note that this increase may be due to patrons placing requests for items owned by eiu that were difficult to access. mcneese state university in louisiana started a multiyear library renovation in 2012, during which the library’s personnel, services, and a small subset of the collection were relocated to a ballroom in the student union. a 2016 article discusses effects on library services half-way through the disruption. in reviewing the literature, the author found, “the longer students and faculty are not allowed access to the library building, the more usage statistics such as circulation, interlibrary loan (ill), and instruction decrease.”12 mcneese experienced a 22% decrease in interlibrary loan requests (borrowing and lending), a 51% decrease in the use of e-books, and a 62% decline in reference transactions. the author noted that “nearly every library service experienced a precipitous decline.”13 pepperdine university bucked the trend of sharp decreases during their 2016–2017 renovation. they experienced a dramatic increase in the number of interlibrary loan requests, both for borrowing (33%) and lending (375%), likely related to their decision to join rapid-ill at the beginning of the renovation. conversely, there was a slight decrease (10%) in borrowing from the california statewide consortium. pepperdine’s e-book usage remained fairly steady and actually increased 3% during the disruption. as expected, their in-person questions decreased while in a temporary facility, although chat and email reference questions increased by 30%.14 covid-19 impact though the coronavirus was first discovered to have reached the united states in january 2020, it was not until late february that american colleges and universities began to implement travel restrictions for students and staff and to develop plans for potential closure. 15 by early march 2020 the us department of education had developed guidelines for the coronavirus’ impact on distance learning and financial aid.16 the topic was also the subject of numerous articles on higher education websites such as the chronicle of higher education and inside higher ed. in march 2020, when the impact of covid-19 began to reverberate around the united states, virtually all libraries closed.17 the american library association (ala) conducted a survey of libraries of all types in may and reported that the majority of academic libraries had already lost funding, or anticipated losing funding within the next year, for staff, new hires, professional development, print collections, programs, and services.18 the press release for the ala survey stated that “survey respondents shared leaps in the use of digital content, online learning, and virtual programs.”19 however, a review of the survey instrument reveals that no questions were asked about use of these resources, so it is unclear where this assertion comes from.20 the survey included opportunities to leave comments, so it may be that the leaps in use mentioned in the ala press release is anecdotal. although libraries have never faced a global pandemic in modern times, previous research indicates library building disruptions generally reduce use, not cause it to increase as described in the press release. academic libraries were among the last facilities to close on many campuses, as they were viewed as essential for students.21 however, as early as the week of march 9, 7% of academic libraries reported that they had stopped circulating physical materials. in addition, building and face-toinformation technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 5 face reference desk staffing began to be scaled back, with 28% of reporting libraries doing no faceto-face reference by the end of that week.22 complete closure of academic library physical facilities became the norm by mid-march, with vocal advocacy for that measure expressed by both the american library association and the association of college and research libraries.23 apart from library instruction, the coronavirus did not so much force academic libraries to move online as to temporarily suspend physical and in-person services. by 2020, provision of online services—including book, article, and media circulation and email and chat reference—was nothing new for libraries. recent years have seen increased migration to cloud-based systems, eliminating the need for library work to be done using specific client software on library staff computers. academic libraries and vendors alike promoted the ability of their institutions and computer systems to handle the covid-19 crisis with minimal disruption.24 the international coalition of library consortia (icolc) issued a statement on march 13, 2020 , asking vendors to lift many of the usual licensing restrictions and opening access to the 391 million students affected by school and library shutdowns.25 publishers and vendors quickly began to remove paywalls between users and their online collections, either for free without library mediation or upon library request. icolc followed up by starting a comprehensive list of these materials on march 16, 2020.26 despite increased access to online collections, library resource use was disrupted. no matter how many services can be offered online, plenty of students and faculty still use traditional items such as print books. given the circumstances of covid-19, with sudden and lasting limits on access to physical materials and space, academic libraries began to promote online equivalents of th ese, such as the hathitrust emergency temporary access service (etas), which penn state reported offered “reading access to more than 48% of the libraries’ print collections.”27 upon seeing the collected mass of print books students returned prior to leaving campus due to covid -19, librarian nora dimmock of brown university’s john jay library identified the need to move “more intentionally” to e-books over print books in future purchasing decisions.28 methodology in order to determine whether the covid-19 pandemic affected use of library systems and resources, the researchers compared usage statistics from a covid-affected time frame in 2020 to the same time frame in 2019. these will be called the covid 2019 and covid 2020 time frames (or covid time frames collectively). because there could be other factors influencing use, such as differences in student fte between the two years, the researchers also pulled data from control time frames to compare earlier in the spring semesters. these control time frames were unaffected by covid in both 2019 and 2020. these will be called the control 2019 and control 2020 time frames (or control time frames collectively). by including data from the control time frames, we were able to determine whether there were trends affecting usage differences before the pandemic hit. as an example, lsu’s catalog pageviews were down 5% from the previous year during the first part of the semester unaffected by covid-19. for the latter part of the semester affected by covid-19, catalog use fell 25% as compared to the previous year. the control time frame comparison shows that catalog use was already in decline before the pandemic; therefore, some of the 25% decline is likely due to factors other than covid-19. the baseline factored difference is −20% (−5% + x = −25%; x = information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 6 −20%). figures 2, 4, and 6 illustrate the percentage of differences in covid time frames from determined baselines. each of the three researchers determined the control and covid time frames for their own institutions. because the academic calendars vary widely between the three institutions, the time frames of study also differ. the absolutes for each institution were: • the 2019 and 2020 control time frames were the same number of days • the 2019 and 2020 covid time frames were the same number of days (but the control time frames were not the same length as the covid time frames) • if a time frame from one year contained a special calendar event (e.g., spring break, mardi gras holiday, etc.), the researchers made sure the corresponding time frame from the other year also included that event. table 1. dates of institutions’ control & covid comparison time frames (excluding ebsco and proquest) control time frame 2019 control time frame 2020 covid time frame 2019 covid time frame 2020 lsu monday, january 14 through friday, march 8 monday, january 13 through friday, march 6 monday, march 18 through saturday, april 27 monday, march 23 through saturday, may 2 neiu monday, january 7 through friday, march 15 monday, january 6 through friday, march 13 saturday, march 16 through sunday, may 5 saturday, march 14 through sunday, may 3 valpo wednesday, january 9 through friday, march 1 wednesday, january 8 through friday, february 28 monday, march 18 through tuesday, may 14 monday, march 16 through tuesday, may 12 the primary concern was to ensure that each institution was comparing like time frames to like time frames within its own academic calendar. the variability of institutional calendars means that the data cannot be compared between institutions. for the major database platforms, the control and covid time frames differ from the other areas measured due to limitations of the statistical platforms. ebsco and proquest platforms allow reports to be run on full months only; partial month statistics cannot be pulled. for that reason, for the major database platforms alone, the control time frame is january through february, and the covid time frame is april through may. because 2020 was a leap year, february 2020 had one more day than february 2019. this means that the 2019 control time frame had 59 days while the 2020 control time frame had 60 days. the extra day could account for an increase in use of approximately 2%. when evaluating use of database platforms, there are several metrics from which to choose. the authors chose full text downloads (the counter metric known as total item requests) because it is the same across reports. there are different search metrics in platform and database reports, so the authors chose to focus on the consistency of the full text download metric. when considering information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 7 e-resource use during covid-19, another factor to consider is that many publishers removed some or all of their paywalls as the pandemic began. according to counter, the non-profit organization that regulates standards for usage data: as a result of this open content, usage may appear to go down during this period. this is because many of your users will be working from home, outside of your institution’s ip range and not authenticated. this means that the publishers are unable to attribute this usage to your institution.29 the journal platforms (wiley, sage, etc.) were more likely to be affected by this consideration than ebsco and proquest, who did not remove paywalls for databases of interest to academic audiences. the researchers determined which metrics to harvest. the following were selected: • main website usage • catalog usage • discovery tool usage • libguides usage (database a–z lists and guide views) • interlibrary loan article requests received (for both lending and borrowing) • sfx document delivery clickthroughs • ebsco and proquest total item requests (full text downloads) • patron interactions (chat questions, research consultations, ask-a-librarian) the three institutions use different products for discovery tools, chat reference, and interlibrary loan, so reporting mechanisms also vary. however, in general a combination of google analytics pageviews and vendors’ own reporting systems were used to pull the data. neiu libraries migrated from illiad to tipasa in june 2019, so it was not possible to gather comparable daily statistics for interlibrary loan request comparisons. since neiu did not have interlibrary loan data available that compared 2019 to 2020, the sfx report on document delivery clickthroughs was used as a proxy measure. when full text of an item is not available in a database, the docdel clickthroughs indicate that the user went to the sfx menu and then the tipasa interlibrary loan request page. because of various factors, not all data points are available for the three institutions. for example, neiu is part of the carli consortium. it was decided by a carli committee a few years ago that google analytics would not be used in the shared catalog in order to protect users’ privacy, s o the catalog usage data point is missing for neiu. neiu and valpo have chat service data; lsu now has chat but did not during the time frames under investigation. when data are not available, they are missing from the tables and results. results louisiana state university lsu’s data present a mixed picture. while the use of libguides rose, use of the libraries’ website, discovery system, and catalog all declined during the covid-enforced closure. while downloads through ebsco increased during the covid time frame by 30%, that is less than the increase information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 8 during the control time frame (37%), resulting in a baseline-factored decrease. interlibrary loan requests and requests for help from patrons rose considerably. table 2. lsu: use of library resources during control and covid time frames, 2019 and 2020 measure of use 2019 control 2020 control control period change 2019 covid 2020 covid covid period change ill borrowing & docdel 3,872 3,774 -3% ▼ 2,619 3,525 35% ▲ ill lending 2,692 2,419 -10% ▼ 1,845 2,884 56% ▲ catalog pageviews 83,760 79,838 -5% ▼ 54,662 41,000 -25% ▼ discovery tool pageviews 432,070 407,832 -6% ▼ 349,558 291,093 -17% ▼ ebsco total item requests 59,892 81,804 37% ▲ 49,819 64,817 30% ▲ proquest total item requests 2,783 5,575 100% ▲ 6,002 2,859 -52% ▼ main website pageviews 164,329 151,090 -8% ▼ 103,340 81,404 -21% ▼ databases a–z views 27,977 26,037 -7% ▼ 20,893 19,624 -6% ▼ libguides views 127,191 123,783 -3% ▼ 12,163 18,925 56% ▲ ask-a-librarian tickets 35 88 151% ▲ 17 108 535% ▲ information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 9 figure 1. lsu: change from 2019 to 2020 in both control and covid time frames (in %). figure 2. lsu: percentage of differences in covid time frames from determined baselines. information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 10 northeastern illinois university use of some of neiu’s resources dropped dramatically during the covid 2020 time frame. however, use of some resources was already lower overall in 2020 than in 2019, even in the control time frame. illinois has experienced a tumultuous few years, with state universities starting new fiscal years without budgets in 2015 and 2016. this has led to a steady decrease in enrollment at public regional universities, as students sought to attend more stable institutions out of state.30 neiu was hit particularly hard, experiencing a 25% drop in enrollment between fall 2015 and fall 2019.31 so, it is likely some of the decrease in 2020 was due to lower enrollment. areas that saw growth in covid time frame 2020 were chat and reference consultations and interlibrary loan clickthroughs from sfx (see table 3). table 3. neiu: use of library resources during control and covid time frames, 2019 and 2020 measure of use 2019 control 2020 control control time frame change 2019 covid 2020 covid covid time frame change sfx docdel clickthroughs 1,864 2,728 46% ▲ 924 1,249 35% ▲ discovery tool pageviews 52,435 50,310 -4% ▼ 30,362 23,729 -22% ▼ ebsco total item requests 14,525 19,704 36% ▲ 15,785 15,452 -2% ▼ proquest total item requests 7,003 5,842 -17% ▼ 6,895 4,898 -29% ▼ main website pageviews 47,085 43,215 -8% ▼ 28,938 15,624 -46% ▼ databases a–z views 11,475 11,980 4% ▲ 8,687 7,743 -11% ▼ libguides views 7,473 7,081 -5% ▼ 3,317 3,282 -1% ▼ chats 363 306 -16% ▼ 182 213 17% ▲ research consultations 162 149 -8% ▼ 59 77 31% ▲ information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 11 figure 3. neiu: change from 2019 to 2020 in both control and covid time frames (in %). figure 4. neiu: percentage of differences in covid time frames from determined baselines. information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 12 valparaiso university overall, usage of valparaiso university’s library resources dropped dramatically during the part of the 2020 spring semester affected by covid-19 (see table 4). reductions occurred in all areas except chat assistance. if you consider that usage was up in most categories during the pre-covid19 part of the semester compared to the previous year, the decline during the latter part of the semester is even more significant. the two exceptions to this pattern of covid-related net reduction were chat questions and interlibrary loan lending requests received (see figure 6). table 4. valpo: use of library resources during control and covid time frames, 2019 and 2020 measure of use 2019 control 2020 control control time frame change 2019 covid 2020 covid covid time frame change ill borrowing & docdel 564 668 18% ▲ 721 622 -14% ▼ ill lending 341 260 -24% ▼ 373 303 -19% ▼ catalog pageviews 18,951 16,231 -14% ▼ 26,865 6,234 -77% ▼ discovery tool pageviews 20,020 21,369 7% ▲ 32,818 28,534 -13% ▼ ebsco total item requests 7,355 7,324 -0% ▼ 10,871 10,159 -7% ▼ proquest total item requests 8,770 10,844 24% ▲ 12,325 10,915 -11% ▼ main website pageviews 27,334 31,919 17% ▲ 29,771 21,336 -28% ▼ databases a–z views 3,959 4,242 7% ▲ 5,044 4,526 -10% ▼ libguides views 15,395 18,029 17% ▲ 13,374 12,980 -3% ▼ chats 26 32 23% ▲ 20 37 85% ▲ information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 13 figure 5. valpo: change from 2019 to 2020 in both control and covid time frames (in %). figure 6. valpo: percentage of differences in covid time frames from determined baselines. information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 14 table 5. three institutions: net pandemic change in resource use (in %) measure of use lsu neiu valpo ill borrowing & docdel 37% ▲ -32% ▼ ill lending 66% ▲ 5% ▲ sfx docdel clickthroughs -11% ▼ catalog pageviews -20% ▼ -62% ▼ discovery tool pageviews -11% ▼ -18% ▼ -20% ▼ ebsco total item requests -6% ▼ -38% ▼ -6% ▼ proquest total item requests -153% ▼ -12% ▼ -35% ▼ main website pageviews -13% ▼ -38% ▼ -45% ▼ databases a–z views 1% ▲ -15% ▼ -17% ▼ libguides views 58% ▲ 4% ▲ -20% ▼ chats 33% ▲ 62% ▲ research consultations 39% ▲ ask-a-librarian tickets 384% ▲ discussion louisiana state university usage of the libraries’ website, discovery system, and catalog all declined during the covidenforced closure. use of the main library site has been steadily declining since 2012. lsu library’s main website saw reduced traffic in both 2020 time frames, but this reduction was particularly dramatic during the covid-19 pandemic. catalog use has also been declining but experienced a sharp decline of 25% during the covid closure. catalog use regularly drops when the library is closed, at least partially owing to the fact that library staff are heavy catalog users. also, the catalog is largely used by patrons seeking print items and, with the closure of the library building, patrons were less likely to search for print items. therefore, a drop could be expected, but such an extreme reduction was a surprise. usage of the main discovery system (eds) followed a similar trajectory. as most discovery searches begin on the library website’s home page, a decline in discovery use would logically follow a decline in website usage. eds pageviews declined between 2019 and 2020 during the control time frame by 6% but declined much more sharply during the covid time frame. interlibrary loan requests (borrowing/document delivery and lending) rose considerably during the closure. one factor in the borrowing/document delivery increase was that document delivery was opened up to undergraduates during the covid 2020 time frame; this service was not available to that patron group during the control time frames and the covid 2019 time frame. downloads through ebsco, typically lsu’s busiest platform, increased in both the control time frame (37%) and covid time frame (30%), resulting in a baseline-factored decrease of 6% (rounded to the nearest percent). proquest use was less straightforward to interpret. while there was a dramatic decrease in downloads between 2019 and 2020, that seems to owe more to an unusually high usage in the 2019 covid time frame rather than a precipitous drop in the 2020 covid time frame. information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 15 lsu uses springshare’s popular libguides product, both for its list of databases (databases a–z) and as a platform for research guides. usage of the databases a–z page was minimally impacted in the covid time frame. however, overall guide usage rose significantly at 56%. at least some of this is due to the creation of a covid-specific research guide. this guide included descriptions and links to various electronic resources which vendors had made freely available during the crisis. that page alone had nearly 1,700 views in the covid time frame. but other pages with no clear connection to covid also had significant increases. this may reflect students feeling the need to consult electronic guides to compensate for lack of in-person access to librarians. the covid closure did have an enormous impact on usage of lsu’s patron support system. this system, labeled “ask us!”, handled 108 tickets in the 2020 covid time frame, compared to 17 in 2019. while this system’s use was trending upward in the control time frame as well, the jump in use during the covid-19 closure was notable. while many of the tickets were similar to questions asked when the libraries were open, many others were clearly related to the library’s closure. for example, several inquiries were in reference to alternate access to print items (such as scanning and delivering electronic versions of book chapters). there were also inquiries about when the building might reopen, and when patrons might be able to access print items that had been placed on hold. northeastern illinois university the library’s website is part of the university’s drupal content management system, which restricts the library’s ability to design and structure content. for that reason, much of the high impact library content is stored on third-party sites, like libguides or worldcat discovery. while the library directs users to start their research on the main website, users typically immediately follow a link that leads them to one of those third-party resources. usage of the library website and resources linked from the website both dropped. sfx document delivery clickthroughs, which pass users through to neiu’s interlibrary loan request pages, saw increases overall in 2020 compared to 2019. this is not surprising, given that neiu was forced to cancel two “big deal” packages at the end of 2019. the neiu libraries’ chat service was one of the resources that experienced increased usage during the covid 2020 time frame. an informal review of questions coming in during the early covid closure showed that most respondents were asking about building hours or returning or checking out materials. while the library website was immediately updated with covid-related closure information, the chat button is easily spotted and readily available, while the hours and materials information required clicking additional links. research consultations also increased. as the physical reference desk was no longer available, students and faculty were directed to use email or set up subject librarian google meet appointments for those questions where they may have usually visited the desk. an increase of 31% in these interactions demonstrates that users still needed librarian assistance with research and course-related questions in the time of covid. valparaiso university when evaluating valpo’s interlibrary loan demand, only article requests (for both borrowing/document delivery and lending) were considered; loans were excluded. demand from valpo patrons fell 32% during the time frame affected by covid-19. despite lack of access to the print collection once the pandemic hit, net lending demand from electronic resources rose slightly. information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 16 catalog use was in decline before covid hit. this is likely due to the increased reliance on the library’s discovery tool, which includes records for all materials held in the catalog. however, the difference in use seen during the pandemic is striking; net use fell 62%. because patrons use the catalog to access information about the library’s physical collection and the library was closed, this precipitous decline is not surprising. overall discovery tool (summon) use also fell during the 2020 covid time frame (20%). valparaiso university subscribes to approximately 60 ebsco databases and 30 proquest databases, making them valpo’s most popular database vendors. both saw declines, but in proquest more than ebsco. the library’s main website has slightly more than 100 pages, and although it serves as a starting point to reach many of the library’s other resources on different platforms (databases, libguides, etc.) it still receives considerable use: 147,125 pageviews in the 2019 calendar year. high-use pages include hours, the library directory, departments such as interlibrary loan and archives & special collections, and the listing of liaison librarians. the main website experienced a net 45% drop during the time frame affected by covid-19. valpo uses libguides for two primary purposes: to organize and share approximately 200 databases using the product’s a–z database list, and to deliver subject-specific and instruction content. the a–z lists are heavily used by students and faculty across campus. however, during the pandemic, net usage of the databases fell 17%. the net decrease in libguides views was approximately the same at −20%. seemingly, patrons had increased need for chat reference during the pandemic, although valpo receives a relatively small number of chat questions. during the first part of the spring 2020 semester unaffected by covid, valpo received six more questions over the same time frame in 2019. during the covid-affected part of the 2020 semester, usage jumped from 20 chats to 37 chats. again, these numbers are small, so caution should be used in interpretation, but considering the baseline change from the control time frame, there was a net 62% jump in usage for chat reference during the covid time frame. commonalities and trends while the study did not set out to compare usage among the three institutions, some clear patterns did emerge. use of all three libraries’ websites as well as the discovery tools/catalogs and major databases decreased during the covid time frame. a number of explanations could apply. for instance, regarding library website usage, libraries’ public computers are often set to open to the libraries home pages whenever a new browser is launched. when those computers are not in use, such as when library buildings are closed, students do not automatically start their research on those pages. even though students did continue to interact with librarians and library staff through virtual methods, in-person reference encounters were not possible during covid. many patron interactions begin with librarians demonstrating how to start research at the library home page, moving on to find databases or conduct searches in the discovery tool or catalog. without direct librarian guidance, it is not surprising that students do not start their research at library tools. whether it is because they do not realize library resources are available off campus or they have become fond of free web search tools such as google scholar, most public services librarians would expect a decrease in library resource usage when students are not on campus. information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 17 with the exception of lsu’s libguides usage, the results of this study do not support the ala assertion that covid led to “leaps in the use of digital content” by library patrons. it is not surprising to see that all three libraries experienced an increase in virtual communication methods, as chat, email, and online meetings were the only means of student-librarian communication once campuses closed. no longer could students catch library staff in the stacks or at a service desk to ask quick—and often uncounted—questions. instead, interactions were more easily measured through the virtual trails they left behind. it would be difficult to determine whether overall student-librarian interaction increased or decreased during the covid time frame as compared to “normal” times. data only show that students did continue to seek help from libraries even when the buildings were closed. individual institutions noted some within–time frame usage patterns. for instance, valpo found that the drop in usage was more dramatic during the first part of the covid 2020 time frame and improved during the latter part of the covid time frame. initially, students were probably in sinkor-swim mode, with school assignments having a lower priority in their lives. after the first month or so, students and faculty may have been able to start thinking about research needs more. valparaiso university fared the worst of the three institutions, with net decreases in eight out of ten of the areas studied (80%) (see table 4). neiu had decreases in six out of nine areas (67%) and lsu, the largest institution, broke even with five out of ten (50%) areas showing decreases. valpo has a carnegie classification of highly residential, while lsu and neiu are primarily nonresidential. it could be that the residential nature of valpo’s campus more negatively impacted use when students were away from campus. lsu, a state flagship, has a higher percentage of graduate students than valpo or neiu. it seems likely that graduate students would be more determined to continue research activities during the closure than undergraduates, which may explain the smaller decrease. this could be an area of further research. conclusion as the pandemic continues and universities plan for altered learning environments into the fall, will there be a rebound in library system and resource usage, or will the dramatic dip seen during immediate covid time frame continue? nearly every day there are multiple webinars for academic libraries, their administrators, and their staff members to share their stories, compare their experiences, and help guide each other for operating in the new normal. librarians have had time to adjust typical pedagogical practices, learn new virtual technologies, and develop outreach plans to ensure continued library instruction in remote and online environments. other changes to library practice may include an even greater shift in acquisitions from print to electronic resources. services for distance students might become a greater point of emphasis. faculty, too, have had time to reevaluate their syllabi and identify support needs for themselves and their students as courses go online. as with so many areas of life these days, the outcomes of this work remain uncertain. information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 18 endnotes 1 “covid-19 map,” johns hopkins coronavirus resource center, accessed july 31, 2020, https://coronavirus.jhu.edu/map.html. 2 the carnegie classification of institutions of higher education, 2018 ed. (bloomington: indiana university center for postsecondary research, 2018), https://carnegieclassifications.iu.edu/. 3 the carnegie classification. 4 “library collection by material type: fiscal year 2018,” national center for education statistics, accessed august 21, 2020, https://nces.ed.gov/ipeds/datacenter/institutionbyname.aspx?gotoreportid=1. 5 the carnegie classification. 6 “library collection by material type.” 7 “enrollment data by semester,” valparaiso university office of institutional effectiveness, accessed august 11, 2020, https://www.valpo.edu/institutional-effectiveness/institutionalresearch/enrollment-data/. 8 “library collection by material type.” 9 s. d. allen iske and linda g. lengfellner, “fire, water & books: disaster preparedness for academic libraries,” professional safety 60, no. 10 (october 2015), https://search.proquest.com/docview/1735009821?pq-origsite=summon; daryl l. superio, stephen b. alayon, and mary grace h. oliveros, “disaster management practices of academic libraries in panay island, philippines: lessons from typhoon haiyan,” information development 35, no. 1 (january 2019): 51–66, https://doi.org/10.1177/0266666917725905; andy corrigan, “disaster: response and recovery at a major research library in new orleans,” library management 29, no. 4/5 (may 2008): 293–306, https://doi.org/10.1108/01435120810869084. 10 lisa mcguire, “planning for a pandemic influenza outbreak: roles for librarian liaisons in emergency delivery of educational programs,” medical reference services quarterly 26, no. 4 (december 2007): 1–13, https://doi.org/10.1300/j115v26n04_01. 11 bradley p. tolppanen and marlene slough, “providing circulation services in a temporary location,” journal of access services 1, no. 4 (may 24, 2004): 125, https://doi.org/10.1300/j204v01n04_10. 12 walter m. fontane, “assessing library services during a renovation,” journal of access services 13, no. 4 (october 1, 2016): 223, https://doi.org/10.1080/15367967.2016.1250643. 13 fontane, 223. 14 marc vinyard et al., “a pop-up service point and repurposed study spaces: maintaining market share during a renovation,” journal of library administration 58, no. 5 (july 4, 2018): 449–67, https://doi.org/10.1080/01930826.2018.1468193. https://coronavirus.jhu.edu/map.html https://carnegieclassifications.iu.edu/ https://nces.ed.gov/ipeds/datacenter/institutionbyname.aspx?gotoreportid=1 https://www.valpo.edu/institutional-effectiveness/institutional-research/enrollment-data/ https://www.valpo.edu/institutional-effectiveness/institutional-research/enrollment-data/ https://search.proquest.com/docview/1735009821?pq-origsite=summon https://doi.org/10.1177/0266666917725905 https://doi.org/10.1108/01435120810869084 https://doi.org/10.1300/j115v26n04_01 https://doi.org/10.1300/j204v01n04_10 https://doi.org/10.1080/15367967.2016.1250643 https://doi.org/10.1080/01930826.2018.1468193 information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 19 15 amy harmon, “inside the race to contain america’s first coronavirus case,” new york times, february 5, 2020, sec. us, https://www.nytimes.com/2020/02/05/us/corona-viruswashington-state.html; karin fischer, “colleges brace for more-widespread outbreak of coronavirus,” chronicle of higher education, february 26, 2020, http://www.chronicle.com/article/colleges-brace-for/248123. 16 “guidance for interruptions of study related to coronavirus (covid-19),” us department of education office of postsecondary education, updated june 16, 2020, https://ifap.ed.gov/electronicannouncements/030520guidance4interruptionsrelated2coronaviruscovid19. 17 “libraries respond: covid-19 survey results (may 2020),” american library association, http://www.ala.org/tools/covid/libraries-respond-covid-19-survey. 18 “libraries respond: covid-19 survey,” appendix 1: academic library financial aggregate tables. 19 “survey: libraries examine phased building re-opening, prepare summer programs,” american library association news and press center, june 3, 2020, http://www.ala.org/news/pressreleases/2020/06/survey-libraries-examine-phased-building-re-opening-prepare-summerprograms. 20 “libraries respond: covid-19 survey” (instrument), american library association, may 2020, http://www.ala.org/pla/sites/ala.org.pla/files/content/advocacy/covid-19/librariesrespond-covid-19-survey-may-2020_5-12-20.pdf. 21 karin fischer, “as coronavirus spreads, colleges make limited allowances for support staff,” chronicle of higher education, march 23, 2020, http://www.chronicle.com/article/ascoronavirus-spreads/248304. 22 christine wolff-eisenberg and lisa janicke hinchliffe, “academic library strategies shift to closure and restriction,” ithaka s+r (blog), march 15, 2020, https://sr.ithaka.org/blog/academic-library-strategies-shift-to-closure-and-restriction/. 23 fischer, “as coronavirus spreads”; colleen flaherty, “librarians advocate closing campus libraries during coronavirus pandemic,” inside higher ed, march 19, 2020, https://www.insidehighered.com/news/2020/03/19/librarians-advocate-closing-campuslibraries-during-coronavirus-pandemic. 24 “cloud-based library platform keeps california community college libraries operational during covid-19 crisis—system is now live at 110 california community colleges,” library technology guides, current news service and archive, 2020, https://librarytechnology.org/pr/25004; “springshare responds to remarkable shift to online library services,” library technology guides, current news service and archive, april 29, 2020, https://librarytechnology.org/pr/25116. 25 “statement on the global covid-19 pandemic and its impact on library services and resources | icolc website,” icolc coordinating committee, march 13, 2020, https://www.nytimes.com/2020/02/05/us/corona-virus-washington-state.html https://www.nytimes.com/2020/02/05/us/corona-virus-washington-state.html http://www.chronicle.com/article/colleges-brace-for/248123 https://ifap.ed.gov/electronic-announcements/030520guidance4interruptionsrelated2coronaviruscovid19 https://ifap.ed.gov/electronic-announcements/030520guidance4interruptionsrelated2coronaviruscovid19 http://www.ala.org/tools/covid/libraries-respond-covid-19-survey http://www.ala.org/news/press-releases/2020/06/survey-libraries-examine-phased-building-re-opening-prepare-summer-programs http://www.ala.org/news/press-releases/2020/06/survey-libraries-examine-phased-building-re-opening-prepare-summer-programs http://www.ala.org/news/press-releases/2020/06/survey-libraries-examine-phased-building-re-opening-prepare-summer-programs http://www.ala.org/pla/sites/ala.org.pla/files/content/advocacy/covid-19/libraries-respond-covid-19-survey-may-2020_5-12-20.pdf http://www.ala.org/pla/sites/ala.org.pla/files/content/advocacy/covid-19/libraries-respond-covid-19-survey-may-2020_5-12-20.pdf http://www.chronicle.com/article/as-coronavirus-spreads/248304 http://www.chronicle.com/article/as-coronavirus-spreads/248304 https://sr.ithaka.org/blog/academic-library-strategies-shift-to-closure-and-restriction/ https://www.insidehighered.com/news/2020/03/19/librarians-advocate-closing-campus-libraries-during-coronavirus-pandemic https://www.insidehighered.com/news/2020/03/19/librarians-advocate-closing-campus-libraries-during-coronavirus-pandemic https://librarytechnology.org/pr/25004 https://librarytechnology.org/pr/25116 information technology and libraries june 2021 the impact of covid-19 on the use of academic library resources | connell, wallis, and comeaux 20 https://icolc.net/statement/statement-global-covid-19-pandemic-and-its-impact-libraryservices-and-resources. 26 “icolc covid19 complimentary expanded access specifics” (google doc), accessed may 8, 2020, https://docs.google.com/spreadsheets/d/1xiinlf9p00to5lgki3v4s413iujycm5qjokug19a_y/edit?usp=sharing. 27 “libraries without walls: even wider access to digital resources during pandemic | penn state university,” penn state news, april 8, 2020, https://news.psu.edu/story/614577/2020/04/08/research/libraries-without-walls-evenwider-access-digital-resources-during. 28 emilija sagaityte and cate ryan, “historic pandemic poses lasting impact on libraries, scholarship,” brown daily herald, march 31, 2020, https://www.browndailyherald.com/2020/03/31/historic-pandemic-poses-lasting-impacton-libraries-scholarship/. 29 “message to libraries about counter usage during the covid-19 pandemic,” project counter, june 12, 2020, https://www.projectcounter.org/message-to-libraries-about-counter-usageduring-the-covid-19-pandemic/. 30 “outmigration numbers increase, with evidence of tie to budget impasse,” illinois board of higher education, march 12, 2019, http://www.ibhe.org/pressreleases/2019.03.12-ibheoutmigration-numbers-for-web.htm. 31 “fall enrollments: five-year trend data,” northeastern illinois university, accessed june 3, 2020, https://www.neiu.edu/sites/neiu.edu/files/documents/ysun2/fall%202019%20data%20di gest%205-year%20enrollment.pdf. https://icolc.net/statement/statement-global-covid-19-pandemic-and-its-impact-library-services-and-resources https://icolc.net/statement/statement-global-covid-19-pandemic-and-its-impact-library-services-and-resources https://docs.google.com/spreadsheets/d/1xiinlf9p00to-5lgki3v4s413iujycm5qjokug19a_y/edit?usp=sharing https://docs.google.com/spreadsheets/d/1xiinlf9p00to-5lgki3v4s413iujycm5qjokug19a_y/edit?usp=sharing https://news.psu.edu/story/614577/2020/04/08/research/libraries-without-walls-even-wider-access-digital-resources-during https://news.psu.edu/story/614577/2020/04/08/research/libraries-without-walls-even-wider-access-digital-resources-during https://www.browndailyherald.com/2020/03/31/historic-pandemic-poses-lasting-impact-on-libraries-scholarship/ https://www.browndailyherald.com/2020/03/31/historic-pandemic-poses-lasting-impact-on-libraries-scholarship/ https://www.projectcounter.org/message-to-libraries-about-counter-usage-during-the-covid-19-pandemic/ https://www.projectcounter.org/message-to-libraries-about-counter-usage-during-the-covid-19-pandemic/ http://www.ibhe.org/pressreleases/2019.03.12-ibhe-outmigration-numbers-for-web.htm http://www.ibhe.org/pressreleases/2019.03.12-ibhe-outmigration-numbers-for-web.htm https://www.neiu.edu/sites/neiu.edu/files/documents/ysun2/fall%202019%20data%20digest%205-year%20enrollment.pdf https://www.neiu.edu/sites/neiu.edu/files/documents/ysun2/fall%202019%20data%20digest%205-year%20enrollment.pdf abstract introduction background louisiana state university northeastern illinois university valparaiso university literature review preparing and responding to pandemics and other disasters resource use during building disruptions covid-19 impact methodology results louisiana state university northeastern illinois university valparaiso university discussion louisiana state university northeastern illinois university valparaiso university commonalities and trends conclusion endnotes detection of information requirements of researchers using bibliometric analyses to identify target journals vadim nikolaevich gureyev, nikolai alekseevich mazov information technology and libraries | december 2013 66 abstract bibliometric analyses were used to identify journals that are representative of the authors’ research institutes. methods to semiautomatically collect data for an institute’s publications and which journals they cite are described. citation analyses of lists of articles and their citations can help librarians to quickly identify the preferred journals in terms of the number of publications, and the most frequently cited journals. librarians can use these data to generate a list of journals that an institute should subscribe to. background recent developments in information technology have had a significant impact on the research activities of scientific libraries. such tools have provided new insights into the workload and duties of librarians in research libraries. in the present study, we performed bibliometric analyses to determine the information needs of researchers, and to determine whether they are satisfied with the journal subscriptions available at their institutes. such analyses are important because of limited funding for subscriptions, increases in the cost of electronic resources, and the publication of new journals, especially open-access journals. bibliometric analyses are more accessible and less labor-intensive when using specially designed web services and software. several databases of citation data are accessible online. the leading publishers of these databases, including thomson reuters and elsevier, promote their products such as the web of science (wos) and scopus with travelling and online seminars to increase the number of skilled users. of note, the number of articles devoted to bibliometric analysis has increased about 4-fold since 2000 (see figure 1). vadim nikolaevich gureyev (gureyev@vector.nsc.ru) is leading bibliographer, information analysis department, state research center of virology and biotechnology vector, novosibirsk, russia. nikolai alekseevich mazov (mazovna@ipgg.sbras.ru) is head of information and library center, trofimuk institute of petroleum geology and geophysics sb ras, novosibirsk, russia. mailto:mgureyev@vector.nsc.ru mailto:mazovna@ipgg.sbras.ru information technology and libraries | december 2013 67 figure 1. growth of publications devoted to informetric analysis. data were generated from the wos using the following request: «topic=((bibliometric* or informetric* or webometic* or scientometric*) and (stud* or analys*))». bibliometric analysis appears to be the most objective method for use by librarians. it is important to note that bibliometric analysis shows high objectivity, even when compared with peer review.1 citation analysis can be used to select target journals because it accurately reflects the needs of researchers and can reveal current scientific trends. it also allows librarians to evaluate the effectiveness of each journal, the significance of each journal to the institute, and the minimum archival depth.2 citation analysis is particularly useful when generating a list of journals for subscription and to determine whether to subscribe to specific journals.3 in the present study, we performed citation analyses of scientific papers that were published by researchers at src vb “vector” (biology and medicine) and ipgg sb ras (geosciences). we analyzed groups of journals that published articles from these two institutes and compared the characteristics of the cited and citing journals. many journals publish articles covering the fields associated with the two institutes (biology and medicine, and geosciences), and journals in these fields tend to have the highest impact factors of all fields. therefore, the methods applied in this study and the results may be generalized to other research libraries. detection of information requirements of researchers using bibliometric analyses to identify target journals | gureyev and mazov 68 study design sources. we analyzed articles published in journals or books by researchers at src vb “vector” and ipgg, together with the references cited in these articles. we limited the articles to those published in 2006–2010 (ipgg) or 2007–2011 (src vb “vector”). we did not analyze monographs, theses, or conference proceedings (including those that were published in journals), because our aim was to optimize the list of subscribed journals. to collect comprehensive data regarding these publications, we used four overlapping sources. (1) the russian science citation index (sci) was used to retrieve articles based on the profile of each researcher. the “bound and unbound publications in one list” option was switched off. (2) thomson reuters sci expanded was used to examine the profile of each researcher. the “conference proceedings citation index” option was switched off. (3) scopus was used to retrieve the publications for each researcher. (4) each head of department provided us with the articles published by each member of the research group within the last 5 years. along with publications in which the affiliation was clearly stated, we also analyzed articles where the authors’ affiliation was not stated, the authors reported a superior organization such as a governmental ministry, and the authors from either institute attributed the work to another affiliation (if they worked at two or more organizations). the translated and original versions of the same article were treated as a single article, and the english version was used in our analyses. for journals that published the original russian article and an english translation, we analyzed the latter. citations. citations from the published articles were analyzed to identify the most frequently cited journals. we ignored references that lacked sufficient information or references included in footnotes. cited monographs, theses, and conference proceedings (including those that were published in journals) were also ignored. for citations published in russian with an english translation, we analyzed the translated version, even if the authors originally cited the russian version. we preferred to include the translated versions because they are included in wos database and we can treat them automatically. for example, the wos indexes articles from russian geology and geophysics (print issn 1068-7971) but not the russian-language version geologiya i geofizika (print issn 0016-7886). journals that had been renamed were treated as one journal, and the current/most recent name was used in the analysis. however, journals that had split into multiple titles were analyzed separately, and the journal’s name at the time the cited article was published was used in the analysis. for this study, we first retrieved the journal name and the year the cited article was published. we then expanded on this information by recording the journal publisher, journal accessibility (i.e. subscription method, paper or electronic), open/pay-per-view access, embargo status, and journal length. we ignored the accessibility of individual articles that had been deposited in the author’s personal website or in an institutional repository. information technology and libraries | december 2013 69 results and discussion table 1 summarizes the publication activities for both institutes. a. year number of articles included in russian sci* (%) included in wos (%) included in scopus* (%) nowhere indexed (%) number of journals** 2007 118 94.9 28.8 54.2 5 66 2008 84 96.4 41.6 51.1 3.5 57 2009 82 97.5 39 52.4 2.4 58 2010 100 94.0 41 61 6 60 2011 105 91.4 25.7 55.2 8.5 50 b. year number of articles included in russian sci* (%) included in wos (%) included in scopus* (%) nowhere indexed (%) number of journals** 2007 188 79.8 43.1 43.1 21 82 2008 218 96.8 39.4 41.7 3 88 2009 259 93.0 39.0 37.8 7 87 2010 250 84.4 31.2 29.6 5 102 2011 267 70.4 30.4 30.0 29 97 *the russian sci and scopus indexed some articles twice, particularly those published in russian with an english translation. therefore, some articles have different timelines and citations. these duplications were analyzed as one article. **number of journals in this field, excluding translated journals. table 1. publication activity and articles presence in the main bibliographic databases in the fields of biomedicine (a; 2007–2011) and geoscience (b; 2007–2011). table 1 shows that the two institutes have a stable publication history relative to other russian scientific institutes in terms of publication activity, in publishing approximately 150 articles per year. therefore, our results can be generalized to other institutes in these fields. collecting this information may seem to be a daunting task, especially for librarians who have not conducted such analyses before. we used three databases and contacted the heads of department detection of information requirements of researchers using bibliometric analyses to identify target journals | gureyev and mazov 70 directly. however, our data indicate that it is sufficient to use the free-of-charge russian sci, an extensive index of russian scientific articles that includes almost all of the articles published by russian researchers in russian and international journals. nevertheless, it is essential to review the profile of each author. when searching for articles by affiliation, the number of articles retrieved ranged from 28% to 51%, but the number of publications retrieved tended to decrease over time. this phenomenon may be caused by a deficient system used to identify affiliations because of differences in the spelling of the affiliation name (in our case, more than 70 variants have been used), attribution of the research to a superior organization, and two or more affiliations may have the same name.4 furthermore, recent studies1,5 confirmed that information about authors should be collated by their affiliations, rather than by performing searches in bibliographic databases. it seems paradoxical that the wos and scopus databases index russian articles quicker than the russian sci. by subscribing to the same print and electronic journals, we noted that print editions are published before electronic ones. nevertheless, this seems reasonable based on this 2-year retrospective analysis. therefore, routine analysis of russian articles can be partly automated by efficient searches of the russian sci. table 2 presents the citation details. citing year number of references number of cited journals average number of references in article 2007 1830 492 15.5 2008 1354 472 16.1 2009 1536 558 18.7 2010 1591 471 15.9 2011 1613 484 15.4 table 2. number of cited articles, cited journals, and mean number of references in article in the biomedical field references from articles not indexed in wos were manually extracted, which takes time and effort. references from articles indexed in the wos, including russian articles translated into english, were analyzed semiautomatically. for this purpose, we used endnote software developed by thomson reuters. endnote web is a free alternative that could also be used for this purpose. the references cited in each article were exported into endnote. next, the references were arranged according to the chosen parameters to simplify our analyses. of note, about 35% of the russian articles indexed in wos accounted for 80% of all the references cited in the articles. two possibly reasons for this are (1) the greater number of articles published in translated and international information technology and libraries | december 2013 71 journals, and (2) russian researchers are adopting the western citing culture.6 this finding suggests that it is possible to avoid labor-intensive routine work and to use automated services developed by thomson reuters to collate and analyze up to 80% of all references. the authors of articles in the geosciences field cited 1000 journals, including 750 in western journals and 250 in russian journals. in terms of biomedical articles, the index included 1339 articles, of which 1168 were in western journals and 171 in russian journals. we analyzed about 8000 references cited by authors from each institute over 5 years. the references were divided into three equal groups. the most frequently cited russian journals and book series are listed in table 3. biological and medical sciences geosciences journals/book series percent of references total (%) journals/book series percent of references total (%) problems of virology 16.94 16.94 russian geology and geophysics 34.99 34.99 molecular biology 6.44 23.38 doklady earth sciences 18.18 53.17 biotechnology in russia 6.07 29.45 geochemistry international 6.49 59.66 doklady biological sciences 5.09 34.54 petrology 2.72 62.38 atmospheric and oceanic optics 4.42 38.96 geotectonics 2.55 64.93 annals of the russian academy of medical sciences 4.04 43 geology of ore deposits 2.23 67.16 journal of microbiology, epidemiology and immunobiology 3.82 46.82 national geology 1.98 69.14 molecular genetics microbiology and virology 2.92 49.74 stratigraphy and geological correlation 1.96 71.1 bulletin of experimental biology and medicine 2.77 52.51 izvestiya. physics of the solid earth 1.55 72.65 russian journal of bioorganic chemistry 2.69 55.2 proceedings of all-union mineralogic society 1.45 74.1 problems of tuberculosis 2.62 57.82 bulletin of the russian academy of sciences: geology 1.42 75.52 biochemistry (moscow) 1.79 59.61 lithology and mineral resources 1.42 76.94 pharmaceutical chemistry journal 1.72 61.33 oil and gas geology 1.26 78.2 infectious diseases 1.57 62.9 russian journal of pacific geology 0.82 79.02 bulletin siberian branch of russian academy of medical sciences 1.2 64.1 chemistry for sustainable development 0.69 79.71 russian journal of genetics 1.12 65.22 physics of the solid state 0.68 80.39 table 3. characteristics of the 16 most frequently cited russian journals and journals in the second group listed in order of number of citations. the journals in the colored region include one-third of all citations. the translated titles of each journal and the official translated titles of journals without translated variants are given. detection of information requirements of researchers using bibliometric analyses to identify target journals | gureyev and mazov 72 table 3 shows that two-thirds of all citations were published in only 9% (16/171) of the cited russian biomedical journals. this statistic is even more pronounced in the field of geosciences, as 6% (16/250) of russian journals published 80% of the cited articles. comparing the data between the two institutes, it is notable that the results are consistent. the only difference evident to us is that the geoscience researchers tended to cite more russian journals, whereas biomedical researchers preferred to cite international literature. the greater concentration of citations to select journals in the geosciences field can be explained by the smaller number of citations. in the biomedical field, we observed a high trend towards abundant citations resulting in a wider distribution of citations in each article; the journals with the highest impact factors in biology and medicine confirmed our observation. figures 2 and 3 show the correlations between citations and publication activity in russian journals. figure 2. correlation between publication activity (red) and citations (blue) in the biomedical field (in %) for the data shown in table 3. timescale: 2007–2011. figure 3. correlation between publication activity (red) and citations (blue) in the geosciences field (in %) for the data shown in table 3. timescale: 2006–2010. information technology and libraries | december 2013 73 the citing and cited journals are often the same journals, and publication activity is highly correlated with citation activity. this is more apparent in the geosciences field, where russian geology and geophysics is the most frequently cited journal, as it published about two-thirds of all cited articles. this is unsurprising because it is published by our institute and is the main multidisciplinary russian journal in the field of geosciences. the most frequently cited international journals are listed in table 4. biological and medical sciences geosciences journals/book series percent of references total (%) journals/book series percent of references total (%) journal of virology 6.03 6.03 earth planetary science letters 6.46 6.46 proceedings of the national academy of sciences of the united states of america 3.36 9.39 geochimica et cosmochimica acta 6.28 12.74 virology 3.15 12.54 contributions to mineralogy and petrology 5.67 18.41 vaccine 2.77 15.31 journal of geophysical research 4.9 23.31 journal of biological chemistry 2.4 17.71 nature 3.67 26.98 journal of general virology 2.4 20.11 american mineralogist 3.53 30.51 nature 2.04 22.15 journal of petrology 3.22 33.73 science 1.94 24.09 lithos 2.58 36.31 journal of clinical microbiology 1.94 26.03 chemical geology 2.29 38.6 emerging infectious diseases 1.89 27.92 geology 2.01 40.61 nucleic acids research 1.59 29.51 tectonophysics 1.94 42.55 journal of infectious diseases 1.38 30.89 economic geology 1.93 44.48 detection of information requirements of researchers using bibliometric analyses to identify target journals | gureyev and mazov 74 journal of molecular biology 1.35 32.24 science 1.87 46.35 journal of immunology 1.24 33.48 journal of crystal growth 1.56 47.91 journal of medical virology 1.19 34.67 canadian mineralogist 1.48 49.39 virus research 0.86 35.53 russian geology and geophysics 1.35 50.74 new england journal of medicine 0.86 36.39 european journal of mineralogy 1.32 52.06 archives of virology 0.83 37.22 geophysics 1.02 53.08 antiviral research 0.75 37.97 geophysical research letters 1.02 54.1 lancet 0.73 38.7 journal of metamorphic geology 0.98 55.08 cell 0.65 39.35 journal of geology 0.93 56.01 applied and environmental microbiology 0.6 39.95 international geology review 0.91 56.92 biochemistry 0.59 40.54 physical review. ser. b 0.9 57.82 journal of experimental medicine 0.59 41.13 precambrian research 0.9 58.72 febs letters 0.56 41.69 mineralogical magazine 0.88 59.6 table 4. characteristics of the 25 most frequently cited international journals and journals within the second group listed in terms of number of citations. the colored area includes one-third of all citations. the distribution of citations to international journals was similar to that observed for russian journals, with a greater citation density in journals in the geosciences field. notably, two-thirds of all citations were to articles published in just 25 journals. in terms of biomedical journals, twothirds of all citations were to articles published in 100 journals. only 1.3% (15/1 168) of the cited journals contained one-third of the cited articles in the biomedical field. the corresponding value for journals in the geosciences field was 0.9% (7/750). the correlations between citation activity and publication activity are shown in figures 4 and 5. information technology and libraries | december 2013 75 fig. 4. correlation between publication activity (red) and citations (blue) for biomedical journals (in %) for the data shown in table 4. timescale: 2007–2011. fig. 5. correlation between publication activity (red) and citations (blue) for journals in the geosciences field (in %) for the data shown in table 4. timescale: 2006–2010 detection of information requirements of researchers using bibliometric analyses to identify target journals | gureyev and mazov 76 as illustrated in figures 4 and 5, the distribution of citations to international journals was broader than for russian journals, where there are only 1–4 frequently cited journals. this is probably due to the smaller number of russian journals than international journals. figures 4 and 5 also revealed a difference between the two disciplines, as geoscience researchers published their articles in top cited international journals, whereas biomedical researchers rarely published their research in highly cited journals. this may be due to the greater number of biomedical journals or the lower rate of publication, because relatively few articles were published in the major multidisciplinary journals, such as nature or science, or in specialized journals, such as the journal of virology. conclusion citation analysis enabled rapid identification of the most frequently cited journals that are essential to academic researchers. in the biomedical field, we found that 16 russian and 100 international journals published two-thirds of all cited articles in the last 5 years. in the field of geosciences, we identified 4 russian and 25 international journals that were essential to researchers in this field. interestingly, there were four times as many russian and international journals in the biomedical field than in the geosciences field. the journals that published the researchers’ articles were partially correlated with the cited journals in the geosciences field, but this correlation was less obvious for biomedical journals. it is important to note that all aspects of this study were performed by librarians who used tools that were available in both institutes. we did not require any additional facilities or the assistance of any researchers. we believe our method is one of the most objective and accessible approach for scientific libraries to select target journals. we used our results to optimize subscribed periodical items. in addition to journal acquisition, our methods and results may be applied to other tasks that may be performed by research libraries. for example, it is possible to study the citing and cited halflives of journals, and compare the results with those reported in the journal citation reports. this allows researchers in specific institutes to determine whether they are citing cutting edge or obsolete literature in their studies. the results can also be used to determine whether the subjects of the cited articles are relevant to the institute’s field of research. finally, the results can be used to compare the list of the most frequently cited international journals within a particular field with the list of journals that are most frequently cited by a research institute. perspectives in this study, we revealed some differences in the correlation between citing and cited journals in two distinct fields, namely geosciences and biomedical science. notably, this correlation was greater for journals in the geosciences field. to determine the factors underlying this phenomenon, it will be interesting to extend our study to a greater number of disciplines. it will also be interesting to compare data for cited journals with their usage statistics. information technology and libraries | december 2013 77 references 1. a.f.j. van raan. “the use of bibliometric analysis in research performance assessment and monitoring of interdisciplinary scientific developments.” technikfolgenabschätzung – theorie und praxis, vol. 1 no.12 (2003): 20-29. 2. n.a. slashcheva, yu.v. mokhnacheva and t.n. kharybina. (2008)/ “study of information requirement of scientists from pushchino scientific center ras in central center library.” http://dspace.nbuv.gov.ua:8080/dspace/bitstream/handle/123456789/31392/20slascheva.pdf?sequence=1 (accessed january 21, 2013). 3. nikolay a. mazov. “estimation of a flow of scientific publications in research institute on the basis bibliometric citation analysis”. information technologies in social researches, no.16 (2011): 25-30. 4. leo egghe and ronald rousseau. “citation analysis”, in introduction to informetrics: quantitative methods in library, documentation and information science. amsterdam: elsevier science publishers. (1990): 217-218. 5. “bibliometrics publication analysis as a tool for science mapping and research assessment.” (2008), http://ki.se/content/1/c6/01/79/31/introduction_to_bibliometrics_v1.3.pdf (accessed january 21, 2013). 6. a.e. warshawsky and v.a. markusova. (2009). “estimation of efficiency of russian scientists should be corrected.” http://strf.ru/organization.aspx?catalogid=221&d_no=17296 (accessed january 21, 2013). http://dspace.nbuv.gov.ua:8080/dspace/bitstream/handle/123456789/31392/20-slascheva.pdf?sequence=1 http://dspace.nbuv.gov.ua:8080/dspace/bitstream/handle/123456789/31392/20-slascheva.pdf?sequence=1 http://ki.se/content/1/c6/01/79/31/introduction_to_bibliometrics_v1.3.pdf http://strf.ru/organization.aspx?catalogid=221&d_no=17296 110 information technology and libraries | september 2009 employing virtualization in library computing: use cases and lessons learned arwen hutt, michael stuart, daniel suchy, and bradley d. westbrook this paper provides a broad overview of virtualization technology and describes several examples of its use at the university of california, san diego libraries. libraries can leverage virtualization to address many long-standing library computing challenges, but careful planning is needed to determine if this technology is the right solution for a specific need. this paper outlines both technical and usability considerations, and concludes with a discussion of potential enterprise impacts on the library infrastructure. o perating system virtualization, herein referred to simply as “virtualization,” is a powerful and highly adaptable solution to several library technology challenges, such as managing computer labs, automating cataloging and other procedures, and demonstrating new library services. virtualization has been used in one manner or another for decades,1 but it is only within the last few years that this technology has made significant inroads into library environments. virtualization technology is not without its drawbacks, however. libraries need to assess their needs, as well as the resources required for virtualization, before embarking on large-scale implementations. this paper provides a broad overview of virtualization technology and explains its benefits and drawbacks by describing some of the ways virtualization has been used at the university of california, san diego (ucsd) libraries.2 n virtualization overview virtualization is used to partition the physical resources (processor, hard drive, network card, etc.) of one computer to run one or more instances of concurrent, but not necessarily identical, operating systems (oss). traditionally only one instance of an operating system, such as microsoft windows, can be used at any one time. when an operating system is virtualized—creating a virtual machine (vm)—the vm communicates through virtualization middleware to the hardware or host operating system. this middleware also provides a consistent set of virtual hardware drivers that are transparent to the enduser and to the physical hardware. this allows the virtual machine to be used in a variety of heterogeneous environments without the need to reconfigure or install new drivers. with the majority of hardware and compatibility requirements resolved, the computer becomes simply a physical presentation medium for a vm. n two approaches to virtualization: host-based vs. hypervisor virtualization can be implemented using type 1 or type 2 hypervisor architectures. a type 1 hypervisor (figure 1), commonly referred to as “host-based virtualization,” requires an os such as microsoft windows xp to host a “guest” operating system like linux or even another version of windows. in this configuration, the host os treats the vm like any other application. host-based virtualization products are often intended to be used by a single user on workstation-class hardware. in the type 2 hypervisor architecture (figure 2), commonly referred to as “hypervisor-based virtualization,” the virtualization middleware interacts with the computer’s physical resources without the need of a host operating system. such systems are usually intended for use by multiple users with the vms accessed over the network. realizing the full benefits of this approach requires a considerable resource commitment for both enterprise-class server hardware and information technology (it) staff. n use cases archivists’ toolkit the archivists’ toolkit (at) project is a collaboration of the ucsd libraries, the new york university libraries, and the five colleges libraries (amherst college, hampshire college, mt. holyoke college, smith college, and university of massechusetts, amherst) and is funded by the andrew w. mellon foundation. the at is an open-source archival data management system that provides broad, integrated support for the management of archives. it consists of a java client that connects to a relational database back-end (mysql, mssql, or oracle). the database can be implemented on a networked server or a single workstation. since its initial release in december 2006, the at has sparked a great deal of interest and rapid uptake of the application within the archival community. this growing interest has, in turn, created an increased demand for demonstrations of the product, workshops and training, and simpler methods for distributing the application. (of the use cases described here, the two for the at arwen hutt (ahutt@ucsd.edu) is metadata specialist, michael stuart (mstuart@ucsd.edu) is information technology analyst, daniel suchy (dsuchy@ucsd.edu) is public services technology analyst, and bradley d. westbrook (bradw@library.ucsd.edu) is metadata librarian and digital archivist, university of california, san diego libraries. employing virtualization in library computing | hutt et al. 111 distribution and laptop classroom are exploratory, whereas the rest are in production.) at workshops the society of american archivists sponsors a two-day at workshop occurring on multiple dates at several locations. in addition, the at team provides oneand two-day workshops to different institutional audiences. at workshops are designed to give participants a hands-on experience using the at application. accomplishing this effectively requires, at the minimum, supplying all participants with identical but separate databases so that participants can complete the same learning exercises simultaneously and independently without concern for working in each other’s space. in addition, an ideal configuration would reduce the workload of the instructors, freeing them from having to set up the at instructional database onsite for each workshop. for these workshops we needed to do the following: n provide identical but separate databases and database content for all workshop attendees n create an easily reproducible installation and setup for workshops by preparing and populating the at instructional database in advance virtualization allows the at workshop instructors to predefine the workstation configuration, including the installation and population of the at databases, prior to arriving at the workshop site. to accomplish this we developed a workshop vm configuration with mysql and the at client installed within a linux ubuntu os. the workshop instructors then built the at vm with the data they require for the workshop. the at client and database are loaded on a dvd or flash drive and shipped to the classroom managers at the workshop sites, who then need only to install a copy of the vm and the freely available vmplayer software (necessary to launch the at vm) onto each workstation in the classroom. the at vm, once built, can be used many times both for multiple workstations in a classroom as well as for multiple workshops at different times and locations. this implementation has worked very well, saving both time and effort for the instructors and classroom support staff by reducing the time and communication figure 1. a type 1 hypervisor (host-based) implementation figure 2. a type 2 hypervisor-based implementation 112 information technology and libraries | september 2009 necessary for deploying and reconfiguring the vm. it also reduces the chances that there will be an unexpected conflict between the application and the host workstation’s configuration. but the method is not perfect. more than anything else, licensing costs motivated us to choose linux as the operating system instead of a proprietary os such as windows. this reduces the cost of using the vm, but it also requires workshop participants to use an os with which they are often unfamiliar. for some participants, unfamiliarity with linux can make the workshop more difficult than it would be if a more ubiquitous os was used. at demonstrations in a similar vein, members of the at team are often called upon to demonstrate the application at various professional conferences and other venues. these demonstrations require the setup and population of a demonstration database with content for illustrating all of the application’s functions. one of the constraints posed by the demonstration scenario is the importance of using a local database instance rather than a networked instance, since network connections can be unreliable or outright unavailable (network connectivity being an issue we’ve all faced at conferences). another constraint is that portions of the demonstrations need some level of preparation (for example, knowing what search terms will return a nonempty result set), which must be customized for the unique content of a database. a final constraint is that, because portions of the demonstration (import and data merging) alter the state of the database, changes to the database must be easily reversible, or else new examples must be created before the database can be reused. building on our experience of using virtualization to implement multiple copies of an at installation, we evaluated the possibility of using the same technology for simplifying the setup necessary for demonstrating the at. as with the workshops, the use of a vm for at demonstrations allows for easy distribution of a prepopulated database, which can be used by multiple team members at disparate geographic locations and on different host oss. this significantly reduces the cost of creating (and recreating) demonstration databases. in addition, demonstration scripts can be shared between team members, creating additional time savings as well as facilitating team participation in the development and refinement of the demonstration. perhaps most important is the ability to roll back the vm to a specific state or snapshot of the database. this means the database can be quickly returned to its original state after being altered during a demonstration. overall, despite our initial anxiety about depending on the vm for presentations to large audiences, this solution has proven very useful, reliable, and cost-effective. at distribution implementing the at requires installing both the toolkit client and a database application such as mysql, instantiating an at database, and establishing the connection between database and client. for many potential customers of the at, the requirements for database creation and management can be a significant barrier due to inexperience with how such processes work and a lack of readily available it resources. many of these customers simply desire a plug-and-play version of the application that they can install and use without requiring technical assistance. it is possible to satisfy this need for a plug-and-play at by constructing a vm containing a fully installed and ready-to-use at application and database instance. this significantly reduces the number and difficulty of steps involved in setting up a functional at instance. the customer would only need to transfer the vm from a dvd or other source to their computer, download and install the vm reader, and then launch the at vm. they would then be able to begin using the at immediately. this removes the need for the user to perform database creation and management; arguably the most technically challenging portion of the setup process. users would still have the option of configuring the application (default values, lookup lists, etc.) in accord with the practices of their repository. batch processing catalog records the rapid growth of electronic resources is significantly changing the nature of library cataloging. not only are types of library materials changing and multiplying, the amount of e-resources being acquired increases each year. electronic book and music packages often contain tens of thousands of items, each requiring some level of cataloging. because of these challenges, staff are increasingly cataloging resources with specialized programs, scripts, and macros that allow for semiautomated record creation and editing. such tools make it possible to work on large sets of resources—work that would not be financially possible to perform manually item by item. however, the specialized configuration of the workstation required for using these automated procedures makes it very difficult to use the workstation for other purposes at the same time. in fact, user interaction with the workstation while the process is running can cause a job to terminate prior to completion. in either scenario, productivity is compromised. virtualization offers an excellent remedy to this problem. a virtual machine configured for semiautomated batch processing allows for unused resources on the workstation to process the batch requests in an isolated environment while, at the same time and on the same machine, the user is able to work on other tasks. in cases employing virtualization in library computing | hutt et al. 113 where the user’s machine is not an ideal candidate for virtualization, the vm can be hosted via a hypervisorbased solution, and the user can access the vm with familiar remote access tools such as remote desktop in windows xp. secure sandbox in addition to challenges posed by increasingly large quantities of acquisitions, the ucsd libraries is also encountering an increasing variety of library material types. most notable is the variety and uniqueness of digital media acquired by the library, such as specialized programs to process and view research data sets, new media formats and viewers, and application installers. cataloging some of these materials requires that media be loaded and that applications be installed and run to inspect and validate content. but running or opening these materials, which are sometimes from unknown sources, poses a security risk to both the user’s workstation and to the larger pool of library resources accessible via the network. many installers require a user to have administrative privileges, which can pose a threat to network security. the virtual machine allows for a user to have administrative privileges within the vm, but not outside of the vm. the user can be provided with the privileges needed for installing and validating content without modifying their privileges on the host machine. in addition, the vm can be isolated by configuring its network connection so that any potential security risks are limited to the vm instance and do not extend to either the host machine or the network. laptop classroom instructors at the ucsd libraries need a laptop classroom that meets the usual requirements for this type of service (mobility, dependability, etc.) but also allows for the variety of computing environments and applications in use throughout our several library locations. in a least-common-denominator scenario, computers are configured to meet a general standard (usually microsoft windows with a standard browser and office suite) and allow minimal customization. while this solution has its advantages and is easy to configure and maintain from the it perspective, it leaves much to be desired for an instructor who needs to use a variety of tools in the classroom, often on demand. the goal in this case is not to settle for a single generic build but instead look for a solution that accommodats three needs: n the ability to switch quickly between different customized os configurations n the ability to add and remove applications on demand in a classroom setting n the ability to restore a computer modified during class to its original state of course, regardless of the approach taken, the laptops still needed to retain a high level of system security, application stability, and regular hardware maintenance. after a thorough review of the different technologies and tools already in use in the libraries, we determined that virtualization might also serve to meet the requirements of our laptop classroom. the need to support multiple users and multiple vms makes this scenario an ideal candidate for hypervisor-based virtualization. we decided to use vdi (virtual desktop infrastructure), a commercially available hypervisor product from vmware. vmware is one of the largest providers of virtualization software, and we were already familiar with several iterations of its host-based vm services. the core of our project plan consists of a base vm to be created and managed by our it department. to support a wide variety of applications and instruction styles, instructors could create a customized vm specific to their library’s instruction needs with only nominal assistance from it staff. the custom vm would then be made available on demand to the laptops from a central server (as depicted in figure 2 above). in this manner, instructors could “own” and maintain a personal instructional computing environment, while the classroom manager could still ensure the laptop classroom as a whole maintained the necessary secure software environment required by it. as an added benefit, once these vms are established, they could be accessed and used in a variety of diverse locations. n considerations for implementation before implementing any virtualization solution, in-depth analysis and testing is needed to determine which type of solution, if any, is appropriate for a specific use case in a specific environment. this analysis should include three major areas of focus: user experience, application performance in the virtualized environment, and effect on the enterprise infrastructure. in this section of this paper, we review considerations that, in hindsight, we would have found to be extremely valuable in the ucsd libraries’ various implementations of virtualization. user experience traditionally, system engineers have developed systems and tuned performance according to engineering metrics (e.g., megabytes per second and network latency). while such metrics remain valuable to most assessments of a 114 information technology and libraries | september 2009 computer application, performance assessments are being increasingly defined by usability and user experience factors. in an academic computing environment, especially in areas such as library computer labs, these newer kinds of performance measures are important indicators of how effectively an application performs and, indirectly, of how well resources are being used. virtualization can be implemented in a way that allows library users to have access to both the virtualized and host oss or to multiple virtualized oss. since virtualization essentially creates layers within the workstation, multiple os layers (either host or virtualized) can cause the users to become confused as to which os they are interacting with at a given moment. in that kind of implementation, the user can lose his or her way among the host and guest oss as well as become disoriented by differing features of the virtualized oss. for example, the user may choose to save a file to the desktop, but may not be aware that the file will be saved to the desktop of the virtualized os and not the host os. external device support can also be problematic for the end user, particularly with regard to common devices such as flash drives. the user needs to be aware of which operating system is in use, since it is usually the only one with which an external device is configured to work. authentication to a system is another example of how the relationship between the host and guest os can cause confusion. the introduction of a second os implicitly creates a second level of authentication and authorization that must be configured separately from that of the host os. user privileges may differ between the host and guest os for a particular vm configuration. for instance, a user might need to remember two logins or at least enter the same login credentials twice. these unexpected differences between the host and guest os produce negative effects on a user’s experience. this can be a critical factor in a time-sensitive environment such as a computer lab, where the instructor needs to devote class time to teaching and not to preparing the computers for use and navigating students through applications. interface latency and responsiveness latency (meaning here the responsiveness or “sluggishness” of the software application or the os) in any interface can be a problem for usability. developers devote a significant amount of time to improving operating systems and application interfaces to specifically address this issue. however, users will often be unable to recognize when an application is running a virtualized os and will thus expect virtualized applications to perform with the same responsiveness as applications that are not-virtualized. in our experience, some vm implementations exhibit noticeable interface latency because of inherent limitations of the virtualization software. perhaps the most notable and restrictive limitation is the lack of advanced 3d video rendering capability. this is due to the lack of support for hardware-accelerated graphics, thus adding an extra layer of communication between the application and the video card and slowing down performance. in most hardware-accelerated 3d applications (e.g., google earth pro or second life), this latency is such a problem that the application becomes unusable in a virtualized environment. recent developments have begun to address and, in some cases, overcome these limitations.3 in every virtualization solution there is overhead for the virtualization software to do its job and delegate resources. in our experience, this has been found to cause an approximately 10–20 percent performance penalty. most applications will run well with little or moderate changes to configuration when virtualized, but the overhead should not be overlooked or assumed to be inconsequential. it is also valuable to point out that the combination of applications in a vm, as well as vms running together on the same host, can create further performance issues. traditional bottlenecks the bottlenecks faced in traditional library computing systems also remain in almost every virtualization implementation. general application performance is usually limited by the specifications of one or more of the following components: processor, memory, storage, and network hardware. in most cases, assuming adequate hardware resources are available, performance issues can be easily addressed by reconfiguring the resources for the vm. for example, a vm whose application is memorybound (i.e., performance is limited by the memory available to the vm), can be resolved by adjusting the amount of memory allocated to the vm. a critical component of planning a successful virtualization deployment includes a thorough analysis of user workflow and the ways in which the vm will be utilized. although the types of user workflows may vary widely, analysis and testing serve to predict and possibly avoid potential bottlenecks in system performance. enterprise impact when assessing the effect virtualization will have on your library infrastructure, it is important to have an accurate understanding of the resources and capabilities that will form the foundation for the virtualized infrastructure. it is a misconception that it is necessary to purchase stateof-the-art hardware to implement virtualization. not only are organizations realizing how to utilize existing hardware better with virtualization for specific projects, they are discovering that the technology can be extended employing virtualization in library computing | hutt et al. 115 to the rest of the organization and be successfully integrated into their it management practices. virtualization does, however, impose certain performance requirements for large-scale deployments that will be used in a 24/7 production environment. in such scenarios, organizations should first compare the level of performance offered by their current hardware resources with the performance of new hardware. the most compelling reasons to buy new servers include the economies of scale that can be obtained by running more vms on fewer, more robust servers, as well as the enhanced performance supplied by newer, more virtualization-aware hardware. in addition, virtualization allows for resources to be used more efficiently, resulting in lower power consumption and cooling costs. also, the network is often one of the most overlooked factors when planning a virtualization project. while a local virtualized environment (i.e., a single computer) may not necessarily require a high performance network environment, any solution that calls for a hypervisor-based infrastructure requires considerable planning and scaling for bandwidth requirements. the current network hardware available in your infrastructure may not perform or scale adequately to meet the needs of this vm use. again, this highlights the importance of thorough user workflow analyses and testing prior to implementation. depending on the scope of your virtualization project, deployment in your library can potentially be expensive and can have many indirect costs. while the initial investment in hardware is relatively easy to calculate, other factors, such as ongoing staff training and system administration overhead, are much more difficult to determine. in addition, virtualization adds an additional layer to oftentimes already complex software licensing terms. to deal with the increased use of virtualization, software vendors are devoting increasing attention to the intricacies of licensing their products for use in such environments. while virtualization can ameliorate some licensing constraints (as noted in the at workshop use case), it can also conceal and promote licensing violations, such as multiple uses of a single-license applications or access to license-restricted materials. license review is a prudent and highly recommended component of implementing a virtualization solution. finally, concerning virtualization software itself, it also should be noted that while commercial vm companies usually provide plentiful resources for aiding implementation, several worthy open-source options also exist. as with any opensource software, the total cost of operation (e.g., the costs of development, maintenance, and support) needs to be considered. n conclusion as our use cases illustrate, there are numerous potential applications and benefits of virtualization technology in the library environment. while we have illustrated a number of these, many more possibilities exist, and further opportunities for its application will be discovered as virtualization technology matures and is adapted by a growing number of libraries. as with any technology, there are many factors that must be taken into account to evaluate if and when virtualization is the right tool for the job. in short, successful implementation of virtualization requires thoughtful planning. when so implemented, virtualization can provide libraries with cost-effective solutions to long-standing problems. references and notes 1. alessio gaspar et al., “the role of virtualization in computing education,” in proceedings of the 39th sigcse technical symposium on computer science education (new york: acm, 2008): 131–32; paul ghostine, “desktop virtualization: streamlining the future of university it,” information today 25, no. 2 (2008): 16; robert p. goldberg, “formal requirements for virtualizable third generation architectures,” in communications of the acm 17, no. 7 (new york: acm, 1974): 412–21; and karissa miller and mahmoud pegah, “virtualization: virtually at the desktop,” in proceedings of the 35th annual acm siguccs conference on user services (new york: acm, 2007): 255–60. 2. for other, non–ucsd use cases of virtualization, see joel c. adams and w. d. laverell, “configuring a multi-course lab for system-level projects,” sigcse bulletin 37, no. 1 (2005): 525–29; david collins, “using vmware and live cd’s to configure a secure, flexible, easy to manage computer lab environment,” journal of computing for small colleges 21, no. 4 (2006): 273–77; rance d. necaise, “using vmware for dual operating systems,” journal of computing in small colleges 17, no. 2 (2001): 294–300; and jason nieh and chris vaill, “experiences teaching operating systems using virtual platforms and linux,” sigcse bulletin 37, no 1 (2005): 520–24. 3. h. andrés lagar-cavilla, “vmgl (formerly xen-gl): opengl hardware 3d acceleration for virtual machines,” www .cs.toronto.edu/~andreslc/xen-gl/ (accessed oct. 21, 2008). 24 teaching with marc tapes pauline atherton: associate professor, and judith tessier: research associate, school of library science, syracuse university, syracuse, n.y. a computer based laboratory for library science students to use in class assignments and for independent projects has been developed and used for one year at syracuse university. marc pilot project tapes formed the data base. different computer programs and various samples of the marc file ( 48,poo records, approx.) were used for search and retrieval operations. data bases, programs, and seven different class assignments are described and evaluated for their impact on library education in general and individual students and faculty in particular. a computer based laboratory for use in library science instruction, with the marc pilot project tapes as the file of catalog records, has been the focus of leep (library education experimental project) at syracuse university, school of library science, since august 1968. work has been twofold: 1) development of the laboratory as an instructional tool and 2) exploration of applications of such a facility in library education. the instructional aspect of the project is really "learning with marc tapes". the development of the laboratory has been reported elsewhere ( 1 ), and will not be emphasized in this report. many of today's students in library schools will be tomorrow's workers in libraries that will be parts of library networks and cooperative technical processing centers. they will be involved in library automation projects and related developments. in anticipation of personnel needs for new modes of library service, leep designed activities in the laboratory to satisfy minimum requirements for tomorrow's professional and to encourage maximum use for students with serious interests. teaching with marc tapes/ atherton and tessier 25 the aim during the past year has been to develop a laboratory where computer programs and the marc tapes could be used by library school faculty in class assignments and by students for independent research. the objective was to achieve a program of activities integrated throughout the library school curriculum-one in which computer applications would be seen as one more source of support for the functioning librarian. students were to be provided with a myriad of experiences that would help them to probe the potential usefulness of machine readable catalog data and to develop certain minimal skills needed for using computer based retrieval systems. figure 1 shows the resulting place of leep in the library school. the approach has two stresses : 1) demonstrations of library automation and 2) activities where computers are used in librarianship for research and experimental applications. this orientation is in contradistinction to hillis griffin's use of the term, automation in technical processes, (as he defines it, it includes only acquisition, cataloging, and circulation processes) ( 2). after a short description of the facilities at syracuse this paper will deal with the various class assignments and student projects developed in the first academic year of use, the feedback from students and faculty concerning the usefulness of marc records in instruction, and the authors' conclusions. l.s. classes reference cotalocling bibliography technical services information systems etc. faculty research curriculum development class assignments independent student projects fig. 1. leep's role in the library school. 26 journal of library automation vol. 3/ 1 march, 1970 table 1: data bases and computer programs available through leep. i. data bases: marc i48,000 records (the entire pilot project file) leep programs function program biblolst(3) fdr(4) language access by lc card number, prints assembler each bibliographic record in lc diagnostic format. a frequency distribution program for assembler file study. marc i prints the entire content of a file assembler double column of marc i records in a two-column lister page format. marc i record sort sorts a file of marc i records on the assembler content of any variable (tagged) field. ii. data bases: programs marc/ dps-1000 (the first 1000 marc i records) marcs/ dps-5200 records (a stratified sample of social science records, selected by lc class number, and the lc a's and z's) marcs/dps(ii)-5200 records above, plus 3800 (a stratified sample in humanities, selected by lc class number) function marc reformat reformats marc i records to meet dps requirements and performs certain counts of characters per field, etc. program language pl/ i teaching with marc tapes/ atherton and tessier 27 dps (ibm document processing system) (1,5,6) processes entire text of marc record assembler to produce dictionary and search file, of keywords. retrieves records by any keyword or keyword combinations specified by searcher. allows for root searches, weighting, phrase and field placement, etc. iii. data bases: marcs/ molds-5200 (see marcs/dps above) march/ molds-3800 (see marcs/dps (ii) above) iv. v. programs dbg (7) molds (management online data system) (8) data bases: programs shop (subject heading output stat data bases: test programs function data base generator; selects and formats records for molds. program language pl/ i retrieves fixedfield records by · fortran matching elements; includes sort capabilities and arithmetic operations. licosh (lc subject headings, 7th ed.) function program language formats and prints subject heading pl/ i records. a frequency distribution program for pl/ i file study. lc/z class (lc schedule z) index to z class function program language z text processor selects certain lines of text from assembler, sample of lc z-class schedule and fortran transforms lines into kwic indexable data. 28 journal of library automation vol. 3/ 1 march, 1970 leep facilities the facilities at leep include marc pilot project data bases, computer programs, and personnel. students and faculty were fully informed of the accessibility of the staff of faculty, programmers, and graduate student liaison personnel for consultation and guidance. further, the facility includes computer time; either the leep budget or the library school's university-supported computer budget covers the time charges for class assignments and student projects. table 1 lists and describes the leep programs (with explanation of acronyms) and data bases that are available at the present time at the university computing center for library instruction purposes. the leep facility uses the university's ibm 360/ 50 computer with the following characteristics: main storage: 512 k bytes disc storage: 240 m bytes (2314 disc unit) 3 tape units: 9 channel (800 bpi max) 1 tape unit: 7 channel (800 bpi max) printer: 1000 1pm (two print trains, std and tn) card read/punch: 1000 cpm in 300 cpm out work on the implementation of the computer programs available from the library of congress and ibm was carried on through most of the first academic year. biblolst was in use during the fall term, but the first efficient retrieval system, dps, became available only during the spring 1969 term. for this reason, instructional development has been limited to one semester and one summer session. the experiences reported here are based upon those two terms. marc ii records became available only in late spring 1969. no effort to utilize these records has been made to date, but future plans do include using such a data base. class assignments and student projects using leep because most assignments use the dps retrieval system, learning that system once helps the student in consecutive assignments. leep staff arranged tours of the computing center for individual students, classes and faculty. keypunching instruction and dps explanations were distributed with first assignments, or as needed on an individual basis. during the summer there was instituted an all-school leep orientation seminar and a leep clinic, where a staff member was available for consultation one hour each day in the corridor outside the library school classrooms. materials related to marc/ dps are always available near the students' study carrels and their reserved reading shelf. seven different assignments were developed for classroom use by the library school faculty working with the leep staff during the first year. each assignment reflects the interests of the teacher and the purposes of the unit in which it was introduced. during the spring semester over • teaching with marc tapes/atherton and tessier 29 100 students had computer based assignments (over 200 searches total); during the summer session, seventy-six students had assignments (over 200 searches). following are abstracts of the seven assignments: l.s. 407 bibliographic linking reference service purpose: a) obtain a listing of titles conl.s. 427 cat. & class. (richmond) l.s. 427 cat. & class. (moore) l.s. 621 technical services ( gration and webster) taining bibliographies from marc records; b) prepare for extension and interconnection of some of these bibliographic entries and the original titles within the marc data base. c) practice bibliographic evaluation. procedure: a) area of interest was selected by dewey or l.c. class number (root search, and, or options) from marc file of 1000 records. records with class number and bibliographic note were retrieved using dps /marc system. b) bibliographic entries in these titles were examined and marc i worksheets were made for three english monographs, with added data fields for source of reference. c) evaluate the bibliographies in the books examined as reference tools for a scholar. title searches purpose: contrast searching for titles to be ordered in bpr and in marc file, in order to obtain l.c. card number, established entry, and full cataloging record. procedure: a) search for 12 titles in bpr ( 1966 and 1967). b) search in marc file ( 1000 records) for 10 (and searches of title words), and prepare unit cards for any 5. searching a shelf-list purpose: a) verify an assigned class number with library holdings in that number. b) compare the subject headings for one class number. procedure: assign dewey class numbers to three titles; search the marcs/dps file for the assigned number; compare titles cataloged with titles retrieved by search for consistence and compare subject headings on worksheet provided. searching for acquisitions purpose: extract cataloging records from marc files ( 48,000 records) for titles selected from choice or library ] ournal ( 1967 issues). 30 journal of library asutomation vol. 3/1 march, 1970 l.s. 621 technical services (gration) l.s. 628 information systems (bottle) l.s. 628 information systems (bottle) procedure: cite l.c. card number for selected titles (at least 10); keypunch numbers; submit with job control deck to dispatcher in computing center and obtain printout of full l.c. cataloging via biblolst program. evaluation of series purpose: a) for a given subject, examine catalog records for titles in a series. b) determine quantity of material on a subject published in series. c) evaluate series notes and series tracing with a view to setting policy for series control. procedure: a) search for subject via dps/ marc system ( 5000 social science monographs). (and, or, root searches of any descriptors are possible.) b) examine printout of 50 titles (or less) for series notes, publishers series, etc. c) write procedural statement for handling series. preparation of bibliographic information for machine input purpose: a) exercise keypunching. b) simulate preparation of bibliographic information for machine input. procedure: for one marc i input worksheet (done in ls407) keypunch six data elements following a fixed format. use of boolean logic for searching marc file purpose: a) practice in use of boolean operators. b) practice in use of a reference retrieval system, e.g. dps/ marcs. procedure: construct 3 searches-!) do or search for references found earlier in s.u. library with both l.c. card number and in bnb; 2) do and search for two descriptors possibly in same document, e.g., d.c. class number and l.c. class number, or two english language words that describe a subject; 3) do or search looking for same subject as in 2). compare results and comment on use of modifiers (root search, specification of field, sentence, or paragraph to be searched, etc. ) teaching with marc tapes/atherton and tessier 31 aside from the structured approach developed for classroom assignments, students were encouraged to develop independent projects. one student group developed an index to abstracts of recent articles in the area of technical services. this involved analysis of three aspects of information in journal articles (type of library, function, and technique) as described in abstracts prepared by the class. the project group did the coding, keypunching, sorting, and listing needed to produce the index. several decisions about abbreviations, format, content, and index order had to be made. leep provided keypunching and instructions for implementing the project. this student abstracting service, begun by one group in the fall of 1968, was updated by another group in the spring. the index was ready in a second edition for use by summer school students. it may become an ongoing service if there is enough student interest. another group of students produced a computer printed bibliography on negro history for an inner-city school library in syracuse. pl/1, or programming language one, was offered twice as a noncredit, eight-session seminar for librarians and library school students. the teacher, a leep consultant, stressed the pl/1 vocabulary subset for character manipulation. the students, on completion, could do simple programs to access, count and print marc i records. one student chose to continue his pl/1 experience, and under an independent project, programmed an ordering procedure and reporting forms. through term projects or independent research, the student can get academic credit, free computer time, and consultation from faculty and leep. during the spring and summer, a general invitation to the students to make dps searches was offered and a form for search evaluation was developed. during the summer session, such independent searches became more popular as instructors of subject reference, bibliography of the social sciences, and bibliography of the humanities allowed dps searching as one technique in term project development. these independent search results became a part of the students' subject bibliographies. searches run by leep were used in two classes as instructional aids: in advanced cataloging and classification as examples of precedents in cataloging practices; and in bibliography of the social sciences as an example of bibliography building for area studies work where information on the catalog card can be retrieved by searching any bibliographic field therein. during the fall semester, 1969, students had the option of taking a three-unit research course on search strategy and retrieval evaluation. the basic tool of this course was marc on dps, and the objective analysis and evaluation of reference retrieval via a computer based system, as well as evaluation of traditional cataloging in a new retrieval form. work continues to prepare other computer based assignments in courses in the bibliography of social sciences and humanities, advanced cataloging and subject reference. 32 journal of library automation vol. 3/ 1 march, 1970 feedback and findings this first year has not yet produced conclusive results about the best direction in which to continue, hut the faculty has been encouraged to think that the above-described integrated activities for the student are promising. different students used the computer based laboratory in varying degrees, the depth of their investigations being their choice. for some students the only experience with leep was a class assignment or an orientation lecture on computers and cataloging or computers and reference service. others met leep in class hut also made efforts to explore the pl/ 1 seminars, and at the informal coffee hours and orientations showed considerable interest in this field. the first year has been a blending of library automation and other concepts in librarianship which the student could explore through practical experience. evaluation forms were distributed through student mailboxes at the close of each semester to get individual feedback. of ninety responding (about 35% of the library school population) sixty-four students had used leep in classes and seventeen had used it independently. sixty students .. picked up new ideas as a result of leep." twenty-two reported no new ideas. fifty-eight students would .. take a job involving library automation" and fifty-five reported this was not the same response they would have given a year earlier. some comments included: .. the field has an exciting future," "automation is of value to libraries," .. we need librarians in the field," and "now i can talk to experts and communicate our needs to them." the first assignment that the student encounters is the most important one. all the students in the school, not just those who have expressed interest in automation, are required to take three of the four courses with leep assignments. students with a broad range of personal interests must be exposed to the potential of computers for libraries. the first assignment is designed to overcome students' fears of computers and related equipment. many are reluctant to keypunch and hesitant about approaching any problem involving equipment. the method is a simple one : starting the student out with a simple assignment with little computerese and one that has a high chance of success. every instruction is made clear and the steps to follow are stated. the reason for the assignment is spelled out exactly and its relation to everyday librarianship explained. even though students tend to resist what looks foreign and complicated, they usually respond, upon completion of the first assignment: "that was easy". the assignments described above in cataloging and in technical services best illustrate the simple-and locked-in-nature of a beginning assignment. in the classroom, marc/ dps has been presented in terms of the assignment. in order to make the system understandable by the uninitiated, students have at times been given only a portion of its capabilities. this approach works well for locked-in assignments, as in cataloging. howteaching with marc tapes/atherton and tessier 33 ever, it hegins to seem that an in-depth introduction to dps, with all its capabilities and fiexibilities, is a better start for some students. an evaluation of retrieval systems and machine readable cataloging data is one instructional aim which may he best achieved in special seminars. experience has also shown that at times the integration of leep assignments into the instructional objectives of a course vitiates both elements. the reference assignment is an example. the final objective of the assignment was evaluation of bibliographies in books (indicated by a bibliography note or marc i indicator). the first section of the assignment included structuring a subject search in marc/dps that illustrated access to bibliography indicators not accessible in the card catalog. the second and third parts dealt with retrieval of the books and evaluation of the bibliographies. the students expressed satisfaction with the citation retrieval, but they experienced frustrations in finding the books cited on the marc tapes. finally, the evaluation of the bibliographies suffered from the student frustration, and students came to regard a "leep sponsored" assignment as "too complicated." the indications from this one assignment were that instruction in retrieval techniques and potential would be sufficient, and need not be tied directly to a larger problem which may rely on external resources. in other classes an integrated approach was used successfully, whenever the techniques and parts of the assignment were kept simple. the greatest impact of the pl/i non-credit seminar was not learning how to program, but understanding what is involved in using such precise techniques, and how to specify steps with logic. this helped make the programmer's role more understandable to the librarians and students in the class. summary the stress during the first year of operation has been on implementing the leep facility for class use. now that development of data bases and retrieval programs is somewhat stabilized, it is hoped to move on to more use of this tool for analysis and evaluation. with faculty support, it is planned to continue designing class assignments, increasing the "catalog of tested assignments." the intent is to encourage a serious study of the marc record, and hence traditional cataloging practices. it should also he possible to do some useful research into the nature of bibliographic description as a tool for reference retrieval. leep will continue through 1969-70, using the marc tapes (marc i and soon, marc ii) . emphasis will not he to teach "how to automate a library better" hut to learn "what difference does a machine readable catalog make" and "of what use and value is such a record to librarians and library users?" the marc records will be used to ask questions about how to improve or change acquisitions, cataloging, reference, and other library functions. this is a departure from the use of computer 34 journal of library automation vol. 3/ 1 march, 1970 based facilities to teach library data processing. the leep approach seems to have had an impact upon library students who are "straight librarians," and not very interested in library automation. it may also foster a greater interest in analysis and research in the library school. for example, with machine readable catalog records it is possible to monitor what has been done in practice before and after the anglo-american rules, or the various additions in l.c. subject cataloging and classification. it is possible to check cataloging consistency more easily. because the marc tapes include both library of congress class numbers and dewey class numbers, they can be compared as to their usefulness for subject searches, subject spread on a library's shelves, etc. with marc ii tapes, it should be possible to simulate a data base more like a national bibliography, and thus open new fields for efficient survey. as noted earlier, all the research, whether on retrieval evaluation or on the nature of cataloging, is student based. the library school's objective is to provide the facility, the impetus, and the guidance which make up the intellectual environment where such investigation can be done. leep is a new part of the library school environment. it can serve to encourage librarians to consider, understand, and even use computers where applicable, in library schools today and in the library field tomorrow. the future use of computers in libraries will be decided by librarians and not by system programmers or automation technologists. to prepare such librarians there must be a time in their lives for experimentation, research and development. there must be a time when they can objectively question what of the old can blend with the new and what will have to be revised. we hope that leep has provided that opportunity to some, if not all, of the students and faculty at syracuse university school of library science. acknowledgments work reported here was partially supported by a grant from usoe, bureau of library research, oeg-08-0664. this paper is based on a presentation before the library education division at the american library association annual meeting, atlantic city, new jersey, june 23, 1969. programs and descriptions microfiches and photocopies of the following leep program descriptions and related materials may be obtained from national auxiliary publications service of asis as follows: 1) "leep program description for marc i file: distribution of records" (naps 00878); 2) "leep report 69-11 : leep program description: marc i double column lister" (naps 00879) ; 3) "leep report 69-12 : leep program descripteaching with marc tapes/atherton and tessier 35 tion: leep-biblolst" (naps 00880); 4) "leep report 69-13: leep program description: marc i record sort" (naps 00881); 5) "leep report 69-14: leep program description: listing machine-readable library of congress subject heading file" (naps 00882); 6) "leep report 69-15: the conversion of the lc classification schedules to machine-readable form" (naps 00883); 7) "rome project program description: molds support package" (naps 00884). copies in mimeographed form may also be had by writing to library education experimental project, school of library science, syracuse university, syracuse, new york 13210. references 1. atherton, pauline; wyman, john: "searching marc tapes with ibm/document processing system." in proceedings of the american society for information science, 6 (westport, connecticut, greenwood publishing corporation, 1969), 83-88. 2. griffin, hillis: "automation of technical processes in libraries," in annual review of information science and technology, volume 3, cuadra, carlos a., ed., (chicago: encyclopedia britannica, 1968), pp. 241-262. 3. library of congress, information systems office: marc pilot pro;ect report, appendix a (washington, d.c., 31 january 1967), p. ill, 3, 21. 4. martel, frank: stillwell, john: marc pilot pro;ect file analysis of distribution of records (syracuse: leep report 69-1). 5. tessier, judith: index and manual for ibm system/360 document processing system (syracuse: leep report 69-5). 6. tessier, judith: searching marc/dps; a users manual (syracuse, n.y.: leep report 69-3 ). 7. leep report to be published december 1969. 8. peterson, p. l.; carnes, r.; reid, i.; et. al.: large scale information processing system, vol. i. compiler, natural language and information processing, report radc-tr-68-401, volume i (april 1969). jorgensen ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ microsoft word 13319 20211217 gallery.docx article developing a minimalist multilingual full-text digital library solution for disconnected remote library partners todd digby information technology and libraries | december 2021 https://doi.org/10.6017/ital.v40i4.13319 todd digby (digby@ufl.edu) is chair, library technology services department, and associate university librarian, george a. smathers libraries, university of florida. © 2021. abstract the university of florida (uf) george a. smathers libraries have been involved in a wide range of partnered digital collection projects throughout the years with a focus on collaborating with institutions across the caribbean region. one of the countries that we have a number of digitization projects within is cuba. one of these partnerships is with the library of the temple beth shalom (gran sinagoga bet shalom) in havana, cuba. as part of this partnership, we have sent personnel over to cuba to do onsite scanning and digitization of selected materials found within the institution. the digitized content from this project was brought back to uf and loaded into our university of florida digital collections (ufdc) system. because internet availability and low bandwidth are issues in cuba, the synagogue’s ability to access the full-text digitized content residing on ufdc was an issue. the synagogue also did not have a local digital library system to load the newly digitized content. to respond to this need we focused on providing a minimalist technology solution that was highly portable to meet their desire to conduct full-text searches within their library on their digitized content. this article will explore the solution that was developed using a usb flash drive loaded with a portableapps version of zotero loaded with multilingual ocr’s documents. about the partnership the university of florida (uf), george a. smathers libraries have been involved in a wide range of partnered digital collection projects throughout the years with a focus on collaborating with institutions across the caribbean region. uf has been involved with the digital library of the caribbean (dloc), which began in 2006, and the university is the technical home. the dloc brings together collections from countries around the caribbean in order to provide researchers with greater online access to these physically dispersed collections.1 this partnership reflects common interests of preservation, access, accessibility, discovery, and content management.2 one of the countries that we have a number of digitization projects with is cuba.3 the cuban judaica collection comprises materials held in the library of the temple beth shalom (gran sinagoga bet shalom) in havana, cuba. the synagogue library collection contains over 10,000 books. the origin of this collection started with abraham marcus matterin, the founder of the la agrupacion cultural hebreo cubana cultural group, who first gathered and arranged the materials. in addition to matterin’s own works, the materials in the library include many rare yiddish publications from the early 20th century, as well as little-known works produced in cuba beginning in the 1930s. the temple beth shalom library as a whole provides a complete snapshot of cuban jewish intellectual, cultural, religious, and political life as it evolved and progressed during the 20th century.4 information technology and libraries december 2021 developing a minimalist multilingual full-text digital library solution | digby 2 among the rare publications included in the smathers libaries collections are habanera lebn, the main cuban jewish newspaper published between 1932 and1960; israelia, a spanish-language newspaper which circulated as a monthly during the 1950s; as well as many other cuban jewish publications. this collection will also provide access to the synagogue library’s wealth of jewish publications from other parts of the caribbean and latin america. the synagogue library digitization project is a partnership between the george a. smathers libraries, the isser and price library of judaica under the auspices of its neh challenge grant, la comunidad hebrea de cuba, and la biblioteca nacional de cuba josé martí. the digitization process as part of this partnership, graduate interns who were fluent in spanish travelled to cuba to do onsite scanning and digitization of selected materials found within the institution. the digitized content from this project was brought back to uf and loaded into our university of florida digital collections (ufdc) system. this digitization process involved taking images in a high-resolution tiff format and then creating the appropriate metadata to accompany these records for use once they were loaded into the digital library system. an additional step of this process was developed by fletcher durant, uf’s preservation librarian, who cut strips of colorful acid-free paper, which were placed in physical items to indicate that they were digitized. these paper flags were used to tell local synagogue users which items were digitized and available locally in the synagogue and more broadly on the internet. these digital files were then transported back to the digital support services group at the university of florida with the returning personnel on usb hard drives, which were then appropriately scanned for viruses before extracting the digitized files. as part of the ingest process to ufdc we created derivative access files including jpgs, jpg2000, and thumbnail images from the high resolution. additionally, these files are processed through an optical character recognition (ocr) process to generate full text searchable files. this ocr process involved a combination of using adobe acrobat or abbyy finereader. the unique aspect of the ocr process was the need to ensure multilingual character recognition that could recognize and generate full text files that may include spanish, hebrew, and english. these scanned files, along with the derivatives and ocr files were then loaded and made publicly searchable with the ufdc system. these newly digitized files were then easily loaded into our ufdc system and made accessible to the wider internet audience around the world. working with a partner in cuba, however, presented additional challenges. partnership challenges providing the synagogue with access to their own content presented challenges that were not necessarily anticipated when the scanning project started. the synagogue did not have a local digital library system that they could use to load and access the newly digitized content. we did provide the synagogue with copies of all the digital files, but since this digitization effort focused on text-based printed materials, to make full use of this content access to the ocr files in a searchable format was the end goal. this in itself is not normally an issue since the digitized content would be loaded on ufdc systems and could be accessed online. unfortunately, one challenge that presented itself when working with cultural heritage partners within cuba is that limited technology infrastructure and internet connectivity can create issues in supporting the information technology and libraries december 2021 developing a minimalist multilingual full-text digital library solution | digby 3 physical and digital initiatives needs of the project activities. with internet availability being intermittent and bandwidth speeds limited in cuba, there needed to be creative ways to address the digitization work and subsequent file sharing needed to be developed.5 aside from the broader infrastructural challenges related to technology that are presented when working with cuban partners, there are additional bureaucratic challenges presented with partnering projects in cuba that can be in flux and change with changes in policy by either the us or cuban governments. these technological and political hurdles made our ability to offer ongoing remote support highly challenging. additionally, there were barriers to how we could offer remote support because our respective it technicians spoke either english or spanish, but not both. the need for translation between languages is one that can be overcome, but does slow down the responsiveness. also, translators may not be accustomed to translating technical jargon, which can further complicate providing support. given the challenges presented above we endeavored to provide the synagogue with a solution that would provide a multilingual full-text search of the materials that had been scanned and put through the ocr process. additional factors that influenced our planning, as discussed above, were the recognition that our cuban partner did not have reliable internet and access to the content hosted in the us and that this solution would be run locally at the synagogue so we would be able to provide minimal, if any, support in installing or supporting the system once it was deployed. minimalist computing solutions in the search for a solution, we were influenced by the concept of minimalist computing.6 borrowed from digital humanities, minimal computing “refer[s] to computing done under some set of significant constraints of hardware, software, education, network capacity, power, or other factors. minimal computing includes both the maintenance, refurbishing, and use of machines to do dh work out of necessity.”7 an important focus of minimal computing in the digital humanities, as noted by risam and edwards (2017), is how these practices can be used by those that find themselves with technological needs when they work outside larger macro structures of financial and technical support.8 so basically, minimal computing is a solution for those individuals or institutions that are not positioned at a larger scale to support projects financially and technologically. appropriate technology in addition to acknowledging the use of minimalist computing, additional related frameworks exist that we drew upon for our project. the most prominent of these is the concept of appropriate technology (at) which is a concept that comes from the field of economics.9 this concept was further adapted and used in the field of economic development and was seen through implementation pertaining to … small production units, appropriate technologies are small-scale but efficient, replicable in numerous units, readily operated, maintained and repaired, low-cost and accessible to low-income persons. in terms of the people who use or benefit from them, appropriate technologies seek to be compatible with local cultural and social environments.10 information technology and libraries december 2021 developing a minimalist multilingual full-text digital library solution | digby 4 one of the main reasons for the use of appropriate technology is that advanced technologies were often inappropriate for the needs of the populations in countries that did not have the same level of technological infrastructure, support, and knowledge. this idea is composed of multiple facets, where in some cases it can be used to describe as using the simplest level of technology that is needed to meet the intended purpose of the user. it can also refer to how the system works in the way it is developed that takes into account the social and environmental factors for a given use. open-source appropriate technology further influences that influenced the design of this project can be found in a more granular approach to appropriate technology that has become known as open source appropriate technology (osat). as defined by pearce (2012), osat refers to technologies that can be sustainably developed, while at the same time being developed using the concepts and principles of free and open source software. additionally, pearce (2012) further states that, osat is made up of technologies that are easily and economically utilized from readily available resources by local communities to meet their needs and must meet the boundary conditions set by environmental, cultural, economic, and educational resource constraints of the local community.11 developing a solution as mentioned previously, the digitized synagogue content from this project was brought back to uf from cuba and loaded into our local digital collections system. this made the content accessible to anyone with internet access around the world, yet due to the fact that internet availability and low bandwidth are issues in cuba, the synagogue’s ability to access the full-text digitized content residing on ufdc could not be assured. additionally, the synagogue also did not have a local digital library system to load the newly digitized content into. to respond to their desire to conduct full-text searches on their digitized content within their library, we focused on providing a minimalist technology solution that was highly portable, user friendly, open source, and sustainable without any or minimal technical support. in scanning the library technological landscape, our first thought was to find a small digital library system that could be used to meet these needs. although there are a number of open source digital library systems and some of these can be configured to work in a non-internet–connected environment, the level of customization and ongoing technical support posed a problem, especially when we may not be able to provide support due to both issues in the telecommunications systems and possible language barriers. although we knew that they had a windows laptop, there were still technical uncertainties about the local computing environment within the synagogue. once we determined that a full digital library system was not going to be sustainable and deployable, we decided to look for alternative approaches. a solution that just involved providing the scanned materials on dvds was considered, but this also presented a problem, because the ocr’d pdfs would need to be opened individually to search the text within, and there was no logical way to provide citation information or organize files in a meaningful way. information technology and libraries december 2021 developing a minimalist multilingual full-text digital library solution | digby 5 zotero portable eventually, we looked at the various citation management systems, since many of these allow pdf files to be imported and allow for searching the full text of pdf files using the ocr’d text. we then focused on a solution that is open source; this is especially important given that we were providing this to an entity in a foreign country and we did not want to experience any licensing or update issues to the chosen software. the platform that was chosen was zotero, primarily because of the open source license of the software, but also because of existing technical knowledge and experience using it. with zotero chosen as our platform, we then investigated how this could be made portable in a way that was already installed, configured, and populated for the end user. i had existing knowledge of the portableapps platform (https://portableapps.com/), which is a fully open source and free platform that enables a broad range of windows applications to be installed on a portable device, in our case a large flash drive. once installed using the portableapps platform these applications can be used without any additional installation, just by plugging in the flash drive to another computer. in the case of the zotero application, there was already a deployment that was created to be used with portableapps, which made the installation and configuration less of a hurdle. see figure 1. figure 1. once plugged into the pc, the portableapps.com drive would present itself; double-clicking on the icon would load the menu for the user with only the zotero application available to choose. once installed we started to load the digitized files into zotero. this consisted of a two-step process. first, because the content was already added to our digital library system, we imported citations for each item or volume into zotero, as you normally would when adding citations into zotero. then we went into each of these citations and added the corresponding digitized file(s) into each entry. some of these items consisted of multiple volumes that would be placed under an information technology and libraries december 2021 developing a minimalist multilingual full-text digital library solution | digby 6 existing citation. in addition to this, some of the materials contained both spanish and hebrew text, so during the ocr process there was a separate file created for each of these languages. at this point we were able to test the full-text search capabilities of the zotero against our multilanguage ocr’d pdf files. figure 2. this image shows a set of citations of the scanned materials. to perform a full-text search within these documents a user uses the everything search box in the zotero toolbar. at this point in our testing, we had determined that our method was successful in being able to perform a full-text search across all the loaded pdf documents (see figure 2). however, a limit of zotero at this point was that the search would only identify which files the search terms were located in and not the exact location of the located terms within each file (see figure 3). although this was a limitation of our solution, we were able to provide a second step for searching within the pdf files for the exact location of the search terms. information technology and libraries december 2021 developing a minimalist multilingual full-text digital library solution | digby 7 figure 3. the zotero search results, which highlighted the files in which the search text could be found. figure 4. acrobat search conducted on a file to locate the exact location of the text in the digitized materials. since we were aware that you can search within a full-text pdf file using adobe acrobat reader, we decided that for a more granular search we would instruct the users to click on the file that information technology and libraries december 2021 developing a minimalist multilingual full-text digital library solution | digby 8 included the identified search term, which would load acrobat reader and open the file. then the user could search for the exact term using adobe acrobat reader’s search capabilities to locate the location in the document where the term was situated (see figure 4 for an example). although this two-step process is not ideal, for a minimal technological solution that addresses all the concerns, it would meet the overall goals of the project and provide a workable searching solution for our partner, with the workflows, installation, and configuration of the flash drive complete, we next created documentation in both english and spanish to guide the user through the search process. we then provided the flash drive with accompanying documentation to the next staff member who was travelling to cuba. because of federal rules implemented shortly after we transported the flash drive to cuba, our ability to travel to cuba to work with our partners was limited. this has substantially reduced the information flow between our institution and our cuban partners and has limited how much we can know about how actively this resource is being used. it is hoped that in the near future we can once again be able to travel to cuba and re-engage with our partners to determine the success of this and other projects we have been working with them on. conclusion in the realm of library technology, we often implement and support complex and highly costly systems as part of our regular oversight. by working on this project, we have been given a chance to take a step back and design a solution that uses open source and free tools that are readily available and require low support. looking broadly across the technology platforms, systems, and software that are used, there is a tendency to find these include a plethora of features and functions that are rarely used but add additional complexity. focusing on solutions that reduce this complexity and still meet user needs has been a rewarding experience. endnotes 1 brooke wooldridge, laurie taylor, and mark sullivan, “managing an open access, multiinstitutional, international digital library: the digital library of the caribbean,” resource sharing & information networks 20, no. 1–2 (2009): 35–44, https://doi.org/10.1080/07377790903014534. 2 miguel asencio, “collaborating for success: the digital library of the caribbean,” journal of library administration 57, no. 7 (2017): 818–25, https://doi.org/10.1080/01930826.2017.1362902. 3 “celebrating cuba! collaborative digital collections of cuban patrimony,” university of florida digital collections, accessed february 15, 2021, https://ufdc.ufl.edu/cuba. 4 “cuban judaica,” university of florida digital collections, accessed february 15, 2021, https://ufdc.ufl.edu/cuban_judaica. information technology and libraries december 2021 developing a minimalist multilingual full-text digital library solution | digby 9 5 xuefei deng, nancy armando camacho, and larry press, “how do cubans use internet? the effects of capital,” in proceedings of the 52nd hawaii international conference on system sciences (2019), https://doi.org/10.24251/hicss.2019.617. 6 jentery sayers, “minimal definitions,” minimal computing: a working group of go::dh, october 2, 2016, https://go-dh.github.io/mincomp/thoughts/2016/10/02/minimal-definitions. 7 “about: what is minimal computing?” minimal computing: a working group of go::dh, https://go-dh.github.io/mincomp/about/. 8 roopika risam and susan edwards, “micro dh: digital humanities at the small scale,” digital humanities 2017, http://works.bepress.com/roopika-risam/27/. 9 ernest f. schumacher, small is beautiful: economics as if people mattered (london: blond and brigggs, 1973). 10 peter thormann, “proposal for a program in appropriate technology,” in appropriate technologies for third world development (new york: st. martin's press, 1979): 280–99. 11 j. m. pearce, “the case for open source appropriate technology,” environment, development and sustainability 14 (2012): 425–31, https://doi.org/10.1007/s10668-012-9337-9. lib-mocs-kmc364-20131012112916 aacr2: oclc's implementation and database conversion georgia l. brown: oclc, inc., dublin, ohio. 161 oclc's online union catalog (oluc) contains bibliographic records created under various cataloging guidelines. until december 1980, no system-wide attempt had been made to resolve record conflicts caused by use of the different guidelines. the introduction of the new guidelines, the anglo-american cataloguing rules, second edition (aacr2) , exacerbated these record conflicts. to reduce library costs, w hich might increase dramatically as users attempted to resolve those conflicts, oclc converted name headings and uniform titles in its database to aacr2 form. the purpose of the conversion was to resolve record conflicts that resulted from rule changes and to conform to lc preferred forms of heading if possible. background in may 1978, upon receiving an advance copy of the anglo-american cataloguing rules second edition (aacr2), oclc formed an internal task force of librarians who were professional catalogers to study the new rules. the aacr2 task force was charged with identifying differences between aacr2 and aacrl as applied by the library of. congress. the task force compared the two sets of rules on a rule-by-rule basis to determine: (1) effects of rule changes on the marc record formats , (2) who benefited from the changes, and (3) relative costs of the changes on both a one-time and a continuing basis. each change was assigned a number from 0 to 5 to represent the cost to libraries (0 being no cost and 5 being maximum cost) . the task force identified a total of 454 significant rule changes or new rules . the task force categorized each rule's effect, and in its judgment, 56 percent of the changes would benefit neither the librarian nor the patron, 23 percent would benefit librarians, and 21 percent would benefit patrons. the estimates of the percentage of changes along the cost spectrum are illustrated in table 1. manuscript received ap rill981; accepted aprill98 l. 162 journal of library automation vol. 14/3 september 1981 table 1. estimates of aac r2 changes in terms of costs cost range 0 1 2 3 4 5 pe rcentage of changesone-time 18 54 13 9 4 2 identification of conversion requirements percentage of changescon t inuing 20 56 20 0 2 2 originally, the findings of the task force were to be used to adjust the oclc online system and card production programs to accommodate aacr2 changes. however, in light of estimated costs to individual libraries to convert existing headings and uniform titles to aacr2 form, the task force studied the requirements for an oclc machine conversion. the machine conversion required that information within the record be consistently identifiable. the task force used work sheets to record and keep track of its findings . the first column of each row on the work sheet represented one rule. the row was completed with the rule number, the aacr2 form with tagging, the pre-aacr2 form with tagging, instructions, and comments. figure 1 illustrates a work sheet. an analysis of the work sheets indicated that one method to convert to aacr2 form was to develop an oclc authority control system based on aa cr 2 aa cr2 fo rm prc-aacr2 form rule with tagging with tagging instructions comments 22.501 100 10 zerotina, 100 10 z ze rotina withi n l a z could be fo r czech and slovak karel z ka rel searched , dele ted , and names only a dded at end o f field 25.4b l xx ! a ... lxx !a . . . set up table of uniform t h is would requ ire 240 ta theaetet us 240 t a theaitetos titles where greek reading of ! a . . . or fo rms change to latin c hecking against ta ble 240 !a theaetetus fo rms. change 240 ta greek fo rm to l atin form 21 .26 100 10 parker, 100 10 ra msdell , no way to and theodore tc sar ah a. automatically recognize 22.14 (s pirit) 700 10 70010 par ker , those records requ iring ramsdell, theodore change sarah a. 25 .9 240 t a selections 240 t a selected if text of 240 ! a is this will require wo rks "selected works" reading tex t of l a c hange to " selections" fig. 1. task force conversion w orksheet aacr2: oclc/brown 163 the lc name authority file. due to time constraints and the complexity of developing such a system, however, oclc decided on a second method: to convert online union catalog headings and uniform titles using the lc name authority file and some additional data manipulation techniques that would detect changes not done by the authority processing. preconversion testing using the work sheets, the task force assigned the rule changes to pattern sets. pattern sets were defined as combinations of character strings, punctuation, subfield coding, and other characteristics that indicate that the heading could be algorithmically changed to conform to the new rules. these changes were further divided into those that could be converted by machine and those that could not be converted by machine. approximately 100 pattern sets were initially identified. before making a commitment to convert all 100 of these pattern sets, tests were run to determine the approximate number of bibliographic records that would be changed. a test file obtained by selecting records at random from the online union catalog as of september 2, 1978, already existed at oclc. the test file represented a 1 percent sample of the database on that date, or 41,212 records. programs run on the test file identified the patterns within the bibliographic records and counted the number of times each pattern occurred in the test file. table 2 illustrates selected results of pattern set sampling. patterns not found in the test file were later eliminated from those to be applied against the entire online union catalog. " u.s." was found in qualifying fields 754 times, and "covenant" was found only once. "university of' was found 486 times on the test sample; however, it could be incorrectly converted frequently enough to eliminate it from the list of pattern matching to be done. tests also indicated that some changes that appeared straightforward, when applied , introduced further errors that would have to be resolved after the conversion. of the 41,212 records , 100 records were manually checked for system changes that would need to be made for the existing bibliographic records table 2. selected results of pattern sampling rule number number matched comments 21.39a 32 !a ... !k liturgy and ritual 21.39c 7 ! a jews ! k liturgy and ritual 24.1b 7l state university 21.33 28 constitution 3 charter 1 covenant 21.35 27 treaties 25.15 206 laws, etc. 25.681 0 books, parts, numbers 25.9 19 selected works 24.2782 0 pope 164 journal of library automation vol. 14/3 september 1981 to comform toaacr2. general findings included: change number of records none 33 more than one 21 minor personal name change 19 personal name modification 13 single change other than 14 personal name specific changes that would be needed are shown in table 3. as noted in the table, personal name changes account for more than two-thirds of all required conversion changes. as a final note, name headings to be converted by authority processing could not be estimated by sampling, since the lc name authority file was not available online when the tests were run. early estimates, based on the tests and anticipated name authority matches, called for conversion of 8 percent of the online union catalog, or 560,000 records, to aacr2. however, samplings done by the library of congress indicated that 17 percent of all marc records contained one or more headings that needed to be converted. oclc assumed that this statistic would also apply to its database. the task force's study, in general, showed that oclc could convert by machine a large portion of its bibliographic records to conform to aacr2. design methodology oclc formally initiated the aacr2 project to : (1) accommodate the use of aacr2 format in the online system, and (2) convert existing bibliographic records to aacr2. accommodating aacr2 formats required validating additional content designators, modifying card printing to allow for the new content designators, and training users. also , the seven bibliographic format documents (books, serials, audiovisual media, scores, sound recordings, maps, and manuscripts) were rewritten to include the new content designators and aacr2 input conventions and table 3. modifications needed for aacr2 conversion (based on a sample of 100 records) modification personal name parenthesize geographic location u.s.-united states, ct. brit.-great britain uniform title modification drop geographic location from corporate name tk dropped university heading conference date and place inverted u.s. congress total occurences per 100 records 57 8 3 3 5 2 2 2 1 83 percent of modification 69 10 4 4 6 2 2 2 1 100 aacr2: oclc!brown 165 examples. the remainder of this paper will deal with the conversion of existing bibliographic records in the online union catalog, oclc's bibliographic database. the purpose of the conversion was to resolve record conflicts that resulted from rule changes affecting name headings and uniform titles. functional specifications two sets of functional specifications were written based on the preproject studies by the aacr2 task force. set 1 functional specifications addressed the conversion of bibliographic records to aacr2 by matching the records in the lc name authority file and then incorporating data into the bibliographic records. set 2 functional specifications addressed the machine manipulation of character strings that formed a given pattern. set 1 functional specifications three constraints were placed on the conversion described in set 1 functional specifications. first, the pre-aacr2 form of a converted field must be retained. second, the bibliographic record must be retrievable by both pre-aacr2 and aacr2 forms. third, the field that was changed must be identified to users, and the record must indicate that it had been modified by machine conversion. set 1 functional specifications listed the fields in the bibliographic and authority records that should be considered in the conversion, grouping bibliographic fields that should be matched with given authority fields. for each field, characters were eliminated that might inadvertently cause a no-match result. subfield codes and delimiters, multiple blanks, and diacritics were eliminated from the character string used for matching. all alphabetic characters were converted to uppercase letters and certain subfields were eliminated from the matching strings. this process was applied to both bibliographic and authority records. the resultant matching strings, for a bibliographic and an authority field, were compared on a character-by-character basis. if any character was different, there was no match. matches were treated differently depending on the contents of the name authority field . four cases for matching were defined: case 1. bibliographic field matches aacr2 authority field. in case 1, the only change needed was to indicate in subfield w of the bibliographic field that it conformed with aacr2. case 2. bibliographic field matches non-aacr2 authority field; aacr2 form present in authority record. case 2 called for the following changes: (1) replacing the bibliographic field with the aacr2 form from the authority record; (2) moving the replaced bibliographic data to another field (an 87x field); and (3) indicating in the converted bibliographic field that conversion had been done. 166 journal of library automation vol. 14/3 september 1981 case 3. bibliographic field matches non-aacr2 authority field; aacr2 form not present in authority record . in case 3, the authority record contained the form preferred by lc, but not the aacr2 form . if the bibliographic field matched a "see from" reference ( 4xx authority field), case 3 called for the following changes: (1) replacing the bibliographic field with the authoritative field (lxx authority field); and (2) moving the replaced bibliographic data to another field (an 87x field) . no indication was added that the field was machine-converted, since the form supplied was not aacr2. case 4. bibliographic field tagged as personal name matches authority field tagged as corporate name. in case 4, the bibliographic tag was corrected to a corporatename tag. case 4 was used to clean up the database and to allow more fields to be converted. set 2 functional specifications for set 2 functional specifications, the pre-aacr2 form of the entry also must be retained and the record retrievable by both pre-aacr2 and aacr2 forms. these functional specifications called for conversion of six pattern sets. each pattern set might apply to multiple fields and, within the fields, to multiple character strings. some of the pattern sets were further subdivided into various conditions . for example, pattern set 2 specified the conversion of form subheadings. this pattern set looked only at one field, the 110 field, but held two conditions. in the first condition, any one of ten character strings might be matched. in the second condition, either of two character strings qualified for matching. pattern set 2 was actually one of the easier sets to work with since it involved minimum data manipulation and testing. the most complicated pattern set concerned music uniform titles where only two fields were involved but six possible conditions had to be considered. one of these conditions required conversion of forty-two character strings, provided other information was present. development plan after reviewing the two sets of functional specifications, a development plan was established. this plan outlined the steps involved in software development for the project, named an individual responsible for each step, estimated the duration of each step, identified the objectives of software development, and identified potential time conflicts for staff and machine resources. the time estimates were constantly monitored and revised during the project cycle to ensure that the work would be completed on time. development method based on a thorough analysis of the functional specifications, the following basic design was chosen: aacr2: oclc/brown 167 1. read a bibliographic record. 2. identify a field in the bibliographic record for potential conversion. 3. derive a key from that field. the key derivation used would be the same as that used for the online system, except that it would be extended to include fields not normally indexed but that needed to be converted to aacr2. derived search keys are formulated by extracting a certain number of characters from the words in a name. for personal names, a 4,3,1 key is used; i.e., the first four characters from the surname, the first three characters from the forename, and the middle initial. 4. perform a keyed search of the lc name authority index files. 5. for each hit on an index record, read the corresponding name authority record and check for a match of the authority and bibliographic fields. when a match is found, merge the bibliographic and authority data. 6. repeat steps 2 through 5 for every field in the bibliographic record that qualifies for conversion. 7. scan the bibliographic record for fields that might be converted using the machine-manipulation pattern matching and compare these fields with the various patterns. should a match occur, manipulate the string accordingly. 8. if a record has been converted, add the 040 field if it is not already present in the record; or, edit the 040 field to include a subfield d indicating that oclc has modified the record. 9. repeat the entire process for every record in the online union catalog. design method for conversion the method presented a complex design. because it required indexing fields not normally indexed by the oclc system, the search keys would have to be specified. also, the 130,430, 530 uniform and variant title fields in the name authority file would have to be indexed and the keys defined. this could be done by adding the search keys to the existing name index file , which contains indexes to the lc name authority file , or by creating a separate file. adding to the existing name index file would result in inconsistent data within the file , mixed names and titles, and, more important, interference with the online system . using a separate file would mean more maintenance, necessitate slightly more machine space, and require two searches to cover all search possibilities for derived name authority search keys. (it should be noted that currently online system users cannot search the name authority file using a derived title search key.) software development project software design defined activity along the two lines of conversion, corresponding to the functional specifications: conversion of name 168 journal of library automation vol. 14/3 september 1981 headings by matching bibliographic headings with headings in the lc name authority file , and conversion of name and uniform title headings through machine manipulation of existing bibliographic data. conversion by matching name authority headings was broken down into subactivities as specified by cases 1 through 4 in the functional specifications. conversion by machine manipulation was subdivided into: 1. conversion of conference name headings. 2. conversion of uniform titles-music. 3. conversion of uniform titles-general. 4 . conversion of form subheadings. 5. conversion of general material designators. 6. conversion of "united states" and "great britain" abbreviations. the entire conversion was designed to be directed by a series of runtime parameters that specified which subactivities were to be performed, whether the conversion was to be run concurrently with the online system , the names of files to be used (including audit and checkpoint files), and the range of oclc numbers to be processed. the run-time parameters allowed multiple processes (programs) to be run simultaneously, with each process running against a different part of the online union catalog. the design also included use of an audit trail, where a record is written to a file every time a change is made to a bibliographic field. the trail consisted of the oclc number and the type of subactivity applied to the field . conversion restarts were specified to be automatically controlled through a checkpoint file. checkpoint records in this file contained the latest oclc number processed, total number of records processed, total number of records, and time stamps to calculate elapsed time. to effect a restart, the conversion was simply rerun and the checkpoint file handled file positioning to ensure against duplicate reprocessing of records. an overriding development priority was to design the software to be flexible enough to handle both the conversion of the online union catalog and the conversion of incoming marc tapes. in this way, pre-aacr2 headings would be converted (if they met the specifications) before being loaded into the database. growth requirements at the same time that the coding began, the project staff studied the design to determine its effects on the existing system. additional disk space was projected based on the estimate of bibliographic records to be converted. based on past research of field lengths, project staff estimated that 66.42 bytes (characters) would be added to each converted record . based on earlier samplings by the library of congress, it was assumed that 17 percent of the database would be converted (a figure that turned out low). therefore, 79.04 additional megabytes would be used. because an additional 13 percent of this would be needed for file management, the total aacr2: oclcibrown 169 requirement for the expansion of the bibliographic file was projected as 89.3 megabytes. the bibliographic index files would also expand with the conversion. not only would the old index keys be retained but new keys would be added. it was estimated that 4 percent of the bibliographic records would generate new keys (duplicate keys are not added to the files), for an additional requirement of ten megabytes. it was also calculated that six megabytes would be required for the new name authority index file. the total additional space required for the expansion of the bibliographic file, the expansion of the bibliographic index file, and the addition of the name authority index file was thus 105.3 megabytes. this space would have to be available before the conversion could be run. testing as coding progressed into the testing phase, it became obvious to the project staff that existing testing methods were not well suited to testing the conversion software. therefore, a utility program was developed to enter bibliographic records in a test file using techniques similar to those used by the online system. these test bibliographic records included both good and bad data, and records that should and should not be converted. an attempt was made to cover as many situations as practicable. for example, a given record might have multiple fields that would convert and, within a given field, multiple conversions might apply. images of the converted test records were manually compared with the original entry. at the same time, the accuracy of the audit trail was verified. the conversion process was tested using a utility debugger to simulate error conditions that did not occur as a result of other tests. changes to the online system code were tested using a simulator. all testing and development work was done on a development machine. calibration tests were made on the data base processor (dbp), the database management portion of the online system. the calibrations were taken in a stand-alone environment to calculate the length of time needed to run the conversion and to test the conversion software on a larger database than the test files. at the time of the calibration tests, the lc name authority file held about 250,000 records; it was not distributed across the various disk packs but rather restricted to a fairly small number of packs. between the time of the calibration and the conversion run, the lc name authority file grew to 450,000 records and was distributed evenly across the disk packs on the dbp. according to the calibration tests, the conversion to aacr2 was expected to take ninety-two hours, with sixteen copies of the software processing different ranges of the bibliographic file. the calibration tests also estimated that 28 percent of the bibliographic records would be converted, much higher than originally estimated. after the calibration tests, the software underwent quality assurance tests. the conversion software was run against test files on the dbp to 170 journal of library automation vol. 14/3 september 1981 verify the conversion process and to provide the data for the next phase of quality assurance, the regression test. during regression testing, each subsystem in the online system, with new software changes included, was tested by oclc staff. additional tests were made of normal work flows to ensure that all functions that previously worked still functioned correctly and all functions that should not work still did not work. no problems were uncovered during these tests and no software changes were made. conversion of the oclc online union catalog the conversion was designed to run either in a stand-alone mode or concurrently with the online system. the major drawback to running in a stand-alone mode was that the online system would be unavailable to users for some period of time. however, this was not deemed as great a problem as running the conversion while the online system was operational. with the online system operational, the conversion would have to "lock" the bibliographic record as it is processed, thus potentially affecting system performance. for example, if a user wanted to retrieve a record that was locked, he or she would have to wait until the record was unlocked. since theaacr2 conversion process locks the bibliographic record when it reads it and keeps it locked until the conversion for that record is complete, the record could stay locked for several seconds. before the conversion could be run, various files had to be created on the dbp . the bibliographic file on the dbp is partitioned across twenty-nine disk packs, each pack holding 250,000 records within a range of oclc control numbers. the start-up commands and parameters were put into one file for execution. one audit file was created for each process to be run. the conversion began running with sixteen processes . ten of the processes were run against two disk packs, with four processes running against a single disk pack. at the time of conversion, the dbp contained fourteen cpus; twelve of the processes ran alone in a cpu, and two processes ran in each of two cpus . as soon as the conversion began, on december 13, 1981, at 4:00a.m., another calibration test was done to estimate completion time. the results showed that the file redistribution that was expected to lower the time estimates significantly had not produced the expected result. attempts were made to explain the discrepancies, but it was concluded that the processes simply were slow. the 110 rate and cpu utilization rate were high. based on these calibration test findings, it was decided to start up additional processes so that one process would be run on a single disk pack, with two processes per cpu. the original sixteen processes had to be stopped, the range of oclc numbers processed redistributed, and additional audit files created. twenty-eight processes were then started up. all records in the twenty-ninth disk pack, records with control numbers greater than seven million, were to be handled by the twenty-eighth process. aacr2: oclc!brown 171 the conversion ran smoothly until some of the processes encountered a problem they could not handle. the conversion was then stopped. because the problem was not immediately obvious, the records being processed at the time of the error were skipped and the conversion restarted using the checkpoint file. the problem was later identified-if the converted field held more than 255 characters, the length of the field was incorrectly calculated-and software was corrected. the audit files were saved up to the point of the correction to identify the problem records. using these audit files to find records that had been converted, a preconversion copy of the bibliographic file was scanned for records that would need correction. fifty-six records were identified and sent to the bibliographic maintenance section, user services division, of oclc for manual correction. from this point on, the conversion ran smoothly but slowly, processing an average of 28,500 records per hour. the checkpoint files were read every two hours to monitor the speed of the conversion. because this monitoring in itself proved to be quite cumbersome, a program was written to format the checkpoint data for easier readability. the resultant reports showed a breakdown by process of how much of the conversion had been done, the rate at which it had been done, and how much remained . by using these reports, as a process would finish, another slower process could be divided and started up to balance the load and finish faster. periodically, converted records were written on hard-copy printers for oclc staff to use to check the accuracy of the conversion. the checkpoint reports showed that 39 percent of the records in the online union catalog were being converted to aacr2. this percentage was much higher than anticipated by the calibration tests, and consequently the disk space needed for expansion was more than anticipated. files not used by the conversion were deleted and index files were moved to other disk packs to allow the bibliographic files to expand. the last record was converted and all processes stopped by 10:45 a .m. on december 21, after 246 hours of work. the bibliographic file and its indexes were reorganized, slack space squeezed out, and all files that had been deleted were put back. the online system was made available to users at 7:00a.m., december 23, 1980. a total of 3,704,440 changes had been made on more than 2, 767,000 records. table 4 lists the number of fields converted for each activity. summary some records could not be converted because: (1) the data within the field were incorrect or inadequate, or (2) the record would have exceeded field number and record length limits. oclc has made a continuing effort since the conversion to correct problems. the most difficult and numerous problems involved the lc name authority file. in some cases the data within the authority records are incorrect, while in other instances multiple authority records exist. the 172 ]ournaloflibraryautomation vol.14/3 september1981 1'a ble 4. fields converted for each activ ity activity mistagged corporate na me fie lds direct aacr 2 match match where aacr2 form is e lsewhe re in the a uth ority record ma tc h on lc prefe rred form conversion of conference name headings conversion of uniform titlesmusic conversion of uniform titles-general conversion of form headings conversion of general material designators conversion of " united states" and " great brita in" abbreviations nu mber of fields co nverted 1,268 2, 685. 211 614,333 23,611 96,382 68,905 2,263 3 1,278 49,978 13 1,211 conversion used the first matching authority record it encountered. the most desirable record, as it turned out, was sometimes not the first encountered . a series of eight fixes was programatically applied to the oluc to correct problems, using either the audit file or database scans to select the record to be corrected. fixes 1 and2 were similar in that each was the result of a bad authority record and the original form was restored . headings converted to "voice of america (radio program)" were changed back to " united states. dept . of state" by fix 1. "united states bureau of the census . census of construction industries (1972)" was changed back to "united states. bureau of the census" by fix 2. fixes 3 through 7 were needed to correct programming problems, omissions in the functional specifications, and changes in lc procedures . subfields x, y, and z were deleted from 600 fields by the conversion. fix 3 moved the subfields back into the 600 fields. fix 4 reordered the e and q subfields in personal name headings that had been moved into the field in the wrong order by the conversion. the conversion had supplied a subfield g between the word "manuscript" and the following text in 110 fields. fix 5 changed subfield coding g to n when lc began using the n . fixes 6 and 7 restored some fields to the original form, which had been unintentionally converted. fix 6 corrected form subheadings, and fix 7 corrected music uniform titles. "constitutional" had been treated as " constitution," i.e., it was deleted from the field. some terms within music uniform titles were to have been pluralized. however, the conversion did not differentiate between terms needing pluralization and those that were already plural . " masses" ended up as "masseses. " fix 7 corrected this problem . forty-six headings, including chopin, shakespeare, and beethoven , were identified as unconverted headings, resulting from the multiple authority record problem . fix 8 adjusted the name authority file so the desired record would be the first encountered, scanned the oluc to select records containing the forty-six headings, and ran those selected records through the conversion process. approximately 80,000 records were converted by fix 8. aacr2: oclcibrown 173 other problems were expected to filter in, although the stream has slowed to a trickle. these problems continue to be dealt with by oclc librarians. on the whole, problems were expected, planned for, and handled in a timely fashion. oclc originally envisioned the conversion of its large database to encompass 8 percent of the total records online; 39 percent of the records were converted, and they were available to oclc users before the january 1, 1981, deadline set by the library community. georgia l. brown is manager, cataloging section, in the development division of oclc. an evaluation of finding aid accessibility for screen readers kristina l. southwell and jacquelyn slater information technology and libraries | september 2013 34 abstract since the passage of the american disabilities act in 1990 and the coincident growth of the internet, academic libraries have worked to provide electronic resources and services that are accessible to all patrons. special collections are increasingly being added to these web-based library resources, and they must meet the same accessibility standards. the recent popularity surge of web 2.0 technology, social media sites, and mobile devices has brought greater awareness about the challenges faced by those who use assistive technology for visual disabilities. this study examines the screen-reader accessibility of online special collections finding aids at 68 public us colleges and universities in the association of research libraries. introduction university students and faculty today expect some degree of online access to most library resources. special collections libraries are no exception, and researchers now have access to troves of digitized finding aids and original materials at university library websites nationwide. as part of the websites of higher education institutions, these resources must be accessible to patrons with disabilities. section 504 of the rehabilitation act of 1973 first prohibited exclusion of the disabled from programs and activities of public entities, and the 1990 americans with disabilities act (ada) mandated accessibility of public services and facilities. section 508 of the rehabilitation act, as amended by the workforce investment act of 1998, also requires accessibility of federally funded services. since the passage of these laws, libraries at us colleges and universities have made progress in physical and electronic accessibility for the disabled. according to the employment and disability institute at cornell university, 2.1 percent of noninstitutionalized persons in the united states in 2010 had a visual disability.1 the us census bureau counted nearly 8.1 million people (3.3 percent) who reported difficulty seeing and 2 million who are blind or unable to see.2 these numbers indicate that there are students, faculty, and patrons outside the academic community with visual impairments who are potential consumers of online special collections materials. as ada improvements increasingly pave the way for greater enrollment numbers of students with visual impairments, libraries must anticipate these students’ need for fully accessible information resources. kristina southwell (klsouthwell@ou.edu) is associate professor of bibliography and assistant curator at the western history collections, university of oklahoma, norman, ok. jacquelyn slater (jdslater@ou.edu) is assistant professor of bibliography and librarian at the western history collections, university of oklahoma, norman, ok. mailto:klsouthwell@ou.edu mailto:jdslater@ou.edu an evaluation of finding aid accessibility for screen readers | southwell and slater 35 library websites and the constantly changing resources they offer must be regularly evaluated for compatibility with screen readers and other accessibility technologies to ensure access. perhaps because special collections materials are relatively late arrivals to the internet, their accessibility has not received as much attention as more traditional library offerings like published books and periodicals. the goal of this study is to determine whether a sampling of special collections finding aids available on public us college and university library websites are accessible to patrons using screen readers. internet access and screen readers blind and low-vision internet users have various types of assistive technology available to them. these include screen readers with text-to-speech synthesizers, refreshable braille displays, text enlargement, screen magnification software, and nongraphical browsers. guidelines for making websites accessible via assistive technology are available from the w3c’s web content accessibility guidelines (wcag 2.0).3 these rules provide success criteria for levels a, aa, and aaa for web developers to meet. many websites today still do not conform to these guidelines, and barriers to access persist. screen-reader users access information on the internet differently than sighted persons. the keyboard usually replaces the monitor and mouse as the primary computer interface. webpage content is spoken aloud in a strictly linear order, which may differ from the visual order on screen. instead of visually scanning the page to look for the desired content, screen-reader users can use the “find” or “search” function to look for something specific or use one of several options for skimming the page via keyboard shortcuts. these shortcuts, which vary by screen reader, allow navigation to the available links, headings, aria landmarks, frames, paragraphs, and other elements of the page. a recent webaim survey of screen-reader users indicated 60.8 percent navigated lengthy webpages first by their headings. using the “find” feature was the second most common method (16.6 percent), followed by navigation with links (13.2 percent) and aria landmarks (2.3 percent). only 7.0 percent reported reading through a long website without using navigational shortcuts.4 some websites also offer a “skip navigation link” at the beginning of a page, which allows the user to skip over the repetitive navigational information in the banner to hear the “main content” as soon as possible. these fundamental differences in the way screenreader users access internet content are the key to making websites that work well with assistive technology. literature review accessibility studies of library web sites in higher education have primarily focused on the library’s homepage and its resources and services. more than a decade ago, lilly and van fleet and spindler determined only 40 percent and 42 percent, respectively, of academic library homepages were rated accessible using bobby accessibility testing software.5 a series of similar studies followed by schmetzke and comeaux and schmetzke, which found accessibility rates of library homepages fluctuating over time, decreasing from a 2001 rate of 59 percent to 51 percent in 2002, information technology and libraries | september 2013 36 and rising back to 60 percent in 2006.6 providenti and zai iii examined academic library homepages in kentucky, comparing data from 2003 and 2007. they also found low accessibility rates with minimal improvement in four years.7 many accessibility studies have focused on one of the mainstays of academic library sites: databases of e-journals. early studies by coonin, riley, horwath, and others found significant accessibility barriers in most electronic content providers’ databases.8 problems ranged from missing and unclear alternative text to inaccessible journal pdfs saved as images instead of text. as awareness of web accessibility in library resources spread, research studies began to find that most databases were section 508 compliant but still lacked user-friendliness for users of assistive technology.9 more recent studies examined the actual usability of journal databases and the challenges they pose for the disabled. power and labeau still found vendor databases that were not section 508 compliant and others that were minimally compliant but lacked functionality.10 dermody and majekodunmi found that students were hindered by advanced database features intended to improve general users’ experiences.11 disabled students were also confronted with excessive links, unreadable links, and inaccessible pdfs. related studies have focused on providing guidelines for accessible library web design and services. brophy and craven highlighted the importance of universal design in library sites because of the ever-increasing complexity of web-based media.12 vandenbark provided a clear explanation of us regulations and standards for accessible design and outlined basic principles of good design and how to achieve it.13 recent works by samson and willis addressed best practices for reference and general library services to disabled patrons. samson found no consistent set of best practices between eight academic libraries studied, noting that five of the eight based their services on reactions to individual complaints instead of using a broader, proactive approach.14 willis followed up on a 1996 study by surveying technology and physical-access issues for the disabled in academic health sciences libraries. she found improvements in physical access, but technological access proved to be a mixed bag. while library catalogs were more accessible because they were available online, library webpages continued to pose problems for the disabled. significant deficiencies in the provision of alternative text and accessible media formats were observed.15 finding no comparable evaluations of special collections resources, in 2011 we examined the screen-reader accessibility of digitized textual materials from the special collections departments of us academic library websites.16 our study found that 42 percent of the digitized items were accessible by screen reader, while 58 percent were not. published at the same time, lora j. davis’ 2012 study evaluated accessibility of philadelphia area consortium of special collections libraries (pacscl) member libraries’ special collections websites and compared their performance to popular sites such as facebook, wikipedia, and youtube. davis found that the special collections sites had error rates comparable to the popular sites, but demonstrated that a low number of error codes in automatic checkers does not necessarily mean the page is usable for nonsighted people.17 davis concluded that it is difficult to “meaningfully assess site accessibility” an evaluation of finding aid accessibility for screen readers | southwell and slater 37 using only automatic accessibility checkers.18 our current research study addresses this issue by incorporating manual tests of the special collections finding aids we examined. the results provide some insight into the screen-reader user’s experience with these materials. method the researchers evaluated a single online finding aid from the websites of each of the 68 us public university and college libraries in the association of research libraries. they were analyzed with automated and manual tests during the 2012 fall academic semester. the evaluated finding aids were randomly selected from each library’s manuscripts and archives collections. selection was limited to only collections that have a container list describing manuscript or archives contents at least at box level. evaluations were performed on the default display mode of the selected finding aids. if the library’s website required a format choice instead of a default display (such as html or pdf) the html version was selected. the automated web-accessibility checker wave 5.0 by webaim was used to perform an initial assessment of each finding aid’s conformance to section 508 and wcag 2.0 guidelines. the wavegenerated report for each finding aid was used to compile a list of codes for the standard wave categories: errors, alerts, features, structural elements, and wai-aria landmarks. we recorded how many libraries earned each type of code, as well as how many times each code was generated during the entire evaluation process. manual testing of each finding aid was performed with the webbie 3 web browser, a text-only browser for the blind and visually impaired. webbie’s ctrl-h and ctrl-l commands were used to examine the headings and links available on each finding aid to determine whether patrons who use screen readers could navigate the finding aid by using its headings or internal links. the study concluded with a manual test by screen reader directed by keyboard navigation. system access to go (satogo) and nvda were used for this test. results overview basic descriptive data recorded during the selection process shows that 65 of the 68 finding aids tested were displayed as webpages using html, xhtml, or xml coding. the remaining three finding aids were displayed only in pdf, with no other viewing option available. only 25 of 68 finding aids were offered in multiple viewing formats, while 43 were only available in a single format. twenty of the finding aids were displayed in a state or regional database, while four used archon, one used contentdm, and four used dlxs. wave 5.0 web accessibility evaluation tool the three finding aids available only in pdf cannot be checked in wave, which is limited to webpages. therefore 65 finding aids were evaluated with this tool. the results show that the majority of tested finding aids (58 of 65, or 89.23 percent) had at least one accessibility error (see information technology and libraries | september 2013 38 table 1). the most common errors were missing document language, missing alternative text, missing form labels, and linked images missing alternative text. only seven of the finding aids had zero accessibility error codes. missing document language was noted for 63 percent of finding aids. language identification is important for screen readers or any text-to-speech applications, and it is a basic level a conformance requirement to meet wcag 2.0 criteria. the finding aids tested for this study contain primarily english materials, but they also describe materials in other languages, particularly spanish and french manuscript and book titles. without language identification, these words are spoken incorrectly with english pronunciation. furthermore, increasing popularity of mobile devices with voicing capabilities will likely make language identification helpful for many users, whether or not they use a screen reader for a disability accommodation. error number of libraries total number of occurrences broken skip link 4 6 document language missing 41 41 empty button 1 1 empty form label 4 7 empty heading 15 16 image map area missing alternative text 2 2 linked image missing alternative text 12 28 missing alternative text 15 36 missing form label 23 29 missing or uninformative page title 7 7 table 1. wave 5.0 errors (n = 65). the number of errors found for missing alternative text (36 instances at 15 libraries), linked images missing alternative text (28 instances at 12 libraries), and image map areas missing alternative text (two instances at two libraries) is surprising. alternative text for graphic items is one of the most basic and well-known accessibility features that can be implemented. the fact that it has not been provided when needed in more than a dozen finding aids suggests that these libraries have not performed the most rudimentary accessibility checks. missing or empty form labels and empty buttons, found at 24 libraries, can cause significant problems for screen-reader users. form labels and buttons allow listeners to identify and interact with forms such as search boxes. lack of accessible descriptive information makes them challenging to use, if not impossible. because headings are used with screen readers to facilitate quick keyboard navigation of a page, the presence of empty headings deprives screen-reader users of the information they need to scan the page the way a sighted patron does. similarly, skip links are used to jump to the main content of a page, bypassing the repetitive information in headers and sidebars. broken skip links were an evaluation of finding aid accessibility for screen readers | southwell and slater 39 present at four libraries, eliminating their intended advantage. missing or uninformative page titles were found at seven libraries, six of which were from pages using frames for display. when frames are used, each frame must have a clear title so listeners can choose the correct frame to hear. wave’s alerts category identifies items that have the potential to cause accessibility issues, particularly when not implemented properly (see table 2). a total of 43 percent of the finding aids reported missing first level headings, 30 percent had a skipped heading level, and nearly 17 percent had no heading structure. missing and disordered headings cause confusion when screenreader users try to navigate a page with them. listeners may think they have missed a heading, or they may have difficulty understanding the order and relationship of the page’s sections. alert number of libraries total number of occurrences accesskey 3 15 broken same-page link 9 18 link to pdf document 3 5 missing fieldset 1 1 missing first level heading 28 28 missing focus indicator 13 13 nearby image has same alternative text 9 1,071 no heading structure 11 16 noscript element 8 9 orphaned form label 2 2 plugin 1 1 possible table caption 3 4 redundant alternative text 4 9 redundant link 26 264 redundant title text 18 1,093 skipped heading level 20 22 suspicious alternative text 6 6 suspicious link text 1 5 tabindex 8 74 underlined text 8 142 unlabeled form element with title 1 2 very small text 11 20 table 2. wave 5.0 alerts (n = 65). at first glance, the most frequently encountered alerts appear to be for nearby images with the same alternative text (1,071 instances at nine libraries), and redundant title text (1,093 instances information technology and libraries | september 2013 40 at 18 libraries). on closer inspection, it is clear the vast majority of these alerts came from just three libraries using archon and are due to the inclusion of an “add to your cart” linked arrow image at the end of each described item. this repetitive statement is read aloud by the screen reader, making for a tedious listening experience. redundant links accounted for the next largest group of errors (264 instances at 26 libraries). most of these came from a single library using contentdm. its finding aid included a large number of subject headings linked to a “refine your search” option. excessive links clutter the navigational structure used by screen readers. broken same page links, present on nine finding aids, also hamper quick navigation within a page. other alerts reported at several libraries indicated failure to provide descriptive information or adequate alternative text for form labels, table captions, fieldsets, and links. the presence of these problems underscores the fact that descriptive information needed by screen reader users is not reliably available in finding aids. the remaining alerts for accesskey, tabindex, plugin, noscript element, and link to pdf document simply highlight areas that should be checked for correct implementation and do not confirm the presence of an access barrier. the features, structural elements, and wai-aria landmarks codes in wave identify the coding elements that make online content more accessible. features help users with disabilities interact with the page and read all of the available information on it, such as alternative text for images and form labels (see table 3). fully 83 percent (54 of 65) of library finding aids evaluated included at least one accessibility feature. the most commonly used features are alternative text and linked images with alternative text. a total of 53 libraries used some form of alternative text. wave reported that skip navigation links were available on only four finding aids, accounting for just 6 percent of libraries. a manual check of the source code, however, located a total of six finding aids feature number of libraries total number of occurrences alternative text 45 142 element language 2 2 form label 5 16 image button with alternative text 4 4 image map area with alternative text 2 5 image map with alt attribute 3 3 linked image with alternative text 19 31 long description 1 6 null or empty alternative text 10 21 null or empty alternative text on spacer 9 30 skip link 4 4 skip link target 5 5 table 3. features (n = 65). an evaluation of finding aid accessibility for screen readers | southwell and slater 41 with functioning skip links, all correctly located at or near the beginning of the page. this discrepancy indicates that accessibility checkers are not fail proof and must be followed by manual tests. the two added libraries raise the total to just 9 percent of libraries with skip links. considering the value of skip links to users of assistive technology, it is unfortunate they are not present on more pages. the structural elements noted by wave are the elements that help with keyboard navigation of the page and contextualizing layout-based information, such as tables or lists (see table 4). most libraries (64 of 65, or 98 percent) used at least one structural feature on their finding aids. lists and heading levels 2 and 3 are the most frequently used, followed by heading levels 1 and 4. although heading levels should be ordered sequentially to provide logical structure to the document, heading level 1 was skipped at 28 libraries (see table 2). table header cells, included at the 9 libraries using data tables to display container lists, are key to making tables screen-reader accessible. inline frames were used at seven libraries, as opposed to six libraries that used traditional frames. while inline frames are considered more accessible than traditional frames, using css is preferable to using either type. structural element number of libraries total number of occurrences definition/description list 11 86 heading level 1 33 54 heading level 2 43 150 heading level 3 42 295 heading level 4 25 108 heading level 5 1 2 heading level 6 0 0 inline frame 7 7 ordered list 6 16 table header cell 9 38 unordered list 41 715 wai-aria landmarks 1 3 table 4. structural elements (n = 65). wai-aria landmarks are element attributes that identify areas of the page such as “banner” or “search.” they serve as navigational aids for assistive technology users in a manner similar to headings. only one of the finding aids included three wai-aria roles. while aria landmarks are becoming more common on the internet in general, the data collected for this study indicates they have not yet been incorporated into library finding aids. information technology and libraries | september 2013 42 webbie 3 text browser analysis because screen reader users often use a webpage’s headings and links for navigating by keyboard commands, their importance to accessibility cannot be overstated. a quick check of any page in a nongraphical browser will reveal the page’s linear structure and reading order as handled by a screen reader. a text-only view of a website shows the order of headings and links within the document. webbie 3’s ctrl-h and ctrl-l commands were used to evaluate the 65 finding aids for the presence of headings and links for internal navigation. finding aids were rated on a pass/fail basis in three categories: • presence of any headings • presence of headings for navigating to another key part of the finding aid (e.g., container list) • presence of internal links for navigating to another key part of the finding aid headings/links yes no finding aid has at least one heading 59 (91%) 6 (9%) headings are used for navigation within finding aid 44 (68%) 21 (32%) links are used for navigation within finding aid 37 (57%) 28 (43%) headings and/or links used for navigation within finding aid 49 (75%) 16 (25%) table 5. use of headings and links for navigation (n = 65). while 91 percent had at least one heading, just 68 percent actually had headings that enabled navigation to another important section of the document, such as the container list. that means one-third of all finding aids encountered during this study could not be navigated by headings. even those that did have enough headings with which to navigate did not always have the headings in proper sequential order, or were missing first-level headings. this lack of adequate structure, given the length of some manuscript-collection finding aids, can make reading them with a screen reader tedious. finding aids with few or no headings prevent users of assistive technology from conveniently moving between sections, as a sighted reader can by visually scanning the page and selecting a relevant portion to read. even fewer finding aids offered links for navigating between sections of the finding aid. while 57 percent included such links, 43 percent did not. a total of 25 percent of pages tested lacked both headings and links of any kind for navigation within the finding aid. inclusion of headings or links to the standard sections of the finding aid facilitates keyboard navigation. additional headings or links to individual series or boxes provide even more options an evaluation of finding aid accessibility for screen readers | southwell and slater 43 for screen reader users. this is particularly helpful for patrons whose queries aren’t easily found using a search function – for example, when a patron does not know the specific terms to use for searching. only the most patient visitor will listen to an entire finding aid being read. screen reader test a manual screen reader test of each finding aid was completed by the researchers with satogo and nvda. both screen readers were used to ensure that success or failure to read the content was not because of any particular screen reader software. despite the 89 percent error rate noted by the automatic accessibility checker, the screen readers were able to read the main content of all 65 finding aids. the three pdf-only finding aids in the original group of 68 were also tested by opening them with the screen reader and adobe reader together. adobe reader indicated all three lacked tagging for structure and attempted to prepare them for reading. this resulted in all three being read aloud by the screen reader, but only one of the three was navigable by linked sections of the finding aid. the remaining two finding aids had no headings or links. while it is encouraging that the main content of all 68 finding aids could be read, some functioned poorly because of how the information is organized and displayed. finding aids serve as reference works for researchers and as internal record-keeping documents for the history of the collection. as such, they typically have a substantial amount of administrative information positioned at the beginning. biographical, acquisition, and scope and content notes are common, as are processing details and subject headings. sighted users can glance at the administrative information and skip to the collection summary or container list as needed. screenreader users can bypass this administrative information by using headings or links when they are supplied. users of the one-third of finding aids in this study that lacked these shortcuts must skim, search, or read the entire finding aid. inclusion of extensive administrative information without providing the means to skip past it creates a significant usability barrier. the descriptive style and display format of the container list also posed problems during this test. lengthy container lists displayed in tables are difficult to understand when spoken because tables are read row-by-row. this separates the descriptive table header cells, such as “box” and “folder” from the related information in the rows and columns below. as a result, the screen reader says “one, fifteen” before the description of the item in box 1, folder 15. it is hard to follow a long table, and the listener must remember or revisit the column and row headers to make sense of the descriptions. most screen readers have a table-reading mode for data tables that will read the header cell with the associated content, but only if the table has been marked up with sufficient tags. container-list-item descriptions that begin with an identification number or numeric date (e.g., 2012/01/13) are particularly unclear for listeners. these long sequences of numbers seem out of context when spoken by the screen reader, and it can be difficult to infer the relationship between the number and the item. item descriptions that are phrased as brief sentences in plain language result in finding aids that are more easily understood. application of findings information technology and libraries | september 2013 44 most special collections personnel in academic libraries are not responsible for the design of their websites, which are part of a larger organization that serves other needs. it is important that special collections librarians communicate to administrative and systems personnel that finding aids must be accessible to the visually disabled. libraries cannot rely on a content management system’s claims of being section 508-compliant to ensure accessibility, because that does not automatically guarantee the information displayed in the system is accessible. proper implementation of any content management system’s accessibility features is a key factor in achieving accessibility. librarians can take the first step toward improving accessibility of their special collections’ online finding aids by experiencing firsthand what screen reader users encounter when they use them. this can be done by conducting the same automated and manual tests described in this study. the following key checkpoints should be considered: accessible finding aids should • be keyboard navigation-friendly; • include alternative text for all graphics; • have descriptive labels and titles for all interactive elements like forms; • offer at least one type of navigational structure: o skip links and internal navigation links, o sufficient and properly ordered headings, or o wai-aria landmarks; and • linear reading order should be correct and simulate visual reading order, particularly for the container list. conclusion this study indicates that special collections finding aids at us public colleges and universities can be accessed by screen-reader users, but they do not always perform well because of faulty coding and inadequate use of headings or links for keyboard navigation. it is clear that many finding aids available online today have not been evaluated for optimal performance with assistive technology. this results in usability barriers for visually impaired patrons. special collections librarians can help ensure their electronic finding aids are accessible to screen-reader users by conducting automatic and manual tests that focus on usability. the test results can be used to initiate changes that will result in finding aids that are accessible to all users. an evaluation of finding aid accessibility for screen readers | southwell and slater 45 references 1. “disability statistics,” employment and disability institute, cornell university, 2010, accessed december 20, 2012, www.disabilitystatistics.org/reports/acs.cfm. 2. matthew j. brault, “americans with disabilities: 2010,” us census bureau, 2010, accessed december 20, 2012, www.census.gov/prod/2012pubs/p70-131.pdf. 3. “web content accessibility guidelines (wcag) 2.0,” world wide web consortium (w3c), accessed december 20, 2012, www.w3.org/tr/wcag. 4. “screen reader user survey #4,” webaim, accessed december 20, 2012, http://webaim.org/projects/screenreadersurvey4. 5. erica b. lilly and connie van fleet, “wired but not connected,” reference librarian 32, no. 67/68 (2000): 5–28, doi: 10.1300/j120v32n67_02; tim spindler, “the accessibility of web pages for mid-sized college and university libraries,” reference & user services quarterly 42, no. 2 (2002): 149–54. 6. axel schmetzke, “web accessibility at university libraries and library schools,” library hi tech 19, no. 1 (2001): 35–49; axel schmetzke, “web accessibility at university libraries and library schools: 2002 follow-up study,” in design and implementation of web-enabled teaching tools, ed. mary hricko (hershey, pa: information science, 2002); david comeaux and axel schmetzke, “web accessibility trends in university libraries and library schools,” library hi tech 25, no. 4 (2007): 457–77, doi: 10.1108/07378830710840437. 7. michael providenti and robert zai iii,“web accessibility at kentucky’s academic libraries,” library hi tech 25, no. 4 (2007): 478–93, doi: 10.1108/07378830710840446. 8. bryna coonin, “establishing accessibility for e-journals: a suggested approach,” library hi tech 20, no. 2 (2002): 207–20, doi: 10.1108/07378830210432570; cheryl a. riley, “libraries, aggregator databases, screen readers and clients with disabilities,” library hi tech 20, no. 2 (2002): 179–87, doi: 10.1108/07378830210432543; cheryl a. riley, “electronic content: is it accessible to clients with ‘differabilities’?” serials librarian 46, no. 3/4 (2004): 233–40, doi: 10.1300/j123v46n03_06; jennifer horwath, “evaluating opportunities for expanded information access: a study of the accessibility of four online databases,” library hi tech 20, no. 2 (2002): 199–206; suzanne l. byerley and mary beth chambers, “accessibility and usability of web-based library databases for non-visual users,” library hi tech 20, no. 2 (2002): 169–78, doi: 10.1108/07378830220432534; suzanne l. byerley and mary beth chambers, “accessibility of web-based library databases: the vendors’ perspectives,” library hi tech 21, no. 3 (2003): 347–57. 9. ron stewart, vivek narendra and axel schmetzke, “accessibility and usability of online library databases,” library hi tech 23, no. 2 (2005): 265–86, doi: 10.1108/07378830510605205; suzanne l. byerley, mary beth chambers, and mariyam thohira, “accessibility of web-based http://webaim.org/projects/screenreadersurvey4/ information technology and libraries | september 2013 46 library databases: the vendors’ perspectives in 2007,” library hi tech 25, no. 4 (2007): 509– 27, doi: 10.1108/07378830710840473. 10. rebecca power and chris labeau, “how well do academic library web sites address the needs of database users with visual disabilities?” reference librarian 50, no. 1 (2009): 55–72, doi: 10.1080/02763870802546399. 11. kelly dermody and norda majekodunmi, “online databases and the research experience for university students with print disabilities,” library hi tech 29, no. 1 (2011): 149–60, doi: 10.1108/07378831111116976. 12. peter brophy and jenny craven, “web accessibility,” library trends 55, no. 4 (2007): 950–72. 13. r. todd vandenbark, “tending a wild garden: library web design for persons with disabilities,” information technology & libraries 29, no. 1 (2010): 23–29. 14. sue samson, “best practices for serving students with disabilities,” reference services review 39, no. 2 (2011): 244–59, doi: 10.1108/00907321111135484. 15. christine a. willis, “library services for persons with disabilities: twentieth anniversary update,” medical reference services quarterly 31, no. 1 (2012): 92–104, doi: 10.1080/02763869.2012.641855. 16. kristina l. southwell and jacquelyn slater, “accessibility of digital special collections using screen readers,” library hi tech 30, no. 3 (2012): 457–471, doi: 10.1108/07378831211266609. 17. lora j. davis, “providing virtual services to all: a mixed-method analysis of the website accessibility of philadelphia area consortium of special collections libraries (pacscl) member repositories,” american archivist 75 (spring/summer 2012): 35–55. 18. davis, “providing virtual services to all,” 51. 30 information technology and libraries | march 2010 the path toward global interoperability in cataloging ilana tolkoff libraries began in complete isolation with no uniformity of standards and have grown over time to be ever more interoperable. this paper examines the current steps toward the goal of universal interoperability. these projects aim to reconcile linguistic and organizational obstacles, with a particular focus on subject headings, name authorities, and titles. i n classical and medieval times, library catalogs were completely isolated from each other and idiosyncratic. since then, there has been a trend to move toward greater interoperability. we have not yet attained this international standardization in cataloging, and there are currently many challenges that stand in the way of this goal. this paper will examine the teleological evolution of cataloging and analyze the obstacles that stand in the way of complete interoperability, how they may be overcome, and which may remain. this paper will not provide a comprehensive list of all issues pertaining to interoperability; rather, it will attempt to shed light on those issues most salient to the discussion. unlike the libraries we are familiar with today, medieval libraries worked in near total isolation. most were maintained by monks in monasteries, and any regulations in cataloging practice were established by each religious order. one reason for their lack of regulations was that their collections were small by our standards; a monastic library had at most a few hundred volumes (a couple thousand in some very rare cases). the “armarius,” or librarian, kept more of an inventory than an actual catalog, along with the inventories of all other valuable possessions of the monastery. there were no standard rules for this inventory-keeping, although the armarius usually wrote down the author and title, or incipit if there was no author or title. some of these inventories also contained bibliographic descriptions, which most often described the physical book rather than its contents. the inventories were usually taken according to the shelf organization, which was occasionally based on subject, like most libraries are today. these trends in medieval cataloging varied widely from library to library, and their inventories were entirely different from our modern opacs. the inventory did not provide users access to the materials. instead, the user consulted the armarius, who usually knew the collection by heart. this was a reasonable request given the small size of the collections.1 this type of nonstandardized cataloging remained relatively unchanged until the nineteenth century, when charles c. jewett introduced the idea of a union catalog. jewett also proposed having stereotype plates for each bibliographic record, rather than a book catalog, because this could reduce costs, create uniformity, and organize records alphabetically. this was the precursor to the twentieth-century card catalog. while many of jewett’s ideas were not actually practiced during his lifetime, they laid the foundation for later cataloging practices.2 the twentieth century brought a great revolution in cataloging standards, particularly in the united states. in 1914, the library of congress subject headings (lcsh) were first published and introduced a controlled vocabulary to american cataloging. the 1960s saw a wide array of advancements in standardization. the library of congress (lc) developed marc, which became a national standard in 1973. it also was the time of the creation of anglo-american cataloguing rules (aacr), the paris principles, and international standard bibliographic description (isbd). while many of these standardization projects were uniquely american or british phenomena, they quickly spread to other parts of the world, often in translated versions.3 while the technology did not yet exist in the 1970s to provide widespread local online catalogs, technology did allow for union catalogs containing the records of many libraries in a single database. these union catalogs included the research libraries information network (rlin), the oclc online computer library center (oclc), and the western library network (wln). in the 1980s the local online public access catalog (opac) emerged, and in the 1990s opacs migrated to the web (webpacs).4 currently, most libraries have opacs and are members of oclc, the largest union catalog, used by more than 71,000 libraries in 112 countries and territories.5 now that most of the world’s libraries are on oclc, librarians face the challenge and inconvenience of discrepancies in cataloging practice due to the differing standards of diverse countries, languages, and alphabets. the fields of language engineering and linguistics are working on various language translation and analysis tools. some of these include machine translation; ontology, or the hierarchical organization of concepts; information extraction, which deciphers conceptual information from unorganized information, such as that on the web; text summarization, in which computers create a short summary from a long piece of text; and speech processing, which is the computer analysis of human speech.6 while these are all exciting advances in information technology, as of yet they are not intelligent enough to help us establish cataloging interoperability. it will be interesting to see whether language engineering tools will be capable of helping catalogers in the future, but for now they are ilana tolkoff (ilana.tolkoff@gmail.com) holds a ba in music and italian from vassar college, an ma in musicology from brandeis university, and an mls from the university at buffalo. she is currently seeking employment as a music librarian. the path toward global interoperability in cataloging | tolkoff 31 best at making sense of unstructured information, such as the web. the interoperability of library catalogs, which consist of highly structured information, must be tackled through software that innovative librarians of the future will produce. in an ideal world, oclc would be smoothly interoperable at a global level. a single thesaurus of subject headings would have translations in every language. there would be just one set of authority files. all manifestations of a single work would be grouped under the same title, translatable to all languages. there would be a single bibliographic record for a single work, rather than multiple bibliographic records in different languages for the same work. this single bibliographic record could be translatable into any language, so that when searching in worldcat, one could change the settings to any language to retrieve records that would display in that chosen language. when catalogers contribute to oclc, they would create the records in their respective languages, and once in the database the records would be translatable to any other language. because records would be so fluidly translatable, an opac could be searched in any language. for example, the default settings for the university at buffalo’s opac could be english, but patrons could change those settings to accommodate the great variety of international students doing research. this vision is utopian to say the least, and it is doubtful that we will ever reach this point. but it is valuable to establish an ideal scenario to aim our innovation in the right direction. one major obstacle in the way of global interoperability is the existence of different alphabets and the inherently imperfect nature of transliteration. there are essentially two types of transliteration schemes: those based on phonetic structure and those based on morphemic structure. the danger of phonetic transliteration, which mimics pronunciation, is that semantics often get lost. it fails to differentiate between homographs (words that are spelled and pronounced the same way but have different meanings). complications also arise when there are differences between careful and casual styles of speech. park asserts, “when catalogers transcribe words according to pronunciation, they can create inconsistent and arbitrary records.”7 morphemic transliteration, on the other hand, is based on the meanings of morphemes, and sometimes ends up being very different from the pronunciation in the source language. one advantage to this, however, is that it requires fewer diacritics than phonetic transliteration. park, whose primary focus is on korean–roman transliteration, argues that the mccune reischauer phonetic transliteration that libraries use loses too much of the original meaning. in other alphabets, however, phonetic transliteration may be more beneficial, as in the lc’s recent switch to pinyin transliteration in chinese. the lc found pinyin to be more easily searchable than wade-giles or monosyllabic pinyin, which are both morphemic. however, another problem with transliteration that neither phonetic nor morphemic schemes can solve is word segmentation—how a transliterated word is divided. this becomes problematic when there are no contextual clues, such as in a bibliographic record.8 other obstacles that stand in the way of interoperability are the diverse systems of subject headings, authority headings, and titles found internationally. resource description and access (rda) will not deal with subject headings because it is such a hefty task, so it is unlikely that subject headings will become globally interoperable in the near future.9 fortunately, twenty-four national libraries of english speaking countries use lcsh, and twelve non-english-speaking countries use a translated or modified version of lcsh. this still leaves many more countries that use their own systems of subject headings, which ultimately need to be made interoperable. even within a single language, subject headings can be complicated and inconsistent because they can be expressed as a single noun, compound noun, noun phrase, or inverted phrase; the problem becomes even greater when trying to translate these to other languages. bennett, lavoie, and o’neill note that catalogers often assign different subject headings (and classifications) to different manifestations of the same work.10 that is, the record for the novel gone with the wind might have different subject headings than the record for the movie. this problem could potentially be resolved by the functional requirements for bibliographic records (frbr), which will be discussed below. translation is a difficult task, particularly in the context of strict cataloging rules. it is especially complicated to translate among unrelated languages, where one might be syntactic and the other inflectional. this means that there are discrepancies in the use of prepositions, conjunctions, articles, and inflections. the ability to add or remove terms in translation creates endless variations. a single concept can be expressed in a morpheme, a word, a phrase, or a clause, depending on the language. there also are cultural differences that are reflected in different languages. park gives the example of how angloamerican culture often names buildings and brand names after people, reflecting our culture’s values of individualism, while in korea this phenomenon does not exist at all. on the other hand, korean’s use of formal and informal inflections reflects their collectivist hierarchical culture. another concept that does not cross cultural lines is the korean pumasi system in which family and friends help someone in a time of need with the understanding that the favor will be returned when they need it. this cannot be translated into a single english word, phrase, or subject heading. one way of resolving ambiguity in translations is through modifiers or scope notes, but this is only a partial solution.11 because translation and transliteration are so difficult, 32 information technology and libraries | march 2010 as well as labor-intensive, the current trend is to link already existing systems. multilingual access to subjects (macs) is one such linking project that aims to link subject headings in english, french, and german. it is a joint project under the conference of european national librarians among the swiss national library, the bibliothèque nationale de france (bnf), the british library (bl), and die deutsche bibliothek (ddb). it aims to link the english lcsh, the french répertoire d’autorité matière encyclopédique et alphabétique unifié (rameau), and the german schlagwortnormdatei/ regeln für den schlagwortkatalog (swd/rswk). this requires manually analyzing and matching the concepts in each heading. if there is no conceptual equivalent, then it simply stands alone. macs can link between headings and strings or even create new headings for linking purposes. this is not as fruitful as it sounds, however, as there are fewer correspondences than one might expect. the macs team experimented with finding correspondences by choosing two topics: sports, which was expected to have a particularly high number of correspondences, and theater, which was expected to have a particularly low number of correspondences. of the 278 sports headings, 86 percent matched in all three languages, 8 percent matched in two, and 6 percent was unmatched. of the 261 theater headings, 60 percent matched in three languages, 18 percent matched in two, and 22 percent was unmatched.12 even in the most cross-cultural subject of sports, 14 percent of terms did not correspond fully, making one wonder whether linking will work well enough to prevail. a similar project—the virtual international authority file (viaf)—is being undertaken for authority headings, a joint project of the lc, the bnf, and ddb, and now including several other national libraries. viaf aims to link (not consolidate) existing authority files, and its beta version (available at http://viaf.org) allows one to search by name, preferred name, or title. oclc’s software mines these authority files and the titles associated with them for language, lc control number, lc classification, usage, title, publisher, place of publication, date of publication, material type, and authors. it then derives a new enhanced authority record, which facilitates mapping among authority records in all of viaf’s languages. these derived authority records are stored on oai servers, where they are maintained and can be accessed by users. users can search viaf by a single national library or broaden their possibilities by searching all participating national libraries. as of 2006, between the lc’s and ddb’s authority files, there were 558,618 matches, including 70,797 complex matches (one-to-many), and 487,821 unique matches (one-to-one) out of 4,187,973 lc names and 2,659,276 ddb names. ultimately, viaf could be used for still more languages, including non-roman alphabets.13 recently the national library of israel has joined, and viaf can link to the hebrew alphabet. a similar project to viaf that also aimed to link authority files was linking and exploring authority files (leaf), which was under the auspices of the information society technologies programme of the fifth framework of the european commission. the three-year project began in 2001 with dozens of libraries and organizations (many of which are national libraries), representing eight languages. its website describes the project as follows: information which is retrieved as a result of a query will be stored in a pan-european “central name authority file.” this file will grow with each query and at the same time will reflect what data records are relevant to the leaf users. libraries and archives wanting to improve authority information will thus be able to prioritise their editing work. registered users will be able to post annotations to particular data records in the leaf system, to search for annotations, and to download records in various formats.14 park identifies two main problems with linking authority files. one is that name authorities still contain some language-specific features. the other is that disambiguation can vary among name authority systems (e.g., birth/death dates, corporate qualifiers, and profession/ activity). these are the challenges that projects like leaf and viaf must overcome. while the linking of subject headings and name authorities is still experimental and imperfect, the frbr model for linking titles is much more promising and will be incorporated in the soon-to-be-released rda. according to bennett, lavoie, and o’neill, there are three important benefits to frbr: (1) it allows for different views of a bibliographic database, (2) it creates a hierarchy of bibliographic entities in the catalog such that all versions of the same work fall into a single collapsible entry point, (3) and the confluence of the first two benefits makes the catalog more efficient. in the frbr model, the bibliographic record consists of four entities: (1) the work, (2) the expression, (3) the manifestation, and (4) the item. all manifestations of a single work are grouped together, allowing for a more economical use of information because the title needs to be entered only once.15 that is, a “title authority file” will exist much like a name authority file. this means that all editions in all languages and in all formats would be grouped under the same title. for example, the lord of the rings title would include all novels, films, translations, and editions in one grouping. this would reduce the number of bibliographic records, and as danskin notes, “the idea of creating more records at a time when publishing output threatens to outstrip the cataloguing capacity of national bibliographic agencies is alarming.”16 the frbr model is particularly beneficial for complex canonical works like the bible. there are a small number of complex canonical works, but they take up a the path toward global interoperability in cataloging | tolkoff 33 disproportionate number of holdings in oclc.17 because this only applies to a small number of works, it would not be difficult to implement, and there would be a disproportionate benefit in the long run. there is some uncertainty, however, in what constitutes a complex work and whether certain items should be grouped under the same title.18 for instance, should prokofiev’s romeo and juliet be grouped with shakespeare’s? the advantage of the frbr model for titles over subject headings or name authorities is that no such thing as a title authority file exists (as conceptualized by frbr). we would be able to start from scratch, creating such title authority files at the international level. subject headings and name authorities, on the other hand, already exist in many different forms and languages so that cross-linking projects like viaf might be our only option. it is encouraging to see the strides being made to make subject headings, name authority headings, and titles globally interoperable, but what about other access points within a record’s bibliographic description? these are usually in only one language, or two if cataloged in a bilingual country. should these elements (format, contents, and so on) be cross-linked as well, and is this even possible? what should reasonably be considered an access point? most people search by subject, author, or title, so perhaps it is not worth making other types of access points interoperable for the few occasions when they are useful. yet if 100 percent universal interoperability is our ultimate utopian goal, perhaps we should not settle for anything less than true international access to all fields in a record. because translation and transliteration are such complex undertakings, linking of extant files is the future of the field. there are advantages and disadvantages to this. on the one hand, linking these files is certainly better than having them exist only for their own countries. they are easily executed projects that would not require a total overhaul of the way things currently stand. the disadvantages are not to be ignored, however. the fact that files do not correspond perfectly from language to language means that many files will remain in isolation in the national library that created them. another problem is that cross-linking is potentially more confusing to the user; the search results on http://www.viaf.org are not always simple and straightforward. if cross-linking is where we are headed, then we need to focus on a more user-friendly interface. if the ultimate goal of interoperability is simplification, then we need to actually simplify the way query results are organized rather than make them more confusing. very soon rda will be released and will bring us to a new level of interoperability. aacr2 arrived in 1978, and though it has been revised several times, it is in many ways outdated and mainly applies to books. rda will bring something completely new to the table. it will be flexible enough to be used in other metadata schemes besides marc, and it can even be used by different industries such as publishers, museums, and archives.19 its incorporation of the frbr model is exciting as well. still, there are some practical problems in implementing rda and frbr, one of which is that reeducating librarians about the new rules will be costly and take time. also, frbr in its ideal form would require a major overhaul of the way oclc and integrated library systems currently operate, so it will be interesting to see to what extent rda will actually incorporate frbr and how it will be practically implemented. danskin asks, “will the benefits of international co-operation outweigh the costs of effecting changes? is the usa prepared to change its own practices, if necessary, to conform to european or wider ifla standards?”20 it seems that the united states is in fact ready and willing to adopt frbr, but to what extent is yet to be determined. what i have discussed in this paper are some of the more prominent international standardization projects, although there are countless others, such as eurowordnet, the open language archives community (olac), and international cataloguing code (icc), to name but a few.21 in general, the current major projects consist of linking subject headings, name authority files, and titles in multiple languages. linking may not have the best correspondence rates, we have still not begun to tackle the cross-linking of other bibliographic elements, and at this point search results may be more confusing than helpful. but the existence of these linking projects means we are at least headed in the right direction. the emergent universality of oclc was our most recent step toward interoperability, and it looks as if cross-linking is our next step. only time will tell what steps will follow. references 1. lawrence s. guthrie ii, “an overview of medieval library cataloging,” cataloging & classification quarterly 15, no. 3 (1992): 93–100. 2. lois mai chan and theodora hodges, cataloging and classification: an introduction, 3rd ed. (lanham, md.: scarecrow, 2007): 48. 3. ibid., 6–8. 4. ibid., 7–9. 5. oclc, “about oclc,” http://www.oclc.org/us/en/ about/default.htm (accessed dec. 9, 2009). 6. jung-ran park, “cross-lingual name and subject access: mechanisms and challenges,” library resources & technical services 51, no. 3 (2007): 181. 7. ibid., 185. 8. ibid. continued on page 39 tagging: an organization scheme for the internet | visser 39 international and o’reilly media, web 2.0 refers to the web as being a platform for harnessing the collective power of internet users interested in creating and sharing ideas and information without mediation from corporate, government, or other hierarchical policy influencers or regulators. web 3.0 is a much more fluid concept as of this writing. there are individuals who use it to refer to a semantic web where information is analyzed or processed by software designed specifically for computers to carry out the currently human-mediated activity of assigning meaning to information on a webpage. there are librarians involved with exploring virtual-world librarianship who refer to the 3d environment as web 3.0. the important point here is that what internet users now know as web 2.0 is in the process of being altered by individuals continually experimenting with and improving upon existing web applications. web 3.0 is the undefined future of the participatory internet. 3. clay shirky, “here comes everybody: the power of organizing without organizations” (presentation videocast, berkman center for internet & society, harvard university, cambridge, mass., 2008), http://cyber.law.harvard.edu/inter active/events/2008/02/shirky (accessed oct. 1, 2008). 4. ibid. 5. lawerence lessig, “early creative commons history, my version,” videocast, aug. 11, 2008, lessig 2.0, http://lessig.org/ blog/2008/08/early_creative_commons_history.html (accessed aug. 13, 2008). 6. elaine peterson, “beneath the metadata: some philosophical problems with folksonomy,” d-lib magazine 12, no. 11 (2006), http://www.dlib.org/dlib/november06/peterson/11peterson .html (accessed sept. 8, 2008). 7. clay shirky, “ontology is overrated: categories, links, and tags” online posting, spring 2005, clay shirky’s writings about the internet, http://www.shirky.com/writings/ontology_ overrated.html#mind_reading (accessed sept. 8, 2008). 8. gene smith, tagging: people-powered metadata for the social web (berkeley, calif.: new riders, 2008): 68. 9. ibid., 76. 10. thomas vander wal, “folksonomy,” online posting, feb. 7, 2007, vanderwal.net, http://www.vanderwal.net/folksonomy .html (accessed aug. 26, 2008). 11. thomas vander wal, “explaining and showing broad and narrow folksonomies,” online posting, feb. 21, 2005, personal infocloud, http://www.personalinfocloud.com/2005/02/ explaining_and_.html (accessed aug. 29, 2008). 12. shirky, “ontology is overrated.” 13. ibid. 14. michael arrington, “exclusive: screen shots and feature overview of delicious 2.0 preview,” online posting, june 16, 2005, techcrunch, http://www.techcrunch.com/2007/09/06/ exclusive-screen-shots-and-feature-overview-of-delicious-20 -preview/(accessed jan. 6, 2010). 15. smith, tagging, 67–93 . 16. vander wal, “explaining and showing broad and narrow folksonomies.” 17. adam mathes, “folksonomies—cooperative classification and communication through shared metadata” (graduate paper, university of illinois urbana–champaign, dec. 2004); peterson, “beneath the metadata”; shirky, “ontology is overrated”; thomas and griffin, “who will create the metadata for the internet?” 18. shirky, “ontology is overrated.” 19. peterson, “beneath the metadata.” 20. cory doctorow, “metacrap: putting the torch to seven straw-men of the meta-utopia,” online posting, aug. 26, 2001, the well, http://www.well.com/~doctorow/metacrap.htm (accessed sept. 15, 2008). 21. marieke guy and emma tonkin, “folksonomies: tidying up tags?” d-lib magazine 12, no. 1 (2006), http://www.dlib .org/dlib/january06/guy/01guy.html (accessed sept. 8, 2008). 22. shirky, “ontology is overrated.” global interoperability continued from page 33 9. julie renee moore, “rda: new cataloging rules, coming soon to a library near you!” library hi tech news 23, no. 9, (2006): 12. 10. rick bennett, brian f. lavoie, and edward t. o’neill, “the concept of a work in worldcat: an application of frbr,” library collections, acquisitions, & technical services 27, no. 1, (2003): 56. 11. park, “cross-lingual name and subject access.” 12. ibid. 13. thomas b. hickey, “virtual international authority file” (microsoft powerpoint presentation, ala annual conference, new orleans, june 2006), http://www.oclc.org/research/ projects/viaf/ala2006c.ppt (accessed dec. 9, 2009). 14. leaf, “leaf project consortium,” http://www.crxnet .com/leaf/index.html (accessed dec. 9, 2009). 15. bennett, lavoie, and o’neill, “the concept of a work in worldcat.” 16. alan danskin, “mature consideration: developing bibliographic standards and maintaining values,” new library world 105, no. 3/4, (2004): 114. 17. ibid. 18. bennett, lavoie, and o’neill, “the concept of a work in worldcat.” 19. moore, “rda.” 20. danskin, “mature consideration,” 116. 21. ibid.; park, “cross-lingual name and subject access.” meeting users where they are: delivering dynamic content and services through a campus portal communications meeting users where they are delivering dynamic content and services through a campus portal graham sherriff, dan desanto, daisy benson, and gary s. atwood information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.11519 graham sherriff (graham.sherriff@uvm.edu) is instructional design librarian, university of vermont. dan desanto (ddesanto@uvm.edu) is instruction librarian, university of vermont. daisy benson (daisy.benson@uvm.edu) is library instruction coordinator, university of vermont. gary s. atwood (gatwood@uvm.edu) is education librarian, university of vermont. abstract campus portals are one of the most visible and frequently used online spaces for students, offering one-stop access to key services for learning and academic self-management. this case study reports how instruction librarians at the university of vermont collaborated with portal developers in the registrar’s office to develop high-impact, point-of-need content for a dedicated “library” page. this content was then created in libguides and published using the application programming interfaces (apis) for libguides boxes. initial usage data and analytics show that traffic to the libraries’ portal page has been substantially and consistently higher than expected. the next phase for the project will be the creation of customized library content that is responsive to the student’s user profile. introduction for many academic institutions, campus portals (also referred to as enterprise portals) are one of students’ most frequently used means of interacting with their institutions. campus portals are websites that provide students and other campus constituents with a “one-stop shop” experience, with easy access to a selection of key services for learning and academic self -management. typically, portals provide features that make it possible for students to obtain course information, manage course enrollment, view grades, manage financial accounts, and access information about campus activities. for faculty and staff, campus portals provide access to administrative resources related to teaching, human relations, and more. these campus portals are different from library portals, which some libraries implemented in the 2000s as a way to centralize access to key library services.1 currently, the public-facing websites of many colleges and universities serve a crucial role in marketing the institution to prospective students. this creates an incentive to be as comprehensive as possible and to showcase the full breadth of programs, services, offices, and facilities. a common disadvantage to this approach to institutional web design is information overload: an overwhelming array of labels and links that diminish the ability of current affiliates to find and access the services they need. these sites are designed for external users for whom the research and educational functions of the library are a low priority. campus portals, however, are designed for internal users and can take a more selective approach. they give student and faculty users a view of campus services that aligns with their priorities and places them in a convenient interface. in this sense, they are tools for information management. campus portals play a critical role in students’ daily lives because they do much more than simply present information. carden observes that campus portals have these key characteristics: mailto:graham.sherriff@uvm.edu mailto:ddesanto@uvm.edu mailto:daisy.benson@uvm.edu mailto:gatwood@uvm.edu information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 2 • allow a single user authentication and authorization step at the initial point of contact to be applied to all (or most) other entities within the portal; • allow multiple types and sources of information to be displayed on a single composite screen (multiple “channels”); • provide automated personalization of the selection of channels offered, based on each user’s characteristics, on the groups to which each user belongs, and possibly on the way in which the system has historically been used; • allow user personalization of the selection of channels displayed and the look-and-feel of the interface, based on personal preferences; • provide a consistent style of access to diverse information sources, including “revealing” legacy applications through a new consistent interface; and • facilitate transaction processing as well as simple data access. 2 in sum, enterprise portals use a combination of advanced technologies that have the ability to present both static and user-responsive information in a space reserved for affiliates of the university. these abilities present an attractive venue for libraries to leverage the capabilities of a campus portal to present users with dynamic, personalized instructional experiences—in a space where users are. this aligns with the principles of user-centered design, which emphasizes the need to empathize with users’ needs and perspectives. simplicity, efficiency, convenience, and responsiveness to each user’s individual circumstances are critical.3 the idea of presenting libraries’ content through a campus portal is not a new one. stoffel and cunningham surveyed libraries in 2004 and, while finding that “library participation in campus portals is . . . relatively rare,” of the sixteen self-selected responding campuses, ten had a library tab or a dedicated library channel within their campus portal, while two more had a channel or tab under development.4 the types of library integration described in most examples consisted of using the portal’s campus authentication to link to a user’s library account and view borrowed books, fines, holds, and announcements. while resources like federated searches, research guides, and lists of journals and databases appeared in some respondents’ portals, they largely appeared as static content rather than responding to the user’s profile. since 2004, portals have remained a core part of the university of vermont’s information delivery system, but portal integration remains relatively rare among libraries and most have done little to integrate new tools such as research guides or develop instructional content that leverages a portal’s user-responsive design. as a result, there is little in the literature on libraries’ integration of content into campus portals, but a small number of case studies provide proof of concept, such as lehigh university, california state university-sacramento, and arizona state university.5 these case studies also illustrate the importance of cross-campus collaboration. our project required some critical elements, specifically access to the campus portal and a method for publishing content. the projects described in the case studies were successful partly because they were able to apply advanced programming expertise that was not available to our group, such as api coding. instead, our group was able to obtain these critical inputs through a partnership with the university of vermont registrar’s office. information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 3 at the university of vermont, the campus portal uses the banner product licensed from ellucian and has branded it as “myuvm.” it is administered by the registrar’s office. librarians have observed that it is central to students’ academic lives. students go to myuvm as their pathway to many of the online services and tools that they use. they go there to check email, log in to the learning management system (lms), check grades, to add, drop, or withdraw from courses, to check their schedule, and more. they go there to carry out tasks. figure 1. screenshot of myuvm (https://myuvm.uvm.edu as it was on march 1, 2019). the importance of myuvm is communicated to university of vermont students at orientation. in this way, first-year students learn at the earliest point, even before their academic programs begin, that the portal is their primary gateway for access to campus academic services. this shapes their view of the services available to them and how those services are organized. it also shapes how they reach those services and how they interact with them. at the same time, the selective principle underlying the campus portal means that if something is not present, it is less visible and less accessible, and there is a risk of signaling to students that it is not important to their daily lives or their academic performance. methods the characteristics of campus portals and their contents motivated instruction librarians to explore the possibility of integrating library services into myuvm. in 2014, the university of vermont libraries’ educational services working group—a small cross-libraries group of librarians who work on a variety of projects supporting classroom instruction and research assistance—began by defining the desirable scope of possible portal content. https://myuvm.uvm.edu/ information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 4 the educational services working group quickly determined that library content included in the portal should be designed to conform with the principle of priority-based selectivity employed across the portal as a whole. this content should not attempt to represent the full s uite of library information and services available. this would replicate the websites of the three libraries on campus and would risk creating overload and disorientation, in a similar way to institutional websites. it is common for actionable and instructional material to become buried beneath links on a library homepage, and the homepages of our three libraries’ websites are no different. our hope was to reposition selected instructional content such as research guides, databases -bysubject, chat reference, and liaison librarian contacts in a venue with which students are used to interacting. the goal of the project was the strategic positioning of dynamic, responsive information about research services in a venue with which students frequently interact. research librarians would select and organize the most important and pertinent instructional content. such selectivity fit well within the portal’s principle for curating content: high-use tools and services that directly support students’ priorities. thus the objective for this project would not be the re-creation of the library websites within myuvm. it was also determined that the scope would exclude content that might be considered marketing or engagement for its own sake, for the same purpose of minimizing users’ cognitive load and helping them to quickly find the features they need. the myuvm developers in the registrar’s office were enthusiastic about working with us on this project, which partly reflects an increased attention across campus to equitable access to student services for all users—something that is important for its own sake, but also for the purposes of accreditation. following preliminary discussions in early 2018, myuvm developers created a test “libraries” page, equivalent to a full screen of content, and assigned to our group the privileges necessary to view it in the myuvm test environment. each page in myuvm is composed of a series of content boxes or channels. in developing our new page, our task was to develop content for the desired channels. we began our process for composing the page with a card-sorting exercise that identified priorities for the content that should be highlighted. the participants were the group’s members, in order to expedite initial decisions about content that could be tested with users at a later point in the project. items that figured prominently in this process were the libraries’ “ask a librarian” service, research guides, and search tools (discovery layer, databases, and journal directory). this confirmed that our group’s priorities centered on users’ transactional interactions with library services and not merely the one-way promotion of library information. the results of the card sorting were then translated into a wireframe (see figure 2). each square in the wireframe represented a channel for which we would need to create the appropriate content: • ask a librarian (contact details for the libraries’ research assistance services) • research guides (subject and class guides) • search our collections (search tools for the discovery layer, databases, and journal directory) information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 5 • research roadmap (the libraries’ suite of tutorials on foundational research skills) • featured content (a channel for rotating or temporary time-specific content) • libraries (a box with a link to each of the three libraries on campus; we later added a channel for each library) • the wireframe also envisaged the inclusion of a pop-out chat widget. figure 2. wireframe for library content. as noted, the project needed a process that would enable our group to create and publish this content autonomously, but without requiring advanced programming skills on our part. we learned that myuvm is capable of publishing content pushed from a webpage by using its url. this meant that we could create content in libguides, a platform with which our group was very familiar, and then push the content of an individual libguide box to a myuvm channel simply by providing the libguide box urls to the portal developers. this method offers several advantages. importantly, it meant that our group had direct control of the box content and was able to publish it without needing the myuvm developers to review and authorize every edit. information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 6 those involved in this project faced important decisions early in the process regarding which resources we deemed essential for inclusion and best suited to this new online context. once items were selected, it was important to keep user behaviors in mind as we prioritized “above the fo ld” content. students are used to quickly popping into the portal, finding what they need, and popping out. we tried to place interactive content that fit this use pattern in high-visibility places and moved content that required more sustained reading and attention further down the page. a challenge faced during the design process was our campus’s lack of a unified, cross-libraries web presence. the three libraries on our campus have separate websites, but the university of vermont portal required that we present a unified “libraries” presence. in some cases, such as links back to library webpages, we were easily able to treat the three libraries separately. in other cases, such as our research guides, we were able to merge resources from multiple libraries. in still other cases, such as our chat widgets, we had to make decisions about which library’s resource would be featured and which other versions would be secondary. the prototyping and testing phases revealed that some content needed to be adjusted in order to display in myuvm as desired. libguides’ tabbed boxes and gallery boxes did not display correctly. also, some style coding inherited from the libguides boxes needed to be adjusted in order to display cleanly. one item, “course reserves,” was present in the wireframe but not the page at the time of implementation. we continue to work on the development of a widget for searching “course reserves” holdings. the version of the “library” page at the time of going live is shown in figure 3. figure 3. screenshot of the “library” page in myuvm. the “research guides” channel has a dropdown menu for subject guides and another for class guides. these menus were created using libguides widgets, meaning that they update automatically as guides are published and unpublished, and do not require any manual maintenance. information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 7 the “search our collections” channel includes three access points to the libraries’ collections. this contrasts with the libraries’ websites, which display only the discovery layer search box. the latter approach has the advantage of promoting one-stop searching, but also the disadvantage of overwhelming users with non-relevant results. channels on the left side of the page are less dynamic and interactive. at the top, links to the three libraries on campus provide highly visible quick access for students looking for the libraries’ websites. similarly, the “ask a librarian” channel quickly gets students to reference and consultation services at their home library. the “you should know” channel provides a space for rotating content to be changed based on time-of-year, events on campus, or other perceived student needs. results the “library” page in myuvm went live in january 2019, at the same time that spring semester classes began. our preliminary review of results from the semester, based on data collected from myuvm, libguides statistics, and google analytics, has identified several positive outcomes. myuvm data showed that there were 18,891 visits to the “library” page during the period from mid-january to the end of march, a period of eleven weeks when classes were in session. this volume of traffic substantially exceeded our group’s expectations for the first months following implementation, during a period when we were only beginning to promote awareness of the page. data also showed that usage during this period was generally consistent. the most significant variation in traffic was a small peak in late february that corresponded with a high point in the level of library instruction. libguides statistics showed an overall increase in usage of subject guides, though it is not possible to attribute this to the myuvm project with complete certainty. in addition, however, we also observed that for many of our guides during this period, myuvm was among the top referring sites. libguides statistics also recorded unexpectedly large increases in usage for the “research roadmap” that we attribute primarily to the myuvm project. four sections of the “research roadmap” experienced increases of more than 100 percent during the january-march period. the research roadmap’s “more help” page showed a 65 percent drop in visits, but a possible explanation for this is that the highlighting of sections in myuvm is providing more-immediate help to our users in finding what they need and promoting independent use of instructional materials by students. libchat statistics indicated a significant increase in chat reference transactions at howe library, the university of vermont’s central library: a 23 percent increase over the count for the fall 2018 semester, with the implementation of the myuvm project being the only reasonable explanation. all initial data appear to show that users are finding and continuing to use the “library” tab in the portal. they are discovering guides and using the embedded chat widget. we plan to gather more usage data for other channels on the page to better inform our picture of what users are doing once they find and view the “library” tab. as campus portals have become a ubiquitous part of university life, revisiting the library’s role in these portals seems worthwhile, especially given that information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 8 commonplace design tools like libguides dramatically lower the technological acumen needed for creating content. future directions the next step for this project is to leverage the ability of a campus portal to create a myuvm homepage library channel that customizes the display of content, based on unique user characteristics. when the user logs in, they are routed to the portal’s landing page, which is dynamically created based upon their student or faculty status, enrollment in a college or school, level of study (graduate/undergraduate), or number of years attending the university of vermont. this page has the ability to conform to the user in even more granular ways and dynamically display content based upon their major or other demographic categories such as study abroad status, veteran status, or first-year students. by leveraging the portal’s ability to display user-specific content, the university of vermont libraries have the ability to customize instructional content tailored to a user’s information needs and place that content in a channel that will display alongside other channels on the myuvm homepage. a first-year history major’s library channel could contain tutorials on working with primary sources, a link to their liaison librarian, links to digitized newspaper collection, and help guides for chicago citation style. a graduate student in nursing might see information abou t evidence-based practices for developing a clinical question, help guides for using pubmed and cinahl, and resources for point-of-care. a faculty member in psychology might find tutorials for creating alerts in their favorite journals, information about copyright and reserves material, or information about citation-management software. in each case, the portal pushes resources and assistance to each user that best fits their specific need, as informed by the librarians best equipped to address that need. this last step of placing dynamic content on the myuvm homepage will require a great deal of coordination with liaison librarians both to identify the most pertinent disciplinary information to place in the portal and to identify the times of year when certain information is most relevant. to keep portal content dynamic and pertinent to users, a system will need to be created for releasing and removing content on a regular basis and this scheduling of content will require the input of liaison librarians. the educational services working group will need to manage this scheduling, as well as the enforcement of portal design conventions in coordination with the myuvm developers. although this management may end up being complex, it is not insurmountable, and our next steps will be to both to create a system for content creation and management, and to begin to create test content for a sample of user groups. we also plan to gather more data and expand our analytics capabilities to assess how users are using content on the myuvm “library” page and examine which features are most popular, how much traffic is being driven back to our websites, and how users are interacting with the features on the page. conclusion our project has confirmed our initial inclination that students go to myuvm as a finding tool for finding inter-campus resources. also, faculty have reported accessing library resources through the portal and directing their students to that pathway as well. the immediate high use and consistency of use indicate that we have placed our selected libraries resources in a high -traffic information technology and libraries march 2020 meeting users where they are | sherriff, desanto, benson, and atwood 9 venue. instead of attempting to coax students to our web outpost in the wilds of the internet, we have placed an exit ramp from a highway they already travel. this has proven overwhelmingly effective and confirms, on our campus at least, the literature from the mid-2000s pointing out the opportunity created for libraries by campuses’ institutional adoption of portal systems. in all, the project has been a worthwhile venture for the university of vermont libraries. we have observed immediate use and better-than-expected levels of traffic, as well as continued use throughout the semester. it appears that once students wear a path to resources in myuvm, they are continuing to use that path as a way to access library content. we look forward to further customizing that content in the near future. acknowledgements we gratefully acknowledge david alles, portal developer, and naima dennis, senior assistant registrar for technology, in the university of vermont office of the registrar, for their contributions to the design and development of this project. endnotes 1 scott garrison, anne prestamo, and juan carlos rodriguez, “putting library discovery where users are,” in planning and implementing resource discovery tools in academic libraries, ed. mary pagliero popp and diane dallis (hershey, pa: information science reference, 2012), 391, https://doi.org/10.4018/978-1-4666-1821-3.ch022; bruce stoffel and jim cunningham, “library participation in campus web portals: an initial survey,” reference services review 33, no. 2 (june 1, 2005): 145-46, https://doi.org/10.1108/00907320510597354. 2 mark carden, “library portals and enterprise portals: why libraries need to be at the centre of enterprise portal projects,” information services & use 24, no. 4 (2004): 172–73, https://doi.org/10.3233/isu-2004-24402. 3 ilka datig, “walking in your users’ shoes: an introduction to user experience research as a tool for developing user-centered libraries,” college & undergraduate libraries 22, nos. 3–4 (2015): 235–37, https://doi.org/10.1080/10691316.2015.1060143; steven j. bell, “staying true to the core: designing the future academic library experience,” portal: libraries and the academy 14, no. 3 (2014): 369–82. https://doi.org/10.1353/pla.2014.0021. 4 stoffel and cunningham, “library participation in campus web portals,” 145-46. 5 tim mcgeary, “mylibrary: the library’s response to the campus portal,” online information review 29, no. 4 (2005): 365–73, https://doi.org/10.1108/14684520510617811; garrison, prestamo, and rodriguez, “putting library discovery where users are,” 393-94. https://doi.org/10.4018/978-1-4666-1821-3.ch022 https://doi.org/10.1108/00907320510597354 https://doi.org/10.3233/isu-2004-24402 https://doi.org/10.3233/isu-2004-24402 https://doi.org/10.1080/10691316.2015.1060143 https://doi.org/10.1353/pla.2014.0021 https://doi.org/10.1108/14684520510617811 abstract introduction methods results future directions conclusion acknowledgements endnotes lib-mocs-kmc364-20131012113204 190 communications automation and the service attitudes of arl circulation managers james r. martin: university of rochester library, rochester, new york. the circulation function in our large academic libraries has undergone two important transformations since the turn of the century. the first of these is departmentalization; the second, automation. the departmentalization of the circulation function has tended to separate the circulation department from the library's educational and information functions , the more "professional " aspects of librarianship. laurence miller makes this point in his dissertation, "changing patterns of circulation services in university libraries, " which focuses on the rise of circulation departmentalization.1 miller surveyed large academic libraries to determine if certain services-reference, interlibrary loan, orientation, catalog assistance-were being withdrawn from the circulation function . after verifying a withdrawal of these services and identifying them as the "professional" ones, miller drew the conclusion that circulation is therefore suspect as a professional activity. 2 his are generally held conclusions as robert oram suggests: until recently, librarians have been reluctant to deal with circulation problems on an organized basis. the belief that circulation was, in part at least, custodial and clerical rather than managerial and professional underlies much of the reluctance to solve mutual circulation problems th rough a professional group.' paralleling this change in the circulation function's organizational setting, the mechanization of the circulation process has continued to move from the laborious and slow use of manual procedures and book cards toward the immediate updating and record keeping of the online system. circulation automation has passed from the early days of simply mechanizing files (represented by the batch system) to the present, where libraries have the potential capacity to perform the complete circulation control process with real-time systems. • sophisticated online systems have begun to truly control the complete circulation function. the metamorphosis of circulation automationfrom simple mechanization to full computerization-has had a tremendous impact on the technical side, the processes, of the circulation department. likewise it may well have had impact on the service attitudes, priorities, and leadership of the department. the level of automation may relate to the circulation manager's attitudes and priorities, and in the words of an american library association committee, "the impact of automation might change the image of the circulation librarian." 5 as it automates, gaining control over its own processes, the circulation department and its manager may actually become more responsive to its users-more service oriented, more "professional." in february 1980, a questionnaire was sent to circulation managers of all the ninety-eight academic libraries that hold membership in the association of research libraries. 6 it sought to (1) identify the degree and state of automation of the circulation function , classified by the three system categories of manual , batch , and online systems, and (2) to capture opinions on the circulation manager's view of his management role and his attitudes on service issues and user demands. these attitudes were related to the three types of systems. seventy-six questionnaires were returned, for a 78 percent response rate. circulation department characteristics circulation departments ranged in size from 4 to 78 ite employees. the average department size was 18, the median 14.25. the number of students employed ranged from 0 to 175. twenty-nine percent of managers said staffing was not adequate and 45 percent said they had to depend too heavily upon students. fifty-seven percent of managers of manual systems responded that they had to depend too heavily upon students, versus 27 percent of batch and 50 percent of online managers. (because of variations in what is counted, transaction volume figures are not particularly informative.) circulation system characteristics the seventy-six responding libraries reported approximately thirty-two different system configurations. thirty-nine percent of these systems were manual, 34 percent were batch, and 26 percent were online. nineteen percent of the total were manual mcbee systems and 15 percent were libs100 online systems. manual systems had been in use an average of twenty-six years, batch systems an average of eight years (range: ten months to eighteen years), and online systems an average of three years (range: three months to eight years). circulation manager characteristics typically, the circulation manager in an arl library is the head of a department. arl circulation managers had held their positions from six months to twenty years. five years was the average, but 68 percent listed five years or less. gender was evenly distributed: thirty-eight males and thirtyeight females. the managers of manual systems were 43 percent male/ 57 percent female, those of batch systems were 54 percent male/46 percent female, and of online systems 55 percent male/45 percent female. seventy percent of all managers had an mls, and 30 percent did not; 40 percent of managers of online systems did not have an mls. a majority of circulation mancommunications 191 agers (57 percent) reported spending over 25 percent of their time on matters outside of strictly circulation concerns. in fact a substantial minority, 23 percent of all managers, spent over 50 percent on extracirculation matters. satisfaction with circulation system as a group, arl circulation managers are not satisfied with their systems, as table 1 shows. online-system managers consistently rate their systems most highly. asked if their systems were "close to ideal," only 17 percent of all respondents were affirmative. only 3 percent of manual-system managers agreed that their system was "close to ideal" as compared to 12 percent of batch managers and 45 percent of online managers. hidden in these averages is the fact that three managers gave their systems perfect scores on all four questions and those systems were all online: geac, libsloo, and an ibm-based online system. (table 2 summarizes responses on the four system-performance statements.) hardware, software, and downtime circulation managers with automated systems also reported on their experience with equipment, software, and downtime. batch-system managers were more satisfied with hardware and software (7 4 percent for both) than were online managers (60 percent satisfied with hardware and 65 percent with software). however, open-ended questions revealed that dissatisfaction with online-system hardware and software centered around limitations of the libs100 system (used by 55 percent of online-system managers). the libs100 system was panned for "inflexible software," "poor fines system," and "lack of reserve book features. " (these are all long-recognized limitations that were partially addressed in the relatively recent release 24.) the downtime situation was more satisfactory, however, for online managers than batch managers. seventy-five percent reported downtime was not a problem as against more than 63 percent of batch-system managers. 192 journal of library automation vol. 14/3 september 1981 table 1. responses by type of system (n = 30 manual , 26 batch, 20 online) strongly no strongly disagree agree agree opinion disagree "our circulation system is completely adequate" manual 1(3%) 4(13 %) batch 1(4%) 5(19%) online 3(15%) 7(35%) "our circulation system is reliable" manual 1 (3%) 15(50 o/o ) batch 3(12%) 9(35%) online 5(25 %) 11(55 %) 1(3 %) 1(4 %) 1(5 %) 1(3 %) 12(40% ) 13(50 %) 6(30%) 10(33 %) 11(42 %) 3(15 %) 14(40 %) 6(23%) 3(15%) 3(10 %) 3(12 %) 1(5 %) "our circulation system's records are very accurate" manual 2(7%) 7(23%) 2(7 %) 16(53 %) 9(35%) 6(30%) 3(10 %) 3(12 % ) batch 3(8%) 12(46%) online 4(20 o/o) 10(50%) "our circulation system is close to ideal" manual 1(3 %) batch 3(12 o/o) online 3(15%) 6(30%) 3(15 %) 7(23%) 8(31 o/o) 4(20%) 22(73%) 13(50%) 4(20%) table 2. summary of responses on four system questions (detail given in table 1) standard minimum maximum mean median deviation value value variance manual 9. 9 3.27 4 16 11 batch 10.o8· 8.5 3.81 4 18 15 online 13.45. 14 4.57 5 20 21 •20 =strongly agree, 16 =agree, 12 =no opinion, 8 =disagree, 4 =strongly disagree. service attitudes respondents were asked to mark attitude statements on a five-point scale: "strongly agree," "agree," "no opinion," "disagree," and "strongly disagree." attitude statements fell into four categories: (1) specific service concerns, (2) the importance of the managerial role, (3) user problems, contacts and complaints, and (4) user demands and expectations. the averages of the last three groups were used to explore the question of association between level of automation and manager service attitudes (see table 3). specific service concerns ninety percent of circulation managers agreed that "speed of service is very important to users," and no online-system manager disagreed. forty-three percent of manual-system managers agreed that "control of circulating books tends to be inadequate." this compares to 16 percent of batch managers and 15 percent of onlinesystem managers. asked whether "users tend to expect more service than the department can give," 56 percent of manual managers agreed, as did 46 percent of batch managers and 40 percent of online-system managers. attitudes toward management role the study found that circulation managers are uniformly strong in their affirmation of the importance of their role, with a slight tendency for online managers to be more affirmative. in fact, 100 percent of respondents agreed with the statement that the "management of the circulation function is important." ninety-three percent agreed that "circulation management should rank high among the library's priorities." ninety-five percent disagreed with the negative statement that "circulation communications 193 table 3. attitude responses, averages management role (9 questions) demands and expectations (6 questions) contacts and complaints (6 questions) totals 3.913 3.92 3.99 manual batch online 4.38 4.34 4.48 5 =most positive response . 1 =least positive response. management offers little opportunity for the exercise of initiative." ninety-four percent of all managers disagreed that "circulation management lacks complexity." attitudes toward user problems, contacts, and complaints the study found that circulation managers are uniformly strong in their desire to respond to user complaints and problems, but with a slight tendency for online managers to be more favorable to the user. one hundred percent of online managers regarded user contacts as pleasant, as did 93 percent of manual and batch managers. ninety-five percent of online managers, 92 percent of batch managers, and 87 percent of manual managers affirm that patron contact provides the challenge in circulation work. eighty percent of online managers and 73 percent of manual and batch managers rejected the statement that "complaints tend to be unfounded." sixty-five percent of the respondents of online systems were more likely to favor the user by thinking "complaints are most often substantive," as compared to 50 percent of manual managers and 48 percent of batch managers. ninety percent of online managers disagreed that users "complain far too much," compared with 84 percent of batch managers and 79 percent of manual managers. attitudes toward user demands and expectations circulation managers are generally favorable in their attitudes toward user demands and expectations. several statements in this area, however, ran contrary to the tendency of online managers to agree slightly more with attitudes favorable to the user than managers of batch and manual 3.48 3.52 3.46 3.88 3.9 4.03 systems. for example, while 93 percent of manual-system managers and 85 percent of batch managers agreed that "the circulation department should be oriented towards users' expectations," only 70 percent of online managers did. on the statement, "users should be more tolerant of limitations in circulation services," manual managers disagreed by 34 percent, batch managers by 40 percent, and online managers by 20 percent. these responses against the trend of the online manager as more user oriented may be due to the fact that the study was not completely successful in differentiating between responses based on general attitudes and those based directly on the specific system in use. in other words, the relative quality of each circulation system or even the "bugs" peculiar to a ~pecific system may affect one's attitude toward the user's need to tolerate the limitations of that system. manual-system managers know the limitations on their service are keyed to inefficient systems, whereas online-system managers know their systems and services are already at a high level. this knowledge of the system in use colors service attitudes. conclusion the study found a depressed state of circulation-system development and support in arl libraries. seventy-four percent of circulation managers, on average, rated their systems negatively on basic system integrity, as shown in table 2. the thirty manual-system managers gave their systems an average score of 9, to the effect that their systems were ideal, adequate, reliable, and accurate. the twentysix batch managers gave their systems an average score of 10.08, the twenty online managers an average of 13.45. recognizing the considerable constraints under which 194 journal of library automation vol. 14/3 september 1981 today's large academic libraries struggle, there is, nonetheless, room for criticism of library priorities. this study must be viewed as only a first step (largely tentative and exploratory) in relating automation with service attitudes. it suggests that online systems may be associated with managers more positive in their view of the management role and more positive in their attitudes toward users than batchand manual-system managers. further research would be useful at this point to compare levels of automation (manual, batch, and online) with circulation-staff service attitudes or those of patrons using the systems. references l. laurence miller, "changing patterns of circulation services in university libraries" (ph.d. dissertation, florida state university, 1971), p.iii. 2. ibid., p.149. 3. robert oram, "circulation," in allen kent and harold lancour, eds., encyclopedia of library and information science, v.s (new york: marcel dekker, 1971), p.l. 4. william h. scholz, "computer-based circulation systemsa current review and evaluation," library technolo gy reports 13:237 (may 1977). 5. robert oram , " circulation," p.2. 6. james robert martin , "automation and the service environment of the circulation manager" (ph.d. dissertation, florida state university, 1980), p.22. statistics on headings in the marc file sally h. mccallum and james l. godwin: network development office, library of congress, washington, d.c. in designing an automated system, it is important to understand the characteristics of the data that will reside in the system. work is under way in the network development office of the library of congress (lc) that focuses on the design requirements of a nationwide authority file. in support of this work, statistics relating to headings that appear on the bibliographic records in the lc marc ii files were gathered. these statistics provide information on characteristics of headings and on the expected sizes and growth rates of various subsets of authority files. this information will assist in making decisions concerning the contents of authority files for different types of headings and the frequency of update required for the various file subsets. then ational commission on libraries and information science supported this work. use of these statistics to assist in system design is largely system-dependent; however, some general implications are given in the last section of this paper. in general , counts were made of the number of bibliographic records, headings that appear in those records, and distinct headings that appear on the records. the statistics were broken down by year, by type of heading, and by file. in this paper, distinct headings are those left in a file after removal of duplicates. distinctness will not be used to imply that a heading appears only once in a source bibliographic file, although distinct headings may in fact have only a single occurrence. thus, a file of records containing the distinct headings from a set of bibliographic records is equivalent in size to a marc authority file of the headings in those bibliographic records. methodology these statistics were derived from four marc ii bibliographic record files maintained internally at lc: books, serials, maps, and films. the files contain updated versions of all marc records that have been distributed by lc on the books, serials, maps, and films tape:; frum 1969 through october 1979, and a few records that were then in the process of distribution. the files do not contain cip records. a total of l ,336,182 bibliographic records were processed, including 1,134,069 from the books file, 90,174 from the serials file, 60,758 from the maps file, and 51,176 from the films file. a file of special records, called access point (ap) records, was created that contains one record for the contents of each occurrence of the following fields in the bibliographic records: 108 journal of library automation vol. 14/2 june 1981 has shown that interactive television programs: 1. serve as an initial introduction to naive audiences of what a truly interactive system is all about; 2. are difficult to implement; 3. really aren't democratic; 4. are basically polling devices. it has been said that the reason that railroads went out of business was because they insisted that they were in the railroad business and wouldn't admit that they were in the transportation business . if cable operators insist that they are in the television business, they may well miss the opportunities that are possible in the communications business or, in fact, in the information business . by the same token, if libraries miss the significance of what cable television is bringing to their business, their role in the community will be diminished and libraries may go the way of railroads. modern communications and computers offer an opportunity for libraries to become the information choice in their community. in the near future, applications such as the home book club may well be a way to provide increased accessibility of library services to library patrons, and to "condition" those patrons to the coming electronic nature of libraries. over the long term, libraries, if they have the courage and the foresight, can be the focus of the coming information and telecommunications revolution . the message is quite clear: opportunities abound. references l. john wicklein, "wired city, u.s.a: the charms and dangers of two-way tv," atlantic monthly 243:35--42 (feb. 1979). 2. warner amex represents a newly formed corporation resulting from the merger of warner communications and american express. 3. jonathan black, "brave new world of television," new times 11:41 (24 july 1978). 4. ibid ., p.49. 5. "warner cable's qube: exploring the outer reaches of two-way tv, " broadcasting 95:28 (31 july 1978). 6. "two-way converters hot ticket at ncta exhibits , " broadcasting 97:72 (26 may 1980). an informal survey of the cti computer backup system joseph covino and sheila intner: great neck library, great neck, new york. in order to help decide whether or not to purchase computer backup systems from computer translation, inc . (cti), * for use when the clsi libs 100 automated circulation system is not operating, great neck library conducted an informal survey of libraries using both systems . eleven institutions, including both public and academic libraries , responded to a brief questionnaire. they were asked what size cti system they had purchased and why, how easily it was installed, how well it performed, how it was maintained, and if clsi acknowledged that the addition of the backup did not affect their libs 100 maintenance agreements . before summarizing the responses, the structure of the two systems and how they interact should be outlined. clsi libs 100 the clsi automated circulation system consists of a stand-alone minicomputer console with local and/or remote terminals connected to it through individual ports by means of electrical and/or dedicated telephone line hookups . when it operates, the terminals are online and interactive with the database, which is stored on one or more multiplatter disc packs. cti backup the cti backup system is based on an apple ii microcomputer with two minidisc drives, which take 51/.-inch floppy discs, a tv monitor, and a switching system that can be connected to the libs 100 console or its terminals. the cti system can also be used alone . when the libs 100 is down (inoperative), the cti system is connected to a terminal, and data is recorded on its discs for later dumping (data entry) into the database via a port connection . it *cti is a profit-making company wholly owned by brigham young university. the cti backup system was originally developed to support the clsi-installation at byu. appears to the public and the library's staff memb e r operating the backup-terminal combination that the terminal is working. there is, however, no connection between the backup unit and the database in this mode. when the libs 100 is up (operating) once again, the backup is connected and data is automatically dumped. naturally the port cannot be used by both the clsi terminal and the backup unit at the same time without the addition of other hardware . the terminals attached to other ports may operate normally while dumping is completed. the clsi and cti software, which operate compatibly, are owned by the respective companies, not the library . the responses 1. size of system : cti systems are available in two sizes , 32k and 48k . two libraries purchased the smaller system, nine purchased the larger system, and one purchased both. greater programming capabilities of the larger system were consid e red its greatest asset. 2. reason for purchase: five libraries indicated they use the backup for other purposes in addition to substituting for the libs 100 when it is down . among these other purposes were development of a community information database, personnel and financial reports and files, use as an rlin terminal, as a bookmobile terminal, and as an aid in converting short-title bibliographic records to expanded format. 3. installation: respondents were unanimous in having no problems with installation . seven did their own installation, while cti gave instructions over the phone. three were installed by cti, who also trained the library staff in its operation. one library indicated the accompanying documentation was enough to install the system without assistance. 4. performance: all eleven respondents were enthusiastic about system performance. some comments were, "it's the best thing since buttered communications 109 popcorn," and "we love it dearly 0 0 0 0 it saves hours 0 0 0 works just fine . " many commented on the slow dumping time as the biggest drawback, but noted that increased accuracy over manual entry and decreased pressure on their circulation staff during downtime were assets. 5. maintenance: backup system maintenance is not uniform . six respondents said that software was maintained by cti, but hardware was maintained by an apple dealer; or they were undecided about who would be respons ible for hardware repairs. a seventh library contracted with an apple dealer for hardware repairs, but was contending over software mai ntenance with ct i. three libraries answered that cti was maintaining the system, but did not specify both hardware and software . the last respondent expected to take hardware repairs to an apple dealer and did not mention software . 6. clsi maintenance agreements: one library stated that they had written assurance from clsi that the installation of the backup system would not affect their libs 100 maintenance contract. three more said they had verbal assurances . five respondents indicated no assurances from clsi that the libs 100 contract was not affected . one library sent a copy of a clsi · letter defining company policy in this area. it said, in part: "clsi does not prohibit the attachment of foreign devices to the systems . .. . " qualifications to this statement involved an inst itution's attempt to repair the libs 100 itself, to hold clsi responsible for damage resulting from the attachment of the device, or to have clsi maintain the device . the great neck library decided to purchase two cti backup systems for use when the libs 100 is down. experience bears out the findings of the survey ; i.e . , it is easy to install the system with only telephone assistance; it works well, and, though data transmission to the main unit is slow, it is accurate and removes some of llo journal of library automation vol. 14/2 june 1981 the desperation from a downtime situation. great neck library is also planning to use the apples for other functions, which, it is hoped, will be implemented soon . multimedia catalog: com and online kenneth j. bierman: tucson public library, tucson, arizona. like many public libraries, the tucson public library (tpl) is closing its card catalog and implementing a vendorsupplied microform catalog. unlike most of these other libraries, however, the tpl microform catalog will not include: location or holding information. the indication of where copies of a particular title are actually available (i.e., which of the fifteen possible branch locations) will be available only by accessing a video display terminal connected to the online circulation and inventory control system. conceptually, the tpl catalog will be in two parts with each part intended to serve different functions . 1 the microform catalog (copies available in both film and fiche format) will fulfill the bibliographic function of the catalog. this catalog will contain bibliographic description and provide the traditional access points of author, title, and subject. the online catalog (online terminals are in place at all reference desks and a few public access terminals will also be available) will fulfill the finding or locating function of the catalog. this catalog will contain very brief bibliographic description and will only be searchable by author, title, author/title, and call number, and will contain the current status of every copy of every title in the library system (i.e., on shelf, checked out, at bindery , reported missing, etc.). why did the tucson public library make this decision? there are two major reasons: l. accuracy . the location information , if provided in the microform catalog, would always be inaccurate and out of date. assuming that the locations listed in the latest edition of the microform catalog were completely accurate when the catalog was first issued (an unrealistic assumption to begin with as anyone who has ever worked with location information at a public library with many branches well knows!), the location information would become increasingly less accurate with each day because of the large number of withdrawals, transfers, and added copy transactions that occur (more than 100 , 000 a year). in addition, at any given time, one-quarter to one-third of the materials in busy branches are not on the shelf because they are either checked out or waiting to be reshelved. thus, the microform catalog would indicate that these materials were available at specific branches when a significant percentage would in fact not be available at any given time. in short, even in the best of circumstances, easily half of the location information would be incorrect in telling a user where a copy of a title was actually available at that moment. 2 . cost . a study done at the tucson public library indicated that close to half of the staff time of the cataloging department was spent dealing with location and holding information. this time includes handling transfers, withdrawals, and added copies. all of this record keeping is already being done as a part of the online circulation and inventory control system (the tucson public library has no card shelflist containing copy and location information but rather relies completely on the online file for this type of information) . to "duplicate" the information in the microform catalog would cost an estimated $40,000 to $60,000 a year and the information in the microform catalog would never be accurate or up to date for the reasons outlined above. figure 1 is a brief summary of how the bibliographic system will work. would the system in figure 1 be improved if holdings were included in the microform catalog? on the surface, the obvious answer is yes-more information is 51 brown university library fund accounting system robert wedgeworth: brown university library, providence, r. i. the computer-based acquisitions procedures which have been developed at the library provide more efficient and more effective control over fund accounting and the maintenance of an outstanding order file. the system illustrates an economical, yet highly flexible, approach to automated acquisitions procedures in a university library. · the fund accounting system of the brown university .library was initiated on the basis of a program developed in april, 1966. subsequently, it was decided to implement the program in the fall of that year. the necessary in-house equipment, namely, an ibm 826 typewriter card punch and an ibm 026 keypunch, was placed on order along with new six-part order forms. about the same time an agreement was reached with the administrative data processing office of the university (tabulating) which would provide for rental time on their ibm 1401, 12k system with three magnetic disks and four magnetic tape-storage units. the services of a part-time programmer were also secured through this office. the system became fully operational on december 1, 1966. the primary objective of the project was to establish more efficient and more effective control over the approximately 150 fund accounts administered by the order department of the university library. in addition, it seemed that a number of by-products were possible. among these were statistical information for management and a file of bibliographical records from which a new accessions list could be drawn on a regular basis. the system was to accommodate the payment of all invoices to be posted against the aforementioned accounts. these include mono52 journal of library automation vol. 1/ 1 march, 1968 graphic and serial publications as well as supplies and equipment. however, records of outstanding orders were to be maintained for monographic publications only. although the basic routines were to remain much the same, some minor adjustments wer~ necessary to accommodate the new machine system. also, several flle s of dubious value to the new system · were to be maintained in order to gain empirical evidence as to their worth. this report is presented as a record of an attempt to develop an economical, yet highly flexible approach to the automating of acquisitions procedures of a university library. perhaps the scope of the computer-based acquisitions procedures at brown may be determined more easily relative to three recently reported systems of varying complexity. one of tl1e best surveys of automated university library acquisitions systems appears in the project report of the university of illinois, chicago campus (1). however, two of the systems summarized here are more recent. the university of michigan was included in the illinois literature survey, but the first full description to be published appeared just recently. · automated acquisitions procedures have been in operation at the university of michigan library since june, 1965 (2). the system features a list of items produced by computer from punch cards in which order · information has been recorded. this list is produced on a monthly basis with semi-weekly cumulative supplements. the computer also produces status report cards. these are punch cards, containing summarized order information, which travel with the book and at appropriate processing stages are coded and returned to the computer in order to up-date the status code in the processing list. thus by checking the status code one can determine that a book has been received, received and paid, or cataloged. claim notices are automatically produced for items which remain on order for longer than the predetermined period. in addition to creating and maintaining full financial records and compiling selected statistics, the system will produce specialized acquisitions lists on demand. yale university library creates a machine readable record of a request before it is searched or ordered ( 3). as a result, the status-monitoring system is almost immediately effective. an ibm 826 typewriter card punch is used to type purchase orders, and the ibm 357 data collection system is used to monitor the progress of an item through the system. the process information list is produced weekly with daily supplements. automatic claiming and financial record maintenance are also products of the system. moreover, numerous statistics are planned for management purposes. the fund control system reported by the university of hawaii features financial accounting for book purchases based on pre-punched cards corresponding to purchase orders typed ( 4). the list price is keypunched into the appropriate card in a separate operation and used to encumber library fund accounting system / wedgeworth 53 funds. upon receipt of the book the invoice is matched with the appropriate punch card, and after actual cost is keypunched the card is used to up-date the account. the michigan and yale systems incorporate all of the major features of operational university library automated acquisitions systems. foremost among them are the list of items being processed and its coordinate monitoring system. the cost of creating and maintaining such a file was prohibitive for brown. brown, michigan and hawaii generate a machine record after searching. unlike michigan and yale, brown and hawaii do not have "total" acquisitions systems plans. at brown serials control is not included. at hawaii fund accounting is the only task of the system. also, brown differs from michigan and yale in that the claiming procedure merely notifies the department that certain items are overdue. the brown system is certainly not as economical as that of hawaii, but the use of the typewriter card punch creates a highly flexible and easily expanded system for the difference in cost. manual files and procedures the manual routines of the order department are based upon the maintenance of four basic files. the file documents are all parts of the six-part purchase order form. the outstanding order search file is an alphabetical card file representing unfilled orders, requests t9 search for items, and inquiries for bibliographical information. this file is virtually independent of other routines, thus making it feasible for it to be merged with the file of items waiting to be cataloged. the processing file consists of outstanding orders filed first by book dealer, and second by order number. this file is used to check in shipments of books, to record reports on orders and to record claims. the numerical control file is an order number sequence file containing one copy of every order typed regardless of its ultimate disposition. it provides rapid access to information regarding retrospective orders. the fund file is a file of completed or cancelled transactions filed first by fund name and second by order number. the latter two files were thought to be of dubious value to the new system. however, it was agreed to maintain both for the time being. in order to accommodate the ftmd accounting system, the procedures developed feature two basic routines based on the presence or absence of a unique order number. unique order (figure 1) items acquired in this fashion include purchases and solicited gifts. continuations, but not serials, are included. when a request is received in the order department, it is searched in the main catalog, the waiting catalog and the outstanding order file. if it is found to be neither in the library nor on order, it is then given to an order assistant who completes the bibliographical work, if necessary, and assigns a fund and 54 journal of library automation vol. 1/1 march, 1968 fig. 1. unique order procedure. abbrf.v!ations kp key punch kv • key verify c exhibit c fimf~te maintenance . n.b . library fund accounting systemjwedgeworth 55 fund slip to kp card kp-kv record card · book arrives pull out .... t-----1 order card catalo!! er retuh~ update e* proposed accessions listinr pror ram fig. 1 continued. 56 journal of library automation vol. 1/ 1 march, 1968 dealer. if the price is listed in a foreign currency, the assistant converts it to u. s. dollars. the request then proceeds to the typist. all unique orders are typed on an 826 typewriter card punch. as the typist fills in the six-part order form, pre-~lected pieces of information are keypunched automatically. these fields are as follows : order number order date source type d for domestic, etc. fund number list price author title imprint series orders are proofread on the day after they are typed. the forms are separated and the outstanding order cards are filed immediately in order to detect duplicate orders. at this point the dealer slips are mailed and the numerical control slips filed. the processing file documents, each containing a fund slip, an l.c. order slip, and a cataloger's work slip on a separate perforation, are then filed pending the arrival of the books. also, the deck of ibm cards which has been weeded of voided orders goes to tabulating. ·. although books may be processed without invoices, the normal practice is to process after the arrival of the invoice. the processing file document is obtained and the cost, invoice date and the number of volumes are noted on the fund slip. if the item is a continuation, a supplementary fund slip is made and the original returned to the processing file with the receipt noted. the invoices are cleared and sent to the controller. the fund slips representing books received are sent to the keypuncher in order to up-date the accounts. in the meantime the books, along with the work slips and the l.c. order slips, are sent to the catalog department. as the books are cataloged, the work slips noting any major bibliographical changes and the call number are returned to the order department. from these slips are punched bibliographical adjustment cards and an up-date record card containing the call number and coded for subject and location. the resulting bibliographical record forms the data base for the new accessions listing. no unique order (figure 2 ) items acquired in this fashion include unsolicited gifts, exchanges, standing orders, etc. some continuations and all serials invoices are included. upon arrival, invoiced items without unique order numbers are searched. if they are duplicates they are retwned for credit. if they are not duplicates, they are sent to the typist. catalog file slips are typed library fund accounting systemjwedgeworth 57 kp • kv series 9 card book without unique order number here book & invoice to typist create slips & record card books & slips to cataloger invoice cheeked and si ned invoice to controller n.b. of course no r.ecord card will be made for alfts or exchanges fig. 2. no unique order procedure. 58 journal of library automation vol. 1/ 1 march, 1968 and by-product bibliographical and ·accounting records are punched. on the record card for accounting, the order number field is filled with nines. this signals the program that this entry is a receipt for which there was no unique order number. the series of order numbers beginning with 900000 was originally reserved for assignment to our standing order agreements with presses, societies, etc. eventually, each will have its own order number. however, the last number of the series, 999999, will continue to be used for miscellaneous receipts. presently no accessions listing records are being generated for items without unique order numbers. however, all purchases without unique order numbers are processed with a series 9 order number. serials all serial invoices are handled as series 9 transactions with no attempt to record bibliographical information or volume counts. expenditures for serials are accumulated and entered as one transaction each time the accounts are up-dated. this decision was made in anticipation of the development of a separate serials control program. ibm 1401 files and procedures the basic function of the computer program for the fund accounting system is to maintain current balances on the various library fund accounts and to maintain a file of outstandmg orders exclusive of standing orders. although several correlative functions are distinct possibilities, the only additionat function planned is a file of bibliographic records for the production of an accessions listing. figures 3, 4, 5 and 6 illustrate the major __ tasks to be performed by the system. the programming language used is autocoder. fund balance forward file a card file created at the beginning of each fiscal year having two card types. l fund group header card a. group code b. groq.p name this card assigns a unique code and name to categories of funds such as endowed, special, etc. 2. fund balance forward and appropriation card a. fund group code b. fund code c.-fund name d. previous year balance forward e. current income . or appropriation f. balance forward code g. remaining previous year encumbrances library fund accounting systemj wedgeworl'h 59 fund balance forward file create libr.ary fund accounting file fig. 3. fttnd file creation. fund grou9 headers fund listing this card contains information used to establish the individual funds at the beginning of each year. the balance forward code directs the program to carry over excess funds to the next year, not to carry over excess funds to the next year, or to carry over a negative balance to the next year, thereby reducing cash balance resulting from the new income or appropriation. encumbrances are carried over to the next year in order to maintain an accurate net available at all times. 60 journal of library automation vol. 1/ 1 march, 1968 .--i f/m i ·--fig. 4. file maintenance. completed record cards r-1 i i i i library fund accounting system/ wedgeworth 61 new or ders ------.., i i i i _..j fig. 5. fund accounts updating . 62 journal of library automation vol. 1/1 march, 1968 library fund file a magnetic tape file created from the fund balance forward file and containing three record types,. 1. fund group header 2. fund record a. fund group code b. fund code c. fund name d. previous year balance forward e. current income or appropriation f. current expenditures g. cash balance h. amount encumbered i. net available j. volumes purchased k. balance forward code fund record fields a, b, c, d, e, h and k initially are taken from the corresponding fields in the fund balance forward card. current year expenditures and volumes purchased are preset to zero each year. cash balance is determined by the sum of the previous year balanc(l forward_ and the current income or appropriation. amount encumbered will be · preset to zero or taken from the fund card. net available is determined by the difference between cash balance and amount encumbered. 3. fund group trailer this record is the last within each fund group and contains a summation of the quantitative fields in that fund group. it is used primarily for control purposes. figure 4 illustrates the file maintenance program for the library fund files. this program permits the addition or deletion of a fund group code, changes to a fund group header, addition or deletion of a specific fund or changes to a specific fund. however, changes to quantitative fields are limited to those fields which are contained in the fund balance forward card. thus, net available may not be changed directly by file maintenance but may be changed by manipulating current income or appropriation. . the library fund f ile is a serial file maintained in ascending algebraic sequence on fund group code, fund code and fund record from major to minor respectively. outstanding order file a magnetic disk file created and up-dated by three card types. 1. order card a. order number b. order date library fund accounting system/ wedgeworth 63 c. source type d is domestic, f is foreign d. fund number e. list price figure 5 illustrates the program which processes new orders. this program validates fund code, rejects duplicate order numbers and encumbers list price, thereby reducing net available. 2. record card a. order number b. invoice date c. fund code d. cost e. continuation order code, if applicable f. number of volumes standing orders, blanket orders, serials, etc. are purchased without placing an order. consequently, a series 9 order number is assigned to these record cards. such cards will not match the outstanding order file by definition but will increase amount expended, decrease cash balance and net available and increase volumes purchased. all other record cards must match an existing order number on file. on continuations the record card for each part received produces a transaction as described above, except that the encumbrance remains unchanged until the final record card appears without the continuation order code. 3. adjustment card this card may be submitted for either an order card or a record card. it is differentiated by a special code. its primary purpose is to correct a previous error or to effect a cancellation. · the outstanding order file is in ascending algebraic sequence by fund group, fund code and order number. all cards used in this program must be pre-sorted into this sequence. p1'intout products the accumulated punch cards are processed on a bi-weekly schedule by the tabulating office. a file maintenance report (figure 4) is the first product of each run. it lists in detail any adjustments, additions, or deletions to the fund listing plus the results of such operations. at the end of the detailed report is a summary of the status of each active fund. copies of this latter report are distributed for desk use to all order assistants, the chief order librarian, and the librarian. the transaction register of fund activity (figure 5) lists each transaction posted to each fund for the inclusive period. the assistant in charge· of bookkeeping is the primary user of this and the detailed file maintenance report. 64 journal of library automation vol. 1/ 1 march, 1968 the delinquent orders report (figure 6) lists all past due outstanding orders according to two cycles. domestic orders are listed bi-monthly and foreign orders are listed quart<;rly. the listing is of the "tickler" variety, as it may not be necessary to ask reports on all of the items. an order will remain on the delinquent orders report until it is filled or cancelled. control card list delinquent orders fig. 6. delinquent order listing. conclusion delinquent orders . as of october, 1967, the fund accounting system has been in operation for ten months. assessment of its effectiveness in terms of meeting the primary objective shows the system to be an immediate success. at this . point costs are about the same for the manual system as for the present one. however, accounts which used to require from 25 to 30 man-hours per month are maintained with about 5 man-hours per month. our current equipment and processing costs run about $325 per month. on the other hand, we have become aware of some shortcomings of the system. the addition of a currency conversion sub-routine would greatly expedite the many requests for foreign publications received daily. secondly, the addition of a dealer code would make the delinquent orders list much more useful. at present a user must search the numerical file for the order to ascertain the dealer. the processing file copies are then pulled to go to the typist who asks reports on delinquent orders. a revised program incorporating both of these features is being planned and will be operational early in 1968. the proposed accessions listing has been rejected as a by-product of this system primarily because of the limited character set available on our ibm 1403 print chain and the excessive length of the average listing. the time and expense of storing and up-dating the bibliographical record library fund accounting system j wedgeworth 65 for each new acquisition should, in our estimation, result in a more palatable end-product. we have, therefore, temporarily discontinued producing punch cards for the bibliographical records. as a corollary, it should be added that we have turned to a consideration of the paper tape typewriters as input/ output devices, focusing on their expanded character set and operating speed. the speed of the 826 leaves much to be desired. the numerical control file has proven its usefulness as a rapid index to our files spanning several years. it is extremely helpful in identifying quotes on old order numbers which have long since been cancelled. the fund file, however, has proven to be a duplicate of our machine file. it is thought that replacement of the slip in the numerical control file with the fund slips would at the same time reduce our files by one and up-date the information in the numerical file. finally, this modest beginning, occasioned by limited financial resources as well as the lack of personnel with experience in data processing, seems to have been justified. moreover, although the increasing complexity of our involvement in library automation poses some serious planning and supervisory problems, we are encouraged by our initial success. acknowledgments the staff of the order department have all contributed to the production of this report. however, a special note of gratitude is acknowledged for t:pe assistance of dorothy woods and gloria hagberg imd for the technical advice and assistance of ai hansen, library programmer, and david a. jonah, librarian. references 1. kozlow, robert d.: report on a library project conducted on the chicago campus of the university of illinois, (washington: nsf, 1966), p. 50. 2. dunlap, connie : "automated acquisitions procedures at the univer. sity of michigan library," library resources & technical services, 11 ·(spring 1967), 192. 3. alanen, sally; sparks, david e.; kilgour, frederick g.: "a computer. monitored library technical processing system," american documen· tation institute. proceedings, 3 ( 1966), 419. 4. · shaw, ralph r.: "conh·ol of book funds at the university of hawaii library," library resomces & technical services, 11 (summer 1967) , 380. editorial board thoughts: the importance of staff change management in the face of the growing “cloud” mark dehmlow information technology and libraries | march 2016 3 the library vendor market likes to throw around the word “cloud” to make their offerings seem innovative and significant. in many ways, much of what the library it market refers to as “cloud,” especially saas (software as a service) offerings, are really just a fancier term for hosted services. the real gravitas behind the label cloud really emanated from grid-computing or large interconnected, and quickly deployable infrastructure like amazon’s aws or microsoft’s azure platforms. infrastructure at that scale and that level of geographic distribution was revolutionary when it emerged. still these offerings at their core are basically iaas (infrastructure as a service) bundled as a menu of services. so i think the most broadly applicable synonym for the “cloud” could be “it as a service” in various forms. outsourcing in this way isn’t entirely new to libraries. the function and structure of oclc has arguably been one of the earlier instantiations of “it as a service” for libraries vis-à-vis their marc record aggregation and distribution which oclc has been doing for decades. the more recent trend toward hosted it services has been relatively easy for non-it related units in our library. a service no different to most library staff based on where it is hosted. and with many services implementing apis for libraries, that distinction is becoming less significant for our application developers too. for many of our technology staff, who have built careers around systems administration, application development, systems integration, and application management, hosted services represent a threat to not only their livelihoods but in some ways also their philosophical perspectives that are grounded in open source and do-ityourself oriented beliefs. in many ways the “cloud” for the it segment of our profession is perhaps more synonymous with change, and with change requires effective management of that change, especially for the human element of our organizations. recently, our office of information technologies started an initiative to move 80% of their technology infrastructure into the cloud. they have proposed an inverted pyramid structure for determining where it solutions should reside — focusing first on hosted software as a service solutions for the largest segment of applications, followed by hosting those applications we would have typically installed locally onto a platform or infrastructure as a service provider, and then limiting only those applications that have specialized technical or legal needs to reside on premise. this is a big shift for our it staff, especially, but not limited to, our systems administrators. the iaas platform our university is migrating to is amazon web services and their infrastructure is mark dehmlow (mdehmlow@nd.edu), a member of lita and the ital editorial board, is the director, information technology program, hesburgh libraries, university of notre dame, south bend, indiana. editorial board thoughts: the importance of staff change management in the face of the growing “cloud” | dehmlow | doi: 10.6017/ital.v35i1.8965 4 largely accessible via a web dashboard, so that the myriad of tasks our systems administrators took days and weeks to do can now, in some adjusted way, be accomplished with a few clicks. this example is on one extreme end of the spectrum as far as it change goes, but simultaneously, we have looked at the vendor market to lease pre-packaged tools that support standard functions in academic libraries and can be locally branded and configured with our data — things like course guides, a-z journal lists, scheduling events, etc. the overarching goals of these efforts are cost savings and increased velocity and resiliency of infrastructure, but also and perhaps more important, is giving us flexibility in how we invest our staff time. if we are able to move high level tasks from staff to a platform, then we will be able to reallocate our staff’s time and considerable talent to take on the constant stream of new, high level technology needs. partnering with the university, we are aiming towards their defined goal of moving 80% of our technical infrastructure into the “cloud.” we have adopted their overall strategy of approach to systems infrastructure, at least in principle and are integrating into our own strategy significant consideration for the impact of these changes on our staff. our organization has recognized that people form not only habits around process, but also personal and emotional attachments to why we do things the way we do them, both from a philosophical as well as a pragmatic perspective. our approach to staff change is layered as well as long term. we know that getting from shock to acceptance is not an overnight process and that staff who adopt our overarching goals and strategy as their own will be more successful in the long term. to make this transition, we have developed several strategic approaches: 1. explaining the case: my experience is that staff can live through most changes as long as they understand why. helping them gain that understanding can take some time, but ultimately having that comprehension will help them fully understand our strategic goals as well as help them make decisions that are in alignment with the overall approach. i often find it is important to remember that, as managers, we have been a part of all of the change conversations and we have had time to assimilate ideas, discuss points of view, and process the implications of change. each of our staff needs to go through the same process and it is up to leadership to guide them through that process and ensure they get to participate in similar conversations. it is tempting to want to hit an initiative running, but there is significant value in seeding those discussions gradually over a somewhat gradual time period to more holistically integrate staff into the broader vision. it is important to explain the case for change multiple times and actively listen to staff thoughts and concerns and to remember to lay out the context for change, why it is important, and how we intend to accomplish things. then reassure, reassure, and reassure. the threats to staff may seem innocuous or unfounded to managers, but staff need to feel secure during a process to ultimately buy in. 2. consistency and persistence: staff acceptance doesn’t always come easy — nor should it necessarily. listening and integrating their perspectives into the planning and information technology and libraries | march 2016 5 implementation process can help demonstrate that they matter, but equally important is that they feel our approach is built on something solid. stability is reinforced through consistency in messaging. not only in individual consistency, but also team consistency, and upper management consistency — everyone should be able to support and explain messaging around a particular change. any time staff approach me and say, “it was much easier to do it this other way,” i talk about the efficiency we will garner through this change and how we will be able to train and repurpose staff in the future. the more they hear the message, the more ingrained it becomes, and the more normative it begins to feel. 3. training and investment: it futures require investment, not just in infrastructure, but also in skill development. we continue to invest significantly in providing some level of training on new technologies that we implement. that training will not only prove to staff that you are invested in their development as well as their job security, but it will also give them the tools they need to be successful in implementing new technologies. change is anxiety inducing because it exposes so many unknowns. providing training helps build confidence and competence for staff, reducing anxieties and providing some added engagement in the process. it also gives them exposure to the real world implementation of technologies where they can begin to see the benefits that you have been communicating for themselves. 4. envisioning the future: improvements and roles — one of the initial benefits we will be getting from recouping staff time is around shoring up our processes. we have generally had a more ad hoc approach to managing the day to day. it has been difficult to institute a strong technical change management process, in part, because of time. we will be able to remove that consideration from our excuses as we take advantage of the “cloud.” the net effect will be that we will do our work more thoughtfully and less ad hoc and use better defined processes that will meet group-developed expectations. in addition to doing things better, we do expect to do things differently. with fewer tasks at the operational level, we believe we will be able to transition staff into newly defined roles. some of these roles include devops engineers, a hybrid of application engineering (the dev) and systems administration (the ops), these staff will help design automation and continuous integration processes that allow developers to focus on their programming and less on the environment they are deploying their applications in; financial engineers who will take system requirements and calculate costs in somewhat complex technical cloud environments; systems architects who will be focused on understanding the smorgasbord of options that can be tied together to provide a service to meet expected response performance, disaster recovery, uptime, and other requirements; and business analysts who will focus on taking technical requirements and looking at all of the potential approaches to solve that need whether it be a hosted service, a locally developed solution, an implementation of an open source system, or some integration of all or some of the editorial board thoughts: the importance of staff change management in the face of the growing “cloud” | dehmlow | doi: 10.6017/ital.v35i1.8965 6 above. this list is by no means exhaustive, but i think it forms a good foundation on which to help staff develop their skill set along with our changing environment. i believe it is important to remind those of us who are managing it departments in libraries that in many ways the easiest parts of change are the logistics. the technology we work with is bounded by sets of guidelines that define how they are used and ensure that if they are implemented properly, they will work effectively. people on the other hand are not bounded as neatly by stringent rules. they are guided by diverse backgrounds, personalities, experiences, and feelings. they can be unpredictable, difficult to fully figure out, and behaviorally inconsistent. and yet, they are the great constant in our organizations and therefore require significant attention. our field needs “servant leaders” dedicated to supporting and developing staff, and not just being competent at implementing technologies. those managers who invest in staff, their well-being, development, and sense of engagement in their jobs, will find their organizations are able to tackle most anything. but those who ignore their staffs’ needs over pragmatic goals will likely find their organizations struggling to move quickly and instead spend too much energy overcoming resistance instead of energizing change. reproduced with permission of the copyright owner. further reproduction prohibited without permission. wikiwikiwebs: new ways to communicate in a web environment chawner, brenda;lewis, paul h information technology and libraries; mar 2006; 25, 1; proquest education journals pg. 33 reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. reproduced with permission of the copyright owner. further reproduction prohibited without permission. lib-mocs-kmc364-20131012114209 (bath: the library, 1974-75). 2. valentine de bruin, "sometimes dirty things are seen on the screen," journal of academic librarianship 3:256-66 (nov. 1977). 3. carolyn m. cox and bonnie juergens, microform catalogs: a viable alternative for texas libraries (dallas: amigos bibliographical council, 1977). eric document no. ed 149 739. 4. james r. dwyer, "public response to an academic library microcatalog," ]ourru~l of academic librarianship 5:132-41 ouly 1979). 5. brett butler, martha w. west, and brian aveney, com catalog: use and evaluation: report of a field study of the los angeles county public library system (rev. ed.; los altos: information access corporation, 1979}, 71p. 6. theodora hodges and uri bloch, "fiche or film for com catalogs-two use tests" in library effectiveness: a state of the art (chicago: american library assn., 1980), p.122-30. 7. william saffady, computer-output microfilm: its library applications (chicago: american library assn., 1978), 190p. 8. commercial com catalogs: how to choose, when to buy. catalog use committee, reference and adult services division, american library association. (chicago: american library assn., 1978), 47p. 9. debruin, "dirty things," p.266. 10. hodges, "fiche or film," p.128. 11. hodges to crowley, september 1979. electronic order transmission james k. long: oclc, inc., dublin, ohio. in this era of decreasing library allocation from the public sector, libraries are realizing increased benefits from the automation of the acquisitions process. the price of hardware is decreasing and the capabilities of the available offerings increasing. we have evolved from the small local library collection of data and printing of orders, through the book vendor offerings of an online connection to a single vendors inventory. these systems still required local mailing for all other vendor orders. communications 295 in 1981 we have seen a greater emphasis on electronic ordering. memorial university in canada has been experimenting in sending orders directly to john coutts library services ltd. in print format using the utlas catss system. wayne state university is planning to use the ringgold nonesuch acquisitions system to transmit orders electronically to book house using the bisac tape format. blackwell/ north america and the academic book center have experimentally used wln to receive test orders in a print file format. these all save time in getting the orders to the respective vendor. if sufficient volume can be generated there may be a savings in transmission costs over the u.s. mail. however, in order to realize maximum economics in this electronic process, four activities need to occur. 1. acquisition orders must be collected from multiple libraries at a central site to generate volume for dispersal to multiple sites. 2. standard formats need to be accepted and enforced for order transmission. 3. the isbn must become a universally accepted part of the library acquisitions order. 4. the library must receive order status information from the vendor. once again, this should occur via a standard data format. at oclc there were 113 libraries, as of november 1981, thatcouldsendprintedorders from a central site to over 15,000 addressesoftheir choice. by july 1982 the projection is for over 200 libraries to be using the system. the library's order is hatched by the vendor address that the library has specified. this process offers savings by sharing mail and printing costs between participants. with the proposed installation of direct transmission in 1982, this central collection will afford shared transmission costs. this is the type of centralized collection that maximizes the benefits of electronic ordering. within the book industry, standards for electronic data transmission for book ordering have been developed. in may of 1981 the book industry systems advisory committee (bisac), a subcommittee of the book industry study group (bisg), ap296 journal of library automation vol. 14/4 december 1981 proved the third version of their purchase order format. this is a simple format with fixed length fields and fixed length records. it was developed for tape transmission of book orders and relies heavily on the use of the international standard book number (isbn) for accuracy. ansi z39 subcommittee u is working on an ansi purchase order data transmission format for libraries. this effort is in cooperation with bisac. in 1981 there were nine book vendors using the bisac purchase order format, including the large retail chains walden and dalton. there were also twelve vendors using the bisac invoice format, five vendors using the title update format, and one vendor using the approved data transmission protocol (ibm 3780). this book ordering activity and standards use is fine for the book vendors and retailers. but where are the libraries? oclc plans to use bisac data transmission protocol and fixed data format in their initial direct transmission effort. however, there are some real problems with these formats relative to library needs. first, the formats do not provide for serials ordering or renewal. second, data fields in the format are fixed length. this is a real problem when ordering esoteric publications. especially since the title and descriptor entries are a single field. obviously there are many items that a library needs to order that cannot be supported by this current standard. oclc and dataphase have representatives on the bisac purchase order subcommittee. this subcommittee is developing a variable length p. 0. format. however, if there is to be real cooperation, and the accompanying economics, we must have more active participation from the library community. the cataloger has the library of congress catalog number (lccn). however, this is inadequate for library acquisitions. the isbn was developed for acquisitions. the isbn identifies the publishers or current distributor, the binding, etc., so necessary for accurate acquisitions. you can order music, maps, recordings, or film by using the isbn. this is providing you order from a publisher that assigns isbns to those materials. it also assumes that you include the isbn on the order. baker and taylor, brodart, random house, and mcgrawhill estimate less than 25 percent of their orders contain an isbn. yankee book, blackwell/north america, and the book house report approximately 10 percent use on orders received. a significant number of these isbns are incorrect, obsolete, or otherwise erroneous. if we are to realize the tremendous economies possible with electronic transmission, we must have greater and more accurate use of the isbn. it is simply uneconomical to transmit all of the data necessary to accurately identify a piece via the cataloging fields and subfields for every order; even if this information were available for ordering. another standard developed by bisac is the standard address number (san). all library vendors, public, academic, and school libraries have been assigned a san. do you know yours? do you know the san of your vendor? these sans are available in your libraries' reference sections, as well as the online name-address directories that accompany the network acquisition systems. if electronic ordering is going to be used most effectively and economically, the san plays an important part. it is not economically efficient to transmit hundreds of characters of address information. the last item that becomes feasible with electronic transmission is order status information. the day is gone when we can afford to keep thousands of dollars encumbered with acquisition pieces that are unavailable. the normal practice of automatic cancel after sixty or ninety days, keeps those monies committed. how much better would it be to know within twentyfour to forty-eight hours of an order that the material was unavailable. those funds, that become more dear each year, could be recommitted to more available items. this would be advantageous to both the vendor, the library and ultimately the library patron. both the bisac invoice and title update formats have potential for use in reporting. it would be better, however, if we could derive a format specific for title status. in closing, i urge you to use the isbn and san; pursue avenues of collective ordering; and lastly, become active in the standards effort. it is the library that ultimately has the most to gain from a cooperative, coordinated, volume-oriented, resource-sharing electronic ordering process. for information relative to bisac transmission formats or bisac membership, write to: book industry systems advisory committee, 160 fifth ave., suite604, new york, ny 10010. for input to bisac purchase order formats, write to: j. k. long, chairman, bisac p.o. subcommittee, c/o oclc, inc., 6565 frantz rd., dublin, oh 43017. (mr. long is also the library or network representative on the isbn advisory council.) for input to the ansi z39 p.o. transmission formats, write to: mr. e. muro, chairman, subcommittee u, c/o baker & taylor co., 6 kirby ave., somerville, nj 08876. for problems with the isbn and san, write to: mr. emory i koltay, international standard book numbering agency, 1180 avenue of the americas, new york, ny 20036. microcomputer backup to online circulation sheila intner: emory university, atlanta, georgia. our primary objective in purchasing microcomputer systems for the great neck library was to provide a better alternative to paper and pencil checkouts when our minicomputer-based clsi libs 100 automated circulation system was down. two difficult and lengthy downtime periods occurring shortly after going online convinced the administration that public service should not be jeopardized because of system failure. after investigation of the backup systems vended by computer translation, inc., 1 two of them were purchased in november 1980. computer translation, inc. (cti) sells a turnkey backup system based on an apple ii plus microcomputer, with two mini-disk drives using 5\14 " floppy diskettes, a tv monitor, and a switching system connecting the apple to the libs 100 console and terminals. software designed to interface with the clsi system is part of the package. the backup collects and stores data for communications 297 check-ins and checkouts and then dumps them into the database by simulating a terminal when the mini-main-frame is operational again. this requires dedicating a terminal to this process until complete. it can also be used alone as a portable unit for circulation purposes, or with any of the many applesoft packages available, or with an applesoft program of the user's own design. our initial experience in great neck was with a borrowed demonstration system, set up by a sympathetic cti representative on the spur of the moment in tandem with and connected to the main library checkout station's crt laser terminal after several days of downtime. the circulation staff cheered as the familiar prompts appeared on both screens. they used the clsi equipment which they were accustomed to operating and the computer room staff learned to operate the cti system. the ease with which the apple could be transported to different locations in the building and the immediate relief it gave wherever it was connected, sometimes one checkout station, sometimes another, led us to put off deciding on a permanent installation at first. we thought it might be more advantageous to keep it on a rolling cart and use it wherever a terminal was down, or wherever the traffic appeared to be heaviest. we continued in this manner for a while even after both of our own apple systems were delivered. it soon became apparent that the apple and its accompaniments, especially the switching system with its dangling cables, was a nuisance at the checkout counter. people with piles of books or records tended to nudge it dangerously close to the edge or jiggle its connections loose. the circulation staff didn't like waiting until someone from the computer room could be spared to bring up the system, secure the connections, and turn on the apple. also, although the apple is a very reliable instrument which has given us negligible downtime, bumpy rides over various floors , carpets, lintels, and textured tiles occasionally loosened its chips and rendered it, too, inoperative. cti representatives were called in to make a more permanent installation for the apple in our computer room , a simple operation requiring some additional cable. se58 information technology and libraries | june 2010 know its power, and facets can showcase metadata in new interfaces. according to mcguinness, facets perform several functions in an interface: ■■ vocabulary control ■■ site navigation and support ■■ overview provision and expectation setting ■■ browsing support ■■ searching support ■■ disambiguation support5 these functions offer several potential advantages to the user: the functions use category systems that are coherent and complete, they are predictable, they show previews of where to go next, they show how to return to previous states, they suggest logical alternatives, and they help the user avoid empty result sets as searches are narrowed.6 disadvantages include the fact that categories of interest must be known in advance, important trends may not be shown, category structures may need to be built by hand, and automated assignment is only partly successful.7 library catalog records, of course, already supply “categories of interest” and a category structure. information science research has shown benefits to users from faceted search interfaces. but do these benefits hold true for systems as complex as library catalogs? this paper presents an extensive review of both information science and library literature related to faceted browsing. ■■ method to find articles in the library and information science literature related to faceted browsing, the author searched the association for computing machinery (acm) digital library, scopus, and library and information science and technology abstracts (lista) databases. in scopus and the acm digital library, the most successful searches included the following: ■■ (facet* or cluster*) and (usability or user stud*) ■■ facet* and usability in lista, the most successful searches included combining product names such as “aquabrowser” with “usability.” the search “catalog and usability” was also used. the author also searched google and the next generation catalogs for libraries (ngc4lib) electronic discussion list in an attempt to find unpublished studies. search terms initially included the concept of “clustering”; however, this was quickly shown to be a clearly defined, separate topic. according to hearst, “clustering refers to the grouping of items according to some measure faceted browsing is a common feature of new library catalog interfaces. but to what extent does it improve user performance in searching within today’s library catalog systems? this article reviews the literature for user studies involving faceted browsing and user studies of “next-generation” library catalogs that incorporate faceted browsing. both the results and the methods of these studies are analyzed by asking, what do we currently know about faceted browsing? how can we design better studies of faceted browsing in library catalogs? the article proposes methodological considerations for practicing librarians and provides examples of goals, tasks, and measurements for user studies of faceted browsing in library catalogs. m any libraries are now investigating possible new interfaces to their library catalogs. sometimes called “next-generation library catalogs” or “discovery tools,” these new interfaces are often separate from existing integrated library systems. they seek to provide an improved experience for library patrons by offering a more modern look and feel, new features, and the potential to retrieve results from other major library systems such as article databases. one interesting feature these interfaces offer is called “faceted browsing.” hearst defines facets as a “a set of meaningful labels organized in such a way as to reflect the concepts relevant to a domain.”1 labarre defines facets as representing “the categories, properties, attributes, characteristics, relations, functions or concepts that are central to the set of documents or entities being organized and which are of particular interest to the user group.”2 faceted browsing offers the user relevant subcategories by which they can see an overview of results, then narrow their list. in library catalog interfaces, facets usually include authors, subjects, and formats, but may include any field that can be logically created from the marc record (see figure 1 for an example). using facets to structure information is not new to librarians and information scientists. as early as 1955, the classification research group stated a desire to see faceted classification as the basis for all information retrieval.3 in 1960, ranganathan introduced facet analysis to our profession.4 librarians like metadata because they jody condit fagan (faganjc@jmu.edu) is content interfaces coordinator, james madison university library, harrisonburg, virginia. jody condit fagan usability studies of faceted browsing: a literature review usability studies of faceted browsing: a literature review | fagan 59 doing so and performed a user study to inform their decision. results: empirical studies of faceted browsing the following summaries present selected empirical research studies that had significant findings related to faceted browsing or interesting methods for such studies. it is not an exhaustive list. pratt, hearst, and fagan questioned whether faceted results were better than clustering or relevancy-ranked results.11 they studied fifteen breast-cancer patients and families. every subject used three tools: a faceted interface, a tool that clustered the search results, and a tool that ranked the search results according to relevance criteria. the subjects were given three simple queries related to breast cancer (e.g., “what are the ways to prevent breast cancer?”), asked to list answers to these before beginning, and to answer the same queries after using all the tools. in this study, subjects completed two timed tasks. first, subjects found as many answers as possible to the question in four minutes. second, the researchers measured the time subjects took to find answers to two specific questions (e.g., “can diet be used in the prevention of breast cancer?”) that related to the original, general query. for the first task, when the subjects used the faceted interface, they found more answers than they did with the other two tools. the mean number of answers found using the faceted interface was 7.80, for the cluster tool it was 4.53, and for the ranking tool it was 5.60. this difference was significant (p<0.05).12 for the second task, the researchers found no significant difference between the tools when comparing time on task. the researchers gave the subjects a user-satisfaction questionnaire at the end of the study. on thirteen of the fourteen quantitative questions, satisfaction scores for the faceted interface were much higher than they were for either the ranking tool or the cluster tool. this difference was statistically significant (p < 0.05). all fifteen users also affirmed that the faceted interface made sense, was helpful, was useful, and had clear labels, and said they would use the faceted interface again for another search. yee et al. studied the use of faceted metadata for image searching, and browsing using an interface they developed called flamenco.13 they collected data from thirty-two participants who were regular users of the internet, searching for information either every day or a few times a week. their subjects performed four tasks (two structured and two unstructured) on each of two interfaces. an example of an unstructured task from their study was “search for images of interest.” an example of a structured task was to gather materials for an art history of similarity . . . typically computed using associations and commonalities among features where features are typically words and phrases.”8 using library catalog keywords to generate word clouds would be an example of clustering, as opposed to using subject headings to group items. clustering has some advantages according to hearst. it is fully automated, it is easily applied to any text collection, it can reveal unexpected or new trends, and it can clarify or sharpen vague queries. disadvantages to clustering include possible imperfections in the clustering algorithm, similar items not always being grouped into one cluster, a lack of predictability, conflating many dimensions, difficulty labeling groups, and counterintuitive subhierarchies.9 in user studies comparing clustering with facets, pratt, hearst, and fagan showed that users find clustering difficult to interpret and prefer a predictable organization of category hierarchies.10 ■■ results the author grouped the literature into two categories: user studies of faceted browsing and user studies of library catalog interfaces that include faceted browsing as a feature. generally speaking, the information science literature consisted of empirical studies of interfaces created by the researchers. in some cases, the researchers’ intent was to create and refine an interface intended for actual use; in others, the researchers created the interface only for the purposes of studying a specific aspect of user behavior. in the library literature, the studies found were generally qualitative usability studies of specific library catalog interface products. libraries had either implemented a new product, or they were thinking about figure 1. faceted results from jmu’s vufind implementation 60 information technology and libraries | june 2010 uddin and janacek asked nineteen users (staff and students at the asian institute of technology) to use a website search engine with both a traditional results list and a faceted results list.22 tasks were as follows: (1) look for scholarship information for a masters program, (2) look for staff recruitment information, and (3) look for research and associated faculty member information within your interested area.23 they found that users were faster when using the faceted system, significantly so for two of the three tasks. success in finding relevant results was higher with the faceted system. in the post–study questionnaire, participants rated the faceted system more highly, including significantly higher ratings for flexibility, interest, understanding of information content, and more search results relevancy. participants rated the most useful features to be the capability to switch from one facet to another, preview the result set, combine facets, and navigate via breadcrumbs. capra et al. compared three interfaces in use by the bureau of labor statistics website, using a between-subjects study with twenty-eight people and a within-subjects study with twelve people.24 each set of participants performed three kinds of searches: simple lookup, complex lookup, and exploratory. the researchers used an interesting strategy to help control the variables in their study: because the bls website is a highly specialized corpus devoted to economic data in the united states organized across very specific time periods (e.g., monthly releases of price or employment data), we decided to include the us as a geographic facet and a month or year as a temporal facet to provide context for all search tasks in our study. thus, the simple lookup tasks were constructed around a single economic facet but also included the spatial and temporal facets to provide context for the searchers. the complex lookup tasks involve additional facets including genre (e.g. press release) and/or region.25 capra et al. found that users preferred the familiarity afforded by the traditional website interface (hyperlinks + keyword search) but listed the facets on the two experimental interfaces as their best features. the researchers concluded, “if there is a predominant model of the information space, a well designed hierarchical organization might be preferred.”26 zhang and marchionini analyzed results from fifteen undergraduate and graduate students in a usability study of an interface that used facets to categorize results (relation browser ++).27 there were three types of tasks: ■■ type 1: simple look-up task (three tasks such as “check if the movie titled the matrix is in the library movie collection”). ■■ type 2: data exploration and analysis tasks (six tasks essay on a topic given by the researchers and to complete four related subtasks. the researchers designed the structured task so they knew exactly how many relevant results were in the system. they also gave a satisfaction survey. more participants were able to retrieve all relevant results with the faceted interface than with the baseline interface. during the structured tasks, participants received empty results with the baseline interface more than three times as often as with the faceted interface.14 the researchers found that participants constructed queries from multiple facets in the unstructured tasks 19 percent of the time and in the structured tasks 45 percent of the time.15 when given a post–test survey, participants identified the faceted interface as easier to use, more flexible, interesting, enjoyable, simple, and easy to browse. they also rated it as slightly more “overwhelming.” when asked to choose between the two, twenty-nine participants chose the faceted interface, compared with two who chose the baseline (n = 31). thirty-one of the thirty-two participants said the faceted interface helped them learn more, and twentyeight of them said it would be more useful for their usual tasks.16 the researchers concluded that even though their faceted interface was much slower than the other, it was strongly preferred by most study participants: “these results indicate that a category-based approach is a successful way to provide access to image collections.”17 in a related usability study on the flamenco interface, english et al. compared two image browsing interfaces in a nineteen-participant study.18 after an initial search, the “matrix view” interface showed a left column with facets, with the images in the result set placed in the main area of the screen. from this intermediary screen, the user could select multiple terms from facets in any order and have the items grouped under any facet. the “singletree” interface listed subcategories of the currently selected term at the top, with query previews underneath. the user could then only drill down to subcategories of the current category, and could not select terms from more than one facet. the researchers found that a majority of participants preferred the “power” and “flexibility” of matrix to the simplicity of singletree. they found it easier to refine and expand searches, shift between searches, and troubleshoot research problems. they did prefer singletree for locating a specific image, but matrix was preferred for browsing and exploring. participants started over only 0.2 percent of the time for the matrix compared to 4.5 percent for singletree.19 yet the faceted interface, matrix, was not “better” at everything. for specific image searching, participants found the correct image only 22.0 percent of the time in matrix compared to 66.0 percent in singletree.20 also, in matrix, some participants drilled down in the wrong hierarchy with wrong assumptions. one interesting finding was that in both interfaces, more participants chose to begin by browsing (12.7 percent) than by searching (5.0 percent).21 usability studies of faceted browsing: a literature review | fagan 61 of the first two studies: the first study comprised one faculty member, five graduate students, and two undergraduate students; the second comprised two faculty members, four graduate students, and two undergraduate students. the third study did not report results related to faceted browsing and is not discussed here. the first study had seven scenarios; the second study had nine. the scenarios were complex: for example, one scenario began, “you want to borrow shakespeare’s play, the tempest, from the library,” but contained the following subtasks as well: 1. find the tempest. 2. find multiple editions of this item. 3. find a recent version. 4. see if at least one of the editions is available in the library. 5. what is the call number of the book? 6. you’d like to print the details of this edition of the book so you can refer to it later. participants found the interface friendly, easy to use, and easy to learn. all the participants reported that faceted browsing was useful as a means of narrowing down the result lists, and they considered this tool one of the differentiating features between primo and their library opac or other interfaces. facets were clear, intuitive, and useful to all participants, including opening the “more” section.31 one specific result from the tests was that “online resources” and “available” limiters were moved from a separate location to the right with all other facets.32 in a study of aquabrowser by olson, twelve subjects— all graduate students in the humanities—participated in a comparative test in which they looked for additional sources for their dissertation.33 aquabrowser was created by medialab but is distributed by serials solutions in north america. this study also had three pilot subjects. no relevance judgments were made by the researchers. nine of the twelve subjects found relevant materials by using aquabrowser that they had not found before.34 olson’s subjects understood facets as a refinement tool (narrowing) and had a clear idea of which facets were useful and not useful for them. they gave overwhelmingly positive comments. only two felt the faceted interface was not an improvement. some participants wanted to limit to multiple languages or dates, and a few were confused about the location of facets in multiple places, for example, “music” under both format and topic. a team at yale university, led by bauer, recently conducted two tests on pilot vufind installations: a subject-based presentation of e-books for the cushing/ whitney medical library and a pilot test of vufind using undergraduate students with a sample of 400,000 records from the library system.35 vufind is open-source software developed at villanova university (http://vufind.org). that require users to understand and make sense of the information collection: “in which decade did steven spielberg direct the most movies?”). ■■ type 3: (one free exploration task: “find five favorite videos without any time constraints”). the tasks assigned for the two interfaces were different but comparable. for type 2 tasks, zhang and marchionini found that performance differences between the two interfaces were all statistically significant at the .05 level.28 no participants got wrong answers for any but one of the tasks using the faceted interface. with regard to satisfaction, on the exploratory tasks the researchers found statistically significant differences favoring the faceted interface on all three of the satisfaction questions. participants found the faceted interface not as aesthetically appealing nor as intuitive to use as the basic interface. two participants were confused by the constant changing and updating of the faceted interface. the above studies are examples of empirical investigations of experimental interfaces. hearst recently concluded that facets are a “proven technique for supporting exploration and discovery” and summarized areas for further research in this area, such as applying facets to large “subject-oriented category systems,” facets on mobile interfaces, adding smart features like “autocomplete” to facets, allowing keyword search terms to affect order of facets, and visualizations of facets.29 in the following section, user studies of next-generation library catalog interfaces will be presented. results: library literature understandably, most studies by practicing librarians focus on products their libraries are considering for eventual use. these studies all use real library catalog records, usually the entire catalog’s database. in most cases, these studies were not focused on investigating faceted browsing per se, but on the usability of the overall interface. in general, these studies used fewer participants than the information science studies above, followed less rigorous methods, and were not subjected to statistical tests. nevertheless, they provide many insights into the user experience with the extremely complex datasets underneath next-generation library catalog interfaces that feature faceted browsing. in this review article, only results specifically relating to faceted browsing will be presented. sadeh described a series of usability studies performed at the university of minnesota (um), a primo development partner.30 primo is the next-generation library catalog product sold by ex libris. the author also received additional information from the usability services lab at um via e-mail. three studies were conducted in august 2006, january 2007, and october 2007. eight users from various disciplines participated in each 62 information technology and libraries | june 2010 participants. the researchers measured task success, duration, and difficulty, but did not measure user satisfaction. their study consisted of four known-item tasks and six topic-searching tasks. the topic-searching tasks were geared toward the use of facets, for example, “can you show me how would you find the most recently published book about nuclear energy policy in the united states?”45 all five participants using endeca understood the idea of facets, and three used them. students tried to limit their searches at the outset rather than search and then refine results. an interesting finding was that use of the facets did not directly follow the order in which facets were listed. the most heavily used facet was library of congress classification (lcc), followed closely by topic, and then library, format, author, and genre.46 results showed a significantly shorter average task duration for endeca catalog users for most tasks.47 the researchers noted that none of the students understood that the lcc facet represented call-number ranges, but all of the students understood that these facets “could be used to learn about a topic from different aspects—science, medicine, education.”48 the authors could find no published studies relating to the use of facets in some next-generation library catalogs, including encore and worldcat local. although the university of washington did publish results of a worldcat local usability study in a recent issue of library technology reports, results from the second round of testing, which included an investigation of facets, were not yet ready.49 ■■ discussion summary of empirical evidence related to faceted browsing empirical studies in the information science literature support many positive findings related to faceted browsing and build a solid case for including facets in search interfaces: ■■ facets are useful for creating navigation structures.50 ■■ faceted categorization greatly facilitates efficient retrieval in database searching.51 ■■ facets help avoid dead ends.52 ■■ users are faster when using a faceted system.53 ■■ success in finding relevant results is higher with a faceted system.54 ■■ users find more results with a faceted system.55 ■■ users also seem to like facets, although they do not always immediately have a positive reaction. ■■ users prefer search results organized into predictable, multidimensional hierarchies.56 ■■ participants’ satisfaction is higher with a faceted system.57 the team drew test questions from user search logs in their current library system. some questions targeted specific problems, such as incomplete spellings and incomplete title information. bauer notes that some problems uncovered in the study may relate to the peculiarities of the yale implementation. the medical library study contained eight participants—a mix of medical and nursing students. facets, reported bauer, “worked well in several instances, although some participants did not think they were noticeable on the right side of the page.”36 the prompt for the faceted task in this study came after the user had done a search: “what if you wanted to look at a particular subset, say ‘xxx’ (determine by looking at the facets).”37 half of the participants used facets, half used “search within” to narrow the topic by adding keywords. sixty-two percent of the participants were successful at this task. the undergraduate study asked five participants faced with a results list, “what would you do now if you only wanted to see material written by john adams?”38 on this task, only one of the five was successful, even though the author’s name was on the screen. bauer noted that in general, “the use of the topic facet to narrow the search was not understood by most participants. . . . even when participants tried to use topic facets the length of the list and extraneous topics rendered them less than useful.”39 the five undergraduates were also asked, “could you find books in this set of results that are about health and illness in the united states population, or control of communicable diseases during the era of the depression?”40 again, only one of the five was successful. bauer notes that “the overly broad search results made this difficult for participants. again, topic facets were difficult to navigate and not particularly useful to this search.”41 bauer’s team noted that when the search was configured to return more hits, “topic facets become a confusingly large set of unrelated items. these imprecise search results, combined with poor topic facet sets, seemed to result in confusion for test participants.”42 participants were not aware that topics represented subsets, although learning occurred because the “narrow” header was helpful to some participants.43 other results found by bauer’s team were that participants were intrigued by facets, navigation tools are needed so that patrons may reorder large sets of topic facets, format and era facets were useful to participants, and call-number facets were not used by anyone. antelman, pace, and lynema studied north carolina state university’s (ncsu) next-generation library catalog, which is driven by software from endeca.44 their study used ten undergraduate students in a between-subjects design where five used the endeca catalog and five used the library’s traditional catalog. the researchers noted that their participants may have been experienced with the library’s old catalog, as log data shows most ncsu users enter one or two terms, which was not true of study usability studies of faceted browsing: a literature review | fagan 63 one product’s faceted system for a library catalog does not substitute for another, the size and scope of local collections may greatly affect results, and cataloging practices and metadata will affect results. still, it is important for practicing librarians to determine if new features such as facets truly improve the user’s experience. methodological best practices after reading numerous empirical research studies (some of which critique their own methods) and library case studies, some suggestions for designing better studies of facets in library catalogs emerged. designing the study ■■ consider reusing protocols from previous studies. this provides not only a tested method but also a possible point of comparison. ■■ define clear goals for each study and focus on specific research questions. it’s tempting to just throw the user into the interface and see what happens, but this makes it difficult, if not impossible, to analyze the results in a useful way. for example, one of zhang and marchionini’s hypotheses specifically describes what rich interaction would look like: “typing in keywords and clicking visual bars to filter results would be used frequently and interchangeably by the users to finish complex search tasks, especially when large numbers of results are returned.”64 ■■ develop the study for one type of user. olson’s focus on graduate students in the dissertation process allowed the researchers to control for variables such as interest of and knowledge about the subject. ■■ pilot test the study with a student worker or colleague to iron out potential wrinkles. ■■ let users explore the system for a short time and possibly complete one highly structured task to help the user become used to the test environment, interface, and facilitator.65 unless you are truly interested in the very first experience users have with a system, the first use of a system is an artificial case. designing tasks ■■ make sure user performance on each task is measurable. will you measure the time spent on a task? if “success” is important, define what that would look like. for example, english et al. defined success for one of their tasks as when “the participant indicated (within the allotted time) that he/she had reached an appropriate set of images/specific image in the collection.”66 ■■ establish benchmarks for comparison. one can test for significant differences between interfaces, one can test for differences between research subjects and an expert user, and one can simply measure against ■■ users are more confident with a faceted system.58 ■■ users may prefer the familiarity afforded by traditional website interface (hyperlinks + keyword search).59 ■■ initial reactions to the faceted interface may be cautious, seeing it as different or unfamiliar.60 users interact with specific characteristics of faceted interfaces, and they go beyond just one click with facets when it is permitted. english et al. found that 7 percent of their participants expanded facets by removing a term, and that facets were used more than “keyword search within”: 27.6 percent versus 9 percent.61 yee et al. found that participants construct queries from multiple facets 19 percent of the time in unstructured tasks; in structured tasks they do so 45 percent of the time.62 the above studies did not use library catalogs; in most cases they used an experimental interface with record sets that were much smaller and less complicated than in a complete library collection. domains included websites, information from one website, image collections, video collections, and a journal article collection. summary of practical user studies related to faceted browsing this review also included studies from practicing librarians at live library implementations. these studies generally had smaller numbers of users, were more likely to focus on the entire interface rather than a few features, and chose more widely divergent methods. studies were usually linked to a specific product, and results varied widely between systems and studies. for this reason it is difficult to assemble a bulleted summary as with the previous section. the variety of results from these studies indicate that when faceted browsing is applied to a reallife situation, implementation details can greatly affect user performance and user preference. some, like labarre, are skeptical about whether facets are appropriate for library information. descriptions of library materials, says labarre, include analyses of intellectual content that go beyond the descriptive terms assigned to commercial items such as a laptop: now is the time to question the assumptions that are embedded in these commercial systems that were primarily designed to provide access to concrete items through descriptions in order to enhance profit.63 it is clear that an evaluation of commercial interfaces or experimental interfaces does not substitute for an opac evaluation. yet it is a challenge for libraries to find expertise and resources to conduct user studies. the systems they want to test are large and complex. collaborating with other libraries has its own challenges: an evaluation of 64 information technology and libraries | june 2010 groups of participants, each of which tests a different system. ■❏ a within-subjects design has one group of participants test both systems. it is hoped that if libraries use the suggestions above when designing future experiments, results across studies will be more comparable and useful. designing user studies of faceted browsing after examining both empirical research studies and case studies by practicing librarians, a key difference seems to be the specificity of research questions and designing tasks and measurements to test specific hypotheses. while describing a full user-study protocol for investigating faceted browsing in a library catalog is beyond the scope of this article, reviewing the literature and the study methods it describes provided insights into how hypotheses, tasks, and measurements could be written to provide more reliable and comparable evidence related to faceted browsing in library catalog systems. for example, one research question could surround the format facet: “compared with our current interface, does our new faceted interface improve the user’s ability to find different formats of materials?” hypotheses could include the following: 1. users will be more accurate when identifying the formats of items from their result set when using the faceted interface than when using the traditional interface. 2. users will be able to identify formats of items more quickly with the faceted interface than with the traditional interface. looking at these hypotheses, here is a prompt and some example tasks the participants would be asked to perform: “we will be asking you to find a variety of formats of materials. when we say formats of materials, we mean books, journal articles, videos, etc.” ■■ task 1: please use interface a to search on “interpersonal communication.” look at your results set. please list as many different formats of material as you can. ■■ task 2: how many items of each format are there? ■■ task 3: please use interface b to search on “family communication.” what formats of materials do you see in your results set? ■■ task 4: how many items of each format are there?” we would choose the topics “interpersonal communication” and “family communication” because our local catalog has many material types for these topics and because these topics would be understood by most of our students. we would choose different topics to expectations or against previous iterations of the same study. for example, “75 percent of users completed the task within five minutes.” zhang and marchionini measured error rates, another possible benchmark.67 ■■ consider looking at your existing opac logs for zeroresults searches or other issues that might inspire interesting questions. ■■ target tasks to avoid distracters. for example, if your catalog has a glut of government documents, consider running the test with a limit set to exclude them unless you are specifically interested in their impact. for example, capra et al. decided to include the united states as a geographic facet and a month or year as a temporal facet to provide context for all search tasks in their study.68 ■■ for some tasks, give the subjects simple queries (e.g., “what are the ways to prevent breast cancer?”) as opposed to asking the subjects to come up with their own topic. this can help control for the potential challenges of formulating one’s own research question on the spot. as librarians know, formulating a good research question is its own challenge. ■■ if you are using any timed tasks, consider how the nature of your tasks could affect the result. for example, pratt, hearst, and fagan noted that the time that it took subjects to read and understand abstracts most heavily influenced the time for them to find an answer.69 english et al. found that the system’s processing time influenced their results.70 ■■ consider the implications of your local implementation carefully when designing your study. at yale, the team chose to point their vufind instance at just 400,000 of their records, drew questions from problems users were having (as shown in log files), and targeted questions to these problems.71 who to study? ■■ try to study a larger set of users. it is better to create a short test with many users than a long test with a few users. nielsen suggests that twenty users is sufficient.72 consider collaborating with another library if necessary. ■■ if you test a small number, such as the typical four to eight users for a usability test, be sure you emphasize that your results are not generalizable. ■■ use subjects who are already interested in the subject domain: for example, pratt, hearst, and fagan used breast cancer patients,73 and olson used graduate students currently writing their dissertations.74 ■■ consider focusing on advanced or scholarly users. la barre suggests that undergraduates may be overstudied.75 ■■ for comparative studies, consider having both between-subjects and within-subjects designs.76 ■❏ a between-subjects design involves creating two usability studies of faceted browsing: a literature review | fagan 65 these experimental studies. previous case-study investigations of library catalog interfaces with facets have proven inconclusive. by choosing more specific research questions, tasks, and measurements for user studies, libraries may be able to design more objective studies and compare results more effectively. references 1. marti a. hearst, “clustering versus faceted categories for information exploration,” communications of the acm 49, no. 4 (2006): 60. 2. kathryn la barre, “faceted navigation and browsing features in new opacs: robust support for scholarly information seeking?” knowledge organization 34, no. 2 (2007): 82. 3. vanda broughton, “the need for faceted classification as the basis of all methods of information retrieval,” aslib proceedings 58, no. 1/2 (2006): 49–71. 4. s. r. ranganathan, colon classification basic classification, 6th ed. (new york: asia, 1960). 5. deborah l. mcguinness, “ontologies come of age,” in spinning the semantic web: bringing the world wide web to its full potential, ed. dieter fensel et al. (cambridge, mass.: mit pr., 2003): 179–84. 6. hearst, “clustering versus faceted categories,” 60. 7. ibid., 61. 8. ibid., 59. 9. ibid.. 60. 10. wanda pratt, marti a. hearst, and lawrence m. fagan, “a knowledge-based approach to organizing retrieved documents,” proceedings of the sixteenth national conference on artificial intelligence, july 18–22, 1999, orlando, florida (menlo park, calif.: aaai pr., 1999): 80–85. 11. ibid. 12. ibid., 5. 13. ka-ping yee et al., “faceted metadata for image search and browsing,” 2003, http://flamenco.berkeley.edu/papers/ flamenco-chi03.pdf (accessed oct. 6, 2008). 14. ibid., 6. 15. ibid., 7. 16. ibid. 17. ibid., 8. 18. jennifer english et al., “flexible search and navigation,” 2002, http://flamenco.berkeley.edu/papers/flamenco02.pdf (accessed apr. 22, 2010). 19. ibid., 7. 20. ibid., 6. 21. ibid., 7. 22. mohammed nasir uddin and paul janecek, “performance and usability testing of multidimensional taxonomy in web site search and navigation,” performance measurement and metrics 8, no. 1 (2007): 18–33. 23. ibid., 25. 24. robert capra et al., “effects of structure and interaction style on distinct search tasks,” proceedings of the 7th acm-ieee-cs joint conference on digital libraries (new york: acm, 2007): 442–51. 25. ibid., 446. 26. ibid., 450. help minimize learning effects. to further address this, we would plan to have half our users start first with the traditional interface and half to start first with the faceted interface. this way we can test for differences resulting from learning. the above tasks would allow us to measure several pieces of evidence to support or reject our hypotheses. for tasks 1 and 3, we would measure the number of formats correctly identified by users compared with the number found by an expert searcher. for tasks 2 and 4, we would compare the number of items correctly identified with the total items found in each category by an expert searcher. we could also time the user to determine which interface helped them work more quickly. in addition to measuring the number of formats identified and the number of items identified in each format, we would be able to measure the time it takes users to identify the number of formats and the number of items in each format. to measure user satisfaction, we would ask participants to complete the system usability scale (sus) after each interface and, at the very end of the study, complete a questionnaire comparing the two interfaces. even just selecting the format facet, we would have plenty to investigate. other hypotheses and tasks could be developed for other facet types, such as time period or publication date, or facets related to the responsible parties, such as author or director: hypothesis: users can find more materials written in a certain time period using the faceted interface. task: find ten items of any type (books, journals, movies) written in the 1950s that you think would have information about television advertising. hypothesis: users can find movies directed by a specific person more quickly using the faceted interface. task: in the next two minutes, find as many movies as you can that were directed by orson welles. for the first task above, an expert searcher could complete the same task, and their time could be used as a point of comparison. for the second, the total number of movies in the library catalog that were directed by welles is an objective quantity. in both cases, one could compare the user’s performance on the two interfaces. ■■ conclusion reviewing user studies about faceted browsing revealed empirical evidence that faceted browsing improves user performance. yet this evidence does not necessarily point directly to user success in faceted library catalogs, which have much more complex databases than those used in 66 information technology and libraries | june 2010 53. uddin and janecek, “performance and usability testing”; zhang and marchionini, evaluation and evolution; hao chen and susan dumais, bringing order to the web: automatically categorizing search results (new york: acm, 2000): 145–52. 54. uddin and janecek, “performance and usability testing.” 55. ibid.; pratt, hearst, and fagan, “a knowledge-based approach”; hsinchun chen et al., “internet browsing and searching: user evaluations of category map and concept space techniques,” journal of the american society for information science 49, no. 7 (1998): 582–603. 56. vanda broughton, “the need for faceted classification as the basis of all methods of information retrieval,” aslib proceedings 58, no. 1/2 (2006): 49–71; pratt, hearst, and fagan, “a knowledge-based approach,” 80–85.; chen et al., “internet browsing and searching,” 582–603; yee et al., “faceted metadata for image search and browsing”; english et al., “flexible search and navigation using faceted metadata.” 57. uddin and janecek, “performance and usability testing”; zhang and marchionini, evaluation and evolution; hideo joho and joemon m. jose, slicing and dicing the information space using local contexts (new york: acm, 2006): 66–74.; yee et al., “faceted metadata for image search and browsing.” 58. yee et al., “faceted metadata for image search and browsing”; chen and dumais, bringing order to the web. 59. capra et al., “effects of structure and interaction style.” 60. yee et al., “faceted metadata for image search and browsing”; capra et al., “effects of structure and interaction style”; zhang and marchionini, evaluation and evolution. 61. english et al., “flexible search and navigation,” 7. 62. yee et al., “faceted metadata for image search and browsing,” 7. 63. la barre, “faceted navigation and browsing,” 85. 64. zhang and marchionini, evaluation and evolution, 183. 65. english et al., “flexible search and navigation.” 66. ibid., 6. 67. zhang and marchionini, evaluation and evolution. 68. capra et al., “effects of structure and interaction style.” 69. pratt, hearst, and fagan, “a knowledge-based approach.” 70. english et al., “flexible search and navigation.” 71. bauer, “yale university library vufind test—undergraduates.” 72. jakob nielsen, “quantitative studies: how many users to test?” online posting, alertbox, june 26, 2006 http://www.useit .com/alertbox/quantitative_testing.html (accessed apr. 7, 2010). 73. pratt, hearst, and fagan, “a knowledge-based approach.” 74. tod a. olson used graduate students currently writing their dissertations. olson, “utility of a faceted catalog for scholarly research,” library hi tech 25, no. 4 (2007): 550–61. 75. la barre, “faceted navigation and browsing.” 76. capra et al., “effects of structure and interaction style.” 27. junliang zhang and gary marchionini, evaluation and evolution of a browse and search interface: relation browser++ (atlanta, ga.: digital government society of north america, 2005): 179–88. 28. ibid., 183. 29. marti a. hearst, “uis for faceted navigation: recent advances and remaining open problems,” 2008, http://people. ischool.berkeley.edu/~hearst/papers/hcir08.pdf (accessed apr. 27, 2010). 30. tamar sadeh, “user experience in the library: a case study,” new library world 109, no. 1/2 (jan. 2008): 7–24. 31. ibid., 22. 32. jerilyn veldof, e-mail from university of minnesota usability services lab, 2008. 33. tod a. olson, “utility of a faceted catalog for scholarly research,” library hi tech 25, no. 4 (2007): 550–61. 34. ibid., 555. 35. kathleen bauer, “yale university library vufind test— undergraduates,” may 20, 2008, http://www.library.yale.edu/ usability/studies/summary_undergraduate.doc (accessed apr. 27, 2010); kathleen bauer and alice peterson-hart, “usability test of vufind as a subject-based display of ebooks,” aug. 21, 2008, http://www.library.yale.edu/usability/studies/summary _medical.doc (accessed apr. 27, 2010). 36. bauer and peterson-hart, “usability test of vufind as a subject-based display of ebooks,” 1. 37. ibid., 2. 38. ibid., 3. 39. ibid. 40. ibid., 4. 41. ibid. 42. ibid., 5. 43. ibid., 8. 44. kristin antelman, andrew k. pace, and emily lynema, “toward a twenty-first century library catalog,” information technology & libraries 25, no. 3 (2006): 128–39. 45. ibid., 139. 46. ibid., 133. 47. ibid., 135. 48. ibid., 136. 49. jennifer l. ward, steve shadle, and pam mofield, “user experience, feedback, and testing,” library technology reports 44, no. 6 (aug. 2008): 22. 50. english et al., “flexible search and navigation.” 51. peter ingwersen and irene wormell, “ranganathan in the perspective of advanced information retrieval,” libri 42 (1992): 184–201; winfried godert, “facet classification in online retrieval,” international classification 18, no. 2 (1991): 98–109.; w. godert, “klassificationssysteme und online-katalog [classification systems and the online catalogue],” zeitschrift für bibliothekswesen und bibliographie 34, no. 3 (1987): 185–95. 52. yee et al., “faceted metadata for image search and browsing”; english et al., “flexible search and navigation.” academic web site design and academic templates | peterson 217 academic web site design continues to evolve as colleges and universities are under increasing pressure to create a web site that is both hip and professional looking. many colleges and universities are using templates to unify the look and feel of their web sites. where does the library web site fit into a comprehensive campus design scheme? the library web site is unique due to the wide range of services and content available. based on a poster session presented at the twelfth annual association of college and research libraries conference in minneapolis, minnesota, april 2005, this paper explores the prevalence of university-wide academic templates on library web sites and discusses factors libraries should consider in the future. c ollege and universities have a long history with the web. in the early 1990s, university web sites began as piecemeal projects with varying degrees of complexity—many started as informational sites for various technologically advanced departments on campus. over the last decade, these web sites have become a vital part of postsecondary institutions and one of their most visible faces. academic web sites communicate the brand and mission of an institution. they are used by prospective students to learn about an institution and then used later to apply. current students use them to pay tuition bills, register for classes, access course materials, participate in class discussions, take tests, get grades, and more. online learning and course-management software programs, such as blackboard, continue to increase the use of web sites. they are now an important learning tool for the entire campus community and the primary communication tool for current students, parents, alumni, the community, donors, and funding organizations. web site standards have developed since the 1990s. usability and accessibility are now important tenets for web site designers, especially for educational institutions. as a result, campus web designers or outside consultants are often responsible for designing large parts of the academic web site. as web sites have grown, ongoing maintenance is an important workload issue. databases and other technologies are used to simplify daily updates and changes to web sites. this is where the academic template fits in. an academic template can be defined as a common or shared template used to control the formatting of web pages in different departments on a campus. generally, administrators will mandate the use of a specific template or group of templates. this mandate includes guidelines for such things as layout, design, color, font, graphics, and navigation links to be used on all web pages. often, the templates are administered using content management systems (cmss) or web development software such as macromedia’s contribute. these programs give different levels of editing rights to individuals, thus keeping tight control over particular web pages or even parts of web pages. academic templates give the web site administrator the ability to change the template and update all pages with a single keystroke. for example, the web site administrator may give editing rights to content editors, such as librarians, to edit only the center section of the web page. the remaining parts of the page such as the top, sides, and bottom are locked and cannot be edited. the result of using templates is that the university web site is very unified and consistent. this is particularly important in creating a brand for the university. well-branded institutions have the opportunity to increase revenue, improve administration and faculty staffing, improve retention, and increase alumni relationships.1 but what about the library? libraries are one of the most visited web pages on a university’s web site.2 thus, the design of the library page can be crucial to a well-designed academic web site. the library web site can set a tone for an institution and help prospective students get a feel for the campus. belanger, mount, and wilson contend it is important for the image of an institution to match the reality.3 if there is discord between the two, students may choose an inappropriate college and quickly drop out, lowering a campus’s retention data. the library web site can also be important in the recruitment of new faculty members. in addition, libraries use their web sites for marketing, public relations, and fund-raising for the library.4 library web sites are crucial to delivering data, research tools, and instruction to students, faculty, staff, and community patrons. more than 90 percent of students access the library from their home computers, and 78 percent prefer this form of access.5 today, the web site connects users with article citations and databases, library catalogs, full-text journals, magazines, newspapers, books, videos, dvds, e-books, encyclopedias, streaming music and video, and more. users access subject-specific research guides, library tutorials, information-literacy instruction, and critical evaluation tools. services such as interlibrary loan (ill), reference management programs such as endnote or refworks, and print and electronic reserves are also used via the web. users get help with doing research by e-mail and virtual chat. in addition, libraries are digital repositories for a growing number of digital historic documents and archives. academic web site design and academic templates: where does the library fit in? kate peterson kate peterson (katepeterson@gmail.com) is an information literacy librarian at capella university, minneapolis, minnesota. 218 information technology and libraries | december 2006 how common are academic templates in library web sites? what effect do they have on the content and services provided by libraries? ■ methods for the purposes of this study, a list of doctoral, master’s, and bachelor of arts (ba) institutions (private and public) based on the carnegie classification of institutions of higher education was created and a random number table was used to select a sample of web pages (n=216).6 home pages, admissions pages, departmental pages, and library web pages were analyzed. a similarly sized sample of each type was selected to give a broad overview of trends—18 percent of doctoral institutions (n=47), 19 percent of master’s institutions (n=115), and 23 percent of ba institutions (n=54). the following questions were asked: ■ does the college or university web site use an academic template? ■ if yes, is the library using the template, and for how much of the library web site? ■ to what extent is the template being used? primarily, a web site was determined to be using an academic template based on the look of the site. for example, if the majority of the web elements (top banner, navigation) all matched, then the web site was counted as using some sort of template. use and nonuse of content management system (cms) software behind the web site was not considered in this study—only the look of the web site. ■ results a majority of college and university web sites (94 percent) use an academic template. fifty percent of the libraries surveyed use the academic template for at least the library’s home page. of that number, about 34 percent of libraries use the template on a majority of the library pages. roughly 44 percent of the total libraries surveyed did not use the academic template, and approximately 5 percent of academic web sites do not use any sort of unified academic template. smaller ba institutions are more likely to use the academic template on multiple library pages than doctoral institutions, which tend to have their own library design or template (see table 1). for those libraries that did not use the academic template on every library page, the most commonly used elements template were the top header (which often has the university seal or an image of the university), the top navigation bar (with university-wide links), and the bottom footer, which often contains the university address, privacy statement, or legal disclaimers. less frequently used elements were the bottom navigation bar, and the left or right navigation bar with university-wide links (see tables 2–3). ■ discussion while many colleges and universities use academic templates, only about half of their libraries follow suit. libraries using the template often use selected parts of the template, or only use the template on their home page. though not considered in this study, there may be a correlation between institution size and template use, as larger institutions are more likely to have library web designers and thus use the academic template only on the library’s home page. while academic templates can cause libraries many problems, there are also many benefits to be considered. ■ problems with academictemplates on library web sites the primary concern with any template is how much space is available for content. for example, there may be a very small box for the page content while images, banner bars, and large navigation links may take up most of the real estate on the page. this problem can be exacerbated for libraries because there are so many different types of content such as the library catalog, databases, tutorials, forms, ill, and other library services delivered via the web. libraries can be caught between the design imposed by the academic template and the rigid size requirements from outside vendors such as database companies, ill or reserve modules, federated search products, or others. academic templates are usually mandated by administrators without a full understanding of the specific content and uses of the library web site. many problems can occur when trying to fit an existing library web site into a poorly designed academic template. it can be very difficult to modify the template effectively for the library’s purposes. an example of one specific problem is confusing links on the template, where a link on every page to the “university catalog” links to the course catalog and not the library catalog, which is very confusing for users. another example is a search box as part of the academic template—what are users searching? the university web site? the library web site? the library catalog? the world wide web? another drawback to using academic templates for library web sites can be the time involved in training librarians, staff, and library web site administrators. the existing academic web site design and academic templates | peterson 219 content must be fit into the new template—a huge project, given that many library web sites contain one thousand pages or more. generally, a decision to use a template is accompanied by a decision to use a cms or new web-page editor. this takes yet more time to train individuals on the new software in addition to the new template. ■ benefits of using academic templates one of the benefits for libraries using an academic template is the ability to exploit the expertise of the web site designers who created the template. the academic template often incorporates images, logos, and branding that the library may not be able to design otherwise. many libraries do not have professional web designers on staff; even if they do, there often is no one person who designs and maintains the entire library web site. instead, different parts of a library web site are designed and maintained by different individuals with varying degrees of web site ability. as a result, many library web sites are a mix of styles, which can be disorienting for students who are familiar with the university’s “look.” web site uniformity has a positive effect on usability since familiarity with one part of the web site helps students, faculty, table 1. percentages of occurrences of academic templates no academic template (%) library not using template (%) library using template—transition or top page (%) library using template—majority of pages (%) bachelor of arts 4 37 13 46 master’s 6 48 12 34 doctoral 6 45 28 21 table 2. occurrence of templates in academic and library web sites no academic template library not using template library using template— transition or top page library using template— majority of pages total sites analyzed bachelor of arts 2 20 7 25 54 master’s 7 55 14 39 115 doctoral 3 21 13 10 47 total 12 96 34 74 216 table 3. percentages of occurrence for institutions using the academic-wide template for first page of library web site or libraries using modified academic template ba (%) master’s (%) doctoral (%) all colleges and universities (%) top header (no navigation) 100 94 94 91 top navigation 75 82 82 76 bottom header (no navigation) 83 65 76 72 bottom navigation 25 18 18 20 left navigation 42 18 18 24 right navigation 8 0 0 2 220 information technology and libraries | december 2006 and staff navigate other parts of the web site. even web site basics such as knowing the color and style of the links and how to navigate to different pages can be helpful.8 another benefit is academic templates are generally ada compliant as required under section 508 of the rehabilitation act of 1973.9 as usability and usability testing become more prevalent, academic template designers may also test the template and navigation for usability. such testing will improve the template and thus the library web site as well. ■ trends in academic and library web sites colleges and universities are responding to a new generation of students, the majority of whom have grown up with computers. in trying to meet their needs and desires, many academic web sites have high-quality photographs, quotes, and testimonials from the universities’ students on their home pages. more and more materials are being placed online to allow both prospective and current students to do what they need to do twenty-four hours a day, from registering for classes to handing in research papers. many web sites have interactive elements such as instant polls or quizzlets or use instant messaging to connect with tech-savvy students. for example, prospective students can chat with admissions staff members or current students about what it is like to attend a particular university. a large number of sites also highlight weblogs written by current students or those studying abroad. these features allow students to use the technology they are comfortable with to maximize their academic experience. numerous library web sites are changing as well, featuring a library catalog, article database, or federated search box on the home page to allow users to search instantly. additionally, library sites are beginning to include images of students using the library, external or internal shots of the building, flash graphics, icons, and sound. many incorporate screen captures to help users navigate specific databases or forms. in addition, an increasing number of libraries use weblogs to give more of a dynamic quality with daily library news and announcements. ■ strategies for using academic templates based on comments received in april 2005 during the poster session, and in recent electronic discussion list postings, many academic libraries are dealing with these issues. libraries should work on creating a mission statement and objectives for their web sites that expand upon the library’s mission, the institutional web site’s mission, and the institution’s overall mission and brand. librarians must be knowledgeable about web site usability and trends in web site design in order to communicate effectively to designers and administrators. librarians should also become members of campus web committees and be a voice for library users during the design process. teaching administrators and campus web designers about the library and the library web site’s prominence are important tools to successfully deal with any proposed university-wide academic templates. for example, a librarian could mock-up a few pages, conduct informal usability testing, and invite administrators to learn firsthand about potential problems library users could experience with a template. librarians could also propose a modified template that uses a few key elements from the academic template. this would maintain the brand but retain enough space for important library content. connecting with other librarians and learning from each other’s successes and failures will also help bring insight into this academic template issue. ■ conclusion the use of academic templates is only going to increase as institutional web sites grow in complexity and importance. libraries are an important part of institutions both physically—on campus—and virtually—as part of the campus web site. academic templates are part of a unified design scheme for colleges and universities. librarians must work with both library and university administrators to create a well-designed but usable library web site. they must advocate for library users and continue to help students and faculty access the rich resources and services available from the library. library administrators need to allocate resources and staff time to improve their web sites and to work in concert with academic web site designers to merge the best of the academic template to the best of the library site while not sacrificing users’ needs. the result will be highly used, highly usable library web sites that attract students and keep them coming back to access the fantastic world of information available in today’s academic libraries. ■ references 1. robert sevier, “university branding: 4 keys to success,” university business 5, no. 1 (2002): 27–28. 2. mignon adams and richard m. dougherty, “how useful is your homepage? a quick and practical approach to evaluating a library’s web site,” college & research libraries news 63, no. 8 (2002): 590–92. academic web site design and academic templates | peterson 221 3. charles belanger, joan mount, and mathew wilson, “institutional image and retention,” tertiary education and management 8, no. 3 (2002): 217. 4. jeanie m. welch, “the electronic welcome mat: the academic library web site as a marketing and public-relations tool,” the journal of academic librarianship 31, no. 3 (2005): 225–28. 5. oclc, “how academic librarians influence students’ web-based information choices,” in oclc online computer library center database online, (2002), 5, http://www5.oclc .org/downloads/community/informationhabits.pdf (accessed march 10, 2005). 6. carnegie foundation, carnegie classification of institutions of higher education, 2000 edition, http://www.carnegiefound ation.org/classification/ (accessed jan. 8, 2005). 7. beth evans, “the authors of academic library home pages: their identity, training, and dissemination of web construction skills,” internet research 9, no. 4 (1999): 309–19. 8. oclc, 6. 9. u.s. department of justice, section 508 home page, in united states department of justice database online, (2004), 1, http://www.usdoj.gov/crt/508/ (accessed july 3, 2005). statement of ownership, management, and circulation information technology and libraries, publication no. 280-800, is published quarterly in march, june, september, and december by the library information and technology association, american library association, 50 e. huron st., chicago, illinois 60611-2795. editor: john webb, librarian emeritus, washington state university libraries, pullman, wa 99164-5610. annual subscription price, $55. printed in u.s.a. with periodical-class postage paid at chicago, illinois, and other locations. as a nonprofit organization authorized to mail at special rates (dmm section 424.12 only), the purpose, function, and nonprofit status for federal income tax purposes have not changed during the preceding twelve months. extent and nature of circulation (average figures denote the average number of copies printed each issue during the preceding twelve months; actual figures denote actual number of copies of single issue published nearest to filing date: june 2006 issue). total number of copies printed: average, 5,256; actual, 5,300. sales through dealers and carriers, street vendors, and counter sales: none. paid or requested mail subscriptions: average, 4,262; actual, 4,280. free distribution (total): average, 59; actual, 67. total distribution: average, 4,758; actual, 4,769. office use, leftover, unaccounted, spoiled after printing: average, 498; actual, 531. total: average, 5,256; actual, 5,300. percentage paid: average, 98.76; actual, 98.60. s t a t e m e n t o f o w n e r s h i p , m a n a g e m e n t , a n d c i r c u l a t i o n ( p s f o r m 3 5 2 6 , o c t o b e r 1 9 9 9 ) f i l e d w i t h t h e u n i t e d s t a t e s p o s t o f f i c e p o s t m a s t e r i n c h i c a g o , s e p t e m b e r 3 0 , 2 0 0 6 . lib-s-mocs-kmc364-20140601052745 146 journal of lihmry automation vol. 5/2 june, 1972 book reviews book catalogs. by maurice f. tauber and hilda feinberg. metuchen, n.j.: scarecrow press, 1971. 572 p. $15.00 in 1963 kingery & tauber published a collection entitled book catalogs. this is a much larger follow-up, containing twenty papers published between 1964 and 1970 and eight previously unpublished pieces. not surprisingly, nearly all of them are concerned with computer-produced book catalogs-in academic, special, county, public, and school libraries. although nearly all of the previously published papers appeared in wellknown journals, it is useful to have them collected together; the older ones are now of mainly historical interest, but, taken as a whole, they form a valuable record of trial and error-also of progress. it would be unfair to single out any of the published articles for special praise or blame. in a rapidly changing field, even the good is soon improved upon. it is the examples, the castings, and above all the mistakes that are so helpful. there is no excuse now for running into problems that have in the past led to the total scrapping of some computer systems: unforeseen filing difficulties, insufficient computer storage, bad economic estimating, and inability to produce an acceptable product. one major problem is still unsolved and indeed has not really been tackled systematically-the pattern of output (main sequence and supplements) that provides maximum usability at minimal cost-a problem surely amenable to or techniques. as a reviewer from the united kingdom, i would like to have seen a little more on relevant events there than is provided by frederick g. kilgour's general review: the smaller budgets of british libraries have generally enforced much more careful planning and, although there may be fewer successes, there are also very few failures. the introduction and the three final pieces, all specially written, are of great value, particularly hilda feinberg's "sample book catalogs and their characteristics" (some samples are unbelievably horrible). for good measure there is a bibliography, a (computer-produced) index, and the listing of "book form catalogs" reprinted from lrts. book reviews 147 i would hazard a guess that it is with com that the future lies for many libraries. the next collection of papers, for which i hope we shall not have to wait eight years, must surely be entitled "book and microform catalogs." maurice b. line an introduction to pljl programming for library and information science. library and information science series. by thomas h. mott, jr., susan artandi, and leny struminger. new york: academic press, 1972. 231 p. the importance of this text rests in the authors' assumptions that the acquisition of programming skills by the library student is an essential component of his education in the fields of library automation and information retrieval. such skills should enable the student to examine critically the relevance of automated information handling for the library, to experiment with some basic methods of manipulating machine readable textual material, and, "to acquire an understanding of the role of the programmer in the development of ... information handling techniques." the selection of a programming language for this text deserves some comment. pl/1 has been recognized as a particularly suitable language for the processing of textual material and data base management applications. its extensive and powerful repertoire of bit, character, string, array, record, and file manipulation capabilities argue strongly in favor of its adoption for library and other information handling applications. students should be encouraged by the selection of pl/1 for this text, for it offers the novice great flexibility and ease in constructing and manipulating even the most complex types of information structures. this title constitutes the first published attempt to tailor an introductory programming text to the needs of the library student. as such, it possesses several characteristics which distinguish it from other basic programming books, including other pl/1 texts. the language features receiving the greatest share of attention in the present title are the set of built-in functions in pl/1 designed to facilitate the manipulation of strings of both binary and character data. discussion of four of these functions ( bool, unspec, verify, and translate) is usually omitted from general introductory pl/1 textbooks. although the discussions of the bool and unspec functions are reasonably complete, the explanations of verify and translate fail to indicate the scope of their applications. for example, the utility of the verify function as an index function for ranges is completely ignored. a more illuminating example of the power of the translate function could have explored its usefulness in converting ascii characters to the corresponding characters of the ebcdic set. this might have clarified the section entitled "internal representation of pl/ 1 characters," which contains an equivalence table for the pl/1 character 148 journal of library automation vol. 5/2 june, 1972 set in ascii and ebcdic without indicating its purpose. additional use of this example could have been made in the presentation of the marc material, where the practical value of such a function could be stressed. another desirable feature of this text for the instructor and the library student is the inclusion of sample problems and exercises which , since they refer exclusively to text processing, library automation , and information retrieval, should be readily understandable. unfortunately, the present volume omits any mention of the picture attribute and its uses. as a powerful device facilitating the interchange of data between numeric and character variables and the uncomplicated editing of numeric fields prior to output time, its inclusion would have proved valuable to the tex t handling programmer. however, it should be emphasized that this appears to be the single instance in the text in which a generally acknowledged basic language feature has been entirely excluded. it seems to me that too much of the text ( 15-25 percent) is devoted to developing some of the elementary concepts of boolean algebra and constructing a theoretical model of document retrieval based on these concepts. one possible explanation for this emphasis is the fact that the material for the book was drawn from a graduate seminar in programming theory for information handling. although these chapters are informative and the exposition of ideas is straightforward, they should have been omitted. the space which they occupy could have been used more successfully to explore those pl/1 features essential for information handling but excluded or treated too briefly in the present volume. a list of such topics would include: an expanded discussion of program interrupts and the on condition, a description of pl/1 record formats emphasizing the variable length record, and a guide to the use of the varying structure method of writing variable length records. the deficiencies of this text are its overemphasis of information retrieval theory and applications, and its failure to stress those features of pl/ 1 which would enable the student to appreciate the file-handling capabilities of the language. however, for many instructors the availability of programming examples which should be easily grasped by the library student may strongly outweigh these disadvantages. h award s. harris guidelines f01' library automation; a handbook for federal and other libraries. by barbara evans markuson, judith wagner, sharon schatz, and donald black. santa monica, calif.: system development corporation, 1972. 401 p. $12.50 this handbook is the result of a 1970 study on the status of fed eral library automation projects which was conducted under the auspices of tlw federal book reviews 149 library committee's task force on automation. the survey was carried out by the system development corporation and funded by the u.s. office of education. it is one of two reports generated from the study data, the other report being aut01nation and the federal library community. the study consisted of a questionnaire survey of 2,104 federal libraries of which 964 responded. of that number, 57 libraries had one or more functions automated and ten had one or more functions in various stages of development or planning. the survey revealed that, among other activities, 27 cataloging systems (presumably "cataloging" means catalog card production), 25 serials systems, and 13 circulation systems were operational. the handbook purports to help the federal librarian answer the question: .. is it feasible to use automation for my library?" it attempts to do this by presenting step-by-step guidelines "from the initial feasibility survey through systems analysis and design to fully operational status." that material more or less follows a pattern of discussion on automation procedure followed by a checklist of the procedures in chart form. the areas covered include "feasibility guidelines" concerning such points as equipment, personnel, budget, and existing files; and "systems development guidelines" which include planning, analysis, design, implementation, and operation. the discussions include brief reviews of the various aspects of automation development, and statements describing the experiences of federal librarians as reported in the study. in this fashion, the reader is informed of the steps that should be considered with each aspect of automation development and, additionally, he is informed of what his colleagues have previously done about each phase and/or problem. much of this material is too general and too brief to do more than call the reader's attention to the fact that certain requirements must be met in the successful development of an automation project. a large portion of the book is taken up with descriptions of automation projects in 59 federal libraries. this overview of the federal sector provides limited descriptive information about each library and reviews the various applications in terms of system descriptions, equipment, programs, future plans, documentation, etc. the reviews are not consistent in that not all of the above points are included in every review. this, however, is the result of the data submitted to the survey by the respondents. approaches have been provided to this survey material by automated application, form of publication, type of equipment used, and by the special features of each system. surprisingly, there is no approach by name of library. at least one very important library is not represented, i.e., livermore, but for some reason, a similar library, los alamos, is included. the final section of the book is a potpourri of information about nonfederal automation activities and is the weakest section of the volume. it includes a list of "automated libraries" that was published before and is very incomplete and poorly defined. additionally, it briefly discusses data bases, commercial ventures, and for no apparent reason suddenly includes 150 journal of library automation vol. 5/2 jun e, 1972 22 pages of information on microforms in libraries. it just as suddenly reverts back to automation and proceeds to provide 23 pages of data on input/output hardware in libraries. the final section is a selected bibliography that seems almost as aimless as the section before it. the items included "have been selected on the basis of their particular interest and applicability to federal libraries," it is stated. they range over the whole spectrum of library automation, and some items have nothing to do with automation at all. there is no index to the book as a whole and a fair number of errors are present. in summary, the book includes a limited amount of rather old information most of which is available in other places in far greater detail. it appears that sdc had some rather weak survey data that seemed like it should be used! as a book of "guidelines" it does succeed in providing information in uncluttered and simplified form , but it is a very disappointing publication that leaves much to be desired both in substance and in organization. donald p. hammer canadian marc; a report of the activities of the marc task group resulting in a recommended canadian marc format for monographs and a canadian marc format for serials. recommended to the national librarian. by dr. guy sylvestre. ottawa: the national library of canada, 1972. canada's approach to the realization of a proposed format for machinereadable cataloging data was influenced by several factors. first and foremost was the fact that canada is bi-lingual, dictating the requirement for the possible representation of data in both french and english. in addition, the national library of canada wanted to continue its interaction with the library of congress and also to coordinat e the development of a canadian marc with international developments. the formats recommended are for the communication of machinereadable cataloging data. the processing of the data by local libraries was not ignored. it was recognized that this could involve ( 1) expansion. of the format to accommodate processing data (e.g., for acquisitions, serial control); and ( 2) the development of data format independent software for effective data storage and retrieval (e.g., a data management system with logical and physical characteristics of data described independently of specific applications software). the marc task group was established as a result of the recommendations of the conference on cataloguing standards held at the national library of canada in may 1970. the mission of the task group was to study the requirements for a format for machine-readable bibliographic records to be used in canada. the group was not to concern itself with book reviews 1.51 cataloging standards as such, since these were to be considered by the task group on cataloguing standards. the marc task group limited its attention to monographs and serials because this was the greatest need at the time. it was felt that after development of these two basic formats, i.e., monographs and serials, other formats for films, manuscripts, maps, etc., could be more logically developed. recognizing that canada has two official languages and that this creates specific bibliographic needs, the task group's first recommendation was that the national library of canada assume the responsibility for developing a distinctive canadian marc format. variations from the library of congress format are to be kept to a minimum, due to: • economic considerations. • dedication of canadian library communication (in common with the library of congress) to the full application of the aacr, american edition and the "version fran9aise." • willingness of canada for continued heavy reliance upon the library of congress for answering its bibliographical needs in both the traditional way as well as in machine-readable form. • readiness of canada to accept future bibliographic developments and amendments proposed by the library of congress, e.g., new filing rules. it is further recommended that: • the development of a separate canadian marc be coordinated with international developments such as isbd (international standard bibliographic description) and isds (international serials data system). • the national library of canada adopt the precis (preserved context index system) developed for bnb for the purpose of adding subject data to marc records for canadian publications in the form of descriptors. • any new data elements and varying levels of completeness of data introduced into the format in the future (for other media, specialized collections, or retrospective conversions) do not conflict with the basic specifications recommended for canadian marc. several studies were made by the task group. one addressed the need for marc formats and the user requirements for such formats, keeping in mind the need for bi-lingual content in the perspective of an international marc as to data for author, title, collation and notes, geographic names, and subject. format requirements were based on a comparison of the united states and united kingdom formats and the examination of italian and other national marc formats. an intensive study was made of the proposed library of congress format for serials. the implications and requirements for a marc format to be used in conjunction with information retrieval and indexing systems were also examined. the best formats were then defined and recommended to the national librarian. 152 journal of library automation vol. 5/2 june , 1972 the format recommended for monographs may be summarized as follows: 1. the tags are mainly from the library of congress marc-ii, with adoptions from bnb and monocle. particular attention was paid to avoiding conflict with any of the national formats. the library of congress 900 tags were expanded to provide canadian libraries the option of selecting data in bi-lingual content, i.e., the data for the secondary entry fields could be represented in either the french or english equivalent. 2. the indicators specified in the library of congress format have been retained. some additional ones from bnb and monocle have been added. 3. the subfield codes of the library of congress format have been used most often with additional ones from bnb. there is no basic conflict with the library of congress marc. canadian marc is more specific and the more precise specifications are hospitable to the library of congress format. it was felt that the subfielding for filing values or relationships found in monocle could be met by software. 4. descriptive and bibliographic content are not altered in any way since they are dealt with by cataloging codes. however, for codified content (e.g., codes for language, geographic area, bibliographic area, intellectual level), use of standard international codes is recommended. meanwhile, library of congress marc-ii codes will be used for some fields, e.g., languages, geographic area . for serials, it was the intention of the task group to maintain compatibility with the canadian marc format for monographs. however, it was necessary to study the proposed formats for serials issued by the library of congress, mass-a marc-based automated serials system proposed in the united kingdom by the birmingham libraries co-operative mechanisation project, and the french monocle. the proposed canadian marc format for serials has been based on the recommendation for the processing of serials issued by the task group on cataloguing standards. data elements were isolated to meet special applications such as: 1. the preparation of union lists for serial holdings with minimal bibliographic data (e.g., by broad subject groupings, by form division). 2. the bibliographic description of canadian serials for a national bibliography. 3. the development of local library in-house systems for acquisition, processing, and control of serials. 4. the preparation of a canadian serials directory incorporating a minimum of data and with a constant update facility. book reviews 153 this diversity of requirements led the task group to state several beliefs. first, the isolation of data elements for local library in-house systems and the compatibility of these data elements to allow for the exchange of computer programs can best be done by allocating a tag structure in a format separate to the main serials communication format. second, there is a requirement for the relating of entries in the serial and monograph format (e.g., monographs in series which may appear in either format). if an exchange of data between the two formats is necessary, there may be a need to have an additional tag or a more extensive tagging structure for titles and series title entries. the specific recommendations for serials were that the national library should: 1. participate in the unesco proposals for an international serials data system in which the isolation of data elements for international exchange will have a direct bearing on the elements in a canadian marc serials format. 2. immediately initiate any action deemed advisable within the international proposals to provide standard serial numbers for canadian serial publications. 3. consider the preparation of a canadian serials directory as a separate project. 4. initiate a pilot project with other libraries to test the proposed canadian serials format prior to full implementation. 5. on the basis of the above recommendations, explicitly state which data elements are necessary. (the proposed format for serials recommended has those elements asterisked that the task group believed were not necessary. these are all processing control-oriented, e.g., frequency control, publication patterns, indexing, and abstracting coverage.) the report includes three comparative tables to be used in evaluating the proposed canadian marc formats. table 1 compares, for monographs, the library of congress, united kingdom, french (monocle), and italian formats against the format proposed for canada. table 2 compares the library of congress proposed format and the mass format for serials against the format proposed for canada. table 3 compares the canadian format for monographs against the canadian format for serials. copies of the table 1 were submitted to the united states, the united kingdom, france, and italy for review and comments. the resulting revisions were not incorporated in the report since this would have delayed publication. the tagging structure, therefore, may be slightly revised when the canadian marc user's manual is finalized. however, those interested in the compatibility of the canadian formats with the library of congress formats and the implications of the canadian formats for an international marc format will find the tables sufficient. lillian h. \vashi11g ton 154 ]oumal of library automation vol. 5/ 2 june, 1972 monocle: pro;et de mise en ordinateur d'une notice catalographique de livre. publications de ia bibliotheque universitaire de grenoble, 4. [par) marc chauveinc. 2.eme ed. grenoble: bibliotheque interuniversitaire, 1972. 197 p. plus 25 annexes and errata a review of the 1st edition of monocle appeared in ]ola in march 1971 ( v. 4, no. 1, pp. 57-58). readers are referred to that review and to the article by m. chauveinc in the september 1971 issue of ]ola (v. 4, no. 3) for a description of the structure of monocle. the format has undergone little change in essentials, but many changes in detail have been made. new fields have been added ( 249: abridged title of periodical; 270: printer's imprint; 545: note showing title of periodical analyzed ) , subfield codes have been changed or added, new indicators have been created (see below), and the names (and therefore the contents) of some fields have been changed ( cf. 241 and 242). the leader has been enlarged from 19 to 24 bytes to show more exactly the address of the index related to a particular bibliographic record ( 4 new bytes) and to show the current number of fields in the record ( 2 new bytes) and the current length of the record ( 2 new bytes ) as well as the initial number of fields and the initial length. the length of the index is no longer given. thus the leader makes use of 8 new bytes and has discontinued 2 (only 18 of the original 19 were utilized). what has remained unchanged is the emphasis on coding for filing arrangement and on the use of tags to identify not only the nature of a field but its different functions and its relationship with other data. there is increased emphasis, however, on the importance of the integration and collaboration of several libraries in automation activities and, therefore, on the need for monocle to be generalized so that it is usable by institutions with other goals, hardware, and processing languages than the university of grenoble. mention is made throughout the volume of the variant approach of the bibliotheque nationale which uses monocle to prepare the bibliographie de la france. one change in the second edition is the increased awareness of the complexities involved in dealing with subrecords. the use of the subrecord technique has therefore been limited to works meeting certain requirements. the requirements are so strict that, for all practical purposes, grenoble does not use subrecords. instead, it uses secondary entries, or series headings, or contents notes. an important change has been made in the first indicator position of personal name fields ( 100, 400, 600, 700, 800, 900) which, in the 1st edition, was similar to marc. a new indicator structure has been created to facilitate construction of sort keys. a first indicator of '0' is used for forenames of saints, popes, and emperors. a '1' indicates a name that is to be filed exactly as given, whether it is a forename, simple surname, or multiple surname. book re vie ws 155 a '2' is used for multiple surnames containing a hyphen that is to be replaced by a blank, e.g., saint-exupery. a '3' is used when a name contains a blank, apostrophe, or hyphen that is to be deleted, e.g., la fontaine. a '4' is used for complex names, whether simple or multiple, in which it is necessary to keep some blanks and/or letters and to delete others. for this purpose, monocle makes use of three vertical bars to distinguish text to be printed and used for sorting from text to be printed only from text (supplied) to be used only for sorting. since the three bars are used only in fields with 1st indicator of '0' or '4', the use of these indicators enables the program to test for them only when these indicators are present instead of in every field. the 1st indicator of '4' is used for complex arrangements utilizing the three bars in other fields as well: 110, 111, 241, 243, 245, 410, 411, 441, 443, 445 and the equivalent 6xx, 7xx, 8xx, and 9xx fields. the errors in this volume are minor. monocle still lists field 653 (proper names incapable of authorship) as an lc subject field, although this field was discontinued almost as soon as it was created so that it doesn't even appear in the 1st edition ( 1969) of the marc manual.s. in a discussion of the use of terminals to catalog books, it footnotes 'the library' of 'ohio college' rather than 'the libraries' affiliated with the ohio college library center. the review of the 1st edition pointed out that one of the values of monocle for american librarians was the light it threw on marc. that statement still holds true. for purposes of facilitating its use for this purpose, an english language translation might be of value. judith hqpkins 156 journal of library automation vol. .5/2 june, 1972 information retrieval and library automation this monthly review is unique in its extensive u. s i and international coverage of the many specialized fields which contribute to improved information systems and labrary services for sc;ience, social science, technology, law and medicine; these fields include; computer technology and systems, library science and technology, library administration, photo· graphic technology and microforms, facsimile and communications, library and information networks, reprographic and printing technologies, copyright issues, indexing systems, mechine-aided indexing and abstracting, documentation and data standards, databanks and anlysis centers. subscription is $24.00 per year (overleas subscribers add $6.00). orders and inquiries should be directed to: lomond systems, inc. mt. airy maryland 21771 48,222 strong ... and still growing! f.w. faxon company, the only fully automated library subscription agency in the world, has an ibm 370/145 computer currently listing 48,222 p e riodic a ls for your library. our 'til forbidden service the a utomatic annua l renewal of your subscriptions provides fast, accurate, and efficient processing of your orde rs and invoices. send lor free descriptive brochure and annual librarians' guide. library business is our on ly businesssince 1886. reitic1 f. w. faxon co. ,inc. lljllj 15 southwest park westwood, massachusetts 02090 tel: (boo) 225-7894 (toll free) the american lffirary association announces the exclusive distribution here in the united states qf non-book materials: the organization of integrated collections jean riddle, shirley lewis, janet macdonald, in consultation with the technical services committee of the canadian library association. non-book materrols published by the canadian library association ( 1970) is now being exclusively distributed in the united states by the american library association. these officially approved rules for the cataloging of audiovisual materials were designed to be compatiable with parts i and ii of anglo-american cataloging rules (ala 1970). though written with the school library in mind, the principles can be applied to any library system which houses books and other media together and has a single, unified list of holdings. color coding, organization of media, rules for descriptive cataloging, use of illes, storage and media destination are covered in addition to discussing 20 different media. the glossary of media designations in the book is an attempt to standardize terminology within the media industry. isbn0-8389-3129-4 (1971 $3.50 iidiamerjcan library association ~ l huran st • chicaga 60611 information technology and libraries at 50: the 1980s in review mark dehmlow information technology and libraries | september 2018 8 mark dehmlow (mdehmlow@nd.edu) is director, library information technology at the hesburgh libraries, university of notre dame. my view of library technology in the 1980s through the lens of journal of library automation (jola) and its successor information technology and libraries (ital) is a bit skewed by my age. i am a gen-xer and much of my professional perspective has been shaped by the last two decades in libraries. while i am cognizant of our technical past, my perspective is very much grounded in the technical present. in a way, i think that context made my experience reviewing the 1980s in jola and ital all the more fun. the most pronounced event for the journal during the 1980s was the transition from the journal of library automation to information technology and libraries between 1981 to 1982. the rationale for this change is perhaps best captured through the context set in the guest editorial “old wine in new bottles?” by kenney in the first issue of ital: “proliferating technologies, the trend toward integration of some of these technologies into new systems, and rapidly increasing adoption of technology-based systems of all types in libraries .…”1 the article grounds us in the anxieties and challenges of the decade surrounding an accelerating change in technology. libraries were evolving from implementing systems of “automation,” a term that focuses more on processes, to broadening their view to “information technology,” which is more of a discipline — an ecosystem made up of technology, process, systems, standards, policies, etc. in a way, the article acknowledges the departure of libraries from their adolescent technological pasts to their young adult present for which the 80s would be the background. perhaps no other event is more technologically significant during the decade than the standardization of the internet. while the concept of networks and a network of networks, e.g. the internet, was conceptualized in the 1960s, it was the development of the tcp/ip network protocol that is the most consequential event because it made it possible to interconnect computer systems using a common means of communication. while the internet wouldn’t become ubiquitously popularized until the early 1990s with the emergence of the world wide web, the internet was active and alive well before that and, in its early state, was critical to the emergence and evolution of library technologies. from the first issue through the last of the 1980s, ital references the term “online” frequently. the “online” of the 80s however was largely text based, where systems were interconnected using lightweight terminals to navigate browse and search systems. it was not unlike a massive “choose your own adventure book,” skipping from menu to menu to find what you were looking for. throughout my review, i was happy to see a small, but significant, percentage of international articles that focused on character sets, automation, and collection comparisons in countries like kuwait, australia, china, and israel. diversity is a cornerstone for lita and ala and the journal has continued this trend to encourage the submission of articles from outside of the u.s. the 1980s volumes of ital traversed a plethora of topics ranging from measuring system the 1980s in review | dehmlow 9 https://doi.org/10.6017/ital.v37i3.10749 performance (efficiency was important during a time when computing was relativ ely slow and expensive) to how to use library systems to provide data that can be used to make business decisions. over the decade, there was a significant focus on library organizations coming to terms with new technology, e.g. the automation of circulation, acquisitions, and the marc bibliographic record. there were several articles that discussed the complications, costs, and best practices for converting card-catalog metadata to electronic records and several other articles that detailed large barcoding projects. the largest number of articles on a single topic focused on the automation and management of authority control in automated library systems. there were articles on the emergence of research databases often delivered as applications on cd-roms which would then be installed on microcomputers. the term “microcomputer” was frequently used because the 80s saw the emergence of the personal computer in the work environment, a transformative step in enabling staff and patrons alike to access online library services and applications to support their research and work. electronic mail was in its infancy and became a novel way to share information with end users across a campus. several articles focused on the physical design of search terminals and optimizing the ergonomics of computers. there were also many articles about designing the best opac interface for users, ranging from how to present bibliographic records to users, to what information should be sent to printers, to early efforts to extend local catalogs with article-based metadata. many of these topics have parallels today. instead of only analyzing statistical usage data we can pull from our systems, libraries are striving to develop predictive analytics, leveraging big-data from across an assortment of institutions. i found the 1988 article “investigating computer anxiety in an academic library,” which examines staff resistance to technology and change to be as apropos today as it was then.2 cd-roms have gone the way of the feathered and overly hairsprayed coifs of the 80s and have largely been superseded by hard drives and solid state flash media that can hold significantly more data and can transfer data more rapidly. the current decade of the 2010s has been dedicated to providing the optimal search experience for our end users as we have broadened our efforts to the discovery of all scholarly information, not just what is held in our collections. and of course, instead of adding a few article abstracting resources to our catalogs in an innovative, but difficult to sustain manner, the commercial sector has created web-scale mega-indexes that are integrated with our catalogs and offer the promise of searching a predominant amount of the scholarly record. there was a really interesting thread of articles over the decade that traced the evolution of the ils in libraries. there were articles about how to develop automation systems for libraries, the various functions that could be automated — cataloging, circulation, acquisitions, etc. — and evaluation projects for commercial systems. if the 2000s was the era of consolidation, the early 1980s could easily represent the era of proliferation. the decade nicely traces the first two generations of library systems, starting with university-developed automation and database backed systems and the migration of many of those systems to vendors. the northwestern university-based notis system was referenced a lot and there were some mentions of oclc’s acquisition and distribution of the ls/2000 system. this part of our automation history is a palpable reminder that libraries have been innovative leaders in technology for decades, often developing systems ahead of the commercial industry in an effort to meet our evolving service portfolios. this early strategy for libraries mirrors recent developments of institutional repositories, current research information systems (criss), and faculty profiling systems like vivo that were developed before the commercial sector saw the feasibility of commercialization. information technology and libraries | september 2018 10 the cycle of selecting and implementing a new integrated library system is something that m any organizations are faced with again. the only difference is that the commercial sector has entered into the development of the 4th or 5th generation of integrated library systems, many of which are coming with data services integrated and most of them are implemented in the cloud. in addition to seeing our technically rudimentary past, there were several articles over the decade that discussed especially innovative ideas or that anticipated future technologies. a 1983 article by tamas doszkocs which was written long before the emergence of google is an early revelation that regular patrons struggle to use expert systems that require normalized and boolean searching strategies. not surprising is the conclusion that users lean organically toward natural language searching, but even then we were having the expert experience vs. intuitive experience debate in the profession: “the development of alternative interfaces, specifically designed to facilitate direct end user interaction in information retrieval systems, is a relatively new phenomenon.”3 the 1984 article, “packet radio for library automation,” is about eliminating the challenges of retrofitting buildings with cabling to connect lan networks by using radio based interfaces.4 could this be an early precursor to wifi? there is the 1985 article titled “microcomputer based faculty-profile” about using a local database management application on a pc to create an index of faculty publications and university publishing trends.5 this is nearly three decades before the popularization of the cris and faculty profile system. in 1986, there is an article “integrating subject pathfinders into a geac ils: a marc-formatted record approach,” an article that made me think about how library websites are structured, and the current trend of developing online research guides and making them discoverable in our websites as a research support tool.6 and finally, i was struck by the innovative approach in 1987’s “remote interactive online support,” wherein the authors wrote about using hardware to make simultaneous shell connections to a search interface so they could give live search guidance to researchers remotely. 7 we take remote technical support for granted now, but in the late 80s, this required several complicated steps to achieve. the 80s were an exciting time for technology development and a decade that is rife with technical evolution. i think this quote from the article “1981 and beyond: visions and decisions” by fasana in the journal of library automation best elucidates the deep connection between the past and the future, “library managers are currently confronted with a dynamic environment in which they are attempting simultaneously to plan library services and systems for the future, and to control the rate and direction of change.”8 this still holds true. library managers are still planning services in a rapidly changing environment, except, i like to think we have learned to live with change that we cannot control the rate nor direction of. 1 b. kenney, “guest editorial: old wine in new bottles?,” information technology and libraries, 1 no. 1 (march 1982), p. 3. 2 maryellen sievert, rosie l. albritton, paula roper, and nina clayton, “investigating computer anxiety in an academic library,” information technology and libraries 7 no. 3 (september 1988), pp. 243-252. the 1980s in review | dehmlow 11 https://doi.org/10.6017/ital.v37i3.10749 3 tamas e. doszkocs, “cite nlm: natural-language searching in an online catalog,” information technology and libraries 2 no. 4 (december 1983), p. 364. 4 edwin b. brownrigg, clifford a. lynch, and rebecca pepper, “packet radio for library automation,” information technology and libraries 3 no. 3 (september 1984), pp. 229-244. 5 vladimir t. borovansky and george s. machovec, “microcomputer based faculty-profile,” information technology and libraries 4 no. 4 (december 1985), pp. 300-305. 6 william e. jarvis and victoria e. dow, “integrating subject pathfinders into a geac ils: a marcformatted record approach,” information technology and libraries 5 no. 3 (september 1986), pp. 213-227. 7 s. f. rossouw and c. van rooyen, “remote interactive online support,” information technology and libraries 6 no. 4 (december 1987), pp. 311-313. 8 paul j. fasana, “1981 and beyond: visions and decisions,” journal of library automation 13 no. 2 (june 1980), p. 96. 10 high school library data processing betty flora: librarian, leavenworth high school, leavenworth, kansas and john willhardt: data processing instructor, central missouri state college, warrensburg, missouri. planning and operation of an automated high school library system is described which utilizes an ibm 1401 data processing system installed for teaching purposes. book ordering, shelf listing and circulation have been computerized. this paper presents an example of a small automated high-school library system which works efficiently. a great deal of emphasis to date in library automation has been on large university and college libraries, but the relatively few schools that have pioneered in the field of school library automation have demonstrated its feasibility and its potential. data processing is economically within the realm of large and medium-sized school districts. the port huron district, port huron, michigan, has an accounting machine, keypunch and verifier; among the operations performed are printing purchase orders and book cards. the port huron staff consists of one professional librarian, two clerks and two part-time working students. evanston township high school, evanston, illinois, has an automated library system processed with an ibm 1401 computer. other high schools using library data processing are the oak park-river forest high school in illinois; beverly hills, california; west hartford, connecticut; weston, massachusetts; and the burnt hills-ballston lake and bedford-mt. kisco school districts in new york state (1). there are a small number of high schools and vocational schools in kansas and missouri that have high school library edp/flora and willhardt 11 data processing equipment which is used for teaching purposes. names and addresses of these schools may be obtained from the missouri director of vocational education at jefferson city, missouri, and from the kansas state supervisor of technical training at topeka, kansas. introduction leavenworth senior high school, leavenworth, kansas, a campusstyle school comprising six buildings, has approximately 1350 students. the library, located in the main academic building, is presently being remodeled and enlarged. it contains approximately eighteen thousand volumes, including the professional collection; and fifteen hundred to two thousand new volumes are added each year. the library staff consists of one qualified librarian, two full-time clerical assistants, and twenty student assistants, each of the latter working one class period a day. the library is, in the true sense of the term, a media center. a mobile listening center is available, and there are large collections of recordings, cartridge and reel tapes, film strips, films, microfilms, reproductions of paintings, educational games, magazines and vertical file material. fortunately, there is a consistently substantial budget of more than eight dollars per student, including some federal funds, which makes additions to the collection possible in stable development. · data processing at leavenworth high school was made possible by the vocational education act of 1963, which provided for the secretary of health, education, and welfare to enter into agreements with the several state vocational education agencies to provide such occupational training as found to be necessary by the secretary of labor (2). under the provisions of the act, federal money is alloted to the states, which in turn allot a portion of this money to various school districts; a school system receiving such money must lease or purchase data processing equipment and use it mainly for teaching purposes. a data processing curriculum was initiated in the school year 1964-65 at leavenworth high school, under conditions and regulations set up by the state supervisor of technical training which gave first priority in the use of the data processing equipment to teaching. this has been adhered to strictly at leavenworth high school; the equipment is used over half of the school day for teaching purposes and adult education courses in data processing are offered at night. class time consists of lecture and application, with students having opportunity to operate, wire, program and test problems. data processing classes are scheduled first in the computer room; administrative and library operations are scheduled to be processed in the remaining hours during the school day and after school, each operation being assigned a specific time. although unit record equipment was initially leased, plans for a small computer were included in the original decision to offer data processing courses. equipment, plus salaries to those conducting the program, con12 1 ournal of librm·y automation vol. 2/1 march, 1969 stitute a major investment for a medium-sized public high school. consequently, although the classes are a valuable addition to the vocational training area of the curriculum, as many applications as possible are made of school operations, such as enrollment, record keeping, grade reports and payroll, in order to further justify the cost. for this reason, the superintendent of the leavenworth school system suggested that the library might, by using data processing in many of its procedures, both support the data processing instructional program and increase its own effectiveness. methods and materials to develop a system requires systems analysis, which necessitates a clear formulation of purposes and requirements independent of any particular design for implementation ( 3); and the development of procedural applications to be processed on a computer system should be a joint responsibility of both the systems staff and line management ( 4). furthermore, any conversion of library procedures to automation should be carefully planned in advance. proceeding in the fullest cooperation with a view to mutual benefits, the librarian at leavenworth and the head of data processing spent many hours working out the details of their joint effort. the librarian explained her needs and suggested methods of achieving the desired objectives. for his part, the head of data processing evaluated the possibilities from a technical point of view and suggested methods of achieving the desired objectives. together they worked out an initial plan, and the various phases were then programmed. the leavenworth data processing library system was set up to 1) order all new library books; 2) complete shelf cards and book checkout cards; 3) run shelf card listings; 4) correct and file shelf cards; 5) reproduce book checkout cards for books checked out; 6) run first and second overdue notices; and 7) provide library inventory, book count lists and book catalogs. all the lists, notices, and reproduced cards are done on the 1401 computer; computer programs for these operations are written in autocoder. the amount of computer time required for the processing of library data and reports is comparatively small in relation to other operations of the data processing department and was set up to run partly in the daily schedule and partly after school. time required for preparation of information for the computer is significant and must be scheduled more carefully. again, part of this time is fitted into the daily schedule and part of it is accomplished after classes. the high school leases the following ibm data processing equipment: two 024 card punches, one 026 printing card punch, one 082 sorter, one 548 interpreter, one 085 collator, and one 1401 computer with 4k and one disk storage drive. the 1401 computer consists of the 1401 central processing unit, a 1402 card reader punch, a 1403 printer and a 1311 disk storage drive. high school library edp /flora and willhardt 13 the following cards were developed for the procedure: shelf card a is punched from lists of books to be ordered and only the following information and columns are punched: author name (columns 14-35), title (columns 36-71), copyright date (columns 72-73), and purchase date (columns 79-80). when the book is received, this card is completed with the following information: shelf letter (column 1), dewey decimal number (columns 2-7), author number (columns 8-13) and accession number (columns 7 4-78). shelf card b is punched and filed behind shelf card a. only the following information and columns are punched: price (columns 8-13), publisher (columns 36-65), and an x-punch in column 80. the book checkout card (figure 1) is first reproduced from the completed shelf card a, and after that from book checkout cards when books have been checked out of the library. this card contains the shelf letter (column 1), dewey decimal number (columns 2-7), author number (columns 8-13), author name (columns 14-30), title (columns 31-66), student number (columns 68-73), accession number (columns 7 4-78), and an x-punch in column 80. !look titlt author i accept responsibility for this book ano should this book be lost, destroyed or stolen while checked out to me, i will pay the replacement cost of the book, i agree to pay nie fin~ for overdue book$ as follows: i to 5 days overdue 2¢ per day 6 to 10 da'is overdue !5¢ per day over 10 days ~l:rt!v£ 10¢ per da't ---=moc.,=-=t '""':=::-'""' __ ] fig. 1. book checkout card. student tivmdir i a student finder card locates the student's name and parent's name and address on the computer disk pack. the biggest initial task was keypunching an ibm card for each book in the library, which at that time comprised 13,000 books. it was done by data processing students in the high school, working occasionally during class, but mostly after class and on saturdays on a voluntary basis. toward the end of the second semester, many of the procedures had been reviewed and discussed with students in the data processing classes as part of the vocational program. 14 journal of library automation vol. 2/1 march, 1969 aondb card i nte'rpret cards run listing on 1401 re-:-run list fig. 2. book order procedure. interpret book checkou cards interpret shelf card reproduce shelf card into book checkout card fig. 3. new book p1'0cessing. high school library edp/flora and willhardt 15 r ecej ve book · checkout cards from library reproduce new book checkout cards interpret new book checkout cards return old and new book check: out cards to ll brary fig. 4. book checkout procedure. cards for overdue books send finder cards to data processing run address labels return labels and finder cards to library fig. 5. overdue notice procedure. 16 journal of library automation vol. 2/1 march, 1969 book order (figure 2) the library furnishes the data processing department with request cards or lists of books to be ordered, giving author name, title, copyright date, price, publisher, and purchase date (year). data processing punches two cards for each book according to shelf cards a and b. these cards and batches must be kept in the order received from the library. the cards are interpreted, checked for correct punching and listed by batch. the library must check the number of copies ordered and the total amount of each group or batch. after verification and corrections, the cards are returned to data processing for rerunning of the number of copies necessary to send with the purchase order. new book processing (figure 3) when new books are received, the library staff discards shelf card b and writes the following information on shelf card a for punching in the columns indicated: shelf letter in column 1 ( b for biography, k for kansas, p for professional, r for reference, s for story collection, or a blank which indicates fiction); dewey decimal number in columns 2-7; author number in columns 8-13; and accession number in columns 74-78. these columns are interpreted on the 548 interpreter. shelf card a is used to reproduce the book checkout card. shelf cards are block sorted on column 1; each group is then sorted by author number and dewey decimal number. individual cards must be hand filed into the shelf list. the shelf list can be used to provide classification listings, inventory listings, library book counts and book catalogs. book checkout cards are interpreted, sorted by author name (columns 14-23 alpha), returned to the library and filed in the respective books. book checkout card reproduction (figure 4) as books are checked out of the library, the book checkout card (figure 1) is signed by the student and his number is written on it. once a week accumulated book checkout cards are sent to data processing to be reproduced into new book checkout cards which are interpreted and merged behind the old book checkout cards. each week's cards are kept separately. the old cards are for books due in the library in two weeks. the new cards are inserted in these books as they are returned and the old ones placed in a separate file for library circulation statistics. overdue notices (figure 5) the library is provided with a deck of student finder cards (on~ for each student), with student name, number and finder number on the card and in the address file on the disk. when books are overdue, finder cards are pulled by the library staff and sent to data processing,. where they are sorted by a disk accession number. address labels are run on high school library edp/flora and willhardt 17 the 1401 computer for those students with overdue books. these labels are presently attached to pre-printed envelope overdue notices (figure 6), but it is planned to replace the envelope with a continuous-form post card. the first notice is addressed to the student at his home and the second to his parents. leavenworth senior high school library if you have returned your overdue library materials disregard this notice •••• if not, please come to the library at your earliest convenience. 1 to 5 days overdue..................... 2¢ per day 6 to 10 days overdue ..................... 5¢ per day over 10 days overdue ..................... lo¢ per day fig. 6. overdue notice. discussion the book checkout and overdue notices procedures were the first concrete ones developed. these were initiated during the 1965-66 school year, and have proved to be quite successful in saving time and effort. one of the most useful purposes of the leavenworth system is that any portion of the shelf list can be easily provided for an instructor who wishes to assign special readings. also the system has simplified and accelerated preparation of lists for inventory purposes. the ordering process gives the librarian the opportunity to check the order lists before forwarding them to the business manager; this improves the accuracy of the order. 18 i ournal of library automation vol. 2/1 march, 1969 standardization of procedure and operation is essential for efficiency ( 5,6). basically the leavenworth procedure utilized two types of cards, sheh card a and the book checkout card, which are very similar in format. sheh card a is initiated when ordering books and is used to reproduce book checkout cards and to make sheh listings, inventory and book count listings. moreover the system was designed on the basis of having a minimum of skilled clerical workers. student help is used for correcting and filing sheh cards. the ability to provide a book catalog in the future is an advantage. a book catalog need not be confined to one area and may be done in multiple copies. different editions of a work may be more readily seen and compared on a printed page than in a card catalog, where only one entry can be examined at a time. also a book catalog may concentrate in a single easily handled volume entries which would occupy several heavy drawers in a card catalog (7). one of the problems associated with developing a system like the one here described is that of communication. as in all technical and professional areas, a specialized terminology develops, a kind of esoteric jargon which confuses meanings and impedes understanding. this difficulty naturally diminishes as each party to the cooperative effort becomes more familiar with the terminology of the other, and a little plain talking and clear thinking will soon eliminate it. the effectiveness of an automated library program depends, of course, upon the unqualified cooperation between the library and the data processing department. the librarian must establish a reasonable and acceptable schedule of work upon which the data processing department can depend, and she must assure that library material essential to that work is delivered according to schedule. conversely, the data processing department must undertake to complete the work promptly and accurately. evaluation certainly one of the most significant benefits of automation is the great saving of time. tedious and detailed tasks essential to the efficient operation of any library, tasks which formerly required many hours to complete and which had by their natures to be repeated periodically, are accomplished in a fraction of the time. consequently, the librarian is freed for more professional work; most importantly, she has more time to give to the students and their problems, which should be, above all, her first concern. the value of the leavenworth high school library system lies not only in greater accuracy and saving of time for the librarian and he~ staff, but also in the opportunity it provides for student help to learn and operate a system. it is apparent, finally, that automation, properly applied, can be an invaluable asset to the school library. like all systems it depends, in the high school library edp /flora and willhardt 19 final analysis, upon the human factors involved. so long as interests are mutual, and so long as efforts are equal, the library and data processing departments can work effectively together for the benefit of both. acknowledgments mr. jack spear, ksu, manhattan, kansas, advised on the initial planning of the system. the authors received cooperation and encouragement from mr. gordon yeargan, superintendent of schools in leavenworth, and mr. dino spigarelli, principal of leavenworth high school. mr. fred buis, data processing instructor at the high school, helped with the preparation of this paper and is continuing to develop the potential of the system. references 1. mccusker, sister mary lauretta: "implications of automation for school libraries part 2," school libraries, (fall, 1968), 15-22. 2. united states department of health, education and welfare: vocational and technical education (washington: government printing office, 1964). 3. markuson, barbara evans, ed.: libraries and automation (washington: library of congress, 1964). 4. elliott, orville c.; wesley, roberts.: business information processing systems (homewood, illinois: richard d. irwin, inc~ , 1968). 5. laden, h. n.; gildersleeve, t. r.: system design for computer applications (new york: john wiley & sons, inc., 1963). 6. dougherty, richard m.: "manpower utilization in technical services," library resources and technical services, 12 (winter, 1968), 79-80. 7. kingery, robert e.; tauber, maurice f., eds.: book catalogs (new york: the scarecrow press, inc., 1963). lib-s-mocs-kmc364-20140601051338 multipurpose cataloging and indexing system (cain) at the national agricultural library. 21 vern j. van dyke: chief, computer applications, national agricultural library, and nancy l . ayer: computer systems analyst, national agricultural library, beltsville, maryland. a description of the cataloging and indexing system (cain) which the national agricultural library has been using since january 1970 to build a broad data base of agricultural and associated sciences information. with a single keyboarding, bibliographic data is inputed, edited, manipulated, and merged into a permanent base which is used to produce many types of printed or print-ready end-products. presently consisting of five subsystems, cain utilizes the concept of controlled authority files to facilitate both information input and its retrieval. the system was designed to provide maximum computer services with the minimum of effort by users. introduction this article describes an interactive system in operation at the national agricultural library which with a single keyboarding of data provides all necessary catalog cards, book catalogs, bibliographies, and related internal reports, as well as a computer data base for information retrieval. primarily in batch mode, the system can operate on an ibm 360 with 256k memory using os, six magnetic tape drives, a card reader, and a line printer. background the national agricultural library ( nal) as one of the three national libraries is responsible for the collection and dissemination of agricultural information on a national and worldwide basis. in this pursuit publications are obtained through gifts, exchange agreements, and by purchase of items in many languages. titles of those items in non-roman alphabets are transliterated and all non-english titles are translated. the volume of publications handled by nal in 1969 was in the neigh22 journal of library automation vol. 5/1 march, 1972 borhood of 600,000, of which approximately 275,000 were added to the collection. this volume was sufficiently large to provide a serious problem to nal's staff and thus computer assistance was clearly a logical and necessary arrangement. in 1964 a computer group was formed in nal; it became active in developing systems to prepare voluminous indexes for the bibliography of agriculture, the complete pesticides documentation bulletin, and the categorical and alphabetical issues of the agricultural/ biological vocabulary. during 1969 these systems were consolidated and expanded so as to process all input data within one coordinated set of parameters. in january 1970 the new cataloging and indexing (cain) system was implemented. system design cain is a complex and comprehensive computer system which has been engineered to handle up to five ( 5) simultaneous but separate users who share the same controlled authority files. the basic precept in development of computer applications at nal is to make input and output simple and convenient for the users, with the computer assuming as much detail and data manipulation as is technically feasible. at nal the current users providing input data are the new book section, cataloging, indexing, and agricultural economics. operating in parallel, cain also services the herbicides data base of the agricultural research service; the international tree disease data base of the forest service; and in 1971 will be installed in the library of the technion-israel institute of technology in haifa, israel. the master data record is variable in length with a fixed portion of 173 characters and up to fifty-seven additional segments of 65 characters each. the fixed portion includes basic data plus a directory of data contained in the variable portion. data elements in cain are: a. file code-delineates the various files. b. identification number-on cataloged items this embodies the accession number. all identification numbers include the year of accession, a parallel run code plus a unique control number. c. source code. d. user codes-specific identification of up to five users. e. english indicator-language of text. f. translation code-availability of an english translation. g. language, if other than english. h. proprietary restrictoridentifies classified records. i. title tracing indicator-for catalog cards. j .. main entry-designates main entry if not normal sequence. k document type-whether journal article, monograph, serial, etc. i. filing location-if other than in the library stacks. m. categories-two. general area of coverage of subject matter. cataloging and indexing system/van dyke and ayer 23 n. new book description-if the title is not sufficiently explanatory. o. titles-three types: ( 1 ) vernacular or short, ( 2) alternate or holdings, and ( 3) translated title (english). p. personal authors-up to 10. names plus identifying data. q. corporate authors-maximum of two. r. major personal author affiliation. s. abbreviated journal title if item is a journal article; imprint if monographs and serials. t. collation/pagination. u. date-two: search date, and date on publication if different. v. call number. w. subject terms-may be nested. up to 45. x. general notes. y. special purpose numbers-patent, grant, analysis, contract, technical, or report. z. series statement. aa. abstract/ extract. bb. tracings not otherwise normally generated by the system. cc. nonvocabulary cross-references. the total number of individual elements is limited only by the maximum record size. the nal-produced software is written in cobol. the data base is maintained on tape which is nine-track, 800 bpi, blocked 2, in ebcdic, with standard ibm 360 header and trailer labels. the total system presently consists of forty programs, some of which are multipass. in addition, throughput is sorted twenty-five times during the full computer run. these, of course, include the search and retrieval programs and sorts which are run only on request. the ultimate system which nal is working toward and for which the basic design is already substantially complete is an on-line full library document locator and control system which may be linked via dial-up service to an international and national science and technology information network. each portion of cain is developed with the broader picture in mind. it was this factor which weighed heavily in selecting cathode ray tube (crt) terminals for the proposed data gathering subsystem inasmuch as crt's will be the predominant type of terminal in the future network. for convenience in discussion, the system will be described by its subsystems: data gathering, edit and update, publication, search and controlled authorities. data gathering subsystem from its inception the input to cain was in the form of punched cards, a method which has proved to be slow and error prone. in order to eliminate double keyboarding and excessive time lag, as well as to reduce the 24 journal of library automation vol. 5/ 1 march, 1972 error rates, it was decided to perform this input function in the library with trained library personnel. to accomplish this, nal proposes to implement an "on-line" type of input subsystem using crt's. although this form of entry is not yet in use, the subsystem should operate substantially as follows. the documents are to be marked by catalogers and indexers and passed to library technicians who will enter the data through crt's into an on-line storage file. to do this, the technician will call from the hardware prestored formats as desired and fill in the data elements required. these formats use english terms and for the most part call for data rather than codes. in addition, data are to be entered in normal upperand lowercase without diacritics, thus improving visual scanning for errors. an average of four formats will be needed to enter one item. by use of an algorithm, the system would store formatted records for each id in such a manner as to permit recall singly or collectively. the physical documents are then to be passed on to an editor who can recall any or all formatted records for review. with the document in hand, stored records will be reviewed and corrected if necessary. when acceptable, the records will then be transmitted to magnetic tape. variations on this procedure could include input direct to tape, storage to tape without recall to a crt by an editor, cancellation of actions, and a direct purge of the entire storage file without loss of the controlling matrix. the expertise of the library technicians inputting the data should insure far more accuracy than could be expected from multihandling and multikeyboarding. in addition the system has been designed to accomplish basic pre-cain editing of such factors as numeric or alphabetic characters in certain fields and overall lengths of the fields. errors in these categories will be promptly identified by the computer by a blinking feature on the crt screen. another major benefit of this direct approach is that documents can be processed through the system so as to reach the stacks twenty-four days faster than under the current keypunch method. magnetic tapes created by the data gathering system will be periodically converted from ascii to ebcdic and processed into the edit and update subsystem of cain. the present nal time schedule for updating master cain files is weekly. this is not a requirement of the system but an administrative decision based on other deadlines. the data gathering system as prescribed by nal will be composed of sixteen crt's, a large on-line storage file , and one nine-track 800 bpi magnetic tape drive. this configuration will be either a hard-wired "black-box" approach, or controlled by a dedicated mini-computer. the hardware prescribed for this subsystem is not included as a requirement of cain inasmuch as transactions can be entered on 80-column cards if desired. an additional feature of this subsystem will be the generation of managecataloging and indexing systemjv an dyke and ayer 25 ment information feedback. this will encourage elimination of manual counts and provide accurate throughput volume statistics on a timely basis. through this means the supervisor will be in a better position to evaluate workload, individual performance, and hardware utilization. edit and update subsystem the first step in the acceptance of transactions is a thorough validation of each data element. the computer is used to relieve librarians of the voluminous and time-consuming edit of many individual elements having predetermined limits. thus, only a cursory review of the proof-listed records is necessary by a librarian before acceptance. the system cannot detect, of course, logical or typographical errors, but it can determine the absence of necessary information, codes in invalid ranges, and the incorrect placement of data. elements for which the system supplies authority files are not only verified against the file but also additional transactions are generated from the authority file to assure uniformity in output. this also eliminates the necessity for librarians having to enter those elements which have a direct predictable relationship to another element. further validations are performed at the point of building new records or updating records already in the master file. the two "master" files are ( 1 ) the temporary set of unselected records and ( 2 ) the permanent set of those records which have been approved and selected for publication in some form. data elements specified as required within each record are reviewed. if one or more is missing, the system refuses to approve this record, and a notice is produced concerning this reversal of human input. fields can be deleted, in whole or in part, replaced or added. three types of output from this subsystem are: • new updated master files. those which have been added or altered during this update run are proof-listed for cursory review by a team of professional librarians. corrections and/ or approvals are submitted in a subsequent update run. • activity notices. every action whether submitted by the user or system-generated which has been accepted for processing is reported. • error notices. all error and warning messages from this subsystem are compiled into one listing. this includes errors on individual elements, system-discovered errors of omission, and warnings of computer overriding of submitted actions. through the use of control cards various handling options are possible. one of these is proof-listing of a specific range or ranges of masters by identification numbers or dates. subject headings are assigned by professional librarians for monographs and new serial titles. for journal articles, however, the system analyzes the title of the article and creates subject index terms, using single words, 26 journal of library automation vol. 5/1 march, 1972 combinations of two words not separated by stop words, and singular and plural variations. the generated terms are then processed against the controlled authority file. those accepted as valid are inserted in the record for searching purposes. publication and distribution subsystem each data element of a bibliographic item is captured only once and at the earliest possible time in the receipt process. master records which have successfully passed the edit and update phase become candidates for various types of publications and other user services. six major modes of publication products are produced by cain, at various times and in a variety of both formats and media. preliminary to the production of formal output there is a screening for records designated as fully acceptable by the edit and update subsystem. as mentioned above, any record may be identified as being applicable to any combination of from one to five users. by a method of control cards the system is informed as to which users are scheduled for publication/ distribution, and the maximum quantity to be selected in each case. this subsystem reviews each record to ascertain its appropriateness for selection. records meeting the criteria are siphoned off for individual handling. no record is dropped from the temporary file until it has been selected by all applicable users. a new book shelf listing may be printed on photocopy paper on request. on preparation, it is ready to be matted, photographed, printed, and distributed throughout the department of agriculture. only enough new book entries are selected by the computer at one time as will fit on three sheets of a four-page publication. approved cataloged records are selected weekly. each record is analyzed for applicability to any or all of the eight major files for which catalog cards are prepared. each card file has its own criteria both in content and in the number and types of cards produced for it. the system produces a separate record for each card required, sorts together the records for each file, and alphabetizes within that file. leading articles (regardless of language) are printed but are excluded in the sorting procedure. cards are printed two-up in upperand lowercase in the format prescribed by angloamerican cataloging rules. after printing, the cards are distributed to the appropriate organizations and sections where they may be filed with a minimum of additional effort. monthly, a book catalog is compiled. this contains not only a listing by main entry but also indexes of personal authors, corporate authors, subjects, and titles. a biographic index (major personal author affiliation) capability is available although not presently used by nal in the book catalog. this catalog is printed in varying numbers of columns changeable by control card option for each index. again photocopy paper is used with a standard cataloging and indexing system/van dyke and ayer 27 upperand lowercase (tn) print train. an alternate option is magnetic tape output formatted for direct input to a computer-driven linotron. see bibliographic description for more detail. semiannually the index portions of the book catalog are cumulative. main entry listings are not repeated. multiyear accumulations may also be produced. the book catalogs are presently being published from photocopy printout by rowman and littlefield, inc., new york. bibliographies, either scheduled or special, can be produced with the same indexes as those in the book catalog. these are normally prepared for printing via the linotron. this magnetic tape record contains all formatting requirements with the exception of word divisions. document title, page, and columnar (subject category) headers are provided by nal. running headers are inserted by the linotron. through predetermined codes, the cain tape specifies the print style, print size, and print format. bibliographies may also be computer printed on photocopy paper similar to the book catalog. once a month, each record selected for publication is processed through a merge and adjustment program. at this point published records not previously on the permanent master file are added to it. those which are already on it are compared and the resident record is adjusted to include the new user for whom the record has just been published. the term field is also verified and updated if necessary. each term is also used to generate posting records for the subject authority file. the permanent (published) cain data base is available on magnetic tape in either the master format or a print format of the linear proof (listing of each data element). only records not previously published are added to the monthly sale tapes. these tapes may be ordered individually (new monthly selections) or collectively (whole file) at the cost of reproduction only. the tape is nine-track, 800 bpi, ebcdic with standard ibm 360 header and trailer labels. one of the purchasers of cain tape is the ccm information corporation of new york which publishes bibliography of agriculture from it starting in 1970. current purchasers include private corporations and universities, both in the united states and abroad. the last type of output is normal computer printout of numerous internal reports in a variety of customized formats. search subsystem the search capability of the cain system is not being used by nal on its own data base at the present time. it is utilized, however, by other organizations who run the cain system on a parallel basis, maintaining their own data bases. the following description, therefore, pertains to the programmed system rather than to its use on the nal data base. this subsystem permits identification and retrieval of records in cain format based on search statements as applied to almost every data element 28 journal of library automation vol. 5/ 1 march, 1972 or combinations thereof. such searches may use simple statements or a complex series of nested boolean parameters. questions may also be absolute or weighted to give more precise results. the weight factors if used are normally assigned to each statement within a search question, with a threshold weight assigned to the overall question. the total weight of all true statements must be equal to or greater than the threshold weight for the full query in order to be considered as meeting the search criteria. if such is not the case, the record will not be selected. since cain uses a controlled vocabulary, query statements on subject terms are first matched against that authority file. at this point each invalid (use ) term is replaced by a corresponding valid ( uf ) term if appropriate. in addition, if the query statement so specifies, the requested terms may be expanded one level in the hierarchy. in other words, it could generate additional statements requesting all broader, narrower, or related terms as specified if such structure were present for the subject within the vocabulary. because subject terms comprise the largest percentage of all search elements, an algorithm was developed whereby queries on this type of element are first processed against an inverted file. identification numbers are extracted for all terms matching the query and only those candidate records are searched using the full query. on a serial file such as cain, this concept provides a substantial savings in computer run time. the print options of retrieval output allow either for normal sequence by identification number or for a specific sequence as requested by the originator. the printout may contain all data elements or only those selected, all others being suppressed. at the present time this subsystem is used infrequently by nal and only for internal high priority searches due to the extremely limited subject indexing terms present. it is used more extensively on the parallel operation established for the international tree disease register maintained for the u. s. forest service. authority files subsystem this subsystem updates, generates, expands, and maintains three types of authority files. these include subject terms with associated hierarchy, call numbers of indexed journals with abbreviated titles, and a subject term inverted file carrying the identification number of each record using that term. each transaction to add, change, or delete any data is both edited and reversed before entering the updating sequence. thus an addition of a narrower term (for example, horse) to a base term (for example, animal) will automatically generate another transaction to add the broader term of animal to a base term (new or existing ) of horse. this precludes having to manually enter both sides of an action as well as assuring reciprocity of entries. due to the flexibility of the search subcataloging and indexing system/van dyke and ayer 29 system of cain, this hierarchical continuity is of great importance. if an item is changed the same procedure is followed. in the instance of deletion, a broader precept is involved. in this case, the term is deleted from all entries in other hierarchies but is itself left on the authority fil e and marked as being no longer valid. it is thus available for search purposes but is not allowed to be used on subsequent cain data records. during a normal cain data run, each call number or subject term in a record is verified against the appropriate file. each element on these files is carried in two forms-one in stripped uppercase, and the other in preferred print form. when an incoming term is found on the authority file, the system substitutes the proper form. this includes substituting a valid term for an invalid term as in the "use-use for" relationship, as well as generation of the appropriate abbreviated journal title for a given call number. in order to keep the authority file up to date, the transactions generated by the publication subsystem are now used to insert the record identification number into the inverted file as well as increase the number of postings per term. this assists search specialists in formulating queries in the manner which will reduce computer processing time to the greatest degree. when published, the authority files themselves can be printed in a special format which displays the entire hierarchy of each term. in addition, up to ten levels of increasingly narrower terms can be listed for each term. summary cain is a broad-based comprehensive batch mode system which meets many library requirements. its flexibility is apparent from the fact that it has already been expanded to se lect each newly cataloged serial record for transmission in marc ii communication format to the national serials data bank being created by the three national libraries. still more capabilities will undoubtedly be built into it before the nal ultimate on-line system is implemented. the major thrust of the systems design has been to concentrate on simplifying user interface while imposing stringent and extensive service requirements on the computer system itself. due to its inherent fluidity, cain is being retained as an in-house system. it is so complex that a single change in one subsystem may have radial effects in any or all of the other portions. continuing efforts are underway to simplify input, accelerate throughput, and expand its already generous services both to the staff of the national agricultural library and to those organizations utilizing output from the cain system. digital faculty development editorial board thoughts digital faculty development cinthya ippoliti information technology and libraries | june 2019 5 cinthya ippoliti (cinthya.ippoliti@ucdenver.edu) is director, auraria library, colorado. the role of libraries within faculty development is not a new concept. librarians have offered workshops and consultations for faculty for everything from designing effective research assignments, to scholarly impact, and open educational resources. in recent months however, both acrl and educause have highlighted new expectations for faculty to develop skills in supporting students within a digital environment. as part of acrl’s “keeping up with…” series, katelyn handler and lauren hays1 discuss the rise of faculty learning communities that cover topics such as universal design, instructional design, and assessment. effective teaching has also recently become the focus of many institutions’ efforts in increasing student success and retention, and faculty play a central role in students’ academic experience. in addition, the educause horizon report echoes these sentiments, positing that “the role of full-time faculty and adjuncts alike includes being key stakeholders in the adoption and scaling of digital solutions; as such, faculty need to be included in the evaluation, planning, and implementation of any teaching and learning initiative.”2 finally, maha bali and autumn caines mention that “when offering workshops and evidence-based approaches, educational development centers make decisions on behalf of educators based on what has worked in the past for the majority.”3 they call for a new model that blends digital pedagogy, identity, networks, and scholarship where the experience is focused on “participants negotiating multiple online contexts through various online tools that span open and more private spaces to create a networked learning experience and an ongoing institutionally based online community.”4 so how does the library fit into this context? what we are talking about here goes far beyond merely providing access to tools and materials for faculty. it requires a deep tripartite partnership with educators and the centers for faculty development, as each partner brings something unique to the table that cannot be covered by one area alone. the interesting element here is a dichotomy where this type of engagement can span both in-person and virtual environments as faculty utilize both to teach and connect with colleagues as part of their own development. the lines between these two worlds suddenly blur and it is experience and connectivity that are at the center of the interactions rather than the tools themselves. while librarians may not be able to provide direct support in terms of instructional technologies, they can certainly inform efforts to integrate open and critical pedagogy and scholarship into faculty development programming and into the curriculum. libraries can take the lead on providing the theoretical foundation and application for these efforts while the specifics of tools and approaches can be covered by other entities. bali and caines also observe that bringing together disparate teaching philosophies and skill sets under this broader umbrella of digital support and pedagogy can help provide professional development opportunities for faculty, especially adjuncts, who may not have the ability to participate otherwise. this opportunity can act as a powerful catalyst to influence their teaching by implementing, and therefore modeling, a best-practices approach so that they are thinking about digial faculty develoment | ippoliti 6 https://doi.org/10.6017/ital.v38i2.11091 bringing students together in a similar fashion even if they are not teaching exclusively online, but especially if they are.5 open pedagogy can accomplish this in a variety of ways. bronwyn hegarty defines eight areas that constitute open pedagogy: (1) participatory technologies; (2) people, openness, and trust; (3) sharing ideas and resources; (4) connected community; 5) learner generated; (6) reflective practice; and (7) peer review.6 these elements are applicable to both faculty development practices, as well as pedagogical ones. just as faculty might interact with one another in this manner, so can they collaborate with their students utilizing these methods. by being able to change the course materials and think about the ways in which those activities shape their learning, students can view the act of repurposing information as a way to help them define and achieve their learning goals. this highlights the fact that an environment where this is possible must exist as a starting point and it also underlines the importance of the instructor’s role in fostering this environment. having a cohort of colleagues, for both instructors and students, can “facilitate student access to existing knowledge, and empower them to critique it, dismantle it, and create new knowledge.”7 this interaction emphasizes a twoway experience where both students and instructors can learn from one another. this is very much in keeping with the theme of digital content, as by the very nature of these types of activities, the tools and methods must lend themselves to being manipulated and repurposed, and this can only occur in a digital environment. finally, in a recent posting on the open oregon blog, silvia lin hanick and amy hofer discuss how open pedagogy can also influence how librarians interact with faculty and students. specifically, they state that “open education is simultaneously content and practice”8 and that by integrating these practices into the classroom, students are learning about issues such as intellectual property and the value of information, by acting “like practitioners” 9 where they take on “a disciplinary perspective and engage with a community of practice.”10 this is a potentially pivotal element to take into consideration when analyzing the landscape of library-related instruction, because it frees the librarian from feeling as if everything rests on that one-time instructional opportunity. the development of a community of practitioners which includes the students, faculty, and the librarian has the potential to provide learning opportunities along the way. including the librarian as part of this model makes sense not only as a way to signal the critical role the librarian plays in the classroom, but also as a way to stress that thinking about, and practicing library-related activities is (or should be) as much part of the course as any other exercise. information technology and libraries | june 2019 7 references 1 katelyn handler and lauren hays, “keeping up with…faculty development,” association of college and research libraries, last modified 2019, http://www.ala.org/acrl/publications/keeping_up_with/faculty_development. 2 “horizon report,” educause, last modified 2019, https://library.educause.edu//media/files/library/2019/2/2019horizonreportpreview.pdf. 3 maha bali and autumm caines. “a call for promoting ownership, equity, and agency in faculty development via connected learning.” international journal of educational technology in higher education 15, no. 1 (2018): 3. 4 bali, “a call for promoting ownership, equity, and agency in faculty development,” 9. 5 ibid, 3. 6 bronwyn hegarty, “attributes of open pedagogy: a model for using open educational resources,” last modified, 2015, https://upload.wikimedia.org/wikipedia/commons/c/ca/ed_tech_hegarty_2015_article_attri butes_of_open_pedagogy.pdf. 7 kris shaffer, “the critical textbook,” last modified 2014, http://hybridpedagogy.org/criticaltextbook/. 8 silvia lin hanick and amy hofer, “opening the framework: connecting open education practices and information literacy,” open oregon, last modified 2017, http://openoregon.org/openingthe-framework/. 9 “opening the framework.” 10 “opening the framework.” tull ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ j costs of library catalog cards produced by computer 121 frederick g. kilgour: ohio college library center, columbus, ohio production costs of 79,831 cards are analyzed. cards were produced by four variants of the columbia-harvard-yale procedure employing an ibm 870 document writer and an ibm 1401 computer. costs per card ranged from 8.8 to 9.8 cents for completed cards. . early in september, 1964, the yale medical library.put into routine operation the columbia-harvard-yale computerized technique for catalog card manufacture ( 1), and during the following three · years yale produced over 87,000 cards. the principal objective of the chy project was an on-line, computerized, bibliographic information retrieval system. however, the route selected for attaining the objective included manufacture of cards from machine readable data to keep up the manual catalog while machine readable records were being inexpensively accumulated for computerized subject retrieval. catalog cards were only one product of the system, but their production was designed to be as efficient as possible within constraints of the system. nevertheless, this paper will examine chy card production costs as though this segment of the system were an isolated procedure, yielding but one product, as is the case in classical library procedures. costing will disregard other benefits, such as accession lists and machine readable data produced for little, or no, additional expense. the columbia medical library and harvard medical library also installed ibm 870 document writers and tested the programs for card production, but neither library routinely produced cards. however, co122 journal of library automation vol. 1/ 2 june, 1968 lumbia produced its acquisitions lists until october, 1966, using chy techniques. harvard issued a similar list, but for a shorter period of time, and it was harvard's withdrawal early in 1966 that brought about the collapse of the project. nevertheless, other institutions adopted the chy procedure for catalog card production, among them the medical library at the university of rochester, which used the programs for two years following february, 1966. e. r. squibb & sons at east brunswick, new jersey, also uses the programs. at the university of kentucky an 870 document writer types catalog cards, but new programs were written to run on an ibm 7040 computer that recently have been recoded in cobol for an ibm 360/50. similarly, the library at philip morris, inc., richmond, virginia, rewrote the programs to run on an ibm 1620 computer which punches cards that drive an 870. the korean social science bibliography project of the human relations area files has elaborated the chy technique into its automated bibliographic system ( 2), which in turn is the base for another bibliographic system for mrican studies. the machine readable cataloging record of the chy mechanized system eventually became the great-grandfather of the marc ii format and contributed about as much to marc ii as would have been the case had their relationship been truly biological. although the columbia-harvard-yale project never did develop and activate its proposed bibliographic information retrieval system, r. k. summit working entirely independently has brought into successful operation his excellent dialog system ( 3) which is essentially the system that chy had in design stage. moreover, summit's system is definitely superior because it has several useful functions not contemplated in chy. nearly all reports on catalog card production limit study of costs to reproduction of cards and neglect other costs involved in preparing cards for the catalog. an exception is p. j. fasana's 1963 investigation wherein he found that library of congress cards, in seven copies and ready to be filed into a catalog, cost 16.6 cents per card; cards produced by a machine method consisting of a tape typewriter and a very small special purpose computer cost 9.9 cents ( 4). fasana used an hourly salary rate of $2.00. a study of early experience with chy production yielded 12.5 cents per card ( 1) whereas the present study shows that costs range between 8.8 and 9.8 cents per card, cards being ·in completed form, arranged in packs for individual catalogs, and ready for bursting before alphabetizing for filing. methods · during the course of the three years in which the chy programs were in operation, four variant techniques were used for card production. the first three with their limitations have been described · elsewhere ( 5). briefly, the initial system consisted of keypunching from worksheets, _listing the punch cards on an ibm 870 document writer, proofreading and costs of library catalog cards/ kilgour 123 correcting, processing the proofread and corrected punch cards on an ibm 1401 computer which produced punch card output that, in tum, was used to drive the 870 document writer for production of catalog cards on oneup forms. in the next arrangement, printing of cards on one-up forms was accomplished on an ibm 1401 computer driving an upperand lowercase print chain. in the third procedure, a two-up card form replaced the one-up form. finally, the medical library returned the 870 document writer to the manufacturer, and the 1401 was programmed to do the prooflisting in upper and lower case. the yale bibliographic system (6) replaced the chy routines on 25 july 1967. the keypuncher kept time records for the various activities listed in table 1 throughout the period of this study. during the first two months of operation, design for recording data was inadequate. subsequently an individual would, albeit infrequently, fail to record time elapsed, so that production of 7,630 cards was omitted from the study, leaving a total of 79,831 to be included. on several occasions during the fourth part of the study, the second proofreading was suspended, and only correction carried out. hence, time expended in this category is less than in the previous three periods. at first an ibm 1401 computer in the yale computer center was used, the center being located about a mile from the medical library. subsequently, another 1401 modified to drive an upperand lower-case print chain and located in the medical sc;hool was employed. later this machine was transferred to the administrative data systems computer center, which moved to a new location not long after it assumed operation of the 1401. still later, the 1401 was again transferred, this time to the yale computer center. as can be seen from the computer charges in table 1, these wanderings about new haven appear to have had no effect on operating efficiency. time recorded for each computer run was actual time clocked by the operator. other times were recorded by the individual performing the operation. ·. salaries used in the cost calculation were salaries being paid in june, 1967, which were, of course, appreciably higher than those in the autumn of 1964; hourly rate for the first proofreader in table 1 was $2.62 ~nd for the second $2.21. hourly rental for the 870 document writer was $.78. rate of computer charges employed in the calculation was $20 per hour, a rate that had existed during the last year or so during which data was collected. initially, computer charges had been $75 an hour, but they dropped precipitously during the first two years. costs for catalog card stock were the lowest cost charged for the two types of forms. since these forms were not standard items during the years of the study, their prices varied considerably depending upon the amount ordered. results table 1 contains cost figures for catalog card production by the four variant techniques. since salaries and computer charges can vary widely, -----.-.---.-~..::::-·...:::::-.-__ ...... l'o ~ table 1. per-card costs of computer-produced catalog cards. 'o' one-u p form on 870 one-up fo r m o n 1401, t woup f o r m on 1401 , two-up· form o n 1401 , ~ g proo f on 8 70 proof o n 870 p r oof o n 140 1 ...... ..a dollars hou r s dollars hou r s dolla r s hou r s d olla r s hours t"'' .... <:3"' k e ypunch i n g • 02 19 • 0099 • 0 2 18 • 0099 • 0222 • 0 10 1 • 0 235 • 0106 "'t ~ "'t '-!::: keypun c h • 0029 • 00 99 • 0030 • 009 9 • 003 0 • 0101 • 0 0 32 • 0 106 ::> ~ i b m 8 70p r o o£ • 0033 • 00 4 3 • 0 036 • 00 4 6 • 003 9 • 00 51 ..... 0 i bm 1401 -proof • 004 6 ~ • 009 1 ~ ..... .... proofr eaders (2) 0 ;:$ proofr eadi ng • 0 11 5 • 004 4 • 0 11 3 • 00 4 3 . 0118 • 00 45 • 011 6 • 0044 proofr eading and c orrecting • 0 120 • 0 0 55 • 0 12 2 • 005 5 • 0 11 9 • 0 0 54 • 009 1 • 004 1 ~ i bm 140 1 • 0149 • 0085 • 0313 • 0 156 • 023 1 • 011 6 • 024 5 • 0 112 !"""' ...... ib m 8 70-ca r d typing • 0 104 '-.... l'o card st o c k • 0 149 • 01 49 • 01 2 5 • 0125 '--1 t o ta l • 0 9 18 • 0981 • 0884 • 09 35 § v(l) ...... <;;0 n um b er of cards 1 5, 149 9343 27,210 28, 129 0:> 00 number of titles 1, 6 55 990 2 , 920 3,1 30 cards per titl e 9 . 2 9. 4 9. 3 9 . 0 ~--· costs of librm·y catalog cards/kilgour 125 particularly among countries, time per card produced is also included in the table to facilitate comparison with other systems. of course, amounts of tim~ calculated by dividing elapsed time by amount of product are not directly comparable with results of time and motion studies such as henry voos' helpful study (7) . however, two different methods of comparing the input costs in table 1 with those johnson ( 8) published for the stanford book catalog gave divergences of only 2 and 6 per cent. source of the increase in costs of six-tenths of a cent from the first procedure to the second is entirely the increase in computer charges when the 1401 replaced the 870 to print cards. when the two-up form was employed on the computer in variant three, charges then dropped to less than the combined 1401 and 870 costs in the first procedure. costs rose again in procedure four. here the principal cause of the increase was the substitution of computer-produced proof listings after the 870 document writer had been returned to the manufacturer. although there is no reason to think that preparation of cataloging copy on a worksheet is either more or less expensive than older techniques, coding a worksheet constitutes additional work for which there is no equivalent in classical procedures. coding costs were examined between 9 march and 11 may 1965, when six individuals, ranging from professional catalogers to a student assistant, recorded time required to code 725 worksheets. time per final catalog card produced was three seconds; in other words, $.003 for a cataloger receiving $7500 a year, or $.001 for a student assistant earning $1.50 an hour. if total coding cost, . rather than a portion of it, were to be charged to card production, costs reported in table 1 could rise oneto three-tenths cents. discussion the accurate comparison of costs would be with those of systems similar to the chy system that produce more than one product. for instance, the chy system also produced monthly accession lists from the same punch-card decklets that produced catalog cards. the accession list was produced mechanically at a cost far less than that for the previous manual preparation. the decklets also constituted machine readable information available for other purposes, most of which have not yet been realized. system costing would assign only a portion of keypunching and proofreading costs to card production. another saving was the appreciable shortening of time required for catalog cards to appear in the catalog. in procedures one through three, usually three or four days elapsed from the day on which the cataloger completed cataloging to the day on which cards were filed into the catalog. however, in procedure four, the computer, which was then a mile distant from the medical library, was used on two separate occasions for each batch of decklets, so that elapsed time rose to at least a week. ' i li ii ii '· ,, .. '· ,, ' • ,, 126 journal of library automation vol. 1/ 2 june, 1968 even though other benefits are not reflected in comparative costs, it is clear from fasana's findings that the chy computer-produced cards cost far less than do lc cards, and have a similar cost to those produced mechanically on which fasana reported. although there appears to be no published evidence that photocopying techniques can produce finished catalog cards at less expense than 9 cents, it is possible that some photoreproduced cards may be less expensive than those described in this article. however, it must be pointed out that photo-reproduced cards are products . of single-product procedures, whereas the chy cards are one of several system products. increase in cost between procedure three and procedure four was due to increase in cost of prooflisting in upper and lower case on the 1401 computer as compared to prooflisting on the 870 document writer. this cost increase was not detected until calculations were done for this investigation, and therein lies a moral. it was the policy at the yale library for all programming to be done by library programmers, since various inefficiences, and indeed catastrophes, had occasionally been observed when non-library personnel had prepared programs for library operations. the single exception to this policy was the proof program, which this investigation reveals used an exhorbitant amount of time-one-third of that required for subsequent card production. since it had been felt that writing and coding a prooflisting program. was perfectly straightfmward, an outside programmer of recognized ability was employed to write and code the program. because the program was simple, and because the programmer had high competence, efficiency of the program was never checked as it should have been. this episode raises the question that if even the wary can be trapped, how can the tmwary avoid pitfalls? there is no satisfactory answer, but it would appear that some difficulties could be avoided by review of new programs by experienced library programmers, of which there are unfortunately far too few. comparison with data such as that in table 1 will also be helpful, but not definitive, in evaluating new programs. of course, when widely used library computer programs of recognized efficiency are generally available, magnitude of the pitfalls will have been greatly reduced. concl"qsion computer-produced catalog cards, even when they are but one of several system products, can be prepared in finished form for a local catalog less expensively and with less delay than can library of congress printed cards. computer card production at 8.8 to 9.8 cents per completed card appears to be competitive with other procedures for preparing catalog cards. however, undetected inefficiency in a minor program increased costs, thereby emphasizing need to insure efficiency in programs used routinely. costs of library catalog cards/ kilgour 127 acknowledgements the author is most grateful to mrs. sarah boyd, keypuncher extraordinary, who maintained the record of the data used in this study. national science foundation grant no. 179 supported the chy project in part. references 1. kilgour, frederick g.: "mechanization of cataloging procedures," bulletin of the medical library association, 53 (aprill965), 152-162. 2. koh, hesung c.: "a social science bibliographic system; computer adaptations," the american behavioral scientist, 10 (jan. 1967), 2-5. 3. summit, roger k.: "dialog; an operational on-line reference retrieval system," association for computing machinery, proceedings of 22nd national conference, (1967), 51-56. 4. fasana, p.j.: "automating cataloging functions in conventional libraries," library resources & technical services, 7 ( fall1963), 350-365. 5. kilgour, frederick g.: "library catalogue production on small computers," american documentation, 17 (july 1966), 124-131. 6. weisbrod, david l.: "an integrated, computerized, bibliographic system for libraries," (in press). 7. voos, henry: standard times for certain clerical activities in technical processing (ann arbor, university microfilms, 1965). 8. johnson, richard d.: "a book catalog at stanford~" journal of library automation, 1 (march 1968), 13-50. ----------------------batch ingesting into eprints digital repository sof tware tomasz neugebauer and bin han information technology and libraries | march 2012 113 abstract this paper describes the batch importing strategy and workflow used for the import of theses metadata and pdf documents into the eprints digital repository software. a two-step strategy of importing metadata in marc format followed by attachment of pdf documents is described in detail, including perl source code for scripts used. the processes described were used in the ingestion of 6,000 theses metadata and pdfs into an eprints institutional repository. introduction tutorials have been published about batch ingestion of proquest metadata and electronic theses and dissertations (etds),1 as well as endnote library,2 into the digital commons platform. the procedures for bulk importing of etds using dspace have also been reported.3 however, bulk importing into the eprints digital repository software has not been exhaustively addressed in the literature.4 a recent article by walsh provides a literature review of batch importing into institutional repositories.5 the only published report on batch importing into the eprints platform describes perl scripts for metadata-only records import from thomson reuters reference manager.6 bulk importing is often one of the first tasks after launching a repository, so it is unsurprising that requests for reports and documentation on eprints-specific workflow have been a recurring question on the eprints tech list.7 a recently published review of eprints identifies “the absence of a bulk uploading feature” as its most significant weakness.8 although eprints’ graphical user interface for bulk importing is limited to the use of the installed import plugins, the software does have a versatile infrastructure for this purpose. leveraging eprints’ import functionality requires some perl scripting, structuring the data for import, and using the command line interface. in 2009, when concordia university launched spectrum,9 its research repository, the first task was a batch ingest of approximately 6,000 theses dated from 1967 to 2003. the source of the metadata for this import consisted in marc records from an integrated library system powered by innovative interfaces and proquest pdf documents. this paper is a report on the strategy and workflow adopted for batch ingestion of this content into the eprints digital repository software. import strategy eprints has a documented import command line utility located in the /bin folder.10 documents can also be imported through eprints’ graphical interface. using the command line utility for tomasz neugebauer (tomasz.neugebauer@concordia.ca) is digital projects and systems development librarian and bin han (bin.han@concordia.ca) is digital repository developer, concordia university libraries, montreal, quebec, canada. mailto:tomasz.neugebauer@concordia.ca mailto:bin.han@concordia.ca batch ingesting into eprints digital repository software| neugebauer and han 114 importing is recommended because it is easier to monitor the operation in real time by adding progress information output to the import plugin code. the task of batch importing can be split into the following subtasks: 1. import of metadata of each item 2. import of associated documents, such as full-text pdf files the strategy adopted was to first import the metadata for all of the new items into the inbox of an editor’s account. after this first step was completed, a script was used to loop through the newly imported eprints and attach the corresponding full-text documents. although documents can be imported from the local file system or via http, import of the files from the local file system was used. the batch import procedure varies depending on the format of the metadata and documents to be imported. metadata import requires a mapping of the source schema fields to the default or custom fields in eprints. the source metadata must also be converted into one of the formats supported by eprints’ import plugins, or a custom plugin must be created. import plugins are available for many popular formats, including bibtex, doi, endnote, and pubmedxml. in addition, community-contributed import plugins such as marc and arxiv are available at eprints files.11 since most repositories use custom metadata fields, some customization of the import plugins is usually necessary. marc plugin for eprints in eprints, the import and export plugins ensure interoperability of the repository with other systems. import plugins read metadata from one schema and load it into the eprints system through a mapping of the fields into the eprints schema. loading marc-encoded files into eprints requires the installation of the import/export plugin developed by romero and miguel.12 the installation of this plugin requires the following two cpan modules: marc::record and marc::file::usmarc. the marc plugin was then subclassed to create an import plugin named “concordia theses,” which is customized for thesis marc records. concordia theses marc plugin the marc plugin features a central configuration file (see appendix a) in which each marc field is paired with a corresponding mapping to an eprints field. most of the fields were configured through this configuration file (see table 1). the source marc records from the innovative interfaces integrated library system (ils) encode the physical description of each item using the anglo american cataloguing rules, as in the following example: “ix, 133 leaves : ill. ; 29 cm.” since the default eprints field for number of pages is of the type integer and does not allow multipart physical descriptions from the marc 300 field, a custom text field for these physical descriptions (pages_aacr) had to be added. the marc.pl configuration file cannot be used to map compound fields, such as author names—the fields need custom mapping implementation in perl. for instance, the marc 100 and 700 fields information technology and libraries | march 2012 115 are transferred into the eprints author compound field (in marc.pm). similarly, marc 599 is mapped into a custom thesis advisor compound field. marc field eprints field 020a isbn 020z isbn 022a issn 245a title 250a edition 260a place_of_pub 260b publisher 260c date 300a pages_aacr 362a volume 440a series 440c volume 440x issn 520a abstract 730a publication table 1. mapping table from marc to eprints helge knüttel’s refinements to the marc plugin shared on the eprints tech list were employed in the implementation of a new subclass of marc import for the concordia theses marc records. in the implementation of the concordia theses plugin, concordiatheses.pm inherits from marc.pm. (see figure 1.)13 knüttel added two methods that make it easier to subclass the general marc plugin and add unique mappings: handle_marc_specialities and post_process_eprint. the post_process_eprint function was not used to attach the full-text documents to each eprint. instead, the strategy to import the full-text documents using a separate attach_documents script was used (see “theses document file attachment” below). import of all of the specialized fields, such as thesis type (mapped from marc 710t), program, department, and proquest id, was implemented in the function handle_marc_specialities of concordiatheses.pm. for instance, 502a in the marc record contains the department information, whereas an eprints system like spectrum stores department hierarchy as subject objects in a tree. therefore importing the department information based on the value of 502a required regular expression searches of this marc field to find the mapping into a corresponding subject id. this was implemented in the handle_marc_specialities function. batch ingesting into eprints digital repository software| neugebauer and han 116 figure 1. concordia theses class diagram, created with the perl module uml::class::simple execution of the theses metadata import the depositing user’s name is displayed along with the metadata for each eprint. a batchimporter user with the corporate name “concordia university libraries” was created to carry out the import. as a result, the public display of the imported items shows the following as a part of the metadata: “deposited by: concordia university libraries.” the marc plugin requires the encoding of the source marc files to be utf-8, whereas the records are exported from the ils with marc-8 encoding. therefore marcedit software developed by reese was used to convert the marc file to utf-8.14 to activate the import, the main marc import plugin and its subclass, concordiatheses.pm, have to be placed in the plugin folder /perl_lib/eprints/plugin/import/marc/. the configuration file information technology and libraries | march 2012 117 (see appendix a) must also be placed with the rest of the configurable files in /archives/repositoryid/cfg/cfg.d. the plugin can then be activated from the command line using the import script in the /bin folder. a detailed description of this script and its usage is documented on the eprints wiki. the following eprints command from the /bin folder was used to launch the import: import repositoryid --verbose --user batchimporter eprint marc::concordiatheses theses-utf8.mrc following the aforementioned steps, all the theses metadata was imported into the eprints software. the new items were imported with their statuses set to inbox. a status set to inbox means that the imported items are in the work area of batchimporter user and will need to be moved to live public access by switching their status to archive. theses document file attachment after the process of importing the metadata of each thesis is complete, the corresponding document files need to be attached. the proquest id was used to link the full-text pdf documents to the metadata records. all of the marc records contained the proquest id, while the pdf files, received from proquest, were delivered with the corresponding proquest id as the filename. the pdfs were uploaded to a folder on the repository web server using ftp. the attach_documents script (see appendix b for source code) was then used to attach the documents to each of the imported eprints in the batchimporter’s inbox and to move the imported eprints to the live archive. several variables need to be set at the beginning of the attach_documents operation (see table 2). variable comment $root_dir = 'bin/importdata/proquest' this is the root folder where all the associated documents are uploaded by ftp. $depositor = 'batchimporter' only the items deposited by a defined depositor, in this case batchimporter, will be moved from inbox to live archive. $dataset_id = 'inbox' limit the dataset to those eprints with status set to inbox $repositoryid = 'library' the internal eprints identifier of the repository table 2. variables to be set in the attach_documents script batch ingesting into eprints digital repository software| neugebauer and han 118 the following command is used to proceed with file attachment, while the output log is redirected and saved in the file attachment: /bin/attach_documents.pl > ./attachment 2>&1 the thesis metadata record was made live even if it did not contain a corresponding document file. a list of eprint ids of theses that did not contain a corresponding full-text pdf document are listed at the end of the log file, along with the count of the number of theses that were made live. after the import operation is complete, all the abstract pages need to be regenerated with the following command: /bin/generate_abstracts repositoryid conclusions this paper is a detailed report on batch importing into the eprints system. the authors believe that this paper and its accompanying source code is a useful contribution to the literature on batch importing into digital repository systems. in particular, it should be useful to institutions that are adopting the eprints digital repository software. batch importing of content is a basic and fundamental function of a repository system, which is why the topic has come up repeatedly on the eprints tech list and in a repository software review. the methods that we describe for carrying out batch importing in eprints make use of the command line and require perl scripting. more robust administrative graphical user interface support for batch import functions would be a useful feature to develop in the platform. acknowledgements the authors would like thank mia massicotte for exporting the metadata records from the integrated library system. we would also like to thank alexandros nitsiou, raquel horlick, adam field, and the reviewers at information technology and libraries for their useful comments and suggestions. references 1. shawn averkamp and joanna lee, “repurposing proquest metadata for batch ingesting etds into an institutional repository,” code{4}lib journal 7 (2009), http://journal.code4lib.org/articles/1647 (accessed june 27, 2011). 2. michael witt and mark p. newton, “preparing batch deposits for digital commons repositories,” 2008, http://docs.lib.purdue.edu/lib_research/96/ (accessed june 20, 2011). 3. randall floyd, “automated electronic thesis and dissertations ingest,” 2009, https://wiki.dlib.indiana.edu/display/iusw/automated+electronic+thesis+and+dissertations+i ngest (accessed may 26, 2011). 4. eprints digital repository software, university of southampton, uk, http://www.eprints.org/ (accessed june 27, 2011). 5. maureen p. walsh, “batch loading collections into dspace: using perl scripts for automation and quality control,” information technology & libraries 29, no. 3 (2010): 117–27, http://journal.code4lib.org/articles/1647 http://docs.lib.purdue.edu/lib_research/96/ https://wiki.dlib.indiana.edu/display/iusw/automated+electronic+thesis+and+dissertations+ingest https://wiki.dlib.indiana.edu/display/iusw/automated+electronic+thesis+and+dissertations+ingest http://www.eprints.org/ information technology and libraries | march 2012 119 http://search.ebscohost.com/login.aspx?direct=true&db=a9h&an=52871761&site=ehost-live (accessed june 26, 2011). 6. lesley drysdale, “importing records from reference manager into gnu eprints,” 2004, http://hdl.handle.net/1905/175 (accessed june 27, 2011). 7. eprints tech list, university of southampton, uk, http://www.eprints.org/tech.php/ (accessed june 27, 2011). 8. mike beazly, “eprints institutional repository software: a review,” partnership: the canadian journal of library & information practice & research 5, no. 2 (2010), http://journal.lib.uoguelph.ca/index.php/perj/article/viewarticle/1234 (accessed june 27, 2011). 9. concordia university libraries, “spectrum: concordia university research repository,” http://spectrum.library.concordia.ca (accessed june 27, 2011). 10. eprints wiki, “api:bin/import,” university of southampton, uk, http://wiki.eprints.org/w/api:bin/import (accessed june 23, 2011). 11. eprints files, university of southampton, uk, http://files.eprints.org/ (accessed june 24 2011). 12. parella romero and jose miguel, “marc import/export plugins for gnu eprints3,” eprints files, 2008, http://files.eprints.org/323/ (accessed may 31, 2011). 13. agent zhang and maxim zenin, “uml:class::simple,” cpan, http://search.cpan.org/~agent/uml-class-simple-0.18/lib/uml/class/simple.pm (accessed september 20, 2011). 14. terry reese, “marcedit: downloads,” oregon state university, http://people.oregonstate.edu/~reeset/marcedit/html/downloads.html (accessed june 27, 2011). http://search.ebscohost.com/login.aspx?direct=true&db=a9h&an=52871761&site=ehost-live http://hdl.handle.net/1905/175 http://www.eprints.org/tech.php/ http://journal.lib.uoguelph.ca/index.php/perj/article/viewarticle/1234 http://spectrum.library.concordia.ca/ http://wiki.eprints.org/w/api:bin/import http://files.eprints.org/ http://files.eprints.org/323/ http://search.cpan.org/~agent/uml-class-simple-0.18/lib/uml/class/simple.pm http://people.oregonstate.edu/~reeset/marcedit/html/downloads.html batch ingesting into eprints digital repository software| neugebauer and han 120 appendix a. marc.pl configuration file # # plugin eprints::plugin::import::marc # # marc tofro eprints mappings # do _not_ add compound mappings here. $c->{marc}->{marc2ep} = { # marc to eprints '020a' => 'isbn', '020z' => 'isbn', '022a' => 'issn', '245a' => 'title', '245b' => 'subtitle', '250a' => 'edition', '260a' => 'place_of_pub', '260b' => 'publisher', '260c' => 'date', '362a' => 'volume', '440a' => 'series', '440c' => 'volume', '440x' => 'issn', '520a' => 'abstract', '730a' => 'publication', }; $c->{marc}->{marc2ep}->{constants} = { }; ################################################################### ### # # plugin-specific settings. # # any non empty hash set for a specific plugin will override the # general one above! # ################################################################### ### # # plugin eprints::plugin::import::marc::concordiatheses # $c->{marc}->{'eprints::plugin::import::marc::concordiatheses'}->{marc2ep} = { '020a' => 'isbn', '020z' => 'isbn', '022a' => 'issn', '250a' => 'edition', information technology and libraries | march 2012 121 '260a' => 'place_of_pub', '260b' => 'publisher', '260c' => 'date', '300a' => 'pages_aacr', '362a' => 'volume', '440a' => 'series', '440c' => 'volume', '440x' => 'issn', '520a' => 'abstract', '730a' => 'publication', }; $c->{marc}->{'eprints::plugin::import::marc::concordiatheses'}->{constants} = { # marc to eprints constants 'type' => 'thesis', 'institution' => 'concordia university', 'date_type' => 'submitted', }; batch ingesting into eprints digital repository software| neugebauer and han 122 appendix b. attach_documents.pl #!/usr/bin/perl -i/opt/eprints3/perl_lib =head1 description this script allows you to attach a file to an eprint object by proquest id. =head1 copyright and license 2009 adam field, tomasz neugebauer <tomasz.neugebauer@concordia.ca> 2011 bin han <bin.han@concordia.ca> this module is free software under the same terms of perl. compatible with eprints 3.2.4 (victoria sponge). =cut use strict; use warnings; use eprints; my $repositoryid = 'library'; my $root_dir = '/opt/eprints3/bin/import-data/proquest'; #location of pdf files my $dataset_id = 'inbox'; #change to 'eprint' if you want to run it over everything. my $depositor = 'batchimporter'; #limit import to $depositor’s inbox #global variables for log purposes my $int_live = 0; #count of eprints moved to live archive with a document my $int_doc = 0; #count of eprints that already have document attached my @array_doc; #ids of eprints that already have documents my $int_no_doc = 0; #count of eprints moved to live with no document attached my @array_no_doc; #ids of eprints that have no documents my $int_no_proid = 0; #count of eprints with no proquest id my @array_no_proid; #ids of eprints with no proquest id my $session = eprints::session->new(1, $repositoryid); die "couldn't create session for $repositoryid\n" unless defined $session; #the hash contains all the files that need to be uploaded #the hash contains key-value pairs: (pq_id => filename) my $filemap = {}; load_filemap($root_dir); #get all eprints in inbox dataset my $dataset = $session->get_repository->get_dataset($dataset_id); #run attach_file on each eprint object $dataset->map($session, \&attach_file); information technology and libraries | march 2012 123 #output log for attachment print "#### $int_doc eprints already have document attached, skip ####\n @array_doc\n"; print "#### $int_no_proid eprints doesn't have proquest id, skip ####\n @array_no_proid\n"; print "#### $int_no_doc eprints doesn't have associated document, moved to live ####\n @array_no_doc\n"; #total number of eprints that were made live: those with and without documents. my $int_total_live = $int_live + $int_no_doc; print "#### intotal: $int_total_live eprints moved to live ####\n"; #attach file to corresponding eprint object sub attach_file { my ($session, $ds, $eprint) = @_; #skip if eprint already has a document attached my $full_text_status = $eprint->get_value( "full_text_status" ); if ($full_text_status ne "none") { print "eprint ".$eprint->get_id." already has a document, skipping\n"; $int_doc ++; push ( @array_doc, $eprint->get_id ); return; } #retrieve username/userid associated with current eprint my $user = new eprints::dataobj::user( $eprint->{ session }, $eprint->get_value( "userid" ) ); my $username; # exit in case of failure to retrieve associated user, just in case. return unless defined $user; $username = $user->get_value( "username" ); # $dataset includes all eprints in inbox, so we limit to $depositor's items only return if( $username ne $depositor ); #skip if no proquest id is associated with the current eprint my $pq_id = $eprint->get_value('pq_id'); if (not defined $pq_id) { print "eprint ".$eprint->get_id." doesn't have a proquest id, skipping\n"; $int_no_proid ++; batch ingesting into eprints digital repository software| neugebauer and han 124 push ( @array_no_proid, $eprint->get_id ); return; } #remove space from proquest id $pq_id =~ s/\s//g; #attach the pdf to eprint objects and move to live archive if ($filemap->{$pq_id} and -e $filemap->{$pq_id} ) #if the file exists { #create document object, add pdf files to document, attach to eprint object, and move to live archive my $doc = eprints::dataobj::document::create( $session, $eprint ); $doc->add_file( $filemap->{$pq_id}, $pq_id . '.pdf' ); $doc->set_value( "format", "application/pdf" ); $doc->commit(); print "adding document to eprint ", $eprint->get_id, "\n"; $eprint->move_to_archive; print "eprint ".$eprint->get_id." moved to archive.\n"; $int_live ++; } else { #move the metadata-only eprints to live as well print "proquest id \\$pq_id\\ (eprint ", $eprint->get_id, ") does not have a file associated with it\n"; $eprint->move_to_archive; print "eprint ".$eprint->get_id." moved to archive without document attached.\n"; $int_no_doc ++; push ( @array_no_doc, $eprint->get_id ); } } #recursively traverse the directory, find all pdf files. sub load_filemap { my ($directory) = @_; foreach my $filename (<$directory/*>) { if (-d $filename) { load_filemap($filename); } #catch the file name ending in .pdf elsif ($filename =~ m/([^\/]*)\.pdf$/i) information technology and libraries | march 2012 125 { my $pq_id = $1; #add pq_id => filename pair to filemap hash table $filemap->{$pq_id} = $filename; } } } lib-mocs-kmc364-20131012113359 210 reports and working papers inclusion of nonroman character sets the following document was prepared by staff of the library of congress as a working paper for discussions on incorporating the techniques described into the marc communications format. the document defines the principles for inclusion of nonroman alphabet character sets in the marc communications format and the procedural changes needed to allow implementation of the principles. this technique was agreed upon at the marbi committee meeting on february 2, 1981. any questions on the description of the inclusion of nonroman character sets in the marc communications format should be addressed to: library of congress, processing services, attention: mrs. margaret patterson, washington , dc 20540. 1. introduction the cataloging rules followed by american libraries favor recording the title page data in the original script when possible. this helps those who consult catalogs to read the most essential information about the book. (reading his or her name in romanized form is just as difficult for someone who knows arabic as reading your name when it's written in arabic. ) the new cataloging rules also specify that names and titles in notes be given in their original script, aacr2 l. 7 a.3. technological advances have made it possible to provide many, if not all , nonroman alphabets in machinereadable cataloging records. oclc and rlin are in the process of enhancing their systems so they can handle some nonroman writing systems. the library of congress has entered into a cooperative agreement with rlin for the development and use of an augmented rlin system for east asian (i.e., chinese, japanese, and korean) bibliographic data. although the library itself will not be creating and distributing marc records with nonroman characters in the near term , the goal of this proposal is to define how these data can be included now so others can do so soon. the technique known as an escape sequence announces that the codes which follow will represent letters in a specific different alphabet instead of the roman letters the codes would otherwise stand for. 2. principles the following principles will govern inclusion of other alphabets in marc records. note that these deal only with the marc communications format record, not the details of its processing-keying, sorting, display, etc.-by any bibliographic agency or utility. these principles are a slightly revised version of ones reviewed and approved in principle by the marbi character set committee in 1976. the earlier version was also distributed that year as working paper n77 of iso tc46/sc4/ wgi. (1) standard character sets should be used when available. (2) standard escape sequences should be used when available. (3) escape sequences should be used only when needed. (4) escape sequences are locking within a subfield but revert at any delimiter or field or record terminator code. example: (for demonstration purposes only, ec represents escape to cyrillic and ea escape to ascii) 245 10$aecrussian title proper :$becrussian subtitle. f not 245 10$aecrussian title proper :ea$becrussian subtitle. eaf and not 245 10$aecrussian title proper :$brussian subtitle. f (5) records which contain an escape sequence will also contain a special field which specifies what unusual character sets are present. 3. implementation the following will be done to realize these principles. • the ala character set will be redefined-see table 1. • a new character sets present field will be defined. • details of application such as distribution, filing indicator values, etc., will be defined. 3.1 discussionala character set a character set is a list of characters with the code used to represent each one. using this definition , the ala character set as given in appendixes iii.b and iii.c of marc formats for bibliographic data actually consists of eight character sets. (1) ascii and ala diacritics and special characters with their eight-bit code. (2) superscript zero to nine, plus, minus, open and close parentheses with their eight-bit code. table 1. proposed revised ala character set ~ p ~ p p p p i p p i p p p i i p i p p p i p i p i i p ~ i i i i ~ ~ ~ i p ~ i i ~ i p i p i i i i p p i i p i i i i p i i i i 4 3 2 i bits p i 2 3 4 ~ 6 7 r 9 10 ii 12 13 14 , 'i p i p i 2 nul ole sp soh dci ! fstx dc2 . etx dc3 " eot dc4 s enq nak " ack syn & bel !::to os can i ht em i lp sub vt f:sc + ff fs cr os , so ns , s l us' i ~ p l p 9 i i i i p p p i p i 3 ~ ~ . p @ p i a q 2 b r 3 c s 4 i> t ~ e u 6 p v 7 g w 8 h x !i i y j z ; k i < 1. \ m i > n ' 1 0 i ascii 6 • b c d c r • h ; j k i m n 0 reports and working papers 211 (3) subscript zero to nine, plus, minus, open and close parentheses with their eight-bit code. (4) greek lowercase alpha, beta, and gamma with their eight-bit code. (5-8) the same characters with their sixbit codes. the six-bit character sets are used to distribute marc records on seven-track tapes. there are very few subscribers. it is unlikely that a method can be devised for distribution of nonroman character sets records on such tapes. the present seven-track subscribers should be asked if they know of any way to do so. if they do not, the alternatives are to cease distribution of seven-track tapes entirely or limit them to those records containing only roman alphabet characters-those without a character sets present field. in the latter case, they should pay proportionately less for their subscription. the present four eight-bit character sets and their escape sequences do not conform to present standards. the present standards did not exist when the character sets were being defined. to avoid creating and distributing records containing both standard and nonstandard character sets and stanp p i i i i p p i i p p p i p i 7 8 9 . p q r ' l u ,. w x y , i' : i' -. del l i i /i p ~ i i i i i i p p i i p i p i ii p ~~. i i 10 ii 12 13 · u 0 l i l ' e • < 2 0 d ' j .. p ~ 4 4 . ;e • 5 s u <e .. 6 -, " 7 ' . . 8 .. < ~ ( 9 . 411 b + . ~ :!: r -0 " ( ~ lf u i ) ~ . a y . proposed change d ala extension of ascii i hb 7 i 6t ~) s 212 journal of library automation vol. 14/3 september 1981 dard and nonstandard escape sequences, the ala character set should be redefined. this change will be much less traumatic than it sounds. no new characters will be added; only the codes used to represent subscript, superscript, and greek characters will be changed. these characters were found in the title field of 8.59 out of 1.1 million records. if, as seems plausible, most or all marc subscribers translate tapes into their own character set codes as a first step and for communication translate from their own codes into the ala character set as the last step before distribution, only these two programs would need to be changed. the proposed redefined ala character set is shown in table 1. on it, columns two through seven are the american standard code for information interchange (ascii) which is a recognized standard with a registered escape sequence. columns ten through fifteen are the ala extension of ascii with special characters and the three greek letters in columns ten and eleven, superscripts in column twelve, subscripts in thirteen, and diacritics in columns fourteen and fifteen. (it should be noted that six ascii codes will not occur in marc records: codes 5/14 circumflex, 5/15 underline, 6/0 grave, and 7/ 14 tilde are redundant with the codes for these diacritics in columns fourteen and fifteen; codes 7 i 11 left brace and 7/13 right brace never occur because these characters do not occur in bibliographic data. no change in this practice is proposed. it is the fact that these last two codes are used in some nonroman alphabet standard character sets that makes nonroman six-bit codes impossible.) the ala extension of ascii is not an official standard now; it does not have an escape sequence yet. in addition to the ala extension of ascii, there is a draft international standard extended latin alphabet character set for bibliographic use-iso dis 5426 (table 2). while both sets are identical in purpose, they differ in the characters they contain and the codes used to represent them. the abacus group has agreed that iso 5426 be used for international distribution of marc records among the bibliographic agencies they represent once it is an approved international standard, cf. lc information bulletin, november 16, 1979, p. 475. the library will, however, continue to use the ala extension for u.s. distribution. some of the characters only on the iso set could be added to the ala extension without affecting existing records. an ansi z39 subcommittee has been established to consider this possibility. while some changes may be desirable to the ala character repertoire, it is important that this issue not delay the separate matter of providing for inclusion of nonroman alphabets in marc. 3.2 discussion-escape sequence for purposes of this discussion, escape sequences are defined as a combination of three characters. (see table 3.) the first is an escape character, hex 1111. the second character specifies which codes are having different characters assigned to them, those in columns 27 or those in columns 10-15. the third character defines what characters are being assigned to these codes, e.g., cyrillic, greek, etc. this is a greatly simplified explanation of the escape sequence standards, iso 2022 and ansi x3.41. (both are in the process of revision.) these standards provide for two types of escape sequences: public ones which reference registered character sets, and private ones for unregistered character sets. while the meaning of the latter is governed by an agreement between the sender and the receiver, they are in conformity with the standard. until the ala extension of ascii has a registered escape sequence, such a "private" escape sequence could be defined for it in the character set appendix and used. the second character of an escape sequence which changes the meaning of the codes in columns 2-7 contains either an open parenthesis, hex 2/8, or a less than sign, hex 2/12. the second character of an escape sequence which changes the meaning of the codes in columns 10-15 contains either a close parenthesis, hex 2/9, or an equal sign, hex 2/13. the third character of escape sequences for certain registered character sets has been defined as follows: table 2. extended latin alphabet character set ii . : 0 0 0 i ~ 0 lb71b61bs b4 b3 b2 bt 1\ow set ascii bits russian (1967 cost standard) (table 3) iso greek rso extended cyrillic (table3) 0 0 0 0 0 0 0 0 1 1 0 0 1 0 2 0 0 l 1 3 0 1 0 0 4 0 l 0 1 5 0 1 1 0 6 0 1 1 1 7 1 0 0 0 8 1 0 0 1 9 1 0 l 0 10 1 0 1 l 11 1 l 0 0 12 l 1 0 1 13 1 1 1 0 14 1 1 1 1 15 code registration applied for, code pending 5/8, uppercase x 517, uppercase w 0 0 1 the sixteen codes in column three can be used to designate sixteen different "private" character sets. in marc records, ascii and russian would be assigned to columns 2-7, while greek and the extended cyrillic (and the ala extension of ascii) would be assigned to columns 10-15. 1 reports and working papers 213 10 0 1 1 11 l 1 1 0 0 1 1 0 1 0 1 0 1 2 3 4 5 6 7 . 7 ) . ... i l ie ae " -[) ct " . .j £ a 0 1.1 s ~ 0 ¥ ~ 1 t .t u u ij 9 . .. i ii .. _t. .r . . .. ¢ = ¢ .. .. 0 <e i <e « >> . b "' b * . p p © i ii r ® ii j 1 ® l " '-../ escape sequences would be given where needed in data fields. if necessary, it is permissible to embed escape sequences within a word. for example, a latin diacritic might be needed with an extended cyrillic letter to represent a letter in one of the nonslavic languages of central asia which uses the cyrillic alphabet. in addition to escape sequences for nonroman alphabets described above in which one code stands for one letter, the escape standards also define escape sequence procedures for changing to multiple byte character sets. because the ideographic writing 214 journal of library automation vol. 14/3 september 1981 table 3. escape sequence character set p p p p p p p 1 p p 1 p ~ ~ 1 1 u i p p p i p 1 p i 1 p p 1 i 1 i p p u 1 u u 1 i ~ 1 p i ~ i i i i p u i i q i i i i p i 1 i i ·i 3 2 i jilts g 1 2 j 4 r. 6 7 8 9 ill 11 12 ij (.1 1$ fl ~ ~ u ~ p 1 p 1 ~ p q p 1 1 p i 2 3 sp p ! 1 " 2 # j ll 4 \\ $ & 6 7 i 8 i 9 : • : < > i ? l ~ i i i g g i p i ~ 4 !o g 10 n 10 a « a 15 p !j 1.1 c ll a t .n e >' e <!> "' <l> r • r " • x u .. lo( -,, 3 ~ k w " j1 . n ... u1 m k ~ ll 0 . 0 ~ i i l i i u g p u i 1 p ~ i 1 u 1 ~ 1 ~ i ~ 7 8 9 10 11 12 n r .!l ~ p r c c t ~ y s lk j b 'i b j bl "' 3 ,_ ill ii 3 " ul y 'l ,, ,, i i p 1 13 -t 9 v .. [ j /i i i i 1 h ~ 1!o 1 / r '!; ,; e f y c ;;{ j:: s 1 .. j jb h, 1\ ,( y ll " i hll 7 i gt r, s cost 13052-67 russian iso dis 5~27 extended cyrillic systems of east asia use thousands of different characters, it will be necessary to use two or three bytes/codes to identify a single specific character uniquely. the japanese industrial standard character set, jis 6226, uses two bytes per character, and it has been submitted to iso to obtain a registered escape sequence. the first volume of the chinese character code for information interchange, cccii, has been issued; the second is expected in december. it uses three bytes per character. in all probability the lc/ rlin east asian cooperative project will adopt either these character sets and their escape sequences or machine reversible adaptations of them. the need to expand east asian character sets constantly to provide for infrequently used characters poses problems whose solutions cannot be predicted at this time. 3. 3 discussioncharacter sets present field as specified in the sixth principle, there is need for a special field which specifies what character sets are present whenever a set other than ascii and the ala extension of ascii are present in a record. the proposed field will use tag 066 and be defined as follows: 066 character sets present this field specifies what character sets are present in the other than ascii and the ala extension of ascii. the field is not repeatable. both indicators are unused and will contain blanks. $a this subfield will contain all but the first character of the escape sequence to the default character set in columns 2-7 whenever the default character set is not ascii. this is not likely to occur in records created in the united states. since there can only be one default character set, the subfield is not repeatable. $b this subfield will contain all but the first character of the escape sequence to the default character set in columns 1015 whenever the default character set is not the ala extension of ascii. this is not likely to occur in records created in the united states. since there can be only one default extension character set, this subfield is not repeatable. $c this subfield will contain all but the first character (or all but the first if a longer escape sequence is used) of every escape sequence found in the record. if the same escape sequence occurs more than once, it will be given only once in this subfield. the subfield is repeatable. this subfield does not identify the default character sets. example : l'>l'>~c)w a record containing the iso extended cyrillic character set. l'>l'>$c)w$c)x a record 3.4 discussion-other details containing both the iso greek and extended cyrillic character sets. when a field has an indicator to specify the number of leading characters to be ignored in filing and the text of the field begins with an escape sequence, the length of the escape sequence will not be included in the character count. when fields contain escape sequences to languages written from right to left, the field will still be given in its logical order. for example, the first letter of a hebrew title would be the eighth character in a field (following the indicators, a delimiter, a subfield code, and a three-character escape sequence). the first letter would not appear just before the end of field character and proceed backwards to the beginning of the field. a convention exists in descriptive cataloging fields that subfield content designation generally serves as a substitute for a space. an escape sequence can occur within a word, after a subfield code, or between two words not at a subfield boundary. for simplicity, the convention that an escape sequence does not replace a space should be adopted. one other convention is also advocated: when a space, subfield code, or punctuation mark (except open quote, pareports and working papers 215 renthesis or bracket) is adjacent to an escape sequence, the escape sequence will come last. wayne davison of rlin raised the following issue. after the library of congress has prepared and distributed an entirely romanized cataloging record for a russian book, a library with access to automated cyrillic input and display capability will create a record for the same book with the title in the vernacular. (since aacr2 says to give the title in the original script "wherever practicable," the library could be said to be obligated to do so.) in such an event the local record could have all the authoritative library of congress access points. to keep this record current when the library of congress record is revised and redistributed, it would be necessary to carry the lc control number in the local record . most automated systems are hypersensitive to the presence of two records with the same control number. the two records can be easily distinguished: in the library of congress record, the modified record byte in field 008 will be set to "o" and it will not have any 066, character sets present field. a comparison of oclc, rlg/rlin, and wln university of oregon library the following comparison of three major bibliographic utilities was prepared by the university of oregon library's cataloging objectives committee, subcommittee on bibliographic utilities. members of the subcommittee were elaine kemp, acting assistant university librarian for technical services; rod slade, coordinator of the library's computer search service; and thomas stave, head documents librarian. the subcommittee attempted to produce a comparison that was concise and jargonfree for use with the university community in evaluating the bibliographic utilities under consideration. the university faculty library committee was enlisted to review this document in draft jorm and held three meetings with the subcommittee for that purpose. the document was also shared with library faculty and staff in order to elicit suggestions for revision. 176 journal of library a-utomation vol. 2/3 september, 1969 book reviews information retrieval systems; characteristics, testing, and evaluation, by f. wilfred lancaster. new york, john wiley & sons, 1968. 222 pp. $9.00. despite the fact that users retrieve the majority of information that they obtain from collections such as libraries by employing author / title listings in catalogs, information scientists consider only subject listings in discussions of information retrieval this book is no excepton. lancaster defines an information retrieval system as informing a user "on the existence (or non-existence) and whereabouts of documents relating to his request." half of his book treats of characteristics and operation of information retrieval systems and half of testing and evaluating such systems. it is the latter half of the book that distinguishes it from other general introductions to the subject. for the testing and evaluation sections of his book, the author draws heavily on his experience gained while working on the cranfield project as well as at the national library of medicine. at the latter he examined a segment of the real world in a major investigation of the medlars system. an interesting finding of the medlars study that he reports in the book, but on which he does not elaborate, is that there was no relationship between recall ratio percentage and precision ratio percentage for 299 searches examined. in his preface the author expresses the hope that his book will be helpful to students and useful to practitioners. however, a principal function of such an introduction is to guide the reader in further pursuit, or retrieval, of information. in this function the book does not succeed, for seven chapters are barren of references, another eight average somewhat more than three, and the remaining chapter boasts fifty-three. this book will not supplant other general introductions to information retrieval systems, but its discussion of testing and evaluation is a useful introduction. frederick g. kilgour book reviews 177 how to manage your information, by bart e. holm. new york, reinhold book corporation, 1968. 292 pp. $10.00 essential information exceeds the grasp of the keenest minds in all professions. a method of readily obtaining needed resource material can be a particularly knotty problem for those who have no background in appropriate methods of data storage and retrieval. successful operation for many professionals depends directly upon their ability to work out a practical personal system which does not require complex apparatus, excessive cost or time. the purpose of this volume is to help such individuals evaluate their particular needs and design a method of managing information which will be workable and practical. i found the book enjoyable and informative. it immediately recommends itself with its own efficient organization, attractive format, readable style, clever illustrations, and complete indexing. it not only deals with the broad principles necessary for development of a personal information system but also includes specific information of a practical nature on the approach to this problem for professionals in several different fields. the first chapter, which is titled "man the collector", is fascinating to an unsophisticated non-librarian. it outlines the enormous problem of the growth of world-wide information that appears to be proliferating in an almost malignant manner. this served to emphasize a repeatedly stressed cardinal principle: the need to be selective, so that only items of probable real value will be retained. a most valuable chapter for those not experienced in library work relates to the basic principles for retrieval on a single or multiple entry basis. this logically leads into a discussion of how to evaluate the individual's personal need. the operations of specific simple systems, such as optical coincidence, termatrex, keysort and term cards were adequately discussed. individual chapters are devoted to the unique problems that might be encountered by the engineer, the chemist, the physicist, the architect, the doctor, and the archivist, with emphasis on the specific vocabulary needed for proper organization and a brief review of information sources of the various disciplines. the remaining seven chapters deal with proper use of available sources of information, such as keeping current with the literature, use of the modern library, records management, microfilming, and data systems of the present and the future. this volume should be a real value to many who have limited background and are struggling in vain to keep up with the information they need. it can provide practical pointers for those who want to make a serious effort toward establishing and maintaining a system of storage and retrieval of information that does not rely on an all too often faulty memory. ellis a. fuller, m.d. 178 journal of library automation vol. 2/3 september, 1969 the institutes of education union list of periodicals processing system, by j. d. dews and j. m. smethurst. ( symplegades, no. 1). newcastleupon-tyne, oriel press ltd., 1969. 39 pp. sbn ( 69uk) 85362 060 1. 15s. the first half of this small manual is devoted to describing the file .maintenance .and text editing system developed by the university of newcastle-upon-tyne. the second half of the text is devoted to the technical specifications of the newcastle file handling system and refers specifically to the english electric-leo marconi kdf 9 computer. the system described is the application of a series of general purpose programs, that provide the capability of storing, adding, deleting, or changing variable length records, to a union list project for a group of libraries. unfortunately this otherwise well designed system has not been able to do away with the manual "typed slips" back-up file which plagues so many other computerized union list projects. also of interest in this processing system is the use of the work developed at the newcastle computer typesetting research project for computer controlled composition of the final output. section two of seminar on the organization and handling of bibliographic records by computer, newcastle-upon-tyne, 1967 edited by nigel s. m. cox and michael w. grose (archon books, hamden, connecticut, 1967) is the preferred description of all aspects of the system except for those who need the program specifications. alan d. hogan computer based information retrieval systems, edited by bernard houghton. camden, conn., archon books, 1969. 136 pp. $5.00. this book contains six papers that their authors presented at a special course in april 1968 at the liverpool school of librarianship. the objective of the course was to survey the major computer based informational retrieval systems operating in the united kingdom for an audience of prospective users and planners. the book is a successful elementary introduction to large information retrieval systems. in the 1940's and early 1950's, such pioneers as w. e. baten, g. cordonnier, calvin mooers and mortimer taube developed new techniques for information retrieval, a phrase which mooers coined. the major innovation in the new development was "coordinate indexing" or the coordination of index terms at the time of searching. coordination employed simpl~ boolean logic -"and," "or," and "but not." coordinate indexing increased flexibility of searching and number of accesses to documents in contrast to the inflexible, pre-coordinated traditional subject catalogs. book reviews 179 it was also characteristic of the early systems that they dealt with relatively small files of documents not under classical bibliographical control -patents, internal reports, and segments of external report literature. with the advent of the computer, it became feasible to apply the new information retrieval techniques to large files of traditional materials, but to date the major effort has been directed toward huge files of journal articles. it is, therefore, no surprise to find that the five chapters in computer based information retrieval systems that describe systems all depict retrieval from files of journal articles. these five systems are medlars, the science citation index (sci) and its peripherals, chemical titles ( ct) and chemical biological activities ( cbac), a burgeoning institution of electrical engineers (lee ) sponsored project in selective dissemination of electronics information, and a minor computer application to production of the british technology index; the three major, operational projects are of united states origin. selective dissemination is a gt·atifying feature of sci, ct, cbas, and the lee project, for sdi applications take advantage of the computer's potential for personalization by servicing individual users on the basis of their individual needs. the book is a successful primer that provides a useful introduction to computer based systems for retrieval of journal citations from large files. g. a. somerfield's last chapter, "state of the art of computer based information retrieval systems," is more than its title implies, for the last half of the chapter analyzes desirable improvements yet to be achieved. the first half could well serve as an introduction to the book. recently, several worthwhile primers on information retrieval and retrieval systems have appeared. computer based infotimation retrieval systems is still another to provide the brief, clear, elementary introduction that new students, new users, and new planners find most effective in providing an understanding of an unfamiliar field. frederick g. kilgour modern data processing, by robert r. arnold, harold c. hill and aylmer v. nichols. new york, john wiley and sons, inc., 1969. 370 pp. $8.95 this book is an updated version of the authors' previous book, introduction to data processing, john wiley and sons, 1966. the present volume is designed to be used as an introductory text to the concepts of all facets of data processing. it will not teach people to be programmers or systems analysts but it can be very useful to anyone who would like to learn about data processing without having to become a programmer or systems analyst. the book is well organized and explains, in non-technical terms, highly technical facets of data processing. this book can be used not only 180 journal of library automation vol. 2/3 september, 1969 at the high school level but also at the beginning college level. in it the authors strived and achieved to ·make . available all the latest advancements in the computer science field. in my opinion the authors have achieved then· goal of developing a very good elementary text in data processing. i highly recommend this book to librarians and all others as a basic primer in automation. it will be particularly useful to administrators, as it has an excellent glossary that assist them in their understanding of the data processing vocabulary and jargon. thomas k. burgess microsoft word september_ital_fortier_final.docx hidden  online  surveillance:  what   librarians  should  know  to  protect  their   own  privacy  and  that  of  their  patrons       alexandre  fortier   and     jacquelyn  burkell     information  technology  and  libraries  |  september  2015             59   abstract   librarians  have  a  professional  responsibility  to  protect  the  right  to  access  information  free  from   surveillance.  this  right  is  at  risk  from  a  new  and  increasing  threat:  the  collection  and  use  of  non-­‐ personally  identifying  information  such  as  ip  addresses  through  online  behavioral  tracking.  this   paper  provides  an  overview  of  behavioral  tracking,  identifying  the  risks  and  benefits,  describes  the   mechanisms  used  to  track  this  information,  and  offers  strategies  that  can  be  used  to  identify  and  limit   behavioral  tracking.  we  argue  that  this  knowledge  is  critical  for  librarians  in  two  interconnected   ways.  first,  librarians  should  be  evaluating  recommended  websites  with  respect  to  behavioral   tracking  practices  to  help  protect  patron  privacy;  second,  they  should  be  providing  digital  literacy   education  about  behavioral  tracking  to  empower  patrons  to  protect  their  own  privacy  online.   introduction   privacy  is  important  to  librarians.  the  american  library  association  code  of  ethics  (2008)  states   that  “we  protect  each  library  user’s  right  to  privacy  and  confidentiality  with  respect  to  information   sought  or  received  and  resources  consulted,  borrowed,  acquired  or  transmitted,”  while  the   canadian  library  association  code  of  ethics  (1976)  states  that  members  have  responsibility  to   “protect  the  privacy  and  dignity  of  library  users  and  staff.”  this  translates  to  a  core  professional   commitment:  according  to  the  american  library  association  (2014,  under  “why  libraries?”),   “librarians  feel  a  professional  responsibility  to  protect  the  right  to  search  for  information  free   from  surveillance.”   increasingly,  information  searches  are  conducted  online,  and  as  a  result  librarians  should  be   paying  specific  attention  to  online  surveillance  in  their  efforts  to  satisfy  their  privacy-­‐related   professional  responsibility.  this  is  particularly  important  given  the  current  environment  of   significant  and  increasing  threat  to  privacy  in  the  online  context.  although  many  concerns  about   online  privacy  relate  to  the  collection,  use,  and  sharing  of  personally  identifiable  information,   there  is  increasing  awareness  of  the  risks  associated  with  the  collection  and  use  of  what  has  been   termed  ‘non-­‐personally  identifiable  information’  (e.g.:  internet  protocol  addresses,  pages  visited,   geographic  location  information,  search  strings,  etc.;  office  of  the  privacy  commissioner  of  canada     alexandre  fortier  (afortie@uwo.ca)  is  a  phd  candidate  and  lecturer,  faculty  of  information  and   media  studies,  the  university  of  western  ontario,  london,  ontario.  jacquelyn  burkell   (jburkell@uwo.ca)  is  associate  professor,  faculty  of  information  and  media  studies,  the   university  of  western  ontario,  london,  ontario.     hidden  online  surveillance:  what  librarians  should  know  to  protect  their  own  privacy  and   that  of  their  patrons|  fortier  and  burkell  |  doi:  10.6017/ital.v34i3.5495   60   2011,  12).  this  practice  has  been  termed  ‘behavioral  tracking’,  and  recent  revelations  of   government  security  agency  collection  of  user  metadata  (ball  2013;  weston,  greenwald  and   gallager  2014)  have  heightened  awareness  of  this  issue.  the  problem,  however,  is  not  new,  nor  is   the  practice  restricted  to  the  actions  of  governmental  agencies.  indeed,  as  early  as  1996   commercial  and  non-­‐commercial  entities  were  practicing  online  behavioral  tracking  for  purposes   of  website  and  interaction  personalization  and  to  present  targeted  advertising  (“affinicast  unveils   personalization  tool”  1996;  “adone  classified  network  and  clickover  announce  strategic  alliance”   1997).  since  these  initial  forays  into  behavioral  tracking  and  personalization  of  online  content  the   practice  has  proliferated,  and  many  sites  now  use  a  variety  of  behavioral  tracking  tools  to  enhance   user  experience  and  deliver  targeted  advertisements  (see,  e.g.,  the  “what  they  know”  series  from   the  wall  street  journal  2010;  gomez,  pinnick  and  soltani  2009;  soltani  et  al.  2009).   there  can  be  no  question  that  behavioral  tracking  is  a  form  of  surveillance  (castelluccia  and   narayanan  2012),  and  the  ubiquity  of  this  practice  means  that  users  are  regularly  subject  to  this   type  of  surveillance  when  they  access  online  resources.  in  order  to  satisfy  a  professional   commitment  to  support  information  access  free  from  surveillance,  librarians  must  therefore   address  two  related  issues:  first,  they  must  ensure  that  the  resources  they  recommend  are   privacy-­‐respecting  in  that  those  resources  engage  in  little  if  any  online  surveillance;  second,  they   must  raise  the  digital  literacy  of  their  patrons  with  respect  to  online  privacy,  increasing   understanding  of  online  tracking  mechanisms  and  the  strategies  that  patrons  can  use  to  protect   their  privacy  in  light  of  these  activities.   addressing  the  first  issue  requires  that  librarians  attend  to  surveillance  practices  when   recommending  online  information  resources.  privacy  and  surveillance  issues,  however,  are   notably  absent  from  common  guidelines  for  evaluating  web  resources  (see,  e.g.,  kapoun  1998;   university  of  california,  berkley  2012;  john  hopkins  university  2013),  and  thus  librarians  do  not   have  the  guidance  they  need  to  ensure  that  the  resources  they  recommend  are  privacy-­‐respecting.   it  is  critical  that  librarians  and  other  information  professionals  address  this  gap  by  developing  an   understanding  of  the  surveillance  mechanisms  used  by  websites  and  the  strategies  that  can  be   deployed  to  identify  and  even  nullify  these  mechanisms.  this  same  understanding  is  necessary  to   address  the  second  goal  of  raising  the  privacy-­‐related  digital  literacy  of  patrons.  librarians  must   understand  tracking  mechanisms  and  potential  responses  in  order  to  integrate  privacy  literacy   into  library  digital  literacy  initiatives  that  are  central  to  the  mission  of  libraries  (american  library   association  2013).   this  paper  provides  an  introduction  to  behavioral  tracking  mechanisms  and  responses.  the  goals   of  this  paper  are  to  provide  an  overview  of  the  risks  and  benefits  associated  with  online   behavioral  tracking,  to  discuss  the  various  surveillance  mechanisms  that  are  used  to  track  user   behavior,  and  to  provide  strategies  for  identifying  and  limiting  online  behavioral  tracking.  we   have  elsewhere  published  analyses  of  behavioral  tracking  practices  on  websites  recommended  by   information  professionals  (burkell  and  fortier  2015),  and  on  practices  with  respect  to  the   disclosure  of  tracking  mechanisms  (burkell  and  fortier  2015).  this  paper  serves  as  an  adjunct  to     information  technologies  and  libraries  |  september  2015   61   those  empirical  results,  providing  information  professionals  with  background  that  will  assist  them   in  negotiating,  on  the  part  of  themselves  and  their  patrons,  the  complex  territory  of  online  privacy.   consumer  attitudes  toward  behavioral  tracking   survey  data  suggest  that  consumers  are,  in  general,  aware  of  behavioral  tracking  practices.  the   2013  us  consumer  data  privacy  study  (truste  2013),  for  example,  reveals  that  80  percent  of   users  are  aware  of  online  behavioral  tracking  on  their  desktop  devices,  while  slightly  under  70   percent  are  aware  of  tracking  on  mobile  devices  (see  also  office  of  the  privacy  commissioner  of   canada  2013).  awareness,  however,  does  not  directly  translate  to  understanding,  and  recent  data   indicate  that  even  relatively  sophisticated  internet  users  are  not  fully  informed  about  behavioral   tracking  practices  (mcdonald  and  cranor  2010;  smit  et  al.  2014).  moreover,  attitudes  about   tracking  are  at  best  ambivalent  (ur  et  al.  2012),  and  many  studies  indicate  a  predominantly   negative  reaction  to  these  practices  (turow  et  al.  2009;  mcdonald  and  cranor  2010;  truste   2013).  although  it  is  not  universally  required  by  regulatory  frameworks,  many  users  feel  that   companies  should  request  permission  before  collecting  behavioral  tracking  data  (office  of  the   privacy  commissioner  of  canada  2013).  finally,  although  some  users  take  steps  to  limit  or  even   eliminate  behavioral  tracking,  many  do  not.  for  example,  while  one-­‐third  to  three-­‐quarters  of   survey  respondents  indicate  that  they  manage  or  refuse  browser  cookies  (truste  2013;   comscore  2007;  2011;  rainie  et  al.  2013),  at  least  one  quarter  reported  no  attempts  to  limit   behavioral  tracking.  this  may  be  attributed  to  the  difficulty  in  using  such  mechanisms  (leon  et  al.   2011).   behavioral  tracking  and  its  mechanisms   tracking  mechanisms  transmit  non-­‐personally  identifiable  information  to  websites  for  different   purposes.  originally,  the  information  collected  by  these  mechanisms  was  used  to  enhance  user   experience  and  to  make  these  website  interactions  more  efficient.  tracking  mechanisms  can   record  user  actions  on  a  web  page  and  their  interaction  preferences.  using  these  data,  websites   can  for  example  direct  returning  visitors  to  a  specific  location  in  the  site,  allowing  those  visitors  to   resume  interaction  with  a  website  at  the  point  where  they  were  on  the  previous  visit.  using  the   internet  protocol  (ip)  address  of  a  user,  websites  can  display  information  relevant  to  the   geographic  area  where  a  user  is  located.  tracking  mechanisms  also  allow  a  website  to  remember   registration  details  and  the  items  users  have  put  in  their  shopping  basket  (harding,  reed  and  gray   2007).   tracking  mechanisms  are  also  of  great  use  to  webmasters,  supporting  the  optimization  of  website   design.  thus,  for  example,  these  mechanisms  can  inform  webmasters  of  users’  movements  on   their  websites:  what  pages  are  visited,  how  often  they  are  visited,  and  in  what  order.  they  can  also   indicate  the  common  entry  and  exit  points  for  a  specific  website.  this  information  can  be   leveraged  in  site  redesign  to  increase  user  satisfaction  and  traffic.     hidden  online  surveillance:  what  librarians  should  know  to  protect  their  own  privacy  and   that  of  their  patrons|  fortier  and  burkell  |  doi:  10.6017/ital.v34i3.5495   62   website  optimization  and  interaction  personalization  have  potential  benefit  to  users.  at  the  same   time,  however,  the  detailed  profile  of  user  activities,  potentially  aggregated  across  multiple  visits   to  different  websites,  presents  potential  privacy  risks.  the  information  gathered  through  tracking   mechanisms  can  allow  a  website  to  identify  browsing  and  information  access  habits,  to  infer  user   characteristics  including  location  and  some  demographics,  and  to  know  what  topics  or  products   are  of  particular  interest  to  a  user.   tracking  mechanisms  can  be  first-­‐party  or  third-­‐party,  and  the  difference  has  implications  for  the   detail  that  can  be  assembled  in  the  user  profile.  first-­‐party  mechanisms  are  set  by  directly  by  the   website  a  user  is  visiting,  while  third-­‐party  mechanisms  are  set  by  outside  companies  providing   services,  such  as  advertising,  analysis  of  user  patterns  and  social  media  integration,  on  the   primary  site.  first-­‐party  tracking  mechanisms  collect  information  about  a  site  visit  and  visitor  and   deliver  that  information  to  the  site  itself.  using  first-­‐party  tracking,  web  sites  can  provide   personalized  interaction,  integrating  visit  and  visitor  information  both  within  a  single  visit  and   across  multiple  visits  (randall  1997).  this  information  is  available  only  to  the  web  site  itself,  and   thus  neither  includes  information  about  visits  to  other  sites  nor  is  accessible  by  other  websites,   unless  the  information  is  sold  or  leaked  by  the  first-­‐party  site  (see  narayanan  2011).   third-­‐party  tracking  mechanisms,  by  contrast,  deliver  information  about  a  site  visit  and  visitor  to   a  third  party.  this  transaction  is  often  invisible  to  the  user,  and  the  information  is  transmitted   typically  without  explicit  user  consent.  third-­‐party  tracking  represents  a  greater  menace  to   privacy,  since  third  parties  have  a  presence  on  multiple  sites,  and  are  able  to  collect  information   about  users  and  their  activities  on  all  those  sites  and  integrate  that  information  across  sites  and   across  visits  into  a  single  detailed  user  profile  (see  mayer  and  mitchell  2012  for  a  discussion  of   privacy  problems  associated  with  third-­‐party  tracking).  research  demonstrates  that  third-­‐party   tracking  is  a  common  and  perhaps  even  ubiquitous  practice  (gomez,  pinnick  and  soltani  2009;   (burkell  and  fortier  2013).  it  is  not  uncommon  for  websites  to  have  trackers  from  more  than  one   third  party,  and  some  websites,  especially  popular  ones,  have  trackers  from  dozens  of  different   organizations:  gomez,  pinnick  and  soltani  (2009),  for  example,  found  100  unique  web  beacons  on   a  single  website.  furthermore,  the  same  tracking  companies  are  present  on  many  different   websites,  allowing  them  to  integrate  into  a  single  profile  information  about  visits  to  each  of  these   many  sites.  privacychoice1,  which  maintains  a  comprehensive  database  of  tracking  companies,   estimates  that  google  display  network  (doubleclick),  for  instance,  has  a  presence  on  57  percent  of   websites.  thus,  a  user  traveling  the  web  is  likely  to  be  tracked  by  doubleclick  on  more  than  half  of   the  sites  they  visit,  and  doubleclick  has  access  to  information  about  all  visits  to  each  of  these  many   sites.   worries  about  the  potential  privacy  breaches  that  mechanisms  for  tracking  a  user’s  activities   online  can  allow  are  not  new.  even  at  their  inception  in  the  mid-­‐1990s,  http  cookies  (also  known   as  browser  cookies)  were  generating  controversy  about  the  potential  invasion  of  privacy                                                                                                                               1 http://www.privacychoice.org/.   information  technologies  and  libraries  |  september  2015   63   (e.g.  randall  1997).  users,  however,  quickly  realized  that  they  could  manage  http  cookies  using   accessible  browser  settings  that  limit  or  even  entirely  disallow  the  practice  of  setting  cookies.  as  a   result,  websites,  advertisers  and  others  who  benefit  from  web  audience  segmentation  and   behavior  analytics  developed  newer  and  more  obscure  tracking  technologies  including   ‘supercookies’  and  web  beacons,  and  these  technologies  are  now  deployed  along  with  http   cookies  (sipior,  ward  and  mendoza  2011).  tracking  technologies  are  constantly  evolving  in   response  to  user  behavior  and  advertiser  demand,  therefore  keeping  up  to  date  is  an  ongoing   challenge  (see,  e.g.,  goodwin  2011).   http  cookies   http  cookies  were  originally  meant  to  help  web  developers  “invisibly”  gather  information  about   users  in  order  to  personalize  and  optimize  user  experience  (randall  1997).  these  cookies  are   simply  a  few  lines  of  text  shared  in  an  http  transaction,  and  a  typical  cookie  might  include  a  user   id,  the  time  of  a  visit,  and  the  ip  address  of  the  computer.  cookies  are  associated  with  a  specific   browser,  and  the  information  is  not  shared  between  different  browsers  on  the  same  machine:   thus,  the  cookies  stored  by  firefox  are  not  accessible  to  internet  explorer,  and  vice  versa.  cookies   do  not  usually  include  identifying  information  such  as  name  or  address,  and  they  are  able  to  do  so   if  and  only  if  the  user  has  explicitly  provided  this  information  to  the  website.  when  users  want  to   access  a  web  page,  their  browser  sends  a  request  to  the  server  for  the  specific  website  and  the   server  searches  the  hard  drive  for  a  cookie  file  from  this  site.  if  there  is  no  cookie,  a  unique   identifier  code  is  assigned  to  the  browser  and  a  cookie  file  is  saved  on  the  hard  drive.  if  there  is  a   cookie,  it  is  retrieved  and  the  information  is  used  to  personalize  and  structure  the  website   interaction  (for  a  detailed  description  of  the  mechanics  of  cookies,  see  kriscol  2001,  152–155).     some  http  cookies,  called  session  or  transient  cookies,  automatically  expire  when  the  browser  is   closed  (barth  2011).  they  are  mainly  used  to  keep  track  of  what  a  consumer  has  added  to  a   shopping  cart  or  to  allow  users  to  navigate  on  a  website  without  having  to  log  in  repeatedly.  other   http  cookies,  called  permanent,  persistent  or  stored  cookies,  are  configured  to  keep  track  of   users  until  the  cookie  reaches  its  expiration  date,  which  can  be  set  many  years  after  creation   (barth  2011).  permanent  http  cookies  can  be  easily  deleted  using  browser  management  tools   (sipior,  ward  and  mendoza  2011).  studies  have  shown  that  approximately  a  third  of  users  delete   cookies  once  a  month  (e.g.  comscore  2007;  2011).  such  behavior,  however,  displeases  advertisers,   as  it  leads  to  an  overestimation  of  the  number  of  true  unique  visitors  on  a  website  and  impede   user  tracking  (marshall  2005;  see  also  comscore  2007;  2011).   flash  cookies  and  other  ‘supercookies’   to  palliate  this  ‘attack’  on  http  cookies,  an  online  advertising  company,  united  virtualities,   developed  a  backup  system  for  cookies  using  the  local  shared  object  feature  of  adobe’s  flash   player  plug-­‐in:  the  persistent  identification  element  (sipior,  ward  and  mendoza  2011).  this  type   of  storage,  called  flash  player  local  shared  objects  or,  more  commonly,  flash  cookies,  shares   many  similarities  with  http  cookies  with  regard  to  their  tracking  capabilities,  storing  similar     hidden  online  surveillance:  what  librarians  should  know  to  protect  their  own  privacy  and   that  of  their  patrons|  fortier  and  burkell  |  doi:  10.6017/ital.v34i3.5495   64   non-­‐personally  identifying  information.  unlike  http  cookies,  however,  flash  cookies  do  not  have   an  expiration  date,  a  characteristic  that  makes  them  permanent  until  they  are  manually  deleted.   they  are  also  not  handled  by  a  browser,  but  are  stored  in  a  location  accessible  to  different   browsers  and  flash  widgets,  which  are  thus  all  able  to  access  the  same  cookie.  they  can  hold   much  more  data  (up  to  100  kb  by  default  compared  to  4  kb  for  http  cookies),  and  support  more   complex  data  types  than  http  cookies  (see  macdonald  and  cranor  2012  for  a  technical   comparison  of  http  and  flash  cookies).  moreover,  it  is  estimated  that  adobe’s  flash  player  is   installed  on  over  99  percent  of  personal  computers  (adobe  2011),  making  flash  cookies  usable  on   virtually  all  computers.   flash  cookies  represent  a  more  resilient  technology  for  tracking  than  http  cookies.  erasing   traditional  cookies  within  a  browser  does  not  affect  flash  cookies,  which  needs  to  be  erased  in  a   separate  panel  (sipior,  ward  and  mendoza  2011).  flash  cookies  also  have  the  ability  to  ‘respawn’   (or  recreate)  deleted  http  cookies.  a  website  using  flash  cookies  can  therefore  track  users  across   sessions  even  if  the  user  has  taken  reasonable  steps  to  avoid  this  type  of  online  profiling  (soltani   et  al.  2009),  and  although  it  is  declining  in  incidence,  this  practice  is  still  occurring,  sometimes  on   very  popular  websites  (ayenson  et  al.  2011;  macdonald  and  cranor  2012).     it  should  also  be  noted  that  other  internet  technologies  (e.g.  silverlight,  javascript,  and  html5),   which  have  so  far  attracted  less  attention  from  researchers,  use  local  storage  for  similar  purposes.   one  developer  even  created  the  ‘evercookie’,  a  very  persistent  cookie  incorporating  twelve  types   of  storage  mechanisms  available  in  a  browser  that  makes  data  persist  and  allows  for  respawning   (kamkar  2010),  a  method  investigated  by  the  national  security  agency  to  de-­‐anonymize  users  of   the  tor  network,  (‘tor  stinks’  presentation  2013),  a  network  which  aims  at  concealing  the   location  and  usage  of  users.   web  beacons   users’  online  behavior  can  also  be  monitored  by  web  beacons  (also  called  web  bugs,  clear  gifs  or   pixel  tags),  which  tiny  are  image  tags  embedded  within  a  document,  appearing  on  a  webpage  or   attached  to  an  email,  that  are  intended  to  be  unnoticed  (martin,  wu  and  alsaid  2003).  the  image   tag  creates  a  holding  space  for  a  referenced  image  residing  on  the  web,  and  beacons  transmit   information  to  a  remote  computer  when  the  document  (web  page  or  email)  is  viewed.  web   beacons  can  gather  information  on  their  own,  and  they  can  also  retrieve  information  from  a   previously  set  cookie  (angwin  2010;  see  martin,  wu  and  alsaid  2003  for  description  of  the   different  technological  abilities  of  web  beacons).  such  capacity  means,  according  to  the  privacy   foundation  (smith  2000;  quoted  in  martin,  wu  and  alsaid  2003),  that  beacons  could  potentially   transfer  to  a  third  party  demographic  data  and  personally  identifiable  information  (name,  address,   phone  number,  email  address,  etc.)  that  a  user  has  typed  on  a  page.  unlike  cookies,  beacons  are   not  tied  to  a  specific  server  and  can  track  users  over  multiple  web  sites  (schoen  2009).  beacons,   moreover,  cannot  be  managed  through  browser  settings.  while  blocking  third-­‐party  cookies  limit     information  technologies  and  libraries  |  september  2015   65   their  range  of  action,  it  does  not  preclude  beacons  from  gathering  information  on  their  own,  and   users  have  to  install  extensions  to  their  browser  to  efficiently  limit  the  effects  of  web  beacons.   strategies  for  identifying  behavioral  tracking   in  order  to  identify  privacy-­‐respecting  online  resources,  librarians  must  learn  to  assess  the   behavioral  tracking  activities  occurring  on  websites.  the  first  step  is  to  identify  and  review   website  privacy  policies.  privacy  guidelines  regulating  the  collection,  retention  and  use  of  personal   information  in  the  online  environment  usually  require  that  users  should  be  given  notice  of  website   practices  (e.g.,  fair  information  practice  principles2  proposed  in  1973  by  the  us  secretary’s   advisory  committee  on  automated  personal  data  systems,  the  convention  for  the  protection  of   individuals  with  regard  to  automatic  processing  of  personal  data  developed  by  the  council  of   europe  (1981),  and  the  organisation  for  economic  co-­‐operation  and  development  guidelines  on   the  protection  of  privacy  and  transborder  flows  of  personal  data3).  this  notice  is  typically  provided   in  privacy  policies  that  identify  what  information  is  collected,  how  it  is  used,  and  with  whom  it  is   shared.  regulatory  frameworks,  however,  did  not  originally  contemplate  the  collection  of  non-­‐ personally  identifiable  information.  while  such  disclosure  would  seem  to  be  consistent  with  the   fair  information  practice  principles,  the  current  mode  of  mode  of  control  is  in  many  cases  self-­‐ regulatory45,  and  full  compliance  with  notice  requirements  is  far  from  universal  (komanduri  et  al.   2011-­‐2012).  thus,  while  disclosure  of  behavioral  tracking  practices  in  websites  should  be  seen  as   diagnostic  of  the  presence  of  these  mechanisms,  lack  of  disclosure  cannot  be  interpreted  to  mean   that  the  site  does  not  engage  in  behavioral  tracking  (komanduri  et  al.  2011-­‐2012;  burkell  and   fortier  2013b).   furthermore,  privacy  policy  disclosures,  where  they  do  exist,  may  be  difficult  to  understand   (burkell  and  fortier  2013b).  website  privacy  policies  are  often  complex  (micheti,  burkell  and   steeves  2010).  they  tend  to  be  written  with  the  goal  of  protecting  a  website  owner  against   lawsuits  rather  than  informing  users  (earp  et  al.  2005;  pollach  2005).  pollach  (2005),  for  example,   details  a  variety  of  linguistic  strategies  that  serve  to  undermine  user  understanding  of  website   practices,  including  mitigation  and  enhancement,  obfuscation  of  reality,  relationship  building,  and   persuasive  appeals.  therefore,  even  if  many  websites  acknowledge  the  collection  of  non-­‐ personally  identifiable  information,  both  from  first-­‐  and  third-­‐party,  the  effectiveness  of  this   disclosure  is  limited,  making  privacy  policies  a  relatively  ineffective  tool  to  identify  behavioral   tracking  practices.                                                                                                                             2 the privacy act of 1974, 5 u.s.c. § 552a. 3 c(80)58/final, as amended on 11 july 2013 by c(2013)79. 4 for instance, the new self-regulatory guidelines for online behavioral advertising identify the need to provide notice to users when behavioral data is collected that allows the tracking of users across websites and over time (united states federal trade commission, 2009). 5 exceptions to this self-regulatory principle are increasing, including but not limited to the california online privacy protection act of 2003 (oppa), and the eu cookie directive (2009/136/ec) of the european parliament and of the council.   hidden  online  surveillance:  what  librarians  should  know  to  protect  their  own  privacy  and   that  of  their  patrons|  fortier  and  burkell  |  doi:  10.6017/ital.v34i3.5495   66   as  a  result,  librarians  need  to  develop  strategies  and  tools  that  allow  them  to  assess  directly  the   behavioral  tracking  practices  of  websites,  in  order  that  these  practices  can  be  considered  in   making  websites  recommendations.  different  protocols  can  be  followed  in  making  this   assessment,  but  they  should  be  built  around  the  following  guiding  principles  (see  burkell  and   fortier  2013a  for  a  full  discussion).  the  first  important  principle  is  that  each  website  should  be   visited  in  an  independent  session  to  eliminate  contamination.  each  website  under  consideration   should  be  visited  in  an  independent  session,  beginning  with  the  browser  at  an  about:blank  page,   with  clean  data  directories  (no  http  and  flash  cookies,  and  an  empty  cache).  the  evaluator   should  ensure  that  browser  settings  are  configured  to  allow  cookies,  tools  to  track  web  beacons   (e.g.,  the  ghostery6  browser  extension)  are  installed  in  the  browser,  and  adobe  flash,  via  the   website  storage  settings  panel  is  configured  to  accept  data.  the  website  should  then  be  accessed   directly  by  entering  the  domain  name  into  the  browser’s  navigation  bar.  evaluators  should  mimic   a  typical  user  interaction  with  the  website  on  many  pages  without  clicking  on  advertisements  or   following  links  to  outside  sites.  as  they  browse  through  the  site,  the  evaluator  should  record  the   web  beacons  and  trackers  identified  by  the  browser  extension  (e.g.,  ghostery).  at  the  end  of  the   session,  they  should  immediately  review  the  contents  of  the  browser  cookie  file  and  the  adobe   flash  panel  via  website  storage  settings,  recording  any  cookies  that  are  present.  privacychoice,  as   well  as  ghostery,  maintains  a  database  of  trackers  that  evaluators  can  use  to  identify  associated   privacy  risk.  while  all  third-­‐party  trackers  raise  some  privacy  issues,  some  of  them  put  users  at  a   greater  risk  than  others,  either  because  of  their  practices  or  their  presence  on  a  large  number  of   websites.  evaluators  should  take  that  into  account  when  making  a  decision.   strategies  for  limiting  behavioral  tracking   users  may  also  take  these  steps  to  identify  the  presence  of  behavioral  tracking,  and  digital  literacy   initiatives  should  provide  this  information  along  with  tools  and  strategies  that  users  can  employ   to  limit  tracking.  it  should  be  noted  that  elimination  of  all  behavioral  tracking  may  not  be  a   desirable  outcome  from  the  perspective  of  users  who  benefit  from  the  website  personalization   and  optimization  supported  by  these  mechanisms.  targeted  advertising  can  also  be  positive  for   many  people,  since  it  eliminates  unwanted  or  ‘useless’  advertisements.  ultimately,  a  user  must   decide  whether  he  or  she  wants  to  be  tracked.  digital  literacy  initiatives  should  raise  awareness  of   behavioral  tracking  and  provide  users  with  the  tools  they  need  to  identify  and  control  tracking   should  they  choose  to  do  so.     the  easiest  step  is  for  users  to  learn  how  to  manage  http  cookies  in  every  web  browser  that  they   use.  using  browser  settings,  users  can  decide  to  refuse  third-­‐party  cookies  or  even  all  cookies.  the   latter,  however,  will  make  the  make  the  browsing  experience  much  less  efficient  and  may  impede   users  from  accessing  some  websites.  users  should  also  learn  how  to  delete  cookies  and  they   should  be  encouraged  to  think  about  periodically  emptying  the  cookie  file  of  each  of  their   browsers.  controlling  flash  cookies  is  more  complex,  yet  crucial  considering  the  capabilities  of                                                                                                                             6 https://www.ghostery.com/.   information  technologies  and  libraries  |  september  2015   67   flash  cookies.  this  is  achieved  through  settings  on  the  adobe  website  storage  settings  panel.   browser  extensions,  such  as  ghostery  and  adblock  plus7,  can  be  added  to  most  browsers.   ghostery  allows  users  to  block  trackers,  either  on  a  tracker-­‐by-­‐tracker  basis,  a  site-­‐by-­‐site  basis  or   a  mixture  of  the  two.  also  customable,  adblock  plus  allows  users  to  block  either  all   advertisements  or  only  the  ones  they  do  not  want  to  see.  these  extensions,  however,  may  slow   down  internet  browsing.   users  can  also  change  their  internet  use  habits.  it  is  possible  for  user  to  use  search  engines  that  do   dot  store  any  non-­‐personally  identifiable  information,  such  as  ixquick8  and  duckduckgo9.  ixquick   returns  the  top  ten  results  from  multiple  search  engines.  it  only  sets  one  cookie  that  remembers  a   user’s  search  preferences  and  that  is  deleted  after  a  user  does  not  visit  ixquick  for  90  days.   duckduckgo,  which  returns  the  same  search  results  for  a  given  search  term  to  all  users,  aims  at   getting  information  from  the  best  sources  rather  than  the  most  sources.  while  these  search   engines  do  not  have  all  the  functionality  of  the  major  search  engines,  both  of  them  have  received   praise  (e.g.  mccracken  2011).  the  ultimate  solution,  one  that  allows  a  user  to  navigate  online  total   anonymity,  is  to  use  the  tor10  web  browser,  which  impedes  network  surveillance  or  traffic   analysis  and  which  the  u.s.  national  security  agency  has  characterized  as  “the  king  of  high  secure,   low  latency  internet  anonymity”  (schneier  2013).  the  anonymity  afforded  by  tor,  however,   comes  at  the  price  of  reduced  speed  and  limitations  to  available  content.   conclusion   it  is  widely  understood  that  online  privacy  is  at  risk,  threatened  by  the  actions  of  governmental   agencies  and  commercial  entities.  there  is  widespread  awareness  of  and  attention  to  the  risks   associated  with  the  collection  and  use  of  personally  identifiable  information,  but  less  attention  is   paid  to  an  equally  significant  issue:  the  collection  and  use  of  information  that  is  highly  personal   but  nonetheless  ‘non-­‐identifying’.  this  practice,  termed  ‘behavioral  tracking’,  is  the  focus  of  this   paper.  other  research  demonstrates  that  behavioral  tracking  is  widespread  (gomez,  pinnick  and   soltani  2009;  burkell  and  fortier  2013a),  but  users  demonstrate  only  a  limited  knowledge  of  the   practice  and  they  do  little  to  control  tracking  (comscore  2007;  2011;  rainie  et  al.  2013;  truste   2013).  we  argue  that  librarians  have  a  dual  professional  responsibility  with  respect  to  this  issue:   first,  librarians  should  be  aware  of  the  surveillance  practices  of  the  websites  they  recommend  to   patrons  and  take  these  practices  into  account  in  making  website  recommendations;  second,  digital   literacy  initiatives  spearheaded  by  librarians  include  a  focus  on  online  privacy,  and  provide   patrons  with  the  information  they  need  to  manage  their  own  online  privacy.     this  paper  presents  an  overview  of  online  behavioral  tracking  mechanisms,  and  provides   strategies  for  identifying  and  limiting  online  behavioral  tracking.  the  information  presented   provides  a  basic  understanding  of  tracking  mechanisms  along  with  practical  strategies  that                                                                                                                             7 https://adblockplus.org/. 8 https://www.ixquick.com/. 9 https://duckduckgo.com/. 10 www.torproject.org/torbrowser/.   hidden  online  surveillance:  what  librarians  should  know  to  protect  their  own  privacy  and   that  of  their  patrons|  fortier  and  burkell  |  doi:  10.6017/ital.v34i3.5495   68   librarians  can  use  to  evaluate  websites  with  respect  to  these  practices  and  strategies  that  can  be   used  to  limit  online  tracking.  we  recommend  that  website  evaluation  standards  be  extended  to   include  assessment  of  online  privacy  and  especially  behavioral  tracking.  we  also  recommend  that   librarians  actively  promote  digital  literacy  by  engaging  in  public  education  programs  that  take   privacy  and  other  digital  literacy  issues  into  account  (american  library  association  2013).  finally,   we  note  that  protecting  online  privacy  is  an  ongoing  challenge,  and  librarians  must  ensure  that   they  continually  update  their  understanding  of  online  surveillance  mechanisms  and  the   approaches  that  can  be  used  to  monitor  and  limit  these  activities.     acknowledgement   support  for  this  project  was  provided  by  the  office  of  the  privacy  commissioner  of  canada   through  its  contributions  program.  the  views  expressed  in  this  document  are  those  of  the   researchers  and  do  not  necessarily  reflect  the  views  of  the  officer  of  the  privacy  commissioner  of   canada.   references   adobe.  2011.  “adobe  flash  platform  runtimes:  pc  penetration”.   http://www.adobe.com/mena_en/products/flashplatformruntimes/statistics.html.   “adone  classified  network  and  clickover  announce  strategic  alliance”.  1997.  business  wire,  march   24.   “affinicast  unveils  personalization  tool”.  1996.  adage,  december  4.   http://adage.com/article/news/affinicast-­‐unveils-­‐personalization-­‐tool/2714/.   american  library  association.  2008.  code  of  ethics.   http://www.ala.org/advocacy/proethics/codeofethics/codeethics.   ———.  2013.  digital  literacy,  libraries,  and  public  policies:  report  of  the  office  for  information   technology  policy’s  digital  literacy  task  force.  http://www.districtdispatch.org/wp-­‐ content/uploads/2013/01/2012_oitp_digilitreport_1_22_13.pdf.   ———.  2014.  choose  privacy  week.  accessed  april  8.  http://chooseprivacyweek.org.   angwin,  julia.  2010.  “the  web’s  new  gold  mine:  your  secrets”.  the  wall  street  journal  july  31.   http://online.wsj.com/news/articles/sb10001424052748703940904575395073512989404.     ayenson,  mika,  dietrich  james  wambach,  ashkan  soltani,  nathan  good  and  chris  jay  hoofnagle.   2011.  “flash  cookies  and  privacy  ii:  now  with  html5  and  etag  respawning”.  social  science   research  network.  http://ssrn.com/abstract=1898390.   ball,  james.  2013.  “nsa  stores  metadata  of  millions  of  web  users  for  up  to  a  year,  secret  files  show”.   the  guardian,  september  30.  http://www.theguardian.com/world/2013/sep/30/nsa-­‐americans-­‐ metadata-­‐year-­‐documents.     information  technologies  and  libraries  |  september  2015   69   barth,  adam.  2011.  “http  state  management  mechanism”.  internet  engineering  task  force,  rfc   6265.  http://tools.ietf.org/html/rfc6265.   burkell,  jacquelyn  and  alexandre  fortier.  2013.  privacy  policy  disclosures  of  behavioural  tracking   on  consumer  health  websites.  proceedings  of  the  76th  annual  meeting  of  the  association  for   information  science  and  technology,  edited  by  andrew  grove.  doi:  10.1002/meet.14505001087.   burkell,  jacquelyn  and  alexandre  fortier.  2015.  could  we  do  better?  behavioural  tracking  on   recommended  consumer  health  websites.  health  information  and  libraries  journal  32  (3):  182– 194.   canadian  library  association.  1976.  code  of  ethics.   http://www.cla.ca/content/navigationmenu/resources/positionstatements/code_of_ethics.htm.   castelluccia,  claude  and  arvind  narayanan.  2012.  privacy  considerations  of  online  behavioural   tracking.  heraklion,  greece:  european  union  agency  for  network  and  information  security.   http://www.enisa.europa.eu/activities/identity-­‐and-­‐trust/library/deliverables/privacy-­‐ considerations-­‐of-­‐online-­‐behavioural-­‐tracking.   comscore  2007.  the  impact  of  cookie  deletion  on  the  accuracy  of  site-­‐server  and  ad-­‐server  metrics:   an  empirical  comscore  study.   https://www.comscore.com/fre/insights/presentations_and_whitepapers/2007/cookie_deletio n_whitepaper.   ———.  2011.  the  impact  of  cookie  deletion  on  site-­‐server  and  ad-­‐server  metrics  in  latin  america:   an  empirical  comscore  study.   http://www.comscore.com/insights/presentations_and_whitepapers/2011/impact_of_cookie_de letion_on_site-­‐server_and_ad-­‐server_metrics_in_latin_america.   council  of  europe.  1981.  convention  for  the  protection  of  individuals  with  regard  to  automatic   processing  of  personal  data.  http://conventions.coe.int/treaty/en/treaties/html/108.htm.   earp,  julia  b.,  annie  i.  antón,  lynda.  aiman-­‐smith  and  william  h.  stufflebeam.  2005.  “examining   internet  privacy  policies  within  the  context  of  user  values”.  ieee  transactions  on  engineering  and   management  52  (2):  227–237.     gomez,  joshua,  travis  pinnick  and  ashkan  soltani.  2009.  knowprivacy.   http://ashkansoltani.files.wordpress.com/2013/01/knowprivacy_final_report.pdf.   goodwin  josh.  2011.  super  cookies,  ever  cookies,  zombie  cookies,  oh  my.  ensighten,  blog  entry.   http://www.ensighten.com/blog/super-­‐cookies-­‐ever-­‐cookies-­‐zombie-­‐cookies-­‐oh-­‐my.   harding,  william  t.,  anita  j.  reed  and  robert  l.  gray.  2001.  cookies  and  web  bugs:  what  they  are   and  how  they  work  together.  information  systems  management  18  (3):  17–24.     hidden  online  surveillance:  what  librarians  should  know  to  protect  their  own  privacy  and   that  of  their  patrons|  fortier  and  burkell  |  doi:  10.6017/ital.v34i3.5495   70   johns  hopkins  university  sheridan  libraries.  2013.  evaluating  information  found  on  the  internet.   http://guides.library.jhu.edu/evaluatinginformation.   kamkar,  samy.  2010.  “evercookie”.  http://samy.pl/evercookie/.   kapoun,  jim.  1998.  “teaching  undergrads  web  evaluation:  a  guide  for  library  instruction”.  college   &  research  libraries  news,  july/august:  522–523.   komanduri,  saranga,  richard  shay,  greg  norcie,  blase  ur  and  lorrie  faith  cranor.  2011-­‐2012.   “adchoices?  compliance  with  online  behavioral  advertising  notice  and  choice  requirements”.  i/s:   a  journal  of  law  and  policy  for  the  information  society  7:  603–638.   kristol,  david  m.  2001.  http  cookies:  standards,  privacy,  and  politics.  acm  transactions  on   internet  technology  1  (2):  151–198.   leon,  pedro  giovanni,  blase  ur,  rebecca  balebako,  lorrie  faith  cranor,  richard  shay,  and  yang   wang.  2012.  “why  johnny  can’t  op  out:  a  usability  evaluation  of  tools  to  limit  online  behavioral   advertising”.  proceedings  of  the  sigchi  conference  on  human  factors  in  computing  systems.   http://dl.acm.org/citation.cfm?id=2207759.   marshall,  matt.  2005.  “new  cookies  much  harder  to  crumble”.  the  standard-­‐times,  may  15.   http://www.southcoasttoday.com/apps/pbcs.dll/article?aid=/20050515/news/305159957.   martin,  david,  hailin  wu  and  adil  alsaid.  2003.  hidden  surveillance  by  web  sites:  web  bugs  in   contemporary  use.  communications  of  the  acm  46  (1):  258–264.   mayer,  jonathan  r.  and  john  c.  mitchell.  2012.  third-­‐party  web  tracking:  policy  and  technology.   proceedings  of  the  2012  ieee  symposium  on  security  and  privacy.   https://cyberlaw.stanford.edu/files/publication/files/trackingsurvey12.pdf.   mccracken,  harry.  2011.  “50  websites  that  make  the  web  great.  time,  august  16.   http://content.time.com/time/specials/packages/0,28757,2087815,00.html.   mcdonald,  aleecia  m.  and  lorrie  faith  cranor.  2010.  “beliefs  and  behaviors:  internet  users’   understanding  of  behavioral  advertising”.  social  science  research  network.   http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1989092.   ———.  2012.  “a  survey  of  the  use  of  adobe  flash  local  shared  objects  to  respawn  http  cookies”.   i/s:  a  journal  of  law  and  policy  for  the  information  society  7  (3):  639–687.   micheti,  anca,  jacquelyn  burkell  and  valerie  steeves.  2010.  “fixing  broken  doors:  strategies  for   drafting  privacy  policies  young  people  can  understand”.  bulletin  of  science,  technology,  and   society.  30  (2):  130–143.   narayanan,  arvind.  2011.  “there  is  no  such  thing  as  anonymous  online”.  blog  entry,  july  28.   https://cyberlaw.stanford.edu/blog/2011/07/there-­‐no-­‐such-­‐thing-­‐anonymous-­‐online-­‐tracking.     information  technologies  and  libraries  |  september  2015   71   office  of  the  privacy  commissioner  of  canada.  2011.  report  on  the  2010  office  of  the  privacy   commissioner  of  canada's  consultations  on  online  tracking,  profiling  and  targeting,  and  cloud   computing.  https://www.priv.gc.ca/resource/consultations/report_201105_e.pdf.   ———.  2013.  survey  of  canadians  on  privacy-­‐related  issues.   http://www.priv.gc.ca/information/por-­‐rop/2013/por_2013_01_e.pdf.   pollach,  irene.  2005.  “a  typology  of  communicative  strategies  in  online  privacy  policies:  ethics,   power,  and  informed  consent”.  journal  of  business  ethics  62  (3):  221–235.   rainie,  lee,  sara  kiesler,  ruogu  kang  and  mary  madden.  anonymity,  privacy,  and  security  online.   pew  research  internet  project.  http://www.pewinternet.org/2013/09/05/anonymity-­‐privacy-­‐ and-­‐security-­‐online/.   randall,  neil.  1997.  “the  new  cookie  monster”.  pc  magazine  16  (8):  211–214.   schneier,  bruce.  2013.  “attacking  tor:  how  the  nsa  targets  users'  online  anonymity”.  the   guardian,  4  october.  http://www.theguardian.com/world/2013/oct/04/tor-­‐attacks-­‐nsa-­‐users-­‐ online-­‐anonymity.   schoen,  seth.  2009.  “new  cookie  technologies:  harder  to  see  and  remove,  widely  used  to  track   you”.  blog  entry,  september  14.  https://www.eff.org/deeplinks/2009/09/new-­‐cookie-­‐ technologies-­‐harder-­‐see-­‐and-­‐remove-­‐wide.   sipior  ,  janice  c.,  burke  t.  ward  and  ruben  a.  mendoza.  2011.  online  privacy  concerns  associated   with  cookies,  flash  cookies,  and  web  beacons.  journal  of  internet  commerce  10  (1):  1–16.   smit,  edith  g.,  guda  van  noort  hilde  a.  m.  voorveld.  2014.  understanding  online  behavioural   advertising:  user  knowledge,  privacy  concerns,  and  online  coping  behaviour  in  europe.  computers   in  human  behavior  32  (1):  15–22.   smith,  r.  m.  2000.  “why  are  they  bugging  you?”  privacy  foundation.   http://www.privacyfoundation.org/resources/whyusewb.asp.     soltani,  ashkan,  shannon  canty,  quentin  mayo,  lauren  thomas,  chris  jay  hoofnagle.  2009.  “flash   cookies  and  privacy”.  social  science  research  network.   http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1446862.   “‘tor  stinks’  presentation”.  2013.  the  guardian  online,  october  4.   http://www.theguardian.com/world/interactive/2013/oct/04/tor-­‐stinks-­‐nsa-­‐presentation-­‐ document.   truste.  2013.  us  2013  consumer  data  privacy  study  –  advertising  edition.   http://www.truste.com/us-­‐advertising-­‐privacy-­‐index-­‐2013/.     hidden  online  surveillance:  what  librarians  should  know  to  protect  their  own  privacy  and   that  of  their  patrons|  fortier  and  burkell  |  doi:  10.6017/ital.v34i3.5495   72   turow,  joseph,  jennifer  king,  chris  jay  hoofnagle,  amy  bleakley  and  michael  hennessy.  2009.   “americans  reject  tailored  advertising  and  three  activities  that  enable  it”.  social  science  research   network.  http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1478214.   united  states  federal  trade  commission.  2009.  ftc  staff  report:  self-­‐regulatory  principles  for   online  behavioral  advertising.  http://www.ftc.gov/os/2009/02/p085400behavadreport.pdf.   university  of  california,  berkley  library.  2012.  “finding  information  on  the  internet:  a  tutorial”   http://www.lib.berkeley.edu/teachinglib/guides/internet/evaluate.html.   ur,  blase,  pedro  giovanni  leon,  lorrie  faith  cranor,  richard  shay,  and  yang  wang.  2012.  “smart,   useful,  scary,  creepy:  perceptions  of  online  behavioral  advertising”.  soups  ’12  proceedings  of  the   eighth  symposium  on  usable  privacy  and  security.  http://dl.acm.org/citation.cfm?id=2335362.   weston,  greg,  glenn  greenwal  and  ryan  gallagher.  2014.  “csec  used  airport  wi-­‐fi  to  track   canadian  travelers:  edward  snowden  documents”.  cbc  news,  january  30.   http://www.cbc.ca/news/politics/csec-­‐used-­‐airport-­‐wi-­‐fi-­‐to-­‐track-­‐canadian-­‐travellers-­‐edward-­‐ snowden-­‐documents-­‐1.2517881.   “what  they  know”.  2010.  the  wall  street  journal  online.  http://blogs.wsj.com/wtk/.     alexa, are you listening? an exploration of smart voice assistant use and privacy in libraries article alexa, are you listening? an exploration of smart voice assistant use and privacy in libraries miriam e. sweeney and emma davis information technology and libraries | december 2020 https://doi.org/10.6017/ital.v39i4.12363 miriam e. sweeney (mesweeney1@ua.edu) is associate professor, university of alabama. emma davis (edavispatsfan@gmail.com) is library specialist, hoover public library. abstract smart voice assistants have expanded from personal use in the home to applications in public services and educational spaces. the library and information science (lis) trade literature suggests that libraries are part of this trend, however there are few empirical studies that explore how libraries are implementing smart voice assistants in their services, and how these libraries are mitigating the potential patron data privacy issues posed by these technologies. this study fills this gap by reporting on the results of a national survey that documents how libraries are integrating voice assistant technologies (e.g., amazon echo, google home) into their services, programming, and checkout programs. the survey also surfaces some of the key privacy concerns of library workers in regard to implementing voice assistants in library services. we find that although voice assistant use might not be mainstreamed in library services in high numbers (yet), libraries are clearly experimenting with (and having internal conversations with their staff about) using these technologies. the responses to our survey indicate that library workers have many savvy privacy concerns about the use of voice assistants in library services that are critical to address in advance of library institutions riding the wave of emerging technology adoption. this research has important implications for developing library practices, policies, and education opportunities that place patron privacy as a central part of digital literacy in an information landscape characterized by ubiquitous smart surveillant technologies. introduction smart voice assistant use has expanded from personal uses in the home to new applications in customer services, healthcare, e-government, and educational spaces, raising questions from groups like the american civil liberties union (aclu), among others, about the data privacy implications of these technologies in public and shared spaces.1 libraries are part of the voice assistant adoption trend, as documented in the american libraries magazine article “your library needs to speak to you” by carrie smith.2 smith gives examples of school, public, and academic libraries adopting smart voice assistants like amazon’s alexa and echo devices for a range of services and programming including “event calendars, catalog searches, holds, and advocacy.” nicole hennig points out that there are tremendous opportunities for voice assistants to assist “people with disabilities, the elderly, and people who can’t easily type.”3 in these ways, voice assistants are often presented in the trade literature as part of an exciting new wave of emerging smart technology services that libraries can “get ahead of” and potentially harness for public service and community engagement. at the same time, the key privacy issues inherent in voice assistants are often downplayed as secondary concerns while librarians are encouraged to press forward and experiment with smart technology adoption. we argue that the privacy concerns surrounding voice assistant use in libraries should be treated as fundamental questions for library mailto:mesweeney1@ua.edu mailto:edavispatsfan@gmail.com information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 2 workers to consider as a part of upholding the core values of patron privacy and confidentiality in library services. voice assistant use in libraries is still nascent, reflecting the emerging nature of these technologies. given this, it is not surprising that very few empirical studies have explored voice assistant use and potential data privacy implications for libraries. our research is intended as an exploratory study that contributes to advancing knowledge in this area. the goals of this study are to begin mapping smart voice assistant use in libraries, to assess how aware library workers are of privacy concerns involving these technologies, and document how library workers are educating patrons about privacy and voice assistant use. these are necessary first steps for developing library practices, policies, and education opportunities for voice assistant use that prioritize privacy as a central part of digital literacy in an information landscape characterized by ubiquitous smart surveillant technologies and diminishing data privacy protections. review of literature what is a voice assistant? voice assistants are a type of digital assistant technology, also known as virtual assistants, and can be broadly defined as computer programs designed with human characteristics that act on behalf of users in digital environments using voice interfaces.4 apple’s siri, microsoft’s cortana, and amazon’s alexa are prevalent examples of smart digital assistants that use voice recognition and natural language user interfacing to help learn users’ preferences, answer questions, and manage a variety of applications and personal information. voice assistants can run on multiple devices and be seamlessly integrated across platforms including networked internet of things (iot) gadgets like smart speakers (e.g., amazon echo and google home) and other smart-home technologies (e.g., nest or ring), along with mobile devices, smart watches, personal computers, and numerous third-party applications. ubiquitous “always on” features are offered as a convenience to users who can use “wake words” (e.g., “hey, siri”; “alexa”; “ok google”) to initiate queries and commands. amazon’s smart speakers and intelligent digital assistants are rapidly becoming pervasive home and personal technologies, with the amazon echo leading the market in 2019 with 61 percent market share, followed distantly by the google home device with 24 percent market share.5 a recent united states survey by clutch reported that nearly half of people surveyed owned a voice assistant, with one-third planning to purchase one in the next three years.6 additionally, the clutch survey found that 69 percent of voice assistant owners used their devices every day.7 the popularity of voice assistants for personal use has driven the expansion of these technologies for customer service applications outside of the home in shared and public spaces, including in educational settings and health care. in this landscape it is perhaps not surprising that librarians are following suit and exploring the service potentials of voice assistants for libraries. libraries and voice assistant use the american library association’s (ala) center for the future of libraries initiative identified “voice control” as a trend in their 2017 report, anticipating the relevance of voice assistant technologies for libraries.8 the capability of voice assistants to integrate across platforms through customized applications—which amazon calls “skills” and google refers to as “actions”—allows libraries to create specialized uses for these technologies as a part of their regular information services. additionally, existing third-party vendors like overdrive (for e-book lending) and hoopla (multimedia lending) that most public libraries use are preconfigured to connect to voice information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 3 assistants like amazon’s alexa. there are many creative and potentially helpful ways that voice assistants could be integrated into the library setting, including enhancing read-along with music and effects, providing accessible services for elderly patrons or individuals with disabilities, and providing an alternative access point for common library queries and institutional information (e.g., searching titles, placing holds, requesting library event information).9 some libraries have started experimenting with voice assistant services in the library. for example, iowa state university staff developed alexa skills for their library so that users could find out information about library history and library collections.10 other libraries are using voice assistants to strategically engage their communities, as when the spokane public library placed amazon echo dots in the library so patrons could ask questions about upcoming bond elections, an issue that directly impacts library funding.11 the worthington (oh) libraries are integrating voice assistant technologies into technology training and “petting zoo kits” which allow their patrons to try out emerging technologies.12 the king county (wa) library system is taking a novel approach and experimenting with developing their own voice assistant, libro.13 these examples point to the many applications and creative approaches libraries are experimenting with to bring voice assistant technology to their services. data privacy issues as convenient as voice assistants may be for library services, the underlying data infrastructures of these technologies are tightly controlled by the technology companies that design and sell them. the lack of library control (and transparency) over these infrastructures raises questions about how the core values of privacy and confidentiality can be guaranteed in the library setting. 14 voice assistant technologies capture a wide range of intimate user information in the form of biometric data (e.g., voice recognition), consumer habits, internet-based transactions, personally identifiable information (pii), and geographical information.15 the ubiquitous “always on” feature that makes these technologies so convenient also flags important privacy questions about the extent of user interactions that are recorded; how these files are processed, transcribed, and stored; and how local, state or other law enforcement agencies might compel or otherwise use these records.16 recently amazon has confirmed that they have employees dedicated to listening to recordings from echo devices in order to help “eliminate the gaps in alexa’s understanding of human speech and help it better respond to commands,” which is concerning for patron privacy in the library context.17 researchers at northeastern university and imperial college london recently did a study about how often smart speakers record “accidentally” and whether or not they are constantly recording. the study found no evidence to support the theory that these devices are constantly recording, however the researchers did report that smart speakers are accidentally activated around 19 times a day, on average. 18 these reports aside, there is still much unknown about what these companies, and the companies they contract out work to, do with the personal data collected from voice assistants. lastly, amazon is a known collaborator with us government agencies like homeland security and immigration and customs enforcement (ice), hosting their biometric data on amazon web services (aws).19 amazon has a reputation for being one of the least transparent technology companies in terms of data sharing practices, and has routinely evaded questions about if/how much of customers’ echo data has been turned over to federal authorities.20 given this data environment, the fact that libraries are beginning to experiment with voice assistant integration in their services poses important questions for patron data privacy and confidentiality. ala provides library privacy guidelines for third-party vendors that clearly detail information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 4 expectations for use, aggregation, retention, and disclosure of user data. 21 while this document has been helpful for guiding license agreements with digital content providers, program facilitators, and other libraries, it does not quite capture the range of complexities that emerging smart technologies pose in the app-driven iot landscape. this area is ripe for study and having more information about how libraries of different types are approaching using voice assistants is necessary for developing responsive professional practices that center issues of privacy and critical digital literacy. our survey explores some of these issues with the purpose of beginning to document voice assistant use, and associated privacy concerns, in library services. research methods four main research questions guide this study: (1) how are libraries using smart voice assistant technologies as a part of their library services? (2) how aware are library workers of how voice assistants integrate with third-party digital content platforms? (3) are libraries educating library patrons about the privacy implications of smart voice assistant technologies? and (4) what kinds of privacy concerns do library workers have about the use of smart voice assistant technologies in their library services and programming? to address these questions, we developed an online survey using qualtrics web software, and distributed it in fall 2019 to 1,929 public and academic libraries across the us via email solicitation.22 the survey consisted of a mix of 31 multiple choice and open-ended questions designed to address different aspects of the stated research questions (see appendix a). since most of the examples of library voice assistant use detailed in the lis trade literature came from public and academic libraries, these were the library types we identified as most likely to already be experimenting with voice assistants in services and programming. using purposive sampling techniques, we selected 30 public libraries for each state that represented a range of rural and metropolitan service areas. we selected approximately 10-20 academic libraries per state, the actual numbers ranging based on the total number of universities and colleges in a given state. we identified a cross-section of large state schools, private colleges, and community colleges in each state to account for the variety of higher education institutional settings for academic libraries. we sent email solicitations to each public library, targeting email addresses for library directors where possible. for libraries that had centralized email services, we solicited participation using the contact forms available on the libraries’ websites. email solicitations to the academic libraries targeted library employees with job titles that included: emerging technology, user services, user experiences, head of public services, and head of technology. our survey analysis documents the numbers of reported uses, and kinds of integration, of voice assistant technologies across library applications and services. we conducted a qualitative content analysis of the short answer responses, with both researchers independently coding participant comments for emergent themes and categories. as a part of this process both researchers compared and negotiated categories in two iterations of coding to arrive at a common codebook which was then applied in the final pass of the responses. these categories have some distinct features, but also have many overlapping components. comments that embodied multiple themes were included in all categories that were relevant for describing them, meaning a particular comment might be included in multiple categories. the following sections report on the key findings of this study, organizing the discussion around our original research questions. information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 5 findings participant demographics we received 86 total responses for the survey, with the majority of respondents (61 percent) reporting affiliation with public libraries, followed by respondents from academic libraries (38 percent), with one respondent from a school library (1 percent).23 the participants represented libraries from 42 states across the us.24 the vast majority of public library respondents (65 percent) reported serving populations of 25,000 or more, though there was also a large reporting from libraries serving smaller populations of 2,500-9,999. the majority of academic library respondents work for small and medium sized institutions serving populations between 2,5009,999 (table 1), with nearly a third of respondents representing medium to large institutions. admittedly, these are rough demographic sketches to help quickly identify which types of libraries might be using voice assistants. more granular demographic detail would be useful in future studies to further understand how factors like institution type, geographical region, access to resources, and service community demographics shape decisions about emerging technology adoption in libraries. table 1. size of service population by library type total public academic school total count 84 51 32 1 2,5000 or less 11.9% 2.0% 28.1% 0.0% 2,500-9,999 25.0% 19.6% 34.4% 0.0% 10,000-25,000 16.7% 11.8% 25.0% 0.0% 25,000+ 44.0% 64.7% 12.5% 0.0% i’m not sure. 2.4% 2.0% 0.0% 100.0% how are libraries using smart voice assistant technologies as a part of their library services? only five respondents (6 percent) in our study reported that their library is currently using amazon echo, google home, or apple siri devices for patron services and programming. of the voice assistant adopters, three were public libraries using amazon echo and google home devices, and two were academic libraries using amazon echo and apple siri (table 2). table 2. voice assistant device by library type total public academic school amazon echo 3 1 2 0 google home 2 2 0 0 apple siri 1 0 1 0 librarians described using voice assistants to “provide basic info about the library and resources,” and on an “ad hoc basis” to promote the library-specific alexa skills and google home actions. other reported uses included “translation services” and as a part of “technology petting zoos.”25 we asked librarians to describe where voice assistants were located in the library to get a better idea of the spatial arrangements of these technologies, which could be important for considering potential surveillant concerns. several libraries reported that they had voice assistants sitting at front service desks or reference desks for patrons to use in both adult and children’s service areas, information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 6 as well as at circulation desks. as one librarian described, “we are mounting it [the voice assistant] so students/users can ask questions when necessary.” when it comes to using these devices in library programming, the most common response was for use in technology petting zoos and in technology classes where patrons can see technology demonstrations and ask library staff questions, or get on-on-one tutoring sessions: “our technology department holds regular ‘tech drop-in's’ and carries out one on one assistance by appointment. in the context of these patrons will sometimes bring in their own devices or ask questions about the use of digital assistants.” other programming applications that librarians mentioned for voice assistants included trivia, 3 -d printing, and makerspaces. two libraries (one public and one academic) reported that they were circulating apple siri devices (e.g., ipads) and amazon alexa products (e.g., echo) for checkout. how aware are library workers of how voice assistants integrate with third-party digital content platforms? the majority of library workers surveyed (70 percent) reported that their libraries use third-party digital media platforms like overdrive and hoopla to provide multimedia content like e-books and streaming video to patrons. both of these platforms support integration with voice assistants like amazon alexa through “skills” (the alexa equivalent of an application). patrons are able to download a skill for their alexa-enabled device to access digital content through these platforms, which are often linked to their library accounts (e.g., “alexa, ask hoopla how many borrows i have remaining.”).26 around 14 percent of the respondents reported that they were aware that overdrive and hoopla integrated with voice assistants, and 3 percent of all respondents reported that their libraries actively inform patrons about amazon alexa skills for these services. when patrons begin connecting their personal voice assistant devices with third -party digital content providers that are also linked to their library accounts, different terms of service agreements and privacy policies overlap creating a complex data rights landscape. almost a third of our respondents (29 percent) replied that they were aware that amazon has different privacy policies from overdrive and hoopla, with 22 percent responding that they were unaware of these differences (the rest were unsure or did not respond). only 15 percent of respondents reported that their libraries provided patrons with information about overdrive and hoopla’s privacy policies. one library worker offered that, “when helping a patron or informing them that we use overdrive they are encouraged to read all the privacy info.” however, no libraries in this study reported sharing information about amazon’s privacy policies with patrons, which might also apply to linked accounts. lastly, 34 percent of the library workers indicated that they were familiar with the ala guidelines on privacy that pertain to third-party vendors, and 16 percent reported that their library actively refers to these guidelines in information materials for patrons. for instance, “we have a privacy policy on our website, which was based on the ala library privacy checklists. it states that our vendors have different privacy policies than we do.” these responses indicate that while some library workers are aware of the privacy implications of the integration of voice assistants into third-party digital content platforms, there are opportunities to increase staff and patron awareness about the intersecting privacy policies and terms of service in this landscape. information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 7 are libraries educating library patrons about voice assistant technologies as a part of services and programming? we were curious if the libraries who used voice assistants in their services were taking any particular measures to inform patrons about the privacy implications of these technologies, or offering any other kinds of specific privacy “best practices” guides for use (e.g., how to erase your data records, adjust settings, etc.). the two libraries who reported circulating voice assistants indicated that they did not include any privacy information with voice assistant devices at checkout. similarly, we asked library workers about the kinds of technology classes or programming that their libraries were offering, since these might be sites where there is potential to educate or provide information about privacy issues raised by smart technologies like voice assistants. we found that 49 (56 percent) of the libraries represented in the survey (37 public, 12 academic) offer technology courses for the public. of these, 39 libraries (24 public, 15 academic) responded “yes” to our question asking if aspects of “data privacy or data literacy” are included as part of these classes or other related programming.27 only 3 libraries (2 public, 1 academic) were able to report that their library offers data literacy education that specifically addresses voice assistant technologies. library workers provided many examples of the kinds of data literacy information that their libraries typically provided in technology classes and programming. twelve respondents said that their libraries offered some sort of broad data literacy class and several cited classes specifically targeted at personal data practices and security. topics taught in these classes included: understanding your personal risk profile; password managers and security; how to understand and protect your digital footprint; and sessions on facebook and google where staff “walk users through how to find their information and make decisions about it.” several respondents identified information literacy topics in conjunction with data literacy, noting that their library teaches classes about identifying “fake news,” phishing scams, and evaluating the authority of websites and website content. none of the responses specifically named issues around privacy or data capture by voice assistants or other smart technologies as topics covered in library technology classes. several library workers noted that technology classes were offered at their libraries through one-on-one sessions, geared to individually address what patrons had questions about. based on these responses it is unclear how in-depth, or if at all, these one-on-one sessions might go into informing patrons about privacy best practices and risks when using smart technologies like voice assistants. what kinds of privacy concerns do library workers have about the use of smart voice assistant technologies in their library services and programming? just over half of the library workers surveyed (52 percent) answered “yes” to the question: “do you have any privacy concerns about the use of amazon echo, google home, or apple siri devices in the library?” of the other responses, 16 percent reported “no” concerns and 15 percent answered “i’m not sure.” those who answered yes were asked to further describe their privacy concerns, resulting in robust descriptions that demonstrated a savvy understanding of the voice assistant data landscape. we characterized library workers’ concerns about voice assistants in the library by five major categories: data access and use; surveillance and “always on” features; procedure and operations; legal issues; and professional responsibility. data access and use by far the most prevalent privacy concerns focused on questions about who has access to data collected by smart voice assistants and how this data might be used (or misused) by different information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 8 parties. library workers were the most concerned about the reach of access that the three major voice assistant parent companies (amazon, google, apple) have to patron data, closely accompanied by concerns with the selling of this data to third-party vendors: “there are known risks in the logging practices of the assistant vendor (amazon, google, apple). there are potentially greater, and unknown, risks of privacy and data security problems with third-party integrators that libraries are working with to create the alexa skills, google home actions, etc.” “these devices are tied to user accounts for vendors that sell goods and services. there are opportunities to make purchases that we do not want to present to our patrons.” “as currently constituted, most of these devices' privacy policies require owners to allow voice recordings to be sent to cloud services for transcription and, in some instances, for storage and for re-listening by staff or 3rd-party contractors.” another library worker added that they were concerned about the willingness of these parent companies to “share personal, private data with law enforcement agencies.” this observation underscores what is potentially at stake in terms of patron vulnerability in this data environment. several concerns focused on patrons “unwittingly leaving their sensitive information on devices that we might use.” “being that anything we use in the library, or check out to our patrons is shared, i have privacy concerns for what data and recordings will be collected by the services while they are either in use in the library or while they are in the patron’s possession.” while some of these concerns were tied back to how parent companies might use this data, others were equally wary of the potentials for “storing information that can be accessed between patron uses” or by library staff: “as with computers in the info commons, i would be concerned whether user information is scrubbed after each user. or would one user's information persist and become available to a subsequent user.” “i would not want to be able to identify the patron who used the device. in this case, we cannot. we circulate ipads as assistive devices. as soon as the item is returned, the checkout record is purged.” lastly, library workers expressed cybersecurity concerns about voice assistants, wondering about how voice assistants might be hacked or otherwise manipulated by malicious actors: “the library is public space, these devices are not known for being secure. a device would have to be registered to some university account, but would be prone to algorithmic manipulation from public voice inputs if that makes sense?” “just the idea that they (everything!) is [sic] hackable, and hostage-able, and so on, creeps me out personally, but also in terms of privacy and confidentiality of users of that technology.” information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 9 “alexa and google home can be hacked to phish passwords and other sensitive information.” taken together, these concerns gesture to the opacity of the data environment in terms of who might have access to data (companies, law enforcement, patrons, library staff, hackers) and how this data might be used (advertising and marketing, exposure of personal patron information, state surveillance, and exploitation). surveillance and “always on” features the second major area of concern that library workers expressed was about the surveillance potentials of voice assistants via their passive listening features. in order for voice assistants to respond to their various wake words, they need to be “always on” and listening. while there is a difference between always listening and always recording (which recent studies suggest is no t happening), library workers remained wary about devices “constantly monitoring staff or patrons.”28 these concerns have some obvious overlap to the data access and use theme, but differ in that they are specifically concerned with the act of surveilling—monitoring—patron activities, use patterns, and personal information. three respondents in this category couched their data privacy concerns in terms of ability to exert some control over their data (e.g., deleting data), or the ability to grant permission/consent to be recorded: “these devices are intended for use in the home. they offer some protections for users with management access. for example, the google assistant allows review and deletion of recording history. for users without such access there are no such protections.” “...they [voice assistants] are intended to for use inside a single household, learning the voices, habits and preferences of those household members. i feel that this kind of personal information should be the individual's choice to make and not the library's [sic].” “my concern is that my personal data is being collected without my permission. the same concern applies to patrons of the library. having them present and turned on captures people's conversations and they may not be aware that is happening.” as these comments suggest, passive listening in public spaces opens up the potential for surveilling patrons and library staff who are not intending to interact with the devices, or who have no knowledge that the device is present. in other words, while some patrons might opt to use a voice assistant to ask a question or look a book up in a library catalog, patrons (and library staff) who are merely talking in the vicinity of these devices may still be listened to and recorded by these devices without their knowledge or consent. this group of privacy concerns conveys a lack of transparency around data collection and surveillance in voice assistants, pointing to larger power differentials between parent companies and users in terms of control over data collection and management. procedure and operations library workers discussed the operational challenges that voice assistants present to staff in terms of establishing routine procedures that ensure patron privacy and confidentiality in between patron use: information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 10 “how do we make sure no residual information remains in the device before someone else uses it or that if used during a program 'private' information isn't being broadcast to other devices in the area?” another library worker alluded to some of the operational considerations that already accompany library use and lending of personal computing devices, “clearing data, purchasing, maintaining, we already have ipads and other devices and their management with our staff has been a challenge.” this comment points to the extra staff labor that underpins technology services, which is often not considered as a part of infrastructure for offering these services. similarly, there is a sense from these comments that establishing procedures to maintain privacy and confidentiality are critical for voice assistants. failure to erase or secure patron data could lead to inadvertently exposing sensitive or personally identifiable information (pii). “patron's [sic] may inadvertently be saving their information or staff may forget to delete information causing the previous patrons sensitive information to remain for the next patron to discover.” while google home and amazon alexa devices do provide the ability for individual recordings to be deleted by the account holder, in the case of shared library use of voice assistants, it would likely be incumbent on a library staff member to access and delete recordings. this raises ethical, legal, and operational questions for library staff required to manage any patron data collected by voice assistants. in any case, procedural concerns are a reminder that library staff have an active role to take in ensuring patron privacy. legal issues library workers in this study identified three legal issues posed by voice assistants in the library. the first legal issue raised was the potential for violation of the family educational rights and privacy act (ferpa)—the federal law that protects the privacy of student education records—due to the collection of pii by voice assistants. library workers in many academic settings are required to maintain compliance with ferpa. one of the respondents was concerned that by using voice assistants in their services, libraries would be putting themselves in a position to potentially violate this law. a second set of concerns focused on questions about the liability of the library (or individual library workers) if a patron’s pii is misused by technology companies or the third-party vendors who have access to user data: “i have great concerns regarding the use of this technology in a library setting since it might expose the library to potential liability if, more likely when patron data is misused by the technology providers.” related to this concern, another library worker asked, “who owns the info?” questions about rights and ownership of personal data by technology companies, itself a fraught and opaque legal area, require more ethical and legal probing as libraries become intermediaries to patron use of voice assistants. lastly, one library worker cited concerns about librarians’ ability to uphold first amendment rights with voice assistants. information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 11 “we take our mandated role to uphold first amendment rights and patron privacy very seriously. there are too many issues with the way these for-profit companies collect, store and potentially use information. we see no benefits of service gained that offset these concerns. we are also concerned about the way owners of these products use their wealth to leverage political influence.” this comment identifies privacy as a necessary condition for facilitating free speech, contrasting this with a sketch of the political and economic motives underlying voice assistant development. the concerns raised by these library workers point to the complexity of managing patron data in the context of a variety of existing legal frameworks. professional responsibility three respondents explicitly placed privacy concerns in the context of their professional responsibility as library workers to “protect” patrons and patron privacy. a fourth respondent voiced a twin concern about “the library's inability to protect privacy and patron information” (emphasis added). beyond descriptions of protecting patrons, these library workers framed privacy as a professional value. comments such as, “we take our mandated role to uphold first amendment rights and patron privacy very seriously,” emphasize privacy as a professional charge. these kinds of comments tacitly draw on lis professional core values and ethics statements to position responsible professional practice as the action of upholding privacy. as a result, professional identity is discursively constructed by these library workers as a function of valuing privacy. the following comment, particularly, draws an identity-based line between “us” (library professionals) and “them” (technology companies) that is based on divergent values surrounding privacy: “since one of the main concerns we (should) have as library professionals is patron privacy; ‘teaming up’ with technology providers who do not have that level of concern is problematic at best.” the assertion that library core values may be in conflict with the technology providers that are designing voice assistants is very astute, and important for libraries to consider when weighing the decision to experiment with these (and other) emerging smart technologies. discussion: key considerations for library professionals our research suggests that library use of voice assistants poses many as-of-yet unresolved privacy issues for library staff and patrons alike. though voice assistant use is still fairly nascent across public and academic libraries, our study confirms that these tools are already being adopted by some libraries. the adoption of these, and other, smart technologies, is likely to keep trending in library services across institution types, paralleling market trends for personal adoption of voice assistants. many library workers in our study expressed astute concerns about voice assistants, raising important questions about how patron data was collected, managed, and used across the data lifecycle of these technologies. this is a critical moment, then, for the library profession to take stock of questions of privacy surrounding voice assistants, and an opportunity to set a broader professional agenda for data-privacy that encompasses the complexities of smart technology use in library services. in this spirit, we have identified several main areas of concern that emerged from our study, posited as key considerations about voice assistants for library professionals to grapple with. information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 12 circulation procedures for libraries who are, or are considering, lending voice assistant-enabled technologies, clear lending rules are needed for patrons that set guidelines for disconnecting their personal amazon, apple, or google accounts before returning the device. likewise, it is important to develop procedures for library staff to anticipate instances when patrons forget to disconnect their personal accounts. library workers cannot, and should not, be responsible for disconnecting personal accounts as a protective measure for both staff and patrons, since doing so asks library workers to access and take responsibility for personal patron data, including pii. one suggestion might be to require devices to be restored to factory settings, which could be verified by a library staff member at time of device return. libraries might also consider including privacy best practices with these devices that outline known privacy risks and provide information about how to adjust settings to limit data sharing or delete records in personal accounts where applicable (e.g., amazon). third-party digital content platforms the integration of voice assistants in third-party digital content platforms licensed by libraries is becoming more common, pointing to the complexity of upholding patron data privacy throughout these layered and linked services. this issue speaks to the difficulties navigating overlapping privacy statements and terms of service agreements, which is not unique to voice assistants but does indicate the need for more data protections and consumer-oriented information policies. ala already does advocacy work on these issues and provides many helpful guidelines , such as the library privacy guidelines for vendors (http://www.ala.org/advocacy/privacy/guidelines/vendors). still, the data environment is very much characterized by the unequal power differential between technology companies and users. we are in dire need of more robust information policy frameworks that are predicated on transparency, strict parameters for data collection and use, corporate accountability, and user control and agency. a promising example of this is the general data protection regulation (gdpr) implemented in the european union in 2018. something similar is needed in the us to regulate corporate data-sharing practices and give users more control over their data. this would be beneficial across the board for the public, as well as to library patrons using their personal voice assistant devices to access library resources. education opportunities for expanding digital literacy library workers in our study reported a range of technology education and digital literacy programming initiatives in their libraries, though none that specifically addressed voice assistants. this suggests that library technology programming might not be targeting the kinds of specific privacy concerns posed by smart technologies like voice assistants. as smart technologies like voice assistants become more common for household/personal use, it would make sense to expand library programming initiatives to include informational sessions that incorporate data privacy considerations for smart technologies in addition to skills-driven sessions. additionally, some survey responses indicated that library workers may have some knowledge gaps or a lack of concern about voice assistant use. this might point to a need for expanded education, training, and professional development around data privacy issues and emerging technologies for library workers. there has already been a large push in the field to expand digital literacy, defined by ala as “the ability to use information and communication technologies to find, evaluate, create, and communicate information, requiring both cognitive and technical skills.”29 however, this definition of digital literacy falls short of considering the role of assessing data collection, storage, and use as a core part of digital knowledge. expanding digital literacy training, for both staff and http://www.ala.org/advocacy/privacy/guidelines/vendors information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 13 patrons, to include awareness of the data ecosystems and privacy concerns that undergird smart technologies is a must for responsive library services. surveilling patrons and staff voice assistants placed in public service areas, in the library stacks, and in public gathering areas within the library raise the ethical issue of recording patrons (and staff) who either do not wish to be recorded, or do not even know they may be recorded. in the case of library staff, this poses a labor issue where staff may be asked to work in areas where devices may be listening to their interactions during the duration of their shift. for patrons, this could compromise privacy in reference transactions and in other information seeking activities, as well as capturing other personal interactions that take place in the library setting. it is critical that libraries are transparent about using voice assistant technologies, upfront about the potential privacy harms of these technologies, and abide by “opt-in” rather than “opt-out” frameworks. library workers should consider treating voice assistant records in the same way they have historically treated circulation records, opting to either delete these records or not collect them (meaning, not use voice assistants) at all. unlike circulation records, however, library workers have far less control over the data captured by voice assistants. this data is stored in the cloud on privately owned servers that remain outside of library control and oversight. given the incredibly low bar for federal access to information under the usa patriot act, actively facilitating the collection of patron and staff interactions, particularly without informed consent, should give librarians pause. opt to not adopt in light of the issues raised in this study, library workers need to seriously weigh whether the benefits of using voice assistants in libraries at this point in time outweigh the vast privacy concerns that we have outlined here. as it stands, these technologies are not currently filling a gap in library services that cannot be otherwise met by more traditional service models that carry fewer potential harms for our patron communities. importantly, not all patrons are equally vulnerable to harm or exploitation in these data environments. for instance, there is a wealth of research that demonstrates the multitude of ways that black, indigenous, people of color, lqbtq+, women, and low-income communities are subjected to higher levels of surveillance and data profiling that results in harassment, discrimination, economic penalties, and legal persecution.30 as the current national political landscape is aflame in protests against police violence and anti-black racism, it is important to identify surveillance technologies as policing technologies. libraries need to consider that these tools, as extensions of policing data networks, may directly endanger, particularly, black, latinx, and indigenous people who are already subjected to over-policing. in this sense, concerns about patron data privacy are high-stakes and are deeply linked to the professional core value of social responsibility.31 libraries should consider not using voice assistants until key data privacy concerns are addressed, more robust data protections are in place at a federal level, and the blanket authority for federal agencies and law enforcement to compel user data is revoked. this is not a technophobic stance. on the contrary, we are suggesting that library workers could serve an important role as privacy advocates, which includes critically evaluating the role of emerging technologies in their communities on behalf of public interest. a key part of this must include the library profession taking responsibility for the use of surveillance technologies in their institutions since these technologies are deeply implicated in the policing of disenfranchised communities by state and federal authorities. information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 14 conclusion we view this study as a modest starting point for mapping some of the many privacy issues associated with voice assistant use in library services and programming and hope it points a way forward for future research. future research might address specific case studies of voice assistant use in libraries, data mapping of patron data through third-party library services, use and privacy issues across different institution types, patron digital literacies with voice assistants, and library policies for smart technologies more generally. plural and diverse vantage points are needed to understand the potential impacts of these technologies across different community types. such research is critical for developing best practices, guidelines, policies, and education opportunities for voice assistant use (and other smart technologies) that prioritize patron privacy and confidentiality. the use of voice assistants in libraries raises questions about the responsibility of libraries and librarians to actively engage patron data privacy concerns when considering integrating these technologies into services and programming. indeed, we encourage library workers to consider informed non-adoption of these technologies as a socially responsible professional stance until the key issues we have outlined are addressed. while it is, of course, important for library workers to remain current and innovative in their services, it is also paramount that patron privacy (as a function of safety) stays at the forefront of library services. in other words, it is the responsibility of library workers to anticipate potential privacy issues associated with emerging technologies, rather than treating privacy as a secondary concern to technology adoption. there are tremendous opportunities for library workers to lead the data privacy charge—in collaboration with community stakeholders—in pursuit of privacy-centered library services that are accountable to community members, particularly those who are mostly likely to be harmed by these technologies. information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 15 appendix a: survey instrument 1. by selecting the “i agree” button below, i hereby certify: that i am 19 years old or older; that i have read and understand the above consent form; and that this action indicates my willingness to voluntarily take part in the study. a. i agree to participate in the research study described above. b. i do not agree to participate in the research study described above. 2. do you work in a library setting? a. yes b. no 3. what kind of library do you work at? a. public b. academic c. school d. other, please specify [fill in the blank] 4. what is the size of your library’s service population? a. 2,500 or less b. 2,500-9,999 c. 10,000-25,000 d. 25,000+ e. i’m not sure. 5. what state is your library located in? [fill in the blank] 6. does your library have amazon echo devices, google home devices, or apple siri devices available for use by patrons? a. yes b. no c. i’m not sure. 7. which of the following digital assistant devices does your library have available for patrons to use? a. amazon echo devices b. apple siri devices c. google home devices d. other products, please specify: [fill in the blank] 8. please provide some examples of how your library patrons use the library's digital assistant technologies. [short answer] 9. could you describe where these digital assistant technologies are located in the library? [short answer] information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 16 10. does your library use amazon echo devices, google home devices or apple’s siri devices in any of the following kinds of programming? (select all that apply) a. tech “petting zoos” b. trivia c. homework help d. technology classes e. makerspaces f. not listed, please specify: [fill in description] g. none of the above 11. for the programs you selected, briefly explain how the devices are integrated into programming. [short answer] 12. does your library circulate amazon echo, google home, and apple siri devices to the public for checkout? a. yes b. no c. i’m not sure. 13. which devices do you circulate? a. amazon echo devices b. apple siri devices c. google home devices d. other products, please specify [fill in the blank] 14. do you provide any privacy information and/or best practice information with the device at checkout? a. yes b. no c. i’m not sure 15. if so, briefly explain what kind of privacy or best practices information you include. examples of content covered in this information would be helpful. [short answer] 16. do you have any privacy concerns about the use of amazon echo, google home, or apple siri devices in the library? a. yes b. no c. i’m not sure 17. could you describe these privacy concerns? [short answer] 18. does your library offer any sort of technology courses to the public? a. yes b. no c. i’m not sure information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 17 19. does your library teach data privacy or data literacy as part of the library's programming? a. yes b. no c. i’m not sure 20. does your library offer any data literacy education in programming that specifically addresses digital assistants? a. yes b. no c. i’m not sure 21. what kinds of data literacy information is provided in these courses taught at your library? please provide some examples: [short answer] 22. does your library use any of the following services? select all that apply: a. overdrive/libby b. hoopla c. none of the above 23. are you aware that both overdrive and hoopla have amazon echo application integration (called "skills")? a. yes b. no c. i’m not sure 24. does your library inform patrons about amazon echo skills on overdrive and/or hoopla? a. yes b. no c. i’m not sure 25. are you aware that amazon's privacy policies differ from those of overdrive and hoopla? a. yes b. no c. i’m not sure 26. does your library provide any information to patrons about overdrive and hoopla's privacy policies? a. yes b. no c. i’m not sure 27. does your library provide any information to patrons about amazon's privacy policies? a. yes b. no c. i’m not sure 28. please provide a brief description of the information that you are providing to patrons on this subject, including where this information is located for patron access. [short answer] information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 18 29. are you aware of the guidelines that the american library association (ala) provides on privacy as it pertains to third party electronic vendors? a. yes b. no 30. does your library use or refer to these privacy guidelines in any informational materials for patrons? a. yes b. no c. i’m not sure 31. please describe these informational materials, including how and where they are distributed to patrons: [short answer] information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 19 endnotes 1 benjamin herald, “teacher’s aide or surveillance nightmare? alexa hits the classroom,” digital education, education week, june 26, 2018, http://blogs.edweek.org/edweek/digitaleducation/2018/06/alexa_in_the_classroom_teacher s_surveillance.html?cmp=soc-shr-fb. 2 carrie smith, “your library needs to speak to you,” american libraries (june 3, 2019), https://americanlibrariesmagazine.org/2019/06/03/voice-assistants-your-library-needs-tospeak-to-you/. 3 nicole hennig, siri, alexa, and other digital assistants: the librarian’s quick guide (santa barbara, ca: libraries unlimited, 2018) 33–8. 4 adapted from: brenda laurel, “interface agents: metaphors with character,” human values and the design of computer technology (1997): 207–19, cambridge university press. 5 emily clark, “alexa, are you listening? how people use voice assistants,” https://clutch.co/appdevelopers/resources/alexa-listening-how-people-use-voice-assistants. 6 clark, “alexa, are you listening? how people use voice assistants.” 7 clark, “alexa, are you listening? how people use voice assistants.” 8 "voice control", american library association, http://www.ala.org/tools/future/trends/voicecontrol. 9 shannon liao, “google home will play music and sound effects when you read disney storybooks,” https://www.theverge.com/2018/10/29/18037466/google-home-disneymusic-moana-incredibles-coco-storytime; hennig, siri, alexa, and other digital assistants, 35; susan allen and avneet sarang, “serving patrons using voice assistants at worthington,” online searcher 42, no. 6 (november-december 2018): 49–52. 10 smith, “your library needs to speak to you.” 11 smith, “your library needs to speak to you.” 12 allen and sarang, “serving patrons using voice assistants at worthington.” 13 king county library system, “voice assistants, connecting you to your library,” https://kcls.org/voice/. 14 “core values of librarianship,” american library association, http://www.ala.org/advocacy/intfreedom/corevalues. 15 miriam e. sweeney, “digital assistants,” in uncertain archives: critical keywords for big data, ed. nanna bonde thylstrup, daniela agostinho, annie ring, catherine d’ignazio , and kristin veel (baltimore, md: mit press, 2021), 151–60. http://blogs.edweek.org/edweek/digitaleducation/2018/06/alexa_in_the_classroom_teachers_surveillance.html?cmp=soc-shr-fb http://blogs.edweek.org/edweek/digitaleducation/2018/06/alexa_in_the_classroom_teachers_surveillance.html?cmp=soc-shr-fb http://blogs.edweek.org/edweek/digitaleducation/2018/06/alexa_in_the_classroom_teachers_surveillance.html?cmp=soc-shr-fb http://blogs.edweek.org/edweek/digitaleducation/2018/06/alexa_in_the_classroom_teachers_surveillance.html?cmp=soc-shr-fb https://americanlibrariesmagazine.org/2019/06/03/voice-assistants-your-library-needs-to-speak-to-you/ https://americanlibrariesmagazine.org/2019/06/03/voice-assistants-your-library-needs-to-speak-to-you/ https://clutch.co/app-developers/resources/alexa-listening-how-people-use-voice-assistants https://clutch.co/app-developers/resources/alexa-listening-how-people-use-voice-assistants http://www.ala.org/tools/future/trends/voicecontrol https://www.theverge.com/2018/10/29/18037466/google-home-disney-music-moana-incredibles-coco-storytime https://www.theverge.com/2018/10/29/18037466/google-home-disney-music-moana-incredibles-coco-storytime https://kcls.org/voice/ http://www.ala.org/advocacy/intfreedom/corevalues information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 20 16 anthony cuthbertson, “amazon admits employees listen to audio from echo devices,” https://www.independent.co.uk/life-style/gadgets-and-tech/news/amazon-alexa-echolistening-spy-security-a8865056.html. 17 matt day et al., “amazon workers are listening to what you tell alexa,” https://www.bloomberg.com/news/articles/2019-04-10/is-anyone-listening-to-you-onalexa-a-global-team-reviews-audio. 18 daniel j. dubois et al., “when speakers are all ears: understanding when smart speakers mistakenly record conversations,” mon(iot)r, february 14, 2020, https://moniotrlab.ccis.neu.edu/smart-speakers-study/. 19 karen hao, “amazon is the invisible backbone of ice’s immigration crackdown,” mit technology review, october 16, 2019, https://www.technologyreview.com/s/612335/amazon-is-theinvisible-backbone-behind-ices-immigration-crackdown/. 20 zack whittaker, “echo is listening, but amazon’s not talking,” zdnet, january 16, 2018, https://www.zdnet.com/article/amazon-the-least-transparent-tech-company/. 21 american library association, “library privacy guidelines for vendors,” http://www.ala.org/advocacy/privacy/guidelines/vendors. 22 this research protocol (19-08-2671) was approved in october 2019 by the university of alabama’s institutional review board (irb). 23 note, participants were not required to answer every question, so some questions have fewer than 86 total responses due to participants electing to not respond. also, even though we were targeting public and academic libraries, we did receive a response from someone identifying their institution as a school library and decided to include it in the results. 24 we did not receive responses from libraries in arizona, arkansas, connecticut, delaware, pennsylvania, vermont, virginia, or wyoming. 25 “technology petting zoos” are areas where patrons can experiment or try out technologies and gadgets. 26 hoopla, “alexa, meet hoopla,” july16, 2018, http://hub.hoopladigital.com/whatsnew/2018/7/alexa-meet-hoopla. 27 we purposely couched questions about “data literacy” and “data privacy” broadly in the survey to allow for a range of interpretations by respondents in an attempt to capture the range of information that might be taught under this umbrella. 28 daniel j. dubois et al., “when speakers are all ears: understanding when smart speakers mistakenly record conversations.” 29 american library association, “digital literacy,” https://literacy.ala.org/digital-literacy/. 30 examples of critical research in this area include: toby beauchamp, going stealth: transgender politics and u.s. surveillance practices (durham, london: duke university press, 2019); https://www.independent.co.uk/life-style/gadgets-and-tech/news/amazon-alexa-echo-listening-spy-security-a8865056.html https://www.independent.co.uk/life-style/gadgets-and-tech/news/amazon-alexa-echo-listening-spy-security-a8865056.html https://www.bloomberg.com/news/articles/2019-04-10/is-anyone-listening-to-you-on-alexa-a-global-team-reviews-audio https://www.bloomberg.com/news/articles/2019-04-10/is-anyone-listening-to-you-on-alexa-a-global-team-reviews-audio https://moniotrlab.ccis.neu.edu/smart-speakers-study/ https://moniotrlab.ccis.neu.edu/smart-speakers-study/ https://moniotrlab.ccis.neu.edu/smart-speakers-study/ https://www.technologyreview.com/s/612335/amazon-is-the-invisible-backbone-behind-ices-immigration-crackdown/ https://www.technologyreview.com/s/612335/amazon-is-the-invisible-backbone-behind-ices-immigration-crackdown/ https://www.zdnet.com/article/amazon-the-least-transparent-tech-company/ http://www.ala.org/advocacy/privacy/guidelines/vendors http://hub.hoopladigital.com/whats-new/2018/7/alexa-meet-hoopla http://hub.hoopladigital.com/whats-new/2018/7/alexa-meet-hoopla https://literacy.ala.org/digital-literacy/ information technology and libraries december 2020 alexa, are you listening? | sweeney and davis 21 virginia eubanks, automating inequality: how high-tech tools profile, police, and punish the poor (st. martin’s press, 2018); safiya u. noble, algorithms of oppression: how search engines reinforce racism (new york: nyu press, 2018). 31 “core values of librarianship,” american library association. abstract introduction review of literature what is a voice assistant? libraries and voice assistant use data privacy issues research methods findings participant demographics how are libraries using smart voice assistant technologies as a part of their library services? how aware are library workers of how voice assistants integrate with third-party digital content platforms? are libraries educating library patrons about voice assistant technologies as a part of services and programming? what kinds of privacy concerns do library workers have about the use of smart voice assistant technologies in their library services and programming? data access and use surveillance and “always on” features procedure and operations legal issues professional responsibility discussion: key considerations for library professionals circulation procedures third-party digital content platforms education opportunities for expanding digital literacy surveilling patrons and staff opt to not adopt conclusion appendix a: survey instrument endnotes aliprand ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ 139 technical communications announcements panel discussion on «government publications in machine-readable form" this meeting will be held on july 10 from 8:30 to 10:30 p.m. as a part of the american library association's 1974 new york conference. the meeting is cosponsored by the government documents round table's (godort) machinereadable data file committee, the federal librarians round table (flirt), the rasd information retrieval committee, and the rasd/rtsd/ asla public documents committee. the moderator is gretchen dewitt of columbus public library and the panelists are peter watson of ucla, mary pensyl of mit, judith rowe of princeton, and billie salter of yale. mr. watson will discuss the general issues concerning the acquisition and use of bibliographic data files and provide a brief description of some of the files now publicly available; miss pensyl will describe the workings of the project now underway to make these files available to mit users. mrs. rowe will discuss the ways in which government-produced statistical files supplement the related printed reports and will indicate some of the types and sources of files now being released; miss salter will discuss a program for integrating these and other research files into yale's social science reference service. representatives of several federal agencies will display materials describing and documenting both bibliographic and statistical data files. the purpose of the program is to acquaint reference librarians, particularly those now handling printed documents, with the uses of both types of files, the advantages and disadvantages of these reference tools, and the techniques and policy changes necessary for their use in a library environment. the recent release of the draft proposal produced by the national commission on libraries and information services makes more timely than ever an open discussion of the place of bibliographic and numeric data files in a reference collection. all librarians must be acquainted with these growing resources in order to continue to provide full service to their patrons. for further information, contact judith rowe, computer center, princeton university, 87 prospect ave., princeton, nj 08540. ninth annual educational media and technology conference to be hosted by university of wisconsin-stout, july 22-24, 1974 aetc past president dr. jerry kemp, coordinator of instructional development services for san jose state university (california), and film consultant ralph j. amelio, media coordinator and english instructor at willowbrook high school, villa park, illinois, will headline the university of wisconsin-stout's 9th annual educational media and technology conference to be held in menomonie, wisconsin, on july 22-24, 1974. "educational technology: can we realize its potential?" will be the subject of kemp's presentation on monday evening, while amelio, speaking on tuesday, july 23, will challenge participants with the subject "visual literacy: what can you do?". seven concurrent workshops will be held on monday afternoon: library automation; sound for visuals; making the timesharing computer work for you; new developments in photography; what's 140 journal of libmry automation vol. 7/2 june 1974 new in graphics; selecting and evaluating educational media; and instructional development: how to make it work! individuals leading the three-hour workshops will include: alfred baker, vicepresident of science press; john lord, technical service manager for the dukane corporation; william daehling, weber state college, ogden, utah; and several media specialists from learning resources, university of wisconsin-stout. about fifty exhibitors will show and demonstrate both hardware and software during the conference. six case studies will be given of exemplary media programs at the public school, vocational-technical, and college level. further information may be obtained by contacting dr. david p. bernard, dean of learning resources, university of wisconsin-stout, menomonie, wi 54751. report of recon project published the library of congress has published in recon pilot project (vii, 49p.) the final report of a project sponsored by lc, the council on library resources, inc., and the u.s. office of education to determine the problems associated with centralized conversion of retrospective catalog records and distribution of these records from a central source. in the marc pilot project, begun in november 1966, the library of congress distributed machine-readable catalog records for english-language monographs, and the success of that project led to the implementation in march 1969 of the marc distribution service, in which over fifty subscribers have by now received more than 300,000 marc records representing the current english-language monograph cataloging at the library of congress. as coverage is extended to catalog records for foreign-language monographs and for other forms of material, libraries will be able to obtain machine records for a large number of their current titles. more research was needed, however, on the problems of obtaining machinereadable data for retrospective cataloging, and the council on library resources made it possible for lc to engage in november 1968 a task force to study the feasibility of converting retrospective catalog records. the final report of the recon (for retrospective conversion) working task force was published in june 1969. one of the report's recommendations was that a pilot project test various conversion techniques, ideally covering the highest priority materials, english-language monograph records from 1960-68; and with funds from the sponsoring agencies lc initiated a two-year project in august 1969. the present report covers five major areas examined in that period: 1. testing of techniques postulated in the recon report in an operational environment by converting englishlanguage monographs cataloged in 1968 and 1969 but not included in the marc distribution service. 2. development of format recognition, a computer program which can process unedited catalog records and supply all the necessary content designators required for the full marc record. 3. analysis of techniques for the conversion of older english-language materials and titles in foreign languages using the roman alphabet. 4. monitoring the state-of-the-art of input devices that would facilitate conversion of a large data base. 5. a study of microfilming techniques and their associated costs. recon pilot project is available for $1.50 from the superintendent of documents, u.s. government printing office, washington, dc 20402. stock no. 300000061. library of congress issues recon working task fo1'ce report national aspects of creating and using marc/recon records (v, 48p.) reports on studies conducted at the library of congress by the recon working task force under the chairmanship of henriette d. avram. they were made concurrently with a pilot project by the library to test the feasibility of the plan outlined in the task force's first report entitled conversion of retrospective reco1·.ds to machine-readable form (library of congress, 1969) and in recon pilot p1'oject (library of congress, 1972). both the pilot project and the new studies received financial support from the council on library resources, inc., and the u.s. office of education. the present volume describes four investigations: ( 1) the feasibility of determining a level or subset of the established marc content designators (tags, indicators, and subfield codes) that would still allow a library using it to be part of a future national network; ( 2) the practicality of the library of congress using other machine-readable data bases to build a national bibliographic store; ( 3) implications of a national union catalog in machine-readable form; and ( 4) alternative strategies for undertaking a largescale conversion project. the appendices include an explanation of the problems of achieving a cooperatively produced bibliographic data base, a description of the characteristics of the present national union catalog, and an analysis of library of congress card orders for one year. although the findings and recommendations of this report are less optimistic than those of the original recon study, they reaffirm the need for coordinated activity in the conversion of retrospective catalog records and suggest ways in which a large-scale project might be undmtaken. the report provides a basis for realistic planning in a critical area of library automation. national aspects of creating and using marc!recon records is available for $2.75 from the superintendent of documents, u.s. government printing office, washington, dc 20402. stock no. 300000062. isad official activities tesla info1'mation editor's note: use of the following guidelines and forms is described in the article by john kountz in this issue of technical communications 141 jola. the tesla reactor ballot will also appear in subsequent issues of technical communications for reader use, and the tesla standards scoreboard will be presented as cumulate.d 1'esults warrant its publication. to use, photocopy or otherwise duplicate the forms presented in jola-tc, fill out these copies, and mail them to the tesla chai1'man, m1'. john c. kountz, associate fo1' libmry automation, office of the chancello1', the califomia state university and colleges, 5670 wilshim blvd., suite 900, los angeles, ca 90036. initiative standard proposal outlinethe following outline and forms are designed to facilitate review by both the isad committee on technical standards for library automation (tesla) and the membership of initiative standards requirements and to expedite the handling of the initiative standard proposal through the procedure. since the outline will be used for the review process, it is to be followed explicitly. where an initiative standard requirement does not require the use of a tesla reactor ballot reactor information name title organization address city state ___ zip __ telephone identification number for standard requirement for against reason for position: (use additional pages if required} 142 ]oumal of librm·y automation vol. 7/2 june 1974 tesla standards scoreboard receipt screen division rej/acpl publish tally representative title/i.d. number date date date date date date date target specific outline entry, the entry heading is to be used followed by the words "not applicable" (e.g., where no standards exist which relate to the proposal, this is indicated by: vi. existing standards. not applicable). note that the parenthetical statements following most of the outline entry descriptions relate to the ansi standards proposal section headings to facilitate the translation from this outline to the ansi format. all initiative standards proposals are to be typed, double spaced on 83~" x 11" white paper (typing on one side only). each page is to be numbered consecutively in the upper right-hand corner. the initiator's last name followed by the key word from the title is to appear one line below each page number. i. title of initiative standard proposal (title) . ii. initiator information (forward). a. name b. title c. organization d. address e. city, state, zip f. telephone: area code, number, extension iii. technical area. describe the area of library technology as understood by initiator. be as precise as possible since in large measure the information given here will help determine which ala official representative might best handle this proposal once it has been reviewed and which ala organizational component might best be engaged in the review process. iv. purpose. state the purpose of standard proposal (scope and qualifications) . v. description. briefly describe the standard proposal (specification of the standard) . vi. relationship of other standards. if existing standards have been identified which relate to, or are felt to influence, this standard technical communications 143 proposal, cite them here (expository remarks) . vii. background. describe the research or historical review performed relating to this standard proposal (if applicable, provide a bibliography) and your findings (justification). viii. specifications. specify the standard proposal using record layouts, mechanical drawings, and such related documentation aids as required in addition to text exposition where applicable (specification of the standard). research and development system development corporation awarded national science foundation grant to study interactive searching of large literature data bases santa monica, california-the national science foundation has awarded system development corporation $98,500 for a study of man-machine system communication in on-line reh·ieval systems. the study will focus on interactive searching of very large literature data bases, which has become a major area of interest and activity in the field of information science. at least seven major systems of national or international scope are in operation within the federal government and private industry, and more systems are on the drawing boards or in experimental operation. the principal investigator for the project will be dr. carlos cuadra, manager of sdc's education and library systems department. the project manager, who will be responsible for the day-to-day operation of the fifteen-month effort, is judy wanger, an information systems analyst and project leader with extensive experience in the establishment and use of interactive bibliographic retrieval services. ms. wanger is currently responsible for user training and customer support on sdc's on-line information service. the study will use questionnaire and interview techniques to collect data re144 journal of libml'y automation vol. 7/2 june 1974 lated to: (1) the impact of on-line retrieval usage on the terminal user; (2) the impact of on-line service on the sponsoring institution; and ( 3) the impact of online service on the information-utilization habits of the information consumer. attention will also be given to reliability problems in the transmission chain from the user to the computer and back. the major elements in this chain include: the user; the terminal; the telephone instrument; local telephone lines and switchboards; long-haul communications; the communications-computer interface hardware; the computer itself; and various programs in the computer, including the retrieval program. reports on regional projects and activities california state university and colleges system union list system the library systems project of the california state university and colleges has recently completed a production union list system. this system, comprised of eight processing programs to be run in a very modest environment (currently a cdc 3300), is written in ansi cobol and is fully documented. included in the documentation package are user worksheets for bibliographic and holding data, copies of all reports, file layouts, program descriptions, etc. output from this system are files designed to drive graphic quality photocomposition or com devices. the system is available for the price of duplicating the documentation package. and, for those so desiring, the master file containing some 25,000 titles and titles with references is also available for the cost of duplication. interested parties (bona fides only, please) should contact john c. kountz, associate for library automation, california state university and colleges, 5670 wilshire blvd., suite 900, los angeles, ca 90036, for further details. solinet membe1·ship meeting the annual membership meeting of the southeastern library network (solinet) was held at the georgia institute of technology in atlanta, march 14. it was announced that charles h. stevens, executive director of the national commission on libraries and information science, has been named director of solinet effective july 1. john h. gribbin, chairman of the board, will serve as interim director. it was also announced that solinet will be affiliated with the southern regional education board. sreb will provide office space, act as financial disbursing agent, and will be available at all times in an advisory capacity. negotiations are underway for a tie-in to the ohio college library center ( oclc) and a proposed contract is in the hands of the oclc legal counsel. it is anticipated that a contract soon will be signed. additional to the tie-in, solinet will proceed with the development of its own permanent computer center in atlanta. this center will eventually provide a variety of services and will be coordinated carefully with other developing networks, looking toward a national library network system. elected to fill three vacancies on the board of directors were james f. govan (university of north carolina), gustave a. harrar (university of florida), and robert h. simmons (west georgia college). they will assume office on july 1. anyone desiring information about solinet should write to 130 sixth st., nw, atlanta, ga 30313. reports-library projects and activities new book catalog for junior college district of st. louis the three community college libraries of the junior college district of st. louis have been using computerized union book catalogs since 1964. formerly maintained and produced by an outside contractor, the catalogs are now one product of a new catalog system recently designed and implemented by instructional resources and data processing staff of the district. known as "ir catalog," the system presently has a data base of approximately 65,000 records describing the print and nonprint collections of the district's three college instructional resource centers. in addition to photocomposed author, subject, and title indexes, the system also produces weekly cumulative printouts which supplement the phototypeset ''base" catalog. other output includes three-by-five-inch shelflist cards (which include union holdings information), a motion picture film catalog, subject and cross reference authority lists, and various statistical reports. hawaii state lihra1'y system to automate p1'ocessing the state board of education in hawaii has approved a proposal for a computerized data processing system for the hawaii state library. the decision allows for the purchase of computer equipment for automating library operations. the state library centrally processes library materials for all public and school libraries in the state. teichior hirata, acting state education superintendent, told board members a computerized system will speed book selection, ordering, and processing, and will improve interlibrary loan and reference services. he also pointed out it would facilitate a general streamlining of all technical administrative operations. the system's total cost will be $187,000, of which $58,000 will be spent for computer software. the "biblios" system, designed and developed at orange county public library in california and marketed by information design, inc., was selected as the software package. the caltech science lihm1'y catalog supplement the use of catalog supplements during the necessary maturation period required to take full advantage of the national program for acquisitions and cataloging is technical communications 145 obviously an idea whose time has come. the program developed at the california institute of technology, however, differs in several important respects from that previously described by nixon and bell at u.c.l.a. 1 for reasons based primarily on faculty pressure, the practice of holding books in anticipation of the cataloging copy has never been a practice at the institute. the solution, while hardly unique, is to assign the classification number (dewey) and depend on a temporary main entry card to suffice until the lc copy is available. while this procedure has the distinct advantage of not requiring the presence of the book to complete the cataloging process, it does, however, prevent the user from finding the newest books through a search of the subject added entry cards. the use of the computer-based systems is an obvious solution to this aspect of the program but raises several additional problems which formerly seemed to defy solutions. as has been pointed out by mason, library-based computer systems can rarely be justified in terms of cost effectiveness, and computer-based library catalogs are no exception.2 part of this problem arises from the natural inclination to repeat in machine language what has been standard practice in the library catalog. this reaction overlooks the very different nature of catalogs and catalog supplements. as catalogs serve as the basis for the permanent record and their cost can be prorated over several decades the need for a careful description of the many facets of a book is quite properly justified. in the case of catalog supplements, however, where the record will serve quite likely for only a few months, any attempt at detailed description of the book cannot be justified. one solution to this dilemma that has been developed here at caltech is a brief listing supplement which allows searching for a given book by either the first author or editor's last name, a key word from the title, or the first word of a series entry. these elements form the basis of a simple kwoc index (see figure 1) which sup146 journal of library automation vol. 7/2 june 1974 chemisorption chemisorption and catalysis hepple 541.395 he 1970 ch chester 19 techniques in partial differential equations chester 517.6 ch 1971 ch 199 ciba protein turnover 612.39 pr 1972 bl ( ciba foundation symposium, 9) fig. 1. sample entries from the kwoc index 108 19 t chemisorption & catalysis hepple 541.395 he 1970 ch a hepple chemisorption catalysis 108 t protein turnover 612.39 pr 1972 bi (ciba foundation symposium, 9) a protein ciba 199 t techniques in partial differential equations chester 517.6 ch 1971 ch a differential chester fig. 2. sample ent1·ies from the bibliographic listing new books chemistry /biology august 6, 1973 catalysis, chemisorption and . hepple 541.395 he 1970 ch differential equations, techniques in partial . . . chester 517.6 ch 1971 ch protein turnover ciba foundation symposium, 9 612.39 pr 1972 bi fig. 3. sample entries from the weekly list of newly added books plements the bibliographic listing (shown in figure 2) . all books received in the chemistry, physics, and biology libraries are represented in the catalog supplement. weekly lists of newly added books (shown in figure 3) are annotated to show the index terms prior to keypunching. the unit record consists of a "title" card or cards (which contain the full title, author/ editor, call number, library designation, and series information) and an "author" card (which contains the index terms) . edited material is added accessionally to the card file data base and batch processed on the campus ibm 370/ 155 computer. the catalog supplement is currently published on 8jf-by-1hnch sheets as a result of reducing the computer printout on a xerox 7000 copier. lists are given a vello-bind and delivered to therespective libraries. weeding the catalog supplement is still unresolved. at the present time additions are less than 1,000 per year, so that it may be possible after five years to replace the subject sections of the respective divisional catalogs with the catalog supplement. the "library" at caltech consists of several divisional libraries, each with their own card catalog. these divisional card catalogs are supplemented by a union catalog, which serves all libraries on campus and, because of the strong interdisciplinary nature of the divisional libraries, is much the better source for subject searches. the project is so facile and the costs so minimal that this approach might be of value to many small libraries. it is particularly applicable to the problems recently discussed by patterson. 3 books in series, even if they are distinct monographs, are often lost to the user from a subject approach. with this system each physical volume added to the library can be analyzed for possible inclusion in the catalog supplement. 1. robertanixon and ray bell, "the u.c.l.a. library catalog supplement," library resources & technical services 17:59 (winter 1973). 2. ellsworth mason, "along the academic way," library journal 96:1671 (1971). 3. kelly patterson, "library think vs libra1y user," rq 12:364 (summer 1973). danal. roth millikan librm·y c alifomia institute of technology commercial activities richard abel & company to sponsor workshops in library automation and management one of the most effective forms of continuing education is state-of-the-art reporting. recognizing the need for more such communication 1 the international library service firm of richard abel & company plans to sponsor two workshops for the library and information science community. the first workshop will deal with the latest techniques in library automation. it will precede the 197 4 american library association conference in new york city, july 7-13. the second will present advances in library management, and will be scheduled to precede the 1975 ala midwinter meeting, january 19-25. the workshops will include forums, lectures, and open discussions. they will be presented by recognized leaders in the fields of library automation, management, and consulting. each workshop will probably be one or two days long. there will be no charge to attend either of the workshops, but attendance will be limited, to provide a good discussion atmosphere. for the management workshop, attendance will be limited to librarians active in library management. similarly, the automation workshop is intended for librarians working in library automation. maintaining the theme of state-of-theart reporting, the basic content of the workshops will consist of what is happening in library management and automation today. looking to the future, there will also be discussions and forecasts of what is to come. persons interested in further informatechnical communications 147 tion or in pa1ticipating in either workshop should contact abel workshop director, richard abel & company, inc., p.o. box 4245, portland, or 97208. idc introduces bibnet on-line services the introduction of bibnet on-line systems, a centralized computer-based bibliographic data service for libraries, has been announced by information dynamics corporation. demonstrations are planned for the ala annual conference in new york, july 7-13. according to david p. waite, idc president, "during 1973, bibnet service modules were interconnected over thousands of miles and tested for on-line use with idc's centralized computerbased cataloging data files. this is the culmination of a program that began two years ago. it is patterned after advanced technological developments similar to those recently applied to airline reservation systems and other large scale nationwide computing networks used in industry." idc, a new england-based library systems supplier, will provide a computerstored cataloging data base of more than 1.2 million library of congress and contributed entries. initially it will consist of all library of congress marc records (now numbering over 430,000 titles), plus another 800,000 partial lc catalog records containing full titles, main entries, lc card numbers, and other selected data elements. as a result, bibnet will provide on-line bibliographic searching for all 1,250,000 catalog records produced by the library of congress since 1969. to enable users to produce library cards from those non-marc records for which only partial entries are kept in the computer, idc will mail card sets from its headquarters and add the full records to the data base for future reference. subscribing libraries will have access to the data base using a minicomputer cathode ray tube (crt) terminal. using this technique of dispersed computing each bibnet terminal has programmable computer power built-in. this in-house 148 journal of library automation vol. 7/2 june 1974 processing power, independent of the central computer, allows computer processes like library card production to be performed in the library. this also eliminates waiting for catalog cards to arrive in the mail. bibnet terminals communicate with the central computer over regular telephone lines, eliminating the high costs of dedicated communication lines. therefore, thousands of libraries throughout the united states and canada can avail themselves of on-line services at low cost. bibnet users will have several methods of extracting information from the idc data base. the computer can search for individual records by titles, main entry, isbn number, or keywords. here's how it works: the operator types in any one of the search items or if a complete title is not known, a keyword from the title may be used. the cataloging information is then displayed on the crt where the operator may verify the record. at the push of a button, the data is stored on a magnetic cassette tape which is later used for editing and production of catalog cards by the user library. the bibnet demonstration in new york will highlight one of many bibliographic service modules available from idc and stress the fact that these services can be utilized by individual libraries and organized groups of libraries. license for new information retrieval concept awarded to boeing by xynetics an exclusive license for manufacture and marketing to the government sector of systems incorporating a completely new concept in information storage and retrieval has been awarded to the boeing company, seattle, washington, by xynetics, inc., canoga park, california, it was announced jointly by dr. r. v. hanks, boeing program manager, and burton cohn, xynetics board chairman. the system is said to be the first image storage and retrieval system which offers response times and costs comparable to those of digital systems. the heart of the system is a device of proprietary design, the flat plane memory, which provides mpid access to massive amounts of data stored in high resolution photographic media. the photographic medium enables low cost storage of virtually any type of source material (documents, correspondence, drawings, multitone images, computer output, etc.) while eliminating the need for time-consuming, costly conversion of pre-existing information into a specialized (e.g., digital) format. by virtue of its extremely rapid random access capability, the data needs of as many as several thousand users can be served at remote video terminals from a single memory with near real time response ( 1-3 seconds, typically). the high speed, high accuracy, and high reliability of the flat plane memory is accomplished primarily through the use of the patented xynetics positioner, which generates direct linear motion at high speeds and with great precision and reliability instead of converting rotary motion. as a result, the positioners eliminate the gears, lead screws, and other mechanical devices previously utilized, and thus achieve the requisite speed, accuracy, and reliability. the xynetics positioners are already being used in automated drafting systems produced by the firm, and in a wide variety of other applications, including the apparel industry and integrated circuit test systems. the new approach could eliminate many of the problems associated with multiple reproductions and distribution of large data files. in addition to many government applications, the system is expected to have major applications in the commercial marketplace. appointments charles h. stevens appointed solinet director charles h. stevens, executive director, national commission on libraries and information science, has been appointed director of the southeastern library net~ work (solinet), effective july 1. the announcement was made at a meeting of solinet in atlanta, march 14, by john h. gribbin, board chairman. composed of ninety-nine institutional members, solinet is headquartered in atlanta. a librarian of acknowledged national stature and an expert on the technical aspects of information retrieval systems, mr. stevens brings to solinet a valuable combination of experience and abilities. concerned with national problems of libraries and information services, he will develop a regional network and move toward a cohesive national program to meet the evolving needs of u.s. libraries. a forerunner in library automation, mr. stevens served for six years as associate director for library development, project intrex, at massachusetts institute of technology. from 1959-1965 he was director of library and publications at mit's lincoln laboratory, lexington, massachusetts. at purdue university, he was aeronautical engineering librarian and later director of documentation of the thermophysical properties research center. mr. stevens is a member of the council of the american library association, the american society for information science, the special libraries association, and other professional organizations. he is the author of approximately forty papers in the field, lectures widely, and consults on library activities for a number of universities. mr. stevens holds a b.a. in english fro:in principia college, elsah, illinois, and master's degrees in english and in library science from the university of north carolina. mr. stevens has done further study in engineering at brooklyn polytechnic institute. mr. stevens is married and has three sons. input to the editor: international scuttlebutt informs us that those in the bibliothecal stratosphere are technical communications 149 attempting to formulate a communications format for bibliographical records acceptable on a worldwide basis. we on the local scene unite in wishing them "huzzah!" and "godspeed!" nomenclature must be provided, of course, to designate particular applications; and the following suggestions are offered as possible subspecies of the genus supermarc: deutschmarc-for records distributed from bonn and/ or wiesbaden rheemarc-for south korean records, named in honor of the late president of that country bismarc-for records of stage productions which have been produced by popular demand from the top balcony; especially pertinent for wagnerian operas benchmarc-for records of generally unsuccessful football plays minskmarc-for byelorussian records sachermarc-for austrian records, usually representing extremely tasteful concoctions trademarc-for records pertaining to manufactured products, especially patent medicines goldmarc-for records representing hungarian musical compositions ( v. karl goldmark, 1830-1915) ectomarc } endomarc mesomarc (from -for skinny, fat, and the italian, mezmedium-sized reczomarc) ords, respectively landmarc-for records of historic edifices; sometimes ( enoneously) applied to records for local geographical regions feuermarc-for records representing charred or burned documents montmarc-1. for records representing works by or about parisian artists; 2. for records representing publications of the french academy watermarc-for records representing documents contained in bottles washed up on the beach. joseph a. rosenthal university of california, berkeley 100 information technology and libraries | june 2009 tutorial andrew darby and ron gilmour adding delicious data to your library website social bookmarking services such as delicious offer a simple way of developing lists of library resources. this paper outlines various methods of incorporating data from a delicious account into a webpage. we begin with a description of delicious linkrolls and tagrolls, the simplest but least flexible method of displaying delicious results. we then describe three more advanced methods of manipulating delicious data using rss, json, and xml. code samples using php and javascript are provided. o ne of the primary components of web 2.0 is social bookmarking. social bookmarking services allow users to store bookmarks on the web where they are available from any computer and to share these bookmarks with other users. even better, these bookmarks can be annotated and tagged to provide multiple points of subject access. social bookmarking services have become popular with librarians as a means of quickly assembling lists of resources. since anything with a url can become a bookmark, such lists can combine diverse resource types such as webpages, scholarly articles, and library catalog records. it is often desirable for the data stored in a social bookmarking account to be displayed in the context of a library webpage. this creates consistent branding and a more professional appearance. delicious (http://delicious .com/), one of the most popular social bookmarking tools, allows users to extract data from their accounts and to display this data on their own websites. delicious offers multiple ways of doing this, from simply embedding html in the target webpage to interacting with the api.1 in this paper we will begin by looking at the simplest methods for users uncomfortable with programming, and then move on to three more advanced methods using rss, json, and xml. our examples use php, a cross-platform scripting language that may be run on either linux/ unix or windows servers. while it is not possible for us to address the many environments (such as cmses) in which websites are constructed, our code should be adaptable to most contexts. this will be especially simple in the many popular php–based cmses such as drupal, joomla, and wordpress. it should be noted that the process of tagging resources in delicious requires little technical expertise, so the task of assembling lists of resources can be accomplished by any librarian. the construction of a website infrastructure (presumably by the library’s webmaster) is a more complex task that may require some programming expertise. linkrolls and tagrolls the simplest way of sharing links is to point users directly to the desired andrew darby (adarby@ithaca.edu) is web services librarian, and ron gilmour (rgilmour@ithaca.edu) is science librarian at ithaca college library, ithaca, new york. figure 1. delicious linkroll page adding delicious data to your library website | darby and gilmour 101 delicious page. to share all the items labeled “biology” for the user account “iclibref,” one could disseminate the url http://delicious.com/iclibref/ biology. the obvious downside is that the user is no longer on your website, and they may be confused by their new location and what they are supposed to do there. linkrolls, a utility available from the delicious site, provides a number of options for generating code to display a set of bookmarked links, including what tags to display, the number, the type of bullet, and the sorting criterion (see figure 1).2 this utility creates simple html code that can be added to a website. a related tool, tagrolls, creates the ubiquitous delicious tag cloud.3 for many librarians, this will be enough. with the embedded linkroll code, and perhaps a bit of css styling, they will be satisfied with the results. however, delicious also offers more advanced methods of interacting with data. for more control over how delicious data appears on a website, the user must interact with delicious through rss, json or xml. rss like most web 2.0 applications, delicious makes its content available as rss feeds. feeds are available at a variety of levels, from the delicious system as a whole down to a particular tag in a particular account. within a library context, the most useful types of feeds will be those that point to lists of resources with a given tag. for example, the request http://feeds.delicious.com/rss/iclibref/biology returns the rss feed for the “biology” tag of the “iclibref” account, with items listed as follows: <item rdf:about=“http://icarus .ithaca.edu/cgi-bin/pwebrecon .cgi?bbid=237870”> <title>darwin’s dangerous idea (evolution 1) 2008-0409t18:40:00z http://icarus.ithaca .edu/cgi-bin/pwebrecon. cgi?bbid=237870 iclibref this episode interweaves the drama in key moments of darwin&#039;s life with documentary sequences of current research, linking past to present and introducing major concepts of evolutionary theory. 2001 biology to display delicious rss results on a website, the webmaster must use some rss parsing tool in combination with a script to display the results. the xml_rss package provides an easy way to read rss using php.4 the code for such an operation might look like this: parse(); foreach ($rss->getitems() as $item) { echo “”; } ?> this code uses xml_rss to parse the rss feed and then prints out a list of linked results. rss is designed primarily as a current awareness tool. consequently, a delicious rss feed only returns the most recent thirty-one items. this makes sense from an rss perspective, but it will not often meet the needs of librarians who are using delicious as a repository of resources. despite this limitation, the delicious rss feed may be useful in cases where currency is relevant, such as lists of recently acquired materials. json a second method to retrieve results from delicious is using javascript object notation or json.5 as with the rss feed method, a request with credentials goes out to the delicious server. the response returns in json format, which can then be processed using javascript. an example request might be http://feeds.delicious . c o m / v 2 / j s o n / i c l i b r e f / b i o l o g y . by navigating to this url, the json response can be observed directly. a json response for a single record (formatted for readability) looks like this: delicious.posts = [ {“u”:“http:\/\/icarus.ithaca .edu\/cgi-bin\/pwebrecon .cgi?bbid=237870”, “d”:“darwin’s dangerous idea (evolution 1)”, “t”:[“biology”], “dt”:“2008-04-09t06:40:00z”, “n”:“this episode interweaves the drama in key moments of darwin’s life with documentary sequences of current research, linking past to present and introducing major concepts of evolutionary theory. 2001”} ]; it is instructive to look at the json feed because it displays the information elements that can be extracted: “u” for the url of the resource, “d” for the title, “t” for a comma-separated list of related tags, “n” for the note field, and “dt” for the timestamp. to display results in a webpage, the feed is requested using javascript: 102 information technology and libraries | june 2009 then the json objects must be looped through and displayed as desired. alternately, as in the script below, the json objects may be placed into an array for sorting. the following is a simple example of a script that displays all of the available data with each item in its own paragraph. this script also sorts the links alphabetically. while rss returns a maximum of thirty-one entries, json allows a maximum of one hundred. the exact number of items returned may be modified through the count parameter at the end of the url. at the ithaca college library, we chose to use json because at the time, delicious did not offer the convenient tagrolls, and the results returned by rss were displayed in reverse chronological order and truncated at thirty-one items. currently, we have a single php page that can display any delicious result set within our library website template. librarians generate links with parameters that designate a page title, a comma-delimited list of desired tags, and whether or not item descriptions should be displayed. for example, www.ithacalibrary.com/research/delish_feed. php?label=biology%20films&tag=bio logy,biologyi¬es=yes will return a page that looks like figure 2. the advantage of this approach is that librarians can easily generate webpages on the fly and send the url to their faculty members or add it to a subject guide or other webpage. the php script only has to read the “$_get” variables from the url and then query delicious for this content. xml delicious offers an application programming interface (api) that returns xml results from queries passed to delicious through https. for instance, the request https://api.del.icio.us/v1/posts/ recent?&tag=biology returns an xml document listing the fifteen most recent posts tagged as “biology” for a given account. unlike either the rss or the json methods, the xml api offers a means of retrieving all of the posts for a given tag by allowing requests such as https://api.del.icio.us/v1/ posts/all?&tag=biology. this type of request is labor intensive for the delicious server, so it is best to cache the results of such a query for future use. this involves the user writing the results of a request to a file on the server and then checking to see if such an archived file exists before issuing another request. a php utility called deliciousposts, which provides caching functionality, is available for free.6 note that the username is not part of the request and must be supplied separately. unlike the public rss or json feeds, using the xml api requires users to log in to their own account. from a script, this can be accomplished using the php curl function: $ch = curl_init(); curl_setopt($ch, curlopt_ url, $queryurl); curl_setopt($ch, curlopt_ userpwd, $username . “:” . $password); curl_setopt($ch, curlopt_ returntransfer, 1); $posts = curl_exec($ch); curl_close($ch); this code logs into a delicious account, passes it a query url, and makes the results of the query available as a string in the variable $posts. the content of $posts can then be processed as desired to create web content. one way of doing this is to use an xslt stylesheet to transform the results into html, which can then be printed to the browser: /* create a new dom document from your stylesheet */ $xsl = new domdocument; $xsl->load(“mystylesheet.xsl”); /* set up the xslt processor */ $xp = new xsltprocessor; $xp->importstylesheet($xsl); /* create another dom document from the contents of the $posts variable */ $doc = new domdocument; $doc->loadxml($posts); /* perform the xslt transformation and output the resulting html */ $html = $xp>transformtoxml($doc); echo $html; conclusion delicious is a great tool for quickly and easily saving bookmarks. it also offers some very simple tools such as linkrolls and tagrolls to add delicious content to a website. but to exert more control over this data, the user must interact with the delicious api or feeds. we have outlined three different ways to accomplish this: rss is a familiar option and a good choice if the data is to be used in a feed reader, or if only the most recent items need be shown. json is perhaps the fastest method, but requires some basic scripting knowledge and can only display one hundred results. the xml option involves more programming but allows an unlimited number of results to be returned. all of these methods facilitate the use of delicious data within an existing website. references 1. delicious, tools, http://delicious .com/help/tools (accessed nov. 7, 2008). 2. linkrolls may be found from your delicious account by clicking settings > linkrolls, or directly by going to http:// delicious.com/help/linkrolls (accessed nov. 7, 2008). 3. tagrolls may be found from your delivious account by clicking settings > tagrolls or directly by going to http:// delicious.com/help/tagrolls (accessed nov. 7, 2008) 4. martin jansen and clay loveless, “pear::package::xml_rss,” http://pear .php.net/package/xml_rss (accessed november 7, 2008). 5. introducing json, http://json.org (accessed nov. 7, 2008). 6. ron gilmour, “deliciousposts,” h t t p : / / r o n g i l m o u r. i n f o / s o f t w a r e / deliciousposts (accessed nov. 7, 2008). lita cover 2, cover 3, cover 4 mit press 92 index to advertisers microsoft word december_ital_biswas_final.docx analyzing digital collections entrances: what gets used and why it matters paromita biswas and joel marchesoni information technology and libraries | december 2016 19 abstract this paper analyzes usage data from hunter library’s digital collections using google analytics for a period of twenty-seven months from october 2013 through december 2015. the authors consider this data analysis to be important for identifying collections that receive the largest number of visits. we argue this data evaluation is important in terms of better informing decisions for building digital collections that will serve user needs. the authors also study the benefits of harvesting to sites such as the digital public library of america, and they believe this paper will contribute to the literature on google analytics and its use by libraries. introduction hunter library at western carolina university (wcu) has fourteen digital collections hosted in contentdm—a digital collection management system from oclc. users can enter the collections in various ways—through the library’s contentdm landing pages,1 search engines, or sites such as the digital public library of america (dpla) where all the collections are harvested.2 since october 2013, the library has collected usage data from its collections’ websites and from dpla referrals via google analytics. this paper analyzes this usage data covering a period of approximately twenty-seven months from october 2013 through december 2015. the authors consider this data analysis important for identifying collections receiving the largest number of visits, including visits through harvesting sites such as the dpla. the authors argue that such data evaluation is important because it can better inform decisions taken to build collections that will attract users and serve their needs. additionally, this analysis of usage data generated from harvesting sites such as the dpla demonstrates the usefulness of harvesting in increasing digital collections’ usage. lastly, this paper contributes to the broader literature on google analytics and its use by libraries in data analysis. literature review using google analytics to study usage of electronic resources is common; a considerable amount of material exists describing the use of google analytics in marketing and business fields.3 paromita biswas (pbiswas@email.wcu.edu) is metadata librarian and joel marchesoni (jmarch@email.wcu.edu) is technology support analyst, hunter library, western carolina university, cullowhee, north carolina. analyzing digital collections entrances: what gets used and why it matters | biswas and marchesoni | https://doi.org/10.6017/ital.v35i4.9446 20 however, the published literature offers little about the use of this software for studying usage of collections consisting of unique materials digitized and placed online by libraries and cultural heritage organizations. for example, betty has written about using google analytics to track statistics for user interaction with librarian-created digital media such as quizzes and video tutorials.4 fang discusses using google analytics to track the behavior of users who visited the rutgers-newark law library website.5 fang looked at the number of visitors, what and how many pages they visited, how long they stayed on each page, where they were coming from, and which search engine or website had referred them to the library’s website. findings were evaluated and used to make improvements to the library’s website. for example, fang mentions using google analytics data for tracking the percentage of new and returning visitors before and after the website redesign. among articles that discuss using web analytics to learn how users access digital collections, most have focused on a comparison between third-party platforms, online search engines, and the traditional library catalog to find preferred modes of access and whether results call for a shift in how libraries share their digital collections. for example, in their article on the impact of social media platforms such as historypin and pinterest on the discovery and access of digital collections, baggett and gibbs use google analytics for tracking usage of digital objects on the library’s website as well statistics collected from historypin’s and pinterest’s first-party analytics tools.6 the authors conclude that while neither historypin nor pinterest drive users back to the library’s website, they help in the discovery of digital collections and can enhance user access to library collections. schlosser and stamper compare the effects on usage of a collection housed in an institutional repository and reposted on flickr.7 whether housing a collection on a third-party site had an adverse effect on attracting traffic to the library’s website was not as important as ensuring users accessed the collection somewhere. likewise, o’english demonstrates how data from web analytics were used to compare access to archival materials via online search engines as opposed to library catalogs using marc records for descriptions.8 o’english argues library practices should change accordingly to promote patron access and use. ladd’s article on the access and use of a digital postcard collection from miami university uses statistics from google analytics, contentdm, and flickr over a period of one year.9 ladd’s findings reveal that few users came to the main digital collections website to search and browse; instead, most arrived via external sources such as search engines and social media sites. the resulting increase in views makes it imperative, ladd asserts, that regular updates both in contentdm and flickr are important for promoting access and use of the postcards. articles on using google analytics for tracking digital collection usage have explored tracking the geographic base of users. for example, herold uses google analytics to demonstrate usage of a digital archival collection by users at institutional, national, and international levels.10 herold looks at server transaction logs maintained in google analytics, onand off-campus searching counts, user locations, and repeat visitors to the archival images representing cultural heritage materials related to orang asli peoples and cultures of malaysia. she uses these data to ascertain information technology and libraries | december 2016 21 the number of users by geographic region and determine that, while most visitors came from the united states, malaysia ranked second. the data supported, according to herold, that this particular digital collection was able to reach another target audience: users from malaysia. herold’s findings indicate that digitization of unique materials makes them available to a worldwide audience. whether harvesting has increased usage of digital collections available via dpla or its hubs has received limited exploration in the literature. most writings on harvesting digital collections have focused more on the technical aspects of the process, like the dpla’s ingestion method, the quality and scalability of metadata remediation and enhancement,11 and large metadata encoding.12 for example, gregory and williams write about the north carolina digital heritage center as one of the service hubs of the dpla. the service hubs are centers that aggregate digital collection metadata provided by institutions for harvesting by the dpla. the authors discuss metadata requirements, software review, and establishment of workflow for sending large metadata feeds to the dpla.13 boyd, gilbert, and vinson, in their article on the south carolina digital library (scdl), another service hub for dpla, describe the planning behind setting up the scdl, its management, and the technology involved in metadata harvesting.14 freeland and moulaison discuss the missouri hub as a model for “institutions with similar collective goals for exposing and enriching their data through the dpla.”15 according to them, by harvesting their metadata to the dpla, institutions are able to share their digital collections with the broader public. additionally, institutions that harvest metadata to the dpla get value-added services like geocoding of locationbased metadata and expression of contributed metadata as linked data. data collection parameters hunter library digital collections usage data included information on item views16 and referrals17 for each of the collections including dpla referrals. the authors also considered keyword search terms18 across all referrals, and within contentdm specifically, that brought users to the library’s collections. the authors considered the most frequently occurring keywords to be representing the subjects of collections that were most used. repeat visitors to the library’s digital collections’ website were also tracked. finally, sessions19 were traced by the geographic area20 of the users. hunter library’s collections vary in size. the library’s largest and one of the oldest collections, craft revival [note: collections are set in roman and capitalized] showcases documents, photographs, and craft objects housed in hunter library and smaller regional institutions. the collection’s items represent the late nineteenth and early twentieth century (1890s–1940s) craft revival movement in western north carolina, which was characterized by a renewed interest in handmade objects, including cherokee arts and crafts. the craft revival collection began in 2005 and includes 1,982 items. the second largest collection, great smoky mountains, which highlights efforts that went into the establishment of the park and includes photographs on the landscape and flora and fauna in the park, began in 2012 and consists of 1,829 items. not all digital analyzing digital collections entrances: what gets used and why it matters | biswas and marchesoni | https://doi.org/10.6017/ital.v35i4.9446 22 collections were harvested to the dpla at the same time. while some older collections were harvested to the dpla in 2013, smaller, institution-specific collections started later were also harvested later. for example wcu—oral histories, a collection of interviews collected by students of one of wcu’s history classes documenting the history and culture of western north carolina and the lives of wcu athletes or artists’ like josephina niggli who taught drama at wcu; highlights from wcu, a collection of unique items from wcu’s mountain heritage center and other departments on campus, including letters from the library’s special collections transcribed by wcu’s english department students; and wcu—fine art museum, showcasing art work from the university’s fine art museum, were harvested to the dpla in 2015. as these smaller collections were started later, their total item views and referral counts would likely be less than some of the library’s older collections; however, these newer collections were included as they might provide valuable data regarding harvesting referrals and returning visitors. table 1 shows the years the collections were started, the number of items included in each collection, and the year they were harvested to the dpla. collection name start year collection size (number of items) harvested since cherokee traditions 2011 332 2013 civil war 2011 68 2013 craft revival 2005 1,982 2013 great smoky mountains 2013 1,829 2013 highlights from wcu 2015 39 2015 horace kephart 2005 552 2013 picturing appalachia 2012 972 2013 stories of mountain folk 2012 374 2013 travel western north carolina 2011 160 2013 wcu—fine art museum 2015 87 2015 wcu—herbarium 2013 91 2013 wcu—making memories 2012 408 2013 wcu—oral histories 2015 67 2015 western north carolina regional maps 2015 37 2015 table 1. collections by year information technology and libraries | december 2016 23 collecting data using google analytics the library has had google analytics set up on online exhibits—websites outside of contentdm that provide additional insight into the collection—since 2008 and began using google analytics to track its contentdm materials with the 6.1.2 release in october 2013. contentdm version 6.4 introduced a configuration field that allowed the authors to enter a google analytics id and automatically generate the tracking code in pages to simplify the setup. following that software update, oclc made google analytics the default data logging mechanism. the library set up google analytics such that online exhibits are tracked together with their contentdm collections. this is accomplished by using custom tracking on all webpages and a custom script in contentdm. this allows the library to link its contentdm and wcu.edu domains within google analytics so that sessions can be viewed across all online digital collections. data were collected from google analytics using several tools. google provides an online tool called query explorer (https://ga-dev-tools.appspot.com/query-explorer/) that can create and execute custom searches against google analytics. this application was used to craft the queries. microsoft excel was primarily used to download data, using the custom plugin rest to excel library (http://ramblings.mcpher.com/home/excelquirks/json/rest) to parse information from google analytics into worksheets. the excel add-on works well, but requires knowledge of microsoft visual basic for applications (vba) programming to use effectively. this limitation prompted the authors to look for a simpler way of retrieving data. the authors found openrefine (https://github.com/openrefine/openrefine) to collect, sort, and filter data, with excel used for results analysis. once in excel, formulas were used to mine data for specific targets. results analysis the data collected using google analytics spanned a period of approximately twenty-seven months, from october 2013 through december 2015. table 1 and graph 1 show each collection’s item views, item referrals, and size (number of items in the collection). these numbers were calculated for each collection as a percentage of total item views, total items referrals, and total number of items for all collections together. in table 2, the top five collections in terms of items views and referrals are highlighted. graph 1, a graphical representation of table 2, displays more starkly the differences between collections in terms of views and referrals. analyzing digital collections entrances: what gets used and why it matters | biswas and marchesoni | https://doi.org/10.6017/ital.v35i4.9446 24 collection name item views as percentage of total views item referrals as percentage of total referrals number of items as percentage of total items for all collections cherokee traditions 6.38 6.12 4.74 civil war 1.89 0.88 0.97 craft revival 41.35 52.39 28.32 great smoky mountains 7.50 6.34 26.14 highlights from wcu 0.23 0.08 0.56 horace kephart 11.67 7.62 7.89 picturing appalachia 10.03 9.99 13.89 stories of mountain folk 3.51 2.45 5.344 travel western north carolina 7.87 9.57 2.29 wcu—fine art museum 0.19 0.08 1.24 wcu—herbarium 0.71 0.45 1.30 wcu—making memories 7.13 2.64 5.83 wcu—oral histories 0.80 1.08 0.96 western north carolina regional maps 0.26 0.11 0.53 total 100.00 100.00 100.00 table 2. collections by percentage graph 1. collections by percentage information technology and libraries | december 2016 25 as demonstrated in the preceding table and graph, craft revival, one of the library’s oldest and largest collections, contributes more than 28 percent of all digital collections’ items and garners close to 42 percent of all item views and 53 percent of all item referrals. great smoky mountains, the second largest collection, contributes a little more than 26 percent of items but receives only about 8 percent of all item views and 7 percent of all referrals. the horace kephart collection, focusing on the life and works of horace kephart—author, librarian, and outdoorsman who made the mountains of western north carolina his home later in life—is the library’s fourth largest collection. it receives almost 12 percent of all item views and about 8 percent of all item referrals. picturing appalachia, the third largest collection—consisting of photographs showcasing the history, culture, and natural landscape of southern appalachia in the western north carolina region—makes up 14 percent of items and receives approximately 10 percent of all referrals and views. travel western north carolina—visual journeys of western north carolina communities through three generations—contributes fewer than 3 percent of items but scores high on both items views and referrals. wcu—making memories, which highlights the people, buildings, and events from wcu’s history, and stories of mountain folk (somf), which is a collection of radio programs from western north carolina non-profit catch the spirit of appalachia and archived at hunter library, are collections that are similar in size—receiving fewer than 3 percent of all item referrals. however, wcu—making memories receives a more than 7 percent of all item views compared to somf’s almost 4 percent. these findings are not surprising as the making memories collection documents western carolina university’s history and may receive many views from within the institution. overall, however, the craft revival collection can be considered the library’s most popular collection. the horace kephart collection appears to be the second most popular collection. and, not surprisingly, cherokee traditions, a collection of art objects, photographs, and recordings similar in content to the craft revival in terms of its focus on cherokee culture and history, is quite popular and receives more item referrals than both wcu—making memories and somf and more item views than somf (table 2). an analysis of keyword searches within contentdm and keyword searches across all referral sources reiterates these findings. as part of the analysis, data collected for this twenty-sevenmonth period for the top keyword searches within contentdm and the top keyword searches counting all referrals was recorded in an excel spreadsheet and then uploaded to openrefine. openrefine allows text and numeric data to be sorted by name (alphabetical) and count (highest to lowest occurring). once the excel spreadsheet was uploaded to openrefine, keywords were sorted numerically and clustered. openrefine has a “cluster” function to bring together text that has the same meaning but differs by spelling or capitalization (for example, “cherokee,” “cherokee,” “cheroke”) or by order (for example, “jane smith,” “smith, jane”). the clustering function provides a count of the number of times a keyword was used regardless of exact spelling. after identifying keywords belonging to a cluster (for example, a cluster of the word “cherokee” spelled differently), the differently spelled or organized keywords in each cluster were merged in analyzing digital collections entrances: what gets used and why it matters | biswas and marchesoni | https://doi.org/10.6017/ital.v35i4.9446 26 openrefine with their most accurate counterparts. finally, it should be noted that keywords including “!” and “+” symbols were most likely generated from either using multiple search terms within contentdm’s advanced search or from curated search links maintained on some of our online exhibit websites. these links take users to commonly used result sets within the collection. tables 3 and 4 provide a listing of the ten most frequently searched keywords within contentdm across all referrals and names of collections that are most relevant to these searches. keywords occurrence count relevant collection(s) cherokee 187 craft revival; cherokee traditions cherokee language 107 craft revival; cherokee traditions southern highland craft guild 98 craft revival basket!object 96 craft revival; cherokee traditions indian masks—appalachian region, southern 83 craft revival; cherokee traditions basket!photograph postcard 82 craft revival; cherokee traditions w.m. cline company 78 picturing appalachia; craft revival cherokee +indian! photograph 72 craft revival; cherokee traditions wood-carving— appalachian region, southern 70 craft revival indian wood-carving— appalachian region, southern 69 craft revival table 3. top keywords searches within contentdm information technology and libraries | december 2016 27 keywords number of sessions relevant collection(s) cherokee traditions 442 craft revival; cherokee traditions horace kephart 185 horace kephart; great smoky mountains; picturing appalachia cherokee pottery 55 craft revival; cherokee traditions kephart knife 50 horace kephart amanda swimmer 37 craft revival; cherokee traditions appalachian people 36 craft revival; cherokee traditions; great smoky mountains; wcu—oral histories cherokee indian pottery 36 craft revival; cherokee traditions cherokee baskets 34 craft revival; cherokee traditions weaving patterns 33 craft revival; cherokee traditions basket weaving 26 craft revival; cherokee traditions table 4. top keyword searches across all referrals tables 3 and 4 show that top searches relate to arts and crafts from the western north carolina region (“baskets,” “indian masks,” “indian wood carving,” “cherokee pottery”), artists (“amanda swimmer”), or topics relating to cherokee culture (“cherokee,” “cherokee language”). searches relating to the horace kephart collection (“horace kephart,” “kephart knife”) are also popular, explaining the fact that the kephart collection, which accounts for fewer than 8 percent of the library’s digital collections’ items scores highly in terms of item views (second) and referrals (fourth). the popularity of topics related to western north carolina is reiterated in the geographic base of the users. graph 2 shows north carolina accounts for most of the searches, with cities in western north carolina (asheville, franklin, cherokee, waynesville) accounting for more than 40 percent of sessions. analyzing digital collections entrances: what gets used and why it matters | biswas and marchesoni | https://doi.org/10.6017/ital.v35i4.9446 28 graph 2. cities by session count the majority of item referrals come from search engines such as google, bing, and yahoo! graph 3 shows the percentage of item referrals from these external searches.21 however, the dpla also generates a fair amount of incoming traffic to the collections. for example, while all collections get referrals from the dpla, harvesting to the dpla is particularly useful for smaller collections such as highlights from wcu, wcu—fine art museum, and civil war collection. each of these collections gets 17 percent of referrals from the dpla, making dpla the largest referral source following the search engines for the highlights and fine art museum collections. graph 4 shows referrals each collection receives via the dpla as a percentage of total referrals. this indicates the usefulness of harvesting to the dpla. a trend seems also to show there is an increase in total referrals from dpla per month the longer items are in dpla (graph 5). graph 3. percentage of search engine item referrals (google, bing, and yahoo!) 367 319 171 146 144 135 122 109 105 98 44% 29% 47% 44% 75% 43% 57% 11% 23% 75% 74% 38% 33% 6% 22% information technology and libraries | december 2016 29 graph 4. percentage of dpla item referrals graph 5. increase in dpla referrals over time lastly, new and returning visitors to the collections were tracked as a marker of user interest in particular collections. graph 6 shows data collected for new and returning visitors calculated as a proportion of the total number of visits for each collection. some smaller collections like highlights from wcu, wnc regional maps, wcu—fine art museum, and wcu—oral histories score highly in terms of attracting return visitors (graph 6). 6% 17% 3% 12% 17% 4% 11% 6% 3% 17% 3% 4% 5% 0% analyzing digital collections entrances: what gets used and why it matters | biswas and marchesoni | https://doi.org/10.6017/ital.v35i4.9446 30 graph 6. new and returning visitors discussion the aim behind gathering data was to study usage of hunter library’s digital collections and examine the usefulness of harvesting in promoting use. although usage data logs were unable to shed much light on the actual usefulness of the collections to users, the logs provided information on volume of use, what materials were accessed, and where users were located. analysis of the transaction logs indicates that while all collections likely benefitted from harvesting, craft revival, cherokee traditions, and horace kephart (collections focusing on the culture and history of western north carolina) were the most heavily used and most visitors came from the state of north carolina and from the region in particular. search terms in the transaction logs also indicated a strong interest in items related to cherokee culture and horace kephart. as herold, who traced the second largest group of users of the orang asli digital image archive to malaysia notes, the geographic base of a collection’s users can be indicative of the popularity of a subject area.22 likewise, matusiak asserts that users’ comments can be indicative of the relevance of collections to users’ needs and provide direction for the future development of digital collections.23 as neither the craft revival, cherokee traditions, nor horace kephart collection includes items that relate specifically to the university’s history—unlike other institution-specific collections mentioned earlier—it is possible collection users may be more representative of the larger public than the university. these findings point to the need for questioning identification of an academic information technology and libraries | december 2016 31 library’s user base as mainly students and faculty of the institution and whether librarians should give greater consideration to the needs of a wider audience.24 data supporting the existence of this user base, whose true import or preferences might not be captured in surveys and questionnaires, can serve as a valuable source of information for individuals responsible for building digital collections. in an informal survey of hunter library faculty carried out by hunter library’s digital initiatives unit in september of 2014, respondents considered collections such as craft revival to be more useful to users external to the university. while the survey could allude to the nature of the user base of a collection like craft revival, it understandably could not capture the scale of the item views and referrals garnered by this collection as well as a usage data analysis could. on the other hand, analysis of usage data, as demonstrated in this paper, indicated that certain collections— highlights from wcu, wcu—fine art museum, and wcu—oral histories—possibly served a niche audience. these smaller and more recently established collections consisting of universitycreated materials attracted more returning visitors (see graph 6). these returning visitors were likely internal users whose visits indicated, as fang points out, a loyalty to these collections.25 in the paper “a framework of guidance for building good digital collections,” authored by the national information standards organization framework advisory group, the authors point out that while there are no absolute rules for creating quality digital collections, a good collection should include data pertaining to usage.26 the authors point to multiple assessment matrixes including using a combination of observations, surveys, experiments, and transaction log analyses. as the wcu digital collections findings demonstrate, a careful analysis of the popularity of collections can indicate the need for balancing quantitative data with more qualitative survey and interview data. these findings also indicate that usage data analysis can be very valuable in identifying the extent of collection usage by visitors who may not have significant survey representation. results from the small (fewer than ten respondents) wcu survey indicate that some respondents question the institutional usefulness of collections such as craft revival. these results show the importance of taking multiple factors into account when assessing user needs and interests in digital collections. conclusion the authors feel future projects might stem from this data analysis. for example, local subject fields based on the highest recurring keywords that were mined from the transaction logs can be added for all of hunter library’s digital collections. usage statistics at a later period could be evaluated to study if addition of user generated keywords increased use of any collection. as matusiak points out in her article on the usefulness of user-centered indexing in digital image collections, social tagging—despite its lack of synonym control or misuse of the singular and plural—is a powerful form of indexing because of “close connection with users and their language,” as opposed to traditional indexing.27 the terms users assign to describe images are also the ones they are most likely to type while searching for digital images. likewise, according to walsh, a analyzing digital collections entrances: what gets used and why it matters | biswas and marchesoni | https://doi.org/10.6017/ital.v35i4.9446 32 study conducted by the university of alberta found more than forty percent of collections reviewed used a locally developed classification for indexing and searching their collections, and many of these schemes could work well for searches within the collection by users who are familiar with the culture of the collection.28 usage-data analysis can constitute useful information that guides decisions for building digital collections that better serve user needs. it can identify a library’s digital collections’ users and what they want. these are important considerations to keep in mind if library services are to be all about engaging and building relationship with the users.29 harvesting to a national portal such as the dpla is beneficial for hunter library’s collections. at the same time, the library’s institution-specific collections receive more return visits, likely because of sustained interest from the large user base of the university’s students and employees, an assessment supported by survey findings. conversely, collections not so directly tied to the institution receive the most onetime item views and referrals. items that get used are a good indication of what users want and, as this paper demonstrates, the focus of academic digital library collections should consider the needs of both the university audience and the general public. references 1. a landing page refers to the homepage of a collection. 2. the dpla provides a single portal for accessing digital collections held by cultural heritage institutions across the united states. “history,” digital public library of america, accessed may 19, 2016, http://dp.la/info/about/history/. 3. paul betty, “assessing homegrown library collections: using google analytics to track use of screencasts and flash-based learning objects,” journal of electronic resources librarianship 21, no. 1 (2009): 75–92, https:// doi.org/10.1080/19411260902858631. 4. ibid. 5. wei fang, “using google analytics for improving library website content and design: a case study,” library philosophy and practice (e-journal), june 2007, 1-17, http://digitalcommons.unl.edu/libphilprac/121. 6. mark baggett and rabia gibbs, “historypin and pinterest for digital collections: measuring the impact of image-based social tools on discovery and access,” journal of library administration 54, no. 1 (2014): 11–22, https:// doi.org/10.1080/01930826.2014.893111. 7. melanie schlosser and brian stamper, “learning to share: measuring use of a digitized collection on flickr and in the ir,” information technology and libraries 31, no. 3 (september 2012): 85–93, https:// doi.org/10.6017/ital.v31i3.1926. information technology and libraries | december 2016 33 8. mark r. o’english, “applying web analytics to online finding aids: page views, pathways, and learning about users,” journal of western archives 2, no. 1 (2011): 1–12, http://digitalcommons.usu.edu/westernarchives/vol2/iss1/1. 9. marcus ladd, “access and use in the digital age: a case study of a digital postcard collection,” new review of academic librarianship 21, no. 2 (2015): 225–31, https://doi.org/10.1080/13614533.2015.1031258. 10. irene m. h. herold, “digital archival image collections: who are the users?” behavioral & social sciences librarian 29, no. 4 (2010): 267–82, https://doi.org/10.1080/01639269.2010.521024. 11. mark a. matienzo and amy rudersdorf, “the digital public library of america ingestion ecosystem: lessons learned after one year of large-scale collaborative metadata aggregation,” in 2014 proceedings of the international conference on dublin core and metadata applications (dcmi, 2014), 1–11, http://arxiv.org/abs/1408.1713. 12. oskana l. zavalina et al., “extended date/time format (edtf) in the digital public library of america’s metadata: exploratory analysis,” proceedings of the association for information science and technology 52, no. 1 (2015), 1–5, http://onlinelibrary.wiley.com/doi/10.1002/pra2.2015.145052010066/abstract. 13. lisa gregory and stephanie williams, “on being a hub: some details behind providing metadata for the digital public library of america,” d-lib magazine 20, no. 7/8 (july/august 2014): 1–10, https://doi.org/10.1045/july2014-gregory. 14. kate boyd, heather gilbert, and chris vinson, “the south carolina digital library (scdl): what is it and where is it going?” south carolina libraries 2, no. 1 (2016), http://scholarcommons.sc.edu/scl_journal/vol2/iss1/3. 15. chris freeland and heather moulaison, “development of the missouri hub: preparing for linked open data by contributing to the digital public library of america,” proceedings of the association for information science and technology 52, no. 1 (2015): 1–4, http://onlinelibrary.wiley.com/doi/10.1002/pra2.2015.1450520100105/abstract. 16. a single view of an item in a digital collection. 17. visits to the site that began from another site with an item page being the first page viewed. 18. keywords are words visitors used to find the library’s website when using a search engine. google analytics provides a list of these keywords. 19. a session is defined as a “group of interactions that take place on a website within a given time frame” and can include multiple kinds of interactions like page views, social interactions, and economic transactions. in google analytics, a session by default lasts thirty minutes, though analyzing digital collections entrances: what gets used and why it matters | biswas and marchesoni | https://doi.org/10.6017/ital.v35i4.9446 34 one can adjust this length to last a few seconds or several hours. “how a session is defined in analytics,” google, analytics help, accessed may 20, 2016, https://support.google.com/analytics/answer/2731565?hl=en. 20. locations were studied in terms of mostly cities and states. 21. the percentage is based on the total referral count a collection gets—for example, a 44 percent referral count for cherokee traditions would mean that the search engines account for 44 percent of the total referrals this collection gets. 22. herold, “digital archival image collections,” 278. 23. krystyna k. matusiak, “towards user-centered indexing in digital image collections,” oclc systems & services: international digital library perspectives 22, no. 4 (2006): 283–98, https://doi.org/10.1108/10650750610706998. 24. ladd, “access and use in the digital age,” 230. 25. fang points out that the improvements made to the rutgers-newark law library website could attract more return visitors and thus achieve loyalty. fang, “using google analytics for improving library website,” 11. 26. niso framework advisory group, a framework of guidance for building good digital collections, 2nd ed. (bethesda, md: national information standards organization, 2004), https://chnm.gmu.edu/digitalhistory/links/cached/chapter3/link3.2a.niso.html. 27. matusiak, “towards user-centered indexing,” 289. 28. john walsh, “the use of library of congress subject headings in digital collections,” library review 60, no. 4 (2011), https://doi.org/10.1108/00242531111127875. 29. lynn silipigni connaway, the library in the life of the user: engaging with people where they live and learn, (dublin: oclc research, 2015), http://www.oclc.org/research/publications/2015/oclcresearch-library-in-life-of-user.html. automated book order and circulation control procedures at the oakland university library lawrence auld: oakland university, rochester, michigan 93 automated systems of book order and circulation control using an ibm 1620 computer are described as developed at oakland university. relative degrees of success and failure are discussed briefly. introduction oakland university, affiliated with michigan state ·university and founded in 1957, offers degree programs at the bachelor's and master's levels. by september, 1967, 3,896 students were enrolled and continuing growth is anticipated in coming years. the library had holdings of 86,755 jlumes and 17,908 units of microform materials on july 1, 1967. although young, oakland's library has already encountered a host of problems common to most academic libraries. in recognizing a need to 1utomate or otherwise improve basic routines of handling book ordering •• u. circulation control, oakland is simply another member of a growing club. the book order system developed at oakland is noteworthy because of ·~rtain features which may be unique: a title index to the on-order file, a computer prepared invoice-voucher form, and a computer prepared voucher card which serves as input to the computer for writing payment checks. in logic the system is related, through parallel invention, to the machine aided technical processing system developed at yale university ( 1). the system developed with unit record equipment at the university of maryland is perhaps more directly related, particularly in the use of the purchase order as a vendor's report form (2,3 ). the pennsyl94 journal of library automation vol. 1/ 2 june, 1968 vania state university library design for automated acquisitions, which uses a similar purchase order, includes the capacity for an elaborate and variable method for reporting the progress of each item from initial order to completion of cataloging ( 4,5) . the ibm 357 circulation control system developed at southern illinois university, carbondale, set the pattern followed by most subsequent systems ( 6,7) . oakland's circulation control system, a variation of the ibm 357 system, is more flexible than some because it uses trigger cards to control machine operations. this paper, originally distributed to a relatively small group of persons and redrafted for a more general reading, presents a case study of how one institution in modest circumstances set about solving certain problems. it describes not systems to be copied but rather a learning process which will continue for many years to come. background during the winter of 1964/ 65, oakland university library laid out the plans and began work on a program of automation of the university library. an initial four-phase plan was conceived: 1) book order, 2) circulation control, 3) serials acquisitions, and 4) a printed book catalog. these housekeeping routines were felt to be the foundation for developing further automation in the library. their automation would liberate the staff, clerical and professional, from such nonproductive and repetitive_ tasks as alphabetizing and re-copying of bibliographic information. an early decision to learn by doing rather than attempting to design the ultimate system in advance was supported by the university administration. consensus being that a larger computer to replace the ibm 1620 would be delivered within two years, computer programs were planned to be useful for twenty-four to thirty-six months. work on developing the book order system was begun in march, 1965; perhaps an all-time speed record was achieved when the system was put into use on july 1 of the same year. work on a circulation control system was begun in august and on february 21, 1966, it too was ready. phases three and four, serials acquisitions and the printed book catalog, were by then being held in abeyance until larger computer equipment should become available to the library. at oakland university all computer and related services are provided by the computing and data processing center. the computer system includes the following pieces of equipment: ibm 1620 computer, 40k with monitor 1 and additional instructions feature (mf, tnf, tns) ibm 1622 card reader/ punch (240 cpm/ 125 cpm) two ibm 1311 disk drives with changeable disk packs ibm 1443 line printer ( 240 lpm) automated book order/ auld 95 only one of the two disk drives is available for production use because the other is committed to monitor, supervisor, and stored programs. a disk pack on the ibm 1620 can accommodate two million numeric or one million alphabetic characters. the computer language used for most of the library programs is 1620 sps (symbolic programming system); fortran is used for some computational work. equipment within the library consists of an ibm 026 printing keypunch which is used for the order system and an ibm 357 data collection device, including a time clock, with output via a second ibm 026 printing keypunch for the circulation system. book order procedure as may be inferred from a birdseye view of the order system (figure 1), the initial input to the computer is decklets of punched cards. output from the computer is a series of printouts: purchase orders, library of congress card orders, oakland university invoice-vouchers, a complete fig. 1. flow chart of book order system . 96 journal of library automation vol. 1/ 2 june, 1968 on-order listing with title and purchase order number indices, departmental listings, and budget summaries. facu1ty and library staff submit requests for book purchases to the acquisitions department on a specially designed library book request form (figure 2 ) . the 5x8-inch size provides adequate room for notes, checking marks, etc., and makes for improved legibility, which in turn makes for easier, faster, and more accurate keypunching. kttt;e libt o ry oo~larul untve r1 ity jildg. q 11ery library book request mutt be typ41d &jp st orch au th o, cij tit i• p' tla piip brit . ... p~o~bli•h•r and "oce r----no. copie• i p, bll•h dote l fd ition j•'· _t·· ~ mo . yr. cotl. ltl!qu tttt d it d!portment cited in r---o:;;p'rice t ·· dept i vando• clau l l c cood n•mbe • l.c. fig. 2. book order request form. the request form calls for the bibliographic data customarily required for book purchasing, plus date of ordering, code number for the department originating the order, and vendor number. oakland university utilizes campus-wide a five-digit vendor code system; since the library's vendor numbers are a part of the university's vendor code, this interface is one of several points where the book order system ties in with other university records and procedures. a tag number is assigned to each library book request form upon its arrival in the acquisitions department. after routine bibliographic identification is completed, decklet cards (figure 3) are keypunched. the individual cards in each decklet are kept together by the tag number, punched into columns one through five. to keep the cards in order within decklets, column six is punched to identify the type of card as 1) author, 2) title, 3) place and publisher, or 4) miscellaneous information. column seven indicates the card number within type of card. for example, code 11 in columns six and seven wou1d be the first author card and code 12 the second. ·automated book order/ auld 97 i i : i •i l l i l ~~ i l i i ..... ~;,, i i i i i l l ! i i i : i .,! autho• ~ ~ : 0 : i!! u ~ ••• c< anc ~u&ll&~ bisct/11~ cll~b:s . each book has a machine readable book card (figure 7). the period for which the book normally circulates is indicated with a letter code punched into column one; column two identifies the collection within the library from which the material came; column three identifies · the type of material. the call number and/ or other identifying information is punched into columns four through forty-one. column forty-two is punched with an end-of-transmission code . 104 journal of library automation vol. 1/ 2 june, 1968 .. ::;; !!~8 g c: .. ~~z z ... . 0 p 0 -1 1111 .. !,._ ~oot ~ .... i z:o =;.: "' !::o:r.oa 1);1 ... ;!,..< .. ... '"it"' !ill 0 $ ;;;~_. ... ;., 0 c: s!! ~~1!!1 ,. ill: z o:= lire:~ ... ~-~ -zr.oa s 0 n_ c: ..... :iiii i:' s; s20 z-;. z =~= 0 0-it . . c: ........ .. z !o~ ... < ... ~:~ .. "' ~ . "' pi 'i t' 1 • t •h • eijn!lf!m!i!u+•+•+•+•+•+•+•+•• .. fig. 7 . book card. the ibm 357 data collection device will perform only one operation without special instructions. if it is to perform more than one operation, it. must receive instructions for each variant operation and it must receive them each time the variant operation is performed. this limitation can be met in one of three ways: by not admitting variant operations, by using a cartridge as a carrier for some information, or by providing special . instructions as they are needed via a "trigger" card. denying the existence of a variant operation was not practical, because at oakland the identification of a borrower constitutes a set of variant operations. the library's clientele includes not only oakland university students, faculty, and staff, but also residents from the surrounding communities, area high school students, and neighboring college students. the heaviest users are oakland's own students and faculty, who have machine readable plastic identification cards issued by the registrar or the personnel office. it has been impractical for the library to attempt to issue similar cards to guest borrowers. thus, the identification of a borrower is a set of variant operations. use of a cartridge to gain the borrower identification number would be possible but would leave the borrower identification badge unused. this badge card constitutes an official identification card and as such should be utilized throughout the university whenever practical. . · trigger cards to instruct the 357 in the pedormance of variant operations were developed to control the recording of borrower identification and to identify discharging and certain charging functions. the use of trigger cards provides flexibility, in that machine. instructions are carried in trigger cards and are not an integral part of the book cards. a change in machine configuration would probably not require ·repunching book cards for the book collection. at the same time a wide range · of 357 machine functions are made possible through ·the use of different automated book order/ auld 105 trigger cards. in short, the adoption of trigger cards provides the greatest degree of flexibility in operating the 357. in the customary borrowing procedure the student brings a book to the circulation desk and presents it, along with his machine readable student id card, to the desk attendant. the attendant first inserts the book card into the ibm 357 data collection device, then retrieves the book card and inserts a "student badge trigger card", which activates the badge reader on the 357. then the badge is inserted into the badge reader, completing the transaction. by remote control this has created on an ibm 026 printing keypunch a card with the following information: typical loan period, collection from which the item came, type of material, call number, borrower type, borrower's identification number, the day of the year, and the time of day secured from an on-line clock. if the borrower does not have a machine readable badge card, an alternate method of charging a book is to use a "manual entry trigger card" which activates the manual entry unit, with which can be recorded numeric information identifvine: the borrower. with special trigger cards .bo;;ks can also be charged to reserve, bindery, or "missing". books are discharged by passing the book card through the 357 and following it by a "discharge trigger card". monday through friday at closing the charge and discharge cards for the day are delivered to the computing and data processing center, where they are processed by the ibm 1620 computer system. the circulation file is maintained on a disk pack similar to that for the order. system. three reports are received from the computing and data processing center: a daily cumulative listing of all books and materials in circulation (figure 8); a cumulative weekly list of all books on long-term loan; and a weekly fines-due report. in addition, overdue notices, computer printed on mailable postcard stock, are sent weekly to the library where they are audited before being mailed. the fines-due report is arranged by borrower, bringing together in one place all of the borrower's delinquencies; the books which he has neglected to return are listed here, as are the overdue books which he returned through the outdoor book return chute. for the latter the number of days overdue at the time of return is listed. subsequent refinements introduced into this system include two additional reports: a pre-notice report in call number sequence produced two days in advance of the fines-due report and a listing of books discharged each day. the pre-notice report makes it possible to search the shelves for books which have been returned but, because of time lag, may still have overdue notices generated. normal tum-around time for the system is 24 hours, but on weekends it goes to 63 hours and at certain holiday periods even higher. the daily list of discharges documents the return and discharge of each book and is used to answer the student who says, "but i returned the book." 106 journal of library automation vol. 1/2 june, 1968 s hort term books in circulation . weds-jul. 13.1966 pag f. 1 8 call numb er borrower day of yr due odue 01 jc0153ol.79 01 000009b74 20b 01 jc 0179 , r723 01 000007736 209 01 j c0 179or83-1962 01 000004838 199 01 j c0 179.r86-195 4 01 000007935 209 01 j c025 lol..27 01 000009021 20 i 01 jc0421ob8vol 01 000000207 127 * 01 j c0 4 23oi..58co2 04 000002393 19 9 01 jk0246ob9-1895v o 2 01 00000020 7 127 * 01 jk04 2 1op4 01 000006266 203 01 jk0421o s7 01 000006266 209 01 jk0516os3 01 000 003891 199 01 jk0518oh6 01 000006266 209 01 jk0524ol.38 01 000007717 2 1 4 01 jk154 1oj27 01 000006266 182 * 01 jk1561o527 01 000003891 199 01 jk1 57 1om8 01 000003640 208 01 j k1976 om5-co2 0 5 0 00002256 207 01 jk2295om5253 01 000007397 209 01 jk2372 oh5 04 000002194 2 10 01 jk 2372op6 04 000002194 2 1 0 01 jk2408ok4 0 1 000 00020 7 146 * 01 jn6769 oa5k622 01 00000 52 31 2 13 01 j01503 o1 912 ob7 01 000003824 209 01 j01503o1911oh72 01 000003824 207 0 1 j01512ok7 01 000 003824 207 01 j s0323oc58 01 0000 07717 209 01 js0341ow7 0 1 00 0 00 7717 2 09 01 j x14 25 op384 04 00000 2925 213 01 jx14 28 oc 6c5-1 964 01 000004154 199 01 jx1 977o2 oc5a73 01 000009 11 9 207 01 jx1977o2ou5577 0 1 000007371 201 fig. 8. example of short term circulation rep01t. maximum file capacity will permit up to about 9,000 charges at one time. assuming an average life of four weeks for each charge, the maximum number of transactions which can be accommodated in one year is about 115,000. the circulation control system utilizes eight programs. all are written in 1620 sps and utilize 40k storage. (an additional computational program not included in the production package is written in fortran.) with only minor modification the programs could be made to work with 20k storage. the individual programs are described in table 2. tabk 2. lib 201 lib 202 lib 204 lib 205 lib 207 lib 209 lib 212 lib 213 circulation control system programs to update file and to print short-and long-term reports. to print overdue notices and fines-due report. phase 1 routine for lib 202. cold start program to "seed" circulation file. to restart files from one term to the next. to print pre-notice report. to print daily discharges. to print circulation file or part thereof. • automated book order/ auld 107 appraisal the book order system has been described as it was originally designed, and the circulation control system as designed and modified. a partial update together with a critical appraisal follows. implicit in the planning of both systems was the assumption that the ibm 1620 would eventually be replaced by a larger and faster machine and that both systems would be redesigned and augmented. however, the ibm 1620 is continuing in use for a maximum rather than minimum projected time. in july, 1965, oakland initiated an accelerated library development program. overnight the book budget projection for several years was available and in less than three months the book order system was consequently overloaded. with the disk ble filled and many orders waiting, drastic action was required. the most obvious solution seemed to be use of an additional changeable disk pack to expand the purchase order file, but this procedure would have been hopelessly unwieldy. to use a second pack would require either that all transactions be run against both disk packs, roughly doubling computer time and costs, or that each transaction be addressed to a particular disk pack which would necessitate extensive systems redesign. another proposed solution was to revert to a completely manual system, but the order section preferred, if at all po~sible, to retain the automated fiscal control and invoice-voucher preparation features of the order system. , the alternative finally adopted required a basic philosophical change in the system. as originally designed, the system accounted for a book from the time it was placed on order to the time it was cataloged and placed on the shelf. the disk file was one-half occupied· with items received and paid for but not yet cataloged. by purging the file of such items, an on-order file in the narrowest sense was created and a doubling of file capacity gained. now a new problem was created. how was a book to be accounted for that had been received, paid, and purged from the on-order file, but not yet cataloged? the solution was to print a second (carbon) copy of the lc card order slip which would be hand-filed into the card catalog; there it would serve as an on-order/ in-process slip until replaced by a catalog card. hand-filed slips replacing a machine-filed list further altered the philosophical basis of the system. discrepancies in entry do occur, but not so often that the expedient does not work. four months later the system was again overloaded and a routine had to be devised whereby purchase orders could be issued either manually or through the computer. however, all items were still paid via the computer and all invoice-vouchers computer prepared. fiscal control was retained even though the rationale of the system was violated . 108 journal of library automation vol. 1/ 2 june, 1968 during the summer of 1967 a change of a different nature was implemented. as originally designed the system provided constant communication between the library and each faculty department through the departmental report. but, after the changes described above, the departmental report now included less than one-half of the items being purchased with the department's book fund allocation. it had ceased to serve any purpose and was omitted after july, 1967, with a consequent reduction of nearly two-fifths of line-printer time required for the book order system. to the question, "would it be better to return to a completely manual system for ordel'ing books?" the answer by the order section has always been "no, retention of the automated system for fiscal control and voucher preparation is preferable, even with the patched system at hand." nor should it be forgotten that the book order system as originally designed worked well until the demand on it exceeded its production capacity. also to be recognized is the gain in experience and insight by the library staff during these three years. reading about or visiting someone else's work is enlightening but day-to-day work brings an understanding for which it is difficult to obtain a substitute. acknowledgments four persons deserve special recognition for the roles they played in the foregoing: dr. floyd cammack, former university librarian, without · whose imagination and courage library automation at oakland would not have been attempted; mr. donald mann, assistant director, computing and data processing center, an outstanding systems analyst and programmer; mrs. edith pollock, head of the order section, who likes computers; mrs. nancy covert, head of circulation department, who likes students. references i. alanen, sally; sparks, david e.; kilgour, frederick g.: "a computermonitored library technical processing system," in american documentation institute: proceedings of the annual meeting, v. 3, 1966 (woodland hills, calif.: adrianne press, 1966) p. 419-26. 2. cox, carl r.: "the mechanization of acquisitions and circulation procedures at the university of maryland library," in international business machines corporation: ibm library mechanization symposium (endicott, n. y.: 1964) p. 205-35. 3. cox, carl r.: "mechanized acquisitions procedures at the university of maryland," college & research libraries, 24 (may 1965) 232-36. 4. minder, thomas l.: "automation-the acquisitions program at the pennsylvania state university library," in international business machines corporation: ibm library mechanization symposium (endicott, n. y.: 1964) p. 145-56. automated book order/ auld 109 5. minder, thomas l.; lazorick, gerald: "automation of the penn state university acquisitions department" in international business machines corporation: ibm library mechanization symposium (endicott, n. y. 1964) p. 157-63. (reprinted from american documentation institute: automation and scientific communication; short papers contributed to the theme sessions of the 26th annual meeting ... (washington: 1963) p. 455-59. ) 6. dejarnett, l. r. : "library circulation control using ibm 357's at southern illinois university," in international business machines corporation: ibm library mechanization symposium (endicott, n. y.: 1964) p . 77-94. 7. mccoy, ralph e.: "computerized circulation work: a case study of the 357 data collection system," library resources & technical services, 9 (winter 1965), 59-65. reproduced with permission of the copyright owner. further reproduction prohibited without permission. policies governing use of computing technology in academic libraries vaughan, jason information technology and libraries; dec 2004; 23, 4; proquest pg. 153 policies governing use of computing technology in academic libraries the networked computing environment is a vital resource for academic libraries. ever-increasing use dictates the prudence of having a comprehensive computer-use policy in force. universities often have an overarching policy or policies governing the general use of computing technology that helps to safeguard the university equipment, software, and network against inappropriate use. libraries often benefit from having an adjunct policy that works to emphasize the existence and important points of higher-level policies, while also providing a local context for systems and policies pertinent to the library in particular. having computer-use policies at the university and library level helps provide a comprehensive, encompassing guide for the effective and appropriate use of this vital resource. f or clients of academic libraries, the computing environment and access to online information is an essential part of everyday service-every bit as vital as having a printed collection on the shelf. the computing environment has grown in positive ways-higher-caliber hardware and software, evolving methods of communication, and large quantities of accurate online information content. it has also grown in many negative ways-the propagation of worms and viruses, other methods of hacking and disruption, and inaccurate informational content. as the computing environment has grown, it has become essential to have adequate and regularly reviewed policies governing its use. often, if not always, overarching policies exist at a broad institutional or even larger systemwide level. such policies can govern the use of all university equipment, software, and network access within the library and elsewhere on campus, such as campus computer labs. a single policy may encompass every easily conceivable computing-related topic, or there may be several individual policies. apart from any document drafted and enforced at the university level, various public laws exist that also govern appropriate computer-use behavior, whether in academia or on the beach. many institutions have separate policies governing employee use of computer resources; this paper focuses on student use of computing technologies. in some cases, the library and the additional campus student-computer infrastructure (for example, campus labs and dormitory computer access) are governed by the same organizational entity, so the higher-level policy and the library policy are de facto the same. in many instances, libraries have enacted additional computeruse policies. such policies may emphasize or augment certain points found in the institution-level policy(s), address concerns specific to the library environment, or both. this paper surveys the scope of what are most jason vaughan commonly referred to as "computer-use policies," specifically, those geared toward the student-client population. common elements found in university-level policies (and often later emphasized in the library policy) are identified. a discussion on additional topics generally more specific to the library environment, and often found in library computer-use policies, follows. the final section takes a look at the computer-use environment at the university of nevada, las vegas (unlv), the various policies in force, and identifies where certain elements are spelled out-at the university level, the library level, or both. i policy basics purpose and scope policies can serve several purposes. a policy is defined as: a plan or course of action ... intended to influence and determine decisions, actions, and other matters. a course of action, guiding principle, or procedure considered expedient, prudent, or advantageous.' any sound university has a comprehensive computeruse policy readily available and visible to all members of the university community-faculty, staff, students, and visitors. some institutions have drafted a universal policy that seeks to cover all the pertinent bases pertaining to the use of computing technology. in some cases, these broad overarching policies have descriptive content as well as references to other related or subsidiary policies. in this way, they provide content and serve as an index to other policies. in other cases, no illusions are made about having a single, general, overarching policy-the university has multiple policies instead. policies can define what is permitted (use of computers for academic research) or not permitted (use of computers for nonacademic purposes, such as commercial or political interests). a policy is meant to guide behavior and the use of resources as they are meant to be used. in addition, policies can delve into procedure. for example, most policies contain a section on how to report suspected abuse and how suspected abuse is investigated, and outlines potential penalties. policies buried in legalese may serve some purpose, but they may not do a good job of educating users on what is acceptable and not acceptable. perhaps the best approach is an appropriate jason vaughan (jvaughan@ccmail.nevada.edu) is head of the library systems department at the university of nevada, las vegas. policies governing use of computer technology in academic libraries i vaughan 153 reproduced with permission of the copyright owner. further reproduction prohibited without permission. balance between legalese and language most users will understand. in addition, policies can also serve to help educate individuals on important topics, rather than merely stating what is allowed and what will get one in trouble. for example, a general policy statement might read, "you must keep your password confidential." taken a step further, the policy could include recommendations pertaining to passwords, such as the minimum password length, inclusion on nonalphabetic characters, the recommendation to change the password regularly, and the mandate to never write down the password. characteristics of a policy-visibility, prominence, easily identifiable a policy is most useful when it is highly visible and clearly identified as a policy that has been approved by some authoritative individual or body. students often sign a form or agree online to terms and conditions when their university accounts are established. web pages may have a disclaimer stating something to the effect of "use of (institution's) resources is governed by .... " and provide a hyperlink to the various policies in place. or, a simple policies link may appear in the footer of every web page at the institutional site. some universities have gone a bit further. at the university of virginia, for example, students must complete an online quiz after reviewing the computer-use guidelines.' in addition, they can choose to view the optional video. such components serve to enhance awareness of the various policies in place. a review of the library literature failed to uncover any articles focusing on computer-use policies in academic libraries. the author then selected several similar-sized (but not necessarily peer) institutions to unlv-doctoralgranting universities with a student population between twenty thousand and thirty thousand-and thoroughly examined their library web sites to see what, if any, policy components were explicitly highlighted. it quickly became evident that many libraries do not have a centrally visible, specifically titled, inclusive computer-use policy document. most, but not all, of the library web sites provided a link to the institutional-level computer-use policy. in some cases, library policies were not consolidated under a central page titled "policies and procedures," or "guidelines," and, where they did appear, the context did not imply or state authoritatively that this was an official policy. there was no statement of who drafted the policy (which can lend some level of authority or credence), as well as no indicated creation or revision date. granted, many libraries have paper forms one must sign to obtain a library card, or they may state the rules in hardcopy posted within prominent computer-dense locations. still, with so much emphasis given to licensed database and internet resources, and with such heavy use of the computing environment, such policies should appear online in a prominent location. where better to provide a computer-use policy than online? perhaps all the libraries reviewed did have policies posted somewhere online. if the author could not easily find them, chances are a student would have difficulties as well. in sum, the location of the policy information and how it is labeled can make a tremendous difference. revisions policies should be reviewed on a regular basis. often, the initial policy likely goes through university counsel, the president's administrative circles, and, perhaps, a board of regents or the equivalent. revisions may go through such avenues, or may be more streamlined. a frequent review of policies is mandated by evolving information technology. for example, cell phones with built-in cameras or internetbrowsing capabilities, nonexistent a few years ago, are now becoming mainstream. with such an inconspicuous device, activities such as taking pictures of an exam or finding simple answers online are now possible. similarly, regularly installed critical updates are a central concept within windows' latest version of operating-system software. such functionality failed to attract much attention until the increase in security exploits and associated media coverage. some policies, recently updated, now make mention of the need to keep operating systems patched. i why have a library policy? while some libraries link to higher-level institutional policies and perhaps have a few rules stated on various scattered library web pages, other libraries have quite comprehensive policies that serve as an adjunct to (and certainly comply with) higher-institutional policies. there are several reasons to have a library policy. first, it adds visibility to whatever higher-level policy may be in place. a central feature of a library policy is that it often provides links (and thus, additional visibility) to other higher-level policies. a computer-use policy can never appear in too many places. (some libraries have the link in the footer of every web page.) a computer-use policy can be thought of as a speed limit sign. presumably, everyone knows that unless otherwise posted, the speed limit inside the city is thirty-five miles per hour, and outside it is fifty-five miles per hour. nevertheless, numerous speed-limit signs are in place to remind drivers of this. higher-level institutional policies often take a broad stroke, in that they pertain to and address computing technology in general, without addressing specific systems in detail. a second reason to have a local-library policy is to reflect rules governing local-library resources that are housed and managed by the library. such systems 154 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. often include virtual reference, electronic reserves, laptop-checkout privileges, and the mass of electronic databases and full-text resources purchased and managed by libraries. such library-based systems do not necessarily make the radar of higher-level policies, yet have important considerations, such as copyright issues in the electronic age or privacy as it relates to e-mail and chat reference. in addition, libraries often have two large user groups that other campus entities do not have-university affiliates (faculty, staff, students) and nonuniversity affiliates (community users). while broader university policies generally apply to all users of computing technology, local-library policies can work to address all users of the library pcs, and make distinctions as to when, where, and what each group can use. i common computer-use policy elements the following section outlines broad topics that are usually addressed within high-level, institutional policies. often, some or many of these same elements are later reemphasized or adapted by libraries, focusing on the library environment. in many cases, the policy is presented in a manner somewhat like breaking the seal on a new piece of software packaging. essentially, if someone is using the university equipment or network, that person agrees to abide by all policies governing such use. an overarching policy frequently may end with a bulleted summary of the important points in the document. an important first part of the policy is a clear indication of who the policy applies to. this may be as broad as "anyone who sits down in front of university equipment or connects to the network," or as specific as spelling out individual user groups (undergraduates, graduates, alumni, k-12 students). appendix a summarizes elements found in the various end-user computer policies in force at unlv and the unlv university libraries. network and workstation security network security is a universal topic addressed in computer-use policies. under this general aegis one often finds prohibitions against various forms of hacking, as well as recommendations for steps individual users should take to help better secure the overall network. there are also such policies as the prohibition of food and drink near computer workstations or on the furniture housing computer workstations. typical components related to network and workstation security include: 1. disruption of other computer systems or networks; deliberately altering or reconfiguring system files; use of ftp servers, peer-to-peer file sharing, or operation of other bandwidth-intensive services 2. creation of a virus; propagation of a virus 3. attempts at unauthorized access; theft of account ids or passwords 4. password information-individual users need to maintain a strong, confidential password 5. intentionally viewing, copying, modifying, or deleting other users' files 6. a requirement to secure restrictions to files stored on university servers 7. recommendation or requirement to back up files 8. statement of ownership regarding equipment and software-the university, not the student, owns the equipment, network, and software 9. intentional physical damage: tampering, marking, or reconfiguring equipment or infrastructuresuch as unplugging network cables 10. food and drink policies personal hardware and software many universities allow students to attach their own laptops to the campus wired or wireless network(s). in addition to network connections, a growing number of consumer devices such as floppy disks, zip disks, and rewritable cd /dvd-media have the potential to connect to university computers for the purpose of data transfer. today, the list has grown to include portable flash drives, digital cameras and camcorders, and mp3 players, among others. the attaching of personal equipment to university hardware may or may not be allowed. similarly, users may often try to install software on university-owned equipment. typical examples may include a game brought from home or any of the myriad pieces of software easily downloaded from the internet. some of the policy elements dealing with the use of personal hardware and software include: 1. connecting personal laptops to the university wired or wireless network(s) 2. use of current and up-to-date patched operating systems and antivirus programs running on personal equipment attached to the network 3. connecting, inserting, or interfacing such personal hardware as floppy disks, cds, flash drives, and digital cameras with university-owned hardware; liability regarding physical damage or data loss 4. limit access to and mandate immediate reporting of stolen personal equipment (to deactivate registered mac addresses, for example) 5. downloading or installing personal or otherwise additional software onto university equipment 6. use of personal technology (cell phones, pdas) in classroom or test-taking environments policies governing use of computer technology in academic libraries i vaughan 155 reproduced with permission of the copyright owner. further reproduction prohibited without permission. e-mail e-mail privileges figure prominently in computer-use policies. some topics deal with security and network performance (sending a virus), while many deal with inappropriate use (making threats or sending obscene e-mails). other topics deal with both (such as sending spam, which is unsolicited, annoying, and consumes a lot of bandwidth). among the activities covered are prohibitions or statements regarding: l. hiding identity, forging an e-mail address 2. initiating spam 3. subscribing others to mailing lists 4. disseminating obscene material or weblinks to such material 5. general guidelines on e-mail privileges, such as the size of an e-mail account, how long an account can be used after graduation, and e-mail retention 6. basic education regarding e-mail etiquette printing with the explosion of full-text resources, libraries and other student-computing facilities have experienced a tremendous growth in the volume of pages printed on library printers. at unlv libraries, for example, the printing volume for july 2002 to june 2003 was just shy of two million pages; the following year that had jumped to almost 2.4 million pages. various policies helping to govern printing may exist, such as honor-system guidelines ("don't print more than ten pages per day"). some institutions or libraries have implemented cost-recovery systems, where students pay fixed amounts per black-and-white and color pages printed through networked printers. standard policies regarding printer-use cover: 1. mass printing of flyers or newsletters 2. tampering with or trying to load anything into paper trays (such as trying to load transparencies in a laser printer) 3. per-sheet print costs (color and black-and-white; by paper size) 4. refund policies 5. additional commonsense guidelines, such as "use print preview in browser" personal web sites many universities allow students to create personal web sites, hosted and served from university-owned equipment. customary policy items focusing on this privilege include: 1. general account guidelines-space limitations, backups, secure ftp requirements 2. use of school logo on personal web pages 3. statement of content responsibility or institutional disclaimer information 4. requirement to provide personal contact information 5. posting or hosting of obscene, questionable, or inappropriate content intellectual property, copyright, or trademark abuse of copyright, clearly a violation of federal law, is something that libraries and universities were concerned about long before computers hit the mainstream. widespread computing has introduced new avenues to potentially break copyright laws, such as peer-to-peer file sharing and dvd-movie duplication, to mention only two. a computer-use policy covering copyright will generally include: l. general discussion of copyright and trademark law; links to comprehensive information on these topics 2. concept of educational "fair use" 3. copying or modification of licensed software, use of software as intended, use of unlicensed software 4. specific rules pertaining to electronic theses and dissertations 5. specific mention of the illegality of downloading copyrighted music and video files appropriateand priority-use guidelines appropriate use is often covered in association with topics such as network security or intellectual property. however, appropriateand priority-use rules can be an entire policy and would include: l. mention of federal, state, and local laws 2. use of resources for theft or plagiarism 3. abuse, harassment, or making threats to others (via e-mail, instant messaging, or web page) 4. viewing material that may offend or harass others 5. legitimate versus prohibited use; use for nonacademic purposes such as commercial, advertising, political purposes, or games 6. academic freedom, internet filtering privacy, data security, and monitoring privacy and data security are tremendous issues within the computing environment. networking protocols and components of many software programs and operating systems by default keep track of many activities (browser history files and cache, dynamic host configuration protocol logs, and network account login logs, to mention a few). additional specialized tools can track specific sessions and provide additional information. just as credit156 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. card companies, banks, and hospitals provide a privacy policy to their clients, so do many academic computer-use policies. such statements often address what logs are kept, how they are maintained, how they may be used, and who has access. in addition to the legitimate use of maintaining information, there is the general concept of questionable or outright malicious collection of information, through cookies, spybots, or browser hijacks. the following are concepts often addressed under the general heading of privacy: l. cookies, spybots, other malicious software 2. what information is collected for evaluative system management and/ or statistical purposes; use of cookies for this; how such information is used and reported 3. statement on routine monitoring or inspection of accounts or use; reasons information may be accessed (routine system maintenance, official university business, investigation of abuse, irregular usage patterns) 4. security of information stored on or transmitted by various campus resources 5. statement on general lack of security of public, multiuser workstations (browser cache, search history, recent documents) 6. disposition of information under certain circumstances (for example, if a student dies while enrolled, any personal university e-mail and stored files can be turned over to executor of will or parents) abuse violations, investigations, and penalties as policies generally are a statement of what is or is not permitted, or what is considered abuse, a clearly defined mechanism for reporting suspected abuse and policy violations can often be found. obviously, some abuse issues violate not only university policy, but also local, state, or federal law. investigations of suspected abuse are by their nature tied into the privacy and monitoring category. policy items detailing suspected abuse usually include: 1. how one can report suspected abuse 2. how requests for content, logging, or other account information are handled; how and by what entities abuse investigations are handled 3. potential penalties 4. how to appeal potential penalties; rights and responsibilities one may have in such a situation other computer or network-based services affecting the broad student population universities operate any number of other computer or network-based services for the broad academic community. such services may include provisioning of isp accounts, courseware, online registration, and digital institutional repositories. depending on the broad nature of these services, policy information particular to such systems can be specified at the broad policy level, especially if they have unique avenues of potential exploitation or abuse not covered in the general topics included elsewhere in the policy. i additional library-specific computer-use policy elements many libraries elect to have their own, additional computer-use policies that serve as an adjunct to the larger university-level policy that generally governs the use of all computing resources on campus. libraries that have a formalized library computer-use policy often start with a statement of other policies governing the use of the library equipment and network-references to the university policies in place. the library policy may choose to include or paraphrase parts of the university policy deemed especially important or otherwise applicable to the specific library environment. important concepts governing university policies apply equally to library policies-purpose and comprehensiveness, visibility, and frequent review. libraries that have formalized computer-use policies often link them under library common web-site sections such as "information about the libraries," or "about the libraries." library policies can help address items unique, special, or otherwise worthy of elaboration, such as specific systems in place or situations that may arise. they can also help provide guidelines and strategies to aid staff in policy enforcement. as an example of a library computer-use policy, appendix b provides the main unlv libraries computer-use policy. public versus student use-allowances and priority use many of the other entities on a university campus do not daily deal with the community at large (the non-university affiliates) as do academic libraries. this applies to most if not all public institutions, as well as many private institutions. the degree to which academic libraries embrace community users varies widely; often, a statement on which user groups are the primary clients is stated in a policy. such policy statements may discuss who may use what computers, what software components they have access to, and when access is allowed. in some cases, levels of access for students and the community are basically the same. community users may be allowed to use all software installed on the pc. more often, separate pcs with smaller software sets have been configured for community users or for specific access to policies governing use of computer technology in academic libraries i vaughan 157 reproduced with permission of the copyright owner. further reproduction prohibited without permission. government documents. in some cases, libraries allow some or all pcs to be used by anyone, student or nonstudent, but have technically configured the pc or network to prevent the community at large from using the full software set (such as common productivity suites). however, community users may be limited from using the productivity software (such as microsoft word) found on these pcs. the may be restricted from using pcs on upper floors, or those reserved for special purposes, such as high-end graphics-development workstations. in addition, during crunch time-midterms and final exams-community users are often restricted to the few pcs set up and configured to allow access only to the library web page (not the web at large) and the online catalog. in addition, only students and staff can plug in their personal laptops to the library and campus network. regardless of whether it is crunch time, nonstudent users can be asked to leave if all pcs are in use and students are waiting. an in-house-authored program identifies accounts and whether particular users are students or nonstudents. in 2005, the unlv libraries will begin limiting full web access to community users; they will only be permitted access to a limited set of webbased resources, such as government document websites and library licensed databases. more and more government information is available online. for libraries serving as government document repositories, all users have the right to freely access information distributed by the government. in 2005, the unlv libraries will begin limiting full web access to community users; they will only be permitted access to a limited set of web-based resources, such as government document web sites and library licensed databases. on another note, many libraries have special adaptive workstations with additional software and hardware to facilitate access to library resources by disabled citizens. disabled individuals, enrolled at the university or not, are allowed to use these adaptive workstations. laptop checkout privileges many libraries today check out laptops for student use. at unlv libraries, faculty, staff, and students may check out lcd projectors and library-owned laptops and plug them into the network at any of the hundreds of available locations within the main library. more details on these privileges can be found in the article "bringing them in and checking them out: laptop use in the modern academic library." 3 as the university does not otherwise check out laptops to users or allow students to plug in their own laptops to the wired university network, the libraries had to come up with these additional specific policies. licensed electronic resources-terms and conditions academic libraries are generally the gatekeepers to many citation and full-text databases and electronic journals. each of the myriad subscription vendors has terms of use, violations of which can carry harsh penalties. for example, the unlv libraries had an incident where a vendor temporarily cut off access to its resource due to potential abuse detected from a single student. in this case, the user was downloading multiple pdf full-text files in an automated manner. this illustrates the need to have some statement in a library policy outlining the existence of such additional terms of use. vendors generally place a link at the top page of each of their resources related to this. for greater visibility, libraries should at least point out the existence of such terms of use for better exposure and potential compliance. in addition, some electronic resources have licensing agreements that simply do not permit community-user access. in these cases, library policy can simply state that some licensed resources may be accessed only by university affiliates. electronic reserves many libraries have set up electronic reserves systems to help distribute electronic full-text documents and streaming media content, among other things. additional policies may govern the use of such systems, such as making the system available only to currently enrolled students, and providing some boundaries in terms of what is acceptable for mounting on such a system. in addition, there is the whole area of copyright. e-reserve systems often have built-in methods to help better enforce copyright compliance in the electronic arena. additional policy statements can help educate faculty members on particulars related to copyright and e-reserves. offsite access to licensed electronic resources many libraries provide offsite access to their licensed resources to legitimate users via proxy servers or other methods. the policy regarding such access may address things such as who is permitted to access resources from offsite (such as students, staff, and faculty), and the requirement that the user be in good standing (such as no outstanding library-book fines). in some instances, universities have implemented broad authentication systems that, once logged on from an offsite location, allow the user into a range of university resources, including, potentially, library-licensed electronic resources. if such is the case, information pertaining to offsite access may be found in a higher-level policy. 158 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. electronic reference transactions many libraries have installed (or plan to install) virtualreference systems, or, at a minimum, have a simple e-mail reference service ("ask a librarian"). in addition, many collect library feedback or survey information through simple forms. in all cases, a record exists of the transaction. with virtual-reference systems, the record can include chat logs, e-mail reference inquiries, and urls of web pages accessed during the transaction. a policy governing the use of electronic-reference systems may address such things as which clientele may use the system; a statement on the confidentiality of the transaction; or a statement on whether the library maintains the electronic-transaction details. items such as hours of operation and response time to an e-mail question could be considered more procedural or informational than a policy issue. statements on information literacy while perhaps not a policy per se, many libraries have a computer-use policy statement to the effect that while the library may provide links to certain information, this does not serve as an endorsement or guarantee that the information is accurate, up-to-date, or has been verified. (such a statement posted on the library web site may provide additional exposure to the maxim that all that glitters is not gold.) statements that libraries do not regulate, organize, or otherwise verify the general mass of information on the internet may be included. obviously, many libraries have separate instruction sessions, awareness programs, and overall mission goals geared toward information literacy. i principles on intellectual freedom and internet filtering statements by the american library association (ala) on intellectual freedom and internet filtering may well appear in an institutional policy and often are included in library policies. filtering is something more likely to affect public and school libraries as opposed to academic libraries. still, underage children can and do use academic libraries. in such an environment, they may be intentionally or unintentionally exposed to questionable or obscene material. thus, a library computer-use policy can express the general concept behind the following: 1. intellectual freedom (freedom of speech; free, equal, unrestricted access); 2. the fact that academic libraries provide a variety of information expressing a variety of viewpoints; 3. the fact that this information is not filtered; and 4. the responsibility of parents to be aware of what their children may be viewing on library pcs. some libraries have provided policy links to various sets of information from the office of intellectual freedom at ala's web site, such as: 1. ala code of ethics 2. ala bill of rights 3. intellectual freedom principles for academic libraries: an interpretation of the library bill of rights 4. access to electronic information, services, and networks: an interpretation of the library bill of rights some libraries also provide references to ala information pertaining to the usa patriot act and how lawenforcement inquiries are handled. i summary computing is a vitally important tool in the academic environment. university and library computing resources receive constant and growing use for research, communication, and synthesizing information. just as computer use has grown, so have the dangers in the networked computing environment. universities often have an overarching policy or policies governing the general use of computing technology that help to safeguard the university equipment, software, and network against inappropriate use. libraries often benefit from having an adjunct policy that works to emphasize the existence and important points of higher-level policies, while also providing a local context for systems and policies pertinent to the library in particular. having computer-use policies at the university and library level help provide a comprehensive, encompassing guide for the effective and appropriate use of this vital resource. references 1. the a111erica11 i jeri/age college dictionary, 3rd edition. (boston: houghton, 1997), 1058. 2. board of visitors of the university of virginia, "responsible computing at u.va.: a handbook for students." accessed june 2, 2004, www.itc.virginia.edu/pubs/ docs/respcomp / rchandbook03.html. 3. jason vaughan and brett burnes, "bringing them in and checking them out: laptop use in the modern academic library," information technology and libraries 21 (2002): 52-62. policies governing use of computer technology in academic libraries i vaughan 159 reproduced with permission of the copyright owner. further reproduction prohibited without permission. appendix a. systemwide, institutional, and library computing policies at unlv unlv uccsn unlv policy libraries scs computing unlv student for posting guidelines unlv libraries nevadanet resou rces computer-use information for library additional policy* policy** policy*** on the webt computer usett policiesttt general direct evident link or references to higher-level institutional/system computer use policy x x x author / authority information included x x x approved/revised date included x x x x network and workstation security disruption of other computer systems/networks; deliberate ly altering or reconfiguring system files; ftp servers/peer-to-peer file sharing/operation of other bandwidth intensiv e services x x x creat ion of a virus; propagation of a virus x x x x attempts at unauthorized access/theft of account ids or passwords x x x x x password informationindividual user's need to maintain a strong, confidential password intentionally view, copy, mod ify, or de lete other users' files x x x x requirement to secure restrictions on stored files recommendat ion/requirement to back up fi les statement of ownership regarding equipment/software x intent ion al phys ical damage: tampering with or marking, reconfiguring equipment or infrastructure x x x food and dr ink policies x 160 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. appendix a. systemwide, institutional, and library computing policies at unlv (cont.) scs nevada net policy* pe rsonal hardware and software connect ing persona l laptops, etc. to university wired or wireless network(s) use of current and up-to-date patched operating systems and antiv irus programs running on personal equipment attached to network connect ing/ insert ing/ interfacing personal hardware with existing univers ity equipment; liability regarding physica l damage or data loss limiting access to personal equipment/report immed iately if stolen download ing or installat ion of personal or otherwise add itiona l software onto university equipment use of pe rsonal technology in c lassroom/test -tak ing environments printing mass printing of f lyers or news lette rs tampering with or trying to load anything into paper trays per-sheet print costs refund policies additiona l commonsense gu idel ines e-mail hiding ident ity/forging an e-mai l address initiating spam x uccsn computing resources policy** x x x x unlv policy unlv student for posting computer-use information policy*** on the webt x x unlv libraries gu idelines for library computer usett x x x x x x x unlv libraries additional policiesttt x x policies governing use of computer technology in academic libraries i vaughan 161 reproduced with permission of the copyright owner. further reproduction prohibited without permission. appendix a. systemwide, institutional, and library computing policies at unlv (cont.) j unlv uccsn unlv policy libraries i scs computing unlv student for posting guidelines unlv libraries nevada net resources computer-use information for library additional policy* policy** policy*** on the webt computer usett policiesttt e-mail (cont.) subscribing others to mailing lists dissemination of obscene material or web links to such material x x general guidel ines on e-mail privileges , such as the size of an e-mail account, how long an account can be used after graduation, etc. personal web site specific general account guidelines x use of schoo l name and logo statement of content responsibility/institutional discla imer inform ation x requirement to prov ide personal contact inform at ion x posting and hosting of obscene, questionable, or inappropriate material x intellectual property, copyright, and trademark general d iscussion of copyrights and trademark law; link s to comprehensive information on these topics x x x the concept of educational fair use x copying or modifying licensed software/use of software as intended/use of unlicensed software x x x specific rules pertaining to electronic theses and dissertations 162 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. ' f i i appendix a. systemwide, institutional, and library computing policies at unlv (cont.) unlv uccsn unlv policy libraries scs computing unlv student for posting guidelines unlv libraries nevadanet resources computer-use information for library additional policy* policy** policy*** on the webt computer usett policiesttt appropriateand priority-use guidelines mention of federal, state, and local laws x x x use of resources for theft/plagiarism x abuse, harassment, or making threats to others (via e-mai l, instant messaging, web page, etc.) x x x x viewing material which may offend others x legitimate versus prohibited use; use for nonacademic purposes (commerc ial; advertising; political purposes; games; etc.) x x x x x academic freedom; internet filtering x x x x privacy cookies, spybots, other malicious software what information is collected for evaluative/system management/statistical purposes; use of cookies for this statement on routine monitoring or inspect ion of accounts or use; reasons information may be accessed x x security of information stored on or transmitted by various campus resources x statement on general lack of security of public, multi-user workstations disposition of information under certain circumstances policies governing use of computer technology in academic libraries i vaughan 163 reproduced with permission of the copyright owner. further reproduction prohibited without permission. appendix a. systemwide, institutional, and library computing policies at unlv (cont.) unlv scs nevadanet policy* abuse violations, investigations, and penalties how one can report suspected ab use how requests for content, logg ing, or other account information are hand led; how and by what entities abuse investigations are hand led potential pena lt ies how to appea l potentia l penalties; rights/ respons ibilit ies you may have in such a sit uation other computer/ network-based services affecting the broad student population library-specific pub lic versus student use -a llowances and pr iority use right to access government information assistance for person w ith disab ilit ies laptop, lcd projector, etc. checkout privileges licensed electron ic resources-terms and conditions offsite access to licensed electron ic resources-who can access from offsite electronic reference transactions statements on information literacy x x x uccsn computing unlv student resources computer,-use policy** policy*** x x x 164 information technology and libraries i december 2004 unlv policy for posting information on the webt x x x libraries guidelines unlv libraries for library additional computer usett policies ttt x x x x x x reproduced with permission of the copyright owner. further reproduction prohibited without permission. appendix a. systemwide, institutional, and library computing policies at unlv (cont.) unlv ala princip les on academic freedom /internet filtering scs nevada net policy* electron ic reserves; copyright as it pertains to electronic reserves notes uccsn unlv policy computing unlv student for posting resources computer-use information policy** policy*** on the webt libraries guidelines unlv libraries for library additional computer usett policiesttt x x * the systems computing services nevadanet policy. among other responsibilities, scs provides and maintains the general internet connectivity for nevada's higher education institutions, including unlv. the complete document can be accessed at www.scs.nevada.edu/nevadanet/nvpolicies.html. ** the university and community college system of nevada computing resources policy. uccsn is the system of higher education institutions in the state of nevada, governed by an elected board of regents. the complete document can be accessed at www.scs .nevada .edu/about/policy061899.html *** the complete document can be accessed at www.unlv.edu/infotech/itcc/scup.html. 1 the complete document can be accessed at www.unlv.edu/infotech/itcc/www _policy.html. 11 the primary unlv libraries policy governing student computer use. provided in appendix 2, the complete document can also be accessed at www. library.unlv.edu/services/policies/computeruse.html . ttt various other policies are in effect at the unlv libraries. some of these can be accessed at www.library.unlv .edu/services/policies/computeruse.html. appendix b. unlv university libraries guidelines for library computer use in pursuit of its goal to provide effective access to information resources in support of the university's programs of teaching, research, and scholarly and creative production , the university libraries have adopted guidelines governing electronic access and use of licensed software. all those who use the libraries' public computers must do so in a legal and ethical manner that demonstrates respect for the rights of other users and recognizes the importance of civility and responsibility when using resources in a shared academic environment. authorized users to gain authenticated access to the libraries ' computer network, all users of the university libraries public computers must be officially registered as a library borrower, a library computer user, or a guest user . a photo id is required. (exceptions may be made as needed when access to federal depository electronic resources is required.) priority use is granted to unlv students, faculty, and staff. as need arises, access restrictions may be imposed on nonuniversity users. in accordance with lic ensing and legal restrictions, nonuniversity users are restricted from using word-processing, spreadsheet, and other productivity and high-end multi-media software. during high-demand times, all users may have time restrictions placed on their computer use. if requested by library staff, all users must be prepared to show photo id to confirm their user status. authorized and unauthorized use public computers are to be used for academic research purposes only. electronic information, services, software, and networks provided directly or indirectly by the mliversity libraries shall be accessible, in accordance with licensing or contractual obligations and in accordance with existing unlv and university and community college system of nevada (uccsn) computing services policies (uccsn computing resources policy www.scs .nev ada.edu/about/policy061899.html; policies governing use of computer technology in academic libraries i vaughan 165 reproduced with permission of the copyright owner. further reproduction prohibited without permission. unlv faculty computer use policy www.unlv.edu/infotech/itcc/fcup.html; student computer use policy http:/ /ccs. unlv.edu/ scr/ computeruse.asp). users are not permitted to: 1. copy any copyrighted software provided by unlv. it is a criminal offense to copy any software that is protected by copyright, and unlv will treat it as such 2. use licensed software in a manner inconsistent with the licensing arrangement. information on licenses is available through your instructor 3. copy, rename, alter, examine, or delete the files or programs of another person or unlv without permission 4. use a computer with the intent to intimidate, harass, or display hostility toward others (sending offensive messages or prominently displaying material that others might find offensive such as vulgar language, explicit sexual material, or material from hate groups) 5. create, disseminate, or run a self-replicating program ("virus"), whether destructive in nature or not 6. use a computer for business purposes 7. tamper with switch settings, move, reconfigure, or do anything that could damage terminals, computers, printers, or other equipment 8. collect, read, or destroy output other than your own work without the permission of the owner 9. use the computer account of another person with or without their permission unless it is designated for group work 10. use software not provided by unlv 11. access or attempt to access a host compnter, either at unlv or through a network, without the owner's permission, or through use of log-in informatio! belonging to another person internet and web use the university libraries cannot control the information available over the internet and are not responsible for its content. the internet contains a wide variety of material, expressing many points of view. not all sources provide information that is accurate, complete, or current, and some may be offensive or disturbing to some viewers. users should properly evaluate internet resources according to their academic and research needs. links to other internet sites should not be construed as an endorsement by the libraries of the content or views contained therein. the university libraries respect the first amendment and support the concept of intellectual freedom. the libraries also endorse ala's library bill of rights, which supports access to information and opposes censorship, labeling, and restricting access to information. in accordance with this policy, the university libraries do not use filters to restrict access to information on the internet or web. as with other library resources, restriction of a minor's access to the internet or web is the responsibility of the parent or legal guardian. printing users are charged for printing no matter who supplies the paper. mass production of club flyers, newsletters, posters is strictly prohibited. if multiple copies are desired, users need to go to an appropriate copying facility such as campus reprographics. contact a staff member when using the color laser printer to avoid costly mistakes. the university libraries reserve the right to restrict user printing based on quantity and content (such as materials related to running an outside business). copyright alert many of the resources found on the internet or web are copyright protected. although the internet is a different medium from printed text, ownership and intellectual property rights still exist. check the documents for appropriate statements indicating ownership. most of the electronic software and journal articles available on library servers and computers are also copyrighted. users shall not violate the legal protection provided by copyrights and licenses held by the university libraries or others. users shall not make copies of any licensed or copyrighted computer program found on a library computer. use of personal laptops and other equipment students, faculty, and staff of the university are welcome to bring laptops with network cards and use them with our data drops to gain access to our network. the laptop must be registered in our laptop authentication system, and a valid 166 information technology and libraries i december 2004 reproduced with permission of the copyright owner. further reproduction prohibited without permission. library barcode is also required. users are responsible for notifying the library promptly if their registered laptop is lost or stolen, since they may be held responsible if their laptop is used to access and damage the network. users taking advantage of this service are required to abide by all uccsn and unlv computer policies. the libraries allow the use of the universal serial bus (usb) connections located in the front of the workstations. this includes use with portable usb-based devices such as flash-based memory readers (memory sticks, secure digital) and digital camera connections. the patron assumes all responsibility in attaching personal hardware to library workstations. the libraries are not responsible for any damage done to patron-owned items (hardware, software, or personal data) as a result of connecting such devices to library workstations. as with any use of library workstations, patrons must adhere to all uccsn, unlv, and university libraries' computing and network-use policies. patrons are responsible for the security of their personal hardware, software, and data. inappropriate behavior behavior that adversely affects the work of others and interferes with the ability of library staff to provide good service is considered inappropriate. it is expected that users of the libraries' public computers will be sensitive to the perspective of others and responsive to library staff's reasonable requests for changes in behavior and compliance with library and university policies. the university libraries and their staff reserve the right to remove any user(s) from a computer if they are in violation of any part of this policy and may deny further access to library computers and other library resources for repeat offenders. the libraries will pursue infractions or misconduct through the campus disciplinary channels and law enforcement as appropriate. revised: march 3, 2004 updated: thursday, may 13, 2004 content provider: wendy starkweather, director of public services policies governing use of computer technology in academic libraries i vaughan 167 hathitrust as a data source for researching early nineteenth-century library collections: identification, coverage, and methods articles hathitrust as a data source for researching early nineteenth-century library collections: identification, coverage, and methods julia bauder information technology and libraries | december 2019 14 julia bauder (bauderj@grinnell.edu) is associate professor and social studies and data services librarian, grinnell college. abstract an intriguing new opportunity for research into the nineteenth-century history of print culture, libraries, and local communities is performing full-text analyses on the corpus of books held by a specific library or group of libraries. creating corpora using books that are known to have been owned by a given library at a given point in time is potentially feasible because digitized records of the books in several hundred nineteenth-century library collections are available in the form of scanned book catalogs: a book or pamphlet listing all of the books available in a particular library. however, there are two potential problems with using those book catalogs to create corpora. first, it is not clear whether most or all of the books that were in these collections have been digitized. second, the prospect of identifying the digital representations of the books listed in the catalogs is daunting, given the diversity of cataloging practices at the time. this article will report on progress towards developing an automated method to match entries in early nineteenth-century book catalogs with digitized versions of those books, and will also provide estimates of the fractions of the library holdings that have been digitized and made available in the google books/hathitrust corpus. introduction digital libraries such as google books and hathitrust have created tantalizing opportunities for research into the history of american culture: automated analyses of the entire corpus of books published at a given point in time. the attraction of this prospect is most clearly demonstrated by the avalanche of papers written using the google books ngram data, which provides counts over time of the words and phrases used in the works that make up the google books corpus. as soon as this data became available in 2009, it was used to make arguments about social, linguistic, and other changes over time as reflected in changes in the words used in print.1 however, for nearly as long, other researchers have been cautioning that the google books corpus is not a representative sample of publishing output, let alone of what the public at large was actually reading in a given year, and that its unrepresentativeness makes it dangerous to draw sweeping conclusion s from this data.2 one potentially feasible solution to the problem of unrepresentativeness in the google books corpus would be to use corpora based on the holdings of a specific library or a group of libraries. using library holdings to form corpora helps to remedy some known issues with using the google books corpus as an indicator of social change, such as the fact that many books did not become mailto:bauderj@grinnell.edu hathitrust as a data source | bauder 15 https://doi.org/10.6017/ital.v38i4.11251 popular and/or widely available until well after their official publication date, and that some prolific authors who contributed hundreds of thousands of words to the google books corpus were never as widely purchased and read as authors who wrote a single, short, best-selling work.3 although using books held by a set of libraries at a given time as the corpus has its own problems of unrepresentativeness—particularly, for long-established libraries, the fact that the books on the shelf at a given time represent not only works of interest to current users but also those of interest to users from decades past—triangulating this data with that provided by the google books ngram data would at least give some sense of whether and where these different corpora disagree.4 creating corpora using books that are known to have been owned by a given library at a given point in time is potentially feasible because digitized records of the books in several hundred nineteenth-century library collections are available in the form of scanned book catalogs: a book or pamphlet listing all of the books available in a particular library. however, there are two potential problems with using those book catalogs to create corpora. first, it is not clear whether most or all of the books that were in these collections have been digitized, incorporated into google books and hathitrust, and hence made available for ngram analyses. second, the prospect of identifying the digital representations of the books listed in the catalogs is daunting, as both widely agreed-upon cataloging standards and universal identifiers were not adopted until late in the nineteenth century. this article will report on progress towards developing a fully-automated method to match entries in early nineteenth-century book catalogs with digitized versions of those books, and will also provide estimates of the fractions of the library holdings that have been digitized and made available in the google books/hathitrust corpus. methods practical considerations dictated using data from hathitrust rather than from google books for this research. the hathitrust corpus, although not perfectly coextensive with the google books corpus, has very substantial overlap with it. the hathitrust digital archive was founded in 2008, when a group of large academic libraries formed a collaboration to archive and disseminate their digitized books. the vast majority of those digitized books—around 95 percent, as of mid-2017— had originally been scanned as part of the google books project; the agreements that google books entered into with the libraries typically stipulated that google had to provide the library with a digital copy of each book scanned from that library. 5 it was necessary to use hathitrust rather than google books as the comparison corpus because the metadata for the titles in hathitrust is readily available in ways that the google books metadata is not, including as bulk marc-data downloads. the libraries included in this analysis are social libraries, which were a type of quasi-public library that predated the now-standard, tax-supported public library in the united states. these libraries were privately owned and operated, but were open to some large portion of the population of a particular area who were willing and able to pay a fee or buy a share to belong to the library. although the presence or absence of a book in social library collections is not a perfect indicator of the book’s popularity—most social libraries pointedly refused to purchase the “trashy” but widely read sensational fiction of the day—it is a defensible proxy (although with some caveats, as noted above) for the popularity of the “serious” literature and nonfiction works that made up the bulk of these libraries’ collections. information technology and libraries | december 2019 16 roughly one hundred social library book catalogs published between 1800 and 1860 can be found in hathitrust.6 for the purposes of the present study, attention was focused on the thirteen library catalogs from ten different american libraries that were published between 1776 and 1825. (a list of these catalogs can be found in appendix a.) these catalogs were chosen because they are likely to present the worst-case scenario in terms of both of the challenges mentioned above: the highest percentage of rare and extremely old books, which google’s partner libraries would have been least likely to permit to be scanned by google, and, presumably, the most primitive and eclectic cataloging practices. to the extent that it was possible to do so, this analysis focused on book-length monographs. when serials or pamphlets were listed in a separate section of the catalog, those catalog pages were excluded from the process by which entries were extracted from the catalogs and parsed into csv files. serials present particularly intractable matching problems: not only are the original catalogs often unclear about which specific volumes were held, but also hathitrust’s marc data does not always clearly indicate which volumes are available in hathitrust either. pamphlets have limited coverage in hathitrust. the selected catalogs were downloaded from hathitrust as pdfs, and the pdftotext software was used to extract the ocr data from the relevant pages of the scans as hocr (a file format for ocr that includes information about where each word is located on the page in addition to the words themselves). 7 then cleaning scripts were created that parsed the hocr data into csv files for analysis, with one catalog entry per line of the csv file.8 given the widely varied cataloging practices of the early nineteenth century, several different cleaning scripts were written, each tailored to a particular catalog format. for example, many of the catalogs had entries that spanned multiple lines (see figures 1 and 2), so the scripts for those catalogs had to be able to identify when each new entry started. many catalogs had extraneous information, such as the name of the donor of the book or the size of the book, that had to be filtered out (see figure 1; f, q, o, and d refer to the size of the book: folio, quarto, octavo, or duodecimo). in addition, various forms of dittoes were frequently used in these catalogs (see figures 1, 2, and 3), so one of the tasks for the cleaning scripts was to identify the dittoes and replace them with the correct words from the previous entry. figure 1. library company of philadelphia, a catalogue of the books belonging to the library company of philadelphia: to which is prefixed, a short account of the institution, with the charter, laws, and regulations (philadelphia, pa: printed by bartram & reynolds, 1807), 5. hathitrust as a data source | bauder 17 https://doi.org/10.6017/ital.v38i4.11251 figure 2. library company of baltimore, a catalogue of the books, &c. belonging to the library company of baltimore: to which are prefixed the act for the incorporation of the company, their constitution, their by-laws, and an alphabetical list of the members (baltimore, md: printed by eades and leakin, 1809), 46. figure 3. washington library company, catalogue of books in the washington library (washington, dc: printed by anderson and meehan, 1822), 17. unfortunately, the horizontal-line dittoes seen in figures 1 and 2—a type of ditto which is used in seven of the thirteen catalogs—are represented inconsistently or not at all in the hocr, so they cannot reliably be used to identify places where words need to be carried down from the previous entry. for the catalog of the library of company of philadelphia, from which figure 1 was taken, the numbers after the horizontal-line dittoes (which identify the books’ locations on the shelves) can be used to distinguish between a line that is indented because it is a continuation of the entry above and a line that is indented but is the start of a new entry. in theory, a cleaning script for the catalog of the library company of baltimore (figure 2) could use a similar process to identify the last line of an entry by watching for the right-justified count of volumes at the end of each entry. information technology and libraries | december 2019 18 when a right-justified digit was encountered, the script could then carry down the first word from that entry if the first word in the next entry was indented. however, these isolated digits were also not handled well by the ocring process: many do not appear in the hocr file at all, and those that do are as likely to be ocred as a colon, an exclamation point, a capital i, etc., as they are to be a digit. hence, the three catalogs of the library company of baltimore, which use this format and have this ocr issue, were not analyzed for this project. table 1. results of verification library date founded if known, or inc. if not known9 date catalog printed number of spreadsheet entries number of entries handverified handverified entries that cannot be positively identified handverified, positively identifiable entries that are not in hathitrust positively identifiable entries successfully matched when work was in hathitrust library company of philadelphia 1731 1807 7619 128 0% 16.9% 79.8% horsham library company 1808 1810 143 143 28.4% 5.1% 79.8% salem (ma) athenaeum inc. 1810 1811 1585 130 0.8% 11.3% 72.3% new york society library 1754 1813 4522 135 5.7% 17.9% 76.1% providence library company 1753 1818 688 688 17.1% 9.4% 87.2% apprentices’ library (new york, ny)10 1820 1820 1811 124 34.4% 15.0% 69.7% washington (dc) library company inc. 1814 1822 900 124 12.9% 3.2% 83.7% boston library inc. 1794 1824 2273 138 4.1% 11.1% 82.5% mercantile library (new york, ny) 1820 1825 1386 138 0% 11.3% 86.0% hathitrust as a data source | bauder 19 https://doi.org/10.6017/ital.v38i4.11251 the catalogs of the other nine libraries could all be parsed with an acceptable success rate and, with one exception, were included. the exception was the salem athenaeum’s 1818 catalog, which was identical in format and nearly identical in content to the athenaeum’s 1811 catalog. given the overwhelming similarity it was decided to include only one of the catalogs; given that the goal of this analysis was to try to use the worst-case-scenario catalogs, the older of the two catalogs was chosen for inclusion. once the catalogs were parsed into csv files, they were run through another script that attempted to match each entry in the catalog against metadata from hathitrust. in february 2019, marc records containing metadata for 2,824,875 public-domain titles in hathitrust were downloaded from hathitrust via their oai feed and ingested into a local apache solr index for searching and matching, using code from the solrmarc and vufind projects.11 because of ocr errors in the catalog files and mistakes in the original catalogs, many of the words in the entries have one or more character-level errors. therefore, solr’s fuzzy searching option was used, which allows words to match as long as the levenshtein distance between them is two or less. (the levenshtein distance is the number of edits, such as changing one letter to another or adding or deleting a letter, it would take to turn one word into the other.) no attempt was made to match specific editions; as can be seen from the excerpts in figures 2 and 3, many of the catalogs do not contain sufficient detail to do so, even if it was desirable. the goal was merely to determine whether the text of that work, from any edition, was contained in the hathitrust corpus. once the catalogs had been checked against hathitrust, a sample of the entries was hand-verified. for the two smallest catalogs, the horsham library company and the library company of providence, all entries were hand-verified. for the other catalogs, a random sample of approximately 130 items (+/10) was selected. microsoft excel’s random-number generator was used to assign each line in the csv file a number between 0 and 1, and then the lowest 1.5 percent to 12.5 percent (depending on the number of items in the catalog) were examined. results percentage of works included in hathitrust a minimum of four of the books in every catalog examined was missing from hathitrust. as can be seen in table 1, the fraction of books from the hand-verified sample that was missing from hathitrust ranged from 3.2 percent for the washington library company to just shy of 18 percent for the new york society library. the library company of philadelphia, at 16.9 percent missing, had the second-highest missing number. it is not surprising that these two libraries, as two of the oldest and most venerable libraries in the united states at the time, owned the most books that are not represented in hathitrust, as both have a high percentage of very old and rare works. however, not all of the books from these collections that are not represented in hathitrust fall into that category. only six of the twenty missing works from the library company of philadelphia sample, and no more than eight of twenty-two from the new york society library, were published before 1700, for example.12 percentage of works that cannot be positively identified as can be seen in figures 1 through 3, some catalogs provided relatively full titles (figures 1 and 2), while others described the works in only two or three words each (figure 3). as might be expected, it is much easier to positively identify the works when fuller titles are provided, although two or three words proved to be enough to identify the work unambiguously the information technology and libraries | december 2019 20 majority of the time. (all of the titles shown in figure 3 can be positively identified, for example.) in the samples taken from the nine catalogs, the percentage of titles that were unidentifiably ambiguous ranged from 0 percent (library company of philadelphia, mercantile library of new york) to more than one in four (apprentices’ library of new york, 34.4 percent; horsham library company, 27.9 percent). the apprentices’ library of new york and the horsham library company were particularly problematic because they frequently omitted the name of the author, in addition to greatly compressing the title; without an author name, titles such as modern geography (apprentices’ library) and history of rome (horsham library company) present far too many potential matches. however, even including the author’s name does not make all greatly compressed entries identifiable. one particularly egregious example comes from the library company of providence’s 1818 catalog, which contains an entry reading “bell’s inquiry.” the list of candidates for this work includes a practical inquiry into the authority, nature, and design of the lord’s supper, by william bell; an inquiry into the causes which produce, and the means of preventing diseases among british officers, soldiers, and others in the west indies, by john bell; and inquiry into the policy and justice of the prohibition of the use of grain in distilleries, by archibald bell. figure 4. new york society library, a catalogue of the books belonging to the new-york society library (new york: printed by c. s. van winkle, 1813), 139. success rates for the parsing and matching scripts when there was a single, identifiable work that matched the catalog entry, and that work was in hathitrust, the matching scripts identified it at least 70 percent of the time for every individual catalog. unsurprisingly, catalogs such as those of the horsham library company and the apprentices’ library of new york that had entries that were difficult to positively identify were also more difficult for the script to properly match, although the matching script still succeeded between roughly 70 and 80 percent of the time. hathitrust as a data source | bauder 21 https://doi.org/10.6017/ital.v38i4.11251 for two other libraries with below-average matching results (the library company of philadelphia and the new york society library), many of the matching problems were caused by issues with the scanned catalogs that the data-cleaning scripts did not handle well. the new york society library catalog listed out the contents of multivolume sets in a way that was difficult for the cleaning script to identify and remove (see figure 4); instead, it was common for each volume of the set to end up with its own entry in the dataset. since the hathitrust records generally do not list out the contents of each volume, it was very rare for the cleaning script to correctly match a set based on an entry for one volume in the set. twenty-seven percent (six out of 22) missed matches from that sample failed because of this table-of-contents issue. for the library company of philadelphia, the problem lies with a quirk in the hocr where the character heights for many of the horizontal-line dittoes are extremely high—around twenty pixels, when the text around those dittoes is typically around ten pixels high. it appears as if the ocr program may have treated each horizontal-line ditto as an em dash and assigned it a height that would be proportional for an em dash of that length. these extra-tall line heights for the first “word” on the line cause issues with the algorithm that processes the text line-by-line, causing some entries to be inappropriately divided across two entries in the data spreadsheets. unsurprisingly, the matching script had difficulty correctly identifying the correct work in hathitrust when it was trying to match based on only half of the book’s title. conclusions although not a complete success, the results of this study provide hope that it might be possible to create full-text corpora based on the works in individual libraries with minimal manual labor, with a few caveats. the first caveat is that the digitized catalogs of those libraries must meet certain specifications: 1) the catalog is formatted, and has been ocred, in such a way that it is consistently possible to parse the catalog line-by-line and to identify algorithmically where each entry starts and ends. 2) the catalog provides at least the authors’ last names, if not their full names, plus a more-orless complete and accurate transcription of the title proper. 3) either the catalog contains minimal extraneous information (such as tables of contents or donors’ names), or the extraneous information is consistently formatted in a way that it can be algorithmically identified and removed. the second caveat is that even if all of these conditions are met, the full-text corpora that can be created will probably still be missing some small percentage of the books available in that library. one potential direction for future research could be more closely examining the books that are absent from hathitrust to see if there are any commonalities among them that might bias research done using these corpora, or if the missing works can safely be treated as random omissions. on the other hand, as was noted above, the catalogs used in this study represent a likely worst-case scenario for being able to positively identify the works listed in the catalogs and for those works being present in hathitrust. another promising avenue for future research would be to repeat this analysis on catalogs from the mid-to-late nineteenth century to see if the works in those catalogs are in fact more likely to exist in the hathitrust corpus. information technology and libraries | december 2019 22 appendix a: american library catalogs from 1776 to 1825 included in hathitrust boston library, catalogue of books in the boston library, june, 1824, boston: munroe and francis, 1824, http://hdl.handle.net/2027/hvd.32044080249337. general society of mechanics and tradesman of the city of new york, catalogue of the apprentices’ library, instituted by the society of mechanics and tradesman of the city of new-york, on the 25th november, 1820: with the names of the donors: to which is added, an address delivered on the opening of the institution by thomas r. mercein, a member of the society. new york: printed by william a. mercein, no. 93 gold-street, 1820, http://hdl.handle.net/2027/nnc2.ark:/13960/t8md1cv2t. horsham library company, the constitution, bye-laws, and catalogue of books, of the horsham library company. philadelphia, pa: j. rakestraw, 1810, http://hdl.handle.net/2027/nnc1.cu55910696. library company of baltimore, a catalogue of the books, &c. belonging to the library company of baltimore: to which are prefixed the act for the incorporation of the company, their constitution, their by-laws, and an alphabetical list of the members. baltimore, md: printed by eades and leakin, 1809, http://hdl.handle.net/2027/nyp.33433069263907. library company of baltimore, a supplement to the catalogue of books, &c. belonging to the library company of baltimore. baltimore, md: printed by j. robinson, 1816, http://hdl.handle.net/2027/nyp.33433069263899. library company of baltimore, a supplement to the catalogue of books, &c. belonging to the library company of baltimore. baltimore, md: printed by j. robinson, 1823, http://hdl.handle.net/2027/nyp.33433069263899. library company of philadelphia, a catalogue of the books belonging to the library company of philadelphia: to which is prefixed, a short account of the institution, with the charter, laws, and regulations. philadelphia, pa: printed by bartram & reynolds, 1807, http://hdl.handle.net/2027/nyp.33433075914816. mercantile library association of the city of new york, catalogue of the books belonging to the mercantile library association of the city of new-york: to which are prefixed, the constitution and the rules and regulations of the same. new york: printed by hopkins & morris, 1825, http://hdl.handle.net/2027/nyp.33433057517090. new york society library, a catalogue of the books belonging to the new-york society library. new york: printed by c. s. van winkle, 1813, http://hdl.handle.net/2027/mdp.39015023478822. providence library company, charter and by laws of the providence library company, and a catalogue of the books of the library. providence, ri: printed by miller and hutchens, 1818, http://hdl.handle.net/2027/nyp.33433059555346. salem athenaeum, catalogue of the books belonging to the salem athenæum, with the by-laws and regulations. salem, ma: printed by thomas c. cushing, 1811, http://hdl.handle.net/2027/hvd.32044080252174. http://hdl.handle.net/2027/hvd.32044080249337 http://hdl.handle.net/2027/nnc2.ark:/13960/t8md1cv2t http://hdl.handle.net/2027/nnc1.cu55910696 http://hdl.handle.net/2027/nyp.33433069263907 http://hdl.handle.net/2027/nyp.33433069263899 http://hdl.handle.net/2027/nyp.33433069263899 http://hdl.handle.net/2027/nyp.33433075914816 http://hdl.handle.net/2027/nyp.33433057517090 http://hdl.handle.net/2027/mdp.39015023478822 http://hdl.handle.net/2027/nyp.33433059555346 http://hdl.handle.net/2027/hvd.32044080252174 hathitrust as a data source | bauder 23 https://doi.org/10.6017/ital.v38i4.11251 salem athenaeum, catalogue of the books belonging to the salem athenæum, with the by-laws and regulations. salem, ma: printed by w. palfray, 1818, http://hdl.handle.net/2027/hvd.32044080252174. washington library company, catalogue of books in the washington library, july 20, 1822. washington, dc: printed by anderson and meehan, 1822, http://hdl.handle.net/2027/chi.098498263. references 1 see, e.g., jean-baptiste michel et al., “quantitative analysis of culture using millions of digitized books,” science, 311, no. 6014 (january 11, 2011): 176-82, https://doi.org/10.1126/science.1199644; jean m. twenge, w. keith campbell, and brittany gentile, “male and female pronoun use in u.s. books reflects women’s status, 1900 -2008,” sex roles 67, nos. 9-10 (november 2012), 488-93, https://doi.org/10.1007/bf00287963; patricia m. greenfield, “the changing psychology of culture from 1800 through 2000,” psychological science 24, no. 9, 1722-31, https://doi.org/10.1177/0956797613479387. 2 eitan adam pechenick, christopher m. danforth, and peter sheridan dodds, “characterizing the google books corpus: strong limits to inferences of socio-cultural and linguistic evolution,” plos one 10, no. 10 (october 7, 2015): e0137041. https://doi.org/10.1371/journal.pone.0137041; alexander koplenig, “the impact of lacking metadata for the measurement of cultural and linguistic change using the google ngram data sets—reconstructing the composition of the german corpus in times of wwii,” digital scholarship in the humanities 32, no. 1 (april 2017): 169-88, https://doi.org/10.1093/llc/fqv037. 3 pechenick et al., 2015; lindsay dicuirci, colonial revivals: the nineteenth-century lives of early american books (philadelphia: university of pennsylvania press, 2019). 4 robert a. gross, “reconstructing early american libraries: concord, massachusetts, 1795 -1850,” proceedings of the american antiquarian society, 97, no. 1 (january 1, 1987): p. 331-451. 5 jennifer howard, “what ever happened to google’s effort to scan millions of university library books?,” edsurge, august 20, 2017, https://www.edsurge.com/news/2017-08-10-whathappened-to-google-s-effort-to-scan-millions-of-university-library-books. 6 book catalogs fell out of favor in the latter half of the nineteenth century as library collections became larger and more dynamic, making book catalogs much more difficult and expensive to compile and to keep up to date. by the end of the nineteenth century, book catalogs had largely been replaced by the card catalog system that remained in use through most of the twentieth century. although card catalogs were far superior for their primary purposes—maintaining an inventory of books presently owned by the library and allowing library users to locate the books that they wanted—they leave no permanent record of the books listed in the catalog at any particular point in time. 7 information about pdftotext can be found at https://manpages.debian.org/testing/popplerutils/pdftotext.1.en.html. http://hdl.handle.net/2027/hvd.32044080252174 http://hdl.handle.net/2027/chi.098498263 https://doi.org/10.1126/science.1199644 https://doi.org/10.1007/bf00287963 https://doi.org/10.1177/0956797613479387 https://doi.org/10.1371/journal.pone.0137041 https://doi.org/10.1093/llc/fqv037 https://www.edsurge.com/news/2017-08-10-what-happened-to-google-s-effort-to-scan-millions-of-university-library-books https://www.edsurge.com/news/2017-08-10-what-happened-to-google-s-effort-to-scan-millions-of-university-library-books https://manpages.debian.org/testing/poppler-utils/pdftotext.1.en.html https://manpages.debian.org/testing/poppler-utils/pdftotext.1.en.html information technology and libraries | december 2019 24 8 the cleaning scripts, as well as data and other code used in this project, are available in https://github.com/julia-bauder/library-catalog-analysis-public. 9 the founding and incorporation dates were taken from the prefatory texts in the book catalogs used in this analysis, as listed in appendix a. 10 the scan of this catalog that is available from hathitrust is missing pages 3-6. 11 apache solr is a widely used, open-source search platform. solrmarc is a utility that can be used to index marc records into solr. vufind is an open-source library discovery layer built in part on solr and solrmarc. for more information, see http://lucene.apache.org/solr/, https://github.com/solrmarc/solrmarc, and https://vufind.org/vufind/, respectively. the hathitrust oai feed is available at https://www.hathitrust.org/oai. 12 five of the missing works from the new york society library sample were undated in the catalog. https://github.com/julia-bauder/library-catalog-analysis-public http://lucene.apache.org/solr/ https://github.com/solrmarc/solrmarc https://vufind.org/vufind/ https://www.hathitrust.org/oai abstract introduction methods results percentage of works included in hathitrust percentage of works that cannot be positively identified success rates for the parsing and matching scripts conclusions appendix a: american library catalogs from 1776 to 1825 included in hathitrust references microsoft word 5485-10835-5-ce.docx negotiating  a  text  mining  license  for   faculty  researchers       leslie  a.  williams,     lynne  m.  fox,     christophe  roeder,     and  lawrence  hunter       information  technology  and  libraries  |  september  2014           5     abstract   this  case  study  examines  strategies  used  to  leverage  the  library’s  existing  journal  licenses  to  obtain  a   large  collection  of  full-­‐text  journal  articles  in  xml  format,  the  right  to  text  mine  the  collection,  and   the  right  to  use  the  collection  and  the  data  mined  from  it  for  grant-­‐funded  research  to  develop   biomedical  natural  language  processing  (bnlp)  tools.  researchers  attempted  to  obtain  content   directly  from  pubmed  central  (pmc).  this  attempt  failed  because  of  limits  on  use  of  content  in  pmc.   next,  researchers  and  their  library  liaison  attempted  to  obtain  content  from  contacts  in  the  technical   divisions  of  the  publishing  industry.  this  resulted  in  an  incomplete  research  data  set.  researchers,  the   library  liaison,  and  the  acquisitions  librarian  then  collaborated  with  the  sales  and  technical  staff  of  a   major  science,  technology,  engineering,  and  medical  (stem)  publisher  to  successfully  create  a   method  for  obtaining  xml  content  as  an  extension  of  the  library’s  typical  acquisition  process  for   electronic  resources.  our  experience  led  us  to  realize  that  text-­‐mining  rights  of  full-­‐text  articles  in   xml  format  should  routinely  be  included  in  the  negotiation  of  the  library’s  licenses.   introduction   the  university  of  colorado  anschutz  medical  campus  (cu  anschutz)  is  the  only  academic  health   sciences  center  in  colorado  and  the  largest  in  the  region.  annually,  cu  anschutz  educates  3,480   full-­‐time  students,  provides  care  during  1.5  million  patient  visits,  and  receives  more  than  $400   million  in  research  awards.1  cu  anschutz  is  home  to  a  major  research  group  in  biomedical  natural   language  processing  (bnlp),  directed  by  professor  lawrence  hunter.  natural  language  processing   (also  known  as  nlp  or,  more  colloquially,  “text  mining”)  is  the  development  and  application  of   computer  programs  that  accept  human  language,  usually  in  the  form  of  documents,  as  input.  bnlp   takes  as  input  scientific  documents,  such  as  journal  articles  or  abstracts,  and  provides  useful     leslie  a.  williams  (leslie.williams@ucdenver.edu)  is  head  of  acquisitions,  auraria  library,   university  of  colorado,  denver.  lynne  m.  fox  (lynne.fox@ucdenver.edu)  is  education  librarian,   health  sciences  library,  university  of  colorado  anschutz  medical  campus,  aurora.     chistophe  roeder  is  a  researcher  at  the  school  of  medicine,  university  of  colorado,  aurora.   lawrence  hunter  (larry.hunter@ucdenver.edu)  is  professor,  school  of  medicine,  university  of   colorado,  aurora.       negotiating  a  text  mining  license  for  faculty  researchers  |  williams  et  al   6   functionality,  such  as  information  retrieval  or  information  extraction.  cu  anschutz’s  health   sciences  library  (hsl)  supports  hunter’s  research  group  by  providing  a  reference  and  instruction   librarian,  lynne  fox,  to  participate  on  the  research  team.  hunter’s  group  is  working  on   computational  methods  for  knowledge-­‐based  analysis  of  genome-­‐scale  data.2  as  part  of  that  work,   his  group  is  devising  and  implementing  text-­‐mining  methods  that  extract  relevant  information   from  biomedical  journal  articles,  which  is  then  integrated  with  information  from  gene-­‐centric   databases  and  used  to  produce  a  visual  representation  of  all  of  the  published  knowledge  relevant   to  a  particular  data  set,  with  the  goal  of  identifying  new  explanatory  hypotheses.     hunter’s  research  group  demonstrated  the  potential  of  integrating  data  and  research  information   in  a  visualization  to  further  new  discoveries  with  the  “hanalyzer”   (http://hanalyzer.sourceforge.net).  their  test  case  used  expression  data  from  mice  related  to   craniofacial  development  and  connected  that  data  to  pubmed  abstracts  using  gene  or  protein   names.  “copying  of  content  that  is  subject  to  copyright  requires  the  clearing  of  rights  and   permissions  to  do  this.  for  these  reasons  the  body  of  text  that  is  most  often  used  by  researchers   for  text  mining  is  pubmed.”3  the  resulting  visualization  allowed  researchers  to  identify  four  genes   involved  in  mouse  craniofacial  development  that  had  not  previously  been  connected  to  tongue   development,  with  the  resulting  hypotheses  validated  by  subsequent  laboratory  experiment.4  the   knowledge-­‐based  analysis  tool  is  open  access.     to  continue  the  development  of  the  bnlp  tools  for  the  knowledge-­‐based  analysis  system,  three   things  were  required:  a  large  collection  of  full-­‐text  journal  articles  in  xml  format,  the  right  to  text   mine  the  collection,  and  the  right  to  store  and  use  the  collection  and  the  data  mined  from  it  for   grant-­‐funded  research.  the  larger  the  dataset,  the  more  robust  the  visual  representations  of  the   knowledge-­‐based  analysis  system,  so  hunter’s  research  group  sought  to  compile  a  large  corpus  of   relevant  literature,  beginning  with  journal  articles.  the  text  that  is  mined  can  start  in  many   formats;  however,  xml  provides  a  computer-­‐ready  format  for  text  mining  because  it  is  structured   to  indicate  parts  of  the  document.  xml  is  “called  a  ‘markup  language’  because  it  uses  tags  to  mark   and  delineate  pieces  of  data.  the  ‘extensible’  part  means  that  the  tags  are  not  pre-­‐defined;  users   can  define  them  based  on  the  type  of  content  they  are  working  with.”5,6   xml  has  been  adopted  as  a  standard  for  content  creation  by  journal  publishers  because  it   provides  a  flexible  format  for  electronic  media.7  xml  allows  the  parts  of  a  journal  article  to  be   encoded  with  tags  that  identify  the  title,  author,  abstract,  and  other  sections,  allowing  the  article  to   be  transmitted  electronically  between  editor  and  publisher  and  to  be  easily  formatted  and   reproduced  into  different  versions  (e.g.,  print,  online).  xml  can  also  indicate  significant  content  in   the  text,  such  as  biological  terms  or  concepts.  xml  allowed  hunter’s  research  group  to  write   computer  programs  that  can  make  sense  of  each  article  by  using  the  xml  tags  as  indicators  of   content  and  placement  within  the  article.  products  have  been  developed,  such  as  la-­‐pdftext,  to   extract  text  from  pdf  documents.8  however,  direct  access  to  xml  provides  more  useful  corpora     information  technology  and  libraries  |  september  2014   7   because  the  document  markup  saves  time  and  improves  the  accuracy  of  results  extracted  from   xml.     once  the  sections  and  content  of  an  article  are  identified,  text-­‐mining  techniques  are  applied  to  the   article.  “text  mining  extracts  meaning  from  text  in  the  form  of  concepts,  the  relationships  between   the  concepts  or  the  actions  performed  on  them  and  presents  them  as  facts  or  assertions.”9  text-­‐ mining  techniques  can  be  applied  to  any  type  of  information  available  in  machine-­‐readable  format   (e.g.,  journal  article,  e-­‐books).  a  dataset  is  created  when  the  text-­‐mined  data  is  aggregated.  using   bnlp  tools,  hunter’s  research  group’s  knowledge-­‐based  analysis  system  analyzed  the  dataset  and   produced  visual  representations  of  the  knowledge  that  have  the  potential  to  lead  to  new   hypotheses.  text  mining  and  bnlp  techniques  have  the  potential  to  build  relationships  between   the  knowledge  contained  in  the  scholarly  literature  that  lead  to  new  hypothesis  resulting  in  more   rapid  advances  in  science.   literature  review   hunter  and  cohen  explored  “literature  overload”  and  its  profoundly  negative  impact  on  discovery   and  innovation.10  with  an  estimated  growth  rate  of  3.1  percent  annually  for  pubmed  central,  the   us  national  library  of  medicine’s  repository,  researchers  struggle  to  master  the  new  literature  of   their  field  using  traditional  methods.  yet  much  of  the  advancement  of  biological  knowledge  relies   on  the  interplay  of  data  created  by  protein,  sequence,  and  expression  studies  and  the   communication  of  information  and  discoveries  through  nontextual  and  textual  databases  and   published  reports.11  how  do  biomedical  researchers  capitalize  on  and  integrate  the  wealth  of   information  available  in  the  scholarly  literature?  “the  common  ground  in  the  area  of  content   mining  is  in  the  shared  conviction  that  the  ever  increasing  overload  of  information  poses  an   absolute  need  for  better  and  faster  analysis  of  large  volumes  of  content  corpora,  preferably  by   machines.”12   bnlp  “encompasses  the  many  computational  tools  and  methods  that  take  human-­‐generated  texts   as  input,  generally  applied  to  tasks  such  as  information  retrieval,  document  classification,   information  extraction,  plagiarism  detection,  or  literature-­‐based  discovery.”13  bnlp  techniques   accomplish  many  tasks  usually  performed  manually  by  researchers,  including  enhancing  access   through  expanded  indexing  of  content  or  linkage  to  additional  information,  automating  reviews  of   the  literature,  discovering  new  insights,  and  extracting  meaning  from  text.14  text  mining  is  just   one  tool  in  a  larger  bnlp  toolbox  of  resources  used  to  read,  reason,  and  report  findings  in  a  way   that  connects  data  to  information  sources  to  speed  discovery  of  new  knowledge.15  according  to   pioneering  text-­‐mining  researcher  marti  hearst,  “text  mining  is  the  discovery  by  computer  of  new,   previously  unknown  information,  by  automatically  extracting  information  from  different  written   resources.  a  key  element  is  the  linking  together  of  the  extracted  information  together  to  form  new   facts  or  new  hypotheses  to  be  explored  further  by  more  conventional  means  of     negotiating  a  text  mining  license  for  faculty  researchers  |  williams  et  al   8   experimentation.”16  biomedical  text  mining  uses  “automated  methods  for  exploiting  the  enormous   amount  of  knowledge  available  in  the  biomedical  literature.”17   recent  reports,  commissioned  by  private  and  governmental  interest  groups,  discuss  the  economic   and  societal  value  of  text  mining.18,19  the  mckinsey  global  institute  estimates  the  worth  of   harnessing  big  data  insights  in  us  health  care  at  $300  billion.  the  report  concludes  that  greater   sharing  of  data  for  text  mining  enables  “experimentation  to  discover  needs,  expose  variability,  and   improve  performance”  and  enhances  “replacing/supporting  human  decision  making  with   automated  algorithms,”  among  other  benefits.  furthermore,  the  mckinsey  report  points  out  that   north  america  and  europe  have  the  greatest  potential  to  take  advantage  of  innovation  because  of   a  well-­‐developed  infrastructure  and  large  stores  of  text  and  data  to  be  mined.20  however,  these   new  and  evolving  technologies  are  challenging  the  current  intellectual-­‐property  framework  as   noted  in  an  independent  report  by  ian  hargreaves,  “digital  opportunity:  a  review  of  intellectual   property  and  growth,”  resulting  in  lost  opportunity  for  innovation  and  economic  growth.21  in  “the   value  and  benefits  of  text  mining,”  jisc  finds  copyright  restrictions  limit  access  to  content  for  text   mining  in  the  biomedical  sciences  and  chemistry  and  that  costs  for  access  and  infrastructure   prevent  entry  into  text-­‐mining  research  for  many  noncommercial  organizations.22  despite   copyright  barriers,  organizations  surveyed  pointed  out  the  risks  associated  with  failing  to  use  text-­‐ mining  techniques  to  further  research  include  financial  loss,  loss  of  prestige,  opportunity  lost,  and   the  brain  drain  of  having  talented  staff  seek  more  fulfilling  work.  jisc  explores  a  research  project’s   workflow  and  finds  a  lack  of  access  to  text  mining  delayed  the  publication  of  an  important  medical   research  study  by  many  months,  or  the  time  the  research  team  spent  analyzing  and  summarizing   relevant  research.23  both  reports  advocate  an  exception  to  intellectual  property  rights  for   noncommercial  text-­‐mining  research  to  balance  the  protection  of  intellectual  property  with  the   access  needs  of  researchers.  a  centrally  maintained  repository  for  text  mining  has  been  proposed,   although  its  creation  would  face  significant  challenges.24   scholarly  journal  content  is  the  raw  “ore”  for  text  mining  and  bnlp.  the  lack  of  access  to  this  ore   creates  a  bottleneck  for  researchers.  “new  business  models  for  supporting  text  mining  within  the   scholarly  publishing  community  are  being  explored;  however,  evidence  suggests  that  in  some   cases  lack  of  understanding  of  the  potential  is  hampering  innovation.”25  bnlp  and  machine-­‐ learning  research  products  are  more  accurate  and  complete  when  more  content  is  available  for   text  mining.  “knowledge  discovery  is  the  search  for  hidden  information.  .  .  .  hence  the  need  is  to   start  looking  as  widely  as  possible  in  the  largest  set  of  content  sources  possible.”26  however,  as   noted  in  a  nature  article,  “the  question  is  how  to  make  progress  today  when  much  research  lies   behind  subscription  firewalls  and  even  ‘open’  content  does  not  always  come  with  a  text-­‐mining   license.”27  large  scientific  publishers  are  facing  economic  challenges,  and  potentially  diminished   economic  returns,  as  the  tension  over  the  right  to  use  licensed  content  heats  up.  nature,  the   flagship  of  a  major  scientific  publisher,  predicted  “trouble  at  the  text  mine”  if  researchers  lack   access  to  the  contents  of  research  publications.28  and  a  2012  investment  report  predicted  slower     information  technology  and  libraries  |  september  2014   9   earnings  growth  for  elsevier,  the  largest  stem  publisher,  if  it  blocked  access  to  licensed  content   by  text-­‐mining  researchers.  the  review  predicted,  “if  the  academic  community  were  to  conclude   that  the  commercial  terms  imposed  by  elsevier  are  also  hindering  the  progress  of  science  or  their   ability  to  efficiently  perform  research,  the  risk  of  a  further  escalation  of  the  acrimony  [between   elsevier  and  the  academic  community]  rises  substantially.”29  with  open  access  alternatives   proliferating,  including  making  federally  funded  research  freely  accessible,  stem  publishers  are   under  increased  pressure  to  respond  to  market  forces.  “the  greatest  challenge  for  publishers  is  to   create  an  infrastructure  that  makes  their  content  more  machine-­‐accessible  and  that  also  supports   all  that  text-­‐miners  or  computational  linguists  might  want  to  do  with  the  content.”30  on  the  other   end  of  the  spectrum,  researchers  are  struggling  to  gain  legal  access  to  as  much  content  as  possible.     academic  libraries  have  long  excelled  at  serving  as  the  bridge  between  researchers  and  publishers   and  can  expand  their  roles  to  include  navigating  the  uncharted  territory  of  obtaining  text-­‐mining   rights  for  content.  increasing  the  library’s  role  in  text  mining  and  other  associated  bnlp  and   machine-­‐learning  methods  offers  tremendous  potential  for  greater  institutional  relevance  and   service  to  researchers.31  at  cu  anschutz’s  hsl,  fox  and  williams,  an  acquisitions  librarian,  found   natural  opportunities  for  collaboration  including  negotiating  rights  to  content  more  efficiently   through  expanded  licensing  arrangements  and  facilitating  the  secure  transfer  and  storage  of  data   to  protect  researchers  and  publishers.   method   hunter  and  fox  began  working  in  2011  to  obtain  a  large  corpus  of  biomedical  journal  articles  in   xml  format  to  create  a  body  of  text  as  comprehensive  as  possible  for  bnlp  experimentation  that   would  further  advance  hunter’s  research  group’s  knowledge-­‐based  analysis  system.  the  desired   result  was  an  aggregated  collection  obtained  from  multiple  publishers,  stored  locally,  and   available  on  demand  for  the  knowledge-­‐based  analysis  system  to  process.  hunter  and  fox  soon   realized  that  “the  process  of  obtaining  or  granting  permissions  for  text  mining  is  daunting  for   researchers  and  publishers  alike.  researchers  must  identify  the  publishers  and  discover  the   method  of  obtaining  permission  for  each  publisher.  most  publishers  currently  consider  mining   requests  on  a  case  by  case  basis.”32  they  pursued  a  multifaceted  strategy  to  build  a  robust   collection  and  to  determine  which  strategy  proved  most  fruitful  because,  during  a  grant  review,   national  library  of  medicine  staff  wanted  evidence  of  access  to  an  xml  collection  before  awarding   a  grant.     fox  first  approached  two  open-­‐access  publishers,  biomed  central  (bmc)  and  public  library  of   science  (plos),  to  request  access  to  xml  text  from  journals  in  the  subjects  of  life  and  biomedical   science.  fox  had  existing  contacts  within  both  organizations  and  an  agreement  was  reached  to   obtain  xml  journal  articles.  letters  of  understanding  were  quickly  obtained  as  both  publishers   were  excited  about  exploring  new  ways  for  their  research  publications  to  be  accessed  and  the   potential  to  increase  the  use  of  their  journals.  possible  journal  titles  were  identified  and     negotiating  a  text  mining  license  for  faculty  researchers  |  williams  et  al   10   arrangements  were  made  to  transfer  and  store  files  locally  from  bmc  and  plos  to  hunter’s   research  group.   hunter  approached  staff  at  pubmedcentral  (pmc)  to  request  access  to  articles  and  discovered   they  could  only  be  made  available  with  permission  from  publishers.  a  wiley  research  and  product   development  executive  granted  hunter  permission  to  access  wiley  articles  in  pmc.  the  wiley   executive  was  interested  in  learning  what  impact  text  mining  might  have  on  wiley  products.   hunter’s  research  group  planned  to  transfer  document  type  definition  (dtd)  format  files  from   pmc.  unfortunately,  when  hunter’s  research  group  staff  requested  file-­‐transfer  assistance  from   pmc,  no  pmc  staff  were  available  to  provide  the  technical  help  needed  because  of  budget   reductions.  pmc  staff  could  accurately  evaluate  their  time  commitment  because  they  had  a  clear   understanding  of  the  xml  access  and  transfer  process,  and  knew  they  could  not  allocate  resources   to  the  effort.     hunter  then  began  to  leverage  his  professional  network  connections  to  obtain  content  from  a   major  stem  vendor.  research  and  development  division  directors  within  the  company  were   familiar  with  the  work  of  hunter’s  research  group  and  were  willing  to  provide  assistance  in   acquiring  content.  however,  when  the  research  group  began  to  perform  research  using  this  data,   further  investigation  determined  that  the  contents  were  not  adequate  for  the  research.  follow-­‐up   between  fox,  the  research  group,  and  the  vendor  revealed  that  the  group’s  needs  were  not   communicated  in  the  vendor’s  vernacular,  resulting  in  the  group  not  clearly  understanding  what   content  the  vendor  was  providing.  this  disconnect  occurred  in  the  communication  flow  from  the   research  group  to  the  vendor’s  research  and  development  staff  to  the  vendor’s  sales  staff  (who   identified  the  content  to  be  shared).  it  was  a  like  a  game  of  telephone  tag.   after  the  initial  strategies  produced  mixed  results,  hunter’s  research  group  hypothesized  that  they   could  harvest  materials  through  hsl’s  journal  subscriptions.  hunter’s  research  group  attempted   to  crawl  and  download  journal  content  being  provided  by  hsl’s  subscription  to  a  major  chemistry   publisher.  since  publishers  monitor  for  web  crawling  of  their  content,  the  chemistry  publisher   became  aware  of  the  unusual  download  activity,  turned  off  campus  access,  and  notified  the  library   that  there  may  have  been  an  unauthorized  attempt  to  access  the  publisher’s  content.  researchers   are  often  unaware  of  complex  copyright  and  license  compliance  requirements.  in  fact,  librarians   sometimes  become  aware  of  text-­‐mining  projects  only  after  automated  downloads  of  licensed   content  prompt  vendors  to  shut  off  campus  access.33  libraries  can  prevent  interruption  of   campus-­‐wide  access  to  important  resources  by  suggesting  more  effective  content-­‐access  methods.     williams,  an  hsl  acquisitions  librarian,  investigated  the  interruption  in  access  and  discovered   hunter’s  research  group’s  efforts  to  obtain  journal  articles  to  text  mine  for  their  research.  she   offered  to  use  her  expertise  in  acquiring  content  to  help  hunter’s  research  group  obtain  the   dataset  needed  for  their  research.  initially,  hunter  and  fox  had  not  included  an  acquisitions     information  technology  and  libraries  |  september  2014   11   librarian  because  that  position  was  vacant.  after  williams  became  involved,  the  effort  focused  on   licensing  content  via  negotiation  and  licensing  with  individual  publishers.   results   “there  are  a  large  number  of  resources  to  help  the  researcher  who  is  interested  in  doing  text   mining”  but  “no  similar  guide  to  obtaining  the  necessary  rights  and  permissions  for  the  content   that  is  needed.”34  at  cu  anschutz,  this  vacuum  was  filled  by  williams,  who  is  knowledgeable  about   the  acquisition  of  content,  and  fox,  who  is  knowledgeable  about  hunter’s  research,  serving  as  the   bridge  between  the  research  group  and  the  stem  publisher.  by  working  together  and  capitalizing   on  each  other’s  expertise,  williams  and  fox  were  able  to  facilitate  the  collaboration  that  developed   a  framework  for  purchasing  a  large  collection  of  full-­‐text  journal  articles  in  xml  format.  as  the   collaboration  progressed,  three  major  elements  to  the  framework  surfaced,  including  a  pricing   model,  a  license  agreement,  and  the  dataset  and  delivery  mechanism.   researchers  interested  in  legally  text  mining  journal  content  often  find  themselves  having  to   execute  a  license  agreement  and  pay  a  fee.35  what  should  the  fee  be  based  on  to  create  a  fair  and   equitable  pricing  model?  publishers  establish  pricing  for  library  clients  on  the  basis  of  not  only  the   content,  but  many  valued-­‐added  services  such  as  the  breath  of  titles  aggregated  and  made   available  for  purchase  in  a  single  product,  the  creation  of  a  platform  to  access  the  journal  titles,  the   indexing  and  searching  functionality  within  the  platform,  and  the  production  of  easily  readable   pdf  versions  of  articles.  these  value-­‐added  services  are  not  required  for  text-­‐mining  endeavors.   rather,  the  product  is  the  raw  journal  content  that  has  been  peer-­‐reviewed,  edited,  and  formatted   in  xml  that  precedes  the  addition  of  value-­‐added  services.  therefore  the  pricing  should  not  be   equivalent  to  the  cost  of  a  library’s  subscription  to  a  journal  or  package  of  journals.  in  the  end,   after  lengthy  negotiations,  the  pricing  model  for  the  hunter’s  research  group  collection  of  full-­‐text   journal  articles  in  xml  format  consisted  of   • a  cost  per  article;   • a  minimum  purchase  of  400,000  articles  for  one  sum  on  the  basis  of  the  cost  per  article;   • an  annual  subscription  for  the  minimum  purchase  of  400,000;   • the  ability  to  subscribe  to  additional  articles  in  excess  of  400,000  in  quantities  determined   by  hunter’s  research  group;   • a  volume  discount  off  the  per  article  price  for  every  article  purchased  in  excess  of  400,000;   • inclusion  of  the  core  journal  titles  purchased  via  the  library’s  subscription  at  no  charge;     • inclusion  of  the  core  journal  titles  purchased  by  the  university  of  colorado  boulder  at  no   charge  because  of  hunter’s  joint  appointment  at  both  cu  boulder  and  cuanschutz   campuses;  and   • a  requirement  for  hsl  to  maintain  its  subscription  to  the  vendor’s  product  at  its  current   level.     negotiating  a  text  mining  license  for  faculty  researchers  |  williams  et  al   12   “where  institutions  already  have  existing  contracts  to  access  particular  academic  publications,  it  is   often  unclear  whether  text  mining  is  a  permissible  use.”36  from  the  beginning,  common  ground   was  easily  found  on  the  subject  of  core  titles  purchased  by  the  two  campuses’  libraries.  core  titles   are  typically  those  journals  that  libraries  pay  a  premium  for  to  obtain  perpetual  rights  to  the   content.  most  of  the  negotiation  focused  on  access  titles,  which  are  journals  that  libraries  pay  a   nominal  fee  to  have  access  to  without  any  perpetual  rights  included.   the  final  challenge  related  to  cost  was  determining  how  to  process  and  pay  for  the  product.   hunter’s  research  group  operates  on  major  grant  funding  from  federal  government  agencies.  the   university  of  colorado  requires  additional  levels  of  internal  controls  and  approvals  to  expend   grant  funds  as  well  as  to  track  expenditures  to  meet  reporting  requirements  of  the  funding   agencies.  also,  grant  funding  of  this  type  often  spans  multiple  fiscal  years  whereas  the  library’s   budget  operates  on  a  single  fiscal  year  at  a  time.  therefore  it  was  decided  that  hunter  would   handle  payment  directly  rather  than  transferring  funds  to  hsl  to  make  payment  on  their  behalf.   “libraries  as  the  licensee  of  publishers’  content  are  from  that  perspective  interested  in  the  legal   framework  around  content  mining.”37  during  price  negotiations,  williams  recommended   negotiating  a  license  agreement  similar  to  those  libraries  and  publishers  execute  for  the  purchases   of  journal  packages.  a  license  agreement  would  offer  a  level  of  protection  for  all  parties  involved   while  clearly  outlining  the  parameters  of  the  transaction.  hunter  and  the  stem  publisher  readily   agreed.     the  final  license  agreement  contained  ten  sections  including  definitions;  subscription;  obligations;   use  of  names;  financial  arrangement;  term;  proprietary  rights;  warranty,  indemnity,  disclaimer,   and  limitation  of  liability;  and  miscellaneous.  while  the  license  agreement  was  similar  to   traditional  license  agreements  between  libraries  and  publishers  for  journal  subscriptions,  there   were  some  notable  differences.  first,  in  the  definitions  section,  users  were  defined  and  limited  to   hunter  and  his  research  team.  this  limited  the  users  to  a  specific  group  of  individuals  unlike   typical  library–publisher  license  agreements  that  license  content  for  the  entire  campus.     second,  the  subscription  section  covered  how  the  data  can  be  used  in  detail  and  allowed  the   dataset  to  be  installed  locally.  this  was  important  to  make  the  dataset  available  on  demand  to   researchers;  to  allow  researchers  to  manipulate,  segment,  and  store  the  data  in  multiple  ways   instead  of  as  one  large  dataset;  and  to  allow  the  researchers  the  ability  to  access  and  use  the  large   dataset  efficiently  and  quickly.  because  the  dataset  would  be  manipulated  so  extensively,  the   license  gave  permission  to  create  a  backup  copy  and  store  it  separately.  the  subscription  section   also  required  the  dissemination  of  the  research  results  to  occur  in  such  a  way  that  the  dataset   could  not  be  extracted  and  used  by  others.  this  was  significant  because  prof.  hunter  releases  the   bnlp  software  applications  they  develop  as  open  source  software  so  that  the  applications  can  be   open  to  peer  review  and  attempts  at  reproduction.  ideally,  someone  could  download  the  open   source  software,  obtain  the  same  corpus  as  input,  and  see  the  same  output  mentioned  in  the  paper.     information  technology  and  libraries  |  september  2014   13   third,  the  obligations  section  was  radically  different  from  traditional  library–publisher  license   agreements  because  even  though  “publishers  are  still  working  out  how  to  take  advantage  of  text   mining  .  .  .  none  wants  to  miss  out  on  the  potential  commercial  value.”38  this  interest  prompted   the  crafting  of  an  atypical  obligations  section  in  the  license  agreement  that  included  an  option  for   hunter  to  collaborate  with  the  stem  publisher  to  develop  and  showcase  an  application  on  the   vendor’s  website  and  included  a  commitment  for  hunter  to  meet  quarterly  with  the  vendor’s   representatives  to  discuss  advances  in  research.  furthermore,  the  obligations  section  specified  a   request  for  hunter  and  the  university  of  colorado  to  recognize  the  vendor  where  appropriate  and   a  right  for  the  stem  publisher  to  use  any  research  software  application  released  as  open  source.   up  to  this  point,  williams  had  been  collaborating  with  the  university  of  colorado  in-­‐house  counsel   to  review  and  revise  the  license  agreement.  when  the  stem  publisher  requested  the  right  to  use   the  software  application,  williams  was  required  to  submit  the  license  agreement  to  the  university   of  colorado‘s  technology  transfer  office  for  review  and  approval.  approval  was  prompt  in  coming,   primarily  because  prof.  hunter  releases  his  software  applications  as  open  source.   fourth,  the  license  agreement  included  a  “use  of  names”  section,  which  is  not  found  in  typical   library–publisher  agreements.  this  section  authorized  the  vendor  to  use  factual  information   drawn  from  a  case  study  in  market-­‐facing  materials  and  a  requirement  that  the  vendor  request   written  consent,  as  required  from  the  university  of  colorado  system,  for  information  in  the  case   study  to  be  released  for  market  facing  materials.  the  vendor  also  agreed  not  to  use  the  university   of  colorado’s  trademark,  service  mark,  trade  name,  copyright,  or  symbol  without  prior  written   consent  and  to  use  these  items  in  accordance  with  the  university  of  colorado  system’s  usage   guidelines.     fifth,  the  vendor  agreed  not  to  represent  in  any  way  that  the  university  of  colorado  or  its   employees  endorse  the  vendor’s  products  or  services.  this  is  extremely  important  because  the   university  of  colorado’s  controller  does  not  allow  product  endorsements  because  of  the  federal   unrelated  business  income  tax.  exempt  organizations  are  required  to  pay  this  tax  if  engaged  in   activities  that  are  regularly  occurring  business  activities  that  do  not  further  the  purpose  of  the   exempt  organization.39     finally,  the  license  agreement  stated  all  items  would  be  provided  in  xml  format  with  a  unique   digital  object  identifier  (doi)  number,  essential  for  linking  xml  content  to  real-­‐world  documents   that  researchers  using  hunter’s  research  group’s  knowledge-­‐based  analysis  system  would  want  to   access.   after  a  pricing  model  and  license  agreement  were  finalized,  the  focus  turned  to  the  last  major   element  of  the  framework:  the  dataset  and  delivery  mechanism.  elements  such  as  quality  of  the   corpora  contents,  file  transfer  time,  and  storage  capacity  are  all  important.  in  other  words,  “the   need  is  to  start  looking  as  widely  as  possible  in  the  largest  set  of  content  sources  possible.  this   need  is  balanced  by  the  practicalities  of  dealing  with  large  amounts  of  information,  so  a  choice     negotiating  a  text  mining  license  for  faculty  researchers  |  williams  et  al   14   needs  to  be  made  of  which  body  of  content  will  most  likely  prove  fruitful  for  discovery.  text  mines   are  dug  where  there  is  the  best  chance  of  finding  something  valuable.”40   when  building  an  xml  corpora  for  research,  hunter’s  research  group  wanted  to  maximize  their   return  on  investment,  so  a  pilot  download  was  conducted  to  assure  that  the  most  beneficial   content  could  be  transferred  smoothly  to  a  local  server.  “permissions  and  licensing  is  only  a  part   of  what  is  needed  to  support  text  mining.  the  content  that  is  to  be  mined  must  be  made  available   in  a  way  that  is  convenient  for  the  researcher  and  the  publisher  alike.”41  this  pilot  phase  allowed   hunter’s  researchers  and  the  vendor’s  technical  personnel  to  clarify  the  requirements  of  the   dataset  and  to  efficiently  deliver  and  accurately  invoice  for  content.  one  of  the  initial  obstacles   was  that  a  filter  for  the  delivery  mechanism  didn’t  exist.  letters  to  the  editor,  errata,  and  more   were  all  counted  as  an  article.  hunter’s  researchers  quickly  determined  that  research  articles  were   most  important  at  this  point  in  the  development  of  the  knowledge-­‐based  analysis  system.  how   should  a  useful  or  minable  article  be  defined—by  its  length,  by  xml  tags  indicating  content  type,  or   by  some  other  criteria?  roeder,  a  software  engineer,  used  article  attributes  and  characteristics   embedded  in  xml  tags  to  define  an  article  as  including  all  of  the  following:     • an  abstract   • a  body   • at  least  40  lines  of  text   • none  of  the  following  tags:  corrigendum,  erratum,  book  review,  editorial,  introduction,   preface,  correspondence,  or  letter  to  the  editor   in  the  end,  hunter’s  research  group  and  the  vendor  agreed  to  transmit  everything  and  allow  the   group  a  fifteen  business  days  to  evaluate  the  content.  the  research  group  would  then  notify  the   vendor  of  how  many  “articles”  were  received.  this  process  would  continue  until  400,000  “articles”   were  received.     after  spending  more  than  a  year  working  to  develop  a  structure  to  purchase  a  large  corpus  of   journal  articles  to  text  mine.  just  as  hunter’s  research  group  was  ready  to  execute  the  license,   remit  payment,  and  receive  the  articles,  their  federal  grant  expired,  stalling  the  purchase.  in   retrospect,  this  unfortunate  development  was  the  catalyst  for  a  shift  in  philosophy  and  strategy  for   the  researchers  and  librarians  at  cu  anschutz.   discussion   xml  text-­‐mining  efforts  will  continue  to  expand,  leading  to  increased  demand  on  libraries  and   librarians  to  play  a  role  in  securing  content.  publishers,  researchers,  and  libraries  see  the  potential   commercial  and  research  value  for  text  mining  journal  content  and  are  driving  the  rapid  evolution   of  this  arena,  in  part,  because  “there  is  increasing  demand  from  public  and  charitable  funders  that   maximum  value  is  leveraged  from  their  substantial  investment  and  this  includes  making  outputs     information  technology  and  libraries  |  september  2014   15   accessible  and  usable.  .  .  .  text  mining  offers  the  potential  for  fuller  use  of  the  existing  publicly-­‐ funded  research  base.”42     however,  publishers  identified  two  main  barriers  to  text  mining  from  their  perspective—lack  of   standardization  in  content  formats  and  in  access  terms—and  concede  that  “publishers  should   develop  shared  access  terms  for  research-­‐driven  mining  requests.”43  from  the  researcher  and   librarian  perspective,  there  are  many  barriers  and  costs  involved  including  “access  rights  to  text-­‐ minable  materials,  transaction  costs  (participation  in  text  mining),  entry  (setting  up  text  mining),   staff  and  underlying  infrastructure.  currently,  the  most  significant  costs  are  transaction  costs  and   entry  costs.”44  the  significant  transaction  costs  stem  from  the  time  it  takes  to  navigate  the   complexity  of  negotiating  and  complying  with  license  agreements  for  journal  content.  the  various   types  of  “costs  are  currently  borne  by  researchers  and  institutions,  and  are  a  strong  hindrance  to   text  mining  uptake.  these  could  be  reduced  if  uncertainty  is  reduced,  more  common  and   straightforward  procedures  are  adopted  across  the  board  by  license  holders,  and  appropriate   solutions  for  orphaned  works  are  adopted.  however,  the  transaction  costs  will  still  be  significant  if   individual  rights  holders  each  adopt  different  licensing  solutions  and  barriers  inhibiting  uptake   will  remain.”45   in  a  survey  of  libraries,  findings  indicated  that  librarians  anticipate  a  new  role  as  facilitators   between  researchers  and  publishers  to  enable  text  mining.46  librarians  are  a  natural  fit  for  this   role  because  they  already  have  expertise  in  navigating  copyright,  requesting  copyright   permissions,  and  negotiating  license  agreements  for  journal  content.  “advice  and  guidance  should   be  developed  to  help  researchers  get  started  with  text  mining.  this  should  include:  when   permission  is  needed;  what  to  request;  how  best  to  explain  intended  work  and  how  to  describe   the  benefits  of  research  and  copyright  owners.”47   after  their  experience  with  developing  a  framework  to  license  and  purchase  a  large  corpora  of   journal  articles  in  xml  format  to  be  text  mined,  fox  and  williams  came  to  believe  that,  in  addition   to  providing  copyright  expertise,  librarians  should  assist  in  reducing  transaction  costs  by   developing  model  license  clauses  for  text  mining  and  routinely  negotiating  for  these  rights  when   the  library  purchases  journals  and  other  types  of  content.  adopting  this  philosophy  and  strategy   led  williams  and  fox  to  successfully  advocate  for  the  inclusion  of  a  text-­‐mining  clause  in  the   license  agreement  for  the  stem  publisher  in  this  case  study  at  the  time  of  the  library’s   subscription  renewal.  this  occurred  at  a  regional  academic  consortium  level,  making  text  mining   easier  at  fourteen  academic  institutions.  furthermore,  the  university  of  colorado  libraries,  which   includes  five  libraries  on  four  campuses,  is  now  working  on  drafting  a  model  clause  to  use  when   purchasing  journal  content  as  the  university  of  colorado  system  and  to  put  forth  for  consideration   by  the  consortiums  that  facilitate  the  purchase  of  our  major  journal  packages.  given  that   incorporating  text  mining  clauses  into  library–publisher  license  agreements  for  scholarly  journals   is  in  its  infancy,  there  are  few  resources  available  to  assist  librarians  adopting  this  new  role.  model   clauses  include  the  following:     negotiating  a  text  mining  license  for  faculty  researchers  |  williams  et  al   16   • british  columbia  electronic  library  network’s  model  license  agreement48   o clause  3.11.  “data  and  text  mining.  members  and  authorized  users  may  conduct   research  employing  data  or  text  mining  of  the  licensed  materials  and  disseminate   results  publicly  for  non-­‐commercial  purposes.”     • california  digital  library’s  standard  license  agreement49   o section  iv.  authorized  use  of  licensed  materials.  “text  mining.  authorized  users   may  use  the  licensed  material  to  perform  and  engage  in  text  mining/data  mining   activities  for  legitimate  academic  research  and  other  educational  purposes.”     • jisc’s  model  license  for  journals50   o clause  3.1.6.8.  “use  the  licensed  material  to  perform  and  engage  in  text   mining/data  mining  activities  for  academic  research  and  other  educational   purposes  and  allow  authorised  users  to  mount,  load  and  integrate  the  results  on  a   secure  network  and  use  the  results  in  accordance  with  this  license.”   o clause  9.3.  “for  the  avoidance  of  doubt,  the  publisher  hereby  acknowledges  that  any   database  rights  created  by  authorised  users  as  a  result  of  textmining/datamining  of   the  licensed  material  as  referred  to  in  clause  3.1.6.8  shall  be  the  property  of  the   institution.”   publishers  are  also  beginning  to  break  down  barriers  perhaps,  in  part,  because  of  the  sentiment   that  “privately  erected  barriers  by  copyright  holders  that  restrict  text  mining  of  the  research  base   could  be  increasingly  regarded  as  inequitable  or  unreasonable  since  the  copyright  holders  have   borne  only  a  small  proportion  of  the  costs  involved  in  the  overall  process;  furthermore,  they  do   not  have  rights  or  ownership  of  the  inherent  facts  or  ideas  within  the  research  base.”51  biomed   central  and  plos  both  offer  services  that  allow  researchers  to  access  xml  text  collections.  biomed   central  makes  content  readily  accessible  by  providing  a  website  for  bulk  download  of  xml  text.52   plos  requires  contact  with  a  staff  member  for  download  of  xml  text.53  in  december  2013,   elsevier  also  announced  that  it  would  create  a  “big  data”  center  at  the  university  college  london   to  allow  researchers  to  work  in  partnership  with  mendeley,  a  knowledge  management  and   citation  application  now  owned  by  elsevier.  while  this  is  a  positive  step,  the  partnership  does  not   appear  to  make  the  data  available  to  research  groups  beyond  the  university  college  london.54     however,  there  is  still  a  long  way  to  go  before  publishers  and  librarians  are  routinely   collaborating  on  opening  up  the  scholarly  literature  to  be  mined.  for  example,  a  2012  nature   editorial  states  “nature  publishing  group,  which  also  includes  this  journal,  says  that  it  does  not   charge  subscribers  to  mine  content,  subject  to  contract.”55  repeated  attempts  by  williams  to   obtain  more  information  from  nature  publishing  group  and  a  copy  of  the  contract  have  proved   fruitless.     in  january  2014,  elsevier  announced  that  “researchers  at  academic  institutions  can  use  elsevier’s   online  interface  (api)  to  batch-­‐download  documents  in  computer-­‐readable  xml  format”  after     information  technology  and  libraries  |  september  2014   17   signing  a  legal  agreement.  elsevier  will  limit  researchers  to  accessing  10,000  articles  per  week.56,57   for  small-­‐scale  projects  with  a  narrow  scope,  this  limit  will  suffice.  for  example,  mining  the   literature  for  a  specific  gene  that  plays  a  known  role  in  a  disease  could  require  a  text  set  under   30,000  articles.  at  elsevier’s  current  rate  of  article  transfer,  a  30,000  article  text  set  could  be   created  in  roughly  three  weeks.  however,  for  large-­‐scale  projects  such  as  hunter’s  research   group’s  knowledge-­‐based  analysis  system  that  require  a  text  set  of  400,000  articles  (or  much   more,  if  not  limited  by  budget  constraints),  nearly  a  year  of  time  would  be  required  to  build  the   corpora.  time  is  one  of  the  most  valuable  commodities  in  computational  biology.  the  elapsed  time   required  to  transfer  articles  at  the  rate  of  10,000  articles  per  week  represents  a  bottleneck  that   most  grant-­‐funded  research  cannot  afford.  speed  of  transfer  will  also  be  a  factor.  researchers   require  flexibility  to  maximize  available  central  processing  unit  (cpu)  hours  because  documents   can  take  from  a  few  seconds  to  a  full  minute  each  to  transfer  to  the  storage  destination.   monopolizing  peak  hours  in  high  performance  computing  (hpc)  settings  may  mean  that   computing  power  is  not  available  for  other  tasks,  although  many  hpc  centers  have  learned  to   allocate  cpu  use  more  efficiently  to  high  volumes.  furthermore,  the  terms  and  conditions  set  by   elsevier  for  output  limits  excerpting  from  the  original  text  to  200  characters.58  this  is  roughly   equivalent  to  two  lines  of  text  or  approximately  forty  words.  this  may  be  insufficient  to  capture   important  biological  relationships  necessary  to  evaluate  the  relevance  of  the  article  to  the   research  being  represented  by  the  hanalyzer  knowledge-­‐based  analysis  system.     conclusion   forging  a  partnership  between  a  library,  a  research  lab,  and  a  major  stem  vendor  requires   flexibility,  patience,  and  persistence.  our  experience  strengthened  the  existing  relationship   between  the  library  and  the  research  lab  and  demonstrated  the  library’s  willingness  and  ability  to   support  faculty  research  in  a  nontraditional  method.  librarians  are  encouraged  to  advocate  for   the  inclusion  of  text-­‐mining  rights  in  their  library’s  license  agreements  for  electronic  resources.   what  the  future  holds  for  publishers,  researchers,  and  libraries  involved  in  text  mining  remains  to   be  seen.  however,  what  is  certain  is  that  without  cooperation  between  publishers,  researchers,   and  libraries,  breaking  down  the  existing  barriers  and  achieving  standards  for  content  formats   and  access  terms  will  remain  elusive.   references     1.     university  of  colorado  anschutz  medical  campus,  university  of  colorado  anschutz  medical   campus  quick  facts,  2013,   http://www.ucdenver.edu/about/whoweare/documents/cuanschutz_facts_041613.pdf.     negotiating  a  text  mining  license  for  faculty  researchers  |  williams  et  al   18     2.     sonia  m.  leach  et  al.,  “biomedical  discovery  acceleration,  with  applications  to  craniofacial   development,”  plos  computational  biology  5,  no.  3  (2009):  1–19,   http://dx.doi.org/10.1371/journal.pcbi.1000215.   3.     jonathan  clark,  text  mining  and  scholarly  publishing  (publishing  research  consortium,  2013).   4.     corie  lok,  “literature  mining:  speed  reading,”  nature  463  (2010):  416–18,   http://dx.doi.org/10.1038/463416a.   5.     hong-­‐jie  dai,  yen-­‐ching  chang,  richard  tzong-­‐han  tsai,  wen-­‐lian  hsu,  "new  challenges  for   biological  text-­‐mining  in  the  next  decade,"  journal  of  computer  science  and  technology  25,   no.1  (2010):  169-­‐179,  doi:  10.1007/s11390-­‐010-­‐9313-­‐5.     6.     anne  hoekman,  “journal  publishing  technologies:  xml,”   http://www.msu.edu/~hoekmana/wra%20420/ismte%20article.pdf.   7.     alex  brown,  "xml  in  serial  publishing:  past,  present  and  future,"  oclc  systems  &  services  19,   no.  4,  (2003):149-­‐154,  doi:  10.1108/10650750310698775.   8.     cartic  ramakrishnan  et  al.,  “layout-­‐aware  text  extraction  from  full-­‐text  pdf  of  scientific   articles,”  source  code  for  biology  and  medicine  7,  no.  7  (2012),   http://dx.doi.org/10.1186/1751-­‐0473-­‐7-­‐7.   9.     ibid.   10.    lawrence  hunter  and  k.  bretonnel  cohen,  “biomedical  language  processing:  perspective   what’s  beyond  pubmed?”  molecular  cell  21,  no.  5,  (2006):  589–94.   11.    martin  krallinger,  alfonso  valencia,  and  lynette  hirschman,  “linking  genes  to  literature:  text   mining,  information  extraction,  and  retrieval  applications  for  biology,”  genome  biology  9,   supplement  2  (2008):  s8.1–s8.14,  http://dx.doi.org/10.1186/gb-­‐2008-­‐9-­‐s2-­‐s8.   12.    eefke  smit  and  maurits  van  der  graaf,  “journal  article  mining:  the  scholarly  publishers’   perspective,”  learned  publishing  25,  no.  1  (2012):  35–46,   http://dx.doi.org/10.1087/20120106.   13.    hunter  and  cohen,  “biomedical  language  processing,”  589.   14.    clark,  text  mining  and  scholarly  publishing.   15.    leach  et  al.,  “biomedical  discovery  acceleration.”   16.    marti  hearst,  “what  is  text  mining?”  october  17,  2003,   http://people.ischool.berkeley.edu/~hearst/text-­‐mining.html.     information  technology  and  libraries  |  september  2014   19     17.    k.  bretonnel  cohen  and  lawrence  hunter,  “getting  started  in  text  mining,”  plos   computational  biology  4,  no.  1  (2008):  1–3,  http://dx.doi.org/10.1371/journal.pcbi/0040020.   18.    jisc,  “the  model  nesli2  licence  for  journals,”  2013,  http://www.jisc-­‐collections.ac.uk/help-­‐ and-­‐information/how-­‐model-­‐licences-­‐work/nesli2-­‐model-­‐licence-­‐/.   19.    ian  hargreaves,  “digital  opportunity:  a  review  of  intellectual  property  and  growth,”  may   2011,  http://www.ipo.gov.uk/ipreview-­‐finalreport.pdf.     20.    james  manyika  et  al.,  “big  data:  the  next  frontier  for  innovation,  competition,  and   productivity,”  mckinsey  &  company,  may  2011,   http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_inn ovation.   21.   hargreaves,  “digital  opportunity.”     22.    diane  mcdonald  and  ursula  kelly,  “the  value  and  benefits  of  text  mining  to  uk  further  and   higher  education,”  jisc,  2012,  http://www.jisc.ac.uk/reports/value-­‐and-­‐benefits-­‐of-­‐text-­‐ mining.   23.    jisc,  “the  model  nesli2  licence  for  journals.”   24.    smit  and  van  der  graaf,  “journal  article  mining.”   25.    mcdonald  and  kelly,  “the  value  and  benefits  of  text  mining.”   26.    clark,  text  mining  and  scholarly  publishing.   27.    “gold  in  the  text?”  nature  483  (march  8,  2012):  124,  http://dx.doi.org/10.1038/483124a.   28.    richard  van  noorden,  “trouble  at  the  text  mine,”  nature  483  (march  8,  2012):  134–35.   29.    claudio  aspesi,  a.  rosso,  and  r.  wielechowski.  reed  elsevier:  is  elsevier  heading  for  a  political   train-­‐wreck?  2012.   30.    clark,  text  mining  and  scholarly  publishing.   31.    jill  emery,  “working  in  a  text  mine:  is  access  about  to  go  down?”  journal  of  electronic   resources  librarianship  20,  no.  3  (2009):135–38,   http://dx.doi.org/10.1080/19411260802412745.   32.    clark,  text  mining  and  scholarly  publishing:  14.   33. van  noorden,  “trouble  at  the  text  mine.” 34.   ibid. 35.    ibid.     negotiating  a  text  mining  license  for  faculty  researchers  |  williams  et  al   20     36.    jisc,  “the  model  nesli2  licence  for  journals.”   37.    smit  and  van  der  graaf,  “journal  article  mining.”   38.    van  noorden,  “trouble  at  the  text  mine.”   39.    internal  revenue  service,  “unrelated  business  income  defined,”     http://www.irs.gov/charities-­‐&-­‐non-­‐profits/unrelated-­‐business-­‐income-­‐defined.   40.    clark,  text  mining  and  scholarly  publishing:  10.   41.    ibid:  14.   42.    mcdonald  and  kelly,  “the  value  and  benefits  of  text  mining.”   43.    smit  and  van  der  graaf,  “journal  article  mining.”   44.    mcdonald  and  kelly,  “the  value  and  benefits  of  text  mining.”   45.    ibid.   46.    smit  and  van  der  graaf,  “journal  article  mining.”   47.    mcdonald  and  kelly,  “the  value  and  benefits  of  text  mining.”   48.    british  columbia  electronic  library  network,  bc  eln  database  licensing  framework,     http://www.cdlib.org/services/collections/toolkit/.   49.      “licensing  toolkit,”  california  digital  library,   http://www.cdlib.org/services/collections/toolkit/.   50.    jisc,  “the  model  nesli2  licence  for  journals.”   51.    mcdonald  and  kelly,  “the  value  and  benefits  of  text  mining.”   52.    “using  biomed  central’s  open  access  full-­‐text  corpus  for  text  mining  research,”     http://www.biomedcentral.com/about/datamining.   53.    “help  using  this  site,”  plos,  http://www.plosone.org/static/help.   54.    iris  kisjes,  “university  college  london  and  elsevier  launch  ucl  big  data  institute,”  elsevier   connect,  press  release,  december  18,  2013,  http://www.elsevier.com/connect/university-­‐ college-­‐london-­‐and-­‐elsevier-­‐launch-­‐ucl-­‐big-­‐data-­‐institute.   55.    “gold  in  the  text?”   56.    richard  van  noorden,  “elsevier  opens  its  papers  to  text-­‐mining,”  nature  506  (february  2,   2014):  17.   57.    sciverse,  content  apis,  http://www.developers.elsevier.com/cms/content-­‐apis.     information  technology  and  libraries  |  september  2014   21     58.    “text  and  data  mining,”  elsevier,  ,  http://www.elsevier.com/about/universal-­‐access/content-­‐ mining-­‐policies.   animated subject maps for book collections tim donahue information technology and libraries | june 2013 7 abstract of our two primary textual formats, articles by far have received the most fiscal and technological support in recent decades. meanwhile, our more traditional format, the book, seems in some ways to already be treated as a languishing symbol of the past. the development of opacs and the abandonment of card catalogs in the 1980s and 1990s is the seminal evolution in print monograph access, but little else has changed. to help users locate books by call number and browse the collection by subject, animated subject maps were created. while the initial aim is a practical one, helping users to locate books and subjects, the subject maps also reveal the knowledge organization of the physical library, which it displays in a way that can be meaningful to faculty, students, and other community members. we can do more with current technologies to assist and enrich the experience of users searching and browsing for books. the subject map is presented as an example of how we can do more in this regard. lc classification, books, and library stacks during the last few decades of technological evolution in libraries, we have helped facilitate a seismic shift from print-based to digital research. our library websites are jammed with electronic resources, digital collection components, database links, virtual reference assistance, online tutorials, and mobile apps. collection budgets too have shifted from a print to electronic focus. many libraries are now spending less than 20 percent of their material budgets on print monographs. and yet, our stacks are still filled with books that often take up more than fifty percent of our library spaces. knowledge organization schemas have also evolved in libraries. we have subject lists to help users to decide on which databases to select that reflect current disciplines and majors in higher education. internal database navigation continues to evolve in terms of limits, fields, and subject searching. web searching is based on the contemporary keyword approach where “everything is miscellaneous” and need not be organized, but nationwide, billions of books still sit on shelves according to dewey or library of congress classification systems that were initially developed over a century ago. some say these organizing systems are woefully antiquated and do not reflect our contemporary post-modern realities, though they still amply serve their purpose to assign call number locations for our books. we hear scant little of plans to update these classification schemes. why invest more time, energy, and resources on revamped organization schemes for libraries? the hathitrust now contains the tim donahue (tdonahue@montana.edu) is assistant professor/instruction librarian, montana state university, bozeman, mt. animated subject maps for book collections | donahue 8 scanned text of more than ten million books. google claims there are almost 130 million published titles in the world and intends to digitize all of them.1 what will happen to our physical book collections? how long will they reside on our library shelves? how long will they be located using the dewey and lc systems? is the library a shrinking organism? profession-wide, there seems to be no concrete vision in regards to the future of our book collections. there is, of course, general acknowledgement that acquisition of e-books will increase as print acquisitions decrease and that, overall, print collections will accordingly shrink to reflect the growing digital nature of knowledge consumption. but for now and into the foreseeable future these billions of monographs remain on our shelves in the same locations our call number systems assigned to them decades ago. and while online library users are now able to utilize an array of electronic access delivery systems and web technologies for their article research and consumption, book seekers still need a call number. books and articles have been our two primary textual formats for centuries. articles have moved into the digital realm more fleetly than their lengthier counterparts. their briefer length, the cyclical serial publication process, and the evolution of database containment and access have enabled, in a relatively short time, a migration from print to primarily digital access. books, however, are accessed in much the same way they were a hundred years ago. the development of opacs in the 1980s and 1990s and abandonment of card catalogs is the seminal evolution in print monograph access, but little else has changed.2 once a call number is attained, the rest of the process remains physical, usually requiring pencil, paper, feet, sometimes a librarian, and a trip through the library until the object itself is found and pulled from the shelf. so while the process of article acquisition may employ a plethora of finding aids, keyword searching, database features, full text availability, and various delivery methods through our richly developed websites, beyond the opac and possibly a static online map, book seekers are on their own or need a librarian in what may seem a meaningless labyrinth of stacks and shelves. while the primary and most practical purpose of our classification schemes is to provide an assigned call number for book finding, these organizational outlines create an order to the layout of our stacks that maps a universe of knowledge within our library walls. this structure of knowledge reveals a meaning to our collections that includes the colocation of books by topic and proximity of related subjects. these features enhance the browsing process and often lead to the act of serendipitous discovery. to locate a book by call number, a user may consult library floor plans, which are typically limited to broad ranges or lc main classes, then rely on stack-end cards to home in on the exact stack location. to browse books by subject without using the catalog, a user typically must rely on a combination of floor plans and lc outline posters if they exist at all. often, informed browsing by subject cannot take place without a visit to the reference desk for mediation by a librarian. even then, many librarians are barely familiar with their book collection’s organizational structure and are reticent to recommend broad subject browsing. information technology and libraries | june 2013 9 purpose and description of the subject map to help users locate books by call number and browse the collection by subject, animated subject maps were created at skidmore college and montana state university. displaying overhead views of library floors, users mouse over stacks to reveal the lc sub-classes located within. alternatively, they may browse and select lc subject headings to see which stacks contain them. the lc outline contains 21 main subject classes and 224 sub-classes, corresponding to the first two elements of a book call number. on stack mouse-over, three items are displayed: the call number by range, the main subject heading, and all sub-classes contained within the stack. when using the browse by subject option, users select and click an lc main class and the stacks where this class is located are highlighted. while the initial aim is a practical one, helping users to locate books and subjects, the subject map also reveals the knowledge organization of the physical library, which it displays in a way that can be meaningful to faculty, students, and other community members. the map also provides local electronic access to the lc classification outline. at both institutions the maps are linked from prominent web locations and electronic points of need that are relevant and proximate to other book searching functions and tools. figure 1. skidmore college subject map showing stack mouse-over display. animated subject maps for book collections | donahue 10 figure 2. montana state university subject map showing stack mouse-over display. design rationale and methodology the inspiration for the subject map started with a question: what if users could see on a map where individual subjects were located within the library? most library maps examined were limited to lc main classes or broad ranges denoting wide swaths of call numbers. including hundreds of lc subclasses would convolute and clutter a floor map beyond usability. but what if an online map contained each individual stack and only upon user-activation was the information revealed, saving space and avoiding clutter? such a map should be as devoid of congestion as possible and focus the user’s attention on library stack locations and lc classification. working from existing maps and architectural blueprints of the library building, a basic perimeter was rendered using adobe illustrator and indesign software. these perimeters were then imported into adobe flash and a new .fla file created. library stacks were then measured, counted, and added as a separate layer within each floor perimeter. basic location elements such as stairways, elevators, and doors were added for locational reference points. each stack was then programmed as a button with basic rollover functionality. flash actionscript was coded so that the correct call number, main class, and sub-class information appear within the interface upon rollover activation. this functionality accounts for the stack searching ability of the subject map. information technology and libraries | june 2013 11 additionally, the lc outline was made searchable within the map so that users can mouse over subjects and upon clicking, see what stacks contain those main classes. this functionality accounts for the subject searching ability of the map. left-hand navigation was built in so users can toggle between these two main search functions. maintaining visual minimalism and simplicity was a priority and inclinations to render the map more comprehensively were resisted in order to maximize attention to subject and stack information. black, white, and gray colors were chosen to enhance the contrast of the map and aid the user’s eye for quick and clear use. other relevant links and instructional context were added to the left-hand navigation including links to the catalog, official lc outline, and library homepage. finally, after uploading to the local server and creating a simple url, links to the subject map were established in prominent and meaningful points of need within the library website. user acceptance once the subject map was completed and links to it were made public, a brief demonstration was provided for reference team members who began showing it to users at the reference desk. initial reaction was enthusiastic. students thought it was “cool” and enjoyed “playing with it.” one reported, “i didn’t know the library actually made sense like that. it’s neat to see the logic about where things are.” another student said, “now i can see where all the books on buddhism are!” faculty, too, were pleased. though faculty members typically know a little about lc classification, they are not accustomed to seeing it visualized and grafted onto their institutional library’s stacks. making transparent the intellectual organization of the library for other faculty can bolster their confidence in our order and structure. professors are often pleased to see their discipline’s place within our stacks and where related subjects are located. the most positive praise for the subject map, however, comes from the sense of convenience it lends. many comments express appreciation for the ability to directly locate an individual book stack. because primary directional and finding elements like stairs and elevators are included in the maps, users are able to see the exact path that leads to the book they are seeking. for those not interested in browsing, in a hurry, or challenged in terms of mobility, the subject map is a time and energy saver. some users however have reported frustration with the sensitivity required for the mouse-over functions. others desire a more detailed level of searching beyond the sub-class level. one user pointed out that the subject map was of no help to the blind. multiple uses and internal applications the primary use and most obvious application of the subject map is as a reference tool. as a front line finding aid, librarians and other public service staff at reference, circulation, or other help desks can easily and conveniently deploy the map to point users in the right direction and orient them to the book collection. in library instruction sessions, the subject map is not only a practical local resource worth pointing out, but also serves as an example of applied knowledge organization. when accompanying a demonstration of the library catalog, the map is not only a valuable finding aid, but adds a layer of meaning as well. students who understand the map are animated subject maps for book collections | donahue 12 not only more able to browse and locate books, but learn that a call number represents a detailed subject meaning as well as locational device. used in conjunction with a tour, the map reinforces the layout of library shelves and helps to bridge the divide between electronic resources and physical retrieval. the subject map facilitates a concrete and visual introduction to the lc classification outline, a knowledge of which can be applied to most college and research libraries in the united states. the subject map can also be of assistance with collection development. perusal of the map can reveal relative strengths and weaknesses within the collection. subject liaisons and bibliographers may use the map to home in on and visualize their assigned areas. circulation staff and stacks maintenance workers find the map useful for book retrieval, shifting projects, and in the training and acclimation of new workers to the library. the subject map has proven to be a useful reference for library redesign and space planning considerations. at information fairs and promotional events where devices or projection screens are available, the map has served as a talking point and promotional piece of digital outreach. the map has been demonstrated by information science professors to lis graduate students as an example of applied knowledge organization in libraries. recently, a newly hired incoming library dean commented that the map helped him “get to know the book collection” and familiarized him with the library. figure 3. skidmore college subject map showing subject search display. information technology and libraries | june 2013 13 issues and challenges in some libraries, books don’t move for decades. the same subjects may reside on the same shelves during an entire library’s lifetime. in this case, a subject map can be designed once and never edited. but, of course, most library buildings go through changes and evolutions. in many libraries, collection shifting seems to be ongoing. book collections wax and wane. certain subjects expand with their times, while others shrink in irrelevancy. weeding does not affect all subjects and stacks equally and adjustments to shelves and end cards are necessary. in addition to the transitions of weeding and shifting, sometimes whole floors are reconfigured. in the library commons era of the last few decades, substantial redesigns have been commonplace as book collections make way for computer stations and study spaces. in all these cases, adjustments and updates will be necessary to keep a subject map accurate. this is easily done by going back into the master .fla file and editing as needed. in many cases only a stack or two need be adjusted, but in instances of major collection shifting some planning ahead may be necessary and more time allotted for redesign. shifting can be a complex spatial exercise and it is difficult to predict where subjects will realign exactly. subject map editing may have to wait until physical shifting is completed. it should be noted that each stack must be hand-coded separately. in libraries with hundreds of stacks this can seem a tedious and time-consuming design method. both subject maps rely on adobe flash animation technology. flash is proprietary software, so the benefits of open source software cannot be utilized with subject maps at this time. further, abobe flash reader software must be installed on a computer for the subject map to render. this has almost never been a problem, however, as the flash reader is ubiquitous and automatically installed on most public and private machines upon initial boot up. another concern, however, relating to flash technology is human assets. not every library has a flash designer or even someone who can implement the most fundamental flash capabilities. flash is not hard to learn and the subject maps utilize only its most basic functionalities, but still, for some it remains a niche software and many libraries will not have the resources to invest. reaction, though, to the live subject maps and the rollover interactivity they provide, has been so positive that more fully integrated flash maps have been proposed. why not have all physical elements of the library incorporated into one flash-enabled map? this is possible but may come at some expense to the functionality of the subject-rendering aspect of the maps. by limiting the application to stacks and lc classes, a user may remain more focused. avoiding clutter, overcrowding, and a preponderance of choice is a design strategy that has gained much credibility in recent years.3 the subject map enjoys the usability success of clean design, limited purpose, and simple rendering. while demonstrating the potential of user-activated animation for other proposed library applications, the subject map might be best maintained as a limited specialty map. a final concern regarding the long-term success of subject maps should be mentioned. how long will books remain in libraries? how long will they be organized by subject? when the physical animated subject maps for book collections | donahue 14 arrangement and organization of information objects no longer exists in libraries, maps of any kind will seemingly lose all efficacy. but will libraries themselves exist in this future? whither books? whither libraries? future developments the most prominent and practical attribute of the subject map is its ability to show a user the exact stack where the book they are seeking is located. but in its current state as a stand-alone application, a user must obtain a call number from a catalog search, then open the subject map by going to its independent url. investigation is underway to determine what is necessary in order to integrate the subject map with the online catalog. in this scenario, a catalog item record might also display an embedded subject map that automatically highlights the floor and stack where the call number is located. this seemingly requires .swf files and flash actionscript to be embedded in catalog coding. one potential solution is to attribute an individual url to each stack rendering so that a get url function can be applied and embedded in each catalog item record. this synthesis of subject map and catalog poses a complex challenge but promises meaningful and time-saving results for the item retrieval process. qr code technology in conjunction with subject map use is also being deployed. by fixing qr codes on stack end cards that link to relevant sections of the lc outline, a researcher may use a mobile device to browse digitally and physically within the stacks at the same time. in this way a user may conduct digital subject browsing and physical item browsing simultaneously. the urls linked to by qr coding contain detailed lc sub-levels not contained within the subject map, which is limited to the level of sub-class. the active discovery of new knowledge facilitated by exploiting preexisting lc organization inside library stacks in real time can be quite impressive when experienced firsthand. another development exploiting lc knowledge organization is in beta mode at this time. an lc search database has been created allowing users to enter words and find matching lc subject terminology. potentially, this database could be merged with the subject map, allowing users to correlate subject word search with physical locations independent of call numbers. despite its intent as a limited specialty map, possibilities are also being explored to incorporate the subject map into a more fully integrated library map. one way forward in this regard is to create map layers that could be toggled on and off by users. in this way, the subject map could exist as its own layer, maintaining its clarity and integrity when isolated but integrated when viewed with other layers. flash technology excels at allowing such layer creation. other stack maps and related technologies searching the web for “subject map” and relative terminology such as stack, shelf, book, and lc maps, does turn up various efforts and approaches to organizing and exploiting classification scheme data, but no animated, user-activated maps are found. similar searches across library and information science literature turn up some explorative research on the possibilities of mapping information technology and libraries | june 2013 15 lc data, but again no animated stack maps are found.4 there is a product licensed by bowker inc. called stackmap that can be linked to catalog search results. when a user clicks on the map link next to a call number result, a map is displayed with the destination stack highlighted, but the information provided is locational only. stackmap is not animated or user-activated. no subject information is given and the map offers no browsing features. since the release of html5, we are beginning to see more animation on the web that is not flashdriven. steve jobs and apple’s determined refusal to run flash on their mobile devices has motivated many to seek other animation options. new html5 animation tools such as adobe edge, hippo animator, and hype offer promising starts at dislodging the flash grip on web animation, but they have far to go and do not yet offer either the ease of design nor the range of creative possibilities of flash. building an animated subject map with html5 alone does not seem possible at this time. universal applicability of the subject map so far, subject maps have been created for two very different libraries. the commonality shared between the montana state university and skidmore college libraries is their possession of hundreds of thousands of books in stacks shelved by the lc classification system. this is a trait shared by nearly all college and research libraries. subject maps can be easily structured on the dewey decimal system as well so that public libraries could benefit from their functionality, making the subject map appropriate and creatable for more than 12,000 libraries.5 of our two primary textual formats, articles by far have received the most fiscal and technological support in recent decades. article searching and retrieval continues to evolve through the rich implementation of assets such as locally constructed resource management tools, independent journal title searches, complexly designed database search interfaces, and dedicated electronic resource librarians. meanwhile, our more traditional format, the book, seems in some ways to already be treated as a languishing symbol of the past. because its future is uncertain, does that justify our neglect in the present? as a profession we seem a bit complacent about the state of our book collections. why dedicate our technical resources to a format that is on the way out? but has the book disappeared yet? as we make room for more student lounges, coffee bars, computer stations, writing labs, and information commons, we should carefully ask what makes a library special. good books and the focused, sustained treatment of knowledge they contain are part of the correct answer, symbolically and as yet, practically speaking. while our books still occupy our library shelves, shouldn’t they also fully benefit from the ongoing technological explosion through which we continue to evolve? opacs haven’t evolved much in recent years. in fact they seem quite stymied to many librarians and users. we can do more with current technologies to assist and enrich the experience of users searching and browsing for books. the subject map is hopefully an example of how we can do more in this regard. while we have grown accustomed to increasingly look forward in order to position our libraries for the future, we should also remember to sometimes look back. our classification systems and animated subject maps for book collections | donahue 16 book collections are assets built from the past that represent many decades of great labor, investment, and achievement. more than 12,000 public and academic libraries together make up one of our greatest national treasures and bulwarks of living democracy. libraries are among the dearest valued assets in any of our states. many of the most beautiful buildings in our nation are libraries. based on library insurance values and estimated replacement costs, library buildings and the collections they hold amount cumulatively to hundreds of billions of dollars of worth.6 this astounding worth is figured mainly from the buildings themselves and the books they contain. a few have commented that there is some aesthetic quality to the subject maps. if this is true, the appeal comes from the synthesis of architectural form and the universe of knowledge revealed within, from the beauty of libraries both real and ideal, from physical and mental constructions unified. animated subject maps can help bring the physical and intellectual beauty of libraries into the digital realm, but the main appeal is a practical one: to point the user directly to the book or subject they are seeking. so in conclusion, perhaps we should measure the subject map’s potential in the light of ranganathan’s five laws of library science:7 1. books are for use. 2. every reader his [or her] book. 3. every book its reader. 4. save the time of the reader. 5. the library is a growing organism. the subject maps can be found at the following urls: skidmore college subject map: http://lib.skidmore.edu/includes/files/subjectmaps/subjectmap.swf montana state university subject map: www.lib.montana.edu/subjectmap references 1. google, “google books library project—an enhanced card catalog of the world’s books,” http://books.google.com/googlebooks/library.html, accessed november 8, 2012. 2. antonella iacono, “opac, users, web. future developments for online library catalogues,” bollettino aib 50, no. 1–2 (2010): 69–88, http://bollettino.aib.it/article/view/5296. 3. geoffrey little, “where are you going, where have you been? the evolution of the academic library web site,” the journal of academic librarianship 38, no. 2, (2012): 123–25, doi:10.1016:j.acalib.2012.02.005. http://lib.skidmore.edu/includes/files/subjectmaps/subjectmap.swf http://www.lib.montana.edu/subjectmap/ http://books.google.com/googlebooks/library.html http://bollettino.aib.it/article/view/5296 http://dx.doi.org/10.1016:j.acalib.2012.02.005 information technology and libraries | june 2013 17 4. kwan yi and lois mai chan, “linking folksonomy to library of congress subject headings: an exploratory study,” journal of documentation 65, no. 6 (2009): 872–900, doi:10.1108:00220410910998906. 5. american library association, “number of libraries in the united states, ala library fact sheet 1,” www.ala.org/tools/libfactsheets/alalibraryfactsheet01. 6. edward marman, “a method for establishing a depreciated monetary value for print collections,” library administration and management 9, no. 2 (1995): 94–98. 7. s. r. ranganathan, the five laws of library science (new delhi: ess ess, 2006), http://hdl.handle.net/2027/mdp.39015073883822. http://dx.doi.org/10.1108:00220410910998906 http://www.ala.org/tools/libfactsheets/alalibraryfactsheet01 http://hdl.handle.net/2027/mdp.39015073883822 president’s message thomas dowling information technologies and libraries | december 2015 doi: 10.6017/ital.v34i4.9150 1 the lita governing board has had a productive autumn, and i wanted to share a few highlights. keeping an eye on how better to understand and improve the member experience, we have a couple of new groups getting down to work. lita local task force i'm writing this shortly after returning from lita forum 2015, which was a fantastic meeting. i'm glad that so many people were able to attend, and i hope even more will come to forum 2016. but we know many members cannot regularly travel to national meetings, and even the best online experience can lack the serendipitous benefits that so often come from face-to-face meetings. the new lita local task force will be responsible for creating a toolkit to facilitate local groups’ ability to host events, including information on event planning, accessibility, and ensuring an inclusive culture at meetings. so you’ll be able to host a lita event in your own backyard! (if your backyard has a couple of meeting rooms and good wireless.) forum assessment and alternatives task force as we begin work on lita local events, we are also turning our eyes to our national meeting. planning the next lita forum is essentially a year-round process. we assess the work we’ve done on previous forums, of course, but the annual schedule often doesn’t afford an opportunity to strategically rethink what forum is and how it can best serve the members. to address that issue, we’re convening another new task force, on forum assessment and alternatives. this group will look critically at how forum advances our strategic priorities, and will also look at other library technology conferences to help identify how forum can continue to distinguish itself in a rapidly changing environment. lita personas task force finally, as i write this, the board is in the final stages of creating a personas task force as a tool for better understanding our current and potential new members. a well-constructed set of personas, representing both people who are lita members and people who aren’t—but who could be or should be—will become a valuable tool for membership development, programming, communications, assessment, and other purposes. each of these task forces will work throughout 2016 and deliver their results by midwinter 2017. it is worth noting that we could only convene these groups because we have a strong list of volunteers on tap. if you haven’t filled out a lita volunteer form recently, please considering doing so at http://www.ala.org/lita/about/committees. thomas dowling (dowlintp@wfu.edu) is lita president 2015-16 and director of technologies, z. smith reynolds library, wake forest university, winston-salem, north carolina. http://www.ala.org/lita/about/committees mailto:dowlintp@wfu.edu lita local task force forum assessment and alternatives task force lita personas task force 2 information technology and libraries | march 2011 program that would provide for educating mentees about lita, sharing of areas of expertise and awareness, and develop a network of professionals. dialogue on the lita electronic discussion list and conversations with committee and interest group chairs suggests a desire and need for leadership training. the membership development committee is addressing the need for mentors in lita 101 and lita 201 held at ala annual conferences and midwinter meetings. lita leadership, including the membership development committee, committee and interest group chairs, the education committee, lita emerging leaders, and others, will be included in an ongoing dialogue to see how and what can be implemented from the lita leadership institute and the lita mentorship program recommendations as submitted by the 2009 emerging leaders team t. follow-up by lita to implement the recommendations of emerging leader projects is important to the vitality and longevity of the association. since 2007, a number of projects have been developed by emerging leaders. information about the projects is available at the following locations online: ■■ the ala website: http://www.ala.org/ala/educationcareerleader ship/emergingleaders/index.cfm ■■ ala connect: http://connect.ala.org/emergingleaders ■■ facebook: http://www.facebook.com/pages/ala-emerging -leaders/156736295251?ref=ts/ ■■ the emerging leaders blog: http://connect.ala.org/2011emergingleaders ■■ the emerging leaders wiki: http://emergingleaders.ala.org/wiki/index.php ?title =main_page i n 2006, ala president leslie burger implemented six initiatives, including an emerging leaders program that is now in its fifth year. the initiative was designed to prepare librarians who are new to the profession in leadership skills that are applicable on the job and as active leaders within the association. lita is sponsoring 2011 emerging leaders bohyun kim and andreas orphanides. bohyun is currently digital access librarian at the florida international university medical library. andreas is currently librarian for digital technologies and learning at the north carolina state university libraries. as of the writing of this column, the projects for 2011 have not been assigned. additional lita members accepted into the 2011 ala emerging leaders program include tabatha farney, deana greenfield, amanda harlan, colleen harris, megan hodge, matthew jabaily, catherine kosturski, nicole pagowsky, casey schacher, sibyl schaefer, jessica sender, and andromeda yelton. lita provides an ideal environment for its members to enhance their skills. in 2009, emerging leaders team t developed a project “making it personal: leadership development programs for lita,” working in consultation with the lita membership development committee. team members included amanda hornby (university of washington), angelica guerrero fortin (san diego county library), dan overfield (cuyahoga community college), and lisa carlucci thomas (yale university). the team t members recommended the creation of “an online continuing education program to develop the leadership and project management skills necessary to maintain and promote the value and ability of lita’s professional membership to the greater librarian population.” outcomes for the training would include project-management and team-building skills within a context that focuses on the development and application of technology in libraries. the team members also recommended the establishing of a lita mentorship karen j. starr (kstarr@nevadaculture.org) is lita president 2010–11 and assistant administrator for library and development services, nevada state library and archives, carson city. karen j. starr president’s message: membership, leadership, emerging leaders, and lita 8 information technology and libraries | march 2010 t. michael silver monitoring network and service availability with open-source software silver describes the implementation of a monitoring system using an open-source software package to improve the availability of services and reduce the response time when troubles occur. he provides a brief overview of the literature available on monitoring library systems, and then describes the implementation of nagios, an open-source network monitoring system, to monitor a regional library system’s servers and wide area network. particular attention is paid to using the plug-in architecture to monitor library services effectively. the author includes example displays and configuration files. editor’s note: this article is the winner of the lita/ex libris writing award, 2009. l ibrary it departments have an obligation to provide reliable services both during and after normal business hours. the it industry has developed guidelines for the management of it services, but the library community has been slow to adopt these practices. the delay may be attributed to a number of factors, including a dependence on vendors and consultants for technical expertise, a reliance on librarians who have little formal training in it best practices, and a focus on automation systems instead of infrastructure. larger systems that employ dedicated it professionals to manage the organization’s technology resources likely implement best practices as a matter of course and see no need to discuss them within the library community. in the practice of system and network administration, thomas a. limoncelli, christine j. hogan, and strata r. chalup present a comprehensive look at best practices in managing systems and networks. early in the book they provide a short list of first steps toward improving it services, one of which is the implementation of some form of monitoring. they point out that without monitoring, systems can be down for extended periods before administrators notice or users report the problem.1 they dedicate an entire chapter to monitoring services. in it, they discuss the two primary types of monitoring—real-time monitoring, which provides information on the current state of services, and historical monitoring, which provides long-term data on uptime, use, and performance.2 while the software discussed in this article provides both types of monitoring, i focus on real-time monitoring and the value of problem identification and notification. service monitoring does not appear frequently in library literature, and what is written often relates to single-purpose custom monitoring. an article in the september 2008 issue of ital describes the development and deployment of a wireless network, including a perl script written to monitor the wireless network and associated services.3 the script updates a webpage to display the results and sends an e-mail notifying staff of problems. an enterprise monitoring system could perform these tasks and present the results within the context of the complete infrastructure. it would require using advanced features because of the segregation of networks discussed in their article but would require little or no extra effort than it took to write the single-purpose script. dave pattern at the university of huddersfield shared another perl script that monitors opac functionality.4 again, the script provided a single-purpose monitoring solution that could be integrated within a larger model. below, i discuss how i modified his script to provide more meaningful monitoring of our opac than the stock webpage monitoring plug-in included with our opensource networks monitoring system, nagios. service monitoring can consist of a variety of tests. in its simplest form, a ping test will verify that a host (server or device) is powered on and successfully connected to the network. feher and sondag used ping tests to monitor the availability of the routers and access points on their network, as do i for monitoring connectivity to remote locations.5 a slightly more meaningful check would test for the establishment of a connection on a port. feher and sondag used this method to check the daemons in their network.6 the step further would be to evaluate a service response, for example checking the status code returned by a web server. evaluating content forms the next level of meaning. limoncelli, hogan, and chalup discuss end-to-end monitoring, where the monitoring system actually performs meaningful transactions and evaluates the results.7 pattern’s script, mentioned above, tests opac functionality by submitting a known keyword search and evaluating the response.8 i implemented this after an incident where nagios failed to alert me to a problem with the opac. the web server returned a status code of 200 to the request for the search page. users, however, want more from an opac, and attempts to search were unsuccessful because of problems with the index server. modifying pattern’s original script, i was able to put together a custom check command that verifies a greater level of functionality by evaluating the number of results for the known search. n software selection limoncelli, hogan, and chalup do not address specific t. michael silver (michael.silver@ualberta.ca) is an mlis student, school of library and information studies, university of alberta, edmonton, alberta, canada. monitoring network and service availability with open-source software | silver 9 how-to issues and rarely mention specific products. their book provides the foundational knowledge necessary to identify what must be done. in terms of monitoring, they leave the selection of an appropriate tool to the reader.9 myriad monitoring tools exist, both commercial and open-source. some focus on network analysis, and some even target specific brands or model lines. the selection of a specific software package should depend on the services being monitored and the goals for the monitoring. wikipedia lists thirty-five different products, of which eighteen are commercial (some with free versions with reduced functionality or features); fourteen are opensource projects under a general public license or similar license (some with commercial support available but without different feature sets or licenses); and three offer different versions under different licenses.10 von hagen and jones suggest two of them: nagios and zabbix.11 i selected the nagios open-source product (http:// www.nagios.org). the software has an established history of active development, a large and active user community, a significant number of included and usercontributed extensions, and multiple books published on its use. commercial support is available from a company founded by the creator and lead developer as well as other authorized solution providers. monitoring appliances based on nagios are available, as are sensors designed to interoperate with nagios. because of the flexibility of a software design that uses a plug-in architecture, service checks for library-specific applications can be implemented. if a check or action can be scripted using practically any protocol or programming language, nagios can monitor it. nagios also provides a variety of information displays, as shown in appendixes a–e. n installation the nagios system provides an extremely flexible solution to monitor hosts and services. the object-orientation and use of plug-ins allows administrators to monitor any aspect of their infrastructure or services using standard plug-ins, user-contributed plug-ins, or custom scripts. additionally, the open-source nature of the package allows independent development of extensions to add features or integrate the software with other tools. community sites such as monitoringexchange (formerly nagios exchange), nagios community, and nagios wiki provide repositories of documentation, plug-ins, extensions, and other tools designed to work with nagios.12 but that flexibility comes at a cost—nagios has a steep learning curve, and usercontributed plug-ins often require the installation of other software, most notably perl modules. nagios runs on a variety of linux, unix, and berkeley software distribution (bsd) operating systems. for testing, i used a standard linux server distribution installed on a virtual machine. virtualization provides an easy way to test software, especially if an alternate operating system is needed. if given sufficient resources, a virtual machine is capable of running the production instance of nagios. after installing and updating the operating system, i installed the following packages: n apache web server n perl n gd development library, needed to produce graphs and status maps n libpng-devel and libjpeg-devel, both needed by the gd library n gcc and gnu make, which are needed to compile some plug-ins and perl modules most major linux and bsd distributions include nagios in their software repositories for easy installation using the native package management system. although the software in the repositories is often not the most recent version, using these repositories simplifies the installation process. if a reasonably recent version of the software is available from a repository, i will install from there. some software packages are either outdated or not available, and i manually install these. detailed installation instructions are available on the nagios website, in several books, and on the previously mentioned websites.13 the documentation for version 3 includes a number of quick-start guides.14 most package managers will take care of some of the setup, including modifying the apache configuration file to create an alias available at http://server.name/nagios. i prepared the remainder of this article using the latest stable versions of nagios (3.0.6) and the plug-ins (1.4.13) at the time of writing. n configuration nagios configuration relies on an object model, which allows a great deal of flexibility but can be complex. planning your configuration beforehand is highly recommended. nagios has two main configuration files, cgi.cfg and nagios.cfg. the former is primarily used by the web interface to authenticate users and control access, and it defines whether authentication is used and which users can access what functions. the latter is the main configuration file and controls all other program operations. the cfg_file and cfg_dir directives allow the configuration to be split into manageable groupsusing additional recourse files and the object definition files (see figure 1). the flexibility offered allows a variety of different structures. i group network 10 information technology and libraries | march 2010 devices into groups but create individual files for each server. nagios uses an objectoriented design. the objects in nagios are displayed in table 1. a complete review of nagios configuration is beyond the scope of this article. the documentation installed with nagios covers it in great detail. special attention should be paid to the concepts of templates and object inheritance as they are vital to creating a manageable configuration. the discussion below provides a brief introduction, while appendixes f–j provide concrete examples of working configuration files. n cgi.cfg the cgi.cfg file controls the web interface and its associated cgi (common gateway interface) programs. during testing, i often turn off authentication by setting use_authentication to 0 if the web interface is not accessible from the internet. there also are various configuration directives that provide greater control over which users can access which features. the users are defined in the /etc/nagios/htpasswd.users file. a summary of commands to control entries is presented in table 2. the web interface includes other features, such as sounds, status map displays, and integration with other products. discussion of these directives is beyond the scope of this article. the cgi.cfg file provided with the software is well commented, and the nagios documentation provides additional information. a number of screenshots from the web interface are provided in the appendixes, including status displays and reporting. n nagios.cfg the nagios.cfg file controls the operation of everything except the web interface. although it is possible to have a single monolithic configuration file, organizing the configuration into manageable files works better. the two main directives of note are cfg_file, which defines a single file that should be included, and cfg_dir, which includes all files in the specified directory with a .cfg extension. a third type of file that gets included is resource.cfg, which defines various macros for use in commands. organizing the object files takes some thought. i monitor more than one hundred services on roughly seventy hosts, so the method of organizing the files was of more than academic interest. i use the following configuration files: n commands.cfg, containing command definitions n contacts.cfg, containing the list of contacts and associated information, such as e-mail address, (see appendix h) n groups.cfg, containing all groups—hostgroups, servicegroups, and contactgroups, (see appendix g) n templates.cfg, containing all object templates, (see appendix f) n timeperiods.cfg, containing the time ranges for checks and notifications all devices and servers that i monitor are placed in directories using the cfg_dir directive: servers—contains server configurations. each file includes the host and service configurations for a physical or virtual server. devices—contains device information. i create individual files for devices with service monitoring that goes beyond simple ping tests for connectivtable 1. nagios objects object used for hosts servers or devices being monitored hostgroups groups of hosts services services being monitored servicegroups groups of services timeperiods scheduling of checks and notifications commands checking hosts and services notifying contacts processing performance data event handling contacts individuals to alert contactgroups groups of contacts figure 1. nagios configuration relationships. copyright © 2009 ethan galstead, nagios enterprises. used with permission. monitoring network and service availability with open-source software | silver 11 ity. devices monitored solely for connectivity are grouped logically into a single file. for example, we monitor connectivity with fifty remote locations, and all fifty of them are placed in a single file. the resource.cfg file uses two macros to define the path to plug-ins and event handlers. thirty other macros are available. because the cgi programs do not read the resource file, restrictive permissions can be applied to them, enabling some of the macros to be used for usernames and passwords needed in check commands. placing sensitive information in service configurations exposes them to the web server, creating a security issue. n configuration the appendixes include the object configuration files for a simple monitoring situation. a switch is monitored using a simple ping test (see appendix j), while an opac server on the other side of the switch is monitored for both web and z39.50 operations (see appendix i). note that the opac configuration includes a parents directive that tells nagios that a problem with the gateway-switch will affect connectivity with the opac server. i monitor fifty remote sites. if my router is down, a single notification regarding my router provides more information if it is not buried in a storm of notifications about the remote sites. the web port, web service, and opac search services demonstrate different levels of monitoring. the web port simply attempts to establish a connection to port 80 without evaluating anything beyond a successful connection. the web service check requests a specific page from the web server and evaluates only the status code returned by the server. it displays a warning because i configured the check to download a file that does not exist. the web server is running because it returns an error code, hence the warning status. the opac search uses a known search to evaluate the result content, specifically whether the correct number of results is returned for a known search. i used a number of templates in the creation of this configuration. templates reduce the amount of repetitive typing by allowing the reuse of directives. templates can be chained, as seen in the host templates. the opac definition uses the linux-server template, which in turn uses the generic-host template. the host definition inherits the directives of the template it uses, overriding any elements in both and adding new elements. in practical terms, generic-host directives are read first. linux-server directives are applied next. if there is a conflict, the linuxserver directive takes precedence. finally, opac is read. again, any conflicts are resolved in favor of the last configuration read, in this case opac. n plug-ins and service checks the nagios plugins package provides numerous plug-ins, including the check-host-alive, check_ping, check_tcp, and check_http commands. using the plug-ins is straightforward, as demonstrated in the appendixes. most plugins will provide some information on use if executed with—help supplied as an argument to the command. by default, the plug-ins are installed in /usr/lib/nagios/ plugins. some distributions may install them in a different directory. the plugins folder contains a subfolder with usercontributed scripts that have proven useful. most of these plug-ins are perl scripts, many of which require additional perl modules available from the comprehensive perl archive network (cpan). the check_hip_search plug-in (appendix k) used in the examples requires additional modules. installing perl modules is best accomplished using the cpan perl module. detailed instructions on module installation are available online.15 some general tips: n gcc and make should be installed before trying to install perl modules, regardless of whether you are installing manually or using cpan. most modules are provided as source code, which may require compiling before use. cpan automates this process but requires the presence of these packages. n alternately, many linux distributions provide perl module packages. using repositories to install usually works well assuming the repository has all the needed modules. in my experience, that is rarely the case. table 2. sample commands for managing the htpasswd.users file create or modify an entry, with password entered at a prompt: htpasswd /etc/nagios/htpasswd.users create or modify an entry using password from the command line: htpasswd -b /etc/nagios/htpasswd.users delete an entry from the file: htpasswd -d /etc/nagios/htpasswd.users 12 information technology and libraries | march 2010 n many modules depend on other modules, sometimes requiring multiple install steps. both cpan and distribution package managers usually satisfy dependencies automatically. manual installation requires the installer to satisfy the dependencies one by one. n most plug-ins provide information on required software, including modules, in a readme file or in the source code for the script. in the absence of such documentation, running the script on the command line usually produces an error containing the name of the missing module. n testing should be done using the nagios user. using another user account, especially the root user, to create directories, copy files, and run programs creates folders and files that are not accessible to the nagios user. the best practice is to use the nagios user for as much of the configuration and testing as possible. the lists and forums frequently include questions from new users that have successfully installed, configured, and tested nagios as the root user and are confused when nagios fails to start or function properly. n advanced topics once the system is running, more advanced features can be explored. the documentation describes many such enhancements, but the following may be particularly useful depending on the situation. n nagios provides access control through the combination of settings in the cgi.cfg and htpasswd.users files. library administration and staff, as well as patrons, may appreciate the ability to see the status of the various systems. however, care should be taken to avoid disclosing sensitive information regarding the network or passwords, or allowing access to cgi programs that perform actions. n nagios permits the establishment of dependency relationships. host dependencies may be useful in some rare circumstances not covered by the parent–child relationships mentioned above, but service dependencies provide a method of connecting services in a meaningful manner. for example, certain opac functions are dependent on ils services. defining these relationships takes both time and thought, which may be worthwhile depending on any given situation. n event handlers allow nagios to initiate certain actions after a state change. if nagios notices that a particular service is down, it can run a script or program to attempt to correct the problem. care should be taken when creating these scripts as service restarts may delete or overwrite information critical to solving a problem, or worsen the actual situation if an attempt to restart a service or reboot a server fails. n nagios provides notification escalations, permitting the automatic notification of problems that last longer than a certain time. for example, a service escalation could send the first three alerts to the admin group. if properly configured, the fourth alert would be sent to the managers group as well as the admin group. in addition to escalating issues to management, this feature can be used to establish a series of responders for multiple on-call personnel. n nagios can work in tandem with remote machines. in addition to custom scripts using secure shell (ssh), the nagios remote plug-in executor (nrpe) add-on allows the execution of plug-ins on remote machines, while the nagios service check acceptor (nsca) add-on allows a remote host to submit check results to the nagios server for processing. implementing nagios on the feher and sondag wireless network mentioned earlier would require one of these options because the wireless network is not accessible from the external network. these add-ons also allow for distributed monitoring, sharing the load among a number of servers while still providing the administrators with a single interface to the entire monitored network. the nagios exchange (http://exchange.nagios .org/) contains similar user-contributed programs for windows. n nagios can be configured to provide redundant or failover monitoring. limoncelli, hogan, and chalup call this metamonitoring and describe when it is needed and how it can be implemented, suggesting self-monitoring by the host or having a second monitoring system that only monitors the main system.16 nagios permits more complex configurations, allowing for either two servers operating in parallel, only one of which sends notifications unless the main server fails, or two servers communicating to share the monitoring load. n alternative means of notification increase access to information on the status of the network. i implemented another open-source software package, quickpage, which allows nagios text messages to be sent from a computer to a pager or cell phone.17 appendix l shows a screenshot of a firefox extension that displays host and service problems in the status bar of my browser and provides optional audio alerts.18 the nagios community has developed a number of alternatives, including specialized web interfaces and rss feed generators.19 monitoring network and service availability with open-source software | silver 13 n appropriate use monitoring uses bandwidth and adds to the load of machines being monitored. accordingly, an it department should only monitor its own servers and devices, or those for which it has permission to do so. imagine what would happen if all the users of a service such as worldcat started monitoring it! the additional load would be noticeable and could conceivably disrupt service. aside from reasons connected with being a good “netizen,” monitoring appears similar to port-scanning, a technique used to discover network vulnerabilities. an organization that blithely monitors devices without the owner’s permission may find their traffic is throttled back or blocked entirely. if a library has a definite need to monitor another service, obtaining permission to do so is a vital first step. if permission is withheld, the service level agreement between the library and its service provider or vendor should be reevaluated to ensure that the provider has an appropriate system in place to respond to problems. n benefits the system-administration books provide an accurate overview of the benefits of monitoring, but personally reaping those benefits provides a qualitative background to the experience. i was able to justify the time spent on setting up monitoring the first day of production. one of the available plug-ins monitors sybase database servers. it was one of the first contributed plug-ins i implemented because of past experiences with our production database running out of free space, causing the system to become nonfunctional. this happened twice, approximately a year apart. each time, the integrated library system was down while the vendor addressed the issue. when i enabled the sybase service checks, nagios immediately returned a warning for the free space. the advance warning allowed me to work with the vendor to extend the database volume with no downtime for our users. that single event convinced the library director of the value of the system. since that time, nagios has proven its worth in alerting it staff to problem situations, providing information on outage patterns both for in-house troubleshooting and discussions with service providers. n conclusion monitoring systems and services provides it staff with a vital tool in providing quality customer service and managing systems. installing and configuring such a system involves a learning curve and takes both time and computing resources. my experiences with nagios have convinced me that the return on investment more than justifies the costs. references 1. thomas a. limoncelli, christina j. hogan, and strata r. chalup, the practice of system and network administration, 2nd ed. (upper saddle river, n.j.: addison-wesley, 2007): 36. 2. ibid., 523–42. 3. james feher and tyler sondag, “administering an opensource wireless network,” information technology & libraries 27, no. 3 (sept. 2008): 44–54. 4. dave pattern, “keeping an eye on your hip,” online posting, jan. 23, 2007, self-plagiarism is style, http://www.daveyp .com/blog/archives/164 (accessed nov. 20, 2008). 5. feher and sondag, “administering an open-source wireless network,” 45–54. 6. ibid., 48, 53–54. 7. limoncelli, hogan, and chalup, the practice of system and network administration, 539–40. 8. pattern, “keeping an eye on your hip.” 9. limoncelli, hogan, and chalup, the practice of system and network administration, xxv. 10. “comparison of network monitoring systems,” wikipedia, the free encyclopedia, dec. 9, 2008, http://en.wikipedia .org/wiki/comparison_of_network_monitoring_systems (accessed dec. 10, 2008). 11. william von hagen and brian k. jones, linux server hacks, vol. 2 (sebastopol, calif.: o’reilly, 2005): 371–74 (zabbix), 382–87 (nagios). 12. monitoringexchange, http://www.monitoringexchange. org/ (accessed dec. 23, 2009); nagios community, http:// community.nagios.org (accessed dec. 23, 2009); nagios wiki, http://www.nagioswiki.org/ (accessed dec. 23, 2009). 13. “nagios documentation,” nagios, mar. 4, 2008, http:// www.nagios.org/docs/ (accessed dec. 8, 2008); david josephsen, building a monitoring infrastructure with nagios (upper saddle river, n.j.: prentice hall, 2007); wolfgang barth, nagios: system and network monitoring, u.s. ed. (san francisco: open source press; no starch press, 2006). 14. ethan galstead, “nagios quickstart installation guides,” nagios 3.x documentation, nov. 30, 2008, http://nagios.source forge.net/docs/3_0/quickstart.html (accessed dec. 3, 2008). 15. the perl directory, (http://www.perl.org/) contains complete information on perl. specific information on using cpan is available in “how do i install a module from cpan?” perlfaq8, nov. 7, 2007, http://perldoc.perl.org/perlfaq8.html (accessed dec. 4, 2008). 16. limoncelli, hogan, and chalup, the practice of system and network administration, 539–40. 17. thomas dwyer iii, qpage solutions, http://www.qpage .org/ (accessed dec. 9, 2008). 18. petr šimek, “nagioschecker,” google code, aug. 12, 2008, http://code.google.com/p/nagioschecker/ (accessed dec. 8, 2008). 19. “notifications,” monitoringexchange, http://www .monitoringexchange.org/inventory/utilities/addon-projects/notifications (accessed dec. 23, 2009). 14 information technology and libraries | march 2010 appendix a. service detail display from test system appendix b. service details for opac (hip) and ils (horizon) servers from production system appendix c. sybase freespace trends for a specified period appendix d. connectivity history for a specified period appendix e. availability report for host shown in appendix d appendix f. templates.cfg file ############################################################################ # templates.cfg sample object templates ############################################################################ ############################################################################ # contact templates ############################################################################ monitoring network and service availability with open-source software | silver 15 # generic contact definition template this is not a real contact, just # a template! define contact{ name generic-contact service_notification_period 24x7 host_notification_period 24x7 service_notification_options w,u,c,r,f,s host_notification_options d,u,r,f,s service_notification_commands notify-service-by-email host_notification_commands notify-host-by-email register 0 } ############################################################################ # host templates ############################################################################ # generic host definition template this is not a real host, just # a template! define host{ name generic-host notifications_enabled 1 event_handler_enabled 1 flap_detection_enabled 1 failure_prediction_enabled 1 process_perf_data 1 retain_status_information 1 retain_nonstatus_information 1 notification_period 24x7 register 0 } # linux host definition template this is not a real host, just a template! define host{ name linux-server use generic-host check_period 24x7 check_interval 5 retry_interval 1 max_check_attempts 10 check_command check-host-alive notification_period workhours notification_interval 120 notification_options d,u,r contact_groups admins register 0 } appendix f. templates.cfg file (cont.) 16 information technology and libraries | march 2010 # define a template for switches that we can reuse define host{ name generic-switch use generic-host check_period 24x7 check_interval 5 retry_interval 1 max_check_attempts 10 check_command check-host-alive notification_period 24x7 notification_interval 30 notification_options d,r contact_groups admins register 0 } ############################################################################ # service templates ############################################################################ # generic service definition template this is not a real service, # just a template! define service{ name generic-service active_checks_enabled 1 passive_checks_enabled 1 parallelize_check 1 obsess_over_service 1 check_freshness 0 notifications_enabled 1 event_handler_enabled 1 flap_detection_enabled 1 failure_prediction_enabled 1 process_perf_data 1 retain_status_information 1 retain_nonstatus_information 1 is_volatile 0 check_period 24x7 max_check_attempts 3 normal_check_interval 10 retry_check_interval 2 contact_groups admins notification_options w,u,c,r notification_interval 60 notification_period 24x7 register 0 } appendix f. templates.cfg file (cont.) monitoring network and service availability with open-source software | silver 17 # define a ping service. this is not a real service, just a template! define service{ use generic-service name ping-service notification_options n check_command check_ping!1000.0,20%!2000.0,60% register 0 } appendix f. templates.cfg file (cont.) appendix g. groups.cfg file ############################################################################ # contact group definitions ############################################################################ # we only have one contact in this simple configuration file, so there is # no need to create more than one contact group. define contactgroup{ contactgroup_name admins alias nagios administrators members nagiosadmin } ############################################################################ # host group definitions ############################################################################ # define an optional hostgroup for linux machines define hostgroup{ hostgroup_name linux-servers ; the name of the hostgroup alias linux servers ; long name of the group } # create a new hostgroup for ils servers define hostgroup{ hostgroup_name ils-servers ; the name of the hostgroup alias ils servers ; long name of the group } # create a new hostgroup for switches define hostgroup{ hostgroup_name switches ; the name of the hostgroup alias network switches ; long name of the group } ############################################################################ # service group definitions ############################################################################ 18 information technology and libraries | march 2010 # define a service group for network connectivity define servicegroup{ servicegroup_name network alias network infrastructure services } # define a servicegroup for ils define servicegroup{ servicegroup_name ils-services alias ils related services } appendix g. groups.cfg file (cont.) appendix h. contacts.cfg ############################################################################ # contacts.cfg sample contact/contactgroup definitions ############################################################################ # just one contact defined by default the nagios admin (that’s you) # this contact definition inherits a lot of default values from the # ‘generic-contact’ template which is defined elsewhere. define contact{ contact_name nagiosadmin use generic-contact alias nagios admin email nagios@localhost } appendix i. opac.cfg ############################################################################ # opac server ############################################################################ ############################################################################ # host definition ############################################################################ # define a host for the server we’ll be monitoring # change the host_name, alias, and address to fit your situation define host{ use linux-server host_name opac parents gateway-switch alias opac server monitoring network and service availability with open-source software | silver 19 appendix i. opac.cfg (cont.) address 192.168.1.123 } ############################################################################ # service definitions ############################################################################ # create a service for monitoring the http port define service{ use generic-service host_name opac service_description web port check_command check_tcp!80 } # create a service for monitoring the web service define service{ use generic-service host_name opac service_description web service check_command check_http!-u/bogusfilethatdoesnotexist.html } # create a service for monitoring the opac search define service{ use generic-service host_name opac service_description opac search check_command check_hip_search } # create a service for monitoring the z39.50 port define service{ use generic-service host_name opac service_description z3950 port check_command check_tcp!210 } appendix j. switches.cfg ############################################################################ # switch.cfg sample config file for monitoring switches ############################################################################ ############################################################################ # host definitions ############################################################################ 20 information technology and libraries | march 2010 appendix k. check_hip_search script #!/usr/bin/perl -w ######################### # check horizon information portal (hip) status. # hip is the web-based interface for dynix and horizon # ils systems by sirsidynix corporation. # # this plugin is based on a standalone perl script written # by dave pattern. please see # http://www.daveyp.com/blog/index.php/archives/164/ # for the original script. # # the original script and this derived work are covered by # http://creativecommons.org/licenses/by-nc-sa/2.5/ ######################### use strict; use lwp::useragent; # note the requirement for perl module lwp::useragent! use lib “/usr/lib/nagios/plugins”; use utils qw($timeout %errors); # define the switch that we’ll be monitoring define host{ use generic-switch host_name gateway-switch alias gateway switch address 192.168.0.1 hostgroups switches } ############################################################################ ### # service definitions ############################################################################ ### # create a service to ping to switches # note this entry will ping every host in the switches hostgroup define service{ use ping-service hostgroups switches service_description ping normal_check_interval 5 retry_check_interval 1 } appendix j. switches.cfg monitoring network and service availability with open-source software | silver 21 ### some configuration options my $hipserverhome = “http://ipac.prl.ab.ca/ipac20/ipac. jsp?profile=alap”; my $hipserversearch = “http://ipac.prl.ab.ca/ipac20/ipac.jsp?menu=se arch&aspect=subtab132&npp=10&ipp=20&spp=20&profile=alap&ri=&index=.gw&term=li nux&x=18&y=13&aspect=subtab132&getxml=true”; my $hipsearchtype = “xml”; my $httpproxy = ‘’; ### check home page is available... { my $ua = lwp::useragent->new; $ua->timeout( 10 ); if( $httpproxy ) { $ua->proxy( ‘http’, $httpproxy ) } my $response = $ua->get( $hipserverhome ); my $status = $response->status_line; if( $response->is_success ) { } else { print “hip_search critical: $status\n”; exit $errors{‘critical’}; } } ### check search page is returning results... { my $ua = lwp::useragent->new; $ua->timeout( 10 ); if( $httpproxy ) { $ua->proxy( ‘http’, $httpproxy ) } my $response = $ua->get( $hipserversearch ); my $status = $response->status_line; if( $response->is_success ) { my $results = 0; my $content = $response->content; if( lc( $hipsearchtype ) eq ‘html’ ) { if ( $content =~ /\(\d+?)\<\/b\>\ \;titles matched/ ) { $results = $1; appendix k. check_hip_search script (cont.) 22 information technology and libraries | march 2010 } } if( lc( $hipsearchtype ) eq ‘xml’ ) { if( $content =~ /\(\d+?)\<\/hits\>/ ) { $results = $1; } } ### modified section original script triggered another function to ### save results to a temp file and email an administrator. unless( $results ) { print “hip_search critical: no results returned|results=0\n”; exit $errors{‘critical’}; } if ( $results ) { print “hip_search ok: $results results returned|results=$results\n”; exit $errors{‘ok’}; } } } appendix k. check_hip_search script (cont.) appendix l. nagios checker display lib-mocs-kmc364-20131012113301 202 journal of library automation vol. 14/3 september 1981 rlin system scheduled to be operational this summer. the proposed test will involve eight members of the reference stafffour from each departmentwho will be trained to search on oclc and rlin. those selected will include both librarians and library assistants who regularly provide reference assistance. the results obtained from such a representative group will better enable us to assess the impact on the whole reference staff should we later decide to fully implement the service. they will be the only ones involved in sampling questions and conducting comparative searches. the test will have two components, the first of which will be a twenty-week period to collect at least 400 sample questions. during their regularly scheduled reference hours, the eight specially trained librarians 'will collect samples of reference requests for materials that, based on the information initially given by the patron, cannot be identified in the card catalog. after checking the catalog, the librarian will then complete the top portion of a two-page selfcarbon form with all of the information that is known about the requested item. then, at regular intervals during the semester, the pages of each form will be separated and distributed to other members of the test staff for batch-mode searching. the manual oclc and rlin searching for each query will be done by different staff members to eliminate crossover effects. each request will be searched on both oclc and rlin with the following information being recorded: 1. date of the material requested (if known). 2. type of material (e.g. , conference proceeding) . 3. amount of time required to do the search. 4. success or failure of the search. this information will then be cumulated in a statistical table, and the results of each search will be keypunched for computerized analysis using the bmdp (biomedical computer programs) statistical package to determine whether or not effectiveness and efficiency have been improved significantly. in addition , on twenty-four randomly selected days during the semester the trained searchers will count the total number of questions received by them on that day that would have been appropriate to search on rlin or oclc. by using these data it will be possible to extrapolate the potential usefulness of the systems for the entire semester. the second component of the test will be a two-week real-life test during which all questions requiring further verification would be searched immediately on rlin, oclc, and in the appropriate printed sources to compare time required, success rate, and type of material requested. this sort of test would permit the searcher to continue to negotiate with the patron as the search progressed, which is the usual situation. also, this would provide the only opportunity to have the patron judge the value of subject searches done on rlin. if funding is received, preliminary results should be available in early 1982. anyone conducting similar or otherwise relevant studies is asked to contact the author. replicating the washington library network computer system software thomas p. brown: manager of computer services, and raymond deb use: manager of development and library services, washington library network, olympia. the washington library network (wln) computer system supports shared cataloging and catalog maintenance, retrospective conversion, reference, com catalog production, acquisitions, and accounting functions for libraries operating within a network. the system offers both full marc and brief catalog records as well as linked authority control for all traced headings. it contains more than 250,000 lines of pli 1 and ibm bal code in more than 1,100 program modules and runs on ibm or ibm-compatible hardware with ibm operating systems (mvs ,os/vsl). all database management functions are provided by ada:bas, a product of software a.g. of north america. the online system runs under cics/vs 1.5. a set of assembler codes called the tp monitor interface defines a standard service interface between the applications programs and the tp monitor. this allows easy upgrade to different tp monitors and convenient points for collecting performance statistics. the majority of the bibliographic subsystem updating is done in batch mode to conserve online resources. a new version of the system with interactive updating is currently being planned, for use in special environments. the applications software was designed and implemented with a number of important conventions: 1. top-down design. 2. standard use of ibm environments. 3. structured coding techniques. 4. interfaces to a database management system (adabas) and teleprocessing monitor (currently cics). 5. stand;ud naming and formatting. 6. use of a standard set of data structures and assembler subroutines to manipulate data. 7. identification of maintenance changes in source programs. in addition, programming for the online functions meets other conditions: 1. load modules less than 20k bytes. 2. no pl/1 run-time subroutines. 3. reentrant coding. 4. standard services for the tp monitor interface. 5. applications are kept as terminal independent a~ possible, with the tp monitor interface performing input and output translations. replication a system with these characteristics, even though large, can easily be transported to a different site. while wln was not designed with multiple replications in mind, a policy decision made by the network a few years ago made replication an attractive possibility. recognizing that it had a capability that would be highly competitive with other online shared bibliographic services, wln expanded its service area beyond the state of washington. it set limits to its expansion, however, having determined that it would remain a small, responsive organicommunications 203 zation providing what it hoped would be superior service to its participating libraries. having set such limits, however, created two impediments to its achieving superior service: wln would have a smaller base of libraries from which its participants could obtain the benefits of shared cataloging, and there would not be the fiscal resources necessary to support a large continuing development effort. both would penalize libraries for joining wln, the first with a lower hit rate against the database and the second with fewer added capabilities. replication provided a possible answer to both of these problems, as well as a potential source of income. in its software license agreements, wln asks the licensee to agree to bibliographic data sharing. all cataloging done by a licensee or its participants would thus be available for loading on wln's own database; likewise, all wln participant cataloging will be made available to the licensee. wln, at least, would accept catalog records only from libraries that follow its bibliographic standards; that is, the standards of the library of congress. currently this sharing is accomplished via magnetic tape, but in the future, online record interchange may be possible, given wln's current work in this area. wln also explicitly asks in its software license agreements that the replicating institution carry out an organized program of development to meet the latter's particular needs. such development is monitored by wln in order that redundant work is not undertaken and to ensure that the various efforts relate coherently. there is a built-in constraint upon major modification andredesign: wln is packaging enhancements and changes into periodic releases of the source code and requiring that the replicants install these releases within twelve months of the date issued. because of the interest in shared development and because wln itself is not in a position to provide first-level program maintenance, the system is distributed in source-code form. the initial installation, however, is of load modules (programs in a form efficiently read and executed by machine), allowing immediate testing of the system's capabilities in its new environ204 journal of library automation vol. 14/3 september 1981 ment. wln is currently negotiating a contract with a new firm, biblio-techniques, that will offer a more nearly turnkey version of wln, packaged with adabas and software a.g.'s tp monitor, complete, and, if necessary, with the required hardware. national lffirary of australia the first replication of the system was made at the national library of australia (nla) in cranberra in early 1979. nla had its own ibm 370/148 and an established data processing staff. adabas had been installed prior to the arrival of wln's installation consultant. minor changes are necessary in cics to support dedicated wln terminals, and these were quickly made and the system was up within days. further work allowed searching on the system from the library's 3270 terminals. after a couple of weeks of shakedown, a wln staff member spent about two weeks training nla staff in the use of the system. it has been in full production for in-house production cataloging for more than a year now, and this spring is being extended to other libraries around the country on a pilot basis, testing the concept of the newly defined australian bibliographic network (abn). nla has replaced the 370/l48 with a larger machine. university of illinois the second installation occurred earlier this year at the university of illinois, where the system was obtained to carry out a pilot project in which the urbana campus and the river bend library system will use it as an online public-access catalog in conjunction with the lcs circulation system. again, load modules were installed and the system was up within a few days, running on the university's administrative computer at the chicago circle campus. illinois staff have had some difficulties in recompiling all of the source code, but these problems are being worked out. wln will warrant that the source code supplied corresponds to the load modules it installs. the system as presently distributed by wln can in no way be considered turnkey. local computer operations or jcl requirements as well as differing levels of staff expertise can create problems. furthermore, wln handles source management through wylbur, a text-editing system, and this is not included with the wln software. the module descriptions, grogramming language, mode, link-edit information, etc., maintained through this facility are supplied, however. either a test or, if contracted for, a full database is also supplied. wln has had some difficulties in creating a valid test database for illinois, but has now defined procedures to better control the process. wln has distributed its second release to australia as a full source update identical to what was installed at illinois. in the future only the source changes in standard ibm iebupdte form will be supplied to replication sites. this will better enable these sites to integrate the new version into theirs. other sites the university of missouri is likely to be the third replicant of the system, since it has just selected wln as the basis for its online catalog system . installation is planned before the end of 1981. the national library of new zealand has also indicated that it intends to purchase the system. the southeastern library network (solinet) has obtained the system in order to convert it to a burroughs facility. while this is a software license, it is not a replication. theresulting system, however, would be available from wln for installation on burroughs equipment. wln has not implemented data sharing with australia, but is testing the loading of illinois data into its bibliographic file. wln libraries should see illinois records on a regular basis by late summer of 1981. similar arrangements will be made with missouri and solinet. shared development has gotten off to a start with the national library of australia having done the work necessary to add the ibm 3270 type of terminal to those that can support cataloging input and edit on wln. illinois will be undertaking the development of enhancements to make the system easier to use as a public online catalog, in addition to other possible areas of concern. wln, of course, continues its in-house development, which has recently seen the implementation of a new batch retrospectiveconversion subsystem, and added com catalog options and online authority verification during input/edit. while not the only bibliographic system to be successfully replicated, the wln computer system is becoming the most systematically replicated main-frame facility, with a broad range of future possibilities, including that of a truly turnkey system. wln's experience indicates that, if a system is designed for ease of maintenance at perhaps some sacrifice of efficiency, it will be readily transportable and allow others to obtain the benefits of a highly sophisticated bibliographic capability without the everincreasing cost of original development and, more importantly, without having to support the ongoing maintenance of a unique system. a general planning methodology for automation richard w. meyer, beth ann reuland, francisco m. diaz, and frances colburn: clemson university, clemson, south carolina. introduction a workable planning methodology is the logical starting place for the successful implementation of automation in libraries. an automation plan may develop on the basis of an informal arrangement or from the efforts of one individual, but just as often, automation plans are developed by committees. an automation planning committee must determine and execute some kind of planning methodology and is more likely to be successful if it starts with clear guidelines, good leadership, and a thoroughly proven approach. as a summary review of the literature will bear out, many libraries have developed their own planning techniques inhouse. some of these, which are addressed to the issues of cataloging rule changes and public-access catalogs, have been very well thought out. 1 however, these techniques are generally not directed to planning for communications 205 library-wide automation, and are usually designed to meet the specific needs of an individual library. although the pattern for these studies is often similar, they do not seem to be based upon any general automation design methodology. neither, in addition, does there seem to be a general methodology available through any external library agency. the office of library management studies of the association of research libraries has developed a number of programs designed to assist libraries with their planning efforts, some of which appear to be useful in automation development. 2 but for many libraries, these programs may be too broad, too time-consuming or too expensive. as an alternative, some libraries will need to look elsewhere for a general automation planning methodology. this problem was addressed by the administration of the clemson library, and was resolved in a unique way. background the robert muldrow cooper library of clemson university has the responsibility of acquiring, preserving, and making available for use the many materials needed by faculty and students in their research and instructional efforts. at a typical landgrant institution like clemson, the amount of scholarly publishing and the pressure to develop research proposals has risen sharply in recent years. the increased needs of users working with an expanding and diversified collection have resulted in a doubling of circulation activity, and have required the growth of library staff by 70 percent over the last decade. furthermore, acquisition, processing, and access problems are compounded by the high inflation rate of materials, particularly serial publications, and manpower costs. even though user demands heavily burdened the traditional manual systems, the extent of library automation at clemson had been limited to a batch circulation system, a simple serials-listing capability, and the use of bibliographic utilities. although it had been generally accepted for some time that the acquisitions and fund-control functions at clemson were in need of automation, no concrete approach to developletter from the editor kenneth j. varnum information technology and libraries | june 2019 1 https://doi.org/10.6017/ital.v38i2.11241 welcome to the june 2019 issue of ital. you’ll likely notice a new look to the journal when you read this issue’s content. our helpful and supportive partners at boston college, where information technologies and libraries is archived, have updated the journal’s content management system to the current version of open journal systems. i am grateful to john o’connor at boston college for his patience with and quick, helpful responses to my numerous questions as we adapted to the new user interface and editorial workflows. columns in this issue include bohyun kim’s final “president’s message” as her term concludes, summarizing the work that has gone into the planned division merger that would combine lita, alcts, and llama. editorial board member cinthya ippoliti discusses the role of libraries in fostering digital pedagogy in her “editorial board thoughts” column. and, in the second of our new “public libraries leading the way” column, jeffrey davis discusses the technologies and advantages of digital pass systems. peer-reviewed articles in this issue include: • “no need to ask: creating permissionless blockchains of metadata records,” by dejah rubel, laying a path for using blockchain for managing metadata. • “50 years of ital/jla: a bibliometric study of its major influences, themes, and interdisciplinarity,” by brady lund, a thorough study of how our journal has influenced, and been influenced by, other leading information technology journals. • “weathering the twitter storm: early uses of social media as a disaster response tool for public libraries during hurricane sandy,” by sharon han. this article is the 2019 lita/ex libris student writing award-winning paper. • “‘good night, good day, good luck’: applying topic modeling to chat reference transcripts,” by megan ozeran and piper martin, describing a process to categorize chat reference themes using topic mapping software. • “information security in libraries: examining the effects of knowledge transfer,” by tonia san nicolas-rocca and richard j burkhard, investigating the importance of knowledge transfer across an organization to enhance information security behaviors. • “wikidata: from ‘an’ identifier to ‘the’ identifier,” by theo van veen, describing how libraries could use wikidata as a source of linked open data. thank you to this issue’s authors, and all of information technology and libraries’ readers for supporting peer-reviewed, open-access, scholarly publishing. in closing, i would like to thank the members of the editorial board whose terms are ending june 30: patrick “tod” colegrove, joseph deodato, richard guajardo, and frank cervone. i’m grateful to these four individuals, upon whom i’ve relied for their excellent advice and guidance in steering ital’s course. we are in the process of appointing new editorial board members with two-year terms starting on july 1, and i’ll introduce them in the next issue. kenneth j. varnum, editor varnum@umich.edu june 2019 a scatter storage scheme for dictionary lookups d. m. murray: department of computer science, cornell university, ithaca, new york scatter storage schemes are examined with respect to their applicability to dictionary lookup procedures. of particular interest are virtual scatter methods which combine the advantages of rapid search speed and reason• able storage requirements. the theoretical aspects of computing hash addresses are developed, and several algorithms are evaluated. finally, experiments with an actual text lookup process are described, and a possible library application is discussed. a document retrieval system must have some means of recording the subject matter of each document in its data base. some systems store the actual text words, while others store keywords or similar content indicators. the smart system ( 1) uses concept numbers for this purpose, each number indicating that a certain word appears in the document. two advantages are apparent. first, a concept number can be held in a fixedsized storage element. this produces faster processing than if variablesized keywords were used. second, the amount of storage required to hold a concept number is less than that needed for most text words. hence, storage space is used more efficiently. smart must be able to find the concept numbers for the words in any document or query. this is done by a dictionary lookup. there are two reasons why the lookup must be rapid. for text lookups, a slow scheme is costly because of the large number of words to be processed. for handling user queries in an on-line system, a slow lookup adds to the user response time. 174 journal of library automation vol. 3/3 september, 1970 storage space is also an important consideration. even for moderate sized subject areas the dictionary can become quite large-too large for computer main memory, or so large that the operation of the rest of the retrieval system is penalized. in most cases a certain amount of core storage is allotted to the dictionary, and the lookup scheme must do the best possible job within this allotment. this usually means keeping the overhead for the scheme as low as possible, so that a large portion of the allotted core is available to hold dictionary words. the rest of the dictionary is placed in auxiliary storage and parts of it are brought in as needed. obviously the number of accesses to auxiliary storage must be minimized. this paper presents a study of scatter storage schemes for application to dictionary lookup, methods which appear to be fast and yet conservative with storage. the next two sections describe scatter storage schemes in general. they are followed by a section presenting the results of various experiments with hash coding algorithms and a section discussing the design and use of a practical lookup scheme. the final sections deal with extensions and conclusions. basic scatter storage method a basic scatter storage scheme consists of a . transformation algorithm and a table. the table serves as the dictionary and is constructed as follows: given a natural language word, the algorithm operates on its bit pattern to produce an address, and the concept number for the word is placed in the table slot indicated by this address. this process is repeated for every word to be placed in the dictionary. the generated addresses . are called hash addresses; and the table, a hash table.· · there are many possible algorithms for producing hash addresses ( 2,3,4). some of the most common are: 1 ) choosing bits from the square of the integer represented by the input word; 2) cutting the bit pattern into pieces and adding these pieces; 3) dividing the integer represented by the input word by the length of the hash table and using the remainder. · collisions in an ideal situation every word placed in the dictionary would have a unique hash address. however, as soon as a few slots in the hash table have been filled, the possibility of a collision arises-two or more words producing the same hash address. to differentiate among collided entries, the characters of the dictionary words· must be stored along with their concept numb~rs. during lookup, the input word can then be compared with the character string to verify that the correct table entry has been located. · · the problem of where to store the collided items has several · methods of solution ( 3,5). the linear scan method places a collided item in the first free table slot after the slot indicated by the hash address. the scan· is scatter storage for dictionary lookups/murray 175 circular over the end of the table. the random probe method uses a crude algorithm to generate random offsets r(i) in the interval [1,h] where h is the length of the hash table. if the colliding address is a, slot a+r( 1) mod h is examined. the process is repeated until an empty slot is found. both of these methods work best when the hash table is lightly loaded; that is, when the ratio between the number of words entered and the number of table slots is small. in such cases the expected length of scan or average number of random probes is small. chaining methods provide a satisfactory method of resolving collisions regardless of the load on the hash table. however, they require a second storage table-a bump table-for holding the collided items. when a collision occurs, both entries are linked together by a pointer and placed in the bump table. a pointer to this collision chain is placed in the hash table along with an identifying flag. further colliding items are simply added to the end of the collision chain. table layout and search procedure in the virtual scatter storage system described later, the hash table has a high load factor. hence the chained method (or rather a variation of it) is used to resolve collisions. further discussion involves only scatter storage systems using collision chains. with this restriction, then, a scatter storage system consists of a hash table, a bump table, and the associated algorithm for producing hash addresses. a dictionary entry consists of a concept number and the character string for the word it represents. these entries are placed in the hash-bump table as described above. consequently there are three types of slots in the hash table-slots that are empty, slots holding a single dictionary entry, and slots containing a pointer to a collision chain held in the bump table. figure 1 is a typical table layout. hash table 0 empty slot • • concept + char • nary entry single dictio . . pointer -..j.,ntry 11 'r --~)~entry 21 \ collision cha in fig. 1. typical table layout. 176 journal of libmry automation vol. 3/3 september, 1970 one of the advantages of scatter storage systems is that the search strategy is the same as the strategy for constructing the hash-bump tables. a word being given, its hash address is computed and the tables searched to find the proper slot. during construction, dictionary information is placed in the slot; during lookup, information is extracted from the slot. the basic search procedure is illustrated by the flow diagram in figure 2. the construction procedure is similar. pointer,----< get next bump table entry input the text ~rd coiilpute hasii. address return concept number word never entered in dictionary fig. 2. flow diagram for the lookup procedure in basic scatter storage systems. scatter storage for dictionary lookups/murray 177 theoretical expectations an ideal transformation algorithm produces a unique hash address for each dictionary word and thereby eliminates collisions. from a practical point of view, the best algorithms are those which spread their addresses uniformly over the table space. producing a hash address is simply the process of generating a uniform random number from a given character string. if the addresses are truly random, a probability model may be used to predict various facts about the storage system. suppose a hash table has h slots and that n words are to be entered in the hash-bump tables. let h, be the expected number of hash table slots with i entries for i=0,1, ... n. in other words, ho is the expected number of empty slots, h1 is the expected number of single entries, and h2,hs, ... , hn are the expected number of slots with various numbers of colliding items. even though the items are physically located in the bump table, they may be considered to "belong .. to the same slot in the hash table. it is expected that: n 1) h=~ h. i=o n 2) n="s i h, i=o now let x _ (1 if exactly i items occur in the r~~ slot ' 1 ~0 if exactly i items do not occur in the j'11 slot for i = 1,2, ... , h then h, = e [xu + x.2 + ... + xm] h = ~ e [x,j] i= 1 assume that any chosen table slot is independent of the others so that the probability of getting any single item in the slot is 1/h. then the probability of getting exactly i items in that slot is 3)p·= (~x1r(1-1f then e[x,1] = 1·p, + 0· (1-p,) = p, substituting into the above 4) h,= h·p, = n( ~x~) i ( 1~ ri for j = 0,1, ... > n 178 journal of library automation vol. 3/3 september, 1970 for the cases of interest h and n are large, and the poisson approximation can be used in equation 3: p ·nih (n/h)' •e il the ratio n f h is the load factor mentioned previously. it is usually designated by a so that a• 5) h, = he·a if i=o,l, . . . , n equation 5 is sufficient to describe the state of the scatter storage system after the entry of n items. most of the statistics of interest can be predicted using this expression; a few of them are listed in table 1. the time required for a single lookup using a hash scheme depends on the number of probes into the table space, that is, how many slots must be examined. suppose the word is actually found; if it is a single entry, only one probe is required. if the word is located in a collision chain, the number of probes is one (for the hash table) plus one additional probe for each element of the collision chain that must be examined. suppose that the word is not in the dictionary; if its hash address corresponds to an empty table slot, again only one probe is needed. however, if the address points to a collision chain, the number is one plus the length of the chain. for words found in the dictionary the average number of probes per lookup is : i 6) p = 1 + n[(o)ht + (1+2)hz + (1+2+3)hs + ... n i = 1 + ~ h , ~ f i=2 f=1 1 n = 1 + 2 ~ (i+1)fi-1 i=2 1 n 1 n =i+ 2 ~ (i-1) f i-1 +2 l f •. t i=2 i=2 1 n+1 1 n+1 = 1 + 2 ~ (i-1)fi-t + 2 ~ f1-1 i=2 i=2 (probes) + (1+2+ ... + n)hn] scatter storage for dictionary lookups/murray 179 table 1. expected storage and search properties fo1' basic scatter storage schemes measure load factor number of empty table slots number of single entries number of collision chains of length i expected sums fraction of hash table empty fraction of table filled with single entries fraction of hash table slots with i entries expected sums number of collisions number of entries in the bump table total table slots required average lookup time (probes) h = number of hash table slots n = number of words to be entered formula a=n/h ho =he-a h1 = ne-a ai hi = h e-a--:-r i = 2,3, ... , n z. n h = ~--hi i=o n n =~--hi i=o 1 fo = h ho= e-a \ 1 f1 = h h1 = aea 1 a' f, = h hi = e-a if i = 2,3, ... , n n 1 = ~ f, i=o n a = ~ i f, i=o no = h2 + h a + ... + hn = h ho-hl b = n-hl s = h+b 180 journal of library automation vol. 3/ 3 september, 1970 vihtual scatter storage method from table 1, the expected number of collisions is nc= hhoht = h( 1 e ·ntn _ ~e·nih) for a fixed n, this number decreases as h increases. at the same time the number of empty hash table slots ho = h e·nt n increases as h increases. both of these results are expected; as the hash addresses are spread over a larger and larger table space ( h slots), the number of collisions should decrease and the number of empties increase for a fixed number of entries ( n). a virtual scatter storage scheme tries to balance these opposing strains by combining hash coding with a sparse storage technique. large or virtual hash addresses are used to obtain the collision properties associated with a very large hash table, and the storage technique is used to achieve the storage and search properties of a reasonably sized hash table. if the virtual hash address is taken large enough, the expected number of collisions can be reduced to essentially zero. with no expected collisions, it is possible to dispense with verifying that a query word and the dictionary word are the same. it is enough to check that they produce the same virtual address. hence, the character strings need not be stored in the hash-bump tables at all. to implement the virtual scheme a large hash address is computed, say in the range ( 0, v), and the address is split into a major and minor part. the major portion is used just as before-as an index on a hash table of size h. the minor portion is stored in the hash or bump table, in place of the character string. with this difference, the virtual scheme works just as the basic scheme does. the lookup procedure is identical, but the minor portions are used for comparison rather than character strings. all the results of the previous section apply as storage and timing estimates. the advantage of virtual scatter storage systems is economy of storage space. the minor portion is much smaller in size than that of the character string it replaces. it is true that the virtual scheme assigns the same concept number to two different words if they have the same virtual address. this need not be disastrous for document retrieval applications. presumably v is chosen large enough to keep the number of collisions small. on the one hand, errors could be neglected because of their low probability of occurrance and their small effect on the total performance of the retrieval system. on the other hand, it is always possible to resolve detected collisions even in a virtual scheme. collisions may be detected during dictionary construction or updating, and the characters for the scatter storage for dictionary lookups/murray 181 colliding words appended to the bump table. the hash or bump table entry must contain a pointer to these characters along with an identifying flag. collisions occurring during actual lookups cannot be detected. collision problem "' in order to use a virtual hash scheme, the virtual table must be large enough to reduce the expected number of collisions to an acceptable level. from a practical point of view, a collision may be considered to involve only two words, rather than three, four, or more. it is assumed that the probability of these other types of collisions is negligible. let v be the size of the virtual hash table. then the expected number of collisions is simply n. =h2 a2 = v2e·a where a = ~ . in this case v> > n so that a is small and e·a is approximately 1. a2 7) n.=v2 n2 =2v suppose, for example, the dictionary has n = 213 words. if the size of the virtual hash table is chosen to be v = 226, then the expected number of collisions is (213)2 1 nc = 2(226) = ]' suppose further that this table size is adopted for the dictionary, and that the hash code algorithm produces three collisions. the question arises whether the algorithm is a good one-whether it produces uniform random addresses. the answer is found by extending the previous probability model. consider a virtual scatter storage scheme in which the virtual table size is v, and n items are to be entered into the hash-bump tables. again assume that collisions involve only two items. let p(i) =prob [i collisions] = prob [i table slots have 2 items and n-2i slots have 1 item] the number of ways of choosing the i pairs of colliding words (in an ordered way) is: (~x n22} .. ( n-~+2 ) 2' (~~2i)l 182 journal of library automation vol. 3/3 september, 1970 there are il ways of ordering these pairs and vi (v)n-i = (v-n+i)l ways of placing the pairs in the hash table, so that ( . nl (v)n-' t n 2 j 8) p l) = 21il (n-2i)! -yr fori= 0,1, ... , in a form for hand computation, 1 2 n-1 9) p(0)=(1--y) (1-y-) ... (1---y) p( ') =p('1 ) (n-2i+ 2) (n-2i+1) f t ' 2i(v-n+ i) or i=1,2, ... , these results are exact, but the following approximations can be used with accuracy n-1 . log p ( 0) = ~ log ( 1 ~ ) i=l n-1 . =~ -' . 1 v 7= nz -2v let f3 = ~; . terms linear in n may be neglected in equation 9, giving p(o) = exp(-fi) p(i) = ~ p(i-1) 1 this is also a poisson distribution: 10) p(i) = exp(-fi) ft for i = 0,1,2, .. . , l ~ j this equation gives the approximate probability of i collisions for a virtual scatter storage scheme. it may be used to form a confidence interval around the expected number of collisions nc = /3. for the previous example in which v = 22 6, n = 213, n c = ~'the following table of values can be made: i p(i) ~p(i) 0 .607 .607 1 .303 .910 2 .076 .986 3 .012 .998 \ scatter storage for dictionary lookups/murray 183 the probability is .986 that the number of collisions is less than or equal to 2. since the algorithm gave 3 collisions, it appears to be a poor one. the results for the collision properties are summarized in table 2. table 2. expected collision properties for virtual scatter storage systems measure collision factor expected number of collisions probability of i collisions probability that the number of collisions c lies in [ a,b] v virtual hash table size formula n2 {3= 2v n.=p p( i) = exp(-{3) ~' i=o, 1, ... , [ ~ j b prob = ~ p(i) i=a n number of words to be entered experiments with algorithms for generating hash addresses any scatter storage scheme depends on a good algorithm for producing hash addresses. this is especially true for virtual schemes in which collisions are to be eliminated. in these experiments three basic algorithms are evaluated for use in virtual schemes. the words in two dictionariesthe adi wordform and cran 1400 \ivordform-are used. the hash-bump tables are filled using these words and the resulting collision and storage statistics compared with the expected values. dictionaries the adi wordform contains 7822 words pertaining to the field of documentation. it contains 206 common words (previously judged) averaging 3.93 characters. the remaining 7616 noncommon words average 8.00 characters. in all there are 61,712 characters. the cran 1400 wordform contains 8926 words dealing with aeronautics. the common word list consists of that of the adi, plus four additional entries. the 8716 noncommon words average 8.40 characters. there is a total of 74,074 characters. figures 3 and 4 show the distribution of the length of the words versus percentage of collection. the abrupt end to the curves in figure 3 is due to truncation of words to 18 characters. both dictionaries have approximately the same size and proportions of words of various length. however, their vocabularies are considerably different. a good hash scheme should work equally well on both dictionaries. 184 journal of library automation vol. 3/3 september, 1970 1/) 0 common words "e ~ ~ adi >0 cran 1400 ~ 0 c 0 += 0 ·0 -0 -c q) ~ 8 q) a.. 0 2 4 6 8 10 14 word length fig. 3. distribution of dictionary words according to their lengths. >. '0 c .q -u 0 -0 q) .~ -a :; e ::j u \ scatter storage for dictionary lookups/murray 185 0 common words 6 adi 0 cran 1400 0 2 4 6 8 10 12 14 16 18 20 word length fig. 4. cumulative distribution of dictionary words according to th eir l engths. 186 journal of library automation vol. 3/3 september, 1970 hash coding algorithms by their nature, hash coding algorithms are machine dependent. the computer representation of the alphabetic characters, the way in which arithmetic operations are done, and other factors all affect the randomness of the generated address. the algorithms described below are intended for use on the ibm s /360. words are padded with some character to fill an integral number of s /360 full words. then the full words are combined in some manner to form a single fullword key, and the final hash address is computed from this key. in the experiments which follow, the blank is used as a fill character. this is an unfortunate choice because of the binary representation of the blank 01000000. in some algorithms the zeroes may propagate or otherwise affect the randomness. a good fill character is one that 1) is not available on a keypunch or teletype, 2) will not propagate zeroes, 3) will generate a few carries during key formation, and 4) has the majority of its bits equal to 0, so their positions may be filled. a likely candidate for the s/360 is 01000101. three basic methods of generating virtual hash addresses-addition, multiplication, and division-are studied. the first and second provide contrasting ways of forming the single fullword keys. the second and third differ in the way the hash address is computed from the key. variations of each basic method are also tested to try to improve speed, programming ease, or collision-storage properties. l. addition methods ac-addition and center the fullwords of characters are logically added to form the key. the key is squared and the centermost bits are selected as the major. the minor is obtained from bits on both sides of the major. as-addition with shifting same as ac, except the second, third, etc. fullwords are shifted two positions to the left before their addition in forming the key. (an attempt to improve collision-storage properties) am-addition with masking same as ac, except the second, third, etc. fullwords have certain nonsignificant bits altered by masks before their addition in forming the key. (an attempt to improve collision-storage properties) 2. multiplication methods mc-multiply and center the fullwords of characters are multiplied together to form the key. the center bits of the previous product are saved as the multiplier for the next product. the key is squared and the centermost bits selected as the major. the minor is obtained from the bits on both sides of the major. scatter storage fo1' dictionary lookups/murray 187 msl-multiply and save left same as mc, but during formation of the key, the high order bits of the products, rather than the center, are used as successive multipliers. (an attempt to improve speed) mlm-multiply with left major same as mc, but taking the major from the left half of the square of the key and the minor from the right half. (an attempt to improve speed) 3. division methods dp-divide by prime the fullwords of characters are multiplied together to form the key. the center bits of the previous product are saved as the multiplier for the next product. the key is divided by the length of the virtual hash table-a prime number in this case-and the remainder used as the virtual hash address. the major is drawn from the left end of the virtual address and the minor from the right. do-divide by odd number same as dp, except using a hash table whose length is odd. (an attempt to provide more flexibility of hash table sizes ) dt -divide twice same as dp, except two divisions are made. the major is produced by dividing the key by the actual hash table size. the minor results from a second division. primes are used throughout as divisors. (an attempt to improve storage-collision properties) evaluation in the experiments to evaluate each variation of the above hash schemes, the size of the virtual hash table varies from 220 to 228 slots. the actual hash table varies in size from 212 to 214 slots. bump table space is used as needed. the tables are filled by the words from either the adi or cran dictionaries and the collision and storage statistics taken. because good collision properties are most important, they are examined first. the storage properties are dealt with later. the number of collisions obtained from each scheme versus the virtual table length is plotted in figures 5 to 8. the adi dictionary is shown in figures 5 and 7, and the cran in figures 6 and 8. the circled lines correspond to curves generated from equations 7 and 10. the horizontal one shows the expected number of collisions and the lines above and below it enclose a 95% confidence interval about the expected curve. in other words, if an algorithm is generating random . addresses, the probability is 95% that the curve for that scheme lies between the heavy lines. consider figures 5 and 6 showing the results for all the addition methods and the mc variation of the multiplication variation. the ac and mc algorithms differ only in that addition is used in forming the key in the 188 journal of library automation vol. 3/3 september, 1970 -0 ooooooo theoretical curves (equations (7) and ( 1 0) experimental curves ---interpolated curves virtual hash table size (power of two) fig. 5. collisions in the adi dictionary for addition and multiplication hash schemes. first one and multiplication in the second one. yet the curves are spectacularly different. the result seems to have the following explanation. the purpose of a hash address computation is to generate a random number from a string of characters. if the bits in the characters are as varied as possible, then the algorithm has a headstart in the right direction. however, the s/360 bit patterns for the alphabet and numbers are: a to i 1100 xxxx j to r 1101 xxxx s to z 1110 xxxx 0 to 9 1111 xxxx scatter storage for dictiona1·y lookups/murray 189 en c: 0 en 0 (.) -0 ... q) .0 e ::t z 20 ooooooo theoretical curves (equations (7) and (l 0) experimental curves --interpolated curves 26 virtual hash tobl e size (power of two) fig. 6. collisions in the gran dictionary for addition and multiplication hash schemes. c 28 in each case the two initial bits of a character are l's, so that in any given word one-fourth of the bits are the same. in forming a key, the successive additions in the ac algorithm may obscure these nonrandom bits if a sufficient number of carries are generated. however, the number of additions performed is usually small-2 or 3and it appears that the pattems are not broken sufficiently. the mc algorithm uses multiplication to form its keys, which involves many additions-certainly enough to make the resulting key random. the multiplications in the mc algorithm are costly in terms of computation time. therefore the as and am algorithms are tried. these addition 190 journal of library automation vol. 3/3 september, 1970 en c 0 ·~ 0 u .... 0 ooooo theoretical curves experimental curves interpolated curves 22 20 virtual hosh table size (power of two} fig. 7. collisions in the adi dictionary for division and multiplication hash schemes. variants try to hasten the breakup of the nonrandom bits by shifting and masking respectively. although these variants reduce the number of collisions somewhat, none of the addition schemes could be called random. typically a few words are singled out at some point and continue to collide regardless of the length of the virtual address. several collision pairs are listed below. note the similarities between the words. count worth tolerated wheel -sound -forty -telemeter -sheet in 1: 0 ·;;; 0 0 ... 0 ... cl) .d e :i z 20 scatter storage for dictionary lookups/murray 191 0000000 \ \ ,---\ \ \ \ 26 theoretical curves (equations (7) and (1 0) experimental curves interpolated curves 28 virtual hash table size (power of two) fig. 8. collisions in the gran dictionary for division and multiplication hash schemes. consider the multiplication algorithms. during key formation, the process of saving the center of successive products adds to the computation time. the msl variation attempts to remedy this by saving only the high order bits between multiplications (on the s /360 this means saving the upper 32 bits of the 64-bit product) . this method is so inferior that its collision graph could not be included with the others. the poor results stem from the fact that characters at the end of fullwords have little effect on the key and that the later multiplications swamped the effects of the earlier ones. examples of collision pairs are given below. for convenience the fullwords are separated by blanks. 192 journal of library automation vol. 3/3 september, 1970 certainty prevented heaving expe nse charter certainly -presented -heat lng -expanse -chapter the mc and mlm variants are identical with respect to collision properties. in general these algorithms produce good results, reducing the number of collisions to zero in both dictionaries. the collision curve is always beneath the expected one. consider figures 7 and 8 showing the results for all division methods and the mc method. all of the division algorithms display a distinct rise in the number of collisions when the virtual table size is near 224-regardless of the dictionary. the majority of the colliding word pairs are 4-character words having the same two middle letters. this brings to light a curious fact about division algorithms. for virtual tables, the divisor of the key is large and the initial few bits determine the quotient, leaving the rest for the remainder. for words of less than 4 characters (which require no multiplications during key formation), dividing by 224 is equivalent to selecting the last 3 characters of the word as the hash address. because the divisors are not exactly equal to 224, only the two middle characters tend to be the same. examples are: deal -bear took -soon held -cell verb -term this phenomenon apparently continues for table sizes around 226 and 228, but there are few or no words of 4 characters or less which agree in 26 or 28 bits. for divisors smaller than 22 \ a larger part of the key determines the quotient and apparently breaks up the pattern. because the above effect occurs only for v = 22 \ these points are passed over on the graphs. in general, the dt algorithm is superior to the rest of the division methods, mostly because each of its two divisors is smaller than those used in other methods. prime numbers seem to produce better results than other divisors. on the basis of collision properties, the mc, mlm, dt, and possibly as algorithms are the best. storage-search evaluations are included for these methods only. the experiments with each hash coding method also include counting the frequency of various length of collision chains. here a collision chain refers to chains of words producing the same major. the frequency counts are compared with the expected counts given by equation 5. the comparison is in terms of a chi-square goodn ess-of-fit test with a 10 % level of 0 8 :;::: t/) -0 -(j) q) ~ 0 ;:) ct (j) i .c u 2 0 scatter storage for dictionary lookups/murray 193 x·---x·---x or----dt / dt a~as mlm as ----as,dt mlm--mlm mlm mc mc-mc ~c ----mc virtual hash table size (power of two) xcurve for 10% level of significance fig. 9. deviations of storage-search properties from expected values for selected hash schemes using the adi dictionary. significance. figures 9 and 10 show the results of this test for each dictionary. included in the graphs is the line corresponding to the 10% level of significance. if the major portions of the hash addresses are really random, there is a probability of 0.90 that the 10% line will lie above the curve for the algorithm tested. consider the mc and mlm algorithms which differ only in that the major is selected from the center and left of the virtual address. from the graphs, it is clear that the multiplication methods produce their most 194 journal of library automation vol. 3/3 september, 1970 .~ -.!:!! -0 -cj) ~ 0 8 ~ c:1' cj) i ..c (.) 6 4 mlm x--dt virtual hash table size (power of two) xcurve for 10% level of significance fig. 10. deviations of storage-search properties from expected values for selected i-iash schemes using the gran dictionary. random bits in the center of their product. this is somewhat as expected, because the center bits are involved in more additions than other bits. the division algorithm, which had fairly good collision properties, seems to have rather mediocre storage properties. this is probably due to the scatter storage for dictionary lookups/murray 195 same causes as the collision problems, but working at a lower level, and not affecting the results as much. the as curve is included simply for completeness. the scheme displays a well behaved storage curve, but it has poor collision properties. in summary, the mc scheme seems to be the best for both dictionaries in terms of collision and search properties. in terms of computing time, the method is more time consuming than the addition methods, but less expensive than the division methods. the difference in computation times is not an extremely big factor. all methods required from 35 to 55 microseconds for an 8-character word on the s/360/65. the routines are coded in assembly language and called from a fortran executive. the times above include the necessary bookkeeping for linkage between the routines. a practical lookup scheme general description the lookup scheme described below is designed for use with dictionaries of about 21 :. words. the virtual table size selected is 229 and the actual table size is 216• on the basis of the results presented in previous sections, when the dictionary is full, it is expected that 1) 36.8% of the hash table will be empty, 2) 36.8% of the hash table will be single entries, 3) the bump table will require ( 0.632 )215 entries, 4) 1 collision is expected, 5) the probability of 5 or fewer collisions is 0.999, and 6) the average lookup will require 2.13 probes. table layout in all previous discussions a dictionary entry has included a minor and a concept number. a concept number is simply a unique number assigned to each word. the hash address of a word is also unique, and hence can be used. there is no need to store and use a previously assigned concept number. a dictionary entry contains a 14-bit minor and a single bit indicating whether the word is common or noncommon: 1 2 15 ic minor c = 0 implies the word is common; c = 1 implies the word is noncommon. a hash table entry contains 16 bits arranged as : 0 1 15 i flag i information flag = 0 implies that the information is a dictionary entry; flag = 1 implies that the information is a pointer to the bump table. words that have the same major are stored in a block of consecutive 196 journal of library automation vol. 3/3 september, 1970 locations in the bump table. this eliminates the need for pointers in the collision "chains". a bump table entry also has 16 bits structured as: 0 1 2 w i end i c minor end= 0 implies that the entry is not the last in the collision block; end = 1 implies that the entry is the last in the block. some convention must be adopted to signify an empty hash table slot. a zero is most convenient in the above scheme. unfortunately a zero is also a legitimate minor. however, to cause trouble the word generating the zero minor would have to be a common word and a single table entry (zero minors in the bump table are no problem). hopefully this occurs rarely because of the size of the minor ( 14 bits) and the small number of common words. however, even if this combination of circumstances occurs, the common word could be placed in the bump table anyway. in designing the tables, it is important to make the hash table entries large enough to accommodate the largest pointer anticipated for the bump table. for the above scheme, the expected bump table size is less than 215 so that the 15 bits allocated for pointers is sufficient. search considerations the number of probes needed to locate any given word depends on the place that the word occupies in a collision block. the average search time is improved if the most common words occupy the initial slots in each block. a study of adi text yields the statistics given in tables 3 and 4. table 3. division of words by categm·y. number of words percent of total 17270 total words 100.0 8716 common words 50.5 8554 noncommon words 49.5 table 4. distribution of l engths. number of all common noncharacters words percent words percent common percent words 1-4 10145 58.8 8057 92.5 2097 24.5 5-8 4630 26.8 627 7.2 4003 46.8 9-12 2249 13.0 32 0.3 2217 25.9 13-16 221 1.3 0 0.0 221 2.6 17-20 11 0.1 0 0.0 11 0.1 21-24 5 0.0 0 0.0 5 0.1 totals 17270 100.0 8716 100.0 8554 100.0 av. length 6.3 4.3 8.3 scatter storage for dictionary lookups/murray 197 using the categorical information, it appears that in filling the hash-bump tables, the common words should be entered first. within each category, all words should be entered in frequency order if such information is known. if frequency information is not available, the distribution by lengths can be used as an approximation to it. for common words, this means entering the shorter words first. for noncommon words, the words of 5 to 8 characters should be entered first. the greater the number of single entries, the greater the average search speed. figure 11 shows the fraction of single entries ( f 1) and fraction of empty slots ( f o) for various load factors. the fraction of single entries .l: iji 0 i -0 c 0 -(.j 0 ~ u. 0 .4 .8 load factor fig. 11 . theoretical hash table usage. 0 fraction empty slots a fraction of single entries 1.6 198 journal of library automation vol. 3/3 september, 1970 f1=ae-a reaches a maximum for a= 1, but since the slope of the curve is small around this point, the load factor in the interval ( 0.8, 1.2 ) is practically the same. table usage is better, however, for the larger values of a. these facts imply that scatter storage schemes make most efficient use of space and time for a=l. most text words can be assumed to be in the dictionary. thus the order of comparisons during lookup should be: hash table scan 1) check minor assuming the text word is a common word 2) check minor assuming the word is non common 3) check if the entry is a pointer to the bump table 4) check if the entry is empty first bump table entry (must be at least two) 5) check minor assuming the word is a common word 6) check minor assuming the word is non common other bump table entries 7) check minor assuming the word is non common 8) check minor assuming the word is common 9) check if at end of collision block. the search pattern can be varied to take advantage of the storage conditions. for example, if all common words are either single entries or the first element of a collision block, then step 8 may be eliminated. performance the lookup system described above has been implemented and tested on the ibm s/360/65. a modified form of the mc algorithm is used to compute a 29-bit virtual address and divide it into a 15-bit major and a 14bit minor. the modification is the inclusion of a single left shift of the fullwords of characters during key formation. this breaks up certain types of symmetries between words such as wingtail and tailwing. without this, such words will always collide. the hash-bump tables were filled with entries from the adi dictionary-common words first, followed by noncommon words. the shortest words were entered first. table 5 gives comparison of the expected and actual results. table 5. lookup system results. a=.239 number of empty table slots number of single entries number of collision blocks longest collision block average length of collision blocks size of bump table number of collisions average probes per lookup expected 25810 6161 797 4 2.1 1663 .06 1.33 actual 25762 6250 756 4 2.1 1572 0 1.33 scatter storage for dictionary lookups/murray 199 to obtain the actual lookup times 627 words were processed. the words were read from cards and all punctuation removed. each word was passed to the lookup program as a continuous string of characters with the proper number of fill characters added. the resulting times are given in table 6 (in microseconds); a larger sample of the category of "not-found" words processed with less accurate timings indicates that the average time for words in this category is about 62 microseconds (standard deviation 26). table 6. lookup times category number of words of words all 627 common 288 noncommon 338 not found 1 percent of total 100.0 45.9 53.9 0.2 average time 57.9 49.9 64.7 53.1 standard deviation 11.7 6.7 10.7 0.0 average probes 1.18 1.12 1.24 1.00 the time to compute a hash address depends on the length of the word . let n be the number of s /360 full words needed to hold these characters. the time to form the initial address is i ( n) = 34.5 + 10.2 ( n-1) microseconds. the average total lookup time, then, is t = i(n) + cp where c is the average time per probe into the table space and p is the average number of probes. for the words in the experiment n = 2.32 (average), i ( n) = 40.3, and t = 57.9, so that each probe required about 15 microseconds. c ompadsons timing information for other lookup schemes is difficult to obtain. a treestructured dictionary is used for a similar purpose at harvard. published information indicates 6pq microseconds are needed to process p words in a dictionary of q entries. this time is for the ibm 7094. translating this time to the s/360/65, which is roughly four times faster, and using the adi dictionary ( q = 7822), it appears that each lookup averages 11,000 microseconds. exactly how much computation and input-output this includes is unknown. extensions larger dictionaries as more words are added to the dictionary, the size of the virtual address must increase in order to prevent collisions. as a result, the number of bits per table slot must also increase in order to accommodate the larger minors and pointers that are used. for a fixed-sized hash table, the number of entries in the bump table grows as new words are added. at some point the space required for tables will exceed the amount of core allotted for 200 journal of library automation vol. 3/ 3 september, 1970 dictionary use. to salvage the scheme, it may be possible to split the buinp table into parts-one part for more frequently used words and one for words in rather rare usage. during dictionary construction common words are entered first, then noncommon, then rare. when a rare word must be placed in a collision block, a marker is stored instead, and the item is placed in the secondary bump table. presumably the nature of the words in the second bump table will make its usage rather infrequent, thus saving access to auxiliary storage to fetch it. suffix removal many dictionary schemes store only word stems; the lookup attempts to match only the stem, disregarding suffixes in the process. this is not easily done with scatter storage schemes. one solution is to try to remove the suffix after an initial search has failed. each of the various possible stems must be looked up independently until a match is found. another solution is to use a table of correspondences between the various forms of a word and its stem. the concept number could be used as an index on th is table containing pointers to information about the actual stem. a thesauru s lookup can be handled the same way. application to library files library fil es-characterized by a large number of entries, personal and corporate names, foreign language excerpts, etc.-present special problems to lookups. with regard to size, there is no particular reason that scatter storage cannot b e extended to such files. the only genuine requirement is the ability to compute a virtual address long enough to insure a reasonably low number of collisions. as mentioned previously, table space can become a problem. for really large files, a two-stage process looks most promising. a small hash table is used to address high frequency items and a larger hash table is used for addressing all other data. lookup starts with the small tables and continues to the larger ones if the initial search fails. the same virtual address can be used in both lookups by shifting a few bits from the high-frequency minor to the low-frequency major. this two-stage technique should keep the amount of table shufbing to a minimum and provide rapid lookup for all textual data in titles, abstracts, etc. with respect to bibliographic information, personal and corporate names are bothersome because they can occur in several forms. unfortunately, scatter storage schemes do not guarantee that dictionary entries for r. a. jones and robert a. jones are near each other, so that if an initial lookup fails, the rest of the search can be confined to a local area of the file. there are two approaches to the problem : ( 1 ) standardization of names before input or ( 2) repeated lookups using variants of a name as it occurs in text. standardization, along with delimiting and formatting bibliographic data, is probably the most effective and least expensive approach. in addition, it reduces the amount of redundant data in the file. scatter storage for dictionary lookups/murray 201 phrases in foreign languages present a difficulty, since the character sets on most computing equipment are limited to english letters and symbols. however, if an encoding for such symbols is used, lookup can proceed normally. the problem of obtaining the dictionary entry for an english equivalent of a foreign word is a completely different matter and will not be dealt with here. conclusions virtual scatter storage schemes are well suited for dictionaries, having both rapid lookup and economy of storage. the rapid lookup is due to the fact that the initial table probe limits the search to only a few items. the space savings come from the fact that the actual character strings for words are not part of the dictionary. the schemes depend heavily on a good algorithm for producing random hash addresses. the theory developed in the first two sections of this paper gives a basis for judging the worth of proposed algorithms. for any particular application, the table organization may vary to suit different needs and to store different information. however, the advantages of scatter storage schemes are still present. references 1. salton, g.: "a document retrieval system for man-machine interaction." in association for computing machinery. proceedings of the 19th national conference, philadelphia, pennsylvania, august 25-27, 1964, pp. l2.3-l-l2.3-20. 2. mcilroy, m. d.: dynamic storage allocation (bell telephone laboratories, inc., 1965). 3. morris, r.: "scatter storage techniques," communications of the acm (january, 1968 ). 4. maurer, w. d.: "an improved hash code for scatter storage," communications of the acm (january, 1968) . 5. johnson, l. r.: "indirect chaining method for addressing on secondary keys," communications of the acm (may, 1961). antelman 128 information technology and libraries | september 2006 article title: subtitle in same font author name and second author author id box for 2 column layout library catalogs have represented stagnant technology for close to twenty years. moving toward a next-generation catalog, north carolina state university (ncsu) libraries purchased endeca’s information access platform to give its users relevance-ranked keyword search results and to leverage the rich metadata trapped in the marc record to enhance collection browsing. this paper discusses the new functionality that has been enabled, the implementation process and system architecture, assessment of the new catalog’s performance, and future directions. editor’s note: this article was submitted in honor of the fortieth anniversaries of lita and ital. t he promise of online catalogs has never been realized. for more than a decade, the profession either turned a blind eye to problems with the catalog or accepted that it is powerless to fix them. online catalogs were, once upon a time, “the most widely-available retrieval system and the first that many people encounter.”1 needless to say, that is no longer the case. libraries cannot force users into those “closed,” “rigid,” and “intricate” online catalogs.2 as a result, the catalog has become for many students a call-number lookup system, with resource discovery happening elsewhere. yet, while the catalog is only one of many discovery tools, covering a proportionately narrower spectrum of information resources than a decade ago, it is still a core library service and the only tool for accessing and using library book collections. in recognition of the severity of the catalog problem, particularly in the area of keyword searching, and seeing that integrated library system (ils) vendors were not addressing it, the north carolina state university (ncsu) libraries elected to replace its keyword search engine with software developed for major commercial web sites. the software, endeca’s information access platform (iap), offers state-of-the-art retrieval technologies. ฀ early online catalogs larson and large and beheshti summarize an extensive body of literature on online public access catalogs (opacs) and related information-retrieval topics through 1997.3 the literature has tapered off since then; however, as promising innovations failed to be realized in commercial systems, mainstream opac technology stabilized, and the library community’s collective attention was turned to the web. first generation online catalogs (1960s and 1970s) provided the same access points as the card catalog, dropping the user into a pre-coordinate index.4 the first online catalogs, byproducts of automating circulation functions, were “intended to bring a generation of library users familiar with card catalogs into the online world.”5 the expectation was that most users were interested in known-item searching.6 with the second generation of online catalogs came keyword or post-coordinate (boolean) searching. while systems based on boolean algebra represented an advance over those that preceded them, boolean is still a retrieval technique designed for trained and experienced searchers. (twenty years ago, salton wrote, “[t]he conventional boolean retrieval methodology is not well adapted to the information retrieval task.”7) boolean systems were, however, simple to implement and economical in their storage and processing requirements, important at that time.8 soon after the euphoria of combining free-text terms across records wore off, the library community recognized that the major problem with firstand second-generation catalogs was the difficulty of searching by subject.9 ฀ the “next-generation” catalog by the early 1980s, thinking turned to next-generation catalog features.10 out of this surge of interest in improving online catalogs emerged a number of experimental catalogs that incorporated advanced search and matching techniques developed by researchers in information retrieval. they typically did not rely on exact match (boolean) but used partial-match techniques (probabilistic and vector-based). since probabilistic and vector-based models were first worked out on document collections, not collections of marc records, adaptations were made to the models.11 these prototype systems included okapi, which implemented search trees, and cheshire ii, which refined probabilistic retrieval algorithms for online catalogs.12 it is particularly sobering to revisit one system that was developed between 1979 and 1983. the cite catalog, developed at the national library of medicine, incorporated many of the features of the endeca-powered catalog, including suggesting (mesh) subject headings, correcting spelling errors, stemming, as well as even more advanced features, such as term weighting, keyword suggestion, and “find similar.”13 toward a twenty-first century library catalog kristin antelman, emily lynema, and andrew k. pace kristin antelman (kristen_antelman@ncsu.edu), emily lynema (emily_lynema@ncsu.edu), and andrew k. pace (andrew_pace@ncsu.edu) are respectively associate director for the digital library, systems librarian for digital projects, and head, information technology, at the north carolina state university libraries, raleigh. toward a twenty-first-century library catalog | antelman, lynema, and pace 129 ฀ where are we now? as belkin and croft noted in 1987, “there is a disquieting disparity between the results of research on ir techniques . . . and the status of operational ir systems.”14 two decades later, libraries are no better off: all major ils vendors are still marketing catalogs that represent secondgeneration functionality. despite between-record linking made possible by migrating catalogs to web interfaces, the underlying indexes and exact-match boolean search remain unchanged. it can no longer be said that more sophisticated approaches to searching are too expensive computationally; they may, however, to be too expensive to introduce into legacy systems from a business perspective. ฀ the endeca-powered catalog coupled with the relative paucity of current literature on next-generation online catalogs is a scarcity of library industry interfaces from which to draw inspiration, rlg’s red light green and oclc’s fictionfinder being notable exceptions. in june 2004, library automation vendor tlc announced a partnership with endeca technologies for joint sales, marketing, technology, and product development of the company’s iap software. this search software underlies the web sites of companies such as wal-mart, barnes and noble, ibm, and home depot. ncsu libraries acquired endeca’s iap software in may 2005, started implementation in august, and deployed the new catalog in january 2006. several organizational and cultural factors contributed to making this project possible. of significance was an ongoing administrative commitment to fund digitallibrary innovation, including projects that involve some risk. library staff share this feeling that calculated risks are opportunities to improve the library as well as to open up new challenges in their own jobs. critically, they also believe that not all issues, particularly “edge cases,” (i.e., rarely occurring scenarios) must be resolved before releasing a new service. finally, it was important that the managers who controlled access to programming and other resources were also the project leaders and drivers of the collective urgency to solve the underlying problem. all these factors also contributed to making possible a five-month implementation timeline. functionality the principle functionality gained by implementing an advanced search-and-navigation technology such as the endeca iap falls in three main areas: relevance-ranked results, new browse capabilities, and improved subject access. most ilss, including ncsu’s former catalog, presented keyword results to users in one order: last-in, first-out (i.e., system sort), while browsing within keyword result sets was limited to the links within individual records. ฀ searching and relevance ranking of results inhabiting the catalog search landscape now, somewhere between a secondand third-generation catalog, is endeca’s mdex engine, which is capable of both boolean and limited partial-match retrieval. queries submitted to endeca can use one of several matching techniques (e.g., matchall, matchany, matchboolean, matchallpartial). the current ncsu implementation primarily uses the “matchall” technique for keyword searching, an implied and technique that requires that all search terms (or their spellcorrected, truncated form) entered by the user occur in the result. the user is not required to enter boolean operators for this type of search; in fact, these terms are discarded as stopwords. the “matchboolean” technique continues to support true boolean queries with standard operators; access to this functionality is provided through advanced search options. although classic information retrieval research tends to associate relevance ranking with probabilistic or vector-based retrieval techniques, endeca includes a suite of relevance ranking options that can be applied to booleantype searches (i.e., implied and/or). these individual modules are combined and prioritized according to customer specifications to form an overall relevance ranking strategy, or algorithm. each search index created in the endeca software can be assigned a different relevance ranking strategy. this capability becomes significant when considering the differences in the data being indexed for isbn/issn as compared to a general keyword search. since the keyword anywhere index contains the majority of the fields in a marc record and is the default search operator, its relevance ranking strategy received the most attention. this strategy currently consists of seven modules. the first five modules rank results in a dynamic fashion, while the final two modules provide static ordering based on publication date and total circulation. the ncsu libraries, algorithm prioritizes results with the query terms exactly as entered (no spell-correction, truncation, or thesaurus matching) as most relevant. for multiterm searches, results containing the exact phrase are considered more relevant than those that do not. in addition, ncsu has created a field priority ranking, which 130 information technology and libraries | september 2006 provides the capability to define matches that occur in the title as more relevant than matches that occur in the notes fields. the relevance algorithm also considers factors such as the number of times the query appears in each result and the term frequency/inverse document frequency (tf/idf) of query terms. the unprecedented nature of using this particular set of tools to define relevance algorithms in library catalogs meant that the initial configuration required a best guess approach. the ability to quickly change the settings and re-index provided the opportunity both to learn by doing and test assumptions. much work remains, however, including systematic testing of the “matchallpartial” retrieval technique. while not a true probabilistic or vectorbased matching approach, the “matchallpartial” retrieval technique will broaden a search by dropping individual query terms if no results are returned. however, this type of retrieval technique creates the challenge of developing an intuitive interface that helps users understand partial matching (although many users must be aware that this is how google works). spell correction, “did you mean . . . ,” and sort several other features are included in the basic endeca iap application. these include auto-correction of misspelled words, which uses an index-based approach based on frequency of terms in the local database rather than a dictionary. due to the presence of unique terminology in the database (particularly author names), the relevance ranking has been configured to display any matches on the user’s original term before spell-corrected matches. a “did you mean…” feature also checks queries against terms indexed within the local database to determine if another possible term has more hits than the original term in order to provide the user the option to resubmit the search with a different spelling. various sort options are supported, including date, title, author, and “most popular.” ฀ browse whatever the shortcomings of the card catalog, a library user could approach it with no query in mind; any drawer could be browsed. with the advent of online catalogs, this is no longer possible: an initial search is required to enter the system. marchionini characterizes “browsing strategies” as “informal and opportunistic.”15 a good catalog browse should simulate the experience of browsing the stacks, even potentially improving upon it since the virtual browser can jump around. many patrons cite the serendipity of browsing the stacks and “recognizing” relevant resources as a key part of their discovery process. with more books moving to online formats and off-site storage (and therefore, unable to be browsed), enhancing virtual browsing in the catalog becomes increasingly important. as borgman points out, “few systems allow searchers . . . to pursue non-linear links in the database.”16 key browsing features provided by the endeca software are faceted navigation and the ability to browse the entire collection without entering a search term. although most modern search engines support both fast response times and relevance ranking, the opportunity to apply endeca’s guided navigation feature to the highly structured marc record data was particularly intriguing. guided, or faceted, navigation exposes the relationships between records in the result set. for example, a broad topical search might return thousands of results. classification codes, subject headings, and item-level details can be used to define logical clusters for browsing—post-coordinate refinement—within the result set. since these refinements are based on the actual metadata of the records in the result set, users can never refine to less than one record, (i.e., there are no “dead ends”).these clusters, or facets, are known as dimensions. users are able to select and remove values from all available dimensions in any order to assist them as they browse through the result set. endeca’s dimensions, while able to be browsed, are not available only as post-coordinate search refinements, however. using the endeca application, library catalogs can once again give users the ability to browse the entire set of records without first entering a search term. any of the dimensions can be used to browse the collection in this fashion, and the ability to assign item-level information (e.g., format, availability, new book), as well as bibliographic-record elements, to the dimensions further enhances the browsing functionality. ฀ improving subject access given the unsuitability of library of congress subject headings (lcsh) as an entry vocabulary, improving topical (subject) access in catalogs centers around keyword searching. while keyword searches query the subject headings as they do the rest of the record, most systems do not take advantage of the fact that subject headings are controlled and structured access points or use the subject information embedded in the classification number. the endeca-powered catalog, in addition to addressing classic keyword-search problems by introducing relevance ranking, implied phrase, spell correction, and stemming, also leverages the “ignored” controlled vocabulary present in the bibliographic records—subject headings and classification numbers—to aid in improving topical searching. this is a system design concept that has been discussed in the literature on improving subject toward a twenty-first-century library catalog | antelman, lynema, and pace 131 access but has not until now been manifest in a major catalog implementation. as chan noted, “subject headings and classification systems have more or less operated in isolation from each other.”17 the endeca-powered catalog interface is an experiment in presenting users with these two different, but complementary, approaches to categorizing library materials by subject. classification several catalog experiments created retrieval clusters based on deweyand ddc-classification schemes and captions in order to improve subject access by expanding the entry vocabulary and as a way to improve precision and recall.18 using the lc classification is more challenging, however, as it is not hierarchical. still, the potential of its use has been noted by bates and coyle; and larson experimented with creating clusters (“classification clusters”) based on subject headings associated with a given lc class.19 in larson’s system, the interface suggested possible subject headings of interest, an approach similar to that of displaying the subject facets alongside the result set in the endeca catalog. there is some evidence from early usability studies that exposing the classification, much as it was physically exposed in the card catalog, is useful and desired by catalog users. markey summarizes findings of a 1981 council on library resources study in which many institutions conducted usability testing. positive aspects of card-catalog use that people wanted to see in the opac included, a “visual overview of what is available in the library,” and “serendipity.”20 but there is a difference between using the classification scheme to identify subject headings and displaying the classification itself in the user interface. the latter can be problematic from a usability perspective, as larson pointed out, because the classification scheme and terminology are not transparent.21 imagine the would-be browser of a library’s computer-science collection having to know to select first q science, then qa1–qa939 mathematics, and then qa71–qa90 instruments and machines before possibly recognizing that qa75–qa76.95 calculating machines included computer science? despite these potential problems, because the endeca software supported display of the lc classification as a dimension, ncsu decided to experiment with its utility by making it available on the results screen. entry vocabularies entry vocabularies or mappings apply to all types of retrieval models. they address the general problem of reconciling a user’s query vocabulary with the index vocabulary represented in the catalog or documents.22 studies show that users’ query vocabulary is large (people rarely pick the same term to describe the same concept) and inflexible (people are unable to repair searches with synonyms.)23 because of this, bates refers to the objective of the entry vocabulary as the “side-of-abarn principle.”24 several approaches have been taken to develop this functionality. building on larson’s “classification clustering” methodology, buckland created an entry vocabulary module by associating dictionaries created by analyzing database records.25 the result was natural language indexes to existing thesauri and classification systems. while the endeca-powered catalog does not yet incorporate an entry vocabulary, its exposure of the index vocabulary to the user in subject dimensions could be said to be a limited side-of-a-barn approach. the limitation is that only controlled vocabulary from the retrieved records is exposed as dimensions on the results screen; relevant records not retrieved because of a lack of match between query vocabulary and terms in the record will not have their facets displayed. were an entry vocabulary for lcsh available, endeca’s synonym-table feature could be used to map between query terms and lcsh. ฀ implementation the library’s information technology advisory committee appointed a seven-member representative team to oversee the implementation. preparatory steps included sending key development staff to training and a two-day meeting with endeca project managers to establish functional and technical requirements. architecture knowing that the endeca application would not completely replace ncsu’s integrated library system, determining how best to integrate the two products was part of the implementation process. the endeca iap coexists with the sirsidynix unicorn ils and the sirsidynix (web2) online catalog, indexing marc records that are exported from unicorn’s underlying oracle database. figure 1 depicts the integration of the endeca software with existing systems. although the endeca software is capable of communicating directly with the database that supports the unicorn ils, ncsu chose the easier path of exporting marc records into text files for ingest by endeca. the marc4j api is used to reformat the exported marc records (which include itemlevel information in 999 fields) into flat text files with utf-8 encoding that are parsed by endeca’s data foundry process. nightly shell scripts export updated and new records from ils, merge those with the base endeca files, and start the re-indexing process. the indexing of seventy-three marc 132 information technology and libraries | september 2006 record fields and ten dimensions results in an index size of approximately 2.5 gb. the entire index resides in system memory. the endeca data foundry can easily parse and reindex the approximately 1.7 million titles in ncsu’s holdings nightly (in stark contrast to the more than 3 days of downtime required to re-index keywords in unicorn). the relative speed of this process and the fact that it does not interfere with the front-end application prompted the decision not to implement “partial indexing” at the outset. though there was little doubt among staff as to the increased capabilities of keyword searching through endeca, the implementation team decided that authority searching (author, title, subject, call number) would be preserved in the new catalog interface. this allowed ncsu to retain the value of authority headings, in addition to providing a familiar interface and approach to known-item searching. since the detailed record in web2 included the capability to save records, place requests, and send systemsuggested searches (“more like this”), the implementation team also decided to link from titles in the endeca-powered results page to the web2 detailed record. only slight modifications were required to stylize this display in a manner consistent with the new interface. the front-end interface for keyword searching in endeca is a java-based web application built in-house. this application is responsible for sending queries to the endeca mdex engine—the back-end http service that processes user queries—and displaying the results that are returned. user-interface design because it is created by the customer, ncsu libraries has complete control over the look, feel, and layout of the endeca search-results page. indexes, properties, and dimensions the implementation team began the process of making indexing decisions by looking at the fields indexed in the unicorn keyword-index file. this list included 161 marc fields and subfields, including more than thirty fields that are never displayed to the public. this kitchen-sink approach was replaced with a more carefully selected list less than half that number. the implementation team defined eleven dimensions for use with endeca’s faceted navigation feature. once users enter a search query, they can explore the result set by selecting values from these dimensions: availability; lc classification; subject: topic; subject: genre; format; library; subject: region; subject: era; language; and author (see figure 2). the eleventh dimension is not displayed on the results page, but is used to enable patrons to browse new titles. each dimension value also lists the number of results associated with it; most dimensions are listed in frequency order. search interface once the implementation team made some preliminary decisions regarding dimensions and search indexes, wireframes were created to assist in the iterative design process for the front-end application. while the positioning of the dimensions on the results page and the display of holdings information was well debated, the design of the catalog search page was an even hotter topic. integration of both endeca keyword searching and web2 authority searching required an interface that could help users differentiate between the two tools. a survey of the keyword-versus-authority searching distinction in a variety of library catalogs led to the development of four mock-ups. the implementation team chose a search tab that includes separate search boxes for keyword and authority searching, as well as search figure 1. ncsu endeca architecture figure 2. dimensions toward a twenty-first-century library catalog | antelman, lynema, and pace 133 examples dynamically displayed based on the index selected. authority searching was relabeled “begins with” searching to let users know that this particular search box featured known-item searching (although it is also where lcsh searching is found) (see figure 3). an advanced search tab re-creates the pre-coordinated search options from the web2 search interface using endeca search functionality. one unique new feature allows users to include or exclude reference materials and government documents from their results. a true boolean search box is made available here, primarily for staff. browse while users can submit a blank search and browse the entire collection by any of the dimensions, the browse tab specifically supports browsing by lc classification scheme (see figure 4). this tab also includes a “new titles” browse that can easily be refined with faceted navigation. at the time of this writing, there are plans to pull out other dimensions, such as format, language, or library, for browsing. this will be a great stride forward since there has traditionally been no way to perform a marc codes-only search (in order to browse all chinese fiction in the main library, for example). assessment the endeca-powered catalog seems self-evidently a better tool to help users find relevant resources quickly and intuitively. but since so much of the implementation involved uncharted territory, plans for assessment began before the launch of the interface, and the actual assessment activities began shortly thereafter. the library identified five assessment measures prior to implementation. one of these, however, requires longer time-series data (changes in circulation patterns), and another, the application of new and potentially complex log-analysis techniques (path analysis). other measures relate to use of the refinements, “sideways searching,” and objective and subjective measurements of quality search results, some of which can be preliminarily reported on here. log analysis to learn more about how patrons are using the catalog, data from two months of search logs were analyzed. while authority searching using the library’s old web2 catalog is still available in the new interface, search logs show that authority searching has decreased 45 percent and keyword searches have increased 230 percent. it is noted, however, that a significant—and indefinable—component of this increase in keyword searching is due to the fact that the default catalog search was changed from title to keyword. users are taking advantage of the new navigational features. fifty-five percent of the endeca-based search requests are simple keyword searches, 30 percent represent searches where users are selecting post-search refinements from the dimensions on the results page, and the remaining 15 percent are true browses with no search term entered (this figure includes use of browse new titles). dimensions the horizontal space just above the results is used to display the full range of results within the lc classification scheme (see figure 2). the first dimensions in the left column focus on the subject dimensions (topic and genre) that should be pertinent to the broadest range of searches. the following format and library dimensions recognize that patrons are often limited by time and space. when designing the user interface, it was not known which dimensions would be most valuable. as it turned out, dimension use does not exactly parallel dimension placement. lc classification is the most heavily used, followed closely by subject: topic, and then library, format, author, and subject: genre. since no basis for the placement of dimenfigure 3. new catalog search interface figure 4. browse by lc classification and new titles 134 information technology and libraries | september 2006 sions existed at the time of implementation, the endeca product team plans to use these data, after some time, to determine if changes in dimension order are warranted. spell correction and “did you mean . . .” approximately 6 percent of endeca keyword searches responded to the user’s query with some type of spelling correction or suggestion: 3.6 percent performed an automatic spell correction, and 2.8 percent offered a “did you mean…” suggestion. while ncsu has not analyzed how many of the spell corrections are accurate or how many of the “did you mean…” suggestions are being selected by users, future work in this area is planned. recommender features two features in endeca that have seen a surprising amount of use are the “most popular” sort option and the “more titles like this” feature available on the detailed-record page for a specific title. both relate broadly to the area of recommending related materials to patrons. the “most popular” sort option is currently powered by aggregated circulation data for all items associated with a title. while this technique is ineffective for serials, reference materials, and other noncirculating items, it provides users a previously unavailable opportunity to define relevance. to date, the “most popular” sort is the second most frequently selected sort option (after publication date, at 41 percent), garnering 19 percent of all sorting activity. most-popular sorting was trailed by title, author, and call-number sorting. when viewing a detailed record, users are given the option to find “more titles like this” or “more by these authors.” the first option initiates a new subject keyword search combining the phrases from the $a subdivision of all the subject (6xx) fields assigned to the record. the latter option initiates an author keyword search for any of the authors assigned to the current record. while there are not good statistics on use of this feature, these subject strings appear regularly in the list of most popular queries in search logs. assessing top results if relevance ranking was effective, one would expect to see good results on the first page. but what are “good” or “relevant” results? greisdorf finds that topicality is the first condition of relevance, and xu and chen’s more recent study finds topicality and novelty to be equally important components of relevance.26 while someone other than the searcher might be able to assess topical relevance, it is impossible to assess novelty, since it cannot be known what the searcher already knows. although researchers agree that relevance is subjective—that is, only a searcher can determine whether results are relevant—janes showed that trained external searchers do a reasonably good job of approximating the topical relevancy judgments of users.27 the analysis reported here focuses on topicality (using a liberal interpretation of what might be topically relevant). ncsu libraries sought to measure how many of the top search results are likely to be relevant to the user ’s query in the old and new catalogs. methodology one of the authors searched 100 topical queries (taken from 2005 search logs) in both web2 and endeca catalogs using “keyword anywhere.” topical queries whose meaning was unclear (e.g., “hand wrought”) were excluded. the topical relevance of the top hits (up to five) was coded for each target. because not all search-result sets contained five records, success for each was measured as a ratio (e.g., 2/5 = .4). those searches that resulted in 0 records in both targets were discarded, while those that resulted in 0 records in target a but “found relevant results” in target b were counted as 0 in target a. the ratios were then averaged for each target and compared to determine the difference in relevance-ranking performance. finally, a random subset of forty-four of the queries was selected, and the placement in the web2 results of the first result in endeca was noted. results on average, 40 percent of the top results in web2 were judged to be relevant, while 68 percent of the top results in endeca were judged to be relevant. that represents a 70 percent better performance for the endeca catalog. if one makes the assumption that the first endeca record is relevant (admittedly an assumption), based on these data, then one can look at the average position of that record in the old catalog. it was found that the first hit in endeca fell between #1 and #4126 in web2, with more than a third falling after the second screen of results, the maximum number of screens users are typically willing to examine.28 while this level of increased performance is impressive, it masks some dramatic differences in the respective result sets. looking at a broad search, “marsupial,” all of the top five hits in endeca have “marsupial” in the title and “marsupials” or “marsupialia” as a subject heading. the result set includes seventy-eight records, thanks to this intelligent stemming. in the web2 result set, just twenty-nine records, not a single one of the top five has “marsupial” in the title or subject headings (and the top two results, tributes to malcolm c mckenna and poisonous plants and related toxins, are highly unlikely to be relevant). it is not until record #10 that you see the first item that contains “marsupial” in the title or subject. this single example demonstrates the benefit of both relevance ranking and stemming. toward a twenty-first-century library catalog | antelman, lynema, and pace 135 usability testing as a result of a long history of catalog-usability studies, there are things that are known about library catalog users. one is that people both expect systems to be easy to use and find that they are not.29 usability testing was conducted to compare student success in using the new catalog interface with that of students using the old catalog interface when completing the same set of ten tasks. ten undergraduate students were recruited for the test. five were randomly selected to use the old web2 catalog, while the other five used the new catalog interface, which allows users to choose between a keyword search box powered by endeca and an authority search box (begins with . . . ) that is still powered by web2. the test contained four known-item tasks and six topical-searching tasks (appendix a). task success, duration, and difficulty were recorded. user satisfaction was not measured since catalog usability studies have found that satisfaction does not correlate with success.30 task duration figure 5 shows the average task duration for the topical tasks (5–10) for web2 and endeca. except for task 9*, there is clearly a trend of significantly decreased average task duration for endeca catalog users. the endeca catalog shows a 48 percent improvement in the average time required to complete a task (01:34 in web2 compared to 00:49 in endeca). it is also noted that, although results from known-item searching tasks (1–4) are not reported in detail here, test subjects were just as successful in completing them using keyword searching in the endeca catalog as they were using authority searching in web2. task success and difficulty in addition to task duration, the test moderator assigned a difficulty rating to each task attempted by the participants: easy, medium, hard, or failed. figure 6 illustrates the overall task-attempt difficulty for topical tasks (5–10) in the web2 and endeca catalogs. the largest improvement is in the increased percentage of tasks that are completed easily in endeca and the nearly equivalent decrease in the percentage of tasks that were rated as hard to complete. while a significant number of tasks were still failed using the endeca catalog, many of these failures can be attributed to participants’ propensity to select keyword in subject rather than keyword anywhere searches. in fact, the only instances where keyword anywhere search in the new catalog failed to lead to successful task completion were for a single participant who was unwilling to examine retrieved results closely enough to determine if they were actually relevant to the task question, assuming too quickly that the task had been completed successfully. terminology participants using both the web2 and endeca catalog interfaces expressed confusion over some of the terminology employed. one of the most problematic terms was “subject.” a number of participants selected keyword in subject for topical searches because of the attraction of the word “subject.” none of the participants recognized that this term referred to controlled vocabulary assigned to records. coupled with a slight unfamiliarity with the term “keyword,” not typically used in web searching, this misunderstanding led participants to misuse (or overuse) keyword in subject searches when they could have found results more effectively using general keyword searching. this terminology problem appears to be an artifact of the usability testing, however. looking at the search logs, more than 50 percent of the keyword searches were keyword anywhere searches, while only 4 percent represented keyword in subject searches. relevance relevance ranking of search results is clearly the most important im-provement in the new catalog. students in this usability test all looked immediately at the first few results on the first page to determine if their search had produced good results. if they didn’t like what they saw, they were likely to retry the search with fewer or more keywords in order to improve their first few results. one participant figure 5. average task duration: web2 versus endeca * while task 9 may appear to be an aberration, it actually reveals effective use of new functionality. this task required users to locate an audio recording of poetry in spanish. in web2, three of five participants completed the task successfully, all using the material type and language limits available in the advanced search tab. the two participants who didn’t locate this tool failed to complete the task. in endeca, two participants used the same advanced search limits to complete the task successfully and two additional participants were able to locate and use endeca dimensions to complete the task successfully. this suggests that the new interface is providing users with more options to help them arrive at the results they seek. 136 information technology and libraries | september 2006 using the web2 catalog expressed the need for relevance ranking, “once i scroll through a page, i get pretty discouraged about the results.” the number of paging requests recorded in system logs confirms that users are focusing on the first result screen (with ten results per page); only 13 percent of searchers go to the second page. use of dimensions when questioned after the test, all five participants who used the endeca catalog intuitively understood that dimensions could be used to narrow results. however, only three used the dimensions during the test. throughout the tests, the student participants frequently attempted to limit their search at the outset, rather than beginning with a broad search and then refining. it is unclear whether this behavior is a function of the very specific nature of the test questions or experience with the old catalog. log data show that users are indeed entering broad keyword searches with only one or two terms, which implies that dimensions may be more useful than this usability test indicates. it is also interesting to note that while none of the students understood that the lc classification dimension represented call-number ranges, they did understand that the values could be used to learn about a topic from different aspects—science, medicine, education. ฀ future directions weeks before the initial application went live in january 2006, the list of desired features had grown long. some of these were small “to do” items that the team did not have time to implement. others required deeper investigation, discussion, and testing before the feature could be put into production. still others may or may not be possible. a few of ncsu’s significant planned development directions are summarized below. functional requirements for bibliographic records there is much interest in the utility of applying the functional requirements for bibliographic records model to online catalogs.31 endeca includes a feature called “record rollup” that allows retailers to group items together for example, different sizes and colors of a shirt. all that is required for this feature is a rollup key. ncsu, working with oclc, has elected to try the oclc work identifier to take advantage of this functionality and create work-level record displays in the endeca catalog hit list. subject access the collective investment libraries have made in subject and name authorities is leveraged with the faceted navigation features of endeca. but only authorized headings in records are seen by endeca, cross-references in the subjectauthority record are not used. during implementation, the team looked at ways to improve the entry vocabulary to authorized-subject terms by loading the 1xx and 4xx fields from the subject-authority file into endeca synonym tables so that users could be guided to proper subject terms. the team still views this as a promising direction, but simply did not have time to fully explore it prior to implementation. additional discussions with oclc centered on their faceted access to subject terms (fast) project. fast terms are more amenable than lcsh headings to being broken up into topical, geographic, and time-period facets without losing context and meaning. the normalization of geographic and time-period subdivisions promises to be particularly useful. fast has, to date, lacked a ready interface for the application of its data. while the fast structure is more conducive to non-cataloger metadata creation and post-coordinate refinement, it still does not meet the need figure 6. topical task success and difficulty: web2 versus endeca toward a twenty-first-century library catalog | antelman, lynema, and pace 137 for a user-entry vocabulary.32 were such a vocabulary for lcsh to become available, it could be mapped to synonym tables to lead users to authorized headings. abandon authority searching? the future of authority searching, however, is less clear. although the usability testing described in this paper showed that the endeca keyword search tools performed on a par with the old catalog for known-item searching, it is recognized that authority searching serves more functions. clearly, collocation of all books on a topic is absent when a user does a topical search using keyword rather than a controlled subject heading. but there are more subtle losses as well. as chan points out, one purpose of subject access is to help users focus searches, develop alternative strategies, and enable recall and precision.33 this is not possible with a simple keyword search, unless the searcher discovers that he can search on a subject heading from a record of interest. the display of subject facets in the endeca-powered catalog works to counter this weakness of simple keyword searching. another navigation aid in the traditional authority display that is lost in a simple keyword-search result is visible “seams.” as mann points out, “seams serve as perceptible boundaries that provide points of reference; without such boundaries, readers get ‘lost at sea’ and don’t know where they are in relation to anything else: they can’t perceive either the extent of what they have, or what they don’t have.”34 until users have confidence that a known item will appear at the top of a results list if the library holds that item, with a large keyword result set, one cannot confirm a “negative result” without browsing through the entire set. the endecapowered catalog interface does not help to address either the “seams” or the negative-result problem, which are two reasons why ncsu maintained authority searching. an integration platform despite the vast improvements found in the endeca catalog, the fact remains that it is still mainly books—as calhoun says, “only a small portion of the expanding universe of scholarly information.”35 there are two approaches to take with the endeca platform: one is to take advantage of having control over the data and the interface to facilitate incorporation of outside data sources to enhance bibliographic records. the second is to put other, non-catalog data sources under the endeca search-and-navigation umbrella. the middleware nature of the endeca platform makes either approach more promising than the “square peg and round hole” problem of trying to work with library management systems illequipped to handle a diversity of digital assets. whether as a feed of catalog data to a metasearch application or web-site search tool, or as a platform for faceted access to electronic theses, institutional repositories, or electronic books, endeca has clear potential as a future platform for library resource discovery. ฀ conclusion while it cannot be claimed that this endeca-powered catalog is a third-generation online catalog, it does implement a majority of the third-generation catalog features identified by hildreth. most notably, through navigation of subject and item-level facets, the endeca catalog supports two of his objectives, “related record search and browse” and “integration of keyword, controlled vocabulary, and classification-based approaches.” spell correction, intelligent stemming, and synonym tables support “automatic term conversion/matching aids.” the flexible relevance-ranking tools support “closest, best-match retrieval” as well as “ranked output.” much work remains, however. three important features identified by hildreth cannot be said to be implemented in this catalog at this time: “natural language query expression,” that is, an entry vocabulary, “expanded coverage and scope,” and “relevance feedback methods.”36 requirements for these features are either being reviewed or are already under development by both endeca and ncsu libraries. ncsu views the endeca catalog implementation in the context of a broader, critical evaluation and overhaul of library discovery tools. like the library web site, the catalog still requires users to come to it. when they do, it still sets a high threshold for patience and the ability to interpret clues. still, at the end of the day it rewards the ncsu student searching “declaration of independence” with the book, american scripture: making the declaration of independence instead of the recent congressional resolution, recognizing the mexican holiday of cinco de mayo. references 1. christine l. borgman, “why are online catalogs still hard to use?” journal of the american society for information science 47, no. 7 (1996). 2. karl v. fast and d. grant campbell, “i still like google: university student perceptions of searching opacs and the web.” in proceedings of the 67th asis&t annual meeting (providence, r.i.: american society for information science and technology, 2004). 3. ray r. larson, “between scylla and charybdis: subject searching in the online catalog,” advances in librarianship 15 (1991); andrew large and jamshid beheshti, “opacs: a research review,” library & information science research 19, no. 2 (1997). 4. nathalie nadia mitev, gillian m. venner, and stephen walker, designing an online public access catalogue: okapi, a catalogue on a local area network (london: british library, 1985). 138 information technology and libraries | september 2006 5. borgman, “why are online catalogs still hard to use?” 495. 6. r. hafter, “the performance of card catalogs: a review of research,” library research 1, no. 3 (1979). 7. gerard salton, “the use of extended boolean logic in information retrieval,” in proceedings of the 1984 acm sigmod international conference on management of data (new york: acm pr., 1984), 277. 8. ray r. larson, “classification clustering, probabalistic information retrieval, and the online catalog,” library quarterly 61, no. 2 (1991). 9. ibid. 10. charles r. hildreth, online public access catalogs: the user interface (dublin, ohio: oclc, 1982). 11. larson, “classification clustering.” 12. mitev, venner, and walker, designing an online public access catalogue; ray r. larson et al., “cheshire ii: designing a next-generation online catalog,” journal of the american society for information science 47, no. 7 (1996). 13. tamas e. doszkocs, “cite nlm: natural-language searching in an online catalog,” information technology and libraries 2, no. 4 (1983). 14. nicholos j. belkin and w. bruce croft, “retrieval techniques,” in annual review of information science and technology, ed. martha e. williams (new york: elsevier, 1987), 129. 15. gary marchionini, information seeking in electronic environments (new york: cambridge univ. pr., 1995), 100–18. 16. borgman, “why are online catalogs still hard to use?” 494. 17. lois mai chan, exploiting lcsh, lcc, and ddc to retrieve networked resources: issues and challenges (washington, d.c.: library of congress, 2001), www.loc.gov/catdir/bibcontrol/ chan_paper.html (accessed july 10, 2006). 18. lois mai chan, “library of congress classification as an online retrieval tool: potentials and limitations,” information technology and libraries 5, no. 3 (1986); mary micco and rich popp, “improving library subject access (ilsa): a theory of clustering based in classification,” library hi tech 12, no. 1 (1994). 19. marcia j. bates, “subject access in online catalogs: a design model,” journal of the american society for information science 37, no. 6 (1986); karen coyle, “catalogs, card—and other anachronisms,” the journal of academic librarianship 31, no. 1 (2005); larson, “classification clustering.” 20. karen markey, “thus spake the opac user,” information technology and libraries 2, no. 4 (1983): 383. 21. larson, “classification clustering.” 22. marcia j. bates, library of congress bicentennial conference on bibliographic control for the new millennium, task force recommendation 2.3 research and design review: improving user access to library catalog and portal information, final report, 2003; charles r. hildreth, intelligent interfaces and retrieval methods for subject searching in bibliographic retrieval systems (washington, d.c.: library of congress, 1989); bates, “subject access in online catalogs”; belkin and croft, “retrieval techniques.” 23. bates, “subject access in online catalogs”; bates, library of congress bicentennial conference on bibliographic control for the new millennium; eric novotny, “i don’t think, i click: a protocol analysis study of use of a library online catalog in the internet age,” college & research libraries 65, no. 6 (2004). 24. bates, “subject access in online catalogs,” 367. 25. larson, “classification clustering”; buckland et al., “mapping entry vocabulary to unfamiliar metadata vocabularies,” d-lib magazine 5, no. 1 (1999). 26. h. greisdorf, “relevance thresholds: a multi-stage predictive model of how users evaluate information,” information processing & management 39, no. 3 (2003): 403–23; yunjie (calvin) xu and zhiwei chen, “relevance judgment: what do information users consider beyond topicality?” journal of the american society for information science and technology 57, no. 7 (2006). 27. joseph w. janes, “other people’s judgments: a comparison of users’ and others’ judgments of document relevance, topicality, and utility,” journal of the american society for information science 45, no. 3 (1994). 28. bernard j. jansen and udo pooch, “a review of web searching studies and a framework for future research,” journal of the american society for information science and technology 52, no. 3 (2001); novotny, “i don’t think, i click.” 29. borgman, “why are online catalogs still hard to use?” 30. brian nielsen and betsy baker, “educating the online catalog user: a model evaluation study,” library trends 35, no. 4 (1987). 31. ifla cataloging section, “frbr bibliography,” www.ifla .org/vii/s13/wgfrbr/bibliography.htm (accessed may 1, 2006). 32. lois mai chan et al., “a faceted approach to subject data in the dublin core metadata record,” journal of internet cataloging 4, no. 1/2 (2001). 33. chan, exploiting lcsh, lcc, and ddc. 34. thomas mann, “is precoordination unnecessary in lcsh? are web sites more important to catalog than books?” a reference librarian’s thoughts on the future of bibliographic control (washington, d.c.: library of congress, 2001), www.loc.gov/ catdir/bibcontrol/mann_paper.pdf (accessed july 10, 2006). 35. karen calhoun, “the changing nature of the catalog and its integration with other discovery tools,” prepared for the library of congress, 2006, 24. unpublished, www.loc.gov/ catdir/calhoun-report-final.pdf (accessed july 7, 2006). 36. charles r. hildreth, online catalog design models: are we moving in the right direction? (washington, d.c.: council on library resources, 1995). toward a twenty-first-century library catalog | antelman, lynema, and pace 139 copyright © 2006 by charles w. bailey jr. this work is licensed under the creative commons attributionnoncommercial 2.5 license. to view a copy of this license, visit http://creativecommons.org/licenses/by-nc/2.5/ or send a letter to creative commons, 543 howard st., 5th floor, san francisco, ca, 94105, usa. bailey continued from 127 ฀ known-item questions 1. “your history professor has requested you to start your research project by looking up background information in a book titled civilizations of the ancient near east.” a. “please find this title in the library catalog.” b. “where would you go to find this book physically?” 2. “for your literature class, you need to read the book titled gulliver’s travels written by jonathan swift. find the call number for one copy of this book.” 3. “you’ve been hearing a lot about the physicist richard feynman, and you’d like to find out whether the library has any of the books that he has written.” a. “what is the title of one of his books?” b. “is there a copy of this book you could check out from d. h. hill library?” 4. “you have the citation for a journal article about photosynthesis, light, and plant growth. you can read the actual citation for the journal article on this sheet of paper.” alley, h., m. rieger, and j.m. affolter. “effects of developmental light level on photosynthesis and biomass production in echinacea laevigata, a federally listed endangered species.” natural areas journal 25.2 (2005): 117–22. a. “using the library catalog, can you determine if the library owns this journal?” b. “do library users have access to the volume that actually contains this article (either electronically or in print)?” ฀ topical questions 5. “please find the titles of two books that have been written about bill gates (not books written by bill gates).” 6. “your cat is acting like he doesn’t feel well, and you are worried about him. please find two books that provide information specifically on cat health or caring for cats.” 7. “you have family who are considering a solar house. does the library have any materials about building passive solar homes?” 8. “can you show me how would you find the most recently published book about nuclear energy policy in the united states?” 9. “imagine you teach introductory spanish and you want to broaden your students’ horizons by exposing them to poetry in spanish. find at least one audio recording of a poet reading his or her work aloud in spanish.” 10. “you would like to browse the recent journal literature in the field of landscape architecture. does the design library have any journals about landscape architecture?” appendix a: ncsu libraries catalog usability test tasks virtual reality: a survey of use at an academic library articles virtual reality a survey of use at an academic library megan frost, michael goates, sarah cheng, and jed johnston information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.11369 megan frost (megan@byu.edu) physiological sciences librarian, brigham young university. michael goates (michael_goates@byu.edu) life sciences librarian, brigham young university. sarah cheng is an undergraduate student, brigham young university. jed johnston (jed_johnston@byu.edu) innovation lab manager, brigham young university. abstract we conducted a survey to inform the expansion of a virtual reality (vr) service in our library. the survey assessed user experience, demographics, academic interests in vr, and methods of discovery. currently our institution offers one htc vive vr system that can be reserved and used by patrons within the library, but we would like to expand the service to meet the interests and needs of our patrons. we found use among all measured demographics and sufficient patron interest for us to justify expansion of our current services. the data resulting from this survey and the subsequent focus groups can be used to inform other academic libraries exploring or developing similar vr services. introduction virtual reality (vr) is commonly defined as an experience in which a user remains physically within their real world while entering a virtual world (comprising three-dimensional objects) using a headset with a computer or a mobile device.1 vr is part of a spectrum of related technologies ranging from mostly real experiences to completely virtual experiences, such as augmented reality, augmented virtuality, and mixed reality. 2 extended reality (xr) is a term often used when describing these technologies as a whole. many different xr devices and services are available in academic libraries. the most popular xr devices used in libraries are the htc vive, the oculus rift by facebook, and google cardboard.3 other common xr devices include gearvr by samsung and playstation virtual reality by sony.4 the htc vive and oculus rift are technologies that provide an immersive virtual-reality experience. google cardboard provides both non-immersive virtual reality and augmented reality experiences, while mixed reality is provided through various technologies such as microsoft’s hololens and mixed-reality headsets from hp, acer, and magic leap. in addition, many academic libraries are using augmented reality apps that can be downloaded on patrons’ personal mobile devices.5 academic libraries are starting to offer various xr services to increase engagement with patrons and teach information literacy.6 despite the increase in xr service offerings, there is little consistency in the devices used or in how these services are developed at academic libraries , and there is substantial variation in the types of services offered. for example, some libraries make vr headsets available for in-house activities, such as storytelling, virtual travel, virtual gaming, and the development of new skills.7 other libraries, notably ryerson university library and archives in toronto, let students and faculty borrow their oculus rift headsets for two or three days at a time.8 some university libraries lend out headsets or 360-degree cameras or provide a virtualmailto:megan@byu.edu mailto:michael_goates@byu.edu mailto:jed_johnston@byu.edu information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 2 reality space for students to develop content.9 the university of utah library offers an open-door, drop-in vr workshop once a week.10 claude moore health sciences library at the university of virginia implemented a project that educated its students and staff on the uses of vr in the health field through a combination of large-group demonstrations, one-on-one consultations, and workshops.11 the xr field is developing quickly, and xr services have the potential to benefit students academically. some universities are already offering classes on vr platforms.12 this is particularly true in fields that are high risk or potentially discomforting. for example, students in medical fields benefit by practicing virtually before attempting surgery on a human body.13 in addition to potential surgical benefits, the university of new england has been utilizing xr technology to teach empathy to its medical and other health profession students by putting the learner in the place of their patients.14 other examples of xr usage in the health fields include a recent attempt to introduce vr in anatomic pathology education and the use of virtual operating rooms to train nurses and educate the public. 15 one recent study measured the effectiveness of using vr platforms in engineering education and found a drastic improvement in student performance.16 many educational institutions outside of the university setting have also started exploring how xr could be used to enhance students’ educational experience. this technology has already progressed from being considered a novelty to being an established tool to engage learners.17 one of the perceived benefits of xr use in public libraries by both library patrons and staff is the ability of xr technology to inspire curiosity and a desire to learn.18 in some school programs, students are able to advance their learning through xr apps that allow them not only to absorb information but also to experience what they are learning through hands-on activities and full immersion without danger (e.g., hazardous science experiments) or high cost (e.g., traveling to another country).19 xr has the potential to increase the overall engagement of students, which, according to carnini, kuh, and klein’s 2006 study, is correlated to how well students learn.20 xr has the ability to capture the attention of students and eliminate distractions. this is particularly true for students with attention deficit disorder, anxiety disorders, or impuls e-control disorder.21 the application of xr goes beyond traditional classroom settings. a case study assessing the benefits of vr in american football training found that players showed an overall improvement of 30 percent after experiencing game plays created by their coaches in a virtual environment.22 although these studies were not conducted in an academic library or university setting, their results are transferable. it is beneficial to academic libraries to provide technologies to their patrons that enhance and advance their learning. currently, xr apps available for purchase on the google app store are still limited. most app development comes from private companies; however, some universities are giving their students the opportunity to develop xr content.23 objectives at brigham young university, we want our vr services to foster the link between academic achievement and virtual reality. in order to do this effectively, our first objective is to determine which vr services will be of most benefit to our patrons. to inform the expansion of future vr services, we conducted a survey of patrons using current vr services in the library. this survey is also intended to help other libraries that are developing vr services and potentially developers information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 3 interested in creating academic content for students. we were primarily interested in user experience, demographics, academic interests in vr, and methods of discovery. methods during one semester, january through april 2018, we asked individuals to complete a questionnaire following their use of the library’s htc vive system. this questionnaire was administered through an online qualtrics survey that was distributed via email to patrons after using the library’s vr equipment. it consisted of thirteen questions that gathered basic demographic information as well as information on patron interests and experiences with the library’s vr services. the complete survey used in this study can be found in appendix a. currently the harold b. lee library at brigham young university offers one htc vive vr system that can be used on site in the science and engineering area of the library. it is primarily operated by student employees who work at the science and engineering reference desk. time slots are reserved through an online registration system on the library’s website. in order to gather more in-depth, qualitative data on patron experience with the library’s vr services, we also conducted a focus group with vr users. we recruited participants by adding a question at the end of the qualtrics survey asking whether the responder would be interested in participating in a focus group. all focus group participants received a complimentary lunch. during the focus group, we asked a series of five questions to gain a deeper understanding of users’ vr experience at the library. in particular, we asked participants to explain what went well during their vr experience in the library, what difficulties they experienced, how they envisioned using vr for both academic and extracurricular purposes, and what type of vr content (e.g., software or equipment) they would like the library to acquire. the focus group facilitator asked follow-up questions for clarification as needed. the session was audio recorded, and participant responses were transcribed and coded for themes. results and discussion demographics the most frequent users of the vr equipment in the library were male students in the science. technology. engineering, or mathematics (stem) disciplines. the percentage of male students at brigham young university is roughly 50 percent but over 70 percent of our survey respondents were male. that stated, there was considerable use among all measured demographics, as shown in figure 1. over one third of responders were not students. university faculty made up 11 percent of responders during the survey period. the proportion of faculty who responded was higher than the university’s faculty-to-student ratio and likely the result of directly advertising the service to non-student university employees. because some users informed librarians that they had brought spouses and children to use the equipment, we estimate that the 7 percent of responders who were neither students nor university employees mostly consisted of family or friends accompanying students or employees. over one third of student responders were majoring in disciplines outside of science, technology, engineering, and mathematics. this number is small when compared to the number of students in these majors across campus (approximately 63 percent of students on campus are not majoring in stem disciplines.); however, it demonstrates that there is an interest in vr technology throughout the university. as the vr services are located in the science and engineering area of the library, it is not surprising that more students majoring information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 4 in these disciplines used these services when compared to students majoring in other disciplines. in fact, 15 percent of responders learned about the services at the reference desk, where they could see other patrons using the vr equipment. the most common discovery method, however, was the various forms of advertisements targeted to both students and employees of the brigham young university, as shown in figure 2. figure 1. demographics. information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 5 figure 2. most effective discovery methods: advertisement and word-of-mouth. only 7 percent of responders identified research or class assignments as their primary reason for using the services. the large majority of use, as shown in figure 3, was simply for entertainment or fun. this was not unexpected, especially as most of the users were trying the technology for the first time (see figure 4). however, because we purchased the equipment with the intent to support academic pursuits on campus, we hoped to see a higher percentage of academic use. figure 3. most responders came because it sounded fun. information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 6 figure 4. most responders were first-time users. faculty use was higher than expected (see figure 5). eleven percent of users during our survey period were faculty. the majority of these responders indicated an interest in potentially using vr technology with their students (see figure 6). while this interest was positive, faculty member suggestions for classroom use remained hypothetical, without any concrete intentions for implementation. this suggests that although faculty interest exists, faculty may need to be informed of specific application ideas in order to be more likely to incorporate this technology into their courses. information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 7 figure 5. faculty were interested in trying the vr equipment. information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 8 figure 6. faculty were interested in using vr academically. a clear majority (72 percent) indicated an intention of returning to the library to use the service again (see figure 7). information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 9 figure 7. most responders intend to return. because our vr services were a small pilot program at the time of the survey, we did not offer a large number of paid apps to users. table 1 displays the most common apps used by survey responders. most users tried google earth during their session, and employees at the reference desk often recommended this app to new users. another common app for new users was the lab, which includes a few small games showcasing the current capabilities of vr. google tiltbrush is an app for creating 3d art. virtual jerusalem is an app that was created by faculty at brigham young university and allows users to walk around and explore the jerusalem temple mount during the time of christ. the fifth-most-used app we offered was 3d organon vr anatomy, which teaches human anatomy. 1. google earth 2. the lab 3. tiltbrush by google 4. virtual jerusalem 5. 3d organon vr anatomy table 1. top five apps used. focus group data we conducted a total of three focus group sessions. each session included between five and eight participants, for a total of twenty-one focus group participants. because we were primarily information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 10 interested in student responses, we limited focus group participants to students enrolled at brigham young university. the participants were asked to describe what did or did not go well during their vr session. when describing what went well during their vr session, many participants responded with positive comments about the quality of service the library employees provided during their session. most participants expressed satisfaction with the number and quality of the apps provided by the library. during all three focus groups, participants mentioned that they liked how easy it was to sign up for the vr services. the most common problems reported by participants related to health or safety concerns, such as feeling dizzy, bumping into objects in the room because of the lack of space, and tripping over the headset wire. other s reported problems related to the level of personal or social comfort with the vr services, such as feeling self-conscious using vr in a semi-open space not exclusively devoted to vr services or being told to be quieter. when asked about ways the library could improve its vr services, the students suggested solutions to many of these problems. a frequent recommendation was that the library dedicate a space to vr. the reasons for this suggestion included minimizing the risk of accidentally bumping into objects, reducing the embarrassment of using the vr equipment in front of spectators, and allowing participants to become more fully immersed in the vr experience without worrying about being too loud. other common suggestions included providing more than one headset for multiple patrons to use for gaming purposes or team projects, acquiring wireless headsets to eliminate wire tripping hazards, and providing more online training videos to reduce reliance on library workers for common troubleshooting problems. participants did not provide actionable suggestions on ways to decrease dizziness while operating vr equipment. when asked about how the students could see themselves using vr academically, many responded with some of the more well-known uses of vr technology, such as potential uses in science, medicine, engineering, and the military. however, some students had a very hard time determining how vr could be applied to humanities fields such as english. after some discussion, most students were able to see the relevance of vr in their field, but some said that they most likely would not pursue those functions of vr, using vr exclusively for extracurricular activities. in contrast to the lack of academic uses envisioned by focus group participants, participants had substantially more ideas about how they would use vr for extracurricular purposes, including playing games for stress relief, exercising, exploring the world, and watching movies. many expressed interest in using vr for extracurricular learning outside their majors, such as virtually being part of significant historic events, exploring ecosystems, and visiting museums or other significant landmarks. students expressed interest in exploring the many possibilities provided by vr technology but were not especially aware of or interested in how vr might apply to their specific field of study unless they were in an engineering, medical, or other science-related discipline. conclusions vr is a rapidly growing field, and academic libraries are already providing students access to this technology. in our study, we found considerable interest across campus in using vr in the library, however the academic interest and use were not as high as we hoped. future marketing to faculty might benefit from specifically suggesting ideas for academic uses or collaboration. even though our current vr services are located at the science and engineering help desk, nearly 40 percent of information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 11 users were not in stem disciplines. this is encouraging and suggests value in marketing future vr services to all library patrons. we also found sufficient patron interest to justify exploring related vr services, such as offering classes on creating content and acquiring less expensive headsets that can be borrowed outside of the library. although this survey was limited to one university, we believe the results can be used to inform other academic libraries as they develop similar vr services. endnotes 1 susan lessick and michelle kraft, “facing reality: the growth of virtual reality and health sciences libraries,” journal of the medical library association: jmla 105, no. 4 (2017): 407. 2 paul milgram et al., “augmented reality: a class of displays on the reality-virtuality continuum,” in telemanipulator and telepresence technologies 2351 (international society for optics and photonics, 1995), 282-92. 3 hannah pope, “incorporating virtual and augmented reality in libraries,” library technology reports 54, no. 6 (2018): 8. 4 sarah howard, kevin serpanchy, and kim lewin, “virtual reality content for higher education curriculum,” proceedings of vala (melbourne, australia: libraries, technology and the future inc., 2018), 2. 5 zois koukopoulos and dimitrios koukopoulos, “usage scenarios and evaluation of augmented reality and social services for libraries,” in digital heritage. progress in cultural heritage: documentation, preservation, and protection (springer international, 2018), 134-41; leanna fry balci, “using augmented reality to engage students in the library,” information today europe/ili365 (november 17, 2017), https://www.infotoday.eu/articles/editorial/featuredarticles/using-augmented-reality-to-engage-students-in-the-library-121763.aspx. 6 bruce massis, “using virtual and augmented reality in the library,” new library world 116, nos. 11-12 (2015): 789, https://doi.org/10.1108/nlw-08-2015-0054. 7 adetoun a oyelude, “virtual and augmented reality in libraries and the education sector,” library hi tech news 34, no. 4 (2017): 3, https://doi.org/10.1108/lhtn-04-2017-0019. 8 weina wang, kelly kimberley, and fangmin wang, “meeting the needs of post-millennial: lending hot devices enables innovative library services,” computers in libraries (april 2017): 7. 9 “oxford libguides: virtual reality: borrowing vr equipment,” bodleian libraries, https://ox.libguides.com/vr/borrowing; “virtual reality services,” penn state university libraries, https://libraries.psu.edu/services/virtual-reality-services; “vr studio,” north carolina state, https://www.lib.ncsu.edu/spaces/vr-studio. 10 oyelude, “virtual and augmented reality,” 3. 11 lessick and kraft, “facing reality: the growth of virtual reality,” 409. https://www.infotoday.eu/articles/editorial/featured-articles/using-augmented-reality-to-engage-students-in-the-library-121763.aspx https://www.infotoday.eu/articles/editorial/featured-articles/using-augmented-reality-to-engage-students-in-the-library-121763.aspx https://doi.org/10.1108/nlw-08-2015-0054 https://doi.org/10.1108/lhtn-04-2017-0019 https://ox.libguides.com/vr/borrowing https://libraries.psu.edu/services/virtual-reality-services https://www.lib.ncsu.edu/spaces/vr-studio information technology and libraries march 2020 virtual reality | frost, goates, cheng, and johnston 12 12 oyelude, “virtual and augmented reality,” 3. 13 medhat alaker, greg r. wynn, and tan arulampalam, “virtual reality training in laparoscopic surgery: a systematic review & meta-analysis,” international journal of surgery 29 (2016): 86, https://doi.org/10.1016/j.ijsu.2016.03.034. 14 elizabeth dyer, barbara j. swartzlander, and marilyn r. gugliucci, “using virtual reality in medical education to teach empathy,” journal of the medical library association: jmla 106, no. 4 (2018): 498, https://doi.org/10.5195/jmla.2018.518. 15 emilio madrigal, shyam prajapati, and juan hernandez-prera, “introducing a virtual reality experience in anatomic pathology education,” american journal of clinical pathology 146, no. 4 (2016): 462, https://doi.org/10.1093/ajcp/aqw133; nils fredrik kleven et al., “training nurses and educating the public using a virtual operating room with oculus rift,” ieee (2014): 1, https://doi.org/10.1109/vsmm.2014.7136687. 16 wadee alhalabi, “virtual reality systems enhance students’ achievements in engineering education,” behaviour & information technology 35, no. 11 (2016): 925, https://doi.org/10.1080/0144929x.2016.1212931. 17 patricia brown, “how to transform your classroom with augmented reality—edsurge news,” edsurge, november 2, 2015, https://www.edsurge.com/news/2015-11-02-how-to-transformyour-classroom-with-augmented-reality. 18 negin dahya et al., “virtual reality in public libraries,” university of washington information school, https://ischool.uw.edu/vrinlibraries. 19 del siegle, “seeing is believing: using virtual and augmented reality to enhance student learning,” gifted child today 42, no. 1 (2019): 46, https://doi.org/10.1177/1076217518804854. 20 guillaume loup et al., “immersion and persistence: improving learners’ engagement in authentic learning situations,” 11th european conference on technical enhanced learning (2016): 414, https://doi.org/10.1007/978-3-319-45153-4_35; robert carini, george kuh, and stephen klein, “student engagement and student learning: testing the linkages,” research in higher education 47, no. 1 (2006): 23-4, https://doi.org/10.1007/s11162-005-8150-9. 21 mariano alcaniz, elena olmos-raya, and luis abad, “use of virtual reality for neurodevelopmental disorders: a review of the state of the art and future agenda,” medicinabuenos aires 79, nos. 77–81 (2019): 419-20, https://doi.org/10.21565/ozelegitimdergisi.448322. 22 yazhou huang, lloyd churches, and brendan reilly, “a case study on virtual reality american football training,” proceedings of the 2015 virtual reality international conference 6 (2015): 3, https://doi.org/10.1145/2806173.2806178. 23 “media lab,” massachusetts institute of technology, https://libraries.psu.edu/services/virtualreality-services; “the ischool technology resources at fsu: virtual reality,” florida state university libguides, https://guides.lib.fsu.edu/ischooltech/vr. https://www.sciencedirect.com/science/journal/17439191 https://doi.org/10.1016/j.ijsu.2016.03.034 https://doi.org/10.5195/jmla.2018.518 https://doi.org/10.1093/ajcp/aqw133 https://doi.org/10.1109/vsmm.2014.7136687 https://doi.org/10.1080/0144929x.2016.1212931 https://www.edsurge.com/news/2015-11-02-how-to-transform-your-classroom-with-augmented-reality https://www.edsurge.com/news/2015-11-02-how-to-transform-your-classroom-with-augmented-reality https://ischool.uw.edu/vrinlibraries https://doi.org/10.1177/1076217518804854 https://doi.org/10.1007/978-3-319-45153-4_35 https://doi.org/10.1007/s11162-005-8150-9 https://doi.org/10.21565/ozelegitimdergisi.448322 https://doi.org/10.1145/2806173.2806178 https://libraries.psu.edu/services/virtual-reality-services https://libraries.psu.edu/services/virtual-reality-services https://guides.lib.fsu.edu/ischooltech/vr abstract introduction objectives methods results and discussion demographics focus group data conclusions endnotes letter from the editor (september 2019) letter from the editor kenneth j. varnum information technology and libraries | september 2019 1 https://doi.org/10.6017/ital.v38i3.11631 editorial board changes thanks to the dozens of lita members who applied to join the board this spring. the large number of interested volunteers made the selection process challenging. i’m pleased to welcome six new members to the ital editorial board for two-year terms (2019-2021): • lori ayre (independent technology consultant) • jon goddard (north shore public library) • soo-yeon hwang (sam houston state university) • holli kubly (syracuse university) • brady lund (emporia state university) • paul swanson (minitex) in this issue welcome to lita’s new president, emily morton-owens. in her inaugural president’s message, “sustaining lita,” morton-owens discusses the many ways lita strives to provide a sustainable organization for its members. we also have the next edition of our “public libraries leading the way column. this quarter’s essay is by thomas lamanna, “on educating patrons on privacy and maximizing library resources.” joining those essays are six excellent peer-reviewed articles: • “library-authored web content and the need for content strategy,” by courtney mcdonald and heidi burkhardt • “use of language-learning apps as a tool for foreign language acquisition by academic libraries employees,” by kathia ibacache • “is creative commons a panacea for managing digital humanities intellectual property rights?,” by yi ding • “am i on the library website?,” by suzanna conrad and christy stevens • “assessing the effectiveness of open access finding tools,” by teresa auch schultz, elena azadbakht, jonathan bull, rosalind bucy, and jeremy floyd • “creating and deploying usb port covers at hudson county community college,” by lotta sanchez and john delooper call for pllw contributions if you work at a public library, you’re invited to submit a proposal for a column in our “public libraries leading the way” series for 2020. our series has gotten off to a strong start with essays by thomas finley, jeffrey davis, and thomas lamanna. if you would like to add your voice, please submit a proposal through this google form. kenneth j. varnum, editor varnum@umich.edu september 2019 https://doi.org/10.6017/ital.v38n3.11627 https://doi.org/10.6017/ital.v38n3.11571 https://doi.org/10.6017/ital.v38n3.11571 https://doi.org/10.6017/ital.v38n3.11627 https://doi.org/10.6017/ital.v38n3.11077 https://doi.org/10.6017/ital.v38n3.11077 https://doi.org/10.6017/ital.v38n3.10714 https://doi.org/10.6017/ital.v38n3.10714 https://doi.org/10.6017/ital.v38n3.10977 https://doi.org/10.6017/ital.v38n3.11009 https://doi.org/10.6017/ital.v38n3.11007 https://doi.org/10.6017/ital.v38n1.10974 https://doi.org/10.6017/ital.v38n2.11141 https://doi.org/10.6017/ital.v38n3.11571 mailto:https://docs.google.com/forms/d/e/1faipqlsfqu7c9ogmcdvvbn025a0kiehavrrlr7090ao3rowqypbqtng/viewform?usp=sf_link mailto:varnum@umich.edu editorial board changes in this issue call for pllw contributions 12 library mechanization at auburn community college eloise f. hilbert: head librarian, auburn community college, auburn, new york use of an ibm 1401 computer and a single keypunch operation for changing a college book collection from dewey decimal to library of congress classification; for acquisitions, accounting and circulation procedures; and for production of a list of periodical holdings. a mark-sense reproducer is used for the circulation system. introduction auburn community college, a two-year college and one of the fifty-two units of the state university of new york, was founded in 1953 as one of the first community colleges in the state. like other institutions of higher learning, it has experienced a rapid growth. there are about 1500 students enrolled in the day division and about 1500 in the evening division. the college offers courses in data processing in addition to the usual curriculum offerings. the library possesses approximately 40,000 volumes and adds about 3,000 volumes a year. the librarian attended a one-week ibm customer executive conference for librarians on the subject of automation in endicott, n.y. in april1967, and returned eager to consider ways of applying computer technology to some of the library's procedures. data processing equipment, located directly beneath the library, was available for library use. the limited library staff, both professional and clerical, would benefit by automating of technical processes, since automation would eliminate much typing and reduce tedious tasks. clerical staff would have time to take over clerical operations that were being performed by professional staff, freeing the latter for more professional work. library mechanization at auburn/hilbert 13 upon approval by the college administration of plans for automating, discussions took place between the librarians involved and the chairman of the data processing department. there was considerable interest and cooperation among the library staff and data center personnel. library literature describing computerization in academic libraries was reviewed, but there was a decided lack of information available concerning automation of libraries approximately the same size as auburn's. a proposed use of mark-sense cards in the circulation system also appeared to be unique. ibm publications on library applications served as guides in developing the systems ( 1,2,3). decisions were made to automate a projected reclassification, the acquisitions and circulation systems and accounting procedures, and to produce lists of the serial holdings ( 4). the ibm processing installation used for the library comprised the following: 1401 computer, 12k storage; 1402 card reader punch; 1403 printer; 1311 disk storage drive; 514 mark-sense reproducer; 083 sorter; 548 interpreter; 026 keypunch. reclassification a decision had previously been made to change the classification of the library's collection of 30,000 books from the dewey decimal to the library of congress system. since the tedious task of erasing or painting over and retyping catalog card numbers and entries, book pockets and book cards would be greatly reduced by automation, it was decided that reclassification of the book collection should be the first automated procedure. the aim was to complete conversion as quickly as possible in order to avoid confusion in the library. the data center made its staff available for the summer months so that much of the conversion was completed at that time. conversion of the shelf list presently in dewey decimal classification was to be the key step. the shelf list was sent to the data center, where one ibm shelf list card was punched for each volume. accession number, call number, author's name and title, copyright date, and publisher were carefully abbreviated to fit into six fixed fields of the eighty columns on the card. these fields would appear repeatedly in later processes so this bibliographic record, which would be used in later procedures, needed to be keypunched only once. keypunch operators were instructed to use the library of congress number on the catalog card instead of punching the dewey decimal number. all cards that did not have complete library of congress cataloging were returned from the data center to the library for necessary additions and corrections. the decision was made to accept the library of congress classification call number exactly as it appeared on the card, in order to keep original cataloging to a minimum. this abbreviated shelf list punch card was sufficient for the purpose 14 journal of library automation vol. 3/ 1 march, 1970 presently planned for its use. it was considered unnecessary to provide complete bibliographical detail, which would be available from the main card catalog. punched shelf list cards would produce accession lists, "new book" lists, and bibliographies of books in abbreviated form. there was no interest in producing catalog cards, which would have required much more work and more time than was available. once the shelf list cards had been punched, reclassification proceeded as follows: the punched shelf list card was used as the source card to create the circulation book card, duplicating accession number, call number, author and title. the punched shelf list card also produced the gummed labels, a label for the book pocket, the spine label, and labels for the catalog cards. the punched cards had been produced in dewey decimal order from the catalog trays. labels, shelf list and circulation cards were placed with the dewey decimal shelf list card to make a set. it was important to keep them in dewey decimal order, since the books would come off the shelves in that order. each set of labels and cards was placed in its respective book, labels were attached to the book pockets already in the books, and call number labels were applied to the spines of the books. the remaining labels were filed with the dewey shelf list card and later attached to the other cards in the card catalog. circulation cards were inserted in the book pockets and the books were returned to the stacks, which had been relabeled with library of congress numbers. student assistants were used to perform this work under the supervision of the librarians. originally, it was estimated that the job would require three years, but by automated procedures reclassification took only about six months. attaching the labels to the cards in the card catalog did take another year. labels were placed on the cards at the card catalog without removing cards from the trays. instruction was given and examples were displayed to show how to locate the new library of congress number on the catalog card, thus the use of the card catalog was not hindered by the lack of labels on the cards. this method seemed to be the most efficient, since new cards would have been expensive, headings would have had to be retyped and all cards refiled again. circulation (figure 1) a machine readable shelf list made possible an automated circulation system, since producing a machine readable circulation card (figure 2) would be a simple computer operation. the old circulation system presented the usual problems but manually preparing the over-dues was costly and time consuming and would benefit most from automation. total circulation for the library for 1968/ 69 was 18,000 books. the maximum number of books charged out per day was one hundred and fifty books and the minimum, twenty-five. maximum size of the loan record file was about two thousand charges. library mechanization at auburn/hilbert 15 mak~ up dtlf~ eal"d. idj.af '., rit~fl} ia l i.brary fig. 1. circulation flow chart. gets . (jut t:~""hy eat"d.5 . +hata/r~ady hare. a. d~.~e cla"ki11 the-m. 16 journal of library aa.ttomation vol. 3/ 1 march, 1970 -1 i~' ~, :r:; o· -o10 nc 0 0 ;sz oo 3::-i oo"" om 03:: ;><;0 < m [ai nu~bt• ui hor oui _10 nc . \. ii ill i u u 01 ' 110 ,,_; ,. 01 u n" " "'"""•"""" '"'" " ~~"~!"".~_'~~-~~ · ko l nn• ll k """" " kh 011 " "' '""""" " "'" fig. 2. circulation card. a mark-sense reproducer prepares the cards for the computer. this reproducer had been acquired for other college computer functions and the library was able to make use of it (2). under this plan the books are charged out by having the borrower write in his identification number, which serves as the borrower's number, and his name in the appropriate box on the ibm circulation card ( 3). the student assistant at the desk mark-senses the book card with the identification number; this is the one manual operation, but it has presented no problem. the marked circulation cards are sent three times a week to the data center, where the mark-senses are read and punched and the due date is gang-punched in. the 1401 computer generates a second circulation card, duplicating the accession number, call number, author and title. old and new circulation cards are machine filed together by accession number and returned to the circulation file, which is arranged by date and accession number. it was found that the accession number is easier to read than the library of congress number and is the truly unique number. a printed circulation listing, arranged by call number to facilitate use, is kept at the charge desk; it shows accession number, author and title, borrower's name, identification number and due date. it is also possible to prepare a daily circulation report by student identification number and name if required. the entire circulation is sent to the data center weekly to produce a cumulative print-out of all books in circulation. these print-outs provide daily and weekly totals of all outstanding circulated books. no data processing equipment is required for reserve circulation. charging out of books on reserve continues to be done by having the borrower write his identification and name on a blue reserve card to be kept at the desk. library mechanization at auburn/hilbert 17 when a book is returned, the pair of circulation cards are selected from the circulation file. the used charge card, which contains the borrower's identification number and due date, is marked "cancelled" with a rubber stamp. the new circulation card is inserted in the pocket of the book and the book is reshelved. cancelled circulation cards are kept and sorted later to provide statistical analyses by date and class number for each semester. this system was developed because it was felt a small library could not justify expensive charging machinery. acquisitions and shelf list once the reclassification operation was organized it was possible to set up automation procedures for processing current acquisitions. an ibm card was designed as a book request card (figure 3) to be filled in by staff or faculty member. information on it includes author's name, title, publisher, price purchase order number, academic department, and requestor's name. at the data center the foregoing information is keypunched into the card, which then becomes a purchase order card. the purchase order number identifies the vendor and is gang-punched into the cards. a computer print-out produced from the purchase order cards is mailed directly to the dealer as a book order. order cards are kept in an "on order" file by dealer or purchase order number and then by author until the books are received. / i i i i i i i i i i""'' '""· .,.o . nol li.tr,. 0 c • •. nn uf ••11r nnvr tiis ill£ mlibrary author c request form tide inch.! de date & edirion ii neceuafy publis'her dept, please print or type. list price i reqvested a, complete request nn lit jhow nrs lim . and sign. p.o. -~ cod lc. cion i i do not fold, bend acceu;on .j i i i or mutilate this card. i i i i i i ' 1 f j t i i i i i ii n ij ult 111111111tr11 nn l4 11mn•~uu~»•nxnut1unm~·t/ii~miiuummmpmm.i1uh~u-ii·h~ itd»h~niih~· ~ ............. fig. 3. book request card. when the book and its library of congress cards have been received, the corresponding order cards are pulled and the following additional information is added to the purchase order card: actual cost (taken from the invoice), accession number (stamped on), and the library of congress call number (taken from the library of congress printed card). figure 4 is a flowchart of the acquisitions procedure. the books are then processed in the same manner as was used for reclassifying ( 1). 18 journal of library automation vol. 3/ 1 march, 1970 ,,.~, ll/20 vcr fig. 4. acquisitions flow chart. vpm r~ceipt .f~~.s tj,.de/#~d. m•ke. ~kif list . card~ f"r1m ~rcluue. orr:lel' li~ra~"''/ ~,. fi/,..,9 library mechanization at auburn/ hilbert 19 /41>1 lillo fig. 4 continued. 1~01 l/1~0 order cards are sent to the data center to reproduce the shelf list cards, automatically transferring the pertinent information already punched in the order card and keypunching the additional information into the shelf list cards (figure 5). currently, provision is being made for inclusion of the library of congress card order number in the shelf list card to enable easy subsequent selection of the corresponding marc records. -r;, l,l,i"r%ry -hr .fi/i-,9 fig. 5. new books listing procedure. 20 l ournal of library automation vol. 3/ 1 march, 1970 the shelf list cards are used to produce the new books list (figure 5). the shelf list is kept in the ibm card form, and a book catalog could easily be made if so desired. to compile a bibliography it is only necessary to take the punched cards from the shelf list in the wanted classification. the library's subject catalog and the library of congress subject headings are checked to determine the class numbers to be used. as depicted in the flowchart in figure 5, these cards are put through the computer to produce the print-out, and then returned to the punched shelf list file. this system was designed to produce a bibliographical record of the books in the library and to automate the technical processing of the books in as simple a method as possible so as not to defeat the purpose of automating. accounting (figure 6) the accounting system was designed to use the book request card after it has had department and cost punched into it. after the books are processed, accumulated request cards are sent periodically to the data center, where computer print-out is produced by department, listing the books purchased and the cost of each, with a summary showing all expenditures. copies are sent to department chairmen to keep them informed of their expenditures. these order cards are kept for a semester, then returned to the individual requesting faculty members after a cumulative accounting record has been made. by this means it is possible to keep track of each department's book budget and the library's total book budget, with the computer doing all the work. l117tj fig. 6. accounting procedure. library mechanization at auburn/ hilbert 21 overdues (figure 7) overdue notices are machine prepared from overdue circulation cards which are selected periodically from the charge-out file. the cards are passed through the computer, which generates second and third overdue overdue c!.ire. ecl"ds 14tji 70 li.6ran; mw ovt~due.. file l. 1/~t:f fig. 7. overdue procedure. t 22 journal of library a.utomation vol. 3/ 1 march, 1970 cards to be used for discharging purposes. gummed address labels that include the student identification number are produced using the college log of names and addresses. the appropriate label is applied to the reverse side of the circulation card using the i.d. as a guide. each notice card is stamped "overdue book, please return as soon as possible," then sent through the postage meter and placed directly in the mail. if several overdues are sent to the same person, the cards are mailed in an envelope, using the gummed label. the second and third notice cards are filed at the circulation desk until needed or until the book is returned. there is another file for borrowers who are seriously delinquent in returning their books. cards that have accumulated in this overdue file are processed as follows to generate further overdue notices: an overdue notice is sent to the borrower, the dean's letter to the borrower or to his parents, and the list of names to the dean and the student personnel office. at the end of each semester a list is prepared indicating all books held by individual faculty members for more than three months and the latter are notified. the time-consuming operation of preparing overdues has been considerably reduced ( 4). serials serial holdings have been converted to machine readable punched cards. the state university of new york, under the direction of dr. irwin pizer of the upstate medical center at syracuse, has recently published a union list containing the titles of all periodicals received in all units of the state university (5). it includes the serial holdings of auburn community college, (approximately 400 titles) and punched cards for these holdings are used by the library adapted for its use. information on the card comprises title, inclusive dates, years on microfilm, department for which the periodical was ordered and the indexes in which the periodical is listed. each new serial title added to the holdings is keypunched with this information. the punched cards are used to print out an alphabetically arranged title listing and a departmental listing. adding or withdrawing titles is a simple matter, and up-to-date lists of periodical holdings are easily produced by the computer. copies of the lists are sent to eacli faculty member and several copies are available at the desk and in the periodical room. costs since library use of the data center was considered to be similar to other college uses (e.g., that of the business office), the cost of library automation was absorbed by the data center and not charged to the library. an estimate of the cost, including rental time on the computer (about three hours per week), supplies, and data center staff time, is about $1500.00 a year for ongoing programs. library mechanization at auburn/hilbert 23 conclusion the automated systems herein described have now been completely operational for over a year. converting data for a computer operation spotlighted inaccurate recording of information and afforded a good opportunity for correcting previous errors. periodically, progress and results have been reviewed and changes made, as will continue to be the case. the automated circulation system is providing the library with rapid, accurate, and efficient circulation control not possible for a manual system. ease and speed of performing routine library operations by the use of automation more than compensates for the cost of data processing. automated technical procedures provide for faster and more efficient processing of books, production of the library's monthly new books list (which previously took hours to type) and subject bibliographies. other important results of the mechanization project are the serial listings and departmental accounts, all of which make possible better library service. acknowledgments the programming was done in autocoder by, or under the supervision of, mr. richard klinger, chairman of the data processing department at auburn community college; to him is due most of the credit for the mechanization of the library. the library is grateful to mr. klinger for his encouragement and enthusiastic support and his willingness to assume the technical responsibilities of programming and systems design. references 1. international business machines: mechanized library procedures. (white plains, n. y.: ibm, n. d. ). 2. international business machines: library processing for the albuquerque public school system (white plains, n. y.: ibm, n. d.). 3. dejarnett, l. r.: "library circulation." in international business machines corporation: ibm library mechanization symposium (endicott, n. y.: 1964), pp. 78-93. 4. eyman, eleanor g.: "periodicals automation at miami-dade junior college," library resources and technical services, 10 (summer, 1966), 341-61. 5. the union list of serials in the libraries of the state university of new york. (syracuse, n.y.: state university of new york upstate medical center, 1966). 154 information technology and libraries | september 2009 tutorial kathleen carlson delivering information to students 24/7 with camtasia this article examines the selection process for and use of camtasia studio software, a screen video capture program created by techsmith. the camtasia studio software allows the author to create streaming videos which gives students 24 hour access on any topics including how to order books through interlibrary loan. h ow does one engage students in the library research process? in my brief time at the downtown phoenix campus library of arizona state university (asu) i have found a software program that allows librarians to bring the classroom to the student. screen capture programs allow you to create presentations and publish them for students to view on their own time. instead of telling students how to do something, we need to show them.1 recent studies show there are numerous benefits to using streaming video in higher education. students that receive streaming video instruction as well as traditional instruction show dramatic improvement in class.2 this article takes a look at how i selected one software program and created a streaming video using the application. i examined three software applications that help create video tutorials and presentations: cam studio, macromedia’s captivate, and techsmith’s camtasia studio. i first experimented with cam studio, which is open-source software. there are limitations to what you can do with software that is free. the screen size is too small and the file size it can create is limited. macromedia’s captivate is good if you want to create a series of screenshots with accompanying audio. i did not choose this streaming video program because i was unsure of the software’s capability, and i had no one to provide technical support. the third choice, the open-source camtasia studio, was the software i selected. there were several reasons why i preferred this software. i had more familiarity with it, and the software is very easy to load and is user friendly. it also has the ability to record a video of everything that is happening on your computer screen.3 another reason i selected camtasia studio was because of the availability of an asu software technician who had experience editing the streaming video. most users view camtasia’s video through adobe flash, but the program also can produce windows media, quicktime, dvd-ready avi, ipod, iphone, realmedia mp3, web, cd, blog, and animated gif formats.4 camtasia performs screen captures in real time. you are able to simultaneously use slideshow software, navigate to a website, and narrate step-by-step instructions. the full version of camtasia studio runs around $300. in addition to the software program, you also must have a combination headset and microphone. a stick microphone will work, but the combination headset will help eliminate any noise that can be picked up by a stick microphone. i purchased a logitech extreme pc gaming headset for about $20. when you purchase the camtasia license online at http://www.techsmith .com/, the customer service department will e-mail you the access code along with a link from which you can download the software. the cd-rom loaded with the camtasia software arrives about ten days later. my first camtasia studio project was a tutorial on how to use the university’s interlibrary loan system. here are the basic steps i took to create a streaming video: 1. preproduction. this involves the creation of a script. 2. production. the actual capturing of the video and audio content. have all websites and programs open and minimized at the bottom of the screen in order to easily select them during the video capturing. 3. postproduction. this is the most time-consuming and involves editing the video and compressing the file for delivery to users. 4. publishing. posting the video to a web server and assessing the material’s success. to see the full 3 minute 53 second streaming video “how to order an article that asu does not own” go to http://www.asu.edu/lib/tutorials/ illiad/index.html. implementing camtasia studio once camtasia studio is installed on your computer, double click on the camtasia studio icon. it will bring up a welcome window where you can select from the following (see figure 1): n start a new project by recording the screen n start recording a powerpoint presentation n start a new project by importing media files n open an existing project i have selected “start a new project by recording the screen.” on the left hand menu there is a task list, and you can select one of the kathleen carlson (kathleen.carlson@ asu.edu) is health sciences librarian, information commons library, arizona state university, downtown phoenix campus. delivering information to students 24/7 with camtasia | carlson 155 following (see figure 2): n record the screen n record the powerpoint i have selected “start a new project by recording the screen.” this will bring up a window, “new recording wizard screen recording setup.” it asks you what you would like to record (see figure 3). n region of the screen n specific window n entire screen i have selected “entire screen.” when you click on the “next” button, it brings up a recording options window (see figure 4). select from the following: n record audio n record camera i have selected “record audio while recording the screen.” next you see a window that lets you choose audio settings from the following (see figure 5): n microphone n speaker audio n microphone and speaker audio n manual input selection i have selected “microphone” (see figure 6). the next window is titled “tune volume input levels.” use the input level lever to set the audio input level (see figure 7). figure 1. welcome screen and what do you want to do? figure 5. choose audio settingsfigure 3. screen recording setup figure 4. recording options figure 2. record the screen 156 information technology and libraries | september 2009 the “begin recording” window appears, which includes instructions on how to start and stop recording. you have the choice of clicking the “record” button on camtasia recorder or clicking the f9 key to start recording. to stop, click the “stop” button on camtasia recorder or click the f10 key (see figure 8). finally click on either “record the screen” or “record powerpoint.” to view your streaming video, click on the saved icon where it says clip bin or go to camtasia toolbar and click on view. then click on clip bin, then click on thumbnails. that’s all there is to it. summary i found camtasia studio to be very user friendly, although i cannot emphasize enough how important it is for librarians to collaborate with their it staff. this software enables you to bring the classroom to the student when they need it. you may have instructed a class on library research, but many of these students may have already forgotten where to begin. streaming video allows students to access presentations 24/7. here is a checklist of things to think about when selecting software: n what do you want to accomplish with the software? n what kind of access are you trying to give? n do you want audio, video, or both? n is it easy for the student to access and understand? n have you researched the software to make sure it meets your needs? n how much money do you want to spend? n what additional equipment is necessary? finally, and most importantly, work with your it staff on all phases of your project. by developing a collaborative relationship with them you will have fewer bumps in the road. use your imagination: the sky is the limit. references 1. diane murley, “tools for creating video tutorials,” law library journal 99, no. 4 (2007). 2. ron reed, “streaming technology improves achievement: study shows the use of standards-based video content, powered by new internet technology application, increases student achievement,” t.h.e. journal 30, no. 7 (2003). 3. christopher cox, “from cameras to camtasia: streaming media without the stress,” internet reference services quarterly 9 no. 3/4 (2004). 4. john d. clark and qinghua kou, “captivate/camtasia,” journal of the medical library association 96, no. 1 (2008), http://www.pubmedcentral.nih.gov/ articlerender.fcgi?artid=2212324 (accessed june 24, 2009). figure 6. audio volume levels figure 7. begin recording figure 8. camtasia recorder automatic retrieval of biographical reference books cherie b. well: institute for computer research, committee on information science, university of chicago, chicago, illinois 239 a description of one of the first pro;ected attempts to automate a reference service, that of advising which biographical reference book to use. two hundred and thirty-four biographical books were categorized as to type of subjects included and contents of the uniform entries they contain. a computer program which selects up to five books most likely to contain answers to biographical questions is described and its test results presented. an evaluation of the system and a discussion of ways to extend the scheme to other forms of reference work are given. ideally the reference librarian is the "middleman between the reader and the right book" ( 1 ) , and this is what the program here described is intended to be. in the past there has been very little interest shown in automating this service, probably because it is neither urgent nor practical in current reference departments. many developments in automating other areas of libraries have indirectly benefitted reference librarians, and the literature primarily emphasizes this aspect. for instance, where circulation systems have been automated, the location of a particular volume can be quickly ascertained and librarians need not waste time searching. automation of the ordering phase provides them with information on the processing stage of a new volume. if the contents of the catalog have been put in machine readable form, special bibliographies can be rapidly produced in response to a particular request or as a regular service of selective dissemination. the development of kwic (key word in context) in240 journal of library automation vol. 1/ 4 december, 1968 dexes, which are compiled and printed by computer, has enabled publishers to provide indexes to their books much faster. computers have also been programmed to make concordances and citation indexes ( 2). the combination of paper-tape typewriters, computer and a photocomposer has introduced automation into compiling index medicus (3). changes in reference services themselves, however, may make automation of question-answering practical. one trend is toward larger reference collections to be shared by several libraries; some areas have already set up regional reference services. there are also cooperative reference plans whereby several strong libraries agree to specialize in certain fields and cooperate in answering questions referred by the others ( 4). these trends will mean two things to reference librarians: greater concentration of resources, allowing more specialized books and mechanization; and screening of questions at the local level, letting reference centers concentrate on more complex questions that utilize their specialized books. thus it seems likely that special reference centers may look increasingly toward mechanizing their services, and retrieval schemes of the type presented here will be important to consider. basic assumptions the categorizing system was based on two nearly universal generalizati.ons about biographical reference books: 1) they are consistently confined to biographies of persons who have something in common: for example, being alive or dead; or having the same nationality, sex, occupation, religion, race, memberships; or possessing some combination of those attributes. these common characteristics in the people covered by a given book are herein called "exclusive categories." 2) the books generally maintain uniform entries for each subject; that is, they give the same data for each biography. these facts are referred to herein as "specifics" or "specific categories." certain assumptions were made about reference work: 1) all biographical reference books fit into the scheme and can be categorized. 2) the more limited a book's scope, the more likely it is to contain the person a user wants to find. in other words, if a user is interested in a dutch economist, he is more likely to find information in a book limited to dutch economists than in a general biographical dictionary. the user, however, does not want to miss any source that might be useful. therefore a general biographical dictionary should be given to him as a last resort, after books on dutch economists, dutchmen of all occupations, and economists of all nationalities. 3) certain requirements, the specifics, have no substitutes. for example, a book lists addresses or it does not, and if a user wants an address, books without them are useless. there is merit in suggesting to a user which book to use as opposed to giving him the direct answer to his question. probably the best argument for this assumption is that the volume of names that would have to be retrieval of biographical reference booksjweil 241 compiled and stored for a direct inquiry system is staggering, only a small number would ever be looked up, and it is impossible to predict which ones would be searched for. there are advantages to mechanizing this particular task of a reference librarian: good reference librarians should be freed to perform work less easily mechanized; there are not enough reference librarians who have perfect recall of their collections even to knowing which exclusive categories all the books fit into; and no librarian could have complete recall as to the specifics contained in each biographical reference book in the collection. the computer program the program was written in the comit language, a non-numerical programming language developed for research in mechanical translation, information retrieval and artificial intelligence. it is a high-level problemoriented language for symbol manipulation, especially designed for dealing with strings of characters. the program could probably be converted to other list-processing languages ( 6) for operation at other installations. the program was run at the university of chicago computation center on an ibm 7094 having the comit system on a disk. questions were submitted and nm in large batches. · the data all biographical reference books in english, with alphabetical ordering of subjects, which are in the reference room of the university of chicago's harper library were included in the data and no other books were included. since one assumption was that all biographical reference books could be categorized by the scheme, it seemed more useful to prove the system could handle any biographical reference tool than to compile a balanced list of biographical books. there was no difficulty in categorizing the books. all books are categorized in the following way. first an arbitrary abbreviation for the book is chosen to be its entry in the file; it is referred to as a "constituent." each book is then described by determining the values of nine subscripts each constituent carries, the subscripts being sex, living, nat (nationality), occup (occupation), min ( minorities), date, index, specl and spec2 (specifics). values of the first five subscripts-the exclusive categories-are first determined. that is, is the book limited to one sex? are all the subjects living or dead? do they all have a certain occupation? does the book include only certain nationalities? or is there another restriction; e.g., to alumni of a college, members of the nobility or a religious group? the exclusive categories for a book are determined and coded from a table of abbreviations. sex, for example, allows three values: restricted to males ( m), restricted to females (f) , or no restriction ( z) . also a value x must occur 242 journal of library automation vol. 1/ 4 december, 1968 with m or f, indicating there is a restriction. therefore sex can have the following combinations: sex z, sex f x, or sex m x; the values m x and f x are both the opposite of z. next the book's date is determined by asking "at what date did the values on living (yes or no) apply? or, if the subjects are not restricted to living or dead (living z), "when was the book up to date?" next any indexes to the biographies are noted. all the biographical books list subjects in alphabetical order by surname. lists of subjects in any other order are considered indexes even if the subjects are actually listed in some other order in the main body and the list that is alphabetic by surname is an index. finally, specific categories (spec i and spec2) are coded for such facts as birthdate, birthplace, college attended, degrees held, hobbies, illustrations, social clubs, and marital status. when all categorizing is finished, a data item is punched in this form: dictphilbio/ index field x, living n x, occup z, sex a, nat philip asian x, specl dc ds fl bp l cl cm dg e i z, date 50s x, spec2 p pl r ms pd z, min z +. this represents the dictionary of philippine biography, a book limited to dead filipinos and giving for each entry: dates, career, descendants, field, birthplace, long articles, class in college, degrees, education, picture, parents, publications, references, marital status and physical description. the book has a special index to find subjects by their field of work. one specific value, that for a long article, requires special mention. though most biographical reference books provide the same facts about all the subjects in list form, a few provide different facts about different subjects in a nanative form. such books carry the specl l, and the other specifics these books are listed as providing are not always given for every subject. for example, a book with a list format may provide the birthplace for every subject when it can be ascertained, but in a book using the narrative form, where often different authors write the articles, birthplace is not necessarily given. books in narrative form are used less for quick reference; therefore the program provides a note, when a long article is requested, that the card catalog may provide more long articles on the subject. ease of file maintenance is one advantage of this system. as data is analyzed in the first place, if a new value for a category is required, such as an occupation which is not in the list, the new value is simply added under occup for that particular book and in the list of abbreviations for fuh1re use. it is a little more complicated to make an existing value more specific. for example, to differentiate botanist, chem, physics and astron and still maintain scientist as a general category embracing them all, another short program is required to retrieve the data to be reclassified. retrieval of biographical reference books/ well 243 coding the question a biographical question can be quickly coded. the nine required subscripts are the same as those for the data books, but only one value for each subscript is necessary. for example, "'what are the publications of a living dutch economist? a current book is desired." is coded as q / sex z (or m), living y, nat dutch, occup econ, min z, index z, date 60s, spec! z, spec2 pl +. operation of the program briefly, the program reads in data and then the first question. it weeds out data items that can never be suitable, discarding all but those items that have the same values as the question has on the subscripts index, spec! and spec2. it then weeds out data items that do not have either the same values as the question, or the value z, on the subscripts occup, nat, min, sex and living. mter each weeding the program checks to determine that there are data items left; if all the books have been weeded out, there are no answers. there is also a provision to allow the user to designate certain titles to be ignored on a particular question in case he has already checked them, for example. all data items left after weeding are potential answers and could simply be printed out. however, subsequent searches over the remaining items serve the purpose of rearranging them into an order in which they are more likely to produce answers. it was decided that five answers are enough to judge the types of titles chosen yet few enough to avoid very long searches. a shorter list of answers would obviously be cheaper and a longer list more likely to produce a book containing the desired subject. ordering proceeds as follows: first values of subscripts sex, living, min, occup, nat and date on the question as originally stated are matched to those of books in the data. the computer is at this stage searching for books that are limited in just those categories in which the question is limited. for example, if the question q / sex z, living y, min z, nat dutch, occup econ, index z, date 60s, spec! a, spec2 pl + will match only those books published in the 1960's and restricted to living dutch economists which give publications for all the subjects (or the majority), and the books cannot be restricted to a sex or to any "minority" group. the books found may or may not have additional values on the subscripts; that is, a book may also contain french economists. such books found on the first search are mostly likely to contain the subject the questioner is looking for. if there are fewer than five books found which are a perfect match with the question, the program begins to alter the question. to make the least significant possible change in the question, the program changes the value of the subscript judged to be the limiting factor on the fewest books in the data, namely sex. if sex has a z as its value (because the questioner did not know the sex or did not prefer a book limited to one 244 journal of library automation vol. 1/ 4 december, 1968 sex) it is changed to x so that a book limited to one sex will not be overlooked. if sex does not have a z value (which means it has either m x or f x), it is changed to z. this means the questioner preferred books limited to one sex but presumably his second choice is books not limited to any sex. clearly if the question has sex f x it can never be changed to sex m x or sex x, since sex x will find books in the data classified sex m x. anything other than z changes to z, and z only changes to x. mter this change is made, another search is conducted and the answers counted. until there are five books or the data is exhausted, the original question is altered and the cycle continued. alterations proceed by changing the values of one subscript at a time in the following order: sex, living, min, nat and occup. then they are changed two at a time, three at a time, four at a time, and finally all five are changed, so there are thirty-one possible changes. if at the end of the thirty-second search there are still not five answers and there are more data items, the date restriction on the question is checked. if date has a value other than z, it is changed to z, which matches all the data items, and the computer prints a note if this is done; the program will then select any book regardless of date. control returns to search and begins the cycle again, continuing until five answers are found or the data is exhausted. mter searching is finished, the writing routine commences. one at a time the computer takes each answer, writes out its code for possible further reference, and then writes out the complete author, title, copyright date and library of congress call number, all of which the computer finds in a list within the program. results to obtain some measure of the program's accuracy, fourteen textbook questions, probably more challenging than the average patron would ask, were submitted to the computer and to a professional librarian who was especially familiar with biographical reference books. (see figure 1 for sample questions and results. ) the librarian spent a total of an hour and a half, and found answers to eleven out of fourteen questions. on the three she could not answer she felt she had exhausted the resources. in one of the eleven she answered ("how many americans won the nobel prize in medicine between 1930 and 1950?") she found the answer in a source not specifically biographical (world almanac) and therefore not in the computer's data. no problems occurred in forming the questions for submission to the computer. the program found some reasonable sources in all cases. it found books containing the answer in ten out of fourteen cases, the four answers not found being those three the librarian missed and the one requiring an almanac. in all but one case there were more possibilities than the five books given per answer. some questions were rerun ignoring retrieval of biographical reference books / well 245 qu~stion: in one source find a list of .1t least twenty references t o biog raphical information about dmitri ~1endelee£ (or mende. lev), russ i an chemist (1834-1907). as submitted to computer: q / sex h, livi ng n, occup cheh, nat russian, min z, specl z, specz r , index z, dt.te z + librarian's results: b phillips, dictionary of biographical a encyclopedia britannica a encyclopedia p-'llericana a biography index (1949-64 volumes) reference .. 0 references 6 references 1 reference .. 14 references time: 10 minutes computer's results: a index to scientists ... 27 references a biography index c drake, di ctionary of american biography (sounds wrong but it is international.) b phillips, dictionary of biographical reference a encyclopedia britannica question: what academic degrees have been earned by professor reuben l. hill, director of family study at the university of minnesota'?: as submitted to computer: (l) q/ sex m, living y, occup educ, nat aher, min z, specl dg, spec2 z, i~"dex z, date z + ( 2) ignore + 1\}ieconassn + i gnore + amerscience + ignore + ampolisci + ignore + damerschol + ignore + leadeduc + q / sex m, living y, occup educ, nat amer, min z, specl dg, specz z, index z, date z librarian's results: b leaders in education a who's who in arne rica answer: bs, phm, phd time: 3 minutes computer 's results: ( l) d handbook of the american economic association d d d b (2) b c a b b american men of science biographical directory of the american l'olitical science assoca t ion directory of american scholars leaders in education who's who i n american education outstanding young men of america who's who in america h'ho' s who in various areas national cyclopedia of american biograp hy question: where might i find information about a new england ancestor named jacob billings who was born around 1753'? as submitted to the computer: a / sex m, living n , occup z, nat amer, min ff , index z, date z, specl z, spec2 z + librarian's results: d handbook of genealogy about genealogists not families a compendium of american genealogy time: 8 minutes computer t s results: a compendium of a(!letican genea logy c dictionary of american biography c who ~.ras ~"1'10 in america c lamb's biographical dict i onary of the u. s. c concise dictionary of american biography a = it has the answer or a t least part of it b = good choice but it does not have answer c = reasonable choice but the r e arc better ones d = poor choice fig. 1. sample reference questions. 246 journal of library automation vol. 1/ 4 december, 1968 the first five answers, and five more titles were retrieved; even then there were more possibilities. in some cases the program did better than the librarian because she wasted time looking in sources that did not give the specifics sought. for instance, when the question asked for the pronunciation of the surname of paul and lucian hillemaker, french composers, she looked in dictionaries that do not give pronunciation. the computer found the only four possible sources immediately. in other cases the program came up with rather far-fetched answers a human would have skipped. a question asking for biographies of franz rakoczy, an hungarian hero, retrieved in its second five sources three jewish encyclopedias and a book on composers! these were not wrong and, in cases where occupation or minority group affiliations were unknown, these might be good sources. as an answer to the nobel-prize-winner question the computer retrieved sources on american doctors, nobel winners and scientists, which are the best choices from the data and would have the answers buried in them. however, what is really required is an index to award winners, and there were none in the data. the test revealed the necessity for allowing questions to have dummy values; that is, ones not used in the data. for instance there are no books limited to botanists, so occup botanist is not allowed in a question, though occup scientist is, and chem and physics are included as more specific values under scientist. asking for occup scientist when searching for a botanist avoids getting books devoted to nonscientific occupations but also gets books devoted to chemists and physicists. since one would want these books if he did not know the scientist was a botanist, that should not be changed. if he asks for occup botanist he wants books devoted to botanists first, then scientists in general. a short-term solution is to have dummy values to stand for all these other values. for example occup other-scientist could include all scientific occupations except those specifically listed, and it would retrieve books limited to all scientists but not to specific scientific occupations mentioned in the data. a long-term solution is to use a computer language allowing tree-structured data. presently this problem does no more than cause extraneous retrievals which the person using the list can easily skip. discussion advantages of the scheme can be speculated. from the library's point of view its virtues are that it is simple and inexpensive. original implementation would not require a major block of time to be spent in human indexing or abstracting. operating costs would be low because it does not require such a large store of information in memory that several tapes must be searched, and because updating the file is simple. when a new retrieval of biographical reference books/ well 247 book is added, an experienced person could categorize it in five minutes, punch a new data card and, if required, add to the list of values in the table of abbreviations. the system could provide useful information to other departments. it could keep tallies for the acquisitions department of how often a book is given as an answer, indicating whether new editions of it or similar books would be good buys. from the user's point of view the system avoids a major pitfall of some retrieval schemes which retrieve on the basis of ambiguous terms or association chains; that is, missing relevant items. if the user resubmits the same question ignoring already retrieved books each time, he will eventually have a comprehensive list of possible sources in the data that have the index and specifics he requires. a user also wants his information as brief as possible, listed in order of importance and with no extraneous answers ( 7); this requirement could be met as the program stands by having a human simply cross out any unnecessary titles. users like to know the reliability of the information ( 7) ; this detail could be provided along with the titles. users also want speed and convenience. as it stands, this system could be made available to users of the university of chicago library tomorrow with no more equipment than is presently in the computation center. time delay in the present implementation could be remedied by using an on-line system. users often prefer to be given facts themselves and not just citations ( 7). a program that gives biographical facts directly has no connection with this scheme or classification system, but the output of this program could be used as a tool by a librarian to find the answer for a patron. bibliographies the most obvious area to which the retrieval scheme could be extended is that of bibliographies. like biographies, they are limited in their scopes to certain exclusive categories, and they contain the same specific facts for each entry. logical exclusive categories could be: nationality, form (with such values as drama, poetry, fiction, maps, etc. ), subject (probably the most frequently used criterion on which to select books for a bibliography), and date. since there is no living with which to connect date, date here should probably have not just the most recent relevant date but as many values as necessary. for instance date 40s 50s 60s would apply to an index that began publication in the 1940's and is current. then a request for any of those dates would find it. possible specifics include number of pages, the cost, or a facsimile of the title page. arrangement would be needed, being different from index in that bibliographies, unlike biographies, cannot be assumed to have the same order (alphabetic by subject's name) plus indexes in other orders. arrangement would list as values all the ways the con248 journal of library automation vol. 1/ 4 december, 1968 tents of the bibliography could be approached: by subject, author, title, chronology or a combination of these. dictionaries dictionaries also lend themselves well to this type of scheme; one exclusive category, subject, might even be adequate for dictionaries. dictionaries' special subjects could be broken down into field (such as chemistry or business) and type (such as slang or geography), if necessary. language would be a specific category, since there are no substitutes for the language required. other possible specifics are pronunciation, definition, etymology and illustration. atlases atlases are also suited to the scheme. exclusive categories that seem appropriate are area covered, special subject atlases, and the size of the scale. scale should probably act as date does in the biographical program; that is, if a particular scale is requested, that would be searched for first and, if no answer is found, a note would be given and another search made for any scale. specifics for atlases could include items like topography, rainfall, winds, cities, highways and major products. factual books (those that give the highest mountain, the first fourminute mile, the january loth price of u.s. steel, etc.) do not lend themselves to the scheme. because these books are not uniform as to entries and subject coverage, the list of possible specifics and exclusive categories would be extremely long and the number of searches consequently prohibitive. also, since such books are far fewer in number than biographical or bibliographical works, the proper one is easier to find by browsing. conclusion a scheme for categorizing biographical reference books by their exclusive and specific categories makes it possible to automatically retrieve titles of those which would best answer reference questions. when tested it was found acceptable, with minor refinements, and it is easily adaptable to other reference book forms. such a system seems a logical direction in which to go when automation of actual reference functions is undertaken. acknowledgment the project under discussion was undertaken in partial fulfillment of requirements for the m. a. degree at the university of chicago's graduate library school. the computer program employed is detailed in the author's thesis ( 8). the work was partially completed under the auspices of aec contract no. at(ll-1)614. retrieval of biographical reference books / well 249 references 1. university of illinois library school: the library as a community information center. papers presented at an institute conducted by the university of illinois library school september 29-0ctober 2, 1957 (champaign, illinois: university of illinois library school, 1959), p. 2. 2. shera, jesse: "automation and the reference librarian," rq, iii, 6 (july 1964), 3-4. 3. austin, charles j.: medlars 1963-1967 (bethesda, national institutes of health, 1968). 4. haas, warren j.: "statewide and regional reference service," library trends. xii, 3 (january 1964), 407-10. 5. yngve, victor: com it programmers' reference manual (cambridge, mass.: m. i. t. press, 1962). 6. hsu, r. w.: characteristics of four list-processing languages (u. s. department of commerce, national bureau of standards, sept. 1963). 7. goodwin, harry b. : "some thoughts on improved technical information service," readings in information retrieval (new york, scarecrow press, 1964) , p. 43. 8. weil, cherie b.: classification and automatic retrieval of biographical reference books (chicago: university of chicago graduate library school, 1967). lib-mocs-kmc364-20140103102400 52 journal of library automation vol. 4/1 march, 1971 book reviews computerized library catalogs: their growth, cost, and utility, by j. l. dolby, v. j. forsyth, [and] h. l. resnikoff. cambridge, mass.: them. i. t. press, 1969. 164 pp. $10.00. on the verso of the title page of this book extolling the benefits of computer stored information we read: "no part of this book may be reproduced in any form or by any means, . . . , or by any information storage and retrieval system, without permission in writing from the publisher." this is ironic evidence of the inner contradiction between the pioneering aspirations of new technological development and the interests of an existing industry which feels threatened by unfounded fear of obsolescence and pursues claims to undesirable universal control of information. it is a vivid indication of an urgent need for statutory regulation of the right to information as opposed to the right of profit-motivated control of information. behind this disturbing title page are found seven chapters, three of which constitute a specially important contribution to the very scarce literature on quantitative aspects of bibliographic administration. the other four chapters deal at some length with various cost-related aspects of bibliographic record conversion in machine readable form and with computer use for the production of library catalogs from such records: a chapter analyzing the user costs, costs of programming, hardware costs, and record conversion costs; another chapter on the effect of type face design and page format characteristics on the cost of printed catalogs; a chapter on automated error detection in bibliographic record processing; and a chapter on the use of machine readable catalog data in the production of bibliographies. the three chapters on statistical analysis of machine readable bibliographic data are chef-d'oeuvres of library literature. they demonstrate the wealth of quantitative information inherent in bibliographic record files, which with application of appropriate statistical methodology can yield most important information for library management. the introductory chapter illustrates a methodology of analysis of book publication trends, and comparing these with the gross national product and other economic indicators, points out forcibly the extent of vital quantitative information available to the administrator if he cares to analyze. this is a brilliant essay on the topic of the growth rate of library collections! the case study of the fondren library shelf list is a further elaboration of this theme, especially in terms of title vs. volume ratios and class distri· bution, leading to the third analytical essay on the similarities between the economic growth of nations and archival acquisition rates. this essay trio should not be missed by anyone concerned with the objectives and rational management of libraries. it should be obligatory book reviews 53 reading for administrators who want food for their creative vision. these essays not only are informative, illuminating and stimulating, they also attest the virtuosity of their authors in the area of imaginative statistical analysis. we put this book on our ready reference shelf with the conviction that its authors should be persuaded to give us a most needed textbook on the methodology of library statistical analysis. ritvars bregzis library use of computers, an introduction, edited by gloria l. smith and robert s. meyer. new york: special libraries association, 1969. 114 pp. $5.00. ( sla monograph no. 3). this little paperback book is the result of a lecture series held during the spring of 1965 and sponsored by sla's san francisco bay region chapter and the university extension, university of california, berkeley. according to the introduction, it is intended to be "a librarian's primer for computers: briefly, how they work, and basically what kinds of output can be got from them for use in libraries." the book includes most aspects of library automation in a very generalized manner. the chapters include programming, systems analysis, hardware, applications, reference services, conversion, and current trends and future applications. the most informative chapters are the ones on application and current trends. the other chapters matter-of-factly present their information, but because they fail to raise questions do not challenge the beginner to seek additional answers. in addition, most of the papers do not include a bibliography or reading list and therefore do not give the reader guidance to further his interests. the more substantive papers in the book confront the reader with provocative statements, such as, "it should be the cost to the patron which the library should worry about. what is the cost to the ~atron of not knowing something, or the cost of having to find out?" or ' . . . library science . . . today is really only a technology, much like medicine was before biology was developed, namely a handbook-cookbook world. we must continue to search for underlying principles so that librarianship may become a science and grow out of its technological phase." the papers without doubt met the needs of the lecture series for which they were originally intended, and the authors are experts in the field. most of them are well known for their work in library automation and mechanized information retrieval the question nevertheless arises-must ~very spoken word subsequently appear in print? very little contribution is made to the literature by this book as most of it has appeared elsewhere, many times, before-even down to many of the illustrations. as a group of elementary instructional lectures designed to stimulate intere!jt, most of these papers make the grade, but as a contribution to the literature, a fllw of the papers may have better served as journal articles. donald p. hamme1' 54 journal of library automation vol. 4/1 march, 1971 the career of the academic librarian-a study of the social origins, educational attainments, vocational experiences, and personality characteristics of a group of american academic librarians, by perry d. morrison. chicago: american library association, 1969. viii, 165 pp. $4.50. this acrl monograph no. 29 is a revised and condensed version of a dissertation for the degree of doctor of library science at the university of california at berkeley, published ten years after the data were collected. in 1958 the author sent questionnaires to two groups of academic librarians: head librarians of american college and university libraries earning $6,000 or more, which he calls the "primary group," and a "control group" from the same institutions selected from the 1955 edition of who's who in library service. the findings are quite interesting, often expected, but sometimes surprising. it would be expected that the study would support the theory that the true leader has interests broader than those whom he leads, that participation in professional and scholarly organizations is directly correlated with position and salary, and that willingness to move around to different positions is an important ingredient in the formula of "success". but this reviewer, and perhaps others, are surprised to learn that librarians, psychologists and business leaders tend to come from families of men who are better educated than the average, and that it is more advantageous for a woman than a man to hold a master's degree in a subject field and/or to possess the old-type master's in library science. the chapter on "implications of findings" has some interesting comments from respondents, among which is the notion that rewarding specialized competence, as opposed to general administrative ability, is essential if a maximum contribution is to be secured from both men and women. not many academic library administrators would quarrel with this point of view, but might quarrel with the heavy burden which the respondents lay on the library schools to solve the problems of academic librarianship, rather than sharing them with the libraries. criticism of the study is directed toward the rather pessimistic, probably unrealistic, attitude taken by the author on the question of faculty status, and the omission of any substantial inclusion about information science and information specialists, although one needs to recall that these were not prominent issues in 1958, though they were in 1969. the wealth of information contained in this little volume can be useful in recruiting to the profession and to individual libraries, helpful to the young librarian just starting out in the practice of his profession, and of value to administrators of academic libraries. the appendices include a copy of the questionnaire, a statement of the statistical treatment, suggestions for further research and an extensive bibliography. lewis c. branscomb book reviews 55 the computer in the public service: an annotated bibliography 19661969. compiled by public administration service. chicago: public information service, 1970. 74 pp. $8.00. it seemed useful to approach this review from the point of view of a designer of automated library systems, to see how well this bibliography covered the use of computers in that sector of the public service. thus approached, the computer in the public service is disappointing in the extreme. nearly a quarter of the volume is used to explain the classification system, wherein one finds libraries lumped together with museums and the promotion of science and research. this over-generalization is reflected throughout the work and results in very thin coverage, at least as concems the field of library automation. for example, in the four-year period covered, one finds a single article by henriette a vram. one finds neither marc nor recon in the subject index. the indexes, by the way, refer one to the numbers of the sections of the classification code, rather than to page numbers. this takes some getting used to. both journal articles and monographic works are included, but one is hard put to find reference to most of the major systems problems now being tackled by library systems designers, and is struck by the fact that most imprints included seem to be 1966-67. but to return to the real problem with this work: covering four years of the use of computers in the public service in forty-seven pages of citations naturally leaves wide gaps in the literature. the person wishing to learn all about the use of computers in the public service, especially in library systems, will do well to look beyond this meagre list of citations. it can perhaps be best characterized as too little and too late. lawrence g. livingston computer programming for business and social science, by paul l. emerick and joseph wilkinson. homewood, ill.: richard d. irwin, 1970. xviii, 429 pp. $13.25. this book is devoted principally to business programming, with a few of the examples being taken from educational administration or social ~cience. the computer language used throughout most of the chapters ~s fortran-not the current fortran iv, which is covered briefly m an appendix, but the older fortran ii. one chapter is devoted to cobol and another to an "overview of other languages" including basic, algol, and pl/i. it is a satisfactory introduction to programming fundamentals, generally clear and well presented. furthermore, the ?umerous exercises in business programming would seem to be a useful mtroduction to that field. however, neither the language nor the type of ~xample used gives the book any special relevance for persons approaching library automation. foster m. palmer 56 ] oumal of library automation vol. 4/1 march, 1971 consortiwn of universities of the washington metropolitan area: union list of serials, edited by bruce dack. 2d edition. washington, d.c.: distributed by the catholic university of america press, 1970. $20.00. this edition represents 23,787 diherent serial titles located in american, catholic, george washington, georgetown and howard universities. the scope has been enlarged to include monographic series. among the subjects emphasized are africana, astronomy, canon law, chemistry, classics, latin americana, law, linguistics, medicine, physics, semitics, theology and the american negro. although the citations lack the complete bibliographic data that was included in the first edition, the cross references are more than adequate to get from variant forms of the entry to the latest form. in most instances only the title is noted along with the holdings of the various libraries. the print is not as good as that of the first edition. even though the quality does not quite come up to that of the first edition, the list is still worthwhile for people doing research in the washington metropolitan area. sue brown the case for faculty status for academic librarians, edited by lewis c. branscomb. chicago: american library association, 1970. 122p. $5.00. this volwne is available at a most opportune time because of the current interest in faculty status for librarians in academic institutions. fourteen articles and statements examine the various aspects of the subject and generally support faculty status for librarians. although only two of the articles have not appeared in colleges and research libraries during the past decade, this compilation effectively brings together the relevant statements on the subject. faculty status for librarians is a burning issue on many campuses where recognition is being sought, threatened, or questioned. unfortunately, many librarians take faculty status for granted or perhaps do not completely understand or appreciate it. this book will be useful for many groups and situations because of its comprehensiveness. some of the subjects discussed are priviliges and obligations of faculty status, definition of professional duties, criteria for appointment and promotion, opportunities for study and research, and the granting of tenure. on individual campuses faculty status for librarians must be secured and retained in terms locally acceptable and in a frequently unfriendly, if not hostile, atmosphere. the article, "institutional dynamics of faculty status for librarians," by robert h . muller very effectively describes the advantages of faculty status for the institution and explains the forces that tend to work against faculty status for librarians. ]ames t. dodson book reviews 57 monocle; projet de mise en ordinateur d'une notice catalographique de livre, [by] marc chauveinc. grenoble: bibliotheque universitaire, 1970. (publications de ia bibliotheque u niversitaire de grenoble, iii) 156p., [32 leaves] this attractively printed volume is an adaptation, not a translation, of lc marc ii and bnb marc by the librarian of the university of grenoble. as m. chauveinc points out, the format established in a country is based on the cataloging rules used there, since the latter determine the cataloging elements, their functions and relationships. element by element, code by code, everything in marc had to be redefined for monocle in the context of the french rules. because of a commitment to international standardization, care was taken not to alter the basic structure of the content designators used in marc. when a marc variable field is not needed in monocle, e.g., 650 ( l. c. topical subject headings) it is left unused and a new tag is created for the equivalent french field. french topical subject headings are thus 681. marc is a communications format; monocle is a processing format for an individual library which wishes to place its bibliographic records in memory, to produce printed catalogs, and to keep statistics that will be useful in administering the library. the structure of monocle divides each record into two parts: an index file and a library or principal file. this latter is a continuous string of variable length fields which contain the data of a catalog record. the fields and subfields are differentiated in this file only by delimiters and subfield codes. the tags and indicators appear in the directory. the index file in its turn is divided into three parts: a leader, information codes, and a directory which is similar to, though not identical with, the marc directory. the nineteen-byte leader and the sixty-nine-byte information codes area together form the legend. the leader is based on the marc leader and 001 (control number) field while the information codes are an expansion and replacement of marc's 008 field, with provision for serials as well as monographs in this one format. some of the information indicated in the legend can be found in full in the variable fields. however, in line with monocle's avowed objective of automating to improve the running of a library, this information is repeated in the legend, since information placed there, being coded and in fixed positions, can be accessed and sorted more easily, rapidly, and economically. one of monocle's most striking characteristics is the great concern exhibited throughout for sorting and filing arrangement; one example is the effort made to give sorting values to subfield codes. in this respect monocle follows the example of bnb marc rather than lc marc, 58 journal of library automation vol. 4/1 march, 1971 which uses subfield codes only as a means of identifying distinct elements within a field. everyone interested in the problems of bibliographic formatting, or sorting for filing, should give monocle close attention, both for its specific provisions (such as its tagging conventions, its search code, its treatment of titles, subrecords, and references, etc.) and for the light it throws on marc. while lc marc is very succinct in its commentaries and rru:ely justifies its codes, monocle has much fuller explanations of its provisions and its reasons for agreeing with or differing from marc. judith hopkins system scope for library automation and generalized information storage and retrieval at stanford university. stanford, calif.: stanford university, 1970. 157 pp. (available from eric document reproduction service. ed 038153 mf $0.75; hc $7.70) "the purpose of this document is to define the scope of a manual-automated system to serve the libraries and the teaching and research community of stanford university." the automated system considered is not one, but the joint development of two major bibliographic projects; ballots (bibliographic automation of large library operations on a timesharing system) and spires (stanford physics information retrieval system). the development activity falls into three areas; applications unique to ballots, applications unique to spires, and common facilities that are used by both applications, such as executive and communications software and a text editor. the document is roughly divided in two, with hah being devoted to the scope statement and the other half a myriad collection of appendices. the scope portion of the document defines a second phase of development for the system, as prototype applications have been in operation. the objectives of the applications are redefined in system level detail in view of experience learned from phase one. hardware is evaluated and there are indications that it is inadequate to effectively handle even the prototype system. the appendices include a glossary for the uninitiated, sample documentation of the present library operations, a comment on how the law library could use the system, a review by louise addis of the stanford linear accelerator center's experience with spires, and a tutorial on information retrieval. because of the audience this publication is intended for (librarian users, system developers, and administrators) library automation specialists and information scientists will not find much to put their teeth into. the document seems to be intended mainly for internal use rather than external distribution. alan d. hogan book reviews 59 advances in librarianship, edited by melvin j. voigt. volume 1. new york: academic press, 1970. 294 pp. this volume is a most welcome addition to the literature of librarianship, and the prospect of its annual reappearance is indeed cheering. for decades, there were few major innovations in librarianship, but about 1960 there occurred a series of events in libraries, including user-operated photocopying machines, radical improvements and extensions of microphotography, and computerization, that are leading to formulation of new objectives, new systems, new techniques for printing, new media of communication, and new knowledge. up until a dozen years ago, a librarian could keep abreast of new knowledge in his field by skimming a few journals and reading precious few articles. today to keep up he should read a couple of abstract journals and request offprints, read most of the artic1es in at least four or five journals, and advances in libmrianship. this first volume contains eleven chapters by different authors that · discuss topics in a broad span of librarianship : cataloging; acquisitions; costs; academic, school and public libraries; bibliotherapy; and developing countries. although the standard observation by reviewers of such volumes is that "as is to be expected, the quality of the papers is uneven, with some falling below others," it would be accurate to state of this volume that the quality of some papers is higher than others. the editor is to be commended for having produced an excellent publication that should be on the personal shelves of every librarian who wishes to keep abreast of advances throughout his profession. frederick g. kilgour folkbiblioteken och adb. en introduktion i automatisk databehandling. av sten henriksson. datorn som hjalpmedel vid utlan och katalogisering. av claes axelson, lund: bibliotekstjanst, 1969. 86 p., ills., reg. the general service bureau for swedish public libraries, bibliotekstjanst, has edited this introduction to computers and library automation ( circulation and cataloguing). the book is written for persons who know the functions of a library and have perhaps more interest in than knowledge of computers and computer technique. the introduction about computers by sten henriksson from the university of lund is not loaded with numeric statements but carefully explains the computer technique. for non-mathematicians this is an extremely good introduction. claes axelson from bibliotekstjanst writes about the computer's use in circulation and cataloguing. the described circulation system is based on some experiments carried out at a branch library in malmo (stock: 15,000 vols., circulation: 50,000 vols per year). borrower's card and book-card are matched, and batch processed once a week. in case any cards are lost, 60 journal of library automation vol. 4/1 march, 1971 new ones are punched at the desk. reservations were troublesome to handle (a more serious objection if the systems had been intended for use in research and special libraries). cataloguing is described without referring to any experiments. all possibilities are mentioned, book catalogues, card catalogues, databanks, microfiche. specific problems about book cataloguing are treated too. both parts of the book are well written and illustrated with taste and economy. there is an index and a bibliography, but one will find ]ola and program missing. for librarians in scandinavian countries this book is a very useful one-though some problems are related only to the swedish public library world. mogens weitemeyer latin american literature. harvard university library (widener library shelhist, 21). cambridge, massachusetts : harvard university library, 1969. 498 pp. $20.00. the harvard library initiated its monumental project for publishing the widener shelf list in 1965 with the appearance of crusades. making available a classed listing of the holdings of one of the world's largest scholarly libraries is a major contribution to scholarship, as reviewers of the early volumes gratefully acknowledged. however, this review is not concerned with content of this remarkable publication, but rather with the typography of the 21st volume. the early volumes ( 1-8) were reproduced by photo-offset from computer produced copy that was all in upper-case characters. to be sure, one does not read a shelf list, one consults it. nevertheless, literature is in lower case, and using large compilations in upper case is tiresome because of the low legibility of upper-case characters. the plates of these first volumes contained an average of 55 entries. the harvard library improved its procedures about a year after the volumes began to appear, so that the computer printout was in an expanded character set that included lower-case characters and diacritics. the newer volumes were far more legible and therefore far more comfortable to use. each page carried a single column of entries; pages averaged 85 entries. beginning with volume 21, computerized phototypesetting techniques are being employed, and legibility is greatly improved. economy has also improved, for each page now averages over 130 entries per page-nearly two-and-one-half times the content of pages in the early volumes. volume 21 has two columns of entries per page-a format that enhances the number of entries as well as legibility. harvard is to be congratulated on taking advantage of computer developments during the first five years of publication of the widener shelflist. thereby, the aesthetics and economy of a major bibliographic publication have been gratifyingly enhanced. frederick g. kilgofjf cherry 154 information technology and libraries | september 2006 article title: subtitle in same font author name and second author author id box for 2 column layout the present study investigated whether there is a correlation between user performance and compliance with screen-design guidelines found in the literature. rather than test individual guidelines and their interactions, the authors took a more holistic approach and tested a compilation of guidelines. nine bibliographic display formats were scored using a checklist of eighty-six guidelines. twenty-seven participants completed ninety search tasks using the displays in a simulated web environment. none of the correlations indicated that user performance was statistically significantly faster with greater conformity to guidelines. in some cases, user performance was actually significantly slower with greater conformity to guidelines. in a supplementary study, a different set of forty-three guidelines and the user performance data from the main study were used. again, none of the correlations indicated that user performance was statistically significantly faster with greater conformity to guidelines. a ttempts to establish generalizations are ubiquitous in science and in many areas of human endeavor. it is well known that this enterprise can be extremely problematic in both applied and pure science.1 in the area of human-computer interaction, establishing and evaluating generalizations in the form of interface-design guidelines are pervasive and difficult challenges, particularly because of the intractably large number of potential interactions among guidelines. using bibliographic display formats from web catalogs, the present study utilizes global evaluation by correlating user performance in a search task with conformity to a compilation of eighty-six guidelines (divided into four subsets). the literature offers many design guidelines for the user interface, some of which cover all aspects of the user interface, some of which focus on one aspect of the user interface—e.g., screen design. tullis, in chapters in two editions of the handbook of human-computer interaction, reviews the work in this area.2 the earlier chapter provides a table describing the screen-design guidelines available at that time. he includes, for example, galitz, whom he notes have several hundred guidelines addressing general screen design, and smith and mosier, whom he notes have about three hundred guidelines addressing the display of data.3 earlier guidelines tended to be generic. more recently, guidelines have been developed for specific applications—e.g., web sites for airline travel agencies, multimedia applications, e-commerce, children, bibliographic displays, and public-information kiosks.4 although some of the guidelines in the literature are based on empirical evidence, many are based on expert opinion and have not been tested. some of the researchbased guidelines have been tested in isolation or in combination with only a few other guidelines. the national cancer institute (nci) web site, research-based web design and usability guidelines, rates sixty guidelines on a scale of 0 to 5 based on the strength of the evidence.5 the more valid the studies that directly support the guideline, the higher the rating. in interpreting the scores, the site advises that scores of 1, 2, or 3 suggest that “more evidence is needed to strengthen the designer’s overall confidence in the validity of a guideline.” of the sixty guidelines on the site, forty-six (76.7 percent) fall into this group. in 2003, the united states department of health and human services web site, research-based web design and usability guidelines, rated 187 guidelines on a different five-point scale.6 eightytwo guidelines (43.9 percent) meet the criteria of having strong or medium research support. another forty-eight guidelines (25.7 percent) are rated as having weak research support. thus, there is some research support for 69.6 percent of the guidelines. in addition to the issue of the validity of individual guidelines, there may be interactions among guidelines. an interaction occurs if the effect of a variable depends on the level of another variable—e.g., an interaction occurs if the usefulness of a guideline depends on whether some other guideline is being followed. a more severe problem is the potential for high-order interactions: the nature of a two-way interaction may depend on the level of a third variable, the nature of a three-way interaction may depend on the level of a fourth variable, and so on. because of the combinatorial explosion, if there are more than a few variables the number of possible interactions becomes huge. as cronbach stated: “once we attend to interactions, we enter a hall of mirrors that extends to infinity.”7 with a large set of guidelines, it is impractical to test all of the guidelines and all of the interactions, including highorder interactions. muter suggested several approaches for handling the problem of intractable high-order interactions, including adapting optimizing algorithms such as simplex, seeking “robustness in variation,” re-construing the problem, and pruning the alternative space.8 the present study utilizes another approach: global evaluation by joan m. cherry, paul muter, and steve j. szigeti bibliographic displays in web catalogs: does conformity to design guidelines correlate with user performance? joan m. cherry (joan.cherry@utoronto.ca) is a professor in the faculty of information studies; paul muter (muter@psych .utoronto.ca) is an assistant professor in the department of psychology; and steve j. szigeti (szigeti@fis.utoronto.ca) is a doctoral student in the faculty of information studies and the knowledge media design institute, all at the university of toronto, canada. bibliographic displays in web catalogs | cherry, muter, and szigeti 155 correlating user performance with conformity to a set of guidelines. using this method, particular guidelines and interactions are not tested, but the set and subsets are tested globally, and some of the interactions, including high-order interactions, are captured. bibliographic displays were scored using a compilation of guidelines, divided into four subsets, and the performance of users doing a set of search tasks using the displays was measured. an attempt was made to determine whether users find information more quickly on displays that receive high scores on checklists of screen-design guidelines. the authors are aware of only two studies that have investigated conformity with a set of guidelines and user performance, and they both included only ten guidelines. d’angelo and twining measured the correlation between compliance with a set of ten standards (d’angelo standards) and user comprehension.9 the d’angelo standards are in the form of principles for web-page design, based on a review of the literature.10 d’angelo and twining found a small correlation (.266) between number of standards met and user comprehension.11 they do not report on statistical significance, but from the data provided in the paper it appears that the correlation is not significant. gerhardt-powals compared an interface designed according to ten cognitive engineering principles to two control interfaces and found that the cognitively engineered interface resulted in statistically significantly superior user performance.12 the guidelines used in the present study were based on a list compiled by chan to evaluate displays of bibliographic records in online library catalogs.13 the set of guidelines was broken down into four subsets. participants in this study were given search tasks and clicked on the requested item on a bibliographic display. the main dependent variable of interest was response time. ฀ method participants twenty-seven participants were recruited through the university of toronto psychology 100 subject pool. seventeen were female; ten were male. most (twenty) were in the age group 17 to 24; three were in the age group 25 to 34 years, and four were in the age group 35 to 44. one had never used the web; all others reported using the web one or more hours per week. participants received course credit. design to control for the effects of fatigue, practice runs, and the like, the order of trials was determined by two orthogonal 9 x 9 latin squares—one to select a display and one to select a book record. each participant completed five consecutive search tasks—author, title, call number, publisher, and date—in a random order, with each display-book combination. (the order of the five search tasks was randomized each time.) this procedure was repeated, so that in total each participant did ninety tasks (9 displays x 5 tasks x 2 repetitions). materials and apparatus the study used nine displays from library catalogs available on the web. they were selected to represent a variety of systems and to illustrate the remarkable diversity in bibliographic displays in web catalogs. the displays differed in the amount of information included, the structure of the display, employment of highlighting techniques, and use of graphical elements. four examples of the nine displays are presented in figures 1a, 1b, 1c, and 1d. the displays were captured and presented in an interactive environment using active server page (asp) software. the look of the displays was retained, but hypertext links were deactivated. nine different book records were used to provide the content for the displays. items selected were those that would be readily understood by most users—e.g., books by saul bellow, norman mailer, and john updike. the guidelines were based on a list compiled by chan from a review of the literature in human-computer interaction and library science.14 the list does not include guidelines about the process of design. chan formatted the guidelines as a checklist for bibliographic displays in online catalogs. in work reported in 1996, cherry and cox modified the checklist for use with bibliographic displays in web catalogs.15 in a 1998 paper, cherry reported on evaluations of bibliographic displays in catalogs of academic libraries, based on chan’s data for twelve opacs and data for ten web catalogs evaluated by cherry and cox using a modification of the 1996 checklist for web catalogs.16 the findings showed that, on average, displays in opacs scored 58 percent and displays in web catalogs scored 60 percent. the 1996 checklist of guidelines was modified by herrero-solana and de moya-anegón, who used it to explore the use of multivariate analysis in evaluating twenty-five latin american catalogs.17 for the present study four questions were removed that were considered less useful from the checklist used in cherry’s 1998 analysis. the checklist consisted of four sections or subsets: labels (these identify parts of the bibliographic description); text (the display of the bibliographic, holdings/ location, and circulation status information); instructions (includes instructions to users, informational messages, and options available); and layout (includes identification of the screen, the organization for the bibliographic 156 information technology and libraries | september 2006 information, spacing, and consistency of information presentation). items on the checklist were phrased as questions requiring yes/no responses. examples of the items are: labels: “are all fields/variables labeled?” text: “is the text in mixed case (upper and lowercase)?” instructions: “are instructional sentences or phrases simple, concise, clear, and free of typographical errors?” and layout: “is the width of the display no more than forty to sixty characters?” the set used in the present study contained eightysix guidelines in total, of which forty-eight were generic and could be applied to any application. thirty-eight are specific and apply to bibliographic displays in web catalogs. the experiment was run on a pentium computer with a seventeen-inch sony color monitor with a standard keyboard and mouse. figure 1a. example of display figure 1b. example of display figure 1c. example of display figure 1d. example of display bibliographic displays in web catalogs | cherry, muter, and szigeti 157 procedure participants were tested individually. five practice trials with a display and book record not used in the experiment familiarized the participant with the tasks and software. at the beginning of a trial, the message “when ready, click” appeared on the screen. when the participant clicked on the mouse, a bibliographic display appeared along with a message at the top of the screen indicating whether the participant should click on the author, title, call number, publisher, or date of publication—e.g., “current task: author.” participants clicked on what they thought was the correct answer. if they clicked on any other area, the display was shown again. an incorrect click was not defined as an error—in effect, percent correct was always 100—but an incorrect click would of course add to the response time. the software recorded the time to successfully complete each search, the identification for the display and the book record, and the search-task type. when a participant completed the five search tasks for a display, a message was shown indicating the average response time on that set of tasks. when participants completed the ninety search tasks, they were asked to rank the nine displays according to their preference. for this task, a set of laminated color printouts of the displays was provided. participants ranked the displays, assigning a rank of 1 to the display that they preferred most, and 9 to the one they preferred least. they were also asked to complete a short background questionnaire. the entire session took less than forty-five minutes. scoring the displays on screen design guidelines the authors’ experience has indicated that judging whether a guideline is met can be problematic: evaluators sometimes differ in their judgments. in this study, three evaluators assessed each of the nine displays independently. if there was any disagreement amongst the evaluators’ responses for a given question for a given display, that question was not used in the computation of the percentage score for that display. (a guideline regarding screen density was evaluated by only one evaluator because it was very time-consuming.) the total number of questions used to assess each display was eighty-six. the number of questions on which the evaluators disagreed ranged from twelve to thirty across the nine displays. all questions on which the three evaluators agreed for a given display were used in the calculation of the percentage score for that display. hence the percentage scores for the displays are based on a variable set and number of questions—from fifty-six to seventy-four. the subset of questions on which the three evaluators agreed for all nine displays was small—twenty-two questions. ฀ results with regard to conformity to the guidelines, in addition to the overall scores for each display, which ranged from 42 percent to 65 percent, the percentage score was calculated for each subset of the checklist (labels, text, instructions, and layout). the time to successfully complete each search task was recorded to the nearest millisecond. (for some unknown reason, six of the 2,430 response times recorded [27 x 90] were 0 milliseconds. the program was written in such a way that the response-time buffer was cleared at the time of stimulus presentation, in case the participant clicked just before this time. these trials were treated as missing values in the calculation of the means.) six mean response times were calculated: author, title, call number, publisher, date, and the sum of the five response times, called all tasks. the mean of all tasks response times ranged from 13,671 milliseconds to 21,599 milliseconds for the nine formats. the nine display formats differed significantly on this variable according to an analysis of variance, f(8, 477) = 17.1, p < .001. the correlations between response times and guidelines-conformance scores are presented in table 1. it is important to note that a high correlation between response time and conformity to guidelines indicates a low correlation between user performance (speed) and conformity to guidelines. row 1 of table 1 contains correlations between the total guidelines score and response times; column 1 contains correlations between all tasks (the sum of the five response times) and guidelines scores. of course, the correlations in table 1 are not all independent of each other. only five of the thirty correlations in table 1 are significant at the .05 level, and they all indicate slower response times with higher conformity to guidelines. of the six correlations in table 1 indicating faster response times with higher conformity to guidelines, none approaches statistical significance. the upper left-hand cell of table 1 indicates that the overall correlation between total scores on the guidelines and the mean response time across all search tasks (all tasks) was 0.469 (df = 7, p = 0.203)—i.e., conformity to the overall checklist was correlated with slower overall response times, though this correlation did not approach statistical significance. figure 2 shows a scatter plot of the main independent variable, overall score on the checklist of guidelines, and the main dependent variable, the sum of the response times for the five tasks (all tasks). figure 3 shows a scatter plot for the highest obtained correlation: between score on the overall checklist of guidelines and the time to complete the title search task. visual inspection suggests patterns consistent with table 1: no correlation in figure 2, and slower search times with higher guidelines scores in figure 3. finally, correlations were computed between preference and response times (all tasks response times and five 158 information technology and libraries | september 2006 specific-task response times) and between preference and conformity to guidelines (overall guidelines four subsets of guidelines). none of the eleven correlations approached statistical significance. ฀ supplementary study to further validate the results of the main study, it was decided to score the interfaces against a different set of guidelines based on the 2003 u.s. department of health and human services research-based web design and usability guidelines. this set consists of 187 guidelines and includes a rating for each guideline based on strength of research evidence for that guideline. the present study started with eighty-two guidelines that were rated as having either moderate or strong research support, as the definitions of both of these include “cumulative research-based evidence.”18 compliance with guidelines that address the process of design can only be judged during the design process, or via access to the interface designers. since this review process did not allow for that, a total of nine process-focused guidelines were discarded. this set of seventy-three guidelines was then compared with the sixty-guideline 2001 nci set, research-based web design and usability guidelines, intending to add any outstanding nci guidelines supported by strong research evidence to the existing list of seventy-three. however, all of the strongly supported nci guidelines were already represented in the original seventy-three. finally, the guidelines in the iso 9241, ergonomic requirements for office work with visual display terminals (vdts), part 11 (guidance on usability), part 12 (presentation of information ), and part 14 (menu dialogues ) were compared to the existing set of seventy-three, with the intention that any prescriptive guideline in the iso set that was not already included in the original seventy-three would be added.19 again, there were none. the seventy-three guidelines were organized into three thematic groups: (1) layout (the organization of textual and graphic material on the screen), (2) interaction (which included navigation or any element with which the user would interact), and (3) text and readability. all of the guidelines used were written in a manner allowing readers room for interpretation. the authors explicitly stated that they were not writing rules, but rather, guidelines, and recognized that their application must allow for a level of flexibility.20 this ambiguity creates problems in terms of assessing displays. in this study, two evaluators independently assessed the nine displays. the first evaluator applied all seventy-three guidelines and found thirty to be nonapplicable to the specific types of interfaces considered. the second evaluator applied the shortened list of forty-three guidelines. following the independent evaluations, the two evaluators compared assessments. the initial rate of agreement between the two assessments ranged from 49 percent to 70 percent across the nine displays. in cases where there was disagreement, the evaluators discussed their rationale for the assessment in order to achieve consensus. ฀ results of supplementary study as with the initial study, in addition to the overall scores for each display, the percentage score was calculated for each subset of the checklist (labels, interaction, and text and readability). it is worth noting that the overall scores witnessed higher compliance to this second set of guidelines, ranging from 68 percent to 89 percent. the correlations between response times and guidelines-conformance scores are presented in table 2. again, it is important to note that a high correlation between response time and conformity to guidelines indicates a low correlation between user performance (speed) and conformity to guidelines. row 1 of table 2 contains correlations between the total guidelines score and response times; column 1 contains correlations between all tasks (the sum of the five response times) and guidelines scores. of course, the correlations in table 2 are not all independent of each other. only one of the twenty-four correlations in table 2 table 1. correlations between scores on the checklist of screen design guidelines and time to complete search tasks: pearson correlation (sig. 2-tailed); n=9 all cells all tasks author title call # publisher year total score: .469 (.203) .401 (.285) .870 (.002) .547 (.127) .035 (.930) .247 (.522) labels: .722 (.028) .757 (.018) .312 (.413) .601 (.087) .400 (.286) .669 (.049) text: -.260 (.500) -.002 (.997) .595 (.091) -.191 (.623) -.412 (.271) -.288 (.452) instructions: .422 (.258) .442 (.234) .712 (.032) .566 (.112) .026 (.947) .126 (.748) layout: .602 (.086 -.102 (.794) .383 (.308) .624 (.073) .492 (.179) .367 (.332) bibliographic displays in web catalogs | cherry, muter, and szigeti 159 is significant at the .05 level, and it indicates a slower response time with higher conformity to guidelines. of the ten correlations in table 2 indicating faster response times with higher conformity to guidelines, none approaches statistical significance. the upper left-hand cell of table 2 indicates that the overall correlation between total scores on the guidelines and the mean response time across all search tasks (all tasks) was 0.292 (p = 0.445)—i.e., conformity to the overall checklist was correlated with slower overall response times, though this correlation did not approach statistical significance. figure 4 shows a scatter plot of the main independent variable, overall score on the checklist of guidelines, and the main dependent variable, the sum of the response times for the five tasks (all tasks). figure 5 shows a scatter plot for the highest-obtained correlation: between score on the text and readability category of guidelines and the time to complete the title search task. visual inspection suggests patterns consistent with table 2: no correlation in figure 4, and slower search times with higher guidelines scores in figure 5. ฀ discussion in the present experiment and the supplementary study, none of the correlations indicating faster user performance with greater conformity to guidelines approached statistical significance. in some cases, user performance was actually significantly slower with greater conformity to guidelines—i.e., in some cases, there was a negative correlation between user performance and conformity to guidelines. the authors are aware of no other study indicating a negative correlation between user performance and conformity to interface design guidelines. some researchers would not be surprised at a finding of zero correlation between user performance and conformity to guidelines, but a negative correlation is somewhat puzzling. a negative correlation implies that there is something wrong somewhere—perhaps incorrect underlying theories or an incorrect body of assumptions. such a negative correlation is not without precedent in applied science. in the field of medicine, before the turn of the twentieth century, seeing a doctor actually decreased the chances of improving health.21 presumably, medical guidelines of the time were negatively correlated with successful practice, and the negative correlation implies not just worthlessness, but medical theories or beliefs that were actually incorrect and harmful. the boundary conditions of the present findings are unknown. the present findings may be specific to the tasks employed—fairly simple search tasks. the findings may apply only to situations in which the user is switching formats frequently, as opposed to situations in which each user is using only one format. (a between-subjects design would test this possibility.) the findings may be specific to the two sets of guidelines used. with sets of ten guidelines, d’angelo and twining and gerhardt-powals found positive correlations between user performance and conformity to guidelines (though apparently not statistically significantly in the former study).22 the guidelines used in the authors’ main study and supplementary study tended to be more detailed than in the other two studies. detailed guidelines are sometimes seen as advantageous, since developers who use guidelines need to be able to interpret the guidelines in order to implement them. however, perhaps following a large number of detailed figure 2. scatter plot for overall score on checklist of screen design guidelines and time to complete set of five search tasks figure 3. scatter plot for overall score on checklist of screen design guidelines and time to complete “title” search tasks 160 information technology and libraries | september 2006 guidelines reduces the amount of personal judgment used and results in less effective designs. (designers of the nine displays used in the present study would not have been using either of the sets of guidelines used in our studies but may have been using some of the sources from which our guidelines were extracted.) as noted by cheepen in discussing guidelines for voice dialogues, sometimes a designer’s experience may be more valuable than a particular guideline.23 the lack of agreement in interpreting the guidelines was an unexpected but interesting factor revealed during the collection of data in both the main study and the supplementary study. while a higher rate of agreement had been expected, the differences raised an important point in the use of guidelines. if guidelines intentionally leave room for interpretation, what factor does expert opinion and experience play in design? in the main study, the number of guidelines on which the evaluators disagreed ranged from 14 percent to 35 percent across the nine displays. in the supplementary study, both evaluators had experience in interface design through a number of different roles in the design process (both academic and professional). this meant the evaluators’ interpretations of the guidelines were informed by previous experience. the initial level of disagreement ranged from 30 percent to 51 percent across the nine displays. while it was possible to quickly reach consensus table 2. correlations between scores on subset of the u.s. dept. of health and human services (2003) research–based web design and usability guidelines and time to complete search tasks: pearson correlation (sig. 2-tailed); n=9 all cells all tasks author title call # publisher year total score: .292 (.445) .201 (.604) .080 (.839) -.004 (.992) .345 (.363) .499 (.172) layout: -.308 (.420) -.264 (.492) -.512 (.159) -.332 (.383) .046 (.906) -.294 (.442) text: .087 (.824) -.051 (.895) .712 (.032) -.059 (.879) -.095 (.808) -.259 (.500) interaction: .638 (.065) .603 (.085) .055 (.887) .439 (.238) .547 (.128) .625 (.072) figure 4. scatter plot for subset of u.s. department of health and human services (2003) research–based web design and usability guidelines conformance score and total time to complete five search tasks figure 5. scatter plot for text and readability category of u.s. department of health and human services (2003) research–based web design and usability guidelines and time to complete “title” search tasks bibliographic displays in web catalogs | cherry, muter, and szigeti 161 on a number of assessments (because both evaluators recognized the high degree of subjectivity that is involved in design), it also led to longer discussions regarding the intentions of the guideline authors. a majority of the differences involved lack of guideline clarity (where one evaluator had indicated a meet-or-fail score, while another felt the guideline was either unclear or not applicable). does this imply that guidelines can best be applied by committees or groups of designers? the dynamic of such groups would add another complex variable to understanding the relationship between guideline conformity and user performance. future research should test other tasks and other sets of guidelines to confirm or refute the findings of the present study. there should also be investigation of other potential predictors of display effectiveness. for example, would the ratings of usability experts or graphic designers for a set of bibliographic displays be positively correlated with user performance? crawford, in response to a paper presenting findings from an evaluation of bibliographic displays using a previous version of the checklist of guidelines used in the main study, commented that the design of bibliographic displays still reflects art, not science.24 several researchers have discussed aesthetics and user interface design. reed et al. noted the need to extend our understanding of the role of aesthetic elements in the context of user-interface guidelines and standards.25 ngo, teo, and byrne discussed fourteen aesthetic measures for graphic displays.26 norman discussed these ideas in “emotions and design: attractive things work better.”27 tractinsky, katz, and ikar found strong correlations between perceived aesthetic appeal and perceived usability.28 most empirical studies of guidelines have looked at one variable only or, at the most, a small number of variables. the opposite extreme would be to do a study that examines a large number of variables factorially. for example, assuming eighty-six yes/no guidelines for bibliographic displays, it would be theoretically possible to do a factorial experiment testing all possible combinations of yes/no—2 to the 86th power. in such an experiment, all two-way interactions and higher interactions could be assessed, but such an experiment is not feasible. what the authors have done is somewhere between these two extremes. this study has the disadvantage that we cannot say anything about any individual guideline, but it has the advantage that it captures some of the interactions, including highorder interactions. despite the present results, the authors are not recommending abandoning the search for guidelines in interface design. at a minimum, the use of guidelines may increase consistency across interfaces, which may be helpful. however, in some research domains, particularly when huge numbers of potential interactions result in extreme complexity, it may be advisable to allocate resources to means other than attempting to establish guidelines, such as expert review, relying on tradition, letting natural selection take its course, utilizing the intuitions of designers, and observing user-interaction. indeed, in pure and applied research in general, perhaps more resources should be allocated to means other than searching for explicit generalizations. future research may better indicate when to attempt to establish generalizations and when to use other methods. ฀ acknowledgements this work was supported by a social sciences and humanities research council general research grant awarded by the faculty of information studies, university of toronto, and by the natural sciences and engineering research council of canada. the authors wish to thank mark dykeman and gerry oxford who developed the software for the experiment; donna chan, joan bartlett, and margaret english, who scored the displays with the first set of guidelines; everton lewis, who conducted the experimental sessions; m. max evans, who helped score the displays with the supplementary set of guidelines; and robert l. duchnicky, jonathan l. freedman, bruce oddson, tarjin rahman, and paul w. smith for helpful comments. references and notes 1. see, for example, a. chapanis, “some generalizations about generalization,” human factors 30, no. 3 (1988): 253–67. 2. t. s. tullis, “screen design,” in handbook of human-computer interaction, ed. m. helander (amsterdam: elsevier, 1988), 377–411; t. s. tullis, “screen design,” in handbook of humancomputer interaction, 2d ed., eds. m. helander, t. k. landauer, and p. prabhu (amsterdam: elsevier, 1997), 503–31. 3. w. o. galitz, handbook of screen format design, 2d ed. (wellesley hills, mass.: qed information sciences, 1985); s. l. smith and j. n. mosier, guidelines for designing user interface software, technical report esd-tr-86-278 (hanscom air force base, mass.: usaf electronic systems division, 1986). 4. c. chariton and m. choi, “user interface guidelines for enhancing the usability of airline travel agency e-commerce web sites,” chi ‘02 extended abstracts on human factors in computing systems, apr. 20–25, 2002 (minneapolis, minn.: acm press), 676–77, http://portal.acm.org/citation .cfm?doid=506443.506541 (accessed dec. 28, 2005); m. g. wadlow, “the andrew system; the role of human interface guidelines in the design of multimedia applications,” current psychology: research and reviews 9 (summer 1990): 181–91; j. kim and j. lee, “critical design factors for successful e-commerce systems,” behaviour and information technology 21, no. 3 (2002): 185–99; s. giltuz and j. nielsen, usability of web sites for children: 162 information technology and libraries | september 2006 70 design guidelines (fremont, calif.: nielsen norman group, 2002); juliana chan, “evaluation of formats used to display bibliographic records in opacs in canadian academic and public libraries,” master of information science research project report (university of toronto: faculty of information studies, 1995); m. c. maquire, “a review of user-interface design guidelines for public information kiosk systems,” international journal of human-computer studies 50, no. 3 (1999): 263–86. 5. national cancer institute, research-based web design and usability guidelines (2001), www.usability.gov/guidelines/index .html (accessed dec. 28, 2005). 6. u.s. department of health and human services, researchbased web design and usability guidelines (2003), http://usability .gov/pdfs/guidelines.html (accessed dec. 28, 2005). 7. l. j. cronbach, “beyond the two disciplines of scientific psychology,” american psychologist 30, no. 2 (1975): 116–27. 8. p. muter, “interface design and optimization of reading of continuous text,” in cognitive aspects of electronic text processing, eds. h. van oostendorp and s. de mul (norwood, n.j.: ablex, 1996), 161–80; j. a. nelder and r. mead, “a simplex method for function minimization,” computer journal 7, no. 4 (1965): 308–13; t. k. landauer, “research methods in human-computer interaction,” in handbook of human-computer interaction, ed. m. helander (amsterdam: elsevier, 1988), 905–28; r. n. shepard, “toward a universal law of generalization for psychological science,” science 237 (sept. 11, 1987): 1317–323. 9. j. d. d’angelo and j. twining, “comprehension by clicks: d’angelo standards for web page design, and time, comprehension, and preference,” information technology and libraries 19, no. 3 (2000): 125–35. 10. j. d. d’angelo and s. k. little, “successful web pages: what are they and do they exist?” information technology and libraries 17, no. 2 (1998): 71–81. 11. d’angelo and twining, “comprehension by clicks.” 12. j. gerhardt-powals, “cognitive engineering principles for enhancing human-computer performance,” international journal of human-computer interaction 8, no. 2 (1996): 189–211. 13. chan, “evaluation of formats.” 14. ibid. 15. joan m. cherry and joseph p. cox, “world wide web displays of bibliographic records: an evaluation,” proceedings of the 24th annual conference of the canadian association for information science (toronto, ontario: canadian association for information science, 1996), 101–14. 16. joan m. cherry, “bibliographic displays in opacs and web catalogs: how well do they comply with display guidelines?” information technology and libraries 17, no. 3 (1998): 124– 37; cherry and cox, “world wide web displays of bibliographic records.” 17. v. herrero-solana and f. de moya-anegón, “bibliographic displays of web-based opacs: multivariate analysis applied to latin-american catalogs,” libri 51 (june 2001): 75–85. 18. u.s. department of health and human services, researchbased web design and usability guidelines, xxi. 19. international organization for standardization, iso 924111: ergonomic requirements for office work with visual display terminals (vdts)—part 11: guidance on usability (geneva, switzerland: international organization for standardization, 1998); international organization for standardization, iso 9241-12: ergonomic requirements for office work with visual display terminals (vdts)—part 12: presentation of information (geneva, switzerland: international organization for standardization, 1997); international organization for standardization, iso 9241-14: ergonomic requirements for office work with visual display terminals (vdts)—part 14: menu dialogues (geneva, switzerland: international organization for standardization, 1997). 20. u.s. department of health and human services, researchbased web design and usability guidelines. 21. ivan illich, limits to medicine: medical nemesis: the expropriation of health (harmondsworth, n.y.: penguin, 1976). 22. d’angelo and twining, “comprehension by clicks”; gerhardt-powals, “cognitive engineering principles.” 23. c. cheepen, “guidelines for dialogue design—what is our approach? working design guidelines for advanced voice dialogues project. paper 3,” (1996), www.soc.surrey.ac.uk/research/ reports/voice-dialogues/wp3.html (accessed dec. 29, 2005). 24. w. crawford, “webcats and checklists: some cautionary notes,” information technology and libraries 18, no. 2, (1999): 100–03; cherry, “bibliographic displays in opacs and web catalogs.” 25. p. reed et al., “user interface guidelines and standards: progress, issues, and prospects,” interacting with computers 12, no. 1 (1999): 119–42. 26. d. c. l. ngo, l. s. teo, and j. g. byrne, “formalizing guidelines for the design of screen layouts,” displays 21, no. 1 (2000): 3–15. 27. d. a. norman, “emotion and design: attractive things work better,” interactions 9, no. 4 (2002): 36–42. 28. n. tractinsky, a. s. katz, d. ikar, “what is beautiful is usable,” interacting with computers 13, no. 2 (2000): 127–45. levan opensearch and sru | levan 151 not all library content can be exposed as html pages for harvesting by search engines such as google and yahoo!. if a library instead exposes its content through a local search interface, that content can then be found by users of metasearch engines such as a9 and vivísimo. the functionality provided by the local search engine will affect the functionality of the metasearch engine and the findability of the library’s content. this paper describes that situation and some emerging standards in the metasearch arena that choose different balance points between functionality and ease of implementation. editor's note: this article was submitted in honor of the fortieth anniversaries of lita and ital. ฀ the content provider’s dilemma consider the increasingly common situation in which a library wants to expose its digital content to its users. suppose it knows that its users prefer search engines that search the contents of many sites simultaneously, rather than site-specific engines such as the one on the library’s web site. in order to support the preferences of its users, this library must make its contents accessible to search engines of the first type. the easiest way to do this is for the library to convert its contents to html pages and let the harvesting search engines such as google and yahoo! collect those pages and provide searching on them. however, a serious problem with harvesting search engines is that they place limits on how much data they will collect from any one site. google and yahoo! will not harvest a 3-million-record book catalog, even if the library can figure out how to turn the catalog entries into individual web pages. an alternative to exposing library content to harvesting search engines as html pages is to provide a local search interface and let a metasearch engine combine the results of searching the library’s site with the results from searching many other sites simultaneously. users of metasearch engines get the same advantage that users of harvesting search engines get (i.e., the ability to search the contents of many sites simultaneously) plus those users get access to data that the harvesting search engines do not have. the issue for the library is determining how much functionality it must provide in its local search engine so that the metasearch engine can, in turn, provide acceptable functionality to its users. the amount of functionality that the library provides will determine which metasearch engines will be able to access the library’s content. metasearch engines, such as a9 and vivísimo, are search engines that take a user’s query, send it to other search engines, and integrate the responses.1 the level of integration usually depends on the metasearch engine’s ability to understand the responses it receives from the various search engines it has queried. if the response is html intended for display on a browser, then the metasearch engine developers have to write code to parse through the html looking for the content. in such a case, the perceived value of the content determines the level of effort that the metasearch engine developers put into the parsing task; low-value content will have a low priority for developer time and will either suffer from poor integration or be excluded. for metasearch engines to work, they need to know how to send a search to the local search engine and how to interpret the results. metasearch engines such as vivísimo and a9 have staffs of programmers who write code to translate the queries they get from users into queries that the local search engines can accept. metasearch engines also have to develop code to convert all the responses returned by the local search engines into some common format so that those results can be combined and displayed to the user. this is tedious work that is prone to breaking when a local search engine changes how it searches or how it returns its response. the job of the metasearch engine is made much simpler if the local search engine supports a standard search interface such as sru (search and retrieve url) or opensearch. ฀ what does a metasearch engine need in order to use a local search engine? the search process consists of two basic steps. first, the search is performed. second, records are retrieved. to do a search, the metasearch engine needs to know: 1. the location of the local search engine 2. the form of the queries that the local search engine expects 3. how to send the query to the local search engine to retrieve records, the metasearch engine needs to know: 4. how to find the records in the response 5. how to parse the records opensearch and sru: a continuum of searching ralph levan ralph levan (levan@oclc.org) is a research scientist at oclc online computer library center in dublin, ohio. 152 information technology and libraries | september 2006 ฀ four protocols this paper will discuss four search protocols: opensearch, opensearch 1.1, sru, and the metasearch xml gateway (mxg).2 opensearch was initially developed for the a9 metasearch engine. it provides a mechanism for content providers to notify a9 of their content. it also allows rss (really simple syndication) browsers to display the results of a search.3 opensearch 1.1 has just been released. it extends the original specification based on input from a number of organizations, microsoft being prominent among them. sru was developed by the z39.50 community.4 recognizing that their standard (now eighteen years old) needed updating, they simplified it and created a new web service based on an xml encoding carried over http. the mxg protocol is the product of the niso metasearch initiative, a committee of metasearch engine developers, content providers, and users.5 mxg uses sru as a starting place, but eases the requirement for support of a standard query grammar. ฀ functionality versus ease of implementation a library rarely has software developers. the library’s area of expertise is, first of all, the management of content and, secondarily, content creation. librarians use tools developed by other organizations to provide access to their content. these tools include the library’s opac, the software provided to search any licensed content, and the software necessary to build, maintain, and access local digital repositories. for a library, ease of adoption of a new search protocol is essential. if support for the search protocol is built into the library’s tools, then the library will use it. if a small piece of code can be written to convert the library’s existing tools to support the new protocol, the library may do that. similarly, the developers of the library’s tools will want to expend the minimum effort to support a new search protocol. the tool developer’s choice of search protocol to support will depend on the tension between the functionality needed and the level of effort that must be expended to provide and maintain it. if low functionality is acceptable, then a small development effort may be acceptable. high functionality will require a greater level of effort. the developers of the search protocols examined here recognize this tension and are modifying their protocols to make them easier to implement. the new opensearch 1.1 will make it easier for some local search-engine providers to implement by easing some of the functionality requirements of version 1.0. similarly, the niso metasearch committee has defined mxg, a variant of sru that eases some of the requirements of sru.6 ฀ search protocol basics once again, the five basic pieces of information that a metasearch engine needs in order to communicate effectively with a local search engine are: (1) local search engine location, (2) the query-grammar expected, (3) the request encoding, (4) the response encoding, and (5) the record encoding. the four protocols provide these pieces of information to one degree or another (see table 1). the four protocols expose a site’s searching functionality and return responses in a standard format. all of these protocols have some common properties. they expect that the content provider will have a description record that describes the search service. all of these services send searches via http as simple urls, and the responses are sent back as structured xml. to ease implementation, opensearch 1.1 allows the content provider to return html instead of xml. all four protocols use a description record to describe the local search engine. the opensearch protocols define what a description record looks like, but not how it is retrieved. the location of the description record is discovered by some means outside the protocol (a priori knowledge). the description record specifies the location of the local search engine. the sru protocols define what a description record looks like and specifies that it can be obtained from the local search engine. the location of the local search engine is provided by a means outside the protocol (a priori knowledge again). each protocol defines how to formulate the search url. opensearch does this by having the local search-engine provider supply a template of the url in the description record. sru does this by defining the url. opensearch and mxg do not define how to formulate the query. the metasearch engine can either pass the user’s query along to the local search engine unchanged or reformulate the query based on information about the local search engine’s query language that it has gotten by outside means (more a priori knowledge). in the first case, the metasearch engine has to hope that some magic will happen and the local search engine will do something useful with the query. in the latter case, the metasearch engine’s staff has to develop a query translator. sru specifies a standard query grammar: cql (common query language).7 this means that the metasearch engine only has to write one translator for all the sru local search engines in the world. but it also means that all the sru local search engines have to support the cql query grammar. since there are no local search engines that support cql as their native query grammar, the content provider is left with the task of translating cql queries into their native query grammar. the query translation task has moved from the metasearch engine to the content provider. opensearch and sru | levan 153 opensearch 1.0, mxg, and sru define the structure of the query response. in the case of opensearch, the response is returned as an rss message, with a couple of extra elements added. mxg and sru define an xml schema for their responses. opensearch 1.1 allows the local search engine to return the response as unstructured html. this moves the requirement of creating a standard response from the content provider and leaves the metasearch engine with the much tougher task of finding the content embedded in html. if the metasearch engine doesn’t write code to parse the response, then all it can do is display the response. it will not be able to combine the response from the local search engine with the responses from other engines. sru and mxg require that records be returned in xml and that the local search engine must specify the schema for those records in the response. this leaves the content provider with the task of formatting the records according to the schema of their choice, a task that the content provider is probably best able to do. in turn, the metasearch engine can convert the returned records into some common format so that the records from multiple local search engines can be combined into a single response. because the records are encoded in xml, it is assumed that standard xml formatting tools can be used for the conversion. opensearch does not define how records should be structured. the opensearch response has a place for the title of the record and a url that points to the record. the structure of the record is undefined. this leaves the metasearch engine with the task of parsing the record that is returned. again, the effort moves from the content provider to the metasearch engine. if the metasearch engine does not or cannot parse the records, then it can at least display the records in some context, but it cannot combine them with the records from another local search engine. ฀ conclusion these protocols sit on a spectrum of complexity, trading the content provider’s complexity for that of the search engine. however, with lessened complexity for the metasearch engine comes increased functionality for the user. metasearch engines have to choose what content providers they will search. those that provide a high level of functionality can be easily combined with their existing local search engines. content providers with a lower level of functionality will either need additional development by the metasearch engine or will not be searched. not all metasearch engines require the same level of functionality, nor will they be prepared to accept content with a low level of functionality. content providers, such as digital libraries and institutional repositories, will have to choose the functionality they need to support to reach the metasearch engines they desire. references and notes 1. joe barker, “meta-search engines,” in finding information on the internet: a tutorial (u.c. berkeley: teaching library internet workshops, aug. 23, 2005 [last update]), www.lib.berkeley. edu/teachinglib/guides/internet/metasearch.html (accessed may 8, 2006). 2. a9.com, “opensearch specification,” http://opensearch .a9.com/spec/ (accessed may 8, 2006); a9.com, “opensearch 1.1,” http://opensearch.a9.com/spec/1.1/ (accessed may 8, 2006). 3. mark pilgrim, “what is rss?” o’reilly xml.com, dec. 18, 2002, www.xml.com/pub/a/2002/12/18/dive-into-xml.html (accessed may 8, 2006). 4. the library of congress network development and marc standards office, “z39.50 maintenance agency page,” www.loc.gov/z3950/agency/ (accessed may 8, 2006). 5. national information standards organization, “niso metasearch initiative,” www.niso.org/committees/ ms_initiative.html (accessed may 8, 2006). 6. niso metasearch initiative task group 3, “niso metasearch xml gateway implementors guide, version 0.2,” may 16, 2005, [microsoft word document] www.lib.ncsu.edu/nisomi/images/0/06/niso_metasearch_initiative_xml _gateway _implementors_guide.doc (accessed may 8, 2006); the library of congress, “sru: search and retrieve via url; sru version 1.1 13 february 2004,” www.loc.gov/standards/sru/index.html (accessed may 8, 2006). 7. the library of congress, “common query language; cql version 1.1 13th february 2004.” [web page] www.loc .gov/standards/sru/cql/index.html (accessed may 8, 2006). table 1. comparison of requirements of four metasearch protocols for effective communication with local search engines protocol feature opensearch 1.1 opensearch 1.0 mxg sru local search engine location a priori a priori a priori a priori request encoding defined defined defined defined response encoding none rss xml xml record encoding none none xml xml query grammar none none none cql peer reading promotion in university libraries: based on simulation study about readers' opinion seeking in social network article peer reading promotion in university libraries based on a simulation study about readers’ opinion seeking in social networks yiping jiang, xiaobo chi, yan lou, lihua zuo, yeqi chu, and qingyi zhuge information technology and libraries | march 2021 https://doi.org/10.6017/ital.v40i1.12175 yiping jiang (jyp@zjut.edu.cn) is associate professor in information and library science, zhejiang university of technology, china. xiaobo chi (chixiaobo@zjut.edu.cn) is associate professor in information and library science, zhejiang university of technology, china. yan lou (jljly@zju.edu.cn) is associate professor in administrative department of continuing education, zhejiang university, china. lihua zuo (jljly@zju.edu.cn) is librarian at zhejiang university of technology, china. yeqi chu (cyq77@zjut.edu.cn) is librarian at zhejiang university of technology, china. qingyi zhuge (beckygoodly@163.com) is librarian at zhejiang university of technology, china. © 2021. abstract university libraries use social networks to promote reading; however, there are challenges to increasing the use of these library platforms, such as poor promotion and low reader participation. therefore, these libraries need to find ways of dealing with the behavior characteristics of social network readers. in this study, a simulation experiment was developed to explore the behaviors of readers seeking book reviews and opinions on social networks. the study draws on social network theory to find the causes of students’ behavior and how these affect their selection of information. finally, it presents strategies for peer reading promotion in university libraries. introduction over the last decade, social media has made an impact on almost every aspect of daily life. university libraries have gradually accepted social media as a way of promoting their services, and almost every university library in the people’s republic of china has its own social media accounts. however, there are challenges to increasing libraries’ use of social media, such as poor promotion and low reader participation.1 university libraries cannot depend only on promoting reading through readers’ unenthusiastic use of social media tools, as constructive engagement with social networks requires users’ participation, dominance, and construction.2 therefore, as a baseline, university libraries must take into consideration their readers’ social attributes and then make full use of the mutual cooperation and sharing mechanisms between peers so that readers can become more involved in the use of these platforms that promote reading. in the current study, a free simulation was conducted wherein participants were required to complete a preferential choice task while browsing a book review survey that was integrated with social media platforms.3 finally, we provide some suggestions to promote reading. literature review university reading promotion our review of literature on the promotion of university reading reveals three main research perspectives. the first perspective focuses on libraries. there is some evidence to suggest increasing enthusiasm for reading programs within universities.4 rodney detailed the experiences of a library at a small liberal arts university that launched a one book, one community program.5 hou emphasized the dominant position of university reading in reading promotion and put forward specific promotion strategies.6 li et al. established a subscription digital service system for reading promotion in universities that provided personalized services for users.7 the national mailto:jyp@zjut.edu.cn mailto:chixiaobo@zjut.edu.cn mailto:jljly@zju.edu.cn mailto:jljly@zju.edu.cn mailto:cyq77@zjut.edu.cn mailto:beckygoodly@163.com information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 2 resource center for the first-year experience and students in transition hosts a discussion site and has compiled a list of institutions reporting first-year summer reading programs along with a list of book titles used in the programs.8 appalachian state university also has an active university reading program discussion list.9 college reading experience programs have the potential to bring disparate disciplines and college departments together in ways that extend student learning and engagement beyond the classroom. it could be argued that librarians are one group of natural “boundary spanners.”10 gustavus adolphus college has compiled a lengthy list of links to universities that participate in the first-year experience. querying this list suggests that there is a growing number of reading programs on college campuses and librarians are increasingly finding a role in their development and delivery.11 the second perspective focuses on readers. for instance, zhou et al. analyzed users’ reading needs and proposed ideas of how universities could promote reading through the use of questionnaires.12 based on self-determination theory, wei et al. constructed an index of students’ and teachers’ motivation and participation in reading promotions in libraries by using questionnaires. their factor analysis of readers’ reading psychology from the aspects of information value, social sharing, interest, cognition, and emotional entertainment concludes that the theme, intelligence, and interactivity of college reading promotions are significant. 13 dali made specific recommendations on how to give reading practices in academic libraries a boost and a new direction through the lens of the differentiated nature of readerships on campuses.14 the third perspective focuses on cultural constructions in colleges. boff et al. studied the practical activities they termed as library participation in “campus reading experience” (cre) at two american community colleges and two four-year institutions in the united states.15 their research pointed out the importance of reading promotion activities in the cultural constructions of colleges and universities. moreover, it presented efficient suggestions of how librarians can hold reading promotion activities on campus and how librarians can play a more positive role in presenting reading promotion plans to their administrations. marcoux et al. emphasized the vital status of canadian college libraries in various subject areas and cultural dominance in colleges and insisted that reading promotion should be enforced by bringing together colleges, teachers, and students.16 peer education in university libraries since the 1970s, university libraries have experimented with making students an extension of reference services and part of established peer instruction services. for example, at california state university, fresno, student assistants were recruited to work on the reference desk and answer directional and simple reference questions.17 the university of michigan in ann arbor developed its peer information counseling program in 1985 to focus on the retention of minority students.18 the university of wisconsin-parkside and binghamton university in new york employ student peers to provide instructional support.19 all these programs incorporate peer tutoring models developed specifically for their settings. there are many practical projects in which libraries provide instruction to peer readers, such as the peer practical project at wabash college, the student assistant project at valparaiso university, the student consultant project at the university of new mexico, the curriculum consultant project at the university of new hampshire, and the student assistants project at utah state university. 20 surveys of targeted students in these programs revealed that students were more likely to ask questions of student assistants than librarians. descriptions of the training in these projects information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 3 emphasized knowledge of the library’s resources, with little or no explanation of incorporating peer-learning principles.21 by 2010, libraries acknowledge that peer learning provides an opportunity to further solidify the information literacy skills of all students. programs such as the library instruction tutor project at the university of new mexico and research mentors at the university of new hampshire were designed with a focus on taking advantage of the uniqueness of the peer-to-peer relationship, rather than replacing reference services.22 the librat program at california polytechnic state university trained students to provide single-shot information literacy instruction. student endorsement of peer-led sessions provides supportive evidence that participating attendees perceive this type of session as useful and valuable.23 to summarize, university libraries have identified that harnessing the uniqueness of peer relationships is an effective way to engage students in learning.24 social reading promotion the first published papers regarding the use of facebook by libraries and librarians appeared in 2006.25 up to that time, most scholars defined social reading as sharing theoretical research but had not yet explored its relationship to social media. for instance, zhang et al. suggested that classic books and other resources should be aimed more accurately at potential users, and the convenience of the connection between a library and its users in the social media environment should be fully utilized in order to understand the personal needs of users.26 yang et al. explored the relationship between users who engage in social reading and reading resources by analyzing it from the perspective of users.27 white et al. explored how social networks promote reading and studying and found that social media can promote users’ selection, critiques, and discussions in the reading context, which are part of the process of constructive studying.28 similarly, asteman et al. investigated the impact of users’ discussions on reading participation and reading promotion by using facebook as a research target, and they proved that users’ discussions on social reading were beneficial to reading, studying, and understanding complex scientific topics.29 some researchers have explored the resources recommending methods and algorithms for social reading. take kochitchi et al.’s research as an example: they built a visual analysis system to construct the relationship between user characteristics and resources by extracting social reading tags and user interaction behavior characteristics. 30 huang et al. proposed an efficient method for recommending information among social network groups.31 some scholars put particular emphasis on researching user services. for instance, liu et al. created reading review flows to help improve users’ reading ability and optimize the reading experience by analyzing users’ needs on social platforms.32 fox subdivided users on social media platforms into three categories—passive, active, and interactive—and summarized that users’ standardized behaviors can effectively enhance user interaction.33 yao et al. studied the data gathered from practical activities such as information posts and book retrieval through social reading platforms at the tsinghua university library.34 information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 4 methods research questions existing studies on social reading promotion in libraries have mainly explored how to use social media platforms to promote and develop reading promotion services. however, few scholars have explored the reader patterns of those seeking opinions within social networks. this study explored the effects that peers have on each other as opposed to services provided by university libraries and addressed the following three questions: 1. do readers value the opinions of peers on social networks? 2. what tendencies do readers exhibit when seeking opinions on different types of literature? 3. how does social capital influence readers’ tendencies when those readers are seeking opinions? social capital refers to the potential value of social relations and includes two key dimensions: structural and relational.35 social structure can be characterized by quantity and configuration.36 with respect to quantity, the more social ties one has the potential to activate, the more information resources can be transferred.37 the configuration of social capital means that it is higher when a network’s structure is more sparse.38 relational capital refers to the potential value associated with the quality of social relationships which are created and embedded by network peers and can be utilized by network friends.39 previous studies have used different social attributes to describe relational capital, including homogeneity, trust, expertise, power, and closeness.40 study design we used a questionnaire applet and designed a survey of book reviews to explore patterns in the way readers seek information on books through social networks and the factors that influence readers when they adopt others’ opinions. as an initial step, we recruited 300 college student participants from 15 colleges and universities. these students, from the wechat group of the eighth national mechanical design competition for college students, expressed interest in the study. the students were offered three lists of books, categorized as leisure literature, mechanical literature, and information resources utilization literature. there were 10 books in every list, and the books were selected from a 2019 lending list compiled by five colleges and universities. participants were asked to log in to our survey using their wechat credentials. they were required to write reviews for the books that they had read, and they were encouraged to recommend similar literature and write reviews for those books as well. meanwhile, 30 librarians were also invited to write reviews for the books on the lists (see fig. 1). information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 5 figure 1. the book review steps. to keep a representative sample, we adopted stratification and divided participants into different non-overlapping homogenous groups. we set up six groups with similar scales according to the number of wechat friends that the readers had within the 300-student sample. readers in the first group did not have any friends. readers in the second group had one or two friends. readers in the third group had three or four friends. readers in the fourth group had five or six friends. readers in the fifth group had seven to eleven friends. readers in the sixth group had twelve or more friends. subsequently, we randomly invited 15 readers from every group to complete the following steps: (1) complete an online questionnaire that measured their relational capital (see table 1 and fig. 2) and (2) make references to others’ reviews and select books that they intended to read (see fig. 2). information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 6 table 1. three measurements of relational capital measurements questionnaire professional skills 1. reading is a part of my life. 2. reading is in my daily to-do list. 3. i go to the library to study. 4. my reading ability is good. 5. reading helps me a lot. 6. i read a lot (at least 10 books a year) similarity 1. my friends and i read similar books. 2. my friends and i have similar feelings about reading. 3. recommendations are useful to me. intimacy 1. my survey group is trustworthy. 2. others’ comments are beneficial to me. 3. i am willing to share my feelings about reading with my friends. note: a five-point likert scale (strongly disagree,disagree,neutral,agree,strongly agree) was used. information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 7 figure 2. selecting the books. data collection measurement of readers’ behavior when seeking opinions. a book review applet (with wechat’s questionnaire function) was incorporated to record the number of times a reader looked at reviews from peers and librarians (see fig. 3). information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 8 figure 3. measurement of readers’ behavior when seeking opinions. note: when readers wanted to refer to other people’s comments, the applet allowed them to choose either classmates’ or librarians’ reviews. readers could browse the comments through a drop-down menu, and the number of reviews read by the respondents was recorded. measurement of structural capital. the scale of a reader’s structural capital was related to the number of wechat friends they had.41 drawing on the extant literature, we computed network sparseness by dividing the network’s effective size by the overall size. network effective size is the average number of wechat friends within the sample set of 300 people. measurement of relational capital. we used the three variables of professional skills, similarity, and intimacy to measure social relationships.42 data were gathered from the online questionnaire (see table 1). experiment validity reliability analysis. we performed measurement validity checks on the three variables applied in the measurement of relational capital. table 2 shows the evidence of satisfactory convergence and discriminant validity. table 2. reliability analysis results cronbach’s alpha std. cronbach’s alpha projects .811 .811 12 factor analysis. we tested whether it was scientifically meaningful to consider network scale, network sparseness, and relational capital as independent variables in respectively analyzing two information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 9 dependent variables, that is, times of seeking peers’ opinions and times of ref erring to librarians’ opinions. tables 3 and 4 show the results of the kaiser-meyer-olkin (kmo) spherical test and bartlett’s test, which show that factor analysis is suitable, and further analysis can be performed. table 3. kmo spherical test kmo .654 bartlett’s test chi-square 314.342 df 10 significance .00 table 4. bartlett’s test initial extract network scale 1.000 .724 network sparseness 1.000 .676 relational capital 1.000 .720 results reviews written and browsed we recorded the number of reviews read by the respondents and then compared the number of times they consulted peers with the number of times they consulted librarians. the results are shown in table 5. in total, readers sought opinions from peers 1,374 times (70.9%), and from librarians 563 times (29.1%). for leisure literature, mechanical literature, and information resources utilization literature they sought opinions from peers 422 times (85.3%), 519 times (88.3%), and 433 times (50.7%), respectively. from these results, it can be surmised that readers tend to seek opinions from peers. table 5. comparison of sources consulted by readers seeking opinions leisure mechanical information resources utilization total reviews browsed reviews browsed reviews browsed readers 422 (85.3%) 519 (88.3%) 433 (50.7%) 1374 (70.9%) librarians 73 (14.7%) 69 (11.7%) 421 (49.3%) 563 (29.1%) we used regression analysis to analyze the relationship between readers’ behavior of seeking opinions and social capital. results are shown in table 6. according to the t-test, given the significance level of 0.10, the significance probability of the three variables was less than 0.10 for both the times that readers sought peer opinions and the times that they consulted librarians. table 7 shows the introduction and elimination process of variables in the process of stepwise information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 10 regression. the analysis of seeking peer opinions eliminates relational capital, and the analysis of reference to librarians’ opinions has eliminated network sparseness. table 6. regression analysis results model non-standardized coefficients standardized coefficients t observed value significance probability partial regression coefficient (b) standard error (std error) standardized partial regression coefficient (beta) dependent variable: seeking peers’ opinions (constant) 4.864 1.334 3.645 .000 network scale 1.341 .143 .715 9.393 .000 network sparseness 3.450 1.012 -.245 -3.408 .001 relational capital .094 .318 .017 .295 .769 dependent variable: seeking librarian opinions (constant) .292 1.943 .150 .881 network scale -1.691 .208 .909 8.133 .000 network sparseness -1.312 1.474 .094 .890 .376 relational capital -1.385 .463 -.247 -2.990 .004 table 7. quantity introduction and elimination process model variable entered variable removed dependent variable: seeking peers’ opinions network scale, network sparseness relational capital dependent variable: seeking librarians’ opinions network scale, relational capital network sparseness according to the analysis results, we can draw some conclusions. first, the number of times readers sought opinions from peers was in proportion to network sparseness. in other words, the higher the number of readers’ online peers, the lower the network sparseness and the more they sought advice from their peers. second, the number of times readers sought librarians’ opinions was inversely proportional to their network scale and relational capital. this means that readers tended not to seek opinions from librarians if they had more network peers and more relational capital. information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 11 according to further analysis of tables 6 and 7, there are two equations that can derived: 1. regression equation of readers seeking opinions from peers: where y1 represents seeking peer opinions and x1 and x2 represent network scale and network sparseness, respectively. 2. regression equation of readers seeking opinions from librarians: where y2 represents seeking librarian opinions and x1 and x3 represent network scale and relational capital, respectively. based on the two regression equations, it can be observed that the number of peers’ reviews consulted increases by 13.9% for each point increase in network scale and by 35.7% for each additional unit of network sparseness. the number of librarians’ reviews consulted decreases by 60.7% for each point increase in network scale and by 70.7% for each additional unit of relational capital. discussion the value of peers according to the results in table 5, readers tend to seek opinions from peers. this is mainly because familiar information sources can provide more diagnostic help.43 meanwhile, the cognitive effort required to process such information is lower and information is easier to understand. research on peer education also shows similar findings. jane piaget and lev vygotsky found that it is easier to build partnerships among children than among children and adults. moreover, children are more willing to negotiate and theorize with partners who are not authoritative. in this social media age, the value of peers is even more pronounced. therefore, libraries should recognize this, recruit influential readers for reading promotion, and utilize the influence of peer social networks to spread information related to the promotion of reading. opinion seeking tendency of different types of literature information the participants in this study were university students majoring in mechanical engineering disciplines across several different universities. this means that they had similar backgrounds, experiences, and feelings as they participated in the mechanical design competition. under these circumstances, it would be expected that they had close peer relationships. behavioral science research proves that if the communicator and the receiver have similar experiences, are concerned about similar things, and face similar problems, the receiver is more likely to accept information from the communicator. this viewpoint is consistent with the standpoint proposed by psychological models in which relevant sources of information are more frequently activated. 44 therefore, the result that readers are more willing to seek opinions from peers is supported. the result regarding information resources utilization literature shows similar numbers in terms of seeking opinions from peers and from librarians, and the number seeking from librarians is slightly higher. this may be because libraries are the literature and information resource centers information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 12 of universities, and readers trust the professional abilities of librarians and are willing to seek their help with regard to resource utilization. the results indicate that librarians should make efforts to promote the use of information resources. seeking tendency of readers with different social capital in the process of decision-making, readers will look for homogeneous and credible people to assist them in their search for and evaluation of information.45 identity, experience, reading level, and taste in reading are contributing factors to credibility. the more partners, the more the relational capital; if readers have more trustworthy sources of information, they will not turn to librarians for their opinions. information searching is a dynamic and adaptive process. when readers find information that is novel, it has value in decision-making. conversely, when the information is redundant (not novel), it may cause the seeker to stop searching. it is widely acknowledged that sparse social networks reduce the possibility of information redundancy.46 therefore, the sparser the social network, the more likely it is that readers will need peer help. conclusions in this age of social media, university students are accustomed to using social networks. seeking information that complies with their psychological needs is more significant and valuable to them than the value of the information itself. therefore, libraries should make full use of peer influences when employing social media in reading promotion activities. first, university libraries ought to realize their great potential for involving students within the social flow to participate in reading promotion activities. in the digital age, readers’ consciousness is repeatedly awakened. individuality, advocating for information freedom, and improving the flow of information mean that readers are no longer satisfied with passively receiving information and are more willing to actively search out and read information. meanwhile, sharing and interaction can fully meet the desires of individuals to share and communicate as well as meet their psychological need of realizing their self-worth. libraries must understand the characteristics of contemporary university students’ information needs and create a space for readers to take the initiative. only in this way can readers be more than passive recipients of information—they can also be pushers and disseminators of information. bolder attempts at innovation should be applied to reading promotions. in this research, an analysis and exploration were conducted based on the literature, indicating that readers prefer to seek opinions from their own social networks. therefore, the library can make full use of readers’ social network groups when promoting the library’s literature resources. likewise, other services and activities provided by the libraries can be promoted through readers’ social networks. for instance, libraries can invite student volunteers to take part in a new service before launching and then invite them to share their feelings and evaluations through social networks. these types of methods are more efficient than the traditional flyer notification. last but not least, when organizing reading promotion activities, libraries should stay behind the scenes. university libraries can establish a set of systematic peer reading promotion rules, including recruitment, training, and management systems, to build a wide and influ ential reading promotion student volunteer team on social networks. libraries should, however, strengthen the process of monitoring peer reading promotions to prevent negative influences caused by harmful information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 13 information on social media.47 they should pay special attention to the control of social opinion by using the reader volunteer team. in addition, through the monitoring and analysis of data, the strategy and direction of reading promotions can be adjusted over time to improve their pertinence and effectiveness. moreover, libraries should strengthen the effective evaluation of peer reading promotion projects. readers should be involved in the systematic readjustment of traditional reading promotion methods. innovative methods need to be tested in practice s o that libraries can strengthen the effective evaluation of peer projects. limitations and future research there are some limitations in the methodology design, theoretical scope, empirical context, and research perspective in the current study. getting past these limitations can also provide direction for further research. in the methodology, although the sectional sampling method is often adopted, the conclusions of this research make it difficult to disentangle the roles of readers’ social capital from those of opinion seeking. this study explored the correlations between several variables which need to be further investigated by introducing control variables to fully examine the interactions between the variables. in theory, this research focused on analysis of the rule that readers will seek opinions through social networks, which reflects the behavior observed in information searching and browsing. readers then need to decide whether to use the information they find. these two behaviors are related and, based on this research, we need to further explore readers’ adoption behavior to better guide reader service work in libraries. in the empirical analysis, it is worth further considering how the various efforts undertaken by university libraries have promoted information channels. the participants in this study were university students majoring in mechanical engineering disciplines. however, students in different majors may exhibit different information selecting behavior, and this deserves further analysis and exploration. this study compared the influence of peers’ and librarians’ opinions on readers. however, how do readers feel about opinions from peers as compared to those from librarians? what is the impact of students preferring to use their social networks instead of librarians for information retrieval? is there a difference in the adoption of peer opinions by readers in different social media contexts? these questions deserve further study to fully understand the impact of social networks on read er opinion-seeking behavior. acknowledgements this work was supported by the humanities and social sciences research fund of the chinese ministry of education [grant number 17yja870003] and the philosophy and social science fund of zhejiang province [grant number 21ndjc039yb]. information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 14 endnotes 1 shi-man tang, “research on the reading promotion model and implementation path of university library based on social media platform.” (master’s thesis, university of jilin, 2013): 90–111. 2 yao qi, hua-wei ma, huan yan, and qi chen, “analysis of social network users’ online behavior from the perspective of psychology,” advances in psychological science 22, no. 10 (2014): 1647–59, https://doi.org/10.3724/sp.j.1042.2014.01647. 3 david gefen, elena karahanna, and detmar w. straub, “trust and tam in online shopping: an integrated model,” mis quarterly 27, no. 1 (march 2003): 51–90, https://doi.org/10.2307/30036519. 4 colleen boff, robert schroeder, carol letson, and joy gambill, “building uncommon community with a common book: the role of librarians as collaborators and contributors to campus reading programs,” research strategies 20, (2007): 272–83, https://doi.org/10.1016/j.resstr.2006.12.004. 5 mae l. rodney, “building community partnerships: the ‘one book one community’ experience,” c&rl news 65, no. 3 (march 2004), 130–32, https://doi.org/10.5860/crln.65.3.130. 6 ai-hua hou, “analysis and research on the reading promotion strategy of university library,” lifelong education 9, no. 5 (2020), https://doi.org/10.18282/le.v9i5.1251. 7 mei-ning li, tian-zi zhao, xu guan, and xin-hua chen, “study on building digital service system of ‘subscription’ reading promotion for university library,” library and information service 62, no. 18 (2018): 77–82, https://doi.org/10.13266/j.issn.0252-3116.2018.18.008. 8 boff, “building uncommon community with a common book,” 271–83. 9 kim becnel et al., “‘somebody signed me up’: north carolina fourth-graders’ perceptions of summer reading programs,” children & libraries: the journal of the association for library service to children 15, no. 3 (2017): 3–8, https://doi.org/10.5860/cal.15.3.3. 10 rodney, “building community partnerships,” 130–32, 155. 11 boff, “building uncommon community with a common book,” 271–83. 12 mu-chen wan and liang ou, “the empirical research of university libraries reading promotion effect based on the wechat public platform,” library and information service 60, no. 22 (2015): 72–78, https://doi.org/10.13266/j.issn.0252-3116.2015.22.011. 13 xiao-li wei, yi-ming mi, and fang sheng, “motivation measurement of university library’s participation in the reading promotion based on self-determination theory,” journal of library and information science 10, (2018): 1–8. 14 dali keren and lindsay mcniff, “reading work as a diversity practice: a differentiated approach to reading promotion in academic libraries in north america,” journal of https://doi.org/10.3724/sp.j.1042.2014.01647 https://doi.org/10.2307/30036519 https://doi.org/10.1016/j.resstr.2006.12.004 https://doi.org/10.5860/crln.65.3.130 https://doi.org/10.18282/le.v9i5.1251 https://doi.org/10.13266/j.issn.0252-3116.2018.18.008 https://doi.org/10.5860/cal.15.3.3 https://doi.org/10.13266/j.issn.0252-3116.2015.22.011 information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 15 librarianship and information science 52, no. 4 (february 2020): 1050–62, https://doi.org/10.1177/0961000620902247. 15 boff, “building uncommon community with a common book,” 271–83. 16 elizabeth betty marcoux and d. v. loertscher, “the role of a school library in a school’s reading program,” teacher librarian 37, no. 1 (2009): 8, 10–14, 84. 17 brett b. bodemer, “they can and they should: undergraduates providing peer reference and instruction,” college & research libraries 75, no.2 (2014): 162–78, https://doi.org/10.5860/crl12-411. 18 barbara macadam and darlene p. nichols, “peer information counseling: an academic library program for minority students,” journal of academic librarianship 15, no. 4 (1989): 204–9, https://doi.org/10.1016/0268-4012(89)90012-1. 19 turkey alzahrani and melinda leko, “the effects of peer tutoring on the reading comprehension performance of secondary students with disabilities: a systematic review,” reading & writing quarterly (april 2017): 1–17, https://doi.org/10.1080/10573569.2017.1302372. 20 bodemer, “they can and they should,” 162–78; ruth sara connell and patricia j. mileham, “student assistant training in a small academic library,” public services quarterly 2, no. 2–3 (2006): 69–84, https://doi.org/10.1300/j295v02n02_06; michael m. smith and leslie j. reynolds, “the street team: an unconventional peer program for undergraduates,” library management 29.3 (2008): 145–58, https://doi.org/10.1108/01435120810855287; gail fensom et al., “navigating research waters: the research mentor program at the university of new hampshire at manchester,” college & undergraduate libraries 13, no. 2 (2006): 49–74, https://doi.org/10.1300/j106v13n02_05; wendy holliday and c. nordgren, “extending the reach of librarians: library peer mentor program at utah state university,” college & research libraries news, 66, no. 4 (2005), https://doi.org/10.5860/crln.66.4.7422. 21 mary o’kelly, julie garrison, brian merry, and jennifer torreano, “building a peer-learning service for students in an academic library,” libraries and the academy 15, no. 1 (2015): 163– 82, https://doi.org/10.1353/pla.2015.0000. 22 fensom et al., “navigating research waters,” 49–74. 23 bodemer, “they can and they should,” 162–78. 24 ling-jie yao, “peer education: a new mode of university library services,” library development 12, (2012): 57–59. 25 jamie m. graham, allison faix ,and lisa hartman, “crashing the facebook party: one library’s experiences in the students’ domain,” library review 58, no. 3 (2009): 228–36, https://doi.org/10.1108/00242530910942072. 26 yue-qun zhang and chun-ning li, “change of library role in knowledge transfer in social network environment and countermeasures,” library and information 166, no. 6 (2015): 107– 12. https://doi.org/10.1177/0961000620902247 https://doi.org/10.5860/crl12-411 https://doi.org/10.1016/0268-4012(89)90012-1 https://doi.org/10.1080/10573569.2017.1302372 https://doi.org/10.1300/j295v02n02_06 https://doi.org/10.1108/01435120810855287 https://doi.org/10.1300/j106v13n02_05 https://doi.org/10.5860/crln.66.4.7422 https://doi.org/10.1353/pla.2015.0000 https://doi.org/10.1108/00242530910942072 information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 16 27 yi yang and ji-qing sun, “professional reading habits correlation research based on the social network theory,” new century library 70, no. 10 (2012): 81, 91–92, https://doi.org/10.16810/j.cnki.1672-514x.2012.10.024. 28 john wesley white and holly hungerford-kresser, “character journaling through social networks,” journal of adolescent & adult literacy 57, no. 8 (2014): 642–54, https://doi.org/10.1002/jaal.306. 29 christa s. c. asterhan and rakheli hever, “learning from reading argumentative group discussions in facebook,” computers in human behavior, no. 53 (2015): 570–76, https://doi.org/10.1016/j.chb.2015.05.020. 30 a. kochtchi, t. v. landesberger, and c. biemann, “networks of names: visual exploration and semi-automatic tagging of social networks from newspaper articles,” computer graphics forum 33, no. 3 (2014): 211–20, https://doi.org/10.1111/cgf.12377. 31 zhen-hua huang, bo zhang, qiang fang, and yang xiang, “an efficient algorithm of information recommendation between groups in social networks,” acta electronica sinica 43, no. 6 (2015): 1090–93. 32 cheng-ying liu, ming-syan chen, and chi-yao, “incrests: towards real-time incremental short text summarization on comment streams from social network services,” ieee transactions on knowledge and data engineering 27, no. 11 (2015): 2986–3000, https://doi.org/10.1109/tkde.2015.2405553. 33 jesse fox and courtney anderegg, “romantic relationship stages and social networking sites: uncertainty reduction strategies and perceived relational norms on facebook,” cyberpsychology behavior & social networking 17, no. 11 (2014): 685–91, https://doi.org/10.1089/cyber.2014.0232. 34 fei yao, cheng-yu zhang, wu chen, and tian-fang dou, “study on integrating library services into social network sites: taking the book club of tsinghua library university as a practice example,” library journal 30, no. 6 (2011): 24–28, https://doi.org/10.13663/j.cnki.lj.2011.06.014. 35 lin nan, “social capital: a theory of social structure and action,” (cambridge: cambridge university press, 2001); paul s. adler and seok-woo kwon, “social capital: prospects for a new concept,” academy of management review 27, no. 1 (2002): 17–40, https://doi.org/10.5465/amr.2002.5922314; peter moran, “structural vs. relational embeddedness: social capital and managerial performance,” strategic management journal 26, no. 12 (2005): 1129–51, https://doi.org/10.1002/smj.486. 36 peter h. gray, s. parise, and b. iyer, “innovation impacts of using social bookmarking systems,” mis quarterly 35, no. 3 (2011): 629–43, https://doi.org/10.1002/asi.21581. 37 linton c. freeman, “centrality in social networks’ conceptual clarification,” social networks (1978), https://doi.org/10.1016/0378-8733(78)90021-7; stephen p. borgatti, “centrality and network flow,” social networks 27, no. 1 (2005): 55–71, https://doi.org/10.1016/j.socnet.2004.11.008. https://doi.org/10.16810/j.cnki.1672-514x.2012.10.024 https://doi.org/10.1002/jaal.306 https://doi.org/10.1016/j.chb.2015.05.020 https://doi.org/10.1111/cgf.12377 https://doi.org/10.1109/tkde.2015.2405553 https://doi.org/10.1089/cyber.2014.0232 https://doi.org/10.13663/j.cnki.lj.2011.06.014. https://doi.org/10.5465/amr.2002.5922314 https://doi.org/10.1002/smj.486 https://doi.org/10.1002/asi.21581 https://doi.org/10.1016/0378-8733(78)90021-7 https://doi.org/10.1016/j.socnet.2004.11.008 information technology and libraries march 2021 peer reading promotion in university libraries | jiang, chi, lou, zuo, chu, and zhuge 17 38 ronald s. burt, “structural holes: the social structure of competition” (cambridge: harvard university press, 1992). 39 adler and kwon, “social capital,” 17–40. 40 peter v. marsden and k. e. campbell, “reflections on conceptualizing and measuring tie strength,” social forces 91, no. 1 (2012): 17–23, https://doi.org/10.1093/sf/sos112tti; stephen p. borgatti and r. cross, “a relational view of information seeking and learning in social networks,” management science 49, no. 4 (2003): 432–45, https://doi.org/10.1287/mnsc.49.4.432.14428; peter moran, “structural vs. relational embeddedness,” 1129–51; mesch gustavo and i. talmud, “the quality of online and offline relationships: the role of multiplexity and duration of social relationships,” the information society 22, no. 3 (2006): 137–48, https://doi.org/10.1080/01972240600677805. 41 camille grange and i. benbasat, “opinion seeking in a social network-enabled product review website: a study of word-of-mouth in the era of digital social networks,” social science electronic publishing 27, no. 6 (2018): 629–53, https://doi.org/10.2139/ssrn.2993427. 42 borgatti and cross, “a relational view of information seeking and learning in social networks,” 432–45; gustavo and talmud, “the quality of online and offline relationships,” 137–48; fox and anderegg, “romantic relationship stages and social networking sites,” 685–91; gray, parise, and iyer, “innovation impacts of using social bookmarking systems,” 629–43. 43 david gefen, “e-commerce: the role of familiarity and trust,” omega 28, no. 6 (2000): 725–37, https://doi.org/10.1016/s0305-0483(00)00021-9. 44 tam kar yan and s. y. ho, “understanding the impact of web personalization on user information processing and decision outcomes,” mis quarterly 30, no. 4 (2006): 865–90, https://doi.org/10.2307/25148757. 45 jacqueline johnson brown and peter h. reingen, “social ties and word-of-mouth referral behavior,” journal of consumer research 14, no. 3 (december 1987): 350–62, https://doi.org/10.1086/209118. 46 glenn j. browne, mitzi g. pitts, and james c. wetherbe, “cognitive stopping rules for terminating information search in online tasks,” mis quarterly 31 (march 2007): 89–104, https://doi.org/10.2307/25148782. 47 yue long and yi-yang liu, “propagation characteristics and paths of negative network public opinions in colleges under the new media environment,” information science 37, no. 12 (2019): 134–39, https://doi.org/10.13833/j.issn.1007-7634.2019.12.022. https://doi.org/10.1093/sf/sos112tti https://doi.org/10.1287/mnsc.49.4.432.14428 https://doi.org/10.1080/01972240600677805 https://doi.org/10.2139/ssrn.2993427 https://doi.org/10.1016/s0305-0483(00)00021-9 https://doi.org/10.2307/25148757 https://doi.org/10.1086/209118 https://doi.org/10.2307/25148782 https://doi.org/10.13833/j.issn.1007-7634.2019.12.022 abstract introduction literature review university reading promotion peer education in university libraries social reading promotion methods research questions study design data collection experiment validity results reviews written and browsed discussion the value of peers opinion seeking tendency of different types of literature information seeking tendency of readers with different social capital conclusions limitations and future research acknowledgements endnotes 116 information technology and libraries | september 2009 success factors and strategic planning: rebuilding an academic library digitization program cory lampert and jason vaughan this paper discusses a dual approach of case study and research survey to investigate the complex factors in sustaining academic library digitization programs. the case study involves the background of the university of nevada, las vegas (unlv) libraries’ digitization program and elaborates on the authors’ efforts to gain staff support for this program. a related survey was administered to all association of research libraries (arl) members, seeking to collect baseline data on their digital collections, understand their respective administrative frameworks, and to gather feedback on both negative obstacles and positive inputs affecting their success. results from the survey, combined with the authors’ local experience, point to several potential success factors including staff skill sets, funding, and strategic planning. e stablishing a successful digitization program is a dialog and process already undertaken or currently underway at many academic libraries. in 2002, according to an institute of museum and library services report, “thirty-four percent of academic libraries reported digitization activities within the past 12 months.” nineteen percent expect to be involved in digitization work in the next twelve months, and forty-four percent beyond twelve months.1 more current statistics from a subsequent study in 2004 reflected that digitization work has both continued and expanded, with half of all academic libraries performing digitization activities.2 fifty-five percent of arl libraries responded to a survey informing part of the 2006 association of research libraries (arl) study managing digitization activities; of these, 97 percent of the respondents indicated engagement in digitization.3 the 2008 ithaka study key stakeholders in the digital transformation in higher education found that nearly 80 percent of large academic libraries either already have or plan to have digital repositories.4 with digitization becoming the norm in many institutions, the time is right to consider what factors contribute to the success and rapid growth of some library digitization programs while other institutions find digitization challenging to sustain. the evolution of digitization at the unlv libraries is doubtless a journey many institutions have undertaken. over the past couple of years, those responsible for such a program at the unlv libraries have had the opportunity to revitalize the program and help collaboratively address some key philosophical questions that had not been systematically asked before, let alone answered. associated with this was a concerted focus to engage other less involved staff. one goal was to help educate them on academic digitization programs. another goal was to provide an opportunity for input on key questions related to the programs’ strategic direction. as a subsequent action, the authors conducted a survey of other academic libraries to better understand what factors have contributed to their programs’ own success as well as challenges that have proven problematic. many questions asked of our library staff in the planning and reorganization process were asked in the survey of other academic libraries. while the unlv libraries have undertaken what is felt are the proper structural steps and have begun to author policies and procedures geared toward an efficient operation, the authors wanted to better understand the experiences, key players, and underlying philosophies of other institutional libraries as theses pertain to their own digitization program. the following article provides a brief context relating the background of the unlv libraries’ digitization program and elaborates on the authors’ efforts toward educating library colleagues and gaining staff buy-in for unlv’s digitization program—a process that countless other institutions have no doubt experienced, led, or suffered. the administered survey to arl members dealt with many topics similar to those that arose during the authors’ initial planning and later conversations with library staff, and as such, survey questions and responses are integrated in the following discussion. the authors administered a 26-question survey to the 123 members of the arl. the focus of this survey was different from the previously mentioned arl study managing digitization activities, though several of the questions overlapped to some degree. in addition to demographic or concrete factual types of questions, the unlv libraries digitization survey had several questions focused on perceptions—that is, staff support, administrative support, challenges, and benefits. areas of overlap with the earlier arl survey are mentioned in the appropriate context. though unlv isn’t a member of the arl, we consider ourselves a research library, and, regardless, it was a convenient way to provide some structure to the survey. survey responses were collected for a forty-five-day period from mid-june to late july, 2008. through visiting each and every arl library’s website, the authors identified the individuals that appeared to be the “leaders” of the arl digitization programs, with instructions to forward the message to a colleague if cory lampert (cory.lampert@unlv.edu) is digitization projects librarian and jason vaughan (jason.vaughan@unlv.edu) is director, library technologies, university of nevada las vegas. success factors and strategic planning | lampert and vaughan 117 they themselves had been incorrectly identified. this was very tricky, and revealed numerous program structures in place, differences between institutions in promoting their collections, and so on. the authors didn’t necessarily start with the presumption that all arl libraries even have a digitization program, but most (but not all) either seemed to have a formal organized digitization program with staffing, or at least had digitized and made available something, even if only a single collection. we e-mailed a survey announcement and a link to the survey to the targeted individuals, with a follow-up reminder a month later. responses were anonymous, and respondents were allowed to skip questions; thus the number of responses for the twenty-six questions making up the survey ranged from a low of thirty (24.4 percent) to a high of forty-four responses (35.8 percent). the average number of responses for each of the questions was 39.8, yielding an overall response rate of 32.4 percent. questions were of three types: multiple choice (select one answer), multiple choice (mark all that apply), and open text. in addition, some of the multiple choice questions allowed additional open text comments. survey responses appear in appendix a. n context of the unlv libraries’ digitization program “digital collection,” for the purpose of the unlv library digitization survey, was defined as a collection of library or archival materials converted to machine-readable format to provide electronic access or for preservation purposes; typically, digital collections are library-created digital copies of original materials presented online and organized to be easily searched. they may offer features such as: full text search, browsing, zooming and panning, side by side comparison of objects, and export for presentation and reuse. one question the survey asked was “what year do you feel your library published its first ‘major’ digital collection?” responses ranged from 1990 to 2007; the general average of all responses was 2001. the earlier arl study found 2000 as the year most respondents began digitization activities.5 mirroring this chronology, the unlv libraries has been active in designing digital projects and digitizing materials from library collections since the late 1990s. technical web design expertise was developed in the cataloging unit (later renamed bibliographic and metadata services), and some of the initial efforts were to create online galleries and exhibits of visual materials from special collections, such as the jeanne russell janish (1998) exhibit.6 subsequently, the unlv libraries purchased the contentdm digital collection management software, providing both back-end infrastructure and front-end presentation for digital collections. later, the first digitization project with search functionality was created in partnership with special collections and was funded by a unlv planning initiative award received in 1999. the early las vegas (2003) project focused on las vegas historical material and was designed to guide users to search, retrieve, and manipulate results using contentdm software to query a database.7 unlv’s development corresponds with regional developments in utah in 2001, when “the largest academic institutions in utah were just beginning to develop digital imaging projects.”8 data from the 2004 imls study showed that, in the twelve months prior to the study release in 2004, the majority of larger academic libraries had digitized between one and five hundred images for online presentation.9 in terms of staffing, digitization efforts occur in a wide variety of configurations, from large departments to solo librarians managing volunteers. for institutions with recognized digitization staff, great variations exist between institutions in terms of where in the organizational chart digitization staff are placed. boock and vondacek’s research revealed that, of departments involved in digitization, special collections, archives, technical services, and newly created digital library units are where digitization activities most commonly take place.10 a majority of respondents to the arl study indicated that some or all activities associated with digitization are distributed across various units in the library.11 in 2003, the unlv libraries created a formal department within the knowledge access management division—web and digitization services (wds)—initially comprising five staff focused on the development of the unlv libraries’ public website, the development of web-based applications and databases to manage and efficiently present information resources, and the digitization and online presentation of library materials unique to the unlv libraries’ collections and of potential interest to a wider audience. augmenting their efforts were individuals in other departments helping with metadata standards, content selection, and associated systems technical support. the unlv library digitization survey showed that the majority (78 percent) of libraries that responded have at least one full-time staff member whose central job responsibility is to support digitization activities. this should not imply the existence of a fully staffed digitization program; the 2006 imls study found that 74.1 percent of larger academic libraries described themselves as lacking in sufficiently skilled technology staff to accomplish technology-related activities.12 central to any digitization program should be some structure in terms of how projects are proposed and subsequently prioritized. to help guide the priorities 118 information technology and libraries | september 2009 of unlv’s infant wds department, a digital projects advisory committee was formed to help solicit and prioritize project ideas, and subsequently track the development of approved projects. this committee’s work could be judged as having mixed success partly because it met too infrequently, struggled with conflicting philosophical thoughts on digitization, and was confronted with the reality that staff that were needed to help bring approved ideas to fruition simply weren’t in place because of too many other library priorities drawing attention away from digitization. an evaluation of the lessons learned from these early years can be found in brad eden’s article.13 the unlv library digitization survey had several questions related to management and prioritization for digital projects and shows that despite the challenges of a committee-based decisionmaking structure, when a formal process is in place at all, 42.1 percent of survey respondents used a committee versus a single decision maker (23.7 percent) for determining to whom projects are proposed for production. a follow-up question asked “how are approved projects ultimately prioritized?” the most popular response (54.1 percent) indicated “by a committee for review by multiple people,” followed by “no formal process” (27 percent). “by a single decision maker” was selected by 18.9 percent of the respondents. the earlier arl study asked a somewhat related question: “who makes decisions about the allocation of staff support for digitization efforts? check all that apply.” out of seven possible responses, the three most popular were “head of centralized unit,” “digitization team/committee/working group,” and “other person”; the other person was most often in an administrative capacity, such as a dean, director, or department head.14 administrative support for a program was another variable the unlv library digitization survey investigated. the survey asked respondents to rate, on a scale of one to five, “how would you characterize current support for digitization by your library’s administration?” more than 40 percent of responses indicated “consistent support,” followed by 31 percent of respondents indicating “very strong support, top priority,” 14.3 percent ranking support as neutral, and 14.2 percent claiming “minimal support” or “very little support, or some resistance.” it was also clear from some of the other questions’ responses that the dean or director’s support (or lack thereof) can have dramatic effects on the digitization program. 2005 brought change to the unlv libraries in the form of a new dean. well-suited for the digitization program, she came from california, a state very heavily engaged and at the forefront of digitization within the library and larger academic environment. one of her initiatives was a retooling of the digitization program at the unlv libraries, and her enthusiasm reflects a growing awareness of administrators regarding the benefits of digitization. n reorganization, library staff engagement, and decision making in 2006, two new individuals joined unlv libraries’ web and digitization services department, the digitization projects librarian (filling a vacancy), and the web technical support manager (a new position). a bit later, the systems department (providing technical support for the web and digitization servers, among other things), and the wds department were combined into a single unit and renamed library technologies. collectively, these changes brought new and engaged staff into the digitization program and combined under one division many of the individuals responsible for digital collection creation and support. perhaps more subtlety, this arrangement also provided formal acknowledgement of the importance and desire of publishing digital collections. with the addition of new staff and a reorganization, a piece still missing was a resuscitation of library stakeholders to help solicit, prioritize, and manage the creation of digital collections and an overall vision guiding the program. while the technical expertise, knowledge of metadata and imaging standards, and deep-rooted knowledge of digitization programs and concepts existed within the library technologies staff, other knowledge didn’t—primarily in-depth knowledge of the unlv libraries’ special collections and a track record of deep engagement with college faculty and the educational curriculum. similar to other organizations, the unlv libraries had not only created a new unit, but was also poised to introduce cross-departmental project groups that would collaborate on digitization activities. in their study of arl and greater western library association (gwla) libraries, book and vondracek found that this was the most commonly used organizational structure.15 knowledge of the concepts of a digitization program and what is involved in digitizing and sustaining a collection was not widespread among other library colleagues. acknowledged, but not guaranteed up front for the unlv libraries, was the likely eventual reformation of a group of interested and engaged library stakeholders charged to solicit, prioritize, and provide oversight of the unlv libraries’ digitization program. for various reasons, the authors wanted to garner staff buy-in to the highest degree possible. apart from wanting less informed colleagues to understand the benefits of a digitization program, it was also likely that such colleagues would help solicit projects through their liaison work with programs of study across campus. one unlv library digitization survey question asked, “how would you characterize support for digitization in your library by the majority of those providing content for digitization projects?” “consistent support” was indicated by 65.9 percent of respondents; 15.9 percent indicated “very strong support, top priority,” 13.6 percent indicated neutrality, and 4.6 success factors and strategic planning | lampert and vaughan 119 percent indicated either minimal support or even some resistance. to help garner staff buy-in and set the stage for revitalizing the unlv libraries’ digitization efforts, we began laying the groundwork to educate and engage library staff in the benefits of a digitization program. this work included language successfully woven into the unlv libraries’ strategic plan and an authored white paper posing engaging questions to the larger library audience related to the strategic direction of the program. finally, we planned and executed two digitization workshops for library staff. n the strategic plan one unlv library digitization survey question asked, “is the digitization program or digitization activities referenced in your library’s strategic plan?” a total of 63.4 percent indicated yes, with an additional 22 percent indicating no specific references, but rather implied references. only 7.3 percent indicated that the digitization program was not referenced in any manner in the strategic plan, while, surprisingly, 3 responses (7.3 percent) indicated that their library doesn’t have a strategic plan. the unlv libraries’ strategic plan is an important document authored with wide feedback from library staff, and it exemplifies the participatory decision-making process in place in the library. the current iteration of the strategic plan covers 2007–9 and includes various goals with supporting strategies and action items.16 in addition, all action items have associated assessment metrics and library staff responsible for championing the action items. departmental annual reports explicitly reference progress toward strategic plan goals. as such, if goals related to the digitization program appear in the strategic plan, that’s a clear indication, to some degree, of staff buy-in in acknowledging the significance of the digitization program. fortunately, digitization efforts figure prominently in several goals, strategies, and action items, including the following: n increasingly provide access to digital collections and services to support instruction, research, and outreach while improving access to the unlv libraries’ print and media collections. n provide greater access to digital collections while continuing to build and improve access to collections in all formats to meet the research and teaching needs of the university. identify collections to digitize that are unique to unlv and that have a regional, national, and international research interest. create digital projects utilizing and linking collections. develop and adapt metadata and scanning standards that conform to national standards for all formats. provide content and metadata for regional and national digital projects. continue to develop expertise in the creation and management of digital collections and information. collaborate with faculty, students, and others outside the library in developing and presenting digital collections. n be a comprehensive resource for the documentation, investigation, and interpretation of the complex realities of the las vegas metropolitan area and provide an international focal point for the study of las vegas as a unique urban and cultural phenomenon. facilitate real and digital access to materials and information that document the historical, cultural, social, and environmental setting of las vegas and its region by identifying, collecting, preserving, and managing information and materials in all formats. identify unique collections that strengthen current collections of national and international significance in urban development and design, gaming, entertainment, and architecture. develop new access tools and enhance the use of current bibliographic and metadata utilities to provide access to physical and digital collections. develop web-based digital projects and exhibits based upon the collections. an associated capital campaign case statement associated with the strategic plan lists several gift opportunities that would benefit various aspects of the unlv libraries; several of these include gift ideas related to the digitization of materials. n the white paper another important step in laying the groundwork for the digitization program was a comprehensive white paper authored by the recently hired digitization projects librarian. the finished paper was originally given to the dean of libraries and thereafter to the administrative cabinet, and eventually distributed to all library staff. the outline of this white paper is provided as appendix b. the purpose of the white paper was multifaceted. after a brief historical context, the white paper addressed perhaps the single most important aspect of a digitization program—program planning—developing the strategic goals of the program, selecting and prioritizing projects though a formal decision-making process, and managing initiatives from idea to reality through efficient project teams. this first topic addressing the core values of the program had a strong educational purpose for the entire library staff—the ultimate audience of the paper. as part of its educational goal, the white paper enumerated the various strengths of digitization and why an institution 120 information technology and libraries | september 2009 would want to sustain a digitization program (providing greater worldwide access to unique materials, promoting and supporting education and learning when integrated with the curriculum, etc.). it defined distinctions between an ephemeral digital exhibit and a long-term published and maintained collection. it discussed the various components of a digital collection—images, multimedia, metadata, indexing, thematic presentation (and the preference to be unbiased), integration with other digital collections and the library website, etc. it posited important questions on sustenance and assessment, and defined concepts such as refreshing of data and migration of data to help set the stage for future philosophical discussions. given the myriad reasons one might want to publish a digital collection, checked by the reality that all the reasons and advantages may not be realized or given equal importance, the white paper listed several scenarios and asked if each scenario was a strong underlying goal for our program—in short, true or false: n “the libraries are interested in digitizing select unique items held in our collection and providing access to these items in new formats.” n “the libraries are interested in digitizing whole runs of an information resource for access in new formats.” n “the libraries should actively pursue funding to support major digitization initiatives.” n “the libraries should take advantage of the unique publicity, promotion, and marketing opportunities afforded by a digital project/program.” continuing with a purpose of defining boundaries of the new program, the paper asked questions related to audience, required skill sets, and resources. the second primary topic introduced the selection and prioritization of the items and ideas suggested for digitization. it posed questions related to content criteria (why does this idea warrant consideration? would complex or unique metadata be required from a subject specialist?) and listed various potential evaluative measures of project ideas (should we do this if another library is already doing a very similar project?). technical criteria considerations were enumerated, touching on interoperability of collections in different formats, technical infrastructure considerations, and so on. multiple simultaneous ideas beg for prioritization, and the white paper proposed a formal review process and the library staff and skill sets that would help make such a process successful. the third primary topic focused on the details of carrying an approved idea to reality, and strengthened the educational purpose of the white paper. it described the general planning steps for an approved project and included a list of typical steps involved with most digital projects—scanning; creating metadata, indexes, and controlled vocabulary; coding and designing the web interface; loading records into unlv libraries’ contentdm system; publicizing the launch of the project; and assessing the project after completion. one unlv library digitization survey question was related to thirteen such skills the unlv libraries identified as critical for a successful digitization program. the question asked respondents to rate skill levels possessed by personnel at their library, based on a five-point scale (from one to five: “no expertise,” “very limited expertise,” “working knowledge/enough to get by,” “advanced knowledge,” and “tremendous expertise”). neither “no expertise” nor “very limited expertise” garnered the highest number of responses for any of the skills. the overall rating average of all thirteen skills was 3.79 out of 5. the skills with the highest rating averages were “metadata creation/cataloging” 4.4 and “digital imaging/document scanning/post image processing/photography” with 4.27. the skills with the lowest rating averages were “marketing and promotion” with 2.95 followed by “multimedia formats” with 3.33. the unlv libraries’ white paper contained several appendixes that likely provided some of the richest content of the white paper. with the educational thrust completed, the appendixes drew a roadmap of “where do we want to go from here?” this roadmap suggested the revitalization of an overarching digital projects advisory committee, potential members of the committee, and functions of the committee. the committee would be responsible for soliciting and prioritizing ideas and tracking the progress of approved ideas to publication. the appendixes also proposed project teams (which would exist for each project), likely members of the project teams, and the functions of the project team to complete day-to-day digitization activities. the liaison between the digital projects advisory committee and the project team would be the digitization projects librarian, who would always serve on both. the last page of the white paper provided an illustration highlighting the various steps proposed in the lifecycle of a digital project—from concept to reality. n digitization workshops several months after the white paper had been shared, the next step in restructuring the program and building momentum was sponsoring two forums on digitization. the first one occurred in november 2006 and included two speakers brought in for the event, roy tennant (formerly user services architect with the california digital library and now with oclc) and ann lally (head of the digital initiatives program at the university of washington libraries). this session consisted of a success factors and strategic planning | lampert and vaughan 121 two-hour presentation and q&a to which all library staff were invited, followed by two breakout sessions. all three sessions were moderated by the digitization projects librarian. questions from these sessions are provided in appendix c. the breakout sessions were each targeted to specific departments in the unlv libraries. the first focused on providing access to digital collections (definitions of digital libraries, standards, designing useful metadata, accessibility and interoperability, etc.). the second focused on components of a well-built digital library (goals of a digitization program, content selection criteria, collaboration, evaluation and assessment, etc.). colleagues from other libraries in nevada were invited, and the forum was well attended and highly praised. the sessions were recorded and later made available on dvd for library staff unable to attend. this initial forum accomplished two important goals. first, it was an allstaff meeting offering a chance to meet, explore ideas, and learn from two well-known experts in the field. second, it offered a more intimate chance to talk about the technical and philosophical aspects of a digitization program for those individuals in the unlv libraries associated with such tasks. as a momentum-building opportunity for the digitization program, the forum was successful. the second workshop occurred in april 2007. to gain initial feedback on several digitization questions and to help focus this second workshop, we sent out a survey to several dozen library staff—those that would likely play some role at some point in the digitization program. the survey contained questions focused on several thematic areas: defining digital libraries, boundaries to the digitization program, users and audience, digital project design, and potential projects and ideas. it contained thirteen questions consisting of open-ended response questions, questions where the respondent ranked items on a five-point scale, and “select all that apply”–type questions. we distributed the survey to invitees to the second workshop, approximately three dozen individuals; of those, eighteen (about 50 percent) responded to most of the questions. the survey was closely tied to the white paper and meant to gauge early opinions on some of the questions posed by that paper. whereas the first workshop included some open q&a, the second session was structured as a hands-on workshop to answer some of the digitization questions and to illustrate the complexity of prioritizing projects. the second workshop began with a status update on the retooling of the unlv libraries’ digitization program. this was followed by an educational component that focused on a diagram that detailed the workflow of a typical digitization project and who was involved and that emphasized the fact that there is a lot of planning and effort needed to bring an idea to reality. in addition, we discussed project types and how digital projects can vary widely in scope, content, and purpose. finally, we shared general results from the aforementioned survey to help set the stage for the structured hands-on exercises. the outline for this second workshop is provided in appendix d. one question of the unlv library digitization survey asked, “on a scale of 1 to 5, how important are each of the factors in weighing whether to proceed with a proposal for a new digital collection project, or enhancement of an existing project?” eight factors were listed, and the fivepoint scale was used (from one to five: “not important,” “less important,” “neutral,” “important,” and “vitally important”). the average rating for all eight factors was 3.66. the two most important factors were “collection includes unique items” (4.49 average rating) and “collection includes items for which there is a preservation concern or to make fragile items more accessible to the public” (3.95 average rating). the factors with the lowest average ratings were “collection includes integration of various media into a themed presentation” (2.54 average rating) followed by “collection involves a whole run of an information resource (i.e., such as an entire manuscript, newspaper run, etc.” (3.39 average rating). the earlier arl survey asked a somewhat related question, “what is/has been the purpose of these digitization efforts? check all that apply.” of the six possible responses (which differed somewhat from those in the unlv library digitization survey), the most frequent responses were “improved access to library collections,” “support for research,” and “preservation.”17 the earlier survey also asked the question, “what are the criteria for selecting material to be digitized? check all that apply.” the most frequent responses were “subject matter,” “material is part of a collection being digitized,” and “rarity or uniqueness of the item(s).”18 the first exercise of the second digitization workshop focused on digital collection brainstorming. the authors provided a list of ten project examples and asked each of the six tables (with four colleagues each) to prioritize the ideas. afterward, a speaker from each table presented the prioritizations and defended their rankings. this exercise successfully illustrated to peers in attendance that different groups of people have different ideas about what’s important and what constitutes prime materials for digitization. the rankings from the varying tables were quite divergent. a related question asked of the arl libraries in the unlv library digitization survey was “from where have ideas originated for existing, published digital collection at your library?” and offered six choices. respondents could mark multiple items. the most chosen answer (92.7 percent) was “special collections, archives, or library with a specialized collection or focus.” the least chosen answer (51.2 percent) was “an external donor, friend of the library, community user, etc.” for the second part of the workshop exercise, each table came up with their own digital collection ideas, defined the audience and content of the proposal, and defended and 122 information technology and libraries | september 2009 explained why they thought these were good proposals. fourteen unique and varied ideas were proposed, most of which were tightly focused on las vegas and nevada, such as “history of las vegas,” “unlv yearbooks,” “las vegas gambling and gamblers,” and “african american entertainers in las vegas.” other proposals were less tied to the area, such as a “botany collection,” “movie posters,” “children’s literature,” “architecture,” and “federal land management.” this exercise successfully showed that ideas for digital collections stretch across a broad spectrum, as broad as the individual brainchilden themselves. finally, in the last digitization workshop exercise, each table came up with specialties, roles, and skills of candidates who could potentially serve on the proposed committee, and defended their rationale—in other words, committee success factors. this exercise generated nineteen skills seen as beneficial by one or more of the group tables. at the end of the workshop, we asked if others had alternate ideas to the proposed committee. none surfaced, and the audience thought such a committee should be reestablished. this second workshop concluded with a brief discussion on next steps—drafting a charge for the committee, choosing members, and a plug for the expectation of subject liaisons working with their respective areas to help better identify opportunities for collaboration on digital projects across campus. n toward the future digital projects currently maintained by the unlv libraries include both static web exhibits in the tradition of unlv’s first digitization efforts, as well as several searchable contentdm–powered collections. the unlv libraries have also sought to continue collaborative efforts, participating as project partners for the western waters digital library (phase 1) and continuing in a regional collaboration as a hosting partner in the mountain west digital library. partnerships were shown in the unlv library digitization survey to garner increased buy-in for projects, with one respondent commenting that faculty partnerships had been “the biggest factor for success of a digital library project.” institutional priorities at unlv libraries reflect another respondent’s comment regarding “interesting archival collections” as a success factor. one recently launched unlv collection is the showgirls collection (2006), focused on a themed collection of historical material about las vegas entertainment history.19 another recently launched collection, the nevada test site oral history project (2008), recounts the memories of those affiliated with and affected by the nevada test site during the era of cold war nuclear testing and includes searchable transcripts, selected audio and video clips, and scanned photographs and images.20 with general library approval, the restructured digitization projects advisory committee was established in july 2007 with six members drawn from library technologies, special collections, the subject specialists, and at large. the advisory committee has drafted and gained approval for several key documents to help govern the committee’s future work. this includes a collection development policy for digitization projects and a project proposal form to be completed by the individual or group proposing an idea for a digital collection. at the time of writing, the committee is just now at the point of advertising the project proposal form and process, and time will tell how successful these documents prove. in the unlv library digitization survey, 65.4 percent responded that a digitization mission statement or collection development policy was in place at their institution. one goal at unlv is to “ramp up” the number of simultaneous digitization projects underway at any one time at unlv. many items in the special collections are ripe for digitization. many of these are uncataloged, and digitizing such collections would help promote these hidden treasures. related to ramping up production, one unlv library digitization survey question asked, “on average over the past three years, approximately how many new digital collections are published each year?” responses ranged from zero new collections to sixty. the average number of new collections added each year was 6.4 for the 32 respondents who gave exact numerical answers. while this is perhaps double the unlv libraries’ current rate of production, it illustrates that increasing production is an achievable goal. staffing and funding for the unlv libraries’ digitization program have both seen increases over the past several years. a new application developer was hired, and a new graphics/multimedia specialist filled an existing vacancy. together, these staff have helped with projects such as modifying contentdm templates, graphic design, and multimedia creation related to digital projects, in addition to working on other web-based projects not necessarily related to the digitization program. another position has a job focus shifted toward usability for all things webbased, including digitization projects. in terms of funding, the two most recent projects at the unlv libraries are both the result of successful grants. the recently launched nevada test site oral history project was the result of two grants from the u.s. departments of education and energy. subsequently, a $95,000 lsta grant proposal seeking to digitize key items related to the history of southern nevada from 1900 to 1925 was funded for 2008–9, with the resulting digital collection publicly launched in may 2009. this collection, southern nevada: the boomtown years, contains more than 1,500 items from several institutions, focused on the heyday of mining town life in southern success factors and strategic planning | lampert and vaughan 123 nevada during the early twentieth century.21 this grant funded four temporary positions: a metadata specialist, an archivist, a digital projects intern, and an education consultant to help tie the digitized collection into the k–12 curriculum. grants will likely play a large role in the unlv libraries’ future digitization activities. the unlv library digitization survey asked, “has your institution been the recipient of a grant or gift whose primary focus was to help efforts geared toward digitization of a particular collection or to support the overall efforts of the digitization program?” the question sought to determine if grants had played a role, and if so, whether it was primarily large grants (defined as > $100,000), small grants (< $100,000), or both. the majority of responses (46.2 percent), indicated a combination of both small and large grants had been received in support of a project or the program. an additional 25.6 percent indicated that large grants had played a role, and 23.1 percent indicated that one or more small grants had played a role. two respondents (5.1 percent) indicated that no grants had been received or that they had not applied for any grants. the earlier arl survey asked the question, “what was/is the source of the funds for digitization activities? check all that apply.” of seven possible responses, “grant” was the second most frequent response, trailing only “library.”22 with an eye toward the future, the survey administered to arl libraries asked two blunt questions summarizing the overall thrust of the survey. one of the final open-ended survey questions asked, “what are some of the factors that you feel have contributed to the success of your institution’s digitization program?” forty respondents offered answers that ranged from listing one item to multiple items. several responses along the same general theme seemed to surface, which could be organized into rough clusters. in general, support from library administration was mentioned by a dozen respondents, with such statements as “consistent interest on the part of higher level administration,” “having support for the digitization program at an administrative level from the very beginning,” “good support from the library administration,” “support of the dean,” and, mentioned multiple times in the same precise language, “support from library administration.” faculty collaboration and interest across campus was mentioned by ten respondents, evidenced by statements such as “strong collaboration with faculty partners,” “support of faculty and other partners,” “interest from faculty,” “heavily involving faculty in particular . . . ensures that we can have continued funding since the faculty can lobby the provost’s office,” and “grant writing partnerships with faculty.” passionate individuals involved with the program and/or support from other staff in the libraries were mentioned by ten respondents, with comments such as “program management is motivated to achieve success,” “a strong department head,” “individual staff member ’s dedication to a project,” “commitment of the people involved,” “team work, different departments and staff willing to work together,” and “supportive individuals within the library.” having “good” content to digitize was mentioned by seven respondents, with statements such as “good content,” “collection strength,” “good collections,” and “availability of unique source materials.” strategic plan or goals integration was mentioned in several responses, such as “strong financial commitment from the strategic plan” and “mainstreaming the work of digital collection building into the strategic goals of many library departments.” successful grants and donor cultivation were mentioned by four respondents. other responses were more unique, such as one respondent’s one-word response—“luck”—and other responses such as “nimbleness, willingness, and creativity,” and “a vision for large-scale production, and an ability to achieve it.” the final unlv library digitization survey question asked, “what are the biggest challenges for your institution’s digitization program?” thirty-nine respondents provided feedback, and again, several variations on a theme emerged. the most common response, unsurprisingly, “not enough staffing,” was mentioned by eighteen respondents, with responses such as “lack of support for staffing at all necessary levels,” “the real problem is people, we don’t have enough staff,” “limited by staff,” and “we need more full-time people.” following this was (a likely related response) “funding,” mentioned by another nine respondents, with statements such as “funding for external digitization,” “identifying enough funding to support conversion,” “we could always use more money,” and, succinctly, “money.” related to staffing, specifically, six responses focused on technical staff or support from technical staff, such as “need more it (information technology) staff,” “need support from existing it staff,” “not enough application development staff,” and “limited technical expertise.” prioritization and demand issues surfaced in six responses, with responses such as “prioritizing efforts now that many more requests for digital projects have been submitted,” “prioritization,” “can’t keep up with demand,” and “everyone wants to digitize everything.” workflow was mentioned in four responses, such as “workflow bottlenecks,” “we need to simplify the process of getting materials into the repository,” and “it takes far longer to describe an object than to digitize it, thus creating bottlenecks.” “not enough space” was mentioned by three respondents, and “maintaining general librarywide staff support for the program” was mentioned by two respondents. the unlv libraries will keep in mind the experiences of our colleagues, as few, if any, libraries are likely immune to similar issues. 124 information technology and libraries | september 2009 n conclusions the unlv library digitization survey revealed, not surprisingly, that not all libraries, even those of high stature, are created equally. many have struggled to some extent in growing and sustaining their digitization programs. many have numerous published projects, others have few or perhaps even none. administrative and fellow colleague support varies, as does funding. additional questions remain to be tackled at the unlv libraries. how precisely will we define success for the digitization program? by the number of published collections? by the number of successful grants executed? by the number of image views or metadata record accesses? by the frequency of press in publications and word-of-mouth praise from fellow colleagues? ideas abound, but no definitive answers exist as of yet. at the larger level, other questions are looming. as libraries continue to promote themselves as relevant in the digital age, and promote themselves as a (or the) central partner in student learning, to what degree will libraries’ digital collections be tied into the educational curriculum, whether at their own affiliated institutions or with k–12 in their own states as well as beyond? clearly the profession is changing, with library schools creating courses and certificate programs in digitization. discussions about the integration of various information silos, metadata crosswalking, and item exposure in other online systems used by students will continue. library digitized collections are primary resources involved in such discussions. while these questions persist, it’s hoped that at a minimum, the unlv libraries have established the foundational structure to foster what we hope will be a successful digitization program. references 1. institute for museum and library services, “status of technology and digitization in the nation’s museums and libraries 2002 report,” may 23, 2002, www.imls.gov/publications/ techdig02/2002report.pdf (accessed mar. 1, 2009). 2. institute for museum and library services, “status of technology and digitization in the nation’s museums and libraries 2006 report,” jan. 2006, www.imls.gov/resources/ techdig05/technology%2bdigitization.pdf (accessed mar. 1, 2009). 3. rebecca mugridge, managing digitization activities, spec kit 294 (washington, d.c.: association of research libraries, 2006): 11. 4. ross housewright and roger schonfeld, “ithaka’s 2006 studies of key stakeholders in the digital transformation in higher education,” aug. 18, 2008, www.ithaka.org/research/ ithakas%202006%20studies%20of%20key%20stakeholders%20 in%20the%20digital%20transformation%20in%20higher%20 education.pdf (accessed mar 1, 2009). 5. ibid. 6. university of nevada, las vegas university libraries, “jeanne russell janish, botanical illustrator: landscapes of china and the southwest,” oct. 17, 2006, http://library.unlv .edu/speccol/janish/index.html (accessed mar. 1, 2009). 7. university of nevada, las vegas university libraries, “early las vegas,” http://digital.library.unlv.edu/early_ las_vegas/earlylasvegas/earlylasvegas.html (accessed mar. 1, 2009). 8. arlitsch, kenning, and jeff jonsson, “aggregating distributed digital collections in the mountain west digital library with the contentdm multi-site server,” library hi tech 23, no. 2 (2005): 221. 9. institute for museum and library services, “status of technology and digitization in the nation’s museums and libraries 2006 report.” 10. michael boock and ruth vondracek, “organizing for digitization: a survey,” portal: libraries and the academy 6, no. 2 (2006), http://muse.jhu.edu/journals/portal_libraries_and_ the_academy/v006/6.2boock.pdf (accessed mar. 1, 2009). 11. mugridge, managing digitization activities, 12. 12. institute for museum and library services, “status of technology and digitization in the nation’s museums and libraries 2006 report.” 13. brad eden, “managing and directing a digital project,” online information review 25, no. 6 (2001), www.emerald insight.com/insight/viewpdf.jsp?contenttype=article& filename=html/output/published/emeraldfulltextarticle/ pdf/2640250607.pdf (accessed mar. 1, 2009). 14. mugridge, managing digitization activities, 32–33. 15. boock and vondracek, “organizing for digitization: a survey.” 16. university of nevada, las vegas university libraries, “university libraries strategic goals and objectives,” june 1, 2005, www.library.unlv.edu/about/strategic_goals.pdf (accessed mar. 1, 2009). 17. mugridge, managing digitization activities, 20. 18. ibid, 48. 19. university of nevada, las vegas university libraries, “showgirls,” http://digital.library.unlv.edu/showgirls/ (accessed mar. 1, 2009). 20. university of nevada, las vegas university libraries, “nevada test site oral history project,” http://digital.library .unlv.edu/ntsohp/ (accessed mar. 1, 2009). 21. university of nevada, las vegas university libraries, “southern nevada: the boomtown years,” http://digital .library.unlv.edu/boomtown/ (accessed may 15, 2009). 22. mugridge, managing digitization activities, 40. success factors and strategic planning | lampert and vaughan 125 appendix a. unlv library digitization survey responses 1. is the digitization program or digitization activities referenced in your library’s strategic plan? answer options (41 responses total) response percent response count yes 63.4 26 no 7.3 3 not specifically, but implied 22.0 9 our library doesn’t have a strategic plan 7.3 3 2. how would you characterize current support for digitization by your library’s administration? answer options (42 responses total) response percent response count very strong support, top priority 31.0 13 consistently supportive 40.5 17 neutral 14.3 6 minimal support, 7.1 3 very little support, or some resistance 7.1 3 3. how would you characterize support for digitization in your library by the majority of those providing content for digitization projects (i.e., regardless of whether those providing content have as a primary or a minor responsibility provisioning content for digitization projects)? answer options (44 responses total) response percent response count very strong support, top priority 15.9 7 consistently supportive 65.9 29 neutral 13.6 6 minimal support 2.3 1 very little support, or some resistance 2.3 1 126 information technology and libraries | september 2009 4. what year do you feel your library published its first “major” digital collection? major is defined as this was the first project deemed as having permanence and which would be sustained; it has associated metadata, etc. if you do not know, you may estimate or type “unknown.” responses ranged from 1990 to 2007. 5. to date, approximately how many digital collections has your library published? (please do not include ephemeral exhibits that may have existed in the past but no longer are present or sustained.) responses ranged from 1 to 1,000s. the great majority of responses were under 100; four responses were between 100 and 200, and one response was “1,000s.” success factors and strategic planning | lampert and vaughan 127 6. on average over the past 3 years, approximately how many new digital collections are published each year? all but two responses ranged from 0 to 10. one response was 13, one was 60. 7. what hosting platform(s) do you use for your digital collections (e.g., contentdm, etc.)? 8. does your institution have an institutional repository (e.g., dspace)? answer options (41 responses total) response percent response count yes 73.2 30 no 26.8 11 9. if the answer was “yes” in question 5, is your institutional repository using the same software as your digital collections? answer options (30 responses total) response percent response count yes 26.7 8 no 73.3 22 128 information technology and libraries | september 2009 10. is there an individual at your library whose central job responsibility is the development, oversight, and management of the library’s digitization program? (for purposes of this survey, central job responsibility means that 50 percent or more of the employee’s time is dedicated to digitization activities.) answer options (38 responses total) response percent response count yes 78.9 30 no 21.1 8 11. are there regular, full-time staff at your library who have as their primary or one of their primary job responsibilities support of the digitization program? for this question, a primary job responsibility means that at least 20 percent of their normal time is spent on activities directly related to supporting the digitization program or development of a digital collection. (mark all that apply) answer options (39 responses total) response percent response count digital imaging/document scanning, post-image processing, photography 82.1 32 metadata creation/cataloging 79.5 31 archival research of documents included in a collection(s) 28.2 11 administration of the hosting server 53.8 21 grant writing/donor cultivation/program or collection marketing 23.1 9 project management 61.5 24 multimedia formats 25.6 10 database design and data manipulation 53.8 21 maintenance, customization, and/or configuration of digital asset management software or features within that software (e.g., contentdm) 64.1 25 programming languages 30.8 12 web design and development 71.8 28 usability 25.6 10 marketing and promotion 28.2 11 none of the above 2.6 1 12. approximately how many individuals not on the full-time library staff payroll (i.e., student workers, interns, fieldworkers, volunteers) are currently working on digitization projects? answers ranged from 0 to “approximately 46.” the majority of responses (24) fell between 0 and 10 workers; twelve responses indicated more than 10; several responses indicated “unknown.” success factors and strategic planning | lampert and vaughan 129 13. has your library funded staff development, training, or conference opportunities that directly relate to your digitization program and activities for one or more library staff members? answer options (41 responses total) response percent response count yes, frequently, one or more staff have been funded by library administration for such activities 48.8 20 yes, occasionally, one or more staff have been funded by library administration for such activities 51.2 21 no, to the best of my knowledge, no library staff member has been funded for such activities 0.0 0 14. where does the majority of digitization work take place? answer options (41 responses total) response percent response count centralized in the library (majority of content digitized using library staff and equipment in one department) 48.8 20 decentralized (majority of content digitized in multiple library departments or outside the library by other university entities) 12.2 5 through vendors or outsourcing 7.3 3 hybrid of approaches depending on project 31.7 13 15. on a scale of 1 to 5 (1 being least important and 5 being vitally important), how important are each of the factors in weighing whether to proceed with a proposal for a new digital collection project or enhancement of an existing project? answer options (41 responses total) not important less important neutral important vitally important rating average response count collection includes item(s) for which there is a preservation concern or to make fragile item(s) more accessible to the public 0 1 9 22 9 3.95 41 collection includes unique items 0 0 1 19 21 4.49 41 collection involves a whole run of an information resource (e.g., an entire manuscript, newspaper run, etc.) 2 5 11 21 2 3.39 41 130 information technology and libraries | september 2009 answer options (41 responses total) not important less important neutral important vitally important rating average response count collection includes the integration of various media (i.e., images, documents, audio) into a themed presentation 7 11 17 6 0 2.54 41 collection has a direct tie to educational programs and initiatives (e.g., university courses, statewide education programs, or k–12 education) 3 3 6 17 12 3.78 41 collection supports scholarly communication and/or management of institutional content 1 4 7 21 8 3.76 41 collection involves a collaboration with university colleagues 1 3 9 18 10 3.83 41 collection involves a collaboration with entities external to the university (e.g., public libraries, historical societies, museums) 2 4 11 19 5 3.51 41 16. from where have ideas originated for existing, published digital collections at your library? in other words, have one or more digital collections been the brainchild of one of the following? (mark all that apply) answer options (41 responses total) response percent response count library subject liaison or staff working with teaching faculty on a regular basis 75.6 31 library administration 65.9 27 special collections, archives, or library with a specialized collection or focus 92.7 38 digitization program manager 63.4 26 university staff or faculty member outside the library 68.3 28 an external donor, friend of the library, community user, etc. 51.2 21 (continued from previous page) success factors and strategic planning | lampert and vaughan 131 17. to whom are new projects first proposed to be evaluated for digitization consideration? answer options (38 responses total) response percent response count to an individual decision-maker 23.7 9 to a committee for review by multiple people 42.1 16 no formal process 34.2 13 18. how are approved projects ultimately prioritized? answer options (37 responses total) response percent response count by a single decision-maker 18.9 7 by a committee for review by multiple people 54.1 20 by departments or groups outside of the library 0.0 0 no formal process 27.0 10 19. are digitization program mission statements, selection criteria, or specific prioritization procedures in use? answer options (40 responses total) response percent response count yes, one or more of these forms of documentation exist detailing process 67.5 27 yes, some criteria are used but no formal documentation exists 25.0 10 no documented process in use 7.5 3 20. what general evaluation criteria do you employ to measure how successful a typical digital project is? (mark all that apply) answer options (39 responses total) response percent response count log analysis showing utilization/record views of digital collection items 69.2 27 analysis of feedback or survey responses associated with the digital collection 38.5 15 publicity generated by, or citations referencing, digital collection 46.2 18 e-commerce sales or reproduction requests for digital images 12.8 5 we have no specific evaluation measures in use 33.3 13 132 information technology and libraries | september 2009 21. has your institution been the recipient of a grant or gift whose primary focus was to help efforts geared toward digitization of a particular collection or to support the overall efforts of the digitization program? answer options (39 responses total) response percent response count we have received one or more smaller grants or donations (each of which was $100,000 or less) to support a digital collection/program 23.1 9 we have received one or more larger grants or donations (each of which was greater than $100,000) to support a digital collection/program 25.6 10 we have received a mix of small and large grants or donations to support a digital collection/program 46.2 18 we have been unsuccessful in receiving grants or have not applied for any grants—grants and/or donations have not played any role whatsoever in supporting a digital collection or our digitization program 5.1 2 22. how would you rate the overall level of buy-in for collaborative digitization projects between the library and external partners (an external partner is someone not on the full-time library staff payroll, such as other university colleagues, colleagues from other universities, etc.)? answer options (41 responses total) response percent response count excellent 41.5 17 good 39.0 16 neutral 4.9 2 minimal 7.3 3 low or none 0.0 0 not applicable—our library has not yet published or attempted to publish a collaborative digital project involving individuals outside the library 7.3 3 23. when considering the content available for digitization, which of the following statements apply? (mark all that apply) answer options (40 responses total) response percent response count at my institution, there is a lack of suitable library collections for digitization 0.0 0 content providers regularly contact the digitization program with project ideas 52.5 21 the main source of content for new digitization projects comes from special collections, archives, other libraries with specialized collections (maps, music, etc.), or local cultural organizations (historical societies, museums) 87.5 35 success factors and strategic planning | lampert and vaughan 133 answer options (40 responses total) response percent response count the main source of content for new digitization projects comes from born digital materials (such as dissertations, learning objects, or faculty research materials) 32.5 13 content digitization is mainly limited by available resources (lack of staffing, space, equipment, expertise) 47.5 19 obtaining good content for digitization can be challenging 7.5 3 24. various types of expertise are important in collaborative digitization projects. please rate the level of your local library staff’s expertise in the following areas (1–5 scale, with 1 having no expertise and 5 having tremendous expertise). answer options (41 responses total) no expertise very limited expertise working knowledge/ enough to “get by” advanced knowledge tremendous expertise n/a rating average response count digital imaging/ document scanning, post image processing, photography 0 1 3 21 16 0 4.27 41 metadata creation/ cataloging 0 0 2 20 18 0 4.40 40 archival research of documents included in a collection 0 2 6 15 16 2 4.15 41 administration of the hosting server 1 2 7 16 15 0 4.02 41 grant writing/ donor cultivation 1 4 13 13 8 2 3.59 41 project management 0 1 9 23 8 0 3.93 41 multimedia formats 0 5 21 10 4 1 3.33 41 database design and data manipulation 0 4 9 14 13 1 3.90 41 (continued from previous page) 134 information technology and libraries | september 2009 answer options (41 responses total) no expertise very limited expertise working knowledge/ enough to “get by” advanced knowledge tremendous expertise n/a rating average response count digital asset management software (e.g., contentdm) 3 0 5 21 11 0 3.93 40 programming languages 4 3 14 9 11 0 3.49 41 web design and development 2 1 13 10 15 0 3.85 41 usability 1 7 12 13 8 0 3.49 41 marketing and promotion 2 11 17 7 3 1 2.95 41 25. what are some of the factors that you feel have contributed to the success of your institution’s digitization program? survey responses were quite diverse because respondents were speaking to their own perceptions and institutional experience. the general trend of responses are discussed in the body of the paper. 26. what are the biggest challenges for your institution’s digitization program? survey responses were quite diverse because respondents were speaking to their own perceptions and institutional experience. the general trend of responses are discussed in the body of the paper. appendix b. white paper organization i. introduction ii. current status of digitization projects at the unlv libraries iii. topic 1: program planning a. are there boundaries to the libraries digitization program? what should the program support? b. what resources are needed to realize program goals? c. who is the user or audience? d. when selecting and designing future projects, how can high-quality information be presented in online formats incorporating new features while remaining un-biased and accurate in service provision? e. to what degree do digitization initiatives need their own identity versus heavily integrating with the libraries’ other online components, such as the general website? f. how do the libraries plan on sustaining and evaluating digital collections over time? g. what type of authority will review projects at completion? how will the project be evaluated and promoted? iv. topic 2: initiative selection and prioritization a. project selection: what content criteria should projects fall within in order to be considered for digitization and what is the justification for conversion of the proposed materials? (continued from previous page) success factors and strategic planning | lampert and vaughan 135 b. project selection: what technical criteria should projects fall within in order to be considered for digitization? c. project selection: how does the project relate to, interact with, or complement other published projects and collections available globally, nationally, and locally? d. project selection and prioritization: after a project meets all selection criteria, resources may need to be evaluated before the proposal reaches final approval. what information needs to be discussed in order to finalize the selection process, select between qualified project candidates, and begin the prioritization process for approved proposals? e. project prioritization: should we develop a formal review process? v. topic 3: project planning a. what are the planning steps that each project requires? b. who will be responsible for the different steps in the project plan and department workload? c. how can the libraries provide rich metadata and useful access points? d. what type of web design will each project require? e. what type of communication needs to exist between groups during the project? vi. concluding remarks vii. related links and resources cited viii. white paper appendixes a. working list of advisory committee functions and project workgroup functions b. contentdm software: roles and expertise c. project team workflow d. contentdm elements appendix c. first workshop questions general questions 1. how do you define a digital library? do the terms “repository,” “digital project,” “exhibit,” or “online collection” connote different things? if so, what are the differences, similarities, and boundaries for each? 2. what factors have contributed to a successful digitization program at your institution? did anything go drastically wrong? were there any surprises? what should new digitization programs be cautious and aware of? 3. what is the role, specifically, of the academic library in creating digital collections? how is digitization tied to the mission of your institution? 4. why digitize and for whom? do digital libraries need their own mission statement or philosophy because they differ from physical collections? should there be boundaries to what is digitized? 5. what standards are most widely in use at this time? what does the future hold? are there new standards you are interested in? technical questions, metadata questions 1. what are some of the recommended components of digital library infrastructure that should be in place to support a digitization program (equipment, staff, planning, technical expertise, content expertise, etc?) 2. what are the relationships between library digitization initiatives, the library website, the campus website or portal, and the web? in what ways do these information sources overlap, interoperate, or require boundaries? 3. how do you decide on what technology to use? what is the decision-making process when implementing a new technology? 4. standards are used in various ways during digitization. what is the importance of using standards, and are there areas where standards should be relaxed, or not used at all? how do digitization programs deal with evolving standards? 5. preservation isn’t talked about as much as it used to be. what’s your solution or strategy to the problem of preserving digital materials? 6. will embedded metadata ever be the norm for digital objects, or will we continue to rely on collection management like contentdm to link digital objects to their associated metadata? 136 information technology and libraries | september 2009 appendix d. second workshop outline 1. introduction—purpose/focus of the meeting a. to talk about next steps in the digitization program b. quick review of the current status and where the program has been c. serve to further educate participants on the steps involved in taking a project idea to reality d. goals for participants: understand types of projects and project prioritization; engage in activities on ideas and prioritization; talk about process and discuss committee; open forum 2. staff digitization survey discussion a. “defining digital libraries” b. “boundaries to the digitization program” c. “users and audience” d. “digital project design” e. “potential projects and ideas” 3. first group exercise: digital project idea ranking and defense of ranking 4. second group exercise: digital project idea brainstorming and defense of ideas brainstormed 5. concept/proposal for a digitization advisory committee 6. conclusion and next steps collections and design questions 1. how do you decide what should be included in a digital library? does the digital library need a collection development policy and if so, what type? how are projects prioritized at your institution? 2. how do you decide who your user is? are digital libraries targeting mobile users or other users with unique needs? what value-added material compliments and enhances digital collections (i.e., item-level metadata records, guided searches, narrative or scholarly content, teaching material, etc.)? 3. how should digital libraries be assessed and evaluated? how do you gauge the success of a digital collection, exhibit, or library? what has been proven and disproved in the short time that libraries have been doing digital projects? 4. what role do digital libraries play in marketing the library? how do you market your digital collections? are there any design criteria that should be considered for the web presence of digital libraries (should the digital library look like the library website, the campus website, or have a unique look and feel)? 5. do you have any experience partnering with teaching faculty to create digital collections? how are collaborations initiated? are such collaborations a priority? what other types of collaborations are you involved in now? how do you achieve consensus with a diverse group of collaborators? to what degree is centralization important or unnecessary? 20 file organization of library records i. a. warheit: international business machines corporation, san jose, california library records and their utilization are described and the various types of file organization available are examined. the serial file with a series of inverted indexes is preferred to the simple serial file or a threaded list file. it is shown how various records should be stored, according to their utilization, in the available storage devices in order to achieve optimum cost-performance. one of the problems data processing people are beginning to face is the organization of library files. these are some of the largest and most voluminous files that will have to be organized, maintained and searched. they range in size from the national union catalog of the library of congress, which has over sixteen million records with an average of three hundred characters each, down to the hundreds of small college catalogs of 100,000 records. there are more than fifty universities whose holdings range from one million to over eight million volumes. the average holdings of library systems serving cities of 500,000 or more exceed two million volumes, although the actual number of titles is less. since the tum of the century the university libraries have been growing exponentially and at present are doubling, on the average, every fifteen years. , also the abstracting-indexing services, whose records are very similar to library catalog records and are used in much the same way, have grown very large. chemical abstracts which has been operating since 1907, now has over three and a half million citations. it provides data on file organization of library records/w arheit 21 some three million compounds and is today adding over a quarter of a million citations each year. if the present rate of growth continues, it will be adding 400,000 citations a year by 1971. index medicus and biological abstracts are very similar and there are a number of other somewhat smaller bibliographic services in the field of metals, engineering, physics, petroleum, urban renewal, atomic energy, meteorology, geology, aerospace, and so on. in addition, library-type file maintenance, organization and search are being applied to medical records, adverse drug reaction reports, intelligence files, engineering drawings, museum catalogs · and the like, and these too, represent very large information retrieval files. in other words, library files are very widespread and are beginning to become a problem for data processing. characteristics of files the aforementioned library files have certain common characteristics. first, as already noted, they are large. in the next ten or fifteen years there will probably be several hundred libraries with holdings exceeding one million volumes each. second, the records themselves are alphabetic and tend to be voluminous. they range from two hundred characters in an index journal, to three hundred characters for the standard catalog card up to two thousand characters for the abstract journals. in 1962 the library of congress, for example, estimated that it would need a file exceeding 9 x 108 bits to do its normal library processing and to store the serial records; it would need a file of 1.3 x 109 bits to store the circulation records and location directory and monitor the use of the collection, and would need a file of 1012 bits for the central catalog and the catalog authority files ( 1) . on the basis of library experience since 1962, these figures are generally considered too low. third, file records are variable in length. the librarian cannot control his inputs. the world's publica,tions appear in every shape, form and identity and they must be recorded the way they have appeared so that they can be properly identified. artificial identification such as book numbers, call numbers, coden numbers for journals and the like are simply parochial conveniences and do not replace the actual bibliographic record. records in a large catalog file are generally stable and not dynamic. if there is a new edition of a document, a new bibliographic record is made. if the old document is retained along with the new edition, the old catalog record is also retained. the record is discarded only if the document is discarded and, in the large research library, this occurs very infrequently. new indexing or cataloging is seldom applied to old records. in contrast, the smaller item record file used for acquisition and processing, the circulation file, and the serials records file, all ranging from 10,000 to 100,000 records, are dynamic records requiring many and frequent changes, additions and deletions. _ 22 journal of library automation vol. 2/1 march, 1969 each record item must have a number of different access points, since a single class or access point which everyone will accept is an impossibility. at present, with conventional library cataloging, card catalogs and printed indexes provide about five or six access points or records per title. however, computer systems, with their greater opportunity to do deeper indexing, are providing from ten to twenty keys or access points per title. distribution of index tenns is very uneven and not predictable. a few terms have a great many postings or addresses, while many terms, notably author entries, have only one or two postings. file segmentation by subject class has been proposed by some data processing personnel, but inter-disciplinary needs are such that subject segmentation is not considered very seriously. file segmentation by date, especially for the abstract services, is increasing in popularity. it is generally thought that major activity, in the technologies especially, is concentrated in current records; this is less true, however, in the sciences and even less in the humanities. public library and undergraduate library personnel may not object to segmenting their files, but those librarians responsible for major research collections that cover all disciplines do not look with favor on segmented files. although circulation records do provide some clues as to the activity of the various parts of a library's collection, no one really knows what the search activity in the catalog is, or how it is distributed across the various records used. therefore, since every record is considered permanent in libraries, major effort has been expended on input processing which has included the recording of much material whose utility is questionable. a user wants to access files in open language, and wants to receive response in open language; he will not use codes and so-called machine language and will tolerate only a minimum of training on methods to interrogate the file. he prefers to engage in an actual dialogue with the file and if he cannot do this will ask a reference librarian or reader's advisor to find the references for him. he also wants real-time response. if he doesn't get fairly prompt answers, he will go elsewhere to satisfy his informational needs. types of files the librarian must work with a number of files: 1) the item record file is the record of an item, book, journal, report. etc., that is being ordered, is on order, is being received, or is being processed by the cataloger. 2) the catalog file is the permanent bibliographic and subject record of the item that has been processed by the cataloger. 3) the serial~ record file, which is in two parts, is the record of holdings of completed volumes both bound and unbound, and the check-in record of currently received periodical issues. 4) the circulation control file keeps the record of all items loaned or otherwise charged out. 5) the catalog authority file organization of library records/w arheit 23 file is the thesaurus-like vocabulary control which indexers and catalogers use as their authority list and guide in assigning index terms. it is also used to "normalize" the inquiries of a searcher and convert them to legitimate index terms. the librarian is also concerned with a number of indexed abstracts produced by various discipline oriented institutions which are used in libraries. he also uses a number of special files: borrower or patron file, special collection files, location files, vendor files, and the like. except for a few comments about the item record, this discussion is confined primarily to the catalog file, which is by far the largest file and, for the librarian and the general user, the most important. as already noted, in most respects it is very similar to the indexed abstract file and, in fact, in certain special libraries, these two files are combined. in process file the in process, or item record, file consists of records of all items which the library is acquiring and processing. it is not a very large file, or, at least if properly policed, should not be. unfortunately, because in manual systems it is difficult continuously to follow up outstanding orders, a lot of deadwood accumulates and files become unnaturally large and difficult to handle. in a well controlled file, however, the number of records does not grow appreciably, for, although new items are added, processed titles are removed when they are added to the catalog file. · in addition to providing such normal bibliographic access points as personal author, corporate author, title, report nmj}ber and the like, the item record may also be searched by a number of specialized keys: order number, vendor, publisher, journal code, contract number, fund, requester. the item record is very dynamic. information available to the librarian when the order for an item is placed may be faulty. new information will be coming in about the item, such as price, shipping costs, invoice number, change in vendor, and change in title. various funds have to be charged and obligations changed, payments authorized, funds decremented, receipt notices prepared and sent to requesters, flags in various files changed to prevent duplicate orders and the bibliographic record transmitted to the cataloging staff. however, once an item has been received and cataloged, only the bibliographic information (author, title, place, publisher, date, pagination) are retained and the rest of the information is retired to an historical file. ( 2). because it would provide greater flexibility as new and unexpected demands are generated, the best way to handle this dynamic file would be with a generalized data management system rather than with a tailormade acquisitions and processing program. although present data management systems are really not suitable, because of variable length records in item record files and because terminals will be used, it appears that some could be adapted. 24 journal of library automation vol. 2/1 march, 1969 catalog file the tendency today, however, is to build a single master file with various functional fields where bibliographic information, ordering, and purchasing data, loan records, location information and other item control data are stored. how should this very large master catalog file be organized so that it will be easy and economical to maintain and provide all the desired search capabilities? there are three basic file organization schemes in use today for information retrieval: the serial file, the inverted file and the list process file ( 3,4,5). actually, from a technical point of view, both the inverted file and the list process file represent two different classes of list structures and are, therefore, sometimes referred to as the inverted list system and the threaded list system. serial file organization although the serial file is the easiest and cheapest to maintain, the librarian obviously cannot accept purely serial searching of his catalog. the file is much too big and the real time requirements are such as to rule out any but the shortest, simplest serial or sequential search. as will be pointed out later, the librarian does need some serial searching capability, and of course he does need it if he wants to do any browsing. however, if he is to provide any kind of useful service, he must use direct-access storage devices and access to his records individually. threaded list file organization for a while there was some interest in using a threaded list file organization for the catalog file. here, the searcher is first directed through a dictionary or directory to the latest record associated with a term. this record also contains the chain address of the previous record having the same descriptor, so that a user can run through a "chain" or "list" until he reaches the oldest or last record, or comes back full circle to the starting record. each record belongs to a number of lists, one for each descriptor used to describe it, and there are as many lists as there are descriptors. such a system seems economical of storage space in that a secondary or separate index does not have to be stored, but, since storage space for the chain or link address has to be provided, the actual savings are very small. there are several possible refinements of this list file organization which reduce storage costs. some involve elimination of redundant information; a term, or any other searchable piece of information, is stored just once, sometimes in the form of a table. each record that contains searchable information has a pointer to the term itself. there have to be, of course, pointers from every term back to the records as well. insofar as the pointers may require fewer bits than the terms or addresses themselves, there is a saving in storage space. it does cost some additional processing time and file maintenance is somewhat complicated. ( 6). file organization of library records/w arheit 25 another economy measure is provided by what is generally called a multilist system which groups several-usually three-descriptors into one super key with one chain address. a multilist not only saves space but also speeds both file posting and searching by processing multiple descriptors simultaneously. ( 7,8,9,10,11). such a system, to be workable, must permit grouping of various descriptors into mutually exclusive groups, and within each group there must be some equitable distribution of descriptors posted to records. in normal library information retrieval applications, a very large percentage of the descriptors are used just for one or two documents and only a few descriptors are used to identify a large number of document records. in other words, most of the so-called super keys end up having just a single real descriptor, which is equivalent to establishing a separate list for each descriptor. in a test made with the defense document center collection it turned out that about ninety percent of the super keys had only single descriptors. ( 12,13). there are, in addition, special modifications of multilist files which essentially involve segmenting the multilist to fit the hardware, for example, the track length or cylinder size. (14). a fragmented sub-list, sometimes referred to as a cellular multilist, may even contain all the link addresses in the directory, thus becoming indistinguishable from an inverted file. any list process file organization, however, does pose serious file maintenance problems, especially where individual records must be changed or deleted. also special precautions must be taken to avoid broken chains and provision made to repair breaks, although some advocates of list process files claim it is easier to maintain thread~d lists than inverted lists. of course, if multilists are used, a special effort must be made to build the super keys. · it must not be forgotten that a threaded list directory can only provide the search statistics for a single term and, unlike the inverted list, can only provide intersection statistics upon completion of a total search. the few librarians who have been exposed to threaded list file organization have not reacted favorably. a few have been interested in applying this technique to do hierarchical searches and other relationship connections in their authority lists or thesauri, but have not seriously considered using it for their catalog files. inverted file organization the traditional library file organization as exemplified by the standard card catalog has been based on a serial main file plus an inverted file. here a normal serial file is "inverted" and the file sequenced by index entry or key. the record itself is duplicated under each of its keys, which librarians call tracings. by strictly limiting the number of tracings or keys applied to each record, the librarian can keep the card catalog down to a reasonable size. however, as deeper indexing is applied to the documents, more keys or tracings are used and the file becomes very large. 26 ] ournal of library automation vol. 2/1 march, 1969 furthermore, storage costs in the mechanized file are appreciably higher than in an ordinary manual card file. the full record, therefore, in a mechanized system cannot be economically stored behind each term. only the document or record number or file address of the master record is recorded after each term; in other words, the inverted file is just an index to the record file. the main record file itself is a simple serial file where each record is complete in itself, the tracings or keys in the record and the address of the record being duplicated on the inverted file. the catalog file, therefore, is made up of two parts: a serially organized main or master record file, and an inverted index to the main file. ( 15). maintenance of an inverted index is expensive. tracings and the addresses to which they refer have to be duplicated, requiring costly additional storage space. new terms and new addresses cannot simply be added to the end of a file but must be distributed and interfiled throughout tl1e index, causing a number of file maintenance problems. the inverted index and main serial file must be kept in phase, with changes in one being reflected in the other. to maintain these files, separate inputs should not be prepared; instead the inverted index should be generated from the main record file update by program control. ( 16,17,18,19). although the combined file organization of a serial record file and an inverted index does cost more to maintain than serial or list file organization, it provides such superior search capabilities that it has become the favored library catalog file organization. since the inverted file is organized by subject headings or descriptors and since a search request is specified by listing the desired descriptors and their logical relationships, the search programs need only examine the items filed behind each selected descriptor or subject heading. it is unnecessary to look at all the records, as it is with the serial file. the inverted file search, in · its basic form, takes the request descriptors, obtains the list of record addresses or items under each relevant descriptor, makes the specified logical connections, and produces all items satisfying the request. the search procedure examines only potentially pertinent records, ignoring the rest of the file. in other words, the file is organized every time a search is made to suit the requirements of the search. thus, the file and the request are compatible and utilization of the file is essentially independent of its size. an inverted index provides a very special capability to a searcher who is using a terminal, on-line system. he can test both individually and collectively the effectiveness of the terms of his search statement without having to make a complete search of the master record, simply by examining the inverted index. the system will tell him, for example, the number of entries under a term. it will tell him how many entries several terms share in common so that he can test the intersections, that is, the conjunction and disjunction of the terms. the count of addresses that results from the list intersection can be returned immediately to the terfile organization of library records/w arheit 27 minal as an upper limit of the number of hits. in effect decoding of the boolean expression takes place in the inverted index, which is a very compact list, and hence the response time is fast. it is true, some additional calculations and comparisons in the record itself may reduce the number of hits, but will never increase them. sitting at a terminal, a searcher can ask the system what will be the maximum number of hits he will get in response to a search statement. he can change the parameters of his search statement and see immediately what effect that will have on the response of the system. it is primarily because of this capability of the user to have a dialog with the machine that every terminal-oriented library information retrieval system, at least of which the author is aware, is adopting an inverted file organization. in order to reduce storage costs, not every search term need be carried on an inverted index. those search terms or index entries that are practically never searched alone, but used rather in conjunction with another term or tracing, are carried only in the main file and not on the inverted index. in a library catalog these terms are usually the place and date of publication, publisher, language of the book, level of the publication (i.e. adult, children, youth), number and type of illustrations, and so on. these terms appear on almost every record and some of them are high density terms; that is, they are heavily posted. for example, in a typical u.s. library, some eighty per cent of the books are identified as being in english. form headings (bibliography, essay, poem, biography, map, etc. ) , geographic headings, and numerics tha~ are used in conjunction with what are called main headings, also do not appear on the inverted index, but can be searched in the main file. in the very unlikely event that a search is required to be made only for a term not on the inverted file, then, of course, a serial search can be made of the master file. in some systems, a very compact serial file of data may simplify serial searching of the master file. physical organization a basic understanding of how a library's records are used is necessary to a proper plan for their physical organization. in a manual system, logical organization and physical organization of a library's records are identical. furthermore, all files are physically the same, usually on 3x5 catalog cards or, in a few cases, in printed book or sheaf catalogs. in a computer system, however, because of varying capacities, speeds and storage costs of different direct access devices, it is extremely important that the various records and segments of records be stored in those devices which will give the best cost-performance for the application. this means that the rate of utilization of the various records and parts of records, as well as the size of the records, will determine what types of devices will be used as physical files. 28 journal of library automation vol. 2/1 march, 1969 in a library operation there is very heavy use of index terms, or subject headings and author entries, to search the files; records for these entries can be very short. borrower records and charge-out records in circulation control systems are also very actively used. there is less use made of the bibliographic record or journal citation. these records are somewhat longer than the subject and author tracings, and hence require more storage, but do not need such rapid access. notes, abstracts and other explanatory material can require an enormous amount of storage space but, as a rule, are used only infrequently. patron registration, as contrasted with borrower records, is used much less frequently, unless, of course, the two types of records are combined. since serials holdings records do not change very frequently, printouts are quite satisfactory as finding tools and the records are usually kept off-line. journal check-in, however, requires a great number of accesses every day. in view of the requirements generated by the above uses, the present thinking for on-line library systems, in terms of current hardware, runs something like this: in a combined file system described above, with the bibliographic record on the serially organized main file and the index in an inverted file arrangement, the inverted file, which must be accessed many more times than the main fil~, would best be carried on disk files. the bibliographic record itself, being much more voluminous and accessed less frequently, is stored in a larger, slower, more economical file like the ibm 2321, tl1e data cell. abstracts, and other seldom used bulk records might well be on tape, off line. actually, though, as libraries build up their record files to control their total collections, they will, of course, exceed the capacity of the present data cells and will have to go to future mass memory devices similar to the ibm photo digital storage system. then it may be economical to put even abstracts and notes of the bibliographic record on line. if there is a separate item record file of in process or acquisitions data, it can be handled in the same way as the catalog file, that is, all access points as an inverted file on disk with the record itself on the data cell strips. if, however, the total item record file is not too big, it might well be stored on disk. circulation control records are carried on disk, but patron registration, if it is to be kept on line, would be more economically stored in the data cell. the authority list or thesaurus really has two functions. it is heavily used to validate and convert all inputs and all search requests. it is also used to store all cataloging and indexing decisions and to provide guides to users as to the formulation of search queries. the necessary data makes for long records that are either infrequently used or available as printouts. therefore, a condensed form of the authority list or thesaurus, a forni which carries only the terms and their equivalents, is best stored on disk, whereas the full-blown authority list which is used primarily for printing the thesaurus and its supplements can be carried off-line on tape, or in file organization of library records/w arheit 29 the cheapest, biggest and slowest direct access device which is available. in order to achieve economical, compact storage, the subject headings, descriptors or index terms would not be stored in open language but in numeric codes. by using, for example, the decimal code as used in a dewey decimal system, numeric codes would also make it possible economically to build hierarchies or class tables with the descriptors. it would be necessary, therefore, in every transaction, to translate from open language to code when interrogating the system and to translate from code to open language when outputing from the system. translations would have to be very fast to accommodate the traffic of a large number of terminals. the translation job, using a stored table, might have to be done in an auxiliary, large core storage, which is very fast but more expensive than disk files. as a general rule, what is being proposed is that for very large files the index and the bibliographic record are not to be stored in the same device. one might start this way until the file and the traffic into it are built up and the system becomes fully operational. however, the system should be so structured that indexes could be stored in files that are faster than the bulk storage devices used for the records. the translation files, that is, the tables that convert from open language to stored codes on input and the reverse on output, can be stored in the fastest available exemal storage. ( 20). it is extremely doubtful that hardware development in the' immediate future will change these principles of library file organization very much. as storage costs drop, total capacities increase, and _access times become shorter, more and more libraries will find it practical and economical to put their files on line in order to provide the improved services that users demand. references 1. u. s. library of congress: automation and the library of congress (washington, government printing office, 1963), p. 74. 2. batts, n. c.: "data analysis of science monograph order/cataloging fmms," special libraries, 57 (october, 1966), 583-586. 3. "corporate data file design," edp analyzer. 4 (december, 1966). 4. climenson, w. d.: "file organization and search techniques," annual review of information science and technology. 1 (new york: interscience, 1966), p. 50. 5. borko, h.: "design of information systems and services," annual review of information science and technology, 2 (new york: interscience, 1967), p. 50. 6. castner, w. g., et al.: "the mecca systema modified list processing application for library collections," proceedingsa. c. m. national meeting ( 1966), pp. 489-498. 30 journal of library automation vol. 2/1 march, 1969 7. prywes, n. s., et al.: the multi-list system (philadelphia, moore school of electrical engineering, university of pennsylvania technical status report no. 1 under contract nonr 551(40), november, 1961). 8. prywes, n. s.; gray, h. j.: "the multi-list system for real time storage and retrieval," ifip congress proceedings. 1962, pp. 112-116. 9. university of pennsylvania, moore school of electrical engineering: the tree as a stratagem for automatic information handling (report of work under ... contract nonr 551 ( 40) and ... af 30 ( 602)2832, moore school report no. 63-15, 15 december 1962). 10. lefkovitz, d.: automatic stratification of descriptors (philadelphia, moore school of electrical engineering, university of pennsylvania, technical report under contract nonr 551 ( 40), moore school report no. 64-03, 15 september 1963). 11. landauer, i.: "the balanced tree and its utilization in information retrieval," ieee transactions on electric computers (december, 1963), pp. 863-871. 12. univac division of sperry rand corporation: multi-list systems: preliminary report of a study into automatic attribute group assignment; technical status report no. 1-2, 3#ad 609 709, 4#ad 609 710 ( 1963-1964). 13. univac division of sperry rand corporation: optimization and standardization of information retrieval language and systems; final report (ad 630-797, 1966). 14. lefkovitz, d.: file st1'uctures for on-line systems (class syllabus). 15. curtice, r. m.: magnetic tape and disc file organizations for retrieval (lehigh university, center for information sciences, july, 1966). 16. warheit, i. a.: "the direct access search system," afips conference proceedings, 24 ( 1963), pp. 167-172. 17. warheit, i. a.: the combined file search system. a case study of system design fm· information retrieval (paper presented at the f. i. d. meeting in washington, d. c., october 15, 1965; abstract, 1965 congress, international federation for documentation ( fid), washington, d. c., u. s. a. 10-15, october 1965), p. 92. 18. prentice, d. d.: the combined file search system (san jose, california: ibm june 15, 1964). 19. 1401 information storage and retrieval systemversion ii; the combined file search system, no. 10.3.047 (hawthorne, new york: ibm, may 1, 1966) . 20. warheit, i. a.: file organization for libraries; report to project intrex, mit, cambridge, massachusetts, march 14, 1968. migration of a research library's ict-based services to a cloud platform communication migration of a research library’s ict-based services to a cloud platform francis jayakanth, ananda t. byrappa, and filbert minj information technology and libraries | march 2022 https://doi.org/10.6017/ital.v41i1.13537 francis jayakanth (francis@iisc.ac.in) is scientific officer, jrd tata memorial library, indian institute of science. ananda t. byrappa (anandtb@iisc.ac.in) is librarian, jrd tata memorial library, indian institute of science. filbert minj (filbert@iisc.ac.in) is principal research scientist, supercomputer education and research centre, indian institute of science. © 2022. abstract libraries have been at the forefront in adopting emerging technologies to manage the library’s operations and provide information services to the user community they serve. with the emergence of cloud computing (cc) technology, libraries are exploring and adopting cc service models to make their own services more efficient, reliable, secure, scalable, and cost-effective. in this article, the authors share their experience migrating some of the library’s locally hosted ict-based services onto the microsoft azure cloud platform. the migration of services to a cloud platform has helped the library significantly reduce the downtime of its services due to power or network or system outages. introduction established in 1909, the indian institute of science is a leading advanced education and research institution in the sciences and engineering. since its inception, the institute has balanced an emphasis on pursuing basic knowledge with applying its research findings for industrial and societal benefit. the institute, which started with just two departments—general and applied chemistry and electrical technology—now has over 40 departments spread across six divisions: biological sciences, chemical sciences, electrical sciences, interdisciplinary research, mechanical sciences, and physical and mathematical sciences. the institute’s jrd tata memorial library (https://library.iisc.ac.in) celebrated its centenary in 2011. established in 1911, the library was one of the earliest central facilities created by the institute to support teaching and research. the library offers both conventional and contemporary services to its users. the library’s traditional services include reference, referral, cataloguing and classification, circulation, inter library loan, document delivery, weekly display of recent periodicals and books, and photocopying. some of the library’s current information and communications technology (ict)based services include digital repository services for the institution’s research publications and theses and dissertations, a faculty profiling system, a web-based online public access catalogue (web opac), and shibboleth-based federated access to the library’s subscribed online resources. the library also facilitates information literacy services such as library orientations, workshops, seminars, demonstrations, invited talks, training sessions on subscribed resources, trial access to new products and services, and author workshops on the research publishing process. mailto:francis@iisc.ac.in mailto:anandtb@iisc.ac.in mailto:filbert@iisc.ac.in https://library.iisc.ac.in/ information technology and libraries march 2022 migration of a research library’s ict-based services to a cloud platform | jayakanth, byrappa, and minj 2 until 2018, the library used its on-premises it infrastructure to provide these ict-based services. the library had dedicated computer servers for its email, institutional repository, library website, integrated library management system (lms), and online journal publishing system. the institution’s faculty profiles system is part of the indian research information system (https://irins.org/irins/), a web-based research information management service provided by the information and library network (inflibnet) centre. the library’s in-house servers were ageing, and they were even beginning to fail. also, managing the in-house servers with limited human resources was increasingly challenging. as a result, the library contemplated moving some of its services to a cloud platform. even smaller libraries had begun migrating their services to the cloud platform almost a decade ago.1 around 2016, the institution established the digits (digital campus and information technology services) office to conceive, plan, and create a best-in-class information technology and networking system and support operational excellence through agile it and networking services. to date, the digits office has, among other projects, successfully migrated more than 70 departmental email servers to a centrally managed cloud-based microsoft office 365 suite and developed and migrated the institute’s main website (https://www.iisc.ac.in) and more than 150 websites and 10 web portals of institution departments, centres, and other facilities to the microsoft azure platform. the digits office also creates and maintains virtual machines (vms) on the microsoft azure cloud platform for the institution’s departments and offices. migration of locally hosted it infrastructure to a cloud platform offers several benefits to the organization. these benefits include setting up virtual offices accessible from anywhere and at any time, avoiding capital investment in computing infrastructure, taking advantage of the cloud platform’s elastic computing resources, avoiding the necessity of having a dedicated it team, and, most importantly, minimizing downtime and loss of productivity and data. moreover, a cloud platform offers easy scalability, redundancy, and security. achieving these features in the traditional in-house hosting of computing infrastructure would be cost prohibitive. 2 the library has configured three vms on the azure platform and has moved some of its ict-based services to the cloud platform. migrating ict-based services to the cloud platform has helped the library significantly reduce the downtime of computer servers. cloud computing and its service models cloud computing (cc) refers to computer hardware and software provided as a service by another company. the only requirement to access the cloud computing service is a device with access to the internet. some leading cc service providers include amazon web services (https://aws.amazon.com/what-is-aws/), microsoft azure (https://azure.microsoft.com/en-in/), and google cloud (https://cloud.google.com). there are three service models in cloud computing: software as a service (saas), platform as a service (paas), and infrastructure as a service (iaas). service providers host software applications on their cloud platforms in the saas model. examples of the saas model include google apps (https://workspace.google.com/) and microsoft office 365 (https://www.microsoft.com/enin/microsoft-365). clients opting for the saas model need not worry about installation, setup, running, and maintaining the applications. service providers will do that for the clients. https://irins.org/irins/), https://www.iisc.ac.in/ https://aws.amazon.com/what-is-aws/ https://azure.microsoft.com/en-in/ https://cloud.google.com/ https://workspace.google.com/ https://www.microsoft.com/en-in/microsoft-365 https://www.microsoft.com/en-in/microsoft-365 information technology and libraries march 2022 migration of a research library’s ict-based services to a cloud platform | jayakanth, byrappa, and minj 3 paas provides a computing platform comprising an operating system, database, programming environment, and application programming interface. examples of the paas model include amazon elastic beanstalk (https://aws.amazon.com/elasticbeanstalk/), windows azure (https://azure.microsoft.com/en-in/), and google compute engine (https://cloud.google.com/compute). in the iaas service model, clients can obtain computing infrastructure, virtual machines, networking, and storage components on demand and deliver them over the internet. examples for the iaas model include google compute engine, amazon ec2, and microsoft azure. in coordination with the digits office, the library initially provisioned three vms on the microsoft azure cloud platform to migrate some of its it-based services. table 1 shows the initial hardware configurations of each of the three vms. table 1. vm types and their system configurations along with the cost virtual machine (vm) vm type1 vcpus & ram (gb) cost / month (usd)2 os disk (gb) & type secondary storage disk (gb) & type storage cost / month (usd)3 os vm1 (ir services) standard f4s_v2 4, 8 140 400 (ssd) 600 (ssd) 114 cent 7.x vm2 (ilms) standard d4s_v3 4, 16 148 300 (ssd) 200 (ssd) 57 cent 7.x vm3 (website) standard f4s 4, 8 140 300 (ssd) 200 (ssd) 57 ubuntu 18.x 1as of 2018 and subject to change with time. 2cost as prevalent in 2018. 3cost as prevalent in 2018. a virtual machine (vm) is an on-demand and scalable computing resource available on cc platforms. vms give better control over the computing environment without buying any underlying physical hardware. the microsoft azure platform offers various vm options, each optimized for different workloads. for example, the d-series azure vms provide a combination of vcpus (virtual cpus), memory, and temporary storage to meet the requirements associated with most production workloads. categories in the d-series of vms include ds-series, dds-series, and das-series. the f-series vms feature a higher cpu-to-memory ratio, are equipped with 2 gb ram and 16 gb of local solid-state drives (ssds) per cpu core, and are optimized for compute-intensive workloads. f-series vms are costlier than the corresponding d-series vms (https://docs.microsoft.com/en-us/azure/virtualmachines/sizes). for the secondary storage, apart from the standard hard disk drives (hdds), vms support azure premium ssds and ultra-disk storage, depending on regional availability. the premium ssds are designed to support intensive input/output workloads. they are priced almost three times higher than the standard hdds. the standard disk capacity of an azure vm’s os disk is 30 gb, and it can be increased to the desired capacity. apart from the os disk, one can also have a required amount of secondary disk storage. the cost for the additional disk storage (both os and data) is https://aws.amazon.com/elasticbeanstalk/ https://azure.microsoft.com/en-in/ https://cloud.google.com/compute https://docs.microsoft.com/en-us/azure/virtual-machines/sizes https://docs.microsoft.com/en-us/azure/virtual-machines/sizes information technology and libraries march 2022 migration of a research library’s ict-based services to a cloud platform | jayakanth, byrappa, and minj 4 independent of vm pricing, and it depends on storage type and capacity. a vcpu refers to a virtual central processing unit. a vm treats each vcpu as a single physical core. migration of the library’s ict-based services to microsoft azure cloud platform libraries have always been at the forefront of adopting emerging technologies. it is true with cc technology as well. in an interview at the american library association annual meeting in anaheim, california, in june 2012, clifford lynch traced 30 years of interactions between libraries and new technologies.3 with the evolution of cc technologies, libraries have been using cc’s saas and iaas service models since 2009 to host their websites, library management system (lms), and digital repositories. libraries have been using the cc mainly for saas and iaas services.4 as a first step, during mid-2017, the digits office began migrating all the 70+ individual departmental email servers, including the library’s, to a centrally managed, cloud-based mailing solution using office 365 (now microsoft 365) exchange online. after the successful migration of all the email servers, the library shut down its email server. next, the library decided to migrate some of its locally hosted ict-based services to the cloud platform in a phased manner. planning the migration process: considering a single vm or independent vms for each application before undertaking the migration process, libraries need to consider what types of projects are good candidates for the cloud and what types are not.5 in the first phase of the cloud migration, the library decided to migrate the following services: (1) institutional repository services, (2) the library management system, and (3) the library website. before the cloud migration, the library used three independent on-premises servers to host the above services. a sun fire computer server with intel xeon processor, 4 gb of ram, and 2 tb of secondary storage hosted the institutional repository service for research publications using eprints software and the electronic theses and dissertations service using dspace software. the libsys library management system was hosted on an ibm server with intel xeon cpu e5-2620 v2 @ 2.10ghz processor, 16 gb ram, and 1 tb of secondary storage. the library website was hosted on an ibm thinkserver ts150 server with intel xeon cpu e3-1225 v5 @ 3.30ghz processor, 8 gb ram, 1 tb of secondary storage. all three computer servers had been in use for nearly 10 years and were long overdue for replacement. as the library contemplated upgrading its ict infrastructure, provisioning vms on the azure cloud platform through the digits office was a stimulus. next, the library had to decide whether to go with a single vm with robust hardware configuration to host all three applications or to provide independent vms for each service. based on the experience gained from hosting two ir services on a single server, the library decided to go again with a single vm with robust hardware configuration to host the ir services. the lms is a critical application for managing all library functions; therefore, the library decided to host it on a separate vm. a third vm is used to host the library website. as the library eventually plans to move its other ict-based applications to the azure platform, it could migrate and distribute those applications on the existing three vms based on the utilization and load on each of the three vms. initially, the library opted for two vms in the f-series and one in the d-series with premium ssds for all three vms. after observing performance and price for about three months, the library had information technology and libraries march 2022 migration of a research library’s ict-based services to a cloud platform | jayakanth, byrappa, and minj 5 one of the two vms (vm3) moved from f-series to d-series and downgraded the os and data disk types of all the three vms to the standard disk drives. the data disk on vm3, hosting the library website, was dropped as the os disk capacity was more than adequate to run the service. table 2 shows the revised vm types and their configurations. in april 2020, most of our students and faculty members started working off-campus because of the onset of the covid-19 pandemic. to facilitate seamless access to licensed online resources from off-campus, the library set up federated access through shibboleth sso.6 the library provisioned a new virtual machine (vm4) with a system configuration as indicated in table 2. table 2. revised vm types and their system configurations along with the cost virtual machine (vm) vm type4 vcpus & ram (gb) cost/ month (usd)2 os disk (gb) & type secondary storage (gb) & type storage cost / month (usd)3 os vm1 (ir services) standard f4s_v2 4, 8 147 400 (ssd) 600 (hdd) 39 cent 7.x vm2 (ilms) standard d4s_v3 4, 16 148 300 (ssd) 200 (hdd) 21 cent 7.x vm3 (website) standard d2s_v3 4, 8 81 300 (hdd) none nil ubuntu 18.x vm4 idp server standard f2s_v2 2, 4 71 300 (hdd) none nil ubuntu 18.x 1as of 2019 and subject to change with time. 2cost as prevalent in 2019. 3cost as prevalent in 2019. the cloud migration process cloud migration is the process of moving applications and data from an organization’s onpremises computers to a cloud platform. before undertaking the migration process, the requisite software applications must be installed and configured on the vms. then, the data corresponding to each application must be backed up on the on-premises system and moved to the corresponding vms. coordination with the campus network support team is essential to ensure the vms are accessible on the internet with all the security measures in place. so, every application migrating to the cloud platform has to go through a cloud migration process. in the following sections, the authors briefly describe the library’s migration process to migrate three of its ict-based services to the azure cloud. the library completed the entire migration process in about three months. migration process for the research publications repository eprints@iisc, the institute’s institutional repository (ir), was established in 2002 and holds nearly 55,000 publications. it is one of the earliest repositories in this part of the world. 7 the ir runs on eprints (https://www.eprints.org/uk/), the world-leading open-source digital repository platform. developed at the university of southampton for over 20 years, eprints has provided stable, innovative repository services across academia and beyond. eprints is a stable, flexible, https://www.eprints.org/uk/ information technology and libraries march 2022 migration of a research library’s ict-based services to a cloud platform | jayakanth, byrappa, and minj 6 reliable software and ideal for maintaining institutional repositories. before the migration, the library hosted the publications repository on an on-premises server for almost 17 years. eprints software depends on other software, including apache web server with mod_perl (https://httpd.apache.org/), mysql/mariadb (https://mariadb.org/) relational database management system, perl programming language, and several perl modules. eprints software bundles many required perl modules, but some are installed depending on the underlying operating system (os). for example, for the eprints@iisc repository installed on a vm running the cent os, the library has installed apache web server with mod_perl, mariadb relational database management system, and a few missing perl modules. after installing all the dependent software, the library followed the steps listed below to migrate the publications repository to the vm on the azure cloud platform: 1. installed the latest version of the eprints (3.4.1) software on the vm and incorporated all the local customizations done at the configuration and code levels. 2. created a new repository and retained the existing repository name. 3. created a new mariadb database and assigned appropriate grant permissions to the database. 4. as the database structure had changed in the latest version of eprints, executed the necessary scripts built into the eprints software to update the eprints database structure. 5. imported customized institute-specific subject headings to override the default ones. 6. moved the database and full-text backup files to the vm using winscp—an open-source, free ftp client (https://winscp.net/eng/docs/introduction). 7. restored the backups comprising eprints mysql database and full-text files on the vm to the corresponding locations on the file system and uncompressed database and full-text files. 8. imported the mysql database into the new mariadb database on the vm. 9. regenerated all the static pages of the repository, abstracts of all the records, and browse views for the year, author, document type, and subject categories on the vm. 10. enabled hypertext transfer protocol secure (https) for log in and account creation links. 11. configured postfix (http://www.postfix.org/). postfix is a free and open-source mail transfer agent that routes and delivers electronic mail. 12. coordinated with the institute’s network support team to make necessary changes in the dns entries to reflect the vm’s new public ips and enable the vm to send and receive emails. 13. created crontab entries on the vm to run the cron jobs. a cron job is a time-based job scheduler in a unix-like computer operating system. some of the cron jobs include updating the latest records added to the repository, displaying the latest count of records in the repository, and updating the browse views of the repository. migration process for the electronic theses and dissertations repository established in 2005, etd@iisc, the institution’s electronic theses and dissertations repository, is one of the earliest etd repositories in this part of the world. 8 the library uses dspace software (https://duraspace.org/dspace/) to maintain the etd repository. to date, the repository holds nearly 6,000 of the institute’s etds. before the migration, the library hosted the etd repository on an on-premises server for almost 13 years. https://httpd.apache.org/ https://mariadb.org/ https://winscp.net/eng/docs/introduction http://www.postfix.org/ https://duraspace.org/dspace/ information technology and libraries march 2022 migration of a research library’s ict-based services to a cloud platform | jayakanth, byrappa, and minj 7 dspace software is dependent on several third-party software applications and tools, including java jdk, apache maven (https://maven.apache.org/), apache ant (https://ant.apache.org/), postgresql (https://www.postgresql.org/) or oracle (https://www.oracle.com/in/database) relational database management system, and apache tomcat servlet engine (http://tomcat.apache.org/). the library followed the steps listed below to migrate the etd repository to the vm on the azu re cloud platform: 1. installed all the dspace-dependent software packages and the latest version of dspace software (version 6.3). 2. configured the dspace software to incorporate the native customizations. 3. created communities and collections to reflect the divisions and the corresponding departments and centres of the institute using java script. 4. set the access permission for each collection of the repository based on the users and groups belonging to the collection. 5. modified the metadata of the etds from the on-premises version using a script. the modified metadata was imported into the latest version of dspace. 6. copied and moved etd items comprising pdf files from the on-premises server to the vm. 7. enabled hypertext transfer protocol secure (https) for the etd site. 8. customized the default etd site for a better look and feel. 9. created crontab entries on the vm to run the cron jobs to take incremental backup and display the etd count on the landing page. the new version of the dspace user registration system was modified to enable only people with an institute email id to register with etd@iisc. in addition, the registration process captures the registrant’s department and the division, which helps automate the process of assigning the registrant to a specific collection of the repository. therefore, a user will submit an etd only to the designated collection. migration process for the libsys library management system the library has been using libsys (https://www.libsys.co.in/), a commercial lms, for over 25 years. libsys is dependent on several other software applications, including wildfly application server (https://www.wildfly.org/), java jdk, and mysql (mariadb) relational database management system. the steps involved in migrating libsys to the cloud (vm2) are listed below: 1. installed all the libsys-dependent software components and the latest version of the libsys software on the vm. 2. restored the mariadb database backup. 3. made required changes in the libsys configuration files. 4. installed and configured postfix email transfer agent for email communication. 5. as the libsys service and the opac run on nonstandard ports, the library coordinated with the network support team to open the required communication ports on the vm. https://maven.apache.org/ https://ant.apache.org/ https://www.postgresql.org/ https://www.oracle.com/in/database http://tomcat.apache.org/ https://www.libsys.co.in/ https://www.wildfly.org/ information technology and libraries march 2022 migration of a research library’s ict-based services to a cloud platform | jayakanth, byrappa, and minj 8 migration process for the library website the library uses drupal (https://www.drupal.org/), a content management system, to host its website. the steps involved in the migration process are listed below: 1. installed all the drupal-dependent software, including apache web server, mariadb, and php (https://www.php.net/), on the vm. 2. installed one of the latest versions of drupal using its web-based installer. 3. installed the required drupal plugins and the drupal theme. 4. restored drupal database backup on the vm. 5. installed and configured postfix email transfer agent for email communication. after completing migration processes, the library coordinated with the network support team to make changes in the domain name system (dns) to enable access to all the three vms on the internet. monitoring the azure virtual machines azure monitor (am) for vms includes a set of performance charts that target several key performance indicators to determine how well a virtual machine performs. the graphs show resource utilization over a period to identify bottlenecks and anomalies or switch to a perspective listing of each vm to view resource utilization based on the metric selected. while there are numerous elements to consider when dealing with performance, azure monitor for vms monitors critical operating system performance indicators related to the processor, memory, network adapter, and disk utilization. performance complements the health monitoring feature and helps expose issues that indicate a possible system component failure, support tuning and optimization to achieve efficiency, or support capacity planning (https://docs.microsoft.com/enus/azure/azure-monitor/insights/vminsights-performance). am is accessible only to the cloud administrator. based on the am charts, the library’s inference has been that the ir server (vm1) hosting publications and etd repositories needs capacity planning. cpu utilization is reaching maximum capacity quite frequently. therefore, the library plans to move the etd repository to an independent vm. the utilization of the ilms server (vm2) is less than optimal. therefore, the library decided to migrate publication of the institution’s journal of the indian institute of science (jiisc) from onpremises hosting to the ilms server (vm2) on the azure cloud. for hosting jiisc on the azure cloud platform, the library uses the open journal system (ojs) platform (https://pkp.sfu.ca/ojs/). ojs is open-source software for the management of peerreviewed academic journals. ojs is dependent on other software and tools, including apache web server, mysql or postgresql, and php. the library used the virtual hosting concept to host multiple sites on a single vm (vm2). virtual hosting is a method of hosting multiple domain names on a single server. https://www.drupal.org/ https://www.php.net/ https://docs.microsoft.com/en-us/azure/azure-monitor/insights/vminsights-performance https://docs.microsoft.com/en-us/azure/azure-monitor/insights/vminsights-performance https://pkp.sfu.ca/ojs/ information technology and libraries march 2022 migration of a research library’s ict-based services to a cloud platform | jayakanth, byrappa, and minj 9 benefits observed of migrating to the cloud moving some of the ict-based services of the library to the azure cloud platform has resulted in the following benefits: 1. service reliability has improved significantly as the reliance on the in-house ageing servers has been done away with. 2. cloud migration has made the library’s computing infrastructure more flexible. it can be scaled up or down as per the library’s requirement. 3. operational logs and usage metrics are easy to obtain. 4. set alert rules are based on vm metrics. 5. the cloud hosting company’s managed services include periodic backups, ensuring that data is secure. 6. users can now quickly move between the library and home (or any other location) and access all their research. 7. another significant benefit to cloud computing for library users is sharing information easily. libraries provide collaborative spaces within the building, but patrons also can use collaborative online spaces. onedrive provides access to online storage and allows sharing of files and folders among approved users. 9 lessons learned during and after the migration process initially, the library opted for two vms on azure’s f-series and one in the d-series with the ssd storage devices for all three vms. as stated above, the pricing of the azure vms depends on the hardware configuration of a specific type of vm series. for example, the f-series with a particular hardware configuration and ssd costs more than its counterparts on the dor b-series with the same hardware configuration with a standard hdd. libraries should, therefore, have a clear understanding of the vm types and the corresponding costing aspect. after observing vms performance and the cost aspect for a few months, the library moved one of the two vms from the f-series to the d-series and switched to standard hard disk drives for all three vms. the changeover did not result in any performance degradation, but the cost of the secondary storage came down by about one-third. the library maintains two institutional repositories, one for research publications using the eprints application and the other for theses and dissertations using the dspace application. the library decided to migrate both the repositories to a single vm. however, it turns out that this decision was not a prudent one, for the tomcat server running eprints crashes, often resulting in downtime for the etd service. the vm usage metrics reveal that the dspace application often utilizes nearly 100% of cpu capacity, which leads to the freezing of the tomcat server, resulting in etd service becoming unresponsive. the library contemplates upgrading the hardware configuration or setting up the two repository applications on two vms. the library is checking to understand if the issue is with the tomcat server configurations. the graph shown in figure 1 is the screenshot of azure’s metric monitoring of the vm1 running eprints and dspace applications. the peaks in the graph represent the cpu usage by the dspace application. it is evident from the graph that quite frequently, the cpu usage of the dspace application is reaching 100%, which eventually leads to the freezing of the tomcat web server. information technology and libraries march 2022 migration of a research library’s ict-based services to a cloud platform | jayakanth, byrappa, and minj 10 figure 1. screenshot representing the cpu usage of eprints and dspace applications. the professional library staff administered and managed the on-premises ict-based services of the library. therefore, the library did not encounter any technical challenges during and after the cloud migration process. other libraries that intend to migrate their services to a cloud platform should ensure that the library staff entrusted with the migration process are comfortable working with the command prompt, especially in the linux operating system environment. conclusions it has been more than two years since some of the library’s ict-based services migrated to the microsoft azure platform. to date, the library has not experienced a single instance of its servers being down because of a power outage, network issue, or crashing of the vms. however, there have been issues with specific services like the apache tomcat servlet engine or the apache web server crashing, resulting in the corresponding application being unresponsive. such behaviour can result because, at times, the system resources, especially the cpu and ram, may be used to their capacity. restarting the specific service will ensure that the corresponding application comes up. therefore, it is essential to keep track of the ram and cpu usage of the vms and upgrade them if the situation warrants such an action. the rapid elasticity characteristic of cloud computing facilitates organizations to configure optimal computing resources based on actual requirements. based on the library’s initial experience running the ict-based services on a cloud platform, the authors suggest that deploying two different institutional repository software platforms like eprints and dspace on a single vm may not be a good idea. the tomcat instance powering the dspace site runs with higher cpu usage, swinging up to 100% and at times going beyond 100% cpu usage. the high cpu usage by the tomcat instance can eventually lead to its freezing, resulting in the corresponding service being inaccessible. working with the vms demands some degree of familiarity in working in the command prompt. library staff who are not comfortable working in the command prompt will require additional training in getting used to the vm environment. in our case, the training was not necessary as the authors had adequate experience working in the linux operating system. also, library staff managing the cloud infrastructure need to coordinate with the organization’s networking and information technology and libraries march 2022 migration of a research library’s ict-based services to a cloud platform | jayakanth, byrappa, and minj 11 email support staff to configure the vms to be accessible on the internet, email communications , and enforce security measures. acknowledgements the authors would like to thank the editor and the referees for their insightful comments and suggestions. endnotes 1 robin r. hartman, “life in the cloud: a worldshare management services case study,” journal of web librarianship 6, no. 3 (2012): 176–85, https://doi.org/10.1080/19322909.2012.702612. 2 erik t. mitchell, “cloud computing and your library,” journal of web librarianship 4, no. 1 (2010): 83–86, https://doi.org/10.1080/19322900903565259. 3 clifford lynch, elke greifeneder, and michael seadle, “interactions between libraries and technology over the past 30 years,” library hi tech 30, no. 4 (2012): 565–78, https://doi.org/10.1108/07378831211285059. 4 yan han, “iaas cloud computing services for libraries: cloud storage and virtual machines,” oclc systems and services 29, no. 2 (2013): 87–100, https://doi.org/10.1108/10650751311319296. 5 denis galvin and mang sun, “avoiding the death zone: choosing and running a library project in the cloud,” library hi tech 30, no. 3 (2012): 418–27, https://doi.org/10.1108/07378831211266564. 6 francis jayakanth, ananda t. byrappa, and raja vishvanathan, “off-campus access to licensed online resources through shibboleth,” information technology and libraries 40, no. 2 (2021), https://doi.org/10.6017/ital.v40i2.12589. 7 francis jayakanth et al., “eprints@iisc: india’s first and fastest-growing institutional repository,” oclc systems and services: international digital library perspectives 24, no. 1 (2008): 59–70, https://doi.org/10.1108/10650750810847260. 8 jobish pitchet, filbert minj, and tarikere basappa rajashekar, “etd@iisc.: a dspace-based etdms and oai compliant theses repository service of indian institute of science,” in etd 2005: evolution through discovery, 8th international symposium on electronic theses and dissertations, 28–30 september 2005 (sydney, australia: the university of new south wales). 9 tom ipri, “where the cloud meets the commons,” journal of web librarianship 5, no. 2 (2011): 132–41, https://doi.org/10.1080/19322909.2011.573295. https://doi.org/10.1080/19322909.2012.702612 https://doi.org/10.1080/19322900903565259 https://doi.org/10.1108/07378831211285059 https://doi.org/10.1108/10650751311319296 https://doi.org/10.1108/07378831211266564 https://doi.org/10.6017/ital.v40i2.12589 https://doi.org/10.1108/10650750810847260 https://doi.org/10.1080/19322909.2011.573295 abstract introduction cloud computing and its service models migration of the library’s ict-based services to microsoft azure cloud platform planning the migration process: considering a single vm or independent vms for each application the cloud migration process migration process for the research publications repository migration process for the electronic theses and dissertations repository migration process for the libsys library management system migration process for the library website monitoring the azure virtual machines benefits observed of migrating to the cloud lessons learned during and after the migration process conclusions endnotes 230 the recon pilot project: a progress report november 1969 -april 1970 henriette d. avram, kay d. guiles, lenore s. maruyama: marc development office, library of congress, washington, d. c. a srtnthesis of the second progress report submitted by the library of congress to the council on library resources under a grant for the recon pilot project. an overview of the p1'0gress made from november 1969 to april 1970 in the following areas: p1'0duction, official catalog comparison, format mcognition, research titles, microfilming, investigation of inptlt devices. in addition, the status of the tasks assigned to the recon working task force are briefly described. introduction an article was published in the june 1970 issue of the journal of library automation ( 1) describing the scope of the recon pilot project (hereafter referred to as recon) and summarizing the first progress report submitted by the library of congress ( lc) to the council on library resources (clr). recon is supported by the council, the u.s. office of education, and the library of congress. in order that all aspects of the project might be brought together as a meaningful whole, the various segments, regardless of the source of support, were covered in the second progress report and have been included in this article. in some instances, it has been necessary to introduce a section by repeating some aspects already reported in the june 1970 article in order to add clarity to the content of that section. recon pilot project/ avram 231 progress-november 1969 to april 1970 recon production the production operations of the recon pilot project are being handled by the recon production unit in the marc editorial office of the lc processing department. printed cards with 1968, 1969, and 7-series card numbers have been provided from the card division stock for recon input, and approximately 99,550 cards in the 1969 and 7-series have been received. using prescribed selection criteria the recon editors have sorted these cards and obtained approximately 27,150 eligible for recon input. approximately 150,000 cards in the 1968 series have also been received. the recon editors have sorted 60,000 of these cards and obtained approximately 24,000 records eligible for recon input. a large number of cards in these three series is already out of print, and replacement cards are being sent by the card division as soon as reprints are made. each card eligible for recon input from the above-mentioned selection process is also checked against a computer produced index of card numbers for records in machine readable form. each number in the print index has a corresponding code to show on which machine readable data base the record resides. the source codes are as follows: m1-marc i data base m2-marc ii, 1st practice tape m3-marc ii, 2nd practice tape m4-marc ii data base m5-marc ii residual data base (the two practice tapes contain records converted before the implementation of the marc distribution service to test the programs and input techniques.) the print index used for the final selection of the 1969 and 7-series card numbers contained only the records from m2-m5 (the marc i data base consists of the records converted during the marc pilot project which ended in june 1968). for the selection of the 1968 records, another print index had been produced which contains numbers for records on all five data bases. if the recon editors find a match on the print index, the appropriate source code is added to the printed card; these printed cards are then maintained in a separate file. (later in the project, the records in the data bases identified as m1 to m3 will be updated to conform with the current marc ii format and added to the recon data base.) the remaining cards for recon are reproduced on input worksheets and edited. to date, approximately 9,750 records in the 1969 and 7-series have been edited for recon. recon records in the 1969 and 7-series are being input by a service bureau. the contractor uses ibm selectric typewriters equipped with an ocr typing mechanism, and the hard-copy sheets are run through an 232 journal of library automation vol. 3/3 september, 1970 optical scanner. the output from the scatmer is a magnetic tape which is processed by the contractor's programs to produce a tape in the marc pre-edit format. this tape is then sent to lc and processed by the marc system programs to produce a full marc record. since the input for the retrospective conversion effort will be printed cards (or copies of printed cards from the card division record set), it will be necessary to compare these with their counterparts in the lc official catalog. the printed card for each main entry in the official catalog will show if any changes have been made which did not warrant reprinting these cards to incorporate these changes. items on a printed card that could be noted in this fashion include changed subject headings, added entries, and call numbers. since these will be important access points in a machine readable catalog record, it was felt that such revisions should be reflected in the recon records. the recon report ( 2) contains a lengthy discussion of the various factors involved in the catalog comparison process, such as the percentage of change in relation to the age of the record, the difficulty in ascertaining any changes because of language, interpretation of cataloging rules, etc. to determine the most efficient and least costly method of catalog comparison, two recon editors were assigned to conduct an experiment to test eight different methods as follows: 1) print-out checked in alphabetic order-single group of 200 records. 2) proofsheets (already proofed) checked in worksheet (card number) order-group of 200 records in batches of 20. 3) proofsheets (not proofed) checked in worksheet (card number) order -group of 200 records in batches of 20. 4) proofsheets (already proofed) checked by mental alphabetizationgroup of 200 records in batches of 20. 5) proofsheets (not proofed) checked by mental alphabetization-group of 200 records in batches of 20. 6) worksheets before editing (not input) checked by mental alphabetization-group of 200 records in batches of 20. 7) worksheets before editing (not input) checked in alphabetical order -group of 200 records in batches of 20. 8) worksheet before editing (not input) checked in worksheet (card number) order-group of 200 records in batches of 20. mental alphabetization means the searching of all the entries in a batch beginning with "a," then all the entries beginning with "b," etc., even though the batch is not in alphabetical order. each editor used 200 records for each method, made the necessary corrections, and recorded the time required as well as the number of corrections made. . figure 1 shows the average number of records checked in an hour using the eight different methods of catalog comparison. tables 1 and 2 give the estimated cost per record for each of the methods. in determining met.hod one : prinffi-0w checked in alplt.aaet'lcal 0,ra.e¢ metn0d. twa : pr0€if~li~lt!s (already proepfed) cheeked.in w:ork&heet qrder method ·th~~e: pitoof:s:tieets--(no;t proo.£,ed) cheek:ed i .n worksrbet orde,r merll:od fou·r method five method six p'rotlf'sheets (already proofed) che.c:ked bv .tmij!f,j:m.. all'~etua.',l'l~on liroofs·h:eets (not pt;';oe:i ed) checked bv mental alphabetization method seven: workshtbftts before editing (no•t tnntht) ~jh\~e,':f!,q-!\1. qj;!>to ..... 0 ~ = table 4. input devices >:l -----------0 manufacturer i mn};~ne ke yboard reco rd price t'-1 model configudisplay length i•~ mont1:f! remarks .... purchase ~ ration characters rent a cybercom i kic mark! kp none 80 $7970 $145 con.verter-$1801month ~ data action ki c 150 kp projec720 $5900 $155 converter-$5751month > tion .: ibm i ki c 50 kp back7w $9605 $175 converter-$340/ month -0 i ki c light in£nite converter-$3401 month ~ ibm mtstv t printed $100 -iv t printed in£nite $277 ... _ 0 sycor i kic 301 t crt 216 $7000 $150 converter-$1301 month ~ tycore ki c 8500 kp light240 $6000 $120 converter-$220/month < emitting 0 diodes viatron ki c 21 ti kp crt infinite $1920 $39 many options affecting price "' burroughs ki m n-7000 kp projec160 $8400to $165 to ...___ "' tion $12,200 $277 honeywell ki m keytape ti kp back80-400 $7500 to $148 to pooler for 2 stationscl) light $33,000 $735 $2001 month exh-a (i) ~ keymatic ki m 1091 t backin£nite $8750 $166 price is for basic 88 keys. 256 .... (i) light unique keys available as well s as optional printer. 0" 100 or 200 (i) mal i ki m 100-92 kp projec$6400 $160 pooler for up to 8 stationsv~ tion $401month extra ...... mohawk i ki m 6400 kp back80 $8000 $145 pooler for 3 stationsco light $1751month extra --l c motorola i k/ m kb800 kp none wo $8500 none pooler for 7 stationspotter i ki m kdr kp bcd 160 $8100 $165 purchase price $9700 pooler for 3 stations(bit) $451month sangamo kim ds9100 kp back120 $8200 $177 pooler for 10 stationsvanguard ki m datakp scribe light none 200 $247 / month extra $8500 $175 comoutp-r t(j't tnfo. .,.. c'r't' qm ~ 1 0. clf'v\ .&. ont'!n ~ '"'""" . coo soles system computer kit 6000 kp entry system mohawk kit 9000 kp computer machinery ki t key processkp general ki t ing 2100 t computer systems inforex kit key entry kp penta kit key kp associates logic systems eng. k/ t keytran kp logic corp. ki d lc-720 kp legend: ki t = key to magnetic tape system kid = key to disk system ki c = key to cassette none 496 back80 light back250 light printed 200 crt l28 back200 light none 300 crt 350 ki m = key to computer compatible magnetic tape kp = key punch t ikp = typewriter or key punch $78,000 $ 200 $16,200 to $360 to two to 6 stations $42,000 $925 $53,000 to $1040 to four to 16 stations $145,000 $2840 $92,500 to $2055to eight to 32 stations $168,100 $4095 $81,240 to $2350 to seven to 39 stations $273,120 $7885 $30,300 to $760 to four to 8 stations $35,100 $960 $110,000 $3000 to eight to 64 stations to $8600 $345,200 $100,000 $2875 to nine to 48 stations to $6350 $220,000 $148,000 $2450to four to 16 stations to $5800 $300,000 t = typewriter backlight= a matrix consisting of all individual characters that can be keyed. each character, as keyed, is displayed one at a time in its particular position in the matrix. projection and light-emitting diodes = a one-character position dot matrix. each character, as keyed, is displayed one at a time in the same position. bcd (bit) = lights displaying the bit position (on, off ) of individual characters. each character, as keyed, is displayed one at a time. (the prices quoted and the characteristics given of each device reflect the best information that could be obtained by the recon staff.) :::tl ~ ("') 0 ~ "';j [ ~ ~~ --.. ~ ~ ~ ~ 244 journal of library automation vol. 3/3 september, 1970 could be assigned to single keys and translated to their proper value by software, thus reducing the amount of keystroking required. the keymatic appears worth further investigation; therefore, the library may rent a device for several months for testing and evaluation. a typist will be trained in current marc/recon procedures and assigned to the keymatic as soon as her training period has been completed. the first month will be spent training on the keymatic prior to the actual input of recon records to obtain production and error rates and cost evaluation for comparison purposes. serious consideration was also given in the recon report to direct-read ocr equipment; however, at that time no equipment existed that offered the technical capability to perform the conversion of the lc record set. since then, preliminary investigation of the model370 compuscan universal optical character reader proved interesting enough to continue further exploration of the device. the model 370 compuscan is a computer directed flying-spot scanner which matches the scanned portion of a character with a character described in the core memory of the computer. the manufacturer has examjned a sample of lc printed cards selected at random over a period of twenty years and has concluded that although the hardware is sufficient to read the record set optically, significant software effort would be required. the results of the sampling indicated that the record set is not constituted entirely of "mint" cards, i.e., cards printed from the metal of the original linotype composition, but is composed of originals and reprints of the original. when the stock of the original printing is close to depletion, the card is reprinted by photographing the card, and duplicates are made by a photo-offset process. as this cycle is repeated, the card for any one title could be several generations removed from the original. in some instances, a microscopic examination of the cards seems to indicate that the matrices used in the linotype composition were worn. because of these factors, what might appear as the same character to the naked eye would represent different pattern configurations to the scanner's core memory. · the coarseness of the card surface may also cause variations in the same characters. lc cards have a high rag content in order to meet the archival standards required by libraries. the roughness of the surface does not affect the readability for the human but may cause variations in a given character when read by an optical scanner. another significant problem with lc cards concerns characters which touch, i.e., connections between what are intended to be distinct characters but are read by the scanner as one. for example, if a lower case "n" were next to a lower case «t" and the cross bar on the "t" touched the "n," the scanner would consider the combination of the "n" and the "t" as one character. software must be written to handle the variant character and the touching recon pilot project/ avram 245 character problems. in the case of the touching characters, the machine must recognize some allowable limit of reading a single character, and when this limit is exceeded, the pattern read rnust be divided and matched against single-character patterns held in core. programs can be written so that if either of the above conditions occurs, the output on magnetic tape will be flagged for later spot checking, permitting the scanner to continue to operate at throughput speeds without human intervention. the resultant magnetic tape would serve as input to the library's format recognition programs to reformat the scanner's output into the marc ii format. it has been estimated that the throughput speed of compuscan would be in the vicinity of 1800 cards per hour. the lc record set will be microfilmed according to the specifications required by the scanner. since the scanner operates with negative film, a very dark background with a very clear, white image is necessary. a tentative cost estimate of the microfilming and reading has been computed at approximately fifty cents per 1000 characters output on magnetic tape (approximately three lc cards). this price does not include the cost of the software. original printed "mint" cards will be used to test the device without implementing the required software, and depending on the results, investigation may be continued. the keying of the 1969 recon records has been performed by a contractor using an ibm selectric typewriter with the resulting hard copy fed through a farrington optical character reader. as part of the contractor's services to the library, production rates were monitored and reported. this gave lc the basis to compare two devices, the key-tocassette used at the library of congress for the marc distribution serv.ice and the equipment used by the contractor for recon records. to make the comparison in table 5, it was necessary to determine the costs for each method using the techniques developed in the recon report (9). some modifications of cost were made to the original recon estimates because actual figures are now available. marc costs were obtained by dividing the costs of the manhours for typing and proofing in a given period by the number of records added to the marc master file in the same period. the equipment cost per record was also based on the number of records added to the master file. production rates associated with particular tasks were not used. the manpower figures supplied by the contractor were limited to hourly production rates; therefore, to obtain the cost per record for ocr typing it was necessary to project the hourly rate to cover a manyear. the estimated annual production of a typist was then divided into the annual salary of a gs-4 (step 1) typist incremented by 8.5% for fringe benefits. the ocr equipment costs were computed on the basis of figures supplied by the contractor, assuming ownership of the ocr-font typewriter and service bureau rental of the scanner. 246 journal of library automation vol. 3/3 september, 1970 table 5. input costs per record 1. manpower key to cassette method typing $ .45 proofing .70 total $1.15 ocr method typing rate of contractor 1,000 records in 104 hours or 9.6 records per hour typing cost at lc $5,522 + 8.5% ( $5,522) 9.6 x 1,338 $ .466 proofing rate of recon editors at lc: 1,534 records proofed in 173 hours or 8.9 records per hour20% = 7.1 records per hour proofing cost at lc $6,882 + 8.5% ( $6,882) $ .786 7.1 x 1,338 typing $ .466 proofing .786 total $1.25 2. equipment (costs do not include maintenance where applicable ) key to cassette key to cassette monthly rental $100.00 converter-monthly rental prorated over 10 key to cassettes 26.00 total $126.00 hourly cost (assumes 132 hours a month) $ .955 effective production rate of key to cassette average weekly marc output 1,005 4 k t c tt 't 120 = 8.4 records/hour ey o asse e um s record cost of key to cassette and converter $.955 8.4 = $ .114 recon pilot project/ avram 247 ocr method ocr-font typewriter purchase price 40-month amortization hourly cost (assumes 132 hours use) effective production rate of ocr typewriter $500.00 12.50/month .095 9.6 records/hour x 1,338 homs d /l 132 hours x 12 months 8·1 recor s 10ur record cost of ocr typewriter $.095 sr=$ .o12 ocr scanner-service bureau hourly rental 10,000 lines/hour each recordis lines 555 records/hour record cost of ocr scmmer total record cost for equipment $.012 + $ .09 = $ 50.00 $ .09 $ .102 the cost of proofing in the ocr method was based on the recon experience at lc modified by contractor experience. in actual practice, ocr records are proofed and corrected by the contractor before they are proofed by recon editors. it was assumed that double proofing is unnecessary but that allowance should be made for the added difficulty of reading copy with a higher proportion of errors. (a preliminary study of errors on recon proofsheets has shown that there are fewer typographical errors on recon proofsheets than on current marc proofsheets.) for this reason, the number of recon records proofed in an hour has been decreased by 20% in the calculations. on the basis of the calculations in table 5, the comparative input costs are summarized as follows: table 6. estimated input cost per record key-to-cassette ocr manpower: typing $.45 $.47 proofing .80 .78 equipment .11 .10 totals $1.26 $1.35 the final figures indicate that the two methods are very close in cost. as presently calculated, the key-to-cassette method is less expensive than the ocr method. it is easy to see that a slight change in any cost or production rate could make the ocr method less expensive. if the proofing 248 journal of library automation vol. 3/3 september, 1970 rate of 8.9 records per hour were maintained instead of decreasing to 7.1 per hour, the ocr proofing cost would drop to $.63, and the total price for this proposed method would be $1.20. one way to test the assumption of the added difficulty of a single proofing would be to obtain uncorrected records from the contractor as a means of determining the actual proofing rate under that condition. recon tasks the four tasks that have been identified for study by the working task force are: 1) levels of completeness of marc records; 2) implications of a national union catalog in machine readable form; 3) conversion of existing data bases in machine readable form for use in a national bibliographic service; and 4) study of problems involved in any future distribution of name and subject cross reference control files. progress to date on the first three tasks is described in the following paragraphs. task 1 has been completed, and an article summarizing the results of a report submitted to clr has been published in the journal of library automation, june 1970 ( 10). the following conclusions reached by this study are quoted from the article: 1) the level of a record must be adequate for the purposes it will serve. 2) in terms of national use, a machine readable record may function as a means of distributing cataloging information and as a means of reporting holdings to a national union catalog. 3) to satisfy the needs of diverse installations and applications, records for general distribution should be in the full marc ii format. 4) records that satisfy the nuc function are not necessarily identical with those that satisfy the distribution function. 5) it is feasible to define the characteristics of a machine readable nuc report at a lower level than the full marc ii format. task 2 consists of an investigation of the implications of a national union catalog in machine readable form. a design of such a system is needed, and although the implementation of such a project is beyond the purview of the working task force, some of the technical and cost factors should be examined and defined for possible future research. as a framework for discussion purposes, a future reporting system for the national union catalog was postulated based on the present reporting system as follows: contributors lc outside libraries present report form printed cards locally produced cards and lc cards future report form lc marc data (for all records) marc data (for all records) or records submitted to nuc to be keyed as machine readable records recon pilot project/ avram 249 the problems of the control number and library location symbols were considered, but a tentative decision was made that recommendations should be forthcoming when the american national standards institute sectional committee z39 has completed its work on library identification codes. the indicators and subfield codes to be included in the machine readable nuc records would depend on the optimum file arrangement of the suggested bibliographic listings. the library of congress is presently engaged in a filing rules study which should influence the inclusion or exclusion of particular content designators. task 2 is still in progress. task 3 is the investigation of the possible utilization of other machine readable data bases for use in a national bibliographic store. the task was divided into several subtasks as follows: 1) identification of useful data bases for the purposes described (content and bibliographic completeness); 2) cost of the conversion from a local format to a marc ii record; 3 ) cost of updating records not already in the lc data base for consistency and missing data by comparing the records with the library of congress official catalog; 4) cost of comparing the record for the existing lc machine readable records to eliminate duplicate records. to satisfy the first subtask, a questionnaire was sent to 42 organizations. the information requested included: 1) availability of data bases-maintained by library or service bureau, and permission to copy data base. 2) use of the data base-for acquisitions, production of book catalog, circulation system, etc. 3) composition of data base-monographs, serials, technical reports, etc. 4) composition of data base-number of titles, imprint dates (primarily current, retrospective, etc.), language of records. 5) source of catalog data-marc distribution service, lc catalog card, local cataloging. 6) data elements for monographs. 7) format used in identifying data elements-marc i format, marc ii format, etc. 8) character set used. the results from this survey were analyzed, and a follow-up letter was sent to 22 of the organizations, requesting further information as follows: 1) an estimate of the number of monographs added to the data base each year. 2) representative group of twenty-five entries for monographs including both fiction and non-fiction. 3) details on the character set used in the machine readable data base.· 4) detailed specifications of monographic record format. responses from this last letter have been received and analyzed. this analysis should identify a limited number of machine readable data bases that will be subjected to further content and cost analysis. 250 journal of library automation vol. 3/3 september, 1970 outlook the recon project continues to be on schedule. the working task force has met several times for deliberations on the assigned tasks; in addition, members have been briefed on the progress of the pilot project and their advice has been sought. thus, individuals interested in the problems of bibliographic conversion guide the project throughout its development. the library of congress recon staff continues to maintain liaison with individuals and organizations working in any facet of the project's scope, hoping to bring all expertise possible to bear on the problems involved. it is significant, although not fully recognized at the onset of the recon project, that the solution to many of the problems under exploration will have impact on current conversion as well as retrospective conversion. this is evident at the library of congress where marc and recon, although staffed separately in the production area, share staff in the information systems office, and the project is known as marc/recon. coordination continues between the recon project and the card division mechanization project. the recon project director is the technical adviser for the card division project, and under her general direction, a computer analyst in the information systems office has been assigned full time to the project. the analyst has been given a detailed orientation to the procedures and computer programs for marc/recon and the specifications for the card division project. this exposure is necessary to guarantee that there is no duplication of effort between the two projects and that the design work for the card division project includes the possibility of a future national service for machine readable cataloging, both current and retrospective. (the marc distribution service is such a national service for english language monograph cataloging data, but what is assumed here is a service of a much broader scope.) although progress has been made in many of the tasks included in recon, several methods of input described in the recon report can only be fully evaluated when the format recognition programs are implemented. according to present estimates, this should take place toward the end of 1970. much remains to be accomplished. the library of congress will continue to make its progress known as rapidly as possible, because the results of the pilot project will have great ramifications for the entire library community. acknowledgments the authors wish to thank the staff members associated with the recon pilot project in the technical processes research office and the marc editorial office in the library of congress processing department, and recon pilot profectj avram 251 those in the information systems office, for their respective reports, which were incorporated into the progress report submitted to the council on library resources and which provided significant contributions to this paper. references l. avram, henriette d.: "the recon pilot project: a progress report," journal of library automation, 3 (june 1970). 2. recon working task force: conversion of retrospective records to machine-readable form (washington, d. c.: library of congress, 1969), pp. 32-33. 3. avram, henriette d., et al.: "marc program research and development: a progress report," journal of library automation, 2 (december 1969)' 250-253. 4. recon working task force: op. cit., p. 31. 5. national microfilm association: glossary of terms for microphotography and reproductions made from micro-images. 4th rev. ed. (annapolis, md.: national microfilm association, 1966), p. 8. 6. ibid. 7. ibid., p. 52 8. hawken, william r.: copying methods manual (chicago: library technology program, american library association, 1966), p. 243. 9. recon working task force: op. cit., pp. 58-59, 86, 93. 10. recon working task force: "levels of machine-readable records," journal of library automation, 3 (june 1970). / editorial the authors of “the state of rfid applications in libraries,” that appeared in the march 2006 issue, inadvertently included two sentences that are near quotations from a commentary by peter warfield and lee tien in the april 8, 2005 issue of the berkeley daily planet. on page 30 immediately following footnote 24, the authors wrote: “the eugene public library reported ‘collision’ problems on very thin materials and on videos as well as false readings from the rfid security gates. collision problems mean that two or more tags are close enough to cancel the signals, making them undetectable by the rfid checkout and security systems.” warfield and lien wrote: “the eugene (ore.) public library reported ‘collision’ problems on very thin materials and on videos as well as ‘false readings’ from the rfid security gates. (collision problems mean that two or more tags are close enough to ‘cancel the signals,’ according to an american library association publication, making them undetectable by the rfid checkout and security systems.)” (accessed may 16, 2006, www .berkeleydailyplanet.com/article.cfm?archivedate=04-08 -05&storyid=21128). the authors’ research notes indicated that it was a near quotation, but this fact was lost in the writing of the article. the article referee, the copy editors, and i did not question the authors because earlier in the same paragraph they wrote about the eugene public library experience and referred (footnote 23) to an earlier article in the berkeley daily planet. the authors and i apologize for this unfortunate error. **** july 1, 2006 marked the merger of rlg and oclc. by the time this editorial appears, many words will already have been spoken and written about this monumental, twentyfirst century library event. i know what i think the three very important immediate effects of the merger will be. first, it is a giant step toward the realization of a global library bibliographic database. second, taking advantage of rlg’s unique and successful programs and integrating them and their development philosophy as “rlgprograms,” while working alongside oclc research, seems a step so important for the future development of library technology that it cannot be overemphasized. third, and very practically, incorporating redlightgreen into open worldcat will give the library world a product that users might prefer over a search of google books or amazon. i requested and received quotes about the merger from the principals that i might put into this editorial that won’t appear until four months after the may 3 announcement. jay jordan, president and ceo, oclc, remarked: “we have worked cooperatively with rlg on a variety of projects over the years. since we announced our plans to combine, staff from both organizations have been working together to develop plans and strategies to integrate systems, products, and services. over the past several months, staff members have demonstrated great mutual respect, energy, and enthusiasm for the potential of our new relationship and what it means for the organizations we serve. there is much work to be done as we complete this transition. clearly, we are off to a good start.” betsy wilson, chair, oclc board of trustees, and dean of libraries, university of washington, wrote: “the response from our constituencies has been overwhelmingly supportive. over the past several months, we have finalized appointments for the twelve-person program council, which reports to . . . oclc through a standing committee called the rlg board committee. we are starting to build agendas for our new alliance. the members of this group from the rlg board are: james neal, vice president for information services and university librarian, columbia university; nancy eaton, dean of university libraries and scholarly communication, penn state university (and former chair of the oclc board); and carol mandel, dean of libraries, new york university. from oclc the members are elisabeth niggeman, director, deutschesbibliothek; jane ryland, senior scientist, internet 2; and betsy wilson, dean of university libraries, university of washington.” and from james michalko, currently president and ceo of rlg, and by the time you read this, vice president, rlg-programs development, oclc: “we are combining the practices of rlg and oclc in a very powerful way— by putting together the traditions of rlg and oclc we are creating a robust new venue for research institutions and new capacity that will provide unique and beneficial outcomes to the whole community.” by now, all lita members and ital readers know that in 1967, fred kilgour founded oclc; and was the founding editor of the journal of library automation (jola—vol. 1, no. 1 was published in march, 1968), which, with but a mild outcry from serials librarians, changed its title to information technology and libraries in 1982. this afternoon (6/15/06), i called fred. he and his wife eleanor reminisced about the earliest days, and then i asked him for his comments on the oclc-rlg merger. because he had had the first words about both oclc and jola, as it were, i told him that i would like for him to have the last. and this is what he said, “at long last!” fred kilgour died on july 31, 2006, aged 92. a tribute posted by alane wilson of oclc may be read at http:// scanblog.blogspot.com/2006/07/frederick-g-kilgour -1914-2006.html editorial: a confession, a speculation, and a farewell john webb john webb (jwebb@wsu.edu) is a librarian emeritus, washington state university and editor of information technology and libraries. editorial | webb 115 resource discovery: comparative survey results on two catalog interfaces heather hessel and janet fransen resource discovery: comparative survey results | hessel and fransen 21 abstract like many libraries, the university of minnesota libraries-twin cities now offers a next-generation catalog alongside a traditional online public access catalog (opac). one year after the launch of its new platform as the default catalog, usage data for the opac remained relatively high, and anecdotal comments raised questions. in response, the libraries conducted surveys that covered topics such as perceptions of success, known-item searching, preferred search environments, and desirable resource types. results show distinct differences in the behavior of faculty, graduate student, and undergraduate survey respondents, and between library staff and non-library staff respondents. both quantitative and qualitative data inform the analysis and conclusions. introduction the growing level of searching expertise at large research institutions and the increasingly complex array of available discovery tools present unique challenges to librarians as they try to provide authoritative and clear searching options to their communities. many libraries have introduced next-generation catalogs to satisfy the needs and expectations of a new generation of library searchers. these catalogs incorporate some of the features that make the current web environment appealing: relevancy ranking, recommendations, tagging, and intuitive user interfaces. traditional opacs are generally viewed as more complex systems, catering to advanced users and requiring explicit training in order to extract useful data. some librarians and users also see them as more effective tools for conducting research than next-generation catalogs. academic libraries are frequently caught in the middle of conflicting requirements and expectations for discovery from diverse sets of searchers. in 2002, the university of minnesota-twin cities libraries migrated from the notis library system to the aleph500™ system and launched a new web interface based on the aleph online catalog, originally branded as mncat. in 2006, the libraries contracted with the ex libris group as one of three development partners in the creation of a new next-generation search environment called primo. during the development process, the libraries conducted multiple usability studies that provided data to inform the direction of the product. participants in the usability studies generally characterized the primo interface as “clear” and “efficient.”1 a year later the university heather hessel (heatherhessel@yahoo.com) was interim director of enterprise technology and systems, janet fransen (fransen@umn.edu) is the librarian for aerospace engineering, electrical engineering, computer science, and history of science & technology, university of minnesota, minneapolis, mn. mailto:heatherhessel@yahoo.com mailto:fransen@umn.edu information technology and libraries | june 2012 22 libraries branded primo as mncat plus, rebranded the aleph opac as mncat classic, and introduced mncat plus to the twin cities user community as a beta service. in august 2008, mncat plus was configured as the default search for the twin cities catalog on the libraries’ main website, with the libraries continuing to keep a separate link active to the aleph opac. a new organizational body called the primo management group was created in december 2008 to coordinate support, feedback, and enhancements of the local primo installation. this committee’s charge includes evaluating user input and satisfaction, coordinating communication to users and staff, and prioritizing enhancements to the software and the normalization process. when the primo management group began planning its first user satisfaction survey, the group noted that a significant number of library users seemed to prefer mncat classic. therefore, two surveys were developed in response to the group’s charge. these two surveys were identical in scope and questions, except that one survey referenced mncat classic and was targeted to mncat classic searchers (appendix a), while the other survey referenced mncat plus and was targeted to mncat plus searchers (appendix b). these surveys were designed to produce statistics that could be used as internal benchmarks to gauge library progress in areas of user experience, as well as to assist with ongoing and future planning with regard to discovery tools and features. research questions in addition to evaluating user satisfaction and requesting user input, the primo management group also chose to question users about searching behaviors in order to set the direction of future interface work. questions directed toward searching behaviors were informed by the findings from a 2009 university of minnesota libraries report on making resources discoverable.2 the group surveyed respondents about types of items they expect to find in their searches, their interest in online resources, and the entry point for their discovery experience. the primo management group crafted the surveys to get answers to the following research questions:  how often do users view their searching activity as successful?  how often do users know the title of the item that they are looking for, as opposed to finding any resource relevant to their topic?  what search environments do users choose when looking for a book? a journal? anything relevant to a topic?  how interested are users in finding items that are not physically located at the university of minnesota?  are there other types of resources that users would find helpful to discover in a catalog search? resource discovery: comparative survey results | hessel and fransen 23 although it can be tempting to think of the people using the catalog interfaces as a homogeneous group of “users,” large academic libraries serve many types of users. as wakimoto states in “scope of the library catalog in times of transition,” on the one hand, we have ‘net-generation users who are accustomed to the simplicity of the google interface, are content to enter a string of keywords, and want only the results that are available online. on the other hand, we have sophisticated, experienced catalog users who understand the purpose of uniform titles and library of congress classifications and take full advantage of advanced search functions. we need to accommodate both of these user groups effectively.3 the primo management group planned to use the demographic information to look for differences among user communities; therefore the surveys requested demographic information such as role (e.g., student) and college of affiliation (e.g., school of dentistry). in designing the surveys, the group took into account the limitations of this type of survey as well as the availability of other sources of information. for example, the primo management group chose not to include questions about specific interface features because such questions could be answered by analyzing data from system logs. the group was also interested in finding out about users’ strategies for discovering information, but members felt that this information was better obtained through focus groups or usability studies rather than through a survey instrument. research method the primo management group positioned links to the user surveys in several online locations, with the libraries’ home page providing one primary entry point. clicking on the link from the home page presented users with an intermediate page, where they were given a choice of which survey to complete: one based on mncat plus, and the other on mncat classic. if desired, users could choose to complete a separate survey for each of the two systems. links were also provided from within the mncat plus and mncat classic environments, and these links directed users to the relevant version of the survey without the intermediary page. in addition to the survey links in the online environment, announcements were made to staff about the surveys, and librarians were encouraged to publicize the surveys to their constituents around campus. the survey period lasted from october 1 through november 25, 2009. at the time of the surveys, the university of minnesota libraries was running primo version 2 and aleph version 19. because participants were self-selected, the survey results represent a biased sample, are more extreme than the norm, and are not generalizable to the whole university population. participants were not likely to click the survey link or respond to e-mailed requests unless they had sufficient incentive, such as strong feelings about one interface or the other. thirty percent of respondents provided an e-mail address to indicate that they would be willing to be contacted for focus groups or further surveys, indicating a high level of interest in the public-facing interfaces the libraries employ. in considering a process for repeating this project, more attention would be paid to methodology to address validity concerns. findings and analysis information technology and libraries | june 2012 24 findings relevant to each research question are discussed here. six hundred twenty-nine surveys contained at least one response—476 for mncat plus and 153 for mncat classic. responses by demographics as shown in table 1, graduate students were the primary respondents for both mncat plus and mncat classic, followed by undergraduates and faculty members. library staff made up 13 percent of mncat classic respondents and 4 percent of mncat plus respondents, although the actual number of library staff responding was nearly identical (twenty-one for mncat plus, twenty for mncat classic). library staff members were disproportionately represented in these survey responses and the group analyzed the results to identify categories in which library staff members differed from overall trends in the responses. questions about affiliation appeared at the end of the surveys, which may account for the high number of respondents in the “unspecified” category. mncat classic respondents frequency mncat plus respondents frequency graduate student 50 33% graduate student 176 37% undergraduate student 31 20% undergraduate student 110 23% library staff 20 13% faculty 40 8% faculty 21 14% staff (non-library) 28 6% staff (non-library) 10 7% library staff 21 4% community member 2 1% community member 11 2% (unspecified) 19 12% (unspecified) 90 19% total 153 100% total 476 100% table 1. respondents by user population a comparison of the student survey responses shows that graduate students were overrepresented, while undergraduates were underrepresented, at close to a reverse ratio. of the total number of graduate and undergraduate students, 62 percent of the respondents were graduate students, even though they accounted for only 32 percent in the larger population. conversely, undergraduates represented only 38 percent of the student respondents, even though they accounted for 68 percent of the graduate and undergraduate total. regrettably, the surveys did not include options for identifying oneself as a non-degree-seeking or professional student, so the analysis of students compared with overall population in this section includes only graduate students and undergraduates. differences were also apparent in the representation of all four categories of students within a particular college unit. at least two college units were underrepresented in the survey responses: resource discovery: comparative survey results | hessel and fransen 25 carlson school of management and the college of continuing education. one college unit was overrepresented in the survey results; 59 percent of the overall student respondents to the mncat classic survey, and 47 percent of the mncat plus students indicated that they were housed in the college of liberal arts (cla), and yet cla students only represent 32 percent of the total number of students on campus. table 2 shows the breakdown of percentages by college or unit and the corresponding breakdown by survey respondent, highlighting where significant discrepancies are evident. twin cities overall percentage of students mncat classic student survey respondents +/mncat plus student survey respondents +/ carlson school of management 9% 0% -9% 2% -7% center for allied health 0% 2% +1% 1% 0% col of educ/human development 10% 9% -1% 14% +3% col of food, agr & nat res sci 5% 4% 0% 7% +2% coll of continuing education 8% 1% -7% 1% -7% college of biological sciences 4% 6% +2% 5% 0% college of design 3% 3% 0% 3% 0% college of liberal arts 32% 59% +27% 47% +15% college of pharmacy 1% 1% 0% 0% -1% college of veterinary medicine 1% 1% 0% 1% 0% graduate school 0% 0% 0% 0% 0% humphrey inst of publ affairs 1% 1% 0% 1% 0% institute of technology (now college of science & engineering) 14% 9% -5% 10% -4% law school 2% 1% -1% 1% 0% medical school 4% 2% -3% 5% 0% school of dentistry 1% 1% 0% 0% -1% school of nursing 1% 0% -1% 0% -1% school of public health 2% 1% -1% 3% +1% table 2. student responses by affiliation information technology and libraries | june 2012 26 faculty and staff together totaled only eighty-nine respondents on the mncat plus survey and fifty-one respondents on the mncat classic survey. in keeping with graduate and undergraduate student trends, the college of liberal arts (cla) was clearly over-represented in terms of faculty responses. the cla faculty group represents about 17 percent of the faculty at the university of minnesota. yet over half the faculty respondents on the mncat plus survey were from cla; over 80 percent of the mncat classic faculty respondents identified themselves as affiliated with cla. faculty groups that were underrepresented include the medical school and the institute of technology. perceptions of success a critical area of inquiry for the surveys was user satisfaction and perceptions of success: “do users perceive their searching activity as successful?” asked in both surveys, the question’s responses allowed the primo management group to compare respondents’ perceived success between the two interfaces. results show a marked difference: while 86 percent of the mncat classic respondents reported that they are “usually” or “very often” successful at finding what they are looking for, only 62 percent of the mncat plus respondents reported the same perception of success. respondents reported very similar rates of success regardless of school, type of affiliation, or student status. figure 1. perceptions of success: mncat plus and mncat classic these results should be interpreted cautiously. because mncat plus is the libraries’ default catalog interface, mncat classic users are a self-selecting group whose members make a conscious decision to bookmark or click the extra link to use the mncat classic interface. one cannot assume that mncat users in general also would have an 86 percent perception of success were they to use mncat classic; familiarity with the tool could play a part in mncat classic users’ success. 14% 24% 44% 18% 4% 11% 32% 54% 0% 10% 20% 30% 40% 50% 60% rarely sometimes usually very often mncat classic mncat plus resource discovery: comparative survey results | hessel and fransen 27 another possible factor in the reported difference in user success is the higher proportion of known-item searching—finding a book by title—occurring in mncat classic. a user’s criteria for success differ when searching for a known item versus conducting a general topical search. it is easier for a searcher to determine that they have been successful in a situation where they are looking for a specific item. some features of mncat classic, such as the start-of-title and other browse indexes, are well suited to known-item searching and had no direct equivalent in mncat plus, which defaults to relevance-ranked results. (primo version 3 has implemented new features to enhance known-item searching.) comments received from users suggest that several factors played a role. one mncat classic respondent praised the “precision of the search...not just lots of random hits” and noted that mncat classic supports a “[m]ore focused search since i usually already know the title or author.” in contrast, a mncat plus respondent commented that the next-generation interface was “great for browsing topics when you do not have a specific title in mind.” this comment is consonant with the results from other usability testing done on next-generation catalogs. in "next generation catalogs: what do they do and why should we care?", emanuel describes observed differences between topical and known-item searching: “during the testing, users were generally happy with the results when they searched for a broad term, but they were not happy with results for more specific searches because often they had to further limit to find what they wanted in the first screen of results.”4 a common characteristic of next-generation catalogs is that they return a large result set that can then be limited using facets. training and experience may also explain some of the differences in success. mncat plus also enables functionality associated with the functional requirements for bibliographic records (frbr), which is intended to group items with the same core intellectual content in a way that is more intuitive to searchers. however, this feature is unfamiliar to traditional catalog searchers and requires an extra step to discover very specific known-items in primo. one mncat plus user expressed dissatisfaction and added, “i'm not sure if it's my lack of training/practice or that the system is not user-friendly.” in focus group analyses conducted in 2008, oclc found that “when participants conducted general searches on a topic (i.e., searches for unknown items) that they expressed dissatisfaction when items unrelated to what they were looking for were returned in the results list. end users may not understand how to best craft an appropriate search strategy for topic searches.”5 how often do users know the title of the item that they are looking for? users come to the library with different goals in mind. in “chang's browsing,” available in theories of information behavior, chang identified five general browsing themes,6 adapted to discovery by carter.7 for the purposes of the survey, the primo management group grouped those themes into two goals: finding an item when the title is known, and finding anything on a given topic. the primo management group had heard concerns from faculty and staff that they have more difficulty finding an item when they know the title when using mncat plus than they did with mncat classic. the group was interested in knowing how often users search for known items. to explore this topic and its impact on perceptions of success, the surveys included two questions on known-item and topical searching. the survey results shown in table 3 indicate that a significantly higher proportion of mncat classic respondents (30 percent plus 43 percent = 73 percent) than mncat plus respondents (24 information technology and libraries | june 2012 28 percent plus 29 percent = 53 percent) were “very often” or “usually” searching for known items. it may be that users in search of known items have learned to go to mncat classic rather than mncat plus. rarely sometimes usually very often total i already know the title of the item i am looking for mncat classic 7% (11) 19% (29) 30% (46) 43% (66) 152 mncat plus 15% (69) 33% (151) 24% (111) 29% (132) 463 i am looking for any resource relevant to my topic mncat classic 14% (21) 32% (47) 20% (29) 34% (51) 148 mncat plus 14% (62) 29% (133) 29% (133) 28% (127) 455 table 3. responses to “i already know the title of the item i am looking for” when the primo management group considered how often researchers in different user roles searched for known items versus anything on a topic, clear patterns emerged as shown in figure 2. in the mncat plus survey, only 34 percent of undergraduate mncat plus searchers “usually” or “very often” search for a particular item, versus 74 percent of faculty. conversely, 75 percent of undergraduate respondents “usually” or “very often” search for any resource relevant to a topic, versus 37 percent of faculty. graduate student respondents showed interest in both kinds of use. if successful browsing by topic is best achieved using post-search filtering, it may help to explain differences between undergraduate students and faculty. the analysis of usability testing done on other next generation catalogs described in “next generation catalogs: what do they do and why should we care?” states that “users that did not have extensive searching skills were more likely to appreciate the search first, limit later approach, while faculty members were faster to get frustrated with this technique.”8 results for all mncat classic respondents showed a preference for known item searching, but undergraduate students still indicated that they search more for anything on the topic and less for known items than faculty respondents. no significant differences were identified by discipline. resource discovery: comparative survey results | hessel and fransen 29 figure 2. searching for a known item vs. any relevant resource some qualitative comments from survey takers suggest that respondents view the library interface as a place to go to find something already known to exist, e.g., “i never want to search by topic. library catalogs are for looking up specific items.” however, with respect to discovering resources for a subject in general, both mncat classic and mncat plus respondents showed that they would also like to find items relevant to their topic (figure 2). there was no significant difference between mncat classic and mncat plus respondents on this question; in both environments, only 14 percent of the users said that they would “rarely” be interested in general results relevant to their topic. perceptions of success by specific characteristics for mncat plus, the majority of respondents “somewhat agree” or “strongly agree” that items available online or in a particular collection are easy to find. one-third of the mncat plus respondents had never tried to find an item in a particular format. over 40 percent had never tried to find an item with a particular isbn/issn. interface features may be a factor here: isbn/issn searching is not a choice in the mncat plus drop down menu, so users may not know that they can do such a search. a higher percentage of mncat classic respondents “strongly agree” that it is easy to find items by collection, available online, or in a particular format, than mncat plus respondents. figure 3 shows results based on particular characteristics. information technology and libraries | june 2012 30 figure 3. perception of success by characteristic although the surveys were primarily intended to gather reactions from end users, some interesting data emerged about usage by library staff. as demonstrated in figure 4, library staff respondents were much more likely to have performed the specific types of searches listed in this section than users generally, and reported a much higher rate of perceived success with mncat classic. figure 4. perception of success by characteristic: library staff resource discovery: comparative survey results | hessel and fransen 31 searching by location: local collections and other resources in a large research institution with several physical library locations and many distinct collections, users need the ability to quickly narrow a search to a particular collection. but even the largest institution cannot collect everything a researcher might need. the primo management group wondered not only whether users felt successful when they looked for an item in a particular collection but also wanted to explore whether users want to see items not owned by the institution as part of their search results. finding items among the many library locations was not a problem for either mncat plus or mncat classic respondents: 72 percent either somewhat or strongly agreed that it is easy to find items in a particular collection using mncat. furthermore, survey respondents of both interfaces agreed that they are interested in items no matter where the items are, which underlines the value of a service such as worldcat; 73 percent of mncat plus respondents and 78 percent of mncat classic respondents expressed a preference for seeing items held by other libraries, knowing they could request items using an interlibrary loan service if necessary. preferred search environments three of the survey questions asked users about their preferred search environments for different searching needs:  when looking for a particular book  when looking for a particular journal article  when searching without a particular title in mind each survey presented respondents with a list of choices and space to specify other sources not listed. respondents were encouraged to mark as many sources as they regularly use. when searching for a specific book, users of the two catalog environments identified a number of other sources. the top five sources in each survey are listed in table 4. when i am looking for a specific book, i usually search (check all that apply): mncat classic respondents (frequency) mncat plus respondents (frequency) 1. mncat classic (116) 1. mncat plus (217) 2. worldcat (50) 2. google (165) 3. amazon (50) 3. mncat classic (163) 4. google (49) 4. amazon (160) 5. google books (31) 5. google books (108) table 4. search environment for books information technology and libraries | june 2012 32 qualitative comments indicated that users like being able to connect to amazon and google books in order to look at tables of contents and reviews. they also specifically mentioned barnes and noble, as well as other local libraries. these results show that mncat plus respondents were more likely to also use mncat classic than vice-versa. the data do not suggest why this would be the case, but familiarity with the older interface may play a role. mncat classic respondents were more likely than mncat plus users to return to their search environment when searching for a particular book (82 percent versus 53 percent). one mncat plus respondent commented “i didn't know i could still get to mncat classic.” when searching for a specific journal article, users of both systems chose “other databases (jstor, pubmed, etc.)” above all the other choices. even more respondents would likely have marked this choice if not for confusion over the term “other databases.” most of the comments mentioned specific databases, even when the respondent had not selected the “other databases” choice. one user commented, “most of these choices would be illogical. you don't list article indexes, that's where i go first.” table 5 lists the five responses marked most often for each survey. when i am looking for a specific journal article, i usually search (check all that apply): mncat classic respondents (frequency) mncat plus respondents (frequency) 1. other databases (jstor, pubmed, etc.) (92) 1. other databases (jstor, pubmed, etc.) (232) 2. mncat classic (53) 2. google scholar (131) 3. google scholar (40) 3. e-journals list (130) 4. e-journals list (34) 4. mncat plus (110) 5. google (29) 5. mncat plus article search (101) table 5. search environment for articles. qualitative comments from respondents indicated that interfaces would be more useful if they helped users find online journal articles. this raised some questions with regard to mncat plus, which includes a tab labeled “articles” for conducting federated article searches. however, mncat plus respondents noted that they used the plus “articles” search almost as much as they did mncat plus. other plus comments included: i tried to use this for journal articles but it only has some in the database i guess and when i did my search it only found books and no articles. i don't understand it. i tried this new one and it came up with wierd [sic] stuff in terms of articles. my professor said to give up and use the regular indexes because i wasn't getting what i needed to do the paper. it wasted my time. this desire for federated search coupled with the expressions of dissatisfaction with the existing federated search platform is consistent with the mixed opinions expressed in other studies, such as sam houston state university’s assessment of use of and satisfaction with the webfeat resource discovery: comparative survey results | hessel and fransen 33 federated search tool. that study found “[f]ederated search use was highest among lower-level undergraduates, and both use and satisfaction declined as student classification rose.”9 the new search tools that contain preindexed articles, such as primo central, summon, worldcat local, and ebsco discovery service, may address the frustrations that more experienced searchers express regarding federated search technology. when researching a topic without a specific title in mind, “google” and “other databases” were nearly equal and ranked first for mncat plus respondents, while “other databases” ranked first for mncat classic respondents. table 6 lists the five responses marked most option for each survey. when i am researching a topic without a specific title in mind, i usually search (check all that apply): mncat classic respondents (frequency) mncat plus respondents (frequency) 1. other databases (jstor, pubmed, etc.) (84) 1. google (197) 2. mncat classic (76) 2. other databases (jstor, pubmed, etc.) (192) 3. google (63) 3. google scholar (155) 4. google scholar (47) 4. mncat plus (145) 5. worldcat (32) 5. mncat classic (101) table 6. search environment for topics significant differences based on school affiliation were evident in the area of preferred search environments for topical research. for example, institute of technology respondents reported using google much more often when researching without a specific title in mind than respondents in other areas. evidence from the health sciences is limited in that only seven percent of respondents in total identified themselves as being from this area. however, these limited results show that health sciences respondents relied more on library databases than on google. respondents in the liberal arts relied more on mncat, in either version, than did respondents in the other fields. desired resource types one feature of the primo discovery interface is its ability to aggregate records from more than one source. university libraries maintains several internal data sources that are not included in the catalog, and the possibility of including some of these in the mncat plus catalog has been considered many times since primo’s release. the primo management group was interested to hear from users whether they would find three types of internal sources useful: research reports and preprints, online media, and archival finding aids. the group also asked users to mark “online journal articles” if they would find article results helpful. the question did not specify whether journal articles would appear integrated with other search results in a mncat “books” search or information technology and libraries | june 2012 34 in a separate search such as that already provided through a metasearch on the mncat plus articles tab. the surveys asked users what kinds of resources would make mncat more useful. the results for both mncat plus and mncat classic were similar and response counts for both surveys were ordered as shown in table 7. respondents could mark more than one of the choices. i would find mncat more useful if it helped me find: mncat classic frequency mncat plus frequency online journal articles 65 255 u of m research materials (e.g., research reports, preprints) 34 149 online media (e.g., digital images, streaming audio/visual) 27 134 archival finding aids 27 90 table 7. desired resource types the primo management group noted that more mncat plus respondents chose “online journal articles” more frequently than the other categories even though the mncat plus interface includes an “articles” tab for federated searching. it is unclear whether the respondents were not seeing the “articles” tab in mncat plus because they would like to see search results integrated, or if they were using the “articles” tab and were not satisfied with the results. comments from respondents generally supported the inclusion of a wider range of resources in mncat. however, several respondents also expressed concerns about the trade-offs that might be involved in providing wider coverage. one user liked the idea of having the databases “all … in one place,” but added that “it would have to just give you the stuff that you need.” several users cited the varying quality of the material discovered through library sources. one user supported the inclusion of articles “if it included good articles and not the ones i got.” a mncat classic respondent gave the variable quality of the material he or she had found through a database search as a reason for leaving the coverage of mncat as it is: “i use the best sources depending on my needs.” another mncat classic user expressed doubt that coverage of all disciplines was feasible. in commenting on the content of mncat, respondents also mentioned specific types of material that they wanted to see (e.g. archives of various countries), as well as difficulties with particular classes of material (“the confusing world of government documents”). one mncat plus user related his or her interest in public domain items to a specific item of functionality that would enhance their discovery, namely a date sort. in general, the interest in university of minnesota research material was fairly high. however, faculty members ranked university of minnesota research materials last in terms of preference: only twelve faculty respondents chose the option, out of sixty-one total faculty respondents. resource discovery: comparative survey results | hessel and fransen 35 conclusions the data from two surveys, conducted concurrently in 2009 on a traditional opac (mncat classic) and next-generation catalog (mncat plus), point to differences in the use and perceptions of both systems. there appeared to be fairly strong “brand loyalty” with mncat classic, given that this interface is no longer the default search for the libraries. surveys for both systems suggest a perception of success that is lower than desirable and that there is room to improve the quality of the discovery experience. it is unclear from the data if the reported perceptions of success were the result of the systems not finding what the user wants, or if the systems did not contain what the user wanted to find. mncat classic respondents were more likely to use worldcat to find a specific book than mncat plus respondents. mncat plus respondents indicated a use of mncat classic, but not vice versa. both sets of surveys described use of amazon and google for discovery. mncat plus respondents reported lower rates of success at finding known items than mncat classic respondents. mncat classic respondents were far more likely to have a specific title in mind that they wanted to obtain; half of the mncat plus respondents reported having a specific title in mind. the team that examined the survey responses found that the data suggested several key attributes that should be present in the libraries discovery environment. further discussion of the results and suggested attributes was conducted with library staff members in open sessions. results also informed local work on improving discovery interfaces. the results suggested:  the environment should support multiple discovery tasks, including known-item searching and topical research.  support for discovery activity should be provided to all primary constituent groups, noting the significant survey response by graduate student searchers.  users want to discover materials that are not owned by the libraries, in addition to local holdings.  a discovery environment should make it easy for users to find and access resources in vendor-provided resources, such as jstor and pubmed. while the results of the 2009 surveys provided a valuable description of usage, the survey team recognized that methodological choices limit the usefulness in applying results to a larger population. the team also recognized that there were a number of questions yet unanswered. some of these outstanding questions present opportunities for future research and suggest that a variety of formats might be useful, including surveys, focus groups, and targeted interviews.  to what extent do users expect to find integrated search results among different kinds of content, such as articles, databases, indexes, and even large scale data sets?  what general search strategies do users use to navigate the complex discovery environment that is available to them, and where are the failure points?  how much of the current environment requires training and how much is truly intuitive to users? information technology and libraries | june 2012 36  how can the university libraries identify and serve users who did not complete the surveys?  how useful would users find targeted results based on a particular characteristic such as role, student status, or discipline? since the surveys were conducted, the university libraries upgraded to primo version 3, which included features to address some of the concerns respondents identified in the surveys, such as known-item searching. primo version 3 allows users to conduct a left-justified title search (“title begins with…”), as well as sort by fields such as title and author. once the new version has been in place long enough for users to develop some comfort with the interface, the primo management group intends to resolve methodological issues and repeat its surveys, measuring users’ reactions against the baseline data set in the 2009 surveys. acknowledgements we would like to thank the other members of the primo management group, who helped to design and implement the surveys, as well as analyze and communicate the results: chew chiat naun (chair), susan gangl, connie hendrick, lois hendrickson, kristen mastel, r. arvid nelsen, and jeff peterson. we also want to acknowledge the helpful feedback and guidance of the group’s sponsor, john butler. references 1 tamar sadeh, “user experience in the library: a case study.” new library world 109, no. 1/2 (2008): 7–24. 2 cody hanson et al., discoverability phase 1 final report (minneapolis: university of minnesota, 2009), http://purl.umn.edu/48258/ (accessed dec. 20, 2010). 3 jina choi wakimoto, “scope of the library catalog in times of transition.” cataloging & classification quarterly 47, no. 5 (2009): 409–26. 4 jenny emanuel, “next generation catalogs: what do they do and why should we care?” reference & user services quarterly 49, no. 2 (winter, 2009): 117–20. 5 karen calhoun, diane cellentani, and oclc, online catalogs : what users and librarians want: an oclc report (dublin, ohio: oclc, 2009). 6 shan-ju chang, “chang's browsing,” in theories of information behavior, ed. karen e. fisher, sandra erdelez and lynne mckechnie, 69-74 (medford, n.j.: information today, 2005). 7 judith carter, “discovery: what do you mean by that?” information technology & libraries 28, no. 4 (december 2009): 161–63. 8 jenny emanuel, “next generation catalogs: what do they do and why should we care?” reference & user services quarterly 49, no. 2 (winter, 2009): 117–20. 9 abe korah and erin dorris cassidy. “students and federated searching: a survey of use and satisfaction,” reference & user services quarterly 49, no. 4 (summer 2010): 325–32. https://purl.umn.edu/48258 resource discovery: comparative survey results | hessel and fransen 37 appendix a. mncat classic survey the library catalog is intended to help you find an item when you know its title, as well as suggest items that are relevant to a given topic. we’d like to know how often you use mncat classic for these different purposes. 1. when i visit mncat classic… very often usually sometimes rarely i already know the title of the item i am looking for     i am looking for any resource relevant to my topic     many people use tools other than the library catalog to find books, articles, and other resources. for the different situations below, please tell us what other tools you find helpful. 2. when i am looking for a specific book, i usually search (check all that apply):  amazon  mncat classic  other databases (jstor, pubmed, etc.)  google  mncat plus  worldcat  google books  mncat plus article search  google scholar  libraries onesearch other (please specify) _______________________________________________________ 3. when i am looking for a specific journal article, i usually search (check all that apply):  amazon  google books  mncat plus article search  citation linker  google scholar  libraries onesearch  e-journals list  mncat classic  other databases (jstor, pubmed, etc.)  google  mncat plus  worldcat other (please specify) ___________________________________________________ information technology and libraries | june 2012 38 4. when i am researching a topic without a specific title in mind, i usually search (check all that apply):  amazon  google scholar  libraries onesearch  e-journals list  mncat classic  other databases (jstor, pubmed, etc.)  google  mncat plus  worldcat  google books  mncat plus article search other (please specify) ___________________________________________________ now we’d like to know what you think of mncat classic and what new features (if any) you’d like to see. 5. when i use mncat classic very often usually sometimes rarely i succeed in finding what i’m looking for     6. it is easy to find the following kinds of items in mncat classic strongly agree somewhat agree somewhat disagree strongly disagree i haven’t looked for this with mncat classic an item that is available online      an item within a particular collection (e.g., wilson library, university archives, etc.)      an item in a particular physical format (e.g., dvd, map, etc.)      an item with a specific isbn or issn      resource discovery: comparative survey results | hessel and fransen 39 7. i would find mncat classic more useful if it helped me find (check all that apply):  online journal articles  online media (e.g., digital images, streaming audio/visual)  archival finding aids  u of m research material (e.g., research reports, preprints) other (please specify) ___________________________________________________ 8. the worldcat catalog allows you to search the contents of many library collections in addition to the university of minnesota. which of the following best describes your level of interest in this type of catalog?  yes, i am interested in what other libraries have regardless of where they are, knowing i could request it through interlibrary loan if i want it  yes, i am interested, but only if i can get the items from a nearby library  no, i am interested only in what is available at the university of minnesota libraries please share anything you particularly like or dislike about mncat classic. 9. what i like most about mncat classic is: ___________________________________________ ___________________________________________________________________________________ ___________________________________________________________________________________ 10. what i like least about mncat classic is: ___________________________________________ ___________________________________________________________________________________ ___________________________________________________________________________________ we want to understand how different groups of people use mncat classic, as well as other tools, for finding information. please answer the following questions to give us an idea of who you are. 11. how are you affiliated with the university of minnesota?  faculty  graduate student  undergraduate student  staff (non-library) information technology and libraries | june 2012 40  library staff  community member 12. with which university of minnesota college or school are you most closely affiliated?  allied health programs  food, agricultural and natural resource sciences  pharmacy  biological sciences  law school  public affairs  continuing education  liberal arts  public health  dentistry  libraries  technology (engineering, physical sciences & mathematics)  design  management  veterinary medicine  education & human development  medical school  none of these  extension  nursing 13. we are interested in learning more about how you find the materials you need. if you would be willing to be contacted for further surveys or focus groups, please provide your e-mail address: _______________________________________________ resource discovery: comparative survey results | hessel and fransen 41 appendix b. mncat plus survey the library catalog is intended to help you find an item when you know its title, as well as suggest items that are relevant to a given topic. we’d like to know how often you use mncat plus for these different purposes. 1. when i visit mncat plus… very often usually sometimes rarely i already know the title of the item i am looking for     i am looking for any resource relevant to my topic     many people use tools other than the library catalog to find books, articles, and other resources. for the different situations below, please tell us what other tools you find helpful. 2. when i am looking for a specific book, i usually search (check all that apply):  amazon  mncat classic  other databases (jstor, pubmed, etc.)  google  mncat plus  worldcat  google books  mncat plus article search  google scholar  libraries onesearch other (please specify) _______________________________________________________ 3. when i am looking for a specific journal article, i usually search (check all that apply):  amazon  google books  mncat plus article search  citation linker  google scholar  libraries onesearch  e-journals list  mncat classic  other databases (jstor, pubmed, etc.)  google  mncat plus  worldcat other (please specify) ___________________________________________________ information technology and libraries | june 2012 42 4. when i am researching a topic without a specific title in mind, i usually search (check all that apply):  amazon  google scholar  libraries onesearch  e-journals list  mncat classic  other databases (jstor, pubmed, etc.)  google  mncat plus  worldcat  google books  mncat plus article search other (please specify) ___________________________________________________ now we’d like to know what you think of mncat plus and what new features (if any) you’d like to see. 5. when i use mncat plus very often usually sometimes rarely i succeed in finding what i’m looking for     6. it is easy to find the following kinds of items in mncat plus strongly agree somewhat agree somewhat disagree strongly disagree i haven’t looked for this with mncat plus an item that is available online      an item within a particular collection (e.g., wilson library, university archives, etc.)      an item in a particular physical format (e.g., dvd, map, etc.)      an item with a specific isbn or issn      resource discovery: comparative survey results | hessel and fransen 43 7. i would find mncat plus more useful if it helped me find (check all that apply):  online journal articles  online media (e.g., digital images, streaming audio/visual)  archival finding aids  u of m research material (e.g., research reports, preprints) other (please specify) ___________________________________________________ 8. the worldcat catalog allows you to search the contents of many library collections in addition to the university of minnesota. which of the following best describes your level of interest in this type of catalog?  yes, i am interested in what other libraries have regardless of where they are, knowing i could request it through interlibrary loan if i want it  yes, i am interested, but only if i can get the items from a nearby library  no, i am interested only in what is available at the university of minnesota libraries please share anything you particularly like or dislike about mncat plus. 9. what i like most about mncat plus is: ___________________________________________ ___________________________________________________________________________________ ___________________________________________________________________________________ 10. what i like least about mncat plus is: ___________________________________________ ___________________________________________________________________________________ ___________________________________________________________________________________ we want to understand how different groups of people use mncat plus, as well as other tools, for finding information. please answer the following questions to give us an idea of who you are. 11. how are you affiliated with the university of minnesota?  faculty  graduate student  undergraduate student  staff (non-library) information technology and libraries | june 2012 44  library staff  community member 12. with which university of minnesota college or school are you most closely affiliated?  allied health programs  food, agricultural and natural resource sciences  pharmacy  biological sciences  law school  public affairs  continuing education  liberal arts  public health  dentistry  libraries  technology (engineering, physical sciences & mathematics)  design  management  veterinary medicine  education & human development  medical school  none of these  extension  nursing 13. we are interested in learning more about how you find the materials you need. if you would be willing to be contacted for further surveys or focus groups, please provide your e-mail address: _______________________________________________ editorial board thoughts: reinvesting in our traditional personnel through knowledge sharing and training mark dehmlow information technology and libraries | december 2017 4 mark dehmlow (mdehmlow@nd.edu) is a member of the ital editorial board and director of library information technology, hesburgh library, university of notre dame, south bend, indiana. lately i have been giving a lot of thought to how those of us in technology positions can extend our impact throughout our organizations. with finite budgets and time and relatively low personnel turnover, i have realized that the solution goes beyond merely finding ways that technology can optimize workflows through automation. i have been working in academic library technology for nearly 20 years and when i began my career, virtually all areas of technology required specialized staff – from supporting general computer applications to managing the technical infrastructure that underlay our core systems. these days, technology is still a specialty, but the function of technicians has become more focused on providing infrastructure and much of the general application support we used to provide has become ubiquitous and has become an expectation of almost all library positions. managing email, creating specialized formulas for data analysis, navigating operating systems, even developing basic databases, are now regular parts of library work. the trend of technological infusion will continue but instead of general technical tasks, almost all new library positions will require deeper technical skills. this is due, in part, to the function of knowledge work becoming more specialized as libraries focus on the areas where they can create the most value and those new domains require more technical expertise to be effective. perhaps the most striking example of this evolution is in the transition of catalogers to metadata specialists. the days of working with a single metadata format (marc) in a single, tabular interface (catalog) are quickly slipping away and being replaced by metadata structured in multiple complex schemes, expressed in formats like xml and json. instead of acquiring data from oclc, libraries need to work with web-based apis to harvest metadata. and the tools for manipulation require basic programming skills in languages like python or working with open source applications that look little like the integrated library systems we are used to. working with these tools can enable metadata experts to customize metadata at scale, but it requires new knowledge and even new ways of thinking about metadata and metadata manipulation. cataloging isn’t the only position undergoing change in academic libraries, either. acquisitions is pushing toward greater automation and patron driven selection. the catalog is becoming more like a bookstore and the discovery landscape includes a panoply of resources that are purchased only at the point a user clicks on a link to a resource. acquisitions is also occurring at larger scale, and requiring the ability to work with thousands of items in a batch, to select based on the qualities of what libraries want to make available, to analyze usage trends, and to load, update, and remove metadata as quickly from our discovery environment as possible. the tools to accomplish this are similar to those for metadata. beyond technical services, we’re beginning to see the role of the subject selector transition from building broad disciplinary collections toward a focus on curation editorial board thoughts | dehmlow 5 https://doi.org/10.6017/ital.v36i4.10239 of specialized collections requiring digitization and digital curation. the tools to accomplish this are digital asset management systems and web-based digital exhibition tools which are specialized content management systems. subject selectors are transforming into digital content creators and managers. technologically-driven change regularly outpaces generational personnel turnover in libraries, and given that technological change continues to grow exponentially, it is clear we need a flexible workforce and an organizational commitment to training and professional growth. while organizations are rewriting positions to include technical skills, we will always have a preponderance of staff that started their careers in libraries with depreciating skillsets. merely directing staff to webinars, conferences and self-driven development isn’t enough. multi-day workshops are great as long as there are opportunities to apply learning upon returning to work. to guarantee skill retention, sustained training needs to be directed towards the specific skills needed now and based in actual work, not just theoretical exercises. the challenge, then, becomes how to implement such a program and identifying who can provide the necessary training. how can specialization be disseminated to non-specialists? many libraries have some of the needed resources close at hand, even if staffing is thin and technical resources scarce. it requires thinking a bit pragmatically to reuse the resources libraries do have, and for technologists to evolve with demands as well, transitioning our roles from technology experts alone to a hybrid of practitioners, teachers, and enablers. teaching is, itself, a specialty and many it professionals are unlikely to have developed that skillset. most libraries, though, have staff who do have experience and expertise in training and pedagogy. evolving towards in-sourced technology development will undoubtedly require it staff to first learn effective teaching methods and basic curricula development. they will need a framework to take a set of specific skills and build ad-hoc courses with medium range learning objectives. teaching can occur in the context of actual work scenarios so that learning is put to practical use as part of that training, and skills retention improved. libraries can become labs for cross-training and knowledge sharing through leveraging our teachers and technologists in interdisciplinary partnerships and collaboration with a focus on internal growth so that library organizations can meet continuously changing demands. once staff have been trained in new technical areas, there is another opportunity for it professionals to extend their impact, by dividing technology-driven projects into the parts that require deep technical work and the parts that require transferable technical skills. if technologists start looking at ways to implement technical solutions in componentized ways instead of as end-to-end solutions, they have the opportunity to empower newly trained staff to contribute in practical ways through building solution foundations and then delegating configurable application inputs. as an example – developing a full application stack requires considerable programming skill, but learning to create and update extensible stylesheets to transform xml-based metadata is a teachable skill. it professionals could develop applications that take a configuration file and an xsl file as inputs while staff with xslt training can modify the configuration to include parameters for connecting to apis or loading xml. trained staff could then modify the xsl to transform data to their specifications without having to pass the task back to the it professional. information technology and libraries | december 2017 6 moving toward more holistic technology capability in libraries will require all personnel to be committed to evolving to meet the emerging needs of our organizations – it professionals included. for decades, technologists have been in the privileged position of having the necessary skills to advance the profession’s digital future, but it will be important for technologists in libraries to integrate the many valuable skills other personnel can offer so that they also can evolve in ways that best support our organizations – leveraging foundational library skills to enhance overall organizational capacity to accomplish tasks that are increasingly requiring technical expertise. i won’t pretend it will be easy. it will require libraries to prioritize organizationally-led training, even amidst the flurry of demands around us, but i think it is also critical to the future of the profession, and the old adage that winter pays for summer feels apropos here. technologists will need to be open to incorporating foundational library skills, to collaborating and learning from other library specialists, to thinking of their positions more broadly, and, for those who live in ivory towers (you know who you are), to eliminating the silos they’ve built and collaborate, cooperate, and engage. technologists are an important part of library ecosystems with what we contribute operationally, but i think we can have a greater impact if we propagate our knowledge in an effort to increase the profession’s overall technology capacity and become agents to support knowledge workers’ future skill development. 218 history of library computerization frederick g. kilgour : director, ohio college library center, columbus, ohio the history of library computerization from its initiation in 1954 to 1970 is described. approximately the first half of the period was devoted to computerization of user-oriented subject infotmation retrieval and the second half to library-oriented procedures. at the end of the period on-line systems were being designed and activated. this historical scrutiny seeks the origins of library computerization and traces its development through innovative applications. the principal evolutionary steps following upon a major application are also depicted. the investigation is not confined to library-oriented computerization, for it examines mechanization of the use of library tools as well; indeed, the first half-dozen years of library computerization were devoted only to user applications. the study reveals two major trends in library computerization. first, there are those applications designed primarily to benefit the user, although few, if any, applications have but one goal. the earliest such applications were machine searches of subject indexes employing post-coordination of uniterms. nearly a decade later, the first of the bookform catalogs appeared that made catalog information far more widely available to users than do card catalogs. finally, networks are under development that have as their objective availability of regional resources to individual users. the second trend is employment of computers to perform repetitive, routine library tasks, such as catalog production, order and accounting procedures, serials control, and circulation control. this type of mechanizahistory of library computerizationfkilgour 219 tion is extremely important as a fir st step toward an increasingly productive library technology, which must be an ultimate goal if libraries are to be economically viable in the future ( 1,2). historical studies of library computerization have not yet appeared, although some reports beginning with that of l. r. bunnow ( 3) in 1960 contain valuable literature reviews. both editions of literature on information retrieval and machine translation by c. f. balz and r. h. stanwood ( 4,5) are extremely useful. in addition, j. a. speer's libraries and automation ( 6) is a valuable, retrospective bibliography of over three thousand entries. origins the origins of library computerization were in engineering libraries newly established in the 1950's and employing the uniterm coordinate indexing techniques of mortimer taube on collections of report literature. the technique of post-coordination of simple index terms proved most suitable for computerization, particularly when the size of a file caused manual manipulation to become cumbersome. harley e . tillitt presented the first report, albeit unpublished at the time, on library computerization at the u.s. naval ordnance test station (nots), now the naval weapons center at china lake, california. the report, entitled "an experiment in information searching with the 701 calculator" (7), was given at an ibm computation seminar at endicott, new york, in may 1954. the system was extended .and improved in 1956, and a published report appeared in 1957 ( 8). tillitt subsequently published an evaluation ( 9). the nots system mimicked manual use of a uniterm card file. this noteworthy system could add new information, delete information related to discarded documents, match search requests against the master file, and produce a printout of document numbers selected. search requests were run in batches, thereby producing inevitable delays that caused user dissatisfaction. when the user did receive results of his search, he had a host of document numbers that he had to take to a shell list file to obtain titles. subsequent system designers also found that a computerized system could cause user dissatisfaction if it did not speed up and make more thorough practically all tasks. because use of the system dwindled, it was not reprogrammed for an ibm 704 that replaced the 701 in 1957. however, a couple of years later, when an ibm 709 became available, the system was reprogrammed and improved so that the user received a list of document titles ( 10). tillitt, bracken, and their colleagues deserve much credit for their pioneer computerization of a subject information retrieval system. the application required considerable in genuity, for the ibm 701 did not have built-in character representation. therefore it was necessary to develop subroutines that simulated character representation ( 11 ). moreover, the 701 had an 220 journal of library automation vol. 3/3 september, 1970 unreliable electrostatic core memory. on some machines the mean time between failures was less than twenty minutes ( 12). in september 1958, general electric's aircraft gas turbine division at evendale, ohio, initiated a system on an ibm 704 computer ( 13) that was similar to the nots application. mortimer taube and c . d. gull had installed a uniterm index system at evendale in 1953 (14,15). the ge system was an improvement over the then-existing nots system because it printed out author and title information for a report selected, as well as an abstract of the report. like the nots system, however, the ge application provided only for boolean "and" search logic. the celebrated medlars system ( 16) encompassed the first major departure in machine citation searching. the original medlars had two principal products: 1 ) composition of index m edicus; and 2) machine searching of a huge file of journal article citations for production of recurrent or ondemand bibliographies. the system became operational in 1964. the nots and ge systems coordinated document numbers as listed under descriptors. medlars departed from this technique by searching a compressed citation file in which each citation had its descriptors or subject headings associated with it. the medlars system also provides for boolean "and," "or," and "not" search logic. the next major development was dialog (17), an on-line system for machine subject searching of the nasa report file. queries were entered from remote terminals. the suny biomedical communication network constitutes an important development in operation of machine subject searching and production of subject bibliographies of traditional library materials. the suny network went into operation in the autumn of 1968 with nine participating libraries ( 18) . its principal innovation is on-line searches from remote terminals of the medlars journal article file to which book references have been added. the suny network eliminates the two major dissatisfactions with the nots system and all subsequent batch systems, in that it provides the user with an immediate reply to his search query. catalog production in 1960, l. r. bunnow prepared a report for the douglas aircraft company ( 3) in which he recommended a computerized retrieval system like the nots and ge systems that would also include catalog card production. bunnow's proposal was perhaps the first to contain the concept of production of a single machine readable record from which multiple products could be obtained, such as printed catalog cards and subject bibliographies produced by machine searching. catalog card production began in may 1961 ( 19), the cards having a somewhat unconventional format and being printed all in upper-case characters as shown in figure 1. cards were mechanically arranged in packs for individual catalogs, and alphabetized within packs-an early sophistication. accompanying the history of library ,computerizationjkilgour 221 ml 13,750 douglas aircraft co., inc mechanized information retrieval system for douglas aircraft company, inc., status report. g. w. koriagin, l. r. bunnow january 1962 copy 1 fig. 1. sample catalog card. info~tion retrieval ibraries computer earching ibm 7090 ibm 1401 production of catalog cards was production of accession lists from the same machine readable data. the next development in catalog card production occurred at the air force cambridge research laboratory library, which began to produce cards mechanically in upperand lower-case in 1963 ( 20). a special computer-like device called a crossfiler manipulated a single machine readable cataloging record on paper tape to produce a complete set of card images punched on paper tape. this paper-tape product drove a friden flexowriter that mechanically typed the cards in upperand lower-case. two years later, yale began to produce catalog cards in upperand lower-case directly on a high-speed computer printer ( 21). the yale cards were also arranged in packs, as had been those at douglas, but were not alphabetized within packs. the new england library information network, nelinet, demonstrated in a pilot operation in 1968 a batch processing technique servicing requests from new england state university libraries, via teletype terminals, for production of catalog card sets, book labels, and book pockets from a marc i catalog data file ( 22). the nelinet system became operational in the spring of 1970 employing the marc ii data base. also in 1968 the university of chicago library brought into operation catalog card production with data being input remotely on terminals in the library, and cards being printed in batches on a high-speed computer printer centrally ( 23 ). bookform catalogs began to appear in the early 1960's, and it appears that the information center of the monsanto company in st. louis, missouri, published the earliest report on a bookform catalog that it had 222 journal of library automation vol. 3/3 september, 1970 produced by computer in 1962 ( 24,25) . the center discontinued its card catalog in the same year. book catalogs can increase availability of cataloging information to users while reducing library work, and the monsanto book catalog is an example of such an achievement, for it provides a union catalog of the holdings of seven monsanto libraries, and is produced in over one hundred copies. as would be expected, the catalog appeared all in upper-case. however, in september 1964 the library at florida atlantic university produced a bookform catalog in upperand lower-case (26) and the university of toronto library put out the first edition of its upperand lower-case onulp catalog on 15 february 1965 (27,28). the monsanto catalog format called for author and call number on one line, with title and imprint on a second, or second and third, line. both florida atlantic and toronto catalogs were essentially catalogs of catalog cards. under the leadership of mortimer taube, documentation, inc. was first to produce a bookform catalog in upperand lower-case, with a format like that of bookform catalogs in the nineteenth century ( 29); documentation, inc., prepared the catalog for the baltimore county public library. entries were made once, with titles listed under an entry if there were more than one. the stanford bookform catalog appeared late in 1966, introducing a new type of unit record, whose first element is the title paragraph. h. p. luhn proposed selective dissemination of information ( sdi) in 1958 (30), and perhaps the first library application of sdi was in the spring of 1962 at the ibm library at owego ( 31), where special processing was given to new acquisitions for input into the sdi system. at about the same time, the library of the douglas missile & space systems division instituted an sdi system that employed as input a single machine readable record from which catalog cards and accessions lists were also produced ( 32). the introduction of sdi into library operation is a major, historic innovation, for sdi is a routine but personalized service in contradistinction to the depersonalized library service characteristic of all but the smallest libraries. selective dissemination of information is one of the few examples of library computerization that takes full advantage of the computer's ability to treat an individual as a person and not as one of a horde of users. circulation the picatinny arsenal reported the first computerized circulation system ( 33). the pica tinny application produced a computer printed loan record, lists of reserves, overdues, lists of books on loan to borrowers, and statistical analysis, in a system that began operation in april 1962. the charge card at picatinny was an ibm punch card into which was punched the bibliographic data and data concerning the borrower each time the book was charged. in the fall of 1962, the thomas j. watson research center ( 34) activated a circulation system much like the pica tinny system, except that bibliographic data was punched into a book card by machine, but information about the borrower was manually punched. history of library c;om.puterizationjkilgour 223 the next step forward occurred at southern illinois university ( 35), where a circulation system like the two just described began limited operation in the spring of 1964 employing an ibm 357 data collection system. by using the 357, it was possible to have a machine punched book card and a machine readable borrower's identification card that could be read by the 357, thereby eliminating manual punching. the southern illinois system became fully operational at the beginning of the fall term of 1964, as did a similar 357 system at florida atlantic university (26). batch processed circulation systems periodically producing a listing of books on loan have a built-in source of dissatisfaction, particularly in academic libraries, for current records are unavailable on the average for half the period of the frequency of the printout. such delay can be eliminated in an on-line system, wherein information about the loan is available immediately after recording the loan. however, not all circulation systems with remote terminals operate interactively. in an on-line system introduced at the illinois state library in december 1966 ( 36) the transactions were recorded on an ibm 1031 terminal located at the circulation desk, data transmitted from the terminal being accumulated daily and processed into the file nightly. as first activated, the system did not permit querying the file to determine books charged out, but this capability was added in 1969. also in december 1966, the redstone scientific information center brought into operation a pilot on-line book circulation system based on a converted machine readable catalog consisting of brief catalog entries. this pilot system remained in operation until october 1967, and was capable of recording loans, discharging loans, putting out overdues, maintaining reserves, and locating the record in the file (37). the bellrel real time loan system went into operation at bell laboratories library in march 1968 ( 38). bellrel has a data base consisting of converted catalog records, so that in effect it also is a remote catalog access system. bellrel serves three libraries remotely from two ibm 1050 terminals in each library. bellrel is a sophisticated on-line, real time circulation system that not only records and discharges books, but also replies to inquiries as to the status of a title, and the status of a copy, and will display the full record for a title, as would be required for remote catalog access. serials the library of the university of california, san diego, activated the first computerized serials control system ( 39). this system has as its objective production of a complete holdings list, lists of current receipts, binding lists, claims, nonreceipt lists, and expiration of subscription lists. checking in was accomplished by manual removal from a file of a prepunched card for a specific title and issue. the check-in clerk sent this card to the computer center for processing and the journal issue to the shelves. this 224 journal of library automation vol. 3/3 september, 1970 technique of prepunching receipt cards has generated new problems in some libraries, for professional advice is often needed as to action to be taken when the issue received does not match the prepunched card. nevertheless, the san diego system still operates, albeit with modifications. the washington university school of medicine library activated a serials control system in 1963 ( 40) that was essentially like that at san diego. a series of symposia held at washington university, with the first in the autumn of 1963, widely publicized the system and led to its adoption elsewhere. the university of minnesota biomedical library introduced a technique of writing in receipts of individual journal issues on preprinted check-in lists ( 41 ). check-in data was then keypunched from the lists. this system obviated the problem generated by prepunched cards that did not match received issues, but, of course, reintroduced manual procedures. difficulties with check-in procedures, and delays in receipt of printed lists of holdings made it clear that an on-line real time circulation control system would be superior to the batch systems described in the previous paragraph. laval university in quebec introduced the first on-line, real time system in 1969 ( 42). in september 1969 the laval on-line file held 16,335 titles. access to the file from cathode ray tube terminals is by accession number, and the file, or sections thereof, can be listed. the system also produces operating statistics and contains the potential for automatic claiming. the kansas union list of serials ( 43 ), which appeared in 1965, was the first computerized union list to contain holdings of several institutions. the kansas union list recorded holdings for nearly 22,000 titles in eight colleges and universities. reproduced photographically from computer printout and printed three columns on a page, this legible and easy-to-use list set the style for many subsequent union lists. acquisitions the national reactor testing station library was first to use a computer in ordering processes ( 44). a multiple-part form was produced for library records and for dealers. the library of the thomas j. watson research center activated a more sophisticated system in 1964 that produced a processing information list containing titles of all items in process, a shelf list card, a book card, and a book pocket label ( 45). the pennsylvania state university library put a computerized acquisition system into operation in 1964 ( 46). this system produced a compact, line-a-title listing of each item in process, together with an indication of the status of the item in processing. a small decklet of punch cards was produced for each item on a keypunch, and one of these cards was sent to the _computer center for processing each time its associated item changed status. the pennsylvania system also produced purchase orders. in june 1964, the university of michigan library ( 47) introduced a computerized acquisitions procedure more sophisticated than its predehistory of library computerization/kilgour 225 cessors. the michigan system produced a ten-part purchase order fanfold, an in-process listing, and computer produced transaction cards to update status of items in process; and carried out acconnting for encumbrance and expenditure of book fnnds. in addition, the system produced periodic listings of "do-not-claim" orders, listings of requests for quotation, and of "third claims" for decision as to future action on such orders. in 1966, the yale machine aided technical processing system began operation ( 48). it produced daily and weekly in-process lists arranged by author, a weekly order number listing, weekly fund commitment registers, and notices to requesters of status of request. subsequently, claims to dealers were added, as well as management information reports on activities within the system. like the pennsylvania and michigan systems, its inprocess list recorded the status of the item in processing. the washington state university library brought the first on-line acquisition system into operation in april 1968 ( 49). access to the system was by purchase order number, with records arranged in a random access file nnder addresses computed by a random number generator (50). the stanford university libraries on-line acquisition system began operation in 1969 (51), and employed a sequential file of entries having an index of words in author and title elements of the entry. the stanford system calculated addresses of index works by employing a division hashing technique on the first three letters of the word. standardization by 1965, a dozen or more libraries had a dozen or more formats for machine readable bibliographic records, and an impenetrable thicket of such records was evolving. fortnnately, the library of congress, with the help of the connell on library resources, took the initiative in standardization of format of bibliographic records and produced the now familiar marc format (52) . just as standardization of catalog card sizes enabled interchange of catalog records, so has marc made possible interchange of machine readable catalog records. this standardization has encouraged developments of networks, such as the suny biomedical network, nelinet, the washington state libraries network, and that of the ohio college library center. with each of these regional networks employing the marc bibliographic record, it will be possible to integrate these regional nodes into a future national network. substance and sum the first half of the first decade and a half of library computerization was confined almost entirely to two major mechanizations of mortimer taube's uniterm coordinate indexing. the computerization of single descriptors with attendant document numbers wa£ a relatively easy task. the first breakaway from computerized subject searching came at the 226 journal of library autornation vol. 3/ 3 september, 1970 douglas aircraft corporation, where the technique of producing one machine readable record from which multiple products could be obtained was introduced in 1961. the last half of library automation's decade and a half has been largely consumed with efforts to automate existing library procedures. althou~h notable departures have occurred that take advantage of the computers powerful qualities, on-line, real time techniques introduced at the very end of the historical period under review began again to use individual words as words, not unlike the logic in which the first applications employed uniterms; and it seems likely that the immediate future will witness increasing degrees of computerization based on individual words in bibliographic descriptions rather than on the record as a whole. acknowledgments the author is grateful to sheila bertram for identifying, searching out, and gathering most of the references used in this paper. cloyd dake gull furnished in correspondence invaluable information about events of the fifties and early sixties, and various librarians supplied photocopies of early documents. references l. kilgom, frederick g.: "the economic goal of library automation," college & research libraries, 30 (july 1969 ), 307-311. 2. baumol, william j.: "the costs of library and informational services." in libraries at large (new york: r. r. bowker co., 1969), pp. 168-227. 3. bunnow, l. r.: study of and proposal for a mechanized inforrrwtion retrieval system for the missiles and space systems engineering library (santa monica, california: douglas aircraft co., 1960). 4. balz, charles f.; stanwood, richard h .: literature on information retrieval and machine translation ( international business machines corp., november 1962). 5. balz, charles f.; stanwood, richard h.: literature on information retrieval and machine translation 2d. ed. (international business machines corp., january 1966). 6. speer, jack a. : libraries and automation; a bibliography with index (emporia, kansas: teachers college press, 1967). 7. tillitt, harley e.: "an experiment in information searching with the 701 calculator," journal of library automation, 3 (sept. 1970 ), 202-206. 8. bracken, r. h. ; tillitt, h. e.: "information searching with the 701 calculator," journal of the a ssociation for computing machinery, 4 ( april 1957 ), 131-136. 9. tillitt, harley e. : "an application of an electronic computer to information retrieval." in boaz, martha : modern trends in doc'lrlm entation (new york: pergamon press, 1959), pp. 67-69. history of library computedzationjkilgour 227 10. zaharias, jerome l.: lizards; libmry irlformation search and retrieval data system (china lake, california: u. s. naval ordnance test station, 1963). 11. bracken, robert h.; oldfield, bruce g.: "a general system for handling alphameric information on the ibm 701 computer," journal of the association for computing machinery, 3 (july 1956), 175-180. 12. rosen, saul: "electronic computers: a historical survey," computing surveys, 1 (march 1969), 7-36. 13. barton, a. r.; schatz, v. l.; caplan, l. n.: information retrieval on a high speed computer (evendale, ohio: general electric co., 1959), p· 8. 14. gull, c. d.: personal communication, (22 august 1969) . 15. dennis, b. k.; brady, j. j.; dovel, j. a., jr.: "five operational years of inverted index manipulation and abstract retrieval by an electronic computer," journal of chemical documentation, 2 (october 1962 )) 234-242. 16. austin, charles j.: medlars; 1963-1967 (bethesda, maryland: national library of medicine, 1968). 17. summit, roger k.: "dialog: an operational on-line reference retrieval system." in association for computing machinery: proceedings of 22nd national conference. (washington, d. c.: thomson, 1967), pp. 51-56. 18. pizer, irwin: "regional medical library network," bulletin of the medical libmry association, 51 (april1969), 101-115. 19. koriagin, gretchen w .: "library information retrieval program," journal of chemical documentation, 2 (october 1962 ) 242-248. 20. fasana, paul j.: "automating cataloging functions in conventional libraries," 7 (fall 1963), 350-365. 21. kilgour, frederick g.: "library catalogue production on small computers," american documentation, 17 (july 1966), 124-131. 22. nugent, william r.: "nelinet-the new engjand information network." in congress of the international federation for information processing, 4th, edinburgh, 5-10 august, 1968: proceedings (amsterdam: north-holland publishing co., 1968), pp. g 28-g 32. 23. payne, charles t.: "the university of chicago's book processing system." in proceedings of a conference held at stanford university libraries, october 4-5, 1968 (stanford, califomia: stanford university libraries, 1969). 24. wilkinson, w . a.: personal communication (november 1969). 2.5. wilkinson, w. a.: "the computer-produced book catalog: an appli · cation of data processing at monsanto's information center." in university of illinois graduate school of library science: proceedings of the 1965 clinic on library applications of data processing (champaign, illinois: illini union bookstore, 1966), pp. 92-111. 228 journal of library automation vol. 3/3 september, 1970 26. heiliger, edward: "florida atlantic university library." in university of illinois graduate school of library science: proceedings of the 1965 clinic on library applications of data processing (champaign, illinois: illini union bookstore, 1966), pp. 92-111. 27. bregzis, ritvars: personal communication (november 1969 ) . 28. bregzis, ritvars: "the ontario universities library project-an automated bibliographic data control system," college & research libraries, 26 (november 1965), 495-508. 29. robinson, charles w.: "the book catalog: diving in," wilson library bulletin, 40 (november, 1965), 262-268. 30. luhn, h. p.: "a business intelligence system," ibm journal of research and development, 2 (october 1958), 315-319. 31. stanwood, richard h.: "the merge system of information dissemination, retrieval and indexing using the ibm 7090 dps ." in association for computing machinery: digest of technical papers (1962), pp. 38-39. 32. young, e. j.; williams, a. s.: historical development and present status-douglas aircraft company computerized library program (santa monica, california: douglas aircraft co., 1965). 33. haznedari, i.; voos, h.: "automated circulation at a government r & d installation," special libraries, 55 (february 1964), 77-81. 34. gibson, r. w., jr.: randall, g. e.: "circulation control by computer," special libraries, 54 (july-august 1963), 333-338. 35. mccoy, ralph e.: "computerized circulation work: a case study of the 357 data collection system," library resources & technical services, 9 (winter 1965), 59-65. 36. hamilton, robert e.: "the illinois state library 'on-line' circulation control system." in university of illinois graduate school of library science: proceedings of the 1968 clinic on library applications of data processing (urbana, illinois: graduate school of library science, 1969), pp. 11-28. 37. "redstone center shows on-line library subsystems," datamation, 14 (february 1968), 79, 81. 38. kennedy, r. a. : "bell laboratories' library real-time loan system (bellrel)," journal of library automation, 1 (june 1968), 128-146. 39. university of california, san diego, university library: report on serials computer project; university library and ucsd computer center (la jolla, california: university library, july 1962). 40. pizer, irwin h.; franz, donald r.; brodman, estelle: "mechanization of library procedures in the medium-sized medical library: i. the serial record," bulletin of the medical library association, 51 (july 1963) , 313-338. 41. strom, karen c.: "software design for bio-medical library serials control system." in american society for information science, annual meeting, columbus, 0., 20-240ct.1968: proceedings, 5 (1968) , 267-275. history of l-ibrary computerizationjkilgour 229 42. varennes, rosario de : "on-line serials system at laval university library," journal of library automation, 3 (june 1970). 43. kansas union list of serials ( lawrence, kansas: university of kansas libraries, 1965 ), 357 pp. 44. griffin, hillis l.: "electronic data processing applications to technical processing and circulation activities in a technical library." in university of illinois graduate school of library science: p-roceedings of the 1963 clinic on library applications of data process'ing (champaign, illinois: illini union bookstore, 1964) , pp. 96-108. 45. randall, g. e.; bristol, roger p.: "pil (processing information list ) or a computer-controlled processing record," special libraries, 55 (feb. 1964), 82-86. 46. minder, thomas l.: "automation-the acquisitions program at the pennsylvania state university library." in international business machines corporation: ibm library mechanization symposium, endicott, new york, may 25, 1964, pp. 145-156. 47. dunlap, connie: "automated acquisitions procedures at the university of michigan library," library resources & technical services, 11 (spring 1967), 192-206. 48. alanen, sally; sparks, david e.; kilgour, frederick g.: "a computermonitored library technical processing system." in american documentation institute, 1966 annual meeting, october 3-7, 1966, santa monica, california: proceedings, pp. 419-426. 49. burgess, t .; ames, l.: lola; library on-line acquisitions subsystem (pullman, wash.: washington state university library, july 1968). 50. mitchell, patrick c.; burgess, thomas k.: "methods of randomization of large files with high volatility," journal of library au-tomation, 3 (march 1970). 51. parker, edwin b.: "developing a campus information retrieval system." in proceedings of a conference held at stanford university libraries, october 4-5, 1968 (stanford, california: stanford university libraries, 1969), pp. 213-230. 52. "preliminary guidelines for the library of congress, national library of medicine, and national agricultural library implementation of the proposed american standard for a format for bibliographic information interchange on magnetic tape as applied to records representing monographic materials in textual printed form (books) ," jourruzl of ubrary automation, 2 (june 1969), 68-83. microsoft word september_ital_colegrove_for_proofing.docx editorial board thoughts: rise of the innovation commons tod colegrove   information  technology  and  libraries  |  september  2015             2   that  the  practice  of  libraries  and  librarianship  is  changing  is  an  understatement.  throughout   their  history,  libraries  have  adapted  and  evolved  to  better  meet  the  needs  of  the  communities   served.  from  content  collected  and/or  archived,  to  facilities  and  services  provided,  a   constant  throughout  has  been  the  adoption,  incorporation,  and  eventual  transition  away   from  technologies  along  the  way:  clay  tablets  and  papyrus  scrolls  giving  way  to  the  codex;  the   printing  press  and  eventual  mass  production  and  collection  of  books  yielding  to  information   communication  technology  such  as  computer  workstations  and  the  internet.  indeed,  the   rapid  and  widespread  adoption  of  the  internet  has  enabled  entire  topologies  of  information   to  change  –  morphing  from  ponderous  print  tomes  into  digital  databases,  effectively  escaping   the  walls  of  libraries  and  archives  altogether.1   in  reflection  of  end-­‐users’  growing  preference  for  easily  accessible  digital  materials,  libraries   have  responded  with  the  creation  of  new  spaces  and  services.  repositioning  physical,  digital,   human,  and  social  resources  to  better  meet  the  needs  of  the  communities  supported,  the   information  commons2  that  is  the  library  begins  to  acquire  a  more  technological  edge.  the   concept  of  a  library  service  or  area  referred  to  specifically  as  an  information  commons  can  be   traced  to  as  early  as  1992  with  the  opening  of  the  information  arcade  at  the  university  of   iowa  –  specifically  designed  to  provide  end-­‐users  technology  tools,  with  a  stated  mission  “to   facilitate  the  integration  of  new  technology  into  teaching,  learning,  and  research,  by   promoting  the  discovery  of  new  ways  to  access,  gather,  organize,  analyze,  manage,  create,   record,  and  transmit  information.”3   first  mentioned  in  the  literature  in  1994,  discussion  of  the  idea  itself  waited  another  five   years,  with  donald  beagle  writing  about  the  theoretical  underpinnings  of  “the  new  service   delivery  model”  in  1999.  defined  as  “a  cluster  of  network  access  points  and  associated  it   tools  situated  in  the  context  of  physical,  digital,  human,  and  social  resources  organized  in   support  of  learning.”  a  flurry  of  articles  followed,  with  the  idea  seeming  to  have  caught  the   collective  imagination  of  libraries  generally  by  2004.  information  commons  as  named  spaces   within  libraries  made  “…  sudden,  dramatic,  and  widespread  appearance  in  academic  and   research  libraries  across  the  country  and  around  the  world.”4  scott  bennett  went  further,  in   2008  asking  flatly:  “who  would  today  build  or  renovate  an  academic  library  without   including  an  information  commons?”5   this  proliferation  and  transition  has  not  been  limited  to  academic  libraries;  for  decades,   libraries  of  all  type,  shape,  and  size,  have  been  similarly  provisioning  resources  and     patrick  “tod”  colegrove  (pcolegrove@unr.edu),  a  member  of  the  ital  editorial  board,  is   head  of  the  delamare  science  &  engineering  library  at  the  university  of  nevada,  reno,  nv.       editorial  board  thoughts:  rise  of  the  innovation  commons  |  colegrove     doi:  10.6017/ital.v34i3.8919     3   technology  in  the  context  of  end-­‐user  access  and  learning.  by  2006,  a  new  variation  of  the   information  commons  had  entered  the  vernacular:  the  learning  commons.  defined  by  beagle   as  the  result  of  information  commons  resources  “organized  in  collaboration  with  learning   initiatives  sponsored  by  other  academic  units,  or  aligned  with  learning  outcomes  defined   through  a  cooperative  process.”6  a  subset  of  the  broader  concept,  when  the  library   collaborates  with  stakeholders  external  to  the  library  to  collaboratively  achieve  academic   learning  outcomes,  it  becomes  operationally  a  learning  commons.  one  can  easily  conceive  of   the  learning  commons  more  broadly  by  considering  learning  outcomes  desirable  within  the   context  of  particular  library  types:  school  libraries  with  offerings  and  programs  in  alignment   with  broader  k-­‐12  curricula;  public  libraries  in  support  of  lifelong  learning  and  participatory   citizenship;  special  libraries  in  alignment  with  other  niche-­‐specific  learning  outcomes.   note  that  not  all  information  commons  are  learning  commons.  as  defined,  the  learning   commons  depends  on  the  actions  and  involvement  of  other  units  that  establish  the  mission,   and  associated  learning  goals,  of  the  institution.  others  must  join  with  the  library’s  effort  in   order  to  create  and  nourish  such  spaces  in  a  way  that  is  deeply  responsive  to  the  aspirations   of  the  institution:  “the  fundamental  difference  between  the  information  and  the  learning   commons  is  that  the  former  supports  the  institutional  mission  while  the  latter  enacts  it.”   (bennett  2008,  emphasis  added)  at  a  time  when  libraries  are  undergoing  such  rapid  and   significant  transformation,  it’s  hard  to  dismiss  such  collaborative  effort  as  merely  trendy  –   such  spaces,  and  the  library  by  extension,  become  of  even  more  fundamental  relevance  to  the   broader  organization.   in  short,  resources  are  provisioned  in  the  information  commons  so  that  learning  can  happen;   collaborative  effort  with  stakeholders  beyond  the  library,  but  within  the  organization,   ensures  that  learning  does  happen.   drawing  a  parallel,  what  if  the  library  were  to  go  beyond  simply  repositioning  resources  in   support  of  learning  –  indeed,  beyond  working  with  other  units  of  the  organization  to   collaboratively  align  and  provision  resources  in  support  of  achieving  organizational  learning   outcomes?  to  go  beyond  strategic  alignment  with  the  aspirations  of  the  institution,  involving   stakeholders  from  beyond  the  immediate  organization  in  the  creation  and  support  of  such   spaces?  provisioning  library  spaces  and  services  that  are  deeply  responsive  to  the   aspirations  of  the  greater  community?  arguably  this  is  where  the  relatively  recent   introduction  of  makerspaces  into  the  library  fits  in.  the  annual  environmental  scan   performed  by  the  new  media  consortium  (nmc)  has  for  a  number  of  years  identified   makerspaces  to  be  on  its  short-­‐term  adoption  horizon  –  the  2015  library  edition  goes   further,  identifying  a  core  value:     the  introduction  of  makerspaces  in  the  academic  library  is  inspiring  a  mode  of  learning  that   has  immediate  applications  in  the  real  world.  aspiring  inventors  and  entrepreneurs  are     information  technology  and  libraries  |  september  2015         4   taking  advantage  of  these  spaces  to  access  tools  that  help  them  make  their  dreams  into   concrete  products  that  have  marketable  value.7     aspects  of  the  information  commons  are  present  in  library  makerspace  –  not  only  in  the   access  to  traditional  library  resources,  but  also  in  the  shift  toward  providing  support  of  21st-­‐ century  literacies  in  the  creation,  design,  and  engineering  of  output.  with  the  acquisition  and   use  of  these  literacies  in  collaboration  with  and  in  support  of  the  goals  of  the  greater   institution,  it  is  also  a  learning  commons;  for  example,  in  the  case  of  a  school  or  public  library   where  makerspace  activities  and  engagement  collaboratively  meet  and  support  learning   outcomes  including  increased  engagement  with  science,  technology,  engineering,  the  arts,   and  math  (steam)  disciplines.  consider  the  further  example  of  university  students   leveraging  makerspace  technology  as  part  of  ste(a)m  outreach  efforts  to  local  middle   schools  in  the  hope  of  kindling  interest,  or  partnering  with  the  local  discovery  museum  in  the   production  of  a  mini  maker-­‐faire  to  carry  that  interest  forward.  alternatively,  a  team  of   students  conceiving,  then  prototyping  and  patenting  a  new  technology  with  the  active  and   direct  support  of  the  library  commons,  going  on  to  eventually  launch  as  a  business.  to  the   extent  the  library  can  springboard  off  the  combination  of  makerspace  with  information  or   learning  commons  to  engage  stakeholders  from  beyond  the  institution,  it  can  go  beyond  –   becoming  something  broader,  and  potentially  transformative;  even  as  it  enables  progress   toward  collaboratively  achieving  community  goals,  outcomes,  and  aspirations.   the  hallmark  of  community  engagement  with  such  library  facilities  is  a  spontaneous   innovation  that  seems  to  flow  naturally.  library?  information  or  learning  commons?   arguably  such  spaces  are  more  accurately  named  innovation  commons.   beyond  solidifying  the  library’s  place  as  a  hub  of  access,  creation,  and  engagement  across   disciplinary  and  organizational  boundaries,  the  direct  support  of  innovation  –  the  process  of   going  from  idea  to  an  actual  good  or  service  with  a  real  perceived  value    –  is  in  potential   alignment  with  the  aspirations  of  the  broader  community.  in  collaboration  with  stakeholders   from  across  the  community,  from  economic  development  and  government  representatives  to   businesses  and  private  individuals,  broader  outcomes  and  aspirations  of  the  greater   community  can  be  identified  and  supported.    nevertheless,  simply  adding  makerspace   technology  to  an  information  or  learning  commons  does  not  automatically  create  an   innovation  commons.  it  is  in  the  broader  conversation,  along  with  the  catalyzation,   identification  of  and  support  for  the  greater  aspirations  of  the  community,  that  the  commons   begins  to  assume  its  proper  role  in  the  greater  ecosystem.  leveraging  the  deliberate   application  of  information,  with  imagination,  and  initiative,  enabling  end-­‐users  to  go  from   idea  all  the  way  to  useful  product  or  service  is  something  that  community  stakeholders  see  as   a  tangible  value.           editorial  board  thoughts:  rise  of  the  innovation  commons  |  colegrove     doi:  10.6017/ital.v34i3.8919     5   the  library  as  innovation  commons  becomes  a  natural  partner  in  the  local  innovation   ecosystem,  working  collaboratively  to  achieve  community  aspirations  and  economic  impact.   traditional  business  and  industry  reference  support  ramps  up  to  another  level,  providing   active  and  participatory  support  of  coworking,  startup  companies,  and  etsypreneur8  alike  –   patent  searches  taking  on  an  entirely  new  light  in  support  of  innovators  using  makerspace   resources  to  rapidly  prototype  inventions.  actualized,  the  library  joins  forces  in  a  deeper  way   with  the  community  in  the  creation  of  new  technologies,  jobs,  and  services,  taking  an  ever   more  active  role  in  building  the  futures  of  the  community  and  its  members.   references                                                                                                                               1.    morgan  currie,  “what  we  call  the  information  commons,”  institute  of  network  cultures   blog,  july  8,  2010,  http://networkcultures.org/blog/2010/07/08/what-­‐we-­‐call-­‐the-­‐ information-­‐commons/   2.    the  word  commons  reflects  the  shared  nature  of  a  resource  held  in  common,  such  as   grazing  lands.   3.    robert  a.  seal,  “issue  overview,”  journal  of  library  administration,  50  (2010),  1-­‐6.   http://www.tandfonline.com/doi/pdf/10.1080/01930820903422248   4.    charles  forrest  &  martin  halbert,  a  field  guide  to  the  information  commons.  lanham,  md:   scarecrow,  2009.   5.    scott  bennett,  “the  information  or  the  learning  commons:  which  will  we  have?,”  the   journal  of  academic  librarianship,  34,  no.  3  (2008),  183-­‐185.   6.    donald  robert,  donald  russel  bailey,  &  barbara  tierney,  the  information  commons   handbook,  xviii.  new  york:  neal  schuman,  2006.   7.    larry  johnson,  samantha  adams  becker,  victoria  estrada,  and  alex  freeman,  nmc  horizon   report:    2015  library  edition,  36.  austin,  tx:  the  new  media  consortium,  2015.   8.    the  combination  of  etsy,  a  peer-­‐to-­‐peer  e-­‐commerce  website  that  focuses  on  selling   handmade,  vintage,  or  unique  items,  and  entrepreneurship.  the  word  “etsypreneur”   refers  to  someone  who  is  in  the  “etsy  business”  –  namely,  selling  such  items  via  the   website.  http://etsypreneur.com/the-­‐hidden-­‐danger-­‐of-­‐the-­‐internet-­‐opportunity/   editor’s comments bob gerrity information technology and libraries | september 2012 1 g’day, mates, and welcome to our third open-access issue. ital takes on an additional international dimension with this issue, as your faithful editor has taken up residence down under, in sunny queensland, australia. the recent ala annual meeting in anaheim marked some changes to the ital editorial board that i’d like to highlight. cynthia porter and judith carter are ending their tenure with ital after many years of service. cynthia is featured in this month’s editorial board thoughts column, offering her perspective on library technology past and present. judith carter ends a long run with ital as managing editor, and i thank her for her years of dedicated service. ed tallent, director of levin library at curry college, is the incoming managing editor. we also welcome two new members of the editorial board: brad eden, the dean of library services and professor of library science at valparaiso university, and jerome yavarkovsky, former university librarian at boston college, and the 2004 recipient of ala’s hugh c. atkinson award. jerome currently co-chairs the library technology working group at the mediagrid immersive education initiative. we cover a broad range of topics in this issue. ian chan, pearl ly, and yvonne meulemans describe the implementation of the open-source instant messaging (im) network openfire at california state university san marcos, in supporting of the integration of chat reference and internal library communications. richard gartner explores the use of the metadata encoding and transmission standard (mets) as an alternative to the fedora content model (fcm) for an “intermediary” digital-library schema. emily morton and karen hanson present an innovative approach to creating a management dashboard of key library statistics. kate pittsley and sara memmott describe navigational improvements made to libguides at eastern michigan university. bojana surla reports on the development of a platform-independent, java-based marc editor. yongming wang and trevor dawes delve into the need for next-generation integrated library systems and early initiatives in that space. melanie schlosser and brian stamper begin to explore the effects of reposting library digital collections on flickr. in addition to the compelling new content in this issue of ital, we have compelling old content from the print archive of ital and its predecessor, journal of library automation (jola), that will soon be available online, thanks in large to the work of andy boze and colleagues at the university of notre dame. scans of all of the back issues have now been deposited onto the server that currently hosts ital, and will be processed and published online over the coming months. bob gerrity (r.gerrity@uq.edu.au) is university librarian, university of queensland, st. lucia, queensland, australia. lib-mocs-kmc364-20131012113323 velopment, which has recently seen the implementation of a new batch retrospectiveconversion subsystem, and added com catalog options and online authority verification during input/edit. while not the only bibliographic system to be successfully replicated, the wln computer system is becoming the most systematically replicated main-frame facility, with a broad range of future possibilities, including that of a truly turnkey system. wln's experience indicates that, if a system is designed for ease of maintenance at perhaps some sacrifice of efficiency, it will be readily transportable and allow others to obtain the benefits of a highly sophisticated bibliographic capability without the everincreasing cost of original development and, more importantly, without having to support the ongoing maintenance of a unique system. a general planning methodology for automation richard w. meyer, beth ann reuland, francisco m. diaz, and frances colburn: clemson university, clemson, south carolina. introduction a workable planning methodology is the logical starting place for the successful implementation of automation in libraries. an automation plan may develop on the basis of an informal arrangement or from the efforts of one individual, but just as often, automation plans are developed by committees. an automation planning committee must determine and execute some kind of planning methodology and is more likely to be successful if it starts with clear guidelines, good leadership, and a thoroughly proven approach. as a summary review of the literature will bear out, many libraries have developed their own planning techniques inhouse. some of these, which are addressed to the issues of cataloging rule changes and public-access catalogs, have been very well thought out .1 however, these techniques are generally not directed to planning for communications 205 library-wide automation, and are usually designed to meet the specific needs of an individual library. although the pattern for these studies is often similar, they do not seem to be based upon any general automation design methodology. neither, in addition, does there seem to be a general methodology available through any external library agency. the office of library management studies of the association of research libraries has developed a number of programs designed to assist libraries with their planning efforts, some of which appear to be useful in automation development. 2 but for many libraries, these programs may be too broad, too time-consuming or too expensive. as an alternative, some libraries will need to look elsewhere for a general automation planning methodology. this problem was addressed by the administration of the clemson library, and was resolved in a unique way. background the robert muldrow cooper library of clemson university has the responsibility of acquiring, preserving, and making available for use the many materials needed by faculty and students in their research and instructional efforts. at a typical landgrant institution like clemson, the amount of scholarly publishing and the pressure to develop research proposals has risen sharply in recent years . the increased needs of users working with an expanding and diversified collection have resulted in a doubling of circulation activity, and have required the growth of library staff by 70 percent over the last decade . furthermore, acquisition, processing, and access problems are compounded by the high inflation rate of materials, particularly serial publications, and manpower costs. even though user demands heavily burdened the traditional manual systems, the extent of library automation at clemson had been limited to a batch circulation system, a simple serials-listing capability, and the use of bibliographic utilities. although it had been generally accepted for some time that the acquisitions and fund-control functions at clemson were in need of automation, no concrete approach to develop206 journal of library automation vol. 14/3 september 1981 ing a system had been established. in addition, there was some concern that the development of an automated acquisitions system shouldn't be initiated without a clear understanding of how such an effort would affect the rest of the functions in the library. with this in mind, and as an initial part of planning, the library administration decided to implement a programmed study to determine specific needs and problems of the whole library at clemson and to determine the attendant costs and benefits of their resolution. since developing the methodology for this kind of study effort inhouse has been shown by experience elsewhere to be both expensive and time-consuming, a planning methodology was sought which could be brought in from outside the library and applied in a timely fashion. the international business machines corporation (ibm), through their local marketing representative, volunteered to supply that methodology by means of an education industry application transfer team (att} study. in order to implement the study, a team was organized consisting of representatives from the library, from the university's division of administrative programming services (daps), and from the ibm corporation. the purpose, approach, and results of that study constitute the rest of this paper. purpose the application transfer team methodology was implemented to fulfill a fourfold purpose. • first, it was necessary to act on the recognized need for a library-wide automation plan with something tangible that library and university administrators could use in the decision-making process. • second, basic objectives and implementation estimates were required to provide groundwork to the development of systems specifications and evaluation. • third, the planning process needed to provide a forum for meaningful participation by a number of library staff and users. • fourth, the planning needed to be accomplished rather quickly. the att met all these requirements. although the att study technique is generalized for work on any problem in the education arena, it seems particularly well suited to the library environment because it is oriented toward developing applications that solve production problems. the application transfer team methodology was developed by the ibm corporation for customer use. the a tt methodology evolved from ibm's business system planning function, which has been operational since the early 1970s. although the methodology has been used several times in the academic environment, this is the first time, to our knowledge, that it has been used in a library operation. the strength of the att is that it helps members of a team with diverse backgrounds to understand the environment under study. its final goal was "to improve operational productivity, provide better service to students, and provide information which can enhance management planning and decision making."3 put to work, the methodology is straightforward and effective. from beginning to end, the a tt process took clemson slightly more than three months elapsed time. total work time (including all report writing) for library staff was approximately one thousand man hours. as the initial step with the a tt methodology, it was necessary to engage a sponsor and to select a team . for this study, the sponsor chosen was the dean of graduate studies, who reported directly to the vicepresident for academic affairs. in turn, the director of mmpnting and the director of the division of administrative programming services (daps) reported to the dean of graduate studies. although it was not critical that the sponsor be intimately involved in the project, his level of authority within the university administration would help to secure acceptance of the study's recommendation. the sponsor also provided cogent advice along the way, based upon his understanding of institutional resources, and he served as a communication link with other university administrative offices. the study team was chosen by the library administration with the intention of getting diverse involvement and expertise. library staff included the associate director, the head of circulation, the serials cataloger, and a reference librarian. although only the associate director brought significant experience in library automation development, the head of circulation contributed substantial practical experience with automation systems. the cataloger offered specifics of bibliographic problems, cataloging rule changes, and serials control issues, and the reference librarian contributed a comprehensive knowledge of informationretrieval concerns. outside staff included the director of daps, who furnished details on the clemson computing environment, and an ibm marketing representative, who provided appropriate help with hardware capabilities, the att metnodology, and legwork. in addition, clemson was also able to engage the help of a representative of ibm's education industry division to guide the a tt efforts on the basis of his experience in the use of the methodology. from time to time, other ibm and daps staff were involved in assisting with interviews and report writing. the associate director served as team chair in order to act as spokesperson, to coordinate team effort, and to edit the final report. methodology the application transfer team methodology is applied in six phases. ibm recommends that these phases be conducted sequentially, and that they last from five to sixteen weeks, depending on the size of the problem. throughout the process, verbal reviews were conducted by the team with the sponsor and with the library staff. the first phase involved an organizational session. following the introduction of team members, the ibm education industry division representative presented an overview of the methodology and explained the mechanics of the a tt study process. the team then established the scope of the study by choosing an application area on which to focus and by determining the general objectives of the final system to be implemented. since part of the purpose of the project was to develop a plan for librarywide automation, it was quickly recognized by the team that the application area should be an integrated library information system. however, the ibm representative suggested that this scope was too broad for the study and that one functional area such as communications 207 acquisitions be chosen, with other functions reserved for subsequent att studies. given time constraints, a compromise arrangement was made in which serials control was determined as the scope. since serials control is a single functional area, but encompasses nearly all bibliographic issues, it served as a microcosm of overall library operations. therefore, it was generally accepted that a plan that effectively accommodated serials would constitute an integrated system plan. the organizational phase continued by determining who to interview during the data-collections phase and by setting up an interview schedule. this phase was concluded by developing an outline of the final report and by assigning writing responsibilities to individual team members. the data-gathering effort constituted phase two. this involved structured interviews of representative staff of each unit of the library who were involved in routine interactions with any phase of serials control at clemson. interviews were conducted with staff from acquisitions, cataloging, circulation, reference units, and branch libraries as well as the university business office, students, and faculty. following an outline in the att, each person interviewed was asked for specific details of his work with serial publications regarding (1) interfaces (or points of interaction), (2) concerns or needs, (3) suggested improvements, (4) expected values or benefits of improvements, (5) work volume, and (6) cycles. data gathered in each of these interview sessions were immediately documented in a letter to the interviewees. these letters were reviewed by those interviewed for corrections and adde::tl delail. data from completed and documented interviews were consolidated during the third phase of the study into a matrix of each of the six questions plotted against operational areas of the library, graphically designating areas of the greatest concern to the largest part of the library. this composite was analyzed to separate problems that could be reasonably handled by an integrated automation system from those that needed the attention of administrative policy and direction. functions for automation consideration were then examined in a 208 journal of library automation vol. 14/3 september 1981 "blue sky" session of the committee to envision what system would accommodate the specifications for serials control and access that each library unit and serials user required. from this session a synthesis emerged of the architecture for an integrated system. 4 this architecture included a description of the basic relationships of functional modules of the system, a list of the various files needed to contain system information, and a list of data elements required for bibliographic holdings, acquisition, and patron records in the system database. phase four called for the translation of the architecture and general system requirements into modules on basic access, acquisition or processing functions, and into the individual programs needed to execute each module. the team divided into two parts. the ibm and daps personnel, with the associate director, listed the modules and programs and formulated descriptions of each. part of the description effort involved drafting approximate flowcharts of each program. using algorithms developed by ibm, these descriptions were used to assign estimates of person hours required to create the necessary modules. in order to determine the overall cost of system development the person-hour figures were converted to dollars using an average hourly cost for clemson daps personnel. committee members not involved in program/module design formed a group to evaluate anticipated benefits defined in the interviews, to collect data from library staff to support these expectations, and to assign a value to them. benefits from reduced file maintenance, processing, and tracking time were valued as person hours saved by the new system . additional improvements were projected for the system's capability for better fund control, more complete and immediate on-order, claiming, and inprocess information, and statistical collection development/use data. these benefits were assigned the value of estimated duplicate and inappropriate material acquired under the present system. a value was not assigned to user benefits. faculty and student satisfaction is intangible, and variable from case to case. enhanced user service was recognized as a substantial benefit of the proposed system, but was not quantified. the cost factors determined in phase four were consolidated with derived benefit values to form a cost/benefit analysis, which constituted phase five. in the sixth and final phase an implementation plan was formulated. this plan, along with recommended target dates, was presented orally to library staff and university administration. in addition, the entire process, recommendations, and plan of action were documented in a written report. 5 results within the a tt report were a description of the current library environment, objectives and description of the proposed system, implementation considerations, a cost/benefits analysis, and recommendations for a plan of action. although care was taken to "walk through" the function of each module of the described system, the report was not intended to provide detailed computer program specifications ready to be coded by a programmer. it described a useful and powerful integrated serials system in sufficient detail to be a working tool in the hands of a knowledgeable systems analyst to match (or revise) already available systems and programs to the library's specifications. the report itself also served as an effective communication link with the university administration, setting out library concerns and giving rational solutions to the pervasive problem of serials control and, in the long term, to an integrated library information system. the timing of the a tt study was fortunate for the clemson library. the university was on the eve of an accreditation selfstudy. as often happens with the examination of any organization, a host of related, but unacknowledged , problems surfaced in the course of the att study. during the interviews, staff members felt free to bring up matters of unclear policies, misunderstood hierarchical arrangements, and staffing inadequacies throughout the library. the number and importance of nonautomation concerns was significant enough that an administrative report was written to articulate these problems to the university administration. 6 it is interesting to note also that, while in every instance the team received enthusiastic cooperation from all those interviewed , there was fear among some staff members that any automation project would necessarily cut staff positions. once this worry was identified, the study team was able to allay those fears by explaining the study's purpose. one of the greatest contributions of the att study has been the direction it has given the library for future goals and priorities. by focusing on the problems of serials control, the team evaluated a microcosm of library problems. investigating these problems in the environment of more limited budgets, possible future closing or freezing of the card catalog, and increased user demands for services has helped the library develop a course of action, a resolve of mission, and a direction for future growth. the staff of daps and the library are conducting a review of existing software and systems potentially appropriate for a comprehensive serials control system. the att study was the tool successfully used to elicit university support for library automation. the university has given its approval, and supplied funding, to proceed with the determination of available systems and with the development of a request for quotation. communications 209 references l. for example: university of rochester, river campus libraries, task force on access systems, report (rochester, n.y.: univ. of rochester, 1980), university of california, berkeley, general library, committee on bibliographic control, future of the general library catalogs of the university of california at berkeley (berkeley: univ. of california, 1977); pennsylvania state university libraries, systems development department, remote catalog access system: general system specifications (university park: pennsylvania state univ., 1977). 2. association of research libraries, office of management studies, annual report, 1979 (was hington, d.c.: the association, 1979). 3. international business machines corporation, application transfer teams: application description (white plains, n.y.: the corporation, 1977), p.1 ; international business machines corporation, application transfer teams: realizing your computing syste1ns' potential (white plains, n.y.: the corporation, 1977). 4. inte rnational busine'is machines corporation , business systems planning: information systems planning guide (white plains, n.y.: the corporation, 1975), p.49. 5. richard w. meyer and others, total integrated library information system: a report on the general design phase (syracuse, n.y.: eric clearinghouse on information resources, 1980), ed 191446. 6. richard w. meyer, cooper library: status and agenda. a report on fy 1979-80 (clemson, s.c.: clemson univ., 1980). reducing psychological resistance to digital repositories | quinn 67 and mit mandates, and other mandates such as the one instituted at stanford’s school of education, have come to pass, and the registry of open access repository material archiving policies (roarmap) lists more than 120 mandates around the world that now exist.3 while it is too early to tell whether these developments will be successful in getting faculty to deposit their work in digital repositories, they at least establish a precedent that other institutions may follow. how many institutions follow and how effective the mandates will be once enacted remains to be seen. will all colleges and universities, or even a majority, adopt mandates that require faculty to deposit their work in repositories? what of those that do not? even if most institutions are successful in instituting mandates, will they be sufficient to obtain faculty cooperation? for those institutions that do not adopt mandates, how are they going to persuade faculty to participate in self-archiving, or even in some variation—such as having surrogates (librarians, staff, or graduate assistants) archive the work of faculty? are mandates the only way to ensure faculty cooperation and compliance, or are mandates even necessarily the best way? to begin to adequately address the problem of user resistance to digital repositories, it might help to first gain some insight into the psychology of resistance. the existing literature on user behavior with regard to digital repositories devotes scant attention to the psychology of resistance. in an article entitled “institutional repositories: partnering with faculty to enhance scholarly communication,” johnson discusses the inertia of the traditional publishing paradigm. he notes that this inertia is most evident in academic faculty. this would suggest that the problem of eliciting user cooperation is primarily motivational and that the problem is more one of indifference than active resistance.4 heterick, in his article “faculty attitudes toward electronic resources,” suggests that one reason faculty may be resistant to digital repositories is because they do not fully trust them. in response to a survey he conducted, 48 percent of faculty felt that libraries should maintain paper archives.5 the implication is that digital repositories and archives may never completely replace hard copies in the minds of scholars. in “understanding faculty to improve content recruitment for institutional repositories,” foster and gibbons point out that faculty complain of having too much work already. they resent any additional work that contributing to a digital repository might entail. thus the authors echo johnson in suggesting that faculty resistance the potential value of digital repositories is dependent on the cooperation of scholars to deposit their work. although many researchers have been resistant to submitting their work, the literature on digital repositories contains very little research on the psychology of resistance. this article looks at the psychological literature on resistance and explores what its implications might be for reducing the resistance of scholars to submitting their work to digital repositories. psychologists have devised many potentially useful strategies for reducing resistance that might be used to address the problem; this article examines these strategies and how they might be applied. o bserving the development and growth of digital repositories in recent years has been a bit like riding an emotional roller coaster. even the definition of what constitutes a repository may not be the subject of complete agreement, but for the purposes of this study, a repository is defined as an online database of digital or digitized scholarly works constructed for the purpose of preserving and disseminating scholarly research. the initial enthusiasm expressed by librarians and advocates of open access toward the potential of repositories to make significant amounts of scholarly research available to anyone with internet access gradually gave way to a more somber appraisal of the prospects of getting faculty and researchers to deposit their work. in august 2007, bailey posted an entry to his digital koans blog titled “institutional repositories: doa?” in which he noted that building digital repository collections would be a long, arduous, and costly process.1 the success of repositories, in his view, will be a function not so much of technical considerations as of attitudinal ones. faculty remain unconvinced that repositories are important, and there is a critical need for outreach programs that point to repositories as an important step in solving the crisis in scholarly communication. salo elaborated on bailey’s post with “yes, irs are broken. let’s talk about it,” on her own blog, caveat lector. salo points out that institutional repositories have not fulfilled their early promise of attracting a large number of faculty who are willing to submit their work. she criticizes repositories for monopolizing the time of library faculty and staff, and she states her belief that repositories will not work without deposit mandates, but that mandates are impractical.2 subsequent events in the world of scholarly communication might suggest that mandates may be less impractical than salo originally thought. since her post, the national institutes of health mandate, the harvard brian quinn (brian.quinn@ttu.edu) is social sciences librarian, texas tech university libraries, lubbock. brian quinn reducing psychological resistance to digital repositories 68 information technology and libraries | june 2010 whether or not this was actually the case.11 this study also suggests that a combination of both cognitive and affective processes feed faculty resistance to digital repositories. it can be seen from the preceding review of the literature that several factors have been identified as being possible sources of user resistance to digital repositories. yet the authors offer little in the way of strategies for addressing this resistance other than to suggest workaround solutions such as having nonscholars (e.g., librarians, graduate students, or clerical staff) serve as proxy for faculty and deposit their work for them, or to suggest that institutions mandate that faculty deposit their work. similarly, although numerous arguments have been made in favor of digital repositories and open access, they do not directly address the resistance issue.12 in contrast, psychologists have studied user resistance extensively and accumulated a body of research that may suggest ways to reduce resistance rather than try to circumvent it. it may be helpful to examine some of these studies to see what insights they might offer to help address the problem of user resistance. it should be pointed out that resistance as a topic has been addressed in the business and organizational literature, but has generally been approached from the standpoint of management and organizational change.13 this study has chosen to focus primarily on the psychology of resistance because many repositories are situated in a university setting. unlike employees of a corporation, faculty members typically have a greater degree of autonomy and latitude in deciding whether to accommodate new work processes and procedures into their existing routines, and the locus of change will therefore be more at an individual level. ■■ the psychology of user resistance psychologists define resistance as a preexisting state or attitude in which the user is motivated to counter any attempts at persuasion. this motivation may occur on a cognitive, affective, or behavioral level. psychologists thus distinguish between a state of not being persuaded and one in which there is actual motivation to not comply. the source of the motivation is usually an affective state, such as anxiety or ambivalence, which itself may result from cognitive problems, such as misunderstanding, ignorance, or confusion.14 it is interesting to note that psychologists have long viewed inertia as one form of resistance, suggesting paradoxically that a person can be motivated to inaction.15 resistance may also manifest itself in more subtle forms that shade into indifference, suspicion of new work processes or technologies, and contentment with the status quo. may be attributed at least in part to motivation.6 in another article published a few months later, foster and gibbons suggest that the main reason faculty have been slow to deposit their work in digital repositories is a cognitive one: faculty have not understood how they would benefit by doing so. the authors also mention that users may feel anxiety when executing the sequence of technical steps needed to deposit their work, and that they may also worry about possible copyright infringement.7 the psychology of resistance may thus manifest itself in both cognitive and affective ways. harley and her colleagues talk about faculty not perceiving any reward for depositing their work in their article “the influence of academic values on scholarly publication and communication practices.” this perception results in reduced drive to participate. anxiety is another factor contributing to resistance: faculty fear that their work may be vulnerable to plagiarism in an openaccess environment.8 in “towards user responsive institutional repositories: a case study,” devakos suggests that one source of user resistance is cognitive in origin. scholars do not submit their work frequently enough to be able to navigate the interface from memory, so they must reinitiate the learning process each time they submit their work. the same is true for entering metadata for their work.9 their sense of control may also be threatened by any limitations that may be imposed on substituting later iterations of their work for earlier versions. davis and connolly point to several sources of confusion, uncertainty, and anxiety among faculty in their article “institutional repositories: evaluating the reasons for non-use of cornell university’s installation of dspace.” cognitive problems arise from having to learn new technology to deposit work and not knowing copyright details well enough to know whether publishers would permit the deposit of research prior to publication. faculty wonder whether this might jeopardize their chances of acceptance by important journals whose editors might view deposit as a form of prior publication that would disqualify them from consideration. there is also fear that the complex structure of a large repository may actually make a scholar’s work more difficult to find; faculty may not understand that repositories are not isolated institutional entities but are usually searchable by major search engines like google.10 kim also identifies anxiety about plagiarism and confusion about copyright as being sources of faculty resistance in the article “motivating and impeding factors affecting faculty contribution to institutional repositories.” kim found that plagiarism anxiety made some faculty only willing to deposit already-published work and that prepublication material was considered too risky. faculty with no self-archiving experience also felt that many publishers do not allow self-archiving, reducing psychological resistance to digital repositories | quinn 69 more open to information that challenges their beliefs and attitudes and are more open to suggestion.18 thus before beginning a discussion of why users should deposit their research in repositories, it might help to first affirm the users’ self-concept. this could be done, for example, by reminding them of how unbiased they are in their work or how important it is in their work to be open to new ideas and new approaches, or how successful they have been in their work as scholars. the affirmation should be subtle and not directly related to the repository situation, but it should remind them that they are openminded individuals who are not bound by tradition and that part of their success is attributable to their flexibility and adaptability. once the users have been affirmed, librarians can then lead into a discussion of the importance of submitting scholarly research to repositories. self-generated affirmations may be even more effective. for example, another way to affirm the self would be to ask users to recall instances in which they successfully took a new approach or otherwise broke new ground or were innovative in some way. this could serve as a segue into a discussion of the repository as one more opportunity to be innovative. once the self-concept has been boosted, the threatening quality of the message will be perceived as less disturbing and will be more likely to receive consideration. a related strategy that psychologists employ to reduce resistance involves casting the user in the role of “expert.” this is especially easy to do with scholars because they are experts in their fields. casting the user in the role of expert can deactivate resistance by putting that person in the persuasive role, which creates a form of role reversal.19 rather than the librarian being seen as the persuader, the scholar is placed in that role. by saying to the scholar, “you are the expert in the area of communicating your research to an audience, so you would know better why the digital repository is an alternative that deserves consideration once you understand how it works and how it may benefit you,” you are empowering the user. casting the user as an expert imparts a sense of control to the user. it helps to disable resistance by placing the user in a position of being predisposed to agree to the role he or she is being cast in, which also makes the user more prone to agree with the idea of using a digital repository. priming and imaging one important discovery that psychologists have made that has some bearing on user resistance is that even subtle manipulations can have a significant effect on one’s judgments and actions. in an interesting experiment, psychologists told a group of students that they were to read an online newspaper, ostensibly to evaluate its design and assess how easy it was to read. half of them read an editorial discussing a public opinion survey of youth ■■ negative and positive strategies for reducing resistance just as the definition of resistance can be paradoxical, so too may be some of the strategies that psychologists use to address it. perhaps the most basic example is to counter resistance by acknowledging it. when scholars are presented with a message that overtly states that digital repositories are beneficial and desirable, it may simultaneously generate a covert reaction in the form of resistance. rather than simply anticipating this and attempting to ignore it, digital repository advocates might be more persuasive if they acknowledge to scholars that there will likely be resistance, mention some possible reasons (e.g., plagiarism or copyright concerns), and immediately introduce some counterrationales to address those reasons.16 psychologists have found that being up front and forthcoming can reduce resistance, particularly with regard to the downside of digital repositories. they have learned that it can be advantageous to preemptively reveal negative information about something so that it can be downplayed or discounted. thus talking about the weaknesses or shortcomings of digital repositories as early as possible in an interaction may have the effect of making these problems seem less important and weakening user resistance. not only does revealing negative information impart a sense of honesty and credibility to the user, but psychologists have found that people feel closer to people who reveal personal information.17 a librarian could thus describe some of his or her own frustrations in using repositories as an effective way of establishing rapport with resistant users. the unexpected approach of bringing up the less desirable aspects of repositories—whether this refers to the technological steps that must be learned to submit one’s work or the fact that depositing one’s work in a repository is not a guarantee that it will be highly cited—can be disarming to the resistant user. this is particularly true of more resistant users who may have been expecting a strong hard-sell approach on the part of librarians. when suddenly faced with a more candid appeal the user may be thrown off balance psychologically, leaving him or her more vulnerable to information that is the opposite of what was anticipated and to possibly viewing that information in a more positive light. if one way to disarm a user is to begin by discussing the negatives, a seemingly opposite approach that psychologists take is to reinforce the user’s sense of self. psychologists believe that one source of resistance stems from when a user’s self-concept—which the user tries to protect from any source of undesired change—has been threatened in one way or another. a stable self-concept is necessary for the user to maintain a sense of order and predictability. reinforcing the self-concept of the user should therefore make the user less likely to resist depositing work in a digital repository. self-affirmed users are 70 information technology and libraries | june 2010 or even possibly collaborating on research. their imaginations could be further stimulated by asking them to think of what it would be like to have their work still actively preserved and available to their successors a century from now. using the imagining strategy could potentially be significantly more effective in attenuating resistance than presenting arguments based on dry facts. identification and liking conscious processes like imagining are not the only psychological means of reducing the resistance of users to digital repositories. unconscious processes can also be helpful. one example of such a process is what psychologists refer to as the “liking heuristic.” this refers to the tendency of users to employ a rule-of-thumb method to decide whether to comply with requests from persons. this tendency results from users constantly being inundated with requests. consequently, they need to simplify and streamline the decision-making process that they use to decide whether to cooperate with a request. the liking heuristic holds that users are more likely to help someone they might otherwise not help if they unconsciously identify with the person. at an unconscious level, the user may think that a person acts like them and dresses like them, and therefore the user identifies with that person and likes them enough to comply with their request. in one experiment that psychologists conducted to see if people are more likely to comply with requests from people that they identify with, female undergraduates were informed that they would be participating in a study of first impressions. the subjects were instructed that they and a person in another room would each learn a little about one another without meeting each other. each subject was then given a list of fifty adjectives and was asked to select the twenty that were most characteristic of themselves. the experimenter then told the participants that they would get to see each other’s lists. the experimenter took the subject’s list and then returned a short time later with what supposedly was the other participant’s list, but was actually a list that the experimenter had filled out to indicate that either the subject had much in common with the other participant’s personality (seventeen of twenty matches), some shared attributes (ten of twenty matches), or relatively few characteristics in common (three of twenty matches). the subject was then asked to examine the list and fill out a survey that probed their initial impressions of the other participant, including how much they liked them. at the end of the experiment, the two subjects were brought together and given credit for participating. the experimenter soon left the room and the confederate participant asked the other participant if she would read and critically evaluate an eight-page paper for an english class. the results of the experiment indicated that the more the participant thought she shared in consumer patterns that highlighted functional needs, and the other half read a similar editorial focusing on hedonistic needs. the students next viewed an ad for a new brand of shampoo that featured either a strong or a weak argument for the product. the results of the experiment indicated that students who read the functional editorial and were then subsequently exposed to the strong argument for the shampoo (a functional product) had a much more favorable impression of the brand than students who had received the mismatched prime.20 while it may seem that the editorial and the shampoo were unrelated, psychologists found that the subjects engaged in a process of elaborating the editorial, which then predisposed them to favor the shampoo. the presence of elaboration, which is a precursor to the development of attitudes, suggests that librarians could reduce users’ resistance to digital repositories by first involving them in some form of priming activity immediately prior to any attempt to persuade them. for example, asking faculty to read a brief case study of a scholar who has benefited from involvement in open-access activity might serve as an effective prime. another example might be to listen briefly to a speaker summarizing the individual, disciplinary, and societal benefits of sharing one’s research with colleagues. interventions like these should help mitigate any predisposition toward resistance on the part of users. imagining is a strategy related to priming that psychologists have found to be effective in reducing resistance. taking their cue from insurance salesmen—who are trained to get clients to actively imagine what it would be like to lose their home or be in an accident—a group of psychologists conducted an experiment in which they divided a sample of homeowners who were considering the purchase of cable tv into two groups. one group was presented with the benefits of cable in a straightforward, informative way that described various features. the other group was asked to imagine themselves enjoying the benefits and all the possible channels and shows that they might experience and how entertaining it might be. the psychologists then administered a questionnaire. the results indicated that those participants who were asked to imagine the benefits of cable were much more likely to want cable tv and to subscribe to it than were those who were only given information about cable tv.21 in other words, imagining resulted in more positive attitudes and beliefs. this study suggests that librarians attempting to reduce resistance among users of digital repositories may need to do more than merely inform or describe to them the advantages of depositing their work. they may need to ask users to imagine in vivid detail what it would be like to receive periodic reports indicating that their work had been downloaded dozens or even hundreds of times. librarians could ask them to imagine receiving e-mail or calls from colleagues indicating that they had accessed their work in the repository and were interested in learning more about it, reducing psychological resistance to digital repositories | quinn 71 students typically overestimate the amount of drinking that their peers engage in at parties. these inaccurate normative beliefs act as a negative influence, causing them to imbibe more because they believe that is what their peers are doing. by informing students that almost threequarters of their peers have less than three drinks at social gatherings, psychologists have had some success in reducing excessive drinking behavior by students.23 the power of normative messages is illustrated by a recent experiment conducted by a group of psychologists who created a series of five cards to encourage hotel guests to reuse their towels during their stay. the psychologists hypothesized that by appealing to social norms, they could increase compliance rates. to test their hypothesis, the researchers used a different conceptual appeal for each of the five cards. one card appealed to environmental concerns (“help save the environment”), another to environmental cooperation (“partner with us to save the environment”), a third card appealed to the advantage to the hotel (“help the hotel save energy”), a fourth card targeted future generations (“help save resources for future generations”), and a final card appealed to guests by making reference to a descriptive norm of the situation (“join your fellow citizens in helping to save the environment”). the results of the study indicated that the card that mentioned the benefit to the hotel was least effective in getting guests to reuse their towels, and the card that was most effective was the one that mentioned that descriptive norm.24 this research suggests that if users who are resistant to submitting their work to digital repositories were informed that a larger percentage of their peers were depositing work than they realized, resistance may be reduced. this might prove to be particularly true if they learned that prominent or influential scholars were engaged in populating repositories with their work. this would create a social-norms effect that would help legitimize repositories to other faculty and help them to perceive the submission process as normal and desirable. the idea that accomplished researchers are submitting materials and reaping the benefits might prove very attractive to less experienced and less well-regarded faculty. psychologists have a considerable body of evidence in the area of social modeling that suggests that people will imitate the behavior of others in social situations because that behavior provides an implicit guideline of what to do in a similar situation. a related finding is that the more influential people are, the more likely it is for others to emulate their actions. this is even more probable for highstatus individuals who are skilled and attractive and who are capable of communicating what needs to be done to potential followers.25 social modeling addresses both the cognitive dimension of how resistant users should behave and also the affective dimension by offering models that serve as a source of motivation to resistant users to change common with the confederate, the more she liked her. the more she liked the confederate and experienced a perception of consensus, the more likely she was to comply with her request to critique the paper.22 thus, when trying to overcome the resistance of users to depositing their work in a digital repository, it might make sense to consider who it is that is making the request. universities sometimes host scholarly communication symposia that are not only aimed at getting faculty interested in open-access issues, but to urge them to submit their work to the institution’s repositories. frequently, speakers at these symposia consist of academic administrators, members of scholarly communication or open-access advocacy organizations, or individuals in the library field. the research conducted by psychologists, however, suggests that appeals to scholars and researchers would be more effective if they were made by other scholars and those who are actively engaged in research. faculty are much more likely to identify with and cooperate with requests from their own tribe, as it were, and efforts need to be concentrated on getting faculty who are involved in and understand the value of repositories to articulate this to their colleagues. researchers who can personally testify to the benefits of depositing their work are most likely to be effective at convincing other researchers of the value of doing likewise and will be more effective at reducing resistance. librarians need to recognize who their potentially most effective spokespersons and advocates are, which the psychological research seems to suggest is faculty talking to other faculty. perceived consensus and social modeling the processes of faculty identification with peers and perceived consensus mentioned above can be further enhanced by informing researchers that other scholars are submitting their work, rather than merely telling researchers why they should submit their work. information about the practices of others may help change beliefs because of the need to identify with other in-group members. this is particularly true of faculty, who are prone to making continuous comparisons with their peers at other institutions and who are highly competitive by nature. once they are informed of the career advantages of depositing their work (in terms of professional visibility, collaboration opportunities, etc.), and they are informed that other researchers have these advantages, this then becomes an impetus for them to submit their work to keep up with their peers and stay competitive. a perception of consensus is thus fostered—a feeling that if one’s peers are already depositing their work, this is a practice that one can more easily agree to. psychologists have leveraged the power of identification by using social-norms research to inform people about the reality of what constitutes normative behavior as opposed to people’s perceptions of it. for example, college 72 information technology and libraries | june 2010 highly resistant users that may be unwilling to submit their work to a repository. rather than trying to prepare a strong argument based on reason and logic, psychologists believe that using a narrative approach may be more effective. this means conveying the facts about open access and digital repositories in the form of a story. stories are less rhetorical and tend not to be viewed by listeners as attempts at persuasion. the intent of the communicator and the counterresistant message are not as overt, and the intent of the message might not be obvious until it has already had a chance to influence the listener. a well-crafted narrative may be able to get under the radar of the listener before the listener has a chance to react defensively and revert to a mode of resistance. in a narrative, beliefs are rarely stated overtly but are implied, and implied beliefs are more difficult to refute than overtly stated beliefs. listening to a story and wondering how it will turn out tends to use up much of the cognitive attentional capacity that might otherwise be devoted to counterarguing, which is another reason why using a narrative approach may be particularly effective with users who are strongly resistant. the longer and more subtle nature of narratives may also make them less a target of resistance than more direct arguments.28 using a narrative approach, the case for submitting work to a repository might be presented not as a collection of dry facts or statistics, but rather as a story. the protagonists are the researchers, and their struggle is to obtain recognition for their work and to advance scholarship by providing maximum access to the greatest audience of scholars and to obtain as much access as possible to the work of their peers so that they can build on it. the protagonists are thwarted in their attempts to achieve their ends by avaricious publishers who obtain the work of researchers for free and then sell it back to them in the form of journal and database subscriptions and books for exorbitant prices. these prices far exceed the rate of inflation or the budgets of universities to pay for them. the publishers engage in a series of mergers and acquisitions that swallow up small publishing firms and result in the scholarly publishing enterprise being controlled by a few giant firms that offer unreasonable terms to users and make unreasonable demands when negotiating with them. presented in this dramatic way, the significance of scholar participation in digital repositories becomes magnified to an extent that it becomes more difficult to resist what may almost seem like an epic struggle between good and evil. and while this may be a greatly oversimplified example, it nonetheless provides a sense of the potential power of using a narrative approach as a technique to reduce resistance. introducing a time element into the attempt to persuade users to deposit their work in digital repositories can play an important role in reducing resistance. given that faculty are highly competitive, introducing the idea not only that other faculty are submitting their work but that they are already benefiting as a result makes the their behavior in the desired direction. redefinition, consistency, and depersonalization another strategy that psychologists use to reduce resistance among users is to change the definition of the situation. resistant users see the process of submitting their research to the repository as an imposition at best. in their view, the last thing that they need is another obligation or responsibility to burden their already busy lives. psychologists have learned that reframing a situation can reduce resistance by encouraging the user to look at the same phenomenon in a different way. in the current situation, resistant users should be informed that depositing their work in a digital repository is not a burden but a way to raise their professional profile as researchers, to expose their work to a wider audience, and to heighten their visibility among not only their peers but a much larger potential audience that would be able to encounter their work on the web. seen in this way, the additional work of submission is less of a distraction and more of a career investment. moreover, this approach leverages a related psychological concept that can be useful in helping to dissolve resistance. psychologists understand that inconsistency has a negative effect on self-esteem, so persuading users to believe that submitting their work to a digital repository is consistent with their past behavior can be motivating.26 the point needs to be emphasized with researchers that the act of submitting their work to a digital repository is not something strange and radical, but is consistent with prior actions intended to publicize and promote their work. a digital repository can be seen as analogous to a preprint, book, journal, or other tangible and familiar vehicles that faculty have used countless times to send their work out into the world. while the medium might have changed, the intention and the goal are the same. reframing the act of depositing as “old wine in new bottles” may help to undermine resistance. in approaching highly resistant individuals, psychologists have discovered that it is essential to depersonalize any appeal to change their behavior. instead of saying, “you should reduce your caloric intake,” it is better to say, “it is important for people to reduce their caloric intake.” this helps to deflect and reduce the directive, judgmental, and prescriptive quality of the request, thus making it less likely to provoke resistance.27 suggestion can be much less threatening than prescription among users who may be suspicious and mistrusting. reverting to a third-person level of appeal may allow the message to get through without it being immediately rejected by the user. narrative, timing, and anticipation psychologists recommend another strategy to help defuse reducing psychological resistance to digital repositories | quinn 73 technological platforms, and so on. this could be followed by a reminder to users that it is their choice—it is entirely up to them. this reminder that users have the freedom of choice may help to further counter any resistance generated as a result of instructions or inducements to anticipate regret. indeed, psychologists have found that reinstating a choice that was previously threatened can result in greater compliance than if the threat had never been introduced.32 offering users the freedom to choose between alternatives tends to make them more likely to comply. this is because having a choice enables users to both accept and resist the request rather than simply focus all their resistance on a single alternative. when presented with options, the user is able to satisfy the urge to resist by rejecting one option but is simultaneously motivated to accept another option; the user is aware that there are benefits to complying and wants to take advantage of them but also wants to save face and not give in. by being offered several alternatives that nonetheless all commit to a similar outcome, the user is able to resist and accept at the same time.33 for example, one alternative option to self-archiving might be to present the faculty member with the option of an authorpays publishing model. the choice of alternatives allows the faculty member to be selective and discerning so that a sense of satisfaction is derived from the ability to resist by rejecting one alternative. at the same time, the librarian is able to gain compliance because one of the other alternatives that commits the faculty member to depositing research is accepted. options, comparisons, increments, and guarantees in addition to offering options, another way to erode user resistance to digital repositories is to use a comparative strategy. one technique is to first make a large request, such as “we would like you to submit all the articles that you have published in the last decade to the repository,” and then follow this with a more modest request, such as “we would appreciate it if you would please deposit all the articles you have published in the last year.” the original request becomes an “anchor” or point of reference in the mind of the user against which the subsequent request is then evaluated. setting a high anchor lessens user resistance by changing the user’s point of comparison of the second request from nothing (not depositing any work in the repository) to a higher value (submitting a decade of work). in this way, a high reference anchor is established for the second request, which makes it seem more reasonable in the newly created context of the higher value.34 the user is thus more likely to comply with the second request when it is framed in this way. using this comparative approach may also work because it creates a feeling of reciprocity in the user. when proposition much more salient. it not only suggests that submitting work is a process that results in a desirable outcome, but that the earlier one’s work is submitted, the more recognition will accrue and the more rapidly one’s career will advance.29 faculty may feel compelled to submit their work in an effort to remain competitive with their colleagues. one resource that may be particularly helpful for working with skeptical faculty who want substantiation about the effect of self-archiving on scholarly impact is a bibliography created by the open citation project titled, “the effect of open access and downloads (hits) on citation impact: a bibliography of studies.”30 it provides substantial documentation of the effect that open access has on scholarly visibility. an additional stimulus might be introduced in conjunction with the time element in the form of a download report. showing faculty how downloads accumulate over time is analogous to arguments that investment counselors use showing how interest on investments accrues and compounds over time. this investment analogy creates a condition in which hesitating to submit their work results in faculty potentially losing recognition and compromising their career advancement. an interesting related finding by psychologists suggests that an effective way to reduce user resistance is to have users think about the future consequences of complying or not complying. in particular, if users are asked to anticipate the amount of future regret they might experience for making a poor choice, this can significantly reduce the amount of resistance to complying with a request. normally, users tend not to ruminate about the possibility of future disappointment in making a decision. if users are made to anticipate future regret, however, they will act in the present to try to minimize it. studies conducted by psychologists show that when users are asked to anticipate the amount of future regret that they might experience for choosing to comply with a request and having it turn out adversely versus choosing to not comply and having it turn out adversely, they consistently indicate that they would feel more regret if they did not comply and experienced negative consequences as a result.31 in an effort to minimize this anticipated regret, they will then be more prone to comply. based on this research, one strategy to reduce user resistance to digital repositories would be to get users to think about the future, specifically about future regret resulting from not cooperating with the request to submit their work. if they feel that they might experience more regret in not cooperating than in cooperating, they might then be more inclined to cooperate. getting users to think about the future could be done by asking users to imagine various scenarios involving the negative outcomes of not complying, such as lost opportunities for recognition, a lack of citation by peers, lost invitations to collaborate, an inability to migrate one’s work to future 74 information technology and libraries | june 2010 submit their work. mandates rely on authority rather than persuasion to accomplish this and, as such, may represent a less-than-optimal solution to reducing user resistance. mandates represent a failure to arrive at a meeting of the minds of advocates of open access, such as librarians, and the rest of the intellectual community. understanding the psychology of resistance is an important prerequisite to any effort to reduce it. psychologists have assembled a significant body of research on resistance and how to address it. some of the strategies that the research suggests may be effective, such as discussing resistance itself with users and talking about the negative effects of repositories, may seem counterintuitive and have probably not been widely used by librarians. yet when other more conventional techniques have been tried with little or no success, it may make sense to experiment with some of these approaches. particularly in the academy, where reason is supposed to prevail over authority, incorporating resistance psychology into a program aimed at soliciting faculty research seems an appropriate step before resorting to mandates. most strategies that librarians have used in trying to persuade faculty to submit their work have been conventional. they are primarily of a cognitive nature and are variations on informing and educating faculty about how repositories work and why they are important. researchers have an important affective dimension that needs to be addressed by these appeals, and the psychological research on resistance suggests that a strictly rational approach may not be sufficient. by incorporating some of the seemingly paradoxical and counterintuitive techniques discussed earlier, librarians may be able to penetrate the resistance of researchers and reach them at a deeper, less rational level. ideally, a mixture of rational and less-conventional approaches might be combined to maximize effectiveness. such a program may not eliminate resistance but could go a long way toward reducing it. future studies that test the effectiveness of such programs will hopefully be conducted to provide us with a better sense of how they work in real-world settings. references 1. charles w. bailey jr., “institutional repositories: doa?,” online posting, digital koans, aug. 22, 2007, http://digital -scholarship.org/digitalkoans/2007/08/21/institutional -repositories-doa/ (accessed apr. 21, 2010). 2. dorothea salo, “yes, irs are broken. let’s talk about it,” online posting, caveat lector, sept. 5, 2007, http://cavlec. yarinareth.net/2007/09/05/yes-irs-are-broken-lets-talk-about -it/ (accessed apr. 21, 2010). 3. eprints services, roarmap (registry of open access repository material archiving policies) http://www.eprints .org/openaccess/policysignup/ (accessed july 28, 2009). 4. richard k. johnson, “institutional repositories: partnering the requester scales down the request from the large one to a smaller one, it creates a sense of obligation on the part of the user to also make a concession by agreeing to the more modest request. the cultural expectation of reciprocity places the user in a situation in which they will comply with the lesser request to avoid feelings of guilt.35 for the most resistant users, breaking the request down into the smallest possible increment may prove helpful. by making the request seem more manageable, the user is encouraged to comply. psychologists conducted an experiment to test whether minimizing a request would result in greater cooperation. they went door-to-door, soliciting contributions to the american cancer society, and received donations from 29 percent of households. they then made additional solicitations, this time asking, “would you contribute? even a penny will help!” using this approach, donations increased to 50 percent. even though the solicitors only asked for a penny, the amounts of the donations were equal to that of the original request. by asking for “even a penny,” the solicitors made the request appear to be more modest and less of a target of resistance.36 librarians might approach faculty by saying “if you could even submit one paper we would be grateful,” with the idea that once faculty make an initial submission they will be more inclined to submit more papers in the future. one final strategy that psychological research suggests may be effective in reducing resistance to digital repositories is to make sure that users understand that the decision to deposit their work is not irrevocable. with any new product, users have fears about what might happen if they try it and they are not satisfied with it. not knowing the consequences of making a decision that they may later regret fuels reluctance to become involved with it. faculty need to be reassured that they can opt out of participating at any time and that the repository sponsors will guarantee this. this guarantee needs to be repeated and emphasized as much as possible in the solicitation process so that faculty are frequently reminded that they are entering into a decision that they can reverse if they so decide. having this reassurance should make researchers much less resistant to submitting their work, and the few faculty who may decide that they want to opt out are worth the reduction in resistance.37 the digital repository is a new phenomenon that faculty are unfamiliar with, and it is therefore important to create an atmosphere of trust. the guarantee will help win that trust. ■■ conclusion the scholarly literature on digital repositories has given little attention to the psychology of resistance. yet the ultimate success of digital repositories depends on overcoming the resistance of scholars and researchers to reducing psychological resistance to digital repositories | quinn 75 20. curtis p. haugtvedt et al., “consumer psychology and attitude change,” in knowles and linn, resistance and persuasion, 283–96. 21. larry w. gregory, robert b. cialdini, and kathleen m. carpenter, “self-relevant scenarios as mediators of likelihood estimates and compliance: does imagining make it so?” journal of personality & social psychology 43, no. 1 (1982): 89–99. 22. jerry m. burger, “fleeting attraction and compliance with requests,” in the science of social influence: advances and future progress, ed. anthony r. pratkanis (new york: psychology pr., 2007): 155–66. 23. john d. clapp and anita lyn mcdonald, “the relationship of perceptions of alcohol promotion and peer drinking norms to alcohol problems reported by college students,” journal of college student development 41, no. 1 (2000): 19–26. 24. noah j. goldstein and robert b. cialdini, “using social norms as a lever of social influence,” in the science of social influence: advances and future progress, ed. anthony r. pratkanis (new york: psychology pr., 2007): 167–90. 25. dale h. schunk, “social-self interaction and achievement behavior,” educational psychologist 34, no. 4 (1999): 219–27. 26. rosanna e. guadagno et al., “when saying yes leads to saying no: preference for consistency and the reverse foot-inthe-door effect,” personality & social psychology bulletin 27, no. 7 (2001): 859–67. 27. mary jiang bresnahan et al., “personal and cultural differences in responding to criticism in three countries,” asian journal of social psychology 5, no. 2 (2002): 93–105. 28. melanie c. green and timothy c. brock, “in the mind’s eye: transportation-imagery model of narrative persuasion,” in narrative impact: social and cultural foundations, ed. melanie c. green, jeffrey j. strange, and timothy c. brock (mahwah, n.j.: lawrence erlbaum, 2004): 315–41. 29. oswald huber, “time pressure in risky decision making: effect on risk defusing,” psychology science 49, no. 4 (2007): 415–26. 30. the open citation project, “the effect of open access and downloads (‘hits’) on citation impact: a bibliography of studies,” july 17, 2009, http://opcit.eprints.org/oacitation -biblio.html (accessed july 29, 2009). 31. matthew t. crawford et al., “reactance, compliance, and anticipated regret,” journal of experimental social psychology 38, no. 1 (2002): 56–63. 32. nicolas gueguen and alexandre pascual, “evocation of freedom and compliance: the ‘but you are free of . . .’ technique,” current research in social psychology 5, no. 18 (2000): 264–70. 33. james p. dillard, “the current status of research on sequential request compliance techniques,” personality & social psychology bulletin 17, no. 3 (1991): 283–88. 34. thomas mussweiler, “the malleability of anchoring effects,” experimental psychology 49, no. 1 (2002): 67–72. 35. robert b. cialdini and noah j. goldstein, “social influence: compliance and conformity,” annual review of psychology 55 (2004): 591–21. 36. james m. wyant and stephen l. smith, “getting more by asking for less: the effects of request size on donations of charity,” journal of applied social psychology 17, no. 4 (1987): 392–400. 37. lydia j. price, “the joint effects of brands and warranties in signaling new product quality,” journal of economic psychology 23, no. 2 (2002): 165–90. with faculty to enhance scholarly communication,” d-lib magazine 8, no. 11 (2002), http://www.dlib.org/dlib/november02/ johnson/11johnson.html (accessed apr. 2, 2008). 5. bruce heterick, “faculty attitudes toward electronic resources,” educause review 37, no. 4 (2002): 10–11. 6. nancy fried foster and susan gibbons, “understanding faculty to improve content recruitment for institutional repositories,” d-lib magazine 11, no. 1 (2005), http://www.dlib.org/ dlib/january05/foster/01foster.html (accessed july 29, 2009). 7. suzanne bell, nancy fried foster, and susan gibbons, “reference librarians and the success of institutional repositories,” reference services review 33, no. 3 (2005): 283–90. 8. diane harley et al., “the influence of academic values on scholarly publication and communication practices,” center for studies in higher education, research & occasional paper series: cshe.13.06, sept. 1, 2006, http://repositories.cdlib.org/ cshe/cshe-13-06/ (accessed apr. 17, 2008). 9. rea devakos, “towards user responsive institutional repositories: a case study,” library high tech 24, no. 2 (2006): 173–82. 10. philip m. davis and matthew j. l. connolly, “institutional repositories: evaluating the reasons for non-use of cornell university’s installation of dspace,” d-lib magazine 13, no. 3/4 (2007), http://www.dlib.org/dlib/march07/davis/03davis .html (accessed july 29, 2009). 11. jihyun kim, “motivating and impeding factors affecting faculty contribution to institutional repositories,” journal of digital information 8, no. 2 (2007), http://journals.tdl.org/jodi/ article/view/193/177 (accessed july 29, 2009). 12. peter suber, “open access overview” online posting, open access news: news from the open access environment, june 21, 2004, http://www.earlham.edu/~peters/fos/overview .htm (accessed 29 july 2009). 13. see, for example, jeffrey d. ford and laurie w. ford, “decoding resistance to change,” harvard business review 87, no. 4 (2009): 99–103.; john p. kotter and leonard a. schlesinger, “choosing strategies for change,” harvard business review 86, no. 7/8 (2008): 130–39; and paul r. lawrence, “how to deal with resistance to change,” harvard business review 47, no. 1 (1969): 4–176. 14. julia zuwerink jacks and maureen e. o’brien, “decreasing resistance by affirming the self,” in resistance and persuasion, ed. eric s. knowles and jay a. linn (mahwah, n.j.: lawrence erlbaum, 2004): 235–57. 15. benjamin margolis, “notes on narcissistic resistance,” modern psychoanalysis 9, no. 2 (1984): 149–56. 16. ralph grabhorn et al., “the therapeutic relationship as reflected in linguistic interaction: work on resistance,” psychotherapy research 15, no. 4 (2005): 470–82. 17. arthur aron et al., “the experimental generation of interpersonal closeness: a procedure and some preliminary findings,” personality & social psychology bulletin 23, no. 4 (1997): 363–77. 18. geoffrey l. cohen, joshua aronson, and claude m. steele, “when beliefs yield to evidence: reducing biased evaluation by affirming the self,” personality & social psychology bulletin 26, no. 9 (2000): 1151–64. 19. anthony r. pratkanis, “altercasting as an influence tactic,” in attitudes, behavior and social context: the role of norms and group membership, ed. deborah j. terry and michael a.hogg (mahwah, n.j.: lawrence erlbaum, 2000): 201–26. we do not have an information-prone society. when faced with a problem or interest, i suggest, we are more prone to ask, "what do i have to do?" rather than, "what do i have to know?" part of this reaction is probably due to the fact that when we ask "what do i have to know?" we are faced with another problem in addition to the initial one; i.e., where to get the information. this added effort simply confirms in us our indifference to information, and we take our best shot at solving the problem through decision and action . i sometimes think we have made a virtue of the information incapacity by the way we laud decision making as an indicator of ability. if the foregoing examples are reasonably accurate, we are then faced with a situation in which information is fundamentally important to societal and individual wellbeing, but is not perceived to be so by people in the conduct of their daily affairs. computer-supported telecommunications systems can be the instrument for accelerating information control by a few (this has been much of the trend , so far , as indicated by corporate, research, and technical use of these systems), or it can be used to build information confidence, use, and desire throughout society. this option, i. suggest, is central to the significance of telecommunications systems for a democratic society. if the latter option is to be obtained, i suggest that information will have to be packaged and targeted so well on people 's everyday problems and interests that it will be easier and more productive to say "what do i have to know?" before saying "what do i have to do?" a basic approach to articulating an information service of this kind consists of the following steps: l. determine and prioritize the individual and societal problems and interests of a given community. 2. ascertain the information parameters of those problems and interests. 3. locate and obtain the information necessary to address those problems and interests. 4. organize this information so as to optimally target the specified probcommunications 103 !em or interest to be as easily retrievable as possible. this requires an understanding of the context in which the information is used so that it is optimally relevant, and an understanding of the language and problem articulation common ·to the individuals in the community in order to ensure rapid retrieval. a lesson in interactive television programming: the home book club on qube w. theodore bolton: oclc , inc., columbus, ohio. on december 1, 1977, warner communications christened what has become the most publicized and talked about technological development in the field of cable television: qube, its two-way interactive cable system . publicity posters claimed that this would be "a day you'll tell your grandchildren about," and broadcasters added the word "interactive" to their cocktail-party vocabulary. academicians who ten· years ago forecast a technological revolution initiated by the marriage of computer to cable television, smugly grinned and saw their dreams turn into reality. response to qube, however, has been mixed. participatory television brings, to some, futuristic images of instant democracy ; others warn of its potential demagogic power. 1 regardless of your critical persuasion, there now exists what former cbs executive turned warner amex2 consultant mike dann calls "a whole new utility ."3 this whole new utility, whether in the form of qube cable television, or some other combination of computer, cable television, telephone, and standard over-the-air broadcasting, will change the way we conduct our lives <:nd interact with other people . the history of the home book club early in 1979, the oclc, inc . , research staff appraised the nature and context of the qube facilities (located in co104 journal of library automation vol. 14/2 june 1981 lumbus , ohio, only five miles away). discussions, which at times centered around far-fetched and lofty ideas, eventually led to realistic and inventive concepts that made use of qube's interactive technology. the most promising of these concepts was a book discussion program where the audience determined the content and direction of the discussion itself. hoping to take advantage of this new technology, and at the same time expand library services available to the general public, oclc proposed a book discussion program to qube. in a previously released statement, qube vice-president harlan kleiman had stated that the polling capabilities of the qube system should be treated like a "time bomb."4 yet oclc's proposal indicated an interest in exploring these very same devices. this factor, coupled with qube' s "closed door" policy toward outside researchers and scholars, seemed to indicated that the home book club research proposal would be rejected. but qube executives did the unexpected: they agreed to air six home book club programs, one each month . and so, on july 18, 1979, at 7 p.m . , the home book club premiered. an interactive book discussion what makes qube unique is its twoway, or upstream, capability. the qube technology is made up of three complementary computers that are used for monitoring, tabulation, and billing purposes . each qube console in a viewer's home has thirty channels to choose from and five response buttons to press when answering questions posed to home viewers on qube programs . by monitoring and tabulating data that show which tv sets are on, which programs viewers are watching, and which response buttons they last touched, qube therefore has a virtually error-free system of audience research . t his allows for a staggering amount of audience data to be compiled theoretically every six seconds . apart from the thirty-channel capability of standard television, community programs, and pay-per-viewing feature films, the most intriguing aspect of qube is its five response buttons. oclc felt that the use of these buttons should be emphasized and the concept of interaction should be fully incorporated into the home book club . at the beginning of each home book club program, home viewers were asked to select, from three alternatives, the opening topic of conversation about the book. after the home viewers had "touched in" their preference on one of the prespecified buttons, the qube polling computer tallied and displayed the results. once the book discussion was under way, the home viewers were given additional opportunities to "democratically" determine whether the panelists should continue in a particular topic area, or move on to new topic areas. if a controversial issue emerged within the course of a discussion, the horne book club panelists were encouraged to spontaneously pose interactive questions to horne viewers. this form of instantaneous polling was extended to telephone participants who were also periodically incorporated into the book discussion. a sampling of these opinion-type questions included: from the wifey program, "should sandy have left norman?"; from the metropolitan life program, "is this book too subjective for non-new yorkers?"; from the eye of the needle program, "was the violence portrayed a necessary part of this book?"; from the world according to carp program, "was this a feminist novel?" toward the end of each one-hour horne book club program the qube system broke new ground in interactive television history: home viewers selected, from five alternatives, the book to be discussed on next month's program. in addition, horne viewers were able to request a copy of the book to be sent to their home at no charge from the public library of columbus and franklin county (plcfc) . these two transactions took place with a mere touch of the prespecified button on the qube console. plcfc provided a major contribution to the horne book club. once the qube computers had compiled the names and addresses of those viewers who requested next month's book (earlier, all horne viewers had been told that their names would be entered in the qube computer if they responded to a book request), the qube computer printed the names on mailing labels. these labels were forwarded to the plcfc books-by-mail office, which then filled each request. the total time from "touch-in request" to "in-home mail delivery" was usually two to three days. indeed, a form of electronic catalog ordering actually took place each time the home book club program was cablecast in columbus. it should be noted that home book club viewers were also given the opportunity to order the alternative book choices. who watched the home book club? an additional use of qube's two-way capability was also incorporated into the first six home book club programs. prior to selecting and ordering the next months' books, home viewers were asked to respond to a series of demographic-type questions. from these questions, a profile of the typical home book club viewer was compiled to plcfc and qube management. this portion of the program also provided the oclc research department with data with which to explore the market-research potential of an interactive television system. from the beginning of the home book club research project, a few obvious limitations of interactive polling became apparent. first, not all home viewers made use of, or were willing to participate in, qube's interactive technology. response rates ranged from 20 to 85 percent, with an approximate mean rate of 55 percent. second, only one viewer in a multiple-person household could respond. third, it can be logically assumed that certain kinds of people will and did interact more often than others . taking these limitations into consideration, a few generalizations were still able to be made regarding the home book club audience . the demographic data traced over the first six programs showed the audience to be primarily composed of younger (below thirty-nine years of age), college-educated (65 percent had college communications 105 or postgraduate degrees), middle to upper income (60 percent earning $25 ,000 or more per year), females (approximately 70 percent of the interacting audience). these figures should not surprise anyone who is either familiar with previous profiles of the general library users or who may in passing conjure a guess as to what kind of person might be interested in viewing a televised interactive book discussion . a closer inspection of the instantaneous audience demographics, however, led to some disappointing implications. can a democratic television program survive? as was pointed out earlier, home viewers were permitted to select the next month's book at the conclusion of a program. this was strictly a democratic process where the majority ruled. the world according to carp, the premier home book club book, was followed by eye of the needle and wifey for programs two and three respectively. the qube computer indicated that each of these programs were viewed by approximately 175 households, or almost 420 individuals. in a competitive structure where there are twenty-nine television program alternatives from which a viewer can choose, qube, oclc, and the plcfc felt that a successful programming concept had been born. qube management enthusiastically reported that the home book club had achieved audience levels that at times rivaled their more extravagant and broadbased entertainment/interview program, "columbus alive." this enthusiasm was short-lived as audience-level figu res from program four came in . at the end of program three (wifey), the audience selected james michener's weighty novel chesapeake for the next month's program. the respectable figure of approximately 375 viewers for wifey dwindled to slightly less than 210 viewers for chesapeake. and to make matters worse, the audience-level figures did not improve for programs five and six. there are several alternative and sometimes complementary explanations for this substantial loss in audience. first, many viewers may not have been able to get 106 journal of library automation vol. 14/2 june 1981 through the some one thousand pages of "maryland's eastern shore" history in chesapeake, and thus chose not to participate in the horne book club. second, the new fall syndicated programs offered at that time by local network affiliates may have led many viewers to choose alternative programming. additional hypotheses can also be gleaned from the interactive demographic data: whereas in programs one through three approximately 40 percent of the audience indicated their educational level to be either some college or below, only 20 percent of the chesapeake audience (program four) fell into this category. this statistic remained constant for programs five and six of the horne book club. in the democratic television environment that the horne book club provides, what happens to the minority interest group? could this democratic television system be systematically eliminating specific viewer types? it might be that the outvoted minority group book reader can withstand being overruled just so many times before ceasing to participate. what recourse does this minority interest group have other than to be dominated by higher-educated viewers who heavily stuff the electronic ballot box in favor of their own book preferences? quite clearly the recourse for the minority interest group was to select a competing television program, as evidenced by the declining viewing audience-level figures. the loss of these viewers becomes especially disheartening because this particular audience segment may represent a group of individuals who never before participated in a book discussion . the future of the home book club given the somewhat disappointing results of the horne book club reported thus far, one would expect the program to be recorded in history as a noble, but unsuccessful, attempt at interactive television programming. the books-by-mail program did send out some 760 paperback books as a result of the horne book club (a 79 percent overall increase), and twenty-six new library cards (not a prerequisite) were issued to horne book club viewers . but the fact remains that a for-profit company such as warner annex most definitely cannot substantiate the continuation of a program that has the audience ratings as low as the horne book club . ... or can it? not only has the horne book club been continued (it's now in its twentieth month), but a morning edition of the horne book club premiered in june 1980. what explanations can account for this somewhat bewildering corporate behavior? on a very idealistic level, warner annex could be fulfilling its obligation to serve all facets of the columbus community. the horne book club certainly offers a viewing alternative to an often neglected segment of the viewing population. oclc, inc., and public libraries throughout the united states applaud this kind of responsible programming . on a more practical level, there may be other strategies behind the renewal of the horne book club contract. a 1978 study completed by the argus research corporation concluded that "no profits are expected from qube until the system is successfully replicated in cities other than columbus , and at considerably lower costs . "s to replicate the qube system, warner annex must expand its cable territory into new communities throughout the united states. this can at times be a very difficult task. the right for a company such as warner annex to wire a local municipality to its qube system is determined by local government. normally, a city council reviews and contrasts alternative cable systems in terms of the services each system proposes in return for franchising rights. the final decision usually is based on costs, the programming made available, and, most importantly, the kind of community service the cable system proposes to extend to its viewers . one definition of extended community service might be a televised book discussion program that involves the local public libraries . the alluring notion of an interactive book discussion may even be more appealing to community-minded city council members. in fact, qube is currently using an edited composite tape of h9rne book club highlights in their franchising efforts. the success of such efforts remains to be seen . whether warner amex's motives are communityor commercial-minded, the fact remains that other communities may have the opportunity to develop a program of this kind . since local governments can legally specify what services the cable company must provide, the inclusion of a televised book discussion program could become part of a contract fulfillment. advice for those interested in developing alternative television programs for special-interest groups : don't be caught napping when your national cable representatives come knocking on your city council door. as for the home book club , qube and the public library of columbus and franklin county are working at reestablishing a solid baseline audience . as is the case for any television program, promotion is a key ingredient for success . when viewers were asked where they first found out about the home book club, more than half indicated they obtained program information through the free qube program guide. approximately 15 percent heard from a friend and 12 percent found information at the public library . a coordinated promotional effort is highly recommended for a public-service program of this nature . the future of interactive television qube must be thought of as more than just a two-way television system. in fact, it is more than interactive television. qube is actually a computer hooked to a cable communication system . that cable communication system is a network providing a pathway for a wide variety of services from central facility to home subscribers . in the future, not only will systems such as qube provide "local loop" communications for these services, but undoubtedly will be interconnected by a satellite with other similar systems throughout the country and indeed the world. the five buttons on the existing qube consoles are just the first evidence of the future possibilities of interactive broadband communications systems currently delivering television . because the early applications of cable were to provide entertainment television, and more often communications 107 than not were provided by people in the television business, cable television is naturally oriented toward the entertainment business . but the future of these broadband communications systems is in interactive retrieval of information as much as it is in entertainment. this goes far beyond the simple polled system so frequently used in a two-way mechanism : the talk show host asks how many people have read a particular book, the audience responds, and the net result has no effect on the program itself. it is also a lot more than interactive television : the host asks what you want to discuss, the audience says the plot of the book, and the answer has an effect on the outcome of the show. in fact , these broadband communications systems have the potential for placing at the fingertips of americans a vast storehouse of information services about, for example, the best auto routes to your favorite spots, baby care, banking, buying a house , dressmaking, good buys, hobbies , jobs , legal facts , properties for sale or rent, sports scores, technology, and wine. as qube expands into its qube iii system with more than a hundred channels of services, it will be technically positioned to support all aspects of this burgeoning information age. 6 besides simple information retrieval, a qube subscriber will be able to conduct banking and shopping transactions, to provide information such as who is on what side of community issues , and also (incidentally) to watch television . if all of that does not seem like enough, remember that cable "is really a very large pipe through which any variety of electronic information can be pushed. passive home security, fire alarm , and energy management are also services either in existence or contemplated by a number of cable operators . for that matter there is no reason to believe the computer processing services can't be made available to individual subscribers . a subscriber could call up the program to balance his checkbook, to perform his smallbusiness payroll calculations, or to complete a statistical analysis of data for a school project. most people thought (as we initially did) that interactive cable (qube) means interactive television . but oclc's research 108 journal of library automation vol. 14/2 june 1981 has shown that interactive television programs : 1. serve as an initial introduction to naive audiences of what a truly interactive system is all about; 2. are difficult to implement; 3. really aren't democratic; 4. are basically polling devices. it has been said that the reason that railroads went out of business was because they insisted that they were in the railroad business and wouldn't admit that they were in the transportation business. if cable operators insist that they are in the television business, they may well miss the opportunities that are possible in the communications business or, in fact, in the information business . by the same token, if libraries miss the significance of what cable television is bringing to their business, their role in the community will be diminished and libraries may go the way of railroads. modern communications and computers offer an opportunity for libraries to become the information choice in their community. in the near future, applications such as the home book club may well be a way to provide increased accessibility of library services to library patrons, and to "condition" those patrons to the coming electronic nature of libraries . over the long term, libraries, if they have the courage and the foresight, can be the focus of the coming information and telecommunications revolution . the message is quite clear: opportunities abound. references l. john wicklein, "wired city , u.s.a: the charms and dangers of two-way tv," atlantic monthly 243:35--42 (feb. 1979). 2. warner amex represents a newly formed corporation resulting from the merger of warner communications and american express. 3. jonathan black, "brave new world of television," new times ll:41 (24 july 1978). 4. ibid., p.49. 5. "warner cable's qube: exploring the outer reaches of two-way tv," broadcasting 95:28 (31 july 1978). 6. "two-way converters hot ticket at ncta exhibits," broadcasting 97:72 (26 may 1980). an informal survey of the cti computer backup system joseph covino and sheila intner: great neck library, great neck, new york. in order to help decide whether or not to purchase computer backup systems from computer translation, inc. (cti), * for use when the clsi libs 100 automated circulation system is not operating, great neck library conducted an informal survey of libraries using both systems . eleven institutions, including both public and academic libraries, responded to a brief questionnaire. they were asked what size cti system they had purchased and why, how easily it was installed, how well it performed, how it was maintained, and if clsi acknowledged that the addition of the backup did not affect their libs 100 maintenance agreements . before summarizing the responses, the structure of the two systems and how they interact should be outlined. clsi libs 100 the clsi automated circulation system consists of a stand -alone minicomputer console with local and/or remote terminals connected to it through individual ports by means of electrical and/or dedicated telephone line hookups. when it operates, the terminals are online and interactive with the database, which is stored on one or more multiplatter disc packs. cti backup the cti backup system is based on an apple ii microcomputer with two minidisc drives, which take 5 1/4-inch floppy discs, a tv monitor, and a switching system that can be connected to the libs 100 console or its terminals . the cti system can also be used alone. when the libs 100 is down (inoperative), the cti system is connected to a terminal, and data is recorded on its discs for later dumping (data entry) into the database via a port connection . it *cti is a profit-making company wholly owned by brigham young university. the cti backup system was originally developed to support the clsi"installation at byu. virtual production at cloud901 in the memphis central library public libraries leading the way virtual production at cloud901 in the memphis central library david mason and alan ji information technology and libraries | march 2023 https://doi.org/10.6017/ital.v42i1.16315 alan ji (geewhyalan@outlook.com) is youth contributor. david mason (davidpattenmason@gmail.com) is program specialist, memphis public library. © 2023. background cloud901 cloud901 is a teen learning center in the memphis public library (https://www.memphislibrary.org/cloud901/). in our space we give youth between the ages of 13–18 exclusive access to all the resources needed to produce anything from short films to visual art to 100+ pound robots. this includes specialty areas designated for subjects such as art, video production, and engineering, each staffed with expert facilitators. as an organization we aim to provide youth with the opportunity to discover many areas of study at only the cost of their time and attention. in addition to providing individual support, cloud901 (dubbed “the cloud”) runs its own instructional programs, acts as a venue for outside organizations to host their programs, and generally serves as a place where high schoolers can hang out after school. the nature of what we do places us at the intersection of many different fields and brings a wide variety of youth together. despite this, the cloud faces a challenge i see in many other institutions: computer programming is taught in isolation from other fields such as film, so it is seen as an uncreative pursuit. filmmaking and programming could each benefit from one another: film, by incorporating code to streamline production; and programming, by gaining a way to creatively collaborate with those seen traditionally as “artists.” in order to explore this connection at cloud901, i am using my experience in film and programming to develop a “virtual production” initiative in our space. this project serves the purpose of teaching youth how to write programs within unreal engine while creating a platform where those interested in the film, programming, music, and visual art aspects of our space can collaborate. virtual production virtual production is a new method of producing special effects for movies and tv that turns the traditional visual effects production pipeline on its head. until now, the process of modern film making has required all digital compositing and special effects to take place after filming. this system relies on actors, directors, and the pre-production team to imagine what environments might look like and to make decisions about lighting and composition with little input from the visual effects team. if they make misguided decisions, the post-production process becomes significantly more complicated and the result may not measure up to expectations. to minimize this problem, industrial light & magic’s production team for the mandalorian used a circular room lined with light panels controlled by a real-time 3d rendering engine to light the physical set. not only does this process allow visual effects artists to make immediate adjustments to the virtual surroundings in response to the director’s feedback during the shoot, but it also produces footage that requires a comparatively small amount of post-production work. mailto:geewhyalan@outlook.com mailto:davidpattenmason@gmail.com information technology and libraries march 2023 virtual production at cloud901 2 mason and ji how can we do this? the cloud faces many more constraints than industrial light & magic (ilm). nevertheless, we found a way to achieve a form of virtual production in our space now, as the tools have become quite accessible compared to even a few years ago. base station nikon d5600, camera that captures our actor. vive tracker, used in conjunction with the base station to sync the location and rotation of our physical camera with our virtual camera increasingly accessible tools on the software side we rely on blender for creating virtual environments and we use unreal engine 5 for the virtual production logic and rendering. both software applications are used in professional film production and both are completely free. in fact, the virtual production stage used in the mandalorian also relies on unreal engine. thanks to the gaming and live streaming industries, the hardware required has become much more affordable. in our production we acquired the latest graphics and vr tracking equipment for a few thousand dollars. such equipment includes our workstation laptop for running unreal engine and the htc vive 3.0 tracking system for synchronizing the position and orientation of our physical camera with our virtual camera. for a full breakdown, refer to the documentation on our project’s github page. https://www.vive.com/us/accessory/base-station/ https://www.nikonusa.com/en/nikon-products/product/dslr-cameras/d5600.html https://www.vive.com/us/accessory/tracker3/ https://github.com/dp-mason/budget-virtual-production information technology and libraries march 2023 virtual production at cloud901 3 mason and ji our approach the process used by ilm begins with the render of the environment in unreal engine, which is then sent to the led-panels surrounding the stage, which light the stage and act as the backdrop that the camera captures. the actors on the stage are working with the same visual backgrounds and lighting that will ultimately be seen by the viewer at home. (other effects are added in postproduction.) this means that the real actors are illuminated by the virtual background via the led displays, making the addition of purely virtual effects easier and more seamless. example of the cylindrical computer-generated background with actors in the center, lit with the same illumination as is generated by the virtual set. image licensed under the creative commons attribution-share alike 4.0 international license. source: https://commons.wikimedia.org/wiki/file:stagecraft_the_mandalorian.jpg. in contrast, our effects process begins at our physical camera by sending the live video of our actor against a green screen to unreal engine, where the video is processed to remove the background. then, with the help of a virtual object matching the dimensions of our physical green screen, we apply the actor’s image on top of the virtual environment. instead of projecting our virtual environment to real space through led panels, we do the inverse: we project our actor into the virtual environment, achieving a similar effect at a fraction of the cost. https://en.wikipedia.org/wiki/en:creative_commons https://creativecommons.org/licenses/by-sa/4.0/deed.en https://commons.wikimedia.org/wiki/file:stagecraft_the_mandalorian.jpg information technology and libraries march 2023 virtual production at cloud901 4 mason and ji render of our virtual production set. this is what our scene looks like in unreal engine from a view outside the camera. the image of our actor appears distorted from this angle, because it is projected from the camera to a mesh matching our green screen’s dimensions. this is the view of our virtual scene from the camera. source: https://www.blendswap.com/blend/30009 https://www.blendswap.com/blend/30009 information technology and libraries march 2023 virtual production at cloud901 5 mason and ji why does this project matter? the philosophy i had been pursuing photography, music, and 3d graphics since long before i joined cloud901 professionally. this interest actually began in my own teenage years while i was a volunteer at cloud901. at the time i was producing music, learning more about cameras through event photography, and just beginning to create 3d animations in blender. when i left for college to study computer science, i still had these interests, but they sometimes took a back seat to my studies. as time went on, though, i began to see how these interests and art in general are not incompatible with computer science. there are many ways to exemplify this compatibility, but there is something special about virtual production in particular. many students new to computer science get the impression that the field is all about memorizing rigid rules so that you can arrive at a “correct” solution. it is natural to feel that way even after months of classes. while it is true that creating in this medium can feel restricting, it offers many opportunities to make creative contributions. computer programming—especially in the context of virtual production—is like lute-making. a luthier’s first lutes are likely very ordinary, much of their effort spent attaining the “correct” sound in their instruments. yet, if they continue to develop their technical ability, to learn the mechanics of producing sound, and to become more comfortable with experimentation, they can influence the future of music in an aesthetic way without necessarily conforming to rigid notions of “correctness.” such an opportunity presented itself during the rise of opera. luthiers of the late sixteenth century used their technical knowledge to create the theorbo and other adaptations of the lute that would not be drowned out by an opera singer. there is nothing “correct” or “incorrect” about the theorbo or about virtual production methods, but rather they are subjective visions of how film production and music can adapt to create new experiences. virtual production is a fascinating use of current technology; it leverages current computing power to innovate in a field not typically associated with computer science. it makes us wonder what other industries could benefit from the application of programming. using lessons i taught students about unreal engine, i encouraged them to embrace topics such as programming, linear algebra, and photography as knowledge essential to crafting their own “instruments.” this began with smaller projects such as a creating a cube that moves up and down according to a sine wave and progressed to building extensions for the system we use to calibrate our camera tracking. co-author alan ji first got into 3d graphics through a programming camp at the memphis public library over the summer. it was there that he met david, and there that he was introduced to unreal engine. it’s crazy to think that i went from not knowing anything about unreal engine to helping create a virtual production set in a few months. connecting to the outside the novelty of this project also forged connections with people outside of the library. spencer burnham, a project manager critical to some of the biggest xr experiences to date, such as bafta award-winning app “wonderscope,” visited cloud901 to discuss his career and the potential he sees in learning virtual production to the cohort of youth interested in contributing to our project. while demoing the project at the “cs for all” conference last fall, i spoke to a computer science information technology and libraries march 2023 virtual production at cloud901 6 mason and ji teacher at a local high school and later spoke to the students in his after-school program. the debut of cloud901’s virtual production studio features a performance from alfred banks, a regularly touring vocal performer from new orleans. next steps in the short term, we would like to make our system easier to use and share it via an open source license. the current version of our system is hosted and documented on github. in the long term, we plan to take a more formal approach to onboarding youth contributors during the summer and record regular performances. currently we have onboarded people to the project on an ad-hoc basis, but we plan to organize regular summer programming teaching students interested in film and programming how to use and extend this program. we plan to use this set to host a series of performances featuring established musicians from the area. this will provide our youth contributors with practical experience working on an innovative form of production and opportunities to network with established creatives. external links • github repository for our project • our proof of concept video • cloud901 home page https://github.com/dp-mason/budget-virtual-production https://youtu.be/evbbt_uzhde https://youtu.be/evbbt_uzhde https://www.memphislibrary.org/cloud901/ background cloud901 virtual production how can we do this? increasingly accessible tools our approach why does this project matter? the philosophy connecting to the outside next steps external links lita president’s message joining together emily morton-owens information technology and libraries | december 2019 2 emily morton-owens (egmowens.lita@gmail.com) is lita president 2019-20 and the assistant university librarian for digital library development & systems at the university of pennsylvania libraries. . in writing this column i am looking ahead, as i have been throughout my term as vice-president and president of lita, to the possibility of our merger with alcts and llama. recently our discussions have included an exploration on all sides of how a division can support members through their career. this has inspired me to reflect on how lita has always taken a broad and inclusive view of what library technology work is and can be in the future. i believe the proposed core division can support and extend that tradition. one question that i’ve heard posed from time to time is “am i technical enough for lita?” longtime lita members like to answer that with a full-throated “yes!” if you’re interested enough to ask the question, we want you to join us in using technology as a part of your work. we want you to be supported in doing so at your current skill level, whether or not you want to make technology more a part of your work than it is today. if you want to go deeper into technology, we’ll be there with you. while the culture of the for-profit technology industry can promote imposter syndrome, we want lita to be a haven. in lita’s events and meetings, we consistently see different facets of library technology work reflected. some of us are training users in new technologies or creating programs that get young people excited about coding. others are working to make online resources accessible and easy for our users to benefit from. we have members who are manipulating metadata, creating services to help researchers comply with data management requirements, creating websites that guide users to the information they need, and preserving cultural heritage in digital forms. some of us manage tech projects or workers. some of our members work on large tech teams with generous resources and others are spinning magic just from their own skills. when i started working in libraries, my bosses and mentors were often librarians who had started in technical services or other roles, before “automation.” eager to improve their own workflows, and getting pulled into ils migrations and catalog development, they had become the technology experts. these accidental systems librarians have always been some of my favorite colleagues because of their sure-footed approach to our data. recently i’ve come to work with colleagues who are accidental systems librarians in the opposite sense; tech workers who took jobs in libraries and embraced what we do. one developer on my team, who had no previous library experience, took to our projects and ethical stance like a duck to water. he told me that he now goes to parties and tells people about how librarians are defenders of privacy and protectors of information. lita embraces growth in any direction because we want to support learning and problem-solving with a foundation of shared principles and resources. i don’t see these developments as time-based or inevitable in any given person’s career. there are plenty of library tech workers who prefer being an individual contributor and think they have mailto:egmowens.lita@gmail.com joining together | morton-owens 3 https://doi.org/10.6017/ital.v38i4.11905 their biggest impact doing direct work on applications. and many of my technical services colleagues prefer to define their work goals in those terms, no matter how adept they become with tech tools. whether or not they seek out a management position, our members will probably find themselves exhibiting leadership in some context, like developing standards or advocating for standards. instead of a rigid path of career development, many librarians today have fluid and multi-faceted careers. for myself, i have held similar positions at quite different types of libraries—public, medical, academic. lita has always been a part of my experience, though, providing a sort of collegial bedrock through a lot of change. the people are what make lita, lita: friendly, principled, and quirky. lita members are the kind of people who will learn all they can about a technology like the amazon alexa—and then unplug the one on the exhibit floor at annual. both as i was thinking about all this, and in this resulting column, leadership, collections, and technical services kept coming up. there is such strong and fruitful cross-pollination among these specialties, and i see that as something that would enhance the member experience—both for current lita members who want more contact with expert colleagues and for current llama and alcts members who want learning opportunities and support for their work with technology. lita members love to share their knowledge and hash through challenges together. sometimes i wish more ala members would feel comfortable giving us a try, and perhaps core will be a new, friendly face for that ongoing outreach. if, in the future, someone asked the new question “am i technical enough for core?” i’m sure the answer will be the same: “yes, please join us!” leadership and infrastructure and futures…oh my! letter from the core president leadership, infrastructure, futures christopher cronin information technology and libraries | december 2020 https://doi.org/10.6017/ital.v39i4.13027 christopher cronin (cjc2260@columbia.edu) is core president and associate university librarian for collections, columbia university. © 2020. i am so pleased to be able to welcome all ital subscribers to core: leadership, infrastructure, futures! this issue marks the first of ital since the election of core’s inaugural leadership. a merger of what was formerly three separate ala divisions—the association of library collections & technical services (alcts), library & information technology association (lita), and the library leadership & management association (llama)—core is an experiment of sorts. it is, in fact, multiple experiments in unification, in collaboration, in compromise, in survival. while initially born out of a sheer fight or flight response to financial imperatives and the need for organizational effectiveness, developing core as a concept and as a model for an enduring professional association very quickly became the real motivation for those of us deeply embedded in its planning. core is very deliberately not an all-caps acronym representing a single subset of practitioner within the library profession. it is instead an assertion of our collective position at the center of our profession. it is a place where all those working in libraries, archives, museums, historical societies—information and cultural heritage broadly—will find reward and value in membership and a professional home. all organizations need effective leaders, strong infrastructure, and a vision for the future. and that is what core strives to build with and for its members. while i welcome ital’s readers into core, i also welcome core’s membership into ital. no longer publications of their former divisions, all three journals within core have an opportunity to reconsider their mandates. as with all things, audience matters. ital’s readership has now expanded dramatically, and those new readers must be invited into ital’s world just as much as ital has been invited into theirs. as we embark on this first year of the new division, we do so with a sense of altogether newness more than of a mere refresh, and a sense of still becoming more than a sense of having always been. and who doesn’t want to reinvent themselves every once in a while? start over. move away from the bits that aren’t working so well, prop up those other bits that we know deserve more, and venture into some previously uncharted territory. how will being part of this effort, and of an expanded division, reframe ital’s mandate? the importance of information technology has never been more apparent. it is not lost on me that we do this work in core during a year of unprecedented tumult. in 2020, a murderous global pandemic was met with unrelenting political strife, pervasive distribution of misinformation and untruths, devastating weather disasters, record-setting unemployment, heightened attention on an array of omnipresent social justice issues, and a racial reckoning that demands we look both inward and outward for real change. individually and collectively, we grieve so many losses —loss of life, loss of income, loss of savings, loss of homes, loss of dignity, loss of certainty, loss of control, loss of physical contact. and throughout all of these challenges, what have we relied on more this year than technology? technology kept us productive and engaged. it provided a focal point for communication and connection. it provided venues for advocacy, expression, inspiration, and, as a mailto:cjc2260@columbia.edu information technology and libraries december 2020 leadership, infrastructure, futures | cronin 2 counterpoint to that pervasive distribution of misinformation, it provided mechanisms to amplify the voices of the oppressed and marginalized. for some, but unfortunately not all, technology also kept us employed. and as the physical doors of our organizations closed, technology provided us with new ways to invite our users in, to continue to meet their information needs, and to exceed all of our expectations for what was possible even with closed physical doors. and yet our reliance on and celebration of technology in this moment has also placed another critical spotlight on the devastating impact of digital poverty on those who continue to lack access, and by extension also a spotlight on our privilege. in her parting words to you in the final issue of ital as a lita journal, evviva weinraub lajoie, the last president of lita, wrote: we may have always known that inequities existed, that the system was structured to make sure that some folks were never able to get access to the better goods and services, but for many, this pandemic is the first time we have had those systemic inequities held up to our noses and been asked, “what are you going to do to change this?” balancing those priorities will require us to lean on our professional networks and organizations to be more and to do more. i believe that together, we can make core stand up to that challenge. i believe we will do this, too, and with a spirit of reinvention that is guided by principles and values that don’t just inspire membership but also improve our professional lives and experience in tangible ways. it was a privilege to have served as the final president of alcts and such a humbling and daunting responsibility to now transition into serving as core’s first. it is a responsibility i do not take lightly, particularly in this moment when so much is demanded of us. as we strive for equity and inclusion, we do so knowing that we are only as strong as every member’s ability to bring their whole selves to this work. we must work together to make our professional home everything we need it to be and to help those who need us. it is yours, it is theirs, it is ours. https://doi.org/10.6017/ital.v39i3.12687 20190318 10979 gallley editorial board thoughts who will use this and why? user stories and use cases kevin m. ford information technology and libraries | march 2019 5 kevin m. ford (kefo@loc.gov) is librarian, linked data specialist, library of congress. perhaps i’m that guy. the one always asking for either a “user story” or a “use case,” and sometimes both. they are tools employed in software or system engineering to capture how, and importantly why, actors (often human users, but not necessarily) interact with a system. both have protagonists, but one is more a creative narrative, the other like a strict, unvarnished retelling. user stories relate what an actor wants to do and why. use cases detail to varying degrees how that actor might go about realizing his desire. the concepts, though distinct, are often confused and conflated. and, because they classify as jargon, the concepts have sometimes been employed outside of technology to capture what an actor needs, the path the actor takes to his or her objective, including any decisions that might be made along the way, and all of this effort is undertaken in order to identify the best solution. by giving the actors a starring role, user stories and use cases ensure focus is on the actors, their inputs, and the expected outcome. they protect against incorporating unnecessary elements, which could clutter and, even worse, weaken the end product, and they create a baseline understanding by which the result can be measured. and so i find myself frequently asking in meetings, and mumbling in hallways: “what’s the use case for that?” or “is there a user story? if not, then why are we doing it?” you get the idea. it’s a little ironic that i would become this person. not because i didn’t believe in user stories and use cases – quite the contrary, i’ve always believed in the importance and utility of them – but because of a book i was assigned during graduate coursework for my lis degree and my initial reaction. it’s not just an unassuming book, it has a downright boring appearance, as one might expect of a book entitled “use case modeling.”1 it’s a shocking 347 pages. it was a joint endeavor by two authors: kurt bittner and ian spence. i think i read it, but i can’t honestly recall. i assume i did because i was that type of student and i had a long chicago el commute at the time. in any case, i know beyond doubt that i was assigned this book, dutifully obtained it, and then picked it up, thumbed through it, rolled my eyes, and probably said, “ugh, really?” and that’s just it. the joke’s on me. the concepts, and as such the book, which i’ve moved across the country a couple of times, remain near-daily constants in my life. as a developer, i basically don’t do anything without a user story and a use case, especially one whose steps (including preconditions, alternatives, variables, triggers, and final outcome) haven’t been reasonably sketched out. “sketched out” is an interesting phrase because one would think that if entire books were being authored on the topic of use cases, for example, then use cases would be complicated and involved affairs. they can be, but they need not be. the same holds for user stories. imagine you were designing a cataloging system, here’s an example of the latter: as a librarian i want my student catalogers to be guided through selection of vocabulary terms to improve both their accuracy and speed.2 editorial board thoughts: who will use this and why? | ford 6 https://doi.org/10.6017/ital.v38i1.10979 that single-sentence user story identifies the actors (student catalogers), what they need (a “guided … selection of vocabulary terms”), and why (“to improve their accuracy and speed”). the use case would explore how the student catalogers (the actors) would interact with the system to realize that user story. the use case might be narrowly defined (“adding controlled terms to records”) or might be part of a broader use case (“cataloging records”), but in either instance the use case might go to some length to describe the interaction between the student catalogers and the system in order to generate a clear understanding of the various interactions. by doing this, the use case helps to identify functional requirements and it clearly articulates user/system expectations, which can be reviewed by stakeholders before work begins and used to verify delivery of the final product. as i have presented this, using these tools might strike you as overly formal and time-consuming. in many circumstances they might be, if the developer has sufficient user and domain knowledge (rare, very, very rare) and especially if the “solution” is not an entirely new system but just an enhancement or augmentation to an existing system. yet, whether it is a completely new system being developed by someone who has long and profound experience with the domain or a simple enhancement, it may be worth entertaining the questions/process if even informally. i find it is often sufficient to ask “who will use this and why?” essentially i’m asking for the “user story” but dispensing with the jargon. doing so may lead to additional questions, the answers to which would likely check the boxes of a “use case” even if the effort is not identified as such, and it certainly ensures the user-driven nature and need of the request. this might all sound obvious, but i like to think of it as defensive programming, which is like defensive driving. yes, the driver coming up to the stop sign on my right is going to stop, but i take my foot off the gas and position it over the brake just in case. likewise, i’m confident the functional requirements i’m being handed have been fully considered and address a user need, but i’m going to ask for the user story anyway. i’m also leery of scope creep which, if i were to continue the driving analogy, would be equivalent to driving to one store because you need to, but then also driving to two additional stores for items you think might be good to have but for which you have no present need. it’s time-consuming, you’ve complicated your project, you’ve added expense to your budget, and the extra items might be of little or no use in the end. the number of times i’ve been in meetings in which new, additional features are discussed because the designers think it is a good idea (that is, there has been no actual user request or input sought) is alarmingly high. that’s when i pipe up, “is there a user story? if not, then why are we doing it?” user stories and use cases help focus any development project on those who stand to benefit, i.e. the project’s stakeholders, and can guard simultaneously against insufficient planning and software bloat. and the concepts, though most often thought of with respect to large-scale projects, apply in all circumstances, from the smallest feature request to an existing system to the redesign of a complex system. if you are not in the habit of asking, try it next time: who will use this and why? endnotes 1 kurt bittner and ian spence, use case modeling (boston: addison-wesley, 2003). also useful: alistair cockburn, writing effective use cases (boston: addison-wesley, 2001). information technology and libraries | march 2019 7 2 “use case 3.4: authority tool for more accurate data entry,” linked data for libraries (ld4l), accessed march 1, 2019, https://wiki.duraspace.org/display/ld4l/use+case+3.4%3a+authority+tool+for+more+accur ate+data+entry. editorial board thoughts: library analytics and patron privacy ken varnum information technology and libraries | december 2015 2 two significant trends are sweeping across the library landscape: assessment (and the corresponding collection and analysis of data) and privacy (of records of user interactions with our services). libraries, perhaps more than any other public service organization, are strongly motivated to assess their offerings with dual aims. the first might be thought of as an altruistic goal: understanding the needs of their particular clientele and improving library services to meet those needs. the second is perhaps more existential: helping justify the value libraries create to whatever sources of funding are necessary to impress. both are valid and important. it is hard to argue that improving services, focusing on actual needs, and maintaining funding are in any way improper goals. however, this desire is often seen as being in conflict with exploring too deeply the actions or needs of individual constituents, despite librarians’ historical and deeply-held belief that each constituent’s precise information needs should be explored and provided for through personalized, tailored services. solid assessment cannot happen without solid data. libraries have historically relied on qualitative surveys of their users, asking users to evaluate the quality of the services they receive. being able to know more details and ask directed questions of individuals who used services is possible in the traditional library setting through invitations to complete surveys after individual interactions such as a reference or circulation desk interaction, library program, visit to a physical location, or even a community-wide survey invitation. focus groups can be assembled as well, of course, once a library has identified a real-world group to study. however, those samples are more often convenience samples or—unless a library is able to successfully contact and receive responses from across the entire community— somewhat self-selected. assessment that leads to new or improved services relies much more heavily on broad-based understanding of the users of a system. libraries have been able to do limited quantitative studies of library usage—at its simplest, counting how many of this were checked out, how many of that was accessed, and how many users were involved. these metrics are useful, but also limited, particularly at the scale of a single library. knowing that a pool of resources is heavily used is helpful; even knowing that a suite of resources is frequently used collectively is beneficial. however, tying use of resources to specific information needs or information seekers, whether this is defined as ken varnum (varnum@umich.edu), a member of the ital editorial board, is senior program manager for discovery, delivery, and learning analytics at the university of michigan library, ann arbor, michigan. mailto:varnum@umich.edu editorial board thoughts: library analytics and patron privacy | varnum doi: 10.6017/ital.v34i4.9151 3 individuals or ad hoc collections of users based on situational factors such as academic level, course enrollments, etc. these more specific grouping rely on granular data that for many libraries—especially academic ones—are increasingly electronic. we are at a point in time when we have the potential to leverage wide swathes of user data. and this is where the second trend, privacy, comes to bear. protecting user privacy has been a guiding principle of librarianship in the united states (in particular) since the 1970s, as a strong reaction to u.s. government (through the fbi) requests to provide access to circulation logs for individuals under suspicion of espionage. this was in the early days of library automation, when large libraries with automated ils systems could prevent future disclosure through the straightforward strategy of purging transaction records as soon as the item was returned. this practice became standard operating procedure in libraries, and expanded into new information service domains as they evolved over the following forty years. with good intentions, libraries have ensured that they maintain no long-term history for most online services. as a profession, we have begun to realize that the straightforward (and arguably simplistic) approaches we have relied on for so long may no longer be appropriate or helpful. over the past year, these conversations found focus through a project coordinated by the national information standards organization thanks to a grant from the andrew w. mellon foundation.1 the range of issues discussed here was far-reaching and touched on virtually every aspect of privacy and assessment imaginable. the resulting draft document, consensus framework to support patron privacy in digital library and information systems,2 outlines 12 principles that libraries (and the information service vendors they partner with) should follow as they establish “practices and procedures to protect the digital privacy of the library user.” this new consensus framework sets a series of guidelines for us to consider as we begin to move into this uncharted (for libraries) territory. if we are to record and make use of our users’ online (and offline, for that matter) footprints to improve services, improve the user experience, and justify our value, this document gives us an outline of the issues to consider. it is time (and probably long past time) that we make conscious decisions about how we assess our online resources, in particular, and do so with a deeper knowledge of both the resources used and the people using them. at the exact moment in our technological history when we find ourselves able to provide automated services at scale to our users through the internet and simultaneously record and analyze the intricate details of those transactions, we need to come to think clearly about what questions we have, what data we need to answer them, and be explicit about how those data points are treated. it is important that we start this process now and change our blunt practices into more strategic data collection and analysis. where 40 years ago we opted to information technology and libraries | december 2015 4 bluntly enforce user privacy by deleting the data, we should now take a more nuanced approach and store and analyze data in the service of improved services and tools for our user communities. we have the opportunity, through technology and a more nuanced understanding of privacy, to conduct a protracted reference interview with our virtual users over multiple interactions… and thereby improve our services. references 1. http://www.niso.org/topics/tl/patron_privacy/ 2. http://www.niso.org/apps/group_public/download.php/15863/niso%20consensus %20principles%20users%20digital%20privacy.pdf http://www.niso.org/topics/tl/patron_privacy/ http://www.niso.org/apps/group_public/download.php/15863/niso%20consensus%20principles%20users%20digital%20privacy.pdf http://www.niso.org/apps/group_public/download.php/15863/niso%20consensus%20principles%20users%20digital%20privacy.pdf lib-s-mocs-kmc364-20140601052858 the shared cataloging system of the ohio college library center frederick g. kilgour, philip l. long, alan l. landgraf, and john a. wyckoff: ohio college library center, columbus, ohio development and implementation of an off-line catalog card production system and an on-line shared cataloging system are described. in off-line production, average cost per card for 529,893 catalog cards in finished form and alphabetized for filing was 6.57 (·. an account is given of system design and equipment selection for the on-line system. file organization and programs are described, and the on-line cataloging system is discussed. the system is easy to use, efficient, 1'eliable, and cost beneficial. the ohio college library center ( oclc) is a not-for-profit corporation chartered by the state of ohio on 6 july 1967. ohio colleges and universities may become members of the center; forty-nine institutions are participating in 1971/ 72. the center may also work with other regional centers that may "become a part of any national electronic network for bibliographic communication." the objectives of oclc are to increase the availability to individual students and faculty of resources in ohio's academic libraries, and at the same time to decrease the rate of rise of library costs per student. the oclc system complies with national and international standards and has been designed to operate as a node in a future national network as well as to attain the more immediate target of providing computer support to ohio academic libraries. the system is based on a central computer with a large, random access, secondary memory, and cathode ray tube terminals which are connected to the central computer by a network of telephone circuits. the large secondary memory contains a file of bibliographic records and indexes to the bibliographic record file. access to this central file from 158 journal of library automation vol. 5/3 september, 1972 the remote terminals located in member libraries requires fewer than five seconds. oclc will eventually have five on-line subsystems: 1) shared cataloging; 2) serials control; 3) technical processing; 4 ) remote catalog access and circulation control; and 5) access by subject and title. this paper concentrates on cataloging; the other subsystems are not operational at the present time. figure 1 presents the general file design of the system. the shared cataloging system has been the first on-line subsystem to be activated, and the files and indexes it employs are depicted in figure 1 by the heavy black lines and arrows. as can be seen in the figure, much of the system required for shared cataloging is common with the other four subsystems. the three main goals of shared cataloging are: 1) catalog cards printed to meet varying requirements of members; 2 ) an on-line union catalog; and 3) a communications system for requesting interlibrary loans. in addition, the bibliographic and location information in the system can be used for other purposes such as book selection and purchasing. the only description of an on-line cataloging system that had appeared in the literature during the development of the oclc system is that of the shawnee mission (kansas) public schools ( 1). the shawnee mission cataloging system produces uniform cards from a fixed-length, non-marc record. the oclc system uses a variable-length marc record and has great flexibility for production of cards in various formats. there are a number of reports describing off-line catalog card production systems, including systems at the georgia institute of technology ( 2), the new subject class 3,3 3,1,1, i lc card call author name and title index number title title number number index index index index index index l ! ! l ! l t ! bibliographic f---+ holding library, record file multiple and ~ partial holdings file i l l 1 ' date file name and techn ica i note and address processing dash entries file system and extra files added entries fig. 1. general file design; shared cataloging subsystem in heavy lines. shared cataloging systemfkilgour, et al. 159 england library information network ( nelinet) ( 3), and the university of chicago ( 4). the flexibility of the oclc system distinguishes it from these three systems as well. catalog card production-off-line an off-line catalog card production system based on a file of marc ii records was activated a year before the on-line system ( 5) . oclc supplied member libraries with request cards (punch cards prepunched with symbols for each holding library within an institution). for each title for which catalog cards were needed, members transcribed library of congress ( lc) card numbers onto a request card. members sent batches of cards to oclc at least once a week. at oclc, the lc card numbers were keypunched into the cards and new requests were combined with unfilled requests to be searched against the marc ii file. by the spring of 1971, over 70 percent of titles requested were found the first time they were searched. the selected marc ii records were then submitted to a formatting program that produced print images on magnetic tape for all cards required by a member library. the number of cards to be printed was determined by the number of tracings on the catalog record and the number of catalogs into which cards were to go including a regional union catalog (the cleveland regional union catalog) and the national union catalog. individual cards were formatted according to options originally selected by the member library. these options included: 1) presence or absence of tracings and holdings information on each of nine different types of cards; 2) three different indentions for added entries and subject headings; 3) a choice of upper-case or upperand lower-case characters for each type of added entry and subject heading; and 4) many formats for call numbers. oclc returned cards to members in finished form, alphabetized within packs for filing in specific local catalogs. the primary objective of off-line operation was the production of catalog cards at a lower cost than manual methods in oclc member libraries. early activation of off-line catalog card production did reduce costs and gave some members an opportunity to take advantage of normal staff turnover by not filling vacated positions in anticipation of further savings after activation of the on-line system. other objectives of off-line operation were the automated simulation of on-line activity in member libraries and development and implementation of catalog card production in preparation for card production in an on-line operation. the number of catalog card variations required by members, even after members had reviewed and accepted detailed designs of card products, proved to be higher than anticipated. more than one man-year was expended after activation of the off-line system in further development and implementation to take care of the formats and card dissemination variations requested by specific libraries. the one year advance start on 160 journal of library automation vol. 5/3 september, 1972 catalog production made possible by using marc ii records in the off-line mode proved to be a far greater blessing than anticipated, for it would have been literally impossible to have activated on-line operation and catalog card production simultaneously. a major goal of oclc card production is elimination of uniformity required by standardized procedures. the oclc goal is to facilitate cooperative cataloging without imposing on the cooperators. the cost to attain this goal is slight, for although there is a single expense to establish a decision point in a computer program, the cost of selection among three or thirty alternatives during program execution is infinitesimal. design of catalog cards and format options began four months before off-line activities. two general meetings of the oclc membership were held at which card formats were reviewed and agreed upon in a general sense. next, the oclc staff published a description of catalog card production and procedures for participation ( 6). this publication was reviewed by the membership and format variations were reported for inclusions in the procedure. members reported few variations at this time, but when implementation for individual members was undertaken, it was necessary to build many additional options into the computer programs. to assist the oclc staff in defining options for off-line catalog products and on-line procedures, an advisory committee on cataloging was established. this committee met several times and provided much needed guidance and counsel. the catalog card format options that members could select were extensive. for example, although the position of the call number was fixed in the upper left-hand corner of the card, there were 24 basic formats for lc call numbers, and libraries using the dewey decimal classification could format their call numbers as they wished. in general, the greatest number of format options are associated with call numbers, probably because there has never been a standard procedure for call number construction. programs because designing, writing, coding, and debugging of catalog card production programs can cost tens of thousands of dollars, oclc sought existing card production programs that could run on computers at ohio state university, which is the generous host of the ohio college library center. only two programs were located that could both produce cards in the manner required by oclc and run on osu computers. card production costs were not available for one of the programs, but because analysis suggested that the design of the program would create very high card costs, this program was not selected. the other program had been written and used at the yale university library, and although the card production costs were high, it was known that changes could be made to increase efficiency. thus, arrangements were made to obtain and run the yale programs at osu. members were free to choose a variety of format options and submitted on a catalog profile questionnaire (figure 2) their specifications for each shared cataloging systemjkilgovr, et al. 161 catalog. holdings information and tracings could be printed on any or all of nine types of cards: 1) shelf list; 2) main entry; 3) topical subject; 4) name as subject; 5) geographic subject; 6) personal and corporate added entries; 7) title added entry; 8) author-type series added entry; and 9) title-type series added entry. subject headings and added entries could have top-of-card or bottom-of-card placement and could be printed in all upper-case or in upperand lower-case characters. any type of subject heading and added entry could begin at the left edge of the card or at the first, second, or third indention. other options are described in the manual for oclc catalog card production ( 5). the data received on catalog profile questionnaires were transferred to punch cards and a computer program written in snobol iv embedded the information in the form of a pack definition table (pdt) in one of the principal catalog production programs named convert ( cnvt). each pdt defined the cards to go into the catalogs of one holding library, a holding library being a collection with its own catalog. the first major program in the processing sequence was prepros, which was written in ibm 360 basic assembler language ( bal) and run on an ibm 360/75. prepros converted records from the weekly marc ii tapes to an oclc internal processing format, including conversion of marc ii characters from ascii to ebcdic code. this program also parsed lc call numbers and partially formatted them. it also checked for end-of-field and end-of-record characters and verified the length of record. finally, it wrote the output records in lc card number sequence into huge variable format blocks of 20,644 characters. the large blocks reduced computer costs since the pricing algorithm employed on the ibm 360/75 imposed a charge for each physical read and write operation. the magnetic tape output weekly by prepros was then submitted to cnvt together with the old master file of bibliographic records in lc card number order and a file of request cards that had been sorted in lc card number order. cnvt merged the records on the weekly tape with the master file and then matched the requests by lc card number. when a match was obtained, cnvt deleted some fields from the bibliographic record and formatted the call number according to the specifications of the library that had originated the request. it then wrote the modified record and associated pot's onto an output tape in external ibm 7094 binarycoded-decimal (bcd) character code with the record format converted to that of the yale bibliographic system. the second principal product of cnvt was the new master tape of bibliographic records that would become the old master for the next week's run. cnvt also punched out a card bearing the lc card number for each request card for which there was a match. these punch cards were used to withdraw cards from the request card file so that they would not be submitted again. cnvt was first run on an ibm 360/50. the tape file of modified records and pot's was then submitted to 162 journal of library automation vol. 5/3 september, 1972 ohio college library center ca t alog pr o f ile questionnaire i. to define the pack of a rece i ving catalog, the member should complete the following tabl e . directions for completing the table are in the instruction manual , pp . 2 -3. leave blank rows for types o f entry not to be included ~n th~s pack. ii. 1. what is the name of the holding library or collection for which this pack contains c a rds? jvvc.;.. ,l... ~~ ::s 2 . what is the name of the receiving catalog into which this pack will go? u"'"" ~~~;~\ ""~r_s~ 3 . if this receiving catalog is not in the holding library or collection , 11 11 11111 put in the following box the stamp to appear above the call number · · · · · · · · · (see instruction manual) . lnsti tution: \)," jcl.<>·~ or;. [:>.,\( • .-<' fig. 2. catalog profile questionnaire. expand, a modified yale program written in mad and run on an ibm 7094. by combining the number of tracings and pdt requirements, expand developed a card image for each catalog card required by the requesting library. it also prepared a sort tag for each image so that the image could be subsequently sorted by library into packs and alphabetized within each pack. expand essentially did the formatting of catalog cards except for the complex lc call number formatting carried out by cnvt. the file of card images was passed to a program named build print tape (bldpt) written in bal and run on the ibm 360/ 75. bldpt first converted the external ibm 7094 bcd characters to ebcdic. next bldpt sorted the images, and finally, it arranged the images on a single tape to allow printing on continuous, two-up catalog card formsthe first half of the sorted file was printed on the left-hand cards and the second half on the right. the print program was also written in bal but run on an ibm 360/ 50. it was designed so that either the entire file or a segment as small as four cards could be printed; the latter feature was of greatest use in reprinting cards that for one of several reasons were not satisfactorily printed during the first run. cards were printed six lines to an inch and the print train used was a modified version of the train designed by the university of chicago which in turn was a modified version of the ibm tn train. shared cataloging systemjkilgour, et al. 163 the printer attached to the ibm 360/50 was an ibm 1403 n1 printer. this printer appears to be superior to any other high-speed printer currently available, but to obtain a product of high quality, it was necessary to fine-tune the printer, to use a mylar ribbon from which the ink does not flake off, and to experiment with various mechanical settings to determine the best setting for tension on the card forms and for forms thickness. above all, patience in large amounts was required during initial weeks when it seemed as though a messy appearance would never be eliminated. oclc off-line catalog card production programs were written in assembler language and higher level languages. use of higher level languages for character manipulation incurs unnecessarily high costs. therefore, for a large production sys tem like oclc, it is absolutely required that processing programs and subroutines that manipulate all characters, character by character, be written in an assembler language to obtain efficient programs that run at low cost. programs that do not manipulate characters, such as the oclc program for embedding pdt's in cnvt, may well be written in a higher level language. materials and equipment-a summary off-line catalog production was based on availability of marc ii records on magnetic tapes disseminated weekly by the library of congress. without the marc ii tapes, the off-line procedure could not have operated. each week, the new marc ii records were added to the previous cumulated master file also on magnetic tape, and previously unfilled and new requests were run against the updated file. osu computers employed were an ibm 360/75, an ibm 360/50, an ibm 7094, and an ibm 1620. the run procedure was complex and therefore somewhat inefficient, but this inefficiency was traded off against a predictably high expense to write a new card formatting program. members submitted a request for card production on a punch card on which the member had written an lc card number. members could specify a recycling period of from one to thirty-six weeks for running their request cards against the marc ii file before unfulfilled requests would be returned. in general, request cards bore lc card numbers for that section of the marc ii file that was complete; at first, the file was inclusive for only "7" series numbers, but in early 1971 the recon file for "69" numbers was added. request cards often numbered several thousand a week. catalog card forms are the now-familiar two-up, continuous forms with tractor holes along each side for mechanical driving. the card stock is permalife, one of the longest-lived paper stocks available. a thin slit of about one thirty-second of an inch in height converts each three-inch vertical section of card stock to 75 mm. the lowest price paid in a lot of a half million cards has been $8.065 per thousand. after having been printed, the card forms are trimmed on a modified uarco forms trimmer, model number 1721-1. this trimmer makes four 164 journal of library automation vol. 5/3 september, 1972 continuous cuts in the forms and produces cards with horizontal dimensions of 125 mm. cards are stacked in their original order as printed and are therefore in filing order. the trimmer operates at quoted speeds of 115 and 170 feet per minute or 920 and 1,360 cards per minute. measurements of speeds of operations confirmed these ratings. results the off-line catalog production system produced 529,893 catalog cards from july 1970 through august 1971 at an average cost of 6.57 cents per card. this cost includes over twenty separate cost elements plus a threequarter cent charge for overhead. the firm of haskins & sells, certified public accountants, reviewed the costing procedures that oclc employs, found that all direct costs were being included, and recommended the three-quarter cent overhead charge. the number of extension cards varies from library to library depending almost entirely on the types of cards on which libraries have elected to print tracings. however, one university library with a half-dozen department libraries and requiring tracings on only shelf list and main entry cards averages approximately six cards per title. cataloging using the oclc off-line system results in a decrease of staff requirements, and some libraries that used the system during most of the year found that they needed less staff in cataloging. reduction of staff by taking advantage of normal staff turnover facilitated financial preparation for the oclc on-line system in these libraries. evaluation despite the obvious inefficiences generated by running production computer programs on four different computers in two different locations and despite inefficiencies in the programs themselves, computer costs to process marc ii tapes and to format catalog cards, but not to print them, was 2.27 cents per card. as will be shown later, newer and more efficient programs have halved this cost, but even at 2.27 cents per card for formatting and .33 cents per card for printing, the cost of oclc off-line card production is less than half the cost of more traditional card production methods ( 7). two features originally designed into the system were never implemented, somewhat diminishing the usefulness of the system for some libraries. one of the incompleted features was a technique for deleting, changing, or adding a field to a marc record (this capability exists in the on-line system). absence of this procedure meant that libraries had to accept lc cataloging without modification except to call numbers. the second missing feature was the ability to print multiple holding locations on cards (this capability also exists in the on-line system) although it was possible to print multiple holdings in one location. this deficiency limited the usefulness of the system for large libraries processing duplicates into shared cataloging systemjkilgour, et al. 165 two or more collections. both of these features could have been activated, . but shortage of available time prior to activation of the on-line system prevented their implementation. figure 3 shows the high quality of the catalog cards produced. subsequent to attainment of this level of quality, there have been no complaints from members except in cases where a piece of chaff from the card forms went through the printer and caused omission of characters. oclc continues to vary the design of its continuous forms to achieve completely chaff-free stock. the shortest possible time in which cards could be received by the member library after submitting a request card was ten days, but it is doubtful that this response time was often achieved. the minimum average response time for the three-quarters of requests for which a marc record was located on the first run was two weeks. delays at a computer center or incorrect submission of a run could extend this delay to three and four weeks, and unfortunately such delays were cumulative for subsequent requests until the "weekly" runs were made sufficiently more often than weekly to catch up. if another delay occurred during a catch-up period, the response time further degraded. during the fourteen months of operation, there were two serious delays. the amount of normal turnover that occurred in oclc libraries during the fourteen months and that was taken advantage of by not filling positions was too small to reduce the financial burden incurred in starting up the on-line system. a few libraries demonstrated that it was possible to take advantage of such attrition. however, 20 percent of the libraries did not participate in the on-line system and perhaps half of those who did participate were uncertain as to whether the on-line cataloging system would operate or would operate at a saving. when feasibility of on-line shared cataloging has been substantiated and other centers begin to implement similar systems, it should be possible to activate off-line catalog production sufficiently in advance of on-line implementation to enable participants to take adequate advantage of normal attrition to minimize, or nearly eliminate, additional expenditures. experience such as that of oclc will enable new centers to calculate the number of months necessary for off-line production required to reduce salary expenditures by an amount needed to finance the on-line system. shared cataloging-on-line the cataloging objectives of the on-line shared cataloging system are to supply a cataloger with cataloging information when and where the cataloger needs the information and to reduce the per-unit cost of cataloging. catalog products of the system are the same as the off-line systemcatalog cards in final form alphabetized for £ling in specific catalogs; the on-line system is not limited to marc ii records but also allows cataloging input by member libraries. the shared cataloging system, which accommo166 ]oumal of library automation vol. 5/3 septembe r, 1972 jc423 oll7 ctt tt 171 oe45 1971 oaku la~reit de lacharr~ere repe. stude& sur ta theorle deaocratlc: spinoza, rousseau, beaet, marx. paris, pa:yot, 1963o 218 p• 23 c •• c8ibllotheque politique et econoaique) bibliocraphical ~ootnotes. dawis, mildred j., edo babroider:y desians, 1780-1820; 1ro• the aanuscript collection, the textile resource and research center, the valentine museua, richaond, virginia. edited by mildred j. davis. new york, crown publishers f1971 1 xiii, 94 p• (chie1l:y illus. (part colo)) 29 c•• commercial policy. 338.91 1:875in l:reinin, wordechai elihau, 193000 international econo•ics; a polic y approach (b:y) mordechai e. ~reinin• mew york harcourt, brace, jovanovich [ 1971 1 x., 379 p• it lus. (the harbrace series in business and econo•i cso) dc 430.5 • z9 c34 oako intersectoral capital ~tows in the econoaic dewelopaent o~ taiwan, 1 89 51960. lee, tena-hui • intersectoral capital ~lows in th e econoaic developaent o~ taiwan, 1 8&~1960. ithaca (n.y.] cornell univ ers it y press [ 1971 1 xxt 197 p• 23 cao an out&rowth o~ the author's the~is , cornell oniwersit:y, 1968. bibliography: p• (183)-1 8 1. 0 a"rnt 76-1 59031 ( fl(;uh e fig. 3. computer-produced catalog cards. hed uced 25%) dates all cataloging done in modern european alphabets, builds a union catalog of holdings in oclc member libraries as cataloging is done. one library, wright state university, is converting its entire catalog to machinereadable form in the oclc on-line catalog. the third major goal is a communications system for transacting interlibrary loans. system design and equipment selection figure 4 depicts the basic design of computer and communication comshared cataloging systemf kilgour, et al. 167 ponents for th e comprehensive system comprised of the five subsystems described in th e introduction. the machine system for shared cataloging was designed to be a subsystem of the total system so that subsequent modules could be added with minimal dismption . similarly, the logical d esign of the shared cataloging subsystem was constructed so that the modules of shared cataloging would be common to the remaining file requirements as shown in figure 1. design of the on-line shared cataloging system began with a redefinition of the catalog products of off-line catalog production ( 5) . in this exercise, the advisory committee on cataloging, comprised of members from seven libraries, contributed valuable assistance. the committee was also most helpful in designing the formats of displays to appear on terminal screens. important decisions in the design of the computer, communications, and terminal systems were those involving mass storage devices and terminals. random access storage was the only type feasible for achieving the objective of supplying a user with bibliographic information when and where he needed it. hence, random access memory devices were selected for the comprehensive system and ipso facto for shared cataloging. data channel system file catalog f1 l e data channe i memory drive contr ol data channe l ----connect1on made 1f cpu #i malfunct ions ·connect1on made if cpu #2 ma l funct1ons fig. 4. computer and communication system. 168 ]oum.al of library automation vol. 5/ 3 september, 1972 the cathode ray tube (crt) type of terminal was selected primarily because of its speed and ease of use by a cataloger. crt terminals are far more flexible in operation than are typewriter terminals from the viewpoint of both the user and machine system designer. for these reasons, crt terminals can enhance the amount of work done by the system as a whole. it was originally planned to select a computer without the assistance of computerized simulation, but in the course of time, it became clear that it was impossible to cope with the interaction among the large number of variable computer characteristics without computerized simulation. therefore, a contract was let to comress, a firm well known for its work in computer simulation. ten computer manufacturers made proposals to oclc for equipment to operate the five subsystems at peak loading (an average five requests per second over the period of an hour ) . all ten proposed computer systems failed because simulation revealed inefficiencies in their operating systems for oclc requirements. oclc and comress staff then proposed a modification in operating systems, which the manufacturers accepted. the next series of trials revealed that more than half of the computers or secondary memory files would have to be utilized over 100 percent of the time to process the projected traffic. as a result of these findings , one computer manufacturer withdrew its proposal, and five others changed proposals by upgrading their systems. on the final simulation runs, the percent of simulated computer utilization ranged from 19.70 percent to 114.31 percent. a subsequent investigation of predictable delays due to queuing in such a system showed that unacceptable delays could arise if computer utilization rose above 30 percent at peak traffic. three manufacturers proposed computer systems that were under 30 percent utilization and, for these, a trade-off study was made that included such characteristics as cost, reliability, time to install the applications system, and simplicity of program design. the findings of the simulation and trade-off studies provided the basis of the decision to select a xerox data systems sigma 5 computer. major components of the oclc sigma 5 are the central processing unit (cpu), three banks of core memory with a total capacity of 48 thousand 32-bit words or 192 thousand 8-bit bytes, a high speed disk secondary memory, 10 disk-pack spindles with total capacity of 250,000,000 bytes plus two spare spindles, two magnetic tape drives, two multiplexor channels, five communications controllers, a card reader, card punch, and printer. the character code is ebcdic. figure 5 illustrates the sigma 5 configuration at oclc. in this configuration, the burden of operating communications does not fall on the cpu so that there is no requirement for "cycle stealing" that slows processing by a cpu. the lease cost to oclc of the equipment represented in figure 5 is $16,317 monthly. the listed monthly lease of the equipment is $21,421 from which an educational discount of 10 percent is deducted. (the remaining difference is due to a rebate because the original order included secondary shared cataloging system j kilgour, et al. 169 memory units that xds was to obtain from another manufacturer who proved incapable of supplying units that fulfilled specifications. hence, xds was forced to supply other memory units having a higher list price but has done so at a cost per bit of the units originally ordered.) the printer furnished with the sigma 5 does not provide the high-quality printing required for library use. at the present time, oclc prints catalog cards on an osu ibm 1403-n1 printer that without doubt provides the highest quality printing currently available from a line printer. however, oclc is designing an interface between a sigma 5 and an ibm 1403 memory bonk no. i --dolo ---control memory bonk no. 2 memory bonk no. 3 i i i --------------~ l i 1 r----------r sigma 5 cpu multiplexor opera! or's console cord punch magnetic tope units cord reader dolo bose disk bonk no. i doto bose disk bonk no. 2 _______ !j bus-shor in g 1---+----, mull iplexor fig. 5. xds sigma 5 configuration. 170 journal of library automation vol. 5/3 september, 1972 printer; xds is also developing a new type of printer that will provide high quality output. when the sigma 5 can produce quality printing, it will be fully qualified to be used for nodes in national networks. as has already been stated, the crt-type terminal was selected because of its ease of use. moreover, the simulation study confirmed that crt terminals would place far less burden on the central computer and therefore, for the oclc system, would make possible selection of a less expensive computer than would be required to drive typewriter terminals. although typewriter terminals cost less, the total cost could be higher for a system employing typewriter terminals than for one using crt's because of greater central computer expense. library requirements for a crt terminal are: 1) that the terminals have the capability of displaying upperand lower-case characters and diacritical marks; 2) that the image on the screen be highly legible and visible; 3) that the terminal possess a large repertoire of editing capabilities; and 4) that interaction with the central computer and files be simple and swift. system requirements were: 1) that the terminal accept and generate ascii code; 2) that it make minimal demands for message transmissions from and to the central site; 3) that it have the capability of operating with at least a score of other terminals on the same dedicated line; and 4) that its cost, including service at remote sites, be about $150 per month. data were collected on crt's produced by fifteen manufacturers, and three machines were identified as being prime candidates for selection. oclc carried out a trade-off study in which thirty-three characteristics were assessed for these three machines. one of the thirty-three (reliability) could not be judged for any of the three because none had yet reached the market. for the remaining characteristics, the irascope lte excelled or equaled the other two terminals for twenty-eight characteristics including all nineteen characteristics of importance to the oclc user. moreover, the irascope was outstandingly superior in its ability to perform continuous insertion of characters, line wrap-around during insertion of characters, repositioning of characters so that each line ends in a complete word, and full use of its memory. however, the irascope was the most expensive$175 a month as compared with $153 and $166. nevertheless, the irascope was selected because of its obvious superiority. pilot operation by library staffs has not produced complaints concerning visibility or operability; complaints during pilot operation have sprung from failures caused by a variety of bugs in telephone systems and a couple of bugs in the terminals that were subsequently exterminated. the number of terminals needed by a member library for shared cataloging was calculated on the assumption that six titles could be processed per terminal-hour. it was also assumed that a library might have only one staff member to use the terminal throughout the year. it was further assumed that as much as three months of the terminal operator's time would be lost to vacations, sick leave, and breaks. at the rate of six titles per terminal-hour shared cataloging system f kilgour, et al. 171 and with 2,000 working hours in a year, 12,000 titles would be processed annually assuming full-time use. since only nine months was assumed to be available, it was estimated that 9,000 titles would be processed at each terminal. in large libraries where there would be more than one staff member to operate a terminal, there would be three months of time available to do input cataloging, and since only a few libraries will be obtaining less than 75 percent of cataloging from the central system, it appears that a formula of one terminal for every 9,000 titles or fraction thereof cataloged annually would give each library sufficient terminal-hours. in actual operation, operators have been able to work at twice the assumed rate of six titles per terminal-hour so that there is reason to believe that these guidelines will provide adequate terminal capability. file organization the primary data that will enter the total system are bibliographic records, and since the system is being designed to conform to standards, the national standard for bibliographic interchange on magnetic tape ( 8) has been complied with in file design. in other words, the system can produce marc records from records in the oclc file format; more specifically, the system can regenerate marc ii records from oclc records derived originally from marc ii records, although an oclc record contains only 78 percent of the number of characters in the original marc ii record. similarly, the system can generate marc ii records from original cataloging input by member libraries. the simulation study clearly showed that bibliographic data would have to be accessed in the shortest possible time if the system were to avoid generating frustrating delays at the terminal. imitation of library manual files or of standard computer techniques for file searching would not provide sufficient efficiency. oclc, therefore, set about developing a file organization and an access method that would take advantage of the computation speeds of computers. oclc research on access methods has produced several reports ( 9,10,11) and has developed a technique for deriving truncated search keys that is efficient for retrieval of single entries from large files. these findings have been employed in the present system that contained over 600,000 catalog records in april1973, arranged in a sequential file on disks, and indexed by a library of congress card-number index, author-title index, and a title index. the research program on access methods did not, however, investigate methods for storing and retrieving records. research on file organization included experiments directed toward development of a file organization that would minimize processing time for retrieval of entries or for the discovery that an entry is not in the file. since the oclc system is designed for on-line entry of data into the data base, it was not possible to consider a physically sequential file for the index files. 172 ]ottmal of library automation vol. 5/ 3 september, 1972 the indexed sequential method of file organization obviates the data-entry obstacle posed by physical sequential organization, but is inefficient in operation. consequently, scatter storage was determined to be the best method for meeting the efficient file organization requirements of the system. the findings of the investigation have shown that very large files of bibliographic index entries organized by a scatter-store technique in which search keys are derived from the main entry can be made to operate very efficiently for on-line retrieval and at the same time be sparing of machine time even in those cases where requests are for entries not in the file ( 12). this research also produced two powerful mathematical tools for predicting retrieval behavior of such files, and a design technique for optimizing record blocking in such files so that, on the average, only one to two physical accesses to the file storage device are needed to retrieve the desired information. the files displayed in figure 1 are constructed by a single file-building program designed so that additional modules can be embedded in the program. the program accepts a bibliographic record, assigns an address for it in the main sequential file, and places the record at that address. having determined the bibliographic record address, the program next derives the author-title search key and constructs an author-title index file entry which contains the pointer to the bibliographic record. then the program produces an lc card number index entry and a title index entry, each of which contains the same pointer to the bibliographic record. when a bibliographic record is used for catalog card production, an entry is made in the holdings file. when the first holdings entry is made for a bibliographic record, a pointer to the holdings entry is placed in that record; the pointer to each subsequent holdings entry is placed in the previous holdings entry. an entry is made at the same time in the call number index containing a pointer to the holdings entry. this file organization operates with efficiency and economy. the files containing the large bibliographic records and their associated holdings information are sequential, and hence, are highly economical in disk space. the technique used ensures that only a low percentage of available disk area need be reserved for growth of these large sequential files. disk units can be added as needed. each fixed-length record in the scatter-store files is less than 3 percent of the size of an average bibliographic record, and since 25 percent to 50 percent of these files are unoccupied, the empty disk area is small because of the small record lengths. sequential files the bibliographic record file and holdings file are sequential files, the holdings file being a logical extension of the bibliographic record file. a record is loaded into a free position made available by deletion of a record or into the position following the last record. whenever a new version of a shared cataloging system/kilgour, et al. 173 record updates the version already in the file, the new record is placed in the same location as the old if it will fit; otherwise, it is placed at the end of the file and pointers in the indexes are changed. there is a third, small sequential file containing unique notes for specific copies, dash entries, and extra added entries. each bibliographic record contains the information in a marc ii record. each record also contains a 128-bit subrecord capable of listing up to 128 institutions that could hold the item described by the record. at the present time, only 49 of the 128 bits are used since there are 49 institutions participating in oclc. the record also includes pointers to entries in index files, so that the data base may be readily updated, and a pointer to the beginning of the list of holdings for the record. in addition, each record has a small directory for the construction of truncated author-title-date entries, which are displayed to allow a user to make a choice whenever a search key indexes two or more records. although each bibliographic record includes all information in a standard marc ii record, records in the bibliographic record file have been reduced to 78 percent of the size of the communication record largely by reducing redundancy in structural information. oclc intends to compress bibliographic records further by reducing redundancy in text by employing compression techniques similar to those described in the literature ( 13,14). the holdings file contains a string of holdings records for each bibliographic record; individual records are chained with pointers. information in each record includes identity of the holding institution and the holding library within the institution, a list of each physical item of multiple or partial holdings, the call number and pointers to the next record in the chain, and to the call number index. the last record in the chain also has a back-pointer to the associated bibliographic record. whenever there is a unique note, dash entry, or extra added entry coupled to a holding, that holding has a pointer to a location in the third sequential file in which the note or entry resides. index files indexes include an author-title index, a title index, and an lc card number index. research and development are under way leading to implementation of an author and added author index and a call number index. a class number index will be developed and implemented in the future. with the exception of the class number index, which by its nature is required to be a sequentially accessible file, the oclc indexes are scatter storage files. the construction of and access to a scatter storage file involves the calculation of a home address for the record and the resolution of the collisions that occur when two or more records have the same home address. the calculation of a home address comprises derivation of a search key from the record to be stored or retrieved and the hashing or randomizing of the key to obtain an integer, relative record address that is converted to a 174 journal of library automation vol. 5/3 september, 1972 storage home address. the findings of oclc research on search keys has been reported (9,10,11). the hashing procedure employs a pseudo-random number generator of the multiplicative type: home address= rem ( k x.jm) where k is the multiplier 65539, x,. is the binary numerical value of the search key, and m is the modulus which is set equal to the size of the index file; 'rem' denotes that only the remainder of the division on the right-hand side is used. philip l. long and his associates have shown that efficiency of a scatter storage file is rapidly degraded when the loading of the file exceeds 75 percent ( 12 ); therefore, oclc initially loads files at 50 percent of physical capacity. hence, the modulus is chosen to be twice th e size of initial number of records to be loaded. when 75 percent occupancy is reached a new modulus is chosen and the file is regenerated. collisions are resolved using the quadratic residue search method proposed by a. c. day ( 15) and shown to be efficient ( 12). in this method, a new location is calculated when the home address is full; the first new location has the value (home address 2), the second (home address 6 ), the third ( home address 12 ) and so on until an empty location is found if a record is being placed in the file, or the end of the entry chain is found if records are being retrieved. when the file size m is a prime having the form 4n + 3, where n is an integer, the entire file may be examined by 1n searches. retrieval techniques the retrieval of a record or records from the oclc data base is achieved in fractions of a second when a single request is put to the file, and rarely exceeds a second when queuing delays are introduced by simultaneous operation of upwards of 50 terminals. response time at the terminal is greater than these figures because of the low communication line data rate, but terminal response time rarely exceeds five seconds. figure 6 shows the map of a record in the author-title index file and the title file. in the author-title file, the search key is a 3,3 key with the first trigram being the first three characters of the author entry and the second being the first three characters of the first word of the title that is not an english article (9). for example, "str,cha" is the search key for b. h. streeter's the chained library. however, any or all of the characters in the trigrams may be all in lower case. the author-title index also indexes title-only entries, but the title index provides a more efficient access to this type of entry. the pointer in the record map in figure 6 is the address of the bibliographic record from which the search key was d erived. the entry chain indicator bit is set to 0 (zero) if there is another record in the entry chain and to 1 if the record is last in the chain. when this bit is 0, the search skips to the next record as calculated by day's skip algorithm. the shared cataloging systemjkilgour, et al. 175 bibliographic record presence indicator bit is set to 0 (zero) to indicate that the bibliographic record associated with this index entry has been deleted; it is set to 1 to indicate that the bibliographic record is present. an author-title search of the data base is initiated by transmission of a 3,3 key from a terminal. a message parser analyzes the message and identifies it as a 3,3 author-title search key by the presence of the comma and by there not being more than three characters on either side of that comma. next, the hashing algorithm calculates the home address and the location is checked for the presence of a record. if no record is present, a message is sent to the terminal stating that there is no entry for the key submitted and suggesting other action to be taken. if a record is present and its key matches the key submitted and if the entry-chain indicator bit signifies that the record at the home address is the only record in the chain, the bibliographic record which matches the key submitted is displayed on the terminal screen. if the entry-chain bit signifies that there are additional records in the chain, those records are located by use of the skip algorithm. if more than one record possesses the same key as that submitted, truncated author-titledate entries derived from the matching bibliographic records are displayed with consecutive numbering on the terminal screen. the user then indicates by number the entry containing information pertaining to the desired work, and the program displays the full bibliographic record. the title-index record has the same map as the author-title record and is depicted in figure 6. the title index is also constructed and searched in entry chain indicator bit 4 bytes bibliographic record pointer nometitle search key bibliographic record presence indicator bit bibliographic record pointer title search key fig. 6. author-1'itte and title index records. 8 bytes 176 ]ou,-nal of library automation vol. 5/ 3 september, 1972 the same manner as the author-title index. the title search key ( 3,1,1,1) consists of the first three characters of the first word of the title that is not an english article plus the initial character of each of the next three words. commas separate the characters derived from each word. the title search key is "cha,l," for b. h. streeter's the chained libmry, the three commas signifying that the message is a title search key. the bibliographic record pointer and the two indicator bits have the same function as in the authortitle record. figure 7 exhibits the map for a record in the lc card number index. the three left-most bytes in the lc card number section contain an alphabetic prefix to a number where this is present, or, more usually, three blanks when there is no alphabetic prefix. similarly the right-most byte contains a supplement number or is blank. the middle four bytes contain eight digits packed two digits to a byte after the digits to the right of the dash have been, when necessary, left-filled with zeroes to a total of six digits. the dash is then discarded. for example, lc card number 68-54216 would be 68054216 before being packed. the pointer and the two indicator bits have the same function as in the author-title index record. an lc card number search is started with the transmission of an lc card number as the request. the parser identifies the message as an lc card number search by determining that there is a dash in the string of characters and that there are numeric characters in the two positions immediately to the left of the dash. the remainder of the search procedure duplicates that for the author-title index. on-line programs as is the case with all routinely used oclc programs, the on-line programs are written in assembly language to achieve the utmost efficiency in processing. in addition, every effort has been made to design programs to run in the fastest possible time. in other words, one of the main goals of the oclc on-line operation is lowest possible cost. the simulation study had shown that it was necessary to modify the operating system of the xds sigma 5 so that the work area of the operating system would be identical with that of the applications programs. the xds real-time batch monitor, which is one of the operating systems furnished by xds for the sigma 5, has been extensively altered, and one of the alterations is the change to a single work area. another major change to the operating system was building into it the capability for multiprogramming. at the present time, the on-line foreground of the system operates two tasks in that two polling sequences are running simultaneously, and the background runs batch jobs at the same time. this new monitor is called the on-line bibliographic monitor ( obm). an extension of obm is named motherhood (mh); mh supervises the operation of the on-line programs. mh also keeps track of the activities of these programs and compiles statistics of these activities. in addition, mh shared cataloging systemjkilgovr, et al. 177 contains some utility programs such as the disk and terminal 1/0 routines. the principal on-line application program is catalog (cat); its functions are described in detail in the subsequent sections entitled cataloging with existing bibliographic information and input cataloging. in general, cat accepts requests from terminals, parses them to identify the type of request, and then takes appropriate action. if a request is for a bibliographic record, cat identifies it as such, and if there is only one bibliographic record in the reply, cat formats the record in one of its work area buffers and sends the formatted record to the terminal for display. if more than one record is in the reply, cat formats truncated records and puts them out for display. after a single bibliographic record has been displayed, cat modifies the computer memory image of the record in accordance with update requests from the terminal. for example, fields such as edition statement or subject headings may be deleted or altered, and new fields may be added. when the request is received from the terminal to produce catalog cards from the record as revised or unrevised, cat writes the current computer memory image of the record onto a tape to be used as input to the catalog card production programs. the catalog card production programs operate off-line, and the first processing program is convert ( cnvt), which formats some of the fields and call numbers. the major activity of cnvt is the latter, for libraries require a vast number of options to set up their call numbers for printing. cnvt also automatically places symbols used to indicate oversized books above, below, or within call numbers as required. format is the second program; it receives partially formatted records from cnvt. format expands each record into the total number of card images corresponding to the total cards required by the requesting library 4 bytes bibliographic record pointer lbibliogrophic 8 bytes lc cord number record presence indicator bit entry choin indicator bit fig. 7. library of congress card number index record. 178 ] ournal of libm1·y automation vol. 5/3 september, 1972 for each particular title. format determines this total from the number of tracings and pack definition tables previously submitted by the library that define the printing of formats of cards to go into each catalog. format, which is an extensive revision of expand, contains many options not found in the old off-line catalog card production system. format can set up a contents note on any particular card, and puts tracings at the bottom of a card when tracings are requested. the author entry normally occurs on the third line, but if a subject heading or added entry is two or more lines long, format moves the author entry down on the card so that a blank line separates the added entry from the author entry. in other words, each card is formatted individually. the major benefit of this feature, which allows the body of the catalog data to float up and down the card, is that the text on most cards can start high up on the card, thereby reducing the number of extension cards. the omission of tracings from added entry cards has a similar effect. table 1 presents the percentage of extension cards in a sample of 126,738 oclc cards for 18,182 titles produced for twenty-five or more libraries during a seventeen-day period, compared with extension cards in library of congress printed cards and in a sample of nelinet cards "for over 1,300 titles" ( 16). the table shows that the oclc mixture of cards with and without tracings and with the floating body of text yields about 10.8 percent more extension cards compared to library of congress printed cards. were libraries to restore the original meaning to the phrase "main entry" by printing tracings only on main entry cards, the percentage of extension cards in computer produced catalog cards printed six lines to the inch would probably be less than for lc cards. format also sets up a sort key for each record and a sort program sorts the card images by institution, library, catalog, and by entry or call number within each catalog pack. another program, build-print-tape (bpt), arranges the sorted images on tape so that cards are printed in consecutive order in two columns on two-up card stock. f inally, a print program prints the cards on an ibm 1403 n1 printer attached to an ibm 360/50 computer. cataloging with existing bibliographic information this section describes cataloging using a bibliographic record already in the central file; the next section, entitled input cataloging describes cataloging when there is no record in the system for the item being cataloged. the cataloger at the terminal first searches for an existing record, using the lc card number found on the verso of the title page or elsewhere. if the respon se is negative or if there is no card number available, the cataloger searches by title or by author and title using the 3,1,1,1 or 3,3 search keys respectively. if these searches are unproductive, the cataloger does input cataloging. when a search does produce a record, the cataloger reviews the record shared cataloging systemjkilgour, et al. 179 table 1. extension catalog card percentages number oclc lilnary of nell net of marc ii congress marc ii cards cards printed cards cards 1 77.2 87.8 79.9 2 18.9 10.0 16.7 3 2.5 1.6 2.5 4 1.1 .3 .6 5 .2 .2 .1 6 .1 .2 to see if it correctly describes the book at hand. if it is the correct record and if the library uses library of congress call numbers, the cataloger tra nsmits a request for card production by depressing two keys on the keyboard. cataloging is then complete. if the lc call number is not used, the cataloger constructs and keys in a new number and then transmits the produce-cards request. if the record does not describe the book as the cataloger wishes, the record may be edited . the cataloger may remove a field or element, such as a subject heading. information within a field may be changed by replacing existing characters, such as changing an imprint date by overtyping, by inserting characters, or by deleting characters. finally, a new field such as an additional subject heading may be added. when the editing process is complete, the cataloger can request that the record on the screen be reformatted according to the alterations. having reviewed the reformatted version, the cataloger may proceed to card production. when a cataloger has edited a record for card production, the alterations in the record are not made in the record in the bibliographic record file. rather, the changes are made only in the version of the record that is to be used for card production. the edited version of the record is retained in an archive file after catalog card production so that cards may be produced again from the same record for the same library, should the need arise in the future. the author index currently under development will enable a cataloger to determine the titles of works in the file by a given author. the call number index, also currently being developed, will make it possible for a cataloger to determine whether or not a call number has been used before in his library. the class number index that will be developed in the future will provide the capability of determining titles that have recently been placed under a given class number or, if none is under the number, the class number and titles on each side of the given number. liijjul cataloging input cataloging is undertaken when there is no bibliographic record in the file for the book at hand. to do input cataloging, the cataloger requests 180 ]ounwl of library automation vol. 5/3 september, 1972 that a work form be displayed on the screen (figure 8 ) . the cataloger then proceeds to fill in the work form by keyboarding the catalog data, and transmitting the data to the computer field by field as each is completed. a~ shown in figure 8, a paragraph mark terminates each field ; each dash is to be filled in by the cataloger for each field used. input cataloging may be original cataloging or may use cataloging data obtained from some source other than the oclc system. type: form: intel i vi: bib i lv i: 1t ~ t> 1-t> 2 24t> 3 250 t> 4 260t> 5 300 t> 6 4-t> 7 5-t> 8 6-t> 9 7-t> 10 8-t> i i 092 t> 12 049 -t> 13 590 fig. 8. workform for a dewe y library. lang: isbn card no: d ~ b c ~ 1t b c ~ b c ~ d 1t ~ -« d ~ 1t b-j 4[ 1t shared cataloging systemjkilgour, et al. 181 when the catalog data has been input, revised, and correctly displayed on the terminal screen, the cataloger requests catalog card production. in the case of new cataloging, not only are cards produced, but also the new record is added to the file and indexed so that it is available within seconds to other users. if a marc ii record for the same book is subsequently added to the file, it replaces the input-cataloging record but does not disturb the holdings information. union catalog each display of a bibliographic record contains a list of symbols for those member institutions that possess the title. in other words, the central file is also a union catalog of the holdings of oclc member libraries, although in the early months of operation these holdings data are very incomplete. nevertheless, they will approach completeness with the passage of time and with retrospective conversion of catalog data. titles cataloged during the operation of the off-line system have been included in the union catalog. the union catalog function is an important function of the shared cataloging system, for it makes available to students and faculties, through the increased information available to staff members, the resources of academic institutions throughout ohio. libraries also use the union catalog as a selection tool since they can dispense with expensive purchases of little-used materials residing in a neighboring library. members also use the file to obtain bibliographic data to be used in ordering. assessment with over nine hundred thousand holdings recorded in the union catalog as of april 1973, it is clear that having this type of information immediately at hand will greatly improve services to students and faculties. enlargement of holdings recorded will enhance the union-catalog value of the system. wright state university is in process of converting its holdings using the oclc system, and the ohio state university libraries-the largest collection in the state-has already converted its shelf list in truncated form. the osu holdings information will soon be available to oclc members. members using the oclc system report a large reduction in cataloging effort. two libraries using lc classification report that they are cataloging at a rate in excess of ten titles per terminal hour when cataloging already exists in the system. libraries using dewey classification are experiencing a somewhat lower rate. the original cost benefit studies were done on the basis of a calculated rate of six titles per hour for those books for which there were already cataloging data in the system. the net savings will be realized when the file has reached sufficient size to enable the largest libraries to locate records for 65 percent of their cataloging and for the smallest to find 95 percent. to reach this level, members collectively would have to use 182 journal of library automation vol. 5/3 september, 1972 existing bibliographic information to catalog 350,000 titles in the course of a year, or an average of approximately 1,460 titles for the total system per working day. it was thought that this rate would be attained by the end of the second year of operation. however, at the end of the first month of on-line operation, over a thousand titles per day were being cataloged. the new catalog card production programs operating on the sigma 5 are much more efficient than the programs used in the older off-line system. earlier in this paper it was reported that cost of the older programs to format catalog cards, but not to print them, was 2.27 cents per card. if costs of the sigma 5 are calculated at commercial rates, the new programs format cards at 2.21 cents per card. however, if actual costs to oclc are used and with the total cost being assigned to one shift, the cost of formatting each card becomes 0.86 cents. the total cost of producing catalog cards is, of course, much more than the cost to format them on a computer. nevertheless, either the 2.21 cents or 0.86 cents rate might serve as a criterion for measuring the efficiency of computerized catalog card production. the low terminal response-time delay for the operation of seventy terminals is a good gauge of the efficiency of the on-line system. in particular, the file organization is efficient, for it enables retrieval of a single entry swiftly from a file of over 600,000 records. moreover, no serious degradation in retrieval efficiency is expected to arise as the result of the growth of file size. the system operates from 7:00 a.m. to 7:00 p.m. on mondays through fridays, and at times the interval between system downtimes has exceeded a week. it is rare that the system will be down on successive days, and when a problem does occur, the system can be restored within a minute or two. moreover, when the system goes down, only two terminals will occasionally lose data, and most of the time, there is no loss of data. hence, it can be concluded that the hardware and software are highly reliable. in summary, it can be said that the oclc on-line shared cataloging system is easy to use, efficient, reliable, and cost beneficial. acknowledgments the research and development reported in this paper were partially supported by office of education contract no. oec-0-70-2209 ( 506), council on library resources grant no. clr-489, national agricultural library contract no. 12-03-01-5-70, and an l.s.c.a. title iii grant from the ohio state library board. references 1. ellen washy miller and b. j. hodges, "shawnee mission's on-line cataloging system," ]ola 4:13-26 (march 1971). shared cataloging systemjkilgour, et al. 183 2. john p . kennedy, "a local marc project : the georgia t ech library," in proceedings of th e 1968 clinic on library a pplications of data processing. (u rbana , ill.: university of illinois gradu ate school of library science, 1969 ) p . 199-215. 3. new england board of high er education, new england librm·y information netw01·k; final r ep01t on council on library resources grant #443. (feb. 1970 ). 4. charles t. payne and robert s. mcgee, the university of chicago bibliographic data processing system: documentation and report supplement, (chicago, ill. : university of chicago library, april1971). 5. judith hopkins, manual for oclc catalog card production (feb. 1971). 6. ohio college library center, pt·eliminary description of catalog cards produced from marc ii data (sept. 1969). 7. f. g. kilgour, "libraries-evolving, computerizing, personalizing," american libraries 3:141-47 ( feb. 1972). 8. american national standards institute, american national standard for bibliographic information interchange on magnetic tape (new york: american national standards institute, 1971 ). 9. f. g. kilgour, p. l. long, and e. b. l eiderman, "retrieval of bibliographic entries from a name-title catalog by use of truncated search keys," proceedings of the american society for information science 7:79-82 ( 1970 ). 10. f. g. kilgour, p. l. long, e. b. leiderman, and a. l. landgraf, "titleonly entries retrieved by use of truncated search keys," lola 4: 207-210 (dec. 1971 ) . 11. philip l. long, and f. g. kilgour, "a truncated search key title index," lola 5:17-20 ( mar. 1972). 12. p. l. long, k. b. l. rastogi, j. e. rush, and j. a. w yckoff, "large on-line files of bibliographic data : an efficient design and a mathematical predictor of re trieval behavior." ifip congress '71: ljubljana -aug ust 1971. ( amsterdam, north holland publishing co., 1971 ). bookle t ta-3, 145-149. 13. martin snyderman and bernard hunt, "the myriad virtues of text compaction," datamation 16:36-40 (dec. 1970). 14. w. d. schieb er and g. w. thomas, "an algorithm for compaction of alphanumeric data," lola 4 :198-206 (dec. 1970). 15. a. c. day, "full table quadratic searching for scatter storage," communications of the acm 13:481 (aug. 1970). 16. new england board of higher education, new england libmry in formation . .. , p. 100-101. communications ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ lib-s-mocs-kmc364-20141005044228 book reviews die elektronische datenverarbeitung im bibliothekswesen. by paul niewalda. muenchen-pullach, berlin, verlag dokumentation, 1971. (bibliothekspraxis, 1) as the first volume in a new series called bibliothekspraxis (library practice), v erlag dokumentation has published a short monograph on library automation by paul n iewalda, of the university library of regensburg. niewalda has written an introductory text, in german, condensing the standard, largely american, literature on the subject. his treatment is concise, wellwritten, and · well-organized. computer capabilities, and existing library applications in the united states and elsewhere, are carefully delineated. the text is thoroughly documented, with a large number of notes and a useful bibliography included. the book addresses itself to the german reader and, in fact, much is already familiar to american librarians. yet niewalda's frequent references to the ·european, particularly the german, library automation scene enhance the book's value. the author is clearly well informed both about library automation in general, and about local practice and problems. he brings to his task common sense and sound judgment. the work is recommended to those readers having a general interest in foreign developments in the field of library automation. s. micha namenwirth university of california, berkeley dictionanj of library science, information and documentation in six languages. compiled and artanged by w. e. clason. amsterdam: elsevier scientific publ. co., 1973. the basic table, a numbered list of entries for 5,439 english language words and phrases, alphabetically arranged, forms the body of the dictionary of library science, information and documentation. each entry consists ·of a serial number, the english term (american and/or british), equivalents in french, spanish, italian, dutch; and german, and a code identifying the book reviews 123 vocabulary with which the term is associated. hence, there are separate entries for volume as a book trade or library term and as an information processing term. many entries are augmented by brief definitions. english synonyms are also frequently given; in general these are terms from which references have been made. in such cases entry is under the synonym which files first. this practice produces some apparently eccentric choices; e.g., pseudonym, see allonym; udc, see brussels system. following the basic ' table are indexes for the five non-english languages. numerical references are given to basic table entries in which the index term is cited. german band is found not only in the first volume entry mentioned above but also in the bookbinding and information processing entries for tape. criteria employed for the selection of entries are unexplained. ibm's data pro-·' cessing glossary and the american national standard vocabulary for information processing appear to have been important sources of information processing terms. the glossary in anglo-american cataloging rules was evidently not used. it is clear that some of the source lists used were in other languages. the juxtaposition of related vocabularies which often put the same words to different uses presents difficulties which the approach taken here seems capable of handling. nevertheless the work as executed has flaws which reduce its effectiveness. the notions of synonymy and nonsynonymy among the english terms are puzzling. definitions are frequently unclear and occasionally wrong. there are cases in which the non-english equivalents for a single' term are certainly not synonymous with each other. the utility of the mdexes would be enhanced if the number of nonenglish synonyms given were greater. however, if approached with care, the volume can ·provide much useful information. in works of this type it is probably unfair to expect perfection. besides, a dictionary which manages to encompass both negative entropy (information theory) and scrivener's palstj (authors and authorship) has to be interesting, at least. charles w. husbands harvard universitv library letter to the editor ann kucera information technology and libraries | june 2018 9 https://doi.org/10.6017/ital.v37i2.10407 dear editorial board, regarding “halfway home: user centered design and library websites” in the march 2018 issue of information technology and libraries (ital), i thought there were some interesting points. i think that your assertion, however, that user centered design automatically eliminates anything from a website that your main user group did not expressly ask for is faulty. when someone brings up the fact that user centered design is not statistically significant, i interpret that as a misunderstanding of what user centered design is. our academic library websites are not research projects so why would we gather statistically significant information about them? our academic library websites are (or should be) helpful to students and faculty and constantly changing to meet their needs. if librarians perpetuate a misunderstanding of user centered design, my fear is that misunderstanding could perpetuate stagnation and a refusal to change our technology/user interfaces in a rapidly changing environment and do our patrons and ourselves a disservice. user centered design is a set of tools to help us gather information about users and their needs. the information gathered informs the design but does not dictate the design and needs to be part of an iterative process. the web design team at your institution demonstrated user centered design when they added floor maps back into the web site when a group of users pointed out that it was causing problems for the main users at your institution. while valuable experience from librarians and other staff is critical to take into account, it is sometimes difficult to determine which pieces of the puzzle provide comfort to those who work at the library vs. which pieces assist students in their studies. i applaud your willingness to “clear the slate” and reduce the amount of information you were maintaining on your website. i’m guessing you may have removed dozens of links from your website. you only mentioned adding one category of information back into the design. i would say your user centered design process is working quite well. ann kucera systems librarian central michigan university https://doi.org/10.6017/ital.v37i1.10338 44 information technology and libraries | june 2007 author id box for 3 column layout column title 44 information technology and libraries | september 2008 communications james feher and tyler sondag administering an open-source wireless network this tutorial presents enhancements to an open-source wireless network discussed in the june 2007 issue of ital that should reduce its administrative burden. in addition, it will demonstrate an opensource monitoring script written for the wireless network. as it has become increasingly important to provide wireless internet access for their patrons, libraries and colleges are almost expected to offer this service. inexpensive methods of providing wireless access—such as adding a commodity wireless access point to an existing network—can suffer from security issues, access by external entities, and bandwidth abuses. designs that address these issues often involve more costly proprietary hardware as well as expertise and effort that are often not readily available. a wireless network built with open-source software and commodity hardware that addressed the cost, security, and equal access issues mentioned above was presented in the june 2007 issue of ital.1 this tutorial highlights enhancements to the previous design that help to explain the technical hurdles in implementation, and includes a program that monitors the status of the various software and hardware components, helping to reduce the time required to administer the network. the wireless network presented requires several different pieces of software that must work together. because each of the required software programs are frequently updated, slight changes to the implementation may also be needed. a few issues that have arisen since the previous paper was written are addressed. a note is provided explaining the significance of setting the correct media access control (mac) address for the radius server and for wireless distribution system (wds) when configuring the system. in addition, in order to provide secure exchange of authentication credentials (username and password), the secure socket layer was used. a brief explanation of how to install a registered certificate on the gateway server is provided. lastly, a program that monitors the status of the network, provides a web page displaying the status of the various hardware and software components, and e-mails administrators with any changes to the network status—along with information on how this program is to be deployed within the network—is presented. configuration changes for previous design as new exploits are discovered and patched on a continual basis, any system should be regularly updated to insure that the most recent software is being used. the network design provided in the previous article used many different software components including, but not limited to: access point software openwrt—whiterussian rc3 dns cache dnsmasq v2.32 gateway chillispot v1.0 operating system fedora core 4 radius server free radius v1.0.4 web caching server squid v2.5 web server apache 2.2.3 many of these components can be kept up-to-date by using the yellow dog updater, modified (yum). 2 for example, to update a given package, with root access, at the command line enter: yum update packagename the yum command may also be used to update each package that has an available update by simply removing the package name from the yum update command and entering the following: yum update yum may also be used to upgrade the entire operating system.3 keep in mind that with any change in software, the configuration of any particular package may change as well. for example, the newest version of squid is currently 2.6. appendix d in the previous paper explained how to allow transparent relay of web requests so that client browsers did not have to be reconfigured. so, while version 2.5 required four changes to allow the transparent relay, the current version—found in appendix a—requires only one. in addition to changes in software, occasionally even entire websites move, as happened with chillispot.4 another change involved the configuration of the linksys wrt54gs access points. the newer versions of this access point/router sold by linksys have half the flash memory and half of the ram of the older versions.5 while the newer versions of the linksys wrt54gs can be flashed with custom firmware,6 the firmware that will fit on the newer unit lacks all the capability of the standard firmware. given this, those wishing to implement such a wireless network should investigate the capability of models to be deployed, as well as the version numbers for the access points chosen. the current version of the linksys wrt54gl and wrtsl54gs units retain enough flash memory and ram to be updated with the standard firmware mentioned in the previous article.7 james feher (jdfeher@mckendree.edu) is associate professor of computer science and computer information systems at mckendree university, lebanon, illinois. tyler sondag (sondag@cs.iastate.edu) is a phd candidate in computer science at iowa state university, ames. introducing zoomify image | smith 45administering an open-source wireless network | feher and sondag 45 in addition, the procedure for upgrading the firmware for the wrtsl54gs is simpler than the procedure outlined in appendix i of the previous paper. the factory-installed firmware on version 1.1 can be flashed directly using the web interface provided by linksys. so, while this tutorial and the previous paper outline the design of a network, the administrator will need to be vigilant in updating the packages used and keep in mind that the configuration specifications may also change with those updates. the administrator for the network must also investigate the capability of the standard hardware used to insure that it retains the functionality required for the system. choosing the correct mac address for the access point the access points used will have more than one interface and as such more than one mac address. when entering the mac address of a given access point into either the users file for the radius server or the access points that use the wds, use the mac address associated with the wireless interface.8 using the incorrect mac address will result in problems when communicating with the various access points. for the radius server, the access point will not get the correct ip address, which will prohibit the possibility of remotely administering the unit. incorrect mac addresses that are used for the wds settings will cause even worse problems, as the unit will not be able to relay data from users who connect to this access point. installation of a registered ssl certificate as users are required to enter their authentication credentials to gain access to the internet, the exchange of this data is encrypted using the secure socket layer.9 while administrators can self-sign the certificates used for their web servers, it is recommended that a registered certificate be obtained and installed for the system. this can help prevent common attacks and has the added benefit of eliminating warnings for the client browsers when they detect unregistered certificates being used by the ssl. a search of “ssl certificate” will yield any number of commercial vendors from which a certificate can be obtained. generally the installation of a certificate is fairly straightforward. the openssl command line utility can be used to generate a ssl key and certificate signing request (csr).10 once the csr is generated, pick a vendor/certificate authority who can sign your key. it should be noted that the design presented required the authentication gateway to be behind the main router. this required a certificate to be signed for a server within an intranet that does not have a fully qualified domain name. so, when generating the ssl key and csr, make sure to use gatewayhostname.localnet as the common name of your server. of course, gatewayhostname is whatever you choose as the name of your gateway host. the term localnet is used to refer to the server existing within an intranet. then make sure to place an entry for gatewayhostname.localnet into the hosts file of the server that is providing domain name service for your network. an example entry for the hosts file which is in the /etc directory of a standard fedora core installation is found in appendix b. monitoring script for wireless network as the wireless network has many separate hardware and software components, many possible points of failure exist for the system. the script from appendix c, which was written in perl,11 uses ping to test if each access point is still connected to the network and nmap to test whether the port associated with a given network service is still available.12 this program can be run manually or, even better, run automatically through the unix cron utility to update a webpage that displays the current state of all the network components. the webpage generated by this script for the mckendree college wireless network may be found at http://lance.mckendree.edu/cgi-bin/wireless/status.cgi. (additionally, a sample of this page is available as a figure in appendix d.) this script actually contains a script within a script. the main script must be run on the gateway machine, chilli on the diagram in appendix e, as only this machine has access to ping the access points. when the script determines that an access point or daemon is down, it will e-mail the system administrator. when an access point is down, in addition to sending the system administrator an e-mail, it can also send notification to an e-mail address associated with that device. this allows for someone other than the system administrator—who may have closer physical access to the unit—to check the access point on behalf of the administrators for simple issues, such as an access point losing power. this script then generates another cgi script that can be transmitted to an external server that can be reached from anywhere on the internet. in this case, this generated script can be run as a web-based application or by the system itself using the cron utility. if run as by the cron daemon, it will also e-mail the administrators if the script has not been updated recently. the script requires the use of several perl modules that will need to be installed. n expect n mail::mailer n net::ping the script has been released using the gnu general public license, 46 information technology and libraries | june 200846 information technology and libraries | september 2008 version 2 (gpl).13 the first portion of the script contains a reference to the gpl, followed by a brief explanation of the script as well as a set of parameters that should be changed to fit the specifications of the network designed. conclusion administrators should be vigilant in updating the entire system to assure security, keeping in mind that new versions of software or hardware may necessitate changes in the overall configuration of the system. in addition, while the monitoring script provides a useful aid in monitoring the network, it could be further expanded to include a more comprehensive review of level of use for various access points by the different users. it is felt that this would be best done through a database, which would require a higher level of administrative effort. a brief frequently asked questions list along with the script and link to the code for the script can be found at http://lance.mckendree.edu/csi/ wirelessfaq.html. references 1. sondag, tyler and james feher, “open source wifi hotspot implementation,” information technology and libraries 26, no. 2: 35–43, http://ala.org/ala/lita/ litapublications/ital/262007/2602jun/ toc.cfm (accessed july 24, 2008). 2. linux@duke, “yum: yellow dog updater, modified,” http://linux.duke .edu/projects/yum (accessed july 24, 2008) 3. upgrading fedora using yum frequently asked questions, http://fedora p r o j e c t . o r g / w i k i / yu m u p g r a d e f a q (accessed mar. 16, 2007). 4. chillispot—open source wireless lan access point controller, “spice up your hotspot with chilli,” www .chillispot.info/ (accessed may 22, 2008). 5. openwrtdocs/hardware/linksys /wrt54gs—openwrt, http://wiki.open wrt.org/openwrtdocs/hardware/link sys/wrt54gs (accessed july 24, 2008). 6. bitsum technologies wiki— wrt54g5 cfe, http://bitsum.com/ openwiking/owbase/ow.asp?wrt54g5_ cfe (accessed july 24, 2008). 7. openwrtdocs/hardware/linksys/wrtsl54gs—openwrt, http:// wiki.openwrt.org/openwrtdocs/hard ware/linksys/wrtsl54gs (accessed july 24, 2008). 8. o p e n wr t d o c s / w h i t e r u s s i a n / configuration, wireless distribution system (wds)/repeater/bridge. http:// wiki.openwrt.org/openwrtdocs/white russian/configuration (accessed july 24, 2008). 9. viega, john, matt messier, and pravir chandra, network security with openssl cryptography for secure communications. (sebastopol, calif.: o’reilly and associates, 2002). 10. generating a key pair and csr for an apache server with modssl. www .verisign.com/support/tlc/csr/modssl/ v00.html (accessed feb. 20, 2007). 11. wall, larry, tom christiansen, and randal schwartz, programming perl, third edition (sebastopol, calif.: o’reilly and associates). 12. nmap—free security scanner for network exploration and security audits. http://insecure.org/nmap/ (accessed feb. 20, 2007). 13. gnu general public license version 2, june 2007. www.gnu.org/licenses/ gpl.txt. appendix a. squid configuration changes # changes made to squid.conf # lines needed for squid 2.5 #httpd_accel_port 80 #httpd_accel_host virtual #httpd_accel_with_proxy on #httpd_accel_uses_host_header on # # one line needed in version 2.6 http_port 3128 transparent appendix b. /etc/hosts entry on marla for localnet entry 127.0.0.1 marla localhost.localdomain localhost 66.128.109.60 bob 66.99.172.252 lance.mckendree.edu lance # next line is for the ssl certificate to work properly 192.168.176.1 chilli.localnet chilli introducing zoomify image | smith 47administering an open-source wireless network | feher and sondag 47 appendix c. monitoring script #!/usr/bin/perl ######################################################### # code released 03/22/07 under: # # the gnu general public license, version 2 # # http://www.gnu.org/licenses/gpl.txt # # # # it is recommended that this script is run as a cron # # job frequently to find changes in the network. this # # script will check the status of the wireless access # # points/routers as well as the daemons necessary to # # run the network. it will then output the results to # # another perl file that is copied to a remote # # webserver. when the script observes a change in the # # availability of any access point or daemon, email # # will be sent to the specified administrator # # address(es). the option exists to send an email to # # to an additional person for each access point. # # # # additionally, the output file on the remote webserver # # will check when it was last updated, if that script # # is run from the command line or via cron. if it has # # not been updated for a specified number of minutes, # # it will send an email to the administrator. it is # # also recommended that this output script be run as a # # cron jobr. this output script can also be executed # # as a cgi program to generate a display of network # # status. # ######################################################### use strict; use expect(); # needed to scp to webserver use mail::mailer; # needed to send emails if outages use net::ping; # needed to check the status of aps #variables for webserver to host status page’s my $webservuname = “username”; my $webservpass = “password”; my $webservurl = “lance.mckendree.edu”; my $webservtarg = “/var/www/cgi-bin/wireless/”; my $weboutputurl = “http://lance.mckendree.edu/cgi-bin/wireless/status.cgi”; my $instname = “mckendree college”; #default background color of the status page my $defbgcolor = “#660066”; # if the page on the webserver has not been updated # in $updatemin minutes send an email that the service # is down (set to =~ 3*crontime) my $updatemin = 10; #email address errors will be sent to my $fromemail = ‘admin1@email.com’; my $toemail = 48 information technology and libraries | june 200848 information technology and libraries | september 2008 ‘admin1@email.com, admin2@email.com’; #file where errors will be stored on remote host my $logfilename = “/tmp/wireleslog.txt”; #hash for routers/ap’s #location is displayed on the webpage and in status emails #owner changes in status regarding this ap are sent to # this address as well (optional) my %iptoloc = ( “192.168.182.10” => { “location” => “clark 205”, “owner” => ‘’}, “192.168.182.11” => { “location” => “clark 202a”, “owner” => ‘apuser1@email.com’}, “192.168.182.12” => { “location” => “pac lounge”, “owner” => ‘apuser2@email.com’}, “192.168.182.20” => { “location” => “library main”, “owner” => ‘apuser3@email.com’}, “192.168.182.21” => { “location” => “library upper”, “owner” => ‘’}, “192.168.182.22” => { “location” => “library lower”, “owner” => ‘’}, “192.168.182.30” => { “location” => “carnegie”, “owner” => ‘apuser4@email.com’}); #hash for daemons my %daemons = ( “dnsmasq dns server” => { “ip_addr” =>”10.4.1.90”, “port” =>”53”, “proto” =>”tcp”}, “radius authenticate” => { “ip_addr” =>”10.4.1.90”, “port” =>”1812”, “proto” =>”udp”}, “chilli capt. portal” => { “ip_addr” =>”10.5.3.30”, “port” =>”0”, “proto” =>”local”}, “squid web cache” => { “ip_addr” =>”10.4.1.90”, “port” =>”3128”, “proto” =>”tcp”}, “apache web server” => { “ip_addr” =>”10.5.3.30”, “port” =>”80”, “proto” =>”tcp”}); introducing zoomify image | smith 49administering an open-source wireless network | feher and sondag 49 ######################################################## # # # no changes need to be made to the following code # # # ######################################################## # get the current time my $currenttime = scalar localtime(); my $starttime = time(); # open old output status script to get previous status’ open(old, “status.cgi”); my @tmpoldstatfile = ; my $oldstatfile = join(“”, @tmpoldstatfile); # check routers/ap’s using ping my $diff = ‘’; my $allrouterstat; foreach my $host (sort keys %iptoloc){ my $p = net::ping->new(); my $pingresult = $p->ping($host); if(!$pingresult){ sleep 10; $pingresult = $p->ping($host); } my $thislaststat = ( $oldstatfile =~ m/$iptoloc{$host}{location}<\/td>close(); } #check the status of each daemon my $alldaemonstat =’’; foreach my $i (sort keys %daemons){ my $thislaststat = ( $oldstatfile =~ m/$i<\/td> (\$lasttime + (60 * $updatemin))){ \$systemstatus = “#ff0000”; \$message = “

status update failed

”; } # if this is cron running the script if (\$currentuser =~ “$webservuname”){ # send email if status is down & logfile doesn’t exist &sendemail() if ( (\$systemstatus =~ “#ff0000”) && !(-e “$logfilename”) ); # delete log file if everything is up unlink(“$logfilename”) if ( (!(\$systemstatus =~ “#ff0000”)) && (-e “$logfilename”) ); } #else apache is accessing the page (its a web request) else{ #print the page print header(); ############################ # start of html output # ############################ print < $instname wireless status introducing zoomify image | smith 51administering an open-source wireless network | feher and sondag 51

$instname wireless status

\$message $allrouterstat
access point status

$alldaemonstat
daemon status


last updated $currenttime
web_output ########################## # end of html output # ########################## }#end else sub sendemail { my \$mailer = mail::mailer->new(“sendmail”); \$mailer->open({from => ‘$fromemail’, 52 information technology and libraries | june 200852 information technology and libraries | september 2008 to => [\$toemail], subject => “wireless problem”}); my \$message = “the wireless system has failed to “ .”it’s status.\n\n$weboutputurl\n”; print \$mailer \$message; \$mailer->close(); open(file, “>>$logfilename”); print file “failed to update system.”; close(file); } output_file_for_remote_host ######################################################## # end of script output block # ######################################################## #write output code to the file my $perloutputfile = “status.cgi”; open (out, “>$perloutputfile”); print out $perloutput; close (out); chmod 0755, $perloutputfile; #send email is necessary &sendemail($diff, $weboutputurl, $fromemail, $toemail) if ($diff); #send perl file to webserver &scpfile($perloutputfile, $webservuname, $webservpass, $webservurl, $webservtarg); ################################################ # # # end main code block, start functions # # # ################################################ # given the name and status of something (ap or # daemon), this returns a string for the table # row for displaying the status of the ap/daemon sub printstatus { my ($service, $status, $oldstatus, $owner, $toemail,$oldstatusfile, $currenttime ) = @_; my $msg = “”; my $statusline = “\n $serviceup”; introducing zoomify image | smith 53administering an open-source wireless network | feher and sondag 53 # if last two status’ were down if ($oldstatusfile =~ m/\($service\)-0--->/){ $msg = “$service back up at $currenttime\n”; # if service has owner & not already in mail list, # add owner to mail list $toemail .= “, \’$owner\’” if ($owner && (!($toemail =~ $owner))); } } #else current status is down else{ $statusline .= “down\”>down”; # if last status was down & before that status was up if ($oldstatusfile =~ m/\($service\)-0-1-->/){ $msg = “$service down at $currenttime\n”; # if service has owner & not already in mail list, # add owner to mail list $toemail .= “, \’$owner\’” if ($owner && (!($toemail =~ $owner))); } } $statusline .= “”; return ($statusline, $toemail, $msg); }#end printstatus function # checks the status for the given daemon # takes in ip, port to check, daemon name, and protocol # (tcp/udp). if given port=0 it checks for local daemon sub checkdaemon { my ($ip, $port, $daemon, $proto) = @_; my $dstat = 0; if ($proto !~ /local/){ #su checks for udp ports my $com = ($proto =~ “tcp”) ? (“nmap -p $port $ip | grep $port”) : (“nmap -su -p $port $ip | grep $port”); open(tmp, “$com|”); my $comout = ; close(tmp); if ($comout =~ /open/){ $dstat = 1; #if port is open, status is up } } else{ $daemon =~ s/ +.*//g; #\l lowercases the first letter of $daemon my $com = “which \l$daemon”; open(tmp, “$com|”); my $comout = ; close(tmp); $com = “ps aux | awk ‘{print \$11}’ | grep $comout”; open(tmp, “$com|”); $comout = ; close(tmp); $dstat = 1 if ($comout); 54 information technology and libraries | june 200854 information technology and libraries | september 2008 } return $dstat; } # end checkdaemon function # send the output perl status file to the webserver sub scpfile { my ($filepath, $webservuname, $webservpass, $webservurl, $webservtarg ) = @_; my $command = “scp $filepath $webservuname” .”\@$webservurl:$webservtarg”; my $exp1 = expect->spawn ($command); # the first argument “30” may need to be adjusted # if your system has very high latency my $ret = $exp1->expect(30, “word:”); print $exp1 “$webservpass\r”; my $ret = $exp1->expect(undef); $exp1->close(); } # end scpfile function # send an email to the admin & append error to log file sub sendemail { my ($errorlist, $weboutputurl, $fromemail, $toaddresses ) = @_; my $mailer = mail::mailer->new(“sendmail”); $mailer->open({from => “$fromemail”, to => [$toaddresses], subject => “wireless problem”}); $errorlist .= “\n\n$weboutputurl”; print $mailer $errorlist; $mailer->close(); } # end sendemail function appendix d. script output page appendix e. diagram of network lita cover 2, cover 3, cover 4 index to advertisers lib-s-mocs-kmc364-20140601051313 17 a truncated search key title index philip l. long: head, automated systems research and development and frederick g. kilgour: director, ohio college library center, columbus, ohio. an experiment showing that 3, 1, 1, 1 search keys derived from titles are sufficiently specific to be an efficient computerized, interactive index to a file of 135,938 marc ii records. this paper reports the findings of an experiment undertaken to design a title index to entries in the ohio college library center's on-line shared cataloging system. several large libraries participating in the center requested a title index because experience in those libraries had shown that the staff could locate entries in files more readily by title than by author and title. users of large author-title catalogs have long been aware of great difficulties in finding entries in such catalogs. since the center's computer program for producing an author-title index could be readily adapted to produce a title index, it was decided to add title access to the system. a previous paper has shown that truncated three-letter search keys derived from the first two words of a title are less specific than authortitle keys ( 1). earlier work had revealed that addition of only the first letter of another word in a title improved specificity ( 2) . therefore, the experiment was designed to test the specificity of keys consisting of the first three characters of the first non-english-article word of the title plus the first letter of a variable number of consecutive words. the experiment was also designed to produce an index that catalogers could use efficiently and that would operate efficiently in the computer system. it was assumed that the terminal user would have in hand the volume for which an entry was to be sought in the on-line catalog. the index was not to be designed for use by library users; subsequent experiments will be done to design an index for nonlibrarian users. other investigations into computerized, derived-key title indexes include 18 journal of library automation vol. 5/1 march, 1972 the previous paper in this series to which reference has already been made ( 1) and development of a title index in stanford's ballots system ( 3). although stanford has not published results observed from experiment or experience that describe the retrieval specificity of its technique, it is clear that the stanford procedure is not only more powerful than the one described in this paper but also more adaptable for user employment. the stanford index is probably less efficient. materials and methods a file of 135,938 marc ii records was used in this experiment. this file contains title-only and name-title entries, and keys were derived from titles in both types of entries. a key was extracted consisting of the first three characters of the first non-english-article word of each title plus the first character of each following word up to four. if there were fewer than four additional words, the key was left-justified, with trailing blank fill. only alphabetic and numeric characters were used in key derivation; alphabetic characters were forced to uppercase. all other characters were eliminated and the space occupied by an eliminated character was closed up before the key was derived. a total of 115,623 distinct keys was derived from the 135,938 entries. these 115,623 keys were then sorted. each key in the file was compared with the subsequent key or keys and equal comparisons were counted. a frequency distribution by identical keys was thus prepared, and a table constructed of percentages of numbers of equal comparisons based on the total number of distinct keys. this table contains the percentage of time for expected numbers of replies based on the assumption that each key had a probable use equal to all other keys. next, by eliminating the fourth single character and then the fourth and third, files of 3,1,1,1 and 3,1,1 keys were prepared from the 3,1,1,1,1 file. for example, the 3,1,1,1,1 key for raymond irwin's the heritage of the english library is her, 0, t, e , l; the 3,1,1,1 key for this title is her, 0 , t, e; and the 3,1,1 key, her, 0 , t. the same processing given to the 3,1,1,1,1 file was employed on these two files. results table 1 contains the maximum number of entries in 99 percent of replies. inspection of the table reveals that there is a large increase in specificity when the key is enlarged from 3,1,1 to 3,1,1,1; the maximum number of entries ( 99+ percent of the time) drops from twelve to five. however, when the key goes to 3,1,1,1,1, the number of entries per reply goes down only to four from five. the percentage of replies that contained a single entry was 67.8 for the 3,1,1 key, 84.0 for the 3,1,1,1 key, and 90.0 for the 3,1,1,1,1 key. a truncat ed search key / long and kilgour 19 table. 1. maximum number of entries in 99 percent of replies search key 3, 1,1 3, 1, 1,1 3, 1, 1, 1,1 title index entries maximum entries per reply 12 5 4 percentage of time 99.0 99.1 99.2 the irascope cathode ray tube terminals used in the oclc system can display nine truncated entries on the one screen, and it is felt that catalogers can use with ease up to two screensful of entries. therefore, the keys producing more than eighteen titles were listed. for 3,1,1,1,1 there were only 33; for 3,1,1,1 there were 67; and for 3,1,1 there were 357. the maximum number of identical keys was 321 for 3,1,1,1,1 and 3,1,1,1; the key was pro, b, b, b, b, most of which was d erived from "proceedings." for 3,1,1 the maximum was 417, for his, 0 , t "history of the." discussion it is clear from the findings that a 3,1,1 search key is not sufficiently specific to operate efficiently as a title index in a large file. however, the 3,1,1,1 key appears to be sufficiently specific for efficient operation, while the 3,1,1,1,1 key does not appear to possess sufficient increased specificity to justify its additional complexity. the observation that there is a large increase in specificity between keys employing threeand four-title words that constitute markov strings suggests that the second and third words may be highly correlated. indeed this suggestion is substantiated b y the maximum case for 3,1,1-his, 0, t. in the more-than-eighteen group for 3,1,1,1, these characters occurred in seven keys for a total of 206 entries, and for 3,1,1,1,1 they did not occur at all in the more-than-eighteen group. conclusion this experiment has shown that a 3,1,1,1 or 3,1,1,1,1 derived search key is sufficiently specific to operate efficiently as a title index to a file of 135,938 marc ii records. since a previous paper observed that as a fil e of entries increases the number of entries per reply does not increase in a one-to-one ratio ( 1 ), it is likely that these keys will operate efficiently for files of considerably greater size. 20 journal of library automation vol. 5/1 march, 1972 references 1. frederick g. kilgour, philip l. long, eugene b. leiderman, and alan l. landgraff, "title-only entries retrieved by use of truncated search keys," l ournal of library automation 4:207-10 ( dec. 1971 ). 2. frederick g. kilgour, "retrieval of single entries from a computerized library catalog file," proceedings of the american society for information science 5: 133-36 ( 1968 ). 3. edwin b. parker, spires (stanford physics information retrieval system) 1969-70 annual report ( palo alto: stanford university, june 1970 ), p. 7778. lib-s-mocs-kmc364-20141005023522 61 from the editor at the january 1973 midwinter meeting of the american library association, the board of directors of the information science and automation division appointed me to the position of editor of the journal of library automation. i wish to express gratitude to don s. culbertson, at that time executive secretary of both the information science and automation division and the american library trustees association, for adding yet another hat while he prepared a substantial portion of this june 1972 issue of jola. as incoming editor, i also wish to describe briefly to the subscribers and regular readers of jola the situation of the journal and my plans for its immediate future. you are aware that there has been a hiatus in the publication of jola. at this writing, the journal is approximately ten months behind schedule. by taxing the capacity of the ala staff, jola should return to its normal schedule within a year. during the intervening period, i will appreciate greatly the support of isad members, jola readers, authors, and the ala staff. with this support the task will be made lighter and perhaps will be expedited. no substantial changes in editorial policy will be made in the near future, as all efforts will be turned toward bringing the journal up to scheduje. susan k. martin, editor 17 march 1973 persistent urls and citations offered for digital objects by digital libraries article persistent urls and citations offered for digital objects by digital libraries nicholas homenda information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.12987 abstract as libraries, archives, and museums make unique digital collections openly available via digital library platforms, they expose these resources to users who may wish to cite them. often several urls are available for a single digital object, depending on which route a user took to find it, but the chosen citation url should be the one most likely to persist over time. catalyzed by recent digital collections migration initiatives at indiana university libraries, this study investigates the prevalence of persistent urls for digital objects at peer institutions and examines the ways their platforms instruct users to cite them. this study reviewed institutional websites from the digital library federation’s (dlf) published list of 195 members and identified representative digital objects from unique digital collections navigable from each institution’s main web page in order to determine persistent url formats and citation options. findings indicate an equal split between offering and not offering discernible persistent urls with four major methods used: handle, doi, ark, and purl. significant variation in labeling persistent urls and inclusion in item-specific citations uncovered areas where the user experience could be improved for more reliable citation of these unique resources. introduction libraries, archives, and museums often make their unique digital collections openly available in digital library services and in different contexts, such as digital library aggregators like the digital public library of america (dpla, https://dp.la/) and hathitrust digital library (https://www.hathitrust.org/). as a result, there can be many urls available that point to digital objects within these collections. take, for example, image collections online (http://dlib.indiana.edu/collections/images) at indiana university (iu), a service launched in 2007 featuring open access iu image collections. users discover images on the site through searching and browsing and its collections are also shared with dpla. the following urls exist for the digital object shown in figure 1, an image from the building a nation: indiana limestone photograph collection: • the url as it appears in the browser in image collections online: https://webapp1.dlib.indiana.edu/images/item.htm?id=http://purl.dlib.indiana.edu/iudl/i mages/vac5094/vac5094-01446 • the persistent url on that page (“bookmark this page at”) http://purl.dlib.indiana.edu/iudl/images/vac5094/vac5094-01446 • the url pasted from the browser for the image in dpla: https://dp.la/item/eb83ff0a6ae507e2ba441634f7eb0f18?q=indiana%20limestone nicholas homenda (nhomenda@indiana.edu) is digital initiatives librarian, indiana university bloomington. © 2021. https://dp.la/ https://www.hathitrust.org/ http://dlib.indiana.edu/collections/images https://webapp1.dlib.indiana.edu/images/item.htm?id=http://purl.dlib.indiana.edu/iudl/images/vac5094/vac5094-01446 https://webapp1.dlib.indiana.edu/images/item.htm?id=http://purl.dlib.indiana.edu/iudl/images/vac5094/vac5094-01446 http://purl.dlib.indiana.edu/iudl/images/vac5094/vac5094-01446 https://dp.la/item/eb83ff0a6ae507e2ba441634f7eb0f18?q=indiana%20limestone mailto:nhomenda@indiana.edu information technology and libraries june 2021 persistent urls and citations offered for digital objects by digital libraries | homenda 2 as a digital library or collection manager, which url would you prefer to see cited for this object? figure 1. an example of a digital object with multiple urls. mcmillan mill, ilco id in2288_1. courtesy, indiana geological and water survey, indiana university, bloomington, indiana. retrieved from image collections online at http://purl.dlib.indiana.edu/iudl/images/vac5094/vac509401446. citation instructions given to authors in major style guides explicitly mention using the best possible form of a resource’s url: “[i]t is important to choose the version of the url that is most likely to continue to point to the source cited.”1 of the three urls above, the second is a purl, or persistent url (https://archive.org/services/purl/), which is why both image collections online and dpla instruct users to bookmark or cite it. other common methods for issuing and maintaining persistent urls include digital object identifiers (doi, https://www.doi.org/), handles (http://handle.net/), and archival resource keys (ark, https://n2t.net/e/ark_ids.html). all of those have been around since the late 1990s to early 2000s. at indiana university libraries, recent efforts have focused on migrating digital collections to new digital library platforms, mainly based on the open source samvera repository software (https://samvera.org/). as part of these efforts, we wanted to survey how peer institutions were http://purl.dlib.indiana.edu/iudl/images/vac5094/vac5094-01446 http://purl.dlib.indiana.edu/iudl/images/vac5094/vac5094-01446 http://purl.dlib.indiana.edu/iudl/images/vac5094/vac5094-01446 https://archive.org/services/purl/ https://www.doi.org/ http://handle.net/ https://n2t.net/e/ark_ids.html https://samvera.org/ information technology and libraries june 2021 persistent urls and citations offered for digital objects by digital libraries | homenda 3 employing persistent, citable urls for digital objects to determine if a prevailing approach had emerged since indiana university libraries’ previous generation of digital library services were developed in the earlyto mid-2000s. besides having the capability of creating and reliably serving these urls, our digital library platforms need to make these urls easily accessible to users, preferably along with some assertion that the urls should be used when citing digital objects and collections instead of the many non-persistent urls also directing to those same digital objects and collections. although libraries, archives, and museums have digitized and made digital objects in digital collections openly accessible for decades using several methods for providing persistent, citable urls, how do institutions now present digital object urls to people who encounter, use, and cite them? by examining digital collections within a large population of digital library institutions’ websites, this study aims to discover 1. what methods of url persistence are being employed for digital objects by digital library institutions? 2. how do these institutions’ websites instruct users to cite these digital objects? literature review the study of digital objects in the literature often takes a philosophical perspective in attempting to define them. moreover, practical accounts of digital object use and reuse note the challenges associated with infrastructure, retrieval, and provenance. much of the literature about common methods of persistent url resolution comes from individuals and entities who developed and maintain these standards, as well as overviews of the persistent url resolution methods available. finally, several studies have investigated the problem of “link rot” by tracking the availability of web-hosted resources over time. allison notes the generations of philosophical thought that it took to recognize common characteristics of physical objects and the difficulty in understanding an authentic version of a digital object, especially with different computer hardware and software changing the way digital objects appear.2 hui also investigates the philosophical history of physical objects to begin to define digital objects through his methods of datafication of objects and objectification of data, noting that digital objects can be approached in three phases: objects, data, and networks, in order to define them.3 lynch is also concerned with determining the authenticity of digital objects and challenges inherent in the digital realm. in describing digital objects, he creates a hierarchy with raw data at the bottom, elevated to interactive experiential works at the top which elicit the fullest emotional connection contributing to the authentic experience of the work.4 the literature often examines digital objects from the practitioner’s perspective, such as the publishing industry’s difficulty in repurposing digital objects for new publishing products. publishers in benoit and hussey’s 2011 case study note the tension between managers and technical staff concerning assumptions about what their computer system could automatically do with their digital objects; their digital objects always require some human labor and intervention to be accurately described and retrievable later. 5 dappert et al. note the need to describe a digital object’s environment in order to be able to reproduce it in their work with the premis data dictionary for preservation metadata (https://www.loc.gov/standards/premis/).6 strubulis et al. provide a model for digital object provenance using inference and resource description framework (rdf) triples (https://w3.org/rdf/) since storing full provenance information for https://www.loc.gov/standards/premis/ https://w3.org/rdf/ information technology and libraries june 2021 persistent urls and citations offered for digital objects by digital libraries | homenda 4 complex digital objects, such as the large amount of mars rover data they offer as an example, would be cost prohibitive.7 in 2001, arms describes the landscape of persistent uniform resource names (urn) of handles, purls, and dois near the latter’s inception.8 recent work by koster explains the persistent identifier methods most in use today and examines current infrastructure practices for maintaining them.9 the persistent link resolution method most prominently featured in the literature is the digital object identifier (doi). beginning in 1999, those behind developing and implementing doi have explained its inception, development, and trajectory, continuing with paskin’s deep explanation in 2002 of the reasons why doi exist and the technology behind the service. 10 discipline-specific research notes the utility of doi. sidman and davidson and weissberg studied doi for the purposes of automating the supply chain in the publishing industry.11 derisi, kennison, and twyman, on behalf of the public library of science (plos) announced their 2003 decision to broadly implement doi, followed by additional disciplinespecific encouragement of the practice by skiba in nursing education and neumann and brase in molecular design.12 the archival resource key (ark) is an alternative permanent link resolution scheme. since 2001, the open-source ark identifier offers a self-hosted solution for providing persistent access to digital objects, their metadata, and a maintenance commitment.13 recently, duraspace working groups have planned for further development and expansion of ark with the arks in the open project (https://wiki.lyrasis.org/display/arks/arks+in+the+open+project). persistent urls (purls) have been used to provide persistent access to digital objects for nearly 20 years, and their use in the library community is well documented. shafer, weibel, and jul anticipate uniform resource names becoming a web standard and offer purls as an intermediate step to aid in urn development.14 shafer also explained how oclc uses purls and alternate routing methods (arms) to properly direct global users to oclc resources.15 purls are also used to provide persistent access to government information and were seen by the cendi persistent identification task group as essential to their early efforts to implement the federal enterprise architecture (fea) and a theoretical federal persistent identification resolver.16 digital objects and collections should ideally be accessible via urls that work beyond the life of any one platform, lest the materials be subjected to “link rot,” or the process of decay when previously working links no longer correctly resolve. ducut et al. investigated 1994–2006 medline abstracts for the presence of persistent link resolution services such as handle, purl, doi, and webcite and found 20% of the links were inaccessible in 2008.17 mcmurry et al. investigated link rot in life sciences data and suggested practices for formatting links for increased persistence and approaches for versioning.18 the topic of link rot has been examined as early as 2003, in markwell and brooke’s “broken links: just how rapidly do science education hyperlinks go extinct,” cited by multiple link rot studies. ironically, this article is no longer accessible at the cited url.19 methodology this study sought a set of digital objects within library institutions’ digital collections websites. to locate examples of publicly accessible digital objects in digital collections, this study collected institutional websites from the digital library federation’s (dlf) published list of 195 members https://wiki.lyrasis.org/display/arks/arks+in+the+open+project information technology and libraries june 2021 persistent urls and citations offered for digital objects by digital libraries | homenda 5 as of august 2019.20 subsequent investigation aimed to find one representative digital object from unique digital collections navigable from each institution’s main web page. this study aimed to locate digital collections that met the following criteria: 1. collections are openly available. 2. collections are in a repository service, as opposed to highlighted content visible on an informational web page or blog. 3. collections are gathered within a site or service that contains multiple collections, as opposed to individual digital project websites, when possible. 4. collections are unique to an institution, as opposed to duplicated or licensed content. these criteria were developed in an effort to find unique, publicly accessible digital objects within each institution’s digital collections. to be sure, users search for and discover materials in a variety of ways and in numerous services, but studying the information-seeking behavior of users looking for digital objects or digital collections is outside the scope of this study. ultimately, digital collections indexed by search engines or available in aggregator services like dpla often contain links to collections and objects in their institutionally hosted platforms. users who discover these materials are likely to be directed to the sites this study investigated. for the purposes of this study, at least one digital collection was investigated from each dlf institution. multiple sites for an institution were investigated when more than one publicly accessible site or service met the above criteria. when digital collections at an institution were delivered only through the library catalog discovery service, reasonable attempts were made to delimit discoverable digital collections content. in total, 183 digital collections were identified for this study. once digital collections were located, subsequent investigation aimed to locate individual digital objects within them. while digital objects represent diverse materials available in a variety of formats, for ease of comparing approaches between institutions, a mixture of ind ividual digital images, multipage digital items, and audiovisual materials were examined. objects for this study were primarily available in websites containing a variety of collections and format types with common display characteristics despite format differences, and no additional efforts were made to locate equal or proportional digital object formats at each institution. one representative digital object was identified per digital collection, totaling 183 digital objects. once a digital object was located at an institution, the object’s unique identifier, format, persistent url, persistent url label, method of link resolution (if identifiable), and citation were collected with particular focus on the object’s persistent url, if available. commonly used persistent url types and their url components can be identified, as seen in table 1; however, any means of persistence was collected if clearly identified. after examining initial results, the object’s provided citation, if available, was added to the list of data collected since many digital collection platforms provide recommended citations for individual objects. information technology and libraries june 2021 persistent urls and citations offered for digital objects by digital libraries | homenda 6 table 1. commonly used persistent url methods and corresponding url components persistent url type url component archival resource key (ark) ark:/ digital object identifier (doi) doi.org/ (or doi:) handle hdl.handle.net persistent url (purl) purl. results most institutions have a single digital collection site or service that met the selection criteria for this study. some appear to have multiple digital collection repositories, often separated by digital object format or library department, and many institutions have collections that are only publicly accessible through discrete project web sites, such as digital exhibits or focused digital humanities research projects. out of 195 dlf member institutions, 171 had publicly accessible digital collections. of these 171 institutions, 153 had digital collections services/sites that adhered to the criteria of this study, while 21 had only project-focused digital collections sites. since several institutions had more than one digital collection platform accessible via their main institutional website, a population of 183 digital collections were investigated. one representative digital object from each collection was gathered, consisting of 107 digital images, 73 multipage items, and 3 audiovisual items (totaling 183). table 2. number of instances of digital collection platforms identified platform number percentage of total (183) custom or unidentifiable 53 29% contentdm 46 25% islandora 19 10% dspace 11 6% samvera 11 6% omeka 10 5% internet archive 7 4% digital commons 6 3% fedora custom 4 2% luna 3 2% xtf 3 2% artstor 2 1% iiif server 2 1% primo 2 1% aspace 1 1% elevator 1 1% knowvation 1 1% veridian 1 1% information technology and libraries june 2021 persistent urls and citations offered for digital objects by digital libraries | homenda 7 as seen in table 2, almost a third of digital collection platforms encountered appear to be customdeveloped or customized to not reveal the software platform upon which they were based. of the platform-based services encountered where software was identifiable, 17 different platforms were used and the top five were contentdm, islandora, dspace, samvera (hyrax, avalon, curation concerns, etc.), and omeka. table 3. occurrence of persistent links in surveyed digital collections, method of link persistence, and persistent link labels persistent links? number percentage of total (183) no/unknown 93 51% yes/ persistence claimed 90 49% persistent link method number percentage of total (90) unknown 33 37% handle 27 30% ark 19 21% doi 6 7% purl 5 6% persistent link label number percentage of total (90) othera 24 26.7% permalink 22 24.4% identifier 13 14.4% [no label given] 10 11.1% permanent link 7 7.8% uri 5 6% persistent link 3 3.3% handle 2 2.2% link to the book 2 2.2% persistent url 2 2.2% atwenty-four other persistent link labels were reported,21 each occurring only once. as seen in table 3, the numbers of digital objects with and without publicly accessible persistent (or seemingly persistent) links were nearly equal. among the digital objects with persistent links, the majority claimed persistence without a discernible resolution method, with the rest divided between handle, ark, doi, and purl. these objects also had 33 different labels for these links in the public-facing interface. the top five labels were: permalink (22), identifier (13), permanent link (7), uri (5), and persistent link (3). as seen in table 4, the majority of digital objects surveyed had a unique item identifier in their publicly viewable item record. the majority did not offer a citation in the item’s publicly viewable record. among items that offered citations, the majority contained a link to the item, and three offered downloadable citation formats only, such as endnote, zotero, and mendeley. information technology and libraries june 2021 persistent urls and citations offered for digital objects by digital libraries | homenda 8 table 4. various digital object characteristics surveyed unique item identifier in item record number percentage of total (183) yes 132 72% no 51 28% citation in item record number percentage of total (183) yes 65 36% no 118 64% citations containing links to item number percentage of total (65) yes 39 60% downloadable citation format only 3 5% no 23 35% discussion since proper citation practice dictates choosing the url most likely to provide continuing access to a resource, it follows that providing persistent urls to resources such as digital objects or digital collections is also a good practice. it is encouraging to see a large number of institutions surveyed providing urls that persist (or claim to persist). providing persistent access to a unique digital resource implies a level of commitment to maintaining its url into the future, requiring policies, technology, and labor resources, further augmented by costs associated with registering certain types of identifiers like doi.22 it is likely that institutions not providing persistent (or not obviously persistent) urls are either internally committing to preserving their objects, collections, and services through means not known to end users; are constrained by technological limitations of their digital collection platforms; hope to develop or adopt new digital library services that offer these capabilities; or lack the resources to offer persistent urls. the four commonly used methods of persistent link resolution—doi, handle, ark, and purl— have been used for nearly 20 years, and it is not surprising that alternative observable methods were seldom encountered in this study. handles were the most common persistent url method, which seems related to the digital library platform used by an institution. dspace distributions are pre-bundled with handle server software, for example, and 12 out of 27 platforms serving digital objects with handles were based on dspace (https://duraspace.org/dspace/). when choosing to implement or upgrade a digital library platform, institutions often consider several available options. choosing a platform that offers the ability to easily create and maintain persistent urls might be less burdensome than making urls persist via independent or alternative means. thirty-three digital objects offered links that had labels implying some sort of persistence but lacked information describing the methods used or url components consistent with commonly used methods, as seen in table 1. to achieve persistence, there might be a combination of url rewriting, locally implemented solutions, or nonpublic persistent urls existing. it would benefit users, increasingly aware of the need to cite digital objects using persistent links, for digital object platforms that offer persistent linking to explicitly state that fact and ideally offer some evidence of the resolution method used. researchers will be looking for citable persistent links that offer https://duraspace.org/dspace/ information technology and libraries june 2021 persistent urls and citations offered for digital objects by digital libraries | homenda 9 some cues signifying their persistence, whether it is clearly indicated language on the website or a url pattern consistent with the four major methods commonly used. the amount of variation in labeling persistent links was surprising. commonly used digital library software platforms have default ways of labeling these fields. nearly all of the “reference url” labels encountered are in contentdm sites, for example. since the concept of offering a persistent link to a digital object is not uncommon, perhaps there can be a more consistent approach to choosing the label for this content. when a researcher finds a digital object in an institutional digital library service, they might want to cite that object. accurately citing resources in all formats is an essential research skill, and digital library platforms often try to aid users by providing dynamically generated or pre-populated citations based on unique metadata associated with that object. it was somewhat surprising to encounter these types of citation helpers that did not include persistent links. since a digital object’s preferred persistent link is often different than the url visible in the browser, efforts should be made to make citations available containing persistent links. there are institutions with digital collections that were not examined in this study due to a number of factors. first, this study examined the 195 institutions who were members of the digital library federation, and there are 2,828 four-year postsecondary institutions in the united states as of 2018.23 additional study could expand perceptions about persistent links for digital objects when looking beyond the dlf member institutions, which are predominantly four-year postsecondary institutions but also contain museums, public libraries, and other cultural heritage organizations. an alternative approach to collecting this data would be to conduct user testing focused on finding and citing digital objects from a number of institutions. this approach was not used, however, since the initial goal of this study was to see how peer digital library institutions have employed persistent links and citations across a broad yet contained spectrum. as one librarian with extensive digital library experience, my approach to locating these platforms and resources is subject to subconscious bias i may have accumulated over my professional career, but i would hope that my experience makes me more able to locate these platforms and materials than the average user. digital library platforms are numerous, and often institutions have several of them with varying degrees of public visibility or connectivity to their institution’s main library website. this study’s findings for any particular institution are not as authoritative as self-reported information from the institution itself. while a survey aimed at collecting direct responses from institutions might have yielded more accuracy, a potentially low response rate would also make it difficult to truly know what methods of persistent linking peer institutions are employing, especially with the majority of these resources being openly findable and accessible. still, further study with self reported information could shed more light on the decisions to provide certain methods of persistent links to objects within their chosen digital collection platforms. moreover, it is possible that some digital object formats are more likely to have persistent urls than others. newer formats such as three-dimensional digital objects, commonly cited resources like data sets, and scholarship held in institutional repositories could be available in digital library services similar to those surveyed in this study with different persistent url characteristics. additional study could aim to survey populations of digital objects by format across multiple institutions to investigate any correlation between persistent urls and object format. information technology and libraries june 2021 persistent urls and citations offered for digital objects by digital libraries | homenda 10 conclusion unique digital collections at digital library institutions are made openly accessible to the pu blic in a variety of ways, including digital library software platforms and digital library aggregator services. regardless of how users find these materials, best practices require users to cite urls for these materials that are most likely to continue to provide access to them. persistent urls are a common way to ensure cited urls to digital objects remain accessible. commonly used methods of issuing and maintaining persistent urls can be identified in digital object records within digital collection platforms available at these institutions. this study identified characteristics about these digital objects, their platforms, prevalence of persistent urls in their records, and the way these urls are presented to users. findings indicate that dlf member institutions are split evenly between providing and not providing publicly discernible persistent urls with wide variation on how these urls are presented and explained to users. decisions made in developing and maintaining digital collection platforms and the types of urls made available to users impact which urls users cite and the possibility of others encountering these resources through these citations. embarking on this study also was prompted by digital collection migrations at indiana university, and these findings provide us interesting examples of persistent url usage at other institutions and ways to improve the user experience in digital collection platforms. endnotes 1 the chicago manual of style online (chicago: university of chicago press, 2017), ch. 14, sec. 7. 2 arthur allison et al., “digital identity matters,” journal of the american society for information science & technology 56, no. 4 (2005): 364–72, https://doi.org/10.1002/asi.20112. 3 yuk hui, “what is a digital object?” metaphilosophy 43, no. 4 (2012): 380–95, https://doi.org/10.1111/j.1467-9973.2012.01761.x. 4 clifford lynch, “authenticity and integrity in the digital environment: an exploratory analysis of the central role of trust” council on library and information resources (clir), 2000, https://www.clir.org/pubs/reports/pub92/lynch/. 5 g. benoit and lisa hussey, “repurposing digital objects: case studies across the publishing industry,” journal of the american society for information science & technology 62, no. 2 (2011): 363–74, https://doi.org/10.1002/asi.21465. 6 angela dappert et al., “describing and preserving digital object environments,” new review of information networking 18, no. 2 (2013): 106–73, https://doi.org/10.1080/13614576.2013.842494. 7 christos strubulis et al., “a case study on propagating and updating provenance information using the cidoc crm,” international journal on digital libraries 15, no. 1 (2014): 27–51, https://doi.org/10.1007/s00799-014-0125-z. 8 william y. arms, “uniform resource names: handles, purls, and digital object identifiers,” communications of the acm 44, no. 5 (2001): 68, https://doi.org/10.1145/374308.375358. https://doi.org/10.1111/j.1467-9973.2012.01761.x https://www.clir.org/pubs/reports/pub92/lynch/ https://doi.org/10.1002/asi.21465 https://doi.org/10.1080/13614576.2013.842494 https://doi.org/10.1007/s00799-014-0125-z https://doi.org/10.1145/374308.375358 information technology and libraries june 2021 persistent urls and citations offered for digital objects by digital libraries | homenda 11 9 lukas koster, “persistent identifiers for heritage objects,” code4lib journal 47 (2020), https://journal.code4lib.org/articles/14978. 10 albert w. simmonds, “the digital object identifier (doi),” publishing research quarterly 15, no. 2 (1999): 10, https://doi.org/10.1007/s12109-999-0022-2; norman paskin, “digital object identifiers,” information services & use 22, no. 2/3 (2002): 97, https://doi.org/10.3233/isu2002-222-309. 11 david sidman and tom davidson, “a practical guide to automating the digital supply chain with the digital object identifier (doi),” publishing research quarterly 17, no. 2 (2001): 9, https://doi.org/10.1007/s12109-001-0019-y; andy weissberg, “the identification of digital book content,” publishing research quarterly 24, no.4 (2008): 255–60, https://doi.org/10.1007/s12109-008-9093-8. 12 susanne derisi, rebecca kennison, and nick twyman, “the what and whys of dois,” plos biology 1, no. 2 (2003): 133–34, https://doi.org/10.1371/journal.pbio.0000057; diane j. skiba, “digital object identifiers: are they important to me?,” nursing education perspectives 30, no. 6 (2009): 394–95, https://doi.org/10.1016/j.lookout.2008.06.012; janna neumann and jan brase, “datacite and doi names for research data,” journal of computer-aided molecular design 28, no. 10 (2014): 1035–41, https://doi.org/10.1007/s10822-014-9776-5. 13 john kunze, “towards electronic persistence using ark identifiers,” california digital library, 2003, https://escholarship.org/uc/item/3bg2w3vs. 14 keith e. shafer, stuart l. weibel, and erik jul, “the purl project,” journal of library administration 34, no. 1–2 (2001): 123, https://doi.org/10.1300/j111v34n01_19. 15 keith e. shafer, “arms, oclc internet services, and purls,” journal of library administration 34, no. 3–4 (2001): 385, https://doi.org/10.1300/j111v34n03_19. 16 cendi persistent identification task group, “persistent identification: a key component of an egovernment infrastructure,” new review of information networking 10, no. 1 (2004): 97–106, https://doi-org/10.1080/13614570412331312021. 17 erick ducut, fang liu, and paul fontelo, “an update on uniform resource locator (url) decay in medline abstracts and measures for its mitigation,” bmc medical informatics & decision making 8, no. 1 (2008): 1–8, https://doi.org/10.1186/1472-6947-8-23. 18 julie a. mcmurry et al., “identifiers for the 21st century: how to design, provision, and reuse persistent identifiers to maximize utility and impact of life science data,” plos biology 15, no. 6 (2017): 1–18, https://doi.org/10.1371/journal.pbio.2001414. 19 john markwell and david brooks, “broken links: just how rapidly do science education hyperlinks go extinct?” (2003), cited by many and previously available from: http://wwwclass.unl.edu/biochem/url/broken_links.html [currently non-functional]. 20 “our member institutions,” digital library federation (2020), https://www.diglib.org/about/members/. https://journal.code4lib.org/articles/14978 https://doi.org/10.1007/s12109-999-0022-2 https://doi.org/10.3233/isu-2002-222-309 https://doi.org/10.3233/isu-2002-222-309 https://doi.org/10.1007/s12109-001-0019-y https://doi.org/10.1007/s12109-008-9093-8 https://doi.org/10.1371/journal.pbio.0000057 https://doi.org/10.1016/j.lookout.2008.06.012 https://doi.org/10.1007/s10822-014-9776-5 https://escholarship.org/uc/item/3bg2w3vs https://doi.org/10.1300/j111v34n01_19 https://doi.org/10.1300/j111v34n03_19 https://doi-org/10.1080/13614570412331312021 https://doi.org/10.1186/1472-6947-8-23 https://doi.org/10.1371/journal.pbio.2001414 http://www-class.unl.edu/biochem/url/broken_links.html http://www-class.unl.edu/biochem/url/broken_links.html https://www.diglib.org/about/members/ information technology and libraries june 2021 persistent urls and citations offered for digital objects by digital libraries | homenda 12 21 twenty-four labels used only once: archival resource key; ark; bookmark this page at; citable link; citable link to this page; citable uri; copy; copy and paste this url; digital object url; doi; identifier (hdl); item; link; local identifier; permanent url; permanently link to this resource; persistent link to this item; persistent link to this record; please use this identifier to cite or link to this item; related resources; resource identifier; share; share link/location; to cite or link to this item, use this identifier. 22 one of the frequently asked questions (https://www.doi.org/faq.html) states that doi registration fees vary. 23 national center for education statistics, “table 317.10. degree-granting postsecondary institutions, by control and level of institution: selected years, 1949–50 through 2017–18,” in digest of education statistics, 2018, https://nces.ed.gov/programs/digest/d18/tables/dt18_317.10.asp. https://www.doi.org/faq.html https://nces.ed.gov/programs/digest/d18/tables/dt18_317.10.asp abstract introduction literature review methodology results discussion conclusion endnotes 8 information technology and libraries | june 20088 information technology and libraries | september 2008 from our readers: virtues and values in digital library architecture mark cyzyk editor’s note: “from our readers” will be an occasional feature, highlighting ital readers’ letters and commentaries on timely issues. at the fall 2007 coalition for networked information (cni) conference in washington, d.c., i pre-sented “a survey and evaluation of open-source electronic publishing systems.” toward the end of my presentation was a slide enumerating some of the things i had personally learned as a web application architect during my review of the systems under consideration: n platform independence should not be neglected. n one inherits the flaws of external libraries and frameworks. choose with care. n installation procedures must be simple and flawless. n don’t wake the sysadmin with “slap a gui on that xml!”—and push application administration out, as much as possible, to select users. n documentation must be concise, complete, and comprehensive. “i can’t guess what you’re thinking.” initially, these were just notes i thought might be useful to others, figuring it’s typically helpful to share experiences, especially at international conferences. but as i now look at those maxims, it occurs to me that when abstracted further they point in the direction of more general concepts and traits—concepts and traits that accurately describe us and the products of our labor if we are successful, and prescribe to us the concepts and traits we need to understand and adopt if we are not. in short, peering into each maxim, i can begin to make out some of the virtues and values that underlie, or should underlie, the design and architecture of our digital library systems. n freedom and equality platform independence should not be neglected. “even though this application is written in platformindependent php, the documentation says it must be run on either red hat or suse, or maybe it will run on solaris too, but we don’t have any of these here.” while i no doubt will be heartily flamed for suggesting that microsoft has done more to democratize computing than any other single company, i nevertheless feel the need to point out that, for many of us, windows server operating systems and our responsibility for administering them way back when provided the impetus for adding our swipe-card barcodes to the acl of the data center—surely a badge of membership in the club of enterprise it if ever there was one. you may not like the way windows does things. you may not like the way microsoft plays with the other boys. but to act like they don’t exist is nothing more than foolish burying one’s head in the *nix sand. windows servers have proven themselves time and again as being affordable, easily managed, dependable, and, yes, secure workhorses. windows is the ford pickup truck of the server world, and while that pickup will some day inevitably suffer a blowout of its twenty-year-old head gasket (and will therefore be respectfully relegated to that place where all dearly departed trucks go), it’s been a long and good run. we should recognize and appreciate this. windows clearly has a place in the data center, sitting quietly humming alongside its unix and linux brothers. i imagine that it actually takes some effort to produce platform-dependent applications using platform-independent languages and frameworks. such effort should be put toward other things. keep it pure. and by that i mean, keep it platform independent. freedom to choose and presumed equality among the server-side oses should reign. n responsibility and good sense one inherits the flaws of external libraries and frameworks. choose with care. so you’ve installed the os, you’ve installed and configured the specified web server, you’ve installed and configured the application platform, you’ve downloaded and compiled the source, yet there remains a long list of external libraries to install and configure. one by one you install them. suddenly, when you get to library number 16 you hit a snag. it won’t install. it requires a previous version of library number 7, and multiple versions of library number 7 can’t be installed at the same time on the same box. worse yet, as you take a break to read some more of the documentation, it sure looks like required library number 19 is dependent on the current version of library number 7 and won’t work with any previous version. and could it be that library number 21 is dependent on library number 20, which is dependent on library number 23, which is dependent on—yikes—library number 21? mark cyzyk (mcyzyk@jhu.edu) is the scholarly communication architect, library digital programs group, sheridan libraries, johns hopkins university in baltimore. from our readers: virtues and values in digital library architecture | cyzyk 9 all things come full circle. but let’s suppose you’ve worked out all of these dependencies, you’ve figured out the single, secret order in which they must install, you’ve done it, and it looks like it’s working! yet, when you go to boot up the web service, suddenly there are errors all over the place, a fearsome crashing and burning that makes you want to go home and take a nap. something in your configuration is wrong? something in the way your configuration is interacting with an external library is wrong? you search the logs. you gather the relevant messages. they don’t make a lot of sense. now what to do? you search the lists, you search the wikis to no avail, and finally, in desperation, you e-mail the developers. “but that’s a problem with library x, not with our application.” au contraire. i would like to strongly suggest a copernican revolution in how we think about such situations. while it’s obvious that the developers of the libraries themselves are responsible for developing and maintaining them, i’d like to suggest that this does not relieve you, the developer of a system that relies on their software, from responsibility for its bugs and peculiar configuration problems. i’d like to suggest that, far from pushing responsibility in the case mentioned above out to the developers of the malfunctioning external library, that you, in choosing that library in the first place, have now inherited responsibility for it. even if you don’t believe in this notion of inheritance, if you would please at least act as if it were true, we’d all be in a better place. part of accepting this kind of responsibility is you then acting as a conduit through which we poor implementers learn the true nature of the problem and any solutions or temporary workarounds we may apply so that we can get your system up and running pronto. in the end, it’s all about your system. your system as a whole is only as strong as the weakest link in its chain of dependencies. n simplicity and perfection installation procedures must be simple and flawless. it goes without saying that if we can’t install your system we a fortiori can’t adopt it for use in our organization. i remember once having such a difficult time trying to get a system up and running that i almost gave up. i tried first to get it running against apache 1.4, then against apache 2.0. i had multiple interactions with the developers. i banged my head against the wall of that system for days in frustration. the documentation was of little help. it seemed to be more part of an internal documentation project, a way for the developers to communicate among themselves, than to inform outsiders like me about their system. and related to this i remember driving to work during this time listening to a report on npr about the famous hopkins pediatric neurosurgeon, dr. ben carson. apparently, earlier in the week he had separated the brains of siamese twins and the twins were now doing fine, recuperating. the npr commentator marveled at the intricacy of the operation and at the fact that the whole thing took, i believe, five hours. “five hours? five hours?!” i exclaimed while barreling down the highway in my vintage 1988 ford ranger pickup (head gasket mostly sealed tight, no compression leakage). “i can’t get this system at work installed in five days!” our goal as system architects needs to be that we provide to our users simple and flawless installation procedures so that our systems can, on average, be installed and configured in equal or less time than it takes to perform major brain surgery.1 “all in an afternoon” should become our motto. i am happy to find that there are useful and easy to use package managers, e.g., yum and synaptic, for doing such things on various linux distributions. windows has long had solid and sophisticated installation utilities. tomcat supports drop-in-place war files. when possible and appropriate, we need to use them. n justice and e-z livin don’t wake the sysadmin with “slap a gui on that xml!”—and push application administration out, as much as possible, to select users. i remember reading plato’s republic as an undergraduate and the feeling of being let down when the climax of the whole thing was a definition in which “justice” simply is each man serving his proper place in society and not transgressing the boundaries of his role. “that’s it?” i thought. “so you have this rigidly hierarchical society and each person in it knows his role and knows in which slot his role fits—and keeping to this is ‘justice’?” this may not be such a great way to structure a society, but now that i think about it, it’s a great way to structure a computer application. sit down and carefully look at the functions your program will provide. then create a small set of user roles to which these functions will be carefully mapped. in the end you will have a hierarchical structure of roles and functions that should look perfectly simple and rational when drawn on a piece of paper. and while the superuser role should have power over 10 information technology and libraries | september 2008 all and access to all functions in the application, the list of functions that he alone has access to should be small, i.e., the actual work of the superuser should be minimized as much as possible by making sure that most functions are delegated to the members of other, appropriate, proper user roles. doing this happily results in what i call the state of e-z livin: the last thing you want is for users to constantly be calling you with data issues to fix. you therefore will model management of the data—all of it—and the configuration of the application itself—most of it— directly into the architecture of the application, provide users the guis they need to configure and manage things themselves, and push as much functionality as you can out to them where it belongs. let them click their respective ways to happiness and computing goodness. you build the tool, they use it, and you retire back to the land of e-z livin. users are assigned to their roles, and all roles are in their proper places. application architecture justice is achieved. n clarity and wholeness documentation must be concise, complete, and comprehensive. “i can’t guess what you’re thinking.” as system developers we’ve probably all had the magical experience of a mind meld with a fellow developer when working intensively on a project. i have had this experience with two other developers, separately, at different stages of my career. (one of them, in fact, used to point out to everyone that, “between the two of us, we make one good developer!”) this is a wonderful and magical and productive working relationship in which to be, and it needs to be recognized, supported, and exploited whenever it happens. you are lucky if you find yourself designing and developing a system and your counterpart is reading your mind and finishing your sentences. however, just as it’s best to leave that nice young couple cuddling in the corner booth alone, so too it really doesn’t make a lot of sense to expect the mind-melded developers to turn out anything that remotely resembles coherent and understandable documentation. those undergoing a mind meld by definition know perfectly well what they mean. to the rest of us it just feels like we missed a memo. if you have the luxury, make sure that the one writing the documentation is not currently undergoing a mind meld with anyone else on the development team. scotty typically stayed behind while he beamed the others down. beam them down. be that scotty. you do the world a great service by staying behind on the ship and dutifully reporting, clearly and comprehensively, what’s happening down on the red planet. to these five maxims, and their corresponding virtues, i would add one more set, one upon which the others rely: n empathy and graciousness you are not your audience. at least in applied computing fields like ours, we need to break with the long-held “guru in the basement” mentality. the actions of various managerial strata have now ostensibly acknowledged for us that technical expertise, especially in applied fields, is a commodity, i.e., it can be bought. a dearth of such expertise is remedied by simply applying money to the situation—admittedly difficult to do at the majority of institutions of higher education, but a common occurrence at the wealthiest. nevertheless, the dogmatic hold of the guru has been broken and the magical aura that once draped her is not now so resplendent—her relative rarity, and the clubby superiority that depended upon it, has been diluted significantly by the sheer number of counterparts who can and will gleefully fill her function. we respect, value, and admire her; it’s just that her stranglehold on things has (rightfully) been broken. and while nobody is truly indispensable, what is more difficult and rare to find is someone who has the guru’s same level of technical chops coupled with a genuine empathic ability to relate to those who are the intended users of her systems and services. unless your systems and services are geared primarily toward other developers, programmers, and architects— and presumably they are not, nor, in the library world, should they be—your users will typically be significantly unlike you. let me repeat that: your users are not like you. rephrased: you are not your audience. when looking back over the other maxims, values, and virtues mentioned in this essay then, the moralpsychological glue that binds them all is composed of empathy for our users—faculty, students, librarians, non-technical staff—and the graciousness to design and carry out a project plan in a spirit of openness, caring, flexibility, humility, respect, and collaboration. when empathy for the users of our systems is absent—and there are cases where you can actually see this in the design and documentation of the system itself—our systems will ultimately not be used. when the spirit of graciousness is broken, men become robots, mere rule followers, and users will boycott using their systems and will look elsefrom our readers: virtues and values in digital library architecture | cyzyk 11 where, naturally preferring to avoid playing the simonsays games so often demanded by tech folk in their workaday worlds; there is a reason the comic strip dilbert is so funny and rings so true. when confronted with a lack of empathy and graciousness on our part, the users who can boycott using our systems and services will boycott using our systems and services. and we’ll be left out in the rain, feeling like, as bonnie raitt once sadly sang, “i can’t make you love me if you don’t / i can’t make your heart feel something it won’t.” empathy and graciousness, while not guaranteeing enthusiastic adoption of our systems and services, are a necessary precondition for users even countenancing participation. there are undoubtedly other virtues and values that can usefully be expounded in the context of digital library architecture—consistency, coherence, and elegance immediately come to mind—and i could go on and on analyzing the various maxims surrounding these that bubble up through the stack of consciousness during the course of the day. yet doing so would conflict with another virtue i think is key to the success and enjoyment of opinionpiece essays like this and maybe even of other sorts of publications and presentations: brevity. note 1. a colleague of mine has since informed me that carson’s operation took twenty-five hours, not five. nevertheless, my admonition here still holds. when installation and configuration of our systems are taking longer, significantly longer, than it takes to perform major brain surgery, surely there is something amiss? reproduced with permission of the copyright owner. further reproduction prohibited without permission. information ecologies: using technology with heart/the media ... zillner, tom information technology and libraries; mar 2000; 19, 1; proquest pg. 54 book reviews information ecologies: using technology with heart by bonnie a. nardi and vicki l. o'day. cambridge: mit pr., 1999. 232p. $27.50 (isbn 0-262-14066-7). the media equation: how people treat computers, television, and new media like real people and places by byron reeves and clifford nass. cambridge: cambridge univ . pr., 1996 and 1999. 305p. $28.95 (isbn 1-575-86052x); paper, $15.95 (isbn 1-575-86053-8). the books i am reviewing this month are interrelated because they both focus on information technology and our changing world, with the two volumes looking at different levels of the picture. the broader, and to me more intriguing, view is presented by nardi and o'day in their wonderful book information ecologies. although it is not clear from the capsule biographies of the dust jacket, nardi and o'day are anthropologists who study the world of technology in a number of locales, and they here report the findings from their field work. among the case studies they discuss are an examination of the activities of reference librarians at two corporations and a look at a virtual world created for and by elementary school students. but they do much more than simply present case studies, although these alone make the book a worthwhile read. in addition, they argue that the most useful way to look at information technology is through the metaphor of "information ecologies," "system[s] of people, practices, values, and technologies in ... particular local environment[s]." they adopt this biological metaphor after carefully considering the most commonly employed information technology metaphors: technology as tool, text, or system . in turn, they find each of these metaphors wanting. it is particularly important to choose carefully the metaphorical lenses through which technological developments are viewed. each particular metaphor has consequences for how sanguinely we view a technology, and it is often worthwhile to use multiple metaphors to enhance our world view. the information ecology metaphor is particularly appropriate for an anthropological view of local "habitats" and their inhabitants and artifacts . in turn, an anthropological view is particularly apt for capturing the human side of technology (thus the subtitle: using technology with heart). this is a side of things that can be overlooked in other metaphorical views, particularly since it requires that the sticky issue of values be considered. unfortunately for all of us, there is a reluctance to talk of human values when considering technology. as nardi and o'day note, there is a tendency to either enthusiastically applaud new technology without regard to its effects, or to condemn all new technology as inherently debasing to humanity, or to simply resign oneself pessimistically to the inevitable development of technology and our lack of control over it. nardi and o'day tend to be cautious optimists, claiming that we can control technology, and the way to exercise that control is through our own local encounters with information ecologies. thus, rather than bemoaning the dehumanizing effects of the internet, information ecologies explores the successful use of internet technologies to set up a virtual world for students and the elderly in phoenix, arizona. instead of thinking or acting globally, exploit the technology locally, but do so in a way that makes sense in terms of human values. on the taxonomic scale of technology views, ranging from gloom and doom (e.g., the views of clifford 54 information technology and libraries i march 2000 tom zillner, editor stoll) to perpetual optimism (e.g., nicholas negroponte), i place nardi and o'day somewhere in the middle, but as i suggested, leaning toward cautious optimism. in fact, they spend several chapters discussing the views of others and offering prescient criticism of the deficiencies of those views . of particular interest to me was their analysis of the french sociologist jacques ellul, who apparently sounded the alarm concerning the stress to mind and soul of constant technological change in 1954, well before the current crop of doomsayers. nardi and o'day find ellul's views, as articulated in the technological society to be compelling. yet, they claim, the rise of the internet can counteract the trend that ellul saw toward monotonous sameness and lack of diversity in the face of technological efficiency. perhaps so. one thing that i was looking for in information ecologies were some practical tools for engaging in the kind of exploration of information habitats that nardi, o'day, and other anthropologists engage in. there is a spate of interest lately in the role of anthropologists in the design and deployment of new technologies, and i would like to determine its applicability to my modest software development projects. unfortunately, i was mainly disappointed on this score. in fairness to the authors, they did not set out to spell out the anthropological methodology of exploring information ecologies in any detail. the purpose of the book is rather to argue that viewing the world of technology as a set of interconnected information ecologies is useful and accurate, and in many cases superior to other metaphorical views. they succeed in this goal. now i want them to go on to write a book on using anthropological methods in these ecologies without necessarily becoming a professional anthropologist. nardi and o'day do touch extremely briefly on a few conventions of interviewing subjects, with --------reproduced with permission of the copyright owner. further reproduction prohibited without permission. their most important technical discussion centering on what they call "strategic questioning," which they present in the context of evolving information ecologies . they provide useful categories of questions to be asked, and specific examples. although it may seem obvious to ask penetrating questions of members of an information habitat, this is one area in which software developers in particular fail miserably . another seemingly obvious pointer is to pay attention . again, its obviousness is deceptive , since most of us are poor observers who make many assumptions about the characteristics of a work activity without observational evidence . as evidence that people introducing new technologies to an ecology do not follow these simplest pieces of advice you can tum to the chapter "a dysfunctional ecology," to see how badly technology can fail for nontechnological reasons . this case study deals with a major teach ing hospital that introduced a monitoring system into its neurosurgical operating suites that captured instrument readings as well as complete audio and video. the system was installed to aid neurophysiologists, experts who are called in to advise neurosurgeons at key points during complex surgeries to ensure that patient neurological function is not compromised . the neurosurgeons and neurophysiologists at this hospital decided that it would be more efficient for the neurophysi ologists to be able to remotely monitor multiple surgeries simultaneously. both groups failed to consult with the other constituencies among the operating team, the nurses and anesthesiology staff. these groups believed that their privacy was being compromised, particularly since it was possible to tape any procedures at multiple workstations throughout the hospital. i can easily envision similar sorts of problems due to lack of communication in introducing new or modified technology into other milieus, e.g., libraries. although the consequences might not lead to the potentially life-threatening situations that could arise in an operating suite, there are certainly possible outcomes where service to users could be undermined. despite the book being not exactly what i (rather selfishly) want, lnformation ecologies is a first-rate read and an important starting point for those concerned with better controlling technological change in the world of information. turning from an anthropological point of view to a psychological one, the media equation offers another important basis for technological design and implementation, particularly of computer software and multimedia. the release last year of a paperback edition of this volume, first published in 1996, provides a convenient pretext for reviewing this work. reeves and nass have supervised years of study and experimentation that have consistently demonstrated the truth of what they call the "media equation": that our relations with media, including computers and multimedia, are identical in key ways to our relationships with other human beings. this is true of all of us, even those of us sophisticated enough to understand that we are dealing with devices and human artifacts rather than people . reeves and nass quite entertainingly present the technique they've used over the years to perform their research, on a step-by-step basis: 1. pick a research finding on how people respond to each other or their environment. 2. find the summary of the social or natural rule that the study has yielded. 3. replace the words "person" or "environment" in the summary with media of some sort (television, movi es, computers, etc.) 4. find the research procedure . 5. substitute media for one of the people or the environment in the procedure. 6. run the experiment. 7. draw conclusions. although this may sound facetious, it is in fact the recipe that produced the startling conclusions that we all tend to behave toward media much as we do toward other people. what's perhaps more important is that reeves and nass point toward techniques that practitioners can use to produce more effective media, including computer software . as a simple example, consider politeness. reeves and nass discovered that people treated computers with the same sort of politeness that they would other human beings, and in turn reeves and nass suggest that people respond better to "polite media." they then provide some fairly straightforward advice on producing polite computer programs, starting with grice's maxims, a set of politeness rules assembled by h. paul grice, a philosopher and psychologist. these center around truth telling, appropriate quantity of information (neither too much nor too little), relevance, and clarity. all of this is fairly unsurprising, but the authors spell out just how the maxims can be applied to the construction of computer programs . further, they go on to suggest some rules of thumb of their own. for example, some computer programs produce verbal output but expect the user to key in his or her responses. this may be perceived by the user , possibly subconsciously, as forcing an impolite response, since mixing communications modalities is a faux pas. thus, they suggest that if text input is required , perhaps only text output should be supplied . this should provide you with some of the flavor of the media equation, and in turn you may be able to see a set of potential ethical dilemmas that can arise from utilizing book reviews i 55 reproduced with permission of the copyright owner. further reproduction prohibited without permission. techniques that result from the research of reeves and nass. this set of problems can be seen most clearly in the chapter "subliminal images," where they discuss how subliminal messages could be inserted into new media to advertise products or to attempt to bolster employee morale. in fact, they say, " ... it might be easier to accomplish subliminal intrusions with a computer than with a television, because software can respond to the particular input of individual users and timing is more precise." they immediately temper this insight with the caution that" ... ethical and legal issues abound." indeed. although some of the techniques that can be applied to new media do lead to ethical problems, i think that most of what reeves and nass talk about are just elements of good design. subliminal suggestion seems to most of us to be out of bounds because it unfairly manipulates user response in a powerful way. the unfairness is that someone can be manipulated without his or her knowledge to do something outside of the person's normal behavior. although the other techniques tend to subtly alter behavior, they don't generally result in an anomalous action by the user. if you think this is a kind of philosophical hairsplitting, you're right. the onus is upon the programmer or multimedia designer to use these techniques with great care. in a past professional life i wrote computerized patient interviews for the psychiatry department of the university of wisconsin. researchers there and elsewhere found that people were generally more candid with the computer than they were with human clinicians. so the findings of reeves and nass were not quite as surprising to me as they might be to others. what did surprise me, however, is that the media equation is not a phenomenon solely of the nai"ve or inexperienced media and computer users. on the contrary, all of us, no matter how conversant we are with underlying technology, are susceptible to the effects described in the media equation. this vastly increases the power of computer programs and other media for both good and ill. i want to emphasize that not all of the possible effects of humanmedia interaction are pernicious. most are simply innocuous, and if techniques that benefit users can result from these effects there should be no harm in applying them in software or multimedia. in general, it's desirable to make user experiences of software and media pleasanter and more productive, and reeves and nass do an excellent job of providing pointers throughout the book. there are suggestions with regard to personality, emotion (including arousal), social roles, and form (e.g., image size, fidelity of sound, and video). none of them comes close to being as controversial as subliminal suggestion, although it continues to make me uncomfortable that people react to media as if they were dealing directly with other human beings. this is a disquieting finding, but it should not dissuade us from our jobs of designing good systems for users. all in all, information ecologies and the media equation are both firstrate books that belong in our libraries and on our professional bookshelves. both provide methodologies and techniques for making user interactions with automated systems a better experience, both in terms of accomplishing tasks efficiently and in terms of user satisfaction.-tom zillner index to advertisers info usa library technologies, inc. lita cover 4 cover 3 cover 2, 2 56 information technology and libraries i march 2000 december_ital_oud_final accessibility of vendor-created database tutorials for people with disabilities joanne oud information technology and libraries | december 2016 7 abstract many video, screencast, webinar, or interactive tutorials are created and provided by vendors for use by libraries to instruct users in database searching. this study investigates whether these vendorcreated database tutorials are accessible for people with disabilities to see whether librarians can use these tutorials instead of creating them in-house. findings on accessibility were mixed. positive accessibility features and common accessibility problems are described, with recommendations on how to maximize accessibility. introduction online videos, screencasts, and other multimedia tutorials are commonly used for instruction in academic libraries. these online learning objects are time consuming to create in-house and require a commitment to maintain and revise when database interfaces change. many database vendors provide screencasts or online videos on how to use their databases. should libraries use these vendor-provided instructional tools rather than spend the time and effort to create their own? many already do: a study shows that 17.7 percent of academic libraries link to tutorials created by third parties, mainly by vendors or other libraries.1 when deciding whether to use vendor-created tutorials, one consideration is whether the tutorials meet accessibility requirements for people with disabilities. the importance of accessibility for online tutorials has been increasingly recognized and outlined in recent library literature.2 people with disabilities make up one of the largest minority groups in the united states and canada, and studies show that about 9 percent of university or college students have a disability.3 problems with web accessibility have been well documented. people with disabilities are often unable to access the same online sites and resources as others, creating a digital divide.4 even if people with disabilities can access a site, it is more difficult for many to use it.5 assistive technologies, like screen-reading software, enable access but add an extra layer of complexity in interacting with the site, and blind or low-vision users can’t always rely on visual cues to navigate and interpret sites. a recent study of library website accessibility concluded that typical library websites are not designed with people with disabilities in mind.6 joanne oud (joud@wlu.ca) is instructional technology librarian and instruction coordinator, wilfrid laurier university, ontario, canada. accessibility of vendor-created database tutorials for people with disabilities | oud https://doi.org/10.6017/ital.v35i4.9469 8 libraries, which are founded on a philosophy of equal access to information, should be concerned about online accessibility. legal requirements for providing accessible online web content vary, but exist in every jurisdiction in the united states and canada. apart from the legal requirements, recent literature points out that equitable access to information for people with disabilities is a matter of human rights and an issue of diversity and social justice, and calls on libraries and librarians to improve their commitment to online accessibility.7 it is important for libraries to participate in creating level playing field and to avoid creating conditions that make people feel unequal or prevent them from equitable access. it is unclear whether librarians can assume vendor-created instructional tutorials are accessible. studies on vendor database accessibility have been mixed, showing some commitment to and improvements in accessibility on one hand, but sometimes substantial gaps in accessibility on the other.8 the focus until now has been exclusively on the accessibility of database interfaces. this study investigates the accessibility of online tutorials, including videos, screencasts, interactive multimedia, and archived webinars created by database and journal vendors and offered as instructional materials to librarians and patrons, to determine whether they are a viable alternative to making in-house training materials. literature review although a few articles exist on how to make video tutorials accessible,9 no studies have evaluated the accessibility of already-created video or screencast tutorials. there are, however, some studies evaluating the accessibility of vendor databases. byerley, chambers, and thohira surveyed vendors in 2007 and found that most felt they had integrated accessibility standards into their search interfaces, and nearly all tested for accessibility to some degree, though not always with actual users.10 these findings conflict somewhat with the results of other studies. tatomir and durrance evaluated the accessibility of thirty-two databases with a checklist and found that although many did contain accessibility features, 72 percent were marginally accessible or inaccessible.11 similarly, dermody and majekodunmi found that students with print-related disabilities who use screen-reading software could only complete 55 percent of tasks successfully because of accessibility barriers and usability challenges.12 delancey surveyed vendors and examined vpats, or product accessibility claims, and found that vendors felt they were compliant with 64 percent of us section 508 items.13 especially relevant to this study, only 23 percent of vendors said that the multimedia content within their products was compliant, and 46 percent admitted multimedia content was not compliant at all. since vendor vpat forms are completed for databases and other products only, and not the instructional tutorials created by vendors on how to use those products, vendor accessibility claims for instructional tutorials are unknown. although no studies have been done on the accessibility of video or screencast tutorials, some have been done on the accessibility of multimedia or other related kinds of online learning. information technology and libraries | december 2016 9 roberts, crittenden, and crittenden surveyed 2,366 students taking online courses at several us universities. a total of 9.3 percent of those students reported that they had a disability, and of those, 46 percent said their disability affected their ability to succeed in their online course, although most reasons cited were not related to technical accessibility barriers.14 kumar and owston studied students with disabilities using online learning units that contained videos. all students in the study reported at least one barrier to completing the learning units.15 although this study involves student use of video tutorials, it doesn’t report on accessibility issues specific to those tutorials. previous studies of vendor products focus exclusively on database interfaces, and previous studies of online learning have not focused on screencast accessibility. therefore this study’s goal is to investigate how accessible vendor-created video tutorials are. accessibility is defined as both technical accessibility (can people with disabilities locate, access, and use them) and usability (how easy it is for people with disabilities to use them). this study will look at which major accessibility issues there are (if any) and make recommendations on whether librarians can direct students to them rather than making in-house instructional videos. method an evaluation checklist (see appendix 2) was developed for this study using criteria drawn from the web content accessibility guidelines (wcag) 2.0. wcag 2.0 is the most widely recognized web-accessibility standard internationally. much recent accessibility legislation adopts it, including the in-process revisions to section 508 guidelines in the united states.16 wcag 2.0 is also consistent with tutorial accessibility best-practice advice found in recent articles, which emphasize the need for accurate captions, keyboard accessibility, descriptive narration, and alternate versions for embedded objects, among other criteria.17 the checklist has twenty items and is split into two sections, “functionality” and “usability.” functionality items test whether the tutorial can be used by people using screen-reading software or a keyboard only, and include whether the tutorial is findable on the page and playable, whether player controls and interactive content can be operated by keyboard, whether captions are available, and whether audio narration is descriptive enough so someone who can’t see the video can understand what is happening. usability items test how easy the tutorial is to use. examples include clear visuals and audio, use of visual cues to focus the viewer’s attention, and short and logically focused content. to help prioritize the importance of checklist items, the local accessible learning centre (alc), which supports students on campus who use assistive technologies, was consulted about the difficulties most encountered by students. the alc’s highest priority was the provision of an alternate accessible version of a tutorial, since it is difficult to make complex embedded web content accessible for everyone under every circumstance and an alternate version allows people to work with content in a way that suits their needs. accessibility of vendor-created database tutorials for people with disabilities | oud https://doi.org/10.6017/ital.v35i4.9469 10 for the evaluation, major database vendors were chosen through a scan of common vendors and platforms at universities, with input from collections colleagues. some vendors were eliminated because they don’t provide instructional tutorials on their websites. twenty-five vendors were included in the study (see appendix 1). a large majority of the tutorials found were screencast or video tutorials; a few vendors provided recorded webinars, and a few provided interactive multimedia tutorials, mainly text captions or visuals with clickable areas or quizzes. in total, 460 tutorials were evaluated for accessibility: 417 video, screencast, or interactive tutorials from twenty-foure vendors, and 41 recorded webinars from four vendors. if tutorials were available in more than one place, most commonly on both the vendor’s website and youtube, both locations were tested. if more than thirty tutorials were provided by a vendor, every other one was tested. if multiple formats of tutorial were available, such as screencasts and recorded webinars, each format was tested. testing from the perspective of people with visual impairments was a key focus. other assistive technologies such as kurzweil (for people who can see but have print-related disabilities) and zoomtext (for enlargement) are widely used, but if webpages work well using screen-reading software intended for people with visual impairments, they also generally work using other kinds of assistive software. tutorials were tested with two screen-reading programs used by people with visual impairments: nvda (with firefox), a free open source program, and jaws (with internet explorer), a widely used commercial product. both were used to determine whether any difficulties were due to the quirks of a particular software product or a result of inherent accessibility problems. in addition, captions were evaluated to determine accessibility for people who are deaf or have hearing difficulties. people with visual or some physical impairments use the keyboard only, so all tutorials were tested without a mouse using solely the keyboard. during testing, each task was tried three different ways within nvda or jaws before deciding that it couldn’t be completed. if one of the three methods worked the task was marked as successfully completed. if a task could be completed successfully in one screen-reading program but not the other, it was marked as unsuccessful. screen-reader support needs to be consistent across platforms, since people may be using a variety of types of assistive software. findings and discussion tutorials created by the same vendor nearly all used the same approach and had the same checklist results. this is positive, since consistency is important for accessibility and helps in navigation and ease of use. none of the forty-one recorded webinars tested in this study were accessible. webinars did not have player controls that were findable on the page by screen-reading software or usable by information technology and libraries | december 2016 11 keyboard. none had captions, transcripts, or alternate accessible versions. often webinars were quite long, with no clear structure and no cues to focus attention on the screen. recorded webinars had almost no accessibility features and can’t be recommended for use as accessible instructional materials in their current form. none of the screencast or video tutorials tested were completely accessible, and all failed in at least one checklist item. tutorials from some vendors, however, came close to meeting all checklist requirements. overall, there were many positive accessibility features in the video and screencast tutorials. most of these tutorials were findable and playable by screen reading software in some way, had video player controls usable by keyboard, had descriptive narration so people who can’t see the screen can tell what is happening, had clear visuals and audio narration, used simple language, and were relatively short and focused in content. the most accessible screencast or video tutorials were produced by the american psychological association (apa), american theological library association (atla), modern language association (mla), and ebsco. their tutorials had many accessibility features and rated highly on the checklist. they included much less commonly found accessibility features, especially the use of visual and/or audio cues to focus the viewer’s attention and the inclusion of accurate and properly synchronized closed captions. visual cues are important for people with learning or attentionrelated disabilities, and help all viewers interpret and follow the video more easily. people who are deaf can’t access the content without captions, and captions also help people who have english as a second language or are at public computers without headphones. tutorials from these vendors also had an alternate version or transcript available. as mentioned earlier, the highest-priority checklist item is the presence of an alternate accessible version, since it is difficult to design multimedia that works for people with all disabilities in all circumstances. people with disabilities may also have previous negative experiences with online multimedia and prefer to use an alternate format that they have had more success with. in the case of these above-average vendors, the alternate accessible version was a transcript consisting of the video’s closed captions, auto-generated by youtube. since the tutorials’ narration was descriptive and the captions were accurate, the auto-generated transcripts are useful. however, the youtube transcript is hard to find on the youtube page. also, most of these vendors had tutorials available both from their own websites and from youtube, and none had alternate versions available on their own websites. viewers requiring an alternate format would need to know to go to the youtube site instead of the vendor site to find it. two other vendors also had quite accessible tutorials. ieee’s tutorials had the same positive accessibility features already mentioned. tutorials were done in-house and presented through the vendor’s site. while most tutorials presented on vendor sites were lacking in accessibility, ieee’s were well thought out from an accessibility perspective and usable by screen-reading software. these were the only tutorials tested where all interactivity, including pop-up screens, was easily accessibility of vendor-created database tutorials for people with disabilities | oud https://doi.org/10.6017/ital.v35i4.9469 12 usable and navigable by keyboard. the one accessibility issue was the lack of an alternate accessible version. elsevier’s sciencedirect tutorials took a different approach to accessibility than other vendors, or even than elsevier’s tutorials for other elsevier products. the science direct tutorials were not accessible, but an alternate text version was available and people using screen-reader software were informed of this when they get to the tutorial page and were redirected to the text version. the ideal is to have one version that is accessible to everyone, but this approach is a good way to implement an alternate version if one accessible version isn’t possible. screencasts or video tutorials from other vendors also have some good accessibility features, but these were balanced with serious accessibility problems. the main accessibility issues discovered include the following: alternate accessible versions: vendors who had captions and hosted their videos on youtube did have auto-generated youtube transcripts, but these were hard to find and were only useful if the captions were descriptive and accurate, which many were not. apart from elsevier’s sciencedirect tutorials, no vendors provided another format deliberately as an accessible alternative. captions: captions were missing or problematic in the tutorials of fourteen vendors, or 59 percent of the total. five (21 percent) of vendors provided no captions at all for their tutorials. nine (38 percent) had unedited, auto-generated youtube captions, which are highly inaccurate and therefore don’t provide usable access to the content for people who are deaf. tutorial not findable or playable on page: twelve vendors (50 percent) had tutorials that were not findable on the webpage or playable for people using a keyboard or screenreading software. most of these issues are with tutorials on vendor sites, which were often flash-based or offered through non-youtube third party sites like vimeo. four vendors (17 percent) offered access to their tutorials both through their own (inaccessible) website and youtube, which is findable and playable by screen reading software. eight (33 percent), however, only provided access through their (inaccessible) webpages, which means that people using a keyboard or screen reading software would not be able to use their tutorials. no visual cues to focus attention: eight vendors (33 percent) had no visual cues to focus attention in the video. visual cues help people with certain disabilities focus on the essential part of the screen that is being discussed, help everyone more easily interpret and follow what is happening, and are known to help facilitate successful multimedia learning.18 information technology and libraries | december 2016 13 nondescriptive narration: six vendors (25 percent) had tutorials with audio narration that didn’t sufficiently describe what was happening on the screen. narration needs to describe what is happening in enough detail so people who can’t see the screen are not missing information available for sighted viewers. fuzzy visuals: five vendors (21 percent) had tutorials with visuals that were fuzzy and hard to see. this makes viewing difficult for people with low vision, and challenging even for people with normal vision. fuzzy audio or background music: three vendors (13 percent) had poor-quality audio narration or background music playing during narration. background music is distracting for those with hearing difficulties and makes it more difficult to focus on what is being said. eliminating extraneous sound also makes it easier for people to learn from multimedia.19 tutorials consisting only of text captions: three vendors (13 percent) had tutorials consisting of text captions with no narration. the text captions were not readable by screen-reading software, and no alternate accessible versions were provided. providing narration in tutorials is recommended for accessibility, since it allows people who can’t see the screen to access the content more easily, and has been shown to improve learning and recall over on-screen text and graphics alone.20 recommendations and conclusions this study attempted to determine how accessible vendor-created database tutorials are, and whether academic librarians can use them instead of re-creating them locally. for recorded webinars, the answer is a clear no, since none were technically accessible for people using screenreading software. for video or screencast tutorials, however, the answer less is clear. results showed that many vendors created tutorials with positive features like clear visuals and audio, being short and focused on one main point, and using descriptive narration. however, technical accessibility was much less successful, with 59 percent of vendors omitting usable captions and 50 percent presenting tutorials that couldn’t be found on the page or played by people using screen-reading software. these technical accessibility issues prevent people with hearing, vision, or some mobility impairments from using the tutorials at all. although none of the tutorials studied met all the checklist criteria, some came close and could be used by librarians depending on local requirements, policies, and priorities for accessibility. in part, this study found that the accessibility of many tutorials depends on how they are presented. disappointingly, 50 percent of vendors had tutorials on their websites that were not findable or playable by people with disabilities. many vendors, however, hosted tutorials on youtube as well as their own site. in these cases, youtube was always a more accessible option accessibility of vendor-created database tutorials for people with disabilities | oud https://doi.org/10.6017/ital.v35i4.9469 14 than the vendor site. youtube itself is relatively accessible, with both pages and players that are navigable by keyboard and by screen-reading software. there are options for accessibility settings in youtube, such as having captions display automatically, and more accessible third-party overlays are available for the youtube player. on vendor sites, there were more likely to be issues with flash and an inability for people using screen-reading software or keyboards to find and play videos. some vendors embed youtube videos on their site. even if the embedded videos are findable and playable, this method omits important accessibility features found on the youtube page, such as the text transcript. the results of this study show that using youtube where available is recommended. further, linking to youtube rather than embedding the video is preferred, unless a separate link to the transcript is made to provide an alternate accessible version. captions are another key accessibility problem identified in this study: nearly two-thirds had unusable captions. often, auto-generated youtube captions were present but were not usable. the presence of captions is not enough for accessibility; those captions need to be accurate and present the same content as the narration. youtube auto-captioning does not generate captions that are accurate enough to be useful without manual editing. youtube auto-generates transcripts from the captions, so if the captions are inaccurate the transcript will not be useful either. editing youtube auto-generated captions is necessary to ensure accessibility. a few accessibility issues found in this study would be easy to improve with some thought during tutorial creation. adding visual cues like arrows or highlighting to the screen to help people focus attention, or remembering that not everyone can see the screen while recording narration, can be easily achieved and would improve accessibility significantly. other issues would require more planning and effort to improve. given the widespread technical accessibility problems identified in this study, it is particularly important for people creating tutorials to provide alternate formats that are accessible if tutorials themselves are not accessible. almost no vendors do this currently, but it would have the most significant impact on accessibility for the broadest range of people. adding usable captions is the second most important area for improvement. to provide access for people who are deaf, captions need to be added or autogenerated youtube captions need to be edited for accuracy. both alternate formats and captions require some thought and effort to implement but ensure that tutorials will meet accessibility requirements and be usable by everyone. notes and bibliography 1. eamon tewell, “video tutorials in academic art libraries: a content analysis and review,” art documentation 29, no. 2 (2010): 53–61. information technology and libraries | december 2016 15 2. amanda s. clossen, “beyond the letter of the law: accessibility, universal design, and human-centered design in video tutorials,” pennsylvania libraries: research & practice 2, no. 1 (2014): 27–37, https://doi.org/10.5195/palrap.2014.43; joanne oud, “improving screencast accessibility for people with disabilities: guidelines and techniques,” internet reference services quarterly 16, no. 3 (2011): 129–44, https://doi.org/10.1080/10875301.2011.602304; kathleen pickens and jessica long, “click here! (and other ways to sabotage accessibility),” imagine, innovate, inspire: the proceedings of the acrl 2013 conference (chicago: acrl, 2013), 107–12. 3. deann barnard-brak, lucy lechtenberger, and william y. lan, “accommodation strategies of college students with disabilities,” qualitative report 15, no. 2 (2010): 411–29. 4. cyndi rowland et al., “universal design for the digital environment: transforming the institution,” educause review 45, no. 6 (2010): 14–28. 5. peter brophy and jenny craven, “web accessibility,” library trends 55, no. 4 (2008): 950–72. 6. kyunghye yoon, laura hulscher, and rachel dols, “accessibility and diversity in library and information science: inclusive information architecture for library websites,” library quarterly 86, no. 2 (2016): 213–29. 7. ruth v. small, william n. myhill, and lydia herring-harrington, “developing accessible libraries and inclusive librarians in the 21st century: examples from practice,” advances in librarianship 40 (2015): 73–88, https://doi.org/10.1108/s0065-2830201540; john carlo jaeger, paul t. wentz, and brian bertot, “libraries and the future of equal access for people with disabilities: legal frameworks, human rights, and social justice,” advances in librarianship 40 (2015): 237–53; yoon, hulscher, and dols, “accessibility and diversity in library and information science: inclusive information architecture for library websites.” 8. suzanne l. byerley, mary beth chambers, and mariyam thohira, “accessibility of web-based library databases: the vendors’ perspectives in 2007,” library hi tech 25, no. 4 (2007): 509– 27, https://doi.org/10.1108/07378830710840473; kelly dermody and norda majekodunmi, “online databases and the research experience for university students with print disabilities,” library hi tech 29, no. 1 (2011): 149–60, https://doi.org/10.1108/07378831111116976; jennifer tatomir and joan c. durrance, “overcoming the information gap: measuring the accessibility of library databases to adaptive technology users,” library hi tech 28, no. 4 (2010): 577–94, https://doi.org/10.1108/07378831011096240. 9. pickens and long, “click here!”; clossen, “beyond the letter of the law”; oud, “improving screencast accessibility for people with disabilities”; nichole a. martin and ross martin, “would you watch it? creating effective and engaging video tutorials,” journal of library & accessibility of vendor-created database tutorials for people with disabilities | oud https://doi.org/10.6017/ital.v35i4.9469 16 information services in distance learning 9, no. 1–2 (2015): 40–56, https://doi.org/10.1080/1533290x.2014.946345. 10 . byerley, chambers, and thohira, “accessibility of web-based library databases.” 11. tatomir and durrance, “overcoming the information gap.” 12. dermody and majekodunmi, “online databases and the research experience for university students with print disabilities.” 13. laura delancey, “assessing the accuracy of vendor-supplied accessibility documentation,” library hi tech 33, no. 1 (2015): 103–13, https://doi.org/10.1108/lht-08-2014-0077. 14. jodi b. roberts, laura a. crittenden, and jason c. crittenden, “students with disabilities and online learning: a cross-institutional study of perceived satisfaction with accessibility compliance and services,” internet and higher education 14, no. 4 (2011): 242–50, https://doi.org/10.1016/j.iheduc.2011.05.004. 15. kari l. kumar and ron owston, “evaluating e-learning accessibility by automated and student-centered methods,” educational technology research and development 64, no. 2 (2015): 263–83, https://doi.org/10.1007/s11423-015-9413-6. 16. us access board, “draft information and communication technology ( ict ) standards and guidelines,” 36 cfr parts 1193 and 1194, rin 3014-aa37 (2015), https://www.accessboard.gov/attachments/article/1702/ict-proposed-rule.pdf. 17. pickens and long, “click here!”; clossen, “beyond the letter of the law”; martin and martin, “would you watch it?”; oud, “improving screencast accessibility for people with disabilities.” 18. see the signaling principle in richard e. mayer, multimedia learning, 2nd ed. (cambridge: cambridge university press, 2009): 108–17. 19. see the coherence principle, ibid., 89–107. 20. see the modality principle, ibid., 200–220. information technology and libraries | december 2016 17 appendix 1. list of vendors 1. acm 2. adam matthew 3. alexander st press 4. apa 5. atla 6. chemspider 7. cochrane library (webinars only) 8. ebsco 9. elsevier 10. factiva 11. gale 12. ieee 13. lexis nexis academic (tutorials and webinars) 14. marketline 15. mathscinet 16. ovid/wolters kluwer (tutorials and webinars) 17. oxford 18. proquest (tutorials and webinars) 19. pubmed 20. sage 21. scifinder 22. standard & poor/netadvantage 23. taylor and francis 24. web of knowledge/thompson reuters 25. zotero accessibility of vendor-created database tutorials for people with disabilities | oud https://doi.org/10.6017/ital.v35i4.9469 18 appendix 2. tutorial accessibility evaluation checklist functionality � equivalent alternate format(s) are provided � transcript/test version � audio � other ___________________________ � alternate formats provided are accessible � alternate formats provided are findable on the page by screen reader � screen reading software can find the video on the webpage � screen-reading software can access and play the video � video-player functions can by operated by keyboard/screen-reading software � interactive content can be accessed and used by keyboard/screen-reading software � user has some control over timing (pause/rewind capability) � alternate modes of presentation are available for all, meaning presented through text, visuals, narration, color, or shape � synchronized closed captions are available for all audio � audio/narration is descriptive usability � user controls if/when the video starts (no auto play) � video is easy to use by screen-reading software � clear, high-contrast visuals and text � clear, high-contrast audio (no background noise/music) � uses visual cues to focus attention (e.g., highlighting, arrows) � is short and concise � is clearly and logically organized � has consistent navigation, look, and feel � uses simple language, avoids jargon, and defines unfamiliar terms � explicit structure with sections, headings to give viewers context � learning outcome/goal clearly outlined and content focused on outcome lib-s-mocs-kmc364-20140601052211 96 ]o11mal of library automation vol. 5/ 2 june, 1972 analysis of search key retrieval on a large bibliographic file gerry d. guthrie, steven d. slifko : research & development division, the ohio state university libraries, columbus, ohio two search keys (4,5 and 3,3) are amlyzed using a probability formula on a bibliographic file of 857,725 records. assuming random requests by record permits the creation of a predictive model which more closely approximates the actual behavior of a search and retrieval system as determined by a usage survey. introduction systems planners are hard pressed to accurately predict the access characteristics of search keys on large on-line bibliographic files when so little is known about user requests. this paper presents a realistic model for analyzing different search keys and, in addition, the results are compared to actual request data gathered from a usage survey of the ohio state university libraries circulation system. a number of papers are available in the literature concerning search key effectiveness; however, all of these were done on relatively small data bases ( 1-5) . of particular importance to this paper is kilgour's article on truncated search keys ( 6) . purpose the purposes of this study are ( 1 ) to determine the comparative effectiveness of the 4,5 and 3,3 search keys, ( 2) to compare two predictive models, and ( 3 ) to test the results with an actual usage survey. method the ohio state university libraries circulation system contained at the time of this study 857,725 titles representing over 2.6 million volumes in the analysis of search key retrieval/guthrie 97 osu collection. the data base used for this study was the search key index file which contained one search key for each title in the master file. the search key is composed of the first four letters of the author's last name and the first five letters of the first word of the title excluding nonsignificant words ( 4,5 key). title words are passed against a stop-list to determine significance. the stop-list contains the words: a, an, and, annual, bulletin, conference, in, international, introduction, journal, of, on, proceedings, report, reports, the, to, yearbook. the search key file is in sequence by search key. for comparative purposes, a second search key file was created and sorted which contained a 3,3 key (the first three characters of the author's last name and the first three characters of the first significant word of the title. ) the two files of sorted search keys were then processed by a statistical analysis computer program. this program created a frequency distribution table of identical keys, i.e., how many keys were unique, duplicated once, duplicated twice, etc. from this table two models were compared. modell: file entry was viewed as a random process with choice of any unique search key equiprobable. this model has been suggested in the literature mentioned earlier. it states that if x;. number of keys will return i matches then the probability of a file search returning i matches may be written: p(i) = xi/ku where ku is the total number of unique file keys. likewise, the cumulative probability for i or fewer matches is i i p(i) = ~ p(i) = ( l x;. )/ku i= l i= l model 2: file entry is viewed as a random process with the choice of any record equiprobable. thus, p( i) = ix;/rt where r t is the total number of file records. correspondingly, i i p(i) = l p(i) = ( ~ ixi )/rt i= l i= l survey: the ohio state university libraries automated circulation system includes a telephone center to which patrons may telephone requests for 98 journal of library automation vol. 5/2 june, 1972 library holdings information and for checking out and renewing books. telephone operators, sitting at cathode ray tube ( crt) terminals, translate the patron's author-title request into a 4,5 search key and proceed with a file search. by having the telephone operators treat te lephone calls as random input to the system and recording the number of matches returned for each search used, results can be generated in the same form that both of the models take, i.e. , i or fewer matches have been returned p( i ) x 100 percent of the time. this is a relatively easy survey to conduct since the output list of matching records for any particular key entry is headed with the exact number of matches which follow. the sample size was 1000 information requests recorded over two one-week periods separated by one month. before these two subsamples were merged, statistical analysis on their individual means (for percent of 10 or fewer matches) signified they were identical at the 99 percent confidence level. results the results predicted by the two models for both a 4,5 and 3,3 search key for 1-10 matches appear in tables 1 and 2. the figures pertaining to the 4,5 key can be compared directly to the data received fro m the survey conducted through the osu library's telephone center. this comparison is shown in table 1 for 1-10 matches. table 1. file access comparisons (4,5 search key). (percent of time i or fewer matches returned) i 1 2 3 4 5 6 7 8 9 10 actual survey 35.9 53.8 66.0 73.1 78.5 81.3 83.8 85.6 86.6 87.8 modell model 2 (random key) (random tecord) 81.3 55.7 92.9 71.6 96.3 78.5 97.7 82.4 98.4 84.9 98.8 86.6 99.1 87.8 99.3 88.8 99.4 89.6 99.5 90.2 to acquire a 99 percent upper confidence limit on the percent of requests returning 10 or fewer matches, the normal distribution was used as an approximation to the binomial distribution ( n = 1000, p = .878 ) producing an upper limit of 90.2 percent. analysis of search key retrieval/guthrie 99 table 2. file access comparisons (3,3 search key). (percent of time i or fewer matches were returned ) i 1 2 3 4 5 6 7 8 9 10 discussion modell (random key) 64.3 81.0 87.9 91.6 93.7 95.1 96.1 96.8 97.3 97.7 model 2 (random record) 28.0 42.5 51.7 58.0 62.7 66.3 69.3 71.8 73.9 75.7 in table 1 the results of the survey show that 87.8 percent of all searches recorded returned 10 or fewer titles. in modell, assuming that requests of the file are random with respect to search key, it is predicted that 99.5 percent of all searches will return 10 or fewer titles. all predicted percentages for model 1 are consistently higher than observed results. the predicted response in model2 more closely approximates the observed behavior of the system as the number of responses increases. however, model 2 is also consistently higher than the actual survey. comparing model 1 and model 2 only, it is apparent that assuming a random record request more accurately reflects the true usage of a library collection. the lower percentages recorded in the actual survey may be attributable to a number of variables not taken into consideration in this study. clustering due to common english word titles and common names may account for the greater part of this difference. table 2 shows the results of predicted response for a 3,3 search key. in this table, model2 predicts that only 75.7 percent of requests will return 10 or fewer titles. equally important, only 28.0 percent of the requests will return a single record. conclusion in predicting the expected behavior of an information retrieval system, it is more accurate to assume random requests by record than to assume random requests by search key. probability predictions are deceptively high for assumed random key requests and do not reflect actual usage of the file. even assuming random requests by record will produce higher-thanobserved results. data calculated using model 2 should be considered as an upper limit or "ideal" performance indicator. regarding the results of 100 journal of library autvmatio11 vol. 5/ 2 june, 1972 the random record model as the upper limit on effectiveness of the search key, the data gathered indicate that, as the search key is shortened from 4,5 to 3,3, the deviation between the random key and random record models is considerably heightened. the 4,5 search key is more efficient for retrieval of 10 or fewer records from a large file than the 3,3 key (90.2 -75.7 percent ). based on these data, the osu libraries decided to retain the 4,5 search key and not reduce it to 3,3. additional studies should be undertaken to determine the effects of common word usage, common names, and their relation to book usage. secondly, the data presented here could be systematically and randomly reduced in size to predict the behavior of various search key combinations on varying file sizes. references 1. philip l. long and frederick g. kilgour, "a truncated search key title index," journal of library automation 5:17-20 (mar. 1972 ). 2. frederick g. kilgour, philip l. long, eugene b. leiderman, and alan l. landgraf, "title-only entries retrieved by use of truncated search keys," journal of library automation 4:207-10 (dec. 1971 ). 3. frederick g. kilgour, "retrieval of single entries from a computerized library catalog file," proceedings of the american society for information science 5: 133-36 ( 1968) . 4. frederick h. ruecking, jr., "bibliographic retrieval from bibliographic input; the hypothesis and construction of a test," j ournal of library automation 1:227-38 ( dec. 1968). 5. william l. newman and edwin j. buchinski, "entry / title compression code access to machine readable bibliographic files," journal of library automation 4:72-85 (june, 1971 ). 6. frederick g. kilgour, philip l. long, and eugene b. leiderman, "retrieval of bibliographic entries from a name-title catalog by use of truncated search keys," proceedings of the american society for information science 7:79-81 ( 1970). .. 190 information technology and libraries | december 2006 digital tool making offers many challenges, involving much trial and error. developing machine learning and assistance in automated and semi-automated internet resource discovery, metadata generation, and rich-text identification provides opportunities for great discovery, innovation, and the potential for transformation of the library community. the areas of computer science involved, as applied to the library applications addressed, are among that discipline’s leading edges. making applied research practical and applicable, through placement within library/collection-management systems and services, involves equal parts computer scientist, research librarian, and legacy-systems archaeologist. still, the early harvest is there for us now, with a large harvest pending. data fountains and ivia, the projects discussed, demonstrate this. clearly, then, the present would be a good time for the library community to more proactively and significantly engage with this technology and research, to better plan for its impacts, to more proactively take up the challenges involved in its exploration, and to better and more comprehensively guide effort in this new territory. the alternative to doing this is that others will develop this territory for us, do it not as well, and sell it back to us at a premium. awareness of this technology and its current capabilities, promises, limitations, and probable major impacts needs to be generalized throughout the library management, metadata, and systems communities. this article charts recent work, promising avenues for new research and development, and issues the library community needs to understand. t his article is intended to discuss data fountains (http://datafountains.ucr.edu) project work and thinking (and its foundation in the ivia system, http://ivia.ucr.edu) regarding tools and services, for use in collection creation and augmentation. both systems emphasize automated and semi-automated internet resource discovery, metadata generation, and rich-text harvest. these areas of work and research occur within the larger realms of machine assistance and machine learning. they are of critical value to libraries as they currently or potentially concern: significant resource savings; amplification and re-tasking of expert effort to better match librarian expertise with tasks that truly require it (through the automation of routine tasks); and better scaling of collections by providing them the technological wherewithal to grow, as appropriate, and better match the explosion of significant available knowledge and information that the internet has accelerated. this article is organized into three major sections: ■ part i details machine assistance work to date in the data fountains and ivia systems project. ■ part ii describes current and upcoming promising research directions in machine assistance. ■ part iii delves into planning and organizational issues that may arise for the library community as a result of these technologies. ■ part i: recent work in data fountains and ivia part i covers work to date on data fountains and ivia. section 1, “a new service and open source software,” describes concrete project work with data fountains, a new open service and suite of open-source software tools for the educational and library communities, in developing practical machine learning to provide machine assistance in collection building. data fountains is an expansion of work based upon the ivia systems foundation.1 it is an effort that has been ongoing and evolving since 1994.2 section 2, “role and niche definition for machine assistance in collection building,” covers recent developments in our ongoing effort to better research and define roles and niches for machine assistance of the types offered by data fountains. the spectrum—ranging from collection building with an emphasis on expertise that receives small assists from machine tools to an emphasis on machine tools that are configured and thereafter assisted through small refinements by expertise—is examined. results from an initial exploratory survey in these areas are summarized. ■ a new service and open-source software—data fountains description data fountains is an internet resource discovery, metadata-generation, and selected, full-text harvesting service as well as the open source (lesser general public license machine assistance in collection building: new tools, research, issues, and reflections steve mitchell steve mitchell (smitch@ucr.edu) is the science librarian for ivia/nsdl data fountains/data fountains projects, science library, university of california, riverside. machine assistance in collection building | mitchell 191 (lgpl) and general public license (gpl) licensed) software that makes the services possible. it is a set of tools for use by organizations and institutions serving the greater learning community that create and maintain internet portals, subject directories, digital libraries, virtual libraries, or library catalogs with portal-like capabilities (ipdvlcs) containing significant collections of internet resources. it is an evolved variant of the ivia system, with which it shares many components. the data fountains/ivia code base represents more than 250,000 lines of primarily c++ code. on the systems level, data fountains operates as an array of independent systems containing crawler, text classifier, text extraction, portal, and database software components customized to the needs of participating projects. each cooperator and subject community works with, fine tunes, and benefits from its own set of crawler(s), classifier(s), and database manager(s), i.e., its own specific data fountain. note that in this article, data fountains’ portal/metadata repository/database management, content management, import-export, or content search/browse capabilities, which are substantial, will not be discussed.3 instead, the article will focus on its machine assistance and machine-learning components. the data fountains system and service has been developed through a research partnership among computer scientists and academic librarians that is beginning to provide technological solutions to some of the major overall problems associated with the scalability and efficient running of ipdvlcs. much project effort is based on applying machine-learning techniques to partially automate and provide help in a number of laborious and costly ipdvlc activities. included here, more specifically, are the following needs/scaling challenges: reducing to some degree the high costs of manually created metadata; better coverage of the ever-increasing number of important internet resources (relatedly, the relatively small size of most library internet collections, where searches yielding very few or no results are common); reducing or making more efficient expert-involved tasks requiring little expertise; and reducing redundant efforts among ipdvlcs (both in content and systems building). by providing inexpensive, universally needed raw materials (i.e., metadata and rich full text representing important resources), the data fountains service is intended to offer major support and resource savings to cooperating ipdvlc participants that otherwise have strong ongoing commitments to their established institutional identity or “brand,” interface or look, system, and, more generally, “established way of doing things.” data fountains viability and sustainability is keyed to providing universally needed service and very generic information products that do not require ipdvlcs to change—this often being seen as prohibitively expensive in time and resources. data fountains is intended to lower barriers for substantive cooperation in collection building and resource savings on the part of large numbers of ipdvlcs by developing, sharing, and distributing the benefits of machine learning in its areas of application. the data fountains service will be useful to a large spectrum of academic and library-based finding tools including metadata repositories and catalogs with internet portal-like capabilities.4 increasingly, library-catalog software is developing more flexibility, including, hopefully, the means by which full marc (machine-readable cataloging) records coexist with more streamlined (and less expensive) records, e.g., dublin core (dc) and other types, and, moreover, metadata records that include or can be closely associated with selected rich full-text, among many other catalog need areas.5 data fountains offers multiple levels of products and services geared to fit the needs of ipdvlcs of differing sizes, subject needs, and desired data “completeness” or depth (this being the amount and type of metadata and full-text needed to properly represent each resource). uses, products, and services overall, data fountains automatically or semi-automatically supplies varying levels of what represents the basic “ore” required by ipdvlcs for internet resource and article collection building: access to significant, previously undiscovered resources as well as the metadata and selected full-text that describe or represent them. this ore is available in both raw (relatively unprocessed) and more refined products depending on the needs of the participating ipdvlc including, perhaps most importantly, the degree to which expertise is available to provide further refinement and how and for whom the material is intended to be used. data fountains multiple product and usage models supports the building of a wide array of ipdvlc collections. a number of usage or service models are supported by data fountains, including: collection development support for single hybrid record type collections the first usage model, based on full automation, involves the utilization of data fountains metadata and rich, fulltext “as is,” without review, to populate a collection. these records can be used by themselves or mixed with other types of records. they can also be used as part of a hybrid collection to undergird another, more primary, or fully expert-created, collection.6 while more accurate, expertcreated collections are not only comparatively more labor intensive and expensive to create and maintain, but often smaller, with narrower and more limited coverage. this has been the infomine (http://infomine.ucr.edu) model that features two distinct collections, with the automatically generated collection supporting, as a second tier of 192 information technology and libraries | december 2006 data, the expert-built content in the primary collection. users can search one or both. internet resource discovery service a second model uses data fountains primarily as an internet resource discovery service where links and titles and other minimal metadata are supplied but where the user’s intent is to identify new resources and build metadata records emphasizing a considerable amount of metadata not generated by data fountains (e.g., different subject schema). this is done by utilizing the targeted link crawler, expert guided crawler, or focused crawler. because little to no metadata/rich-text generation/extraction occurs, this is the least complex of the usage models. crème de la data fountains a third approach, a variation of the second, utilizes only those data fountains records that have been automatically determined, through a user-set threshold, to represent the most highly significant resources (e.g., the top 20 percent). these can be flagged for expert review or automatically harvested without review. the data fountains metadata retained for expert review, post-processing, and improvement can be minimal or full. metadata records intended for expert refinement a fourth approach, which is semi-automated, involves using data fountains as both a discovery service and as a metadata record-building service where employment of records from the data fountains data stream is selective but the machine-created record is routinely retained as a foundation record to be refined or augmented by the expert. metadata records plus full-text a fifth approach is to use the rich full-text selectively identified and harvested from the internet resource, either in addition to the metadata generated or by itself, to populate a collection and greatly boost retrieval. that is, some collections may want to utilize metadata differing from that produced by data fountains but have data fountains perform the service of augmenting their metadata with rich full-text. all or parts of the object and full-text can be harvested. ■ standards, metadata, and full-text data fountains’ record format is dublin core (dc) and features standard research library subject schemas including slightly modified library of congress subject headings (lcsh) and library of congress classification (lcc). as part of upcoming work, development of additional classifiers to apply other subject/classification schemas/ vocabularies will occur, notably ddc and those that can be automatically invoked from the terminology found in the collection objects. cooperators may choose to help develop new formats, subject schemas, and metadata to meet custom needs in collecting and classification. other important metadata generated include: title, creators, description (an annotation-like construct), keyphrases, capitalized terms, and resource language, among a total of thirty-plus fields. in addition to fielded metadata, data fountains delivers selected rich text harvested from the resource. this is important for enhancing ipdvlc retrieval capabilities and user-searching success. the rich text can be harvested verbatim and offered as-is for search or, if this is problematical, further processed into keyphrases. data post-processing, transfer, and product relevance assurance participants determine and download resources of relevance automatically in batch mode via subject-profiled, custom internet crawls and editable results sets created by and for each ipdvlc to reflect its particular interests. these profiled crawls and metadata generation routines are stored and can be re-executed at selected intervals. results are transferred using the open archives initiative protocol for metadata harvesting (oai-pmh) or sdf (standard delimited format) in dc, marc, and extensible hypertext markup language (xhtml) formats. in addition to batch transfers, participants can manually and interactively identify individual records or groupings of records that suit their needs for harvest. selective, interactive, sorting/browsing of results, followed often by evaluation and editing of metadata and full-text fields (as individual records or globally in patterns), is enabled prior to export. these capabilities allow precisely targeted, custom record identification, modification, and downloading. this in turn enables the most general, as well as the most subject-specialized, ipdvlcs certainty in identifying and receiving only records that meet their need criteria. open-source software the software making the above possible is available to all for free through the lgpl/gpl open-source licenses and model. the open-source model should work well for tool development as fundamental as that described. open source of this type generally means that users freely use and perhaps participate in further development of the functionality of the software and, at intervals, contribute their innovations back to the code base for all to use. lgpl/gpl supports a wide diversity of forms of commachine assistance in collection building | mitchell 193 mercial service development. open source has worked well for large applications such as many forms of the linux operating system (a number of variants of this are supported), apache server software, and mysql database management software (all of which are used by the data fountains system). using this model has the intent of cooperatively benefiting the community as a whole. it is the author’s belief that tools of the data fountains type will have wide enough usage within and are crucial enough to the library community to support the development of an open-source community around them. data fountains software is of use to thousands of institutions that build ipdvlc collections. open source also means that the development and evolution of a core tool or system for a community can potentially occur faster and more flexibly, with the proper community support, than many types of proprietary effort. this is needed given the continuing and increasingly greater revolutions in computing power and software potential. the community needs to be able to evolve faster in response to changing conditions, and free, communitybased, open-source software development is one strategy for achieving this. ■ current systems design, development, and features to date, most of the work has emphasized research and development leading to innovations in preferential focused crawling, subject classification using logistic regression, knearest neighbor (knn) and other classifiers, and rich full-text identification and extraction. a major emphasis in systems development has been identifying points of intervention in crawling, classification, and extraction, whereby initial, periodic or ongoing interactive expert input can be employed to improve machine processes and results. that is, the work has emphasized usage not only of fully automated machine processes but semi-automated machine processes intended to interactively augment, amplify, and improve the efforts of experts. experts assist machine processes, and machine processes assist expert judgment/labor. the programming has also been done with an eye toward modularity among different systems components. ■ internet resource discovery/ identification—expert guided and focused crawling a number of crawling systems have been used; currently, for data fountains, three are used that represent two approaches to crawling: expert guided and focused. expert-guided crawling is accomplished by a targeted link crawler (tlc) and an expert guided crawler (egc). tlc is concerned with crawling a user-specified link or list of links. egc differs from tlc in that the single “start url” link given is only the beginning point from which the crawler will either drill down (find onsite links at multiple depths in a site) or drill out (find external links not on the start url site). the result is that, compared with tlc, many more links than just those given the egc crawler initially are crawled. with all crawlers, a metadata record with accompanying rich full-text is generated for each resource crawled. a preferential focused crawler, called the nalanda ivia focused crawler (nifc) after the name of the ancient seat of learning in india, continues to be developed. focused crawling makes possible focused identification of significant internet resources by identifying specific, interlinked, and semantically similar communities of sites of shared subject interest. generally, nifc traverses subject experttargeted regions of the internet to find resources that are strongly interlinked and thereby represent coherent subject-interest communities and sites of shared interest and mutual use (i.e., are often concerned with and contain content similar to one another). communities sharing interests often identify and cite one another through linkages on their internet resources. through this mechanism, these communities and their sites/resources can be identified, mapped, and harvested. preferential focused crawling makes focused crawling more efficient by employing algorithms that can respond to clues in web resource page layout and structure (e.g., using document object models, visual cues, and text windows adjacent to anchor text, among others) that indicate the more “promising” links to crawl. the result is more efficient focused crawling (figure 1).7 the focused crawling process starts with exemplary sites/pages/urls being supplied by participating ipdvlc experts. these highly on-topic exemplars are used to form a seed set of model pages used for training/guiding the crawler. as the crawling progresses, an interlinkage graph is developed of which resources link to one another (i.e., cite and co-cite). highly interlinked resources are evaluated, differentiated, and rated as to the degree to which they are linked to/from as well as for their capacities as authoritative resources (e.g., a primary resource such as an important technical report that receives many in-links to it from other resources) or hubs (e.g., secondary sources such as expert virtual library collections that provide out-links to other, authoritative resources). as hubs, expert-created, high-quality ipdvlc collections of links (e.g., infomine) play an important role as milestones and navigation aids in the guidance of many types of crawling. another automated process works to rate resources, as a second indirect measure of resource quality, by comparing for similarity of content (e.g., similarities among key-word 194 information technology and libraries | december 2006 vocabularies) between the potential new resources and model resources. the most linked to/from authorities and hubs, with terminology most similar to the exemplars, are thus identified and become prime candidates for adding to the collection and for indicating other resources to add. the overall architecture of data fountains involves multiple concurrent crawls and an array of multiple crawlers and associated classifiers on multiple machines (i.e., there are one or more data fountains for each major subject area or major cooperator). areas of expert interaction in focused crawling expert interactive and semi-automated approaches to improve crawling are employed in and constitute special design areas of data fountains since many participating projects and communities have access to considerable subject expertise. there is much promise in amplifying the role of this expertise in the crawling process. experts can create and refine crawls by: ■ determining the most appropriate seeds (exemplary resources) to use (whether found in their own collections or generated from other sources); ■ choosing degree of “on-topic-ness” desired (a precision versus recall setting); ■ determining the total number of resources to be crawled; ■ editing initial crawl results (e.g., de-selecting or blacklisting resources found) with an eye toward generally refining and developing a super seed set of very large numbers of increasingly on-target seeds that are then crawled anew. (this process of refinement and enlargement can be reiterated as desired in achieving increasing accuracy in and numbers of exemplars and therefore accuracy in the final crawl.) ■ in addition, expert truing of crawler web graph weightings (i.e., manually “lifting” the values of selected hubs and authorities) either during or after a crawling run is being explored to improve crawling accuracy. this lifting process can be aided through tools to visualize the crawl so that the expert can quickly identify, among the masses of results, the most promising areas of a web graph for the crawler to emphasize. ■ expert-created blacklists of urls for types of sites or pages that are not valuable can be stored to save future crawling and expert time. there is such a blacklist for each participating data fountains community group and individual. ■ metadata generation— automated and semi-automated subject classification data fountains and ivia embody innovations in automated metadata generation, including identifying and applying controlled subject terms (using academic library-standard subject schema), keyphrases, and annotation-like constructs (figure 2). automated classifier programs apply these and other metadata and are part of a suite of programs known as the record builder. controlled subject terminology applied currently includes lcsh, lcc, ddc, and medical subject headings (mesh). in assigning these, the system generally first looks for html and dc metatags and then extracts these data. with some fields, when these data are not present (which is common), original metadata are then generated automatically. in the case of lcsh, lcc, and ddc, if not present in metatags, or if users choose to override metatag extraction (in cases where metatags are not accurate, such as when they are spammy or when top-page boilerplate metadata is carried onto all pages regardless of subject relevance), then classification processes are invoked. these derive a set of keywords and key phrases from the resource that serve as a surrogate in representing and summarizing its content. then, using a model that encapsulates the relationships between these natural-language terms and the set of controlled-subject terms, the closest corresponding set of controlled terms is assigned. the model is learned from training data sets that consist of large sets of records (more than thirty million in corpora loaned for research purposes by the cornell university library, library of congress, california digital library [cdl], and oclc) figure 1. focused and preferential crawling (courtesy of s. chakrabarti) machine assistance in collection building | mitchell 195 from library catalogs and virtual libraries. with lcc, the aim has been to assign one or more lccs to a resource based on the set of lcshs associated with that resource. svm, knn, and logistic regression classifiers have been used. generally, performance has been acceptable in cases where there were two hundred examples of the usage of a particular lcsh (in a record with a url). unfortunately, as large as the training data sets have been, there simply haven’t been enough records for classification purposes with urls and associated text. this problem will more than likely be resolved shortly as catalogs increasingly incorporate web resources. metadata generation—automated extraction of known, named entities named-entity (e.g., data elements that can be expected to be in a resource and that are placed by authors/publishers within a known textual/markup pattern) extraction is primarily practiced through the simple means of identifying and extracting data elements indicated by html/dc metatags, when present on a page. data for more than thirty dublin core common (and not so common) fields are extracted. with some fields, extraction can be guided, as needed, in the interests of original metadata creation through pattern recognition and profiling, or through classification (e.g., title, subjects, description). ■ rich-text identification and harvest refinement of our “aboutness” measure for identifying the most relevant pages or sections in a resource or document (i.e., those intended by the author to be rich in descriptive information about the topics within and the type of resource) from which to extract text is a continuing pursuit. involved in this quest has been better determination of author-created structures and conventions in document or resource layout (e.g., locating introductions, summaries, etc., and determining/proportioning the amount of text to be extracted from each). more accurate rich-text identification in turn yields more accurate identification, extraction, and application of key phrases and, from these, more accurate controlled subject term and other metadata application. this is at the foundation of many metadata generation processes. crucially, rich full-text is also important from an end-user information-retrieval perspective because the natural-language terminology contained partially corrects for the limitations inherent in many controlled metadata and subject vocabulary/schema approaches (e.g., new or specialized subject terminology is often slow to appear or weakly represented in the often generalist library-standard subject schemas). refinement of the “aboutness” measure in identifying terms indicating that rich text follows is an important and ongoing task that involves formulating fairly intricate text-extraction rules in reflecting conventions in rich-text placement in resources and documents of differing types (e.g., web sites, articles, database interfaces), formats (e.g., html, pdf, postscript), and languages. ■ a modular architecture that supports a federated array of subject-specific focused crawlers and classifiers the architecture that data fountains is based upon is shown in figures 3 and 4. data fountains operates on the systems level as an array of separate sets of bundled crawlers (both guided and focused), classifiers, and extractors; this bundled array of crawlers approach provides greater flexibility and efficiency, as compared with using a more monolithic, single-crawler, multiple-subject approach. a bundle can occupy a whole machine or several can exist independently, as virtual data fountains, on a single machine. instead of one broad, multiple-subject, multipleaudience data fountain that follows a broad shotgun approach to internet resource discovery and classification, there are several vertical, subjectand audience-focused data fountains. a data fountain is intended to exist for each distinct, major subject area and the subject-specific ipdvlc collections (e.g., visual arts, business, horticulture) associated with them. figure 2. metadata choices in data fountains 196 information technology and libraries | december 2006 data fountains systems architecture emphasizes modularity. it has been enabled and assumed that separate components of the system (e.g., the crawlers, classifiers, database management systems) could be developed further for other uses independent of the data fountains system. in addition, as technologies that the system is dependent upon advance, users will be able to more easily swap out and replace older modules. these capabilities contribute to system sustainability. ■ service design and sustainability data fountains was conceived to be a cooperative, nonprofit, low-overhead, cost-recovery-based service intended to sustain itself after start-up. access will be provided to ipdvlc cooperators who demonstrate interest and support for the work and service. by so doing, cooperators share in supporting the continuing evolution and improvement of data fountains. as an additional sustainability consideration, the software has been released as open source so that it can develop and evolve in many directions (to directly fit unique needs) as well as benefit through distributed effort. ■ “small is beautiful”: roles for and advantages of appropriate smallto medium-scaled tools approaches like those data fountains has taken may be among the few ways that internet finding tools can continue to be relevant to the learning/library community and offer the accuracy and significant content needed by that community. the technical challenges faced by the large engines in their quest to cover an infinitude of audiences and internet resources do not need to be grappled with by the community of research libraries and are not faced by focused crawlers and classifiers of the type data fountains relies upon. the latter are better able to develop targeted, more accurate approaches to their subjects because they enable machine assistance for, as well as amplification of, authoritative subject expertise (e.g., librarians) as a core interactive component in the process of finding and describing new resources. the processes involved target more narrowly defined, distinct, and finite subject universes and intellectual communities. this, in turn, allows them to scale appropriately for their tasks and to apply more complex and varied types of metadata for faculty, researchers, graduate students, and librarians, who generally require more precision (and authority) in their finding tools but still need to move beyond collections (even allied) that are essentially catalogs moved forward a notch. the smaller scale of this work also potentially enables innovations in effective linkage and similarity (i.e., semantic) analysis. some experts note that the future of internet searching as a whole may lie in searching federated finding tools based in these techniques.8 such a federation could be an academic’s or librarian’s web of high-quality finding tools. data fountains may offer part of the foundation needed to support such a web. from a related perspective, these tools represent an appropriate approach for library and library-communityscaled resource identification and description tasks that emphasizes perhaps the great advantage the library comfigure 3. interaction of fully and semi-automated and manual collection building processes in data fountains figure 4. overall data fountains architecture machine assistance in collection building | mitchell 197 munity can bring to bear in creating useful finding and metadata generation tools, which no others have. that is, the community’s unparalleled subject and description expertise in finding and ordering significant resources into coherent rich collections might be amplifiable shortly, through machine assistance. if such an effort was sensibly coordinated and focused, and minor modifications in approach and established standards made to enable best use of these new tools, then the best internet finding tools/collections could be made possible yielding high-quality and significant coverage. these collections would benefit by having the capability to catalyze, out of the mass of the web, the resources that constitute much of its intelligent fraction and make this coherently visible and available to learners and researchers. moreover, this could be done in such a way that digital and print record and object collections could seamlessly interact as one, rendering what would be the best information-finding tools/collections without regard to type of resource. this effort in fact has been unfurling for a long time, though, to date, in small and somewhat sporadic, uncoordinated ways. for example, infomine and similar collections have provided credible links to and for the academic community for well over a decade. ■ role and niche definition for machine assistance in collection building exploratory survey an exploratory survey conducted in fall 2005 illuminated new perspectives, desired products and services, and research opportunities as perceived by a sampling of digital library and library leaders in regard to a number of areas involving machine assistance in collection building. generally, areas explored concerned, among others: new roles projected for machine learning/machine assistance in libraries for metadata generation, resource discovery, and rich full-text identification and extraction; new finding-tool niches and opportunities existing in the service spectrum between google and opac; acceptance of streamlined, more minimal, and cost-saving approaches to metadata creation or augmentation; the role of cost-recovery-based service and cooperative, participatory business models in digital libraries. more specifically, the purposes of the survey were to: 1. elicit leading library attitudes in relation to the types of services, software development, and research that generally will constitute data fountains; 2. test the waters in regard to attitudes toward implementing machine-learning/machine-assistance-based services for semi-automated collection building within the general context of libraries; 3. probe for new avenues or niches for these services and tools in distinction to both traditional library services/tools and large web search engines; 4. concretely define data foundations’ initial set of automatically and semi-automatically generated metadata/resource-discovery products, formats, and services; 5. examine attitudes toward the value and roles of rich, full-text in library-related finding tools; 6. examine attitudes toward hybrid databases containing heterogeneous records (e.g., multiple formats, types, and amounts of metadata); 7. gather ideas on cooperatively organizing such services; and 8. generally define new ideas in all interest areas for development of products and services. the survey, comprised of fifty-nine questions, was sent to thirty-five managers of leading digital libraries/libraries/information projects.9 there was roughly a 40 percent return from those targeted (fourteen out of thirty-five). responding institutions and individuals were guaranteed anonymity of response. ■ survey result summary there was considerable agreement on most answers. as such, this initial definitional survey has proven helpful in design and product definition. though the survey sample set/number of respondents was limited and while results need to be seen as tentative, the views expressed are from well-regarded experts in the fields of digital library and library technology, development, and services. in addition to helping define current data fountains services, the survey results also indicated the need for further exploration in the areas of services, tools, overall niche definition, and publicity. while conclusions remain tentative, barring future, larger surveys, some of the more relevant results are as follows: ■ there appear to be significant niches for an automated/semi-automated collection-building/augmentation service given inadequacies in serving research-library users found in google (and presumably other large commercial search engines) and commercial-library opac/catalog systems. survey results indicate a need for services of the types characterized by data fountains. ■ generally, academic libraries get a slightly above middle-value (neutral) grade in terms of meeting shifting researcher and student information needs over the last decade. this indicates that, above and beyond specific library and commercial-finding tools, 198 information technology and libraries | december 2006 there are information needs not being met by libraries in regard to information discovery and retrieval that new services may be able to help provide. ■ there is support, above and beyond creating machine-assistance-based collection-building services, for developing and distributing the free, opensource software tools supporting these services. tools that make possible machine assistance in resource description and collection development are seen as potentially providing very useful services. ■ automated metadata creation and automated resource discovery/identification, specifically, are perceived as potentially important services of significant value to libraries/digital libraries. ■ there is support for the notion of automated identification and extraction of rich, full-text data (e.g., abstracts, introductions) as an important service and augmentation to metadata in improving user retrieval. ■ the notion of hybrid databases/collections (such as infomine) containing heterogeneous metadata records (referring to differing amounts, types, and origins of metadata) representing heterogeneous information objects/resources, of different types and levels of core importance, was supported in most regards. ■ many notions that were foreign to library and even leading-edge digital library managers/leaders (the respondents) two to three years ago appear to be acknowledged research and service issues now. included among these are: machine assistance in collection building; crawling, extraction, and classification tools; more streamlined types of metadata; open-source software for libraries; limitations of google for academic or research uses; limitations of commercial-library opac/catalog systems; and the value of rich full-text as a complement to metadata for improved retrieval. ■ there is strong support, given the resource savings and collection growth made possible, for the notion of machine-created metadata: both that which is created fully automatically and, with even more support, that which is automatically created and then expert reviewed and refined. ■ amounts, types, and formats of desired metadata (very streamlined dc metadata was supported for most uses and contexts) and means of data transfer (oai-pmh was preferred) were specified by respondents. ■ summary of part i data fountains is a unique service and system for inexpensively supporting aspects of collection building among ipdvlcs. developing and utilizing advances in focused crawling and classification, this service automatically and semi-automatically identifies useful internet resources (both open-web as well as closed-collection resources including articles and reports, etc.) and generates metadata (and selected rich text) to accompany them. data fountains is a cooperative service, a free open-source software system, and a research-and-development project exploring machine assistance as well as machine-expert interfaces and synergies in collection building. several useful service niches and roles for the work have been identified and have been or are being developed. ■ part ii: new directions in research this section discusses important new directions in research for machine assistance in collection building as they relate to upcoming and expanding research, development, and prototyping within data fountains and ivia. among focus areas are promising means of: automated classification for applying library standard controlled subject vocabularies/schema, including hybrid and ensemble classification; smarter and more accurate named-entity extraction (e.g., capturing object/article metadata “facts” such as publisher and publishing date); improvements in rich-text identification and harvesting; article/report collection level cocitation and subject gisting functionality; and generally improved expert-guided and focused web crawling. ■ new research in machine assistance for collection building the ivia and data fountains projects have recently received a fourth national leadership grant from the united states institute of museum and library services that supports three years of research and development in machine assistance in collection building. in addition, the national science digital library is continuing funding. the areas that will be worked in are discussed below. these have been determined through experience gained over the last eight years of work in machine-assistance-oriented systems development and dialogue with computer scientists and collection coordinators. these areas of technology work and application, though complex and challenging, are very important. that is, assuming it is important that the learning/library community not be dis-intermediated by such technologies but instead becomes more fully empowered by them. this can only occur through developing a much larger role in actively defining, guiding, and putting the technologies to best possible use. looking into the future, it is clear that libraries cannot simply continue to wait for or rely on good companies like machine assistance in collection building | mitchell 199 google, oclc, or opac creators to deliver them, much like a cargo cult, as they have in the past. to the degree that this is done, there is the risk of becoming vendor vectors blinded by the limitations of these companies and their product lines. these products are often incorrectly assumed to be the known technical and organizational universe of what is possible or doable. the revolutions coming in computing power together with the low cost of this power—which will be almost ubiquitously distributed among users of library collections and services—promise much more change than libraries have seen in the last decade. among the changes underway are those in machine learning and machine assistance in libraries. as the changes take place, organization size may not guarantee much as, over the last decade, librarians and researches have witnessed large academic and other research libraries, with some exception, demonstrate a profound organizational entropy in almost direct proportion to the magnitude of what are essentially paradigm shifts in scholarly communications, information provision, and research. to some degree, these simply reflect larger blockages within the universities and institutions in which libraries are embedded. as these changes play out, it should be noted that history in information or libraryrelated public or scholarly information provision/access probably will not end with google or oclc—wonderful and fairly open companies—just as history in automobile manufacture has not ended with gm, computer manufacture with ibm, or web finding with alta vista. with this as background and in the vein of open planning (as well as open services and open software) and given the size of the work areas addressed and their challenges, much of the projects’ technical planning and direction are being presented in this paper. these areas of computer and information-science research and development, which will affect libraries in many ways, are evolving rapidly into practical application. the current major research areas are: ■ named-entity identification and extraction, and unified models of information extraction and data mining named-entity identification and extraction is concerned with finding and harvesting generally concise factual data—often common bibliographic metadata—present in the targeted resource such as publisher, title, and publishing date. this type of metadata usually is associated with particular collections containing information objects that are often homogeneous (e.g., scientific article collections) and in which author-intended placement of metadata (or data) elements follows an established pattern and location in the object (e.g., an abstract is typically present and indicated in a pattern following presentation of title/author). while making extraction easy is one of the functions of metatagged metadata in internet resources, generally few authors or collection coordinators in academia, or elsewhere, use metatags or applicable naming schema in any significant or uniform way (often, in fact, it is used very sparingly or not at all). extractors therefore must be able not only to identify and harvest metatag metadata, but must discern and then extract specific metadata elements interspersed in bodies of text, as made identifiable by detecting the patterns of occurrence unique to the type of element as it occurs in the object or collection. among the many advances planned for data foundations is the usage of conditional random fields.10 important as well are user interfaces or dash boards that allow configuration of extractors whereby, as patterns of placement for desired data for extraction change in differing collections and types of objects, the tool can be configured appropriately to match the context and task. also under development consideration are more hybrid, unified approaches to and models for data extraction and mining (as applies to text classification), using each to inform and improve the other.11 that is, a family of models is being developed for improving data mining of information in largely unstructured text by using methods that “have such tight integration that the boundaries between them disappear, and they can be accurately described as a unified framework for extraction and mining.”12 much of this work is concerned with generating metadata for article/report-level collections. ■ document-scale learning and classification a strong emphasis in the new work will be on documentscale machine learning, classification, and named-entity extraction in regard to collections of research papers, reports, theses, and monographs. internet-object boundary detection is another important concern. detecting and properly defining compound documents (e.g., web hyper-books on multiple pages or sites) is a goal, as is identifying compound-document points of author-intended entry and intended-user paths (i.e., author-intended main connective threads in distributed or compound documents).13 relatedly, improved internal-document structure identification for better document-level classification and extraction is critical. involved are standard-document internal-structure identification (e.g., abstract, introduction, summary text, captions for tables/figures) including units of rich text and microinformation units of text organized via subtopic.14 methods of document-level word-and-phrase graphing as per 200 information technology and libraries | december 2006 textrank and other means of identifying small-world and micro-information units are currently being pursued.15 a strong emphasis as well will be on examining and implementing new means of co-referencing among documents in collections and new means of identifying latent topics in a well-defined collection. by way of explanation, another term for co-referencing is co-citation. an example of such co-referencing is referencing work, described in papers, that has been funded through the same agency and program or that shares principal investigators in addition to standard bibliographic citation. this will improve on work done in citeseer.ist (researchindex) and similar projects through integrating and advancing the promising approaches of rexa open-source collection-management software.16 the focus of this effort will be on integrating article-level named-entity extraction as well as co-citation and bibliometric-refined subject identification within collections of papers/reports. ■ individual text-classification algorithm and training method improvement new research on individual text-classification algorithms will be examined and applied. the emphasis here will be on prototyping and measuring how applicable recent promising scholarly work might be to library-related metadata-generation challenges. the major focus continues to be in the area of applying controlled, library standard subject vocabularies (e.g., lcsh, lcc, and ddc). many of the improvements relate to advances in individual text-classification algorithms, classifier training and fine-tuning, training-corpora cleanup and normalization techniques, and creating the ability for the individual classifiers to hybridize with other classifiers. of special interest are classifiers that perform well with very large numbers of classes, both small and large amounts of text, and that yield probabilistic estimates in class assignment (e.g., of a particular lcsh). the latter allows both provision of multiple class assignments for resources that have multiple subjects as well as greater accuracy and knowledge of the confidence level of the assignments (thresholds of confidence level in accuracy can be set in applying, or not, a particular classification). more specifically, this work will examine, test, and— depending on test results—refine recently improved variants of the most promising of several classification algorithms.17 among those are: ■ support vector machines (svms)18 ■ logistic regression (lr)19 ■ naïve bayes (nb)20 ■ hidden markov models (hmms)21 ■ knn/knn model22 a number of metrics to measure performance of these and other text classifiers in regard to controlled subject assignment, in both fully-automated and user-interactive (semi-automated) modes, will also be employed.23 ■ hybrid classifiers an important effort will be to test and develop new hybrid classifiers that incorporate the best capabilities of two or more in one classifier. much of the current research has involved developing and improving new hybrids that combine the best of discriminative (e.g., lr, svms, decision trees) and generative (e.g., nb and expectation maximization) techniques in classification. for example, nb is fast but lacking in accuracy, while svms are accurate but can be slow to train. hybrid models can produce better accuracy/coverage than either their purely generative or purely discriminative counterparts.24 various combinations, among others, of lr, hmm, and svm are among the most promising.25 ■ ensemble classification or classifier fusion this constitutes one of the main current directions in classification research and has been applied to a wide range of real-world challenges. classification ensembles are reputed to be more accurate than any individual classifier making them up.26 an important focus is on experimenting with new approaches to automated and semi-automated ensemble classification that involves creating frameworks that support metaclassifiers or classifier-recommender systems to apply multiple classifiers, as appropriate, to the classification task.27 developing classifier ensembles, including the metaclassifiers to guide them, is a major element in making possible the self-service aspect of an open, automated metadata-generation service, given that the metaclassifier is intended to determine the nature of the collection and classification task and assign the appropriate classifier(s) to the job.28 it is probable that expert interaction at suitable points in this process will improve performance. ■ distributed classification classifier ensembles are often used for distributed data mining in order to discover knowledge from inherently distributed and heterogeneous information sources and to scale-up learning to very large databases (often the context for library-related tasks). however, standard methods machine assistance in collection building | mitchell 201 of combining multiple classifiers, such as stacking, can have high performance costs. new classifier combination strategies and methods of distributed interaction will be examined to better handle very large classification needs.29 distributed classification, by nature, would be focused on improving large-scale self-service classification. ■ semi-automated, expert-interactive classification means of enabling semi-automated, expert-interactive classification will be presented.30 there is much scope for building interactive classifiers that engage the tool user or collection coordinator in an active dialogue (e.g., multiple iterations of machine/expert actions and feedback loops) that leads to incorporation of expert knowledge about specific classification tasks, metadata, and collections into the classifier, thus improving performance. that is, an active learning model can be extended significantly for these processes to include both feature-selection and document-labeling conversations, exploiting rapidly increasing computing power to give the user immediate feedback on choices to improve the classification process.31 several different models featuring domain expertinteractive classification and extraction will be evaluated. these vary from being extremely interactive, emphasizing frequent machine assists, to less interactive, where experts profile, launch, and only occasionally refine a primarily machine process. the initial focus will be on the latter models. note that ivia and data fountains have included a metadata generator with semi-automated record builder for years. ollie and hiclass are examples of systems that are more intensively expert-interactive.32 classification tasks and collection types will be characterized as to which lend themselves to frequent expert interactions, occasional interactions, or more fully-automated modes (i.e., little interaction or initial profiling/definition only). ■ classifier training and evaluation techniques as important as direct work on the classifiers is work emphasizing assessment, cleaning, and testing of classifier-training data and classifier-evaluation techniques. involved are training data/corpora-normalization techniques, document-clustering techniques, and classifier bias/variance-reduction techniques. also involved on the classifier side are tuning issues in regard to the data at hand, including improved feature-selection techniques and determining and using confidence estimates in applying/not applying classifications. different approaches to these will be examined, tested, and refined with a range of training corpora. diverse training and test data from assorted collection “types” will include standardized corpora as well as data from participating library or educational community projects. that is, the techniques will be assessed with regard to how they perform with: (1) open web resources, (2) collections of research papers, reports, theses, or monographs (working with rexa), (3) typical campus web-site pages, and (4) mixes of the above.33 each collection-type focus will require differing approaches, algorithms, training, and fine-tuning techniques and will be evaluated through a number of measures.34 ■ improved rich-text identification and extraction for improved classification and user search/ browse rich text is text that has the role of conveying through traditional or new document structures or conventions (e.g., introductions, tables of contents, faqs, and captions for figures) the author-intended subject(s) and intent of the information object. being able to accurately identify and extract this material greatly aids in classifier performance by improving significant keyphrase identification as well as in user retrieval by enabling full-text retrieval. the availability of natural-language text for searching is one means of helping to resolve problems encountered in searching controlled, library standard subject vocabularies (which in turn counteract problems searchers have when only natural-language retrieval is available). both approaches are inherently complementary. improvements in rich-text identification and harvest through improved means of document-structure learning (e.g., identifying text windows around links or captions for figures and tables) will be sought. the lightweight semantic (e.g., use of terms that indicate “aboutness” such as “about”, faqs, introduction, and abstract; rating the frequency and uniformity of application of these terms in a given collection; and proportioning source of harvest) and markup clues will be refined as well. identifying aboutness text, which can be seen as micro-information units of text organized via topic and subtopic, is being pursued through work with rexa and others.35 ■ improved focused crawling focused crawling is an appropriately scaled method of crawling for many library collections (see part i). it is used to discover new internet resources by defined topic terminology and topic web-link neighborhood. topic similarity 202 information technology and libraries | december 2006 and semantic analysis are key measures of significance that are combined with linkage co-citation measures to indicate significance or relevance of a new resource. topic similarity among resources will be increasingly modeled through a topic-linkage matrix (i.e., semantic similarity map).367 new means of evaluating, fine-tuning, and improving basic crawling will be examined.37 rules reflecting the specific semantics of each major subject area are to be developed by participants for crawls/classification. ■ combined mining and extraction that support improved focused crawling in regard to best link pursuit and expert interaction the development of hybrid, unified approaches to extraction and mining can be applied to focused crawling. the processes of data mining, rich-text identification and extraction, and the newest forms of focused crawling are starting to overlap and depend upon one another in important ways (as discussed in the section on preferential focused crawling). another focus for development efforts will therefore be work to more systematically refine bestlink pursuit with an eye toward combining advances in mining, extraction, and rich-text identification in focused crawling. this work will be undertaken to improve the work on nifc. focused crawling will improve in many situations, as well, through use of user-interactive components and data-visualization interfaces (e.g., control boards that visualize an interactive graph to aid in expert “lifting” of the values of specific sites/subtopic neighborhoods to better reflect their significance to the expert). this in turn that will help users guide and tune the crawling, in semiautomated fashion, to better fit the goals and context of a particular crawl. ■ modeling different approaches for a self-service, openly accessible metadata-generation service(s) the data fountains and ivia efforts have some experience with modeling metadata-collection related services, having provided collaborative, scholarly virtual-library service successfully for more than a decade. the data fountains project has improved upon earlier work and represents an automated and semi-automated resource discovery, metadata generation, and rich-text identification and harvest service for cooperating collections. the intent is that data fountains be a self-service operation. in related effort, with co-operators at the national science digital library (nsdl) and library of congress (lc), the data fountains project has been striving to develop self-service dash boards that collection managers can use to configure, profile, and satisfy their needs. by complementing initial profiling with ongoing interactive dialogue, guidance, and refinement, more precise task definition and tool utilization can be achieved. the goal is to have a service that can, through advanced interfaces, engage users in dialogue to help them better determine their options, the tasks involved in achieving them, the capabilities and limitations of the tools available, and therefore, the best choice of tools and practices given their specific service needs and the nature of their collections. ■ summary of part ii there are many fronts of research in machine learning as applied to text processing and new-resource discovery in regard to collection building of various types, relevant to libraries, which have opened over the last few years. the data fountains/ivia research described is looking into just a few of these. for libraries, the borders between computer science, information science, and library science are dissolving rapidly. it would be hard to devise or project forward a five-year plan for a large working library without some understanding of current and oncoming machinelearning and machine-assistance work in each of these disciplines, the many inter-connected organizational/community/technical issues, and without an understanding that goes beyond the domain of current or developing products and services from existing vendors. ■ part iii: issues and reflections part iii is intended to define and address some of the many challenges and issues that are arising or may arise as a result of work on machine-assistance tools in the areas of automated and semi-automated resource discovery, metadata generation, and rich, full-text identification and harvest. included here are reflections on and questions about some of the probable implications and impacts of, as well as roadblocks to, machine-learning technologies applied to collection building. addressed are probable impacts leading to changing roles for libraries, librarian expertise, library standard vocabularies/schema, and the organizations that are the stewards of library standards. these include: ■ what might be the effect of these technologies on library operations, including changes in the areas and nature of expenditure of expertise required, shifts in amount of expertise required, and changes in divisions of labor (both human/human and human/ machine)? machine assistance in collection building | mitchell 203 ■ what are the effects on libraries and end users when the coverage of finding-tool content can be greatly and inexpensively broadened and deepened? ■ how do current or traditional approaches to librarybased practices and standards help foster or hinder these technologies? ■ how will best practices develop in regard to machineassisted activities? ■ how do these technologies amplify and enable or simply prematurely dislodge librarian expertise? ■ who will own these technologies and tools? ■ how open to evolution are library metadata standards and the organizations entrusted with their stewardship? ■ how will these technologies impact these standards? unfortunately, most of these questions will remain as questions unanswered. the few answers offered here must remain as tentative, contradictory, and flawed as those of most who dabble in the cottage industry of imagining library futures. still, in the effort to help map some of the new information landscape that is becoming apparent, these reflections, developed over the course of the last few years, may be small contributions toward defining and understanding what is coming. ■ licensing for automatic agents of libraries it will become increasingly important for libraries to develop licenses with commercial-resource vendors/publishers that allow crawlers/classifiers and other automated programs, to be seen as agents of and for these libraries. it is important that automated agents be allowed to work with (e.g., create or enrich metadata and therefore increase end-user success in finding) both free and fee-based materials in much the same way that an expert bibliographer, cataloger, or public-services librarian would when selecting, creating original metadata for, and providing access to a new commercially vended book intended to become part of a library or other well-defined collection. automated agents accessing and processing fee-based, internet-delivered information objects do so with the goal of improving the finding tools of the institution paying the fee to provide access for users to these objects (i.e., “library users”). thus, they are engaged in a bona fide, fair use of the material by and for the purchasing/subscribing institution. the metadata and descriptive information these tools develop help make the materials they process more visible in collection/finding-tool contexts, a goal which should be desirable by all parties (i.e., end user, subscribing library, and owning author/publisher). ■ new medium, new organization, and an over-proliferation of electronic toll booths and borders another challenge is that internet access to library-collection contents and library catalog-described data, both free and fee-based, is becoming increasingly restricted as libraries, library service organizations, and publishers grope to create special aggregations, with exclusive access for their clienteles. countering this in their adherence to open access, have been, among others, services developed by, for example, arxiv, the institute of museum and library services open archive, cdl escholarship, oaister, citeseer, and nsdl.38 differences in the two approaches may increasingly become an issue. on the one hand there is the broad, longterm community ethic favoring open access to an internet with few walls or borders, and authors enabled to publish directly via the internet through open eprint collections or dual commercial/personal-site publishing/copyrighting of their work. on the other hand there is the fairly narrow definition of an internet information niche in which electronic/virtual services and collection access remain mapped restrictively to the sponsoring physical libraries/collections/institutions/publishers. libraries face a contradiction or tension between these two approaches. the latter mode is a natural effort to retain a tightly held clientele and access model that has characterized physical libraries, reflecting narrowly conceived and decadesold organizational/budget/certification/user models of physical-library services and publisher controls. much of this practice is necessitated by commercial publishers (for whom libraries often have no alternative but to act as vectors), together with the lack of vision for and outdated stereotypes held of libraries by the larger organizations in which they find themselves. at the same time, much of the problem is also due to the inability of libraries to develop new cooperative organizational modes, models, and services that map better to the new medium, map better to new author and user benefits enabled by this medium, and that are better able to exploit fully and fluidly the new medium’s capabilities. the types of compartmentalization of collections, access, and services needed for physical libraries and print, or necessitated by publisher restrictions, are increasingly an obstacle when projected onto internet access and service capabilities. thorough rethinking is needed, just as the educational and scholarly missions of the university as a whole must be thoroughly rethought in the light of internet-associated technologies and capabilities.39 while the information highway must be paid for, over-compartmentalization based on dated organizational and service models is yielding an over-multiplication of 204 information technology and libraries | december 2006 toll booths and border crossings among aggregations and collections. an example has been the emphasis at many university of california campus libraries on the single campus opac rather than the pooling of resources across uc libraries for the strengthening and refinement of cdl’s melvyl union catalog. it is likely that with systemwide, multicampus shared resources, melvyl could improve in all respects vastly beyond the single campus opac. this is noted in the final report of the bibliographic services task force of the university of california libraries.40 overall, institutional parochialism can and has greatly lessened the value and fluidity of the internet as a medium for information provision. the booths and borders of tightly held collections make material harder to find, less visible, and less useful than would be true of more open, expansive collections and archives. as dempsey stated, libraries need to find “better ways to match supply and demand in the open network. . . . we need new services that operate at the network level, above the level of individual libraries.”41 for crawlers and classifiers, the booths and borders that are proliferating in libraries can act disjunctively as barriers, reducing their performance. there are few answers to the challenges that over-proliferation of booths and borders represent. they are often practical solutions to immediate needs. still, projects that are exploring new avenues in organization and open, sharable collections (and the standards they are based upon) should be further encouraged and supported communitywide. these include the open archives already mentioned and systems such as those ivia/data fountains work upon that to provide services for such collections in an open, inclusive, cooperative, participatory manner. while the answer will probably remain a mix of open (reflecting capabilities of media) and closed (reflecting organizational and vendor restraints) collections, it would be progress to move the balance point more toward the middle and away from so many booths and borders. ■ note on the related issue of meta-search libraries often respond to some of these open/closed/ multiple-collection aggregator and “brand” challenges and issues with meta-search services. meta-search can serve to mask the fundamental, growing problem of increasing booths and borders. meta-search, unlike the internet-borne conceptions of open service, collections, access, systems, software, and standards, does not really ask us to change our fundamental assumptions, organizations, or data architectures to match the capabilities of the new information medium. it does not ask us to cooperate more fully and share at the level of collection and data; it also doesn’t encourage uniform-standards adoption and development. while meta-search is a fine answer to certain needs, sometimes it is used as a technical means to attempt to avoid these more fundamental issues. in addition, meta-search can be constraining for user search/access—i.e., it frequently disallows use of significant or unique search and metadata capabilities of each individual database to which it is applied. meta-search in libraries is becoming increasingly central, though it has many current operational flaws. among these flaws are: ■ simplification or dumbing-down of search in order to access lowest-common-denominator fields; ■ clumsy cross-walking among fields, or metadata terminologies that really are not equivalents; ■ difficulty in collating results/eliminating duplicates; and ■ difficulty of matching differing results ranking weightings/systems held by different bases. libraries emphasizing this approach may be increasingly themselves perceived as dumbed down by academics, grad students, or serious researchers, who must reach beyond google, the opac, and meta-search search and display. instead of, or in addition to, meta-search, it might be wise to pursue more fully the hybrid database approach of combining heterogeneous records for multiple collections (and multiple retrieval languages as needed) in one database.42 as computing power increases geometrically and price decreases drastically every couple of years, the challenge that the hybrid-database approach poses in regard to searching and maintenance of very large hybrid databases may soon become less of a problem. this power also implies that meta-search become more useful. ■ library standard controlled subject schema/vocabularies as the promise of automated and semi-automated metadata generation and related tools becomes better known, it may be important for the community as a whole to urge our major subject vocabulary standards organizations, i.e., lc and oclc, to open more fully their standards and input in standard making for wider participation on the part of new communities of researchers, developers, and end users. both organizations maintain important library standard subject vocabularies/schema, lcsh/lcc, and ddc, and related large bibliographic databases and classifier-training data embodying these standards. in this work, both organizations need to more actively seek out and encourage a wider variety of open innovation and development, both within and outside of the library community. this means involving more researchers, end users, and other perspectives in the effort of contributing to the more rapid evolution of these standards in an attempt both to better meet end-user finding needs and machine assistance in collection building | mitchell 205 to facilitate application of the standards through machine assistance. while oclc and lc have been generous in providing their data and standards for ivia research (others that have been generous with training data have been the cornell university library and cdl), most known work on these standards is funneled through their organizations, allies, and organizational filters. this is, of course, critical to a point for coordination; however, if overdone it may unnecessarily inhibit wider pollinations, new perspectives (e.g., a wider variety of linguists, computer scientists, and subject vocabulary/schema experts from other disciplines such as medicine and the sciences), decision making, and faster movement forward. informing the perspective here is that, while there are major costs involved in maintaining and coordinating these vocabularies/schemas, such costs are being borne directly or indirectly by the community in fees paid, monies applied (often public monies through the large participating public university/land-grant libraries, among others), or labor volunteered/provided. lc is a public agency and oclc a corporate cooperative. in many ways then, libraries, through their metadata expert/cataloger community, should be seen as “owning,” as both co-author and funding agent, more of a share in these vocabularies (and other standards in library metadata) than their stewarding organizations. a significant portion of the success of thousands of individual libraries is dependent on the successful evolution (replacement?) of these standards through the facilitation efforts and new roles adopted by these two organizations. ultimately, it must be recognized that in many ways, oclc and lc metadata schema and vocabularies (as well as conventions, styles, and customs in practical application) represent the codified wisdom, in the form of very large knowledge bases, of decades of resource description practice on the part of information professionals in thousands of institutions. the library community is the co-author of these, and oclc and lc are their stewards. when viewing the community as owner, and when taking into account that the community needs to evolve more rapidly with its users to survive, then periodic clarification and renewal of the origin, intent, and understanding of the stewarding organizations and the standards they coordinate might help encourage more rapid, far-sighted change. libraries may or may not sink to the degree that this is realized. in this light, it should be noted that some communities, including path-breaking projects within nsdl, have made well-reasoned decisions not to use these library subject vocabulary standards (carl lagoze, pers. comm.). these are just recent examples, given that abstracting and indexing services/databases, for the journal literature, have in most cases long ago chosen to use their own specialist vocabularies, often supplementing these by enabling key-word or natural-language searching of abstracts or complete full-text. among other core practical concerns here are that the library community’s standards may not be seen as useful and as widely applicable as other information communities may desire. that is, if an important goal is to evolve and expand standards long associated with and emanating from the library community into becoming the standards of new, larger communities outside of libraries, then a moreguarded-than-not approach, which is slow to respond to early adaptors or innovators and slows sensible change, may not be the best path. here it should be said that there are significant ongoing efforts to overcome some of the challenges and better evolve lcsh/lcc. oclc’s faceted application of subject terminology (fast) may represent a step in the right direction.43 having an entry-level vocabulary to translate enduser terminology to appropriate library subject standard vocabulary terms would be of great importance to most types of end user.44 oclc has also been working with the resource description network (rdn) to streamline ddc application.45 there just need to be more of these efforts moving at a more rapid clip. as macewan concluded in 1998, “if lcsh does not change it will sooner or later be abandoned. . . .”46 the same might be said of library subject vocabulary/classification standards. however, in the worst-case scenario, assuming the existing subject standards cannot evolve more rapidly to meet new user needs in information access, collection building, and metadata creation, now may even be an appropriate juncture for a large-scale rethinking and rebuilding, from the ground up.47 the architecture, intent, end-user audience, form, and substance of these standards would need to be rebuilt and expanded. a capability for organizationally responding more quickly to what has amounted over the last few years to far-reaching paradigm shifts would be enabled. now may be the time because, in addition to the questions of the openness/innovation/evolutionary adaptability of these standards, they exhibit significant, long-noted, functional flaws in terms of a non-librarian end user finding success. among others often noted are: ■ misuse/lack of understanding on the part of end users (and, rarely, poor learning materials and guidance supplied by librarians) due to real or perceived complexity, often associated with the use of subheadings and arcane terms that are far from intuitive for users).48 ■ typically sparse application that doesn’t fully represent the number or depth of topics addressed by a work. despite the time needed to create the marc record manually, very few lcshs are applied (often three or less in the university of california’s melvyl union catalog). ■ the arcane and overly general nature of many terms that sometimes do not accord with terminology used by practitioners in the field.49 206 information technology and libraries | december 2006 ■ the lack of currency of terms describing new or recent phenomenon (see discussion of entry vocabulary.50 ■ the lack of uniformity of subject granularity in their application across multiple cataloging institutions for the same/similar works. ■ the significant amounts of expensive expert labor involved in their application. ■ their complexity often at least partially assumes some expert mediation (that may not be available, given that access is increasingly from outside the library) or long-term experience with the vocabulary. ■ overdone detail/complexity, some of it either not extremely useful to researchers and nonlibrarian end users or already instantly verifiable by users. ■ their arcane-ness and complexity, which limits capabilities for machine assistance in application and, thus thwarts a major, inexpensive means for future collection growth, increased coverage, and more useful collections. fortunately, and this is crucial, it turns out that much of the tonic needed for improvement may reside in the areas of inexpensively augmenting, as opposed to changing, the lcsh/lcc/ddc schema/vocabularies. for example, it is probable that most significant objects, when not digitized themselves, will be accompanied increasingly by digitized, representative cores of searchable natural-language rich text, as lc is doing with its table of contents digitization.51 automated and semi-automated tools for rich-text identification, extraction, and end-user searching are showing applicability now (see part i). similarly, keyphrase identification and application can be accomplished automatically with a good degree of reliability; these processes play a role similar to rich text in providing useful retrieval terms and in augmenting subject searching with/without these controlled vocabularies. finally, reasonably good overall subject gisting is occurring in the creation of annotationlike constructs. all of these—rich text, keyphrases, and annotation-like constructs alike—are of great potential value in addressing controlled subject vocabulary/schema inadequacies and in complementing lcsh/lcc/ddc in end-user finding. it is also probable that use of machine means to augment overarching standard subject vocabularies with complementary and much more granular/detailed specialist vocabularies (both expert created and controlled as well as those that are automatically invoked) will shortly be practical and prove very useful. streamlined lcsh/ lcc/ddc could be made perhaps to function as linguistic “switching yards” with specialist vocabularies oriented to them and acting as extensions via the spine provided by the generalist vocabularies (similar to work being explored by vizine-goetz). all of this could be hinged on the synonomy and other term/concept relationships supplied by wordnet or other whole natural-language corpora.52 in such a manner, reconceived lcsh/lcc/ddc can basically work as multi-vocabulary integration and translation tools in cases where the granularity of the subject becomes very fine-grained or specialized.53 such synonymy, linguistic linkages, and switching capabilities would make possible more meaningful and accurate interrelations and more fluid user movement among the vocabularies and concepts of multiple disciplines and multiple-controlled vocabularies/schema. this would also better enable the end user when employing terms actually used by practitioners/researchers/students in their disciplines.54 these and other efforts are crucial because, despite their problems, lcsh/lcc/ddc are comprehensive, overarching vocabularies and schema that, though complex (as are the subject vocabularies of biosis and pubmed/medline, which successfully represent very large subject universes of their own), have done a generally useful job of representing and coherently organizing finding terminology for most known worldly (and unworldly) phenomena. this, on any basis, is no easy task. these library standard vocabularies might best be seen as both essential connective tissue and as spines that could coherently thread many disciplines and interests, and many of the more specific vocabularies, together. without such a spine, interdisciplinarians, researchers/students new to an area, and generalists—whose focus requires wide knowledge often across among many disciplines (and therefore subject vocabularies)—may find themselves handicapped. each suband then sub-sub-specialization might develop its own mutually exclusive and contradictory terminology in a manner that natural-language substitutions such as keyphrase and rich-text availability can only partially fix. many end users and librarians noted the downsides of natural-language-text-only searching two decades ago while using newspaper and other full-text databases offered by dialog or brs. finally, one cannot ignore that lcsh/lcc/ddc have huge established bases of practitioners and metadata records employing them. therefore, their value is large. to summarzie, the solutions to the problems inherent in using library standard subject vocabulary/schema and other controlled metadata will involve the following: ■ openness to extensive hybridization of approaches to rethinking subject vocabularies/schema and other metadata; ■ awareness of, design for, guidance of, and incorporation of new machine-assisted technologies to boost collection coverage and reduce costs of application; ■ embracing machine assistance, as appropriate, as a means of amplifying and extending expertise and application; ■ applying existent technologies for generation of keymachine assistance in collection building | mitchell 207 phrases, description-like constructs, and rich text in order to augment controlled subject vocabularies; ■ developing a better conception of end-user metadata expectations and needs against the backdrop and expectations generated by the web, such as instant end-user access/verification; and ■ making use of specialist vocabularies that might be dovetailed well with and coordinated through standard vocabularies. ■ invoked subject vocabularies—hierarchical and otherwise it is important to track recent research into automated and semi-automated means for creating (often referred to in the computer-science literature as “inducing” or extracting) hierarchical and other subject vocabularies/ontologies from natural-language corpora (see part ii). the intent of this work is to have the natural-language terms used by practitioners directly populate and structure the subject-finding approach. automated induction of subject vocabularies will be useful to augment and increase the capabilities, flexibility, and interactivity of standard subject vocabularies/schema.55 at the very least, and this is important, they could function to automatically suggest synonyms or new terminology for ongoing vocabularies/schema. and these approaches could be put to use in building entry-level vocabularies that front the vocabularies of the standards.56 they could also be used to aid in the semi-automated or automated repopulation/reworking of the standards, if large-scale, from-the-ground-up reworking is deemed necessary at some point. this would be done on a discipline-by-discipline, subject-by-subject basis. ■ resource discovery, search engines, and your library’s subject portal library collections, virtual libraries, portals, and internetenabled catalogs of openly accessible, significant internet resources all function as “hubs” (see part i). along with other types of expert-created hubs, they have played a role in providing most large, sophisticated, commercial search engines with a significant means for modeling and determining high-quality resources and, when accurate, a considerable portion of their accuracy. though google and others do not detail how their search algorithms work, most advanced crawlers highly weight (give authority to) sites that contain large numbers of links to research and other significant resources, especially when expert created. similarly, resources from specific domains such as .edu, .org, and .gov, and institutions such as libraries, universities, and scholarly societies can be identified and more highly weighted. this is another case of the community’s expertise/authority functioning as a knowledge base that, when offered as a public good (as library-created hubs often are), helps better enable directional tools for these commercial and noncommercial crawlers. there is nothing wrong with this as long as the community is aware of its contribution and as long as its efforts are recognized by these businesses. expert library-based subject portals often reciprocate usage by using commercial engines for resource discovery, though this usually represents a minor way of collecting because other expert sources are preferred. ■ enumeration of catalysts for, impacts of, and issues in machine assistance in the library community related to these research and technical developments, the library community needs to think through a great many interrelated and diverse issues and questions regarding (1) impacts of the machine assistance we have been discussing; (2) the possible massive automation of metadata generation and resource discovery in libraries, (3) who will “own” these technologies and ideas, and (4) changes in expectations/roles of metadata practitioners and standards and their stewards, in the following areas: ■ when will machine learning/machine assistance yield reliable, inexpensive, and therefore massive application of metadata on an internet scale, that meets librarian, and more importantly, end-user expectations in terms of usefulness? machine assistance should begin to be factored into long-term planning. ■ what will be the effects of this machine amplification in changing the importance/roles/content of subject standards? that is, how and to what degree will a new means and scale of application change these standards generally, and how they’re perceived and used by end users and librarians and, therefore, be applied by the library community? how might these standards themselves change both in terms of changes in and approaches to vocabulary and schema? that is to say, how would massive, machine-assisted application in and of itself change the makeup of the vocabulary, schema, and the styles/conventions with which they are applied? ■ how might the roles of the stewards of these standards change, given massive application as well as possible interest on the part of other communities? can library standards penetrate and be effectively used by other information communities? what changes in the standards would be required to achieve this? 208 information technology and libraries | december 2006 ■ what are the trade-offs between highly manual or craftsman/guild approaches and highly automated or more industrial approaches to applying metadata? within which contexts, collections, resources, and budgets are these approaches to be best used, either singly or combined in various proportions, in building/expanding a collection? how does each approach best complement the other in library collections? ■ to what degree will changing end-user information usage and access patterns change approaches in regard to collection design and access assumptions, the metadata standards the collections are based upon, and the stewarding organizations of the standards? ■ to what degree may labor and resource savings, as well as the ability to provide for more comprehensive collections, as offered by this technology, dictate changes within the library community in regard to expectations for metadata quality and specificity? in which information-seeking contexts and collections and to what degree will the google-type record or minimal, streamlined dc become, if not a necessity themselves, then a pole toward which library bibliographic metadata evolves? ■ a question self-evident to most but not to all is: to what degree will the nature of the internet itself continue to change our approach to supplying metadata? again, researchers in academic departments no longer need walk across campus to the library by virtue of having many bibliographic details of an object present in a metadata record. increasingly, they can go to the object on the internet and instantly verify the detail for themselves. should libraries deemphasize data elements/fields that are dependably and quickly end-user verifiable in favor of expending more expertise, time, and resources in gisting/ describing the subject, intent, and perhaps even estimated quality or significance of the work? ■ in which specific ways will labor be saved and machines be capable of assisting in resource discovery and metadata generation? that is, what level of automation/semi-automation is acceptable to the community and reliably deployable in production over horizons of one to five years? what level of quality/depth will users accept in metadata designed to occupy the continuum existing between the marc record and the google “record” (this being a large and significant service area; see part i)? how will this technology change old and enable new roles, tasks, and production routines for library subject experts and other staff? how will libraries ramp up and transition into this? ■ will the substantial potential economic advantages of automated or semi-automated generation of library standard metadata such as lcsh/lcc/ddc vocabularies/schema drive a rethinking toward greater uniformity/simplicity/streamlining of these standards and conventions in their application, explicitly with machine application in mind? for example, perhaps only a subset of a whole vocabulary will be used and those that are used will become less detailed and less rich for experts but also—for most end users— less complex and arcane, and more intuitive.57 ■ in some ways, the existence of dc is a recognition that this kind of rethinking and streamlining of library description standards, in the interest of representing and providing access to a much larger scale of communities and resources, is already well under way. what are the obstacles to greater usage of dc? ■ what should the balance be in streamlining metadata for automated application, in relation to its current complexity/depth while augmenting with rich text? from another perspective, what is the balance when considering the oversimplification and loss of descriptive power when using machine methods as compared with that otherwise achievable through use of subject expertise? how will libraries determine best balances of expert and machine in regard to different tasks? how will this be quantified and determined through examination of user retrieval success/satisfaction—with this, in turn, factored against the backdrop of metadata creation costs, fulltext data harvesting and retrieval, and the need for collections with much greater reach? ■ as accurate means of metadata and rich-text generation for/from text objects improve, machine assistance will allow a shifting of expertise to provide better collection coverage and expression of subjectdomain expertise (e.g., in abstracts). how will this new capability for breadth and depth be defined and used in library collections? for example, will new visual, multimedia, and data objects—which the web has made possible on a mass basis and which libraries generally do not cover well—become a major goal in repurposing expertise since these do not easily lend themselves to machine processing (karen calhoun, pers. comm.)? ■ might streamlining and the usage of multiple depths/types of metadata application first require the acceptance within the community of the concept of the multitiered collection/database that supports multiple levels and types of heterogeneous resources representing differing levels of importance to users?58 or, can this need be met through more fully evolved meta-search approaches? ■ helping to structure this metadata heterogeneity might be the sliding-scale application of varying levels of metadata-generation labor expenditures and amounts/type of metadata, with the lower machine assistance in collection building | mitchell 209 and middle-value resources receiving application of streamlined standard vocabularies/schema and rich text, automatically or semi-automatically, at low cost. high-value resources would continue to receive expert-applied, expensively created, complex, and high-quality metadata as well as rich text. libraries already make such distinctions in quality/significance to some degree through purchasing (e.g., departmental collecting profiles/weightings by subject and object type and cost) and order-ofcataloging priority decisions, as well as by student/ faculty input on specific items. more specifically, we would need to discuss and develop criteria in determining the core or peripheral value of a resource for its subjects and user communities and then, based on the judgments derived, appropriately apportion amount and type of metadata and expert labor or machine assistance, on a sliding scale. again, while it should be noted that the library community has generally avoided rendering judgments on the possible use/relevance of a resource to a subject community, libraries nevertheless do routinely make general calls that effectively function this way to some degree. in making this judgement, it would be critical to involve resource users. reviewer-researcher, library user, and librarian evaluations for purchases as well as finding tool/collection-usage statistics for the specific subject or author and item all could be woven into the means by which the core weighting of a resource could be assigned and be refined over time via usage. developing this value is important from a library standpoint. it is a key that may help unlock solutions for some of the community’s bigger challenges, including those revolving around the best marriage of machine assistance with librarian expertise. how do libraries go about making these sliding-scale evaluations with some uniformity, among different collection types and interests, with an eye toward tasking expert and machine? ■ can some of the general end-user search deficiencies commonly acknowledged for lcsh/lcc/ddc be rectified to some extent by automatically/semi-automatically providing rich full-text accompaniment for each record/resource, either in the form of “selected” excerpts verbatim or as processed into significant keyphrases representing this text? how could the presence of this rich text not so much change as augment these standards? for example, rich full-text might be relied upon to contain detail that obviated the need to use certain lcsh subdivisions or other types of marc metadata. could inadequacies/inaccuracies in expert-applied and machine-applied metadata be partially countered, for end-user retrieval purposes, through the presence of rich full-text? rich text, as well as keyphrases/terms and descriptions that serve the same purpose in this context, can now be reliably generated in many cases automatically. what would be the right mix of subject-vocabulary standard metadata and accompanying, selected natural-language text for best end-user success? how might rich-text extraction and searching improve upon searching of whole-object full-text? how much rich text is needed and how distilled should it be? large, whole-object full-text searching can often be a searcher’s quagmire, clouding results rankings and weightings. ■ could a new scale of application and interest on the part of new communities be better catalyzed through the incentive offered by opening up the lcsh/lcc/ ddc subject vocabularies/schema on an open-standards/open-source, free-software model? ■ if development of these technologies is constrained with regard to action/inaction on the part of the community and its stewards, will the standards be replaced—or become obsolete—for major existing or prospective sectors of users? if so, what does this mean for the library community? ■ by and for whom is such standard subject vocabulary/schema application technology developed within the community? classifiers are actually trained through great amounts of what, in many cases, is really community-created knowledge in order to apply community-developed schema/vocabularies. smart crawlers and extractors similarly use (have “learned”) collectively created information patterns, derived from open-knowledge bases of various sorts. who should own these tools/models and how open/ closed should the programming code/ideas be, considering they could not be built without using the collective wisdom embodied in these knowledge bases? these tools exploit decades of labor by thousands of institutions, whose assumption has generally been that the knowledge base and, by extension, the tools that are built on and benefit from it, are and should remain directly or indirectly, public goods. ■ for whom is machine learning/assistance in collection building patented? the ideas, training corpora, algorithms, and data models discussed need to be observed and protected for the public domain to encourage their widespread and inexpensive availability, as well as their evolution. the u.s. patent and trademark office is now more commonly supporting the patenting of whole, generic processes that have heretofore had one or both feet in the commons, as compared with solely granting patent rights in more discrete areas of original invention. it would be unfortunate to find one day that machine assistance in collection building had been patented. this is especially an issue, given that there is little machine learning of interest to libraries that does not mine, apply, and extend the stored wisdom and knowledge that the community has built for decades. 210 information technology and libraries | december 2006 ■ summary of part iii it is important to think through and anticipate a great number of issues and concerns—including those of open models and open development—regarding machine-assistance tools (e.g., classifiers, extractors, and related algorithms/models) that generate library standard metadata, and identify and extract useful natural-language data. it is important because these tools could become central activities in libraries over the next one to five years. reflection here is especially appropriate, given the degree that these tools are trained on exemplars from library collections and come to distill and embody models of library metadata, standards, and expertise that represent the knowledge created over decades through the effort of a whole community. it is important to think through what machineassistance technologies in collection building imply for the future role of the librarian’s expertise. specifically, libraries need to reconceptualize machine-assistance software not as fully automated “ai” but rather, as enabling expert driven, strongly interactive, “servo-mechanisms” that semi-automate some work to increase the reach, quality, and user-finding success within library collections. while it will probably start out with ten or fifteen minutes of expert time saved per record by such tools, this is a lot of time saved when aggregated across the entire community and will only increase. and the community needs to think through what this implies for the evolution of librarystandard metadata, given that machine assistance will increasingly allow for massive and economic application, if a convergence of machine capabilities and machinefriendly metadata standards is architected. this large-scale amplification of usage will quite likely involve changing the value/roles of these standards for the community, as well as for the larger communities that may come to use them at the cost of simplification, streamlining, and a greater reliance on end users to verify some of their own metadata details (often interacting directly with the digital resource). the tools also imply a restructuring of expertise and its application in metadata creation in libraries to reflect a division of labor, with semi-automated machine description processes spent on the mass of useful but midto lower-value materials; with and expert time being spent on high-value resources; and with both types of records residing in the same multitiered, heterogeneous collection.58 finally, needing examination will be the roles of the stewardship organizations in: ■ shepherding the community’s metadata standards during a period of great change; ■ openly evolving the application of metadata standards within the context of machine assignment for the greatest possible good; ■ rapidly evolving the application of metadata standards to retain guidance of and to keep pace with open and proprietary developments in these areas; ■ distilling the metadata knowledge base and wisdom created by the community as this is transformed into the programmatic knowledge (rule bases and models) used by new tools. this knowledge base is a priceless asset for the library community in sustaining service roles in an age of the large-scale advent of commercial-information access, delivery, and ownership. ■ conclusion this article discusses work over the last several years in machine-learning software and services relevant to collection building in libraries. a number of promising avenues for exploration and research are detailed. deeper understanding of and more direct involvement in areas of machine learning are urged for libraries in order to reflect advances in the computer sciences and other disciplines as well as to meet changing end-user needs among information seekers. ■ acknowledgements the author would like to thank the u.s. institute of museum and library services; the library of the university of california at riverside; the national science foundation’s national science digital library; the fund for the improvement of post-secondary education of the u.s department of education; the librarians association of the university of california; and the computing and communications group of the university of california at riverside for current or past funding support. the author would also like to thank the library of congress; cornell university library; oclc; and the california digital library for providing training data and other assistance for the research. thanks to karen calhoun (cornell university library) and two anonymous readers for some excellent comments and suggestions. finally, the author would like to commend ivia lead programmer johannes ruscheinski, primary author of the data fountains and ivia code bases, for his excellent work over the years, as well as gordon paynter, walt howard, jason scheirer, keith humphries, anthony moralez, paul vander griend, artur kedzierski, margaret mooney, john saylor, laura bartolo, carlos rodriguez, jan herd, carolyn larson, diane hillmann, and ruth jackson for their invaluable contributions to the machine assistance in collection building | mitchell 211 projects. the views expressed here are solely those of the author and not intended to represent those of the library of the university of california, riverside, our funding agencies, or cooperators. ■ references and notes 1. s. mitchell et al., “ivia: open source virtual library software,” d-lib magazine (january 2003). http://www.dlib .org/dlib/january03/mitchell/01mitchell.html (accessed oct. 20, 2006); g. paynter, “developing practical automatic metadata assignment and evaluation tools for internet resources,” in proceedings of the 5th acm/ieee joint conference on digital libraries (denver: acm pr., 2005), 291–300 (winner of the jcdl 2005 vannevar bush best paper award), http://ivia.ucr.edu/ projects/publications/paynter-2005-jcdl-metadata-assignment.pdf, (accessed oct. 20, 2006); s. mitchell, “collaboration enabling internet resource collection-building software and technologies,” library trends 53, no. 4 (may 2005): 604–19; j. mason et al., “infomine: promising directions in virtual library development,” first monday (2000), http://www.first monday.dk/issues/issue5_6/mason/ (accessed oct. 20, 2006). 2. s. mitchell, “infomine: the first three years of a virtual library for the biological, agricultural, and medical sciences,” in proceedings of the contributed papers session, biological sciences division, special libraries association annual conference (seattle: special libraries assocation, 1997). 3. mitchell, “collaboration enabling internet resource collection-building software and technologies.” 4. j. phipps et al., “orchestrating metadata enhancement services: introducing lenny,” in proceedings of dc-2005: international conference on dublin core and metadata applications (madrid, spain: universidad carlos iii de madrid, 2005), http://arxiv.org/pdf/cs.dl/0501083, (accessed oct. 20, 2006). 5. mason et al., “infomine: promising directions in virtual library development.” 6. ibid. 7. s. chakrabarti, mining the web: discovering knowledge from hypertext (san francisco: morgan kauffman, 2003); s. chakrabarti et al., accelerated focused crawling through online relevance feedback, http://www2002.org/cdrom/ refereed/336/ (accessed oct. 20, 2006); s. chakrabarti, the structure of broad topics on the web, http://www2002.org/cdrom/refereed/338/index.html (accessed oct. 20, 2006); s. chakrabarti, integrating the document object model with hyperlinks for enhanced topic distillation and information extraction, http://www10.org/cdrom/papers/489 (accessed oct. 20, 2006). 8. chakrabarti et al., accelerated focused crawling; f. menczer, “mapping the semantics of web text and links” iee internet computing, 9, no. 3 (may/june 2005): 27–36; f. menczer, g. pant, and p. srinivasan, “topical web crawlers: evaluating adaptive algorithms” transactions on internet technology, 4, no 4 (2004): 378–; f. menczer, “correlated topologies in citation networks and the web” european physical journal b, 38 no. 2 (march 2004): 211–21. 9. s. mitchell, “data fountains survey,” 2005, http:// datafountains.ucr.edu/ datafountainssurvey.doc, (accessed oct. 20, 2006). 10. a. culotta and a. mccallum, “confidence estimation for information extraction,” in proceedings of human language technology conference and north american chapter of the association for computational linguistics (boston: association for computational linguistics, 2004), http://www.cs.umass.edu/ ~mccallum/papers/crfcp-hlt04.pdf, (accessed oct. 20, 2006); f. peng and a. mccallum, “accurate information extraction from research papers using conditional random fields,” in proceedings of the human language technology conference and north american chapter of the association for computational linguistics (2004). http://ciir.cs.umass.edu/pubfiles/ir-329.pdf, (accessed oct. 20, 2006); c. sutton and a. mccallum, “an introduction to conditional random fields for relational learning,” in introduction to statistical relational learning, lise getoor and ben taskar, eds. (cambridge, mass.: mit pr., 2006). http://www .cs.umass.edu/~mccallum/papers/crf-tutorial.pdf, (accessed oct. 20, 2006). 11. a. mccallum and d. jensen, “a note on the unification of information extraction and data mining using conditionalprobability, relational models,” in proceedings of the ijcai 2003 workshop on learning statistical models from relational data, acapulco, mexico: ijcai, http://www.cs.umass.edu/~mccallum/ papers/iedatamining-ijcaiws03.pdf, (accessed oct. 20, 2006); u. nahm and r. mooney, “a mutually beneficial integration of data mining and information extraction,” in proceedings of the american association for artificial intelligence/innovative applications of artificial intelligence (austin, texas: american association for artificial intelligence, 2000). http://www.cs.utexas .edu/users/ ml/papers/discotex-aaai-00.pdf, (accessed oct. 20, 2006); r. raina et al., “classification with hybrid generative/ discriminative models,” in proceedings of neural information processing systems (2003). http://www.cs.umass.edu/~mccallum/ papers/hybrid-nips03.pdf, (accessed oct. 20, 2006); g. bouchard and b. triggs, “the trade-off between generative and discriminative classifiers,” compstat 2004. (prague: springer, 2004) http://lear.inrialpes.fr/pubs/2004/bt04/bouchard-comp stat04.pdf, (accessed oct. 20, 2006). 12. mccallum and jensen, “a note on the unification of information extraction.” 13. n. eiron and k. mccurley, “untangling compound documents on the web,” in conference on hypertext (nottingham, uk: acm conference on hypertext and hypermedia, 2003), http://citeseer .ist.psu.edu/eiron03untangling.html, (accessed oct. 20, 2006). http://www.almaden.ibm.com/cs/people/mccurley/pdfs/ pdf.pdf, (accessed oct. 20, 2006); p. dimitriev et al., “as we may perceive: inferring logical documents from hypertext,” presented at ht 2005, 16th acm conference on hypertext and hypermedia (salzburg: acm, 2005); k. tajima, “finding context paths for web pages,” in proceedings of acm hypertext (darmstad, germany: acm, 1999), http://www.jaist.ac.jp/~tajima/ 212 information technology and libraries | december 2006 papers/ht99www.pdf, (accessed oct. 20, 2006); k. tajima et al., “discovery and retrieval of logical information units in web,” in proceedings of the workshop of organizing web space (in conjunction with acm conference on digital libraries) (berkeley, calif.: acm, 1999), 13–23, http://www.jaist.ac.jp/~tajima/ papers/ wows99www.pdf, (accessed oct. 20, 2006); e. de lara et al., “a characterization of compound documents on the web,” tr99-351, university of toronto (1999), http://www.cs.toronto .edu/~delara/papers/compdoc.pdf, (accessed oct. 20, 2006), http://www.cs.toronto.edu/~delara/ papers/compdoc_html/, (accessed oct. 20, 2006); l. xiaoli et al., “web search based on micro information units,” (honolulu, hawaii: eleventh international world wide web conference, 2002), http://www2002 .org/cdrom/poster/78.pdf, (accessed oct. 20, 2006); w. lee et al., retrieval and organizing web pages by information unit, http://www10.org/cdrom/papers/466/, (accessed oct. 20, 2006). 14. tajima et al., “discovery and retrieval of logical information units in web”; xiaoli et al., “web search based on micro information units”; lee et al., retrieval and organizing web pages. 15. r. mihalcea, “graph-based ranking algorithms for sentence extraction, applied to text summarization,” in proceedings of the 42nd annual meeting of the association for computational linguistics, companion volume (barcelona, spain: association for computational linguistics, 2004), http://www.cs.unt .edu/~rada/papers/mihalcea.acl2004.pdf, (accessed oct. 20, 2006); r. mihalcea and p. tarau, “textrank: bringing order into texts,” in proceedings of the conference on empirical methods in natural language processing (barcelona, spain: empirical methods in natural language processing, 2004), http://www.cs.unt .edu/~rada/papers/mihalcea.emnlp04.pdf, (accessed oct. 20, 2006); r. mihalcea, p. tarau, and e. figa, “pagerank on semantic networks, with application to word sense disambiguation,” in proceedings of the 20th international conference on computational linguistics (geneva, switzerland: coling 2004). http://www .cs.unt.edu/~rada/papers/ mihalcea.coling04.pdf, (accessed oct. 20, 2006); y. matsuo et al., “keyworld: extracting keywords in a document as a small world,” in proceedings of discovery science (berlin, new york: springer, 2001), 271–81 (lecture notes in computer science, v. 2226), http://www.miv.t.u-tokyo.ac.jp/ papers/matsuods01.pdf, (accessed oct. 20, 2006); y. matsuo and m. ishizuka, “keyword extraction from a single document using word co-occurrence statistical information,” international journal on artificial intelligence tools 13, no.1 (2004): 157–69, http://www.miv.t.u-tokyo.ac.jp/papers/matsuoijait04.pdf, (accessed oct. 20, 2006); xiaoli et al., “web search based on micro information units”; lee et al., retrieval and organizing web pages; g. forman and ira cohen, “learning from little: comparison of classifiers given little training,” tech report: hpl2004-19r1 20040719 (palo alto, calif.: hewlett-packard research labs., 2004), http://www.hpl.hp.com/techreports/2004/hpl -2004-19r1.pdf, (accessed oct. 20, 2006). 16. g. mann et al., “bibliometric impact measures leveraging topic analysis,” (in press), in proceedings of the joint conference on digital libraries (2006). http://www.cs.umass.edu/~mccallum/ papers/impact-jcdl06s.pdf, (accessed oct. 20, 2006). 17. r. bouckaert and e. frank, “evaluating the replicability of significance tests for comparing learning algorithms,” in proceedings of the pacific-asia conference on knowledge discovery and data mining. (berlin, new york: springer-verlag, 2004), 3–12 (lecture notes in computer science, v. 3056), http://www .cs.waikato.ac.nz/~ml/publications/2004/bouckaert-frank.pdf, (accessed oct. 20, 2006); r. bouckaert, “estimating replicability of classifier learning experiments,” in proceedings of the international conference on machine learning (2004), http://www. cs.waikato.ac.nz/~ml/publications/2004/bouckaert-estimating.pdf, (accessed oct. 20, 2006); r. caruana and a. niculescumizil, “data mining in metric space: an empirical analysis of supervised learning performance criteria,” in kdd-2004: proceedings of the tenth acm sigkdd international conference on knowledge discovery and data mining (new york: acm press, 2004), http://perfs.rocai04.revised.rev1.ps, (accessed oct. 20, 2006). 18. j. zhang et al., “modified logistic regression: an approximation to svm and its application in large-scale text categorization,” in proceedings: twentieth international conference on machine learning (menlo park calif.: aaai press, 2003), 888–97, http://www.informedia.cs.cmu.edu/documents/icml03zhang .pdf, (accessed oct. 20, 2006); y-c. chang, “boosting svm classifiers with logistic regression,” technical report. (taipei: institute of statistical science, academia sinica, 2003), http://www.stat.sinica.edu.tw/library/c_tec_rep/2003-03.pdf, (accessed oct. 20, 2006); t. zhang and f. oles, “text categorization based on regularized linear classification methods,” information retrieval 4, no. 1 (2001): 5–31, http://www.research .ibm.com/people/t/tzhang/pubs.html, (accessed oct. 20, 2006); t. joachims, “svmlight,” (including svmmulticlass, svmstruct, svmhmm) (software, 2005), http://svmlight.joachims.org/, (accessed oct. 20, 2006); c. chang and c-j. lin, “libsvm,” (software, 2005), http://www.csie.ntu.edu.tw/~cjlin/libsvm/, (accessed oct. 20, 2006); c-w hsu and c-j lin, “bsvm,” (software, 2003), http://www.csie.ntu.edu.tw/~cjlin/bsvm/index .html, (accessed oct. 20, 2006); t. finley and t. joachims, “supervised clustering with support vector machines,” in proceedings of the international conference on machine learning (new york: acm press, 2005), http://www.cs.cornell.edu/ people/tj/publications/finley_joachims_05a.pdf, (accessed oct. 20, 2006); i. tsochantaridis et al., “support vector machine learning for interdependent and structured output spaces,” in proceedings of the international conference on machine learning (new york: acm press, 2004), http://www.cs.cornell.edu/ people/tj/publications/tsochantaridis_etal_04a.pdf, (accessed oct. 20, 2006) ; s. godbole and s. sarawagi, “discriminative methods for multi-labeled classification,” in proceedings of the pacific-asia conferences on knowledge discovery and data mining (2004), http://www.it.iitb.ac.in/~shantanu/work/pakdd04 .pdf, (accessed oct. 20, 2006); l. cai and t. hofmann, “hierarchical document categorization with support vector machines,” in proceedings of the acm 13th conference on information and knowlmachine assistance in collection building | mitchell 213 edge management (2004), http://www.cs.brown.edu/people/ th/publications.html, (accessed oct. 20, 2006); t. hofmann et al., “learning with taxonomies: classifying documents and words,” in proceedings of the workshop on syntax, semantics, and statistics, neural information processing (2003), http:// www.cs.brown.edu/people/ th/publications.html, (accessed oct. 20, 2006); a. tveit, “empirical comparison of accuracy and performance for the mipsvm classifier with existing classifiers,” technical report, division of intelligent systems, department of computer and information science, norwegian university of science and technology. (trondheim, norway, 2003), http://www.idi.ntnu.no/~amundt/publications/2003/ mipsvmclassificationcomparison.pdf, (accessed oct. 20, 2006); c-w hsu and c-j lin, “a comparison of methods for multiclass support vector machines,” ieee transactions on neural networks 13, no. 2 (2002): 415–25, http://www.csie.ntu .edu.tw/~cjlin/papers/multisvm.pdf, (accessed oct. 20, 2006). 19. p. komarek, “logistic regression for data mining and high-dimensional classification” (ph.d. thesis, carnegie mellon university, 2004), 138; p. komarek and a. moore, “fast robust logistic regression for large sparse data sets with binary outputs,” proceedings of the ninth international workshop on artificial intelligence and statistics. january 3–6, 2003, hyatt hotel, key west, florida, ed. by christopher m. bishop and brendan j. frey. http://research.microsoft.com/ conferences/aistats2003/proceedings/174.pdf (accessed nov. 23, 2006); a. popescul et al., “towards structural logistic regression: combining relational and statistical learning,” in mrdm 2002: workshop on multi-relational data mining, http://www-ai.ijs.si/sasodzeroski/mrdm2002/proceed ings/popesul.pdf (accessed nov. 23, 2006); j. zhang and y. yang, “probabilistic score estimation with piecewise logistic regression,” in proceedings: twenty-first international conference on machine learning (menlo park, calif.: aaai press, 2004), http://www-2.cs.cmu.edu/~jianzhan/papers/icml04zhang .pdf, (accessed oct. 20, 2006); zhang et al., “modified logistic regression”; zhang and oles, “text categorization”; multi-class lr is discussed in zhang et al., 2003, and chang, 2003 (reference 18). 20. some recent work on nb can be seen in j. rennie, “tackling the poor assumptions of naive bayes text classifiers,” in t. fawcett and n. mishra, eds., proceedings of the 20th international conference on machine learning (washington, d.c.: aaai pr., 2003), 616–23, http://haystack.lcs.mit.edu/papers/rennie .icml03.pdf, (accessed oct. 20, 2006); k. schneider, “techniques for improving the performance of naive bayes for text classification,” in computational linguistics and intelligent text processing: sixth international conference, cicling2005, mexico city, mexico, february 13–19, 2005: proceedings (new york: springer, 2005). (lecture notes in computer science, 3406). 682–93, http://www.phil.uni-passau.de/linguistik/ schneider/pub/cicling2005.html, (accessed oct. 20, 2006); e. frank et al., “locally weighted naive bayes,” in proceedings of the 19th conference in uncertainty in artificial intelligence (acapulco: morgan kaufmann, 2003), 249–56, http://www .cs.waikato.ac.nz/~eibe/pubs/uai_200.ps.gz, (accessed oct. 20, 2006); g. webb et al., “not so naive bayes: aggregating one-dependence estimators,” machine learning 58, no. 1 (jan. 2005): 5–24, http://www.csse.monash.edu.au/~webb/files/ webbboughtonwang05.pdf, (accessed oct. 20, 2006); e. keogh and m. pazzani, “learning the structure of augmented bayesian classifiers,” international journal on artificial intelligence tools 11, no. 4 (2002): 587–601, http://www.ics.uci.edu/~pazzani/ publications/tools.pdf (accessed oct. 20, 2006). 21. mccallum and jensen, “a note on the unification of information extraction and data mining”; joachims, “svmlight”; y. altun et al., “hidden markov support vector machines,” in proceedings of the 20th international conference on machine learning (menlo park, calif.: aaai press, 2003), http://www.cs.brown.edu/people/th/publications.html (accessed oct. 20, 2006); a. ganapathiraju et al., “hybrid svm/hmm architectures for speech recognition,” in advances in neural information processing systems 13: proceedings of the 2000 conference (cambridge, mass.: mit press, 2001), http://www.nist.gov/speech/publications/tw00/pdf/cp210 .pdf (accessed oct. 20, 2006); d. freitag and a. mccallum, “information extraction with hmm structures learned by stochastic optimization,” in proceedings of the 18th conference on artificial intelligence (austin, tx.: aaai press, 2000) http:// www.cs.umass.edu/~mccallum/papers/iehill-aaai2000s .ps (accessed oct. 20, 2006); s. basu et al., “a probabilistic framework for semi-supervised clustering,” in proceedings of the 10th acm sigkdd international conference on knowledge discovery and data mining (seattle, wash.: 2004), 59– 68, http://www.cs.utexas.edu/users/ml/papers/semi-kdd -04.pdf, (accessed oct. 20, 2006). 22. t. liu et al., “efficient exact knn and nonparametric classification in high dimensions,” in advances in neural information processing systems 15: proceedings of the 2002 conference (cambridge, mass.: mit press, 2001). http://www .autonlab.org/autonweb/showpaper.jsp?id=liu-knn, (accessed oct. 20, 2006); g. guo et al., “knn model-based approach in classification,” in lecture notes in computer science, vol. 2888 (heidelberg: springer berlin, 2003), 986–96, http://www .icons.rodan.pl/publications/%5bguo2003%5d.pdf (accessed oct. 20, 2006) 23. bouckaert and frank, “evaluating the replicability of significance tests”; bouckaert, “estimating replicability of classifier learning experiments”; caruana and niculescu-mizil, “data mining in metric space”; r. caruana and t. joachims, “perf (data mining evaluation software),” in proceedings of the conference on knowledge discovery and data mining (2004). http://kodiak.cs.cornell.edu/kddcup/software.html (accessed oct. 20, 2006); paynter, “developing practical automatic metadata.” 24. raina et al., “classification with hybrid generative/ discriminative models”; bouchard and triggs, “the trade-off between generative and discriminative classifiers.” 25. ibid; zhang et al., “modified logistic regression”; chang, “boosting svm classifiers with logistic regression”; joachims, 214 information technology and libraries | december 2006 “svmlight”; l. shih et al., “not too hot, not too cold: the bundled svm is just right!” in proceedings of the icml-2002 workshop on text learning (2002). http://people.csail.mit.edu/ u/j/jrennie/public_html/papers/icml02-bundled.pdf (accessed oct. 20, 2006); f. fukumoto and y. suzuki, “manipulating large corpora for text classification,” in proceedings of the conference on empirical methods in natural-language processing (philadelphia: association for computational linguistics, 2002), 196–203, http://acl.ldc.upenn.edu/w/w02/w02-1026.pdf (accessed oct. 20, 2006); altun et al., “hidden markov support vector machines; ganapathiraju et al., “hybrid svm/hmm architectures”; liu et al., “efficient exact k-nn”; a. ng and m. jordan, “on discriminative versus generative classifiers: a comparison of logistic regression and naive bayes,” in advances in neural information processing systems 14: proceedings of the 2001 conference (cambridge, mass.: mit press, 2002), http:// www.robotics.stanford.edu/~ang/ papers/nips01-discriminativegenerative.ps (accessed oct. 20, 2006); k. nigam et al., “text classification from labeled and unlabeled documents using em,” machine learning 39, nos. 2/3 (2000): 103–34, http://www .kamalnigam.com/papers/emcat-mlj99.pdf (accessed oct. 20, 2006). 26. g. valentini and f. masulli, “ensembles of learning machines,” in neural nets wirn vietri-02, series lecture notes in computer sciences, m. marinaro and r. tagliaferri, eds. (heidelberg: springer-verlag, 2002), http://www.disi.unige.it/ person/masullif/papers/masulli-wirn02.pdf (accessed oct. 20, 2006). 27. ibid.; r. caruana et al., “ensemble selection from libraries of models” in proceedings: twenty-first international conference on machine learning (menlo park, calif.: aaai press, 2004). http://www.cs.cornell.edu/~alexn/shotgun.icml04.revised. rev2.pdf (accessed oct. 20, 2006); g. tsoumakas, “effective voting of heterogeneous classifiers,” in machine learning ecml 2004: 15th european conference on machine learning, pisa, italy, september 20–24, 2004: proceedings. (berlin, new york: springer, 2004), http://users.auth.gr/~greg/publications/tsoumakas -ecml2004.pdf (accessed oct. 20, 2006); j. fürnkranz, “on the use of fast sub-sampling estimates for algorithm recommendation,” technical report tr-2002-36 (wien: österreichisches forschungsinstitut für artificial intelligence, 2002), http://www .ofai.at/cgi-bin/get-tr?paper=oefai-tr-2002-36.pdf (accessed oct. 20, 2006); a. seewald, 2002. “meta-learning for stacked classification,” (extended version) in proceedings of the 2nd international workshop on integration and collaboration aspects of data mining, decision support, and meta-learning (university of helsinki, department of computer science, report b-2002-3, 2002), http:// www.ofai.at/cgi-bin/get-tr?download=1&paper=oefai-tr-2002 -05.pdf (accessed oct. 20, 2006); a. seewald and j. fürnkranz, “an evaluation of grading classifiers,” in advances in intelligent data analysis: proceedings of the 4th international symposium (lisbon, portugal: springer-verlag, 2001), http://www.ofai.at/ cgi-bin/get-tr?paper=oefai-tr-2001-01.pdf (accessed oct. 20, 2006); p. bennett et al., “the combination of text classifiers using reliability indicators,” technical report. microsoft and information retrieval 8, no. 1 (2005): 67–100, http://research .microsoft.com/~horvitz/tclass_combine.pdf (accessed oct. 20, 2006); y. kim et al., “optimal ensemble construction via metaevolutionary ensembles,” expert systems with applications 30, no. 4 (in press 2006), http://www.informatics.indiana.edu/fil/ papers/mee-eswa.pdf (accessed oct. 20, 2006). 28. s. godbole, “document classification as an internet service: choosing the best classifier” (masters thesis, iit bombay, 2001). http://www.it.iitb.ac.in/~shantanu/work/mtpsg.pdf (accessed oct. 20, 2006). 29. k. liu and h. kargupta, “distributed data mining bibliography: release 1.7,” (baltimore: university of maryland, computer science department, 2006), http://www.csee.umbc.edu/ ~hillol/ddmbib/ (accessed oct. 20, 2006); a. prodromidis and p. chan, “meta-learning in distributed data mining systems: issues and approaches,” in advances of distributed data mining, hillol kargupta and philip chan, eds. (menlo park, calif. : aaai/ mit press, 2000). http://www1.cs.columbia.edu/~andreas/ publications/ddmbook.ps.gz (accessed oct. 20, 2006); g. tsoumakas and i. vlahavas, “distributed data mining of large classifier ensembles,” in methods and applications of artificial intelligence: second hellenic conference on ai, setn 2002, thessaloniki, greece, april 11–12, 2002: proceedings, (berlin, new york: springer, 2002), 249–56, http://users.auth.gr/~greg/publications/ddmlce.pdf (accessed oct. 20, 2006); r. khoussainov et al., “grid-enabled weka: a toolkit for machine learning on the grid,” ercim news no. 59, (oct. 2004), http:// www.ercim.org/publication/ercim_news/enw59/khussainov .html (accessed oct. 20, 2006). 30. s. godbole et al., “document classification through interactive supervision of document and term labels,” in knowledge discovery in databases: pkdd 2004: 8th european conference on principles and practice of knowledge discovery in databases, pisa, italy, september 20–24, 2004: proceedings (berlin; new york: springer, 2004), http://www.it.iitb.ac.in/~shantanu/work/ pkdd04.pdf (accessed oct. 20, 2006).; h. yu et al., “pebl: positive example based learning for web page classification using svm,” in kdd-2002: proceedings of the eighth acm sigkdd international conference on knowledge discovery in data mining (new york: acm pr., 2002), 239–48, http://eagle.cs.uiuc.edu/ pubs/2002/pebl-kdd02.pdf (accessed oct. 20, 2006); t. kristjannson et al., “interactive information extraction with constrained conditional random fields,” in proceedings: nineteenth national conference on artificial intelligence (aai-04) (menlo park, calif.: aaai press; cambridge, mass.: mit press, 2004), http://www.cs.umass.edu/~mccallum/papers/addrie-aaai04. pdf (accessed oct. 20, 2006); v. tablan et al., “ollie: on-line learning for information extraction,” in proceedings of the hlt-naacl workshop on software engineering and architecture of language technology systems: edmonton, canada: 2003. (new york: acm, 2003), http://gate.ac.uk/sale/hlt03/ollie-sealts.pdf (accessed oct. 20, 2006). 31. godbole et al., “document classification.” 32. ibid.; tablan et al., “ollie: on-line learning for information extraction.” machine assistance in collection building | mitchell 215 33. g. mann et al., “bibliometric impact measures,” (in press). 34. bouckaert and frank, “evaluating the replicability of significance tests”; caruana and niculescu-mizil, “data mining in metric space”; caruana and joachims, “perf (data mining evaluation software).” 35. mann et al., “bibliometric impact measures”; matsuo et al., “keyworld”; matsuo and ishizuka, “keyword extraction from a single document”; lee et al., retrieval and organizing web pages; tajima et al., “discovery and retrieval of logical information.” (see also the sections on hybrid, unified models, and document scale learning and classification, above.) 36. menczer, “mapping the semantics of web text and links.” 37. p. srinivasan et al., “a general evaluation framework for topical crawlers,” information retrieval 8, no. 3 (2005): 417–47, http://www.informatics.indiana.edu/fil/papers/ crawl_framework.pdf (accessed oct. 20, 2006); a. maguitman et al., “algorithmic computation and approximation of semantic similarity,” (in press, 2006). to appear in world wide web journal. http://www.informatics.indiana.edu/fil/papers/ semsim_extended.pdf (accessed oct. 20, 2006). 38. arxiv. cornell university library, http://arxiv.org/ (accessed oct. 20, 2006); citeseer.ist (formerly researchindex), http://citeseer.ist.psu.edu/ (accessed oct. 20, 2006); escholarship repository, california digital library, http://repositories.cdlib .org/escholarship/, (accessed oct. 20, 2006); national science foundation, national science digital library, http://nsdl.org/ (accessed oct. 20, 2006); oaister. digital library production service (university of michigan), http://oaister.umdl.umich.edu/ o/oaister/ (accessed oct. 20, 2006); u.s. institute of museum and library services. digital collections and content, http://imlsdcc .grainger.uiuc.edu/ (accessed oct. 20, 2006). 39. k. calhoun, “the changing nature of the catalog and its integration into other discovery tools,” (report to the library of congress, mar. 17, 2006), http://www.loc.gov/catdir/calhoun -report-final.pdf (accessed oct. 20, 2006); mitchell, “collaboration enabling internet resource collection-building software and technologies”; w. wulf, “higher education alert: the railroad is coming,” in educause, publications from the forum for the future of higher education (2002), http://www.educause. edu/ir/library/pdf/ffpiu022.pdf (accessed oct. 20, 2006). 40. university of california libraries, “rethinking how we provide bibliographic services at the university of california,” final report of the bibliographic services task force of the university of california libraries, 2005, http://libraries .universityofcalifor nia.edu/sopag/bstf/final.pdf (accessed oct. 20, 2006). 41. l. dempsey, “libraries and the long tail: some thoughts about libraries in a network age,” d-lib magazine 12, no. 4 (2006), http://www.dlib.org/dlib/april06/dempsey/ 04dempsey.html (accessed oct. 20, 2006). 42. mason, j. et al., “infomine: promising directions in virtual library development,” first monday 5, no. 6 (june 5, 2000), http://www.firstmonday.dk/issues/issue5_6/mason/ (accessed oct. 20, 2006). 43. e. o’neill and l. m. chan, “fast: faceted application of subject terminology,” in proceedings of the world information congress, ifla general conference and council (berlin: ifla, 2003). http://www.ifla.org/iv/ifla69/papers/010e-oneill_maichan.pdf (accessed oct. 20, 2006); see also: oclc 2003–2006, “fast: faceted application of subject terminology,” http:// www.oclc.org/research/projects/fast/default.htm) (accessed oct. 20, 2006). 44. m. bates, 2003, “improving user access to library catalog and portal information,” task force recommendation 2.3, final report (washington, d.c.:library of congress, 2003), 30, http:// www.loc.gov/catdir/bibcontrol/2.3batesreport6-03.doc.pdf (accessed oct. 20, 2006). 45. rdn (resource description network), http://www.rdn .ac.uk/projects/eprints-uk/, (accessed oct. 20, 2006); oclc “eprints-uk” (2005), http://www.oclc.org/research/projects/ mswitch/epuk.htm, (accessed oct. 20, 2006). 46. a. macewan, “working with lcsh: the cost of cooperation and the achievement of access: a perspective from the british library,” presented at the ifla general conference, 1998, http://www.ifla.org/iv/ifla64/033-99e.htm (accessed oct. 20, 2006). 47. ibid.; r. larson, “the decline of subject searching: longterm trends and patterns of index use in an online catalog,” journal of the american society for information science 42, no. 3 (1991): 197–215. 48. k. drabenstott et al., “end-user understanding of subject headings in library catalogs,” library resources & technical services 43, no. 3 (jul. 1999): 140–60; bates, “improving user access.” 49. bates, “improving user access,” (see discussion of entry vocabulary). 50. ibid. 51. beat (bibliographic enrichment advisory team, library of congress), “digital tables of contents,” (2005), http://www .loc.gov/catdir/beat/digitoc.html (accessed oct. 20, 2006). 52. d. vizine-goetz, “terminology services, oclc,” (2004), http://www.oclc.org/research/projects/termservices/default .htm (accessed oct. 20, 2006). 53. c. fellbaum, wordnet: an electronic lexical database (cambridge, mass.: mit pr., 1998), http://wordnet.princeton.edu/ (accessed oct. 20, 2006); a. csomai, “wordnet bibliography,” (2006). http://lit.csci.unt.edu/~wordnet/ (accessed oct. 20, 2006). 54. bates, “improving user access.” 55. a. maedche and r. volz, “the ontology extraction and maintenance framework: text-to-onto,” in proceedings of the icdm 2001 workshop (san jose, calif.: ieee computer society (2001), http://cui.unige.ch/~hilario/icdm-01/dm-km-final/volz .pdf (accessed oct. 20, 2006); v. parekh, j. gwo, and t. finin, “mining domain specific texts and glossaries to evaluate and enrich domain ontologies,” in proceedings of the 2004 international conference on information and knowledge engineering: ike ‘04 (las vegas: csrea press, 2004), http://ebiquity. umbc.edu/v2.1/paper/html/id/171/ (accessed oct. 20, 2006); 216 information technology and libraries | december 2006 d. sleeman et al., “enabling services for distributed environments: ontology extraction and knowledge base characterization,” in proceedings of workshop on knowledge transformation for the semantic web/fifteenth european conference on artificial intelligence (lyon, france: ecai, 2002), http://www.csd.abdn .ac.uk/~sleeman/published-papers/p129-final-ontomine.pdf (accessed oct. 20, 2006). ; b. omelayenko, “learning of ontologies for the web: the analysis of existent approaches,” in proceedings of the international workshop on web dynamics (london: webdyn, 2001), http://dcs.bbk.ac.uk/webdyn/ webdynpapers/omelayenko.pdf (accessed oct. 20, 2006); r. dhamankar et al., “imap: discovering complex semantic matches between database schemas,” in sigmod 2004: proceedings of the acm sigmod international conference on management of data, june 13–18, 2004, paris, france (new york: association for computing machinery, 2004), http:// www.cs.washington.edu/homes/pedrod/papers/sigmod04 .pdf (accessed oct. 20, 2006); p. cassin et al., “ontology extraction for educational knowledge bases,” lecture notes in computer science, vol. 2926 (heidelberg: springer-verlag, 2004), 297–309; revised and invited papers from agent-mediated knowledge management: international symposium (stanford, calif., mar. 24–26, 2003), ftp://mas.cs.umass.edu/pub/cassin _ontology-amkm03.pdf (accessed oct. 20, 2006); t. wang et al., “extracting a domain ontology from linguistic resource based on relatedness measurements,” in the 2005 ieee/wic/ acm international conference on web intelligence: proceedings: september 19–22, compiègne university of technology, france (los alamitos, calif.: ieee computer society, 2005), 345–51, http:// csdl2.computer.org/persagen/dlabstoc.jsp?resourcepath=/ dl/proceedings/&toc=comp/proceedings/wi/2005/2415/00/ 2415toc.xml&doi=10.1109/wi.2005.63 (accessed oct. 20, 2006). 56. bates, “improving user access to library catalog and portal information.” 57. o’neill and chan, “fast: faceted application of subject terminology.” 58. mason, et al., “infomine: promising directions in virtual library development.” microsoft word 9733-16966-4-ce.docx editorial board thoughts: arts into science, technology, engineering, and mathematics – steam, creative abrasion, and the opportunity in libraries today tod colegrove information technologies and libraries | march 2017 4 over the millennia, man’s attempt to understand the universe has been an evolution from the broad to the sharply focused. a wide range of distinctly separate disciplines evolved from the overarching natural philosophy, the study of nature, of greco-roman antiquity: anatomy and astronomy through botany, mathematics, and zoology among many others. similarly, the arts, humanities, and engineering developed from broad over-arching interest into tightly focused disciplines that today are distinctly separate. as these legitimate divisions formed, grew, and developed into ever-deepening specialty, they enabled correspondingly deeper study and discovery1; in response, the supporting collections of the library divided and grew to reflect that increasing complexity. libraries have long been about the organization of, and access to, information resources. subject classification systems in use today, such as the dewey decimal system, are designed to group like items with like, albeit under broad overarching topic. a perhaps inevitable result for print collections housed under such a classification system is the physical isolation of items and, by extension, the individuals researching those topics from one another. under the library of congress system, for example, items categorized as “geography” are physically removed from those in “science;” further still from “technology.” end-users benefit from the possibility of serendipitous discovery while browsing shelves nearby, even as they are effectively shielded from exposure to distracting topics outside of their immediate focus. recent years have witnessed a rediscovery of, and renewed interest in, the fundamental role the library can have in the creation of knowledge, learning, and innovation among its members. as collections shift from print to electronic, libraries are increasingly less bound to the physical constraints imposed by their print collections. rather than a continued focus on hyperspecialization and separation, we have the opportunity to rethink the library: exploring novel configurations and services that might better support its community, and embracing emerging roles of trans-disciplinary collaboration and innovation. the library as intersection libraries reflect the institutional and organizational structures of their communities, even as the tod colegrove (pcolegrove@unr.edu), a member of the ital editorial board, is head of delamare science & engineering library, university of nevada, reno. editorial board thoughts | colegrove https://doi.org/10.6017/ital.v36i1.9733 5 physical organization of the structures built to house print collections mirror the classification system in use. academic libraries are perhaps most entrenched in the structural division: rather than intrinsically promoting collaboration and discovery across disciplines, the organization of print collections, and typically the spaces around them, is designed to foster increased focus and specialization. specialized almost to the exclusion of other areas of study altogether, in branch libraries of a college or university this division can reach a pinnacle; libraries and collections devoted to exclusive topics of engineering, science, music, and others, exist on campuses across the country. amplified by separation and clustering of faculty and researchers, typically by department and discipline, it becomes entirely possible for individuals to “spend a lifetime working in a particular narrow field and never come into contact with the wider context of his or her study.”2 the library is also one of the few places in any community where individuals from a variety of backgrounds and specialties can naturally cross paths with one another. at a college or university, students and faculty from one discipline might otherwise rarely encounter those from other disciplines. whether public, school, or academic library, outside of the library individuals and groups are typically isolated from one another physically, with little opportunity to interact organically. without active intervention and deliberate effort on the part of the library, opportunities for creative abrasion3 and trans-disciplinary collaboration become virtually nonexistent; its potential to “unleash the creative potential that is latent in a collection of unlikeminded individuals,”4 untapped. leveraged properly, however, the intersection of interests and expertise that occurs naturally within the neutral spaces of the library can become a powerful tool that supports not only research, but creativity and innovation a place where ideas and viewpoints can collide, building on one another: “for most of us, the best chance to innovate lies at the intersection. not only do we have a greater chance of finding remarkable idea combinations there, we will also find many more of them.... the explosion of remarkable ideas is what happened in florence during the renaissance, and it suggests something very important. if we can just reach an intersection of disciplines or cultures, we will have a greater chance of innovating, simply because there are so many unusual ideas to go around.”5 difficult and scary the problem? “stimulating creative abrasion is difficult and scary because we are far more comfortable being with folks like us.”6 and yet a quick review of the literature reveals that knowledge creation, innovation, and success are inextricably linked7, with the fundamental understanding of their connection having undergone a dramatic shift: “knowledge is in fact essential to innovate, and while this might sound obvious today, putting knowledge and innovation and not physical assets at the centre of competitive advantage was a tremendous change.”8 as our libraries move toward embracing an even more active role within our communities, our organizational priorities are undergoing similarly dramatic shifts: support for knowledge creation information technologies and libraries | march 2017 6 and innovation becomes more central, even as physical assets shift toward a supporting, even peripheral, role. libraries, as fundamentally neutral hubs of diverse communities, are uniquely positioned to be able to cultivate creative abrasion within and among their communities, fostering not only knowledge creation, but innovation and success. indeed, the combination of physical, electronic, and staff assets can be the raw stuff by which trans-disciplinary engagement is encouraged. the active cultivation and support of creative abrasion, with direct linkage to desired outcomes, becomes arguably one of the most vital services the library can provide its community. rather than deepening the cycle of hyper-specialization, the emergence of makerspace in our libraries is one example of a trend toward enabling libraries to broaden and embrace that support. building on the intellectual diversity within the spaces of the library, staff members, volunteers, and fellow community members can serve as catalyst, triggering groups to “do something with that variety”9 by engaging across traditional boundaries. indeed, “by deliberately creating diverse organizations and explicitly helping team members appreciate thinking-styles different than their own, creative abrasion can result in successful innovation.”10 strategic placement and staff support of makerspace activity can dramatically increase the opportunity for creative abrasion and, by extension, the resulting knowledge creation, creativity and innovation. arts bring a fundamental literacy and resource to stem in recent years, greater emphasis on students acquiring stem (science, technology, engineering, and math) skills has raised the topic to be one of the most central issues in education. considered a key solution to improving the competitiveness of american students on the global stage, the approach of stem education shares the common goal of breaking down the artificial barriers that exist even within the separate disciplines of sciences, technology, engineering, and math in short, increasing the diversity of the learning environment. proponents of steam go further by suggesting that adding art into the mix can bring new energy and language to the table, “sparking curiosity, experimentation, and the desire to discover the unknown in students.” 11 federal agencies such as the u.s. department of education and the national science foundation have funded and underwritten a number of grants, conferences, and workshops in the field, including the seminal forum hosted by the rhode island school of design (risd), “bridging stem to steam: developing new frameworks for art-science-design pedagogy.”12 john maeda, the president of the risd, identifies a direct connection between the approach and the creativity and success of late apple co-founder steve jobs, with steam support “a pathway to enhance u.s. economic competitiveness.”13 proponents go further, arguing the arts bring both a fundamental literacy and resource to the stem disciplines, providing “innovations through analogies, models, skills, structures, techniques, methods, and knowledge.”14 consider the findings of a study of nobel prize winners in the sciences, members of the royal society, and the u.s. national academy of sciences; nobel laureates were: editorial board thoughts | colegrove https://doi.org/10.6017/ital.v36i1.9733 7 twenty-five times as likely as an average scientist to sing, dance, or act; seventeen times as likely to be an artist; twelve times more likely to write poetry and literature; eight times more likely to do woodworking or some other craft; four times as likely to be a musician; and twice as likely to be a photographer.15 from the standpoint of creative abrasion, welcoming the “a” of art into the library support of stem disciplines increases the diversity of the library, and by default the opportunity for creative abrasion. from aristotle and pythagoras through galileo galilei and leonardo da vinci to benjamin franklin, richard feynman, and noam chomsky, a long list of individuals of wideranging genius hints at a potential left largely untapped by our traditional approach. connections between stem disciplines, art, and the innovation arising directly out of their creative abrasion surround us: the electronic screens used on a wide range of technology, including computers, televisions, and cell phones, are the result of a collaboration between a series of painter-scientists and post-impressionist artists such as seurat a combination of red, green, and blue dots generate full-spectrum images in a way not unlike that of the artistic technique of pointillism. the electricity to drive that technology is understood, in part, due to early work by franklin even as he lay the foundations of the free public library with the opening of america’s first lending library, and pursued a broad range of parallel interests. the stitches used in medical surgery are the result of nobel laureate alexis carrel taking his knowledge of lace making from a traditional arena into the operating room. prominent american inventors “samuel morse (telegraph) and robert fulton (steam ship) were among the most prominent american artists before they turned to inventing.”16 in short, “increasing success in science is accompanied by developed ability in other fields such as the fine arts.”17 rather than isolated in monastic study, “almost all nobel laureates in the sciences are actively engaged in arts as adults.”18 perhaps surprisingly, rather than being rewarded by an ever-increasing focus and hyper-specialization, genius in the sciences seems tied to individuals’ activity in the arts and crafts. the study’s authors cite three different nobel prize winners, including j. h. van’t hoff’s 1878 speculation that scientific imagination is correlated with creative activities outside of science19; going on to detail similar findings from general studies dating back over a century. of even more seminal interest, the authors point to a similar connection for adolescents/young adults where milgram and colleagues20 found “having at least one persistent and intellectually stimulating hobby is a better predictor of career success in any discipline than iq, standardized test scores, or grades.”21 discussion the connection between individuals holding a multiplicity of interests, trans-disciplinary activity, and success is clear; what is less clear is to what extent we are fostering that connection in our libraries today. the potential is nevertheless tantalizing: a random group of people, thrown together, is not likely to be very creative. by going beyond specialization and wading into the information technologies and libraries | march 2017 8 deeper waters of supporting and cultivating creative abrasion and avocation among the membership of our libraries, we are fostering success and innovation beyond what might otherwise occur. the decision to catalyze and foster the cross-curricular collaboration that is steam22 is squarely in the hands of the library: in the design of its spaces, and in the interactions of the staff of the library with the communities served. we can choose to actively connect and catalyze across traditional boundaries. as the head of a science and engineering library, one of the early adopters of makerspace and actively exploring the possibilities of steam engagement for several years, i have time and again witnessed the leaps of insight and creativity brought about by creative abrasion. from across disciplines members are engaging with the resources of the library and, with our encouragement, one another in an ever-increasing cycle of knowledge creation, innovation, and success. the impact is particularly dramatic among individuals from strongly differing backgrounds and disciplines: for example, when an engineering student, who considers themselves to be expert with a particular technology, witnesses and interacts with an art student using that same technology and accomplishing something truly unexpected, even seemingly magical. or when a science student approaching a problem from one perspective realizes a practitioner from a different discipline sees the problem from an entirely different, and yet equally valid, point of view. in each case, it’s as if the worldview of each suddenly melts: shifting and expanding, never to return to its original shape. transformative experiences become the order of the day, even as the informal environment offers a wealth of opportunity to engage with and connect end-users to the more traditional resources of library. by actively seeking out opportunities to bring art into traditionally stem-focused activity, and vice-versa, we are deliberately increasing the diversity of the environment. makerspace services and activities, to the extent they are open and visibly accessible to all, are a natural for the spontaneous development of trans-disciplinary collaboration. within the spaces of the library, opportunities to connect individuals around shared avocational interest might range from music and spontaneous performance areas to spaces salted with lego bricks and jigsaw puzzles; the potential connections between our resources and the members of our communities are as diverse as their interests. indeed, when a practitioner from one discipline can interact and engage with others from across the steam spectrum, the world becomes a richer place – and maybe, just maybe, we can fan the flames of curiosity along the way. references 1. bohm, d., and f. d. peat. 1987. science, order, and creativity: a dramatic new look at the creative roots of science and life. london: bantam. 2. ibid., 18-19. 3. hirshberg, jerry. 1998. the creative priority: driving innovative business in the real world. london: penguin. editorial board thoughts | colegrove https://doi.org/10.6017/ital.v36i1.9733 9 4. leonard-barton, dorothy, and walter c. swap. 1999. when sparks fly: harnessing the power of group creativity. boston, massachusetts: harvard business school press books. 5. johansson, frans. 2004. the medici effect: breakthrough insights at the intersection of ideas, concepts, and cultures. boston, massachusetts: harvard business school press, 20. 6. leonard-barton, dorothy, and walter c. swap. 1999. when sparks fly: harnessing the power of group creativity. boston, massachusetts: harvard business school press books, 25. 7. nonaka, ikujiro. 1994. “a dynamic theory of organizational knowledge creation.” organization science 5 (1): 14–37. 8. correia de sousa, milton. 2006. “the sustainable innovation engine.” vine 36 (4): 398–405, accessed february 14, 2017. https://doi.org/10.1108/03055720610716656. 9. leonard-barton, dorothy, and walter c. swap. 1999. when sparks fly: harnessing the power of group creativity. boston, massachusetts: harvard business school press books, 20. 10. adams, karlyn. 2005. the sources of innovation and creativity. education, september, 2005, 33. https://doi.org/10.1007/978-3-8349-9320-5. 11. jolly, anne. 2014. “stem vs. steam: do the arts belong?” education week teacher. http://www.edweek.org/tm/articles/2014/11/18/ctq-jolly-stem-vssteam.html?qs=stem+vs.+steam. 12. rose, christopher, and brian k. smith. 2011. “bridging stem to steam: developing new frameworks for art-science-design pedagogy.” rhode island school district press release. 13. robelen, erik w. 2011. “steam: experts make case for adding arts to stem.” education week. http://www.bmfenterprises.com/aep-arts/wp-content/uploads/2012/02/ed-week-stemto-steam.pdf. 14. root-bernstein, robert. 2011. “the art of scientific and technological innovations – art of science learning.” http://scienceblogs.com/art_of_science_learning/2011/04/11/the-art-ofscientific-and-tech-1/. 15. ibid. 16. ibid. 17. root-bernstein, robert, lindsay allen, leighanna beach, ragini bhadula, justin fast, chelsea hosey, benjamin kremkow, et al. 2008. “arts foster scientific success: avocations of nobel, national academy, royal society, and sigma xi members.” journal of psychology of science and technology. https://doi.org/10.1891/1939-7054.1.2.51. information technologies and libraries | march 2017 10 18. ibid. 19. van’t hoff, jacobus henricus. 1967. “imagination in science,” molecular biology, biochemistry and biophysics, translated by g. f. springer, 1, springer-verlag, pp. 1-18 20. milgram, roberta m., and eunsook hong. 1997. "out-of-school activities in gifted adolescents as a predictor of vocational choice and work." journal of secondary gifted education 8, no. 3: 111. education research complete, ebscohost (accessed february 26, 2017). 21. root-bernstein, robert, lindsay allen, leighanna beach, ragini bhadula, justin fast, chelsea hosey, benjamin kremkow, et al. 2008. “arts foster scientific success: avocations of nobel, national academy, royal society, and sigma xi members.” journal of psychology of science and technology. https://doi.org/10.1891/1939-7054.1.2.51. 22. land, michelle h. 2013. “full steam ahead: the benefits of integrating the arts into stem.” procedia computer science 20. elsevier masson sas: 547–52. https://doi.org/10.1016/j.procs.2013.09.317. 10980 2019038 editor lita president’s message updates from the 2019 ala midwinter meeting bohyun kim information technology and libraries | march 2019 2 bohyun kim (bohyun.kim.ois@gmail.com) is lita president 2018-19 and chief technology officer & associate professor, university of rhode island libraries, kingston, rhode island. in this president’s message, i would like to provide some updates from the 2019 ala midwinter meeting held in seattle, washington. first, as many of you know, the potential merger of lita with alcts and llama has been temporarily put on hold, due to an initial timeline that was rather ambitious and the lack of time required to deliberate on and resolve some issues in the transition plan to meet that timeline.1 these updates were also shared at the lita town hall during the midwinter meeting, where many lita members spent time discussing topics such as the draft mission and vision statements for the new division, what makes people feel at home in a division, in which areas lita should redouble its focus, and which activities lita may be able to set aside without losing its identity. valuable feedback and thoughts were provided by town hall participants. many emphasized the importance of building and retaining a community of library technologists around lita values, programming, resources, advocacy, service activities, and networking opportunities in those feedback. the merger-related discussion is to resume this spring, and the leadership of lita, alcts, and llama will make every effort to ensure the best future for three divisions at this time of great flux and change. second, lita is looking into introducing some changes to the lita forum. in the feedback and thoughts gathered at the lita town hall, the lita forum was also mentioned as one of the valuable lita offerings to its members. the origin of the lita forum goes back to lita’s first national conference held in baltimore in 1983.2 since then, the lita forum has become a cherished venue for many library technologists, a place where they meet other like-minded people in the field, learn from one another, share ideas and experience, and look for more ways in which technology can be utilized to better serve libraries and library patrons. initially, the steering committee hoped that all three divisions would participate in putting together the lita forum with a wider range of content that encompasses the interests of not only lita members but also of those in alcts and llama, in a virtual format in order to engage more members who cannot easily travel, to be held some time in spring 2020. at the time this idea was conceived more than a year ago, it was assumed that all preparations for the member vote regarding the merger would have been nearly completed by the time of the midwinter meeting. however, the steering committee unfortunately ran out of time for that preparation. merger planning also took up almost the entirety of the time that the leadership and the staff of the three divisions had available. this resulted in an unfortunate delay in proper forum planning. with the merger conversation on hold at this point and the new timeline for the merger being likely to be set back at least by a year, the changed circumstances for the forum planning had to be reviewed. information technology and libraries | march 2019 3 after a lively and thoughtful discussion at the midwinter meeting, the lita board decided that, considering how much work remains to be done regarding merger planning, it may not be practical or feasible to have the next lita forum be the first virtual and joint one. however, there was a lot of interest in and excitement about trying a virtual format since it will allow lita to reach and serve the needs of more lita members than the traditional in-person meeting. it was also pointed out that the virtual format may provide an opportunity for lita to experiment with different and more unconventional conference program formats, which could be a welcoming change to lita members. the lita board, however, also acknowledged the value of a physical conference where people get to meet one another in person, which cannot be easily transferred to a virtual conference. if the virtual conference experiment takes place and is successful, lita may hold its forum alternating every year between two different formats – virtual and physical. planning for and running a fully virtual conference at the scale of a multi-day national forum will require additional time and careful consideration since it will be the first time the lita forum planning committee and the lita office attempt this. logistics management is likely to be quite different in a virtual conference. the attendee expectations and the user experience will also significantly differ in a virtual conference than in a physical conference. as the first step of this investigation, the lita forum planning committee will explore what the ideal lita virtual forum may look like in terms of programming formats and participant experience. the lita office and the finance advisory committee will also look into the financial side of running the lita forum in a virtual format. at this time, it is not yet determined when the next lita forum will be held and whether it will be a virtual or a physical one. once these investigations are completed, however, the lita board should be able to decide on the most appropriate path towards the next lita forum. stay tuned for what exciting changes may be coming to the lita forum. third, i would like to mention that lita issued a statement regarding the incidents of aggressive behavior, racism, and harassment reported at the 2019 ala midwinter meeting.3 along with the statement, the lita board has decided to commit funds to provide an online bystander / allyship training, which we hope will equip lita members with tools that empower active and effective allyship, recognize and undo oppressive behaviors and systems, and promote the practice of cultural humility, thereby collectively increasing our collaborative capacity. the lita statement and the board decision were received positively by many lita members. other ala divisions such as alcts, alsc, asgcla, llama, united, and yalsa have already expressed interest in working together with lita on this, and the lita board is looking into a few options to choose from. more information about the training will be soon provided. lastly, i am thrilled to announce that the lita president’s program at the upcoming ala annual conference at washington d.c in june will feature meredith broussard, a data journalist and the author of artificial unintelligence: how computers misunderstand the world, as the speaker. in her book, broussard delves into many problems surrounding techno-chauvinism, which displays blind optimism about technology and an abundant lack of caution about how new technologies will be used. she further details how this simplistic worldview that prioritizes building new things and efficient code above social conventions and human interactions often misinterprets a complex social issue as a technical problem and results in a reckless disregard for public safety and the public good. lita president’s message: updates from the 2019 ala midwinter meeting | kim 4 https://doi.org/10.6017/ital.v38i1.10980 reviewing the early history of computing and digital technology, broussard observes: “we have a small, elite group of men who tend to overestimate their mathematical abilities, who have systematically excluded women and people of color in favor of machines for centuries, who tend to want to make science fiction real, who have little regard for social convention, who don’t believe that social norms or rules apply to them, who have unused piles of government money sitting around, and who have adopted the ideological rhetoric of far-right libertarian anarcho-capitalists. what could possibly go wrong?”4 i invite all of you to come to this program for more insight and a deeper understanding about what the recent technology innovation involving artificial intelligence (ai) and big data means to our everyday life and where it may be headed. the program information is available in the ala 2019 annual conference scheduler at https://www.eventscribe.com/2019/alaannual/fspopup.asp?mode=presinfo&presentationid=519109. endnotes 1 the official announcement can be found at the lita blog. see bohyun kim, “update on new division discussions,” lita blog, january 26, 2019, https://litablog.org/2019/01/update-onnew-division-discussions/. 2 stephen r. salmon, “lita’s first twenty-five years: a brief history,” library information technology association (lita), september 28, 2006, http://www.ala.org/lita/about/history/1st25years. 3 “lita’s statement in response to incidents at ala midwinter 2019,” lita blog, february 4, 2019, https://litablog.org/2019/02/litas-statement-in-response-to-incidents-at-alamidwinter-2019/. 4 meredith broussard, artificial unintelligence: how computers misunderstand the world (cambridge, massachusetts: the mit press, 2018), p. 85. institutional political and fiscal factors in the development of library automation, 1967-71 allen b. veaner: stanford university, stanford, california. 5 this paper (1) summarizes an investigation into the political and financial factors which inhibited the ready application of computers to individual academic libraries during the period 1967-71, and (2) presents the author's speculations on the future of libraries in a computer dominant society.il> technical aspects of system design were specifically excluded from the investigation. twenty-four institutions were visited and approximately 100 pe1·sons interviewed. substantial future change is envisaged in both the structure and function of the library, if the eme1·ging trend of coalescing libraries and computerized «information processing centers" continues. summary of major factors which inhibited the application of computers to library problems, 1967-71 major factors which inhibited the application of the computer to the library during the period 1967-71 can be categorized under three broad headings: (a) governance, organization, and management of the computer facility; (b) personnel in the computer facility; and (c) deficiencies in the library environment. a. governance, organization, and management of the computer facility 1. uncertainty over who was in charge of the computer facility.-this problem was partly attributable to the fact that the goals and objectives of the facility were imprecisely stated or not stated at all often there was no charter, no systematic procedures for establishing priorities, and excessive autonomy by the computer facility. these factors often permitted the facility to operate as a self-directing, self-sustaining entity, responsible to no informed, upper level manager. '~> the paper is based on a clr fellowship report to the council on library resources, inc., for the period january-june 1972. 6 journal of lihra1·y automation vol. 7/1 march 1974 2. effect of high level administrative changes.-in a few instances, the library automation effort was instigated by the president of the institution. he could, in effect, personally direct the allocation of resources. however, whenever a high administrative official leaves, the resulting vacuum is quickly filled by other interests, the atmosphere changes, and his personal program goals dissolve. 3. management inadequacies.-the effects of domination by a technician or special interest group are described below in more detail. although more and more organizations are putting together influential user groups to point the way toward better management, decision-making responsibility and authority continued to be misplaced in a few institutions which vested authority for technical decisions in a committee of deans who were somewhat remote from current trends in computing because of their administrative responsibilities. (in one institution, it was half jokingly stated that a dean in any hard science could be characterized as suffering from a minimum technological time lag of two years.) 4. lack of long-range planning inclusive of attention to community priorities.-few facilities visited had any written long-range plans, either for the acquisition of hardware, the conversion of older programs, or the involvement of users in systems design. ad hoc arrangements were prevalent. 5. system instability.-this was more the rule than the exception, especially in software, operating systems, hardware configuration, and pricing. wherever an academic computing facility was used for library development, the same broken record always seemed to be playing: the facility was always being taken apart and put together again. of course library development was not the only user affected; complaints arose from all users. 6. biased pricing algorithms.-in the academic facility, student and research use were competitive. hence systems were typically geared to distribute computing resources around the clock in some equitable and rational way. for instance, short student jobs were sometimes given a high priority for rapid turnaround, while long, grinding calculation work was pushed off to the evening or night shift by means of variable pricing schedules or algorithms. a pricing algorithm is basically a load leveling device to smooth out the peaks of over-demand and the valleys of under-utilization which would have occurred in the absence of such controls. devising pricing algorithms is by no means a simple task, since many factors must be taken into account: the kinds of machine resources available, their respective costs, the data rates at which they can function, market demand, hardware and software available, and system overhead, to name but a few. library jobs tended to suffer in both batch and on-line processing. in the former case, because batch jobs on large data bases took so much institutional political and fiscal factors/veaner 7 time, library work generally could not be done during the prime shift; in the latter case, an on-line library system made substantial demands upon a facility's storage equipment and telecommunications support, and competed with all other on-line users. 7. sense of competition with the library for hard dollars.-this problem, which is related to pricing bias, is detailed further on page 21. 8. scheduling problems.-many of the institutions visited had systems or charts for scheduling production, development, and maintenance. but conversations with system users often verified that schedules were either not met or had been unrealistically established. this was especially the case with development work b. personnel in the computer facility 1. selection and evaluation.-inasmuch as the library often did not have the competence to judge personnel nor the ability to generate meaningful specifications, there was generally very little protection from incompetence in this area. 2. elitism: the notion that the masters of the computer are inherently superior to and have better judgment than computer customers.-elitism is a paradox: it can be positive or negative-positive when the best brains produce software designs of true genius with respect to function, performance, economy, and reliability-but in its negative manifestation, reminiscent of the girl with the curl in the middle of her forehead: "when she was good, she was very, very good; when she was bad, she was horrid." during the boom years when computer facilities were expanding faster than the supply of competent staff, elitism seemed fairly common in the computer center. the excitement of rapid development, the seemingly unlimited intellectual challenge presented by the powerful apparatus, and high strung dispositions sometimes caused tempers to flare or immaturity to sustain itself beyond a reasonable time. strange hours, strange habits, bizarre behavior, all seemed to conspire against ordered and rational development. fortunately, as the field matures, the negative aspects of elitism are dying; managers now can concentrate on staff development work to turn top intellectual talents toward productive achievement. 3. disinterest.-this factor may be allied to elitism. in some instances, the computer center's staff gave considerable attention to the library during the period immediately following machine installation, when utilization was low. later, the staff's keen interest became "dulled" at the thought of operating a production system. "more interesting jobs" were .challenging the programmers and beginning to fill up the machine. 4. fear of the unknown big user.-it was recognized early that the library could be among the computer facility's largest potential customers, perhaps the largest. in some facilities, this recognition may have induced 8 journal of library automation vol. 7/1 march 1974 a fear of being taken over or overwhelmed by the user, who would then be in a position to dominate and dictate the direction of further development and operations. 5. fears of an unknown production environment-simply expressed, a production environment removes much of the stimulus for creative approaches to problem solving unless continuous development is maintained for new systems and new applications. many of the best programmers did not wish to lose their freedom to innovate and actively resisted participation in establishment of a production environment, with its concomitant requirement of "dull" maintenance support work. c. deficiencies within the library environment 1. failure to understand in full detail the current manual system.even where the manual system was understood, there was often an inability to describe it in the clear, unambiguous style essential to system design work. these deficiencies were further compounded by the unwillingness of some librarians to learn how to communicate adequately with computer personnel. 2. inability to communicate design specification.-many did not understand how to put together a specification document; particularly they did not know how to account exhaustively for all possible cases or alternatives. librarians were unaccustomed to defining their data processing requirements quantitatively or with precision-both absolutely indispensable to the computer environment. also, as much as the computer facility changed its software environment, many library development efforts were constantly changing their system requirements-a condition which made it all but impossible to program efficiently. 3. failure to understand the development process.-development is a new phenomenon in libraries. most librarians were not educated to comprehend development as an iterative process, characterized by experimentation, error, feedback, and corrective measures. accustomed to the relative stability of long-established procedures-some of which had stood for generations, even centuries-some librarians were baffled by the rapidly changing new technology, others showed impatience and a low tolerance for frustration. many expected development projects to resemble turnkey operations, and the failure of the process to accommodate these expectations produced disappointment and an inability to cope with the computer environment. 4. failure to recognize the computer as a finite resource.-both librarians and early facility managers seemed to look upon the computer as an inexhaustible resource, the former through lack of sophistication and the latter apparently through myopia or possibly ambition. some managers must have told their users that there was "no way" their equipment could be saturated in the foreseeable future. apparently some library users were naive enough to believe. institutional political and fiscal factorsjveaner 9 5. excessive or unrealistic performance expectations.-few library users understood the relationship between the system specifications and functional results, and fewer still understood the significance of performance specifications. the situation was not assisted by notions of "instantaneous" retrieval pushed by salesmen or the popular press. (the writer recalls vividly how one salesman told him the library could have a crt device for $1 a day! and indeed, the device itself was $1 per day if one cared to do without the keyboard, without cables, installation, control units, teleprocessing overhead, a computer, software, etc.) 6. lack of an established tradition of research and development ( r & d) and the lack of venture capital in the library community.the challenge of the computer may have been largely responsible for activating research and development as a serious and continuous effort in librarianship. inexperience in raising and managing funds for r & d, as well as a general lack of knowledge of computer cost factors inhibited progress or tended to make the development effort inefficient and full of surprises. 7. human problems.-some libraries having prior experience with small batch systems underestimated the scale of effort for contributing to the design of the large system, selling it to the users, installing it, and training the users. 8. insufficient support from top management.-in some instances, library management did not accord the automation effort the kind and degree of support essential to success. in particular, some librarians seemed to feel that automation was a temporary affair, definitely of less importance and significance than current manual operations. some did not recognize the sacrifices in regular production that would be necessary and some did not appreciate the continuing nature of development work. background two important prerequisites to progress in library automation were money and technical readiness. the government supplied the first, industry the second. the announcement by ibm in 1964 of its system 360 occurred at a fortunate time for the american library community. president johnson's administration had launched enormous programs in support of education. the library services and construction act was soon to channel millions of dollars into library plant expansion and, perhaps more significantly, the higher education act of 1965 was to sponsor research, which ui1til then had only the support of limited funds from the council on library resources, inc., and the national science foundation. (support from the national science foundation was largely, although not exclusively; directed toward discipline-oriented information services; one of the largest nsf grants went to the university of chicago library.) it was the right time to invest in library automation. important milestones were already behind the library community: the national library 10 journal of lihm1'y automation vol. 7/1 march 1974 of medicine's medlars program was well underway, the airlie conference on library automation had been held and its report published ("the white book"), and the library of congress automation feasibility study ("the red book") had appeared. 1 • 2 the first marc format was being tested in the field. in computer technology, third generation equipment represented major increases in computing power, processing speed, reliability, and capacity to store data in machine-readable form. ibm's sales force was successful beyond imagination in getting system 360's installed in large universities, as well as in business and government. ibm promised a new kind of software-time-sharing-which would virtually eliminate the tremendous mismatch of data processing speed between the human being and the machine. the new methods of spreading computer power through teleprocessing and time-sharing promised to make the computer at least competitive with and possibly an improvement over "antiquated" manual systems of providing rapid access to large and complex data files. within this relatively unknown environment, universities and libraries entered the software development process, which if successful, could enable them to catch up where they had been hopelessly falling behind. circulation, book purchasing, and technical processing loads in many libraries seemed to double and triple overnight as the country's schools and their programs grew to accommodate expanding enrollments. manual systems that had been reasonably workable and responsive in environments characterized by slow growth demonstrated significant and disturbing defects -the inability to deal with peak loads, or rapidly changing loads. the same effects were felt in administrative and academic computing: a bigger and more complex payroll, more students to register, construction contracts to monitor, more research grants which demanded bigger computers, and so on. these were truly boom years. but in the academic community there was still another force developing which was ultimately to be of even greater significance for libraries than the inconveniences of being unable to handle the housekeeping load: a dramatic rise in the expectations of patrons, especially in the academic community, where computers already abounded. libraries had come to be felt by some as strongholds of conservatism and expensive luxuries; librarians were faulted for not "putting the card catalog onto magnetic tape," for not implementing automated circulation systems, or otherwise failing to take advantage of new and powerful data processing techniques. the libraries were caught amidst a variety of sometimes conflicting, sometimes complementary factors: the visionary ignorance of the computer salesman, the senior academic officer possessed by the computer dybbuk, a lack of sympathy or understanding among some computer center managers, a lack of appreciation by students and faculty of the complexity of identifying, procuring, and cataloging unique copies of what must be the least standardized product known to man, and their own lukeinstitutional political and fiscal facto1'sjveaner 11 warm commitment to undertake the hard work required to learn how to use the computer resource. anxieties about jog displacement caused some library staff to look upon computers with trepidation, thus further placing the librarian in a defensive position. while these forces were taking shape, the library's bibliographic activities continued to be seriously hampered by inadequate international bibliographic control.~~ some essential computer hardware, especially the programmable crt terminal with an adequate character set, was either nonexistent or totally unsuitable to library applications. in this institutional context librarians entered the world of computers and data processing. t purpose it is the purpose of this report to examine in some detail how internal institutional factors affected the development of computerized bibliographic systems, and especially to consider nontechnical, negative factors: what slowed down or inhibited the applications of computers in librarianship? this report is not concerned with the merits or demerits of specific systems or their features; indeed, the investigator did not inquire about system specifications. major questions centered about the factors which fostered or hindered the development p1'ocess, regardless of the merit of a project or system. scope investigation was limited almost solely to those institutions considered likely to have large scale, in-house development projects using third generation computer equipment. the majority of places visited were large academic libraries. the time span included in the survey begins approximately in 1967 and ends in 1971. a total of twenty-four institutions was visited and some 100 persons interviewed; a list of the institutions visited is in appendix 1. methodology site visits and i nte1'views arrangements were made to visit four types of individuals: the director of libraries, the head of the library's system development department, the director of the computation center, and whatever principal institutional officer was managerially and/ or financially responsible for campus computing. considerable variation was found in the type of person assigned this last responsibility-it could be the provost, the vice-president u implementation of the library of congress' shared cataloging program under title ii6f the higher education act of 1965 was soon to alter this situation dramatically. t the painful trauma libraries and librarians experienced in getting into computers is too well documented to summarize here. perhaps the best summary has been done by stuart-stubbs. a 12 ] oumal of library automation vol. 7/1 march 197 4 for academic affairs, or the vice-president for business/ financial affairs. choice of the major institutional official to be interviewed was often determined by the pattern of computing in a particular institution, or the facility which supported the development effort. at first the investigator attempted to utilize a structured questionnaire for interviewing. this very quickly broke down, as the interviewees were generally voluble and ranged widely over many related topics or items which they would have been asked about later. accordingly, after the first few interviews, the formal questionnaire approach was dropped and a simple checklist of major questions kept on a few cards to make sure that each major issue had been addressed. every interviewee received the investigator graciously and none was unwilling to talk; indeed, if anything the opposite was the case-most persons seemed to be eagerly waiting for an opportunity to air their views. visits and interviews occurred during the period january-april1972. literature searches searching the literature on this topic has been extremely frustrating. in the literature of computer science and management, there are many articles on pricing algorithms, machine resource allocation schemes, and issues of managing the computer facility, but none specific to the topic of this report. besides scanning professional literature, the author has regularly conducted for the past year monthly computer searches via the ucla center for information service's sdi service. abstracts and citations were searched in research in education (rie) and current index to journals in education (cije). with respect to problems faced by the library in acquiring computer services, the results have been nil in both cases. the author reluctantly concludes that no major recent studies have yet been published in this sensitive area, although two papers by canadian librarians are very helpful. 3• 4 the national academy of sciences/computer science and engineering board's information systems panel appears to have come closest to identifying the issues in its report, library and information technology: a national systems challenge. still, the comments in that report are highly generalized and do not grapple with specifics. 5 structure of educational computing most of the visited institutions maintained separate facilities for administrative and academic computing, while a few ran combined facilities or were in the throes of consolidating their facilities. the differences between administrative and academic computing have historical roots deeply embedded in institutional soil. administrative computing is usually an outgrowth of punched card installations first set up for payroll and financial reporting. academic computing, on the other hand, has its origins within the institution's instructional and research programs. typically it has been supported by external grants and contracts and has been oriented toward institutional political and fiscal facto1'sjveaner 13 the "hard" sciences. until the recent dropoff in federal support of higher education, academic computing was a money maker (through the overhead on grants and contracts) while administrative computing was a money spender. administrative computing typically very little computational work is done in administrative applications; most of the computer work is associated with input, update, reading records, writing records, and printing reports. except for the pay.roll application, the consumer group has tended to be somewhat smaller and less transient than the academic group. but to university administrators the computer could do much more than write checks and pay bills. many significant administrative applications had already been installed on second generation equipment: faculty-staff directories, inventories of space, supplies, and equipment, records of grades, course consumption reports, etc. all these tended to expand the user group, increasing competition for the resource. the advent of third generation equipment made it attractive for administrators to think about applications centered around the so-called "integrated data base." this led to a demand for further new services for the registrar, fund raising and gift solicitation, student services, purchasing, etc. conventional administrative computing-particularly that part of it which generated regular reports-lent itself naturally to batch processing, and indeed many of the early computer installations actually continued established punched card operations, merely using the computer as a faster calculator and printer. the administrative computing shop is typically characterized by (or hopes to be characterized by) great systems stability and dependability, a cautious and measured rate of innovation, and in the opinion of some academic computing types, not much imagination. file integrity, backup and recovery, and timely delivery of its products are prime goals in an administrative computing system. the administrative computing facility very much resembles the library in two important aspects: ( 1) it is a production system; and ( 2) it is almost entirely an overhead function, i.e., there is little or no attempt at cost recovery from system users for its services. academic computing academic computing is a much different world. it serves a large, vociferous, .influential, and mostly technological user community, many of whom ~~e not only competent in programming, but more importantly, possess ready cash. but this is changing: as academic computing expands to service users in the humanities and social sciences rather than mainly those in the "hard" sciences, the user group is growing and it will probably not be long before it embraces the total academic community. in hard science applications, the academic facility typically performs an 14 journal of library automation vol. 7/1 march 1974 enormous amount of computing ("number crunching") with a relatively small amount of output. system backup and recovery is important to the academic computing facility, but file integrity responsibility may often be assigned to the user since such a center sometimes does not maintain the data base but merely provides a service for manipulating it. the main components of academic use are departmentor discipline-oriented research and student instruction, the latter being particularly strong if there is a well-established computer science department. software development has customarily played a major role in academic computing and the usual practice was to actively seek out imaginative systems programmers for whom change and system improvement are food and drink. consequently, instability, both in hardware and software, has been more the rule than the exception in the recent past, although as the management of computer facilities matures, this too is changing. currenttrendsandstatus it is obvious from the above that administrative and academic computing have been characterized by diametrically opposed machine and managerial requirements. where they have been combined in the same facility, tensions have prevailed and neither user was happy. in a few instances known to the writer, such combinations have been abortive and a reversion made to divided facilities. but as computing matures it is becoming evident that operational stability is needed for all types of computing, not just administrative computing. additionally, the financial crises now prevalent in institutions of higher education have brought more realistic attitudes to the fore in understanding just what kinds of facilities can be afforded, and how they should be managed. additionally, the economies of scale, the increasing flexibility of hardware and growing sophistication of software are now combining to form an environment which can better satisfy all potential users of computers. there are clear indications that a unified, well-managed shop with competent staff might now economically and efficiently serve a variety of applications, including administrative and academic-on the same facility. however, this is a developing trend and does not correspond with what the writer actually observed during his visits. in situ he saw much evidence that anthony oettinger's observations of some years ago were still valid: ... routine scheduled administrative work and unpredictable experimental work coexist only very uneasily at best, and quite often to the serious detriment of both. where the demands of administrative data processing and education require the same facilities at precisely the same time, the argument is invariably won by whoever pays the bills. finances permitting, the loser sets up an independent installation. 6 indeed, it would not be unreasonable to conclude from the interviews that in most places visited, computing during the period 1967-71 was in a institutional political and fiscal factorsjveaner 15 state of disarray. there is abundant and disagreeable evidence of technical incompetence, lack of management ability, ill spent money, communication failures, and naive and disillusioned users. but it would be a mistake to conclude that the failures in library automation are attributable primarily to computer-oriented personnel or hardware problems-librarians in their own way displayed many of these same failures. it would be another mistake to dwell excessively on the high failure rates observed. in any complex technological endeavor, the rate of failure is dramatically high at the beginning; there is ample evidence here from the aircraft and space industries. indeed, the likelihood of a first success in anything complex-library automation is complex, as we have learned the hard way-is practically nil. organization and management problems: the academic computing environment early academic computing facilities were typically run by faculty members in engineering, applied mathematics, computer science, or related fields. this arrangement was satisfactory when computers were small, relatively primitive, and the user community was confined to those few people who could program in machine language or assembly language. as equipment became bigger and more powerful, and as higher level programming languages developed, more and more people learned programming. correspondingly, the task of managing the computer facility grew rapidly in size and scope. the budget of a large computer center in a modern university can easily run to several millions of dollars annually. the manager must balance seemingly innumerable, complex forces: personnel, management, government and vendor relationships, demands from vocal users, establishing priorities, the challenge of hardware advances, marketing, pricing services, balancing the budget, etc. it soon became clear that few faculty members possessed either the multifaceted talents or the experience required for effective management. as the center's budget grew, and particularly as the shift was made from second to third generation equipment, th,e faculty member tended to be replaced by the technician as manager. unf01tunately for many of the facility users, the technician tended to promote his own technical interests in software development or hardware utilization. in some instances, the user community felt that the facility was being run more for the benefit of the staff than for the users. the technician-manager often looked at the computer as his personal machine, much as some faculty members had earlier felt the computer to be their own private preserve. the vice-president of one university expressed the view that the technician-manager doesn't really have an institutional loyalty tied to the goals and objectives of the academic programs; he is more loyal to the machine or the software. in a school with a long history of computer utilization, there had been no tech16 journal of library automation vol. 7/1 march 1974 nician in charge of the computer facility for a decade. yet in a school not too far away, an officer indicated that his institution had "made the same mistake twice in a row" by hiring a technician to manage the computer facility. the technician-manager represents a highly personalized management style, one in which goodwill, friendship, or personal interest is the key to effective service. it can hardly represent an arrangement for the successful development and implementation of computerized bibliographic systems. in the third and current organization and management phase of academic computer facilities, the professional manager is in charge. schools are now beginning to see the need to develop formal charters for their computing centers, quasi-legal instruments which will lay out their specific responsibilities as service agencies. a professionally managed service agency eliminates one of the most irritating elements in the allocation of computer resources: personal judgment by the faculty or technician-manager as to the worth of a project, which was so prevalent during earlier management stages. at the time of the interviews, very few institutions actually had such charters, but their need was being recognized. it is now universally accepted that the computer center can no longer be the plaything of the faculty nor the expensive toy of the technician. organization and management: the administrative environment because of its historical development the administrative computing facility was usually first run by someone with an accounting or financial background. (academic computing persons occasionally put disparaging labels on such people as "edp-types" or characterized them as having a "punched card mentality.") the nature of the workload virtually meant that the administrative shop would be set up mainly for batch processing and any data base services provided for other users would involve printed lists. such facilities were found satisfactory by a number of libraries even for applications such as circulation, which produced gigantic lists-probably because it represented a vast improvement over an antiquated, poorly designed, or overloaded manual system. however, there was at least one major technical consideration which had direct political and financial implications for the library which turned to the administrative computing facility for its computer support. this was the library's need to support and manipulate a data base with nearly every data element of variable length-a requirement that was practically nonexistent in administrative computing. some facilities were unable or unwilling to meet this requirement. the move from tape-oriented systems to mixed disc and tape systems on third generation equipment necessitated an upgrading of programming staff, and brought into the administrative shop the same clearcut distinction between system programmers and application programmers which had institutional political and fiscal factorsjveaner 17 emerged earlier in the academic shop. this change in turn demanded appointment of more knowledgeable facility managers, many of whom were drawn from business and industry rather than the ranks of in-house accounting staff. this transitional period was characterized by two enormously challenging parallel efforts: the conversion of existing programs to run on third generation equipment and the development of new applications. to an extent these responsibilities were competitive, and from this viewpoint it was certainly not a propitious time to embark upon anything as complex as bibliographic data processing. yet numerous workable systems emerged for circulation, book catalogs, ordering and accounting systems, and serials lists. these were not accomplished without anguish as the library did not control the machine resources and often did not control the human resources -the facility manager tended to make his pliority decisions to please his boss who was certainly not the librarian. besides, no application could really take precedence over payroll or accounting in the administrative shop. to the librarian it was more like borrowing another person's car than renting or owning a car: when the resource was urgently needed someone else had first call. organization and management: the library automation endeavor a detailed study of this subject is not within the scope of this investigation. however, it will be useful to note that the organization and management of library automation activities demonstrate development phases which closely parallel those in the computing environment: 1. a stage in which the user himself ( cf. accountant or faculty member) undertakes to perform the activity. in this stage individual librarians learned programming, did their own design work, wrote, debugged, and ran programs themselves. (this was possible in the "open shop" environment prevalent in many early computer facilities.) 2. a stage in which the technician-in this case a librarian with appropriate public service expertise (for circulation applications) or technical processing knowledge (for acquisitions, cataloging, or serials) -took charge of an organized development effort, hired his own programmers and systems analysts, and negotiated directly with the computer facility.* 3. a stage in which the professional system development manager is hired to oversee the total effort. such a person is sometimes drawn from business or industry, is a seasoned project manager, and has broad knowledge of computers, especially in the area of costs. such an ap*the technical person need not be a librarian. northwestern university represents a significant instance where a faculty member in computer sciences and electrical engineering undertook the development effort. 18 ]oumal of library automation vol. 7/1 march 1974 pointment is more common in the large library, the consortium, or network. human problems associated with rapid change in institutions some institutions, particularly in their administrative functions, became embroiled in a seemingly endless round of internal psycho-social problems which did not make the environment conducive to problem solving. the move to computerizing manually oriented functions, whether in the library or other parts of an institution, was found to be extremely threatening to established departmental structures. it was consistently reported that the political and emotional aspects of system conversion, both in the libra.ry and elsewhere, were much more aggravating than the technical aspects. the problem simply showed up first outside the library because applications of computers occurred there earlier. departments were sometimes unwilling to give up data for computer manipulation for fear that computerization would take jobs away. this phenomenon is not unknown in librarianship where some professionals take an extremely proprietary attitude toward bibliographic data. now pressures from governments, legislatures, and the academic community at large are gradually establishing the concept that some categories of data are corporate, and do not belong to a specific individual or department, or even to an institution, but should be shared through networking or other mechanisms. but the rapidity of microsocial change and its upsetting emotional consequences caught some library leaders unawares. a considerable reeducational process for both management and labor is required to smooth the transition to the new view. motivation problems it is difficult to elicit sound comment concerning motivation (or lack thereof) as a deterrent to progress in library automation. it is an emotional subject and neither the librarians nor the programmers come out "clean." the prima donna computer programmer, much in evidence in the early days of computer center development, is very much on the wane these days. like the spoiled child, the prima donna programmer could only exist where personal interests were permitted to take precedence over social goals-or perhaps where institutional goals for the computer facility had not been clearly articulated or had not yet come into focus. some prima donnas, partly out of ignorance, partly through a stereotyped image of library activities, were inclined to disdainfully dismiss library applications as "trivial," and demand "really challenging" assignments. but the librarians had their prima donnas, too. some had learned enough programming to be a little dangerous and they then felt like peers who could tell the computer center not only what to do but how to do it. at first, few members of the library staff were willing to learn how to arinstitutional political and fiscal factorsjveaner 19 ticulate their specifications and requirements to the management of a computer facility. most librarians expected some kind of miraculous magic, akin to a wave of the hand, to bring a computer system to reality. very few understood the heuristic nature of development. so there were barriers of status, depth of knowledge, and language-any one of which would have sufficed to kill the development of the good motivation essential to breaking new gro~nd. in the wrong combination they could present an overwhelming conspiracy, for their mutual interaction could only produce polarization and intransigence. the library and the computer facility the role of similarities and differences for a long time the library has been the "heart of the university." until the advent of the computer, little could challenge the supremacy of the library as the principal resource of an educational institution. even the faculty could be put into second place, since it was difficult to attract high quality faculty without good library resources, and the faculty were to a greater degree transient, for the library was considered "permanent," an investment for all time. the computer represents a new and challenging force in the arena where shrinking resources are allocated among competing academic users. both the library and the computer facility have experienced exceedingly rapid growth in the recent past, concurrent with an expanded demand for services which can easily outstrip available resources. among some of the larger academic libraries, the staff of the computer center may be half or greater than half that of the library. important differences between the two services have recently come into focus. first, most of the services and benefits of the library are intangible. because of this it has always been difficult to measure the cost benefit of the library as an institution, and it is well known that counts of the number of people entering the door or the number of circulations are far from true measures of the library's functional success. the computer, on the other hand, is a relentless accounting engine; computer facilities can produce endless statistics on the number of jobs run, lines printed, terminal hours provided to users, turnaround time, cards punched, etc. the computer's output is extremely tangible and can be more directly and easily related to academic achievement than can library use. a second major difference lies in apparently different financial roles within the institution. in most organizations, the library is run as an overhead expense, without any attempt to charge back to users or departments proportional costs of utilization. like air, the library resource is there for anyone to use as much or as little as he pleases; the library gets a "free ride," but the computer center is expected to pay its own way. this dichotomy is often explicitly designated as the "library-bookstore" duo model. furthermore, since the library does not generate much in the way of re20 journal of library automation vot 7/1 march 1974 search grants and contracts, it is looked upon as a consumer rather than a producer of financial resources. in fact, those who support computing in preference to books point to. the fact that overhead income generated by computer-related research grants and contracts is shared with the library which may have done little to contribute toward the acquisition of such income! in some institutions the situation has become critical indeed because of the recent substantial reductions in federal· support. much political in-fighting has been necessary to maintain current levels of computer activity, and not all such efforts· have been successful.· some institutions have been forced to cut back on computing power, merge facilities, or combine resources with other institutions. · · · · several years ago when the national science foundation imposed an expenditure ceiling on grants, associated overhead income was correspondingly reduced. one computer center director was reported to have suggested that the effect of this overhead cut could be nullified by a simple, internal reallocation of funds, say by taking the needed amount from the budget of another agency on campus of less significance to researchers and scientists, such as the library. this attitude is clear evidence that the library has lost its sacred cow status as a "good thing" on the campus. it too must justify itself. close examination of the library and the computer facility gives clear evidence that both deal with the same commodity: information. within the recent past several computer facilities have changed their designations to "information processing" facilities or centers. several institutions, notably the university of pittsburgh and columbia university, have coalesced the library and the computer center organizationally or have both units reporting to a vice-president for information services. the recognition and furtherance of this natural linkage may do much to reduce the potentially destructive competition which can characterize the relationship between the two units. there are remarkable growth parallels between the two facilities-the library acquiring and processing more and niore books in response to expanded publication patterns, more users, and the· growth of new ·disciplines and interdisciplinary research, while the computation facility moves rapidly from one generation of software and hardware to the next. the expansion of both organizations produces seemingly equal capital-intensive and labor-intensive pressures: library processing staff doubles and triples, while the ·newly acquired books demand ·more in the way of housing, whether of the traditional library type or warehouse space; the computer center moves toward more sophisticated hardware, especially terminals and communications, which need to be supported by greater numbers of still more highly qualified· systems programmers, communication experts, and user services staff. both services have a marketing. problem; but the computation facility, being relatively more dynamic and more interactive (because of terminal services), can be more sensitive and responsive, .financially and technically, to its clientele than can the library. only now institutional political and fiscal factors/veaner 21 with the emphasis upon computerized bibliographic networking has the library as an institution begun to approach the marketing strategies and the effective user feedback already well developed in computation facilities. service capacity, resource utilization and sharing differences both in service capacity and resource utilization represent a key political issue affecting the future of both libraries and computer facilities. in major universities, the budget for the computer facility is now not far from the library budget in size, and in a few institutions it exceeds the library budget. with the diminution of external grants and contracts, the two organizations compete for the same hard dollars. this economic competition can either drive the two facilities apart, dividing the campus, or cause them to coalesce-as has been the case at columbia and pittsburgh. despite its high operating costs, from the viewpoint of resource utilization, the well-managed computer facility can almost always point to an excellent record.§ no matter how well managed, the research library can never make this claim in the context of its current materials and processing expenditures, much of which by definition is aimed at filling future needs. the library and its patrons cannot "use" all the resources at their command; the library could not even service all the patrons should they demand the use of "all" the resources. in contrast, the computer facility (particularly large on-line systems with interactive capabilities) can be very efficiently utilized even when demand is heavy. thus, to the "objective" eye, it would appear that in the computer facility both the institution and the individual patron get more value for their dollar than they do in the library, which in comparison resembles a bottomless financial pit. one may counter that apples and oranges are being compared, but the institution which pays their bills nevertheless makes the comparison. flexibility, inflexibility, and the future besides better resource utilization, the computer facility offers the patron far greater flexibility of resource use than can the library. there is no way a large collection of books on the celtic language or the military history of the austro-hungarian empire can help a professor of structural engineering, a student of marine biology, or a researcher in modern urban problems. even the books these people actually need and use cannot easily assist others, as relevant data in them is not indexed or readily available for computer manipulation. · the point is that, unlike the library, the computer is a highly elastic universal tool, one that each user can temporarily shape to his own need, replicate .the shape later, or if he wishes change the shape at will. the traditional.lib:rary has no such flexibility; its main bibliographic retrieval de§in fact, if a computer resource is not much used and isn't "carrying its weight," it can be disposed of, by sale if purchased, or by cancellation if leased. 22 journal of library automation vol. 7/1 march 1974 vice-the card catalog-is especially noted for its high maintenance cost, its limited ability to respond to complex queries, and a general fixity of organization and structure that is ever at variance with changing patron expectations and interests. (if computers can be flexible, why can't the library?) there is much in the library that is not used because it is inaccessiblelocked up in an inflexible retrieval tool or unavailable because the stateof-the-art (both in bibliography and computer science) or staffing does not yet permit far deeper access via "librarian-negotiators" and patrons at terminals interacting with large and deeply indexed data bases. as long as major portions of the library budget and staff are devoted to housekeeping and internal technical processing, the library will look less good, less "costbeneficial" to the academic community than does the computer facility. but there is growing recognition that both institutions deal with information processing which covers a wide spectrum of time. true, the storage formats differ, but this may be a temporary phenomenon. as progress is made on improved, less expensive conversion of data from analog to digital form and vice-versa, the day may arrive when the library and the computer facility are indistinguishable. will the library become an information utility? computer utilities are an important developing trend and it is sometimes suggested that library services could be delivered within the utility model. utilities and libraries as they exist today have very different characteristics. a utility can be defined as a system providing a relatively undifferentiated but tangible service to a mass consumer group and with use charges in accordance with a pricing structure designed for load leveling (i.e., optimization of resource utilization). typically, a utility both wholesales and retails its services. within this definition, a conventional library cannot be construed as a utility; its services are generally intangible and very highly differentiated-indeed, chiefly unique, for rarely is one book "just as good as another"; its clientele is not the general public but a highly select group which itself contains highly unequal concentrations of users; and almost no libraries impose user charges in the interest of cost recovery; practically speaking, there is only one united states wholesaler (of bibliographic data) -the library of congress. this situation is changing in several respects. first, the establishment of practical, computerized bibliographic networks has introduced among participating institutions cost sharing schemes closely resembling the load leveling or rate averaging algorithms prevalent among utilities.ll these han example of rate averaging is the practice of the ohio college library center to lump total telecommunication cost and prorate it into the membership fee, in effect creab":ng a distance independent tariff. (this arrangement does not hold outside of ohio.) institutional political and fiscal factorsjveaner 23 new ideas have been readily accepted by libraries and could even become the basis for balancing more equitably the costs of interlibrary loan traffic. second, specialized "information centers" have evolved in certain fields, partially as a consequence of lack of responsiveness (or slow turnaround) by conventional library services, and "for profit" commercial services have been set up. examples of the latter include the european s'il vous plait and its american counterpart, f.i.n.d. (often such commercial services do not hire librarians as they are considered too tradition bound.) a third force which is rather inchoate at the moment may soon take on a recognizable shape: facilities management. under such a scheme, the complete management responsibility for all or part of a function is contracted to an outside vendor. for instance, it is conceivable that some libraries in the near future may have no in-house staff for technical processing. services would be purchased totally from a vendor or obtained from his resident staff, much as computer centers buy specialized expertise through the "resident s.e." (systems engineer). the gradual buildup of computerized bibliographic services offers an excellent opportunity for commercial ventures into turnkey bibliographic operations for libraries. this would bring the libraries one step closer to the utility concept, as they buy a complete package from a wholesaler who probably services many customers. the traditional library service concepts we know today may undergo drastic changes in financing and in methods of delivery. beyond the commercialized or contractual arrangement for technical processing, which is only one component of the total information flow, lie unknown territory and little explored concepts: use charges for library services (the bookstore model), the "for profit" library, the complete information delivery system integrated with computers, communication satellites, and cable tv. if the computer-based library is to become an information utility, a major accommodation will be needed in the financing arrangements, perhaps in form of user charges-for no utility can survive without regulated demand. an unlimited, uncontrolled demand for any product or service is untenable, for without regulation (i.e., pricing) demand rapidly outruns supply. in the traditional library, where theoretically every user has the "righf' to unlimited demand, this never happens for several reasons: (1) not all potential patrons elect to use the resource; ( 2) the users must usually go to the library to access the bibliographic apparatus and obtain the materials held by the library; ( 3) every item in a library collection does not have an equal probability of use; and ( 4) there is a finite rate at which human beings can "use the resource," i.e., people can read just so f~st. none of these self-limiting factors applies to say, electric power, radxo and tv broadcasting, telecommunication services, or similar utilities. the library picture could become quite different if these limitations were removed or mitigated. suppose the patron could access the bibliographic apparatus through his home computer terminal attached to his tv 24 journal of libmry automation vol. 7 ;1· march 1974 in the "wired city." further suppose that he could receive selected, short items (where time of delivery is important to him) directly at his tv set, or longer items having less time value as microforms or hard copy delivered by mail or private delivery systems. given such possibilities, the collecting policies of individual .. libraries" (if they continue to be called by that name) might well change drastically so that nationally, collections might become much more standardized or .. homogenized" -increasing the likelihood that individual holdings will have more nearly equal use probabilities. this would imply the need for one or more national and/ or regional centers for servicing the less used materials, along with appropriate delivery systems and pricing schedules. conclusion work on library automation has proceeded during a highly developmental period in the history of computing. in this sense, librarianship has suffered no worse than any other computer application, nearly all of which have gone through traumas of design, installation, redesign, reprogramming, etc. the main distinction is that in many of these other applications -government, military, industrial, or commercial-there have been . far greater resources available to the task and vastly greater experience with the development process. despite the obstacles, progress in computerized bibliographic work has been far more significant and has achieved far more than many librarians-especially those unaccustomed to the developmerit cycle..;..can appreciate. the snowballing growth of practical consortia and networks along with the successful installation and operation of several on-line bibliographic systems has already changed the face of libtarianship in ·a very short time. like the breaking of the sonic barrier, once the initial.difficulty is overcome, further progress is easier. the ·computer has successfully achieved what librarians have until recently· only paid lip service to: cooperation and wide sharing of an expensive· and large· resource. though the linear growth model in libraries has been dead for some time, the recognition of this fact has riot yet penetrated the entire profession. if libraries are to survive as viable institutions throughout this century and into the next, their leaders inust solve the financial, space, ·and human communication problems inherent in growth. local autonomy, local self-sufficiency, and the "freedom" to ·avoid, evade, and even· undermine national standards now show up as expensive and dangerous luxuries-potentially self-destructive. only through the computet will true library cooperation be possible~ only the development of regional and national bibliographic networks,· with the assistance of substantial federal funding, can really .. save" the library. the computer is actually the' library's life insurance and blood plasma .. a failure to respond to the challenge of the ·computer could be fatal, for it is increasingly apparent that patrons growing up in the computer era will not patiently interact'with··library systems geared to nineteenth-century methods. nothing institutional political and fiscal factorsjveaner 25 in the educational system exists to .force people to use a given resource; people use the resources which are effective, responsive, and economical. if the computer is a better performer than the library, patrons will go to the computer. this will be pa!ticularly the case as computer services· become broader in coverage, simpler to lise, and unit prices continue to decline. despite the serious and irritating problems associated with learning''tp ·use the computer,. librarians must continue aggressively to support. computer applications; indeed, library leaders can impart no more important message than this to their community leaders. · acknowledgments· i wish to thank the following persons for their support: dr. e. howard brooks, who was vice-provost for academic affairs in 1971, and da'vid c. weber, director of libraries, respectively, stanford university, for granting the leave of absence which enabled me to undertake this project. i acknowledge with thanks the contributions of the following persons who reviewed early drafts of the paper, in many cases making valuable suggestions and in other instances helping me ward off errors: mrs. henriette d. avram, head, marc development office, library of congress; hank epstein, director of project ballots and associate director for library and administrative computing, stanford center for information processing; frederick g. kilgour, executive director, ohio college library center; peter simmons, professor of library science, university of british columbia; carl m. spaulding, program officer, council on library resources, inc.; david c. weber, director of libraries, stanford university. references 1. barbara evans markuson, ed., libra1'ies and automation; conference on libraries and automation, warrenton, va., 1963. (washington, d.c.: library of congress, 1964). 2. u.s. library of congress, automation and the library of congress; a survey sponsored by the council on library resources, inc. (washington, d.c.: library of congress, 1963), 3, basil stuart-stubbs, "trial by computer: a punched card parable for library administrators," library ]ournal92:4471-4 (15 dec. 1967). 4. dan mather, "data processing in an academic library: some conclusions and observations," pnla quarterly 32:4-21 (july 1968). 5. lib1'aries and information technology: a national systems challenge; a report to the council on library resources, inc., by the information systems panel, computer science and engineering board. (washington: national academy of sciences, 1972). 6. anthony oettinger, run, computer, run (cambridge, mass.: harvard university · press, 1969), p.196. (these same comments were cited in allen b. veaner's earlier article, "major decision points in library automation," college & research libraries :299-312. 26 journal of library automation vol. 7/1 march 1974 appendix 1 list of institutions visited university of alberta university of british columbia university of chicago cleveland public library the college bibliocentre, ontario university of colorado columbia university cornell university harvard university university of illinois indiana university massachusetts institute of technology university of michigan new york public library northwestern university ohio college library center university of pennsylvania pennsylvania state university umversity of pittsburgh purdue university simon fraser university syracuse university university of toronto yale university a computer output microfilm serials list for patron use william saffady: wayne state university, detroit, michigan. 263 library literature generally assumes that com is bette1· suited to staff rather than patron use applications. this paper describes a com serials holdings list intended for patton use. the application and conversion from paper to com are described. emphasis is placed on the selection of an appropriate microformat and easily operable viewing equipment as conditions of success fo1' patron use. as a marriage of dynamic information-handli11g technologies, computer output microfilm (com) is a systems tool of· potentially great significance to librarians. several libraries have reported successful com applications initiated within the last few years. the two most recent-fischer's description of four com-generated reports used by the los angeles public libraries and bolefs account of a com book catalog at the washington university school of medicine library-stress the time, space, and cost savings so frequently reported in analyses of the advantages of com.1• 2 this article describes the substitution of microfilm for paper as the computer output medium in one of the most common library automation applications, a serials holdings list intended for use by library patrons. it is interesting that, at a time when librarians are insisting on the importance of patron acceptance of technological innovation, the recent literature reports com applications intended solely for staff use. bole£, in fact, lists staff rather than patron use among the characteristics of potentially successful library com applications. the report that follows suggests, however, that careful attention to the selection of an appropriate microformat and viewing equipment can successfully extend the effectiveness of com to include pab:on-use library automation applications. the application the union list of se1·ials in the wayne state university libraries is a computer-generated alphabetical listing, by title, of serials held by the wayne state university library system and some biomedical libraries in the detroit metropolitan area. sullivan describes it as "informative in purpose and conventional in method."3 as with many similar applications, serials i' 264 journal of library automation vol. 7 j 4 december 197 4 holdings were automated in order to unify and disseminate hitherto separate, local records. the list is primarily a location device, giving for each title the location within the library system and information on the holdings at each location. it is updated monthly, the july 1974 issue totalling 1,431 pages. in paper form, twenty copies produced on an ibm 1403 line printer using four-ply carbon-interleaved forms were distributed for use throughout the library system. the list shares some of the characteristics that have marked other successful com applications. 4 it consists of many pages and has a sizeable distribution. quick retrieval of information is essential. use is for reference rather than reading. there is no need to annotate the list and no need for paper copies, although the latter requirement would not rule out the use of com for this particular application. patrons simply consult the list to determine whether the library's holdings include a particular serial and then proceed to the indicated location. it is interesting that serials holdings lists, long recognized as an excellent introductory library automation application, should also prove an excellent first application for com. complexities of format and viewing equipment selection aside, the conversion of output from paper to microfilm presented no problems. since the wayne state university computing and data processing center does not have com capability, the university libraries, after careful consideration of several vendors, contracted with the mark larwood company, a microfilm service bureau equipped with a gould beta com 700l recorder. the beta com is a crt-type com recorder with an uppercase and lowercase character set, forms-overlay capability, proportional spacing, underlining, superscripts, subscripts, italics, and a universal camera capable of producing 16, 35, 70, and 105mm microformats at several reduction ratios. a decisive factor in the selection of this particular vendor was the beta com's dedicated pdp-8/l minicomputer that enables the com recorder to accept an ibm 1403 print tape, thereby greatly simplifying conversion and eliminating the expense of reprogramming. microformat selection as ballou notes, discussions of com have tended to concentrate more on the computer than on micrographics, but for a patron-use com application the selection of an appropriate microformat is of the greatest importance.5 however, there has been an unfortunate emphasis placed, both in the literature of micrographics and by vendors, on microfiche, the format now dominating the industry, especially in com applications. such emphasis ignores the fundamental rule of systems design, that form follows function. each of the microformats has strengths and weaknesses that must be analyzed with reference to the application at hand. for a patron-use, com-generated serials holdings list, ease of use with a minimum of patron film handling is a paramount consideration. microfiche is clearly unsuitable for a list of over 1,400 pages. even at 42x reduction, the paserials list/saffady 265 tron would be forced to choose from among seven :fiches, each containing 208 pages. the difficulties of handling and loading, combined with library staff involvement in a program of user instruction, make fiche an unattractive choice. instead, the relatively large size of the holdings list suggests that one of the 16mm roll formats offers the best prospects of containing present size and future growth within a single microform. the disadvantages of the conventional 16mm open spool-the necessity of threading film onto a take-up reel before viewing-can be minimized by using a magazine-type film housing. the popular cartridge format eliminates much film handling, but cartridge readers are very expensive, necessitating a considerable investment where many readers are required. even with the cartridge, it is still possible for a patron to unwind the film from the take-up reel, necessitating rethreading before viewing. fortunately, microfilm cassettes overcome this difficulty. unlike the cartridge format, 16mm cassettes feature selfcontained supply and take-up reels. the film cannot be completely unwound from the take-up reel and the cassette can be removed from the viewer at any time without rewinding. patron film handling is virtually eliminated. the cassette format has proven very popular with british libraries, where it has been used with satisfactory results in com applications.6 viewing equipment success in format choice is contingent on the selection of appropriate viewing equipment. as larkworthy and brown point out, the best viewer for patron-use com applications is one that can easily be operated by the least mechanically inclined person.7 fortunately, cassette viewers, while limited in number, tend to be very easy to operate. the viewer chosen for use with the union list of serials, the memorex 1644 autoviewer, features a simple control panel, fixed 24x reduction, easily operated focus and scan knobs, motorized film drive for high-speed searching, and a manual hand control for more precise image positioning. the screen measures eleven by fourteen inches in size, with sufficient brightness for comfortable ambient light viewing. other cassette viewers examined, however satisfactory they might be in other respects, failed to meet the peculiar requirements of this particular application. discussion since its introduction in april 1974, the com-generated union list of serials in the wayne state university libraries has enjoyed a satisfactory reception. patrons have learned to consult the com list with little difficulty. the selection of an appropriate microformat and easily operated viewing equipment have kept staff involvement in patron instruction to a minimum. there appears to be no reason for limiting potential library com applications to those used primarily or solely by staff members. given the 266 journal of library automation vol. 7/4 december 1974 severity of the current paper shortage, the consequent rise in paper prices, and serious questions about the availability of paper at any price, com merits serious consideration as an alternative output medium for the widest range of library automation applications. references 1. mary l. fischer, "the use of com at the los angeles public library," the journal of micrographics 6:205-10 (may 1973). 2. doris bole£, "computer-output microffim," special libraries 65:169-75 (april 1974). 3. howard a. sullivan, "metropolitan detroit's network: wayne state university library's serials automation project," medical library association buuetin 56:269-71 (july 1968). 4. see, for example, auerbach on computer output microfilm (princeton: auerbach publishers, 1972), p.1-10. 5. hubbard w. ballou, "microform technology," in carlos cuadra, ed., annual review of information science and technology, v.8 (washington, d.c.: american society for information science, 1973), p.139. 6. d. r. g. buckle and thomas french, "the application of microform to manual and machine readable catalogues," program 6:187-203 (july 1972). 7. graham larkworthy and cyril brown, "library catalogs on microfilm," library association record 73:231-32 (dec. 1971). critical success factors for integrated library system implementation in academic libraries: a qualitative study shea-tinn yeh and zhiping walter information technology and libraries | september 2016 27 abstract integrated library systems (ilss) support the entire business operations of an academic library from acquiring and processing library resources to making them available to user communities and preserving them for future use. as libraries’ needs evolve, there is a pressing demand for libraries to migrate from one generation of ils to the next. this complex migration process often requires significant financial and personnel investment, but its success is by no means guaranteed. we draw on enterprise resource planning and critical success factors (csfs) literature to identify the most salient csfs for ils migration success through a qualitative study with four cases. we found that careful selection process, top management involvement, vendor support, project team competence, staff user involvement, interdepartmental communication, data analysis and conversion, project management and project tracking, staff user education and training, and managing staff user emotions are the most salient csfs that determine the success of a migration project. introduction the first generation of integrated library systems (ilss) were developed specifically for library operations focused on the selection, acquisition, cataloging, and circulation of print collections. as libraries’ nonprint materials steadily grow, the print-centric ilss became less and less efficient in supporting libraries’ daily operations. recent years have seen an emergence of a new generation of ilss, commonly called library services platforms (lsps), that takes into account the management of both print and electronic collections. lsps take advantage of cloud computing and network advancements to provide economies of scale and to allow a library to better share data with other libraries. furthermore, lsps unify the entire suite of library operations to provide efficient workflow at the back end and advanced online discovery tools at the front end for the library.1 given the claimed benefits of the emerging lsp and the fact that vendors are phasing out support for their legacy ilss, we project that more libraries will be migrating to lsps as the systems mature and libraries’ needs evolve. shea-tinn yeh (sheila.yeh@du.edu) is assistant professor and library digital infrastructure and technology coordinator, university of denver libraries. zhiping walter (zhiping.walter@ucdenver.edu) is associate professor, business school, university of colorado denver. mailto:sheila.yeh@du.edu mailto:zhiping.walter@ucdenver.edu) critical success factors for integrated library system implementation in academic libraries: a qualitative study | yeh and walter |doi:10.6017/ital.v35i2.9255 28 migrating from one generation of ils to another is a significant initiative that affects the entire library operation.2 because of its scale and complexity, the migration project is not always smooth and often fraught with problems, with some projects falling behind migration completion schedule.3, 4, 5 in addition, committing to a new system often results in significant financial and personnel costs for an academic library.6 understandably, there is considerable trepidation before, during, and after the migration process.6, 7 what contributes to a smooth migration process and a successful migration project? this is an urgent question at present and an enduring question for the future. this is because, as libraries continue to evolve, their operations and management needs are destined to outgrow functionalities of the current generation of ils. therefore migration to a new generation of ils is destined to occur periodically for a library. in this research, we study critical success factors (csfs) that contribute to a smooth migration process and a successful migration project defined as on-time and on-budget project completion and a smooth implementation process. to achieve our research goal, we anchor our theoretical foundation in the enterprise resource planning (erp) system-implementation literature. erp is “business process management software that allows an organization to use a system of integrated applications to manage the business and automate many back office functions related to technology, services and human resources.”9 since a complete ils is formed from a suite of integrated functions to manage a broad range of library processes, it is in fact an erp for libraries.10 a literature review of csfs for erp system implementation success revealed more than ninety csf factors.11, 12 the contribution of our research is in identifying, through qualitative research method, the most salient csfs that contribute to the success of a library system migration project from one generation of ils to another. results of this study can help library administrators to improve the chance of success and decrease the level of anxiety during a migration project now and in the future. the remainder of the article is organized as follows: section 2 reviews erp, ils, lsp, csfs, and information system success measurement described in the literature. section 3 describes the guided interviews that have been conducted to identify the csfs, the results, and the analysis of the results. finally, we offer conclusions and limitations as well as recommend future work. literature review erp is business-management software comprising a suite of integrated applications that an organization can use to collect, store, manage, and interpret data from many business activities, including product planning, manufacturing, service delivery, marketing and sales, and human resources. the core idea of an erp system is to integrate both the data and the process dimensions in a business so that transactions can be monitored and analyzed for planning and strategic purposes.13 modules of the system cover different functions within a company and are linked so users can see what is happening in all areas of the company. an erp system can improve a business’s back offices as well as its front-end functions, with both operational and strategic benefits.14 some of the benefits include reliability in information access, data and operations information technology and libraries | september 2016 29 redundancy, data retrieval and reporting efficiency, easy module extension, and internet commerce capability. just like an erp system for a business, a complete library management solution comprises a suite of integrated applications that manage a broad range of library processes including circulation, acquisition, cataloging, electronic resources management, and system administration. lsps, the current generation of library management systems, are designed to manage both physical and digital collections. lsps follow the service-oriented architecture (soa) and can be deployed through multitenant software as a service (saas) distribution model.15 in addition to supporting all library functions, lsps integrate with other university systems, such as student registry and finance, and provide front-end for library patrons in a cloud environment that leverages a global network of systems for discovery of a wide array of resources.16 since an lsp is essentially an enterprise system for library functions, csfs of erp implementation success could guide lsp implementation. csfs are conditions that must be met for an implementation to be successful.17 more than ninety csfs have been identified for erp implementation success.18, 19 those csfs have been classified according to various schemes, but we found the strategic versus tactical classification most relevant to the library context.20 strategic factors address the big picture involving the breakdown of goals into do-able items. tactical factors, on the other hand, are the methods to accomplish the doable items that lead to achieving the goals.21 by examining the entire list of csfs from both the strategic and the tactical perspectives, we identify top csfs for library-management-solution implementation and migration success, defined as on-time and on-budget delivery as well as smooth implementation process,22, 23 through a qualitative study. method we conducted semi-structured interviews with open-ended questions to identify the most salient csfs for implementation success. since we needed to reduce more than ninety csfs in the literature to a list of most salient csfs in the library context and to potentially identify new csfs, a qualitative-interview approach was more suitable than a quantitative-survey approach. a twostep process was used to arrive at the final list. first, we evaluated all csfs in the literature and identified a subset of csfs that might be most relevant for library-systems implementation.24 second, this csfs subset was used to develop an interview guide for semistructured interviews conducted later to further reduce this subset. open-ended questions were also used during the interviews to elicit additional csfs. an institutional review board (irb) application was submitted and approved. the result of this two-step process is a list of ten csfs discussed in the results section, with nine csfs coming from our initial list and one csf emerging from the interviews. the criterion for recruiting study libraries is that the library has implemented a new lsp within the last three years. this is because the lsp is the current generation of ils, and it is only within the last few years that various lsp vendors began to promote and implement the lsps. a critical success factors for integrated library system implementation in academic libraries: a qualitative study | yeh and walter |doi:10.6017/ital.v35i2.9255 30 recruitment email was sent to libraries listed as adopters on various vendors’ press release sites. participating recipients referred the interview request to appropriate migration team members whom we later contacted to schedule interviews. this resulted in up to five people from each participating library being interviewed in person or via skype. their positions are listed in table 1. interviews were recorded, transcribed, and cleaned. emails to the same interviewees were used for follow-up questions as needed. after interviews with each library, qualitative data analysis was performed to identify csfs that emerged from the interviews. interviews continued until no new csfs emerged in the last interview. in total, staff from four libraries were interviewed between october 2014 and march 2015 about their implementation process and experience from staff user perspective. the design and implementation of discovery public interface experience was not part of this inquiry. table 1 summarizes characteristics of the four libraries. case numbers instead of university names are used to protect identities of participating libraries and interviewees. case 1 case 2 case 3 case 4 type of university private public public private student population 11,000+ 32,000+ 2,400+ 2,700+ operating budget 11 million 13 million 1.5 million 1.3 million library employees 150 400 17 13.5 project length 6 months 9 months 6 months 9 months ils used before millennium aleph evergreen voyager lsp implemented sierra alma sierra sierra reasons for migration discontinued vendor system support; servers out of warranty; vendor gave incentives outdated servers; servers out of warranty in need of a robust system and provides discovery layer in need of a modern system demonstrating the library’s moving with the times positions of interviewees head of systems; module experts heads of systems director of library; head of systems director of library table 1. summary of case study site characteristics. results the following csfs emerged from interviews: careful selection process, top management involvement, vendor support, project team competence, staff user involvement, interdepartmental communication, data analysis and conversion, project management and project tracking, staff user education and training, managing staff user emotions. we discuss each csf next. information technology and libraries | september 2016 31 careful selection process most ilss are commercial, off-the-shelf software systems that can vary dramatically in functionality from system to system.25 for example, some packages are more suitable for large institutions while others are more suitable for smaller ones. to mitigate risks in productivity or transaction loss and to minimize system and implementation costs, a library needs to determine the best “fitness-of-use” system. such a determination is the outcome of a careful selection process. although there is no commonly accepted technique, method, or tool for this process, all selection processes share common key steps suggested in the literature.26 they are the following as applied to library-systems selection: define stakeholder requirements, search for products, create a short list of most promising candidates based on a set of “must-have” requirements, evaluate the candidates on the short list, and analyze the evaluation data to make a selection. in addition, if the server option was chosen instead of the cloud option, selected hardware needs to satisfy system requirements for the final configuration. careful selection process emerged as a csf that affected implementation outcome for all four libraries. all cases were migrating to an lsp system. some systems can be offered as locally installed systems, which require appropriate in-house and hardware capabilities. case 1 did not consider its it capability when deciding on a turnkey system. as a result, the library experienced difficulties in setting up the infrastructure in-house during the implementation. each of the other three cases considered the candidate system’s compatibility with the legacy system, the match between library needs and system functionalities, system maturity, migration costs, data storage needs, and vendor support before and during the implementation as well as continued vendor support throughout the life of the new system. even though each of the three libraries arrived at its system choice differently, on reflection, interviewees expressed relief and satisfaction in their decisions to choose their respective systems. “we were in the position where our servers were out of date and warranty, needed to be replaced. the servers were too small. we had sizing issues and we couldn’t update to the most recent version of aleph . . . alma being a cloud based solution will eliminate our need to be ‘in the server business.’” (case 2). “we went through a very extensive formal process to select this system.” (case 3) top management involvement successful implementation requires strong leadership by executives who understand, support, and champion the project.27 when this involvement is trickled down through organizational hierarchy, it leads to an organizational commitment, which is required for implementation success for complex projects.28, 29 since library-system implementation is a complex project that (if done correctly) will transform the entire library and reposition it for better efficiency, strong leadership is critical as well. critical success factors for integrated library system implementation in academic libraries: a qualitative study | yeh and walter |doi:10.6017/ital.v35i2.9255 32 in all four cases, top management were involved in the final decisions of their respective system choices. in cases 1 and 2, top management also took charge in securing funding for the migration projects. interviewees stressed that top management support was very important in their respective project implementations. “the top level management took the recommendations from the systems librarians at the time, with the blessing of the council determined whether they want to proceed with the product alma, and had funding conversations with the financial people.” (case 2) “we have faculty library committee, faculty governance oversight. we showed them webinars of the products we considered before we signed them, so we have faculty representation on board. we held open forum and were inclusive in our invitations.” (case 4) vendor support with a new technology, it is critical to acquire external technical expertise, often from the vendor, to facilitate successful implementation.30 effective vendor support includes adequate and highquality technical support during and after implementation, sufficient training provided for both the project team and staff users, and positive relationships between all parties in the project.31 additionally, there should be adequate knowledge transfer between the vendor consultants and the clients, which can be achieved by defining roles, achieving shared understanding, and enhancing relationships through competent communication.32, 33 in the case of library-system implementations, vendor support is particularly important because of the complexity of each new generation of the system and the library personnel’s knowledge gap in understanding the nuts and bolts of the new system. effective vendor support was identified in each case as a critical success factor determining the implementation outcome even though the form of vendor support varied from case to case. in case 1, the vendor sent different consultants with various expertise as project managers on the basis of the project phase. in case 2, the vendor sent one consultant who served as the main project manager. in case 3, the vendor provided a project manager and a team of technicians. in case 4, consultants were shared across multiple consortium libraries that were implementing the system at the same time. no matter how vendor support was provided, it was essential for implementation success as indicated by interviewees. “the vendor has been very supportive and provides a group of experts throughout the process, some are knowledgeable in server business while others are skilled project managers.” (case 1) project team competence since library-system migration affects all functional areas of a library, members of the implementation team need to be cross-functional. furthermore, members with both business information technology and libraries | september 2016 33 knowledge and technology knowhow are especially crucial for implementation success.34 competence of vendor consultants assigned to the project also influences implementation success, as discussed earlier. additionally, it is important to have an in-house project leader who champions the project and who has the essential skills and authority to set goals that legitimize change.35 having a competent project team was essential for implementation success for each of our cases. in each case, the vendor provided the project manager and the library provided a co-manager who was a champion figure. other team members came from various functional areas such as acquisition, circulation, cataloging, electronic resources management, and system administration. for example, in case 1, the technology librarian participated as a co-project manager. the projectmanagement team comprised module experts within the library and from functional areas. in addition, the university’s technology services department lent technical support during early stages of implementation when servers need to be set up. the interviewees all stressed the importance of project-team competence. “without the infrastructure knowledge from the university’s technology team and their time and full support to negotiate with the vendor, the migration project would not have been possible.” (case 1) “the university’s it made sure that we are in compliance with campus policies and expectations for securities.” (case 2) staff user involvement it is important that the project team involve staff users early on, otherwise the implementation process may be bumpy. when end users are involved in decisions relating to system selection and implementation, they are more invested in and concerned with the success of the system, which in turn leads to greater system use and user satisfaction.36, 37 as such, it is one of the most cited critical success factors in erp implementation.38 because personal relevance to the system is just as important for library-system implementation, effective staff user involvement with implementation is positively related to implementation success. staff user involvement has emerged as a main success factor in all our cases and contributed to the implementation project outcome. in case 1, staff users were not consulted as to whether an lsp was necessary for the library, although they were informed of the reasons for implementation. additionally, staff users were not involved when the project timetable was negotiated. this lack of early staff user involvement led to considerable stress down the road, which made the implementation process bumpy. the other three cases involved staff users early on; as a result, staff users experienced much less stress and frustration down the road. specifically, in case 2, the staff users were educated about the need for migration through staff meetings, town hall meetings, supervisory meetings, council meetings, and forums. many product-demo sessions were conducted for the staff so they would have the knowledge to participate before the final decision critical success factors for integrated library system implementation in academic libraries: a qualitative study | yeh and walter |doi:10.6017/ital.v35i2.9255 34 was made. there were daily internal newsletters conveying implementation news throughout implementation months. in case 3, the entire library was involved with the selection of a new system. while the key staff (such as circulation manager, acquisition manager, and reference manager) had more input than others, everyone offered input about the project. as such, the buyin with the new system was strong from all stakeholders. in case 4, staff users were involved early on through open forums and webinars. the following quotes are examples of interviewee sentiment concerning staff user involvement: “everybody is involved in choosing the system; partially because evergreen had been so problematic. we wanted to make sure that everyone is on board.” (case 3) “migration is the most time consuming aspect of the library staff work during the time of the project, without their buy-ins, it is difficult to have a successful project.” (case 4) interdepartmental communication the importance of effective communications across functional and departmental boundaries is well known in information-systems-implementation literature.39 with consultants coming from the vendor, project team members coming from different functional areas, and staff users with different perceptions and understandings of the implementation project, the importance of effective communications between all involved cannot be overstated. communications should start early, be consistent and continuous throughout various stages of the implementation process, and include a system overview, rationale for implementation, briefings for process changes, and contact-points establishment.40 expectations and goals should be communicated to all stakeholders and to all levels of the organization.41 effectiveness of interdepartmental communication affected the implementation outcome in all our cases. in case 1, the library’s project manager was designated to communicate with the vendor when issues arose, such as hardware and software configurations, system backup and use, and task assignments. the formal project plan was established using the web-based basecamp so that team members in different roles with different responsibilities could communicate and work together online. regular meetings were held and emails were exchanged between project team members. however, there is a lack of effective interdepartmental communication with staff who were not on the project team. this resulted in the absence of necessary system testing that would have detected some data-integrity issues. such issues later caused the system to be offline for days, which brought much frustration and stress to everyone. in the other three cases, all actors were well informed through news releases, meetings, presentations, and webinars. concerns were communicated to the project team and addressed timely. as a result, the level of frustration was very low for those three cases. data analysis and conversion a fundamental requirement for the effectiveness of an erp system is the accuracy of its data,42 and the same is true for a library system. data types in a legacy ils are often of an outdated format and information technology and libraries | september 2016 35 can differ from formats supported by a new library system. conversion from one format to another can be an overwhelming process, especially when there is no existing expertise in the library. since migrating legacy data to the new system is essential, effective data analysis for conversion is a critical success factor for implementation success. the smoothness of each of the four implementation cases was related to the project team’s data analysis and conversion efforts. in case 1, the library did not spend any effort to analyze, convert, or clean the data. as a result, the system experienced data-integrity issues after it went live. the other three libraries either devoted time to clean and convert the data or had a third party do the data cleaning. as a result, no system issues arose from data-integrity problems. interviewees from case 2 told us, “we elected to freeze the data 30 days sooner in terms of bibliographic data, so that we can do an authority control project with a third party vendor.” project management and project tracking according to erp implementation literature, effective project-management practices are critical for implementation success. such practices include defining clear objectives, establishing a formal implementation plan, designing a realistic work plan, and establishing resource requirements.43 the formal implementation plan needs to identify modules to be implemented, tasks to be undertaken, and all technical and nontechnical issues to be considered.44 project progress must be carefully monitored through meetings and reports.45, 46 effective project management and tracking has affected implementation outcome in all our cases. a popular project management and tracking software is basecamp, a web-based project management and collaboration tool initially released in 2004.47 it offers discussion boards, to-do lists, file sharing, milestone management, event tracking, and messaging system that help project teams stay organized and connected despite their different locations. all cases used basecamp for project management and tracking, which contributed to on-time and on-budget project completion for all cases. staff user education and training a new system often frustrates users who do not receive adequate training in its functionalities and use.48 when feeling frustrated and stressed, users may avoid using the system. proper and adequate training will sooth users and eliminate their reluctance to use the new system, which in turn helps realize productivity gains.49, 50 training processes should consider factors such as training curriculum, user commitment, trainers’ personnel skills and competence, as well as training schedule, budget, evaluation, and methods.51 effective staff user training has emerged as a critical success factor from all our cases. in case 1, staff users had access to a vendor-supplied preview portal, which simulated system functionalities. staff users were so familiar with the new system by the time the system went live that they were eager to engage with it. in cases 2, 3 and 4, staff users were trained through demo products, online video trainings, q&a, and on-site training sessions conducted by the vendor. critical success factors for integrated library system implementation in academic libraries: a qualitative study | yeh and walter |doi:10.6017/ital.v35i2.9255 36 these training materials and sessions served to ease staff user’s feeling of uncertainty and anxiety, as the following quotes show: “the online training videos were provided to all staff in the library and followed up with q&a sessions which members of the committee will host in their respective areas. . . . then ex libris did a week long onsite training workshop serve for the final deep configuration issues. . . . we know that there are staff users who want to be ahead of the game, yet there are always people who don’t want to learn until the day before they go live.” (case 2) “we have a training package with several onsite visits, each one is for a few days. the trainer focused on one aspect of the system. it was more than watching the videos online. because of the small staff here, almost everyone attended at least one training.” (case 3) “the trainers varied with their expertise, we developed fondness for some more than others. the training is functional in nature. the vendor’s priority was about trainer availability and to keep the project on time. we became familiar with trainers’ expertise; we were able to request the right trainer with the job.” (case 4) managing staff user emotions although education and training eases user anxiety, it does not completely eliminate it. emotions felt by users early in the implementation of a new system have important effects on the use of the system later on.52 how to manage staff user anxiety and negative emotions when they appear has emerged as a critical success factor in all our cases, as shown in the following quotes: “there were so many things going on in the library during the migration go-live week. the unknown of the migration success made staff users uncomfortable. should the migration date be decided in consideration of other initiatives, the frustration experienced would have been a lot less and might not have been ignored during the going-live week.” (case 1) “the frustration was just change; it was the fact that we have to learn something new. . . . primarily the frustration was handled by the lead.” (case 2) “there was a challenge, especially early on, in getting people to engage with the manuals and the literature in documentation. it is as if everyone is being asked to learn a new language. . . . the key relationship between the onsite coordinator and the project manager on the vendor side is important. when those two exchange information and handle frustration diplomatically, this bridge between the two organizations can smooth over a lot of rough feathers on either or both sides.” (case 4) information technology and libraries | september 2016 37 this final csf did not come directly from the ninety-plus csfs that we started with, although it aligned closely with “change management” category.53 this csf emerged mostly from the interview process. summary of results the results of the case studies for each critical factor are summarized in table 2. implementation project outcome is summarized in table 3. an implementation is considered successful if it was completed on-time and on-budget and if the implementation process was smooth as reflected in the number and degree of unexpected problems along the way. critical success factor case 1 case 2 case 3 case 4 careful selection process no yes yes yes top management involvement yes yes yes yes vendor support yes yes yes yes project team competence yes yes yes yes staff user involvement no yes yes yes interdepartmental communication no yes yes yes data analysis & conversion no yes yes yes project management and tracking yes yes yes yes staff user education and training yes yes yes yes managing staff user emotions no yes yes yes table 2. summary of case study critical success factors findings case 1 case 2 case 3 case 4 on time implementation yes yes yes yes on budget implementation yes yes yes yes smoothness of implementation no staff users experienced data integrity issue, system downtime, as well as anxiety and stress with the system implementation process yes yes yes table 3. summary of case study implementation success measures discussion and conclusions the implementation of a new ils is a large-scale undertaking that affects every aspect of a library’s operations as well as every staff user’s workflow process. as such, it is imperative for critical success factors for integrated library system implementation in academic libraries: a qualitative study | yeh and walter |doi:10.6017/ital.v35i2.9255 38 library administrators to understand what factors contribute to a successful implementation. our qualitative study shows that there are two categories of csfs: strategic and tactical. from the strategic perspective, top management involvement, vendor support, staff user involvement, interdepartmental communication, and staff user emotion management are critical. from the tactical perspective, project team competence, project management and project tracking, data analysis and conversion, and staff user education and training to break down the technical barrier greatly affect implementation outcome. in addition, selection of the final system from a variety of choices and options requires a careful consideration of both strategic and tactical issues. each factor identified is important in its own right during the implementation process. combined, they complement each other to guide an implementation to success. among the list of csfs identified, the role of staff user emotion management was not identified during the theoretical phase of the study; it only emerged as an important csf during interviews. top management involvement, vendor support, project team competence, project management and tracking, and staff user education and training are csfs that were somewhat intuitive, and they were implemented by all cases. however, a library may select an end system without careful considerations. it may also be unaware of the importance of involving users early on, the importance of opening clear lines of interdepartmental communications, or the importance of performing data analysis and conversion before the implementation. staff user emotion management, especially, is at the risk of being an afterthought of an implementation. by identifying the most salient csfs, this study offers practical contributions to academic library leaders and administrators in understanding how critical success factors play a role in ensuring a smooth and successful ils implementation. although csfs have been extensively studied in the discipline of information-systems management, this is the first study to apply csfs in the library context. since library management has its unique challenges compared to businesses, identifying csfs for library-system-implementation success is important not only for the current migration to lsps but also for future migrations to future generations of ilss as the needs of libraries continue to evolve. as with any empirical research, there are limitations to this study. the number of academic libraries interviewed is small despite no new information being discovered after the fourth interview. the vendors represented in this study are only two of the many in the market providing lsps to libraries. with these aforementioned limitations, the results of this study may not be generalizable to libraries implementing an lsp with vendors other than innovative interfaces and ex libris. additionally, the results may not be generalizable to nonacademic libraries. this research can be extended to validate the proposed csfs quantitatively by performing a survey research in academic libraries. studying interactions between identified factors will offer an even greater contribution. this research can be experimented in other types of libraries to generalize inferences. in addition, case libraries 3 and 4 both expressed that lsp changes the public interface that is used by external users, and they wished to have more opportunities for outreach prior to the implementation. although the design and implementation of the public interface was not considered within the scope of this research, this comment is insightful because information technology and libraries | september 2016 39 it may imply that future studies should consider a project champion to be a critical success factor. the project champion must have people-related skills and position to introduce changes in achieving buy-in from staff users.54, 55 references 1. richard m. jost, selecting and implementing an integrated library system: the most important decision you will ever make (boston: chandos, 2015). 2. ibid., 3. 3. suzanne julich, donna hirst and brian thompson, “a case study of ils migration: aleph500 at the university of iowa,” library hi tech 21, no. 1 (2003): 44–55, http://dx.doi.org/10.1108/07378830310467391. 4. zahiruddin khurshid, “migration from dobis libis to horizon at kfupm,” library hi tech 24, no. 3 (2006): 440–51, http://dx.doi.org/10.1108/07378830610692190. 5. vandana singh, “experiences of migrating to an open-source integrated library system,” information technology & libraries 32, no. 1 (2013): 36–53. 6. jost, “selecting and implementing an integrated library system.” 7. yongming wang and trevor a. dawes, “the next generation integrated library system: a promise fulfilled,” information technology & libraries 31, no. 3 (2012): 76–84. 8. keith kelley, carrie c. leatherman, and geraldine rinna, “is it really time to replace your ils with a next-generation option?” computers in libraries 33, no. 8 (2013): 11–15. 9. vangie beal, “erp—enterprise resource planning,” webopedia, http://www.webopedia.com/term/e/erp.html. 10. “library management system,” tangient llc, https://libtechrfp.wikispaces.com/library+management+system. 11. christopher p. holland and ben light, “a critical success factors model for erp implementation,” ieee software 16, no. 3 (1999): 30–36, http://dx.doi.org/10.1109/52.765784. 12. levi shaul and doron tauber, “critical success factors in enterprise resource planning systems: review of the last decade,” acm computing surveys 45 no. 4 (2013): 1–39, http://dx.doi.org/10.1145/2501654.2501669. 13. yahia zare mehrjerdi, “enterprise resource planning: risk and benefit analysis,” business strategy series 11, no. 5 (2010): 308–24, http://dx.doi.org/10.1108/17515631011080722. 14. mohammad a. rashid, liaquat hossain, and jon david patrick, “the evolution of erp systems: a historical perspective,” in enterprise resource planning: global opportunities and challenges (hershey, pa: idea group, 2002). http://dx.doi.org/10.1108/07378830310467391 http://dx.doi.org/10.1108/07378830610692190 http://www.webopedia.com/term/e/erp.html https://libtechrfp.wikispaces.com/library+management+system http://dx.doi.org/10.1109/52.765784 http://dx.doi.org/10.1145/2501654.2501669 http://dx.doi.org/10.1108/17515631011080722 critical success factors for integrated library system implementation in academic libraries: a qualitative study | yeh and walter |doi:10.6017/ital.v35i2.9255 40 15. marshall breeding, “library systems report 2014: competition and strategic cooperation,” american libraries 45, no. 5 (2014): 21–33. 16. sharon yang, “from integrated library systems to library management services: time for change?” library hi tech news 30, no. 2 (2013): 1–8, http://dx.doi.org/10.1108/lhtn-022013-0006. 17. shahin dezdar, “strategic and tactical factors for successful erp projects: insights from an asian country,” management research review 35, no. 11 (2012): 1070–87, http://dx.doi.org/10.1108/14637151111182693. 18. ibid. 19. shahin dezdar and ainin sulaiman, “successful enterprise resource planning implementation: taxonomy of critical factors,” industrial management & data systems 109, no. 8 (2009): 1037– 52, http://dx.doi.org/10.1108/02635570910991283. 20. sherry finney and martin corbett, “erp implementation: a compilation and analysis of critical success factors,” business process management journal 13, no. 3 (2007): 329–47, http://dx.doi.org/10.1108/14637150710752272. 21. f. pearce, business building and promotion: strategic and tactical planning (houston: pearman cooperation alliance, 2004). 22. jennifer bresnahan, “mixed messages,” cio (may 16, 1996), 72, http://dx.doi.org/10.1016/j.jchf.2013.07.005. 23. majed al-mashari, abdullah al-mudimigh, and mohamed zairi, “enterprise resource planning: a taxonomy of critical factors,” european journal of operational research 146, no. 2 (2003): 352–64, http://dx.doi.org/10.1016/s0377-2217(02)00554-4. 24. shaul and tauber, “critical success factors in enterprise resource planning systems.” 25. h. akkermans and k. van helden, “vicious and virtuous cycles in erp implementation: a case study of interrelations between critical success factors,” european journal of information systems 11, no. 1 (2002): 35–46, http://dx.doi.org/10.1057/palgrave.ejis.3000418. 26. abdallah mohamed, guenther ruhe, and armin eberlein, “cots selection: past, present, and future” (paper presented at the 14th annual ieee international conference and workshops on the engineering of computer-based system, 2007), http://dx.doi.org/10.1109/ecbs.2007.28. 27. m. michael umble, elisabeth j. umble, and ronald r. haft, “enterprise resource planning: implementation procedures and critical success factors,” european journal of operational research 146 no. 2 (2003): 241–57, http://dx.doi.org/10.1016/s0377-2217(02)00547-7. 28. jim johnson, “chaos: the dollar drain of it project failures,” application development trends 2, no. 1 (1995): 41–47. http://dx.doi.org/10.1108/lhtn-02-2013-0006 http://dx.doi.org/10.1108/lhtn-02-2013-0006 http://dx.doi.org/10.1108/14637151111182693 http://dx.doi.org/10.1108/02635570910991283 http://dx.doi.org/10.1108/14637150710752272 http://dx.doi.org/10.1016/j.jchf.2013.07.005 http://dx.doi.org/10.1016/s0377-2217(02)00554-4 http://dx.doi.org/10.1057/palgrave.ejis.3000418 http://dx.doi.org/10.1109/ecbs.2007.28 http://dx.doi.org/10.1016/s0377-2217(02)00547-7 information technology and libraries | september 2016 41 29. prasad bingi, maneesh k. sharma, and jayanth k. godla, “critical issues affecting an erp implementation,” information systems management 16, no. 3 (1999): 7–14, http://dx.doi.org/10.1201/1078/43197.16.3.19990601/313. 30. mary sumner, “critical success factors in enterprise wide information management systems projects,” proceedings of the 1999 acm sigcpr conference on computer personnel research, 1999 (new york: acm, 1999), http://dx.doi.org/10.1145/299513.299722. 31. eric t. g. wang et al., “the consistency among facilitating factors and erp implementation success: a holistic view of fit,” journal of systems & software 81 no. 9 (2008): 1609–21, http://dx.doi.org/10.1016/j.jss.2007.11.722. 32. dong-gil ko, laurie j. kirsch, and william r. king, “antecedents of knowledge transfer from consultants to clients in enterprise system implementations,” mis quarterly 29, no. 1 (2005): 59–85. 33. al-mashari, “enterprise resource planning.” 34. fiona fui-hoon nah and santiago delgado, “critical success factors for enterprise resource planning implementation and upgrade,” journal of computer information systems 46 no. 5 (2006): 99–113. 35. liang zhang et al., “a framework of erp systems implementation success in china: an empirical study,” international journal of production economics 98, no. 1 (2005): 56–80, http://dx.doi.org/10.1016/j.ijpe.2004.09.004. 36. ann-marie k. baronas and meryl reis louis, “restoring a sense of control during implementation: how user involvement leads to system acceptance,” mis quarterly 12, no. 1 (1988): 111–24. 37. joseph esteves, joan pastor and joseph casanovas, “a goals/questions/metrics plan for monitoring user involvement and participation in erp implementation projects,” ie working paper, march 11, 2004, http://dx.doi.org/10.2139/ssrn.1019991. 38. khaled al-fawaz, zahran al-salti, and tillal eldabi, “critical success factors in erp implementation: a review” (paper presented at the european and mediterranean conference on information systems, dubai, may 25–26, 2008). 39. h. akkermans and k. van helden, “vicious and virtuous cycles in erp implementation: a case study of interrelations between critical success factors,” european journal of information systems 11, no. 1 (2002): 35–46, http://dx.doi.org/10.1057/palgrave.ejis.3000418. 40. nancy bancroft, henning seip, and andrea sprengel, implementing sap r/3: how to introduce a large system into a large organisation (greenwich, uk: manning, 1998). 41. nah, “critical success factors.” http://dx.doi.org/10.1201/1078/43197.16.3.19990601/313 http://dx.doi.org/10.1016/j.jss.2007.11.722 http://dx.doi.org/10.1016/j.ijpe.2004.09.004 http://dx.doi.org/10.2139/ssrn.1019991 http://dx.doi.org/10.1057/palgrave.ejis.3000418 critical success factors for integrated library system implementation in academic libraries: a qualitative study | yeh and walter |doi:10.6017/ital.v35i2.9255 42 42. toni m. somers and klara nelson, “the impact of critical success factors across the stages of enterprise resource planning implementations,” proceedings of the 34th hawaii international conference on system sciences, 2001, http://dx.doi.org/10.1109/hicss.2001.927129. 43. shi-ming huang et al., “assessing risk in erp projects: identify and prioritize the factors,” industrial management & data systems 104, no. 8 (2004): 681–88, http://dx.doi.org/10.1108/02635570410561672. 44. nah, “erp implementation.” 45. umble, “enterprise resources planning.” 46. nah, “erp implementation.” 47. “basecamp, in a nutshell,” basecamp, https://basecamp.com/about/press. 48. nah, “erp implementation.” 49. umble, “enterprise resources planning.” 50. mo adam mahmood et al., “variables affecting information technology end-user satisfaction: a meta-analysis of the empirical literature,” international journal of human-computer studies 52, no. 4 (2000): 751–71, http://dx.doi.org/10.1006/ijhc.1999.0353. 51. iuliana dorobat and floarea nastase, “training issues in erp implementations,” accounting & management information systems 11, no. 4 (2012): 621–36. 52. anne beaudry and alain pinsonneault, “the other side of acceptance: studying the direct and indirect effects of emotions on information technology use,” mis quarterly 34, no. 4 (2010): 689–710. 53. shaul and tauber, “critical success factors in enterprise resource planning systems.” 54. andrew lawrence norton et al., “ensuring benefits realisation from erp ii: the csf phasing model,” journal of enterprise information management 26, no. 3 (2013): 218–34, http://dx.doi.org/10.1108/17410391311325207. 55. chong hwa chee, “human factor for successful erp2 implementation,” new straits times, july 28, 2003, https://www.highbeam.com/doc/1p1-76161040.html. http://dx.doi.org/10.1109/hicss.2001.927129 http://dx.doi.org/10.1108/02635570410561672 https://basecamp.com/about/press http://dx.doi.org/10.1006/ijhc.1999.0353 http://dx.doi.org/10.1108/17410391311325207 https://www.highbeam.com/doc/1p1-76161040.html abstract introduction literature review method results careful selection process top management involvement vendor support project team competence staff user involvement interdepartmental communication data analysis and conversion project management and project tracking staff user education and training managing staff user emotions summary of results marc international richard e. coward: head of research and development, the british national bibliography, london, england 181 the cooperative development of the library of congress marc ii profect and the british national bibliography marc ii project is described and presented as the forerunner of an international marc network. emphasis is placed on the necessity for a standard marc record for international exchange and for acceptance of international standards of cataloging. this paper is an examination of two major operational automation projects. these projects, the library of congress marc ii project and the british national bibliography (bnb) marc ii project, are the result of sustained and successful anglo-american cooperation over a period of three years during which there has been continuous evaluation and change. in 1969, for a brief period, the systems developed have been stabilised, partly to give time for library systems to examine ways and means of exploiting a new type of centralised service, and partly to give the library of congress and the british national bibliography the opportunity to look outwards at other systems being developed in other countries. there has, of course, already been extensive contact and exchange of views between the agencies involved in the planning and developing of automated bibliographic systems and the possibilities of cooperation and exchange have been informally discussed at many levels. the time has now come for the national libraries and cataloguing agencies concerned to look at what has been achieved and to lay the foundation for effective cooperation in the future. the history of the anglo-american marc project began at the library 182 journal of library automation vol. 2/ 4 december, 1969 of congress with an experiment in a new way of distributing catalogue data. the traditional method of distributing library of congress bibliographic information is to provide catalogue cards or proof sheets. these techniques will undoubtedly continue indefinitely into the future, but the rapid spread of automation in libraries has created a new demand for bibliographic information in machine readable form. the original marc project ( 1) was "an experiment to test the feasability of distributing library of congress cataloguing in machine readable form". the use of the word "cataloguing" underlines the essential nature of the marc i project; its end product was a catalogue record on magnetic tape. there is a very significant difference between a catalogue record on magnetic tape and a bibliographic file in machine form. the latter does not necessarily hold anything resembling a catalogue entry, although marc ii still reflects, both in the lc implementation ( 2,3) and in the bnb implementation ( 4,5), a preoccupation with the visual organisation of a catalogue entry. fortunately retention of the cataloguing ''framework" does not hinder the utilisation of lc or bnb marc data in systems designed to hold and exploit bibliographic information, as the whole project is designed as a method for communication between systems. the essence of the marc ii project is that it is a communications system, or a common exchange language between systems wishing to exchange bibliographic information. it is highly undesirable, in fact quite impossible, to plan in terms of direct compatability between systems. machines are different, programs are different, and local objectives are different. the exchange of bibliographic information in any medium implies some level of agreement on the best way to organise and present the data being exchanged. the need to use a fairly standard type of bibliographic structure on a catalogue card is obvious enough, and over the years a form of presentation, as best exemplified by a library of congress catalogue card, has been developed which holds all the essential data and also, by means of typographical distinctions and layout, conveys the information in a visually attractive style. when bibliographic information is transmitted in a machine readable form the question of visual layout does not arise but the question of structure is vitally important. this structure is called the machine format and the machine format holds the data. it literally does not matter in what order the various bits and pieces that make up a catalogue record appear on a magnetic tape. what does matter very much is that the machine should be able to recognise each data element: author, title, series, subject heading, etc. in practice, either each data element must be given an identifying tag that the machine can recognise, or each data element must occupy a predetermined place in the record. in view of the unpredictable nature of bibliographic information, the former methodthat of tag identification-is now widely used and is the technique adopted in the marc system. marc international/ coward 183 the lc and bnb marc systems are two very closely related implementations of a communications format which in its generalised form has been carefully designed to hold any type of bibliographic information. the generalised format description is now being circulated by british standards institute and united states of america standards institute. it can be very briefly described as follows : i leader i directory i control field(s) i data fields the leader is a fixed field of 24 characters, giving the record length, the file status and details of the particular implementation. the directory is a series of entries each containing a tag (which identifies a data field) , the length of the data field in the record, and its starting character position. this directory is a variable field depending on the number of data elements in the record. the control fields consist of a special group of fields for holding the main control number and any subsidiary control data. the data fields are designated for holding bibliographic data. each field may be of any length and may be divided into any number of subfields. a data field may begin with special characters, called indicators, which can be used to supply additional information about the field as a whole. it can be seen that the basis of marc ii is a very flexible file structure designed to hold any type of bibliographic record. once such a level of compatability is established it is possible to prepare general file handling systems ( 6) which will convert any bibliographic record to a local file format. there is certainly much scope for agreement on local file formats as well, but such formats will necessarily be conditioned by the type of machine available and the use to be made of the file. the establishment of a generalised file structure is a great step forward but by itself means very little unless a wide measure of agreement can be reached on the data content of the record to be exchanged. here the responsibility for cooperation and standardisation shifts from the automation specialist to the librarian, and particularly to those national libraries and cataloguing agencies who can by their practical actions assist libraries to implement the standards prepared for the profession. in order to appreciate the real importance of standardisation, particularly in the context of the marc project, it is necessary to look a few years into the future. it is inevitable that the rapid spread of automated systems in libraries will create a demand for machine readable bibliographic records and that in turn will lead to the setting up of bibliographic data banks in machine readable form in national and local centres. these data banks will be international in scope and will contain many millions of items. in the long run the only feasible way to maintain them is for each country or group of countries to develop automated centralised cata184 journal of library automation vol. 2/4 december, 1969 loguing systems for handling their own national outputs and to receive from all other countries involved in the network machine readable records of the latter's national outputs. countries cooperating on this basis must agree on standards of cataloguing (and ultimately on standards of classification and subject indexing), so that the general data bank presents a consistently compiled set of bibliographic data. there is no doubt that national data banks will be set up. libraries today are faced simultaneously with a rapid increase in book prices, a need to maintain ever-increasing book stocks to meet the basic requirements of their readers, and a persistent shortage of trained personnel to catalogue their purchases. these trends are already well established and in the united states, where they are most advanced, the result has been the massive and highly successful shared cataloguing program. historically the shared cataloguing program will probably be seen as the first and last attempt to provide a comprehensive bibliographic service by unilateral action. a large number of countries have cooperated in this attempt but the shared cataloguing program does not rest on the principle of exchange. it is doubtful if even the united states will be able to maintain and extend this programme in its present form. the shared cataloguing program must ultimately be replaced with an international exchange system. national machine readable bibliographic systems will be established, but there is a grave danger that those agencies responsible will be primarily concerned only with the immediate problem of producing records suitable for use in their own national context or for their own national bibliography, regardless of the fact that the libraries and information centres they need to serve are acquiring ever-increasing quantities of foreign material. the exchange principle will be downgraded to an afterthought, a bv-product of the fact that an automated system is being used. if this outcome is to be avoided, international standards must be prepared and national agencies must accept them instead of only paying lip service to them. in the past librarians have tended to be more concerned with codification than standardisation, but in the field of cataloguing at least a great breakthrough was made sixteen years ago when seymour lubetzky produced his "cataloguing rules and principles; a critique of the a.l.a. rules for entry and a proposed design for their revision" ( 7). the work of lubetsky led to the "paris principles" ( 8) published by ifla in 1963 and in due course to the preparation of the "anglo-american cataloguing rules" 1967 ( 9) . these rules, though unfortunately departing from lubetzky's principles in one or two areas provide a solid basis for standardisation. we are fortunate to have them available at such a critical moment in the history of librarianship. they must form the basis of an international marc project. of all the great libraries of the world, the library of congress has done more than any other to promote international cataloguing standards. it is now in a uniquely favourable position to promote these standards marc international/coward 185 through its own marc ii project. the lc marc ii project, together with the bnb marc ii project, can provide the foundation of the international marc network. these projects alone cover the total field of english language material and yet already the basic requirement of standardisation is absent. the library of congress finds itself unable, for administrative reasons, to adopt fully the code of rules it worked so hard to produce and which british librarians virtually accepted as it stood in the interests of international standardisation. that a great library should be in this position is understandable. what is less understandable is that the library of congress should transfer the non-standard cataloguing rules established by an internal administrative decision to prescription of cataloguing data in the machine readable record that it is now issuing on an international basis. one of the great advantages of machine readable records is that they can simultaneously be both standard and non-standard. there is no reason that the library of congress, or any national agency, should not provide for international exchange a standard marc record together with any local information the library might want. if as a result other national agencies are encouraged to do the same, it will not be long before the absurdity and expense of examining each record received via the international network in order to change a standard heading to a local variant, will become apparent. the british national bibliography has already accepted the anglo-american code and by this action has now done much to promote its acceptance in great britain. incomplete acceptance of the code is really the only significant difference between the two marc projects. at a detailed level there are differences in some of the subfield codes. these are chiefly due to the fact that the british marc committee was particularly concerned with the problems of filing bibliographic entries, and as no generally accepted filing code exists it was decided to provide a complete analysis of the fields in headings. this analysis will enable the bnb marc data base to be arranged in different sequences to test the rules now being prepared. the other difference, or extension, in the british marc format is the provision of cross references with each entry, on the assumption that in a marc system a total pack of cataloguing data should be provided. however these differences reflect the experimental nature of the british project, not the fundamental differences in opinion. in this paper an attempt has been made to look at the british and american marc projects not as systems for distributing bibliographic information but as the forerunners of an international bibliographic network. intensive efforts have been made to lay a foundation for this international network. the anglo-american code provides a sound cataloguing base, the generalised communications format provides a machine base, and the standard book numbering system provides an international identification 186 journal of library automation vol. 2/ 4 december, 1969 system. these developments are all part of a general move towards real cooperation in the provision of bibliographic services. they must now be brought together in an international marc network. references i. avram, henriette d.: the marc pilot profect (washington, library of congress: 1968). 2. u. s. library of congress. information systems office. the marc ii format: a communications format for bibliographic data. prepared by henriette d. avram, john f. knapp and lucia j. rather. (washington, d. c.: 1968). 3. "preliminary guidelines for the library of congress, national library of medicine, and national agricultural library implementation of the proposed american standard for a format for bibliographic information interchange on magnetic tape as applied to records representing monographic materials in textual printed form (books)," journal of library automation, 2 (june 1969) . 68-83 4. bnb marc documentation service publications, nos. 1 and 2 (london, council of the british national bibliography, ltd., 1968 ). 5. coward, r. e.: '~he united kingdom marc record service," in cox nigel s. j.; grose, michael w.: organization and handling of bibliographic records by computer (hamden, conn., archon books, 1967). 6. cox, nigel s. m.; dews, j. d.: "the newcastle file handling system," in op. cit. (note 4). 7. lubetzky, seymour: code of cataloging rules ... prepared for the catalog code revision committee . .. with an explanatory commentary by paul dunkin. (chicago : american library association, 1960). 8. international federation of library associations. international conference on cataloguing principles, paris, 9th-18th october, 1961: report; edited by a. h. chaplin. 9. anglo-american cataloging rules. british text (london: library association, 1967). assessing the effectiveness of open access finding tools articles assessing the effectiveness of open access finding tools teresa auch schultz, elena azadbakht, jonathan bull, rosalind bucy, and jeremy floyd information technology and libraries | september 2019 82 teresa auch schultz (teresas@unr.edu) is social sciences librarian, university of nevada, reno. elena azadbkaht (eazadbakht@unr.edu) is health sciences librarian, university of nevada, reno. jonathan bull (jon.bull@valpo.edu) is scholarly communications librarian, valparaiso university. rosalind buch (rbucy@unr.edu) is research & instruction librarian, university of nevada, reno. jeremy floyd (jfloyd@unr.edu) is metadata librarian, university of nevada, reno. abstract the open access (oa) movement seeks to ensure that scholarly knowledge is available to anyone with internet access, but being available for free online is of little use if people cannot find open versions. a handful of tools have become available in recent years to help address this problem by searching for an open version of a document whenever a user hits a paywall. this project set out to study how effective four of these tools are when compared to each other and to google scholar, which has long been a source of finding oa versions. to do this, the project used open access button, unpaywall, lazy scholar, and kopernio to search for open versions of 1,000 articles. results show none of the tools found as many successful hits as google scholar, but two of the tools did register unique successful hits, indicating a benefit to incorporating them in searches for oa versions. some of the tools also include additional features that can further benefit users in their search for accessible scholarly knowledge. introduction the goal of open access (oa) is to ensure as many people as possible can read, use, and benefit from scholarly research without having to worry about paying to read and, in many cases, restrictions on reusing the works. however, oa scholarship helps few people if they cannot find it. this is especially problematic for green oa works, which are those that have been made open by being deposited in an open online repository even if they were published in a subscription -based journal. opendoar reports more than 3,800 such repositories.1 as users are unlikely to search each individual repository, an efficient search method is needed to find the oa items spread across so many locations. in recent years, several browser extensions have been released that allow a user to search for an open version of an article while on a webpage for that article. the tools include: • lazy scholar, a browser extension that searches google scholar, pubmed, europepmc, doai.io, and dissem.in. it has extensions for both the chrome and firefox browsers.2 • open access button, which uses both a website and a chrome extension to search for oa versions.3 • unpaywall, which also acts through a chrome extension to search for open articles via the digital object identifier.4 • kopernio, a browser extension that searches subject and institutional repositories and is owned by clarivate analytics. kopernio has extensions for chrome, firefox, and opera.5 mailto:teresas@unr.edu mailto:eazadbakht@unr.edu mailto:jon.bull@valpo.edu mailto:rbucy@unr.edu mailto:jfloyd@unr.edu assessing the effectiveness of open access finding tools |auch schultz, azadbakht, et al. 83 https://doi.org/10.6017/ital.v38i3.11109 some of the tools offer other services, such as open access button’s ability to help the user email the author of an article if no open version is available, as well as integration with libraries’ interlibrary loan workflows. kopernio and lazy scholar offer to sync with a user’s institutional library to see if an article is available through the library’s collection.6 although other similar extensions might also exist, this article is focused on the four mentioned above based on the authors’ knowledge of available oa finding tools at the time of the project. literature review as noted above, scholars have indicated for several years a need for reliable and user-friendly methods, systems, or tools that can help researchers find oa materials. bosman et al. forwarded the idea of a scholarly commons—a set of principles, practices, and resources to enable research openness—that depends upon clear linkages between digital research objects.7 bulock notes that oa has “complicated” retrieval in that oa versions are often housed in various locations across the web, including institutional repositories (irs), preprint servers, and personal websites. 8 there is no perfect search option or tool, although some have tried creating solutions, such as the open jericho project from wayne state university, which is seeking to create an aggregator to search institutional repositories and eventually other sources as well.9 however, this lack of a central search tool can lead to confusion among researchers.10 nicholas and colleagues found that their sample of early career scholars drawn from several countries relied heavily on google and google scholar to find articles that interested them.11 many also turn to researchgate and other social media platforms and risk running afoul of copyright. the results of ithaka s+r’s 2015 survey of faculty in the united states reflect these findings to a certain extent, as variations exist between researchers in different disciplines.12 a majority of the respondents also indicated an affinity for freely accessible materials. as more researchers become aware of and gravitate toward oa options, the efficacy of various discovery tools, such as the browser extensions evaluated in this study, will become even more pertinent. previous studies on the findability of oa scholarship have focused primarily on google and google scholar.13 a few have assessed tools such as oaister, opendoar, and pubmed central.14 norris, oppenheim, and rowland sought a selection of articles using google, google scholar, oaister, and opendoar.15 while oaister and opendoar found just 14 percent of the articles’ open versions, google and google scholar combined managed to locate 86 percent. jamali and nabavi assessed google scholar’s ability to retrieve the full text of scholarly publications and documented the major sources of the full-text versions (publisher websites, institutional repositories, researchgate, etc.).16 google scholar was able to locate full-text versions of more than half (57.3 percent) of the items included in the study. most recently, martin-martin et al. likewise used google scholar to gauge the availability of oa documents across different disciplines.17 they found that roughly 54.6 percent of the scholarly content for which they searched was freely available, although only 23.1 percent of their sample were oa by virtue of the publisher. as of yet, no known studies have systematically evaluated the growing selection of open access tools’ efficiency and effectiveness at retrieving oa versions of articles. however, several scholars and journalists have reviewed these new tools, especially the more established open access button and unpaywall.18 these reviews were mostly positive, even as some acknowledged that the tools are not a wholescale solution for locating oa publications. despite pointing out these tools’ information technology and libraries | september 2019 84 limitations, reviewers voiced their hope that the oa finding tools could help disrupt the traditional scholarly publishing industry.19 at least one study has used the open access button to determine the green oa availability of journal articles. emery used the tool as the first step to identify oa article versions and then searched individual institutional repositories, followed by google scholar as the final steps.20 emery found that 22 percent of the study sample was available as green oa but did not say what portion of that was found by the open access button. emery did note that the open access button returned 17 false positives (six in which the tool took the user to the wrong article or other content, and 11 in which it took the user to a citation of the article with no full text available). she also found at least 38 cases of false-negative returns from the open access button, or articles that were openly available that the tool failed to find. the study did not count open versions found on researchgate or academia.edu. methodology oa finding tools this study compared the chrome browser extensions for google scholar and four oa finding tools: lazy scholar, unpaywall, open access button, and kopernio. each extension was used while in the chrome browser to search for open versions of the selected articles and the success of each extension in finding any free, full version was recorded. the authors did not track whether an article was licensed for reuse. for the four oa finding tools, the occurrences of false positives (e.g., the retrieval of an error page, a paywalled version, or the wrong article entirely) were also tracked. false positives were not tracked for google scholar, which does not purport to find only open versions of articles. data collection occurred over a six-week period in october and november 2018. the authors used web of science to identify the test articles (n=1,000) with the aim of selecting articles that would give the tools the best chance for finding a high number of open versions. articles selected were published in 2015 and 2016. these years were selected in order to try to avoid embargoes that might have prevented articles being made open through deposit. the articles were selected from two disciplines: applied physics and oncology, both of which have a large share in web of science and come from a broader discipline with a strong oa culture.21 each comparison began with searching the google scholar extension by article doi or title if a doi was not available. all versions retrieved by google scholar were examined until an open version was located or until the retrieved versions were exhausted. the remaining oa tools were then tested from the webpage for the article record on the journal’s website (if available). if no journal page was available, the article pdf page was tested. all data were recorded in a shared google sheet according to a data dictionary. searches for open versions of paywalled articles were performed away from the authors’ universities to ensure the institutions’ subscriptions to various journals did not impact the results. authors were limited in the number of articles they could search each day as some tools blocked continued use, presumably over concerns of illegitimate web activity, after as few as 15 searches. study limitations this methodology might have missed open versions of articles, even using these five search tools. although studies have found google scholar to be one of the most effective ways of searching for assessing the effectiveness of open access finding tools |auch schultz, azadbakht, et al. 85 https://doi.org/10.6017/ital.v38i3.11109 open versions, way has shown that it is not perfect.22 therefore, it is possible that this study undercounted the number of oa articles. the study tested the ability of oa finding tools to locate open articles from a journal’s main article page, not other possible webpages (e.g., the google scholar results page). this design may have limited the effectiveness of some tools, such as kopernio, which appear to work well with some webpages but not others. results overall, the tools found open versions for just less than half of the study sample (490), whereas they found no open versions for 510 articles. although lazy scholar, unpaywall, open access button, and kopernio all found open versions, google scholar returned the most with 462 articles (94 percent of all articles with at least one open version). open access button, lazy scholar, and unpaywall all found a majority of the open articles (62 percent, 73 percent, and 67 percent, respectively); however, kopernio found open versions for just 34 percent of the articles (see figure 1). figure 1. number of open versions found by each tool. it was most common for three or more of the tools to find an open version for an article, with just 48 found by two tools and 98 found by only one tool (see figure 2). information technology and libraries | september 2019 86 figure 2. number of articles where x number of oa finding tools found an open version. when looking at articles where only one tool returned an open version, google scholar had the highest results (84). open access button (4) and lazy scholar (10) also returned unique hits, but unpaywall and kopernio did not. open access button returned the most false positives with 46, or nearly 5 percent of all 1,000 articles. lazy scholar returned 31 false positives (3 percent), unpaywall returned 14 (1 percent), and kopernio returned 13 (1 percent). discussion the results for the oa search tools show that while all four options met with some success, none of them performed as well as google scholar. three of the tools—lazy scholar, open access button, and unpaywall—did find at least half or more of the open versions that google scholar did. it is important to note that open access button, which found the second fewest open versions, does not search researchgate and academia.edu because of legal concerns over article versions that are likely infringing copyright.23 this could have affected open access button’s performance. likewise, kopernio’s lower percentage of finding oa resources might relate to concerns over article versions as well. when creating an account on kopernio, the user is asked to affiliate themselves with an institution so that the tool can search existing library subscriptions at that institution. for this study, the authors did not affiliate with their home institutions when setting up kopernio to get a better idea of which content was open as opposed to content being accessible because of the tool connecting to a library’s subscription collection. if the authors were to identify assessing the effectiveness of open access finding tools |auch schultz, azadbakht, et al. 87 https://doi.org/10.6017/ital.v38i3.11109 with an institution, the number of accessible articles would likely increase, but this access would not be a true representation of what open content is discoverable. in addition, some tools might work better with certain publishers than others. for instance, kopernio did not appear to work with spandidos publications, a leading biomedical science publisher that publishes much of its content as gold oa, meaning the entire journal is published as oa. kopernio found just one open version of a spandidos article, compared to 153 by google scholar. this could be an unintentional malfunction either with spandidos or kopernio, which if fixed, could greatly increase the efficacy of this finding tool. however, open access button, lazy scholar, unpaywall, and google were able to find oa publications from spandidos at similar rates (135, 138, and 139, respectively) with no false positives. while none of the tools performed as well as google scholar, some of the tools were easier to use compared to google scholar. google scholar does not automatically show an open version first; instead, users often have to first select the “all x versions” option at the bottom of each record and then open each version until they find an open version. lazy scholar and unpaywall appear (for the most part) automatically, meaning users can see right away if an open version is available and then click a button once to be taken to that version. although open access button and kopernio do not show automatically if they have found an open version, users need to click a button on their toolbar once to activate each tool and see if the tool was able to find an open version. open access button also provides the extra benefit of making it easy for users to email authors to make their works open if an open version is not already available. relying on lazy scholar, unpaywall, or open access button first causes users no harm, and they can always rely on google scholar as a backup. whether all four tools are needed is questionable. for instance, a few of the authors found kopernio difficult to work with as it seemed to be incompatible with at least one publisher’s website and it introduced extra steps in downloading a pdf file. the fact that it also returned by far the fewest open versions—just 36 percent of the ones google scholar found and no unique hits—does not argue well for users to include it in their oa finding toolbox. also, while lazy scholar, unpaywall, and open access button all performed better on their own, the authors wonder what improvements could be created by combining the resources of the individual tools. conclusion the growth of oa finding tools is encouraging to see as far as helping to make oa works more discoverable. although the study showed that google scholar uncovered more articles than any of the other tools, the utility of at least two of the tools—lazy scholar and open access button—can still be seen in that both found articles not discovered by the other tools, including google scholar. indeed, using the tools in conjunction with one another appears to be the best method. and although open access button found the second fewest articles, the tool’s effort to integrate with interlibrary loan and discovery workflows, as well as its concern about legal issues are all promising for its future. likewise, kopernio might be a better tool for those interested in combining access to a library collection—which likely has a large number of final, publisher versions of scholarship—with their search for openly available scholarship. future studies can include newer oa finding tools that have entered the market, as well as evaluate the user experience of the tools. another study can also look at how well open access information technology and libraries | september 2019 88 button’s author email feature works. also, as open access button and unpaywall continue to move into new areas, such as interlibrary loan support, research could explore if these are more effective ways of connecting users to oa material as well as measure users’ understanding of oa versions they find. overall, the emergence of oa finding tools offers much potential for increasing the visibility of oa versions of scholarship, although no tool is perfect. however, if scholars wish to support oa through their research practices or find themselves unable to purchase or legally acquire the publisher's version, each of these tools can be valuable additions to their work. data statement the data used for this study has been shared publicly in the zenodo database under a cc-by 4.0 license at https://doi.org/10.5281/zenodo.2602200. endnotes 1 jisc, “browse by country and region,” accessed february 15, 2019, http://v2.sherpa.ac.uk/view/repository_by_country/countries=5fby=5fregion.html. 2 colby vorland, “extension,” accessed march 14, 2019, http://www.lazyscholar.org/; colby vorland, “data sources,” lazy scholar (blog), accessed march 14, 2019, http://www.lazyscholar.org/data-sources/. 3 “avoid paywalls, request research,” open access button, accessed march 14, 2019, https://openaccessbutton.org/. 4 unpaywall, “browser extension,” accessed march 14, 2019, https://unpaywall.org/products/extension. 5 kopernio, “faqs,” accessed march 14, 2019, https://kopernio.com/faq. 6 colby vorland, “features,” lazy scholar (blog), accessed march 14, 2019, http://www.lazyscholar.org/category/features/. 7 jeroen bosman et al., “the scholarly commons—principles and practices to guide research communication,” open science framework, september 15, 2017, https://doi.org/10.17605/osf.io/6c2xt. 8 chris bulock, “delivering open,” serials review 43, no. 3–4 (october 2, 2017): 268–70, https://doi.org/10.1080/00987913.2017.1385128. 9 elliot polak, email message to author, june 4, 2019. 10 bulock, "delivering open.” 11 david nicholas et al., “where and how early career researchers find scholarly information,” learned publishing 30, no. 1 (january 1, 2017): 19–29, https://doi.org/10.1002/leap.1087. https://doi.org/10.5281/zenodo.2602200 http://v2.sherpa.ac.uk/view/repository_by_country/countries=5fby=5fregion.html http://www.lazyscholar.org/ http://www.lazyscholar.org/data-sources/ https://openaccessbutton.org/ https://unpaywall.org/products/extension https://kopernio.com/faq http://www.lazyscholar.org/category/features/ https://doi.org/10.17605/osf.io/6c2xt https://doi.org/10.1080/00987913.2017.1385128 https://doi.org/10.1002/leap.1087 assessing the effectiveness of open access finding tools |auch schultz, azadbakht, et al. 89 https://doi.org/10.6017/ital.v38i3.11109 12 christine wolff, alisa b rod, and roger c. schonfeld, “ithaka s+r us faculty survey 2015,” 2015, 83, https://sr.ithaka.org/publications/ithaka-sr-us-faculty-survey-2015/. 13 mamiko matsubayashi et al., “status of open access in the biomedical field in 2005,” journal of the medical library association 97, no. 1 (january 2009): 4–11, https://doi.org/10.3163/15365050.97.1.002; michael norris, charles oppenheim, and fytton rowland, “the citation advantage of open-access articles,” journal of the american society for information science and technology 59, no. 12 (october 1, 2008): 1963–72, https://doi.org/10.1002/asi.20898; doug way, “the open access availability of library and information science literature,” college & research libraries 71, no. 4 (2010): 302–09; charles lyons and h. austin booth, “an overview of open access in the fields of business and management,” journal of business & finance librarianship 16, no. 2 (march 31, 2011): 108–24, https://doi.org/10.1080/08963568.2011.554786; hamid r. jamali and majid nabavi, “open access and sources of full-text articles in google scholar in different subject fields,” scientometrics 105, no. 3 (december 1, 2015): 1635–51, https://doi.org/10.1007/s11192-0151642-2; alberto martín-martín et al., “evidence of open access of scientific publications in google scholar: a large-scale analysis,” journal of informetrics 12, no. 3 (august 1, 2018): 819–41, https://doi.org/10.1016/j.joi.2018.06.012. 14 norris, oppenheim, and rowland, “the citation advantage of open-access articles”; micahel norris, fytton rowland, and charles oppenheim, “finding open access articles using google, google scholar, oaister and opendoar,” online information review 32, no. 6 (november 21, 2008): 709–15, https://doi.org/10.1108/14684520810923881; maria-francisca abad‐garcía, aurora gonzález‐teruel, and javier gonzález‐llinares, “effectiveness of openaire, base, recolecta, and google scholar at finding spanish articles in repositories,” journal of the association for information science and technology 69, no. 4 (april 1, 2018): 619–22, https://doi.org/10.1002/asi.23975. 15 norris, rowland, and oppenheim, “finding open access articles using google, google scholar, oaister and opendoar.” 16 jamali and nabavi, “open access and sources of full-text articles in google scholar in different subject fields.” 17 martín-martín et al., “evidence of open access of scientific publications in google scholar.” 18 stephen curry, “push button for open access,” the guardian, november 18, 2013, sec. science, https://www.theguardian.com/science/2013/nov/18/open-access-button-push; bonnie swoger, “the open access button: discovering when and where researchers hit paywalls,” scientific american blog network, accessed may 30, 2017, https://blogs.scientificamerican.com/information-culture/the-open-access-buttondiscovering-when-and-where-researchers-hit-paywalls/; lindsay mckenzie, “how a browser extension could shake up academic publishing,” chronicle of higher education 68, no. 33 (april 21, 2017): a29–a29; joyce valenza, “unpaywall frees scholarly content,” school library journal 63, no. 5 (may 2017): 11–11; barbara quint, “must buy? maybe not,” information today 34, no. 5 (june 2017): 17–17; michaela d. willi hooper, “product review: unpaywall [chrome & firefox browser extension],” journal of librarianship & scholarly communication 5 https://sr.ithaka.org/publications/ithaka-sr-us-faculty-survey-2015/ https://doi.org/10.3163/1536-5050.97.1.002 https://doi.org/10.3163/1536-5050.97.1.002 https://doi.org/10.1002/asi.20898 https://doi.org/10.1080/08963568.2011.554786 https://doi.org/10.1007/s11192-015-1642-2 https://doi.org/10.1007/s11192-015-1642-2 https://doi.org/10.1016/j.joi.2018.06.012 https://doi.org/10.1108/14684520810923881 https://doi.org/10.1002/asi.23975 https://www.theguardian.com/science/2013/nov/18/open-access-button-push https://blogs.scientificamerican.com/information-culture/the-open-access-button-discovering-when-and-where-researchers-hit-paywalls/ https://blogs.scientificamerican.com/information-culture/the-open-access-button-discovering-when-and-where-researchers-hit-paywalls/ information technology and libraries | september 2019 90 (january 2017): 1–3, https://doi.org/10.7710/2162-3309.2190; terry ballard, “two new services aim to improve access to scholarly pdfs,” information today 34, no. 9 (november 2017): cover-29; diana kwon, “a growing open access toolbox,” the scientist, accessed december 11, 2017, https://www.the-scientist.com/?articles.view/articleno/51048/title/agrowing-open-access-toolbox/; kent anderson, “the new plugins — what goals are the access solutions pursuing?,” the scholarly kitchen, august 23, 2018, https://scholarlykitchen.sspnet.org/2018/08/23/new-plugins-kopernio-unpaywallpursuing/. 19 curry, “push button for open access”; swoger, “the open access button”; mckenzie, “how a browser extension could shake up academic publishing”; kwon, “a growing open access toolbox.” 20 jill emery, “how green is our valley?: five-year study of selected lis journals from taylor & francis for green deposit of articles,” insights 31, no. 0 (june 20, 2018): 23, https://doi.org/10.1629/uksg.406. 21 anna severin et al., “discipline-specific open access publishing practices and barriers to change: an evidence-based review,” f1000research 7 (december 11, 2018): 1925, https://doi.org/10.12688/f1000research.17328.1. 22 way, “the open access availability of library and information science literature.” 23 open access button, “open access button library service faqs,” google docs, accessed february 19, 2019, https://docs.google.com/document/d/1_hwkryg7qj7ff05cx8kw40ml7exwrz6ks5fb10gegg/edit?usp=embed_facebook. https://doi.org/10.7710/2162-3309.2190 https://www.the-scientist.com/?articles.view/articleno/51048/title/a-growing-open-access-toolbox/ https://www.the-scientist.com/?articles.view/articleno/51048/title/a-growing-open-access-toolbox/ https://scholarlykitchen.sspnet.org/2018/08/23/new-plugins-kopernio-unpaywall-pursuing/ https://scholarlykitchen.sspnet.org/2018/08/23/new-plugins-kopernio-unpaywall-pursuing/ https://doi.org/10.1629/uksg.406 https://doi.org/10.12688/f1000research.17328.1 https://docs.google.com/document/d/1_hwkryg7qj7ff05-cx8kw40ml7exwrz6ks5fb10gegg/edit?usp=embed_facebook https://docs.google.com/document/d/1_hwkryg7qj7ff05-cx8kw40ml7exwrz6ks5fb10gegg/edit?usp=embed_facebook abstract introduction literature review methodology oa finding tools study limitations results discussion conclusion data statement endnotes author name and second author the use of ajax, or asynchronous javascript + xml, can result in web applications that demonstrate the flexibility, responsiveness, and usability traditionally found only in desktop software. to illustrate this, a repository metasearch user interface, ojax, has been developed. ojax is simple, unintimidating but powerful. it attempts to minimize upfront user investment and provide immediate dynamic feedback, thus encouraging experimentation and enabling enactive learning. this article introduces the ajax approach to the development of interactive web applications and discusses its implications. it then describes the ojax user interface and illustrates how it can transform the user experience. w ith the introduction of the ajax development paradigm, the dynamism and richness of desktop applications become feasible for web-based applications. ojax, a repository metasearch user interface, has been developed to illustrate the potential impact of ajax-empowered systems on the future of library software.1 this article describes the ajax method, highlights some uses of ajax technology, and discusses the implications for web applications. it goes on to illustrate the user experience offered by the ojax interface. ■ ajax in february 2005, the term ajax acquired an additional meaning: asynchronous javascript + xml.2 the concept behind this new meaning, however, has existed in various forms for several years. ajax is not a single technology but a general approach to the development of interactive web applications. as the name implies, it describes the use of javascript and xml to enable asynchronous communication between browser clients and server-side systems. as explained by garrett, the classic web application model involves user actions triggering a hypertext transfer protocol (http) request to a web server.3 the latter processes the request and returns an entire hypertext markup language (html) page. every time the client makes a request to the server, it must wait for a response, thus potentially delaying the user. this is particularly true for large data sets. but research demonstrates that response times of less than one second are required when moving between pages if unhindered navigation is to be facilitated through an information space.4 the aim of ajax is to avoid this wait. the user loads not only a web page, but also an ajax engine written in javascript. users interact with this engine in the same way that they would with an html page, except that instead of every action resulting in an http request for an entire new page, user actions generate javascript calls to the ajax engine. if the engine needs data from the server, it requests this asynchronously in the background. thus, rather than requiring the whole page to be refreshed, the javascript can make rapid incremental updates to any element of the user interface via brief requests to the server. this means that the traditional page-based model used by web applications can be abandoned; hence, the pacing of user interaction with the client becomes independent of the interaction between client and server. xmlhttprequest is a collection of application programming interfaces (apis) that use http and javascript to enable transfer of data between web servers and web applications.5 initially developed by microsoft, xmlhttprequest has become a de facto standard for javascript data retrieval and is implemented in most modern browsers. it is commonly used in the ajax paradigm. the data accessed from the http server is usually in extensible markup language (xml) but another format, such as javascript object notation, could be used.6 applications of ajax google is the most significant user of ajax technology to date. most of its recent innovations, including gmail, google suggest, google groups, and google maps, employ the paradigm.7 the use of ajax in google suggest improves the traditional google interface by offering real-time suggestions as the user enters a term in the search field. for example, if the user enters xm, google suggest might offer refinements such as xm radio, xml, and xmods. experimental ajax-based auto-completion features are appearing in a range of software.8 shanahan has applied the same ideas to the amazon online bookshop.9 his experimental site, zuggest, extends the concept of auto-completion: as the user enters a term, the system automatically triggers a search without the need to hit a search button. the potential of ajax to improve the responsiveness and richness of library applications has not been lost on the library community.10 several interesting experiments have been tried. at oclc, for example, a “suggest-like service,” based on controlled headings from the worldjudith wusteman and pádraig o’hiceadha using ajax to empower dynamic searching | wusteman 57 using ajax to empower dynamic searching judith wusteman (judith.wusteman@ucd.ie) is a lecturer in the ucd school of information and library studies, university college dublin, ireland. 58 information technology and libraries | june 2006 wide union catalog, worldcat, has been implemented.11 ajax has also been used in the oclc deweybrowser.12 the main page of this browser includes four iframes, or inline frames, three for the three levels of dewey decimal classification and a fourth for record display.13 the use of ajax allows information in each iframe to be updated independently without having to reload the entire page. implications of ajax there have been many attempts to enable asynchronous background transactions with a server. among alternatives to ajax are flash, java applets, and the new breed of xml user-interface language formats such as xml user interface language (xul) and extensible application markup language (xaml).14 these all have their place, particularly languages such as xul. the latter is ideal for use in mozilla extensions, for example. combinations of the above can and are being used together; xul and ajax are both used in the firefox extension version of google suggest.15 the main advantage of ajax over these alternative approaches is that it is nonproprietary and is supported by any browser that supports javascript and xmlhttprequest—hence, by any modern browser. it could be validly argued that complex client-side javascript is not ideal. in addition to the errors to which complex scripting can be prone, there are accessibility issues. best practice requires that javascript interaction adds to the basic functionality of web-based content that must remain accessible and usable without the javascript.16 an alternative non-javascript interface to gmail was recently implemented to deal with just this issue. a move away from scripting would, in theory, be a positive step for the web. in practice, however, procedural approaches continue to be more popular; attempts to supplant them, as epitomized by xhtml 2.0, simply alienate developers.17 it might be assumed that the use of ajax technology would result in a heavier network load due to an increase in the number of requests made to the server. this is a misconception in most cases. indeed, ajax can dramatically reduce the network load of web applications, as it enables them to separate data from the graphical user interface (gui) used to display it. for example, each results page presented by a traditional search engine delivers, not only the results data, but also the html required to render the gui for that page. an ajax application could deliver the gui just once and, after that, deliver data only. this would also be possible via the careful use of frames; the latter could be regarded as an ajax-style technology but without all of ajax’s advantages. ■ from client-server to soa the dominant model for building network applications is the client/server approach, in which client software is installed as a desktop application and data generally reside on a server, usually in a database.18 this can work well in a homogenous single-site computing environment. but institutions and consortia are likely to be heterogeneous and geographically distributed. pcs, macs, and cell phones will all need access to the applications, and linux may require support alongside windows. even if an organization standardizes solely on windows, different versions of the latter will have to be supported, as will multiple versions of those ubiquitous dynamic link libraries (dlls). indeed, the problems of obtaining and managing conflicting dlls have spawned the term “dll hell.”19 in web applications, a standard client, the browser, is installed on the desktop but most of the logic, as well as the data, reside on the server. of course, the browser developers still have to worry about “dll hell,” but this need not concern the rest of us. “speed must be the overriding design criterion” for web pages.20 but the interactivity and response times possible with client/server applications are still not available to traditional web applications. this is where ajax comes in: it offers, to date, the best of the web application and client/server worlds. much of the activity is moved back to the desktop via client-side code. but the advantages of web applications are not lost: the browser is still the standard client. service-oriented architecture (soa) is an increasingly popular approach to the delivery of applications to heterogeneous computing environments and geographically dispersed user populations.21 soa refers to the move away from monolithic applications toward smaller, reusable services with discrete functionality. such services can be combined and recombined to deliver different applications to users. web services is an implementation of soa principles.22 the term describes the use of technologies such as xml to enable the seamless interoperability of web-based applications. ajax enables web services and hence enables soa principles. thus, the adoption of ajax facilitates the move toward soa and all the advantages of reuse and integration that this offers. ■ arc arc is an experimental open-source metasearch package available for download from the sourceforge opensource foundry.23 it can be configured to harvest open using ajax to empower dynamic searching | wusteman 59 archives initiative-protocol for metadata harvesting (oai-pmh)-compliant data from multiple repositories.24 the harvested results are stored in a relational database and can be searched using basic web forms. arc’s advanced search form is illustrated in figure 1. ■ applying ajax to the search gui the use of ajax has the potential to narrow the gulf between the responsiveness of guis for web applications and those for desktop applications. the flexibility, usability, and richness of the latter are now possible for the former. the ojax gui, illustrated in figure 2, has been developed to demonstrate how ajax can improve the richness of arc-like guis. ojax, including full source code, is available under the open-source apache license and is hosted on sourceforge.25 ojax comprises a client-side gui, implemented in javascript and html, and server-side metasearch web services, implemented in java. the web services connect directly to a metasearch database created by arc from harvested repositories. the database connectivity leverages several libraries from the apache jakarta project, which provides open-source java solutions.26 ■ development process the ojax gui was developed iteratively using agile software development methods.27 features were added incrementally and feedback gained from a proxy user. in order to gain an in-depth understanding of the system and the implications for the remainder of the gui, features were initially built from scratch, using objectoriented javascript.they were then rebuilt using three open-source javascript libraries: prototype, script.aculo .us, and rico.28 prototype provides base ajax capability. it also includes advanced functionality for object-oriented javascript, such as multiple inheritance. the other two libraries are built on top of prototype. the script.aculo. us library specializes in dynamic effects, such as those used in auto-completion. the rico library, developed by sabre, provides other key javascript effects—for example, dynamic scrollable areas and dynamic sorting.29 ■ storyboard one of the aims of the national information standards organization (niso) metasearch initiative is to enable all library users to “enjoy the same easy searching found in web-based services like google.”30 adopting this approach, ojax incorporates the increasingly common concept of the search bar, popularized by the google toolbar.31 ojax aims to be as simple, uncluttered, and unthreatening as possible. the goal is to reflect the simple-search experience while, at the same time, providing the power of an advanced search. thus, the user interface has been kept as simple as possible while maintaining equivalent functionality with the arc advanced search interface. all arc functionality, with the exception of the grouping feature, is provided. to help the intuitive flow of the operation, the fields are set out as a sentence: find [term(s)] in [all archives] from [earliest year] until [this year] in [all subjects] tool tips are available for text-entry fields. by default, searching is on author, title, and abstract. these fields map to the creator, title, and description dublin core metadata fields harvested from the original repositories.32 the search can be restricted by deselecting unwanted fields. arc supports both mysql and oracle databases.33 mysql has been chosen for ojax as mysql is an open-source database. boolean search syntax has been figure 1. arc’s advanced search form figure 2. the ojax metasearch user interface 60 information technology and libraries | june 2006 implemented in ojax to allow for more powerful searching. the syntax is similar to that used by google in that it identifies and/or and exact phrase functionality by +/and “ ”. hence it preserves the user’s familiarity with basic google search syntax. however, it is not as powerful as the full google search syntax; for example, it does not support query modifiers such as: intitle: 34 the focus of this research is the application of ajax to the search gui and not the optimization of the power or expressive capability of the underlying search engine. however, the implementation of an alternative back end that uses a full-text search engine, such as apache lucene, would improve the expressive power of advanced queries.35 full-text search expressiveness is likely to be key to the usability of ojax, ensuring its adequacy for the advanced user without alienating the novice. ■ unifying the user interface one of the main aims of ojax is the unification of the user interface. instead of offering distinct options for simple and advanced search and for refining a completed search, the interface is sufficiently dynamic to make this unnecessary. the user need never navigate between pages because all options, both simple and advanced, are available from the same page. and all results are made available on that same page in the form of a scrollable list. the only point at which a new page is presented is when the resource identifier of a result is clicked. at this stage, a pop-up window, external to the ojax session, displays the full metadata for that resource. this page is generated by the external repository from which the record was originally harvested. simple and advanced search options are usually kept separate because most users are unwilling or unable to use the latter.36 furthermore, the design of existing search-user interfaces is based on the assumption that the retrieval of results will be sufficiently time-consuming that users will want to have selected all options beforehand. with ojax, however, users do not have to make a complete choice of all the options they might want to try before they see any results. as data are entered, answers flow to accommodate them. because the interface is so dynamic and responsive and because users are given immediate feedback, they do not have to be concerned about wasting time due to the wrong choice of search options. users iterate toward the search results they require by manipulating the results in real time. the reduced level of investment that users must make before they achieve any return from the system should encourage them to experiment, hence promoting enactive learning. ■ auto-completion in order to provide instant feedback to the user, the search-terms field and the subject field use ajax to autocomplete user entries. figure 3 illustrates the result of typing smith in the search-terms field. a list is automatically dropped down that itemizes all matches and the number of their occurrences. users select the term they want, the entire field is automatically completed, and a search is triggered. the arc system denormalizes some of the harvested data before saving them in its database. for example, it merges all the author fields into one single field, each name separated by a bar character. to enable the ojax auto-completion feature, it was necessary to renormalize the names. a new table is used to store each name in a separate row; names are referenced by the resource identifier. to enable this, arc’s indexing code was updated so that it creates this table as it indexes records extracted from the oai-pmh feed. in its initial implementation, ojax uses a simple algorithm for auto-completion. future work will involve developing a more complex heuristic that will return results more closely satisfying user requirements. ■ auto-search as already mentioned, a central theme of ojax is the attempt to reduce the commitment necessary from users before they receive feedback on their actions. one way in which dynamic feedback is provided is the triggering of an immediate search whenever an entire option has been selected. examples of entire options include choice of an archive or year and acceptance of a suggested autocompletion. in addition, the following heuristics are used to identify when a user is likely to have finished entering a search term and, thus, when a search should be triggered: 1. entering a space character in the search-terms field or subject field 2. tabbing out of a field after having modified its contents 3. five seconds of user inactivity for a modified field the third heuristic aims to catch some of the edge cases that the other heuristics may miss. it is assumed likely that a term has been completed if a user has made no edits in the last five seconds. as each term will be using ajax to empower dynamic searching | wusteman 61 separated by a space, it is only the last term in a search phrase that is likely not to trigger an auto-search via the first heuristic. users can click the search button whenever they wish, but they should never have to click it. the zuggest system abandons the search button entirely; ojax retains it, mainly in order to avoid confounding user expectations.37 while a search is in progress, the search button is greyed out and acquires a red border. this is particularly useful in alerting the user that a search has been automatically triggered. this is the only feature of ojax that may have an impact on network load in terms of slightly higher traffic. however, the increased number of requests is offset by a reduction in the size of each response because the gui is not downloaded with it. for example, initiating a search in arc results in an average response size of 57.32k. the response is in the form of a complete html page. initiating a search in ojax results in an average response size of 7.96k. the latter comprises a web service response in xml. in other words, more than seven ojax autosearches would have to be triggered before the size of the initial search result in arc was exceeded. ■ dynamic archive list the use of ajax enables a static html page to contain a small component of dynamic data without the entire page having to be dynamically generated on the server. ojax illustrates this: the contents of the drop-down box listing the searchable archives are not hard-coded in the html page. rather, when the page is loaded, an ajax request for the set of available archives is generated. this is a useful technique; static html pages can be cached by browsers and proxy servers, and only the dynamic portion of the data, perhaps those used to personalize the page, need be downloaded at the start of a new session. ■ dynamic scrolling searches commonly produce thousands of results. typical systems, such as google and arc, make these results available via a succession of separate pages, thus requiring users to navigate between them. finding information by navigating multiple pages can take longer than scrolling down a single page, and users rarely look beyond the second page of search results.38 to avoid these problems and to encourage users to look at more of the available results, those results could be made available in one scrollable list. but, in a typical non-ajax application, accessing a scrollable list of, say, two thousand items would require the entire list to be downloaded via one enormous html page. this would be a huge operation; if it did not crash the browser, it would, at least, result in a substantial wait for the user. the rico library provides a feature to enable dynamic scrollable areas. it uses ajax to fetch more records from the server when the user begins to scroll off the visible area. this is used in the display of search results in ojax, as illustrated in figure 4. to the user, it appears that the scrollable list is seamless and that all 4,678 search results are already downloaded. in fact, only 386 have been downloaded. the rest are available at the server. as the user scrolls further down, say to item 396, an ajax request is made for the next ten items. any item downloaded is cached by the ajax engine and need not be requested again if, for example, the user scrolls back up the list. a dynamic information panel is available to the right of the scroll bar. it shows the current scroll position in relation to the beginning and end of the results set. in figure 3. auto-completion in the search terms field figure 4. display of search results and dynamic information panel 62 information technology and libraries | june 2006 figure 4, the information panel indicates that there are 4,678 results for this particular search and that the current scroll position is at result number 386. this number updates instantly during scrolling, preserving the illusion that all results have been downloaded and providing users with dynamic feedback on their progress through the results set. this means that users do not have to wait for the main results window to refresh to identify their current position. ■ auto-expansion of results ojax aims to provide a compact display of key information, enabling users to see multiple results simultaneously. it also aims to provide simple access to full result details without requiring navigation to a new web page. in the initial results display, only one line each of the title, authors, and subject fields, and two lines of the abstract, are shown for each item. as the cursor is placed on the relevant field, the display expands to show any hidden detail in that field. at the same time, the background color of the field changes to blue. when the cursor is placed on the bar containing the resource identifier, all display fields for that item are expanded, as illustrated in figure 5. this expansion is enabled via simple cascading style sheet (css) features. for example, the following css declaration hides all but the first line of authors: #searchresults td div { overflow:hidden; height: 1.1em } when the cursor is placed on the author details, the overflow becomes visible and the display field changes its dimensions to fit the text inside it: #searchresults td div:hover { overflow:visible; height:auto } ■ sorting results another method used by ojax to minimize upfront user investment is to provide initial search results before requiring the user to decide on sort options. because results are available so quickly and because they can be re-sorted so rapidly, it is not necessary to offer pre-search selection of sort options. ajax facilitates rapid presentation of results; after a re-sort, only those on the first screen must be downloaded before they can be presented to the user. results may be sorted by title, author, subject, abstract, and resource identifier. these options are listed on the gray bar immediately above the results list. clicking one of these options sorts the results in ascending order; an upward-pointing arrow appears to the right of the sort option chosen, as illustrated in figure 6. clicking on the option again sorts in descending order and reverses the direction of the arrow. clicking on the arrow removes the sort; the results revert to their original order. functionality for the sort feature is provided by the rico javascript library. server-side implementation supports these features by caching search results so that it is not necessary to regenerate them via a database query each time. figure 5. auto-expansion of all fields for item number 386 figure 6. results being sorted in ascending order by title using ajax to empower dynamic searching | wusteman 63 ■ search history several experimental systems—for example, zuggest— have employed ajax to facilitate a search-history feature. a similar feature could be provided for ojax. a button could be added to the right of the results list. when chosen, it could expand a collapsible search-history sidebar. as the cursor was placed on one of the previous searches listed in the sidebar, a call out, that is, a speech bubble, could be displayed. this could provide further information such as the number of matches for that search and a summary of the search results clicked on by the user. clicking one of the previous searches would restore those search results to the main results window. this feature would take advantage of the ajax persistent javascript engine to maintain the history. its use could help counter concerns about ajax technology “breaking” the back button; the feature could be implemented so that the back button returned the user to the previous entry in the search history.39 in fact, this implementation of back-button functionality could be more useful than the implementation in google, where hitting the back button is likely to take the user to an interim results page; for example, it might simply take the user from page 3 of results to page 2 of results. ■ scrapbook users browsing through search results on ojax would require some simple method of maintaining a record of those resource details that interested them. ajax could enable the development of a useful scrapbook feature to which such resource details could be copied and stored in the persistent javascript engine. ojax could further leverage a shared bookmark web service, such as del. icio.us or furl, to save the scrapbook for use in future sessions and to share it with other members of a research or interest group.40 ■ potential developments for ojax as well as searching a database of harvested metadata, the ojax user interface could also be used to search an oai-pmh-compliant repository directly. with appropriate implementation, all of ojax’s current features could be made available, apart from auto-completion. a recent development has enabled the direct indexing of repositories by google using oai-pmh.41 the latter provides google with additional metadata that can be searched via the google web services apis. the current ojax web services could be replaced by the google apis, thus eliminating the need for ojax to host any server-side components. hence, ojax could become an alternative gui for google searching. ■ conclusion ojax demonstrates that the use of ajax can enable features in web applications that, until now, have been restricted to desktop applications. in ojax, it facilitates a simple, nonthreatening, but powerful search user interface. page navigation is eliminated; dynamic feedback and a low initial investment on the part of users encourage experimentation and enable enactive learning. the use of ajax could similarly transform other web applications aimed at library patrons. however, ajax is still maturing, and the barrier to entry for developers remains high. we are a long way from an ajax button appearing in dreamweaver. reusable, well-tested components, such as rico, and software frameworks, such as ruby on rails, sun’s j2ee framework, and microsoft’s atlas, will help to make ajax technology accessible to a wider range of developers.42 as with all new technologies, there is a temptation to use ajax simply because it exists. as ajax matures, it is important that its focus does not become the enabling of “cool” features but remains the optimization of the user experience. references and notes 1. ojax homepage, http://ojax.sourceforge.net (accessed apr. 5, 2006). 2. j. j. garrett, “ajax: a new approach to web applications,” feb. 18, 2005, www.adaptivepath.com/publications/ essays/archives/000385.php (accessed nov. 11, 2005). 3. ibid. 4. j. nielsen, “the need for speed,” alertbox mar. 1, 1997, www.useit.com/alertbox/9703a.html (accessed nov. 11, 2005). 5. dynamic html and xml: the xmlhttprequest object, http://developer.apple.com/internet/webcontent/xmlhttpreq .html (accessed apr. 5, 2006). 6. javascript object notation, wikipedia definition, http:// en.wikipedia.org/wiki/json (accessed apr. 5, 2006). 7. google gmail, http://mail.google.com (accessed apr. 5, 2006); google suggest, www.google.com/webhp?complete =1&hl=en (accessed apr. 5, 2006); google groups, http://groups .google.com (accessed apr. 5, 2006); google maps, http://maps .google.com (accessed apr. 5, 2006). 8. p. binkley, “ajax and auto-completion,” quædam cuiusdam blog may 18, 2005, www.wallandbinkley.com/quaedam/?p=27 (accessed nov. 11, 2005). 9. francis shanahan, zuggest, www.francisshanahan.com/ zuggest.aspx (accessed apr. 5, 2006). 64 information technology and libraries | june 2006 10. a. rhyno, “ajax and the rich web interface,” librarycog blog apr. 10, 2005, http://librarycog .uwindsor.ca:8087/artblog/librarycog/1113186562 (accessed nov. 11, 2005); r. tennant, “tennant’s top tech trend tidbit,” lita blog june 22, 2005, http://litablog.org/?p=35 (accessed nov. 11, 2005). 11. t. hickey, “ajax and web interfaces,” outgoing blog, mar. 31, 2005. retrieved nov. 11, 2005 http://outgoing.typepad .com/outgoing/2005/03/web_application.html. 12. oclc deweybrowser. http://ddcresearch.oclc.org/ ebooks/fileserver (accessed apr. 5, 2006). 13. hickey, “ajax and web interfaces.” 14. j. wusteman, “from ghostbusters to libraries: the power of xul,” library hi tech 23, no 1 (2005a). retrieved nov. 11, 2005 www.ucd.ie/wusteman/; cover pages, microsoft extensible application markup language (xaml), http://xml.cover pages.org/ms-xaml.html (accessed apr. 5, 2006). 15. google extensions for firefox, http://toolbar.google .com/firefox/extensions/index.html (accessed apr. 5, 2006). 16. c. adams, “ajax: usable interactivity with remote scripting,” sitepoint. (jul. 13, 2005), www.sitepoint.com/article/ remote-scripting-ajax (accessed nov. 11, 2005). 17. xhtml 2.0, w3c working draft, may 27, 2005, www .w3.org/tr/2005/wd-xhtml2-20050527 (accessed apr. 5, 2006). 18. client/server model, http://en.wikipedia.org/wiki/ client/server (accessed apr. 5, 2006). 19. dll hell, http://en.wikipedia.org/wiki/dll_hell (accessed apr. 5, 2006). 20. j. nielsen, “the need for speed.” 21. service-oriented architecture, http://en.wikipedia.org/ wiki/service-oriented_architecture (accessed apr. 5, 2006). 22. j. wusteman, “realizing the potential of web services,” oclc systems & services: international digital library perspectives 22, no. 1 (2006): 5–9. 23. arc—a cross archive search service, old dominion university digital library research group, http://arc.cs.odu .edu (accessed apr. 5, 2006); niso metasearch initiative, www .niso.org/committees/ms_initiative.html (accessed apr. 5, 2006); arc download page, sourceforge, http://oaiarc.source forge.net (accessed apr. 5, 2006). 24. open archives initiative protocol for metadata harvesting, www.openarchives.org/oai/openarchivesprotocol.html (accessed apr. 5, 2006). 25. ojax download page, sourceforge, http://sourceforge .net/projects/ojax (accessed apr. 5, 2006). 26. apache jakarta project, http://jakarta.apache.org (accessed apr. 5, 2006); apache jakarta commons dbcp, http:// jakarta.apache.org/commons/dbcp (accessed apr. 5, 2006); apache jakarta commons dbutils, http://jakarta.apache.org/ commons/dbutils (accessed apr. 5, 2006). 27. agile software development definition, wikipedia, http://en.wikipedia.org/wiki/agile_software_development (accessed apr. 5, 2006). 28. prototype javascript framework, http://prototype.conio .net (accessed apr. 5, 2006); script.aculo.us, http://script.aculo .us (accessed apr. 5, 2006); rico, http://openrico.org/rico/ home.page (accessed apr. 5, 2006). 29. sabre, www.sabre.com (accessed apr. 5, 2006). 30. niso metasearch initiative, www.niso.org/committees/ ms_initiative.html (accessed apr. 5, 2006). 31. google toolbar, http://toolbar.google.com (accessed apr. 5, 2006). 32. dublin core metadata initiative, http://dublincore.org (accessed apr. 5, 2006). 33. mysql, www.mysql.com (accessed apr. 5, 2006). 34. google help center, advanced operators, www.google .com/help/operators.html (accessed apr. 5, 2006). 35. apache lucene, http://lucene.apache.org (accessed apr. 5, 2006). 36. j. nielsen, “search: visible and simple,” alertbox may 13, 2001, www.useit.com/alertbox/20010513.html (accessed nov. 11, 2005). 37. francis shanahan, zuggest. 38. j. r. baker, “the impact of paging versus scrolling on reading online text passages,” usability news 5, no. 1 (2003), http://psychology.wichita.edu/surl/usabilitynews/51/ paging_scrolling.htm (accessed nov. 11, 2005); j. nielsen, “search: visible and simple.” 39. j. j. garrett, “ajax: a new approach to web applications.” 40. del.icio.us, http://del.icio.us (accessed apr. 5, 2006); furl, www.furl.net (accessed apr. 5, 2006). 41. google sitemaps (beta) help, www.google.com/web masters/sitemaps/docs/en/other.html (accessed apr. 5, 2006). 42. ruby on rails, www.rubyonrails.org (accessed apr. 5, 2006); java 2 platform, enterprise edition (j2ee), http://java .sun.com/j2ee (accessed apr. 5, 2006); m. lamonica, “microsoft gets hip to ajax,” cnet news.com, june 27, 2005, http:// news.com.com/microsoft+gets+hip+to+ajax/2100-1007_3 -5765197.html (accessed nov. 11, 2005). letter from the editor: a blank page letter from the editor a blank page kenneth j. varnum information technology and libraries | june 2020 https://doi.org/10.6017/ital.v39i2.12405 nothing is as daunting as a blank page, particularly now. as i sat down to write this issue’s letter, i was struck by how much fundamental uncertainty is in our lives, so much trauma. a blank page can emphasize our concerns that the old familiar should return at all, or that a new, better, normal will emerge. at the same time, a blank page can be liberating at a time when so much of our social, professional, and personal lives needs to be reconceptualized and reactivated in new, healthier , more respectful and inclusive ways. we are collectively faced with two important societal ailments. the first is the literal disease of the covid-19 pandemic that has been with us for only months. the other is the centuries-long festering disease of racial injustice, discrimination, and inequality that typifies (particularly, but not uniquely) american society. while some of us may be in better positions to help heal one or the other of these two ailments, we can all do something in both, as different as they are. lend emotional support to those in need of it, take part in rallies if your personal health and circumstances allow, and advocate for change to government officials at all levels from local to national. learn about the issues and explore ways you can make a difference on either or both fronts. i hope i am not being foolish or naive when i say i believe the blank page before us as a society will be liberating: an opportunity to shift ourselves toward a better, more equitable, more just path. * * * * * * to rephrase humphrey bogart’s rick blaine in casablanca, “it doesn’t take much to see that the problems of three little people library association divisions don’t amount to a hill of beans in this crazy world.” but despite the small global impact of our collective decision, i am glad our alcts, llama, and lita colleagues chose a united future as core: leadership, infrastructure, futures. watch for more information about what the merged division means for our three divisions and this journal in the months to come. sincerely, kenneth j. varnum, editor varnum@umich.edu june 2020 https://core.ala.org/ mailto:varnum@umich.edu lita president's message: a framework for member success lita president’s message a framework for member success emily morton-owens information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.12105 emily morton-owens (egmowens.lita@gmail.com) is lita president 2019-20 and the assistant university librarian for digital library development & systems at the university of pennsylvania libraries. this column represents my final venue to reflect on our potential merger with alcts and llama before the vote. after a busy midwinter meeting with lots of intense discussions about the steering committee on organizational effectiveness (scoe)’s recommendations, the divisions, the merger, ala finances, and more, my thoughts keep turning in a particularly wonkish direction: towards our organization. so many of the challenges before us hinge on one particular dilemma. for those of us who are most involved in ala and lita, the organization (our committees, offices, processes, bylaws, etc.) may be familiar and supportive. but for new members looking for a foothold, or library workers who don’t see themselves in our association, our organization may look like a barrier. moreover, many of our financial challenges are connected to our organization. the organization must evolve, but we must achieve this without losing what makes us loyal members. while ala and lita have specific audiences of library workers and technologists, we have a lot in common with other membership organizations. one of the responsibilities for the lita vicepresident is attendance at a workshop put on by the american society of association executives, where we learn how to steward an organization. representatives from many different groups attended this workshop, where i had a chance to discuss challenges with leaders from medical and manufacturing associations, and i learned that these challenges are often orthogonal to the subject matter at hand. everyone was dealing with the need to balance membership cost and value, how to give members a voice while allowing for agile decision-making, and how to put on events that are great for attendees without becoming the only way to get value from membership. hearkening back even further, i worked as a library-school intern at a library with a long run of german and french-language serials that i retrospectively cataloged. one batch that has always stuck in my mind is the planning materials for international congresses that were held in the early 20th century by the international societies for horticulture and botany. these events were massive undertakings held at multi-year intervals, gradually planned by international mail. interested parties would receive a lavish printed prospectus, with registration and travel arrangements starting several years in advance. the most interesting documents pertained to the events planned for the mid to late 1930s in europe. these events were cancelled or fell short of intentions because of pre-world war ii political pressures. the congress schedules did not resume until 1950 or later, with some radical changes—for example, german was no longer used as the language of science, and the geographic distribution of events increased significantly in the later 20th century. when i first encountered this material, i was intrigued by how the war affected science. looking back now, i see a dual case study in organizations weathering a crisis whose magnitude we can only imagine, and then reinventing themselves on the other side. both of these organizations still exist and continue to meet, by the way—and i can’t help but feel that reinvention is the key to survival. mailto:egmowens.lita@gmail.com information technology and libraries march 2020 a framework for member success | morton-owens 2 our organizational framework is a key part of the challenge for both ala and lita. i have no doubt that members remain excited about our key issues for advocacy, our subjects for continuing education, and our opportunities for networking. but we have concerns about how we make those things happen. in lita, for example, continuing education requires a massive effort on the part of both member volunteers and staff to organize. we need to brainstorm relevant topics, recruit qualified instructors, schedule and promote the events, and finally run the sessions, collect feedback, and arrange payment for the instructors. this takes the time of the same people we’d like to have creating newsletters and booking conference speakers. meanwhile, right across the hall at ala headquarters, we have staff from alcts and llama doing the same things. these inefficiencies hit at the heart of our financial problems. at the ala level, scoe has proposed ideas like a single set of dues structures for divisions, and a single set of policies and procedures for all round tables. these changes would reduce the overhead required to operate these groups as unique entities, a financial benefit, while also making it easier for members to afford, join, and move between them, a membership benefit. that framework also offers us an opportunity to improve our associations. members have been asking how the association can act more responsively on issues of diversity, equity, and inclusion—for example, how can we have incident response that is proactive and sensitive to member needs while recognizing the complexities of navigating that space as a member-based organization. this is a chance to live up to our aspirations as a community. the actions lita has taken to extend all forms of participation to members who can only participate remotely/online are a way to make us more accessible to library workers regardless of finances or home circumstances. bylaws and policies may not be the most glamorous part of associations but they are the levers we can employ to change the character of our community. coming back to core, we can observe elements of the plan that are responding to both threats and opportunities. members of alcts, llama, and lita know that financial pressures are a major impetus for the merger effort. but, in the hope of achieving a positive reinvention, the merger planning steering committee put most of its emphasis on the opportunity side. the diagram of intersecting interests for core’s six proposed sections (https://core.ala.org/core-overlap/) is a demonstration of the new frontiers of collaboration that core will offer members. the proposed structure of core retains committees while also offering a more nimble way to instantiate interest groups. moreover, the process of creating core reflects the kind of transparent process we want to see in the future. the steering committee and the communications sub-committee crossed not just the three divisions but also different levels of experience and types of prior participation in the divisions. the communications group answered freeform questions, held twitter ama’s, and held numerous forums to collect ideas and feelings about the project. zoom meetings and twitter are not new media, but the sustained effort that went into soliciting and responding to feedback through these channels is a new mode for our divisions. the lita board recently issued a statement (https://litablog.org/2020/02/news-regarding-thefuture-of-lita-after-the-core-vote/) explaining that if the core vote does not succeed, we don’t see a viable financial path forward and will be spending the latter half of 2020 and the beginning of 2021 working toward an orderly dissolution of lita. it is tempting to approach this crossroads from a place of disappointment or fear. we cannot yet say precisely what it will be like to be a https://core.ala.org/core-overlap/ https://litablog.org/2020/02/news-regarding-the-future-of-lita-after-the-core-vote/ https://litablog.org/2020/02/news-regarding-the-future-of-lita-after-the-core-vote/ information technology and libraries march 2020 a framework for member success | morton-owens 3 member of core. but when i look at the organizational structure core offers us, i feel hopeful about it being a framework in which members will find their home and flourish. the new division includes what we need for a rich member experience coupled with a streamlined structure that makes it easier to be involved in the ways and extent that make sense for you. in fifty years, perhaps a future member of core will be writing a letter to their members: looking back at this moment of technological and organizational disruption and reflecting on how we reinvented our organization at the moment it needed it most. yee ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀ ฀ ฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀฀฀฀฀฀฀฀฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀฀฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ 54 information technology and libraries | june 2010 tinuing education opportunities for library information technologists and all library staff who have an interest in technology. 2. innovation: to serve the library community, lita expert members will identify and demonstrate the value of new and existing technologies within ala and beyond. 3. advocacy and policy: lita will advocate for and participate in the adoption of legislation, policies, technologies, and standards that promote equitable access to information and technology. 4. the organization: lita will have a solid structure to support its members in accomplishing its mission, vision, and strategic plan. 5. collaboration and outreach: lita will reach out and collaborate with other library organizations to increase the awareness of the importance of technology in libraries, improve services to existing members, and reach out to new members. the lita executive committee is currently finalizing the strategies lita will pursue to achieve success in each of the goal areas. it is my hope that the strategies for each goal are approved by the lita board of directors before the 2010 ala annual conference in washington, d.c. that way the finalized version of the lita strategic plan can be introduced to the committee and interest group chairs and the membership as a whole at that conference. this will allow us to start the next fiscal year with a clear road for the future. while i am excited about what is next, i have also been dreading the end of my presidency. i have truly enjoyed my experience as lita president, and in some way wish it was not about to end. i have learned so much and have met so many wonderful people. thank you for giving me this opportunity to serve you and for your support. i have truly appreciated it. a s i write this last column, the song “my way” by frank sinatra keeps going through my head. while this is definitely not my final curtain, it is the final curtain of my presidency. like sinatra i have a few regrets, “but then again, too few to mention.” there was so much more i wanted to accomplish this year; however, as usual, my plans were more ambitious than the time i had available. being lita’s president was a big part of my life, but it was not the only part. those other parts—like family, friends, work, and school—demanded my attention as well. i have thought about what to say in this final column. do i list my accomplishments of the last year? nah, you can read all about that in the lita annual report, which i will post in june. tackle some controversial topic? while i can think of a few, i have not yet thought of any solutions, and i do not want to rant against something without proposing some type of solution or plan of attack. i thought instead i would talk about where i have devoted a large part of my lita time over the last year. as i look back at the last year, i am also thinking ahead to the future of lita. we are currently writing lita’s strategic plan. we have a lot to great ideas to work with. lita members are always willing to share their thoughts both formally and informally. i have been charged with the task of taking all of those great ideas, gathered at conferences, board meetings, hallway conversations, surveys, e-mail, etc., to create a roadmap for the future. after reviewing all of the ideas gathered over the last three years, i was able to narrow that list down to six major goal areas. with the assistance of the lita board of directors and the lita executive committee, we whittled the list down to five major goal areas of the lita strategic plan: 1. training and continuing education: lita will be nationally recognized as the leading source for conmichelle frisque (mfrisque@northwestern.edu) is lita president 2009–10 and head, information systems, north western university, chicago. michelle frisque president’s message: the end and new beginnings lib-mocs-kmc364-20140103102512 64 predicting the need for multiple copies of books robert s. grant, presently at hope college, holland, michigan. an industrial inventory technique adapted to a university library's computer based circulation system as one aid in identifying heavily used books for multiple-copy purchase. the university of windsor has approximately 5,000 students. the university library's open stacks contain more than 300,000 volumes, 100,000 of which are non-circulating (bound periodicals and reference books). there are approximately 200,000 books available for circulation, a booksto-student ratio of 40:1. nevertheless, a perennial student complaint is: "why is it that every time i need a book, someone else has already checked it out?" to help mitigate this problem, the library decided several years ago to embark upon a programme of purchasing multiple copies of much used books. the question then became one of determining which books would need duplicating, and how many more copies of each title would need to be bought. suggestions of titles to be duplicated were at £rst solicited from the faculty, but ever-increasing demands on them prevented their being more than minimally cooperative. three years ago, in an effort to increase the availability of books to undergraduates, the library changed its circulation period for undergraduates from two weeks to one week, with unlimited renewals. at the same time there was instituted a system whereby a student £lied out a reserve card requesting that he be allowed to check out a book upon its predicting need for multiple copies/grant 65 return. when there were five or more such requests, then a copy of the book was to be purchased. although this . system of ordering multiple copies was very cumbersome, it was better than nothing. an article by william l. leffler ( 1) suggested a system of adapting industrial inventory techniques to the problem of identifying books to be duplicated that would be compatible with the library's computer based circulation system and also could be expected to be simpler and more thorough than the above method of buying multiple copies. without rehearsing leffler's arguments, the basic formula used in this project can be simply stated as: n x n9s% nbooks = t where nbook• = the number of copies of a single title necessary to meet at least 95% of student demand for that title; t =number of days of observation, i.e., the number of days in the academic year in which students are permitted to check out books (a constant of 273 in this formula, being the number of days in the period from 1 september to 31 may); n = total number of times a title circulated during t; n9s% =a+ 2s, where a= the average length of time a title was on loan, i.e., the total number of days in which a title was in circulation divided by the number of times (n ) the title circulated. s =standard deviation, which is computed as: j l (a~a) 2 a1 is the length of time, in terms of days, that a single title was off the shelves each time it circulated, and is not to be confused with a which is the average length of time (over the academic year) that the same title was on loan. the sum ( l) of all the a1's was used earlier to calculate a. a1 + a2 + aa . ... etc. a= n for example, if a book circulated three times during the academic year (the first time for 18 days, the second time for 20 days, and the third time for 3 days) then a (the average length of time the book was on loan) would be calculated as 18 + 20 + 3 13 66 3 , or . at this point it should be noted that although the library continues to accept request cards for books presently on loan (and to reserve books for the requestors), these requests are not used as part of the data in determining the number of copies necessary to meet at least 95% of the demand. for one thing, there is no way of knowing how long the person making a request will want to keep a book out, and time is an important element in the formula. but more importantly, the formula, as it now stands, attempts to account for unsatisfied requests. it assumes that in at least some instances there will be more requests for a title than there 66 journal of library automation vol. 4/2 june, 1971 calculate t n '>-----~1 of loans + add 1 t o copies circulati ng n calcu late n o . of days on loan s tor e information in table fig. 1. programme logic. total n of days on loan re-set table n calculate average length of l oan calculate standard deviation calculate calcu l a te print report predicting need for multiple copiesj grant 67 are copies in the library. by providing an analysis of the present circulation profile of each book, the formula attempts to predict the number of copies of each title the library would need to have in order to more adequately accommodate unsatisfied demand. the programme for performing the calculations is written in pl/1 and is run on an ibm 360 j 50 (figure 1). the execution time for 140,000 circulation records (each time a book circulates the data on its circulation is considered a single record ) is 15 minutes. the historical record file, the source of data for the programme, is incremented each time a book in circulation is returned. figure 2 shows the format of this file. the file itself is a sequential file stored on magnetic tape, updated daily to include the previous day's circulation data. entries are arranged in lc call number-accession number order. field card type lc call number author accession number spare card sequence number spare borrower's id code borrower 's id number spare action code due da te (mmddyy) (mo .-dayyr.) spare indicator date charged out (yyddd) (yr.-day) date returned (yyddd) (yr. -day) fig. 2. format of historical record file. length 1 29 15 6 1 6 2 1 6 3 1 6 3 1 5 5 accumulative length 1 30 45 51 52 58 60 61 67 70 71 77 bo bl 86 91 68 journal of library automation vol. 4/ 2 june, 1971 results after the calculations described above have been performed for every title circulated during the academic year, a print-out of the results is produced ( figure 3). in order to limit paperwork, only those results under "projected need" which were ~ 1.00 appear on the print-out; any results less than 1.00 were suppressed. the column labelled "transactions" is simply the number of times the book was checked out and checked back in again. the column, "average loan period" is the a described in the formula above. and the column, "copies circulated" is the number of books with the same classification number as listed on the left-hand column, but with different accession numbers, checked out during the year. this figure is not the number of copies of the book that the library owns, which could, in some instances, be more copies than were actually circulated. the column labelled "projected need" should, according to the calculations, indicate the number of copies of a title which could accommodate the demand for that title with 95% certainty. in order to find out whether or not the library should purchase more copies of a particular title, the number listed in this column is simply checked against the number of classification author projected trans. avg loan copies need period circul . am---101.-.c3488-canada-national 3 . 61 37 10 . 45 17 b-----56.-.c6---collins-james-d 1.14 21 8 . 00 2 b-----65.-.86---bodenheimer,£.1. 21 12 11.50 3 b--65. .r6----rommen-heinrich l. 34 5 20.60 2 8--67 ..858-blake-ralph-m-2.00 4 36 .75 2 8 ----67.-.n22---nagel-ernest--2. 34 23 11 . 39 3 8----72.-.c63-copleston f.c . 2. 39 27 9.18 10 b----72. .hs---gilson-e. h.---1. 64 26 9.03 14 b-----72.-.j6----joad-cyril-edwi 2.84 8 2 1. 7 5 2 b----72.-.p3----parkerf .h . ---2.48 4 41.00 2 b--358.-.c57---plato--------5.68 21 15 . 61 3 8----358.-.j8----plato---------2.00 38 8 . 07 10 b---358.-.w7----plato-------2. 72 5 39.80 3 8----377. -. a285-plato--3.65 8 3 5. 3 7 2 b----378.-.a2c6--plato-----l. 58 2 73 . 00 2 8 -381. -. ast35plato---l. 04 3 36 . 33 3 b---385.-.a6----anderson-f h-2.92 16 13 . 43 2 b----395.-.877---brumbaugh-r-s2. 05 12 13.33 1 b----395.-.c6----crombie-i-----3.02 17 12 . 41 2 b-395. .g67--grube-georg em5.13 30 10.30 5 b----395. -. g78 --guardini,r.2. 04 17 12 . 23 4 b---395. -. k6----koyre-alexandre 1.13 4 21. 7 5 1 b-395.-.l6--lodge rupert c1. 88 3 51.33 1 b----395.. 553-5horey , paul-4.69 23 11.91 4 b--398. .t25 -taylor,a.e. ---1. 31 28 7 . 7 5 5 8 -398.-.e8h17hall , robert-w .2.99 11 16 . 72 2 b-407 .-. l8l9-lutoslaw5ki,w. 3.10 4 59.25 1 b--505. -. m2--aristotel£5, -2 . 88 17 1 2 . 00 7 b----505.-.03---oates-w.j.--3 . 86 9 2 7. 3 3 7 b--528.. z 4 13-zeller-eduard-1.39 6 33 . 00 2 8--528. .p751--pohlenz-max---1. 35 5 34.60 2 8 --667.. 525---sam bur5ky 5am-1. 36 5 42 . 40 1 b-701 . -.d4d6-dondaine , h . f . 1. 03 2 69. 0 0 1 b-701.-.a4e5 -proc lu5-diadoch 1.11 2 72.50 1 fig. 3. circulation history analysis report. predicting need for multiple copies/ grant 69 copies listed for this classification number in the official shelf list. for example, the book classified as b.72.j6 shows a "projected need" of 2.84. therefore if the library had three copies of this book, and the book's circulation pattern did not change significantly in the immediate future, then the library would be able to fill 95% of the requests. the official shelf list, however, indicates that the library only owns two copies of this title, suggesting that at least one more copy should be purchased to meet present demand. these calculations do not anticipate future demand on the book. also, doubling the number of copies can never succeed in doubling circulation, a fact demonstrated by leimkuhler ( 2). this print-out, therefore, can only serve as one guide to multiple-copy purchase. precautions and pitfalls in using the results of these computations as a guide to the purchase of multiple copies, the librarian should be aware of several factors which may have distorted the results. one is that the student who checks out the only copy of a book and keeps checking it out all year, in lieu of buying his own copy, creates a false "demand" for· the book. it may be that he is the only person in the university interested in it, and when he graduates this book may sit out its life on the shelves completely unused. however, since the historical record file contains the borrower's id number, it is possible to distinguish between an original loan and a renewal. the first time the borrower's id number appears on the book's circulation record indicates the original loan. each additional and consecutive time the same borrower's id number appears on the same circulation record indicates a renewal. although the pilot project did not contain provisions for obviating this problem, it would have been simple enough to build into the programme a mechanism for suppressing the unwanted data. a faculty member who assigns parts of books for students to read, but does not place the books on reserve, forces competition for them on the open shelves. this too creates a demand which may not exist after the professor leaves the university or stops teaching a particular course. the librarian should be aware of such possible short-lived demands that may never recur. the circulation analysis programme was executed at the end of one academic year in order to provide the university of windsor librarians with guidelines for purchase of multiple copies of books to be used in the next academic year. if it were known that a particular book receiving heavy use one year would not receive equally heavy use in the next (because, for example, the particular course requiring that book would no longer be taught; or the book would be placed on a "two-hour reserve" for the coming academic year; or the book circulated frequently in one year only because it was on the "best-seller list"), then it would be folly to purchase three or four additional copies of the book just because the computer print-out indicated that a number of additional copies were 70 journal of library automation vol. 4/2 june, 1971 needed. other factors, therefore, although not included in the input data, are certainly relevant in determining the need for multiple copies. at the university of windsor library, a book that needs to be re-bound because of heavy use or mutilation is charged out to the bindery department. it then shows up on the historical record file, just as though it had been charged out. but since the "borrower's id number" for books charged to the bindery department consists of all zeroes, it would be simple enough to identify and suppress these particular records as unwanted data. by-products in addition to providing a list of books to be considered for duplication, the historical record file upon analysis revealed several other interesting facts about the university library's circulation. most noteworthy is the fact that, although there were more than 200,000 circulating books sitting on the open shelves at the time of this pilot project, only 40,205 different titles circulated for a total of 134,276 times. assuming there were only 100,000 different titles among the 200,000 books, this would mean that nearly 60% of the collection was probably not used by the students. of the 40,205 different works which did circulate, the calculations indicated that only 3,257 titles required one or more copies in order to fill 95% of the requests. of this latter number, only 570 titles were in need of duplication. (that is to say, the number of copies listed under projected need exceeded the number of copies actually owned by the library as indicated by the shelf list.) a random sample comprising one-third of these 570 titles was checked to see whether or not the books were in print. indications were that 38% of the titles in need of duplication were no longer in print. conclusions a close examination of the 570 titles apparently in need of duplication reveals that, with very few exceptions, students are apparently checking out only books that are curriculum oriented in the most narrow sense, i,e., books which they need to use in writing term papers. nevertheless, one can appreciate the fact that these books are in demand by the student, and if the library is to be responsive to users' demands on its facilities, it will need to spend part of the book budget each year purchasing multiple copies of the most heavily used books. unfortunately, even with these good intentions and the sophisticated assistance of the computer, students' demands for books will still be frustrated (at least one-out-of-three times) because books which need to be duplicated are no longer in print. programme a print-out copy of the circulation analysis programme described above predicting need for multiple copies/grant 71 is available from mrs. jean griffiths, computer centre, university of windsor, windsor, ontario, canada. acknowledgments the initial impetus and continuous guidance for this project was provided by albert v. mate, assistant librarian for public services at the university of windsor. dr. martin basic, faculty of business administration, acted as consultant. systems analyst was mrs. jean griffiths, and programmer was mrs. lillian jin, both at the university computer centre. references 1. leffier, william l.: "a statistical method for circulation analysis," college and research libraries, 25 ( 1964), 488-490. 2. leimkuhler, ferdinand f .: "systems analysis in university libraries," college and research libraries, 27 ( 1966), 13-18. • editorial board thoughts | farnel 169 t his past spring, my alma mater, the school of library and information studies (slis) at the university of alberta, restructured the it component of its mlis program. as a result, as of september 2010, incoming students are expected to possess certain basic it skills before beginning their program.1 these skills include the following: ■■ comprehension of the components and operations of a personal computer ■■ microsoft windows file management ■■ proficiency with microsoft office (or similar) products, including word processing and presentation software ■■ use of e-mail ■■ basic web browsing and searching this new requirement got me thinking: is this common practice among ala-accredited library schools? if other schools are also requiring basic it skills prior to entry, how do those required by slis compare? so i thought i’d do a little investigating to see what others in “library school land” are doing. before i continue, a word of warning: this was by no means a rigorous scientific investigation, but rather an informal survey of the landscape. i started my investigation with ala’s directory of institutions offering accredited master’s programs.2 there are fifty-seven institutions listed in the directory. i visited each institution’s website and looked for pages describing technology requirements, computer-competency requirements, and the like. if i wasn’t able to find the desired information after fifteen or twenty minutes, i would note “nothing found” and move on to the next. in the end i found some sort of list of technology or computer-competency requirements on thirty-three (approximately 58 percent) of the websites. it may be the case that such a list exists on other sites and i didn’t find it. i should also note that five of the lists i found focus more on software and hardware than on skills in using said software and hardware. even considering these conditions, however, i was somewhat surprised at the low numbers. is it simply assumed that today’s students already have these skills? or is it expected that they will be picked up along the way? i don’t claim to know the answers, and discovering them would require a much more detailed and thorough investigation, but they are interesting questions nonetheless. once i had found the requirements, i examined them in some detail to get a sense of the kinds of skills listed. while i won’t enumerate them all, i did find the most common ones to be similar to those required by slis— basic comfort with a personal computer and proficiency with word processing and presentation software, e-mail, file management, and the internet. a few (5) schools also list comfort with local systems (e-mail accounts, online courseware, etc.). several (7) schools mention familiarity with basic database design and functionality, while a few (5) list basic web design. very few (3) mention competency with security tools (firewalls, virus checkers, etc.), and just slightly more (4) mention familiarity with web 2.0 tools like blogs, wikis, etc. while many (14) specifically mention searching under basic internet skills, few (7) mention proficiency with opacs or other common information tools such as full-text databases. interestingly, one school has a computer programming requirement, with mentions of specific acceptable languages, including c++, pascal, java, and perl. but this is certainly the exception rather than the rule. i was encouraged that there seems to be a certain agreement on the basics. but i was a little surprised at the relative rarity of competency with wikis and blogs and all those web 2.0 tools that are so often used and talked about in today’s libraries. is this because there is still some uncertainty as to the utility of such tools in libraries? or is it because of a belief that the members of the millennial or “digital” generation are already expert in using them? i don’t know the reasons, but it is interesting to ponder nonetheless. i was also surprised that a level of information literacy isn’t listed more often, particularly given that we’re talking about slis programs. i do know, of course, that many of these skills will be developed or enhanced as students work their way through their programs, but it also seems to me that there is so much other material to learn that the more that can be taken care of beforehand, the better. librarians work in a highly technical and technological environment, and this is only going to become even more the case for future generations of librarians. certainly, basic familiarity with a variety of applications and tools and comfort with rapidly changing technologies are major assets for librarians. in fact, ala recognizes the importance of “technological knowledge and skills” as core competencies of librarianship. specifically mentioned are the following: ■■ information, communication, assistive, and related technologies as they affect the resources, service delivery, and uses of libraries and other information agencies. ■■ the application of information, communication, assistive, and related technology and tools consistent with professional ethics and prevailing service norms and applications. ■■ the methods of assessing and evaluating the sharon farnel editorial board thoughts: system requirements sharon farnel (sharon.farnel@ualberta.ca) is metadata & cataloguing librarian at the university of alberta in edmonton, alberta, canada. 170 information technology and libraries | december 2010 references 1. university of alberta school of library and information studies, “degree requirements: master of library & information studies,” www.slis.ualberta.ca/mlis_degree_requirements.cfm (accessed aug. 5, 2010). 2. american library association office for accreditation, “library & information studies directory of institutions offering accredited master’s programs 2008–2009,” 2008, http:// ala.org/ala/educationcareers/education/accreditedprograms/ directory/pdf/lis_dir_20082009.pdf (accessed aug. 5, 2010). 3. american library association, “ala’s core competences of librarianship,” january 2009, www.ala.org/ala/education careers/careers/corecomp/corecompetences/finalcorecomp stat09.pdf (accessed aug. 5, 2010). specifications, efficacy, and cost efficiency of technology-based products and services. ■■ the principles and techniques necessary to identify and analyze emerging technologies and innovations in order to recognize and implement relevant technological improvements.3 given what we know about the importance of technology to librarians and librarianship, my investigation has left me with two questions: (1) why aren’t more library schools requiring certain it skills prior to entry into their programs? and (2) are those who do require them asking enough of their prospective students? i hope you, our readers, might ask yourselves these questions and join us on italica for what could turn out to be a lively discussion. 3 hoist by their own petard a funny thing happened at ala midwinter. what's more, it was fascinating as well, for it was one of the loveliest examples of "communications dysfunction" i've ever seen. (dysfunction: impaired or abnormal functioning.) librarians-information scientists-have always been concerned with the transfer of information. in recent times, this concern has been explicitly identified as constituting the major component of the profession's domain. whether one interprets information to be the book, and discusses its transfer in terms of acquisitions, circulation, and interlibrary loan, or one interprets information to be datum, and discusses transfer in terms of access, retrieval, and transfer, the fact remains that information transfer is the area of concern of the information profession. yet, as is already evident from the paragraph above, the medium being used to relay the message, the unit which is basic to the process of information transfer, i.e., the word, is a fractious thing. one would think that informationalists would be among the most alert to this frailty of language; yet, though the problem has been addressed at great length by a great many, members of our profession have not been predominant among them. we, too, use words ever more loosely, violate structure ever more often, and transpose jargon ever more freely-unaware, and, apparently, uncaring that in the process we are vitiating the very foundation of our field. and thus, at the palmer house in chicago, during a very balmy january midwinter meeting of the american library association, a select group of professional practitioners who had gathered together to work together found themselves caught in their own trap. they were unable to communicate! information specialists-listening without hearing, reading without comprehending, talking without communicating. it was almost frightening. "network" concerns got defined in terms of the need for reimbursement for interlibraty loan. the phrases "data base interchange," "machine-readable record exchange," and "networking" were being used interchangeably, engendering damaging misconceptions. the distinction between "conb:act negotiation assistance" (which clr will provide the anable serials group) and "contracting" (which clr is not doing here) was not made. legislative "networks" described procedural, not substantive, activity. the jargon of internal revenue code section 4942 ( j) 4 journal of library automation vol. 7/1 march 1974 ( 3) (operating foundation) and the jargon of the technical sector ( operations) were interpreted as being synonymous. and the word standard lost its identity altogether. the irony is overwhelming. like the old adage about the shoemaker's children who don't have shoes, it would appear that it is the information specialists who cannot communicate.-ruth l. tighe, new england library information network llo journal of library automation vol. 14/2 june 1981 the desperation from a downtime situation. great neck library is also planning to use the apples for other functions, which, it is hoped, will be implemented soon. multimedia catalog: com and online kenneth j. bierman: tucson public library, tucson, arizona. like many public libraries, the tucson public library (tpl) is closing its card catalog and implementing a vendorsupplied microform catalog. unlike most of these other libraries, however, the tpl microform catalog will not include', location or holding information. the indication of where copies of a particular title are actually available (i.e., which of the fifteen possible branch locations) will be available only by accessing a video display terminal connected to the online circulation and inventory control system. conceptually, the tpl catalog will be in two parts with each part intended to serve different functions.' the microform catalog (copies available in both film and fiche format) will fulfill the bibliographic function of the catalog. this catalog will contain bibliographic description and provide the traditional access points of author, title, and subject. the online catalog (online terminals are in place at all reference desks and a few public access terminals will also be available) will fulfill the finding or locating function of the catalog. this catalog will contain very brief bibliographic description and will only be searchable by author, title, author/title, and call number, and will contain the current status of every copy of every title in the library system (i.e., on shelf, checked out, at bindery, reported missing, etc.). why did the tucson public library make this decision? there are two major reasons: l. accuracy . the location information, if provided in the microform catalog, would always be inaccurate and out of date. assuming that the locations listed in the latest edition of the microform catalog were completely accurate when the catalog was first issued (an unrealistic assumption to begin with as anyone who has ever worked with location information at a public library with many branches well knows!), the location information would become increasingly less accurate with each day because of the large number of withdrawals, transfers, and added copy transactions that occur (more than 100,000 a year) . in addition, at any given time, one-quarter to one-third of the materials in busy branches are not on the shelf because they are either checked out or waiting to be reshelved . thus, the microform catalog would indicate that these materials were available at specific branches when a significant percentage would in fact not be available at any given time. in short, even in the best of circumstances, easily half of the location information would be incorrect in telling a user where a copy of a title was actually available at that moment. 2. cost , a study done at the tucson public library indicated that close to half of the staff time of the cataloging department was spent dealing with location and holding information. this time includes handling transfers, withdrawals, and added copies. all of this record keeping is already being done as a part of the online circulation and inventory control system (the tucson public library has no card shelflist containing copy and location information but rather relies completely on the online file for this type of information). to "duplicate" the information in the microform catalog would cost an estimated $40,000 to $60,000 a year and the information in the microform catalog would never be accurate or up to date for the reasons outlined above . figure 1 is a brief summary of how the bibliographic system will work. would the system in figure 1 be improved if holdings were included in the microform catalog? on the surface, the obvious answer is yes-more information is communications 111 known-item search (37 percent of tpl catalog use according to catalog use survey conducted at the tpl in 1971) user searches microform catalog by author and/or title. if user does not find desired bibliographic entry, user either leaves unsatisfied or goes to desk (or public access terminal) for help. if user finds the desired bibliographic entry, he/she writes down call number (or author for fiction) and proceeds to shelf. if user finds book on shelf he/she checks it out. if user does not find book on shelf, user either leaves unsatisfied or goes to desk (or public access terminal) to obtain holdings information or ask for help (put on reserve, borrow from another library, possible purchase of additional copies, etc.). subject search (63 percent of tpl catalog use by public according to catalog use survey conducted at the tpl in 1971) user searches microform catalog. user writes down call number(s) and proceeds to shelf. if user finds appropriate material(s), he/she checks it out. if user does not find appropriate material he/she leaves unsatisfied or goes to desk for help (reference interview, etc.) . fig. 1. summary of how system will work. always better. but, if we examine the situation in depth, perhaps not. let us look at some hypothetical situations. if the user is doing a search and does not find the desired entry/entries in the microform catalog, it makes no difference whether holdings are included in the catalog. the user will still either leave unsatisfied or go to the desk for help. if the user is doing a known-item search and finds the desired item and notes, and the agency he/she is at is listed as a holding agency, he/she will proceed to the shelf. if the desired material is found, fine . if not (because the material is checked out, reported missing, or withdrawn), he/she will either leave unsatisfied or go to the desk (or public access terminal) for help. if the user is doing a known-item search and finds the desired item in the microform catalog but notes that the agency is not listed as a holding agency, what are his/her choices? the user can go away unsatisfied without checking the shelves (although there may be a copy on the shelf because a copy may have been added to that agency since the microform catalog was last recumulated) or he/she can go to the desk (or public access terminal) to obtain help; here he/she will have access to the "real" holdings information--on the online system. the user could notice from the holdings in the microform catalog that another branch has the item and drive to the other branch. however, when the user gets there he/she may discover that the item is not available-information that could have been found in the online system at the original branch if he/she had gone to the desk (or public access terminal). · the purpose of the above exercise is to demonstrate that in all cases the user is still going to require access to the online catalog in order to determine holdings more accurately. with time, this access will become increasingly self-service through public access terminals. from the user's point of view, providing inaccurate holdings in the microform catalog does very little good and can actually do harm by leaving the impression that, if a library is listed as a holding library, that library will have the item (a false conclusion because of checkouts, reported missings, and withdrawals) or leaving the impression that if a library is not listed as a holding library, that library will not have the item (a false conclusion because a copy could have been added recently but that fact is not yet reflected in the microform catalog) . if the user is doing a subject search, holdings are of less value in the catalog 112 journal of library automation vol. 14/2 june 1981 anyway because he is primarily getting suggested classification numbers in order to browse. the tucson public library could not have made the above decisions if it did not have a complete online file of all its holdings (including even reference materials that never circulate). but since this data did exist (after a five-year bar-coding effort) and since more than forty online terminals were already in place throughout the library system to access the online file, the decision not to include locations or holdings in the microform catalog seemed reasonable . in the longer-range future (1990?), it is very likely that the entire catalog will be available online . in the meantime, the tucson public library did not want to divide its resources maintaining two location records, but rather wanted to concentrate resources in maintaining one accurate record of locations available as widely as possible throughout the library system (by installing more online terminals for staff and public use). was this decision a sound one? we don't know. the microform catalog has not yet been introduced for public use. by the end of this year we should have some preliminary answers to this question. references 1. robin w. macdonald and j. mcree elrod, "an approach to developing computer catalogs," college & research libraries 34:202-8 (may 1973). a structure code for machine readable library catalog record formats herbert h. hoffman: santa ana college, santa ana, california. libraries house many types of publications in many media, mostly print on paper, but also pictures on paper, print and pictures on film, recorded sound on plastic discs, and others. these publications are of interest to people because they contain recorded information. more precisely said, because they contain units of intellectual, artistic, or scholarly creation that collectively can be called "works." one could say simply that library materials consist of documents that are stored and cataloged because they contain works. the structure of publications into documents (or "books") and works, the clear distinction between the concept of the information container as opposed to the contents, deserves more attention than it has received so far from bibliographers and librarians. the importance of the distinction between books and works has been hinted at by several theoreticians, notably lubetzky . however, the idea was never fully developed. the cataloging implications of the structural diversity among documents were left unexplored. as a consequence, librarians have never disentangled the two terms book and work . from the paris principles and the marc formats to the new second edition of the anglo-american cataloguing rules, the terms book and work are used loosely and interchangeably, now meaning a book, now a work proper, now part of a work, now a group of books. such ambiguity can be tolerated as long as each person involved knows at each step which definition is appropriate when the term comes up. but as libraries ease into the age of electronic utilities and computerized catalogs based on records read by machine rather than interpreted by humans, a considerably greater measure of precision will have to be introduced into library work. as one step toward that goal an examination of the structure of publications will be in order. the items that are housed in libraries, regardless of medium, are of two types. they are either single documents, or they are groups of two or more documents. items that contain two or more documents are either finite items (all published at once, or with a first and a last volume identified) or they are infinite items (periodicals, intended to be continued indefinitely at intervals). schematically, these three types of bibliographic items in libraries can be represented as shown in figure l. it should be noted that all publications, all documents, all bibliographic items in liproduct ownership of a legacy institutional repository: a case study on revitalizing an aging service article product ownership of a legacy institutional repository a case study on revitalizing an aging service mikala narlock and don brower information technology and libraries | september 2021 https://doi.org/10.6017/ital.v40i3.13241 mikala narlock (mnarlock@nd.edu) is digital collections strategy librarian, university of notre dame. don brower (dbrower@nd.edu) is digital projects lead, university of notre dame. © 2021. abstract many academic libraries have developed and/or purchased digital systems over the years, including digital collection platforms, institutional repositories, and other online tools on which users depend. at hesburgh libraries, as with other institutions, some of these systems have aged without strong guidance and resulted in stale services and technology. this case study will explore the lengthy process of stewarding an aging service that satisfies critical external needs. starting with a brief literature review and institutional context, the authors will examine how the current product owners have embraced the role of maintainers, charting a future direction by defining a clear vision for the service, articulating firm boundaries, and prioritizing small changes. the authors will conclude by reflecting on lessons learned and discussing potential future work, both at the institutional and professional level. introduction our home-grown institutional repository (ir) began almost a decade ago with enthusiasm and promise, driven by an eagerness to meet as many use cases as possible. over time, the code grew unwieldy, personnel transitioned into new roles, and new priorities emerged, leaving few individuals to manage the repository, allocate resources, articulate priorities, or advocate for user needs. this in turn left the system underutilized and undervalued. in mid -2019, two product owners (pos) at hesburgh libraries, university of notre dame were named to oversee the service and tasked with determining how the service should continue, if at all. the pos began by evaluating the service, current commitments, and benefits, and identifying potential on-campus adopters of the service. after agreeing the service should continue, the pos started the lengthy process of turning the metaphorical ship, prioritizing modest adjustments that would have large payoffs.1 selected literature review since the 2003 seminal article by clifford lynch, much has been authored on the topic of institutional repositories as academic libraries and archives have flocked to create their own.2 a complete literature review is beyond the scope of this case study: institutional repositories have contended and continue to contend with a wide variety of challenges, including legal, ethical, and socio-technical challenges.3 while the lessons presented in this case study can apply to a wide variety of legacy services, a brief overview of some of the literature surrounding irs is crucial to understanding the challenges the authors were presented as product owners. broadly defined “as systems and service models designed to collect, organize, store, share, and preserve an institution’s digital information or knowledge assets worthy of such investment,” libraries and archives flocked to build the “essential infrastructure for scholarship in the digital mailto:mnarlock@nd.edu mailto:dbrower@nd.edu information technology and libraries september 2021 product ownership of a legacy institutional repository | narlock and brower 2 age.”4 operating under the assumption that faculty members would flock to the service to deposit their works, irs were promised to solve many problems, including supporting open access publishing and digital asset management.5 as articulated by dorothea salo, however, the field of dreams model (“build it and they will come”) was insufficient as repositories often failed to meet changing user needs and expectations while heavily employing library jargon that was foreign to faculty members.6 moreover, as identified by kim, some irs struggle to even be known to their users, while also grappling with concerns of trust.7 other problems that have plagued repositories include limited adoption rates, restricted resources to support digitization of analog materials for faculty that operate in both analog and digital media, failing support from fellow library colleagues, and inconsistent and incomplete metadata.8 salo warned more than a decade ago that high-level library administrative support would be necessary to empower repository managers to enact lasting and substantive change, and recent studies echo these concerns.9 libraries have slowly started to serve faculty on their terms, such as by creating automated processes for populating irs, streamlining content deposits, experimenting with metadata harvesting features to provide increased access, and building more tools to integrate directly with the research lifecycle.10 however, these new technologies and services may be out of reach for many institutions. in addition to limited resources, some institutions are grappling with a legacy system that is incompatible with newer code, leaving these institutions in a feature desert, reliant on aging technology and cumbersome deposit processes.11 moreover, even in an institution where resources might be more readily available for licensing or purchasing newer technology, early forks of open-source code or otherwise deprecated components might make migration to newer platforms extremely difficult, if not impossible, without extensive infrastructure improvements. lastly, as libraries grappled with some of the issues mentioned above and options for repositories continued to proliferate, many institutions struggled to clearly articulate boundaries around their digital library holdings. confusion between digital collections, scholarly content, e-resources, and other digital materials resulted in some institutions having too many options to store content, leaving internal and external stakeholders confused as to where to discover and distribute materials; conversely, other institutions have few options, and a wide variety of content is pigeonholed imperfectly into a single repository.12 in both situations, developing repositories with vague content scopes can be exceedingly difficult, as a restrictive scope can stifle development , while an overly inclusive approach results in too many use cases and competing stakeholder interests to effectively prioritize feature development. local context our institutional repository at the university of notre dame, managed by hesburgh libraries employees, suffered from many problems that affected our locally built code: limited adoption and awareness on campus; aging technology that made adding new features a monumental, if not impossible, task; and an overly broad scope (and a simultaneous proliferation of other digital collection tools). while the detailed history of this repository is beyond the scope of this paper, a brief overview of the development provides critical context. additionally, the technical details and implementation particulars will not be discussed, as this case study transcends specific software frustrations and will resonate with many institutions regardless. information technology and libraries september 2021 product ownership of a legacy institutional repository | narlock and brower 3 in 2012, after a failed attempt to launch a repository in the early 2000’s, consortial development of our ir began in an open-source community. in 2014, an early implementation of the product was envisioned to be a unified digital library service that would provide support to many different stakeholders. this included a plan for a single location for researchers to share their scholarly work, research outputs, and research data, as well as for the university libraries to provide access to digitized rare and archival holdings. as development continued on the homegrown service, features were implemented to serve the numerous purposes mentioned above. this included components of an institutional repository, such as a self-deposit interface, customizable access levels, and a proof-of-concept researcher profile system. over time support for browsing digital collections was added, namely the development of the work type “collection,” which allowed curators to create a landing page for their collection and customize it with a representative image. development continued in a somewhat sporadic fashion, often aligning at the intersection of “what is easy?” and “what is needed?” as technical staff continued growing the open-source code. as content was added to the system, stemming from special collections, various campus partners, and electronic thesis and dissertation (etd) deposits, additional use cases emerged and were added to the scope of the repository. the system quickly grew cumbersome and difficult to work with. in short, the repository struggled with the challenges of many open-source technologies. the struggle was compounded by decreasing resources, an overly inclusive scope, limited adoption— both with external faculty as well as library faculty and staff—and consortial development that introduced features extraneous to local campus needs. while our repository did many different things, it failed to do any one well. after falling short of meeting the expectations for digital collections, particularly with regards to browsing and displaying objects, the library applied for, and received, a three-year mellon grant.13 this grant, a collaboration with the snite museum of art, university of notre dame, was initially sought to improve upon the existing repository and to build the infrastructure necessary to support the online display of collective cultural heritage materials and facilitate serendipitous discovery for patrons. however, soon into the grant, it became clear that creating an entirely new system for digital collections would be not only easier to build and maintain, but also better suited to meet the specific needs of digital collections as articulated by campus partners. first things first: what is our ir? around the same time this shift was announced, two individuals were appointed to serve as product owners (pos) of the repository. while exact duties vary between institutions, pos are responsible for liaising with users, managing the product backlog, directing development, communicating with a wide variety of stakeholders, resolving issue tickets, and guiding the overall direction of the product.14 the pos were tasked with making this amorphous, oft-critiqued service usable while dealing with uncertain resources and competing institutional priorities. with the change in grant objectives mentioned above, namely the desire to develop a new repository instead of contending with the legacy code, the option was presented to retire the repository and direct users to other systems that could sufficiently meet their needs, such as discipline specific repositories, general purpose repositories, or even online cloud storage. the pos recognized that continuing the system due solely to sunk costs was a fallacy: if the service was too cumbersome to information technology and libraries september 2021 product ownership of a legacy institutional repository | narlock and brower 4 maintain with even nominal use, the return on investment would be abysmal and ultimately prevent the library from investing resources more appropriately. in order to evaluate the service, the pos considered active commitments and ongoing partnerships tied to the service. in particular, several centers and departments on campus had utilized the system to capture citations and demonstrate their impact. additionally, after conversations with library liaisons, it became apparent that there was great value in providing the campus with a discipline-agnostic repository that allows deposition of, provides access to, and preserves scholarly outputs that might otherwise be lost. while the pos recognized that faculty adoption or even awareness of the service was limited, they realized there were several campusspecific features that were useful to local champions, including flexible access controls at the record and file levels, as well as a customized etd workflow that served the graduate school, internal technical services, and the students and faculty required to interact with the system. acknowledging that the system and related services were still critical, the pos prioritized making sure the system remained useful: maintaining the legacy repository would cost valuable time and resources and would need to overcome the resentment that many internal stakeholders had developed over the years. after deciding the system was worth maintaining, it was necessary to explicitly narrow the scope of the service, which had broadened over time in an ad hoc manner: as other services were turned off, leaving various digital content to find a new location, our institutional repository was often leveraged to host the content, even when support for the needs of niche content was poor at best. when considering the future of the repository, several key use cases emerged, including the etd support provided to the graduate school as mentioned above. while the service had done many things acceptably, the strength was in the support for scholarship: the customized access levels, self-deposit interface, and robust preservation capabilities were frequently lauded as the highlights of the service to internal and external stakeholders. these considerations, combined with the eventual migration of digitized rare and unique materials to the new mellon-funded platform, resulted in rebranding and redefining the service as exclusively focused on scholarly outputs. with the goal of best supporting the teaching and research mission of the university, the directional force became how to (re)build the service as a trusted, useful, and integral repository for campus scholars to provide access to their research outputs. mission (and vision) critical operating under the guiding principles of usefulness, usability, and transparency, the first task after redefining and rearticulating the scope of the service was to keep the service operational. however, with the recognition that maintenance alone, while critical, would not lead to an enhanced reputation on campus, it was important to continue charting a forward direction. the product owners were given the freedom to articulate their ideal mission statement. to complement the vision of the repository as both trusted and integral, the pos further defined the mission statement in three key areas: to increase the impact of campus researchers, scholars, and departments; to advance new research by facilitating access to scholarship in all forms; and to serve as active campus partners in the research lifecycle. while these statements are far from innovative or revolutionary, it was essential for moving the service forward. in fact, these sentences were carefully crafted over the course of a month, during which time the product owners drafted the language, compared it with peer and aspirational peer information technology and libraries september 2021 product ownership of a legacy institutional repository | narlock and brower 5 institutions, and solicited feedback from trusted internal colleagues before sharing it more broadly. this time-consuming process was critical for success, however: with the knowledge that these words would serve as the foundation for prioritizing feature requests and advocating for resources, the pos wanted to establish both the repository and themselves in their new role. this clarity in mission was also important for grappling with legacy emotional and mental frustrations that lingered towards the system, as the pos had a strong, unified foundation to advocate for resources and the service as a whole. relatedly, these mission and vision statements provided critical and consistent talking points, which were leveraged in presentations to internal stakeholders, provided to librarians as messaging for the liaison faculty, and useful in short communications to teaching professors, research faculty, and department administrato rs. clear and present boundaries in rebranding the repository, it also became clear that firm boundaries would be instrumental in attaining success. in addition to narrowly focusing feature development on supporting research and scholarly outputs, the pos also scaled back goals for adoption, intentionally excluded digital collection features, and identified features that were patently unattainable in the short term. the repository was often seen as a failure locally due to limited adoption and an incomplete record of the academic outputs of campus, reflecting concerns of irs more generally.15 combatting this narrative required a clear articulation and acceptance of the fact that the institutional repository, regardless of how seamlessly integrated or easy to use, would never be absolutely comprehensive or the authoritative record of our researchers and scholars. with limited resources and a current technical infrastructure in which it is difficult to incorporate automatic harvesting mechanisms, any effort to make the repository comprehensive would be impractical, unrealistic, and a waste of limited resources. instead, by focusing efforts on making the repository useful and refraining from being yet another requirement for an already overwhelmed faculty member or graduate student, the service can be improved to meet the unique needs of campus faculty, serving as a more viable option for those who need it.16 similarly, because there is less concern with filling the repository and increasing usage statistics and more on what the patron needs, the pos have been able to develop robust partnerships with stakeholders, leading to champions in research centers, labs, departments, and other administrative units across campus. this has helped scholars demonstrate the impact of their work, which in turn led to more partnerships with other campus centers, as champions began to advocate for the service to colleagues facing similar challenges across the university. in this way, decreasing the effort to fill the repository has actually increased holdings and driven more traffic to the site: by focusing on useful offerings and decreasing the burden on ourselves to create a comprehensive research repository, the pos have been able to prove the value of a discipline-agnostic approach to internal and external stakeholders. an additional, and extremely beneficial, boundary was intentionally excluding library-owned digital collections from the repository’s collecting and feature-development scope. the pos received little pushback from internal users on this change: the repository had been the de facto scholarly and research repository for nearly five years, as it was patently clear that supporting digital collections had been more of an afterthought, with limited features built to support curators and users in creating and interacting with rare and archival materials. in fact, internal colleagues supported this change wholeheartedly, as the pos volunteered to continue providing access to the extant digital content in the ir as the mellon grant-funded site was built. while this information technology and libraries september 2021 product ownership of a legacy institutional repository | narlock and brower 6 direction had already been understood by individuals across the organization, it was helpful to clearly articulate the new boundaries in open forums for internal stakeholders, communication through a library-wide listserv, and repetition in smaller meetings. by articulating this new boundary clearly and repeating it frequently in different methods of communication, the pos had the authority to reject feature requests that were explicitly in support of rare and archival materials. with a clear focus on collecting and providing access to scholarly and research outputs, niche metadata fields, advanced browsing features, and robust collection landing pages were identified as unnecessary, as they were scoped for the mellon-funded platform, and internal colleagues quickly embraced this boundary. the final, crucial boundary, also related to feature requests, was to clearly define requests that were impossible to accommodate in the current technical infrastructure. as mentioned earlier, the pos focused first on maintenance: by updating code, critically evaluating the service and existing commitments, and charting a future direction, the pos could more effectively steward the project. this also meant revisiting previous feature requests, and even technical promises, in order to set more reasonable expectations on what the service would, and would not, be able to support in the coming years. with limited resources, advanced features such as research profiles—a frequent request from internal allies—was beyond the current capabilities with the aging technical stack. moreover, a feature-rich repository would be essentially useless if users’ basic expectations were left unmet: a cumbersome deposit interface, limited upload support, and confusing language throughout the site were more pressing issues, as they prevented users from even engaging with the site for any amount of time. by resolving these limitations and generating awareness of the repository, the pos could better serve not only current campus partners, but also future users, as an increase in adoption and use would lead to more resources to develop advanced features. instead of planning a new outfit for the proverbial patient, it was more important to stop the bleeding. by adopting firm boundaries, the pos were able to scope developer work, prioritize maintenance and modest feature development, and even deny implementation of previously requested features that were no longer relevant to the repository or would be unattainable in the coming years. the pos could explicitly drop support for unused services, allow other services to limp along, and improve existing strengths. this has further helped to clarify messaging about the service and garner more support from our campus partners; instead of a malleable system that fits too many roles in a limited capacity, the pos could clearly state how the repository offers support and garner users from across campus. small changes, big rewards the last critical component of rebranding and revitalizing the institutional repository was the conscious decision to implement incremental improvements instead of large, sweeping changes. in particular, there were known frustrations with the service that were easy to start working on while the product owners expanded the user base and sought additional user feedback. small changes to the user interface, including the addition of use metrics and color-coded access tags, received immediate attention and positive feedback from key stakeholders. additionally, over the numerous years of development, many projects to improve the repository had stalled for various reasons. by either prioritizing the work necessary to complete the project or accepting the sunkcosts and clearing the backlog for other projects, the technical development team could build momentum, completing projects and clearing mental space for new, exciting endeavors. information technology and libraries september 2021 product ownership of a legacy institutional repository | narlock and brower 7 with limited resources on hand, maximizing the return on investment also included an emphasis on securing and keeping internal and external champions. due to the limited outreach conducted early in the system’s existence as well as the mediocre service offerings, many campus users were unaware of the tool, and a few were using the repository in a somewhat limited fashion. in order to build support for the service, it was critical that key users of the repository received targeted support and outreach efforts. a primary example of this was an imaging facility on campus: this unit provided a critical service to campus, yet had difficulty showing the impact of their work as many faculty members did not cite their team in publications. the facility slowly began collecting citations manually, but still struggled to publicly advertise their capabilities and show the fruits of their labor. they solved this problem by loading citation records into the repository, which became the single location where any interested faculty, staff, and students could look to see the full output of the center. while they were using the repository in a somewhat different manner than anticipated, they found the system useful and were actively directing other campus centers and institutes to the repository for similar support. in conversations with them, it became clear that a few modest changes would streamline their workflows and alleviate some cumbersome burdens. with this concentrated outreach and a minimal amount of development, the repository secured a champion that continues to advocate for the service to colleagues across campus. lastly, prioritizing maintenance and paying down technical debt was critical for moving the repository forward. many software dependencies had fallen behind by several major version updates, making it difficult to add new features or consider potential migration paths to future technical solutions. while the amount of technical debt to be paid was substantial, by prioritizing a small amount of maintenance every month, the development team quickly caught up, thereby improving the overall performance of the site and providing the product owners with the flexibility to consider future technical implementations and key features to continue recruiting users. lessons learned and future work moving forward, the product owners are embracing the role of maintainers. in specific reference to repositories, that includes “repairing, caring for and documenting a wide variety of knowledge systems beyond irs to facilitate access and optimize user experience.”17 the work of critically evaluating commitments, establishing clear boundaries, and reaffirming the mission of the repository is useful on a recurring basis, and will need to be continued as the repository ages. maintaining the technical infrastructure as appropriate and conducting user experience testing to improve the service will be critical to ensuring the long-term success of the repository and the information contained therein. beyond the stewardship and small improvements required for maintaining the service, there is the opportunity to reconsider the role of the institutional repository, both at the local level and within the academic community. by prioritizing usefulness over comprehensiveness, the product owners made great strides in making the service accessible to patrons and actually usable. when considering the future of repositories, specifically through a lens of usefulness, it is critical to consider how future work will best serve faculty needs without overburdening librarians. adding pos who are examining how a service will be used and what will promote the mission of the library reframes a repository from being a piece of technology to being a source of interconnections. scholarship usually requires a level of technology different from what most campus it departments can provide: research does not usually just deal in urls, it requires dois information technology and libraries september 2021 product ownership of a legacy institutional repository | narlock and brower 8 and persistent identifiers; files are not just backed up, but are preserved (an active process that requires consideration for how computing will change over the coming decades). not only is a library a place to go to look for data, but it is also a place that can help publish and deposit items, providing valuable services to connect researchers to tools and platforms to facilitate research. this is an area of service that libraries and repositories can provide. in the relationship between libraries and technologies, innovation and maintenance, one clear challenge was the amount of emotional labor necessary to revitalize a service. the pos spent a large portion of time apologizing for previous failures, managing expectations by scaling back previous promises, and grappling with the current technical shortcomings of the service. while this is, at least in part, the role of the pos, the phenomenon of controlling expectations and handling the emotional debt that comes with broken promises and failed technologies is not localized to hesburgh libraries. in libraries especially, this work tends to fall to women, where they are forced to be the middle ground between technology and patron-facing librarians.18 while embracing the term “product owner” has helped to make visible and valuable the labor invested, especially that which might otherwise be overlooked, libraries writ large still need to contend with the gender divide plaguing the seeming dichotomy between innovation and maintenance. 19 in fact, as libraries continue to build new technologies and support innovative research, the role of the product owners in managing legacy technologies will be crucial for success, as will embracing a culture of care and empathy. while beyond the scope of this case study, continued discussions of the gender roles often employed in library technology need to continue, especially as academic libraries embrace scrum methodology, project management, and product ownership. conclusion in this case study, the product owners of a legacy institutional repository described methods for revitalizing a service. for the institutional repository managed by hesburgh libraries, there has been a noticeable increase in usage in the past six months: more deposits, higher access counts, and more support tickets tracked. it appears the efforts of the product owners are showing results. this increased usage is one more piece of evidence that a repository is more than software and more than technology: by allowing the product owners oversight of the mission and ultimate direction of the service, not to mention the freedom to engage with users on behalf of the development team, the system is in a much better position than in previous years. despite these improvements, there is still room for growth as the pos guide the overall mission and development of the institutional repository as both a service and a system. similarly, as more institutions contend with legacy digital technology, using pos and the methods described above may prove beneficial. there is additional work to be done, such as investigating more thoroughly the role of the repository—indeed the concept of the repository—and discussions of gender norms in technology. endnotes 1 this article is based on a presentation by don brower and mikala narlock: “what to do when your repository enters middle age” (online presentation, samvera connect 2020, october 28, 2020), https://doi.org/10.7274/r0-e32v-2h81. 2 clifford lynch, “institutional repositories: essential infrastructure for scholarship in the digital age,” portal: libraries and the academy 3 (april 1, 2003): 327–36, https://doi.org/10.1353/pla.2003.0039. https://doi.org/10.7274/r0-e32v-2h81 https://doi.org/10.1353/pla.2003.0039 https://doi.org/10.1353/pla.2003.0039 https://doi.org/10.1353/pla.2003.0039 information technology and libraries september 2021 product ownership of a legacy institutional repository | narlock and brower 9 3 soohyung joo, darra hofman, and youngseek kim, “investigation of challenges in academic institutional repositories: a survey of academic librarians,” library hi tech 37, no. 3 (january 1, 2019): 525–48, https://doi.org/10.1108/lht-12-2017-0266. 4 j. j. branin, “institutional repositories,” in encyclopedia of library and information science, ed. m. a. drake (boca raton, fl: taylor & francis group, 2005): 237–48; lynch, “institutional repositories.” 5 raym crow, “the case for institutional repositories: a sparc position paper,” arl bimonthly report 223, august 2002: 7; lynch, “institutional repositories.” 6 dorothea salo, “innkeeper at the roach motel,” december 11, 2007, https://minds.wisconsin.edu/handle/1793/22088. 7 jihyun kim, “motivations of faculty self-archiving in institutional repositories,” journal of academic librarianship 37, no. 3 (may 1, 2011): 246–54, https://doi.org/10.1016/j.acalib.2011.02.017; deborah e. keil, “research data needs from academic libraries: the perspective of a faculty researcher,” journal of library administration 54, no. 3 (april 3, 2014): 233–40, https://doi.org/10.1080/01930826.2014.915168. 8 trevor owens, “the theory and craft of digital preservation,” lis scholarship archive, july 15, 2017, https://doi.org/10.31229/osf.io/5cpjt. 9 e.g., joo, hofman, and kim, “investigation of challenges in academic institutional repositories.” 10 sarah hare and jenny hoops, “furthering open: tips for crafting an ir deposit service,” october 26, 2018, https://scholarworks.iu.edu/dspace/handle/2022/22547; james powell, martin klein, and herbert van de sompel, “autoload: a pipeline for expanding the holdings of an institutional repository enabled by resourcesync,” code4lib journal, no. 36 (april 20, 2017), https://journal.code4lib.org/articles/12427; carly dearborn, amy barton, and neal harmeyer, “the purdue university research repository: hubzero customization for dataset publication and digital preservation,” oclc systems & services, february 1, 2014, https://docs.lib.purdue.edu/lib_fsdocs/62. 11 clifford lynch, “updating the agenda for academic libraries and scholarly communications,” college & research libraries 78, no. 2 (february 2017): 126–30, https://doi.org/10.5860/crl.78.2.126. 12 lynch, “updating the agenda,” 128. 13 diane walker, “hesburgh/snite mellon grant,” october 31, 2018, https://doi.org/10.17605/osf.io/cusmx. 14 hrafnhildur sif sverrisdottir, helgi thor ingason, and haukur ingi jonasson, “the role of the product owner in scrum-comparison between theory and practices,” in “selected papers from the 27th ipma (international project management association), world congress, dubrovnik, croatia, 2013,” special issue, procedia—social and behavioral sciences, 119 (march 19, 2014): 257–67, https://doi.org/10.1016/j.sbspro.2014.03.030. https://doi.org/10.1108/lht-12-2017-0266 https://doi.org/10.1108/lht-12-2017-0266 https://minds.wisconsin.edu/handle/1793/22088 https://minds.wisconsin.edu/handle/1793/22088 https://minds.wisconsin.edu/handle/1793/22088 https://doi.org/10.1016/j.acalib.2011.02.017 https://doi.org/10.1016/j.acalib.2011.02.017 https://doi.org/10.1016/j.acalib.2011.02.017 https://doi.org/10.1080/01930826.2014.915168 https://doi.org/10.1080/01930826.2014.915168 https://doi.org/10.31229/osf.io/5cpjt https://doi.org/10.31229/osf.io/5cpjt https://scholarworks.iu.edu/dspace/handle/2022/22547 https://scholarworks.iu.edu/dspace/handle/2022/22547 https://journal.code4lib.org/articles/12427 https://journal.code4lib.org/articles/12427 https://journal.code4lib.org/articles/12427 https://docs.lib.purdue.edu/lib_fsdocs/62 https://docs.lib.purdue.edu/lib_fsdocs/62 https://docs.lib.purdue.edu/lib_fsdocs/62 https://doi.org/10.5860/crl.78.2.126 https://doi.org/10.5860/crl.78.2.126 https://doi.org/10.5860/crl.78.2.126 https://doi.org/10.17605/osf.io/cusmx https://doi.org/10.1016/j.sbspro.2014.03.030 https://doi.org/10.1016/j.sbspro.2014.03.030 information technology and libraries september 2021 product ownership of a legacy institutional repository | narlock and brower 10 15 salo, “innkeeper.” 16 carolyn ten holter, “the repository, the researcher, and the ref: ‘it’s just compliance, compliance, compliance’,” journal of academic librarianship 46, no. 1 (january 1, 2020): 102079, https://doi.org/10.1016/j.acalib.2019.102079. 17 don brower et al., “on institutional repositories, ‘beyond the repository services,’ their content, maintainers, and stakeholders,” against the grain, 32 (1), https://against-thegrain.com/2020/04/v321-atg-special-report-on-institutional-repositories-beyond-therepository-services-their-content-maintainers-and-stakeholders/. 18 bethany nowviskie, “on capacity and care,” october 4, 2015, http://nowviskie.org/2015/oncapacity-and-care/; ruth kitchin tillman, “who’s the one left saying sorry? gender/tech/librarianship,” april 6, 2018, https://ruthtillman.com/post/whos-the-one-leftsaying-sorry-gender-tech-librarianship/. 19 dale askey and jennifer askey, “one library, two cultures” (library juice press, 2017), https://macsphere.mcmaster.ca/handle/11375/22281; rafia mirza and maura seale, “dudes code, ladies coordinate: gendered labor in digital scholarship,” october 22, 2017, https://osf.io/hj3ks/. https://doi.org/10.1016/j.acalib.2019.102079 https://doi.org/10.1016/j.acalib.2019.102079 https://against-the-grain.com/2020/04/v321-atg-special-report-on-institutional-repositories-beyond-the-repository-services-their-content-maintainers-and-stakeholders/ https://against-the-grain.com/2020/04/v321-atg-special-report-on-institutional-repositories-beyond-the-repository-services-their-content-maintainers-and-stakeholders/ https://against-the-grain.com/2020/04/v321-atg-special-report-on-institutional-repositories-beyond-the-repository-services-their-content-maintainers-and-stakeholders/ http://nowviskie.org/2015/on-capacity-and-care/ http://nowviskie.org/2015/on-capacity-and-care/ http://nowviskie.org/2015/on-capacity-and-care/ https://ruthtillman.com/post/whos-the-one-left-saying-sorry-gender-tech-librarianship/ https://ruthtillman.com/post/whos-the-one-left-saying-sorry-gender-tech-librarianship/ https://ruthtillman.com/post/whos-the-one-left-saying-sorry-gender-tech-librarianship/ https://macsphere.mcmaster.ca/handle/11375/22281 https://macsphere.mcmaster.ca/handle/11375/22281 https://macsphere.mcmaster.ca/handle/11375/22281 https://osf.io/hj3ks/ abstract introduction selected literature review local context first things first: what is our ir? mission (and vision) critical clear and present boundaries small changes, big rewards lessons learned and future work conclusion endnotes conversion of bibliographic information to machine readable form using on-line computer terminals 217 frederick m. balfour: information systems engineer, technical information dissemination bureau, state university of new york, buffalo, new york a description of the first six months of a profect to convert to machine readable form the entire shelf list of the libraries of the state university of new york at buffalo. ibm datatext~ the on-line computer service which was used for the conversion, provided an upperand lowercase typewriter which transmitted data to disk storage of a digital computer. output was a magnetic tape containing bibliographic information tagged in a· modified marc i format. typists performed all tagging at the console. au information except diacriticals and non-roman alphabets was converted. direct costs for the first six months were $.55 per title. several recent articles have reported on methods and related costs to convert library bibliographic information to machine readable form. chapin ( 1) compared keypunching, paper tape, and optical character recognition. keypunching was also described by hammer ( 2), and black (3) . buckland (4) described paper tape conversion, and johns hopkins university ( 5) reported on optical character recognition. online computer terminals have been proposed ( 6), but have hitherto not been tried in a large library. without attempting to discuss the various techniques, this paper presents a detailed report of converting with on-line computer terminals. it is hoped that the experiences reported here and in the cited articles will 218 journal of library automation vol. 1/ 4 december, 1968 provide suitable information to a library administration considering largescale conversion. background in 1965 a systematic program of automation was begun in the libraries of the state university of new york at buffalo. the general goals of the program were to improve services to patrons and streamline internal operations. there are three general areas usually considered for automation in a library: acquisitions and accounting, the card catalog, and circulation control. an analysis of the system indicated that conversion of the card catalog to machine readable form would provide the greatest improvement in library services and operations. the reasons for the decision were as follows. first, the university libraries are growing rapidly; in one year the shelf list will increase by 60,000 to 100,000 titles, or about 15 to 25 per cent. second, suny buffalo is currently planning a new campus which will be completed in five to ten years. in the interim, the university will be spread over three major campus locations, with many smaller offices and departments located throughout the city, and the libraries must provide some form of bibliographic index for each location. the conversion of the shelf list to machine readable form will allow this distribution of the bibliographic information at a very low cost per title. finally, the project will provide experience in using magnetic tape for the handling of bibliographic information, so that when the library of congress' marc project begins to produce magnetic tapes, suny buffalo will be able to utilize them immediately. selecting the conversion hardware in 1966, a proposal for converting the shelf list to machine readable form ( 7) was presented to the library administration. it pointed out the many improvements in patron services, the advantages to the library staff, both professional and clerical, and the monetary savings to be realized by such a conversion. it discussed the four methods of file conversion then feasible: punched cards, optical scanners, punched paper tape, and magnetic tape-keyed data converters (as exemplified by tl1e mohawk data sciences equipment) ( 8). the proposal recommended using the magnetic tape-keyed data converters because of their input speed, ease of entry, and elimination of handling cards or paper tape. during the first quarter of 1967, a fifth method of conversion was considered, an ibm product called datatext (9). it required the rental of an ibm 27 41 communications terminal (essentially a typewriter), a western electric 103a data-set, and a voice-grade telephone line to the nearest ibm installation, which was cleveland, ohio. a customer may buy time in six-hour blocks called datatext agreements. an agreeconversion of bibliographic information/ balfour 219 ment covered a time segment from 7:00a.m. to 1:00 p.m., or from 1:30 p.m. to 7:30p.m., five days a week. datatext provided everything that the magnetic tape converters did with some important additions. first, it had upperand lower-case alphabet using a shift character (the library administration had seen only the mohawk upper-case converter). second, the typewriter gave a typed copy which was easy to proofread. third, corrections were much easier because of the text-editing capabilities of the on-line computer. text-editing can best be illustrated by describing a typical datatext job. a typist working from source material produces a typewritten page; at the same time, the ibm 27 41 she is using transmits the data being typed to the computer in an area called "working storage". when typing is completed, the clerk gives the appropriate command and the information is stored in an area called "permanent storage", a computer manipulation which can be compared to taking a page from the typewriter and placing it in a folder in a file cabinet. when the typist wishes to make changes to the information, she can give a command to recall it from permanent storage to working storage. she can then manipulate it in several ways. during original entry, the computer automatically assigned numbers to each line. using these line numbers, a typist can move information within the text, can add or delete information, and can correct errors. commands are very simple and concise; for example, it takes four keystrokes to move a new line into the text. in making a correction, the typist merely types the incorrect word and the correct word; the computer then types the complete line to show that the correction has been properly executed. (this instant replay, or on-line interaction, is a benefit unique to the on-line terminal.) after any change, the computer automatically renumbers lines and reformats the entire text. a sample of typed input is illustrated and discussed later in the article. in april 1967, it was decided to test the datatext service because of its powerful correction capability, and because it could be installed and working within three weeks. in may the console was delivered, the telephone equipment installed, and a long-distance line to cleveland rented. a one-month test of datatext proving successful, three more consoles, data sets and telephone lines were added, and the conversion project was fully underway. training the typists the majority of the typing and proofreading staff were drawn from existing personnel in the cataloging department. individuals chosen had a background in either catalog card typing or file maintenance, and consequently a good working knowledge of information on a catalog card. it was anticipated that with a minimum of further training, the typists could identify and tag information as they were typing it at the console. this assumption was critical to the success of the project, since the li.... ----------------~---220 journal of library automation vol. 1/ 4 december, 1968 brary could not afford the professional time necessary for complete pretagging of bibliographic information. typists involved in the one-month test were given several hour-long training sessions on tagging before the console arrived. when the project got underway, a list of all possible tags was posted near the console, and a librarian was nearby to answer questions. mter three weeks of operation, it was obvious that the typists could tag at the console, thus making this part of the test run a success. the tagging system used was developed from the marc i pilot project ( 10). most of the original tags were retained and several additional ones designed to meet specific local needs. tape files created were formatted according to marc i specifications, although fixed fields were left blank. the tagging system is outlined in a reference manual prepared for typists and proofreaders ( 11). operation of an on-line console requires special training. ibm sent a datatext instructor to buffalo on several occasions to provide typist training. for the major training session, which occurred in june, the ibm representative came for a full week. ten typists were trained; five specialized in entering information, and five specialized in retrieving, correcting, and transmitting information. by the end of the week both groups were skilled in their respective specialities, and many typists were able to perform well in both areas. later, typists were trained in several sessions by one of the library's typing staff. during the first three months, the author was near the terminals at all times to answer questions on terminal operation, to collect data for measuring and controlling performance, and to act as supervisor. a librarian was on call for questions on complex library problems, and the programmer-analyst was available to help solve problems regarding input format and tagging. at the end of this period, appropriate clerical staff had been trained to supervise minute-to-minute operation. conversion procedures the general method of conversion (figure 1 ) was as follows. a typist typed into "working storage" for an hour, inputting 15 to 30 shelf list cards. she instructed the computer to store this "document" in a permanent storage location on disc. she then placed the typed copy and cards in a proofreading bin, cleared working storage, and started another document. a proofreader compared typed copy with original cards and indicated any errors. the corrected document then went to a correction typist who "retrieved" the document from permanent storage to working storage, performed the corrections, and transmitted the corrected document to magnetic tape. the original uncorrected document was left in permanent storage overnight and deleted the following day. documents were transmitted to tape bqffalo 3 x 5 shelf list cards hard copy proofreading operation file conversion of bibliographic information/ balfour 221 cleveland computer disc storage mail to library fig. 1. shelf list conversion information flow. 222 journal of library automation vol. 1/ 4 december, 1968 for about two weeks and the accumulation returned to the library via the mails. (ibm saved all permanent storage records for one week as a security measure. if a library typist inadvertently deleted a document , it could be retrieved by the computer operator. ) figure 2 shows a sample of typed input and subsequent correction. line numbers, as they are stored on the disc, are included on the right margin for ease of explanation. lines typed in capitals are computer r esponses to commands, the first entry being the command to clear working storage. the computer responds and then indicates that the console is in one of two general input modes. all cards are typed in "automatic" mode, for which the typist gives the appropriate command. when the computer responds the typist asks for the next line number, which is 3, and begins to input the card. in line 4, the typist makes an error and realizes it before throwing the carriage. she hits the "attention" key proc cleared uncontrolled mode a automatic mode n next number -3 90t bs2575.3.a7 lot bible. n.t. matthew. english. 1963. new english. 20t the gospel according to matthew=. commemen 3 ntary by a.w. argyle. 4 30a cambridge 30b university press 30c 1963 40t 227 p. maps. 20 em. sot the dambridge bible commentary: new english bible 70t bible. n.t. matthew -commentaries. 7lt argyle, aubrey william, 191073t title. 60z 92t 226.207 94t 63-23728 n next number -10 6 dambridge cambridge ~ot the cambridge bible commentary: new english bible fig. 2. sample input and correction of one shelf list card. 5 6 7 8 9 conversion of bibliographic info1'mation/balfovr 223 clueing the underscore, rolls the platen down, back spaces, and retypes the correct word. the computer then corrects the error. in line 6 the typist misspells "cambridge", but does not realize it before throwing the carriage. the correction is shown at the bottom although the input typist could not have performed it herself; it would have gone through proofreading and back to the correction typist. the correction is made by typing the line number, in this case "6", the incorrect word, "dambridge", tab, and the correct word. the computer responds by typing out the complete line showing the correction. except for a brief period, the shelf list was converted in alphabetic order, and by december 1 shelf list drawers through the e's were completed. early in the project, some of the literature classification, p and pq, was converted. foreign languages in the pq's gave no particular problems, and typing rates did not drop. all cards were converted in shelf list order except for those having non-western alphabets. when possible, these were transliterated and entered. otherwise their input was delayed. since the 2741 console has no diacritical marks, these were left out; however each card having them was entered and given a special tag to permit retrieval at a later date when diacritical marks could be added by special coding such as used by marc. conversion consoles and shelf list were in the same building. each day, several inches of cards were removed from the drawer being processed and a marker inserted indicating where the cards had gone. in general operation, cards were returned and refiled in less than a day so that inconvenience to staff was minimal. as a card was proofread, it was marked on the back with a "c" and the ·upper right hand comer received a very small notch with a mcbee punch. thus, newly cataloged cards filed with cards already converted are recognizable by the unnotched comer. costs table 1 gives a statistical summary of the conversion project from july 31 through december 1, 1967. the term "l.c. card" refers to a complete bibliographic entry for a title and may include more than one physical card, or may include writing on the back of a card. input and correction functions are reported separately and then totaled to give a realistic input rate per hour for corrected cards. supervisor cost reflects wages of clerical supervisors only. those of the programmer-analyst, the librarian and the systems analyst assigned to the project are not included. a breakdown of monthly equipment costs per console is given in table 2. installation costs were $150 for each terminal, and $50 for each leased telephone line. when the project operated four consoles, the monthly equipment cost was $4,472. 224 journal of library automation vol. 1/ 4 december, 1968 table 1. conversion project statistics (july 31-dec. 1, 1967) input, proofreading and correction total l.c. cards input typist hours input typist hours correcting total typist hours proofreading hours number of errors per l.c. card l.c. card input rate per hour l.c. card correction rate per hour overall conversion rate (input & correction) cards per hour proofreading rate, cards per hour costs labor cost @ $1.75 per hour equipment and supervisors total cost cost per card converted utilization of console time hours typed hours consoles down hours computer down hours lost time table 2. monthly operational costs per terminal ibm 2741 communications terminal western electric 103a data set 24-hour voice-grade lease line to cleveland plus local telephone costs 2 data text agreements @ $310. total 3,035 492 3,381 245 91 438 4,155 49,348 3,527 1,235 .42 16.3 100 14 40 $ 8,078.00 18,995.00 $27,073.00 $0.55 81.4% 5.9% 2.2% 10.5% 100.0% $ 85.00 27.50 385.50 620.00 $1118.00 "hours typed" is time that consoles were actually being used to input or correct cards. this is slightly less than "typist hours worked" because some correction has been delayed, but it was included in hours worked to give true representation of input rates. "hours consoles down" reflects time lost due to console breakdown. during the early part of the conversion of bibliographic information/ balfour 225 period, two consoles were failing often. however, as operating problems were solved, console down-time dropped far below the average 5.9 per cent shown. "hours computer down" was also greater during early weeks of the project. however, for each hour down, ibm credited the library with $12.00 ( $3.00 per terminal for four terminals). "hours lost time" reflects periods when a working console could not be manned because of personnel breaks or operator absence. all times are given in console-hours, four consoles operating for one hour being recorded as four hours. the error rate of .42 errors per card is very low. allowing 350 characters per shelf list card, typists were making one error for every 830 keystrokes. this translates to about 3 errors per typewritten page of 50 characters per line, 50 lines per page. the office of secretarial studies of suny at buffalo indicates that this rate is well within the tolerance for "normal" typing, as in a typing pool. when it is considered that typists were tagging and inputting complicated bibliographic information, rate of accuracy was commendably high. typists used in the project included the lowest salary grade of civilservice typists, part-time hourly workers, and students. an acceptable input rate for civil service typists was 18 cards per hour, which is equivalent to 21 5-character words per minute. the faster typists, at 26 cards per hour, were typing at 30 words per minute. again, let it be mentioned that the material was complex and that typists were required to tag each piece of information. conclusions several points can be made about converting with datatext. it was easy to implement and received excellent support from ibm. the ibm information marketing staff in cleveland provided constant assistance during the early part of the installation and visited often once the project was successfully underway. ibm sent the datatext instructor as often as needed and provided free computer time during teaching sessions. the four long-distance telephone lines and data sets proved reliable. there was only one instance during the period when a line was inoperable and it was repaired in three hours. the liaison and support from new york bell telephone was very good. datatext costs would have been lower had the ibm installation been nearer. cleveland is 173 miles from buffalo giving a 24-hour leaseline cost of $342 per month. (datatext service will soon include a uniform long-distance-lines cost.) verification or correction on datatext does not require human retyping of each line of entry. only the word in error and its replacement need be typed; the console then types the corrected line to show that the error was deleted and the replacement inserted. consequently correction costs are low and corrections accurate. 226 journal of library automation vol . l / 4 december, 1968 average rates and costs given in table i reflect learning during the first six months of the project. towards the end of the reported period, rates were improving and costs decreasing. since december 1967, the project has added three more consoles and uses a datatext service provided by a campus computer. costs have dropped below $.45 per card, a figure which will increase somewhat when diacriticals are added. potentially cost per title for complete conversion is under $.50. references 1. chapin, richard e.; pretzer, dale h.: "comparative costs of converting shelf list records to machine readable form," journal of library automation, 1 (march 1968), 66-7 4. 2. hammer, donald p.: "problems in the conversion of bibliographic dataa keypunching experiment," american documentation, 19 (january 1968), 12-17. 3. black, donald v. : "creation of computer input in an expanded character set," journal of library automation, 1 (june 1968), 110-120. 4. buckland, l. f.: recording of library of congress bibliographical data in machine readable form (rev. ed.; washington, d.c.: council on library resources, 1965). 5. the johns hopkins university. milton s. eisenhower library: progress report on an operations research and systems engineering study of a university library (baltimore: johns hopkins, 1965). 6. international business machines corporation. federal systems division : report of a pilot protect for converting the pre-1952 national union catalog to a machine readable record (rockville, maryland: ibm, 1965). 7. lazorick, gerald j.; herling, john; atkinson, hugh: conversion of shelf list bibliographic information to machine readable form and production of book indexes to shelf list (buffalo, n.y.: state university of new york at buffalo, technical information dissemination bureau, 1966). 8. mohawk data sciences corp.: datagram no. 35, 1181 twk correspondence data-recorder, (herkimer, n.y., mohawk data sciences corp., 1967). 9. international business machines corporation: datatext operators instruction guide, form # j20-0010-1 (ibm, white plains, n.y., 1967). 10. u.s. library of congress, information systems office: a preliminary report on the marc (machine readable catalog) pilot protect (washington, d.c.: library of congress, 1966). 11. michael m. coffey: reference manual for typists and proofreaders. sunyab shelf list conversion project (buffalo, n.y. : suny at buffalo, technical information dissemination bureau, 1968). searchable signatures: context and the struggle for recognition gina schlesselmantarango information technology and libraries | september 2013 5 abstract social networking sites made possible through web 2.0 allow for unique user-generated tags called “searchable signatures.” these tags move beyond the descriptive and act as means for users to assert online individual and group identities. this paper presents a study of searchable signatures on the instagram application, demonstrating that these types of tags are valuable not only because they allow for both individuals and groups to engage in what social theorist axel honneth calls the “struggle for recognition,” but also because they provide contextual use data and sociohistorical information so important to the understanding of digital objects. methods for the gathering and display of searchable signatures in digital library environments are also explored. introduction a comparison of user-generated tags with metadata traditionally assigned to digital objects suggests that social network platforms provide an intersubjective space for what social theorist axel honneth has termed the “struggle for recognition.” 1 social network users, through the creation of identity-based tags—or what can be understood as “searchable signatures”—are able to assert and perform online selves and are thus able to demand, or struggle for, recognition within a larger social framework. baroncelli and freitas cogently argue that web 2.0, or the interactive online social arena, in fact functions as a “recognition market in which contemporary individuals . . . trade personal worth through displays and exchanges of . . . self-presentations.” 2 a comparison of a metadata schema used in yale university’s digital images database with usergenerated tags accompanying shared photographs on the social networking platform instagram demonstrates that searchable signatures are unique to social networking sites. as phenomena that allow for public presentations of disembodied selves, searchable signatures thus provide specific information about the context of the digital images with which they are associated. capturing context remains a challenge for those working with digital collections, but searchable signatures allow viewers to derive valuable use data and sociohistorical information to better understand the world in which digital images originated and exist. literature review web 2.0 identities and recognition theory while web 2.0 can be imagined as a highly collaborative space where social actors are able to gina schlesselman-tarango (gina.schlesselman@du.edu) holds a master of social sciences from the university of colorado denver and is currently an mlis candidate at university of colorado. mailto:gina.schlesselman@du.edu searchable signatures: context and the struggle for recognition | schlesselman-tarango 6 communicate to the world new identities, some warn that this communication is somehow engineered and performed. van dijck, in an analysis of social media, argues that it is indeed “publicity strategies [that] mediate the norms for sociality and connectivity,” and baroncelli and freitas note that web 2.0 allows people to make themselves visible through modes of spectacularization.3 though his focus is on the spectacle in fin de siècle france, clark provides some insight into the effects of spectacularization on the individual. 4 working within a historical materialist framework, clark points that with the growth of capitalism, the individual has become colonized. 5 clark further describes this colonization as “massive internal extension of the capitalist market—the invasion and restructuring of whole areas of free time, private life, leisure, and personal expression . . . the making-into-commodities of whole areas of social practice which had once been referred to casually as everyday life.” 6 here, web 2.0 is not a liberatory tool but instead a space where users are colonized to the extent that they create selves exchanged through social networking sites owned by capitalist enterprises. web 2.0, then, has created a situation in which personal time and identification can be successfully commodified. baroncelli and freitas conclude, “from that formula, personal life becomes a capital to be shared with other people—preferably, with a large audience.” 7 the problem, then, is that one’s existence is defined simply “by being seen by others” and can no longer be understood as authentic.8 despite the sophistication of the argument detailed above, there are some who view the online self, created through web 2.0, as a legitimate and authentic identity. in an account of the online self, hongladarom summarizes this position, noting that both offline and virtual identities are constructed in social environments. 9 for hongladarom, these identities are not different in essence because “what it is to be a person . . . is constituted by external factors.” 10 the online world as an external factor has the ability to affirm one’s existence, regardless of whether that existence is physical or virtual. in sum, it is the social other and not a material existence that is the authenticating factor in identity formation. there are others who validate the role that spectacle—or what also can be understood as performance—plays in identity formation. pearson calls on the work of goffman to argue, “identity-as-performance is seen as part of the flow of social interaction as individuals construct identity performances fitting their milieu.” 11 for pearson, the identity is always performed, be it through web 2.0 or otherwise. there is nothing particularly worrisome, then, about the effects of web 2.0 on the self, nor does web 2.0 threaten the authenticity of the self. identity is always performed and is in a sense a spectacle—this does not mean, however, that identity in itself is spurious. it is with this perspective of the online self as a performed albeit authentic identity that this paper further develops. before a thorough analysis of the searchable signature as an online self can be conducted, a deeper understanding of honneth’s theory of recognition is in order. information technology and libraries | september 2013 7 in his 1995 work the struggle for recognition: the moral grammar of social conflicts, honneth sets out to develop a social theory based on what he calls “morally motivated struggle.” 12 based on the habermasian concept of communicative action, honneth contends that it is through mutual recognition that “one can develop a practical relation-to-self [and can] view oneself from the normative perspective of one’s partners in interaction, as their social addressee.” 13 relation-toself is key for honneth, and he argues that a healthy relation-to-self, or what can be thought of as self-esteem, is developed when one is seen as valuable by others. beyond self-esteem, honneth points that the success of social life itself depends on “symmetrical esteem between individualized (and autonomous) subjects.” 14 for honneth, this “symmetrical esteem” can lead to solidarity between individuals. “relationships of this sort,” he explains, “can be said to be cases of ‘solidarity’ because they inspire not just passive tolerance but felt concern for what is individual and particular about the other person.” 15 that is to say that felt concern for another allows one to see the specific traits of the other as valuable in working towards common goals, and honneth imagines that in situations of “symmetrical esteem . . . every subject is free from being collectively denigrated, so that one is given the chance to experience oneself to be recognized, in light of one’s own accomplishments and abilities, as valuable for society.” 16 until this ideal is realized, however, individuals must find sites in which to struggle to be recognized as valuable social assets. according to baroncelli and freitas, it is in fact web 2.0 that provides the arena where “the contemporary demand for the visibility of the self” is able to flourish. 17 they position this argument within honneth’s framework, asserting that the visibility of self is “directed towards a quest for recognition,” and they thus conclude that web 2.0 can be understood as a “recognition market.” 18 context and its importance capturing and integrating markers of context into records, according to chowdhury, still present a challenge for many.19 “there is now a general consensus that the major challenge facing a digital library as well as a digital preservation program is that it must describe its content as well as the context sufficiently well to allow its correct interpretation by the current and future generations of users,” he contends.20 context in itself is difficult to define, let alone its myriad facets that might or might not facilitate better understanding of digital objects. dervin, in her exploration of the meaning of context, points that it is often conceptualized as the “container in which the phenomenon resides.” 21 she points that the list of factors that constitute the container and might be considered contextual is in fact “inexhaustible”—items on this list, for example, might include the gender, race, and ethnicity of those involved in a phenomenon. 22 in an indexing or digital collection environment, the goal is to determine which of these many factors ought be included in a record to best allow for discovery and use. searchable signatures: context and the struggle for recognition | schlesselman-tarango 8 others imagine context as a fluid, ever-changing process rather than as a static container of data. “in this framework,” dervin writes, “reality is in a continuous and always incomplete process of becoming.” 23 this understanding of context as changing is helpful for those working with objects that live in digital environments, especially web 2.0. certainly the interactive nature of the web has created room for a variety of users to create, share, appropriate, comment on, tag, reject, celebrate, and ultimately understand images in a multitude of contexts that might be different from one moment to the next. there are many reasons to include contextual information in records of digital objects. lee argues that by providing context, or what he describes as the “social and documentary” world “in which [a digital object] is embedded,” future users will be able to better understand the “details of our current lives.” 24 further, lee contends that context is helpful in that is illustrates the ways in which a digital object is related to other materials: relationships to other digital objects can dramatically affect the ways in which digital objects have been perceived and experienced. in order for a future user to make sense of a digital object, it could be useful for that user to know precisely what set of . . . representations—e.g. titles, tags, captions, annotations, image thumbnails, video keyframes—were associated with a digital object at a given point in time. 25 the user-generated tag, then, is a valuable representation that provides contextual information surrounding the perception and experience of the image with which it is directly related. discussion user-generated tags and traditional metadata user-generated tags have been hailed as an important stage in the evolution of image description and are said to have the potential to shape controlled vocabularies used in traditional metadata schemas. for example, in a comparison of flickr tags and index terms from the university of st. andrews library photographic archive, rorissa stresses the importance of exploring similarities and differences between indexers’ and users’ language, noting that “social tagging could serve as a platform on which to build future indexing systems.” 26 like others, rorissa hopes that continued research into user-generated social tags will be able to “bridge the semantic gap between indexerassigned terms and users’ search language.” 27 in fact, some are currently utilizing social tags in an effort to describe and facilitate access to collections. one such organization is steve: the museum social tagging project, “a place where you can help museums describe their collections by applying keywords, or tags, to objects.” 28 the organization allows users to not only view traditional metadata associated with cultural objects, but also tags generated by others. in an effort to better understand the similarities and differences between user-generated tags and the language used in traditional metadata schemas, one must compare the two systems. information technology and libraries | september 2013 9 yale university’s digital images database provides a glimpse at the ways in which traditional metadata schemas are typically used to describe images in digital library settings. most of the images included in the database are accompanied by descriptive, structural, and administrative metadata. for example, an item entitled “boy sitting on a stoop holding a pole” (see figure 1) from the university’s collection of 1957–90 andrews st. george papers provides a digital copy of the image, the image number, name of the creator, date of creation, type of original material, dimensions, copyright information, manuscript group name and number, box and folder numbers, and a credit line.29 the image is further described by the following: “man in the shed is making homemade bombs. the boy and man are also in image 45350.” 30 figure 1. “boy sitting on a stoop holding a pole” from yale university’s digital images database collection of 1957–90 andrews st. george papers, november 2012. certainly, such information is useful in library environments and provides users with helpful and formatted data to best guide the information discovery process. the finding aid for the andrews st. george collection is additionally helpful in that it includes information about provenance, access, processing, associated materials, and the creator; it also contains descriptive information about the collection by box and folder number. 31 however, if additional use data and sociohistorical searchable signatures: context and the struggle for recognition | schlesselman-tarango 10 information specific to this individual item were available, it would be most helpful in assisting users in determining the image’s greater context. a study of modes of participation on social networking sites suggests that it is now possible to supply such contextual information for digital objects that live in interactive online environments. a useful site for exploring user-generated tags associated with images is instagram, a social application designed for iphone and android.32 instagram users are able to upload and edit photos, and other users can then view, like, and comment on the shared photos. instagram users are able to follow other users and search for photos by the creator’s username or by accompanying tags. instagram, owned by facebook, is interoperable with other social networking sites, and users have the ability to share their photos on facebook, flickr, tumblr, and twitter. as of july 2012, it was reported that instagram had 80 million users, and in september 2012, the new york times reported that 5 billion photos were shared through the application.33 users are limited to 30 tags per photo, and instagram suggests that users be as specific as possible when describing an image with a tag so that communities of users with similar interests can form.34 many tags, like the information included in traditional metadata schemas, aim to best describe an image by explaining its content; for example, one user assigned the tags #kids, #nieces, #nephews, and #family to a photograph of a group of smiling children (see figure 2). like the information accompanying the photograph in the yale university digital images database, such tags provide users and viewers with tools to better determine the “aboutness” of the image at hand. information technology and libraries | september 2013 11 figure 2. photo shared on instagram assigning both descriptive tags and the searchable signature #proudaunt, november 2012. however, instagram users are repurposing the tagging function in a way that is unique to social networking sites. in addition to the descriptive tags assigned to the image of the children described above, the user also tagged the photo with the term #proudaunt (see figure 2). there is, however, no aunt (what can be assumed to be an adult female) in the photograph. this tag, then, functions to further identify the user who created or shared the photograph and does not describe the content of the image at hand. a search of the same tag, #proudaunt, demonstrates that this user is not alone in identifying as such: in november 2012, this search returned 40,202 images with the same tag and more than 58,000 images with tags derived from the same phrase (#proudaunty, #proudauntie, #proudaunties, #proudauntiemoment, and #proudaunti) (see figure 3). figure 3. list of results from #proudaunt hashtag search on instagram, november 2012. this type of user-generated tag—one that identifies the creator or sharer of the photograph yet is not necessarily meant to describe the content of the image—can be understood as a searchable signature. such identity-based tags are not found within yale university’s digital images database; the closest relative of the searchable signature is the creator’s name. while searchable, this name is not alternative, or secondary, and it was not created and does not exist in a social environment. searchable signatures: context and the struggle for recognition | schlesselman-tarango 12 currently, born-digital objects are often created and shared in a technological milieu that allows for the assignment of user-generated tags. consequently, the integration of the searchable signature into the presentation of digital objects has become part of accepted social practice and offers unique opportunities for digital library curators and users alike. until quite recently, most materials—be they photographs, manuscripts, or government documents—were not born in digital environments. however, digitization projects have been undertaken to ensure that such historical materials are more widely and eternally available. these reborn digital objects, then, have been and can be integrated into dynamic social environments. steve: the museum social tagging project, mentioned earlier in this paper, is one example of an organization that has capitalized on the social practice of user-generated tagging and is using descriptive tags along with traditional metadata to better describe reborn digital objects. it is important, then, to explore what (if any) implications the application of the searchable signature, a unique type of user-generated tag, has for historical objects that are later integrated into digital environments. searchable signatures associated with born digital images on social networking sites contain valuable information about their creators, users, and the images’ context. one cannot ignore that users will, if given the chance, also likely apply signatures to reborn digital objects in similar ways that they do to objects that have always existed in social environments. since the searchable signature is used to identify not only digital image creators, but also sharers, and if these signatures do in fact provide important insight into the sharers and their motivations, then these signatures are not to be ignored. rather than focusing on the creating, the lens through which to understand the searchable signature for reborn digital objects can be shifted to the social act of sharing: by whom, when, in which social environments, and for what purposes. a deeper analysis of the presentation of self through the searchable signature and the role that the signature plays in providing valuable contextual information for both bornand reborn-digital objects is developed below. searchable signatures and the struggle for recognition if web 2.0 indeed functions as a recognition market, then social media and social networking sites might appear to be tables at such a market. placing oneself behind a table—be it facebook, twitter, or instagram—the user is able to perform his or her online identity to passersby and effectively struggle to be recognized as a unique individual or as a member of a social group. these performances, which could be deemed narcissistic in nature, can alternatively be read as healthy attempts to self-actualize and connect to larger society.35 one such “table” in the recognition market is instagram. beyond instagram’s social nature that allows participants to interact with and follow one another, the specific role of the searchable signature is of interest to those who are concerned with struggles for recognition. rather than describing shared images, searchable signatures reflect performative yet authentic user identities. information technology and libraries | september 2013 13 mccune, in a case study of consumer production on instagram, acknowledges the potential of the tag to not only facilitate image exchange but to communicate users’ positions as members of social groups.36 through a simple search of tags, users who identify as, for example, “cat ladies,” are able to validate their identities when they see that there are many others who use the same or similar language in demonstrations of the self (see figure 4). other signatures such as #proudaunt, while not necessarily playful, still function to provide viewers with additional information about the instagram user that cannot be determined through the photo itself. the ability to find images based on these searchable signatures allows users to find others who identify in a like manner and to imagine themselves as part of a larger social group. in effect, searchable signatures allow users to be recognized as social addressees of like-minded others. positioning oneself within a group must be understood as a struggle for recognition, for to imagine oneself as part of the social fabric is also to see oneself as valuable. figure 4. list of results from #catlady hashtag search on instagram, november 2012. enabled by web 2.0, searchable signatures contain potential for marginalized peoples or groups to assert online selves to be seen and ultimately heard in a truly intersubjective landscape. it is not too much of a leap to imagine that searchable signatures might make possible the organization of individuals and groups for political purposes. in fact, in a discussion of social groups, honneth notes that “the more successful social movements are at drawing the public sphere’s attention to searchable signatures: context and the struggle for recognition | schlesselman-tarango 14 the neglected significance of the traits and abilities they collectively represent, the better their chances of raising the social worth, or indeed, the standing of their members.” 37 here, searchable signatures might provide such movements with a venue to capture the public’s attention and to effectively struggle for and gain recognition. searchable signatures and context as markers of individual and group identities, searchable signatures are unique in that they provide a snapshot of the multitude of social, historical, political, individual, and interpersonal relationships that ontologize the images with which they are paired. it is this very contextual information that is at times lacking in traditional indexing environments. by examining searchable signatures, experts and users are able to understand which individuals and groups create, use, and identify with certain images. thus, as markers of self, searchable signatures provide use data for scholars to better investigate which images are important to online individual or group identities. if the searchable signature is used in a political fashion, historians and sociologists might be able to study which types of images, for example, marginalized groups rally around, identify with, and use in their struggles for recognition. such use data also illuminates how and by whom certain digital images have been appropriated over time. for example, if a picture of a cat is first created or shared via instagram by an animal rights activist, the image might be accompanied by the searchable signature #humanforcats. this same image, shared by another user months later, might be accompanied by the #catlady signature. those interested will be able to examine how the same image has been historically used for different purposes and will be better able to grasp the evolving nature of its digital context. in addition to use data, the searchable signature provides insight into the sociohistorical context surrounding digital images. for those who perceive “reality . . . as accessible only (and always incompletely) in context, in specific historicized moments in time space” the searchable signature clarifies and makes more accessible that reality surrounding the digital image. 38 in a traditional library setting, a photo of a cat might be indexed with descriptive subject headings such as “cat,” “persian cat,” or “kitten—behavior.” however, the searchable signature #catladyforlife provides additional information on how the cat has become, for a certain social group in a specific moment in time, a trope of sorts for those who are proud of not only their relationships with their domestic pets, but of their shared values and lifestyles as well. if a historian were to dig deeper, he or she also might see that “cat lady” has historically been used in a derogatory manner to mark single, unattractive women thought to be crazy and unable to care for the great number of cats they own and that, by (re)claiming this title, women might be engaging in a struggle for recognition that extends beyond mere admiration for felines.39 chowdhury, in a continued discussion of challenges facing the digital world, asks whether it is “possible to capture the changing context along with the content of each information resource, because as we know the use and importance . . . changes significantly with time.” 40 additionally, he information technology and libraries | september 2013 15 asks, “will it be possible to re-interpret the stored digital content in the light of the changing context and user community, and thereby re-inventing the importance and use of the stored objects?” 41 it is here that the searchable signature offers use data and sociohistorical information to illuminate the (changing) value digital images have for individuals, communities, and society. conclusion clark argues that representation must be understood within the confines of what he calls “social practice.” 42 social practice, among other things, can be understood as “the overlap and interference of representations; it is their rearrangement in use.” 43 representation of self also must be understood within current social practice, and an important facet of today’s practice is web 2.0. as a social space, web 2.0 allows for the creation of disembodied self-representations. one type of such representation, the searchable signature, is a phenomenon unique to social networking sites. while many acknowledge the potential of descriptive, user-generated tags to inform or even to be used in conjunction with metadata schemas or controlled vocabularies, instagram users have created an additional, alternative use for the tag. rather than simply using tags to describe shared images, they have successfully created a route to online identity formation and recognition. searchable signatures demonstrate the power of the online self, as they allow users to struggle to be recognized as unique individuals or as parts of larger social groups. these signatures, too, might act as platforms on which social groups can assert their value and thus demand recognition. additionally, searchable signatures provide contextual information that reflects the social practice in which digital images live. while the capture and integration of such information remains a challenge for those engaged in traditional indexing, web 2.0 allows for this unique type of usergenerated tag and thus provides better understanding of the context surrounding digital images. as to the question of whether searchable signatures can be integrated into existing metadata schemas or be used to inform controlled vocabularies in library environments, it is not unreasonable to suggest that digital objects be accompanied by their supplemental yet valuable representations (e.g., searchable signatures and the like). many methods exist through which these signatures might be both gathered and displayed. certainly, a full exploration of such practices is the stuff of future research; however, some initial ideas are detailed below. one method of gathering identity-based tags would involve the active hunting down of searchable signatures. locating objects on social networking sites that are also in one’s digital collection, the indexer would identify and track associated user-generated searchable signatures. this method would require extreme diligence, high levels of comfort navigating and using web 2.0, a clear idea of which social networking sites yield the most valuable searchable signatures, and likely one or more full-time staff members devoted to such activities. even if feed systems were employed for individual digital objects, this method demands much of indexers and would likely not be sustainable over time. searchable signatures: context and the struggle for recognition | schlesselman-tarango 16 a more passive yet efficient way of gathering searchable signatures would simply be to build on methods that have shown to be successful. by creating interactive digital environments that encourage users to assign not only descriptive but also identity-based tags, indexers are freed of the time-consuming task of hunting for searchable signatures on the web. since searchable signatures have come to be part of online social practice, the assigning of them would likely be familiar to users—initially, libraries might need to prompt users to share signatures or provide them with examples. this gathering tactic could be used to harvest signatures for items that are already part of the library’s digital collection (telling us about signatures used by potential sharers) or as a means to incorporate new digital objects into the collection (telling us about signatures used by both creators and sharers). in both gathering scenarios, indexers might choose to display only the most occurring or what they deem to be the most relevant searchable signatures, or they might choose to display all such tags; decisions such as these will ultimately depend on each institution’s mission and resources. of course, if a library integrates a born-digital image into its collection and can identify the searchable signatures originally assigned to it via social networking sites or otherwise, this information should also be recorded. here, users will be able to get a glimpse of the image in its pre-library life. providing associated usernames, dates posted, and the name of the social networking sites too will assist in providing a more complete picture of the individuals or groups linked to the image. this information can provide valuable data about the information creators and sharers who use specific social platforms. the aim of this paper is to lay the theoretical groundwork to better understand the role of searchable signatures in today’s digital environment as well as the signature’s unique ability to provide context for digital images. surely, further research into the phenomenon of the searchable signature would demonstrate how it is currently used outside of instagram or as a political tool. others might consider examining the username as another arena in which individuals or groups construct and perform online identities and thus engage in struggles for recognition. usernames also might provide contextual use data and sociohistorical information that inevitably support greater understanding of digital objects. finally, further research is needed to identify how libraries could utilize the searchable signature in promotional activities and to build and cater to user communities. references 1. axel honneth, the struggle for recognition: the moral grammar of social conflicts (cambridge: mit press, 1995). 2. lauane baroncelli and andre freitas, “the visibility of the self on the web: a struggle for recognition,” in proceedings of 3rd acm international conference on web science, 2011, accessed august 12, 2013, www.websci11.org/fileadmin/websci/posters/191_paper.pdf. information technology and libraries | september 2013 17 3. jose van dijck, “facebook as a tool for producing sociality and connectivity,” television & new media 13, no. 2 (2012): 160–76; baroncelli and freitas, “the visibility of the self.” 4. t. j. clark, introduction to the painting of modern life: paris in the art of manet and his followers (princeton, nj: princeton university press, 1984), 1–22. 5. ibid. 6. ibid., 9. 7. baroncelli and freitas, “the visibility of the self.” 8. ibid. 9. soraj hongladarom, “personal identity and the self in the online and offline world,” minds & machines 21 (2011): 533–48. 10. ibid., 541. 11. erika pearson, “all the world wide web’s a stage: the performance of identity in online social networks,” first monday 14 (2009), accessed november 9, 2012, www.uic.edu/htbin/cgiwrap/bin/ojs/index.php/fm; erving goffman, the presentation of self in everyday life (garden city, ny: doubleday, 1959). 12. honneth, the struggle for recognition, 1. 13. jurgen habermas, the theory of communicative action (boston: beacon, 1984); honneth, the struggle for recognition, 92. 14. honneth, the struggle for recognition, 129. 15. ibid. 16. ibid., 130. 17. baroncelli and freitas, “the visibility of the self.” 18. ibid. 19. gobinda chowdhury, “from digital libraries to digital preservation research: the importance of users and context,” journal of documentation 66, no. 2 (2010): 207–23, doi: 10.1108/00220411011023625. 20. ibid., 217. 21. brenda dervin, “given a context by any other name: methodological tools for taming the unruly beast,” in information seeking in context, ed. pertti vakkari et al. (london: taylor graham, 1997), 13–38. searchable signatures: context and the struggle for recognition | schlesselman-tarango 18 22. ibid., 15. 23. ibid., 18. 24. christopher a. (cal) lee, “a framework for contextual information in digital collections,” journal of documentation 67 (2011): 95–143. 25. ibid., 100. 26. abebe rorissa, “a comparative study of flickr tags and index terms in a general image collection,” journal of the american society for information science and technology 61, no. 11 (2010): 2230–42. 27. ibid., 2239. 28. “steve central: social tagging for cultural collections,” steve: the museum social tagging project, accessed december 16, 2012, http://tagger.steve.museum. 29. “yale university library manuscripts & archives department,” yale university manuscripts & archives digital images database, last modified april 19, 2012, accessed december 3, 2012, http://images.library.yale.edu/madid. 30. ibid. 31. “andrew st. george papers (ms 1912),” manuscripts and archives, yale university library, accessed april 30, 2013, http://drs.library.yale.edu:8083/fedoragsearch/rest. 32. “faq,” instagram, accessed november 10, 2012, http://instagram.com/about/faq. 33. emil protalinksi, “instagram passes 80 million users,” cnet, july 6, 2012, accessed november 13, 2012, http://news.cnet.com/8301-1023_3-57480931-93/instagram-passes-80-millionusers; jenna wortham, “it’s official: facebook closes its acquisition of instagram,” new york times, september 6, 2012, accessed november 13, 2012, http://bits.blogs.nytimes.com/2012/09/06/its-official-facebook-closes-its-acquisition-ofinstagram. 34. “tagging your photos using #hashtags,” instagram, accessed november 10, 2012, http://help.instagram.com/customer/portal/articles/95731-tagging-your-photos-usinghashtags; “instagram tips: using hashtags,” instagram, accessed november 10, 2012, http://blog.instagram.com/post/17674993957/instagram-tips-using-hashtags. 35. andrew l. mendelson and zizi papacharissi, “look at us: collective narcissism in college student facebook photo galleries,” in a networked self: identity, community and culture on social network sites, ed. zizi papacharissi (new york: routledge, 2010), 251–73. 36. zachary mccune, “consumer production in social media networks: a case study of the http://tagger.steve.museum/ http://images.library.yale.edu/madid/ http://drs.library.yale.edu:8083/fedoragsearch/rest/ http://instagram.com/about/faq/ http://news.cnet.com/8301-1023_3-57480931-93/instagram-passes-80-million-users/ http://news.cnet.com/8301-1023_3-57480931-93/instagram-passes-80-million-users/ http://bits.blogs.nytimes.com/2012/09/06/its-official-facebook-closes-its-acquisition-of-instagram/ http://bits.blogs.nytimes.com/2012/09/06/its-official-facebook-closes-its-acquisition-of-instagram/ http://help.instagram.com/customer/portal/articles/95731-tagging-your-photos-using-hashtags http://help.instagram.com/customer/portal/articles/95731-tagging-your-photos-using-hashtags http://blog.instagram.com/post/17674993957/instagram-tips-using-hashtags information technology and libraries | september 2013 19 ‘instagram’ iphone app” (master’s dissertation, university of cambridge, 2011), accessed december 20, 2012, http://thames2thayer.com/portfolio/a-study-of-instagram. 37. honneth, the struggle for recognition, 127. 38. dervin, “given a context by any other name,” 17. 39. kiri blakeley, “crazy cat ladies,” forbes, october 15, 2009, accessed december 4, 2012, www.forbes.com/2009/10/14/crazy-cat-lady-pets-stereotype-forbes-woman-timefelines.html; crazy cat ladies society & gentlemen's auxiliary homepage, accessed december 4, 2012, www.crazycatladies.org. 40. chowdhury, “from digital libraries to digital preservation,” 219. 41. ibid. 42. clark, introduction to the painting of modern life, 6. 43. ibid. acknowledgments many thanks to erin meyer and dr. krystyna matusiak at the university of denver for their feedback and guidance. http://thames2thayer.com/portfolio/a-study-of-instagram/ online ticketed-passes: a mid-tech leap in what libraries are for public libraries leading the way online ticketed-passes: a mid-tech leap in what libraries are for jeffrey davis information technology and libraries | june 2019 8 jeffrey davis (jtrappdavis@gmail.com) is branch manager at san diego public library, san diego, california. last year a library program received coverage from the new york times, the wall street journal, the magazines mental floss and travel+leisure, many local newspapers and tv outlets, online and trade publications like curbed, thrillist, and artforum, and more. that program is new york’s culture pass, a joint program of the new york, brooklyn, and queens public libraries. culture pass is an online ticketed-pass program providing access to area museums, gardens, performances, and other attractions. as the new york daily news wrote in their lede: “it’s hard to believe nobody thought of it sooner: a new york city library card can now get you into 33 museums free.” libraries had thought of it sooner, of course. museum pass programs in libraries began at least as early as 1995 at boston public library and the online ticketed model in 2011 at contra costa (ca) county library. the library profession has paid this “mid-tech” program too little attention, i think, but that may be starting to change. what are online ticketed-passes? the original museum pass programs in libraries circulate a physical pass that provides access to an attraction or group of attractions. sometimes libraries are able to negotiate free or discounted passes but many times the passes are purchased outright. the circulating model is still the most common for library pass programs, but it suffers from many limitations. passes by necessity are checked out for longer than they’re used. they sit waiting for pick up on hold shelves and in transit to their next location. long queues make it hard for patrons to predict when their requests will be filled, and therefore difficult to plan on using. for the participating attractions, physical passes are typically good anytime and so compete with memberships and paid admission. there are few ways to shape who borrows the passes in order to meet institutional goals. and there are few ways to limit repeat use by library patrons to both increase exposure and nudge users toward membership. as a result, most circulating pass programs only connect patrons to a small number of venues. despite these limitations, circulating passes have been incredibly popular: at writing there are 967 requests for san diego public library’s 73 passes to the new children’s museum. we sometimes see that sort of interest in a new bestseller, but this is a pass that sdpl has offered continuously since 2009. in 2011, contra costa county library launched the first “ticketed-pass” program, discover & go. discover & go replaced circulating physical passes with an online system with which patrons, remotely or in the library with staff assistance, retrieve day-passes — tickets — by available date or venue. this relatively simple and common-sense change makes an enormous difference. in addition to convenience and predictability for patrons, availability is markedly increased because venues are much more comfortable providing passes when they can manage their use: patrons can be restricted to a limited number of tickets per venue per year and venues can match the information technology and libraries | june 2019 9 number of tickets available to days that they are less busy. the latter preserves the value of their memberships while making use of their own “surplus capacity” to bring in new visitors and potential new members. funding and internal expectations at many venues carry obligations to reach underserved communities and the programs allow partner attractions to shape public access and receive reporting by patron zip code and other factors. the epass software behind discover & go is regional by design and supports sharing of tickets across multiple library systems in ways that are impractical to do with physical passes. as new library systems join the program, they bring new partner attractions into the shared collection with them. the oakland zoo, for example, needs only to negotiate with their contact at oakland public library to coordinate access for members of oakland, san francisco, and san jose public libraries. because of the increased attractiveness of participation, it’s been easier for libraries to bring venues into the program. in 2011, discover & go hoped for a launch collection of five museums but ultimately opened with forty. the success of ticketed-pass programs in turn attracts more partners. today, discover & go is available through 49 library systems in california and nevada with passes to 137 participating attractions. similarly, new york’s culture pass launched with 33 participating venues and has grown in less than a year to offer a collection of 49. while big city programs attract the most attention, pass programs are offered by county systems like alamace county (nc), consortiums like libraries in clackamas county (or), small cities like lawrence (ma), small towns like atkinson (nh), and statewide like the michigan activity pass which is available through over 600 library sites with tickets to 179 destinations plus state parks, camping, and historical sites. for each library, the participating destinations form a unique collection: a shelf of local riches, idiosyncratic and rooted in place. through various libraries one can find tickets for the basketball hall of fame, stone barns center for food and agriculture, dinosaur ridge, eric carle museum of picture book art, bushnell park carousel, california shakespeare theater, children’s museums, zoos, aquariums, botanical gardens, tours, classes, performances, and on to the met, moma, crocker, de young, and many, many, many more. for kids, “enrichments” like these are increasingly understood as essential parts of learning and exploration. for adults, access to our cultural treasures, including partners like san francisco’s museum of the african diaspora or chicago’s national museum of puerto rican arts & culture — besides being its own reward — enhances local connection and understanding. we’re also starting to see the ticketing platform itself become an asset to smaller organizations — craft studios, school performances, farm visits, nature centers, and more — that want to increase public access without having to take on a new ability. importantly, ticketed-pass programs are built on the core skills of librarians: information management, collection development, community outreach, user-centered design, customer service, and technological savvy. the technology discover & go was initially funded by a $45,000 grant from the bay area library and information system (balis) cooperative. contra costa contracted with library software company quipu group to develop the epass software that runs the program and that is also used by ny’s culture pass, public libraries leading the way: online ticketed passes | davis 10 https://doi.org/10.6017/ital.v38i2.11141 multnomah county (or) library’s my discovery pass, and a consortium of oregon libraries as cultural pass. ticketed-pass software is also offered by the libraryinsight and plymouth rocket companies and used by denver public library, seattle public library, the michigan activity pass, and others. the software consists of a web application with a responsive patron interface and connects over sip2 or vendor api to patron status information from the library ils. administrative tools set finegrained ticket availability, blackout dates, and policies including restrictions by patron age, library system, zip code, municipality, number of uses allowed globally and per venue, and more. recent improvements to epass include geolocation to identify nearby attractions and improved search filters. still in development are transfer of tickets between accounts, re-pooling of unclaimed tickets, and better handling of replaced library cards. the strength that comes from multi-system ticketed-pass programs also carries with it challenges on the patron account side. ilses each implement protocols and apis for working with patron account information differently and library systems maintain divergent policies around patron status. there’s a role for lita and for library consortia and state libraries to push for more attention to and consistency on patron account policies and standards. the emphasis in library automation is similarly shifting. our ilses originated to manage the circulation of physical items, a catalog-centric view. today, as robert anderson of quipu group suggested to me, a diverse range of online and offline services and non-catalog offerings orbit our users, calling for a new frame of reference: “it’s a patron-centric world now.” the vision library membership is the lynchpin of ticketed-pass and complementary programs in the technical sense, as above, and conceptually: library membership as one’s ticket to the world around. though i’m not aware of academic libraries offering ticketed-passes, they have been providing local access through membership. at many campuses, the library is the source for one’s library card which is also one’s campus id, onand off-campus cash card, transit pass, electronic key, print management, and more. that’s kind of remarkable and deserving of more attention. traditionally, librarians have responded to patron needs by providing information, resources, and services ourselves. new models and technologies are making it easier to complement this with the facilitation approach, of which online ticketed-passes are the quintessential example. we further increase access by reducing barriers of complexity, language, know-how, and social capital, for example, by maintaining community calendars of local goings-on or helping communities take advantage of nearby nature. online ticketed-pass programs will grow and take their place in the public’s expectations of libraries and librarians: that libraries are the place that help us (better, more equitably) access the resources and riches around us. powering this are important new tools for library technologists to interrogate and advance with the same attention we give to both more established and more speculative applications. the provision of mobile services in us urban libraries ya jun guo, yan quan liu, and arlene bielefield information technology and libraries | june 2018 78 ya jun guo (yadon0619@hotmail.com) is associate professor of information and library science at zhengzhou university of aeronautics, china. yan quan liu (liuy1@southernct.edu) is professor of information and library science at southern connecticut state university. arlene bielefield (bielefielda1@southernct.edu) is professor in information and library science at southern connecticut state university. . abstract to determine the present situation regarding services provided to mobile users in us urban libraries, the authors surveyed 138 urban libraries council members utilizing a combination of mobile visits, content analysis, and librarian interviews. the results show that nearly 95% of these libraries have at least one mobile website, mobile catalog, or mobile app. the libraries actively applied new approaches to meet each local community’s remote-access needs via new technologies, including app download links, mobile reference services, scan isbn, location navigation, and mobile printing. mobile services that libraries provide today are timely, convenient, and universally applicable. introduction the mobile internet has had a major impact on people’s lives and on how information is found located and accessed. today, library patrons are untethered from and free of the limitations of the desktop computer.1 the popularity of mobile devices has changed the relationship between libraries and patrons. mobile technology allows libraries to have the kind of connectivity with their patrons that did not exist previously. patrons no longer think that it is necessary for them to be physically in the library building to use library services, and they are eager to obtain 24/7 access to library resources anywhere using their mobile devices. mobile patrons need mobile libraries to provide them with services. in other words, “patrons want to have a library in their pocket.”2 as a result, libraries around the world are exploring and developing mobile services. according to the state of america’s libraries 2017 report by the american library association, the 50 us states, the district of columbia, and outlying territories have 8,895 public library administrative units (as well as 7,641 branches and bookmobiles). the vital role public libraries play in their communities has also expanded.3 as part of the main role of public libraries, us urban libraries need to embrace the developmental trend of the mobile internet to better serve their communities. the provision of mobile services in us urban libraries is worthy of study and is of great significance as a model for how other public libraries plan and implement their mobile services. mailto:yadon0619@hotmail.com mailto:liuy1@southernct.edu mailto:bielefielda1@southernct.edu the provision of mobile services in us urban libraries | guo, liu, and bielefield 79 https://doi.org/10.6017/ital.v37i2.10170 literature review definition and types of mobile devices and mobile services as early as 1991, mark weiser proposed “ubiquitous computing,” pointing out how people could obtain and handle information at anytime, anywhere, and in any way.4 with this expectation, the possibilities of using personal digital assistants (pdas) as mobile web browsers were researched in 1995.5 in combination with a wireless modem, library users are able to use pdas to access information services whenever they are needed. today, mobile devices are generally defined as units small enough to carry around in a pocket, falling into the categories of pdas, mobile phones, and personal media players.6 for many researchers, laptops are not included in the definition of mobile devices. although wireless laptops purportedly offer the opportunity to go “anywhere in the home,” laptops are generally used in a small set of locations, rather than moving fluidly through the home; wireless laptops are portable, but not mobile.7 in contrast, lippincott suggested that mobile devices should include laptops, netbooks, notebook computers, cell phones, audio players such as mp3 players, cameras, and other items.8 according to the “mobile strategy report” by the california digital library, mobile phones, e-readers, mp3 players, tablets, gaming devices, and pdas are common mobile devices.9 each mobile device has its own characteristics and the potential to connect to the internet from anywhere with a wi-fi network, driving widespread use and thus the provision of library mobile services. mobile services are services libraries offer to patrons via their mobile devices. these services as described herein comprise two categories: traditional library services modified to be available via mobile devices and services created for mobile devices.10 pope et al. listed several mobile services, including sms or text-messaging services, the my info quest project, digital collections, audiobooks, applications, and mobile-friendly websites.11 the california digital library pointed out that a growing number of university and public libraries are offering mobile services. libraries are creating mobile versions of library websites, using text messaging to communicate with patrons, developing mobile catalog searching, providing access to resources, and creating new tools and services, particularly for mobile devices.12 the most recognized mobile services in university libraries are mobile sites, mobile apps, mobile opacs, mobile access to databases, text messaging services, qr codes, augmented reality, and e books.13 both academic and public libraries’ use of web 2.0 applications and services include blogs, wikis, phone apps, qr codes, mash-ups, video or audio sharing, customized webpages, social media and social networking, and types of social tagging.14 this study focuses on the two most common mobile devices, mobile phones and tablets, and on the services provided to library patrons and local communities through mobile websites, mobile apps, and mobile catalogs. status of mobile services in us libraries mobile devices present a new and exciting opportunity for libraries of all types to provide information to people of all ages on the go, wherever they are.15 it is generally observed that there is an increased use of mobile technology in the library environment. information technology and libraries | june 2018 80 librarians see their users increasingly using mobile phones instead of laptops and desktop computers to search the catalog, check the library’s opening hours, and maintain contact with library staff.16 in an earlier investigation of 766 librarians, spires found that there was very little demand for services for mobile devices as of august 2007. at that time, relatively few libraries (18%) purchased content specifically for wireless handheld device use, and very few libraries (15%) reformatted content for these devices.17 however, a survey of public libraries completed by the american library association between september and november 2011 indicated interesting changes: 15% of library websites are optimized for mobile devices, and 12% of libraries use scanned codes (e.g. qr codes), and 7% of libraries have developed smartphone applications for access to library services; 36% of urban libraries have websites optimized for mobile devices, compared to 9% of rural libraries; 76% of libraries offer access to e-books; 70% of libraries use social networking tools such as facebook. 18 later studies revealed more significant changes. 99 association of research libraries member libraries were surveyed in 2012 to identify how many had optimized at least some services for the mobile web. apps were not investigated. the result showed that 83 libraries (84%) had a mobile website.19 a study in 2015 by liu and briggs showed that the top 100 university libraries in the united states offered one or more mobile services, with mobile websites, mobile access to the library catalog, mobile access to the library’s databases, e-books, and text messaging services being the most common. qr codes and augmented reality were less common.20 kim noted that “libraries are acknowledging that people expect to do just about everything on mobile devices and that more and more people are now using a mobile device as their primary access point for the web.”21 although librarians may have previously underestimated what people wanted to do using mobile devices, there is a growing understanding of the potential of these access points. research design survey samples while a growing number of users tend to access information remotely, urban libraries, as the most popular public-sector institutions and community centers, are facing great challenges in addressing the growing need for mobile services. the urban libraries council (ulc) (https://www.urbanlibraries.org), as an authoritative source founded in 1971, is the premier membership association of north america’s leading public library systems. ulc’s member libraries are in communities throughout the united states and canada, comprising a mix of institutions with varying revenue sources and governance structures, and serving communities with populations of differing sizes. ulc’s website lists 145 us and canadian urban libraries. since this study focused only on us urban libraries, 138 libraries were chosen as the study targets, and all were examined. https://www.urbanlibraries.org/ the provision of mobile services in us urban libraries | guo, liu, and bielefield 81 https://doi.org/10.6017/ital.v37i2.10170 table 1. the survey and examples of survey results. contents options example no.1: pima county public library … example no.138: milwaukee public library components of mobile websites 1 account login; 2 catalog search; 3 contact us; 4 downloadables; 5 events; 6 interlibrary loan; 7 kids & teens; 8 locations and hours; 9 meeting room; 10 recent arrivals; 11 recommendations; 12 social media; 13 suggest a purchase; 14 support 1, 2, 3, 4, 5, 7, 8, 9, 10, 12, 13, 14. 1, 2, 3, 4, 5, 7, 8, 9, 12, 13, 14. components of mobile apps 1 account login; 2 barcode wallet; 3 bestsellers; 4 catalog search; 5 contact us; 6 downloadables; 7 events; 8 full website; 9 interlibrary loan; 10 just ordered; 11 kids & teens; 12 locations and hours; 13 meeting room; 14 my bookshelf; 15 my library; 16 pay fines; 17 popular this week; 18 recent arrivals; 19 recommendations; 20 scan isbn; 21 social media; 22 suggest a purchase; 21 support 1, 4, 5, 6, 7, 8, 12, 15, 18, 20, 21. 1, 4, 5, 6, 7, 8, 12, 17, 20, 21. mobile reference services 1 chat/im; 2 social medias; 3 text/sms; 4 web form - 1, 3, 4. social media 1 blog; 2 facebook; 3 flickr; 4 goodreads; 5 google+; 6 instagram; 7 linkedin; 8 pinterest; 9 tumblr; 10 twitter; 11 youtube 1, 2, 3, 6, 8, 10, 11. 1, 2, 6, 8, 10. mobile reservation services 1 reserve a computer; 2 reserve a librarian; 3 reserve a meeting room; 4 reserve a museum pass; 5 reserve a study room; 6 reserve exhibit space - 3. mobile printing 1 mobile printing; 2 no mobile/ wi-fi printing; 3 wifi printing 3. 2. apps or databases 1 axis 360; 2 biblioboard; 3 bookflix;4 brainfuse; 5 career transitions; 6 cloud library; 7 driving -tests.org; 8 ebscohost; 9 flipster; 10 freading; 11 freegal; 12 gale virtual; 13 hoopla; 14 instant flix; 15 learning express; 16 lynda.com; 17 mango languages; 18 master file; 19 morningstar; 20 new york times; 21 novelist; 22 one click digital; 23 overdrive; 24 reference usa; 25 safari; 26 tumble book; 27 tutor.com; 28 world book; 29 worldcat; 30 zinio. 4, 11, 14, 22, 23, 26, 28, 30. 4, 8, 11,12, 13, 15, 17, 18, 19, 21, 23, 24, 30. information technology and libraries | june 2018 82 survey methods as mobile services are offered basically via wireless systems and mobile devices, a combination of research methods, including mobile website visits, content analysis, and librarian interviews, were applied for data collection. specifically, librarian interviews were employed as a verification and supplemental process to ensure that survey data were accurate and exhaustive. first, the authors utilized an iphone, an android mobile phone, and an ipad to access the websites of the 138 us urban libraries in the study sample to ascertain if these libraries have mobile websites or mobile catalogs and whether the platforms are operated properly. then the authors checked whether these libraries have mobile apps that can be downloaded from the apple app store or the google play store. the survey was conducted from june 18 to june 24, 2017. next, the authors went through all the mobile websites and the mobile apps the libraries provide to check the mobile services offered. the authors used a specially designed survey to collect data about each library’s mobile website and app (see table 1). the procedure of survey content analysis was conducted between june 25 and july 24, 2017, with the examination of each library’s services taking approximately 30 minutes. finally, for those libraries that had no mobile websites or mobile apps found through the website visits, the authors made interview requests to staff librarians via their online reference services such as live chat, web form and email. an additional purpose of this step was to confirm the accuracy of the survey data collected from website visits. the survey was conducted from july 22 to august 3, 2017. results and analysis results from the examination of mobile website visits, content analysis, and librarian interviews revealed what services us urban libraries provided as mobile services, how they were provided, and which were commonly provided. how many libraries provide mobile services? over 83% of us urban libraries have developed their own mobile websites (see figure 1) for communities they serve. the mobile website is currently the most popular service platform for mobile users. the provision of mobile services in us urban libraries | guo, liu, and bielefield 83 https://doi.org/10.6017/ital.v37i2.10170 figure 1. types of mobile services provided by libraries. promisingly, each test of these websites through the authors’ mobile devices, either smartphones or tablets, confirmed that all the study subjects can be accessed 100% of the time. these library websites, however, are not entirely built specially for mobile devices. while the majority of urban libraries have transformed their desktop websites into mobile sites with proper responsive design, about 17% are just smaller versions of their desktop websites (see figure 2). a responsive mobile website can react or change according to the needs of the users and the mobile device they’re viewing it on to achieve a good layout and content display. here, text and images change from a three-column to a single-column layout, and unnecessary images are hidden. the web address of a responsively designed mobile website is the same as the desktop website. responsive design is described as a long-term solution for addressing both designers’ and users’ needs.22 the survey found that 59% of libraries now have apps. our analysis of the earliest version of apps records indicate that los angeles public library was the first to use an app, in august 2010. mobile apps have advantages and disadvantages compared to mobile websites, and many libraries compared them and chose between the two. skokie (illinois) public library, as of october 2015, is no longer supporting the library’s mobile app because they claim the library’s website offers a better mobile experience. they also offer an easy access solution like that for a mobile app, with a message displayed to users: “miss having an icon on your home screen? bookmark the site to your home screen and you’ll have an icon to take you directly to this site.” 83% 59% 22% 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% mobile website mobile app mobile catalog information technology and libraries | june 2018 84 figure 2. the smaller versions of the desktop website and the specially designed mobile website the proportion of libraries providing mobile catalog services is only 22%. libraries can use multiple options to create one or more mobile service platforms. nearly half (46%) of us urban libraries have both mobile websites and mobile apps. according to the survey, 95% of libraries have at least one mobile website, mobile catalog, or mobile app. a survey the authors conducted in april 2014 found that only 81% of the urban libraries had at least one mobile website, mobile catalog, or mobile app (see figure 3). clearly, libraries are paying increasing attention to mobile services, and providing mobile services has become the unavoidable choice of libraries nowadays. figure 3. changes in the proportion of libraries that provide mobile services from 2014 to 2017. 19% 81% 2014 no mobile services at least one mobile service 5% 95% 2017 no mobile services at least one mobile service the provision of mobile services in us urban libraries | guo, liu, and bielefield 85 https://doi.org/10.6017/ital.v37i2.10170 what content do the mobile websites offer? through mobile website visits and content analysis, it was found that some types of information are available at all libraries, including “account login,” “events,” “locations and hours,” “contact us,” and “social media” (see figure 4). figure 4. components of mobile websites the proportion of library mobile sites that offer “support” and “downloadables” is 96% and 95%, respectively. among them, “support” generally includes donations to the library foundation, donation of books and other materials, and providing volunteer services; “downloadables” generally include e-books, e-magazines, and music. a total of 86% of the urban libraries set up “kids” and “teens” sections, providing specialized information services, such as storytime, games, events, book lists, homework help, volunteer information, and college information. a majority (62%) of libraries provide interlibrary loan information on mobile websites, but one library, palo alto (california) city library, no longer offers the costly interlibrary loan service as of july 2011. more than half (56%) of the libraries set up a “suggest a purchase” function and generally ask readers to provide title, author, publisher, year published, format, and other information in web form. some libraries display “recommendations” (26%) on their mobile websites. denver public library has a special column recommending books for children and teenagers and offers personalized reading suggestions: “tell us what you like to read and we’ll send you our recommendations in about a week.” many mobile websites will pop hints to the libraries’ mobile apps and link to the apple app store or the google play store after automatically identifying the user’s mobile phone operating system. this is helpful for promoting the use of the libraries’ apps, and it also provides great convenience for users. 100% 100% 100% 100% 100% 99% 96% 95% 86% 74% 62% 56% 32% 26% 0% 20% 40% 60% 80% 100% account login events locations and hours contact us social media catalog search support downloadables kids & teens meeting room interlibrary loan suggest a purchase recent arrivals recommendations http://www.marinlibrary.org/events/?trumbaembed=filter3%3dstorytimes information technology and libraries | june 2018 86 what content do the mobile apps offer? the content of mobile websites in libraries is basically the same, but the content of their mobile apps varies widely. the reason is that the understanding of the various libraries about the functions an app should offer differs from one library to another. some of these apps were designed by software vendors, such as boopsie, sirsidynix, and bibliocommoms, but some were designed by the libraries themselves, leading to the absence of a uniform standard or template for the app design. survey results show that only “account login” and “catalog search” are available in all apps (see figure 5). “locations and hours” accounts for a high proportion of apps at 96%. the “locations” feature in many libraries apps, with the help of gps, helps users find their nearest library location. figure 5. components of mobile apps about 85% of apps provide “contact us.” click “contact us” in poudre river public library district and some other libraries’ apps, and you can directly call the library or send text messages via email. “scan isbn” is a unique feature of mobile apps, and 75% of apps provide this functionality. if a library user finds a book they need in a bookstore or elsewhere, they can scan the isbn to can see if that book is in the library’s collection. apps designed by bibliocommoms all have “bestsellers”, “recently reviewed”, “just ordered” and “my library” (see chart figure 6). in “my library,” the “checked out” section contains red alerts for “overdue,” yellow alerts for “due soon,” and “total items.” the “holds” section contains “ready for pickup,” “active holds,” and “paused holds.”. the “my shelves” section contains “completed,” “in progress,” and “for later.” in this way, users can clearly see the details of the books they have 100% 100% 96% 89% 85% 77% 75% 68% 46% 27% 24% 19% 18% 16% 16% 10% 6% 5% 3% 0% 20% 40% 60% 80% 100% account login catalog search locations and hours downloadables contact us events scan isbn social media full website recent arrivals bestsellers recently reviewed popular this week just ordered my library my bookshelf pay fines barcode wallet kids & teens the provision of mobile services in us urban libraries | guo, liu, and bielefield 87 https://doi.org/10.6017/ital.v37i2.10170 borrowed and intend to borrow. apps designed by boopsie generally have “popular this week” to tell users which books have been borrowed more recently. figure 6. an app designed by bibliocommoms. only 3% of apps have “kids” and “teens” sections, which differs greatly from the percentage of mobile websites that offer those sections (86%). what mobile reference services do libraries provide? according to the survey, the most common way for us urban libraries to provide mobile reference service is a web form, which is available in 86% of surveyed libraries (see figure 7). related to “call us,” a web form has the advantage of being independent from the library’s working hours. although users fill out and submit a web form, it is similar to email and, generally, librarians respond to the user’s e-mail address, but it does not require users to enter their own email system, as they only need to fill in the content required by the web form. therefore, it is more convenient to use. the authors believe that providing only an email address is not mobile reference service. the survey found that 6% of libraries do not have mobile reference services. information technology and libraries | june 2018 88 figure 7. mobile reference services provided by libraries. currently, 43% of libraries offer chat and instant messaging (im) services, which allow users to communicate with librarians instantly. for example, when gwinnett county (georgia) public library’s mobile website is visited, an “ask us” dialog box appears in the upper right corner of the site, which allows visitors to chat with librarians. outside of the library’s work hours, the box displays “sorry, chat is offline but you can still get help” (see figure 8). the county of los angeles public library provides four options for im. they are aim, google talk, yahoo! messenger, and msn messenger. figure 8. “ask us” on gwinnett county public library’s mobile website 86% 43% 33% 8% 0% 20% 40% 60% 80% 100% web form chat/im text/sms social media the provision of mobile services in us urban libraries | guo, liu, and bielefield 89 https://doi.org/10.6017/ital.v37i2.10170 all the florida urban libraries surveyed offer reference services via the web form, chat, and text because an “ask a librarian” service administered by the tampa bay library consortium provides florida residents with those mobile reference services. the survey shows that only 8% of the libraries provide social media reference service in “ask a librarian.” the social media that provides reference service is either facebook or twitter. in fact, 100% of libraries have social media, and 100% of libraries have facebook and twitter, but most libraries do not use them to provide reference services. what social media do the libraries use? survey results showed that 100% of mobile websites display links to their social media, usually in the prominent position of the front page of the websites; 68% of apps have social media links. facebook and twitter are social media leaders, and now all libraries’ mobile websites have both (see figure 9). the survey conducted in 2014 showed that facebook and twitter had the highest occupancy rate, but only 61% of libraries offered facebook and 53% offered twitter. it is obvious that libraries have made great progress in the last three years in the application of social media. figure 9. social media being used by libraries. instagram and pinterest are both photo social media, and they are used 76% and 49%, respectively. as the leading social media in the video field, youtube is used by 67% of libraries. what mobile reservation services do libraries provide? mobile reservation services were found in 78% of all libraries’ mobile services. a majority (62%) of the libraries allow online reservation of a meeting room via web form or other forms, and 14% allow reserving a study room (see figure 10). some libraries only reserve a study or meeting room via phone. 100% 100% 76% 67% 57% 49% 41% 19% 12% 12% 9% 0% 20% 40% 60% 80% 100% facebook twitter instagram youtube blog pinterest flickr tumblr linkedin google+ goodreads information technology and libraries | june 2018 90 figure 10. mobile reservation services provided by libraries. a few libraries provide instant online access to free and low-cost tickets to museums, science centers, zoos, theatres, and other fun local cultural venues with discover & go. a total of 14% of the libraries provide “reserve a librarian” service, allowing patrons to reserve a free session with a reference librarian or subject specialist at the library. in addition, several libraries, such as pasadena public library, allow reserving of exhibit space. how many libraries provide mobile printing? mobile printing services allow patrons to print to a library printer from outside the library or from their mobile device. patrons’ print jobs are available for pick up at the library. already, 43% of the libraries provide mobile printing service (see figure 11). it is expected that more libraries will provide this service. to print from a mobile device, patrons need to download an app that supports mobile printing. printeron is the more commonly used app, which has been used by oakland public library, and san mateo county (california) libraries, and others. however, san diego public library uses the your print cloud print system, and santa clara county (california) library uses smart alec. san mateo county libraries offers wireless printing from smartphones, tablets, and laptops at all of its locations, and its wireless printing includes mobile printing, web printing, and email printing. in addition, 14% of libraries offer wireless printing services but do not provide mobile printing services. for example, live oak public libraries in savannah, georgia, states that printing from laptops (pc and mac) is available in all branches, but they don’t have apps that support printing from tablets or mobile phones. 62% 20% 15% 14% 14% 4% 0% 10% 20% 30% 40% 50% 60% 70% reserve a meeting room reserve a computer reserve a museum pass reserve a study room reserve a librarian reserve exhibit space the provision of mobile services in us urban libraries | guo, liu, and bielefield 91 https://doi.org/10.6017/ital.v37i2.10170 figure 11. the proportion of libraries that offer mobile printing. what apps or databases do libraries provide for patrons? four main software programs found to be used to display e-books of the surveyed libraries are overdrive (93%), hoopla (64%), tumblebook (61%), and cloud library (48%). for audiobooks, overdrive (93%) and hoopla (64%) are the most popular; oneclickdigital is used by 48%. most libraries (74%) use zinio for e-magazines, and 48% use the music software freegal. overdrive is the most common application in libraries (see table 2). table 2. the proportion of apps or databases being used in libraries. apps or databases % of libraries providing apps or databases % of libraries providing overdrive 93 world book 46 novelist 79 new york times 44 referenceusa 74 masterfile 43 zinio 74 ebscohost 43 learningexpress 69 flipster 29 gale virtual 68 bookflix 28 hoopla 64 brainfuse 22 morningstar 64 tutor.com 17 mango languages 61 safari 17 tumblebook 61 driving-tests.org 16 lynda.com 57 biblioboard 12 worldcat 51 career transitions 12 freegal 48 axis 360 11 oneclick digital 48 instantflix 10 cloud library 48 freading 9 mobile printing 43% no wireless/mobile printing 42% wireless printing 14% information technology and libraries | june 2018 92 the libraries provide users with various types of databases. survey statistics show that the widely used databases include referenceusa (business), mango languages (language learning), learningexpress and career transitions (job and career), lynda.com and tutor.com (education), morningstar (investment), world book (encyclopedias), worldcat (library resources worldwide), new york times (newspaper articles), driving-tests.org (testing preparation), and safari (technology). conclusion this study shows that mobile services have become popular in us urban libraries as of summer 2017, with 95% offering one or more types of mobile service. responsive mobile websites and mobile apps are the main platforms of current mobile services. the us urban libraries are terribly striving to meet local community’s remote access needs via new technologies. compared with desktop websites, mobile websites and apps for mobile devices offer services that are more accessible, smarter and interactive for local users. some mobile websites automatically prompt the user to install the libraries’ apps; many libraries’ apps offer the “scan isbn” function, making it convenient for the user to scan a book title at any time to see if it is in the library’s collection; “location” provides gps positioning and navigation services for users; “contact us” can directly link telephone, text, and email. libraries are actively developing and adding more mobile services, such as mobile reservation services and mobile printing services. the development of mobile technology has provided the support for libraries to offer mobile services. a future world of users accessing services provided by the libraries at anytime, anywhere, and in any way is getting closer and closer. acknowledgements this work was supported by grant no. 14ctq028 from the national social science foundation of china. references 1jason griffey, mobile technology and libraries (new york: neal-schuman, 2010). 2meredith farkas, “a library in your pocket,” american libraries no. 41 (2010): 38. 3american library association, “the state of america’s libraries 2017: a report from the american library association,” special report, american libraries, april 2017, http://www.ala.org/news/sites/ala.org.news/files/content/state-of-americas-librariesreport-2017.pdf. 4mark weiser, “the computer for the 21st century,” scientific american 265, no. 3 (1991): 94–104. 5stefan gessler and andreas kotulla, “pdas as mobile www browsers,” computer networks and isdn systems 28, no. 1–2 (1995): 53–59. 6georgina parsons, “information provision for he distance learners using mobile devices,” electronic library 28, no. 2 (2010): 231–44, https://doi.org/10.1108/02640471011033594. http://www.ala.org/news/sites/ala.org.news/files/content/state-of-americas-libraries-report-2017.pdf http://www.ala.org/news/sites/ala.org.news/files/content/state-of-americas-libraries-report-2017.pdf https://doi.org/10.1108/02640471011033594 the provision of mobile services in us urban libraries | guo, liu, and bielefield 93 https://doi.org/10.6017/ital.v37i2.10170 7allison woodruff et al., “portable, but not mobile: a study of wireless laptops in the home,” international conference on pervasive computing 4480 (2007): 216–33, https://doi.org/10.1007/978-3-540-72037-9_13. 8joan k. lippincott, “a mobile future for academic libraries,” reference services review 38, no. 2 (2010): 205–13. 9rachel hu and alison meir, “mobile strategy report,” california digital library, august 18, 2010, https://confluence.ucop.edu/download/attachments/26476757/cdl+mobile+device+user+r esearch_final.pdf?version=1. 10yan quan liu and sarah briggs, “a library in the palm of your hand: mobile services in top 100 university libraries,” information technology & libraries 34, no. 2 (2015): 133–48, https://doi.org/10.6017/ital.v34i2.5650. 11kitty pope et al., “twenty-first century library must-haves: mobile library services,” searcher 18, no. 3 (2010): 44–47. 12hu and meir, “mobile strategy report.” 13qian and briggs, “a library in the palm of your hand.” 14kalah rogers, “academic and public libraries’ use of web 2.0 applications and services in mississippi,” slis connecting 4, no. 1 (2015), https://doi.org/10.18785/slis.0401.08. 15 pope et al., “twenty-first century library must-haves.” 16lorraine paterson and low boon, “usability inspection of digital libraries: a case study,” ariadne 63, no. 1 (2010): 11, https://doi.org/10.1007/s00799-003-0074-4. [website lists h. rex hartson, priya shivakumar, and manuel a. pérez-quiñones as the authors] 17todd spires, “handheld librarians: a survey of librarian and library patron use of wireless handheld devices,” internet reference services quarterly 13, no. 4 (2008): 287–309, https://doi.org/10.1080/10875300802326327. 18 american library association, “libraries connect communities 2011-2012,” last modified june, 2012, http://connect.ala.org/files/68293/2012.67b%20plfts%20results.pdf. 19barry trott and rebecca jackson, “mobile academic libraries,” reference & user services quarterly 52, no. 3 (2013): 174–78. 20 liu and briggs, “a library in the palm of your hand.” 21bohyun kim, “the present and future of the library mobile experience,” library technology reports 49, no. 6 (2013): 15–28. 22hannah gascho rempel and laurie bridges, “that was then, this is now: replacing the mobileoptimized site with responsive design,” information technology & libraries 32, no. 4 (2013): 8–24, https://doi.org/10.6017/ital.v32i4.4636. https://doi.org/10.1007/978-3-540-72037-9_13 https://confluence.ucop.edu/download/attachments/26476757/cdl+mobile+device+user+research_final.pdf?version=1 https://confluence.ucop.edu/download/attachments/26476757/cdl+mobile+device+user+research_final.pdf?version=1 https://doi.org/10.6017/ital.v34i2.5650 https://doi.org/10.18785/slis.0401.08 https://doi.org/10.1007/s00799-003-0074-4 https://doi.org/10.1080/10875300802326327 http://connect.ala.org/files/68293/2012.67b%20plfts%20results.pdf https://doi.org/10.6017/ital.v32i4.4636 abstract introduction literature review definition and types of mobile devices and mobile services status of mobile services in us libraries research design survey samples survey methods results and analysis how many libraries provide mobile services? what content do the mobile websites offer? what content do the mobile apps offer? what mobile reference services do libraries provide? what social media do the libraries use? what mobile reservation services do libraries provide? how many libraries provide mobile printing? what apps or databases do libraries provide for patrons? conclusion acknowledgements references author id? metaphor’s role in the information behavior of humans interacting with computers | sease 9 robin sease metaphor’s role in the information behavior of humans interacting with computers metaphors convey information, communicate abstractions, and help us understand new concepts. while the nascent field of information behavior (ib) has adopted common metaphors like “berry-picking” and “gap-bridging” for its models, the study of how people use metaphors is only now emerging in the subfield of human information organizing behavior (hiob). metaphors have been adopted in human–computer interaction (hci) to facilitate the dialogue between user and system. exploration of the literature on metaphors in the fields of linguistics and cognitive science as well as an examination of the history of use of metaphors in hci as a case study of metaphor usage offers insight into the role of metaphor in human information behavior. editor’s note: this article is the winner of the lita/ ex libris writing award, 2008. o ur world is growing increasingly digital; our entire lives—our interactions, our entertainment, even our personal memories—are mediated by technology. humans have had thousands of years to learn to communicate with each other, largely employing metaphors and analogies to negotiate meaning. our experience communicating with computers is both nascent yet broadening every day with increasing dependency. we must fully understand the role that metaphors play in the exchange of information to facilitate the communication between humans and computers. n metaphors: a definition originally regarded as rhetorical devices, plato abhorred the use of metaphors, arguing that they could convince a man to do the illogical. schön explains that at that time metaphors were considered a “kind of anomaly of language, one which must be dispelled in order to clear the path for a general theory of reference or meaning.”1 aristotle, on the other hand, saw that they provided insight into the items of comparison. “ordinary words convey only what we know already; it is from metaphor that we can best get hold of something new.”2 traditionally the objects in the equation have been called the tenor and the vehicle, but more recently they are referred to as the target and source domains. in the metaphor, “alex is a space cadet,” alex is the tenor or target domain (the abstract or undefined), and space cadet represents the vehicle or source domain (the known). if “the essence of metaphor is understanding and experiencing one thing in terms of another,” then the vehicle or the source domain is responsible for elucidating the tenor or target domain.3 one measures the relationship between these domains, the tenor and the vehicle, with “ground” and “tension.” ground concerns the similarities between the domains and tension represents the dissimilarities.4 metaphors have been studied from multiple perspectives: from the creative use of metaphors in literature to the comprehension or appreciation of metaphors.5 the research from other disciplines can offer insight into the effect of metaphors on human information behavior. i will first discuss the use of metaphors in language and then review some of the theories on how they work. n metaphorically speaking: the role of metaphors in language the work of lakoff and johnson has been fundamental to understanding the pervasive use of metaphors in our language. they propose that metaphors are an underlying structure forming and shaping the way we discuss and even think about the world. they argue that the “human conceptual system is metaphorically structured and defined.”6 mapping from a source domain to a target domain is central to the semantics of language and communication. “domains need structure so that one can reason about them. the major function of metaphor is thus to supply structure in terms of which reasoning can be done.”7 in metaphors we live by, lakoff and johnson catalogue examples of underlying conceptual metaphors. they identify orientation metaphors that underlie how we speak about abstract concepts such as health, happiness, and success. each of these states is associated with the direction up. one can be “up and at ‘em” or in “high spirits” or of “high standing.” counter examples include “being under the weather,” “feeling down,” and “low comedy.” metaphors shape the way we think about the concepts we are describing. for instance, the metaphor “argument is war” (“defending your point of view,” “attacking your opponent’s stance,” and “he shot me down”) may define expectations for “winning” and “losing” and detrimentally shape our ability to negotiate and compromise.8 lakoff and johnson refer also to michael reddy’s 1979 piece, “the conduit metaphor.”9 reddy hypothesizes that linguistically and conceptually we see ideas or meanings as objects, linguistic expressions as containers, and communication as sending. the “receivers” of robin sease (seaser@u.washington.edu) is an mlis candidate at the ischool, university of washington, seattle. 10 information technology and libraries | december 2008 the communication are the information users or seekers. the designers “package their ideas,” “put them down on paper,” and “convey” them to the user who “gets” them or not. reddy argues that this underlying metaphor influences the way we think about the communication process, making information and meaning an object rather than a process, which trivializes the function of the reader or listener.10 metaphors are undeniably central to our ability to communicate and use language, and perhaps more fundamentally, to convey meaning or to infer meaning—to illustrate and explain as well as to identify and to catalog. the role of metaphors in human cognition is still a matter of great debate. n thinking about metaphors: the cognitive role of metaphors information science is at its heart the study of information. if metaphors exist as a necessary component of language—a tool to convey meaning and to transfer information—then metaphors are by necessity a component of information science. understanding how metaphors work provides insight into information itself. early propositions about how metaphors were understood stemmed from poetic and rhetorical research. that is, if a sentence cannot be interpreted literally, then it must be interpreted figuratively. to illustrate, the assertion “my child is a pig” is initially illogical, so the receiver would then move on to figurative interpretation. once that determination is made, the mind sets about finding meaning from the expression. this theory argues that once the statement is deemed false, the statement is treated like a simile, or a comparison statement, by identifying traits or attributes in the source domain (the pig: sloppy, slovenly, fascinated with mud) that would be applicable to the target domain.11 one group of theorists questions this premise, pointing to sentences that can be interpreted literally and figuratively. one useful example is the statement “my dog is an animal.”12 while this expression is true literally, most would reject the literal interpretation in favor of one that depicts the dog as a ferocious or uncontrollable beast. glucksberg and keysar, among others, seek a model that focuses on the associations between the domains. they hypothesize that metaphors are not “implicit comparisons” but are class-inclusion statements or “assertions of categorization.”13 research in cognitive processing of analogies has shifted from plain association of a is to b where a traits are matched to b traits to a hypothesis that maps from a to b and leads insight into a super-ordinate category that includes both a and b. gentner’s work studying science metaphors in the 1980s is partially founded on this theory. she notes that through “analogical reasoning, learning can result in the generation of new categories and schemas.”14 she is particularly interested in creating ways for computers to interpret figurative expressions. she proposes a structuremapping theory: a system of relations (not just traits) from the source domain to the target domain with a parallelism between the structures that allows for a one-to-one mapping of the domains and relationships. weiner explores a similar tactic with human–computer interaction language processing by prototyping the shared framework. the prototype theory allows for a range of possible predicates and would accommodate greater tension (the differences in a metaphor) in the same way that we can categorize penguins and chickens under the prototype of bird.15 these theories of categorization remain popular today, but still struggle to account for certain things about the way metaphors are comprehended. specifically, take the shakespearean line, “juliet is the sun.” categorization theory does not explain why some attributes like “glowing” and “center of the solar system” are transferred from the source while others such as “nuclear” and “huge” are not.16 this theory also stumbles with novel poetic metaphors like e e cummings’ “the voice of your eyes is deeper than all roses.”17 alternative theorists argue that while the categorization-based theories accommodate the ground (commonality) in a metaphor, they fail to fully explain the effect and purpose of the tension (differences) in the equation. lakoff fervently contends that simplifying conceptual models to mere categorization ignores the unique nature of each specific mapping: each mapping defines an open-ended class of potential correspondences across inference patterns. when activated, a mapping may apply to a novel source domain knowledge structure and characterize a corresponding target domain knowledge structure.18 in other words, each pairing creates new meaning or conceptual frameworks from which other metaphors and meanings can be instantiated. a is to b creates meaning c, rather than a and b are part of c. looking at it from the perspective of lanier, a vocabulary is created upon which we can define even more vocabulary.19 lakoff maintains that the theory of conceptual domains speaks to both the uni-directional nature of metaphors as well as the “systematicity” that allows the interpreter to selectively identify the aspects that are consistent and discard the aspects that are inconsistent with the metaphor.20 more recent work approaches the question from a connectivist point of view, seeking ways to identify an overarching model consistent with and encompassing of other theories. this premise rests on the foundation of metaphor as communication and examines the use of metaphors in conversational contexts. the necessary mutual cognitive environment of the communicators, the metaphor’s role in the information behavior of humans interacting with computers | sease 11 working memory, and the common ground that they find are all of importance, but so are context and motivation as influencing factors. the context in which the statement is made, the place in which it is interpreted, and the motivation of the user to understand the statement combine to affect the meaning that is derived. for instance, the phrase, “i want you to sheepdog this project” could mean something different in the context of a chaotic group of workers than in the context of a core team threatened by competing entities.21 likewise, the relationship of the receiver to the sender could modify the motivation of the receiver to seek meaning beyond the first or easiest interpretation. n classifying metaphors: metaphors in information science these notions of context and user-motivation are not new to the field of information science. at the turn of the century the subfield of information behavior had begun to direct its attention to cognitive psychology, the nature of man–machine dialogue, and to a certain extent the role of metaphor in deciphering and creating meaning. spink investigates human information behavior (hib) from an evolutionary perspective.22 after exploring a wide variety of research in fields, spink and currier performed a qualitative analysis of the information behavior of historical figures. they postulated that modular cognitive architecture makes homo sapiens rare in their ability to think of one thing in terms of another.23 the resulting mapping allows for the creation of new cognitive structures in a similar fashion to lanier’s vocabulary development conjecture. spink and currier’s work launched a new theory of information use, which has led to recent research into metaphor use. in an attempt to model an integrative approach to human information behavior incorporating the everyday life information-seeking and sense-making approach, the information-foraging approach, and the problem-solution view of information seeking, spink and cole recognized a gap in the research covering actual information use and proffered a fourth information approach to account for it. their information-use theory “starts from an evolutionary psychology notion that humans are able to adapt to their environment and survive because of our modular cognitive architecture.”24 development of this theory has birthed a sub-area within the field of human information behavior dubbed human information organizing behavior (hiob) of which the use of metaphors or metaphor instantiation is a necessary component. cole and leide explore the notion of modular cognitive architecture in an attempt to model a cognitive framework for metaphor use in hiob. similar to the categorization theory of metaphor use, they claim that “metaphor instantiation is similar to a form of superordinate category instantiation . . . along with the metaphor comes the structure of the metaphor.”25 following in belkin’s footsteps, they address the problem of a “domain novice attempting to formulate his information need into an effective query to an information retrieval system.”26 they conducted three case studies with the purpose of developing a methodology that researchers can use to “ascertain the efficacy of metaphor instantiation as an information need structuring device.”27 they conclude that metaphor instantiation might help us create systems that more closely resemble the way that humans behave with information: interaction, organization, and retrieval. n metaphors in human–computer interaction: a case study reality bytes while theorists of various fields explored the nature of metaphors, the field of human–computer interaction (hci) found itself thrust into the thick of it. rarely does one intentionally adopt new ideas so whole-heartedly without first considering the ramifications, but the history of hci shows that that is exactly what happened. it began with enthusiastic adoption to improve communication, then reeled in recognition of the drawbacks of metaphor mismatches, and finally has lurched to a standstill while new approaches to metaphor use are explored. the first instances of metaphor and analogy in the field of computer science and hci preceded images of windows, desktops, mice, scrollbars, and icons. the initial focus was on natural language processing to improve the communication between the user and the system.28 although the field of information science was on the periphery of metaphor research at the time, it certainly was interested in improving the dialogue between users and systems. belkin proposed a model of information seeking that highlighted the user ’s anomalous state of knowledge. he argued for a better understanding of user’s conceptual models in order to improve system communications.29 although he did not propose metaphors specifically, the advent of the graphical user interface (gui) placed metaphors in a position to tackle belkin’s concerns. hci gets gui perhaps because of the difficulty of man–machine dialogue, guis emerged. by simplifying the “language” to “point and click,” even an average user could make the system do what it was supposed to do.30 with its more intuitive and memorable interface, the gui was the 12 information technology and libraries | december 2008 result of years of frustration trying to remember system functions and commands. because illustrations of the abstract are necessarily grounded in something concrete, guis and metaphors were inexorably intertwined; in a sense, metaphors were “inescapable.”31 metaphors enacted through the user interface would become the primary mechanism of communication between the user and the system. gui metaphors can be categorized several ways. a typical breakdown is to break out noun and verb metaphors into “organization metaphors” and “operations metaphors.”32 alternatively, fineman further divides the nouns and classifies various metaphors into three basic types: functionality metaphors, interface metaphors, and interaction metaphors.33 fineman describes an e-mail program. functionality metaphors outline the expectations that a user should have for an application and generally guide the overall behavior of the tool. in the e-mail program the functionality metaphor would be “e-mail is postal mail.” interface metaphors are the mechanical metaphors that allow the user to accomplish the tasks within the functionality metaphor. the interface metaphors should be guided by the functionality metaphors, but not constrained by them. examples would include the address book and printer metaphors. interaction metaphors, or the verbs, are the underlying metaphors that define the form of the action, how things are performed; these metaphors span beyond a particular tool, but greatly affect the functionality metaphor.34 the effect of the selected metaphors cannot be underestimated. for instance, many feel that the direct manipulation metaphor (data is an object that can be manipulated) and gui are synonymous.35 and within the graphical user interface, the choice of desktop has affected all aspects of the interface with the user. one need only reflect upon the famous englebart demonstration of the “mouse” most often viewed in alan kay’s video presentation.36 englebart’s mouse preceded the notion of a desktop and more closely resembled a pilot’s controls than an office worker sitting at a typewriter keyboard. imagine how different our computers would be today had the pilot metaphor ever got off the ground.37 the ground we walk on having adopted metaphors, the field of hci wanted a better understanding of why and how they worked. carroll and thomas stressed the importance of psychology research and rallied for the use of metaphor for its grounding purposes, that is, bridging abstract concepts to concrete attributes. in a manner similar to belkin, they brought forth the notion that the designer of the system creates a conceptual model of how it works. the metaphors used within the user interface serve as bridges to the user’s mental model of the system. “people employ metaphors in learning about computing systems, the designers of those systems should anticipate and support likely metaphorical constructions to increase the ease of learning and using the system.”38 they encouraged designers to consider the limitations and consequences of metaphors; ideally, the metaphor should convey its limitations to the user. their eagerness to adopt metaphors, which they considered “crucial” for motivating and facilitating understanding, was countered only by their warning that “for most computer systems there will come a point at which the metaphor or metaphors that initially helped the user understand the system will begin to hinder further learning.”39 case recognized the importance of assessing users’ needs and expectations when designing metaphors for systems. his study of historians found that metaphors and analogies are commonly used in the information behavior of historians. he endorsed their use in interface development despite potential pitfalls. concerned mostly with transitioning historians from physical to electronic format, case argued that digital documents and files should more closely resemble physical files— not necessarily physically but in the manner of retrieval and storage.40 espousing a slightly more conservative opinion, marcus indicated that an “appropriate metaphor balances delicately expectation and surprise on part of the user/viewer.”41 marcus repeated that the objective of the designer is to design a conceptual model that clearly indicates to users what their expectations of the system should be, the goal being that the conceptual model created by the designers will map as much as possible to an existing mental model that the user can bring to reference.42 metaphors are not only useful for familiarizing users with the system, but also affect the system design as part of the design rationale. maclean, bellotti, young, and moran noted the usefulness of metaphors in the creative process, but expressed concern that designers should consider the effect of even implicit metaphors.43 some metaphors are inevitable because “new concepts and processes require new terminology. we can either coin new terms, borrow them from greek, latin, or other languages, create terms by adding prefixes or suffixes—or use metaphoric terms.”44 many metaphors used by designers in their communications are simply embedded in the language of computer science. what makes computer science so unique among the sciences, especially when using metaphors, is that they not only talk about something in terms of metaphors, they implement them too. “we live with our metaphors.”45 this discourse may carry loads of inexplicable metaphors for common users, “heaps” and “stacks” and “parents” and “children,” for instance, come readily to mind for anyone with computer science experience, but do metaphor’s role in the information behavior of humans interacting with computers | sease 13 not necessarily convey meaning to users. we should stay aware of our metaphors so that we avoid seeing “platforms, engines and objects rather than ‘platforms’, ‘engines’ and ‘objects.’”46 the tension builds these caveats that metaphors must be constantly monitored and selected with care, coupled with a growing collection of mismatched and ill-fitting metaphors, began the initial protestations over the use of metaphors in hci. the field of hci started experiencing the effect of the tension in the metaphorical equation—those attributes that fail to match. gentner and nielson summarize three “classic drawbacks” of metaphors: n the target domain has features not in the source domain (magic attributes). n the source domain has features not in the target domain (misleading attributes). n some features exist in both domains but act differently (violation of expectations).47 even proponents of metaphors readily admitted the limits of metaphors, specifically that they never match perfectly and that they can “limit meaning.”48 halasz and moran cautioned that teaching new users through analogical models may be an easy way to introduce a user to a new system but that “analogical models can act as barriers preventing new users from developing an effective understanding of systems.”49 halasz and moran argued that computers are unique; we should abandon analogical models and rather seek to create a conceptual model of the system that would more accurately reflect the actual system. a system designer’s conceptual model would represent the system to improve the user’s ability to solve problems and apply reason within the system. they confess that moving away from analogical models leaves the user without the tool of “prior knowledge,” so for teaching purposes (though not long-term reasoning purposes) they offer the use of smaller, simpler metaphors—those that they liken to literary metaphors used to “make a point in passing. once the point is made, the metaphor can be discarded.”50 noting that there was room for error and rejection on behalf of the user, marcus explained that some inappropriate metaphors simply become assimilated or evolve. for example, the original apple trashcan icon more closely resembled a “kitchen garbage can” for scraps and rotting things than an office wastebasket for paper, but over the years it has evolved to its current office basket icon.51 also, as technology changes, the metaphors will change. “the paradigm shift, or change in metaphors, will be constant and swift as paradigms evolve from prototypes, become typed, evolve to archetypes, and eventually become stereotyped or obsolete.”52 without stating it explicitly, he spoke of dead metaphors: metaphors that no longer bring new meaning to light, the “arm” of a chair or the “leg” of a table, for instance. these metaphors are accepted idiomatically with no need for explanation and exploration. aware of the ease with which users employ idiomatic icons in computing, cooper adduced that idioms and meaningless symbols are preferable to new metaphors, claiming “metaphors offer a tiny boost in learnability to first time users at tremendous cost. the biggest problem is that by representing old technology, metaphors firmly nail our conceptual feet to the ground, forever limiting the power of our software.”53 he proposed that we move away from a metaphoric paradigm to an idiomatic paradigm where a word or symbol simply stands for something else and does not carry with it the weight of analogy. many of the metaphors originally created in computing have become dead metaphors or idioms already. people do not think of their memory buffer where they store copied or cut items as an actual clipboard. the macintosh trashcan is ubiquitously cited as a perfect example of a mismatched metaphor and illustrates what may happen when a metaphor becomes idiomatic. for many years to the horror and confusion of many users, the trashcan both deleted files and was used to eject a diskette. a user would drag their diskette icon to the trashcan to eject it. although this may seem like just a poor choice of metaphor, it does have a sensible origin. historically, computers had no hard drive, but rather ran applications from diskettes. when you were entirely done with the application you would remove the application icon from the desktop by placing it in the trash. you would also need to eject the diskette. for expediency, apple engineers incorporated ejection and desktop removal into one quick task. it was user tested and readily adopted.54 the metaphor was a natural extension until the functionality changed. the user is not the only potential victim of metaphors; the blinders of an adopted metaphor can curtail a system designers’ vision.55 gentner and nielson take great offense at the direct manipulation metaphor because it reduces us to “pointing” and “grunting” as if we were children barely able to communicate or patrons at a restaurant where we don’t speak the language. when they state “computer interfaces must evolve to utilize more of the power of language,”56 they are not speaking of voice control and natural language processing, but to creating a shared language understandable by both the user and the system. only “power users” of a machine have breached the walls of the interface and have attempted to learn the language of the machine itself, but even they are inevitably dragged down by the restrictions of direct manipulation.57 14 information technology and libraries | december 2008 near the end of the millennium, user interface guidelines and handbooks backed off—afraid to support or spurn metaphor use in hci. blackwell’s chronicle of the history of the desktop metaphor notes that 1990 “marked the middle of a decade (1985 to 1995) in which researchers anticipated problems with metaphor at the start and had experienced failure by the end.”58 the silence is most stunning in the handbook of human computer interaction, a 1,582 page volume in which only two of the sixty-two chapters even mention metaphors.59 hollan, bederson, and helfman caution against metaphors in their chapter on information visualization,60 while neale and carroll cautiously return to carroll’s original thesis, stressing the importance of creating a conceptual model (the designer’s model of the system’s functions) that “should incorporate an accurate understanding of the user’s task, requirements, experience, capabilities, and limitations.”61 n metaphor ever after by the year 2000, system designers found themselves stuck between a rock and hard drive. investigations into the efficacy of metaphors find that metaphors are a mixed bag, unavoidable, useful, yet problematic.62 while creating a taxonomy of hci metaphors, barr, biddle, and noble conclude that “the analysis present in the taxonomy should indicate that there are many benefits to user-interface metaphors if we choose them correctly and harness them properly.”63 yet blackwell’s dissertation research finds that metaphors afford “surprisingly little benefit for cognitive tasks” and that the benefit is “largely restricted to mnemonic assistance.”64 blackwell notes that the benefits were greatest when the user constructed his or her own metaphor rather than using the system supplied metaphor. interestingly, while studying students’ understanding of search engines, hendry identified a conceptual metaphor (not provided by the system) common to many of the students’ visions of an information retrieval system. although hendry does not suggest that metaphors should be used when creating systems, he does question how existing conceptual metaphors might be identified through sketching and then incorporated to create mappings “between problem domains and programming notations.”65 endeavoring to incorporate the benefits of metaphors while dodging the drawbacks, recent variations on the use of metaphor have been tendered. neale and carroll lobby for composite metaphors—metaphors made up of multiple metaphors—to alleviate the tension between source and target domains.66 powell found composite metaphors useful for facilitating computer game play without unduly upsetting users. she explains that gamers have readily adopted the tool or inventory bag from which the user may equip their character with a mannequin style “dress-up” panel. the bag and mannequin metaphors have no real-world association but work effectively.67 hsu, investigating composite metaphors, confirmed neale and carroll’s assertions and found that the “closer the mapping between designers’ conceptual models and users’ mental models, the greater the effect of interface metaphors.”68 as an alternative to composite metaphors, khoury and simoff propose a new class of metaphors that they call “elastic.” they explain that metaphors in language are unavoidable, and we must deal with them in information technology. rather than focusing on concrete objects, however, metaphors should focus on social structures, such as relationships in game play. they conclude that “elastic metaphors can provide an optimal mapping from source to target domains.”69 n conclusion historically, in hci the designer of the system supplies metaphors to help the user understand the system better. unfortunately, this format falls prey to reddy’s conduit metaphor: the receiver of the information is left out of the communication process. if hci is to learn from humanto-human interaction, then the user of the system should be able to communicate his or her needs to the system. if the system does not have the capacity to understand the request, then the user and the system should be empowered to select mutually agreeable simple metaphors for communicating. the user should be given the option to choose his or her own metaphors, and the metaphors, vocabulary, and “language” created should be able to evolve as the boundaries of the comparison are reached. a common complaint from users is, “the computer just isn’t listening to me.” and they are, of course, right. the field of information science, and particularly the subfield of human information behavior, are in a unique position to help resolve the long-standing debate over the use of metaphors in hci. from belkin’s early stated objectives to improve information systems to cole and leide’s pursuit of metaphor instantiation in human information organizing behavior, the study of information behavior attempts to better understand and ideally facilitate the user—assisting them in their acquisition and application of information. metaphors are clearly utilized by humans as they communicate with each other, seek and conceptualize information, and solve problems. to improve the interaction between human and computer, we must first gain better insight into the role that metaphors play in our own interaction with information. metaphor’s role in the information behavior of humans interacting with computers | sease 15 references 1. donald a. schön, “generative metaphor,” in metaphor and thought, 2nd ed., ed. andrew ortony (new york: cambridge univ. pr., 1993): 138 . 2. aristotle, rhetoric, book iii, chapter 10, ed. lee honeycutt, trans. william r. roberts, www.public.iastate.edu/~honeyl/ rhetoric/rhet3-10.html (accessed june 25, 2008). 3. george lakoff and mark johnson, metaphors we live by (chicago: univ. of chicago pr., 1980): 5. 4. andrew ortony, “metaphor, language, and thought,” in metaphor and thought, 2nd ed., ed. andrew ortony (new york: cambridge univ. pr., 1979/1993): 1–18. 5. robert sternberg, roger tourangeau, and georgia nigro, “metaphor, induction, and social policy: the convergence of macroscopic and microscopic views,” in metaphor and thought, 2nd ed., ed. andrew ortony (new york: cambridge univ. pr., 1979/1993): 277–303. 6. lakoff and johnson, metaphors we live by, 9. 7. george lakoff, “the contemporary theory of metaphor,” in metaphor and thought, 2nd ed., ed. andrew ortony (new york: cambridge univ. pr., 1979/1993): 194. 8. lakoff and johnson, metaphors we live by. 9. michael j. reddy, “the conduit metaphor,” in metaphor and thought, 2nd ed., ed. andrew ortony (new york: cambridge univ. pr., 1993): 174–201. 10. ibid. 11. alan paivio and mary walsh, “psychological processes in metaphor comprehension and memory,” in metaphor and thought, 2nd ed., ed. andrew ortony (new york: cambridge univ. pr., 1979/1993): 307–28. 12. sam glucksberg and boaz keysar, “how metaphors work,” in metaphor and thought, 2nd ed., ed. andrew ortony (new york: cambridge univ. pr., 1993): 408. 13. ibid., 401. 14. dedre gentner, “reasoning and learning by analogy,” american psychologist 52, no. 1 (1997): 33. 15. judith e. weiner, “a knowledge representation approach to understanding metaphors,” computational linguistics 10, no. 1 (1984): 1–14. 16. george lakoff, “position paper on metaphor,” proceedings of the 1987 workshop on theoretical issues in natural language processing (morristown, n.j.: association for computational linguistics, 1987): 94–197. 17. gentner, “reasoning,” 106. 18. lakoff, “contemporary theory,” 210. 19. jaron lanier, “jaron’s world: the meaning of metaphor,” discover (mind and brain) 28, no. 2 (2007), http://discover magazine.com/2007/feb/jarons-world-metaphors-vocabulary (accessed june 25, 2008). 20. lakoff and johnson, “metaphors.” 21. david ritchie, “metaphors in conversational context: toward a connectivity theory of metaphor interpretation,” metaphor and symbol 19, no. 4 (2004): 265–87. 22. amanda spink and charles cole, “a human information behavior approach to a philosophy of information,” library trends 52, no. 3 (2004): 617–28; amanda spink and james currier, “towards an evolutionary perspective for human information behavior: an exploratory study,” journal of documentation 62, no. 2 (2006): 171–93; amanda spink and james currier, “emerging evolutionary approach to human information behavior,” in new directions in human information behavior, ed. amanda spink and charles cole, ol. 8 of information science and knowledge management (netherlands: springer, 2006): 170–202. 23. spink and currier, “towards an evolutionary perspective.” 24. amanda spink and charles cole, “human information behavior: integrating diverse approaches and information use,” journal of the american society for science and technology 57, no. 1 (2005): 25. 25. charles cole and john e. leide, “a cognitive framework for human information behavior: the place of metaphor in human information organizing behavior” in new directions in human information behavior ed. amanda spink and charles cole, vol. 8 of information science and knowledge management (netherlands: springer, 2006): 174. 26. ibid., 173. 27. ibid., 198. 28. weiner, “a knowledge representation approach”; gentner, “reasoning and learning.” 29. nicholas j. belkin, “anomalous state of knowledge for information retrieval,” canadian journal of information science 5 (1980): 133–43. 30. donald gentner and jacob nielson, “the anti-mac interface,” communications of the acm 39, no. 8 (1996): 70–82. 31. richard m. chisholm, “new metaphors for understanding the new machines” proceedings of the 4th annual international conference on systems documentation (new york: acm, 1986): 91. 32. aaron marcus, “metaphor design in user interfaces: how to effectively manage expectation, surprise, comprehension, and delight” in conference companion on human factors in computing systems chi ‘95, ed. irivin katz, robert mack, and linn marks (new york: acm, 1995): 373–74. 33. benjamin fineman, “computers as people: human interaction metaphors in human-computer interaction (master’s thesis, carnegie-mellon university, 2004), www.mildabandon .com/paper/paper.pdf (accessed june 25, 2008). 34. ibid. 35. gentner and nielson, “the anti-mac interface.” 36. alan kay, doing with images makes symbols (university video communications, 1987), flash video file, http://video .google.com/videoplay?docid=-533537336174204822 (accessed june 25, 2008). 37. alan f. blackwell, “the reification of metaphor as a design tool,” acm transactions on computer-human interaction 13, no. 4 (2006): 490–530. 38. john m. carroll and john c. thomas, “metaphor and the cognitive representation of computing systems,” ieee transactions on systems, man and cybernetics 12, no. 2 (1982): 108. 39. ibid., 113. 40. donald case, “conceptual organization and retrieval of text by historians: the role of metaphor and memory,” journal of the american society for information science 42, no. 9 (1991): 657–68. 41. aaron marcus, “managing metaphors for advanced user interface,” proceedings of international workshop avi ’94 (new york: acm, 1994): 14. 16 information technology and libraries | december 2008 42. ibid. 43. allan maclean, victoria bellotti, richard young, and thomas moran, “reaching through analogy: a design rationale perspective on roles of analogy” in proceedings of chi ‘91 conference on human factors in computer systems (new york: acm press, 1991), 167–72. 44. chilsolm, “new metaphors,” 90. 45. gerald j. johnson, “of metaphor and the difficulty of computer discourse,” communications of the acm 37, no. 12 (1994): 97–102. 46. ibid., 101. 47. gentner and nielson, “the anti-mac interface.” 48. chisolm, “new metaphors,” 90. 49. frank halasz and thomas p. moran, “analogy considered harmful,” international journal of man-machine studies 14 (1981): 383. 50. ibid., 185. 51. marcus, “managing metaphors,” 14. 52. ibid., 16. 53. alan cooper, “the myth of metaphor” originally published in visual basic programmer’s journal (july 1995), www .cooper.com/articles/art_myth_of_metaphor.htm (accessed june 25, 2008). 54. tim rohrer, “metaphors we compute by: bringing magic into interface design,” (1995), http://zakros.ucsd.edu/~trohrer/ metaphor/gui4web.htm (accessed june 25, 2008). 55. gentner and nielson, “the anti-mac interface.” 56. ibid., 74 57. ibid. 58. blackwell, “the reification of metaphor,” 493. 59. handbook of human computer interaction, 2nd rev. ed., ed. martin helander, thomas landauer, and p. prabhu (amsterdam: elsevier science pub. b.v., 1998). 60. james hollan, benjamin bederson, and jonathan helfman, “information visualization” in handbook of human computer interaction, 2nd rev. ed., ed. martin helander, thomas landauer, and p. prabhu (amsterdam: elsevier science pub. b.v., 1998), 441–62. 61. dennis c. neale and john m. carroll, “the role of metaphors in user interface design” in handbook of human computer interaction, 2nd rev. ed., ed. martin helander, thomas landauer and p. prabhu (amsterdam: elsevier science pub. b.v., 1998): 447. 62. a. f. blackwell and t. r. g. green, “does metaphor increase visual language usability?” in proceedings 1999 ieee symposium on visual languages (1999): 246–53.; lee ratzan, “making sense of the web: a metaphorical approach,” information research 6, no. 1 (2000), http://informationr.net/ir/6-1/ paper85.html (accessed june 25, 2008); christopher r. wolfe, “plant a tree in cyberspace: metaphor and analogy as design elements in web-based learning environments,” cyberpsychology & behavior 4, no. 1 (2001): 67–76; muna k. yousef, “legal, social, theoretical and fundamental aspects: assessment of metaphor efficacy in user interfaces for the elderly: a tentative model for enhancing accessibility,” proceedings of the 2001 ec/ nsf workshop on universal accessibility of ubiquitous computing: providing for the elderly (new york: acm, 2001): 120–24. 63. pippin barr, robert biddle, and james noble, “a taxonomy of user interface metaphors” proceedings of sigchi-nz symposium on computer-human interaction (chinz 2002) (hamilton, new zealand: australian computer society, 2002): 6. 64. alan f. blackwell, “metaphor in diagrams” (phd diss., darwin college, univ. of cambridge, 1998), www.cl.cam .ac.uk/~afb21/publications/thesis/blackwell-thesis.pdf (accessed june 25, 2008): 1. 65. david g. hendry, “sketching with conceptual metaphors to explain computational processes” visutal languages and human-centric computing (vl-hcc ‘06), (piscataway, n.j.: ieee, 2006): 7. 66. neale and carroll, “the role of metaphors in user interface design.” 67. amy powell, “composite metaphor, games and interface” proceedings of the second australasian conference on interactive entertainment: vol. 123, acm international conference proceeding series (sydney, australia: creativity & cognition studios pr., 2005): 159–62. 68. yu-chen hsu, “the long-term effects of integral versus composite metaphors on experts’ and novices’ search behaviors,” interacting with computers 17 (2005): 391. 69. gerald khoury and simeon simoff, “elastic metaphors: expanding the philosophy of interface design” in selected papers from conference on computers and philosophy, ed. john weckert and yeslam al-saggaf, vol. 37 of acm international conference proceeding series volume 101 (darlinghurst, australia: australian computer society, 2003): 70. taking the long way around: improving the display of hathitrust records in the primo discovery system jason alden bengtson and jason coleman information technology and libraries | march 2019 27 jason bengtson (jbengtson@ksu.edu) is head of it services for kansas state university libraries. jason coleman (coleman@ksu.edu) is head of library user services for kansas state university libraries. abstract as with any shared format for serializing data, primo’s pnx records have limits on the types of data which they pass along from the source records and into the primo tool. as a result of these limitations, pnx records do not currently have a provision for harvesting and transferring rights information about hathitrust holdings that the kansas state university (ksu) library system indexes through primo. this created a problem, since primo was defaulting to indicate that all hathitrust materials were available to ksu libraries (k-state libraries) patrons, when only a limited portion of them actually were. this disconnect was infuriating some library users, and creating difficulties for the public services librarians. there was a library-wide discussion about removing hathitrust holdings from primo altogether, but it was decided that such a solution was an overreaction. as a consequence, the library it department began a crash program to attempt to find a solution to the problem. the result was an application called hathigenius. introduction many information professionals will be aware of primo, the web scale discovery tool provided by ex libris. web scale discovery services are designed to provide indexing and searching user experiences, not only for the library’s holdings (as with a traditional online public access catalog), but also for many of a library’s licensed and open access holdings. primo offers a variety of useful features for search and discovery, taking in data from manifold sources and serializing them into a common format for indexing within the tool. however, such applications are still relatively young, and the technologies powering them have not fully matured. the combination of this lack of maturity and deliberately closed architecture between vendors leads to several problems for the user. one of the most frustrating is errors in identifying full-text access availability. as with any shared format for serializing data, primo’s pnx (primo normalized xml) records have limits on the types of data they pass from the source records into the primo tool. as a result of these limitations, pnx records do not currently have a provision for harvesting and transferring rights information about hathitrust holdings that the k-state libraries system indexes through primo. this created a problem in the k-state libraries’ implementation, since primo was defaulting to indicate that all hathitrust materials were available to k-state libraries patrons, when only a limited portion of them actually were. this disconnect was infuriating some library users, and creating difficulties for the public services librarians. there was a library-wide discussion about removing hathitrust holdings from primo altogether, but it was decided that such a solution was an overreaction. as a consequence, the library it services department began a crash program to attempt to find a solution to the problem. taking the long way around | bengston and coleman 28 https://doi.org/10.6017/ital.v38i1.10574 hathitrust’s digital library as a collection in primo central hathitrust was established in 2008 as a collaboration among several research libraries that were interested in preserving digital content. as of the beginning of march 2018, the collaborative’s digital library contained more than sixteen million items, approximately 37 percent of which were in the public domain.1 ex libris’ primo central index (pci), which serves as primo’s built-in index of articles from various database providers, includes metadata for the vast majority of the items in hathitrust’s digital library, providing inline frames within the original primo user interface to directly display full-text content of those items that the library has access to. libraries subscribing to primo choose whether or not to make these records available to their users. k-state libraries, like many other primo central clients, elected to activate hathitrust in its instance of primo, which it has branded with the name search it. the unmodified version of primo central identified all records from hathitrust’s digital library as available online, regardless of the actual level of access provided to users. users who discovered a record for an item from hathitrust’s digital library were presented with a conspicuous message indicating that full text was available and two links named view it and details. an example of the appearance of these search results is shown in figure 1. after clicking the “view it” tab, the center window would display the item’s homepage from hathitrust’s digital library inside an iframe. public domain items would display the title page of the item and present users with an interface containing numerous visual indicators that they were viewing an ebook (see figure 2 for an example). items with copyright restrictions would display a message indicating that the item is not available online (see figure 3 for an example). figure 1. two books from hathitrust as they appeared in search it prior to implementation of hathigenius. information technology and libraries | march 2019 29 figure 2. hathitrust result for an item in the public domain. figure 3. hathitrust’s homepage for an item that is not in the public domain. despite the intentions evident in the design of the primo interface, availability of hathitrust records was not being accurately reflected in the list of returns. the size of the indices underlying web scale discovery systems and the number of configurations and settings that must be maintained locally introduce a variety of failure points that can intercede when patrons attempt to access subscribed resources.2 one of the failure points identified by sunshine and carter is inaccurate knowledgebase information. the scope of inaccurate information about hathitrust items in primo central index constituted a particularly egregious example of this type of failure. patron reaction to misinformation about access to hathitrust between the time hathitrust’s digital library was activated in search it and the time the hathigenius application was installed at least thirty patrons contacted k-state libraries to ask why they were unable to access a book in hathitrust when search it had indicated that full text was available for the book. many of these expressed frustration at frequently encountering this error (for an example, see figure 4). taking the long way around | bengston and coleman 30 https://doi.org/10.6017/ital.v38i1.10574 1:08 26389957777759601093088133 i find it misleading that the search it function often finds a book i am interested in, but sometimes says it is available online; however, oftentimes it takes me to the hathi trust webpage for the book where i am told it is not available online. is this because our library has had to give up their subscription to this service? 1:08 me hi! 1:09 me that is definitely frustrating and we are trying to find a way to correct it. 1:10 me it does not have to do with our subscription, but rather the metadata we receive from hathitrust and its compatibility (or rather, incompatibility) with search it 1:11 26389957777759601093088133 okay, so i guess i better ask for the book i am seeking (the emperor’s mirror) through ill. 1:11 me that’d probably be your best bet, but let me take a look one moment 1:14 me yes, ill does look best. please note that the ill department will be closed after today until january 1:14 26389957777759601093088133 got it. thanks. i hope the hathi trust issue is resolved soon. (i have seen this problem all semester and finally got so frustrated to ask about it.) 1:15 26389957777759601093088133 have a happy holiday! 1:15 me you as well! and yes, i hope we can figure it out asap 1:15 me (it’s frustrating for us, too!) 1:20 26389957777759601093088133 has left the conversation figure 4. chat transcript revealing frustration with inaccurate information about availability of items in hathitrust. staff reaction to misinformation about access to hathitrust reference staff at k-state libraries use a ticketing system to report electronic resource access problems to a team of librarians who troubleshoot the underlying issues. shortly after the hathitrust library was activated in search it, reference staff submitted several tickets about problems with access to items in that collection. members of the troubleshooting team responded quickly and informed the reporting librarians that the problem was one beyond their control. this message was slow to reach the entirety of the reference staff and was not always understood as being applicable to the full range of access problems our patrons were experiencing. samples and healy note that this type of decentralization and reactive orientation is common in electronic resource troubleshooting.3 like them, k-state libraries recognized a need to develop best practices to obviate confusion. we also found ourselves pining for a tool such as that described by collins and murray that could automatically verify access for a large set of links.4 the extent of displeasure with the situation was so severe that some librarians stated they were loath to promote search it to students since several million records were so conspicuously inaccurate. information technology and libraries | march 2019 31 technical challenges the k-state libraries it department wanted to fix the situation, in order to provide accurate expectations to their users, but doing so presented severe technical challenges, the most significant of which stemmed from the lack of rights information in the pnx record in primo. without more accurate information on availability, user satisfaction seemed destined to remain low. research into patron use of discovery layers predicted this unsurprising dissatisfaction. oclc’s (2009) research into what patrons want from discovery system led the researchers to conclude that “a seamless, easy flow from discovery through delivery is critical to end users. this point may seem obvious, but it is important to remember that for many end users, without the delivery of something he or she wants or needs, discovery alone is a waste of time.”5 a later usability study reported: “some participants spent considerable time looking around for features they hoped or presumed existed that would support their path toward task completion.”6 additionally, the perceived need to customize discovery layers so that they reflect the needs of a particular research library is hardly new, or exclusive to k-state libraries. the same issue was confronted by catalogers at east carolina university, as well as catalogers at unc chapel hill.7 nonetheless, the challenge posed by discovery layers comes with opportunity, as james madison university discovered when their ebsco discovery service widget netted almost twice the usage of their previous library catalog widget, and as the university of colorado discovered when they observed users attempting to use the discovery layer search box in “google-like” ways that could potentially aid discovery layer creators (as well as library it departments) in both design and in setting expectations.8 as previously noted, primo’s results display is driven by pnx records (see figure 5 for an example). the single most fundamental challenge was finding a way to get to holdings rights information despite that data not being present in the pnx records, or, consequently, the search results that showed up in the presentation layer. there was no immediate option to create a solution that leveraged “server-side” resources, where the data itself resided and was transformed, since k-state libraries subscribes to primo as a hosted service, and ex libris provided no direct server-side access to k-state libraries. some alternative way had to be found to locate the rights data for individual records and populate it into the primo interface. upon assessing the situation, the assistant director, it (ad) decided that one potential approach would be to independently query the hathitrust bibliographic application programming interface (api) for rights information. this approach solved a number of fundamental problems, but also posited its own questions and challenges: 1. some server-side component would still be needed for part of the query . . . where would that live and how could it be made to communicate with the javascript k-state libraries had injected into its primo instance? 2. how to best isolate hathitrust object identifiers from primo and then use them to launch an api query? 3. how to keep those responses appropriately “pinned” to their corresponding entries on the primo page? 4. how would the hathitrust bibliographic api perform under load from search it queries? answering these questions would require significant research into the hathitrust bibliographic api documentation, and extensive experimentation. taking the long way around | bengston and coleman 32 https://doi.org/10.6017/ital.v38i1.10574 figure 5. a portion of the pnx record for http://hdl.handle.net/2027/uc1.32106011231518 (the second item shown in figure 1). building the application of these four questions, the first was easily the most challenging: where would the server-side component live and how would it work? the k-state libraries it services department had, in the past, made a number of significant modifications to the appearance and functionality of the primo application by adding javascript to the static html tiles used in the primo interface. however, generally speaking, javascript cannot successfully request data from outside of the domain of the web document it occupies. requesting data from an api across domains requires the mediation of a server-side appliance. the ad constructed one for this purpose, using the php programming language. this script would serve as an intermediary between the javascript in primo and the hathitrust api. the appliance accepted data from the primo javascript in the form of the contents of http variables (encoded in the url of the get request to the php appliance), then used those values to query the hathitrust api. however, since this server-side appliance did not reside in the same domain as k-state libraries’ primo instance, the problem of getting the returned api data from the php appliance to the javascript still remained. this problem was solved by treating the php appliance as a javascript file for purposes of the application. while javascript cannot load data from another domain, a web document may load actual javascript files from anywhere on the web. the hathigenius appliance takes advantage of this fact by calling the php appliance programmatically as a javascript file, with a javascript object notation (json) version of the identifiers of any hathitrust entries encoded as part of the url used to call the file. the php script runs the queries against the api and returns a javascript file consisting of a single variable containing the json data encoding the availability information for the hathitrust entries as supplied from the bibliographic api . . . essentially appearing to the browser as a standard javascript file. information technology and libraries | march 2019 33 the second and third problems were intrinsically interrelated, and essentially boiled down to finding a unique identifier to use in an api query from the hathitrust entries. the most effective way to handle these queries was to use the “htid” identifier, which was largely unique to hathitrust entries, could be easily extracted from any entries that contained it, and would form the basis of the php script’s request to the hathitrust restful api to obtain rights information. in the process of harvesting the htid, hathigenius also copies the id for the object in the webpage that serves as the entry in the list of primo returns containing that htid. as the data is moved back and forth for processing, the htids, and later the resultant json data, remain paired to the object id for the entry in the list of returns. when hathigenius receives the results of the api query, it can then easily rewrite those entries to reflect the rights data it obtained. the fourth question has been fully answered with time. to this point, well over a year after hathigenius was activated in production, library it has not observed any failure of the api to deliver the requested results in testing, and no issues to that effect have been reported by users. log data indicates that, even under heavy load, the api is performing to expectations. further modifications originally, the hathigenius application supplied definitive states of available or unavailable for each entry. however, some experimentation showed this approach to be less than optimal. since the bibliographic api cannot be queried by kansas state university as a specific user, but rather was being queried for general access rights, the possibility still existed for false negatives in the future, if kansas state university’s level of access to hathitrust changed. the data returned from the api queries, when drilled down, just consisted of the usrightsstring property from the api, which corresponded to open-source availability, and did not account for any additional items available to the library by license in the future. after the application had been active for a short time, to mitigate this potential issue, the “not available” state (consisting of an application of the “exlresultstatusnotavailable” class to the hathitrust entry) was “softened” into an application of the “exlresultstatusmaybeavailable” class and verbiage asking users to check the “view it” tab for availability. a few weeks after deployment, it received a ticket indicating hathigenius was failing to work properly. the source of the problem proved to be detailed bibliographic pages for items in a search results list, which were linking out from the search entries. these pages used a different class and object structure than the search results pages in primo, requiring that an additional module be built into hathigenius to account for them. once the new module was added to the application and put into place, the problem was resolved. a second issue presented itself some weeks later, when a few false negatives were reported. at first, the assistant director assumed that licensing had changed, creating a disparity between the access information from the usrightsstring property and the library’s actual holdings. however, upon investigation it was clear that hathigenius was dropping some of the calls to the hathitrust bibliographic api. the api itself was performing as expected under load, however, and the failure proved to be coming from an unexpected source. the php script used by hathigenius to interface with the api was employing the curl module, which, in turn, was using its own, less secure certificate to establish a secure socket layer (ssl) connection to the hathitrust server. once the taking the long way around | bengston and coleman 34 https://doi.org/10.6017/ital.v38i1.10574 script was refactored to employ the simpler file_get_contents function, which relied upon the server’s main ssl certificate, the problem was fully resolved. hathigenius also had a limited vulnerability to bad actors. while the internal script’s destination hardwiring prevented hathigenius from being used as a generic tool to anonymously query apis, the library did encounter a situation in which a (probably inadvertently) malicious bot repeatedly pinged the script, causing it to use up system resources until it interrupted other services on the host machine. modifications were added to the script to provide a simple check against requests originating from primo. additionally, restrictions were placed on the script so that excessive resource use would cause it to be intermittently deactivated. while not perfect solutions, these measures have prevented a repeat of the earlier incident. k-state libraries has recently finished work on its version of the new primo user interface (primo new ui), which was moved into production this year. the new interface has a completely different client-side structure, requiring a very different approach to integrating hathigenius.9 appearance of hathitrust results in primo after hathigenius when the hathigenius api does not find a usrights property, we configured primo to display a yellow dot and the text “please check availability with the view it tab” (see figure 6 for an example). as noted earlier, we originally considered this preferable to displaying a red dot and the text “not available online,” because there might be instances in which the item is actually available in full view through hathitrust despite the absence of usrights in the record. figure 6. two books for which hathigenius found no usrights in hathitrust. when the hathigenuis api finds usrights, we configured primo to display a green dot and text “available online” (see figure 7 for an example). information technology and libraries | march 2019 35 figure 7. a book for which hathigenius found usrights. patron response since the beginning of 2017, the reference staff at k-state libraries have received no reports of patrons encountering situations in the original user interface in which primo indicates that full text is available but hathitrust is only providing a preview. however, a small number of patrons (at least four) expressed confusion at seeing a result in primo and discovering that the full-text is not available. some of those patrons noted that they saw the text “please check availability with the view it tab,” and inferred that this was meant to state that the full-text was available. others indicated that they never considered that we would include results for books that we do not own. these responses add to the body of literature documenting user expectations that everything should be available in full-text in an online library and that systems should be easy to use.10 internal response in order to gauge the feelings of k-state libraries’ staff who regularly assist patrons with reference questions, the authors crafted a brief survey (included in appendix a). respondents were asked to indicate whether they had noticed a positive change following implementation of hathigenius, a negative change, or no change at all. they were also invited to share comments. the survey was distributed to thirty individuals. twelve (40 percent) of those thirty responded to the survey. the survey response indicated a great deal of ambivalence by reference staff toward the change, with four individuals (33 percent) indicating they had not noticed a difference, and another four (33 percent) indicating that they had noticed a difference, but that it had not improved the quality of search results. only two (17 percent) of the respondents revealed that they had noticed an improvement in the quality of the search results. one (9 percent) respondent indicated that they felt that the hathitrust results had gotten noticeably worse since the introduction of hathigenius, although they did not elaborate on this in the survey question which invited further comment. the remaining respondent stated that they did not have an opinion. four comments were left by respondents, including one which indicated displeasure with the new, softer verbiage for hathitrust “negatives,” and one who claimed that the problem of false positives persisted, despite such feedback not being seen by the authors through any of the statistical modalities currently used for recording reference transactions. one user praised hathigenius, while another related broad displeasure with the decision to include hathitrust records in search it. that individual claimed that almost none of the results from hathitrust were available and stated that the hope engendered by the presence of the hathitrust results and the corresponding suggestion to check the view it tab was always dashed, to the detriment of patron satisfaction. taking the long way around | bengston and coleman 36 https://doi.org/10.6017/ital.v38i1.10574 the new ui as previously mentioned, in late 2018, k-state libraries adopted the primo new ui created by ex libris. this new user interface was built in angular, and changed many aspects about how hathigenius had to be integrated into primo. the k-state libraries’ it department completed a refactoring (reworking application code to change how an application works, but not what it does) of hathigenius to integrate it with the new ui and released it into production in september 2018. as an interesting aside, the it department did not initially prioritize the reintegration of hathigenius, due to the ambivalence of the response to the application evidenced by the survey conducted for this paper. however, shortly after search it was switched over to the new ui, complaints about the hathitrust results again displaying inaccurate availability information began to come in to the it department via both email and tickets from reference staff. as the stridence of the response increased, the project was reprioritized, and the work completed. future directions as previously mentioned, hathigenius currently uses the very rough “usrightsstring” property value from the hathitrust bibliographic api. however, the api also delivers much more granular rights data for digital objects. a future version of the app may inspect these more granular rights codes and compare them to rights data from k-state libraries in order to more definitively provide access determinations for hathitrust results in primo should the licensing of hathitrust holdings be changed. similarly, since htid technically only resolves to the volume level, a future version may additionally harvest the hathitrust record number, which appears to be extractable from the primo entries. based on feedback from the survey, the “soft negative” verbiage used in hathigenius was replaced with a firmer negative. this decision proved especially sagacious given that, once the early issues with certificates and communication with the hathitrust bibliographic api were sorted out, the accuracy of the tool seemed to be fully satisfactory. another problem with the “soft negative” was the fact that it asked users to click on the view-it tab, when many users simply chose to ignore the tabs and links in the search results, instead clicking on the article title, as found in a usability study on primo conducted by the university of houston libraries.11 it is also worth noting the one survey respondent who is apparently not seeing an improvement in hathitrust accuracy. if the continued difficulties they have indicated can be documented and replicated, the it department can examine those complaints to investigate where the tool may be failing. discussion one interesting feature of this experience is the seeming disconnect between library reference support staff and users in terms of the perception of the efficacy of the tool. this disconnect is all the more curious given the negative reaction displayed by reference support staff when hathigenius became unavailable temporarily upon introduction of the primo new ui. part of this perceived disconnect may be a result of the fact that staff were given a survey instrument, while the reactions of users have been determined largely via null results (a lack of complaints to, or information technology and libraries | march 2019 37 requests for assistance from, service point staff). however, given the dramatic drop in user complaints compared to the ambivalent reaction to the tool by most of the survey respondents, it appears that the staff had a much less enthusiastic response to the intervention than patrons. a few possibilities occur to the authors, including a general dislike for the discovery layer by reference librarians, a general disinclination toward a technological solution by some respondents, or the initial perception by at least part of the reference staff that the problem was not significant. as noted by fagan et al., the pivot toward discovery layers has not been a comfortable one for many librarians.12 until further research can be conducted on this, and reactions to similar customization interventions, these possibilities remain speculation. one particular feature of note with hathigenius is the use of what one of the authors refers to as “sidewise development” to solve problems that seem to be intractable within a proprietary, or open source, web-based tool. while not a new methodology in and of itself, the author has mainly encountered this type of design in ad-hoc creations, rather than as a systematic approach to problem-solving. instead of relying upon the capabilities of primo, this type of customization made its own query to a relevant api and blended that external data with the data available from primo seamlessly within the application’s presentation layer in order to facilitate a solution to a known problem. the solution created in this fashion was portable, and unaffected by most updates to primo itself. even the transition to the new ui required changes to the “hooks” and timing used by the javascript, rather than any substantial rewrite of the core engines of the application. this methodology has been used repeatedly by k-state libraries it services to solve problems where other interventions would have necessitated the creation of specialized modules, or the rewriting of source code; both of which would be substantially affected by updates to the product itself, and which would have been difficult to improve or version without down time to the affected product. similar solutions have seen tools independently query an application’s database in order to inject the data back into the application’s presentation layer, bypassing the core functionality of the application. conclusion reactions at this point from users, and at least some library staff, have been positive. while not a perfect tool, hathigenius has improved the user experience, removing a point of frustration and an area of disconnect between the library and its users. the application itself is fully replicable by other institutions (as is the general model of sideways development), allowing them to improve the utility of their primo instances. as with many possible customizations to discovery layers, hathigenius provides fertile ground for additional work, research, and refinement, as libraries struggle to find the most effective ways to implement discovery tools within their own environments. beyond hathigenius itself, the sideways development method provides a powerful tool for libraries to improve the tools they use by integrating additional functionality at the presentation layer level. tackling the problem of inaccurate full-text links in discovery layers is only one application of this approach, but it is an important one. as libraries continue to strive to improve the results and usability of their search offerings, the ability to add local customizations and improvements will be an essential feature for vendors to consider. taking the long way around | bengston and coleman 38 https://doi.org/10.6017/ital.v38i1.10574 appendix a. feedback survey q1 in january 2017, the library began applying a tool (called hathigenius) to the hathitrust results in primo in order to eliminate the problem of “false positives.” in other words, primo would report that all of the hathitrust results it returned were available online as full text, when many were not. we would like your feedback about the impact of this change from your perspective. q2 which of the following statements best describes your opinion about the impact of hathigenius? o i haven’t noticed a difference. o i feel that search it’s presentation of hathitrust results in search it has become noticeably better since hathigenius was implemented. o i feel that search it’s presentation of hathitrust results in search it has become noticeably worse since hathigenius was implemented. o i have noticed a difference, but i feel that search it’s presentation of hathitrust results is about the same quality as it was before hathigenius was implemented. o no opinion. q3 please share any comments you have about hathigenius or any ideas you have for improving the display of hathitrust’s records in search it. information technology and libraries | march 2019 39 references 1 hathitrust digital library, “welcome to hathitrust!” accessed march 4, 2018, https://www.hathitrust.org/about. 2 sunshine carter and stacie traill, “essential skills and knowledge for troubleshooting eresources access issues in a web-scale discovery environment,” journal of electronic resources librarianship 29, no. 1 (2017): 7, https://doi.org/10.1080/1941126x.2017.1270096. 3 jacquie samples and ciara healy, “making it look easy: maintaining the magic of access,” serials review 40, no. 2 (2014): 114, https://doi.org/10.1080/00987913.2014.929483. 4 maria collins and william t. murray, “seesau: university of georgia’s electronic journal verification system,” serials review 35, no. 2 (2009): 80, https://doi.org/10.1080/00987913.2009.10765216. 5 karen calhoun, diane cellentani, and oclc, eds., online catalogs: what users and librarians want: an oclc report (dublin, ohio: oclc, 2009): 20, https://www.oclc.org/content/dam/oclc/reports/onlinecatalogs/fullreport.pdf. 6 rice majors, “comparative user experiences of next-generation catalogue interfaces,” library trends; baltimore 61, no. 1 (summer 2012): 191, https://scholarcommons.scu.edu/cgi/viewcontent.cgi?article=1132&context=library. 7 marlena barber, christopher holden, and janet l. mayo, “customizing an open source discovery layer at east carolina university libraries “the cataloger’s role in developing a replacement for a traditional online catalog,” library resources & technical services 60, no. 3 (july 2016): 184, https://journals.ala.org/index.php/lrts/article/view/6039; benjamin pennell and jill sexton, “implementing a real-time suggestion service in a library discovery layer,” code4lib journal, no. 10 (june 2010): 5, https://journal.code4lib.org/articles/3022. 8 jody condit fagan et al., “usability test results for a discovery tool in an academic library,” information technology and libraries 31, no. 1 (march 2008): 99, https://doi.org/10.6017/ital.v31i1.1855. 9 dan moore and nathan mealey, “consortial-based customizations for new primo ui,” the code4lib journal, no. 34 (october 25, 2016), http://journal.code4lib.org/articles/11948. 10 lesley m. moyo, “electronic libraries and the emergence of new service paradigms,” the electronic library, 22, no. 3 (2004): 221, https://www.emeraldinsight.com/doi/full/10.1108/02640470410541615. 11 kelsey brett, ashley lierman, and cherie turner, “lessons learned: a primo usability study,” information technology and libraries, 35, no. 1 (march 2016): 20, https://ejournals.bc.edu/ojs/index.php/ital/article/view/8965. 12 fagan et al., “usability test results for a discovery tool in an academic library,” 84. lib-mocs-kmc364-20140103101924 13 shawnee mission's on-line cataloging system ellen washy miller: library systems analyst, and b. j. hodges: senior systems analyst, shawnee mission public schools, shawnee mission, kansas an on-line cataloging pilot project for two elementary schools is discussed. the system components are 27 40 terminals, upper-lower-case input, ibm's faster generalized software packo.ge, and usual cards/labels output. reasons for choosing faster, software and hardware features, operating procedures, system performance and costs are detailed. future expansion to cataloging 100,000 annual k-12 acquisitions, on-line circulation, retrospective conversion, and union book catalogs is set forth. introduction the shawnee mission public schools, serving several affiuent suburbs of the kansas city metropolitan area, began automated library operations in 1968. as the school districfs computer center was then equipped with a 1401 computer and tape/disk store, the first library system was designed for batch ordering and cataloging. later, a batch circulation system for three of the school district's fourteen secondary libraries was started. library automation in that period was similar to that described by scott (1) and auld (2). two years saw a profound change in the shawnee mission school district. by unification, it had added 50 elementary schools and a new high school, makin9 a total of 65 schools, all of which had libraries. at the school districts computer center, the configuration had passed through the 360/30 stage to a 360/40; 2314 disk packs were on order; and 2741 term~als, using ibm's remote access computational system (rax) had been ~stalled at all five high schools for computer science courses. wlule the batch library system could handle the 28.000 items ordered 14 journal of library automation vol. 4/1 march, 1971 and cataloged annually up to that point, it was impossible to justify using it for an estimated 100,000 annual acquisitions needed by 65 libraries. computer time to process autocoder programs on a mod/40 would be excessive; the librarians desired many improvements (upperand lowercase i/0; longer fields; shortened time to process items; and more accurate data on cards and labels) . the need for data processing and library improvements resulted in rethinking of the approach to ordering, cataloging, and circulation. naturally, on-line processing came to mind. ibm 274fs for computer science courses had given management and data processing staff some experience with a dedicated on-line system; the 360/40 and 2314 disks would support large files, indexed sequential file organization, and multiprogramming (simultaneous use of the cpu for terminal and batch jobs). the experiences of stanford and university of chicago ( 3) and ibm ( 4) pointed out that on-line systems could be built for larger and more complex organizations than for shawnee mission, where the collections are 95% english language and the system covers only books and audio-visual items. cataloging is based on title-page information; tools used are children·s catalog, sears, n.u.c., a. a. rules, and other standard works. also very important was the fact that the computer center management wanted experience in multiprogramming prior to considering it for student scheduling, student records, payroll and business functions. a proposal was made to library and data processing management by the library systems analyst in mid-december 1969 that on-line cataloging in multiprogramming mode be begun by mid-march 1970 for two elementary schools on a test basis. an improved batch order system using cobol programs was also proposed. finally, it was suggested that a carefully designed cataloging system could include fields to be used later for circulation control. the specific purposes of the on-line cataloging pilot project were 1) to test whether direct access to master disk files is an efficient, accurate, and economical method of creating and updating bibliographic and holdings data; and 2) to allow data processing management to ascertain if multiprogramming is feasible and practical at this time locally. a search of library literature revealed no on-line systems for cataloging and circulation functions; rather, either circulation or order functions were real time. moreover, truly on-line systems were rare; columbia had designed a circulation system that could be used in that mode, but as of october 1968, was operating batch (5). chicago's book processing system does input data on line, although ordering and cataloging functions are performed off-line (6) . bellrel is an on-line circulation system (7). comparing the circumstances of the above institutions with that of shawnee mission school district brought out one sterling difference: the latter had no yant money nor huge parent institution upon which to rely. rather, it ha a modest hardware-software configuration, a need to be on-line cataloging system/ miller and hodges 15 operational within three months if the two test librarians were to see any output by the end of the school year, and a small team of data processors and librarians devoted to redesign and implementation. methods and materials having earlier seen demonstrations of the kansas city, missouri, police department's faster system, with its on-line access to constantly updated alphanumeric files, the senior systems analysts turned to ibm for further information. the police department's system was based on a software package developed in alameda, california, for law enforcement. it was also available in a general form called faster (filing and source data entry techniques for easier retrieval). the proven ability of this system to accept on-line data via 2740 terminals and to display it on 2260 crt's, its ease of adaptation to user requirements, the quickness with which analysts and programmers had learned to use it at the police department, and a local, positive experience decided the issue. in mid-january 1970, faster was chosen as the software framework for on-line cataloging. software faster has been programmed in modular form, with each module performing a particular task ( 8). modules supporting functions that vary because of hardware must be coded by the user. this coding is done in macro form (brief program statements in higher level language which generate many machine instructions) and therefore is not a tedious task. one of the hardest, most complicated portions of implementing a teleprocessing system is programming the support from the cpu to the terminal; with faster, this took about a day. the macros use basic teleprocessing access method (btam) support. with line support taking little time, the user may spend more effort on his own processing needs. the user may have only a query or an update requirement; shawnee mission needed both. because faster is a modular system, the user is permitted to describe each of his needs as a transaction. this transaction must be programmed as a tpd (transaction processing d~scription) using macros. coding and listing time for a tpd will vary w1th the processing description. those familiar with detailed programming will note that the programmer does not have to concern himself with 1/0. the tpd will prepare the data for output, and the faster interface module will handle 1/0. some of the major functions supported by the macro language include: 1) retrieval of records from indexed sequential ( isam ) files-files accessed only through hierarchic indexes; 2) modifications and additions to isam ides; 3) data manipulation; 4) formatting of responses to selected terminals; 5) message switching and 6) recording audit data on a logging file. faster under dos requires fixed-length records; this has been modified under the os version. 16 journal of library automation vol. 4/1 march, 1971 retrieval from the !sam files required for processing a given transaction may be performed in one of three ways: 1) retrieval of a unique record, 2) sequential retrieval of a specified number of records from a logical grouping, or 3) retrieval of a specified number of records from a logical grouping, in which the retrieval records represent the best qualified from the group based upon the user's selection criteria. hardware hardware supported by the faster system is as follows: machine configuration: ibm 360 mod/30, 40, or 50 storage requirements: minimum-dos 65k; minimum-os 128k disks supported: 2311 or 2314 logging file: disk or tape line control: btam witb 2701, 2702, or 2703 terminals: 2740, 1050, 2260 crt systems at shawnee mission computer center: ibm 360/40 dos 256 k eight 2314 disks three 2401 tape drives one 2702 line control 27 40 and 27 41 terminals the system operates in three partitions. partition f1 houses apl (a programming language) for student use with 27 41 terminals. partition f2 houses faster. partition bg is used for batch jobs (both cobol and autocoder under cs monitor). file organization and access faster supports !sam files only (as data sets) with the exception of the logging file; the logging device must be sequential. faster's support of disk files is accomplished by using the same software modules that al and cobol use in maintaining !sam files. therefore, standard programming languages may be used for creating files and data retrieval. shawnee mission chose cobol as its main fanguage and found it to complement the faster system. files the batch library system was based on a 400-character record, repeated in its entirety for every copy in each library. this space consumption for redundant information was undesirable in a system with 65 collections, and therefore two basic files were designed. the first is the disk title file, containing one record with bibliographic information for each unique title in the school district. its fields include author, title, dates, subject headings, annotation, etc. (table 1). each record is 562 characters long. on-line cataloging system/miller and hodges 17 table 1. main fields input by operators record title copy field form code publication date copyright date author title annotation publisher edition price dewey number cutter number grade level collation series language code lc card number subject heading 1 subject heading 2 subject heading 3 added entry title number number of copies building code funding code volume number print instructions length 2 4 4 35 50 105 30 3 5 8 10 4 40 35 3 14 24 46 60 30 7 4 5 comments distinguishes physical format may be continued in annotation use marc language codes use sears headings " , , , , for name or title 2 if other than general funds 3 for volume or other sequence number 16 kept only until labels and cards printed in distinction, the disk copy file contains a 56-character record for each copy of a title, comprised of fixed-length fields for building number, special funding, volume information, and circulation control. copy and title records are linked through the title number. the third file is the isam title index, comprised of records with a phon.etic code and key for each title record. this file is called up by a t~rnu~al transaction containing title; the incoming phonetic code for the htle ~~ matched with any equal ones on the index. for matches, biblio~rap~tc data is pulled from the title and typed on the terminal. the main unction of the title index is to determine duplicates. 18 journal of library automation vol. 4/1 march, 1971 tests on 45,711 title records showed that a 16-character phonetic code resulted in a maximum of 36 different titles having the same phonetic code. the 16-character code chosen consists of the first character of title followed by numeric values for most consonants. the on-line cataloging system input recognizing that the pilot project might be expanded into a full-scale operation, librarians drew up procedures for entering data from either shelf list cards or new arrivals. conversion from shelf lists required that cards be edited to eliminate confusing information and to add implicit data. for new acquisitions, most information needed by the terminal operator is annotated on the title page and its verso. a grid sheet to be slipped into the book contains subject headings, added entry, annotation and some other fields. all of these practices were set forth in a user's manual (9), along with instructions on how to enter transactions into the disk files. limits on the input buffer permit a maximum of 120 characters to be transmitted by any transaction, which means that several transactions are required to add all cataloging and location data necessary to describe a new title. there are two sets of transactions. the lt series adds to and updates records in the title file; the lc series does the same for the copy file. for instance, entering a new title requires an lt01 transaction to start the record and assign a title number, one or more lt02's to complete cataloging data, and an lc01 for building assignment. operators find transactions easy to key and understand. by category, lc and lt transactions set up new records, add on fields, update fields, delete or activate records, and query the contents of a specified record. these transactions are a simple, understandable, and powerful method of maintaining library files. several transactions also add data to fields, thus saving the operator keying time. for instance, the cutter number is automatically derived from the first three letters of the author's last name, unless specifically superseded by the operator. also, "f" is assigned to dewey for all items unless replaced by another classification. finally, a standard set of output, consisting of 1) two author cards, overprinted cards, a copy card, and 2) one three-up label, is assumed when an lc01 transaction is input to show location. if other outputs are needed, the operator uses an lc05 transaction to specify them. there are also several instances of data being input in lower case (to save time and buffer space for a shift) and being edited on output to upper case. the result of all these program aids is that the operator knows she is keying only important data; highly invariable fields are input and edited by the faster programs, saving operator time. on-line cataloging system/ miller and hodges 19 output two basic card formats were chosen. the unit card contains all cataloging information; the copy card shows a library's holdings of a given title (figures 1 and 2) . a unit card and copy card (giving all cataloging and holdings information) go into the school's shell list; the usual set of main and added entry unit cards goes into the public catalog. gunthf'r , j()lln hlr the gr , < 0, 1>, <1, 0>, < 0, 2>, < 1, 1> , <2, 0 >, < 0, 3>, < 1, 2>, etc. ordered triples are ordered pairs of ordered pairs and the integers: t(x, y, z)=i(x, i(y,z) ). and so on. because we have a d enumerable set of books we can accomplish a linear mapping by both subject and category. in fact , the problem is trivial because there are only a finite number of books. physically, however, neither subject nor category will remain together. to suit the library the mapping must be physically simple, but can be abstractly complex. for all his protestations, the classificationist cannot eschew the physical library. if he could-or wished to-the way is open. as i understand classification, it is vacuous without reference to its ability as a finding tool. it must concern itself with the polydimensional aspect of content but cannot disregard the codex. in answer to the question form "where is the book about ... ?" an appropriate and total response type is "at location (x, y, z) ." here x, y, and z are the spatial coordinates relative to a particular library both as to origin and values. the dewey or lc numbers of the book are incomple te answers in that they presume a knowledge of the classification structure as well as knowledge of the architecture of the building. i have suggested that a classification scheme must not disregard the codex, but must insofar as possible not be subservient to physical form. the following scheme takes advantage of the codex form, is as easily automated or computerized as current one-dimensional schemes, advances beyond one dimension, and is very relevant to finding: a library is considered as a three-dimension entity. conventions are adopted for run-on from room to room and floor to floor as for the linear scheme. each book is classified in all three dimensions-the dimensions being independent. the interpretation of each dimension is left to the discretion of the individual library. thus each book has a relative position in each dimension. (this is not an alexandrian scheme relying on absolute location. ) the following example illustrates the relevant concepts: choose a subject classification (as commonly understood ) for the x dimension. for example, let dewey numbers be arranged from left to right on the x axis. choose a category scheme for the y dimension. one could assign degrees of difcataloging geometry/hazelton 15 x d i f f i c u subject ficulty from one to seven, for example. choose a category scheme for the z dimension. one could assign numbers between one and seven running from most general to most specific. this has the following effect: standing in front of the near shelf (i.e., z=l) one can choose a subject by moving laterally. the general books will appear first with difficult items at the top, easy ones at the bottom. if the items are too general, merely move one stack forward and try again. this approach presents an unusually usable instructional layout for circular libraries. a reading lounge can be put dead center with the most subject specific books ranged about the circumference. level of difficulty is easily adjusted by looking up or down. given this apparatus you may wish to change the subject classification scheme. why not put solid state physics behind general physics instead of to the right or left? the card catalog can now be used with greater meaning. there is no reason why it cannot be a map of the shelves. the axes can be translated for ease of searching (e.g., interchange x and y for the card catalog). of particular interest is the relation between this scheme and those of a. d. booth ( 4) where access time is minimized by arranging books in the inverse order of their frequency of use. further refinements consider nonstandard shelf layouts (radial, circular, spiral). one misgiving about shelving by inverse frequency expressed by librarians is that one no longer knows where to look for a particular book in the sense that one knows when using standard schemes. this objection is easily overcome by combining the three-dimensional and frequency schemes. one dimension can be used for frequency, leaving two dimensions in which to group books by subject, difficulty, generality, color, length, or whatever you please. access time is reduced while physical grouping is retained. 16 journal of library automation vol. 5/1 march, 1972 one difficulty that will be encountered is the classification of books that are not subject-oriented-poetry and fiction, for example. these areas are not adequately dealt with in linear schemes and they could easily be left as they are. that is, two dimensions could be constants. on the other hand, it seems plausible that, given three dimensions in which to work, someone could discover congenial physical groupings that would be reasonable yet impossible in one dimension. rather than being a problem, threedimensional classification offers opportunities to cope with literatures that are not subject specific. each dimension of this scheme can be criticized on the same grounds as the current linear classification. but, taken as a whole, it provides a more powerful, much needed tool for the classificationist while allowing new approaches by automaters. its simplicity is assured because it is closer to our intuitive notions of information storage. three dimensions are necessary! references 1. w. h. auden, "the cave of nakedness," about the house ( new york: random house, 1965), p.32. 2. jesse h. shera, "classification-current functions and applications to the subject analysis of library material," in libraries and the organization of knowledge (connecticut: archon books, 1965), p.97 -98. 3. jesse h. shera, "classification as the basis of bibliographic organization," in libraries and the organization of knowledge (connecticut: archon books, 1965) , p.84, 85. 4. a. d. booth, "on the geometry of libraries," journal of documentation 25:28-42 (march 1969). practical limits to the scope of digital preservation mike kastellec practical limits to the scope of digital preservation | kastellec 63 abstract this paper examines factors that limit the ability of institutions to digitally preserve the cultural heritage of the modern era. the author takes a wide-ranging approach to shed light on limitations to the scope of digital preservation. the author finds that technological limitations to digital preservation have been addressed but still exist, and that non-technical aspects—access, selection, law, and finances—move into the foreground as technological limitations recede. the author proposes a nested model of constraints to the scope of digital preservation and concludes that costs are digital preservation’s most pervasive limitation. introduction imagine for a moment what perfect digital preservation would entail: a perfect archive would capture all the content generated by humanity instantly and continuously. it would catalog that information and make it available to users, yet it would not stifle creativity by undermining creators’ right to control their creations. most of all, it would perfectly safeguard all the information it ingested eternally, at a cost society is willing and able to sustain. now return to reality: digital preservation is decidedly imperfect. today’s archives fall far short of the possibilities outlined above. much previous scholarship debates the quality of different digital preservation strategies; this paper looks past these arguments to shed light on limitations to the scope of digital preservation. what are the factors that limit the ability of libraries, archives, and museums (henceforth collectively referred to as archival institutions) to digitally preserve the cultural heritage of the modern era? 1 i first examine the degree to which technological limitations to digital preservation have been addressed. next, i identify the non-technical factors that limit the archival of digital objects. finally, i propose a conceptual model of limitations to digital preservation. technology any discussion of digital preservation naturally begins with consideration of the limits of digital preservation technology. while all aspects of digital preservation are by definition related to technology, there are two purely technical issues at the core of digital preservation: data loss and technological obsolescence. 2 many things can cause data loss. the constant risk is physical deterioration. a digital file consists at its most basic level as binary code written to some form of mike kastellec (makastel@ncsu.edu) is libraries fellow, north carolina state university libraries, raleigh, nc. mailto:makastel@ncsu.edu information technology and libraries | june 2012 64 physical media. just like analog media (paper, vinyl recordings), digital media (optical discs, hard drives) are subject to degradation at a rate determined by the inherent properties of the medium and environment in which it is stored. 3 when the physical medium of a digital file decays to the point where one or more bits lose their definition, the file becomes partially or wholly unreadable. other causes of data loss include software bugs, human action (e.g., accidental deletion or purposeful alteration), and environmental dangers (e.g., fire, flood, war). assuming a digital archive can overcome the problem of physical deterioration, it then faces the issue of technological obsolescence. binary code is simply a string of zeroes and ones (sometimes called a bitstream)—like any encoded information, this code is only useful if it can be decoded into an intelligible format. this process depends on hardware, used to access a bitstream from a piece of physical media, and software, which decodes the bitstream into an intelligible object, such as a document or video displayed on a screen, a printout, or an audio output. technological obsolescence occurs when either the hardware or software needed to render a bitstream usable is no longer available. given the rapid pace of change in computer hardware and software, technological obsolescence is a constant concern. 4 most digital preservation strategies involve staying ahead of deterioration and obsolescence by copying data from older to current generations of file formats and storage media (migration) or by keeping many copies that are tested against one another to find and correct errors (data redundancy). 5 other strategies to overcome obsolescence include pre-emptively converting data to standardized formats (normalization) or avoiding conversion and instead using virtualized hardware and software to simulate the original digital environment needed to access obsolete formats (emulation). as may be expected of a young field, 6 there is a great deal of debate over the merits of each of these strategies. to date, the arguments mostly concern the quality of preservation, which is beyond the scope of this work. what should not be contentious is that each strategy also imposes limitations on the potential scale of digital preservation. migration and normalization are intensive processes, in the sense that they normally require some level of human interaction. any human-mediated process limits the scale of an archival institution’s preservation activities, as trained staffs are a limited and expensive resource. emulation postpones the processing of data until it is later accessed, potentially allowing greater ingest of information. as a strategy, however, it remains at least partly theoretical and untested, increasing the possibility that future access will be limited. data redundancy deserves closer examination, as it has emerged as the gold standard in recent years. the limitations data redundancy imposes on digital preservation are two-fold. the first is that simple maintenance of multiple copies necessarily increases expenses, therefore—given equal levels of funding—less information can be preserved redundantly than can be preserved without such measures. (cost considerations are inextricably linked to every other limitation on digital preservation and are examined in greater detail in “finances,” below.) there are practical, technical limitations on the bandwidth, disk access, and processing speeds needed to perform practical limits to the scope of digital preservation | kastellec 65 parity checks (tests of each bit’s validity) of large datasets to guard against data loss. pushing against these limitations incurs dramatic costs, limiting the scale of digital preservation. current technology and funding are many orders of magnitude short of what is required to archive the amount of information desired by society over the long term. 7 the second way technology limits digital preservation is more complex—it concerns error rates of archived data. non-redundant storage strategies are also subject to errors, of course. only redundant systems have been proposed as a theoretical solution to the technological problem of digital preservation, 8 though, so it is necessary to examine their error rate in particular. on a theoretical level, given sufficient copies, redundant backup is all but infallible. in practice, technological limitations emerge. 9 the number of copies required to ensure perfect bit preservation is a function of the reliability of the hardware storing each copy. multiple studies have found that hardware failure rates greatly exceed manufacturers’ claims. 10 rosenthal argues that, given the extreme time spans under consideration, storage reliability is not just unknown but untestable. 11 he therefore concludes that it cannot be known with certainty how many copies are needed to sustain acceptably low error rates. even today’s best digital preservation technologies are subject to some degree of loss and error. analog materials are also inevitably subject to deterioration, of course, but the promise of digital media leads many to unrealistic expectations of perfection. nevertheless, modern digital preservation technology addresses the fundamental needs of archival institutions to a workable degree. technological limitations to digital preservation still exist but the aspects of digital preservation beyond purely technical considerations—access, selection, law, and finances— should gain greater relative importance than they have in the past. access with regard to digital preservation, there are two different dimensions of access that are important. at one end of a digital preservation operation, authorized users must be able to access an archival institution’s holdings and unauthorized users restricted from doing so. this is largely a question of technology and rights management—users must be able to access preserved information and permitted to do so. this dimension of access is addressed in the technology and law sections of this paper. the other dimension of access occurs at the other end of a digital preservation operation: an archival institution must be able to access a digital object to preserve it. this simple fact leads to serious restrictions on the scope of digital preservation because much of the world’s digital information is inaccessible for the purposes of archiving by libraries and archives. there are a number of reasons why a given digital object may be inaccessible. large-scale harvesting of webpages requires automated programs that “crawl” the web, discovering and capturing pages as they go. web crawlers cannot access password-protected sites (e.g., facebook) and database-backed sites (all manner of sites, including many blogs, news sites, e-commerce sites, information technology and libraries | june 2012 66 and countless collections of data). this inaccessible portion of the web is estimated to dwarf the readily accessible portion by orders of magnitude. there is also an enormous amount of inaccessible digital information that is not part of the web at all, such as emails, company intranets, and digital objects created and stored by individuals. 12 additionally, there is a temporal limit to access. some digital objects only are accessible (or even exist) for a short window of time, and all require some measure of active preservation to avoid permanent loss. 13 the lifespans of many webpages are vanishingly short. other pages, like some news items, are publicly accessible for a short window before they are hidden behind paywalls. even long-lasting digital objects are often dynamic: the ads accompanying a webpage may change with each visit; news articles and other documents are revised; blog posts and comments are deleted. if an archival institution cannot access a digital object quickly or frequently enough, the object cannot be archived, at least not completely. large-scale digital preservation, which in practice necessarily relies on periodic automated harvesting of content, is therefore limited to capturing snapshots of the changes digital objects undergo over their lifespans. law existing copyright law does not translate well to the digital realm. leaving aside the complexities of international copyright law, in the united states it is not clear, for example, whether an archival institution like the library of congress is bound by licensing restrictions and if it can require deposit of digital objects, nor whether content on the web or in databases should be treated as published or unpublished. 14 “many of the uncertainties come from applying laws to technologies and methods of distribution they were not designed to address.” 15 a lack of revised laws or even relevant court decisions significantly impacts the potential scale of digital preservation, as few archival institutions will venture to preserve digital objects without legal protection for doing so. given this unclear legal environment, efforts at large-scale digital preservation are hampered by the need to secure permission to archive from the rights holder of each piece of content. 16 this obviously has enormous impact on preserving the web, but even scholarly databases and periodical archives may not hold full rights to all of their published content. additionally, a single digital object can include content owned by any number of authors, each of whose permission is needed for legal archival. without stronger legal protection for archival institutions, the scope of digital preservation is severely limited by copyright restrictions. digital preservation is further limited by licensing agreements, which can be even more restrictive than general copyright law. frequently, purchase of a digital object does not transfer ownership to the end-user, but rather grants limited licensed access to the object. in this case, libraries do not enjoy the customary right of first sale that, among other things, allows for actions related to preservation that would otherwise breach copyright. 17 preservation of licensed works requires that libraries either cede archival responsibility to rights practical limits to the scope of digital preservation | kastellec 67 holders, negotiate the right to archive licensed copies, or create dark archives that preserve objects in an inaccessible state until their copyright expires. selection the limitation selection imposes on digital preservation hinges on the act of intellectual appraisal. the total digital content created each year already outstrips the total current storage capacity of the world by a wide margin. 18 it is clear libraries and archives cannot preserve everything so, more than ever, deciding what to preserve is critical. 19 models of selection for digital objects can be plotted on a scale according to the degree of human mediation they entail. at one end, the selective model is closest to selection in the analog world, with librarians individually identifying digital objects worthy of digital preservation. at the other end of the scale, the whole domain model involves minimal human-mediation, with automated harvesting of digital objects. the collaborative model, in which archival institutions negotiate agreements with publishers to deposit content, falls somewhere between these two extremes, as does the thematic model, which can apply either selectiveor whole-domain-type approaches to relatively narrow sets of digital objects defined by event, topic, or community. each of these approaches results in limits to the scope of digital preservation. the human mediation of the selective model limits the scale of what can be preserved, as objects can only be acquired as quickly as staff can appraise them. the collaborative and thematic models offer the potential for thorough coverage of their target but by definition are limited in scope. the whole domain model avoids the bottleneck of human appraisal but, more than any other model, is subject to the access limitations discussed above. whole domain harvesting is also essentially wasteful, as it is an anti-selection approach—everything found is kept, irrespective of potential value. this wastefulness makes the whole domain model extremely expensive because of the technological resources required to manage information at such a scale. finances the ultimate limiting factor is financial reality. considerations of funding and cost have both broad and narrow effects. the narrow effects are on each of the other limitations previously identified— financial constraints are intertwined with the constraints imposed by technology, access, law, and selection. the technological model of digital preservation that offers the highest quality and lowest risk, redundant offsite copies, also carries hard-to-sustain costs. while the cost of storage continues to drop, hardware costs actually make up only a small percentage of the total cost of digital preservation. power, cooling, and—for offsite copy strategies—bandwidth costs are significant and do not decrease as scale increases to the same degree that storage costs do. cost considerations similarly fuel non-technical limitations: increased funding can increase the rate at which digital objects are accessed for preservation and can enable development of systems to mine deep web resources. selection is limited by the number of staff who can evaluate objects or information technology and libraries | june 2012 68 the need to develop systems to automate appraisal. negotiating perpetual access to objects or arranging to purchase archival copies creates additional costs. the broad financial effect is that any digital preservation requires dedicated funding over an indefinite timespan. lavoie outlines the problem: much of the discussion in the digital preservation community focuses on the problem of ensuring that digital materials survive for future generations. in comparison, however, there has been relatively little discussion of how we can ensure that digital preservation activities survive beyond the current availability of soft-money funding; or the transition from a project's first-generation management to the second; or even how they might be supplied with sufficient resources to get underway at all. 20 there are many possible funding models for digital preservation, 21 each with their own limitations. creators and rights holders can preserve their own content but normally have little incentive to do so over the long-term, as demand for access slackens. publicly funded agencies can preserve content, but they may lack a clear mandate for doing so, and they are chronically underfunded. preservation may be voluntarily funded, as is the case for wikipedia, although it is not clear if there is enough potential volunteer funding for more than a few preservation efforts. fees may support preservation, either through charging users for access or by third-party organizations charging content owners for archival services; in such cases, however, fees may also discourage access or provision of content, respectively. a nested model of limitations these aspects can be seen as a series of nested constraints (see figure 1). practical limits to the scope of digital preservation | kastellec 69 figure 1. nested model of limitations at the highest level, there are technical limitations on how much digital information can be preserved at an acceptable quality. within that constraint, only a limited portion of what could possibly be preserved can be accessed by archival institutions for digital preservation. next, within that which is accessible, there are legal limitations on what may be archived. the subset defined by technological, access, and legal limitations still holds far more information than archival institutions are capable of archiving, therefore selection is required, entailing either the limited quality of automated gathering or the limited quantity of human-mediated appraisal. finally, each of these constraints is in turn limited by financial considerations, so finances exert pressure at each level. conclusion it is possible to envision alternative ways to model these series of constraints—the order could be different, or they could all be centered on a single point but not nested within each other. thus, undue attention should not be given to the specific sequence outlined above. one important conclusion that may be drawn, however, is that the identified limitations are related but distinct. the preponderance of digital preservation research to date has understandably focused on overcoming technological limitations. with the establishment of the redundant backup model, which addresses technological limitations to a workable degree, the field would be well served by greater efforts to push back the non-technical limitations of access, law, and selection. the other conclusion is that costs are digital preservation’s most pervasive limitation. as rosenthal plainly states it, “society’s ever-increasing demands for vast amounts of data to be kept for the future are information technology and libraries | june 2012 70 not matched by suitably lavish funds.” 22 if funding cannot be increased, expectations must be tempered. perhaps it has always been the case, but the scale of the digital landscape makes it clear that preservation is a process of triage. for the foreseeable future, the amount of digital information that could possibly be preserved far outstrips the amount that feasibly can be preserved. it is useful to put the advances in digital preservation technology in perspective and to recognize that non-technical factors also play a large role in determining how much of our cultural heritage may be preserved for the benefit of future generations. references and notes 1. issues specific to digitized objects (i.e., digital versions of analog originals) are not specifically addressed herein. technological limitations apply equally to digitized and born-digital objects, however, and the remaining limitations overlap greatly in either case. 2. francine berman et al., sustainable economics for a digital planet: ensuring long-term access to digital information (blue ribbon task force on sustainable digital preservation and access, 2010), http://brtf.sdsc.edu/biblio/brtf_final_report.pdf (accessed apr. 23, 2011). 3. marilyn deegan and simon tanner, “some key issues in digital preservation,” in digital convergence—libraries of the future, ed. rae earnshaw and john vince, 219–37 (london: springer london, 2007), www.springerlink.com.proxyremote.galib.uga.edu/content/h12631/#section=339742&page=1 (accessed nov. 18, 2010). 4. berman et al., sustainable economics for a digital planet; deegan and tanner, “digital convergence.” 5. data redundancy normally will also entail hardware migration; it may or may not also incorporate file format migration. 6. the library of congress, for instance, only began digital preservation in 2000 (www.digitalpreservation.gov/partners/pioneers/index.html [accessed apr. 24, 2011]). 7. david s. h. rosenthal, “bit preservation: a solved problem?” international journal of digital curation 5, no. 1 (july 21, 2010), www.ijdc.net/index.php/ijdc/article/view/151 (accessed mar. 14, 2011). 8. h. m. gladney, “durable digital objects rather than digital preservation,” january 1, 2008, http://eprints.erpanet.org/149 (accessed mar. 14, 2011). 9. rosenthal, “bit preservation.” 10. ibid. rosenthal cites studies by schroeder and gibson (2007) and pinheiro (2007). 11. ibid. http://brtf.sdsc.edu/biblio/brtf_final_report.pdf file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.springerlink.com.proxy-remote.galib.uga.edu/content/h12631/%23section=339742&page=1 file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.springerlink.com.proxy-remote.galib.uga.edu/content/h12631/%23section=339742&page=1 http://www.digitalpreservation.gov/partners/pioneers/index.html file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.ijdc.net/index.php/ijdc/article/view/151 http://eprints.erpanet.org/149/ practical limits to the scope of digital preservation | kastellec 71 12. peter lyman, “archiving the world wide web,” in building a national strategy for digital preservation: issues in digital media archiving (washington, dc: council on library and information resources and library of congress, 2002), 38–51, www.clir.org/pubs/reports/pub106/pub106.pdf (accessed dec. 1, 2010); f. mccown, c. c marshall, and m. l nelson, “why web sites are lost (and how they’re sometimes found),” communications of the acm 52, no. 11 (2009): 141–45; margaret e. phillips, “what should we preserve? the question for heritage libraries in a digital world,” library trends 54, no. 1 (summer 2005): 57–71. 13. deegan and tanner, “digital convergence”; mccown, marshall, and nelson, “why web sites are lost (and how they’re sometimes found).” 14. june besek, copyright issues relevant to the creation of a digital archive: a preliminary assessment (the council on library and information resources and the library of congress, 2003), www.clir.org/pubs/reports/pub112/contents.html (accessed mar. 15, 2011). 15. ibid., 17. 16. archival institutions that do not pay heed to this restriction, such as the internet archive (www.archive.org), claim their actions constitute fair use. the legality of this claim is as yet untested. 17. berman et al., sustainable economics for a digital planet. 18. francine berman, “got data?” communications of the acm 51, no. 12 (december 2008): 50, http://portal.acm.org/citation.cfm?id=1409360.1409376&coll=portal&dl=acm&idx=j79&part =magazine&wanttype=magazines&title=communications (accessed nov. 20, 2010). 19. phillips, “what should we preserve?” 20. brian f. lavoie, “the fifth blackbird,” d-lib magazine 14, no. 3/4 (march 2008): i, www.dlib.org/dlib/march08/lavoie/03lavoie.html (accessed mar. 14, 2011). 21. berman et al., sustainable economics for a digital planet. 22. rosenthal, “bit preservation.” http://www.clir.org/pubs/reports/pub106/pub106.pdf file:///c:/users/gerrityr/documents/my%20dropbox/ital/ital_june_2012_preprints/,%20http:/www.clir.org/pubs/reports/pub112/contents.htm http://www.archive.org/ http://portal.acm.org/citation.cfm?id=1409360.1409376&coll=portal&dl=acm&idx=j79&part=magazine&wanttype=magazines&title=communications http://portal.acm.org/citation.cfm?id=1409360.1409376&coll=portal&dl=acm&idx=j79&part=magazine&wanttype=magazines&title=communications http://www.dlib.org/dlib/march08/lavoie/03lavoie.html http://www.dlib.org/dlib/march08/lavoie/03lavoie.html 52 information technology and libraries | june 2006 author name and second author author id box for 2 column layout this paper discusses google scholar as an extension of kilgour’s goal to improve the availability of information. kilgour was instrumental in the early development of the online library catalog, and he proposed passage retrieval to aid in information seeking. google scholar is a direct descendent of these technologies foreseen by kilgour. google scholar holds promise as a means for libraries to expand their reach to new user communities, and to enable libraries to provide quality resources to users during their online search process. editor’s note: this article was submitted in honor of the fortieth anniversaries of lita and ital. f red kilgour would probably approve of google scholar. kilgour wrote that the paramount goal of his professional career is “improving the availability of information.”1 he wrote about his goal of achieving this increase through shared electronic cataloging, and even argued that shared electronic cataloging will move libraries toward the goal of 100 percent availability of information.2 throughout much of kilgour’s life, 100 percent availability of information meant that all of a library’s books would be on the shelves when a user needed them. in proposing shared electronic cataloging—in other words, online union catalogs—kilgour was proposing that users could identify libraries’ holdings without having to travel to the library to use the card catalog. this would make the holdings of remote libraries as visible to users as the holdings of their local library. kilgour went further than this, however, and also proposed that the full text of books could be made available to users electronically.3 this would move libraries toward the goal of 100 percent availability of information even more than online union catalogs. an electronic resource, unlike physical items, is never checked out; it may, in theory, be simultaneously used by an unlimited number of users. where there are restrictions on the number of users of an electronic resource—as with subscription services such as netlibrary, for example—this is not a necessary limitation of the technology, but rather a limitation imposed by licensing and legal arrangements. kilgour understood that his goal of 100 percent availability of information would only be reached by leveraging increasingly powerful technologies. the existence of effective search tools and the usability of those tools would be crucial so that the user would be able to locate available information without assistance.4 to achieve this goal, therefore, kilgour proposed and was instrumental in the early development of much library automation: he was behind the first uses of punched cards for keeping circulation records, he was behind the development of the first online union catalog, and he called for passage retrieval for information seeking at a time when such systems were first being developed.5 this development and application of technology was all directed toward the goal of improving the availability of information. kilgour stated that the goal of these proposed information-retrieval and other systems was “to supply the user with the information he requires, and only that information.”6 shared catalogs and electronically available text have the effect of removing both spatial and temporal barriers between the user and the material being used. when the user can access materials “from a personal microcomputer that may be located in a home, dormitory, office, or school,” the user no longer has to physically go to the library.7 this is a spatial barrier when the library is located at some distance from the user, or if the user is physically constrained in some way. even if the user is perfectly able-bodied, however, and located close to a library, electronic access still eliminates a temporal barrier: accessing materials online is frequently faster and more convenient than physically going to the library. electronic access enables 100 percent availability of information in two ways: by ensuring that the material is available when the user wants it, and by lowering or removing any actual or perceived barriers to the user accessing the material. ■ library automation weise writes that “for at least the last twenty to thirty years, we [librarians] have done our best to provide them [users] with services so they won’t have to come to the library.”8 the services that weise is referring to are the ability for users to search for and gain access to the full text of materials online. libraries of all types have widely adopted these services: for example, at the author’s own institution, the university of north carolina at chapel hill, the libraries have subscriptions to approximately seven hundred databases and provide access to more than 32,000 unique periodical titles; many of these subscriptions provide access to the full text of materials.9 additionally, the state library of north carolina provides a set of more than one hundred database subscriptions to all academic and public libraries around the jeffrey pomerantz jeffrey pomerantz (pomerantz@unc.edu) is assistant pro fessor in the school of information and library science, university of north carolina at chapel hill. google scholar and 100 percent availability of information google scholar and 100 percent availability of information | pomerantz 53 state; any north carolina resident with a library card may access these databases.10 several other states have similar programs. by providing users with remote access to materials, libraries have created an environment in which it is possible for users to be remote from the library. or rather, as lipow points out, it is the library that is remote from the user, yet the user is able to seek and find information.11 this adoption of technology by libraries has had the effect of enabling and empowering users to seek information for themselves, without either physically going to a library or seeking a librarian’s assistance. the increasing sophistication of freely available tools for information seeking on the web has accelerated this trend. in many cases, users may seek information for themselves online without making any use of a library’s human-intermediated or other traditional services. (certainly, providing access to electronic collections may be considered to be a service of the library, but this is a service that may not require the user either to be physically in the library or to communicate with a librarian.) even technically unsophisticated users may use a search engine and locate information that is “good enough” to fulfill their information needs, even if it is not the ideal or most complete information for those purposes.12 thus, for better or worse, the physical library is no longer the primary focus for many information seekers. part of this movement by users toward self-sufficiency in information seeking is due to the success of the web search engine, and to the success of google in particular. recent reports from the pew internet and american life project shed a great deal of light on users’ use of these tools. rainie and horrigan found that “on a typical day at the end of 2004, some 70 million american adults logged onto the internet.”13 fallows found that “on any given day, 56% of those online use search engines.”14 fallows, rainie, and mudd found that of their respondents, “47% say that google is their top choice of search engine.”15 from these figures, it can be roughly estimated that more than 39 million people use search engines, and more than 18 million use google on any given day—and that is only within the united states. this trend seems quite dark for libraries, but it actually has its bright side. it is important to make a distinction here between use of a search engine and use of a reference service or other library service. there is some evidence that users’ questions to library reference services are becoming more complex.16 why this is occurring is less clear, but it may be hypothesized that users are locating information that is good enough to answer their own simple questions using search engines or other internet-based tools. the definition of “good enough” may differ considerably between a user and a librarian. nevertheless, one function of the library is education, and as with all education, the ultimate goal is to make the student self-sufficient in self-teaching. in the context of the library, this means that one goal is to make the user self-sufficient in finding, evaluating, and using information resources. if users are answering their own simple questions, and asking the more difficult questions, then it may be hypothesized that the widespread use of search engines has had a role in raising the level of debate, so to speak, in libraries. rather than providing instruction to users on simply using search engines, librarians may now assume that some percentage of library users possess this skill, and may focus on teaching higher-level information-literacy skills to users (www.ala.org/ala/acrl/ acrlstandards/informationliteracycompetency.htm). simple questions that users may answer for themselves using a search engine, and complex questions requiring a librarian’s assistance to answer are not opposites, of course, but rather two ends of a spectrum of the complexity of questions. while the advance of online search tools may enable users to seek and find information for themselves at one end of this spectrum, it seems unlikely that such tools will enable users to do the same across the entire spectrum any time soon; perhaps ever. the author believes that there will continue to be a role for librarians in assisting users to find, evaluate, and use information. it is also important to make another distinction here, between the discovery of resources, and access to those resources. libraries have always provided mechanisms for users to both discover and access resources. neither the card catalog nor the online catalog contains the full text of the materials cataloged; rather, these tools are means to enable the user to discover the existence of resources. the user may then access these resources by visiting the library. search engines, similar to the card and online catalogs, are tools primarily for discovery of resources: search-engine databases may contain cached copies of web pages, but the original (and most up-todate) version of the web page resides elsewhere on the web. thus, a search engine enables the user to discover the existence of web pages, but the user must then access those web pages elsewhere. the author believes that there will continue to be a role for libraries in providing access to resources—regardless of where the user has discovered those resources. in order to ensure that libraries and librarians remain a critical part of the user’s information-seeking process, however, libraries must reappropriate technologies for online information seeking. search engines may exist separate from libraries, and users may use them without making use of any library service. however, libraries are already the venue through which users access much online content—newspapers, journals, and other periodicals; reference sources; genealogical materials—even if many users do not physically come to the library or consult a librarian when using them. it is possible for 54 information technology and libraries | june 2006 libraries to add value to search technologies by providing a layer of service available to those using it. ■ google scholar one such technology for online information seeking to which libraries are already adding value, and that could add value to libraries in turn, is google scholar (scholar. google.com). google scholar is a specialty search tool, obviously provided by google, which enables the user to search for scholarly literature online. this literature may be on the free web (as open-access publications become more common and as scholars increasingly post preprint or post-print copies of their work on their personal web sites), or it may be in subscription databases.17 users may access literature in subscription databases in one of two ways: (1) if the user is affiliated with an institution that subscribes to the database, the user may access it via whatever authentication method is in place at the institution (e.g., ip authentication, a proxy server), or (2) if the user is not affiliated with such an institution, the user may pay for access to individual resources on a pay-perview basis. there is not sufficient space here to explore the details of google scholar’s operation, and anyway that is not the point of this paper; for excellent discussions of the operation of google scholar, see gardner and eng, and jacsó.18 pace draws a distinction between federated searching and metasearching: federated search tools compile and index all resources proactively, prior to any user’s actual search, in a just-in-case approach to users’ searching.19 metasearch tools, on the other hand, search all resources on the fly at the time of a user’s search, in a just-in-time approach to users’ searching. google scholar is a federated search tool—as, indeed, are all of google’s current services—in that the database that the user searches is compiled prior to the user’s actual search. in this, google scholar is a direct descendent of kilgour’s work to develop shared online library catalogs. a shared library catalog is a union catalog: it is a database of libraries’ physical holdings, compiled prior to any actual user’s search. google scholar is also a union catalog, though a catalog of publishers’ electronic offerings provided by libraries, rather than of libraries’ physical holdings. it should be noted, however, that while this difference is an important one for libraries and publishers, it might not be understood or even relevant for many users. many of the resources indexed in google scholar are also available in full text. this fact allows google scholar to also move in the direction of kilgour’s goal of making passage retrieval possible for scholarly work. by using google’s core technology—the search engine and the inverted index that is created when pages are indexed by a search engine—google scholar enables full-text searching of scholarly work. as mentioned above, when users search google scholar, they retrieve a set of links to the scholarly literature retrieved by the search. google scholar also makes use of google’s linkanalysis algorithms to analyze the network of citations between publications—instead of the network of hyperlinks between web pages, as google’s search engine more typically analyzes. a cited by link is included with each retrieved link in google scholar, stating how many other publications cite the publication listed. clicking on this cited by link performs a preformulated search for those publications. this citation-analysis functionality resembles the functionality of one of the most common and widely used scholarly databases in the scholarly community: the isi web of science (wos) database (scientific .thomson.com/products/wos). wos enables users to track citations between publications. this functionality has wide use in scholarly research, but until google scholar, it has been largely unknown outside of the scholarly community. with the advent of google scholar, however, this functionality may be employed by any user for any research. further, there is a plugin for the firefox browser (www.mozilla.com/firefox) that displays an icon for every record on the page of retrieved results that links to the appropriate record in the library’s opac (google scholar does not, however, currently provide this functionality natively20). this provides a link from google scholar to the materials that the library holds in its collection. when the item is a book, for example, this link to the opac enables users to find the call number of the book in their local library. when the item is a journal, it enables them to find both the call number and any database subscriptions that index that journal title. periodicals are often indexed in multiple databases, so libraries with multiple-database subscriptions often have multiple means of accessing electronic versions of journal titles. a library user may access a periodical via any or all of these individual subscriptions without using google scholar— but to do so, the user must know which database to use, which means knowing either the topical scope of a database or knowing which specific journals are indexed in a database. as a more centralized means of accessing this material, many users may prefer a link in google scholar to the library’s opac. google scholar thus fulfills, in large part, kilgour’s vision of shared electronic cataloging. in turn, shared cataloging goes a long way toward achieving kilgour’s vision of 100 percent availability of information by allowing a user to discover the existence of information resources. however, discovery of resources is only half of the equation: the other half is access to those resources. and it is here where libraries may position themselves as a critical part of the information-seeking process. search engines google scholar and 100 percent availability of information | pomerantz 55 may enable users to discover information resources on their own, without making use of a library’s services, but it is the library that provides the “last mile” of service, enabling users to gain access to many of those resources. ■ conclusion google scholar is the topic of a great deal of debate, both in the library arena and elsewhere.21 unlike union catalogs and many other online resources used in libraries, it is unknown what materials are included in google scholar, since as of this writing google has not released information about which publishers, titles, and dates are indexed.22 google is known to engage in self-censorship—or self-filtering, depending on what coverage one reads—and so potentially conflicts with the american library association’s freedom to read statement (www .ala.org/ala/oif/statementspols/ftrstatement/freedom readstatement.htm).23 google is a commercial entity and, as such, a primary motivation of google must be profit, and only secondarily, meeting the information needs of library users. for all of these and other reasons, there is considerable debate among librarians about whether it is appropriate for libraries to provide access to google scholar. despite this debate, however, users are using google scholar. google scholar is simply the latest tool to enable users to seek information for themselves; it isn’t the first and it won’t be the last. google scholar holds a great deal of promise for libraries due to the combination of google’s popularity and ease of use, and the resources held by or subscribed to by libraries to which google scholar points. as kesselman and watstein suggest, “libraries and librarians need to have a voice” in how tools such as google scholar are used, given that “we are the ones most passionate about meeting the information needs of our users.” given that library users are using google scholar, it is to libraries’ benefit to see that it is used well. google scholar is the latest tool in a long history of information-seeking technologies that increasingly realize kilgour’s goal of achieving 100 percent availability of information. google scholar does not provide access to 100 percent of information resources in existence; but rather enables discovery of information resources, and allows for the possibility that these resources will be discoverable by the user 100 percent of the time. google scholar may be on the vanguard of a new way of integrating library services into users’ everyday information-seeking habits. as taylor tells us, people have their own individual sources to which they go to find information, and libraries—for many people—are not at the top of their lists.25 google, however, is at the top of the list for a great many people.26 properly harnessed by libraries, therefore, google scholar has the potential to bring users to library resources when they are seeking information. google scholar may not bring users physically to the library. instead, what google scholar can do is bring users into contact with resources provided by the library. this is an important distinction, because it reinforces a change that libraries have been undergoing since the advent of the online database: that of providing access to materials that the library may not own. ownership of materials potentially allows for a greater measure of control over the materials and their use. ownership in the context of libraries has traditionally meant ownership of physical materials, and physical materials by nature restrict use, since the user must be physically collocated with the materials, and use of materials by one user precludes use of those materials by other users for the duration of the use. providing access to materials, on the other hand, means that the library may have less control over materials and their use, but this potentially allows for wider use of these materials. by enabling users to come into contact with library resources in the course of their ordinary web searches, google scholar has the potential to ensure that libraries remain a critical part of the user’s information-seeking process. it benefits google when a library participates with google scholar, but it also benefits the library and the library’s users: the library is able to provide users with a familiar and easy-to-use path to materials. this is (for lack of a better term) a “spoonful of sugar” approach to seeking and finding information resources: by using an interface that is familiar to users, libraries may provide quality information sources in response to users’ information seeking. green wrote that “a librarian should be as unwilling to allow an inquirer to leave the library with his question unanswered as a shop-keeper is to have a customer go out of his store without making a purchase.”27 a modern version of this might be that a librarian should be as unwilling to allow an inquirer to abandon a search with his question unanswered. google scholar and online tools like it have the potential to draw users away from libraries; however, these tools also have the potential to usher in a new era of service for libraries: an expansion of the reach of libraries to new users and user communities; a closer integration with users’ searches for information; and the provision of quality resources to all users, in response to all information needs. google scholar and online tools like it have the potential to enable libraries to realize kilgour ’s goals of improving the availability of information, and to provide 100 percent availability of information. these are goals on which all libraries can agree. 56 information technology and libraries | june 2006 ■ acknowledgements many thanks to lisa norberg, instruction librarian, and timothy shearer, systems librarian, both at the university of north carolina at chapel hill, for many extensive conversations about google scholar, which approached coauthorship of this paper. this paper is dedicated to the memory of kenneth d. shearer. references and notes 1. frederick g. kilgour, “historical note: a personalized prehistory of oclc,” journal of the american society for information science 38, no. 5 (1987): 381. 2. frederick g. kilgour, “future of library computerization,” in current trends in library automation: papers presented at a workshop sponsored by the urban libraries council in cooperation with the cleveland public library, alex ladenson, ed. (chicago: urban libraries council, 1981), 99–106; frederick g. kilgour, “toward 100 percent availability,” library journal 114, no. 19 (1989): 50–53. 3. kilgour, “toward 100 percent availability.” 4. frederick g. kilgour, “lack of indexes in works on information science,” journal of the american society for information science 44, no. 6 (1993): 364; frederick g. kilgour, “implications for the future of reference/information service,” in collected papers of frederick g. kilgour: oclc years, lois l. yoakam, ed. (dublin, ohio: oclc online computer library center, inc., 1984): 9–15. 5. frederick g. kilgour, “a new punched card for circulation records,” library journal 64, no. 4 (1939): 131–33; kilgour, “historical note”; frederick g. kilgour and nancy l. feder, “quotations referenced in scholarly monographs,” journal of the american society for information science 43, no. 3 (1992): 266–70; gerald salton, j. allan, and chris buckley, “approaches to passage retrieval in full-text information systems,” in proceedings of the 16th annual international acm sigir conference on research and development in information retrieval (new york: acm pr., 1993), 49–58. 6. kilgour, “implications for the future of reference/information service,” 95. 7. kilgour, “toward 100 percent availability,” 50. 8. frieda weise, “being there: the library as place,” journal of the medical library association 92, no. 1 (2004): 10, www.pubmedcentral.nih.gov/articlerender.fcgi?artid=314099 (accessed apr. 9, 2006). 9. it is difficult to determine precise figures, as there is considerable overlap in coverage; several vendors provide access to some of the same periodicals. 10. north carolina’s database subscriptions are via the nc live service, www.nclive.org (accessed apr. 9, 2006). 11. anne g. lipow, “serving the remote user: reference service in the digital environment,” paper presented at the ninth australasian information online and on disc conference and exhibition, sydney, australia, 19–21 jan. 1999, www.csu.edu.au/ special/online99/proceedings99/200.htm (accessed apr. 9, 2006). 12. j. janes, “academic reference: playing to our strengths,” portal: libraries and the academy 4, no. 4 (2004): 533–36, http:// muse.jhu.edu/journals/portal_libraries_and_the_academy/ v004/4.4janes.html (accessed apr. 9, 2006). 13. lee rainie and john horrigan, a decade of adoption: how the internet has woven itself into american life (washington, d.c.: pew internet & american life project, 2005), 58, www.pewinter net.org/ppf/r/148/report_display.asp (accessed apr. 9, 2006). 14. deborah fallows, search engine users (washington, d.c.: pew internet & american life project, 2005), i, www.pew internet.org/pdfs/pip_searchengine_users.pdf (accessed apr. 9, 2006). 15. deborah fallows, lee rainie, and graham mudd, data memo on search engines (washington, d.c.: pew internet & american life project, 2004), 3, www.pewinternet.org/ppf/ r/132/report_display.asp (accessed apr. 9, 2006). 16. laura bushallow-wilber, gemma devinney, and fritz whitcomb, “electronic mail reference service: a study,” rq 35, no. 3 (1996): 359–69; carol tenopir and lisa a. ennis, “reference services in the new millennium,” online 25, no. 4 (2001): 40–45. 17. alma swan and sheridan brown, open access selfarchiving: an author study (truro, england: key perspectives, 2005), www.jisc.ac.uk/uploaded_documents/open%20access %20self%20archiving-an%20author%20study.pdf (accessed apr. 9, 2006). 18. susan gardner and susanna eng, “gaga over google? scholar in the social sciences,” library hi tech news 8 (2005): 42–45; péter jacsó, “google scholar: the pros and the cons,” online information review 29, no. 2 (2005): 208–14. 19. andrew pace, “introduction to metasearch . . . and the niso metasearch initiative,” presentation to the openurl and metasearch workshop, sept. 19–21, 2005, www.niso.org/news/ events_workshops/openurl-05-ppts/2-1-pace.ppt (accessed apr. 9, 2006). 20. this plugin was developed by peter binkley, digital initiatives technology librarian at the university of alberta. see www.ualberta.ca/~pbinkley/gso (accessed apr. 9, 2006). 21. see, for example, gardner and eng, “gaga over google?”; jacsó, “google scholar”; m. kesselman and s. b. watstein, “google scholar and libraries: point/counterpoint,” reference services review 33, no. 4 (2005): 380–87. 22. jacsó, “google scholar.” 23. anonymous, google censors itself for china, bbc news, jan. 25, 2006, http://news.bbc.co.uk/2/hi/technology/4645596 .stm (accessed apr. 9, 2006); a. mclaughlin, “google in china,” google blog., jan. 27, 2006, http://googleblog.blogspot .com/2006/01/google-in-china.html (accessed apr. 9, 2006). 24. kesselman and s. b. watstein, “google scholar and libraries,” 386. 25. robert s. taylor, “question-negotiation and information seeking in libraries,” college & research libraries 29, no. 3 (1968): 178–94. 26. fallows, rainie, and mudd, data memo on search engines. 27. samuel s. green, “personal relations between librarians and readers,” american library journal i, no. 2–3 (1876): 79. 1–11. a framework for measuring relevancy in discovery environments article a framework for measuring relevancy in discovery environments blake l. galbreath, alex merrill, and corey m. johnson information technology and libraries | june 2021 https://doi.org/10.6017/ital.v40i2.12835 abstract discovery environments are ubiquitous in academic libraries but studying their effectiveness and use in an academic environment has mostly centered around user satisfaction, experience, and task analysis. this study aims to create a quantitative, reproducible framework to test the relevancy of results and the overall success of washington state university’s discovery environment (primo by ex libris). within this framework, the authors use bibliographic citations from student research papers submitted as part of a required university class as the proxy for relevancy. in the context of this study, the researchers created a testing model that includes: (1) a process to produce machine-generated keywords from a corpus of research papers to compare against a set of human-created keywords, (2) a machine process to query a discovery environment to produce search result lists to compare against citation lists, and (3) four metrics to measure the comparative success of different search strategies and the relevancy of the results. this framework is used to move beyond a sentiment or task-based analysis to measure if materials cited in student papers appear in the results list of a production discovery environment. while this initial test of the framework produced fewer matches between researcher-generated search results and student bibliography sources than expected, the authors note that faceted searches represent a greater success rate when compared to open-ended searches. future work will include comparative (a/b) testing of commonly deployed discovery layer configurations and limiters to measure the impact of local decisions on discovery layer efficacy as well as noting where in the results list a citation match occurs. introduction discovery environments are ubiquitous in academic libraries as all but two libraries in the association of research libraries (arl) report using a discovery environment, and they continue to gain traction in other library settings.1 the one-stop shopping model of discovery environments is one of their most alluring features as it closely resembles searching the open web. this familiarity allows users who are accustomed to searching the web to feel comfortable searching the library catalog without fear of encountering a “failed” search (zero result set). discovery environments seldom fail to return results as even the most rudimentary or naïve search strategy will return something for a user. this idea of “returning something” has been anecdotally noted as a positive as it ensures the user does not give up and allows novices to be successful with limited search sophistication or prior instruction from information professionals. one of the potential negatives to this approach however is the sheer volume of material that is returned per search query. library discovery environments often present thousands, if not millions, of search results from an initial search query. this emulation of google is essentially blake l. galbreath (blake.galbreath@wsu.edu) is core services librarian, washington state university. alex merrill (merrilla@wsu.edu) is head of library systems and technical operations, washington state university. corey m. johnson (coreyj@wsu.edu) is instruction & assessment librarian, washington state university. © 2021. mailto:blake.galbreath@wsu.edu mailto:merrilla@wsu.edu mailto:coreyj@wsu.edu information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 2 making the time-honored study of relevancy (precision/recall) moot. how can one determine the number of relevant documents in a search query if the number of documents returned is becoming limitless? this study aims to create a quantitative, reproducible framework to test the relevancy of results returned from, and the overall efficacy of, a library discovery environment, in this case, ex libris primo. within this framework, the authors compare the results returned in model primo search queries against the bibliographic citations used in students’ research papers. background the university common requirements (ucore) curriculum, implemented in fall 2012, was a major redesign of the washington state university (wsu) undergraduate general education program. ucore is comprised of required categories of classes designed to build student proficiency in the seven undergraduate learning goals.2 roots of contemporary issues (rci) is the sole mandated undergraduate course under the ucore system.3 during the 2018–2019 academic year, over 4,500 students were enrolled in rci at wsu, the vast majority being first-year students. this paper utilizes data from the rci library research project, a term-length research experience with four central assignments designed to familiarize students with the fundamentals of quality research and a cumulative research paper where they utilize the skills learned. the research project components are spaced evenly throughout the term; students are guided along the research process from general topic formation, to research question generation, to thesis statement defense in the final paper. students are tasked with finding sources of particular resource types (e.g., journal articles), describing the value of these sources for their research, and citing them properly in chicago style. wsu libraries uses the discovery environment primo, an ex libris product, to provide resources to its patrons.4 specifically, wsu libraries uses the new user interface version of primo, which incorporates search results from the primo central index (pci) in its default search. primo, like all discovery environments, provides results with a wide variety of resource types so rci students can use it at all stages of the term research project. students use it in the pursuit of contemporary newspaper articles, history monographs, history journal articles, and primary sources. in this article, the authors focus on the versatility of primo, using rci student paper bibliographies as the central data source for the project. literature review the need for assessment of library resources and services in higher education has been welldocumented. libraries are increasingly asked to provide tangible evidence they aid student information literacy skill development and thus advance achievement of institutional learning outcomes. accrediting bodies acknowledge, “the importance of information literacy skills, and most accreditation standards have strengthened their emphasis on the teaching roles of libraries.”5 oakleaf and kaske also stress the importance of librarians choosing assessments that can contribute to university-wide assessment efforts, noting they are preferable to assessments that only benefit libraries.6 the washington state university libraries is committed to assessment of its resources and services, with primo as a central target resource, and with large, lowerundergraduate courses as a primary area of focus. information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 3 there are numerous papers which document usability testing of primo. prommann and zhang (2015) analyzed the efficiency of primo through hierarchical task analysis (hta). they counted the number of physical and cognitive steps necessary to get to records or full text of known items and concluded that primo is “a flexible discovery layer as it helps achieve many goals with minimum amount [sic] of steps.”7 although many of these studies articulate avenues of success in terms of user interaction with the discovery environment, there are also reports of difficulties in a variety of categories. students have problems with source retrieval, for example, understanding availability status terminology and labels, and using link resolvers and interlibrary loan.8 dalal et al. (2015) demonstrated that retrieving the full text of an article in a discovery environment is sometimes unintuitive for students and involves navigating multiple interfaces. 9 users also have issues using facets to find particular resource types or distinguishing between them. 10 while the study addressed in this paper does not directly address user difficulties with primo functionality, issues with source retrieval point to a plausible explanation for the few matches between the model search results and student paper bibliographies. it is possible students saw many of the same sources from the model searches in their results, but ultimately did not secure those sources because of the difficulties outlined above. in other words, some source selection choices are based mostly on availability, not as much on relevance. source relevancy is an active area of research for web-based discovery services, in terms of comparative studies to disciplinary subject databases. evelhoch and zebulin analyzed two years of usage data from both primo and a selection of subject databases, concluding that users have difficulty finding relevant sources in primo or they are not available. 11 based on users’ judgments, lee and chung, determined that ebsco discovery service was less effective than a set of education and library subject databases in terms of source relevance. 12 another study illustrated that while students preferred discovery environments, the articles they selected from the subject (indexing and abstracting) databases were more authoritative.13 finally, librarians are posited to believe that subject databases are superior to discovery environments in terms of the relevancy of search results and disciplinary coverage.14 conclusions about source relevancy are complicated by the fact that students infrequently look beyond a first page of results lists.15 researchers have also explored the idea of primo user satisfaction through the presence of relevant results. in one instance, using online questionnaires and in-person focus groups, researchers found users had a high level of satisfaction with their institution’s discovery environment, largely attributed to the quality of search results over ease of use.16 hamlett and georgas (2019) conducted a mixed-methods user experience study to understand student perceptions of relevancy in primo. this study found that participants believed primo to return relevant results (with an average score of 8.3 out of 10). however, some of the qualitative responses indicated that the keywords used did not actually yield relevant results. 17 many other methods and measures have been executed in determining the value and usefulness of primo. huurdeman, aamodt, and heggo analyzed a dataset of 50 popular queries in primo. they deemed a query successful if the first 10 results included the (likely) targeted resource and found that 58% of the queries from the popular searches dataset had been successful, while 20% were unsuccessful, and 22% could not be determined. their approach assumed there is one intended document per query and that the authors can surmise what it is.18 the research presented in the remainder of this article below is unique in that the authors explore user judgment of source relevance (satisfaction) as a function of whether sources in the model primo searches for their topics existed in the students’ papers’ bibliographies. information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 4 methods research questions the impetus for this study was to understand the factors that play a role in establishing a framework to test the relevancy of results returned from primo. the authors attempted to answer the following questions: • how effective is primo at returning relevant results? • to what extent does faceting improve search results? • which search strategies are the most effective within the given framework? • how can the researchers refine the framework for future investigations into relevancy? • what are the implications of this study for end users? data collection the authors began with a sample of 100 randomly selected and anonymized research papers that were submitted to the roots of contemporary issues (rci) courses in fall 2018 and spring 2019 semesters. the study used a two-pronged approach to generate keywords for model primo search queries. for one approach, keywords were machine-generated via a word-vector generation process. for the other, keywords were human-generated by a student research assistant to approximate natural language queries. keywords and queries, machine a rapidminer (https://rapidminer.com/) word-vector generation process with term-frequency schema converted the research papers into keywords, which the authors then used to generate search queries. within the main routine, the process documents from files operator, rapidminer transformed the texts into lower case and tokenized the final papers according to non -letters. rapidminer then filtered the data by those tokens representing nouns and adjectives, removed english stop words, and filtered tokens by length, with a minimum of one character and maximum of 50 characters. the researchers then applied a snowball stemmer for english words and generated 20 n-grams per paper, each with a maximum length of four. table 1 illustrates the product of the word-vector generation process. throughout this example research paper, "trade” occurred 40 times, "slave” occurred 34 times, “slave” and “trade” occurred together 26 times, "africa” occurred 18 times, "impact” occurred 16 times, “african” occurred 11 times, and "peopl” occurred 10 times. table 1. example n-grams and frequency as retrieved from rapidminer n-gram number of occurrences trade 40 slave 34 slave_trade 26 africa 18 impact 16 african 11 peopl 10 ... ... https://rapidminer.com/ information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 5 number of n-grams after compiling the data in rapidminer, the authors created a process to select those n-grams to use in the model primo search queries. huurdeman, aamodt, and heggo (2018) found that users included an average of 2.6 terms per query in their popular searches dataset.19 in a report by ex libris, stohn indicates that most topic-search queries contain five or fewer words.20 in order to investigate both ends of this spectrum, this study constructed short-length queries, consisting of two n-grams, and full-length queries, consisting of four n-grams, using the following rubric to help systematize the construction. rubric to select n-grams for shortand full-length queries pick terms that satisfy the following criteria: 1. n-grams that occur more frequently in a paper are preferred to those that occur less frequently. 2. if two n-grams appear to be structural derivatives of the same word (e.g., korea and korean), select the shortest n-gram and truncate it. 3. if one or more of the top terms appear in a later 2-gram, use the 2-gram as a phrase search. 4. ignore n-grams with repeating terms (e.g., south_africa_africa). 5. truncate all terms (using asterisk or question mark), except the first term of a phrase search, unless the first term is not a complete word (e.g., “busi* meeting*”). 6. for terms or phrases that end in truncated “i”, use the truncated version of the term and its truncated “y” counterpart, and combine both with an or operator (e.g., countri* or country*). 7. ignore all 3and 4-grams as they have a propensity to create nonsensical phrase searches (e.g., racism_polic_brutal). 8. if abbreviations are encountered, expand them for searching purposes (e.g., us is “united states”), except in cases where they are more commonly known by their abbreviation (e.g., ddt). 9. ignore results of contractions (e.g., ‘t) in case of a tie in the selection of an n-gram, sequence the following rules for selection: 1. preference proper nouns over other nouns and adjectives. if there are multiple proper nouns, preference place-name proper nouns over other proper nouns. 2. preference the n-gram that occurs in the greatest number of two or more n-grams later in the list. 3. preference longer words over shorter words. 4. group all the tied n-grams with a series of or statements. note: this may result in the selection of more than four total n-grams. referring to the example n-grams from table 1, an illustration of this method is shown in the following steps: 1. arrange terms from highest to lowest frequency. 2. select slave_trade as first n-gram, since “trade” and “slave” both occur in later n-gram. truncate to “slave trade*”. 3. select africa since it has the next greatest number of occurrences. combine africa with african since they are structural derivatives of one another. truncate to africa*. information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 6 at this point, the first two selected n-grams—slave_trade and africa—become the keywords of the short-length query “slave trade*” and africa*. 4. select impact since it has the next greatest number of occurrences. truncate to impact*. 5. select peopl since it has the next greatest number of occurrences. truncate to peopl*. finally, the first four selected n-grams—slave_trade, africa, impact, and peopl—become the keywords of the full-length query “slave trade*” and africa* and impact* and peopl*. on average, after stop words and booleans were removed, the full-length queries in this study were 5.69 keywords long, while the short-length queries were 3.11 keywords long. keywords and queries, natural language in addition to the machine-oriented keyword process, the authors employed a student research assistant to create human-generated phrases, consisting of 3–10 words, which served as synopses for each of the 100 papers. this study then used these phrases as proxies for creating natural language search queries. for the same example research paper cited in table 1 above, this student created the summary phrase history and effects of the slave trade. this phrase in its entirety became the natural language query. on average, after stop words and booleans were removed, the natural language queries used in this study were 3.95 keywords long. search results using the three keyword-generation strategies outlined above, the authors constructed search queries and ran them against the ex libris’ primo search api endpoint. table 2 summarizes example result sets from the above short-length query, full-length query, and natural language query. for each of the keyword-generation strategies, the authors constructed search queries along four parameters: queries that used no faceting (open-ended), queries that faceted to articles only (articles), queries that faceted to books and ebooks only (books), and queries that faceted to newspaper articles only (newspapers). in all, there were 12 search-query constructions (three query types by four faceting modes) for fall 2018 and 12 for spring 2019. to construct a baseline for the search comparisons, the researchers designed the initial search to be open-ended. that is, the study assumed that patrons most often use the default, basic search functionality, with no facets selected. a segment of the rci instruction specifically encourages students to incorporate materials with resource types articles, books, and newspaper articles into their research papers. the authors therefore assumed that these students would most likely utilize facets corresponding to these resource types in their more specific queries and mirrored this behavior in the comparative searches. each primo search api returned titles for the top 50 results, moving beyond users’ usual search behavior in an effort to provide more flexibility to the initial steps of the relevancy framework. information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 7 table 2. first-occurring result titles for query types: short-length, full-length, and natural language queries query type query first-occurring result titles short-length “slave trade*” and africa* the atlantic slave trade the atlantic slave trade : a census the atlantic slave trade legacy of the trans-atlantic slave trade : hearing before the subcommittee on the constitution, civil rights, and civil liberties of the committee on the judiciary, house of representatives, one hundred tenth congress, first session, december 18, 2007. ... full-length “slave trade*” and africa* and impact* and peopl* the atlantic slave trade the atlantic slave trade : effects on economies, societies, and peoples in africa, the americas, and europe slave trades, 1500–1800 : globalization of forced labour african voices of the atlantic slave trade : beyond the silence and the shame ... naturallanguage history and effects of the slave trade urban history, the slave trade, and the atlantic world 1500–1900 the atlantic slave trade and british abolition, 1760– 1810 the decolonization of african education and history the united states and the transatlantic slave trade to the americas, 1776–1867 ... a student research assistant harvested all the citations used across the 100 example papers to create an inventory of 730 bibliographic citations. using the excel fuzzy lookup add-in, the authors then compared this bibliographic inventory against the 60,000 titles that were returned information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 8 via the primo search api. this add-in fuzzy matches rows between two different tables and assigns a similarity score for each match. the study focused attention on rows with matching scores of .80 and above to further investigate potential matches. using the fuzzy matches as a starting point, the authors confirmed or denied matches by hand, using title and resource type as the main criteria. table 3. sample comparison of citations used in research papers against results returned from primo search api fuzzy score citation title citation resource type results title result resource type confirmed match 1.0000 a short history of biological warfare article a short history of biological warfare article yes 0.9933 the female madlady women, madness, and english culture, 1830– 1980 print book the female malady : women, madness, and english culture, 1830–1980 print book yes 0.9778 industrial revolution web resource the industrial revolution e-book no 0.9037 drug use & abuse print book drug use and abuse : a comprehensive introduction print book no results source citation data description this study compared citations gathered from a random sample of 100 research papers from the two semesters of all sections of history 105/305 taught at washington state university (wsu) from fall 2018 to spring 2019. table 4 below gives a descriptive breakdown of the citations by resource type. the student research assistant identified and categorized the source citation list. information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 9 table 4. total source citations resource type fall 2018 (% of total) spring 2019 (% of total) book chapter 7 (1.94%) 4 (1.08%) books (e-books/print) 107 (29.72%) 96 (25.95%) newspaper article 63 (17.50%) 60 (16.22%) journal article 84 (23.33%) 99 (26.76%) reference entry 6 (1.67%) 6 (1.62%) other/ cannot determine 10 (2.78%) 15 (4.05%) web document 81 (22.50%) 90 (24.32%) magazine article 1 (.28%) n/a newspaper/magazine article 1 (.28%) n/a semester citation count 360 (100%) 370 (100%) total citation count 730 target citations list data the citations collected from the papers were then compared against 60,000 citations retrieved from the wsu primo search api endpoint on july 24, 2020, as described previously in the methods section. to better account for the differing numbers of citations among resource types in the source data and to normalize reporting across query types and semesters, most results are presented as a percentage and referred to as the matching success rate. for example, the natural language query had six matches out of a possible 360 citations in the open-ended search for citations from the fall of 2018. the matching success rate of the open-ended search in the fall of 2018 therefore is calculated at 1.67% (see table 5). table 6 below shows the percentage results for short queries, and table 7 for full queries. for information about the raw source numbers and target data, please see the open science framework project site.21 when all query types and faceting modes are considered, the matching success rate almost uniformly increased from fall 2018 to spring 2019. the largest difference in matching success rate was observed in the full-query articles only search at 8.91% as shown in table 7. the open-ended search observed the smallest difference in positive movement and the anomaly of a diminishing success rate. across the natural language and full-query types the open-ended search exhibited the least amount of positive difference in success rate, at 1.04% and 0.26% respectively, and the short-query open-ended search had a small negative change in success rate at −0.36%. information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 10 table 5. natural language query results success rate fall 2018 spring 2019 % difference open-ended search 1.67% 2.70% 1.04% articles only 4.76% 9.09% 4.33% books only 3.74% 11.46% 7.72% newspapers only 0.00% 1.67% 1.67% table 6. short-query results success rate fall 2018 spring 2019 % difference open-ended search 3.33% 2.97% −0.36% articles only 3.57% 5.05% 1.48% books only 9.35% 10.42% 1.07% newspapers only 0.00% 3.33% 3.33% table 7. full-query results success rate fall 2018 spring 2019 % difference open-ended search 0.56% 0.81% 0.26% articles only 1.19% 10.10% 8.91% books only 0.93% 5.21% 4.27% newspapers only 0.00% 5.00% 5.00% total unique matches across all three search strategies and their four iterations, the researchers also note a raw count of matches which helps to determine how an overall search strategy is performing at finding matching citations. as the reader might expect, this metric includes a matching citation once across all four iterations of a search strategy. meaning, even if a source citation appears in both the open-ended search and the books only search, that source citation is only counted once for the purpose of this metric. for example, in the natural language query in fall 2018, six citations were matched in the openended search. four of the citations were articles and two were books. some of the matches in the articles and books searches were redundant to the open-ended search. considering only unique matches in the articles, books, and newspaper searches, the authors calculated the total number of unique matches. when the target searches were compared, the researchers matched two additional citations in the books only citations list. when the authors add the two additional matches, there were a total of eight unique citation matches across all iterations of the natural language search (open-ended search, books only, articles only, newspapers only). the total unique matches number and the corresponding success rate of the total unique matches for each search strategy is shown in table 8. information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 11 table 8. total unique matches fall 2018 spring 2019 % difference natural language query 8 (2.22%) 22 (5.95%) 3.72% short query 14 (3.89%) 16 (4.32%) 0.44% full query 3 (0.83%) 18 (4.86%) 4.03% matches added by faceting another metric used to measure overall effectiveness of faceted searching is the percentage of matching citations that are new to the results list when limited to a certain resource type— matches added by faceting. meaning, what matching citations were not present in the open-ended search results but are then matched when the results list is reduced to only a single resource type. in table 9, the percentage of matches that are new and only to be found in a targeted search result varies greatly. between both semesters and among all search iterations, the smallest percentage of matches added by faceting is 14.29% and the largest is 83.33%. table 9. matches added by faceting fall 2018 spring 2019 % difference natural language query 2 (25.00%) 12 (54.55%) 29.55% short query 2 (14.29%) 5 (31.25%) 16.96% full query 1 (33.33%) 15 (83.33%) 50.00% comparing search strategies the matching success rate across search strategies (natural, short, full) and iterations is a mixed result and does not allow for very useful comparison beyond descriptions of difference which are outlined in the comparison tables (tables 5–7). to better compare the search strategies as a whole, as opposed to how a particular iterative search performed relative to another open or targeted search, the researchers used a weighted success rate of the total unique matches from both semesters as the proxy for overall performance and the point of comparison among the three search strategies. the comparison of this weighted success rate shows no difference in overall success rate between the natural language query (4.11%) and the short query (4.11%). the search strategy that was demonstrably different in weighted success rate is the full query at a lagging 2.88%. see table 10 for comparison and calculation details. table 10. weighted success rate of total unique matches natural language query (2.22%*360)+(5.95%*370)/730 4.11% (0.04109589) short query (3.89%*360)+(4.32%*370)/730 4.11% (0.04109589) full query (0.83%*360)+(4.86%*370)/730 2.88% (0.02876712) discussion how effective is primo at returning relevant results? according to the preliminary findings, primo is relatively ineffective at providing search results that match the citations used by the student researchers. the matching success rates of the openinformation technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 12 ended searches range from 0.56% to 3.33%. the possible reasons for these low numbers are numerous and varied; everything from students perhaps intending to use sources in the researchers’ auto-generated results lists, but unfortunately were unable to locate the full text, to the prevalence of finding open internet sources outside the discovery layer, to open-ended searches being flooded with rarely cited reference materials and very contemporary newspaper articles (see more about these ideas below). future research aims to understand more clearly which potential factors are present and to what degree they impact the matching success rates. to what extent does faceting improve search results? faceting within primo leads to better results, although the matching success rates are still more ineffective than not. the faceted searches contain the only matching success rates above ten percent: 10.10% (full query, articles only), 10.42% (short query, books only), and 11.46% (natural language query, books only). the data shows that the majority of unique matches found by the 2019 full-length and natural language search strategies occurs within the faceted searches (83.33% and 54.55%, respectively). it is interesting to note that these represent the two longer query strings, on average. future testing will reveal whether there is a relationship between query length and percentage of matches added by faceting. which search strategies are the most effective within the given framework? looking at the search strategies holistically, the researchers note that the total unique matches increased from fall 2018 to spring 2019 across all three query types. this increase was expected behavior, partially due to the fact that primo relevancy ranking algorithms assume that patrons prefer newer materials.22 the weighted success rate is an attempt to understand each search strategy’s performance over the 2018–2019 academic year, as opposed to comparing one semester to the other. from this metric, the consistency of the short-length query is equally effective as the more dynamic performance of the natural language query. the researchers are looking forward to adding more data to this metric to understand in which direction the average might move. how to refine the framework for future investigations into relevancy the most popular resource types used in the source citations were books, journal articles, web documents, and newspaper articles. together, these categories comprised approximately 93% of all resource types in both fall 2018 and spring 2019. however, not all areas were equally accessible within washington state university’s discovery layer configuration. the heavy reliance on web documents in the source citations was somewhat problematic, given the fact that web documents did not constitute a faceted resource type in wsu libraries’ primo prior to this study. therefore, the authors will need to better account for web documents in future testing. the assessment of newspaper articles also proved to be problematic, given their proclivity to inundate primo search results with numerous and recent documents. the sheer number of newspaper articles published and indexed every year in primo for general and introductory topics can dilute the pool of possible target citations greatly. for example, a scan of the matching newspaper articles reveals that 67% (4/6) were published in 2018. in future studies, the researchers will limit publication dates for target citations to the appropriate time period (e.g., an upper limit of may 2019 would be placed on publication dates for papers written in spring 2019) or collect data closer to the submission of research papers. in 11 out of 12 cases, matching success rates were better in spring 2019 than fall 2018, most likely due to recency. it is common for discovery environments, and true for the environment used in this study, to present content information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 13 sorted by relevance and then publication date. therefore, the researchers expected to and did find an increased matching success rate closer to the date of testing, with the one exception of the short-length, open-ended search query. this anomaly led researchers to dig more deeply into the target citations to see if a cause could be determined. researchers found a larger than expected number of citations for resource types that are underrepresented in source citations. for example, the reference entry resource type surfaced prominently in the open-ended search for several of the queries, diluting the pool of target citations with entries that had little chance of appearing in the source citation lists. in one standout example, there were four separate reference entries titled simply “taiping rebellion.” the discovery environment gave preference to these four separate reference entries over other, more substantive works, that are more likely to be cited in an academic paper. the researchers surmise this is partly a function of the relevancy ranking algorithm that gives greater weight to matches in the title, author, and subject fields.23 depending on the search and the configuration of the discovery environment, it is possible that reference entries would push other results from books, articles, and newspaper resource types farther down the results list, making them less and less visible in an open-ended search for a given topic. this dilution of the target citations with resource types that are not emphasized or widely used in source citations is another area the researchers aim to isolate and examine in further rounds of testing. in addition to the source recency and particular source type issues explained above, the authors did not take into account source availability, nor where sources were found by students, which remains a confounding factor on matching success rate. subsequent studies will capture whether sources are present in the local deployment of primo during the time frame the students were conducting research. this issue will be further addressed and mitigated by analyzing urls provided within student source citations. implications of this study for end users the matching success rate in the open-ended search when compared to the type-limited searches leads to a discussion of how to define and present the default search of the discovery environment to best serve an academic population. more pointedly, it opens the discussion of what resource types to include within that default search to return the most relevant and useful results and not just the most results. in this case, the argument could be made that excluding several resource types (e.g., reference entries) would surface resources that are more likely to be cited in a researcher’s scholarship. based on the number of matches that were introduced by performing a faceted search, it is evident that researchers still need to utilize a search strategy which includes using search filters and limiters (prior to or following the initial search) and other search tactics in a discovery environment to return relevant results. the notion that an open-ended “one and done search,” for even the most introductory of topics, will be successful in retrieving many usable and citable resources in the first page or two of results is not supported by the results of this study. conclusions and next steps as the common adage goes, “it’s not what you say, it’s what you do.” in this study, the saying applies as the researchers move beyond what sources students think are relevant to the sources students ultimately use in their papers. the current slate of discovery environment research projects focuses largely on users’ affective connections to discovery environments, often information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 14 compared to other kinds of academic databases, and places users in temporary, hypothetical research scenarios in order to judge source relevance.24 in juxtaposition, the rci research project is a term-length (10–14 weeks) venture; students have a significant amount of time and the aid of a scaffolded set of assignments, to bolster their source relevance assessment skills and authority. methodologies which closely mirror the authentic experiences and curriculum of the students are those which arguably will provide a more accurate picture of the value of the discovery environment in an academic setting. the authors of this study took the first steps in building a relevancy rating system for discovery environments. to standardize their preliminary results, they generated four metrics: matching success rate, total unique matches, matches added by faceted search, and weighted success rate. while the results of this study do not allow the researchers to draw statistical conclusions regarding the dominance of one search strategy over another in returning relevant results, the frequencies showed a better match (success) rate with faceted than non-faceted searching. discovery environments are commonly advertised as providing an easy to use, one-stop location for academic research needs, but the reality is more complex. students need to engage these systems with multiple search refinements to find valuable materials. this investigation was also the initial attempt to create a machine-generated framework to test the relevancy of web-based discovery environment’s results. as the authors look to build upon this preliminary study, there are several avenues to pursue that will enhance the methodology of the framework. one avenue is a refinement of the boundaries of the testing framework. this boundary refinement includes a re-examination of the criteria for inclusion in both the source citations and the search results list. in the current study, all student citations were deemed viable regardless of whether the source citation was able to be verified and accessed. this led to the inclusion of citations of lecture notes and other such materials that are not generally expected to appear in a discovery environment. the authors will also re-examine the inclusion of newspapers and reference works in open-ended searching. these two resource types are large in number, are not indexed very well, and often do not have descriptive titles. a portion of the next round of research will be dedicated to comparative testing (a/b) of generally deployed discovery environment configurations. another avenue of exploration is determining where in the results list a citation appears, not just the binary positive or negative, and measuring any impact based on behavior of the search (i.e., search construction) or behavior and configuration of the discovery environment. refining the methodology of the current framework will result in fewer potentially confounding factors and allow librarians to regain an understanding of relevancy when it comes to teaching discovery layers to student researchers. these next steps will contribute to the overall picture concerning the value and efficacy of web-based discovery environments that is steadily taking shape. information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 15 endnotes 1 marshall breeding, “library technology guides: academic members of the association of research libraries: index-based discovery services,” library technology guides, https://librarytechnology.org/libraries/arl/discovery.pl. 2 “student learning goals,” washington state university common requirements, 2018, https://ucore.wsu.edu/about/learning-goals. 3 “welcome to the roots of contemporary issues,” washington state university department of history, 2017, https://ucore.wsu.edu/faculty/curriculum/root/. 4 “search it,” washington state university libraries, 2020, https://searchit.libraries.wsu.edu/. 5 megan oakleaf and neal kaske, “guiding questions for assessing information literacy in higher education,” portal: libraries and the academy 9, no. 2 (2009): 277, https://doi.org/10.1353/pla.0.0046. 6 oakleaf and kaske, “guiding questions.” 7 marlen prommann and tao zhang, “applying hierarchical task analysis method to discovery layer evaluation,” information technology and libraries 34, no. 1 (2015): 97, https://doi.org/10.6017/ital.v34i1.5600. 8 rice majors, “comparative user experiences of next-generation catalogue interfaces,” library trends 61, no. 1 (2012): 186–207, https://doi.org/10.1353/lib.2012.0029; david comeaux, “usability testing of a web-scale discovery system at an academic library,” college & undergraduate libraries 19, no. 2–4 (2012): 199, https://doi.org/10.1080/10691316.2012.695671; greta kliewer et al., “using primo for undergraduate research: a usability study,” library hi tech 34, no. 4 (2016): 566–84, http://doi.org/10.1108/lht-05-2016-0052; blake galbreath, corey m. johnson, and erin hvizdak, “primo new user interface,” information technology and libraries 37, no. 2 (2018): 10–33, https://doi.org/10.6017/ital.v37i2.10191. 9 heather dalal, amy kimura, and melissa hofmann, “searching in the wild: observing information-seeking behavior in a discovery tool” (association of college & research libraries 2015 conference proceedings, march 25–28, 2015): 668–75, http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/201 5/dalal_kimura_hofmann.pdf. 10 comeaux, “usability testing”; xi niu, tao zhang, and hsin-liang chen, “study of user search activities with two discovery tools at an academic library,” international journal of humancomputer interaction 30, no. 5 (2014): 422–33, https://doi.org/10.1080/10447318.2013.873281; kevin patrick seeber, “teaching ‘format as a process’ in an era of web-scale discovery,” reference services review 43, no. 1 (2015): 19– 30, https://doi.org/10.1108/rsr-07-2014-0023; kylie jarret, “findit@flinders: user experiences of the primo discovery search solution,” australian academic & research libraries 43, no. 4 (2012): 278–99, https://doi.org/10.1080/00048623.2012.10722288; aaron nichols et al., “kicking the tires: a usability study of the primo discovery tool,” journal of web librarianship 8, no. 2 (2014): 172–95, https://doi.org/10.1080/19322909.2014.903133; https://librarytechnology.org/libraries/arl/discovery.pl https://ucore.wsu.edu/about/learning-goals https://ucore.wsu.edu/faculty/curriculum/root/ https://searchit.libraries.wsu.edu/ https://doi.org/10.1353/pla.0.0046 https://doi.org/10.6017/ital.v34i1.5600 https://doi.org/10.1353/lib.2012.0029 https://doi.org/10.1080/10691316.2012.695671 http://doi.org/10.1108/lht-05-2016-0052 https://doi.org/10.6017/ital.v37i2.10191 http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2015/dalal_kimura_hofmann.pdf http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/2015/dalal_kimura_hofmann.pdf https://doi.org/10.1080/10447318.2013.873281 https://doi.org/10.1108/rsr-07-2014-0023 https://doi.org/10.1080/00048623.2012.10722288 https://doi.org/10.1080/19322909.2014.903133 information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 16 kelsey renee brett, ashley lierman, and cherie turner, “lessons learned: a primo usability study,” information technology and libraries 35, no. 1 (2016): 7–25, https://doi.org/10.6017/ital.v35i1.8965; galbreath, johnson, and hvizdak, “primo new user interface.” 11 zebulin evelhoch, “where users find the answer: discovery layers versus database,” journal of electronic resources librarianship 30, no. 4 (2018): 205–15, https://doi.org/10.1080/1941126x.2018.1521092. 12 boram lee and eunkyung chung, “an analysis of web-scale discovery services from the perspective of user’s relevance judgement,” journal of academic librarianship 42 (2016): 529–34, https://doi.org/10.1016/j.acalib.2016.06.016. 13 sarah p. c. dahlen and kathlene hanson, “preference vs. authority: a comparison of student searching in a subject-specific indexing and abstracting database and a customized discovery layer” college & research libraries 78, no. 7 (2017): 878–97, https://doi.org/10.5860/crl.78.7.878. 14 stefanie buck and christina steffy, “promising practices in instruction of discovery tools,” communications in information literacy 7, no. 1 (2013): 66–80, https://doi.org/10.15760/comminfolit.2013.7.1.135; anita k. foster, “determining librarian research preferences: a comparison survey of web-scale discovery systems and subject databases,” journal of academic librarianship 44 (2018): 330–36, https://doi.org/10.1016/j.acalib.2018.04.001. 15 diane cmor and xin li, “beyond boolean, towards thinking: discovery systems and information literacy,” 2012 iatul proceedings, paper 7, https://docs.lib.purdue.edu/iatul/2012/papers/7/; kliewer et al., “using primo”; alexandra hamlett and helen georgas, “in the wake of discovery: student perceptions, integration, and instructional design,” journal of web librarianship 13, no. 3 (2019): 230–45, https://doi.org/10.1080/19322909.2019.1598919. 16 courtney lundrigan, kevin manuel, and may yan, “‘pretty rad’: explorations in user satisfaction with a discovery layer at ryerson university,” college & research libraries 76, no. 1 (2015): 43–62, https://doi.org/10.5860/crl.76.1.43. 17 hamlett and georgas, “in the wake of discovery.” 18 hugo c. huurdeman, mikaela aamodt, and dan michael heggo, “‘more than meets the eye’— analyzing the success of user queries in oria,” nordic journal of information literacy in higher education 10, no. 1 (2018): 18–36, https://doi.org/10.15845/noril.v10i1.270. 19 huurdeman, aamodt, and heggo, “more than meets the eye.” 20 christina stohn, ”how do users search and discover?: findings from ex libris user research,” ex libris, 2015, https://www.exlibrisgroup.com/blog/ex-libris-user-studies-how-do-userssearch-and-discover/. https://doi.org/10.6017/ital.v35i1.8965 https://doi.org/10.1080/1941126x.2018.1521092 https://doi.org/10.1016/j.acalib.2016.06.016 https://doi.org/10.5860/crl.78.7.878 https://doi.org/10.15760/comminfolit.2013.7.1.135 https://doi.org/10.1016/j.acalib.2018.04.001 https://docs.lib.purdue.edu/iatul/2012/papers/7/ https://doi.org/10.1080/19322909.2019.1598919 https://doi.org/10.5860/crl.76.1.43 https://doi.org/10.15845/noril.v10i1.270 https://www.exlibrisgroup.com/blog/ex-libris-user-studies-how-do-users-search-and-discover/ https://www.exlibrisgroup.com/blog/ex-libris-user-studies-how-do-users-search-and-discover/ information technology and libraries june 2021 framework for measuring relevancy in discovery environments | galbreath, merrill, and johnson 17 21 alex merrill and blake l. galbreath, “a framework for measuring relevancy in discovery environments,” 2020, https://osf.io/ve3kp/. 22 “primo search discovery: search, ranking, and beyond,” ex libris, 2015, https://www.exlibrisgroup.com/products/primo-discovery-service/relevance-ranking/. 23 “primo search discovery,” 3. 24 lee and chung, “an analysis of web-scale discovery services”; dahlen and hanson, “preference vs. authority”; lundrigan, manuel, and yan, “pretty rad”; hamlett and georgas, ”in the wake of discovery.” https://osf.io/ve3kp/ https://www.exlibrisgroup.com/products/primo-discovery-service/relevance-ranking/ abstract introduction background literature review methods research questions data collection keywords and queries, machine number of n-grams rubric to select n-grams for shortand full-length queries keywords and queries, natural language search results results source citation data description target citations list data total unique matches matches added by faceting comparing search strategies discussion how effective is primo at returning relevant results? to what extent does faceting improve search results? which search strategies are the most effective within the given framework? how to refine the framework for future investigations into relevancy implications of this study for end users conclusions and next steps endnotes 112 journal of library automation vol. 14/2 june 1981 anyway because he is primarily getting suggested classification numbers in order to browse. the tucson public library could not have made the above decisions if it did not have a complete online file of all its holdings (including even reference materials that never circulate). but since this data did exist (after a five-year bar-coding effort) and since more than forty online terminals were already in place throughout the library system to access the online file, the decision not to include locations or holdings in the microform catalog seemed reasonable . in the longer-range future (1990?), it is very likely that the entire catalog will be available online. in the meantime, the tucson public library did not want to divide its resources maintaining two location records, but rather wanted to concentrate resources in maintaining one accurate record of locations available as widely as possible throughout the library system (by installing more online terminals for staff and public use). was this decision a sound one? we don't know. the microform catalog has not yet been introduced for public use. by the end of this year we should have some preliminary answers to this question. references 1. robin w. macdonald and j. mcree elrod, "an approach to developing computer catalogs," college & research libraries 34:202--8 (may 1973). a structure code for machine readable library catalog record formats herbert h. hoffman: santa ana college, santa ana, california. libraries house many types of publications in many media, mostly print on paper, but also pictures on paper, print and pictures on film, recorded sound on plastic discs, and others. these publications are of interest to people because they contain recorded information. more precisely said, because they contain units of intellectual, artistic, or scholarly creation that collectively can be called "works." one could say simply that library materials consist of documents that are stored and cataloged because they contain works. the structure of publications into documents (or "books") and works, the clear distinction between the concept of the information container as opposed to the contents, deserves more attention than it has received so far from bibliographers and librarians. the importance of the distinction between books and works has been hinted at by several theoreticians, notably lubetzky. however, the idea was never fully developed. the cataloging implications of the structural diversity among documents were left unexplored. as a consequence, librarians have never disentangled the two terms book and work . from the paris principles and the marc formats to the new second edition of the anglo-american cataloguing rules, the terms book and work are used loosely and interchangeably, now meaning a book, now a work proper, now part of a work , now a group of books. such ambiguity can be tolerated as long as each person involved knows at each step which definition is appropriate when the term comes up. but as libraries ease into the age of electronic utilities and computerized catalogs based on records read by machine rather than interpreted by humans, a considerably greater measure of precision will have to be introduced into library work. as one step toward that goal an examination of the structure of publications will be in order. the items that are housed in libraries, regardless of medium, are of two types. they are either single documents, or they are groups of two or more documents. items that contain two or more documents are either finite items (all published at once, or with a first and a last volume identified) or they are infinite items (periodicals, intended to be continued indefinitely at intervals). schematically, these three types of bibliographic items in libraries can be represented as shown in figure l. it should be noted that all publications, all documents, all bibliographic items in lid d ... d do __ _ fig. 1. three types of bibliographic items: top, single-document item; center, finite multiple-document item; bottom, infinite multipledocument item. braries, can be assigned to one of these three structures. there are no exceptions. all bibliographic items, furthermore, contain works. an item may contain one single work. but an item may also contain several works. schematically, the two situations can be represented as shown in figure 2. an item that is composed of several documents and contains several works may have one work in each document, or several per document. schematically, the two possibilities can be represented as shown in figure 3. it is possible, of course, for an item to fig . . 2. top, single-work document (example: a typical novel); bottom, multiple-work document (example: a collection of plays). communications 113 fig. 3. top, one work per document; bottom, several works per document . be composed of several documents but to contain only one work. figure 4 is a schematic representation of this case. mixed structures are also possible, as in the schematic shown in figure 5. ign oring the mixed structure that is only a combination of two "pure" structures, the foregoing information can be combined into a table that shows seven possible publication types that differ from each other in terms of structure (figure 6). all bibliographic items, whether composed of one document or many, are known by a title . these titles can be called item titles. in the case of a singledocument item (structures a and c), item title and document title are, of course, identical. but in the case of some multiple-document items (publications of types d, e, f, and g, for example), two possibilities exist: the documents that make up the item may or may not have their own individual document titles. for purposes of fig. 4. multivolume work (example: a very long novel in two volumes). fig. 5. finite multi-document item containing many works, mixed structure. 114 journal of library automation vol. 14/2 june 1981 one several documents document per item per item one \.jork per item a se veral several lo/orks works per item per c document one lo/ork per document fig . 6. publication types. the bibliographer or cataloger, items that consist of several documents bearing individual document titles can be described under one of two principles. the entire item can be treated as a unit. elsewhere i have coined a term for this treatment: the set description principle .1 but it is also possible to treat each document as a separate publication, to describe it under the book description principle . if we combine all these considerations we find that we can assign to each bibliographic item that is added to a library's collection one of the thirteen codes shown in figure 7. how can these codes be useful? taking a look into the future, let us imagine an online catalog system supported by a database that contains the records of a library's holdings . the records in such a database are entered in a definite format . in this format, whatever it will be called , there will be data fields for titles, authors, physical descriptions , subject headings, document numbers, and much else. i propose that to these fields one other be added: the structure code . the structure code would add a new dimension to the retrieval of recorded infinite infinite b d e f g formation. here are a few specific examples . consider a search for material on subject x. qualify the search argument by structure codes 1, 3, 7, and 12. result: the search will yield only major monographic works, defined as items of types a, b,f, and g. note that subject x assigned to such items is a true subject heading. the materials retrieved in this example would all be works dealing specifically with the topic x. but the same term assigned to an item coded, say, 6, would not be a true subject heading. the term here would only give a broad general summary of what the works in the item are about. the structure code adds sophistication to the retrieval process by enabling a searcher to distinguish between specific subject designators and mere summary subject headings. a search that excludes codes 2, 4, 5, and 6 limits output to materials that are not just collections of essays. the stratagem used in card catalogs to reach the same result is the qualification of a subject heading by terms denoting format, such as the subdivisions congresses or addresses, essays, lectures . this method of qualifying subject headings has never been done communications ll5 structure code publication type description principle: book (b) or set (s) schematic 1 a 2 c 3 b 4 d 5 d 6 d 7 f 8 f 9 e 10 e 11 e 12 g 13 g fig. 7. structure codes . consistently , however . the proposed structure code would ensure uniform treatment of all affected publications. qualify the search by codes 9, 10, 11, 13 and all periodicals can be excluded . in the card catalog, format qualifications such b b s b s, with individual document title s, without indiv . document title b s b s, with individual . document title s, without indiv. document title b s fwli ___ wgj ~--~ as periodicals, or societies, periodicals, etc ., or yearbooks are sometimes added to subject headings to reach similar results. again, the structure code would introduce uniformity and consistency. present-day card catalogs list publica116 journal of library automation vol. 14/2 june 1981 tions only. they do not list the individual works that may be contained in publications. if an analytic catalog were to be built into a computerized system at some time in the future , the structure code would be a great help in the redesign, because it makes it easy to spot items that need analytics, namely those that contain embedded works, or codes 2, 4, 5, 6, 8, 9, 10, 11, and 13. a searcher working with such an analytic catalog could use the code to limit output to manageable stages-first all items of type c, for example; then broadening the search to include those of type d; and so forth, until enough relevant material has been found. the structure code would also be useful in the displayed output. if codes 5 or 8 appeared together with a bibliographic description on the screen, this would tell the catalog user that the item retrieved is a set of many separately titled documents. a complete list of those titles can then be displayed to help the searcher decide which of the documents are relevant for him. in the card catalog this is done by means of contents notes . not all libraries go to the trouble of making contents notes, though, and not all contents notes are complete and rtliable . the structure code would ensure consistency and completeness of contents information at all times. codes 10 and 13 in a search output, analogously, would tell the user that the item is a serial with individual issue titles. there is no mechanism in the contemporary card catalog to inform readers of those titles. codes 4 and 7 would tell that the document is part of a finite set, and so forth. it has been the general experience of database designers that a record cannot have too many searchable elements built into its format. no sooner is one approach abandoned "because nobody needs it," than someone arrives on the scene with just that requirement. it can be anticipated, then, that once the structure code is part of the standard record format, catalog users will find many other ways to work the code into search strategies. it can also be anticipated that the proposed structure code, by adding a factor of selectivity, will help catalogers because it strengthens the authority-control aspect of machine-readable catalog files. if two publications bear identical titles, for example, and one is of structure 1, the other of structure 6, then it is clear that they cannot possibly be the same items. however, if they are of structures 1 and 7, respectively, extra care must be taken in cataloging, for they could be different versions of the same work. determination of the structure of an item is a by-product of cataloging, for no librarian can catalog a book unless he understands what the structure of that book is-one or more works, one or more documents per item, open or closed set, and so forth . it would therefore be very cheap at cataloging time to document the already-performed structure analysis and express this structure in the form of a code. references l. herbert h. hoffman, descriptive cataloging in a new light: polemical chapters for librarians (newport beach, calif.: headway publications, 1976), p.43. revisions to contributed cataloging in a cooperative cataloging database judith hudson: university libraries , state university of new york at albany. introduction oclc is the largest bibliographic utility in the united states. one of its greatest assets is its computerized database of standardized cataloging information . the database, which is built on the principle of shared cataloging, consists of cataloging records input from library of congress marc tapes and records contributed by member libraries. oclc standards ln. order to provide records contributed by member libraries that are as usable as those input from marc tapes, it is imreproduced with permission of the copyright owner. further reproduction prohibited without permission. pearls marmion, dan information technology and libraries; mar 2000; 19, 1; proquest pg. 53 pearls ed. note: "pearls" is a new section that will appear in these pages from time to time. it will be ital 's own version of the "top technology trends" topic begun by pat ensor. these pearls might be gleaned from a variety of places, but most often will come from discussion lists on the net. our first pearl, from thomas dowling appeared on web4lib on august 19, 1999 under the subject "pixel sizes for web from : thomas dowling to : multiple recipients of list sent : thu, 19 aug 1999 06:07 :08 -0700 (pdt) subject: [web4lib] pixel s izes for web pages dan marmion pages." he is responding to a query that asked if web site developers should assume the standard monitor resolution is 640x480 pixels, or something else. you may want to consult the web4lib archive for comments from the last few merry go-rounds on this topic. monitor size in inches is different from monitor size in pixels , which is different from window size in pixels, which is d ifferent from the rendered size of a browser's default font. not only are these four measurements different, they operate almost wholly independently of each other . so a statement like "i have trouble reading text at 600x800" puts the blame in the wrong place . html inherently has no sense of screen or window dimensions. many web designers will argue that the only aspects to a page with fixed pixel dimensions should be inline images; such designers typically restrain their use of images so that no single image or horizontal chain of images is wider than, say, 550px (with obvious exceptions for sites like image archives where the main purpose of a page is to display a larger image) . outside of images, find ways to express measurements relative to window size (percentages) or relative to text size (ems). users detest horizontal scrolling. in my experience, users with higher screen resolutions and/or larger monitors are less likely to run any application full screen; average window size on a 1280x1024 19" or 21 " monitor is very likely to be less than b00px wide. (the browser window i currently have open is 587px wide and 737px high .) i applaud your decision to support web access for the visually impaired . since that entails much , much more than monitor resolution, i trust the people actually writing your pages are familiar with the web content accessibility guidelines. it is actually possible to design web sites that are equally usable , even equally beautiful, under a wide range of viewing conditions. failing to accomplish that completely is understandable; failing to identify it as a goal is not. my recommendations to your committee would be a) find a starting point that isn't tied up in presentational nitpicking; b) find a design that looks attractive anywhere from 550 to 1550 pixels wide; c) crank up both your workstations ' resolution and font size; and d) continue to run your browsers in windows that are approximately 600 to 640 pixels wide . thomas dowling ohiolink ohio library and information network tdowllng @ohiolink.edu pearls i 53 102 the recon pilot project: a progress report henriette d. a vram: project director, information systems office, library of congress, washington, d. c. a synthesis of the progress report submitted by the library of congress to the council on library resources under an officers grant to initiate the recon pilot project that gives an overview of the project and the progress made from august-november 1969 in the following areas: training, selection of material to be converted, investigation of input devices, and format recognition. introduction the recon pilot project is an effort to analyze the problems of largescale conversion of retrospective catalog records through the actual conversion of approximately 85,000 non-current records. this project has grown directly out of the implementation of the marc distribution service. libraries considering the use of machine readable records for their current materials have naturally begun to consider conversion of their older records as well. some libraries have even begun such conversion projects. since the library of congress is also interested in the feasibility of converting its own retrospective records, it seemed appropriate to explore the possibility of centralized conversion of retrospective cataloging records and their distribution to the entire library community from a central source. a proposal having been submitted by the library of congress to the council on library resources, inc. ( clr), the council granted funds for a study of this problem. an advisory committee was appointed to provide guidance, and direct responsibility for the study and report ( 1) was assigned to a working task force. recon pilot project/ avram 103 a recommendation of the working task force was the implementation of a pilot project to test the techniques suggested in the report in an operational environment. since any feasibility report, no matter how detailed, refers to a theoretical model, the recommended techniques should be tested to determine a most efficient method for a large-scale conversion activity. the advisory committee concurred with this recommendation. the library of congress submitted a proposal for a pilot project (hereinafter referred to as recon) to clr, and received an officer's grant in august 1969 to initiate recon while the council continued its evaluation of the full-sc'ale pilot project. . a progress report was submitted to clr by the library covering the period from mid-august to november 1, 1969. so that clr might have a clear understanding of the work in progress, the report addressed itself to both the areas of recon supported by the council and those activities supported by the library of congress. in december 1969, clr awarded the library the funds requested for the entire pilot project. to make the library community cognizant of recon as quickly as possible, clr granted permission to modify the progress report for publication.· overview of the recon pilot project the pilot project is concerned with the conversion and distribution of an estimated 85,000 english language titles: 22,000 titles cataloged in 1969 and not included in the marc distribution service, and 63,000 titles from 1968. the creation of this data base partially satisfies the conclusions and specific recommendations of the recon working task force as stated in the report ( 2) : 1) there should be no conversion of any category (language or form of material) of retrospective records until that category is being currently converted; 2) the initial conversion effort should be limited to english language monograph records issued from 1960 to date and converted into machine readable form in reverse chronological order. (marc distribution service covers current english language monographs cataloged by the library of congress) . in order to explore the problems encountered in encoding and converting cataloging records for older english language monographs, and monographs in other roman alphabet languages, 5,000 additional titles will be selected and converted. the library further intends to investigate, through the design and implementation of a format recognition program, the use of the computer to assist in the editing of cataloging records. this technique should significantly reduce the manpower needs of the present method of conversion and therefore have an impact on any future library of congress conversion activity, either of currently cataloged or retrospective titles. recon will include experimentation with microfilming and producing hard copy from the lc record set. the record set in the lc card division consists of a master copy of the latest version of every lc printed card, arranged by card series and, 104 journal of library automation vol. 3/2 june, 1970 within each series, by card number. although a specific time period can be selected for conversion, the primary disadvantage of the record set for this purpose is the fact that not all changes in cataloging made to the lc official catalog are reflected in the record set. after considering all the alternatives, the recon working task force recommended (3) that the record set be used for selection of titles, but that the titles be compared with the official catalog and updated to insure bibliographic accuracy and completeness. since the record set is in constant use by card division personnel, the selected titles for conversion must be reproduced, and the original file reconstituted, as quickly as possible. the state of the art of direct-read optical character recognition devices suitable for large-scale conversion will be monitored and experimentation will be conducted with a variety of input devices. recon is closely related to the lc card division mechanization project, which is based upon the availability of records in machine readable form. recon will be closely coordinated with the card division project, both in the design of specifications for implementation and in the investigation of a common hardware/software configuration. the project was organized during august 1969. the first group of records being edited are those cataloged by the library of congress in 1969. in june 1970, the editing of the 1968 records will begin. since these records will have to be compared with the lc official catalog to record any changes, present thinking includes the design of a print program (referred to as a two-up print program) to cut printing time by providing a listing with records arranged in card number sequence (the order of input) and in alphabetic sequence by main entry on the same page. the records will be arranged by main entry to reduce the effort of checking them against the official catalog and the changed records will be inserted in their proper place in sequence by lc card number. the process of manual editing may be greatly reduced, or perhaps even eliminated, by october 1970, when the format recognition program is scheduled for completion. mter this time, the records will be input with little or no prior tagging and further editing will be performed by the computer. the resulting records will be examined by the marc editors both for accuracy in transcription and for correctness in the assignment of marc tags, indicators, and subfield codes. the duration of the pilot project will be twenty-four calendar months, august 1969-august 1971. it is anticipated that by november 1970 enough data should be available to determine whether a full-scale conversion project should be undertaken. an early evaluation of the project is advantageous in order to explore the funding possibilities of a conversion effort if the results of the pilot are affirmative. figure 1 is a calendar indicating the major milestones of recon as postulated during august 1969. recon pilot project/ avram 105 1969 1970 1971 au s 0 n d ja f mr ap ~y jn jy au s 0 n d ja f ~r ap my jn jy au t_ project begins • production staff hired • iso staff organized card division sends 1969, 1968 cards !investigate input devices, recon/card division hardware/software rrainfng editors trinf index !reproduction methods for catalog records study !analysis, editing, etc., research [itles 1 organize cardf for recon input ~ull editing of 1969 titlfs (16,000 records) rna1ysir of system to convert 1968 titles 1 full editingr 1968 titles ~er new mtst's ~ire nfw mtst typists fesign and implementation of format recognitifn fig. 1. recon calendar. use of format recognition on remainder 1968 titles conversion of marc , ! to marc ii rnd interim marc ii to marc ii regin evaluation of pilot pr~ject begin planning for con• tinuation of project! begin writing final report 106 journal of library automation vol. 3/2 june, 1970 essentially the same advisory committee and working task force selected for the recon feasibility study have agreed to serve in their respective capacities for recon. the implementation of the library of congress' marc distribution service and the initiation of recon are providing the nucleus of a national bibliographic data base. creation of this data base is not in itself a panacea for libraries but, in fact, amplifies the need to explore some of the larger issues at this time to provide the direction for future cohesive library systems. certain aspects of the problems were discussed in general terms in the recon report but time did not permit full analysis. during the two-year period of recon, the working task force will consider some of those issues (defined as four tasks listed below) under the grant from clr. the ability to complete all of the tasks described will be dependent on additional funding, which, it is hoped, may be available early in 1970. 1) any national data store should have a data base in which all records are consistent. it is possible, and highly probable, that libraries may convert bibliographic records for local use, which may not require the detail of a marc ii record. it is imperative that before levels of completeness of marc records are defined with respect to content and content designation, the implications of these definitions to future library networks be thoroughly explored. 2) any consideration of a national bibliographic data store in machine readable form should include the possibility of recording titles and holdings from other libraries. although the resolution of the problems associated with a machine readable national union catalog are enormous, it is time to begin an exploration of the problems to provide guidance for future design efforts. 3) several institutions have begun the conversion of their cataloging records into machine readable form. the possibility of utilizing these records in building a national bibliographic data store should be investigated. this will involve evaluating the difficulty and cost of converting and upgrading records converted by others to a marc format as opposed to preparing original records. 4) the library of congress maintains, and is considering the conversion into machine readable form, of its name and subject authority files. many libraries have expressed interest in receiving these records in the present marc distribution service. little thought has been given to the storage and maintenance of these large files in each library subscribing to marc distribution service. a library may not have in its collections a bibliographic record requiring either a name or subject cross reference record distributed by the library of congress. however, the library will keep the cross reference record because it cannot predict when a title will be added to the collection that does require the cross reference structure. the result will be the eventual storage and maintenance of the ~-==--------------------------------.... recon pilot project/ avram 107 entire lc name and subject reference files in each library. this problem should be explored to determine if there is a possible efficient method of libraries accessing these files from either a centralized source or several regional sources. progress-august 1969 to november 1969 organization the recon staff is divided into two sections: 1) the production section, responsible for the actual editing and keying of the records; and 2) the research and development section, responsible for liaison with the production section, determination of the criteria for the selection of the 1968 and 1969 titles, actual selection of the 5,000 research titles, investigation of input devices and photocopying techniques, liaison with the card division mechanization project, and the design and coding of special computer programs unique to recon. in addition, staff members of the marc project team in the information systems office (iso) are working in areas of format recognition and marc system programming that will affect recon. training the marc experience at the library of congress has demonstrated that staff members assigned to the editorial process of preparing catalog records for conversion to machine readable form must be exposed to cataloging fundamentals. phase i of the training program for the recon editors was a twoweek cataloging class conducted by the supervisor of the production section, a professional librarian with experience in teaching cataloging principles at the library of congress. each day was formally structured into reading, discussion, and practice. the editor-trainees applied the angloamerican cataloging rules ( 4) to practice problems and to actual cataloging of books. experience in using the lc subject heading list, filing rules, and classification schedules was provided to a lesser extent. in order to insure that the editor-trainees would have a wider range of experience in examining cataloging copy, the mnemonic marc tags and the more simple indicators and subfield codes were taught and used to identify explicitly cataloging elements on lc proofslips. phases ii and iii of the training, marc editing and correction procedures, were also taught by professional librarians. the editing class, which lasted two weeks, was divided into lecture sessions and laboratory sessions. each lecture period was from two to three hours; then, during the laboratory session, the instructions given in the lectures were applied to practice worksheets. the course covered input of variable and fixed fields, assignment of bibliographic codes for language and place of publication, and identification of diacritical marks included in the lc character set. phase iii of the training program, on correction procedures, 108 journal of library automation vol. 3/2 june, 1970 was a one-week class covering the addition, deletion, and conection of entire records or data elements at the field level. the training period was followed by an intensive practice period using marc input worksheets, which were reviewed by the experienced editors. selection of cards the actual selection of the 1968 and 1969 titles is a joint effort by the card division staff and the recon staff. the procedures for the selection of cards from the card division for recon differ from those described in the original report. since only cards for 1968 and 1969 titles are being selected, it is more expedient to draw the cards from the card division card stock than to microfilm the record set. these cards will include all titles cataloged by the library of congress during 1968 and 1969 regardless of language or form of material, which will yield approximately 250,000 cards. the cards are forwarded to the production section from the card division, where each record is inspected to determine whether it meets the criteria established for recon, i.e., all english language monographs with an lc catalog card number representing works cataloged by lc in 1968 and 1969 that are not already in machine readable form. the determination as to whether or not an item is in english is based upon the text, not the title page. an anthology of literature in spanish with a title page in english would not be included in recon; a book with text in english but title page in french would be included. if a book is multilingual (complete text in more than one language), the language of the first title determines inclusion or exclusion for recon. atlases are included, but not single maps or set maps. music or music scores are excluded, but books about music are included. records representing film strips, moving pictures, serials, and other kinds of materials not regarded as monographs are excluded. once the cards eligible for recon are selected and arranged in lc card number sequence, the cards are compared with the print index listing all records already in machine readable form. those records not in machine readable form are photocopied onto the input worksheet for editing and keying. to date, 60,000 cards have been selected by card division staff and forwarded to the production staff for further processing. selection of research titles an integral part of recon is the conversion of 5,000 titles to machine readable form for research purposes. ideally, these titles should serve not only the needs of recon but also be useful for some other purpose in the library of congress. these titles would include english language monographs cataloged before 1950, and foreign language material using the roman alphabet, and would be used to test various methods of input recon pilot projectjavram 109 and certain aspects of the format recognition program. the older material would represent records cataloged under earlier cataloging rules and would reveal problems in conversion in an area in which little information exists. two sources were initially considered for the selection of research titles: 1) titles in the main reading room collection for conversion into machine readable form for the production of book catalogs, and 2) the popular titles (cards ordered most frequently) of the card division mechanization project. a decision was made to study the titles in both sources with priority given to solution of conversion problems and to determine: 1) if overlap existed in records for both projects that would also serve the needs of recon; 2) if overlap did not exist, which titles (main reading room collection or card division popular titles) best served the needs of recon; and 3) if the titles in neither project were suitable, the method of selection to be used from the card division record set. the first task was a study of the characteristics of the main reading room collection. the collection consists of approximately 14,000 titles, and printed cards have been collected to compile a complete shelf-list catalog. these cards represent a wide range of material cataloged from 1900 to date. approximately one-fourth to one-third represent serials. the collection includes material in most of the roman alphabet languages currently processed at the library, the more common non-roman alphabet languages, such as russian, japanese, hebrew, etc., and a number of "difficult" titles, such as encyclopedias, dictionaries, etc., that would present a variety of cataloging and editing problems. the second task was a study of the popular titles from the card division. the card division provided a printout of card numbers for titles with 25 or more orders. there were 4,765 such card numbers listed with their corresponding number of orders. only 210 of these were for pre-1950 cards, and 97 of the 210 cards were for serial titles. only 15 out of the 210 cards were for "difficult" titles. another list was produced which contained card numbers for titles with ten or more orders. this list (with 39,148 card numbers) did produce more titles that would meet the research needs of recon. a sampling technique was designed by the technical processes research office to determine the percentage of overlap of this list with the titles in the main reading room reference collection. the estimated number of matches ( 15.5%) indicated that not enough overlap existed to consider a selection of titles that would serve the needs of both projects (main reading room collection and card division) and recon. therefore, the research titles are being selected from records for the reference collection. iso is working closely with staff members of the reference department on this project. the reference department is providing local informallo journal of library automation vol. 3/2 june, 1970 tion (e.g., local call number to locate the item in the reference collection as opposed to the lc call number which locates the item in the general collection) for all titles. as this process is completed, the responsible recon staff member is selecting the research titles. to date, "local" information has been added to 2,000 records, and 400 recon titles have been selected from this group of records. computer programs the only computer program implemented to date is the print index program. this program was required to check the records meeting the manual selection criteria for inclusion in recon against records in existing machine readable data bases to avoid duplicate input. print index lists by card number all records in machine readable form in either the marc i or marc ii data bases. at a later date, the 1968 titles found on the marc i data base will be processed by a subset of the format recognition program and converted to the marc ii processing format. the print index program is made up of two routines. the lc catalog card number routine reads each record, extracts the lc card number and creates a magnetic tape file of numbers (called print index tape). the tape created contains a card number right justified for machine sorting, a card number in the same form (zeros deleted) as the number on the printed card, and a data base code indicating the file in which the record originally resided (e.g., marc ii data base, marc ii practice tape, marc i data base). a parameter card is used to indicate which format and data base is to be processed. the ibm sort is used to arrange the output of the lc catalog card number routine into the following order: all 6x-series numbers, all 6xseries numbers with alphabetic prefixes (by year of cataloging-i.e., 1968 followed by 1969), all 7 -series numbers (disregarding the check digit, the second digit in the number). the lc card number print routine prints the card numbers, which are in numeric sequence as described in the preceding paragraphs, from the print index tape. each page of the listing contains a heading, a running index, a date, and a page number. the program prints 200 card numbers and data base codes per page. the numbers are in ascending order, top to bottom in four columns of 50 numbers each. format recognition the experience of the library in the creation of machine readable cataloging records during the marc pilot project and the marc dish·ibution service has clearly demonstrated that the highest cost factor of conversion is the human editing and proofing. the editing presently consists of assigning tags and codes to the bibliographic record to explicitly identify the content of the record for machine manipulation. the recon pilot projectfavram 111 library has completed a format recognition feasibility study which concluded that the probability of success of automatically assigning tags and codes by computer is high. since the format recognition feasibility study was only concerned with cataloging records for current english language monographs, the study must be extended to cover other roman alphabet languages and as part of recon, records which were created according to different rules and conventions. although the progress report submitted to clr included the definition and status of each of the tasks that make up the format recognition program, these have been omitted to avoid duplication with an article recently published in the i ournal of library automation ( 5) describing format recognition concepts in some detail and elaborating on the tasks completed and projected at that time. investigation of input devices the investigation of input devices and the testing of several selected devices in an operational mode will continue throughout recon. a study of the use of a mini-computer operating in an on-line mode for input, editing, and formatting of marc records is in progress at the library and will supplement the recon effort and provide additional data. a preliminary investigation was begun of optical character readers commercially available and in the developmental phases. only those readers capable of reading numerous characters on many lines (page reader) as opposed to a limited number of characters or lines per document (document reader) were included in the study. the machines evaluated were considered as possible candidates if they were capable of processing upperand lower-case alphabetic characters, numerals, standard punctuation and some special symbols. each manufacturer has specifications for the type of paper required and the font style which can be recognized. paper handling is a major drawback of optical character readers. excessive handling of the paper or any type of smear, crease, or crinkle could cause rejection of a character or conversion of a character to some specified symbol indicating an invalid character. error rates for the devices considered range from one to 35 characters per 10,000 characters and 80% of the errors are caused by paper handling. typewriters used to prepare the source document must be constantly cleaned and ribbons changed to keep impact keys free of dirt. frequent jamming appears to be a characteristic of most machines; unjamming these machines can be difficult and is highly dependent upon the skill of the operator. ten companies that have various types of optical character recognition equipment commercially available were considered in the first study. five were immediately rejected because their devices did not meet the criteria as specified above. 112 journal of libmry automation vol. 3/2 june, 1970 the devices remaining had the following characteristics: control data corporation 915 page reader. accepts 2.5x4 to 12x14inch paper; ocr-a standard type font; recognizes upper-case alphas, numerals, and standard punctuation; through programming and use of special symbols, lower-case alphas can be coded. farrington model 3030. accepts 4.5x5.5 to 8.5x13.5inch paper; ocr-a standard and 12l (farrington) type fonts; recognizes uppercase alphas, numerals, standard punctuation and special symbols; through programming and use of special symbols, lower-case alphas can be coded. scan-data models 100/300. accepts 8.5xll-inch paper; multi-type fonts; recognizes upperand lower-case alphas, numerals, standard punctuation, and special symbols; has programmable unit for formatting. philco-ford general purpose reader. accepts 5.7x8.5x11 inch paper; multi-type fonts; recognizes upper-case alphas, numerals, standard punctuation and special symbols; through programming and use of special symbols, lower-case alphas can be coded. recognition equipment retina. accepts 3.25x4.88 to 14.14-inch paper; multi-type fonts; recognizes upperand lower-case alphas, numerals, standard punctuation, and special symbols; has a programmable unit for formatting. the possibility exists of using any of these five machines for the input of english language material. the keying of an extraneous character is required with the farrington and control data corporation equipment for lower-case and some special symbols. this is not necessary with philco-ford, scan-data, and recognition equipment machines. since the number of special symbols vary by machine, each machine must be studied to determine a method of coding the entire library character set as developed by the library of congress and this method must be evaluated in terms of the burden placed on the typist. with the added feature of lower-case recognition, the price of the machine increases substantially. adequate information has not been obtained from these companies to give an accurate accounting of cost. it should be noted that the rental price for the majority of optical character readers is high, a factor which will have to be taken into consideration at the time of selection of an input device. the most economic route to recon pilot project/ avram 113 conversion may be through a service bureau, depending on the volume of records to be converted. outlook it is too early in the life of the project to predict the outcome or to describe any factual conclusions. the library of congress is greatly encouraged by the interest expressed in the project and the assistance offered by the members of the advisory committee and the working task force. the scope of the assignments and the fact that all members of the working task force have responsible positions in their own institutions are clear evidence of the spirit of cooperation that has been exhibited by the working task force members and their parent organizations. other members of the library community have been and will continue to be contacted throughout the project for their expertise in certain facets of the many problems under exploration. several developing regional networks were requested to describe their plans in the hope that smaller scale efforts would shed some light on the problems involved on a national level. those organizations contacted have responded, and a continuing liaison will be maintained not only to avoid duplication of effort but, more important, to attain a better understanding of how to approach the requirements of future library systems in terms of what is possible today. the report submitted to clr described progress made to november 1, 1969. since that time, the recon production staff has selected all the 1969 titles from the card stock to be included in recon, 5,200 records have been edited, and the first 250 have been forwarded to a service bureau to test its procedures for keying. the staff· has begun the selection of the 1968 titles and out of approximately 26,000 records received to date from the card division 19,000 are recon candidates. the production section continues its training by the proofing of marc records until the recon records are processed through the marc system to provide the required diagnostics for the proofing process. procedures were set up for typing records without any editing and in accordance with the requirements for the format recognition program. sample records selected for testing the procedures were of above-average difficulty in order to include all types of data that might be encountered. the procedures will be continually evaluated until some optimal method is determined. the format recognition algorithms are being evaluated by having recon staff simulate a computer and follow through the logic of the algorithms on actual data. results of the simulation will provide the necessary feedback to adjust the algorithms prior to the coding of the computer programs. detailed design work has begun on the expansion of the marc system to include random access capability and on-line correction. this 114 journal of library automation vol. 3/2 june, 1970 effort is being coordinated with the card division mechanization project and is considering the requirements of a large-scale conversion activity. although it has a long way to go, recon is on schedule and for any project concerned with automation, that is an encouraging note. for the moment the future looks bright. acknowledgment the author wishes to thank the recon staff members of the library of congress for their respective reports which were incorporated into the progress report submitted to the council on library resources, inc., and as such, are significant contributions to this paper. without the aid of the council on library resources the recon project would not have become a reality. through three important grants the council has made a major contribution to the project: 1) the first was a grant in support of the recon feasibility study and the working task force that resulted in the recon report; 2) an officer's grant enabling the establishment of the recon production unit to create additional machine readable records not included in the marc distribution service; and 3), most importantly, a grant providing full funding for the two-year pilot project. references 1. library of congress; recon working task force: conversion of retrospective catalog records to machine readable form. (washington: library of congress, 1969). 2. ibid, pp. 10-11. 3. ibid, pp. 20-38. 5. anglo-american cataloging rules. (chicago: american library association, 1967). 4. avram, henriette d., et al.: marc program research and development: a progress report," journal of library automation, 2 (december 1969), 242-265. 247 buyer be wary! in the september 1974 issue of jola, the "highlights of isad board meeting" reflects the library automation community's growing concern with misrepresentation of products and misleading or fraudulent claims. a proposal was made that isad create a mechanism to monitor relevant advertising in order to inform and protect' its constituency and, indeed, the entire profession. it is paradoxical that this concern is being voiced at a time when the relationship between the public and private sectors seems closer than at any other time in the recent past. in general, librarians and vendors are good friends. there is an atmosphere of mutual respect, and we no longer raise eyebrows upon learning that a librarian-colleague has gone "commercial." indeed, librarians and libraries are learning from the business world to create products and market them in order to support desired internal services. the growing entrepreneurial efforts of libraries are linkirig the two groups with a yet firmer bond. unfortunately, but inevitably, there are a few flies in the ointment. with regularity, we pick up professional literature to find advertising which sounds too good to be true. an investigation will usually indicate that, in fact, it is not true. we are often visited by salesmen describing incredible advances in their particular areas. the pressure applied by these people can be distasteful and even intolerable. or we may receive a onepage brochure from an unknown company, touting its latest, very competitive system, and listing the familiar names of well-respected librarians as advisors. almost always, we are lucky and are able to discover for ourselves the true nature of the products being advertised. our misfortune may begin when an ambitious salesman finds his or her way into the office of an administrator or politician who does not have adequate preparation for the onslaught of facts, figures, and fallacies. what are the best ways of misrepresenting a product? most approaches fall into one of the following categories: ( 1) misleading advertising, with unclear statements and imprecise use of vocabulary; ( 2) claims that one, or several, or many other libraries are using the product with satisfaction (when this indeed is not the case); ( 3) specific statements that a large and prestigious library is about to sign a contract for servic~s or products (although investigation will reveal no such intention); ( 4) lists of experts in the field who are presumed to be associated with the company in an advisory or consultant role (but who are unaware of this use of their names); and ( 5) approaches to federal, state, or local agencies to appeal 248 journal of library automation vol. 7/4 december 1974 the procedures used by libraries in requesting bids or awarding contracts. at this point, a note of caution must be inserted. strategies of advertising and marketing usually involve one or more of the above techniques to a certain extent. we all practice minor exaggerations and simplifications in our professional lives in order to accomplish certain goals. it would be unwise and unfair to accuse an advertiser of misleading his market on the basis of one of these "small exaggerations." in resolving this issue, our concern must be with those individuals or organizations who are constantly found with a large discrepancy between the word and the deed. what methods can be used as protection against these tactics? there are several reliable paths: ( 1) be aware of and alert to the possibilities of misleading claims and misrepresentation; ( 2) follow up a sales pitch with a few phone calls to those institutions that are described to be using the product or about to sign the contract; ( 3) maintain a reasonable amount of resistance to the sales talk; ( 4) use the library profession's invisible college to determine the validity of the claims and the experiences that others have had with the firm; and ( 5) support the attempts of our professional societies, such as ala and asis, to require organizations to maintain certain advertising standards. the library market is expanding and maturing; therefore, these growing pains associated with increased marketing efforts are not unexpected. with adequate education and awareness on the part of the buyer, with some pressures placed on advertisers by the professional community, and with a tolerance for the normal tendencies of advertising and marketing, we will be able to resolve a difficult situation with grace and without hard feelings. susan k. martin tech tools in pandemic-transformed information literacy instruction: pushing for digital accessibility article tech tools in pandemic-transformed information literacy instruction pushing for digital accessibility amanda rybin koob, kathia salomé ibacache oliva, michael williamson, marisha lamont-manfre, addie hugen, and amelia dickerson information technology and libraries | december 2022 https://doi.org/10.6017/ital.v41i4.15383 amanda rybin koob (amanda.rybinkoob@colorado.edu) is assistant professor, literature and humanities librarian, university of colorado. kathia salomé ibacache oliva (kathia.ibacache@colorado.edu) is assistant professor, romance languages librarian, university of colorado. michael williamson (michael.d.williamson@colorado.edu) is assistant director, assessment and usability, digital accessibility office, university of colorado. marisha lamont-manfre (marisha.manfre@colorado.edu) is accessibility and usability assessment coordinator, digital accessibility office, university of colorado. addie hugen (addison.hugen@colorado.edu) is senior accessibility tester, digital accessibility office, university of colorado. amelia dickerson (amelia.dickerson@colorado.edu) is accessibility professional, digital accessibility office, university of colorado. © 2022. abstract inspired by pandemic-transformed instruction, this paper examines the digital accessibility of five tech tools used in information literacy sessions, specifically for students who use assistive technologies such as screen readers. the tools are kahoot!, mentimeter, padlet, jamboard, and poll everywhere. first, we provide an overview of the americans with disabilities act (ada) and digital accessibility definitions, descriptions of screen reading assistive technology, and the current use of tech tools in information literacy instruction for student engagement. second, we examine accessibility testing assessments of the five tech tools selected for this paper. our data show that the tools had severe, significant, and minor levels of digital accessibility problems, and while there were some shared issues, most problems were unique to the individual tools. we explore the implications of tech tools’ unique environments as well as the importance of best practices and shared vocabularies. we also argue that digital accessibility benefits all users. finally, we provide recommendations for teaching librarians to collaborate with campus offices to assess and advance the use of accessible tech tools in information literacy instruction, thereby enhancing an equitable learning environment for all students. introduction the last fifteen years have seen the rise of collaborative and interactive web platforms and whiteboards, game-based learning technologies, audience polls, and other tools that contribute to student engagement in higher education classrooms. these educational tech tools have supported one-time library information literacy (il) sessions by enabling student participation in real time. still, knowing that tech tools may enhance engagement is not enough; we should also be asking whether these tech tools are accessible for all students and, if not, what can be done to make them more accessible. this paper examines the digital accessibility of five tech tools specifically for students who use assistive technologies such as screen readers. the tools are kahoot!, mentimeter, padlet, mailto:amanda.rybinkoob@colorado.edu mailto:kathia.ibacache@colorado.edu mailto:michael.d.williamson@colorado.edu mailto:marisha.manfre@colorado.edu mailto:addison.hugen@colorado.edu mailto:amelia.dickerson@colorado.edu information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 2 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson jamboard, and poll everywhere. these tech tools were identified in a 2021 paper inquiring which tech tools librarians used in emergency remote il instruction during the covid-19 pandemic along with their perceptions of the weaknesses and strengths of these tech tools.1 although there are guidelines aiding librarians in assessing ada accessibility around library spaces, there are no disability-related recommendations for specific tech tools used in il instruction or studies examining tech tools’ digital accessibility features. 2 there is also a lack of documentation regarding librarians’ outreach to ada-related academic offices and tech companies regarding tech tools. we argue that collaboration between libraries and ada-related offices at the campus level increases awareness of digital accessibility issues and requirements and could ultimately advance digital accessibility in educational tech tools used in il instruction. we place our paper within the context of other pandemic-responsive digital pedagogy research. we acknowledge that technology needs for student engagement are evolving in new face-to-face, hybrid, and remote instruction environments; thus, we hope to impact the way tech tools are assessed for digital accessibility and to promote the use of accessibility-tested tech tools in library instruction. first, we provide an overview of ada and digital accessibility definitions, descriptions of screen reading assistive technology, and the current use of tech tools in instruction for student engagement. secondly, we examine accessibility testing reports for the five tech tools selected for this paper. then, we discuss two trends found in the reports: shared issues between the tools and the implications of unique environments. we also argue that digital accessibility benefits all users. finally, we provide recommendations for teaching librarians to collaborate with campus offices to assess and advance the use of accessible tech tools in il instruction, thereby enhancing an equitable learning environment for all students. overview ada accessibility the americans with disabilities act (ada) was made law in 1990, signaling an initiative to protect people with disabilities from discrimination in employment opportunities, when purchasing goods, and when participating in state and local government services. 3 the idea behind the ada law was to provide equal opportunity.4 however, as health sciences librarian ariel pomputius notes, ada law protects people from discrimination, but it does not guarantee a right to accessibility beyond the legal requirements granted by this act.5 as higher education advances through the covid-19 pandemic, digital accessibility has become more essential than ever in il instruction as it takes place in hybrid, remote, and in -person environments. to ensure the digital accessibility of tech tools for all students, we should first understand its meaning. what is digital accessibility? the covid-19 pandemic brought digital accessibility to the forefront as universities navigated complex remote and hybrid learning environments. fernando h. f. botelho, a scholar with expertise in technology and disability, explains digital accessibility as the interconnection of “hardware design, software development, content production, and standards definition.”6 for botelho, accessibility is “an ongoing and dynamic process” rather than an immobilized state, where standards work together as a part of a ubiquitous process.7 as information studies information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 3 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson professor jonathan lazar notes, “digital accessibility means providing an equal user experience for people with disabilities, and it never happens by accident.” 8 georgetown law also defines digital accessibility from a perspective that may resonate with instructors who seek technologies that are accessible to all students. they define digital accessibility as “the inclusive practice of removing barriers that prevent interaction with, or access to websites, digital tools, and technologies.”9 however, it is lazar who moves the topic forward when referring to digital accessibility in research libraries, arguing that although accessibility laws protect people with disabilities, digital accessibility also benefits the whole population.10 lazar made this assertion after capturing the challenges and lessons learned related to digital accessibility during covid-19.11 the most salient lesson is that research libraries should create an infrastructure that supports digital accessibility, especially now that the covid-19 pandemic has driven universities to provide instruction in multiple formats.12 we argue that this infrastructure should also include digital accessibility evaluation of tech tools used in the classroom. assistive technology for blind users congress defined assistive technology in the disabilities act of 1988 as “any item, piece of equipment, or product system, whether acquired commercially off the shelf, modified, or customized, that is used to increase, maintain, or improve functional capabilities of individuals with disabilities.”13 furthermore, special education professors kathleen puckett and kim w. fisher state, “technology becomes assistive when it supports an individual … to accomplish tasks that would otherwise be difficult or impossible.”14 as scholars of occupational therapy claire kearney-volpe and amy hurst note, screen readers assist people with no or low vision by presenting web information on “a non-visual interface” via braille or speech synthesis.15 screen readers’ purpose is important because all people should have the opportunity to access the same information and services in the digital environment without facing undue barriers or burdens. the digital accessibility office’s (dao) assessment and usability team at university of colorado boulder (cu boulder) primarily tests tools for accessibility by utilizing screen reader assistive technology for both computers and mobile devices. assessment and usability staff rely on screen readers for testing because this assistive technology uses and responds to the underlying code of each webpage, application, and environment. this in-depth output makes screen readers good tools for overall accessibility testing, even though they are generally for people with no vision. however, we found no studies on tech tools and classroom engagement that consider assistive technology such as screen readers. classroom engagement with tech tools academic librarians emily chan and lorrie knight state in a 2010 study that library instruction risks being anachronistic if it does not include an engaging technology-based activity.16 with this in mind, there is ample literature documenting the impact and benefits of tech tools in the classroom. for example, authors highlight tech tools’ anonymous environment, categorized as free of judgment, noting that it is student-centered and enhances student participation.17 moreover, anonymity provides a means for students to answer honestly, fostering classroom discussion that includes introverted students.18 on the other hand, some authors argue that anonymous participation does not enhance critical thinking. ann rosnida md deni and zainor izat zainal, referring to padlet as an educational tool, information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 4 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson argue that one challenge of using tech tools to advance student engagement is that they do not, on their own, enhance criticality or discussion because students may not want to oppose their peers’ opinions.19 as with other pedagogical techniques, intentional facilitation with tech tools is necessary to enhance criticality. many authors regard the use of tech tools in the classroom positively.20 examining kahoot! to test students’ performance, darren h. iwamoto et al. state that students valued receiving immediate feedback on their answers after taking a high-stakes examination.21 carolyn m. plump and julia larosa also appreciate the use of instructional games to provide immediate feedback to students, noting that this feedback warned faculty instructors against making assumptions about how much students understand in class.22 similarly, librarian maureen knapp, referring to online tools for active learning, notes that instant feedback drives classroom discussions forward.23 liya deng, a librarian with a focus on disability studies, notes in a 2019 study that using poll everywhere in library instruction provides an opportunity to build rapport with students and a strategy to keep students focused and away from non-instruction-related internet distractions.24 engineering classes have also used tech tools to enhance teaching and learning. a 2021 case study addressing online education due to covid-19 reports that students found kahoot! to be a useful online tool that helped them reflect, apply knowledge, and receive feedback. 25 similarly, engineering educator razzaqul ashahn advocates for incorporating tech tools like jamboard for active “think-pair-share” activities, noting that it enables instructors to connect with students as they do small group work.26 these studies suggest that tech tools continue to be relevant and beneficial during the pandemic, though again, they do not consider whether they are digitally accessible. for this reason, the continued use of tech tools in various modalities (in-person, hybrid, and remote) attests to their relevance, which may continue to grow as instructors transition to pandemic-transformed pedagogy. pandemic-transformed pedagogy in a 2020 publication exploring covid-19 impacts on teaching, learning, and technology use, scholars jillianne code, rachel ralph, and kieran forde coined the phrase “pandemic-transformed pedagogy.”27 as they state, educators find themselves “on the cusp of a rapid change that is compelling them to re-think their worldview in both how they teach and how their students learn, calling for their transformation as educators.”28 a review of the recent literature available through google scholar on “pandemic-transformed pedagogy” shows expanding adoption of this phrase, including academics publishing on a range of interdisciplinary subjects and in international contexts, with implications for both k–12 and post-secondary education.29 as we reflect on this transformation and call for responsiveness to rapid change, we emphasize the need for support, planning, and advocacy for digital accessibility and tech tools. before the covid-19 pandemic, scholars at the university of sydney found in 2018 that the most significant factor driving the choice to use technology was whether it was immediately available. 30 these scholars emphasized the “just in time” use, noting that ready access to the technology required “actions, expenditure, support, and commitment from policymakers and administrators.”31 at the beginning of the pandemic, teachers, librarians, and students had a matter of days to pivot to remote work, and as ibacache, rybin koob, and vance found in a 2021 study, “availability” was a consideration for librarians in selecting tech tools for engagement and content delivery.32 this “just in time” consideration is even more important in the aftermath of covid-19, which prompted emergency remote learning. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 5 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson yet, teaching librarians also ought to go beyond what is easily available and move towards what is digitally accessible. part of the transformation we envision is to extend the concept of “pandemictransformed pedagogy” to include digital accessibility and thus push for the tech tools we use in il instruction to be readily available and digitally accessible. methodology as previously mentioned, this study examines the digital accessibility of five educational tech tools used in il instruction. to initiate a formal accessibility test, we created scripts detailing how to interact with the samples we provided for the five tools.33 these scripts were then used to manually test each tech tool for its digital accessibility using a variety of screen readers on both computers and mobile devices. about the testers the testers are native users of screen reading assistive technology and are blind. they test each tech tool first, with additional staff in the dao reviewing and validating results. about the test scripts process each test script contained the following parameters: 1. basic information about the tool. 2. contact information for access issues and technical questions, such as the tools’ customer support email and librarians’ emails for follow-up questions. 3. access points to the software and websites (urls). 4. step-by-step instructions for testers to impersonate a student engaging in an il task. as a part of these test scripts, we created short sample quizzes and activities for each of the five tools considered in this paper. in addition, the test scripts provided step-by-step descriptions to help the testers interact with the tools. the testers then tried each tool, focusing on functionality and whether they could complete the tasks in the script. the reports describe three levels of problems: severe, significant, and minor. the results section of this paper reports on these problems as found with the five tools tested. the testers also assessed general user experience (usability). the testers used a holistic approach, engaging with the entire virtual environment of the tool rather than looking only at isolated functions. assistive technology the testers utilized four types of screen reader software: voiceover, talkback, nvda, and jaws. voiceover, developed by apple, is a screen reader for mobile devices and computers that “comes standard on every iphone, mac, apple watch, and apple tv. it is gesture-controlled, so by touching, dragging, or swiping[,] users can hear what’s happening on screen and navigate between elements, pages, and apps.”34 talkback is a google-based screen reader included in android mobile devices that functions similarly to voiceover.35 nvda is a microsoft windows-only free open access screen reader supporting people who are blind or have vision impairment.36 jaws, also compatible with microsoft windows, allows people with visual impairment to read the pc screens with a text-to-speech output or via braille display.37 we also tested for visual usability issues using a free web-based color contrast analyzer.38 the testers provided thorough reports detailing the results of their testing, including exact versions used. the tests were conducted between february 27 and may 1, 2022. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 6 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson tools evaluated the educational technology tools in this study are web-based and have free options, allowing students to engage in activities using their computers or their phones. we identified these tools based on a survey about tech tool use during the covid-19 pandemic and from our own experiences.39 the tools are jamboard, kahoot!, mentimeter, padlet, and poll everywhere. jamboard is a google-powered virtual whiteboard tool. kahoot! is a quiz/game platform offering multiple styles; we tested the standard quiz question format and utilized one of the vendor provided sample quizzes. mentimeter is another quiz-making platform; we created the sample quiz utilizing multiple choice and short answer question formats. padlet is a collaborative bulletin board platform with various formats (including the three we tested: padlet maps, padlet shelf, and padlet wall). padlet includes options for users to add text and multimedia in response to question prompts or to post their own questions and other content in a collaborative virtual space. finally, poll everywhere is a polling/survey platform. limitations although digital accessibility offices at different universities commonly rely on shared standards for technology evaluation, such as web content accessibility guidelines (wcag) 2.1, we acknowledge that the assessment approach will vary from office to office. overall, there is much debate on which practices and standards for evaluating tech tools yield the best results. not all higher education institutions have digital accessibility offices, let alone accessibility and usability labs and testers. some institutions may rely on automated checkers or a mix of automated and manual testing. approaches to testing differ, and there is disagreement among digital accessibility practitioners about whether a fully automated, fully manual, or hybrid approach is best. regardless, we expect that manual testing of these educational tech tools using similar assistive technologies would have similar results during the timeframe these tools were tested. the testing reports capture a moment in time, and it’s important to note that web-based tools are frequently updated. we only tested the free versions of these tools. there may be differences in accessibility between free and paid versions. we tested only the browser versions of these tools on computers and mobile devices and did not test mobile applications, which may or may not be more accessible. this decision was made due to the probability that most il librarians and other instructors would not regularly ask students to download applications to their personal devices for in-class engagement. kahoot!, mentimeter, padlet, and poll everywhere were tested on windows, ios, and android platforms. jamboard was tested only on windows, because the browser version would not open on a mobile device using assistive technology. instead, it attempted to force an app download. we also tested each tool using sample environments and functions that we hope captured some ways the tools would be used in a typical il classroom. due to the nature of the tools and the many options available for question and collaboration formats in each tool, these samples were not exhaustive of all options available. these testing results are meant to be illustrative rather than comprehensive. finally, this study evaluated tech tools only for digital accessibility using the specific assistive technology of screen readers. further research is needed regarding how students with a rang e of different disabilities may interact with the technology tools examined here. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 7 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson results this section reports the three levels of problems (severe, significant, and minor) that dao testers found in jamboard, kahoot!, mentimeter, padlet (shelf, map, and wall), and poll everywhere. the testers also assessed user experience (“usability”). issues may be present in multiple categories based on how they impact the user’s ability to complete actions. severe issues table 1 shows the severe issues found in the tools tested. severe issues create access barriers that prevent assistive technology users from completing tasks and are issues that need to be remediated. the testers consider these issues prohibitive for many individuals with disabilities and for those who use assistive technologies. the dao identified ten severe issues in padlet shelf; five severe issues each in jamboard, padlet wall, and poll everywhere; four severe issues in kahoot!, and three severe issues in padlet map. the testers did not find severe issues in mentimeter. table 1 shows that the most common severe issue corresponds to elements that are unlabeled or inappropriately labeled. in the case of padlet map and jamboard, the testers found buttons that were unlabeled or labeled with irrelevant numbers. testers felt unclear as to what the buttons were or what their functionality was. padlet shelf contained the most unlabeled buttons, including the buttons to add posts and the three vertical dots menu to edit or delete. this issue is highly relevant since users need these buttons to navigate and contribute to the padlet. the testers observed a similar problem when using the screen reader talkback to engage with padlet shelf. talkback found unknown or unlabeled buttons, which impede users’ ability to navigate or interact with videos they submit to the padlet. figure 1 illustrates the play button located at the center of a video. in the screen reader, this button is unlabeled and appears after the video, preventing the screen reader from understanding its function and leaving users unclear whether this button is connected to the video. figure 1. the play button at the center of the video is unlabeled. in the reading order, this unlabeled button appears after the video; therefore, it is unclear what it does or how it relates to the video. the second most prevalent severe issue is elements that are not accessible to screen readers. this issue affected padlet shelf and padlet wall. in the case of padlet shelf, the testers utilizing the information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 8 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson voiceover screen reader were unable to interact with or locate gifs and graphics. when the testers utilized talkback, they would hear the gif but could not find the graphics because they were marked as links. in addition, the drawing feature was also not accessible to screen readers, including the visual elements that control colors, which appear as clickable links instead of the visual elements associated with colors. these elements were unavailable for users utilizing voiceover and jaws. the testers found a similar problem with the visual elements in padlet wall, especially when they tried to edit a post (see fig. 2). figure 2. when users want to edit a post in padlet wall, there are visual elements that are available to change the color of the post. these elements are not available to screen readers. figure 3. when images are not programmed to be read as graphics, screen readers are not able to gather information related to the gif. this image was read as “jaf3mi0ja5huk/giphy.” information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 9 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson figure 4. while using nvda, the user hears links for images that do not make sense. the third most frequent severe issue relates to graphics and gifs that are not appropriately programmed. this issue affected padlet shelf and padlet wall. when the testers were using jaws in padlet shelf, the gifs read as links with the following text: “jaf3ml0ja5huk/giphy.” when the testers utilized nvda, the gifs read as “giphy,” conveying no information describing the gif and hindering navigation (fig. 3). similarly, graphics and gifs in padlet wall are programmed as links rather than graphics. when the testers used jaws to understand graphics and gifs, they heard long links such as: “eb351cc20e6bfda76d443f1e93ad7963/pumpkin_seedling_3.” long links like this are useless to people using screen readers and disrupt people’s ability to search for graphics. when the testers used nvda, they also heard links for the images, but without the other series of characters included in jaws (fig. 4). the testers also found severe issues with elements not available by keyboard or screen reader (jamboard and poll everywhere) and timer features (kahoot!). for example, the pen, eraser, laser, shapes, and text box elements in jamboard can only be utilized or placed on the screen by a mouse, making them inaccessible to blind learners. another issue is the lack of alternative text. since jamboard offers a collaborative multi-user space, some users may post images. however, there is no way to input alternative text to an image. in the case of kahoot!, when the timer is activated, the countdown plays as the screen reader tries to read the page, confusing the screen reader and the user, who will hear the timer with random numbers and not the question. the timer feature also affects the user when starting a quiz or moving between questions. it is unclear whether the screen reader is unable to read the questions due to the short timeframe or whether the questions are truly unavailable to the screen reader. the instructor may extend the timer for quizzes in kahoot!, but it is impossible to turn it off altogether when using the kahoot! quiz question format. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 10 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson table 1. number of occurrences of severe issues found during screen reader testing for kahoot!, jamboard, mentimeter, padlet (three formats tested), and poll everywhere. (jamboard was tested on a windows computer only; the other tools were tested on windows, ios, and android.) severe issue jamboard kahoot! mentimeter padlet: map padlet: shelf padlet: wall poll everywhere total occurrences element not available by keyboard or screen reader 2 0 0 0 0 0 1 3 element presents gesture/navigation traps 0 0 0 0 0 0 1 1 elements are not keyboard accessible 0 0 0 1 0 0 0 1 elements are unlabeled or inappropriately labeled 2 0 0 1 4 2 0 9 elements not accessible to screen reader 0 0 0 0 3 2 0 5 errors do not get focus 0 0 0 0 0 0 1 1 graphics and gifs are not programmed appropriately 0 0 0 0 3 1 0 4 graphics are unlabeled or inappropriately labeled 0 0 0 0 0 0 2 2 graphics lack alternative text 1 0 0 0 0 0 0 1 lack of alert 0 0 0 1 0 0 0 1 text not read by screen reader 0 1 0 0 0 0 0 1 timed pages disrupt the ability to read the page 0 3 0 0 0 0 0 3 tool totals 5 4 0 3 10 5 5 32 information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 11 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson the testers found other severe issues such as text not being read by screen readers, missing notifications, elements not accessible to screen readers, unlabeled graphics, and lack of focus on images. for example, in the case of kahoot!, the screen reader could not read the answer notification text. this issue meant that while the tool offered visual indicators for correct and incorrect answers, the screen reader did not read these indicators and it remained unclear to testers whether their answers were correct or not. finally, “lack of focus” or challenges with “focus handling” indicate that the assistive technology’s attention was not where it should be. this problem happens because tool developers do not set the appropriate code for screen readers. significant issues table 2 shows the significant issues found in the tools. significant issues represent items that create great difficulty for people who use assistive technologies, but they do not necessarily prevent the tool from being used. significant issues are recommended for remediation. interestingly, most significant issues were not shared across the five tools; out of seventeen problems, only one was shared by four tools (“inconsistent focus handling”), and three were shared by two tools each (“graphics are inappropriately labeled,” “reading order can be confusing to users,” and “state is not indicated”). because of this lack of overlap, brief descriptions of how frequent issues affected specific tech tools are warranted, focusing on those issues that affected multiple tools, recurred most frequently, or both. the significant issue that recurred most frequently was “reading order can be confusing to users,” affecting jamboard as well as all three padlet styles. in jamboard, when creating a sticky note, the focus of the assistive technology went into the edit field but ignored the color options. this meant that users were unable to switch between colors when making a post. reading order also caused difficulties. reading order is the way elements are tagged and read by screen readers. this may not be the same order most sighted users experience when reading elements on the page from top to bottom, though it should closely reflect the visual layout of the page. it determines what a blind learner will understand about the digital environment and in what order. in padlet map, the screen reader went through irrelevant content, including the terms and conditions, before reading the “new post” button. padlet shelf had three instances of confusing reading order; for example, the “publish” and “update” options were in the reading order above the “edit” field. the user would have to know to navigate back to finalize their post (this issue is repeated in padlet wall as well). further, if a user leaves the new post dialog box, it is difficult to return due to the reading order. the “more buttons” element was also read before the heading of a new post, and those additional buttons are unlabeled. finally, in padlet wall, the tester utilizing voiceover could not discard a post (fig. 5). a dialog opened asking for discard confirmation, but this dialog was buried in the reading order and challenging to locate. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 12 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson figure 5. a dialog box appears visually in padlet wall when a user attempts to discard a post, but it is buried in the reading order of the voiceover screen reader, making it difficult to locate and complete the task. the next most frequent significant issue was “inconsistent focus handling,” which occurred six times. focus handling directs the attention of the user and facilitates various actions in a given environment. inconsistent focus handling emerged in four out of the five tools: jamboard, kahoot!, all three padlet styles, and poll everywhere. this issue often appeared when a new element on the screen was opened, but the “focus” (what the screen reader was paying attention to at any given time) remained on the previous panel or element, causing confusion and difficulty. for example, in jamboard, when selecting the “open a jamboard” button, the panel opened visually, but the screen reader’s focus remained on the button behind the open panel. to get to the new jamboard, the tester had to navigate the other page content first. focus handling was inconsistent across many activation buttons and interactions in all three padlet styles. in kahoot!, focus handling was inconsistent across screen readers, with the focus going to different places, such as after answering a question. in poll everywhere, the focus traveled to other areas of the page after answering a question, returning to previous ques tions, or refreshing the page. these inconsistencies varied among screen readers. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 13 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson table 2. number of occurrences of significant issues found during screen reader testing for jamboard, kahoot!, mentimeter, padlet (three formats tested), and poll everywhere. (jamboard was tested on a windows computer only; the other tools were tested on windows, ios, and android.) significant issue jamboard kahoot! mentimeter padlet: map padlet: shelf padlet: wall poll everywhere total occurrences difficult combination/list box 0 3 0 0 0 0 0 3 element difficult to access 1 0 0 0 0 0 0 1 element state not indicated 1 0 0 0 0 0 2 3 error does not get focus 0 0 1 0 0 0 0 1 extensive load times create difficulties 0 1 0 0 0 0 0 1 graphics are inappropriately labeled 0 1 1 0 0 0 0 2 graphics not programmed appropriately 1 0 0 0 0 0 0 1 headings are not used 1 0 0 0 0 0 0 1 inconsistent focus handling 1 1 0 1 1 1 1 6 lack of alert 0 0 0 1 0 2 0 3 lack of contextual text/information 0 1 0 0 0 0 0 1 lack of focus indicators 0 0 0 1 0 1 0 2 lack of notification 3 0 0 0 0 0 0 3 object placement 1 0 0 0 0 0 0 1 reading order can be confusing to users 1 0 0 1 3 2 0 7 user-created objects initially lack markup 0 0 0 0 0 1 0 1 tool totals 10 7 2 4 4 7 3 37 information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 14 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson another issue that recurred across two tools (kahoot!, mentimeter) was “graphics are inappropriately labeled.” in kahoot!, a graphic showing the final scoreboard from the quiz and a podium were hidden from screen readers. in mentimeter, the logo for the tool is labeled as a “logotype” in the alt text. while these inappropriate labels for graphics may seem minor, they leave players using assistive technology out of celebratory or fun elements and may be confusing. inappropriate labels cannot be corrected by instructors, who are unable to adjust the alt text for elements that are built into the software. another issue that recurred was “state is not indicated” (jamboard, poll everywhere). here, “state” refers to any change or option for an element—the state of it in the digital environment. in jamboard, there is no indication of what color is selected for sticky notes, for example, which can be problematic if instructors use color to convey meaning (fig. 6). in one test question on poll everywhere, the unlabeled image reads as clickable to nvda, and visually, the image becomes larger when clicked. but this change is not announced and again may be confusing. figure 6. for screen readers, there is not an indication of what color has been selected for sticky notes, though this is available visually. with ten issues listed, jamboard was the tool with the greatest number of significant problems. this was true even though jamboard was tested only on windows and not on mobile devices. padlet wall and kahoot! had seven significant issues each. this is a slight departure from the data in table 1, where padlet shelf had the most severe issues. in general, tools with severe issues consistently exhibited some significant issues as well. figure 6. users can enter text on option 1 and option 2, but these options do not generate a heading. minor issues table 3 shows the minor issues found in the five technology tools. minor issues represent items that are inconvenient or annoying, but do not necessarily create barriers to accessibility, e.g., repetitiveness, unclear text, etc. the testers found that each tool had between one and four minor issues of their own but did not share any of the minor issues listed. kahoot! had three issues related to confusing elements: gibberish text heard on screen readers, blanks in the statement not read by screen readers, and an icon that shows the total number of users who finished a test, which the screen reader could not read. other minor issues include instructions, questions, and information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 15 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson table 3. number of occurrences of minor issues found during screen reader testing for jamboard, kahoot!, mentimeter, padlet (three formats tested), and poll everywhere. (jamboard was tested on a windows computer only; the other tools were tested on windows, ios, and android.) minor issue jamboard kahoot! mentimeter padlet: map padlet: shelf padlet: wall poll everywhere total occurrences element is inappropriately labeled 0 0 1 0 0 0 0 1 elements confusing to users 0 3 0 0 0 0 0 3 elements read twice 0 0 0 0 1 0 0 1 heading level not concise 0 0 0 0 0 1 0 1 headings are not used to provide structure 0 0 0 1 0 0 0 1 headings used too often 0 0 0 0 1 0 0 1 inconsistent focus handling 0 0 1 0 0 0 0 1 labels are inconsistent 1 0 0 0 0 0 0 1 lack of a programmatic list creates confusion 0 0 0 0 0 1 0 1 lack of notification 0 0 0 0 0 1 0 1 same information is presented to screen reader multiple times 0 0 0 0 0 0 1 1 sound effect portray meaning 0 1 0 0 0 0 0 1 submenu item count not provided 1 0 0 0 0 0 0 1 unclear text is confusing to user 0 0 0 0 1 1 0 2 tool totals 2 4 2 1 2 3 1 17 information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 16 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson answers announced multiple times (poll everywhere), long heading text (padlet wall), lack of notification when the user adds an image (padlet wall), excessive use of headings in a page forcing users to go through the entire page to find a heading (padlet shelf), and headings not used to provide structure and facilitate navigation. the lack of heading structure complicates the ability of users who desire to add a post with a heading and text as seen in figure 6 (padlet map). usability issues the accessibility assessment reports also included usability evaluation. usability issues may impact users of any ability. the testers noticed insufficient color contrast in three tools (poll everywhere, padlet wall and map, and kahoot!). for example, figure 7 illustrates a poll everywhere sample question where the color of the text does not have enough contrast between the text and the background. the evaluators also found a lack of color contrast in instructions and captions. in some padlet formats, the instructor can change color contrast by choosing a different template. figure 7. an example of poll everywhere answer options that do not have sufficient color contrast between the text and background. conveying meaning using colors is another issue. in the case of jamboard, the sticky note (fig. 8) has a blue bar at the bottom that appears to be loading. this bar is connected to a character count that is not noted by screen readers. in addition, the testers could continue typing past the character limit when the loading bar turned red. the testers also noticed layered elements that caused usability problems. figure 9 illustrates how the preview panel in padlet map visually blocks the post and the button to close the preview panel. padlet shelf and mentimeter did not have usability issues. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 17 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson figure 8. in jamboard, the blue bar (below the yellow box) is used as a visual indication of character limit that is not available to screen readers. figure 9. when the user has the “preview panel” in padlet map open and starts a new post, the preview panel blocks the post. summary of results the reports showed that mentimeter was the most digitally accessible tool of those considered in this study. it is important to note that kahoot! and poll everywhere were judged as relatively accessible with caveats. both jamboard and all three types of padlet tested were found to be inaccessible for many individuals who use assistive technologies. in any case, all tools included either severe or significant issues, creating a great deal of difficulty for users. most issues were unique to individual tools. of twelve severe issues, only two were shared across two tools each. of sixteen significant issues, only four were shared across tools, and only one was shared across more than two tools (“inconsistent focus handling” was a problem in all tools except mentimeter). all fourteen minor issues were unique. mentimeter, with very few issues, and padlet, with the most problems in aggregate, were outliers. because padlet offers so many different format options, we tested three, which affected the findings. still, padlet shelf had the most severe issues (ten). jamboard had the most significant issues (ten) of any tool aside from padlet. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 18 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson discussion we hypothesized that the tools selected for this study shared similar digital accessibility issues. to our surprise, the reports showed that these tech tools had few shared problems. we will thus consider two trends in the reports worthy of further examination: shared issues among the tools and unique environments. we will also discuss how digital accessibility can benefit all users of tech tools, not only people with disabilities. shared issues among the tools as previously mentioned, many issues were unique to individual tech tools covered in this study, while a few problems were shared among the tools. when a particular issue was shared among different tools, the level of severity determined whether a person using assistive technology could have a successful interaction with a tool or not. tracking shared issues and their severity may help developers and digital accessibility staff create a shared vocabulary for discussing user experience. it may also help both parties recognize when issues are common and relatively easy to remediate (e.g., labeling, heading, and alt text problems). there are other shared issues, such as focus handling inconsistencies, that are more difficult to resolve even though they are at the heart of screen reading assistive technology. tracking focus handling problems may allow developers and digital accessibility advocates to share possible solutions with one another. moreover, if tech tool developers and digital accessibility staff both understand the importance of a factor like focus handling, any difficult and severe problem can be prioritized when creating and fixing tech tools. it is also important for instruction librarians to have a basic grasp of this shared vocabulary so that they can anticipate the needs and experiences of the learners in their classrooms. looking at each tech tool in isolation offers only a tiny glimpse into the possibilities of what might happen when students connect to engagement technologies. evaluating multiple tools allowed us to better understand recurring problems and the barriers they create. unique environments though tracking shared issues is important for these reasons, by the end of our testing, we found that the tools were not similar and that even when they had shared issues, these problems had unique characteristics. for this study, we selected tools that have similar functionality (for example, both kahoot! and mentimeter can function as quiz platforms) and others that are distinctive (such as padlet maps, which incorporates gps data to allow users to interact with maps). these tools offer students real-time engagement, which helps foster a collaborative learning environment. as mentioned above, most severe issues (ten out of twelve), most significant issues (twelve out of sixteen), and all minor issues were unique—in other words, they were not shared across tools. from the testers’ point of view, the presence of unique problems is understood by the fact that the elements of each tool combine to create unique environments. for example, some tech tools are more image based, while others are text based.40 our study shows that even tools that initially appear similar are revealed as unique through assistive technology testing. an interesting finding concerned padlet. when tools have problems, these issues usually exist across all of the screen readers used for testing. padlet, however, caused inconsistencies across information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 19 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson different screen readers. for example, padlet shelf had many unlabeled or inappropriately labeled elements that created different experiences between voiceover, nvda, and jaws. moreover, when irregularities appear across similar assistive technologies, this might mean that developers created unusual code in order to facilitate specific visual elements or other aspects of the technology. developers should consider testing on multiple platforms and with two or more screen readers to catch these inconsistencies and should also consider whether simple html alternatives are possible in place of more complicated code. regarding padlet, it may be possible that software developers used accessible rich internet application code (aria-code), which is known to cause inconsistencies for assistive technology. whenever possible, user experience should be consistent across screen readers. users should never be asked to switch assistive technologies in order to adapt to a tech tool. although we sampled only five tech tools, when considering the breadth of other tools in the market and those that may not yet be developed, we wonder whether our results could indicate an abundance of unique environments with unique digital accessibility problems. this inference suggests that software developers may not be creating tech tools with digital accessibility in mind or may be testing with only one type of screen reader. it also speaks to the lack of digital accessibility best practices in software development for educational tech tools. if anything, our results also illustrate the complexity of tech tool environments and the nuances of assistive technology. digital accessibility benefits users with different abilities digital accessibility is valuable for everyone, not just people with disabilities. two specific values illustrate this comprehensive benefit. first, if standards for digital accessibility are followed, digital content will be more “portable across platforms, browsers, and operating systems.”41 this interoperability could mean that learning content and properly formatted tech tools will be easy to use across assistive technologies and devices such as smartphones. secondly, accessible features benefit people who do not see themselves as having a disability. 42 for example, covid-19 amplified the benefits of using captioning for all learners, even when these learners did not have a specific disability.43 a 2004 microsoft survey also inferred that accessibility features benefit a wide variety of people.44 while a person with a disability benefits from clear organization, headings, labels, and color contrast, those aspects are also helpful for all users. recommendations and next steps planning with intention teaching librarians need to invest time learning about the environment of a tech tool they decide to use in il instruction. sometimes, tech tools that are digitally accessible are not easy or intuitive for instructors to use. we experienced this “easy to use” versus digital accessibility conflict when preparing the scripts for mentimeter and padlet. padlet is used extensively at our institution due perhaps to its instructor-friendly platform. however, padlet’s wall, shelf, and map assessments revealed many problems with digital accessibility. additionally, we had a difficult time creating a quiz in mentimeter, finding this platform unfriendly for the instructor; yet this tool had the fewest digital accessibility problems. this tension between ease of use and digital accessibility illustrates the importance of taking time to read and understand documentation and training materials before creating engagement activities for il sessions. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 20 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson we encourage teaching librarians to work with their local digital accessibility offices to evaluate the technology used most frequently in il classrooms. if a digital accessibility office does not exist on your campus, you may wish to advocate for your administration to create one. even if your institution does not yet have a digital accessibility office, there are ways for librarians to plan their il sessions with accessibility in mind. librarians may do basic assessments of tech tools without access to assistive technology by testing whether it is possible to access all features in a given tool using the tab keyboard key. if there is a function or action that you cannot access with tab or that you must use a mouse to navigate to, then that part of the tool will not be accessible to someone using a screen reader. you can also unplug or turn off the mouse and attempt using a tech tool. librarians can approach each tech tool by asking: is there anything in the tool that uses only images or colors? do sounds convey a meaning that is not otherwise communicated on the screen? if there is anything in the tool that relies on a single form of sensory feedback, it may be unperceivable to people using assistive technology. finally, we strongly suggest considering whether these tools add value to il instruction. if you like a certain tool but know it is inaccessible (or you are unsure), consider trying a different way of involving students in the same kind of engagement. think about simplifying the tech tools that you do use. extend or turn off timers where possible if you choose to use kahoot!, mentimeter, or other quiz-making tools. avoid using questions on any platform that require users to engage with images, even if alt text is provided, because they tend to be more difficult for screen readers. pursue documentation and take time to understand various options for each tool and each question, then weigh which option will be most accessible for most students. it takes time and energy to plan ahead with intention but increasing the ability of all students to engage in learning makes the planning process worthwhile. collaboration if collaboration between librarians and digital accessibility experts is possible on your campus, take the time to talk to one another about learning outcomes and reasons for using specific tech tools. consult with experts in digital accessibility who can also help you advocate for accessibility clauses in purchase contracts before agreeing to subscribe to a given tool or service. you may also foster collaboration with an inclusive community of practice if you have one at your library. further, the teaching and learning unit on your campus may offer support for integrating technology with pedagogy to promote the engagement and learning experience of all students attending il instruction. this collaboration may be impactful for the library and the campus teaching community. as librarians with teaching responsibilities, we usually do not work in isolation. instruction librarians can also serve as a resource for teaching faculty who may want to incorporate accessible tech tools into their instruction. in addition, librarians could investigate professional organizations that provide support and development in understanding digital accessibility. while a framework for assessing tech tools for accessibility does not currently exist, the development of standards and best practices would be beneficial for librarians, software developers, and accessibility professionals alike. we hope to undertake future research and consultation to develop such frameworks with colleagues, possibly through ala round tables or acrl sections focused on instruction and accessibility. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 21 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson next steps our next steps include sharing these reports with the companies who created the five tools we tested. we will ask them to prioritize both the most severe issues and those issues that are easy to fix and that impact user experience. we will also underscore those areas that surprised us, such as inconsistencies between screen readers for the same issue in a given environment. the goal of this outreach will be to build relationships with tech tool developers so that continued dialogue and testing can occur. the ultimate goal is a more accessible learning environment for everyone with technology vendors as partners in this journey. conclusion advocating for digital accessibility in research libraries requires relationship and capacity building. the challenges faced during emergency remote learning illustrate the necessity of campus units working together to ensure student inclusion and success. increased collaborations between academic libraries, tech tool developers, and digital accessibility offices mean that all parties can benefit from mutual expertise. librarians may share the kinds of tech tools being used in il sessions, while accessibility offices may test those tools and provide recommendations for improvement, which may then be leveraged when working with software companies to advocate for positive change. if more people are aware of digital accessibility vocabulary, needs, and resources across campus, that can also augment the number of people available to respond to and triage needs when future emergencies arise. acknowledgment we would like to thank scott holman and eric klinger from the cu boulder writing center for their help revising this manuscript. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 22 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson appendix a: tester instruction and script background poll everywhere is an online platform used in classrooms to engage with students through questions, surveys, and polls. people can sign up for a free account or for one of four subscription based account options. the free account allows users to create unlimited questions, have access to webinar tutorials, and upload images as question choices. this tool also allows people to respond via browser, sms, or app; to export data and screenshots; and to share to social networks, though some of these features are limited with a free account. poll everywhere script 1. type in your browser or click on the link provided. a pop-up might show on your screen. agree to the cookie policy if it does. 2. you may be prompted to introduce yourself and enter the screen name you would like to appear alongside your responses. 3. click continue. the survey will let you know that there are six questions. click start survey. 4. the first question is multiple choice. select your favorite sport. 5. click next on the upper right-hand corner. 6. the second question is a short response. type your favorite ice cream flavor. click submit. you can enter as many answers as you want. when you are ready to go to the next question click next. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 23 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson 7. the third question is also a short response. type in your favorite food. you can enter as many answers as you want. when you are ready to go to the next question click next. 8. the fourth question is also a short response. type what you are looking forward to this semester. you can enter as many answers as you want. when you are ready to go to the next question click next. 9. the fifth question is a clickable image question. click on the face that describes how you are feeling today. for this question, if you want to clear your response and enter a new one, you may do so by clicking “clear last response.” when you are ready, click next. 10. the sixth question is a ranking question. you need to use the arrow feature, which appears when you click next to the image. move images up and down organizing them from favorite to least favorite. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 24 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson for this question, if you want to clear your response and enter a new one, you may do so by clicking “clear last response.” when you are ready, click submit. 11. click finish in the upper right-hand corner. a screen will appear that says “all done!” the results of the survey are only available when the creator of the survey presents them in class. we were not able to figure out a way for respondents to access group responses asynchronously. notes • we noticed that when preparing questions 4 and 6, we were not prompted to enter alt-text by default. • the creator of the poll must enable alt-text for clickable image questions (such as question 4) by going to the user profile and selecting “features lab.” • alt-text did not seem to be available for question 6. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 25 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson appendix b: digital accessibility assessment report for poll everywhere information • testing tools: o windows 10 / jaws 2021/ nvda 2021.3.1 / google chrome (most recent version) o pixel 3a, android 12 / talkback / last updated 9/30/21 o iphone 12 mini / voiceover / safari ios 14.3 • testing dates: february 27, 28, and march 3, 2022 summary this document provides an overview of the issues the digital accessibility office (dao) identified on the poll everywhere platform. overall, we found the site to be relatively accessible for many individuals with disabilities or who use assistive technologies or alternative forms of access depending on the question type. the questions with images—to rank or select—were inaccessible. that said, through our testing, we found five severe issues, two significant issues, one minor issue, and one usability issue. severe issues represent items that create access barriers and need to be remediated, significant issues represent items that create a great deal of difficulty and should be remediated, and minor issues represent items that are the lowest priority but would be good to remediate. usability issues can impact users of any ability. if there are questions, concerns, or the desire to see demos of the issues presented in this report, please reach out to the assessment & usability testing team. please also consider filling out the assessment & usability testing feedback form to help us improve our testing protocols. issues severe graphics are unlabeled or inappropriately labeled • in question 6, there are four pictures of animals. the screen readers read all four images as “unlabeled images.” there is no differentiation between the four images. appropriate image descriptions are needed. o additionally, while reviewing the history of submissions, the answers are a list that read “(an image), (an image), (an image), (an image)” information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 26 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson • there were several elements that have dots in the label name. when using voiceover, these elements were read as “unpronounceable. [braille dots] ...” followed by numbers. o these elements included the marker on the emoji image, the up and down arrow buttons on question 6, and the finished icon. element presents gesture/navigation traps • on question 6, while using voiceover and talkback, the user could not swipe between the answer options. this made the buttons, links, and text before and after the options inaccessible. o a tester was able to leave the trap, but they had to use direct touch and focus landed outside the answers. o additionally, while using talkback, there was not any indication that the image was being moved up or down. element not available by keyboard or screen reader • question 5 is an emoji question where the user would need a mouse or direct touch (while not using a screen reader) to answer successfully. the alternative text says there are emojis, but the user does not know what five emotes or different colors are presented. to activate, the user selects “enter” or double taps (mobile screen reader). this makes a random selection and places the marker in the middle of the image without a way to move the marker to the appropriate emoji. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 27 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson errors do not get focus • during one instance, a user received an error that the response was not submitted. during this one instance, the focus was not pushed to the error message. the user would have to know it was there. ideally, the focus would be pushed to the error so all users would be aware that an error had occurred. significant element state not indicated • in question 6, the unlabeled image reads as “clickable” to nvda. when selecting enter, the state of the element is not announced. visually, the image gets larger. inconsistent focus handling • focus handling for all tools could be improved. focus goes to different areas of the page after responses, returning to previous questions, or refreshing the page. o focus inconsistencies depended on the screen reader. while going through the questions, focus would go to the top of the page, “close app download offer” button, “submit,” or “next.” ideally, focus would be on the heading 1 for each question. minor same information is presented to screen reader multiple times • while using voiceover, the instructions, questions, and answers were announced multiple times. this was noted on several occasions. information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 28 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson usability insufficient color contrast (4.5:1) • in the multiple-choice question, after selecting an answer, the question’s color becomes lighter. the lighter color has insufficient color contrast for both the answer selected (2:1) and the answers not selected (1.8:1). information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 29 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson endnotes 1 kathia ibacache et al., “emergency remote library instruction and tech tools: a matter of equity during a pandemic,” information technology and libraries 40, no. 2 (2021): 8, https://doi.org/10.6017/ital.v40i2.12751. 2 there are many examples of guides which include ada compliance and library spaces, including but not limited to william w. sannwald, checklist of library building design considerations, 6th ed. (chicago: ala editions, an imprint of the american library association, 2016); and carrie scott banks et al., including families of children with special needs: a how-to-do-it manual for librarians (chicago: american library association, 2014). 3 “introduction to the ada,” ada.gov: united states department of justice civil rights division, accessed june 15, 2022, https://www.ada.gov/ada_intro.htm. 4 “introduction to the ada.” 5 ariel pomputius, “assistive technology and software to support accessibility,” medical reference services quarterly 39, no. 2 (2020): 203, https://doi.org/10.1080/02763869.2020.1744380. 6 fernando h. f. botelho, “accessibility to digital technology: virtual barriers, real opportunities,” assistive technology 33, no. s1 (2021): s31, https://doi.org/10.1080/10400435.2021.1945705. 7 botelho, “accessibility to digital technology,” s27. 8 jonathan lazar, “planning for digital accessibility in research libraries,” research libraries issues, no. 302 (2021): 20, https://doi.org/10.29242/rli.302.3. 9 “digital accessibility,” georgetown law, accessed june 15, 2022, https://www.law.georgetown.edu/your-life-career/campus-services/information-systemstechnology/digital-accessibility/. 10 lazar, “planning for digital accessibility,” 21. 11 lazar, “planning for digital accessibility,” 19. 12 lazar, “planning for digital accessibility,” 26–28. 13 education of individuals with disabilities, 20 u.s.c. §§ 1400–1485 (suppl. 2 1988), https://tile.loc.gov/storage-services/service/ll/uscode/uscode1988-03202/uscode1988032020033/uscode1988-032020033.pdf; see also kathleen puckett and kim w. fisher, “assistive technology,” in the sage encyclopedia of intellectual and developmental disorders, ed. ellen b. braaten (thousand oaks, ca: sage publications, inc., 2018), 100 –101. 14 puckett and fisher, “assistive technology,” 100. 15 claire kearney-volpe and amy hurst, “accessible web development: opportunities to improve the education and practice of web development with a screen reader,” acm transactions on accessible computing 14, no. 2 (july 21, 2021): 8:2. https://doi.org/10.6017/ital.v40i2.12751 https://www.ada.gov/ada_intro.htm https://doi.org/10.1080/02763869.2020.1744380 https://doi.org/10.1080/10400435.2021.1945705 https://doi.org/10.29242/rli.302.3 https://www.law.georgetown.edu/your-life-career/campus-services/information-systems-technology/digital-accessibility/ https://www.law.georgetown.edu/your-life-career/campus-services/information-systems-technology/digital-accessibility/ https://tile.loc.gov/storage-services/service/ll/uscode/uscode1988-03202/uscode1988-032020033/uscode1988-032020033.pdf https://tile.loc.gov/storage-services/service/ll/uscode/uscode1988-03202/uscode1988-032020033/uscode1988-032020033.pdf information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 30 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson 16 emily k. chan and lorrie a. knight, “‘clicking’ with your audience: evaluating the use of personal response systems in library instruction,” communications in information literacy 4, no. 2 (march 1, 2011): 192–201, https://doi.org/10.15760/comminfolit.2011.4.2.96. 17 referring to the use of padlet to foster collaboration in a statistics course, henrik skaug saetra suggests that students feel more welcome to ask basic questions in an anonymous environment, in “using padlet to enable online collaboration mediation and scaffolding in a statistics course,” education sciences 11, no. 5 (2021), 6, https://eric.ed.gov/?id=ej1297247. christopher j. e. anderson notes that anonymity invites classroom discussion participation for introverted students and states that answers can be reviewed without “requiring participants to reveal their choice, thus removing stigmas that keep many introverted students from orally participating,” in “repurposing digital devices: using poll everywhere as a vehicle for classroom participation,” journal of teaching and learning with technology 7, (2018): 154, https://eric.ed.gov/?id=ej1307006. 18 citing a 2010 article by b. jean mandernach and jana hackatborn, jared hoppenfeld states that anonymity provides a means for students to answer honestly, in “keeping students engaged with web-based polling in the library instruction session,” library hi tech 30, no. 2 (2012): 238, https://doi.org/10.1108/07378831211239933. see also anderson, “repurposing digital devices,” 154. 19 this paper considered pedagogical approaches when using padlet in the classroom, noting that this tool did not enhance criticality or students’ desire to counter a post by a classmate; see ann rosnida md deni and zainor izat zainal, “padlet as an educational tool: pedagogical considerations and lessons learnt,” proceedings of the 10th international conference on education technology and computers (october 2018), 157, https://doi.org/10.1145/3290511.3290512. 20 some authors surmise that an instructional game could be used to prepare students for exams, for example, patricia a. baszuk and michele l. heath, “using kahoot! to increase exam scores and engagement,” journal of education for business 95, no. 8. (2020): 550, https://doi.org/10.1080/08832323.2019.1707752. examining technology as a tool to enhance teaching and learning in engineering classes, vian ahmed and alex opuku mentioned that students found kahoot! a useful online tool that helped them reflect, apply knowledge, and receive feedback, in “technology supported learning and pedagogy in times of crisis: the case of covid‐19 pandemic,” education and information technologies 27 (2021), https://doi.org/10.1007/s10639-021-10706-w. darren h. iwamoto et al. assert that kahoot! provided students with a fun activity that helped them memorize important concepts, in darren h. iwamoto, jace hargis, erik jon taitano, and kv vuong, “analyzing the efficacy of the testing effect using kahoot! on student performance,” turkish online journal of distance education 18, no. 2 (2017): 83, 89, https://eric.ed.gov/?id=ej1145220. 21 iwamoto et al., “analyzing the efficacy,” 83, 89. 22 carolyn m. plump and julia larosa, “using kahoot! in the classroom to create engagement and active learning: a game-based technology solution for elearning novices,” management teaching review 2, no. 2 (2017): 156, https://doi.org/10.1177/2379298116689783. https://doi.org/10.15760/comminfolit.2011.4.2.96 https://eric.ed.gov/?id=ej1297247 https://eric.ed.gov/?id=ej1307006 https://doi.org/10.1108/07378831211239933 https://doi.org/10.1145/3290511.3290512 https://doi.org/10.1080/08832323.2019.1707752 https://doi.org/10.1007/s10639-021-10706-w https://eric.ed.gov/?id=ej1145220 https://doi.org/10.1177/2379298116689783 information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 31 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson 23 maureen knapp, “technology for one-shot instruction and beyond,” journal of electronic resources in medical libraries (2014): 224, https://doi.org/10.1080/15424065.2014.969969. 24 liya deng, “assess and engage: how poll everywhere can make learning meaningful again for millennial library users,” journal of electronic resources librarianship 31, no. 2 (2019): 63, https://doi.org/10.1080/1941126x.2019.1597437. 25 a surveyand interview-based study by engineering faculty members ahmed and opoku examined both instructors’ and students’ perceptions of technology-supported learning during times of crisis. with regard to technological and pedagogical best practices, student participants noted that interactive feedback tools such as kahoot! helped them synthesize and apply their knowledge. as one student said, “kahoot! was a fun and interactive application and engaging.” see ahmed and opuku, “technology supported learning and pedagogy,” 381. 26 razzaqul ahshan, “a framework of implementing strategies for active student engagement in remote/online teaching and learning during the covid-19 pandemic,” education sciences 11, no. 9 (2021): 487, https://doi.org/10.3390/educsci11090483. 27 jillianne code, rachel ralph, and kieran forde, “pandemic designs for the future: perspectives of technology education teachers during covid-19,” information and learning sciences 121, no. 5/6 (january 1, 2020): 419–31, https://doi.org/10.1108/ils-04-2020-0112. 28 code, ralph, and forde, “pandemic designs,” 426. 29 one such study examines the impact of covid-19 on higher education in ethiopia; see berhanu abera, “the effects of covid-19 on ethiopian higher education and their implication for the use of pandemic-transformed pedagogy: ‘corona batches’ of addis ababa university in focus,” journal of international cooperation in education 24, no. 2 (2021): 3–25. another study focuses on polish primary school integration of ipads; see lucyna kopciewicz and hussein bougsiaa, “understanding emergent teaching and learning practices: ipad integration in polish school,” education and information technologies 26, no. 3 (2021): 2916, https://doi.org/10.1007/s10639-020-10383-1. a third article explores pandemic-transformed pedagogy from the perspectives of early childhood instructors in the caribbean; see sabeerah abdul-majied, zoyah kinkead-clark, and sheron c. burns, “understanding caribbean early childhood teachers’ professional experiences during the covid-19 school disruption,” early childhood education journal (2022), https://doi.org/10.1007/s10643-022-01320-7. 30 paul f. burke et al., “exploring teacher pedagogy, stages of concern and accessibility as determinants of technology adoption,” technology, pedagogy and education 27, no. 2 (2018): 149–63, https://doi.org/10.1080/1475939x.2017.1387602. 31 burke et al., “exploring teacher pedagogy,” 158–59. 32 ibacache, rybin koob, and vance, “emergency remote library instruction,” 9. 33 the authors of this study hold roles as academic subject specialist librarians and digital accessibility office staff, including accessibility and usability team testers. https://doi.org/10.1080/15424065.2014.969969 https://doi.org/10.1080/1941126x.2019.1597437 https://doi.org/10.3390/educsci11090483 https://doi.org/10.1108/ils-04-2020-0112 https://doi.org/10.1007/s10639-020-10383-1 https://doi.org/10.1007/s10643-022-01320-7 https://doi.org/10.1080/1475939x.2017.1387602 information technology and libraries december 2022 tech tools in pandemic-transformed information literacy instruction 32 rybin koob, ibacache, williamson, lamont-manfre, hugen, and dickerson 34 “free accessibility tools and assistive technology you can use today,” bureau of internet accessibility (blog), october 26, 2018, https://www.boia.org/blog/free-accessibility-tools-andassistive-technology-you-can-use-today; see also “chapter 1, introducing voiceover,” in voiceover getting started guide, apple, inc., accessed june 16, 2022, https://www.apple.com/voiceover/info/guide/_1121.html. 35 “get started on android with talkback,” android accessibility help, accessed june 16, 2022, https://support.google.com/accessibility/android/answer/6283677?hl=en. 36 “about nvda,” nv access, accessed june 17, 2022, https://www.nvaccess.org/about-nvda/. 37 jaws was developed by freedom scientific members with vision loss; see “jaws®—freedom scientific,” accessed june 16, 2022, https://www.freedomscientific.com/products/software/jaws/. 38 “colour contrast analyser (cca),” tpgi, accessed june 16, 2022, https://www.tpgi.com/colorcontrast-checker/. 39 ibacache, rybin koob, and vance, “emergency remote library instruction,” 9. 40 jamboard is very visual, with multiple options such as sticky notes, drawing pens, and image searching. other tools such as kahoot! and mentimeter are not solely visual; they also include additional moving parts, such as sounds and other notifications. 41 lazar, “planning for digital accessibility,” 21. 42 lazar, “planning for digital accessibility,” 21. 43 lazar indicated that captioning benefits people who process information in different ways, who are learning the language being used, or who may otherwise struggle to understand a dialect, in “planning for digital accessibility,” 21. 44 forrester research, inc., accessible technology in computing: examining awareness, use, and future potential, redmond, wa: microsoft corporation (2004): 9, http://download.microsoft.com/download/0/1/f/01f506eb-2d1e-42a6-bc7b1f33d25fd40f/researchreport-phase2.doc. https://www.boia.org/blog/free-accessibility-tools-and-assistive-technology-you-can-use-today https://www.boia.org/blog/free-accessibility-tools-and-assistive-technology-you-can-use-today https://www.apple.com/voiceover/info/guide/_1121.html https://support.google.com/accessibility/android/answer/6283677?hl=en https://www.nvaccess.org/about-nvda/ https://www.freedomscientific.com/products/software/jaws/ https://www.tpgi.com/color-contrast-checker/ https://www.tpgi.com/color-contrast-checker/ http://download.microsoft.com/download/0/1/f/01f506eb-2d1e-42a6-bc7b-1f33d25fd40f/researchreport-phase2.doc http://download.microsoft.com/download/0/1/f/01f506eb-2d1e-42a6-bc7b-1f33d25fd40f/researchreport-phase2.doc abstract introduction overview ada accessibility what is digital accessibility? assistive technology for blind users classroom engagement with tech tools pandemic-transformed pedagogy methodology about the testers about the test scripts process assistive technology tools evaluated limitations results severe issues significant issues minor issues summary of results discussion shared issues among the tools unique environments digital accessibility benefits users with different abilities recommendations and next steps planning with intention collaboration next steps conclusion acknowledgment appendix a: tester instruction and script background poll everywhere script notes appendix b: digital accessibility assessment report for poll everywhere information summary issues severe graphics are unlabeled or inappropriately labeled element presents gesture/navigation traps element not available by keyboard or screen reader errors do not get focus significant element state not indicated inconsistent focus handling minor same information is presented to screen reader multiple times usability insufficient color contrast (4.5:1) endnotes highlights of isad board meeting 197 4 midwinter meeting chicago, illinois monday, january 21, 1974 43 the meeting was called to order at 10:15 a.m. by president frederick kilgour. those present were: board-frederick g. kilgour, lawrence w. s. auld, paul j. fasana, donald p. hammer (isad executive secretary), susan k. martin, ralph m. shoffner, and berniece coulter, secretary, isad. guest-brett butler. midwinter 1973 minutes approved. motion. mr. shoffner moved to approve the minutes of the midwinter 1973 board meetings. seconded by mr. fasana. carried. las vegas annual meeting minutes accepted. a correction on page one of the las vegas annual meeting minutes was noted: mr. auld's name should be added to the list of guests present. motion. mr. fasana moved that the minutes of the isad board meetings at the las vegas annual conference be accepted as corrected. seconded by mrs. martin. carried. isad history committee. the matter of appointing members to the isad history committee, whose function is to prepare a history of isad for ala's centennial celebration in 1976, was considered. mr. shoffner said that during the time he was president, he had rendered the isad history committee inactive. it was suggested by mr. kilgour that a historian would serve the purpose better than a committee. mr. shoffner remarked that he anticipated the chairman would be a historian. mrs. martin asked whether a check could be made first whether ala is planning to publish any document for the centennial celebration that would make any preparation by an isad committee or historian worth while. mr. kilgour remarked that isad definitely should be included if ala did plan to publish any document and asked the board to give an "ok" to appoint a historian. motion. mr. fasana moved that the ad hoc isad history committee 44 journal of library automation vol. 7/1 march 197 4 be abolished and recommended that the president be given the right to appoint a historian if ala planned to publish a centennial document. seconded by mr. auld. carried. ala dues structure. mr. hammer explained the information submitted to the board concerning the proposed ala. dues structure. the basic fee for ala membership under this proposed dues structure would be $35. membership in each division would be an additional $15. in es~ sence, each division would be on its own financially:· if there are not enough memberships to support a division, as could be the case, the division would cease to exist. !sad could support itself with its present membership, but there is no. way of knowing how many !sad members would still select !sad if the choice of two divisions included in the dues was removed. the divisions that publish a journal would attract membership much more easily than those that do not provide a journal. mr. hammer further remarked that the proposed dues schedule indicates that the divisions must prove themselves with membership dues as their only support, but this does not apply to ala committees, scmai, units such as the office for intellectual freedom, office for library service to the disadvantaged, and the administrative and support units of ala. these units may be of great value to ala, but if one tinit is forced to prove its value financially, then it seems that all should have to prove themselves. the divisions would be expected to depend on their own resources, e.g., if the division runs out of postage ·money, there would be no further mailings. the divisions would be expected to pay for their support services.· the idea is very closeto the federation plan which has been circulated for some time. in answer to the question of how a new division would get started, mr. hammer replied that he assumed there would have to be enough memberships to provide for it financially. mr. shoffner suggested that the discussion be divided into two parts: ( 1) the principle involved; and ( 2) the financial aspect. · the following points were brought up in the ensuing discussion by the board regarding the proposed dues structure: . starting a new division could be a problem; perhaps it could be subsidized for a stated time, after which the division: would be self-sufficient. the proposed separation of dues, however, would force a clarity in ex~ penditures of. ala in respect to how the divisions would benefit. some divisions could not be self-supporting and yet are producing important contributions for ala. ' ' a division would be at the mercy of the ala supporting units. if a sup~· port unit was not efficient, the divisions would be handicapped in the services to their members. would a division be able to know enough in advance how much money could be counted on for program planning? the answer was "yes" based highlights of meetings 45 on past membership, except in the first year. the income would be predicted on the basis of the previous year's income. an excess of income would remain in the division's funds. if the division income fell short of the anticipated amount, it would have no back-up from ala as it has presently. a person could not join one or more of the divisions without joining ala. some divisions could become part of a stronger division, e.g., a division could be broken up and absorbed into several other divisions with related interests. was there any plan to absorb or redirect these divisions which obviously could not be self-supporting? nothing has been announced so far. if a division got into financial difficulties, it could not cut down on its professional staff as a professional staff is needed to maintain ala's status with the internal revenue service. it was noted that there were more important reasons than this for maintaining a professional staff . . · this proposal was drafted by the then deputy director ruth warncke in 1970. the board was informed that a cost study of ala was recently discussed by staff members, but the reply has been that it would take five years to make such a study. the isad board disagreed with the period of five years, but stated that it could take a year. . · a division should be allowed to set up its own budget under this pro~ posal as well as have a voice in ala policy. · · the proposal appeared to be unfair in some points: ( 1) some divisions would have about twice their present income through memberships, while isad would break about even; ( 2) life members would be entitled to membership in all divisions; ( 3) apparently institutions without a group insurance plan of their own could join ala for $35 and be entitled to the gioup insurance for their staffs; at some point an examination of the privileges in each category of membership should be made; and ( 4) if the $35 ala membership fee were increased in the future, this would directly affect membership in the divisions. the isad budget for the 1973/74 year is approximately $47,000 and the journal of library automation $23,000, or a total of approximately $.70,000. if isad membership should fall back to 3,000 members and the membership fee were $25, isad could still be viable. "mr. kilgour's poll of the board revealed all were in favor of the principle of more or less independent divisions, but with reservations; the following was therefore moved: · 'motion. mr. shoffner moved that the isad board favors the prin. ciple of divided annual fees for ala and for its divisions subject to: ' · ( 1) division determination of the fee structure for division memberships and publications; ( 2) division participation in the governance of ala headquarters activities. seconded by mr. fasana. motion carried. ·selective dissemination of information system. mr. 46 journal of library automation vol. 7/1 march 1974 hammer presented a proposal for establishing on a subscription basis a selective dissemination of information system for ala members (see exhibit 1). mter discussion it was decided that mr. hammer would contact ohio state university library and obtain information on exact procedure as to how this would be run, how it would be publicized, who would develop the profiles, who would handle the subscriptions, the cost to the division, etc., and then repmt to the board. co-sponsorship of basic data processing seminars. mr. hammer presented a proposal to the board regarding co-sponsorship of basic data processing seminars with organizations outside isad, such as ibm and dataflow systems, inc. in bethesda, maryland. in the past isad seminars have generally been on library applications, but what he had in mind, mr. hammer said, was primarily on the basics of data processing, systems analysis, and other basic aspects that would be of interest to administrators. the intent would be to give administrators enough knowledge so that they could evaluate the results that they should be gaining from their data processing systems. these institutes would be a package deal in that the personnel and materials would be commercially supplied, dataflow has conducted seminars for the united states civil service commission. ibm has some seminars which are free, but there is a charge if they have to develop a special program. comment was made regarding seminars conducted several years ago where problems developed as to the commercial aspects. motion. it was moved by mrs. martin that the matter of !sad's cosponsoring basic data processing seminars with outside organizations be referred to the isad program planning committee for discussion and their evaluation. seconded by mr. fasana. carried. tuesday, january 22, 1974 the meeting was called to order by the president, mr. kilgour, at 2:25 p.m. those present were: board-frederick g. kilgour, lawrence w. s. auld, paul j. fasana, donald p. hammer (isad executive secretary), susan k. mqrtin, ralph m. shoffner, and berniece coulter, secretary, isad. guests-alex allain, brigitte kenney, ron miller, and velma veneziano. draft on ala goals and objectives. mrs. brigitte kenney sought feedback from the board on the paper previously distributed on the ala committee on planning's draft statement on ala's goals and objectives. several changes were suggested. mrs. kenney expressed her appreciation for their input. freedom to read foundation. mr. alex allain from the foundation presented the cause of the freedom to read foundation in rehighlights of meetings 47 gard to the current problem of censorship. he stressed the desire to keep channels open with the divisions of ala and with systems and networks across the nation. marbi and isad standards committee (tesla). velma veneziano, chairman of the marbi interdivisional committee, appeared before the isad board requesting clarification of the functions of marbi and the isad standards committee ( tesla). she said that her committee would like discrepancies cleared up and duplications eliminated. mrs. martin suggested that the charges to both marbi and tesla be reworded to clarify their functions. isad bylaws committee. in response to discussions concerning the establishment of several committees, mr. shoffner moved to establish an organization committee. seconded by mrs. martin. mr. fasana pointed out that the mechanism for establishing a bylaws committee was already spelled out in the isad constitution. the president can appoint the committee. motion withdrawn. mr. shoffner withdrew his motion. mr. fasana suggested that the bylaws committee also be charged with the organizational and review function. the matter of the standards committee's function was also made the charge of the bylaws committee. wednesday, january 23,1974 president kilgour called the meeting to order at 10:15 a.m. those present were: board-frederick g. kilgour, lawrence w. s. auld, paul j. fasana, donald p. hammer ( isad executive secretary), susan k. martin, ralph m. shoffner, and berniece coulter, secretary, isad. guestsbrett butler, john kountz, ann painter, charles payne, james rizzolo, richard utman, velma veneziano, and david waite. report of the nominating committee. the chairman, charles payne, announced the nominees for the 197 4/75 slate of isad candidates: vice-president/president-elect: board member-at-large: henriette a vram allen veaner ruth tighe maurice freedman the board members extended a vote of thanks to the nominating committee for their work. report of marc user's discussion group. mr. james rizzolo, chairman, said most of the discussion in the discussion group revolved around ala, clr, and the change in clr' s status which was moved in august from one irs classification to another. it is now an "op48 journal of library automation vol. 7/1 march 1974 erating foundation," i.e., it is active in programs rather than waiting for a reaction to a request using funds they have as a "carrot.'~ also discussed was whether clr should fund and pick the participants or clr should do the funding and ala pick the participants. , . also the group considered the question of standards and how one ardves at them. there are a number of groups in ala dealing with standards, but there is a need to work out a systematic method of developing standards. there needs to be a routine mechanism set up for going from an imtial formulation of an idea for a standard to a standard that the profession can live with. report of program planning committee.. the committee met at the asis meeting in los angeles prior to meeting at the ala midwinter meeting. . :rvir. brett butler, chairman, announced that three european librarians );lad been invited to participate in the 1974 annual program .at new york city. mr. kilgour was handling all arrangements. mr. kilgour informed the board that the travel expenses of all three librarians were ·being provided for by sources outside ala. linda crismond is the local planning person for the 1975. san francisco annual conference program which will be sponsored jointly with asis. joshua smith had suggested mark radwin of lockheed as liaison and he had agreed to serve in this capacity. . the new orleans institute on "alternatives in bibliographic networking" had enough registrants by midwinter to confirm it. there had been some difficulty concerning contact with speakers but the .details had been straightened out. copies of the program ·for the new orleans institute were distributed. mr. butler also inforrried the board that his committee was looking into the details of cooperating with other institutions and state schools which might be interested in working with isad in a seminar or institute. the committee was also considering what type of programs should ·be presented, subcontracting to outside companies, and how to control these. the members of the committee were working on a procedure manual for use in conducting institutes .. telecommunications committee report. the activities of the telecommunications committee are highly organizational at present. the committee has swung away from cable tv as its primary interest and towards telecommunications as applied to bibliographic networks. the chairman, david. waite, said there was a need to set up a simple guide to carry out their charge for the educational activities and legislation advisory responsibilities to the ala committee on legislation. more people would probably be appointed to the telecommunications committee as there was a need for more expertise to assign to the areas identified by the committee. . highlights of meetings 49 he further said that the need now is to determine what existing appara. tus may be utilized to fulfill the committee's responsibility to disseminate information regarding telecommunications as applied to the library community so that the committee could put most of .its effort into technical work. one project discussed was to gather background information on bibliographic data centers and network activities and their needs for telecommunication facilities in order to draft a requirements statement. the purpose of such a: statement is. that the committee could communicate .with new telecommunications systems. the committee was not aware of an ade· quate statement of library requirements that. is readily available ·for the commercial services that .are steadily increasing. assignments have been given to gordon randall, maryann duggan, and ron miller to gather this information. mtr. waite remarked that the committee would be interested in ~ny report on the proposed isad networks committee when available. brett. butler, chairman of the program planning co:rrnnittee; suggested that a telecommtmications institute should be in the future plans and mr. waite's or any of his committee members contribution of any ideas about· such would be appreciated. report 6f the interdivisional committee on machine-readable bibliographic information (marbi). (see exhibit 2~) mr. kilgoirr appointed velma veneziano to serveas liai~ son to the isad standards committee from marbi. her term as chair~ man of marbi will conclude in jnne 197 4. report of cola discussion group. (see exhibit 3.) report of committee on technical standards fqr library 'automation (tesla). (see exhibit 4.)~report of chairman jolln kountz. · · technological unemployment. president kilgour felt ala should do something about the spreading of unemployment due.' to increased use of technological development. m!r. auld suggested that someone be appointed to study the potential and existing problems in this area. this could be funded either: (1.) ,under a fellowship by clr; or (2) application for the j. morris jones .goals award. .· · · · mr. fasana. thought an interdivisional committee might be set up be~ tween the fotir rnost directly affected divisions: isad, lad, led.; 'and rtsd. ·. . . , mr. shoffner expressed· his view that as efficiency is ii:rcreased productivity is increased aj}d could possibly therefore increase employment. mr.: kil~ gour said tha.t.history had proved to.the contrary. mr. shoffner stated he felt the problem was on~. of education and ·.training. a specification· of 50 journal of library automation vol. 7/1 march 1974 what is expected of one and what training he would receive during a technical changeover was needed. mr. fasana's suggestion was that the four divisions be asked for papers of their views or a program at the san francisco annual conference be prepared on the subject of technological unemployment. mr. auld asked if it could not rather be introduced at the new york annual conference, to which ann painter volunteered the use of the isad /led education committee's two-hour time slot for the program at new york. motion. mr. fasana moved that mr. kilgour phrase a statement of the problem on technological unemployment as he sees it and present it to the !sad /led education committee for consideration as the program theme at the new york conference. seconded by mrs. martin. carried. proposed standards in ]ola tc. mr. john kountz brought up the subject of using lola tc for the interactive mechanism of presenting the proposal of a standard to the isad members for comment, and of having a form included to be filled out and returned. the board agreed that this was a good idea. isad/led education committee report. ann painter, chairman, asked for clarification of appointment of new members to the committee. roger greer is the only member whose term continues past this year. mr. hammer was asked to find out who appoints members to the above committee. the committee is working on a series of papers defining educational "modules" and has sent out a revised questionnaire to identify appropriate subject areas. it is planning to send the questionnaires to associated institutions as well as to the ala accredited schools. the need for funding the modules rather than depending upon volunteer or "slave labor" was considered by the committee. volunteers have little preparation time and so often there is a lack of in-depth or consistency in developing these modules. also the committee would like to set up a file of modules available to people across the country. there could be a problem of copyright involved. mr. kilgour asked miss painter for suggestions of people who might be interested in serving on the committee. lola manuscripts. mrs. martin, editor of ]ola, asked the board for its feeling on whether it would be appropriate or desirable to put the date of acceptance on published manuscripts in lola. the board decided that should be the editor's decision. vote of thanks to mrs. martin. the board gave mrs. susan martin a unanimous vote of thanks for her work in getting the issues of ]ola caught up to date in time to meet the post office deadline of december 31, 1973 in order to retain the second class permit. highlights of meetings 51 report of the membership survey committee. (see exhibit 5.) board minutes in lola. the board suggested that minutes published in ]ola be entitled "highlights of isad board meeting" rather than minutes. the meeting was adjourned at 12:30 p.m. exhibit 1 proposal for.establishing on a subscription basis a selective dissemination of information system for ala members the original proposal for an sdi system was intended for isad members only, but interest has grown at ala headqua1ters to the extent that it is being considered as a service to be provided for all ala members. the proposal therefore does not require any action on the part of the isad board. it is presented here for information and to give the board members an opportunity to comment on the idea and make suggestions toward developing the best possible procedure. it is hoped that a presently operating system can be found that would enable ala members to subscribe to a system using multisubject data banks that would automatically adjust profiles according to past output results and that would supply as requested copies of articles and documents whenever possible. such documents would of course be supplied at a fee additional to the basic subscription fee. it is also hoped that the operators of the system would be responsive to subscriber feedback and would improve the system as warranted. at present the only existing data banks in the library and information science fields are eric and marc, but hopefully as time goes on others will be developed. it, for example, would seem prudent for the h. w. wilson company to consider the sale of lihm1·y litemtme in machine-readable form. in any event, there is no reason to limit subscriptions to the service to information science data banks. if interested, members of ala could subscribe to other subject fields depending upon the data banks made available by the operating service. chemistry librarians could, if useful to them, subscribe to chemical abstmcts condensates, engineering librarians to enginee1'ing index, etc., etc. only time and the availability of sdi can determine the interest of librarians in such services. at the time of writing, only one of the two agencies contacted for information has provided descriptive data on their system. a copy of one of the papers sent by the ucla center for information services is attached. ohio state university libraries had not as yet responded. enquiries will be made with other operating systems so that a basis for comparison wiii be available for decision at ala headquarters. comments and suggestions from isad board members would be appreciated. information regarding presently operating systems would also be of great value. december 13, 1973 exhibit 2 reports of the meetings of the marbi committee (interdivisional committee on representation in machine readable form of bibliographic information) january 19 and 20, 1974 number one priority was the resolution of the relationship between the library of congress and marbi in its capacity as the marc advisory group. 52 journal of library automation vol. 7/1 march 1974 there was discussion of the position paper which was presented at the las vegas meeting (copy attached) entitled "the library of congress view on its relation to the ala marc advisory committee." lc had revised certain portions of this paper to conform with marbi's wishes. these revisions were acceptable to the committee. there was concern, however, over an addition which pertained to marbi's role with regard to formats other than books and serials (namely films, maps, music, etc.) alternate wording to lc's proposal was worked out by paul fasana and john knapp. several documents were submitted by henriette avram: (1) a proposed document numbering scheme for communications between lc and the committee and vice versa, and (2) proposed format for presenting changes to marc formats (copies attached). these documents and proposals were acceptable to the committee. (note: incidental to this discussion, the committee officially adopted "marbi" as its official acronym.) 1. the lc liaison presented two proposed marc format changes for the committee's consideration entitled: lc/marbi 2-addition of $x subfield for 4xx fields to allow for issn. lc/marbi 3-specincation of the 830 field. the committee decided that the following plan of action would be followed with regard to these two changes: they would be announced and distributed to isad marc users' discussion group at its january 21, 1974 meeting. the proposed changes would be sent to all on mudg's mailing list, asking for replies to the marbi chairman by february 16, 1974. the chairman would summarize responses and poll marbi committee members who would respond by march 16, 1974. the marbi committee chairman would respond to lc by march 16, 1974. marbi will request publication of changes in ]ola technical communications. 2. henriette avram presented to the committee a clr statement which had been presented to arl entitled "a composite effort to build an on-line national serials data base." the committee took note of the presentation with interest and voted to take no action on the matter at the january 19 meeting. 3. the character set subcommittee of marbi reported that it had issued a written report which will be used in support of the united states position concerning development of standards within the international standard organization. marbi issued thanks to the subcommittee and requested that they remain convened pending review of further developments coming from activities within iso. 4. there was a report on activities of the ad hoc committee convened by clr to discuss use of the marc format in a network environment. a paper entitled "sharing machine readable bibliographic data: a progress report on a series of meetings sponsored by the council on library resources" was discussed. the committee took note of these activities with interest and will wait for formal submission of format changes from the library of congress. 5. marbi discussed the apparent overlap of the change between marbi and the new isad committee on technical standards. marbi passed a resolution that the isad representatives should bring to the attention of the isad board its concern over the similarity of the function statements of the two committees, and asked that these apparent discrepancies be considered and any duplication be eliminated. 6. the proposed marbi serials task force was discussed. it was felt that marbi committee members needed to keep up on developments, and that the chairman should continue to collect and distribute as much documentation as possible to the committee highlights of meetings 53 members. it was decided that there was no need ~tt this time to set up a separate subcommittee to perform this function. 7. the proposed amendments to iso 2709-1973(e) were discussed. it appears that there are several proposals circulating to change this standard. marbi formed a subcommittee to study these proposals and respond, and possibly, to make counterproposals. the position of marbi will be reported to the chairman of ansi z-39, sc/2 and will be used in support of the u.s. position within iso. any committee member or interested professional may reply individually. the subcommittee appointed consists of charles payne, john knapp, mike malinconico, and charles husbands. response will be made by april 1, 197 4. at its regular scheduled meeting, on january 20, all members were present. (john byrum was unable to attend the unofficial meeting on january 19.) the distribution of the rtsd and isad manual material was discussed. the discussion of the previous day was summarized for purposes of review and for the benefit of the nonmembers attending the meeting. 1. marbi and lc the alternative wording to the lc position paper was presented by paul fasana. it was passed. henriette avram will have it published in lcib and will submit it to lola tc. lrts will also receive a copy. the paper will be submitted to each divisional board. 2. the national on-line union file of serials was discussed. larry livingston answered questions. 3. the character set subcommittee report will see that isad has a copy. interested professionals should ask for a copy from them. 4. the activities of the ad hoc clr committee were again reviewed. 5. the isad standards committee was discussed. 6. the serials task force for marbi was reported on. 7. the proposed changes to iso 2709-1973 (e) were reviewed. new business: 8. the activity of the ifla working group on content designators was discussed. it was reported that there is an attempt to standardize content designators across national boundaries, for purposes of international exchange. there are problems in the area of cataloging rules, not all libraries participating, and language. no action was needed, as this is only for informational purposes at this time. 9. location codes were discussed, but the issue was tabled pending report of ad hoc clr committee. 10. language and geographic area codes were brought up but not considered necessary to become involved. 11. the z39 standard account number (san) was reported by emery koltay. 12. progress in regard to the publication of the isbd-m and s was discussed. exhibit 3 cola report-midwinter '74 about fifty people were in attendance at portions of the four-hour meeting. the first half was taken up by a series of informal presentations about activity at: by: stanford allen veaner csuc john kountz berkeley & ulap sue martin ulap cis project at ucla peter watson 54 ]oumal of libmry automation vol. 7/1 march 1974 at: nypl-rlg & suny plans university of chicago lc by: mike malinconico charles payne rob mcgee mary kay daniels questions were entertained at the end of each presentation. the second half was opened by a few announcements by maryann duggan about the new orleans institute and henriette avram about the serials proposals. the major portion of the second half consisted of a panel discussion by john kountz, eme1y koltay, tom brady, and john knapp on the communication of orders, claim reports, ill requests and responses in machine-readable form. john kountz addressed general system design aspects, emery koltay discussed the isbn, issn, and standard account numbers, tom brady discussed b&t's experiences with batab, and john knapp addressed the nature of the data elements and the record structure itself. considerable discussion followed the presentations, centering heavily on the isbn and its good points and failings. both parts of the meeting seemed to be well received. the major value of cola seems to be as an occasion for a wide variety of automation-oriented people to discuss a similarly wide variety of topics in an informal environment. there was some feeling that the presentations in the first half could have been more tightly controlled. the presentation in the second half was quite useful, i feel. i would like to suggest cola as a good sounding-board for proposals and place for announcements, distributions of handouts or written position papers. john kountz and i have discussed setting aside a portion of it for tesla reports. exhibit 4 respectfully submitted, brian aveney to: board of directors, information science and automation division from: john kountz, chairman, committee on technical standards for library automation subject: report of committee's activities, ala midwinter meeting, 1974 the committee on technical standards for library automation (tesla) held its inaugural meetings on tuesday, january 1974 (4:3d-6:00 p.m. and 8:30-11:00 p.m.). these were icebreaker meetings for a new group. in view of the interest that had been expressed in various quarters, several interested observers attended, as well as six of the seven committee members (for membership attendance see attached list). in addition, the following individuals were invited to meet with the committee and present their review of standards activities in other areas; establish a working perspective for the committee within the american library association; and delineate the constraints of the committee's charge: mr. fred kilgour, mr. don hammer, ms. velma veneziano, mr. emery koltay. while the specific discussion that ensued covered a variety of topics, the central objectives for these two meetings (establishing/ defining action areas, constraints, roles, and reviewing in some detail the committee's charge) were met. in addition, stress was placed throughout the discussion on differentiating between professional, service, bibliographic, and similar library standards, and the communications/ clearinghouse function to be served by the committee in its dealings with technical standards impacting library automation. highlights of meetings 55 at its next meeting, the committee can be expected to complete its deliberations on the charge, complete a proposed pilot procedure for the handling of initiative/reactive requirements for standards, and recommend a shakedown of the proposed procedure. committee on technical standards for library automation ala midwinter meeting 1974 attendees of meetings held 21 january 1974 dr. edmund a. bowles, ibm mr. arthur brody, bro-dart industries mr. jay cunningham, university of california mr. john kountz, chairman, california state university and colleges mr. tony miele, illinois state library mr. richard utman, princeton university absent: ms. madeline henderson, national bureau of standards exhibit 5 report of the membership survey committee we mailed out 4,337 questionnaires as of november 3. as of last week, we had received 1,666 replies. they have now dwindled down to about five or six a day, so i feel we have probably received the majority of responses from our mailing. i hope for about a 40 percent response. the returns are presently being coded now by my graduate assistant, and the university of south carolina computer centre will keypunch them for us. i am hopeful that we can start analyzing the results by the end of february, and have the report ready for you by april. the expenses to date have been: $346.95 164.32 166.60 $677.88 preliminary mailing printing of envelopes return postage the bill for printing the questionnaire hasn't been received yet but should be a very minor one. jim williams will write the program for the data, and the library school has computer time which we can use. i expect when all the expenses are in that the total will be more than the budgeted $700, but not very much more. submitted by: elspeth pope, chairman jim williams bill summers martha manheimer letter from the editors (june 2023) letter from the editors kenneth j. varnum and marisha c. kelly information technology and libraries | june 2023 https://doi.org/10.6017/ital.v42i2.16xxx welcome to the june 2023 issue. see below for updates on our hosting migration plans, editorial board membership, and our call for submissions. peer-reviewed articles in the current issue are listed here: • supporting faculty’s instructional video creation needs for remote teaching: a case study on implementing eglass technology in a library multimedia studio space / hanwen dong • technology integration in storytime programs: provider perspectives / maria cahill, erin ingram and soohyung joo • a tale of two tools: comparing libkey discovery to quicklinks in primo ve / jill k. locascio and dejah rubel mary carrier also shares her experience in creating a coding club for kids and teens in this month’s “public libraries leading the way” column. she highlights helpful resources and tools in her article “community-driven programming: offering coding and robotics classes in your library.” ital will move to a new host this summer as mentioned in the march letter from the editor, ital is moving from our longtime hosting at boston college to ala production services. a reminder of a few important details: • our landing page url, https://italjournal.org/, will get you to the journal’s front page both now and after the migration is complete. • urls for articles will change, but dois will continue to resolve the new home of the journal. we will work with our current host, boston college, to set up redirects for as many informational pages as possible to the new location. • for authors and reviewers, there should not be any significant differences because we will continue to use the open journal system platform. • articles published in ital (and our two sibling journals, library leadership and management and library resources and technical services) will continue to be open access with no fees charged to authors or readers. authors maintain copyright in their work. if you would like to receive an email when the september 2023 issue is published at our new location, create a user account by going to the user registration page. be sure to check the “yes, i would like to be notified of new publications and announcements” box near the bottom of the sign-up page to receive an email when future issues are published. we are very grateful to boston college for their support of information technology and libraries over the past decade and to the core board for supporting this migration project. editorial board membership update june marks the end of terms for six of our editorial board colleagues. marisha and i extend our gratitude to lori ayre, jon goddard, soo-yeon hwang, holi kubly, brady lund, and paul swanson for their commitment, dedication, and excellent service to ital and the library technology profession over the four years they have volunteered on the editorial board. https://ejournals.bc.edu/index.php/ital/article/view/15201 https://ejournals.bc.edu/index.php/ital/article/view/15201 https://ejournals.bc.edu/index.php/ital/article/view/15701 https://ejournals.bc.edu/index.php/ital/article/view/16253 https://ejournals.bc.edu/index.php/ital/article/view/16617 https://ejournals.bc.edu/index.php/ital/article/view/16617 https://doi.org/10.6017/ital.v42i1.16319 https://italjournal.org/ https://ejournals.bc.edu/index.php/ital/user/register information technology and libraries march 2023 letter from the editors 2 varnum and kelly in july, we will welcome new members to the editorial board. new members joining us for a twoyear (renewable) term are: cindi blyberg, joanna dipasquale, john klima, ellen schmid, and le yang. we are excited to have them join us. be a part of a future issue as the u.s. academic year hurdles to a close this spring, it’s a great time to think about the work you’ve accomplished and what you might share with your library colleagues near and far. our call for submissions outlines the topics of interest to the journal—basically, if the submission discusses the intersection of libraries/archives/museums and technology, it’s potentially in scope—and the process for submitting an article. we’d love to consider your article for publication. or, if you have an idea you’d like to discuss with ital’s editors, contact either of us at the email addresses below. kenneth j. varnum, editor marisha c. kelly, assistant editor varnum@umich.edu marisha.librarian@gmail.com https://ejournals.bc.edu/index.php/ital/call-for-submissions mailto:varnum@umich.edu mailto:marisha.librarian@gmail.com ital will move to a new host this summer editorial board membership update be a part of a future issue digital native librarians, technology skills, and their relationship with technology jenny emanuel information technology and libraries | september 2013 20 introduction a new generation of academic librarians, who are a part of the millennial generation born between 1982 and 2001,1 are now of the age to either be in graduate school or embarking on their careers. often referred to as “digital natives” because their generation is believed to have always grown up online and with technology ubiquitous in their daily lives,2 many agree that this generation is poised to revolutionize library services with their technology skills.3 younger academic librarians believe that their technology knowledge makes them more flexible and assertive in libraries compared to their older colleagues, and they have different ways of completing their work. they refuse to be stereotyped into the traditional “bookish” idea of librarianship and want to transform libraries into technology-enhanced spaces that meet the needs of students in the digital age, redefining librarianship.4 this paper, as part of a larger study examining millennial academic librarians, their career selection, their attitudes, and their technology skills, looks specifically at the technology skills and attitudes toward technology among a group of young librarians and library school students. the author initially wanted to learn if the increasingly high-tech nature of academic librarianship attracted millennials to the career, but results showed that they had a much more complex relationship with technology than the author assumed. literature review the literature concerning the millennial generation focuses on their use of technology in their daily lives. millennials are using technology to create new ways of doing things, such as creating a digital video for a term project, playing video games instead of traditional board games, and connecting with friends and extended family worldwide through email, instant messaging, and social networking.5 they use technology to create new social and familial networks with friends that are based on the music they listen to, the books they read, the pictures they take, and the products they consume.6 they believe that their relationship with technology will change the way society views and relates to technology.7 with technology at their fingertips on a nearly constant basis, millennials have gained an expectation of instant gratification for all of their wants and needs.8 jenny emanuel (emanuelj@illinois.edu) is digital services & reference librarian, university of illinois, urbana. mailto:emanuelj@illinois.edu digital native academic librarians, technology skills, and their relationships with technology | emanuel 21 millennials believe that technology is not a passive experience, as it was for previous generations.9 to them, technology is active and an experience by which they live their lives.10 they have grown up with reality television, which means anyone can have his or her fifteen minutes of fame. in turn, this means being heard, having their say, and becoming famous online are all natural experiences that can be shared by anyone.11 because they can create their own customized media and make media consumption an interactive, as opposed to a passive and hierarchical, experience, they believe that everyone’s opinion counts and deserves to be heard.12 even though they believe they are the greatest generation and expert users of technology, others have a different view. for example, bauerlein argues that they are not intellectually curious, are anti-library, and blindly accept technology at face value while not understanding the societal implications or context of technology. they also consume technology without understanding how it works.13 within libraries, technology skills related to new librarians have been studied by del bosque and lampert, who surveyed librarians from a variety of library settings with less than nine years experience working as professional librarians. the survey found the majority (55 percent) understood that technology played a large part of their library education, but a similar percent (57 percent) did not expect to work in a technical position upon graduation. respondents also thought there was a disconnect between the technology skills taught in library school and what was needed on the job, with job responsibilities being much more technical than they expected. thus, even though more experienced librarians expected recent graduates to fill highly technical roles, library school did not prepare them for these roles and students did not opt to go to library school to gain strong technology skills. based on survey comments, the researchers noted two categories of new librarians: those who have a high level of technical experience, usually from a previous job in a technology related industry, and those who struggle with technology. for those who struggle with technology, technology was not the reason they decided to become librarians, and they wish their library school had more hands-on opportunities for technology instruction instead of teaching theoretical applications.14 method to understand, in part, the technology skills of millennial academic librarians and their attitudes toward technology, the author developed a two-part research study including an online survey and individual interviews with millennial librarians and library school students. first, an exhaustive three-part survey was created covering multiple aspects of millennial academic librarians, including demographics, career choice, specialization, generational attitudes, management, and technology skills. although the survey focused on many areas of data collection, this paper focuses only on technology skills. the survey was disseminated in may 2012 to 50 american library association (ala) accredited library schools in the united states as well as online outlets geared toward new librarians, including the new members round table (nmrt) electronic discussion list, nextgen-l (next generation librarians list), the ala emerging leaders program alumni electronic discussion list, and the ala think tank on facebook. the survey was information technology and libraries | september 2013 22 open for 10 days. the survey also asked participants if they would be willing to participate in a follow-up interview. a total of 161 participants volunteered for a follow-up interview. interviews began once the survey closed, and individuals were contacted via email to schedule an interview at their convenience. a total of 20 interviews were conducted in may and june 2012. the interviews were conducted using the audio-only function of skype and were recorded using the mp3 skype recorder software. the author then transcribed all of the interviews and coded the transcripts. the interview utilized open-ended questions to gather individual stories and offer support to the quantitative demographic and qualitative survey questions.15 the interview questions were semistructured and asked participants to explain in detail their path to becoming an academic librarian and their attitudes toward technology. results there were 315 valid survey responses. the birth years of participants ranged from 1982 to 1990 (see figure 1). the respondents were nearly evenly divided between library school students (45.5 percent) and individuals having already obtained a mls degree (52.1 percent). concerning the format of their library school program, 38.4 percent earned the degree at an institution entirely in person, 19.6 percent completed the degree entirely online, and 42.0 percent went to a program that was a mix of in person and online courses. figure 1. birth-year distribution of survey participants. 41 35 64 50 45 39 33 22 2 1982 1983 1984 1985 1986 1987 1988 1989 1990 0 10 20 30 40 50 60 70 digital native academic librarians, technology skills, and their relationships with technology | emanuel 23 quantitative data millennials believe it is very important for librarians to understand technology, with 99 percent reporting that it is important or very important. data on skills related to technology were gathered through several questions, notably by using a list of technologies commonly used in academic libraries and asking respondents to rate their comfort level before starting library school, after library school, and at the present time. the results are illustrated in table 1. technology before library school after library school at the present time adobe dreamweaver 1.93 2.5 2.46 adobe flash 2.28 2.61 2.66 adobe photoshop 2.66 3.15 3.22 computer hardware 3.03 3.27 3.32 computer networking 2.54 2.85 2.83 computer security 2.56 2.96 2.91 content management systems (cms) 2.34 3.32 3.29 course management systems (blackboard, moodle, etc.) 3.37 4.22 4.22 file management issues 3.00 3.72 3.67 html 2.56 3.56 3.48 image editing/scanning 3.47 3.87 n/a information architecture 1.86 2.67 2.58 integrated library systems—back end n/a 3.05 2.93 integrated library systems—front end n/a 3.53 3.39 linux/unix 1.58 1.83 1.86 mac os x 2.92 3.31 3.45 microsoft access 2.55 3.19 3.26 microsoft excel 3.94 4.37 4.40 microsoft windows 4.57 4.67 4.71 microsoft word 4.66 4.76 4.79 mobile devices 4.27 4.51 4.60 powerpoint 4.43 4.62 4.65 programming languages (c++, .net, etc.) 1.53 1.94 1.84 relational databases 1.87 2.66 2.66 screen capture software (camtasia, captivate, etc.) 2.10 3.26 3.32 server set up/maintenance 1.56 1.85 1.84 video conferencing 2.61 3.36 3.54 video editing 2.28 2.90 2.94 web 2.0 (rss, blogs, social networking, wikis, etc.) 3.79 4.54 4.49 web programming languages 1.55 1.99 1.92 xml 1.60 2.40 n/a table 1. average comfort level with technologies before and after library school and at the current time. scale: 1 = very uncomfortable to 5 = very comfortable. this list can be split into categories based on the level of technical skill required. individuals were most comfortable with technologies that are used rather than technologies that enable people to create content, which generally require a higher level of skill. for example, people were comfortable with using content management systems (cms) and software used to create information technology and libraries | september 2013 24 webpages, such as dreamweaver, but they were not comfortable with the information architecture skills, css, and html needed to create more complex websites. there was also a lack of understanding about relational databases, which serve as the back end of many online library resources that all librarians use to accomplish most reference work. other deficiencies include linux, an operating system commonly used to run servers, as well as server set up and administration, which run all web-based library resources and services. there is also a strong lack of computer programming understanding and skills, including c++ and .net, as well as web programming languages such as php, asp, and perl. however, when asked which technologies they would like to learn, respondents listed computer and web programming languages the most often, along with other high-level technology skills, including xml, database software and vocabularies, geographic information systems (gis), adobe photoshop, and statistical software, such as spss. data from the technology questions also show that people are learning about technology in library school, but they are learning more about technology they already know how to use than technologies that are new to them. there are a couple of exceptions, including cms, course management systems, html, and screen casting software, with which respondents grew notably more comfortable while in library school (see table 1). more than 84 percent of respondents were required to take a technology course in library school, and they generally believed library school prepared them well to deal with the technological side of librarianship, rating 3.23 on a 1–5 scale. however, respondents did note that most of their technology skill was self-taught (81.7 percent), with only 47.5 percent stating that coursework contributed to their skills. an open-ended question asked what specific technology skills individuals wanted to learn. the results indicate that millennial librarians desire to learn more of the higher-level technology skills, especially programming, which was indicated in 28 of the 97 responses. other skills that were frequently noted include various elements of web programming, including scripting, xml, html, photoshop, microsoft access, spss, and gis. all of these skills either involve content creation (such as scripting, xml and html) or are complicated software that can require a great deal of training to master. see figure 2 for a tag cloud of technologies respondents want to learn. figure 2. coded tag cloud indicating responses to question, “are there any other technologies you want to learn?” digital native academic librarians, technology skills, and their relationships with technology | emanuel 25 clear trends emerged when millennial librarians were asked about technologies that will be most important to libraries in five year. mobile devices, including e-readers (such as the amazon kindle), apps, and tablet computers were the most common category of responses, followed by social media and social applications aimed at libraries. cmss for managing website content was also very popular, and website design was also common. advanced knowledge of database design, including relational database design, and the storage of library data frequently were mentioned, skills that were on a higher level than simply using databases to retrieve information online. web 2.0 applications were also commonly mentioned, but it is unknown if these overlapped with social media. e-books, not unexpectedly, were very popular. the most popular technology individuals wanted to learn was programming, which came up 25 times, indicating there may be a gap in the technical skills that librarians have and the skills they need to have. see figure 3 for a visualization of coded responses. figure 3. coded responses to question, “what three technologies will be most important to libraries in five years?” qualitative data interview participants exhibited a wide variety of diversity and roughly matched the demographics of survey participants. demographic information for survey participants was gathered from their survey responses. ten of the interviewees were born in 1984 or 1985, with the remaining ten born during the remaining years between 1982 and 1989. three participants were male, one did not indicate sex on the survey, and the remaining 16 participants were female. fifteen identified their race as white, two african american, one middle eastern, one hispanic, and one from multiple races. interview participants were from 14 different states. all participants were given pseudonyms for purposes of data analysis. the interview transcripts were coded into information technology and libraries | september 2013 26 broad categories, including attitudes about being digital natives, and technical skills relating to career choice. digital native issues related to being digital natives came up often when millennial librarians were asked to talk about their experiences using technology before they began library school, in library school, and on the job. however, not all considered themselves a digital native, very tech savvy, or able to pinpoint exactly what their tech skills are. most, however, did believe that there were differences in technology use and attitudes between librarians who were younger versus older librarians. childhood technology most remember when they first had a computer in their home as a child, so it was not a part of their lives from birth, but rather from a young age that most participants remember. betty and diana recall always having technology in their homes growing up because their parents worked in technology careers or had an interest in it as a hobby. as diana stated, “both [parents] worked in the it field, so when i was really little, they spent an astronomical amount of money on a computer back in the mid to late 1980s, so i’ve always grown up with technology.” others remember first being exposed to computers in school, with catharine saying, “i can remember being in elementary school and being on a computer and having specialized training. not just in typing but they even pulled people out of class to learn how computers work.” heather vividly remembers her family getting their first computer: “we got one in my house when i was like in the sixth grade and that was a huge thing.” participants also remember having internet access as a child. betsy noted, “i had a prodigy (online service) account when i was seven, when most people did not even know what the internet was at that point.” gabby said, “i think they call people between 21 and 30 the ‘in between’, because they knew what it was like before technology, but they also know how to use technology . . . because i remember before computers.” kelly talked extensively about how she grew up with technology: i think we got our first computer when i was in the fifth grade. i definitely grew up with it. i used it in school. i remember what life was like before computers, though. i have that little bit of perspective there. but it was definitely part of my daily life. and in college i joined facebook back when it was only for college students and now people cannot remember that now. but i used email, was one of the first users of gmail. i got a little more into it in college. olivia also talked about her use of technology as a child: we had a computer in my house. we were very fortunate because my dad was on top of that. so we had a computer since i was a little kid. so i would play around on that a lot, like aol and prodigy. i had the basic skills. and in high school we were taught basic word processing and excel. so i’ve always been in front of a computer. digital native academic librarians, technology skills, and their relationships with technology | emanuel 27 concept of the digital native most people believed they are digital natives because they have been working with technology for a long time, which sets them apart from older generations who they thought did not work extensively with technology until they were adults. catharine stated, “i know it has been a part of my life forever so probably my age does have something to do with it.” catharine talked about the differences in technology skill between herself and her older colleagues, but added, “i don’t feel there is an unwillingness for them to learn technology. i just don’t think they had experiences at the time, where maybe we’re just afforded more opportunities.” however, when pressed, not all considered themselves digital natives. abby recalls a class discussion about the idea of digital natives and how younger people may not be as good with technology as they perceive. because of this, she was hesitant to refer to herself as a digital native, even though growing up she believed technology was a part of her life. others, such as betsy, are reluctant to call themselves digital natives because they remember when their family first got a computer and it was not always in their household. there were also a couple of outliers who were reluctant to call themselves digital natives because they did not grow up with technology in the same ways as did many of their peers. rachel grew up in a poorer home that always got technology second-hand, and she always thought they were behind others. although her family first had an apple computer in the 1980s, she did not recall using it, and just thought of it as a sort of “new appliance” in her house. her family did not emphasize technology use and saw it as something not worth investing in until they had to, which gave her a different perspective of using technology only as necessary and as “one of those things that sometimes i just don’t want to deal with.” samantha grew up in a rural area that only had dial-up internet, which embarrassed her and did not work as well as she thought it should, so she did not use it, leading to a belief that she did not grow up on the internet in the same way as her peers. because of this, she did not consider herself a digital native: i’m still able to relate to those in a different generation who have no idea where to start [with technology], because i was at that state recently. . . . i’m at the in-between stage, so i can handle both ends of the [technology use] spectrum. but yeah, i’m not a digital native. technology reaction participants assumed that, because of their age, they were not as scared or intimidated by technology as they thought some of their colleagues were. heather talked about how learning new things would initially make her nervous, but then would get excited about what the new program or application could do for herself or her work. francis stated, “i’m not afraid of the technology.” she also talked about the differences between herself and her older colleagues: if you ask them something different or to learn something new, they will make it more complicated. i’m so used to exploring my options, i don’t think about it. those 20–30 years older than me are comfortable knowing what they know how to do but not necessarily exploring new ways of doing something that they already know how to do. they feel pretty comfortable and confident in their skills information technology and libraries | september 2013 28 but aren’t really looking to test the waters to see if there is a different way to do something. . . . i’m willing to try. i see a lot of people that are afraid they are going to break something and don’t want to click on it. and i have the confidence that if i click on something, then i can pretty much undo whatever that does. so not necessarily skills, but a different mindset or something. as francis inferred above, younger librarians, because they have always used technology, believe they can quickly learn new technologies. quinn, a current student, also talked extensively about this: i definitely think my age has a lot to do with how comfortable i am with it. because there are various ages within [my library school] and i have definitely noticed that older people fear it a bit more. i guess i can attribute my age to being embedded in technology. because i’ve always had it, well i haven’t always had it, but i had it young enough to feel like it is a part of me, as opposed to new fangled and wasn’t with it in the beginning. . . . i’m not afraid of it, i’m not afraid to mess around with it and mess things up. because you can always reboot or start over. i think that’s the biggest thing, like i will work on something and mess around with it until i figure it out as opposed to someone who is older who wants to know something exactly the right way so they don’t want to do anything bad to it. heather stated: i think i’m a bit more open to new technologies than some of my older colleagues. . . . i have the feeling i know a little bit more. . . . i’m not sure it is just because my comfort level was higher or maybe their experiences make them more cautious about new things, but i think the younger librarians are more quick to latch on to new things. other participants shared this same belief when talking about the difference in work styles and technology use among different ages in their workplace. technology skills the individual technology skills individuals described focus on the use of technology, not the creation of it. francis described this: i don’t have any programming or coding [experience] or building physical computers or anything like that, but just general using a variety of devices like the ipad, iphone, everything is all integrated. i like being able to use technology in my personal life. no one responded that they knew how to program and work with servers, though edward said he had “fiddled with linux as a server” but did not spend a lot of time with it. olivia and quinn, however, did express interest in learning how to program, understand the back end, and create emerging technologies. betty mentioned using sql and xml in her workplace and aired her frustrations that people just expect to be able to use technology without learning how it actually works and what went into making that device or service. several people mentioned working in web design, but only a couple people mentioned creating webpages with html and css, though several had experience using tools such as dreamweaver or frontpage. ian mentioned that it was part of her public library job to assist patrons with using their personal devices, while others digital native academic librarians, technology skills, and their relationships with technology | emanuel 29 stated that when they have technology problems, they simply contact their it departments. many participants mentioned using social media and various web 2.0 applications such as facebook and twitter, both personally and professionally. when asked to compare their tech skills from before they became librarians to after, some described minor changes in skills, such as learning html, but others mostly indicated that library school helped them learn new applications, existing technologies, or new technology resources, most without going into detail. quinn talked about her tech skills in relation to what she is learning in library school: i think they [technology skills] are actually above average. i’ve taken a few of the courses that are offered in terms of tech, and they are totally below what i already know. but other classmates have thought it was really hard. but i’ve had prior knowledge of it. patricia stated she started using online tools more extensively after learning about them in library school. one talked extensively about using webinar software and libguides to deliver instruction online, while another stated library school inspired her to start a blog that she did not keep up, and another became an extensive twitter user. jan focused on digital librarianship while in library school because she saw it as the future of libraries. she thought that library school helped her do some “encoding on some projects and how to do webpages,” but it barely touched on the skills needed to actually perform a job within digital librarianship. she would like to get more into the development side of library technology, but in her current job there is not the time or support to further advance those skills. a couple of participants talked about learning about usability and the evaluation of technologies. a few interview participants mentioned the tech skills of people even younger than they are, or current college students they work with. betty did not see younger coworkers understanding what is needed to develop or understand the back end of technology and believed younger workers do not use technology to communicate as effectively as they could. edward, who works at a for-profit career college that has many poorer and nontraditional students, stated, it is “not just the 50 year olds, but the 18 year olds who don’t know how to attach documents to an email.” when pressed as to why she thinks young students struggle with basic technology tasks, he stated, “at times i think that has a lot more to do with their k–12 experience and if they had access to computers and stuff. i don’t know. it just blows me away sometimes.” gabby, currently working in appalachia, said, “not everyone here has computer skills, not everyone has access to it at home or maybe can’t make it to the library. . . . i think it is awesome to have those things at your fingertips, but not everyone does.” on the other hand, diana believed that she does not “have the same relationship with technology like i’m seeing some of the college students now where they are hooked in all the time and they are just going for it”. she also said that she “wouldn’t call myself a digital immigrant, but i’m very comfortable using technology but not to the extent i’m seeing many people i see now.” information technology and libraries | september 2013 30 tech skills related to career choice the researcher sought to determine the role of technology in determining the career choice of librarianship. those interview participants who talked about using technology did not mention it as a reason they became a librarian. survey responses indicated that opportunities to use technology were an important reason to become a librarian, but participants did not stress technology during the interviews. participants were much more likely to specify their love of the academic atmosphere or their general interest in research first and then maybe think of technology as an afterthought. gabby mentioned, after a long list of things that influenced her career choice, “and technology and stuff.” only taylor talked extensively about how technology influenced her choice. a current library school student, she wanted to go into archives and is really excited about how much information is being digitized and put online: you know how everything on microfiche is now digital? everything seems to be digitized as well, you know books and e-books and journals. being able to take something and scan it and put it online for users to access. it is definitely an important thing. so yeah, that definitely influenced me on becoming a librarian. jan decided to specialize in digital librarianship while in library school because she saw it was the future of library work. rachel, who has observed similar attitudes among her classmates, shared this thinking as well. however, heather admitted she did not have a lot of technology experience before going to library school and did not believe that her master’s program prepared her to go into the technology oriented digital librarianship. several participants talked about how their background using search engines such as google and doing research online would make them better librarians, but none talked about these as factors related to choosing librarianship as a career. abby talked about how she always uses google to look things up, and that it is nice to have found a career that rewards such use. diana discussed how she had always been good at finding information online since she was a child, which helped her narrow her career choice to academic librarianship, as she believed it was the best match for these skills. instead of talking about how technology influenced their career choice, participants were more likely to talk about the fact that technology did not influence them. abby stated, “i don’t think [technology influenced me] because i didn’t really know that librarians needed a lot of technology skill.” edward stated he, “didn’t do any technology in library school because i didn’t want to go in that direction,” reiterated this. rachel, who strongly did not consider herself a digital native, stated she was drawn to librarianship, specifically access services, because she liked working with print books rather than using online resources to find information. she commented, i really liked looking for books and i used the card catalog when i was a kid, but i can use a computer to help people find things, but it was like, i really just liked finding the books rather than electronic digital native academic librarians, technology skills, and their relationships with technology | emanuel 31 information. i guess i feel like it feels comfortable and safe, like books. and you can hold them and you can touch them. and sometimes i feel like they should always be a part of the library. i took a digital libraries course this past semester and i felt like i was the only one being like, “no, we still need physical books,” so i was actually realizing how intimidated i am with technology. i’m totally willing to adapt, and i’m willing to work on these issues, but i do feel like i want the library to still be a place that has the traditional feel. samantha also did not feel like a digital native, as she grew up in a rural area that only had access to dial-up internet. she went on to describe how she did not work with online tools until college and she was relieved when she did not have to use such tools during a year off between college and graduate school. although she recognized technology use by librarians is helping libraries not becoming obsolete, she only learned what she needed to learn in order to complete library school, so it did not influence her career choice. implications millennials are very comfortable with technology, though there are limitations to their skills. for the most part, they have a lifetime attachment to technology,16 but they do remember a time without having a computer in their homes or when computers were something only used at school and for basic instruction. as interview participant frances put it, “nothing like how students get to use them now.” millennial grew up with computers, but early on, they were not advanced enough to do the multimedia creation and application building that is done now, and they mostly use resources that were developed by others. however, millennial librarians in this study do see the utility that computers have in everyday life, and by high school, many stated that computer use was required for them to go about their academic and personal lives, but they thought that technology in its current state with online research resources and social networking did not come about until they were in college. additionally, most interview respondents said that library school helped acculturate them into using technology more frequently. however, not everyone in the study grew up with a computer or internet access in his or her home. two interview respondents refused to call themselves digital natives. one said she grew up in an environment without much money, and the only technology her family had access to was often secondhand and several years behind. the other participant grew up in a rural area that did not have access to high-speed internet, and as a result, she was rarely online until college. both individuals believed that technology was definitely not a factor in them being drawn to librarianship, and they were more interested in the circulation and the print resources than in specializations that require a high level of technical knowledge. other participants were quick to acknowledge that there are many members of their generation who, for one reason or another, do not have an interest in technology and may not have had the resources growing up to have incorporated it into their daily lives. some participants noted there was some computer instruction starting in elementary school, but it was very basic computer literacy, and most of their technology learning occurred at home when there was the time to focus on tasks that were more complicated. information technology and libraries | september 2013 32 even though study participants remember a time without technology in their homes and they believe that technology did not mature to its current state until they were in college, they have used it for a much larger percentage of their lives than older generations. for that reason, they are quick to learn new technologies as they become available or are required based on professional needs. they also believe that because computers had matured alongside them, they are not afraid to break them. interview participant abby states, “i have a lot of faith in technology.” millennials believe that they can experiment with technology without fear that it will become inoperable or cause additional headaches in the future. they are also not wedded to particular technologies and do not get frustrated by current technologies and applications because they think something newer and better is always around the corner. one disconnect in the technology skills of millennials is that most of them are accustomed to using technology, not creating it or understanding the back end infrastructure. as one interview participant said, “they expect everything to be easy, but they don’t understand what went into trying to make it easy.” although many librarians indicated they use tools such as camtasia to create multimedia projects, many thought they had weak skills in this area and desired to learn more. they are also most likely to edit content on webpages using a cms system such as drupal or libguides instead of creating more elaborate websites utilizing information architecture principles or more complex web programming languages (such as php) or relational databases (such as mysql). they rely on dedicated tech people to set these up and maintain the servers that house these services, but they desire to learn more about these technologies themselves. there is also a strong desire to learn more traditional computer programming languages such as c++, c#, and perl. many participants thought library school only affected their technology skills marginally, and they desire to learn higher-order skills that can be applied to their job. millennials are comfortable learning front-end technologies on their own, but they need help understanding the technology behind the tools they use in their daily lives. conclusion this mixed-methods study examined many characteristics of millennial librarians, and this article noted their technology skills and attitudes toward technology. the findings indicate that technology did not play a major role in their decision to become academic librarians. the data also reveal that, although millennial librarians mostly grew up with technology and believe this sets their skills apart from older librarians, their skills are mostly in using technology tools and not in creating them. they also believe their status as digital native has allowed them to recognize that librarianship is changing as a career. however, millennial librarians still respect their older colleagues and the skills associated with traditional librarianship and are firmly rooted in traditions. millennial librarians just want to be able to shape the profession in their own way. digital native academic librarians, technology skills, and their relationships with technology | emanuel 33 references 1. william strauss and neil howe, millennial and the pop culture: strategies for a new generation of consumers in music, movies, and video games (great falls, va: life course associates, 2006). 2. haidee e. allerton, “generation why: they promise to be the biggest influence since the baby boomers,” training and development 55, no. 11 (2001): 56–60; don tapscott, growing up digital: the rise of the net generation (new york: mcgraw-hill, 2008). 3. rachel singer gordon, the nextgen librarian’s survival guide (medford, nj: information today, 2006); sophia guevara, “generation y what can we do for you?” information outlook 11, no. 6 (2007): 81–82; diane zabel, “trends in reference and public services librarianship and the role of rusa: part two,” reference & user services quarterly 45, no. 2 (2005): 104–7. 4. gordon, the nextgen librarian’s survival guide. 5. gordon, the nextgen librarian’s survival guide; lisa johnson, mind your x’s and y’s: satisfying the 10 cravings of a new generation of consumers (new york: free press, 2006); william strauss & neil howe, millennial rising: the next great generation (new york: vintage, 2000); tapscott, growing up digital; ron c. zemke, claire raines, and bob filipczak, generations at work: managing the clash of veterans, boomers, xers, and nexters in your workplace (new york: amacom, 2000). 6. johnson, mind your x’s and y’s; tapscott, growing up digital. 7. strauss and howe, millennial and the pop culture. 8. zemke, raines, and filipczak, generations at work. 9. tapscott, growing up digital. 10. strauss and howe, millennial and the pop culture; tapscott, growing up digital. 11. l. p. morton, “targeting generation y,” public relations quarterly 47, no. (2002): 46–48; p. paul, “getting inside gen y,” american demographics 23, no. 9 (2001): 42–49. 12. paul, “getting inside gen y”; tapscott, growing up digital. 13. mark bauerlein, the dumbest generation: how the digital age stupefies young americans and jeopardizes our future (or, don’t trust anyone under 30) (new york: penguin, 2008). 14. darcy del bosque and cory lampert, “a chance of storms: new librarians navigating technology tempests,” technical services quarterly 26, no. 4 (2009): 261–86. 15. carol h. weiss, evaluation: methods for studying programs and policies (upper saddle river, nj: prentice hall, 1998). 16. allerton, “generation why ”; tapscott, growing up digital. tech skills related to career choice the next generation library catalog | zhou 151are your digital documents web friendly? | zhou 151 are your digital documents web friendly?: making scanned documents web accessible the internet has greatly changed how library users search and use library resources. many of them prefer resources available in electronic format over traditional print materials. while many documents are now born digital, many more are only accessible in print and need to be digitized. this paper focuses on how the colorado state university libraries creates and optimizes text-based and digitized pdf documents for easy access, downloading, and printing. t o digitize print materials, we normally scan originals, save them in archival digital formats, and then make them webaccessible. there are two types of print documents, graphic-based and text-based. if we apply the same techniques to digitize these two different types of materials, the documents produced will not be web-friendly. graphic-based materials include archival resources such as historical photographs, drawings, manuscripts, maps, slides, and posters. we normally scan them in color at a very high resolution to capture and present a reproduction that is as faithful to the original as possible. then we save the scanned images in tiff (tagged image file format) for archival purposes and convert the tiffs to jpeg (joint photographic experts group) 2000 or jpeg for web access. however, the same practice is not suitable for modern text-based documents, such as reports, journal articles, meeting minutes, and theses and dissertations. many old text-based documents (e.g., aged newspapers and books), should be yongli zhoututorial files for fast web delivery as access files. for text-based files, access files normally are pdfs that are converted from scanned images. “bcr’s cdp digital imaging best practices version 2.0” says that the master image should be the highest quality you can afford, it should not be edited or processed for any specific output, and it should be uncompressed.1 this statement applies to archival images, such as photographs, manuscripts, and other image-based materials. if we adopt the same approach for modern text documents, the result may be problematic. pdfs that are created from such master files may have the following drawbacks: ■■ because of their large file size, they require a long download time or cannot be downloaded because of a timeout error. ■■ they may crash a user’s computer because they use more memory while viewing. ■■ they sometimes cannot be printed because of insufficient printer memory. ■■ poor print and on-screen viewing qualities can be caused by background noise and bleedthrough of text. background noise can be caused by stains, highlighter marks made by users, and yellowed paper from aged documents. ■■ the ocr process sometimes does not work for high-resolution images. ■■ content creators need to spend more time scanning images at a high resolution and converting them to pdf documents. web-friendly files should be small, accessible by most users, full-text searchable, and have good treated as graphic-based material. these documents often have faded text, unusual fonts, stains, and colored background. if they are scanned using the same practice as modern text documents, the document created can be unreadable and contain incorrect information. this topic is covered in the section “full-text searchable pdfs and troubleshooting ocr errors.” currently, pdf is the file format used for most digitized text documents. while pdfs that are created from high-resolution color images may be of excellent quality, they can have many drawbacks. for example, a multipage pdf may have a large file size, which increases download time and the memory required while viewing. sometimes the download takes so long it fails because a time-out error occurs. printers may have insufficient memory to print large documents. in addition, the optical character recognition (ocr) process is not accurate for high resolution images in either color or grayscale. as we know, users want the ability to easily download, view, print, and search online textual documents. all of the drawbacks created by high-quality scanning defeat one of the most important purposes of digitizing text-based documents: making them accessible to more users. this paper addresses how colorado state university libraries (csul) manages these problems and others as staff create web-friendly digitized textual documents. topics include scanning, long-time archiving, full-text searchable pdfs and troubleshooting ocr problems, and optimizing pdf files for web delivery. preservation master files and access files for digitization projects, we normally refer to images in uncompressed tiff format as master files and compressed yongli zhou is digital repositories librarian, colorado state university libraries, colorado state university, fort collins, colorado 152 information technology and libraries | september 2010152 information technology and libraries | september 2010 factors that determine pdf file size. color images typically generate the largest pdfs and black-and-white images generate the smallest pdfs. interestingly, an image of smaller file size does not necessarily generate a smaller pdf. table 1 shows how file format and color mode affect pdf file size. the source file is a page containing black-and-white text and line art drawings. its physical dimensions are 8.047" by 10.893". all images were scanned at 300 dpi. csul uses adobe acrobat professional to create pdfs from scanned images. the current version we use is adobe acrobat 9 professional, but most of its features listed in this paper are available for other acrobat versions. when acrobat converts tiff images to a pdf, it compresses images. therefore a final pdf has a smaller file size than the total size of the original images. acrobat compresses tiff uncompressed, lzw, and zip the same amount and produces pdfs of the same file size. because our in-house scanning software does not support tiff g4, we did not include tiff g4 test data here. by comparing similar pages, we concluded that tiff g4 works the same as tiff uncompressed, lzw, and zip. for example, if we scan a text-based page as blackand-white and save it separately in tiff uncompressed, lzw, zip, or g4, then convert each page into a pdf, the final pdf will have the same file size without a noticeable quality difference. tiff jpeg generates the smallest pdf, but it is a lossy format, so it is not recommended. both jpeg and jpeg 2000 have smaller file sizes but generate larger pdfs than those converted from tiff images. recommendations 1. use tiff uncompressed or lzw in 24 bits color for pages with color graphs or for historical documents. 2. use tiff uncompressed or lzw compress an image up to 50 percent. some vendors hesitate to use this format because it was proprietary; however, the patent expired on june 20, 2003. this format has been widely adopted by much software and is safe to use. csul saves all scanned text documents in this format. ■■ tiff zip: this is a lossless compression. like lzw, zip compression is most effective for images that contain large areas of single color. 2 ■■ tiff jpeg: this is a jpeg file stored inside a tiff tag. it is a lossy compression, so csul does not use this file format. other image formats: ■■ jpeg: this format is a lossy compression and can only be used for nonarchival purposes. a jpeg image can be converted to pdf or embedded in a pdf. however, a pdf created from jpeg images has a much larger file size compared to a pdf created from tiff images. ■■ jpeg 2000: this format’s file extension is .jp2. this format offers superior compression performance and other advantages. jpeg 2000 normally is used for archival photographs, not for text-based documents. in short, scanned images should be saved as tiff files, either with compression or without. we recommend saving text-only pages and pages containing text and/or line art as tiff g4 or tiff lzw. we also recommend saving pages with photographs and illustrations as tiff lzw. we also recommend saving pages with photographs and illustrations as tiff uncompressed or tiff lzw. how image format and color mode affect pdf file size color mode and file format are two on-screen viewing and print qualities. in the following sections, we will discuss how to make scanned documents web-friendly. scanning there are three main factors that affect the quality and file size of a digitized document: file format, color mode, and resolution of the source images. these factors should be kept in mind when scanning text documents. file format and compression most digitized documents are scanned and saved as tiff files. however, there are many different formats of tiff. which one is appropriate for your project? ■■ tiff: uncompressed format. this is a standard format for scanned images. however, an uncompressed tiff file has the largest file size and requires more space to store. ■■ tiff g3: tiff with g3 compression is the universal standard for faxs and multipage line-art documents. it is used for blackand-white documents only. ■■ tiff g4: tiff with g4 compression has been approved as a lossless archival file format for bitonal images. tiff images saved in this compression have the smallest file size. it is a standard file format used by many commercial scanning vendors. it should only be used for pages with text or line art. many scanning programs do not provide this file format by default. ■■ tiff huffmann: a method for compressing bi-level data based on the ccitt group 3 1d facsimile compression schema. ■■ tiff lzw: this format uses a lossless compression that does not discard details from images. it may be used for bitonal, grayscale, and color images. it may the next generation library catalog | zhou 153are your digital documents web friendly? | zhou 153 to be scanned at no less than 600 dpi in color. our experiments show that documents scanned at 300 or 400 dpi are sufficient for creating pdfs of good quality. resolutions lower than 300 dpi are not recommended because they can degrade image quality and produce more ocr errors. resolutions higher than 400 dpi also are not recommended because they generate large files with little improved on-screen viewing and print quality. we compared pdf files that were converted from images of resolutions at 300, 400, and 600 dpi. viewed at 100 percent, the difference in image quality both on screen and in print was negligible. if a page has text with very small font, it can be scanned at a higher resolution to improve ocr accuracy and viewing and print quality. table 2 shows that high-resolution images produce large files and require more time to be converted into pdfs. the time required to combine images is not significantly different compared to scanning time and ocr time, so it was omitted. our example is a modern text document with text and a black-and-white chart. most of our digitization projects do not require scanning at 600 dpi; 300 dpi is the minimum requirement. we use 400 dpi for most documents and choose a proper color mode for each page. for example, we scan our theses and dissertations in black-andwhite at 400 dpi for bitonal pages. we scan pages containing photographs or illustrations in 8-bit grayscale or 24-bit color at 400 dpi. other factors that affect pdf file size in addition to the three main factors we have discussed, unnecessary edges, bleed-through of text and graphs, background noise, and blank pages also increase pdf file sizes. figure 1 shows how a clean scan can largely reduce a pdf file size and cover. the updated file has a file size of 42.8 mb. the example can be accessed at http://hdl.handle .net/10217/3667. sometimes we scan a page containing text and photographs or illustrations twice, in color or grayscale and in black-and-white. when we create a pdf, we combine two images of the same page to reproduce the original appearance and to reduce file size. how to optimize pdfs using multiple scans will be discussed in a later section. how image resolution affects pdf file size before we start scanning, we check with our project manager regarding project standards. for some funded projects, documents are required in grayscale 8 bits for pages with black-and-white photographs or grayscale illustrations. 3. use tiff uncompressed, lzw, or g4 in black-and-white for pages containing text or line art. to achieve the best result, each page should be scanned accordingly. for example, we had a document with a color cover, 790 pages containing text and line art, and 7 blank pages. we scanned the original document in color at 300 dpi. the pdf created from these images was 384 mb, so large that it exceeded the maximum file size that our repository software allows for uploading. to optimize the document, we deleted all blank pages, converted the 790 pages with text and line art from color to blackand-white, and retained the color table 1. file format and color mode versus pdf file size file format scan specifications tiff size (kb) pdf size (kb) tiff color 24 bits 23,141 900 tiff lzw color 24 bits 5,773 900 tiff zip color 24 bits 4,892 900 tiff jpeg color 24 bits 4,854 873 jpeg 2000 color 24 bits 5,361 5,366 jpeg color 24 bits 4,849 5,066 tiff grayscale 8 bits 7,729 825 tiff lzw grayscale 8 bits 2,250 825 tiff zip grayscale 8 bits 1,832 825 tiff jpeg grayscale 8 bits 2,902 804 jpeg 2000 grayscale 8 bits 2,266 2,270 jpeg grayscale 8 bits 2,886 3,158 tiff black-and-white 994 116 tiff lzw black-and-white 242 116 tiff zip black-and-white 196 116 note: black-and-white scans cannot be saved in jpeg, jpeg 2000, or tiff jpeg formats. 154 information technology and libraries | september 2010154 information technology and libraries | september 2010 many pdf files cannot be saved as pdf/a files. if an error occurs when saving a pdf to pdf/a, you may use adobe acrobat preflight (advanced > preflight) to identify problems. see figure 2. errors can be created by nonembedded fonts, embedded images with unsupported file compression, bookmarks, embedded video and audio, etc. by default, the reduce file size procedure in acrobat professional compresses color images using jpeg 2000 compression. after running the reduce file size procedure, a pdf may not be saved as a pdf/a because of a “jpeg 2000 compression used” error. according to the pdf/a competence center, this problem will be eliminated in the second part of the pdf/a standard— pdf/a-2 is planned for 2008/2009. there are many other features in new pdfs; for example, transparency and layers will be allowed in pdf/a2.5 however, at the time this paper was written pdf/a-2 had not been announced.6 portable, which means the file created on one computer can be viewed with an acrobat viewer on other computers, handheld devices, and on other platforms.3 a pdf/a document is basically a traditional pdf document that fulfills precisely defined specifications. the pdf/a standard aims to enable the creation of pdf documents whose visual appearance will remain the same over the course of time. these files should be software-independent and unrestricted by the systems used to create, store, and reproduce them.4 the goal of pdf/a is for long-term archiving. a pdf/a document has the same file extension as a regular pdf file and must be at least compatible with acrobat reader 4. there are many ways to create a pdf/a document. you can convert existing images and pdf files to pdf/a files, export a document to pdf/a format, scan to pdf/a, to name a few. there are many software programs you can use to create pdf/a, such as adobe acrobat professional 8 and later versions, compart ag, pdflib, and pdf tools ag. simultaneously improve its viewing and print quality. recommendations 1. unnecessary edges: crop out. 2. bleed-through text or graphs: place a piece of white or black card stock on the back of a page. if a page is single sided, use white card stock. if a page is double sided, use black card stock and increase contrast ratio when scanning. often color or grayscale images have bleedthrough problems. scanning a page containing text or line art as black-and-white will eliminate bleed-through text and graphs. 3. background noise: scanning a page containing text or line art as black-and-white can eliminate background noise. many aged documents have yellowed papers. if we scan them as color or grayscale, the result will be images with yellow or gray background, which may increase pdf file sizes greatly. we also recommend increasing the contrast for better ocr results when scanning documents with background colors. 4. blank pages: do not include if they are not required. blank pages scanned in grayscale or color can quickly increase file size. pdf and longterm archiving pdf/a pdf vs. pdf/a pdf, short for portable document format, was developed by adobe as a unique format to be viewed through adobe acrobat viewers. as the name implies, it is table 2. color mode and image resolution vs. pdf file size color mode resolution (dpi) scanning time (sec.) ocr time (sec.) tiff lzw (kb) pdf size (kb) color 600 100 n/a* 16,498 2,391 color 400 25 35 7,603 1,491 color 300 18 16 5,763 952 grayscale 600 36 33 6,097 2,220 grayscale 400 18 18 2,888 1370 grayscale 300 14 12 2,240 875 b/w 600 12 18 559 325 b/w 400 10 10 333 235 b/w 300 8 9 232 140 *n/a due to an ocr error the next generation library catalog | zhou 155are your digital documents web friendly? | zhou 155 able. this option keeps the original image and places an invisible text layer over it. recommended for cases requiring maximum fidelity to the original image.8 this is the only option used by csul. 2. searchable image: ensures that text is searchable and selectable. this option keeps the original image, de-skews it as needed, and places an invisible text layer over it. the selection for downsample images in this same dialog box determines whether the image is downsampled and to what extent.9 the downsampling combines several pixels in an image to make a single larger pixel; thus some information is deleted from the image. however, downsampling does not affect the quality of text or line art. when a proper setting is used, the size of a pdf can be significantly reduced with little or no loss of detail and precision. 3. clearscan: synthesizes a new type 3 font that closely approximates the original, and preserves the page background using a low-resolution copy.10 the final pdf is the same as a born-digital pdf. because acrobat cannot guarantee the accuracy of manipulate the pdf document for accessibility. once ocr is properly applied to the scanned files, however, the image becomes searchable text with selectable graphics, and one may apply other accessibility features to the document.7 acrobat professional provides three ocr options: 1. searchable image (exact): ensures that text is searchable and selectfull-text searchable pdfs and troubleshooting ocr errors a pdf created from a scanned piece of paper is inherently inaccessible because the content of the document is an image, not searchable text. assistive technology cannot read or extract the words, users cannot select or edit the text, and one cannot figure 1. pdfs converted from different images: (a) the original pdf converted from a grayscale image and with unnecessary edges; (b) updated pdf converted from a blackand-white image and with edges cropped out; (c) screen viewed at 100 percent of the pdf in grayscale; and (d) screen viewed at 100 percent of the pdf in black-and-white. dimensions: 9.127” x 11.455” color mode: grayscale resolution: 600 dpi tiff lzw: 12.7 mb pdf: 1,051 kb dimensions: 8” x 10.4” color mode: black-and-white resolution: 400 dpi tiff lzw: 153 kb pdf: 61 kb figure 2. example of adobe acrobat 9 preflight 156 information technology and libraries | september 2010156 information technology and libraries | september 2010 but at least users can read all text, while the black-and-white scan contains unreadable words. troubleshoot ocr error 3: cannot ocr image based text the search of a digitized pdf is actually performed on its invisible text layer. the automated ocr process inevitably produces some incorrectly recognized words. for example, acrobat cannot recognize the colorado state university logo correctly (see figure 6). unfortunately, acrobat does not provide a function to edit a pdf file’s invisible text layer. to manually edit or add ocr’d text, adobe acrobat capture 3.0 (see figure 7) must be purchased. however, our tests show that capture 3.0 has many drawbacks. this software is complicated and produces it’s own errors. sometimes it consolidates words; other times it breaks them up. in addition, it is time-consuming to add or modify invisible text layers using acrobat capture 3.0. at csul, we manually add searchable text for title and abstract pages only if they cannot be ocr’d by acrobat correctly. the example in troubleshoot ocr error 2: could not perform recognition (ocr) sometimes acrobat generates an “outside of the allowed specifications” error when processing ocr. this error is normally caused by color images scanned at 600 dpi or more. in the example in figure 4, the page only contains text but was scanned in color at 600 dpi. when we scanned this page as blackand-white at 400 dpi, we did not encounter this problem. we could also use a lower-resolution color scan to avoid this error. our experiments also show that images scanned in black-and-white work best for the ocr process. in this article we mainly discuss running the ocr process on modern textual documents. black-and-white scans do not work well for historical textual documents or aged newspapers. these documents may have faded text and background noise. when they are scanned as blackand-white, broken letters may occur, and some text might become unreadable. for this reason they should be scanned in color or grayscale. in figure 5, images scanned in color might not produce accurate ocr results, ocred text at 100 percent, this option is not acceptable for us. for a tutorial on to how to make a full-text searchable pdf, please see appendix a. troubleshoot ocr error 1: acrobat crashes occasionally acrobat crashes during the ocr process. the error message does not indicate what causes the crash and where the problem occurs. fortunately, the page number of the error can be found on the top shortcuts menu. in figure 3, we can see the error occurs on page 7. we discovered that errors are often caused by figures or diagrams. for a problem like this, the solution is to skip the error-causing page when running the ocr process. our initial research was performed on acrobat 8 professional. our recent study shows that this problem has been significantly improved in acrobat 9 professional. figure 3. adobe acrobat 8 professional crash window figure 4. “could not perform recognition (ocr)” error figure 5. an aged newspaper scanned in color and black-and-white aged newspaper scanned in color aged newspaper scanned in black-and-white the next generation library catalog | zhou 157are your digital documents web friendly? | zhou 157 a very light yellow background. the undesirable marks and background contribute to its large file size and create ink waste when printed. method 2: running acrobat’s built-in optimization processes acrobat provides three built-in processes to reduce file size. by default, acrobat use jpeg compression for color and grayscale images and ccitt group 4 compression for bitonal images. optimize scanned pdf open a scanned pdf and select documents > optimize scanned pdf. a number of settings, such as image quality and background removal, can be specified in the optimize scanned pdf dialog box. our experiments show this process can noticably degrade images and sometimes even increase file size. therefore we do not use this option. reduce file size open a scanned pdf and select documents > reduce file size. the reduce file size command resamples and recompresses images, removes embedded base-14 fonts, and subset-embeds fonts that were left embedded. it also compresses document structure and cleans up elements such as invalid bookmarks. if the file size is already as small as possible, this command has no effect.11 after process, some files cannot be saved as pdf/a, as we discussed in a previous section. we also noticed that different versions of acrobat can create files of different file sizes even if the same settings were used. pdf optimizer open a scanned pdf and select advanced > pdf optimizer. many settings can be specified in the pdf optimizer dialog box. for example, we can downsample images from sections, we can greatly reduce a pdf’s size by using an appropriate color mode and resolution. figure 9 shows two different versions of a digitized document. the source document has a color cover and 111 bitonal pages. the original pdf, shown in figure 9 on the left, was created by another university department. it was not scanned according to standards and procedures adopted by csul. it was scanned in color at 300 dpi and has a file size of 66,265 kb. we exported the original pdf as tiff images, batch-converted color tiff images to black-and-white tiff images, and then created a new pdf using blackand-white tiff images. the updated pdf has a file size of 8,842 kb. the image on the right is much cleaner and has better print quality. the file on the left has unwanted marks and figure 8 is a book title page for which we used acrobat capture 3.0 to manually add searchable text. the entire book may be accessed at http://hdl .handle.net/10217/1553. optimizing pdfs for web delivery a digitized pdf file with 400 color pages may be as large as 200 to 400 mb. most of the time, optimizing processes may reduce files this large without a noticeable difference in quality. in some cases, quality may be improved. we will discuss three optimization methods we use. method 1: using an appropriate color mode and resolution as we have discussed in previous ~do university original logo text ocred by acrobat figure 6. incorrectly recognized text sample figure 7. adobe acrobat capture interface figure 8. image-based text sample 158 information technology and libraries | september 2010158 information technology and libraries | september 2010 grayscale. a pdf may contain pages that were scanned with different color modes and resolutions. a pdf may also have pages of mixed resolutions. one page may contain both bitonal images and color or grayscale images, but they must be of the same resolution. the following strategies were adopted by csul: 1. combine bitmap, grayscale, and color images. we use grayscale images for pages that contain grayscale graphs, such as black-and-white photos, color images for pages that contain color images, and bitmap images for text-only or text and line art pages. 2. if a page contains high-definition color or grayscale images, scan that page in a higher resolution and scan other pages at 400 dpi. 3. if a page contains a very small font and the ocr process does not work well, scan it at a higher resolution and the rest of document at 400 dpi. 4. if a page has both text, color, or grayscale graphs, we scan it twice. then we modify images using adobe photoshop and combine two images in acrobat. in figure 10, the grayscale image has a gray background and a true reproduction of the original photograph. the black-and-white scan has a white background and clean text, but details of the photograph are lost. the pdf converted from the grayscale image is 491 kb and has nine ocr errors. the pdf converted from the black-and-white image is 61kb and has no ocr errors. the pdf converted from a combination of the grayscale and black-and-white images is 283 kb and has no ocr errors. the following are the steps used to create a pdf in figure 10 using acrobat: 1. scan a page twice—grayscale optimizer can be found at http:// www.acrobatusers.com/tutorials/ understanding-acrobats-optimizer. method 3: combining different scans many documents have color covers and color or grayscale illustrations, but the majority of pages are textonly. it is not necessary to scan all pages of such documents in color or a higher resolution to a lower resolution and choose a different file compression. different collections have different original sources, therefore different settings should be applied. we normally do several tests for each collection and choose the one that works best for it. we also make our pdfs compatible with acrobat 6 to allow users with older versions of software to view our documents. a detailed tutorial of how to use the pdf figure 9. reduce file size example figure 10. reduce file size example: combine images the next generation library catalog | zhou 159are your digital documents web friendly? | zhou 159 help.html?content=wsfd1234e1c4b69f30 ea53e41001031ab64-7757.html (accessed mar. 3, 2010). 3. ted padova adobe acrobat 7 pdf bible, 1st ed. (indianapolis: wiley, 2005). 4. olaf drümmer, alexandra oettler, and dietrich von seggern, pdf/a in a nutshell—long term archiving with pdf, (berlin: association for digital document standards, 2007). 5. pdf/a competence center, “pdf/a: an iso standard—future development of pdf/a,” http://www. pdfa.org/doku.php?id=pdfa:en (accessed july 20, 2010). 6. pdf/a competence center, “pdf/a—a new standard for longterm archiving,” http://www.pdfa.org/ doku.php?id=pdfa:en:pdfa_whitepaper (accessed july 20, 2010). 7. adobe, “creating accessible pdf documents with adobe acrobat 7.0: a guide for publishing pdf documents for use by people with disabilities,” 2005, http://www.adobe.com/enterprise/ a c c e s s i b i l i t y / p d f s / a c ro 7 _ p g _ u e . p d f (accessed mar. 8, 2010). 8. adobe, “recognize text in scanned documents,” 2010, http:// help.adobe.com/en_us/acrobat/9.0/ s t a n d a rd / w s 2 a 3 d d 1 fa c fa 5 4 c f 6 -b993-159299574ab8.w.html (accessed mar. 8, 2010). 9. ibid. 10. ibid. 11. adobe, “reduce file size by saving,” 2010, http://help.adobe.com/en_us/ acrobat/9.0/standard/ws65c0a053 -bc7c-49a2-88f1-b1bcd2524b68.w.html (accessed mar. 3, 2010). the other 76 pages as grayscale and black-and-white. then we used the procedure described above to combine text pages and photographs. the final pdf has clear text and correctly reproduced photographs. the example can be found at http://hdl .handle.net/10217/1553. conclusion our case study, as reported in this article, demonstrates the importance of investing the time and effort to apply the appropriate standards and techniques for scanning and optimizing digitized documents. if proper techniques are used, the final result will be web-friendly resources that are easy to download, view, search, and print. users will be left with a positive impression of the library and feel encouraged to use its materials and services again in the future. references 1. bcr’s cdp digital imaging best practices working group, “bcr’s cdp digital imaging best practices version 2.0,” june 2008, http://www.bcr.org/ dps/cdp/best/digital-imaging-bp.pdf (accessed mar. 3, 2010). 2. adobe, “about file formats and compression,” 2010, http://livedocs .adobe.com/en_us/photoshop/10.0/ and black-and-white. 2. crop out text on the grayscale scan using photoshop. 3. delete the illustration on the black-and-white image using photoshop. 4. create a pdf using the blackand-white image. 5. run the ocr process and save the file. 6. insert the color graph. select tools > advanced editing > touchup object tool. rightclick on the page and select place image. locate the color graph in the open dialog, then click open and move the color graph to its correct location. 7. save the file and run the reduce file size or pdf optimizer procedure. 8. save the file again. this method produces the smallest file size with the best quality, but it is very time-consuming. at csul we used this method for some important documents, such as one of our institutional repository’s showcase items, agricultural frontier to electronic frontier. the book has 220 pages, including a color cover, 76 pages with text and photographs, and 143 text-only pages. we used a color image for the cover page and 143 black-and-white images for the 143 text-only pages. we scanned appendix a. step-by-step creating a full-text searchable pdf in this tutorial, we will show you how to create a full-text searchable pdf using adobe acrobat 9 professional. creating a pdf from a scanner adobe acrobat professional can create a pdf directly from a scanner. acrobat 9 provides five options: black and white document, grayscale document, color document, color image, and custom scan. the custom scan option allows you to scan, run the ocr procedure, add metadata, combine multiple pages into one pdf, and also make it pdf/a compliant. to create a pdf from a scanner, go to file > create pdf > from scanner > custom scan. see figure 1. at csul, we do not directly create pdfs from scanners because our tests show that it can produce fuzzy text and it is not time efficient. both scanning and running the ocr process can be very time consuming. if an error occurs during these processes, we would have to start over again. we normally scan images on scanning stations by student employees 160 information technology and libraries | september 2010160 information technology and libraries | september 2010 or outsource them to vendors. then library staff will perform quality control and create pdfs on seperate machines. in this way, we can work on multiple documents at the same time and ensure that we provide high-quality pdfs. creating a pdf from scanned images 1. from the task bar select combine > merge files into a single pdf > from multiple files. see figure 2. 2. in the combine files dialog, make sure the single pdf radio button is selected. from the add files dropdown menu select add files. see figure 3. 3. in the add files dialog, locate images and select multiple images by holding shift key, and then click add files button. 4. by default, acrobat sorts files by file names. use move up and move down buttons to change image orders and use the remove button to delete images. choose a target file size. the smallest icon will produce a file with a smaller file size but a lower image quality pdf, and the largest icon will produce a high image quality pdf but with a very large file size. we normally use the default file size setting, which is the middle icon. 5. save the file. at this point, the pdf is not full-text searchable. making a full-text searchable pdf a pdf document created from a scanned piece of paper is inherently inaccessible because the content of the document is an image, not searchable text. assistive technology cannot read or extract the words, users cannot select or edit the text, and one cannot manipulate the pdf document for accessibility. once optical character recognition (ocr) is properly applied to the scanned files, however, the image becomes searchable text with selectable graphics, and one may apply other accessibility features to the document. adobe acrobat professional provides three ocr options, searchable image (exact), searchable image, and clean scan. because searchable image (exact) is the only option that keeps the original look, we only use this option. to run an ocr procedure using acrobat 9 professional: 1. open a digitized pdf. 2. select document > ocr text recognition > recognize text using ocr. 3. in the recognize text dialog, specify pages to be ocred. 4. in the recognize text dialog, click the edit button in the settings section to choose ocr language and pdf output style. we recommend the searchable image (exact) option. click ok. the setting will be remembered by the program and will be used until a new setting is chosen. sometimes a pdf’s file size increases greatly after an ocr process. if this happens, use the pdf optimizer to reduce its file size. figure 2. merge files into a single pdf figure 3. combine files dialog figure 1. acrobat 9 professional’s create pdf from scanner dialog editorial | truitt 55 a recent library journal (lj) story referred to “the palpable hunger public librarians have for change . . . and, perhaps, a silver bullet to ensure their future” in the context of a presentation at the public library association’s 2010 annual conference by staff members of the rangeview (colo.) library district. now, lest there be any doubt on this point, allow me to state clearly from the outset that none of the following ramblings are in any way intended as a specific critique of the measures undertaken by rangeview. far be it from me to second-guess the rangeview staff’s judgment as to how best to serve the community there.1 rather, what got my attention was lj’s reference to a “palpable hunger”for magic ammunition, from whose presumed existence we in libraries seem to draw comfort. in the last quarter century, it seems as though we’ve heard about and tried enough silver bullets to keep our collective six-shooters endlessly blazing away. here are just a few examples that i can recall off the top of my head, and in no particular order: ■■ library cafes and coffee shops. ■■ libraries arranged along the lines of chain bookstores. ■■ general-use computers in libraries (including information/knowledge commons and what-have-you) ■■ computer gaming in libraries. ■■ lending laptops, digital cameras, mp3 players and ipods, e-book readers, and now ipads. ■■ mobile technology (e.g., sites and services aimed at and optimized for iphones, blackberries, etc.) ■■ e-books and e-serials. ■■ chat and instant-message reference. ■■ libraries and social networking (e.g., facebook, twitter, second life, etc.). ■■ “breaking down silos,” and “freeing”/exposing our bibliographic data to the web, and reuse by others outside of the library milieu. ■■ ditching our old and “outmoded” systems, whether the object of our scorn is aacr2, lcsh, lcc, dewey, marc, the ils, etc. ■■ library websites generally. remember how everyone—including us—simply had to have a website in the 1990s? and ever since then, it’s been an endless treadmill race to find the perfect, user-centric library web presence? if sisyphus were to be incarnated today, i have little doubt that he would appear as a library web manager and his boulder would be a library website. ■■ oh, and as long as we’re at it, “user-centricity” generally. the implication, of course, is that before the term came into vogue, libraries and librarians were not focused on users. ■■ “next-gen” catalogs. i’m sure i’m forgetting a whole lot more. anyway, you get the picture. each of these has, at one time or another, been positioned by some advocate as the necessary change—the “silver bullet”—that would save libraries from “irrelevance” (or worse!), if we would but adopt it now, or better yet, yesterday. well, to judge from the generally dismal state of libraries as depicted by some opinionmakers in our profession—or perhaps simply from our collective lack of self-esteem—we either have been misled about the potency of our ammunition, or else we’ve been very poor markspersons. notwithstanding the fact that we seem to have been indiscriminately blasting away with shotguns rather than six-shooters, our shooting has neither reversed the trends of shrinking budgets and declining morale nor staunched the ceaseless dire warnings of some about “irrelevance” resulting from ebbing library use. to stretch the analogy a bit further still, one might even argue that all this shooting has done damage of its own, peppering our most valuable services with countless pellet-sized holes. at the same time, we have in recent years shown ourselves to be remarkably susceptible to the marketingfocused hyperbole of those in and out of librarianship about technological change. each new technology is labeled a “game-changer”; change in general is either— to use the now slightly-dated, oh-so-nineties term—a “paradigm shift” or, more recently, “transformational.” when did we surrender our skepticism and awareness of a longer view? what’s wrong with this picture?2 i’d like to suggest another way of viewing this. a couple of years ago, alan weisman published the world without us, a book that should be required reading for all who are interested in sustainability, our own hubris, and humankind’s place in the world. the book begins with our total, overnight disappearance, and asks (1) what would the earth be like without us? and (2) what evidence of our works would remain, and for how long? the bottom line answers for weisman are (1) in the long run, probably much better off, and (2) not much and not for very long, really. so, applying weisman’s first question to our own, much more modest domain, what might the world be like if tomorrow librarians all disappeared or went on to work doing something else—became consultants, perhaps?— and our physical and virtual collections were padlocked? would everything be okay, because as some believe, marc truitteditorial: no more silver bullets, please marc truitt (marc.truitt@ualberta.ca) is associate university librarian, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 56 information technology and libraries | june 2010 think we need to be prepared to turn off the lights, lock the doors, and go elsewhere, because i hope that what we’re doing is about more than just our own job security. and if the far-fetched should actually happen, and we all disappear? i predict that at some future point, someone will reinvent libraries and librarians, just as others have reinvented cataloguing in the guise of metadata. notes and references 1. norman oder, “pla 2010 conference: the anythink revolution is ripe,” library journal, mar. 26, 2010, http://www .libraryjournal.com/article/ca6724258.html (accessed mar. 30, 2010). there, i said it! a fairly innocuous disclaimer added to one of my columns last year seemed to garner more attention (http:// freerangelibrarian.com/2009/06/13/marc-truitts-surprising -ital-editorial/) than did the content of the column itself. will the present disclaimer be the subject of similar speculation? 2. one of my favorite antidotes to such bloated, short-term language is embodied in michael gorman’s “human values in a technological age,” ital 20, no. 1 (mar. 2000): 4–11, http:// www.ala.org/ala/mgrps/divs/lita/ital/2001gorman.cfm (accessed apr 12, 2010)—highly recommended. the following is but one of many calming and eminently sensible observations gorman makes: the key to understanding the past is the knowledge that people then did not live in the past—they lived in the present, just a different present from ours. the present we are living in will be the past sooner than we wish. what we perceive as its uniqueness will come to be seen as just a part of the past as viewed from the point of a future present that will, in turn, see itself as unique. people in history did not wear quaintly oldfashioned clothes—they wore modern clothes. they did not see themselves as comparing unfavorably with the people of the future, they compared themselves and their lives favorably with the people of their past. in the context of our area of interest, it is particularly interesting to note that people in history did not see themselves as technologically primitive. on the contrary, they saw themselves as they were—at the leading edge of technology in a time of unprecedented change. it’s all out there on the web anyway, and google will make it findable? absent a few starry-eyed bibliophiles and newly out-of-work librarians—those who didn’t make the grade as consultants—would anyone mourn our disappearance? would anyone notice? if a tree falls in the woods . . . in short, would it matter? and if so, why and how much? the answer to the preceding two questions, i think, can help to point the way to an approach for understanding and evaluating services and change in libraries that is both more realistic and less draining than our obsessive quest for the “silver bullet.” what exactly is our “valueadd”? what do we provide that is unique and valuable? we can’t hope to compete with barnes and noble, starbucks, or the googleplex; seeking to do so simply diverts resources and energy from providing services and resources that are uniquely ours. instead, new and changed services and approaches should be evaluated in terms of our value-add: if they contribute positively and are within our abilities to do them, great. if they do not contribute positively, then trying to do them is wasteful, a distraction, and ultimately disillusioning to those who place their hopes in such panaceas. some of the “bullets” i listed above may well qualify as contributing to our value-add, and that’s fine. my point isn’t to judge whether they are “bad” or “good.” my argument is about process and how we decide what we should do and not do. understanding what we contribute that is uniquely ours should be the reference standard by which proposed changes are evaluated, not some pie-inthe-sky expectation that pursuit of this or that vogue will magically solve our funding woes, contribute to higher (real or virtual) gate counts, make us more “relevant” to a particular user group, or even raise our flagging selfesteem. in other words, our value-add must stand on its own, regardless of whether it actually solves temporal problems. it is the “why” in “why are we here?” if, at the end of the day, we cannot articulate that which makes us uniquely valuable—or if society as a whole finds that contribution not worth the cost—then i testing for transition: evaluating the usability of research guides around a platform migration articles testing for transition: evaluating the usability of research guides around a platform migration ashley lierman, bethany scott, mea warren, and cherie turner information technology and libraries | december 2019 76 ashley lierman (lierman@rowan.edu) is instruction librarian, rowan university. bethany scott (bscott3@uh.edu) is coordinator of digital projects, university of houston. mea warren (mewarren@uh.edu) is natural science and mathematics librarian, university of houston. cherie turner (ckturner2@uh.edu) is assessment & statistics coordinator, university of houston. abstract this article describes multiple stages of usability testing that were conducted before and after a large research library’s transition to a new platform for its research guides. a large interdepartmental team sought user feedback on the design, content, and organization of the guide homepage, as well as on individual subject guides. this information was collected using an open-card-sort study, two face-to-face, think-aloud testing protocols, and an online survey. significant findings include that users need clear directions and titles that incorporate familiar terminology, do not readily understand the purpose of guides, and are easily overwhelmed by excess information, and that many of librarians’ assumptions about the use of library resources may be mistaken. this study will be of value to other library workers seeking insight into user needs and behaviors around online resources. introduction like many libraries that employ springshare’s popular libguides platform for creating online research resources, the university of houston libraries (uhl) has accumulated an extensive collection of guides over the years. by 2015, our collection included well over 250 guides, with varying levels of complexity, popularity, usability, and accessibility. this presented a major challenge when we planned to migrate our libguides instance (locally branded as “research guides”) to libguides v2 in fall 2015, but also an opportunity: the transition would be an ideal time to appraise, reorganize, and streamline existing guide content. although uhl had conducted user research in the past to improve usability, in preparing for the migration it became clear that another round of tests would be beneficial in revising our guides for the new platform. our research guides would be presented much differently in libguides v2, and the design and organization of information would need to be tailored to the needs of our user community like any other service. user feedback would be vital to reorganizing our guides’ content and to making customizations to the new system. this article will describe the usability testing process that was employed before and after uhl’s migration to libguides v2. usability testing is one technique in the field of user experience (ux). the primary goal of ux is to gain a deep understanding of users’ preferences and abilities, in order to inform the design and implementation of more useful, easy-to-use products or systems. best practices for ux emphasize “improving the quality of the user’s interaction with and perceptions of your product and any related services.”1 usability tests conducted as part of this case study mailto:lierman@rowan.edu mailto:bscott3@uh.edu mailto:mewarren@uh.edu mailto:ckturner2@uh.edu testing for transition | lierman, scott, warren, and turner 77 https://doi.org/10.6017/ital.v38i4.11169 were informed by the work of jakob nielsen, who pioneered several ux ideas and techniques, and the explanations on conducting your own usability testing provided in steve krug’s seminal works on the topic, don’t make me think and rocket surgery made easy. uhl’s transition to libguides v2 consisted of five stages: (1) card sort testing to determine the best organization of guides in the new system; (2) the migration itself; (3) face-to-face usability testing after migration to study user expectations and behavior after the change; (4) a survey to identify any significant variations in distance users‘ experiences; and (5) final analysis and implementation of the results. incorporating usability testing was a relatively easy and inexpensive process with a high yield of useful insights, which could be adapted as needed to other library settings in order to evaluate similar online resources. literature review as libraries have moved from traditional paper pathfinders to online research guides of increasing sophistication, there has been substantial study into the effectiveness of online research guides for various audiences and information needs. several studies highlight the apparent disconnect between students’ and librarians’ perceptions of research guides, especially regarding the purpose, organization, and intended use of the guides. reeb and gibbons used an analysis of surveys and web usage statistics from several university libraries to show that students rarely or never used online guides despite the extensive time spent by librarians to curate and present information resources.2 similarly, in courtois, higgins, and kapur’s one-question survey (“was this guide useful?”) the authors were surprised to find that 40 percent of the responses received rated guides unfavorably, noting that “it was disheartening for many guide owners to receive poor ratings or negative comments on guides that require significant time and effort to produce and maintain.”3 hemmig concluded that in order to increase the value of a guide from a user perspective, librarians must adopt a user-centric approach by guiding the search process, understanding students’ mental models for research, and providing “starter references.”4 staley’s survey of student users also indicates a need to be mindful of what resources guides are actually expected to provide, as it found that pages linking to articles and databases were far more used than pages with other content.5 data has also shown that undergraduate students are unable to match their information needs with the resources provided on broad subject-area guides, leading several authors to conclude that students would be able to use course-specific guides more easily. for instance, strutin found that course guides are among the most frequently used guides, especially when paired with library instruction sessions.6 several other studies cite survey data, statistics, and educational concepts like cognitive load theory to conclude that ideally, guides would be customized to the specific information needs of each course and its assignments in order to better match the mental models and information-seeking behavior of undergraduate students.7 while the value of online research guides has been under study for quite some time, usability testing of guides is a relatively recent phenomenon. in 2010, librarians at concordia university conducted usability testing of two online research guides and found that undergraduate students generally found the guides difficult to use.8 librarians at metropolitan state university conducted two rounds of usability tests on their libguides with a broader range of participant types, highlighting the ability to incorporate usability testing as part of an iterative design process.9 at ithaca college, subject librarians partnered with students in a human-computer interaction information technology and libraries | december 2019 78 course to test both course guides and subject guides through a series of usability tests, preand post-test questionnaires, and a group discussion in which students evaluated the findings of the usability tests and discussed their experiences.10 at the university of nevada, las vegas, librarians conducted usability testing with both undergraduate students and librarians, and surprisingly found that attitudes towards the guides were similar in both groups: interface design challenges were the greatest barrier to task completion, rather than the level of expertise of the user.11 finally, at northwestern university, librarians conducted several types of usability tests as a part of a transition from the original libguides platform to libguides v2, to determine what features worked from the original guides and what could be improved or updated during the migration.12 throughout these and other usability studies, the authors have identified a number of desirable and undesirable elements in research guide design: • clean and simple design is highly prioritized by users. students preferred streamlined text, plentiful white space, and links to “best bets” rather than comprehensive but overwhelming lists of databases.13 these findings also align with accepted web design best practices. • guide parts and included resources should be labeled clearly and without jargon.14 sections and subpages within each guide should be named according to key terms that students recognize and understand. also, librarians should consider creating subpages using a “need-driven approach,” based on the purpose of each research task or step, rather than by the format of materials or resources.15 • the tabbed navigation of libguides v1 is both unappealing to and easily missed by users, and if it must be implemented, great care should be taken to maximize its visibility and usability.16 • consistency of guide elements, both within a guide and from one guide to the next, helps users more easily orient themselves when using guides; certain elements should always be present in the same place on the page, including navigational elements and table of contents, contact information, supplemental resources such as citation and plagiarism information, and common search boxes.17 with the findings and recommendations of these predecessors in mind, we designed a multi-stage study to expand upon their results and identify new challenges and opportunities that the libguides v2 platform might present. methodology stage 1: card sort the majority of research guides at uhl are organized by subject area, by course, or both. there are a number of guides, however, that are not affiliated with any particular subject area or course, containing task-oriented information that may be valuable across a wide variety of disciplines. the organizational system for these guides had developed organically over time as new guides were developed, rather than being structured intentionally, and it had become evident that these guides were not particularly discoverable or well-used by students. the migration to libguides v2 presented an opportunity to reorganize these guides based on user input. a team of three librarians from the liaison services department conducted an open-card-sort study in november 2015, in order to determine how best to organize those research guides not already affiliated with a course or subject area. card sorting is a method of identifying the testing for transition | lierman, scott, warren, and turner 79 https://doi.org/10.6017/ital.v38i4.11169 categories and organization of information that make the most sense to users, by asking users to sort potential tasks into named categories representing the menus or options that would be available on the site. an open-card sort allows users to create and name as many categories as they need, as opposed to a closed-card sort, which requires users to sort the available options into a predetermined set of categories. to prepare for the study, we reviewed all of our guides to develop a complete list of those not affiliated with a subject or course. for each guide, we developed a brief, clear description of the guide’s topic that would be easy for an average library user to understand, each on a small laminated card. over an approximately ninety-minute period, we staffed a table in the 24-hour lounge of m.d. anderson library, where we recruited passersby to participate in the study. after answering a few demographic questions, participants were asked to place the cards into groups that seemed logical to them. they could create as many or as few groups as necessary, but were asked to try to place every card in a group. while the participants organized the cards, they were asked to explain their thought processes and rationale, and one librarian observed the sorting process and took notes on their actions and explanations. when a participant finished grouping the cards, they were asked to write on an index card a name for each of the groups they had created. the final groupings were photographed and the labels retained for recording purposes. after the testing was complete, participants’ responses were organized into a spreadsheet and reviewed for recurring patterns and commonalities. a new set of categories was developed based on those most commonly created by students during the study, and these categories were titled using the most common terminology used by students in their group labels. stage 2: migration at the direction of the instructional design librarian (idl), research guide editors at uhl revised and prepared their own guide content throughout fall 2015, eliminating unneeded information and reorganizing what remained. the idl led multiple trainings and work sessions throughout the process to ensure compliance. during this same time, the idl completed back-end work in the libguides system to prepare for migration, and the web services department created a custom layout for the new guide site. the data migration itself took place on december 18, 2015, followed by cleanup and full implementation in january 2016. the idl provided a deadline by which all content must be ready for public consumption, prior to the start of the spring semester. af ter that deadline, the web services department switched the url for uhl’s research guides site to the libguides v2 instance and made the new system publicly available. stage 3: face-to-face testing after the migration process was complete, the idl assembled a team of ten other librarian and staff stakeholders from the liaison services, special collections, and web services departments to develop a usability testing protocol. this team assisted the idl in developing two different face-toface testing scripts and the text of a survey for distance users, as well as helping to administer face-to-face testing. the method we chose for the face-to-face testing process was think-aloud testing. in a think-aloud test, the user is provided a set of tasks to complete using the web resource that have been identified as common potential uses. the user is asked to attempt each task, and to narrate any thoughts or reactions to the resource, as well as the thought process and rationale behind each decision made. information technology and libraries | december 2019 80 several members of the team were already familiar with usability practices and had participated in think-aloud user testing before. training for the others was provided in the form of short practical readings, verbal guidance from the idl in group meetings, and practice sessions before conducting the face-to-face testing. in the practice sessions, group members volunteered for their roles in the testing, discussed protocol and logistics and asked any questions, and practiced the tasks they would each need to complete: making the recruitment pitch to users, walking through the consent process, using recording software, using the notetaking sheet, and so on. as the team leader and one of the members experienced with usability, the idl conducted the actual testing interviews. each of the face-to-face tests focused on either subject guides or the guide homepage. for both tests, tables were set up in the 24-hour lounge for recruitment and testing. two team members recruited students in the library at the time of testing by offering snacks and library branded giveaways. two additional team members facilitated the test and took notes during testing. both tests also used the same consent forms and demographic questions, and largely the same followup questions. participants in both homepage and subject guide testing were guided to the appropriate starting points and interviewed about their impressions of the homepage and guides, their perceptions of the purpose of these resources, and their understanding of the research guides name. subject guide testers were allowed to select which of our two testing guides they would be more comfortable using: the general business resources guide, or the biology and biochemistry resources guide. subject guide testers were also asked how they would seek help if the guide did not meet their needs. both groups were then asked to complete one of two sets of tasks. the homepage tasks were designed to test users’ ability to find individual guides, either for a specific course or for general information on a subject; the subject guide tasks were designed to test users’ ability to find appropriate resources for research on a given topic. after completing the tasks for their appropriate resources, participants answered several general follow-up questions, with additional questions from the facilitator as necessary. stage 4: survey unlike the face-to-face testing, the survey focused only on use of subject guides, not the homepage. otherwise, however, because the purpose of the survey was to compare the behavior of distance users to the behavior of on-campus users, the survey was designed to mimic the face-to-face test as closely as possible. several team members with liaison responsibilities identified distance user groups in their subject areas who would be demographically appropriate and available at the needed time, and contacted appropriate faculty members to ask for assistance in distributing the survey via email. ultimately, the survey was distributed to small cohorts of users in the areas of social work, education, nursing, and pharmacy, and customized for each targeted cohort. each version of the survey linked users to their appropriate subject guide and then asked the same questions regarding impressions of the guide that were asked in the face-to-face testing. users were also asked to complete tasks using the guide that were similar in purpose to those in the face-to-face testing, and they were prompted to enter the resource they found at the end of each task. demographic information was requested at the end of the survey to ensure that in the event of drop-offs, basic demographic information would be more likely to be lost than testing data. the testing for transition | lierman, scott, warren, and turner 81 https://doi.org/10.6017/ital.v38i4.11169 survey was distributed to the target groups over a three-week period in june 2016. six users at least partially completed the survey, and four completed it in full. stage 5: analyzing and implementing results after completing the face-to-face testing, the idl reviewed and transcribed the recordings of each test session, along with additional insights from the notetakers. responses to each interview question were coded and ordered from most to least common, as were patterns of behavior and difficulties in completing each task. task results and completion times were also recorded for each user and organized into a spreadsheet with users’ demographic information. the idl then reported out to research guide editors on common responses and task behaviors observed in the testing, and interpretations of the implications of these results for guide design. after survey responses were collected, the idl compiled and analyzed the results using a similar process, although the survey received few enough responses that coding was not necessary. users’ responses to questions were noted and grouped, and success and failure rates on tasks were tallied. a second report out to research guide editors summarized these results and described which responses closely resembled those received in the face-to-face testing and which varied. finally, when all data had been collected, the idl compiled recommendations based on the testing results with other recommendations derived from past uhl studies and from reviewing the literature, and from these developed a set of research guides usability guidelines. the guidelines were organized from highest to lowest priority, based on how commonly each was indicated in testing or in the literature. research guide editors were asked to revise their guides according to these guidelines within one year of their implementation, and advised that their compliance would be evaluated in an audit of all guide content in summer 2017. in the interest of transparency, the idl also included in the guidelines document an annotated bibliography of the relevant literature review, and a formal report on the procedures and results of the usability testing process. findings card sort one significant observation from the card sort was that, while librarians tended to organize guides into groups based on type of user (e.g., “undergraduates,” “student athletes,” “first-years,” etc.), none of the students who participated categorized resources in this way, and they did not seem to be particularly conscious of the categories into which they or other users might fit. instead, their groupings focused on the type of task to which each guide would be most appropriate, rather than the type of user that would be most likely to use that guide. for example, users readily recognized guides related to citation tasks and preferred them to be grouped together, regardless of the level at which they addressed the topic, and also grouped advanced visualization techniques like gis with simpler multimedia-related tasks like finding images. similarly, category labels tended to include “how to . . . ” language in describing their contents, focusing on the task to which the guides in that category would be beneficial. this aligns with the recommendation from sinkinson et al. to name guide pages based on purpose rather than format.18 it is worth noting, however, that all of the students who participated in the card-sort study were undergraduates and may not have fully understood some of the more complex research tasks being described. it should also be noted that all users created some sort of category for “general” or “basic” research tasks, and most either explicitly created an “advanced” research category, or information technology and libraries | december 2019 82 created several more granular categories and then verbally described these as being for “advanced” research tasks. in general, organization by task type was most preferred, followed by level of sophistication of task. face-to-face testing: homepage no significant correlations were found between user demographics and users’ success rates in completing each task, nor between demographics and time on task. users’ ability to navigate the system was generally consistent regardless of major, year in program, and—somewhat surprisingly—frequency of library use. this is, however, in keeping with costello et al.‘s finding that technology barriers were more significant in user testing than level of experience.19 when testing the homepage, we found that all users were able to find known guides (such as a course guide for a specific course) and appropriate guides for a given task (such as a citation guide for a particular style) quickly and easily. when seeking a guide, users generally used the by subject view of all guides to locate both subject and course guides. if this view was not helpful, as in the case of citation style guides, users’ next step was most commonly to switch to the all guides view and use the search function to look for key terms. users understood and used the by subject and all guides views intuitively, expressed more confusion and hesitation about the by owner and by type views, and disregarded the by group view entirely. we had been concerned about whether the search function would confuse users by highlighting results from guide subpages, but on the contrary, the study participants used the search function easily, and the fact that it surfaced results from within guides seemed to help them find and identify relevant terms, rather than confusing them. overall, users responded favorably to the look and feel of guides, albeit with a few specific critiques: the initially limited color palette made it difficult for some users to distin guish parts of a guide from one another, and the text size was found to be uncomfortably small in some areas. face-to-face testing: subject guides in subject guide testing, we found overwhelmingly that users both valued and made use of link and box descriptions within guides, using them throughout the navigation process as sources of additional information. users generally preferred briefer descriptions, rather than reading lengthy paragraphs of text, but several noted specific instances in which they would not have understood the nature or purpose of a database without the description that was provided. we also found, conversely, that librarian profile boxes were of less value to users than we had assumed. when asked how they would find help when researching, most subject guide testers said they would turn to google, ask at the library service desk, or use the contact us link in the libguides footer; only two mentioned the profile box as a potential source of help at all. users also seemed unsure of the purpose of the profile box, and not to recognize whose photo and contact information they were seeing, in spite of box labels and text. contrary to our expectations, users also readily clicked through to subpages of guides to find information, sometimes even when more useful information was actually available on the guide landing page. this was particularly evident in one of the subject guides that included internal navigation links in a box on the landing page: if users saw a term they recognized in one of these links, they would click it immediately, without exploring the rest of the page. in general, users latched on quickly to terms in subpage and box titles that seemed relevant to their tasks, and some expressed feelings of increased confidence and reassurance when seeing a familiar term featured testing for transition | lierman, scott, warren, and turner 83 https://doi.org/10.6017/ital.v38i4.11169 prominently on an otherwise unfamiliar resource. scanning for keywords in this manner also sometimes led users astray, however: some navigated to inappropriate pages or links because they featured words like “research” or “library” in their titles. users also expressed confusion about page titles that did not match their expectations of tasks they could complete online, such as “biology reading room.” these findings support those of many prior authors regarding the importance of including clear descriptions with key words that users readily understand.20 many of our results from subject guide testing not only ran counter to our expectations, but challenged the assumptions on which we had based our questioning. for example, we had been curious to learn whether links to external websites were used significantly compared to links to library databases, or if they simply cluttered guides unnecessarily. in testing, however, we found that users did not distinguish between the two types of resources at all, and used both interchangeably. a better question seemed to be not whether users found those links useful, but how to distinguish them from library content—or whether the distinction was necessary from the user’s perspective. some team members had also been concerned about the scroll depth of guide pages, but the majority of users not only said they did not mind scrolling, but seemed surprised and amused by being asked. their own assumptions about this type of resource clearly included the need to scroll down a page. a few other miscellaneous issues presented themselves in our face-to-face testing. one was that the purpose and nature of research guides was not readily evident to users. many used language that conflated guides with search tools like databases, or even with individual information resources like books or articles. for example, a user asked whether the by owner view listed the authors of articles available in this resource. the curated and directional nature of research guides was not at all clear to users. furthermore, in spite of the improvements to guide look and feel in libguides v2, several users still spoke of guides as being cluttered, lengthy, and overwhelming, leaving them intimidated and unsure of where to begin. consistently, testers tended to gravitate toward course guides even when subject guides would have been more appropriate for a given task, and some users expressed that this choice was because of the greater specificity in course guide titles. users demonstrated a great preference for familiarity, gravitating toward terms and resources that were known to them, and even repeating behaviors that had been unproductive earlier in the testing process. finally, one of the greatest points of confusion for users seemed to be the relationship of research guides to physical materials within the library. users readily and confidently followed links to online resources from research guides but expressed confusion and hesitancy when guides pointed to books or other resources available in the library. survey the survey of off-campus users had few responses, but the demographics of the respondents varied more than those of the on-campus testing participants, including graduate students and faculty. the users who did respond showed evidence of less use of guide subpages than we had observed in the face-to-face testing, indicating that the presence of a librarian during testing may have influenced users to explore guides more thoroughly than they would have when working on their own. at the same time, more experienced researchers in the survey group—in this case, a late-program graduate student and a faculty member—were apparently more likely than less experienced users to explore guides thoroughly, and to succeed at research tasks. survey respondents also were far more likely to state that they would use the profile box on guides for information technology and libraries | december 2019 84 help, with some indicating that they recognized their liaison librarian’s picture and were familiar with the librarian as a source of assistance. liaison librarians at uhl often work more closely with higher-level students and faculty than with undergraduates, and this greater familiarity was not surprising. discussion implementation of findings based on the results of the literature review and testing, a number of changes and recommendations were implemented. a brief description of the nature and purpose of research guides was added to the guide homepage’s sidebar, and more color variation was added to guides, while font sizes were increased. existing documentation was also reworked and expanded to create the research guides usability guidelines document for all guide editors, which included adding or revising the following recommendations: • pages, boxes, and resources should all be labeled with clear, jargon-free language that includes keywords familiar to their most frequent users. • page design should be “clean and simple,” minimizing text and incorporating ample white space. • brief, oneto two-sentence descriptions should be provided for all links. • each guide should have an introduction on its landing page with a brief description of its contents and purpose. it may be helpful to include links to subpages in this box as well, but this should be done judiciously, as these links may take users off the landing page prematurely. • pages and resources aimed at undergraduates should be organized and titled according to their relevance to research tasks (e.g., “find articles”), and not by user group. • electronic resources should be prioritized on guides over print resources. • clear distinctions should be made between library and non-library links when the distinction is important. • a profile box with a picture should be included, but the importance of this item is not as great as we had previously imagined. limitations one of the most significant challenges in our testing was actually negotiating the irb application process. delays in our application raised concerns within the team that we would not receive approval in time to test with students before the start of the summer break. although we did receive approval in time, the window for testing afterward was extremely narrow. submitting the application also bound us to the scripts and text that we had originally drafted, which severely limited the flexibility of the testing process. this became a challenge at several points when a particular phrasing or design of a question was found to be ineffective in practice, but could not be altered from its original form. tensions between the requirements for institutional review and the unique needs of usability testing are a persistent problem for user experience development in an academic setting, and must be planned for accordingly as much as possible. in some cases, as well, we might have improved our results by better designing our questions. one example of this was the question about the name “research guides,” which anecdotal evidence has suggested might be challenging for users. simply asking whether that name made sense to the participant was clearly not effective in practice, and did not yield actionable insights. in the future, testing for transition | lierman, scott, warren, and turner 85 https://doi.org/10.6017/ital.v38i4.11169 we might consider informal testing of our planned questions with users in the target demographic before proceeding with full-scale usability testing. a final challenge was in gathering data on use of guides by distance users. though we were able to get enough responses to draw some tentative conclusions, we had hoped for a larger pool of data. though it would make the results more difficult to compare to in-person testing, reducing the length of the survey might have helped to produce more responses. additionally, increased marketing and more flexible timing for survey distribution might have also helped us reach a larger audience. conclusions the results of our testing were very instructive, and led to the creation of valuable documentation for guide editors to use in their work. we also learned a number of lessons relating to process that would be of value to other librarians seeking to perform similar testing at their own institutions. the first of these is that working with a large, interdepartmental team on this type of project— while occasionally unwieldy—is greatly beneficial overall. even if all the team members are not able to fully participate, involving as many colleagues as possible in the usability testing process lessens the workload for each individual, increases flexibility, and ultimately increases buy-in and compliance with the resulting changes and recommendations. for a platform used directly by a relatively large percentage of librarians, as libguides generally is, the number of stakeholders in user research is correspondingly large, and as many of these stakeholders as possible should be involved to some degree. not only will this distribute the benefits of the process more broadly, it will make it possible to staff more extensive and more frequent testing sessions. in the course of our testing process, we also came to recognize the value of testers familiar with the user group under examination. a majority of librarians involved in testing were from publicfacing departments, with significant student contact in their day-to-day work. as a result, we were able to quickly attract a diverse set of participants for our testing simply through our collective knowledge of students’ likely behaviors and preferences: where students were most likely to congregate, what kinds of rewards would motivate them to participate, how to reach them at a distance, and how far their patience would be likely to extend for an in-person interview or an online survey. the incentives and location that the testing teams selected were so effective that the numbers of volunteers we received overwhelmed our capacity to accommodate within the allotted testing time, resulting in a substantial pool of responses for analysis. therefore, we conclude that the effectiveness of user research can be increased by including (or at least consulting) those most familiar with the user group to be studied. simply assuming that participants will be available may ultimately compromise the effectiveness of testing. additionally, time management is an extremely important element of testing development. failing to fully account for the demands of the irb process, for example, led to significant limitations for our project concerning the timing of testing, the availability of participants, our capacity for marketing and distribution of the survey, and the quality of our testing instrument. while acknowledging that, as in our case, sometimes the need for usability testing arises on short notice, we recommend allocating as much time and preparation to the process as possible, to ensure that every aspect of the testing can be given adequate attention. information technology and libraries | december 2019 86 figure 1. average monthly guide views by transition period. testing for transition | lierman, scott, warren, and turner 87 https://doi.org/10.6017/ital.v38i4.11169 as a final note, nearly two years after the best practices were implemented, we collected and compared guide traffic statistics from three key periods: • september 2014 through december 2015, the sixteen months preceding our transition to libguides v2; • january 2016 through august 2017, our first twenty months on libguides v2, during which time best practices had not yet been fully developed and implemented; and • september 2017 through april 2019, from the beginning of best practices implementation through the time of writing (best practices were implemented gradually between september 2017 and february 2018). mindful of the fact that guide usage fluctuates with the academic year, we compared average views for each guide on a monthly basis. figure 1 shows the average number of times each guide was viewed in a month for each period of the transition. as the figure shows, for most of the academic year, guide views dropped sharply after our transition from libguides v1 to libguides v2, and continued to decline slightly with time through the period when our best practices were implemented. there are a number of possible causes for this phenomenon: • guide usage may be declining over time generally for a variety of reasons, and the transition to the new look of v2 may have confused and disoriented users in the immediate aftermath, causing use of some guides to be discontinued. • a substantial number of older guides were eliminated in the transition to v2, some of which may have been more heavily used than suspected, and new guides that have been created since may not yet have gained traction and recognition from users. • librarians may also have reduced their efforts to incorporate guides into their teaching and outreach strategies. • improved organization in the new system may be helping users to find the guide they need on the first try, without having to move through and examine multiple guides. in any case, this trend is concerning and merits further investigation, but a direct correlation with the transition to libguides v2 and the implementation of best practices has not been established. a more accurate measure of the effect of the best practices would be a user satisfaction survey, although a comparison would be difficult to make due to a lack of a baseline from bef ore the transition. we will continue to investigate trends in the use of our guide and how our best practices have affected our users, and how they can be improved upon in the future. information technology and libraries | december 2019 88 appendix a: homepage testing script welcome and demographics hello! thank you for agreeing to participate. i’ll be helping you through the process, and my colleague here will be taking notes. before we get started, i’d like to ask you a few quick questions about yourself. • are you a student? o (no:) ▪ what is your status at uh? (faculty, staff, fellow, etc.) ▪ with what college or area are you affiliated? o (yes:) ▪ are you an undergraduate or a grad student? ▪ what program are you in? ▪ what year are you in now? • how often do you use this library? • how often do you use the libraries’ website or online resources? • about how many hours a week would you say you spend online? • have you ever used the libraries’ research guides before? (if not) have you ever heard of them? are you ready to start? do you have any questions? homepage tour first, i’d like to ask you a few questions about the homepage, which you can see here. don’t worry about right or wrong answers, i just want to know your reactions. • when you look at this page, what are your first impressions of it? • just from looking at these pages, what do you think this resource is for? • look at the categories across the top of the screen. what do you think each of those mean? what would you use them for? • what would you call the resources listed here? • we call these resources “research guides.” does that name make sense to you? tasks: odd-numbered participants now we’re going to ask you to complete two tasks using this page and the links on it. this isn’t a test, and nothing you do will be the wrong or right answer. we just want to see h ow you interact with the site and what we can do to make that experience better. do you have any questions so far? let’s begin. please try to talk about what you’re doing as much as possible, and tell us what you’re thinking and why you’re taking each step. 1. you need to find sources for an assignment for your history class, and you aren’t sure where to start. you clicked a link on the help section of the library webpage that led you here. find a guide that you think can help you. 2. you are taking chemistry 1301, and your professor told you that the library has a research guide especially for this class. find the guide you think they meant. testing for transition | lierman, scott, warren, and turner 89 https://doi.org/10.6017/ital.v38i4.11169 tasks: even-numbered participants now we’re going to ask you to complete two tasks using this page and the links on it. this isn’t a test, and nothing you do will be the wrong or right answer. we just want to see how you interact with the site and what we can do to make that experience better. do you have any questions so far? let’s begin. please try to talk about what you’re doing as much as possible, and tell us what you’re thinking and why you’re taking each step. 1. you need to format a bibliography in mla style, and your professor told you that the library has a research guide that can help. find the guide you think she meant. 2. you are taking a psychology course for the first time, and you want find out what types of tools you should use to do research in psychology. you clicked a link on the help section of the library webpage that led you here. find a guide that you think can help you. follow-up questions now i’d like to ask you a few follow-up questions. • was this easy or hard to do? • what was the easiest part? • what was the hardest part? • what did you like about using this site? • what’s one thing that would have made these tasks easier to complete? information technology and libraries | december 2019 90 appendix b: subject guides testing script welcome and demographics hello! thank you for agreeing to participate. i’ll be helping you through the process, and my colleague here will be taking notes. before we get started, i’d like to ask you a few quick questions about yourself. • are you a student? o (no:) ▪ what is your status at uh? (faculty, staff, fellow, etc.) ▪ with what college or area are you affiliated? o (yes:) ▪ are you an undergraduate or a grad student? ▪ what program are you in? ▪ what year are you in now? • how often do you use this library? • how often do you use the libraries’ website or online resources? • about how many hours a week would you say you spend online? • have you ever used the libraries’ research guides before? (if not) have you ever heard of them? are you ready to start? do you have any questions? guide impressions first, i’d like to ask you a few questions about this page. don’t worry about right or wrong answers, i just want to know your reactions. • when you look at this page, what are your first impressions of it? • just from looking at this page, what do you think this resource is for? what would you use it for? • what would you call this type of resource? • we call resources like this “research guides.” does that name make sense to you? • if you couldn’t find what you were looking for on this page, what would you do to find help? now we’re going to ask you to complete two tasks using this page and the links on it. this isn’t a test, and nothing you do will be the wrong or right answer. we just want to see how you interact with the site and what we can do to make that experience better. do you have any questions so far? let’s begin. please try to talk about what you’re doing as much as possible, and tell us what you’re thinking and why you’re taking each step. tasks: general business resources guide 1. find a database that you could use for research in a general business class. 2. imagine you want to find information on census data. find an appropriate resource on this guide. 3. find a tool you could use to find a dissertation to use in a general business class. testing for transition | lierman, scott, warren, and turner 91 https://doi.org/10.6017/ital.v38i4.11169 tasks: biology and biochemistry resources guide 1. find a database that you could use for research in a biology class. 2. imagine you want to find information on taxonomy. find an appropriate resource on this guide. 3. find a tool you could use to find a thesis to use in a biology class. follow-up questions now i’d like to ask you a few follow-up questions. • was this easy or hard to do? • what was the easiest part? • what was the hardest part? • what did you like about using this site? • what did you dislike? • what’s one thing that would have made these tasks easier to complete? • did it bother you to have to scroll down the page to find additional information? • if you had been doing this on your own, do you think you would have kept scrolling, or gone to other pages on the guide? • did you notice or read the text below the links? • did the names of the different pages on the guide make sense to you? did you know what to expect? • do you think you would use these resources yourself if you were a student in the appropriate class? information technology and libraries | december 2019 92 appendix c: example survey— social work students screening questions are you a university of houston student, faculty member, or employee? • yes • no are you at least 18 years of age? • yes • no consent university of houston consent to participate in research project title: usability testing of library research guides you are being invited to participate in a research project conducted by ashley lierman, the instructional design librarian, and a team of other librarians from the university of houston libraries. non-participation statement your participation is voluntary and you may refuse to participate or withdraw at any time without penalty or loss of benefits to which you are otherwise entitled. you may also refuse to answer any question. if you are a student, a decision to participate or not or to withdraw your participation will have no effect on your standing. purpose of the study the purpose of this study is to investigate user interactions with the research guides area of the uh libraries’ website, in order to understand user needs and expectations and improve the performance of the site. procedures you will be one of approximately fifty subjects to be asked to participate in this survey. you will be asked to provide your initial thoughts and reactions to the libraries’ research guides, and to complete three ordinary research tasks using the page and associated links, then answer followup questions about your experience. the survey includes 23 questions and should take approximately 20-30 minutes. confidentiality your participation in this project is anonymous. please do not enter your name or other identifying information at any point in this survey. testing for transition | lierman, scott, warren, and turner 93 https://doi.org/10.6017/ital.v38i4.11169 risks/discomforts no foreseeable risks or discomforts should result from this research. benefits while you will not directly benefit from participation, your participation may help investigators better understand our users’ needs and expectations from the libraries’ website. alternatives participation in this project is voluntary and the only alternative to this project is non participation. publication statement the results of this study may be published in professional and/or scientific journals. it may also be used for educational purposes or for professional presentations. however, no individual subject will be identified. if you have any questions, you may contact ashley lierman at 713-743-9773. any questions regarding your rights as a research subject may be addressed to the university of houston committee for the protection of human subjects (713743-9204). by clicking the “i agree to participate” button below, you affirm your consent to participate in this survey. if you do not consent to participate, you may simply close this window. • i agree to participate guide impressions click the link below (will open in a new window) and explore the page it leads to, then return to this survey and answer the questions. http://guides.lib.uh.edu/socialwork when you look at the page linked above, what are your first impressions of it? just from looking at the page, what do you think this resource is for? what would you use it for? what would you call this type of resource, if you had to give it a name? if you couldn’t find what you were looking for on the page linked above, what would you do to find help? on the following pages, you will be asked to complete three brief tasks. this is not a test, and nothing you do will be the wrong or right answer. the purpose of these tasks is simply to allow you to experiment with using the guide in an authentic way. when you have completed all of the tasks, you will be asked a few questions about your experiences. http://guides.lib.uh.edu/socialwork information technology and libraries | december 2019 94 first task click the link below to open the social work resources guide (will open in a new window): http://guides.lib.uh.edu/socialwork on the social work resources guide, find a link to a database that you could use to investigate possible psychiatric medications. enter the name of the database you found: second task click the link below to open the social work resources guide (will open in a new window): http://guides.lib.uh.edu/socialwork imagine you want to find a psychological assessment. find an appropriate resource on social work resources guide. (you do not need to actually find an assessment, only the name of a resource that would help you locate one.) enter the name of the resource you found: third task click the link below to open the social work resources guide (will open in a new window): http://guides.lib.uh.edu/socialwork on the social work resources guide, find a tool you could use to find historical census data. enter the name of the tool you found: follow-up questions were the tasks on the preceding pages easy or difficult to do? • extremely easy • somewhat easy • neither easy nor difficult • somewhat difficult • extremely difficult what was the easiest part of completing the tasks? what was the most difficult part of completing the tasks? what did you like about using the guide that you were linked to? what did you dislike about using the guide? what is one thing that would have made the tasks easier to complete? demographics thank you for completing the survey! before you leave, please answer a few demographic questions about yourself. http://guides.lib.uh.edu/socialwork http://guides.lib.uh.edu/socialwork http://guides.lib.uh.edu/socialwork testing for transition | lierman, scott, warren, and turner 95 https://doi.org/10.6017/ital.v38i4.11169 are you a student? • yes • no type of student: • undergraduate • graduate • not a student program or major: year in program: • 1st • 2nd • 3rd • 4th • 5th or higher • not a student how often do you use the university of houston libraries? • daily • a few times a week • a few times a month • a few times a year • never how often do you use the libraries’ website or online resources (e.g. databases, catalog, etc.)? • daily • a few times a week • a few times a month • a few times a year • never have you ever used the libraries’ research guides before? • yes • no ending screen we thank you for your time spent taking this survey. your response has been recorded. information technology and libraries | december 2019 96 references 1 “user experience basics,” usability.gov, https://www.usability.gov/what-and-why/userexperience.html. 2 brenda reeb and susan gibbons, “students, librarians, and subject guides: improving a poor rate of return,” portal: libraries and the academy 4, no. 1 (2004): 123-30, https://doi.org/10.1353/pla.2004.0020. 3 martin p. courtois, martha e. higgins, and aditya kapur, “was this guide helpful? users’ perceptions of subject guides,” reference services review 33, no. 2 (2005): 188-96, https://doi.org/10.1108/00907320510597381. 4 william hemmig, “online pathfinders: toward an experience-centered model,” reference services review 33, no. 1 (2005): 66-87, https://doi.org/10.1108/00907320510581397. 5 shannon m. staley, “academic subject guides: a case study of use at san jose state university,” college & research libraries 68, no. 2 (2007): 119-40, http://crl.acrl.org/content/68/2/119.short. 6 michal strutin, “making research guides more useful and more well used,” issues in science and technology librarianship 55 (2008), https://doi.org/10.5062/f4m61h5k. 7 kristin costello et al., “libguides best practices: how usability showed us what students really want from subject guides” (presentation, brick & click ’15: an academic library conference, maryville, mo, november 6, 2015): 52-60; alisa c. gonzalez and theresa westbrock, “reaching out with libguides: establishing a working set of best practices,” journal of library administration 50, no. 5-6 (2010): 638-56, https://doi.org/10.1080/01930826.2010.488941; jennifer j. little, “cognitive load theory and library research guides,” internet reference services quarterly 15, no. 1 (2010): 53-63, https://doi.org/10.1080/10875300903530199; dana ouellette, “subject guides in academic libraries: a user-centered study of uses and perceptions,” canadian journal of information and library science 35, no. 4 (2011): 436-51, https://doi.org/10.1353/ils.2011.0024. 8 luigina vileno, “testing the usability of two online research guides,” partnership: the canadian journal of library and information practice and research 5, no. 2 (2010): 1-21. https://doi.org/10.21083/partnership.v5i2.1235. 9 alec sonsteby and jennifer dejonghe, “usability testing, user-centered design, and libguides subject guides: a case study,” journal of web librarianship 7, no. 1 (2013): 83-94. https://doi.org/10.1080/19322909.2013.747366. 10 laura cobus-kuo, ron gilmour, and paul dickson, “bringing in the experts: library research guide usability testing in a computer science class,” evidence based library and information practice 8, no. 4 (2013): 43-59, http://ejournals.library.ualberta.ca/index.php/eblip/article/view/20170. 11 costello et al., 56. https://www.usability.gov/what-and-why/user-experience.html https://www.usability.gov/what-and-why/user-experience.html https://doi.org/10.1353/pla.2004.0020 https://doi.org/10.1108/00907320510597381 https://doi.org/10.1108/00907320510581397 http://crl.acrl.org/content/68/2/119.short https://doi.org/10.5062/f4m61h5k https://doi.org/10.1080/01930826.2010.488941 https://doi.org/10.1080/10875300903530199 https://doi.org/10.1353/ils.2011.0024 https://doi.org/10.21083/partnership.v5i2.1235 https://doi.org/10.1080/19322909.2013.747366 http://ejournals.library.ualberta.ca/index.php/eblip/article/view/20170 testing for transition | lierman, scott, warren, and turner 97 https://doi.org/10.6017/ital.v38i4.11169 12 john j. hernandez and lauren mckeen, “moving mountains: surviving the migration to libguides 2.0,” online searcher 39, no. 2 (2015): 16-21. 13 ouellette, 447; denise fitzgerald quintel, “libguides and usability: what our users want,” computers in libraries 36, no. 1 (2016): 8; sonsteby and dejonghe, 89. 14 costello et al., 56; hernandez and mckeen, 20; sonsteby and dejonghe, 89. 15 caroline sinkinson et al., “guiding design: exposing librarian and student mental models of research guides,” portal: libraries and the academy 12, no. 1 (2012): 74, https://doi.org/10.1353/pla.2012.0008. 16 costello et al., 56; ouellette, 444-45; quintel, 8; kate a. pittsley, and sara memmot, “improving independent student navigation of complex educational web sites: an analysis of two navigation design changes in libguides,” information technology and libraries 31, no. 3 (2012): 56, https://doi.org/10.6017/ital.v31i3.1880; sonsteby and dejonghe, 87. 17 cobus-kuo, gilmour, and dickson, 50; costello et al., 56. 18 sinkinson et al., 74. 19 costello et al., 56. 20 costello et al., 56; hernandez and mckeen, 20; sonsteby and dejonghe, 89; sinkinson et al., 74. https://doi.org/10.1353/pla.2012.0008 https://doi.org/10.6017/ital.v31i3.1880 abstract introduction literature review methodology stage 1: card sort stage 2: migration stage 3: face-to-face testing stage 4: survey stage 5: analyzing and implementing results findings card sort face-to-face testing: homepage face-to-face testing: subject guides survey discussion implementation of findings limitations conclusions appendix a: homepage testing script welcome and demographics homepage tour tasks: odd-numbered participants tasks: even-numbered participants follow-up questions appendix b: subject guides testing script welcome and demographics guide impressions tasks: general business resources guide tasks: biology and biochemistry resources guide follow-up questions appendix c: example survey— social work students screening questions consent guide impressions first task second task third task follow-up questions demographics ending screen references lib-s-mocs-kmc364-20140601051834 biblios revisited j kountz 63 biblios revisited john c. kountz: library systems coordinator, california state university and colleges, los angeles. when this article was in preparation, the author was systems analyst, orange county public libraries, orange county, california. in the following, orange county public library's earlier reports on its biblios system are updated. book catalog and circulation control modules are detailed, development and operation costs documented, and a cost comparison for acquisitions cited. "in 1968 ala began publishing, through its information science and automation division, a journal of library automation. it is perhaps appropriate to note that in the first three quarterly issues only one public library project was described ( 1), and this was a project under contemplation, not one actually in operation." ( 2) this statement by dan melcher to substantiate his contention that library automation is suspect is, in itself, suspect. the public library project alluded to as being contemplated in 1968 was brought to fruition by orange county (california) public library in 1969, and has functioned with startling success ever since. in addition, the finished system was reported to the library ( 3) and data processing ( 4) worlds in 1969 and 1970 respectively. orange county public library's biblios (book inventory building library information oriented system) is a system designed to fulfill all functional requirements of a multibranch library which is growing by leaps and bounds (5). specifically these functional requirements are: acquisitions, book processing, catalog maintenance, circulation control, and book fund accounting, in addition to management reporting on a level not practical in a manual system. 64 ]ounwl of uhrary automation vol. 5 / 2 !unc, 1972 the functional system the interrelation of these system elements is shown diagramaticall y in figure 1. briefly and from a us<'r's point of view, the system works like this: a title is desired by someone, patron or staff member. the p erson refers to the book catalog, figure 2, to see if the item is in the collection. if it is and not in circulation, he gets the book directly. if the item is in circulation, he can submit a request for it-to rece ive the book on its return. to update the catalog, a cumulative supplement is produced, keeping current the listing of the library's holdings. if the title is not found in the catalog or supplement, the monthly cumulative on· order list, figure 3, is consulted. if the title is listed , a request is submitted and, on receipt and processing, the book is released to the requester. if the title is cancelled, the requester is notified. when a title wanted for the collection is not listed in either the catalog or the cumulative on-order list, a bibliographic information sheet ( bis ), figure 4, is completed and optically scanned into the system. this information is essentially a pre-cataloging bibliographic description of the desired material. once entered, these same data serve first to create purchase orders and related reports; then, once edited by the catalogers from the book in hand, to create book card and pocket sets (figure 5 ), book catalog entries, shown in figure 2, holding lists (shelf lists ) for each branch, and a broad array of operational reports. it is a feature of biblios that the descriptive data (from the bis) are entered in their entirety only once. this means that a bibliographic description need not be initialized by each individual using it; rather, it need only be consulted and, if necessary, corrected or deleted. thus, an entry once in the system is immediately available for, among other purposes, ordering. this is especially significant since it means that each entry in the book catalog, the catalog supplement, the cumulative on-order list, etc., can be ordered against by simply using the key number for the desired item and the number assigned to the branch wishing to order. this poses the possibility of orders for materials which are op or otherwise not readily available through the usual vendor channels. biblios addresses these potential errors by listing (pre-vend list, figure 6) all order requirements for review before they are used to create orders. by editing this list against books in print and/or publishers' catalogs and taking corrective action, orders for the unobtainable are short-stopped. on placing an order, while a unique subpurchase order number is mechanically created, the key number continues to document the title for processing purposes. in this role the key number follows the order until it is filled or cancelled. thus, the key is used by biblios to update inven tory automatically on receipt of an order and to create the card and pocket sets for those materials received. finally, the key number is used by the branches to report inventory changes and, as a subset of inventory, for circulation control. rl ruos revisited /kountz fl.s since it is through the key number (or key, for short) for a bibliographic citation that the citation is used in the various functions performed hy biblios, perhaps a little detail concerning the key is in order. bibliographic data optical scan marc ~ l bibliographic jl book catalog ~ master indices book catalogs & supps. orders new materials reorders ~ ! acq uisitions accounting ~ sub purchase orders on-order-lists budget reports vendor performance pre-vend/review lists inventory update losses gifts ..___...._r----inventory t locator ~ lo cator guide & supps. pocket & card sets collection profiles fig. 1. biblios-the functiorwl system. the key number circulation input transactions patron registrations _,...___ ... l cl rculation l ~ holdings list book "tags" patron register overdue notices management repo rts us e profiles in figure 2, the key for 73084452 has been underlined. the key number resembles the lc card order number. wherever an lc card order number is available, it is used. when no lc card order number is available, a unique orange county ( oc) number is applied. the oc number consists of two alphabetic characters in the first two positions (at one time the numbers implied year ) of the "traditional" number followed by a six-digit sequential number. since the library of congress has certain idiosyncracies about its card order number, the key also specifies the type of material it represents (for example, only book keys are in the book ca talog ), and identifies each volume, or edition, of a title which has a blanket lc card order number. the selection of the lc card order number for this application was based on a suspicion that the bulk of materials in the collection were already adult catalog '71 cumulative supplement 7 author-title section wall, joseph frazier. walter chandoha's book of foals ward, mary jane. washington, george, pres. u. s ., andrew carneg•e oxford umvers•ty 1970 and horses. the other carol•ne~ a novel crown 1970 17 32-1799. 1 137p lnde• b•bhog photos b•ography see. chandoha. walter 216p the journal ol ma1or george wa.h1ngton 92-c aa006725 636.1 197 1 aa011379 fiction 70108078 march of ameoca facs1mlfe seoes. no wall, leonard vernon comp1ter the walters ndrome. ward, ritch i e . 42 oogmat r p reads w1tilamsburgh the puppet boo~ ed. of g. a while 2nd & see· neel ric~ard the living cloc~s drawmgs by hollett smith ponied. london. reponted for r jefferys extensive rev under ed. of a. r. philpott y . allred a knopf 1971 385p index b/ w 1154 co vers the pertod from oct i 153 faber & faber 1965 300plnde•b•bhog fictionm 1970 79122149 lllusphotos to jan 1154 un•vers•tymicro fllms 1966 b/ w lllus photos walters, barbara. 574 1 77111247 32p no index maps 791.53 68017740 how to tal~ w1th practically anybody about · . 973.26 66026314-001 wall street and w·tchcraft practically anyth~ng doubleday 1970 warde, fredenck b . washington international arts i • 195p f11ty years ol make bel1eve by fredeock l tt see gunther. max 80856 aa007142 warde international pr syndicate 19 20 e er edllor 133 1971 aa012873 · liop grant s and a1d to lnd1v1duals 1n the arts th w ii s walton, clarence c . contammg ltstmgs o f most protess1onat e a treet jungle. ethos and the execut1 ve values m fiction 2 i008l 54 awards. and /nlormafion about colleges see ney. richard manageoal decision makmgprent1ce-hall wardropper, bruce w. edlfor umvers1tres and prot schools of the 332.678 1970 76084477 1969 267plndekbibhog spamsh f>oetry of the golden age edited by arts by the ed1tors of the washmgton wallace, irving. 658.4 .. 73084~52 bruce w wardropper appleton-century inti arts lefler paperback wa sh lnll the nympho and other mamacs s1monwalton lzaak 1971 353p for lang poetry collect1on art s le tter 1970 75p no index sc huster 197 1 4 7 5p index b1bhog b/ w the ll~es of john donne and george herbert b1bhog r378. 34 70112695 lllus b•ography bound wlfh the prlgom 's progress. by sp861.08 78132806 waskow, arthur i. 301.415 aa011778 john bunyan v 15 m the harvard ware, clyde. the freedom seder a new 1/aggadah l or wallace, marcia. c!ass1cs coll1er 1909 418p the ecfen tree touchstone pubhsh~ng passover holt r1nehart w~nsto 1970 barefoot '" the kltchen. a cookbook tor fiction 09023026-001 -015 company 1971 357p 56p b w lllus summer hostesses dra wmgs by re1d the l1ves ol john donne and george herbert fiction aao 13079 296.437 7910355 7 perez kolman st marhn·s pr 1971 150p bound w1th the p1lgflm ·s progress. by . wasley, ruth. index b/ w tllus john bunyan no. i 5 m the harvard warmack, ohver j. bead oe 51 gn a comprehens11·e course tor 6 4 !5 73145431 ctasslcscoll1er 1937 418p the mystery ol lmquity . volume i 2 thess . begmner and expeoenced craltsman by w ii · r b rt fiction 37040164 -001 -015 2.1 pub by the author 1969 120p no ruth wasley and ed•lh hams crown a ace, 0 e edit or w b h j h index 19 70 216p index b w ill us col ill us the worldoibermnl.l5981680 byrobert am aug • osep • 200 77013647-001 photos 1 w41/4c~ tjnd th~ cdllors of t1me-lde the b lue kmeht an atlantic monthly press books t1me ·l1 i e books. 1970 192p index book l11t1e. brown. 1972 338p lg dwin g . -------746.~ . ~ ...__......._ 81~&,.chm ; ' cql.!!.~u s · · __...------~·~'l...._ _ 79175 nd the~r cu ~ -~ --~ ~-~ fig. 2. a book catalog page featuring four columns. biblios revisitedfkovntz 67 assigned a number, a suspicion which was confirmed on completion of conversion through simple reporting of the keys on file. in short, after fifty years of operation of orange county's libraries, 92 percent of all titles in the collection had an "lc number," a factor one might weigh when trying to decide between isbn and lc card order number; nor has it been indicated that isbn's will be developed retrospectively. an update to the system in the paper presented to the american society for information science in 1969 ( 6), neither the book catalog nor the circulation control modules had been implemented. book catalog in may 1971, the first edition of biblios book catalog was released for public use. since that date, the cumulative supplement has been run six times. the module of biblios producing the book catalog and cumulative supplement is diagrammed in figure 7. input is the title-master file (the system's bibliographic data base) and a specification of the output required. the output options available to the library include the production of either a full catalog or a cumulative supplement (displaying all entries placed on file since production of the full catalog which have been edited by cataloging). in the case of full catalog production, the title-master file is updated to reflect the use of all qualifying entries for catalog production and the date of their use. this updating facilitates cumulative supplement production by precluding the use of these entries from display until the next full catalog run. in addition to the type catalog (full or supplement), the library designates the format of the output. either an off-line print-out or a print file designed to drive a mechanical photocomposition device, or both, can be requested. it is important to note that this print file is designed specifically to be hardware independent, e.g., it will run on rca, photon, alphanumeric, or comparable equipment with equal ease. hardware independence in its simplest terms means the computer program does not have to be rewritten each time a vendor goes out of business. and, coincidentally, this print file is in the sequence it is to be displayed in. in short, the vendor only performs that processing necessary to make his device set type to the library's specification for layout, font style, and font sizea specification, it might be added, which calls for upperand lower-case type from a file in upper-case only. this approach differs from what has become typical of book catalog production in that sorting, file maintenance, and all related processing are sustained by the library through biblios. the vendor only sets type, prints, and binds. the results spell savings since a potentially error-laden file does not have to be committed to the most expensive of all displays, photocomposition, before corrections can be made. l~20l4c4 cumulative on·o~oer list _ _ _ _ media 01 book author titl~ --wioereierg~ siv wicker, kings~ey '~ier, este::t wiest, j, i levy, p, wiltox, leslie a, _ _ wil.de f; , laura .( incal.ls) wilk, max wllkeksun, oavio ~ilkes, bill st. john wllklhsun, paul h, wllki~son, rupert willarn5, milorto wilds ~~~ li.c ox, dont,lo \oilllc(1ll, donald wjllcux·, donald wjlliaiis, brao will i ams, ctlljn ·-· \ol i l li mis, ethel w a _ williahs, garth vii ll i mis, jay ~illians, john g, will i mis, joyce will 14ms, hill.er williams, rooert m. --williams, te nn essee ~ illia~s, ursula moray ~illia~s, ursula mo ray ~ illi~g hah, ~arren we 1-: illl$, f. roy i'll ls j;.~ , eomlino hilson, ~~len janet (came whson, erica wilson, h, w.; firm, pusl ¥! hson, ira g. 'rl ll.son, jean vi i lson~ j ulin rgwan ~ilson, k~ nneth l. w inc~[ll, cunstance mabel ~inuchvt eu~ene c, winn, marie; lllt\hf\ts tales , ~ i ntepburn 1 mollie ~ t nterst donald l., _ __ \-' lrte noer g, patricia l~ ~~ 1 se 1 arthur wi se, herbe~t alvin \ollse, slr.ne:v t, ~!li it her s1 carl ~lltkin, b. e. my best frtf l-10 ways of nhhlism white oak managem nt guide to pertm~schjt s 1/ cyage by thr. ~horr.s of silver l wit and wisrom of holly~u cross and th e s witc~ulaoe nautica i' arc ha ecjlogv aircraf enc i nes of t~e w preve nti on of u•i mki ng pr luc k nf harr y wfav~r muo (rn leat htr design new ues!gn i n jewelry wood design lost le l enos cf the wes t ~omnsex~al.s and the mllit knuw youk ant r sturs big gul •.en animal a 8 c s ll. v e r ~· h i s tl t; field gi de to the sutter adjusta~l[ jul i~ only wo~lo thlre is u c l a susi nr ss forecast hilk train du~sntt stop h snv in ear< n thrh toy makl rs fr~e-ac ~ ess higher ed ucat jta~y c ~ooses e~rope upstate alllr icml pai nter i n paris crf~e l. ~mgro l dery ficti on catal~g for 1970 wf-lht ccj ii ?uto;s ca n ~ot do weaving is fu!·i oa~r lt~c. t o:~ have fa l th with~ut ftar guide t reference books, to nk in 1' ulf playgro . p ol!o k w i ntf~ts tales l61 1970 techn i que lf rlandpu ilt pu henry c· nt.m .l 'riall..u:!: as all.aro~ndt h l 4guse 6rt \ti hu ki lli:o e!ijch pur:l.ll great t ~es uf tl rrgr i t lnvfst ~no ~~tirl i~ mexl m\e rjca . riddle &ulll< sumhary of califurnia law -------------------------fig. 3. all outstanding titles are reported in the monthly cumulative onorder list. orange c3unty public libraky lc-cc · n i.j ~uer 7zl14z2 9 ••••••••• aaoll44 9 ••••••••• 7 ~ 13~ 0 00 ••••••••• aaoll~5 3 • •••••••• 7 j l ~ 5~89 ••••••••• 1 39 0 27 9 4 9. , ••••••• 73124 9 83 ••••••••• 63 009~42 ••••••••• 711,9~50 ••••••••• 41013 39 7 •••••• 070 7aou 3d57 ••••••••• 72 131 147 ••••••••• 6y0 17~b7., ••••••• , 7 9 126t 7o ••••••••• 6~0 12400 ••••••••• . 7oo h 0 ~ 6n •• • •••••• aaol l , o? ••••••••• 6 00 15 2 5 2 •• • ••• • •• ~700 ~jlz ••••••••• 71 1 36 ~ 8? ••••••••• 7 3 146 ~ 03 ••••••••• a a u12 ~ 5l • •••• • ••• 7ol227uo ••••••••• aa017727 ••••••••• b 30l364l ••••••••• 79 1 0 2 ~ 11 ••••••••• 7315 2 d 7 ~ ••••••••• aa 0 17711 ••••••••• 7 50 g3 v 24., ••••••• 7514 33 0?. ••••••••• 7 0 149zzj., ••••• • • 62 009637 ••••••••• 0 90 35 0 4~ •••••• 070 ' 7 3 112 ~ 23 • • • •••••• aao l3 , 77., ••••••• 72150 ~2 ~ ••••••••• 7 7 1 2 4 69~ ••••••••• aao u 6 78 j •••••• o 67 aa c l2 1 55 • •••••••• o7 o l3)ql ••••••••• 5 5o1 j l ~~ •••••• o7l , aanl 77 9 4••••••••• 7 t::cj:1orc; .j ••••••••• 600260 ~~ ••••••• • • 761 4 8 ~ 31 ••••••••• 4 ~00 5 5 ~ 2 ••••••••• aao l 7o e5 ••••••• • • 5j olo u45., ••••••• 6000 4 79~ •••••• 169 surl pu t; tip. l8 order no 7127707 7 47 7117906?.49 7121705 '•27 7ll3 c02 0 3.3 7l2 7700j o a 711791 8~36 71 2 2105 -' ~l 7 1 2 !>ooo l 90 7127703 '•36 712210 ~'i 8q 710 '> 704 \li. o 7l22107z63 712!>6023 9 4 7122103587 7l277q5.::i30 712770297 9 7125605046 71277 0 2742 7l2t:l006~ l 71 22104 <. 31 7j.2560l l!l 6 7122100 0 95 712!>6026?.3 71161098 7 !) 7l2!;i605 ;•j3 7l2770 0 o37 7 l27 7 :l 4 ub8 "tllol os 'j65 712tr 0 2 6 5l 712 560 3 v l3 7125600143 7lz770ll6t 7l 25r) 097co 712211 ( v ~h 712770 1!u e0 7 i 6 4 712,~03013 711"1903-il~ 7 1256081>0 7127711:323 71 2i'7u tl'>31 7ll~ l()4d69 712770703i. 71221qq l ?7 712 ~ 60 3 g 5l 7 1277021> 71 711791p t94 712"170 59 05 712 21 0 9t•0 7 vei~dqr ct bro l:tp. o sho brq pr< ~! bro br~ bro cd bro bt b~o bro bro cro bt ar.o bihi 8 rll &ku ur•u sf{ o ch ut ~ r l.j ck o c!-i &r ij bf, g flfw liko wi p i< \~ st 00 bt p.ro dd cf·ia ih a f.:. a p (;ro ihuj bro ort-1 [i t cll too•··· 7l p t.r • .:~ 6 8 ptl qty cd :?.z 4 5 l 1 1 23 l 4 2 8 1 2 3 l 1 2 l 1 16 !. l 1 'l 1 7 lo '0 1 j 6 1 26 fl l \ ' ) 6 2 4 l 3 4 2 1 4 l 7.7 3 :; li ~ t prtc.t: :i, so-7.or, 3.95 4.9~ !!.9~ l. 8 () 7,95 4,9 5 ?.'j:; 2 5. 00 l :> . (; q 4.5 o 1~.so · 7.~0 8 ,95 5,95 6.9; 5 , c·o 3,3 () 3 .if ., ~ . 9 :; l. 3 5 4.9!> 10. ~ 0 3. 75 3,a il 3. (1 () 6 , ~ i) a. ~o 6. 9 :; 4 .95 7.5 () zs.r.o (l. 9~ 5 .95 b.95 3 .9~ · 4. 00 6.9~ 4.95 5.9!5 l.c, \) 0 8,9!) 5. 0 0 ;. 9 ) 4. 'i 'i l. -.: ..... a ;j ~ ..... 0 ~ 5' . ;::s 0 < £. cj1 0 ......_ l\j '--< c: 0 ::l _en ..... 0 cd -l t-0 0 --- 7616035~••••••••• 1 ryck1 fra~cis f a loarj~d gun a~olz421········· 1 sacki john n a ~ieu!enant callf.v 7712'906tt••••••• 1 sanders, eo, n a fami~v aa01311~•·••••••• 1 sanderson, ivan te~ence n a \.!• 5 9 a, 61014l1'••!••••!• 1 sandoz, marl ~ j r jhes~ were the sioux aaq1048~••••••••• 3 santesson 1 hans stefan f j days -after tqmurrow aaol0~7bee•t•••~• 1 sase~, hlroslav n esto es san p:railciscu j aa010~77,,,,,,,,, 1 sasek, mirosi.av n estu es washingt0~1 d, c, j 6~0l9787e••••••!• 1 sasek, ~lroslav this is hong kong '!016zb6," .. "!! 1 sasek, 11irdslav this ls pa~is 14171270 .. " .. ,., 1 saxton, jusepiiine group feast n j r n j r f a 1.1~1 pi{'i(;e ~,,o 4o95 ---------------------------------------------fig. 6. before producing a sub-p.o., the pre-vend list is checked for o .p. materials, among other things. ---------------------------~-----------------, y , _ r,u~llc llort\ry oaf e .. ll•oij·7~ ~agf44 ol ihjok , ty publlsh [ fl. ua!e vur 0 ujsc net pr av .. special o~oe~l liata co cd ptt p.rice co co 3 ~arnt:s -nubi.e ~9'11 •bro a ! , , e~!jnothcs ih a 4 l>oualroar 1971 •ou a 5 dcjl.isle()tiy 19?1 *ol' a sf 9 steln-uay 19"11 *i.h\ 0 a a! a 2 vi k l '·h> press 19'!1 •bt a 8ro a ) e. p, dutton 1971 *if!' a ii---3 n -.]. -.]. ----------~----~--------------()0 lb~oc301 holdings list cost center j adult call nu"ber author * ()' i 0 795.~15 reese terence • story of an accusation 1 0 795.415 s~einwolo alfred * shoat cut to winning bridge~· 795.415 s,.ith thoi'ias " * look it up in i'oyle 795.~15 young ray • bridge for people who don t know 0 795.41503 reese terence + bridge player s dictionary , 0 0 0 0 0 0 0 0 0 0 0 0 d 0 0 0 0 795.42 ccllver donald i + scie~tific blackjack and co ~ 795.42 thorp edward 0 * beat the dealer a winning stri 7q5.43& blackstone harry + blackstone s modern card tr ~ 795.43& stanyon ellis + card tricks for everyone 795.540973 r~nd ,.cnally * 1970 rand mcnally guidebook to c 796 bisher fur "an * with a southern exposure 796 krout john allen + annals of a"erican sport 796 mittelbuscher c f + call em rig~t 796 murray jim * sporting world of jim murray 796 smith robert miller • grantland rice award pri2 7<16 smith walter wellesley + views cf sport 796 vannier 11aryhelen + individual and tea" sp orts 796 wood cle11e nt • coi'iplete book of gai1es 796.026 sports rules encyclopedia* sports rules e~cycl 796.03 s~lak john s + dictionary of american sports 796.06& aaron david* chilo s play 796.068 butler george d * recreation areas 796.0m isaacs stan* careers and opportunities in spoil 796.08 pepe philips* winners never ouit 796.08 wino herbert warren * realm of sport 796.082 esquire * esquire s great men and moments in sp 796.0&2 sports illustrated * sports the american scene 796.09 cohane timothy + bypaths of glory 796.0973 be9t sports stories • best sports stories for 796.0973 best sports stories * best sports stories for 796.0973 best sports stories • best sports stories for 796.0973 best sports stories * b!st sports stories for 796.0973 best sports stories • best sports stories for 7q6.l broer marion r • fundamentals of marching 7q6.13 c~~se richard • hull~baloo and other singing f( 796.15 wagenvoord james • flying kites 796.3 holt rich~ro • teach yourself billiards 796.31 amateur athletic union • official rules 796.31 maxwell harvey c + american lawn bowler 796.323 amateur athletic union + official a a u 796.323 sports illustrated + book of basketball ~no sn( handbai s gu i dl basket i 7q6.32307 verderame sal reo • organization for champidnsi 796.323092 auerbach ~rnolo reo * reo auerb~ch winning the 796.3230q2 pettit bob • bob pettit the drive within ~e 796.3236 a~thel" pete • city game 796.33 ccnerly charlie • forward p~ss 796.332 schenkel chris • how to whch football on tel.e1 796.33203 treat roger l • encyclopedia of football 796.3320'& riger robert • best plays of the year 196z 796.33206 curran bob • four hundred thousand dollar quaa" 796.332077 devine dan * "issouri power football 796.332077 schoor gene • treasury of notre dame football 796.332082 newcombe ~ack * fireside book of football 7q6.33209 bell joseph n * bowl game thrills --.-----------.-----------......-.--..-.-fig. 9. th e maintenance of manual shelftists is obviated by a bibliosproduced holdings list for each branch. j------------------------------costa iusa loh-fict ion l tle i: card froi1 another l' lete casino guide ·ie gy for the ga"e of twenty one i :ks ; l"pgrounos rev eo i i l: sports stories i i 'or girls ano wollen ! 11pe d la ls i i ~ qrts i i i 061 : ~63 ~64 ~66 ~ 970 ~lk gai'ies ( jker l l e ! all guide 1965 1966 r ip high school basketball hard way ~.is ion ' erback 07/01/71 page 341 r s n8r lc/oc nbr • 67~17872 ••••••••• • 61016665 ••••••••• • 7&077366 ••••••••• • 64015641 ••••••••• • 63025374 ••••••••• • 66023116 ••••••••• • 66012019 ••••••••• • 58005566 ••••••••• • 6&022206 ••••••••• 60c01380••••••••• • 62008215 ••••••••• • 2900080~a •••••••• • 8b091802 ••••••••• • 68c25594••••••••• • 62015934 ••••••••• • 53006862 ••••••••• • 60007465 ••••••••• • 3&003909 ••••••••• • 61019409 ••••••••• • 60013658 ••••••••• • 64012696 ••••••••• • 57011288 ••••••••• • 64019529 ••••••••• • 67026079 ••••••••• • 66019433 ••••••••• • 61010232 ••••••••• • 63021480 ••••••••• • 63016506 ••••••••• • 45035124 •••••• 061 • 45035124 •••••• 063 • 45035124 •••••• 064 • 45035124 •••••• 066 • 45035124 •••••• 070 • 65021807 ••••••••• • 49008127 ••••••••• • 68031281 ••••••••• • 5&003667 ••••••••• • 88c40004 ••••••••• • 66025876 ••••••••• • 88090177 ••••••••• • 62011346 ••••••••• • 63014720 •••••• • •• • 67011223 ••••••••• • 66c14357 ••••••••• • aa010179 ••••••••• • 60012110 •••••••• • • 64020856 ••••••••• • 61013913 ••••••••• • 62022305 • • ••••••• • 65022618 ••••••••• • 62005250 ••••••••• • 6201&326 ••••••••• • 64019933 ••••••••• • 2 63016799 ••••••••• 80 journal of library automation vol. 5/ 2 june, 1972 branch to circul•tion control sub-system book card production (by s ranth) book & date-due cards fig. 10. biblios circulation control subsystem. biblios revisitedjkountz 81 cards, cassette, or mini-reels). ideally, the elusive transactor should be able to "read" a label on the book as well as a patron card. kimball labels, "sunburst" tags, magnetically coded swatches and the like have worked and continue to work in the retail trade; there is no reason why they shouldn't work for libraries. the only deterrent seems to be the reticence of their manufacturers to enter an unknown market where, following the melcher axiom, they are met with a "stubborn, 'show me' attitude when automation is proposed." ( 8) the products designed into the circulation control module include: weed lists, patron "black lists," circulation profiles (graphically displaying patron use of each branch's collection), and automatic duplicate ordering. reports measure circulation from a manager's viewpoint, but not to the exclusion of such bread-and-butter products as overdue notices, registration lists, and related statistical recapitulations. a word about documentation for each program in each subsystem of biblios, forty unique programs in all, there is a formal package consisting of: l. a program specification detailing the inputs, processing, outputs, idiosyncracies, and edits of that program; 2. a listing of the cobol program itself; 3. an operations binder (notebook) section for set-up and run procedures; 4. a user's guide section relating requirements and diagnostics to the librarians using the program including typical problems; and, 5. assorted total system binders (notebooks). while some might think "overkill," in automation this is not the case. the biblios system has yet to fail a scheduled commitment. further, it is suspected that the mere discipline of documentation caused many serious reconsiderations of program and procedural logic, at the time and on the spot, with the result that biblios is a reliable systemrequiring no major rework and continuing to respond to the library's functional requirements for over two years at this writing. a word about development costs both developmental and operational costs for biblios are known and documented. specifically, the costs to procure such a system are broken out in table 1, where each subsystem is examined in terms of the dollars it represents and the assorted tasks required to bring it into being. the totals represent all costs over approximately a three-year period beginning with rough specifications and yielding the first book catalog. it must be noted that final program specifications and coding were performed for orange county by a contractor. this approach was chosen, since a good job done on time was wanted. that the approach was valid is table 1. biblios development costs (including full conversion and publication of first book catalog). (x) 1:-0 '-. 0 bibliographic book catalog ~ marc inventory locator guide acquisitions circulation total contractor 0 -program specifications t"-' & coding $16,686 $ 54,299 $ 25,800 $ 72,305 $ 91,000 $260,090 & ;::; ""' .'= orange co. public library ~ :::: ..... analyst 3,360 7,840 2,240 14,560 7,000 35,000 0 ;:; coordination ..... 1,225 7,679 818 5,310 5,670 20,702 -· -~ :s implementation ( k.p. , < machine time, etc. ) 4,772 12,263 4,635 7,879 10,110 39,659 £. con version/ outside cll -services 800 53,500 41,370 95,670 l-:> subtotal 10,157 81,282 49,063 27,749 22,780 191,031 ._ ,.. :l "\.) ,.... total $26,843 $135,581 $ 74,863 $100,054 $113,780 $451,121 c;o -..1 l..o biblios revisitedfkountz 83 evidenced by the achievement of a successful system on schedule and within budget. this approach reflects a contention that librarians can specify their requirements if they "have a mind to," and that a contracted programming staff can satisfactorily perform to predetermined standards and timeframes if properly directed. in direct contrast to this approach are the incredible schedules developed when requirements are not specified (and frozen), and the suspected monumental costs hidden in lost staff time due to extended parallel operations or simply waiting until "they" get the " ... thing" to run right. the remaining cost components, briefly, reflect direct library analyst time, the cost of coordination meetings, direct key punch and machine time for programs, their test, debug, string test, systems test, and for the bibliographic and book catalog, subsystems conversion and catalog print file generation. the conversion/outside services include a marc subscription, the creation and use of a group of nine typists to optically scan the library's files to convert them to machine readable form (including error correction), and the contracted services of a photoreproduction house to mechanically compose, print, bind, and deliver 500 sets of the book catalog and 100 sets of the locator guide. these are the costs of setting the system up, staff training, and creating a single operational display: the book catalog. a word about operating costs early in 1965, as a prelude to implementing a book acquisition program, a time/cost study was performed to determine how much it cost the library to order a book (one title). this study detailed and costed the typing, sorting, assignment of vendors, and the reduction of a diversity of paper requisite to creating a purchase order. excluding the cost of the purchase order form itself, the direct manual cost for this process was $1.56 per title, using a clerical rate of $2.10 per hour. in the intervening years three things have happened: first, clerical rates have increased to $2.79 per hour which when applied to the unit cost of the 1965 acquisitions study means a direct outlay of $2.07 per title (as against the previous $1.56). second, the number of branches has increased which implies that, if the manual system of 1965 could cope with the increased load, it would have required more people and therefore an increase in indirect costs, not to mention the probability of less efficiency due to increased direct costs. third, orange county has automated this function (as well as others). since orange county is wont to track costs, it so happens that the cost for creating a purchase order ( subpurchase order under the new system) is available. specifically, orange county knows computer and peripheral costs and the exact time for processing from actual billings over the past two years. the reduction of these data to a per-unit-handled equivalent, while detailed, is not difficult. thus, it is possible to deduce the machine costs table 2. typical processing costs for one title in orange county public library's biblios system. marc acquisitions 2 book catalog (weekly) bibliographic1 inventory order receive3 b.c. inventory run cost $325.16 $300.40 $201.21 $1244.94 $238.55 $238.00 $26.00 average items per period 1154 1,000 8,100 700 4000 order receive cost/entry 2.83 0.30 0.025 $1.78 $0.34 0.059 0.0006 supplies 0.13 0.028 $.05 (sub p.o.) services 0.02 ( convelope) .06 ( opscan) 0.041 0.0028 (opscan) ( comp /print) ( comp /print) total $2.96 $0.32 $0.053 $1.89 $0.34 $.10 $0.0034 example: cost of entry from initial input to display in book catalog (including convelope; excluding marc source: $2.77). 1 40% bibliographic. 2 60% bibliographic. 3 includes invoice, vendor, and budget displays. 4 if all new entries to system came from marc. (x) ..,... 0' ::: 3 1::) ........ ~ t""< 6.:. ..., ;:::: ..., <::: :;.,... ::: .,... 2 :::; ..... c· ;:; < 2.. '-" -...._ 1'0 ._ ,.. 5 "(1) ,_. ~ 1'0 biblios revisitedjkountz 85 equitable to those for the earlier manual effort : creating a purchase order for one title, including the purchase order form , now costs $1.89. similar economies can readily be documented as can the increases in service to our patrons at no increase in staff. the operating costs for those biblios subsystems in regular use are given in table 2. only two entries on this table are not self-explanatory. marc marc, which is indicated as processed weekly, has not been run for over a year. the explanation is simple economics. it costs $0.32 to manually place a bibliographic description on file (excluding the time spent to circle an entry in publishers, weekly (pw) vs. $2.96 to process the same entry from marc. this cost for marc includes the subscription cost prorated to selected entries, the translation and format of all marc entries, the automatic release of those entries of limited value to a public library, the cumulation of entries which may be of value, the extract and transfer of those entries selected, and the reporting via indices and full listings for the contents of the cumulated file. the unit cost is the actual processing cost for marc ii files for one year divided by the number of titles processed through the rest of biblios during the same period. this cost does not include corrections to selected marc entries (invariably in the call number and author fields for consistency with the library's existing files). the costs affiliated with processing corrective input closely resemble those for bibliographic, e.g., $0.32 each. prorated bibliographic input biblios works on pre-cataloged entries. the 60 percent bibliographic input shown under acquisitions relates to the full initial description for a title being entered by a book selector to effect its order and subsequent reporting; the 40 percent shown under bibliographic is for cataloger input to adjust the entry for title-page accuracy, consistency with existing files , and, for nonfiction, the assignment of call numbers and subject headings. it is important to note that for reorders against a title already in the system, no bibliographic input is required. in the case of reorders, the per title cost is $0.88 including subpurchase order forms. references 1. john c. kountz, "cost comparison of computer versus manual catalog maintenance," l ournal of library automation 1:159-77 (spring 1968). 2. daniel melcher, m elcher on acquisition (chicago: american library association, 1971 ), p. 135. 3. john c. kountz and robert norton, "biblios-a modular approach to total library adp," proceedings of asis 6:39-50 ( 1969). 86 journal of library automation vol . 5/ 2 june, 1972 4. john c. kountz and robert e. norton , "biblios-a modular system for library automation," datamation 16-79-83 ( feb. 1970 ) . 5. orange county public library presently has twenty-six branches, three bookmobiles, and plans for at least three more branches and an additional bookmobile in the near future. 6. kountz and norton, "biblios-a modular approach." 7. the device affiliated with the book depends on the transactor. the only requirements are that it mechanically represent the key for the book, be practically indestructible, and that it can be prepared mechanically. this last consideration is an absolute when there are 800,000 volumes to convert. 8. melcher, melcher on acquisition, p. 135. 2 information technology and libraries | march 2007 m any things happen on the national front that affect libraries and their use of technology. legislative action, national policy, and stan­ dards development are all arenas in which ala and lita both take an active role. lita has articulated in its strategic plan the need to pursue active involvement in providing its expertise on national issues and standards development. lita achieves these important objectives in a variety of ways. lita has several committees, interest groups, and representatives to ala standing committees that address legislation, regulation, and national policy issues that pertain to technology. the charge of the lita legislation and regulations committee reads: “the legislation and regulation committee monitors legislative and regula­ tory developments in the areas of information and communications technologies; identifies relevant issues affecting libraries and assists in developing appropri­ ate strategies for responding to these issues.” as its educational mission, the committee publicizes issues and strategies on the lita web site. the chairperson of this committee serves as the lita representative to the ala legislation assembly which advises ala on positions to take regarding legislative and regulatory action. lita also has a representative to the ala office of information technology policy advisory committee who works closely with the legislation and regulation committee on it policy issues that may cross over into the legislative realm. lita also appoints a representa­ tive to the ala intellectual freedom committee whose purpose is “to recommend such steps as may be neces­ sary to safeguard the rights of library users, libraries, and librarians, in accordance with the first amendment to the united states constitution and the library bill of rights.” much has happened on the national front in the past few years that provides plenty of work for these lita and ala committees. the patriot act, calea, net neutrality, dopa, ada compliance, and debates over copyright and intellectual property rights in an electronic world are all examples of issues that require technologi­ cal control or affect systems and network solutions. they also touch at the heart of what librarians have always stood for: protection of intellectual property, personal pri­ vacy, and intellectual freedom. library technologists exert enormous time and effort protecting the privacy of patron records through data retention policies, system controls, and strong authentication systems all while providing authorized access to intellectual property according to copyright or licensing restrictions. keeping lita mem­ bers apprised of all of these issues and the technologies required to abide by legal requirements is an enormous task of the committees and interest groups. these groups do this through programming, publications, and postings to the lita web site. lita has always been very active on the standards development front. from the start, lita was involved with the marc standards through the hard work of henriette avram. the number of standards that affect libraries has mushroomed. there are standards for all aspects of technology—data formats, hardware and firmware, and networking. ala regularly calls on lita to provide expertise on developing standards that per­ tain to library technology. lita has a standards interest group and shares membership with alcts and rusa on the marbi committee. most lita interest groups deal with standards of some sort at least occasionally. the lita board felt that lita’s work on develop­ ing standards was so important that in 2006 a new standards coordinator position was created and diane hillman, cornell university, was appointed as the first person in this role. the standards coordinator identifies lita experts to assist in calls for review of developing standards and seeks input from the membership. the standards coordinator works closely with the standards interest group to help educate the membership. because of the nature of digital information, networks, and the standards that enable the distribution of digital informa­ tion and services, it has become impossible for any one person to understand all the standards that affect the library technologist. as standards proliferate, it becomes more important for lita to provide educational oppor­ tunities alongside the involvement in the development of these standards that so impact our daily lives. the lita web site provides a wealth of information about standards. a new means of contributing to the dialogue about developing standards is to participate in the lita wiki where diane hillman will be leading the way in posting information about various library technology standards. also, a great place to learn about various stan­ dards is right here in ital. practically every issue has at least one article about one standard or another. lita’s participation in technological developments on the national front is critical to all libraries. policy, regu­ lation, and standards form the infrastructure to techno­ logical implementation and are the cornerstone to library technology. lita is the place where you can learn more about these developments and participate in the dialogue about them. bonnie postlethwaite (postlethwaiteb@umkc.edu) is lita president 2006/2007 and associate dean of libraries, university of missouri–kansas city. president’s column bonnie postlethwaite lib-s-mocs-kmc364-20141005045055 207 the ad · hoc discussion group on serials data bases: its history, current position, and future richard anable: coordinator, york university libraries, toronto, ontario, canada. history the ad hoc discussion group on serials data bases was formed as a result of an informal meeting held during the american library association's conference in las vegas on june 26, 1973. those in attendance were primarily interested in the generation and maintenance of machine-readable union files of serials. (this author's involvement in that meeting and the later activities of the group stems from a contract between the national library of canada and york university concerning an investigation of the problems associated with machine-readable serials files.) it was intended to be a relatively small and informal meeting of about ten individuals. the meeting was by no means closed, but it was not widely advertised. however, twenty-five individuals representing twenty institutions on the national (both the united states and canada), regional, and local levels attended. at the meeting there was a great deal of concern expressed about: 1. the lack of communication among the generators of machine-readable serials files. 2. the incompatibility of format andjor bibliographic data among existing files. 3. the apparent confusion about the existing and proposed bibliographic description and format "standards." there was general agreement that something should and could be done about these problems, and that the formation of a group specifically concerned with the generation and maintenance of machine-readable serials data bases would at least improve the communications aspect of the overall problem. (poor communication was thought by some to be at the root of the other problems.) it was also suggested that such a group could lay the groundwork for solving some of the compatibility problems, by presenting proposals on various aspects of the overall problem. these proposals might be used as guidelines for any new projects or revisions of existing ones. it 208 journal of library automation vol. 6/ 4 december 1973 was felt very strongly that the time factor was crucial if the efforts of such a group were to be useful, particularly to several of the institutions represented at the meeting. there was also a concern that the activities of the group should not parallel or duplicate any work already being undertaken by other groups. while various ala committees were dealing with some aspects of the overall problem, no one committee seemed to be addressing its entire scope. the association of research libraries was conducting a study of the existing serials data bases held by their member institutions, but was not currently addressing the overall problem, particularly with regard to the union list activities. it was suggested that direct communication with that committee be established. the net result of this first meeting was that the discussion group was formed and several meetings were scheduled. cynthia pugsley from the university of waterloo libraries, jay cunningham from the university-wide library automation program, university of california, and this writer were requested to prepare a position paper outlining the need for such a group. in july, the minutes of the june 26 meeting and the position paper were distributed. in the meantime a steering committee was arbitrarily selected. the council on library resources agreed to fund a meeting of that committee to be held september 21 at york university in toronto. the steering committee was made up of representatives from the council on library resources (clr), northwestern university, the canadian union catalogue taskgroup and its subgroup on the serials union list, the state university of new york (suny), the university of california university-wide library automation program ( ulap), the association of research libraries (arl), the joint committee on the union list of serials ( jculs), ohio college library center ( oclc), the national serials data program (nsdp), the library of congress (lc), the national library of canada (nlc), universite laval, international serials data system ( isds) /canada, and an observer from the british library. the purpose of the meeting was: 1. to establish a mechanism for creating a set of "agreed-upon practices for converting and communicating machine-readable serials data." 2. to establish a mechanism for cooperatively converting a comprehensive retrospective bibliographic data base of serials. to further these ends, the following subcommittees were established: 1. holding statement notation 2. working communications conventions 3. authority files 4. cooperative conversion mechanism the steering committee recognized the need for swift action on the development of "agreed-upon practices." consequently, this job was delegatad hoc discussidn group i anable 209 ed to the holding statement notation and the working communications conventions subcommittees. it deferred action on the question of a cooperative conversion effort until a report was received from that subcommittee at the next steering committee meeting scheduled for october 22, 1973 during the american society for information science meeting in los angeles. on october 10, three of the four subcommittees met for very brief sessions at the library of congress. the most significant results came from the cooperative conversion subcommittee which recommended that: ( 1) a proposal for a cooperative project be prepared as soon as possible; and (2)· that the conversion vehicle for such a project be the oclc facilities. at the next steering committee meeting these recommendations were accepted and the coordinator was asked to prepare a draft of a proposal for review by the cooperative conversion subcommittee. at this time the proposal is being prepared. the question of the need for formal affiliation with one or more of the existing professional organizations had repeatedly been raised at the various meetings. it was initially decided to inform the appropriate organizations of our existence and intentions, and to cooperate whenever and wherever our activities overlapped. when the group decided to prepare a proposal for a cooperative conversion project, the need for such· affiliation increased dramatically. at the october 22 meeting, the association of research libraries indicated a positive interest in our exploring that possibility further with them. they asked for a more detailed definition of our goals and plans, which is also being prepared. · generally the reaction of the group toward some kind of organizational arrangement with arl, if assurance could be made regarding the participation of non-arl institutions, was favorable. another question that lingers is whether it would be advisable to have a formal dual affiliation with both arl and a second professional organization. at this point the question is still open. current position thus far the activities of the group have addressed the problems of: 1. the improvement of communications among institutions engaged in the generation or maintenance of serials data bases. 2. the establishment of a set of "agreed-upon practices.'' 3. the investigation of future means of cooperative or coordinated serials record conversion of retrospective titles. the reasons for these efforts are obvious. we are currently all spending much time and money on noncooperative and uncoordinated local and regional conversion, and few of us are satisfied with the results. through improved communications among conversion efforts, we hope 210 journal of library automation vol. 6/4 december 1973 to establish a set of "agreed-upon practices" which should increase the interfiling compatibility. this in turn should reduce the cost to each institution. the use of a common centralized data collection vehicle will minimize redundant conversion. the problems associated with the generation and maintenance of union files of serials have multiplied in the last decade with the introduction of the anglo-american cataloguing rules ( aacr), establishment of the international serials data system ( isds), the presentation of the international standard bibliographic description for serials ( isbds) proposal, the distribution of the library of congress marc serials records, and the increasing role played by the indexing and abstracting services as points of access to serials lists of all types. individually our institutions cannot comprehensively attack all of these aspects. if attacked independently, there is little chance of similarity of approach; if attacked jointly, through the establishment of a set of "agreed-upon practices," similarity will be greater. if attacked jointly through a cooperative conversion effort, the resulting file will be equally usable to all the participants. it is the primary objective of the cooperative conversion project to establish a relatively comprehensive bibliographic data base of serials titles within a time frame which would eliminate the necessity for redundant and costly conversion efforts on the local and regional levels. the prime use of the resulting data base is intended to be the support of union list of serials activities. the secondary objectives are: i. to assist the national libraries of both countries (canada and the united states) in the establishment of a computer-maintained (and hopefully remotely accessible) serials data system. this will be accomplished partly by the very existence of the resulting data base, and partly by the experience gained in its establishment. 2. to assist in the definition of the roles of the regional or resource centers in such enterprises. 3. to provide a source data base for use within the international serials data system, and to seek the active participation of the canadian and united states national centers. the intention of the cooperative conversion project is to establish a comprehensive data base of serials titles in such a way as to accommodate the past, present, and future standards for format, description, and identification, when they can be identified. it is not the intention of this group to establish any new standards in any area. the proposed record structure will be a composite record complying with the iso /2709 format standard on level one (structure), and will attempt to reconcile the minor conflicts among the international serials data system's guidelines, the national serials data program internal format, the library of congress' marc-s format, and the draft of the caad hoc discussion group/ anable 211 nadian marc serials format on levels two and three (content designators and content) . the problems here appear to be technical in nature and by no means insurmountable. thus far the communication among the par· ticipants (including representatives from three of the four areas) has been most encouraging. the record will be based on a minimum set of data elements established to provide enough data to support the union list functions. however, this is a minimum and not a maximum set. it is basically a convention below which a record will be considered incomplete and above which it will be considered acceptable. there probably will be two additional categories of data elements besides those that are required: ( 1) required if readily avail. able; and ( 2) not required by the system, but acceptable. "required if readily available" covers those situations where complete bibliographic data are available at the time of conversion, such as where library of con· gress data are present. if the data are there, it is cheaper to convert at that point than at a later date. for this category and the "not required but ac· ceptable" category, a set of agreed·upon practices will be in effect to en· sure that if a data element is converted it will be consistent in content with similarly tagged fields. since at the time of this writing the proposed working communication conventions have not been finalized, it is not possible for the reader to judge whether the minimum set of data elements will meet his local or regional requirements. at this stage it appears that the set will probably in· elude over thirty elements and will have as a subset the isds data element requirements. the conversion project is intended not to compete with any existing or planned programs at either the library of congress or the national li· brary of canada. in fact, it is intended to complement activities in which these two institutions might be engaged. the distribution of the lc marc-s records, and the similar proposed service by the national library of canada, deal primarily with new titles or title changes, and not with the conversion of retrospective titles. while it is the stated intent of the nsdp to attack this area (retrospective titles), thus far it has not been fund· ed to do so. in fact the active involvement of both national libraries and their isds centers is anticipated since their contribution would be inappropriate to duplicate. it is intended that the resulting data base be made available to the isds international center and thus the rest of the international li· brary community. while the direct participation in the conversion effort may well be limited to a manageable number of institutions, this should not deter any institution from direct involvement in the deliberations of the group. what is requested is that the prospective participants have a serious interest in the solving of problems within the short time frame allowed. 212 journal of library automation vol. 6/4 december 1973 future to repeat, the basic goals of the group are: 1. to improve the communication among the generators of serials data bases. 2. to establish a set of "agreed-upon practices." 3. to establish a mechanism for the cooperative conversion of a comprehensive serials data base. the first goal will be an on-going effort, probably carried through by one or more existing organizations. the second goal, we hope to have partially completed by the time ala meets in chicago in january 1974, through the presentation to the steering committee of the reports of the various subcommittees. the third goal will be accomplished in stages. by ala midwinter we hope to have a concrete proposal that can be presented to all prospective participants, to funding agencies, and to the library community as a whole. since time is one of the items to be optimized, we feel that we should have the project launched no later than the end of the second quarter of 1974. the basic approach being proposed in this document can be characterized in the following ways: 1. a limited number of large institutions ( 5-15), or centers representing large institutions, will use a single on-line data collection facility, such as the ohio college library center ( oclc), to convert their retrospective serials files. 2. one or more large bibliographic 6.les will be used as a base file ( possibly the minnesota union list of serials file) to which new records or fields can be added. 3. the conversion requirement will be based on: (a) the building of a composite record incorporating the aacr, isds, and proposed isbd ( s) requirements. (b) minimum set of data elements basically for union list of serials purposes. (c) the concept of an expandable record able to incorporate: ( 1) variant entry approaches, and (2) available (but not required) data elements. such an approach is a series of compromises, the first of which deals with the trade-off between time and cost. one argument which has been offered against the concept of collecting an "incomplete" serials record is that the total cost to the library community in the long run will be greater than if a complete record conversion were to be done initially. this argument is a carryover from the similar discussion concerning monographs. however, we must recognize the following: l. serials records are of a dynamic nature; what is true for a title this ad hoc discussion group/ anable 213 year probably will not be true next year. the more comprehensive the data element set, the more true this becomes. 2. the cost curve dramatically increases as the number and type of data elements increase. 3. the increased time required to collect, edit, and control an exhaustive data element set will seriously protract the time frame of such an effort. 4. the time frame is one of the prime targets for optimization. any massive additional data collection requirement will compromise that goal. there are no conclusive studies in the area of serials conversions which suggest that the "total" record conversion approach would be less expensive in the long run tha11 a base record conversion approach. we do not know what the total conversion effort is now, but it is guessed to be significant. the utilization of most of the existing files is primarily for catalogs or for location services which the proposed minimum data set will accommodate. only a limited number of institutions have experimented with serials check-in or other functions requiring more complete records. the building of a composite record incorporating the various bibliograpllic standards is easily justified. such a record must be accessible via past, current, future, and popular practices. the ability to incorporate alternative applications of the same standards is also important, particularly in those cases where the rule is open to interpretation. this is very important if there is no centralized authority to control the application of a specific standard. the ability to convert additional data elements which are readily available but not required, is also an important capability since it will reproduce a more complete record at a reasonable cost. keeping the number of contributing institutions to a relatively small number simplifies the control aspects of the project. using a central on-line (remotely accessible) system such as oclc reduces the amount of software development required and reduces the degree of redundant conversions. it also will enable us to start conversion in a time frame otherwise impossible. the use of at least one large bibliographic file such as muls decreases the amount of original conversion, thus shortening the total time frame. the use of multiple starting data bases increases the matching requirements of similar records among the data bases but further reduces the original effort. the problem of selecting data bases is being studied. summary i have attempted to define in this article the history, the current position, and the future plans of the ad hoc discussion group on serials data bases. we have tried to include in the deliberations of the group as many of the interested parties as possible. omissions do exist, not by intent, but 214 journal of library automation vol. 6/ 4 december 1973 because of a lack of complete information and poor communication. i have tried to act with speed because of the need expressed by the participants. the group is not a closed shop. any institution seriously desiring to make contributions is welcome. please consider this an open invitation. any documentation desired is readily available from me, as the coordinator. the willingness of the participants to cooperate and to make compromises has thus far exceeded all expectations, particularly in those areas where problems were expected. it has truly been a group effort. i would like to especially thank the national library of canada, the library of congress, and the national serials data program for the cooperation they have given to the regional organizations which have been the backbone of this effort. ital_24n4.pdf ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ 26 information technology and libraries | june 2008 preparing locally encoded electronic finding aid inventories for union environments: a publishing model for encoded archival description author id (to come) plato l. smith ii this paper will briefly discuss encoded archival description (ead) finding aids, the workflow and process involved in encoding finding aids using ead metadata standard, our institution’s current publishing model for ead finding aids, current ead metadata enhancement, and new developments in our publishing model for ead finding aids at florida state university libraries. for brevity and within the scope of this paper, fsu libraries will be referred to as fsu, electronic ead finding and/ or archival finding aid will be referred as ead or eads, and locally encoded electronic ead finding aids inventories will be referred to as eads @ fsu. n what is an ead finding aid? many scholars, researchers, and learning and scholarly communities are unaware of the existence of rare, historic, and scholarly primary source materials such as inventories, registers, indexes, archival documents, papers, and manuscripts located within institutions’ collections/holdings, particularly special collections and archives. a finding aid—a document providing information on the scope, contents, and locations of collections/ holdings—serves as both an information provider and guide for scholars, researchers, and learning and scholarly communities, directing them to the exact locations of rare, historic, and scholarly primary source materials within institutions’ collections/holdings, particularly noncirculating and rare materials. the development of the finding aid led to the institution of an encoding and markup language that was software/hardware independent, flexible, extensible, and allowed online presentation on the world wide web. in order to provide logical structure, content presentation, and hierarchical navigation, as well as to facilitate internet access of finding aids, the university of california–berkeley library in 1993 initiated a cooperative project that would later give rise to development of the nonproprietary sgml-based, xml-compliant, machine-readable markup language encoding finding aid standard, encoded archival description (ead) document type definition (dtd) (loc, 2006a). thus, an ead finding aid is a finding aid that has been encoded using encoded archival description and which should be validated against an ead dtd. the ead xml that produces the ead finding aid via an extensible style sheet language (xsl) should be checked for well-formed-ness via an xml validator (i.e. xml spy, oxygen, etc.) to ensure proper nesting of ead metadata elements “the ead document type definition (dtd) is a standard for encoding archival finding aids using extensible markup language (xml)” (loc, 2006c). an ead finding aid includes descriptive and generic elements along with attribute tags to provide descriptive information about the finding aid itself, such as title, compiler, compilation date, and the archival material such as collection, record group, series, or container list. florida state university libraries has been creating locally encoded electronic encoded archival description (ead) finding aids using a note tab light text editor template and locally developed xsl style sheets to generate multiple ead manifestations in html, pdf, and xml formats online for over two years. the formal ead encoding descriptions and guidelines are developed with strict adherence to the best practice guidelines for the implementation of ead version 2002 in florida institutions (fcla, 2006), manuscript processing reference manual (altman & nemmers, 2006), and ead version 2002. an ead note tab light template is used to encode findings down to the collection level and create ead xml files. the ead xml files are tranformed through xsl stylesheets to create ead finding aids for select special collections. n ead workflow, processes, and publishing model the certified archivist and staff in special collections and a graduate assistant in the digital library center encode finding aids in ead metadata standard using an ead clip and ead template library in note tab light text editor via data entry input for the various descriptive, administrative, generic elements, and attribute metadata element tags to generate ead xml files. the ead xml files are then checked for validity and well-formed-ness using xml spy 2006. currently, ead finding aids are encoded down to the folder level, but recent florida heritage project 2005–2006 grant funding has allowed selected special collections finding aids to be encoded down to the item level. currently, we use two xsl style sheets, ead2html.xsl and ead2pdf.xsl, to generate html and pdf formats, and simply display the raw xml as part of rendering ead finding aids as html, pdf, and xml and presenting these manifestations to researchers and end users. the ead2html.xsl style sheet used to generate the html versions was developed with specifications such as use of fsu seal, color, and display with input from the special collections department head. the ead2pdf.xsl style sheet used to generate pdf versions uses xsl-fo (formatting plato l. smith ii (psmithii@fsu.edu) is digital initiatives librarian at florida state university libraries, tallahassee. preparing locally encoded electronic finding aid inventories for union environments | smith 27 object), and was also developed with specifications for layout and design input from the special collections department head. the html versions are generated using xml spy home edition with built-in xslt, and the pdf versions are generated using apache formatting object processor (fop) software from the command line. ead finding aids, eads @ fsu, are available in html, pdf, and xml formats (see figure 1). the style sheets used, ead authoring software, and eads @ fsu original site are available via www.lib.fsu.edu/dlmc/dlc/ findingaids. n enriching ead metadata as ead standards and developments in the archival community advance, we had to begin a way of enriching our ead metadata to prepare our locally encoded ead finding aids for future union catalog searching and opac access. the first step toward enriching the metadata of our ead finding aids was to use rlg ead report card (oclc, 2008) on one of our ead finding aids. the test resulted in the display of missing required (req), mandatory (m), mandatory if applicable (ma), recommended (rec), optional (opt), and encoding analogs (relatedencoding and encodinganalog attributes) metadata elements (see figure 2). the second test involved reference online archive of california best practices guidelines (oac bpg), specifically appendix b (cdl, 2005, ¶ 2), to create a formal public identifier (fpi) for our ead finding aids and make the ead fpis describing archives content standards (dacs)–compliant. this second test resulted in the creation of our very first dacs– compliant ead formal public identifier. example: ftasu2003004. xml the rlg ead report card and appendix b of oac bpg together helped us modify our ead finding aid encoding template and workflow to enrich the ead document identifier metadata tag element, include missing mandatory ead metadata elements, and develop fpis for all of our ead finding aids. prior to recent new developments in the publishing model of ead finding aids at fsu libraries, the ead finding aids in our eads @ fsu inventories could not be easily found using traditional web search engines, were part of the so-called “deep web,” (prom & habing, 2002) and were “unidimensional in that they [were] based upon the assumption that there [was] an object in a library and there [was] a descriptive surrogate for that object, the cataloging record” (hensen, 1999). ead finding aids in our eads @ fsu inventories did not have a descriptive surrogate catalog record and lacked the relevant related encoding and analog metadata elements within the ead metadata with which to facilitate “metadata crosswalks”—mapping one metadata standard with another metadata standard to facilitate crosssearching. “to make the metadata in ead instance as robust as possible, and to allow for crosswalks to other encoding schemes, we mandate the inclusion of the relatedencoding and encodinganalog attributes in both the and segments” (meissner, et al., 2002). incorporating an ead quality checking tool such as rlg bpg and ead compliance such as dacs when figure 1. ead finding aids in html, pdf, and xml format figure 2. rlg ead report card of xml ead file 28 information technology and libraries | june 2008 authoring eads, will assist in improving ead encoding and ead finding aids publishing model. n some key issues with creating and managing ead finding aids one of the major issues with creating and managing ead finding aids is the set of rules used for describing papers, manuscripts, and archival documents. the former set of rules used for providing consistent descriptions and anglo-american cataloging rules (aacr) bibliographic catalog compliance for papers, manuscripts, and archival documents down to collection level was archives, personal papers, and manuscripts (appm), which was complied by steven l. hensen and published by the library of congress in 1983. however, the need for more description granularity down to the item level, enhanced bibliographic catalog specificity, marc and ead metadata standards implementations and metadata standards crosswalks, and inclusion of descriptors of archival material types beyond personal papers and manuscripts prompted the development of describing archives: a content standard (dacs), published in 2004 with the second edition published in 2007. “dacs [u.s. implementation of international standard for the description of archival materials and their creators] is an output-neutral set of rules for describing archives, personal papers, and manuscripts collections, and can be applied to all material types ”(pearce-moses, 2005). some international standards for describing archival materials are general international standard archival description isad(g) and international standard archival authority record for corporate bodies, persons, and families [isaar(cpf)]. other issues with creating and managing ead finding aids include (list not exhaustive): 1. online presentation of finding aids 2. exposing finding aids electronically for searching 3. provision of a search interface to search finding aids 4. online public access catalog record (marc) and link to finding aids 5. finding aids linked to digitized content of collections eads @ fsu exist in html for online presentation, pdf for printing, and xml for exporting, which allow researchers greater flexibility and options in the information-gathering and research processes and have improved the way archivists communicated guides to archival collections with researchers as opposed to paper finding aids physically housed within institutions. eads @ fsu have existed online in html, pdf, and xml formats for two years in a static html document and then moved to drupal (mysql database with php) for about one year, which improved online maintenance but not researcher functionality. however, the purchase and upgrade of a digital content management system marked a huge advancement in the development of our ead finding aids implementation and thus resolutions to issues numbers 1–3. researchers now have a single-point search interface to search eads @ fsu across all our digital collections/ institutional repository (see figure 3); the ability to search within the finding aids via full-text indexing of pdfs; the option of brief (thumbnails with ead, htm, pdf, and xml manifestation icons), table (title, creator, and identifier), and full (complete ead finding aid dc record with manifestations) views of search results, which provides different levels of exposures of ead finding aids; and the ability to save/e-mail search results. future initiatives are underway to enhance eads @ fsu implementation via the creation of ead marc records through dublin core to marc metadata crosswalk, to deep link to ead finding aids via 856 field in marc records, and to begin digitizing and linking to ead finding aids archival content via digital archival object ead element. is “linking element that uses the attributes entityref or href to connect the finding aid information to electronic representations of the described materials. the and elements allow the content of an archival collection or record figure 3. online search gui for ead finding aids and digital collections within ir preparing locally encoded electronic finding aid inventories for union environments | smith 29 group to be incorporated in the finding aid” (loc, 2006b). we have opted to create basic dublin core records of ead finding aids based on the information in the ead finding aids descriptive summary (front matter) first and then crosswalk to marc, but are cognizant that this current workflow is subject to change in the pursuit of advancement. however, we are seeking ways to improve the ead workflow and ead marc record creation through more communication and future collaboration with the fsu libraries cataloging department. n number of finding aids and percent of eads @ fsu as of february 16, 2006, we had 700 collections with finding aids in which 220 finding aids are electronic and encoded in html (31 percent of total finding aids). from the 220 electronic finding aids, 60 are available as html, pdf, and xml finding aids (20 percent of electronic finding aids are eads @ fsu). however, we currently have 63 ead finding aids available online in html, pdf, and xml formats. n new developments in publishing eads @ fsu current eads @ fsu include the recommendations from test 1 and test 2 (rlg bpg and dacs compliance) which were discussed earlier and the digital content management system (i.e. digitool) creates a descriptive digital surrogate of the ead objects in the form of brief and basic dublin core metadata records for each ead finding aid along with multiple ead manifestations (see figure 4). we have successfully built and launched our first new digital collection, fsu special collections ead inventories, in digitool 3.0 as part of fsu libraries dlc digital repository (http://digitool3.lib.fsu.edu/r/), a relational database digital content management system (dcms). digitool has an oracle 9i relational database management system backend, searchable web-based gui, a default ead style sheet that allows full-text searching of eads, supports marc, dc, mets metadata standards, jpeg2000 (built in tools for images and thumbnails) as well as z39.50 and oai protocols which will enable resource discovery and exposing of eads @ fsu. you can visit fsu special collections ead finding aids inventories at http://digitool3.lib.fsu.edu/r/? func=collections-result&collection_id=1076. n national, international, and regional aggregation of finding aids initiatives rlg’s archivegrid (http://archivegrid.org/web/index. jsp) is an international, cross-institutional search constituting the aggregation of primary source archival materials of more than 2,500 research libraries, museums, and archives with a single-point interface to search archival collections from across research institutions. other international, cross-institutional searches of aggregated archival collections are: n intute: arts& humanities in the united kingdom www.intute.ac.uk/artsandhumanities/ cgi-bin/browse.pl?id=200025 (international guide to subcategories of archival materials) n archives made easy www.archivesmade easy.org (guide to archives by country) there are also some regional initiatives, which provide cross-institutional search of aggregations of finding aids: n publication of archival library and museum materials (palmm) http://palmm.fcla.edu (crossfigure 4. ead finding aids in ead (default), html, pdf, and xml manifestations 30 information technology and libraries | june 2008 institutional searches in fl fsu participates, fl) n virginia heritage: guides to manuscript and archival collections in virginia http://ead.lib .virginia.edu/vivaead/ (cross-institutional searches in virginia) n texas archival resources online www.lib.utexas. edu/taro/ (cross-institutional searches in texas) n online archive of new mexico http://elibrary .unm.edu/oanm/ (cross-institutional searches in new mexico) awareness of regional, national, and international aggregation of finding aids initiatives and engagement in regional aggregation of finding aids will enable a consistent advancement in the development and implementation of eads @ fsu. acknowledgments fsu libraries digital library center and special collections department, florida heritage project funding (fcla), chuck f. thomas (fcla), and robert mcdonald (sdsc) assisted in the development, implementation, and success of eads at fsu. references altman, b. & nemmers, j. (2006). manuscripts processing reference manual. florida state university special collections. california digital library (cdl). (2005). oac best practice guidelines for encoded archival description, appendix b. formal public identifiers for finding aids. retrieved october 6, 2006 from www.cdlib.org/inside/diglib/guidelines/bpgead/ bpgead_app.html#d0e2995. digital library center, florida state university libraries. (2006). fsu special collections ead finding aids inventories. retrieved january 5, 2007 from http://digitool3.lib.fsu.edu/ r/?func=collections-result&collection_id=1076. florida center of library automation (fcla). (2004). palmm: publication of archival library and museum materials, archival collections. retrieved january 7, 2007 from http://palmm.fcla .edu. florida center for library automation (fcla). (2006). best practice guidelines for the implementaton of ead version 2002 in florida institutions. (john nemmers, ed.). accessed april 21, 2008, at www.fcla.edu/dlini/openingarchives/new/ floridaeadguidelines.pdf fox, m. (2003). the ead cookbook — 2002 edition.chicago: the society of american archivists. retrieved october 6, 2006 from www.archivists.org/saagroups/ead/ead2002cookbook .html. hensen, s. l. (1999). nistf ii and ead: the evolution of archival description. encoded archival description: context, theory, and case studies (pp. 23–34). chicago: the society of american archivsits library of congress (loc). (2006a). development of the encoded archival description dtd. retrieved october 6, 2006 from www.loc.gov/ead/eaddev.html. library of congress (loc). (2006b). digital archival object— encoded archival description tag library—version 2002. retrieved january 8, 2007 from www.loc.gov/ead/tglib. library of congress (loc). (2006c). encoded archival description —version 2002 official site. etd dtd version 2002. retrieved april 19, 2008 from www.loc.gov/ead/ead2002a.html. meissner, d., kinney, g., lacy, m., nelson, n., proffitt, m., rinehart, r., ruddy, d., stockling, b., webb, m., & young, t. (2002). rlg best practices guidelines for encoded archival description (pp. 1-24). mountain view: rlg. retrieved january 5, 2007 from www.rlg.org/en/pdfs/bpg.pdf. national library of australia. (1999). use of encoded archival description (ead) for manuscript collection retrieved january 4, 2007 from www.nla.gov.au/initiatives/ead/eadintro .html. oclc. (2007). archivegrid—open the door to history. retrieved january 4, 2007 from http://archivegrid.org/web. oclc. (2008). ead report card. retrieved april 11, 2008 www.oclc.org/programs/ourwork/past/ead/reportcard .htm. pearce-moses, r. (2005). a glossary of archival and records terminology. chicago: society of american archivists. retrieved january 8, 2007 from www.archivists.org/glossary/index.asp. prom, c. j. & habing, t. g. (2002). using the open archives initiative protocols with ead . paper preserted at the international conference on digital libraries proceedings of the 2nd acm/ieee-cs joint conference on digital libraries. portland, oregan, usa, july 14-18, 2002. retrieved october 6, 2006 from http://portal.acm .org/citation.cfm?doid=544220.544255. reese, t. (2005). building lite-weight ead repositories,. paper presented in the international conference on digital libraries proceedings of the 5th acm/ieee-cs joint conference on digital libraries. new york: acm. retrieved january 5, 2007 from http://doi.acm.org/10.1145/1065385.1065498. special collections department, university of virginia. (2004). virginia heritage guides to manuscripts and archival collections in virginia. retrieved january 7, 2007 from http://ead.lib.virginia .edu/vivaead/. thomas, c., et al. (2006). best practices guidelines for the implementation of ead version 2002 in florida institutions. florida state university special collections. university of texas libraries, university of texas at austin. (unknown). texas archival resources online (taro). retrieved january 4, 2007 from www.lib.utexas.edu/taro. factors affecting university library website design | kim 99 yong-mi kim factors affecting university library website design factors include usability testing and institutional forces.5 because website design studies are sparse, this study examines the success of technology utilization studies to further identify factors that are pertinent to website design in order to provide a comprehensive view of web design success factors. a review of literature related to university library website design will be offered in the next section. the research methods, which discuss the data collection strategies and the measurements used in the current study, will be followed by the literature review. the findings of the study will later be reported and discussed after the research methods section. the paper will then conclude with an overview of the implications the findings have for academia and managers. ■■ literature review this section offers an overview of the existing website design literature and relevant success factors. these factors include institutional forces, supervisors’ technical knowledge and support, input from secondary sources, and input from users. because the aforementioned elements are identified as independent variables, this study also adopts them as such. following existing studies, website success factors are identified from the utilitarian perspective.6 the dependent variables are (1) the extent to which website designers meet users’ needs, (2) the extent to which users perceive ulwr to be useful, and (3) their actual usage. in this manner, the evaluation of success is measured from different perspectives. this discussion of the independent and the dependent variables appears in the conceptual model, figure 1. institutional forces institutional forces refer to as organizations following other organizations practices to secure efficiency and legitimacy. existing studies have identified three institutional forces: coercive, mimetic, and normative.7 coercive force takes place when an organization pressures others to adopt a certain practice. it is higher when an organization is a subset of another organization. in this research context, the university could be an agent of coercive force. mimetic force refers to organizations following other organizations’ practices, and it is especially common for organizations within the same industry group.8 because organizations within existing studies have extensively explored factors that affect users’ intentions to use university library website resources (ulwr); yet little attention has been given to factors affecting university library website design. this paper investigates factors that affect university library website design and assesses the success of the university library website from both designers’ and users’ perspectives. the findings show that when planning a website, university web designers consider university guidelines, review other websites, and consult with experts and other divisions within the library; however, resources and training for the design process are lacking. while website designers assess their websites as highly successful, user evaluations are somewhat lower. accordingly, use is low, and users rely heavily on commercial websites. suggestions for enhancing the usage of ulwr are provided. f rom a utilitarian perspective, a website evaluation is based on users’ assessments of the website’s instrumental benefits.1 if a website helps users complete their tasks, they are likely to use the website. following this line of reasoning, dominant research has reported that users are most likely to use university library website resources (ulwr) when they can help with user tasks.2 although we know now that the utilitarian perspective should be applied to web design, not clear is the extent to which web designers consider users’ needs and, likewise, the extent to which users consider ulwr to be successful in terms of meeting their needs. also not clear are what factors other than user needs influence university library website design. this is not a trivial issue because university libraries have invested a massive number of resources into providing web services and need to justify their investments to stakeholders (such as the university) by demonstrating their ability to meet users’ needs.3 also important is the identification of these factors because web design and website performance are closely correlated.4 as a consequence, investigating factors that influence successful university library website design and providing managerial guidance is a timely pursuit. later, the objectives of this paper are twofold: 1. what factors influence university library website design? 2. to what extent do website designers and users consider the university library website to be successful? to explore these research questions, this study identifies factors influencing university library website design that have been reported in existing literature. these yong-mi kim (yongmi@ou.edu) is assistant professor, school of library and information studies, university of oklahoma, tulsa, oklahoma. 100 information technology and libraries | september 2011 although it is a critical factor for website success, there is little evidence that website designers receive strong support from their supervisors. research shows that supervisors’ lack of knowledge about websites inhibits user-centered website design.17 a respondent from chen et al.’s study reports, “it’s really a pain trying to connect with our administration on the topic of web design and usability, because even definitions are completely out the window” and “the dean and the associate directors know little about the need for usability and view it as a last minute check-off, so they can say that the web site is tested and usable.”18 lack of supervisor support inhibits website usability.19 input from secondary sources website designers typically aggregate information from secondary sources rather than from users. identified secondary sources are consultations with experts, other divisions within the library, webmasters, web committees, and focus groups.20 the most widely used method is consultation with experts.21 experts uncover technical flaws and any obvious usability problems with a design,22 facilitate focus groups,23 and create new information architecture.24 because they are experts, however, their ways of thinking may not be the same as users.’25 research shows that 43 percent of the problems found by expert evaluators were actually false alarms and that 21 percent of users’ problems were missed by those evaluators. if this analysis is true, expert evaluators tend to miss and incorrectly identify more problems than they correctly identify;26 consequently, expert testing should not substitute for user testing.27 another problem with secondary sources is that web committees “are ignorant about integrating design with usability and focus on their own agenda.”28 nonetheless, because of the lack of available resources to conduct more rigorous usability tests and the difficulty of collecting information directly from users, secondary sources such as expert evaluations are commonly used.29 input from users user input provides a great advantage for directly finding out users’ needs and integrating a user-centered design during the development stage.30 often, information from secondary sources makes assumptions about users’ needs.31 to discover users’ genuine needs, designers can conduct a regular user survey and/or seek out users’ input.32 by surveying users’ needs, one can overcome criticism such as, “most websites are created with assumptions of more expert knowledge than the users may actually possess,” and can address users’ needs more effectively.33 discovering users’ needs goes beyond usability testing because information obtained directly the same industry face similar problems or issues, mimetic decisions can reduce uncertainty and secure legitimacy.9 in this context, website designers may analyze and emulate other universities’ websites to claim that their websites are congruent with successful websites, thereby justifying their managerial practices. normative force is associated with professionalism.10 normative force occurs when the norms (e.g., equity, democracy, etc.) of the professional community are integrated into organizational decision-making. in a library setting, website designers may follow a set of value systems or go to conferences to discover ways to better deliver services. there is evidence that website designers follow other organizations.11 this phenomenon is known as isomorphism. the appearance and the structure of websites show isomorphic patterns when an organization follows examples of other organizations’ websites or conforms to institutional pressures.12 another study reports coercive forces in the design of university library websites; the parent institution exercises power over library website design by providing guidelines, and later, the design is not independent.13 supervisors’ technical knowledge and support literature on supervisors’ knowledge of and support for technology has long been recognized as one of the most important technology success factors.14 if supervisors are knowledgeable about technology, they are likely to support and provide resources for training.15 supervisors’ technical knowledge also serves as a signal for the importance of the utilization of technology within the organization; consequently, employees actively look for ways to utilize technology and vigorously adopt technology.16 figure 1. conceptual model for website design success factors affecting university library website design | kim 101 march and may 2009. a total of 315 responses were collected (139 males and 176 female; 148 undergraduates, 101 master ’s, and 66 doctoral/faculty; business 152, human relations 51, psychology 43, engineering 41, education 20, other 8). because detailed discussion of the user side of this sample appears elsewhere,36 it will not be repeated here to avoid redundancy. because sparse research has been done in this area, the questionnaire and its measurements were created based on literature relating to the successful deployment of technology, but they were modified to fit into the website design context. because of this modification, the finalized instrument was pretested and pilot tested before use in this study.37 the institutional forces are measured in three categories: coercive isomorphism (i.e., following the university guidelines regarding website creation), mimetic isomorphism (i.e., investigating other university websites and investigating commercial websites), and normative isomorphism (i.e., attending conferences). following existing studies, supervisors’ knowledge and support are assessed by the web designer in two areas: the extent to which a supervisor is knowledgeable about technology and aware of the importance of technology. the supervisor ’s support for the website is measured by asking web designers about the extent to which their supervisors allocated resources and offered training. input from secondary sources is measured by asking the extent to which website designers consult sources such as experts, other divisions, webmasters, and web committees. input from users is measured by the extent to which web designers collect information from website users. finally website successes are measured by two categories: assessments made by the web designers and the website users themselves. the finalized measurements and the sources appear in table 1. ■■ report of findings this section reports the empirical findings of each category discussed in the previous section. figure 2 shows institutional forces that influence university library website design. the first category is coercive force, the second category is mimetic forces, and the third category is normative force. it is clear that the majority of university library web designers (75 percent) comply with the guidelines given by the university, which is a measurement of coercive force; and also designers investigate other universities’ websites (75 percent) and commercial websites (59 percent), which is a measurement of mimetic forces; however, designers don’t appear to actively attend conferences that influence website design, which is a measurement of normative force. from users will reveal what users want and what should be done to meet their needs, thereby enhancing ulwr usage. however, research shows that this aspect is not actively integrated into web design due to the lack of support from supervisors.34 website success success can be measured according to the website’s purpose: to what extent does the website meet users’ needs? in the university library website context, following a utilitarian perspective, researchers measured the success by the degree of ulwr integrated into users’ tasks and users’ frequent visits to the website.35 these two measurements, when combined with the designers’ perceptions of success, will allow one to measure the users’ and designers’ perspectives of website success. by measuring from these two sides, if there is a discrepancy between the two success outcomes, it will prompt designers to adjust their viewpoints to align their success measures with users. ■■ research methods this section discusses the sampling strategies and the measurements for the independent and the dependent variables. because one of the contributions of this study is to compare users’ and designers’ perceptions of website success, the samples are drawn from two groups: one is from university library website designers and the other one is from university library users. for the designer side, it is directly collected from university library website designers; later, libraries without website designers within the library are excluded. the designer sample is identified from the publicly available yahoo academic library list (http://dir.yahoo.com/ reference/libraries). the list contains 448 academic libraries, including those outside the united states. the research assistant made a phone call to the libraries that reside in the united states and verified the existence of website designers within the library, which included 86 academic libraries. if a library had a website designer, the research assistant contacted the person and invited him or her to participate in the study. because of difficulties contacting website designers, the research assistant was able to collect 16 responses between may 2009 and february 2010. once the graduate assistant identified the unreachable designers, the researcher e-mailed those designers between january and april of 2010 and added 30 more responses to the dataset, which resulted in a total of 46 responses (a 54 percent response rate). for the user side, a survey questionnaire was sent to faculty, doctoral, master ’s, and undergraduate students between 102 information technology and libraries | september 2011 the second group of factors that affects website design is supervisors’ knowledge about technology and support for the utilization of technology (see figure 3). web designers have a somewhat mixed perception about their supervisors’ technical knowledge. more specifically, 37 percent of respondents responded that their supervisors do not have good knowledge about technology; 23 percent responded that their supervisors were somewhat knowledgeable about technology; and 40 percent responded that their supervisors have good knowledge about technology; thus, web designers have mixed evaluations about supervisors’ technical knowledge. web designers reported that their supervisors’ perceptions of the importance of technology and websites are higher than their technical knowledge. approximately 60 percent of designers responded that their supervisors emphasize the importance of technology and websites, and the remaining respondents answered that their supervisors are somewhat aware of the importance or do not value it at all. table 1. instrument construct operationalization source institutional forces following university guidelines regarding website creations investigating other university websites investigating commercial websites attending conferences 11, 12, 15 supervisor’s technical knowledge and support supervisor’s knowledge about technology supervisor’s evaluation of the importance of technology supervisor’s evaluation of the importance of website utilization availability of website tools availability of budgeting availability of technical training availability of website creation training 17, 22 input from secondary sources consulting with experts consulting with other divisions within the library consulting with webmasters consulting with website committee consulting with focus group 10, 25–26 input from users conducting user survey utilizing users’ inputs 10 website success measures from web designer we meet users’ needs we provide better services via the website we satisfy users’ needs we provide quality services our library is overall successful 1, 2 website success measures from website users it lets me finish my project more quickly it helps improve my productivity it helps enhance the quality of my project the extent to which users integrate website library resources into users’ tasks* frequency of users’ visits to university library website** 3, 41, 43 all items are measured with a likert scale: 1 not really; 2: somewhat; and 3: greatly. * measured by percentage **measured by frequency figure 2. institutional forces factors affecting university library website design | kim 103 percent of respondents reported that they consult with web experts; over 70 percent responded that they integrate input from other divisions; and around 70 percent consult with webmasters. the utilization of secondary information sources for website creation is very high except for focus groups. the most widely used technique in this category is expert consultations followed by consultations with other divisions within the library. web designers also consider input from webmasters and web committees. figure 6 shows the extent to which website designers apply input directly derived from web users. around half of respondents reported that they obtain information from user surveys, and around 70 percent responded that they consider users’ input collected via comments, feedback, and complaints. figure 4 shows the extent to which supervisors support web designers. fifty-five percent of respondents reported that they have good web creation tools; 44 percent responded that they have enough budget for website creation, and almost a similar rate of respondents (39 percent) reported that they do not have adequate budgets for website creation. the last two questions concerning training show somewhat different results from the findings of the first two questions. the majority of web designers do not get technology-related or website creation-related training. less than one-third of respondents reported that they receive enough technology-related and web creationrelated training. the findings of the use of secondary sources show in figure 5 that web designers actively leverage such information sources for web design. by category, over 80 figure 3. supervisor’s knowledge about technology figure 4. supervisor’s support figure 5. input from secondary sources figure 6. input from users 104 information technology and libraries | september 2011 majority of users rely on commercial web resources for their academic tasks. ■■ discussion based on the study’s findings, this discussion will first cover the most influential factors first followed by the least influential elements in designing a university library website. first, the most influential factors for website designers are expert opinions and consultations with other divisions within the library. these may be the most important factors because relying on experts allows designers to discover users’ needs while saving costs. web designers also consider input from webmasters and web committees. coercive and mimetic forces are also highly significant factors affecting web designers. the university library is a subset of the university, and thus, designers may need to align themselves with university policy. also, designers can claim legitimacy by imitating other successful university websites, thereby securing necessary resources and support for website creation; however, web designers are much less likely to imitate commercial websites. this finding is consistent with existing reports that organizations imitate other successful organizations’ managerial practices that are within the same industry category.38 the least influential website creation factors are supervisors’ knowledge, which in turn impacts low budget allocations, and web designers’ technical training. this finding is consistent with successful technology deployment literature that shows supervisors’ technical knowledge is highly correlated with budget allocations.39 the lack of training for web designers does not appear to be improved since the last study, which was conducted in 2001;40 library ■■ website success website success is evaluated from two sides: designer opinion and user opinion. overall, designers evaluated their websites to be highly successful. they believe that they meet users’ needs, provide better services via the web, satisfy users’ needs, and provide quality services. later, their evaluation of their website is extremely positive, as reported in figure 7. figure 8 shows users’ perceptions of the usefulness of ulwr. users generally agree that ulwr are useful for their academic projects. more specifically, 55 percent responded that they are able to finish their tasks quickly because of the resources; 65 percent reported that they could increase their productivity; and 67 percent responded that they enhanced project quality thanks to the resources. on the other hand, a significant portion of respondents (more than 30 percent) do not think or have no opinions that ulwr are useful for their academic tasks. figure 9 investigates how often users visit university library websites. approximately 30 percent reported that they never visited or rarely visited the university library website. thirty-two percent made a visit to the website a couple of times a month, and approximately 40 percent visited the library website a couple of times a week or daily. figure 10 examines the users’ utilization of ulwr versus commercial website resources. the responses from 315 users show that they utilize commercial websites more than ulwr. specifically, 46 percent of respondents reported that they use less than 20 percent of ulwr and only 8 percent utilize ulwr more than 80 percent. in contrast, 14 percent utilize less than 20 percent of commercial website resources, and 22 percent utilize more than 80 percent of commercial website resources. the figure 7. website success evaluated by design figure 8. users’ perceptions of website usefulness factors affecting university library website design | kim 105 from a utilitarian perspective, web designers primarily need to consider the ability of the website to meet users’ needs. usefulness again needs to be evaluated by users. according to user assessments ulwr are somewhat satisfactory but not strong enough to rely heavily on for academic projects. it is an alarming fact that users use commercial website resources at a much higher rate than ulwr. this is somewhat disturbing given that web designers strive to provide good services to users, and libraries have invested massive resources into providing online services. this study has implications for academia and practitioners. for academia, there has been sparse research on web design studies from a designer standpoint. it may be because of difficulties in collecting data directly from website designers. from this line of research, this study enhances the understanding of what factors influence university web design. although university websites may be deemed successful, information managers should discover why the majority of users turn to commercial websites for their academic projects. without addressing this problem, the existence of library websites may be compromised. although there is evidence that libraries consider user input, it may not accurately represent all user populations because only extremely satisfied or extremely dissatisfied users tend to provide feedback;43 consequently, a regular survey may facilitate the utilization of ulwr. finally, supervisors’ technical knowledge is found to be low. this problem may be alleviated as time goes on because new generations are more aware of the importance of technology. in the meantime, web designers are encouraged to actively communicate with supervisors about the value of the utilization of technology and seek more financial support. this study’s data have some limitations. although the web designers are usually self-taught rather than formally trained.41 one promising finding, though, is that despite the relatively low technical knowledge held by supervisors, the respondents tend to rank highly when it comes to their perceptions of the importance of technology. compared with other institutional forces, normative force is relatively low. this kind of institutional force is higher at the early stage of technology adoption. in other words, the majority of universities have already launched their websites and have established rules and policies, so libraries are already past this stage. also, input from user surveys is relatively low. this may be because it is very costly, and they have other sources to turn to such as other universities’ successful websites. website success evaluations by web designers and users show discrepancies. overall, web designers evaluate their websites to be highly successful, while user ratings offer a different picture. this incongruity is a red flag in terms of ulwr usage. the majority of users report that they turn to commercial websites more than ulwr, and one-third never or rarely visit the university website. the disparity of the success between web designers and users may be attributed to the sources of information that website designers rely on. more specifically, existing studies report that input from experts and website committees is incongruent with what users really want, while feedback from focus groups can assist in understanding users’ needs.42 ■■ conclusions this study investigates the factors that website designers consider when designing university library websites. figure 9. frequency of visits to university library websites figure 10. university library vs. commercial website 106 information technology and libraries | september 2011 seriously in information systems research,” mis quarterly 29, no. 4 (2005): 591–605. 9. scott, institutions and organizations; dimaggio and powell, “the iron cage revisited”; h. haverman, “follow the leader: mimetic isomorphism and entry into new markets,” administrative science quarterly 38, no. 4 (1993): 593–627. 10. scott, institutions and organizations. 11. k. lee, dinesh mirchandani, and xinde zhang, “an investigation on institutionalization of web sites of firms,” the data base for advances in information systems 41, no. 2 (2010): 70–88. 12. lee, mirchandani, and zhang, “an investigation on institutionalization of web sites of firms.” 13. r. raward, “academic library website design principles: development of a checklist,” australian academic & research libraries 32, no. 2 (2001): 123–36. 14. y-m. kim, an investigation of the effects of it investment on firm performance: the role of complementarity (saarbrucken, germany: vdm verlag, 2008); p. weill, “the relationship between investment in information technology and firm performance: a study of the valve manufacturing sector,” information systems research 3, no. 4 (1992): 307–33. 15. a. lederer and v. sethi, “the implementation of strategic information systems planning methodologies,” mis quarterly (1988): 445–461; j. thong, c. yap, and k. raman, “top management support, external expertise and information systems implementation in small business,” information systems research 7, no. 2 (1996): 248–67; m. earl, “experiences in strategic information systems planning,” mis quarterly (1993): 1–24; a. boynton and r. zmud, “information technology planning in the 1990’s: directions for practice and research,” mis quarterly 11, no. 1 (1987): 59–72. 16. s. jarvenpaa and b. ives, “information technology and corporate strategy: a view from the top,” information systems research 1, no. 4 (1990): 351–76. 17. chen, germain, and yang, “an exploration into the practices of library web usability in arl academic libraries.” 18. ibid. 19. j. veldof and s. nackerud, “do you have the right stuff? seven areas of expertise for successful web site design in libraries,” internet reference services quarterly 6, no. 1 (2001): 20. 20. chen, germain, yang, “an exploration into the practices of library web usability in arl academic libraries”; r. raward, “academic library website design principles: development of a checklist,” australian academic & research libraries 32, no. 2 (2001): 123–36; j. bobay et al., “working with consultants to test usability: the indiana university bloomington experience,” in usability assessment of library-related web sites: methods and case studies, ed. n. campbell (chicago: ala, 2002): 60–76; h. king and c. jannik, “redesigning for usability: information architecture and usability testing for georgia tech library’s website,” oclc systems & services 21, no. 3 (2005): 235–43. 21. j. h. spyridakis, j. b. barrick, and e. cuddihy, “internetbased research: providing a foundation for web-design guidelines,” ieee transactions on professional communication 48, no. 3 (2005): 242–60; t. a. powell, web design: the complete reference (berkeley, calif.: osborne/mcgraw-hill, 2002). 22. powell, web design. 23. r. tolliver, d. carter, and s. chapman, “website redesign and testing with a usability consultant: lessons learned,” oclc systems & services 21, no. 3 (2005): 156–66; l. vandecreek, author tried to increase responses using various means, the number of responses does not allow one to use a sophisticated analytical technique such as regression. this study includes academic libraries with a web designer within the library; as a consequence, libraries without a web designer are not included. it is recommended to collect data from both groups and compare those with a designer (resource rich) and without a designer (resource poor), and discover underlying patterns of the factors impacting website designs and offer implications for academia and managers. references 1. d. v. parboteeah, j. s. valacich and j. d. wells, “the influence of website characteristics on a consumer’s urge to buy impulsively,” information systems research 20, no. 1 (2009): 60–78; m-h. huang, “designing web site attributes to induce experiential encounters,” computers in human behavior 19 (2003): 425–42. 2. y-m. kim, “the adoption of university library web site resources: a multigroup analysis,” journal of the american society for information science & technology 61, no. 5 (2010): 978–93; o. nov and c. ye, “users’ personality and perceived ease of use of digital libraries: the case for resistance to change,” journal of the american society for information science & technology 59 (2008): 845–51; n. park et al., “user acceptance of a digital library system in developing countries: an application of the technology acceptance model” international journal of information management 29, no. 3 (2009): 196–209. 3. w. hong et al., “determinants of user acceptance of digital libraries: an empirical examination of individual differences and system characteristics,” journal of management information systems 18, no. 3 (2001–2): 97–124. 4. parboteeah, valacich and wells, “the influence of website characteristics; j. palmer, “web site usability, design, and performance metrics,” information systems research 13, no. 2 (2002): 151–67. 5. c. burton, “library web site user testing,” collect & undergraduate libraries 9, (2002): 10; s. ryan, “library web site administration: a strategic planning model for the smaller academic library,” journal of academic librarianship 29, no. 4 (2003): 207–18; y-h chen, c.a. germain., and h. yang, “an exploration into the practices of library web usability in arl academic libraries,” journal of the american society for information science and technology 60, no. 5 (2009): 953–68. 6. m-h huang, “designing web site attributes to induce experiential encounters,” computers in human behavior 19 (2003): 425–42. 7. w. r. scott, institutions and organizations (thousand oaks, calif.: sage publications, inc, 1995); p. dimaggio and w. powell, “the iron cage revisited: institutional isomorphism and collective rationality in organizational fields,” american sociological review 48 (1983): 147–60. 8. w. r. scott, institutions and organizations; h. haverman, “follow the leader: mimetic isomorphism and entry into new markets,” administrative science quarterly 38, no. 4 (1993): 593–627; m. w. chiasson and e. davidson,” taking industry factors affecting university library website design | kim 107 “usability testing for web redesign: a ucla case study,” oclc systems & services 21, no. 3 (2005): 226–34; j. ward, “web site redesign: the university of washington libraries’ experience,” oclc systems & services 22, no. 3 (2006): 207–16. 32. chen, germain, and yang, “an exploration into the practices of library web usability in arl academic libraries.” 33. ibid. 34. kim, “the adoption of university library web site resources.” 35. ibid. 36. ibid. 37. y-m. kim, “validation of psychometric research instruments: the case of information science,” journal of the american society for information science & technology 60, no. 6 (2009): 1178–91. 38. h. haverman, “follow the leader: mimetic isomorphism and entry into new markets,” administrative science quarterly 38, no. 4 (1993): 593–627. 39. t. teo and j. ang, “an examination of major is planning problems,” information journal of information management 21 (2001): 457–70. 40. r. raward, “academic library website design principles: development of a checklist,” australian academic & research libraries 32, no. 2 (2001): 123–36. 41. ibid. 42. chen, germain, and yang, “an exploration into the practices of library web usability in arl academic libraries”; powell, web design; b. bailey, “heuristic evaluations vs. usability testing,” ui design update newsletter (2001), http:// www.humanfactors.com/downloads/jan01.asp (accessed june 15, 2011). 43. t. hennig-thurau et al., “electronic word-of-mouth via consumer-opinion platforms: what motivates consumers to articulate themselves on the internet?” journal of interactive marketing 18, no. 1 (2004): 38–52. “usability analysis of northern illinois university libraries’ website: a case study,” oclc systems & services 21, no. 3 (2005): 181–92. 24. spyridakis, barrick, and cuddihy, “internet-based research.” 25. b. bailey, “heuristic evaluations vs. usability testing,” ui design update newsletter (2001), http://www.humanfactors .com/downloads/jan01.asp (accessed june 10, 2011). 26. powell, web design. 27. chen, germain, and yang, “an exploration into the practices of library web usability in arl academic libraries.” 28. k.a. saeed, y. hwang, and v. grover, “investigating the impact of web site value and advertising on firm performance in electronic commerce,” international journal of electronic commerce 7, no. 2 (2003): 119–41. 29. l. manzari and j. trinidad-christensen, “user-centered design of a web site for library and information science students: heuristic evaluation and usability testing,” information technology & libraries 25, no. 3 (2006): 163–69. 30. e. abels, m. white, and k. hahn, “identifying user-based criteria for web pages,” internet research 7, no. 4 (1997): 252–56. 31. l. vandecreek, “usability analysis of northern illinois university libraries’ website: a case study,” oclc systems & services 21, no. 3 (2005): 181–92; m. ascher, h. lougee-heimer, and d. cunningham, “approaching usability: a study of an academic health sciences library web site,” medical reference services quarterly 26, no. 2 (2007): 37–53; b. battleson, a. booth and j. weintrop, “usability testing of an academic library web site: a case study,” journal of academic librarianship 27, no. 3 (2001): 188– 98; g. h. crowley et al., “user perceptions of the library’s web pages: a focus group study at texas a&m university,” journal of academic librarianship 28, no. 4 (2002): 205–10; b. thomsett-scott and f. may, “how may we help you? online education faculty tell us what they need from libraries and librarians,” journal of library administration 49, no. 1/2 (2009): 111–35; d. turnbow et al., 4 information technology and libraries | march 2007 this study examines how social scientists arrive at and utilize information in the course of their research. results are drawn about the use of information resources and channels to address information inquiry, the strategies for information seeking, and the difficulties encountered in information seeking for academic research in today’s information environment. these findings refine the understanding of the dynamic relationship between information systems and services and their users within social-scientific research practice and provide implications for scholarly information-system development. t he information needs and information­seeking behavior of social scientists have been the focus of inquiry within library and information science (lis) research for decades. folster reviewed the major studies that have been conducted in this area over the past three decades.1 she found that research methods had developed through several stages. research prior to the 1960s usually consisted of questionnaire­based user studies that gathered basic demographic data and quan­ titative data on the type of information used. following that were citation studies in the mid­1960s, and then the combination of questionnaire and interview techniques to develop profiles of users and their needs in the 1970s. the information environment of the 1980s witnessed a major transition in research design. the former practice of studying large groups via questionnaires or struc­ tured interviews gave way to the use of unstructured interviews or observation of smaller groups, resulting in a more holistic picture of social scientists’ research practices. more fully developed techniques for behavioral models emerged in the 1990s. folster summarized these studies done over decades and concluded that (1) social scientists place a high importance on journals; (2) most of their citation identification comes from journals; (3) infor­ mal channels, such as consulting colleagues and attend­ ing conferences, are an important source of information; (4) library resources, such as catalogs, indexes, and librar­ ians, are not very heavily utilized; and (5) computerized services are ranked very low in their importance to the research process. there are many examples of studies about the infor­ mation­seeking behavior of social scientists. for example, the infross project (investigation into information requirements of the social scientist) studied the informa­ tion needs of british social scientists in the late 1960s and early 1970s and found that they preferred to use journal citations instead of traditional bibliographic tools, and that they tended to consult with colleagues and subject experts, rather than library catalogs or librarians in order to locate information.2 other social­scientist studies reinforced the findings of the infross project.3 several studies indicated that computerized literature searching was ranked low as a source of information among social scientists and suggested the promotion of electronic information services by librarians to enhance their roles as information providers.4 in an influential study on social scientists’ informa­ tion­seeking patterns, ellis developed a behavioral model with six features based on the stages they went through in gathering information: ■ starting—includes activities characteristic of the ini­ tial search for information, such as asking colleagues or consulting literature reviews, online catalogs, and indexes and abstracts; ■ chaining—following chains of citations and other forms of referential connection between materials; ■ browsing—semi­directed searching in an area of potential interest, such as scanning published jour­ nals, tables of contents, references, and abstracts; ■ differentiating—using differences (authors or jour­ nal hierarchies) between sources as a filter on the nature and quality of the material examined; ■ monitoring—maintaining awareness of develop­ ments in an area through the monitoring of particular sources such as core journals, newspapers, confer­ ences, magazines, books, and catalogs; and ■ extracting—systematically working through a par­ ticular source to locate material of interest, for exam­ ple, sets of journals, collections of indexes, abstracts, or bibliographies.5 meho and tibbo revised ellis’s information­seeking model by studying the information­seeking behavior of social­science faculty who study stateless nations.6 they confirmed ellis’s model and derived four additional fea­ tures—accessing, networking, verifying, and information managing. accessing is getting hold of the materials or sources of information once they have been identified and located. networking includes communicating and maintaining a close relationship with a broad range of people such as friends, colleagues, and intellectu­ als. verifying is checking the accuracy of the informa­ tion found, and information managing includes filing, archiving, and organizing the collected information to facilitate research. yi shen yi shen (yishen@wisc.edu) is a ph.d. candidate in the school of library and information studies, university of wisconsinmadison. her article is the winner of the 2006 lita/endeavor student writing award. information seeking in academic research: a study of the sociology faculty at the university of wisconsin-madison article title | author 5information seeking in academic research | shen 5 with the exception of ellis’s work in 1987–1990 and the follow­up study by meho and tibbo, studies inves­ tigating academic social scientists have been in steady decline since the mid­1970s.7 according to line, in an information world radically changed by the internet, it is essential to carry out new studies of information uses and needs.8 most of the studies discussed in this paper were conducted before the development of the internet. the present study focuses on the information­seeking behav­ ior of social scientists in a new information environment featuring the internet and other dramatic technological advances. kling and mckim pointed out the growing importance of information technology and the resulting major shifts in scientific practice.9 costa and meadows studied the impact of computer usage on scholarly com­ munication among social scientists and found that major changes in their communication habits were occurring.10 the most significant impacts of information technology were greater interactivity, widened community boundar­ ies, extended access to information, and an increasing democratization of the international research community. they suggested that the developments were influenced by new pressures (social, economic, political) from the research community and the institutional environment, and by newly available resources (infrastructure, ser­ vices, sources) being introduced into the academic envi­ ronment by information technology. it could be expected that social scientists’ information­seeking behavior would change within a new social­technical environment. the purpose of this study is to extend the findings of the pre­ vious studies by examining social scientists’ information needs and their activities and perceptions in relation to today’s information systems and services. this paper provides a theoretical framework for the study, discusses the methods for data collection and data analysis, and summarizes findings. finally, it discusses results, reflects on the theoretical and practical implica­ tions that ensue, and notes the limitations imposed by the study design. ■ theoretical framework the theoretical frame for this study is the idea of “com­ munities of practice.” wenger, mcdermott, and snyder define a community of practice as “a group of people who share a common concern, a set of problems, or a passion about a topic, and who deepen their knowledge and expertise in this area by interacting on an ongoing basis.”11 within communities of practice, people share common values, observe and interact with each other, exchange views and ideas, and contribute to the knowl­ edge­creation process.12 according to wenger, communities of practice are combinations of three elements: a domain of knowledge, which defines the key issues in the community; a com­ munity of people who care about the domain; and the shared practice that they create.13 communities of prac­ tice are loosely connected, informal, and self­managed. they are about knowledge sharing, and the best way to share knowledge is through social interaction and infor­ mal relationship networks. effective communication and mutual understanding are important factors in fostering communities of practice. this form of social construction is highly situated and highly improvised.14 it essentially suggests that researching some thing is inseparable from its own historical and social locations of practice and should be carried out in the process of actually doing that thing.15 a process organizes knowledge in a way that is especially useful to practitioners whose shared learning brings value to a community.16 pragmatically, the exami­ nation of context­based research processes draws “atten­ tion away from abstract knowledge and cranial processes and situates it in the practice and communities in which knowledge takes on significance.”17 what is learned is highly dependent in the context on which the learning takes place, as it is central to the transfer and consump­ tion of information. this requires “looking at the actual practice of work, which consists of a myriad of fine­ grained improvisations that are unnoticed in any formal mapping of work tasks.”18 such beliefs are utilized in this present study to approach and explain information­seek­ ing behavior among social scientists. researchers used communities of practice in orga­ nization and business studies to investigate knowledge sharing and knowledge­creation processes within orga­ nizational settings to cultivate the building of knowl­ edge­management systems. researchers also used this approach in the field of computer­supported cooperative work (cscw) to study the social interactions of group­ ware systems and community computing and support systems. this study selected communities of practice as the theoretical frame because it has been widely applied in the study of knowledge sharing and has been tested and verified through empirical research. this study rep­ resents an exploration of the usefulness of communities of practice for research on information­seeking behavior within a knowledge­intensive scholarly community. the primary purpose of the present study is to pro­ vide empirical evidence on social scientists’ information seeking in scientific research. the main research ques­ tions are: (1) how do social scientists make use of different information sources and channels to satisfy their infor­ mation needs? (2) what strategies do they apply when seeking information for academic research? and, (3) what difficulties are encountered in searching for supporting 6 information technology and libraries | march 20076 information technology and libraries | march 2007 information? information service providers should find the results of this study interesting because identifying users’ perceptions of the information environment pro­ vides guidance for information­system development that will closely reflect or accommodate the information­seek­ ing activities of social scientists. ■ methods the research questions described in the preceding section were tested in the context of information use in scientific inquiry by faculty in the department of sociology at the university of wisconsin­madison during march and april 2003. the participants were selected from the faculty list on the department web page and then contacted by e­mail to arrange face­to­face interviews. four people were interviewed based on their willingness to par­ ticipate. three of them are full­time professors and have teaching experience of more than twenty years (one of them has been teaching for more than thirty years). the fourth is an assistant professor with four years of teaching experience. all of the participants are female. each inter­ view lasted from forty­five minutes to an hour. all participants were interviewed in their campus offices to allow for easy access to supporting materials as examples of how they go about their work. after explain­ ing her identity, the purpose of the research, and assuring the confidentiality of the interview, the researcher asked initial questions in a relatively structured way to glean background­related information and research context. the second part of the interview dealt with informa­ tion­related behavior, such as information sources and channels used to address research inquiry, and the major strategies for selecting needed information. the third part focused on problems the participants encountered in information seeking. the researcher took field notes and tape­recorded all interviews. as a consistency check, the participants were sometimes asked to comment on disciplinary work prac­ tices gleaned from other interviews. the selection of four participants reflected the practicalities of collecting data with limited time and resources. ■ findings based on the idea of communities of practice that what is learned is highly dependent on the context in which the learning takes place because it is central to the trans­ fer and consumption of information, the present study provides a holistic picture of information use situated in actual research practice and academic context among these social scientists.19 these findings can be summarized into several interrelated stages as shown in figure 1. the figure shows that the social scientists’ information seeking moves from academic information needs, choice of information sources, searching for information, to use of the information. the researchers move back and forth between stages until the information inquiry is satisfied. searching for information involves the implementation of strategies, confrontation of difficulties, and continuous decision making. choice of information channels goes through the whole information­seeking process based on researchers’ momentary or changing information activi­ ties and information needs. this figure is intended to provide a general view of the information seeking behav­ ior in this specific case, but is not intended to generate a model or pattern of information seeking. the findings are organized into the use of information resources and channels to address information inquiry, the strategies for information seeking, and the difficulties encountered in searching for information, which together constitute the major information­seeking practice of the participants. figure 1. stages of the social scientists’ information seeking article title | author 7information seeking in academic research | shen 7 ■ use of information resources and channels to address information needs information needs the respondents reported their research­oriented infor­ mation needs in the context of their research activities. those information needs can be grouped into seven cat­ egories. examples of responses follow. 1. general academic issues and current research dis­ courses in the field. “i find conferences are more useful for seeing what kinds of general things are going on. i guess some of these are research, some are academic politics kinds of things, and what’s happening in the disci­ pline as a whole.” “in conferences, you find out what other people are doing research on. the most current research is not published yet, so you know what’s happening now.” 2. feedback from colleagues on personal research. “the best thing about conferences is that when i present my own research, i get comments about it.” “you show your paper to people and ask them for comments, and they show you their papers and ask you for comments. this is kind of the normal part of academic life.” “i usually send a copy of a paper or something and get actual comments through e­mail.” 3. current research topics and activities of specific authors. “i’ll look for key people, and see what they’ve done. . . .” “knowing who is doing what where. . . .” “you sort of inevitably talk about your research with other people doing comparable research and find out what they are doing to keep current to what the different research projects are.” 4. existing datasets (existing survey research data­ bases) and statistics for secondary data analysis. “there are online statistical sources that i get to put in the papers.” “i use the internet to download all the . . . data that we analyze. . . .” “i do a lot of data research, so i use government sites on the internet, like the science’s bureau, or the national center for health statistics. we also have a little center for demography and ecology library. i use our in­house databases too.” “in social science, there are many existing sets of data. we have something called data and program library service here. they have all kinds of data­ bases that will tell you where there are data sources that have certain variables in them. . . . so you can go and do your own statistical analysis on those data.” 5. information needed for management purposes, such as the cooperation and coordination of research activities. “in this department, we conduct community busi­ ness by e­mail. we pass messages around. . . . a decision is usually made through this dialogue.” “i am constantly in interaction with people by e­ mail to cooperate on research projects.” 6. community recognition and inspirational support from colleagues. for example, one respondent commented, “in conferences, i feel invigorated when sitting and talking to field colleagues who are interested in my research. the whole conversa­ tion makes me feel excited and inspired.” another respondent indicated, “to see people face­to­face that you respect and they think your work is good, that’s good.” it is echoed by a third respondent: “you just talk about your work, and people act like what you are doing is very interesting, then it makes you more inspired.” those needs for information constitute a major research practice of the participants and thus determine how they go about seeking information. ■ information resources supporting information resources could be divided into internally built university resources and external resources. moreover, these internal and external resources could be further subdivided into human resources and nonhuman resources based on their physical forms. internal, nonhuman resources the participants identified two major categories of internal nonhuman information resources for academic research based on their intended use. the first of these categories is books and journals that are available in the university libraries for literature reviews and to provide awareness of current research. however, because of phys­ ical inconvenience, campus libraries are not often used. one participant indicated, “the library is down the hill, so even before there were lots of good internet resources, i wasn’t going down to the library a lot.” on the other hand, the participants reported that they frequently used the library online public access catalog (opac) to order 8 information technology and libraries | march 20078 information technology and libraries | march 2007 document delivery from the libraries. “i find madcat (the library online catalog system) very useful for a whole variety of specific searches for journals, books, and differ­ ent online information.” another participant remarked, “i can request a book online through the document deliv­ ery services.” another internal nonhuman resource consists of exist­ ing survey datasets that are collected by the center for demography and ecology library for secondary data analysis and research. it was indicated that in social sci­ ence, as more and more survey research databases were available, there was an increasing amount of research conducted on secondary data. the data and program library service provides all kinds of databases informing researchers of the location of data sources and the vari­ ables contained in certain datasets. external, nonhuman resources the participants identified three types of external non­ human resources based on their medium. some of these resources are purchased and managed internally by the campus libraries but developed and maintained exter­ nally by outside library and information professionals. one type is electronic resources, such as electronic news­ papers, external opacs, electronic full­text databases, online statistical reports, survey databases, and govern­ ment or personal web sites. some named examples include sociological abstracts, lexis­nexis, science bureau’s web site, the national center for health statistics web site, web of science, and online british newspapers. the second type is printed resources, such as books, newspapers and magazines, archives, and newspaper indexes that are available from outside of the campus. named examples include the paper indexes for the new york times and los angeles times back in the 1960s. the third type is audio­ video resources, such as radio broadcasts, tapes, video­ tapes, and television. one major finding was that the participants depended primarily on electronic information resources. all looked for information on both literature and research data via the internet. literally, each participant had her own fre­ quent visit to search engines or opacs for information on specific research topics and general research subjects. examples of responses include: “i start with internet explorer and go to google.” “i work a lot online. . . . i just do internet searches. “both these journal and newspaper databases, i use a lot for various purposes.” “i want to find out if there is work on this specific topic or concept. i would almost always start with sociological abstracts.” “the citation index is terrific for finding contempo­ rary work building on something important.” moreover, the respondents also conducted research on the internet to study web behavior or social networks on the internet. “there are more and more people actually doing research on the internet, studying web sites or connec­ tions between web sites. . . . they collect data online. . . .” “in social­movement research, more and more researchers study how people coordinate transactional movements, protest movements, various ethnic move­ ments, and political movements through the internet.” “online is a big way of doing cooperation as well as doing research. it is one of the reasons that we are inter­ ested in studying what kind of connections there are on the electronic network.” “a current research project that i am doing is looking at network of . . . web sites. so we are gathering primary data from the web sites.” thus, the electronic mechanism for information sys­ tems and services dominates the manner in which the participants carry out their research. internal, human resources the faculty participants were not only electronic­informa­ tion consumers, but also electronic­information producers. for example, one described, “i maintain my own web page, on which i post my research and add links to outside resources that i collected for years. i have my own gateway to organize the link pages, which can be used for my future reference and by my students. the library links to my web page as well.” moreover, this participant advocated the creation and collection of electronic materials by her col­ leagues as well. “it’s an evolving process. the more people put their information on the internet, the more useful it is to be on the internet. we are right in that transition.” the department can easily take a step further to build a shared pool of information and information resources in its internal system. a second type of internal human resources comes from the technical staff who provided announcements of technical developments and product information, as well as technical assistance for social­science research. working as the social science computing cooperative (sscc), the technical staff provides the faculty with detailed instruc­ tions and useful tips for creating electronic materials as well as with directions for publishing them. librarians, as a third type of human resource, provided reference services and collected necessary information resources for their academic research. external, human resources the external human resources that the respondents gathered and contacted are of two types: people shar­ article title | author �information seeking in academic research | shen � ing similar research interests and concerns, and people having different fields of interest. the former types were valued for supporting suggestive and creative commu­ nication and interaction as well as potential cooperation. for example, “when it comes to really think[ing] about things, sit down in one place and talk, and then stuff comes out. you don’t even know what you are thinking until you sit down and talk to people. it’s idea generating.” “knowing who is doing what where in the field is important. . . . i am working on a . . . research topic, which requires the awareness of other people with similar inter­ est around the world. . . . i cooperate with the scholars from different countries and with different knowledge background.” the latter types are used for current awareness of research works in other fields and general disciplinary activities and academic trends. for example, “i need to know people who know what’s going on in other fields, and they tell me what’s going on.” “i get a lot in terms of contemporary research at con­ ferences, which are useful for things that haven’t been published in journals.” “[a conference] will generate a lot of interesting inter­ change.” “[at conferences], i think about how what other people are doing is related to what i would want to do, or how they can do it differently. a lot of times, i think about whether the methods they are using would be useful for my work at all.” ■ channels the major information channels through which the par­ ticipants delivered and exchanged information included e­mail, telephone, face­to­face communication, and proj­ ect reports or other documents. e­mail was a domi­ nant communication and information­acquisition tool in research. face­to­face or oral communication channels in this case were often used as a supplementary means. “mostly, e­mail is how i communicate with people, occasionally telephone, but not very often. even with people here and we can walk right next door, mostly we just e­mail each other. it’s nice, because you have a record.” “i get hundreds of e­mails a week. . . . i live on e­mail. my colleagues know i am easier to reach by e­mail than in person.” “[face­to face] it’s just the more personal and emo­ tional mode [of communication] . . . you can see the person’s expression, and figure out what they are really thinking.” e­mail communication helped accomplish several scientists’ tasks, including quick exchange of timely infor­ mation, teamwork coordination, non­work­oriented mes­ sage exchange, field discussion, field information seeking and finding communities of interest. for example, one participant indicated the coordination of community activities through e­mail. in this department, we conduct community business by e­mail. community members rarely meet face to face. the chairperson finds out what the research task is, and sends out messages. people exchange opinions through e­mail messages. and a decision is usually made through this dialogue, instead of talking face to face. when scholars are going to have a face­to­face meet­ ing, they deliver the data, records, and reports before­ hand, and share their initial viewpoints with supporting information through e­mail. the following factors affected a scholar’s choice of channels for information delivery and exchange: the char­ acteristics of the information receiver, the characteristics of the information, the task or purpose of delivering or sharing information, and the immediacy of response. for example, one respondent mentioned that she usu­ ally delivered data, records, and research documents via e­mail for formal announcement and record keeping by the receivers. when there was no stress of immediate response, she preferred e­mail communication for the thoughtful input and feedback allowed by the asynchro­ nous­exchange feature of e­mail. “intellectual questions are more easily handled by e­mail because i have the time to think about it and formulate my responses.” she con­ tinued, “i usually e­mail a copy of my paper to colleagues for detailed feedback.” in another case, a participant indicated, “some of us are well aware that e­mail is archived, it’s not anonymous and not private. if you are concerned about something and want to say something that you don’t want to have an e­mail record of, you may want to go to talk to some­ body about it, instead of writing it in an e­mail.” another participant explained that because of her research topics, she usually adopted the face­to­face method of communication and attended all kinds of international academic conferences. in other circum­ stances, when collecting opinions for resolving certain questions, she chose to use e­mail. ■ strategies for information seeking the participants indicated certain strategies applied to gathering information and tracking resources to address 10 information technology and libraries | march 200710 information technology and libraries | march 2007 their information needs. those strategies with response examples are: 1. extracting abstracts: “i use abstracts to get the parameters of what’s happening and then know more narrowly where to focus.” 2. tracking citations: “the citation index is terrific for finding contemporary work” that builds on previ­ ous major work in a subject area. 3. restricting the search to a limited set of sources or types of sources to achieve satisfactory results within an acceptable timeline. 4. constantly filtering and interpreting the search results by referring to the summary description of web sites: “in most searches that i do, the first ten hits are book dealers. i don’t bother with them. i go to the next page and try. . . . i look at the summary of what the site is and try to figure out what the worthy things are.” 5. avoiding search terms prone to commercial infor­ mation: “when searching for something without a lot of commercial stuff, you are more likely to get what you want on the top.” 6. setting the default for the number of search results with consideration of information completeness, information usefulness, importance of research, and timeliness: for example, one participant stated, “i usually set my least default to a hundred cita­ tions. five hundred is too many, but it depends on what you’re looking for, how much you care about your findings, how much faith you have for the existence of useful information. if you think it’s not worth a minute of your time, you just forget it. but if you are sure it’s there, you just have to keep looking for and work[ing] harder at it.” as shown in the findings, the participants employed certain criteria for evaluation of the information they gathered. those judgment criteria were: importance of research, usefulness, accuracy, completeness, and timeli­ ness. the results imply that to accomplish the research tasks on hand in a fast­paced and distributed digital­ information environment, the practicalities of time and human effort have come into play in the ways in which the participants sought information. ■ difficulties in seeking supporting information the problems encountered by the participants when col­ lecting information through various resources were iden­ tified and are grouped into categories, including: ■ information is scattered in different places and with different qualities; it is difficult to have a complete and valuable picture of a research phenomenon. the participants described this difficulty as “how tricky computerized search is.” ■ there is too much information on the internet to filter, and the current search techniques and ranking tools are not intelligent enough to capture the most relevant information of interest. the participants described trying alternative search strategies as a “game­playing” and “brainstorming” process. ■ no sources of information or mechanisms assist in the identification of people with similar research interests and their activities in the broad virtual space. for example, one participant described: i am trying to find what’s in public debate on con­ troversial topics. and it’s very common to have trouble finding both sides of the debate. i started with diffuse searches on the internet trying to see if i can find the potential academic community and tag into their debate. i basically searched on [the search term] on the whole internet because i had no idea where it would be, who got involved, and how it was formed. when doing [the research] issue, it’s easy to find the people in favor of [a topic], but difficult to find anybody who was an opponent. eventually, i got hundreds of hits [search results], and i had to wade through a lot of proponents to find the opponents. sometimes, it’s an issue to find [an] ethnic minority perspective of a topic. ■ technology upgrades and system integration arouse another concern. as one participant expressed it, “technology is changing [so] fast that lots of com­ puter files from the 1970s are no longer readable. the danger of an information system lies in the tradeoff between the accessibility provided by digitization and the long­term survival of intellectual proper­ ties.” ■ there are no digital sources of information for some historical documents and no retrievable data­ bases for book chapters. one participant noted, “the online strategy is very good for really current stuff, but not for older stuff. the people who started the . . . research were actually writing before the online revolution, so they are not turning up so much in keyword searches online.” another participant also mentioned the inconvenience of using hard­copy indexes for newspapers from the 1960s and archival data that go back to the 1970s and 1980s. ■ discussion this study shows how the ‘communities of practice’ perspective situates the process of using information in the actual practice of scientific research. it provides an information context in which knowledge takes on sig­ article title | author 11information seeking in academic research | shen 11 nificance. the results provide empirical evidence of the participants’ activities as well as insights into the ways they seek information. in his discussion of user­oriented evaluation and qualitative analysis of information use, ellis emphasized a small­scale qualitative analysis of users’ perceptions of system performance to construct insights into the complex reality of the information environment.20 he argued that a detailed understanding of the complexity and interaction of information systems and services and their users can be used to explain problems and provide guidance on the development of information systems. the present study is in accord with ellis’s idea by focus­ ing on a specific sample of academic social scientists working in a university setting. the choice of university of wisconsin­madison is based on the grounds of conve­ nience and ease of access. the restriction to one specific sample also avoids the added complexity and compound problems of information use situated in different practice and contexts. ellis also considered the feasibility of interviews to “provide enough information for a detailed and accurate account of the perceptions of the social scientists of their information­seeking activities to be made, and to enable an authentic picture to be constructed of those activi­ ties.”21 he thought the information­seeking activities of social scientists were too diffuse to carry out triangulation of methods. by applying the interview method, this cur­ rent study complies with ellis’s suggestion. on the other hand, ellis’s information­seeking behav­ ior­model of social scientists presented six generic fea­ tures. these conclusions are far too general for specific application. from the perspective of communities of practice, the current study examines the way social sci­ entists use information in their research practices and specific circumstances; it also presents specific informa­ tion­related behavior, strategies, and difficulties. this study also extends the understanding of the way infor­ mation is used by social scientists in a new information environment with dramatic technical advances. the findings of this study support the conclusions of kling and mckim and costa and meadows by showing the growing importance of information technology and the resulting major shifts in information­seeking practice among social scientists.22 unlike research findings prior to the 1990s, the social scientists in this study make exten­ sive use of a variety of information sources and channels, primarily electronic­information systems and services, in seeking information. in the new information environ­ ment, these new information mechanisms also presented limitations and difficulties. moreover, many lis researchers have examined users’ relevance criteria in information seeking.23 great emphasis is given to the “situational dynamism of user­ centered relevance estimation.”24 situated in their research practices, the present study also identified the social sci­ entists’ applications of certain criteria for evaluation of information. although the small­scale study has limitations for research generalization, the rich description of social sci­ entists’ perspective on the information environment has some practical implications for information­system­and­ service design for academic social scientists. ■ plan for system-to-system integration this study identified technology upgrades and sys­ tem­integration problems existing in current academic information systems. technology was developed and applied without the capability of intergenerational com­ munications and transactions at the cost of intellectual properties. kling and star addressed the same issue that “computerized systems appear like the layers of an archaeological dig, with newer systems built upon older systems with various workplace surveillance capa­ bilities.”25 they stated that such “legacy systems” are fragile and inflexible for information use and knowledge management. therefore, planning for system integration should be underway. ■ enhance the web resource-retrieval system the study identified the difficulties encountered by fac­ ulty in locating relevant, complete, and valuable informa­ tion effectively and efficiently on the large and dynamic web. an advanced web resource system thus is required that allows web content to be indexed and retrieved more intelligently. moreover, the findings of information­seek­ ing strategies in this case study suggest a one­way user­ system interaction process. there is no interactive query refinement between the user and the system. thus, the users have to brainstorm and play with alternative search strategies in the hope of significant results. to enhance system effectiveness, a relevance­feedback mechanism that takes into account the users’ relevance judgment is thus needed. this mechanism should have a two­way user­system interaction component. ■ construct an internal information system the findings of the study point to a need for a shared pool of information resources in the university of wisconsin– madison department of sociology. through the leverage and reuse of existing internal knowledge assets in the 12 information technology and libraries | march 200712 information technology and libraries | march 2007 department, this system could help collectively create or gather information resources for cross­reference by colleagues. ■ construct a collaborative information mechanism for the social-scientific community according to the findings, there are no sources of infor­ mation or mechanisms that assist the identification of people with similar research interests and their activi­ ties on the broad virtual space. however, awareness of shared interests and experiences constitutes an important external human resource that is valued for suggestive and creative interaction and for potential cooperation. thus, a collaborative information mechanism for identification with personal academic interests will be helpful. ■ limitations certain limitations inherent in the study need to be acknowledged. due to the time and resource constraints, the study sample includes only four scholars. given this small sample, results cannot be generalized. although ellis mentioned the feasibility of interviews in a user­ oriented study of information use, dependence on a single method has the disadvantages of the restriction of views. for example, interviewer characteristics, expecta­ tions, and verbal idiosyncrasies, and participants’ socially desirable responses are recognized in many studies as potential sources of method biases (podsakoff et al.).26 if time and resources permit, triangulation of methods—for example, combining interviews with observations and diaries—would increase the level of specificity and justify the validity and reliability of the research results. ■ conclusion drawing upon the idea of communities of practice that what is learned: (1) is dependant on the context in which the learning takes place, and (2) is central to the transfer and consumption of information, this study examined the information­seeking behavior of four social scien­ tists. results were drawn about their use of information resources and information channels to meet their infor­ mation inquiries, their strategies for information seeking, and the difficulties encountered in searching for relevant information, situated in the course of their actual scien­ tific research. this work has two primary contributions. first, it provides a rich description of social scientists’ per­ spectives on their research­oriented information­seeking behavior in the context of today’s information environ­ ment. second, it situates information seeking behavior in a socially constructed practice and presents specific features of information seeking. these results will help refine the understanding of the dynamic relationship between information systems and services and their users within scientific research. several areas remain for future research. researchers could make a comparative study of academics in differ­ ent institutional settings. future research could also study the dynamic interaction of information systems and ser­ vices and their users within each stage of ellis’s model of information­seeking patterns among social scientists to get insights into the specific features of their information seeking behaviors and to enrich their general patterns of information inquiry with specific details. research on information­seeking behaviors of social scientists could also focus on specific research tasks or certain research stages to decide differences or similarities of informa­ tion­seeking behaviors across academic practice. similar research could also be done on faculty in other disci­ plines. references 1. m. b. folster, “information­seeking patterns: social scien­ tists,” the reference librarian 23, no. 49/50 (1995): 83–93. 2. m. b. line, “information requirements in the social sci­ ences: some preliminary considerations,” journal of librarianship 1, (1969): 1–19; m. b. line, “the information uses and needs of social scientists: an overview of infross,” aslib proceedings 23, (1971): 412–34. 3. p. stenstrom and r. b. mcbride, “serial use by social sci­ ence faculty: a survey,” college and research libraries 40 (1979): 426–31; r. h. epp and j. s. segal, “the acls survey and aca­ demic library service,” college and research libraries news 48, (1987): 63–69; m. slater, “social scientists’ information needs in the 1980s,” journal of documentation 44, no. 3 (1988): 226–37; m. b. folster, “a study of the use of information sources by social science researchers,” the journal of academic librarianship 15, no. 1 (1989): 7–11; c. c. gould and m. j. handler, information needs in the social sciences: an assessment (mountain view, calif.: research libraries group, 1989). 4. folster, “a study of the use of information sources by social science researchers”; epp and segal, “the acls survey and academic library service.” 5. d. ellis, “the derivation of a behavioral model for infor­ mation retrieval system design” (ph.d. diss., univ. of sheffield, 1987); d. ellis, “a behavioral approach to information retrieval system design,” journal of documentation 45, no. 3 (1989): 171– 212. 6. l. i. meho and h. r. tibbo, “modeling the information­ seeking behavior of social scientists: ellis’s study revisited,” article title | author 13information seeking in academic research | shen 13 journal of the american society for information science and technology 54, no. 6 (2003): 570–87. 7. h. c. hobohm, “social science information and docu­ mentation: time for a state of the art?” inspel 33, no. 3 (1999): 123–30. 8. m. b. line, “social science information: the poor rela­ tion,” ifla journal 26, no. 3 (2000): 177–79. 9. r. kling and g. mckim, “not just a matter of time: field differences and the shaping of electronic media in supporting scientific communication,” journal of the american society for information science 51, no. 14 (2000): 1306–20. 10. s. costa and j. meadows, “the impact of computer usage on scholarly communication among social scientists,” journal of information science 26, no. 4 (2000): 255–62. 11. e. wenger, r. mcdermott, and w. m. snyder, cultivating communities of practice: a guide to managing knowledge (boston: harvard business sch. pr., 2002), 4. 12. s. al­hawamdeh, knowledge management: cultivating knowledge professionals (oxford: chandos pubs., 2003). 13. e. wenger, communities of practice: learning, meaning, and identity (cambridge: cambridge univ. pr., 1998). 14. j. s. brown and p. duguid, “organizational learning and communities of practice: toward a unified view of working, learning, and innovation,” organization science 2, no.1 (1991): 40–57. 15. j. s. brown, “internet technology in support of the con­ cept of ‘communities of practice’: the case of xerox,” accounting, management, and information technologies 8, no. 4 (1998): 227–36; brown and duguid, “organizational learning and com­ munities of practice; f. blackler, “knowledge, knowledge work, and organizations: an overview and interpretation,” organization studies 16, no. 6 (1995): 1021–46; j. lave and e. wenger, situated learning: legitimate peripheral participation (cambridge: cambridge univ. pr., 1991); n. hayes and g. walsham, “par­ ticipation in groupware­mediated communities of practice: a socio­political analysis of knowledge working,” information and organization 11, no. 4 (2001): 263–88. 16. wenger, mcdermott, and snyder, “cultivating communi­ ties of practice.” 17. brown and duguid, “organizational learning and com­ munities of practice,” 48. 18. hayes and walsham, “participation in groupware­medi­ ated communities of practice,” 264. 19. k. grosser, “human networks in organizational informa­ tion processing,” in m. e. williams, ed., annual review of information science and technology (medford, n.j.: learned information, 1991), 349–402; brown, “internet technology in support of the concept of ‘communities of practice’”; brown and duguid, “organizational learning and communities of practice”; black­ ler, “knowledge, knowledge work, and organizations”; lave and wenger, situated learning: legitimate peripheral participation; hayes and walsham, “participation in groupware­mediated communities of practice.” 20. d. ellis, “user­oriented evaluation and qualitative anal­ ysis of patterns of information use,” in d. bawden, user-oriented evaluation of information systems and services (brookfield, vt.: gower, 1990), 172–79. 21. ibid., 177. 22. kling and mckim, “not just a matter of time”; costa and meadows, “the impact of computer usage on scholarly com­ munication among social scientists.” 23. c. l. barry, “user­defined relevance criteria: an explor­ atory study,” journal of the american society for information science 45, no. 3 (1994): 149–59; h. w. bruce, “a cognitive view of the situational dynamism of user­centered relevance estimation,” journal of the american society for information science 45, no. 3 (1994): 142–48; s. mizzaro, “relevance: the whole story,” journal of the american society for information science 48, no. 9 (1997): 810–32; x.­j. yuan, n. j. belkin, and j.­y. kim, “the relation­ ship between ask and relevance criteria,” in proceedings of the 25th annual international acm sigir conference on research and development in information retrieval (new york: acm pr., 2002), 359–60; s. y. rieh, “judgment of information quality and cognitive authority in the web,” journal of the american society for information science and technology 53, no. 2 (2002): 145–61; c. n. wathen and j. burkell, “believe it or not: factors influencing credibility on the web,” journal of the american society for information science and technology 53, no. 2 (2002): 134–44; a. tombros, i. ruthven, and j. m. jose, “searchers’ criteria for assessing web pages,” in proceedings of the 26th annual international acm sigir conference on research and development in information retrieval (toronto: acm pr., 2003), 385–86. 24. bruce, “a cognitive view of the situational dynamism of user­centered relevance estimation,” 142. 25. r. kling and l. star, “human­centered systems in the perspective of organizational and social informatics,” computers and society 28, no. 1 (1998): 22–29. 26. p. m. podsakoff et al., “common method biases in behav­ ioral research: a critical review of the literature and recom­ mended remedies,” journal of applied psychology 88, no. 5 (2003): 879–903. lib-mocs-kmc364-20140106084018 title-only entries retrieved by use of trunca1'ed search keys 207 frederick g. kilgour, philip l. long, eugene b. liederman, and alan l. landgraf: the ohio college library center, columbus, ohio. an experiment testing utility of truncated search keys as inquiry terms in an on-line system was performed on a file of 16,792 title-only bibliographic entries. use of a 3,3 key yields eight or fewer entries 99.0% of the time. a previous paper ( 1) established that truncated derived search keys are efficient in retrieval of entries from a name-title catalog. this paper reports a similar investigation into the retrieval efficiency of truncated keys for extracting entries from an on-line, title-only catalog; it is assumed that entries retrieved would be displayed on an interactive terminal. earlier work by ruecking (2), nugent (3), kilgour (4), dolby (5), coe ( 6), and newman and buchinski ( 7) were investigations of search keys designed to retrieve bibliographic entries from magnetic tape files. the earlier paper in this series and the present paper investigate retrieval from on-line files in an interactive environment. similarly, the work of rothrock ( 8) inquired into the efficacy of derived truncated search keys for retrieving telephone directory entries from an on-line file. since the appearance of the previous paper, the ohio state university libraries have developed and activated a remote catalog access and circulation control system employing a truncated derived search key similar to those described in the earlier paper. however, osu adopted a 4,5 key consisting of the first four characters of the main entry and the first five characters of the title excluding initial articles and a few other nonsignificant words. whereas the osu system treats the name and title as a continuous string of characters, the experiments reported in this and the previous paper deal only with the first word in the name and title, articles always being excluded. 208 journal of library automation vol. 4/4 december, 1971 the bell system has also recently activated a large traffic experiment in the san francisco bay area. the master file in this system contains 1,300,000 directory entries. the system utilizes truncated derived keys like those investigated in the present experiments. materials and methods the file used in this experiment was described in the earlier paper ( 1), except that this experiment investigates the title-only entries. the same programs used in the name-title investigation were used in this experiment; the title-only entries were edited so that the first word of the title was placed in the name field and the .11emaining words in the title field. as was the case formerly, it was necessary to clean up the file. single word titles often carried in the second or title field such expressions as one year subscription or vol 16 1968. in addition there were spurious character strings that were not titles, and in such cases the entire entry was removed from the file. thereby, the original 17,066 title entries were reduced to 16,792. the truncated search keys derived from these title-only entries consist of the initial characters of the first word of the title and of the second word of the title. if there was no second word, blanks were employed. if either the first or second word contained fewer characters than the key to be derived, the key was left-justified and padded out with blanks. to obtain a comparison of the effectiveness of truncated research keys derived from title-only entries as related to first keys derived from nametitle entries, a name-title entry fil e of the same number of entries ( 16,792) was constructed. a series of random numbers larger than the number of entries in the original name-title file ( 132,808 ) was generated and one of the numbers was added to each of the 132,808 name-title entries in sequence. next the fil e was sorted by number so that a randomized file was obtained. then the first 16,792 name-title entries were selected. the same program analyzed keys d erived from this file. results table 1 presents the maximum number of entries to be expected in 99% of replies for the file of 16,792 title-only entries as well as for the nametitle file containing the same total of entries. for example, when a large number of random requests are put to the title-only file using a 3,3 search key, the prediction is that 99.0% of the time, eight or fewer replies will be returned. however, in the case of the name-title file , only two replies will be returned 99.3% of the time. the 3,3 key produced only thirteen replies ( .12% of the total number of 3,3 keys) containing twenty-one or more entries. the highest number of entries for a single reply for the 3,3 key was 235 ( "jou ,of" d erived from journal of ) . the next highest number of replies was 88 ("adv, in" for advances in ) . trun cated search keys j kilgour 209 table 1. maximum number of entries in 99% of replies search key title-only entries name-title entries percent max imum ent1·ies maximum entries percent per reply of time per reply of time ~2 ~ ~1 7 99.0 ~3 ~ ~1 4 99.6 2,4 11 99.0 3 99.5 3,2 9 99.1 3 99.2 3,3 8 99.0 2 99.3 3~ 8 ~1 2 99.5 4,2 8 99.1 2 99.2 4,3 7 99.0 2 99.6 4,4 7 99.1 2 99.7 discussion the two words from which the keys are derived in name-title entries constitute a two-symbol markov string of zero order, since the name string and title string are uncorrelated. however, the two words from which keys are derived in the title-only entry are first order markov strings, since they are consecutive words from the title string and are correlated. the consequence of these two circumstances on the effective ness of derived keys is clearly presented in table 1. the keys from name-title entries consistently produce fewer maximum entries per reply. therefore, it is desirable to derive keys from zero order markov strings wherever possible. the ohio state university libraries contain over two and a quarter million volumes, but on 9 february 1971 there were only 47,736 title-only main entries in the catalog. the file used in the present experiment is 35% of the size of the osu file. since 99% of the time the 3,3 key yields eight or fewer titles, it is clear that such a key will be adequate for retrieval for library on-line, title-only catalogs. the 3,3 key also posse sses the attractive quality of eliminating the majority of human misspe1ling as pointed out in the earlier paper ( 1). there remains, however, the unsolved problem of the efficient retrieval of such titles as those beginning with "journal of" and "advances in". it appears that it will be necessary to devise a special algorithm for those relatively few titles that produce excessively high numbers of entries in replies. in the previous investigation it was found that a 3,3 key yielded five or fewer replies 99.08% of the time from a fil e of 132,808 name-title entries. table 1 shows that for a file of only 16,792 entries the 3,3 key produces two or fewer replies 99.3% of the time . these two observations suggest that as a file of bibliographic entries increases, the maximum number of entries per reply does not increase in a one-to-one ratio, since the maximum 210 journal of library automation vol. 4/4 december, 1971 number of entries rose from two to five while the total size of the file increased from one to approximately eight. further research must be done in this area to determine the relative behavior of derived truncated keys as their associated file sizes vary. conclusion this experiment has produced evidence that a series of truncated search keys derived from a first order markov word string in a bibliographic description yields a higher number of maximum entries per reply than does a series derived from a zero order markov string. however, the results indicate that the technique is nonetheless sufficiently efficient for application to large on-line library catalogs. use of a 3,3 search key yields eight or fewer entries 99.0% of the time from a file of 16,792 title-only entries. acknowledgment this study was supported in part by national agricultural library contract 12-03-01-5-70 and by office of education contract oec-0-72-2289 (506). references 1. f. g. kilgour; p. l. long; e. b. leiderman: "retrieval of bibliographic entries from a name-title catalog by use of truncated search keys," proceedings of the american society for information science 7 ( 1970), pp. 79-82. 2. f. h. ruecking, jr.: "bibliographic retrieval from bibliographic imput; the hypothesis and construction of a test," journal of library automation 1 (december 1968), 227-38. 3. nugent, w. r.: "compression word coding techniques for information retrieval," ] ournal of library automation 1 ( december 1968 ) , 250-60. 4. f. g. kilgour: "retrieval of single entries from a computerized library catalog file," proceedings of the american society for information science 5 ( 1968), pp. 133-36. 5. j. l. dolby: "an algorithm for variable-length proper-name compression," ] ournal of library automation 3 (december 1970), 257-75. 6. m. j. coe: "mechanization of library procedures in the medium-sized medical library: x. uniqueness of compression codes for bibliographic retrieval,'' bulletin of the medical library association 58 (october 1970), 587-97. 7. w. l. newman; e. j. buchinski: "entry/title compression code access to machine readable bibliographic files," journal of library automation 4 (june 1971 ), 72-85. 8. h. i. rothrock, jr.: computer-assisted directory search; a dissertation in electrical engineering. (philadelphia: university of pennsylvania, 1968). lita president's message: sustaining lita lita president’s message sustaining lita emily morton-owens information technology and libraries | september 2019 2 emily morton-owens (egmowens.lita@gmail.com) is lita president 2019-20 and the assistant university librarian for digital library development & systems at the university of pennsylvania libraries. recently, at the 2019 midwinter meeting in seattle, ala decided to adopt sustainability as one of the core values of librarianship. the resolution includes the idea of a triple bottom line: “to be truly sustainable, an organization or community must embody practices that are environmentally sound and economically feasible and socially equitable.” if you had thought of sustainability mainly in terms of the environment, you have plenty of company. i originally pictured it as an umbrella term for a variety of environmental efforts: clean air, waste reduction, energy efficiency. but in fact the idea encompasses human development in a broader sense. one definition of sustainability involves making decisions in the present that take into account the needs of the future. of course our current environmental threats demand our attention, and libraries have found creative ways to promote environmental consciousness (myriad examples include books on bikes, seeking leed or passive house certification for library buildings, providing resources on xeriscaping, and many more). even if you’re not presently working in a position that allows you to engage directly on the environment, though, the concept of sustainability turns out to permeate our work and values. the ideas of solving problems in a way that doesn’t create new challenges for future people, developing society in a way that allows all people to flourish, and fostering strong institutions: these concepts all resonate with the work we do daily, not only in what we offer our users but also in how we work with each other. as a profession, we have a history of designing future-proof systems (or at least attempting to). whenever i’ve been involved in planning a digital library project, one of the first questions on the table is “how do we get our data back out of this, when the time comes?” no matter how enamored we are of the current exciting new solution, we remember that things will look different in the future. library metadata schemas are all about designing for interoperability and reusability, including in new ways that we can’t picture yet. someone who is unaccustomed to this kind of planning may see a high project overhead for these concerns, but we have consistently incorporated long-term thinking into our professional values due to the importance we place on free access, data preservation, and interoperability. the triple-bottom line approach, considering economic, social, and environmental factors, also influences the lita leadership. i recently announced the lita board’s decision to reduce our in person participation at ala midwinter for 2020, which is partly in response to ala’s deliberations about reinventing the event starting in 2021. with all the useful collaboration technologies now at our fingertips, it is harder to justify requiring our members to meet in person more than once per year. it is possible for us to do great work, on a continuous and rolling basis, throughout the year. more importantly, we want to offer committee and leadership positions to members who may not mailto:egmowens.lita@gmail.com http://www.ala.org/aboutala/sites/ala.org.aboutala/files/content/governance/council/council_documents/2019_ms_council_docs/ala%20cd%2037%20resolution%20for%20the%20adoption%20of%20sustainability%20as%20a%20core%20value%20of%20librarianship_final1182019.pdf sustaining lita | morton-owens 3 https://doi.org/10.6017/ital.v38i3.11627 be able to travel extensively, for personal or work reasons. (especially when many do not receive financial support from their employers. and, to come back around to environmental concerns for a moment, think of all the flights our in-person meetings require.) by being more flexible about what participation looks like, we sustain the effort that our members put into lita through a world of work that is changing. financial sustainability is also a factor in our pursuit of a merger with alcts and llama. we are three smaller divisions based on professional role, not library type, who share interests and members. we also have similar needs and processes for running our respective associations. unfortunately, lita has been on an unsustainable course with our budget for some time—we spend more than we take in annually, due to overhead costs and working within ala’s processes and infrastructure. the lita board has engaged for many years on the question of how to balance our financial future with the fact that our programs require full-time staff, instructors, technology, printing, meeting rooms, etc. core, as the new merged division will be known, will allow us to correct that balance by combining our operations, streamlining workflows, and containing our costs. the staff will also be freed up to invest more effort in member engagement. we can’t predict all the services that associations will offer in the future, but we know that, for example, online professional development is always needed, so we’re ensuring that the plan allows it to continue. it is inspiring to talk about the new collaborations and subject-matter synergies that the merger will bring with it, but core will also achieve something important for sustaining a level of service to our membership. at the ala level, the steering committee on organizational effectiveness (scoe) is also looking at ways to streamline the association’s structure and make it more approachable and welcoming to new members. i would add that a simplified structure should make ala more accountable to members as well, which is crucial for positioning it as an organization worth devoting yourself to. these shifts are essential because member volunteers are what make ala happen, and we need a structure that invites participation from future generations of library workers. taken together, these may look like a confusing flurry of changes. but librarians have evolved to be excellent at long-term thinking about our goals and values and how to pursue an exciting future vision based on what we know now and what tools (technology, people, ideas) we have at hand. we care about helping our users thrive and are able to take a broad view of what that encompasses. in particular, with the new resolution about sustainability, we’re including the health of our communities and the security of our environment as a part of that mission. due to their innovative spirit and principled sense of commitment, our members are well-placed to lead transformations in their home institutions and to participate in the development of lita. as we weigh all these changes, we value the achievements of our association and its past leaders and members, and seek to honor them by making sure those successes carry on for our future colleagues. reproduced with permission of the copyright owner. further reproduction prohibited without permission. the internet, the world wide web, library web browsers, and library web servers jian-zhong, zhou information technology and libraries; mar 2000; 19, 1; proquest pg. 50 tutorial the internet, the world wide web, library web browsers, and library web servers jian-zhong (joe) zhou this article first examines the difference between two very familiar and sometimes synonymous terms, the internet and the web. the article then explains the relationship between the web's protocol http and other high-level internet protocols, such as telnet and ftp, as well as provides a brief history of web development. next, the article analyzes the mechanism in which a web browser (client) "talks" to a web server on the internet. finally, the article studies the market growth for web browsers and web servers between 1993 and 1999. two statistical sources were used in the web market analysis: a survey conducted by the university of delaware libraries for the 122 members of the association of research libraries, and the data for the entire web industry from different web survey agencies. many librarians are now dealing with the internet and the web on a daily basis. while the web is sometimes synonymous with the internet in many people's minds, the two terms are quite distinct, and they refer to different but related concepts in the modem computerized telecommunication system. the internet is nothing more than many small computer networks that have been wired together and allow electronic information to be sent from one network to the next around the world . a piece of data from joe zhou (joezhou@udel.edu) is associate librarian at the university of delaware library, newark. beijing, china may traverse more than a dozen networks while making its way to washington, d.c. we can compare the internet to the great wall of china, which was built in the qin dynasty around the third century b.c. by connecting many existing short defense walls built by previous feudal states . the great wall not only served as a national defense system for ancient china, but also as a fast military communication system. a border alarm was raised by means of smoke signals by day, and beacon fires at night, ignited by burning a mixture of wolf dung , sulfur, and saltpeter. the alarm signal could be relayed over many beacon-fire towers from the western end of the great wall to the eastern end (4,500 miles away) within a day . this was considered light speed two thousand years ago. however, while the great wall transferred the message in a linear mode, the internet is a multidimensional network. the web is a late-comer to the internet, one of the many types of high-level data exchange protocols on the internet. before the web, there was telnet, the traditional commanddriven style of interaction. there was ftp, a file transfer protocol useful for retrieving information from large file archives. there was usenet , a communal bulletin board and news system. there was also e-mail for individual information exchange, and e-mail lists, for one-to-many broadcasts. in addition, there was gopher, a campus-wide information system shared among universities and research institutions, and wais, a powerful search and retrieval system developed by thinking machines, inc. in 1990 tim bemerslee and robert cailliau at cern (www. cern.ch), the european laboratory for particle physics, created a new information system called "world wide web" (www). designed to help the cern scientists with the increasingly confusing task of exchanging information on the 50 information technology and libraries i march 2000 internet, the web system was to act as a unifying force, a system that would seamlessly bind all file-protocols into a single point of access. instead of having to invoke different programs to retrieve information via various protocols, users would be able to use a single program, called a "browser," and allow it to handle all the details of retrieving and displaying information. in december 1993 www received the ima award, and in 1995 bemers-lee and cailliau received the association for computing (acm) software system award for its development. the web is best known for its ability to combine text with graphics and other multimedia on the internet. in addition, the web has some other key features that make it stand out from earlier internet information exchange protocols. since the web is a late-comer to the internet, it has to be compatible backwards with other communications protocols in addition to its native language, hypertext transfer protocol (http). among the foreign languages spoken by web browsers are telnet, ftp, and other high-level communication protocols mentioned earlier. this support for foreign protocols lets people use a single piece of software, the web browser, to access information without worrying about shifting from protocol to protocol and software incompatibility . despite different high-level protocols including http for the web, there is one thing in common for all parts of the internet-tcp/ ip, the lower level of the internet protocol. tcp /ip is respon sible for establishing the connection between two computers on the internet and guarantees that the data can be sent and received intact. the format and content of the data are left for high-level communication protocols to manage, among which the web is the best known one. at the tcp /ip level all computers "are created equal." two computers establish a connection and start to reproduced with permission of the copyright owner. further reproduction prohibited without permission. communicate. in reality, however, most conversations are asymmetric. the end user's machine (the client) usually sends a short request for information, and the remote machine (the server) answers with a longwinded response. the media is the internet. the common language on the internet can be the web or any other high-level protocols . on the web, the client is the web browser; it handles the user's request for a document. the first web browser, ncsa mosaic, developed by the national center for supercomputing applications (ncsa) at the university of illinois at urbanachampaign, was released in midnovember 1993 for unix, windows, and macintosh platforms. version 3.0 of ncsa mosaic is available at www. ncsa. uiuc.ed u/ sdg /software/ mosaic. both source code and binaries are free for academic use. mosaic lost market share to netscape after its key developer left ncsa and joined netscape. even after mosaic introduced an innovative 32-bit version in early 1997, which can perform feats that other major browsers had not even thought of back then, mosaic remained out of the major browsers' market. the two most widely-used browsers today are microsoft's internet explorer (ie) and netscape's navigator (part of the netscape communicator suite). recent web browser surveys conducted by different internet survey companies such as www.zonaresearch.com/ browserstudy, www.psrinc.com/ trends.htm, and www .statmarket. com all indicate that ie is the market leader with more than 60 percent market share, leaving navigator with between 35 percent and 40 percent. in 1995 ie had only 1 percent share versus navigator's more than 90 percent, an unimaginable rise critics have attributed to microsoft's strategy of bundling the browser with its near-monopoly windows operating system. however, a survey conducted in december 1998 by the university of delaware library of 122 members of the association of research libraries (arl) showed that netscape still remained the market leader among big academic libraries. more than 90 percent of arl libraries supported netscape, and about 50 percent also supported ie. most arl libraries supported both browsers, and unlike the browser industry survey mentioned earlier, in which only one product can be picked as the primary browser , the sum of the percentages for the arl survey was greater than 100 percent. the main function of the web browser is to request a document available from a specific server through the internet using the information in the document's url. the server on a remote machine returns the document usually physically stored on one of the server's disks. with the use of common gateway interface (cgi), the documents do not have to be static. rather, they can be synthesized at the point of being requested by cgi scripts running on the server's side of the connection . in some database-driven web servers that make the core of today's e-commerce, the documents provided may never exist as physical files but are generated as needed from database records . the web server can be run on almost any computer, and server software is available for almost all operating systems, such as unix, windows 95/98/nt, macintosh, and os / 2. according to the university of delaware library's 1998 survey of internet web servers among arl member libraries, more than 32 percent of arl libraries chose apache as their web server software, followed by the netscape series at 29.32 percent, ncsa httpd at 11.28 percent, and microsoft internet information server (iis) at 7.52 percent. in july 1999 the author checked the netcraft survey at www .netcraft. com/survey . the top three web server software programs for more than 6.5 million web sites are apache (56.35 percent) , microsoft-hs (22.33 percent), and netscape (5.65 percent). the netcraft survey also provides the historical market share information of major web servers since august 1995. ncsa httpd was the first web server software released, about the same time as the release of mosaic in 1993. however, it slipped from the number-one position with more than 90 percent market share in 1993, and almost 60 percent in 1995, to less than 1 percent in july 1999. it is no longer supported by ncsa, however, httpd remains a popular choice for web servers due to its small size, fast performance, and solid collection of features . the "inertia effect" of the existing sites (if it runs well, why bother to change?) will likely keep ncsa on the major web server software list for some time. ncsa is free, but available only for the unix platform. it is available from http:/ /hoohoo .ncsa.uiuc.edu. however, when the author visited the site in july 1999, the following message appeared on the main page : "the ncsa httpd is no longer under development. it is an unsupported product. we recommend that you check out the apache server, instead of installing our server." most people who use only web browsers may have heard of apache only as an indian nation or a military helicopter, not the most popular web server software with more than 50 percent market share . it was first introduced as a set of fixes or "patches" to the ncsa httpd. apache 1.0 was released in december 1995 as open-source server software by a group of webmasters who named themselves the apache group. open-source means the source code is available and freely distributed, and it is the key to apache's attractiveness and popularity. the apache group members were nsca users tutorial i zhou 51 reproduced with permission of the copyright owner. further reproduction prohibited without permission. who decided to coordinate development work on the server software after nsca stopped. in july 1999 the apache group announced that it was establishing a more formal organization called the apache software foundation (asp). in the future, the asp (www .apache.org) will monitor development of the free software, but it will remain a "not-for-profit" foundation. apache is high-end, enterprise-level server software and can be run on os/2, unix (including linux), and windows platforms, but a mac version is still not available. the netscape series includes netscape-enterprise, netscape-pasttrack, netscape-commerce, and netscape-communication . enterprise is a high-end, enterprise-level server while pasttrack serves as an entrylevel server for small workgroups. netscape supports both the unix and the windows nt platforms. the other major commercial web server, microsoft internet information server (iis), as of 1999, is only available for the windows platform. however, one advantage of iis over netscape is that it can be downloaded for free as part of the windows option pack. in addition, iis can handle ms office documents very well. while both the microsoft and netscape brand names are well recognized by millions of end users. a name alone does not necessarily equate to large market share, nor does a deep pocket. apache remains the top web server despite intense competition. one of the keys to apache's success, in addition to its outstanding performance, lies in its open-source code movement and active user support on a wide basis. the web server of choice for the macintosh platforms is webstar. however, due to the limitations of the operating system networking software, the performance of macintosh-based servers has not been great. webstar can be downloaded as a free evaluation release from www.stamine.com/webstar. the web server market is dynam52 information technology and libraries i march 2000 ic and competition intense. there are more than sixty web server products on the top list ( of web servers with more than one thousand web sites) as of july 1999, and newcomers are being added frequently. acknowledgments the author thanks peter liu, head of the systems department at the university of delaware library, for providing the web survey data of arl libraries . after this article was submitted, the survey data was published by arl in 1999 as spec kit 246: web page development and management. the author also wants to thank his dear wife min yang for her technical assistance. min is webmaster and system administrator for the web site at a. i. dupont nemours foundation and hospital for children, http:/ /kidshealth.org. public libraries respond to the covid-19 pandemic, creating a new service model editorial board thoughts public libraries respond to the covid-19 pandemic, creating a new service model jon goddard information technology and libraries | december 2020 https://doi.org/10.6017/ital.v39i4.12847 jon goddard (jgoddard@northshorepubliclibrary.org) is a librarian at the north shore public library, and a member of the ital editorial board. © 2020. during the covid-19 pandemic, public libraries have demonstrated, in many ways, their value to their communities. they have enabled their patrons to not only resume their lives, but to help them learn and grow. additionally, electronic resources offered to patrons through their library card have allowed people to be educated and entertained. the credit must go to the librarians, who initially fueled, and have maintained this level of service by re-writing the rules—creating a new service model. once libraries closed, librarians promoted ebooks and other important platforms available to patrons with their library cards. the result: the checkout of ebooks, and the use of these platforms rose, exponentially. community engagement became completely virtual with librarians, and those who provide library programs to the public, providing services on platforms that they may or may not have heard of, such as zoom and discord. as libraries re-opened, many offered real-time reference services, as well as seamless and contactless curbside service, providing a sense of control and continuity amongst the chaos. exponential increases in electronic resource usage overdrive, which is currently used by nearly 90% of public libraries in the united states to manage both ebook and audiobook collections, saw an exponential increase in its usage. since the lockdown began in mid-march, the daily average for ebook checkouts have been consistently 52% above pre-covid periods. additionally, new users to the platform have been consistently double and triple 2019 highs.1 library staff have been helping readers during this time to ensure they obtain access with their devices. in suffolk county, new york, where new patron registration to overdrive is up 72% from last year (as of august 2020), there has been no shortage of requests for help.2 with kids being home from school and learning virtually, it is no surprise that ebook readership skyrocketed amongst ya and juvenile readers with an 87% increase from last year. 3 to help them with their homework and studies, families turned to online tutoring. in suffolk county, new york, the usage of the brainfuse online live tutoring service has been consistently up by nearly 50% during the school closures.4 gale, a cengage company, which offers miss humblebee's academy, a virtual learning program for preschoolers, saw its user sessions increase by 100% from the previous year.5 mailto:jgoddard@northshorepubliclibrary.org information technology and libraries december 2020 public libraries respond to the covid-19 pandemic | goddard 2 adults, also eager to learn new skills, took to online courses as well. gale courses saw a 50% increase in enrollments from march-july from the previous year. likewise, gale presents: udemy, which offers on-demand video courses, saw just over 21,000 course enrollments from marchjune.6 to help those who did not have sufficient broadband wifi to use these necessary resources and platforms, many libraries left their wifi on even when the building was closed to allow access to those in the vicinity of the building. in addition, many libraries purchased wifi hotspots to lend to their patrons. according to pew research, approximately 25% of households do not have a broadband internet connection at home.7 while public libraries cannot provide the only local solution to this gap, here are other steps libraries have been taking during the shutdown: • strengthening wireless signals so people can access wireless from outside library buildings. • hosting drive-up wifi hotspot locations. • partnering with schools to obtain and distribute wifi hotspots to families in need. community engagement virtually community engagement has been vital since the covid-19 lockdown. both librarians and those who provide library programs to the public had to quickly adjust to the virtual world in which we were suddenly living. using a mixture of social media platforms, including facebook live and stories, discord, instagram, youtube, zoom, and gotomeeting, librarians flocked to the internet, providing a wide range of programming. even those libraries that did not previously have any virtual programs managed to very quickly provide quality programs to their patrons. virtual programming was not available at the san josé public library (sjpl) prior to the shutdown. librarians quickly started to move programs online, including story time, and created a program called spring into reading, similar to the summer reading program, to continue to encourage families to read together. they also started a weekly recorded story time, so patrons could call the library and use their phones to hear a story. to date, sjpl has hosted over 2,000 virtual events since the lockdown began on march 17th.8 some libraries, like the oceanside library in new york, were offering virtual programs before the pandemic. when the library closed on march 13, the team started planning to move completely virtual. two days later, the library was offering four programs a day, including story times, book chats, and book clubs. by the end of the week, they were offering eight programs a day.9 in april, may, and june, they found book discussions and story times were the most popular programs. they then started to open their programs to people from out of state, partnering with other libraries. the result? program attendance has increased and several zoom meeting rooms have been maxed out.10 through the lockdown, library patrons have been exercising, listening to concerts, taking virtual vacations, learning new skills, cooking, playing games, and reducing stress. this incredible information technology and libraries december 2020 public libraries respond to the covid-19 pandemic | goddard 3 adaption was only possible due to library worker’s quick thinking and a never-ending determination to help. delivering information and materials with a new service model at the san jose public library (sjpl), which has over 500,000 library members, library staff had to quickly shift to a new online reality just after the shutdown. to help patrons get the most from their electronic resources, sjpl used libanswers to post faqs and email responses to their issues and questions. when a librarian was available, patrons could use libchat to ask questions in real-time. because no one was in the library buildings to answer phones, libanswers and libchat became the only way the public could communicate with staff. chat reference conversations increased by nearly 400%—from approximately 40 chat sessions per day to 160 per day. the chat service was also made available in spanish, vietnamese, and chinese. when the library implemented its express pickup service, sjpl utilized the spaces functionality in libcal to allow patrons to create pickup appointments. when patrons arrived at the library for their appointment, the sms functionality in libanswers allowed patrons to text staff upon arrival. through the city of san josé’s sj access initiative, which aims to help bridge the digital divide in the city, sjpl worked closely with other city departments, and the santa clara county office of education, to purchase approximately 16,000 high-speed at&t hotspots for students and the public.11 working towards the new normal the american library association (ala) is committed to advocate strongly for libraries on several different fronts. thanks to thousands of advocate communications with congress, libraries secured $50 million for the institute of museum and library services (imls) in the coronavirus aid, relief, and economic security (cares) act. this enabled libraries and museums to apply for grants during this time of need.12 in addition, the ala is currently advocating for the passage of the library stabilization fund act (s.4181 / h.r.7486) to allow libraries to retain staff, maintain services, and safely keep communities connected and informed. the legislation calls for $2 billion in emergency recovery funding for america's libraries through the institute of museum and library services (imls).13 while the ala is rightly advocating for these emergency funds, public librarians and administrators should take advantage of this time to strategically review what has been put into place to react to the covid-19 pandemic, and plan for the long term. while it is true that libraries are physical spaces, they are also technology-driven services for learning and connections for all ages. additionally, they have shown that due to this new service model, access has expo nentially expanded to new patrons, showing tremendous value when it comes to education and engagement. information technology and libraries december 2020 public libraries respond to the covid-19 pandemic | goddard 4 this new service model should be preserved. programs that engage our communities should be both physical and virtual. physical media and books should be provided both at the circulation desk and through a contactless service. reference services should be provided both at the reference desk, and through chat reference services. this must be our new normal. endnotes 1 david burleigh, director, brand marketing & communication at overdrive, phone conversation with author, october 9, 2020. 2 maureen mcdonald, special projects supervisor at the suffolk cooperative library system, phone conversation, september 14, 2020. 3 burleigh. 4 mcdonald. 5 kayla siefker, head of media & public relations at gale, a cengage company, brian risse, vp of sales – public libraries. muna sharif, product manager, discovery & analytics, phone conversation with author, october 16, 2020. 6 siefker. 7 pew research center, “internet/broadband fact sheet,” june 12, 2019, accessed october 13, 2020, https://www.pewresearch.org/internet/fact-sheet/internet-broadband/. 8 laurie willis, web services at sjpl, phone conversation with author, october 14, 2020. 9 erica freudenberger, “programming through the pandemic,” library journal, may 22, 2020, https://www.libraryjournal.com/?detailstory=programming-through-the-pandemic-covid19. 10 tony iovino, assistant director for community services at the oceanside library, phone conversation with author, october 19, 2020. 11 willis. 12 american library association, “advocacy & policy,” accessed october 15, 2020, http://www.ala.org/tools/covid/advocacy-policy. 13 ibid. https://www.pewresearch.org/internet/fact-sheet/internet-broadband/ https://www.libraryjournal.com/?detailstory=programming-through-the-pandemic-covid-19 https://www.libraryjournal.com/?detailstory=programming-through-the-pandemic-covid-19 http://www.ala.org/tools/covid/advocacy-policy exponential increases in electronic resource usage community engagement virtually delivering information and materials with a new service model endnotes modeling a library website redesign process: developing a user-centered website through usability testing danielle a. becker and lauren yannotta information technology and libraries | march 2013 6 abstract this article presents a model for creating a strong, user-centered web presence by pairing usability testing and the design process. four rounds of usability testing were conducted throughout the process of building a new academic library web site. participants were asked to perform tasks using a talk-aloud protocol. tasks were based on guiding principles of web usability that served as a framework for the new site. results from this study show that testing throughout the design process is an effective way to build a website that not only reflects user needs and preferences, but can be easily changed as new resources and technologies emerge. introduction in 2008 the hunter college libraries launched a two-year website redesign process driven by iterative usability testing. the goals of the redesign were to: • update the design to position the library as a technology leader on campus; • streamline the architecture and navigation; • simplify the language used to describe resources, tools, and services; and • develop a mechanism to quickly incorporate new and emerging tools and technologies. based on the perceived weaknesses of the old site, the libraries’ web committee developed guiding principles that provided a framework for the development of the new site. the guiding principles endorsed solid information architecture, clear navigation systems, strong visual appeal, understandable terminology, and user-centered design. this paper will review the literature on iterative usability testing, user-centered design, and thinkaloud protocol and the implications moving forward. it will also outline the methods used for this study and discuss the results. the model used, building the design based on the guiding principles and using the testing to uphold those principles, led to the development of a strong, user-centered site that can be easily changed or adapted to accommodate new resources and technologies. we believe this model is unique and can be replicated by other academic libraries undertaking a website redesign process. danielle a. becker (dbe0003@hunter.cuny.edu) is assistant professor/web librarian, lauren yannotta (lyannotta@hotmail.com) was assistant professor/instructional design librarian, hunter college libraries, new york, new york. mailto:dbe0003@hunter.cuny.edu mailto:lyannotta@hotmail.com modeling a library website redesign process | becker 7 background the goals of the research were to (1) determine the effectiveness of the hunter college libraries website, (2) discover how iterative usability testing resulting in a complete redesign impacts how the students perceive the usability of a college library website, and (3) reveal student informationseeking habits. a formal usability test was conducted both on the existing hunter college libraries website (appendix a) and the following drafts of the redesign (appendix b) with twenty users over an eighteen-month period. the testing occurred before the website redesign began, while the website was under construction, and after the site was launched. the participants were selected through convenience sampling and informed that participation was confidential. the intent of the usability test was to uncover the flaws in navigation and terminology of the current website and, as the redesign process progressed, to incorporate the users’ feedback into the new website’s design to closely match their wants and needs. the redesign of the website began with a complete inventory of the existing webpages. an analysis was done of the website that identified key information, links, units within the department, and placement of information in the information architecture of the website. we identified six core goals that we felt were the most important for all users of the library’s website: 1. user should be able to locate high-level information within three clicks. 2. eliminate library jargon from navigational system using concise language. 3. improve readability of site. 4. design a visually appealing site. 5. create a site that was easily changeable and expandable. 6. market the libraries’ services and resources through the site. literature review in 2010, oclc compiled a report, “the digital information seeker,” that found 84 percent of users begin their information searches with search engines, while only 1 percent began on a library website. search engines are preferred because of speed, ease of use, convenience, and availability.1 similar studies such as emde et al., and gross and sheridan, have shown that students are not using library websites to do their research.2 gross and sheridan assert in their article on undergraduate search behavior that “although students are provided with library skills sessions, many of them still struggle with the complex interfaces and myriad of choices the library website provides.” 3 this research shows the importance of creating streamlined websites that will information technology and libraries | march 2013 8 compete for our students’ attention. in building a new website at the hunter college libraries, we thought the best way to do this was through user-centered design. web designers both inside and outside the library have recognized the importance of usercentered design. nielsen advises that website structure should be driven by the tasks the users came to the site to perform.4 he asserts the amount of graphics on webpages should be minimized because they often affect page download times and that gratuitous graphics (including text rendered as images) should be eliminated altogether. 5 he also contends it is important to ensure that page designs are accessible to all users regardless of platform or newness of technology. 6 in their article, “how do i find an article? insights from a web usability study,” cockrell and jayne cited instances when researchers concluded that library terminology contributed to patrons’ difficulties when using library websites, thus highlighting the importance of understandable terminology. hulseberg and monson found in their investigation of student-driven taxonomy for library website design that “by developing our websites based on student-driven taxonomy for library website terminology, features, and organization, we can create sites that allow students to get down to the business of conducting research.” 7 performing usability testing is one way to confirm user-centered design. in his book don’t make me think!, krug insists that usability testing can provide designers with invaluable input. that, taken together with experience, professional judgment, and common sense, makes design choices easier.8 ipri, yunkin, and brown, in their article “usability as a method for assessing discovery,” emphasize the important role usability testing has in capturing emotional and aesthetic responses users have to websites, along with expressions of satisfaction with the layout and logic of the site. even the discovery of basic mistakes, such as incorrect or broken links and ineffective wording, can negatively affect discovery of library resources and services. 9 in battleson, booth, and weatherford’s literature review for their usability testing of an academic library website case study, they summarize dumas and redish's discussion of the five facets of formal usability testing: (1) the goal is to improve the usability of the interface, (2) testers should represent real users, (3) testers perform real tasks, (4) user behavior and commentary are observed and recorded, and (5) data are analyzed to recognize problems and suggest solutions. they conclude that when usability testing is "applied to website interfaces, this test method not only results in a more usable site, but also allows the site design team to function more efficiently, since it replaces opinion with user-centered design."10 this allows the designers to evaluate the results and identify problems with the design being tested. 11 usability experts nielsen and tahir contend that the earlier and more frequently usability tests are conducted, the more impact the results will have on the final design of the website because the results can be incorporated throughout the design process. they conclude it is better to conduct frequent, smaller studies with a maximum of five users. they assert, “you will always have discovered so many blunders in the design that it will be better to go back to the drawing board modeling a library website redesign process | becker 9 and redesign the interface than to discover the same usability problems several more times with even more users.” 12 based on the strength of the literature, we decided to use iterative testing for our usability study. krug points out that testing is an iterative process because designers need to create, test, and fix based on test results, then test again.13 according to the united states department of health and human services report “research-based web design and usability guidelines,” conducting before and after studies when revising a website will help designers determine if changes actually made a difference in the usability of the site.14 manzari and trinidad-christensen found in their evaluation of user-centered design for a library website, iterative testing is when a product is tested several times during development, allowing users’ needs to be incorporated into the design. in their study, their aim was that the final draft of their website would closely match the users’ information needs while remaining consistent, easy to learn, and efficient.15 battleson, booth, and weintrop report that there is “a consensus in the literature that usability testing be an iterative process, preferably one built into a web site’s initial design.” 16 they explain that “site developers should test for usability, redesign, and test again—these steps create a cycle for maintaining, evaluating and continually improving a site.” 17 george used iterative testing in her redesign of the carnegie mellon university libraries website and concluded that it was “necessary to provide user-centered services via the web site.” 18 cobus, dent, and ondrusek used six students to usability test the “pilot study.” then eight students participated in the first round of testing; then librarians modified the prototype and tested fourteen students in the second and final round. after the second round of testing they used the results of this test to analyze the user recordings and deliver the findings and proposed “fixes” to the prototype pages to the web editor.19 mcmullen’s redesign of the roger williams university library website was able to “complete the usability-refinement cycle” twice before finalizing the website design.20 but continued refinements were needed, leading to another round of usability tests to identify and correct problem areas.21 bauer-graham, poe, and weatherford did a comparative study of a library websites’ usability via a survey and then redesigned the website after evaluating the survey’s results. they waited a semester, distributed another survey to determine the functionality of the current site. the survey had the participants view the previous design and the current design in a side-by-side comparison to determine how useful the changes made to the site were. 22 when testing participants, in the article “how do i find an article? insights from a web usability study,” cockrell and jayne suggest using a web interface to perform specified tasks while a tester observes, noting the choices made, where mistakes occur, and using a “think aloud” protocol. they found that modifying the website through an ongoing, iterative process of testing, refining, and retesting its component parts improves functionality. 23 in conducting our usability testing we used a think-aloud protocol to capture the participants’ actions. van den haak, de jong, and schellens define think-aloud protocol as relying on a method information technology and libraries | march 2013 10 that asks users to complete a set of tasks and to constantly verbalize their thoughts while working on the tasks. the usefulness of this method of testing lies in the fact that the data collected reflect the actual use of the thing being tested and not the participants’ judgments about its usability. instead, the test follows the individual’s thoughts during the execution of the tasks. 24 nielsen states that think-aloud protocol “may be the single most valuable usability engineering method. . . . one gets a very direct understanding of what parts of the [interface/user] dialog cause the most problems, because the thinking aloud method shows how users interpret each individual interface item.” 25 turnbow ‘s article “usability testing for web redesign: a ucla case study” states that using the “think-aloud protocol” provides crucial real-time feedback on potential problems in the design and organization of a website.26 cobus, dent, and ondrusek used the think-aloud protocol in their usability study. they encouraged participants to talk out loud as they answered the questions, audio taped their comments, and captured their on-screen navigation using camtasia.27 this information was used to successfully reorganize hunter college library’s website. method an interactive draft of hunter college libraries redesigned website was created before the usability study was conducted. in spring 2009, the authors created the protocol for the usability testing. a think-aloud protocol was agreed upon for testing both the old site and the drafts of the new site, including a series of post-test questions that would allow participants to share their demographic information and give subjective feedback on the drafts of the site. draft questions were written, and we conducted mock usability tests on each other. after several drafts we revised our questions and performed pilot tests on an mlis graduate student and two undergraduate student library assistants with little experience with the current website. we ascertained from these pilot tests that we needed to slightly revise the wording of several questions to make them more understandable to all users. we made the revisions and eliminated a question that was redundant. all recruitment materials and finalized questions were submitted to the institutional review board (irb) for review and went through the certification process. after receiving approval we secured a private room to conduct the study. participants were recruited using a variety of methods. signs were posted throughout the library, an e-mail was sent out to several hunter college distribution lists, and a tent sign was erected in the lobby of the library. participants were required to be students or faculty. participants were offered a $10.00 barnes & noble gift card as incentive. applicants were accepted on a rolling basis. twenty students participated in the web usability study (appendix c). no faculty responded to our requests for participation so a decision was made to focus this usability test on students rather than faculty because students comprise our core user base. another usability test will be conducted in the future that will focus on faculty to determine how their academic tasks differ from undergraduates when using the library modeling a library website redesign process | becker 11 website. the redesigned site is malleable, which makes revisions and future changes in the design a predicted outcome of future usability tests. tests were scheduled for thirty-minute intervals. we conducted four rounds of testing using five participants per round. the two researchers switched questioner and observer roles after each round of testing. each participant was asked to think aloud while they completed the tasks and navigated the website. both researchers took notes during the tests to ensure detailed and accurate data was collected. each participant was asked to review the irb forms detailing their involvement in the study, and they were asked to consent at that time. their consent was implied if they participated in the study after reading the form. the usability test consisted of fifteen task-oriented questions. the questions were identical when testing the old and new draft site. the first round tested only the old site, while the following three rounds tested only the new draft site. we tested both sites because we believed that comparing the two sites would reveal if the new site improved performance. the questions (appendix d) were not changed after they were initially finalized and remained the same throughout the entire four rounds of the usability study. participants were reminded at the onset of the test and throughout the process that the design and usability of the site(s) were being tested, not their searching abilities. the tests were scheduled for an hour each, allowing participants to take the tests without time restrictions or without being timed. as a result, the participants were encouraged to take as much time as they needed to answer the questions, but were also allowed to skip questions if they were unable to locate answers. initially the tests were recorded using camtasia software. this allowed us to record participants’ navigation trails through their mouse movements and clicks. but, after the first round of testing, we decided that observing and taking notes was appropriate documentation, and we stopped using the software. after the participants completed the tests we asked them user preference questions to get a sense of their user habits and their candid opinions of the new draft of the website. these questions were designed to elicit ideas for useful links to include on the website and also to gauge the visual appeal of the site. information technology and libraries | march 2013 12 results table 1. percent of tasks answered correctly discussion hunter college libraries’ website was due for a redesign because the site was dated in its appearance and did not allow new content to be added quickly and easily. as a result, a decision was made to build a new site using a content management system (cms) to make the site easily expandable and simple to update. this study tested the simple tasks to determine how to structure the information architecture and to reinforce the guiding principles of the redesigned website. task successes and failures the high percentage of success of participants finding books on the redesigned website using the online library catalog and easily find library hours reinforced our guiding principle of understandable terminology and clear navigational systems. krug contends that navigation educates the user on the site’s contents through its visible hierarchy. the result is a site that guides the user through their options and instills confidence in task old site new site find a book using online library catalog 80% 86% find library hours 100% 100% get help from a librarian using questionpoint 40% 93% find a journal article 20% 66% find reference materials 0% 7% find journals by title 40% 66% find circulation policies 60% 53% find books on reserve 80% 73% find magazines by title 0% 73% find the library staff contact information 60% 100% find contact information for the branch libraries 40% 100% modeling a library website redesign process | becker 13 the website and its designers.28 we found this to be true in the way our users easily found the hours and catalog links on the prototype of our library website. the users on the old site knew where to look for this information because they were accustomed to how to navigate the old site. given that the prototype was a complete departure from the navigation and design of the old site, it was crucial that the labels and links were clear and understandable in the prototype or our design would fail. we made “hours” the first link under the “about” heading and “cuny+/books” the first link under the “find” heading and as a result both our terminology and our structure was a success with participants. on the old website, users rarely used the libraries’ online chat client. despite our efforts to remind students of its usefulness, the website didn’t sufficiently place the link in a reasonably visible location on the home page. in the old site, only 40 percent of participants located the link as it was on the bottom left of the screen and easy to overlook. instead, on the new site, the “ask a librarian” link was prominently featured on the top of the screen. these results upheld the guiding principles of solid information architecture and understandable terminology. it also supported nielsen’s assertion that “site design must be aimed at simplicity above all else, with as few distractions as possible and with a very clear information architecture and matching navigation tools.” 29 as a result the launch of the redesigned site, the use of the questionpoint chat client has more than doubled. finding a journal article on a topic was always problematic for users of the old library website. the participants we tested were familiar with the site, and 80 percent erroneously clicked on “journal title list” when the more appropriate link would have been “databases” if they didn’t have an exact journal title in mind. although we taught this in our information literacy courses, it was challenging getting the information across. in order to address this on the new site, “databases” was changed to “databases/articles” and categorized under the heading “find.” the participants using the new site had greater success with the new terminology; 66 percent correctly chose “databases/articles.” this question revealed an inconsistency with the guiding principals of understandable terminology and clear navigation systems on the old site. these issues were addressed by adding the word “articles” after “databases” on the new site to clarify what resources could be found in a database and also by placing the link under the heading “find” to further explain the action a student would be taking by clicking on the “databases/articles” link. finding reference materials was challenging for the users of the old site as none of the participants clicked on the intended link “subject guides.” in an effort to increase usage of the research guides, the library not only purchased the libguides tool, but also changed the wording of the link to “topic guides.” as we neared the end of our study we observed that only one participant knew to click on the “topic guides” link for research assistance. the participants suggested calling it “research guides” instead of “topic guides” and we changed it. unfortunately, the usability study was completed and we were unable to further test the effectiveness of the rewording of this link. anecdotally, the rewording of this link appears to be more understandable to users as the information technology and libraries | march 2013 14 research guides are getting more usage (based on hit counts) than the previous guides. the rewording of these guides adhered to both principles of understandable terminology and usercentered design. these results supported nielsen’s assertion that the most important material should be presented up front, using the inverted pyramid principal. “users should be able to tell in a glance what the page is about and what it can do for them.” 30 our results also supported the hhs report, which states that terminology “plays a large role in the user’s ability to find and understand information. many terms are familiar to designers and content writers, but not to users.” 31 we concluded that rewriting the link based on student feedback reduces the use of terminology. although librarians are “subject specialists” and “subject liaisons” and are familiar with those labels and that terminology, our students were looking for the word “research” instead of “subject” so they were not connecting with the library’s libguides. as previously discussed, students of the old site thought the link “journal title list” would give them access to the library’s database holdings. when asked to find a specific journal title the correct answer to this question on the old site was “journal title list,” with only 40 percent of the participants answering correctly. another change to terminology in the new site, both were placed under the heading “find,” and, after testing of the first prototype, “journal title list” was changed to “list of journals and magazines.” in the following tests 66 percent of the participants were able to answer correctly. the percentages of success in finding circulation policies between the old site and the prototype site were slight, only a 7 percent difference. this can be attributed to the fact that participants on the old site could click on multiple links to get to the correct page, and they were familiar enough with the site to know that. in the prototype of the site there were several paths as well, some direct, some indirect. testing the wording of this link supported the understandable terminology principle, more so than the old website’s “library policies” link, yet to be true to our user-centered design principle, we needed to reword it once more. therefore, after the test was completed and the website was launched, we reworded the link to “checkout policies,” which utilizes the same terminology that users are familiar with because they checkout books at our checkout desk. the remaining tasks consisted of locating information, such as finding books on reserve, magazines by title, library staff contact information, and finding branch information were all met with higher success rates in the prototype site because in the redesign process the links were reworded to support the understandable terminology and user-centered design principles. participant feedback: qualitative the usability testing process informed the redesign of our website in many specific ways. if the layout of the site didn’t test well with participants, we planned to create another prototype. in their evaluation of colorado state universities libraries’ digital collections and the western waters digital library websites, zimmerman and paschal describe the importance of first impressions of a website as the determining factor of whether users return to a website; if it is positive they will return and continue to explore.32 modeling a library website redesign process | becker 15 when given an opportunity to give feedback on what they thought of the design of the website the participants commented: • “there were no good library links at the bottom before and there wasn’t the ask a librarian link either which i like a lot.” • “the old site was too difficult to navigate, new site has a lot of information, i like the different color schemes for the different things.” • “it is contemporary and has everything i need in front of me.” • “cool.” • “helpful.” • “straightforward.” • “the organization is easier for when you want to find things.” • “interactivity and rollovers make it easy to use.” • “intuitive, straight-forward and i like the simplicity of the colors.” • “more professional, more aesthetically pleasing than the old site.” • “the four menu options (about, find, services, help) break the information down easily.” additional research conducted by nathan, yeow, and murguesan claims attractiveness (referring to aesthetic appeal of a website) is the most important factor in influencing customer decisionmaking and affects the usability of the website.33 not only that, but users feel better when using a more attractive product. fortunately, the feedback from our participants revealed that the website was visually appealing, and the navigation scheme was clear and easy to understand. other changes made to the libraries’ website because of usability testing participants commented that they expected to find library contact information on the bottom of the homepage, so the bottom of the screen was modified to include this information as well as a “contact us” link. participants did not realize that the “about,” “find,” “services,” and “help” headings were also links, so we modified them so they were underlined when hovered over. there were also adjustments to the gray color bars on the top of the page because participants thought they were too bright, so they were darkened to make the labels easier to read. participants also commented that they wanted links to various public libraries in new york city under the “quick links” section of the homepage. we designed buttons for brooklyn public library, queens public library, and the new york public library and reordered this list to move these links closer to the top of the “quick links” section. information technology and libraries | march 2013 16 conclusion conducting a usability study of hunter college libraries existing website and the various stages of the redesigned website prototypes was instrumental in developing a user-centered design. approaching the website redesign in stages, with guidance from iterative user testing and influenced by the participants’ comments, gave the web librarian and the web committee an opportunity to incorporate the findings of the usability study into the design of the new website. rather than basing design decisions on assumptions of users’ needs and information seeking behaviors, we were able to incorporate what we’d learned from the library literature and the users’ behavior into our evolving designs. this strategy resulted in a redesigned website that, with continued testing, user feedback, and updating, has aligned with the guiding principles we developed at the onset of the redesign project. the one unexpected outcome from this study is that we discovered that despite how well a library website is designed, users will still need to be educated in how to use the site with an emphasis on developing strong information literacy skills. references 1. “the digital information seeker: report of the findings from selected oclc, rin, and jisc user behaviour projects,” oclc research, ed. lynn silipigni-connaway and timothy dickey (2010): 6, www.jisc.ac.uk/publications/reports/2010/digitalinformationseekers.aspx. 2. judith emde, lea currie, frances a. devlin, and kathryn graves, “is ‘good enough’ ok? undergraduate search behavior in google and in a library database,” university of kansas scholarworks (2008), http://hdl.handle.net/1808/3869; julia gross and lutie sheridan, “web scale discovery: the user experience,” new library world 112, no. 5/6 (2011): 236, doi: 10.1108/03074801111136275. 3. ibid, 238. 4. jakob nielsen, designing web usability (indianapolis: new riders, 1999), 198. 5. ibid, 134. 6. ibid, 97. 7. barbara j. cockrell and elaine a. jayne, “how do i find an article? insights from a web usability study,” journal of academic librarianship 28, no. 3 (2002): 123, doi: 10.1016/s00991333(02)00279-3. 8. steve krug, don't make me think! a common sense approach to web usability, 2nd ed. (berkeley, ca: new riders, 2006), 135. 9. tom ipri, michael yunkin, and jeanne brown, “usability as a method for assessing discovery,” information technology & libraries 28, no. 4 (2009): 181, doi: 10.6017/ital.v28i4.3229. 10. brenda battleson, austin booth, and jane weintrop, “usability testing of an academic library web site: a case study,” journal of academic librarianship 27, no. 3 (2001): 189–98, doi: 10.1016/s0099-1333(01)00180-x. http://www.jisc.ac.uk/publications/reports/2010/digitalinformationseekers.aspx http://hdl.handle.net/1808/3869 doi:%2010.1108/03074801111136275 doi:%2010.1108/03074801111136275 doi:%2010.1016/s0099-1333(02)00279-3 doi:%2010.1016/s0099-1333(02)00279-3 doi:%2010.6017/ital.v28i4.3229 doi:%2010.1016/s0099-1333(01)00180-x doi:%2010.1016/s0099-1333(01)00180-x modeling a library website redesign process | becker 17 11. ibid. 12. jakob nielsen and marie tahir, “keep your users in mind,” internet world 6, no. 24 (2000): 44. 13. steve krug, don't make me think! a common sense approach to web usability, 135. 14. research-based web design and usability guidelines, ed. ben schneiderman (washington: united states dept. of health and human services, 2006), 190. 15. laura manzari and jeremiah trinidad-christensen, “user-centered design of a web site for library and information science students: heuristic evaluation and usability testing,” information technology & libraries 25, no. 3 (2006): 163, doi: 10.6017/ital.v25i3.3348. 16. battleson, booth, and weintrop, “usability testing of an academic library web site,” 190. 17. ibid. 18. carole a. george, “usability testing and design of a library website: an iterative approach,” oclc systems & services 21, no. 3 (2005): 178, doi: 10.1108/10650750510612371. 19. laura cobus, valeda dent, and anita ondrusek, “how twenty-eight users helped redesign an academic library web site,” reference & user services quarterly 44, no. 3 (2005): 234–35. 20. susan mcmullen, “usability testing in a library web site redesign project,” reference services review 29, no. 1 (2001): 13, doi: 10.1108/00907320110366732. 21. ibid. 22. john bauer-graham, jodi poe, and kimberly weatherford, “functional by design: a comparative study to determine the usability and functionality of one library's web site,” technical services quarterly 21, no. 2 (2003): 34, doi: 10.1300/j124v21n02_03. 23. cockrell and jayne, “how do i find an article?,” 123. 24. maaike van den haak, menno de jong, and peter jan schellens, “retrospective vs. concurrent think-aloud protocols: testing the usability of an online library catalogue,” behavior & information technology 22, no. 5 (2003): 339. 25. battleson, booth, and weintrop, “usability testing of an academic library web site,” 192. 26. dominique turnbowet al., “usability testing for web redesign: a ucla case study,” oclc systems & services 21, no. 3 (2005): 231, doi: 10.1108/10650750510612416. 27. cobus, dent, and ondrusek, “how twenty-eight users helped redesign an academic library web site,” 234. 28. krug, don't make me think! 59. 29. nielsen, designing web usability, 164. 30. ibid., 111. doi:%2010.6017/ital.v25i3.3348 doi:%2010.1108/10650750510612371 doi:%2010.1108/00907320110366732 doi:%2010.1300/j124v21n02_03 doi:%2010.1108/10650750510612416 information technology and libraries | march 2013 18 31. schneiderman, research-based web design and usability guidelines, 160. 32. don zimmerman and dawn bastian paschal, “an exploratory evaluation of colorado state universities libraries’ digital collections and the western waters digital library web sites,” journal of academic librarianship 35, no. 3 (2009): 238, doi: 10.1016/j.acalib.2009.03.011. 33. robert j. nathan, paul h. p. yeow, and sam murugesan, “key usability factors of serviceoriented web sites for students: an empirical study,” online information review 32, no. 3 (2008): 308, doi: 10.1108/14684520810889646. doi:%2010.1016/j.acalib.2009.03.011 doi:%2010.1108/14684520810889646 modeling a library website redesign process | becker 19 appendix a. hunter college libraries’ old website information technology and libraries | march 2013 20 appendix b. hunter college libraries’ new website modeling a library website redesign process | becker 21 appendix c. test participant profiles participant sex academic standing major library instruction session? how often in the library 1 female senior history yes every day 2 female sophomore psychology no every day 3 male junior nursing no 1/week 4 female junior studio art no 5/week 5 female senior accounting yes 2–3/week 6 male freshman undeclared yes 1/week 7 female freshman undeclared no every day 8 male senior music yes 3–4/week 9 male freshman physics/english no every day 10 female senior english lit/ media studies no 1/week 11 female junior fine arts/ geography yes 2–3/week 12 male sophomore computer science yes every day 13 male sophomore econ/psychology yes 6 hours/week 14 female senior math/econ yes 2–3/week 15 female senior art yes everyday 16 male n/a* pre-nursing no daily 17 female senior** econ didn’t remember 3/week 18 male senior pre-med yes 2/week 19 female grad art history yes 3/week 20 male grad education (tesol) no every day note: *this student at hunter fulfilling pre-requisites; already had bachelor of arts degree from another college. **this student had just graduated. information technology and libraries | march 2013 22 appendix d. test questions/tasks • what is the first thing you noticed (or looked at) when you launched the hunter libraries homepage? • what’s the second? • if your instructor assigned the book to kill a mockingbird what link would you click on to see if the library owns that book? • when does the library close on wednesday night? • if you have a problem researching a paper topic and are at home, where would you go to get help from a librarian? • where would you click if you needed to find two journal articles on “homelessness in america”? • you have to write your first sociology paper and wanted to know what databases, journals, and web sites would be good resources for you to begin your research. where would you click? • does hunter library subscribe to the e-journal journal of communication? • how long can you check out a book for? • how would you find items on reserve for professor doyle’s liibr100 class? • does hunter library have the latest issue of rolling stone magazine? • what is the e-mail for louise sherby, dean of libraries? • what is the phone number for the social work library? • you are looking for a guide to grammar and writing on the web, does the library’s webpage have a link to such a guide? • your friend is a hunter student who lives near brooklyn college. she says that she may return books she borrowed from the brooklyn college library to hunter library. is she right? where would you find out? • this website is easy to navigate (agree, agree somewhat, disagree somewhat, disagree)? • this website uses too much jargon (agree, agree somewhat, disagree somewhat, disagree)? • i use the hunter library’s website (agree, agree somewhat, disagree somewhat, disagree)? 20180926 10703 editor president’s message: rebuilding our identity, together bohyun kim information technology and libraries | september 2018 2 bohyun kim (bohyun.kim.ois@gmail.com) is lita president 2018-19 and chief technology officer & associate professor, university of rhode island libraries, kingston, ri. ital is the official journal of lita (library and information technology association), and if you are a reader of the ital journal, it is highly likely that you are a member of lita and/or one who is deeply interested in library technology. it is my pleasure to write this column to update all of you about the exciting discussion that is currently underway in lita and two other ala divisions, alcts (association for library collections and technical services), and llama (library leadership and management association). as many of you know, lita began discussing the potential merger with two other ala divisions, alcts and llama, last year.1 what initially prompted the discussion was the prospect of continuing budget deficits in all three divisions. but the resulting conversation has proved that financial viability is not the entire story of the change that we want to bring about. at the 2018 ala annual conference in new orleans, the three boards of lita, alcts, and llama held a joint meeting open to members and non-members alike to solicit and share our collective thoughts, suggestions, concerns, and hopes about the potential three-division realignment. at this meeting attended by approximately 75 people, participants expressed their support for creating a new division with the following key elements. • retain and build upon the best elements of each division. • embrace the breakdown of silos and positive risk-taking to better collaborate and move our profession forward. • build a strong culture of innovation, energy, and inspiration. • be more transparent, responsive, agile, and less bureaucratic • excel in diversity, equity, and inclusion. • support members in all stages of their careers, those with the least means to travel for in-person participation, in particular. • provide member-driven interactions and good value for the membership fee. these ideas have made it clear that members of all three divisions see the goal of realignment as something much more fundamental than financial sustainability. they have validated the shared belief among the lita, alcts, and llama boards that the ultimate goal of realignment is to create a division that better serves and benefits members, not to simply recover the division’s financial health. while the criteria for the success of a new combined division received almost unanimous endorsement at the meeting, opinions about how to realize such success varied. there were understandable concerns associated with combining three small-sized associations into one large one. for example, how will we reconcile three distinctly different cultures in lita, alcts, and llama? how will the new association ensure itself to be more transparent, responsive, and rebuilding our identity, together | kim 3 https://doi.org/10.6017/ital.v37i3.10703 nimble than the individual divisions prior to the merger? could the larger size of the new division make it more difficult for small groups with special interests to get needed support for their programs? many requested that the leadership of the three divisions provide more specific vision and details. as a group, the leaders of lita, alcts, and llama are committed to hashing out those details. with the aim of providing fuller information about what the new division would look like at the 2019 midwinter conference, we have already formed working groups, one for finances and the other for communication and are currently working to create two more on operations and activities. these four teams will work closely together with the current leadership of lita, alcts, and llama, to prepare the most important information about the proposed new division, so that the boards and the members of three divisions can review and provide feedback for needed adjustments. our goal is to present essential information that will allow the members to vote with confidence on the proposal to form one new division on the ala ballot in the spring of 2019. if the membership vote passes, then we will be taking the proposal to the ala committee on organization for finalization. on this occasion, i would also like to bring to everyone’s attention to an inherent tension between the two ideas that many of us hold as association members regarding alignment. one is that more member involvement in determining alignment-related details at an early stage is essential to the success of the new division. the other is that we can decide whether we will support the new division or not, only after the leadership first presents us with a clear, specific, and detailed picture of what the new division will look like. the problem is that we cannot have both at the same time. as members, if we want to be involved at an early stage of reorganization, we will have to accept that there will be no ready-made set of clear and specific details about the division waiting for us to simply say yes or no. we will be required to work through our collective ideas to decide on those details ourselves. it will be a messy, iterative, and somewhat confusing process for all of us. there is no doubt that this will be hard work for both the lita leadership and lita members. but it is also an amazing opportunity. imagine a new division, where (a) innovative ideas and projects are shared and tested through open conversation and collaboration among library professionals in a variety of functional areas such as systems and technology, metadata and cataloging, and management and administration, (b) frank and inspiring dialogues take place between front-line librarians and administrators about vexing issues and exciting challenges, and (c) new librarians learn the ropes, are supported throughout their careers going through changes in their responsibilities as well as areas of specialization, are mentored to be future leaders, and get to develop the next generation of leaders as they themselves achieve their goals. furthermore, i believe that the process of building this kind of new association from the ground up will be a truly rewarding experience. we had an opportunity to discuss and share our collective hope and vision for the new division at the joint meeting, and that vision is an inspiring one: a division that is member-driven, nimble and responsive, transparent and inclusive, and not afraid to take risks. can we create a new association that breaks down our own silos and builds bridges for better communication and collaboration to move our profession forward? information technology and libraries | september 2018 4 my hope is that we can model and embody the change we want to see, starting in the reorganization process itself. if we want to build a new association that is inclusive, transparent, and nimble, we should be able to build such an association in precisely that manner: inclusively, transparently, and nimbly. if we are successful, our identity as members of this new division will be rebuilt as the very spirit and energy of continuing innovation, experimentation, and collaboration across different functional silos of librarianship, rather than as what we have in our job titles. many lita members and ital readers are leaders in their field and care deeply about the continued success and innovation of lita and ital. i would like to invite all of you to participate in this effort of three-division alignment and to inform and lead our way together. while the boards of three divisions are working on the proposal, there will be multiple calls for member participation. keep your eye out for new updates that will be posted in the ala connect community, “alcts/llama/lita alignment discussion” at https://connect.ala.org/communities/community-home?communitykey=047c1c0e-17b9-45b6a8f6-3c18dc0023f5. all information in this group site is viewable to the public. lita, alcts, and llama members can also join the group, post suggestions and feedback, and subscribe to updates. where would you like lita to be next year, and the year after? let us take lita there, together. endnote 1 andromeda yelton, “president’s message,” information technology and libraries 37, no. 1 (march 19, 2018): 2–3, https://doi.org/10.6017/ital.v37i1.10386. a comprehensive approach to algorithmic machine sorting of library of congress call numbers articles a comprehensive approach to algorithmic machine sorting of library of congress call numbers scott wagner and corey wetherington information technology and libraries | december 2019 62 scott wagner (smw284@psu.edu) is information resources and services support specialist, penn state university libraries. corey wetherington (cjw36@psu.edu) is open and affordable course content coordinator, penn state university libraries. abstract this paper details an approach for accurately machine sorting library of congress (lc) call numbers which improves considerably upon other methods reviewed. the authors have employed this sorting method in creating an open-source software tool for library stacks maintenance, possibly the first such application capable of sorting the full range of lc call numbers. the method has potential application to any software environment that stores and retrieves lc call number information. background the library of congress classification (lcc) system was devised around the turn of the twentieth century, well before the advent of digital computing. 1 consequently, neither it nor the system of library of congress (lc) call numbers which extend it were designed with any consideration to machine readability or automated sorting.2 rather, the classification was formulated for the arrangement of a large quantity of library materials on the basis of content, gathering like items together to allow for browsing within specific topics, and in such a way that a new item may always be inserted (interfiled) between two previously catalogued items without disruption to the overall scheme. unlike, for instance, modern telephone numbers, isbns, or upcs—identifiers which pair an item with a unique string of digits having a fixed and regular format, largely irrespective of any particular characteristics of the item itself—lc call numbers specify the locations of items relative to others and convey certain encoded information about the content of those items. the library of congress summarizes the essence of the lcc in this way: the system divides all knowledge into twenty-one basic classes, each identified by a single letter of the alphabet. most of these alphabetical classes are further divided into more specific subclasses, identified by two-letter, or occasionally three-letter, combinations. for example, class n, art, has subclasses na, architecture; nb, sculpture, nd, painting; as well as several other subclasses. each subclass includes a loosely hierarchical arrangement of the topics pertinent to the subclass, going from the general to the more specific. individual topics are often broken down by specific places, time periods, or bibliographic forms (such as periodicals, biographies, etc.). each topic (often referred to as a caption) is assigned a single number or a span of numbers. whole numbers used in lcc may range from one to four digits in length, and may be further extended by the use of decimal numbers. some subtopics appear in alphabetical, rather than hierarchical, lists and are represented by mailto:smw284@psu.edu mailto:cjw36@psu.edu algorithmic machine sorting of lc call numbers | wagner and wetherington 63 https://doi.org/10.6017/ital.v38i4.11585 decimal numbers that combine a letter of the alphabet with a numeral, e.g., .b72 or .k535. relationships among topics in lcc are shown not by the numbers that are assigned to them, but by indenting subtopics under the larger topics that they are a part of, much like an outline. in this respect, it is different from more strictly hierarchical classification systems, such as the dewey decimal classification, where hierarchical relationships among topics are shown by numbers that can be continuously subdivided.3 as this description suggests, lcc cataloging practices can be quite idiosyncratic and inconsistent across different topics and subtopics, and sorting rules for properly shelf-ordering lc call numbers can be correspondingly complex, as we will see below.4 for the purposes of discussion in what follows, we divide lc call number strings into three principal substrings: the classification, the cutter, and what we will term the specification. the classification categorizes the item on the basis of its subject matter, following detailed schedules of the lcc system published by the library of congress; the cutter situates the item alongside others within its classification (often on the basis of its title and/or author5); and the specification distinguishes a specific edition, volume, format, or other characteristic of the item from others having the same author and title: hc125⏞ 𝑎 .g25313⏞ 𝑏 1997⏞ 𝑐 in the above example, the classification string (a) denotes the subject matter (in this case, general economic history and conditions of latin america), the cutter string (b) locates the book within this topic on the basis of author and/or title (following a specific encoding process), and the specification string (c) denotes the particular edition of the text (in this case, by year). each of these general substrings may contain further substrings having specific cataloging functions, and though each is constructed following certain rigid syntactical rules, a great deal of variation in format may be observed within the basic framework. the following is an inexhaustive summary of the basic syntax of each of the three call number components: • the classification string always begins with one to three letters (the class/subclass), almost always followed by one to four digits (the caption number), possibly including an additional decimal. the classification may also contain a date or ordinal number following the caption number. • the beginning of the cutter string is always indicated by a decimal point followed by a letter and at least one digit. while the majority of call numbers contain a cutter, it is not always present in all cases. among the sorting challenges posed by lc call numbers, we note in particular the “double cutter”—a common occurrence in certain subclasses— wherein the cutter string changes from alphabetic to numeric, then back to alphabetic, and finally again to numeric. triple cutters are also possible, as are dates intervening between cutters. certain cutter strings (e.g., in juvenile fiction) end with an alphabetic “work mark” composed of two or more letters. • the specification string (which may be absent on older materials) is always last, and usually contains the date of the edition, but may also include volume or other numbering, ordinal numbers, format/part descriptions (e.g., “dvd,” “manual,” “notes”), or other distinguishing information. information technology and libraries | december 2019 64 figure 1 shows example call numbers, all found within the catalog of penn state university libraries, suggesting the wide variety of possibilities: figure 1. example call numbers. as one might expect given this irregularity in syntax, systematic machine-sorting of lc call numbers is by no means trivial. to begin with, sorting procedures within the lcc system are to a certain degree contextual—that is, the sorter must understand how a given component of a call number operates within the context of the entire string in order to determine how it should sort. both integer and decimal substrings appear in lc call numbers, so that a numeral may properly precede a letter in one part of a call number (a ‘1’ sorts before an ‘a’ in the classification portion, for example: h1 precedes ha1), while the contrary may occur in another part (within the cutter, in particular, an ‘a’ may well precede a ‘1’: hb74.p65a2 precedes hb74.p6512). similarly, letters may have different sorting implications depending on where and how they appear. compare, for instance, the call numbers v23.k4 1961 and u1.p32 v.23 1993/94. the v in the former denotes the subclass of general nautical reference works and simply sorts alphabetically, whereas the v in the latter call number functions in part as an indicator that the numeral 23 refers to a specific volume number and is to be sorted as an integer rather than a decimal. such contextual cues are often tacitly understood by a human sorter, but can present considerable challenges when implementing machine sorting procedures. furthermore, the lack of uniformity or regularity in the format of call number strings poses various practical obstacles for machine sorting. taken together, these assorted complexities suggest the insufficiency of a single alphanumeric sorting procedure to adequately handle lc call numbers as unprocessed, plain text strings. literature review a thorough review of information science literature revealed little formal discussion of the algorithmic sorting of lc call numbers. if the topic has been more widely addressed in the scholarly or technical literature, we were unable to discover it. nevertheless, the general problem appears to be fairly well known. this is evident both from informal online discussions of the topic (e.g., in blog posts, message board threads, and coding forums) and from the existence of certain features of library management system (lms) and integrated library system (ils) software designed to address the issue. in this section we examine methods proffered by some of these sources, and detail how each fails to fully account for all aspects of lc call number sorting. b1190 1951 no cutter string dt423.e26 9th.ed. 2012 compound specification e505.5 102nd.f57 1999 ordinal number in classification hb3717 1929.e37 2015 date in classification kbd.g189s no caption number, no specification n8354.b67 2000x date with suffix ps634.b4 1958-63 hyphenated range of dates ps3557.a28r4 1955 “double cutter” pz8.3.g276lo 1971 cutter with “work mark” pz73.s758345255 2011 lengthy cutter decimal algorithmic machine sorting of lc call numbers | wagner and wetherington 65 https://doi.org/10.6017/ital.v38i4.11585 in a brief article archived online, conley and nolan outline a method for sorting lc call numbers through the use of function programming in microsoft excel. 6 given a column of plain-text lc call numbers, their approach entails successive processing of the call numbers across several spreadsheet columns with the aim of properly accounting for the sorting of integers . the fullyprocessed strings are then ultimately ready for sorting in the rightmost column using excel’s built in sorting functionality. we note that conley and nolan’s method (hereafter “cnm”) only attempts to sort what the authors refer to as the “base call number” (i.e., the classification and cutter portions), and does not address the sorting of “volume numbers, issue numbers, or sheet numbers” (what we refer to here as the “specification”). 7 cnm stems from the tacit observation that ordinary, single-column sorting of lc call numbers is clearly inadequate in an environment like excel’s. for instance, in the following example, standard character-by-character sorting fails at the third character position, since pz30.a1 erroneously sorts before pz7.a1 (as 3 is compared to 7 in the third character position), contrary to the correct order (7 before 30). to address this, cnm normalizes the numeric portion of the class number with leading zeros so that each numeric string is of equal length, ensuring that the proper digits are compared during sorting. this entails a transformation, pz30.a1  pz0030.a1 pz7.a1  pz0007.a1 following which the strings will in fact sort correctly in an excel column. this technique appears adequate until we compare call numbers having subclasses of different length: p180.a1  p0180.a1 pz30.a1  pz0030.a1 here, while standard excel sorting will in fact properly order the resulting strings, in other applications, depending on the sorting hierarchy employed, sorting may fail in the second position if letters are sorted before numbers. hierarchy aside, it is not difficult to see the potential issues that may arise from sorting unlike portions of the call number string against one another in this way, particularly when comparing characters within the cutter string or in situations involving a “double cutter.” for instance, the call numbers b945.d4b65 1998 and b945.d41 1981b are listed here in their proper sorting order, but are in fact sorted in reverse by cnm when, in the eighth character position, 1 is sorted before b in accordance with excel’s default sorting priority. this again illustrates an essential problem of character-by-character sorting: in certain substrings we require a letters-before-numbers sorting priority, while in others a numbers-before-letters order is needed. this impasse makes clear that no single-column sorting methodology can succeed for all types of lc call numbers without significant modification to the call number string. in a blog post, dannay observed that cnm does not account for certain call number formats, particularly those of legal materials within the k classification having 3-letter class strings. 8 (the information technology and libraries | december 2019 66 same would also be true in the d classification, where 3-letter strings also appear.) although minor modification of portions of the function code (e.g., replacing certain ‘2’s with ‘3’s) would be sufficient to alleviate this particular issue, dannay proposed instead to employ placeholder characters to normalize the classification string and avoid instances of alphabetic characters being compared against numeric ones. dannay’s method (dm) normalizes various parts of the classification string, including the subclass, caption, and decimal portions: q171.t78  q**0171.0.t78 qa9.r8  qa*0009.0.r8 (here, of course, it is imperative that the chosen placeholder character sort before all letters in the sorting hierarchy.) dm thus successfully avoids the issue of comparing classification strings of unequal length or format. nevertheless, despite the improvements of dm over cnm, both approaches are ultimately unable to properly process certain types of common lc call numbers. for example, call numbers with dates preceding the cutter (e.g., gv722 1936.h55 2006) and call numbers without cutters (e.g., b1205 1958) both result in errors, as do those containing the aforementioned “double cutters.” furthermore, as we previously noted, neither dm nor cnm were designed to handle any portion of the specification string following the cutter, where the presence of ordinal and volumetype numbering is commonplace. hence neither method is able to properly order the quite ordinary pair of call numbers ac1.g7 v.19 and ac1.g7 v.2, since the first digit of each’s volume number is compared and ordered numerically (i.e., character-by-character), resulting in a mis-sort. though neither dn nor cnm is ultimately comprehensive (nor designed to be), both methods contain valuable insights and strategies that inform our own approach to the problem. software review available software solutions for sorting lc call numbers appear to be nearly as scant as literature on the subject. while github contains a handful of programs that attempt to address the problem, we found none which could be considered comprehensive. table 1 is a summary of those programs we discovered and were able to examine. the “sqlite3-lccn-extension” program is an extension for sqlite 3 which provides a collation for normalizing lc call numbers, executing from a sqlite client shell. we discovered several limitations in its ability to sort certain call number formats similar to those discussed above in the literature review. for instance, the program cannot correctly sort specification integers (e.g., it sorts v.13 before v.3) or call numbers lacking cutter strings (e.g., it sorts b 1190.a1 1951 before b 1190 1951). we found similar issues with “js-loc-callnumbers,” a javascript program with a web interface into which a list of call numbers can be pasted. the program transforms the call numbers into normalized strings, which are then sorted and displayed to the user. however, we observed that it does not account for dates or ordinal numbers in the classification string, nor can it correctly sort call numbers lacking caption numbers.9 algorithmic machine sorting of lc call numbers | wagner and wetherington 67 https://doi.org/10.6017/ital.v38i4.11585 program and author app-type, interface repository url last update “sqlite3-lccn-extension” by brad dewar database extension, shell https://github.com/macdewar/sqlite3lccn-extension dec. 2013 “js-loc-callnumbers” by ray voelker javascript, web https://github.com/rayvoelker/js-loccallnumbers feb. 2017 “library-of-congresssystem” by luis ulloa python tutorial, command line https://github.com/ulloaluis/library-ofcongress-system sep. 2018 “lcsortable” by mbelvadi2 google sheets script https://github.com/mbelvadi2/lcsortabl e may 2017 “library-callnumber-lc” by library hackers perl, python https://github.com/libraryhackers/libra ry-callnumberlc/tree/master/perl/librarycallnumber-lc dec. 2014 “lc_call_number_compare” by smu libraries javascript, command line https://github.com/smulibraries/lc_call_number_compare dec. 2016 “lc_callnumber” by bill dueber ruby https://github.com/billdueber/lc_callnu mber feb. 2015 table 1. list of github software involving lc call number sorting. several of the programs are rather narrow in scope. the “lcsortable” script is a google sheets scheme for normalizing lc call numbers into a separate column for sorting, very much like cnm and dm. its normalization routine appears to conflate decimals and integers, though, leading to transformations such as hf5438.5.p475 2001  hf5438.0005.p04752001 which would clearly result in a great deal of incorrect sorting across a wide array of lc call number formats. the command-line-based python program “library-callnumber-lc” processes a call number and returns a normalized sort key, but is not intended to store or sort groups of call numbers. it cannot adequately handle compound specifications or cutters containing consecutive letters (e.g., s100.bc123 1985), and does not appear to preserve the demarcation between a caption integer and caption decimal (i.e., the decimal point), thereby commingling integer and decimal sorting logic. lastly, “library-of-congress-system” is a tutorial/training program written in python that runs from the command line and supplies a list of call numbers for the user to sort. it does not draw call numbers from a static collection nor allow call numbers to be input by the user; rather, it randomly generates call numbers within certain parameters and following a https://github.com/macdewar/sqlite3-lccn-extension https://github.com/macdewar/sqlite3-lccn-extension https://github.com/rayvoelker/js-loc-callnumbers https://github.com/rayvoelker/js-loc-callnumbers https://github.com/ulloaluis/library-of-congress-system https://github.com/ulloaluis/library-of-congress-system https://github.com/mbelvadi2/lcsortable https://github.com/mbelvadi2/lcsortable https://github.com/libraryhackers/library-callnumber-lc/tree/master/perl/library-callnumber-lc https://github.com/libraryhackers/library-callnumber-lc/tree/master/perl/library-callnumber-lc https://github.com/libraryhackers/library-callnumber-lc/tree/master/perl/library-callnumber-lc https://github.com/libraryhackers/library-callnumber-lc/tree/master/perl/library-callnumber-lc https://github.com/smu-libraries/lc_call_number_compare https://github.com/smu-libraries/lc_call_number_compare https://github.com/billdueber/lc_callnumber https://github.com/billdueber/lc_callnumber information technology and libraries | december 2019 68 prescribed pattern. as such, we were not able to satisfactorily test its sorting capabilities for the kind of use-case scenario under discussion. we did not evaluate the remaining two github programs, “lc_call_number_compare” and “lc_callnumber,” as we could not get the former, a javascript es6 module, to execute, and as the latter, a ruby application which we did not attempt to install, evidently remains unfinished: its github documentation lists “normalization: create a string that can be compared with other normalized strings to correctly order the call numbers” as the among tasks yet to be completed. in addition to these open resources, we examined lc sorting capability within the commercial lms/ils software we had at hand. the marc (machine-readable cataloging) 21 protocol, a widely used international standard for formatting bibliographic data, provides a specific syntax for cataloging lc call numbers for the purposes of machine parsing.10 symphony workflows, the lms licensed by penn state university libraries from sirsidynix (and thus the only one available for our direct examination), contains within its search module a call number browsing feature which attempts to sort call numbers in shelving order via “shelving ids,” call number strings rendered from each item’s marc 21 “050” data field for sorting purposes. while these shelving ids are not visible within workflows (that is, they operate in the background), they can be accessed as plain text strings via bluecloud analytics, a separate, sirsidynix-branded data assessment and reporting tool peripheral to the lms. examination of these sort keys revealed integer normalization strategies similar to those of dm and cnm, with additional processing of volume-type numbering within the specification string. however, these shelving ids are similarly unable to correctly sort “double cutter” substrings and other syntactic complexities, such as ordinal numbers appearing in the classification. the following shelving id transformations of two call numbers in the penn state university libraries catalog, for instance, fail to properly account for the ordinal numbers which appear within the classification: e507.5 36th.v47 2003  e 000507.5 36th.v47 2003 e507.5 5th.c36 2000  e 000507.5 5th.c36 2000 consequently, and as expected, these two call numbers sort incorrectly within workflows’ call number browsing panes.11 proposed parsing and sorting methodology given the sorting difficulties inherent in the single-column approaches outlined above, we suggest a multi-column, tiered sorting procedure in which only like portions of the call number are compared to one another. this requires the call number to be processed, its various components identified, and each component appropriately sorted according to its specific typ e. this, in turn, requires a sorting algorithm which can identify like substrings by scanning for specific patterns and cues. “shelf reading” is a term for the common practice of verifying the correct ordering of items filed on a library shelf, typically unaided by technology, and our approach is primarily informed by the kind of mental procedures one undertakes when performing such sorting “in one’s head.”12 perhaps the most significant component of this process involves recognizing and interpreting the role and logic of specific types of substrings and identifying their positions within the sorting hierarchy. the overall design of the lc classification, from class to subclass to caption, constitutes algorithmic machine sorting of lc call numbers | wagner and wetherington 69 https://doi.org/10.6017/ital.v38i4.11585 a left-to-right progression from general to specific, and the classification portion of a call number can be interpreted as a series of containers holding items of increasingly narrow scope, some of which may be empty (that is, absent). this creates a structure that has a linear, hierarchical aspect, but also contains within it subcategories that share a common position within the structure. the priority that a subcategory (or container) is afforded in the sorting process depends first upon its position in the linear hierarchy, and subsequently on the depth ascribed to it relative to other subcategories that share the same position. call numbers indicate a subcategory’s position in the linear dimension by including or expanding sections; its depth within a given position is encoded in the character or series of characters chosen to represent it. thus, the sorting process may be regarded as a comparison of the paths that two call numbers denote through this structure, and the point at which the paths diverge is then the decisive point in determining an item’s position relative to others. this inflection point may occur at any juncture of the comparison, from the first character to the last. given these observations, a comprehensive machine-sorting strategy must observe the following provisions: 1. characters in call numbers should only be compared to characters that occupy an equivalent section of another call number. (“like compared to like.”) 2. within these designated sections, characters should only be compared to characters that occupy a corresponding position (place value) within that section. 3. if call numbers are identical up to the point that one of them lacks a section that the other call number possesses, the one with the “missing” section is ordered first. this is in keeping with the convention that items occupying a more general level in the hierarchy are ordered before those occupying a more specific one. (this principle is often summarized in shelfreading tutorials as “nothing before something.”) 4. if call numbers are identical up to the point that one of them lacks a character in a given position within a particular section that the other call number possesses, the one missing the character is ordered first. again, this preserves the general to specific scheme of lcc sorting. (another instance of “nothing before something.”) 5. whole numbers (e.g., caption integers, volume numbers) must be distinguished from decimals. for character-by-character sorting to work in sections of the call number containing integers, the length of whole numbers must be normalized to assure each digit is compared to another of equal place value. application of methodology shelfreader is a software application designed by the authors to improve the speed and accuracy of the shelf-reading process in collections filed using the library of congress system—and, to our knowledge, is the first such application to do so. it was coded by scott wagner in php and javascript, uses mysql for data storage and sorting, and is deployed as a web application. shelfreader allows the user to scan library items in the order they are shelved and receive feedback regarding any mis-shelved items. the program receives an item’s unique barcode identification via a barcode scanner, assembles a rest request incorporating the barcode, and sends it to an api connected to the lms. the application then processes the response, retrieving the title and call number of the item, along with information about the item’s status (for example, if it has been marked as lost or missing). the call number is passed off to the sorting algorithm, information technology and libraries | december 2019 70 which processes it and assigns it a position among the set of call numbers recorded during that session. a user interface then presents a “virtual shelf” to the user displaying a graphical representation of the items in the order they were scanned. when items are out of place on the shelf, the program calculates the fewest number of moves needed to correct the shelf and presents the necessary corrections for the user to perform until the shelf is properly ordered. a screenshot depicting the shelfreader gui during a typical shelf-reading session is presented in figure 2. figure 2. a screenshot of the shelfreader gui, showing an incorrectly filed item (highlighted in blue text) and its proper filing position (represented by the green band). shelfreader’s sorting strategy consists of breaking call numbers into elemental substrings and arranging those parts in a database table so that any two call numbers may be compared exclusively on the basis of their corresponding parts. to this end, a base set of call number components was established. these are shown in table 2, along with their abbreviations (for ease in reference), maximum length, and corresponding mysql data types. the specific mysql data type determines the kind of sorting employed in each column: • varchar accepts alphanumeric string data. sorting is character by character, numbers before letters. • integer accepts numerical data; numbers are evaluated as whole numbers. • decimal accepts decimal values. specifying the overall length of the column and the number of characters to the right of the decimal point has the effect of adding zeros as placeholders in any empty spaces to the right of the last digit. the values are then compared digit by digit. algorithmic machine sorting of lc call numbers | wagner and wetherington 71 https://doi.org/10.6017/ital.v38i4.11585 • timestamp a date/time value that defaults to the date and time the entry is made. this orders call numbers that are identical (i.e., multiple copies of the same item) in the order they are scanned. section, component abbreviation max. length mysql data type classification class/subclass sbc 3 varchar caption number, integer part ci 4 integer caption number, decimal part cdl 16 decimal caption date cdt 4 varchar caption ordinal co 16 integer caption ordinal indicator coi 2 varchar cutter first cutter, alphabetical part c1a 3 varchar first cutter, numerical part c1n 16 decimal first cutter date cd 4 integer second cutter, alphabetical part c2a 3 varchar second cutter, numerical part c2n 16 decimal second cutter date cd2 4 integer third cutter, alphabetical part c3a 3 varchar third cutter, numerical part c3n 16 decimal specification specification sp 256 varchar timestamp — — mysql timestamp table 2. shelfreader call number components and data types. when parsing a call number, it must be assumed that each call number may contain all of the components identified above. the following is a general outline of the parsing algorithm which processes the call number: information technology and libraries | december 2019 72 1. an array is created from the call number. each character, including spaces, is an element of the array. 2. a second array is then created to serve as a template for each call number, replacing the actual characters with ones indicating data type. for example, all integers are replaced with ‘i’s. this makes pattern matching and data-type testing simpler. 3. pattern matching is used to identify the presence or absence of landmarks such cutters, spaces, volume-type numbering, etc. 4. when landmarks are identified, their beginning and ending positions in the call number string are noted. 5. component strings are created by looping through the appropriate section of the call number template, constructing a string in which the template characters are replaced by the actual characters in the call number string and continuing until a space, th e end of the string, or an incompatible character is encountered. 6. where needed, whole numbers strings are normalized to uniform length. dividing a call number into its component parts and placing those parts in separate columns in a database table, then, effectively creates a sort key that may be used for ordering. this key occupies a row of the table, and is an inflated representation of the call number insofar as it makes use of the maximum possible string length of each component type. it contains the characters of each component the call number possesses, and any empty columns serve as placeholders for components it does not possess. when two call numbers are compared, sorting proceeds through each successive column, each component (and each character within each component) serving as a potential break point within the sorting process. we note that every column (with the exception of the specification) contains exclusively alphabetic or numeric data, so that numbers and letters are never compared in th ose sections of the call number string. (the use of spaces in the specification string effectively accounts for the mixed alphanumeric data type.) some additional points of clarification regarding the algorithm’s multi-column approach to sorting are worth mentioning: 1. any lowercase alphabetic characters are converted to uppercase before processing in order to ensure that letter case does not affect sorting. 2. components are arranged in the database table from left to right in the order they occur in the call number. 3. if a call number does not contain a given component, the column is left empty (in the case of a varchar column) or is assigned a zero value (in the case of numeric columns). 4. empty columns and zero columns sort before columns containing data. 5. in columns designated as varchar columns, numbers are compared as whole numbers . this means that, in order to sort correctly, the length of any number stored must be normalized to a uniform length (6 places) by adding leading zeros. for example, 17 must be normalized to “000017.” 6. sorting proceeds column by column, provided the call numbers are identical. when the first difference is encountered, sorting is complete. algorithmic machine sorting of lc call numbers | wagner and wetherington 73 https://doi.org/10.6017/ital.v38i4.11585 table 3 shows two randomly selected call numbers of rather common configuration, along with the corresponding sort keys created by shelfreader: e169.1.b634 2002 e169.1.b653 1987 }  sbc ci cdl cdt co coi c1a c1n cd c2a c2n cd2 c3a c3n sp e 0169 0.10000 0 b 0.6340000000 0002002 e 0169 0.10000 0 b 0.6530000000 0001987 table 3. example shelfreader sort-key processing of two similar call numbers. in this first example, sorting is complete when 3 is compared to 5 in the first numerical cutter (c1n) column. (note that we have here truncated the length of certain strings for space and readability.) to illustrate how the application handles call numbers having heterogenous formats, table 4 shows the sort keys created from two call numbers in an example mentioned above, one with a “double cutter” and one without: b945.d4b65 1998 b945.d41 1981b }  sbc ci cdl cdt co coi c1a c1n cd c2a c2n cd2 c3a c3n sp b 0945 0.0 0 d 0.400000 b 0.650000 0001998 b 0945 0.0 0 d 0.410000 0.000000 0001981b table 4. shelfreader sort-key processing of a “double cutter” call number and a nearby, single cutter call number. by pushing the second cutter (b65) in the first call number into the c2a and c2n columns, the issue of comparing incompatible sections of the call number is avoided, as the 1 in the second call number is compared to the placeholder 0 in the first. when the sorting routine reaches this position, it terminates, and any subsequent characters are ignored. aspects of this multi-column approach may seem counterintuitive at first, but the method mimics what we do when we order call numbers mentally. one compares two call numbers character by character within these component categories until encountering a difference, or until a character or entire category in one of the call numbers is found to be absent. results shelfreader’s sorting method is powerful, accurate, and has been extensively tested without issue in a number of different academic libraries within penn state’s statewide system. the application accurately sorts all valid lc call numbers (with the exception of those for certain cartographic materials in the g1000 – g9999 range, which sometimes employ a different syntax and sorting order) as well those of the national library of medicine classification system (which augments information technology and libraries | december 2019 74 lcc with class w and subclasses qs – qz) and the national library of canada classification (which adds to lcc the subclass fc, for canadian history). while there may conceivably be valid lc or lcextended call numbers having exotic formats that would fail to correctly sort in shelfreader, we are not aware of any examples (outside of, once again, the g1000 – g9999 range), nor have we received reports of any from users. in addition to verifying proper shelf-ordering, shelfreader contains a number of other features useful for stacks maintenance. the program can identify shelved items that are still checked out to patrons, have been marked missing or lost, or are flagged as in transit between locations, and often reveals items which have been inadvertently “shadowed” (i.e., excluded from public-facing library catalogs) or have shelf labels which do not match their catalogued call numbers. the gui has different modes to accommodate the user’s preferred view (both single shelf and multi-shelf, stacks views), and allows for a good deal of flexibility in how and when the user wishes to make and record shelf corrections. a reports module is also included, which tracks shelving statistics and other useful information for later reference. the shelfreader application code (including the full sorting algorithm) is freely available via an mit license at https://github.com/scodepress/shelfreader. while shelfreader was developed and tested using the collections and systems of penn state university libraries, its architecture could be adapted and configured for use with other library apis and adjusted to suit local practices within the general confines of the lc call number structure.13 we can also envision a wide array of potential applications of the sorting functionality within other software environments, and we welcome and encourage users to pursue innovative adaptations of the method. references and notes: 1 leo e. lamontagne, american library classification: with special reference to the library of congress (hamden, ct: the shoe string press, 1961). the lengthy development of the lcc is described in detail in chapters xiii and xiv (pp. 221-51). 2 indeed, as lamontagne asserts, “the classification was constructed [ . . . ] to provide for the needs of the library of congress, with no thought to its possible adoption by other libraries. in fact, the library has never recommended that other libraries adopt its system . . . ” (ibid., p. 252). nevertheless, lcc is employed by the overwhelming majority of academic libraries in the united states (brady lund and daniel agbaji, “use of dewey decimal classification by academic libraries in the united states,” cataloging & classification quarterly 56, no. 7 (december 2018): 653-61, https://doi.org/10.1080/01639374.2018.1517851). 3 “library of congress classification,” library of congress, https://www.loc.gov/catdir/cpso/lcc.html. italics in original. 4 for a summary of lc sorting rules, see “how to arrange books in call number order using the library of congress system,” rutgers university libraries, https://www.libraries.rutgers.edu/rul/staff/access_serv/student_coord/libconsys.pdf. note that this summary is not comprehensive and does not cover all contingencies. 5 here we emphasize that our definition of the cutter string may differ from that of others, including (at times) that of the library of congress. for instance, the schedules for certain lcc https://github.com/scodepress/shelfreader https://doi.org/10.1080/01639374.2018.1517851 https://www.loc.gov/catdir/cpso/lcc.html https://www.libraries.rutgers.edu/rul/staff/access_serv/student_coord/libconsys.pdf algorithmic machine sorting of lc call numbers | wagner and wetherington 75 https://doi.org/10.6017/ital.v38i4.11585 subclasses regard the first portion of a cutter as part of the classification itself. since this paper concerns sorting rather than classification, we favor the simpler and more convenient definition. 6 j.f. conley and l.a. nolan, “call number sorting in excel,” https://scholarsphere.psu.edu/downloads/9cn69m421z. 7 conley and nolan, “call number sorting in excel.” 8 tim danny, “sorting lc call numbers in excel,” https://medium.com/@tdannay/sorting-lc-callnumbers-in-excel-75de044bbb04. 9 while there is in fact a “hack” or partial patch built into the program which identifies call numbers beginning with the subclass kbg and parses them separately, there is no general support for other call numbers in this category. 10 for the details of this syntax, see “050 library of congress call number (r),” library of congress, https://www.loc.gov/marc/bibliographic/bd050.html. 11 testing was conducted on sirsidynix symphony workflows staff client version 3.5.2.1079, build date june 5, 2017. 12 for an overview, see “student library assistant training guide: shelving basics,” florida state college at jacksonville, https://guides.fscj.edu/training/shelving. 13 shelfreader was written to receive real-time data directly from a sirsidynix api connected to penn state university libraries’ lms, a great improvement over drawing from a static collections database. this does, however, present a challenge for making the program easily adaptable to libraries using distinct web services. a strategy to adapt the program would need to account for potential differences in barcode structure, structure and naming conventions in the rest request, and structure and naming conventions within the server response from institution to institution. it is possible that these issues could be resolved via a configuration file made available to the user, but no attempt to address this issue has been undertaken as of yet. https://scholarsphere.psu.edu/downloads/9cn69m421z https://medium.com/@tdannay/sorting-lc-call-numbers-in-excel-75de044bbb04 https://medium.com/@tdannay/sorting-lc-call-numbers-in-excel-75de044bbb04 https://www.loc.gov/marc/bibliographic/bd050.html https://guides.fscj.edu/training/shelving abstract background literature review software review proposed parsing and sorting methodology application of methodology results references and notes: digitization of libraries, archives, and museums in russia article digitization of libraries, archives, and museums in russia heesop kim and nadezhda maltceva information technology and libraries | december 2022 https://doi.org/10.6017/ital.v41i4.13783 heesop kim (heesop@knu.ac.kr) is professor, kyungpook national university. nadezhda maltceva (nadyamaltceva7@gmail.com) is graduate student, kyungpook national university. © 2022. abstract this paper discusses the digitization of cultural heritage in russian libraries, archives, and museums. in order to achieve the research goals, both quantitative and qualitative research methodologies were adopted to analyze the current status of legislative principles related to digitization through the literature review and the circumstance of the latest projects related to digitization through the literature and website review. the results showed that these institutions seem quite successful where they provide a wide range of services for the users to access the digital collections. however, the main constraints on digitization within libraries, archives, and museums in russia are connected with the scale of the work, dispersal of rare books throughout the country, and low level of document usage. introduction culture is one of the most important aspects of human activity. libraries, archives, and museums (lams) in the russian federation store some of the richest cultural and historical heritage collections, some of which can be classified as world cultural treasures. as is true with other countries, lams in russia are engaging in problems with digitization of their unique cultural treasures. in this regard, these repositories are implementing digital technologies to improve their work on digitization, preservation, indexing, search, and access of cultural heritage more effectively and efficiently. information technologies can be used to preserve national knowledge and experience.1 the digitization of cultural heritage is one of the changes that occurred at the present stage of the global information society. researchers have made many attempts to define the concept of digital culture, which is considered to be a phenomenon that manifests itself through art, creativity, and self-realization, by implementing information technologies.2 the need for digitization of unique cultural heritage has caused the rapid development of digital libraries, archives, and museums, described collectively as digital lams, the multidisciplinary institutions that change the way people retrieve and access information. researchers and specialists involved in the digitization of information resources in lams work together to preserve the cultural heritage of the russian federation using modern information technologies. as pronina noted, the digitization of cultural heritage began to develop actively in many countries, including russia, around the same time.3 many researchers analyzed digitization issues in russia. for example, lopatina and neretin discussed the modernization of the system of cultural information resources and the history of preserving digital cultural heritage in russia.4 astakhova pointed out the problem of the digitization of cultural heritage and the transformation of art objects into 3d models.5 miroshnichenko et al. discussed the problem of organizing digital documents in the state archives and pointed out the issues of providing digitized archival documents for wide access through open electronic resources.6 mailto:heesop@knu.ac.kr mailto:nadyamaltceva7@gmail.com information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 2 despite a long history of improvements in digitization policies and programs, issues still exist in the major cultural repositories, and russia’s level and scope of digitization research are still lagging behind many european countries.7 therefore, three primary research questions guide this study: 1. what is the policy to regulate the digitization of cultural heritage in russia? 2. what is the status of the digitization of cultural heritage in russia? 3. what are the constraints related to digitization in russia? in addition, there is not enough research that fully reflects the current activities of digitization practices in lams in russia. by analyzing this matter, the authors hope to present the state of cultural heritage digitization in russia and uncover problems and limitations in this field. benefits of digitization in a cultural heritage repository before answering the key research questions, it is worth exploring the ultimate benefits of digitization in cultural heritage repositories. digitization refers to converting an analogue source into a digital version.8 a large proportion of the collections related to cultural heritage repositories comprise not only the materials that are born digital, but many resources that are not originally created in digital form that have been digitized. digitization involves three major stages.9 the first stage is related to preparing objects for digitization and the actual process of digitizing them. the second stage is concerned with the processing required to make the materials easily accessible to users. this involves a number of editorial and processing activities including cataloguing, indexing, compression, and storage, as well as applying appropriate standards for text and multimedia file formats to meet the needs of online digital lams. the third stage includes the preservation and maintenance of the digitized collections and services built upon them.10 the benefits of digitization are improved access and preservation. items, once digitized, can be used by many people from different places simultaneously at any point in time. unlike printed or analogue collections, digitized collections are not damaged by heavy and frequent usage, which helps in the preservation of information. according to ifla’s guidelines, several benefits come from having digitized materials. organizations digitize 1. to increase access to a high demand from users and the library or archive has the desire to improve access to a specific collection; 2. to improve services to an expanding user’s group by providing enhanced access to the institution’s resources with respect to education and life-long learning; 3. to reduce the handling and use of fragile or heavily used original material and create a backup copy for endangered material such as brittle books or documents; 4. to give the institution opportunities for the development of its technical infrastructure and staff skill capacity; 5. to form a desire to develop collaborative resources, sharing partnerships with other institutions to create virtual collections and increase worldwide access; 6. to seek partnerships with other institutions to capitalize on the economic advantages of a shared approach; and 7. to take advantage of financial opportunities, for example the likelihood of securing funding to implement a program, or of a particular project being able to generate significant income.11 information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 3 while digitization has benefits, there are also some problems. the most obvious one is related to the quality of the digitized objects. in the course of digitizing, we may lose some important aspects of the original document. another problem is related to access management. proper mechanisms need to be put in place to determine the authenticity of materials, as well as to control unauthorized access and use. the success of digitization projects depends not only on technology but also on project planning. since digitization is a relatively new process, institutions may concentrate on technology before deciding on a project’s purpose. however, technology should never drive digitization projects; instead, user needs should be determined first, and only then should a technology appropriate to those needs be selected to meet a project’s objectives. the best practices for planning a digitization project can be suggested as follows: determine the copyright status of the materials; identify the intended audience of the materials; determine whether it is technically feasible to capture the information; insist on the highest quality of technical work that the institution can afford; factor in costs and capabilities for long-term maintenance of the digitized images; cultivate a high level of staff involvement; write a project plan, budget, timeline, and other planning documents; budget time for staff training; plan a workflow based upon the results of scanning and cataloging a representative sample of material.12 policies regulating digitization of cultural heritage in russia the policy development at the time of selection should be made early for the suitability of selection and digital object management. this policy should formulate goals of the digitization project, identify materials, set selection criteria, define the means of access to digitized collections, set standards for image and metadata capture and for preservation of the original materials and state the institutional commitment to the long-term preservation of digital content.13 as stated by russian law, the cultural heritage of the peoples of the russian federation includes material and spiritual values created in the past, as well as monuments and historical and cultural territories and objects significant for the preservation and development of identity of the russian federation and all its peoples, their contribution to world civilization.14 the decree of the president of the russian federation “on approval of the fundamentals of state cultural policy” extended the term of cultural heritage by including documents, books, photos, art objects, and other cultural treasures that represent the knowledge and ideas of people throughout the centuries. the government emphasized the role of the information environment and modern technologies by analyzing it at the legislative level. in the presidential decree “on approval of the fundamentals of state cultural policy,” the concept of the information environment is separately distinguished, defined as a set of mass media, radio, and television broadcasting, and the internet; the textual and visual information materials disseminated through them; as well as the creation of digital archives, libraries, and digitized museum collections.15 another important part of the government policy is to provide open access to cultural heritage objects. the problem of access was confirmed in the state program culture of russia (2012 –2018), which stipulated the need to provide access to cultural heritage in digital forms as well as to create and support resources that provide access to cultural heritage objects on the internet and in the national electronic library, one of the main digital repositories in the country.16 access to digital cultural heritage was also considered in the state program information society (2011–2020). the subprogram information environment ensured equal access to the media environment, including information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 4 objects of digital cultural heritage. the program aimed to reduce the gap in access to cultural heritage objects in different regions across the russian federation.17 the digitization of cultural heritage and creation of digital archives is one of the characteristics of innovative changes in the cultural sphere of the information society. the law “on archival affairs” notes that a significant part of the information resources of the archives has a historical and cultural value and should be considered as part of the digital cultural heritage collection, the digitization of which is required.18 with regards to libraries, on january 22, 2020, the state duma of the russian federation adopted the draft law “on amendments to the federal law on librarianship” in terms of improving the procedure for state registration of rare books (rare books are defined as handwritten books or printed publications that have an outstanding spiritual or material value; have a special historical, scientific, cultural significance; and for which a special regime for accounting, storage, and use has been established) that aimed at ensuring legal protection of rare books by improving the system of protection of the items of the national library. the law reflects the criteria for classifying valuable documents as rare books and fixes the main stages of their registration. in case of museums, a federal law from 1996 aimed to establish the national catalog of the russian federation museum collections. at first this national catalog was created for inventory purposes and then it was transformed into an online database to ensure open access to russia’s cultural heritage (http://kremlin.ru/events/administration/21027). annual reports “on the state of culture in the russian federation” reflect the overall situation and changes in libraries, archives, and museums. some researchers emphasized the need to develop a unified regulatory framework for cultural heritage preservation practices. particularly, shapovalova stressed that the leader in this discussion should be the government, which plays a crucial role in the legal regulations of the cultural heritage policy and is responsible for the development of initiatives.19 however, lialkova and naumov criticized that russian policy discusses digitization of only a few cultural objects, but does not define the legal status of such objects and does not cover objects originally created in a digital form.20 kozlova considered the issues of russian digital culture within the framework of the obligatory library copies system.21 since 1994, the national library of russia has accepted electronic media according to the federal law “on the obligatory copy of documents,” which established the legal deposit system; the bibliographic records of deposited electronic media are available online in the electronic catalog “russian electronic editions.” acquisitions librarians use this catalog as a national bibliographic resource for adding electronic editions to their collections. dzhigo addressed issues of digital preservation of cultural heritage and also paid attention to the federal legal deposit law.22 yumasheva dealt with the content of the russian normative methodic of regulating the process of digital copying of historical and cultural heritage from russian libraries and museums.23 kruglikova considered theoretical and practical issues of legislation for the preservation and popularization of cultural heritage in the modern world.24 shapovalova suggested introducing the terms of digital cultural heritage objects on a legislative level, to recognize the concept of preserving cultural heritage and to provide virtual access to such objects on the bigger scale.25 a review of the literature reveals various studies that discuss cultural heritage preservation using modern technologies. the majority of researchers identified issues in this field. digitization practices are carried out mainly by the state libraries, archives, and museums which seek to http://kremlin.ru/events/administration/21027 information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 5 preserve cultural heritage objects in a better methodological and legislative way, and less development is seen in smaller local lams. researchers express the value of preservation of cultural materials and the need to analyze and improve legislative procedures. to this day government realizes the importance of digital preservation; however, the term “digital cultural heritage” is not mentioned and the legal status of such digitized objects is not defined. in addition, legislative documents do not cover the regulation of objects originally created in digital format. moreover, we can see a large gap between the accumulation of materials and the degree of their use despite the fact that the government seems to support open access to digital cultural heritage objects. digitization projects of cultural heritage in russia to analyze the circumstances of the latest projects related to digitization, we investigated the relevant websites from may 2021 to june 2022. in this study, we chose a few representative institutions, including some national projects, based on their reputation, authority, and the scope of the collections. the data on digitization practices and current projects were collected . the list of institutions is shown in table 1. as shown in table 1, the authors selected russian national library, national electronic library, russian state library, and presidential library as the largest and most well-known libraries in russia. among the archives chosen for the analysis, the archival fonds was selected because it unites the archives in russia in one system, and the national digital archive was selected because its main goal is to preserve and archive key russian digital resources. as for the museums, the state hermitage museum, the state russian museum, and the state museum of fine arts named after a. s. pushkin were chosen for this study because they hold the richest collections of russian cultural heritage and play a vital role in replenishment of the national catalogue of the russian federation museum collections, the main goal of which is to unite museums across the country. by analyzing the websites of these selected libraries, archives, and museums, we can gain insight into what projects have been undertaken to preserve cultural heritage and what are the main drawbacks of this field. however, it is true that some institutions do not share the latest information on digitized items. in the case of libraries and archives, the numbers are fairly public on the website, but it is difficult to prove exactly when the objects were digitized. however, not all museums share information about recently digitized objects. in this case, quantitatively analyzing digitization practices is the only way. therefore, the authors used a manual method for data collection and counted the number of digitized materials available on the website. indeed, this could be one of the limitations of this work, as some institutions have hidden the exact amount of digitized collections; some institutions have not been able to manually count digitized copies due to huge amounts of data; and some websites may not be up to date. information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 6 table 1. institutions responsible for digitization of cultural heritage in russian lams type name size description libraries russian national library http://nlr.ru/eng/ra 2403/digital-library 650,000 scanned copies as of the beginning of 2019, the digital library included scanned copies of books, magazines, newspapers, music publications, graphic materials, maps, audio recordings, and more. the scanned materials include items from the national library of russia and from partner libraries, publishing organizations, authors and readers. national electronic library https://rusneb.ru 1,700,000 digitized books26 the nel project was designed to provide internet users with access to digitized documents from russian libraries, museums, and archives. nel combines rare books and manuscripts, periodicals, and sheet music collected from all major russian libraries. russian state library https://www.rsl.ru 1,500,000 documents this is the largest public library in russia; the digital collection contains copies of valuable and most requested publications, as well as documents originally created in electronic form. the electronic catalog contains information on more than 21 million publications, 1.5 million of which have been digitized. presidential library https://www.prlib.r u/en 1,000,000 units the presidential library is a nationwide electronic repository of digital copies of the most important documents of the history of russia. the volume of the presidential library collections is more than a million storage units including digital copies of books and journals, archival documents, audio and video recordings, photographs, films, dissertation abstracts, and other materials. http://nlr.ru/eng/ra2403/digital-library http://nlr.ru/eng/ra2403/digital-library https://rusneb.ru/ https://www.rsl.ru/ https://www.prlib.ru/en https://www.prlib.ru/en information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 7 type name size description archives archival fonds of russia (central fonds catalog) https://cfc.rusarchiv es.ru/cfc-search/ 959,576 archival fonds27 annually, the volume of documents of the archival fonds of the russian federation increases by an average of 1.7 million units. as of december 13, 2020, the central fonds catalog included 959,576 items from 13 federal archives and 2,225 state and municipal archives of the russian federation. national digital archive https://ruarxive.org 282 websites28 the purpose of this initiative is to find and preserve websites and other digital materials of high public value and at risk of destruction. the nda project collects official accounts on social networks, official websites of government bodies and political parties, and historical data. however, not many websites were collected in comparison with other countries’ initiatives. unlike the internet archive, the nda project make a complete copy of everything that is on the site, including archive channels on twitter, instagram, and telegram. museums national catalogue of the russian federation museum collections https://goskatalog.r u/portal/#/ 23,193,078 units the catalog is an electronic database containing basic information about each museum item and each museum collection included in the museum fonds of the russian federation. according to the latest statistics (2020), over 23 million units were recorded in the national museum catalog. however, the total amount of museum objects across russia is more than 84 million. https://cfc.rusarchives.ru/cfc-search/ https://cfc.rusarchives.ru/cfc-search/ https://ruarxive.org/ https://goskatalog.ru/portal/#/ https://goskatalog.ru/portal/#/ information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 8 type name size description state hermitage museum https://www.hermit agemuseum.org 400,000 units the state hermitage museum is the second largest museum in the world. the hermitage exposition is gradually moving online. this process is slow and very laborious. the entire collection of the hermitage has not been digitized, but the website already contains 400,000 exhibits (that is, approximately only one tenth of the entire collection). the online collection includes paintings, sculptures, numismatics, archaeological finds, and other exhibits. state russian museum https://www.rusmus eum.ru/collections/ 3,682* * the number of digitized collections were manually counted on the website this is the world’s largest museum of russian art. the collection of the museum has about 400,000 exhibits and covers all historical periods of russian art. at the moment on the museum website only a small part of the collection is available in digitized form. however, the museum is maintaining the virtual state russian museum branch project, the main goal of which is to give free access to digital and printed materials from other institutions online. state museum of fine arts named after a. s. pushkin https://pushkinmus eum.art 334,000 as of march 1, 2019, the museum’s database contained information on 670,000 museum items, 334,000 (49%) of which have images. in total there are about 683,000 images in the database (not counting special photography) with a volume of about 35 tb. https://www.hermitagemuseum.org/ https://www.hermitagemuseum.org/ https://www.rusmuseum.ru/collections/ https://www.rusmuseum.ru/collections/ https://pushkinmuseum.art/ https://pushkinmuseum.art/ information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 9 russian national library national electronic library russian national digital archive state hermitage museum figure 1. screenshots of the websites of some of the institutions listed in table 1. a further analysis of russian museums shows that 2,773 state and municipal museums have more than 84 million items, but only a few are displayed in digital form. biryukova et al. reviewed the interdisciplinary approach to preserving cultural heritage and creating virtual museums.29 povroznik also analyzed virtual museums that preserve the cultural heritage of the russian federation.30 the author concluded that virtual museums and its resources need to be studied, developed, and improved more. kondratyev et al. considered the issues of digital heritage preservation from the security, integrity, and accessibility perspective, and analyzed the concept of a smart museum.31 lapteva and pikov represented the experience of the students of the institute for the humanities of siberian federal university working with the state russian museum and the state hermitage museum, the leading russian museums that are playing the important role in country digitization practices.32 the authors noted that results of implementing modern information technologies in museums create a comfortable infrastructure for the audience by preserving and representing cultural heritage in interactive contexts. information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 10 findings digitization in russian libraries creating a digital collection has become a normal library activity in russia.33 within the framework of the main development of activities to preserve russian library collections from 2011 to 2020, one of the main programs of the national library is the digitization of rare books. rare books, according to the federal law “on amendments to the federal law on librarianship,” include handwritten books or printed publications that have outstanding material value or special historical, scientific, and/or cultural significance.34 thus, the law elevated the book to the same level of protection as other objects of cultural heritage at the national level. the website of the register of rare books (https://knpam.rusneb.ru), hosted by the russian state library, became a part of the national library collection preservation program developed in 2001. from 2001 to 2009, the subprogram rare books of the russian federation was created to provide a regulatory framework and methodological support for all areas of library activities related to the preservation of library collections. this program includes not only libraries but also other institutions such as museums, archives, and scientific and educational institutions. however, in order to implement the state registration of rare books, it is necessary to further develop regulatory documents that can control the reference procedure and registration procedure of rare books. another initiative for book preservation is the federal project digital culture, designed to provide citizens with wide access to the country’s unique cultural heritage. it was expected that ten to twenty libraries of different russian regions will take part in the digitization project, each offering at least 50 documents from their collections to the project. however, the problems of this program are related to the work scale, as well as the dispersal issue of rare books throughout the country. as the 2011–2020 library preservation report emphasizes, many of these rare books remain unknown to the wider scholarly community. approximately half of the valuable collections available in the country’s repositories are not described as integral objects of cultural and historical heritage. the russian state library noted that the main problems associated with rare books include comprehensive function to identify and record rare and valuable books; ensuring the safety and security of the books; copying valuable materials requires special equipment; and the need for proper storage as the most important condition for the preservation. another main center of digitization is the digital library of the national library of russia (nlr, https://nlr.ru). the digital library is an open and accessible information resource that includes over 650,000 digitized copies of books, magazines, newspapers, music publications, graphic materials, maps, plans, atlases, and audio recordings. the digitized materials include items from the holdings of the national library of russia, partner libraries, publishing organizations, authors, and even readers. for now, the digital collection of the library includes various collections such as landmarks of the nlr, rare books, rossica, maps, and manuscripts. hosted by russia’s national library in 2004, the national electronic library (nel, https://rusneb.ru/) was launched to create an electronic library sponsored by the russian federal ministry of culture. the nel is a service that searches the full text of scanned books and magazines that have been processed using optical character recognition and converted them into text. it is stored in a digital database available through the internet and mobile applications. one of the main tasks of the nel is the integration of the libraries of the russian federation into a single information network. as of june 15, 2022, the nel collection had a total of 5 million artifacts including electronic copies of books, educational and periodical literature, dissertations and https://knpam.rusneb.ru/ https://nlr.ru/ https://rusneb.ru/ information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 11 abstracts, monographs, patents, notes, and visual and cartographic publications. the russian state library became the main operator of the national electronic library project in 2014. since 2015, the national library of russia has expanded its digitization program and the site now publishes a list of publications that require digitization. readers vote for publications directly on the site by clicking the vote for digitalization button. for example, as of november 2020, a list of 1,998 publications on a variety of topics ranging from physics and mathematical literature to psychology and music was available for voting. digitization in russian archives archives have historical, scientific, social, and cultural significance, which is an essential part of the process of preserving russian cultural heritage. digitization projects in russia began as an element of the digital cataloging of the largest archives from the 1980s to the 1990s. initially, the main purpose of the digitization project was to create digital copies to ensure the preservation of original archive documents and to eliminate the distribution of rare or poor originals in the reading room. since then, digitization has become an integral part of creating digital archives in russia.35 currently, one of the main goals of digitizing archival documents is to provide open access to legal entities and individuals to archival documents from the russian federation. the main archival center is the archives fond of the russian federation (http://archives.ru/af.shtml). the archives fond has more than 609 million items from the early eleventh century to the present and performs important functions to preserve historical memory, replenish information resources, and provide access to the public. the main task of digitization is to preserve russia’s cultural and historical heritage. each year, the total volume of archives across russia increases by an average of 1.7 million items. despite the relatively small amount of equipment for digitization, we can still see progress. in 2015, 8,750 documents were digitized, while in 2019, the annual total had reached 27,518 documents. this increase in the number of digital documents shows that digital copy production is directly related to equipment acquisition. however, the researchers found that the level of use of these documents was not high and tended to decrease. for example, in 2015, there were 18,155 document views, while in in 2019, there were only 19,417 document views. therefore, it is necessary not only to promote the service of the archive agency but also to increase the demand for archive documents. a portal was created under the auspices of the archives fond of the russian federation (http://www.rusarchives.ru) to encourage archiving services for users and to organize all archives throughout russia. the portal collects information resources of russian archives on the internet and publishes archival directories and regulations. the establishment was an important breakthrough in organizing access to the documents of the archives fond of the russian federation. since 2012, the website has operated the central catalog software complex, which provides information on the composition of federal and regional digitized fonds. as reported by the federal archival agency, 32 virtual exhibition projects are posted on the official website and portals of the federal archival agency. this website provides information about online archive projects, including virtual exhibitions, digital collections, and inter-archive projects. users can search for materials on the website by three publication types: virtual exhibition, document collection, and inter-archive project. the project covers four subjects: the great patriotic war, statehood of russia, soviet era, and space exploration. the federal archival agency’s main website also provides five catalogs and databases that guide users through digitized collections. this list includes the central stock catalog http://archives.ru/af.shtml http://www.rusarchives.ru/ information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 12 (http://cfc.rusarchives.ru/cfc-search), state register of unique documents (https://unikdoc.rusarchives.ru/), guides to russian archives (https://guides.rusarchives.ru/), electronic inventories of federal archives (https://rusarchives.ru/elektronnye-opisi-federalnyharhivov), database of declassified cases and documents of the federal archives (http://unsecret.rusarchives.ru/), and database on the places of storage of documents on personnel (http://ls.rusarchives.ru/). as of january 1, 2022, 859 documents were included in the state register of unique documents of the archival fund of the russian federation. a total of more than 98,000 documents are stored in the database. a project to digitize documents from the soviet era is still in progress, and the new collections of digitized copies of archival documents stored in federal archives across russia will be displayed on the website in the future (http://sovdoc.rusarchives.ru/#main). one of the major drawbacks of the digitization process in russia is that archival agencies and cultural heritages are scattered throughout russia. to develop digital archiving initiatives in different regions of russia the culture of russia (2012–2018) program was developed. archives of the constituent entities of the russian federation can take part in this program and get funding from the regional budget to digitize collections as a part of the regional program for the development of archival affairs.36 despite some improvements and ongoing projects, there are still no initiatives for the long -term preservation of born-digital materials and no requirements for mandatory long-term preservation of information. however, the national digital archive (https://ruarxive.org) was created to find and preserve websites and other digital materials that have a high public value and are at risk of destruction. this initiative proposes the general idea of archiving modern digital heritage and consists of many projects. the main one is preserved government, which aims to preserve official materials in the following areas: official accounts on social networks; official sites of government managers, officials, political parties; historical documents; and especially databases. future plans include developing tools that will help collect digital materials faster and more efficiently and also better systematize what has already been collected. digitization in russian museums the active introduction of information technology into museums began at the end of the twentieth century. a new area of study, museum informatics, has emerged in russian higher-education institutions. this area of study focuses on museum work and modern information technology to develop and improve museum activities.37 museums have developed many digitization projects to preserve their collections and give free and easy access to cultural heritage items. the modern russian museum system consists of about 2,773 museums, although the exact number of museums is not known. since the 1970s, the rationale for russian museum digitization practices has been quite similar to that of many other countries, finding that information and collection management are needed to ensure that museum objects are listed and properly preserved. the museums plan to create electronic collections, open valuable collections to the public, create a state catalog of the museum collection of the russian federation (https://goskatalog.ru/portal/#/) and integrate all works from all museums in russia. as of 2020, more than 23 million museum items are registered in the national catalog of the museum collection. the catalog is planned to be complete by 2026, when metadata and images of the museum’s collection are included in the register and posted online. digitization of museum collections is an important process that has recently received stable support from the government. http://cfc.rusarchives.ru/cfc-search https://unikdoc.rusarchives.ru/ https://guides.rusarchives.ru/ https://rusarchives.ru/elektronnye-opisi-federalnyh-arhivov https://rusarchives.ru/elektronnye-opisi-federalnyh-arhivov http://unsecret.rusarchives.ru/ http://ls.rusarchives.ru/ http://sovdoc.rusarchives.ru/#main https://ruarxive.org/ https://goskatalog.ru/portal/#/ information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 13 the national information society (2011–2020) program includes a project to create a new virtual museum based on the collections of the country’s largest national museum. the term “virtual museum” is used to characterize various projects linked to digital technology in virtual and museum space.38 it can be represented by a collection of works of art on the internet and the publication of the museum’s electronic expositions. currently, there are about 300 virtual museums available in virtual form across the country (https://www.culture.ru/museums). the most-visited museums are the state hermitage museum in st. peterburg (https://www.hermitagemuseum.org/), the state tretyakov gallery (https://www.tretyakovgallery.ru), and the state russian museum (http://en.rusmuseum.ru). these museums offer users a wide range of activities, including the use of modern technology . for example, since 2003 the russian art collection at the state russian museum (the world’s largest museum) started to implement the russian museum: virtual branch project, opening virtual branches in museums, universities, cultural centers, and institutions of additional education around the country. thanks to computer technology and digitization, thousands of russian residents in near and far places have access to the value of russian culture, russia’s historical and artistic past, and the richest collection of russian art. international business machines (ibm) collaborated with the hermitage museum to make it one of the most technologically advanced museums in the world. ibm built the state hermitage museum website in 1997, later called the “world’s best online museum” by national geographic traveler.39 the hermitage has unique experience in developing digitization programs and uploading collections to websites. currently, the museum collects more than 3 million items, and the online archives presented on its website provide easy search and the possibility of creating your own collection on the website. in 2020, the hermitage released a documentary feature film in virtual reality (vr) format, “vr— hermitage: immersion in history with konstantin khabenskiy” (https://www.khabenskiy.com/ filmography-vr-hermitage-immersion-in-history-with-konstantin-khabenskiy/). visitors can tour the history of the hermitage in a vr format based on the most important events in the history of the hermitage from the eighteenth century to the present. the pushkin museum, the largest museum of european art in moscow, offers another example of using vr technology. the joy of museums offers virtual tours of more than 60,000 museums and historic sites around the world, including the pushkin museum (https://joyofmuseums.com/museums/russianfederation/moscow-museums/pushkin-museum/). virtual museums can display electronic versions of exhibits longer than actual museum exhibitions limited by region and time zone and have the means to record information about past exhibits, including electronic collections of exhibits, as well as data on opening times and concepts. for example, the website of the state tretyakov gallery contains a virtual archive of past exhibitions. therefore, the virtual museum has considerable research potential and is actively contributing to the preservation of cultural heritage. digital copies of the original culture and arts form an electronic archive of great value from two perspectives. this is the preservation of rarity for future generations, the broad access of users to the rarest and most unique artworks in historical significance, and the possibility of research. on the other hand, it is an opportunity to find commercial use of artifacts, additional sponsorship, and investment proposals for museums. conclusions and further study the two most obvious benefits of digitization are improved access and preservation, so that libraries, archives, and museums can represent russian culture and introduce rare and unique cultural heritage artifacts to future generations. in this work, we have addressed some legislative https://www.culture.ru/museums https://www.hermitagemuseum.org/ https://www.tretyakovgallery.ru/ http://en.rusmuseum.ru/ https://www.khabenskiy.com/filmography-vr-hermitage-immersion-in-history-with-konstantin-khabenskiy/ https://www.khabenskiy.com/filmography-vr-hermitage-immersion-in-history-with-konstantin-khabenskiy/ https://joyofmuseums.com/museums/russian-federation/moscow-museums/pushkin-museum/ https://joyofmuseums.com/museums/russian-federation/moscow-museums/pushkin-museum/ information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 14 principles and outlined major digitization projects. the general problem of digitization in russia is related to the size of works, the tendency of documents to decrease without high use, and the scatter of rare books nationwide. in the case of libraries, one of the problems of digitization is also related to the uneven distribution of rare books throughout the country. the most important materials are concentrated in the largest federal library, and many rare books are housed in many central libraries in various parts of the russian federation. work using book memorials should be planned as long-term activities performed at different levels. in the case of archives and museums, one of the major drawbacks of the digitization is the dismantling of national archives and cultural heritages. based on this preliminary study, there are several further research topics that can enhance understanding of digitization of cultural heritage in russia. in particular, since digitization is a complex process that requires both management and technology, future research needs to be divided into three aspects: management, technology, and content. endnotes 1 g. a. kruglikova, “use of information technologies in preservation and popularization of cultural heritage,” advances in social science, education and humanities research 437 (2020): 446–50. 2 g. m. shapovalova, “digital culture and digital heritage—doctrinal definitions in the field of culture at the stage of development of modern russian legislation. the territory of new opportunities” [in russian], the herald of vladivostok state university of economics and service 10, no. 4 (2018): 81–89. 3 l. a. pronina, “information technologies preserving cultural heritage. analytics of cultural studies,” 2008, https://cyberleninka.ru/article/n/informatsionnye-tehnologii-v-sohraneniikulturnogo-naslediya/viewer. 4 n. v. lopatina and o. p. neretin, “preservation of digital cultural heritage in a single electronic knowledge space,” bulletin mguki 5, no. 85 (2018): 74–80. 5 y. s. astakhova, “cultural heritage in the digital age. human in digital reality: technological risks,” materials of the v international scientific and practical conference (2020): 204–6. 6 m. a. miroshnichenko, y. v. shevchenko, and r. s. ohrimenko, “preservation of the historical heritage of state archives by digitalizing archive documents” [in russian], вестник академии знаний 37, no. 2 (2020): 188–94. 7 inna kizhner et al., “accessing russian culture online: the scope of digitization in museum s across russia,” digital scholarship in the humanities 19 (2019): 350–67, https://doi.org/10.1093/llc/fqy035. 8 s. d. lee, digital imaging: a practical handbook (new york: neal-schuman publishers, inc., 2001). 9 s. tanner and b. robinson, “the higher education digitisation service (heds): access in the future, preserving the past,” serials 11 (1998): 127–31; g. a. young, “technical advisory service for images (tasi),” 2003, http://www.jiscmail.ac.uk/files/newsletter/issue3_03/; https://cyberleninka.ru/article/n/informatsionnye-tehnologii-v-sohranenii-kulturnogo-naslediya/viewer https://cyberleninka.ru/article/n/informatsionnye-tehnologii-v-sohranenii-kulturnogo-naslediya/viewer https://doi.org/10.1093/llc/fqy035 http://www.jiscmail.ac.uk/files/newsletter/issue3_03/ information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 15 “preservation services,” harvard library, https://preservation.library.harvard.edu/digitization. 10 g. g. chowdhury and s. chowdhury, introduction to digital libraries (london: facet publishing, 2003), https://doi.org/10.1016/b978-1-84334-599-2.50006-4. 11 j. mcilwaine et al., “guidelines for digitization projects for collections and holdings in the public domain, particularly those held by libraries and archives” (draft) (unesco, march 2002), 6– 7, https://www.ifla.org/wp-content/uploads/2019/05/assets/preservation-andconservation/publications/digitization-projects-guidelines.pdf. 12 m. note, managing image collections: a practical guide (oxford: chandos publishing, 2011). 13 mcilwaine et al., “guidelines,” 51–52. 14 fundamentals of the legislation of the russian federation on culture, http://www.consultant.ru/document/cons_doc_law_1870/068694c3b5a06683b5e5a2d480 bb399b9a7e3dcc/. 15 decree of the president of the russian federation of december 24, 2014 no. 808, on approval of the fundamentals of state cultural policy, http://kremlin.ru/acts/bank/39208. 16 v. zvereva, ‘‘state propaganda and popular culture in the russian-speaking internet,” in freedom of expression in russia’s new mediasphere, ed. mariëlle wijermars and katja lehtisaari (routledge: abingdon, oxon, 2020), 225–47, https://doi.org/10.4324/9780429437205-12. 17 s. l. yablochnikov, m. n. mahiboroda, and o. v. pochekaeva, “information aspects in the field of modern public administration and law,” in 2020 international conference on engineering management of communication and technology (emctech), 1–5; u. chimittsyrenova, “a research proposal information society: copyright (presumption of access to the digital cultural heritage),” colloquium journal, no. 11-3 (2017): 22–24. голопристанський міськрайонний центр зайнятості = голопристанский районный центр занятости. 18 g. m. shapovalova, “information society: from digital archives to digital cultural heritage,” international research journal 5, no. 47 (2016): 177–81. 19 g. m. shapovalova, “the global information society changing the world: the copyright or the presumption of access to digital cultural heritage,” society: politics, economics, law, 2016. 20 s. b. lialkova and v. b. naumov, “the development of regulation of the protection of cultural heritage in the digital age: the experience of the european union,” информационное общество 1 (2020): 29–41. 21 e. kozlova, “russia’s digital cultural heritage in the legal deposit system,” slavic & east european information resources 12, no. 2-3 (2011): 188–91. 22 a. a. dzhigo, “preserving russia’s digital cultural heritage: acquisition of electronic documents in russian libraries and information centers,” slavic & east european information resources 14, no. 2-3 (2013): 219–23. https://preservation.library.harvard.edu/digitization https://doi.org/10.1016/b978-1-84334-599-2.50006-4 https://www.ifla.org/wp-content/uploads/2019/05/assets/preservation-and-conservation/publications/digitization-projects-guidelines.pdf https://www.ifla.org/wp-content/uploads/2019/05/assets/preservation-and-conservation/publications/digitization-projects-guidelines.pdf http://www.consultant.ru/document/cons_doc_law_1870/068694c3b5a06683b5e5a2d480bb399b9a7e3dcc/ http://www.consultant.ru/document/cons_doc_law_1870/068694c3b5a06683b5e5a2d480bb399b9a7e3dcc/ http://kremlin.ru/acts/bank/39208 https://www.worldcat.org/search?q=au=%22wijermars,%20marie%cc%88lle%22 https://doi.org/10.4324/9780429437205-12 information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 16 23 y. y. yumasheva, “digitizing russian cultural heritage: normative and methodical regulation,” bulletin of the ural federal university humanitarian sciences 3, no. 117 (2013): 2–7. 24 g. a. kruglikova, “use of information technologies in preservation and popularization of cultural heritage,” advances in social science, education and humanities research 437 (2020): 446–50. 25 g. m. shapovalova, “the concept of digital cultural heritage and its genesis: theoretical and legal analysis, the territory of new opportunities” [in russian], the herald of vladivostok state university of economics and service 9, no. 4 (2017): 159–68. 26 a. annenkov, “national electronic library of russia: it’s not yet on fire, but the time to save it is now [in russian], http://d-russia.ru/nacionalnaya-elektronnaya-biblioteka-rossii-eshhyone-gorela-no-spasat-uzhe-pora.html. 27 saa dictionary of archives terminology. a “fonds” is the entire body of records of an organization, family, or individual that have been created and accumulated as the result of an organic process reflecting the functions of the creator. 28 airtable, https://airtable.com/shro1hise7wgurxg5/tblhdxawiv5avtn7y. 29 m. v. biryukova et al., “interdisciplinary aspects of digital preservation of cultural heritage in russia” [in russian], european journal of science and theology 13, no. 4 (2017): 149–60. 30 n. povroznik, “virtual museums and cultural heritage: challenges and solution,” https://www.researchgate.net/profile/nadezhdapovroznik/publication/329308409_virtual_museums_and_cultural_heritage_challenges_and_ solutions/links/5c00e5dba6fdcc1b8d4aa3b7/virtual-museums-and-cultural-heritagechallenges-and-solutions.pdf. 31 d. v. kondratyev et al., “problems of preservation of digital cultural heritage in the context of information security,” history and archives (2013): 36–51. 32 m. a. lapteva and n. o. pikov, “visualization technology in museum: from the experience of sibfu collaboration with the museums of russia,” journal of siberian federal university humanities & social sciences 7, no. 9 (2016): 1674–81. 33 g. a. evstigneeva, the ideology of digitization of library collections on the example of the russian national public library for science and technology, library collections: problems and solutions, 2014, http://www.gpntb.ru/ntb/ntb/2014/3/ntb_3_8_2014.pdf. 34 main directions of development of activities for the preservation of library collections in the russian federation for 2011–2020, https://kp.rsl.ru/assets/files/documents/maindirections.pdf. 35 g. m. shapovalova, “the concept of digital cultural heritage,” 159–68. 36 o. a. kolchenko and e. a. bryukhanova, “the main directions of archiving informatization in the context of electronic society development,” vestnik tomskogo gosudarstvennogo universiteta—tomsk state university journal 443 (2019): 114–18. http://d-russia.ru/nacionalnaya-elektronnaya-biblioteka-rossii-eshhyo-ne-gorela-no-spasat-uzhe-pora.html http://d-russia.ru/nacionalnaya-elektronnaya-biblioteka-rossii-eshhyo-ne-gorela-no-spasat-uzhe-pora.html https://airtable.com/shro1hise7wgurxg5/tblhdxawiv5avtn7y https://www.researchgate.net/profile/nadezhda-povroznik/publication/329308409_virtual_museums_and_cultural_heritage_challenges_and_solutions/links/5c00e5dba6fdcc1b8d4aa3b7/virtual-museums-and-cultural-heritage-challenges-and-solutions.pdf https://www.researchgate.net/profile/nadezhda-povroznik/publication/329308409_virtual_museums_and_cultural_heritage_challenges_and_solutions/links/5c00e5dba6fdcc1b8d4aa3b7/virtual-museums-and-cultural-heritage-challenges-and-solutions.pdf https://www.researchgate.net/profile/nadezhda-povroznik/publication/329308409_virtual_museums_and_cultural_heritage_challenges_and_solutions/links/5c00e5dba6fdcc1b8d4aa3b7/virtual-museums-and-cultural-heritage-challenges-and-solutions.pdf https://www.researchgate.net/profile/nadezhda-povroznik/publication/329308409_virtual_museums_and_cultural_heritage_challenges_and_solutions/links/5c00e5dba6fdcc1b8d4aa3b7/virtual-museums-and-cultural-heritage-challenges-and-solutions.pdf http://www.gpntb.ru/ntb/ntb/2014/3/ntb_3_8_2014.pdf https://kp.rsl.ru/assets/files/documents/main-directions.pdf https://kp.rsl.ru/assets/files/documents/main-directions.pdf information technology and libraries december 2022 digitization of libraries, archives, and museums in russia | kim and malceva 17 37 g. p. nesgovorova, “modern information, communication and digital technologies in the preservation of cultural and scientific heritage and the development of museums: problems of intellectualization and quality of informatics systems” (2006): 153–61, https://www.iis.nsk.su/files/articles/sbor_kas_13_nesgovorova.pdf. 38 n. g. povroznik, “virtual museum: preservation and representation of historical and cultural heritage,” perm university bulletin 4, no. 31 (2015): 2013–21. 39 the preservation of culture through technology, https://www.ibm.com/ibm/history/ibm100/us/en/icons/preservation/ . https://www.iis.nsk.su/files/articles/sbor_kas_13_nesgovorova.pdf https://www.ibm.com/ibm/history/ibm100/us/en/icons/preservation/ abstract introduction benefits of digitization in a cultural heritage repository policies regulating digitization of cultural heritage in russia digitization projects of cultural heritage in russia findings digitization in russian libraries digitization in russian archives digitization in russian museums conclusions and further study endnotes bibliographic retrieval from bibliographic input; the hypothesis and construction of a test frederick h. ruecking, jr.: head, data processing division, the fondren library, rice university, houston, texas 227 a study of problems associated with bibliographic retrieval using unverified input data supplied by requesters. a code derived from compression of title and author information to four, four-character abbreviations each was used for retrieval tests on an ibm 1401 computer. retrieval accuracy was 98.67%. current acquisitions systems which utilize computer processing have been oriented toward handling the order request only after it has been manually verified. systems, such as that of texas a & i university (1), have proven useful in reducing certain clerical routines and in handling fund accounting ( 2). lack of a larger bibliographic data base and lack of adequate computer time have prevented many libraries from studying more sophisticated acquisitions systems. at the time the marc pilot project ( 3) was started, the fondren library at rice university did not have operating computer applications in acquisitions, serials, or cataloging. the university administration and the research computation center provided sufficient access to the ibm 7040 to permit the study of problems associated with bibliographic retrieval using input data which has varying accuracy. in 1966, richmond expressed the concern of many librarians about the lack of specific statements describing the techniques by which on-line retrieval could be accomplished without complicating the problems presented by the current card catalog ( 4). she had previously described some of the problems created by the kind and quality of data being utilized as references by library users ( 5). 228 journal of library automation vol. 1/ 4 december, 1968 an examination of the pertinent literature indicates that most of the current work in retrieval, while related to problems of bibliographic retrieval, does not offer much assistance when the input data is suspect ( 6, 7,8 ). tainiter and toyoda, for example, have described different techniques of addressing storage using known input data ( 9,10). one of the best-known retrieval systems is that of the chemical abstracts service, which provides a fairly sophisticated title-scan of journal articles with a surprising degree of flexibility in the logic and term structure used as input. comparable systems are used by the defense documentation center, medlars centers, and nasa technology centers. these systems have one specific feature in common: a high level of accuracy in the input data. user-supplied bibliographic data the reliability of bibliographic data supplied to university libraries from faculty and students has long been questioned ( 5). any search system which accepts such data must be designed 1) to increase the level of confidence through machine-generated search structures and variable threshholds and 2) to reduce the dependence upon spelling accuracy, punctuation, spacing and word order. the initial task of formulating an approach to this problem is to determine the type, quality, and quantity of data generally supplied by a user. to derive a controlled set of data for this purpose, the acquisition department of the fondren library provided xerox copies of all english language requests dated 1965 or later and a random sample of 295 requests was drawn from that file of 5000 items. this random sample was compared to the manually-verified, original order-requests to determine 1) the frequency with which data was supplied by the requestor and 2) the accuracy of the provided information. results of this study are given in table 1. table 1. level of confidence in the input data data times times level of elements given correct accuracy confidence edition 295 294 99.6 99.6 title 295 292 99.0 99.0 author 290 264 91.0 82.7 publish. 268 218 81.3 73.9 date 265 215 81.1 72.8 the results suggest that edition can have great significance when specified and should be used as strong supporting evidence for retrieval. it should not necessarily be a restrictive element because of the low-order magnitude of actual specification, which was five times in the sample. (unstated editions were considered as first editions, and correct. ) bibliographic retrievalj ruecking 229 title is the most significant and most reliable element. as richmond indicates, use of the entire title for searching would present distinct problems for retrieval systems ( 4) . consequently, an abbreviated version of the title must be derived from the input data which will reduce the impact and significance of the problems described by richmond (5). the hypothesis it is hypothecated that retrieval of correct bibliographic entries can be obtained from unverified, user-supplied, input data through the use of a code derived from the compression of author and title information supplied by the user. it is assumed that a similar code is provided for all entries of the data base using the same compression rules for main and added entry, title and added title information. it is further hypothecated that use of weighting factors for individual segments of the code will provide accurate retrieval in those cases when exact matching does not occur. before the retrieval methodology can be described, it is necessary to outline the compression technique to be used with author and title words. title compression to gain some understanding of the problems to be faced in compressing title information, a random sample of 500 titles was drawn from the first half of the initial marc i reel (about 4800 titles). each of these titles was analyzed for significant words and tabulations were made on word strings and word frequencies. the following words. were considered as non-significant: a, an, and, by, if, in, of, on, the, to. the tabulated data, shown in table 2, contain some surprising attributes. approximately 90% of the titles contain less than five significant words, which suggests that four significant words will be adequate to match on title. table 2. significant word strings in titles length of word string 1 2 3 4 5+ total number of titles 42 151 179 76 52 500 percentage 8.4 30.2 35.8 15.2 10.4 100.0 cumulative percentage 8.4 38.6 74.4 89.6 100.0 letting n stand for the corpus of words available for title use, the random chance of duplicating any specific word in another title can be stated 1 as . when a string of words is considered, the chance of randomly n 1 selecting the same word string may be considered as -a, where 'a' is the n number of words in the string. 230 journal of library automation vol. 1/ 4 december, 1968 certain words are used more frequently than others, and the occurrence of such words in a given string reduces the uniqueness of that string. the curve displayed in figure 1 shows the frequency distribution of words in the sample. the mean frequency of words in the title-sample is 1.33. 'ioo ( )b~f 800 700 600 t.r) 0 a: 0 3.500 ll. d 0:: llj cdfoo ~ =:i :z: 3()() 2ixj \ 100 fi'}.. i~ k f+!.~ \' jtl-' __() (i) i (~ _c[).l i z 3 '1s 6 7 8 f/ 10 ii /2 ffi!equency fig. 1. frequency distribution of words in sample. bibliographic retrievaljruecking 231 therefore, the chance of selecting an identical word string can be more accurately expressed as: n" an examination of word lengths, as shown in table 3, shows that 95% of the significant title words contain less than ten characters. an examination of the word list revealed that some 70% of the title words contain inflections and/ or suffixes. if these suffixes and inflections are removed, approximately 43% of the remaining word stems contain less than five characters and 59% contain less than six. table 3. distribution of character length and stem length length in total different percent stems percent characters words words 1 7 5 0.5 5 0.8 2 25 14 1.3 14 2.3 3 87 48 4.6 48 7.9 4 172 117 11.1 196 32.3 5 229 163 15.5 92 15.2 6 198 153 14.5 94 15.5 7 202 159 15.3 64 10.6 8 158 122 11.6 45 7.4 9 121 102 9.7 15 2.5 10 84 69 6.6 8 1.3 11 54 48 4.6 7 1.2 12 38 28 2.7 2 0.3 13 14 12 1.1 2 0.3 14 6 4 0.4 0 0.0 15 3 3 0.3 0 0.0 16 2 2 0.2 0 0.0 summary 1400 1049 592 the reduction of word length does affect the uniqueness of the individual word, merging distinct words into common word stems at a mean rate of 2.5 to 1.0. in table 3 the difference between 1049 words and 592 stems reflects the reduction of similar words into a common stem; for example: america, american, americans, americanism, etc., into a.mer. thus, the uniqueness of a string of title words is reduced to the following chance of duplication: (2.5 x 1.33 )• 3.3• n• or-n" 232 journal of library automation vol. 1/ 4 december, 1968 an analysis of consonant strings made by dolby and resnikoff provides frequencies of initial and terminal consonant strings occurring in 7000 common english words (with suffixes and inflections removed) ( 11,12, 13). these frequency lists clearly show that the terminal string of consonants has considerable information-carrying potential in terms of word identification. the starting string also carries information potential, but significantly less than the terminal string. by combining the initial and terminal strings, it is possible to generate an abbreviation which has adequate uniqueness and reduces the influence of spelling. the high percentage of four-character word stems and the fact that the maximum terminal string contains four consonants suggest the use of a four-character abbreviation. to compress a title word into four characters, it is necessary to specify a set of rules. the first rule will be to delete all suffixes and inflections which terminate a title word. the second rule will be to delete vowels from the stem until a consonant is located or the four-character stem is produced. the suffixes and inflections deleted in this procedure are contained in table 4. when the stem contains more than four characters, the third compression rule states that the four-character field is filled with the terminal-consonant string and remaining positions are filled from the initialcharacter string. table 4. deleted suffixes and inflections -ic -ive -in -et -ed -ative -ain -est -aged -ize -on -ant -oid -ing -ion -ent -ance -og -ation -ient -ence -log -ship -ment -ide -olog -er -ist -age -ish -or -y -able -al -s -ency -ible -ial -es -ogy -ite -ful -ies -ology -ine -ism -ives -ly -ure -urn -ess -ry -ise -ium -us -ary -ose -an -ous -ory -ate -ian -ious -ity -ite the relative uniqueness of the generated abbreviation can be calculated using the data supplied by dolby and resnikoff. for example, carter and bonk's building library collections would be abbreviatedbuld, libr,coct. the random chance of duplicating any abbreviation can be stated as consisting of the product of the random chance of duplicating the initial string and the random chance of duplicating the terminal string: bibliographic retrievalj ruecklng 233 fl ft -xx3.32 n1 nt the frequencies listed by dolby and resnikoff may be substituted in the above equation producing the following chances for duplication: 324 63 1 x x 10.89 = -for buld 6800 6800 208 288 6800 277 6800 1 1 x 6800 x 10.89 = 14745 for libr 16 1 x 6800 x 10.89 = 1041 for coct the random chance of duplicating this string of three abbreviations can be calculated by multiplying the individual calculations, which yields the random chance of 1 in 32 x 108• this high uniqueness declines rapidly when the title contains less than three significant words and contains high frequency words, such as the title collected works, for which the same uniqueness calculation produces the random chance of 1 in 44 x 104• to increase the level of uniqueness on short titles, like collected works, it becomes necessary to provide supporting data to the title information. it is clear that the supporting data must come from supplied author text. author compression the same compression algorithms can be used for both personal and corporate names with some modifications. the frequent· substitution of "conference" for "congress" and "symposia" for "symposium" suggests that meeting names should be considered as a secondary sub-set of non-significant words. names of organizational divisions, such as bureau, department, ministry, and office, can be considered as part of the same sub-set. the rules which govern the deletion of inflections, suffixes and vowels can be used for corporate names, but personal author names must be carried into the compression routine without modificatjon. only the last name of an author would be compressed into a code. constructing the test four, four-character abbreviations are allowed for title compression and four for author. rather than use a 32-character fixed field for these codes, the lengths of the input and main-base codes are variable, with leading control digits to specify the individual code sizes for the title and author segments. . provision is made for the inclusion of date, publisher and/ or edition in the search-code sh·ucture although these were not implemented in the test performed. . 234 journal of library automation vol. 1/ 4 december, 1968 at the time the input data is read, the existence of title, author, edition, publisher and date is indicated by the setting of indicators which control the matching mask and which, in part, control the specification of the retrieve threshhold. the title indicator specifies the number of compressed words in the supplied title which must be matched by the base code. a simple algorithm is used to calculate the threshhold values given in columns two through four of table 5. columns five through seven are obtained by adding two to the calculated threshholds. each agreement within the mask adds to a retrieve counter the values indicated in the last five columns of table 5, the values of x and y being the number of matching code words in the title and author segments respectively. conducting the test as mentioned above, the initial tests of the retrieve were based upon title and author matching exclusively and required three runs on the fondren library's 1401 computer. the first loaded 2874 original orderrequests, generated a search code utilizing the rules specified in this paper and created an input tape. the second run extracted title and author data from the marc i data base, created multiple search codes for title, main entry, added title and added entry. both tapes were sorted into ascending search-code sequence. the final run was the search program which attempted to match input codes with the marc i base codes. when there was agreement based on relationship of threshhold and retrieve counter, the printer displayed threshhold, short author and short title on one line, and retrieve value, input author and title on the next line as illustrated in figure 2. the printed results were compared to validate the accuracy of the retrieve. this comparison was cross-checked against the results of the acquisition department's manual procedures. the search program also provided for an attempt to match titles on the basis of a rearrangement of title words. in such attempts the retrieve threshhold was raised. analysis of results the raw data obtained from this experimental run are shown in table 6. of the 287 4 items represented in the input file , 48.4%, or 1392, were actually found to exist in the data base. of those actually present 90.4%, or 1200, were extracted with an overall accuracy of 98.67%. an examination of the sixteen false drops revealed several omissions in the compression routines for the input data and for the data base. one of the more significant omissions was failing to compensate for multi-character abbreviations, particularly 'st.' and 'ste.' for 'saint.' a subroutine for acceptance of such abbreviations added to the search-code generating program would increase the retrieve accuracy to 99%. table 5. values for variable threshhold data threshhold values agreement values given full-code test individual code test title author edition publish. date taepd 3 or 4 2 1 3 or 4 2 1 xylll 12 8+2y 4+2y 14 10+2y 6+2y 4x 2y 3 2 1 xyllo 12 8+2y 4+2y 14 10+2y 6+2y 4x 2y 3 2 1 xylol 12 8+2y 4+ 2y 14 10+2y 6+2y 4x 2y 3 2 1 xyloo 12 8+2y 4+ 2y 14 10+2y 6+2y 4x 2y 3 2 1 xyoll 12 8+2y 4+2y 14 10+2y 6+2y 4x 2y 3 2 1 l::x; .... xyolo 12 8+2y 4+ 2y 14 10+2y 6+2y 4x 2y 3 2 1 ~ g:' xy001 12 8+2y 4+2y 14 10+2y 6+2y 4x 2y 3 2 1 (1q ~ "';j xyooo 12 8+2y 4+2y 14 18+2y 6+2y 4x 2y 3 2 1 ;;:to .... ~ xolll 12 11 7 13 12 7 4x 2y 3 2 1 ::x; {';) xouo 12 11 7 13 12 7 4x 2y 3 2 1 ..... "'t .... {';) c: x0101 12 11 7 13 12 7 4x 2y 3 2 1 ~ "' x0100 12 11 7 13 11 7 4x 2y 3 2 1 !:l:l c::: xoou 12 10 6 13 11 7 4x 2y 3 2 1 trl (") p.:: xoolo 12 10 6 13 not permitted 4x 2y 3 2 1 -z x0001 12 9 5 13 not permitted 4x 2y 3 2 1 0 1:-0 xooo 12 not permitted not permitted c.:> cj1 10 4me r4m8rhchs 10 am~r4mbrhchs ob ame~boll ob ameii.boll 12 amerbusqshowzien 12 amerbusqshowzein 12 amercntrcampbrth 12 amercntrcamp 12 aherjewsisrliscs 1~ amerjewsisrliscshalor 12 ameroccpstctblau 1~ ameroccpstctblau 12 ameroccpstctounn 12 aheroccpstctblau 12 amerpartsysmchrs 14 aherpartsysmchrs 10 amerpreowarn 10 amerpreowarn 10 amerschkillck 10 4merschkblck 10 amerschosexi'i 10 amerschosexnpatccayo 12 amerspacexprshtn 1~ amerspacexprshen 12 amerthettooaoowr 1~ amerthettooaoowr 12 ame r thtii.as seenbrwn 11> amerthtras see nmos smonsj 12 ameihhtras seenmoss 18 a!'ierthtras seei'imossmo'isj 12 an4zphphargumcgl 12 anazphphargumcgfjan phip 12 ancihuntfar westpoud 18 ancihuntfar we stpouo fig. 2. sample of retrieved citations. heinrichs, waldo h. heinrichs* boswell, charles. boswellt lieoman, irving . leiomant bosworth, allan r. clay, c. t.t isaacs, harold robert; isaacs, harold r.t blau, peter michael. blaut duncan, otis ouoleyo jo blaut chambers, william nisbet chahberst warren, stoney, 1916 warrei'it black, hillel. blackt sexton, patricia cayo. sextoi'io patricia cayot shelton, william royo sheltont downer, alan seymour, oownert . brown, john mason, 1900 hoses, moi'itrose j.t american ambassaoor joseph c. gr american ambassaojrt the america the story of the worl the america. the story of the world the american burlesque show. the american burlesque showt america-s concentration camps by america-s concent~ation campst americai'i jews in israel by haao americai'i jews in israelt the americai'i occjpational structur the american etcupational structure the american occ~pational structur the american occupational structure the american party systems stages the american part~ systems• stages the american president the amer[can presioentt the american schjolbook. the american schoolbook* readings the american school a sociologic the american scholl. a sociological american space exploration the f american spat~ exploration. the fir the american theater today, eoite the american theater. today* the american theatre as seen by it the american theatre as seen by its moses, montrosf jqnaso the american the4tre as seen by it hoses, montrose j.t the american theatre as seen by its mcgreal, ian philip, 19 analyzing philosophical arguments mcgreaf, jan phillipt analyzing philosophical arguments. pouraoe, richard f. pourade* ancient hunters jf the far west, ancient hunters of the far west* ~ o;, 0" ~ ....... .q.. t"'l .... ~ ~ ~ i e· ;:$ < 0 r....... ~ t1 (!) (') (!) g. (!) ..:-: ....... cd 85 bibliographic retrievaljrvecking 237 table 6. table of results retrieve total correct false percentage values hits hits hits correct 6 14 14 0 100 8 0 0 0 0 10 311 311 0 100 12 264 248 16 93.3 14 232 232 0 100 16 118 118 0 100 18 260 260 0 100 20 1 1 0 100 totals 1200 1184 16 98.7 table 7. distribution of errors title errors author errors no. of title author author codes error spelling lacking error spelling other total 1 2 3 10 12 27 4 58 2 2 6 17 26 60 23 134 3 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 total 4 9 27 38 87 27 192 the occurrence of titles with the words "selected". or "collected," etc., produced additional false drop when the title word string exceeded two words. a modification to the search program to raise the threshhold when the input data contain codes such as 'sect; 'coct' would increase the retrieve accuracy to 99.17% the presence of personal names in titles, such as 'charles evans hughes' and 'franklin delano roosevelt' caused seven additional false drops. at present it seems unlikely that a simple method to prevent them can be included. conclusion the experimental results indicate that the hypothesis suggested is valid. use of multiple codes for added entry, added title in addition to the main entry, and main title data are clearly necessary. approximately 10% of the correctly retrieved items were produced by the existence of an added entry code. the influence of spelling accuracy was lessened by use of a compression technique. an inspection of extracted titles revealed the existence of 43 spelling errors which did not affect retrieval. thus, the search code reduced the significance of spelling by some 30%. utilizing table search followed by table look-up and linking random238 journal of library automation vol. 1/ 4 december, 1968 access addresses, should enable the search code approach to bibliographic retrieval to provide rapid, direct access to the title sought. acknowledgment this study was supported in part by national science foundation grants gn-758 and gu-1153 and by the regional information and communication exchange. the assistance of the acquisitions department staff, the research computation center staff and the staff of the fondren library's data processing division is gratefully acknowledged. references 1. morris, ned c.: "computer based acquisitions system at texas a & i university," journal of library automation, 1 (march 1968 ), 1-12. 2. wedgeworth, robert: "brown university library fund accounting system," i ournal of library automation, 1 (march 1968), 51-65. 3. u. s. library of congress: project marc, an experiment in automating library of congress catalog data (washington: 1967). 4. richmond, phyllis a.: "note on updating and searching computerized catalogs," library resources and technical services, 10 (spring 1966), 155-160. 5. richmond, phyllis a.: "source retrieval," physics today, 18 (april 1965)' 46-48. 6. atherton, p.; yorich, j. c.: three experiments with citation indexing and bibliographic coupling of physics literature (new york, american institute of physics, 1962). 7. international business machines corporation: reference manual, index organization for information retrieval (ibm, 1961). 8. international business machines corporation: a unique computable name code for alphabetic account numbering (white plains, n.y.: ibm, 1960). 9. tainiter, m.: "addressing random-access storage with multiple bucket capacities," association for computing machinery journal, 10 (july 1963 ), 307-315. 10. toyoda, junichi; tazuka, yoshikazu; kasahara, yoshiro: "analysis of the address assignment problems for clustered keys," association for computing machinery journal, 13 (october 1966), 526-532. 11. dolby, james l.; resnikoff, howard l.: "on the structure of written english words," language, 40 (apr-june 1964), 167-196. 12. resnikoff, howard l.; dolby, james l.: "the nature of affixing in written english, part i," mechanical translation, 8 (march 1965), 84-89. 13. resnikoff, howard l.; dolby, james l.: "the nature of affixing in written english, part ii," mechanical translation, 9 (june 1966), 23-33. mobile technologies & academics: do students use mobile technologies in their academic lives and are librarians ready to meet this challenge? angela dresselhaus and flora shrode mobile technologies & academics | dresselhaus and shrode 82 abstract in this paper we report on two surveys and offer an introductory plan that librarians may use to begin implementing mobile access to selected library databases and services. results from the first survey helped us to gain insight into where students at utah state university (usu) in logan, utah, stand regarding their use of mobile devices for academic activities in general and their desire for access to library services and resources in particular. a second survey, conducted with librarians, gave us an idea of the extent to which responding libraries offer mobile access, their future plans for mobile implementation, and their opinions about whether and how mobile technologies may be useful to library patrons. in the last segment of the paper, we outline steps librarians can take as they “go mobile.” purpose of the study similar to colleagues in all types of libraries around the world, librarians at utah state university (usu) want to take advantage of opportunities to provide information resources and library services via mobile devices. observing growing popularity of mobile, internetcapable telephones and computing devices, usu librarians assume that at least some users would welcome the ability to use such devices to connect to library resources. to find out what mobile services or vendors’ applications usu students would be likely to use, we conducted a needs assessment. the lessons learned will provide important guidance to management decisions about how librarians and staff members devote time and effort toward implementing and developing mobile access. we conducted a survey of usu’s students (approximately 25,000 undergraduates and graduates) to determine the degree of handheld device usage in the student population, the purposes for which students use such devices, and students’ interests in mobile access to the library. in addition, we surveyed librarians to learn about libraries’ current and future plans to launch mobile services. this survey was administered to an opportunistic population angela dresselhaus (aldresselhaus@gmail.com) was electronic resources librarian, flora shrode (flora.shrode@usu.edu) is head, reference & instruction services, utah state university, logan, utah. mailto:aldresselhaus@gmail.com mailto:flora.shrode@usu.edu information technology and libraries | june 2012 83 comprised of subscribers to seven e-mail lists whom we invited to offer feedback. our goal was to develop an action plan that would be responsive to students’ interests. at the same time, we aim to take advantage of the growing awareness of and demand for mobile access and to balance workloads among the library information technology professionals who would implement these services. usu is utah’s land-grant university and the merrill-cazier library is its primary library facility on the home campus in logan, utah. while usu has had satellite branches for some time, a growing emphasis on expanding online and distance education courses and degree programs has resulted in a considerable growth of its distance education programs in the last five years. mobile access to university resources makes especially good sense for the distance education population and for students who may reside close to the main usu campus but who also enroll in online courses. the library has an information technology staff of 4.5 fte professionals who support the library catalog, maintain roughly 250 computer workstations in cooperation with the director of campus student computer labs, and oversee the computing needs of library staff and faculty members. literature review mobile access to library resources is not a new concept; in fact, the first project designed to deliver handheld mobile access to library patrons began eighteen years ago, in 1993, the time of mainframe computers and gopher. the “library without a roof” project partners included the university of southern alabama, at&t, bellsouth cellular, and notable technologies, inc. 1 library patrons at participating institutions could search and read electronic texts on their personal digital assistants (pdas) and search the library catalog while browsing in physical collections. as reflected in the literature, interest in pda applications for libraries started to pick up around the turn of the twenty-first century. medical librarians were among the first to widely recognize the potential impact of mobile technologies on librarianship. a 2002 article in the journal of the medical library association and a monograph by colleen cuddy are among the first publications that focus on pdas. 2 a quick perusal of the medical category on the itunes store reveals several professional applications, ranging from new england journal of medicine tools to remote patient vital-sign monitors. as an example of the depth of mobile-device penetration in the medical field, in 2010 the food and drug administration approved the marketing of the airstrip suite of mobile-device applications. these apps work in conjunction with vital-sign monitoring equipment to allow instant remote access to a patient’s vital signs. 3 these examples illustrate the increasing pervasiveness of mobile technology in everyday life. mobile learning in academic areas outside of medicine has increased recently as more universities have adopted mobile technologies. 4 a sampling of current projects at academic mobile technologies & academics | dresselhaus and shrode 84 institutions is provided in the 2010 horizon report. 5 according to the 2010 educause center for applied research (ecar) study, 49 percent of undergraduates consider themselves mainstream adopters of technology. 6 locally, utah state university students have adopted smartphones at the rate of 39.3 percent and other handheld internet devices at the rate of 31.5 percent. these statistics indicate that skills are increasing and the technological landscape is changing quickly. the ecar study reports that student computing is rapidly moving to the cloud, another indication of the rapid change in the use of technology. “usb may one day go the way of the eight-track tape as laptops, netbooks, smartphones and other portable devices enable students to access their content from anywhere. they may or may not be aware of it, but many of today’s undergraduates are already cloud-savvy information consumers, and higher education is slowly but surely following their lead.” 7 similarly, usu students show interest in adopting new technology. while usu students are less likely to own mobile devices, 70.2 percent of respondents indicated that they would be likely or very likely to use library resources on smartphones if they owned capable devices and if the library provided easy access to materials. bridges, gascho rempel, and griggs published a comprehensive article, “making the case for a fully mobile library web site: from floor maps to the catalog,” detailing their efforts to implement mobile services on the oregon state university campus. 8 their paper highlights the popularity of mobile phones and smartphones/web-enabled phones. the authors discuss mobile phone use, library mobile websites, and mobile catalogs, and they describe the process they used to develop their mobile library site. they note that mobile services will certainly be expected in the coming years, and we have learned that usu students share this expectation. survey research in recent years librarians have conducted surveys on mobile technology in libraries. in a 2007 study, cummings, merrill, and borrelli surveyed library patrons to find out if they are likely to access the library catalog via small-screen devices. 9 they discovered that 45.2 percent of respondents, regardless of whether they owned a device, would access the library catalog on a small-screen device. mobile access to the library catalog was the most requested service in the usu student survey, although it accounted for only 16percent of the responses. cummings, et al. also discovered that the most frequent users of the catalog were also the least willing to access the catalog via mobile devices, an interesting observation that merits further research. their survey was completed in june of 2007, just five months after the january 9th release of the original iphone. the release of the iphone is significant as the point where the market demographics of mobile device users began to shift to people under thirty, the primary age group of undergraduate students. 10 librarians wilson and mccarthy at ryerson university conducted two surveys to measure information technology and libraries | june 2012 85 the usage of their catalog’s feature to send a call number via text or email (initiated in 2007) and their “fledgling mobile web site” (launched in 2008). 11 the first survey indicated that 20 percent of respondents owned internet-capable cell phones, and over half said they intended to buy this type of phone when their current contracts expired. the survey respondents indicated they wanted the following services: “booking group study rooms, checking hours and schedules, checking their borrower records and checking the catalogue.” 12 the second survey was conducted a year after the library had implemented a group study room reservation system, catalog and borrower record services, and a computer/laptop availability service. results of the follow-up survey show a drastic increase in ownership of internetcapable cell phones (from 20% to 65%). respondents desired two new services: article searches and e-book access. wilson and mccarthy found that very few library patrons were accessing the mobile services, but “60% of the survey respondents were unaware that the library provided mobile services.” 13 the authors conclude that advertising should be a central part of mobile technology implementation. they also detail how the library contributed expertise and leadership to their campus-wide mobile initiatives. seeholzer and salem conducted a series of focus groups in the spring of 2009 to determine the extent of mobile device use among students at kent state university. 14 notable among their findings are that students are willing to conduct research with mobile devices, and they desire to have a feature-rich interactive experience via handheld devices. students expressed interest in customizing interactions with the library’s mobile site and completing common tasks such as placing holds or renewing library materials. nationwide survey of librarians we asked colleagues who subscribe to e-mail distribution lists to respond to a survey about their libraries’ implementation of mobile applications for access to library collections and services. invitations to take the survey were sent to seven lists (acrl science & technology section, eril, information literacy instruction, liblicense-l, nasig, ref-l, and serialist), and 289 librarians and library staff members responded to the survey. the population of subscribers to the e-mail lists we used to solicit survey responses is dynamic and includes librarians and staff who work in academic and other types of settings. while our findings cannot be generalized in a statistically reliable manner, we nonetheless believe that the survey responses merit thorough analysis. we chose to conduct two surveys to avoid some of the problems we noted in a 2007 study conducted by todd spires. 15 spires’ survey questions focused on librarians’ perceptions rather than on empirical data. we developed separate surveys for librarians and students in hopes of avoiding problems that could arise from basing assumptions on perceived behavior or from the complexity of interpreting and generalizing from perceptions. a survey of library patrons should provide more accurate insight into the ways that patrons are using the library mobile technologies & academics | dresselhaus and shrode 86 via handheld devices. in the libraries that currently provide mobile access to resources, the library catalog is most commonly offered. article databases and assistance from a librarian tie as the second most frequently provided services. figure 1 shows a snapshot of the resources and services librarians reported that they provide. we also asked how long libraries have provided mobile access, and the time periods ranged from a few weeks to more than ten years. five librarians indicated that they have provided mobile access for six to ten years, and it is possible that these respondents may work in medical or health science libraries, as our literature review indicated that access to medical information and journal articles via pdas has been a reality for several years. figure 1. librarians’ responses: does your library provide mobile access to the following library resources? librarians were also asked what services and resources they believe libraries should provide via mobile devices. of one hundred seventy-eight responses, 71 percent indicated that “everything” or a variety of library resources should be made available. a few of the more interesting suggestions include a library café webcam (similar to a popular link from north carolina state university), locker reservations, a virtual suggestion box, alerts about database trials, an app that lists new books, and using ipads or other mobile devices for roving reference. roving reference with tablet pcs was evaluated by smith and pietraszewski at the west campus branch library of texas a&m. 16 as tablet computers become increasingly popular with the release of the ipad and other tablets, 17 roving reference should be reconsidered. smith and pietraszewski note that "the tablet pc proved to be an extremely useful device as well as a novelty that drew student interest (anything to make reference librarians look cool!)" 18 using the latest technology in libraries will help raise awareness that libraries are relevant and adapting to changing user preferences. we asked librarians to indicate who had responsibility for implementing mobile access in their library. the 184 responses are summarized here:  63 percent answered that a library systems or computing professional does this work;  26.1 percent indicated that the electronic resources librarian has this role;  17.9 percent rely on an information professional from outside of the library;  22.8 percent chose “other,” and we unfortunately did not offer a space for comments where survey respondents could tell us the job title of the person in their library who implements mobile access. the results from our sample of librarians are consistent with a larger study by the library journal. 19 the lj study found that the majority of academic libraries have implemented or are information technology and libraries | june 2012 87 planning to implement mobile technologies. student survey in january of 2011 we sent out a thirteen-question survey to students (questions are available in appendix a). usu’s student headcount is 25,767, and 3,074 students responded, representing 11.9 percent of the student population. we asked students to identify with colleges so that we could evaluate the survey sample against the enrollment at usu. the rate of response by college clustered between 12–19 percent with the lowest response rate (8 percent) from the college of education. the highest response rate came from the college of humanities and social sciences. we examined survey response rates from usu undergraduate and graduate populations; 54 percent of undergraduates and 50 percent of graduate students use mobile technology for academic purposes. we believe that our sample is sufficiently representative of the overall population of usu. figure 2. student response rates by college in order to understand the context of survey questions that specifically address mobile access, we asked students how often they used library electronic resources. the majority of students used electronic books, the library catalog, and electronic journals/articles a few times each semester. only 34.4 percent of students never use electronic books, 19.6 percent never use the library catalog, and 17.6 percent never use electronic journals/articles. we made comparisons between disciplines and found no significant difference in electronic resource use between fields in the sciences and those in humanities. further data will be collected in fall 2011 about use of print and electronic materials. mobile technologies & academics | dresselhaus and shrode 88 figure 3. electronic resource use among students students were asked how often they use a variety of handheld devices. we decided to emphasize access over ownership in order to allow for a variety of situations. responses show that 39.3 percent of our students use a smartphone with internet access on a daily basis. another 31.5 percent of students use other handheld devices like an ipod touch on a daily basis. very few students use ipads or e-book readers, with 3.9 percent and 5.4 percent indicating daily use, respectively. we view the "other handheld device" category as an important segment of the mobile technology market because of the lower cost barrier, since such devices do not require a subscription to a data plan. the ecar study also noted the possibility of cost factors influencing the decision of some students not to access the internet via a handheld device. 20 information technology and libraries | june 2012 89 figure 4. mobile device usage students were asked if they use their mobile device or phone for academic purposes (e.g., blackboard, electronic course reserves, etc.). this question was intentionally worded broadly in order to gather general information. we used skip logic to direct respondents to different paths through the survey based on their response to earlier questions. in response to a question about how students use their mobile devices, 54 percent of respondents indicated that they use their mobile devices for academic purposes. we analyzed the results by discipline and noted a few variances. among students responding from the school of business, 63 percent said that they use their mobile device for academic purposes, and 59 percent of engineering students use their devices for school work. the respondents from the other colleges reported use under 50 percent, most likely because of more limited adoption of mobile technology by usu faculty in those fields or lack of personal funds (or unwillingness to spend) to acquire devices and data plans. the 2010 ecar report also noted higher exposure to technology in these fields, indicating that the situation at usu is in line with results from a national study. 21 mobile technologies & academics | dresselhaus and shrode 90 table 1. device use for academic purposes by college we asked the students, “if library resources were easily accessible on your mobile devices, and if you had such a device, how likely would you be to use any of the following for assignments or research?” responses to this question allowed us to gauge interest without concerns about cost of technology or the current state of mobile readiness in our library. among the survey respondents, 70.2 percent are likely or very likely to use resources on a smartphone; 46.9 percent are likely or very likely to use resources on an ipad; 45.9 percent are likely or very likely to use resources on an e-book reader; 63.2 percent are likely or very likely to use resources on other devices. we included an option for respondents to select “not applicable” as distinct from “not likely” to allow for those students who may welcome use of a mobile device but who may currently use a device different from the types we specified. information technology and libraries | june 2012 91 figure 5. likelihood of using library resources on mobile device if easily available we are unsure how to account for the dramatic difference in interest between smartphone and ipad usage. survey responses indicated that only a small number of students have access to an ipad, and it is possible that students have had little opportunity to see their classmates or others use ipads in an academic setting. students were asked in a free-text question to list the services the library should offer. the comments were varied and often used language different from the vocabulary that librarians typically use. in order to gain an understanding of trends and to standardize the language, we coded the survey comments. after coding, trends began to emerge. access to the library catalog was mentioned by 16 percent of respondents. mobile services in general were specified by 11 percent of survey respondents, 10 percent wanted articles, and 9 percent wanted to reserve study rooms on their mobile device. the phrase “mobile services” represents a catch-all tag designated for comments that indicated that a student desired a variety of services or all services that are possible. for example, only 9 percent of respondents indicated they had used text to contact the library and 15percent had used instant messaging. several students indicated they might have used these services but did not know they were available, indicating a need for advertising. while we learned much mobile technologies & academics | dresselhaus and shrode 92 about students’ desires for mobile services from this important subset of comments in response to the free-text question, they did not prove especially useful to guide librarians’ plans for the next stages of implementing mobile technology. figure 6. services requested by students as is common at many institutions, funding at usu is limited and any development in the area of mobile access implementation must be strategic. our survey indicated that usu students are using mobile devices for their academic work and would like to further integrate library resources into their mobile routine. the next section of this paper outlines the steps we are taking toward mobile implementation. going mobile the usu library joins many other academic libraries in the beginning stages of implementing mobile technologies. survey responses from students indicate that they use mobile devices for academic purposes, and until options to use the library with such devices are available and advertised, we will not have a clear understanding of students’ preferences. klatt's article, “going mobile: free and easy,” 22 outlines a way to get started with mobile services with small investments of time and money. articles by griggs, 23 back, 24 and west, 25 and books by green, et al. 26 and hanson 27 also provide guidance in this area. here we offer suggestions to establish an implementation team, conduct an environmental scan, outline steps to begin the process, and shed light on advertising, assessment, and policy issues. information technology and libraries | june 2012 93 implementation team for a library seeking to provide mobile access to online resources, a diverse and talented implementation team is important. public services personnel in an academic library staff are on the front lines and often field students’ questions. they may also have the opportunity to observe how students are using mobile devices in the library. if librarians track reference interactions, they may find evidence that students are attempting to use their mobile devices to access library services. the electronic resources/collections specialist will also play a key role in mobile development. these specialists are often in contact with vendors, and their advocacy is important in encouraging mobile web development in the vendor community. a web site coordinator interested in mobile services and knowledgeable in current web standards will bring essential talent to the team. arguably, a mobile-optimized web site should become a standard level of service. web sites that are optimized or adapted specifically for mobile access are device agnostic and do not require advanced knowledge of smart phone operating systems. therefore existing web development staff can apply their current skill set to expand into mobile web design. in order to launch advanced interactive access to library resources, a programmer who is interested in developing mobile apps on a number of platforms is needed. device-specific applications allow for the use of phone features such as gps and orientation sensing via an accelerometer and provide the basis for augmented reality technologies. environmental scan librarians can learn about mobile usage in their community by gathering information to guide future development. at usu we interpret the numbers of students who use mobile devices for academic purposes as justification for implementing mobile library access, but we have not set a benchmark for a degree of interest that would trigger more development. some of the mobile implementations described at the end of this paper required minimal time or were investigated because of the electronic resources librarian’s interest for their relevance to her role as music subject librarian. in the survey we administered to students, we considered it important to include a wide range of devices, including ipod touches and similar devices that have many of the same possibilities for academic use as smartphones but which do not require a monthly contract. laptops are also considered a mobile technology, and while we did not emphasize this class of devices, some student comments referred specifically to laptop computers. we will monitor use of the mobile applications that we implement and likely conduct a follow-up survey to assess students’ satisfaction and to find out if there are other services they would like for the library to provide. while librarians may gather useful information from a user study, there are other ways to determine if students are, in fact, using mobile devices in the library. one approach is to review logs of reference questions to determine if students are inquiring about access to library resources via mobile devices. recently, a few mobile-related questions have surfaced mobile technologies & academics | dresselhaus and shrode 94 at usu in the libstats program used to track reference interactions. this is also an area where training reference staff to recognize and record questions about mobile access could be helpful to detect demand in the library’s community. if vendors provide statistics about use of their products from mobile devices, this information could also contribute to assessing need. finally, in libraries that use vpn or other off-campus authentication methods, consulting with it support staff to see if they field questions on setting up remote access on smartphones or other devices may factor into decisions regarding mobile access. the usu information technology website provides a knowledgebase that includes entries on a variety of mobile device queries. this indicates to librarians that people in the university community are using their mobile devices for academic functions. before we conducted the survey of usu students, we knew little about the exact nature of their mobile use. getting started after identifying the needs on campus, the next step is to create a plan for mobile implementation. an important aspect of anticipating the needs of a library’s user population is to understand the likely use scenarios, goals, tasks, and context as outlined in “library/mobile: tips on designing and developing mobile web sites.” 28 building on services that incorporate tasks that people already perform in non-academic contexts provides a logical bridge for those who are familiar with everyday use of a mobile device to recognize how such devices can serve academic purposes. gathering information from each vendor that supplies content to the library is an important early step in planning. this information can serve as the basis of a mobile web implementation plan and, in the case of ebsco, creating a profile is necessary in order to allow access to a mobile-formatted platform. at usu our online catalog provider has developed an application for apple's ios platform. if a library’s catalog vendor does not offer a dedicated application or mobile site, samuel liston’s comparisons of three major online catalogs on three popular mobile devices is helpful in gaining an understanding of how opacs display on smartphones. his article also outlines a procedure for testing opacs and usability. 29 at usu we can also take advantage of serials solutions’ mobile-optimized search screen and a variety of applications provided by other vendors. jensen noted that librarians should not rely solely on vendor-created applications due to vendors’ tendency to develop applications that are usable by only a segment of the overall mobile device user population. 30 he adds that libraries should also avoid developing applications for limited platforms. in addition, jensen provides a simple step-by-step process for converting articles retrieved from a vendor database to a format that can be downloaded from electronic course reserves and read on a variety of handheld devices. while using vendor-developed applications is an important strategy, most libraries will find that developing a mobile-compatible library website is necessary. information technology and libraries | june 2012 95 mobile website development can be accomplished in a variety of ways. at usu we plan to offer a version of our regular website by employing cascading style sheets (css). this method is described in the paper by bridges, et al., 31 and standard guidelines can be found in the mobile web best practices 1.0. 32 this method will allow the content to be reformatted at the point of need for a variety of platforms. results from the usu student survey indicate a desire to be able to use a mobile device for access to the library catalog, to use services like reference assistance, find articles, and make study room reservations. the library plans to include hours and location information, access to existing reference chat and text features, and links to databases with mobile friendly websites or vendor-created applications in addition to the resources requested by students. we are still unsure of the best way to provide links to applications and how to explain the various authentication methods required by each vendor. while vpn and ezproxy are possible methods to authenticate via mobile devices, vendors are content at the moment to allow students to access their resources by setting up an account that is based on an authorized e-mail domain or through a user account created on the non-mobile version of the resource. in a few cases at usu, mobile applications from vendors allow access to categories of users such as alumni because they have a usu.edu e-mail address, although the library does not typically include these patrons in our authorized remote user group. advertising, assessment, and policy creating a mobile website and offering mobile services are only the beginning of the effort to provide access to library materials for mobile users. as wilson and mccarthy found, advertising is essential; 33 students won’t use a service they don’t know about. crafting a marketing plan with both online and print materials is essential. educating library staff members, especially those on the public services front line, is an essential part of promoting mobile services. assessment strategies must be developed in order to focus development strategically. periodic surveys and focus groups can inform future development of mobile services and gauge the impact of currently offered services. librarians should encourage vendors to provide usage data for their mobile portals or applications, and libraries can track use data from their own information technology departments. implementation of mobile web services creates the need to develop new policies and to educate staff. privacy concerns and the complexities of digital rights management have the potential to transform the role of the library and its policies. 34 patrons will need to be aware that the library has less control over maintaining privacy when materials are accessed via third-party mobile applications. libraries will need to consider how new developments in pricing models may affect expanding mobile access; one example is harpercollins’ announcement in early 2011 about a policy requiring libraries to repurchase individual e mobile technologies & academics | dresselhaus and shrode 96 book titles after a cap on check-outs is reached. 35 librarians’ desire to offer reference services or other assistance via mobile devices follows naturally from their long-standing efforts to enable patrons to ask questions via e-mail, chat, instant messaging, or sms text. instant messaging, chat, and text lend themselves to mobile access because they are designed for the relatively short exchange that people typically use when communicating with a handheld device. offering reference services using sms text and chat in particular are relatively easy for libraries to employ because there are many free services to support them. in some cases, a systems administrator or it expert may be helpful in navigating the set-up of chat and text services and to integrate them so that, for example, when a text message arrives during a time when no one is monitoring the service, a voicemail message automatically appears in library’s e-mail account. librarians can find an enormous amount of advice on the web and in the literature about how to begin offering mobilefriendly reference, how to expand the virtual reference services they currently provide, and how to choose among free and fee-based services for their library’s needs and budget. two efficient places to begin are cody hanson’s special issue of library technology reports, which provides a thorough overview of mobile devices and their capabilities and straightforward suggestions for planning and implementation, and m-libraries, a section of library success: a best practices wiki. 36 conclusion in light of trends toward more widespread use of mobile computing devices and smartphones, it makes sense for libraries to provide access to their collections and services in ways that work well with mobile devices. this case study presents the situation at the merrill-cazier library at utah state university, where students who responded to a survey indicate they are very interested in mobile access, even if they have not yet purchased a smartphone or find data plans to be too expensive at this point. as is only reasonable for any library, at usu we have begun by implementing mobile applications that are available from vendors of our online catalog and databases because these require minimal effort and no additional cost. we present ideas for establishing an implementation team and advice for academic libraries who wish to “go mobile.” we aim to have a concrete plan for the work that will be required to optimize the library’s website for mobile access by the fall of 2011. a significant step is hiring a digital services librarian to work closely with the webmaster, electronic resources librarian, and others interested in promoting access to resources and services via mobile devices. our vision is to be on track to offer an augmented-reality experience to our patrons as the 2010 horizon report indicates will be an important trend in the next two to three years. we aim to create an environment in which students can use their mobile device to gain entry to a new layer of digital information, enhancing their experience in the physical library. information technology and libraries | june 2012 97 references 1. clifton dale foster, “pdas and the library without a roof,” journal of computing in higher education 7, no. 1 (1995): 85–93. 2. russell smith, “adapting a new technology to the academic medical library: personal digital assistants,” journal of the medical library association 90, no. 1 (2002): 93–94; colleen cuddy, using pdas in libraries: a how-to-do-it manual (new york: neal-schuman publishers, 2005). 3. andrea jackson, “wireless technology poised to transform health care,” rady business journal 3, no. 1 (2010): 24–26. 4. alan w. aldrich, “universities and libraries move to the mobile web,” educause quarterly 33, no. 2 (2010), www.educause.edu/educause+quarterly/educausequarterlymagazinevolum/univers itiesandlibrariesmoveto/206531 (accessed mar. 30, 2011). 5. larry johnson, alan levine, r. smith, and s. stone, the 2010 horizon report (austin, tx: the new media consortium, 2010), www.nmc.org/pdf/2010-horizon-report.pdf (accessed mar. 31, 2011). 6. shannon d. smith and judith borreson caruso, with an introduction by joshua kim, the ecar study of undergraduate students and information technology, 2010 (research study, vol. 6) (boulder, co: educause center for applied research, 2010), www.educause.edu/ecar (accessed mar. 31, 2011). 7. smith and caruso, the ecar study of undergraduate students and information technology, 2010. 8. laurie bridges et al., “making the case for a fully mobile library web site: from floor maps to the catalog,” reference services review 38, no. 2 (2010): 309–20. 9. joel cummings, alex merrill, and steve borrelli, “the use of handheld mobile devices: their impact and implications for library services,” library hi tech 28, no. 1 (2009): 22– 40. 10. rubicon consulting, the apple iphone: success and challenges for the mobile industry (los gatos, ca: rubicon consulting, 2008), http://rubiconconsulting.com/downloads/whitepapers/rubicon-iphone_user_survey.pdf (accessed mar. 31, 2011). 11. sally wilson and graham mccarthy, “the mobile university: from the library to the campus,” reference services review 38, no. 2 (2010): 215. http://www.educause.edu/educause%2bquarterly/educausequarterlymagazinevolum/universitiesandlibrariesmoveto/206531 http://www.educause.edu/educause%2bquarterly/educausequarterlymagazinevolum/universitiesandlibrariesmoveto/206531 http://www.educause.edu/educause%2bquarterly/educausequarterlymagazinevolum/universitiesandlibrariesmoveto/206531 file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.nmc.org/pdf/2010-horizon-report.pdf file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.nmc.org/pdf/2010-horizon-report.pdf file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.educause.edu/ecar http://rubiconconsulting.com/downloads/whitepapers/rubicon-iphone_user_survey.pdf http://rubiconconsulting.com/downloads/whitepapers/rubicon-iphone_user_survey.pdf mobile technologies & academics | dresselhaus and shrode 98 12. ibid., 216. 13. ibid., 223. 14. jamie seeholzer and joseph a. salem, “library on the go: a focus group study of the mobile web and the academic library,” college and research libraries 72, no. 1 (2011): 9– 20. 15. todd spires, “handheld librarians: a survey of librarian and library patron use of wireless handheld devices,” internet reference services quarterly 13, no. 4 (2008): 287– 309. 16. michael m. smith and barbara a. pietraszewski, “enabling the roving reference librarian: wireless access with tablet pcs,” reference services review 32, no. 3 (2004): 249–55. 17. kathryn zickuhr, generations and their gadgets (washington, d.c.: pew internet & american life project, 2011), http://pewinternet.org/reports/2011/generations-andgadgets.aspx (accessed mar. 31, 2011). 18. smith and pietraszewski, “enabling the roving reference librarian,” 253. 19. lisa carlucci thomas, “gone mobile: mobile catalogs, sms reference, and qr codes are on the rise—how are libraries adapting to mobile culture?” library journal 135, no. 17 (2020): 30–34. 20. smith and caruso, the ecar study of undergraduate students and information technology, 2010. 21. ibid. 22. carolyn klatt, “going mobile: free and easy,” medical reference services quarterly 30, no. 1 (2011): 56–73. 23. kim griggs, laurie m. bridges, and hannah gascho rempel, “library/mobile: tips on designing and developing mobile web sites,” code4lib 8, november 23, 2009, http://journal.code4lib.org/articles/2055 (accessed mar. 30, 2011). 24. godmar back and a. bailey, “web services and widgets for library information systems,”information technology & libraries 29, no. 2 (2010): 76–86. 25. mark andy west, arthur w hafner, and bradley d. faust, “communications—expanding access to library collections and services using small-screen devices,” information technology & libraries 25, no. 2 (2006): 103. 26. courtney greene, missy roser, and elizabeth ruane, the anywhere library: a primer for the mobile web (chicago: association of college and research libraries, 2010). http://pewinternet.org/reports/2011/generations-and-gadgets.aspx http://pewinternet.org/reports/2011/generations-and-gadgets.aspx http://journal.code4lib.org/articles/2055 information technology and libraries | june 2012 99 27. cody w. hanson, “libraries and the mobile web,” library technology reports 42, no. 2 (february/march 2011). 28. griggs, bridges, and gascho rempel, “library/mobile.” 29. samuel liston, “opacs and the mobile,” computers in libraries 29, no. 5 (2009): 6–47. 30. r. bruce jensen, “optimizing library content for mobile phones,” library hi tech news 27, no. 2 (2010): 6–9. 31. griggs, bridges, and gascho rempel, “library/mobile.” 32. “mobile web best practices 1.0,” worldwide web consortium (w3c), www.w3.org/tr/mobile-bp (accessed mar. 30, 2011). 33. wilson and mccarthy, “the mobile university.” 34. timothy vollmer, there’s an app for that! libraries and mobile technology: an introduction to public policy considerations (policy brief no. 3) (washington, d.c.: ala office for information technology policy, 2010), www.ala.org/ala/aboutala/offices/oitp/publications/policybriefs/mobiledevices.pdf (accessed mar. 31, 2011). 35. josh hadro, “harpercollins puts 26 loan cap on ebook circulations,” library journal, february 25, 2011, www.libraryjournal.com/lj/home/889452264/harpercollins_puts_26_loan_cap.html.csp (accessed mar. 31, 2011). 36. “m-libraries: library success: a best practices wiki,” www.libsuccess.org/index.php?title=m-libraries, (accessed mar. 31, 2011). file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.ala.org/ala/aboutala/offices/oitp/publications/policybriefs/mobiledevices.pdf file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.ala.org/ala/aboutala/offices/oitp/publications/policybriefs/mobiledevices.pdf file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.libraryjournal.com/lj/home/889452-264/harpercollins_puts_26_loan_cap.html.csp file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.libraryjournal.com/lj/home/889452-264/harpercollins_puts_26_loan_cap.html.csp file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.libsuccess.org/index.php%3ftitle=m-libraries, mobile technologies & academics | dresselhaus and shrode 100 appendix a. student survey questions 1. type of student? 2. age? 3. gender? 4. what is your college? 5. how often do you use the following electronic resources provided by your library? 6. do you use any of the following devices? 7. do you use your mobile device or phone for academic purposes (e.g., blackboard, electronic course reserves, etc.)? 8. please list what you use your device to do? 9. have you ever used a text message to get help using the library? 10. have you ever used instant messaging to get help using the library? 11. if library resources were easily accessible on your mobile devices and if you had such a device, how likely would you be to use any of the following for assignments or research? 12. what mobile services would you like the library to offer? 13. comments? information technology and libraries | june 2012 101 appendix b. librarian survey questions 1. type of library? 2. your job/role in the library? 3. years working in libraries? 4. does your library offer mobile device applications for the following electronic resources? 5. who in your library or on your campus is responsible for implementing or developing mobile device applications? 6. how long has your library provided access via mobile devices to electronic resources or services? 7. if you collect use data for library electronic resources, are patrons using the mobile device applications your library provides? 8. what mobile services do you believe libraries should offer? 9. comments? balancing community and local needs: releasing, maintaining, and rearchitecting the institutional repository article balancing community and local needs releasing, maintaining, and rearchitecting the institutional repository daniel coughlin information technology and libraries | march 2022 https://doi.org/10.6017/ital.v41i1.14073 daniel coughlin (dmc186@psu.edu) is head libraries strategic technologies, penn state university. © 2022. abstract this paper examines the decision points over the course of ten years of development of an institutional repository. specifically, the focus is on the impact and influence from the open-source community, the needs of the local institution, the role that team dynamics plays, and the chosen platform. frequently, the discussion revolves around the technology stack and its limitations and capabilities. inherently, any technology will have several features and limitations, and these are important in determining a solution that will work for your institution. however, the people running the system and developing the software, and their enthusiasm to continue work within the existing software environment in order to provide features for your campus and the larger open-source community, will play a bigger role than the technical platform. these lenses are analyzed through three points in time: the initial roll out of our institutional repository, our long-term running and maintenance, and eventual new development and why we made the decisions we made at each of those points in time. the institutional repository (ir) a university institutional repository (ir) provides long-term access to the scholarship and research outputs of an institution.1 the outputs can be in the form of scholarly publications, data sets to support publications or other research, electronic theses and dissertations, and other digital assets that have value to the university to preserve and to the research community an d beyond to disseminate. there is additional value in keeping these otherwise scattered resources collected in a single repository to showcase the scholarly accomplishments of an institution.2 there is value to the university to collect and disseminate the scholarly outputs of the university to understand the strengths of the university and promote that research to outside audiences, attract new faculty, and provide opportunities for new faculty where fields may be emergent or void of an institutional presence. furthermore, there is value to the research community to be able to find peer research without having to pay publisher access fees. reducing the burden on faculty to meet various policy demands from a federal, publisher, and institutional perspective provides another motivation for irs. federal policies can require making anonymized research data and scholarship publicly available because it is publicly funded through tax dollars; publishers can make authors provide access to the data that supports the research that is being published.3 in the united states, a growing number of academic institutions, from 2005 to 2021, have adopted an open-access policy that requires researchers to provide a copy of any published scholarly article in a publicly accessible repository. the institutional repository is a way for a university to meet this increasing demand from research organizations and funding institutions for their researchers.4 as the size of a campus grows in disciplines, it inherently grows in complexity and a diversity of digital needs and use cases from its researchers. for example, high-resolution images or mailto:dmc186@psu.edu information technology and libraries march 2022 balancing community and local needs | coughlin 2 atmospheric data are likely to create a higher demand in storage needs than a discipline that relies largely on text. performance-based research may require multimedia resources and streaming capabilities while other large files can be shared in a more asynchronous manner. the diversity of needs contributes to the complexity of finding a solution for an institutional repository that meets all, or many, of the needs on a campus from a file storage, discovery, and access perspective. this paper broadly addresses penn state university’s development of its ir at three distinct points in time: (1) choosing a platform for our ir and its initial release; (2) maintaining an ir; and finally, (3) our current solution nearly 10 years later. at each point in time, we analyze our decision process through four lenses. these lenses provided a thorough examination for us to decide on how to proceed; they are community needs and potential tension that exists with local needs, our team dynamics, and finally the platform we built our software on and the infrastructure required to maintain it. we discuss why we made the decisions we made through these four lenses, the benefits and drawbacks, and what we have learned along the way. penn state is the state of pennsylvania’s land grant university in the united states. the university has 24 campuses physically located in the commonwealth of pennsylvania, the world campus which is online, two law schools, and a medical school. in the fall of 2021, penn state had 73,476 enrolled undergraduate students and 13,873 graduate students, with research expenditures totaling over $850 million for the last four years.5 penn state is a large, public research institution with a diverse set of needs. this is significant because when the university is considering developing a large system such as an institutional repository, we need to meet the needs of a broad set of disciplines and domains. we are fortunate enough to have software developer and system administration resources that smaller institutions may not have. this provides a bit of context into our considerations for an institutional repository. selecting a repository in january 2012, penn state university libraries and penn state’s central information technology department collaborated on developing an institutional repository for the university’s growing data management needs. the university libraries was interested in becoming more involved in open-source software community development efforts. at that point, many universities that we had spoken with had an existing ir solution in place, and we had a lot of freedom to choose a platform without the burden of data migration. we considered investigating (1) off-the-shelf, turnkey solutions such as dspace, (2) a prototype we had just built called curation architecture prototype services (caps) using a microservices approach, or (3) building on top of an existing platform. ultimately, we decided to build on top of an existing platform, samvera (named hydra at the time).6 we did not want a turnkey solution, because we felt that we had distinct needs that would require a level of customization that these solutions would not be able to offer. based on discussions with others, we decided to develop something of our own. we wanted to leverage the experience of others in the repository development domain. the microservices approach at the time was more of a conceptual approach towards development than an existing software solution. the ability to build on an existing platform was a happy middle ground for us and we evaluated this decision through several lenses that led us to our selection at that time. community involvement we did not want to develop a solution in a vacuum and thought a group with a (relatively) common set of problems would be helpful to problem solve. the samvera community was a small but growing community working towards repository solutions like what we were trying to information technology and libraries march 2022 balancing community and local needs | coughlin 3 achieve. members of the community were both managerial and technical. this was valuable to us for understanding the strategic direction for the community and the ability to collaborate and problem solve on technical implementations. some of the key partners for our early work were university of hull (uk), stanford university, university of virginia, and notre dame. there was communication throughout the year over community email, chat platforms, and phone calls; however, the quarterly partner meetings were the most valuable time for collaboration. these quarterly meetings were a couple days in length, typically at a partner institution’s campus (physically) attended by managers and software/systems developers. this provided the ability to work together on specific problems, showcase our work, and get to know each other more closely at lunch and after-hours meetups. working within the community would also get our team increased exposure and help with recruiting future colleagues. working in the open-source software community has been seen to benefit both candidates and employers in future job recruitment.7 we were excited by the promise of working with and contributing toward a larger community. our team had apprehension about building this alone, and we were happy to be working with the support of a community and within their set of processes. local needs early on in our requirements for the repository we created a moscow chart that provided our “must have,” “should have,” “could have,” and “won’t have” features.8 the platform we were choosing was going to provide us with a significant set of these features for our repository with very little work on our end. these features were built in and included search, discovery, and basic file edit functionality. essentially, we were going to quickly meet the needs of our stakeholders by using this software. this was important for a couple of reasons. first, providing features to our stakeholders quickly gave them ample time to provide feedback so that we could make necessary customizations for their specific needs. a less quantitative benefit was gaining the trust of colleagues at the start of a new project and new initiative. rather than continually suggesting “that feature will be done next week,” we were able to deliver results quickly and get feedback. for example, our repository integrated with our campus authentication system, restricting access. we were able to deliver these features and get feedback on both the functionality as well as terminology to improve the usability. in particular, the way our developers described permissions was initially too confusing for our users and we were able to make necessary adjustments prior to a production release of the ir. team dynamics we believed it was a significant professional development opportunity for many people on the team to work with a larger community and learn from and with those in the open-source community. the team working on the ir consisted of three full-time, or near full-time, developers (one joining after we started the project), and a systems administrator. this project was our first large project that included a project manager invested in agile project management methodologies and with a systems administrator in place at the beginning of the project. platform and infrastructure stability there was a desire to get to a common solution to easily set up other repositories for various needs within the libraries and we hoped there would be an ability to plug and play various components or features. the three common components of this system were fedora commons to store both metadata and our digital assets; solr as an index for fast search; and blacklight as a web interface that sits on top of solr. one of the primary components, active-fedora, would sync content between the fedora and solr persistence layers. our hope was that with this model, we information technology and libraries march 2022 balancing community and local needs | coughlin 4 would be able to write code that could be used in other repositories, and we could use the code that other institutions had written for our repository needs to build other applications more quickly. the samvera community was initially called hydra because of the relationship with the mythical creature that has several heads (see figure 1). we were considering the potential of running a core storage infrastructure and discovery infrastructure, while developing several heads for our various applications. we knew this was a lofty expectation, but also thought that it was a good design principle for us to advance. additionally, the pilot that we developed on microservices (caps) seemed to have a relatively large storage service and we could not determine how to get away from that. although this was a bit of a shift in our philosophy, it was less of a shift based on our practical experience. figure 1. aspirational intentions of running many applications on one access and discovery system. initial release the initial release of our ir, scholarsphere, was for research data, scholarly articles, and presentations. we considered the repository file agnostic and left the definitions of scholarly materials up to the depositor. the self-deposit process made very few assumptions to limit the barriers to deposit—there were few mandated fields for deposit in scholarsphere. the initial rollout of scholarsphere had met the “must have” and many of the “should have” needs that we had defined initially in our development requirements. the list of “must haves” included upload files via the web, create and assign metadata to the uploaded files, set three access levels to the files, search for files, display files, etc. the list of “should haves” were faceted browse, faceted search results, share files with a group, etc. the benefit of working on a community-developed platform provided some of these features for us (search, faceted browse), and gave us the flexibility to customize where necessary. for example, we had our own data model of metadata to assign to files based on our users’ needs. we were able to update the existing metadata that was information technology and libraries march 2022 balancing community and local needs | coughlin 5 provided out of the box, to accommodate that. this was a tremendous win for us to leverage community-provided solutions and local needs. additionally, the platform provided a search index with solr. this enabled our infrastructure to have a common solution with community support on configuration questions. using the blacklight ui on top of solr created another opportunity for us to customize where desired and ease of development efforts. community: following the initial release, we worked with other members of the community to pull out some of the core functionality and place it in a separate ruby library. this library (sufia) could then be leveraged as a default set of repository features for other developers. the release of a new ir, and this library, provided us with a lot of positive exposure at various community events. local needs: locally, we used this library to develop a repository for our digital archives. it previously took two to three developers nearly nine months to develop scholarsphere; however, we used the sufia module to roll out a separate repository in six months with a single developer. this was another successful production rollout and a successful use of a product created by and for the community. team dynamics: we had a successful release and were getting support for new developers to hire. we continued to move more of our projects toward an agile approach and permanently embed systems administrators into our development projects. infrastructure: we had not released a new system for archives on the exact same system that scholarsphere was developed on, but we were happy that our projects were relatively homogeneous technology stacks and provided a familiarity to run. maintaining the ir over the next several years we released three major updates to scholarsphere: 1. migrating the data object store to a major version 2. overhauling the user interface 3. migrating our data model to the portland common data model (pcdm) simultaneously, the sufia library that we developed had also grown in usage by other institutions and contributions from other developers. we were excited to have additional contributors, and with that came an understandable sense of competing priorities within our community’s development roadmap. we were building scholarsphere features and functionality to meet the needs of our local institution and managing the tension between community direction and local needs. again, we look at these lenses as evaluating the period during maintenance, upgrades, and feature adds. community involvement two of the major releases mentioned above were largely community driven. in one case— migrating the data object store—we were one of the initial repositories within our open-source community to migrate our data storage system. we anticipated that doing this work early would prevent us from having to rewrite any code that relied on the data storage layer. ultimately, this may have been a bit early for us, because we never were able to create the momentum for others in the community to make this same migration. this created a bit of a divergence, but at this layer in our technology it did not prevent us from continuing to work closely with the community. information technology and libraries march 2022 balancing community and local needs | coughlin 6 we were able to add locally developed features for managing files and uploads, community components that allowed for controlled vocabularies, and cloud provider uploads.9 in all, from 2012 to 2019 we were an active member of the community: we provided technical contributions, we were being asked to present at community events, and our developers were frequently asked to help at several workshops. the community provided many opportunities for professional development and code from the community provided new features to our users. we felt this work was successful. we had three major releases. one was something that our local users were able to experience directly. two of our upgrades were largely on the back end and, while there is no argument on their importance, it can be a challenge to illustrate the significance of largely opaque technology upgrades to users. concurrently, we were coming up against other challenges that were proving difficult to solve in a sustainable and scalable way. large file size (larger than 1 gb) for uploads and downloads remained an issue that researchers seemed to be encountering more frequently. our mechanisms for getting around some of these obstacles led us to looking at an api for administrators and other applications to integrate. for example, if the web browser upload was not working, perhaps we could physically get the file from a user and upload it to the system ourselves. if we could do that, maybe we could use an api to do this upload, but we did not have an api. when developing new features, we would question if it should be code to contribute back to the community or only for our (penn state) needs. frequently, the devil is in the details and , while several institutions were interested in a feature based on a conversation, implementation could be much more detailed and it was difficult to find common ground. this complexity could lead to longer timelines and more difficult planning for local development features. team dynamics over this time period we advanced our team by adding several highly skilled developers (some of whom have now moved on to other positions and remain highly respected within the community), and enriched the collective skill set of the group. the team was enriched by this experience overall. the balance between community involvement and local needs became a frequent conversation point for our team. we spent a lot of effort on initiatives that had not solved some of the bigger problems our users were experiencing locally; our community disengagement was likely a combination of common reasons, for example, our lack of time to make meaningful contributions.10 in the spring of 2019, the development team that worked on scholarsphere shrunk from three developers down to one. we had a strong number of developers within the samvera community to collaborate with; however, we had difficulty bringing on new members at the time because the complexity within the scholarsphere system created a high learning curve that was not necessarily transferable to other technology stacks. at the end of the summer in 2019 we were given 25 gb of video files to upload in scholarsphere and make accessible. the parameters of the request were outside of what we could support from our web interface, and we had no api to allow a product owner to develop against and work with the researcher to meet this request. after approximately one month of working with the data and our system, we successfully ingested the files into scholarsphere. at the end of this month, we information technology and libraries march 2022 balancing community and local needs | coughlin 7 decided that we needed to more urgently evaluate our path forward because we could not have our lone developer spending this amount of time on single-user requests. platform and infrastructure stability each of the major versions released between 2012 and 2019 had several patches and feature releases to enhance the system, the interface, and/or our processes for change management within the software system. for example, we went from a typed script containing a series of commands to chef (a language used to automate software deployment) for deployment management; we upgraded infrastructure core components (fedora, solr, travis, redhat, etc.); and we added infrastructure to keep up with the system demands. in terms of adding infrastructure, we both enhanced the virtual capabilities (cpu and ram) of our systems and had tasks offloaded to other systems. we did not want the systems our users interfaced with to be responsible for all the heavy lifting. these tasks included characterization, indexing metadata for search, creating thumbnails, etc. (see figure 2). figure 2. systems and services with basic workflow process for uploading a file to scholarsphere, including the background jobs that ran on file upload. adding additional components improved the user experience but made our infrastructure d ifficult to manage. we were continually trying to push our systems to reflect the best practices of the twelve-factor app.11 however, over time, we had certain “infrastructure smells.” the infrastructure smells were essentially anti-patterns of these best practices or symptoms of a bigger problem.12 these anti-patterns included: • storage coupled closely to application • lack of flexibility to scale storage to integrate • inability to spin up a scholarsphere instance web 1 repo isilon jobs mysql services • apache • rails • passenger • clamav services • maria db services • tomcat • fedora • jetty • solr • redis services • rails • resque • fits services • nfs jobs on upload • characterization • thumbnail creation • text extraction (solr) • derivatives information technology and libraries march 2022 balancing community and local needs | coughlin 8 • taking days to set up a dev environment • lack of flexibility to decouple small tasks that may require increased resources (create derivatives) evaluating next steps although we were coming up against some struggles and continued maintenance with scholarsphere it was a successful software project that had several things we liked (and likely took for granted). it was important for us to recognize what features and characteristics of scholarsphere were a part of this list. scholarsphere‘s data model was flexible enough to support several current use cases and future needs and was developed with a significant amount of community input. there were other development teams within our organization that were also developing new applications in ruby, so the language continued to be relevant within ou r larger group as was ruby on rails, blacklight, and solr. some of the libraries developed with these frameworks were providing us with struggles and we knew that tools and infrastructure could be barriers to newcomers onboarding and orientation.13 however, the languages themselves were still flexible enough for us to continue our work. we had three permission levels to access the full text of an uploaded file: (1) public, (2) penn state only, and (3) private, and we didn’t want to develop anything more complex than that around access permissions. fedora provided us with versioning capability of our objects and we thought that this was something not only to continue but potentially enhance. we also had strong support from the samvera community for scholarsphere. many people had worked on the code that helped provide functionality and we could collaborate within that community when problems arose. at that point we largely decided to continue to develop needed features for scholarsphere while the community pushed forward. in part we were hoping that our divergent paths would converge within a year (give or take). the month following the relatively manual process of ingesting the 25 gb of video files into scholarsphere was spent making important updates to the system and fixing any low-hanging fruit. in october 2019 we decided to start from scratch and spend about two months developing a new solution and to evaluate our path forward after that. current solution we turned to the same four established lenses when evaluating our needs in 2019. however, it is worth noting that organizationally we were in a much different position than when we started in 2012. the software development and infrastructure team that managed the service was organizationally moved from central it to the libraries where the service and product owner resides. being in the same building and having the same priorities improved communications. also, people within the teams had changed, and our leadership had changed, which changed how we approached some of our decisions. we had more experience in technical skills, specifically in repository development; we were more refined in our implementation of agile methodologies; and having run a service for years, we had a better sense of our users’ needs. community involvement the community saw a tremendously successful period of growth during this time in adoption of software, exposure for funded grants, and number of partners. there was renewed excitement about multiple solutions including turnkey repository solutions, hosted solutions, the merging of two highly regarded software libraries for performance, and improvement in developer friendliness. the latter improvement stripped some of the design patterns that developers struggled with to something more familiar and made it easier to onboard new developers. information technology and libraries march 2022 balancing community and local needs | coughlin 9 local needs the pressure to meet our local needs and competing priorities for the community-based software became a sticking point for us. we needed to have a more scalable backend and we were not su re when our needs and the priorities of the community would merge. we had also been behind on several dependencies and the lift to get back up to date, before being able to add anything new, was considerable. this situation led us to create a prototype for evaluation. our initial goal was to see how difficult it would be to build a system to meet the needs of uploading the video files that scholarsphere currently could not handle. we had confidence we could develop features, but this area was a consistent challenge and we considered it a primary hurdle for us to jump. team dynamics our development team consisted of a single developer. however, we had an infrastructure developer who was able to help with systems configuration, automation, and containerization. our developer thoroughly understood scholarsphere and the underlying codebase and architecture and had the resources to hire a consultant to help with our efforts. we had considerable work performed by a local software development company on other repositories (electronic theses & dissertation system, a digital cultural heritage repository, and our researcher metadata database). we valued this partnership and wanted to continue to utilize them as our staff numbers were down. we needed to be able to more quickly onboard others than we previously had in the past. if we were able to have three relatively new members of the team contributing to this progress, then we would also potentially have chosen a technology stack that was comfortable for others outside our development team to make a more immediate impact. platform and infrastructure stability as with many systems that are actively developed for years, our current system had several dependencies that had organically grown over time to become burdensome to put together in order to set up a development environment. additionally, a local development environment was not an exact replica of the production environment because networked storage was implemented on production and our development systems had a local storage. we also took this opportunity to test out amazon s3 storage options as our production storage system. we chose this alternative to see if we had increased reliability in our storage and to see how well we could manage data in s3 and get a production service using this to provide an example of the annual operating cost by using the cloud vendor. we were able to simplify our rollout a bit, and modernize the technologies used to run our systems (i.e., docker containers, kubernetes cluster) (see figure 3). development we had three general goals: (1) to improve stability/scalability for local needs; (2) to improve our ability to get an environment up for developers more simply; and (3) to be able to onboard new developers more quickly. shortly after our prototype test proved we could meet local needs in scalability, we were able to test out our second goal, getting a scholarsphere environment set up easily. the process of setting up a development environment went from days to hours. we had reached two of our three goals with these tests and believed our development team (that was two to three new developers) contributing to our first two goals was proof that we could onboard new developers quickly (our third goal). after several months of development in early 2020 we had accomplished moving several of the obstacles that had been in our way in recent years but were nowhere near feature parody with scholarsphere. information technology and libraries march 2022 balancing community and local needs | coughlin 10 figure 3. current infrastructure for scholarsphere, released in november of 2021. we had a rich feature set to transfer from the existing scholarsphere and did not want to simultaneously run two systems until we achieved some level of feature parody. we wanted to get to a minimal viable product (mvp) for our new prototype, migrate data, release our new version, and retire our existing system. our product owner had been working directly with scholarsphere users and was able to help us determine priorities for the features we needed in order to have an mvp. the following were some of those features: • an api, at the very least an internal one for o our migration script o other home-developed applications o internal library employees • versioning and the ability to view versions • updated status (pending published) • updated user interface • urls that were harvestable • maintaining our data model for continued support of concepts such as collections • enhanced support for dois information technology and libraries march 2022 balancing community and local needs | coughlin 11 we also identified some features that had been developed over the years to either simplify or eliminate. the profiles within scholarsphere were not heavily used and over the years the university had more mature systems for this type of purpose. similarly, finding a featured researcher for the home page seemed to create more work than it was worth, and our social media integrations were not going to be a priority. we also thought a user’s dashboard—the default page after logging in—could be greatly simplified based on the most prominent actions our researchers wanted to perform. conclusion after a little over a year of development, in november 2020, we released our new version of scholarsphere. we used our own internal api, as planned, for data migration from our existing fedora commons storage system into the new one in amazon s3. over the past seven months we have done nine feature releases, including collections, and an enhanced api to support penn state’s open access initiative. we learned some lessons along the way within all of these lenses. we have also more than doubled the physical storage size of our repository since releasing in november 2020. over the summer, we were able to meet a faculty member’s request to upload 30 to 40 videos of 300 to 400 gb, a request we never would have been able to meet in our prior solution. community & local needs working with the samvera community has provided countless opportunities for our entire team. we were able to sharpen our technical skills, were given opportunities to lead workshops, organized community development sprints, and collaborated on a plan for a community roadmap (to name a few). our entire team benefitted in several ways by the involvement in the community: our software knowledge is higher, our problem-solving skills are more creative, and our outside professional opportunities expanded. ultimately, our paths diverged in a way that made it difficult to justify the time and resources required for merging back. there are several benefits to community-based software: more eyes looking at potential security issues in code, more voices to let you know when a dependency of your code has beco me vulnerable, shared software ideas for developing issues, and shared solutions for common problems. the cost of all these benefits comes with increased complexity in organizing a solution (you need to take multiple institutions into account), workflows for development (your local workflow may not be the same as the community approved workflow), competing priorities within the community, and competing priorities with the community and local roadmap. open source communities are largely online, these groups typically have a more shared, informal leadership structure and that lack of formal leadership can make it difficult to find solutions to these complexities.14 team dynamics, and platform and infrastructure stability rewriting a system can be a daunting task, and several prominent developers would argue against it.15 reasons we believe we were successful are that (1) we did not change our data model, (2) although we changed our architecture, we did not change our coding conventions or our agile development process, and (3) the benefits of our changes were multidimensional. we were meeting users’ needs with our development work and our infrastructure was enhancing our capabilities and making the work of our developers easier and less frustrating. our deployment process has improved to the point that we can perform a release easily and without downtime. information technology and libraries march 2022 balancing community and local needs | coughlin 12 our technology is no longer based on samvera, and is now, largely, a more generic ruby on rails application. we migrated from using fedora as both a metadata and object store (retrieving objects on our central isilon system through fedora) to using postgres as a metadata store and amazon’s s3 storage service for our files. we migrated our background jobs processing services from rescue to sidekiq. we continue to use blacklight discovery and search interface, with solr as our search platform. many of these technical decisions were made because of the change in dynamics of our team, and perhaps the single biggest change was around experience and the confidence that comes with that. selecting a platform and infrastructure to support that platform is daunting. it is particularly difficult when you have so many questions in front of you about how the system will be used, the demand it may be under, the need to scale, how to deploy new features and update dependencies, etc. our decisions in 2019 were made with much more experience and understanding of what was required of our system as well as what desired by our users. this gave us the confidence to branch off slightly from the joined technical path and recognize all the value (beyond technical solutions) to remain members of the community albeit in a modified capacity. acknowledgements many people put in tremendous time, effort, skill, thought, and enthusiasm into scholarsphere over the years. we want to acknowledge all those that have contributed to the development and advancement of the system and appreciation for their work: carolyn cole, hector correa, michael tribone, michael j. giarlo, adam wead, ryan schenk, jeff minnelli, dann bohn, justin patterson, joni barnoff, seth erickson, kieran etienne, calvin morooney, jim campbell, paul crum, chet swalina, matt zumwalt, justin coyne, elizabeth sadler, valerie maher, jamie little, brian maddy, kevin clair, patricia hswe, and beth hayes. endnotes 1 helen hockx‐yu, “digital preservation in the context of institutional repositories,” program 40, no. 3 (2006): 232–43, https://doi.org/10.1108/00330330610681312. 2 raymond okon, ebele leticia eleberi, and kanayo kizito uka, “a web based digital repository for scholarly publication,” journal of software engineering and applications 13, no. 4 (2020), https://doi.org/10.4236/jsea.2020.134005. 3 research data access and preservation, “browse data sharing requirements by federal agency,” sparc, september 29, 2020, http://researchsharing.sparcopen.org/compare?ids=18&compare=data; “publisher data availability policies index,” chorus, october 8, 2021, https://www.chorusaccess.org/resources/chorus-for-publishers/publisher-data-availabilitypolicies-index/. 4 “registry of open access repository mandates and policies,” roarmap, http://roarmap.eprints.org/view/country/840.html. 5 “student enrollment – fall 2021,” the pennsylvania state university data digest 2021, https://datadigest.psu.edu/student-enrollment/. https://doi.org/10.1108/00330330610681312 https://doi.org/10.4236/jsea.2020.134005 http://researchsharing.sparcopen.org/compare?ids=18&compare=data https://www.chorusaccess.org/resources/chorus-for-publishers/publisher-data-availability-policies-index/ https://www.chorusaccess.org/resources/chorus-for-publishers/publisher-data-availability-policies-index/ http://roarmap.eprints.org/view/country/840.html https://datadigest.psu.edu/student-enrollment/ information technology and libraries march 2022 balancing community and local needs | coughlin 13 6 stephen abrams, john kunze, and david loy, “an emergent micro-services approach to digital curation infrastructure,” the international journal of digital curation 5, no. 1 (2010): 172–86, https://doi.org/10.2218/ijdc.v5i1.151. 7 jennifer marlow and laura dabbish, “activity traces and signals in software developer recruitment and hiring,” in cscw ’13: proceedings (acm, 2013): 145–56, https://doi.org/10.1145/2441776.2441794. 8 dai clegg and richard barker, case method fast-track: a rad approach (reading: addisonwesley, 1994). 9 “questioning authority,” github, accessed september 2021, https://github.com/samvera/questioning_authority; “browse-everything,” github, accessed 09/05/2021, https://github.com/samvera/browse-everything. 10 sophie huilian qiu et al., “going farther together: the impact of social capital on sustained participation in open source,” 2019 ieee/acm 41st international conference on software engineering (icse) (2019): 688–99, https://doi.org/10.1109/icse.2019.00078. 11 adam wiggins, “the twelve-factor app,” accessed september 2021, http://12factor.net. 12 akond rahman, chris parnin, and laurie williams, “the seven sins: security smells in infrastructure as code scripts,” 2019 ieee/acm 41st international conference on software engineering (icse) (2019): 164–75, https://doi.org/10.1109/icse.2019.00033. 13 christopher mendez et al., “open source barriers to entry, revisited: a sociotechnical perspective,” in proceedings of the 40th international conference on software engineering (may 2018): 1004–15, https://doi.org/10.1145/3180155.3180241. 14 lindsay larson and leslie a. dechurch, “leading teams in the digital age: four perspectives on technology and what they mean for leading teams,” leadership quarterly 31, no. 1 (2020), https://doi.org/10.1016/j.leaqua.2019.101377. 15 fredrick p. brooks jr., the mythical man-month: essays on software engineering (reading, mass.: addison-wesley pub. co., 1982) https://search.library.wisc.edu/catalog/999550146602121; joel spolsky, “things you should never do part i,” joel on software, april 6, 2000, https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/. https://doi.org/10.2218/ijdc.v5i1.151 https://doi.org/10.1145/2441776.2441794 https://github.com/samvera/questioning_authority https://github.com/samvera/browse-everything https://doi.org/10.1109/icse.2019.00078 http://12factor.net/ https://doi.org/10.1109/icse.2019.00033 https://doi.org/10.1145/3180155.3180241 https://doi.org/10.1016/j.leaqua.2019.101377 https://search.library.wisc.edu/catalog/999550146602121 https://www.joelonsoftware.com/2000/04/06/things-you-should-never-do-part-i/ abstract the institutional repository (ir) selecting a repository community involvement local needs team dynamics platform and infrastructure stability initial release maintaining the ir community involvement team dynamics platform and infrastructure stability evaluating next steps current solution community involvement local needs team dynamics platform and infrastructure stability development conclusion community & local needs team dynamics, and platform and infrastructure stability acknowledgements endnotes academic uses of google earth and google maps in a library setting eva dodsworth and andrew nicholson academic uses of google earth and google maps in a library setting | dodsworth and nicholson 102 abstract over the last several years, google earth and google maps have been adopted by many academic institutions as academic research and mapping tools. the authors were interested in discovering how popular the google mapping products are in the academic library setting. a survey was conducted to establish the mapping products’ popularity, and type of use in an academic library setting. results show that over 90 percent of the respondents use google earth and google maps either to help answer research questions, to create and access finding aids, for instructional purposes or for promotion and marketing. the authors recommend expanding the mapping products’ user base to include all reference and liaison librarians. introduction since their launch in 2005, google maps and google earth have had an enormous impact on the way people think, learn, and work with geographic information. with easy access to spatial and cultural information, google maps/earth has provided users with the means to understand their world and their communities of interest. moreover, the customizable map features and dynamic presentation tools found in google maps and google earth make each one an attractive option for someone wanting to teach geographic information or make customized maps. for academic researchers, google mapping applications are also appealing for their powerful ability to share and host projects, create customized kml (keyhole markup language) files, and to easily communicate their own research findings in a geographic context. recognizing their potential for revitalizing map collections and geographic education, the authors felt that many academic libraries were also going to be active in using google maps/earth for a variety of purposes, from promoting their services to developing their own google kml files for users. with google earth’s ease of use and visualization capabilities, it was even thought that academic libraries would be using google earth heavily in instruction classes bringing geographic information to subject areas traditionally outside of geography. as active users of google maps/earth in their roles as academic librarians at their universities, the authors became curious to know what other academic librarians were doing with google maps/earth, particularly those working with maps and/or geography subjects. were they using eva dodsworth (edodsworth@uwaterloo.ca) is geospatial data services librarian, university of waterloo library, waterloo, and andrew nicholson (andrew.nicholson@utoronto.ca) is gis/data librarian, hazel mccallion academic learning centre, university of toronto mississauga, ontario mailto:edodsworth@uwaterloo.ca mailto:andrew.nicholson@utoronto.ca information technology and libraries | june 2012 103 the technology as part of their librarian roles on campus? how were they using it? what impacts was it having in how they delivered library services? to help answer these questions, the authors set out on a three-stage process with the aim of providing a more complete picture of google maps/earth use in academic libraries. the first stage consisted of a literature search focusing on library and information science research databases, to see what (if any) scholarly research had been written that discussed the role of google maps/earth in academic libraries. the second stage of the research had the authors examining over a dozen academic library websites to assess how they were integrating google maps/earth either through an api plug-in on their website or advertising other google maps/earth related services and collections. the third stage had the authors compile a set of twenty survey questions which were then distributed to academic librarians across canada and the united states, probing the use of google mapping products in the academic library setting. literature review despite the ubiquity of google for information searching, there was a surprising paucity of literature that documents the impact of google maps/earth in academic libraries. nevertheless, there are some articles which indicate just how much google maps can help raise the profile of library services. terry ballard, a librarian at quinnipiac university, describes in a few articles how he and colleagues were able to use google earth placemarks to promote his library’s special collections.1 the potential for “discovering the library with google earth” is also a theme in an article by brenner and klein in which the portland state university library linked their urban planning documents collection to google earth for ease of searching.2 although the focus is on public libraries, michael vandenburg documents how his library system began “using google maps as an interface for the library catalogue.” in his article, vandenburg discusses that the inspiration for such a project came about through various google maps mashups that were popular on search oriented websites such as “housing maps,” which combined realtor listings from craigslist with a google maps api. using api coding, vandenburg was able to link latitude and longitude data of countries to individual opac records enabling a visual search for items at the country level.3 while these articles focused on use of google earth as a collection discovery tool, troy swanson notes the visualization aspects of the applications and their utility for teaching information literacy. swanson has students use google earth and second life as tools to create a virtual exhibit on malcolm x. although swanson notes that the final output by the students did not meet the initial expectations, valuable learning opportunities for teaching in a 3d space were recognized and should be pursued. 4 some of these opportunities are highlighted as case studies by lamb, noting the visualization aspects of google earth would be very useful for librarians providing instruction.5 academic uses of google earth and google maps in a library setting | dodsworth and nicholson 104 google maps/earth & academic libraries: a scan of selected library websites for the next stage, the authors performed an environmental scan of academic library websites to see how they are using and implementing google mapping technology into their services. many are doing creative and innovative project work which will, we hope, encourage and guide other libraries to consider doing something similar. mapping technology can be used in several different ways, and with internet users becoming more proficient using this technology, libraries have the opportunity to take advantage of this communication medium. any document or image that has a geographic component can be digitized and made easily accessible using online mapping technology. the following section will review some of the projects highlighted on websites. the projects can be grouped into the following categories: finding aids, collection distribution, and teaching and reference services. finding aids all collections in libraries require some sort of finding aid to locate library material—the most obvious one being the library catalog. however, there are many location-based materials that use customized finding aids such as map and air photo indexes, and geospatial data coverage maps. for several years now libraries have been trying to make access to the finding aids easier by digitizing them and offering them online. not only are online versions easily updatable, but they are quite often created using google technology, allowing for the use of modern basemaps and zoom capabilities. traditional paper indexes can be difficult to navigate, especially the historical ones, making the search process rather difficult for users and library staff. one of the most popular types of online indexes created by libraries is air photo indexes. most map libraries collect air photos, and many use similar indexes to help locate aerial photography for an area of interest. several libraries have digitized the indexes making the same information available online. users simply zoom into a geographical area and click on a point to retrieve the photo information they need in order to locate the air photo in the library collection. some libraries will even send an electronic copy of the photo to the users. the mcgill university library, for example, has made its air photo information available from their webpage in a kml format to be viewed in google earth. users can click on a point of interest to easily obtain the air photo information. mcgill library has also digitized topographic indexes, making them also available via google earth.6 the university of western ontario’s serge a. sauer’s library also provides its air photo indexes online, incorporating google maps directly into their website. placemarks representing individual photos have been inserted on a google map, along with the photo description so that when users click on the placemark, photo information is released. using google mapping technology to offer online finding aids that are searchable by location is an innovative and cost-free step towards collection accessibility. what would make these types of library collections even more accessible, however, is offering users online access to digital versions of the collection items themselves. so to bring the indexing project one step forward, not only would the photo reference information be made available, but the actual image would be too, information technology and libraries | june 2012 105 thereby allowing libraries to use google mapping technology as an avenue for collection distribution and delivery. collection delivery libraries have had digital collections for quite some time. many of course do not need to digitize resources themselves as they subscribe to products such as electronic journals and books. however there are still some less common collections that are physically housed in libraries that would be much more accessible to users if they were exposed and made available online. an internet search has shed light on numerous digitization projects that use google mapping technology to search for and deliver location-based collections. examples of these types of collections include historical maps and air photos, archived photos and postcards, audio interviews, community information, textual documents like letters and diaries, and gis data. mcmaster university library is one example of a library that has digitized a historical map collection and made it available online. an index to its world war i military maps and aerial photography was created using google maps, and was embedded into its webpage.7 users can click on an area of interest to bring up the corresponding high resolution map image. likewise, brock university library has also offered its historical air photo collection online, allowing users to search using a google map, and then download photos of interest.8 additionally, yale university library has created kml indexes of its fire insurance plans, with direct links to the digitized images.9 the university of connecticut library has digitized its local historical maps and using google maps had created a map mashup which includes historic landmarks. clicking on the landmarks provides users with links to related resources. several libraries have digitized other imagery, such as postcards and photography. this is particularly popular with archival and specialized collections. the university of vermont library has embedded a google map into its website with placemarks that when clicked lead the user to the library’s long trail collection, an assortment of over 900 images of the oldest long-distance hiking trail in the united states. the images have been digitized from hand-colored lantern slides.10 cleveland state university library has also done something similar with its cleveland memory project, in which google maps were embedded into the library webpage and placemarks of local historic landmarks added. when users click on the placemarks, they are able to access a description of the landmark along with a photograph of it. clicking on “more information” will lead the user to several related resources, including the library catalog, where original documents about the location are available (e.g., images, books).11 besides digitizing their collections, some libraries have also georeferenced them so that they could not only be accurately located using an index, but so that they could be viewed in google earth (kml format). offering collections in kml format greatly increases exposure and use of geographic resources because google earth is one of the more popular location-based applications used by library users and the public. geographic files such as georeferenced air photos and satellite images, as well as gis data used to be only viewed in specialized gis programs. but gis technology has evolved into so many online applications, offering all computer users the benefits of geographic information and a platform to distribute information. academic uses of google earth and google maps in a library setting | dodsworth and nicholson 106 the university of waterloo map library is one example of a library that had digitized its historical air photo collection and made the images available in kml format for google earth usage.12 users can access a map index of the available photos from the map library webpage and then click on the index to download the images. the university of north carolina library has georeferenced several historical maps and made them available for viewing as an image overlay in google maps. this particular mapping project consists of around 150 thematic maps, including historical soil surveys, road and highway maps, city/county maps, and more. users can take advantage of the georeferenced maps and accurately compare historical features to modern ones with google maps’ basemap. having a preview of the dataset before it is downloaded assists the user in downloading only what is needed.13 perhaps more popular than a library’s air photo collection are libraries’ collections of geospatial data. geospatial, or gis data, has traditionally been only used by users who have access to gis programs such as esri’s arcgis, or arcview. more recently, librarians have discovered that when spatial files are converted into easy-to use file formats, such as kml, the user group is broadened and the files are used more. so it is no surprise that several libraries have converted their gis shapefiles (a spatial data file format used specifically in gis programs) into kml files and made them available for download from their webpages. university of connecticut library offers its gis files online in various formats, including kml. it also provides a sample image of the gis layer in google maps.14 baruch college at the city university of new york has made neighborhood census data available in google maps. the geographic boundary files were overlaid in google maps, and clicking on the map will lead users to the files available from the american census bureau’s fact finder. clearly, many libraries have incorporated google mapping technology into their digitization projects. the technology has proven capable of attracting collections that are not strictly locationfocused such as maps and air photos, but that have a location associated with it, such as archival photos of community landmarks or books written about a specific locale. google mapping technology makes the organization and storage of collections relatively effortless for library project managers, and it makes collection searching and distribution simple and friendly for the users. other uses of google maps/earth in libraries perhaps one of the simplest uses of google mapping technology can be illustrated by visiting several library websites. many libraries have embedded google maps into their website as either a webpage header15 survey: what are academic library staff doing with google maps/earth?16 following the review of the literature and academic library websites, the authors wanted to discover how academic librarians themselves were using google maps and google earth in their work, if at all. to capture this data, the authors compiled a set of survey questions targeting those in the academic library community who work with maps, gis, or geography/geology/earth science subject matter. information technology and libraries | june 2012 107 in preparing the survey questions, the authors were aware of a “survey fatigue” among the academic library community. at the time of research, many surveys were going out to librarians requesting their time and responses, so the authors wanted to keep the survey concise both in terms of number of questions, but also in the types of questions. in the end, the survey was created with twenty questions consisting of six yes/no questions, seven multiple choice, and the remaining seven questions being short answer. for distributing the survey, the authors wanted to reach as many librarians who worked with maps, geospatial data and government document subject matter as possible. the survey was then distributed on specialized map library and government publication listservs, including maps-l, govinfo, gis4lib, and carta (canadian maps & air photo systems forum). the survey was also distributed on the members’ only lists belonging to the association of canadian map libraries & archives (acmla) and the western association of map libraries (waml) listservs. the survey was made available on survey monkey for two months from december 2010 to the end of january 2011. the responses with the survey available during a quieter period of library activities, and thanks to a couple of reminder emails being sent out on the lists, our questionnaire received a total of 83 responses. who is using google maps/earth? the first couple questions dealt with the department or area of the library in which the respondent worked in, and what their position encompassed. as expected, a large majority of respondents, 81 percent, worked in “map/gis services” while 28.8 percent also had “general reference” responsibilities. other library service areas mentioned included “data services” and “it,” as well as some that fell outside library boundaries where staff worked in geography and environment science departments. not surprisingly, 52 percent of the responses indicated that their position was “librarian,” with the majority being “gis librarian” or “map librarian.” others included “reference & instructional services librarian” and “science librarian.” also received were 17 responses from gis specialists, library technicians and map assistants. what was especially noteworthy was that 12 responses were from library administrators, directors, or department heads who were finding time to work with google earth as part of their responsibilities. this number also included gis coordinators and map curators responsible for making decisions in their departments. google mapping products : what is being used, how often and for what purpose? to gain an understanding of how library staff are using google mapping products, a series of questions was asked of the respondents to determine which products were being used, how often and for which tasks. respondents were given a list of all the google mapping products available, and were asked to indicate which ones they had worked with. not surprisingly, the top two products used by respondents were google maps, 93 percent (71) and google earth, 91 percent (69). google maps api had been used by 40 percent (30) of the respondents, followed by google earth pro at 38 percent (29). eight percent (6) had also worked with google earth api, and 7 percent (5) had used google earth plus. interestingly, one respondent indicated that they had deployed google earth enterprise in their library. academic uses of google earth and google maps in a library setting | dodsworth and nicholson 108 figure 1. respondents’ use of google mapping products since many of these users may have simply used the products occasionally, it was important to get a sense of how often the products were being used. when asked the question “how regularly do you work with google mapping products for work-related projects?” 69 percent (54) responded that they use the products at least once a month. of those responses, 45 percent (35) use them at least weekly. specifically, eighteen percent (14) use them one to two times a week, thirteen percent (10) use them three to four times a week, and fourteen percent (11) use them even more often than that. only six percent (5) responded that they don’t use the products at all. information technology and libraries | june 2012 109 figure 2. frequency of use for work-related projects as google maps/earth can be used in many different ways and for different purposes in a library environment, the survey inquired how in fact these products were being used in their libraries. the survey question listed four possible tasks that the technology could be used for with the additional option for respondents to enter their own ‘other’ usages. respondents could check off all that applied. the options given included: • instruction • promotion/marketing • answering research questions • creating/accessing a finding aid tool (air photo map indexes, etc.) • other: (fill in answer) the majority of respondents, 82 percent (58) indicated they were using the products to answer research questions; 61 percent (43) for creating or accessing a finding aid tool; 56 percent (40) for instruction purposes; 27 percent for promotion/marketing and 20 percent (14) have used them for “other” purposes including georeferencing imagery, for use in webpages or creating learning objects. academic uses of google earth and google maps in a library setting | dodsworth and nicholson 110 figure 3. level and frequency of use in instruction are google mapping products being used for library instruction? for the authors, one of the best aspects of google maps/earth applications is their visualization capabilities. the ability to easily create and display geographic information to engage students makes google mapping applications an ideal instruction tool. in many ways, google maps and google earth have helped promote map and spatial literacy as concepts that are teachable. despite the free availability and ease of use of google mapping applications, the authors were somewhat surprised from the survey to find that 72 percent of library staff surveyed noted that their institution did not have any kind of map, spatial, or geospatial literacy policy in place. when it came time to provide instruction in the classroom, the survey found that only 31 percent (26) of the respondents had even used google earth in a classroom. nevertheless, in looking at the course levels, library instruction with google earth tools is actually occurring at all levels, from first year to graduate. significantly however, the frequency of the instruction seems to peak in the fourth year, where staff are using in upwards of six to nine courses. respondents were asked to give some details of these sessions, and they included a variety of class topics from environmental awareness education for first year students, to learning digitization skills in later years. has your library taken advantage of google map/earth technology for promotion or marketing purposes? information technology and libraries | june 2012 111 from our environmental scan of library websites we saw many interesting uses of google maps and google earth that were embedded directly into websites. perhaps because of this the authors were surprised to find that 55 percent of the survey respondents did not believe their library was using these technologies for promotion or marketing purposes. for those respondents who were using google maps or google earth to boost services for users, quite a few provided interesting examples of what this technology can offer. many were using google map apis to enhance map and aerial photo indexes, creating greater awareness of these resources and enhancing access. one respondent noted they had created a campus tour that highlighted all of the buildings that made up the library system, while others were using google api technology to showcase particular digitization projects such as folklore collections or geologic atlases. when asked if such activities have helped to enhance services or provided benefits to users, many responded that they had for both the users and for other library staff. greater speed and an increased familiarity of the collections were cited by several respondents, who no longer need to consult the paper indexes. does the library provide support to the wider campus community using google mapping products (not including instructional collaborations)? although many libraries are now using google maps and google earth technology, the authors were surprised that many were not actively leveraging this expertise across their campuses. almost all the respondents either skipped the question or stated that they were not providing this kind of active support. several noted that their gis services were open to all and that they were responsible for the google earth pro licences on campus, but that this was the extent of their support. working with google map/earth (kml) files in the last few years, kml files have become one of the more popular ways to display and distribute geographic information online. with its ease of use, and access, kml files have considerably broadened the user base of geographic information. kml files can be easily created in google earth, and they can be easily converted from gis files in specialized programs. it is this ease of access and usability that has popularized geographic information, hence increasing exposure to library collections and services. this survey was therefore interested in determining how libraries are using and creating kml files. when survey respondents were asked whether they work with kml files, 64 percent (47) responded they did, with 85 percent (40) claiming that they create their own kml files. for those who create their own, 92 percent (34) said that they created kml files by converting them from another file format using an external application, such as arcgis, earthpoint, ogr20gr, or shp2kml software. 78 percent (29) also created them in google earth, and 32 percent (12) created kml files by writing their own xml code. the authors were most interested to know if kml files were actually held as part of the library holdings. thirty percent (13) of respondents noted that they provide access to their kml files as academic uses of google earth and google maps in a library setting | dodsworth and nicholson 112 part of their collections, with 89 percent, (8) claiming they could be located through the library website. other areas mentioned for access included libguides and specialized gis data catalogues available through the library’s website. in terms of quantity, one respondent claimed a collection of 500-800 kml files, while other responses mentioned amounts in the ranges from 5 to 100, with some claiming that they were not sure exactly how many made up their collection. what other online mapping tools are used in your library apart from google maps and google earth? although google maps and google earth are perhaps the most well-known online mapping tools available, the authors were also interested to learn if there were other products that libraries were using as part of their service offerings. as expected many mentioned esri’s arcgis online and esri’s arcexplorer, while other responses included bing maps, openstreetmap, and open layers. discussion google mapping applications are clearly being used for academic purposes in library settings. with such diverse capabilities made available in these programs, library professionals are using them in several different ways. google earth and google maps are popular among library staff who work with gis and/or map collections. in fact, over 90 percent of the respondents use both products, either to help answer research questions, to create and access finding aids, for instructional purposes or for promotion and marketing. google mapping products have also helped libraries revitalize their collections as well as assist in transferring spatial information literacy skills to academic students and faculty. the authors hope that readers who work in a map/gis library setting will be inspired by the many examples of online mapping projects outlined in this paper and they will too use the online tools to the benefit of their library and their library users. google mapping products offer libraries an online platform to share information, and resources in an easy, accessible and low-cost way. the survey results also indicate that map/gis professionals in academic libraries trust and rely on google maps/earth as a solution to many academic queries and needs. since google mapping products were created for the use by mainstream society, it can be suggested that all other nonmap and gis related fields may find the products to be beneficial and useful to them as well. google earth and google maps are very easy to learn and the users do not require any spatial or mapping skills. as this survey was limited to map/gis users, the authors do not know how, if at all, google mapping products are being used by other library staff. this will be a future area of study. the authors do strongly suggest however for map/gis librarians to consider offering training sessions to reference staff and liaison librarians. as a multidisciplinary tool, many subject areas can benefit from google maps/earth, as it’s certainly not a tool for use by only gis/map librarians. with a little bit of training, all library staff can use google mapping products to assist with research questions, spatial literacy, location-based projects and library instruction. in fact, library staff members responsible for nontraditional library material such as photographs, postcards, audio recordings, original hand-written documents, etc. may want to consider using online mapping products to organize their collection. too many times such original material is lost in the library’s filing system, is irretrievable or unavailable during convenient hours. google maps/earth will organize all collections based on their geographic location and can offer access to the actual information technology and libraries | june 2012 113 material. more exposure to and training on these free and easy to use products can increase collection use, promote mapping technology, and organize the library’s holdings. references 1 terry ballard, “inheriting the earth: using kml files to add placemarks relating to the library’s original content to google earth and google maps” new library world 110 (2009): 357-65, doi: 10.1108/0307480091097579. jacobsen, mikael and terry ballard, “google maps: you are here: using google maps to bring out your library’s local collections” library journal, october 15, 2008 (accessed september 11, 2011). http://www.libraryjournal.com/article/ca6602836.html. 2 michaela brenner and peter klein, “discovering the library with google earth” information technology and libraries 27 (2008): 32-6. 3 michael vandenburg, “using google maps as an interface for the library catalogue” library hitech 26 (2008): 33-40. 4 troy swanson, “google maps and second life: virtual platforms meet information literacy” college & research libraries news 69 (2008): 610-12. 5 annette lamb, and larry johnson, “virtual expeditions: google earth, gis, and geovisualization technologies in teaching and learning” teacher librarian 37 (2010): 81-5. 6 a list of mcgill library’s air photo indexes can be viewed at http://www.mcgill.ca/library/library-findinfo/maps/airphotos/ (accessed september 8, 2011). 7 mcmaster university library map index can be found at http://library.mcmaster.ca/maps/ww1/ndx5to40.htm, (accessed september 8, 2011). 8 the brock university historical air photo collection can be accessed at: http://www.brocku.ca/maplibrary/airphoto/historical.php (accessed september 8, 2011). 9 the yale university sanborn indexes can be found at http://www.library.yale.edu/mapcoll/print_sanborn.html (accessed september 8, 2011). 10 the university of vermont library’s google map can be found at: http://cdi.uvm.edu/collections/browsecollection.xql?pid=longtrail&title=long%20trail%20p hotographs (accessed september 8, 2011). 11 the cleveland memory project can be found at: http://www.clevelandmemory.org/hlneo/ (accessed september 8, 2011). http://www.libraryjournal.com/article/ca6602836.html http://www.mcgill.ca/library/library-findinfo/maps/airphotos/ http://library.mcmaster.ca/maps/ww1/ndx5to40.htm http://www.brocku.ca/maplibrary/airphoto/historical.php http://www.library.yale.edu/mapcoll/print_sanborn.html http://cdi.uvm.edu/collections/browsecollection.xql?pid=longtrail&title=long%20trail%20photographs http://cdi.uvm.edu/collections/browsecollection.xql?pid=longtrail&title=long%20trail%20photographs http://www.clevelandmemory.org/hlneo/ academic uses of google earth and google maps in a library setting | dodsworth and nicholson 114 12 the university of waterloo map library website can be found at: http://www.lib.uwaterloo.ca/locations/umd/project/ (accessed september 8, 2011). 13 the university of north carolina library provides interactive maps at http://www.lib.unc.edu/dc/ncmaps/interactive/overlay.html (accessed september 8, 2011). 14 the university of connecticut library offers gis files online here: http://magic.lib.uconn.edu/connecticut_data.html (accessed september 8, 2011). 15 campus map examples include: yale university library at http://maps.commons.yale.edu/venice/ example maps for library locations on campus include: brock university library, http://www.brocku.ca/maplibrary/general/where-is-the-ml.php university of north carolina, http://www.lib.unc.edu/libraries_collections.html (all accessed on september 8, 2011). 16 the full survey instrument can be found in the appendix of this document. http://www.lib.uwaterloo.ca/locations/umd/project/ http://www.lib.unc.edu/dc/ncmaps/interactive/overlay.html http://magic.lib.uconn.edu/connecticut_data.html http://maps.commons.yale.edu/venice/ http://www.brocku.ca/maplibrary/general/where-is-the-ml.php http://www.lib.unc.edu/libraries_collections.html information technology and libraries | june 2012 115 appendix google maps and google earth: influences and impacts in your library you and your library 1. what is your work position title? 2. what department/division/area of library do you work in? (click all that apply) o map/gis services o government publications o general reference o technical services o other (please specify): google mapping products 3. please check all the products you have worked with? o google maps o google maps api o google earth o google earth plus o google earth pro o google earth api o google earth enterprise 4. how regularly do you work with google mapping products for work-related projects? o not at all o a few times a year o 1-3 times a month o 1-2 times a week o 3-4 times a week o more often than that! o not sure 5. for what work related tasks, have you used these products? (click all that apply) o instruction o promotion/marketing o answering research questions o creating/accessing a finding aid tool (air photo, map indexes, etc.) academic uses of google earth and google maps in a library setting | dodsworth and nicholson 116 library instruction using google mapping products 6. does your library have a map, or spatial, or geospatial literacy policy or program? o yes o no 7. if you are using google mapping products for instruction, what level or year of university course(s) are you using it in, and in how many courses: 1-2 3-5 6-9 10-14 15 and more 1st year (100 level) 2nd year (200 level) 3rd year (300 level) 4th year (400 level) graduate level 8. please describe some of these activities? 9. does your library offer geographic awareness or gis-related training to some or all the library staff? promotion/marketing using google mapping products 10. has your library used google mapping technology to promote, offer, or deliver a service? (for example, offering kml files for download, indexes, guides, scanned documents, placemarks/urls from google maps/earth, etc.) o yes o no 10a. if yes, please describe with as much detail as possible how your library has used google mapping technology. if possible, please provide links to the projects. 10b. if yes, how have the google mapping related projects enhanced services or benefited the library? information technology and libraries | june 2012 117 11. does the library provide support to the wider campus community using google mapping products (not including instructional collaborations)? kml/kmz collections 12. do you work with kml files? o yes o no 13. do you create your own kml files? o yes o no 14. how do you create your own kml files? o write xml code o save in google earth o convert from another file format using an external application o other (please specify) 15. does your library hold and provide access to kml or kmz files as part of its collections? o yes o no 16. if yes, approximately how many files do you currently hold? 17. how are these files findable by your patrons? o opac o library website o both 18. do you or other library staff use other online mapping tools? please list which ones and what they are used for. lib-mocs-kmc364-20140103101752 1 file size and the cost of processing marc records john p. kennedy: data processing librarian, georgia institute of technology, atlanta, georgia many systems being developed for utilizing marc records in acquisitions and cataloging operations depend on the selection of records from a cumulative tape file. analysis of cost data accumulated during two years' experience in using marc records for the production of catalog cards at the georgia tech library indicates that the ratio of titles selected to titles read from the cumulative file is the most significant determinant of cost. this implies that the number of passes of the file must be minimized and an effective formula for limiting the growth of the file must be developed in the design of an economical system. since 1963 several articles on computerized production of catalog cards have reported cost figures for card production. fasana reported a cost per card of 9.9 cents at the air force cambridge research laboratory (afcrl) (1). costs at the yale medical library under the columbiaharvard-yale computerized card production system varied from 8.8 cents to 9.8 cents per card ( 2) . under the yale bibliographic system, costs for card production at the yale medical library have been 13.9 cents per card .. when the marc mate program is used to introduce marc records mto the yale bibliographic system the cost of cards produced from the marc records is 24.9 cents ( 3) . costs for computer assisted card production at the philip morris research library have been estimated at 18 cents per card ( 4) . the cost per card for cards produced from marc records at the georgia institute of technology library has been reported as 10 cents (5). 2 ]oumal of library automation vol. 4/1 march, 1971 the focus of interest in these cost reports has been on a comparison of the costs of computer produced cards and manually produced cards. there is agreement in these reports that computer production can compete favorably in terms of cost with other methods of production. less attention has been given to variations in the costs of computer produced cards. since the systems for which costs have been reported vary in scope and objectives, equipment used, nature of input, rates for labor, and charges for computer time, it is not very useful to compare the costs from system to system. variations in cost within one system are of greater interest, since it is easier to isolate the factors that result in the altered costs. the report on the yale bibliographic system shows that the introduction of marc records into a system that was not designed for processing marc records may produce substantially higher costs. fasana reported that when a pdp-1 computer was used rather than the specially built crossfiler in the afcrl system, the cost per card was quadrupled. kilgour discusses briefly the effects of three changes in the columbia-harvard-yale system on the cost of cards produced. the 10-cent-per-card cost reported for georgia tech was the average cost during the preceding three-month period, january through march 1968. during the three years in which catalog cards have been produced on the computer at georgia tech, costs have varied widely as procedures, personnel, file sizes and work loads have changed. the greatest variation has occurred in the cost of the manual steps in the system, mainly proofreading and making corrections. the greatly improved accuracy of the marc ii records has resulted in a reduction in the time required for proofreading and making corrections. the costs of supflies and equipment have been small and shown little variation. the cost o computer time has varied from 18 cents per title (just over 2 cents per card) to a high of 47 cents ( 6 cents per card), excluding the cost of the merge runs to maintain a cumulative file of marc records. an analysis has been made to determine the factors responsible for this variation in computer costs, and techniques for reducing computer costs have been developed. materials and methods the price gilbert memorial library at the georgia institute of technology is a centralized scientific, technical and management collection of 612,000 volumes plus 500,000 microtext and other bibliographic units. in 1968/69 almost 20,000 titles representing about 35,000 volumes were cataloged for addition to the collection. the library makes use of the univac 1108 and the burroughs b5500 computing systems of the institute's rich electronic computing center for its data processing needs. the work described here was performed on the b5500. the georgia tech b5500 configuration includes two central processing units, 32,000 fortyeight-bit words of core storage, 29 million characters of disc storage and 10 magnetic tape drives. library programs are written in cobol and are file size and marc records/kennedy 3 multi-processed with other programs in the standard work stream. the library is billed $140 per hour for central processor time and $47 per hour for io channel time. the system for production of catalog cards from marc i records which was in operation for over two years has been described previously ( 6). statistics were recorded for all computer runs in the processing of 73 batches of marc i titles. these statistics include number of records processed, file sizes, processor time, io channel time, and cost, for each run. the time and cost remained fairly constant for some runs. the cost of runs to produce the sorted catalog cards from edited marc records ranged from 6 to 9 cents per title and averaged a little over 7 cents. the cost of runs to make changes and additions to the marc records ranged from 1 to 5 cents per title and averaged 2 cents. the cost was usually about 1 cent per title for each time the correction program was run. it often had to be rerun several times before all records in the batch were correct. the library's improved marc ii system avoids the cost of correction reruns by permitting independent corrections to any record in a direct access file rather than requiring records to be processed as a batch. most of the variation in the cost of computer time occurred in the run in which records were selected from the cumulative marc file and the selected records were then converted to the b5500 character codes, reformatted and prooflisted. the cost of this run varied from a low of 10 cents per title selected to a high of 36 cents per title; the variation is primarily an effect of the increasing size of the cumulative marc file and of variation in the number of titles selected in the run. as the marc file increased in size the cost of selecting a small number of titles increased dramatically. the precise relationship of file size and batch size to cost per title is not apparent, however, because the cost of character conversion, reformatting, and printing the prooflist were combined with the cost of selection in a single run. an additional complication results from the effects of the other jobs being processed by the computer concurrently. for example, one batch which had to be rerun because the output tape was defective cost 23 cents per title the first time and 28 cents per title when rerun with a different job mix. although the part of the run cost which can be attributed to passing the m~~c file and the part attributable to code conversion, formatting and pnntmg cannot be determined for a single run, this can be calculated from a number of runs with varying file sizes and batch sizes. it is assumed that . variations in the time required for processing individual records of v~ymg lengths and variations due to the mix of jobs run concurrently ~11 average out and may be disregarded. statistics for the selection runs mclude the number of records read from the cumulative marc file, the number ~f recor~ selected and processed, the processor time and io channel time requrred for the run, and the cost of the run. using the method of the least squares, these statistics were used to calculate the 4 journal of library automation vol. 4/1 march, 1971 average time and cost for each record read from the cumulative marc file. once these constants are calculated it is possible to predict the cost per item or the total cost of a select run with any given file size and batch size. in order to determine the average cost for processing a selected record and the average cost for reading a record from the cumulative marc file, it was postulated that c•= (~~) ca+c. where ct fs is the total cost per title (file size) is the number of records read from the cumulative marc file bs (batch size) is the number of records selected in the run cn is the cost of reading a record from the cumulative marc file cp is the cost for processing a selected record the method of least squares yields the following equations: [ ~(~~ r] ca+ [ ~(~~)] c•= ~(~~)c. and [ ~(:] c.+nc.=c. solving these equations for the data from the 73-batch sample gives the following values: cp = $.073 cn = $.00068 since charges for computer time are determined differently at other installations, the figures for processor time and 10 channel time may be more useful to others than the cost figures. using the same techniques but substituting processor time for cost gives the following values: processor time per record read = .00646 seconds processor time for selected records = 1.339 seconds again, using the same technique but substituting 10 channel time for cost gives the following values: 10 channel time per record read= .02048 seconds 10 channel time for selected records= .456 seconds file size and marc records/kennedy 5 these values may be substituted in the formula, cr = ( ~~ ) cn + c,, to find the cost or time per title for any batch and file size. for example, the per title cost for selecting and processing a batch of 200 records from a marc file of 40,000 records: c.=(~~ )c. +c. c.=( 4:0} $.00068) +$.073 ct= $.21 it will cost about twenty-one cents per title. the total cost of the run can be predicted as follows: c = ( fs bs ) ( cn) + ( bs ) ( cp) c = ( 40000 200) ( $.00066) + ( 200) ( $.073) c = $41.27 results table 1 shows the predicted cost per title for various file sizes and batch sizes; it is based on the cost of the select run at georgia tech and ignores the cost of maintaining the marc file. since the library of congress cumulated marc i records until a reel of tape was filled and provided a cumulative card number listing of the records on the reel, it was not essential to update the cumulative marc file each week. the marc ii tapes issued from the marc distribution service are not cumulative. most libraries maintaining a cumulative file of marc records will find it necessary to update this file each week. weekly updating of the marc file requires that all records on the file be not only read but also written on a new tape each week. for most systems this will rapidly become the most expensive machine procedure in the entire system. combining the selection function and any index production with the file update means that no additional passes of the file will be required, but the cost of writing the file each week must be added to the figures in table 1. statistics from the merge runs at tech show that if the number of old marc file records read, the number of records read from the weekly update tape, and the number of records written on the new marc file are totaled, the average cost per io operation for the merge runs ranged be.~een $.00062 and $.00073 and averaged $.00068 for all merge runs. since th1s is the same cost as that obtained for each record read from the cumulative file in the select runs, it seems reasonable to use this figure as the cost for reading or writing a marc record in calculating the cost of 0) ._ c ~ ~ table 1. relationship of file size and batch size to cost per title c -r .... ~ ~ ~ batch size > file ~ asize 50 100 150 200 250 300 400 500 750 1000 ~ $ .209 $ .118 $ .107 $ .100 $ .095 $ .087 $ .082 $ .080 .... !ok $ .141 $ .090 cs· 20k .345 .209 .164 .141 .127 .118 .107 .100 .091 .087 ;s 30k .481 .277 .209 .175 .155 .141 .124 .114 .100 .093 < 40k .617 .345 .254 .209 .182 .164 .141 .127 .109 .100 c ~ 50k .753 .413 .300 .243 .209 .186 .158 .141 .118 .107 ,;... -60k .889 .481 .345 .277 .236 .209 .175 .155 .127 .114 ...... 70k 1.025 .549 .390 .311 .263 . 232 .192 .168 .137 .121 ~ 80k 1.161 .617 .436 .345 .291 .254 .209 .182 .146 .127 ll' .685 .379 .277 .155 "t 90k 1.297 .481 .318 .226 .194 .134 c'.) look 1.433 .753 .526 .413 .345 .300 .243 .209 .164 .141 .?"' llok 1.569 .821 .572 .447 .372 . 322 .260 .223 .173 .148 ...... co ~ 120k 1.705 .889 .617 .481 .399 .345 .277 .236 .182 .155 ...... table 2. relationship of file size and batch size to cost per titlefile update and record selection functions combined in same program old batch size file size 50 100 150 200 250 300 400 lok $ .378 $ .225 $ .175 $ .149 $ .134 $ .124 $ .111 20k .650 .361 .265 . 217 .188 .169 .145 30k .922 .497 .356 .285 .243 .214 .179 40k 1.194 .633 .447 .353 .297 .260 .213 50k 1.466 .769 .537 .421 .352 .305 .247 60k 1.738 .905 .628 .489 .406 .350 .281 70k 2.010 1.041 .719 .557 .461 .396 .315 80k 2.282 1.177 8.09 .625 .515 .441 .349 90k 2.554 1.313 .900 .693 .569 .486 .383 look 2.826 1.449 .991 .761 .624 .532 .417 llok 3.098 1.585 1.081 .829 .678 .577 .451 120k 3.370 1.721 1.172 .897 .732 .622 .485 500 750 1000 ":tj ... ~ $ .104 $ .093 $ .088 en .131 .111 .102 n . ~ .158 .130 .115 ~ .185 .148 .129 ~ .212 .166 .143 ~ .240 .184 .156 > .267 .202 .170 ~ () .294 .220 .183 ~ .321 .238 .197 ~ ~ .348 .257 .211 c ~ .376 .275 .224 -.403 .293 .238 ~ ttl :z :z t=:l tj to< ~ 8 journal of library automation vol 4/1 march, 1971 combined merge-select runs. table 2 shows the predicted costs per title for combined merge-select runs with varying file and batch sizes. the costs shown are based on the following equation: c. =(fso + fs~:s· + fs. )c.o + cp where ct is the cost per title fso is the file size for the old marc file fsa is the file size for the add records ( 1200) fsv is the file size for the delete records ( 1200) fsn is the file size for the new marc file bs (batch size) is the number of records selected in the run c1o is the cost of reading or, writing a record ( $.00068) cp is the cost of processing a selected record ( $.073) calculations for this table are based on several assumptions: it is assumed that the file has reached a state of equilibrium in which the weekly additions and deletions are equal; it is also assumed that delete records have the same average length as other records and therefore take as long to read. while it is unlikely that these assumptions will hold perfectly, the variations are not great enough to destroy the usefulness of the resulting figures as a guide. discussion the figures presented in the two tables have several implications for the design of systems based on the maintenance of a cumulative marc file and the selection of records from that file. first, they show the im· portance of assuring that no unnecessary passes of the cumulative marc file are made. updating of the marc file, production of indexes to it and selection of records from it should be accomplished in a single pass of the file. if it is desired to select records from the file more often than once a week, table 1 provides a means of estimating the cost of the im· proved response time. if for example, the file size is 100,000 and the weekly volume is 500, twice-a-week runs would increase the cost by 14 cents per title or by $68.00 a week for the select runs. the figures presented in the two tables also show the critical importance of controlling the growth of the cumulative marc file, especially for file size and marc records/kennedy 9 libraries with a relatively small volume of titles to be processed. three characteristics of the acquisitions program of the library largely determine the possibilities for controlling the growth of this file. the number of titles acquired by the library determines the batch sizes for records to be selected from the file each week. the acquisition rate is also an important determinant of the growth rate of the cumulative file provided that records which have been selected and used are then purged from the file. if the library of congress issues an average of 1200 titles per week and a library uses an average of 1000 titles a week from the file, the net annual growth of the cumulative file will be only slightly over 10,000 records. on the other hand, a smaller library selecting an average of only 100 titles a week would have a net annual growth rate of about 57,000. if unused records were purged after one year, the file size would remain stable at these levels. table 2 indicates that the cost per title for file maintenance and selection at these two libraries would be about 9 cents and 86 cents respectively. a second characteristic of the acquisitions program of the library that is important in controlling the growth of the cumulative marc file is the scope of the subject coverage attempted. if most of the monographs acquired fall within well defined subject classes, the probability of utilizing marc records in many other subject classes may be low enough that these records need not be added to the cumulative marc file at all. for a special library that attempts to collect everything published in a few well defined subject areas it may be economical to maintain and utilize a limited marc file even though the number of records selected is small. on the other hand, a small or medium-sized public library acquiring the same number of titles would probably find a much larger percentage of its records on the marc file but still not be able to use the marc tapes economically. since the public library is likely to collect titles in most subject fields, the probabilities of utilizing records in different classes would not vary as widely and it would not be possible to limit the file to records in a few classes having a high probability of utility. consequently, the per-item cost of marc records would likely be too high for consideration. if it is determined that the probabilities of using marc records vary widely for other characteristics, such as publisher, these characteristics may be used for restricting the records to be added to the cumulative file, thus limiting its size, but subject class seems to be the most promising characteristic for this purpose. an analysis by subject class of all non-juvenile records in the marc i bi.e and of those records selected from it for use by the georgia tech ltbrary has been used as the basis for restricting the growth of the cumulative file of marc ii records. overall, 8,953 out of 46,486 records were utili~ed, 19.3% of the file. the percentage selected varied from more than 50% m some engineering classes to less than 1% in a few classes such as cs (genealogy) and bw (practical theology) . elimination of thirty 10 journal of library automation vol. 4/1 march, 1971 classes in which fewer than 4% of the records were eventually used would have reduced the file by 7,710 records or 16.6%. only 184 of these records ( 2.4%) were eventually selected for use. records for these thirty subject classes are not being added to the georgia tech file of marc ii records. a third characteristic of the acquisitions program important in controlling the growth of the cumulative marc file is the speed with which newly published monographs are acquired. if most monographs are acquired soon after publication, the probability of using a marc record that has not been selected in the first few months after its receipt may be low. unselected records may therefore be purged after a relatively short time and the file size thereby controlled. use of the marc tapes for book selection will help to increase the probability of records being selected during the first few months on the file. a system that uses the weekly marc tapes for book selection and does not retain on the cumulative marc file those records not selected for purchase might be quite economical. the frequency with which decisions are later made to acquire titles that were initially passed over, and the added cost for manual input of those records, would have to be considered in deciding on this policy. an analysis has been made of the interval between the date records were added to the marc file and the date on which they were selected for use by the georgia tech library. distributions by time intervals for each library of congress subject class were prepared. the distributions varied significantly for reasons that are not yet clear. generally, it appeared that in those subject classes for which a smaller percentage of the titles available on the marc file were acquired, they were acquired more rapidly. this seems to be advantageous for keeping the marc file small. for those classes in which a large percentage of titles are selected, unselected records will be retained on the file for a long period, such as eighteen months. use of a large percentage will mean that the number of unused records remaining on the file will be relatively small and they will have a high probability of selection over the extended period. for those classes in which a smaller percentage of titles are acquired, the unselected records will be retained on the file for a shorter period, such as six months. since titles in these fields tend to be acquired more promptly, few potentially useful records will be lost by purging unselected records after a shorter interval. over the past year major changes have been made in acquisitions procedures in the georgia tech library. a much larger proportion of mono· graphs are now received on approval plans. the marc distribution serv· ice now provides about twice as many records each week as were provided during the pilot project phase. the effects of these changes on the propor· tion of titles selected and the time required for acquiring titles in the various subject classes have not yet been determined. continuous moni· toring of the operation of the system for changes in these characteristics file size and marc records/kennedy 11 will be required for efficient operation. the improved program for maintenance of the marc ii file and selection of records from it provides for designating subject classes which are not to be added to the file and designating how long unselected records in other classes are to be retained on the file. this study of variations in the computer costs of card production lends support to the decision to continue using cobol as the primary language for the marc ii system being implemented on the univac 1108 rather than using assembly language. the inefficiency of cobol for characterby-character code conversion and for manipulating variable length data had been a source of some concern. the cost of all processing of selected records, including code conversion, reformatting, prooflisting, making corrections, generating and formatting added entry records, and sorting and printing catalog cards, averaged only about 16 cents per title. a reduction of even 50% through the use of assembly language and increased effort directed to program efficiency would reduce costs by only about 8 cents per title or 1 cent per card. these savings do not seem to justify the increased original programming costs and the likelihood of eventual costly reprogramming. on the other hand, the cost of selecting records from the marc file varied from 3 cents per title to 29 cents per title. with the added cost of weekly maintenance of the marc file and with more than twice as many marc records being received, the costs of processing the cumulative marc file might easily go much higher. by careful attention to controlling tl1e growth of this file, significant savings in the cost of the system may be achieved. conclusion some librarians have assumed that as the scope of the marc distribution service expands to include other languages and other types of materials their problems of inputting current records will be solved. this analysis shows that the situation is not so simple. probably only a few of the largest general research libraries will be able to maintain complete marc files for their individual use during the next few years, though reductions in computing costs may eventually change this prediction. even medium-sized libraries such as georgia tech will not be able to use economically the foreign language materials when they are included in the marc program. some libraries which do not use a large enough proportion of the marc records to make it economically practical to maintain a complete marc file may be able to make economical use of marc records by carefully contro~ling the retention of records on the cumulative file. continuing analysts of the probabilities for selecting records of varying age and subject classes rna~ be utilized in developing a formula for maintaining the file at near opbmum size if the system provides for collection of the required statistics. 12 journal of library automation vol. 4/1 march, 1971 for libraries which cannot profitably use the marc tapes, there is another prospect. cooperative centers that do the processing for large library systems or for several systems will have the volume to justify maintenance of complete files. certainly, a processing center serving all libraries of the university system of georgia could economically maintain a more complete marc file than georgia tech alone can justify. the development of cooperative processing programs in ohio, new england, oklahoma, ( 7, 8, 9) and elsewhere indicates that some librarians are coming to this realization. acknowledgments mrs. julie gwynn wrote most of the computer programs referred to in this paper. her husband, professor john gwynn, gave valuable advice on the statistical techniques employed in analyzing the data. the university of toronto library generously provided a copy of its marc file, which included the date each record was added to the file, for use in analysis of the time lag between availability of the record and selection of it. references 1. fasana, paul j.: "automating cataloging functions in conventional libraries," library resources and technical services, 7 (fall1963), 350-365. 2. kilgour, frederick g.: "costs of library catalog cards produced by computer," journal of library automation. 1 (june 1968), 121-127. 3. stone, sandra f .: yale bibliographic system; time and cost analysis at the yale medical library (unpublished document, new haven: yale university library, 1969). 4. murrill, donald p.: "production of library catalog cards and bulletin using an ibm 1620 computer and an ibm 870 document writing system," journal of library automation, 1 (september 1968 ), 198-212. 5. kennedy, john p.: "a local marc project: the georgia tech library." in university of illinois, graduate school of library science: proceedings of the 1968 clinic on library applications of data processing (urbana: university of illinois, 1969), pp 199-215. 6. ibid. 7. kilgour, frederick g.: "a regional networkohio college library center," datamation, 16 (february, 1970), 87-89. 8. agenbroad, james e.; et al.: systems design and pilot operations of the n ew england state universities. nelinet, new england li· brary information network. progress report, july 1, 1967. march 30, 1968 (cambridge, mass.: inforonics, inc., 1968). ed 026 078. 9. bierman, kenneth john; blue, betty jean: "processing of marc tapes for cooperative use," journal of library automation, 3 (march 1970)' 36-64. 10738 20190318 galley determining textbook cost, formats, and licensing with google books api: a case study from an open textbook project eamon costello, richard bolger, tiziana soverino, and mark brown information technology and libraries | march 2019 91 eamon costello (eamon.costello@dcu.ie) is assistant professor, open education at dublin city university. richard bolger (richard.bolger@dcu.ie) is lecturer at dublin city university. tiziana soverino (tiziana.soverino@dcu.edu) is researcher at dublin city university. mark brown (mark.brown@dcu.ie) is full professor of digital learning, dublin city university. abstract the rising cost of textbooks for students has been highlighted as a major concern in higher education, particularly in the us and canada. less has been reported, however, about the costs of textbooks outside of north america, including in europe. we address this gap in the knowledge through a case study of one irish higher education institution, focusing on the cost, accessibility, and licensing of textbooks. we report here on an investigation of textbook prices drawing from an official college course catalog containing several thousand books. we detail how we sought to determine metadata of these books including: the formats they are available in, whether they are in the public domain, and the retail prices. we explain how we used methods to automatically determine textbook costs using google books api and make our code and dataset publicly available. introduction the cost of textbooks is a hot topic for higher education. it has been reported that by 2014 the average student spent $1,200 annually on textbooks.1 another study claimed that between 2006 and 2016 the costs of college textbooks increased over four times the cost of inflation.2 despite this rise in textbook costs, a survey of more than 3,000 us faculty members (“the babson survey”) found that almost every course (98 percent) mandated a textbook or related study resources.3 one response to the challenge of rising textbook costs is open textbooks. open textbooks are a type of open educational resource (oer). oers have been defined as “teaching, learning, and research resources that reside in the public domain or have been released under an intellectual property license that permits their free use and repurposing by others. open educational resources include full courses, course materials, modules, textbooks, streaming videos, tests, software, and any other tools, materials, or techniques used to support access to knowledge.”4 oers stem from the principle that access to education is a human right and that, as such, education should be accessible to all.5 hence an open textbook is made available under terms which grant legal rights to the public, not only to use, but also to adapt and redistribute. creative commons licensing is the most prevalent and well-developed intellectual property licensing tool for this purpose. open textbook projects aimed at promoting publishing and redistributing open textbooks, both in digital and print formats, have been growing. for example, the bcampus project in canada began in 2012 with the aim of creating a collection of open textbooks aligned with the most popular subject areas in british columbia.6 the project has shown strong growth, with over 230 open digital textbooks now available and more than forty institutions involved. a significant recent determining textbook cost, formats, and licensing | costello, bolger, soverino, and brown 92 https://doi.org/10.6017/ital.v38i1.10738 development in open textbooks occurred in march 2018, when the us congress announced a $5 million investment in an open textbook initiative.7 in addition to helping change institutional culture, and challenge attitudes to traditional publishing models, one of the most oft-cited benefits of open textbooks is cost savings. according to the college board’s survey of colleges, the average annual cost to us undergraduate students in 2017 for textbooks and materials was estimated at $1,250.8 this figure is remarkably close to the aforementioned figure of $1,200 a year, as reported by baglione and sullivan. however, there is little known about the monetary face value of books that students are expected to buy, beyond studies based on self-reported data. students themselves in the us have attempted to at least open the debate in this area by highlighting book price disparities.9 nonetheless, they only report on a very small number of books, and the college board representing on-campus us textbook retailers have disputed their results for this reason, claiming that they have been selective in the book prices they have chosen. hence this study seeks to address the gap that exists in knowledge about the true cost of textbooks in higher education. this is in the context of a wider research project we are conducting on open textbooks in ireland.10 determining the cost of books is not straightforward as books can be new, used, rental, or digital subscription. however, the cost of new books does set a baseline for other forms, particularly rental and used books. our aim here is hence to start with new books, by analyzing costs of all the required and recommended textbooks of one higher education institution (hei) in ireland. the overarching research question this study sought to address is: what is known about the currently assigned textbooks in an irish university? the sub-questions were: • rq1: what is the extent of textbooks that are required reading? • rq2: what are the retail costs of textbooks? • rq3: are textbooks available in digital or e-book form? • rq4: are textbooks available in the public domain? the next section outlines our methodology and how we sought to find answers to these questions. methods in this section we describe our approach, the dataset generated, and the methods we used to analyze the data. we identified a suitable data source comprising the official course catalog of a hei in ireland with more than ten thousand students. in the course catalog faculty give required and recommended textbook details for all courses. this information is freely accessible on the website of the hei; the course catalog is powered by a software system known as akari (http://www.akarisoftware.com/). akari is a proprietary software system used by several heis in and outside ireland to create and manage academic course catalogs. the course team gained access to a download of all books recorded in the database of the course catalog (figure 1). in this catalog, fields are provided for lecturers to input information for students about books such as title, international standard book number (isbn), author, and publisher. following manual and automated data cleansing, 3,014 unique records of books were created. due to the large number of books, at this stage we sought a programmatic solution for finding out more information about these books. information technology and libraries | march 2019 93 figure 1. course catalog screenshot. we initially thought that isbns might prove the best way to accurately reconcile records of books. however, many isbns were incomplete or mistyped. moreover, many instructors simply did not enter an isbn. given the capacity for errors in the data—for instance, some lectures simply entered “i will tell you in class” in the book title field—we required a tool that could handle fuzzy search queries, e.g. cases where a book title or author were misspelled. the tool we selected was the google books application programming interface (api).11 this api provides an interface to the google books database of circa thirty million books. the service, like the main google search engine, is forgiving of queries that are mistyped or misspelled. hence, we constructed a query based on a combination of author name, book title, and publisher. following experimentation, we determined that these three search terms together allowed us to find books with a high degree of accuracy whilst also accounting for possible spelling errors. determining textbook cost, formats, and licensing | costello, bolger, soverino, and brown 94 https://doi.org/10.6017/ital.v38i1.10738 figure 2. system design. we then wrote a custom javascript middleware program deployed in the google cloud platform. this program parsed the file of the book search queries, passed them to the google books api as search requests and saved the results. the api returned results in javascript object notation (json) format. json is a modern web language for describing data. it is related to javascript and can be used to translate objects in the javascript programming language into textual strings. it is used as a replacement for xml as it is arguably more human readable and is considerably less verbose. we then imported this json into a mongodb database to filter and clean the data, before finally exporting them to excel for statistical analysis. mongodb is a document store database that natively stores objects in the json format and allows for efficient querying of the data. the google books api provides some key metadata on books aside from the usual author, publisher, isbn, edition, pages, etc. as it gives prices for selected books. google draws this information from its own e-book store which contains over three million books and a network of resellers who sell print and digital versions of the books. in addition to price, google books also contains information on accessible versions of books, digital/e-pub versions, pdf versions, and whether the book is in the public domain. we have published a release of this dataset and all of our code to the software repository github. we then used the zenodo platform to generate a digital object identifier (doi) for the code.12 one of the functions of the zenodo platform is to allow for code to be properly cited and referenced. we published our code in this way for others interested in replicating this work in other contexts. in the next section we will provide an analysis of the results of our queries. results after extracting and processing the data from the course catalog and google platforms, we obtained 3,030 unique course names and in these courses we found over 15,414 books listed. required versus recommended reading from the course catalog data, we found that 11,022 (71.5 percent) books were required readings and the remaining 4,392 (28.5 percent) were recommended. information technology and libraries | march 2019 95 upon cleaning and removing duplicates and missing data, we identified 3,014 books that could be queried using the google books api. querying the api returned results for 2,940 books, i.e. it found 97 percent of the books and only seventy-four books could not be found. the google books api returns information in json format. figure 3 below shows an example of the json information returned for one book. { "volumeinfo" : { "title" : "psychiatric and mental health nursing", "authors" : [ "phil barker" ], "industryidentifiers" : [ { "type" : "isbn_13", "identifier" : "9781498759588" }, { "type" : "isbn_10", "identifier" : "1498759580" } ], "imagelinks" : { "smallthumbnail" : "http://books.google.com/books/content?id=btsocgaaqbaj&printsec=frontcover&img=1&zo om=5&edge=curl&source=gbs_api" } }, "saleinfo" : { "isebook" : true, "retailprice" : { "amount" : 62.39, "currencycode" : "usd" } }, "accessinfo" : { "publicdomain" : false, "pdf" : { "isavailable" : true } } } figure 3. sample of book information returned by google books api. digital formats and public domain license figure 4 shows the numbers of pdf (1,219) and e-book (1,016) versions of books reported to be available. eight hundred and fifty-four were available in both pdf and e-book format. from the determining textbook cost, formats, and licensing | costello, bolger, soverino, and brown 96 https://doi.org/10.6017/ital.v38i1.10738 total of 2,940 individual books listed their availability was as follows: figure 4. availability of 2,940 books in digital formats and public domain license. as per figure 4, only 0.18 percent (six) of the books had a version available in the public domain according to google books. cost results the google books api only returned prices for 596 (20 percent) of the books that we searched for. within that sample, the cost ranged from $0.99 to over $452, as illustrated in figure 5. the median price of a book was $40, and the mean price was $56.67. as there are on average 3.96 books per course, this implies an average cost to students of $224.41 per course taken. as students take an average of 8.05 courses per year, this further implies a cost per year of $1,806.50 per student if they were to buy new versions of all the books. 1,219 (39.73% ) 1,016 (34.56% ) 6 (0. 18%) 0 500 1000 1500 2000 2500 pdf ebook openpdf e-book public domain information technology and libraries | march 2019 97 figure 5. summary of book prices (n = 596). discussion and conclusion we have demonstrated that it is possible to programmatically search and determine the prices of large numbers of books. we used this information to attempt to estimate the full economic cost of books to students on average in an irish hei. we are still actively developing this tool and encourage others to use and even contribute to the code which we have published with the dataset. this proof of concept tool may allow stakeholders with an interest in book costs for students to quickly get real data on large numbers of books. ultimately, we hope that this will help highlight the costs of many textbooks. our findings also highlight relatively low levels of digital book availability. very few books were found to be in the public domain. a limitation of this research is that there are issues around the coverage of google books and its index policies or algorithms. in a literature review of research articles about google books in 2017, fagan pointed out that the coverage of google books is “hit and miss.”13 in 2017, google books included about thirty million books, though google did not release specific details on its database, as emphasized by fagan. it is known that content includes digitized collections from over forty libraries, and that us and englishlanguage books are overrepresented.14 furthermore, google books is only returning results for books that are in the public domain and cannot tell us if books are made available through open licenses such as creative commons. accepting such caveats, however, we have found the google books api to be a very useful tool for answering questions about large numbers of books in a systematic way and hope that our findings can help others. the prices that we derived in this study were for new books only. however, the new book prices provide a baseline for all other prices, e.g. a used book or a loan book price will be relative to a new book price and library budgets will need to take account of new book prices.15 further study is required to determine a more realistic figure for the cost of textbooks and the next phase of our 0 50 100 150 200 250 300 350 400 450 500 1 16 31 46 61 76 91 10 6 12 1 13 6 15 1 16 6 18 1 19 6 21 1 22 6 24 1 25 6 27 1 28 6 30 1 31 6 33 1 34 6 36 1 37 6 39 1 40 6 42 1 43 6 45 1 46 6 48 1 49 6 51 1 52 6 54 1 55 6 57 1 58 6 d ol la rs cost in usd books determining textbook cost, formats, and licensing | costello, bolger, soverino, and brown 98 https://doi.org/10.6017/ital.v38i1.10738 wider open textbook research projects involves interviews and focus groups with students to better understand the lived reality of their relationship with textbooks.16 references 1 stephen l. baglione and kevin sullivan, “technology and textbooks: the future,” american journal of distance education 30, no. 3 (aug. 2016): 145-55, https://doi.org/10.1080/08923647.2016.1186466. 2 etan senack and robert donoghue, “covering the cost: why we can no longer afford to ignore high textbook prices,” report, the student pirgs (feb. 2016), www.studentpirgs.org/textbooks. 3 elaine allen and jeff seaman, “opening the textbook: educational resources in u.s. higher education, 2015-16,” report, babson survey research group (july 2016), https://www.onlinelearningsurvey.com/reports/openingthetextbook2016.pdf. 4 william and flora hewlett foundation (2019), http://www.hewlett.org/programs/education-program/open-educational-resources. 5 2012 paris oer declaration, http://www.unesco.org/new/fileadmin/multimedia/hq/ci/wpfd2009/english_declaratio n.htm. 6 mary burgess, “the bc open textbook project,” in open: the philosophy and practices that are revolutionizing education and science, rajiv s. jhangiani and robert biswas-diener (eds.). (london: ubiquity pr., 2017): 227–36. 7 nicole allen, “congress funds $5 million open textbook grant program in 2018 spending bill,” sparc open (mar. 20, 2018), https://sparcopen.org/news/2018/open-textbooks-fy18/. 8 jennifer ma et al., “trends in college pricing,” report, the college board (oct. 2017), https://trends.collegeboard.org/sites/default/files/2017-trends-in-college-pricing_0.pdf. 9 kaitlyn vitez, “open 101: an action plan for affordable textbooks,” report, student pirgs (jan. 2018), https://studentpirgs.org/campaigns/sp/make-textbooks-affordable. 10 mark brown, eamon costello, and mairéad nic giolla mhichíl, “from books to moocs and back again: an irish case study of open digital textbooks,” in exploring the micro, meso and macro. proceedings of the european distance and e-learning network 2018 annual conference, genova, 17-20 june, 2018 (budapest: the european distance and e-learning network): 206-14. 11 google books api (2018), https://developers.google.com/books/docs/v1/reference/volumes. 12 eamon costello and richard bolger, “textbooks authors, publishers, formats and costs in higher education,” bmc research notes 12, no. 1 (jan. 2019): 12-56, https://doi.org/10.1186/s13104-019-4099-1. information technology and libraries | march 2019 99 13 jody condit fagan, “an evidence-based review of academic web search engines, 2014-2016: implications for librarians’ practice and research agenda,” information technology and libraries 36, no. 2 (mar. 2017): 7-47, https://doi.org/10.6017/ital.v36i2.9718. 14 ibid. 15 anne christie, john h. pollitz, and cheryl middleton, “student strategies for coping with textbook costs and the role of library course reserves,” portal: libraries and the academy 9, no. 4 (oct. 2009): 491-510, http://digital.library.wisc.edu/1793/38662. 16 eamon costello et al., “textbook costs and accessibility: could open textbooks play a role?” proceedings of the 17th european conference on elearning (ecel), vol. 17 (athens, greece: 2018): 99-106. microsoft word june_ital_liu_final.docx a  library  in  the  palm  of  your  hand:   mobile  services  in  top  100  university   libraries     yan  quan  liu  and     sarah  briggs     information  technology  and  libraries  |  june  2015             133   abstract   what  is  the  current  state  of  mobile  services  among  academic  libraries  of  the  country’s  top  100   universities,  and  what  are  the  best  practices  for  librarians  implementing  mobile  services  at  the   university  level?  through  in-­‐depth  website  visits  and  survey  questionnaires,  the  authors  studied  each   of  the  top  100  universities’  libraries’  experiences  with  mobile  services.  results  showed  that  all  of  these   libraries  offered  at  least  one  mobile  service,  and  the  majority  offered  multiple  services.  the  most   common  mobile  services  offered  were  mobile  sites,  text  messaging  services,  e-­‐books,  and  mobile   access  to  databases  and  the  catalog.  in  addition,  chat/im  services,  social  media  accounts  and  apps   were  very  popular.    survey  responses  also  indicated  a  trend  towards  responsive  design  for  websites  so   that  patrons  can  access  the  library’s  full  site  on  any  mobile  device.  respondents  recommend  that   libraries  considering  offering  mobile  services  begin  as  soon  as  possible  as  patron  demand  for  these   services  is  expected  to  increase.   introduction    mobile  devices,  such  as  smart  phones,  tablets,  e-­‐book  readers,  handheld  gaming  tools  and   portable  music  players  are  practically  omnipresent  in  today’s  society.  according  to  walsh  (2012),   “mobile  data  traffic  in  2011  was  eight  times  the  size  of  the  global  internet  in  2000  and,  according   to  forecasts,  mobile  devices  will  soon  outnumber  human  beings”.1  studies  have  revealed  that  use   of  mobile  devices  is  widespread  and  continues  to  increase.  as  of  2013,  56%  of  americans  owned  a   smart  phone  (smith  2013).  this  number  is  even  higher  among  people  ages  18  to  29.2  however,   peters  (2011)  points  out  that  mobile  phones  at  least  can  be  found  among  people  of  all  ages,   nationalities  and  socioeconomic  classes.  he  writes,  “we  truly  are  in  the  midst  of  a  global  mobile   revolution.”3  in  2012,  the  acrl  research  planning  and  review  committee  found  that  55%  of   undergraduates  have  smart  phones,  62%  have  ipods,  and  21%  have  some  kind  of  tablet.  over  67%   of  these  students  use  their  devices  academically.4  elmore  and  stephens  (2012)  write,  “academic   libraries  cannot  afford  to  ignore  this  growing  trend.  for  many  students  a  mobile  phone  is  no   longer  just  a  telephonic  device  but  a  handheld  information  retrieval  tool.”5       yan  quan  liu  (liuy1@southernct.edu)  is  professor  in  information  and  library  science  at   southern  connecticut  state  university,  new  haven,  ct,  and  special  hired  professor  at  tianjin   university  of  technology,  tianjin,  china.  sarah  briggs  (sjg.librarian@gmail.com)  is   library/media  specialist  at  jonathan  law  high  school,  milford,  ct.     a  library  in  the  palm  of  your  hand:  mobile  services  in  the  top  100  university  libraries  |     liu  and  briggs  |  doi:  10.6017/ital.v34i2.5650   134   it  is  clear  from  these  studies  that  academic  libraries  can  expect  their  patrons  to  be  accessing  their   services  via  mobile  devices  in  growing  numbers  and  need  to  adapt  to  this  reality.  however,  the   sheer  number  of  mobile  devices  on  the  market  and  the  myriad  ways  libraries  could  offer  mobile   services  can  be  daunting.  additionally,  offering  mobile  services  requires  investing  time,  money,   and  personnel.  in  order  to  give  libraries  a  starting  point,  this  paper  examines  the  current  status  of   mobile  services  in  the  united  states’  top  100  universities’  libraries  as  a  model,  specifically  what   services  are  being  offered,  what  are  they  being  used  for,  and  what  challenges  libraries  have   encountered  in  offering  mobile  services.  in  doing  so,  this  paper  attempts  to  answer  two  questions:   what  is  the  state  of  mobile  services  among  academic  libraries  of  the  country’s  top  ranked   universities,  and  what  can  the  experiences  of  these  libraries  teach  us  about  best  practices  for   mobile  services  at  the  university  level?     literature  review   current  status  of  mobile  services  in  academic  libraries   there  is  not  a  lot  of  data  regarding  the  prevalence  of  mobile  services  in  academic  libraries.  a  2010   study  found  that  35%  of  the  english  speaking  members  of  the  association  of  research  libraries   had  a  mobile  website  for  either  the  university,  the  library,  or  both  (canuel  and  crichton  2010).6  a   study  of  chinese  academic  libraries  revealed  that  only  12.8%  surveyed  had  a  section  of  their  web   pages  devoted  to  mobile  library  service  (li  2013).7  in  2010,  canuel  and  crichton  found  that  13.7%   of  association  of  universities  and  colleges  of  canada  members  had  some  mobile  services,   including  websites  and  apps.8  in  the  united  states,  a  2010  survey  found  that  44%  of  academic   libraries  offered  some  type  of  mobile  service.  39%  had  a  mobile  website,  and  36%  had  a  mobile   version  of  the  library’s  catalog.  half  of  libraries  which  did  not  offer  mobile  services  were  in  the   planning  process  for  creating  a  mobile  website,  catalog,  and  text  notifications.  additionally,  40%   planned  on  implementing  sms  reference  services,  and  54%  wanted  the  ability  to  access  library   databases  on  mobile  devices  (thomas  2010).9  however,  it  is  widely  assumed  that  mobile  services   will  expand  rapidly  in  the  future  (canuel  and  crichton  2010).10  more  recently,  a  2012  survey  of   academic  libraries  in  the  pacific  northwest  found  that  50%  had  a  mobile  version  of  the  library’s   website  and/or  catalog,  40%  used  qr  codes,  38%  had  a  text  messaging  service,  and  18%  replied   “other”  with  mobile  interfaces  for  databases  being  a  popular  offering.  however,  31%  of  survey   respondents  still  did  not  have  any  mobile  services  (ashford  and  zeigen  2012).11  osika  and   kaufman  (2012)  surveyed  community  and  junior  colleges  nationwide  to  determine  what  mobile   services  were  being  offered.  73%  offered  mobile  catalog  access,  62%  offered  vendor  database   apps,  two  were  creating  a  mobile  app  for  the  library,  and  14.7%  had  a  mobile  library  website.12         definition  and  types  of  mobile  services   although  there  are  dozens  of  different  mobile  devices  on  the  market,  la  counte  (2013)  aptly  and   succinctly  defines  them  as  follows:  “the  reality  is  that  mobile  devices  can  refer  to  essentially  any   device  that  someone  uses  on  the  go”  (vi).13  smart  phones,  netbooks,  tablet  computers,  e-­‐readers,     information  technologies  and  libraries  |  june  2015   135   gaming  devices  and  ipods  are  examples  of  mobile  devices  that  are  now  commonplace  on  college   campuses.  barnhart  and  pierce  (2012)  define  these  devices  as  “…networked,  portable,  and   handheld…”14  additionally,  these  devices  may  be  used  to  read,  listen  to  music,  and  watch  videos   (west,  hafner  and  faust  2006).15  according  to  lippincott  (2008),  libraries  should  consider  all   their  patron  groups  as  potential  mobile  library  users,  including  faculty,  distance  education   students,  on-­‐campus  students,  students  placed  in  internships  or  doing  other  kinds  of  fieldwork,   and  students  using  mobile  devices  to  work  on  collaborative  projects  outside  of  school.16     the  most  common  mobile  services  discussed  in  the  literature  are  mobile-­‐friendly  websites  or  apps,   mobile-­‐friendly  access  to  the  library’s  catalog  and  databases,  text  messaging  services,  qr  codes,   augmented  reality,  e-­‐books,  and  information  literacy  instruction  facilitated  by  mobile  devices.   these  services  fall  into  one  of  two  categories:  traditional  library  services  amended  to  be  available   with  mobile  devices  and  services  created  specifically  for  mobile  devices.     common  library  services  that  have  been  updated  to  be  mobile-­‐friendly  include  a  mobile  website   (either  as  a  mobile  version  of  the  library’s  regular  site,  an  app,  or  both),  mobile-­‐friendly  interfaces   for  the  library’s  catalog  and  databases,  access  to  books  in  electronic  format,  and  information   literacy  instruction  which  makes  use  of  mobile  devices.  regarding  mobile  websites  and  apps,   walsh  (2012)  writes,     “if  a  well-­‐designed  app  is  like  a  top-­‐end  sports  car,  a  mobile  website  is  more  like  a  family  run-­‐ around.  it  may  not  be  as  good  looking,  but  it  is  likely  to  be  cheaper,  easier  to  run  and   accessible  to  more  people.”17     it  is  not  feasible  to  replicate  the  entire  website  in  a  mobile  version,  so  libraries  must  know  what   patrons  find  most  important  and  address  that  information  through  the  mobile  site  (walsh  2012).18   according  to  a  2012  survey  of  academic  libraries  in  the  pacific  northwest,  the  most  popular  types   of  information  found  on  mobile  websites  are  links  to  the  catalog,  a  way  to  contact  a  librarian,  links   to  databases,  and  hours  of  operation  (ashford  and  zeigen  2012).19  many  libraries  are  also   providing  mobile  access  to  their  catalogs  and  databases.  this  is  sometimes  difficult  because  often   third-­‐party  vendors  are  responsible  for  the  catalogs  and/or  databases,  and  libraries  must  rely  on   these  vendors  to  provide  mobile  access  (iglesias  and  meesangnil  2011).20  however,  many  vendors   already  offer  mobile-­‐friendly  interfaces;  libraries  must  be  aware  when  this  is  the  case  and  provide   links  to  these  interfaces.  when  a  vendor  does  not  provide  a  mobile-­‐friendly  interface,  the  library   should  encourage  the  vendor  to  do  so  (bishoff  2013,  p.  118).21     there  is  a  growing  expectation  that  libraries  will  provide  e-­‐books  to  patrons  as  e-­‐books  become   increasingly  popular.  walsh  (2012)  states  that  the  proportion  of  adults  in  the  united  states  who   own  an  e-­‐book  reader  doubled  between  november  2010  and  may  2011.22  according  to  bischoff,   ruth,  and  rawlins  (2013),  29%  of  americans  owned  a  tablet  or  e-­‐reader  as  of  january  2012.23  this   has  presented  challenges  for  libraries,  mainly  in  two  areas:  format  and  licensing.  there  is  risk   involved  in  choosing  a  format  that  will  only  work  with  one  product,  i.e.  a  nook  or  a  kindle,     a  library  in  the  palm  of  your  hand:  mobile  services  in  the  top  100  university  libraries  |     liu  and  briggs  |  doi:  10.6017/ital.v34i2.5650   136   because  not  every  patron  will  own  the  same  device,  and  ultimately  one  device  might  become  the   most  popular,  rendering  books  purchased  for  other  devices  obsolete.  on  the  other  hand,  formats   that  work  with  multiple  devices  tend  to  have  only  basic  functionality  and  do  not  provide  an  ideal   user  experience  (walsh  2012).24  walsh  (2012)  recommends  epub,  which  works  well  with  many   different  devices,  is  free,  and  supports  the  addition  of  a  digital  rights  management  layer.25   licensing  is  also  an  issue  as  libraries  and  publishers  strive  to  find  a  method  of  loaning  e-­‐books   amenable  to  both.  no  one  model  has  emerged  which  is  mutually  satisfactory  (walsh  2012).26             libraries  are  increasingly  integrating  mobile  technologies  into  information  literacy  instruction   and  other  forms  of  instruction.  for  example,  services  such  as  skype  and  facetime,  which  walsh   (2012)  describes  as  “a  window  to  another  world”  (p.  105),  can  be  used  for  distance  learning,   including  reference  and  instruction.27  when  interactions  do  not  need  to  take  place  live,  many   mobile  devices  have  the  capability  to  take  pictures,  record  video,  and  record  audio  (walsh  2012,  p.   97).28  this  allows  class  events,  including  lectures  and  discussions,  to  be  broadcast  to  people  and   spaces  beyond  the  physical  classroom.  walsh  (2012)  notes  that,  when  constructing  podcasts  or   vodcasts,  it  is  important  to  make  mobile-­‐friendly  versions  of  these  available,  bearing  in  mind   different  platforms  and  screen  sizes  people  might  be  using  to  access  the  content.29      text  messaging,  qr  codes,  and  augmented  reality  are  examples  of  library  services  that  were   created  expressly  for  mobile  devices.  text  messaging  in  particular  has  become  a  very  popular   mobile  service  offering;  as  thomas  and  murphy  (2009)  write,  “interacting  with  patrons  through   text  messaging  now  ranks  among  core  competencies  for  librarians  because  sms  increasingly   comprises  a  central  channel  for  communicating  library  information.”30  a  common  use  of  text   messaging  is  a  ‘text  a  librarian’  service.  walsh  (2012)  recommends  launching  such  a  service  even   if  the  library  currently  offers  no  other  mobile  services,  noting,  “it  can  be  quick,  easy  and  cheap  to   introduce  such  a  service  and  it  is  an  ideal  entry  into  the  world  of  providing  services  via  mobile   devices”  (p.  45).31  peters  (2011)  points  out  that  the  shorter  the  turnaround  time  (he  recommends   less  than  ten  minutes)  the  better.  he  notes  that  many  questions  arise  as  the  result  of  a  situation   the  questioner  is  currently  in.  he  writes,  “if  you  do  not  respond  in  a  matter  of  minutes,  not  hours,   the  context  will  be  lost  and  the  need  will  be  diminished  or  satisfied  in  other  ways.”32   qr  codes  have  become  popular  in  libraries  offering  mobile  services.  qr  codes  encode  information   in  two  dimensions  (vertically  and  horizontally),  and  thus  can  provide  more  information  than  a   barcode.  the  applications  necessary  for  using  qr  codes  are  usually  free,  and  they  can  be  read  by   most  mobile  devices  with  cameras  (little  2011).33  the  most  common  uses  of  qr  codes  in   academic  libraries,  according  to  elmore  and  stephens  (2012),  are  linking  to  the  library’s  mobile   website  and  social  media  pages,  searching  the  library  catalog,  viewing  a  video  or  accessing  a  music   file,  reserving  a  study  room,  and  taking  a  virtual  tour  of  the  library  facilities.34     augmented  reality  may  not  currently  be  used  as  often  in  libraries  as  other  services  such  as  mobile   sites  and  text  messaging,  but  many  libraries  are  finding  unique  and  compelling  ways  to  use  ar.  ar   applications  link  the  physical  with  the  digital,  are  interactive  in  real  time,  and  are  registered  in  3-­‐d.     information  technologies  and  libraries  |  june  2015   137   hahn  (2012)  defines  ar  as  follows:  “in  order  to  be  considered  a  truly  augmented  reality   application,  an  app  must  interactively  attach  graphics  or  data  to  objects  in  real  time,  to  achieve  the   real  and  virtual  combination  of  graphics  into  the  physical  environment.”35  he  notes  that  such   applications  are  excellent  additions  to  libraries’  mobile  services  because  they  connect  physical  and   digital  worlds,  much  like  libraries.36  one  example  of  augmented  reality  is  north  carolina  state   university’s  wolfwalk,  which  is  advertised  as  “…a  historical  walking  tour  of  the  nc  state  campus   using  the  location-­‐aware  campus  map”  (ncsu  libraries).37  to  create  the  tour,  the  ncsu  libraries   special  collections  research  center  provided  over  one  thousand  photographs  of  the  campus  from   the  19th  century  to  the  present  (ncsu  libraries).38     research  design   to  make  sure  the  information  gathered  was  current  and  valid,  this  study  employed  two   approaches,  website  visits  and  survey  investigation,  to  determine  the  state  of  mobile  services  at   the  top  100  universities’  libraries.  the  website  visits  explored  what  mobile  services  are  being   offered  and  how  they  are  being  offered  at  these  university  libraries.  the  survey  sent  via  email   inquired  how  they  are  providing  mobile  services  in  their  libraries  and  what  their  results  have   been  regarding  challenges,  successes,  and  best  practices.  the  survey  data  was  analyzed  and   compared  to  the  data  obtained  via  website  exploration  to  form  a  more  comprehensive  picture  of   mobile  services  at  these  universities.   participants   university  libraries'  patrons  are  frequent  users  of  mobile  technology.  according  to  osika  and   kaufman  (2012),  studies  have  found  that  45%  of  18  to  29-­‐year-­‐olds  who  have  internet-­‐capable   cell  phones  do  most  of  their  browsing  on  their  devices.  39  kostruski  and  skornia  (2011)  note  that   people  of  this  age  group  are  “…leaders  in  mobile  communication…the  traditional  college-­‐age   student.”40  as  the  nation’s  leaders  in  undergraduate  and  graduate  programs  and  academic   research,  an  examination  of  the  status  of  the  top  100  university  libraries'  mobile  services  can   provide  useful  service  patterns  and  a  benchmark  for  the  service  improvements  that  would  benefit   academic  programs.  based  on  the  u.s.  news  &  world  report's  national  university  rankings,  this   study  selected  the  top  100  universities  in  the  2014  rankings.41     procedure   website  visits  as  the  first  step  were  conducted  from  march  2,  2014  to  march  16,  2014.  each   library’s  home  page  was  carefully  examined  for  the  most  common  mobile  services  named  in  the   literature  with  these  categorized  items:  1)  a  mobile  website  or  app,  2)  mobile  access  to  the   library’s  catalog  and  databases,  3)  text  messaging  services,  4)  qr  codes,  5)  augmented  reality,  and   6)  e-­‐books.  to  assess  each  site,  we  first  visited  the  site  via  a  nexus  7  to  see  if  it  had  a  mobile   version.  next,  we  viewed  each  library’s  full  site  on  a  laptop  computer.  we  browsed  through  each   page  of  the  site  looking  for  mention  or  use  of  each  said  categorization.  we  also  searched  for  these     a  library  in  the  palm  of  your  hand:  mobile  services  in  the  top  100  university  libraries  |     liu  and  briggs  |  doi:  10.6017/ital.v34i2.5650   138   items  via  the  library’s  site  map  or  site  search  functions  whenever  available.  the  results  were   tabulated  with  a  codebook  in  the  established  categorization  through  microsoft  excel.     although  the  website  visits  place  great  value  on  gathering  quantitative  data  about  what  mobile   services  are  offered  at  these  libraries,  this  method  has  its  limitations.  firstly,  it  locates  only  those   mobile  services  that  appear  on  a  library’s  website,  but  services  the  library  provides  which  are  not   mentioned  on  the  website  can  be  overlooked.  also,  the  use  of  mobile  devices  or  services  in  library   instruction,  a  very  commonly  mentioned  mobile  service  in  the  literature,  cannot  generally  be   determined  via  a  website  visit.  in  addition,  the  website  visit  provides  only  a  snapshot  of  the   current  state  of  mobile  services;  university  libraries  may  be  planning  to  implement  or  even  be  in   the  process  of  implementing  mobile  services.  lastly,  website  visits  evaluate  what  is  publicly   available,  but  it  is  not  possible  to  access  password-­‐protected  information  meant  only  for  a   university’s  students  and  faculty  to  assess  mobile  content.  to  address  these  shortcomings,  we   created  a  survey  using  surveymonkey  to  complement  the  data  supplied  from  the  website  visits.   we  sent  out  the  survey  via  email  to  each  of  the  top  100  universities’  libraries.    the  survey  was   conducted  from  april  10,  2014,  to  april  24,  2014.     results  and  analysis   study  results  presented  compelling  evidence  that  mobile  services  are  already  ubiquitous  among   the  country's  top  universities.  the  most  recognized  ones  are  mobile  sites,  mobile  apps,  mobile   opacs,  mobile  access  to  databases,  text  messaging  services,  qr  codes,  augmented  reality,  and  e-­‐ books.  these  service  forms  confirm  those  commonly  named  in  the  literature  as  library  mobile   services.   what  basic  types  of  mobile  services  do  the  libraries  provide?   the  results  showed  all  of  the  libraries  offered  one  or  more  of  the  specific  mobile  services  in  chart   1  with  multiple  entries  allowed,  presenting  modernized  new  service  patterns  the  university   libraries  provide  to  meet  the  needs  and  demands  of  university  communities  in  this  digital  era.     information  technologies  and  libraries  |  june  2015   139     chart  1.  percentage  of  libraries  offering  specific  mobile  services  (multiple  entries  allowed).   it  is  clear  from  both  the  survey  results  and  the  website  visits  that  almost  all  libraries  at  the  top   100  universities  are  offering  multiple  mobile  services,  with  mobile  websites,  mobile  access  to  the   library’s  catalog,  mobile  access  to  the  library’s  databases,  e-­‐books,  and  text  messaging  services   being  the  most  common.  qr  codes  and  especially  augmented  reality  are  not  as  common.     of  the  eight  main  mobile  services  we  looked  for  via  the  website  visits  and  survey  (mobile  site,   mobile  app  for  the  site,  mobile  opac,  mobile  access  to  databases,  text  messaging,  qr  codes,   augmented  reality,  and  e-­‐books),  all  libraries  surveyed  offer  between  one  and  seven  of  these   services.  no  universities  have  none  of  these  services,  and  no  universities  have  all  of  these  services.   only  one  university  has  one  service,  none  have  two,  seven  have  three,  thirteen  have  four,  twenty-­‐ four  have  five,  forty-­‐six  have  six,  and  eight  have  seven.  to  make  this  information  easy  to  read,  we   summarized  it  in  table  1  below.   number  of  mobile   services  offered   number  of   libraries   percentage   of  libraries   no  mobile  services   0   0%   1  mobile  service   1   1%   2  mobile  services   0   0%   3  mobile  services   7   7%   4  mobile  services   13   13%   5  mobile  services   24   24%   6  mobile  services   46   46%   7  mobile  services   8   8%   8  mobile  services   0   0%   table  1.  number  of  mobile  services  offered.   5.0%   29.2%   58.7%   77.2%   81.6%   81.7%   88.0%   92.6%   augmented  reality   mobile  app  for  site   qr  codes   text  messaging   mobile  website   mobile  databases   mobile  opac   e-­‐books   percentage  of  libraries  offering  specimic  mobile   services       a  library  in  the  palm  of  your  hand:  mobile  services  in  the  top  100  university  libraries  |     liu  and  briggs  |  doi:  10.6017/ital.v34i2.5650   140   such  a  data  pattern  demonstrates  not  only  that  mobile  services  are  very  widespread  at  these   universities’  libraries,  but  also  that  the  vast  majority  of  these  libraries  offer  multiple  mobile   services.  in  other  words,  libraries  do  not  appear  to  be  offering  mobile  services  in  isolation;  they   have  taken  several  of  their  most  popular  services  (such  as  websites,  reference,  and  search   functions)  and  mobilized  all  of  them.  in  fact,  the  average  number  of  mobile  services  offered  among   the  eight  services  we  examined  is  5.31.      although  results  collected  from  the  two  research  methods  (website  visits  and  survey)  are  almost   identical  for  mobile  websites  and  mobile  opacs  and  are  very  comparable  for  text  messaging,  qr   codes,  and  augmented  reality  there  is  a  bit  of  a  gap  between  results  from  the  website  visits  and  the   survey  regarding  mobile  databases  (92.9%  vs.  70.59%),  but  perhaps  libraries  that  responded  to   the  survey  just  happened  to  offer  mobile  access  to  databases  less  often  than  all  the  libraries  in   general.      it  is  interesting  that  we  located  e-­‐books  on  100%  of  the  websites  we  visited,  but  only  85.29%  of   respondents  mention  offering  them.  perhaps  this  discrepancy  can  be  explained  by  a  clarification  in   terms.  we  looked  for  the  presence  of  books  in  electronic  format  that  could  be  accessed  online.   perhaps  survey  respondents  only  considered  e-­‐books  specifically  formatted  for  smart  phones  or   tablets  as  a  mobile  service.  also,  later  in  the  survey  several  respondents  mention  communication   issues  as  an  ongoing  challenge  in  offering  mobile  services,  specifically,  not  always  knowing  what   other  library  departments  are  offering  in  terms  of  mobile  services.  it  is  possible  that  some  survey   respondents  are  not  responsible  for  the  e-­‐book  collection  and  thus  did  not  mention  it  as  a  mobile   service.     another  discrepancy  exists  between  the  results  for  mobile  apps  for  the  library’s  site  (20.2%  for   the  website  visits  versus  38.24%  for  the  survey).  these  results  indicate  that  mobile  apps  for   libraries’  sites  are  more  common  than  we  had  previously  thought.  perhaps  these  apps  are  being   advertised  in  places  other  than  on  the  library’s  website,  and  therefore  a  website  visit  is  not  the   best  way  to  discover  them.     the  website  visits  did  not  look  for  mobile  library  instruction,  mobile  book  renewal,  or  mobile   interlibrary  loan,  but  through  our  website  visits  we  saw  these  services  mentioned  several  times   and  thus  included  them  in  the  survey.  they  turned  out  to  be  somewhat  common  among  libraries   surveyed;  41.18%  of  respondents  offer  mobile  book  renewal,  20.59%  offer  mobile  interlibrary   loan,  and  32.35%  offer  mobile-­‐friendly  library  instruction.     table  2  below  compares  the  data  collected  from  both  the  website  visits  and  the  survey  among   these  100  universities,  ranking  from  high  to  low  percentages.  in  most  cases,  they  are  very  similar.         information  technologies  and  libraries  |  june  2015   141     mobile  services   percentage  of   libraries  offering   service  (website   visits)   percentage  of   libraries  offering   service  (survey)   e-­‐books   100%   85.29%   mobile  databases   92.90%   70.59%   mobile  opac   87.80%   88.24%   mobile  website   80.80%   82.35%   text  messaging   80.80%   73.53%   qr  codes   61.60%   55.88%   mobile  app  for  site   20.20%   38.24%   augmented  reality   7.00%   2.94%   table  2.  data  comparison  of  specific  mobile  services  between  website  visits  &  survey.   what content do the mobile sites offer? in addition to assessing whether libraries had a mobile site, the survey asked libraries that already have a mobile site what is included on the site. 100% of libraries with mobile sites include library hours on their site, making this the most common feature. the next two most common features are library contact information and a search function for the catalog, which both received 96.67%. searching within mobile-friendly databases , such as ebscohost mobile, jstor and pubmed, is the next most popular feature, although it trailed a little behind library hours, contact information, and catalog searching at 70%. book renewal received 56.67%, and access to patron accounts received 53.33%. interlibrary loan is the least common feature by far, offered by only 26.67% of respondents. this information is summarized in chart 2 below: chart  2.  components  of  libraries’  mobile  sites.   26.67%   53.33%   56.67%   70.00%   96.67%   96.67%   interlibrary  loan   access  to  patron  accounts   book  renewal   search  the  databases   library  contact  information   search  the  catalog   components  of  libraries'  mobile  sites     a  library  in  the  palm  of  your  hand:  mobile  services  in  the  top  100  university  libraries  |     liu  and  briggs  |  doi:  10.6017/ital.v34i2.5650   142   these  results  are  interesting  as,  overall,  they  reflect  higher  percentages  for  specific  mobile   services  than  question  1  on  the  survey,  which  asked  which  mobile  services  libraries  offer.  for   example,  in  question  1,  88.24%  of  respondents  offer  mobile  access  to  the  library’s  catalog,   whereas  for  libraries  with  mobile  sites,  96.67%  offer  access  to  the  catalog  on  the  mobile  site.  the   ability  to  search  mobile-­‐friendly  versions  of  databases  the  library  subscribes  to  was  almost  the   same  for  both  groups,  with  70.59%  of  respondents  to  question  1  offering  this  and  70%  of   respondents  having  this  as  a  component  of  their  mobile  sites.  mobile  book  renewal  is  much  more   common  among  libraries  with  mobile  sites;  56.67%  of  respondents  with  mobile  sites  compared  to   41.18%  of  total  respondents.  a  slightly  higher  percentage  of  respondents  with  mobile  sites  offer   mobile  interlibrary  loan  (26.67%)  compared  to  all  respondents  (20.59%).  this  data  suggests  that,   on  the  whole,  libraries  with  mobile  sites  are  more  likely  to  offer  other  mobile  services  as  well,   specifically  mobile  access  to  the  catalog,  mobile  book  renewal,  and  mobile  interlibrary  loan.     what  mobile  reference  services  do  libraries  provide?   the  survey  also  looked  for  information  on  virtual  and/or  mobile  reference  services.  81.25%  of   survey  respondents  offer  text/sms  messaging,  100%  offer  chat/im,  and  21.88%  offer  reference   services  via  a  social  media  account.  these  results  showing  popular  reference  services  in  these  top   universities  are  summarized  in  chart  3  below:   chart  3.  popular  mobile  reference  services.   chat/im  is  obviously  the  most  popular  method  of  providing  virtual/mobile  reference  services;  all   survey  respondents  offer  this  service.  text/sms  is  also  very  popular,  indicating  that  the  majority   of  libraries  see  value  in  providing  both  despite  their  similar  functions.  the  fact  that  social  media   does  not  compare  favorably  to  either  texting  or  chat/im  services  is  curious  because  most  social   media  platforms  have  a  mobile  version  available  that  libraries  can  take  advantage  of  for  free.   however,  this  may  not  be  the  best  medium  for  reference.  one  respondent  commented  on  this   question,  “our  ‘ask  a  librarian’  service  is  available  from  desktop  facebook,  but  not  on  mobile   facebook.”     22%   81%   100%   social  media   text/sms   chat/im   popular  virtual/mobile  reference  services     information  technologies  and  libraries  |  june  2015   143   what  apps  do  libraries  use  or  provide  for  patrons?   although  the  website  visits  and  survey  results  indicated  that  apps  for  a  library’s  site  are  not  very   common,  both  tools  revealed  that  use  of  apps  for  various  purposes  is  widespread.  the  most   commonly  mentioned  app  is  browzine,  which  is  used  for  accessing  e-­‐journals.  several   respondents  mentioned  apps  developed  in-­‐house  for  using  library  services,  such  as  an  app  for   reserving  a  study  room,  accessing  university  archives,  and  sending  catalog  records  to  a  mobile   device.  another  respondent  stated  that  the  university’s  app  has  a  library  function.  several   respondents  mentioned  vendor-­‐provided  or  third-­‐party  apps,  such  as  apps  for  accessing  pubmed,   sciencedirect,  naxos  music  library,  accessmylibrary  (for  gale  resources),  a  mobile  medical   dictionary,  and  the  american  chemical  society.  one  respondent  noted  that  the  library  loans  ipads   preloaded  with  popular  apps  to  support  student  research  such  as  endnote,  notability,   goodreader,  pages,  numbers,  and  keynote,  among  others.  finally,  these  apps  were  named  at  least   once  as  an  app  libraries  either  use  or  provide  access  to:  iresearch  (for  storing  articles  locally),   boopsie  (for  building  a  library  mobile  app),  ebrary  (for  accessing  e-­‐books),  and  safari  (for   accessing  books  and  videos  online).  these  results  indicate  that  the  use  of  apps  is  fairly  robust  and   diverse  among  these  libraries.  additionally,  from  these  results,  it  seems  more  common  for   libraries  to  use  and/or  provide  apps  created  by  third  parties  than  to  develop  an  in-­‐house  app,   perhaps  due  to  the  expertise  and  expense  involved  in  creating  and  maintaining  an  app.     what  mobile  services  will  be  added  in  the  future?   the  final  question  of  the  survey  asks  libraries  if  there  are  any  plans  to  offer  a  mobile  service  not   currently  provided.  responses  are  summarized  in  chart  4  below.     chart  4.  percentage  of  the  libraries  seeking  to  add  specific  mobile  services   the  most  common  selection  is  mobile  friendly  library  instruction,  with  61.54%.  the  next  most   common  is  a  mobile  website  (46.15%).  mobile  interlibrary  loan  was  chosen  by  38.46%  of   8%   8%   8%   15%   15%   15%   38%   46%   62%   text  messaging  services   qr  codes   mobile  app(s)   e-­‐books   augmented  reality   mobile  opac   mobile  databases   mobile  book  renewal   mobile  interlibrary  loan   mobile  website   mobile  library  instruction   planned  mobile  services  additions     a  library  in  the  palm  of  your  hand:  mobile  services  in  the  top  100  university  libraries  |     liu  and  briggs  |  doi:  10.6017/ital.v34i2.5650   144   respondents.  less  common  services  planned  include  adding  mobile  access  to  the  library’s  opac,   mobile  access  to  the  library’s  databases,  and  mobile  book  renewal,  each  of  which  were  chosen  by   15.38%  of  respondents.  7.69%  of  respondents  are  planning  to  add  mobile  apps,  e-­‐books,  and   augmented  reality,  respectively.  no  one  indicated  plans  to  add  text  messaging  services  or  qr   codes.  these  results  indicate  that  libraries  expect  demand  for  traditional  library  services  in  a   mobile-­‐friendly  format  to  continue  to  expand;  mobile  friendly  library  instruction  was  only  offered   by  32.35%  of  respondents,  yet  61.54%  have  plans  to  offer  this  service  in  the  future.  mobile   interlibrary  loan  is  currently  offered  by  20.59%  of  respondents,  so  the  fact  that  38.46%  would  like   to  add  it  represents  a  significant  change.     not  surprisingly,  mobile  websites  are  likely  to  remain  a  very  popular  mobile  service.  the  fact  that   82.35%  of  respondents  already  have  a  mobile  website  and  46.15%  who  do  not  have  one  wish  to   add  one  in  the  near  future  means  that  mobile-­‐friendly  sites  are  well  on  their  way  to  becoming   ubiquitous,  at  least  among  libraries  at  the  top  100  universities,  and  may  reasonably  be  expected  to   take  their  place  among  websites  in  general  as  a  necessity  to  maintain  institutional  viability.   additionally,  several  respondents  mentioned  moving  towards  responsive  design,  in  which  their   websites  are  fully  functional  regardless  of  whether  they  are  accessed  on  mobile  devices  or   desktops.   what  are  challenges  and  strategies  for  offering  mobile  services?   in  addition  to  looking  for  the  presence  or  absence  of  mobile  services  being  offered  at  top  100   university  libraries,  the  survey  also  examined  libraries’  experiences  in  implementing  mobile   services,  including  challenges,  successes,  and  best  practices.  several  themes  emerged  in  response   to  these  questions.  the  most  common  challenge  among  respondents  was  having  the  time,   expertise,  staffing  and  money  to  support  mobile  services,  especially  apps  and  mobile  sites.  to   solve  this  problem,  respondents  mention  relying  on  vendors  and  third-­‐party  providers  supplying   apps  to  access  their  resources,  but  this  does  not  give  libraries  the  flexibility  and  specificity  of  an   in-­‐house  app.     another  common  challenge  mentioned  by  several  respondents  involved  technical  issues,  such  as   difficulties  with  off  campus  access  to  resources  via  a  proxy  server  and  compatibility  issues  among   different  browsers  and  especially  different  devices.  a  lack  of  communication  and/or  support  is   another  issue  for  libraries.  one  respondent  reported  a  lack  of  support  from  the  campus  computing   center  for  mobile  services.  one  respondent  discussed  the  difficulty  of  having  a  coordinated  mobile   effort  when  the  library  has  a  large  number  of  departments,  and  each  department  may  or  may  not   be  aware  of  what  the  others  are  doing  in  regards  to  mobile  services.  survey  results  revealed  that   few  libraries  have  policies  in  place  to  support  mobile  services.     coming  up  with  a  specific  plan  for  implementing  such  services  can  help  libraries  work  towards   promoting  effective  communication  and  garnering  support.  one  respondent  wrote,  “the  biggest   challenges  have  been:  (1)  developing  a  strategy  (2)  developing  a  service  model  (3)  having  a   systematic  model  for  managing  content  for  both  mobile-­‐  and  non-­‐mobile  applications.  we've  had     information  technologies  and  libraries  |  june  2015   145   success  with  the  first  two  and  are  making  great  progress  on  the  third.”  interestingly,  several   respondents  noted  that  underuse  is  an  issue  for  some  services.  one  respondent  mentioned  that   qr  codes  are  not  used  often,  and  another  mentioned  that  the  library’s  text-­‐a-­‐librarian  service  is   much  underutilized.  several  respondents  cited  the  need  to  market  mobile  services  as  an  antidote   to  this  problem.  seeking  regular  feedback  from  the  user  community  regarding  mobile  services   wants  and  needs  is  another  recommended  solution.   other  issues  include  the  fact  that  not  all  library  services  are  mobilized.  however,  libraries  are   actively  looking  for  solutions  for  this.  there  is  a  trend  among  respondents  towards  developing  a   site  that  is  responsive  to  all  devices,  including  desktops,  laptops,  tablets,  and  phones.  this  will  take   the  place  of  a  separate  mobile  site.  as  one  respondent  states,  “at  the  moment,  our  library  mobile   website  only  has  a  fraction  of  the  services  available  via  our  desktop  website.  we  are  in  the  process   of  moving  everything  to  responsive  design,  with  the  expectation  that  all  services  will  be  equally   available  in  mobile  and  desktop.”  in  reading  through  these  responses,  one  message  is  clear:  mobile   services  are  a  must.  several  respondents  noted  that  demand  for  mobile  services  is  growing,  with   one  writing,  “get  started  as  soon  as  possible.  our  analytics  show  that  mobile  use  is  continuing  to   increase.”   conclusion   this  study  confirms  that  as  of  spring  2014  mobile  services  are  already  ubiquitous  among  the   country’s  top  100  universities’  libraries  and  are  likely  to  continue  to  grow.  where  the  most   common  services  offered  are  e-­‐books,  chat/im,  mobile  access  to  databases,  mobile  access  to  the   library  catalog,  mobile  sites,  and  text  messaging  services,  there  is  a  trend  towards  responsive   design  for  websites  so  that  patrons  can  access  the  library’s  full  site  on  any  mobile  device.       the  experiences  of  these  libraries  demonstrate  the  value  of  creating  a  plan  for  providing  mobile   services,  allotting  the  appropriate  amount  of  staffing,  time,  and  funding,  communicating  among   departments  and  stakeholders  to  coordinate  mobile  efforts,  marketing  services,  and  regularly   seeking  patron  feedback.  however,  there  is  no  one  approach  to  offering  mobile  services,  and  each   library  must  do  what  works  best  for  its  patrons.   references     1.     andrew  walsh,  using  mobile  technology  to  deliver  library  services  (maryland:  scarecrow   press,  2012),  xiv.   2.     “smartphone  ownership  2013,”  last  modified  june  5,  2013,   http://www.pewinternet.org/2013/06/05/smartphone-­‐ownership-­‐2013/.   3.     thomas  a.  peters,  “left  to  their  own  devices:  the  future  of  reference  services  on  personal,   portable  information,  communication,  and  entertainment  devices,”  reference  librarian  52   (2011):  88-­‐97,  doi:10.1080/02763877.2011.520110.     a  library  in  the  palm  of  your  hand:  mobile  services  in  the  top  100  university  libraries  |     liu  and  briggs  |  doi:  10.6017/ital.v34i2.5650   146     4.     acrl  research  planning  and  review  committee,  “top  ten  trends  in  academic  libraries,  “   college  &  research  libraries  news  73  (2012):  311-­‐320.   5.     lauren  elmore  and  derek  stephens,  “the  application  of  qr  codes  in  uk  academic  libraries,”   new  review  of  academic  librarianship  18  (2012):26-­‐42,  doi:10.1080/13614533.2012.654679.   6.     robin  canuel  and  chad  crichton,  “canadian  academic  libraries  and  the  mobile  web,”  new   library  world  112  (2011):  107-­‐120,  doi:10.1108/03074801111117014.   7.     aiguo  li,  “mobile  library  services  in  key  chinese  academic  libraries,”  journal  of  academic   librarianship  39  (2013):  223-­‐226,  doi:10.1016/j.acalib.2013.01.009.   8.     robin  canuel  and  chad  crichton,  “canadian  academic  libraries,”  107-­‐120.   9.     lisa  carlucci  thomas,  “gone  mobile?  (mobile  libraries  survey  2010),”  library  journal  135   (2010):  30-­‐34.   10.    robin  canuel  and  chad  crichton,  “canadian  academic  libraries,”  107-­‐120.   11.    “mobile  technology  in  libraries  survey,”  last  modified  2012,   http://www.ohsu.edu/xd/education/library/about/staff-­‐ directory/upload/mobile_survey_academic_final.pdf.   12.    brittany  osika  and  cate  kaufman,  “’mobilizing’  community  college  libraries,”  searcher  20   (2012):  36-­‐46.   13.    scott  la  counte,  “introduction,”  in  mobile  library  services:  best  practices,  ed.  charles  harmon   and  michael  messina.  (maryland:  scarecrow  press,  2013),  v-­‐vii.     14.    fred  d.  barnhart  and  jeannette  e.  pierce,  “becoming  mobile:  reference  in  the  ubiquitous   library,”  journal  of  library  administration  52  (2012):  559-­‐570,     doi:10.1080/01930826.2012.707954.   15.    mark  andy  west,  arthur  w.  hafner,  and  bradley  d.  faust,  “expanding  access  to  library   collections  and  services  using  small-­‐screen  devices,”  information  technology  &  libraries  25   (2006):  103-­‐107.   16.    joan  k.  lippincott,  “mobile  technologies,  mobile  users:  implications  for  academic  libraries,”   arl:  a  bimonthly  report  on  research  library  issues  &  actions  261  (2008):  1-­‐4.     17.    walsh,  using  mobile  technology,  58.   18.    ibid.   19.    “mobile  technology  in  libraries  survey.”     information  technologies  and  libraries  |  june  2015   147     20.    edward  iglesias  and  wittawat  meesangnil,  “mobile  website  development:  from  site  to  app,”   bulletin  of  the  american  society  for  information  science  and  technology  38  (2011):  18-­‐23,   doi:  10.1002/bult.2011.1720380108.   21.    joshua  bishoff,  “going  mobile  at  illinois:  a  case  study,”  in  mobile  library  services:  best   practices,  ed.  charles  harmon  and  michael  messina.  (maryland:  scarecrow  press,  2013),  107-­‐ 121.   22.    walsh,  using  mobile  technology.   23.    helen  bischoff,  michele  ruth,  and  ben  rawlins,  “making  the  library  mobile  on  a  shoestring   budget,”  in  mobile  library  services:  best  practices,  ed.  charles  harmon  and  michael  messina.   (maryland:  scarecrow  press,  2013),  43-­‐54.     24.    walsh,  using  mobile  technology.   25.    ibid.   26.    ibid.   27.    ibid.,  105.   28.    ibid.,  97.   29.    ibid.   30.    “go  mobile:  use  these  strategies  and  increase  your  mobile  literacy  and  your  patrons’   satisfaction,”  last  modified  november  1,  2009,   http://libraryconnect.elsevier.com/articles/technology-­‐content/2009-­‐11/go-­‐mobile.     31.    walsh,  using  mobile  technology,  45.   32.    peters,  “left  to  their  own  devices.”   33.    geoffrey  little,  “keeping  moving:  smart  phone  and  mobile  technologies  in  the  academic   library,”  journal  of  academic  librarianship  37  (2011):  267-­‐269,  doi:   10.1016/j.acalib.2011.03.004.   34.    elmore  and  stephens,  “the  application  of  qr  codes.”   35.    jim  hahn,  “mobile  augmented  reality  applications  for  library  services,”  new  library  world   113  (2012):  429-­‐438,  accessed  june  21,  2014,  doi:10.1108/03074801211273902.   36.    ibid.   37.    wolfwalk:  explore  nc  state  history  right  on  your  phone,”   http://www.lib.ncsu.edu/wolfwalk/.     a  library  in  the  palm  of  your  hand:  mobile  services  in  the  top  100  university  libraries  |     liu  and  briggs  |  doi:  10.6017/ital.v34i2.5650   148     38.    ibid.   39.    osika  and  kaufman,  “mobilizing  community  college  libraries.”   40.    kate  kosturski  and  frank  skornia,  “handheld  libraries  101:  using  mobile  technologies  in  the   academic  library,”  computers  in  libraries  31  (2011):  11-­‐13.     41.    “national  university  rankings,”  http://colleges.usnews.rankingsandreviews.com/best-­‐ colleges/rankings/national-­‐universities/spp+50.   bridging the gap: self-directed staff technology training | quinney, smith, and galbraith 205 kayla l. quinney, sara d. smith, and quinn galbraith bridging the gap: self-directed staff technology training of hbll patrons. as anticipated, results indicated that students frequently use text messages, social networks, blogs, etc., while fewer staff members use these technologies. for example, 42 percent of the students reported that they write a blog, while only 26 percent of staff and faculty do so. also, 74 percent of the students and only 30 percent of staff and faculty indicated that they belonged to a social network. after concluding that staff and faculty were not as connected as their student patrons are to technology, library administration developed the technology challenge to help close this gap. the technology challenge was a self-directed training program requiring participants to explore new technology on their own by spending at least fifteen minutes each day learning new technology skills. this program was successful in promoting lifelong learning by teaching technology applicable to the work and home lives of hbll employees. we will first discuss literature that shows how technology training can help academic librarians connect with student patrons, and then we will describe the technology challenge and demonstrate how it aligns with the principles of self-directed learning. the training will be evaluated by an analysis of the results of two surveys given to participants before and after the technology challenge was implemented. ■■ library 2.0 and “librarian 2.0” hbll wasn’t the first to notice the gap between librarians and students, mcdonald and thomas noted that “gaps have materialized,” and library technology does not always “provide certain services, resources, or possibilities expected by emerging user populations like the millennial generation.”1 college students, who grew up with technology, are “digital natives,” while librarians, many having learned technology later in life, are “digital immigrants.”2 the “digital natives” belong to the millennial generation, described by shish and allen as a generation of “learners raised on and confirmed experts in the latest, fastest, coolest, greatest, newest electronic technologies.”3 according to sweeny, when students use libraries, they expect the same “flexibility, geographic independence, speed of response, time shifting, interactivity, multitasking, and time savings” provided by the technology they use daily.4 students are undergraduates, as members of the millennial generation, are proficient in web 2.0 technology and expect to apply these technologies to their coursework—including scholarly research. to remain relevant, academic libraries need to provide the technology that student patrons expect, and academic librarians need to learn and use these technologies themselves. because leaders at the harold b. lee library of brigham young university (hbll) perceived a gap in technology use between students and their staff and faculty, they developed and implemented the technology challenge, a self-directed technology training program that rewarded employees for exploring technology daily. the purpose of this paper is to examine the technology challenge through an analysis of results of surveys given to participants before and after the technology challenge was implemented. the program will also be evaluated in terms of the adult learning theories of andragogy and selfdirected learning. hbll found that a self-directed approach fosters technology skills that librarians need to best serve students. in addition, it promotes lifelong learning habits to keep abreast of emerging technologies. this paper offers some insights and methods that could be applied in other libraries, the most valuable of which is the use of self-directed and andragogical training methods to help academic libraries better integrate modern technologies. l eaders at the harold b. lee library of brigham young university (hbll) began to suspect a need for technology training when employees were asked during a meeting if they owned an ipod or mp3 player. out of the twenty attendees, only two raised their hands—one of whom worked for it. perceiving a technology gap between hbll employees and student patrons, library leaders began investigating how they could help faculty and staff become more proficient with the technologies that student patrons use daily. to best serve student patrons, academic librarians need to be proficient with the technologies that student patrons expect. hbll found that a self-directed learning approach to staff technology training not only fosters technology skills, but also promotes lifelong learning habits. to further examine the technology gap between librarians and students, the hbll staff, faculty, and student employees were given a survey designed to explore generational differences in media and technology use. student employees were surveyed as representatives of the larger student body, which composes the majority kayla l. quinney (quinster27@gmail.com) is research specialist, sara d. smith (saradsmith@gmail.com) is research specialist, and quinn galbraith (quinn_galbraith@byu.edu) is library human resource training and development manager, brigham young university library, provo, utah. 206 information technology and libraries | december 2010 2.0,” a program that “focuses on self-exploration and encourages staff to learn about new technologies on their own.”24 learning 2.0 encouraged library staff to explore web 2.0 tools by completing twenty-three exercises involving new technologies. plcmc’s program has been replicated by more than 250 libraries and organizations worldwide,25 and several libraries have written about their experiences, including academic26 and public libraries.27 these programs—and the technology challenge implemented by hbll—integrate the theories of adult learning. in the 1960s and 1970s, malcolm knowles introduced the theory of andragogy to describe the way adults learn.28 knowles described adults as learners who (1) are self-directed, (2) use their experiences as a resource for learning, (3) learn more readily when they experience a need to know, (4) seek immediate application of knowledge, and (5) are best motivated by internal rather than external factors.29 the theory and practice of self-directed learning grew out of the first learning characteristic and assumes that adults prefer self-direction in determining and achieving learning goals, and therefore learners exercise independence in determining how and what they learn.30 these theories have had a considerable effect on adult education practice31 and employee development programs.32 when adults participate in trainings that align with the assumptions of andragogy, they are more likely to retain and apply what they have learned.33 ■■ the technology challenge hbll’s technology challenge is similar to learning 2.0 in that it encourages self-directed exploration of web 2.0 technologies, but it differs in that participants were even more self-directed in exploration and that they were asked to participate daily. these features encouraged more self-directed learning in areas of participant interest as well as habit formation. it is not our purpose to critique learning 2.0, but to provide some evidence and analysis to demonstrate the success of hands-on, self-directed training approaches and to suggest other ways for libraries to apply self-directed learning to technology training. the technology challenge was implemented from june 2007 to january 2008. hbll staff included 175 full-time employees, 96 of whom participated in the challenge. (the student employees were not involved.) participants were asked to spend fifteen minutes each day learning a new technology skill. hbll leaders used rewards to make the program enjoyable and to motivate participation: for each minute spent learning technology, participants earned one point, and when one thousand points were earned, the participant would receive a gift certificate to the campus bookstore. staff and faculty participated and tracked their progress through an online masters of “informal learning”; that is, they are accustomed to easily and quickly gathering information relevant to their lives from the internet and from friends. shish and allen claimed that millennials prefer “interactive, hyper-linked multimedia over the traditional static, textoriented printed items. they want a sense of control; they need experiential and collaborative approaches rather than formal, librarian-guided, library-centric services.”5 these students arrive on campus expecting “to handle the challenges of scholarly research” using similar methods and technologies.6 interactive technologies such as blogs, wikis, streaming media applications, and social networks, are referred to as “web 2.0.” abram argued that web 2.0 technology “could be useful in an enterprise, institutional research, or community environment, and could be driven or introduced by the library.”7 “library 2.0” is a concept referring to a library’s integration of these technologies; it is essentially the use of “web 2.0 opportunities in a library environment.”8 manesss described library 2.0 is user-centered, social, innovative, and provider of a multimedia experiences.9 it is a community that “blurs the line between librarian and patron, creator and consumer, authority and novice.”10 libraries have been using web 2.0 technology such as blogs,11 wikis,12 and social networks13 to better serve and connect with patrons. blogs allow libraries to “provide news, information and links to internet resources,”14 and wikis create online study groups15 and “build a shared knowledge repository.”16 social networks can be particularly useful in connecting with undergraduate students: millennials use technology to collaborate and make collective decisions,17 and libraries can capitalize on this tendency by using social networks, which for students would mean, as bates argues, “an informational equivalent of the reliance on one’s facebook friends.”18 students expect library 2.0—and as libraries integrate new technologies, the staff and faculty of academic libraries need to become “librarian 2.0.” according to abram, librarian 2.0 understands users and their needs “in terms of their goals and aspirations, workflows, social and content needs, and more. librarian 2.0 is where the user is, when the user is there.”19 the modern library user “needs the experience of the web . . . to learn and succeed,”20 and the modern librarian can help patrons transfer technology skills to information seeking. librarian 2.0 is prepared to help patrons familiar with web 2.0 to “leverage these [technologies] to make a difference in reaching their goals.”21 therefore staff and faculty “must become adept at key learning technologies themselves.”22 stephen abram asked, “are the expectations of our users increasing faster than our ability to adapt?”23 and this same concern motivated hbll and other institutions to initiate staff technology training programs. the public library of charlotte and mecklenburg county of north carolina (plcmc) developed “learning bridging the gap: self-directed staff technology training | quinney, smith, and galbraith 207 their ability to learn and use technology. to be eligible to receive the gift card, participants were required to take this exit survey. sixty-four participants, all of whom had met or exceeded the thousand-point goal, chose to complete this survey, so the results of this survey represent the experiences of 66 percent of the participants. of course, if those who had not completed the technology challenge had taken the survey the results may have been different, but the results do show how those who chose to actively participate reacted to this training program. the survey included both quantifiable and open-ended questions (see appendix b for survey results and a list of the open-ended questions). the survey results, along with an analysis of the structure of the challenge itself, demonstrates that the program aligns with knowles’s five principles of andragogy to successfully help employees develop both technology skills and learning habits. self-direction the technology challenge was self-directed because it gave participants the flexibility to select which tasks and challenges they would complete. garrison wrote that in a self-directed program, “learners should be provided with choices of how they wish to proactively carry out the learning process. material resources should be available, approaches suggested, flexible pacing accommodated, and questioning and feedback provided when needed.”34 hbll provided a variety of challenges and training sessions related to various technologies. technology challenge participants were given the independence to choose which learning methods to use, including which training sessions to attend and which challenges to complete. according to the exit survey, the most popular training methods were small, instructor-led groups, followed by self-learning through reading books and articles. group training sessions were organized by hbll leadership and addressed topics such as microsoft office, rss feeds, computer organization skills, and multimedia software. other learning methods included web tutorials, dvds, large group discussions, and one-on-one tutoring. the group training classes preferred by hbll employees may be considered more teacher-directed than self-directed, but the technology challenge was self-directed as a whole in that learners were given the opportunity to choose what they learned and how they learned it. the structure of the technology challenge allowed participants to set their own pace. staff and faculty were given several months to complete the challenge and were responsible to pace themselves. on the exit survey, one participant commented: “if i didn’t get anything done one week, there wasn’t any pressure.” another enjoyed flexibility in deciding when and where to complete the tasks: “i liked being able to do the challenge anywhere. when i had a few minutes between appointments, classes, board game called “techopoly.” participation was voluntary, and staff and faculty were free to choose which tasks and challenges they would complete. tasks fell into one of four categories: software, hardware, library technology, and the internet. participants were required to complete one hundred points in each category, but beyond that, were able to decide how to spend their time. examples of tasks included attending workshops, exploring online tutorials, and reading books or articles about a relevant topic. for each hundred points earned, participants could complete a mini-challenge, which included reading blogs or e-books, listening to podcasts, or creating a photo cd (see appendix a for a more complete list). participants who completed fifteen out of twenty possible challenges were entered into a drawing for another gift certificate. before beginning the challenge, all participants were surveyed about their current use of technology. on this survey, they indicated that they were most uncomfortable with blogs, wikis, image editors, and music players. these results provided a focus for technology challenge trainings and mini-challenges. while not all of these technologies may apply directly to their jobs, 60 percent indicated that they were interested in learning them. forty-four percent reported that time was the greatest impediment to learning new technology; therefore the daily fifteen-minute requirement was introduced with the hope that it was small enough to be a good incentive to participate but substantial enough to promote habit formation and allow employees enough time to familiarize themselves with the technology. although some productivity may have been lost due to the time requirement (especially in cases where participants may have spent more than the required time), library leaders felt that technology training was an investment in hbll employees and that, at least for a few months, it was worth any potential loss in productivity. because participants could chose how and when they learned technology, they could incorporate the challenge into their work schedules according to their own needs, interests, and time constraints. of ninety-six participants, sixty-six reached or exceeded the thousand-point goal, and eight participants earned more than two thousand points. ten participants earned between five hundred and one thousand points, and another six earned between one hundred and five hundred. although not all participants completed the challenge, most were involved to some extent in learning technology during this time. ■■ the technology challenge and adult learning after finishing the challenge, participants took an exit survey to evaluate the experience and report changes in 208 information technology and libraries | december 2010 were willing, even excited, to learn technology skills: 37 percent “agreed” and 60 percent “strongly agreed” that they were interested in learning new technology. their desire to learn was cultivated by the survey itself, which helped them recognize and focus on this interest, and the challenge provided a way for employees to channel their desire to learn technology. immediate application learners need to see an opportunity for immediate application of their knowledge: ota et al. explained that “they want to learn what will help them perform tasks or deal with problems they confront in everyday situations and those presented in the context of application to real life.”39 because of the need for immediate application, the technology challenge encouraged staff and faculty to learn technology skills directly related to their jobs—as well as technology that is applicable to their personal or home lives. hbll leaders hoped that as staff became more comfortable with technology in general, they would be motivated to incorporate more complex technologies into their work. here is one example of how the technology challenge catered to adult learners’ need to apply what they learn: before designing the challenge, hbll held a training session to teach employees the basics of photoshop. even though attendees were on the clock, the turnout was discouraging. library leaders knew they needed to try something new. in the revamped photoshop workshop that was offered as part of the technology challenge, attendees brought family photos or film and learned how to edit and experiment with their photos and burn dvd copies. this time, the class was full: the same computer program that before drew only a few people was now exciting and useful. focusing on employees’ personal interests in learning new software, instead of just on teaching the software, better motivated staff and faculty to attend the training. motivation as stated by ota et al., adults are motivated by external factors but are usually more motivated by internal factors: “adults are responsive to some external motivators (e.g., better job, higher salaries), but the most potent motivators are internal (e.g., desire for increased job satisfaction, self-esteem).”40 on the entrance survey, participants were given the opportunity to comment on their reasons for participating in the challenge. the gift card, an example of an external motivation, was frequently cited as an important motivation. but many also commented on more internal motivations: “it’s important to my job to stay proficient in new technologies and i’d like to stay current”; “i feel that i need to be up-to-date or meetings i could complete some of the challenges.” employees could also determine how much or how little of the challenge they wanted to complete: many reached well over the thousand-point goal, while others fell a little short. participants began at different skill levels, and thus could use the time and resources allotted to explore basic or more advanced topics according to their needs and interests. garrison had noted the importance of providing resources and feedback in self-directed learning.35 the techopoly website provided resources (such as specific blogs or websites to visit) and instructions on how to use and access technology within the library. hbll also hired a student to assist staff and faculty one-on-one by explaining answers to their questions about technology and teaching other skills he thought may be relevant to their initial problem. the entrance and exit surveys provided opportunities for self-reflection and self-evaluation by questioning the participants’ use of technology before the challenge and asking them to evaluate their proficiency in technology after the challenge. use of experience the use of experience as a source of learning is important to adult learners: “the richest resource for learning resides in adults themselves; therefore, tapping into their experiences through experiential techniques (discussions, simulations, problem-solving activities, or case methods) is beneficial.”36 the small-group discussions and one-onone problem solving made available to hbll employees certainly fall into these categories. small-group classes are one of the best ways to encourage adults to share and validate their experiences, and doing so increases retention and application of new information.37 the trainings and challenges encouraged participants to make use of their work and personal experiences by connecting the topic to work or home application. for example, one session discussed how blogs relate to libraries, and another helped participants learn adobe photoshop skills by editing personal photographs. need to know adult learners are more successful when they desire and recognize a need for new knowledge or skills. the role of a trainer is to help learners recognize this “need to know” by “mak[ing] a case for the value of learning.”38 hbll used the generational survey and presurvey to develop a need and desire to learn. the results of the generational survey, which demonstrated a gap in technology use between librarians and students, were presented and discussed at a meeting held before the initiation of the technology challenge to help staff and faculty understand why it was important to learn 2.0 technology. results of the presurvey showed that staff and faculty bridging the gap: self-directed staff technology training | quinney, smith, and galbraith 209 statistical reports or working with colleagues from other libraries.” ■■ “i learned how to set up a server that i now maintain on a semi-regular basis. i learned a lot about sfx and have learned some perl programming language as well that i use in my job daily as i maintain sfx.” ■■ “the new oclc client was probably the most significant. i spent a couple of days in an online class learning to customize the client, and i use what i learned there every single day.” ■■ “i use google docs frequently for one of the projects i am now working on.” participants also indicated weaknesses in the technology challenge. almost 20 percent of those who completed the challenge reported that it was too easy. this is a valid point—the challenge was designed to be easy so as not to intimidate staff or faculty who are less familiar with technology. it is important to note that these comments came from those who completed the challenge—other participants may have found the tasks and mini-challenges more difficult. the goal was to provide an introduction to web 2.0, not to train experts. however, a greater range of tasks and challenges could be provided in the future to allow staff and faculty more selfdirection in selecting goals relevant to their experience. to encourage staff and faculty to attend sponsored training sessions as part of the challenge, hbll leaders decided to double points for time spent at these classes. this certainly encouraged participation, but it lead to “point inflation”—perhaps being one reason why so many reported that the challenge was too easy to complete. the doubling of points may also have encouraged staff to spend more time in workshops and less time practicing or applying the skills learned. a possible solution would be offering 1.5 points, or offering a set number of points for attendance instead of counting per minute. it also may have been informative for purpose of analysis to have surveyed both those who did not complete the challenge as well as those who chose not to participate. because the presurvey indicated that time was the biggest deterrent to learning and incorporating new technology, we assume that many of those who did not participate or who did not complete the challenge felt that they did not have enough time to do so. there is definitely potential for further investigation into why library staff would not want to participate in a technology training program, what would motivate them to participate, and how we could redesign the technology challenge to make it more appealing to all of our staff and faculty. several library employees have requested that hbll sponsor another technology challenge program. because of the success of the first and because of continuing interest in technology training, we plan to do so in the future. we will make changes and adjustments according to the on technology in order to effectively help patrons”; “to identify and become comfortable with new technologies that will make my work more efficient, more presentable, and more accurate.” ■■ lifelong learning staff and faculty responded favorably to the training. none of the participants who took the exit survey disliked the challenge; 34 percent even reported that they strongly liked it. ninety-five percent reported that they enjoyed the process of learning new technology, and 100 percent reported that they were willing to participate in another technology challenge—thus suggesting success in the goal of encouraging lifelong technology learning. the exit survey results indicate that after completing the challenge, staff and faculty are more motivated to continue learning—which is exactly what hbll leaders hoped to accomplish. eighty-nine percent of the participants reported that their desire to learn new technology had increased, and 69 percent reported that they are now able to learn new technology faster after completing the technology challenge. ninety-seven percent claimed that they were more likely to incorporate new technology into home or work use, and 98 percent said they recognized the importance of staying on top of emerging technologies. participants commented that the training increased their desire to learn. one observed, “i often need a challenge to get motivated to do something new,” and another participant reported feeling “a little more comfortable trying new things out.” the exit survey asked participants to indicate how they now use technology. one employee keeps a blog for her daughter’s dance company, and another said, “i’m on my way to a full-blown googlereader addiction.” another participant applied these new skills at home: “i’m not so afraid of exploring the computer and other software programs. i even recently bought a computer for my own personal use at home.” the technology challenge was also successful in helping employees better serve patrons: “i can now better direct patrons to services that i would otherwise not have known about, such as streaming audio and video and e-book readers.” another participant felt better connected to student patrons: “i understand the students better and the things they use on a daily basis.” staff and faculty also found their new skills applicable to work beyond patron interaction, and many listed specific examples of how they now use technology at work: ■■ “i have attended a few microsoft office classes that have helped me tremendously in doing my work more efficiently, whether it is for preparing monthly 210 information technology and libraries | december 2010 2. richard t. sweeny, “reinventing library buildings and services for the millennial generation,” library administration & management 19, no. 4 (2005): 170. 3. win shish and martha allen, “working with generationd: adopting and adapting to cultural learning and change,” library management 28, no. 1/2 (2006): 89. 4. sweeney, “reinventing library buildings,” 170. 5. shish and allen, “working with generation-d,” 96. 6. ibid., 98. 7. stephen abram, “social libraries: the librarian 2.0 pheonomenon,” library resources & technical services 52, no. 2 (2008): 21. 8. ibid. 9. jack m. maness “library 2.0 theory: web 2.0 and its implications for libraries,” webology 3, no. 2 (2006), http:// www.webology.ir/2006/v3n2/a25.html?q=link:webology.ir/ (accessed jan. 8, 2010). 10. ibid., under “blogs and wikis,” para. 4. 11. laurel ann clyde, “library weblogs,” library management 22, no. 4/5 (2004): 183–89; maness, “library 2.0. theory.” 12. see matthew m. bejune, “wikis in libraries,” information technology & libraries 26, no. 3 (2007): 26–38 ; darlene fichter, “the many forms of e-collaboration: blogs, wikis, portals, groupware, discussion boards, and instant messaging,” online: exploring technology & resources for information professionals 29, no. 4 (2005): 48–50; maness, “library 2.0 theory.” 13. mary ellen bates, “can i facebook that?” online: exploring technology and resources for information professionals 31, no. 5 (2007): 64; sarah elizabeth miller and lauren a. jensen, “connecting and communicating with students on facebook,” computers in libraries 27, no. 8 (2007): 18–22. 14. clyde, “library weblogs,” 183. 15. maness, “library 2.0 theory.” 16. fichter, “many forms of e-collaboration,” 50. 17. sweeney, “reinventing library buildings”; bates, “can i facebook that?” 18. bates, “can i facebook that?” 64. 19. abram, “social libraries,” 21. 20. ibid., 20. 21. ibid., 21. 22. shish and allen, “working with generation-d,” 90. 23. abram, “social libraries,” 20. 24. helene blowers and lori reed, “the c’s of our sea change: plans for training staff, from core competencies to learning 2.0,” computers in libraries 27, no. 2 (2007): 11. 25. helene blowers, learning 2.0, 2007, http://plcmclearning .blogspot.com (accessed jan. 8, 2010). 26. for examples, see ilana kingsley and karen jensen, “learning 2.0: a tool for staff training at the university of alaska fairbanks rasmuson,” the electronic journal of academic & special librarianship 12, no. 1 (2009), http://southernlibrarianship.icaap.org/content/v10n01/kingsley_i01.html (accessed jan. 8, 2010); beverly simmons, “learning (2.0) to be a social library,” tennessee libraries 58, no. 2 (2008): 1–8. 27. for examples, see christine mackenzie, “creating our future: workforce planning for library 2.0 and beyond,” australasian public libraries & information services 20, no. 3 (2007): 118–24; liisa sjoblom, “embracing technology: the deschutes public library’s learning 2.0 program,” ola quarterly 14, no. 2 (2007): 2–6; hui-lan titango and gail l. mason, “learning library 2.0: 23 things @ scpl,” library management 30, no. 1/2 feedback we have received, and continue to evaluate it and improve it based on survey results. the purpose of a second technology challenge would be to reinforce what staff and faculty have already learned, to teach new skills, and to help participants remember the importance of lifelong learning when it comes to technology. ■■ conclusion hbll’s self-directed technology challenge was successful in teaching technology skills and in promoting lifelong learning—as well as in fostering the development of librarian 2.0. abram listed key characteristics and duties of librarian 2.0, including learning the tools of web 2.0; connecting people, technology, and information; embracing “nontextual information and the power of pictures, moving images, sight, and sound”; using the latest tools of communication; and understanding the “emerging roles and impacts of the blogosphere, web syndicasphere, and wikisphere.”41 survey results indicated that hbll employees are on their way to developing these attributes, and that they are better equipped with the skills and tools to keep learning. like plcmc’s learning 2.0, the technology challenge could be replicated in libraries of various sizes. obviously an exact replication would not be feasible or appropriate for every library—but the basic ideas, such as the principles of andragogy and self-directed learning could be incorporated, as well as the daily time requirement or the use of surveys to determine weaknesses or interests in technology skills. whatever the case, there is a great need for library staff and faculty to learn emerging technologies and to keep learning them as technology continues to change and advance. but the most important benefit of a self-directed training program focusing on lifelong learning is effective employee development. the goal of any training program is to increase work productivity—and as employees become more productive and efficient, they are happier and more excited about their jobs. on the exit survey, one participant expressed initially feeling hesitant about the technology challenge and feared that it would increase an already hefty workload. however, once the challenge began, the participant enjoyed “taking the time to learn about new things. i feel i am a better person/librarian because of it.” and that, ultimately, is the goal—not only to create better librarians, but also to create better people. notes 1. robert h. mcdonald and chuck thomas, “disconnects between library culture and millennial generation values,” educause quarterly 29, no. 4 (2006): 4. bridging the gap: self-directed staff technology training | quinney, smith, and galbraith 211 ers,” journal of extension 33 (2005), http://www.joe.org/ joe/2006december/tt5.php (accessed jan. 8, 2010); wayne g. west, “group learning in the workplace,” new directions for adult and continuing education 71 (1996): 51–60. 33. ota et al., “needs of learners.” 34. d. r. garrison, “self-directed learning: toward a comprehensive model,” adult education quarterly 48 (1997): 22. 35. ibid. 36. ota et al., “needs of learners,” under “needs of the adult learner,” para. 4. 37. ota et al., “needs of learners”; west, “group learning.” 38. ota et al., “needs of learners,” under “needs of the adult learner,” para. 2. 39. ibid., para. 6. 40. ibid., para 7. 41. abram, “social library,” 21–22. (2009): 44–56; illinois library association, “continuous improvement: the transformation of staff development,” the illinois library association reporter 26, no. 2 (2008): 4–7; and thomas simpson, “keeping up with technology: orange county library embraces 2.0,” florida libraries 20, no. 2 (2007): 8–10. 28. sharan b. merriam, “andragogy and self-directed learning: pillars of adult learning theory,” new directions for adult & continuing education 89 (2001): 3–13. 29. malcolm shepherd knowles, the modern practice of adult education: from pedagogy to andragogy (new york: cambridge books, 1980). 30. jovita ross-gordon, “adult learners in the classroom,” new directions for student services 102 (2003): 43–52. 31. merriam, “pillars of adult learning”; ross-gordon, “adult learners.” 32. carrie ota et al., “training and the needs of learnappendix a. technology challenge “mini challenges” technology challenge participants had the opportunity to complete fifteen of twenty mini-challenges to become eligible to win a second gift certificate to the campus bookstore. below are some examples of technology mini-challenges: 1. read a library or a technology blog 2. listen to a library podcast 3. check out a book from circulation’s new self-checkout machine 4. complete an online copyright tutorial 5. catalog some books on librarything 6. read an e-book with sony ebook reader or amazon kindle 7. scan photos or copy them from a digital camera and then burn them onto a cd 8. backup data 9. change computer settings 10. schedule meetings with microsoft outlook 11. create a page or comment on a page on the library’s intranet wiki 12. use one of the library’s music databases to listen to music 13. use wordpress or blogger to create a blog 14. post a photo on a blog 15. use google reader or bloglines to subscribe to a blog or news page using rss 16. reserve and check out a digital camera, camcorder, dvr, or slide scanner from the multimedia lab and create something with it 17. convert media on the analog media racks 18. edit a family photograph using photo-editing software 19. attend a class in the multimedia lab 20. make a phone call using skype 212 information technology and libraries | december 2010 how did you like the technology challenge overall? answer response percent strongly disliked 0 0 disliked 0 0 liked 42 66 strongly liked 22 34 how did you like the reporting system used for the technology challenge (the techopoly game)? answer response percent strongly disliked 0 0 disliked 4 6 liked 41 64 strongly liked 19 30 would you participate in another technology challenge? answer response percent yes 64 100 no 0 0 what percentage of time did you spend using the following methods of learning? (participants were asked to allocate 100 points among the categories) category average response instructor-led large group 15.3 instructor-led small group 27 one-on-one instruction 3.5 web tutorial 12.8 self-learning (books, articles) 27.4 dvds .5 small group discussion 2.7 large group discussion 2.6 other 6.7 i am more likely to incorporate new technology into my home or work life. answer response percent strongly disagree 0 0 disagree 2 3 agree 49 77 strongly agree 13 20 i enjoy the process of making new technology a part of my work or home life. answer response percent strongly disagree 0 0 disagree 2 3 agree 37 58 strongly agree 24 38 after completing the technology challenge, my desire to learn new technologies has increased. answer response percent strongly disagree 0 0 disagree 7 11 agree 44 69 strongly agree 13 20 i feel i now learn new technologies more quickly. answer response percent strongly disagree 0 0 disagree 20 31 agree 39 61 strongly agree 5 8 appendix b. exit survey results bridging the gap: self-directed staff technology training | quinney, smith, and galbraith 213 open-ended questions ■■ what would you change about the technology challenge? ■■ what did you like about the technology challenge? ■■ what technologies were you introduced to during the technology challenge that you now use on a regular basis? ■■ in what was do you feel the technology challenge has benefited you the most? how much more proficient do you feel in . . . category not any somewhat a lot hardware 31% 64% 5% software 8% 72% 20% internet resources 17% 68% 15% library technology 23% 64% 13% in order for you to succeed in your job, how important is keeping abreast of new technologies to you? answer response percent not important 1 2 important 22 34 very important 41 64 microsoft word september_ital_park_proofed.docx evaluation  of  semi-­‐automatic     metadata  generation  tools:  a  survey     of  the  current  state  of  the  art     jung-­‐ran  park    and     andrew  brenza     information  technology  and  libraries  |  september  2015             22   abstract   assessment  of  the  current  landscape  of  semi-­‐automatic  metadata  generation  tools  is  particularly   important  considering  the  rapid  development  of  digital  repositories  and  the  recent  explosion  of  big   data.  utilization  of  semi-­‐automatic  metadata  generation  is  critical  in  addressing  these   environmental  changes  and  may  be  unavoidable  in  the  future  considering  the  costly  and  complex   operation  of  manual  metadata  creation.  to  address  such  needs,  this  study  examines  the  range  of   semi-­‐automatic  metadata  generation  tools  (n  =  39)  while  providing  an  analysis  of  their  techniques,   features,  and  functions.  the  study  focuses  on  open-­‐source  tools  that  can  be  readily  utilized  in  libraries   and  other  memory  institutions.  the  challenges  and  current  barriers  to  implementation  of  these  tools   were  identified.  the  greatest  area  of  difficulty  lies  in  the  fact  that  the  piecemeal  development  of  most   semi-­‐automatic  generation  tools  only  addresses  part  of  the  issue  of  semi-­‐automatic  metadata   generation,  providing  solutions  to  one  or  a  few  metadata  elements  but  not  the  full  range  of  elements.   this  indicates  that  significant  local  efforts  will  be  required  to  integrate  the  various  tools  into  a   coherent  set  of  a  working  whole.  suggestions  toward  such  efforts  are  presented  for  future   developments  that  may  assist  information  professionals  with  incorporation  of  semi-­‐automatic  tools   within  their  daily  workflows.     introduction   with  the  rapid  increase  in  all  types  of  information  resources  managed  by  libraries  over  the  last   few  decades,  the  ability  of  the  cataloging  and  metadata  community  to  describe  those  resources  has   been  severely  strained.  furthermore,  the  reality  of  stagnant  and  decreasing  library  budgets  has   prevented  the  library  community  from  addressing  this  issue  with  concomitant  staffing  increases.   nevertheless,  the  ability  of  libraries  to  make  information  resources  accessible  to  their   communities  of  users  remains  a  central  concern.  thus  there  is  a  critical  need  to  devise  efficient   and  cost  effective  ways  of  creating  bibliographic  records  so  that  users  are  able  to  find,  identify,   and  obtain  the  information  resources  they  need.     one  promising  approach  to  managing  the  ever-­‐increasing  amount  of  information  is  with  semi-­‐ automatic  metadata  generation  tools.  semi-­‐automatic  metadata  generation  tools     jung-­‐ran  park  (jung-­‐ran.park@drexel.edu)  is  editor,  journal  of  library  metadata,  and  associate   professor,  college  of  computing  and  informatics,  drexel  university,  philadelphia.     andrew  brenza  (apb84@drexel.edu)  is  project  assistant,  college  of  computing  and  informatics,   drexel  university,  philadelphia.     evaluation  of  semi-­‐automatic  metadata  generation  tools|  park  and  brenza     doi:  10.6017/ital.v34i3.5889   23   concern  the  use  of  software  to  create  metadata  records  with  varying  degrees  of  supervision  from  a   human  specialist.1  in  its  ideal  form,  semi-­‐automatic  metadata  generation  tools  are  capable  of   extracting  information  from  structured  and  unstructured  information  resources  of  all  types  and   creating  quality  metadata  that  not  only  facilitate  bibliographic  record  creation  but  also  semantic   interoperability,  a  critical  factor  for  resource  sharing  and  discovery  in  the  networked  environment.   through  the  use  of  semi-­‐automatic  metadata  generation  tools,  the  library  community  has  the   potential  to  address  many  issues  related  to  the  increase  of  information  resources,  the  strain  on   library  budget,  the  need  to  create  high-­‐quality,  interoperable  metadata  records,  and,  ultimately,   the  effective  provision  of  information  resources  to  users.   there  are  many  potential  benefits  to  semi-­‐automatic  metadata  generation.  the  first  is  scalability.   because  of  the  quantity  of  information  resources  and  the  costly  and  time-­‐consuming  nature  of   manual  metadata  generation,2  it  is  increasingly  apparent  that  there  simply  are  not  enough   information  professionals  available  for  satisfying  the  metadata-­‐generation  needs  of  the  library   community.  semi-­‐automatic  metadata  generation,  on  the  other  hand,  offers  the  promise  of  using   high  levels  of  computing  power  to  manage  large  amounts  of  information  resources.  in  addition  to   scalability,  semi-­‐automatic  metadata  generation  also  offers  potential  cost  savings  through  a   decrease  in  the  time  required  to  create  effective  records.  furthermore,  the  time  savings  would   allow  information  professionals  to  focus  on  tasks  that  are  more  conceptually  demanding  and  thus   not  suitable  for  automatic  generation.  finally,  because  computers  can  perform  repetitive  tasks   with  relative  consistency  when  compared  to  their  human  counterparts,  automatic  metadata   generation  promises  the  ability  to  create  more  consistent  records.  a  potential  increase  in   consistency  of  quality  metadata  records  would,  in  turn,  increase  the  potential  for  interoperability   and  thereby  the  accessibility  of  information  resources  in  general.  thus  semi-­‐automatic  metadata   generation  offers  the  potential  to  not  only  ease  resource  description  demands  on  the  library   community  but  also  to  improve  resource  discovery  for  its  users.     goals  of  the  study   assessment  of  the  current  landscape  of  semi-­‐automatic  metadata  generation  tools  is  particularly   important  considering  the  fast  development  of  digital  repositories  and  the  recent  explosion  of   data  and  information.  utilization  of  semi-­‐automatic  metadata  generation  is  critical  to  address  such   environmental  changes  and  may  be  unavoidable  in  the  future  considering  the  costly  and  complex   operation  of  manual  metadata  creation.  even  though  there  are  promising  experimental  studies   that  exploit  various  methods  and  sources  for  semi-­‐automatic  metadata  generation,3  a  lack  of   studies  assessing  and  evaluating  the  range  of  tools  have  been  developed,  implemented,  or   improved.  to  address  such  needs,  this  study  aims  to  examine  the  current  landscape  of  semi-­‐ automatic  metadata  generation  tools  while  providing  an  evaluative  analysis  of  their  techniques,   features,  and  functions.  the  study  primarily  focuses  on  open-­‐source  tools  that  can  be  readily   utilized  in  libraries  and  other  memory  institutions.  the  study  also  highlights  some  of  the   challenges  still  facing  the  continued  development  of  semi-­‐automatic  tools  and  the  current  barriers     information  technology  and  libraries  |  september  2015       24   to  their  incorporation  into  the  daily  workflows  for  information  organization  and  management.   future  directions  for  the  further  development  of  tools  are  also  discussed.     toward  this  end,  a  critical  review  of  the  literature  in  relation  to  semi-­‐automatic  metadata   generation  tools  published  from  2004  to  2014  was  conducted.  databases  such  as  library  and   information  sciences  abstracts  and  library,  information  science  and  technology  abstracts  were   searched  and  germane  articles  identified  through  review  of  titles  and  abstracts.  because  the   problem  of  creating  viable  tools  for  the  reliable  automatic  generation  of  metadata  is  a  not  a   problem  limited  to  the  library  and  information  science  professions,4  database  searches  were   expanded  to  include  those  databases  pertinent  to  the  computing  science,  including  proquest   computing,  academic  search  premier,  and  applied  science  and  technology.  keywords,  such  as   “automatic  metadata  generation,”  “metadata  extraction,”  “metadata  tools,”  and  “text  mining,”   including  their  stems,  were  used  to  explore  the  databases.  in  addition  to  keyword  searching,   relevant  articles  were  also  identified  within  the  reference  sections  of  articles  already  deemed   pertinent  to  the  focus  of  the  survey  as  well  as  through  the  expansion  of  results  lists  through  the   application  of  relevant  subject  terms  applied  to  pertinent  articles.  to  ensure  that  the  latest,  most   reliable  developments  in  automatic  metadata  were  reviewed,  various  filters,  such  as  date  range   and  peer-­‐review,  were  employed.  once  tools  were  identified,  their  capabilities  were  tested  (when   possible),  their  features  were  noted,  and  overarching  developments  were  determined.     the  remainder  of  the  article  provides  an  overview  of  the  primary  techniques  developed  for  the   semi-­‐automatic  generation  of  metadata  and  a  review  of  the  open-­‐source  metadata  generation   tools  that  employ  them.  the  challenges  and  current  barriers  to  semi-­‐automatic  metadata  tool   implementation  are  described  as  well  as  suggestions  for  future  developments  that  may  assist   information  professionals  with  integration  of  semi-­‐automatic  tools  within  the  daily  workflow  of   technical  services  departments.     current  techniques  for  the  automatic  generation  of  metadata   as  opposed  to  manual  metadata  generation,  semi-­‐automatic  metadata  generation  relies  on   machine  methods  to  assist  with  or  to  complete  the  metadata-­‐creation  process.  greenberg   distinguished  between  two  methods  of  automatic  metadata  generation:  metadata  extraction  and   metadata  harvesting.5  metadata  extraction  in  general  employs  automatic  indexing  and   information  retrieval  techniques  to  generate  structured  metadata  using  the  original  content  of   resources.  on  the  other  hand,  metadata  harvesting  concerns  a  technique  to  automatically  gather   metadata  from  individual  repositories  in  which  metadata  has  been  produced  by  semi-­‐automatic  or   manual  approaches.  the  harvested  metadata  can  be  stored  in  a  central  repository  for  future   resource  retrieval.   within  this  dichotomy  of  extraction  methods,  there  are  several  other  more  specific  techniques   that  researchers  have  developed  for  the  semi-­‐automatic  generation  of  metadata.  polfreman  et  al.   identified  an  additional  six  techniques  that  have  been  developed  over  the  years:  meta-­‐tag   harvesting,  content  extraction,  automatic  indexing,  text  and  data  mining,  extrinsic  data  auto     evaluation  of  semi-­‐automatic  metadata  generation  tools|  park  and  brenza     doi:  10.6017/ital.v34i3.5889   25   generation,  and  social  tagging.6  although  the  last  technique  is  not  properly  a  semi-­‐automatic   metadata  generation  technique  because  it  is  used  to  generate  metadata  with  a  minimum  of   intervention  required  by  metadata  professionals,  it  can  be  viewed  as  a  possible  mode  to   streamline  the  metadata  creation  process.     both  greenberg  and  polfreman  provide  comprehensive,  high-­‐level  characterizations  of  the   techniques  employed  in  current  semi-­‐automatic  metadata  generation  tools.  however,  an   evaluation  of  these  techniques  within  the  context  of  a  broad  survey  of  the  tools  themselves  and  a   comprehensive  enumeration  of  currently  available  tools  are  not  addressed.  thus,  although  these   techniques  will  be  examined  for  the  remainder  of  this  section,  they  serve  simply  as  a  framework   through  which  this  study  provides  a  current  and  comprehensive  analysis  of  the  tools  available  for   use  today.  each  section  provides  an  overview  of  the  relevant  technique,  a  discussion  of  the  most   current  research  related  to  it,  and  the  tools  that  employ  that  technique.   the  tables  included  in  each  section  provide  lists  of  the  semi-­‐automatic  metadata  generation  tools   (n  =  39)  evaluated  in  the  course  of  this  survey.  the  information  presented  in  the  tables  is   designed  to  provide  a  characterization  of  each  tool:  its  name,  its  online  location,  the  technique(s)   used  to  generate  metadata,  and  a  brief  description  of  the  tool’s  functions  and  features.  only  those   tools  that  are  currently  available  for  download  or  for  use  as  web  services  at  the  time  of  this   writing  are  included.  furthermore,  the  listed  tools  have  not  been  strictly  limited  to  metadata-­‐ generation  applications  but  also  include  some  content  management  system  software  (cmss)  as   these  generally  provide  some  form  of  semi-­‐automatic  metadata  extraction.  typically,  cmss  are   capable  of  extracting  technical  metadata  as  well  as  data  that  can  found  in  the  meta-­‐tags  of   information  resources,  such  as  the  file  name,  and  using  that  information  as  the  title  of  a  record.     meta-­‐tag  extraction   meta-­‐tag  extraction  is  a  computing  process  whereby  values  for  metadata  fields  are  identified  and   populated  through  an  examination  of  metadata  tags  within  or  attached  to  a  document.  in  other   words,  it  is  a  form  of  metadata  harvesting  and,  possibly,  conversion  of  that  metadata  into  other   formats.  marcedit,  the  most  widely  used  semi-­‐automatic  tool  for  the  generation  of  metadata  in  us   libraries,7  is  an  example  of  this  technique.  marcedit  essentially  harvests  metadata  from  open   archives  initiative  protocol  for  metadata  harvesting  (oai-­‐pmh)  compliant  records  and  offers  the   user  the  opportunity  to  convert  those  records  to  a  variety  of  formats,  including  machine-­‐readable   cataloging  (marc),  machine-­‐readable  cataloguing  in  xml  (marc  xml),  metadata  object   description  schema  (mods),  and  encoded  archival  description  (ead).  it  also  offers  the   capabilities  of  converting  records  from  any  of  the  supported  formats  to  any  of  the  other  supported   formats.   other  examples  of  this  technique  are  the  web  services  editor-­‐converter  dublin  core  metadata  and   firefox  dublin  core  viewer  extension.  both  of  these  programs  search  html  files  on  the  web  and   convert  information  found  in  html  meta-­‐tags  to  dublin  core  elements.  in  the  cases  of  marcedit     information  technology  and  libraries  |  september  2015       26   and  editor-­‐converter  dublin  core,  users  are  presented  with  the  converted  information  in  an   interface  that  allows  the  user  to  edit  or  refine  the  data.     figure  1  provides  an  illustration  of  the  extracted  metadata  of  the  new  york  times  homepage  using   editor-­‐converter  dublin  core,  while  figure  2  offers  an  illustration  of  the  editor  that  this  web   service  provides.         figure  1.  screenshot  of  extracted  dublin  core  metadata  using  editor-­‐converter  dublin  core.     figure  2.  screenshot  of  editor-­‐converter  dublin  core  editing  tool  (only  eight  of  the  sixteen  fields   are  visible  in  this  screenshot).     evaluation  of  semi-­‐automatic  metadata  generation  tools|  park  and  brenza     doi:  10.6017/ital.v34i3.5889   27   perhaps  the  biggest  weakness  to  this  type  of  tool  is  that  it  entirely  depends  on  the  quality  of  the   metadata  from  which  the  programs  harvest.  this  can  be  most  readily  seen  in  the  above  figure  by   the  lack  of  values  for  a  number  of  the  dublin  core  fields  for  the  the  new  york  times  website.   programs  that  solely  employ  the  technique  of  meta-­‐tag  harvesting  are  unable  to  infer  values  for   metadata  elements  that  are  not  already  populated  in  the  source.     table  1  lists  the  tools  that  support  meta-­‐tag  harvesting  either  as  the  sole  technique  or  as  one  of  a   suite  of  techniques  used  to  generate  metadata  from  resources.  of  the  thirty-­‐nine  tools  evaluated   for  this  study,  nineteen  support  meta-­‐tag  harvesting.   tool  name   location   techniques   functions/features   anvl/erc   kernel  metadata   conversion   toolkit   http://search.cpan.org/~jak/file-­‐ anvl/anvl     meta-­‐tag  harvester   a  utility  that  can  automatically   convert  records  in  the  anvl   format  into  other  formats  such  as   xml,  json  (javascript  object   notation),  turtle  or  plain,  among   others.   apache  poi  –   text  extractor   http://poi.apache.org/download.html     content  extractor;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   apache  poi  provides  basic  text   extraction  for  all  project   supported  file  formats.  in   addition  to  the  (plain)  text,   apache  poi  can  access  the   metadata  associated  with  a  given   file,  such  as  title  and  author.     apache  tika   http://tika.apache.org/     content  extractor;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   built  on  apache  poi,  the  apache   tika  toolkit  detects  and  extracts   metadata  and  text  content  from   various  documents.   ariadne   harvester   http://sourceforge.net/projects/ariadn ekps/files/?source=navbar     meta-­‐tag  harvester   a  harvester  of  oai-­‐pmh   compliant  records  which  can  be   converted  to  various  other   schema  such  as  learning  object   metadata  (lom).       bibframe  tools   http://www.loc.gov/bibframe/implem entation/     meta-­‐tag  harvester   bibframe  offers  a  number  of   tools  for  the  conversion  of   marcxml  documents  to   bibframe  documents.    web   service  and  downloadable   software  are  both  available.   data  fountains   http://datafountains.ucr.edu/     content  extractor;   automatic  indexer;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   scans  html  documents  and  first   extracts  information  contained  in   meta-­‐tags.    if  information  is   unavailable  in  meta-­‐tags,  the   program  will  use  other   techniques  to  assign  values.     includes  a  focused  web  crawler   that  can  target  websites   concerning  a  specific  subject.           information  technology  and  libraries  |  september  2015       28   dublin  core  meta   toolkit   http://sourceforge.net/projects/dcmet atoolkit/files/?source=navbar     meta-­‐tag  harvester   transforms  data  collected  via   different  methods  into  dublin   core  (dc)  compatible  metadata.   dspace   http://www.dspace.org/     meta-­‐tag  harvester;   extrinsic  auto-­‐ generator;  social   tagging   automatically  extracts  technical   information  regarding  file  format   and  size.    can  also  extract  some   information  from  meta-­‐tags.   editor-­‐converter   dublin  core   metadata   http://www.library.kr.ua/dc/dcedituni e.html   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   scans  html  documents,   harvesting  metadata  from  tags   and  converting  them  to  dc.   embedded   metadata   extraction  tool   (emet)   http://www.artstor.org/global/g-­‐ html/download-­‐emet-­‐public.html     content  extractor;   emet  is  a  tool  designed  to   extract  metadata  embedded  in   jpeg  and  tiff  files.   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   firefox  dublin   core  viewer   extension   http://www.splintered.co.uk/experime nts/73/     meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   scans  html  documents,   harvesting  metadata  from  tags   and  displaying  them  in  dublin   core.   marcedit   http://marcedit.reeset.net/   meta-­‐tag  harvester   harvests  oai-­‐pmh  compliant   data  and  converts  it  to  various   formats  including  dc  and  marc.   metatag   extractor   software   http://meta-­‐tag-­‐ extractor.software.informer.com/     meta-­‐tag  harvester   permits  customizable  extraction   features,  harvesting  meta-­‐tags  as   well  as  contact  information  from   websites.   my  meta  maker   http://old.isn-­‐ oldenburg.de/services/mmm/     meta-­‐tag  harvester   can  convert  manually  entered   data  into  dc.   photo  rdf-­‐gen   http://www.webposible.com/utilidade s/photo_rdf_generator_en.html   meta-­‐tag  harvester   generates  dublin  core  and   resource  description  framework   (rdf)  output  from  manually   entered  input.   pymarc   https://github.com/edsu/pymarc     meta-­‐tag  harvester   scripting  tool  in  python  language   for  the  batch  processing  of  marc   records,  similar  to  marcedit.       repomman   http://www.hull.ac.uk/esig/repomman /index.html   meta-­‐tag  harvester;   content  extractor;   extrinsic  auto-­‐ generator   automatically  extracts  various   elements  for  documents   uploaded  to  fedora  such  as   author,  title,  description,  and  key   words,  among  others.    results  are   presented  to  user  for  review.   sherpa/romeo   http://www.sherpa.ac.uk/romeo/api.h tml     meta-­‐tag  harvester   a  machine-­‐to-­‐machine   application  program  interface   (api)  that  permits  the  automatic   look-­‐up  and  importation  of   publishers  and  journals.   url  and  metatag   extractor   http://www.metatagextractor.com/     meta-­‐tag  harvester   permits  the  targeted  searching  of   websites  and  extracts  urls  and   meta-­‐tags  from  those  sites.   table  1.  semi-­‐automatic  tools  that  support  meta-­‐tag  harvesting.     evaluation  of  semi-­‐automatic  metadata  generation  tools|  park  and  brenza     doi:  10.6017/ital.v34i3.5889   29   content  extraction     content  extraction  is  a  form  of  metadata  extraction  whereby  various  computing  techniques  are   used  to  extract  information  from  the  information  resource  itself.  in  other  words,  these  techniques   do  not  rely  on  the  identification  of  relevant  meta-­‐tags  for  the  population  of  metadata  values.  an   example  of  this  technique  is  the  kea  application,  a  program  developed  at  the  new  zealand  digital   library  that  uses  machine  learning,  term  frequency-­‐inverse  document  frequency  (tf.idf)  and   first-­‐occurrence  techniques  to  identify  and  assign  key  phrases  from  the  full  text  of  documents.8   the  major  advantage  of  this  type  of  technique  is  that  the  extraction  of  metadata  can  be  done   independently  of  the  quality  of  metadata  associated  with  any  given  information  resource.  another   example  of  a  tool  utilizing  this  technique  is  the  open  text  summarizer,  an  open-­‐source  program   that  offers  the  capability  of  reading  a  text  and  extracting  important  sentences  to  create  a  summary   as  well  as  to  assign  keywords.  figure  3  provides  a  screenshot  of  what  a  summarized  text  might   look  like  using  the  open  text  summarizer.           figure  3.  open  text  summarizer:  sample  summary  of  text.   another  form  of  this  technique  often  relies  on  the  predictable  structure  of  certain  types  of   documents  to  identify  candidate  values  for  metadata  elements.  for  instance,  because  of  the   reliable  format  of  scholarly  research  papers—which  generally  include  a  title,  author,  abstract,   introduction,  conclusion,  and  reference  sections  in  predictable  ways—this  format  can  be  exploited   by  machines  to  extract  metadata  values  from  them.  several  projects  have  been  able  to  exploit  this   technique  in  combination  with  machine  learning  algorithms  to  extract  various  forms  of  metadata.     for  instance,  in  the  randkte  project,  optical  character  recognition  software  was  used  to  scan  a   large  quantity  of  legal  documents  from  which,  because  of  the  regularity  of  the  documents’     information  technology  and  libraries  |  september  2015       30   structure,  structural  metadata  such  as  chapter,  section,  and  page  number  could  be  extracted.9  in   contrast,  the  kovacevic’s  project  used  the  predictable  structure  of  scholarly  articles,  converting   documents  from  pdf  to  html  files  while  preserving  the  formatting  details  and  used  classification   algorithms  to  extract  metadata  regarding  title,  author,  abstract,  and  keywords,  among  other   elements.10   table  2  lists  the  tools  that  support  content  extraction  either  as  the  sole  technique  or  as  one  of  a   suite  of  techniques  used  to  generate  metadata  from  resources.  of  the  thirty-­‐nine  tools  evaluated   for  this  study,  twenty  tools  support  some  form  of  content  extraction.   tool  name   location   techniques   functions/features   apache  poi— text  extractor   http://poi.apache.org/download.html     content  extractor;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   apache  poi  provides  basic  text   extraction  for  all  project   supported  file  formats.  in   addition  to  the  (plain)  text,   apache  poi  can  access  the   metadata  associated  with  a  given   file,  such  as  title  and  author.     apache   standol   https://stanbol.apache.org/     content  extractor;   automatic  indexer   extracts  semantic  metadata  from   pdf  and  text  files.  can  apply   extracted  terms  to  ontologies.   apache  tika   http://tika.apache.org/     content  extractor;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   built  on  apache  poi,  the  apache   tika  toolkit  detects  and  extracts   metadata  and  text  content  from   various  documents.   biblio  citation   parser   http://search.cpan.org/~mjewell/   biblio-­‐citation-­‐parser-­‐1.10/     content  extractor   a  set  of  modules  for  citation   parsing.   catmdedit   http://catmdedit.sourceforge.net/     content  extractor   catmdedit  allows  the  automatic   creation  of  metadata  for   collections  of  related  resources,   in  particular  spatial  series  that   arise  as  a  result  of  the   fragmentation  of  geometric   resources  into  datasets  of   manageable  size  and  similar   scale.   crossref   http://www.crossref.org/   simpletextquery/     content  extractor   this  web  service  returns  digital   object  identifiers  for  inputted   references.     data   fountains   http://datafountains.ucr.edu/     content  extractor;   automatic  indexer;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   scans  html  documents  and  first   extracts  information  contained  in   meta-­‐tags.  if  information  is   unavailable  in  meta-­‐tags,  the   program  will  use  other   techniques  to  assign  values.   includes  a  focused  web  crawler   that  can  target  websites   concerning  a  specific  subject.       evaluation  of  semi-­‐automatic  metadata  generation  tools|  park  and  brenza     doi:  10.6017/ital.v34i3.5889   31   embedded   metadata   extraction   tool  (emet)   http://www.artstor.org/global/g   -­‐html/download-­‐emet-­‐public.html     content  extractor;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   emet  is  a  tool  designed  to  extract   metadata  embedded  in  jpeg  and   tiff  files.   freecite   http://freecite.library.brown.edu/     content  extractor   free  parsing  tool  for  the   extraction  of  reference   information.  can  be  downloaded   or  used  as  a  web  service.     general   architecture   for  text   engineering   (gate)   http://gate.ac.uk/overview.html     content  extractor;   automatic  indexer;   natural  language  processor  and   information  extractor.   kea     http://www.nzdl.org/kea/index_old   .html#download   content  extractor;   automatic  indexer   analyzes  the  full  texts  of   resources  and  extracts   keyphrases.  keyphrases  can  also   be  mapped  to  customized   ontologies  or  controlled   vocabularies  for  subject  term   assignment.   metagen   http://www.codeproject.com/articles /41910/metagen-­‐a-­‐project   -­‐metadata-­‐generator-­‐for-­‐visual-­‐st     content  extractor;   automatic  indexer   used  to  build  a  metadata   generator  for  silverlight  and   desktop  clr  projects,  metagen   can  be  used  as  a  replacement  for   static  reflection  (expression   trees),  reflection  (walking  the   stack),  and  various  other  means   for  deriving  the  name  of  a   property,  method,  or  field.     metagenerator   http://extensions.joomla.org/   extensions/site-­‐management/seo-­‐a   -­‐metadata/meta-­‐data/11038   content  extractor   a  plugin  that  automatically   generates  description  and   keyword  meta-­‐tags  by  pulling   text  from  joomla  content.  with   this  plugin  you  can  also  control   some  title  options  and  add  url   meta-­‐tags.     ont-­‐o-­‐mat   http://projects.semwebcentral.org/   projects/ontomat/     content  extractor   assists  user  with  annotation  of   websites  that  are  semantic  web-­‐ compliant.  may  now  include  a   feature  that  automatically   suggests  portions  of  the  website   to  annotate.   open  text   summarizer   http://libots.sourceforge.net/   content  extractor   extracts  pertinent  sentences  from   a  resource  to  build  a  free  text   description.     information  technology  and  libraries  |  september  2015       32   parscit   http://wing.comp.nus.edu.sg/parscit/ #ws     content  extractor   open-­‐source  string-­‐parsing   package  for  the  extraction  of   reference  information  from   scholarly  articles.   repomman   http://www.hull.ac.uk/esig/   repomman/index.html   meta-­‐tag  harvester;   content  extractor;   extrinsic  auto-­‐ generator   automatically  extracts  various   elements  for  documents   uploaded  to  fedora  such  as   author,  title,  description,  and  key   words,  among  others.  results  are   presented  to  user  for  review.   simple   automatic   metadata   generation   interface   (samgi)   http://hmdb.cs.kuleuven.be/amg/   download.php   content  extractor;   extrinsic  auto-­‐ generator   a  suite  of  tools  that  is  able  to   automatically  extract  metadata   elements  such  as  key  phrase  and   language  from  documents  as  well   as  from  the  context  in  which  a   document  exists.     termine   http://www.nactem.ac.uk/software/   termine/     content  extractor   extracts  keywords  from  texts   through  c-­‐value  analysis  and   acromine,  an  acronym  identifier   and  dictionary.  available  as  free   web  service  for  academic  use.   yahoo  content   analysis  api   https://developer.yahoo.com/   contentanalysis/     content  extractor;   automatic  indexer   the  content  analysis  web   service  detects  entities/concepts,   categories,  and  relationships   within  unstructured  content.  it   ranks  those  detected   entities/concepts  by  their  overall   relevance,  resolves  those  if   possible  into  wikipedia  pages,   and  annotates  tags  with  relevant   metadata.   table  2.  semi-­‐automatic  tools  that  support  content  extraction   automatic  indexing   in  the  same  way  as  content  extraction,  automatic  indexing  involves  the  use  of  machine  learning   and  rule-­‐based  algorithms  to  extract  metadata  values  from  within  information  resources   themselves,  rather  than  relying  on  the  content  of  meta-­‐tags  applied  to  resources.  however,  this   technique  also  involves  the  mapping  of  extracted  metadata  terms  to  controlled  vocabularies  such   as  the  library  of  congress  subject  headings  (lcsh),  the  getty  thesaurus  of  geographic  names   (tgn),  or  the  library  of  congress  name  authority  file  (lcnaf),  or  to  domain-­‐specific  or  locally   developed  ontologies.  thus,  in  this  technique,  researchers  use  classifying  and  clustering   algorithms  to  extract  relevant  metadata  from  texts.  term-­‐frequency  statistics  or  if.idf,  which   determines  likelihood  of  keyword  applicability  through  its  relative  frequency  within  a  given     evaluation  of  semi-­‐automatic  metadata  generation  tools|  park  and  brenza     doi:  10.6017/ital.v34i3.5889   33   document  as  opposed  to  its  relative  infrequency  in  related  documents,  are  commonly  used  in  this   technique.     projects  such  as  john  hopkins  university’s  automatic  name  authority  control  (anac)  tool  utilizes   this  technique  to  extract  the  names  of  composers  within  its  sheet  music  collections  and  to  assign   the  authorized  form  of  those  names  based  on  comparisons  with  lcnaf.11  erbs  et  al.  also  use  this   technique  to  extract  key  phrases  from  german  educational  documents  which  are  then  used  to   assign  index  terms,  thereby  increasing  the  degree  to  which  related  documents  are  collocated   within  the  repository  and  the  consistency  of  subject  term  application.12   table  3  lists  the  tools  that  support  automatic  indexing  either  as  the  sole  technique  or  as  one  of  a   suite  of  techniques  used  to  generate  metadata  from  resources.  of  the  thirty-­‐nine  tools  evaluated   for  this  study,  seven  tools  support  some  form  of  automatic  indexing.   tool  name   location   techniques   functions/features   apache  poi— text  extractor   http://poi.apache.org/download.html     content  extractor;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   apache  poi  provides  basic  text   extraction  for  all  project   supported  file  formats.  in  addition   to  the  (plain)  text,  apache  poi  can   access  the  metadata  associated   with  a  given  file,  such  as  title  and   author.     apache  tika   http://tika.apache.org/     content  extractor;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   built  on  apache  poi,  the  apache   tika  toolkit  detects  and  extracts   metadata  and  text  content  from   various  documents.   data   fountains   http://datafountains.ucr.edu/     content  extractor;   automatic  indexer;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   scans  html  documents  and  first   extracts  information  contained  in   meta-­‐tags.  if  information  is   unavailable  in  meta-­‐tags,  the   program  will  use  other  techniques   to  assign  values.  includes  a   focused  web  crawler  that  can   target  websites  concerning  a   specific  subject.     digital  record   object   identification   (droid)   http://www.nationalarchives.gov.uk/   information-­‐management/manage   -­‐information/preserving-­‐digital   -­‐records/droid/     extrinsic  auto-­‐ generator   droid  is  a  software  tool   developed  by  the  national   archives  to  perform  automated   batch  identification  of  file  formats.   dspace   http://www.dspace.org/     meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   automatically  extracts  technical   information  regarding  file  format   and  size.  can  also  extract  some   information  from  meta-­‐tags.   editor-­‐ converter   dublin  core   metadata   http://www.library.kr.ua/dc/   dceditunie.html   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   scans  html  documents,   harvesting  metadata  from  tags   and  converting  them  to  dublin   core.     information  technology  and  libraries  |  september  2015       34   embedded   metadata   extraction   tool  (emet)   http://www.artstor.org/global/g   -­‐html/download-­‐emet-­‐public.html     content  extractor;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   emet  is  a  tool  designed  to  extract   metadata  embedded  in  jpeg  and   tiff  files.   firefox  dublin   core  viewer   extension   http://www.splintered.co.uk/   experiments/73/     meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   scans  html  documents,   harvesting  metadata  from  tags   and  displaying  them  to  dublin   core.   jhove   http://jhove.sourceforge.net/   #implementation     extrinsic  auto-­‐ generator   extracts  metadata  regarding  file   format  and  size  as  well  as   validating  the  structure  of  the   identified  file  format.   national   library  of   new   zealand— metadata   extraction   tool   http://meta-­‐extractor   .sourceforge.net/     extrinsic  auto-­‐ generator   developed  by  the  national  library   of  new  zealand  to   programmatically  extract   preservation  metadata  from  a   range  of  file  formats  like  pdf   documents,  image  files,  sound   files,  microsoft  office  documents,   and  others.   omeka   http://omeka.org/     extrinsic  auto-­‐ generator;  social   tagging   automatically  extracts  technical   information  regarding  file  format   and  size.     repomman   http://www.hull.ac.uk/esig/   repomman/index.html   meta-­‐tag  harvester;   content  extractor;   extrinsic  auto-­‐ generator   automatically  extracts  various   elements  for  documents  uploaded   to  fedora  such  as  author,  title,   description,  and  key  words,   among  others.  results  are   presented  to  user  for  review.   simple   automatic   metadata   generation   interface   (samgi)   http://hmdb.cs.kuleuven.be/amg/   download.php   content  extractor;   extrinsic  auto-­‐ generator   a  suite  of  tools  that  is  able  to   automatically  extract  metadata   elements  such  as  keyphrase  and   language  from  documents  as  well   as  from  the  context  in  which  a   document  exists.     table  3.  semi-­‐automatic  tools  that  support  automatic  indexing   text  and  data  mining   the  two  methods  discussed  above,  content  extraction  and  automatic  indexing,  rely  on  text-­‐  and   data-­‐mining  techniques  for  the  automatic  extraction  of  metadata.  in  other  words,  the  above   methods  utilize  machine-­‐learning  algorithms,  statistical  analysis  of  term  frequencies,  clustering   techniques,  or  techniques  that  examine  the  frequency  of  term  utilization  between  documents  as   opposed  to  the  use  of  controlled  vocabularies,  and  classifying  techniques,  or  techniques  that   exploit  the  conventional  structure  of  documents,  for  the  semi-­‐automatic  generation  of  metadata.   because  of  the  complexity  of  these  techniques,  few  tools  have  been  fully  developed  for  application   within  real-­‐world  library  settings.  rather,  most  uses  of  these  techniques  have  been  developed  to   solve  the  problems  of  automatic  metadata  generation  within  the  context  of  specific  research   projects.       evaluation  of  semi-­‐automatic  metadata  generation  tools|  park  and  brenza     doi:  10.6017/ital.v34i3.5889   35   there  are  two  reasons  for  this.  one  is  that,  as  many  researchers  have  noted,  the  effectiveness  of   machine  learning  techniques  depends  on  the  quality  and  quantity  of  training  data  used  to  teach   the  system.13,  14,  15  because  of  the  number  and  diversity  of  subject  domains  as  well  as  the  shear   variety  of  document  formats,  many  applications  are  designed  to  address  the  metadata  needs  of   very  specific  subject  domains  and  very  specific  types  of  documents.  this  is  a  point  that  kovacevic   et  al.  make  in  stating  that  machine  learning  techniques  generally  work  best  for  documents  of  a   similar  type,  like  research  papers.16  another  issue,  especially  as  it  applies  to  automatic  indexing,  is   the  fact  that,  as  gardner  notes,  controlled  vocabularies  such  as  the  lcsh  are  too  complicated  and   diverse  in  structure  to  be  applied  through  semi-­‐automatic  means.17  although  some  open-­‐source   tools  such  as  data  fountains  have  made  efforts  to  overcome  this  complexity,  projects  like  it  are  the   exception  rather  than  the  rule.  these  issues  signify  the  difficulty  with  developing  sophisticated   semi-­‐automatic  metadata  generation  tools  that  have  general  applicability  across  a  wide  range  of   subject  domains  and  format  types.  nevertheless,  for  semi-­‐automatic  metadata  generation  tools  to   become  a  reality  for  the  library  community,  such  complexity  will  have  to  be  overcome.   there  are,  however,  some  tools  that  have  broader  applicability  or  can  be  customized  to  meet  local   needs.  for  instance,  the  kea  keyphrase  extractor  offers  the  option  of  building  local  or  applying   available  ontologies  that  can  be  used  to  refine  the  extraction  process.  perhaps  the  most  promising   of  all  is  the  above  mentioned  data  fountains  suite  of  tools  developed  by  the  university  of   california.  the  data  fountains  suite  incorporates  almost  every  one  of  the  semi-­‐automatic   metadata  techniques  described  in  this  study,  including  sophisticated  content  extraction  and   automatic  indexing  features.  it  also  provides  several  ways  to  customize  the  suite  in  order  to  meet   local  needs.     extrinsic  data  auto-­‐generation   extrinsic  data  auto-­‐generation  is  the  process  of  extracting  metadata  about  an  information   resource  that  is  not  contained  within  the  resource  itself.  extrinsic  data  auto-­‐generation  can   involve  the  extraction  of  technical  metadata  such  as  file  format  and  size  but  can  also  include  the   extraction  of  more  complicated  features  such  as  the  grade  level  of  an  educational  resource  or  the   intended  audience  for  a  document.  the  process  of  extracting  technical  metadata  is  perhaps  one   area  of  semi-­‐automatic  metadata  generation  that  is  in  a  high  state  of  development,  included  in   most  cmss  such  as  dspace,18  as  well  as  other  more  sophisticated  tools  such  as  harvard’s  jhove,   which  can  recognize  at  least  7twelve  different  kinds  of  textual,  audio,  and  visual  file  formats.19  on   the  other  hand,  the  problem  of  semi-­‐automatically  generating  other  types  of  extrinsic  metadata,   like  grade  level,  are  of  the  most  difficult  to  solve.     as  leibbrandt  et  al.  note  in  their  analysis  of  the  use  of  artificial  intelligence  mechanisms  to   generate  subject  metadata  for  a  repository  of  educational  materials  at  the  education  services   australia,  the  extraction  of  extrinsic  metadata  such  as  grade  level  was  much  more  difficult  than   the  extraction  of  keywords  because  of  the  lack  of  information  surrounding  a  resource’s  context   within  the  resource  itself.20  this  difficulty  can  also  be  seen  in  the  absence  of  tools  that  support  the     information  technology  and  libraries  |  september  2015       36   extraction  of  extrinsic  data  beyond  those  that  are  harvesting  metadata  that  has  been  created   manually  or  extracting  technical  metadata.     table  4  lists  the  tools  that  support  extrinsic  data  auto-­‐generation  either  as  the  sole  technique  or  as   one  of  a  suite  of  techniques  used  to  generate  metadata  from  resources.  of  the  thirty-­‐nine  tools   evaluated  for  this  study,  thirteen  tools  support  some  form  of  extrinsic  data  auto-­‐generation.   tool  name   location   techniques   functions/features   apache  poi— text  extractor   http://poi.apache.org/download.html     content  extractor;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   apache  poi  provides  basic  text   extraction  for  all  project   supported  file  formats.  in  addition   to  the  (plain)  text,  apache  poi  can   access  the  metadata  associated   with  a  given  file,  such  as  title  and   author.     apache  tika   http://tika.apache.org/     content  extractor;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   built  on  apache  poi,  the  apache   tika  toolkit  detects  and  extracts   metadata  and  text  content  from   various  documents.   data   fountains   http://datafountains.ucr.edu/     content  extractor;   automatic  indexer;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   scans  html  documents  and  first   extracts  information  contained  in   meta-­‐tags.  if  information  is   unavailable  in  meta-­‐tags,  the   program  will  use  other  techniques   to  assign  values.  includes  a   focused  web  crawler  that  can   target  websites  concerning  a   specific  subject.     digital  record   object   identification   (droid)   http://www.nationalarchives.gov.uk/   information-­‐management/manage   -­‐information/preserving-­‐digital   -­‐records/droid/     extrinsic  auto-­‐ generator   droid  is  a  software  tool   developed  by  the  national   archives  to  perform  automated   batch  identification  of  file  formats.   dspace   http://www.dspace.org/     meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   automatically  extracts  technical   information  regarding  file  format   and  size.  can  also  extract  some   information  from  meta-­‐tags.   editor-­‐ converter   dublin  core   metadata   http://www.library.kr.ua/dc/   dceditunie.html   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   scans  html  documents,   harvesting  metadata  from  tags   and  converting  them  to  dublin   core.   embedded   metadata   extraction   tool  (emet)   http://www.artstor.org/global/g   -­‐html/download-­‐emet-­‐public.html     content  extractor;   meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   emet  is  a  tool  designed  to  extract   metadata  embedded  in  jpeg  and   tiff  files.   firefox  dublin   core  viewer   extension   http://www.splintered.co.uk/   experiments/73/     meta-­‐tag  harvester;   extrinsic  auto-­‐ generator   scans  html  documents,   harvesting  metadata  from  tags   and  displaying  them  to  dublin   core.   jhove   http://jhove.sourceforge.net/   extrinsic  auto-­‐ extracts  metadata  regarding  file     evaluation  of  semi-­‐automatic  metadata  generation  tools|  park  and  brenza     doi:  10.6017/ital.v34i3.5889   37   #implementation     generator   format  and  size  as  well  as   validating  the  structure  of  the   identified  file  format.   national   library  of   new   zealand— metadata   extraction   tool   http://meta-­‐extractor   .sourceforge.net/     extrinsic  auto-­‐ generator   developed  by  the  national  library   of  new  zealand  to   programmatically  extract   preservation  metadata  from  a   range  of  file  formats  like  pdf   documents,  image  files,  sound   files,  microsoft  office  documents,   and  others.   omeka   http://omeka.org/     extrinsic  auto-­‐ generator;  social   tagging   automatically  extracts  technical   information  regarding  file  format   and  size.     repomman   http://www.hull.ac.uk/esig/   repomman/index.html   meta-­‐tag  harvester;   content  extractor;   extrinsic  auto-­‐ generator   automatically  extracts  various   elements  for  documents  uploaded   to  fedora  such  as  author,  title,   description,  and  key  words,   among  others.  results  are   presented  to  user  for  review.   simple   automatic   metadata   generation   interface   (samgi)   http://hmdb.cs.kuleuven.be/amg/   download.php   content  extractor;   extrinsic  auto-­‐ generator   a  suite  of  tools  that  is  able  to   automatically  extract  metadata   elements  such  as  keyphrase  and   language  from  documents  as  well   as  from  the  context  in  which  a   document  exists.       table  4.  semi-­‐automatic  tools  that  support  extrinsic  data  auto-­‐generation.   social  tagging     social  tagging  is  now  a  familiar  form  of  subject  metadata  generation  although,  as  mentioned   previously,  it  is  not  properly  a  form  of  automatic  metadata  generation.  nevertheless,  because  of   the  relatively  low  cost  in  generating  and  maintaining  metadata  through  social  tagging  and  its   current  widespread  popularity,  a  few  projects  have  attempted  to  utilize  such  data  to  enhance   repositories.  for  instance,  linstaedt  et  al.  use  sophisticated  computer  programs  to  analyze  still   images  found  within  flickr  and  then  use  this  analysis  to  process  new  images  and  to  propagate   relevant  user  tags  to  those  images.21     in  a  slightly  more  complicated  example,  liu  and  qin  employ  machine-­‐learning  techniques  to   initially  process  and  assign  metadata,  including  subject  terms,  to  a  repository  of  documents   related  to  the  computer  science  profession.22  however,  this  proof  of  concept  project  also  permits   users  to  edit  the  fields  of  the  metadata  once  established.  the  user-­‐edited  tags  are  then   reprocessed  by  the  system  with  the  hope  of  improving  the  machine-­‐learning  mechanisms  of  the   database,  creating  a  kind  of  feedback  loop  for  the  system.  specifically,  the  improved  tags  are  used   by  the  system  to  suggest  and  assign  subject  terms  for  new  documents  as  well  as  to  improve   subject  description  of  existing  documents  within  the  repository.  although  these  two  examples   provide  instances  of  sophisticated  reprocessing  of  social  tag  metadata,  these  capabilities  do  not   seem  to  be  present  in  open-­‐source  tools  at  this  time.  nevertheless,  social  tagging  capabilities  are   offered  by  many  cmss  such  as  omeka.  these  social  tagging  capabilities  may  offer  a  means  to   enhance  subject  access  to  holdings.       information  technology  and  libraries  |  september  2015       38   table  5  below  lists  the  tools  that  support  social  tagging  either  as  the  sole  technique  or  as  one  of  a   suite  of  techniques  used  to  generate  metadata  from  resources.  of  the  thirty-­‐nine  tools  evaluated   for  this  study,  two  tools  support  some  form  of  social  tagging.   tool  name   location   techniques   functions/features   dspace   http://www.dspace.org/     meta-­‐tag  harvester;   extrinsic  auto-­‐ generator;  social   tagging   automatically  extracts   technical  information   regarding  file  format  and  size.   can  also  extract  some   information  from  meta-­‐tags.   omeka   http://omeka.org/     extrinsic  auto-­‐ generator;  social   tagging   automatically  extracts   technical  information   regarding  file  format  and  size.     table  5.  semi-­‐automatic  tools  that  support  social  tagging.   challenges  to  implementation   although  semi-­‐automatic  metadata  generation  tools  offer  many  benefits,  especially  in  regards  to   streamlining  the  metadata-­‐creation  process,  there  are  significant  barriers  to  the  widespread   adoption  and  implementation  of  these  tools.  one  problem  with  semi-­‐automatic  metadata   generation  tools  is  that  many  are  developed  locally  to  address  the  specific  needs  of  a  given  project   or  as  part  of  academic  research.  this  local,  highly  focused  milieu  for  development  means  that   general  applicability  of  the  tools  is  potentially  diminished.  the  local  context  may  also  hinder   widespread  adoption  of  applications  that  would  result  in  strong  communities  of  application  users   and  provide  further  support  for  the  development  of  applications  in  an  open-­‐source  context.   because  of  the  highly  specific  nature  of  many  current  tools,  their  relevance  to  real-­‐world   processes  of  metadata  creation  within  the  broader  context  of  libraries’  diverse  information   management  needs  are  not  accounted  for.   additionally,  many  tools  are  focused  on  solving  one  or,  at  most,  a  few  metadata  generation   problems.  for  instance,  the  kea  application  is  designed  to  use  machine-­‐learning  techniques  for  the   sole  purpose  of  extracting  keywords,  the  open  text  summarizer  is  limited  to  automatic   extractions  of  summary  descriptions  and  keywords,  and  editor  converter  dublin  core  is  designed   to  extract  information  in  html  meta-­‐tags  and  map  them  to  dublin  core  elements.  because  of  the   piecemeal  development  of  semi-­‐automatic  generation  tools,  any  comprehensive  package  of  tools   will  require  the  significant  efforts  of  the  implementer  to  coordinate  the  selected  applications  and   to  produce  results  in  a  single  output.  this  is,  to  say  the  least,  a  daunting  task.     furthermore,  a  high  degree  of  technical  skill  is  required  to  implement  these  complex  tools.  many   of  the  more  sophisticated  tools  used  to  semi-­‐automatically  generate  metadata,  such  as  data   fountains,  kea,  and  apache  stanbol,  require  competence  in  a  variety  of  programming  languages.     evaluation  of  semi-­‐automatic  metadata  generation  tools|  park  and  brenza     doi:  10.6017/ital.v34i3.5889   39   significant  knowledge  of  c++,  python,  and  java,  are  required  to  implement  these  systems  properly.   the  high  degree  of  technical  knowledge  needed  to  implement  these  tools  means  that  many   libraries  and  other  institutions  may  not  have  resources  to  begin  implementing  them,  let  alone   incorporating  them  into  the  daily  workflows  of  the  metadata  creation  process.  further,  this  high   degree  of  technical  expertise  may  require  libraries  to  seek  assistance  outside  of  the  library.  in   other  words,  librarians  may  need  to  build  strong  collaborative  relationships  with  those  who  have   the  technical  skills,  expertise  and  credentials  to  implement  and  maintain  these  complicated  tools.   as  vellucci  et  al.  note  in  regards  to  their  development  of  the  metadata  education  and  research   information  commons  (meric),  a  metadata-­‐driven  clearinghouse  of  education  materials  related   to  metadata,  elaborate  and  multidisciplinary  partnerships  need  to  be  firmly  established  for  the   ultimate  success  of  such  projects,  including  the  sustained  support  of  the  highest  levels  of   administration.23  these  types  of  partnerships  may  be  difficult  to  establish  and  maintain  for  the   sustained  implementation  of  complicated  tools.     additionally,  sustainable  development  of  tools,  especially  in  regards  to  the  funding  needed  for   continued  development  of  open-­‐source  applications,  appears  to  be  a  significant  barrier  to   implementation.  for  instance,  at  the  time  of  this  writing,  many  of  the  tools  that  were  touted  in  the   literature  as  being  most  promising,  such  as  dc  dot,  reggie,  and  describethis,  are  no  longer   available  for  implementation.  beyond  the  fact  that  discontinuation  hurts  the  potential  adoption   and  continued  development  of  semi-­‐automatic  tools  within  real  world  library  and  other   information  settings,  there  is  also  the  problem  that  those  settings  that  have  in  fact  adopted  tools   may  lose  the  technical  support  of  a  central  developer  and  community  of  users.  thus   discontinuation  may  result  in  higher  rates  of  tool  obsolescence  and  increase  the  potential   expenses  of  libraries  who  have  implemented  and  then  must  change  applications.   finally,  the  application  of  semi-­‐automatic  metadata  tools  remains  relatively  untested  in  real-­‐world   scenarios.  as  polfreman  et  al.  note,  most  tests  of  automatic  metadata  generation  tools  have  several   of  problems,  including  small  sample  sizes,  narrow  scope  of  project  domains,  and  experiments  that   lack  true  objectivity  because  systems  are  generally  tested  by  their  creators.24  for  these  reasons,   libraries  and  other  institutions  may  be  reluctant  to  expand  the  resources  needed  to  implement   and  fully  integrate  a  complicated,  promising,  but  ultimately  untested,  tool  within  the  already   strained  workflows  of  its  processes.     conclusion   semi-­‐automatic  metadata  generation  tools  hold  the  promise  of  assisting  information  professionals   with  the  management  of  ever-­‐increasing  quantities  and  types  of  information  resources.  using   software  that  can  create  metadata  records  consistently  and  efficiently,  semi-­‐automatic  metadata   generation  tools  potentially  offer  significant  cost  and  time  savings.  however,  the  full  integration  of   these  tools  into  the  daily  workflows  of  libraries  and  other  information  settings  remains  elusive.   for  instance,  although  many  tools  have  been  developed  that  have  addressed  many  of  the  more   complicated  aspects  of  semi-­‐automatic  metadata  generation,  including  the  extraction  of     information  technology  and  libraries  |  september  2015       40   information  related  to  conceptually  difficult  areas  of  bibliographic  description  such  as  subject   terms,  open-­‐ended  resource  descriptions,  and  keyword  assignment,  many  of  these  tools  are   relevant  only  at  the  project  level  and  are  not  applicable  to  the  broader  contexts  needed  by   libraries.  in  other  words,  the  current  array  of  tools  exists  to  solve  experimental  problems  but  has   not  been  developed  to  the  point  that  the  library  community  can  implement  it  in  a  meaningful  way.     perhaps  the  greatest  area  of  difficulty  lies  in  the  fact  that  most  tools  only  address  part  of  the   problem  of  semi-­‐automatic  metadata  generation,  providing  solutions  to  the  semi-­‐automatic   generation  of  one  or  a  few  bibliographic  elements  but  not  the  full  range  elements.  this  means  that   for  libraries  to  truly  have  a  comprehensive  tool  set  for  the  semi-­‐automatic  generation  of  metadata   records,  significant  local  efforts  will  be  required  to  integrate  the  various  tools  into  a  working   whole.  couple  this  issue  with  the  instability  of  tool  development  and  maintenance  and  it  appears   that  the  library  community  may  lack  incentive  to  invest  already  strained  and  limited  resources  in   the  adoption  of  these  tools.   thus  it  appears  that  a  number  of  steps  will  need  to  be  taken  before  the  library  community  can   seriously  consider  the  incorporation  of  semi-­‐automatic  metadata  generation  tools  within  its  daily   workflows.  first,  it  seems  that  the  integration  of  these  various  tools  into  a  coherent  set  of   applications  is  likely  the  next  step  in  the  development  of  viable  semi-­‐automatic  metadata   generation.  since  most  small  libraries  likely  do  not  have  the  resources  required  to  integrate  these   disparate  tools  together,  let  alone  incorporate  them  within  existing  library  systems,  a  single   package  of  tools  will  be  needed  simply  from  a  resource  perspective.  secondly,  considering  the  high   level  of  technical  expertise  needed  to  implement  the  current  array  of  tools,  the  integrated  set  of   tools  must  be  accomplished  in  such  a  way  as  to  foster  implementation,  utilization,  and   maintenance  with  a  minimum  of  technical  expertise.  for  instance,  if  an  integrated  set  of  tools  that   functioned  across  a  wide  range  of  subject  domains  and  format  types  could  be  developed,  the  suite   might  be  akin  to  the  cmss  currently  employed  by  many  libraries.  furthermore,  with  a  suite  of   tools  that  are  relatively  easy  to  use,  adaption  would  likely  increase.  this  might  result  in  a  stable   community  of  users  that  would  foster  the  further  development  of  the  tools  in  a  sustainable   manner.  a  comprehensive,  relatively  easy  to  implement  set  of  tools  might  foster  independent   testing  of  those  tools.  the  independent  testing  of  the  semi-­‐automatic  tools  is  needed  to  provide  an   objective  basis  for  tool  evaluation  and  further  development.   finally,  designing  automated  workflows  tailored  to  the  subject  domain  and  types  of  resources   seems  to  be  an  essential  step  for  integrating  semi-­‐automatic  metadata  generation  tools  into   metadata  creation.  such  workflows  may  delineate  data  elements  that  can  be  generated  by   automated  meta-­‐tag  extractor  from  data  elements  that  need  to  be  refined  and  manually  created  by   cataloging  and  metadata  professionals.  to  develop,  maximize,  and  sustain  semi-­‐automatic   metadata  generation  workflows,  administrative  support  for  finance,  human  resources,  and   training  is  critical.       evaluation  of  semi-­‐automatic  metadata  generation  tools|  park  and  brenza     doi:  10.6017/ital.v34i3.5889   41   thus,  although  many  of  the  technical  aspects  of  semi-­‐automatic  metadata  generation  are  well  on   their  way  to  being  solved,  many  other  barriers  exist  that  might  limit  adoption.  further,  these   barriers  may  have  a  negative  influence  on  the  continued,  sustainable  development  of  semi-­‐ automatic  metadata  generation  tools.  nevertheless,  there  is  a  critical  need  that  the  library   community  finds  ways  to  manage  the  recent  explosion  of  data  and  information  in  cost-­‐effective   and  efficient  ways.  semi-­‐automatic  metadata  generation  holds  the  promise  to  do  just  that.     acknowledgement   this  study  was  supported  by  the  institute  of  museum  and  library  services.   references     1.     jane  greenberg,  kristina  spurgin,  and  abe  crystal,  “final  report  for  the  amega  (autozmatic   2.     sue  ann  gardner,  “cresting  toward  the  sea  change,”  library  resources  &  technical  services   56,  no.  2  (2012):  64–79,  http://dx.doi.org/10.5860/lrts.56n2.64.   3.     for  details,  see  jung-­‐ran  park  and  caimei  lu,  “application  of  semi-­‐automatic  metadata   generation  in  libraries:  types,  tools,  and  techniques,”  library  &  information  science   research  31,  no.  4  (2009):  225–31,  http://dx.doi.org/10.1016/j.lisr.2009.05.002.   4.     erik  mitchell,  “trending  tech  services:  programmatic  tools  and  the  implications  of   automation  in  the  next  generation  of  metadata,”  technical  services  quarterly  30,  no.  3  (2013):   296–10,  http://dx.doi.org/10.1080/07317131.2013.785802.     5.     jane  greenberg,  “metadata  extraction  and  harvesting:  a  comparison  of  two  automatic   metadata  generation  applications,”  journal  of  internet  cataloging  6,  no.  4  (2004):  59–82,   http://dx.doi.org/10.1300/j141v06n04_05.     6.     malcolm  polfreman,  vanda  broughton,  and  andrew  wilson,  “metadata  generation  for   resource  discovery,”  jisc,  2008,   http://www.jisc.ac.uk/whatwedo/programmes/resourcediscovery/autometgen.aspx.   7     park  and  lu,  “application  of  semi-­‐automatic  metadata  generation  in  libraries.”   8.   kea  automatic  keyphrase  extraction  homepage,  http://www.nzdl.org/kea/index_old.html.   9.     wilhelmina  randtke,  “automated  metadata  creation:  possibilities  and  pitfalls,”  serials   librarian  64,  no.  1–4  (2013):  267–84,  http://dx.doi.org/10.1080/0361526x.2013.760286.       10.    aleksandar  kovačević  et  al.,“automatic  extraction  of  metadata  from  scientific  publications  for   cris  systems.”  electronic  library  and  information  systems  45,  no.  4  (2011):  376–96,   http://dx.doi.org/10.1108/00330331111182094.     information  technology  and  libraries  |  september  2015       42     11.    mark  patton  et  al.,  “toward  a  metadata  generation  framework:  a  case  study  at  johns  hopkins   university,”  d-­‐lib  magazine  10,  no.  11  (2004),   http://www.dlib.org/dlib/november04/choudhury/11choudhury.html.     12.    nicolai  erbs,  iryna  gurevych,  and  marc  rittberger,  “bringing  order  to  digital  libraries:  from   keyphrase  extraction  to  index  term  assignment.”  d-­‐lib  magazine  19,  no.  9/10  (2013),   http://www.dlib.org/dlib/september13/erbs/09erbs.html.   13.    polfreman,  broughton,  and  wilson,  “metadata  generation  for  resource  discovery.”   14.    randtke,  “automated  metadata  creation.”   15.    xiaozhong  liu  and  jian  qin,  “an  interactive  metadata  model  for  structural,  descriptive,  and   referential  representation  of  scholarly  output,”  journal  of  the  association  for  information   science  &  technology  65,  no.  5  (2014):  964–83,  http://dx.doi.org/10.1002/asi.23007.   16.    kovačević  et  al.,  “automatic  extraction  of  metadata  from  scientific  publications  for  cris   systems.”   17.    gardner,  “cresting  toward  the  sea  change.”   18.    mary  kurtz,  “dublin  core,  dspace,  and  a  brief  analysis  of  three  university  repositories,”   information  technology  &  libraries  29,  no.  1  (2010):  40–46,   http://dx.doi.org/10.6017/ital.v29i1.3157.     19.    “jhove  -­‐  jstor/harvard  object  validation  environment,”  jstor,     http://jhove.sourceforge.net.   20.    richard  leibbrandt  et  al.,  “smart  collections:  can  artificial  intelligence  tools  and  techniques   assist  with  discovering,  evaluating  and  tagging  digital  learning  resources?”  international   association  of  school  librarianship:  selected  papers  from  the  annual  conference  (2010).   21.    stefanie  lindstaedt  et  al.,  “automatic  image  annotation  using  visual  content  and   folksonomies,”  multimedia  tools  &  applications  42,  no.  1  (2009):  97–113,   http://dx.doi.org/10.1007/s11042-­‐008-­‐0247-­‐7.   22.    liu  and  qin,  “an  interactive  metadata  model.”   23.    sherry  vellucci,  ingrid  hsieh-­‐yee,  and  william  moen,  “the  metadata  education  and  research   information  commons  (meric):  a  collaborative  teaching  and  research  initiative,”  education   for  information  25,  no.  3/4  (2007):  169–78.   24.    polfreman,  broughton,  and  wilson,  “metadata  generation  for  resource  discovery.”   14 information technology and libraries | march 2007 article title: subtitle in same font author name and second author author id box for 2 column layout 14 information technology and libraries | march 2007 article title: subtitle in same font author name and second author author id box for 2 column layout based on data collected as part of the 2006 public libraries and the internet study, the authors assess the degree to which public libraries provide sufficient and quality bandwidth to support the library’s networked services and resources. the topic is complex due to the arbitrary assignment of a number of kilobytes per second (kbps) used to define bandwidth. such arbitrary definitions to describe bandwidth sufficiency and quality are not useful. public libraries are indeed connected to the internet and do provide public-access services and resources. it is, however, time to move beyond connectivity type and speed questions and consider issues of bandwidth sufficiency, quality, and the range of networked services that should be available to the public from public libraries. a secondary, but important issue is the extent to which libraries, particularly in rural areas, have access to broadband telecommunications services. t he biennial public libraries and the internet studies, conducted since 1994, describe public library involve­ ment with and use of the internet.1 over the years, the studies showed the growth of public­access comput­ ing (pac) and internet access provided by public libraries to the communities they serve. internet connectivity rose from 20.9 percent to essentially 100 percent in less than ten years; the average number of public access computers per library increased from an average of two to nearly eleven; and bandwidth rose to the point where 63 percent of public libraries have connection speeds of greater than 769kbps (kilobytes per second) in 2006. this dramatic growth, replete with related information technology challenges, occurred in an environment of challenges—among them budgetary and staffing—that public libraries face in main­ taining traditional services as well as networked services. one challenge is the question of bandwidth suf­ ficiency and quality. the question is complex because typically an arbitrary number describes the number of kbps used to define “broadband.” as will be seen in this paper, such arbitrary definitions to describe band­ width sufficiency are generally not useful. the federal communications commission (fcc), for example, uses the term “high speed” for connections of 200kbps in at least one direction.2 there are three problematic issues with this definition: 1. it specifies unidirectional bandwidth, meaning that a 200kbps download, but a much slower upload (e.g., 56kbps) would fit this definition; 2. regardless of direction, bandwidth of 200kbps is neither high speed nor does it allow for a range of internet­based applications and services. this inad­ equacy will increase significantly as internet­based applications continue to demand more bandwidth to operate properly. 3. the definition is in the context of broadband to the single user or household, and does not take into consideration the demands of a high­use multiple­ workstation public­access context. in addition to connectivity speed, there are many ques­ tions related to public library pac and internet access that can affect bandwidth sufficiency—from budget and sus­ tainability, staffing and support, to services public librar­ ies offer through their technology infrastructure, and the impacts of connectivity and pac on the communities that libraries serve. one key question, however, is what is quality pac and internet bandwidth for public libraries? and, in attempting to answer that question, what are measures and benchmarks of quality internet access? this paper provides data from the 2006 public libraries and the internet study to foster discussion and debate around determining quality pac and internet access.3 bandwidth and connectivity data at the library outlet or branch level are presented in this article. the band­ width measures are not systemwide but rather at the point of service delivery in the branch. ■ the bandwidth issue there are a number of factors that affect the sufficiency and quality of bandwidth in a pac and internet service context. examples of factors that influence actual speed include: ■ number of workstations (public­access and staff) that simultaneously access the internet; ■ provision of wireless access that shares the same con­ nection; ■ ultimate connectivity path—that is, a direct connec­ tion to the internet that is truly direct, or one that goes through regional or other local hops (that may have aggregated traffic from other libraries or orga­ nizations) out to the internet; john carlo bertot and charles r. mcclure assessing sufficiency and quality of bandwidth for public libraries john carlo bertot (jbertot@fsu.edu) is the associate director of the information use management and policy institute and professor at the college of information, florida state university; and charles r. mcclure (cmcclure@ci.fsu.edu) is the director of the information use management and policy institute (www .ii.fsu.edu) and francis eppes professor of information studies at the college of information, florida state university. article title | author 15assessing sufficiency and quality of bandwidth for public libraries | bertot and mcclure 15 ■ type of connection and bandwidth that the telecom­ munications company is able to supply the library; ■ operations (surfing, e­mail, downloading large files, streaming content) being performed by users of the internet connection; ■ switching technologies; ■ latency effects that affect packet loss, jitter, and other forms of noise throughout a network; ■ local settings and parameters, known or unknown, that impede transmission or bog down the delivery of internet­based content; ■ range of networked services (databases, videoconfer­ encing, interactive/real­time services) to which the library is linked; ■ if networked, the speed of the network on which the public­access workstations reside; and ■ general application resource needs, protocol priority, and other general factors. thus, it is difficult to precisely answer “how much bandwidth is enough” within an evolving and dynamic context of public access, use, and infrastructure. putting public­access internet use into a more typi­ cal application­and­use scenario, however, may provide some indication of adequate bandwidth. for example: ■ a typical three­minute digital song is 3mb; ■ a typical digital photo is about 2mb; and ■ a typical powerpoint presentation is about 10mb. if one person in a public library were to e­mail a powerpoint presentation at the same time that another person downloaded multiple songs, and another was exchanging multiple pictures, even a library with a t1 line (1.5mbps—megabytes per second) would experience a temporary network slowdown during these operations. this does not take into account many other new high­ bandwidth­consuming applications such as cnn stream­ ing­video channel; uploading and accessing content to a wiki, blog, or youtube.com; or streaming content such as cbs’s webcasting the 2006 ncaa basketball tournament. an increasingly used technology in various settings is two­way internet­based video conferencing. with an installed t1 line, a library could support two 512kbps or three 384kbps videoconferences, depending on the amount of simultaneous traffic on the network—which, in a public access context, would be heavy. indeed, the 2006 public libraries and the internet study indicated a near continuous use of public­access workstations by patrons (only 14.6 percent of public libraries indicated that they always had a sufficient number of workstations available for patron use). public libraries increasingly serve as access points to e­government services and resources, e.g., social services, disaster relief, health care.4 these services can require the simple completion of a web­based form (low­bandwidth consumption) to more interactive services (high­band­ width consumption). and, as access points to continuing education and online degree programs, public libraries need to offer adequate broadband to enable users to access services and resources that increasingly can depend on streaming technologies that consume greater bandwidth. ■ bandwidth and pac in public libraries today as table 1 demonstrates, public libraries continue to increase their bandwidth, with 63.3 percent of public libraries reporting connection speeds of 769kbps or greater. this compares to 47.7 percent of public libraries reporting connection speeds of greater than 769kbps in 2004. there are disparities between rural and urban pub­ lic libraries, with rural libraries reporting substantially fewer instances of connection speeds of greater than 1.5mbps in 2006. on the one hand, the increase in con­ nectivity speeds between 2004 and 2006 is a positive step. on the other, 16.1 percent of public libraries report that their connection speeds are insufficient to meet patron demands all of the time, and 29.4 percent indicate that their connection speeds are insufficient to meet patron demands some of the time. thus, nearly half of public libraries indicate that their connection speeds are insuf­ ficient to meet patron demands some or all of the time. in terms of public access computers, the average number of workstations that public libraries provide is 10.7 (table 2). urban libraries have an average of 17.1 workstations, as compared to rural libraries, which report an average of 7.1 workstations. a closer look at bandwidth and pac for the next sections, the data offer two key views for analysis purposes: (1) workstations—divided into libraries with ten or fewer public­access workstations and libraries with more than ten public­access worksta­ tions (given that the average number of public­access workstations in libraries is roughly ten); and (2) band­ width—divided into libraries with 769kbps or less and libraries with greater than 769kbps (an arbitrary indicator of broadband for a public library context). in looking across bandwidth and public­access work­ stations (table 3), overall 31.8 percent of public libraries have connection speeds of less than 769kbps while 63.3 percent have connection speeds of greater than 769kbps. a majority of public libraries—68.5 percent—have ten or fewer workstations, while 30.9 percent have more than ten workstations. in general, rural libraries have fewer workstations and lower bandwidth as compared to sub­ urban and urban libraries. indeed, 75.2 percent of urban 16 information technology and libraries | march 200716 information technology and libraries | march 2007 libraries with fewer than ten workstations have connec­ tion speeds of greater than 769kbps, as compared to 45.2 percent of rural libraries. when examining pac capacity, it is clear that public libraries have capacity issues at least some of the time in a typical day (tables 4 through 6). only 14.6 percent of public libraries report that they have sufficient numbers of workstations to meet patron demands at all times (table 6), while nearly as many, 13.7 percent, report that they consistently are unable to meet patron demands for public­access workstations (table 4). a full 71.7 percent indicate that they are unable to meet patron demands during certain times in a typical day (see table 5). in other words, 85.4 percent of public libraries report that they are unable to meet patron demand for public­access workstations some or all of the time during a typical day—regardless of number of workstations available and type of library. the disparities between rural and urban libraries are notable. in general, urban libraries report more difficulty in meeting patron demands for public­access workstations. of urban public libraries, 27.8 percent report that they consistently have difficulty in meeting patron demand for workstations, as compared to 11.0 percent of suburban and 10.6 percent of rural public libraries (table 4). by contrast, 6.6 percent of urban libraries report sufficient workstations to meet patron demand all the time as compared to 18.9 percent of rural libraries (table 6). when reviewing the adequacy of speed of connectiv­ ity data by the number of workstations, bandwidth, and metropolitan status, a more robust and descriptive pic­ table 1. public library outlet maximum speed of public-access internet services by metropolitan status and poverty metropolitan status poverty level maximum speed urban suburban rural low medium high overall less than 56kbps 0.7% ±0.8% (n=18) 0.4% ±0.6% (n=17) 3.7% ±1.9% (n=275) 2.0% ±1.4% (n=245) 2.7% ±1.6% (n=61) 2.6% ±1.6% (n=5) 2.1% ±1.4% (n=311) 56kbps– 128kbps 2.5% ±1.6% (n=67) 5.4% ±2.3% (n=264) 15.2% ±3.6% (n=1,132) 9.9% ±3.0% (n=1,237) 9.5% ±2.9% (n=216) 5.3% ±2.2% (n=10) 9.8% ±3.0% (n=1,463) 129kbps– 256kbps 2.7% ±1.6% (n=72) 6.8% ±2.5% (n=332) 11.1% ±3.1% (n=829) 8.5% ±2.8% (n=1,067) 7.3% ±2.6% (n=166) 8.2% ±2.8% (n=1,233) 257kbps–768kbps 9.1% ±2.9% (n=241) 10.4% ±3.1% (n=504) 13.4% ±3.4% (n=1,002) 12.5% ±3.3% (n=1,557) 8.4% ±2.8% (n=190) 11.7% ±3.2% (n=1,747) 769kbps– 1.5mbps 33.6% ±4.7% (n=889) 40.0% ±4.9% (n=1,945) 31.0% ±4.6% (n=2,310) 34.3% ±4.8% (n=4,286) 34.6% ±4.8% (n=788) 38.1% ±4.9% (n=70) 34.4% ±4.8% (n=5,144) greater than 1.5mbps 49.4% ±5.0% (n=1,304) 31.6% ±4.7% (n=1,533) 19.9% ±4.0% (n=1,488) 27.4% ±4.5% (n=3,423) 35.5% ±4.8% (n=808) 50.5% ±5.0% (n=93) 28.9% ±4.5% (n=4,324) don’t know 1.9% ±1.4% (n=50) 5.4% ±2.3% (n=263) 5.7% ±2.3% (n=427) 5.5% ±2.3% (n=685) 2.1% ±1.4% (n=48) 3.5% ±1.8% (n=6) 4.9% ±2.2% (n=739) weighted missing values, n=1,497 table 2. average number of public library outlet graphical publicaccess internet terminals by metropolitan status and poverty* poverty level metropolitan status low medium high overall urban 14.7 20.9 30.7 17.9 suburban 12.8 9.7 5.0 12.6 rural 7.1 6.7 8.1 7.1 overall 10.0 13.3 26.0 10.7 * note that most library branches defined as “high poverty” are in general part of library systems with multiple branches and not single building systems. by and large, library systems connect and provide pac and internet services systemwide. article title | author 17assessing sufficiency and quality of bandwidth for public libraries | bertot and mcclure 17 ture emerges. while overall, 53.5 percent of public librar­ ies indicate that their connection speeds are adequate to meet demand, some parsing of this figure reveals more variation (tables 7 through 10): ■ libraries with connection speeds of 769kpbs or less are more likely to report that their connection speeds are insufficient to meet patron demand at all times, with 24.0 percent of rural libraries, 25.8 percent of suburban libraries, and 25.4 percent of urban libraries so reporting (table 7). ■ libraries with connection speeds of 769kpbs or less are more likely to report that their connection speeds are insufficient to meet patron demand at some times, with 35.0 percent of rural libraries, 38.1 per­ cent of suburban libraries, and 53.4 percent of urban libraries so reporting (table 8). ■ libraries with connection speeds of greater than 769kbps also report bandwidth­sufficiency issues, with 12.0 percent of rural libraries, 10.5 percent of suburban libraries so reporting; and 14.0 percent of urban librar­ ies indicating that their connection speeds are insuf­ ficient all of the time (table 7); 20.3 percent of rural libraries, 29.5 percent of suburban libraries, and 30.0 percent of urban libraries indicating that their connec­ tion speeds are insufficient some of the time (table 8). ■ libraries that have ten or fewer workstations tend to rate their bandwidth as more sufficient at either 769kbps or less or greater than 769kbps (tables 7, 8, and 10). thus, in looking at the data, it is clear that libraries with fewer workstations indicate that their connection speeds are more sufficient to meet patron demand. table 3. public library public-access workstations and speed of connectivity by metropolitan status rural suburban urban lt769kbps gt769kbps lt769kbps gt769kbps lt769kbps gt769kbps 10 or fewer workstations 48.4% n=2,929 45.2% n=2,737 30.1% n=891 63.2% n=1,872 21.6% n=269 75.2% n=937 more than 10 workstations 22.0% n=307 75.5% n=1,053 12.0% n=225 85.1% n=1,595 9.6% n=130 89.8% n=1,221 total 43.4% n=3,242 50.9% n=3,802 23.0% n=1,116 71.6% n=3,474 15.1% n=399 83.0% n=2,194 missing: 7.6% (n=1,239) table 4. fewer public library public-access workstations than patrons wishing to use them by metropolitan status rural suburban urban total 10 or fewer workstations 10.5% n=681 10.8% n=339 23.6% n=300 12.1% n=1,321 more than 10 workstations 10.8% n=158 11.4% n=220 31.2% n=430 16.9% n=808 total 10.6% n=845 11.0% n=562 27.8% n=748 13.7% n=2,157 missing: 2.9% (n=473) table 5. fewer public library public-access workstations than patrons wishing to use them at certain times during a typical day by metropolitan status rural suburban urban total 10 or fewer workstations 68.8% n=4,444 74.5% n=2,347 69.1% n=880 70.5% n=7,670 more than 10 workstations 78.1% n=1,139 80.2% n=1,548 62.8% n=866 74.5% n=3,553 total 70.5% n=5,605 76.7% n=3,905 65.6% n=1,764 71.7% n=11,273 missing: 2.9% (n=473) table 6. sufficient public library public-access workstations available for patrons wishing to use them by metropolitan status rural suburban urban total 10 or fewer workstations 20.6% n=1,331 14.7% n=464 7.4% n=94 17.4% n=1,889 more than 10 workstations 11.0% n=161 8.4% n=163 6.0% n=83 8.5% n=406 total 18.9% n=1,501 12.3% n=627 6.6% n=177 14.6% n=2,304 missing: 2.9% (n=473) 18 information technology and libraries | march 200718 information technology and libraries | march 2007 table 7. public library connection speed insufficient to meet patron needs by metropolitan status rural suburban urban lt769kbps gt769kbps lt769kbps gt769kbps lt769kbps gt769kbps 10 or fewer workstations 25.4% n=668 12.1% n=297 27.4% n=233 9.8% n=173 15.4% n=34 10.2% n=90 more than 10 workstations 11.6% n=34 11.4% n=108 19.2% n=41 11.3% n=168 25.4% n=32 17.1% n=199 total 24.0% n=705 12.0% n=408 25.8% n=274 10.5% n=341 18.7% n=72 14.0% n=293 table 8. public library connection speed insufficient to meet patron needs at some times by metropolitan status rural suburban urban lt769kbps gt769kbps lt769kbps gt769kbps lt769kbps gt769kbps 10 or fewer workstations 34.1% n=898 19.3% n=474 37.1% n=315 29.0% n=511 50.0% n=130 27.0% n=238 more than 10 workstations 43.2% n=127 22.5% n=214 42.3% n=90 30.3% n=450 60.3% n=76 32.0% n=374 total 35.0% n=1,025 20.3% n=694 38.1% n=405 29.5% n=961 53.4% n=206 30.0% n=626 table �. public library connection speed is sufficient to meet patron needs by metropolitan status rural suburban urban lt769kbps gt769kbps lt769kbps gt769kbps lt769kbps gt769kbps 10 or fewer workstations 38.9% n=1,025 68.3% n=1,675 35.0% n=297 60.2% n=1,062 34.6% n=90 62.9% n=556 more than 10 workstations 45.2% n=133 66.1% n=628 38.5% n=82 54.9% n=817 14.3% n=18 50.9% n=594 total 39.5% n=1,158 67.5% n=2,306 35.7% n=379 57.9% n=1,886 28.0% n=108 56.0% n=1,168 table 10. public library connection speed insufficient to meet patron needs some or all of the time by metropolitan status rural suburban urban lt769kbps gt769kbps lt769kbps gt769kbps lt769kbps gt769kbps 10 or fewer workstations 59.5% n=1,566 31.4% n=771 64.6% n=549 38.8% n=684 65.4% n=170 37.1% n=328 more than 10 workstations 54.8% n=161 33.9% n=322 61.5% n=131 41.6% n=618 85.7% n=108 49.1% n=573 total 24.0% n=1,025 32.3% n=1,102 64.0% n=680 40.0% n=1,302 72.0% n=278 44.0% n=919 article title | author 1�assessing sufficiency and quality of bandwidth for public libraries | bertot and mcclure 1� ■ discussion and selected issues the data presented point to a number of issues related to the current state of public library pac and internet­access adequacy in terms of available public access computers and bandwidth. the data also provide a foundation upon which to discuss the nature of quality and sufficient pac and internet access in a public library environment. while public libraries indicate increased ability to meet patron bandwidth demand when providing fewer publicly avail­ able workstations, public libraries indicate that they have difficulty in meeting patron demand for public access computers. growth of wireless connections in 2004, 17.9 percent of public library outlets offered wire­ less access, and a further 21.0 percent planned to make it available. outlets in urban and high­poverty areas were most likely to have wireless access. the majority of librar­ ies (61.2 percent), however, neither had wireless access nor had plans to implement it in 2004. as table 11 demon­ strates, the number of public library outlets offering wire­ less access has roughly doubled from 17.9 percent to 36.7 percent in two years. furthermore, 23.1 percent of outlets that do not currently have it plan to add wireless access in the next year. thus, if libraries follow through with their plans to add wireless access, 61.0 percent of public library outlets in the united states will have it by 2007. the implications of the rapid growth of the public library’s provision of wireless connectivity (as shown in table 11) on bandwidth requirements are significant. either libraries added wireless capabilities through their current overall bandwidth, or they obtained additional bandwidth to support the increased demand created by the service. if the former, then wireless access created an even greater burden on an already problematic band­ width capacity and may have actually reduced the overall quality of connectivity in the library. if the latter, libraries then had to shoulder the burden of increased expendi­ tures for bandwidth. either scenario required additional technology infrastructure, support, and expenditures. sufficient and quality connections the notion of sufficient and quality public library con­ nection to the internet is a moving target and depends on a range of factors and local conditions. for purposes of discussion in this paper, the authors used 769kbps to differentiate “slower” from “faster” connectivity. if, how­ ever, 1.5mbps or greater had been used to define faster connectivity speeds, then only 28.9 percent of public libraries would meet the criterion of “faster” connectiv­ ity (see table 1). and in fact, simply because 28.9 percent of public libraries report connection speeds of 1.5mbps or faster does not also mean that they have sufficient or quality bandwidth to meet the computing needs of their users, their staff, their vendors, and their service provid­ ers. some public libraries may need 10mbps to meet the pac needs of their users as well as the internal staff and management computing needs. the library community needs to become more edu­ cated and knowledgeable about what constitutes sufficient and quality connectivity in their library for the communi­ ties that they serve. a first step is to understand clearly the nature and type of the connectivity of the library. the next step is to conduct an internal audit that minimally: ■ identifies the range of networked services the library provides both to users as well as for the operation of the library; ■ identifies the typical bandwidth consumption of these services; ■ determines the demands of users on the bandwidth in terms of services they use; ■ determines peak bandwidth­usage times; ■ identifies the impact of high­consumption networked services used at these peak­usage times; ■ anticipates bandwidth demands of newer services and resources that users will want to access through the library’s infrastructure—myspace.com, youtube. com—regardless of whether or not the library is the direct provider of such services; and ■ determines what broadband services are available to the library, the costs of these services, and the “fit” of these services to the needs of the library. based on this and related information from such an audit, library administration can better determine the degree to which the bandwidth is sufficient in speed and quality. ■ planning for sufficient and quality bandwidth knowing the current condition of existing bandwidth in the library is not the same as successful technology plan­ ning and management to ensure that the library has, in fact, bandwidth that is sufficient in speed and quality. once an audit such as has been suggested is completed, careful planning for bandwidth deployment in the library is essential. it appears, however, that currently much of the management and planning for networked services is based first on what bandwidth is available as opposed to the bandwidth that is needed to provide the necessary services and resources in a networked environment. this stance puts public libraries in a reactive condition rather than a proactive condition regarding provision of net­ worked services. 20 information technology and libraries | march 200720 information technology and libraries | march 2007 most public library planning approaches stress the importance of conducting some type of needs assessment as a precursor to any type of planning.5 further, technology plans should include such things as goals, objectives, ser­ vices provision, and evaluation as they relate to bandwidth and the appropriate bandwidth needed. recent library technology planning guides, however, give little attention to the management, planning, and evaluation of band­ width as it relates to provision of networked services. it must be noted that some public libraries may be prevented from accessing higher bandwidth due to high cost, lack of availability of bandwidth alternatives, or other local factors that determine access to advanced telecommunications in their areas. in such circumstances, the audit may serve to inform the public service/utilities commissions, fcc, and others of the need for deploy­ ment of advanced telecommunications services in these areas. ■ bandwidth planning in a community context the audit and planning processes that have been described are critical activities for libraries. it is essential, however, for these processes to occur in the larger community con­ text. investments in technology infrastructure are increas­ ingly a community­wide resource that services multiple functions—emergency services, community access, local government agencies, to name a few. it is in this larger context that library pac and internet access occurs. moreover, there is a convergence of technology and service needs. for example, public libraries increasingly serve as agents of e­government and disaster­relief providers.6 first responders rely on the library’s infrastructure when theirs is destroyed, as hurricane katrina and other storms demonstrated. local, state, and federal government agen­ cies rely on broadband and pac and internet access (wired or wireless) to deliver e­government services. thus, at their core, libraries, emergency services, gov­ ernment agencies, and others have similar needs. pooling resources, planning jointly, and looking across needs may yield economies of scale, better service, and a more robust community technology infrastructure. emergency providers need access to reliable broadband and commu­ nications technologies in general, and in emergency situ­ ations in particular. libraries need access to high­quality broadband and pac technologies. both need access to wireless technologies. as broadcast networks relinquish ownership of the 700 mhz frequency used for analog television in february 2009, and this frequency is distributed to municipali­ ties for emergency services, now is an excellent time for libraries to engage in community technology planning for e­government, disaster planning and relief efforts, and pac and internet services. by working with the larger community to build a technology infrastructure, the library and the entire community benefit. ■ availability to high-speed connectivity one key consideration not known at this time is the extent to which public libraries—particularly those in rural areas—even have access to high­speed connec­ tions. many rural communities are served not by the large telecommunications carriers, but rather by small, privately owned­and­run local exchange carriers. iowa and wisconsin, for example, are each served by more than eighty exchange carriers. as such, public libraries are limited in capacity and services to what these exchange table 11. public-access wireless internet connectivity availability in public library outlets by metropolitan status and poverty metropolitan status poverty level provision of public-access wireless internet services urban suburban rural low medium high overall currently available 42.9% ± 4.9% (n=1,211) 42.5% ± 4.9% (n=2,240) 30.7% ± 4.6% (n=2,492) 38.0% ± 4.8% (n=5,165) 28.1% ±4.5% (n=679) 53.8% ± 5.0% (n=99) 36.7% ± 4.8% (n=5,943) not currently available and no plans to make it available within the next year 23.1% ± 4.2% (n=651) 29.7% ± 4.6% (n=1,562) 49.2% ± 5.0% (n=3,988) 37.4% ± 4.8% (n=5,091) 44.4% ± 4.9% (n=1,072) 21.0% ± 4.1% (n=39) 38.3% ± 4.9% (n=6,201) not currently available, but there are plans to make it available within the next year 30.6% ± 4.6% (n=864) 26.0% ± 4.4% (n=1,369) 18.6% ± 3.9% (n=1,509) 22.5% ± 4.2% (n=3,063) 26.2% ± 4.4% (n=633) 25.3% ± 4.4% (n=46) 23.1% ± 4.2% (n=3,742) article title | author 21assessing sufficiency and quality of bandwidth for public libraries | bertot and mcclure 21 carriers offer and make available. thus, in some areas, dsl service may be the only form of high­speed connec­ tivity available to libraries. and, as suggested earlier, dsl may or may not be considered high speed given the needs of the library and the demands of its users. communities that lack high­quality broadband ser­ vices by telecommunications carriers may want to con­ sider building a municipal wireless network that meets the community’s broadband needs for emergency, disas­ ter, and public­access settings. as a community engages in community­wide technology planning, it may become evident that local telecommunications carriers do not meet the broadband needs of the community. such com­ munities may need to build their own networks, based on identified technology­plan needs. ■ knowledge of networked services connectivity needs patrons may not attempt to use high­bandwidth services at the public library because they know from previous visits that the library cannot provide acceptable connec­ tivity speeds to access that service—thus, they quit trying to access that service, limiting the usefulness of the pub­ lic library. in addition, librarians may have inadequate knowledge or information to determine when bandwidth is or is not sufficient to meet the demands of their users. indeed, the survey and site visits revealed that some librarians did not know the connection speeds that linked their library to the internet. consequently, libraries are in a dilemma: increase both the number of workstations and the bandwidth to meet demand; or provide less service in order to operate within the constraints of current connectivity infrastruc­ ture. and yet, roughly 45 percent of public libraries indi­ cate that they have no plans to add workstations within the next two years; the average number of workstations has been around ten for the last three surveys (2002, 2004, and 2006); and 80 percent of public libraries indicate that space limitations affect their ability to add workstations.7 hence, for many libraries, adding workstations is not an option. ■ missing the mark? the networked environment is such that there are multi­ ple uses of bandwidth within the same library—for exam­ ple, public internet access, staff access, wireless access, integrated library system access. we are now in the web 2.0 environment, which is an interactive web that allows for content uploading by users (e.g., blogs, mytube.com, myspace.com, gaming). streaming content, not text, is increasingly the norm. there are portable devices that allow for text, video, and voice messaging. increasingly, users desire and prefer wireless services. this is a new environment in which libraries provide public access to networked services and resources. it is an enabling environment that puts users fully in the content seat—from creation to design to organization to access to consumption. and users have choices, of which the public library is only one, regarding the information they choose to access. it is an environment of competition, advanced applications, bandwidth intensity, and high­quality com­ puters necessary to access the graphically intense content. the impacts of this new and substantially more com­ plex environment on libraries are potentially significant. as user expectations rise, combined with the provision of high­quality services by other providers, libraries are in a competitive and service­ and resource­rich informa­ tion environment. providing “bare minimum” pac and internet access can have two detrimental effects in that they: (1) relegate libraries to places of last resort, and (2) further digitally divide those who only have public­access computers and internet access through their public librar­ ies. it is critical, therefore, for libraries to chart a high­end course regarding pac and internet access, and not access that is merely perceived to be acceptable by the librarians. ■ additional research the context in which issues regarding quality pac and sufficient connectivity speeds to internet access reside is complex and rapidly changing. research questions to explore include: ■ is it possible to define quality pac and internet access in a public library context? ■ if so, what are the attributes included in the defini­ tion? ■ can these attributes be operationalized and mea­ sured? ■ assuming measurable results, what strategies can the library, policy, research, and other interested communities employ to impact public library move­ ment toward quality pac and internet access? ■ should there be standards for sufficient connectivity and quality pac in public libraries? ■ how can public librarians be better informed regard­ ing the planning and deployment of sufficient and quality bandwidth? ■ what is the role of federal and state governments in supporting adequate bandwidth deployment for public libraries?8 ■ to what extent is broadband deployment and avail­ ability truly universal as per the universal service 22 information technology and libraries | march 200722 information technology and libraries | march 2007 (section 254) of the telecommunications act of 1996 (p.l. 104­104)? these questions are a beginning point to a larger set of activities that need to occur in the research, practitioner, and policy­making communities. ■ obtaining sufficient and quality public-library bandwidth arbitrary connectivity speed targets, e.g., 200kbps or 769kbps, do not in and of themselves ensure quality pac and sufficient connectivity speeds. public libraries are indeed connected to the internet and do provide public­ access services and resources. it is time to move beyond connectivity­type and ­speed questions and consider issues of bandwidth sufficiency, quality, and the range of networked services that should be available to the public from public libraries. given the widespread connectivity now provided from most public libraries, there continue to be increased demands for more and better networked services. these demands come from governments that expect public libraries to support a range of e­government services, from residents who want to use free wireless connectivity from the public library, to patrons who need to download music or view streaming videos (to name but a few). simply providing more or better connectivity will not, in and of itself, address all of these diverse service needs. increasingly, pac support will require additional public librarian knowledge, resources, and services. sufficient and quality bandwidth is a key component of those services. the degree to which public libraries can provide such enhanced networked services (requiring exceptionally high bandwidth that is both sufficient and of high quality) is unclear. mounting a significant effort now to better understand existing bandwidth use and plan for future needs and requirements in individual public libraries is essential. in today’s networked envi­ ronment, libraries must stay competitive in the provision of networked services. such will require sufficient and high­quality connectivity and bandwidth. ■ acknowledgements the authors gratefully acknowledge the support of the bill & melinda gates foundation and the american library association for support of the 2006 public libraries and the internet study. data from that study have been incorpo­ rated into this paper. references 1. information institute, public libraries and the internet (tal­ lahassee, fla.: information use management and policy insti­ tute, 2006). all studies conducted since 1994 are available at: http://www.ii.fsu.edu/plinternet (accessed march 1, 2007). 2. u.s. federal communications commission, high speed services for internet access: status as of december 31, 2005 (wash­ ington, d.c.: fcc, 2006), available at http://www.fcc.gov/ bureaus/common_carrier/reports/fcc­state_link/iad/ hspd0604.pdf (accessed mar. 1, 2007). 3. j. c. bertot et al., public libraries and the internet 2006 (tal­ lahassee, fla.: information use management and policy insti­ tute, forthcoming), available at http://www.ii.fsu.edu/plinternet (accessed mar. 1, 2007). 4. j. c. bertot et al., “drafted: i want you to deliver e­ government,” library journal 131, no. 13 (aug. 2006): 34–37. 5. c. r. mcclure et al., planning and role setting for public libraries: a manual of options and procedures (chicago: ala, 1987); e. himmel and w. j. wilson, planning for results: a public library transformation process (chicago, ala, 1997). 6. j. c. bertot et al., “drafted: i want you to deliver e­gov­ ernment.”; p. t. jaeger et al., “the policy implications of internet connectivity in public libraries,” government information quarterly 23, no. 1 (2006): 123–41. 7. j. c. bertot et al., public libraries and the internet 2006. 8. jaeger et al., “the policy implications of internet connec­ tivity in public libraries.” the marc sort program john c. rather: specialist in technical processes research, and jerry g. pennington: information systems mathematician, library of congress, washington, d.c. 125 describes the characte1·istics, performance, and potential of sked (sortkey edit), a generalized computer program for creating sort keys for marc ii records at the users option. sked and a modification of the ibm s/360 dos tape sort/merge program form the basis for a comprehensive program for arranging catalog entries by computer. the role of sorting in the marc system many present and potential uses of cataloging data in machine readable form require that the input sequence of the records be altered before output. the production of book catalogs, bibliographical lists, and similar output products benefits from an efficient means for arranging the records in a more sophisticated way than mere alphabetical order or, even worse, the collating sequence of a particular computer. internal files, such as special tape indexes, also may require sequencing by sort keys that differ from the actual character strings in the records. the demonstration of the feasibility of filing catalog entries by computer hinges on successfully pedorming two tasks: 1) analyzing the requirements of particular filing arrangements; and 2) programming the computer to perform the required operations. actually, the two tasks are interdependent, because the nature of the filing analysis is strongly influenced by the ability of the computer to perform certain types of operations. the requirements for filing arrangement were considered at the genesis of the marc project ( 1) and they materially affected the characteristics of the marc ii format ( 2,3). structuring the format of a machine record is only part of the solution to the problem, however. 126 journal of library automation vol. 2/3 september, 1969 the first requirement for a program for library sorting is a set of generalized computer techniques for creating sort keys from marc records at the user's option. these techniques will provide the foundation for further refinement of the sorting capability by developing algorithms to resolve specific problems in file arrangement. this article describes the characteristics, performance, and potential of a generalized program developed by the information systems office and the technical processes research office of the library of congress. the present approach to the computer sorting problem was based on the following assumptions: 1) the sort key must be generated on demand. for maximum flexibility and economy of storage, it should not be a permanent part of the machine readable record. 2) data to be sorted must be processed (edited) for sorting by the machine. input to a data field should be in the form required for cataloging purposes; it should not be contrived simply to satisfy the requirements of filing. 3) all data elements contributing to a sort key must be fully represented. to determine the position of an entry in a large file, the filing elements must be considered in turn until the discrimination point is reached. no element may be truncated to make room for another. 4) at least initially, a manufacturer's program should be used for sorting and merging the records with sort keys. given the library's present machine configuration, this means using ibm s /360 dos tape sort/merge program. these assumptions shaped the course that was followed. the requirement that the sort key be generated on demand meant that a program had to be written to build sort keys specifically for records in the marc ii format. to allow maximum flexibility in specifying elements to be included in the sort key, the basic program was to be highly generalized, allowing any combination of fixed and variable field data to be included in the sort key. since several data elements may have to be considered to determine the proper location of an item in a complex file, it is desirable to construct a single sort key containing as many characters of each element considered in turn as the length of the key will allow. using a single sort key is more efficient than using separate keys for each element. the marc ii format the marc sort program was written to handle records in the processing format used by the library of congress. the differences between thi,s format and the marc ii communications format ( 2,3,4) have been described by avram and droz (5). for the purposes of the present article, it is sufficient to give a brief outline of the structure of the format as it marc sort progmm 127 relates to computer sorting capability and to describe the salient features of the content designators that facilitate the creation of sort keys. marc records are undefined; that is, they vary in length, and information is not provided at the beginning of each record for use by software input/ output macros. since the manufacturer's program used for sorting marc records cannot handle undefined records, preparation for sorting must include changing them from one type to the other. at the end of the sort/ merge phase, they must be returned to an undefined state. the maximum physical length of a marc record is 2048 bytes. if a logical record (that is, the bibliographical data plus machine format data) requires more than 2048 bytes, it must be divided into two (or more) physical records. at present, the marc sort program cannot handle continuation records of this type. the basic structure of the format includes a leader, a record directory, and variable fields. each variable field is identified by a unique tag in the directory. if necessary, the data in a field can be defined more precisely by indicators and subfield codes. they appear at the beginning of the field separated by delimiters. when no indicator is needed, the field begins with a delimiter. tags, indicators, and subfield codes are used to specify what variable field data are to be included in the sort key, how the data are to be arranged, and what special treatment may be required. although the full potential of these content designators has yet to be realized, they provide a basis for programming to achieve content-related filing arrangements; for example, placement of a single surname before other headings with the same character string. characteristics of the marc sort program the marc sort program has three components: 1) a sort-key edit program ( sked); 2) the sorting capability of the ibm s / 360 dos tape sort/merge program (tsrt); and 3) a merge routine written expressly for the marc sort program. the marc sort program is activated by a set of control cards supplied by the user. these control cards specify the parameters to be observed in processing each record. using this information, sked reads each record, builds as many sort keys as are required to satisfy the parameters, duplicates the master record each time a different key is constructed, and records information about the sort key and the master record for possible later use. the output of sked is an intermediate marc sort file containing records with sort keys. the second phase of the program involves tsrt, which also is controlled by parameter cards. the input is the intermediate marc sort file. the tsrt program sorts the records according to their keys, using a standard collating sequence (that is, according to the order of the bitconfigurations of the characters in the keys). the output can take either or both of two forms: 1) marc format, in which the sort key is stripped 128 journal of library automation vol. 2/ 3 september, 1969 merge i sort-key edit (sked) program. sort fig. 1. marc sort program data flow. sked parameter cards marc sort program 129 from each record and the format is returned to an undefined state; or 2) intermediate marc sort format, which is identical with the input to the tsrt program (sort key remains with the record). the merge routine written especially for the marc sort program provides the capability to merge two or more files produced previously by tsrt in the intermediate marc format and to output files in either or both of the above formats. it is necessary to provide a separate program for the merge function, since the manufacturer-supplied sort/merge package does not have the capability to merge intermediate marc records while producing marc ii output. figure 1 shows a simplified flow chart of the program. sort-key edit program ( sked) sked builds sort keys according to the parameters specified at run time. in this process, it uses a table to translate the data to equalize upperand lower-case letters, eliminate unwanted characters (e.g., diacritics, punctuation), and to provide a more desirable collating sequence during the sort phase. if the parameters result in more than one sort. key for a record, the record is duplicated each time a new key is built. the sort key is attached to the front of the marc record when both are written in the intermediate marc sort file. this is a variable-length, blocked file with a maximum block length of 4612 bytes (minimum blocking factor of 2). figure 2 shows diagrammatically how a record looks after it has been processed by sked. communiblock record sort key sort leader cations control fixed direc~ variable length length length key area field field tory fields i i i i i i i i l k -----2 or more records blocked-----~ fig. 2. schematic diagram of an intermediate marc sort record. records in the master file that do not satisfy the parameters for a particular processing cycle are written in an exception file which is in the same format as the original master file (that is, undefined). a utility program can be used to list the contents of the exception file. tsrt requires the specification of the number of control fields and certain related information about each such field. as many as twelve control fields (each with a maximum length of 256 bytes) can be accommodated by the program. the current implementation of the marc sort program uses a 256-byte key starting in position 9 of each record. (the first 8 bytes are used for variable record information). any change in the length 130 journal of library automation vol. 2/ 3 september, 1969 of the sort key must be reflected in the sked source deck and on the control cards for tsrt. the specification of control fields shown on a tsrt control card must be changed as follows: if the length of the sort key is shortened, then the control field length specification must be reduced. if the sort key is lengthened, then the control field must be split into two or more control fields, as follows: key length number of control fields = 256 (if the quotient is a fraction, use the next higher integer.) parameters the control cards for a sked processing cycle allow the user to specify the following options: 1) type of field. both fixed and variable fields may be specified as parameters for a sort key. there is no restriction on the order in which they are given. 2) specification of fields. fields may be specified in several ways: a) exact form: a specific tag for a variable field (e.g., 650) or the address of a fixed field (the only option for this type of field); b) generalized form: nxx, nnx, xnn, nxn, where any digit may be substituted for n (e.g., !xx ); and c ) as a range: nnnxxx (e.g. 600-651) . 3) selection of data from a field. the amount of data from a field to be processed can be determined in any of three ways: a) specifying the variable field tag without specifying particular subfield codes associated with it. this results in all data in the field being processed. b) specifying the number of characters to be processed. this must be done for fixed fields even if all data are desired. with either type of field, the data will be truncated if the number specified is smaller than the number of characters in the field. c) specifying the particular subfield codes associated with a variable field tag. this results in the sort key containing only the data from the specified subfields. for example, if the data in a 100 field were smith, john, 1910ed., failure to include subfield "e, (the designator for a relator like ''ed.") in the specification of subfields would result in its being excluded from the sort key. 4) alternate selection. two or more parameters may be specified for the same position in the sort key with the provision that only the first to be found will be used. for example, if 240 (uniform title) and 245 (bibliographical title) are specified as alternate selections in that order and both occur in a record, preference is given to 240 and only it is used in the sort key. marc sort program 131 5) multiple parametric levels. for efficient processing, mutually exclusively para~eters can be listed in the instructions for the same processing cycle. the program affords a means of distinguishing between primary parameters that must always appear in the sort key and secondary parameters that cannot be combined with one another. the user also has the option of specifying that a sort key is to be generated using only the primary parameters. for example, if a book catalog were to contain main entries, added entries, and subjects, the tags for added entries and subjects would be specified as secondary parameters and the tags for the main entry, title, and imprint date as primary parameters. the sort key built for each added entry and each subject entry would always include the main entry (if present), title, and imprint date. this option can be by-passed if, for example, only a subject catalog is desired. 6) sequence of subfield codes. the subfield codes for a variable field may be specified not only to control the data to be included in the sort key but also to determine the order in which it appears. the following example shows how this works: tag subfield codes data record 100 abed charles t ii t king of great britain,f 1630-1685 sked parameter 100 acbd sort key ( charles king of great britain ii 1630 1685ij 7) separator. the user must specify the character that will separate each data element in the sort key but he has a choice of the character to be used. when the required characters from a given data element have been moved to the sort key, the selected separator is inserted to mark the end of the element. the separator is one of a set of specially contrived characters called superblanks that sort lower than any other character, including a b1ank. use of the superblank permits the combination of different data elements in the same sort key because it prevents unlike data elements from sorting against one another, as shown below: {l-b_al_l_j_ohn_o_arth __ ull_th_!_l._i_m_g ___ 7,.. l ball john arthur 0 chess ~ ....____ ---· __ .r without the superb lank (shown here for convenience as a bullet) the second sort key would be placed before the first. later it is expected that use of different superblanks within data elements will enable the sort/merge program to group related headings together. .. . ' 132 journal of library automation vol. 2/3 september, 1969 8) acceptance/ rejection indicator. at the beginning of the processing cycle, a decision can be made as to whether a marc record should be processed if it does not include a particular parameter. if the rejection indicator is set, the record will be written to the exception file if that parameter does not occur. for example, if the parameters are 1xx (any main entry), 245 (bibliographical title ), and imprint date (taken from the fixed field), a record without a main entry tag could be accepted but a record without a title rejected. this allows for title main entries while excluding imperfect records that may be present. 9) duplication. the parameters for variable fields to be included in the sort key may be satisfied by more than one combination of tags in the directory for a single marc record. to provide for this occurrence, a duplication indicator may be set, thereby insuring that a sort key will be generated for each combination that satisfies the parameters. in addition the entire record will be duplicated so that it can accompany each sort key. 10) number of parameters. the number of parameters designating data elements for the sort key is determined by the user at the beginning of the processing cycle. any number up to twenty may be specified. it is unlikely that any given sort key will contain more than four or five parameters, but the ability to specify a much larger number allows for processing sort keys of several different types during the same cycle. translation table before data characters are moved from a designated field in a marc record to the sort key, they must be edited to insure that the key will include only characters that are relevant for sorting. this involves translation of the characters into the sked character set: 1) to equate upper and lower case versions of the same alpha characters; 2) to translate the period, comma, and hyphen to the bit configuration of an ordinary blank; 3) to reduce other punctuation, diacritics, and special characters to a single bit configuration that cannot be moved to the sort key; and 4) to insure the proper machine collating sequence (blank, 0 9, az). the sked character set also provides bit configurations for superblanks (see above). the translation routine is written so that the character set can be changed without programming complications. sked also includes a feature that safeguards the sort key against the possibility of two consecutive blanks, as would be the case when a period and a blank occur in sequence, or when the data erroneously include two blanks when only one should occur. before a character with a bit con~ figuration equal to a blank is moved to the sort key, the program determines whether the last character moved has the same configuration. if it does, the second character is not moved. marc sort prog1·am 133 other options sked has the optional capability of adding two variable data fields and their corresponding directory entries to each record. these entries follow the format for data in other marc ii variable fields. 1) 998 entry. when the sked capability to duplicate records is used, it may be desirable to label one record of the set as the "master record". this technique or a modification of it might be used to generate a reference from a partial record to the full (master) record in a book catalog. when this option is selected there will be one, and only one, 998 field generated with each record. infor,mation about the master record will be given by listing the tags used in the sort key to achieve a unique position of that record on file. for example, if a master record is sorted by main entry, then title (if different from main entry), and finally by the date of publication, the 998 field describing this master record should list the lxx tag, followed by the 2xx tag, and finally by the address !lnd length of the fixed field containing the publication date. the order of the tags in the 998 field is the same sequence used in the sort key of the master record. 2) 999 entry. when a book catalog is produced, it is desirable to show on the first line of an entry the element (e.g., title, subject) that determined the position of the entry in the arrangement. sked supplies this information by creating a variable field (tagged 999) containing the initial element of the sort tag. if this option is chosen, an indicator can be set in the 999 field to show that the data in it should be printed as the first line of the bibliographic printout. sort/ merge the tsrt program used by the marc sort program is the standard ibm-supplied ibm system/ 360 basic operating system tape sort/ merge program. design specifications for this program satisfy the sorting and merging requirements of tape-oriented systems with at least 16k bytes of main storage. this program enables the user to sort files of random records, or merge multiple files of sequential records, into one sequential file. if any inherent sequencing exists within the input file, the program will take advantage of it. the intermediate marc sort file produced by sked is acceptable to tsrt. as stated earlier, tsrt can accommodate up to twelve control fields for sorting. the marc sort program requires only one control field at present. it is important to note that the tsrt comparison routines end as soon as a character in one control field is different from the corresponding character in another conh·ol field. tsrt operates in four phases: assignment (phase 0); internal-sort (phase 1); external-sort (phase 2); merge-only (phase 3). if sorting is 134 journal of library automation vol. 2/3 september, 1969 to be done, the assignment, internal-sort, and external-sort phases are executed. if only merging is to be done, the assignment and merge-only phases are executed. tsrt provides various exits from the main line to enable a user to insert his own routines. exit 22 in the external sort has been provided to delete records. in the marc sort program phase this exit is used as an option for stripping the key and returning the records to standard marc ii format (undefined 2040 bytes maximum). the user exit intercepts each sorted record prior to output and converts it to an undefined state. the option is provided by addition of a "mods" control card . a how chart of tsrt appears as figure 3. user exit 2 2 assignment phase (phase 0 ) internal sor t phase (phase 1 ) external sort phase (phase 2) end o f progr am merged only ( pha!e 3) fig. 3. flow chart of the sort/merge phases of the marc sort program. marc s01·t program 135 four work tapes are used by this application of tsrt. a fifth drive is used for input and output. the ibm sort package is not capable of writing undefined length records. since the marc record is in an undefined format, the output routines of tsrt cannot be used. therefore, the following method is used to develop the marc output file: 1) a separate file receives the sort output instead of the standard sort out-file; 2) the separate file is written by special coding in exit 22 of tsrt; and 3) each record is written in such a way as to prevent the sort from also writing the record. the merge routine written especially for the marc sort program will output either intermediate (with key) or marc ii (without key) format tapes or both. the program in action processing times sked was written in assembler language, using physical input/ output control system and dynamic buffering assignments to achieve speed. the amount of time required to process a particular record is affected by the applicability of run-time parameters on the record. for example, if the user specifies twelve data elements from twelve different variable fields for inclusion in the sort key, the processing time will be greater than that required for a run with only one specification. likewise, a run requesting duplication of each record for every added entry will require more computer time than another run that does not duplicate records at all. since the total processing time required for a file of n records will be the same as the time to process n files of one record each (disregarding i/0 considerations) it is possible to project times using a run with various data conditions and numerous sked parameters as a guide. one such run on the ibm s / 360 at the library of congress processed records at the rate of 2,400 per minute. twelve parameters were specified and records were duplicated for certain subject entries. except for time spent in changing tape reels, sked can be expected to process records at the same rate regardless of the size of the file. the processing time for tsrt is affected by the same characteristics that affect most sort programs. some of these are as follows: 1) amount of memory available to the sort; 2) number of storage units (in lc' s case, tape units are used); 3) types of storage unit (for magnetic tape-interrecord gap, density, and tape length); 4) block size for data; and 5) amount of bias in the input. the only characteristic of sked that seems to relate to the speed with which tsrt operates has to do with sked's extended use of a single control field. for example, in many sorting systems, if records are to be arranged by main entry and within main entry by title and then by date, three control fields would be specified. one would be chosen for main entry; one would be chosen for title; and, one would be selected for the 136 journal of library automation vol. 2/ 3 september, 1969 date. sked places all of these within the same control field, separating them by a superblank. since tsrt is required to discriminate only on the single control field, a smaller amount of processing time is needed than would be the case if several control fields were used. results although sked does not have the ability to make the refined distinctions among headings required for sophisticated filing arrangements, it performs in a workmanlike way in producing alphabetical sequences that are unaffected by the presence of diacritical marks and vagaries in punctuation and spacing. moreover, the collating sequence (blank, 0 9, a z) insures that short names will file before longer ones beginning with the same character string. the ability to truncate headings to remove relators (e.g., ed.) also insmes the creation of a single sequence for authors whose names are sometimes qualified in this way. the following consolidated example shows some of the arrangements produced by sked. to simplify the presentation, generally only the first filing element is given. other elements have been added if they are needed to show distinctions that were made by the program. abbott, charles abc company. a "beckett, gilbert acadia university. alexander iii, king of albania alexander ii, king of bulgaria alexander i, king of russia bradley, bradley and bradley, firm. bradley, milton, 1836-1911. bradley (milton) company katz, eric, ed. sound about our ears. katz, eric. sound in space. lincoln, abraham, pres. u. s., 1809-1865. lincoln co., cr.-directories lincoln county coast directory lincoln, david. lincoln highway lincoln, marshall. lincoln, mass.-history lincoln, me.-genealogy marc sort program 137 london, albert, joint author. [mockridge, norman.] (author not used in sort key) inside the law. london, albert. london at night. london at night. london. central criminal court. london. county council. london, declaration of, 1909. london-description london (diocese) courts. london, jack. white fang. 1930. london, jack. white fang. 1950. london, jack white. alaskan adventure. london, ont. council london, ontario; a history london. ordinances. london-social conditions smith, john, 1900smith, john, 1901-1965. smith, john allan, 1900smith, john, clockmaker. anticipated developments at the present stage of the development of the marc sort program, sked does not have the ability to treat data in a field according to their semantic content. it cannot, for example, treat a character string in a 100 field in a special way because it is a single surname as opposed to a forename or multiple surname. nor does sked include routines for treating abbreviations and digits as if spelled out, or for suppressing data in a given field in some cases but not in others. the achievement of these capabilities will require: 1) development of a generalized technique for taking account of indicators in processing data in variable fields; 2) devising algorithms to handle particular filing situations related to the content of the field; and 3) placement of the algorithms within the framework of the sked program. the refinement of sked is being undertaken in relation to the problem of maintaining the lc subject heading list in machine readable form. techniques developed for this purpose will be applicable also to filing for book catalogs and other listings. the result should be a firm foundation for a comprehensive program for arranging bibliographic entries by computer. 138 journal of library automation vol. 2/ 3 september, 1969 availability of the program since the marc sort program should be useful to libraries that subscribe to the marc distribution setvice, the package (consisting of sked and the modified version of tsrt) has been filed with the ibm program information department. requests should be made through a local branch office of ibm and should cite the following number: 360d06.1.005. references 1. u. s. library of congress. office of the information systems specialist: a proposed format for a standardized machine-readable record. prepared by henriette d. avram, ruth s. freitag, kay d. guiles. iss planning memorandum, no.3. (washington, d. c.: 1965), p . 5. 2. u. s. library of congress. information systems office. the marc ii formats a communications format for bibliographic data. prepared ?y henriette d. avram, john f. knapp, and lucia j. rather. (washmgton, d. c.: 1968), p. 33. 3. "preliminary guidelines for the library of congress, national library of medicine, and national agricultural library implementation of the proposed american standard for a format for bibliographic information interchange on magnetic tape as applied to records representing monographic materials in textual printed form (books)," journal of library automation, 2 (june 1969), 68-83. 4. u. s. library of congress, information systems office: subscriber's guide to the marc distribution service. (washington, d. c.: 1968). 5. avram, henriette d.; droz, julius r.: "marc ii and cobol," journal of library automation, 1 (december 1968), 261-272. a semantic model of selective dissemination of information | morales-del-castillo et al. 21 a semantic model of selective dissemination of information for digital libraries j. m. morales-del-castillo, r. pedraza-jiménez, a. a. ruíz, e. peis, and e. herrera-viedma in this paper we present the theoretical and methodological foundations for the development of a multi-agent selective dissemination of information (sdi) service model that applies semantic web technologies for specialized digital libraries. these technologies make possible achieving more efficient information management, improving agent–user communication processes, and facilitating accurate access to relevant resources. other tools used are fuzzy linguistic modelling techniques (which make possible easing the interaction between users and system) and natural language processing (nlp) techniques for semiautomatic thesaurus generation. also, rss feeds are used as “current awareness bulletins” to generate personalized bibliographic alerts. n owadays, one of the main challenges faced by information systems at libraries or on the web is to efficiently manage the large number of documents they hold. information systems make it easier to give users access to relevant resources that satisfy their information needs, but a problem emerges when the user has a high degree of specialization and requires very specific resources, as in the case of researchers.1 in “traditional” physical libraries, several procedures have been proposed to try to mitigate this issue, including the selective dissemination of information (sdi) service model that make it possible to offer users potentially interesting documents by accessing users’ personal profiles kept by the library. nevertheless, the progressive incorporation of new information and communication technologies (icts) to information services, the widespread use of the internet, and the diversification of resources that can be accessed through the web has led libraries through a process of reinvention and transformation to become “digital” libraries.2 this reengineering process requires a deep revision of work techniques and methods so librarians can adapt to the new work environment and improve the services provided. in this paper we present a recommendation and sdi model, implemented as a service of a specialized digital library (in this case, specialized in library and information science), that can increase the accuracy of accessing information and the satisfaction of users’ information needs on the web. this model is built on a multi-agent framework, similar to the one proposed by herrera-viedma, peis, and morales-del-castillo,3 that applies semantic web technologies within the specific domain of specialized digital libraries in order to achieve more efficient information management (by semantically enriching different elements of the system) and improved agent–agent and user–agent communication processes. furthermore, the model uses fuzzy linguistic modelling techniques to facilitate the user–system interaction and to allow a higher grade of automation in certain procedures. to increase improved automation, some natural language processing (nlp) techniques are used to create a system thesaurus and other auxiliary tools for the definition of formal representations of information resources. in the next section, “instrumental basis,” we briefly analyze sdi services and several techniques involved in the semantic web project, and we describe the preliminary methodological and instrumental bases that we used for developing the model, such as fuzzy linguistic modelling techniques and tools for nlp. in “semantic sdi service model for digital libraries,” the bulk of this work, the application model that we propose is presented. finally, to sum up, some conclusive data are highlighted. n instrumental basis filtering techniques for sdi services filtering and recommendation services are based on the application of different process-management techniques that are oriented toward providing the user exactly the information that meets his or her needs or can be of his or her interest. in textual domains, these services are usually developed using multi-agent systems, whose main aims are n to evaluate and filter resources normally represented in xml or html format; and n to assist people in the process of searching for and retrieving resources.4 j. m. morales-del-castillo (josemdc@ugr.es) is assistant professor of information science, library and information science department, university of granada, spain. r. pedrazajiménez (rafael.pedraza@upf.edu) is assistant professor of information science, journalism and audiovisual communication department, pompeu fabra university, barcelona, spain. a. a. ruíz (aangel@ugr.es) is full professor of information science, library and information science department, university of granada. e. peis (epeis@ugr.es) is full professor of information science, library and information science department, university of granada. e. herrera-viedma (viedma@decsai.ugr.es) is senior lecturer in computer science, computer science and artificial intelligence department, university of granada. 22 information technology and libraries | march 2009 traditionally, these systems are classified as either content-based recommendation systems or collaborative recommendation systems.5 content-based recommendation systems filter information and generate recommendations by comparing a set of keywords defined by the user with the terms used to represent the content of documents, ignoring any information given by other users. by contrast, collaborative filtering systems use the information provided by several users to recommend documents to a given user, ignoring the representation of a document’s content. it is common to group users into different categories or stereotypes that are characterized by a series of rules and preferences, defined by default, that represent the information needs and common behavioural habits of a group of related users. the current trend is to develop hybrids that make the most of content-based and collaborative recommendation systems. in the field of libraries, these services usually adopt the form of sdi services that, depending on the profile of subscribed users, periodically (or when required by the user) generate a series of information alerts that describe the resources in the library that fit a user’s interests.6 sdi services have been studied in different research areas, such as the multi-agent systems development domain,7 and, of course, the digital libraries domain.8 presently, many sdi services are implemented on web platforms based on a multi-agent architecture where there is a set of intermediate agents that compare users’ profiles with the documents, and there are input-output agents that deal with subscriptions to the service and display generated alerts to users.9 usually, the information is structured according to a certain data model, and users’ profiles are defined using a series of keywords that are compared to descriptors or the full text of the documents. despite their usefulness, these services have some deficiencies: n the communication processes between agents, and between agents and users, are hindered by the different ways in which information is represented. n this heterogeneity in the representation of information makes it impossible to reuse such information in other processes or applications. a possible solution to these deficiencies consists of enriching the information representation using a common vocabulary and data model that are understandable by humans as well as by software agents. the semantic web project takes this idea and provides the means to develop a universal platform for the exchange of information.10 semantic web technologies the semantic web project tries to extend the model of the present web by using a series of standard languages that enable enriching the description of web resources and make them semantically accessible.11 to do that, the project basis itself on two fundamental ideas: (1) resources should be tagged semantically so that information can be understood both by humans and computers, and (2) intelligent agents should be developed that are capable of operating at a semantic level with those resources and that infer new knowledge from them (shifting from the search of keywords in a text to the retrieval of concepts).12 the semantic backbone of the project is the resource description framework (rdf) vocabulary, which provides a data model to represent, exchange, link, add, and reuse structured metadata of distributed information sources, thereby making them directly understandable by software agents.13 rdf structures the information into individual assertions (e.g., “resource,” “property,” and “property value triples”) and uniquely characterizes resources by means of uniform resource identifiers (uris), allowing agents to make inferences about them using web ontologies or other, simpler semantic structures, such as conceptual schemes or thesauri.14 even though the adoption of the semantic web and its application to systems like digital libraries is not free from trouble (because of the nature of the technologies involved in the project and because of the project’s ambitious objectives,15 among other reasons), the way these technologies represent the information is a significant improvement over the quality of the resources retrieved by search engines, and it also allows the preservation of platform independence, thus favouring the exchange and reuse of contents.16 as we can see, the semantic web works with information written in natural language that is structured in a way that can be interpreted by machines. for this reason, it is usually difficult to deal with problems that require operating with linguistic information that has a certain degree of uncertainty (e.g., when quantifying the user’s satisfaction in relation to a product or service). a possible solution could be the use of fuzzy linguistic modelling techniques as a tool for improving system–user communication. fuzzy linguistic modelling fuzzy linguistic modelling supplies a set of approximate techniques appropriate for dealing with qualitative aspects of problems.17 the ordinal linguistic approach is defined according to a finite set of tags (s) completely ordered and with odd cardinality (seven or nine tags): { }{ }t,=hi,s=s i …∈ 0, the central term has a value of approximately 0.5, and the rest of the terms are arranged symmetrically around a semantic model of selective dissemination of information | morales-del-castillo et al. 23 it. the semantics of each linguistic term is given by the ordered structure of the set of terms, considering that each linguistic term of the pair (si, st-i) is equally informative. each label si is assigned a fuzzy value defined in the interval [0,1] that is described by a linear trapezoidal property function represented by the 4-tupla (ai, bi, αi, βi). (the two first parameters show the interval where the property value is 1.0; the third and fourth parameters show the left and right limits of the distribution.) additionally, we need to define the following properties: 1.–the set is ordered: si ≥ sj if i ≥ j. 2.–there is the negation operator: neg(si ) = sj, with j = t i. 3.–maximization operator: max(si, sj) = si if si ≥ sj. 4.–minimization operator: min(si, sj) = si if si ≤ sj. it also is necessary to define aggregation operators, such as linguistic weighted averaging (lwa),18 capable of and operating with and combining linguistic information. focusing on facilitating the interaction between users and system, the other starting objective is to achieve the development and implementation of the model proposed in the most automated way possible. to do this, we use a basic auxiliary tool—a thesaurus—that, among other tasks, assists users in the creation of their profile and enables automating the alerts generation. that is why it is critical to define the way in which we create this tool, and in this work we propose a specific method for the semiautomatic development of thesauri using nlp techniques. nlp techniques and other automating tools nlp consists of a series of linguistic techniques, statistic approaches, and machine learning algorithms (mainly clustering techniques) that can be used, for example, to summarize texts in an automatic way, to develop automatic translators, and to create voice recognition software. another possible application of nlp would be the semiautomatic construction of thesauri using different techniques. one of them consists of determining the lexical relations between the terms of a text (mainly synonymy, hyponymy, and hyperonymy),19 and extracting terms that are more representative for the text’s specific domain.20 it is possible to elicit these relations by using linguistic tools, like princeton’s wordnet (http://wordnet .princeton.edu) and clustering techniques. wordnet is a powerful multilanguage lexical database where each one of its entries is defined, among other elements, by their synonyms (synsets), hyponyms, and hyperonyms.21 as a consequence, once given the most important terms of a domain, wordnet can be used to create from them a thesaurus (after leaving out all terms that have not been identified as belonging or related to the domain of interest).22 this tool can also be used with clustering techniques—for example, to group documents of a collection in a set of nodes or clusters, depending on their similarity. each of these clusters is described by the most representative terms of their documents. these terms make up the most specific level of a thesaurus and are used to search in wordnet for their synonyms and most general terms, contributing (with the repetition of this procedure) to the bottom-up-development process of the thesaurus.23 although there are many others, these are some of the most well-known techniques of semiautomatic thesaurus generation (semiautomatic because, needless to say, the supervision of experts is necessary to determine the validity of the final result). for specialized digital libraries, we propose developing, on a multi-agent platform and using all these tools, sdi services capable of generating alerts and recommendations for users according to their personal profiles. in particular, the model presented here is the result of several previous models merging, and its service is based on the definition of “current-awareness bulletins,” where users can find a basic description of the resources recently acquired by the library or those that might be of interest to them.24 n the semantic sdi service model for digital libraries the sdi service includes two agents (an interface agent and a task agent) distributed in a four-level hierarchical architecture: user level, interface level, task level and resource level. its main components are a repository of full-text documents (which make up the stock of the digital library) and a series of elements described using different rdfbased vocabularies: one or several rss feeds that play a role similar to that of current-awareness bulletins in traditional libraries; a repository of recommendation log files that store the recommendations made by users about the resources, and a thesaurus that lists and hierarchically relates the most relevant terms of the specialization domain of the library.25 also, the semantics of each element (that is, its characteristics and the relations the element establishes with other elements in the system) are defined in a web ontology developed in web ontology language (owl).26 next, we describe these main elements as well as the different functional modules that the system uses to carry out its activity. elements of the model there are four basic elements that make up the system: 24 information technology and libraries | march 2009 the thesaurus, user profiles, rss feeds, and recommendation log files. thesaurus an essential element of this sdi service is the thesaurus, an extensible tool used in traditional libraries that enables organizing the most relevant concepts in a specific domain, defining the semantic relations established between them, such as equivalence, hierarchical, and associative relations. the functions defined for the thesaurus in our system include helping in the indexing of rss feeds items and in the generation of information alerts and recommendations. to create the thesaurus, we followed the method suggested by pedraza-jiménez, valverde-albacete, and navia-vázquez.27 the learning technique used for the creation of a thesaurus includes four phases: preprocessing of documents, parameterizing the selected terms, conceptualizing their lexical stems, and generating a lattice or graph that shows the relation between the identified concepts. essentially, the aim of the preprocessing phase is to prepare the documents’ parameterization by removing elements regarded as superfluous. we have developed this phase in three stages: eliminating tags (stripping), standardizing, and stemming. in the first stage, all the tags (html, xml, etc.) that can appear in the collection of documents are eliminated. the second stage is the standardization of the words in the documents in order to facilitate and improve the parameterization process. at this stage, the acronyms and n-grams (bigrams and trigrams) that appear in the documents are identified using lists that were created for that purpose. once we have detected the acronyms and n-grams, the rest of the text is standardized. dates and numerical quantities are standardized, being substituted with a script that identifies them. all the terms (except acronyms) are changed to small letters, and punctuation marks are removed. finally, a list of function words is used to eliminate from the texts articles, determiners, auxiliary verbs, conjunctions, prepositions, pronouns, interjections, contractions, and grade adverbs. all the terms are stemmed to facilitate the search of the final terms and to improve their calculation during parameterization. to carry out this task, we have used morphy, the stemming algorithm used by wordnet. this algorithm implements a group of functions that check whether a term is an exception that does not need to be stemmed and then convert words that are not exceptions to their basic lexical form. those terms that appear in the documents but are not identified by morphy are eliminated from our experiment. the parameterization phase has a minimum complexity. once identified, the final terms (roots or bases) are quantified by being assigned a weight. such weight is obtained by the application of the scheme term frequencyinverse document frequency (tf-idf), a statistic measure that makes possible the quantification of the importance of a term or n-gram in a document depending on its frequency of appearance and in the collection the document belongs to. finally, once the documents have been parameterized, the associated meanings of each term (lemma) are extracted by searching for them in wordnet (specifically, we use wordnet 2.1 for unix-like systems). thus we get the group of synsets associated with each word. the group of hyperonyms and hyponyms also are extracted from the vocabulary of the analyzed collection of documents. the generation of our thesaurus—that is, the identification of descriptors that better represent the content of documents, and the identification of the underlying relations between them—is achieved using formal concept analysis techniques. this categorization technique uses the theory of lattices and ordered sets to find abstraction relations from the groups it generates. furthermore, this technique enables clustering the documents depending on the terms (and synonyms) it contains. also, a lattice graph is generated according to the underlying relations between the terms of the collection, taking into account the hyperonyms and hyponyms extracted. in that graph, each node represents a descriptor (namely, a group of synonym terms) and clusters the set of documents that contain it, linking them to those with which it has any relation (of hyponymy or hyperonymy). once the thesaurus is obtained by identifying its terms and the underlying relations between them, it is automatically represented using the simple knowledge organization system (skos) vocabulary (see figure 1).28 user profiles user profiles can be defined as structured representations that contain personal data, interests, and preferences of users with which agents can operate to customize the sdi service. in the model proposed here, these profiles are basically defined with friend of a friend (foaf), a specific rdf/xml for describing people (which favours the profile interoperability, since this is a widespread vocabulary supported by an owl ontology) and another nonstandard vocabulary of our own to define fields not included in foaf (see figure 2).29 profiles are generated the moment the user is registered in the system, and they are structured in two parts: a public profile that includes data related to the user’s identity and affiliation, and a private profile that includes the user’s interests and preferences about the topic of the alerts he or she wishes to receive. to define their preferences, users must specify keywords and concepts that best define their information a semantic model of selective dissemination of information | morales-del-castillo et al. 25 needs. later, the system compares those concepts with the terms in the thesaurus using as a similarity measure the edit tree algorithm.30 this function matches character strings, then returns the term introduced (if there’s an exact match) or the lexically most similar term (if not). consequently, if the suggested term satisfies user expectations, it will be added to the user’s profile together with its synonyms (if any). in those cases where the suggested term is not satisfactory, the system must have any tool or application that enables users to browse the thesaurus and select terms that better describe their needs. an example of this type of applications is thmanager (http://thmanager .sourceforge.net), a project of the universidad de zaragoza, spain, that enables editing, visualizing, and going through structures defined in skos. each of the terms selected by the user to define his or her areas of interest has an associated linguistic frequency value (tagged as ) that we call “satisfaction frequency.” it represents the regularity with which a particular preference value has been used in alerts positively evaluated by the user. this frequency measures the relative importance of the preferences stated by the user and allows the interface agent to generate a ranking list of results. the range of possible values for these frequencies is defined by a group of seven labels that we get from the fuzzy linguistic variable “frequency,” whose expression domain is defined by the linguistic term set s = {always, almost_ always, often, occasionally, rarely, almost_never, never}, being the default value and “occasionally” being the central value. rss feeds thanks to the popularization of blogs, there has been widespread use of several vocabularies specifically designed for the syndication of contents (that is, for making accessible to other internet users the content of a website by means of hyperlink lists called “feeds”). to create our current-awareness bulletin we use rss 1.0, a vocabulary that enables managing hyperlinks lists in an easy and flexible way. it utilizes the rdf/xml syntax and data model and is easily extensible because of the use of proceedings figure 1. sample entry of a skos core thesaurus diego allione sr. af9fa7601df46e95566 library management 0.83 figure 2. user profile sample 26 information technology and libraries | march 2009 modules that enable extending the vocabulary without modifying its core each time new describing elements are added. in this model several modules are used: the dublin core (dc) module to define the basic bibliographic information of the items utilizing the elements established by the dublin core metadata initiative (http:// dublincore.org); the syndication module to facilitate software agents synchronizing and updating rss feeds; and the taxonomy module to assign topics to feeds items. the structure of the feeds comprises two areas: one where the channel itself is described by a series of basic metadata like a title, a brief description of the content, and the updating frequency; and another where the descriptions of the items that make up the feed (see figure 3) are defined (including elements such as title, author, summary, hyperlink to the primary resource, date of creation, and subjects). recommendation log file each document in the repository has an associated recommendation log file in rdf that includes the listing of evaluations assigned to that resource by different users since the resource was added to the system. each of the entries of the recommendation log files consists of a recommendation value, a uri that identifies the user that has done the recommendation, and the date of the record (see figure 4). the expression domain of the recommendations is defined by the following set of five fuzzy linguistic labels that are extracted from the linguistic variable “quality of the resource”: q = {very_low, low, medium, high, very_high}. these elements represent the raw materials for the sdi service that enable it to develop its activity through four processes or functional modules: the profiles updating process, rss feeds generation process, alert generation process, and collaborative recommendation process. system processes profiles updating process since the sdi service’s functions are based on generating passive searches to rss feeds from the preferences stored 14/03/2007 high figure 4. recommendation log file sample escudero sánchez, manuel fernández cáceres, josé luis broadcasting and the internet http://eprints.rclis.org/…/audiovideo_good.pdf this paper is about… 2002 redoc, 8 (4), 2008 virual communities figure 3. rss feed item sample in a user’s profile, updating the profiles becomes a critical task. user profiles are meant to store long-term preferences, but the system must be able to detect any subtle change in these preferences over time to offer accurate recommendations. in our model, user profiles are updated using a simple mechanism that enables finding users’ implicit preferences by applying fuzzy linguistic techniques and taking into account the feedback users provide. users are asked about their satisfaction degree (ej) in relation to the information alert generated by the system (i.e., whether the items a semantic model of selective dissemination of information | morales-del-castillo et al. 27 retrieved are interesting or not). this satisfaction degree is obtained from the linguistic variable “satisfaction,” whose expression domain is the set of five linguistic labels: s’ = {total, very_high, high, medium, low, very_low, null}. this mechanism updates the satisfaction frequency associated with each user preference according to the satisfaction degree ej. it requires the use of a matching function similar to those used to model threshold weights in weighted search queries.31 the function proposed here rewards the frequencies associated with the preference values present when resources assessed are satisfactory, and it penalizes them when this assessment is negative. let ej { }t,=hba,|ss,s ba 0,...∈∈ s’ be the degree of satisfaction, and f j i l { }t,=hba,|ss,s ba 0,...∈∈ s the frequency of property i (in this case i = “preference”) with value l, then we define the updating function g as s’x s→s: { } { } ( ) {=f,eg s75% of strategic plans), second tier, and other areas of emphasis. saunders’ analysis revealed that strategic plan diversity content was a second-tier issue related to library staff. saunders found the term diversity was used in two ways: to refer to expertise, skills, and abilities; and to delimitate demographic characteristics, including ethnicity, nationality, or language.15 like wilson et al., saunders’ findings demonstrate academic libraries’ recognition of the importance of diversity in higher education.16 methodology this study employed content analysis and examined uborrow consortium members’ library websites (see appendix a for a list) for the presence and content of dei statements. uborrow is an interlibrary loan service comprised of big ten academic alliance members, plus the university of chicago and the center for research libraries, in which “users at member institutions are granted access to the collective wealth of information of the entire consortium.”17 uborrow members leverage individual campus resources to collaboratively assist the academic pursuits of students and faculty of each institution via the expedited sharing of resources. these libraries were chosen as a representation of a model consortium and are a reasonable focus for examination. content analysis is a research technique for making replicable and valid inferences from texts or other forms of contextually based data. it allows for data analysis “in view of the meanings, symbolic qualities, and expressive contents they have and of the communicative roles they play in the lives of the data’s sources”.18 content analysis provides a foundation for understanding how messages and meanings are constructed. as such, content analysis is appropriate to analyze the content and meanings of dei statements on uborrow websites. additionally, this study utilized multimodal theory, particularly lemke’s hypermodality and three communicative acts: organizational, presentational, and orientational.19 examining dei statements as multimodal texts allows for the analysis of meaning making and construction across information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 4 each type of act. just as users make meanings across sentences, paragraphs, and pages, users likewise make meanings from the ways in which they interact with digital information.20 the organizational aspect provides a way to examine the spatial arrangement of library websites, for example, libraries that dedicate entire webpages to dei statements or those in which these statements share pages with other content. content analysis provides a way of examining the presentational aspect of information, the ideational content of texts, in this case how dei statements are presented on uborrow websites. content analysis also provides a way to examine the orientational aspect, which indicates the nature of the communicative relationship, via exploring how libraries establish relations with whom they are communicating, for example, how the presence of dei statements positions libraries as conscientious entities engaged in the promotion of diverse and inclusive environments. i examined each uborrow member website for an explicit dei statement. informed by previous literature, i created an excel spreadsheet and entered data from each institution including: institution name, library website url, dei statement (yes/no), homepage link (yes/no), dei statement url, and notes following a standardized process. first, i recorded the library’s homepage. next, i searched the homepage for a dei statement link. if found, i indicated the presence in the yes/no columns and recorded the url. only direct links to library dei statements were marked as yes in the homepage link column. if no homepage link existed, i searched the library websites using the following terms: diversity, equity, and inclusion. when it was difficult to locate dei statements, i utilized the chat feature or e-mailed library administrators to ensure i was not overlooking relevant information. i conducted an initial search in july 2020 and a subsequent search in november 2020. no changes to explicit dei statements occurred. i conducted a followup search in april 2021. in the intervening months, one major change occurred on the university of minnesota libraries website. implications of this change are discussed below. once dei statements were identified, i examined the pages on which they were located. first, i examined page organization. in this step, i noted if the dei statement was the sole content on a page. if not, i noted the accompanying content. this analysis focused on lemke’s traversals, or the varied paths available to users in their search and navigation of websites.21 second, i analyzed dei statement content and identified the ways libraries presented their statements. this step included an examination of the language used in the dei statement. third, i expanded upon the presentational analysis and considered the ways dei statement content oriented the library toward users by exploring how statement language contributed to portraying the library in a certain way. this analysis focused on two areas: library-centered language common across statements and social justice language, which a subset of libraries’ dei statements employed. limitations uborrow is a single 16-member library consortium. further research of similar consortia or library associations would contribute to the study’s limited size. this study focused on explicit dei statements, thereby excluding other forms of dei content (e.g., announcements, marketing material, events). further research employing a broader view of dei content on academic library sites would also contribute to this study’s findings. finally, this study represents library websites during a snapshot in time. findings and analysis twelve (75%) uborrow member websites had an explicitly titled dei statement in november 2020 and 13 had an explicitly titled and labeled dei statement as of april 2021 (see table 1). in information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 5 november 2020, the university of minnesota had a clearly defined statement; however, this statement was untitled, and its location was unique among websites during the initial search. initially, this statement was not considered an explicit statement due to the lacking title, the implications of which are discussed in detail below. however, between november 2020 and april 2021, the university of minnesota libraries updated their homepage to include a link to a clearly defined and labeled dei statement, which university librarian and dean of libraries lisa german approved on february 1, 2021. for this reason, the university of minnesota libraries website receives unique discussion in the analysis that follows. three additional consortium members did not have an explicit dei statement. table 1. uborrow member libraries and the presence of dei statements institution explicit dei statement (y/n) university of chicago yes center for research libraries no university of illinois at urbana-champaign yes indiana university yes university of iowa yes university of maryland yes university of michigan yes university of minnesota no (fall 2020) yes (spring 2021) michigan state university yes university of nebraska yes northwestern university no ohio state university yes penn state university yes purdue university no rutgers university yes university of wisconsin madison yes while 12/13 of the 16 uborrow members contained an explicitly labeled dei statement, all member institutions addressed dei in some form, which included libguides, links to library resources, library events, and statements responding to specific societal events. however, the degree to which additional dei content prominent varied, with some content buried deep within library sites, as mestre’s work indicated.22 dei statement analysis: organization, presentation, and orientation the following section presents the descriptive analysis of the findings utilizing content analysis and lemke’s organizational, presentational, and orientational communicative aspects.23 analysis focused on questions about how the organization and presentation of dei statements contributed to the construction of meaning, the content of dei statements, and how content is oriented toward users in ways that position academic libraries as conscientious entities via their dei statements. organizational aspect of dei statements the organizational aspect of communication is instrumental and organizes and composes content in such a way that it is coherent and cohesive.24 organizational meanings have practical consequences, as the example of the university of minnesota’s library website demonstrates. information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 6 unique among uborrow sites as of november 2020, the university of minnesota libraries’ website contained a clearly focused, although untitled, dei statement. this statement is accessed via the about option on the homepage’s menu (see figure 1), which includes dropdown links to library policies, library overview, and the untitled dei statement. figure 1. the university of minnesota libraries’ homepage (november 2020). the statement’s placement is problematic for several reasons. first, the statement is easy to overlook. the researcher and a library staff member who responded to the researcher’s query via library chat both overlooked the statement. only when a third staff member was consulted did the identification of this statement occur. secondly, lemke discusses the affordances of hypertext and the many ways users can navigate websites, calling possible paths traversals.25 among the most basic is the visual-organizational traversal, which considers how webpage composition guides users’ eyes across the page. in this instance, the links are a call to action and signify to users that clicking on a link will transport them to a page with more information. static text on a webpage does not offer the same affordance. as a block of text located next to two panes of links, the statement is static, passive, and non-interactive, contributing to the ease with which users can overlook the statement. finally, this statement did not appear in the results of a library site search for the terms diversity, equity, or inclusion. given the various ways users can transverse a website, including actively searching for information, the lacking title makes this statement difficult to locate via scanning and searching. in users’ traversals of websites, two common approaches, identifying links or actively searching for desired information, are not applicable in locating the university of minnesota libraries’ dei statement. in the intervening months, between november 2020 and april 2021, the university of minnesota libraries website was updated to include an explicitly titled and labeled dei statement, available via a link from the homepage, prominently located in the upper right quadrant between the menu bar and hours and locations information (see figure 2). information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 7 figure 2. the umn libraries’ homepage (april 2021). this statement, written by the university libraries’ diversity, equity, and inclusion leadership committee, was approved on february 1, 2021. presented on a standalone page, this statement is similar to those of eight other uborrow consortium members, which are discussed in the next section. organization: stand-alone dei statements of the websites that contained explicitly labeled dei statements, eight libraries dedicated an entire page to the dei statement (nine including the umn libraries update). examination of these webpages revealed similar page titles, with variance according to the terms included. some page titles only included diversity, while others included diversity, equity, and inclusion. the university of michigan was unique as it also included accessibility. the relative consistency across these titles contributes to less frustrating and confusing user experience. clear and descriptive titles provide a positive experience for users accessing pages with an assistive screen reader. logistically, clear titles amplify page presence on searches conducted via google or other search engines. in addition to webpage titles, examination of the eight/nine pages revealed a relatively similar page organization and structure. each page contained headings that included some or all the terms diversity, equity, or inclusion. the pages were text heavy, with the university of nebraska and penn state university the only two whose pages included visual representations of diversity (i.e., images containing multi-racial groups). furthermore, the detail level of libraries’ dei statements were relatively consistent across the eight/nine webpages. while the page titles, organization, and detail of dei statements were similar, differences existed in the amount of additional dei content. for example, along with their dei statement, the university of maryland libraries’ diversity page defined diversity, an equitable environment, and inclusion. the university of michigan library followed their explicit dei statement with information relating the statement to the library’s collections, services, spaces, and people. other library webpages did not contain as much other on-page information. for example, rutgers university libraries links to various dei resources, which was another common trait (see figure 3). although clicking links requires additional steps to reach dei content, the presence of links is significant in consideration of lemke’s traversals. information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 8 figure 3. rutgers libraries’ diversity homepage with dei links. of the many types, a common organizational traversal is what lemke terms cohesive, in which “each element is an instance of some general category, and therefore with some thematic and/or visual similarities to the others, and as we catenate them we are cumulating toward an exhaustive exploration of the category”.26 the links on the rutgers university libraries’ diversity page allow users to traverse the library’s dei content, along with institutional dei content, as several links direct users to diversity pages external to the library. these links serve as calls to action and require users to click for more information. associated to dei content via the categorical connection, these links allow users to fully explore and expand upon the information found on the library’s dei statement page, allowing users to create their own meaning of library commitment to dei. user creation of meaning is in opposition to the library making this decision for the user, as when dei statements are placed on pages with other content, as is the case in four uborrow member websites. organization: shared dei statements unlike the libraries that dedicated a page to dei statements, variety exists in the page titles of the four libraries on which dei statements share pages with other content. dei statements are available via the about section of the library’s website, while of these, two are further couched on administration pages. the pages on which dei statements shared space exemplify mestre’s finding that dei content situated deeply within a website are difficult to locate.27 furthermore, the location of dei content is not entirely intuitive, making a user’s traversal to locate desired information less cohesive. michigan state university’s (msu) dei statement is found on the information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 9 library’s strategic plan page. rather than in a single statement, dei content is spread across the library’s strategic plan, including an inclusivity statement; a vision statement; and diversity, equity, and inclusion strategic direction (see figure 4). also, unlike the pages singularly devoted to a dei statement, msu’s strategic plan page was comparatively static with no links to other library or institutional dei content. the lack of links does not allow users to traverse msu’s site for dei content as easily due to the page’s static nature, making it difficult for users to, “construct a traversal which is more than the sum of its parts.”28 in this way, msu constructs the meaning of their commitment to dei via the limitations and restrictions on users’ interaction opportunities with the page on which their dei content is situated. figure 4. michigan state university libraries’ diversity content as part of strategic plan. organization: homepage links homepage links to dei statements were present on seven (58%) library homepages. when present, homepage links were located at two locations: in the menu or page footers. additionally, two levels of clarity existed regarding homepage links, as some sites contained an explicitly labeled link, while others required a two-step process to access the dei link. for example, the information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 10 university of iowa and penn state university libraries each had a clearly labeled link to their dei statement available via a single click on their library homepages. contrastingly, michigan, maryland, nebraska, ohio state, and rutgers all required users to first navigate a menu bar to find a link to the library’s dei statement. this two-step process requires more time and effort, whereas direct links require one less step. however, the direct link’s location on the university of iowa’s library homepage is located in the footer (see figure 5) and penn state university’s direct dei link is located near the bottom of the page, requiring users to scroll through entire pages. although requiring an extra step, libraries with a menu link at the top of the homepage, such as the ohio state university (see figure 6) do not require scrolling. a tradeoff exists between page location and number of steps to locate a link to the library’s dei statement when a homepage link is present. regardless of the homepage location, the presence of links to dei statements provides relatively easy access, making a user’s traversal to these statements relatively effortless and straightforward. figure 5. university of iowa libraries’ homepage dei statement footer link. figure 6. the ohio state university libraries’ homepage dei statement menu link. presentational aspect of dei statements lemke defines presentational meanings as those that present some state of affairs, which are construed from connections among processes, relations, events, participants, and circumstances and is significant for institutional purposes.29 users see the product of these actions that result in public dei statements. the discussions, meetings, efforts, and decisions that contribute to dei statements on library websites are concealed. the presence of dei statements represents the hidden work necessary for their creation, making dei statement content the library’s presentation of commitment to diversity, equity, and inclusion. information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 11 presentation: vague language and diversity conceptualizations examining the content of the 12/13 libraries with an explicit dei statement revealed these statements are frequently vague. many statements do not include specific language identifying what diverse means or who is in-/excluded. for example, rutgers university libraries’ dei statement states, “the libraries advance and promote diversity in all its forms” without describing, defining, or providing diversity examples.30 additionally, rutgers libraries endeavors “to create a welcoming workplace that reflects and supports the many populations and programs of the university with which we engage [emphasis original].”31 again, no definition indicates who the many populations includes. similarly, vague language produced an inconsistency regarding to whom dei statements were directed, with many, but not all, statements including faculty and staff. indiana university’s statement represents the later, stating, “iu libraries esteems diversity of all kinds […] to support students from diverse socio-economic backgrounds and foster a global, diverse inclusive community… in addition, the libraries commits to diversifying its own staff to reflect a diversity of perspectives and backgrounds [emphasis original].”32 including library faculty and staff acknowledges the potential significance of having a diverse and representative workforce, but still vaguely addresses the issue. unlike many dei statements which vaguely conceptualize diversity, the university of maryland libraries includes in its definition of diversity “race, ethnicity, nationality, religion, socioeconomic status, education, marital status, language, age, gender, sexual orientation, cognitive or physical disability; and learning styles” while noting diversity is not limited to these categories.33 similarly, the university of iowa libraries “welcomes and serves all, including people of color from all nations, immigrants, people with disabilities, lgbtq, and the most vulnerable in our community.”34 while still broad, and with language to cover additional conceptions of diversity, these statements’ explicit mention of various groups is unique among uborrow members’ dei statements. presentation: library-focused language continuing the broad conceptualizations of diversity, the university of chicago libraries’ statement includes an inward focus, which asks library users to consider their own positions and backgrounds: “we encourage open and honest discussion, reflect on our assumptions, and actively seek viewpoints beyond our own … and respect the uniqueness that we each bring to our shared endeavors.”35 this statement asks library users to actively challenge their own assumptions, values, beliefs, and views. however, the statement does not include active language regarding the necessity to prepare for challenging and difficult conversations and interactions. furthermore, the general conceptualization of these interactions with diversity makes it difficult for individuals to prepare for concrete situations in which one may encounter challenging, uncomfortable, or difficult conditions. utilizing lemke’s presentational aspect of communication, which considers processes, relations, events, participants, and circumstances to create and present a state of affairs, demonstrates that uborrow member libraries are vague in their dei statements, which the university of illinois at urbana-champaign (uiuc) exemplifies in their recognition of “diversity as a constantly changing concept. it [diversity] is purposefully defined broadly as encompassing, but not limited to, individuals’ social, cultural, mental and physical differences.”36 dei statements, as representative of academic libraries, present these institutions as attuned to larger social issues and the difficulties of making sweeping, definitive statements regarding diversity, when the term itself is, as the uiuc statement indicates, evolving and contested. the challenges this creates for library information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 12 administrators, and the hidden work that contributes to the creation and presentation of dei statements, is invisible in the end product, yet informs the content of these public statements. orientational aspect of dei statements orientational meanings establish relations between those who are communicating. these meanings communicate point of view, attitudes, and values.37 dei statements demonstrate libraries’ willingness to engage with and address dei issues, as well as, in some cases, combating racism and discrimination. analyzing the content of these statements produces insights into how statements orient libraries to their audiences. the vague and general language of many library dei statements creates a sense of detachment between libraries and users. conceptualizing diversity using vague language in an exchange between library users and a library dei statement orients the library in an abstract, immaterial way. using vague, broad, and ill-defined language makes no concrete demands of users. additionally, many dei statements are written in library-centered language, which positions the libraries at the center and users as peripheral. for example, the university of nebraska’s statement begins, “the university libraries creates and fosters inclusive environments for teaching, learning, scholarship, creative expression and civic engagement.”38 in this instance, the onus is on the libraries and what they can do to address issues of diversity, equity, and inclusion. the statement continues, “libraries staff members are empowered to provide an array of library services, collections, and spaces to meet the diverse needs of students, faculty, and researchers.”39 again, the library self-promulgates their efforts to address dei issues and ignores users’ contributions to positive, inclusive, and equitable environments. the university of nebraska libraries’ statement is not unique in the use of library-centered language, as such language is common across uborrow members’ dei statements. less vague and user-centered language would make library dei statements more humanizing, valuable, and contribute to the inclusive environments these statements espouse. orientation: anti-racism and social justice language some libraries’ dei statements make explicit mention of larger social issues and actively position themselves as social justice advocates, particularly anti-racism. the university of wisconsin – madison libraries are “dedicated to the principles and practices of social justice, diversity and equality and … commit ourselves to doing our part to end the many forms of discrimination that plague our society.”40 the penn state university libraries’ statement includes a commitment to “disrupting racism, hate and bias whenever and wherever we encounter it.”41 the university of michigan library “actively work[s] to ensure that tenets of diversity and antiracism influence all aspects of our work.”42 these statements present the libraries as cognizant, responsible, and socially aware entities. in these statements, the libraries’ employment of social justice discourse demonstrates nonneutrality. the university of wisconsin – madison libraries’ statement recognizes its place within society and the continual legacy of discrimination which tangibly affects current students. identifying social discrimination as a “plague” implies a solution via targeted, collective efforts to “further and enable the opportunities for education, benefit the good of the public and inform citizens.”43 similarly, the ohio state university libraries are guided by priorities “which facilitate, celebrate and honor diversity, inclusion, access and social justice.”44 embracing an active stance against social discrimination and positing the libraries as proponents of social justice utilizes the libraries’ dei statement as a tool to combat these injustices. semantically, dei statement text information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 13 offers information to users. statement content demonstrates libraries’ willingness to address dei issues institutionally and within society. the text demonstrates the libraries’ desire to combat injustice and the importance they place in doing so. additionally, in linking dei statements to social justice issues, libraries make demands of users. while still employing library-centered language, these statements provide a call to action via their direct acknowledgement that the libraries’ actions are a part of larger, collective efforts in the continual struggle against social injustices. lack of explicit dei statements as the analysis shows, the ways in which academic libraries organize, present, and orient themselves via their dei statements contributes to the construction of institutional value of, and commitment to, diversity, equity, inclusion. but what about libraries who do not have an explicit dei statement? in the united states context, given the attention to diversity, through black lives matter and other social movements advocating for social justice, it is surprising that four uborrow members do not have explicitly labeled dei statements on their websites. orientationally, the absence of an explicit dei statement suggests a lack of concern and consideration on behalf of libraries, as well as seemingly being out of touch with broader social contexts in which racial disparities persist. a clear dei statement, however, is a single piece of a library’s online presence. academic libraries can organize and present dei content on their websites in other ways, as all uborrow members did, even if an explicit statement was lacking. for example, the purdue university libraries have a diversity, inclusion, racism and anti-racism resources library guide, which acts as a one-stop shop for dei-related material. additionally, this guide contains a statement from the dean of libraries, dated june 2, 2020, condemning and making a collective call to action to address systemic racism. given this statement, while acknowledging the bureaucratic mechanisms in place that may slow the creation of an explicit dei statement, the question remains: if “enough is enough” as the statement claims, why have the purdue libraries not taken swift action to expedite the bureaucratic process? purdue university libraries are working behind the scenes and have created a council on equity, inclusion and belonging, as well as creating a new strategic plan in which “edi [equity, diversity and inclusion] is much more prominent in the current draft of that plan than in previous ones.”45 similarly, northwestern university does not, as of this writing, have an explicit dei statement. however, minimal diversity language is present in a public-facing welcome message on the library’s about page stating, “your library serves the diversity of the northwestern community.”46 furthermore, minimal diversity language appears in the internal strategic plan, 2019–2021, which includes a commitment to “responding to the vibrant diversity of our campus community.”47 additionally, recent conversations regarding racism, diversity, and social justice among library leadership have spurred the creation of a formal edi program at the institutional level.48 examining the situation at northwestern university revealed a look into the hidden work required to create and present dei content and an explicit dei statement, demonstrating the institutional significance of the presentational aspect of communication. discussion and implications the descriptive analysis presented in this study provides a foundation for closer analysis and future research, with potential avenues suggested below. this analysis also illustrates issues with the way in which dei statements are presented on academic library websites, which, given the pervasive whiteness of academic librarianship, affects academic librarians, staff, and the students information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 14 they serve. following lemke’s treatment of organizational meanings as primarily instrumental, the following section discusses presentational and orientational implications of dei statement content.49 academic libraries are an integral component of the institutions within which they are situated. their physical and digital spaces, services, and resources are critical to students’ academic success and faculty research. academic libraries also contribute to larger institutional dei initiatives. while an examination of institutional dei statements is beyond the scope of this study, institutional mission and vision statements also address diversity, equity, and inclusion. although many institutions have implemented specific diversity statements, wilson, meyer, and mcneal identified diversity content on institutional websites as being limited.50 given the changing demographics of higher education in the united states, the significance of dei to academic institutions and libraries will continually increase. if the purpose of mission and diversity statements is to reflect institutional priorities, as wilson and colleagues argue, the presence, or lack thereof, and content of these statements indicates the extent to which institutions value diversity, equity, and inclusion.51 presentational implications of dei statements in the context of the present study, that all uborrow member libraries’ websites engaged in some ways with dei content demonstrates the value they place on diversity, equity and inclusion. however, that only 12/13 of the 16 sites contained explicitly titled dei statements demonstrates more concerted effort is required if these libraries are to truly demonstrate their commitment. despite other dei language, northwestern university, a member of the 2020 acrl diversity alliance, does not have an explicit, public-facing dei statement, which demonstrates the many ways academic libraries are involved with diversity initiatives. while academic libraries may have internal policies that guide practice, that these policies, if they exist, are not public does not contribute to the construction and dissemination of the libraries’ message indicating their commitment to diversity, equity, and inclusion. the lack of a publicly facing statement, whether intentional or not, contributes to the message that the library is not fully committed to diversity, equity, and inclusion. in this vein, further exploration of diversity content and statements, at the institutional and library levels, is necessary to expand upon the findings of the present study regarding the messages dei statements send. qualitative studies could investigate the working cultures of academic libraries and explore internal mechanisms that contribute to the creation of public facing statements and how these mechanisms operate. lemke argues that presentational meanings are typically uncritical due to the presupposition of institutional hierarchies and roles, which minimize threats to the status quo, making this avenue especially fruitful from a critical or decolonizing framework.52 other opportunities for further research include quantitative content analysis of diversity statements, which could reveal specific words, terms, and phrases that institutions and academic libraries use to shed light on how these entities conceptualize dei. research examining users’ perspectives of academic library dei content is necessary to explore the ways in which libraries’ messages are received. orientational implications of dei statements examining uborrow members’ dei statements revealed the frequent employment of librarycentered language. framing the statements in this way places responsibility to create inclusive, equitable and welcoming environments on academic libraries, librarians, and staff. if the onus is information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 15 on academic libraries, as this dei statement language suggests, those who staff libraries are required to appropriately serve diverse students. as such, practical considerations of staff training regarding cultural competence is of paramount importance, which the university of michigan recognizes as they “encourage all library staff to participate in diversity-focused professional development and training activities.”53 while training and professional development opportunities are of limited utility, as cultural competence, cultural humility, and a diversity mindset cannot be acquired in one-off sessions, setting a pervasive atmosphere establishes institutional library value of diversity, equity, and inclusion. furthermore, hiring and retaining staff representative of student demographics is critical as doing so is one way academic libraries can demonstrate evidence of their values placed on diversity. that librarianship has traditionally been a white profession, as 86.7% of ala members selfidentified as white as of 2017 and 86.1% of higher education credentialed librarians were white as of 2009–2010, exacerbates the need for representative library staff.54 however, recruiting and hiring diverse staff is challenging as the number of visible minorities in academic librarianship has remained stagnant.55 retaining academic librarians and staff of color is a separate challenge, as institutional and library environments, expectations and research output are all explicit barriers, while internal pressure and time management constraints are implicit barriers.56 academic librarians and staff of color are subject to racial microaggressions perpetrated by unaware nonminority colleagues, an issue that permeates higher education, particularly at historically white institutions (hwis).57 these environments contribute to individual stress and fatigue for faculty of color.58 a history of what mehra and gray label white-ist trends in lis, an amalgamation of practices that symbolize, “racist connotations and racism in lis that is part of its historical evolution and development in the united states” affects librarians and staff of color.59 at the societal level, hate crimes are a continual issue in the united states.60 academic library dei statements were not created to directly address grand social issues. however, some dei statements included a social justice call to action. while not all dei statements contained such language, those that did not still made a commitment to supporting diversity. academic libraries’ dei statements identify the scope of available services and demonstrate libraries’ collective attempt to provide equitable spaces for all campus community members. while these statements occasionally align with institutional diversity statements, institutional responses to bias and discrimination provide insight into other ways institutions craft an identity.61 especially at hwis, these responses typically include demonstrating a professed commitment to dei, acknowledging actions to prevent future instances, establishing a protocol in the event an incident occurs; and addressing the issue and removing the institution from the perpetrators’ actions.62 academic library dei statements that simply state a commitment to diversity and inclusion without actively promoting change, which is lacking in the vague, library-centric language common to these statements, is a typical, though not emphatic, stance. this passive stance demonstrates the need for critical analysis of orientational meanings. such critical analysis allows for the examination of scrutinization of the actors and processes involved in dei statement creation, presentation, and messaging. such an examination offers an avenue to hold institutions accountable for their words and dei statements. future research that examines academic libraries’ responses to specific incidents of bias and discrimination could provide further insight into internal processes that lead to the public display of academic libraries as change agents. additional research could examine individual academic librarians and staff to interrogate the congruences or dissimilarities of individual and institutional practices regarding engagement with dei initiatives. information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 16 conclusion examination of uborrow members’ websites revealed that 12/13 of 16 sites contained explicitly labeled dei statements. although not all members’ sites contained an explicit statement, every library engaged with dei content in some way. among the 12/13 sites that contained an explicit dei statement, distinctions existed regarding statement organization. eight/nine libraries dedicated an entire page to their dei statement, while four members’ statements shared a page with other content. organizationally, the pages containing dei statements were similar with textheavy pages common across the websites. presentationally, dei statements serve as publicly facing representations of university libraries. the most telling insight into the presentational aspect of communication was revealed in an analysis of the sites that did not contain explicit dei statements, as this analysis examined the hidden work that is necessary in dei statement creation. orientationally, vague and library-centric language distances academic libraries and positions them as abstract entities. those libraries whose dei statements employed social justice language made more concrete demands of users. while explicit dei statements comprise only a portion of academic library dei content, an analysis of these statements revealed the ways in which they contribute to academic libraries’ construction of value of, and commitment to, diversity, equity, and inclusion. this analysis demonstrated how the presence, or lack thereof, of dei statements positions libraries as conscious entities operating within institutional and social contexts that both restrain and encourage promotion of diversity, equity, and inclusion. that the university of minnesota libraries updated their homepage to include a link to a newly constructed dei statement during the months between the first and second examination of uborrow consortium members’ websites in this study indicates the significance and value institutions place on dei initiatives. academic libraries, as entities that operate within institutions in the social context of historical racism, discrimination, and marginalization in the united states, are not immune to the consequences of these enduring legacies. despite current and ongoing efforts, this analysis revealed that much work and dedication is yet required in the continual engagement with dei initiatives. information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 17 appendix a: uborrow member institutions university of chicago university of illinois at urbana-champaign indiana university university of iowa university of maryland university of michigan michigan state university university of minnesota university of nebraska – lincoln northwestern university ohio state university penn state university purdue university rutgers university university of wisconsin – madison center for research libraries information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 18 appendix b: urls for dei pages from uborrow consortium websites university of chicago: https://www.lib.uchicago.edu/about/thelibrary/ university of illinois at urbana-champaign: https://www.library.illinois.edu/about/administration-overview/ indiana university: https://libraries.indiana.edu/administration#panel-about university of iowa: https://www.lib.uiowa.edu/about/diversity-equity-inclusion/ university of maryland: https://www.lib.umd.edu/about/deans-office/diversity university of michigan: https://www.lib.umich.edu/about-us/about-library/diversity-equityinclusion-and-accessibility michigan state university: https://lib.msu.edu/strategic-plan/ university of minnesota: https://www.lib.umn.edu/about/inclusion university of nebraska – lincoln: https://libraries.unl.edu/diversity ohio state university: https://library.osu.edu/equity-diversity-inclusion penn state university: https://libraries.psu.edu/about/diversity rutgers university: https://www.libraries.rutgers.edu/diversity university of wisconsin – madison: https://www.library.wisc.edu/diversity/ information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 19 endnotes 1 “table 306.30 fall enrollment of u.s. residents in degree-granting postsecondary institutions, by race/ethnicity: selected years, 1976–2028,” national center for education statistics, last modified march 2019, https://nces.ed.gov/programs/digest/d18/tables/dt18_306.30.asp. 2 courtney mcdonald and heidi burkhardt, “library-authored web content and the need for content strategy,” information technology and libraries 38, no. 3 (2019): 8–21, https://doi.org/10.6017/ital.v38i3.11015; courtney mcdonald and heidi burkhardt, “web content strategy in practice within academic libraries,” information technology and libraries 40, no. 1 (2021): 52–98, https://doi.org/10.6017/ital.v40i1.12453. 3 library bill of rights, american library association, amended january 29, 2019, https://www.ala.org/advocacy/intfreedom/librarybill. 4 alice m. cruz, “intentional integration of diversity ideals in academic libraries: a literature review,” the journal of academic librarianship 45, no. 3 (2019): 220–27, https://doi.org/10.1016/j.acalib.2019.02.011; jenny lynne semenza, regina koury, and sandra shropshire, “diversity at work in academic libraries 2010–2015: an annotated bibliography,” collection building 36, no. 3 (2017): 89–95, https://doi.org/10.1108/cb-122016-0038. 5 acrl racial and ethnic diversity committee, “diversity standards: cultural competency for academic librarians,” college and research libraries news 73, no. 9 (2012): 551–61, https://doi.org/10.5860/crln.73.9.8835; “acrl plan for excellence,” american library association, revised november 2019, http://www.ala.org/acrl/aboutacrl/strategicplan/stratplan 6 toni anaya and charlene maxey-harris, diversity and inclusion, spec kit 356 (washington, dc: association of research libraries, september 2017) https://doi.org/10.29242/spec.356. 7 american library association, “acrl, arl, odlos, and pla announce joint cultural competencies task force,” news release, may 18, 2020, https://www.ala.org/news/membernews/2020/05/acrl-arl-odlos-and-pla-announce-joint-cultural-competencies-task-force. 8 lori s. mestre, “visibility of diversity within association of research libraries websites,” the journal of academic librarianship 37, no. 2 (2011): 101–8, https://doi.org/10.1016/j.acalib.2011.02.001. 9 preston salisbury and matthew r. griffis, “academic library mission statements, web sites, and communicating purpose,” the journal of academic librarianship 40, no. 6 (2014): 592–96, https://doi.org/10.1016/j.acalib.2014.07.012. 10 linda r. wadas, “mission statements in academic libraries: a discourse analysis,” library management 38, no. 2/3 (2017): 108–16, https://doi.org/10.1108/lm-07-2016-0054. 11 salisbury and griffis, “academic library mission statements”; wadas, “mission statements in academic libraries.” information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 20 12 jeffery l. wilson, katrina a. meyer, and larry mcneal, “mission and diversity statements: what they do and do not say,” innovative higher education 37 (2012): 125–39, https://doi.org/10.1007/s10755-011-9194-8. 13 laura saunders, “academic libraries’ strategic plans: top trends and under-recognized areas,” the journal of academic librarianship 41, no. 3 (2015): 285–91, https://doi.org/10.1016/j.acalib.2015.03.011. 14 saunders, “academic libraries’ strategic plans”; wadas, “mission statements in academic libraries.” 15 saunders, “academic libraries’ strategic plans.” 16 wilson, meyer, and mcneal, “mission and diversity statements”; saunders, “academic libraries’ strategic plans.” 17 “library borrowing,” big ten academic alliance, accessed november 5, 2020, https://www.btaa.org/library/reciprocal-borrowing. 18 klaus krippendorf, content analysis: an introduction to its methodology, 3rd ed. (los angeles, ca: sage, 2013), 49. 19 jay l. lemke, “travels in hypermodality,” visual communication 1, no. 3 (2002): 299–325, https://doi.org/10.1177%2f147035720200100303. 20 lemke, “travels in hypermodality.” 21 lemke, “travels in hypermodality.” 22 mestre, “visibility of diversity.” 23 krippendorf, content analysis; lemke, “travels in hypermodality,” 304–5. 24 lemke, “travels in hypermodality,” 304. 25 lemke, “travels in hypermodality,” 300–1. 26 lemke, “travels in hypermodality,” 318. 27 mestre, “visibility of diversity.” 28 lemke, “travels in hypermodality,” 318. 29 lemke, “travels in hypermodality,” 304. 30 “diversity, equity, and inclusion,” rutgers university libraries, accessed april 2, 2021, https://www.libraries.rutgers.edu/about-rutgers-university-libraries/diversity-equity-andinclusion. 31 “diversity, equity, and inclusion,” rutgers university libraries. information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 21 32 “indiana university libraries diversity strategic plan,” indiana university libraries, accessed april 2, 2021, https://libraries.indiana.edu/strategicplan. 33 “diversity, equity, inclusion,” university of maryland libraries, accessed april 2, 2021, https://www.lib.umd.edu/about/deans-office/diversity. 34 “the university of iowa libraries’ commitment to diversity, equity, and inclusion,” iowa university libraries, accessed april 2, 2021, https://www.lib.uiowa.edu/about/diversityequity-inclusion/. 35 “diversity, equity, and inclusion statement,” university of chicago library, accessed april 2, 2021, https://www.lib.uchicago.edu/about/thelibrary/. 36 “library diversity statement,” university of illinois library diversity committee, accessed april 2, 2021, https://www.library.illinois.edu/about/administration-overview/. 37 lemke, “travels in hypermodality,” 304. 38 “diversity mission statement,” university of nebraska-lincoln libraries, accessed april 2, 2021, https://libraries.unl.edu/diversity. 39 “diversity mission statement,” university of nebraska-lincoln libraries. 40 “our commitment to diversity and inclusion,” university of wisconsin–madison libraries, accessed april 2, 2021, https://www.library.wisc.edu/about/administration/commitment-todiversity-and-inclusion/. 41 “libraries diversity, equity, inclusion, and accessibility (deia) commitment statement,” penn state university libraries, accessed april 2, 2021, https://libraries.psu.edu/about/diversity. 42 “diversity, equity, inclusion, and accessibility,” university of michigan library, accessed april 2, 2021, https://www.lib.umich.edu/about-us/about-library/diversity-equity-inclusion-andaccessibility. 43 “our commitment to diversity and inclusion,” university of wisconsin–madison libraries. 44 “diversity, equity, inclusion and accessibility (deia),” the ohio state university, university libraries, accessed april 2, 2021, https://library.osu.edu/equity-diversity-inclusion. 45 mark a. puente, associate dean for organizational development, inclusion and diversity, personal communication with the author, november 11, 2020. 46 “about,” northwestern university libraries, accessed april 2, 2021, https://www.library.northwestern.edu/about/index.html. 47 “strategic plan,” northwestern university libraries, accessed july 21, 2020, https://www.library.northwestern.edu/documents/about/2019-21-plan.pdf. 48 claire roccaforte, director of library marketing & communication, personal communication with the author, october 26, 2020. information technology and libraries december 2021 diversity, equity & inclusion statements on academic library websites | ely 22 49 lemke, “travels in hypermodality,” 304. 50 wilson, meyer, and mcneal, “mission and diversity statements.” 51 wilson, meyer, and mcneal, “mission and diversity statements.” 52 lemke, “travels in hypermodality.” 53 “diversity, equity, inclusion, and accessibility,” university of michigan library. 54 kathy rosa and kelsey henke, 2017 ala demographic study (chicago: ala office for research and statistics, 2017): 1–3, https://www.ala.org/tools/sites/ala.org.tools/files/content/draft%20of%20member%20de mographics%20survey%2001-11-2017.pdf; diversity counts 2012 tables, (data from diversity counts study, chicago: american library association), https://www.ala.org/aboutala/sites/ala.org.aboutala/files/content/diversity/diversitycounts /diversitycountstables2012.pdf. 55 janice y. kung, k-lee fraser, and dee winn, “diversity initiatives to recruit and retain academic librarians: a systematic review,” college and research libraries 81, no. 1 (2020): 96–108, https://doi.org/10.5860/crl.81.1.96. 56 trevar riley-reid, “breaking down barriers: making it easier for academic librarians of color to stay,” the journal of academic librarianship 43, no. 5 (2017): 392–96, https://doi.org/10.1016/j.acalib.2017.06.017. 57 jaena alabi, “racial microaggressions in academic libraries: results from a survey of minority and non-minority librarians,” the journal of academic librarianship 41, no. 1 (2015): 47–53, https://doi.org/10.1016/j.acalib.2014.10.008; chavella t. pittman, “racial microaggressions: the narratives of african american faculty at a predominantly white university,” the journal of negro education 81, no. 1 (2012): 82–92, https://doi.org/10.7709/jnegroeducation.81.1.0082. 58 william a. smith, tara j. yosso, and daniel g. solorzano, “challenging racial battle fatigue on historically white campuses: a critical race examination of race-related stress,” in covert racism: theories, institutions, and experiences, ed. rodney d. coates (boston: brill, 2011): 211– 37. 59 bharat mehra and laverne gray, “an ‘owning up’ of white-ist trends in lis to further real transformations,” library quarterly 90, no. 2 (2020): 189–239, https://doi.org/10.1086/707674. 60 “hate crime statistics, 2019,” federal bureau of investigation, https://ucr.fbi.gov/hatecrime/2019. 61 wadas, “mission statements in academic libraries.” 62 glyn hughes, “racial justice, hegemony, and bias incidents in u.s. higher education,” multicultural perspectives 15, no. 3 (2013): 126–32, https://doi.org/10.1080/15210960.2013.809301. president’s message: imagination and structure in times of change bohyun kim information technology and libraries | december 2018 2 bohyun kim (bohyun.kim.ois@gmail.com) is lita president 2018-19 and chief technology officer & associate professor, university of rhode island libraries, kingston, ri. in my last column, i talked about the discussion that lita had begun regarding forming a new division to achieve financial sustainability and more transparency, responsiveness, and agility. this proposed new division would merge lita with alcts (association for library collections and technical services) and llama (library leadership and management association). when this topic was brought up and discussed at an open meeting at the 2018 ala annual conference in new orleans, many members of these three divisions expressed interests and excitement. at the same time, there were many requests for more concrete details. you may recall that as a response to those requests, the steering committee, which consists of the presidents, presidents-elect, and executive directors of the three divisions decided to form four working groups with the aim of providing more complete information about what the new division would look like. today, i am happy to report that the work of the steering committee and the four working groups is well underway. the operations working group that i have been chairing for the last two months submitted its recommendations on november 23. the activities working group finished its report on december 5. the budget and finance working group also submitted its second report. the communications working group continues to engage members of all three divisions by sharing new updates and soliciting opinions and suggestions. most recently, it started gathering input and feedback on potential names for the new division.1 you can see the charges, member rosters, and current statuses of these four working groups in the ‘current information’ page at the ‘alcts/ llama/ lita alignment discussion’ community in the ala connect website (https://connect.ala.org/communities/allcommunities/all/all-current-information).2 to give you a glimpse of our work preparing for the proposed new division, i would like to share some of my experience leading the operations working group. the operations working group consisted of nine members, three from each division, in addition to myself as the chair and one staff liaison. we quickly became familiar with the organizational and membership structures of three divisions. the three divisions are similar to one another in size, but they have slightly different structures. lita has 18 interest groups (ig), 25 committees, and 4 (current) task forces; llama has 7 communities of practice (cop) and 46 discussion groups / committees / task forces; alcts has 5 sections, 42 igs, and 61 committees (20 at the division level and 41 at the section level). all committees and task forces in lita are division-level, while alcts and llama have committees that are either division-level or section/cop-level. alcts is unique in that it elects section chairs, who serve on the division board alongside with alcts directors-at-large. alcts also has a separate executive committee in addition to the board. llama has self-governed cops, which are formed by the board’s approval. among all three, lita has the most flat and simplest structure due to its intentional efforts in the past. for example, there are neither sections nor mailto:bohyun.kim.ois@gmail.com https://connect.ala.org/communities/allcommunities/all/all-current-information information technology and libraries | december 2018 3 communities of practice in lita, and the lita board eliminated the executive committee a few years ago. the steering committee of the three divisions agreed upon several guiding principles for the potential merger. these include (i) open, flexible, and straightforward member engagement, (ii) simplified and streamlined processes, and (iii) a governance and coordinating structure that engages members and staff in meaningful and productive work. the challenge is how to translate those guiding principles into a specific organizational structure, membership structure, and bylaws. clearly, some shuffling of existing sections, cops, and igs in three divisions will be necessary to make the new division as effective, agile, and responsive as promised. however, when and how such consolidation should take place? furthermore, what kind of guidance should the new division provide for members to re-organize themselves into a new and better structure? these are not easy questions to answer. nor are they something that can be immediately answered. some changes may require going through multiple stages for them to be completed. this may concern some members. they may prefer all these questions to have definitive answers before they decide on whether they will support the proposed new division or not. people often assume that a change takes place after a big vision is formed, and then the change is executed by a clear plan that directly translates that vision into reality in an orderly fashion. however, that is rarely how a change takes place in reality. more often than not, a possible change builds up its own pressure, showing up in a variety of forms on multiple fronts by many different people while getting stronger, until the idea of this change gains enough urgency. finally, some vision of the change is crafted to give a form to that idea. the vision for a change also does not materialize in one fell swoop. it often begins with incomplete details and ideas that may even conflict with one another in its first iteration. it is up to all of us to sort them out and make them consistent, so that they would become operational in the real world. recently, the steering committee reached an agreement regarding the final version of the mission, vision, and values of the proposed new division. i hope these resonate with our members and guide us well in navigating challenges ahead if the membership votes in favor of the proposal. the new division’s mission: we connect library and information practitioners in all career stages and from all organization types with expertise, colleagues, and professional development to empower transformation in technology, collections, and leadership, and to advocate for access to information for all. the new division’s vision: we shape the future of libraries and catalyze innovation across boundaries. the new division [name to be determined] amplifies diverse voices and advocates for equal and equitable access to information for all. the new division’s values: shared and celebrated expertise; strategically chosen work that makes a difference; transparent, equitable, flexible, and inclusive structures; empowering framework for experimental and proven approaches; intentional amplification of diverse perspectives; expansive collaboration to become better together. imagination and structure in times of change | kim 4 https://doi.org/10.6017/ital.v37i4.10850 in deciding on all operational and logistical details for the new division, the most important criteria will be whether a proposed change will advance the vision and mission of the new division and how well it aligns with the agreed-upon values and guiding principles. the steering committee and the working groups are busy finalizing the details about the new division. those details will be first reviewed by the board of each division and then shared with the membership at the midwinter for feedback. i did not anticipate that during my service as the lita president-elect and president, i would be leading a change as great as dissolving lita and forming a new division with two other divisions, alcts and llama. it has been an adventure filled with many surprises, difficulties, and challenges, to say the least. this adventure taught me a great deal about leading a change for an organization at a high level. when we move from the high-level vision of a change to the matter of details deep in the weeds, it is easy to lose sight of the original aspiration and goal that led us to the change in the first place. trying to determine as many logistical details becomes tempting to those in a leadership role because we all want to assure people in our organizations at a time of uncertainty and to make the transition smooth. however, creating a new division itself is a huge change at the highest level. it would be wrong to backtrack on the original goal to make the transition smooth. for it is the original goal that requires a transition, not vice versa. i believe those in a leadership role should accept that their most important work during the time of change is not to try to wrangle logistics at all levels but to keep things on track and moving in the direction of the original aspiration and goal. lita and two other divisions have many talented and capable members who will be happy to lend a hand in developing new logistics. the responsibility of leaders is to create space where those people can achieve that freely and swiftly and to provide the right amount of framework and guidance. i hope that all lita members and those associated and involved with lita see themselves in the vision, mission, and values of the new division, embrace changes from the lowest to the highest level, and work towards making the new vision into reality together. 1 you can participate in this process at https://connect.ala.org/communities/communityhome/digestviewer/viewthread?groupid=109804&messagekey=625e8823-21e0-419c-ab2b1cb4a82b8d09 and http://www.allourideas.org/newdivisionname. 2 this ‘current information’ page will be updated as the plans for the new division develop. https://connect.ala.org/communities/community-home/digestviewer/viewthread?groupid=109804&messagekey=625e8823-21e0-419c-ab2b-1cb4a82b8d09 https://connect.ala.org/communities/community-home/digestviewer/viewthread?groupid=109804&messagekey=625e8823-21e0-419c-ab2b-1cb4a82b8d09 https://connect.ala.org/communities/community-home/digestviewer/viewthread?groupid=109804&messagekey=625e8823-21e0-419c-ab2b-1cb4a82b8d09 http://www.allourideas.org/newdivisionname google us! capital area district libraries gets noticed with google ads grant public libraries leading the way google us! capital area district libraries gets noticed with google ads grant sheryl cormicle knox and trenton m. smiley information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.12089 sheryl cormicle knox (knoxs@cadl.org) is technology director for capital area district libraries. trenton m. smiley (smileyt@cadl.org) is marketing & communications director for capital area district libraries. increased choices in the marketplace are forcing libraries to pay much more attention to how they market themselves. libraries can no longer simply employ an inward marketing approach that speaks to current users through printed materials and promotional signage plastered on the walls. furthermore, they cannot rely on occasional mentions by the local media as the primary driver of new users. that’s why in 2016, capital area district libraries (cadl), a 13 branch library system in and around lansing, michigan, began using more digital tactics as a cost-effective way to increase our marketing reach and to have more control over promoting the right service, at the right time, to the right person. one example of these tactics is ad placement on the weather channel app. this placement allows ads about digital services like overdrive and hoopla to appear when certain weather conditions, such as a snowstorm, occur in the area. in 2017, while attending the library marketing and communications conference in dallas, our marketing and communications director had the good fortune of sitting in on a presentation by trey gordner and bill mott from koios (www.koios.co) on how to receive up to $10,000 of in-kind advertising every month from a google ad grants (www.google.com/grants). during this presentation, koios offered participants a 60day trial of their services to help secure the google ad grants and create a few starter campaigns. google ads are text-based and appear in the top section of google's search results, along with the ads of paying advertisers. nonprofits in the google ad grants program can set up various ad campaigns to promote whatever they like—the overall brand of the library, the collection, and various events, meeting room offerings or any other product or service. the appearance of each google ad is triggered by keywords chosen for each campaign. after cadl's trial period expired, we decided to retain koios to oversee the google ad grants project. while the library has used google ads for the sharing of video, we had not done much with keyword advertising. so, we were excited to learn more about the process of using keywords and the funding available through the grant. we viewed this as a great new tool to add to our marketing toolbox. it would help us achieve a few of our marketing goals: expanding our overall marketing reach and digital footprint by 50 percent; increasing the library’s digital advertisement budget by 300% (by using alternative funding); and promoting the right service at the right time. getting started koios coached us through the slalom course of obtaining accounts and setting them up. to secure the monthly ad grant, we first obtained a validation key from tech soup (www.techsoup.org), the nonprofit that makes technology accessible to other non-profits and libraries. that, in turn, pre-qualified us for a google for nonprofits account. (at the time, we were able to get a validation token from our existing tech soup account, but koios currently recommends starting by registering a 501c3 friends organization or library foundation with tech soup whenever possible.) after creating our google for nonprofits account, we used the same account username to create a google ads account. finally, to work efficiently with koios, mailto:knoxs@cadl.org mailto:smileyt@cadl.org https://www.koios.co/ https://www.google.com/grants https://www.techsoup.org/ information technology and libraries march 2020 google us! | knox and smiley 2 we provided them access to our google analytics property (which we have configured to scrub patron identifying information) and our google tag manager account (with the ability to create tags that we in turn review and approve). if you are taking the do-it-yourself approach, google has a step-by-step google ad grants activation guide and extensive help online. designing campaigns spending money well is hard work and that holds true with keyword search ads as well. there are some performance and ad quality requirements in the grant program that must be observed to retain your monthly allotment. understanding these guidelines and implementing campaigns that respect them, while working well enough to spend your grant allocation requires study and patience. again, we relied on koios to guide us. they helped us create campaigns and ad groups within those campaigns that were effective within the grant program. figure 1. example of minecraft title keyword landing page created by koios. information technology and libraries march 2020 google us! | knox and smiley 3 in august 2018, we started with campaigns for general branding awareness that included ads aimed at people actively searching for local libraries and our core services. these ads funnel users to our homepage and our online card signup. they are configured to display only to searchers who are geographically located in our service area. this campaign has been grown and perfected over 18 months into one of our most successful campaigns, garnering over 2,300 impressions and 650 clicks in january 2020, yet it spends just $450 of our grant funds. another consistent performer for us has been our digital media campaign with ads targeting users searching for ebooks and audiobooks. by june 2019 we had grown our grant spend to $1,500 a month using 27 different campaigns. the game changer for us has been working with koios to create campaigns based on an export of marc records from our catalog. we worked with koios to massage this data into a very simple pseudo-catalog of landing pages based on item titles. the landing page is very simple and seo friendly so that it ranks well in the split-second ad auction competition that determines whether your ad will be displayed. it has cover images, clear calls to action, loads fast, is mobile friendly and communicates the breadth of formats held by the library (see figure 1). clicking the item title or the borrow button sends users straight into our full catalog to get more information, request the item, or link to the digital version. figure 2. a user search in google for “dad jokes” showing a catalog campaign ad. grant program ads are displayed below paid ads. the format of the ad may vary as well. this version shows several extensions, like phone number, site links, and directions links. information technology and libraries march 2020 google us! | knox and smiley 4 figure 3. the landing page displayed to the searcher after they click on the ad and the resulting catalog page if the searcher clicks the borrow button. in google ads, koios created 14 catalog campaigns out of the roughly 250,000 titles we sent them. each campaign has keywords (single words and phrases from titles) derived from roughly 18,000 titles ranked by how frequently they are used in google search. again, these ads are limited geographically to our service area. figures 2 and 3 illustrate what a google searcher in ingham county, michigan, potentially encounters when searching for “dad jokes”. since their inception in september 2019, these catalog campaigns have been top performers for us, generating clickthrough rates of 8-15% and a couple thousand additional ad clicks monthly, the aggregation of a small number of clicks on any one ad from our “long tail” of titles. we are now spending over $5,000 of our grant funds and garnering nearly 23,000 impressions and 3,000 ad clicks monthly. results in general, we find that our google ads have succeeded in drawing additional new visitors to our web site. using our long-established google analytics implementation that measures visits to our website and catalog combined, we compared the third quarter of 2018, when we were ramping up our google ad grants campaigns, to the third quarter of 2019, after our catalog campaign was firmly established. the summary numbers are encouraging. the number of users is up 17%, and number of sessions is up 4%. within the overall rise in users, returning users are up 9%, but new users are up 25%. therefore, we are getting more of those coveted, elusive “non-library-users” to visit us online. when comparing the behavior of new and returning visitors, we also see that the overall increase in sessions was achieved despite the head wind of a 4% decline in returning visitor sessions. however, are the new visitors engaging? perhaps the most tangible measure of engagement for a public library catalog is placing holds. we have a google analytics conversion goal that measures those holds. the information technology and libraries march 2020 google us! | knox and smiley 5 rate of conversion on the hold goal among new visitors rose 7%, while dropping 13% among returning visitors. from other analysis, we know that our highly-engaged members are migrating to our mobile app and to digital formats, so the drop for returning users is explainable and the rise among new visitors is hopeful. we are working on ways to study more closely these new visitors so that we can discover and remove more barriers in the way of them becoming highly engaged members of their public library. future plans with the help of koios, new campaigns will be created to promote our blogs and podcasts. we will also link a campaign to our demco events database. finally, in partnership with koios, we will work with patron point to incorporate our automated email marketing system into google ad campaigns. we will add campaigns for pop-up ads that encourage library card signup through our online registration system. once someone signs up for a library card online, the system will trigger a welcome email that promotes some of our core services. this on-boarding set-up will also include an opportunity for the new cardholder to fill out a form to tailor content in future emails to their interests. through all these means, cadl leads the way in delivering the right service, at the right time, to the right person. getting started designing campaigns results future plans we can do it for free! using freeware for online patron engagement public libraries leading the way we can do it for free! using freeware for online patron engagement karin suni and christopher a. brown information technology and libraries | march 2021 https://doi.org/10.6017/ital.v40i1.13257 karin suni (sunik@freelibrary.org) is curator, theatre collection, the free library of philadelphia. christopher a. brown (brownc@freelibrary.org) is curator, children’s literature research collection, the free library of philadelphia. © 2021. “public libraries leading the way” is a regular column spotlighting technology in public libraries. in the early weeks of the pandemic, the special collections division of the free library of philadelphia (https://freelibrary.org/) responded to the library’s call for fun and interactive online engagement. initially staff members released games and buzzfeed-inspired lists via various social media accounts to amuse patrons, distract from the lockdown, and provide educational programming. as the list of activities grew, we realized this content needed a more substantial home; the return on investment of time for the development and production of an online game to be released once on social media was not sufficient. activities and passive programming that took hours to create could easily fall victim to social media’s algorithms and be quickly buried in a patron’s feed. the free library’s official blog was an insufficient option because it promoted all library programming, and our goal was to highlight the value of our division and the materials housed within it. we resolved these issues by creating an online repository solely with freeware systems (https://bit.ly/funwithflpspeccoll). the repository provides a stable landing page wherein the special collections division content builds meaningful connections with patrons of all ages. this model can be readily adapted and is a valuable tool for library workers promoting their own online engagement. repository framework it was clear that our division could not add to the burden of an overworked it staff by requesting support for digital engagement. we needed to seek external alternatives that would interest patrons and could be managed with limited training. before we began our search, we brainstormed a list of requirements: • an inexpensive and user-friendly hosting platform • a pleasing look and easy navigation • the ability to be updated frequently and easily • the flexibility to adapt and expand as our requirements change our search led us to the google suite of products, specifically google sites and google drawings. google sites and google drawings integrated perfectly with each other, and we appreciated their usability and relative simplicity. once we selected the software, we knew we needed a list of best practices to guide the repository’s creation: ● to establish a visual connection with our official website, the repository would primarily use the free library’s branded color scheme. ● all thumbnails created would be square, allowing us to reuse the image as promotional material on different social media accounts. mailto:sunik@freelibrary.org mailto:brownc@freelibrary.org https://freelibrary.org/ https://bit.ly/funwithflpspeccoll information technology and libraries march 2021 we can do it for free! | suni and brown 2 ● all members of the division can create content, but the ability to update and edit the repository would remain limited to ensure consistency. these guidelines have proven effective. the color scheme and thumbnail rules formed a framework wherein we could work productively without “reinventing the wheel.” limiting administrative abilities has allowed us to maintain a controlled vocabulary within the repository, better unifying the content. repository software the google suite, specifically google sites, is advantageous for library workers looking to create professional-looking content quickly. it is free with a google account and built-in templates allow users to build a fully functional website within a few hours with little-to-no design experience. as with all freeware, google sites has quirks. the foremost is that while there are options for customization, these options are finite. there are a limited number of layout, header, and font designs meaning that anyone using the software must temper their vision to fit within the confines of the program. google drawings is far more flexible, in part because it is a much simpler program. users familiar with software like powerpoint or ms paint have the ability to design images for headers, thumbnails, etc. two drawbacks we encountered with this freeware are the restrictions on image upload size (a consideration for our division given the archival files used in our digital collections) and the limited ability to create word art. for our division, the advantages of these software products outweigh their limitations. content framework the repository houses programming devised primarily with freeware. an early discovery was a suite of activities from flippity (https://www.flippity.net). designed for educational use, flippity provides templates for a variety of online activities including memory games, matching games, and board games. our primary focus has been on the first two, although we continue to explore new aspects of this suite as templates are added. flippity works with google sheets and can integrate images from google drawings. jigsaw planet (https://jigsawplanet.net/) has been used extensively by libraries and museums during the pandemic. it allows creators to easily turn images into puzzles that are played online, either on the site itself or through embedding the puzzle. the site allows registered users to access leaderboards, and it allows creators to track how many times puzzles have been played. in addition to the ease of use, the major benefit of jigsaw planet is that the patron can customize their experience by changing the number of pieces to fit their preferred level of difficulty. the desire for audio and video content has surged over the last several months, and we have sought to meet that need through the use of a variety of software. in regard to video, youtube is not a new tool, but the majority of our pre-pandemic programs were not filmed. with the shift to crowdcast and zoom, we now have a library of online lectures and other events that have been uploaded to youtube and can be viewed repeatedly and at any time. with a dedicated home for this content, we have been inspired to seek out older videos of special collections programming across multiple channels and link them to the repository. https://www.flippity.net/ https://jigsawplanet.net/ information technology and libraries march 2021 we can do it for free! | suni and brown 3 one of the newest additions to our offerings has been the podcast story search from special collections (http://bit.ly/flpstorysearch), which explores stories based on, inspired by, or connected to material artifacts. the podcast is recorded and edited using zencastr and audacity and is posted on anchor, which also distributes it to major listening apps. in recent weeks, our division has added images, blog posts, and additional con tent for current and past exhibitions. this is the first formal exhibition compilation since the special collections division began in 2015, and we are delighted that it is available for the public to explore. the material is arranged using templates and tools available in google sites, allowing patrons to view image carousels, exhibition tags, and past programs. the inclusion of this material marks a shift away from the repository functioning as a response to the need for pandemic-related content to a living history of our division and our work promoting the special collections of the free library. accessibility accessibility and equity of access lie at the core of library service. sadly, we were not initially focused on this point, and our content was not fully accessible, e.g., text was presented in thumbnails only which limited the use of screen readers to relay information. as the content expanded, we sought to make the space as inclusive as the freeware limits allowed. alternative text was added to images and information was not limited within thumbnails. this is an ongoing process, but one that is necessary to reach as many patrons as possible. analytics site visits and other statistics for a library’s online presence are always important, but especially so during the pandemic when restricted physical access has driven more patrons to online resources. our plan for capturing this information was two-pronged. first, we used bit.ly to create customized, trackable links for our content. these are used within the repository and on social media and in other online promotions. this has proven to increase repository traffic while providing information on how patrons discover our content. the statistics generated from bit.ly are only available for 30 days for free accounts, albeit in a rolling 30-day window. knowing this, we transcribe the statistics monthly into a spreadsheet to maintain a consistent account of patron access. our second prong is google analytics, a freeware option that only tracks data within the repository. google analytics connects a single google account to google sites, but the integration is seamless and the data remains available indefinitely. this provides a visual breakdown of statistics, including maps and graphs that are easily shared with other stakeholders. by using both tools we are able to surmise who is visiting the repository, where they are finding the links, and which sections are popular with our patrons. conclusion the special collections repository was created in response to a growing need for online patron engagement during the early weeks of the pandemic. our division strove to engage the public with fun, educational programming and activities primarily using freeware. this has proven to be successful with the general public and members of our division. the statistics from the site have both informed content creation and engendered a better appreciation for the repository from our administration. as we move forward, the repository is evolving into a comprehensive collection of what the special collection division does and how we meet the need for patron engagement http://bit.ly/flpstorysearch information technology and libraries march 2021 we can do it for free! | suni and brown 4 online and in person. it is a framework that can be used by library workers across a multitude of areas and specialties, housing activities from story times and passive programming to book clubs and lectures. repository framework repository software content framework accessibility analytics conclusion lib-s-mocs-kmc364-20141024053122 201 a regional serials program under national serials data program auspices: discussion paper prepared for ad hoc serials discussion group audrey n. grosch: university of minnesota, minneapolis. purpose of the program a regionally organized program for serials bibliography is proposed because of the large volume of complex data needing control and the many purposes to which the data can be put in support of regional or local needs. · the size of the data base comprising serials bibliography in the united states alone may exceed 2 million titles. gregory's union list of serials represents the largest single source of controlled titles-450,000 in the third edition.1at a minimum its successor publication, new serial titles (nst), contains 325,000 titles. 2 therefore, some 775,000 titles are under control for ·identification, interlibrary transfer, and location purposes . . the data base requirements for the isds/ nsdp record comprise several dozen fields. when added together with other information now found in a marc serials record for cataloging purposes and when further coupled with explicit holdings information needed for regional networks, the file size would exceed 1 billion characters. therefore, the systems design basis as well as the functional purposes of such data would encourage us 'to explore a regionally organized serials program. another even more overwhelming factor which gives support to a regional system is the mixture of rules applied in cataloging. the nsdp data establish conclusive identity via the key-title without a full bibliographic record. however, libraries will not suddenly drop their local practices or use of lc cataloging copy, affecting their internal arrangements of collections. therefore, the most prudent course would seem to be reconciliation of past practice with the new system and development of a machinereadable serials record specification which accommodates the requirements of the isds/ nsdp and the cataloging rules of present libraries. a regionally organized program should be highly responsive to such a reconciliation. 202 journal of library automation vol. 6/4 december 1973 the regional serials program within the framework of the isds and its respective national center ( nsdp in the u.s. ) , each country will proceed to develop its serials program. the united states program must be organized with nsdp at its center. figure i is a schematic showing the relationships of nsdp and the other units within this proposed regional program. this figure also shows the bi-directional communications flow between the various units. n.a.l. r-e--c:=j ' r--1 i i i i i i i i i ""-1 i i i n.l.m. reiional serials data centers local libraries i i i i i i l ------------------------------------------------------------· fig. i. a regional serials program organization and lines of communication. the three national libraries originate bibliographic data for use in the nsdp system and also can continue to function as providers of cataloging data to the library community via cards, marc tapes, the national ~~brary of medicine's serline, and any new services of this type. marc, serline or other machine readable sources should ultimately be the method by which raw data from the national libraries would enter nsdp. the regional serials data centers would receive information from nsdp and also would provide certain kinds of data for the nsdp data base which would be nonduplicative of the national libraries input. local libraries would interface to their regional serials data center, supplying information for the region in a shared environment. hopefully, the regional centers could take over the functions now performed by the marc serials service, supplying products requested by the libraries locally and obviating the need of local libraries subscribing to marc serial tapes. the serials environment may be organized into: • a local library serials management component; . • a network serials management component; and, • an international/national serials system component. a regional program can address itself to these three facets of the environment, in the following manner. a regional serials programj grosch 203 the local library, whatever its size or type, must develop some system for internal control of its serials collections. this author feels that the local library should be free to adopt either nsdp or anglo-american cataloging rules for current serials but need not change its retrospective records unless it can really afford to, if such data became available in the future through the program herein proposed. ideally, such a conversion and uniformity has much to offer the library user, but costs would be too high for most libraries. also, small libraries and certain libraries, because of their physical conditions, may need to preserve differences. therefore, the local library can be urged to adopt nsdp i aacr as standard but cannot be forced to standardize because of the large retrospective conversion problem. the local library can develop its serials system independently or through partial or full support through a network-the regional serials data center under this plan. independent of whether it chooses to use the regional serials data center for such services its needs remain the same: • to identify the serial in hand; • to obtain cataloging copy for it; • to service its subscriptions, claims, binding; and • to produce some form of catalog showing its holdings and arranged to reflect its specific shelf arrangement. the networking serials management component represents the development of union catalogs, wherein members of the network can: • identify a serial; • identify who holds a specific issue; • provide interlibrary loan/photocopy service to obtain the actual document; and • provide a way to consolidate fragmentary sets, eliminate unnecessary duplication or provide more copies when needed, and broaden subject coverage among the network. union catalogs, document delivery services, and bibliographic reference assistance comprise the products that are used by the network component. the international/ national serials system component must be the vehicle to provide a uniform bibliographic description and local components. the issn and the nsdp record provide the means to this end. the machine-readable record at the regional serials data center would comprise at least the key elements of the international record, i.e. keytitle as supplied via nsdp from either national library or regional center input and the issn. beyond that, the regional center bibliographic data base should be structured to provide full bibliographic description according to aacr rules for current publications, accommodating the retrospective data as it is found in its region-at least until some national effort at conversion of superimposed records can be mounted. moreover, the regional center should be tailored to perform the functions that its local libraries deem important. this obviously will vary with the region and with time as regions will vary in size and their mix of libraries. 204 journal of library automation vol. 6/4 december 1973 figure 2 enumerates the functions and responsibilities of the respective parts of the regionally organized serials program illustrated in figure 1. this list is not meant to be all inclusive as other functions could be recognized by other libraries or regions. national se1·ials data program 1. assign issn/key-title to titles reported via national libraries and regional serials data centers. 2. create and maintain data base, indexes of key-titles, issn' s, etc. 3. create and maintain or accept surrogates from regional serials data centers. 4. maintain essential isds data elements, other isds elements. 5. maintain essential non-isds data elements or national library extensions via other data elements. 6. publish indexes to the data base at nsdp for use by regional serials data centers and national libraries. 7. transmit issn's and key-titles as assigned to the regional serials data centers and the national libraries. 8. carry out publisher relations to convince publishers of the need to use issn, etc. as well as foster some additional uniformity wherever possible. national libraries 1. provide cataloging copy-surrogate for newly cataloged titles-using new or available mechanisms such as marc, serline or nst to nsdp for key-titlesj issn assignments. 2. provide subscription ca.rdj tape cataloging copy to local libraries until a dual definition national serials data program record canbe developed and revised through the regional centers. 3. maintain national union list functions via nst, eventually coordinating with nsdp to provide key-title and issn entry points. regional centers 1. create and maintain regional center serials data base reflecting holdings of libraries in the region. 2. forward new title bibliographic/surrogate data to nsdp for issn/ key-title assignment if not processed by nsdp through national libraries or another regional center. 3. publish union catalogs or holdings in region for network library use. 4. process and provide machine/manual cataloging data for region use with m~rc, serline, nst type processing. 5. forward retrospective data to nsdp converted to their requirement$ for fig. 2. functions and responsibilities of the respective parts of a regionally organized serials program. a regional serials programjgros~h 205 retrospective issn/key-title assignment and addition to the central store atnsdp. 6. develop local library services as required. for example: a. catalog card production. b. book catalog production. c. oclc type serials check-in, claiming, binding, subscription system. d. document/ photocopy delivery system for interlibrary loan. e. coordinated acquisitions program for new serials, added copies. f. coordinate interlibrary transfers and consolidations of retrospective holdings. 7. communicate with other centers or nsdp to locate titles not supplied by local region. local library 1. notify regional center of new titles, changes, corrections to maintain regional data base. 2. use local library services as deemed necessary from regional center. 3. participate in document delivery / resource sharing/set consolidation within the region network. fig. 2. (continued) figure 3 shows the basic tasks and their current status with respect to developing such a program, provided that funding became available for at least a pilot regional center. the isds record specifications are presently available and implemented through the nsdp data base. the nsdp i university of minnesota contract for a feasibility study will determine the costs and required bibliographic and programming support to convert locally generated data bases, i.e., the minnesota union list of serials ( muls) to nsdp requirements. the results of this feasibility study will determine the prospects for funding any proposals for actual conversion of local data bases such as muls. the current muls data base represents one model of a regional center data base, with the addition of issn and key-title as the links to the nsdp system and/ or augmentation to provide other kinds of services to local libraries participating in muls. creation of the system of regional centers would depend upon proposing and funding such a program based on the above work. establishment of a pilot regional center would be one manner in which such a plan could be tested, followed by further center establishment based on the results of the pilot program. with such a system in routine operation, the nsdp in its central role could focus its attention on the retrospective conversion-issn and key-title assignment possibly via some nationally coordinated cooperative venture among the regional centers or other contractors. conclusion obviously, any system for a serials program will have its problems. are206 journal of library automation vol. 6/ 4 december 1973 task i. development of isds record specifications for nsdp and other centers. 2. development of software/ systems to convert locally generated marc based serials data bases to nsdp requirements. 3. conversion of locally generated marc based data bases to nsdp requirements and issn ;key title assignments for unique titles. 4. design of regional center data base record-basic data element specifications. 5. creation of regional centers-pilot center establishment, followed by other regional centers and full implementation of a regional plan. 6. possible retrospective serial title conversion. presently available feasibility study contracted. future future-model available in muls with addition of key-title/issn as linking fields between nsdp record and regional center record. future future-on a nationally coordinated basis. fig. 3. requirements for establishment of a regionauy organized serials program. gionally organized system would have greater responsiveness to local and networking needs than a large centralized program. moreover, certain technical problems of data base manipulation would be easier to solve under this organization. no attempt at greater specificity has been made here, as the purpose of this paper is to describe the nucleus of one way in which a serials program for the u.s. could be structured for maximum local library and networking benefit. let the discussion flow! references i. winifred gregory, union list uf serujls in libraries uf the u.s. and canada (2d ed.; new york: wilson,1943). 2. new serial titles (washington, d.c.: library of congress, 1950). the current state and challenges in democratizing small museums' collections online article the current state and challenges in democratizing small museums’ collections online avgoustinos avgousti and georgios papaioannou information technology and libraries | march 2023 https://doi.org/10.6017/ital.v42i1.14099 avgoustinos avgousti (a.avgousti@cyi.ac.cy) is a researcher, the cyprus institute, cyprus. georgios papaioannou (gpapaioa@ionio.gr) is associate professor in museum studies and director of the museology research laboratory, ionian university, corfu, greece. © 2023. abstract this article focuses on the problematic democratization of small museum collections online in cyprus. while the web has enabled cultural heritage organizations to democratize information to diverse audiences, numerous small museums do not enjoy the fruits of this digital revolution; many of them cannot democratize their collections online. the current literature provides insight into small and large museums’ challenges worldwide. however, we do not have any knowledge concerning small cypriot museums. this article aims to fulfill this gap by raising the following research question: what is the current state of small museum collections online in cyprus, and what challenges do they face in democratizing their collections online? we present our empirical results from the interview summaries gathered from six small museums. introduction cultural heritage digitization and online accessibility offer an unprecedented opportunity to democratize museum collections. online collections, typically presented on institutional websites, represent the world’s culture, an increasing trend toward a world where information is digitally preserved, stored, accessed, and disseminated instantaneously through a global and interconnected digital network. consumers search for information on the web has enabled cultural heritage institutions to democratize their collections online, yet most small museums have not benefited from this process and do not have their collections online. as a result of the above-mentioned problem, digital versions of small museum collections are primarily inaccessible, meaning less access to information “knowledge.” there is a clear need for small museums to remain relevant by publishing their collections online. small museums must move quickly into the digital world. current literature provides insights into the challenges they face worldwide. however, we do not have knowledge regarding the situation in cyprus. this study aims to fill this gap by researching small museums in cyprus and asking the following research question: what is the current state of small museum collections online in cyprus, and what challenges do they face in democratizing their collections online? what is a small museum? museums are defined as small based on their annual budget and number of staff. the american association for state and local history (aaslh) defines museums as small if they have an annual budget of less than $250,000 and limited staff with multiple responsibilities. other factors such as the size of collections and the physical size of the museum could further categorize a museum as small. katz set the same budget and set the staff number at five or less.1 honeysett and falkowski put the budget at $300,000 and five or fewer employees.2 miller notes that the average small mailto:a.avgousti@cyi.ac.cy mailto:gpapaioa@ionio.gr information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 2 avgousti and papaioannou museum has just two full-time employees and a budget of less than $90,000.3 watson by contrast defines small museums as ones that grew out of the community they serve. 4 for the purposes of this article, a small museum is one with more than one but less than five fulltime employees not including museum custodians. categorizing a museum based on its budget is difficult and contentious since often museum staff are funded by another body, such as a municipality. literature review cultural heritage institutions such as galleries, libraries, archives, and museums (glams) were among the first organizations to digitize information by creating databases whose access was granted locally to institutional cardholders (horan, 2013). the process of digitization is of paramount importance, with museums eager to offer online access to their physical collections .5 online collections provide a range of opportunities, including the facilitation of knowledge sharing and the creation of a participatory environment that promotes information exchange.6 through their online presence, museums can present their collections to a global audience.7 the accessibility of digital knowledge opens the door for further knowledge to be generated and enhances the educational reach of cultural institutions.8 online collections create opportunities for small and geographically isolated museums to deliver learning opportunities to audiences around the world, something all museums should aim for.9 while larger museums have done well, smaller ones have not been as successful.10 much of past knowledge is stored in small museums, whose importance in preserving cultural heritage should not be underestimated.11 they sometimes add far more to social capital than larger national ones.12 though the need for museum collections online is recognized, there are limitations. if it was simple, every museum would be online.13 however, most small museums are not online.14 their collections remain digitally inaccessible to future generations.15 oberoi and arnold have gone so far as to maintain that information absent from the internet can be regarded as nonexistent. 16 on the other hand, in rare cases where small museums have their collections online, they target human consumers.17 the information is stored in isolated data silos incompatible with automatic processing. the challenge is to make collections discoverable via online search engines and metadata aggregators.18 the issue appears to have been ongoing for many years as gergatsoulis and lilis maintain that the web lacks semantic information and it has proved challenging to process such a massive set of interconnected data as mentioned 18 years ago.19 clearly, online collections must be understood and used efficiently both by humans and machines, because machine-consumable content will end up in human-consumable content.20 the current state and challenges small museums find it difficult to publish their collections online. most large museums have undergone a digital transformation, but few small ones have.21 the museum survey by tongue in 2017 showed that the number of museums planning to publish their collections online decreased from 40 percent in 2016 to 24 percent in 2017, although only 8 percent had already gone online by 2018.22 the survey in 2020 by network of european museum organisations (nemo) on digitization in european museums shows that an average of 20 percent of museum collections in europe as a whole are online, and the median is 10 percent. surprisingly, 43 percent are digitized but not online, meaning the public has access to less than half of the existing digital items.23 information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 3 avgousti and papaioannou a report by flynn in 2018 reveals that most historical society collections are not accessible online.24 according to honeysett and falkowski, the majority of museums in their survey in the us have less than 10 percent of their collections online.25 according to a survey by axiell in 2017, only 21 percent of museums have a complete collection online, 27 percent more than half, 38 percent percent less than half, and 14 percent percent have no collections online.26 in 2020 beaudoin pointed out that approximately 32 percent of us art museums with holdings provide online collection systems that are openly available to the public, while 13 percent do not even have an institutional website.27 avgousti, papaioannou, and gouveia indicated that even if small museums manage to give online access to their collections, they are often stored in isolated data silos incompatible with automatic processing.28 the museum survey by vernon systems in 2016 showed that 82 percent of museums do not use any machine-consumable standards.29 furthermore, only 11.9 percent use dublin core as a metadata standard, 3.6 percent darwin core, 1.2 percent ead and 8.3 percent other. further, the existence of individual collections online, maintained by different organizations, brings challenges to the discoverability, sharing, and reuse of resources .30 metadata aggregation is a frequently utilized strategy in which centralized organizations, such as europeana,31 collect associated metadata to make resources more discoverable and usable. why do we witness such low levels of online publishing in small museums? and why are online collections not in a format that is searchable and easy to find? according to the relevant literature, small museums lack resources and skilled staff to move to the digital age. current obstacles a key obstacle in the digitization of small museum collections is insufficient resources. large cultural heritage institutions have much greater access to funds.32 according to klimper, while the internet has had a tremendous impact on the democratization of european culture, insufficient financial resources remains a significant challenge for small museums.33 irina oberlander from the institute of cultural memory has pointed out that small and medium-sized museums with limited budgets are digital age victims.34 laine-zamojska stressed that small museums, which are often entirely run by volunteers, cannot afford to digitize or make their collections available to a wider audience.35 therefore, online access to cultural heritage in these small institutions is minimal. the nemo report in 2020 showed that insufficient staff is another major obstacle for museum digitization and online accessibility.36 small museums are understaffed.37 this is confirmed by gallery systems, who noted that small museums face their own set of collection challenges.38 with smaller team sizes and limited staff hours, it is difficult to operate. the museum survey by tongue in 2017 showed that 73 percent of museums did not have dedicated staff to manage online collections.39 this means that collection management is given to staff who already have a full job description. avgousti, papaioannou, and gouveia pointed out that museums do not usually hire experts to plan, develop, deploy, and maintain a digital collection, but delegate the task to museum staff who are often limited in technological skills,40 while wigodner and kearney mentioned that small museums typically have fewer (if any) employees devoted to web publishing.41 fewer employees often means a lack of skilled personnel. in the aforementioned survey, no museum with fewer than 50 staff members reported employing a computer expert.42 additionally, honeysett and falkowski mention that two-thirds of museums have one or no it personnel.43 in information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 4 avgousti and papaioannou addition, the same concern has been observed in small university libraries, whereby 67 percent did not have an it expert.44 further, small museums do not have suitable technology, and in many cases, the staff is not technologically adept.45 additionally, klimper affirms that the internet promise of providing access to european culture is hampered by a lack of technological skills.46 considerable expertise in semantic web technologies is needed to expose machine-consumable content to the “web of data.”47 finally, in-depth knowledge of modeling, along with programming skills, are also essential needs. complexity of technology and metadata issues the nemo report in 2020 showed that less than 20 percent of museum collections are online.48 as already mentioned, this may be attributed and related to the prerequisites of online collections as they include complex technology or the need for online platforms. additionally, avgousti, papaioannou, and gouveia pointed out that small museums do not have suitable technologies.49 within the discussion on the semantic web (also known as web 3.0, the world wide web’s extensions that make internet data machine-readable via applying standards), corlosquet stated that one of the significant challenges is getting semantic web data annotations to the end-user applications. if this is achieved, there will be faster adoption of the web of data. moreover, while content management systems (cms) significantly aid the production of online content by end users, the problem of allowing the user to produce semantic web content remains elusive.50 further, velios discusses the problem of understanding semantic web concepts concerning complex setups.51 such setups may be bewildering for those humanities scholars without a technical background. he mentions that the semantic web does not offer the necessary tools to accommodate data easily. vavliakis, karagiannis, and mitkas postulate that even for the mainstream use of the semantic web in the cultural heritage community, easily operated tools are also required.52 cultural heritage institutions are encouraged to start processing and publishing content with semantic technologies. still, the tools which can undertake such a considerable task continue to lack user friendly features. daradimos, vassilakis, and katifori claim that small museums use content management systems to publish their collections online.53 however, using a general-purpose cms (e.g., drupal) comes with great difficulty, primarily due to the lack of technical information such as dublin core fields, as nontechnical staff cannot be expected to know how to install and configure appropriate modules within drupal to enable the entry and publication of this metadata.54 however, there has been little development of the current cmss regarding user-friendly tools targeting the implementation of semantic markup annotations. the integration of cms into semantic web technologies will increase cultural heritage knowledge dissemination remarkably. further, the absence of robust and easily usable tools is considered a central challenge that continues to pose obstacles concerning the rapid adoption of semantic web and linked data.55 antoniou and van harmelen explain that the semantic web’s adoption relies on developing new and straightforward tools.56 the semantic web is also being based on the adoption of the existing technology rather than on new scientific solutions. modern and easy-to-use tools will facilitate the information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 5 avgousti and papaioannou semantic web’s adoption compared with what is available in the current conjuncture. however, only a small number of institutions use semantic technologies. tim berners-lee, the brains behind the semantic web, points out that the machine-readable web is always farther off compared to the human-readable web.57 in cases of large and well-funded organizations or museums like the bbc or the british museum, it is possible to work with semantic web technologies. on the other hand, small museums will have difficulties with the semantic web’s smooth implementation.58 it is pivotal to emphasize that challenges related to the implementation of machine-consumable content by museums rely heavily on adopting existing technology rather than on scientific approval. as antoniou and van harmelen have underlined, the most significant needs are observed in the areas of easily accessible tools that are approaching nontechnical communities. the most significant technological progress will lead to a more advanced semantic web compared to what can be achieved today.59 methodology data collection methods interviews are regularly used in qualitative research for data collection.60 structured interviews lead to more specific answers, usually in a controlled environment. in unstructured interviews, there are no set-in-advance questions, and the interview can be very broad, open, and exploratory. semistructured interviews fall in the middle, as they allow both a few specific questions to be addressed and space for extra information via deviating from the set questions. this is the main reason why they are one of the most popular and widely used methods of data collection.61 the interview type selected depends on the questions to be asked and the research method. the current research aims to collect a comprehensive understanding of a problem. therefore, semistructured interviews were the ideal tool, and an interview guide containing open-ended questions was developed. selection of the sample the researcher selected museums based on nonrandom criteria. techniques for nonprobability sampling methods are often suitable for qualitative research. nonprobability sampling’s aim is not to test a hypothesis about a large population but to establish an initial understanding of a small community or a population under research. the current research targets small museums in cyprus. therefore, a nonprobability sampling method was used to select small museums. the small museums contacted were not always responsive. however, we managed to conduct interviews with six small museums in cyprus using the snowball sampling method, where the researcher asked the interviewee to refer other people for conducting future interviews. sample size in the current study, the sample population is homogeneous, meaning the population is related to small museums in cyprus. when the population is homogenous the sample size should be at least 4 to 12 cases. in cases of heterogeneous samples, for example in small museums from around the world, the sample size must be at least 12 to 30 cases. in more complex cases such as ethnographic or grounded theory, the sample size must be larger. information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 6 avgousti and papaioannou in our case, we started with two cases and continued until data saturation was achieved, the point in the research when no new information is discovered in data analysis.62 cyprus has 34 museums of which 10 are small (for this survey, defined has having one full-time and fewer than five total employees). we interviewed six of the 10 small museums and reached data saturation after interviewing the first four. conducting the interviews in preparation for the interviews, we contacted the interviewees by phone and email, informing them about the interview. information on the size of the staff was gathered by contacting the museum. while ten museums met the definition of small, only six agreed to participate in the research. first, a pilot test was conducted on two interviewees to identify any problems with the interview guide. based on this pilot test, we made changes and corrected mistakes. due to the covid-19 pandemic, interviews were conducted via internet-based technologies, mostly zoom, a video telephony software program, chosen because of its ability to record video. the interview length was about 20–25 minutes, and all participants had the option to choose greek or english as the interview language. due to the pandemic and logistic challenges, it took about six months to identify subjects and conduct the interviews. results this section discusses the empirical results extracted from the interview summaries. interviews were conducted in greek (both authors are native speakers of greek) and translated to english by the authors. under the major headings of our research subject, we present our findings concerning our research question. the current state of collections online our results indicate that most small museums in cyprus do not have an online presence. two of the six museums (4 and 5) do not have a website. the ones that do have websites created but not updated or supported for more than 15 years, and which therefore need replacement. here are two representative comments: “the museum has an old and simple website” (respondent 1); “[we have] a very old website that needs to be changed soon” (respondent 3). the two museums that do not have a website, use/have used social media: “the museum uses facebook and instagram” (respondent 4); “[we] used to have a facebook page” (respondent 5). we discovered that five of the six museums do not have their collections online: “the museum does not have any of its collections online” (respondent 1); “no online collections” (respondent 4); “we do not have any collections online” (respondent 5). further, we learned that none of the museums use machine-consumable standards to achieve wider interoperability on the web: “the online collections are only in a human-readable format” (respondent 2); “we do not use any machine-readable” (respondent 3). however, museums understand the need and benefits of such solutions: “our goal is to have the online collection understandable by machines and share metadata online” (respondent 2). information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 7 avgousti and papaioannou we noticed that all museums are willing to give online access to their collections, complete or partial, and agree that the primary goal is to disseminate information: “the primary goal is to give access to museum collections for general use” (respondent 1); “to put it another way, to communicate information, knowledge to scholars and the general public” (respondent 2); “to reach as many people as possible and spread those collections online to a variety of audiences” (respondent 3); “the main reason that online collections exist is that it is the tool to reach more people and disseminate those collections online to diverse audiences, researchers, and the public alike, in other words to disseminate knowledge” (respondent 4); “to disseminate knowledge and information to more people such as students and researchers and the general public” (respondent 5). museums also view online collections as a marketing tool that can bring more people to the museum’s physical space: “[online collections] can work as a marketing tool, people that can view our collections online may visit the museum physical space” (respondent 4); “the main goal is to be found” (respondent 5); “tourists coming to cyprus can use the system and find out about our collections and the museum” (respondent 6). clearly, museums are eager to give online access to their collections. the goal is to disseminate information and attract more people to their physical premises. when asked about the goals of publishing machine-consumable content online, findability was most significant: “nowadays, people are using search engines to find the information they are looking for. and since the information is not in a machine-readable format and understandable by search engines, it creates difficulties to be located online” (respondent 1); “[the goal is] to make the collections more findable” (respondent 2); “… to be easily findable by search engines on the internet” (respondent 3); “[to] increase wider findability of the collections over the web” (respondent 6). additionally, we discovered that some museums are not aware of the existence of machinereadable formats: “i am not aware of machine-readable data” (respondent 4); “the museum is not aware of any machine-readable standards for wider web interoperability” (respondent 5). it is evident that findability is the main goal in online content. but it is also clear that some museums are not aware of the existence of machine-readable standards and such technologies. the current challenges of collections online insufficient resources and the cost of existing solutions our study shows that museums’ insufficient resources and the cost of existing solutions are the main obstacles in having their collections online. here are five representative comments: “lack of money” (respondent 1); “we got offers from different companies; however, the costs of existing solutions were well above our budget and possibilities” (respondent 2); “the main obstacle related to giving online access to the museum collections is the cost … outsourcing this kind of work costs a lot of money that the museum does not have” (respondent 4); “of course is the cost” (respondent 5). insufficient staff (time) and skilled staff (know-how) according to our findings, staff limitations are another obstacle small museums face in providing online access to their collections: “the existing staff has so many other responsibilities mostly related to research and museum daily functions” (respondent 1); “populating all the material to a new system requires a lot of time and staff that the museum does not have” (respondent 2); “the museum’s limited staff” (respondent 4); “the limited staff of the museums is a problem” (respondent 6). information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 8 avgousti and papaioannou further, interviewees shared that the lack of know-how is another obstacle to digitizing and making accessible museum collections online: “we do not have the technical knowledge. of all of the staff members, no one has technical knowledge … this means that we must hire a person that has this kind of knowledge” (respondent 1); “we do not have a dedicated staff to work specifically for this function” (respondent 6). complexity of technology (existing systems) according to our research, the existing technological complexity is another major problem: “the lack of easy-to-use tools that we can use at the museum [is a problem]” (respondent 1); “creating a content model selecting all necessary fields is a very complex and time-consuming process” (respondent 2); “we need tools that are user-friendly, easy to use with nontechnical complexity without requiring a too specialized technical know-how” (respondent 3); “the technological complexity that is involved” (respondent 5); “hosting your own online collections due to the maintenance and technical knowledge is another issue that small museums are facing” (respondent 6). insufficient infrastructure our research revealed the lack of technological infrastructure was an obstacle: “the lack of infrastructures … we cannot work with this kind of old infrastructure … we cannot work with a computer that is 20 years old, this is impossible … [we have] only one old computer that is connected to the internet” (respondent 1); “primary challenges related to technological infrastructure” (respondent 3); “the existing infrastructure of the museum, we have old computers” (respondent 4); “hosting your own online collections due to the maintenance and technical knowledge is another issue that small museums are facing in cyprus. this is why we use external platforms” (respondent 6). not machine consumable the complexity of technology was highlighted as the biggest challenge in publishing collections online in machine-consumable formats: “easy-to-use solutions” (respondent 1); “selection of the appropriate technology, there are so many standards for machine-readable data making the selection process extremely hard” (respondent 2); “the complexity of technology is the main obstacle” (respondent 3); “if the system we use can automatically create machine-consumable content this will help” (respondent 4); “the platform that publishes the collection humanconsumable content can at the same time publish in machine-understandable content will solve the problem” (respondent 6). for some, machine consumption is not a priority: “it is not a first priority of the museum” (respondent 5); “the museum is not familiar with machine publishing” (respondent 6). the complexity of technology and the lack of easy-to-use tools are among the biggest obstacles to publishing machine-consumable content. discussion and conclusions existing online collections and/or museum resources should be researched further as they may not be completely digitized and accessible to different audiences online. with one-third of small museums in cyprus providing access to their collections online, there are many opportunities to help small museums to give access to their collections to benefit information knowledge democratization. information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 9 avgousti and papaioannou we discovered that the lack of resources and infrastructure are two significant challenges small museums in cyprus face in providing online access to their collections. our results show no museum partners with national institutions, such as universities or academic research centers. we assert that this collaboration can reduce costs and eliminate the need for infrastructure. at the same time, institutions, such as universities, usually have the technological know-how and can provide museums with new tools, free and open-source systems that focus on cypriot small museum needs. such tools, which can be found in our research, will help museums to drastically reduce the cost that is involved in buying such systems. moreover, we found that the lack of staff (time) is another challenge that prevents museums from having their collections online. we believe that developing new tools that can accelerate the process of generating, administering, maintaining, and uploading museum collections online will alleviate staff time. our research also uncovered that small museums in cyprus do not work with volunteers, as they have no time and resources to find and then train volunteers to museum work; we suggest museums must consider these options concerning the lack of staff (time). additionally, we learned that museums lack specialized staff (know-how), another significant challenge that blocks museums from democratizing their online collections. we anticipate that developing technology that requires less technical expertise will benefit small museums that do not have specialized staff (e.g., developers and information technology specialists). further, help from external bodies such as universities may help. on the other hand, there are platforms available that do not need specialized technical knowledge. however, we discovered that the complexity of existing technology impedes museum collections online. we hope that creating less complex technology will enable museums to use and publish their collections online in human and machine consumable formats. further, training of existing staff in new technologies is needed. to sum up, small museums in cyprus and the world need to invest in democratizing their collections online via digitizing, describing, and making their objects and collections available online. simple and turnkey solutions for publishing and describing digitized objects are required. there is a will; we keep researching towards finding the most suitable case-oriented and affordable ways. acknowledgments many thanks to all the interviewees from small museums in cyprus that opened their doors to our research. for ethical considerations, we keep institutions and interviewees anonymous. information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 10 avgousti and papaioannou endnotes 1 paul katz, “the quandaries of the small museum,” the journal of museum education 20, no. 1 (1995): 15–17, https://www.jstor.org/stable/40479486. 2 nik honeysett and julia falkowski, museum technology landscape 2018: discovery and findings (lyrasis: 2018), https://www.lyrasis.org/leadership/documents/lyrasis-museum-techlandscape-report-2018.pdf. 3 eric miller, uche ogbuji, victoria mueller, and kathy macdougall, bibliographic framework as a web of data: linked data model and supporting services (washington, dc: library of congress, 2012), 42. 4 sheila watson, ed., museums and their communities (routledge: new york, 2007). 5 guido cimadomo, “documentation and dissemination of cultural heritage: current solutions and considerations about its digital implementation,” in 2013 digital heritage international congress (digitalheritage) (ieee: 2013), 555–62, https://doi.org/10.1109/digitalheritage.2013.6743796; s. sylaiou, f. liarokapis, p. patias, and o. georgoula, “virtual museums: first results of a survey on methods and tools” (paper presented at cipa 2005 xx symposium, 26 september–01 october 2005, torino, italy); rachel regelein, “a digital collections plan for the southwest seattle historical society” (unpublished master’s project, university of washington, 2019), https://www.washington.edu/museology/2019/11/13/a-digital-collections-plan-for-thesouthwest-seattle-historical-society/; ion gil fuentetaja and maria economou, “studying the type of online access provided to museum collections” (2008), https://www.semanticscholar.org/paper/studying-the-type-of-online-access-provided-tofuentetaja-economou/b44415e02b5fca204d79b481d325b66482461f41. 6 regelein, “a digital collections plan”; bernadette flynn, “making collections accessible” (federation of australian historical societies inc., january 2018), https://www.history.org.au/wp-content/uploads/2018/10/makingcollectionsaccessible.pdf; karol j. borowiecki and trilce navarrete, “digitization of heritage collections as indicator of innovation,” economics of innovation and new technology 26, no. 3 (2017): 227–46, https://doi.org/10.1080/10438599.2016.1164488; morgan schlesinger, “the museum wiki: a model for online collections in museums” (master’s project/capstone, university of san francisco, 2016), https://repository.usfca.edu/capstone/456; genevieve horan, “digital heritage: digitization of museum and archival collections” (research paper, master of public administration, political science department, southern illinois university, 2013), https://opensiuc.lib.siu.edu/gs_rp/374. 7 ilse harms and werner schweibenz, “evaluating the usability of a museum web site abstract” (2001), https://www.museumsandtheweb.com/mw2001/papers/schweibenz/schweibenz.html. 8 steen hvass, preface to the museum’s web users, a user survey of museum websites (heritage agency of denmark, 2010), https://slks.dk/fileadmin/publikationer/kulturarv/the_museum_s_web_users_2010.pdf; https://www.jstor.org/stable/40479486 https://www.lyrasis.org/leadership/documents/lyrasis-museum-tech-landscape-report-2018.pdf https://www.lyrasis.org/leadership/documents/lyrasis-museum-tech-landscape-report-2018.pdf https://doi.org/10.1109/digitalheritage.2013.6743796 https://www.washington.edu/museology/2019/11/13/a-digital-collections-plan-for-the-southwest-seattle-historical-society/ https://www.washington.edu/museology/2019/11/13/a-digital-collections-plan-for-the-southwest-seattle-historical-society/ https://www.semanticscholar.org/paper/studying-the-type-of-online-access-provided-to-fuentetaja-economou/b44415e02b5fca204d79b481d325b66482461f41 https://www.semanticscholar.org/paper/studying-the-type-of-online-access-provided-to-fuentetaja-economou/b44415e02b5fca204d79b481d325b66482461f41 https://www.history.org.au/wp-content/uploads/2018/10/makingcollectionsaccessible.pdf https://doi.org/10.1080/10438599.2016.1164488 https://repository.usfca.edu/capstone/456 https://opensiuc.lib.siu.edu/gs_rp/374 https://www.museumsandtheweb.com/mw2001/papers/schweibenz/schweibenz.html https://slks.dk/fileadmin/publikationer/kulturarv/the_museum_s_web_users_2010.pdf information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 11 avgousti and papaioannou monica bercigli, “dissemination strategies for cultural heritage: the case of the tomb of zechariah in jerusalem, israel,” heritage 2, no. 1 (march 2019): 306–14, https://doi.org/10.3390/heritage2010020; elena villaespesa and trilce navarrete, “museum collections on wikipedia: opening up to open data initiatives” (paper presented at mw19, the 23rd museweb conference, boston, ma, april 2–6, 2019), https://mw19.mwconf.org/paper/museum-collections-on-wikipedia-opening-up-to-opendata-initiatives/; gerald wayne clough, “democratization of knowledge through digitization in libraries, museums, and archives” (2020), https://smartech.gatech.edu/handle/1853/62423; gerald wayne clough, “democratization of knowledge through digitization in libraries & archives” (2020), video, https://smartech.gatech.edu/bitstream/handle/1853/62423/clough.mp4?sequence=5&isallo wed=y; eva richani, georgios papaioannou, and christina banou, “emerging opportunities: the internet, marketing and museums,” in 20th international conference on circuits, systems, communications and computers (cscc 2016) 76, https://doi.org/10.1051/matecconf/20167602044. 9 lynsey martenstyn, “digital archives: making museum collections available to everyone,” culture professionals network, the guardian, may 3, 2013, https://www.theguardian.com/culture-professionals-network/culture-professionalsblog/2013/may/03/museum-archives-digital-online; shyam oberoi and kristen arnold, “new architectures for online collections and digitization” (paper presented at mw2015: museums and the web, chicago, il, april 8–11, 2015), https://mw2015.museumsandtheweb.com/paper/new-architectures-for-online-collectionsand-digitization/. 10 barbara lejeune, “the effects of online catalogues in london and other museums: a study of an alternative way of access,” papers from the institute of archaeology 18, no. s1 (2007): 79–97, https://doi.org/10.5334/pia.289. 11 chryssoula bekiari, leda charami, martin doerr, christos georgis, and athina kritsotaki, “documenting cultural heritage in small museums” (paper presented in 2008 annual conference of cidoc), https://cidoc.mini.icom.museum/wpcontent/uploads/sites/6/2018/12/25_papers.pdf; rolf däßler and ulf preuß, “digital preservation of cultural heritage for small institutions,” in digital cultural heritage, ed. horst kremers (springer international publishing, 2020), https://www.springerprofessional.de/en/digital-preservation-of-cultural-heritage-for-smallinstitutions/16842836. 12 penelope kelly, “managing digitization projects in a small museum” (master’s project, arts and administration program, university of oregon, march 2005), https://scholarsbank.uoregon.edu/xmlui/handle/1794/937. 13 kate taylor, “going digital not easy for cultural institutions,” the globe and mail, april 18, 2020, https://www-theglobeandmail-com.cdn.ampproject.org. 14 regelein, “a digital collections plan”; flynn, “making collections accessible”; susan wigodner and caitlin kearney, “who reviewed this?! a survey on museum web publishing in 2018 https://doi.org/10.3390/heritage2010020 https://mw19.mwconf.org/paper/museum-collections-on-wikipedia-opening-up-to-open-data-initiatives/ https://mw19.mwconf.org/paper/museum-collections-on-wikipedia-opening-up-to-open-data-initiatives/ https://smartech.gatech.edu/handle/1853/62423 https://smartech.gatech.edu/bitstream/handle/1853/62423/clough.mp4?sequence=5&isallowed=y https://smartech.gatech.edu/bitstream/handle/1853/62423/clough.mp4?sequence=5&isallowed=y https://doi.org/10.1051/matecconf/20167602044 https://www.theguardian.com/culture-professionals-network/culture-professionals-blog/2013/may/03/museum-archives-digital-online https://www.theguardian.com/culture-professionals-network/culture-professionals-blog/2013/may/03/museum-archives-digital-online https://mw2015.museumsandtheweb.com/paper/new-architectures-for-online-collections-and-digitization/ https://mw2015.museumsandtheweb.com/paper/new-architectures-for-online-collections-and-digitization/ https://doi.org/10.5334/pia.289 https://cidoc.mini.icom.museum/wp-content/uploads/sites/6/2018/12/25_papers.pdf https://cidoc.mini.icom.museum/wp-content/uploads/sites/6/2018/12/25_papers.pdf https://www.springerprofessional.de/en/digital-preservation-of-cultural-heritage-for-small-institutions/16842836 https://www.springerprofessional.de/en/digital-preservation-of-cultural-heritage-for-small-institutions/16842836 https://www-theglobeandmail-com.cdn.ampproject.org/ information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 12 avgousti and papaioannou (paper presented at mw18: museums and the web 2018, vancouver, canada, april 18–21, 2018), https://mw18.mwconf.org/paper/who-reviewed-this-a-survey-on-museum-webpublishing-in-2018/index.html; shyam oberoi and kristen arnold, “new architectures for online collections and digitization” (paper presented at mw2015: museums and the web, chicago, il, april 8–11, 2015), https://mw2015.museumsandtheweb.com/paper/newarchitectures-for-online-collections-and-digitization/index.html. 15 flynn, “making collections accessible.” 16 oberoi and arnold, “new architectures.” 17 nuno freire, pável calado, and bruno martins, “availability of cultural heritage structured metadata in the world wide web” (paper presented at 22nd international conference on electronic publishing, june 2018), 11, https://www.researchgate.net/publication/325914185_availability_of_cultural_heritage_stru ctured_metadata_in_the_world_wide_web. 18 flynn, “making collections accessible.” 19 manolis gergatsoulis and pantelis lilis, “multidimensional rdf,” in on the move to meaningful internet systems 2005: coopis, doa, and odbase (2005): 1188–1205, https://doi.org/10.1007/11575801_17. 20 cruce saunders, “content authoring for human and machine consumption.” [a], september 2, 2019, https://simplea.com/articles/content-authoring-for-human-and-machine. 21 alejandra garcia bittar, “is a digital strategy necessary in small museums?” museum and digital culture – pratt institute, november 11, 2018, https://museumsdigitalculture.prattsi.org/is-adigital-strategy-necessary-in-small-museums-a72c1645e495. 22 charles tongue, “museum survey 2017,” vernon systems (blog), may 12, 2017, https://vernonsystems.com/museum-survey-2017/. 23 network of european museum organisations, “final report: digitisation and ipr in european museums” (network of european museum organisations, july 2020), https://www.nemo.org/fileadmin/dateien/public/publications/nemo_final_report_digitisation_and_ipr_in_ european_museums_wg_07.2020.pdf; cf. kelly, “managing digitization projects.” 24 flynn, “making collections accessible.” 25 honeysett and falkowski, “museum technology landscape.” 26 axiell, “museums accelerate implementation of digital strategies, making more content available online and on-site to improve visitor experiences,” axiell (blog), june 28, 2017, https://www.axiell.com/axiell-news/museums-accelerate-implementation-of-digitalstrategies-making-more-content-available-online-and-on-site-to-improve-visitor-experiences2/. https://mw18.mwconf.org/paper/who-reviewed-this-a-survey-on-museum-web-publishing-in-2018/index.html https://mw18.mwconf.org/paper/who-reviewed-this-a-survey-on-museum-web-publishing-in-2018/index.html https://www.researchgate.net/publication/325914185_availability_of_cultural_heritage_structured_metadata_in_the_world_wide_web https://www.researchgate.net/publication/325914185_availability_of_cultural_heritage_structured_metadata_in_the_world_wide_web https://doi.org/10.1007/11575801_17 https://simplea.com/articles/content-authoring-for-human-and-machine https://museumsdigitalculture.prattsi.org/is-a-digital-strategy-necessary-in-small-museums-a72c1645e495 https://museumsdigitalculture.prattsi.org/is-a-digital-strategy-necessary-in-small-museums-a72c1645e495 https://vernonsystems.com/museum-survey-2017/ https://www.ne-mo.org/fileadmin/dateien/public/publications/nemo_final_report_digitisation_and_ipr_in_european_museums_wg_07.2020.pdf https://www.ne-mo.org/fileadmin/dateien/public/publications/nemo_final_report_digitisation_and_ipr_in_european_museums_wg_07.2020.pdf https://www.ne-mo.org/fileadmin/dateien/public/publications/nemo_final_report_digitisation_and_ipr_in_european_museums_wg_07.2020.pdf https://www.axiell.com/axiell-news/museums-accelerate-implementation-of-digital-strategies-making-more-content-available-online-and-on-site-to-improve-visitor-experiences-2/ https://www.axiell.com/axiell-news/museums-accelerate-implementation-of-digital-strategies-making-more-content-available-online-and-on-site-to-improve-visitor-experiences-2/ https://www.axiell.com/axiell-news/museums-accelerate-implementation-of-digital-strategies-making-more-content-available-online-and-on-site-to-improve-visitor-experiences-2/ information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 13 avgousti and papaioannou 27 joan beaudoin, “art museum collections online: extending their reach” (paper presented at mw20, the 24th museweb conference, march 31–april 4, 2020), https://mw20.museweb.net/paper/art-museum-collections-online-extending-their-reach/. 28 avgoustinos avgousti, georgios papaioannou, and feliz ribeiro gouveia, “content dissemination from small-scale museum and archival collections: community reusable semantic metadata content models for digital humanities,” the code4lib journal, no. 43 (february 14, 2019), https://journal.code4lib.org/articles/14054. 29 “vernon systems – museum survey,” vernon systems (blog), may 19, 2016, https://vernonsystems.com/vernon-systems-museum-survey-data/. 30 nuno freire et al., “a survey of web technology for metadata aggregation in cultural heritage,” information services & use 37, no. 4 (2017): 425–36, https://doi.org/10.3233/isu-170859. 31 europeana: discover europe’s digital cultural heritage (website), accessed january 29, 2023, https://www.europeana.eu/en. 32 denis pitzalis, “3d and semantic web: new tools to document artefacts and to explore cultural heritage collections” (2013); denis pitzalis, “3d and semantic web: new tools to document artifacts and to explore cultural heritage collections. signal and image processing,” (phd diss. université pierre et marie curie, 2013); chryssoula bekiari et al., “documenting cultural heritage in small museums;” lejeune, “the effects of online catalogues in london and other museums: a study of an alternative way of access.” 33 paul klimper, “introduction to museums in the digital age,” in nemo 21st annual conference documentation, bukarest, romania, 2013, ed. julia pagel and kelly donahue, https://www.nemo.org/fileadmin/dateien/public/statements_and_news/nemo_21st_annual_conference_doc umentation.pdf. 34 network of euopean museum organisations, “final report”; dorel micle, “heritage networks and portals,” in museum and the internet. selected papers from the international summer course in buşteni, romania, 20th – 26th of september, 2004, ed. irina oberlander-târnoveanu (budapest: archaeolingua, 2008), 73-120; bittar, “is a digital strategy necessary in small museums?” 35 magdalena laine-zamojska, “virtual museum and small museums: vimuseo.fi project,” in museums and the web 2011: proceedings, ed. j. trant and d. beardman (toronto: archives and museum informatics, 2011), https://www.museumsandtheweb.com/mw2011/papers/virtual_museum_and_small_museu ms_vimuseofi_pro. 36 network of european museum organisations, “final report.” 37 noah lenstra, “website development for small museums: a case study of the katherine dunham dynamic museum,” january 1, 2008, https://mw20.museweb.net/paper/art-museum-collections-online-extending-their-reach/ https://journal.code4lib.org/articles/14054 https://vernonsystems.com/vernon-systems-museum-survey-data/ https://doi.org/10.3233/isu-170859 https://www.europeana.eu/en https://www.ne-mo.org/fileadmin/dateien/public/statements_and_news/nemo_21st_annual_conference_documentation.pdf https://www.ne-mo.org/fileadmin/dateien/public/statements_and_news/nemo_21st_annual_conference_documentation.pdf https://www.ne-mo.org/fileadmin/dateien/public/statements_and_news/nemo_21st_annual_conference_documentation.pdf https://www.museumsandtheweb.com/mw2011/papers/virtual_museum_and_small_museums_vimuseofi_pro https://www.museumsandtheweb.com/mw2011/papers/virtual_museum_and_small_museums_vimuseofi_pro information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 14 avgousti and papaioannou https://www.academia.edu/2439955/website_development_for_small_museums_a_case_stud y_of_the_katherine_dunham_dynamic_museum. 38 “resources for small museums,” gallery systems (blog), accessed july 13, 2022, https://www.gallerysystems.com/online-tools-resources-small-museums/. 39 tongue, “museum survey 2017.” 40 avgousti, papaioannou, and ribeiro gouveia, “content dissemination.” 41 wigodner and kearney, “who reviewed this?!” 42 wigodner and kearney, “who reviewed this?!” 43 honeysett and falkowski, “museum technology landscape.” 44 jasmine hoover, “gaps in it and library services at small academic libraries in canada,” information technology and libraries 37, no. 4 (2018): 15–26, https://doi.org/10.6017/ital.v37i4.10596. 45 avgousti, papaioannou, and ribeiro gouveia, “content dissemination.” 46 klimper, introduction. 47 vikas bhushan, shiv shakti ghosh, and sudipta biswas, “bridging the gap between cms and semantic web” (paper presented at naclin 2015 conference, kamataka, india), researchgate, 2016, https://www.researchgate.net/publication/307923260_bridging_the_gap_between_cms_and_ semantic_web. 48 network of european museum organisations, “final report.” 49 avgousti, papaioannou, and ribeiro gouveia, “content dissemination.” 50 stéphane jean joseph corlosquet, “bootstrapping the web of data with drupal” (master’s thesis, national university of ireland, galway, 2009), https://aic.ai.wu.ac.at/~polleres/supervised_theses/stephane_corlosquet_mseng_2009.pdf . 51 athanasios velios and aurelie martin, “off-the-shelf crm with drupal: a case study of documenting decorated papers,” international journal on digital libraries 18, no. 4 (2017): 321–31, https://doi.org/10.1007/s00799-016-0191-5. 52 konstantinos vavliakis, georgios karagiannis, and pericles mitkas, “semantic web in cultural heritage after 2020” (2012), https://www.semanticscholar.org/paper/semantic-web-incultural-heritage-after-2020-vavliakiskaragiannis/c69d14de020d5dedb9e76a173c94cc56cc254251. 53 illias daradimos, costas vassilakis, and akrivi katifori, “a drupal cms module for managing museum collections” (2015), https://www.academia.edu/2439955/website_development_for_small_museums_a_case_study_of_the_katherine_dunham_dynamic_museum https://www.academia.edu/2439955/website_development_for_small_museums_a_case_study_of_the_katherine_dunham_dynamic_museum mailto:https://www.gallerysystems.com/online-tools-resources-small-museums/ https://doi.org/10.6017/ital.v37i4.10596 https://www.researchgate.net/publication/307923260_bridging_the_gap_between_cms_and_semantic_web https://www.researchgate.net/publication/307923260_bridging_the_gap_between_cms_and_semantic_web https://doi.org/10.1007/s00799-016-0191-5 https://www.semanticscholar.org/paper/semantic-web-in-cultural-heritage-after-2020-vavliakis-karagiannis/c69d14de020d5dedb9e76a173c94cc56cc254251 https://www.semanticscholar.org/paper/semantic-web-in-cultural-heritage-after-2020-vavliakis-karagiannis/c69d14de020d5dedb9e76a173c94cc56cc254251 https://www.semanticscholar.org/paper/semantic-web-in-cultural-heritage-after-2020-vavliakis-karagiannis/c69d14de020d5dedb9e76a173c94cc56cc254251 information technology and libraries march 2023 the current state and challenges in democratizing small museums’ collections online 15 avgousti and papaioannou https://www.academia.edu/29947679/a_drupal_cms_module_for_managing_museum_collect ions. 54 quinn dombrowski, drupal for humanists (college station: texas a&m university press, 2016.) 55 jennifer zaino, “2017 trends for semantic web and semantic technologies,” dataversity (blog), november 29, 2016, https://www.dataversity.net/2017-predictions-semantic-websemantic-technologies/. 56 grigoris antoniou and frank van harmelen, a semantic web primer, second edition (cambridge massachusetts and london, u.k.: the mit press, 2008). 57 jackson joab, “tim berners-lee: machine-readable web still a ways off” gcn, october 30, 2009, https://gcn.com/articles/2009/10/30/berners-lee-semantic-web.aspx. 58 eric miller, uche ogbuji, victoria mueller, and kathy macdougall, “bibframe primer – bibliographic framework as a web of data: linked data model and supporting services” (november 2012): 42, https://www.researchgate.net/publication/280113409_bibframe_primer__bibliographic_framework_as_a_web_of_data_linked_data_model_and_supporting_services. 59 antoniou and van harmelen, a semantic web primer. 60 bryn farnsworth, “qualitative vs quantitative research – what is what?” imotions (blog), june 11, 2019, https://imotions.com/blog/qualitative-vs-quantitative-research/. 61 ceryn evans, “analysing semi-structured interviews using thematic analysis: exploring voluntary civic participation among adults,” in sage research methods datasets part 1 (sage publications, ltd., 2018), https://doi.org/10.4135/9781526439284. 62 sandra faulkner and stormy p. trotter, “data saturation,” in the international encyclopedia of communication research methods (american cancer society, 2017), 1–2, https://doi.org/10.1002/9781118901731.iecrm0060. https://www.academia.edu/29947679/a_drupal_cms_module_for_managing_museum_collections https://www.academia.edu/29947679/a_drupal_cms_module_for_managing_museum_collections https://www.dataversity.net/2017-predictions-semantic-web-semantic-technologies/ https://www.dataversity.net/2017-predictions-semantic-web-semantic-technologies/ https://gcn.com/articles/2009/10/30/berners-lee-semantic-web.aspx https://www.researchgate.net/publication/280113409_bibframe_primer_-_bibliographic_framework_as_a_web_of_data_linked_data_model_and_supporting_services https://www.researchgate.net/publication/280113409_bibframe_primer_-_bibliographic_framework_as_a_web_of_data_linked_data_model_and_supporting_services https://imotions.com/blog/qualitative-vs-quantitative-research/ https://doi.org/10.4135/9781526439284 mailto:https://doi.org/10.1002/9781118901731.iecrm0060 abstract introduction what is a small museum? literature review the current state and challenges current obstacles complexity of technology and metadata issues methodology data collection methods selection of the sample sample size conducting the interviews results the current state of collections online the current challenges of collections online insufficient resources and the cost of existing solutions insufficient staff (time) and skilled staff (know-how) complexity of technology (existing systems) insufficient infrastructure not machine consumable discussion and conclusions acknowledgments endnotes using augmented and virtual reality in information literacy instruction to reduce library anxiety in nontraditional and international students articles using augmented and virtual reality in information literacy instruction to reduce library anxiety in nontraditional and international students angela sample information technology and libraries | march 2020 https://doi.org/10.6017/ital.v39i1.11723 dr. angela sample (asample@oru.edu) is head of access services, oral roberts university abstract throughout its early years, the oral roberts university (oru) library held a place of pre-eminence on campus. oru’s founder envisioned the library as central to all academic function and scholarship. under the direction of the founding dean of learning resources, the library was an early pioneer in innovative technologies and methods. however, over time, as the case with many academic libraries, the library’s reputation as an institution crucial to the academic work on campus had diminished. a team of librarians is now engaged in programs aimed at repositioning the library as the university’s hub of learning. toward that goal, the library has long taught information literacy (il) to students and faculty through several traditional methods, including one-shot workshops and sessions tied to specific courses of study. now, in conjunction with disseminating augmented, virtual, and mixed reality (avmr) learning technologies, the library is redesigning instruction to align with various realities of higher education today, including uses of avmr in instruction and research and following best practices from research into serving 1. online learners; 2. international learners not accustomed to western higher-education practices; and 3. learners returning to university study after being away from higher education for some time or having changed disciplines of study. the library is innovating online tutorials targeted for nontraditional and international graduate students with various combinations of avmr, with the goal to diminish library anxiety. numerous library and information science studies have shown a correlation between library anxiety and reduced library use, and library use has been linked to student learning, academic success, and retention.1 this paper focuses on il instruction methods under development by the library. current indicators are encouraging as the library embarks on the redesign of il instruction and early development of inclusion of avmr in il instruction for nontraditional and international students. literature review the patron approaches the reference desk, with eyes downcast. in a voice so soft that it is barely above a whisper, the patron mumbles, “is this where i can get help with research?” some variation on the above scenario is an occurrence long familiar to academic reference librarians. in 1986, mellon put a name to this nervousness of patrons; she called it library anxiety.2 mailto:asample@oru.edu information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 2 since then, librarians have implemented various measures to help put patrons at ease and minimize their library anxiety. scholars have studied many of these measures aimed at reducing library anxiety, both to determine the efficacy of such interventions and to understand better the causes of library anxiety. this paper describes one library’s intervention using a virtual-reality tour of the library to learn about some of the services available at the library prior to their initial visit in an attempt to reduce some aspects of their library anxiety. library anxiety library and information science (lis) researchers have long recognized anxiety related to libraries and research can have a detrimental effect on students. mizrachi described library anxiety as the feeling of being overwhelmed, intimidated, nervous, uncertain, or confused when using or contemplating use of the library and its resources to satisfy an information need. it is a state-based anxiety that can result in misconceptions or misapplications of library resources, procrastination, and avoidance of library tasks.3 since mellon’s theoretical framing of library anxiety in 1986, researchers have studied a number of library-related anxieties, including research anxiety, information literacy anxiety, library technophobia, and computer anxiety. various studies have focused on different groups of students—freshmen, nontraditional students, and international students, to name a few—who may experience higher levels of library anxiety. another area that has been of interest to researchers is the study of the efficacy of various measures aimed at reducing the library anxiety of students. causes and factors researchers have found several causes of library anxiety. in her seminal article, mellon used a grounded theory approach to understand and “describe students’ fear of the library as library anxiety.”4 mellon noted most of the students in her study described their feelings as being lost in the library, which mellon stated “stemmed from four causes: (1) the size of the library; (2) a lack of knowledge about where things were located; (3) how to begin; and (4) what to do.”5 head and eisenberg also found a majority of students (84 percent) had difficulties in knowing where to begin.6 bostick and later jiao and onwuegbuzie named “five general antecedents of library anxiety . . . namely, barriers with staff, affective barriers, comfort with the library, knowledge of the library, and mechanical barriers.”7 barriers with staff are the feelings students have regarding the accessibility and approachability of library staff.8 affective barriers are students’ self-perceptions of their competence in using the library and library resources. affective barriers’ arise from feelings of inadequacy and can be heightened by the perception that others possess library skills that they alone do not.9 comfort with the library deals with the student’s perception of the library as a “safe and comforting environment.”10 knowledge of the library is students’ knowledge of “where things are located and how to find their way around in the building.”11 mechanical barriers refer to students’ perception of the reliability of machines in the library (e.g., copiers, printers, computers, etc.).12 researchers focused on investigating the information-seeking behavior of students have identified stages of library anxiety. in her work, kuhlthau identified six stages of information seeking in information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 3 which students may experience library anxiety: task initiation, topic selection, prefocus exploration, focus formulation, information collection, and search closure.13 in blundell’s presentation of her theoretical model of the academic information search process (aisp) of undergraduate millennial students (figure 1), she described the varying levels of anxiety students may feel throughout this process depending upon their success at finding needed information.14 anxiety at stage 2: development/refinement “ranges from mild to extreme, depending on the success of the student’s aisp in finding information he/she believes is appropriate for addressing the academic need.”15 at stage 3, “based on information located through the aisp in stages 1 & 2, [the] student either fulfills [the] academic need with minimal anxiety, refocuses aisp with mid to high-level anxiety, or abandons the academic need completely with high/extreme levels of anxiety.”16 figure 1. blundell aisp model.17 although blundell studied undergraduate millennial students’ information-seeking behaviors, the same behaviors may also be descriptive of other groups of students. blundell omitted anxiety at or prior to stage 1 when the assignment is received by the student. one reason for the omission of anxiety in blundell’s model at stage 1 may be a seemingly paradoxical finding by many researchers regarding students’ inflated belief in their research skills as compared to their actual level of information literacy (il) skills.18 students with a high self-assessment of their il skills may feel confident at the onset of research, only experiencing anxiety when encountering low success rates when searching for information or when experiencing information overload. however, many other students may experience anxiety at the onset of receiving an assignment, particularly on a information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 4 topic in which they have little or no knowledge. others may experience anxiety if they realize they do not know where to look for information, how to use library research tools, or feel apprehension at the thought of asking for help from a librarian. for example, library anxiety can result from the requirements of the assignment; most professors require peer-reviewed sources. many new students do not know what a peer-reviewed source is, much less how to find one. indeed, many of the causes of library anxiety described from mellon’s and later jiao’s and onwuegbuzie’s work can be positioned throughout all six of kuhlthau’s and all three of blundell’s stages of information seeking and could explain some of the potential steps blundell noted in her model. negative effects in addition to the obvious discomfort students might feel, library anxiety, as with other forms of anxiety, can have a detrimental effect on students’ academic performance. as mellon noted, “students become so anxious about having to gather information in a library for their research paper that they are unable to approach the problem logically or effectively.”19 the findings from jiao’s and onwuegbuzie’s numerous studies support the negative effect library anxiety can have on students’ academic performance in various ways, including research performance, research proposal writing, and study habits.20 research has also shown the link between higher levels of library anxiety and avoidance of the library.21 avoidance of the library could hinder students’ academic performance or retention; studies have linked library use to higher gpas and increased retention rates.22 other negative effects of library anxiety include the reluctance of students to ask for help from a librarian and the tendency to procrastinate until it is too late to do well on assignments. when library anxiety is at a level high enough to cause students to enter a panic mode, logical thinking, the ability to apply existing skills, and building or acquiring new skills can be impaired. at-risk student groups acknowledging the negative effects library anxiety can have on students’ academic performance, several studies have looked to determine whether particular demographic groups of students experience library anxiety at higher rates and what factors or causes may be most prevalent in the causes of library anxiety for a particular group. in one study conducted by jiao, onwuegbuzie, and lichtenstein, students who fell into the following groups tended to have the highest levels of library anxiety: “male, undergraduate, not speak english as their native language, have high levels of academic achievement, be employed either partor full-time, and visit the library infrequently.”23 some studies have focused on learning more about the library anxiety of a particular group. some of the groups investigated include graduate, international, and nontraditional students. still others have focused on possible racial differences in the prevalence of library anxiety. although a few studies have found library anxiety to be higher for undergraduate students than graduate students, one of the most often-studied groups at risk for library anxiety has been graduate students.24 these researchers have looked at a number of factors in relation to graduate students’ library anxiety. in an early study, they found graduate students with the preferred learning style of visual learners tend to have higher levels of library anxiety. 25 in another study of graduate students, they examined the relation between library anxiety and trait anxiety, defined as “the relative stable proneness within each person to react to situations seen as stressful. ”26 jiao and onwuegbuzie, together with bostick, investigated the potential relationship between race and library anxiety in 2004, which study they replicated in 2006. in both, the researchers found information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 5 caucasian american graduate students reported higher levels of library anxiety than their african american counterparts.27 another group frequently examined in library anxiety studies is international students. mizrachi noted “studies involving international students in american universities consistently show their levels of library anxiety to be much higher than their american peers.”28 onwuegbuzie and jiao found international esl students “had higher levels of library anxiety associated with ‘barriers with staff,’ ‘affective barriers,’ and ‘mechanical barriers,’ and lower levels of library anxiety associated with ‘knowledge of the library’ than did native english speakers.”29 later, jiao and onwuegbuzie found the most prevalent causes of library anxiety for international students were mechanical barriers (library technology) as the greatest source, followed by affective barriers. 30 in the more recent pilot study by lu and adkins, the greatest barriers for international students were affective and staff barriers, while mechanical barriers, such as technologies, were no longer a significant cause of anxiety for most.31 collins and veal found adult learners in their study had the highest degree of library anxiety pertaining to affective barriers. 32 in their study, kwon, onwuegbuzie, and alexander revealed graduate students who had higher levels of library anxiety resulting from affective barriers and knowledge of the library had weaker critical-thinking skills, lower self-confidence, less inquisitiveness, and reduced systematicity (“less disposed toward organized, logical, focused, and attentive inquiry”).33 kwon found similar results in undergraduate students.34 interventions recognizing the multiple causes and multidimensional aspects of library anxiety, librarians have devised a number of interventions aimed at addressing one or more of its causes. some of the means to address barriers with staff have focused on outreach, engaging library instruction, online presence, and other similar efforts to reach students and provide needed support for students’ research. librarians have used information literacy instruction (ili), reference desk consultations, and print and online guides to address library anxiety stemming from affective barriers, knowledge of the library, and even the mechanical barriers arising from lack of technology skills. a common intervention is ili, which several studies have found to have some success in reducing students’ library anxiety. bell explored students’ levels of library anxiety before and after a onecredit il course.35 platt and platt examined the efficacy of two 50-minute ili sessions, required of students enrolled in the research methods in psychology course, in reducing library anxiety, which found “the greatest changes . . . were related primarily to knowledge of what resources are available in the library and how to access them.”36 in contrast to the typical one-session il class, fleming-may, mays, and radom investigated and found a three-workshop instruction model correlated with students’ increased confidence in using the library and lessening library anxiety. 37 notwithstanding the benefits of library instruction sessions for students in relieving library anxiety, pellegrino found students were far more likely to ask a librarian for help when their instructor, rather than a librarian, encouraged or required them do so.38 by familiarizing students with the location and arrangement of library services in the building, library orientations have been found to help relieve library anxiety. 39 library orientations primarily aim to address one of the causes of library anxiety: a lack knowledge of the library. these orientations often introduce students to various library staff, which may also help with the dimension of library anxiety due to barriers with staff. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 6 other interventions have been attempted with some success. martin and park found students were more apt to request assistance from the librarian if persuaded the consultation would save time. 40 mcdaniel found in a study of graduate students that the use of peer mentors was effective in reducing affective barriers.41 robbins discussed the use of library events to help ease students anxiety, but found in the follow-up survey many students were unaware of the events.42 diprince et al. discussed ways the use of a print guide can help alleviate library anxiety. 43 oru library oral roberts university the oru library serves the students, faculty, and staff of oral roberts university (oru). oru is a small, private, not-for-profit, liberal arts college located in tulsa, oklahoma. founded in 1963 by oral roberts, enrollment is approximately 3,600 students. oru is an interdenominational christian institution focused on a whole-person education of spirit, mind, and body. oru offers more than 150 majors, minors, and pre-professional programs in a range of degree fields, from business, biology, engineering, nursing, ministry, and more.44 history “the first building will be the library which is the core of the whole academic structure.”45 —oral roberts (1962) from the founding of oru, founder oral roberts had a vision of the library’s centrality to academics.46 this set a precedent early in the history of oru library of the importance of the library to the academic work of the students and faculty of oru. expanding on traditional views of the function of an academic library to serve mainly as the repository of books and articles, through the vision of early library administrators, oru library emerged as one of the early adopters of electronic technology with the dairs (dial access information retrieval system) computer.47 throughout the years, due to a number of factors, the oru library receded from the forefront of pre-eminence in academics on campus. library practices followed the general trend of academic libraries. the oru library continued to acquire needed materials (e.g., books, journals, access to databases). library instruction likewise kept up with current models of instruction. the typical method of instruction to undergraduates has been teaching one or two sessions to a class at the request of the instructor. on largely the efforts of the instruction librarian, il became a required component of undergraduate education at oru. with rare exceptions, undergraduate students at oru are required as a part of comp 102: composition ii to attend two sessions of an il course. other forms of ili include workshops and sessions for undergraduates working on their senior papers and other sessions for graduate and postgraduate students, all typically at the request of the instructors of classes. with the new addition of augmented, virtual, and mixed reality (avmr) learning technologies, at the behest of their dean, oru librarians have begun to look at ways to incorporate these technologies into their classes and daily work. several oru instructors are using avmr technologies in their classes.48 to help prepare students for the use of these technologies in their classes, one oru instruction librarian has begun to introduce students to avmr technologies. other oru instruction librarians are exploring ways to use avmr technologies to create visualizations of library and research concepts, such as a 3d visualization of how boolean logic information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 7 works in database searches. oru instruction librarians are also exploring ways to incorporate avmr technologies into a new program of online ili. although still in very early stages of planning, the proposed online ili will include a virtual tour of the library. this paper focuses on the implementation and early feedback from a formative assessment on a virtual tour of the oru library. oru modular students in addition to traditional 15-week semesters, two colleges at oru offer graduate modular programs, the college of education and the college of theology and ministry. many of the students who enroll in these programs are nontraditional students who are returning to college after some time. several of these students work full-time jobs and have family obligations in addition to their academic work. often, these students are not local to the tulsa campus; several are us students who live out of state and many others are international students. the modular classes offered by both programs can be a hybrid of online and modular format. the college of theology and ministry offers one-week courses on campus; the college of education offers two-and-a-half-day on-campus classes. modular classes are intensive due to the compressed nature of the curriculum. often, modular students are visiting campus for the first time, and in addition to locating their classes, are very busy with coursework. adding to these pressures, modular students may be using computer technologies in new ways. navigating the library’s resources is yet another stressor for many of these students. for students who are not familiar with the operations of an academic library, they may not be aware of library services nor how to access those services. the project in january 2017, the global learning center opened on the campus of oru. one hallmark of this renovated structure is the integration of avmr technologies.49 despite several professors on campus from various disciplines and colleges implementing avmr into their curriculum, students’ use of the facilities was somewhat lower than had been hoped. in the fall 2018 semester, the idea of creating a virtual tour of the oru library arose from a conversation between the author and a colleague, dan eller. eller described an online ili course he envisioned for oru’s graduate theology modular students. as a part of this course, he envisioned a virtual tour that could help students by reducing their library anxiety. early in 2019, oru’s associate vice president of technology and innovation, michael mathews, contacted dr. mark roberts, dean of learning resources (of which the oru library is a part) to propose making avmr learning technologies available through the oru library. dean roberts agreed and created an avmr team of library faculty to oversee this project. in the spring 2019 semester, the oru it department sent one of their employees, stephen guzman, to work with the library’s avmr team to set up an avmr station and work in the library to help make these new technologies available and known to oru students. in addition to other avmr projects guzman helped the library’s avmr team begin, he volunteered to take the 360 images when he learned of the library’s desire to create a virtual tour of the library. guzman also helped in the selection of editing software, 3dvista, for which the library acquired a license. working with the 360 images guzman took and stitched together, the author used 3dvista to create a virtual tour of the library. this software allows for the addition of elements to the 360 images that make up the virtual tour to enhance the viewer’s experience and to provide information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 8 information and hyperlinks to external webpages with more information. some of the elements added to the oru library virtual tour are hotspots that enable a viewer to move from one area to another, icons that present pop-up windows with more information, and other icons that link to the online profiles of various library faculty. throughout the tour, consistent use of icons for the same functions is maintained. for example, icons with arrows allow the viewer to move from one location to another (figure 2), while icons with question marks displayed over library personnel (figure 3) open the personnel’s profile webpage when clicked. icons that contain the letter “i” feature pop-up windows with information and related links. the tour begins from outside the building so new visitors will be able to recognize the building when they arrive on campus (figure 4). viewers can navigate through subsequent 360 images by clicking on the arrow icons so the viewer virtually travels the same path they will follow to enter the library when on campus. there are two other options to navigate the tour. the viewer can click on the small icons of scenes displayed on the left side of the screen to move to another area. the floor plans displayed at the upper right of the screen have red dots indicating the location of various scenes and, when clicked, move the viewer to that scene. figure 2. avmr station near the reference desk, oru library virtual tour. other elements of the tour include small icons of the scenes on the left of the screen. beneath these icons are the names of the various areas. the title of the current scene appears in yello w lettering, providing information to help orient the viewer. small floorplans located in the upper right side of the screen offer additional information on the location of the area (figure 3). viewers can toggle these floorplans on and off. another feature supplying location information is the dropdown menu for the floorplans (the dark blue bar at the upper right of the screen) which shows the floor level of the building on which the area is located. in the lower right of the screen, an information icon is available with details on what behavior to expect when clicking on icons and a description of the various ways to navigate the tour. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 9 figure 3. dean mark roberts near alexa and the self-checkout station at the circulation desk, oru library virtual tour. figure 4. oru campus, oru library tour. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 10 methodology the aim of the virtual tour of the library is to reduce several dimensions of students’ library anxiety. the primary goal of the tour is to reduce anxiety related to knowledge of the library by familiarizing students with images and information regarding the building prior to their arrival on campus. another aim is to reduce barriers with staff, which we address by proving information along with images of library faculty. affective barriers and mechanical barriers are two of the most prevalent causes of library anxiety, which the intervention of the tour does not directly address. the hope is, however, that with the minimization of any anxiety stemming from knowledge of the building and barriers with staff, students will be encouraged to consult with librarians, particularly as information on the variety of ways to contact librarians is included on the information pop-up window on the reference desk. preand post-surveys the preand post-surveys administered to students included 42 statements from bostick’s library anxiety scale. bostick’s library anxiety scale, developed in 1992, is a 5-point likert scale survey instrument that contains 43 statements. the pre-survey also contains demographic questions. the one statement omitted from bostick’s original survey was number 40, “the change machines are always out of order,” as the oru library does not have change machines.50 with the exception of the demographical questions, the post-survey is the duplicate of the pre-survey, with the same 42 statements. although several researchers have adapted bostick’s library anxiety scale, such as blundell’s adaptation to add “elements related specifically to information technology (both hardware such as computers, and software such as online research databases),”51 for the purposes of this preliminary inquiry, the researcher decided to use the original questions from the library anxiety scale. the original statements were used because reduction of library anxiety stemming from information technology use was not a goal of this study. administration of survey a link to the pre-survey was posted on the homepage of the oru library. the author sent email invitations containing a link to the pre-survey to students enrolled in the june 2019 summer modular theology classes. the author met with groups of education modular students during the week they were on campus (june 24–30, 2019) to recruit participation. in a library session, another librarian encouraged her modular students to participate in the study. at the end of the pre-survey, a unique number and instructions to note the number were provided to participants to be used to log in to the post-survey. the link to the virtual tour appeared on the final screen of the pre-survey. the link to the post-survey was provided on the same page as the virtual tour, allowing participants to navigate to the post-survey when desired. the surveys asked for no identifying information; however, the unique number provided on the pre-surveys and entered by the participants on the post-surveys allowed the researcher to link the participants’ responses to both surveys. once the results were downloaded, each of the participants’ preand post-survey responses were coded p1 through p7 to track any potential effects of the virtual tour on participants’ responses. because of the low rate of participation, formal statistical analyses were not applied to these findings. the results were examined in two ways. each participants’ preand post-survey responses were compared to determine if responses changed from preto post-survey. the total number of responses on each point of the likert scale information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 11 to each of the 42 statements were examined to determine trends in participants’ levels of library anxiety. results although approximately 100 students enrolled in either the graduate theology or graduate education modular classes visited the campus june 24–30, 2019, participation in this preliminary study was extremely low. to date only seven participants have completed both the preand postsurveys. the responses from this formative assessment will be used by the oru library to guide future iterations of the virtual library tour and inclusion in ili. the following discusses initial findings from the preand post-surveys. most of the participants reported little or no discomfort or anxiety with using the library. all participants indicated they are us citizens, and all indicated some level of familiarity with the library. four reported they had often visited the library, three responded they had visited the library previously, but not often. of the seven participants, five indicated they are graduate students, one marked “other,” and one reported doctoral-student status. ages of the participants varied from one at 20–29, one 30–39, two 40–49, and three at 50 years or over. the following describes the effect the virtual tour of the library had on participants’ responses. interestingly, one participant showed no change in responses from preto post-survey. note: bostick’s original categorization of the statements have been retained for all 42 of the statements on both instruments. knowledge of the library the principal aim of the virtual tour was to reduce library anxiety related to knowledge of the library by acquainting students with “where things are located and how to find their way around in the building.”52 bostick categorized 5 of the 42 statements as knowledge of the library. based on participants’ responses, there is some indication the tour did help acquaint students with the library. the changes in participants’ responses showed a greater positive trend after viewing the virtual tour; although on two statements, responses showed a negative trend (table 1). table 1 shows the questions on which participants had a change in their responses from preto postsurvey. the number in the positive column indicates the number of participants whose responses displayed a favorable change in the perceptions of participants to that statement following the virtual tour. the number in the negative column shows the number of participants whose responses on the post-survey showed a negative effect of the virtual tour on their responses. statement positive negative i don’t feel physically safe in the library. 1 1 i enjoy learning new things about the library. 3 1 i want to learn how to do my own research. 1 the library is a safe place. 2 the library is an important part of my school. 2 totals 9 2 table 1. statements in knowledge of the library category, which showed change on post-survey. the number of responses of strongly disagree statements in this category were unchanged from preto post-survey. the only statement that received any responses of strongly disagree was five information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 12 to the statement “i don't feel physically safe in the library.” taken together with the two responses of disagree to this statement, all the participants feel safe in the library. to the statement “the library is a safe place,” all seven participants answered either agree (five on pre-survey, four on post-survey) or strongly agree (two on pre-survey, three on post-survey) (figure 6). curiously, responses to “i enjoy learning new things about the library” changed from no responses of disagree on the pre-survey to one response of disagree on the post-survey. the other shift in the number of responses of disagree was on the statement “the library is an important part of my school” (two on pre-survey, one on post-survey), indicating a slight improvement (figure 5). figure 5. comparison of strongly disagree and disagree responses in knowledge of the library category. to the statements in this category, none of the respondents replied undecided, except to the statement “i enjoy learning new things about the library.” there was one undecided response on the pre-survey and no responses of undecided on the post-survey to this statement. the other change in this category was to the statement “the library is an important part of my school,” which moved from no responses of undecided on the pre-survey to one undecided on the postsurvey. the respondents, for the most part, wanted to learn to do their own research, with five responses of agree or strongly agree on both the preand post-surveys. five of the participants felt the library is of importance (one agree and four strongly agree). six of the seven participants reported they enjoy learning new things about the library. the shift in responses from five agree and one strongly agree on pre-survey to two agree and four strongly agree indicates the tour might have affected participants’ views on this statement (figure 6). information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 13 figure 6. comparison of strongly agree and agree responses in knowledge of the library category. affective barriers while not a direct goal of the virtual tour, the responses of participants showed the most gains on the post-survey within the category of affective barriers. this seems to indicate that viewing the virtual tour improved students’ self-perceptions of their competence in using the library and library resources. out of the 42 statements on each of the instruments, 12 are in bostick’s category, affective barriers. the statements in table 2 are those on which participants had a change in their responses from preto post-survey. the numbers in the positive column indicate the number of participant responses, which improved on the post-survey. a number in the negative column indicates participants’ post-survey responses that moved in a negative direction. statement positive negative a lot of the university is confusing to me. 2 i am unsure how to begin my research. 2 i can never find things i need in the library. 3 i don’t know what resources are available in the library. 2 i don't know what to do next when the book i need is not on the shelf. 1 i feel comfortable using the library. 3 i get confused trying to find my way around the library. 2 i’m embarrassed that i don’t know how to use the library. 1 1 the directions for using the computers are not clear. totals 17 1 table 2. statements in affective barriers category, which showed change on post-survey. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 14 looking at the responses to statements in this category reveals some possible effects of the tour and some potential areas of library anxiety. the responses to “i don’t know what resources are available in the library” were split on the pre-survey, with three responses of agree, one of strongly agree, one undecided, one of strongly disagree, and two of disagree. the post-survey responses showed almost no change; the only change was one additional response of undecided, with no strongly agree responses (figures 7.1, 8.1, 9.1). these findings indicate more information on what sources are available to patrons may be needed on the virtual tour. most of the respondents indicated confidence about where to begin research. on both preand post-surveys, there were five responses of strongly disagree or disagree to the statement “i am unsure how to begin my research” (figure 7.1). most indicated they feel confident in using the library based on the responses to the statements “i’m embarrassed that i don’t know how to use the library,” “i feel comfortable using the library,” “i can never find things in the library,” and “i get confused trying to find my way around the library” (figures 7.1, 7.2, 9.1, 9.2). responses were equally positive to the statements “the library won't let me check out as many items as i need,” “a lot of the university is confusing to me,” “i don’t know what to do next when the book i need is not on the shelf,” and “i can’t find enough space in the library to study” (figures 7.1, 7.2, 8.1, 8.2, 9.1, 9.2). responses were divided on the statement “i feel like i’m bothering the reference librarian if i ask a question” (figures 8.2, 10.2). this finding needs further research to determine what is causing students to feel reluctance to ask the librarian for assistance. figure 7.1. comparison of strongly disagree and disagree responses in affective barriers category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 15 figure 7.2. comparison of strongly disagree and disagree responses in affective barriers category. figure 8.1. comparison of undecided responses in affective barriers category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 16 figure 8.2. comparison of undecided responses in affective barriers category. figure 9.1. comparison of strongly agree and agree responses in affective barriers category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 17 figure 9.2. comparison of strongly agree and agree responses in affective barriers category. mechanical barriers although not a goal of the study, there was positive change in participants’ responses in both statements in the category of mechanical barriers. it is unclear how the virtual tour might have caused the improvement in participants’ perception of the reliability of machines in the library. statement positive negative the computer printers are often out of paper. 1 the copy machines are usually out of order. 1 totals 2 table 3. statements in mechanical barriers category, which showed change on post-survey. in this category, on both the preand post-surveys, there was one strongly disagree response to both statements. no respondents replied agree or strongly agree to the statements in this category. responses of disagree to both statements increased one from one disagree response on the pre-survey to two disagree responses on the post-survey. the number of undecided responses fell from five to four on the post-survey. as noted above, it is not clear what caused the change in responses. barriers with staff a secondary goal of the tour was to reduce barriers with staff and thus to reduce library anxiety by providing information with images of library faculty. by providing information and images of the library faculty, this study sought to reduce the anxiety students may have regarding the accessibility and approachability of library staff. in this category, participants showed some positive effects of the virtual tour on how participants viewed library staff. however, the responses of participants exhibited the most variability in this category, with almost an equal number of responses being positive or negative after viewing the tour. the reasons for this information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 18 variance are unclear. in future studies, additional space for comments will be included on the surveys as well as possible follow-up focus-group discussions to determine the causes of negative trends in responses. table 4 shows the statements within this category on which participants had a change in their responses from preto post-survey. the number in the positive column indicates how many participant responses changed in a favorable direction on the post-survey. the number in the negative column indicates the number of participants whose post-survey responses moved in a negative direction. on the survey instruments, 12 of the 15 statements categorized as bostick’s barriers with staff showed changes in responses. statement positive negative i can always ask a librarian if i don’t know how to work a piece of equipment in the library. 1 i can’t get help in the library at the times i need it. 1 1 if i can’t find a book on the shelf the library staff will help me. 2 2 library staff don’t have time to help me. 1 1 the librarians are unapproachable. 2 the library is a comfortable place to study. 2 the library staff doesn’t care about students. 1 3 the library staff doesn’t listen to students. 1 the reference librarians are not approachable. 2 the reference librarians are unhelpful. 2 the reference librarians don’t have time to help me because they’re always busy doing something else. 1 1 there is often no one available in the library to help me. 2 1 totals 15 12 table 4. statements in barriers with staff category, which showed change on post-survey. the findings in the category, overall, were favorable. most feel the librarians and library staff care and are responsive and available to students. pre-survey responses indicated one or two of the participants felt librarians are unapproachable or unhelpful. post-survey responses reflected a positive change in participants’ views on librarians’ approachability and helpfulness. participants also reported the library to be a comfortable study location and that the rules are reasonable (figures 10.1, 10.2, 11.1, 11.2, 12.1, 12.2). information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 19 figure 10.1. comparison of strongly disagree and disagree responses in barriers with staff category. figure 10.2. comparison of strongly disagree and disagree responses in barriers with staff category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 20 figure 11.1. comparison of undecided responses in barriers with staff category. figure 11.2. comparison of undecided responses in barriers with staff category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 21 figure 12.1. comparison of strongly agree and agree responses in barriers with staff category. figure 12.2. comparison of strongly agree and agree responses in barriers with staff category. comfort with the library according to collins and veal, comfort with the library is students’ perceptions of the library as a “safe and comforting environment.”53 out of the 42 statements, bostick placed 8 within this category, all of whom showed some change in responses from pre-survey to post-survey. the changes reflected in this category were positive, but it is unclear how the virtual tour might have influenced participants’ perceptions on statements such as “there is too much crime in the library” or “good instructions for using the library’s computers are available.” further investigation is needed to determine what may account for changes in perception on statements such as these. table 5 depicts the changes, both positive and negative, in participants’ responses on the statements in this category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 22 statement positive negative good instructions for using the library’s computers are available. 2 i don’t understand the library’s overdue fines. 1 2 i feel comfortable in the library. 2 i feel safe in the library. 2 1 the library never has the materials i need. 1 the people who work at the circulation desk are helpful. 3 1 the reference librarians are unfriendly. 1 there is too much crime in the library. 1 2 totals 12 7 table 5. statements in comfort with the library category that showed change on post-survey. the following bar graphs compare the responses on the pre-surveys to the post-survey responses within this category. as with other categories, responses were mostly favorable in this category (figures 13, 14, 15). figure 13. comparison of strongly disagree and disagree responses in comfort with the library category. information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 23 figure 14. comparison of undecided responses in comfort with the library category. figure 15. comparison of strongly agree and agree responses in comfort with the library category. conclusion the oru library has found the virtual tour to be of use in familiarizing students with the library. anecdotal statements from students who viewed the tour during its creation noted the desire that such a tour had been available when they began college and further commented on the assistance that the tour will provide new students. a limitation of this study is the low participation, with no participation from students from some of the groups that other studies have shown may have higher levels of library anxiety (e.g., new students, international students). however, given the indications of positive effects of the virtual tour from our study results and anecdotal statements, we are encouraged that this tool that will assist our students in reducing library anxiety, with the result that they will visit and use the information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 24 library more often to their benefit. again, although participation was low, these results have also encouraged oru librarians to seek other ways to include avmr and other innovative technologies in our instruction, outreach, and services. the 360 virtual tour of the library is undergoing updates and additions to provide students with disabilities information on access points and accessible restrooms. other projects underway include incorporating avmr in il sessions, the addition of a digital sandbox with various technologies and equipment including a vr station, and the addition of vr equipment in our designated faculty research room for use by university faculty to learn and teach students how to use avmr technologies. the response from students and faculty to these new services has been enthusiastic and encouraging that the oru library is positively influencing and supporting the academic work of oru faculty and students. recommended reading varnum, kenneth j. beyond reality: augmented, virtual, and mixed reality in the library. chicago: ala editions, 2019. elliott, christine, marie rose, and jolanda-pieta van arnhem. augmented and virtual reality in libraries. lanham, md: rowman & littlefield, 2018. endnotes 1 anthony j. onwuegbuzie and qun g. jiao, “information search performance and research achievement: an empirical test of the anxiety-expectation mediation model of library anxiety,” journal of the american society for information science & technology 55, no. 1 (2004): 41–54, https://doi.org/10.1002/asi.10342; qun g. jiao and anthony j. onwuegbuzie, “is library anxiety important?,” library review 48, no. 6 (1999), https://doi.org/10.1108/00242539910283732; qun g. jiao and anthony j. onwuegbuzie, library anxiety: the role of study habits (paper presented at the annual meeting of the midsouth educational research association (msera), bowling green, kentucky, november 15–17, 2000), http://files.eric.ed.gov/fulltext/ed448781.pdf. 2 constance a. mellon, “library anxiety: a grounded theory and its development,” college & research libraries 47, no. 2 (1986), https://doi.org/10.5860/crl_47_02_160; see also constance a. mellon, “library anxiety: a grounded theory and its development,” college & research libraries 76, no. 3 (2015), https://doi.org/10.5860/crl.76.3.276. 3 diane mizrachi, “library anxiety,” encyclopedia of library and information sciences (boca raton, fl: crc press, 2017): 2782. 4 mellon, “library anxiety,” (1986): 163; see also mellon, “library anxiety,” (2015): 280. 5 mellon, “library anxiety,” (1986): 162; see also mellon, “library anxiety,” (2015): 278. 6 alison j. head and michael b. eisenberg, truth be told: how college students evaluate and use information in the digital age: project information literacy progress report (university of washington's information school, 2010): 3. https://doi.org/10.1002/asi.10342 https://doi.org/10.1108/00242539910283732 http://files.eric.ed.gov/fulltext/ed448781.pdf https://doi.org/10.5860/crl_47_02_160 https://doi.org/10.5860/crl.76.3.276 information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 25 7 sharon lee bostick, “the development and validation of the library anxiety scale,” (phd diss., wayne state university, 1992); qun g. jiao and anthony j. onwuegbuzie, “antecedents of library anxiety,” library quarterly 67, no. 4 (1997): 72, https://doi.org/10.1086/629972. 8 jiao and onwuegbuzie, “antecedents of library anxiety.” 9 mellon, “library anxiety” (1986); see also mellon, “library anxiety” (2015); see also constance a. mellon, “attitudes: the forgotten dimension in library instruction,” library journal 113, no. 14 (1988). 10 kathleen m. t. collins and robin e. veal, “off-campus adult learners’ levels of library anxiety as a predictor of attitudes toward the internet,” library & information science research 26, no. 1 (2004): 4, https://doi.org/https://doi.org/10.1016/j.lisr.2003.11.002. 11 mizrachi, “library anxiety,” 2784. 12 anthony j. onwuegbuzie, “writing a research proposal: the role of library anxiety, statistics anxiety, and composition anxiety,” library & information science research 19, no. 1 (1997), https://doi.org/10.1016/s0740-8188(97)90003-7. 13 carol collier kuhlthau, “developing a model of the library search process: cognitive and affective aspects,” research quarterly 28, no. (winter 1988), https://www.jstor.org/stable/25828262; carol c kuhlthau, “inside the search process: information seeking from the user’s perspective,” journal of the american society for information science 42, no. 5 (1991), https://doi.org/10.1002/(sici)10974571(199106)42:5<361::aid-asi6>3.0.co;2-%23. 14 shelley blundell, “documenting the information-seeking experience of remedial undergraduate students,” proceedings from the document academy 1, no. 1 (2014), https://doi.org/10.35492/docam/1/1/4. 15 blundell, “documenting the information-seeking experience,” 5. 16 blundell, “documenting the information-seeking experience,” 6. 17 used by permission of the author. retrieved from http://remedialundergraduateaisp.pbworks.com/w/file/88755941/modelrevised%20-%208 .4.jpg. 18 blundell, “documenting the information-seeking experience”; melissa gross and don latham, “attaining information literacy: an investigation of the relationship between skill level, self estimates of skill, and library anxiety,” library & information science research 29, no. 3 (2007), https://doi.org/10.1016/j.lisr.2007.04.012; melissa gross and don latham, “undergraduate perceptions of information literacy: defining, attaining, and self-assessing skills,” college & research libraries 70, no. 4 (2009), https://doi.org/10.5860/0700336; melissa gross and don latham, “experiences with and perceptions of information: a phenomenographic study of first-year college students,” library quarterly 81, no. 2 (2011), https://doi.org/10.1086/658867; melissa gross, “the impact of low-level skills on https://doi.org/10.1086/629972 https://doi.org/https:/doi.org/10.1016/j.lisr.2003.11.002 https://doi.org/10.1016/s0740-8188(97)90003-7 https://www.jstor.org/stable/25828262 https://doi.org/10.1002/(sici)1097-4571(199106)42:5%3c361::aid-asi6%3e3.0.co;2-%23 https://doi.org/10.1002/(sici)1097-4571(199106)42:5%3c361::aid-asi6%3e3.0.co;2-%23 https://doi.org/10.35492/docam/1/1/4 http://remedialundergraduateaisp.pbworks.com/w/file/88755941/modelrevised%20-%208.4.jpg http://remedialundergraduateaisp.pbworks.com/w/file/88755941/modelrevised%20-%208.4.jpg https://doi.org/10.1016/j.lisr.2007.04.012 https://doi.org/10.5860/0700336 https://doi.org/10.1086/658867 information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 26 information-seeking behavior: implications of competency theory for research and practice,” reference & user services quarterly (2005), https://www.jstor.org/stable/20864481. 19 mellon, “attitudes,” 138; jiao and onwuegbuzie, “antecedents of library anxiety.” 20 qun g. jiao and anthony j. onwuegbuzie, “perfectionism and library anxiety among graduate students,” journal of academic librarianship 24, no. 5 (1998), https://doi.org/10.1016/s00991333(98)90073-8; jiao and onwuegbuzie, “is library anxiety important?”; qun g. jiao and anthony j. onwuegbuzie, “library anxiety among international students” (paper presented at the annual meeting of the mid-south education research association point clear, alabama, november 17–19, 1999), https://eric.ed.gov/?id=ed437973; qun g. jiao and anthony j. onwuegbuzie, “self-perception and library anxiety: an empirical study,” library review 48, no. 3 (1999), https://doi.org/10.1108/00242539910270312; qun g. jiao and anthony j. onwuegbuzie, “identifying library anxiety through students’ learning-modality preferences,” library quarterly 69, no. 2 (1999), https://doi.org/10.1086/603054; qun g. jiao and anthony j. onwuegbuzie, library anxiety: the role of study habits; qun g. jiao and anthony j. onwuegbuzie, “library anxiety and characteristic strengths and weaknesses of graduate students’ study habits,” library review 50, no. 2 (2001), https://doi.org/10.1108/00242530110381118; qun g. jiao and anthony j. onwuegbuzie, “dimensions of library anxiety and social interdependence: implications for library services, ” library review 51, no. 2 (2002), https://doi.org/10.1108/00242530210418837; qun g. jiao and anthony j. onwuegbuzie, the relationship between library anxiety and reading ability (paper presented at the annual meeting of the mid-south educational research association, chattanooga, tennessee, november 6–8, 2002), https://eric.ed.gov/?id=ed478612; qun g. jiao and anthony j. onwuegbuzie, “reading ability as a predictor of library anxiety,” library review 52, no. 4 (2003), https://doi.org/10.1108/00242530310470720; anthony j. onwuegbuzie, and vicki l. waytowich, “the relationship between citation errors and library anxiety: an empirical study of doctoral students in education,” information processing & management 44, no. 2 (2008), https://doi.org/10.1016/j.ipm.2007.05.007; onwuegbuzie, “writing a research proposal”; anthony j. onwuegbuzie and qun g. jiao, “i’ll go to the library later: the relationship between academic procrastination and library anxiety,” college & research libraries 61, no. 1 (2000), https://doi.org/10.5860/crl.61.1.45; onwuegbuzie and jiao, “information search performance and research achievement”; anthony j. onwuegbuzie, qun g. jiao, and sharon l bostick, library anxiety: theory, research, and applications, vol. 1 (lanham, maryland: scarecrow press, 2004). 21 jiao and onwuegbuzie, “identifying library anxiety”; qun g. jiao, anthony j. onwuegbuzie, and art a. lichtenstein, “library anxiety: characteristics of ‘at-risk’ college students,” library & information science research 18, no. 2 (1996), https://doi.org/10.1016/s07408188(96)90017-1; nahyun kwon, “a mixed-methods investigation of the relationship between critical thinking and library anxiety among undergraduate students in their information search process,” college & research libraries 69, no. 2 (2008), https://doi.org/10.5860/crl.69.2.117; mellon, “attitudes.” 22 gaby haddow, “academic library use and student retention: a quantitative analysis,” library & information science research 35, no. 2 (2013), https://www.jstor.org/stable/20864481 https://doi.org/10.1016/s0099-1333(98)90073-8 https://doi.org/10.1016/s0099-1333(98)90073-8 https://eric.ed.gov/?id=ed437973 https://doi.org/10.1108/00242539910270312 https://doi.org/10.1086/603054 https://doi.org/10.1108/00242530110381118 https://doi.org/10.1108/00242530210418837 https://eric.ed.gov/?id=ed478612 https://doi.org/10.1108/00242530310470720 https://doi.org/10.1016/j.ipm.2007.05.007 https://doi.org/10.5860/crl.61.1.45 https://doi.org/10.1016/s0740-8188(96)90017-1 https://doi.org/10.1016/s0740-8188(96)90017-1 https://doi.org/10.5860/crl.69.2.117 information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 27 https://doi.org/https://doi.org/10.1016/j.lisr.2012.12.002; adam murray, ashley ireland, and jana hackathorn, “the value of academic libraries: library services as a predictor of student retention,” college & research libraries 77, no. 5 (2016), https://doi.org/10.5860/crl.77.5.631; krista m. soria, “factors predicting the importance of libraries and research activities for undergraduates,” journal of academic librarianship 39, no. 6 (2013), https://doi.org/10.1016/j.acalib.2013.08.017; krista m soria, jan fransen, and shane nackerud, “library use and undergraduate student outcomes: new evidence for students’ retention and academic success,” portal: libraries and the academy 13, no. 2 (2013), https://doi.org/10.1353/pla.2013.0010; krista m. soria, jan fransen, and shane nackerud, “stacks, serials, search engines, and students’ success: first-year undergraduate students’ library use, academic achievement, and retention,” journal of academic librarianship 40, no. 1 (2014), https://doi.org/10.1016/j.acalib.2013.12.002; krista m soria, jan fransen, and shane nackerud, “beyond books: the extended academic benefits of library use for firstyear college students,” college & research libraries 78, no. 1 (2017), https://doi.org/10.5860/crl.78.1.8. 23 jiao, onwuegbuzie, and lichtenstein, “library anxiety,” 1. 24 jiao and onwuegbuzie, “identifying library anxiety”; see also bostick, “the development and validation”; barbara fister, julie gilbert, and amy ray fry, “aggregated interdisciplinary databases and the needs of undergraduate researchers,” portal: libraries and the academy 8, no. 3 (2008), https://doi.org/10.1353/pla.0.0003; mellon, “library anxiety”; jiao and onwuegbuzie, “perfectionism and library anxiety among graduate students”; jiao and onwuegbuzie, “is library anxiety important?”; jiao and onwuegbuzie, “library anxiety among international students”; jiao and onwuegbuzie, “self-perception and library anxiety: an empirical study”; jiao and onwuegbuzie, “identifying library anxiety through students’ learning-modality preferences”; jiao and onwuegbuzie, library anxiety: the role of study habits; jiao and onwuegbuzie, “library anxiety and characteristic strengths and weaknesses of graduate students’ study habits”; jiao and onwuegbuzie, “dimensions of library anxiety and social interdependence”; jiao and onwuegbuzie, the relationship between library anxiety and reading ability; jiao and onwuegbuzie, “reading ability as a predictor of library anxiety”; onwuegbuzie and waytowich, “the relationship between citation errors and library anxiety”; onwuegbuzie, “writing a research proposal”; onwuegbuzie and jiao, “i'll go to the library later”; onwuegbuzie and jiao, “information search performance and research achievement”; onwuegbuzie, jiao, and bostick, library anxiety: theory, research, and applications. 25 onwuegbuzie and jiao, “the relationship”; anthony onwuegbuzie and qun g. jiao, “understanding library-anxious graduate students,” library review 47, no. 4 (1998), https://doi.org/10.1108/00242539810212812. 26 jiao and onwuegbuzie, “is library anxiety important?” 27 qun g. jiao, anthony j. onwuegbuzie, and sharon l bostick, “racial differences in library anxiety among graduate students,” library review 53, no. 4 (2004), https://doi.org/10.1108/00242530410531857; qun g. jiao, anthony j. onwuegbuzie, and sharon l. bostick, “the relationship between race and library anxiety among graduate https://doi.org/https:/doi.org/10.1016/j.lisr.2012.12.002 https://doi.org/10.5860/crl.77.5.631 https://doi.org/10.1016/j.acalib.2013.08.017 https://doi.org/10.1353/pla.2013.0010 https://doi.org/10.1016/j.acalib.2013.12.002 https://doi.org/10.5860/crl.78.1.8 https://doi.org/10.1353/pla.0.0003 https://doi.org/10.1108/00242539810212812 https://doi.org/10.1108/00242530410531857 information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 28 students: a replication study,” information processing & management 42, no. 3 (2006), https://doi.org/10.1016/j.ipm.2005.03.018. 28 mizrachi, “library anxiety,” 2784. 29 anthony j. onwuegbuzie and qun g. jiao, “academic library useage: a comparison of native and non-native english-speaking students,” australian library journal 46, no. 3 (1997): 263, https://doi.org/10.1080/00049670.1997.10755807; jiao and onwuegbuzie, “antecedents of library anxiety.” 30 jiao and onwuegbuzie, “library anxiety among international students.” 31 yunhui lu and denice adkins, “library anxiety among international graduate students,” proceedings of the american society for information science and technology 49, no. 1 (2012), https://doi.org/10.1002/meet.14504901319. 32 collins and veal, “off-campus adult.” 33 nahyun kwon, anthony j. onwuegbuzie, and linda alexander, “critical thinking disposition and library anxiety: affective domains on the space of information seeking and use in academic libraries,” college & research libraries 68, no. 3 (2007): 276, https://doi.org/10.5860/crl.68.3.268. 34 kwon, “a mixed-methods investigation.” 35 judy carol bell, “student affect regarding library-based and web-based research before and after an information literacy course,” journal of librarianship & information science 43, no. 2 (2011), https://doi.org/10.1177/0961000610383634. 36 jessica platt and tyson l platt, “library anxiety among undergraduates enrolled in a research methods in psychology course,” behavioral & social sciences librarian 32, no. 4 (2013): 248, https://doi.org/10.1080/01639269.2013.841464. 37 rachel a. fleming-may, regina mays, and rachel radom, “‘i never had to use the library in high school’: a library instruction program for at-risk students,” portal: libraries and the academy 15, no. 3 (2015), https://doi.org/10.1353/pla.2015.0038. 38 catherine pellegrino, “does telling them to ask for help work?,” reference & user services quarterly 51, no. 3 (2012), https://doi.org/10.5860/rusq.51n3.272. 39 kathy christie anders, stephanie j. graves, and elizabeth german, “using student volunteers in library orientations,”practical academic librarianship: the international journal of the sla 6, no. 2 (2016): 17–30, http://hdl.handle.net/1969.1/166249. 40 pamela n. martin and lezlie park, “reference desk consultation assignment: an exploratory study of students’ perceptions of reference service,” reference & user services quarterly 49, no. 4 (2010), https://doi.org/10.5860/rusq.49n4.333. https://doi.org/10.1016/j.ipm.2005.03.018 https://doi.org/10.1080/00049670.1997.10755807 https://doi.org/10.1002/meet.14504901319 https://doi.org/10.5860/crl.68.3.268 https://doi.org/10.1177/0961000610383634 https://doi.org/10.1080/01639269.2013.841464 https://doi.org/10.1353/pla.2015.0038 https://doi.org/10.5860/rusq.51n3.272 http://hdl.handle.net/1969.1/166249 https://doi.org/10.5860/rusq.49n4.333 information technology and libraries march 2020 using augmented and virtual reality in information literacy instruction | sample 29 41 sarah mcdaniel, “library roles in advancing graduate peer-tutor agency and integrated academic literacies,” reference services review 46, no. 2 (2018), https://doi.org/10.1108/rsr-02-2018-0017. 42 elaine m. robbins, “breaking the ice: using non-traditional methods of student involvement to effect [sic] a welcoming college library environment,” southeastern librarian 62, no. 1 (2014), https://digitalcommons.kennesaw.edu/seln/vol62/iss1/5. 43 elizabeth diprince et al., “don’t panic!,” reference & user services quarterly 55, no. 4 (2016), https://doi.org/10.5860/rusq.55n4.283. 44 oral roberts university, “about oru,” (2019), https://www.oru.edu/admissions/undergraduate/. 45 oral roberts, our partnership with god [sound recording]. eighth world outreach, oral roberts evangelistic association, tulsa, ok: abundant life recordings, 1962). 46 oral roberts, our partnership. 47 margaret m. grubiak, “an architecture for the electronic church: oral roberts university in tulsa, oklahoma,” technology and culture 57, no. 2 (2016), https://doi.org/10.1353/tech.2016.0066. 48 stephanie hill, “oru receives innovation award,” press release, may 2, 2017, http://www.oru.edu/news/oru_news/20170502-glc-innovation-award.php?locale=en. 49 hill, “oru receives.” 50 bostick, “the development and validation,” 160. 51 blundell, “documenting the information-seeking experience,” 263. 52 mizrachi, “library anxiety,” 2784. 53 collins and robin e. veal, “off-campus adult,” 7. https://doi.org/10.1108/rsr-02-2018-0017 https://digitalcommons.kennesaw.edu/seln/vol62/iss1/5 https://doi.org/10.5860/rusq.55n4.283 https://www.oru.edu/admissions/undergraduate/ https://doi.org/10.1353/tech.2016.0066 http://www.oru.edu/news/oru_news/20170502-glc-innovation-award.php?locale=en abstract literature review library anxiety causes and factors negative effects at-risk student groups interventions oru library oral roberts university history oru modular students the project methodology preand post-surveys administration of survey results knowledge of the library affective barriers mechanical barriers barriers with staff comfort with the library conclusion recommended reading endnotes lib-s-mocs-kmc364-20141005045144 providing bibliographic services from machine-readable data basesthe library's role richard de gennaro: director of libraries, university of pennsylvania, philadelphia, 215 libraries will play a key .role in providing access to data bases, but not by subscribing to tape services and establishing local processing centers as is commonly assumed. high costs and the nature of the demand will make this approach unfeasible. it is more likely that the library~s reference staff will develop the capability of serving as a broker between the local campus user and the various regional or specialized retail distribution centers which exist or will be established. this brief paper will attempt to counter the widely held view that the larger research libraries will soon need to begin subscribing to the growing number of data bases in machine-readable form and providing current awareness and other services from them for their local users. 0 it will speculate on how this field might develop and will suggest a less expensive and more feasible strategy which libraries may use to gain access to these increasingly important bibliographic services. the key question of who will pay for these new services, the user or the institution, will also be discussed. while it is clearly outside the scope of this paper to review the state-ofthe-art of data base services, reference to a few key works and a brief introduction to the subject may be helpful. the most comprehensive and authoritative review of the state-of-the-art of the field and its literature is the excellent chapter entitled "machinereadable bibliographic data bases" by marvin c. gechman in the 1972 volume of the annual review of information science and technology. 1 a useful selection of readings is key papers on the use of computerbased bibliographic services edited by stella keenan and published jointly • this paper was developed from a talk by the author on a panel entitled "library management of machine-readable data bases." the program was jointly sponsored by cola, isad, and acrl and took place at the ala conference in las vegas, june 24, 1973. 216 journal of library automation vol. 6/4 december 1973 by the american society for information science and the national federation of abstracting and indexing services in 1973.2 a study of six university-based information systems made by the national bureau of standards is essential and contains in convenient form comparative and descriptive information about these pioneering centers which are sponsored by the national science foundation.3 some of the most useful and important data bases available are those that have been developed by the indexing and abstracting services as byproducts of their efforts to automate the production of their regular printed publications. like the publications, the tapes come in a wide variety of incompatible formats. among the important producers are: chemical abstracts service, biosciences information service, engineering index inc., american institute of physics, and the american geological institute. ccm information corporation (pandex) and the institute for scientific information are two examples of major commercial suppliers. several of the scientific societies received substantial grants from the n ationa! science foundation and other sources in the 1960s for this automation effort, and it was generally expected that an important new market for the by-product tapes would develop among researchers in universities and in industry. imaginative and forward-looking librarians and computer people at various universities applied for and received grants to establish centers where these new data tapes could be used to provide current awareness and retrospective search services to users. the national aeronautics and space administration established a network of regional dissemination centers at six universities, including the universities of connecticut, indiana, and new mexico, the north carolina science and technology research center, university of pittsburgh, and the university of southern california. the national science foundation has been supporting centers at the university of georgia, lehigh university, university of california at los angeles, ohio state university, and stanford university. other centers have been established at the illinois institute of technology research institute and the university of florida. it is worth noting that nearly all centers provide services free to their own institutional users and continue to be heavily subsidized. all seem eager to expand their markets to include paying customers from a larger region. the latest entry into this field is the new england board of higher education's northeast academic science information center (nasic) sponsored by nsf. nasic's approach is basically different from the unitary centers that have been named. it will attempt to become a broker between the various existing centers and its own members, facilitating their access to existing services elsewhere. it will serve a ten-state region and is expected, perhaps somewhat optimistically, to become self-supporting after the three-year grant period ends. the number of data bases available in the united states is now over a hundred and is growing rapidly, apparently without benefit of firm stan..... providing bibliographic services/ de gennaro 217 dards. a parallel development is taking place in europe. as the number of available data bases increases, and as the activity at these centers expands, more and more librarians become interested in and concerned about how they are going to provide these new, important, and expensive services on their own campuses. interest among librarians in data base services is running high. a session at the association of research libraries conference in the spring of 1973 was devoted to it, and a program at the annual meeting of the american library association in las vegas on the subject was jointly sponsored by the cola discussion group~ the information science and automation division, and the association of college and research libraries. while this interest is commendable and should be stimulated, it is also important that it be tempered and put into perspective by a realistic consideration of some of the costs and problems involved in providing these services. this is what the remainder of this paper will attempt to do. the title of the ala program was "library management of machine~ readable reference data bases." implied in that title are two basic assumptions that are widely accepted: one is that libraries will play a key role in providing access to information in machine-readable data bases on their campuses. the other is that in order to provide this access they will have to acquire and maintain these data bases and develop the capability of searching and manipulating them for their local users. the first assumption is valid; libraries will be responsible for assisting users in gaining access to information in this new form. the second assumption is highly questionable, if not invalid. it is extremely unlikely that many individual libraries will be able to afford to establish centers to acquire and process these machine-readable data bases. while it may appear that a straw man is being set up that can be easily demolished, the idea that academic libraries must and will begin acquiring and servicing many large and expensive data bases, and even statistical data banks, is still widely enough held that it ought to be put to rest. how did this idea gain such currency? perhaps it was because the first available data bases were from the indexing and abstracting services and contained machine-readable versions of their printed indexes. since li~ braries subscribed to the printed editions, it followed that they should also subscribe to the tape editions. the same is true for the census tapes. li~ braries were the chief repositories for printed census publications, so it was natural to assume that they would have to subscribe to and make avail~ able the machine~readable census data as well. we now know better about the census tapes; the problem was simply beyond our resources, and they are being made available from specialized centers. a similar solution may well emerge for the bibliographical data tapes of the indexing and ab~ stracting services. to help put matters into perspective, it might be useful to review a few other ideas we had in the last two decades on how certain technological de~ 218 journal of library automation vol. 6/ 4 december 1973 veloprnents would be implemented in the library. take microfilm, for example. back in the 1950s when microfilm came of age for library use, many librarians thought that every major library would require its own laboratory where large quantities of film could be produced and processed under the direction of a new breed of librarian called a documentalist. several major libraries did establish such laboratories for a time, but the only remaining ones of any significance are at the library of congress and a few other large libraries . most of the others were put out of business by the copying machine, the local service bureau, and commercial micropublishers-and the documentalists became information scientists. library automation provides other interesting examples. many of us recall that in the 1960s it was a commonly held view that each major library would have to automate its operations, and that librarians would learn to master the computer that was soon to be installed in every library basement, or see themselves replaced by computer experts. as we all know, it did not happen that way. librarians will probably end up with computer terminals or minicomputers, with software packages supplied by library cooperatives or commercial vendors. when the marc tapes were first made available, it was assumed (and this is what the marc i experiment was all about) that each library would have to subscribe to the tapes and design, implement, and operate its own system to use the data in its cataloging operations. again, it did not happen that way. marc data are being used by libraries, but indirectly through cooperative centers such as oclc, or through commercial vendors of card services such as information design or josten's, inc. individual libraries are not subscribing to marc tapes, as we had thought would be the case. the point of citing these few examples is to suggest that it is extremely difficult in the early stage of a new technology to predict with any confidence how it will be introduced and implemented, and what effects it will have. we seem to have a natural tendency first to try to cope with each new technological development on a do-it-yourself individual library level, and when experience teaches us that implementing the particular technology is more difficult and more expensive than we thought, we regroup and try a broader-based approach. this is approximately where we are with data base services; it is time for a broader-based approach. again, it is unlikely that libraries will provide access to machine-readable data by setting up their own campus information centers to acquire and process data bases. anyone who takes the time to look at a list of data bases available and their annual subscription rates will understand that research library book budgets will not be large enough to cover these additional subscription costs. in fact, the subscriptions are only a minor element in the total cost of providing these services. the data bases must be cumulated and maintained. programs to manipulate and access them in their many nonstandard formats and contents must be written or adapted. c1 providing bibliographic services/de gennaro 219 the cost of administering and marketing the services and interfacing with the users will be high. perhaps the most critical question to be answered is: will the individual user be charged for the services he uses or will the costs be absorbed by the university? the answer to that question will determine how and to what extent the machine-based services will be used in the future. if they are offered free, as are traditional library services, then one can assume with some confidence that a substantial demand for them will materialize. this has in fact been the early experience of the centers at the university of georgia and ohio state and others where use has been totally subsidized by grant money.3·' on the other hand, if the individual user is asked to pay for these services out of his own pocket or even out of departmental or grant funds, the market for them will be severely limited. it is extremely unlikely that large numbers of faculty and other researchers in universities will be seriously interested in becoming paying users of machine-based information services. the experience of c. c. parker at the university of southampton may prove to be typical.5 he reported a drop from forty-seven to five users of an sdi service after charges were introduced. it was not that the users could not pay the charges, but that they preferred to use their resources for other more important needs. the national library of medicine recently instituted user charges in the medline system in order to effect a needed reduction in the number of users. the case for giving these services to users free is theoretically sound in the traditional library context, but there are practical difficulties. first, these services will be expensive and they will require a net addition to library budgets rather than a transfer from one activity to another; the prospects for such budget increases seem dim in the next few years. second, if the services are offered free, there will be no natural or automatic mechanism for controlling their use, and such control is essential to limit costs. once users get on a free subscription list they will tend to stay on it whether they actually use the products or not. this happens in many libraries where current accessions lists are regularly sent to faculty, most of whom discard them unread. on the other hand, there is ample precedent for charging a modest fee for certain services in libraries. the best example is the almost universal charge for photocopies. in those instances where libraries offered free copies, the service was abused and charges had to be reinstated. it seems likely that a combination of institutional subsidy and individual charges will evolve as the dominant method of paying for machinereadable services. in order to recover some costs and prevent abuses, an appropriate system of charges will have to be instituted in spite of the logic of the argument for free services. incidentally, the case for free computer time in universities is perhaps equally valid, but it has never been accepted by the responsible budget officers. 220 journal of library automation vol. 6/4 december 1973 regardless of who pays, these services will have to be advertised and marketed aggressively to reach the limited number of potential users on each campus. it will not be enough to announce their availability and wait for customers. but even the best salesman on the most research-oriented campus will probably fail to find enough users to justify the high costs of providing the extensive and diverse subject coverage that every university will require. the solution, of course, lies in the establishment of a small number of comprehensive regional or even national information processing centers, possibly backed up by a much larger number of specialized centers or services for particular subject or mission-oriented fields such as physics, chemistry, medicine, pollution, urban studies, census data, etc. libraries will play a key role in facilitating access to data bases by functioning as the interface or broker between the users on campus and these regional and special processing and distribution centers. this means that they must develop a new kind of information or data services librarian on their reference staffs whose function it will be to publicize these services and maintain extensive files of information on their scope, contents, cost, and availability. these reference specialists will also guide users to the most appropriate services, help them to build and maintain their interest profiles, and provide assistance with the business aspects of dealing with vendors."'"' after an initial start-up period, this function should and doubtless will become a fully integrated part of the regular reference service, and the need for specialists will disappear as this knowledge becomes a part of every reference librarian's repertoire. the available data base services fall into two main categories: off-line batch and on-line interactive services. the most commonly available up to now have been regular off-line current awareness ( sdi) services based on an interest profile; these have been supplemented by occasional requests for retrospective searches of the older files. the results of these off-line searches are delivered to the subscriber by conventional mail. on-line services permit the user or the reference specialist to access a portion of the data base directly via terminals and telephone lines and perform the search in an interactive mode. some results are immediately displayed on the terminal and others are sent by mail. the lockheed information retrieval service and systems development corporation have recently begun offering interactive searching ~th online computer terminals of a large selection of the most useful bibliographic data bases. with this capability commercially available from leased terminals on a fee-per-use basis, it will be difficult for a university or even some existing centers to justify subscribing to and maintaining these data bases for their own limited use. if lockheed, sdc, and other vendors ca~ d evelop the market and operate these services at a profit, they may be able to 00 the university of pennsylvania library recently established a data services office based on this concept with encouraging early results. providing bibliographic services/ de gennaro 221 satisfy a very substantial portion of the need for these new bibliographic services. medline, toxline, recon, and the new york times information bank provide other models for specialized and centralized interactive services. some authorities assert that this trend toward on-line interactive searching will accelerate and eventually supersede tape searching.6 others argue that the cost of maintaining and searching on-line the really large data bases is prohibitive and will remain so for several years to come. it seems most likely to this author that the trend will be toward on-line systems covering a limited period of time, probably the latest three to five years, with supporting off-line services for retrospective searches. if this proves to be the case, libraries will find it practical and convenient to make terminals available at or near reference desks. a close look at the several centers which now exist on individual campuses would probably show that they are heavily subsidized by grant or other outside funds, and that they are trying to expand to serve their states or even wider regions in order to achieve greater cost effectiveness. these centers ·deserve the credit that is always due pioneers. they are in the process of developing the patterns for providing these services in the future. one of the chief lessons they may have already taught us is that a single university, or even possibly a single state or region, is not a large enough market base upon which to build this activity. these centers will require a large volume of business to justify their high overhead and operating costs and they will seek and welcome additional paying customers. to summarize and conclude, libraries will play a key role in providing access to machine-readable data bases, but they will generally not do it by acquiring and managing these data bases in local campus centers because of the high costs involved. these high costs and the limited market will restrict the number of processing centers to several regional or even national centers, supplemented by a larger number of specialized discipline and mission-oriented services. many data bases and services will be available on a fee-for-service basis either through existing centers or directly from professional societies, government agencies, and commercial vendors with the library serving as facilitator or broker. it seems likely that a combination of institutional subsidies and individual cl1arges will emerge as the pattern for paying for these new computer-based bibliographical services. references i. marvin c. gechroan, "machine-readable bibliographic data bases," in annual review ()j info1'11uj.tion science and technology, v. 7 (washington, d.c.: asis, 1972). p.323-78. 2. stella keenan, ed., key papers on the use of computer-ba.sed bibliographic services (washington, d.c.: asis, 1973). 222 journal of library automation vol. 6/ 4 december 1973 3. b. marron, and others, a study of six university-based information systems (washington, d.c.; national bureau of standards, 1973 [nbs technical note 781]). 4. james l. carmon, "a campus-based information center," special libraries 64:6569 (feb. 1973). 5. c. c. parker, "the use of external current awareness services at southampton university," aslib proceedings 25:4-17 (jan. 1973). 6. m. cerville, l. d. higgins, and francis j. smith, "interactive reference retrieval in large files," information storage and retrieval 7:205-10 (dec. 1971). ital_24n4p24-32 ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ 34 information technology and libraries | march 2010 tagging: an organization scheme for the internet marijke a. visser how should the information on the internet be organized? this question and the possible solutions spark debates among people concerned with how we identify, classify, and retrieve internet content. this paper discusses the benefits and the controversies of using a tagging system to organize internet resources. tagging refers to a classification system where individual internet users apply labels, or tags, to digital resources. tagging increased in popularity with the advent of web 2.0 applications that encourage interaction among users. as more information is available digitally, the challenge to find an organizational system scalable to the internet will continue to require forward thinking. trained to ensure access to a range of informational resources, librarians need to be concerned with access to internet content. librarians can play a pivotal role by advocating for a system that supports the user at the moment of need. tagging may just be the necessary system. w ho will organize the information available on the internet? how will it be organized? does it need an organizational scheme at all? in 1998, thomas and griffin asked a similar question, “who will create the metadata for the internet?” in their article with the same name.1 ten years later, this question has grown beyond simply supplying metadata to assuring that at the moment of need, someone can retrieve the information necessary to answer their query. given new classification tools available on the internet, the time is right to reassess traditional models, such as controlled vocabularies and taxonomies, and contrast them with folksonomies to understand which approach is best suited for the future. this paper gives particular attention to delicious, a social networking tool for generating folksonomies. the amount of information available to anyone with an internet connection has increased in part because of the internet’s participatory nature. users add content in a variety of formats and through a variety of applications to personalize their web experience, thus making internet content transitory in nature and challenging to lock into place. the continual influx of new information is causing a rapid cultural shift, more rapid than many people are able to keep up with or anticipate. conversations on a range of topics that take place using web technologies happen in real time. unless you are a participant in these conversations and debates using web-based communication tools, changes are passing you by. internet users in general have barely grasped the concept of web 2.0 and already the advanced “internet cognoscenti” write about web 3.0.2 regarding the organization and availability of internet content, librarians need to be ahead of the crowd as the voice who will assure content will be readily accessible to those that seek it. internet users actively participating in and shaping the online communities are, perhaps unintentionally, influencing how those who access information via the internet expect to be able to receive and use digital resources. librarians understand that the way information is organized is critical to its accessibility. they also understand the communities in which they operate. today, librarians need to be able to work seamlessly among the online communities, the resources they create, and the end user. as internet use evolves, librarians as information stakeholders should stay abreast of web 2.0 developments. by positioning themselves to lead the future of information organization, librarians will be able to select the best emerging web-based tools and applications, become familiar with their strengths, and leverage their usefulness to guide users in organizing internet content. shirky argues that the internet has allowed new communities to form. primarily online, these communities of internet users are capable of dramatically changing society both onand offline. shirky contends that because of the internet, “group action just got easier.”3 according to shirky, we are now at the critical point where internet use, while dependent on technology, is actually no longer about the technology at all. the web today (web 2.0) is about participation. “this [the internet] is a medium that is going to change society.”4 lessig points out that content creators are “writing in the socially, culturally relevant sense for the 21st century and to be able to engage in this writing is a measure of your literacy in the 21st century.”5 it is significant that creating content is no longer reserved for the internet cognoscenti. internet users with a variety of technological skills are participating in web 2.0 communities. information architects, web designers, librarians, business representatives, and any stakeholder dependent on accessing resources on the internet have a vested interest in how internet information is organized. not only does the architecture of participation inherent in the internet encourage completely new creative endeavors, it serves as a platform for individual voices as demonstrated in marijke a. visser (marijkea@gmail.com) is a library and information science graduate student at indiana university, indianapolis, and will be graduating may 2010. she is currently working for ala’s office for information and technology policy as an information technology policy analyst, where her area of focus includes telecommunications policy and how it affects access to information. tagging: an organization scheme for the internet | visser 35 personal and organizationally sponsored blogs: lessig 2.0, boing boing, open access news, and others. these internet conversations contribute diverse viewpoints on a stage where, theoretically, anyone can access them. web 2.0 technologies challenge our understanding of what constitutes information and push policy makers to negotiate equitable internet-use policies for the public, the content creators, corporate interests, and the service providers. to maintain an open internet that serves the needs of all the players, those involved must embrace the opportunity for cultural growth the social web represents. for users who access, create, and distribute digital content, information is anything but static; nor is using it the solitary endeavor of reading a book. its digital format makes it especially easy for people to manipulate it and shape it to create new works. people are sharing these new works via social technologies for others to then remix into yet more distinct creative work. communication is fundamentally altered by the ability to share content on the internet. today’s internet requires a reevaluation of how we define and organize information. the manner in which digital information is classified directly affects each user’s ability to access needed information to fully participate in twenty-first-century culture. new paradigms for talking about and classifying information that reflect the participatory internet are essential. n background the controversy over organizing web-based information can be summed up comparing two perspectives represented by shirky and peterson. both authors address how information on the web can be most effectively organized. in her introduction, peterson states, “items that are different or strange can become a barrier to networking.”6 shirky maintains, “as the web has shown us, you can extract a surprising amount of value from big messy data sets.”7 briefly, in this instance ontology refers to the idea of defining where digital information can and should be located (virtually). folksonomy describes an organizational system where individuals determine the placement and categorization of digital information. both terms are discussed in detail below. although any organizational system necessitates talking about the relationship(s) among the materials being organized, the relationships can be classified in multiple ways. to organize a given set of entities, it is necessary to establish in what general domain they belong and in what ways they are related. applying an ontological, or hierarchical, classification system to digital information raises several points to consider. first, there are no physical space restrictions on the internet, so relationships among digital resources do not need to be strictly identified. second, after recognizing that internet resources do not need the same classification standards as print material, librarians can begin to isolate the strengths of current nondigital systems that could be adapted to a system for the internet. third, librarians must be ready to eliminate current systems entirely if they fail to serve the needs of internet users. traditional systems for organizing information were developed prior to the information explosion on the internet. the internet’s unique platform for creating, storing, and disseminating information challenges pre– digital-age models. designing an organizational system for the internet that supports creative innovation and succeeds in providing access to the innovative work is paramount to moving the twenty-first-century culture forward. n assessing alternative models controversy encourages scrutiny of alternative models. in understanding the options for organizing digital information, it is important to understand traditional classification models. smith discusses controlled vocabularies, taxonomies, and facets as three traditional methods for applying metadata to a resource. according to smith, a controlled vocabulary is an unambiguous system for managing the meanings of words. it links synonyms, allowing a search to retrieve information on the basis of the relationship between synonyms.8 taxonomies are hierarchical, controlled vocabularies that establish parent–child relationships between terms. a faceted classification system categorizes information using the distinct properties of that information.9 in such a system, information can exist in more than one place at a time. a faceted classification system is a precursor to the bottom-up system represented by folksonomic tagging. folksonomy, a term coined in 2004 by thomas vander wal, refers to a “user-created categorical structure development with an emergent thesaurus.”10 vander wal further separates the definition into two types: a narrow and a broad folksonomy.11 in a broad folksonomy, many people tag the same object with numerous tags or a combination of their own and others’ tags. in a narrow folksonomy, one or few people tag an object with primarily singular terms. internet searching represents a unique challenge to people wanting to organize its available information. search engines like yahoo! and google approach the chaotic mass of information using two different techniques. yahoo! created a directory similar to the file folder system with a set of predetermined categories that were intended to be universally useful. in so doing, the yahoo! developers made assumptions about how the general public would categorize and access information. the categories 36 information technology and libraries | march 2010 and subsequent subcategories were not necessarily logically linked in the eyes of the general public. the yahoo! directory expanded as internet content grew, but the digital folder system, like a taxonomy, required an expert to maintain. shirky notes the yahoo! model could not scale to the internet. there are too many possible links to be able to successfully stay within the confines of a hierarchical classification system. additionally, on the internet, the links are sufficient for access because if two items are linked at least once, the user has an entry point to retrieve either one or both items.12 a hierarchical system does not assure a successful internet search and it requires a user to comprehend the links determined by the managing expert. in the google approach, developers acknowledged that the user with the query best understood the unique reasoning behind her search. the user therefore could best evaluate the information retrieved. according to shirky, the google model let go of the hierarchical file system because developers recognized effective searching cannot predetermine what the user wants. unlike yahoo!, google makes the links between the query and the resources after the user types in the search terms.13 trusting in the link system led google to understand and profit from letting the user filter the search results. to select the best organizational model for the internet it is critical to understand its emergent nature. a model that does not address the effects of web 2.0 on internet use and fails to capture participant-created content and tagging will not be successful. one approach to organizing digital resources has been for users to bookmark websites of personal interest. these bookmarks have been stored on the user’s computer, but newer models now combine the participatory web with saving, or tagging, websites. social bookmarking typifies the emergent web and the attraction of online networking. innovative and controversial, the folksonomy model brings to light numerous criteria necessary for a robust organizational system. a social bookmarking network, delicious is a tool for generating folksonomies. it combines a large amount of self-interest with the potential for an equal, if not greater, amount of social value. delicious users add metadata to resources on the internet by applying terms, or tags, to urls. users save these tagged websites to a personal library hosted on the delicious website. the default settings on delicious share a user’s library publicly, thus allowing other people—not limited to registered delicious account holders—to view any library. that the delicious developers understood how internet users would react to this type of interactive application is reflected in the popularity of delicious. delicious arrived on the scene in 2003, and in 2007 developers introduced a number of features to encourage further user collaboration. with a new look (going from the original del.icio.us to its current moniker, delicious) as well as more ways for users to retrieve and share resources by 2007, delicious had 3 million registered users and 100 million unique urls.14 the reputation of delicious has generated interest among people concerned with organizing the information available via the internet. how does the folksonomy or delicious model of open-ended tagging affect searching, information retrieving, and resource sharing? delicious, whose platform is heavily influenced by its users, operates with no hierarchical control over the vocabulary used as tags. this underscores the organization controversy. bottom-up tagging gives each person tagging an equal voice in the categorization scheme that develops through the user generated tags. at the same time, it creates a chaotic information-retrieval system when compared to traditional controlled vocabularies, taxonomies, and other methods of applying metadata.15 a folksonomy follows no hierarchical scheme. every tag generated supplies personal meaning to the associated url and is equally weighted. there will be overlap in some of the tags users select, and that will be the point of access for different users. for the unique tags, each delicious user can choose to adopt or reject them for their personal tagging system. either way, the additional tags add possible future access points for the rest of the user community. the social usefulness of the tags grows organically in relationship to their adoption by the group. can the internet support an organizational system controlled by user-generated tags? by the very nature of the participatory web, whose applications often get better with user input, the answer is yes. delicious and other social tagging systems are proving that their folksonomic approach is robust enough to satisfy the organizational needs of their users. defined by vander wal, a broad folksonomy is a classification system scalable to the internet.16 the problem with projecting already-existing search and classification strategies to the internet is that the internet is constantly evolving, and classic models are quickly overcome. even in the nonprint world of the internet, taxonomies and controlled vocabulary entail a commitment both from the entity wanting to organize the system and the users who will be accessing it. developing a taxonomy involves an expert, which requires an outlay of capital and, as in the case with yahoo!, a taxonomy is not necessarily what users are looking for. to be used effectively, taxonomies demand a certain amount of user finesse and complacency. the user must understand the general hierarchy and by default must suspend their own sense of category and subcategory if they do not mesh with the given system. the search model used by google, where the user does the filtering, has been a significantly more successful search engine. google recognizes natural language, making it user friendly; however, it remains merely a search engine. it is successful at making links, but it leaves the user stranded without a means to organize search results beyond simple page rank. traditional tagging: an organization scheme for the internet | visser 37 hierarchical systems and search strategies like those of yahoo! and google neglect to take into account the tremendous popularity of the participatory web. successful web applications today support user interaction; to disregard this is naive and short-sighted. in contrast to a simple page-rank results list or a hierarchical system, delicious results provide the user with rich, multilayer results. figure 1 shows four of the first ten results of a delicious search for the term “folksonomy.” the articles by the four authors in the left column were tagged according to the diagram. two of the articles are peer-reviewed, and two are cited repeatedly by scholars researching tagging and the internet. in this example, three unique terms are used to tag those articles, and the other terms provide additional entry points for retrieval. further information available using delicious shows that the guy article was tagged by 1,323 users, the mathes article by 2,787 users, the shirky article by 4,383 users, and the peterson article by 579 users.17 from the basic delicious search, the user can combine terms to narrow the query as well as search what other users have tagged with those terms. similar to the card catalog, where a library patron would often unintentionally find a book title by browsing cards before or after the actual title she originally wanted, a delicious user can browse other users’ libraries, often finding additional pertinent resources. a user will return a greater number of relevant and automatically filtered results than with an advanced google search. as an ancillary feature, once a delicious user finds an attractive tag stream—a series of tags by a particular user—they can opt to follow the user who created the tag stream, thereby increasing their personal resources. hence delicious is effective personally and socially. it emulates what internet users expect to be able to do with digital content: find interesting resources, personalize them, in this case with tags, and put them back out for others to use if they so choose. proponents of folksonomy recognize there are benefits to traditional taxonomies and controlled vocabulary systems. shirky delineates two features of an organizational system and their characteristics, providing an example of when a hierarchical system can be successful (see table 1).18 these characteristics apply to situations using databases, journal articles, and dissertations as spelled out by peterson, for example.19 specific organizations with identifiable common terminology—for example, medical libraries—can also benefit from a traditional classification system. these domains are the antithesis of the domain represented by the web. the success of controlled vocabularies, taxonomies, and their resulting systems depends on broad user adoption. that, in combination with the cost of creating and implementing a controlled system, raises questions as to their utility and long-term viability for use on the web. though meant for longevity, a taxonomy fulfills a need at one fixed moment in time. a folksonomy is never static. taxonomies developed by experts have not yet been able to be extended adequately for the breadth and depth of internet resources. neither have traditional viewpoints been scaled to accept the challenges encountered in trying to organize the internet. folksonomy, like taxonomy, seeks to provide the information critical to the user at the moment of need. folksonomy, however, relies on users to create the links that will retrieve the desired results. doctorow puts forward three critiques of a hierarchical metadata system, emphasizing the inadequacies of applying traditional classification schemes to the digital stage: 1. there is not a “correct” way to categorize an idea. 2. competing interests cannot come to a consensus figure 1. search results for “folksonomy” using delicious. table 1. domains and their participants domain to be organized participants in the domain small corpus expert catalogers formal categories authoritative source of judgment restricted entities coordinated users clear edges expert users 38 information technology and libraries | march 2010 on a hierarchical vocabulary. 3. there is more than one way to describe something. doctorow elaborates: “requiring everyone to use the same vocabulary to describe their material denudes the cognitive landscape, enforces homogeneity in ideas.”20 the internet raises the level of participation to include innumerable voices. the astonishing thing is that it thrives on this participation. guy and tonkin address the “folksonomic flaw” by saying user-generated tags are by definition imprecise. they can be ambiguous, overly personal, misspelled, and a contrived compound word. guy and tonkin suggest the need to improve tagging by educating the users or by improving the systems to encourage more accurate tagging.21 this, however, does not acknowledge that successful web 2.0 applications depend on the emergent wisdom of the user community. the systems permit organic evolution and continual improvement by user participation. a folksonomy evolves much the way a species does. unique or single-use tags have minimal social import and do not gain recognition. tags used by more than a few people reinforce their value and emerge as the more robust species. n conclusion the benefits of the internet are accessible to a wide range of users. the rewards of participation are immediate, social, and exponential in scope. user-generated content and associated organization models support the internet’s unique ability to bring together unlikely social relationships that would not necessarily happen in another milieu. to paraphrase shirky and lessig, people are participating in a moment of social and technological evolution that is altering traditional ways of thinking about information, thereby creating a break from traditional systems. folksonomic classification is part of that break. its utility grows organically as users add tagged content to the system. it is adaptive, and its strengths can be leveraged according to the needs of the group. while there are “folksonomic flaws” inherent in a bottomup classification system, there is tremendous value in weighting individual voices equally. following the logic of web 2.0 technology, folksonomy will improve according to the input of the users. it is an organizational system that reflects the basic tenets of the emergent internet. it may be the only practical solution in a world of participatory content creation. shirky describes the internet by saying, “there is no shelf in the digital world.”22 classic organizational schemes like the dewey decimal system were created to organize resources prior to the advent of the internet. a hierarchical system was necessary because there was a physical limitation on where a resource could be located; a book can only exist in one place at one time. in the digital world, the shelf is simply not there. material can exist in many different places at once and can be retrieved through many avenues. a broad folksonomy supports a vibrant search strategy. it combines individual user input with that of the group. this relationship creates data sets inherently meaningful to the community of users seeking information on any given topic at any given moment. this is why a folksonomic approach to organizing information on the internet is successful. users are rewarded for their participation, and the system improves because of it. folksonomy mirrors and supports the evolution of the internet. librarians, trained to be impartial and ethically bound to assure access to information, are the logical mediators among content creators, the architecture of the web, corporate interests, and policy makers. critical conversations are no longer happening only in traditional publications of the print world. they are happening with communication platforms like youtube, twitter, digg, and delicious. information organization is one issue on which librarians can be progressive. dedicated to making information available, librarians are in a unique position to take on challenges raised by the internet. as the profession experiments with the introduction of web 3.0, librarians need to position themselves between what is known and what has yet to evolve. librarians have always leveraged the interests and needs of their users to tailor their services to the individual entry point of every person who enters the library. because more and more resources are accessed via the internet, librarians will have to maintain a presence throughout the web if they are to continue to speak for the informational needs of their users. part of that presence necessitates an ability to adapt current models to the internet. more importantly, it requires recognition of when to forgo conventional service methods in favor of more innovative approaches. working in concert with the early adopters, corporate interests, and general internet users, librarians can promote a successful system for organizing internet resources. for the internet, folksonomic tagging is one solution that will assure users can retrieve information necessary to answer their queries. references and notes 1. charles f. thomas and linda s. griffin, “who will create the metadata for the internet?” first monday 3, no. 12 (dec. 1998). 2. web 2.0 is a fairly recent term, although now ubiquitous among people working in and around internet technologies. attributed to a conference held in 2004 between medialive tagging: an organization scheme for the internet | visser 39 international and o’reilly media, web 2.0 refers to the web as being a platform for harnessing the collective power of internet users interested in creating and sharing ideas and information without mediation from corporate, government, or other hierarchical policy influencers or regulators. web 3.0 is a much more fluid concept as of this writing. there are individuals who use it to refer to a semantic web where information is analyzed or processed by software designed specifically for computers to carry out the currently human-mediated activity of assigning meaning to information on a webpage. there are librarians involved with exploring virtual-world librarianship who refer to the 3d environment as web 3.0. the important point here is that what internet users now know as web 2.0 is in the process of being altered by individuals continually experimenting with and improving upon existing web applications. web 3.0 is the undefined future of the participatory internet. 3. clay shirky, “here comes everybody: the power of organizing without organizations” (presentation videocast, berkman center for internet & society, harvard university, cambridge, mass., 2008), http://cyber.law.harvard.edu/inter active/events/2008/02/shirky (accessed oct. 1, 2008). 4. ibid. 5. lawerence lessig, “early creative commons history, my version,” videocast, aug. 11, 2008, lessig 2.0, http://lessig.org/ blog/2008/08/early_creative_commons_history.html (accessed aug. 13, 2008). 6. elaine peterson, “beneath the metadata: some philosophical problems with folksonomy,” d-lib magazine 12, no. 11 (2006), http://www.dlib.org/dlib/november06/peterson/11peterson .html (accessed sept. 8, 2008). 7. clay shirky, “ontology is overrated: categories, links, and tags” online posting, spring 2005, clay shirky’s writings about the internet, http://www.shirky.com/writings/ontology_ overrated.html#mind_reading (accessed sept. 8, 2008). 8. gene smith, tagging: people-powered metadata for the social web (berkeley, calif.: new riders, 2008): 68. 9. ibid., 76. 10. thomas vander wal, “folksonomy,” online posting, feb. 7, 2007, vanderwal.net, http://www.vanderwal.net/folksonomy .html (accessed aug. 26, 2008). 11. thomas vander wal, “explaining and showing broad and narrow folksonomies,” online posting, feb. 21, 2005, personal infocloud, http://www.personalinfocloud.com/2005/02/ explaining_and_.html (accessed aug. 29, 2008). 12. shirky, “ontology is overrated.” 13. ibid. 14. michael arrington, “exclusive: screen shots and feature overview of delicious 2.0 preview,” online posting, june 16, 2005, techcrunch, http://www.techcrunch.com/2007/09/06/ exclusive-screen-shots-and-feature-overview-of-delicious-20 -preview/(accessed jan. 6, 2010). 15. smith, tagging, 67–93 . 16. vander wal, “explaining and showing broad and narrow folksonomies.” 17. adam mathes, “folksonomies—cooperative classification and communication through shared metadata” (graduate paper, university of illinois urbana–champaign, dec. 2004); peterson, “beneath the metadata”; shirky, “ontology is overrated”; thomas and griffin, “who will create the metadata for the internet?” 18. shirky, “ontology is overrated.” 19. peterson, “beneath the metadata.” 20. cory doctorow, “metacrap: putting the torch to seven straw-men of the meta-utopia,” online posting, aug. 26, 2001, the well, http://www.well.com/~doctorow/metacrap.htm (accessed sept. 15, 2008). 21. marieke guy and emma tonkin, “folksonomies: tidying up tags?” d-lib magazine 12, no. 1 (2006), http://www.dlib .org/dlib/january06/guy/01guy.html (accessed sept. 8, 2008). 22. shirky, “ontology is overrated.” global interoperability continued from page 33 9. julie renee moore, “rda: new cataloging rules, coming soon to a library near you!” library hi tech news 23, no. 9, (2006): 12. 10. rick bennett, brian f. lavoie, and edward t. o’neill, “the concept of a work in worldcat: an application of frbr,” library collections, acquisitions, & technical services 27, no. 1, (2003): 56. 11. park, “cross-lingual name and subject access.” 12. ibid. 13. thomas b. hickey, “virtual international authority file” (microsoft powerpoint presentation, ala annual conference, new orleans, june 2006), http://www.oclc.org/research/ projects/viaf/ala2006c.ppt (accessed dec. 9, 2009). 14. leaf, “leaf project consortium,” http://www.crxnet .com/leaf/index.html (accessed dec. 9, 2009). 15. bennett, lavoie, and o’neill, “the concept of a work in worldcat.” 16. alan danskin, “mature consideration: developing bibliographic standards and maintaining values,” new library world 105, no. 3/4, (2004): 114. 17. ibid. 18. bennett, lavoie, and o’neill, “the concept of a work in worldcat.” 19. moore, “rda.” 20. danskin, “mature consideration,” 116. 21. ibid.; park, “cross-lingual name and subject access.” a simulation model for purchasing duplicate copies in a library w. y. arms: the open university, and t. p. walter: unilever limited. at the time this study was undertaken the authors were at the university of sussex. 73 p1'ovision of duplicate copies in a lib1'at'y requires knowledge of the demand fo1' each title. since di1'ect measu1'ement of demand is difficult a simulation model has been developed to estimate the demand for a book f1'om the number of times it has been loaned and hence to dete1·mine the number of copies required. special attention has been given to accurate calibration of the model. introduction a common difficulty in library management is deciding when to buy duplicate copies of a given book and how many copies to buy. a typical research library has several hundred thousand different works; many are lightly used but all are potential candidates for duplication. the problem which we faced at sussex university was how to obtain reliable forecasts of the demand for each title and to translate this into a purchasing policy. at present sussex spends between £10,000 and £20,000 ($22,00o-$44,000) per year on duplicate copies, and as the university grows this amount is increasing steadily. because of the large number of books in a library relatively little data are available about each title. records are kept of books on loan or removed from the library, but frequently these are the only routine data collected. few large libraries even manage inventory checks. we therefore looked for a system that could be implemented with the minimum of data collection, preferably one based on existing records. forecasts of demand if the demand for a particular book is known, it is possible, though not necessarily easy, to determine how many copies of that book are needed to achieve a specified level of service, such as a copy being available on 80 percent of the occasions that a reader requires the book. unfortunately demand cannot be measured directly, even retrospectively. records of the 74 journal of librm·y automation vol. 7/2 june 1974 number of times that a book is issued from the library contain no information about how many times the book was used within the library, nor how many readers failed to find a copy and went away unsatisfied. since both these factors are extremely difficult to measure, one of the central parts of our work was to develop a method of estimating them from data readily available. to forecast demand two lines of approach seemed reasonable: subjective estimation based on faculty reading lists; and forecasts based on the number of loans in previous years. in the past, sussex library has made extensive use of reading lists provided by faculty to decide how many copies to buy of each title. as the books most in demand are those recommended for undergraduate courses this seemed a sensible approach, though the number of copies required is not obvious even if the demand is known. webster analysed the effectiveness of these lists in predicting demand for specific titles and evaluated the purchasing rule being used, one copy for every ten students taking a course. 1 restricting his attention to books known to be in demand and marked in the catalog, he drew a random sample of 673 titles, about 4 percent of the books falling into this category. he compared the number of loans of each of these titles over a term· with data from the reading lists supplied at the beginning of the term. as the library had made a special effort to obtain reading lists for all courses taught that term, he had data on the number and type of students taking each course, the importance given to each text, and the subject areas involved. yet despite a thorough analysis of these data webster was able to find very little relationship between observed demand and reading list information. his work shows that faculty at the university have remarkably little knowledge of the books that their students read. in the sample some books strongly recommended to large groups of students were hardly used and some of the most heavily used works appeared on no reading list. the results of this study are fascinating from an educational viewpoint but less satisfying as operational research. the failure of this .. approach led us to predicting demand from records of the number of past loans. this divides into two parts: using the number of loans over a period to estimate what the total demand was during that period; and using this estimate of the demand in one period to forecast the demand in another. various evidence suggests that the latter is a sensible thing to do. the main demand for heavily used books comes from undergraduate courses. most faculty are loyal in their reading habits, recommending books they know rather than new ones, and each course tends to be repeated year after year with a syllabus that changes only gradually. the use of past circulation to forecast future use is fundamental to a markov model of book usage developed by morse and elston and tested with data from the m.i.t. engineering library. 2 for our work we have used the number of loans in a given term to predict the demand in the corresponding term a year later. simulation m odelj arms and walter 75 estimating the total demand in a period from the number of loans in that period is more difficult. this requires a model of the circulation system. mathematical approach several attempts have been made to apply the methods of inventory control or queueing theory to the problem of buying duplicates. for example, grant has recently described an operational system using the simple rule that the number of copies required to satisfy 95 percent of the demand is n (p,. + 2cr.)/t where n is the number of times that the book is issued during a period of t days and p,8 and cr8 are the mean and standard deviation of the time that each book is off the shelf when on loan. 3 this type of approach has the advantage of being straightforward to use. periodically a simple computer program analyzes the circulation history of each book in the library and prints a list of books requiring duplication. however, the method suffers from difficulties both mathematical and practical. to obtain the simple mathematical expression given above, several simplifying assumptions have to be made. for example, the expression ignores use of a book within the library, and identifies demand in a period with the number of loans within that period. practical difficulties in arriving at a more exact mathematical expression are discussed in the next section. difficulties in constructing a model the following are the main difficulties that we found in constructing a model, either mathematical or using simulation: 1. the most useful measure of the effectiveness of a duplication policy is satisfaction level, the proportion of readers who on approaching the shelves find a copy of the book there, but satisfaction level is almost impossible to measure directly since, although some unsatisfied readers ask that the book be held for them, most go away without comment. more or less equivalent is the percentage time on shelf, the proportion of time that at least one copy of the book is available. this can be measured directly, though a visit to the shelves is needed, and was found useful in validating our model. if the underlying demand is random these two measures of effectiveness have the same value. 2. use of books within the library is also difficult to measure. at sussex, as in most libraries, data are available only on the number of times that a book is lent out of the library. if a reader does not find a copy on the shelves or if he uses a book within the library but does not take it away then no record is generated. since various studies, notably that of fussier and simon, suggest that the amount of use within li76 ]oumal of libmry automation vol. 7/2 june 1974 braries often exceeds the number of loans recorded by a factor of three or more, if the number of loans is used to estimate demand a reasonable knowledge of within-library use is essential.4 3. the number of copies required to achieve a specified satisfaction level does not go up linearly with demand. since a reader is satisfied if he finds a single copy on the shelves, proportionately fewer duplicates are needed of the books most in demand. at sussex more than twenty copies are provided of several books and this nonlinearity is very noticeable. 4. the demand for a title is erratic, changing from term to term, from week to week, and from day to day, even if the mean demand is constant. over a period such as a term three different effects might be expected: a background random demand independent of university courses; sudden peaks when a book is required for a course taken by several students; and feedback caused by previously unsatisfied readers returning. 5. the circulation of books is surprisingly complicated. at sussex some books are designated short term loan and can be borrowed for up to four days only; the remainder are long term loan books and can be borrowed for up to six weeks. circulation data show that the time for which a book is off the shelf is not the same as the period for which it is lent, but has a heavily skewed distribution. few books are returned until near the due date; just before the book is due back there is a peak when most books are returned but many become overdue and the tail of the distribution dies away slowly. simulation as these various factors seemed too complex to derive usable mathematical results, we decided to use computer simulation of the book circulation. simulation of book circulation is not new. in particular it has been used at lancaster university by mackenzie et al. to decide loan periods.5 their report includes a good description of the general approach. the object of our simulation was to model the circulation process so that we could study the relationship between three groups of parameters: 1. 0 bserved data number of copies available number of loans 2. total underlying demand 3. measures of effectiveness satisfaction of level percentage time on shelf. the results obtained from any simulation are only as accurate as the values given to the variables used to calibrate the model. as several of these values were not known at all accurately when the work was begun, special efforts were put into careful validation and calibration of the mod76 ]oumal of libmry automation vol. 7/2 june 1974 braries often exceeds the number of loans recorded by a factor of three or more, if the number of loans is used to estimate demand a reasonable knowledge of within-library use is essentiaj.4 3. the number of copies required to achieve a specified satisfaction level does not go up linearly with demand. since a reader is satisfied if he finds a single copy on the shelves, proportionately fewer duplicates are needed of the books most in demand. at sussex more than twenty copies are provided of several books and this nonlinearity is very noticeable. 4. the demand for a title is erratic, changing from term to term, from week to week, and from day to day, even if the mean demand is constant. over a period such as a term three different effects might be expected: a background random demand independent of university courses; sudden peaks when a book is required for a course taken by several students; and feedback caused by previously unsatisfied readers returning. 5. the circulation of books is surprisingly complicated. at sussex some books are designated short term loan and can be borrowed for up to four days only; the remainder are long term loan books and can be borrowed for up to six weeks. circulation data show that the time for which a book is off the shelf is not the same as the period for which it is lent, but has a heavily skewed distribution. few books are returned until near the due date; just before the book is due back there is a peak when most books are returned but many become overdue and the tail of the distribution dies away slowly. simulation as these various factors seemed too complex to derive usable mathematical results, we decided to use computer simulation of the book circulation. simulation of book circulation is not new. in particular it has been used at lancaster university by mackenzie et al. to decide loan periods.5 their report includes a good description of the general approach. the object of our simulation was to model the circulation process so that we could study the relationship between three groups of parameters: 1. 0 bserved data number of copies available number of loans 2. total underlying demand 3. measures of effectiveness satisfaction of level percentage time on shelf. the results obtained from any simulation are only as accurate as the values given to the variables used to calibrate the model. as several of these values were not known at all accurately when the work was begun, special efforts were put into careful validation and calibration of the modsimulation model/ arms and walter 77 el. a separate study was made for a small sample of books, to compare the percentage time on shelf estimated by the simulation with the actual time for which a copy was available, found by looking at the shelves. the results of this study were used to check the amount of use within the library. by this means we were able to verify the simulation model and calibrate it to a highly satisfactory level of accuracy. description of program the basic layout of the simulation is shown in figure 1. .this is a time advance model with a period of one day. the program has been coded in fortran and running on the icl 1904a computer at sussex takes about one second of machine time to simulate two years. this fast speed has enabled us to try a wide range of values for most parameters and to experiment with a variety of distributions of arrival times and book return dates. 1. satisfaction level at the beginning of each day the number of demands for that day is generated. the satisfaction level is taken as the proportion of these requests which can be satisfied from the books left on the shelf from the previous day and those returned during the simulated day. 2. within-library use the proportion of use that takes place within the library was a key parameter in calibrating the model. the first version of the simulation program assumed a figure of 25 percent use within the library. this was based on a small survey of the type of books being studied, standard texts used for undergraduate courses. the weakness of this survey was that it used a count of those books that were left lying in the library at the end of the day and did not make sufficient allowance for books reshelved by readers or by library staff during the day. the validation experiment showed a consistent difference between predicted and observed percentage time on shelf which could be corrected by changing the value of the within-library use parameter to 60 percent. 3. distribution of demand two distributions of demand have been used, poisson arrivals with a specified mean, and a step demand superimposed on a poisson process. in both cases provision is made for a proportion of unsatisfied readers to return later. as the effect of this feedback is to introduce sharp peaks of demand, the two distributions have proved surprisingly similar in the results produced and most of the runs of the program have been done with random demand. a recent survey showed that 69 percent of readers who fail to find a book intend to return, but we do not know how many actually come back nor what the time interval is before they return. 6 the simulation proved to be insensitive to moderate changes of these parameters 78 journal of library automation vol. 7/2 june 1974 advance clock one day add returned books generate requests fig. 1. outline flowchart of simulation program generate :return date generate return date reader return date simulation model/ arms and walter 79 and for most runs 25 percent of unsatisfied readers were deemed to return after a delay which averaged two days. 4. period for which the book is off the shelf the simulation allows for a book to be borrowed within the library, in which case it is available again the next day, or to be lent from the library. if the book is lent, the return date is generated from one of two histograms which respectively refer to books available on short and long term loan. these histograms were derived from an analysis of all books returned during one week in autumn 1970, modified to reflect changes in the circulation system. validation experiment although the structure of the simulation is fairly straightforward several parameters used in the model have been estimated indirectly. validation of the model took two forms. firstly we ran the program with a wide range of values for the main parameters to see which most influence the results. secondly a small study was set up to measure the percentage time on shelf of a number of books. for each book, the actual availability was estimated by the simulation from the number of loans during the same period. twenty-eight books known to be in heavy demand were selected, half in physics and half in sociology. over a period of eight weeks the shelves were inspected once per day, at random times during the day, to see if a copy was available. the number of loans of each copy of each book during the period was noted and the library staff carried out a thorough check to determine whether any copies shown in the catalog had been lost, stolen, or had their loan category altered. the simulation was used to estimate the percentage time on shelf and this was plotted on a graph against the observed percentage. figure 2 shows the graph for the original values of the parameters. in this graph the x axis shows the percentage time on shelf predicted by the simulation; the y axis shows the percentage observed. if the model were perfect the points would lie near the line y = x, deviations being caused by y being a random variable. the graph in figure 2 is clearly convex downwards showing a consistent error in the model, with these values of the parameters. knowing that the simulation is sensitive to the parameter giving the proportion of use that takes place within the library and that our estimate of its value was not precise, a series of graphs were prepared varying this parameter. figure 3 shows the same observations plotted against predictions assuming 60 percent use within the library, the value which best predicts the observations. this graph is much closer to being linear than figure2. the next question is whether the nonlinearities in figure 3 are the type to be expected from y being a random variable. a very rough calculation helps to answer this question. if we make the dubious assumption that 80 i ournal of lihm1'y automation vol. 7/2 june 197 4 observed availability (percent time on shelf) 100 50 25 o~----------~~----------~50~----------~75 ____________ -jloo predicted availability (percent time on shelf) fig. 2. observed percentage time on shelf against predicted ( 25 percent use within library) availability of a copy on a given day is independent of the days before and afterwards, then, for x given, y should be approximately normally distributed with mean x and variance x( 1 : ) , where n is the number of days in the study (forty). if this calculation were exact, 95 percent of the observations of y would lie within two standard deviations of x, but, since the assumption of independence is definitely false, we would expect the number of observations which fall within the range to be less than 95 percent. the curves y = x ± 2 { x(lx)/n} ¥. observed availability (percent time on shelf) 100 75 50 25 simulation model/ arms and walter 81 predicted availability (percent time on shelf) fig. 3. observed percentage time on shelf against predicted ( 60 percent use within library) with 95 percent probability curves have been added to figure 3. two points lie well off all graphs and cannot be explained except as the result of books being stolen or lost during the period of the study. of the remaining twenty-six all but three lie within the curves. this shows that the simulation model as finally calibrated gives a very reasonable description of the situation. operational experience the results of this simulation have been used by library staff since the middle of 1971 initially on an experimental basis. a two-stage process is in82 journal of library automation vol. 7/2 june 1974 volved. from the computer based circulation system cau; be found the number of times that each short term loan copy has been circulated. from these figures the library staff can estimate the demand for a title, over a given period. once the demand has been estimated the staff can use the simulation again to determine how many copies would have been required to have achieved a specified satisfaction level, perhaps 80 percent. if fewer copies are held by the library orders are placed for extra copies. at present these procedures are done manually using tables, but the possibility exists of modifying the computer system to identify those titles which need extra duplication. the actual decision to purchase needs to be done by library staff who can take account of factors not included in the simulation, such as price and changes of undergraduate courses. conclusion although this work was carried out during 1971, we shall have little operational experience of the method in action until the computer circulation system is reorganized. in the past, different copies of the same book have been processed entirely independently, meaning that the total number of loans of a given title can only be found by manually adding up the number of loans of each copy. in the revised computer system this will be done automatically. experience will probably show that the best procedure combines use of the simulation model with reading lists and the skill of a librarian. one possible feature of a computer based system is that it could automatically indicate which books appear to require duplication. the method used here would seem to apply equally well to other libraries. naturally the circulation patterns of other libraries are different, which means that a different simulation would be needed, but this work has shown that it is possible to calibrate a simulation accurately enough to examine the circulation of individual books. acknowledgments we would like to thank the many members of the university of sussex library staff who have helped at various stages, particularly p. t. stone who was closely involved throughout. references 1. p. f. webster, provision of duplicate copies in the university library, final year project report (university of sussex, 1971). 2. p. m. morse and c. r. elston, "a probability model for obsolescence," operations resem·ch 17:36-47 (1969). 3. r. s. grant, "predicting the need for multiple copies of books," journal of library automation 4:64-71 (june 1971). 4. h. h. fussier and j. l. simon, patterns in the use of books in large research libmries (chicago: univ. of chicago pr., 1969). 5. a. g. mackenzie et al., systems analysis of a university library. report to osti on project sl/ 52/02, 1969. 6. j. urquhart, private discussion, 1971. the library of congress view on its relation to the ala marc advisory committee henriette d. avram: marc development office, libraq of congress. 119 this paper is a statement of the library of congress' 1'ecommendation that a marc advisory committee be appointed within the present structure of the rtsd jisad jrasd committee on representation in machine-readable form of bibliographic information (marbi) and describes the library's proposed relation to such a committee. the proposals and recommendations suggested were adopted by the marbi committee dming its deliberations at ala midwinter, janua1'y 1974, and a1'e now in effect. introduction during ala midwinter, january 1973, the library of congress (lc) suggested to the rtsd/isad/rasd committee on representation in machine-readable form of bibliographic information that a marc advisory committee be formed to work with the marc development office regarding changes made to the various marc formats. the primary interest of the committee would be the serial and monograph formats, though the committee should have interest in and responsibility for reviewing changes in any of the marc formats to insure that the integrity and compatibility of marc content designators are preserved. the marbi committee decided that it would be the marc advisory committee and asked that a paper be prepared proposing how such a committee would operate in relationship to the marc development office. prior to a discussion of marc changes, it appears appropriate to make certain basic statements regarding marc changes and the difficulties experienced by the marc development office in evaluating the significance of a change for the marc subscriber. it would be naive to assume, in a dynamic situation, that even in the best of all worlds a marc subscriber would never have to do any reprogramming. changes in procedures, changes in cataloging, experience in providing the knowledge for more efficient ways to process information, additional requirements from users, etc., have always been factors creating the 120 ] ournal of library automation vol. 7/2 june 197 4 need to both modify andjor expand an automated system. programming installations always require personnel to maintain ongoing systems. situations creating changes locally must exist and, likewise, they also exist at lc. staff of the marc development office give serious consideration to every proposed marc change and its impact on the marc subscribers. however, it must be realized that it is not possible to evaluate fully the impact of each change because the significance of a change is directly dependent on the use made of the elements of the record and the programming techniques used by each subscriber. marc staff cannot possibly know the details of use and programming techniques and capabilities at every user installation. each marc subscriber evaluates a change in light of his operational requirements. since the uses made of the data are varied among users, there is rarely a consensus as to the pros and cons of a change. marc staff are aware of the expenses imposed by changes to software and have made an attempt to solicit preferences in some cases for one technique over another from marc subscribers when changes were required. in the case of the isbd implementation, ten replies were received from questions submitted to the then sixty-two marc users. the remainder of this paper describes what is included in the term "change," the various stimuli that initiate changes, and recommendations of how lc and the marc advisory committee should interact in regard to changes. the appendix summarizes in chart form the addenda to books: a marc fo1·mat since the initiation of the marc service. an examination of the chart will reveal that the number and the types of changes have not been too significant. marc changes the term "change" is used throughout this paper in the broad sense, i.e., the term includes additions, modifications, and deletions of content data (in both fixed and variable fields) and content designators (tags, indicators, and subfield codes) made to the format as well as additions, modifications, and deletions made to the tape labels. the concern is with changes made to all records where applicable or groups of records but not with the correction or updating of individual records as part of the marc distribution service. changes as described above fall into several broad types: 1. addition of new fields, indicators, or subfield codes to the format. 2. implementation of aheady defined but unused tags, indicators, subfield codes, or fixed fields. 3. modification of content data of fields (fixed and variable). 4. changes in style of content in records, e.g., punctuation. 5. cessation in use of existing fields, indicators, and subfield codes. library of congress view/ avram 121 the following paragraphs are divided into two sections. section "a" describes the stimulus for a change and the rationale for making it. section "b" describes the lc position regarding the change and, where applicable, a recommendation to the marc advisory committee. changes made to marc records may be divided into the following categories: category 1: changes resulting from a change in cataloging rules or systems. a. cataloging rules or systems fall into two distinct types: those made in consultation with ala (resources & technical services division/cataloging & classification section/descriptive cataloging committee), and those made by the subject cataloging division to the subject cataloging system without consultation with ala. lc follows aacr. since the marc record is the record used for lc bibliographical control as well as the source record for the lc printed card and lc book catalogs (for those items presently within the scope of marc), cataloging changes (descriptive and subject) are necessarily reflected in marc. if the cataloging change is such that the retrospective records can reasonably be modified by automated techniques, these records are modified to reflect the change. prior to marc, this updating could not be provided to subscribers to lc bibliographic products and is one of the advantages of a machine-readable service. it has the effect of maintaining a consistent data base for all marc users. b. changes made in cataloging rules or systems will be made by the appropriate agencies. once changes in cataloging rules have been made by the ala (rtsdjccsjdcc) committee, lc will consult with the marc advisory committee with respect to their implementation in those cases affecting the marc format.'~* wherever possible, depending upon resources available, the number of records affected, and the type of change, the retrospective flies will be updated and made available in one of two ways: if the number of records is small (to be decided by lc), the records will be distributed as corrections through the normal channels of the marc distribution service. if the number of records is large, the records will be sold by the lc card division. category 2: changes made to satisfy a requirement of the library of congress. a. since lc uses the marc records for its own purposes, situations do arise in which lc has a requirement for a change. in most cases, lc feels that the change would also be beneficial to the users. under these circumstances lc has carefully evaluated the im""format change is used in this context to mean a change affecting the tags, indicators, subfield codes, addition or deletion of fixed fields, or change to the leader. 122 i oumal of libmry automation vol. 7/2 june 197 4 plication of the change to the marc subscribers and, in some cases, solicited their preferences and advice. b. if lc has a requirement to make a change to marc, the proposed change and the reason for the change will be referred to the marc advisory committee. the marc advisory committee will solicit opinions from marc users as to whether or not to include the change in the marc distribution service, and lc will abide by the committee's recommendation. if this decision is not to include the change, lc will implement the change only in its own data base.t category 3: changes made to satisfy subscribers' requests. a. subscribers sometimes request that a change be made to a marc record. where possible, within the limitation of lc resources, these requests are complied with. lc, when considering such a request, has sought the opinion of the marc subscribers, and if sufficient numbers of users were interested in the change, the change was implemented. b. changes requested by subscribers will be evaluated by lc, and if considered possible to implement, the proposed change will be submitted by lc to the marc advisory committee to solicit opinions from marc users. if the committee recommends, lc will implement the change. catego1·y 4: changes made to support international standardization. a. lc plays a significant role in international activities in the area of machine-readable cataloging records. much of the future expansion of marc depends upon standards in formats, data content, and cataloging. in all these activities, lc firmly supports aacr and current marc formats. occasionally, in order to arrive at complete agreement with agencies in other countries, it becomes necessary for all to compromise. however, in all cases lc does not agree to changes in cataloging rules until the recommendation has been approved by the appropriate ala committee. b. changes resulting from international meetings will fall principally into two areas: 1. cataloging-if the change required is the result of a change in cataloging rules and the ala (rtsdjccsjdcc) has approved the aacr modifications, the marc change falls into category 1. 2. all other changes affecting the format-since lc is the agency in the u.s. that will exchange machine-readable bibliographic records with other national agencies, lc will consider these t an exception to this statement will be those changes to lc practice which must be reflected on cards and in the marc record and which cannot exist in optional form. an example of the above would be abolition of the check digit in the lc card number. libmry of congress viewj avram 123 changes an internal lc requirement; therefore, they can be considered under the proposal described in category 2. lc will submit the proposed changes to the marc advisory committee. category 5: changes made to expand the marc program to include additional services. a. if the marc service were static, changes to expand the service would not be possible. an example of an additional service is the cataloging in publication data available on marc tapes. since these cataloging data are available four to six months prior to the publication of the item, it was determined to be of value to marc subscribers and'changes were made to the marc record to make these data available in machine-readable form. b. if a new service is under consideration at lc that will cause a change to marc records, e.g., cataloging in publication, lc will submit the proposal to the marc advisory committee for their action as described in category 2. other lc recommendations for the marc advisory committee 1. time fmme fo1' changes. in order to prevent consultation on changes from taking an inordinate length of time, lc proposes that the marc advisory committee be given two months to solicit comments from marc users, to arrive at a consensus, and to respond to proposed changes. if there is no response during that time, lc will implement the proposed change. lc will notify the marc subscribers two months prior to including the change in the marc distribution service. 2. consultation with the marc advisory committee. the marc development office will submit the recommendation for change and any other information required to evaluate the change to the marc advisory committee. the marc advisory committee will be responsible for submitting the proposal to the marc users and notifying the marc development office of the committee's recommendation. 3. test tapes. the marc advisory committee, on consultation with the marc development office, will consider the requirement for a test tape to reflect the change made to the marc record (the requirement for a test tape is dependent on the type of change made). appendix a addenda to books: a marc format stimul~ for change date change 1. cataloging rules and cataloging system changes 1972 u.s./gt. brit. changed to united states and great britain. comments change made to facilitate machine filing. 124 journal of library automation vol. 7/2 june 1974 appendix a-continued stimulus for change date change 1972 isbd. 1973 isbd-additional information. comments cataloging change based on an international agreement. 2. subscribers requests 1972 government publication code 3. initiated at lc: a. addition or deletion of fields added to fixed field. 1969 abolishment of 653-political jurisdiction (subject) and 750-proper name not capable of authorship.' these little-used fields proved difficult to define and of little value. 1970 addition of encoding level to implemented for use for leader. recon records. 1970 addition of geographic area code field, tag 043. 1971 addition of superintendent of documents field, tag 086. this field has been widely used by lc and subscriber libraries. information added to lc catalog cards (and thus to marc records) at the request of outside libraries. b. additions of indicators 1971 addition of filing indicators. or subfields information needed to allow lc to ignore initial articles in arranging its computerproduced book catalog. c. addition or change of codes or data to existing fields 1972 addition of "q" subfield to fields for conferences entered under place. 1969 code added to modified record indicator in fixed field to indicate shortened records. 1969 code for phonodiscs added to illustration fixed field. 1970 code added to modified record indicator in fixed field to indicate that the dashed-on entry on the original lc card was not carried in marc record. 1971 "questionable condition" codes deleted from country of publication code. 1971 geographic area code. guidelines for implementation modified slightly and 23 new codes added. subfield needed to enable lc to file conferences entered under place correctly. 1971 microfilm call numbers description of what such call carried in lc call number field. numbers looked like. 1971 abolished lc card number check digit. numbers available using check digit too limited. library of congress viewjavram 125 appendix a-continued stimulus for change date change comments d. explanations or 1970 use of "b" subfield with subfield and its use inadcorrections topical subjects (field 650) vertently omitted from books: and geographic subjects a marc format. it occurs (field 651). rarely in marc records. 1971 use of "revision date" as explanation of what this insuffix to lc card number. formation means at lc and how subscribers use it. 1971 indicators used with explanation of use of indiromanized title. cators with this field omitted from books: a marc format. e. changes to labels 1972 change to label to reflect new computer system at lc. 4. national and 1970 standard book number (9 international agreement digits ) changed to international standard book number ( 10 digits) to conform to an international standard. 1971 entry map added to leader to adoption of ansi z39 format conform to national standard. for exchange of bibliographic information interchange. 1971 change to label to conform to ansi standard. 5. new services at lc 1969 changes to label and status to provide for cumulative codes for cumulated tapes. quarterly and semiannual tapes. 1971 cip records-addition of codes to encoding level and record status. everyone’s invited: a website usability study involving multiple library stakeholders elena azadbakht, john blair, and lisa jones information technology and libraries | december 2017 34 elena azadbakht (elena.azadbakht@usm.edu) is health and nursing librarian and assistant professor, john blair (john.blair@usm.edu) is web services coordinator, and lisa jones (lisa.r.jones@usm.edu) is head of finance and information technology, university of southern mississippi, hattiesburg, mississippi. abstract this article describes a usability study of the university of southern mississippi libraries website conducted in early 2016. the study involved six participants from each of four key user groups— undergraduate students, graduate students, faculty, and library employees—and consisted of six typical library search tasks, such as finding a book and an article on a topic, locating a journal by title, and looking up hours of operation. library employees and graduate students completed the study’s tasks most successfully, whereas undergraduate students performed relatively simple searches and relied on the libraries’ discovery tool, primo. the study’s results displayed several problematic features that affected each user group, including library employees. these results increased internal buy-in for usability-related changes to the library website in a later redesign. introduction within the last decade, usability testing has become a common way for libraries to assess their websites. eager to gain a better understanding of how users experience our website, we assembled a two-person team and conducted the first usability study of the university of southern mississippi libraries website in february 2016. the web advisory committee—which is tasked with developing, maintaining, and enhancing the libraries’ online presence—wanted to determine if the content on the website was organized in a way that made sense to users and facilitated the efficient use of the libraries’ online resources. our usability study involved six participants from each of the following library user groups: undergraduate students, graduate students, faculty, and library employees. student and faculty participants represented several academic disciplines and departments. all of the library employees involved in the study work in public-facing roles. the web advisory committee and libraries’ administration wanted to know how each of these groups differ in their website use and whether they have difficulty with the same architecture or features. usability testing helped illuminate which aspects of the website’s design might be hindering users from accomplishing key tasks, thereby identifying where and how improvement needed to be made. we included library employees in this study to compare their approach to the website to that of other users in the mailto:elena.azadbakht@usm.edu mailto:john.blair@usm.edu mailto:lisa.r.jones@usm.edu everyone’s invited | azadbakht, blair, and jones 35 https://doi.org/10.6017/ital.v36i4.9959 hope of increasing internal stakeholders’ buy-in for recommendations resulting from this study. this article will discuss the usability study’s design, results, and recommendations as well as the implications of the study’s findings for similarly situated academic libraries. we will give special consideration to how the behavior of library employees compared to that of other groups. literature review the literature on library-website user experience and usability is extensive. in 2007, blummer conducted a literature review of research related to academic-library websites, including usability studies. her article provides an overview of the goals and outcomes of early library-website usability studies. 1 more recent articles focus on a portion or aspect of a library’s website such as the homepage, federated search or discovery tool, or subject guides. fagan published an article in 2010 that reviews user studies of faceted browsing and outlines several best practices for designing studies that focus on next-generation catalogs or discovery tools. 2 other library-website studies have reported on the habits of user groups, with undergraduates being the most commonly studied constituent group. emde, morris, and claassen-wilson observed university of kansas faculty and graduate students’ use of the library website, which had been recently redesigned, including a new federated search tool. 3 many of the study’s participants gravitated toward the subject-specific resources they were familiar with and either missed or avoided using the website’s new features. when asked for their opinions on the federated search tool, several participants said that while it was not a tool they saw themselves using, they did see how it might be a helpful for undergraduate students who were still new to research. the researchers also provided the participants with an article citation and asked them to locate it using the using the library’s website or online resources. while half the participants did use the website’s “e-journals” link, others were less successful. some who had the most difficulty “search[ed] for the journal title in a search box that was set up to search database titles.” 4 this led emde, morris, and claassen-wilson to observe that “locating journal articles from known citations is a difficult concept even for some advanced researchers.” turner’s 2011 article describes the result of a usability study at syracuse university library that included both students and library staff. participants were asked to start at the library’s homepage and complete five tasks designed to emulate the types of searches a typical library user might perform, such as finding a specific book, a multimedia item, an article in the journal nature, and primary sources pertaining to a historic event. 5 when asked to find toni morrison’s beloved, most staff members used the library’s traditional online catalog whereas students almost always began their searches with the federated search tool located on the homepage. participants of both types were less successful at locating a primary source, although this task highlighted key differences in each groups’ approach to searching the library website. since library staff were more familiar than students with the library’s collections and online search tools, they relied more on facets and limiters to narrow their searches, and some even began their searches by navigating to the library’s webpage for special collections. information technology and libraries |december 2017 36 library staff tended to be more persistent; draw upon their greater knowledge of the library’s collections, website, and search tools; and use special syntax in their searchers, like inverting an author’s first and last names. “library staff took more time, on average, to locate materials,” writes turner, because of their “interest in trying alternative strategies.” 6 students, on the other hand, usually included more detail than necessary in their search queries (such as adding a word related to the format they were searching for after their keywords) and could not always differentiate various types of catalog records, for example, the record for a book review and the record for the book itself. turner concludes that the students’ mental models for searching online and their experiences with other web-search environments influence their expectations of how library search tools work and that library-website design should take these mental models into consideration. research on the search behaviors of students versus more experienced researchers or subject experts also has implications for library website design. two recent articles explore the different mental models or mindsets students bring to a search. the students in asher and duke’s 2012 study “generally treated all search boxes as the equivalent of a google search box” and used very simple keyword searches. 7 this tracked with holman’s 2010 study, which likewise found that the students she observed relied on simple search strategies and did not understand how search interfaces and systems are structured. 8 methods our research team consisted of the libraries’ health and nursing librarian and the web services coordinator. we worked closely with the head of finance and information technology in designing and running the usability study. a two-week period in mid-february 2016 was chosen for usability testing to avoid losing potential participants to midterms or spring break. we posted a call for participants to two university discussion lists, on the libraries website, and on social media (facebook and twitter). we also reached out directly to faculty in academic departments we regularly work with and emailed library employees directly. we directed nonlibrary participants to a web form on the libraries website to provide their name, contact information, university affiliation/class standing, and availability. the health and nursing librarian followed up with and scheduled participants on the basis of their availability. each student participant received a ten-dollar print card and each faculty participant received a ten-dollar starbucks gift card. to record the testing sessions, we needed a free or low-cost software option. since the libraries already had a subscription to screencast-o-matic to develop video tutorials, and the tool allows for simultaneous screen, audio, and video capture, so we decided to use it to record all testing sessions. we also used a spare laptop with an embedded camera and microphone. the health and nursing librarian served as both facilitator and note-taker for most usability testing sessions. participants were given six tasks to complete. we encouraged participants to everyone’s invited | azadbakht, blair, and jones 37 https://doi.org/10.6017/ital.v36i4.9959 narrate as they completed each task. the sessions began with simple, secondary navigational questions like the following: • how late is our main library open on a typical monday night? • how could you contact a librarian for help? • where would you find more information about services offered by the library? next, we asked the participants to complete tasks designed to assess their ability to search for specific library resources and to illuminate any difficulty users might have navigating the website in the process. each of the three tasks focused on a particular library-resource type, including books, articles, and journals: • find a book about rabbits. • find an article about rabbits. • check to see if we have a subscription/access to a journal called nature. after the usability testing was complete, we reviewed the recordings and notes and coded them. for each task, we calculated time to completion and documented the various paths participants took to answer each question, noting any issues they encountered. we also compared the four user groups in our analysis. limitations although we controlled for user type (undergraduate, graduate, faculty, or library employee) in the recruitment of study participants, we did not screen by academic discipline. doing so would have hindered our team’s ability to include enough graduate students and faculty members in the study, as nearly all the volunteers from these two groups were from humanities or social science fields. the results might have differed slightly had the study successfully managed to include more faculty from the so-called hard sciences and allied health fields. additionally, the order in which we asked participants to attempt the tasks might have affected how they approached some of the later tasks. if a participant chose to search for a book using the primo discovery tool, for example, they might be more inclined to use it to complete the next task (find an article) rather than navigate to a different online resource or tool. despite these limitations, usability testing has helped improve the website in key ways. we plan to correct for these limitations in future studies. results every group included a participant who failed to complete at least one of the six tasks. an adequate answer to each of the study’s six tasks can be found within one or two pages/clicks from the libraries homepage (figure 1). the average distance to a solution remained at about two page loads across all of the study’s participants, despite a few individual “website safaris.” information technology and libraries |december 2017 38 figure 1. university of southern mississippi libraries’ homepage. graduate students tended to complete tasks the quickest and were generally as successful as library employees. they preferred to use primo for finding books but tended to favor the list of scholarly databases on the “articles & databases” page to find articles and journals. undergraduates were the second fastest group, but many struggled to complete one or more of the six tasks. they had the most trouble finding books and locating the journal by title. undergraduates generally performed simple searches and had trouble recovering from missteps. they were heavy users of primo, relying on the discovery tool more than any other group. the other two user groups, faculty and library employees, were slower at completing tasks. of the two, faculty took the longest to complete any task and failed to complete tasks at a similar rate as undergraduates. likewise, this group favored primo nearly as often. in contrast, library employees took almost as long as faculty to complete tasks but were much more successful. as a group, library employees demonstrated the different paths users could take to complete each task but favored those paths they identified as the “preferred” method for finding an item or resource over the fastest route. everyone’s invited | azadbakht, blair, and jones 39 https://doi.org/10.6017/ital.v36i4.9959 the majority of study participants across all user groups had little trouble with the first three tasks. although most participants favored the less direct path to the libraries’ hours—missing the direct link at the top of the homepage (figure 2)—they spent relatively little time on this task. likewise, virtually all participants took note of the links to our “ask-a-librarian” and “services” pages located in our homepage’s main navigation menu. this portion of the usability study alerted us to the need for a more prominent display of our opening hours on the homepage. figure 2. link to “hours” from the homepage. of the second set of tasks—find a book, find an article, and determine if we have access to nature—the first and last proved the most challenging for participants. one undergraduate was unable to complete the book task, and one faculty member took nearly eight minutes to do so—the longest time to completion of any task by any user in the study. primo was the most preferred method for finding a book. although an option for searching our classic catalog (which uses innovative interfaces’ millennium integrated library system) is contained within a search widget on the homepage, primo is the default search option and therefore users’ default choice. interestingly, even after statements from some faculty such as “i don’t love primo,” “primo isn’t the best,” and “the [classic catalog] is better,” these participants proceeded to use primo to find a book. library employees were evenly split between primo and classic catalog. one undergraduate student, graduate student, and library employee were unable to determine whether we have access to nature. this task was the most time consuming for library employees because there are multiple ways to approach this question and library employees tended to favor the most consistently successful yet most time-consuming options (e.g., searching within the classic catalog). lacking a clear option in the main navigation bar, the most popular path started information technology and libraries |december 2017 40 with our “articles & databases” page, but the answer was most often successfully found using primo. several participants tried using the “search for databases” search box on the “articles & databases” page, which yielded no results because it searches only our database list. the search widget on the homepage that includes primo has an option for searching e-journals by title, as shown in figure 3. however, nearly all nonlibrary employees missed this feature. participants from both the undergraduate and graduate student user groups had trouble with this task, including those who were ultimately successful. unfortunately, many of the undergraduates could not differentiate a journal from an article, and while graduate students were aware of the distinction, a few indicated that they were not used to the idea of finding articles from a specific journal. figure 3. e-journals search tab. when it came to finding articles, undergraduates, as well as several faculty and a few library employees, gravitated toward primo. others, particularly graduate students and library employees, opted to search a specific database—most often academic search premier or jstor. however, those who used primo to answer this question arrived at an answer two to three times faster because of the discovery tool’s accessibility from a search widget on the homepage. regardless of the tool or resource they used, most participants found a sufficient result or two. common breakdowns despite the clear label “search for databases,” at least one participant from each user group, including library employees, attempted to enter a book title, journal name, or keyword into the libguides’ database search tool on our “articles & databases” page (figure 4). some participants attempted this repeatedly despite getting no results. others did not try a search but stated, with everyone’s invited | azadbakht, blair, and jones 41 https://doi.org/10.6017/ital.v36i4.9959 confidence, that entering a journal, book, or article title into the “search for databases” field would yield a relevant result. a few participants also attempted this with the search box on our research guides (libguides) page, which searches only within the content of the libguides themselves. across all groups, when not starting at the homepage, many participants had difficulty finding books because no clear menu option exists for finding books like it does for articles (our “articles & databases” page). this was difficulty was compounded by many participants struggling to return to the libraries homepage from within the website’s subpages. those participants who were able to navigate back to the homepage were reminded of the primo search box located there and used it to search for books. figure 4. “search for databases” box on the “articles & databases” page. another breakdown was the “help & faq” page (figure 5). participants who turned there for help at any point in the study spent a relatively long time trying to find a usable answer and often ended up more confused than before. in fact, only one in three participants managed to use “help & faq” successfully because the faq consists of many questions with answers on many different pages and subpages. this portion of the website had not been updated in several years and therefore the questions were not listed in order of frequency. information technology and libraries |december 2017 42 figure 5. the answer to the “how do i find books?” faq item leads to several subpages. discussion using the results of the study, we made several recommendations to the libraries’ web advisory committee and administration: (1) display our hours of operation on the homepage; (2) remove the search boxes from the “articles & databases” and “research guides” pages; (3) condense the “help & faq” pages; and (4) create a “find books” option on the homepage. all of these recommendations were taken into account during a recent redesign of the website. we also considered each user group’s performance and its implications for website design as well as instruction and outreach efforts. first, our team suggested that the current day’s hours of operation be featured prominently on the website’s front page. despite “how late is our main library open on a typical monday night?” being one of two tasks that had a 100 percent completion rate, this change is easy to make, adds convenience, and addresses a long-voiced complaint. several participants expressed a desire to see this change implemented. moreover, this is something many of our peer libraries provide on their websites. the team’s next recommendation was to remove the “find databases by title” search box from the “article & databases” page. during the study, participants who had a particular database in mind opted to navigate directly to that database rather than search for it. another such search box exists on the “research guides” page. although most of the participants did not encounter this search box during the study, those that did also mistook it for a general search tool. participants everyone’s invited | azadbakht, blair, and jones 43 https://doi.org/10.6017/ital.v36i4.9959 from all groups, especially undergraduate students, assumed that any search box on the libraries’ website was designed to search for and within resources like article databases and the online catalog, regardless of how the search box was labeled. given our findings, libraries with similar search boxes might also consider removing these from their websites. another recommended change was to condense the “help & faq” section of the website considerably. the “help & faq” section was too large and unwieldy for participants to use successfully without becoming visibly frustrated, defeating its purpose. moreover, google analytics showed that only nine of the more than one hundred “help & faq” pages were used with any regularity. going forward, we will work to identify the roughly ten most important questions to feature in this section. the final major recommendation was to consider adding a top-level menu item called “find books” that would provide users with a means to escape the depths of the site and direct them to primo or the classic catalog. when participants would get stuck on the book-finding task, they looked for a parallel to the “articles & databases” menu option. a “getting started” page or libguide could take this idea a step further by also including brief, straightforward instructions on finding articles and journals by title. in effect, this option would be another way to condense and reinvent some of the topics originally addressed in the “help & faq” pages. comparing each user group’s average performance helped illuminate the strengths and weaknesses of the website’s design. we suspect that graduate students were the fastest and nearly most successful group because they are early in their academic careers and doing a great deal of their own research (as compared to faculty). many of them are also responsible for teaching introductory courses and are working closely with first-year students who are just learning how to do research. faculty, because their research tends to be on narrower topics, were familiar with the specific resources and tools they use in their work but were less able to efficiently navigate the parts of the website with which they have less experience. moreover, individual faculty varied widely in their comfort level with technology, and this affected their ability to complete certain tasks. conclusion the results of our website usability study echo those found elsewhere in the literature. students approach library search interfaces as if they were google and generally conduct very simple searches. without knowledge of the libraries’ digital environment and without the research skills library employees possess, undergraduates in our study tended to favor the most direct route to the answer—if they could identify it. this group had the most trouble with library and academic terminology or concepts like the difference between an article and a journal. though not as quick as the graduate students, undergraduates completed tasks swiftly, mainly becau se of their reliance on the primo discovery tool. however, undergraduate students were less able to recover from missteps; more of them confused the “find databases by title” search tool for an article search tool than participants from any other group. since undergraduates compose the bulk of our user information technology and libraries |december 2017 44 base and are the least experienced researchers, we decided to focus our redesign on solutions that will help them use the website more easily. although all of the library employees in our study work in public-facing roles, not all of them provide regular research help or teach information literacy. since most of them are very familiar with our website and online resources, they approached the tasks more methodically and thoroughly than other participants. library employees tended to choose the search strategy or path to discovery that would yield the highest-quality result or they would demonstrate multiple ways of completing a given task, including any necessary workarounds. the inclusion of library employees yielded the most powerful tool in our research team’s arsenal. holding this group’s “correct” methods side-by-side to equally valid methods of discovery helped shake loose rigid thinking, and the fact that some library employees were unable to complete certain tasks shocked all parties in attendance when we presented our findings to stakeholders. any potential argument that student, faculty, and staff missteps were the result of improper instruction and not of a usability issue was countered by evidence that the same missteps were sometimes made by library staff. not only was this an eye-opening revelation to our entire staff, it served as the evidence our team needed to break through entrenched resistance to making any changes. we were met with almost instant, even enthusiastic, buy-in to our redesign recommendations from the libraries’ administration. therefore, we highly recommend that other academic libraries consider including library staff as participants in their website usability studies. references 1 barbara a. blummer, “a literature review of academic library web page studies,” journal of web librarianship 1, no. 1 (2007): 45–64, https://doi.org/10.1300/j502v01n01_04. 2 jody condit fagan, “usability studies of faceted browsing: a literature review,” information technology and libraries 29, no. 2 (2010): 58–66, https://ejournals.bc.edu/ojs/index.php/ital/article/view/3144/2758. 3 judith z. emde, sara e. morris, and monica claassen-wilson, “testing an academic library website for usability with faculty and graduate students,” evidence based library and information practice 4, no. 4 (2009): 24–36, https://doi.org/10.18438/b8tk7q. 4 ibid., 30. 5 nancy b. turner, “librarians do it differently: comparative usability testing with students and library staff,” journal of web librarianship 5, no. 4 (2011): 286–98, https://doi.org/10.1080/19322909.2011.624428. 6 ibid., 295. https://doi.org/10.1300/j502v01n01_04 https://ejournals.bc.edu/ojs/index.php/ital/article/view/3144/2758 https://doi.org/10.18438/b8tk7q https://doi.org/10.1080/19322909.2011.624428 everyone’s invited | azadbakht, blair, and jones 45 https://doi.org/10.6017/ital.v36i4.9959 7 andrew d. asher and lynda m. duke, “searching for answers: student behavior at illinois western university,” in college libraries and student culture: what we now know (chicago: american library association, 2012), 77–78. 8 lucy holman, “millennial students’ mental models of search: implications for academic librarians and database developers,” journal of academic librarianship 37, no. 1 (2011): 21– 23, https://doi.org/10.1016/j.acalib.2010.10.003. https://doi.org/10.1016/j.acalib.2010.10.003 abstract introduction methods limitations results common breakdowns discussion conclusion references lib-s-mocs-kmc364-20140601051521 39 selective dissemination of marc: a user evaluation lorne r. buhr: murray memorial library, university of saskatchewan, saskatoon, saskatchewan after outlining the terms of reference of an investigation of user reaction to the selective dissemination of marc records, a summary of the types of users is given. user response is analyzed and interpreted in the light of recent developments at the library of congress. implications for the future of sdi of marc in a university setting conclude the paper. introduction f. w. lancaster ( 1968) in his detailed study of medlars makes the following statement, which has application to all sdi work: "in order to survive, a system must monitor itself, evaluate its performance, and upgrade it wherever possible." ( 1) since seldom operates in a fairly new field, sdi for current monographs, an evaluation is most important. to a great extent it must be made without reference to other systems since most of the operational sdi services deal with tape services in various fields of scientific journals, and although there are some parallels, there are numerous differences. whereas services such as can/ sdi cater primarily to the natural and applied sciences, seldom opens up the possibilities for sdi in the humanities and social sciences. the background to the seldom project at the university of saskatchewan has been outlined earlier by smith and mauer hoff ( 1971) and will not be repeated here. (2) after five months of operation a major questionnaire was sent out to each of 121 participants in the experimental seldom service. this questionnaire was based almost entirely on the one used by studer ( 1968 ) in his dissertation at indiana state university. (3) the general purpose of the study was to elicit user reaction to seldom, their evalmaterial appearing in this paper was originally presented at the third annual meeting of the american society for information science (western canada chapter), banff, alberta, october 4, 1971. 40 journal of library automation vol. 5/1 march, 1972 uation of its usefulness, time necessary to scan the weekly output, suggestions regarding continuance of the service, etc. besides this general purpose, the gathering and analyzing of data on seldom will be useful to the library administration in determining the future of an sdi service of this nature. a separate cost study is being prepared in this connection. several factors prompt a cautionary stance in assessing the value of an sdi system on the basis of one questionnaire: ( 1) there is no control situation to which we can compare seldom, i.e., there was no systematic service for current awareness in the field prior to the advent of seldom. faculty and researchers were dependent on their ingenuity to ferret out information on new books which were pertinent to their field of research and instruction. seldom is therefore being compared to a conglomeration of ad hoc methods which may be as numerous as the individuals using them. therefore, we must be cautious or we will tend to say, "something in the field of current awareness is better than nothing," when we really do not know what that "nothing" is. (2) although seldom had been operational for some twenty weeks when evaluation began, this is a relatively short period on which to base an assessment. on the other hand studer's evaluation was based on the experiences of thirty-nine users and covered only eight weekly runs against the marc tapes scheduled on an every other week basis. (3) seldom was implemented without any study to determine the adequacy of the ad hoc approaches, to which i have already referred, nor to assess the patterns of recommendation for purchase. it was assumed that there was a need for seldom and some of the response would indicate that this is a fairly valid assumption, since almost 90 percent of the respondents wanted the service continued. a random investigation in mid-august of 748 current orders in the acquisitions department for books with imprint of 1969 or later revealed that ninty-five or 12~ percent referred to seldom as the source of information for a particular recommendation to purchase. this may or may not be significant since there is no way of assessing whether these items would have been recommended anyway, only later perhaps. one by-product of orders based on seldom information is that correct lc and isbn numbers are given and with the capabilities of the tesa-1 cataloging/ acquisitions system such orders can be expedited more quickly and can also be cataloged sooner than non-marc materials, thus ostensibly getting the desired item to the requestor in less time than previously. seldom is valuable in our university setting, therefore, not only as a means of awareness of new items, but also in the actual retrieval of the item for the user, in this case through acquisition. our analysis, however, must be directed to the effectiveness of seldom as an awareness service, vis-a-vis the ad hoc approach. user group of 121 questionnaires sent out, seventy-seven or 63.5 percent were returned. six of these had to be rejected for the purposes of this study since selective dissemination of marc /buhr 41 table 1 i. library and information science a. on-campus 12 b. off-campus 17 29 ii. social sciences and humanities a. on-campus 15 b. off-campus 2 17 iii. natural and applied sciences a. on-campus 23 b. off-campus 2 25 either only a few questions had been answered or a general letter had been sent instead of answering the questionnaire. thus, the data presented in this study will be based on seventy-one completed questionnaires or 58.6 percent return. three additional verbal comments were made to the writer and thus we in fact heard from eighty or 66 percent of the users. the term "users" will designate the seventy-one who completed their questionnaires, although comments from the other nine individuals will also be referred to. the users have been grouped into three categories according to table 1. categorization was along fairly traditional lines, with category i being necessary because of the large number of people falling into this area. the seventeen off-campus users coming under designation (i) represent the library schools in canada as well as librarians/ information scientists in canada and the united states. the on-campus users are library department heads and heads of branch libraries. included in the social sciences and humanities are the fields of psychology, sociology, history, economics, english, commerce, classics, etc. the natural and applied sciences include all the health sciences plus physical education since the two profiles in that area are tending toward the health sciences. engineering, poultry science, physics, chemistry, biology, etc. , are represented here. observations a sample of the questionnaire used appears on p.47-50 and includes a tally of the number of responses for each possible alternative answer to each question. in some cases the total number of replies for a question is less than seventy-one. this is explained by the fact that some questions on some questionnaires were not answered or were answered ambiguously so they could not be tallied. generally speaking, users found seldom to be good to very good in providing sdi for new english monographs. 25.8 percent of the users found the lists very useful while 48.5 percent said they were useful. six users said the listings were inconsequential for their purposes; in several 42 journal of library automation vol. 5/1 march, 1972 instances this may be due to poor profiling or profiling for a subject area in which little would appear on the marc data base. 23.6 percent of the users indicated that in most cases items of interest found on the seldom lists were previously not known to them. 45.8 percent said that "of interest" items were frequently new. 76 percent of the group believed that the proportion of "of interest" items which also were new was satisfactory, a percentage which speaks well for the currency and effectiveness of an sdi capability. one of the chief drawbacks for which sdi services are often cited is the absence of evaluative commentary or abstract material to accompany the citations. some tape services do provide either an abstract or a good number of descriptors, and this has proved to be an asset in helping the subscriber. seldom is based on the marc tapes which provide complete cataloging data but do not give either evaluations or a multiplicity of descriptors. (some indications are that the information now available in publishers' weekly might at some time in the future be added to the marc tapes. ) interestingly enough, 83.5 percent of the users said the information included in the entries was adequate to determine whether an item was of interest or not. predictably, title, author/ editor, and subject headings were the three indicators, in that order, which were found most useful in making evaluations. this is significant since titles in the humanities and some of the social sciences, particularly, are often not as specific in describing the contents of a work as are titles in the physical sciences. 63.5 percent of the users indicate that seldom information is used for recommending titles for acquisition by the library. as a result it is quite possible that purchasing in the areas covered by seldom profiles may increase and the tendency to broaden the collection should increase. unfortunately, no pattern of pre-seldom recommending for purchase is known. some instructors use the weekly printouts to keep current bibliographies on hand both for teaching purposes and for research purposes. since over half the users ( 55.8 percent) needed no more than ten minutes per week to scan the printouts, there is no indication that excessive time is taken up in the use of such an sdi service. in reply to the question, "would you be willing to increase the number of irrelevant notices received in order to maximize the number of relevant ones?" opinions were nearly balanced with 58 percent replying in the affirmative and 42 percent answering negatively. on the other hand, increases in the marc data base expected some time in 1972 when other roman alphabet language imprints and records for motion pictures and filmstrips are added, did not seem problematic with only 25 percent of users asking that an upper limit be placed on the quantity of material retrieved by their profiles. numerous individuals (thirty ) responded favorably to the prospect of wider language coverage by marc. on the other hand, several individuals commented that non-english output on seldom would not enhance the service for them, and this likely reflects languagt selective dissemination of marc /buhr 43 capabilities more than a lack of non-english material in their subject area. the question regarding format brought interesting comments, especially from library personnel and off-campus librarians: "computer type format is often confusing." "a book designer should be consulted to improve the format." "spacing could be improved to separate title and imprint information from subject headings and notes at foot of entry. would make scanning easier." questions fourteen, nineteen, and twenty-one provide an overall summary of user reaction. 88.6 percent of users want the service to continue. overall value of seldom was rated "very high" by 11.3 percent, "high" by 33.8 percent, "medium" by 42.2 percent, and "low" by 12.7 percent. seldom served to demonstrate the possibility of sdi for monographs "amply" according to 36.6 percent of users, "adequately" to 50.6 percent of users, and "poorly" to 12.65 percent of users. there was less certainty on how such a program should be administered or coated, particularly since a long-range cost study was not yet available. clearly those who were impressed with seldom's effectiveness and future possibilities wanted other faculty to have the same opportunities, yet they cautioned against a blanket service. one comment sums this up best, "it should be available to anyone who has a perceived need for it-but require them to at least make the effort of setting up the profiles, etc." many of the less than enthusiastic comments about seldom could be correlated with little or no user feedback to the search editor in order to improve relevancy and recall. user education in this regard is crucial in order that all users fully understand the possibilities and limitations of the sdi service. the success of any existing sdi service in the periodical literature has hinged on a good data base and up-to-date, specific profiling according to smith and lynch ( 1971 ). ( 4) the effectiveness of the profiling is a direct function of the ingenuity and persistence of the user and the profile editor. discussion this study has attempted to weigh the usefulness of an sdi service primarily with regard to its utility as a current awareness service. seldom, in order to be worthwhile, must either be faster or broader in its coverage than existing services. two comparisons readily arise out of the commentary of the users. some library science professors felt that the lc proofslip service was just as fast as seldom and thus there was no advantage in having the latter when the former was available. a study done at the university of chicago by payne and mcgee ( 1970) repudiates this argument fairly effectively. ( 5) findings at chicago show that marc is faster than the corresponding proofslips. a number of users rely heavily on publishers' blurbs and prepublication notices and find that often books for which records appear on seldom are already on the library shelves. this observation is not altogether an indictment of seldom since another user observed that he appreciated being able to have the hard copy im44 journal of library automation vol. 5/1 march, 1972 mediately; and in some cases he might not even have known about the item except for seldom. some users mentioned that waiting for evaluative reviews could put one at least a year behind just in placing the order for the book, let alone receiving it. seldom has the virtue of informing individuals of the existence of new books, but the delay in having the actual item might be problematic, so one question was directed to this consideration. some people felt that it was at least worth something to know that a book existed even if one could not consult it immediately. numerous complaints were aired regarding the slowness of obtaining items ordered through a library's acquisitions department. in fact one user said this slowness meant he had to purchase personal copies of items he wanted/ needed. as indicated earlier in the introduction, the tesa-1 acquisitions-cataloging routine at the university of saskatchewan library does have the capability to speed up actual receipt of books by the patron. a recent development at the library of congress has definite implications for the future of seldom and any other marc-based sdi programs. the cip (cataloging in publication) 0 program initiated this summer means that lc will now be able to make available cataloging information, except for collation, for books about to be published, at a time factor of up to six weeks before publication. such marc records will have a special tag designating them as cip material. furthermore, cip records will appear only on marc, the number predicted is 10,000 for the first year and 30,000 by the third year, a figure which would include all american imprints. (6) marc-oklahoma o o has already surveyed the subscribers to its sdi project to determine whether they would prefer to receive both cip marc records and regular marc records or only one of the two categories. users preferred to receive both types of information and appropriate changes have been made to the oklahoma sdi programs. (7) beginning with september marc cip records will appear and present information on books thirty to forty-five days before they are published. several library personnel appreciated the usefulness of seldom as an outreach service of the university library into the academic community. they see seldom as a public relations tool. numerous efforts are at the present time being made by librarians to alert individuals to materials in their several fields of interest, and seldom can play an important role in providing an active dissemination of information on a systematic basis. this is the direction in which we need to move so that our role becomes both that of a collector of information and a disseminator of information. special librarians have been doing this kind of thing for years and seldom allows for specialized service to a larger user group. implications and conclusions 1. an sdi service based on marc can be helpful in building a balanced library collection depending on the efforts of faculty and/ or bibselective dissemination of marc /buhr 45 liographers in setting up their prorles and maintaining them. the article by ayres ( 1971 ) is particularly good on this aspect. ( 8) the parameters of the marc data base must constantly be kept in mind, just as the constraints of the ad hoc methods must be considered in any comparisons. publishers' blurbs in journals have the limitation of not systematically covering all the publications in a given subject area; book reviews tend to appear too late to allow users to receive current information on new books; seldom corrects the first shortcoming at the expense of not having the evaluations appearing in book reviews. on the other hand marc tapes do represent the cataloging of books in the english language by one of the largest national libraries in the world, and thus provide a coverage which is hard to duplicate by any one other alerting service. 2. comments, especially from users in the social sciences and humanities, indicate that an sdi system for new monographs has greater pertinence in their area than perhaps in the natural and applied sciences simply because of the nature of research done in the two areas. a recent study by j. l. stewart ( 1970) substantiates this factor for the field of political science. ( 9) his detailed analysis of the fatterns of citing in the writings appearing in a collective work in politica science indicated that 75 percent of such citations were from monographs leading him to the obvious conclusion that "monographs provide three times as much material as do journals" in the field of political science. by contrast, journals are likely more crucial for the fields of natural and applied science, and provide the key access point for vital information. 3. sdi of marc, most users felt, should demand a fair amount of effort on the part of users to assure that the service would obtain optimum return for money invested. a blanket service to all faculty would be wasteful since many faculty would not have a perceived need for it and others would not use it enough if it was simply offered free to everyone. comments tended to favor making contact through the departmental library representative and channel weekly printouts through this individual. a cost study will help determine whether it is economically feasible to operate seldom in an academic setting with at least 100 users. if current subscription costs for sdi services such as those offered by can/sdi of the national science library, ottawa can be maintained, and early indications are that they can, a cost of $100 per profile per year may be feasible bringing the annual expenditure for 100 users to $10,000. a chief variable which makes effective costing difficult is the variation in the number of records appearing on each weekly tape and this is a variable which can only be dealt with by prediction on the basis of the number of records on past tapes. 4. seldom has the virtue of adding a major role of dissemination of information to libraries which up until now have primarily operated as starers of information. selective dissemination of marc/buhr 47 822 33~ shakespeare william fleay. frederick garo. 1831-1909. shakespeare manual. new york.ams press<1970> xxiii. 312 p. 19 cm. lc 76-130621 pr2895 p1002 en 01 tw 000 wt 000 s r0252 fc leng 822.33 isbn 0~0~02~08~ seldom evaluation questionnaire l. what is your feelin g about the sdi lists as a source for finding out about the existence of newly published works in your fields of interest? would you say that the lists provided a source which was: (a) very useful (b) useful ( c ) moderately useful (d) inconsequential 18 34 12 6 2. do you feel that the sdi lists brought to your attention works of interest which are not generally cited b y other sources that you use to learn of new publications? (a) many works (b ) some works (c) a few works (d) none 10 39 19 2 3. how would you characterize your feeling about the relative proportions of the items "of interest" ( relevant items) and "those not of interest" (irrelevant items) included in the sdi lists? (a) the proportion of relevant items in the lists was satisfactory. 57 (b) the proportion of irrelevant items in the lists was too high. 13 48 journal of library automation vol. 5/1 march, 1972 4. it is inevitable that some "not-of-interest" items are included in the sdi lists. was the inclusion of irrelevant notices bothersome to you? (a) yes ( b ) no 6 65 reasons: 5. on the other hand, it is possible that for any given search run, some relevant items in the file are missed. the chance of relevant items being missed can generally be minimized by certain search adjustments, but with a resulting increase in irrelevant notices. would you be willing to increase the number of irrelevant notices received in order to maximize the number of relevant ones? ( a) yes ( b ) no 40 29 reasons: 6. the sdi lists notified you of an average of--items per list which you judged to be "of interest." on a purely quantitative basis, would you say that this number was satisfactory, or for some reason too small or too large? (a) satisfactory 48 ( b ) too small 16 ( c ) too large 1 7. when the input to the marc file is increased, your sdi output would also likely increase. do you feel that you would like to be able to set some arbitrary upper limit on the quantity of items included in each sdi list even at the risk of missing a number of relevant items? (a) yes (b) no if yes , maximum number__ _____________ _ 17 51 reasons: 8. the sdi lists alerted you to a number of items which you judged to be "of interest." would you say that "of interest" items were new to you? (a) in most cases (b) frequently (c) occasionally (d) seldom 17 33 17 5 9. do you feel that the proportion of items "of interest" which were also "new" to you was: (a) satisfactory (b) too low 54 17 10. would you say that, in general, information given for the entries in the sdi lists is adequate to judge whether an item is or is not of interest to you? ( a) yes ( b ) no 58 10 11. what elements of the entry did you most often find useful in making evaluations? (a) author/ editor (b) title (c) publisher (d) series note (e) sub38 55 9 4 35 ject headings (f) classification numbers (g) other (please specify) 8 1 selective dissemiootion of marc /buhr 49 12. what is the primary use to which you put the sdi information? (a) recommendation for library acquisition (b) personal purchase of 51 12 item (c) other (please specify) 15 13. if your recommendation originates the library order for a publication, it will be some time before the work is available; and even if already on order, most of the publications included in your lists were probably too new to be available from the library at the same time you received the list. do you feel that this diminishes the value of the sdi service? (a) significantly (b) somewhat (c) negligibly 2 w ~ for what reasons? 14. a potential value of sdi service, based on the large volume of newly published works cataloged by and for the library of congress, is to bring together in one list timely notices for those works in the file which correspond to your several fields of interest. do you feel that the experimental sdi service demonstrated this capacity? (a) amply (b) adequately (c) poorly 26 36 9 15. is the format of the sdi notices satisfactory? (a) yes (b) no 61 9 if not, what format would you suggest? 16. is the distribution schedule of once a week satisfactory? ( a ) yes ( b ) no 71 0 17. on the average, how much time would you estimate it took to examine an sdi list? roughly: minutes: (a) 5 (b) 5-10 (c) 10 (d) 10-15 (e) 15 (f) 15-20 (g) 20 23 16 9 11 5 1 5 18. a possible by-product of this sdi service is the building up of a cumulative marc tape file which can be searched in various ways by computer. would you make use of such a file? (a) yes (b) no 40 18 if no, for what purposes? 19. judging from your total experience with the sdi service, would you characterize its overall value to you as: ( a) very high (b) high (c) medium (d) low r 24 30 9 50 journal of library automation vol. 5/1 march, 1972 20. the marc file at present represents english monographs cataloged by the library of congress on a week-by-week basis. sometime in 1972, the library of congress will begin to add some non-english monographs to the marc file. keeping in mind the forthcoming expanded marc file on which future sdi service would be based, do you feel that its value to you would then be: (a) increased (b) the same (c) less 30 33 7 21. do you personally want this sdi service to be continued? (a) yes (b) no (c) it doesn't matter ~ 3 5 22. do you faculty? (a) yes feel that this sdi service should be offered to the entire 42 reasons: (b) no 14 23. do you feel that this sdi service should appropriately be made available by the university, i.e., that the university should organize and administer the service? (a) yes ( b ) no (c) don't know 36 5 23 24. do you feel that the university alone should pay for this faculty sdi service? (a) yes (b) no (c) don't know 30 6 25 25. optional: general comments, pros and cons, elucidation of above replies, attitudes, suggestions, etc., concerning the sdi service. 40 information technology and libraries | march 2010 mary kurtz dublin core, dspace, and a brief analysis of three university repositories this paper provides an overview of dublin core (dc) and dspace together with an examination of the institutional repositories of three public research universities. the universities all use dc and dspace to create and manage their repositories. i drew a sampling of records from each repository and examined them for metadata quality using the criteria of completeness, accuracy, and consistency. i also examined the quality of records with reference to the methods of educating repository users. one repository used librarians to oversee the archiving process, while the other two employed two different strategies as part of the selfarchiving process. the librarian-overseen archive had the most complete and accurate records for dspace entries. t he last quarter of the twentieth century has seen the birth, evolution, and explosive proliferation of a bewildering variety of new data types and formats. digital text and images, audio and video files, spreadsheets, websites, interactive databases, rss feeds, streaming live video, computer programs, and macros are merely a few examples of the kinds of data that can be now found on the web and elsewhere. these new dataforms do not always conform to conventional cataloging formats. in an attempt to bring some sort of order from chaos, the concept of metadata (literally “data about data”) arose. metadata is, according to ala, “structured, encoded data that describe characteristics of informationbearing entities to aid in the identification, discovery, assessment, and management of the described entities.”1 metadata is an attempt to capture the contextual information surrounding a datum. the enriching contextual information assists the data user to understand how to use the original datum. metadata also attempts to bridge the semantic gap between machine users of data and human users of the same data. n dublin core dublin core (dc) is a metadata schema that arose from an invitational workshop sponsored by the online computer library center (oclc) in 1995. “dublin” refers to the location of this original meeting in dublin, ohio, and “core” refers to that fact dc is set of metadata elements that are basic, but expandable. dc draws upon concepts from many disciplines, including librarianship, computer science, and archival preservation. the standards and definitions of the dc element sets have been developed and refined by the dublin core metadata initiative (dcmi) with an eye to interoperability. dcmi maintains a website (http://dublincore.org/ documents/dces/) that hosts the current definitions of all the dc elements and their properties. dc is a set of fifteen basic elements plus three additional elements. all elements are both optional and repeatable. the basic dc elements are: 1. title 2. creator 3. subject 4. description 5. publisher 6. contributor 7. date 8. type 9. format 10. identifier 11. source 12. language 13. relation 14. coverage 15. rights the additional dc elements are: 16. audience 17. provenance 18. rights holder dc allows for element refinements (or subfields) that narrow the meaning of an element, making it more specific. the use of these refinements is not required. dc also allows for the addition of nonstandard elements for local use. n dspace dspace is an open-source software package that provides management tools for digital assets. it is frequently used to create and manage institutional repositories. first released in 2002, dspace is a joint development effort of hewlett packard (hp) labs and the massachusetts institute of technology (mit). today, dspace’s future mary kurtz (mhkurtz@gmail.com) is a june 2009 graduate of drexel university’s school of information technology. she also holds a bs in secondary education from the university of scranton and an ma in english from the university of illinois at urbana– champaign. currently, kurtz volunteers her time in technical services/cataloging at simms library at albuquerque academy and in corporate archives at lovelace respiratory research institute (www.lrri.org), where she is using dspace to manage a diverse collection of historical photographs and scientific publications. dc, dspace, and a brief analysis of three university repositories | kurtz 41 is guided by a loose grouping of interested developers called the dspace committers group, whose members currently include hp labs, mit, oclc, the university of cambridge, the university of edinburgh, the australian national university, and texas a&m university. dspace version 1.3 was released in 2005 and the newest version, dspace 1.5, was released in march 2008. more than one thousand institutions around the world use dspace, including public and private colleges and universities and a variety not-for-profit corporations. dc is at the heart of dspace. although dspace can be customized to a limited extent, the basic and qualified elements of dc and their refinements form dspace’s backbone.2 n how dspace works: a contributor’s perspective dspace is designed for use by “metadata naive” contributors. this is a conscious design choice made by its developers and in keeping with the philosophy of inclusion for institutional repositories. dspace was developed for use by a wide variety of contributors with a wide range of metadata and bibliographic skills. dspace simplifies the metadata markup process by using terminology that is different from dc standards and by automating the production of element fields and xml/html code. dspace has four hierarchical levels of users: users, contributors, community administrators, and network/ systems administrators. the user is a member of the general public who will retrieve information from the repository via browsing the database or conducting structured searches for specific information. the contributor is an individual who wishes to add their own work to the database. to become a contributor, one must be approved by a dspace community administrator and receive a password. a contributor may create, upload, and (depending upon the privileges bestowed upon him by his community administrator), edit or remove informational records. their editing and removal privileges are restricted to their own records. a community administrator has oversight within their specialized area of dspace and accordingly has more privileges within the system than a contributor. a community administrator may create, upload, edit, and remove records, but also can edit and remove all records available within the community’s area of the database. additionally, the community administrator has access to some metadata about the repository’s records that is not available to users and contributors and has the power to approve requests to become contributors and grant upload access to the database. lastly, the community administrator sets the rights policy for all materials included in the database and writes the statement of rights that every contributor must agree to with every record upload. the network/systems administrator is not involved with database content, focusing rather on software maintenance and code customization. when a dspace contributor wishes to create a new record, the software walks them through the process. dspace presents seven screens in sequence that ask for specific information to be entered via check buttons, fillin textboxes, and sliders. at the end of this process, the contributor must electronically sign an acceptance of the statement of rights. because dspace’s software attempts to simplify the metadata-creation process for contributors, its terminology is different from dc’s. dspace uses more common terms that are familiar to a wider variety of individuals. for example, dspace asks the contributor to list an “author” for the work, not a “creator” or a “contributor.” in fact, those terms appear nowhere in any dspace. instead, dspace takes the text entered in the author textbox and maps it to a dc element—something that has profound implications if the mapping does not follow expected dc definitions. likewise, dspace does not use “subject” when asking the contributor to describe their material. instead, dspace asks the contributor to list keywords. text entered into the keyword field is then mapped into the subject element. while this seems like a reasonable path, it does have some interesting implications for how the subject element is interpreted and used by contributors. dc’s metadata elements are all optional. this is not true in dspace. dspace has both mandatory and automatic elements in its records. because of this, data records created in dspace look different than data records created in dc. these mandatory, automatic, and default fields affect the fill frequency of certain dc elements—with all of these elements having 100 percent participation. in dspace, the title element is mandatory; that is, it is a required element. the software will not allow the contributor to proceed if the title text box is left empty. as a consequence, all dspace records will have 100 percent participation in the title element. dspace has seven automatic elements, that is, element fields that are created by the software without any need for contributor input. three are date elements, two are format elements, one is an identifier, and one is provenance. dspace automatically records the time of the each record’s creation in machine-readable form. when the record is uploaded into the database, this timestamp is entered into three element fields: dc.date.available, dc.date.accessioned, and dc.date.issued. therefore dspace records have 100 percent participation in the date element. for previously published materials, a separate screen asks for the original publication date, which is then 42 information technology and libraries | march 2010 placed in the dc.date.issued element. like title, the original date of publication is a mandatory field, and failure to enter a meaningful numerical date into the textbox will halt the creation of a record. in a similar manner, dspace “reads” the kind of file the contributor is uploading to the database. dspace automatically records the size and type (.doc, .jpg, .pdf, etc.) of the file or files. this data is automatically entered into dc.format.mimetype and dc.format.extent. like date, all dspace records will have 100 percent participation in the format element. likewise, dspace automatically assigns a location identifier when a record is uploaded to the database. this information is recorded as an uri and placed in the identifier element. all dspace records have a dc.identifier.uri field. the final automatic element is provenance. at the time of record creation, dspace records the identity of the contributor (derived from the sign-in identity and password) and places this information into a dc.provenance element field. this information becomes a permanent part of the dspace record; however, this field is a hidden to users. typically only community and network/systems administrators may view provenance information. still, like date, format, and identifier elements, dspace records have automatic 100 percent participation in provenance. because of the design of dspace’s software, all dspace-created records will have a combination of both contributor-created and dspace-created metadata. all dspace records can be edited. during record creation, the contributor may at any time move backward through his record to alter information. once the record has been finished and the statement of rights signed, the completed record moves into the community administrator’s workflow. once the record has entered the workflow, the community administrator is able to view the record with all the metadata tags attached and make changes using dspace’s editing tools. however, depending on the local practices and the volume of records passing through the administrator’s workflow, the administrator may simply upload records without first reviewing them. a record may also be edited after it has been uploaded, with any changes being uploaded into the database at the end of editing process. in editing a record after it has been uploaded, the contributor, providing he has been granted the appropriate privileges, is able to see all the metadata elements that have attached to the record. calling up the editing tools at this point allows the contributor or administrator to make significant changes to the elements and their qualifiers, something that is not possible during the record’s creation. when using the editing tools, the simplified contributor interface disappears, and the metadata elements fields are labeled with their dc names. the contributor or administrator may remove metadata tags and the information they contain and add new ones selecting the appropriate metadata element and qualifier from a slider. for example, during the editing process, the contributor or administrator may choose to create dc.contributor. editor or dc.subject.lcsh options—something not possible during the record-creation process. in the examination of the dspace records from our three repositories, dspace’s shaping influence on element participation and metadata quality will be clearly seen. n the repositories dspace is principally used by academic and corporate nonprofit agencies to create and manage their institutional repositories. for this study, i selected three academic institutions that shared similar characteristics (large, public, research-based universities) but which had differing approaches to how they managed their metadata-quality issues. the university of new mexico (unm) dspace repository (dspaceunm) holds a wide-ranging set of records, including materials from the university’s faculty and administration, the law school, the anderson school of business administration, and the medical school, as well as materials from a number of tangentially related university entities like the western water policy review advisory commission, new mexico water trust board, and governor richardson’s task force on ethic reform. at the time of the initial research for this paper (spring 2008), dspaceunm provided little easily accessible on-site education for contributors about the dspace record-creation process. what was offered—a set of eight general information files—was buried deep inside the library community. a contributor would have to know the files existed to find them. by summer 2009, this had changed. dspaceunm had a new homepage layout. there is now a link to “help sheets and promotional materials” at the top center of the homepage. this link leads to the previously difficult-tofind help files. the content of the help files, however, remains largely unchanged. they discuss community creation, copyrights, administrative workflow for community creation, a list of supported formats, a statement of dspaceunm’s privacy policy, and a list of required, encouraged, and not required elements for each new record created. for the most part, dspaceunm help sheets do not attempt to educate the contributor in issues of metadata quality. there is no discussion of dc terminology, no attempts to refer the contributor to a thesaurus or controlled vocabulary list, nor any explanation of the record-creation or editing process. this lack of contributor education may be explained in part because dspaceunm requires all new records dc, dspace, and a brief analysis of three university repositories | kurtz 43 to be reviewed by a subject area librarian as part of the dspace community workflow. thus any contributor errors, in theory, ought to be caught and corrected before being uploaded to the database. the university of washington (uw) dspace repository (researchworks at the university of washington) hosts a narrower set of records than dspaceunm, with the materials limited to the those contributed by the university’s faculty, students, and staff, plus materials from the uw’s archives and uw’s school of public and community health. in 2008, researchworks was self-archiving. most contributors were expected to use dspace to create and upload their record. there is no indication in the publicly available information about the record creation workflow if record reviews were conducted before record upload. the help link on the researchworks homepage brought contributors to a set of screen-by-screen instructions on how to use dspace’s software to create and upload a record. the step-through did not include instructions on how to edit a record once it had been created. no explanation of the meanings or definitions of the various dc elements was included in the help files. there also were no suggestions about the use of a controlled vocabulary or a thesaurus for subject headings. by 2009, this link had disappeared and the associated contributor education materials with it. the knowledge bank at ohio state university(osu) is the third repository examined for this paper. osu’s repository hosts more than thirty communities, all of which are associated with various academic departments or special university programs. like researchworks at uw, osu’s repository appears to be self-archiving with no clear policy statement as to whether a record is reviewed before it is uploaded to the repository’s database. osu makes a strong effort to educate its contributors. on the upper-left of the knowledge bank homepage is a slider link that brings the contributor (or any user) to several important and useful sources of repository information: about knowledge bank, faqs, policies, video upload procedures, community set-up form, describing your resources, and knowledge bank licensing agreement. the existence and use of metadata in knowledge bank are explicitly mentioned in the faq and policies areas, together with an explanation of what metadata is and how metadata is used (faq), and a list of supported metadata elements (policies). the describe your resources section gives extended definitions of each dspace-available dc metadata element and provides examples of appropriate metadata-element use. knowledge bank provides the most comprehensive contributor education information of any of the three repositories examined. it does not use a controlled vocabulary list for subject headings, and it does not offer a thesaurus. n data and analysis i chose twenty randomly selected full records from each repository. no more than one record was taken from any one collection to gather a broad sampling from each repository. i examined each record for the quality of its metadata. metadata quality is a semantically slippery term. park, in the spring 2009 special metadata issue of cataloging and classification quarterly, suggested that most commonly accepted criteria for metadata quality are completeness, accuracy, and consistence.3 those criteria will be applied in this analysis. for the purpose of this paper, i define completeness as the fill rate for key metadata elements. because the purpose of metadata is to identify the record and to assist in the user’s search process, the key elements are title, contributor/creator, subject, and description.abstract— all contributor-generated fields. i chose these elements because these are the fields that the dspace software uses when someone conducts an unrestricted search. table 1 shows the fill rate for the title element is 100 percent for all three repositories. this is to be expected because, as noted above, title is mandatory field. the fill rate for contributor/creator is likewise high: 16 of 20 (80 percent) for unm, 19 of 20 (95 percent) for uw, and 19 of 20 (95 percent) for osu. (osu’s fill rate for creator and contributor were summed because osu uses different definitions for creator and contributor element fields than do unm or uw. this discrepancy will be discussed in greater depth in the consistency of metadata terminology below.) the fill rate for subject was more variable. unm’s subject fill rate was 100 percent, while uw’s was 55 percent, and osu’s was 40 percent. the fill rate for the description.abstract subfield was 12 of 80 (60 percent) at unm, 15 of 20 (75 percent) at uw, and 8 of 20 (40 percent) at osu. (see appendix a for a complete list of metadata elements and subfields used by each of the three repositories.) the relatively low fill rate (below 50 percent) at the osu knowledgebank in both subject and description .abstract suggests a lack of completeness in that repository’s records. accuracy in metadata quality is the essential “correctness” of a record. correctness issues in a record range from data-entry issues (typos, misspellings, and inconsistent date formats) to the correct application of metadata definitions and data overlaps.4 accuracy is perhaps the most difficult of the metadata 44 information technology and libraries | march 2010 quality criteria to judge. local practices vary widely, and dc allows for the creation of custom metadata tags for local use. additionally, there is long-standing debate and confusion about the definitions of metadata elements even among librarians and information professionals.5 because of this, only the most egregious of accuracy errors were considered for this paper. all three repositories had at least one record that contained one or more inaccurate metadata fields; two of them had four or more inaccurate records. inaccurate records included a wide variety of accuracy errors, including poor subject information (no matter how loosely one defines a subject heading, “the” is not an accurate descriptor); mutually contradictory metadata (record contained two different language tags, although only one applied to the content); and one in which the abstract was significantly longer and only tangentially related than the file it described. additionally, records showed confusion over contributor versus creator elements. in a few records, contributors entered duplicate information into both element fields. this observation supports park and childress’s findings that there is widespread confusion over these elements.6 among the most problematic records in terms of accuracy were those contained in uw’s early buddhist manuscripts project. this collection, which has been removed from public access since the original data was drawn for this paper, contained numerous ambiguous, contradictory, and inaccurate metadata elements.7 while contributor-generated subject headings were specifically not examined for this paper, it must be noted that was a wide variation in the level of detail and vocabulary used to describe records. no community within any of the repositories had specific rules for the generation of keyword descriptors for records, and the lack of guidance shows. consistency can be defined as the homogeneity of formats, definitions, and use of dc elements within the records. this consistency, or uniformity, of data is important because it promotes basic semantic interoperability. consistency both inside the repository itself and with other repositories makes the repository easier to use and provides the user with higher quality information. all three repositories showed 100 percent consistency in dspace-generated elements. dspace’s automated creation of date and format fields provided reliably consistent records in those element fields. dspace’s automatic formatting of personal names in the dc.contributor.author and dc.creator fields also provided excellent internal consistency. however, the metadata elements were much less consistent for contributor-generated information. inconsistency within the subject element is where most problems occurred. personal names used as subject heading and capitalization within subject headings both proved to be particular issues. dspace alphabetizes subject headings according to the first letter of the free text entered in the keyword box. thus the same name entered in different formats (first name first or last name first) generates different subject-heading listings. the same is true for capitalization. any difference in capitalization of any word within the free-text entry generates a separate subject heading. another field where consistency was an issue was dc.description.sponsorship. sponsorship is problem because different communities, even different collections within the same community, use the field to hold different information. some collections used the sponsorship field to hold the name of a thesis or dissertation advisor. some collections used sponsorship to list the funding agency or underwriter for a project being documented inside the record. some collections used sponsorship to acknowledge the donation of the physical materials documented by the record. while all of these are valid uses of the field, they are not the same thing and do not hold the same meaning for the user. the largest consistency issue, however, came from table 1. metadata fields and their frequencies element univ. of n.m. univ. of wash. ohio state univ. title 20 20 20 creator 0 0 16 subject 20 11 8 description 12 16 17 publisher 4 4 8 contributor 16 19 3 date 20 20 20 type 20 20 20 identifier 20 20 20 source 0 0 0 language 20 20 20 relation 3 1 6 coverage 2 0 0 rights 2 0 0 provenance ** ** ** **provenance tags are not visible to public users dc, dspace, and a brief analysis of three university repositories | kurtz 45 a comparison of repository policies regarding element use and definition. unaltered dspace software maps contributor-generated information entered into the author textbox during the record-creation process into the dc.contributor.author field. however, osu’s dspace software has been altered so that the dc.contributor .author field does not exist. instead, text entered into the author textbox during the record-creation process maps to dc.creator. although both uses are correct, this choice does create a significant difference in element definitions. osu’s dspace author fields are no longer congruent with other dspace author fields. n conclusions dspace was created as repository management tool. by streamlining the record creation workflow and partially automating the creation of metadata, dspace’s developers hoped to make institutional repositories more useful and functional while time providing an improved experience for both users and contributors. in this, dspace has been partially successful. dspace has made it easier for the “metadata naive” contributor to create records. and, in some ways, dspace has improved the quality of repository metadata. its automatically generated fields ensure better consistency in those elements and subfields. its mandatory fields guarantee 100 percent fill rates in some elements, and this contributes to an increase in metadata completeness. however, dspace still relies heavily on contributorgenerated data to fill most of the dc elements, and it is in these contributor-generated fields that most of the metadata quality issues arise. nonmandatory fields are skipped, leading to incomplete records. data entry errors, a lack of authority control over subject headings, and confusion over element definitions can lead to poor metadata accuracy. a lack of enforced, uniform naming and capitalization conventions leads to metadata inconsistency, as does the localized and individual differences in the application of metadata element definitions. while most of the records examined in this small survey could be characterized as “acceptable” to “good,” some are abysmal. to improve the inconsistency of the dspace records, the three universities have tried differing approaches. only unm’s required record review by a subject area librarian before upload seems to have made any significant impact on metadata quality. unm has a 100 percent fill rate for subject elements in its records, while uw and osu do not. this is not to say that unm’s process is perfect and that poor records do not get into the system—they do (see appendix b for an example). but it appears that for now, the intermediary intervention of a librarian during the record-creation process is an improvement over self-archiving—even with education—by contributors. references and notes 1. association of library collections & technical services, committee on cataloging: description & access, task force on metadata, “final report,” june 16, 2000, http://www.libraries .psu.edu/tas/jca/ccda/tf-meta6.html (accessed mar. 10, 2007). 2. a voluntary (and therefore less-than-complete) list of current dspace users can be found at http://www.dspace. org/index.php?option=com_content&task=view&id=596&ite mid=180. further specific information about dspace, including technical specifications, training materials, licensing, and a user wiki, can be found at http://www.dspace.org/index .php?option=com_content&task=blogcategory&id=44&itemi d=125. 3. jung-ran park “metadata quality in digital repositories: a survey of the current state of the art,” cataloging & classification quarterly 47, no. 3 (2009): 213–28. 4. sarah currier et al., “quality assurance for digital learning object repositories: issues for the metadata creation process,” alt-j: research in learning technology 12, no. 1 (2004): 5–20. 5. jung-ran park and eric childress, “dc metadata semantics: an analysis of the perspectives of informational professionals,” journal of information science 20, no. 10 (2009): 1–13. 6. ibid. 7. for a fuller discussion of the collection’s problems and challenges in using both dspace and dc, see kathleen forsythe et al., university of washington ealy buddhist manuscripts project in dspace (paper presented at dc-2003, seattle, wash., sept. 28–oct. 2, 2003), http://dc2003.ischool.washington.edu/ archive-03/03forsythe.pdf (accessed mar. 10, 2007). lita cover 2, cover 3 neal-schuman cover 4 oclc 7 index to advertisers 46 information technology and libraries | march 2010 appendix a. a list of the most commonly used qualifiers in each repository university of new mexico dc.date.issued (20) dc.date.accessioned (20) dc.date.available (20) dc.format.mimetype (20) dc.format.extent (20) dc.identifier.uri (20) dc.contributor.author (15)) dc.description.abstract (12) dc.identifier.citation (6) dc.description.sponsorship (4) dc.subject.mesh (2) dc.contributor.other (2) dc.description.sponsor (1) dc.date.created (1) dc.relation.isbasedon (1) dc.relation.ispartof (1) dc.coverage.temporal (1) dc.coverage.spatial (1) dc.contributor.other (1) university of washington dc.date.accessioned (20) dc.date.available (20) dc.date.issued (20) dc.format.mimetype (20) dc.format.extent (20) dc. identifier.uri (20) dc.contributor.author (18) dc.description.abstract (15) dc.identifier.citation (4) dc.identifier.issn (4) dc.description.sponsorship (1) dc.contributor.corporateauthor (1) dc.contributor.illustrator (1) dc.relation.ispartof (1) ohio state university dc.date.issued (20) dc.date.available (20) dc.date.accessioned (20) dc.format.mimetype (20) dc.format.extent (20) dc.identifier.uri (20) dc.description.abstract (8) dc.identifier.citation (4) dc.subject.lcsh (4) dc.relation.ispartof (4) dc.description.sponsorship (3) dc.identifier.other (2) dc.contributor.editor (2) dc.contribtor.advisor (1) dc.identifier.issn (1) dc.description.duration (1) dc.relation.isformatof (1) dc.description.statementofresponsibility (1) dc.description.tableofcontents (1) appendix b. sample record dc.identifier.uri http://hdl.handle.net/1928/3571 dc.description.abstract president schmidly’s charge for the creation of a north golf course community advisory board. dc.format.extent 17301 bytes dc.format.mimetype application/pdf dc.language.iso en_us dc.subject president dc.subject schmidly dc.subject north dc.subject golf dc.subject course dc.subject community dc.subject advisory dc.subject board dc.subject charge dc.title community_advisory_board_charge dc.type other 20 information technology and libraries | june 2008 an assessment of student satisfaction with a circulating laptop service louise feldmann, lindsey wess, and tom moothart since may of 2000, colorado state university’s (csu) morgan library has provided a laptop computer lending service. in five years the service had expanded from 20 to 172 laptops. although the service was deemed a success, users complained about slow laptop startups, lost data, and lost wireless connections. in the fall of 2005, the program was formally assessed using a customer satisfaction survey. this paper discusses the results of the survey and changes made to the service based on user feedback. colorado state university (csu) is a land-grant insti-tution located in fort collins, colorado. the csu libraries consist of the morgan library, the main library on the central campus; the veterinary teaching branch hospital library at the veterinary hospital campus; and the atmospheric branch library at the foothills campus. in 1997, morgan library completed a major renovation and expansion which provided a designated space for public desktop computers in an information commons environment. the library called this space the electronic information center (eic). due to the popularity of the eic ,and with the intent of expanding computer access without expanding the computer lab, library staff began to explore the implementation of a laptop checkout service in 2000. library staff used heather lyle’s (1999) article “circulating laptop computers at west virginia university” as a guide in planning the service. development funds were used to purchase twenty laptop computers, and the 3com corporation donated fifteen wireless network access points. the laptops were to be used in morgan library on a wireless network maintained by the library technology services department. these computers were to be circulated from the loan desk, the same desk used to check out books. although the building is open to the public, use of the laptops was limited to university students and staff and for library in-house use only. all the public desktop computers and laptops use microsoft windows and microsoft office. maintaining the security of the libraries’ network and students’ personal data in a wireless environment was paramount. to maintain a secure computing environment and present a standardized computing experience in the library, an application of windows xp group policies was used. currently, the laptop software is updated at least every semester using symantec ghost. ghost copies a standardized image to every laptop even when the library owns a variety of computer models from the same manufacturer. additionally, due to concerns over wireless computer security, morgan library implemented cisco’s virtual private network (vpn) in 2004. the laptop service was launched in may 2000. more than 22,000 laptop transactions occurred in the initial year. since its inception, the use of the morgan library laptop service and the number of laptops available for checkout has steadily grown. using student technology funds, the service had grown to 172 laptops and ten presentation kits consisting of a laptop, projector, and a portable screen. circulation during the fall 2005 semester totaled 30,626 laptops and 102 presentation kits. in fiscal year 2005, 66,552 laptops and presentation kits were checked out. based on the high circulation statistics and anecdotal evidence, the service appeared to be successful. although morgan library replaced laptops every three years and upgraded the wireless network, laptop support staff noted that users complained of slow laptop startups, lost data, and lost wireless connections. the researchers also noted that large numbers of users queued at the circulation desk at 5:00 p.m. even though large numbers of desktop computers were available in the eic. a customer service satisfaction survey was developed to assess the service and test library staff’s assumptions about the service. csu had a student population of 25,616 students at the time of the survey. n literature review much of the published literature discussing laptop services focuses on the implementation of laptop lending programs and was published from 2001 to 2003, when many libraries were beginning this service (allmang 2003; block 2001; dugan 2001; myers 2001; oddy 2002; vaughan and burnes 2002; williams 2003). these articles deal primarily with topics such as how to deal with start-up technological, staffing, and maintenance issues. they have minimal discussion of the service post-implementation. researchers who have surveyed users of university laptop lending services include direnzo (2002), lyle (1999), jordy (1998), block (2001), oddy (2002), and monash university’s caulfield library (2004). direnzo from the university of akron only briefly discusses a survey they conducted with some information about additional software added as a result of their user comments. lyle from west virginia university discusses the percentage of respondents to particular questions such louise feldmann (louise.feldmann@colostate.edu) is the business and economics librarian at colorado state university libraries. she serves as the college liaison librarian to the college of business. lindsey wess (lindsey.wess@colostate. edu) coordinates assistive technology services and manages the information desk and the electronic information center at colorado state university libraries. tom moothart (tmoothar@ library.colostate.edu) is the coordinator of on-site services at colorado state university libraries. student satisfaction with circulating laptop service | feldmann, wess, and moothart 21 as what applications were used, problems encountered, and overall satisfaction with the service. jordy’s report provides in-depth analysis of the survey results from the university of north carolina at chapel hill, but the focus of his survey is on the laptop service’s impact on library employee work flow. monash university’s caulfield library survey focuses on wireless access and awareness of the program by patrons. other survey results found on university library web sites include southern new hampshire university library (west 2005) and murray state university library (2002). additionally, the monmouth university library web site (2003) provides discussion and written analysis of a survey they conducted prior to implementation of their service, a survey which was used to gather information and assess patron needs in order to aid in the construction and planning of their service. from the survey results discussed in the literature and posted on web sites, overall comments from users are very consistent with one another. most users indicate that they use a loaned laptop computer rather than desktop computer for privacy and portability (lyle 1999; oddy 2002; west 2005). in addition, the responses from patrons are overwhelmingly positive and users appreciated having the service made available (lyle 1999; jordy1998; west 2005). both west virginia university and the university of north carolina at chapel hill surveys found that 98 percent of respondents would check out a laptop again (lyle 1999; jordy 1998). southern university of new hampshire’s survey indicated that 88 percent of those responding would check one out again (west 2005). many respondents stated that a primary drawback of using the laptops was the slowness of connectivity (lyle 1999; monash 2004; murray state 2002). the primary use of the laptops, reported in the surveys, was microsoft word (lyle 1999; jordy 1998; oddy 2002). there is a lack of published literature regarding laptop lending customer satisfaction surveys and analysis. this could be due to the relative newness of many programs, the lack of university libraries that provide laptops, or the reliance on circulation statistics solely to assess the program. articles that discuss circulation and usage statistics as an assessment indicator to judge the popularity of their programs include direnzo (2002), dugan (2001), and vaughan and burnes (2002). based on high circulation statistics and positive anecdotal evidence, it may appear that library users are pleased with laptop programs, and perhaps there has been a hesitation to survey users on a program that is perceived by those in the library as successful. n results with the strong emphasis on assessment at colorado state university, it was decided to formally survey laptop users on their satisfaction with the program. the survey was distributed by the access services staff when the laptops were checked out from october 28, 2005, to november 28, 2005. this was a voluntary survey and the respondents were asked to complete one survey. users returned 173 completed surveys. undergraduates are the predominant audience for the laptop service; of the 173 returned surveys, 160 identified themselves as undergraduates. as shown in table 1, the responses indicated that the library has a core of regular laptop users, with 33 percent using the laptops at least daily and 82 percent using the laptops at least weekly. only 3 percent indicated that they were using a laptop for the first time. many laptop users also utilized the eic with 67 percent responding that they use the information commons at least weekly (see table 2). the laptops were initially purchased with the intent that they would be used to support student team projects. presentation kits with a laptop, projector, and portable screen were an extension of this idea and were also made available for checkout. surprisingly, only 15 percent of table 1. how often do you use a library laptop? frequency percentage more than once a day 3% daily 30% weekly 49% monthly 15% my first time 3% n=172 table 2. how often do you use a library pc? frequency percentage more than once a day 3% daily 20% weekly 44% monthly 20% never 13% n=169 22 information technology and libraries | june 2008 the respondents noted that they were using the laptop with a group. during evenings, it was observed by staff that students were regularly queuing and waiting for a laptop even though pcs were available in the library computer lab. figure 1 shows an hourly use statistics for the desktop and laptop public computers. the usage of the desktop computer drops in the late afternoon, just as the use of the laptop computer increases. students were asked why they chose a laptop rather than a library pc and were allowed to choose from multiple answers. as can be seen in table 3, most students noted the advantages of portability and privacy. five respondents wrote in the “other” category that they were able to work better in quieter areas, and ten mention that the computer lab workspace is limited. the dense use of space in the library computer lab has been noted by morgan library staff and students. the desktop surrounding each library pc only provides about three feet of workspace. one respondent explained the choice of laptop over pc was because “i can take it to a table and spread out my notes vs. on a library pc.” for many users, the desktops are too crowded to spread research material, and the eic is too noisy for contemplative thought. as can be noted from the use statistics, the public laptop program has been a very popular library service. prior to the survey, the perception of the morgan library staff was that students were waiting in the evening for extended periods of time for a laptop. when the library expanded the laptop pool from 20 in 2000 to 172 in 2005, it had seemingly no effect on reducing the number of students waiting to use them. as can be seen in table 4, when asked how long they had waited for a laptop, 74 percent of the students said they had access to a laptop immediately, and 15 percent waited less than a minute. the survey was administered during the second busiest time of the year for the library, the month before thanksgiving break. in the open comments, one respondent stated that it was possible to wait fortyfive minutes to an hour for a laptop and another noted that “during finals weeks it is almost impossible to get one.” even with the limited waiting time recorded by the page 1 of 1 feldmann figures.doc 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% 7:30 am 8:30 am 9:30 am 10:30 am 11:30 am 12:30 pm 1:30 pm 2:30 pm 3:30 pm 4:30 pm 5:30 pm 6:30 pm 7:30 pm 8:30 pm 9:30 pm 10:30 pm 11:30 pm time of day p er ce nt ag e of u se r desktop computers checkout laptops figure 1. computer use statistics for may 1, 2006. figure 1. computer use statistics for may 1, 2006. table 3. why did you choose to use a laptop rather than a library pc? response number portability 41 privacy 12 easier to work with a group 7 portability and privacy 54 portability and easier to work with a group 10 portability, privacy, and easier to work with a group 12 student satisfaction with circulating laptop service | feldmann, wess, and moothart 23 respondents, when asked how the library could improve the laptop service many respondents requested that more laptops be purchased to decrease the wait. the library is struggling to determine the appropriate number of laptops to have available during peak use periods to reduce or eliminate wait times. the library laptops are more problematic than the library desktop computers to support. the laptops are more fragile than the desktop computers and have the added complication of connecting to the wireless network. every morning the morgan library’s technology staff retrieves non-functioning laptops; library technicians regularly retrieve lost data due to malfunctioning laptops and unsophisticated computer users. the addition of the virtual private network (vpn) connection to the laptop startup script files has slowed the boot-up to the wireless network. an effort has been made to ameliorate wireless “dead zones,” but users still complain of being dropped from the wireless network. with these problems in mind, users were asked about the technical complications they have experienced with the library laptops. the survey responses in tables 5 and 6 indicate a much lower percentage of users reporting technical problems than was anticipated. the technical staff’s large volume of technical calls may reflect the volume of users rather than systematic problems with the laptop service. surprisingly, 79 percent of the users reported rarely or never returning a non-functioning laptop. in addition, the library technicians have reported that no problems have been found on some of the laptops returned for repair. some of the returned computers may be due to frustration with the slow connection to the wireless network. forty-five percent of respondents reported at least occasionally having problems connecting to the wireless network. from the inception of the laptop program, the library has experienced problems with the wireless technology. from its original fifteen wireless access points to its current twenty-nine, the library has struggled to meet the demand of additional library laptops and users’ personal laptops. many written comments on the surveys complained about the slow connection speed of the wireless network such as, “find a way to make the boot-up process faster. i need to wait about five minutes for it to be totally booted and ready to use.” even with the slow connection to the wireless network, 41 percent of students responding to the survey rated their satisfaction with the library’s laptop service as excellent and 49 percent rated their satisfaction as good (see table 7). n discussion even with 90 percent of our users rating the laptop service as good or excellent, the survey noted some problems that needed attention. the morgan library laptops seamlessly connect to a wireless network through a login script when the computer is turned on. a new script was written to table 4. how long did you wait before you were able to check out your laptop? response percentage i did not wait 74% less than one minute 15% one to four minutes 11% five to ten minutes 2% more than ten minutes 0% n=171 table 5. how often have you experience problems saving files, connecting to the wireless network, or had a laptop that locked up or crashed? frequency saving files wireless connection locked up or crashed often <1% 5% <1% occasionally 8% 40% 17% rarely 33% 32% 35% never 58% 24% 49% n= 165 165 163 table 6. how often have you returned a library laptop that was not working properly? frequency percentage often 4% occasionally 18% rarely 30% never 49% n=165 24 information technology and libraries | june 2008 allow the connection and authentication to the cisco virtual private network (vpn) client. during testing it was found that some laptops took as long as ten minutes to connect to the wireless network, which resulted in numerous survey respondents commenting on our slow wireless network. to help correct this problem, the library’s network staff changed each laptop’s user profile from a mandatory roaming profile to a local profile and simplified the login script. the laptops connected faster to the wireless network with the new script, but they still did not meet the students’ expectations. in the fall of 2006, the library network staff moved the laptops from vpn to wi-fi protected access (wpa) wireless security, and laptop login time to the wireless network dropped to under two minutes. the number of customer complaints dropped dramatically after implementing wpa. additional access points were purchased to improve connectivity in morgan library’s wireless “dead zones.” in january 2006, the university’s central computing services audited the wireless network after continued wireless connectivity complaints. the audit recommended reconfiguring the access points channel assignments. in many cases it was found that the same channel had been assigned to access points adjacent to each other, ultimately compromising laptop connectivity. the audit also discovered noise interference on the wireless network from a 2.4-ghz cordless phone used by the loan desk staff. the phone was replaced with a 5.8-ghz one, which has resulted in fewer dropped connections near the loan desk. supporting almost 200 laptops has introduced several problems in the library. the morgan library building was not designed to support the use of large numbers of laptops. because it is impractical for the loan desk to charge nearly 200 laptop batteries throughout the day, laptops available for checkout must be connected to electrical outlets. these are seldom near study tables, and students are forced to crawl underneath tables to locate power or stretch adapter cords across aisles. a space plan for the morgan library is being developed that will increase the number of outlets near study tables. in the meantime, 100 power strips were added to tables used heavily by laptop users. the loan desk staff is very efficient at circulating, but has less success at troubleshooting technical problems. when the laptop service was first implemented, large numbers of laptops were not available due to servicing reasons. the public laptop downtime was lowered by hiring additional library technology students. a one-day onsite repair service agreement was purchased from the manufacturer which resulted in many equipment repairs being completed within 48 hours. in order to reduce the downtime further, a plan to replace some loan desk student workers with library technology students is being evaluated. the technology students will be able to troubleshoot connectivity and hardware problems with the users when they return the defective computers to the loan desk. if a computer needs additional service, it can be handled immediately, which will allow more laptops for checkout since fewer will be removed for repair. when the laptop service was first envisioned, it was seen as a great service for those working in groups. as can be seen in table 3, very few students are using the laptops in a group setting. in survey written comments, students emphasize that they enjoy the portability and privacy enabled by using a laptop. the morgan library eic is cramped and noisy, with the configuration allowing very little room for students to spread out research materials and notes for writing. the morgan library space plan takes these issues into consideration and recommends reconfiguring the eic to lessen the noise and provide writing space near computers. this is intended to improve the student library experience and encourage students to use the desktop computers during the evenings when lines form for the laptops. in order to decrease the current laptop queue at the loan desk, more laptops will be added. as a result of survey comments requesting apple computers, five mac powerbooks were added to the library’s laptop fleet. in addition, as morgan library adds more checkout laptops and the number of students arriving on campus with wireless laptops increases, the wireless infrastructure will need to be upgraded. upgrading the wireless access points to standard 802.11g has been implemented. updating each laptop with a new hardrive image has become problematic as the number of laptops has increased. the wireless network capacity is not large enough for the ghost software to transmit the image to multiple laptops, and so each laptop must be physically attached to the library network. initially, when library technology services attempted imaging many laptops at once, it took six to eight hours and required up to eight staff members. this method of large-scale laptop imaging was so network intensive that it had to be performed when the library was closed to avoid disrupting table 7. please rate your satisfaction with the laptop service. response percentage excellent 41% good 49% neutral 7% poor very poor 2% <1% n=166 student satisfaction with circulating laptop service | feldmann, wess, and moothart 25 public internet use. now imaging the laptop fleet is done piecemeal, twenty to thirty laptops at a time, in order to minimize complications with the ghost process and multicasting through the network switches. due to the staff time required, laptop software is not updated as often as the users would like. technological solutions continue to be investigated that will decrease the labor and network intensity of imaging. n conclusion the morgan library laptop service was established in 2000 and has been a very popular addition to the library’s services. as an example of its popularity, in fiscal year 2005 the laptops circulated 66,552 times. student government continues to support the use of student technology fees to support and expand the fleet of laptops. this survey was an attempt to assess users’ perceptions of the service and identify areas that need improvement. the survey found that students rarely wait more than a few minutes for a laptop, and in open-ended survey questions, students noted that they waited for computers only during peak use periods. while relatively few survey respondents experienced technical difficulties with the laptops and wireless network, slow wireless connection time was a concern that students noted in the open comments section of the survey. overall, the students gave the laptop service a very high rating. when asked to suggest improvements to the service, many respondents recommended purchasing more laptops. the libraries made several changes to improve the laptop service based on survey responses. changes have been made to the login script files, wireless network, and security protocol to speed and stabilize the wireless connection process. additional wireless access points will be added to the building and all access points will be upgraded to the 802.11g standard. in addition, five mac powerbooks have been added to the fleet of windowsbased laptops. the library continues to investigate new service models to circulate and maintain the laptops. works cited allmang, nancy. 2003. our plan for a wireless loan service. computer in libraries 23, no. 3: 20–25. block, karla j. 2001. laptops for loan: the experience of a multilibrary project. journal of interlibrary loan, document delivery, and information 12, no. 1: 1–12. direnzo, susan. 2002. a wireless laptop-lending program: the university of akron experience. technical services quarterly 20, no. 2: 1–12. dugan, robert e. 2001. managing laptops and the wireless network at the mildred f. sawyer library. journal of academic librarianship 27, no. 4: 295–298. jordy, matthew l. 1998. the impact of user support needs on a large academic workflow as a result of a laptop check-out program. master’s thesis, university of north carolina. lyle, heather. 1999. circulating laptop computers at west virginia university. information outlook 3, no. 11: 30–32. myers, penelope. 2001. laptop rental program, temple university libraries. journal of interlibrary loan, document delivery, and information supply 12, no. 1: 35–40. monash university caulfield library. 2004. laptop users and wireless network survey. www.its.monash.edu.au/staff/networks/wireless/review/caul-lapandnetsurvey.pdf (accessed june 8, 2005). monmouth university. 2003. testing the wireless waters: a survey of potential users before the implementation of a wireless notebook computer lending program in an academic library. http://bluehawk.monmouth.edu/~hholden/wwl/wireless_survey_results.html (accessed june 8, 2005). murray state university. 2002. library laptop computer usage survey results. www.murraystate.edu/msml/laptopsurv. htm (accessed june 8, 2005). oddy, elizabeth carley. 2002. laptops for loan. library and information update 1, no. 4: 54–55. vaughn, james b., and brett burnes. 2002. bringing them in and checking them out: laptop use in the modern academic library. information technology and libraries 21, no. 2: 52–62. west, carol. 2005. librarians pleased with results of student survey. southern new hampshire university. www.snhu. edu/3174/asp (accessed june 8, 2005). williams, joe. 2003. taming the wireless frontier: pdas, tablets, and laptops at home on the range. computers in libraries 23, no. 3: 10–12, 62–64. june_ital_fagan_final an evidence-based review of academic web search engines, 2014-2016: implications for librarians’ practice and research agenda jody condit fagan an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 7 7 abstract academic web search engines have become central to scholarly research. while the fitness of google scholar for research purposes has been examined repeatedly, microsoft academic and google books have not received much attention. recent studies have much to tell us about google scholar’s coverage of the sciences and its utility for evaluating researcher impact. but other aspects have been understudied, such as coverage of the arts and humanities, books, and non-western, non-english publications. user research has also tapered off. a small number of articles hint at the opportunity for librarians to become expert advisors concerning scholarly communication made possible or enhanced by these platforms. this article seeks to summarize research concerning google scholar, google books, and microsoft academic from the past three years with a mind to informing practice and setting a research agenda. selected literature from earlier time periods is included to illuminate key findings and to help shape the proposed research agenda, especially in understudied areas. introduction recent pew internet surveys indicate an overwhelming majority of american adults see themselves as lifelong learners who like to “gather as much information as [they] can” when they encounter something unfamiliar (horrigan 2016). although significant barriers to access remain, the open access movement and search engine giants have made full text more available than ever.1 the general public may not begin with an academic search engine, but google may direct them to google scholar or google books. within academia, students and faculty rely heavily on academic web search engines (especially google scholar) for research; among academic researchers in high-income areas, academic search engines recently surpassed abstracts & indexes as a starting place for research (inger and gardner 2016, 85, fig. 4). given these trends, academic librarians have a professional obligation to understand the role of academic web search engines as part of the research process. jody condit fagan (faganjc@jmu.edu) is professor and director of technology, james madison university, harrisonburg, va. 1 khabsa and giles estimate “almost 1 in 4 of web accessible scholarly documents are freely and publicly available” (2014, 5). an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 8 two recent events also point to the need for a review of research. legal decisions in 2016 confirmed google’s right to make copies of books for its index without paying or even obtaining permission from copyright holders, solidifying the company’s opportunity to shape the online experience with respect to books. meanwhile, microsoft rebooted their academic web search engine, now called microsoft academic. at the same time, information scientists, librarians, and other academics conducted research into the performance and utility of academic web search engines. this article seeks to review the last three years of research concerning academic web search engines, make recommendations related to the practice of librarianship, and propose a research agenda. methodology a literature review was conducted to find articles, conference presentations, and books about the use or utility of google books, google scholar, and microsoft academic for scholarly use, including comparisons with other search tools. because of the pace of technological change, the focus was on recent studies (2014 through 2016, inclusive). a search was conducted on “google books” in ebsco’s library and information science and technology abstracts (lista) on december 19, 2016, limited to 2014-2016. of the 46 results found, most were related to legal activity. only four items related to the tool’s use for research. these four titles were entered into google scholar to look for citing references, but no additional relevant citations were found. in the relevant articles found, the literature reviews testified to the general lack of studies of google books as a research tool (abrizah and thelwall 2014; weiss 2016) with a few exceptions concerning early reviews of metadata, scanning, and coverage problems (weiss 2016). a search on “google books” in combination with “evaluation or review or comparison” was also submitted to jmu’s discovery service,2 limited to 2014-2016 in combination with the terms. forty-nine items were found and from these, three relevant citations were added; these were also entered into google scholar to look for citing references. however, no additional relevant citations were found. thus, a total of seven citations from 2014-2016 were found with relevant information concerning google books. earlier citations from the articles’ bibliographies were also reviewed when research was based on previous work, and to inform the development of a fuller research agenda. a search on “microsoft academic” in lista on february 3, 2017 netted fourteen citations from 2014-2016. only seven seemed to focus on evaluation of the tool for research purposes. a search on “microsoft academic” in combination with terms “evaluation or review or comparison” was also submitted to jmu’s discovery service, limited to 2014-2016. eighteen items were found but no additional citations were added, either because they had already been found or were not relevant. the seven titles found in lista were searched in google scholar for citing references; four additional relevant citations were found, plus a paper relevant to google scholar not 2 jmu’s version of ebsco discovery service contained 453,754,281 items at the time of writing and is carefully vetted to contain items of curricular relevance to the jmu community (fagan and gaines 2016). information technology and libraries | june 2017 9 previously discovered (weideman 2015). thus, a total of eleven citations were found with relevant information for this review concerning microsoft academic. because of this small number, several articles prior to 2014 were included in this review for historical context. an initial search was performed on “google scholar” in lista on november 19, 2016, limited to 2014-2016. this netted 159 results, of which 24 items were relevant. a search on “google scholar” in combination with terms “evaluation or review or comparison” was also submitted to jmu’s discovery tool limited to 2014-2016, and eleven relevant citations were added. items older than 2014 that were repeatedly cited or that formed the basis of recent research were retrieved for historical context. finally, relevant articles were submitted to google scholar, which netted an additional 41 relevant citations. altogether, 70 citations were found to articles with relevant information for this review concerning google scholar in 2014-2016. readers interested in literature reviews covering google scholar studies prior to 2014 are directed to (gray et al. 2012; erb and sica 2015; harzing and alakangas 2016b). findings google books google books (https://books.google.com) contains about 30 million books, approaching the library of congress’s 37 million, but far shy of google’s estimate of 130 million books in existence (wu 2015), which google intends to continue indexing (jackson 2010). content in google books includes publisher-supplied, self-published, and author-supplied content (harper 2016) as well as the results of the famous google books library project. started in december 2004 as the “google print” project,3 the project involved over 40 libraries digitizing works from their collections, with google indexing and performing ocr to make them available in google books (weiss 2016; mays 2015). scholars have noted many errors with google books metadata, including misspellings, inaccurate dates, and inaccurate subject classifications (harper 2016; weiss 2016). google does not release information about the database’s coverage, including which books are indexed or which libraries’ collections are included (abrizah and thelwall 2014). researchers have suggested the database covers mostly u.s. and english-language books (abrizah and thelwall 2014; weiss 2016). the conveniences of google books include limits by the type of book availability (e.g. free ebooks vs. google e-books), document type, and date. the detail view of a book allows magnification, hyperlinked tables of contents, buying and “find in a library” options, “my library,” and user history (whitmer 2015). google books also offers textbook rental (harper 2016) and limited print-on-demand services for out-of-print books (mays 2015; boumenot 2015). in april 2016, the supreme court affirmed google’s right to make copies for its index without paying or even obtaining permission from copyright holders (authors guild 2016; los angeles times 2016). scanning of library books and “snippet view” was deemed fair use: “the purpose of the copying is highly transformative, the public display of text is limited, and the revelations do 3 https://www.google.com/googlebooks/about/history.html an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 10 not provide a significant market substitute for the protected aspects of the originals” (u.s. court of appeals for the second circuit 2015). literature concerning high-level implications of google books suggests the tool is having a profound effect on research and scholarship. the tool has been credited for serving as “a huge laboratory” for indexing, interpretation, working with document image repositories, and other activities (jones 2010). at the same time, the academic community has expressed concerns about google books’s effects on social justice and how its full-text search capability may change the very nature of discovery (hoffmann 2014; hoffmann 2016; szpiech 2014). one study found that books are far more prevalently cited in wikipedia than are research articles (kousha and thelwall 2017). yet investigations of google books’ coverage and utility as a research tool seem to be sorely lacking. as weiss noted, “no critical studies seem to exist on the effect that google books might have on the contemporary reference experience” (weiss 2016, 293). furthermore, no information was found concerning how many users are taking advantage of google books; the tool was noticeably absent from surveys such as (inger and gardner's (2016) and from research centers such as the pew internet research project. in a largely descriptive review, harper (2016) bemoaned google books’ lack of integration with link resolvers and discovery tools, and judged it lacking in relevant material for the health sciences, because so much of the content is older. she also noted the majority of books scanned are in english, which could skew scholarship. the non-english skew of google books was also lamented by weiss, who noted an “underrepresentation of spanish and overestimation of french and german (or even japanese for that matter)” especially as compared to the number of spanish speakers in the united states (weiss 2016, 286-306). whitmer (2015) and mays (2015) provided practical information about how google books can be used as a reference tool. whitmer presented major google books features and challenged librarians to teach google books during library instruction. mays conducted a cursory search on the 1871 chicago fire and described the primary documents she retrieved as “pure gold,” including records of city council meetings, notes from insurance companies, reports from relief societies, church sermons on the fire, and personal memoirs (mays 2015, 22). mays also described google books as a godsend to genealogists for finding local records (e.g. police departments, labor unions, public schools). in her experience, the geographic regions surrounding the forty participating google books library project libraries are “better represented than other areas” (mays 2015, 25). mays concludes, “its poor indexing and search capabilities are overshadowed by the ease of its fulltext search capabilities and the wonderful ephemera that enriches its holdings far beyond mere ‘books’” (mays 2015, 26). abrizah and thelwall (2014) investigated whether google books and google scholar provided “good impact data for books published in non-western countries.” they used a comprehensive list of arts, humanities, and social sciences books (n=1,357) from the five main university presses in information technology and libraries | june 2017 11 malaysia 1961-2013. they found only 23% of the books were cited in google books4 and 37% in google scholar (p. 2502). the overlap was small: only 15% were cited in both google scholar and google books. english-language books were more likely to be cited in google books; 40% of english language books were cited versus 16% malay. examining the top 20 books cited in google books, researchers found them to be mostly written in english (95% in google books vs 29% in the sample), and published by university of malaysia press (60% in google books vs 26% in the sample) (2505). the authors concluded that due to the low overlap between google scholar and google books, searching both engines was required to find the most citations to academic books. kousha and thelwall (2015; 2011) compared google books with thomson reuters book citation index (bkci) to examine its suitability for scholarly impact assessment and found google books to have a clear advantage over bkci in the total number of citations found within the arts and humanities, but not for the social sciences or sciences. they advised combining results from bkci with google books when performing research impact assessment for the arts and humanities and social sciences, but not using google books for the sciences, “because of the lower regard for books among scientists and the lower proportion of google books citations compared to bkci citations for science and medicine” (kousha and thelwall 2015, 317). microsoft academic microsoft academic (https://academic.microsoft.com) is an entirely new software product as of 2016. therefore, the studies cited prior to 2016 refer to entirely different search engines than the one currently available. however, a historical account of the tool and reviewers’ opinions was deemed helpful for informing a fuller picture of academic web search engines and pointing to a research agenda. microsoft academic was born as windows live academic in 2006 (carlson 2006), was renamed live search academic after a first year of struggle (jacsó 2008), and was scrapped two years later after the company recognized it did not have sufficient development support in the united states (jacsó 2011). microsoft asia research group launched a beta tool called libra in 2009, which redirected to the “microsoft academic search” service by 2011. early reviews of the 2011 edition of microsoft academic search were promising, although the tool clearly lacked the quantity of data searched by google scholar (jacsó 2011; hands 2012). there were a few studies involving microsoft academic search in 2014. ortega and aguillo (2014) compared microsoft academic search and google scholar citations for research evaluation and concluded “microsoft academic search is better for disciplinary studies than for analyses at institutional and individual levels. on the other hand, google scholar citations is a good tool for individual assessment because it draws on a wider variety of documents and citations” (1155). 4 google books does not support citation searching; the researchers searched for the book title to manually find citations to a book. an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 12 as part of a comparative investigation of an automatic method for citation snowballing using microsoft academic search, choong et al. (2014) manually searched for a sample of 949 citations to journal or conference articles cited from 20 systematic reviews. they found microsoft academic search contained 78% of the cited articles and noted its utility for testing automated methods due to its free api and no blocks to automated access. the researchers also tested their method against google scholar, but noted “computer-access restrictions prevented a robust comparison” (n.p.). also in 2014, orduna-malea et al. (2014) attempted a longitudinal study of disciplines, journals, and organizations in microsoft academic search only to find the database had not been updated since 2013. furthermore they found the indexing to be incomplete and still in process, meaning microsoft academic search’s presentation of information about any particular publication, organization, or author was distorted. despite this finding, mas was included in two studies of scholar profiles. ortega (2015) compared scholar profiles across google scholar, microsoft academic search, research gate, academia.edu, and mendeley, and found little overlap across the sites. they also found social and usage indicators did not consistently correlate with bibliometric indicators, except on the researchgate platform. social and usage indicators were “influenced by their own social sites,” while bibliometric indicators seemed more stable across all services (13). ward et al. (2015) still included microsoft academic search in their discussion of scholarly profiles as part of the social media network, noting microsoft academic search was painfully time-consuming to work with in terms of consolidating data, correcting items, and adding missing items. in september 2016, hug et al. demonstrated the utility of the new microsoft academic api by conducting a comparative evaluation of normalized data from microsoft academic and scopus (hug, ochsner, and braendle 2016). they noted microsoft academic has “grown massively from 83 million publication records in 2015 to 140 million in 2016” (10). the microsoft academic api offers rich, structured metadata with the exception of document type. they found all attributes containing text were normalized and that identifiers were available for all entities, including references, supporting bibliometricians’ needs for data retrieval, handling, and processing. in addition to the lack of document type, the researchers also found the “fields of study” to be too granular and dynamic, and their hierarchies incoherent. they also desired the ability to use the doi to build api requests. nevertheless, the advantages of microsoft academic’s metadata and api retrieval suggested to hug et al. that microsoft academic was superior to google scholar for calculating research impact indicators and bibliometrics in general. in october 2016, harzing and alakangas compared publication and citation coverage of the new microsoft academic with google scholar, scopus, and web of science using a sample of 145 academics at the university of melbourne (harzing and alakangas 2016a) including observations from 20-40 faculty each in the humanities, social sciences, engineering, sciences, and life sciences. they discovered microsoft academic had improved substantially since their previous study (harzing 2016b), increasing 9.6% for a comparison sample in comparison with 1.4%, 2%, and 1.7% growth in google scholar, scopus, and web of science (n.p.). the researchers noted a few information technology and libraries | june 2017 13 problems with data quality, “although the microsoft academic team have indicated they are working on a resolution” (n.p.). on average, the researchers found that microsoft academic found 59% as many citations as google scholar, 97% as many citations as scopus, and 108% as many citations as web of science. google scholar had the top counts for each disciplinary area, followed by scopus except in the social sciences and humanities, where microsoft academic ranked second. the researchers explained that microsoft academic “only includes citation records if it can validate both citing and cited papers as credible,” as established through a machine-learningbased system, and discussed an emerging metric of “estimated citation count” also provided by microsoft academic. the researchers concluded that microsoft academic is promising to be “an excellent alternative for citation analysis” and suggested microsoft should work to improve coverage of books and grey literature. google scholar google scholar was released in beta form in november 2004, and was expanded to include judicial case law in 2009. while google scholar has received much attention in academia, it seems to be regarded by google as a niche product: in 2011 google removed scholar from the list of top services and list of “more” services, relegating it to the “even more” list. in 2014, the scholar team consisted of just nine people (levy 2014). describing google scholar in an introductory manner is not helped by google’s vague documentation, which simply says it “includes scholarly articles from a wide variety of sources in all fields of research, all languages, all countries, and over all time periods.”5 the “wide variety of sources” includes “journal papers, conference papers, technical reports, or their drafts, dissertations, pre-prints, post-prints, or abstracts,” as well as court opinions and patents, but not “news or magazine articles, book reviews, and editorials.” books and dissertations uploaded to google book search are “automatically” included in scholar. google says abstracts are key, noting “sites that show login pages, error pages, or bare bibliographic data without abstracts will not be considered for inclusion and may be removed from google scholar.” studies of google scholar can be divided in to three major categories of focus: investigating the coverage of google scholar; the use and utility of google scholar as part of the research process; and google scholar’s utility for bibliographic measurement, including evaluating the productivity of individual researchers and the impact of journals. there is some overlap across these categories, because studies of google scholar seem to involve three questions: 1) what is being searched? 2) how does the search function? and 3) to what extent can the user usefully accomplish her task? the coverage of google scholar scholars want to know what “scholarship” is covered by google scholar, but the documentation merely states that it indexes “papers, not journals”6 and challenges researchers to investigate 5 https://scholar.google.com/intl/en/scholar/inclusion.html 6 https://www.google.com/intl/en/scholar/help.html#coverage an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 14 google scholar’s coverage empirically despite google scholar’s notoriously challenging technical limitations. while some limitations of google scholar have been corrected over the years, longstanding logistical hurdles involved with studying google scholar’s coverage have been well-documented for over a decade (shultz 2007; bonato 2016; haddaway et al. 2015; levay et al. 2016), and include: • search queries are limited to 256 characters • not being able to retrieve more than 1,000 results • not being able to display more than 20 results per page • not being able to download batches of results (e.g. to load into citation management software) • duplicate citations (beyond the multiple article “versions”), requiring manual screening • retrieving different results with advanced and basic searches • no designation of the format of items (e.g. conference papers) • minimal sort options for results • basic boolean operators only7 • illogical interpretation of boolean operators: esophagus or oesophagus and oesophagus or esophagus return different numbers of results (boeker, vach, and motschall 2013) • non-disclosure of the algorithm by which search results are sorted. additionally, one study reported experiencing an automated block to the researcher’s ip address after the export of approximately 180 citations or 180 individual searches (haddaway et al. 2015, 14). furthermore, the research excellence framework was unable to use google scholar to assess the quality of research in uk higher education institutions, because of researchers’ inability to agree with google on a “suitable process for bulk access to their citation information, due to arrangements that google scholar have in place with publishers” (research excellence framework 2013, 1562). such barriers can limit what can be studied and also cost researchers significant time in terms of downloading (prins et al. 2016) and cleaning citations (levay et al. 2016). despite these hurdles, research activity analyzing the coverage of google scholar has continued in the past two years, often building off previous studies. this section will first discuss google scholar’s size and ranking, followed by its coverage of articles and citations, then its coverage of books, grey literature, and open access and institutional repositories. google scholar size and ranking in a 2014 study, khabsa and giles estimated there were at least 114 million english-language scholarly documents on the web, of which google scholar had “nearly 100 million.” another study by orduna-malea, ayllón, martín-martín, and lópez-cózar (2015) estimated that the total number 7 e.g., no nesting of logical subexpressions deeper than one level (boeker, vach, and motschall 2013) and no truncation operators. information technology and libraries | june 2017 15 of documents indexed by google scholar, without any language restriction, was between 160 and 165 million. by comparison, in 2016 the author’s discovery tool contained about 168 million items in academic journals, conference materials, dissertations, and reviews.8 google scholar’s presence in the information marketplace has influenced vendors to increase the discoverability of their content, including pushing for the display of abstracts and/or the first page of articles (levy 2014). proquest and gale indexes were added to google scholar in 2015 (quint 2016). martín-martín et al. (2016b) noted that google scholar’s agreements with big publishers come at a price: “the impossibility of offering an api,” which would support bibliometricians’ research (54). google scholar’s results ranking “aims to rank documents the way researchers do, weighing the full text of each document, where it was published, who it was written by, as well as how often and how recently it has been cited in other scholarly literature.”9 martín-martín and his colleagues (2017, 159) conducted a large, longitudinal study of null query results in google scholar and found a strong correlation between result list ranking and times cited. the influence of citations is so strong that when the researchers performed the same search process four months later, 14.7% of documents were missing in the second sample, causing them to conclude even a change of one or two citations could lead to a document being excluded or included from the top 1,000 results (157). using citation counts as a major part of the ranking algorithm has been hypothesized to produce the “matthew effect,” where “work that is already influential becomes even more widely known by virtue of being the first hit from a google scholar search, whereas possibly meritorious but obscure academic work is buried at the bottom” (antell et al. 2013, 281). google scholar has been shown to heavily bias its ranking toward english-language publications even when there are highly cited non-english publications in the result set, although selection of interface language may influence the ranking. martin-martin and his colleagues noted that google scholar seems to use the domain of the document’s hosting web site as a proxy for language, meaning that “some documents written in english but with their primary version hosted in nonanglophone countries’ web domains do appear in lower positions in spite of receiving a large number of citations” (martin-martin et al. 2017, 161). this effect is shown dramatically in figure 3 of their paper. google scholar coverage: articles and citations the coverage of articles, journals, and citations by google scholar has been commonly examined by using brute force methods to retrieve a sample of items from google scholar and possibly one or more of its competitors. (studies discussed in this section are listed in table 1). the goal is usually to determine how well google scholar’s database compares to traditional research databases, usually in a specific field. core methodology involves importing citations into software such as publish or perish (harzing 2016a), cleaning the data, then performing statistical tests, 8 the discovery tool does not contain all available metadata but has been carefully vetted (fagan and gaines 2016). 9 https://www.google.com/intl/en/scholar/about.html an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 16 expert review, or both. haddaway (2015) and moed et al. (2016) have written articles specifically discussing methodological aspects. recent studies repeatedly find that google scholar’s coverage meets or exceeds that of other search tools, no matter what is identified by target samples, including journals, articles, and citations (karlsson 2014; harzing 2014; harzing 2016b; harzing and alakangas 2016b; moed, barilan, and halevi 2016; prins et al. 2016; wildgaard 2015; ciccone and vickery 2015). in only three studies did google scholar find fewer items, and the meaningful difference was minimal.10 science disciplines were the most studied in google scholar, including agriculture, astronomy, chemistry, computer science, ecology, environmental science, fisheries, geosciences, mathematics, medicine, molecular biology, oceanography, physics, and public health. social sciences studied include education (prins et al. 2016), economics (harzing 2014), geography (ştirbu et al. 2015, 322-329), information science (winter, zadpoor, and dodou 2014; harzing 2016b), and psychology (pitol and de groote 2014). studies related to the arts or humanities 2014-2016 included an analysis of open access journals in music (testa 2016) and a comparison between google scholar and web of science for research evaluation within education, pedagogical sciences, and anthropology11 (prins et al. 2016). wildgaard (2015) and bornmann et al. (2016) included samples of humanities scholars as part of bibliometric studies, but did not discuss disciplinary aspects related to coverage. prior to 2014, the only study found related to the arts and humanities compared google scholar with historical abstracts (kirkwood jr. and kirkwood 2011). google scholar’s coverage has been growing over time (meier and conkling 2008; harzing 2014; winter, zadpoor, and dodou 2014; bartol and mackiewicz-talarczyk 2015, 531; orduña-malea and delgado lópez-cózar 2014) with recent increases in older articles (winter, zadpoor, and dodou 2014; harzing and alakangas 2016b), leading some to question whether this supports the documented trend of increased citation of older literature (martín-martín et al. 2016c; varshney 2012). winter et al. noted that in 2005 web of science yielded more citations than google scholar for about two-thirds of their sample, but for the same sample in 2013, google scholar found more citations than web of science, with only 6.8% of citations not retrieved by google scholar (winter, zadpoor, and dodou 2014, 1560). the unique citations of web of science were “typically documents before the digital age and conference proceedings not available online” (winter, zadpoor, and dodou 2014, 1560). harzing and alakangas’s (2016b) large-scale longitudinal comparison of google scholar, scopus, and web of science suggested that google scholar’s retroactive expansion has stabilized and now all three databases are growing at similar rates. 10 for example, bramer, giustini, and kramer (2016a) found slightly more of their 4,795 references from systematic reviews in embase (97.5%) than in google scholar (97.2%). in testa (2016), the music database rilm indexed two more of the 84 oa journals than google scholar (which indexed at least one article from 93% of the journals). finally, in a study using citations to the most-cited article of all time as a sample, web of science found more citations than did google scholar (winter, zadpoor, and dodou 2014). 11 prins et al. classified anthropology as part of the humanities. information technology and libraries | june 2017 17 google scholar also seems to cover both the oldest and the most recent publications. unlike traditional abstracts and indexes, google scholar is not limited by starting year, so as publishers post tables of contents of their earliest journals online, google scholar discovers those sources (antell et al. 2013, 281). trapp (2016) reported the number of citations to a highly-cited physics paper after the first 11 days of publication to be 67 in web of science, 72 in scopus, and 462 in google scholar (trapp 2016, 4). in a study of 800 citations to nobelists in multiple fields, harzing found that “google scholar could effectively be 9–12 months ahead of web of science in terms of publication and citation coverage” (2013, 1073). an increasing proportion of journal articles in google scholar are freely available in full text. a large-scale, longitudinal study of highly-cited articles 1950-2013 found 40% of article citations in the sample were freely available in full text (martín-martín et al. 2014). another large-sample study found 61% of articles in their sample from 2004–2014 could be freely accessed (jamali and nabavi 2015). in both studies, nih.gov and researchgate were the top two full-text providers. google scholar’s coverage of major publisher content varies; having some coverage of a publisher does not imply all articles or journals from that publisher are covered. in a sample of 222 citations compared across google scholar, scopus, and web of science, google scholar contained all of the springer titles, as many elsevier titles as scopus, and the most articles by wolters kluwer and john wiley. however, among the three databases, google scholar contained the fewest articles by bmj and nature (rothfus et al. 2016). an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 18 18 study sample results (bartol and mackiewicztalarczyk 2015) documents retrieved in response to searches on crops and fibers in article titles, 1994-2013 (samples varied by crop) google scholar returned more documents retrieved for each crop. for example, “hemp” retrieved 644 results in google scholar, 493 in scopus, and 318 in web of science; google scholar demonstrated higher yearly growth of records over time. (bramer, giustini, and kramer 2016b) references from a pool of systematic reviewer searches in medicine (n=4795) google found 97.2%, embase, 97.5%, medline 92.3% of all references; when using search strategies, embase retrieved 81.6%, medline 72.6%, and google scholar 72.8%. (ciccone and vickery 2015) based on 183 user searches randomly selected from ncsu libraries’ 2013 summon search logs (n=137) no significant difference between the performance of google scholar, summon, and eds for known-item searches; “google scholar outperformed both discovery services for topical searches.” (harzing 2014) publications and citation metrics for 20 nobelists in chemistry, economics, medicine, physics, 20122013 (samples varied) google scholar coverage is now “increasing at a stable rate” and provides “comprehensive coverage across a wide set of disciplines for articles published in the last four decades” (575). (harzing 2016b) citations from one researcher (n=126) microsoft academic found all books and journal articles covered by google scholar; google scholar found 35 additional publications including book chapters, white papers, and conference papers. (harzing and alakangas 2016a) samples from (harzing and alakangas 2016b, 802) (samples varied by faculty) google scholar provided higher “true” citation counts than microsoft academic but microsoft academic “estimated” citation counts were 12% higher than google scholar for life sciences and equivalent for the sciences. information technology and libraries | june 2017 19 (harzing and alakangas 2016b) citations of the works of 145 faculty among 37 scholarly disciplines at the university of melbourne (samples varied by faculty) for the top faculty member, google scholar had 519 total papers (compared with 309 in both web of science and scopus); google scholar had 16,507 citations (compared with 11,287 in web of science and 11,740 in scopus). (hilbert et al. 2015) documents published by 76 information scientists in german-speaking countries (n=1,017) google scholar covered 63%, scopus, 31%, bibsonomy, 24%, mendeley, 19%, web of science, 15%, citeulike, 8%. (jamali and nabavi 2015) items published between 2004 and 2014 (n=8,310) 61% of articles were freely available; of these, 81% were publisher versions and 14% were pre-prints; researchgate was the top full-text source netting 10.5% of full-text sources, followed by ncbi.nlm.nih.gov (6.5%). (karlsson 2014) journals from ten different fields (n=30) google scholar retrieved documents from all the selected journals; summon only retrieved documents from 14 out of 30 journals. (lee et al. 2015) journal articles housed in florida state university’s institutional repository (n=170) metadata found in google for 46% of items and in google scholar for 75% of items; google scholar found 78% of available full text. google scholar found full text for six items with no full text in the ir. (martín-martín et al. 2014) items highly cited by google scholar (n=64,000) 40% could be freely accessed using google scholar; nih.gov and researchgate were the top two full-text providers. (moed, bar-ilan, and halevi 2016) citations to 36 highly cited articles in 12 scientific-scholarly english-language journals (n=about 7,000) 47% of sources were in both google scholar and scopus; 47% of sources were in google scholar only; 6% of sources were in scopus only; of the unique google scholar citations, sources were most often from google books, springer, ssrn, researchgate, acm digital library, arxiv, and aclweb.org. an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 20 (prins et al. 2016) article citations in the field of education and pedagogies, and citations to 328 articles in anthropology (n=774) google scholar found 22,887 citations in education & pedagogical science compared to web of science’s 8,870, and 8,092 in anthropology compared with web of science’s 1,097. (ştirbu et al. 2015) compared # of citations resulting from two geographical topic searches (samples varied) google scholar found 2,732 geographical references whereas web of science found only 275, georef 97, and francis 45. for sedimentation, google scholar found 1,855 geographical references compared to web of science’s 606, georef’s 1,265, and francis’s 33; google scholar overlapped web of science by 67% and 82% for the two searches, and georef by 57% and 62% (testa 2016) open access journals in music (n=84) google scholar indexed at least one article from 93% of oa journals. rilm indexed two additional journals. (wildgaard 2015) publications from researchers in astronomy, environmental science, philosophy and public health (n=512) publication count from web of science was 2-4 times lower for all disciplines than google scholar; citation count was up to 13 times lower in web of science than in google scholar. (winter, zadpoor, and dodou 2014) growth of citations to 2 classic articles (19952013) and 56 science and social science articles in google scholar, 2005-2013 (samples varied) total citation counts 21% higher in web of science than google scholar for lowry (1951) but google scholar 17% higher than web of science for garfield (1955) and 102% higher for the 56 research articles; google scholar showed a significant retroactive expansion to all articles compared to negligible retroactive growth in web of science. table 1. studies investigating google scholar’s coverage of journal articles and citations, 2014-2016. information technology and libraries | june 2017 21 google scholar coverage: books many studies mentioned that books, including google books, are sometimes included in google scholar results. jamali and nabavi (2015) found 13% of their sample of 8,310 citations from google scholar were books, while martín-martín et al. (2014) had found that 18% of their sample of 64,000 citations from google scholar were books. within the field of anthropology, prins (2016) found books to generate the most citation impact in google scholar (41% of books in their sample were cited in google scholar) compared to articles (21% of articles were cited in google scholar). in education, 31% of articles and 25% of books were cited by google scholar (3). abrizah and thelwall found only 37% of their sample of 1,357 arts, humanities, and social sciences books from the five main university presses in malaysia had been cited in google scholar (23% of the books had been cited in google books) (abrizah and thelwall 2014, 2502). the overlap was small: 15% had impact in both google scholar and google books. the authors concluded that due to the low overlap between google scholar and google books, searching both engines is required to find the most citations to academic books. english books were significantly more likely to be cited in google scholar (48% vs. 32%), as were edited books (53% vs. 36%). they surmised edited books’ citation advantage was due to the use of book chapters in social sciences. they found arts and humanities books more likely to be cited in google scholar than social sciences books (40% vs. 34%) (abrizah and thelwall 2014, 2503). google scholar coverage: grey literature grey literature refers to documents not published commercially, including theses, reports, conference papers, government information, and poster sessions. haddaway et al. (2015) was the only empirical study found focused on grey literature. they discovered that between 8% and 39% of full-text search results from google scholar were grey literature, with the greatest concentration of citations from grey literature on page 80 of results for full-text searches and page 35 for title searches. they concluded “the high proportion of grey literature that is missed by google scholar means it is not a viable alternative to hand searching for grey literature as a standalone tool” (2015, 14). for one of the systematic reviews in their sample, none of the 84 grey literature articles cited were found within the exported google scholar search results. the only other investigation of grey literature found was bonato (2016), who after conducting a very limited number of searches on one specific topic and a search for a known item, concluded google scholar to be “deficient.” in conclusion, despite much offhand praise for google scholar’s grey literature coverage (erb and sica 2015; antell et al. 2013), the topic has been little studied and when it has, grey literature results have not been prominent. google scholar coverage: open access and institutional repository content erb and sica touted google scholar’s access to “free content that might not be available through a library’s subscription services,” including open access journals and institutional repository coverage (2015, 48). recent research has dug deeper into both these content areas. an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 22 in general, oa articles have been shown to net more citations than non-oa articles, as koler-povh, južnic, and turk (2014) showed within the field of civil engineering. across their sample of 2,026 scholarly articles in 14 journals, all indexed in web of science, scopus, and google scholar, oa articles received an average of 43 citations while non-oa articles were cited 29 times (1039). google scholar did a better job discovering those citations; in google scholar the median of citations of oa articles was always higher than that for non-oa articles, wheras this was true in web of science for only 10 of the 14 journals and in scopus for 11 of the 14 journals (1040). similarly, chen (2014) found google scholar to index far more oa journals than scopus and web of science, especially “gold oa.”12 google scholar’s advantage should not be assumed across all disciplines, however; testa (2016) found both google scholar and rilm to provide good coverage of oa journals in music, with google scholar indexing at least one article from 93% of the 84 oa journals in the sample. but the bibliographic database rilm indexed two more oa journals than google scholar. google scholar indexing of repositories may be critical for success, but results vary by ir platform and whether the ir metadata has been structured according to google’s guidelines. in a random sample from shodhganga, india’s central etd database, weideman (2015) found not one article had been indexed in full text by google scholar, although in many cases the metadata was indexed, leading the author to identify needed changes to the way shodhganga stores etds.13 likewise, chen (2014) found that neither google scholar nor google appears to index baidu wenku, a major full-text archive and social networking site in china similar to researchgate, and orduña-malea and lópez-cózar (2015) found that latin american repositories are not very visible in google or google scholar due to limitations of the description schemas chosen as well as search engine reliability. in yang’s (2016) study of texas tech’s dspace ir, google was the only search engine that indexed, discovered, or linked to pdf files supplemented with metadata; google scholar did not discover or provide links to the ir’s pdf files, and was less successful at discovering metadata. when google scholar is able to index ir content, it may be responsible for significant traffic. in a study of four major u.s. universities’ institutional repositories (three dspace, one contentdm) involving a dataset of 57,087 unique urls and 413,786 records, researchers found that 48%–66% of referrals came from google scholar (obrien et al. 2016, 870). the importance of google scholar in contrast to google was noted by lee et al. (2015), who conducted title searches on 170 journal articles housed in florida state university’s institutional repository (using bepress’s digital commons platform), 100 of which existed in full text in the ir. links to the ir were found in google results for 45.9% of the 170 items, and in google scholar for 74.7% of the 170 items. furthermore, google scholar linked to the full text for 78% of the 100 cases where full text was available, and even provided links to freely available full text for six items that did not have full 12 oa articles on publisher web sites, whether the journal itself is oa or not (chen 2014) 13 most notably, the need to store thesis documents as one pdf file instead of divided into multiple, separate files, to create html landing pages as per google’s recommendations, and to submit the addresses of these pages to google scholar. information technology and libraries | june 2017 23 text in the ir. however, the researchers also noted “relying on either google or google scholar individually cannot ensure full access to scholarly works housed in oa irs.” in their study, among the 104 fully open access items there was an overlap in results of only 57.5%; google provided links to 20 items not found with google scholar, and google scholar provided links to 25 items not found with google (lee et al. 2015, 15). google scholar results note the number of “versions” available for each item. in a study of 982 science article citations (including both oa and non-oa) in irs, pitol and degroote found 56% of citations had between four and nine google scholar versions (2014, 603) almost 90% of the citations shown were the publisher version, but of these, only 14.3% were freely available in full text on the publisher web site. meanwhile, 70% percent of the items had at least one free full-text version available through a “hidden” google scholar version. the author’s experience in retrieving full text for this review indicates this issue still exists, but research would be needed to formulate reliable recommendations for users. use and utility of google scholar as part of the research process studies were found concerning google scholar’s popularity with users and their reasons for preferring it (or not) over other tools. another group of studies examined issues related to the utility of google scholar for research processes, including issues related to messy metadata. finally, a cluster of articles focused specifically on using google scholar for systematic reviews. popularity and user preferences several studies have shown google scholar to be well-known to scholarly communities. a survey of 3,500 scholars from 95 countries found that over 60% of 3,500 scientists and engineers and over 70% of respondents in the social sciences, arts, and humanities were aware of google scholar and used it regularly (van noorden 2014). in a large-scale journal-reader survey, inger and gardner (2016) found that among academic researchers in high-income areas, academic search engines surpassed abstracts and indexes as a starting place for research (2016, 85, figure 4). in low-income areas, google use exceeded google scholar use for academic research. major library link resolver software offers reports of full-text requests broken down by referrer. inger and gardner (2016) showed a large variance across subjects for whether people prefer google or google scholar: “people in the social sciences, education, law, and business use google scholar more to find journal articles. however, people working in the humanities and religion and theology prefer to use google” (88). humanities scholar use of google over google scholar was also found by kemman et al. (2013); google, google images, google scholar, and youtube were used more than jstor or other library databases, even though humanities scholars’ trust in google and google scholar was lower. user research since 2014 concerning google scholar has focused on graduate students. results suggest scholar is used regularly but the tool is only partially sufficient. in their study of 20 engineering masters’ students’ use of abstracts and indexes, johnson and simonsen (2015) found an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 24 that half their sample (n=20) had used google scholar the last time they located an article using specific search terms or criteria. google was the second most-used source at 20%, followed by abstracting and indexing services (15%). graduate students describe google scholar with nuance and refer to it as a specific part of their process. in bøyum and aabø’s (2015) interviews with eight phd business students and wu and chen’s (2014, 381) interviews with 32 graduate students drawn from multiple academic disciplines, the majority described using library databases and google scholar for different purposes depending on the context. graduate students in both studies were well aware of google scholar’s use for citation searching. bøyum and aabø’s (2015) subjects described library resources as more “academically robust” than google or google scholar. wu and chen’s (2014) interviewees praised google scholar for its wider coverage and convenience, but lamented the uncertain quality, sometimes inaccessible full text, too many results, lack of sorting function (document type or date), finding documents from different disciplines, and duplicate citations. google scholar was seen by their subjects as useful during early stages of information seeking. in contrast to general assumptions, more than half the students (wu and chen 2014, 381) interviewed reported browsing more than 3 pages’ worth of google scholar results. about half of interviewees reported looking at cited documents to find more, however students had mixed opinions about whether the citing documents turned out to be relevant. google scholar’s “my library” feature, introduced in 2013, now competes with other bibliographic citation management software. in a survey of 344 (mostly graduate) students, conrad, leonard, and somerville found google scholar was the most-used (47%) followed by endnote (37%), and zotero (19%) (2015, 572). follow-up interviews with 13 of the students revealed that a few students used multiple tools, for example one participant noted he/she used “endnote for sharing data with lab partners and others “across the community”; mendeley for her own personal thesis work, where she needs to “build a whole body of literature”; and google scholar citations for “quick reference lists that i may not need for a second or third time.” messy metadata many studies have suggested google scholar’s metadata is “messy.” although none in the period of study examined this phenomenon in conjunction with relative user performance, the issues found could affect scholarship. a 2016 study itemized the most common mistakes in google scholar resulting from its extraction process: 1) incorrect title identification; 2) missing or incorrectly assigned authors; 3) book reviews indexed as books; 4) failing to group versions of the same document, which inflates citation counts; 5) grouping different editions of books, which deflates citation counts; 6) attributing citations to documents that did not cite them, or missing citations that did; and 7) duplicate author profiles (martín-martín et al. 2016b). the authors concluded that “in an academic big data environment, these errors (which we deem affect less than 10% of the records in the database) are of no great consequence, and do not affect the core system information technology and libraries | june 2017 25 performance significantly” (54). two of these issues have been studied specifically: duplicate citations and missing publication dates. the rate of duplicate citations in google scholar has ranged upwards of 2.93% (haddaway et al. 2015) and 5% (winter, zadpoor, and dodou 2014, 1562), which can be compared to a .05% duplicate citation rate in web of science (haddaway et al. 2015, 13). haddaway found the main reasons for duplication include “typographical errors, including punctuation and formatting differences; capitalization differences (google scholar only), incomplete titles, and the fact that google scholar scans citations within reference lists and may include those as well as the citing article” (2015, 13). the issue of missing publication dates varies greatly across samples. dates were found to be missing 9% of the time in winter et al.’s study, although it varied by publication type: 4% of journals, 15% of theses, and 41% of the unknown document types” (winter, zadpoor, and dodou 2014, 1562). however martin-martin et al. studied a sample of 32,680 highly-cited documents and found that web of science and google scholar agreed on publication dates 96.7% of the time, with an idiosyncratically large proportion of those mismatches in 2012 and 2013 (2017, 159). utility for research processes prior to 2014, studies such as asher, duke, and wilson's 2012 evaluated google scholar’s utility as a general research tool, often in comparison with discovery tools. since 2014, the only such study found was namei and young’s comparison of summon, google scholar, and google using 299 known-item queries. they found google scholar and summon returned relevant results 74% of the time; google returned relevant results 91% of the time. for “scholarly formats,” they found summon returned relevant results 76% of the time; google 79%; and google 91% (2015, 526527). the remainder of studies in this category focused specifically on systematic reviews, perhaps because such reviews are so time-consuming. authors develop search strategies carefully, execute them in multiple databases, and document their search methods and results carefully. some prestigious journals are beginning to require similar rigor for any original research article, not just systematic reviews (cals and kotz 2016). information provided by professional organizations about the use of google scholar for systematic reviews seems inconsistent: the cochrane handbook for systematic reviews of interventions lists google scholar among sources for searching, but none of the five “highlighted reviews” on the cochrane web site at the time of this article’s writing used google scholar in their methodologies. the uk organization national institute for health and care excellence’s manual (national institute for health and care excellence (nice)) only mentions google scholar in an appendix of search sources under “conference abstracts.” a study by gehanno et al. (2013) found google scholar contained 100% of the references from 29 systematic reviews, and suggested google scholar could be the first choice for systematic reviews or meta-analyses. this finding prompted a slew of follow-up studies in the next three years. an an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 26 immediate response by giustini and boulos (2013) pointed out that systematic reviews are not performed by searching for article titles as with gehanno et al.’s method, but through search strategies. when they tried to replicate a systematic review’s topical search strategy in google scholar, the citations were not easily discovered. in addition the authors were not able to find all the papers from a given systematic review even by title searching. haddaway et al. also found imperfect coverage: for one of the seven reviews examined, 31.5% of citations could not be found (2015, 11). haddaway also noted that special characters and fonts (as with chemical symbols) can cause poor matching when such characters are part of article titles. recent literature concurs that it is still necessary to search multiple databases when conducting a systematic review, including abstracts and indexes, no matter how good google scholar’s coverage seems to be. no one database’s coverage is complete, including google scholar (thielen et al. 2016), and practical recall of google scholar is exceptionally low due to the 1,000 result limit, yet at the same time, google scholar’s lack of precision is costly in terms of researchers’ time (bramer, giustini, and kramer 2016b; haddaway et al. 2015). the challenges limiting study of google scholar’s coverage also bedevil those wishing to use it for reviews, especially the 1,000 result retrieval limit, lack of batch export, and lack of exported abstracts (levay et al. 2016). additionally, google scholar’s changing content, unknown algorithm and updating practices, search inconsistencies, limited boolean functions, and 256-character query limit prevent the tool from accommodating the detailed, reproducible search methodologies required by systematic reviews (bonato 2016; haddaway et al. 2015; giustini and boulos 2013). bonato noted google scholar retrieved different results with advanced and basic searches; could not determine the format of items (e.g. conference papers); and found other inconsistent results.14 bonato also lamented the lack of any kind of document type limit. despite the limitations and logistical challenges, practitioners and scholars are finding solid reasons for including academic web search engines as part of most systematic review methodologies (cals and kotz 2016). stansfield et al. noted that “relevant literature for lowand middle-income countries, such as working and policy papers, is often not included in databases,” and that google scholar finds additional journal articles and grey literature not indexed in databases (2016, 191). for eight systematic reviews by eppi-center, “over a quarter of relevant citations were found from websites and internet search engines” (stansfield, dickson, and bangpan 2016, 2). specific tools and practices have been recommended when using search engines within the context of systematic reviews. software is available to record search strategies and results (harzing and alakangas 2016b; haddaway 2015). haddaway suggests the use of snapshot tools (haddaway 2015) to record the first 1,000 google scholar records rather than the typical assessment of the first 50 search results as had been done in the past: “this change in practice 14 bonato (2016) found zero hits for conference papers when limiting by year 2015-2016, but found two papers presented at a 2015 meeting. information technology and libraries | june 2017 27 could significantly improve both the transparency and coverage of systematic reviews, especially with respect to their grey literature components.” (haddaway et al. 2015, 15). both haddaway (2015) and cochrane recommend that review authors print or save locally electronic copies of the full text or relevant details rather than bookmarking web sites, “in case the record of the trial is removed or altered at a later stage” (higgins and green 2011). new methods for searching, downloading, and integrating academic search engine results into review procedures using free software to increase transparency, repeatability, and efficiency have been proposed by haddaway and his colleagues (2015). google scholar citations and metrics google scholar citations and metrics are not academic search engines, but this article included them because these products are interwoven into the fabric of the google scholar database. google scholar citations, launched in late 2011 (martín-martín et al. 2016b, 12) groups citations by author, while google metrics (launch date uncertain) provides similar data for articles and journals. readers interested in an in-depth literature review of google scholar citations for earlier years (2005-2012) are directed to (thelwall and kousha 2015b). in his comprehensive review of more recent literature about using google scholar citations for citation analysis, waltman (2016) described several themes. google scholar’s coverage of many fields is significantly broader than web of science and scopus, and this seems to be continuing to improve over time. however studies regularly report google scholar’s inaccuracies, content gaps, phantom data, easily manipulatable citation counts, lack of transparency, and limitations for empirical bibliometric studies. as discussed in the coverage section, google scholar’s citation database is competitive with other major databases such as web of science and has been growing dramatically in the last few years (winter, zadpoor, and dodou 2014; harzing and alakangas 2016b; harzing 2014) but has recently stabilized (harzing and alakangas 2016b). more and more studies are concluding that google scholar will report more comprehensive information about citation impact than web of science or scopus. across a sample of articles from many years of one science journal, trapp (2016) found the proportion of articles with zero citations was 37% for web of science, 29% for scopus, and 19% for google scholar. some of google scholar’s superiority for citation analysis in the social sciences and humanities is due to its inclusion of book content, software, and additional journals (prins et al. 2016; bornmann et al. 2016). bornmann et al. (2016) noted citations to all ten of a research institute’s ten books published in 2009 were found in google scholar, whereas web of science found citations for only two books. furthermore they found data in google scholar for 55 of the total of 71 of the institute’s book chapters. for the four conference proceedings they could identify in google scholar, there were 100 citations, of which 65 could be found in google scholar. the comparative success of google scholar for citation impact varies by discipline, however: (levay et al. 2016) found web of science to be more reliable than google scholar, quicker for an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 28 downloading results, and better for retrieving 100% of the most important publications in public health. despite google scholar’s growth, using all three major tools (scopus, web of science, and google scholar) still seems to be necessary for evaluating researcher productivity. rothfus (2016) compared web of science, scopus, and google scholar citation counts for evaluating the impact of the canadian network of observational drug effect studies (cnodes), as represented by a sample of 222 citations from five articles. attempting to determine citation metrics for the cnodes research team yielded different results for every article when using the three tools. they found that “using three tools (web of science, scopus, google scholar) to determine citation metrics as indicators of research performance and impact provided varying results, with poor overall agreement among the three” (237). major academic libraries’ web sites often explain how to find one’s h-index in all three (suiter and moulaison 2015). researchers have also noted the disadvantages of google scholar for citation impact studies. google scholar is costly in terms of researcher time. levay et al. (2016) estimated the cost of “administering results” from web of science to be 4 hours versus 75 hours for google scholar. administering results includes using the search tool to search, download, and add records to bibliographic citation software, and removing duplicate citations. duplicate citations are often mentioned as a problem (prins et al. 2016), although moed (2016) suggested the double counting by google scholar would occur only if the level of analysis is on target sources, not if it is on target articles.15 downloaded citation samples can still suffer from double counts, however: harzing and alakangas described how cleaning “a fairly extreme case” in their study reduced the number of papers from 244 to 106 (2016b). google scholar also does not identify self-citations, which can dramatically influence the meaning of results (prins et al. 2016). furthermore, researchers have shown it is possible to corrupt google scholar citations by uploading obviously false documents (delgado lópez-cózar, robinson-garcía, and torres-salinas 2014).while the researchers noted traditional citation indexes can also be defrauded, google’s products are less transparent and abuses may not be easily detected. google did not respond to the research team when contacted and simply deleted the false documents to which it had been alerted without reporting the situation to the affected authors, and the researchers concluded: “this lack of transparency is the main obstacle when considering google scholar and its by-products for research evaluation purposes” (453). because these disadvantages do not outweigh google scholar’s seemingly broader coverage, many articles investigate workarounds for using google scholar more effectively when evaluating 15 “if a document is, for instance, first published in arxiv, and a next version later in a journal j, citations to the two versions are aggregated. in google scholar metrics, in which arxiv is included as a source, this document (assuming that its citation count exceed the h5 value of arxiv and journal j) is listed both under arxiv and under journal j, with the same, aggregate citation count (moed 2016, 29). information technology and libraries | june 2017 29 research impact. harzing and alakangas (2016b) recommend the hi index16, which is corrected for career length and co-authorship patterns, as the citation metric of choice for a fair comparison of google scholar with other tools. bornmann et al. (2016) investigated a method to normalize data and reduce errors when using google scholar data to evaluate citations in the social sciences and humanities. researcher profiles can also be used to find other scholars by topic. in a 2014 survey of researchers (n=8,554), dagienė and krapavickaitė found that 22% used a third-party service such as google scholar or microsoft academic to produce lists of their scholarly activities and 63% reported their scholarly record was freely available on the web (2016, 158, 161). google scholar ranked only second to microsoft word as the most frequently used software to maintain academic activity records (160). martín-martín et al. (2016b) examined 814 authors in the field of bibliometrics using google scholar citations, researcherid, researchgate, mendeley, and twitter. google scholar was the most used social research sharing platform, followed by researchgate, with researcherid gaining wider acceptance among authors deemed “core” to the field. only about one-third of the authors created a twitter profile, and many mendeley and researcherid profiles were found empty. the study found google scholar academic profiles’ distinctive advantages to be automatic updates and its high growth rate, with disadvantages of scarce quality control, inherited metadata mistakes from google scholar, and its manipulatability. overall, martin-martin and colleagues concluded that google scholar “should be the preferred source for relational and comparative analyses in which the emphasis is put on author clusters” (57). google scholar metrics provides citation information for articles and journals. in a sample of 1,000 journals, orduña-malea and delgado lópez-cózar found that “despite all the technical and methodological problems,” google scholar metrics provides sound and reliable journal rankings (2014, 2365). google scholar metrics seems to be an annual publication; the 2016 edition contains 5,734 publications and 12 language rankings. russian, korean, polish, ukranian, and indionesian were added this year, while italian and dutch were removed for unknown reasons (martín-martín et al. 2016a). researchers also found that many discussion papers and working papers were removed in 2016. english-language publications are broken into subject areas and disciplines. google scholar metrics often, but not always creates separate entries for each language in which a journal is published. bibliometricians call for google scholar metrics to display the total number of documents published in the publications indexed and the total number of citations received: “these are the two essential parameters that make it possible to assess the reliability and accuracy of any bibliometric indicator” (13). adding country and language of publication and self-citation rates are among the other improvements listed by lopez-cozar and colleagues. 16 harzing and alakangas (2016b) define the hia as the hi norm/academic age. academic age refers to the number of years elapsed since first publication. to calculate hi norm, one divides the number of citations by the number of authors for that paper, and then calculates the h-index of the normalized citation count. an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 30 informing practice the glaring lack of research related to the coverage of arts and humanities scholarship, limited research on book coverage, and relaunch of microsoft academic make it impossible to form a general recommendation regarding the use of academic web search engines for serious research. until the ambiguity of arts and humanities coverage is clarified, and until academic web search engines are transparent and stable, traditional bibliographic databases still seem essential for systematic reviews, citation analysis, and other rigorous literature search purposes. disciplinespecific databases also have features such as controlled vocabulary, industry classification codes, and peer review indicators that make scholars more efficient and effective. nevertheless, the increasing relevance of academic search engines and solid coverage of sciences and social sciences make it essential for librarians to become as expert with google scholar, google books, and microsoft academic. for some scholarly tasks, academic search engines may be superior: for example, when looking up doi numbers for this paper’s bibliography, the most efficient process seemed to be a google search on the article title plus the term “doi,” and the most likely site to display in the results was researchgate.17 librarians and scholars should champion these tools as an important part of an efficient, effective scholarly research process (walsh 2015), while also acknowledging the gaps in coverage, biases, metadata issues and missing features available in other databases. academic web search engines could form the centerpiece for instruction sessions surrounding the scholarly network, as shown by “cited by” features, author profiles, and full-text sources. traditional abstracts and indexes could then be presented on the basis of their strengths. at some point, explaining how to access full text will likely no longer focus on the link resolver but on the many possible document versions a user might encounter (e.g. pre-prints or editions of books) and how to make an informed choice. in the meantime, even though web search engines and repositories may retrieve copious full text outside library subscriptions, college students should still be made aware of the library’s collections and services such as interlibrary loan. when considering google scholar’s weaknesses, it’s important to keep in mind chen’s observation that we may not have a tool available that does any better (antell et al. 2013). while google scholar may be biased toward english-language publications, so are many bibliographic databases. overall, google scholar seems to have increased the visibility of international research (bartol and mackiewicz-talarczyk 2015). while google scholar’s coverage of grey literature has been shown to be somewhat uneven (bonato 2016; haddaway et al. 2015), it seems to include more diversity among relevant document types than many abstracts and indexes (ştirbu et al. 2015; bartol and mackiewicz-talarczyk 2015). although the rigors of systematic reviews may contraindicate the tool’s use as a single source, it adds value to search results from other databases (bramer, giustini, and kramer 2016a). user preferences and priorities should also be taken into account; google 17 because the authority of researchgate is ambiguous, in such cases i then looked up the doi using google to find the publisher’s version. in some cases, the doi was not displayed on the publisher’s result page (e.g., https://muse.jhu.edu/article/197091). information technology and libraries | june 2017 31 scholar results have been said to contain “clutter,” but many researchers have found the noise in google scholar tolerable given its other benefits (ştirbu et al. 2015). google books purportedly contains about 30 million items, focused on u.s.-published and englishlanguage books. but its coverage is hit-or-miss, surprising mays (2015) with an unexpected wealth of primary sources but disappointing harper (2016) with limited coverage of academic health sciences books. recent court decisions have enabled google to continue progressing toward their goal of full-text indexing and making snippet views available for the google-estimated universe of 130 million books, which suggests its utility may increase. google books is not integrated with link resolvers or discovery tools but has been found useful for providing information about scholarly research impact, especially for the arts, humanities, and social sciences. as re-launched in 2016, microsoft academic shows real potential to compete with google scholar in coverage and utility for finding journal articles. as of february 2017 its index contains 120 million citations. in contrast to the mystery of google scholar’s black-box algorithms and restrictive limitations, microsoft academic uses an open-system approach and offers an api. microsoft academic appears to have less coverage of books and grey literature compared with google scholar. research is badly needed about the coverage and utility of both google books and microsoft academic. google scholar continues to evolve, launching a new algorithm for known-item searching in 201618 that appears to work very well. google scholar does not reveal how many items it searches but studies have suggested 160 million documents have been indexed. studies have shown the google scholar relevance algorithm to be heavily influenced by citation counts and language of publication. google scholar has been so heavily researched and is such a “black box” that more attention would seem to have diminishing returns, except in the area of coverage of and utility for arts and humanities research. librarians may find these takeaways useful for working with or teaching google scholar: • little is known about coverage of arts and humanities by google scholar. • recent studies repeatedly find that in the sciences and social sciences google scholar covers as much if not more than library databases, has more recent coverage, and frequently provides access to full text without the need for library subscriptions. • although the number of studies is limited, google scholar seems excellent at retrieving known scholarly items compared with discovery tools. • using proper accent marks in the title when searching for non-english language items appears to be important. 18 google scholar’s blog notes that in january 2016, a change was made so “scholar now automatically identifies queries that are likely to be looking for a specific paper” technically speaking, “it tries hard to find the intended paper and a version that that particular user is able to read” https://scholar.googleblog.com/. an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 32 • finding full text for non-english journal articles may require searching google scholar in the original language. • while google scholar may include results from google books, it appears both tools should be used rather than assuming google books will appear in google scholar. • while google scholar does include grey literature, these results do not usually rank highly. • google scholar and google must both be used to effectively search across institutional repository content. • free full text may be buried underneath the “all x versions” links because the publisher’s web site is usually the dominant version presented to the user. the right-hand column links may help ameliorate this situation, but not reliably. • google scholar is well-known in most academic communities and used regularly; however, it is seldom the only tool used, with scholars continuing to use other web search tools, library abstracts and indexes, and published web sites as well. • experts in writing systematic reviews recommend google scholar be included as a search tool along with traditional abstracts and indexes, using software to record the search process and results. • for evaluating research impact, google scholar may be superior to web of science or scopus, but using all three tools still seems necessary. • as with any database, citation metadata should be verified against the publisher’s data; with google scholar, publication dates should receive deliberate attention. • when google scholar covers some of a major publisher’s content, that does not imply it covers all of that publisher’s content. • google scholar metrics appears to provide reliable journal rankings. research agenda this review of the literature also provides direction for future research concerning academic web search engines. because this review focused on 2014-2016, researchers may need to review studies from earlier periods for methodological ideas and previous findings, noting that dramatic changes in search engine coverage and behavior can occur within only a few years.19 across the studies, some general best practices were observed. when comparing the coverage of academic web search engines, their utility for establishing research impact, or other bibliometric studies, researchers should strongly consider using software such as publish or perish, and to design their research approach with previous methodologies in mind. information scientists have charted a set of clear disciplinary methods; there is no need to start from scratch. even when 19 for example ştirbu found that google scholar overlapped georef by 57% and 62% (ştirbu et al. 2015, 328), compared with a finding by neuhaus in 2006 where scholar overlapped with georef by 26% (2006, 133). information technology and libraries | june 2017 33 performing a large-scale quantitative assessment such as (kousha and thelwall 2015), manually examining and discussing a subset of the sample seems helpful for checking assumptions and for enhancing the meaning of the findings to the reader. some researchers examined the “top 20” or “top 10” results qualitatively (kousha and thelwall 2015), while others took a random sample from within their large-study sample (kousha, thelwall, and rezaie 2011). academic search engines for arts and humanities research research into the use of academic web search engines within arts and humanities fields is sorely needed. surveys show humanities scholars use both google and google scholar (inger and gardner 2016; kemman, kleppe, and scagliola 2013; van noorden 2014). during interviews of 20 historians by martin and quan-haase (2016) concerning serendipity, five mentioned google books and google scholar as important for recreating serendipity of the physical library online. almost all arts and humanities scholars search the internet for researchers and their activities, and commonly expressed the belief that having a complete list of research activities online improves public awareness (dagienė and krapavickaitė 2016). mays’s (2015) practical advice and the few recent studies on citation impact of google books for these disciplines point to the enormous potential for this tool’s use. articles describing opportunities for new online searching habits of humanities scholars have not always included google scholar (huistra and mellink 2016). wu and chen’s interviews with humanities graduate students suggested their behavior and preferences were different from science and technology students, doing more known-item searching and struggling with “semantically ambiguous keywords” that retrieved irrelevant results (2014, 381). platform preferences seem to have a disciplinary aspect: hammarfelt’s (2014) investigation of altmetrics in the humanities suggests mendeley and twitter should be included along with google scholar when examining citation impact of humanities research, while a 2014 nature survey suggests researchgate is much less popular in the social sciences and humanities than in the sciences (van noorden 2014). in summary, arts and humanities scholars are active users of academic web search engines and related tools, but their preferences and behavior, and the relative success of google scholar as a research tool cannot be inferred from the vast literature focused on the sciences. advice from librarians and scholars about the strengths and limitations of academic web search engines in these fields would be incredibly useful. specific examples of needed research, and related studies to reference for methodological ideas: • similar to the studies that have been done in the sciences, how well do academic search engines cover the arts and humanities? an emphasis on formats important to the discipline would be important (prins et al. 2016). • how does the quality of search results compare between academic search engines and traditional library databases for arts and humanities topics? to what extent can the user usefully accomplish her task? (ruppel 2009)? an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 34 • to what extent do academic search engines support the research process for scholarship distinctive to arts and humanities disciplines (e.g. historiographies, review essays)? • in academic search engines, how visible is the arts and humanities literature found in institutional repositories (pitol and de groote 2014)? specific aspects of academic search engine coverage this review suggests that broad studies of academic search engine coverage may have reached a saturation point. however, specific aspects of coverage need additional investigation: • grey literature: although google scholar’s inclusion of grey literature is frequently mentioned as valuable, empirical studies evaluating its coverage are scarce. additional research following the methodology of haddaway (2015) could investigate the bibliographies of literature other than systematic reviews, investigate various disciplines, or use a sample of valuable known items (similar to kousha, thelwall, and rezaie’s (2011) methodology for books). • non-western, non-english language literature: for further investigation of the repeated finding of non-western, non-english language bias (abrizah and thelwall 2014; cavacini 2015), comparisons to library abstracts and indexes would be helpful for providing context. to what extent is this bias present in traditional research tools? hilbert et al. found the coverage of their sample increased for english language in both web of science and scopus, and “to a lesser extent” in google scholar (2015, 260). • books: any investigations of book coverage in microsoft academic and google scholar would be welcome. very few 2014-2016 studies focused on books in google scholar, and even looking in earlier years turned up little research. georgas (2015) compared google with a federated search tool for finding books, so her study may be a useful reference. kousha et al. (2011) found three times as many citations in google scholar than in scopus to a sample of 1,000 academic books. the authors concluded “there are substantial numbers of citations to academic books from google books and google scholar, and it therefore may be possible to use these potential sources to help evaluate research in bookoriented disciplines” (kousha, thelwall, and rezaie 2011, 2157). • institutional repositories: yang (2016) recommended that “librarians of digital resources conduct research on their local digital repositories, as the indexing effects and discovery rates on metadata or associated text files may be different case by case,” and the studies found 2014-2016 show that ir platform and metadata schema dramatically affect discovery, with some irs nearly invisible (weideman 2015; chen 2014; orduña-malea and lópez-cózar 2015; yang 2016) and others somewhat findable by google scholar (lee et al. 2015; obrien et al. 2016). askey and arlitsch (2015) have explained how google scholar’s decisions regarding metadata schema can dramatically affect results.20 libraries who 20 for example, google’s rejection of dublin core. information technology and libraries | june 2017 35 would like their institutional repositories to serve as social sharing platforms for research should consider conducting a study similar to (martín-martín et al. 2016b). finally, a study of ir journal article visibility in academic web search engines could be extremely informative. • full-text retrieval: the indexing coverage of academic search engines relates to the retrieval of full text, which is another area ripe for more research studies, especially in light of the impressive quantity of full text that can be retrieved without user authentication. johnson and simonsen (2015) found that more of the engineering students they surveyed obtained scholarly articles from a free download or getting a pdf from a colleague at another institution than used the library’s subscription. meanwhile, libraries continue to pay for costly subscription resources. monitoring this situation is essential for strategic decision-making. quint (2016) and karlsson (2014) have suggested strategies for libraries and vendors to support broader access to subscription full text through creative licensing and per-item fee approaches. institutional repositories have had mixed results in changing scholars’ habits (both contributors and searchers) but are demonstrably contributing to the presence of full text in the academic search engine experience. when will academic users find a good-enough selection of full-text articles that they no longer need the expanded full text paid for by their institutions? google books similarly to microsoft academic, google books as a search tool also needs dedicated research from librarians and information scientists about its coverage, utility, and/or adoption. a purposeful comparison with other large digital repositories such as hathitrust (https://www.hathitrust.org) would be a boon to practitioners and the public. while hathitrust is transparent about its coverage (https://www.hathitrust.org/statistics_visualizations), specific areas of google books’ coverage have been called into question. weiss (2016) suggested a gap in google books exists from about 1915-1965 “because many publishers either have let it fall out of print, or the book is orphaned and no one wants to go through the trouble of tracking down the copyright owners” and found that copies in google books “will likely be locked down and thus unreadable, or visible only as a snippet, at best” (303). has this situation changed since the court rulings concerning the legality of snippet view? longitudinal studies in the growth of google books similar to (harzing 2014) could illuminate this and other questions about google books’s ability to deliver content. uneven coverage of content types, geography, and language should be investigated. mays noted a possible geographical imbalance within the united states (mays 2015, 26). others noted significant language and international imbalances, and large disciplinary differences (weiss 2016; abrizah and thelwall 2014; kousha and thelwall 2015). weiss and others suggest the implications of google books’ coverage imbalance have enormous social implications: “google and other [massive digital libraries] have essentially canonized the books they have scanned and contribute to the marginalization of those left unscanned” (301). therefore more holistic quantitative investigations of the types of information in google books and possible skewness an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 36 would be welcome. finally, chen’s study (2012) comparing the coverage of google books and worldcat could be repeated to provide longitudinal information. the utility of google books for research purposes also needs further investigation. books are far more prevalently cited in wikipedia than are research articles (thelwall and kousha 2015a). examining samples of wikipedia articles’ citation lists for the prevalence of google books could reveal how dominant a force google books has become in that space. on a more philosophical level, investigating the ways google books might transform scholarly processes would be useful. szpiech (2014) considered how the google books version of a medieval manuscript transformed his relationship with texts, causing a rupture “produced by my new power to extract words and information from a text without being subject to its order, scale, or authority” (78). he hypothesized readers approach google books texts as consumers, rather than learners, whereby “the critical sense of the gestalt” is at risk of being forgotten” (84). have other researchers in experienced what he describes? microsoft academic given the stated openness of microsoft’s new academic web search engine,21 the closed nature of google scholar, and the promising findings of bibliometricians (harzing 2016b; harzing and alakangas 2016a), librarians and information scientists should embark on a thorough review of microsoft academic with similar enthusiasm to which they approached google scholar. the search engine’s coverage, utility for research, and suitability for bibliometric analysis22 all need to be examined. microsoft academic’s abilities for supporting scholarly social networking would also be of interest, perhaps using ward et al. (2015) as a theoretical groundwork. the tool’s coverage and utility for various disciplines and research purposes is a wide-open field for highly useful research. professional and instructional approaches based on user research to inform instructional approaches, more study on user behavior is needed, perhaps repeating herrera’s (2011) study with google scholar and microsoft academic. in light of the recent focus on graduate students, research concerning the use of academic web search engines by undergraduates, community college students, high school students, and other groups would be welcome. using an interview or focus group generates exploratory findings that could be tested through surveys with a larger, more representative sample of the population of interest. studying searching behaviors has been common; can librarians design creative studies to investigate reading, engagement, and reflection when web search engines are used as part of the process? is there a way to study whether the “matthew effect” (antell et al. 2013, 281), the aging citation 21 microsoft’s faq says the company is “adopting an open approach in developing the service, and we invite community participation. we like to think what we have developed is a community property. as such, we are opening up our academic knowledge as a downloadable dataset” and offers the academic knowledge api (https://www.microsoft.com/cognitive-services/en-us/academic-knowledge-api). 22 see jacsó (2011) for methodology. information technology and libraries | june 2017 37 phenomenon (verstak et al. 2014; martín-martín et al. 2016a; davis and cochran 2015), or other epistemological hypotheses are influencing scholarship patterns? a bold study could be performed to examine differences in quality outcomes between samples of students using primarily academic search engines versus traditional library search tools. exploratory studies in this area could begin by surveying students about their use of search tools for research methods courses or asking them to record their research process in a journal, and correlating the findings with their grades on the final research product. three specific areas of user research needed are the use of scholarly social network platforms, researcher profiles, and the influence of these on scholarly collaboration and research (ward, bejarano, and dudás 2015, 178); the performance of google’s relatively new known-item search23 (compared with microsoft academic’s known-item search abilities), and searching in non-english languages. regarding the latter, albarillo’s (2016) method which he applied to library databases could be repeated with google scholar, microsoft academic, and google books. finally, to continue their strong track record as experts in navigating the landscape of digital scholarship, librarians need to research assumptions regarding best practices for scholarly logistics. for example, searching google for article titles plus the term “doi,” then scanning the results list for researchgate was found by this study’s author to most efficiently provide doi numbers: but is this a reliable approach? does researchgate have sufficient accuracy to be recommended as the optimal tool for this task? what is the most efficient way for a scholar to locate full text for a citation? are academic search engines’ bibliographic citation management software export tools competitive with third-party commercial tools such as refworks? another area needing investigation is the visibility of links to free full text in google scholar. pitol and degroote found that 70% percent of the items in their study had at least one free full-text version available through a “hidden” google scholar version (2014, 603), and this author’s work on this review article indicates this problem still exists — but to what extent? also, when free full text exists in multiple repositories (e.g. researchgate, digital commons, academic.edu), which are the most trustworthy and practically useful for scholars? librarians should discuss the answers to these questions and be ready to provide expert advice to users. conclusion with so many users opting to use academic web search engines for research, librarians need to investigate the performance of microsoft academic, google books, and of google scholar for the arts and humanities, and to re-think library services and collections in light of these tools’ strengths and limitations. the evolution of web indexing and increasing free access to full text should be monitored in conjunction with library collection development. to remain relevant to 23 google scholar’s blog notes that in january 2016, a change was made so “scholar now automatically identifies queries that are likely to be looking for a specific paper” technically speaking, “it tries hard to find the intended paper and a version that that particular user is able to read” https://scholar.googleblog.com/. an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 38 modern researchers, librarians should continue to strengthen their knowledge of and expertise with public academic web search engines, full-text repositories, and scholarly networks. bibliography abrizah, a., and mike thelwall. 2014. "can the impact of nonwestern academic books be measured? an investigation of google books and google scholar for malaysia." journal of the association for information science & technology 65 (12): 2498-2508. https://doi.org/10.1002/asi.23145. albarillo, frans. 2016. "evaluating language functionality in library databases." international information & library review 48 (1): 1-10. https://doi.org/10.1080/10572317.2016.1146036. antell, karen, molly strothmann, xiaotian chen, and kevin o’kelly. 2013. "cross-examining google scholar." reference & user services quarterly 52 (4): 279-282. https://doi.org/10.5860/rusq.52n4.279. asher, andrew d., lynda m. duke, and suzanne wilson. 2012. "paths of discovery: comparing the search effectiveness of ebsco discovery service, summon, google scholar, and conventional library resources." college & research libraries 74(5):464-488. https://doi.org/10.5860/crl374. askey, dale, and kenning arlitsch. 2015. "heeding the signals: applying web best practices when google recommends." journal of library administration 55 (1): 49-59. https://doi.org/10.1080/01930826.2014.978685. authors guild. "authors guild v. google." accessed january 1, 2016, https://www.authorsguild.org/where-we-stand/authors-guild-v-google/. bartol, tomaž, and maria mackiewicz-talarczyk. 2015. "bibliometric analysis of publishing trends in fiber crops in google scholar, scopus, and web of science." journal of natural fibers 12 (6): 531. https://doi.org/10.1080/15440478.2014.972000. boeker, martin, werner vach, and edith motschall. 2013. "google scholar as replacement for systematic literature searches: good relative recall and precision are not enough." bmc medical research methodology 13 (1): 1. bonato, sarah. 2016. "google scholar and scopus for finding gray literature publications." journal of the medical library association 104 (3): 252-254. https://doi.org/10.3163/15365050.104.3.021. bornmann, lutz, andreas thor, werner marx, and hermann schier. 2016. "the application of bibliometrics to research evaluation in the humanities and social sciences: an exploratory study using normalized google scholar data for the publications of a research institute." information technology and libraries | june 2017 39 journal of the association for information science & technology 67 (11): 2778-2789. https://doi.org/10.1002/asi.23627. boumenot, diane. "printing a book from google books." one rhode island family. last modified december 3, 2015, accessed january 1, 2017. https://onerhodeislandfamily.com/2015/12/03/printing-a-book-from-google-books/. bøyum, idunn, and svanhild aabø. 2015. "the information practices of business phd students." new library world 116 (3): 187-200. https://doi.org/10.1108/nlw-06-2014-0073. bramer, wichor m., dean giustini, and bianca m. r. kramer. 2016. "comparing the coverage, recall, and precision of searches for 120 systematic reviews in embase, medline, and google scholar: a prospective study." systematic reviews 5(39):1-7. https://doi.org/10.1186/s13643-016-0215-7. cals, j. w., and d. kotz. 2016. "literature review in biomedical research: useful search engines beyond pubmed." journal of clinical epidemiology 71: 115-117. https://doi.org/10.1016/j.jclinepi.2015.10.012. carlson, scott. 2006. "challenging google, microsoft unveils a search tool for scholarly articles." chronicle of higher education 52 (33). cavacini, antonio. 2015. "what is the best database for computer science journal articles?" scientometrics 102 (3): 2059-2071. https://doi.org/10.1007/s11192-014-1506-1. chen, xiaotian. 2012. "google books and worldcat: a comparison of their content." online information review 36 (4): 507-516. https://doi.org/10.1108/14684521211254031. ———. 2014. "open access in 2013: reaching the 50% milestone." serials review 40 (1): 21-27. https://doi.org/10.1080/00987913.2014.895556. choong, miew keen, filippo galgani, adam g. dunn, and guy tsafnat. 2014. "automatic evidence retrieval for systematic reviews." journal of medical internet research 16 (10): 1-1. https://doi.org/10.2196/jmir.3369. ciccone, karen, and john vickery. 2015. "summon, ebsco discovery service, and google scholar: a comparison of search performance using user queries." evidence based library & information practice 10 (1): 34-49. https://ejournals.library.ualberta.ca/index.php/eblip/article/view/23845. conrad, lettie y., elisabeth leonard, and mary m. somerville. 2015. "new pathways in scholarly discovery: understanding the next generation of researcher tools." paper presented at the association of college and research libraries annual conference, march 25-27, portland, or. https://pdfs.semanticscholar.org/3cb1/315476ccf9b443c01eb9b1d175ae3b0a5b4e.pdf. an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 40 dagienė, eleonora, and danutė krapavickaitė. 2016. "how researchers manage their academic activities." learned publishing 29(3):155-163. https://doi.org/10.1002/leap.1030. davis, philip m., and angela cochran. 2015. "cited half-life of the journal literature." arxiv preprint arxiv:1504.07479. https://arxiv.org/abs/1504.07479. delgado lópez-cózar, emilio, nicolás robinson-garcía, and daniel torres-salinas. 2014. "the google scholar experiment: how to index false papers and manipulate bibliometric indicators." journal of the association for information science & technology 65 (3): 446-454. https://doi.org/10.1002/asi.23056. erb, brian, and rob sica. 2015. "flagship database for literature searching or flelpful auxiliary?" charleston advisor 17 (2): 47-50. https://doi.org/10.5260/chara.17.2.47. fagan, jody condit, and david gaines. 2016. "take charge of eds: vet your content." presentation to the ebsco users' group, boston, ma, may 10-11. gehanno, jean-françois, laetitia rollin, and stefan darmoni. 2013. "is the coverage of google scholar enough to be used alone for systematic reviews." bmc medical informatics and decision making 13 (1): 1. https://doi.org/10.1186/1472-6947-13-7. georgas, helen. 2015. "google vs. the library (part iii): assessing the quality of sources found by undergraduates." portal: libraries and the academy 15 (1): 133-161. https://doi.org/10.1353/pla.2015.0012. giustini, dean, and maged n. kamel boulos. 2013. "google scholar is not enough to be used alone for systematic reviews." online journal of public health informatics 5 (2). https://doi.org/10.5210/ojphi.v5i2.4623. gray, jerry e., michelle c. hamilton, alexandra hauser, margaret m. janz, justin p. peters, and fiona taggart. 2012. "scholarish: google scholar and its value to the sciences." issues in science and technology librarianship 70 (summer). https://doi.org/10.1002/asi.21372/full. haddaway, neal r. 2015. "the use of web-scraping software in searching for grey literature." grey journal 11 (3): 186-190. haddaway, neal robert, alexandra mary collins, deborah coughlin, and stuart kirk. 2015. "the role of google scholar in evidence reviews and its applicability to grey literature searching." plos one 10 (9): e0138237. https://doi.org/10.1371/journal.pone.0138237. hammarfelt, björn. 2014. "using altmetrics for assessing research impact in the humanities." scientometrics 101 (2): 1419-1430. https://doi.org/10.1007/s11192-014-1261-3. hands, africa. 2012. "microsoft academic search – http://academic.research.microsoft.com." technical services quarterly 29 (3): 251-252. https://doi.org/10.1080/07317131.2012.682026. information technology and libraries | june 2017 41 harper, sarah fletcher. 2016. "google books review." journal of electronic resources in medical libraries 13 (1): 2-7. https://doi.org/10.1080/15424065.2016.1142835. harzing, anne-wil. 2013. "a preliminary test of google scholar as a source for citation data: a longitudinal study of nobel prize winners." scientometrics 94 (3): 1057-1075. https://doi.org/10.1007/s11192-012-0777-7. ———. 2014. "a longitudinal study of google scholar coverage between 2012 and 2013." scientometrics 98 (1): 565-575. https://doi.org/10.1007/s11192-013-0975-y. ———. 2016a. publish or perish. vol. 5. http://www.harzing.com/resources/publish-or-perish. ———. 2016b. "microsoft academic (search): a phoenix arisen from the ashes?" scientometrics 108 (3): 1637-1647.https://doi.org/10.1007/s11192-016-2026-y. harzing, anne-wil, and satu alakangas. 2016a. "microsoft academic: is the phoenix getting wings?" scientometrics: 1-13. harzing, anne-wil, and satu alakangas. 2016b. "google scholar, scopus and the web of science: a longitudinal and cross-disciplinary comparison." scientometrics 106 (2): 787-804. https://doi.org/10.1007/s11192-015-1798-9. herrera, gail. 2011. "google scholar users and user behaviors: an exploratory study." college & research libraries 72 (4): 316-331. https://doi.org/10.5860/crl-125rl. higgins, julian, and s. green, eds. 2011. cochrane handbook for systematic reviews of interventions. version 5.1.0 ed.: the cochrane collaboration. http://handbook.cochrane.org/. hilbert, fee, julia barth, julia gremm, daniel gros, jessica haiter, maria henkel, wilhelm reinhardt, and wolfgang g. stock. 2015. "coverage of academic citation databases compared with coverage of scientific social media." online information review 39 (2): 255-264. https://doi.org/10.1108/oir-07-2014-0159. hoffmann, anna lauren. 2014. "google books as infrastructure of in/justice: towards a sociotechnical account of rawlsian justice, information, and technology." theses and dissertations. paper 530. http://dc.uwm.edu/etd/530/. ———. 2016. "google books, libraries, and self-respect: information justice beyond distributions." the library 86 (1). https://doi.org/10.1086/684141. horrigan, john b. "lifelong learning and technology." pew research center, last modified march 22, 2016, accessed february 7, 2017, http://www.pewinternet.org/2016/03/22/lifelonglearning-and-technology/. hug, sven e., michael ochsner, and martin p. braendle. 2016. "citation analysis with microsoft academic." arxiv preprint arxiv:1609.05354.https://arxiv.org/abs/1609.05354. an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 42 huistra, hieke, and bram mellink. 2016. "phrasing history: selecting sources in digital repositories." historical methods: a journal of quantitative and interdisciplinary history 49 (4): 220-229. https://doi.org/10.1093/llc/fqw002. inger, simon, and tracy gardner. 2016. "how readers discover content in scholarly publications." information services & use 36 (1): 81-97. https://doi.org/10.3233/isu-160800. jackson, joab. 2010. "google: 129 million different books have been published." pc world, august 6, 2010. http://www.pcworld.com/article/202803/google_129_million_different_books_have_been_pu blished.html. jacsó, p. 2008. "live search academic." peter’s digital reference shelf, april. jacsó, péter. 2011. "the pros and cons of microsoft academic search from a bibliometric perspective." online information review 35 (6): 983-997. https://doi.org/10.1108/14684521111210788. jamali, hamid r., and majid nabavi. 2015. "open access and sources of full-text articles in google scholar in different subject fields." scientometrics 105 (3): 1635-1651. https://doi.org/10.1007/s11192-015-1642-2. johnson, paula c., and jennifer e. simonsen. 2015. "do engineering master's students know what they don't know?" library review 64 (1): 36-57. https://doi.org/10.1108/lr-05-2014-0052. jones, edgar. 2010. "google books as a general research collection." library resources & technical services 54 (2): 77-89. https://doi.org/10.5860/lrts.54n2.77. karlsson, niklas. 2014. "the crossroads of academic electronic availability: how well does google scholar measure up against a university-based metadata system in 2014?" current science 107 (10): 1661-1665. http://www.currentscience.ac.in/volumes/107/10/1661.pdf. kemman, max, martijn kleppe, and stef scagliola. 2013. "just google it-digital research practices of humanities scholars." arxiv preprint arxiv:1309.2434. https://arxiv.org/abs/1309.2434. khabsa, madian, and c. lee giles. 2014. "the number of scholarly documents on the public web." plos one 9 (5): https://doi.org/10.1371/journal.pone.0093949 kirkwood jr., hal, and monica c. kirkwood. 2011. "historical research." online 35 (4): 28-32. koler-povh, teja, primož južnic, and goran turk. 2014. "impact of open access on citation of scholarly publications in the field of civil engineering." scientometrics 98 (2): 1033-1045. https://doi.org/10.1007/s11192-013-1101-x. kousha, kayvan, mike thelwall, and somayeh rezaie. 2011. "assessing the citation impact of books: the role of google books, google scholar, and scopus." journal of the american society information technology and libraries | june 2017 43 for information science and technology 62 (11): 2147-2164. https://doi.org/10.1002/asi.21608. kousha, kayvan, and mike thelwall. 2017. "are wikipedia citations important evidence of the impact of scholarly articles and books?" journal of the association for information science and technology. 68(3):762-779. https://doi.org/10.1002/asi.23694. kousha, kayvan, and mike thelwall. 2015. "an automatic method for extracting citations from google books." journal of the association for information science & technology 66 (2): 309320. https://doi.org/10.1002/asi.23170. lee, jongwook, gary burnett, micah vandegrift, hoon baeg jung, and richard morris. 2015. "availability and accessibility in an open access institutional repository: a case study." information research 20 (1): 334-349. levay, paul, nicola ainsworth, rachel kettle, and antony morgan. 2016. "identifying evidence for public health guidance: a comparison of citation searching with web of science and google scholar." research synthesis methods 7 (1): 34-45. https://doi.org/10.1002/jrsm.1158. levy, steven. "making the world’s problem solvers 10% more efficient." backchannel. last modified october 17, 2014, accessed january 14, 2016, https://medium.com/backchannel/the-gentleman-who-made-scholar-d71289d9a82d. los angeles times. 2016. "google, books and 'fair use'." los angeles times, april 19, 2016. http://www.latimes.com/opinion/editorials/la-ed-google-book-search-20160419-story.html martin, kim, and anabel quan-haase. 2016. "the role of agency in historians’ experiences of serendipity in physical and digital information environments." journal of documentation 72 (6): 1008-1026. https://doi.org/10.1108/jd-11-2015-0144. martín-martín, alberto, juan manuel ayllón, enrique orduña-malea, and emilio delgado lópezcózar. 2016a. "2016 google scholar metrics released: a matter of languages... and something else." arxiv preprint arxiv:1607.06260. https://arxiv.org/abs/1607.06260. martín-martín, alberto, enrique orduña-malea, juan m. ayllón, and emilio delgado lópez-cózar. 2016b. "the counting house: measuring those who count. presence of bibliometrics, scientometrics, informetrics, webometrics and altmetrics in the google scholar citations, researcherid, researchgate, mendeley & twitter." arxiv preprint arxiv:1602.02412. https://arxiv.org/abs/1602.02412. martín-martín, alberto, enrique orduña-malea, juan manuel ayllón, and emilio delgado lópezcózar. 2014. "does google scholar contain all highly cited documents (1950-2013)?" arxiv preprint arxiv:1410.8464. https://arxiv.org/abs/1410.8464. an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 44 martín-martín, alberto, enrique orduña-malea, juan ayllón, and emilio delgado lópez-cózar. 2016c. "back to the past: on the shoulders of an academic search engine giant." scientometrics 107 (3): 1477-1487. https://doi.org/10.1007/s11192-016-1917-2. martín-martín, alberto, enrique orduña-malea, anne-wil harzing, and emilio delgado lópezcózar. 2017. "can we use google scholar to identify highly-cited documents?" journal of informetrics 11 (1): 152-163. https://doi.org/10.1016/j.joi.2016.11.008. mays, dorothy a. 2015. "google books: far more than just books." public libraries 54 (5): 23-26. http://publiclibrariesonline.org/2015/10/far-more-than-just-books/ meier, john j., and thomas w. conkling. 2008. "google scholar’s coverage of the engineering literature: an empirical study." the journal of academic librarianship 34 (3): 196-201. https://doi.org/10.1016/j.acalib.2008.03.002. moed, henk f., judit bar-ilan, and gali halevi. 2016. "a new methodology for comparing google scholar and scopus." arxiv preprint arxiv:1512.05741.https://arxiv.org/abs/1512.05741. namei, elizabeth, and christal a. young. 2015. "measuring our relevancy: comparing results in a web-scale discovery tool, google & google scholar." paper presented at the association of college and research libraries annual conference, march 25-27, portland, or. http://www.ala.org/acrl/sites/ala.org.acrl/files/content/conferences/confsandpreconfs/201 5/namei_young.pdf national institute for health and care excellence (nice). "developing nice guidelines: the manual." last modified april 2016, accessed november 27, 2016. https://www.nice.org.uk/process/pmg20. neuhaus, chris, ellen neuhaus, alan asher, and clint wrede. 2006. "the depth and breadth of google scholar: an empirical study." portal: libraries and the academy 6 (2): 127-141. https://doi.org/10.1353/pla.2006.0026. obrien, patrick, kenning arlitsch, leila sterman, jeff mixter, jonathan wheeler, and susan borda. 2016. "undercounting file downloads from institutional repositories." journal of library administration 56 (7): 854-874. https://doi.org/10.1080/01930826.2016.1216224. orduña-malea, enrique, and emilio delgado lópez-cózar. 2014. "google scholar metrics evolution: an analysis according to languages." scientometrics 98 (3): 2353-2367. https://doi.org/10.1007/s11192-013-1164-8. orduña-malea, enrique, and emilio delgado lópez-cózar. 2015. "the dark side of open access in google and google scholar: the case of latin-american repositories." scientometrics 102 (1): 829-846. https://doi.org/10.1007/s11192-014-1369-5. orduña-malea, enrique, alberto martín-martín, juan m. ayllon, and emilio delgado lópez-cózar. 2014. "the silent fading of an academic search engine: the case of microsoft academic information technology and libraries | june 2017 45 search." online information review 38(7):936-953. https://doi.org/10.1108/oir-07-20140169. ortega, josé luis. 2015. "relationship between altmetric and bibliometric indicators across academic social sites: the case of csic's members." journal of informetrics 9 (1): 39-49. https://doi.org/10.1016/j.joi.2014.11.004. ortega, josé luis, and isidro f. aguillo. 2014. "microsoft academic search and google scholar citations: comparative analysis of author profiles." journal of the association for information science & technology 65 (6): 1149-1156. https://doi.org/10.1002/asi.23036. pitol, scott p., and sandra l. de groote. 2014. "google scholar versions: do more versions of an article mean greater impact?" library hi tech 32 (4): 594-611. https://doi.org/0.1108/lht05-2014-0039. prins, ad a. m., rodrigo costas, thed n. van leeuwen, and paul f. wouters. 2016. "using google scholar in research evaluation of humanities and social science programs: a comparison with web of science data." research evaluation 25 (3): 264-270. https://doi.org/10.1093/reseval/rvv049. quint, barbara. 2016. "find and fetch: completing the course." information today 33 (3): 17-17. rothfus, melissa, ingrid s. sketris, robyn traynor, melissa helwig, and samuel a. stewart. 2016. "measuring knowledge translation uptake using citation metrics: a case study of a pancanadian network of pharmacoepidemiology researchers." science & technology libraries 35 (3): 228-240. https://doi.org/10.1080/0194262x.2016.1192008. ruppel, margie. 2009. "google scholar, social work abstracts (ebsco), and psycinfo (ebsco)." charleston advisor 10 (3): 5-11. shultz, m. 2007. "comparing test searches in pubmed and google scholar." journal of the medical library association : jmla 95 (4): 442-445. https://doi.org/10.3163/1536-5050.95.4.442. stansfield, claire, kelly dickson, and mukdarut bangpan. 2016. "exploring issues in the conduct of website searching and other online sources for systematic reviews: how can we be systematic?" systematic reviews 5 (1): 191. https://doi.org/10.1186/s13643-016-0371-9. ştirbu, simona, paul thirion, serge schmitz, gentiane haesbroeck, and ninfa greco. 2015. "the utility of google scholar when searching geographical literature: comparison with three commercial bibliographic databases." the journal of academic librarianship 41 (3): 322-329. https://doi.org/10.1016/j.acalib.2015.02.013. suiter, amy m., and heather lea moulaison. 2015. "supporting scholars: an analysis of academic library websites' documentation on metrics and impact." the journal of academic librarianship 41 (6): 814-820. https://doi.org/10.1016/j.acalib.2015.09.004. an evidence-based review of academic web search engines, 2014-2016| fagan | https://doi.org/10.6017/ital.v36i2.9718 46 szpiech, ryan. 2014. "cracking the code: reflections on manuscripts in the age of digital books." digital philology: a journal of medieval cultures 3(1): 75-100. https://doi.org/10.1353/dph.2014.0010. testa, matthew. 2016. "availability and discoverability of open-access journals in music." music reference services quarterly 19 (1): 1-17. https://doi.org/10.1080/10588167.2016.1130386. thelwall, mike, and kayvan kousha. 2015b. "web indicators for research evaluation. part 1: citations and links to academic articles from the web." el profesional de la información 24 (5): 587-606.https://doi.org/10.3145/epi.2015.sep.08. thielen, frederick w., ghislaine van mastrigt, l. t. burgers, wichor m. bramer, marian h. j. m. majoie, sylvia m. a. a. evers, and jos kleijnen. 2016. "how to prepare a systematic review of economic evaluations for clinical practice guidelines: database selection and search strategy development (part 2/3)." expert review of pharmacoeconomics & outcomes research: 1-17. https://doi.org/10.1080/14737167.2016.1246962. trapp, jamie. 2016. "web of science, scopus, and google scholar citation rates: a case study of medical physics and biomedical engineering: what gets cited and what doesn't?" australasian physical & engineering sciences in medicine. 39(4): 817-823. https://doi.org/10.1007/s13246-016-0478-2. van noorden, r. 2014. "online collaboration: scientists and the social network." nature 512 (7513): 126-129. https://doi.org/10.1038/512126a. varshney, lav r. 2012. "the google effect in doctoral theses." scientometrics 92 (3): 785-793. https://doi.org/10.1007/s11192-012-0654-4. verstak, alex, anurag acharya, helder suzuki, sean henderson, mikhail iakhiaev, cliff chiung yu lin, and namit shetty. 2014. "on the shoulders of giants: the growing impact of older articles." arxiv preprint arxiv:1411.0275. https://arxiv.org/abs/1411.0275. walsh, andrew. 2015. "beyond "good" and "bad": google as a crucial component of information literacy." in the complete guide to using google in libraries, edited by carol smallwood, 3-12. new york: rowman & littlefield. waltman, ludo. 2016. "a review of the literature on citation impact indicators." journal of informetrics 10 (2): 365-391. https://doi.org/10.1016/j.joi.2016.02.007. ward, judit, william bejarano, and anikó dudás. 2015. "scholarly social media profiles and libraries: a review." liber quarterly 24 (4): 174–204.https://doi.org/10.18352/lq.9958. weideman, melius. 2015. "etd visibility: a study on the exposure of indian etds to the google scholar crawler." paper presented at etd 2015: 18th international symposium on electronic theses and dissertations, new delhi, india, november 4-6. http://www.web information technology and libraries | june 2017 47 visibility.co.za/0168-conference-paper-2015-weideman-etd-theses-dissertation-india-googlescholar-crawler.pdf. weiss, andrew. 2016. "examining massive digital libraries (mdls) and their impact on reference services." reference librarian 57 (4): 286-306. https://doi.org/10.1080/02763877.2016.1145614. whitmer, susan. 2015. "google books: shamed by snobs, a resource for the rest of us." in the complete guide to using google in libraries, edited by carol smallwood, 241-250. new york: rowman & littlefield. wildgaard, lorna. 2015. "a comparison of 17 author-level bibliometric indicators for researchers in astronomy, environmental science, philosophy and public health in web of science and google scholar." scientometrics 104 (3): 873-906. https://doi.org/10.1007/s11192-015-1608-4. winter, joost, amir zadpoor, and dimitra dodou. 2014. "the expansion of google scholar versus web of science: a longitudinal study." scientometrics 98 (2): 1547-1565. https://doi.org/10.1007/s11192-013-1089-2. wu, tim. 2015. "whatever happened to google books?" the new yorker, september 11, 2015. wu, ming-der, and shih-chuan chen. 2014. "graduate students appreciate google scholar, but still find use for libraries." electronic library 32 (3): 375-389. https://doi.org/10.1108/el-082012-0102. yang, le. 2016. "making search engines notice: an exploratory study on discoverability of dspace metadata and pdf files." journal of web librarianship 10 (3): 147-160. https://doi.org/10.1080/19322909.2016.1172539. the “black box”: how students use a single search box to search for music materials kirstin dougan information technology and libraries | december 2018 81 kristin dougan (dougan@illinois.edu) is head, music and performing arts library, university of illinois. abstract given the inherent challenges music materials present to systems and searchers (formats, title forms and languages, and the presence of additional metadata such as work numbers and keys), it is reasonable that those searching for music develop distinctive search habits compared to patrons in other subject areas. this study uses transaction log analysis of the music and performing arts module of a library’s federated discovery tool to determine how patrons search for music materials. it also makes a top-level comparison of searches done using other broadly defined subject disciplines’ modules in the same discovery tool. it seeks to determine, to the extent possible, whether users in each group have different search behaviors in this search environment. the study also looks more closely at searches in the music module to identify other search characteristics such as type of search conducted, use of advanced search techniques, and any other patterns of search behavior. introduction music materials have inherent qualities that present difficulties to the library systems that describe them and to the searchers who wish to find them. this can be exemplified in three main areas: formats, titles, and relationships. first, printed music comes in multiple formats such as full scores, vocal scores, study scores, and parts; and in multiple editions such as facsimiles, scholarly editions, performing editions (of various caliber); each format and edition serving a different purpose or need. related to this, but less problematic, is the variety of sound recording formats available. second, issues resulting from titling practices abound in music, ranging from frequent use of foreign terms, not just in descriptive titles (l'oiseau de feu = zhar-ptitsa = the firebird = feuervogel), but in generic titles as translated by various publishers from different countries (symphony=sinfonie). additionally, musical works often have only generic genre titles enhanced by key and work number metadata, for example symphony no. 1 in c minor. third, music materials present a relationship issue best defined as “one-to-many.” musical works often have multiple sections or songs in them (an aria in an opera or a movement in a symphony), and a cd or a score anthology may contain multiple pieces of music. given these three main challenges presented by music materials, it is possible that those searching for music develop distinctive search habits compared to patrons in other subject areas. this study uses transaction log analysis of the music and performing arts module of a library’s federated discovery tool to determine how patrons search for music materials. it also makes a top-level comparison of searches done using other broadly defined subject disciplines’ modules in the same discovery tool. it seeks to determine, to the extent possible, whether users in each group have different search behaviors in this search environment. the study also looks more closely at mailto:dougan@illinois.edu the “black box” | dougan 82 https://doi.org/10.6017/ital.v37i4.10702 searches in the music module to identify other search characteristics such as type of search conducted, use of advanced search techniques, and any other patterns of search behavior. background since fall 2007 the university of illinois library has had easy search (es), a locally developed search tool designed to aid users in finding results from multiple catalog, a&i, and reference targets quickly and simultaneously. there is a “main” es on the library’s main gateway page that searches a variety of cross-disciplinary tools (see figure 1). figure 1. gateway easy search. on the gateway, users have the option of selecting one of the format tabs to narrow their search to books, articles, journals, or media. when the data for this study was gathered, the journals tab was not present. starting in 2010 many of the subject and branch libraries in the university library created their own es modules with target resources specific to the disciplinary areas they serve. search boxes for these es subject modules are often displayed right on the branch library’s home page, but users can also select these subject module options from the dropdown in the main es (see figure 2). information technology and libraries | december 2018 83 figure 2. gateway dropdown subject choices. the mpal es interface as it appears on the mpal homepage can be seen in figure 3—it was created in 2011. figure 3. mpal easy search interface. the “black box” | dougan 84 https://doi.org/10.6017/ital.v37i4.10702 es is a federated search tool and does not have a central index like most current discovery layer tools. rather, it utilizes broadcast search technologies to target different tools and search them directly. while the gateway es now uses a “bento box” layout to display selected citations from each target, in the first iterations of the tool and still today in the subject modules, users are simply presented with a list of hit counts in each of the target tools (see figures 4 and 5). figure 4. mpal easy search display screen part 1. figure 5. mpal easy search results screen part 2. information technology and libraries | december 2018 85 not shown in the screen captures are the results from various newspaper indexes and reference sources such as oxford music online and the international encyclopedia of dance. literature review many studies have examined patron search behavior using transaction log analysis and other methods over the past few decades. since the appearance of google in 1998, and its vast impact on individuals’ expectations and search behavior, recent studies have looked at user search behavior in tools that initially present a single search box. additional studies have looked at disciplinespecific searching behaviors. general search studies and single search boxes many advantages and disadvantages have been ascribed to tools with single search boxes (whether federated search tools or discovery layers), namely ease and convenience on the one hand, and the lack of precision possible in searching and overwhelming number of results on the other. two companion articles, boyd et al. and dyas-correira, written ten years apart, attempted to visit and revisit these issues.1 results and patron satisfaction can vary based on size of library and number of resources accessed by these tools. these types of tools will never be able to search and display everything and that problem is magnified by the number of resources a library has. holman, porter, and zimerman discovered in independent studies that undergraduates do not search very efficiently or effectively and find library tools difficult to use.2 avery and tracy also found this true for the es tool under discussion in this study: the generation of keywords by many students indicates they often struggled to identify alternative terminology that may have resulted in a more successful search . . . . many students exhibited persistence in their searching, but the selection of search terms, sometimes compounded by spelling problems or problems in search string structure, likely did not yield the most relevant results.3 asher, duke, and wilson state in their study comparing student search strategies and success across a variety of library search tools and google that there were “strong patterns in the way students approached searches no matter which tool they used. . . . students treated almost every search box like a google search box, using simple keyword searches in 81.5 percent (679/829) of the searches observed.”4 dempsey and valenti note students’ infrequent use of limiters such as “peer-review” and “date” in eds, the high non-use and misuse rates of quotation marks, relatively low instances of repeated uncorrected spelling errors, and variance patterns in keyword usage.5 students like federated search tools and discovery layers because of their convenience and ease, as found in studies by armstrong, belliston, and williams et al.6 this is reiterated in asher et al., “despite the fact that they did not necessarily perform better on the research tasks in this study, students did prefer google and google-like discovery tools because they felt that they could get to full-text resources more quickly.”7 this one-box approach could hinder students, as described by swanson and green: the search box became an obstacle in other questions where it should not have been used. in some cases, the search box was viewed as an all-encompassing search of the entire site. the “black box” | dougan 86 https://doi.org/10.6017/ital.v37i4.10702 several students searched for administrative information, research guides, and podcasts in this box.8 lown et al. also found that users hope to access a vast range of information via a single search box. “one lesson is that library search is about more than articles and the catalog. about 23 percent of use of quicksearch took place outside either the catalog or articles modules, indicating that ncsu library users attempt to access a wide range of information from the single search box.”9 search and library use in different disciplines in their study comparing a discovery layer and subject-specific tools, dahlen and hanson found “subject-specific indexing and abstracting databases still play an important role for libraries that have adopted discovery layers. discovery layers and subject-specific indexing and abstracting databases have different strengths and can complement each other within the suite of library resources.”10 they also observed things iterated by previous authors, chiefly that “not all students prefer discovery tools” and “the tools that students prefer may not be those that give them the best results.”11 in addition, they found that default configuration matters in terms of students’ success in and preference for a given tool. fu and thomes found that creating smaller disciplinespecific subsets in discovery tools was beneficial to searchers by reducing results and in creasing the results’ relevance.12 few studies investigate how music students search for music materials. dougan found in her observational study of music students’ search behaviors that they have difficulty forming good searches; misuse quotation marks and other search elements; and at times struggle with finding music materials.13 mayer noted upper-class music students’ frustration with using library tools to find specific works of music, going so far as to state, “the music students agreed that both the discovery layer and the catalog are not effective for music-related searching, for any format.”14 clark and yeager found that students had an easier time searching for media items than music scores, and frequently struggled with search strategy revisions.15 there is more research on the larger information needs of disciplines and creating models for research behavior, and not necessarily specific search processes or constructions.16 whitmire, in her 2002 pre-google article, found that students majoring in the social sciences were engaged in information-seeking behaviors at a higher rate than students majoring in engineering.17 chrzastowski and joseph surveyed graduate students at the university of illinois at urbana– champaign and found that those in the life sciences, physical sciences, and engineering visited the libraries less often than students in other academic disciplines.18 students in the arts and humanities used the library more often than students in other disciplines. collins and stone report that in prior studies of users across different disciplines, arts and humanities users do not account for the biggest users of library materials, their survey found the opposite to be true. 19 when looking at the various student populations in their study, musicians had the highest library usage in terms of items borrowed and almost the highest number of library visits. music users in the study also showed high numbers of hours logged into the library e-resources and highest number of e-resources accessed compared to others in their discipline group (but not as much as other disciplines). however, they show a low number of pdf downloads and low number of e-resources accessed frequently. information technology and libraries | december 2018 87 methodology this study conducted quantitative analysis of easy search (es) data as a whole and from a selection of the subject modules, including the music and performing arts library (mpal) module, using data from the period june 20, 2014 through june 16, 2015. additional quantitative and qualitative analysis was conducted only on the mpal es transaction log data. data from the following subject modules were included in comparative analyses: • funk agricultural, consumer and environmental sciences library (aces) (http://www.library.illinois.edu/funkaces/) • grainger engineering library (http://search.grainger.illinois.edu/top/) • history, philosophy, and newspaper library (hpnl) (http://www.library.illinois.edu/hpnl/) • music and performing arts library (mpal) (http://www.library.illinois.edu/mpal/) • social science, health, and education library (sshel) (http://www.library.illinois.edu/sshel/) • undergraduate library (ugl) (http://www.library.illinois.edu/ugl/) each of these libraries has a search box for es on its home page that is customized to the search targets identified as best for those subject areas by the subject librarians in that library. transaction log data on searches done in es is continuously compiled in a sql database and queries were written to determine certain quantitative measures. searches done in these various subject modules were isolated by a variable in the sql data that indicates whether the search was done in the main gateway es, in the main gateway es but using one of the subject dropdown choices, or from the subject es box directly from that library’s homepage. searches in the six subject modules listed above and in the main es were assessed for the average number of searches per session and the average words per search. further analysis of searches done in the mpal es module used 25,503 sessions conducted on mpal public computers from march 21, 2014 to june 21, 2015, which is a slightly longer timespan than used for the comparative analysis between subject es modules described above. to make this more manageable, only every tenth session was considered, meaning 2,550 sessions were analyzed out of the full set of mpal data. searches were sorted by session id number, which is assigned to each session when a new session is begun. this method kept all strings from one session together, whereas simply sorting by date and string id did not, since multiple sessions can occur simultaneously. a session is a series of user actions (searches and click-throughs) from the same workstation in which there is less than a twenty-minute pause between actions. if there are user actions from the same workstation after a twenty-minute pause, a new session is established, therefore, there is the possibility that some of the sequential sessions were from the same user, but there is no easy way to determine that. the mpal data set was assessed using the following quantitative measures: 1) average number of searches per session and whether session contained a) a single search b) multiple searches for the same thing (either repeated exactly or varied) http://www.library.illinois.edu/funkaces/ http://search.grainger.illinois.edu/top/ http://www.library.illinois.edu/hpnl/ http://www.library.illinois.edu/mpal/ http://www.library.illinois.edu/sshel/ http://www.library.illinois.edu/ugl/ the “black box” | dougan 88 https://doi.org/10.6017/ital.v37i4.10702 c) multiple strings searching for multiple things 2) average number of search terms per search 3) type of search by index (title/author/keyword) or other advanced search 4) use of boolean, quotation marks, parentheses, etc. 5) use of work or opus numbers or key indications 6) search indicating format (score, cd, etc.) findings comparing the data for searches done in the main es to some of the subject modules (see table 1) shows that the ugl es and the hpnl es have the fewest average searches per session, and the mpal es has the third highest average number of searches per session. the sciences tend to have higher average words per search string values, while mpal has the second lowest average number of words per search. this is not surprising given that the sciences tend to use a lot of journal literature and it is common for researchers to copy and paste such citations into es. whereas in music, as we will see later, keyword searches tend to focus on combinations of the composer’s name and words from the work title, occasionally with other terms added. source sessions searches average searches per session average words per search all es searches 599,482 1,340,159 2.121 5.08220 gateway only gateway everything tab 382,040 757,862 1.9837 5.255 gateway books tab 71,007 136,724 1.9255 4.048 gateway articles tab 57,169 107,893 1.887 6.35 gateway total 1,002,479 all subject modules departmental searches (incl. those from gateway dropdown) 75,035 214,364 1.9288 searches done directly from subject library pages 144,283 select subject modules21 agricultural, consumer and environmental sciences library 2,732 5,221 1.911 4.07 engineering library 32,018 68,146 2.128 5.092 history, philosophy, and newspaper library 1,264 1,985 1.57 3.09 music and performing arts library (mpal) 21,047 41,590 1.976 3.375 mpal data from march 21, 2014 to june 21, 2015 25,503 49,702 1.949 3.349 social science, health, and education library 9,458 19,760 2.089 4.906 undergraduate library 26,988 44,588 1.65 3.909 table 1. comparative search data from june 20, 2014 to june 16, 2015 (unless otherwise noted). information technology and libraries | december 2018 89 average number (and range) of searches per session in looking at the searches done directly from the mpal homepage and from the gateway dropdown from march 21, 2014 through june 21, 2015, there were 25,503 sessions conducted in the mpal es that contained a total of 49,702 searches, resulting in an average of 1.949 searches per session. of the 2,550 mpal search sessions in the study sample, the majority (63.2 percent) consisted of one search.22 this means the patron conducted one search and then left es, presumably clicking into the library catalog or another tool that is a target in es to complete their research. sessions consisting of two to four searches account for 31 percent of sessions, while sessions involving five to nine searches only account for 5 percent of total sessions, and only 32 sessions, or fewer than 1 percent, consist of ten or more searches (see table 2). searches per session number of sessions searches per session number of sessions 1 1604 12 7 2 476 13 2 3 191 14 3 4 116 16 1 5 51 17 2 6 29 18 1 7 22 19 2 8 12 20 1 9 15 23 1 10 6 30 2 11 6 total searches 2,550 table 2. searches per session. sessions with multiple searches (n= 946) were evaluated to see whether patrons were searching multiple times for the same thing (either with the same term[s] or with different terms), or whether they were searching for different things. five sessions that were clearly not music-related were removed from the sample. each session was categorized as “same/exact,” “same/different,” or “different.” at times, sessions might include several searches for the same thing using altered strings, in addition to searches for other things. those sessions were coded as “different.” for example: crumb zodiac crumb georgy crumb georgy cromb korean music there were 478 multi-search sessions (50.6 percent) in which patrons searched for different things within their session, 391 sessions (41.3 percent) in which patrons looked for the same thing with differing search strings, and 71 (7.5 percent) in which patrons reiterated the exact same search in each attempt. in the 71 sessions in which patrons used the same exact search the “black box” | dougan 90 https://doi.org/10.6017/ital.v37i4.10702 multiple times, they averaged 2.25 searches. those sessions tagged as “same/exacts” provide an opportunity to try to determine why patrons repeat the same search. common themes include: using too broad a search, searching in wrong place (non-performing-arts–related search), or repeatedly typing in the wrong info (e.g., typos or other errors) and not realizing their mistake. in the 391 sessions in which patrons spent their session searching for the same thing with different search strings, they did so with an average of 2.96 searches. often the variation in the search string was a change in spelling or a minor change in the terms, but sometimes it involved the addition or subtraction of terms, such as starting with morley fitzewilliam virginalists and going to morley fitzewilliam. in another example, we see how music metadata can prove challenging for searchers to format, such as when a patron started with schumann op.68 (without the necessary space between op. and 68), then progressed to album for the young, and finally schumann album for the young. in the 478 sessions in which patrons searched for completely different things within their session, they did so with an average of 4.08 searches per session. in many cases, although the searches were for different items, they were related in some way, either by genre, instrument, or some other element, such as in this example: microjazz color me jazz jamey aebersold play-along vandall jazz jazz piano pieces but sometimes the searches were for very different things: debussy voiles composition as problem mart humal composition as problem debussy ursatz average number (and range) of terms per search in looking at the approximately 4,900 searches included in the sample of 2,550 mpal sessions, without removing the small percentage of duplicate searches, two-term searches are the most common, followed by three-term searches—together accounting for more than half of the searches (55.3 percent). oneand four-term searches are the next most common, together accounting for 25.5 percent of searches (see table 3). in 2012, regular es single-term searches were at almost 60 percent.23 information technology and libraries | december 2018 91 number of terms in search string instances percentage (%) 1 605 12.4 2 1,559 31.8 3 1,149 23.5 4 642 13.1 5 400 8.2 6 196 4.0 7 100 2.0 8-15 216 4.5 16-57 29 .06 table 3. words per search string. longer search strings (8-15 terms) ranged from 74 to ten examples each, respectively, while searches with 16 to 20 terms ranged from 8 to 2 examples each, respectively. the following term counts each had only one example in the logs (25, 26, 31, 32, 36, 57). single-term searches types of single-term searches can be broken down into several categories (see table 4). over half (58.4 percent) were searches for personal names or part/all of a work title. some names and work titles are in fact so unique that a one-word search might in fact be successful (e.g., beyoncé, schwanengesang, newsies, or landowska). over a fifth (22.2 percent) were classified as “other or undetermined,” including publisher names, cities, or subject terms. type of one-word search number personal name 260 title 93 instrument/genre 51 tool/location/format 51 call number/barcode/label number 15 other/undetermined 135 table 4. one-term search types. in the tool/location/format category patrons searched for things such as: albums, images, dissertation, rilm, worldcat, jstor, and imslp. while rilm (abstracts of music literature) and worldcat can be found by a search in this tool because they will match on journal or database titles to which we subscribe, a search for imslp [international music score library project] only brings back mentions of imslp in rilm, etc. mpal links to imslp on its webpage, but neither imslp nor the library’s website are targets in es. when patrons only searched for a format, as in a session where a patron first searched for performances, then albums, and then audio cd [sic], it is difficult to know whether the patron expected to be led to a tool that only searched or listed recordings, whether they wanted a list of all of our recordings, or if some other logic was occurring. searchers also used this technique in multi-word searches, such as in the example george gershwin articles. the “black box” | dougan 92 https://doi.org/10.6017/ital.v37i4.10702 single-term searches in the “other/undetermined” category were a mix of subject terms like solfege, tuning, and spectralism. the patron could be trying to find materials related to these topics, examples of them (in the case of solfege), or definitions for them. they also included publisher or label terms such as rubank and puntamayo [sic], and even, on more than one occasion, urls and dois. two-term searches and names the largest segment of the mpal data (31.8 percent) is comprised of two-term searches. the examples show that often a musical work can be easily sought based on the composer’s name and a word from the title, especially in cases where it is a common title but adding the composer’s name makes it unique (e.g., ligeti requiem). sometimes the patron only knows the work’s characteristics and not its proper name (e.g., lakme duet). patrons do attempt to search for topical material using only two words, and that is not likely enough for a good topic search in most cases, such as in the example mahler dying. sometimes phonetic spellings are employed such as woozy wick followed by woyzeck (which is both a play and a film with this spelling but could also potentially be a misspelling of berg’s opera wozzeck). another example is image cartier followed by images quartier. personal names are frequently seen in two-term search strings. occasional use of foreign versions of names is observed, e.g., georgy crumb. it is difficult to know if these are typos or an artifact of our high international student population. as with any search that contains only a name, it is impossible to know whether the searcher was looking for materials by that individual or information about them. additionally, when current faculty names are searched, it is difficult to know whether patrons are looking for contact information for them, or scores or recordings by them. also observed in name searches is the phenomena of patrons repeating their search with a change in order of names, such as bryan gilliam and then gilliam bryan. this occurs with other two-word searches as well, such as a change from introitus gubaidulina to gubaidulina introitus. switching the order of the words in a search no longer makes a difference in most search tools (although in some catalogs, of course, it was once required to formulate an author search as last name, first name). there is still the occasional use of comma in ln, fn searches here. echoing the results of an earlier study that asked students what data points they used in searching, only occasionally did searches in this data set incorporate specific performers combined with a particular piece or composer: franck mutter, or for a particular edition: idomeneo barenreiter.24 sometimes names/titles were combined with format, such as a session in which a patron searched for hedwig images and then hedwig photo. here it is hard to tell if they are looking for pictures of a fictional owl or images from productions of hedwig and the angry inch, or something else. names are also frequently combined with work numbers instead of title words, such as mozart k.395 and moscheles op.73. search strings in the “other/undertermined” category sometimes included what appears to be an author/date search, perhaps for an article, such as mccord 2006. long search strings on the other end of the spectrum, the vast majority of the ten-plus word string searches are for performing arts items, but some were in other subject areas. these long searches are often citations that have been copied and pasted, which can be discerned from the use of punctuation and capitalization, like “welded in a single mass”: memory and community in london’s concert halls information technology and libraries | december 2018 93 during the first world war.25 it is very common in general gateway es searches to see an entire citation pasted in,26 but less common in the mpal module. searches such as this are often truncated through iteration to make the search more generic (see table 5). given easy search’s doi search recognition function, the longest version of this search would have worked had the doi been correct, but the correct doi number lacks the “.2” at the end (see table 6, query 1). the middle three searches (#s 2-4) failed because none of the a&i services that include this citation use hess, j. for the author’s name, but instead use her full first name (juliet). other examples showed that even when patrons use the exact citation, their search might not be successful if the citation formatting did not match that of the database(s) in which the article was indexed. query # query string 1 hess, j. (2014). radical musicking: towards a pedagogy of social change. music education research, 16(3), 229-250. doi: 10.1080/14613808.2014.909397.2). 2 hess, j. (2014). radical musicking: towards a pedagogy of social change. music education research, 16(3), 229-250. 3 hess, j. (2014). radical musicking: towards a pedagogy of social change. 4 hess, j. radical musicking: towards a pedagogy of social change. 5 radical musicking: table 5. search truncation. in some instances, searches were long because the patron included additional information such as in this example: bernstein, leonard. arranger: jack mason. title: west side story-selections (for symphonic "full" orchestra piano-conductor score). edition/publisher: hal leonard corporation. it is hard to tell if this was a copy and paste from another source such as a publisher catalog, or if the patron was trying to be very precise. in any case, this search was not successful, but would have been had the searcher omitted extraneous information such as the terms “arranger” and “edition/publisher.” type of index search—title/author/keyword and adding subsets or tools easy search does have an advanced search function with indexes for title and author, although it is rarely used by patrons. including repeated searches, searches done selecting the “title” index only numbered 207, or fewer than 10 percent of the sample. searches done selecting “author” were even scarcer, at 141(5.5 percent). the remaining ~2,300 searches in the sample were conducted using the default keyword search. occasionally there was a misuse of index searching, such as: ti: js bach english suite ti: scarlatti sonatas ti: haydn cello concerto d in these examples, composer name is included in a title index search. it is unclear whether searchers do not realize that they have selected something other than a keyword search, or whether people inherently think of the composer’s name as part of the title. later in this paper the phenomenon of searches using possessive name forms is discussed, which may be associated. the “black box” | dougan 94 https://doi.org/10.6017/ital.v37i4.10702 patrons have the option to start from the main library gateway and perform a search in es, and in the advanced search screen can choose other subject modules such as arts and humanities, l ife sciences, and so forth, and/or types of tools to cross-search (see figure 6). patrons chose the music and performing arts tool subset in 161 sessions. figure 6. easy search advanced search screen. the vast majority of the time (4,557 searches or 93 percent), patrons chose to start from the mpal es on the mpal homepage and do a basic search there, but 179 times patrons started from the mpal es and chose other subsets through the advanced search.27 given our large music education program, logically, some patrons made tool choices that included the music subset and the education and/or social science subsets. but sometimes patrons chose every or almost every option available across multiple unrelated subject areas, which likely made for a very unwieldy result set. information technology and libraries | december 2018 95 use of boolean operators, quotation marks, parentheses, truncation, etc. as in most search tools, there are several ways in es to conduct more sophisticated searches. however, patrons do not employ these techniques often, in part because they don’t always have to. in most older catalogs (including our classic voyager opac), searchers had to use boolean terms in capital letters, whereas in vufind and worldcat boolean and is now implied between terms. in the 159 examples of boolean logic in the searches, and is most common term used. interestingly, some researchers used plus signs instead of and (as they might in google), not just between individual words, but in between multi-word segments of the string (without employing quotation marks). however, the + sign, like and, is ignored/implied by es. berg + warm die lufte progressive studies for trumpet progressive studies for trumpet + john miller progressive studies for trumpet (john miller) new orleans + bossa nova johnny alf + brazil dick farney + brazil dick farney + booker pittman in some cases, the use of boolean did not seem intentional, that is, the term “and” appears as part of a common phrase (especially for instrument combinations), such as in webern violin and piano. only a handful of the boolean searches included examples of or and not, which seemed to stem from a class assignment designed by a professor, as the search strings are all very similar. one set is below: machaut not mass machaut or mass machaut and mass machaut mass notre dame machaut mass machaut and mass the “black box” | dougan 96 https://doi.org/10.6017/ital.v37i4.10702 commas were sometimes seen to stand in for boolean operators in a sense, or at least to separate search concepts, like the plus signs above, but were not counted in the total uses of boolean terms cited above. they are ignored by es. rachmaninoff, moment music planet, holst city noir, john adams piazzolla, flute and marimba mussorgsky, pictures at an exhibition, manfred schandert searchers used quotation marks on occasion (n=162) to keep phrases together, and parentheses were also used in this manner eight times (although they are ignored by es), such as in these examples: preludes and fugues (well-tempered clavier) cohen chaconne (from partita in d minor, bwv 1004) in some cases, searchers did not seem to grasp the function of quotation marks, as in this example: “snowforms" raymond murrey schafer, which was also observed by avery and tracy.28 truncation symbols can be another powerful tool in a searcher’s arsenal, but examples of their use in the transaction logs show that most searchers who attempt to use them do not understand them, such as in the examples doctor atomic?, boethius music,* and: orchestra* history history of the orchestra orchest* history orchestr* history orchestra history orchestral history in fact, the current library catalog assists users by automatically applying truncation logic so that “symphony” returns results for “symphonies” and vice versa. it is doubtful that this is generally known among users and likely functions in a manner transparent to most of them. work numbers and key indications searching by music metadata elements such as work or opus numbers and key designations has always proved challenging in online search environments given that numbers and single letters can appear in other parts of the catalog record with different meanings (e.g., 1 part instead of symphony no. 1). added to this is the difficulty of describing items that contain multiple works— the item’s title might be “mozart’s complete symphonies” or “beethoven symphonies 1-6” without information technology and libraries | december 2018 97 complete work details provided. nevertheless, 134 searches had some form of work number included, and 36 searches included a key indication. fantasie in f# minor presto georg philipp telemann and concerto en ut mineur j.c. bach are further examples of why a work’s key is hard to search by, one because of the use of the french solfege syllable “ut” and one because it includes a sharp symbol (#).29 the difficulties this can cause often led searchers to try various permutations of their search. mozart concerto g major sam franko; mozart concerto k 283 sam franko; scores; mozart violin concerto g major; mozart violin concerto g major sam franko; mozart violin concerto; sonata g major flute cpe bach sonata g major flute bach hamburger sonata flute cpe bach hamburger sonata hamburger sonata it is counterintuitive to searchers that including specific details in their search string might not help, but that is in fact the case in many online catalogs. searchers often run into the question of how or if to include the work indicator (op., k., bwv, etc.), which can lead to a “misuse” of this extra data such as in mozart k501 and mahler symphony no.9 (no spaces). another observation includes the use of what the author calls musicians’ shorthand. that is, those familiar with classical repertoire will know that examples such as sibelius 1 and mahler 5 are searches for symphonies even though they do not say so, but it will be harder, if not impossible, for the catalog to interpret that, leaving the searcher to sort through many extra results. in addition is the long-standing issue of whether to enter the number as “1”, “1st”, or “first” and whether the system can interpret these against the form of the number present in the catalog record. search by format or edition type in forty-seven examples searchers used format terms in their searches, including score, vocal score, full score, dvd, performance recordings, albums, and audio cd as well as the following: prokofiev romeo and juliet orchestra parts orchestra excerpts prokofiev romeo and juliet viola the “black box” | dougan 98 https://doi.org/10.6017/ital.v37i4.10702 tosca harp part assassins cd saxophone article in fifteen examples searchers searched for edition types including urtext, facsimile, critical edition, and complete works. in the latter case they occasionally used the word “complete” and the composer’s name, such as complete schumann or complete webern. unfortunately, this approach will often not be successful, because even though the term “complete works” is used colloquially by musicians, the titles of such editions are often something else (and often in a foreign language, such as “opera omnia”). other observations on formulation of searches searching by call numbers and recording label numbers while some catalogs allow call number searches, our current instance of vufind does not have a call number index, and keyword searching for them only works in some instances.30 but while call number searching does not work well in vufind (e.g., it has to be done as a keyword search and not a call number index search like in voyager), it still works in es because it is searching by keywords. there were thirty-two examples of searches in mpal’s es where patrons used entire call numbers or the first part of a classification number to find related materials: count basie biography count basie ml 410 duke ellington ml 410 duke ellington bibiliography it is also not unrealistic to think that patrons might want to search by a recording’s label number, since most catalogs provide search options for isbns and issns for print materials. searchers attempted this in a handful of searches like lpo-0014,31 7.24356e+11,32 and 777337-2.33 unfortunately this information is not usually reflected in mpal’s catalog records. common descriptions, natural language queries, genre queries, and context words as mentioned already with the examples mahler 1 and complete works, patrons regularly search with terms and phrases that make sense to them or that are used colloquially when discussing music and sources, which may or may not be in the bibliographic record. additional examples in the data set include: handel messiah critical edition rodelinda in italian mamma mia! book [for the text of a musical] grove encyclopedia [the title of this is in fact “dictionary” not “encyclopedia”] mgg sachteil [the abbreviation for musik in geschichte und gegenwart and the name for a section of it] information technology and libraries | december 2018 99 dance collection the last example in the list is particularly intriguing—somewhat like the earlier search examples of performances and albums, one wonders if the patron hoped to find everything in that category and then be able to browse, however it is hard to know what the searcher anticipated getting in return. sometimes natural language queries appear, often in an attempt to find a smaller part of a larger work, such as the slow movement of brahms's first symphony, anonymous chant from vespers for christmas day, and chaconne (from partita in d minor, bwv 1004); or for things other than musical works, such as in reviews of stravinsky article by robert craft. another variation on natural language or colloquial searches is the use of the possessive form of composer names. although not common (23 examples), patrons do this when searching for composer and title of a work, e.g., verdi's requiem. it seems unlikely that people do this when searching for books or other works, but musicians make works possessive to the composer, such as in the examples mendelssohn's violin concerto, to differentiate between pieces with the same form/generic title. in rare cases searchers used the term “by,” such as jeptha by carrissimi. genre searches such as south indian vocal music and hindustani classical music show that people may want to search the way they might in pandora or itunes, although it is possible this person was looking for secondary materials and not recordings or scores: pop female pop women pop contemporary pop searchers also exhibit a desire to find things by genre and instrument or voice type, such as soprano arias [which is ‘high voice’ in the lc subject heading], mozart satb sanctus, and baroque arias for medium voice. other examples include marimba literature, organ literature, and organ techniques. catalogs do not necessarily aid in these types of searches, even though they are natural constructions for users. sometimes searchers add context words to their search like they would in google in a way that will not necessarily help them in the catalog, such as daniel read composer. discussion even given the difficulties of searching for music materials, mpal patrons have embraced es—its module has almost as many searches as the undergraduate library’s, which serves a much larger population. it also has twice as many searches as the social science, health, and education library module, which also serves a much larger population than mpal. one of the possible reasons for this is the fact that mpal was an early adopter of developing an es subject module that could be searched from our homepage, which means our patrons have had longer to grow accustomed to using it. mpal has lower average words-per-search ratio (3.375 or 3.349 depending on data set) than most other es modules, likely because there are more composer plus title keyword searches for musical works and not as many pasted article citation searches, which tend to be longer. this is supported by the comparison of the average number of words in searches done in the gateway books tab the “black box” | dougan 100 https://doi.org/10.6017/ital.v37i4.10702 (4.048) vs. the gateway articles tab (6.35). in addition, although twoand three-word searches are most common, mpal has a significant number of single-word searches (12.4 percent). such searches can work in music, when there are unique titles like turandot and treemonisha that are unlikely to appear for more than one composer or as terms in other disciplines. for this same reason, singleor even two-word searches are unlikely to be effective in most other disciplines. at around seven words per search a transition in search patterns occurs. eight word and longer search strings are almost always some version of a title of a book, article, chapter or dissertation, etc. and strings with six words and fewer tend to be topical searches or combination composer/piece searches. other transaction log studies of es have shown that “title searching and results display—of journal titles, article titles, and book titles—is being heavily employed by users.”34 however, in music, where title alone may not be sufficient to identify and retrieve a musical work, searches with a combination of composer name and elements of the title and/or additional information will always be most prevalent. search location appropriateness and context even though discovery layers and federated search tools help with minimizing the number of silos and places in which scholars need to search, there are still issues with patrons attempting to use the es box to find things it is not designed to find.35 searchers see a box and search, without always understanding the context. this can happen on multiple levels. the mpal page clearly states that the mpal es box searches for arts-related things, but obviously patrons do not always see or comprehend this, even after they type in many queries that do not provide (good) results. this is likely related to the number of visitors to mpal from other disciplines who do not realize that there are various differently scoped versions of es. the following example could be a theatre set construction related search, which would work only moderately well in our tool. or, it may have been conducted by an architecture or structural engineering student, who would have better luck using a different es module. light weigh [sic] structures in architecture building research the evolving design vocabulary of fabric structure the engineering discipline of tent structures building research jan/feb 1972:22 it would be ideal if the system was smart enough to make suggestions: “you appear to need architecture resources—if you are not finding what you need, might we suggest tool x, y, or z?” while es does this to an extent when it can in the generic es, it does not do so in the subject modules, and in reality, can only go so far. it raises the question of whether we are we doing patrons a disservice by offering pre-defined subject modules. while this approach has some benefits for most users, it also creates different challenges for some. mpal’s es does not target all available relevant online tools and neither does the general es, so interdisciplinary researchers still need to be cautious of silos, even well-intentioned ones created by librarians or traditional information technology and libraries | december 2018 101 ones created by vendors. it is difficult to inform patrons of this in one-box search settings—they see the box and are eager to get started without first having to read a lengthy set of instructions. search location context is also important when patrons use es to try to find things that are described or linked on our website and not in es, such as for any of our named special collections. patrons also use es to find tools such as naxos, jstor, worldcat, and librarysource, some of which are targeted by es and some of which are not. es will at least provide a link to a tool, however (see figure 7). figure 7. easy search post-search suggestion. these particular tools are all also linked from the mpal website (in fact, naxos is linked further down the home page from the es box) and we also have a separate tool that enables one to search for databases and online journals by name. on some occasions, searchers used es to look for help using library tools, such as in the following example: rilm retrieval rilm using rilm the library website, not the discovery layer, is a better tool for finding instructions, since help information is currently delivered via various libguides. however, this is not intuitive to patrons. on a related note, it is interesting to consider whether patrons searching for specific tools such as imslp expect to find results from non-library resources in our search layers, or if they simply do not differentiate in their minds what is an open tool and what is a library subscription tool. patron knowledge level many of the observations of this study are related to known-item searching, since a large percentage of people looking for music materials are looking for specific pieces of music. earlier studies show that it is difficult to search for something if you do not know what it is.36 this can be seen in examples like ombramaifu handel (should be ombra mai fu) or the interworkings of tennis (which was followed by the correct inner game of tennis). topical searches can be especially difficult in any subject when the patron does not quite know how to put what they want into words (or literally does not know the right words, especially in the case of our many patrons for whom english is not their first language). the “black box” | dougan 102 https://doi.org/10.6017/ital.v37i4.10702 qualtize musical tension spell change click: kw:qualitize musical tension quantize musical tension quantitative musical tension music motive similarity surveying musical form through melodic-motivic similarities a paradigmatic approach to extract the melodic structure of a musical piece inding subsequences of melodies in musical pieces spell change click: kw:finding subsequences melodies musical pieces similarity measures for melodies measures of musical tension measuring musical tension this echoes head and eisenberg’s 2009 findings and dempsey and valenti’s 2016 findings.37 shortcomings of the easy search tool this study helped illuminate some shortcomings in es. sometimes the search formulation changes from es to the target, for example cramer preludes in es becomes all(cramer preludes) [a bound phrase] in one target, resulting in many fewer results than if the search had been done in the native interface. patrons may not realize this as they are searching. in another case there were no results for danças folclóricas brasileiras e suas aplicações educativas but removing the diacritics retrieves this title in our catalog, so it appears that diacritics do not function in es (at least when vufind is the target)—something that may not be apparent to searchers and hopefully can be addressed in the code. further research additional analysis could be done on this data set, including assessing whether searches were for known items or topics, and more specifically whether for articles, books, scores, or recordings. however, in many cases it is difficult to tell if a patron is looking for a score, recording, or information about a piece or composer. other research on es shows over half of searches (just over 58 percent in 2015) in the main es are for known items.38 this percentage is likely to be much higher in mpal’s es. with an enhanced data set it would also be possible to identify which target tools searchers are choosing most often. conclusion while many patrons (and librarians) are eager for a tool that can truly search everything, we are not there yet. some have tried to make music-specific interfaces for library catalogs, but this work is not widespread.39 perhaps because music students are often searching for things other than articles it would be better to have one tool that searches the catalog and streaming media tools information technology and libraries | december 2018 103 and one that only searches article indexes. some schools have taken this approach—configuring their discovery layer indexes to include article content but not the local catalog. there were several observations in this data of patron search behavior are not fully supported by library systems in all cases, but perhaps should be (e.g., use of + signs, searching by record label numbers or genre names/types of music/formats). in some cases, this is an issue with the metadata standards in use and in others it is about needing more flexible search options based on the metadata that we already have. newcomer et al. discuss this in their article outlining music discovery requirements.40 tools like easy search and discovery layers solve some problems for users but can create others. dedicated library catalogs are still generally the best tools for finding scores and recordings in our physical (and some online) collections, but not all libraries offer that tool anymore, instead offering a discovery layer as the primary search tool. in those cases, serious consideration needs to be given to facets, the ability to limit by format, and especially the frbrization of items, which is particularly problematic for music. additionally, there is a continued need for targeted instruction for music library users, because not only are the tools used in libraries less than perfect, the inherent challenges in searching for music because of its formats and titles are aggravated by musicians’ use of shorthand and colloquialisms to describe music materials. endnotes 1 john boyd et al., “the one-box challenge: providing a federated search that benefits the research process,” serials review 32, no. 4 (december 2006): 247–54, https://doi.org/10.1016/j.serrev.2006.08.005; sharon dyas-correia et al., “’the one-box challenge: providing a federated search that benefits the research process’ revisited,” serials review 41, no. 4 (october-december 2015): 250–56, https://doi.org/10.1080/00987913.2015.1095581. 2 lucy holman, “millennial students’ mental models of search: implications for academic librarians and database developers,” journal of academic librarianship 37, no. 1 (january 2011): 19–27, https://doi.org/10.1016/j.acalib.2010.10.003; brandi porter, “millennial undergraduate research strategies in web and library information retrieval systems,” journal of web librarianship 5, no. 4 (july-december 2011): 267–85, https://doi.org/10.1080/19322909.2011.623538; martin zimerman, “digital natives, searching behavior, and the library,” new library world 11, nos. 3/4 (2012): 174–201, https://doi.org/10.1108/03074801211218552. 3 susan avery and dan tracy, “using transaction log analysis to assess student search behavior in the library instruction classroom,” reference services review 42, no. 2 (june 2014): 332, https://doi.org/10.1108/rsr-08-2013-0044. 4 andrew asher, lynda m. duke, and suzanne wilson, “paths of discovery: comparing the search effectiveness of ebsco discovery service, summon, google scholar, and conventional library resources,” college & research libraries 74, no. 5 (september 2013): 473, https://doi.org/10.5860/crl-374. https://doi.org/10.1080/00987913.2015.1095581 https://doi.org/10.1016/j.acalib.2010.10.003 https://doi.org/10.1080/19322909.2011.623538 https://doi.org/10.1108/03074801211218552 https://doi.org/10.1108/rsr-08-2013-0044 https://doi.org/10.5860/crl-374 the “black box” | dougan 104 https://doi.org/10.6017/ital.v37i4.10702 5 megan dempsey and alyssa valenti, “student use of keywords and limiters in web-scale discovery searching,” journal of academic librarianship 42, no. 3 (may 2016): 203, https://doi.org/10.1016/j.acalib.2016.03.002. 6 annie r. armstrong, “student perceptions of federated searching vs. single database searching,” reference services review 37, no. 3 august 2009): 291–303, https://doi.org/10.1108/00907320910982785; c. jeffrey belliston, jared l. howland, and brian c. roberts, “undergraduate use of federated searching: a survey of preferences and perceptions of value-added functionality,” college & research libraries 68, no. 6 (november 2007): 472-86, https://doi.org/10.5860/crl.68.6.472; sarah d. williams, angela bonnell, and bruce stoffel, “student feedback on federated search use, satisfaction, and web presence: qualitative findings of focus groups,” reference and user services quarterly 49, no. 2 (winter 2009): 131–39. 7 asher et al., “paths of discovery,” 476. 8 troy swanson and jeremy green, “why we are not google: lessons from a library web site usability study,” journal of academic librarianship 37, no. 3 (may 2011): 227, https://doi.org/10.1016/j.acalib.2011.02.014. 9 cory lown, tito sierra, and josh boyer, “how users search the library from a single search box,” college & research libraries 74, no. 3 (may 2013): 240, https://doi.org/10.5860/crl-321. 10 sarah dahlen and kathlene hanson, “preference vs. authority: a comparison of student searching in a subject-specific indexing and abstracting database and a customized discovery layer,” college & research libraries 78, no. 7 (november 2017), 892, https://doi.org/10.5860/crl.78.7.878. 11 ibid. 12 li fu and cynthia thomes, “implementing discipline-specific searches in ebsco discovery service,” new library world 115, nos. 3/4 (2014): 102–15, https://doi.org/10.1108/nlw-012014-0003. 13 kirstin dougan, “finding the right notes: an observational study of score and recording seeking behaviors of music students,” journal of academic librarianship 41, no. 1 (january 2015): 61–67, https://doi.org/10.1016/j.acalib.2014.09.013. 14 jennifer m. mayer, “serving the needs of performing arts students: a case study,” portal: libraries & the academy 15, no. 3 (july 2015): 416, https://doi.org/10.1353/pla.2015.0036. 15 joe clark and kristin yeager, “seek and you shall find? an observational study of music students’ library catalog search behavior,” journal of academic librarianship 44, no. 1 (january 2018): 105-12, https://doi.org/10.1016/j.acalib.2017.10.001. 16 christine d. brown, “straddling the humanities and social sciences: the research process of music scholars,” library & information science research 24, no. 1 (march 2002): 73–94, https://doi.org/10.1016/s0740-8188(01)00105-0; stephann makri and claire warwick, https://doi.org/10.1016/j.acalib.2016.03.002 https://doi.org/10.1108/00907320910982785 https://doi.org/10.5860/crl.68.6.472 https://doi.org/10.1016/j.acalib.2011.02.014 https://doi.org/10.5860/crl-321 https://doi.org/10.5860/crl.78.7.878 https://doi.org/10.1108/nlw-01-2014-0003 https://doi.org/10.1108/nlw-01-2014-0003 https://doi.org/10.1016/j.acalib.2014.09.013 https://doi.org/10.1353/pla.2015.0036 https://doi.org/10.1016/j.acalib.2017.10.001 https://doi.org/10.1016/s0740-8188(01)00105-0 information technology and libraries | december 2018 105 “information for inspiration: understanding architects' information seeking and use behaviors to inform design,” journal of the american society for information science & technology 61, no. 9 (september 2010): 1,745-770, https://doi.org/10.1002/asi.21338; francesca marini, “archivists, librarians, and theatre research,” archivaria 63 (2007): 7–33; ann medaille, “creativity and craft: the information-seeking behavior of theatre artists,” journal of documentation 66, no. 3 (may 2010): 327–47, https://doi.org/10.1108/00220411011038430; marybeth meszaros, “a theatre scholarartist prepares: information behavior of the theatre researcher,” in advances in library administration and organization (v. 29), delmus e. williams and janine golden, eds. (bingley, uk: emerald group publishing limited, 2010): 185-217; bonnie reed and donald r. tanner, “information needs and library services for the fine arts faculty,” journal of academic librarianship 27, no. 3 (may 2001): 231, https://doi.org/10.1016/s0099-1333(01)00184-7; shannon robinson, “artists as scholars: the research behavior of dance faculty,” college & research libraries 77, no. 6 (november 2016): 779-94, https://doi.org/10.5860/crl.77.6.779. 17 ethelene whitmire, “disciplinary differences and undergraduates’ information‐seeking behavior,” journal of the association for information science and technology 53 (june 2002): 631-38, https://doi.org/10.1002/asi.10123. 18 tina chrzastowski and lura joseph, “surveying graduate and professional students' perspectives on library services, facilities and collections at the university of illinois at urbana-champaign: does subject discipline continue to influence library use?,” issues in science & technology librarianship 45, no. 1 (winter 2006), https://doi.org/10.5062/f4dz068j. 19 ellen collins and graham stone, “understanding patterns of library use among undergraduate students from different disciplines,” evidence based library and information practice 9 (september 2014): 51–67, https://doi.org/10.18438/b8930k. 20 this is up from the 4.33 average reported by mischo in 2012 (164). 21 including direct from departmental webpage and via gateway es dropdown choices. 22 in mischo’s 2012 analysis of easy search logs, 52 percent of sessions had one string and 48 percent had two or more. by 2015, single-query sessions had risen to 57 percent (william mischo, et al., "the bento approach to library discovery: web-scale and beyond,” internet librarian international, october 21, 2015). 23 william h. mischo et al., “user search activities within an academic library gateway: implications for webscale discovery systems,” in planning and implementing resource discovery tools in academic libraries, ed. mary popp and diane dallis (hershey, pa: igi global, 2012), 163. 24 kirstin dougan, “information seeking behaviors of music students,” reference services review 40, no. 4 (november 2012): 563, https://doi.org/10.1108/00907321211277369. https://doi.org/10.1002/asi.21338 https://doi.org/10.1108/00220411011038430 https://doi.org/10.1016/s0099-1333(01)00184-7 https://doi.org/10.5860/crl.77.6.779 https://doi.org/10.1002/asi.10123 https://doi.org/10.5062/f4dz068j https://doi.org/10.18438/b8930k https://doi.org/10.1108/00907321211277369 the “black box” | dougan 106 https://doi.org/10.6017/ital.v37i4.10702 25 vanessa williams, “‘welded in a single mass’: memory and community in london’s concert halls during the first world war,” the journal of musicological research 33, nos. 1–3 (2014): 27–38. 26 mischo, “user search activities,” 162. 27 this echoes earlier research that shows most searchers use default settings and keyword searches. 28 avery and tracy, “using transaction logs,” 31. 29 barbara d. henigman and richard burbank, “online music symbol retrieval from the access angle,” information technology & libraries 14, 1 (march 1995): 5–16. 30 we still have to use our older voyager opac or the staff-side of voyager to effectively search by call number until we get a newer version of vufind. 31 symphony no. 4 in e flat “romantic” by anton bruckner, klaus tennstedt (conductor), london philharmonic orchestra. (performer). 32 this is mozart, “clarinet concerto in a, k. 622,” meyer/berlin philharmonic/abbado emi classics 57128; 7.24356e+11. 33 this is reich: sextet / piano phase / eight lines (griffiths kevin/ london steve reich ensemble/ the/ stephen wallace) (cpo: 777337-2)). 34 mischo, “user search activities,” 169. 35 this reinforces what lown and asher et al. found as cited in the literature review above. 36 kirstin dougan, “finding the right notes: an observational study of score and recording seeking behaviors of music students,” journal of academic librarianship 41, no. 1 (january 2015): 66. 37 alison head and michael eisenberg, “finding context: what today’s college students say about conducting research in the digital age,” progress report (2009) (retrieved from http://projectinfolit.org/images/pdfs/pil_progressreport_2_2009.pdf); dempsey and valenti, “student use of keywords and limiters,” 2016. 38 william h. mischo et al., “the bento approach to library discovery: web-scale and beyond,” internet librarian international, october 21, 2015. 39 anke hofmann and barbara wiermann, “customizing music discovery services: experiences at the hochschule für musik und theater, leipzig,” music reference services quarterly 17, no. 2 (june 2014): 61–75, https://doi.org/10.1080/10588167.2014.904699; bob thomas, “creating a specialized music search interface in a traditional opac environment,” oclc systems & services 27, no. 3 (august 2011): 248–56, https://doi.org/10.1108/10650751111164588. 40 nara newcomer et al., “music discovery requirements: a guide to optimizing interfaces,” notes 69, no. 3 (march 2013): 494-524, https://doi.org/10.1353/not.2013.0017. http://projectinfolit.org/images/pdfs/pil_progressreport_2_2009.pdf https://doi.org/10.1080/10588167.2014.904699 https://doi.org/10.1108/10650751111164588 https://doi.org/10.1353/not.2013.0017 abstract introduction background literature review general search studies and single search boxes search and library use in different disciplines methodology findings average number (and range) of searches per session average number (and range) of terms per search single-term searches two-term searches and names long search strings type of index search—title/author/keyword and adding subsets or tools use of boolean operators, quotation marks, parentheses, truncation, etc. work numbers and key indications search by format or edition type other observations on formulation of searches searching by call numbers and recording label numbers common descriptions, natural language queries, genre queries, and context words discussion search location appropriateness and context patron knowledge level shortcomings of the easy search tool further research conclusion endnotes microsoft word ital_march_gerrity.docx editor’s comments bob gerrity   information  technology  and  libraries  |  march  2013   1       with  this  issue,  information  technology  and  libraries  (ital)  begins  its  second  year  as  an  open-­‐ access,  e-­‐only  publication.  there  have  been  a  couple  of  technical  hiccups  related  to  the  publication   of  back  issues  of  ital  previously  only  available  in  print:  the  publication  system  we’re  using  (open   journal  system)  treats  the  back  issues  as  new  content  and  automatically  sends  notifications  to   readers  who  have  signed  up  to  be  notified  when  new  content  is  available.  we’re  working  to   correct  that  glitch,  but  hope  that  the  benefit  of  having  the  full  ital  archive  online  will  outweigh   the  inconvenience  of  the  extra  e-­‐mail  notifications.  overall  though,  ital  continues  to  chug  along   and  the  wheels  aren’t  in  danger  of  falling  off  any  time  soon.  thanks  go  to  mary  taylor,  the  lita   board,  and  the  lita  publications  committee  for  supporting  the  move  to  the  new  model  for  ital.   readership  this  year  appears  to  be  healthy—the  total  download  count  for  the  thirty-­‐three  articles   published  in  2012  was  42,166,  with  48,160  abstract  views.  unfortunately  we  don’t  have  statistics   about  online  use  from  previous  years  to  compare  with.    the  overall  number  of  article  downloads   for  2012,  for  new  and  archival  content,  was  74,924.    we  continue  to  add  to  the  online  archive:  this   month  the  first  issues  from  march  1969  and  march  1981  were  added.  if  you  haven’t  taken  the   opportunity  to  look,  the  back  issues  offer  an  interesting  reminder  of  the  technology  challenges  our   predecessors  faces.     in  this  month’s  issue,  ital  editorial  board  member  patrick  “tod”  colegrove  reflects  on  the   emergence  of  makerspace  phenomenon  in  libraries,  providing  an  overview  of  the  makerspace     landscape.  lita  member  danielle  becker  and  lauren  yannotta  describe  the  user-­‐centered  website   redesign  process  used  at  the  hunter  college  libraries.  kathleen  weessies  and  daniel  dotson   describe  gis  lite  provide  examples  of  its  use  at  the  michigan  state  university  libraries.  vandana   singh  presents  guidelines  for  adopting  an  open-­‐source  integrated  library  system,  based  on   findings  from  interviews  with  staff  at  libraries  that  have  adopted  open-­‐source  systems.  danijela   boberić  krstićev  from  the  university  of  novi  sad  describes  a  software  methodology  enabling   sharing  of  information  between  different  library  systems,  using  the  z39.50  and  sru  protocols.   beginning  with  the  june  issue  of  ital,  articles  will  be  published  individually  as  soon  as  they  are   ready.  ital  issues  will  still  close  on  a  quarterly  basis,  in  march,  june,  september,  and  december.   by  publishing  articles  individually  as  they  are  ready,  we  hope  to  make  ital  content  more  timely   and  reduce  the  overall  length  of  time  for  our  peer-­‐review  and  publication  processes.   suggestions  and  feedback  are  welcome,  at  the  e-­‐mail  address  below.       bob  gerrity  (r.gerrity@uq.edu.au)  is  university  librarian,  university  of  queensland,  australia.   near-field communication (nfc): an alternative to rfid in libraries articles near-field communication (nfc) an alternative to rfid in libraries neeraj kumar singh information technology and libraries | june 2020 https://doi.org/10.6017/ital.v39i2.11811 neeraj kumar singh (neerajkumar78ster@gmail.com), phd, is deputy librarian, panjab university, chandigarh, india abstract libraries are the central agencies for the dissemination of knowledge. every library aspires to provide maximum opportunities to its users and ensure optimum utilization of available resources. hence, libraries have been seeking technological aids to improve their services. near-field communication (nfc) is a type of radio-frequency technology that allows electronics devices—such as computers, mobile phones, tags, and others—to exchange information wirelessly across a small distance. the aim of this paper is to explore nfc technology and its applications in modern era. the paper will discuss potential use of nfc in the advancement of traditional library management system. introduction similar to other identification technologies such as radio-frequency identification (rfid), barcodes, and qr codes, near-field communication (nfc) is a short-range (4–10 cm) wireless communication technology. nfc is based on the existing 13.56 mhz rfid contactless card standards which have been established for several years and are used for payment, ticketing, electronic passport, and access control among many other applications. data rates range from 106 to 424 kilobits per second. a few nfc devices are already capable of supporting up to 848 kilobits per second which is now being considered for inclusion in the nfc forum specifications. 1 compared to other wireless communication technologies nfc is designed for proximity or shortrange communication which provides a dedicated read zone and some inherent security. its 13.56 mhz frequency places it within the ism band, which is available worldwide. it is a bi-directional communication meaning that you can exchange data in both directions with a typical range of 4 – 10 cm depending on the antenna geometry and the output power.2 nfc is convenient and fast: the action is automatically triggered when your phone comes within 10 cm near the nfc tag and you get instant access to the content on mobile, without a single click.3 rfid and nfc technologies are similar in that both use radio waves. both rfid and nfc technologies exchange data within electronic devices in active mode as well as in passive mode. in the active mode, outgoing signals are basically those that actually come from the power source, whereas in case of passive mode the signals use the reflected energy they have received from the active signal. in rfid technology the radio waves can send information to receivers up to hundreds of meters away depending on the frequency of the band used by th e tag. if provided with high amount of power, these signals can also be sent to extreme distances (e.g., in the case of airport radar). at large airports it typically controls traffic within a radius of 100 kilometers of the airport below an elevation of 25,000 feet. rfid is also used very often in tracking animals and vehicles. mailto:neerajkumar78ster@gmail.com information technology and libraries june 2020 near field communication (nfc) | singh 2 in contrast, items like passports and payment cards should not be capable of long-distance transmissions because of the threat of theft of personal information or funds. nfc is designed to meet this need. nfc tags are very small in size so as to fit on the inner side of devices and products such as inside luggage, purses and packs as well as from inside wallets and clothing and can be tracked. nfc technology has added security features that make it much more secure than the previously popular rfid equivalent and it is difficult to steal information stored in it. nfc has short range of work area compared to other wireless technologies, so it can be widely used for payments, ticketing and service admittance and thus has proved to be a safer technology. it is because of this security feature that this technology is used in cellular phones to turn them into a wallet.4 both rfid and nfc wireless technologies can operate in active and passive communication modes to exchange data within electronic devices. the main differences between nfc and rfid are: • though both rfid and nfc use radio frequencies for communication, nfc can be said to be an extension of the rfid technology. the rfid technology has been in use for more than a decade, but nfc has emerged on the scene recently. • rfid has a wider range whereas nfc has limited communication and operates only at close proximity. nfc typically has a range of a few centimeters. • rfid can function in many frequencies and many standards are being used, but nfc requires a fixed frequency of 13.56 mhz, and some other fixed technical specifications to function properly. • rfid technology can be used for such applications as item tracking, automated toll collecting on roads, vehicle movement, etc., that require wide area signals. nfc is appropriate for applications that carry data that needs to be kept secure like mobile payments, access controls, etc., that carry sensitive information. • rfid operates over long distances while exchanging data wirelessly so it is not secure for the applications that store personalized data. rfid using items susceptible to various fraud attacks such as data corruption. nfc’s short working range considerably reduces this risk of data theft, eavesdropping, and “man in the middle” attacks. • nfc has the capability to communicate both ways and thus is suitable to be used for advanced interactions such as card emulation and peer-to-peer sharing. • a number of rfid tags can be scanned simultaneously, while only a single nfc tag can be scanned at a time. how nfc works the extended functionality of a traditional rfid system has led to the nfc forum. the nfc forum has defined three operating modes for nfc devices: tag reader/writer mode; peer-to-peer mode, and card emulation mode (see figure 1). the nfc forum technical specifications for the different operating modes are based on the iso/iec 18092 nfc ip-1, jis x 6319-4, and iso/iec 14443. these specifications must be used to derive the full benefit from the capabilities of nfc technology. contactless smart card standards are referred to as nfc-a, nfc-b, and nfc-f in nfc forum specifications.5 information technology and libraries june 2020 near field communication (nfc) | singh 3 figure 1. nfc operation modes6 reader/writer mode in reader/writer mode (see figure 2), an nfc-enabled device is capable of reading nfc forummandated tag types, such as a tag embedded in an nfc smart poster. this mode allows nfcenabled devices to read the information that is stored on nfc tags embedded in smart posters and displays. since these tags are relatively inexpensive, they provide a great marketing tool for companies. figure 2. reader mode7 the reader/writer mode on the radio frequency interface is compliant with the nfc-a, nfc-b, and nfc-f schemes. examples of its use include reading timetables, tapping for special offers, and updating frequent flyer points, etc.8 information technology and libraries june 2020 near field communication (nfc) | singh 4 peer-to-peer mode in peer-to-peer mode (see figure 3), both devices must be nfc-enabled in order for them to communicate with each other to exchange information and to share files. the users of nfcenabled devices can thus quickly share information and other files with a touch. as an example, users can exchange data such as digital photos or virtual business cards via bluetooth or wifi. figure 3. peer-to-peer mode9 peer-to-peer mode is based on the nfc forum’s logical link control protocol specification and is standardized on the iso/iec 18092 standard. card-emulation mode in card-emulation mode (see figure 4), an nfc device behaves like a contactless smart card so that users can perform transactions such as purchases, ticketing, and transit access control with just a touch. an nfc device may have the ability to emulate more than one card. in card-emulation mode, an nfc-enabled device communicates with an external reader much like a traditional contactless smart card. this allows contact less payments and ticketing by nfc-enabled devices without changing the existing infrastructure. information technology and libraries june 2020 near field communication (nfc) | singh 5 figure 4. card-emulation mode by adding nfc to a contactless infrastructure one can enable two-way communications. in the air transport sector, this could simplify many operations such as updating seat information while boarding or adding frequent flyer points while making a payment.10 nfc standards and specifications the nfc specifications are defined by an industry organization called the nfc forum, which has nearly 200 member companies. the nfc forum was formed in 2004 with the objective of advancing the use of nfc technology. this was achieved by educating the market about nfc technology and developing specifications to ensure interoperability among devices and services. the nfc forum members are working together in task forces and working groups. as noted earlier, nfc technology is based on existing 13.56 mhz rfid standards and includes several protocols such as iso 14443 type a and type b, and jis x 6319-4 (which is also a japanese industrial standard known as sony felica). the iso 15693 standard, an additional 13.56 mhz protocol established in the market, is being integrated into the nfc specification by an nfc forum task force. smartphones in the market are already supporting the iso 15693 protocol.11 these nfc specifications and especially the specifications for the extended nfc functionalities are again standardized by the international standard organizations like iso/iec ecma and etsi.12 initially the rfid standards i.e. iso/iec 14443 a, iso/iec 14443 b and jis x6319-4 were also pronounced as nfc standards by different companies working in the field such as nxp, infineon, and sony. the first ever nfc standard was ecma 340, based on the air interface of iso/iec 14443a and jis x6319-4. ecma 340 adapted the iso/iec standard 18092. at the same time, major credit card companies like europay, mastercard, and visa introduced the emvco payment standard, which is based on iso/iec 14443 a and iso/iec 14443 b. these groups harmonised the over-the-air interfaces within the nfc forum. they are named nfc-a (iso/iec 14443 a based), nfc-b (iso/iec 14443 b based), and nfc-f (felica based).13 information technology and libraries june 2020 near field communication (nfc) | singh 6 nfc tags an nfc tag is a small microchip embedded in a sticker or wristband that can be read by the mobile devices that are within range. information regarding the item is stored in these microchips.14 an nfc tag has the capability to send the information stored on it to nfc enabled mobile phones. nfc tags can also perform various actions, such as changing the settings of handsets or even launch a website.15 tag memory capacity varies by the type of tag. for example, a tag may store a phone number or a url.16 the most common use of the nfc tag function on an object is mobile wallet payment processing, where the user swipes or flicks a mobile phone on a nfc tag to make payment. google’s version of this system is google wallet.17 figure 5. a quick overview of the tag types18 applications of nfc since it emerged as a standard technology in 2003, nfc technology has been implemented across multiple platforms in various ways. the primary driving force behind nfc is its application in the commercial sector in which the implementation of the technology focuses on such areas as sales and marketing. there are also emerging many new and interesting applications in various other fields of education and healthcare. all of these may impact libraries, librarians, and library users, either by prompting adaptations to existing collections and services or inspiring innovation in our profession.19 • mobile payment: customers with nfc-enabled smartphones can link with their bank accounts and are able to pay by simply tapping phones to an nfc-enabled point-of-sale.20 information technology and libraries june 2020 near field communication (nfc) | singh 7 • access and authentication: “keyless access” to restricted areas, cars, and other vehicles. one can imagine other potential uses of nfc in the future with the devices in the home being controlled by it.21 • transportation and ticketing: nfc-enabled phones can connect with an nfc-enabled kiosk to download a ticket, or the ticket can be sent directly to an nfc-enabled phone over the air (ota). the phone can then tap a reader to redeem that ticket and gain access. 22 • mobile marketing: nfc tags they can be embedded into the indoor and outdoor signage. upon tapping their smartphone on an nfc-enabled smart poster, the customer can read a consumer review, visit a website, or even view a movie trailer. • healthcare: nfc medical cards and bracelet tags can store relevant, up-to-date patient information like health history, allergies, infectious diseases, etc. • gaming: nfc technology is the bridge between physical and digital games. players can tap each other’s phones together and earn extra points or receive access to a new level, or get clues, by using nfc application.23 • inventory tracking, smart packaging, and shelf labels: nfc-tagged objects could provide a wide variety of information in different use environments. nfc-enabled smartphones can be used to tap the tags to access book reviews and information about the book’s author and recommend the book to other readers. users could check out a book or add it to a wish list to check out at a later date. indeed, with nfc, library records and metadata could theoretically be stored on and retrieved from library physical holdings themselves, allowing a patron to tap a book or resource borrowed from the library to recall its title, author, and due date.24 applications of nfc in libraries: introducing the smart library some libraries are beginning to use nfc technology as an alternative to rfid. yusof et al. proposed a newly developed application called the smart library, or “s-library,” that has adopted the nfc technology.25 in the s-library, library users can perform many library transactions just by using their mobile smartphones with integrated nfc technology. the users of s-library are required to download and install an app in their compatible mobile phone. this app provides the user relevant and easy to use library functionality such as searching, borrowing, returning, and viewing their transaction records. in this s-library model the app is integrated with the library management software. the s-library app needs to be installed on the mobile device, and the mobile device requires an internet connection that will connect it to the lms. the s-library provides five major functionalities to the user: scan, search, borrow, return, and transaction history. in the scanning function, users can access the information of a book by simply touching their mobile phone to the nfc tag on the book. as soon as the phone touches the book, information regarding its title, author, contents, synopsis, etc. will automatically be displayed on the screen of the mobile device. users can search for books by entering keywords such as book title, author name, year, etc. through the borrowing function the app allows users to check out books of interest. the user just needs to touch their mobile phone to the nfc-tagged book to borrow it. the transaction is automatically stored to the lms database. similar to the borrowing process is the returning process. the user is required to select the return function on the menu and touch the mobile device to the book, and the returning transaction will be automatically performed and stored in the lms database. however, it should be ensured that the book is physically returned to the library by returning the book through the nfc-enabled book drop information technology and libraries june 2020 near field communication (nfc) | singh 8 system of the library and only then transaction should be updated in the lms. the user can check the due date for the current transaction as well as his transaction history. the function of transaction history allows the user to view the list of books that have been borrowed from time to time and their status.26 data transmission for nfc technology can be up to 848 kilobits/second whereas the data transmission rate with rfid technology is 484 kilobits/second. taking advantage of this high data rate, the response time for s-library is also very fast. this is a huge improvement over rfid technology and especially over barcode technology where data transmission rate is variable and inconsistent and dependent upon the quality of the barcodes. the second key advantage of slibrary is that the time taken to read a tag (the communication time between a reader and an nfc enabled device) is very fast. the third advantage of nfc is its usability in comparison to the other two technologies. nfc technology is human-centric because it is intuitive and fast and the user is able to use it anywhere, anytime using their mobile phones. in rfid and barcode technology usability is item centric as person has to go to the specific device located in the library. 27 most of the shortcomings of rfid and barcode technology have been overcome by the s-library. with barcode technology, the quality of barcodes, printing clarity, print contrast ratio , and also the low level of security were all challenges. rfid technology had many drawbacks such as lack of common rfid standards, security vulnerability, reader and tag collision that happens when multiple tags are energized by the rfid tag reader simultaneously and they reflect their respective signals back to the reader at the same time. because nfc is touch based, it has presented a viable alternative tool for library users to overcome these weaknesses of the older technology. yosof et al. found many advantages to s-library: faster book borrowing; saved time of the user as well as the library staff; the connection can be initialised in less than a second; no configuration on the mobile device is required; and higher usability ratings and security.28 however, there are also some limitations of s-library. first, device compatibility is an issue, because s-library presently supports only the android platform. second, as the s-library application only supports up to a 10centimeter range, coverage is an issue. mobile payments nfc technology can be used for several library functions such as making payments, paying library fines, purchasing tickets to library events, or donating to library. users may also be able to use their digital wallet to pay for photocopying, printing, scanning, etc. keeping the requirements of the nfc technology in the future libraries have to enquire about the possibility of adding nfc payment capabilities into the existing hardware and also while purchasing new machines. already, bibliotheca’s smartserv 1000 self-serve kiosk, introduced in september 2013, includes nfc as a payment option. in the future other library automation companies for nfc integration would also be worth monitoring.29 library access and authentication nfc-enabled devices can be used to accessing the library and authenticate users. these capabilities suggest that nfc technology may play an important role in the next generation of identity management systems. of particular interest in this context are several applications of nfc in two-factor authentication, which generally combines a traditional password or other digital credential with a physical, nfc-enabled component as well. for example, an authentication system information technology and libraries june 2020 near field communication (nfc) | singh 9 could require the user to type in a fixed password in addition to tapping an nfc-enabled phone, identity card, or ring to the device they are logging in to. ibm has demonstrated a two -factor authentication method for mobile payment in which a user first types in a password and then taps an nfc-enabled credit card, issued by their bank, to their nfc-enabled smartphone. libraries could investigate similar access and authentication applications for nfc, both for internal use (staff badges and keys) as well as for public services. particularly if nfc mo bile payment finally gains consumer attraction, library patrons may begin to expect that they can use their nfc-enabled mobile devices to replace not just their credit cards but also their library cards. already, d-tech’s rfid air self check unit allows library patrons to log into their user accounts by tapping their nfc-enabled phone to the kiosk. the patron then uses the kiosk’s rfid reader to check out their library materials and receives a receipt via email or sms. beyond its application in circulation, nfc authentication can be applied to streamline access to other services and resources of the library.30 nfc-enabled devices could be used to make reservation of library spaces, classrooms, auditoriums or community halls, digital media labs, meeting rooms , etc. library users could use nfc authentication to be able to access digital library resources, such as databases, e-journals, e-books collections, and other digital collections. nfc might allow libraries of all kinds to provide more convenient access and authentication options to users, though privacy and security considerations would certainly need to be addressed. nfc access and authentication will certainly have an impact on academic libraries. at universities where nfc access systems are deployed, student identification cards can be replaced with nfc-enabled mobile phones for afterhours services such as library building entry, wifi access, and printing, copying, and scanning services. the inconvenience of multiple logins can be eliminated. however, the libraries will have to take the responsibility of protecting student information and library resources with added security.31 promotion of library services librarians can borrow ideas from commercial implementations of nfc-based marketing to enhance promotions for library resources, services, and events. as a first step, as kane and schneidewind suggested, nfc tags can complement several promotional uses of qr codes that have already been piloted or implemented in libraries. 32 for promotional use, libraries can easily embed nfc tags in their new book displays that can be linked to the bestseller list or current acquisitions lists in the library catalog or digital collections. similarly, if the reference book collection is tagged with nfc tags, it could be linked to the relevant digital collections of databases or e-books. nfc tags can be placed on library building doors or on library promotional material by which information such as library hours, opening days, schedule of events, membership rules , or floor plans for the building could be shared. as an example, at the renison university college library in ontario, canada, visitors can tap an nfc-enabled “library smartcard” to retrieve a digital brochure of library services in a variety of formats, including pdf, epub, and mp3.33 to promote outreach programs and events instead of merely sharing links the libraries can take advantage of nfc’s interactive capabilities. as an example, libraries could use nfc tags on their event posters so that the users that can scan them and register for an event, save the event to their personal calendar, join the friends of the library program, or even download a library app. to send a text message to a librarian the users can tap the smart poster promoting a virtual reference service. nfc-enabled promotional materials can engage users with library content even when they are outside of the library building itself. a brilliantly creative example was created by the field information technology and libraries june 2020 near field communication (nfc) | singh 10 museum of chicago. it used nfc-enabled outdoor smart posters throughout the city to promote an exhibit of the 1893 world’s fair. the event posters depicted a personage from 1893 that invited the viewer to “see what they saw.” users could tap their nfc-enabled mobile device to the smart poster (or read a qr code) to download an app from the field museum that included 360° images of the fair as well as videos highlighting items in the exhibition.34 inventory control the smart packaging use case brings forward a very important question for libraries that use rfid for inventory control. first, can existing rfid tags and infrastructure be leveraged to provide additional services to patrons with nfc-enabled mobile devices? the concept is not new; walsh envisioned using library rfid tags to store book recommendations or other digital information, which users could then access with a conveniently located rfid reader. 35 what nfc brings to walsh’s vision is that a dedicated rfid reader may no longer be necessary; a patron could use their own nfc-enabled smartphone to read a tag rather than taking it to a special location to be read. indeed, with nfc, library records and metadata could theoretically be stored on and retrieved from library physical holdings themselves, allowing a patron to tap a book or resource borrowed from the library to recall its title, author, and due date. an exciting and immediate use for nfc in libraries is for self-checkout: a patron can browse the stacks and could tap an nfctagged book with their nfc-enabled phone to check it out without visiting the circulation desk or waiting in line.36 smart packaging a sector close to librarians’ hearts is publishing and several publishers have started testing smart packaging for books, using embedded nfc tags to share additional content with readers such as book reviews, reading lists, etc. with digital extras, the concept of smart packaging has significant implications for libraries as a new opportunity to connect physical collections (i.e., from books to digital media). one can envision in the future that when a user taps an nfc-enabled library book they shall get access to relevant digital information (such as bibliographic information) in a variety of citation formats, editorial reviews, the author’s biography, a projected rating for the book, and links to other similar information. borrowing and returning books one of a library’s key functions is circulating physical books from the library’s collections. due to the low cost of barcode technology, many libraries around the world are using it for circulation management. however, barcode technology has several constraints: it requires a line-of-sight to the barcode, it does not provide security of library collection, it does not offer any benefit for collection management, and it is becoming challenging for libraries to satisfy the increasing demands of their users, for example, reservation of books issued out, checking their transaction history, etc. this leads to the need to implement a new technology to improve the library circulation management, inventory, and security of library collections. librarians are known as early adopters of technology and have started using rfid to provide circulation services in a more effective and efficient manner, for security of library collections, and to satisfy the increasing demands of the users, for example putting tags in books allows them to issue multiple books together by placing stack of books near a reader. information technology and libraries june 2020 near field communication (nfc) | singh 11 recommendations according to mchugh and yarmey, the implementation of nfc has been slow and unsteady and they do not foresee an immediate implementation in libraries.37 however, they recommend that librarians learn and prepare for nfc. they recommend, for example, that librarians: • follow the progress of research and scholarship on nfc and commercial progress of nfc technology to better anticipate its adoption in your community; • experiment with nfc technology and develop prototype applications for nfc use in the library; • offer an informational workshop on nfc for users and library colleagues; • enquire from the rfid vendor about tag compatibility with nfc and rewriting the tags; • monitor the progress of security and privacy aspects of nfc technology and educate the users about these issues; develop or update your library security policy; • allow patrons to “opt-in” to any nfc services at your library, providing other modes of communication where possible; • develop and share best practices for nfc implementations; and • support research on nfc in libraries via planning grants, research forums, and conference sessions. conclusions beyond the potential benefits of nfc, librarians should also be aware of and prepared for privacy and security concerns that accompany the technology. user privacy is of the utmost concern. nfc involves users’ mobile devices generating, collecting, storing, and sharing a significant amount of personal data. several of these functions, particularly mobile payment, necessitate the exchange of highly confidential data, including but not limited to a user’s financial accounts, purchase history, etc. spam may also be a concern; sending unwanted content (e.g., advertisements, coupons, or adware) to users’ mobile devices without their consent. librarians should also use special caution when considering the implementation of nfc for library promotions or services. security is a significant concern and an active area of research, as many nfc implementations involve the exchange of sensitive financial or otherwise personal data. an important concept in nfc security, particularly in the context of mobile payment, is the idea of a tamper-proof “secure element” as a basic protection for sensitive or confidential data such as account information and credentials for authentication.38 outside of continued standardization, the most effective measures for protecting n fc data transmissions are data encryption and the establishment of a secure channel between the sending and receiving devices (e.g., using a key agreement protocol and/or via ssl). for security concerns, as with privacy concerns, librarians have a crucial role to play in user education. there are important steps that individual users can and should take to protect their devices—e.g., setting a lock code for their device, knowing how to remotely wipe a stolen phone, and installing and regularly updating antivirus software. however, many users are unaware of the vulnerability of their mobile devices and often fail to enact even basic protections. by empowering objects and people to communicate with each other at a different level and establish a “touch to share” paradigm, nfc technology has the potential to transform the information technology and libraries june 2020 near field communication (nfc) | singh 12 information environment surrounding our libraries and fundamentally alter the ways in which the library patrons interact with information. endnotes 1 doaa abdel-gaber and abdel-aleem ali, “near-field communication technology and its impact in smart university and digital library: comprehensive study,” journal of library and information sciences, 3, no. 2 (december 2015): 43-77, https://doi.org/10.15640/jlis.v3n2a4. 2 “nfc technology discover what nfc is, and how to use it,” accessed march 17, 2019, https://www.unitag.io/nfc/what-is-nfc. 3 apuroop kalapala, “analysis of near field communication (nfc) and other short range mobile communication technologies” (project report, indian institute of technology, roorkee, 2013 ), accessed march 19, 2019, https://idrbt.ac.in/assets/alumni/pt2013/apuroop%20kalapala_analysis%20of%20near%20field%20communication%20(nfc) %20and%20other%20short%20range%20mobile%20communication%20technologies_2013. pdf. 4 ed, “near field communication vs radio frequency identification,” accessed march 10, 2019, http://www.nfcnearfieldcommunication.org/radio-frequency.html. 5 “what it does,” nfc forum, accessed march 12, 2019, https://nfc-forum.org/what-is-nfc/whatit-does. 6 josé bravo et al., “m-health: lessons learned by m-experiences,” sensors 18, 1569 (2018): 1–27. 10.3390/s18051569. 7 vedat coskun, busra ozdenizci, and kerem ok, “the survey on near field communication,” sensors 15, no. 6 (2015): 13348-405, https://doi.org/10.3390/s150613348. 8 coskun, ozdenizci, and ok, “the survey on near field communications,” 13352. 9 coskun, ozenizci, and ok, “the survey on near field communication.” 10 “how nfc works?,” cnrfid, accessed january 12, 2019, http://www.centrenationalrfid.com/how-nfc-works-article-133-gb-ruid-202.html. 11 coskun, ozdenizci, and ok, “the survey on near field communication,” 13352. 12 c. ruth, “nfc forum calls for breakthrough solutions for annual competition,” accessed march 21, 2019, https://nfc-forum.org/newsroom/nfc-forum-calls-for-breakthrough-solutions-forannual-competition/. 13 m. roland, “near field communication (nfc) technology and measurements,” accessed may 12, 2019, https://cdn.rohdeschwarz.com/pws/dl_downloads/dl_application/application_notes/1ma182 /1ma182_5e_nfc_white_paper.pdf. 14 roland, “near field communication (nfc) technology and measurements.” https://doi.org/10.15640/jlis.v3n2a4 https://www.unitag.io/nfc/what-is-nfc https://idrbt.ac.in/assets/alumni/pt-2013/apuroop%20kalapala_analysis%20of%20near%20field%20communication%20(nfc)%20and%20other%20short%20range%20mobile%20communication%20technologies_2013.pdf https://idrbt.ac.in/assets/alumni/pt-2013/apuroop%20kalapala_analysis%20of%20near%20field%20communication%20(nfc)%20and%20other%20short%20range%20mobile%20communication%20technologies_2013.pdf https://idrbt.ac.in/assets/alumni/pt-2013/apuroop%20kalapala_analysis%20of%20near%20field%20communication%20(nfc)%20and%20other%20short%20range%20mobile%20communication%20technologies_2013.pdf https://idrbt.ac.in/assets/alumni/pt-2013/apuroop%20kalapala_analysis%20of%20near%20field%20communication%20(nfc)%20and%20other%20short%20range%20mobile%20communication%20technologies_2013.pdf http://www.nfcnearfieldcommunication.org/radio-frequency.html https://nfc-forum.org/what-is-nfc/what-it-does https://nfc-forum.org/what-is-nfc/what-it-does https://doi.org/10.3390/s150613348 http://www.centrenational-rfid.com/how-nfc-works-article-133-gb-ruid-202.html http://www.centrenational-rfid.com/how-nfc-works-article-133-gb-ruid-202.html https://nfc-forum.org/newsroom/nfc-forum-calls-for-breakthrough-solutions-for-annual-competition/ https://nfc-forum.org/newsroom/nfc-forum-calls-for-breakthrough-solutions-for-annual-competition/ https://cdn.rohdeschwarz.com/pws/dl_downloads/dl_application/application_notes/1ma182/1ma182_5e_nfc_white_paper.pdf https://cdn.rohdeschwarz.com/pws/dl_downloads/dl_application/application_notes/1ma182/1ma182_5e_nfc_white_paper.pdf information technology and libraries june 2020 near field communication (nfc) | singh 13 15 “what is a near field communication tag (nfc tag)?,” techopedia, accessed may 27, 2019, https://www.techopedia.com/definition/28812/near-field-communication-tag-nfc-tag. 16 “what is meant by the nfc tag?,” quora, accessed july 12, 2019, https://www.quora.com/what-is-meant-by-the-nfc-tag. 17 s. profis, “everything you need to know about nfc and mobile payments,” accessed june 27, 2019, https://www.cnet.com/how-to/how-nfc-works-and-mobile-payments/. 18 “the 5 nfc tag types,” accessed march 24, 2019, https://www.dummies.com/consumerelectronics/5-nfc-tag-types/. 19 abdel-gaber and ali, “near-field communication technology and its impact in smart university and digital library,” 64–71. 20 iviane ramos de luna et al., “nfc technology acceptance for mobile payments: a brazilian perspective,” review of business management 19, no. 63 (2017): 82–103, https://doi.org/10.7819/rbgn.v0i0.2315. 21 rajiv, “applications and future of near field communication,” accessed march 14, 2019, https://www.rfpage.com/applications-near-field-communication-future/. 22 “nfc in public transport,” nfc forum, accessed april 12, 2019, http://www.smartticketing.org/downloads/papers/nfc_in_public_transport.pdf. 23 “gaming applications with rfid and nfc technology,” smarttech, accessed may 14, 2019, https://www.smarttec.com/en/applications/gaming. 24 sheli mchugh and kristen yarmey, “near field communication: recent developments and library implications,” synthesis lectures on emerging trends in librarianship 1, no. 1 (march 2014), 1–93. 25 m.k. yusof et al., “adoption of near field communication in s-library application for information science,” new library world 116, no. 11/12 (2015): 728–47, https://doi.org/10.1108/nlw-02-2015-0014. 26 yusof et al., “adoption of near field communication,” 734–36. 27 yusof et al., “adoption of near field communication,” 744. 28 yusof et al., “adoption of near field communication,” 745. 29 abdel-gaber and ali, “near-field communication technology and its impact in smart university and digital library,” 64. 30 mchugh and yarmey, “near field communication,” 27. 31 mchugh and yarmey, “near field communication,” 734. https://www.techopedia.com/definition/28812/near-field-communication-tag-nfc-tag https://www.quora.com/what-is-meant-by-the-nfc-tag https://www.cnet.com/how-to/how-nfc-works-and-mobile-payments/ https://www.dummies.com/consumer-electronics/5-nfc-tag-types/ https://www.dummies.com/consumer-electronics/5-nfc-tag-types/ https://doi.org/10.7819/rbgn.v0i0.2315 https://www.rfpage.com/applications-near-field-communication-future/ http://www.smart-ticketing.org/downloads/papers/nfc_in_public_transport.pdf http://www.smart-ticketing.org/downloads/papers/nfc_in_public_transport.pdf https://www.smarttec.com/en/applications/gaming https://doi.org/10.1108/nlw-02-2015-0014 information technology and libraries june 2020 near field communication (nfc) | singh 14 32 danielle kane and jeff schneidewind, “qr codes as finding aides: linking electronic and print library resources,” public services quarterly 7, no. 3–4 (2011): 111–24, https://doi.org/10.1080/15228959.2011.623599. 33 mchugh and yarmey, “near field communication,” 31. 34 mchugh, and yarmey, “near field communication,” 31. 35 andrew walsh, “blurring the boundaries between our physical and electronic libraries: location-aware technologies, qr codes and rfid tags,” the electronic library 29, no. 4 (2011): 429–37, https://doi.org/10.1108/02640471111156713. 36 projes roy and shailendra kumar, “application of rfid in shaheed rajguru college of applied sciences for women library, university of delhi, india: challenges and future prospects,” qualitative and quantitative methods in libraries 5, no. 1 (2016): 117–130, http://www.qqmljournal.net/index.php/qqml/article/view/310. 37 mchugh, and yarmey, “near field communication,” 61–2. 38 garima jain and sanjeet dahiya, “nfc: advantages, limits and future scope,” international journal on cybernetics & informatics 4, no. 4 (2015): 1–12, https://doi.org/10.5121/ijci.2015.4401. https://doi.org/10.1080/15228959.2011.623599 https://doi.org/10.1108/02640471111156713 http://www.qqml-journal.net/index.php/qqml/article/view/310 http://www.qqml-journal.net/index.php/qqml/article/view/310 https://doi.org/10.5121/ijci.2015.4401 abstract introduction how nfc works reader/writer mode peer-to-peer mode card-emulation mode nfc standards and specifications nfc tags applications of nfc applications of nfc in libraries: introducing the smart library mobile payments library access and authentication promotion of library services inventory control smart packaging borrowing and returning books recommendations conclusions endnotes author name and second author b y now, most library and information technology association (lita) members and information technology and libraries (ital) readers know that 2006 is the fortieth anniversary of lita’s predecessor, the information science and automation division (isad) of the american library association (ala). and 2007 marks the fortieth birthday of ital, first published in 1967 as the journal of library automation (jola). i hope that members and readers know the vital role played by fred kilgour in the founding of the division and as jola’s founding editor. this issue marks the initiation of a two-volume celebration (volumes 25 and 26) of his role as founding editor by publishing what we hope are significant articles resulting from original research, the development of important and creative new systems, or explications of significant new technologies that will shape future information technologies. i have invited some of the authors of these articles to submit their manuscripts. others are being submitted in response to a call i published both in an earlier editorial and in a message to the lita-l discussion list. whether invited or submitted, they will receive the same double-blind refereeing that all ital articles undergo. the referees will not know which articles have been invited or submitted for this purpose. the articles will, however, be so designated when they are published. volume 25 initiates a second landmark for ital. henceforth, ital will be published simultaneously in electronic and print versions. the electronic copy will be available to lita members and ital subscribers on the ala/lita web site. equally significantly, at the 2006 ala midwinter meeting in san antonio, the lita board of directors approved a second proposal from the lita publications committee. (the ital editor and editorial board report to the publications committee.) after six months, the electronic issues will be open to all, not restricted to members and subscribers. put simply, if you are a member or subscriber reading this issue in print, you may also read it and volume 25, number 1 (the march 2006 issue) on the web. when volume 25, number 3 is published in september 2006, the march issue on the web will be open for anyone to read. when the december issue is published, this june e-issue will be open to all. the web versions are to be published in both pdf and html versions. most ital articles now include urls. readers will be able to link to them. most figures and graphs submitted by authors are in color. from now on, these will be available to the readers of the e-copies. ala publishing allows authors to submit their articles to institutional repositories, and many authors now do so. authors will retain this option. some articles have been posted on other portals as well. martha yee’s outstanding june 2005 article on how to frbrize the opac appears not only on ucla’s repository site but also on the escholarship repository site of the university of california system, one of the few library-related articles on the site (http://repositories.cdlib.org/escholarhip). furthermore, on november 29, 2005, it was among the top ten most popular articles on the site. recently, dlist (http://dlist.sir.arizona.edu) at the university of arizona library received permission to include it. the decisions to allow simultaneous publication of print and electronic versions and to allow open access after six months were not made lightly. the lita board members carried on extensive electronic discussions among themselves and with nancy colyar, chair of the publications committee, and me. lita president pat mullin’s summary of those discussions was more than ten single-spaced pages. nancy and i also attended a meeting of the board in san antonio. publications and memberships are two chief sources of revenue for almost all professional associations. in two surveys in the past ten years, lita members have indicated they considered ital to be their most important membership benefit. lita membership fell this year, probably because of the recent dues increases by other divisions of ala. this decline was anticipated by lita’s leadership. i think both the ital editorial board and the lita leadership would love to take the additional pioneering step of making our journal a full open-access publication. however, legitimate concern was expressed that opening access after six months might lead to both a decrease in members and subscribers. a significant number of lita leaders said that their membership was based on lita programs, participation, and interaction with colleagues, not just ital. i hope that all lita members feel the same. i further hope that lita members will do everything they can to discourage their libraries from canceling their subscriptions. our financial health would be enhanced if all lita members took two other steps: participating in writing and encouraging the writing of significant articles, and encouraging your many library technology vendors to advertise in ital. fred kilgour and the other founders of our division were library information technology (it) pioneers. fred’s leadership helped make jola and now ital vital reading for library it professionals. i believe that by celebrating the lita/ital anniversaries with a reconfirmation of our practice of publishing articles of the highest quality and by making ital more accessible through electronic publication, we are reaffirming the scholarly and professional commitments first made by fred kilgour and his isad colleagues such a short forty years ago. john webb john webb (jwebb@wsu.edu) is assistant director for systems and planning, washington state university libraries, pullman, and editor of information technology and libraries. editorial: lita and ital: forty and still counting editorial | webb 51 june_ita_pekala_final privacy and user experience in 21st century library discovery shayna pekala information technology and libraries | june 2017 48 abstract over the last decade, libraries have taken advantage of emerging technologies to provide new discovery tools to help users find information and resources more efficiently. in the wake of this technological shift in discovery, privacy has become an increasingly prominent and complex issue for libraries. the nature of the web, over which users interact with discovery tools, has substantially diminished the library’s ability to control patron privacy. the emergence of a data economy has led to a new wave of online tracking and surveillance, in which multiple third parties collect and share user data during the discovery process, making it much more difficult, if not impossible, for libraries to protect patron privacy. in addition, users are increasingly starting their searches with web search engines, diminishing the library’s control over privacy even further. while libraries have a legal and ethical responsibility to protect patron privacy, they are simultaneously challenged to meet evolving user needs for discovery. in a world where “search” is synonymous with google, users increasingly expect their library discovery experience to mimic their experience using web search engines.1 however, web search engines rely on a drastically different set of privacy standards, as they strive to create tailored, personalized search results based on user data. libraries are seemingly forced to make a choice between delivering the discovery experience users expect and protecting user privacy. this paper explores the competing interests of privacy and user experience, and proposes possible strategies to address them in the future design of library discovery tools. introduction on march 23, 2017, the internet erupted with outrage in response to the results of a senate vote to roll back federal communications commission (fcc) rules prohibiting internet service providers (isps), such as comcast, verizon, and at&t, from selling customer web browsing histories and other usage data without customer permission. less than a week after the senate vote, the house followed suit and similarly voted in favor of rolling back the fcc rules, which were set to go into effect at the end of 2017.2 the repeal became official on april 3, 2017 when the president signed it into law.3 this decision by u.s. lawmakers serves as a reminder that today’s internet economy is a data economy, where personal data flows freely on the web, ready to be compiled and sold to the highest bidder. continuous online tracking and surveillance has become the new normal. shayna pekala (shayna.pekala@georgetown.edu) is discovery services librarian, georgetown university library, washington, dc. privacy and user experience in 21st century library discovery | pekala https://doi.org/10.6017/ital.v36i2.9817 49 isps are just one of the many players in the online tracking game. major web search engines, such as google, bing, and yahoo, also collect information about users’ search histories, among other personal information.4 by selling this data to advertisers, data brokers, and/or government agencies, these search engine companies are able to make a profit while providing the search engines themselves for “free.” in addition to profiting off of user data, web search engines also use it to enhance the user experience of their products. collecting and analyzing user data enables systems to learn user preferences, providing personalized search results that make it easier to navigate the ever-increasing sea of online information. the collection and sharing of user data that occurs on the open web is deeply troubling for libraries, whose professional ethics embody the values of privacy and intellectual freedom. a user’s search history contains information about a user’s thought process, and the monitoring of these thoughts inhibits intellectual inquiry.5 libraries, however, would be remiss to dismiss the success of web search engines and their use of data altogether. mit’s preliminary report on the future of libraries urges, “while the notion of ‘tracking’ any individual’s consumption patterns for research and educational materials is anathema to the core values of libraries...the opportunity to leverage emerging technologies and new methodologies for discovery should not be discounted.”6 this article examines the current landscape of library discovery, the competing interests of privacy and user experience at play, and proposes possible strategies to address them in the future design of library discovery tools. background library discovery in the digital age the advent of new technologies has drastically shaped the way libraries support information discovery. while users once relied on shelf-browsing and card catalogs to find library resources, libraries now provide access to a suite of online tools and interfaces that facilitate cross-collection searching and access to a wide range of materials. in an online environment, many paths to discovery are possible, with the open web playing a newfound and significant role. today’s library discovery tools fall into three categories: online catalogs (the patron interface of the integrated library system (ils)), discovery layers (a patron interface with enhanced functionality that is separate from an ils), and web-scale discovery tools (an enhanced patron interface that relies on a central index to bring together resources from the library catalog, subscription databases, and digital repositories).7 these tools are commonly integrated with a variety of external systems, including proxy servers, inter-library loan, subscription databases, individual publisher websites, and more. for the most part, libraries purchase discovery tools from third-party vendors. while some libraries use open source discovery layers, such as blacklight or vufind, there are currently no open source options for web-scale discovery tools.8 information technology and libraries | june 2017 50 outside of the library, web search engines (e.g. google, bing, and yahoo), and targeted academic discovery products (e.g. google scholar, researchgate, and academia.edu) provide additional systems that enable discovery.9 in fact, web search engines, particularly google, play a significant role in the research process. both students and faculty use google in conjunction with library discovery tools. students typically use google at the beginning of the research process to get a better understanding of their topic and identify secondary search terms. faculty, on the other hand, use google to find out how other scholars are thinking about a topic.10 unsurprisingly, google and google scholar provide the majority of content access to major content platforms.11 the data economy and online privacy concerns in an information discovery environment that is primarily online, new threats to patron privacy emerge. in today’s economy, user data has become a global commodity. commercial businesses have recognized the value of data mining for marketing purposes. bjorn bloching, et. al. explain, “from cleverly aggregated data points, you can draw multiple conclusions that go right to the heart and mind of the customer.”12 along the same lines, the ability to collect and analyze user data is extremely valuable to government agencies for surveillance purposes, creating an additional data-driven market.13 the increasing value of user data has drastically expanded the business of online tracking. in her book, dragnet nation, journalist julia angwin outlines a detailed taxonomy of trackers, including various types of government, commercial, and individual trackers.14 in the online information discovery process, multiple parties collect user data at different points. consider the following scenario: a user executes a basic keyword search in google to access an openly available online resource. in the fifteen seconds it takes the user to get to that resource, information about the user’s search is collected by the internet service provider (isp), the web browser, the search engine, the website hosting the resource, and any third-party trackers embedded in the website. the search query, along with the user’s internet protocol (ip) address, become part of the data collector’s profile on the user. in the future, the data collector can sell the user’s profile to a data broker, where it will be merged with profiles from other data collectors to create an even more detailed portrait of the user.15 the data broker, in turn, can sell the complete dataset to the government, law enforcement, commercial businesses, and even criminals. this creates serious privacy concerns, particularly since users have no legal right over how their data is bought and sold.16 privacy protection in libraries libraries have deeply-rooted values in privacy and strong motivations to protect it. intellectual freedom, the foundation on which libraries are built, necessarily requires privacy. in its interpretation of the library bill of rights, the american library association (ala) explains, “in a library (physical or virtual), the right to privacy is the right to open inquiry without having the subject of one’s interest examined or scrutinized by others.”17 many studies support this idea, privacy and user experience in 21st century library discovery | pekala https://doi.org/10.6017/ital.v36i2.9817 51 having found that people who are indiscriminately and secretly monitored censor their behavior and speech.18 libraries have both legal and ethical obligations to protect patron privacy. while there is no federal legislation that protects privacy in libraries, forty-eight states have regulations regarding the confidentiality of library records, though the extent of these protections varies by state.19 because these statutes were drafted before the widespread use of the internet, they are phrased in a way that addresses circulation records and does not specifically include or exclude internet use records (records with information on sites accessed by patrons) from these protections. therefore, according to theresa chmara, libraries should not treat internet use records any differently than circulation records with respect to confidentiality.20 the library community has established many guiding documents that embody its ethical commitment to protecting patron privacy. the ala code of ethics states in its third principle, “we protect each library user's right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.”21 the international federation of library associations and institutions (ifla) code of ethics has more specific language about data sharing, stating, “the relationship between the library and the user is one of confidentiality and librarians and other information workers will take appropriate measures to ensure that user data is not shared beyond the original transaction.”22 the library community has also established practical guidelines for dealing with privacy issues in libraries, particularly those issues relating to digital privacy, including the ala privacy guidelines23 and the national information standards organization (niso) consensus principles on user’s digital privacy in library, publisher, and software-provider systems.24 additionally, the library freedom project was launched in 2015 as an educational resource to teach librarians about privacy threats, rights, and tools, and in 2017, the library and information technology association (lita) released a set of seven privacy checklists25 to help libraries implement the ala privacy guidelines. personalization of online systems while user data can be used for tracking and surveillance, it can also be used to improve the digital user experience of online systems through personalization. because the growth of the internet has made it increasingly difficult to navigate the continually growing sea of information online, researchers have put significant effort into designing interfaces, interaction methods, and systems that deliver adaptive and personalized experiences.26 angsar koene, et. al. explain, “the basic concept behind personalization of on-line information services is to shield users from the risk of information overload, by pre-filtering search results based on a model of the user’s preferences… a perfect user model would…enable the service provider to perfectly predict the decision a user would make for any given choice.”27 the authors continue to describe three main flavors of personalization systems: 1. content-based systems, in which the system recommends items based on their similarity to items that the user expressed interest in; information technology and libraries | june 2017 52 2. collaborative-filtering systems, in which users are given recommendations for items that other users with similar tastes liked in the past; and 3. community-based systems, in which the system recommends items based on the preferences of the user’s friends.28 many popular consumer services, such as amazon.com, youtube, netflix, google, etc., have increased (and continue to increase) the level of personalization that they offer.29 one such service in the area of academic resource discovery is google scholar’s updates, which analyzes a user’s publication history in order to predict new publications of interest.30 libraries, in contrast, have not pressed their developers and vendors to personalize their services in favor of privacy, even though studies have shown that users expect library tools to mimic their experience using web search engines.31 some web-scale discovery services do, however, allow researchers to set personalization preferences, such as their field of study, and, according to roger schonfeld, it is likely that many researchers would benefit tremendously from increased personalization in discovery.32 in this vein, the american philosophical society library recently launched a new recommendation tool for archives and manuscripts that uses circulation data and user-supplied interests to drive recommendations.33 opportunities for user experience in library discovery a major challenge in today’s online discovery environment is that the user is inhibited by an overwhelming number of results. this leads to users rely on relevance rankings and to fail to examine search results in depth. creating fine-tuned relevance ranking algorithms based on user behavior is one remedy to this problem, but it relies on the use of personal user data.34 however, there may be opportunities to facilitate data-driven discovery while maintaining the user’s anonymity that would be suitable for library (and other) discovery tools. irina trapido proposes that relevance ranking algorithms could be designed to leverage the popularity of a resource measured by its circulation statistics or by ranking popular or introductory materials higher than more specialized ones to help users make sense of large results sets.35 michael schofield proposes “context-driven design” as an intermediary solution, whereby the user opts in to have the system infer context from neutral device or browser information, such as the time of day, business hours, weather, events, holidays, etc.36 jason clark describes a search prototype he built that applies these principles, but he questions whether these types of enhancements actually add value to users.37 rachel vacek cautions that personalization is not guaranteed to be useful or meaningful, and continuous user testing is key.38 discussion there are several aspects to consider for the design of future library discovery tools. the integrated, complex nature of the web causes privacy to become compromised during the information discovery process. library discovery tools have been designed not to retain borrowing records, but have not yet evolved to mask user behavior, which is invaluable in today’s data economy. it is imperative that all types of library discovery tools have built-in functionality to privacy and user experience in 21st century library discovery | pekala https://doi.org/10.6017/ital.v36i2.9817 53 protect patron privacy beyond borrowing records, while also enabling the ethical use of patron data to improve user experience. even if library discovery tools were to evolve so that they themselves were absolutely private (where no data were ever collected or shared), other online parties (isps, web browsers, advertisers, data brokers, etc.) would still have access to user data through other means, such as cookies and fingerprinting. the operating reality is such that privacy is not immediately and completely controllable by libraries. laurie rinehart-thompson explains, “in the big picture, privacy is at the mercy of ethical and stewardship choices on the part of all information handlers.”39 while libraries alone cannot guarantee complete privacy for their patrons, they can and should mitigate privacy risks to the greatest extent possible. at the same time, ignoring altogether the benefits of using patron data to improve the discovery user experience may threaten the library’s viability in the age of google. roger schonfeld explains, “if systems exclude all personal data and use-related data, the resulting services will be onedimensional and sterile. i consider it essential for libraries to deliver dynamic and personalized services to remain viable in today's environment; expectations are set by sophisticated social networks and commercial destinations.”40 libraries must find ways to keep up with greater industry trends while adhering to professional ethics. recommendations while libraries have traditionally shied away from collecting data about patron transactions, these conservative tendencies run counter to the library’s mission to provide outstanding user experience and the need to evolve in a rapidly changing information industry. as the profession adopts new technologies, ethical dilemmas present themselves that are tied into their use. while several library organizations have issued guidance for libraries about the role of user data in these new technologies, this does not go far enough. the niso privacy principles, for instance, acknowledge that its principles are merely “a starting point.”41 examining the substance of these guidelines is important for confronting the privacy challenges facing library discovery in the 21st century, but there are additional steps libraries can take to more fully address the competing interests of privacy and user experience in library discovery and in library technologies more generally. holding third parties accountable libraries are increasingly at the mercy of third parties when it comes to the development and design of library discovery tools. unfortunately, these third parties not have the same ethical obligations to protect patron privacy that librarians do. in addition, the existing guidance for protecting user data in library technologies is directed towards librarians, not third party vendors. the library community must hold third parties accountable for the ethical design of library discovery tools. one strategy for doing this would be to develop a ranking or certification process for discovery tools based on a community set of standards. the development of hipaa-compliant information technology and libraries | june 2017 54 records management systems in the medical field sets an example. because healthcare providers are required by law to guarantee the privacy of patient data,42 they must select electronic health records systems (erms) that have been certified by an office of the national coordinator for health information technology (onc)-authorized body.43 in order to be certified, the system must adhere to a set of criteria adopted by the department of health and human services,44 which includes privacy and security standards.45 another example is the consumer reports standard and testing program for consumer privacy and security, which is currently in development. consumer reports explains the reason for developing this new privacy standard, “if consumer reports and other public-interest organizations create a reasonable standard and let people know which products do the best job of meeting it, consumer pressure and choices can change the marketplace.”46 libraries could potentially adapt the consumer reports standards and rating system for library discovery tools and other library technologies. engaging in ux research & design libraries should not rely on third parties alone to address privacy and user experience requirements for library discovery tools. libraries are well-poised to become more involved in the design process itself by actively engaging in user experience research and design. the opportunities for “context-driven design” and personalization based on circulation and other anonymous data are promising for library discovery but require ample user testing to determine their usefulness. understanding which types of personalization features offer the most value while preserving privacy is key to accelerating the design of library discovery tools. the growth of user experience librarian jobs and the emergence of user experience teams and departments in libraries signals an increasing amount of user experience expertise in the field, which can be leveraged to investigate these important questions for library discovery. illuminating the black box when librarians adopt new discovery tools without fully understanding their underlying technologies and the data economy in which they operate, this does not serve users. librarians have ethical obligations that should require them to thoroughly understand how and when user data is captured by library discovery tools and other web technologies, and how this information is compiled and shared at a higher level. not only do librarians need to understand the technical aspects of discovery technologies, they also need to understand the related user experience benefits and privacy concerns and the resulting ethical implications. as technology continues to evolve, librarians should be required to engage in continued learning in these areas. such technology literacy skills could be incorporated in the curriculum of library and information science degree programs, as well as in ongoing professional development opportunities. empowering library users because information discovery in an online environment introduces new privacy risks, communication about this topic between librarians and patrons is paramount. librarians should privacy and user experience in 21st century library discovery | pekala https://doi.org/10.6017/ital.v36i2.9817 55 proactively discuss with patrons the potential risks to their privacy when conducting research online, whether they are using the open web or library discovery tools. it is ultimately up to the patron to weigh their needs and preferences in order to decide which tools to use, but it is the librarian’s responsibility to empower patrons to be able to make these decisions in the first place. conclusion with the rollback of the fcc privacy rules that prohibit isps from selling customer search histories without customer permission, understanding digital privacy issues and taking action to protect patron privacy is more important than ever. while privacy and user experience are both necessary and important components of library discovery systems, their requirements are in direct conflict with each other. an absolutely private discovery experience would mean that no user data is ever collected during the search process, whereas a completely personalized discovery experience would mean that all user data is collected and utilized to inform the design and features of the system. it is essential for library discovery tools to have built-in functionality that protects patron privacy to the greatest extent possible and enables the ethical use of patron data to improve user experience. the library community must take action to address these requirements beyond establishing guidelines. holding third party providers to higher privacy standards is a starting point. in addition, librarians themselves need to engage in user experience research and design to discover and test the usefulness of possible intermediary solutions. librarians must also become more educated as a profession on digital privacy issues and their ethical implications in order to educate patrons about their fundamental rights to privacy and empower them to make decisions about which discovery tools to use. collectively, these strategies enable libraries to address user needs, uphold professional ethics, and drive the future of library discovery. references 1. irina trapido, “library discovery products: discovering user expectations through failure analysis,” information technologies and libraries 35, no. 3 (2016): 9-23, https://doi.org/10.6017/ital.v35i3.9190. 2. brian fung, “the house just voted to wipe away the fcc’s landmark internet privacy protections,” the washington post, march 28, 2017, https://www.washingtonpost.com/news/the-switch/wp/2017/03/28/the-house-justvoted-to-wipe-out-the-fccs-landmark-internet-privacy-protections. 3. jon brodkin, “president trump delivers final blow to web browsing privacy rules,” ars technica, april 3, 2017, https://arstechnica.com/tech-policy/2017/04/trumps-signaturemakes-it-official-isp-privacy-rules-are-dead/. 4. nathan freed wessler, “how private is your online search history?” aclu free future (blog), https://www.aclu.org/blog/how-private-your-online-search-history. 5. julia angwin, dragnet nation (new york: times books, 2014), 41-42. information technology and libraries | june 2017 56 6. mit libraries, institute-wide task force on the future of libraries (2016), 12, https://assets.pubpub.org/abhksylo/futurelibrariesreport.pdf. 7. trapido, “library discovery products,” 10. 8. marshall breeding, “the future of library resource discovery,” niso white papers, niso, baltimore, md, 2015, 4, http://www.niso.org/apps/group_public/download.php/14487/future_library_resource_dis covery.pdf. 9. christine wolff, alisa b. rod, and roger c. schonfeld, ithaka s+r us faculty survey 2015 (new york: ithaka s+r, 2016), 11, https://doi.org/10.18665/sr.277685. 10. deirdre costello, “students and faculty research differently” (presentation, computers in libraries, washington, d.c., march 28, 2017), http://conferences.infotoday.com/documents/221/a103_costello.pdf. 11. roger c. schonfeld, meeting researchers where they start: streamlining access to scholarly resources (new york: ithaka s+r, 2015), https://doi.org/10.18665/sr.241038. 12. björn bloching, lars luck, and thomas ramge, in data we trust: how customer data is revolutionizing our economy (london: bloomsbury publishing, 2012), 65. 13. angwin, 21-36. 14. ibid., 32-33. 15. natasha singer, “mapping, and sharing, the consumer genome,” new york times, june 16, 2012, http://www.nytimes.com/2012/06/17/technology/acxiom-the-quiet-giant-ofconsumer-database-marketing.html. 16. lois beckett, “everything we know about what data brokers know about you,” propublica, june 13, 2014, https://www.propublica.org/article/everything-we-know-about-what-databrokers-know-about-you. 17. “an interpretation of the library bill of rights,” american library association, amended july 1, 2014, http://www.ala.org/advocacy/intfreedom/librarybill/interpretations/privacy. 18. angwin, dragnet nation, 41-42. 19. anne klinefelter, “privacy and library public services: or, i know what you read last summer,” legal references services quarterly 26, no. 1-2 (2007): 258-260, https://doi.org/10.1300/j113v26n01_13. 20. theresa chmara, privacy and confidentiality issues: guide for libraries and their lawyers (chicago: ala editions, 2009), 27-28. 21. “code of ethics of the american library association,” american library association, privacy and user experience in 21st century library discovery | pekala https://doi.org/10.6017/ital.v36i2.9817 57 amended january 22, 2008, http://www.ala.org/advocacy/proethics/codeofethics/codeethics. 22. “ifla code of ethics for librarians and other information workers,” international federation of library associations and institutions, august 12, 2012, http://www.ifla.org/news/ifla-code-of-ethics-for-librarians-and-other-informationworkers-full-version. 23. “privacy & surveillance,” american library association, approved 2015-2016, http://www.ala.org/advocacy/privacyconfidentiality. 24. national information standards organization, niso consensus principles on users’ digital privacy in library, publisher, and softwareprovider systems (niso privacy principles), published on december 10, 2015, http://www.niso.org/apps/group_public/download.php/15863/niso%20consensus%20pr inciples%20on%20users%92%20digital%20privacy.pdf. 25. “library privacy checklists,” library and information technology association, accessed march 7, 2017, http://www.ala.org/lita/advocacy. 26. panagiotis germanakos and marios belk, “personalization in the digital era,” in humancentred web adaptation and personalization: from theory to practice, (switzerland: springer international publishing switzerland, 2016), 16. 27. ansgar koene et al., “privacy concerns arising from internet service personalization filters,” acm sigcas computers and society 45, no. 3 (2015): 167. 28. ibid., 168. 29. ibid. 30. james connor, “scholar updates: making new connections,” google scholar blog, https://scholar.googleblog.com/2012/08/scholar-updates-making-new-connections.html. 31. schonfeld, meeting researchers where they start, 2. 32. roger c. schonfeld, does discovery still happen in the library?: roles and strategies for a shifting reality (new york: ithaka s+r, 2014), 10, https://doi.org/10.18665/sr.24914. 33. abigail shelton, “american philosophical society announces launch of pal, an innovative recommendation tool for research libraries,” american philosophical society, april 3, 2017, https://www.amphilsoc.org/press/pal. 34. trapido, “library discovery products,” 17. 35. ibid. 36. michael schofield, “does the best library web design eliminate choice?” libux, september information technology and libraries | june 2017 58 11, 2015, http://libux.co/best-library-web-design-eliminate-choice/. 37. jason a. clark, “anticipatory design: improving search ux using query analysis and machine cues,” weave: journal of library user experience 1, no. 4 (2016), https://doi.org/10.3998/weave.12535642.0001.402. 38. rachel vacek, “customizing discovery at michigan” (presentation, electronic resources & libraries, austin, tx, april 4, 2017), https://www.slideshare.net/vacekrae/customizingdiscovery-at-the-university-of-michigan. 39. laurie a. rinehart-thompson, beth m. hjort, and bonnie s. cassidy, “redefining the health information management privacy and security role,” perspectives in health information management 6 (2009): 4.s 40. marshall breeding, “perspectives on patron privacy and security,” computers in libraries 35, no. 5 (2015): 13. 41. national information standards organization, niso consensus principles. 42. joel jpc rodrigues, et al., “analysis of the security and privacy requirements of cloud-based electronic health records systems,” journal of medical internet research 15, no. 8 (2013), https://www.ncbi.nlm.nih.gov/pmc/articles/pmc3757992/. 43. office of the national coordinator for health information technology, guide to privacy and security of electronic health information, april 2015, https://www.healthit.gov/sites/default/files/pdf/privacy/privacy-and-security-guide.pdf. 44. office of the national coordinator for health information technology, “health it certification program overview,” january 30, 2016, https://www.healthit.gov/sites/default/files/publichealthitcertificationprogramovervie w_v1.1.pdf. 45. office of the national coordinator for health information technology, “2015 edition health information technology (health it) certification criteria, base electronic health record (ehr) definition, and onc health it certification program modifications final rule,” october 2015, https://www.healthit.gov/sites/default/files/factsheet_draft_2015-10-06.pdf. 46. consumer reports, “consumer reports to begin evaluating products, services for privacy and data security,” consumer reports, march 6, 2017, http://www.consumerreports.org/privacy/consumer-reports-to-begin-evaluatingproducts-services-for-privacy-and-data-security/. hitting the road towards a greater digital destination: evaluating and testing dams at university of houston libraries annie wu, santi thompson, rachel vacek, sean watkins, and andrew weidner information technology and libraries | june 2016 5 abstract since 2009, tens of thousands of rare and unique items have been made available online for research through the university of houston (uh) digital library. six years later, the uh libraries’ new digital initiatives call for a more dynamic digital repository infrastructure that is extensible, scalable, and interoperable. the uh libraries’ mission and the mandate of its strategic directions drives the pursuit of seamless access and expanded digital collections. to answer the calls for technological change, the uh libraries administration appointed a digital asset management system (dams) implementation task force to explore, evaluate, test, recommend, and implement a more robust digital asset management system. this article focuses on the task force’s dams selection activities: needs assessment, systems evaluation, and systems testing. the authors also describe the task force’s dams recommendation based on the evaluation and testing data analysis, a comparison of the advantages and disadvantages of each system, and system cost. finally, the authors outline their dams implementation strategy comprised of a phased rollout with the following stages: system installation, data migration, and interface development. introduction since the launch of the university of houston digital library (uhdl) in 2009, the uh libraries have made tens of thousands of rare and unique items available online for research using contentdm. as we began to explore and expand into new digital initiatives, we realize that the uh libraries’ digital aspirations require a more dynamic, flexible, scalable, and interoperable digital asset management system that can manage larger amounts of materials in a variety of formats. we plan to implement a new digital repository infrastructure that accommodates creative workflows and allows for the configuration of additional functionalities such as digital exhibits, data mining, cross-linking, geospatial visualization, and multimedia presentation. the annie wu (awu@uh.edu) is head of metadata and digitization services, santi thompson (sathompson3@uh.edu) is head of repository services, rachel vacek (evacek@uh.edu) is head of web services, sean watkins (slwatkins@uh.edu) is web projects manager, and andrew weidner (ajweidner@uh.edu) is metadata services coordinator, university of houston libraries. mailto:awu@uh.edu mailto:sathompson3@uh.edu mailto:evacek@uh.edu mailto:slwatkins@uh.edu mailto:ajweidner@uh.edu hitting the road towards a greater digital destination: evaluating and testing dams at university of houston libraries | wu et al. | doi:10.6017/ital.v35i2.9152 6 new system will be designed with linked data in mind and will allow us to publish our digital collections as linked open data within the larger semantic web environment. the uh libraries strategic directions set forth a mandate for us to “work assiduously to expand our unique and comprehensive collections that support curricula and spotlight research. we will pursue seamless access and expand digital collections to increase national recognition.”1 to fulfill the uh libraries’ mission and the mandate of our strategic directions, the uh libraries administration appointed a digital asset management system (dams) implementation task force to explore, evaluate, test, recommend, and implement a more robust digital asset management system that would provide multiple modes of access to the uh libraries’ unique collections and accommodate digital object production at a larger scale. the collaborative task force comprises librarians from four departments: metadata and digitization services (mds), web services, digital repository services, and special collections. the core charge of the task force is to: • perform a needs assessment and build criteria and policies based on evaluation of the current system and requirements for the new dams • research and explore dams on the market and identify the top three systems for beta testing in a development environment • generate preliminary recommendations from stakeholders' comments and feedback • coordinate installation of the new dams and finish data migration • communicate the task force work to uh libraries colleagues literature review libraries have maintained dams for the publication of digitized surrogates of rare and unique materials for over two decades. during that time, information professionals have developed evaluation strategies for testing, comparing, and evaluating library dams software. reviewing these models and associated case studies provided insight into common practices for selecting systems and informed how the uh libraries dams implementation task force conducted its evaluation process. one of the first publications of its kind, “a checklist for evaluating open source digital library software” by dion hoe-lian goh et al., presents a comprehensive list of criteria for library dams evaluation.2 the researchers developed twelve broad categories for testing (e.g., content management, metadata, and preservation) and generated a scoring system based on the assignment of a weight and a numeric value to each criterion.3 while the checklist was created to assist with the evaluation process, the authors note that an institution’s selection decision should be guided primarily by defining the scope of their digital library, the content being curated using the software, and the uses of the material.4 through their efforts, the authors created a rubric that can be utilized by other organizations when selecting a dams. information technology and libraries | june 2016 7 subsequent research projects have expanded upon the checklist evaluation model. in “choosing software for a digital library,” jody deridder outlines major issues that librarians should address when choosing dams software, including many of the hardware, technological, and metadata concerns that goh et al. identified.5 additionally, she emphasizes the need to account for personnel and service requirements with a variety of activities: usability testing and estimating associated costs; conducting a formal needs assessment to guide the evaluation process; and a tiered-testing approach, which calls upon evaluators to winnow the number of systems.6 by considering stakeholder needs, from users to library administrators, deridder’s contributions inform a more comprehensive dams evaluation process. in addition to creating evaluation criteria, the literature on dams selection has also produced case studies that reflect real-world scenarios and identify use cases that help determine user needs and desires. in “evaluation of digital repository software at the national library of medicine,” jennifer l. marill and edward c. luczak discuss the process that the national library of medicine (nlm) used to compare ten dams, both proprietary and open-source.7 echoing goh et al. and deridder, marill and luczak created broad categories for testing and developed a scoring system for comparing dams.8 additionally, marill and luczak enriched the evaluation process by implementing two testing phases: “initial testing of ten systems” and “in-depth testing of three systems.”9 this method allowed nlm to conduct extensive research on the most promising systems for their needs before selecting a dams to implement. the tiered approach appealed to the task force, and influenced how it conducted the evaluation process, because it balances efficiency and comprehensiveness. in another case study, dora wagner and kent gerber describe the collaborative process of selecting a dams across a consortium. in their article “building a shared digital collection: the experience of the cooperating libraries in consortium,”10 the authors emphasize additional criteria that are important for collaborating institutions: the ability to brand consortial products for local audiences; the flexibility to incorporate differing workflows for local administrators; and the shared responsibility of system maintenance and costs.11 while the uh libraries will not be managing a shared repository dams, the task force appreciated the article’s emphasis on maximizing customizations to improve the user experience. in “evaluation and usage scenarios of open source digital library and collection management tools,” georgios gkoumas and fotis lazarinis describe how they tested multiple open-source systems against typical library functions—such as acquisitions, cataloging, digital libraries, and digital preservation—to identify typical use cases for libraries.12 some of the use cases formulated by the researchers address digital platforms, including features related to supporting a diverse array of metadata schema and using a simple web interface for the management of digital assets.13 these use cases mirror local feature and functionality requests incorporated into the uh libraries’ evaluation criteria. hitting the road towards a greater digital destination: evaluating and testing dams at university of houston libraries | wu et al. | doi:10.6017/ital.v35i2.9152 8 in “digital libraries: comparison of 10 software,” mathieu andro, emmanuelle asselin, and marc maisonneuve discuss a rubric they developed to compare six open-source platforms (invenio, greenstone, omeka, eprints, ori-oai, and dspace) and four proprietary platforms (mnesys, digitool, yoolib, and contentdm) around six core areas: document management, metadata, engine, interoperability, user management, and web 2.0. 14 the authors note that each solution is “of good quality” and that institutions should consider a variety of factors when selecting a dams, including the “type of documents you will want to upload” and the “political criteria (open source or proprietary software)” desired by the institution.15 this article provided the uh libraries with additional factors to include in their evaluation criteria. finally, heather gilbert and tyler mobley’s article “breaking up with contentdm: why and how one institution took the leap to open source,” provides a case study for a new trend: selecting a dams for migration from an existing system to a new one.16 the researchers cite several reasons for their need to select a new dams, primarily their current system’s limitations with searching and displaying content in the digital library.17 they evaluated alternatives and selected a suite of open-source tools, including fedora, drupal, and blacklight, which combine to make up their new dams.18 gilbert and mobley also reflect on the migration process and identify several hurdles they had to overcome, such as customizing the open-source tools to meet their localized needs and confronting inconsistent metadata quality.19 gilbert and mobley’s article most closely matches the scenario faced by the uh libraries. our study adds to the limited literature on evaluating and selecting dams for migration in several ways. it demonstrates another model that other institutions can adapt to meet their specific needs. it identifies new factors for other institutions to take into account before or during their own migration process. finally, it adds to the body of evidence for a growing movement of libraries migrating from proprietary to open-source dams. dams evaluation and analysis methodology needs assessment the dams implementation task force fulfilled the first part of its charge by conducting a needs assessment. the goal of the needs assessment was to collect the key requirements of stakeholders, identify future features of the new dams, and gather data in order to craft criteria for evaluation and testing in the next phase of its work. the task force employed several techniques for information gathering during the needs assessment phase: • identified stakeholders and held internal focus group interviews to identify system requirement needs and gaps • reviewed scholarly literature on dams evaluation and migration • researched peer/aspirational institutions • reviewed national standards around dams information technology and libraries | june 2016 9 • determined both the current use of uhdl as well as its projected use of uhdl • identified uhdl materials and users task force members took detailed notes during each focus group interview session. the literature research on dams evaluation helped the task force to find articles with comprehensive dams evaluation criteria. the niso criteria for core types of entities in digital library collections were also listed and applied to the evaluation after reviewing the niso framework of guidance for building good digital collections.20 more than forty peer and aspirational institutions’ digital repositories were benchmarked to identify web site names, platform architecture, documentation, and user and system features. the task force analyzed the rich data gathered from needs assessment activities and built the dams evaluation criteria that prepared the task force for the next phase of evaluation. evaluation, testing, and recommendation the task force began its evaluation process by identifying twelve potential dams for consideration that were ultimately narrowed down to three systems for in-depth testing. using data from focus group interviews, literature reviews, and dams best practices, the group generated a list of benchmark criteria. these broad evaluation criteria covered features in categories of system functionality, content management, metadata, user interface, and search support. members of the task force researched dams documentation, product information, and related literature to score each system against the evaluation criteria. table 1 contains the scores of the initial evaluation. from this process, five systems emerged with the highest scores: ● fedora (and, closely associated, fedora/hydra and fedora/islandora) ● collective access ● dspace ● rosettacontentdm the task force eliminated collective access from the final systems for testing because of its limited functionality. it is based around archival content only, and is not widely deployed. the task force decided not to test contentdm because of the system’s known functionalities that we identified through firsthand experience. after the initial elimination process, fedora (including fedora/hydra and fedora/islandora), dspace, and rosetta remained for in-depth testing. hitting the road towards a greater digital destination: evaluating and testing dams at university of houston libraries | wu et al. | doi:10.6017/ital.v35i2.9152 10 dams evaluation score* fedora 27 fedora/hydra 26 fedora/islandora 26 collective access 24 dspace 24 rosetta 20 contentdm 20 trinity (ibase) 19 preservica 16 luna imaging 15 roda† 6 invenio‡ 5 table 1. evaluation scores of twelve dams using broad evaluation criteria the task force then created detailed evaluation and testing criteria by drawing from the same sources used previously: focus groups, literature review, and best practices. while the broad evaluation focused on high-level functions, the detailed evaluation and testing criteria for the final three systems closely analyzed the specific features of each dams in eight categories: ● system environment and function ● administrative access ● content ingest and management ● metadata ● content access ● discoverability ● report and inquiry capabilities ● system support * total possible score: 29. † removed from evaluation because the system does not support dublin core metadata. ‡ removed from evaluation because the system does not support dublin core metadata. information technology and libraries | june 2016 11 prior to the in-depth testing of the final three systems, the task force researched timelines for system setup. rosetta’s timeline for system setup proved to be prohibitive. consequently, the task force eliminated rosetta from the testing pool and moved forward with fedora and dspace. to conduct the detailed evaluation, the task force scored the specific features under each category utilizing systems testing and documentation. a score range from zero to three (0 = none, 1 = low, 2 = moderate, 3 = high) was assigned for each feature evaluated. after evaluating all features, the score was tallied for each category. our testing revealed that fedora outperformed dspace in over half of the testing sections: content ingest and management, metadata, content access, discoverability, and report and inquiry capabilities. see table 2 for the tallied scores in each testing section. testing sections dspace score fedora score possible score system environment and testing 21 21 36 administrative access 15 12 18 content ingest and management 59 96 123 metadata 32 43 51 content access 14 18 18 discoverability 46 84 114 report and inquiry capabilities 6 15 21 system support 12 11 12 total score: 205 300 393 table 2. scores of top two dams from testing using detailed evaluation criteria after review of the testing results, the task force conducted a facilitated activity to summarize the advantages and disadvantages of each system. based on this comparison, the dams task force recommended that the uh libraries implement a fedora/hydra repository architecture with the following course of action: ● adapt the uhdl user interface to fedora and re-evaluate it for possible improvements ● develop an administrative content management interface with the hydra framework ● migrate all uhdl content to a fedora repository hitting the road towards a greater digital destination: evaluating and testing dams at university of houston libraries | wu et al. | doi:10.6017/ital.v35i2.9152 12 fedora/hydra advantages fedora/hydra disadvantages open source steep learning curve large development community long setup time linked data ready requires additional tools for discovery modular design through api no standard model for multi-file objects scalable, sustainable, and extensible batch import/export of metadata handles any file format table 3. fedora/hydra advantages and disadvantages the primary advantages of a dams based on fedora/hydra are: a large and active development community; a scalable and modular system that can grow quickly to accommodate large scale digitization; and a repository architecture based on linked data technologies. this last advantage, in particular, is unique among all systems evaluated, and will give the uh libraries the ability to publish our collections as linked open data. fedora 4 conforms to the world wide web consortium (w3c) recommendation for linked data platforms.21 the main disadvantage of a fedora/hydra system is the steep learning curve associated with designing metadata models and developing a customized software suite, which translates to a longer implementation time compared to off-the-shelf products. the uh libraries must allocate an appropriate amount of time and resources for planning, implementation, and staff training. the long-term return on investment for this path will be a highly skilled technical staff with the ability to maintain and customize an open-source, standards-based repository architecture that can be expanded to support other uh libraries content such as geospatial data, research data, and institutional repository materials. information technology and libraries | june 2016 13 dspace advantages dspace disadvantages open source flat file and metadata structure easy installation / ready out of box limited reporting capabilities existing familiarity through texas digital library limited metadata features user group / profile controls does not support linked data metadata quality module limited api batch import of objects not scalable / extensible poor user interface table 4. dspace advantages and disadvantages the main advantages of dspace are ease of installation, familiarity of workflows, and additional functionality not found in contentdm.22 installation and migration to a dspace system would be relatively fast, and staff could quickly transition to new workflows because they are similar to contentdm. dspace also supports authentication and user roles that could be used to limit content to the uh community only. commercial add-on modules, although expensive, could be purchased to provide more sophisticated content management tools than are currently available with contentdm. the disadvantages of a dspace system are the same long-term, systemic problems with the current contentdm repository. dspace uses a flat metadata structure, has a limited api, does not scale well, and is not customizable to the uh libraries’ needs. consultations with peers indicated that both contentdm and dspace institutions are exploring the more robust capabilities of fedora-based systems. migration of the digital collections in contentdm to a dspace repository would provide few, if any, long term benefits to the uh libraries. of all the systems considered, implementation of a fedora/hydra repository aligns most clearly with the uh libraries strategic directions of attaining national recognition and improving access to our unique collections. the fedora and hydra communities are very active, with project management overseen by duraspace and hydra respectively.23,24 over the long term, a repository based on fedora/hydra will give the uh libraries a low cost, scalable, flexible, and interoperable platform for providing online access to our unique collections. hitting the road towards a greater digital destination: evaluating and testing dams at university of houston libraries | wu et al. | doi:10.6017/ital.v35i2.9152 14 cost considerations to balance the current digital collections production schedule with the demands of a timely implementation and migration, the task force identified the following investments as cost effective for fedora/hydra and dspace, respectively: fedora/hydra dspace metadata librarian: annual salary ● manages daily metadata unit operations during implementation ● streamlines the migration process metadata librarian: annual salary ● manages daily metadata unit operations during implementation ● streamlines the migration process @mire modules: $41,500 ● content delivery (3): $13,500 ● metadata quality: $10,000 ● image conversion suite: $9,000 ● content & usage analysis: $9,000 ● these modules require one-time fees to @mire that recur when upgrading to a new version of dspace table 5. start-up costs associated with fedora/hydra and dspace the task force determined that an investment in one librarian’s salary is the most cost-effective course of action. the new metadata librarian will manage daily operations of the metadata unit in metadata & digitization services while the metadata services coordinator, in close collaboration with the web projects manager, leads the dams implementation process. in contrast to fedora, migration to dspace would require a substantial investment in third party software modules from @mire to deliver the best possible content management environment and user experience. implementation strategies the implementation of the new dams will occur in a phased rollout comprised of the following stages: system installation, data migration, and interface development. mds and web services will perform the majority of the work, in consultation with key stakeholders from special collections and other units. throughout this process, the dams implementation task force will information technology and libraries | june 2016 15 consult with the digital preservation task force* to coordinate the preservation and access systems. phase one system installation phase two data migration phase three interface development set up production and server environment formulate content migration strategy and schedule reevaluate front-end user interface rewrite uhdl front-end application for fedora/solr migrate test collections and document exceptions rewrite uhdl front end as a hydra head or . . . create metadata models conduct the data migration . . . update current front end coordinate workflows with digital preservation task force create preservation metadata for migrated data establish interdepartmental production workflows begin development of administrative hydra head for content management continue development of the hydra administrative interface refine administrative hydra head for content management table 6. overview of dams phased implementation phase one: system installation during the first phase of dams implementation, web services and mds will work closely together to install an open-source repository software stack based on fedora, rewrite the current php front-end interface to provide public access to the data in the new system, and create metadata content models for the uhdl based on the portland common data model,25 in consultation with the coordinator of digital projects from special collections and other key stakeholders. the dams task force will consult with the digital preservation task force† to determine how closely the preservation and access systems will be integrated and at what points. the two groups will also jointly outline a dams migration strategy that aligns with the preservation system. web services and mds will collaborate on research and development of an administrative interface, based on the hydra framework, for day-to-day management of uhdl content. * an appointed task force to create a digital preservation policy and identify strategies, actions, and tools needed to sustain long-term access to digital assets maintained by uh libraries. † a working team at uh libraries that enforces the digital preservation policy and maintains the digital preservation system.[convert these footnotes to endnotes?] hitting the road towards a greater digital destination: evaluating and testing dams at university of houston libraries | wu et al. | doi:10.6017/ital.v35i2.9152 16 phase two: data migration in the second phase, mds will migrate legacy content from contentdm to the new system and work with web services, special collections, and the architecture and art library to resolve any technical, metadata, or content problems that arise. the second phase will begin with the development of a strategy for completing the work in a timely fashion, followed by migration of representative sample collections to the new system to test and refine its capabilities. after testing is complete, all legacy content will be migrated from contentdm to fedora, and preservation metadata for migrated collections will be created and archived. development work on the hydra administrative interface will also continue. after the data migration is complete, all new collections will be ingested into fedora/hydra, and the current contentdm installation will be retired. phase three: interface development in the final phase, web services will reevaluate the current front-end user interface (ui) for the uhdl by conducting user tests to better understand how and why users are visiting the uhdl. web services will also analyze web and system analytics and gather feedback from special collections and other stakeholders. depending on the outcome of this research, web services may create a new ui based on the hydra framework or choose to update the current front-end application with modifications or new features. web services and mds will also continue to develop or adopt tools for the management of uhdl content and work with special collections and the branch libraries to establish production workflows in the new system. continued development work on the front-end and administrative interfaces, for the life of the new digital asset management system, is both expected and desirable as we maintain and improve the uhdl infrastructure and contribute to the open source software community in line with the uh libraries strategic directions. ongoing: assessment, enhancement, training, and documenting throughout the transition process mds and web services will undergo extensive training in workshops and conferences to develop the skills necessary for developing and maintaining the new system. they will also establish and document workflows to ensure the long-term viability of the system. regular consultation with special collections, the branch libraries, and other stakeholders will be conducted to ensure that the new system satisfies the requirements of colleagues and patrons. ongoing activities will include: ● assessing service impact of new system ● user testing on ui ● regular system enhancements ● establishing new workflows ● creating and maintaining documentation ● training: conferences, webinars, workshops, etc. information technology and libraries | june 2016 17 conclusion transitioning from contentdm to a fedora/hydra repository will place the uh libraries in a position to sustainably grow the amount of content in the uh digital library and customize the uhdl interfaces for a better user experience. publishing our data in a linked data platform will give the uh libraries the ability to more easily publish our data for the semantic web. in addition, the fedora/hydra architecture can be adapted to support a wide range of uh libraries projects, including a geospatial data portal, a research data repository, and a self-deposit institutional repository. over the long term, the return on investment for implementing an open-source repository architecture based on industry standard software will be: improved visibility of our unique collections on the web; expanded opportunities for aggregating our collections with highprofile repositories such as the digital public library of america; and increased national recognition for our digital projects and staff expertise. references 1. “the university of houston libraries strategic directions, 2013–2016,” accessed july 22, 2015, http://info.lib.uh.edu/sites/default/files/docs/strategic-directions/2013-2016libraries-strategic-directions-final.pdf. 2. dion hoe-lian goh et al., “a checklist for evaluating open source digital library software,” online information review 30, no. 4 (july 13, 2006): 360–79, doi:10.1108/14684520610686283. 3. ibid., 366. 4. ibid., 364. 5. jody l. deridder, “choosing software for a digital library,” library hi tech news 24, no. 9 (2007): 19–21, doi:10.1108/07419050710874223. 6. ibid., 21. 7. jennifer l. marill and edward c. luczak, “evaluation of digital repository software at the national library of medicine,” d-lib magazine 15, no. 5/6 (may 2009), doi:10.1045/may2009marill. 8. ibid. 9. ibid. 10. dora wagner and kent gerber, “building a shared digital collection: the experience of the cooperating libraries in consortium,” college & undergraduate libraries 18, no. 2–3 (2011): 272–90, doi:10.1080/10691316.2011.577680. 11. ibid., 280–84. http://info.lib.uh.edu/sites/default/files/docs/strategic-directions/2013-2016-libraries-strategic-directions-final.pdf http://info.lib.uh.edu/sites/default/files/docs/strategic-directions/2013-2016-libraries-strategic-directions-final.pdf http://dx.doi.org/10.1108/14684520610686283 http://dx.doi.org/10.1108/07419050710874223 http://dx.doi.org/10.1045/may2009-marill http://dx.doi.org/10.1045/may2009-marill http://dx.doi.org/10.1080/10691316.2011.577680 hitting the road towards a greater digital destination: evaluating and testing dams at university of houston libraries | wu et al. | doi:10.6017/ital.v35i2.9152 18 12. georgios gkoumas and fotis lazarinis, “evaluation and usage scenarios of open source digital library and collection management tools,” program: electronic library and information systems 49, no. 3 (2015): 226–41, doi:10.1108/prog-09-2014-0070. 13. ibid., 238–39. 14. mathieu andro, emmanuelle asselin, and marc maisonneuve, “digital libraries: comparison of 10 software,” library collections, acquisitions, & technical services 36, no. 3–4 (2012): 79–83, doi:10.1016/j.lcats.2012.05.002. 15. ibid., 82. 16. heather gilbert and tyler mobley, “breaking up with contentdm: why and how one institution took the leap to open source,” code4lib journal, no. 20 (2013), http://journal.code4lib.org/articles/8327. 17. ibid. 18. ibid. 19. ibid. 20. niso framework working group with support from the institute of museum and library services, a framework of guidance for building good digital collections (baltimore, md: national information standards organization (niso), 2007). 21 . “linked data platform 1.0”, w3c, accessed july 22, 2015, http://www.w3.org/tr/ldp/. 22. “dspace,” accessed july 22, 2015, http://www.dspace.org/. 23. “fedora repository home,” accessed july 22, 2015, https://wiki.duraspace.org/display/ff/fedora+repository+home. 24. “hydra project,” accessed july 22, 2015, http://projecthydra.org/. http://dx.doi.org/10.1108/prog-09-2014-0070 http://dx.doi.org/10.1016/j.lcats.2012.05.002 http://journal.code4lib.org/articles/8327 http://www.w3.org/tr/ldp/ http://www.dspace.org/ https://wiki.duraspace.org/display/ff/fedora+repository+home http://projecthydra.org/ introduction literature review dams evaluation and analysis methodology needs assessment evaluation, testing, and recommendation cost considerations implementation strategies phase one: system installation phase two: data migration phase three: interface development ongoing: assessment, enhancement, training, and documenting conclusion editorial board thoughts: libraries as makerspace? tod colegrove information technology and libraries | march 2013 2 recently there has been tremendous interest in “makerspace” and its potential in libraries: from middle school and public libraries to academic and special libraries, the topic seems very much top of mind. a number of libraries across the country have been actively expanding makerspace within the physical library and exploring its impact; as head of one such library, i can report that reactions to the associated changes have been quite polarized. those from the supported membership of the library have been uniformly positive, with new and established users as well as principal donors immediately recognizing and embracing its potential to enhance learning and catalyze innovation; interestingly, the minority of individuals that recoil at the idea have been either long-term librarians or library staff members. i suspect the polarization may be more a function of confusion over what makerspace actually is. this piece offers a brief overview of the landscape of makerspace—a glimpse into how its practice can dramatically enhance traditional library offerings, revitalizing the library as a center of learning. been happening for thousands of years . . . dale dougherty, founder of make magazine and maker faire, at the “maker monday” event of the 2013 american library association midwinter meeting framed the question simply, “whether making belongs in libraries or whether libraries can contribute to making.” more than one audience member may have been surprised when he continued, “it’s already been happening for hundreds of years—maybe thousands.”1 the o’reilly/darpa makerspace playbook describes the overall goals and concept of makerspace (emphasis added): “by helping schools and communities everywhere establish makerspaces, we expect to build your makerspace users' literacy in design, science, technology, engineering, art, and math. . . . we see making as a gateway to deeper engagement in science and engineering but also art and design. makerspaces share some aspects of the shop class, home economics class, the art studio and science lab. in effect, a makerspace is a physical mashup of these different places that allows projects to integrate these different kinds of skills.”2 building users’ literacies across multiple domains and a gateway to deeper engagement? surely these are core values of the library; one might even suspect that to some degree libraries have long been makerspace. a familiar example of maker activity in libraries might include digital media: still/video photography and audio mastering and remixing. youmedia network, funded by the macarthur patrick “tod” colegrove (pcolegrove@unr.edu), a lita member, is head of the delamare science & engineering library at the university of nevada, reno, nevada. mailto:pcolegrove@unr.edu editorial board thoughts: libraries as makerspace? | colegrove 3 institute through the institute of museum and library services, is a recent example of such effort aimed at creating transformative spaces; engaged in exploring, expressing, and creating with digital media, youth are encouraged to “hang out, mess around, and geek out.” a more pedestrian example is found in the support of users with first-time learning or refreshing of computer programming skills. as recently as the 1980s, the singular option the library had was to maintain a collection of print texts. through the 1990s and into the early 2000s, that support improved dramatically as publishers distributed code examples and ancillary documents on accompanying cd or dvd media, saving the reader the effort of manually typing in code examples. the associated collections grew rapidly, even as the overhead associated with the maintenance and weeding of a collection that was more and more rapidly obsoleted grew more. today, e-book versions combined with ready availability of computer workstations within the library, and the rapidly growing availability of web-based tutorials and support communities, render a potent combination that customers of the library can use to quickly acquire the ability to create or “make” custom applications. with the migration of the supporting print collections online, the library can contemplate further support in the physical spaces opened up. open working areas and whiteboard walls can further amplify the collaborative nature of such making; the library might even consider adding popular hardware development platforms to its collection of lendable technology, enabling those interested to check out a development kit rather than purchase on their own. after all, in a very real sense that is what libraries do—and have done, for thousands of years: buy sometimes expensive technology tailored to the needs and interest of the local community and make it available on a shared basis. makerspace: a continuum along with outreach opportunities, the exploration of how such examples can be extended to encompass more of the interests supported by the library is the essence of the maker movement in libraries. makerspace encompasses a continuum of activity that includes “co-working,” “hackerspace,” and “fab lab”; the common thread running through each is a focus on making rather than merely consuming. it is important to note that although the terms are often incorrectly used as if they were synonymous, in practice they are very different: for example, a fab lab is about fabrication. realized, it is a workshop designed around personal manufacture of physical items— typically equipped with computer controlled equipment such as laser cutters, multiple axis computer numerical controlled (cnc) milling machines, and 3d printers. in contrast, a “hackerspace” is more focused on computers and technology, attracting computer programmers and web designers, although interests begin to overlap significantly with the fab lab for those interested in robotics. co-working space is a natural evolution for participants of the hackerspace; a shared working environment offering much of the benefit of the social and collaborative aspects of the informal hackerspace, while maintaining a focus on work. as opposed to the hobbyist that might be attracted to a hackerspace, co-working space attracts independent contractors and professionals that may work from home. information technology and libraries | march 2013 4 it is important to note that it is entirely possible for a single makerspace to house all three subtypes and be part hackerspace, fab lab, and co-working space. can it be a library at the same time? to some extent, these activities are likely already ongoing within your library, albeit informally; by recognizing and embracing the passions driving those participating in the activity, the library can become central to the greater community of practice. serving the community’s needs more directly, opportunities for outreach will multiply even as it enables the library to develop a laser-sharp focus on the needs of that community. depending on constraints and the community of support, the library may also be well-served by forming collaborative ties with other local makerspace; having local partners can dramatically improve the options available to the library in day-to-day practice, and better inform the library as it takes well-chosen incremental steps. with hackerspace/co-working/fab lab resources aligned with the traditional resources of the library, engagement with one can lead naturally to the other in an explosion of innovation and creativity. renaissance in addition to supporting the work of the solitary reader, “today's libraries are incubators, collaboratories, the modern equivalent of the seventeenth-century coffeehouse: part information market, part knowledge warehouse, with some workshop thrown in for good measure.”3 consider some of the transformative synergies that are already being realized in libraries experimenting with makerspace across the country: • a child reading about robots able to go hands-on with robotics toolkits, even borrowing the kit for an extended period of time along with the book that piqued the interest; surely such access enables the child to develop a powerful sense of agency from early childhood, including a perception of self as being productive and much more than a consumer. • students or researchers trying to understand or make sense of a chemical model or novel protein strand able not only to visualize and manipulate the subject on a two-dimensional screen, but to relatively quickly print a real-world model to be able and tangibly explore the subject from all angles. • individuals synthesizing knowledge across disciplinary boundaries able to interact with members of communities of practice in a non-threatening environment; learning, developing, and testing ideas—developing rapid prototypes in software or physical media, with a librarian at the ready to assist with resources and dispense advice regarding intellectual property opportunities or concerns. the american libraries association estimates that as of this printing there are approximately 121,169 libraries of all kinds in the united states today; if even a small percentage recognize and begin to realize the full impact that makerspace in the library can have, the future looks bright indeed. editorial board thoughts: libraries as makerspace? | colegrove 5 references 1. dale dougherty, “the new stacks: the maker movement comes to libraries” (presentation at the midwinter meeting of the american library association, seattle, washington, january 28, 2013). http://alamw13.ala.org/node/10004. 2. michele hlubinka et al., makerspace playbook, december 2012, accessed february 13, 2012, http://makerspace.com/playbook. 3. alex soojung-kim pang, "if libraries did not exist, it would be necessary to invent them," contemplative computing, february 6, 2012, http://www.contemplativecomputing.org/2012/02/if-libraries-did-not-exist-it-would-benecessary-to-invent-them.html. http://alamw13.ala.org/node/10004 http://makerspace.com/playbook http://www.contemplativecomputing.org/2012/02/if-libraries-did-not-exist-it-would-be-necessary-to-invent-them.html http://www.contemplativecomputing.org/2012/02/if-libraries-did-not-exist-it-would-be-necessary-to-invent-them.html editorial | truitt 3 marc truitteditorial i doubt that many of the blog people are in the habit of sustained reading of complex texts. —michael gorman, 2005 s o, three plus years after the fact, why am i opening with michael gorman’s unfortunate characterization of those he labeled “blog people”? i have no interest in reopening this debate, honestly! but the problem with generalizations, however unfair, is that at their heart there is just enough substance to make them “stick”—to give them a grain or two of credibility. gorman’s words struck a chord in me that existed before his charge and has continued to exist to this day. the substance in gorman’s words had little to do with these “blog people” as such; rather, my interest was piqued by the implications in his remark about how we all deal with “complex texts” and the “sustained reading” of the same. in a time of wide availability of full-text electronic articles, it has become so easy and tempting to cherry pick the odd phrase here or there, without study of the work as a whole. how has scholarship especially been changed by the ease with which we can reduce works to snippets without having considered their overall context? i’m not arguing that scholarly research and writing hasn’t always been at least in part about finding the perfect juicy quotation around which we then weave our own theses. many of us well recall the boxes of 3x5” citation and 5x8” quotation files that we or our patrons laboriously assembled through weeks, months, and years of detailed research. but if the style of compiling these files that i witnessed (and indeed did) is any guide, their existence was the product of precisely that “sustained reading of complex texts” of which gorman spoke. my vague, nagging sense is that what is changing is this style of approaching whole texts. i wondered then about how much scholarly research today is driven by keyword searches of digitized texts that then essentially produce “virtual quotation files” without our having had to struggle with their context in the whole of the original source text? fast forward three years. lately, several articles touching on our changing ways of interacting with resources have appeared in both scholarly and popular venues, and these have served to underline my sense that we are missing something because of our growing lack of engagement with whole texts. writing in the july/august issue of the atlantic monthly, nicholas carr asks “is google making us stupid?” drawing an analogy to the scene in the film 2001: a space odyssey, in which astronaut dave bowman disables supercomputer hal’s memory circuits, carr says i can feel it, too. over the past few years i’ve had an uncomfortable sense that someone, or something, has been tinkering with my brain, remapping the neural circuitry, reprogramming the memory. my mind isn’t going—so far as i can tell—but it’s changing. i’m not thinking the way i used to think. i can feel it most strongly when i’m reading. immersing myself in a book or a lengthy article used to be easy. my mind would get caught up in the narrative or the turns of the argument, and i’d spend hours strolling through long stretches of prose. that’s rarely the case anymore. now my concentration often starts to drift after two or three pages. i get fidgety, lose the thread, begin looking for something else to do. i feel as if i’m always dragging my wayward brain back to the text. the deep reading that used to come naturally has become a struggle.1 carr goes on to explain that “what the net seems to be doing is chipping away my capacity for concentration and contemplation. my mind now expects to take in information the way the net distributes it: in a swiftly moving stream of particles. once i was a scuba diver in the sea of words. now i zip along the surface like a guy on a jet ski.”2 carr’s nagging fear found similar expression among some tech-savvy participants of library online forums; one of the more interesting comments appeared on the web4lib electronic discussion list. in a discussion of the article, tim spalding of librarything observed that he himself had experienced what he dubbed “the google effect” and noted something is lost. . . . human culture often advances by externalizing pieces of our mental life—writing externalizes memory, calculators externalize arithmetic, maps, and now gps, externalize way-finding, etc. each shift changes the culture. and each shift comes with a cost. nobody memorizes texts anymore, nobody knows the times tables past ten or twelve and nobody can find their way home from the stars and the side of the tree the moss grows on.3 meanwhile, another article appeared on a closely related topic, this time in the journal science. james a. evans observed that, because “scientists and scholars tend to search electronically and follow hyperlinks rather than browse or peruse,” the easy availability of electronic resources was resulting in an “ironic change” for scientific marc truitt (marc.truitt@ualberta.ca) is associate director, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 4 information technology and libraries | september 2008 scholarship, in that as more journal issues came online, the articles referenced tended to be more recent, fewer journals and articles were cited, and more of those citations were to fewer journals and articles. the forced browsing of print archives may have stretched scientists and scholars to anchor findings deeply into past and present scholarship. searching online is more efficient and following hyperlinks quickly puts researchers in touch with prevailing opinion, but this may accelerate consensus and narrow the range of findings and ideas built upon.4 evans’s research highlights an additional irony: an unintended benefit to the scholarly process in the paperbased world was “poor indexing,” since it encouraged browsing through less relevant, older, or more marginal literature. this browsing had the effect of “facilitat[ing] broader comparisons and led researchers into the past. modern graduate education parallels this shift in publication—shorter in years, more specialized in scope, culminating less frequently in a true dissertation than an album of articles.”5 what is one to make of all of this? at the outset, i wish to state clearly that i am not some sort of anti e-text luddite. electronic texts are a fact of life, and are becoming moreso every day. even though they are in their infancy as a medium, they’ve already transformed the landscape of bibliographic access. my interest is not with the tool, but with the manner in which we are using it. i began by suggesting that i share with gorman a concern about how we increasingly engage with “complex texts” today. unlike him, though, my concern is not limited only to the so-called blog people (whomever they may be), but indeed, it includes all of us. with the explosion in easily accessible electronic texts, our ideas and habits concerning interaction with these texts are changing, sometimes in unintended ways. in a recent informal survey i conducted of my colleagues at work, i asked, “have you ever read an e-book (not just a journal article) from (virtual) cover to (virtual) cover?” for those whose answer was affirmative, i also asked, “how many such books have you read in their entirety?” out of twenty-odd responses, three individuals answered that yes, they had had occasion to read an entire e-book (for a total of six books among the three “yes” respondents, which seemed surprisingly high to me). of greater interest, though, were those who chose to question the premise of the survey, arguing that people don’t “read” e-books the way that they read paper ones. it does make one wonder, then, how amazon thinks it possesses a viable business model in the kindle e-book reader, for which it currently lists an astounding 140,000+ available e-books. clearly, some e-books are being read as whole texts, by some people, for some purposes. but i suspect that’s another story.6 carr and evans use slightly differing imagery to describe a similar phenomenon. carr closes with a reference back to the death of 2001’s hal, saying, “as we come to rely on computers to mediate our understanding of the world, it is our own intelligence that flattens into artificial intelligence.”7 evans, on the other hand, compares contemporary scientific researchers to newton and darwin, each of whom produced works that “not only were engaged in current debates, but wove their propositions into conversation with astronomers, geometers, and naturalists from centuries past.” twenty-first-century scientists and scholars, by contrast, are able because of readily available electronic resources “to frame and publish their arguments more efficiently, [but] they weave them into a more focused—and more narrow—past and present.” 8 perhaps the most succinct statement, though, comes from librarything’s tim spalding, who summarized the problem thusly: “we advance by becoming dumber.”9 an ital research and publishing opportunity for an inquisitive and enterprising scholar, perhaps? i’d welcome the manuscript! shameless plugs department. by the time you read this, we at ital will have launched our new blog, italica (http://ital-ica.blogspot.com). italica addresses a need we on the ital editorial board have long sensed; that is, an area for “letters to the editor,” updates to articles, supplementary materials we can’t work into the journal—you name it. one of the most important features of italica will be a forum for readers’ conversations with our authors: we’ll ask authors to host and monitor discussion for a period of time after publication so that you’ll then have a chance to interact with them. italica is currently a pilot project. for our first issue we will have begun with a discussion hosted by jennifer bowen, whose article “metadata to support next-generation library resource discovery: lessons from the extensible catalog, phase i” was published in the june 2008 issue of ital. for our second italica, we plan to expand coverage and discussion to include all articles and other features in the september issue you now have in hand. italica is sure to become a stimulating supplement to and forum for topics originating in ital. we look forward to seeing you there! references and notes extract. michael gorman, “revenge of the blog people!” library journal (feb. 15, 2005) www.libraryjournal.com/article/ ca502009.html (accessed july 21, 2008). 1. nicholas carr, “is google making us stupid?” the atlantic monthly 301 (july/aug. 2008) www.theatlantic.com/ doc/200807/google (accessed july 23, 2008). editor’s column | truitt 5 2. ibid. 3. tim spalding, “re: ‘is google making us stupid? what the internet is doing to our brains,’” web4lib discussion list post, june 19, 2008, http://article.gmane.org/gmane.education .web4lib/12349 (accessed july 24, 2008). 4. james a. evans, “electronic publication and the narrowing of science and scholarship,” science (july 18, 2008) www .sciencemag.org/cgi/content/full/321/5887/395 (accessed july 24, 2008). emphasis added. 5. ibid. 6. as of 5:30pm (est), july 24, 2008, amazon’s website listed 145,591 “kindle books.” www.amazon.com/s/qid=1216934603/ ref=sr_hi?ie=utf8&rs=154606011&bbn=154606011&rh=n%3a1 54606011&page=1. 7. carr, “is google making us stupid?” 8. evans, “electronic publication and the narrowing of science of scholarship.” 9. spalding, “re: ‘is google making us stupid?’” lib-mocs-kmc364-20140106084043 211 name-title entry retrieval from a marc file philip l. long, head, automated systems research and development and frederick g. kilgour, director: ohio college library center, columbus, ohio a test of validity of earlier findings on 3,3 search-ke y retrieval from an in-process file for retrieval from a marc file. probability of number of entries retrieved per reply is essentially the same for both files. this study was undertaken to test the applicability of previous findings on retrieval of name-title entries from a technical processing system fil e ( 1 ) to retrieval from a marc file; the technique for retrieval employs truncated 3,3 search keys. materials and methods the study cited above employed a file of 132,808 name-title entries obtained from the yale university library's machine aided technical processing system. bibliographic control was not maintained for the generation of records in this file , with the result that the file contained errors that simulated errors in the requests library users put to catalogs. the marc file employed in the present study contains 121,588 name-title entries that are nearly error free. whereas the marc file possesses few records bearing foreign titles, the yale file has a significantly higher percentage of such titles, as would be expected for a large university library. initial articles were deleted in yale titles, but only english articles in marc titles because the language of foreign language titles is not identified in marc. 212 journal of library automation vol. 4/4 december, 1971 design of the program used to analyze the marc file was the same as that for the program employed in the previous study. however, the new program runs on a xerox data systems sigma 5 computer. the test employed the 3,3 search key to make possible comparison with previous results. results table 1 presents the percentage of time that up to five replies can be expected, assuming equal likelihood of key choice. inspection of the table reveals that there is no significant difference between the findings from the yale and the marc files. table 1. probability of number of entries per reply using 3,3, search key number of replies 1 2 3 4 5 discussion cumulative probability percentage yale file marc file 78.58 79.98 92.75 93.28 96.83 96.93 98.40 98.26 99.08 98.91 the same result was expected for the marc file that had been obtained earlier from the yale file. possible influences that might have led to different results were the existence of errors in the yale file, a significant proportion of foreign titles in the yale file as compared to the nearly all-english marc file, and the inability to mechanically delete the initial articles in the few foreign language marc titles. it is most unlikely that the effects of these differences are masking one another. conclusion the findings of a previous study on the effectiveness of retrieval of entries from a large bibliographic file ( 1) by use of a truncated 3,3 search key have been confirmed for a similarly large marc file. reference 1. kilgour, frederick g.; long, philip l. ; leiderman, eugene b.: "retrieval of bibliographic entries from a name-title catalog by use of truncated search keys," proceedings of the american society for information science, 7 ( 1970 ), 79-81. 12 information technology and libraries | september 2008 mylibrary: a digital library framework and toolkit eric lease morgan this article describes a digital library framework and toolkit called mylibrary. at its heart, mylibrary is designed to create relationships between information resources and people. to this end, mylibrary is made up of essentially four parts: (1) information resources, (2) patrons, (3) librarians, and (4) a set of locally defined, institution-specific facet/term combinations interconnecting the first three. on another level, mylibrary is a set of object-oriented perl modules intended to read and write to a specifically shaped relational database. used in conjunction with other computer applications and tools, mylibrary provides a way to create and support digital library collections and services. librarians and developers can use mylibrary to create any number of digital library applications: full-text indexes to journal literature, a traditional library catalog complete with circulation, a database-driven website, an institutional repository, an image database, etc. the article describes each of these points in greater detail. n background and history the term “mylibrary” was coined by keith morgan, doris sigl, and myself in 1997 when we worked in the department of digital library initiatives at the north carolina state university libraries. at that time it denoted a personalizable/customizable user interface to sets of library collections and services. it was a reaction to the then-popular portal applications called my netscape, my yahoo!, and my dejanews.1 in that form, mylibrary was a monolithic turnkey application. librarians were expected to use the administrative interface to organize information resources into three distinct groups: databases, electronic texts, and library links (services). each item in each group was expected to be associated with one or more discipline terms. patrons were expected to come to the system, register, select a discipline, and use the databases, texts, and library links to do library research. patrons had three additional functions at their disposal. the first was the ability to add “personal” links—bookmarks to their favorite websites. second, they had the ability to select multiple disciplines and thus refine the number of resources associated with “their” page. finally, and to a small degree, patrons had the ability to change the graphic design of the page. because of these customizable features and its implementation at ncsu libraries, the system was officially called mylibrary@ncstate. mylibrary@ncstate was packaged and distributed as open-source software, a newly coined term at that time. it was subsequently downloaded and installed in roughly two dozen libraries across the world. some of these libraries used it in exactly the manner it was designed, and some of them are still accessible today.2 other libraries used parts and pieces of the system to build their own applications. for example, the openuniversity used only the underlying database structure.3 on the other hand, los alamos national laboratory used to mylibrary@ncstate concept and completely re-wrote the perl modules.4 more importantly, the concept of mylibrary—a userdriven, customizable interface to sets of library collections and services—became very popular. mylibrary-like applications sprang up all over the library landscape. these implementations did not use the perl modules and scripts written under the mylibrary@ncstate rubric, but they did organize content in an underlying database and allowed patrons to mix and match the content for their specific purposes.5 as a turnkey application, mylibrary@ncstate functioned correctly. it did not crash and it did not output invalid data. at the same time, mylibrary@ncstate did not fare very well when it came to usability tests. for example, gibbons describes how the usability of mylibrary was improved to meet the needs of course offerings.6 in another article, brantley describes how users had difficulty “understanding the discipline-specific nature” of mylibrary@ncstate.7 its installation process was nonstandard and therefore difficult to implement. as written, mylibrary@ncstate was difficult to extend and enhance, and thus it did not truly benefit from its open-source nature. data entry was tedious and for this reason its content was difficult to initialize and maintain. the idea of actively customizing a user interface was foreign to many users. people do not take an active role in customizing their user interfaces. they accept the defaults or unconsciously expect the user interface to adapt to their needs.8 for all these reasons, mylibrary@ncstate’s popularity lasted about five years, but for many of the reasons outlined previously, the concept of mylibrary still seems viable. the balance of this article describes two things: (1) how the current implementation of mylibrary has evolved beyond the turnkey nature of mylibrary@ncstate, and (2) how the “new and improved” mylibrary has been and can be used to create a number of digital library applications. eric lease morgan is head of the digital access and information architecture department, hesburgh libraries, university of notre dame, indiana. mylibrary: a digital library framework and toolkit | morgan 13 n mylibrary, relationships, and facet/term combinations more than anything else, mylibrary is intended to provide a framework for creating relationships between information resources and people. most of the time these information resources are the traditional things of libraries such as books, journals, indexes, catalogs, manuscripts, and photographs. the people of mylibrary are patrons and librarians. relationships can be drawn between information resources and people through the use of facet/term combinations—a locally defined and institution-specific controlled vocabulary. information resources and people can be described in similar fashions. resources, for example, are described with subjects. they are described according to their physical format and function. patrons and librarians focus much of their energies in specific subjects: “i am majoring in philosophy.” sometimes people focus their attention on specific formats: “i need a journal article on . . .” sometimes people are interested in particular functions: “i need a definition for . . .” people can belong to particular audiences and they might want to use audience-specific resources: “these resources are particularly useful for students in geog 203.” in our increasingly networked environment, it is just as important to create relationships between people as it is to create relationships between information resources and patrons. librarians are not seen as the only authority on data and information. the opinions of one’s peers play an important role too. users want to read reviews, rank items according to various weights, and make decisions based on the thoughts of people like them. through facet/term combinations applied to users, this is possible. moreover, since users do not visit libraries as often as they used to, librarians need to figure out ways of staying in touch with their populations. by applying facet/term combinations to librarians as well as users, the librarians can know who their users are and users can easily identify subject experts. intended for use as the framework for a controlled vocabulary, the facet/term combinations of mylibrary give the librarian and developer an opportunity to describe and relate the primary components of libraries—information resources and people. through these facet/term combinations, conceptual links can be created between information resources and users, between users and librarians, and between librarians and resources. after creating a set of facet/term combinations, the librarian and developer can address increasingly popular desires such as but not limited to: n as a librarian, this is the set of resources i curate . . . n because you are in this class, you might want to use . . . n here is a list of all the encyclopedias on the topic of . . . n here is a list of patrons who use the resources i curate . . . n here is a list of the full-text article indexes . . . n here is a list of articles on . . . n the library owns the following special collections . . . n these special collections can be used for this class . . . n other people in this class have also used . . . n other people like you have used . . . n recommended resources for this subject are . . . n resources for this subject are . . . n the subject-specific librarian is . . . to be able to address these issues, the librarian and the developer first create sets of facet/term combinations and then assign one or more of them to information resources, patrons, and/or librarians. after the assignments have been made, lists of relevant mylibrary objects (information resources or people) can be generated by specifying—“joining” in relational database parlance—facet/ term combinations held in common between the objects. for example, if many information resources, patrons, and librarians were classified using a subjects/astronomy facet/term combination, then the librarian and developer can create a list of astronomy-related resources for patrons, a list of astronomy-interested patrons for librarians, and list of astronomy-responsible librarians for patrons. n mylibrary facets and terms mylibrary facets are intended to be the headings for very broad categories. mylibrary terms are expected to denote examples of the facets. facet/term combinations are expected but not required to be defined for every mylibrary implementation. every librarian and developer who uses mylibrary is expected to define his or her own set of facet/term combinations. in the form of a simplified entity-relationship diagram, figure 1 illustrates how the relationships between information resources and people are modeled in mylibrary. an easy-to-understand facet might be formats denoting the physical manifestation of an information resource. terms associated with a formats facet might include books, manuscripts, journals, microforms, articles, maps, pictures, movies, or datasets. given just about any information resource, a formats facet/term combination can be assigned to it. for example, a library that owns the encyclopaedia britannica might “catalog” it with the 14 information technology and libraries | september 2008 formats/books facet/term combination: title—encyclopaedia britannica facet/term—formats/books another easy-to-understand facet might be called research tools, denoting things used to find data and information. example terms might include dictionaries, thesauri, manuals, journal indexes, library catalogs, internet indexes, encyclopedias, atlases, or almanacs. continuing with the example above, encyclopaedia britannica might have an additional facet/term combination assigned to it: title—encyclopaedia britannica facet/term—formats/books facet/term—research tools/encyclopedias an audience facet might be created to denote classes of users. in an academic library, possible terms might include freshman, sophomores, juniors, seniors, graduate students, instructors, faculty, and staff. using a different information resource—say, dissertation abstracts—we might come up with a different set of facet/term combinations: title—dissertation abstracts facet/term—research tools/bibliographic indexes facet/term—audiences/graduate students using mylibrary’s facet/term combinations, it is almost trivial to create an authorities list. an authors facet can be created to denote the creators of works. specific names can be used as terms. similarly, there might be a need or desire to include genre headings. consequently, the adventures of huckleberry finn might be described like this: title—the adventures of huckleberry finn facet/term—audiences/adolescents facet/term—authors/mark twain facet/term—formats/books facet/term—genre/coming of age stories facet/term—genre/novels n mylibrary objects facet/term combinations are used to describe and create relationships between mylibrary objects. these objects include information resources and people, and the people consist of users and librarians. the idea of facet/term combinations has been described above. this section describes the mylibrary objects—information resources and people—in greater detail. information resources information resources are the traditional informationcarrying “things” of a library. typically they include books, journals, articles, manuscripts, indexes, catalogs, finding aids, etc. in order to organize and increase access to these materials, libraries systematically describe collections using rigorous cataloging procedures. with the advent of ubiquitous computing and the internet, at least two things have happened regarding the “things of a library.” first, they are increasingly less bibliographic in nature. while the number of books, journals, and articles is certainly not decreasing, the number of conference presentations, simulations, images, sounds, movies, and data sets is multiplying at an astounding rate. second, because of this additional content, the traditional rigorous cataloging procedures of librarianship do not scale to the amount of work that needs to be done. dublin core metadata elements were created to address these problems. facet/term combinations form the foundation for creating simple but local controlled vocabularies. facet/term combinations plus dublin core metadata elements plus a number of other attributes brought along from mylibrary@ncstate for backwards compatibility are used to describe information resource objects in mylibrary. attributes a few things ought to be noted about some of the mylibrary attributes. first, many of the dublin core elements can be duplicated with facet/term combinations. the prime candidates are elements that can be expressed as database many-to-many relationships. the dublin core element called creator is an excellent example. any figure 1. simplified mylibrary entity-relationship diagram. facets have a one-to-many relationship with terms. terms have a manyto-many relationship with resources, patrons, and librarians. after defining sets of facet/term combinations, the mylibrary api allows librarians and developers to build interconnections between resources, patrons, and librarians. mylibrary: a digital library framework and toolkit | morgan 15 single information resource may have many creators, and any creator may be associated with many resources. librarians and developers who use mylibrary are able to place creator information in an attribute of a mylibrary resource object and/or in a facet/term combination. the former usage is similar to traditional library cataloging technique and consequently requires additional overhead for editing records. the application of facet/term combinations makes it much easier to maintain database integrity as well as create browsable lists. just like creators, subjects might be better implemented as facet/term combinations, and the mylibrary subject attribute might be used as a placeholder for keywords or non-controlled vocabulary terms. each mylibrary resource object might have multiple subjects. using the facet/term approach, this is no problem to implement. using the dublin core subject field approach, this is challenging, since the field is not repeatable. to circumvent this, librarians and developers are encouraged to delimit subject term values with predefined characters (such as “|”). upon indexing or display, the subject attribute can be parsed into multiple values. identifiers mylibrary resource objects possess three distinct types of identifiers, and each has it own explicit use. the first is the mylibrary resource identifier, which is a relational database key. it is non-assignable and non-editable by librarians or developers. it is an internal value used to maintain relational database integrity. the second type of mylibrary resource identifier is called the fkey, and it is used to denote a foreign key. this attribute is primarily intended to contain the value of an identifier from a remote information system like the 001 field of a marc record. a better example includes the harvesting of records from oai data repositories. each record in each oai repository has an internet-wide unique identifier. this value is not a url, but usually a combination of characters and numbers analogous to the 001 field of a marc record. each repository may also implement a concept called “sets,” and each record might belong to multiple sets. when harvesting from a repository, the librarian and developer can save the oai identifier as an fkey value, and when the same record from an alternative set is discovered, the associated resource object can be updated instead of duplicated. the third type of identifier are resource::location objects. they are primarily intended for but not limited to urls. unlike all of the other resource attributes, resource::location objects are intended to have many values because information resources have many locations. for example, a library might have a printed version of the adventures of huckleberry finn, and its location is denoted by a call number. a library might also have an electronic version, and its location is a url. an online bibliographic database might be located at a particular url, but its locally developed help text might be located at a different url. each resource::location object has three qualities: (1) a key, (2) a type, and (3) a value. the key is an internal relational database identifier. the type is an institution-defined value denoting the kind of location. examples might include primary url, help text url, call number, local file name, and isbn or issn. the value is an example of the type, and, in the case of dublin core elements, might very well be the identifier. using mylibrary resource::location objects, single information resources can be displayed and multiple locations can be associated with them. library services think creatively regarding the definition of resource objects. think library services as well as books, journals, and databases. libraries are about more than collections. they are also about services applied against those collections. libraries want to promote their services just as much as they want to promote access to bibliographic indexes, special collections, and the wealth of monographs. these services include bibliographic and information literacy sessions, circulation services (such as interlibrary loan, item recalls, renewals, or document delivery), library tours, one-on-one reference consultations, and online chats. each of these services has a title, a description, and probably a url where details can be read online. mylibrary resource objects provide a means to embody this information in a concise package. all that is missing are facet/term combinations to relate them to other information resources or people. consider an audience facet. putting things on reserve is something of interest to instructors. consider an audience term called instructors. assign an audience/instructors facet/term combination to instructions for putting things on reserve. things put on reserve are intended for use by students. again, consider assigning something like an audience/ students facet/term combination to instructions for using the reserve book room. people—patrons and librarians mylibrary includes two types of objects representing people: patrons and librarians. like information resource objects, librarian and patron objects are characterized using a number of attributes plus facet/term combinations. on one level, the patron attributes are simple and rudimentary only including things like first name, last name, username, password, e-mail address, url, and image. this type of information was explicitly designed to map to the foaf (friend of a friend) architecture in the hopes of future compatibility. patron objects also 16 information technology and libraries | september 2008 include attributes for things like last date visited and total number of visits. this information forms the basis for potential what’s new? functionality. the patron object also includes functionality to record personal links for bookmarking features. the mylibrary librarian object is even simpler than the patron object since it only includes attributes for name, e-mail address, and url. just like the mylibrary information resource objects, both the patron objects and the librarian objects can be mapped to facet/term combinations. just as mla bibliography might be “cataloged” using a subjects/ english literature facet/term combination, a patron or librarian object can be “cataloged” in the same way. once these sorts of relationships are established, recommendations can begin to take shape. once patrons start bookmarking and associating particular resources and services to their identity, the system can take the next step and address things such as “people like you also used . . . ” or “popular resources in this area are . . .” moreover, once facet/term combinations are associated with people, then relationships between people can be created and the system can answer statements such as “other people interested in this topic include . . .” or “the patrons who are interested in this subject are . . .” establishing facet/term combinations for people is not as difficult as it may seem at first. in an academic library, much of this information can be gleaned from human resources data or the institution’s registrar office. libraries probably already get this information in one shape or another to populate their integrated library system circulation module. at the very least, this information includes a first name, a last name, and a unique institution identifier (possibly a username). given this information, the librarian and developer could query the institution’s directory services to discover institutional department and/or major field of study. just as this information is loaded into the integrated library system to support borrowing, it can be loaded into a mylibrary instance. each department or major can then be mapped to facet/term combinations. privacy is a real issue with the inclusion of patron information in a mylibrary instance. it should be taken very seriously. the use of mylibrary does not assume the inclusion of patron information; it is more than possible to use mylibrary and not have it contain any information about people. on the other hand, without this information a library prevents itself from providing the sort of services increasingly expected by its patrons. a discussion of the professional ethics of providing personalized services to library users in a computer-networked environment is beyond the scope of this article. each library must weigh for itself the strengths, weaknesses, advantages, and threats of using information about patrons to provide individualized services. n combining mylibrary with other “toolboxes” as a framework or toolbox, mylibrary is intended to support only certain aspects of a digital library, namely, the collection of content, information about people, and a means of making relationships between them. mylibrary is not intended to be an “integrated library system.” it has no acquisitions module. it has no circulation module. it includes the only the most basic functionality for searching. instead, librarians and developers are expected to combine mylibrary with other tools to fulfill these functions. for example, acquisitions functionality can be implemented by harvesting oai content. by combining mylibrary with another set of perl modules called net::oai::harvester, librarians and developers can import oai-based content into a mylibrary instance.9 feed net::oai::harvester an oai root url, and it will systematically harvest remote metadata in any number of metadata formats. since dublin core metadata is required of all oai data repositories, and since mylibrary supports a one-to-one mapping to dublin core elements, it is trivial to create mylibrary resource objects based on each of the harvested records. appendix a illustrates a simple yet complete oai acquisitions application. it harvests journal article metadata from the directory of open access journals. just about any bibliographic metadata format can be mapped to dublin core. examples include marc, marcxml, mods, ead, and tei. to get content in these forms into a mylibrary instance, the librarian and developer need to write a program reading bibliographic data, parsing out the desired information, and saving it to mylibrary. considering marc data, the venerable perl module called marc::record could be used to read and parse the data.10 the other data formats are xmlbased, and a perl-based application supporting xstl or xpath could be used to read and parse the data. in all of these cases the content of the mylibrary instance should be considered brief and the fkey value might point to the original file on the local file system. such mylibrary resource objects are useful for syndication, search result displays, or browsable lists. if more detail is required, then the brief records can point to the full metadata through the fkey value. mylibrary is not intended to support search. that is because search is best supported not by a database but by an indexer.11 there are myriad indexers available. some of them include swish-e, kinosearch, zebra, and lucene.12 to search the content of a mylibrary instance, librarians and developers are expected to write reports against the instance and use them as the content for indexing. appendix b illustrates a rudimentary but complete promylibrary: a digital library framework and toolkit | morgan 17 gram creating a kinosearch index against a mylibrary instance. once the index is created, librarians and developers are expected to write interfaces to search the index. appendix c illustrates one searching technique: get a query as input, search the index, return a record’s id value, lookup the record in mylibrary, display. in summary, mylibrary first defines a number fundamental library objects (information resources, people, and a controlled vocabulary). it then supports a perl-based application programmer interface (api) for doing input/ output against these objects. the input can be garnered from any number of streams—manual data entry, tabdelimited text files, marc or xml files, oai, etc. the output can be xml files, rss or atom feeds, oai, html subject pages, e-mail messages, or pdf files. n production and demonstration applications a number of diverse applications have been created with mylibrary. some of them are production services. some of them are not fully developed and only exist to demonstrate the possibilities. this section briefly describes a number of them. alex catalogue of electronic text the alex catalogue of electronic text is a collection of just less than 14,000 public-domain documents from american literature, english literature, and western philosophy. much of the content comes from project gutenberg, but it also includes content from the defunct eris project of virginia tech and the internet wiretap archive. each mylibrary resource object includes as much dublin core data as possible. the description attribute of each mylibrary resources includes not an abstract of the electronic text, but an rdf/xml version of the original text. a report was written against the mylibrary instance that saves the rdf/xml to the local file system. these files were then indexed with an open-source indexer called zebra, and access to the index was provided through a web services–based protocol called sru (search/retrieve via url). consequently, the catalogue is full-text searchable as well as searchable via title, creator, and subject. the contents of the subject fields were computed by analyzing each document and extracting statistically significant words. the searchable interface supports a did you mean? service by comparing search terms to alternative spellings and a wordnet thesaurus. the catalogue’s title and creator browsable lists are static html files built by a script written against the underlying mylibrary instance. finally, links to all of the documents and their subjects have been uploaded to del.icio.us. to facilitate this, a script was written against the database extracting all the titles, their creators, and subjects (“tags”). these things were then sent to del.icio .us via a perl module implementing the del.icio.us api. article index the directory of open access journals includes an oai interface to its journal titles as well as some of its articles. the article index system harvested the article metadata and saved it to a mylibrary instance. along the way, journal titles and publishers were saved to underlying facet/term combinations and linked to each article. this enabled the creation of browsable lists via publisher and source. the content of the database was indexed using kinosearch and made accessible via a perl module written to implement sru. search results are displayed in a brief format. details are available via a simple asynchronous javascript and xml (ajax-y) link. appendixes a, b, and c illustrate the core of this application. catholic research resources alliance the catholic research resources alliance (crra) is a “portal” intended to highlight rare and unique materials of interest to catholic scholars. much of this content exists in archives. archives use an xml format called ead to describe their holdings. the crra provides a mechanism for ingesting these ead files, parsing out controlled vocabularies, populating facet/term combinations accordingly, full-text indexing the ead, and supporting a searchable/ browsable interface to the entire content via sru. the crra also supports ingesting marc records as well as getting its input from online data-entry forms. reports are written against the underlying mylibrary instance allowing the crra’s content to be accessible via oai. facebook a facebook application has been written against the mylibrary data of the hesburgh libraries university of notre dame’s database-driven website. after facebook users load the application into their profile, they are presented with a set of default recommended resources. the user then has the option to select a different set of resources based on subject terms presented in a pop-up menu. the resulting list of resources is then saved to the user’s profile pane, giving easy access to the pertinent databases and indexes of his or her selected subject. library catalog mylibrary has been used to create a demonstration library catalog. about 300,000 marc records were downloaded 18 information technology and libraries | september 2008 from the library of congress. a program was written that reads each marc record, crosswalks it to dublin core, and creates mylibrary resource objects accordingly. each marc record is saved as an individual file on the file system. the whole collection is indexed with kinosearch, and an sru interface provides access to the index. as search results are returned, the existence of isbn numbers is checked. if found, cover art and user reviews are retrieved and displayed from amazon. each record is displayed in a brief format, but links to a fully tagged format is available as well as marcxml and mods formats. each record is also associated with a “get it for me” link. once clicked, the item is essentially checked out to the user. each user then has a “bookshelf” link displaying the items they have borrowed. hesburgh libraries, university of notre dame’s database-driven website the hesburgh libraries’ database-driven website is probably the most extensive mylibrary application in existence, and its primary purpose is to support the majority of the libraries’ website. the system begins with the integrated library system where much (but not all) of the library’s website content has been cataloged using traditional methods. each item in the catalog destined for the website has been flagged with a local note denoting such. each item’s description has also been enhanced with facet/term combinations. on a nightly basis, all of the items destined for the website are exported from the catalog as marc records. on a nightly basis, another script reads these records and updates a mylibrary instance. reports are written against the instance creating subject pages, format pages, tool pages, etc., complete with descriptions, recommendations, and links to associated librarians. some information resources on the website are not deemed worthy of a record in the catalog. for these items, a manual data-entry form was created allowing bibliographers and subject specialist librarians to supplement the website’s content. these resources are seamlessly integrated into the website along with the resources from the catalog. to facilitate search, reports are written against the mylibrary instance and fed to swish-e. the resulting index is then supplemented with the content of static web pages to support search this site functionality. using this database-driven and mylibrary-based system, the content of the libraries’ website has many fewer broken links because the links are all centrally maintained. the site also sports a common look and feel, making it easy for users to know where they are located in the system. this process also eliminates the need for selectors and subject specialist librarians to know any html. they can focus on content and the system can focus on presentation. n future directions and conclusion the mylibrary modules work in the manner in which they were intended, and they continue to be distributed and supported as open-source software, but software is never complete. mylibrary is available from cpan (comprehensive perl archive network). it is supported by a website complete with voluminous documentation, sample applications, access to a cvs repository, blog commentaries, and a mailing list with about 150 subscribers.13 yet despite the support, use of mylibrary outside the university of notre dame has been underwhelming. i assume this is true because the number of perl programmers in libraries is shrinking as the number other programming languages (php, python, ruby, java, etc.) grows. the modularity of the system may also be a factor since most of the library profession can not write a computer program and therefore will have a difficult time understanding how to put mylibrary into practical use. the idea of facet/term combinations used to describe information resources as well as people may be off-putting. finally, because mylibrary requires an underlying database to operate, the normal perl installation process (perl makefile.pl; make; make test; make install) can only be done after a bit of pre-installation processing. this is possibly another impediment to adoption—the installation process is a bit unusual. despite these issues, mylibrary works very well for the university of notre dame, and a number of improvements are planned. first, the underlying database contains a table for user reviews, and a perl module needs to be written allowing input/output against these tables. similarly, mylibrary presently includes tables for keeping track of how often a particular resource is used and by whom, but there is no module to update the table. future work will enhance this statistics table and implement the statistics module. finally—and most importantly—work will be done to make it easy to do input/output against a mylibrary instance through a rest-ful (representational state transfer) interface. as defined by rest, this interface will exploit the four transfer methods of http (get, post, put, and delete) to retrieve, create, edit, and remove mylibrary objects from the underlying database. by exploiting rest-ful computing techniques, at least two things will be enabled. first, application programmers will be able to use their favorite computer language to maintain a mylibrary instance. there will be no need to know perl; rest is computer-language independent. second, through the use of rest-ful computing mylibrary content will be more easily syndicated. for example, the output of a rest-ful mylibrary interface could be manifested in many flavors of xml. atom comes to mind, but an rdf/xml representation may be mylibrary: a digital library framework and toolkit | morgan 19 more expressive. the output of a rest-ful interface to mylibrary could also be manifested as a json (javascript object notation) data structure, making it easier to integrate mylibrary content in ajax-y interfaces. as more and more library collections and services are manifested in a computer-networked environment, the need to provide these collections and services in new and different ways increases. mylibrary is an attempt to address this issue, and it has met with qualified success. acknowledgments an enormous debt of gratitude goes to rob fox of the hesburgh libraries, university of notre dame for writing the bulk of the mylibrary perl modules. rob and i sat down together for a couple days in 2003 to learn about object-oriented perl programing techniques from ed summers (now working at the library of congress). we then coupled that experience with the needs and desires of the libraries to articulate and design mylibrary as it is today. while i wrote bits and pieces of the modules and used them to write many applications, rob was the person who really got his hands dirty. references and notes 1. keith morgan and tripp reade, “pioneering portals: mylibrary@ncstate,” information technology and libraries 19, no. 4 (dec. 2000): 191–98. 2. the author has identified at least four mylibrary@ ncstate implementations still up and running from across the world, including the wellington city libraries in new zealand, www.wcl.govt.nz/mylibrary (accessed feb. 19, 2008); the buswell library electronic access center of wheaton college, http://libweb.wheaton.edu/mylibrary (accessed feb. 19, 2008); the biblioteca mario rostoni at the universita carlo cattaneo, http://mylibrary.liuc.it/mylibrary (accessed feb. 19, 2008); and auburn university, http://mylibrary.auburn.edu (accessed feb. 19, 2008). 3. anne ramsden, james mcnulty, fiona durham, helen clough, and nicola dowson created myopenlibrary for the openuniversity in the united kingdom. “myopenlibrary is an online personalised library system developed for open university students and staff. every individual user can have a virtual library ‘shelf’ or space which is tailored to meet their particular needs. the system is based on the mylibrary software originally developed at north carolina state university and now supported at notre dame university. the software has a simple basic interface, groups resources under clear headings, and provides a tick box facility for selecting and removing resources. users sign in because it is a personalised service, but then they can customise the colour and settings of their page according to need, and if they are familiar with the internet, they add their own personal favourite links. there is a quick search facility for searching individual databases and internet search engines. the system is currently being used by 20 open university courses and this is expected to increase year on year. for more information see http://myopenlibrary. open.ac.uk/.” myopenlibrary includes 80,768 patrons (79% of the total student population of openuniversity), 111 disciplines, 12,731 e-books, 500 databases, and 38,708 journals. from personal correspondence between the author and james mcnulty (feb. 19, 2008). 4. “the lanl implementation of mylibrary @ lanl is an object oriented redesign of the mylibrary source code created by eric lease morgan of north carolina state university. the code was designed by two summer students andres monroyhernandez and cesar ruiz-meraz from monterrey, mexico. the code is currently maintained by mariella di giacomo and ming yu.” from http://library.lanl.gov/lww/mylibweb.htm (accessed feb. 19, 2008). 5. a search against google for “mylibrary” returns myriad results, many of which are mylibrary-like applications and services. representative samples include mylib of malaysia’s national digital library, www.mylib.com.my (accessed feb. 19, 2008); my library of hennepin county library, www.hclib .org/pub/ipac/mylibrary.cfm (accessed feb. 19, 2008); and mylibrary of coastal carolina university, www.coastal.edu/ library/mylibrary.html (accessed feb. 19, 2008). 6. susan gibbons, “building upon the mylibrary concept to better meet the information needs of college students,” d-lib magazine 9, no. 3 (mar. 2003), www.dlib.org/dlib/march03/ gibbons/03gibbons.html (accessed feb. 19, 2008). 7. steve brantley, annie armstrong, and krystal m. lewis, “usability testing of a customizable library web portal,” college and research libraries 67, no. 2 (mar. 2006): 146–63, www.ala .org/ala/acrl/acrlpubs/crljournal/backissues2006a/marcha/ brantley06.pdf (accessed feb. 19, 2008). 8. udi manber, ash patel, and john robison, “experience with personalization on yahoo!” communications of the acm 43, no. 8 (aug. 2000): 35–39. 9. net::oai::harvester, http://search.cpan.org/dist/oai -harvester (accessed feb. 19, 2008). 10. marc::record, http://search.cpan.org/dist/marc -record (accessed feb. 19, 2008). 11. search is a function best supported by an indexer, not a relational database. relational databases are tools for organizing and maintaining data. through the process of normalization, relational databases store data unambiguously and efficiently. because relational databases store their information in tables, records, and fields, it is necessary to specify the tables, records, and fields when querying a database. this requires the user to know the structure of the database. moreover, standard relational databases do not support full-text searching nor relevance-ranked output. indexers excel at search. given a stream of documents, indexers parse tokens (words) and associate them with document identifiers. searches against indexes return document identifiers and provide the means to retrieve the documents without the necessary knowledge of the index’s structure. indexers are weak at data maintenance. in a welldesigned database, authority terms can be updated in a single location and reflected throughout the database. indexers do not support such functionality. databases and indexers are two sides of the same information retrieval coin. together they form the technological core of library automation. 20 information technology and libraries | september 2008 12. there are a growing number of open-source indexers available on the web, including swish-e, http://swish-e.org (accessed feb. 19, 2008); kinosearch, www.kinosearch.com/ kinosearch (accessed feb. 2008); zebra, http://indexdata.com/ zebra (accessed feb. 19, 2008); and lucene, http://lucene .apache.org (accessed feb. 19, 2008). 13. the canonical home page for mylibrary version 3.x is http://mylibrary.library.nd.edu (accessed feb. 19, 2008). appendix a # harvest doaj articles into a mylibrary instance # require use mylibrary::core; use net::oai::harvester; # define use constant doaj => ‘http://www.doaj.org/oai.article’; # the oai repository mylibrary::config->instance( ‘articles’ ); # the mylibrary instance # create a facet called formats $facet = mylibrary::facet->new; $facet->facet_name( ‘formats’ ); $facet->facet_note( ‘types of physical items embodying information.’ ); $facet->commit; $formatid = $facet->facet_id; # create an associated term called articles $term = mylibrary::term->new; $term->term_name( ‘articles’ ); $term->term_note( ‘short, scholarly essays.’ ); $term->facet_id( $formatid ); $term->commit; $articleid = $term->term_id; # create a location type called url $location_type = mylibrary::resource::location::type->new; $location_type->name( ‘url’ ); $location_type->description( ‘the location of an internet resource.’ ); $location_type->commit; $location_type_id = $location_type->location_type_id; # create a harvester and loop through each oai set mylibrary: a digital library framework and toolkit | morgan 21 $harvester = net::oai::harvester->new( ‘baseurl’ => doaj ); $sets = $harvester->listsets; foreach ( $sets->setspecs ) { # get each record in this set and process it $records = $harvester->listallrecords( metadataprefix => ‘oai_dc’, set => $_ ); while ( $record = $records->next ) { # map the oai metadata to mylibrary attributes $fkey = $record->header->identifier; $metadata = $record->metadata; $name = $metadata->title; @creators = $metadata->creator; $note = $metadata->description; $publisher = $metadata->publisher; next if ( ! $publisher ); $location = $metadata->identifier; next if ( ! $location ); $date = $metadata->date; $source = $metadata->source; @subjects = $metadata->subject; # create and commit a mylibrary resource $resource = mylibrary::resource->new; $resource->fkey( $fkey ); $resource->name( $name ); $creator = ‘’; foreach ( @creators ) { $creator .= “$_|” } $resource->creator( $creator ); $resource->note( $note ); $resource->publisher( $publisher ); $resource->source( $source ); $resource->date( $date ); $subject = ‘’; foreach ( @subjects ) { $subject .= “$_|” } $resource->subject( $subject ); $resource->related_terms( new => [ $articleid ]); $resource->add_location( location => $location, location_type => $location_type_id ); $resource->commit; } } 22 information technology and libraries | september 2008 # done exit; appendix b # index mylibrary data with kinosearch # require use kinosearch::invindexer; use kinosearch::analysis::polyanalyzer; use mylibrary::core; # define use constant index => ‘../etc/index’; # location of the index mylibrary::config->instance( ‘articles’ ); # mylibrary instance to use # initialize the index $analyzer = kinosearch::analysis::polyanalyzer->new( language => ‘en’ ); $invindexer = kinosearch::invindexer->new( invindex => index, create => 1, analyzer => $analyzer ); # define the index’s fields $invindexer->spec_field( name => ‘id’ ); $invindexer->spec_field( name => ‘title’ ); $invindexer->spec_field( name => ‘description’ ); $invindexer->spec_field( name => ‘source’ ); $invindexer->spec_field( name => ‘publisher’ ); $invindexer->spec_field( name => ‘subject’ ); $invindexer->spec_field( name => ‘creator’ ); # get and process each resource foreach ( mylibrary::resource->get_ids ) { # create, fill, and commit a document with content my $resource = mylibrary::resource->new( id => $_ ); my $doc = $invindexer->new_doc; $doc->set_value ( id => $resource->id ); mylibrary: a digital library framework and toolkit | morgan 23 $doc->set_value ( title => $resource->name ) unless ( ! $resource->name ); $doc->set_value ( source => $resource->source ) unless ( ! $resource->source ); $doc->set_value ( publisher => $resource->publisher ) unless ( ! $resource->publisher ); $doc->set_value ( subject => $resource->subject ) unless ( ! $resource->subject ); $doc->set_value ( creator => $resource->creator ) unless ( ! $resource->creator ); $doc->set_value ( description => $resource->note ) unless ( ! $resource->note ); $invindexer->add_doc( $doc ); } # optimize and done $invindexer->finish( optimize => 1 ); exit; appendix c # search a kinosearch index and display content from mylibrary # require use kinosearch::searcher; use kinosearch::analysis::polyanalyzer; use mylibrary::core; # define use constant index => ‘../etc/index’; # location of the index mylibrary::config->instance( ‘articles’ ); # mylibrary instance to use # get the query my $query = shift; if ( ! $query ) { print “enter a query. “; chop ( $query = )} # open the index $analyzer = kinosearch::analysis::polyanalyzer->new( language => ‘en’ ); $searcher = kinosearch::searcher->new( invindex => index, analyzer => $analyzer ); # search $hits = $searcher->search( qq( $query )); # get the number of hits and display $total_hits = $hits->total_hits; 24 information technology and libraries | september 2008 print “your query ($query) found $total_hits record(s).\n\n”; # process each search result while ( $hit = $hits->fetch_hit_hashref ) { # get the mylibrary resource $resource = mylibrary::resource->new( id => $hit->{ ‘id’ }); # extract dublin core elements and display print “ id = “ . $resource->id . “\n”; print “ name = “ . $resource->name . “\n”; print “ date = “ . $resource->date . “\n”; print “ note = “ . $resource->note . “\n”; print “ creators = “; foreach ( split /\|/, $resource->creator ) { print “$_; “ } print “\n”; # get related terms and display @resource_terms = $resource->related_terms(); print “ term(s) = “; foreach (@resource_terms) { $term = mylibrary::term->new(id => $_); print $term->term_name, “ ($_)”, ‘; ‘; } print “\n”; # get locations (urls) and display @locations = $resource->resource_locations(); print “ location(s) = “; foreach (@locations) { print $_->location, “; “ } print “\n\n”; } # done exit; lib-mocs-kmc364-20140106083930 198 an algorithm for compaction of alphanumeric data william d. schieber, george w. thomas: central library and documentation branch, international labour office, geneva, switzerland description of a technique for compressing data to be placed in computer auxiliary storage. the technique operates on the principle of taking two alphabetic characters frequently used in combination and replacing them with one unused special character code. such une-for-two replacement has enabled the ilo to achieve a rate of compression of 43.5% on a data base of approximately 40,000 bibliographic records. introduction this paper describes a technique for compacting alphanumeric data of the type found in bibliographic records. the file used for experimentation is that of the central library and documentation branch of the international labour office, geneva, where approximately 40,000 bibliographic records are maintained on line for searches done by the library for its clients. work on the project was initiated in response to economic pressure to conserve direct-access storage space taken by this particularly large file. in studying the problem of how to effect compaction, several alternatives were considered. the first was a recursive bit-pattern recognition technique of the type developed by demaine ( 1,2), which operates mdependently of the data to be compressed. this approach was rejected because of the apparent complexity of the coding and decoding algorithms, and also because early analyses indicated that further development of the second type of approach might ultimately yield higher compression ratios. compaction of alphanumeric datajschieber and thomas 199 the second type of approach involves the replacement, by shorter nondata strings, of longer character strings known to exist with a high frequency in the data. this technique is data dependent and requires an analysis of what is to be encoded. one such method is to separate words into their component parts: prefixes, stems and suffixes; and to effect compression by replacing these components with shorter codes. there have been several successful algorithms for separating words into their components. salton ( 3) has done this in connection with his work on automatic indexing. resnikoff and dolby ( 4,5) have also examined the problem of word analysis in english for computational linguistics. although this method appears to be viable as the basis of a compaction scheme, it was here excluded because ilo data was in several languages. moreover, dolby and resnikoff's encoding and decoding routines require programs that perform extensive word analysis and dictionary look-up procedures that ilo was not in a position to develop. the actual requirements observed were twofold: that the analysis of what strings were to be encoded be kept relatively simple, and that the encoding algorithm must combine simplicity and speed presumably by minimizing the amount of dictionary look-up required to encode and decode the selected string. one of the most straightforward examples of the use of this technique is the work done by snyderman and hunt ( 6 ) that involves replacement of two data characters by single unused computer codes. however, the algorithm used by them does not base the selection of these two-character pairs (called "digrams") on their frequency of occurrence in the data. the technique described here is an attempt to improve and extend the concept by encoding digrams on the basis of frequency. the possibility of encoding longer character strings is also examined. three other related discussions of data compaction appear in papers by myers et al. (7) and by demaine and his colleagues (8,9). the compression technique the basic technique used to compact the data file specifies that the most-frequently occurring digrams be replaced by single unused specialcharacter codes. on an eight-bit character machine of the type used, there are a total of 256 possible character codes (bytes ) . of this total only a small number are allocated to graphics (that is, characters which can be reproduced by the computer's printer). in addition, not all of the graphics provided for by the computer manufacturer appear in the user's data base. thus, of the total code set, a large portion may go unused. characters that are unallocated may be used to represent longer character strings. the most elementary form of substitution is the replacement of specific digrams. if these digrams can be selected on the basis of frequency , the compression ratio will be better than if selection is done independent of frequency. 200 journal of library automation vol. 4/4 december, 1971 this requires a frequency count of all digrams appearing in the data, and a subsequent ranking in order of decreasing frequency. once the base character set is defined, and the digrams eligible for replacement are selected, the algorithm can be applied to any string of text. the algorithm consists of two elements: encoding and decoding. in encoding, the string to be encoded is examined from left to right. the initial character is examined to determine if it is the first of any encodable digram. if it is not, it is moved unchanged to the output area. if it is a possible candidate, the following character is checked against a table to verify whether or not this character pair can be replaced. if replacement can be effected, the code representing the digram is moved to the output area. if not, the algorithm then moves on to treat the second character in precisely the same way as the first. the algorithm continues, character-by-character until the entire string has been encoded. following is a step-by-step description of the element. 1) load length of string into a counter. 2) set pointer to first character in string. 3) check to determine whether character pointed can occur in combination. if character does not occur in combination, point to next character and repeat step 3. 4) if character can occur in combination, check following character in a table of valid combinations with the first character. if the digram cannot be encoded, advance pointer to next character and return to step 3. 5) if the digram is codable, move preceeding non-codable characters (if any) to output area, followed by the internal storage code for the digram. 6) decrease the string length counter by one, advance pointer two positions beyond current value and return to step 3. in the following example assume that only three digrams are defined as codable: ab, be and de. assume also that the clear text to be encoded is the six-character string abcdef. after encoding the coded string would appear as: ab c de f a horizontal line is used to represent a coded pair, a dot shows a single (non-combined) character. the encoded string above is of length four. note that although bc was defined as an encodable digram, it did not combine in the example above because the digram ab was already encoded as a pair. the characters c and f do not combine, so they remain uncoded. note also that if the digram ab had not been defined as codable, the resultant combination would have been different in this case: a bc de f compaction of alphanumeric data j schieber and thomas 201 the decoding algorithm serves to expand a compressed string so that the record can be displayed or printed. as in the encoding routines, decoding of the string goes from left to right. bytes in the source string are examined one by one. if the code represents a single character, the print code for that character is moved to the output string. if the code represents a digram, the digram is moved to the output string. decoding proceeds byte-by-byte as follows until end of string is reached: 1 ) load string length into counter. 2 ) set pointer to first byte in record. 3 ) test character. if the code represents a single character, point to next source byte and retest. 4) if the code represents a digram: move all bytes ( if any ) up to the coded digram; and move in the digram. 5) increase the length value by one, point to next source byte and continue with step 3. application of the technique the algorithm, when used on the data base of approximately 40,000 records was found to yield 43.5% compaction. the file contains bibliographic records of the type shown in figure 1. 413.5 1970 70al350 warner m stone m the data bank societyorganizations, computers and social freedom. london, george allen and unwin, <1970>. 244 p. charts. /social research/ into the potential thrf.at to privacy and freedom f/human right/sl through thf misuse of /data bank/s examines /computer/ based /information ---~ieval/, the impact of computer technology on branches of the /public administration/ ann /health service/$ in the /usa/ ano the /uk/ ano co~cluoes that, in order to protect human dignity, the new powers must be kept tn chf.ck. /bibliography/ pp. 236 to 242 ano /reference/$. engl fig. 1 . sample record from test file. each record contains a bibliographic se gment as well as a brief abstract containing descriptors placed between slashes for computer identification. a large amount of blank space appears on the printed version of these records; however, the uncoded machine readable copy does not contain blanks, except between words and as filler characters in the few fields defined as fixed-length. the average length of a record is 535 characters ( 10) . 202 journal of library automation vol. 4/4 december, 1971 the valid graphics appearing in the data are shown in table 1, along with the percentage of occurrence of each character throughout the entire file. table 1. single-character frequency freq. freq. freq. freq. freq. graphic % graphic % graphic % graphic % graphic % b 14.87 i 4.32 h 1.58 0.63 8 0.31 e 7.63 c 3.48 1.52 w 0.50 ( 0.28 n 6.38 l 3.32 ' 1.52 2 0.42 ) 0.28 i 6.01 d 2.32 1 1.08 k 0.42 + 0.21 a 6.01 u 2.21 v 0.91 3 0.40 j 0.15 (/j 5.86 p 2.12 b 0.87 5 0.37 x 0.14 t 5.50 m 2.02 9 0.83 7 0.37 z 0.13 r 4.82 f 1.61 y 0.82 0 0.35 q 0.08 s 4.61 g 1.58 6 0.81 4 0.34 misc. 0.01 spec. as might be expected, the blank (b) occurs most frequently in the data because of its use as a word separator. the slash occurs more frequently than is normal because of its special use as a descriptor delimiter. it should also be noted that the data contains no lower-case characters. this is advantageous to the algorithm because it considerably le~sens the total number of possible digram combinations. as a result, a larger proportion of the file is codable in the limited set chosen as codable pairs, and because the absence of 26 graphics allows the inclusion of 26 additional coded pairs. in the file used for compaction there are 58 valid graphics. allowing one character for special functions leaves 197 unallocated character codes (of a total of 256 possible ). a digram frequency analysis was performed on the entire file and the digrams ranked in order of decreasing frequency. from this list the first 197 digrams were selected as those which were eligible for replacement by single-character codes. table 2 shows these "encodable" digrams arranged by lead character. the algorithm was programmed in assembler language for use on an ibm 360/40 computer. the encoding element requires approximately 8,000 bytes of main storage; the decoding element requires approximately 2,000 bytes. in order to obtain data on the amount of computer time required to encode and decode the file, the following tests were performed. to find the encoding time, the file was loaded from tape to disk. the tape copy of the file was uncoded, the disk copy compacted. loading time for 41,839 records was 52 minutes and 51 seconds. the same tape to disk operation without encoding took 28:08. the time difference ( 24:43) represents encoding time for 41,839 records, or .035 seconds per record. a decoding test was done by unloading the previously coded disk file to tape. the time taken was 41:52, versus a time of 20:20 for unloading compaction of alphanumeric dataischieber and thomas 203 an uncompacted file. the time difference (21:32) represents decoding time for 41,839 records, or .031 seconds per record. the compaction ratio, as indicated above, was 43.5 per cent. for purposes of comparison, the algorithm developed by snyderman and hunt ( 6) was tested and found to yield a compaction ratio of 32.5% when applied to the same data file. table 2. most frequently occuring digrams lead char. a b c d e f g h i l m n 0 p r s t u v w y b 1 i ) eligible digrams ab ac ad ag ai al am an ap ar as at ab bl bo ca ce ch ci cl co ct cu cb c. dedi du db dl ea ec ed ef el em en ep er es et ev eb el fe fifo fr f~ ge gl gr gb gl ha he hi ho hb la ic ie il in 10 is it iv la le li ll lo lu us ma me mi mm mu mhs na nc nd ne ng ni no ns nt nla nl oc od of og ol om on op or ou ov ol,a pa pe pl po pr p. ra re ri rk rn ro rs rt ru ry rb rl sa se sl so sp ss st su shs s, s. ta tc te th ti to tr ts tu ty tb t i uc ud ul un ur us ut va ve vi wo yhs yl lisa hsb bc bd be hsg lal lal bm bn bo hip l;6r bs hit l;6u l;6w };6};6 l/j i l/j-. l/j ( 19 1 a ; c je 11 / l ; m jp jr ; s jt jb 1, ,b .l/j -b ), possible extension of the algorithm currently the compression technique encodes only pairs of characters. there might be good reason to extend the technique to the encoding of longer strings-provided a significantly higher compaction ratio could be 204 journal of library automation vol. 4/4 december, 1971 achieved without undue increase in processing time. one could consider encoding trigrams, quadrigrams, and up to n-grams. the english wo~d ·'the", for example, may occur often enough in the data to make it worth coding. the arguments against encoding longer strings are several. prime among these is the difficulty of deciding what is to be encoded. doing an analysis of digrams is a relatively straightforward affair, whereas an analysis of trigrams and longer strings is considerably more costly, because of the fact that there are more combinations. furthermore, if longer strings are to be en'coded, the algorithms for encoding and decoding become more complex and time-consuming to employ. one approach to this type of extension is to take a particular type of character string, namely a word, and to encode certain words which appear frequently. a test of this technique was made to encode particular words in the data: descriptors . all descriptors (about 1200 in number) appear specially marked by slashes in the abstract field of the record. each descriptor (including the slashes) was replaced by a two-character code. after replacement, the normal compaction algorithm was applied to the record. a compaction ratio of 56.4% was obtained when encoding a small sample of twenty records ( 10,777 characters). the specific difficulty anticipated in this extension is the amount of either processing time or storage space which the decoding routines would require. if the look-up table for the actual descriptor values were to be located on disk, the time to retrieve and decode each record might be rather long. on the other hand, if the look-up table were to be in main storage at the time of processing, its size might exclude the ability to do anything else, particularly when on-line retrieval is done in an extremely limited amount of main storage area. a partial solution to this problem might be to keep the look-up tables for the most frequently occurring terms in main storage and the others on disk. at present further analysis is being done to determine the value of this approach. conclusions the compaction algorithm performs relatively efficiently given the type of data used in text data base (i.e. data without lower case alphabetics, having a limited number of special characters, in primarily english text ). the times for decoding individual records ( .031 sec/ record ) indicate that on a normal print or terminal display operation, no noticeable increase in access time will be incurred. however several types of problems are encountered when treating other kinds of data. since the algorithm works on the basis of replacing the most-frequently occurring n-grams by single-byte codes, the compaction ratio is dependent on the number of codes that can be "freed up" for n-gram representation. the more codes that can be reallocated to n-grams, the better the compaction. data which would pose complications to the algorithm-as currently defined-can be separated for discussion as follows: compaction of alphanumeric datajschieber and thomas 205 1) data containing both upper and lower case characters (as well as a limited set of special characters), and 2) data which might possibly contain a wide variety of little-used special graphics. if lower-case characters are used, a possible way to encode data using this technique is to harken back to the time-honored method of representing lower-case with upper-case codes, and upper-case characters by their value, preceeded by a single shift code (e.g., #access for access). the shift code blank character digram would undoubtedly figure relatively high on the frequency list, making it eligible as an encodable digram. the second problem occurs when one attempts to compact data having a large set of graphics. a good example of this is bibliographic data containing a wide variety of little-used characters of the type now being provided for in the marc tapes ( 11) issued by the u. s. library of congress (such as the icelandic thorn). normally representation of these graphics is done by allocating as many codes as required from the possible 256-code set. since the compaction ratio is dependent on the number of unallocated internal codes, a possible solution to this dilemma might be to represent little-used graphics by multi-byte codes which would free the codes for representation of frequently occurring n-grams. further, it is noticeable that the more homogeneous the data the higher the compression ratio. this means that data all in one language will encode better than data in many languages. there is, unfortunately, no ready solution to this problem, given the constraints of this algorithm. in dealing with heterogeneous data one must be prepared to accept a lower compression factor. without doubt to be able to effect a savings of around 40% for storage space is significant. the price for this ability is computer processing time, and the more complex the encoding and decoding routines, the more time is required. there is a calculable break-even point at which it becomes economically more attractive to buy x amount of additional storage space than to spend the equivalent cost on data compaction. yet at the present cost of direct-access storage, compaction may be a possible solution for organizations with large data files. references 1. marron, b. a.; demaine, p. a. d.: "automatic data compression," communications of the acm, 10 (november 1967), 711-715. 2. demaine, p. a. d.; kloss, k.; marron, b. a.: the solid system iii: alphanumeric compression. (washington, d. c. : national bureau of standards, 1967 ) . (technical note 413 ) . 3. salton, g.: automatic information organization and retrieval (new york: mcgraw-hill, 1968 ). 4. resnikoff, h. l.; dolby, j. l.: "the nature of affixing in written english," mechanical translation, 8 (march 1965), 84-89. 206 journal of library automation vol. 4/4 december, 1971 5. resnikoff, h . l.; dolby, j. l.: "the nature of affixing in written english," mechanical translation, 9 (june 1966), 23-33. 6. snyderman, martin; hunt, bernard: "the myriad virtues of text compaction," datamation (december 1, 1970), 36-40. 7. myers, w.; townsend, m.; townsend, t.: "data compression by hardware or software," datamation (april 1966), 39-43. 8. demaine, p. a. d.; kloss, k.; marron, b. a.: the solid system ii. numeric compression. (washington, d. c.: national bureau of standards, 1967). (technical note 413 ). 9. demaine, p. a. d.; marron, b. a.: "the solid system i. a method for organizing and searching files." in schecter, g. (ed.): information retrieval-a critical view. (washington, d. c.: thompson book co., 1967). 10. schieber, w.: isis (integrated scientific information system; a general description of an approach to computerized bibliographical control). (geneva: international labour office, 1971) . 11. books: a marc format; specification of magnetic tapes containing monographic catalog records in the marc ii format. (washington, d. c.: library of congress, information systems office, 1970.) of the people, for the people: digital literature resource knowledge recommendation based on user cognition wen lou, hui wang, and jiangen he information technology and libraries | september 2018 66 wen lou (wlou@infor.ecnu.edu.cn) is an assistant professor in the faculty of economics and management, east china normal university. hui wang (1830233606@qq.com) is a graduate student in the faculty of economics and management, east china normal university. jiangen he (jiangen.he@drexel.edu) is a doctoral student in the college of computing and informatics, drexel university. abstract we attempt to improve user satisfaction with the effects of retrieval results and visual appearance by employing users’ own information. user feedback on digital platforms has been proven to be one type of user cognition. through conducting a digital literature resource organization model based on user cognition, our proposal improves both the content and presentation of retrieval systems. this paper takes powell's city of books as an example to describe the construction process of a knowledge network. the model consists of two parts. in the unstructured data part, synopses and reviews were recorded as representatives of user cognition. to build the resource category, linguistic and semantic analyses were used to analyze the concepts and the relationships among them. in the structural data part, the metadata of every book was linked with each other by informetrics relationships. the semantic resource was constructed to assist with building the knowledge network. we conducted a mock-up to compare the new category and knowledge-recommendation system with the current retrieval system. thirty-nine subjects examined our mock-up and highly valued the differences we made for the improvements in retrieval and appearance. knowledge recommendation based on user cognition was tested to be positive based on user feedback. there could be more research objects for digital resource knowledge recommendations based on user cognition. introduction the concept of user cognition originates in cognitive psychology. this concept principally explores the human cognition process through information-processing methods.1 the concept characterizes a process in which a user obtains unknown information and knowledge through acquired information. as information-science workers, we may explore the psychological activities of users by analyzing their cognitive processes when they are using information services.2 a knowledge-recommendation service based on user cognition has become essential since it emphasizes facilitating collaborations between humans and computers and promotes the participation of users, which ultimately improves user satisfaction. a knowledge-recommendation system is based on a combination of information organization, a retrieval system, and knowledge visualization.3 however, when exploring digital online literature resources, it is difficult to quickly and precisely find what we want because of the problem of information organization and retrieval. most search results only display a one-by-one list view. mailto:2012101040015@whu.edu.cn mailto:1830233606@qq.com mailto:jiangen.he@drexel.edu of the people, for the people | lou, wang, and he 67 https://doi.org/10.6017/ital.v37i3.10060 thus, adding visualization techniques to an interface could improve user satisfaction. furthermore, the retrieval system and visualizations rely on information organization. only if information is well designed can the retrieval system and visualization be useful. therefore, we attempt to improve retrieval efficiency by proposing a digital literature resource organization model based on user cognition to improve both the content and presentation of retrieval systems. taking powell’s city of books as an example, this paper proposes user feedback as first-hand user information. we will focus on (1) resource organizations based on user cognition and (2) new formats on search results based on knowledge recommendations. we will purposefully employ data from users’ own information and give knowledge back to users in accordance with the quote “of the people, for the people.” related work user cognition and measurement user cognition usually consists of a series of processes, including feeling, noticing, temporary memory, learning, thinking, and long-term memory.4 feeling and noticing are at an inferior level, while learning, thinking, and memory are comparatively superior. researchers have so far tried to identify user cognition processes by analyzing user needs. there are four levels of user needs according to ma and yang5 (see figure 1.) in turn, user interests normally reflect potential user needs. users who retrieve information on their own show feeling needs. users who give feedback show expression needs. users who ask questions show knowledge needs, which is the highest level. the methods to quantify user cognition require visible and measurable variables. existing studies have commonly used website log analysis or user surveys. website log analysis has been proven to be a solid data source to record and analyze both user interests and information needs.6 user surveys, including online questionnaires and face-to-face interviews, have been widely used to comprehend user feelings and user satisfaction.7 user surveys generally measure two kinds of relationship: between users and digital services and between users and the digital community.8 with a survey, we can make the most of statistics and assessment studies to analyze user satisfaction about an array of standards and systems of existing service platforms, service environments, service quality, and service personnel, which provides some references and suggestions for future study of user experience quality, platform elements, interaction process , and more.9 however, neither log data nor surveys can obtain first-hand user information in reallife settings. eye tracking and the concept-map method can be used to understand user behavior in the course of user testing.10 however, these approaches are difficult to adapt to a large group of users. therefore, a linguistic-oriented review analysis has become an increasingly important method. user content, including reviews and tags, could be analyzed through text mining and become valuable data sources to learn their preferences for the product and service in the areas of electronic commerce and digital libraries.11 this type of data has been called “more than words.”12 information technology and libraries | september 2018 68 figure 1. understanding user cognition by analyzing user needs. user-oriented knowledge service model the user-oriented service model includes user demand, user cognition, and user information behavior. a service model based on user demand chiefly concentrates on the motives, habits, regularities, and purposes of user demand to identify the model of use demand so that the appropriate service is adopted.13 service models based on user cognition attach importance to the process of user cognition, the influence that users are facing,14 and the change of library information services under the effects of series of cognitive processes (such as feeling, receiving, memorizing, and thinking).15 a service model based on user information behavior focuses on interactive behavior in the process of library information services that users participate in, such as interactions with academic librarians, knowledge platforms,16 and others. studies have paid more attention to the pre-process of the user-oriented service model, which analyzes information habits and user behaviors.17 studies have also proposed frameworks of knowledge services, design innovations,18 or personalized systems and frames of the knowledge service model, but they have not succeeded in implementing or performing user testing. knowledge service system construction most studies of knowledge service system construction are in business areas. numerous studies have explored knowledge-innovation systems for product services.19 cheung et al. proposed a knowledge system to improve customer service.20 vitharana, jain, and zahedi composed a knowledge repository to enhance the knowledge-analysis skills of business consultants.21 from of the people, for the people | lou, wang, and he 69 https://doi.org/10.6017/ital.v37i3.10060 the angle of user demand, zhou analyzed the elements of service-platform construction and found that crucial platforms should serve knowledge service system construction. 22 scholars proposed basic models for knowledge management and knowledge sharing, but they did not simulate their applications.23 knowledge management from the library-science perspective is very different from that in the business area. library knowledge management usually refers to a digital library, especially a personal digital library.24 others explore and attempt to construct a personalized knowledge service system,25 while fewer studies about system designs are based on the results of user surveys in accordance with documented surveys. we rarely see a user-feedback study combined with the method of using users’ own knowledge. users themselves know what they desire. if user-oriented studies separate the system design from user-needs analysis or the other way around, the studies may miss the purpose. therefore, we propose a resource-organization method based on users’ own knowledge to close the distance between the users and the system. resource-organization model based on user cognition there are normally two ways to construct a category system. one method gathers experts to determine categories and assign content to them; the category system comes first and the content second. the other method is to derive a category tree from the content itself, as we propose in this paper. in this way, the content takes priority over the categorization system. in this paper, we focus on this second way to organize resources and index content. resource organization requires a series of steps, including information processes, extraction, and organization. figure 2 shows the resource-organization model based on user cognition. this model fits the needs of digital resources with comments and reviews. the model has two interrelated parts. one is for indexing the content, and the other is for knowledge recommendations. for the first part, the model integrates all the comments and reviews of all literature in an area or the whole resource. the core concepts and the relationships among the concepts are extracted through natural language processing. the relationships between concepts are either subordination and correlation. a triple consists of two core concepts and their relationship. the triple set includes all triples. next, all books are indexed by taxonomy in the new category system. however, the indexing of every book is not based upon the traditional method, which is to manually determine each category by reading the literature. we use a method based on the books’ content. while we are extracting the core concepts from all books we extract the core concepts from every book by the same semantic-analysis methods and build up triples for the individual book. then the triples of this book can match the triple set in the new category system. once a triple in a single book yields a maximum matching value, the core concepts in the triple set will be indexed as the keywords of the book. a few examples of the matching process will be discussed in the empirical study (in the section “indexing books”). the first part is about comments and reviews, which are unstructured data. the second part is to make use of structural data in the bibliography to build a semantic network. structural data, including titles, keywords, authors, and publishers, is stored separately. we calculate the information technology and libraries | september 2018 70 informetrics relationships among the entities. the relationships can be among different entities, such as between one author and another or between an author and a publisher. then two entities and their relationship compose a triple. the components in triples are linked to each other, which makes them semantic resources. furthermore, the keywords in structural data are not the original keywords before the new category system but are the modified keywords. finally, the reindexed resources (books in the new category) and semantic resources (the triples from structural data) are both used to build the knowledge network. figure 2. resource-organization model based on user cognition. however, why is it important to use both unstructured data and structural data? the reason is to complete the entire content of a literature resource. neither of them can fully represent the whole semantics for a literature resource. structural data lacks subjective content, and unstructured data lacks basic information. thus, a full semantic network can be built using both kinds of data. of the people, for the people | lou, wang, and he 71 https://doi.org/10.6017/ital.v37i3.10060 resource-organization experiment object selection located in portland, oregon, powell’s city of books (hereafter referred to as “book city”) is one of the largest bookstores in the united states, with 200 million books in its inventory. book city caught our eyes for four reasons. (1) the comments and reviews of books on book city’s website are well constructed and plentiful. the national geographic channel established it as one of the ten best bookstores in the world.26 atlantis books, pendulo, and munro's books are also on the list. among these bookstores, only book city and munro’s books have indexed the information of comments and reviews. since user reviews are fundamental to this study, we restricted ourselves to bookstores that provided user reviews. (2) we excluded libraries because literature resources have been well organized in libraries. it might not be necessary to reorganize them according to user cognition. however, we can put this topic in the future study. (3) book city is a typical online bookstore that also has a physical bookstore. unlike amazon, book city, indigo, barnes & noble, and munro’s books have physical bookstores. however, they all have technological limitations on retrieval-system and taxonomical construction compared to amazon. thus, it is necessary to investigate these bookstores’ online systems and optimize them. (4) the location was geographically convenient to the researchers. the authors are more familiar with book city than other bookstores. moreover, we plan on conducting a face-to-face interview for the user study. it is doable only if the authors can get to the bookstore and the users who live there. in all, we choose book city as a representative object. data collection and processing on december 22, 2015, we randomly selected the field “cooking and food” and downloaded bibliographic data for 462 new and old books that included title, picture, synopsis and review, isbn, publication date, author, and keywords. in our previous work we described how metadata for all kinds of literature can be categorized into one of three types: structural data, semistructural data, and unstructured data.27 (see table 1). title, isbn, date, publisher, and author are classified as structural data. titles can be seen as structural data or unstructured data depending on the need. titles will be considered as an indivisible entity in this paper as titles need to retain their original meanings. keywords are considered as semistructural data for two reasons: (1) normally one book is indexed with multiple keywords, which are natural language; and (2) keywords are separated by punctuation. each keyword can individually exist with its own meaning. however, in the current category system, keywords are the names of categories and subcategories. since we are about to reorganize the category system, the current keywords will not be included in the following steps. we use the field “synopsis and review” in the downloaded bibliographic records as the source of user cognition. synopses and reviews are classified as unstructured data. all synopses and reviews of a single book are first incorporated into one paragraph, since some books contain more than one review. structural data will be stored for constructing a knowledge network. unstructured data will be part-of-speech tagged and word segmented by the stanford segmenter. all the books’ metadata are stored into the defined three data types and separate fields. each field is linked by the isbn as the primary key. information technology and libraries | september 2018 72 category organization first, the frequencies of words in all books are separately calculated after word segmenting so that core concepts are identified by the frequencies of words. in total, 29,370 words appeared 43,675 times, after excluding stop words. the 206 words in the sample that occurred more than 105 times appeared 34,944 times. this subset was defined as the core words according to the pareto principle. table 1. data sample. field content data type title a modern way to eat: 200+ satisfying vegetarian recipes structural data isbn 9781607748038 date 04/21/2015 publisher ten speed press author anna jones kwds cooking and food-vegetarian and natural semistructural data synopsis and review a beautifully photographed and modern vegetarian cookbook packed with quick, healthy, and fresh recipes that explore the full breadth of vegetarian ingredients—grains, nuts, seeds, and seasonal vegetables—from jamie oliver's london-based food stylist and writer anna jones. how we want to eat is changing. more and more people cook without meat several nights a week and are constantly seeking to . . . unstructured data we are inspired by zhang et al., who described a linguistic-keywords-extraction method by defining multiple kinds of relationships among words.28 the relationships include direct relationship, indirect relationship, part-whole relationship, and related relationship. • direct relationship. two core words have a relationship directly to each other. • indirect relationship. two core words are related and linked by another word as a media. • part-whole relationship. the “is a” relation. one core word belongs to the other. it is the most common relationship in context. • related relationship. two core words have no relationships but they both appear in a large context. the first two relationships can be mixed with the second two relationships. for instance, a partwhole relationship can have either a direct relationship or an indirect relationship. for this study, we combined every two core words into pairs for analysis. for example, the sentence “a picnic is a great escape from our day-to-day and a chance to turn a meal into something more festive and memorable” would result in several core-word pairs, including of the people, for the people | lou, wang, and he 73 https://doi.org/10.6017/ital.v37i3.10060 “picnic” and “meal,” “picnic” and “festive,” and “meal” and “festive.” for “picnic” and “meal,” there is an obvious part-whole relationship in this context. we observed all their relationships in all books and determined their relationship as a direct part-whole relationship because 67 percent of their relationships are part-whole relationship, 80 percent are direct relationship, and others are related relationship. this is the case when two core words are in the same sentence. for two words in different sentences but within one context, we define the words’ relationship as a sentence relationship. for example, “ingredient” and “meat” in one review in table 1 have an indirect relationship because they are connected by other core concepts between them. therefore, the relationship between “ingredient” and “meat” is an indirect part-whole one in this context. for other cases, two concepts are either related if they appear in the same context or are not related if they do not appear in the same review. thus, all couples of concepts are calculated and stored as semantic triples. figure 3. parts of a modified category in “cooking and food” based on user cognition. the next step is to build up a category tree (figure 4). a direct part-whole relationship is that between a parent class and child class. an indirect part-whole relationship is the relationship between a parent class and a grandchild class. a related relationship is the relationship between sibling classes. information technology and libraries | september 2018 74 compared to the modified category system (figure 3), the current hierarchical category system (figure 4) has two major issues. first, some categories’ names are duplicated. for example, the child class “by ingredient” contains “fruit,” “fruits and vegetables,” and “fruits, vegetables, and nuts.” second, there are categories without semantic meaning, such as “oversized books.” these two problems brought out disorderly indexing and recalled many irrelevant results. for example, the system would let you refine your search first if you type one word in search box. however, refining is confusing by parent class and children class. searching “diet” books as an example, the system suggests you refine your search from five subcategories of “diet and nutrition” under three different parent classes. however, the modified category system has avoided the duplicated keywords. furthermore, the hierarchical system based on users’ comments maintains meaning. figure 4. parts of current category system in “cooking and food.” indexing books we found that the list of keywords was confusing due to the inefficiency of the previous category system. it is necessary to re-index the keywords of each book based on the modified category system. we stand on the data-oriented indexing process. the method to detect the core concepts of each book is the same as that for all books in section 4.3. taking the book a modern way to eat as an example, triples are extracted from the book, including “grain-direct part whole-ingredient,” “nut-direct part whole-ingredient,” “vegetarian-related-health,” and so on. using all triples of the book to match with the triples set from all books in section 4.3, we index this book to categories by the best match parent class. in this case, 5 out of 9 triples of a modern way to eat are matched with the parent class “ingredient.” another two are matched with “natural” and “technique,” and of the people, for the people | lou, wang, and he 75 https://doi.org/10.6017/ital.v37i3.10060 the other two cannot correctly match with the triples set. then, a modern way to eat will be indexed with “cooking and food-ingredient,” “cooking and food-natural,” and “cooking and food-technique.” 4.5 semantic-resource construction the semantic resource is constructed based on structural data that was prepared at the beginning. the informetrics method (specifically co-word analysis) will be used to extract the precise relationship among the bibliography of books, as we previously proposed.29 we construct all structural data together and conduct co-words matrixes between each title, publisher, date, author, and keyword. for example, the author “anna jones” co-occurred with many keywords to varying degrees. the author co-occurred with the keyword “natural” four times and “person” seven times. according to qiu and lou, the precise relationship needs to be divided by the threshold and formatted as literal words.30 therefore, among the degree of all relationships between “anna jones” and other keywords, the relationship between “anna jones” and “natural” is highly correlated, and the relationship between “anna jones” and “person” is extremely correlated. triples are composed of two concepts and their relationships. then a semantic resource is finally constructed that could be used for knowledge retrieval. figure 5. an example of the knowledge network. once the semantic resource is ready, the knowledge network is presentable. we adopted d3.js to display the knowledge network (figure 5). the net view automatically exhibits several books related with an author william davis, which is placed in a conspicuous position on the screen. the forced map can be reformed when users drag any book with the mouse, which will be the noticeable center of other books. the network can connect with the database and the website. information technology and libraries | september 2018 76 5. user-experience study on knowledge display and recommendation there are two common ways to evaluate a retrieval system. one is to test the statistic results, such as the recall and precision. the other is a user study. since our aim is “of the people, for the people,” we chose to conduct two user-experience studies over the statistical results. as such, we can obtain what users suggest and comment on our approach. user-experience study design in february 2016, with the help of friends, we recruited volunteers by posting fliers in portland, oregon. fifty volunteers contacted us. thirty-nine responses were received by the end of march 2016 because the other eleven volunteers were not able to enroll in the electronic test. since we needed to test the feasibility of both the new indexing category and the knowledge recommendation, we set up the user study into two parts, including the comparison of the simple retrieval and the knowledge recommendation. first, we requested permission to use the data source and website frame from book city. however, we cannot construct a new website for book city due to intellectual-property issues. therefore, we constructed a practical mock-up to guide users to simulate a retrieval experiment. following the procedure of the user experience design, we chose mockingbot (https://mockingbot.com) as the mock-up builder. mockingbot allows the demo users to experience a vivid system that will be developed later. the mock-up supports every tag that can be linked with other pages so that subjects could click on the mock-up just as they would on a real website. the demo is expected to help us (1) examine whether our changes would meet the users’ satisfaction and (2) gather information for a better design. then we performed face-to-face, userguided interviews to first gain experience on the previous retrieval system and then compare them with our results. we concurrently recorded the answers and scores of users’ feedback. in the following sections, we will describe the interview process and present the feedback results. study 1: comparison of simple retrieval first, subjects were asked to search related books written by “michael pollan” at powells.com (figure 6). as such, all subjects used the search box based on their instincts. then they were asked to find a new hardcover copy of a book named cooked: a natural history of transformation. we paid attention to the ways that subjects located the target. only five of them used keyboard shortcuts to find the target. however, thirteen subjects stated their concerns regarding the absence of refinement options. furthermore, we noticed that six subjects swept (moused over) the refinement area and then decided to continue eye screening. in the meantime, we recorded the time they spent looking for the item. after they found the target, all subjects gave us a score from one to ten that represented their satisfaction with the current retrieval system. of the people, for the people | lou, wang, and he 77 https://doi.org/10.6017/ital.v37i3.10060 figure 6. screenshot of retrieval results in the current system. in the comparison experiment, we placed our mock-up in front of subjects and conducted the same exam above. in the mock-up, we used the basic frame of the retrieval system but reframed the refinement area. in the new refinement area (figure 7), we added an optional box with refinement keywords in the left column to narrow the search scope. the logic of the refined keywords comes from the indexing category, as we mentioned in the section on the indexing books. “michael pollan” was indexed in six categories: “biographies,” “children’s books,” “cooking and food,” “engineering manufactures,” “hobby and leisure,” and “gardening.” thus, when subjects clicked the “cooking and food” category, they can refine the results to only twelve books rather than the seventy books in the current system. users can obtain accurate retrieval results faster. after the subjects completed their tasks, they gave us a score from one to ten representing their satisfaction with the modified retrieval system. figure 7. refinement results in the modified category-system mock-up. information technology and libraries | september 2018 78 study 2: knowledge recommendation in this experiment, we conducted two tests for two functions on knowledge visualization. one tested the preferences for the net view, and the other tested the preferences for the individual recommendation. for the net view, we guided subjects to search for “william davis” in the mock-up and reminded them to click the net view button after the system recalled a list view. then, the subjects could see the net view results in figure 5. we recorded the scores that they gave for the net view. as for the recommendation on individual books, we adopted multiple layers of associated retrieval results for every book. users could click on one book and another related book would show in a new tab window. we asked subjects to conduct a new search for “william davis.” then they could browse the website and freely click on any book. once they clicked on davis’s book wheat belly: lose the wheat, lose the weight, and find your path back to health, the first recommendation results popped up (figure 8). the recommendation results about wheat in the field of “grain and bread” showed up, including good to the grain: baking with whole grain flours and bread bakers apprentice: mastering the art of extraordinary bread. others about health and losing weight showed up also, such as paleo lunches and breakfasts on the go. all related books appeared because the first book is about both wheat and a healthy diet. a new window showing relevant authors and titles would pop up if the mouse glided over any picture. we asked the subjects about their thoughts on the new recommendation format and recorded the scores. figure 8. an example of knowledge recommendation. users’ feedback as a result, knowledge organization and retrieval received a positive response (tables 2 and 3). first, subjects complained about the inefficiency of the current retrieval system in that it took so long to find one book without using shortcut keys (ctrl-f). three quarters of them were not satisfied with the original search style due to the search time length. however, 67 percent of the subjects gave a score of more than eight points for the refined search results of our new system. of the people, for the people | lou, wang, and he 79 https://doi.org/10.6017/ital.v37i3.10060 only two of them thought that it was useless since they were the two users who only took ten seconds to target the exact result. second, 67 percent and 74 percent of the subjects, respectively, thought that the knowledge recommendation and net view were useful and gave them six points. however, five subjects gave scores of one point because they maintained that it was not necessary to build a new viewer system. table 2. the time to find the exact result in the current system. answers # of users fewer than 10 seconds 2 10 to 30 seconds 4 30 seconds to 1 minute 12 more than 1 minute 21 table 3. statistics of quantitative questions in the questionnaire. score questions 10 9 8 7 6 5 4 3 2 1 total satisfied with original results 0 0 0 0 1 9 14 9 4 2 39 preference of refined results 2 10 14 6 5 0 0 0 0 0 37 preference of results in net view 1 8 10 6 4 1 2 3 1 3 39 preference of knowledge recommendation 3 6 4 8 5 6 0 3 1 2 38 during the interview, subjects who gave scores of more than eight points spoke positively about the vivid visualization of the retrieval results, using words such as “innovative” and “creative.” for instance, user 11 said, “bravo changes for powell, that’d be the most innovative experience for the locals.” among the subjects who gave scores of more than six points, the comments were mostly “interesting idea.” for instance, user 17 commented, “this is an interesting idea to explore my knowledge. i had no idea powell could do such an improvement.” some users offered suggestions to improve the system. for example, user 12 suggested that the system was not comprehensive enough to confidently assess whether the modified category system was better than the previous system. user 25 (a possible professional) was very concerned about the recall efficiency since the system might use many matching algorithms. discussion and conclusion in this paper, a digital literature resource organization model based on user cognition is proposed. this model aims to make users exert subjective initiative. we noticed a significant difference between the previous category system and the new system based on user cognition. our aim, which was “of the people, for the people,” was fulfilled. taking powell’s city of books as an example, it is purposeful to describe how to construct a knowledge network based on user cognition. the user experience study showed that this network implements an optimized exhibition of a digital-resource knowledge recommendation and knowledge retrieval. although user cognition includes many other processes of user behavior, we only used the literal expression. it turned out to be a positive and possible way to reveal users’ cognition. information technology and libraries | september 2018 80 we find that there is much more space for the construction object of digital resource knowledge recommendation based on user cognition. for one, in this paper we only take the familiar book city as a study object and books as experiment objects and determined favorable positive effects, which indicates that the digital resource knowledge link can be applied to physical libraries and bookstores or other types of literature. even though libraries have well-developed taxonomy systems, they can be compared with or combined with new ideas. for another, users adore visual effects and user functions. the results show promise in actualizing improvements to book city’s website or even to other digital platforms. the concerns will be how to optimize the retrieval algorithm and reduce the time costs in the next study. acknowledgements we thank carolyn mckay and powell’s city of books for such great help for the questionnaire networking and all participates for feedback. this work was supported by the national social science foundation of china [grant number 17ctq025]. references and notes 1 peter carruthers, stephen stich, and michael siegal, the cognitive basis of science (cambridge: cambridge university press, 2002). 2 sophie monchaux et al., “query strategies during information searching: effects of prior domain knowledge and complexity of the information problems to be solved,” information processing and management 51, no. 5 (2015): 557–69, https://doi.org/10.1016/j.ipm.2015.05.004. 3 hoill jung and kyungyong chung, “knowledge-based dietary nutrition recommendation for obese management,” information technology and management 17, no. 1 (2016): 29–42, https://doi.org/10.1007/s10799-015-0218-4. 4 dandan ma, liren gan, and yonghua cen, “research on influence of individual cognitive preferences upon their acceptance for knowledge classification recommendation service,” journal of the china society for scientific and technical information 33, no. 7 (2014): 712–29. 5 haiqun ma and zhihe yang, “study on the cognitive model of information searchers from the perspective of neuro-language programming,” journal of library science in china 37, no. 3 (2011): 38–47. 6 paul gooding, “exploring the information behaviour of users of welsh newspapers online through web log analysis,” journal of documentation 72, no. 2 (2016): 232–46. https://doi.org/10.1108/jd-10-2014-0149. 7 munmun de choudhury and scott counts, “identifying relevant social media content : leveraging information diversity and user cognition,” in ’ht11 proceedings of the 22nd acm conference on hypertext and hypermedia (new york: acm, 2011), 161–70, https://doi.org/10.1145/1995966.1995990; carol tenopir et al., “academic users’ interactions with sciencedirect in search tasks: affective and cognitive behaviors ,” information processing and management 44, no. 1 (2008): 105–21, https://doi.org/10.1016/j.ipm.2006.10.007. https://doi.org/10.1016/j.ipm.2015.05.004 https://doi.org/10.1007/s10799-015-0218-4 https://doi.org/10.1145/1995966.1995990 https://doi.org/10.1016/j.ipm.2006.10.007 of the people, for the people | lou, wang, and he 81 https://doi.org/10.6017/ital.v37i3.10060 8 young han bae, jong woo jun, and michelle hough, “uses and gratifications of digital signage and relationships with user interface,” journal of international consumer marketing 28, no. 5 (2016): 323–31, https://doi.org/10.1080/08961530.2016.1189372. 9 claude sicotte et al., “analysing user satisfaction with the system in use prior to the implementation of a new electronic inpatient record,” in proceedings of the 12th world congress on health (medical) informatics; building sustainable health systems (amsterdam: ios press, 2007), 1779-1784; zhenzheng qian et al., “satiindicator: leveraging user reviews to evaluate user satisfaction of sourceforge projects,” in proceedings—international computer software and applications conference 1 (2016):93–102, https://doi.org/10.1109/compsac.2016.183. 10 christina merten and cristina conati, “eye-tracking to model and adapt to user meta-cognition in intelligent learning environments,” in proceedings of the 11th international conference on intelligent user interfaces—iui ’06 (new york: acm, 2006), 39–46, https://doi.org/10.1145/1111449.1111465; weidong zhao, ran wu, and haitao liu, “paper recommendation based on the knowledge gap between a researcher’s background knowledge and research target,” information processing & management 52, no. 5 (2016): 976–88, https://doi.org/10.1016/j.ipm.2016.04.004. 11 haoran xie et al., “incorporating sentiment into tag-based user profiles and resource profiles for personalized search in folksonomy,” information processing and management 52, no. 1 (2016): 61–72, https://doi.org/10.1016/j.ipm.2015.03.001; francisco villarroel ordenes et al., “analyzing customer experience feedback using text mining: a linguistics-based approach,” journal of service research 17, no. 3 (2014): 278–95, https://doi.org/10.1177/1094670514524625; yujong hwang and jaeseok jeong, “electronic commerce and online consumer behavior research: a literature review,” information development 32, no. 3 (2016): 377–88, https://doi.org/10.1177/0266666914551071. 12 stephan ludwig et al., “more than words: the influence of affective content and linguistic style matches in online reviews on conversion rates,” journal of marketing 77, no. 1 (2012): 1–52, https://doi.org/10.1509/jm.11.0560. 13 jun yang and yinglong wang, “a new framework based on cognitive psychology for knowledge discovery,” journal of software 8, no. 1 (2013): 47–54. 14 alan baddeley, “on applying cognitive psychology,” british journal of psychology 104, no. 4 (2013): 443–56, https://doi.org/10.1111/bjop.12049. 15 aidan moran, “cognitive psychology in sport: progress and prospects,” psychology of sport and exercise 10, no. 4 (2009): 420–26, https://doi.org/10.1016/j.psychsport.2009.02.010. 16 john van de pas, “a framework for public information services in the twenty-first century,” new library world 114, no. 1/2 (2013): 67–79, https://doi.org/10.1108/03074801311291974. 17 enrique frias-martinez, sherry y. chen, and xiaohui liu, “evaluation of a personalized digital library based on cognitive styles: adaptivity vs. adaptability,” international journal of https://doi.org/10.1080/08961530.2016.1189372 https://doi.org/10.1109/compsac.2016.183 https://doi.org/10.1145/1111449.1111465 https://doi.org/10.1016/j.ipm.2016.04.004 https://doi.org/10.1016/j.ipm.2015.03.001 https://doi.org/10.1177/1094670514524625 https://doi.org/10.1177/0266666914551071 https://doi.org/10.1509/jm.11.0560 https://doi.org/10.1111/bjop.12049 https://doi.org/10.1016/j.psychsport.2009.02.010 https://doi.org/10.1108/03074801311291974 information technology and libraries | september 2018 82 information management 29, no. 1 (2009): 48–56, https://doi.org/10.1016/j.ijinfomgt.2008.01.012. 18 shing lee chung et al., “an integrated framework for managing knowledge-intensive service innovation,” international journal of services technology and management 13, no. 1/2 (2010): 20, https://doi.org/10.1504/ijstm.2010.029669. 19 koteshwar chirumalla, “managing knowledge for product-service system innovation: the role of web 2.0 technologies,” research-technology management 56, no. 2 (2013): 45–53, https://doi.org/10.5437/08956308x5602045; koteshwar chirumalla et al., “knowledgesharing network for product-service system development: is it a typical?,” in international conference on industrial product-service systems (2013): 109–14; fumiya akasaka et al., “development of a knowledge-based design support system for product-service systems,” computers in industry 63, no. 4 (2012): 309–18, https://doi.org/10.1016/j.compind.2012.02.009. 20 c. f. cheung et al., “a multi-perspective knowledge-based system for customer service management,” expert systems with applications 24, no. 4 (2003): 457–70, https://doi.org/10.1016/s0957-4174(02)00193-8. 21 padmal vitharana, hemant jain, and fatemeh zahedi, “a knowledge based component/service repository to enhance analysts’ domain knowledge for requirements analysis,” information and management 49, no. 1 (2012): 24–35, https://doi.org/10.1016/j.im.2011.12.004. 22 baihai zhou, “the construction of library interdisciplinary knowledge sharing service system,” in 2014 11th international conference on service systems and service management (icsssm), june 25–27, 2014, https://doi.org/10.1109/icsssm.2014.6874033. 23 rusli abdullah, zeti darleena eri, and amir mohamed talib, “a model of knowledge management system for facilitating knowledge as a service (kaas) in cloud computing environment,” 2011 international conference on research and innovation in information systems, november 23–24, 2011, 1–4, https://doi.org/10.1109/icriis.2011.6125691. 24 alan smeaton and jamie callan, “personalisation and recommender systems in digital libraries,” international journal on digital libraries 5, no. 4 (2005): 299–308, https://doi.org/10.1007/s00799-004-0100-1. 25 yanwen wu et al., “research on personalized knowledge service system in community elearning,” lecture notes in computer science (berlin: springer, 2006), https://doi.org/10.1007/11736639_17; shu-chen kao and chienhsing wu, “pikipdl. a personalized information and knowledge integration platform for dl service,” library hi tech 30, no. 3 (2012): 490–512, https://doi.org/10.1108/07378831211266627. 26 national geographic, destinations of a lifetime: 225 of the world’s most amazing places (washington d.c.: national geographic society, 2016). 27 wen lou and junping qiu, “semantic information retrieval research based on co-occurrence analysis,” online information review 38, no. 1 (january 8, 2014): 4–23, https://doi.org/10.1016/j.ijinfomgt.2008.01.012 https://doi.org/10.1504/ijstm.2010.029669 https://doi.org/10.5437/08956308x5602045 https://doi.org/10.1016/j.compind.2012.02.009 https://doi.org/10.1016/s0957-4174(02)00193-8 https://doi.org/10.1016/j.im.2011.12.004 https://doi.org/10.1109/icsssm.2014.6874033 https://doi.org/10.1109/icriis.2011.6125691 https://doi.org/10.1007/s00799-004-0100-1 https://doi.org/10.1007/11736639_17 https://doi.org/10.1108/07378831211266627 of the people, for the people | lou, wang, and he 83 https://doi.org/10.6017/ital.v37i3.10060 https://doi.org/10.1108/oir-11-2012-0203; junping qiu and wen lou, “constructing an information science resource ontology based on the chinese social science citation index,” aslib journal of information management 66, no. 2 (march 10, 2014): 202–18, https://doi.org/10.1108/ajim-10-2013-0114; fan yu, junping qiu, and wen lou, “library resources semantization based on resource ontology,” electronic library 32, no. 3 (2014): 322–40, https://doi.org/10.1108/el-05-2012-0056. 28 lei zhang et al., “extracting and ranking product features in opinion documents,” in international conference on computational linguistics (2010): 1462–70. 29 lou and qiu, “semantic information retrieval research,” 4; qiu and lou, “constructing an information science resource ontology,” 202; yu, qiu, and lou, “library resources semantization,” 322. 30 qiu and lou, “constructing an information science resource ontology,” 202. https://doi.org/10.1108/oir-11-2012-0203 https://doi.org/10.1108/ajim-10-2013-0114 https://doi.org/10.1108/el-05-2012-0056 abstract introduction related work user cognition and measurement user-oriented knowledge service model knowledge service system construction resource-organization model based on user cognition resource-organization experiment object selection data collection and processing category organization indexing books 4.5 semantic-resource construction 5. user-experience study on knowledge display and recommendation user-experience study design study 1: comparison of simple retrieval study 2: knowledge recommendation users’ feedback discussion and conclusion acknowledgements references and notes public library computer waiting queues: alternatives to the first -come-first-served strategy stuart williamson public library computer waiting queues | williamson 72 abstract this paper summarizes the results of a simulation of alternative queuing strategies for a public library computer sign-up system. using computer usage data gathered from a public library, the performance of these various queuing strategies is compared in terms of the distribution of user wait times. the consequences of partitioning a pool of public computers are illustrated as are the potential benefits of prioritizing users in the waiting queue according to the amount of computer time they desire. introduction many of us at public libraries are all too familiar with the scene: a crowd of customers huddled around the library entrance in the morning, anxiously waiting for the doors to open to begin a race for the computers. from this point on, the wait for a computer at some libraries, such as the one we will examine, can hover near thirty minutes on busy days and peak at an hour or more. such long waiting times are a common source of frustration for both customers and staff. by far the most effective solution to this problem is to install more public computers at your library. of course, when the space or money run out, this may no longer be possible. another approach is to reduce the length or number of sessions each customer is allowed. unfortunately, reducing session length can make completion of many important tasks difficult; whereas, restricting the number of sessions per day can result in customers upset over being unable to use idle computers.1 finally, faced with daunting wait times, libraries eager to make their computers accessible to more people may be tempted to partition their waiting queue by installing separate fifteen-minute “express” computers. a primary focus of this paper is to illustrate how partitioning the pool of public computers can significantly increase waiting times. additionally, several alternative queuing strategies are presented for providing express-like computer access without increasing overall waiting times. we often take for granted the notion that first-come-first-served (fcfs) is a basic principle of fairness. “i was here first,” is an intuitive claim that we understand from an early age. however, stuart williamson (swilliamson@metrolibrary.org) is researcher, metropolitan library system, oklahoma city, oklahoma. mailto:swilliamson@metrolibrary.org information technology and libraries | june 2012 73 the inefficiency present in a strictly fcfs queue is implicitly acknowledged when we courteously invite a person with only a few items to bypass our overflowing grocery cart to proceed ahead in the check-out line. most of us would agree to wait an additional few minutes rather than delay someone else for a much greater length of time. when express lanes are present, they formalize this process by essentially allowing customers needing help for only a short period of time to cut in line. these line cuts are masked by the establishment of separate dedicated lines, i.e., the queue is partitioned into express and non-express lines. one question addressed by this article is “is there a middle ground?” in other words, how might a library system set up its computer waiting queue to achieve express-lane type service without splitting the set of public internet computers into partitions that operate separately and in parallel? several such strategies are presented here along with the results of how each performed in a computer simulation using actual customer usage data from a public library. strategies queuing systems are heavily researched in a number of disciplines, particularly computer science and operations research. the complexity and sheer number of different queuing models can present a formidable barrier to library professionals. this is because, in the absence of real-world data, it is often necessary to analyze a queuing system mathematically by approximating its key features with an applicable probability distribution. unfortunately, applying these distributions entails adopting their underlying assumptions as well as any additional assumptions involved in calculating the input parameters. for instance, the poisson distribution (used to approximate customer arrival rates) requires that the expected arrival rate be uniform across all time intervals, an assumption which is clearly violated when school lets out and teenagers suddenly swarm the computers.2 even if we can account for such discrepancies, there remains the difficulty of estimating the correct arrival rate parameter for each discrete time interval being analyzed. fortunately, many libraries now use automated computer sign-up systems which provide access to vast amounts of real-world data. with realistic data, it is possible to simulate various queuing strategies, a few of which will be analyzed in this article. a computer simulation using real-world data provides a good picture of the practical implications of any queuing strategy we care to devise without the need for complex models. as is often the case, designing a waiting queue strategy involves striking a balance among competing factors. for instance, one way of reducing waiting times involves breaking with the fcfs rule and allowing users in one category to cut in front of other users. how many cuts are acceptable? does the shorter wait time for users in one category justify the longer waits in another? there are no right answers to these questions. while simulating a strategy can provide a realistic picture of its results in terms of waiting times, evaluating which strategy’s results are preferable for a particular library must be done on a case-by-case basis. in addition to the standard fcfs strategy with a single pool of computers and the same fcfs strategy implemented with one computer removed from the pool to serve as a dedicated fifteen public library computer waiting queues | williamson 74 minute express computer (referred to as fcfs-15), we will consider for comparison three other well-known alternative queuing strategies: shortest-job-first (sjf), highest-response-ratio-next (hrrn), and a variant of shortest-job-first (sjf-fb) which employs a feedback mechanism to restrict the number of times a given user may be bypassed in the queue.3 the three alternative strategies all require advance knowledge or estimation of how long each particular computer session will last. in our case, this means customers would need to indicate how long of a session they desire upon first signing up for a computer. any number of minutes is acceptable so we will limit the sign-up options to four categories in fifteen-minute intervals: fifteen minutes, thirty minutes, forty-five minutes, and sixty minutes. each session will then be initially categorized into one of four priority classes (p1, p2, p3, and p4) accordingly. as the data will show, customers selecting shorter sessions are given a higher priority in the queue and will thus have a shorter expected waiting time. it should be noted that relying on users to choose their own session length presents its own set of problems. it is often difficult to estimate how much time will be required to accomplish a given set of tasks online. however, users face a similar difficulty in deciding whether to opt for a dedicated fifteen-minute computer under the fcfs-15 system. the trade-off between use time and wait time should provide an incentive for some users to self-ration their computer use, placing an additional downward pressure on wait times. however, user adaptations in response to various queuing strategies are outside the scope of this analysis and will not be considered further. the shortest-job-first (sjf) strategy functions by simply selecting from the queue the user in the highest priority class. the amount of time spent waiting by each user is only considered as a tie breaker among users occupying the same priority class. our results demonstrate that the sjf strategy is generally best for minimizing overall average waiting time as well as for getting customers needing the least amount of computer time online the fastest. the main drawbacks of this strategy are that these gains come at the expense of more line cuts and higher average and maximum waiting times for the lowest priority users—those needing the longest sessions (sixty minutes). there is no limit to how many times a user can be passed over in the queue. in theory, this means that such a user could be continually bypassed and never be assigned a computer during the day. the sjf-fb strategy is a variant of sjf with the addition of a feedback mechanism that increases the priority of users each time they are cut in line. for instance, if a user signs up for a sixtyminute session, he/she is initially assigned a priority of 4. suppose that shortly after, another user signs up for a thirty-minute session and is assigned a priority of 2. the next available computer will be assigned to the user with the priority 2. the bypassed user’s priority will now be bumped up by a set interval. in this simulation an interval of 0.5 is used so the bypassed user’s new priority becomes 3.5. as a result, users beginning with a priority of 4 will reach the highest priority of 1 after being bypassed six times and will not be bypassed further. this effectively restricts the maximum number of times a user can be cut in front of at six. information technology and libraries | june 2012 75 the final alternative strategy, highest-response-ratio-next (hrrn), is a balance between fcfs and sjf. it considers both the arrival time and requested session length when assigning a priority to each user in the queue. each time a user is selected from the queue, the response ratio is recalculated for all users. the user with the highest response ratio is selected and assigned the open computer. the formula for response ratio is: ( ) this allows users with a shorter session request to cut in line, but only up to a point. even customers requesting the longest possible session move up in priority as they wait, just at a slower pace. this method produces the same benefits and drawbacks as the sjf strategy; but the effects of both are moderated, and the possibility of unbounded waiting is eliminated. still, although the expected number of cuts will be lower using hrrn than with sjf, there is no limit on how many times a user may be passed over in the queue. the response ratio formula can be generalized by scaling the importance of the waiting time factor. for instance in the modified response ratio below, increasing values of x > 1 will cause the strategy to more resemble fcfs, and decreasing values of 0 < x < 1 will more resemble sjf. ( ) one could experiment with different values of x to find a desired balance between the number of line cuts and the impact on average waiting times for customers in the various priority classes. this won’t be pursued here, and x will be assumed to be 1. methodology the data used in this simulation come from the metropolitan library system’s southern oaks library in oklahoma city. this library has eighteen public internet computers that customers can sign up for using proprietary software developed by jimmy welch, deputy executive director/technology for the metropolitan library system. the waiting queue employs the firstcome-first-served (fcfs) strategy. customers are allotted an initial session of up to sixty minutes but may extend their session in thirty-minute increments so long as the waiting queue is empty. repeat customers are also allowed to sign up for additional thirty-minute sessions during the day, provided that no user currently in the queue has been waiting for more than ten minutes (an indication that demand for computers is currently high). anonymous usage data gathered by the system in august 2010 was compiled to produce the information about each customer session shown in table 1. public library computer waiting queues | williamson 76 table 1. session data (units in minutes) the information about each session required for the simulation includes the time at which the user arrived to sign up for a computer, the number of minutes it took the user to log in once assigned a computer, how many minutes of computer time were used, whether or not this was the user’s first or a subsequent session for the day, and finally, whether the user gave up waiting and abandoned his/her place in the queue. users are given eight minutes to log in once a computer station is assigned to them before they are considered to have abandoned the queue. once this data has been gathered, the computer simulation runs by iterating through each second the library is open. as user sign-up times are encountered in the data, they are added to the waiting queue. when a computer becomes available, a user is selected from the queue using the strategy being simulated and assigned to the open computer. the customer occupies the computer for the length of time given by their associated log-in delay and session length. when this time expires, customers are removed from their computer and the information recorded during their time spent in the waiting queue is logged. results there were 7,403 sign-ups for the computers at the southern oaks library in august 2010. each of these requests is assigned a priority class based on the length of the session as detailed in table 2. the intended session length of users choosing to abandon the queue is unknown. abandoned sign-ups are assigned a priority class randomly in proportion to the overall distribution of priority classes in the data so as not to introduce any systematic bias into the results. even though their actual session length is zero, these users participate in the queue and cause the computer eventually assigned to them to sit idle for eight minutes until it is re-assigned. customers signing up for a subsequent session during the day are always assigned the lowest priority class (p-4) regardless of their requested session length. this is a policy decision to not give priority to users who have already received a computer session for the day. information technology and libraries | june 2012 77 table 2. assignment of priority classes figure 1 displays the average waiting time for each priority class during the simulation (bars) along with the total number of sessions initially assigned to each class (line). it is immediately obvious from the chart that each alternative strategy excels at reducing the average wait for high priority (p1) users. also observe how removing one computer from the pool to serve exclusively as a fifteen-minute computer drastically increases the fcfs-15 average wait times in the other priority classes. clearly, removing one (or more) computer from the pool to serve as a dedicated fifteen-minute station is a poor strategy here for all but the 519 users in class p-1. losing just one of the eighteen available computers nearly doubles the average wait for the remaining 6,884 users in the other priority classes. figure 1. average user wait minutes by priority class public library computer waiting queues | williamson 78 by contrast, note that the reduced average wait times for the highest priority users in class p-1 persist in classes p-2 and p-3 for the non-fcsc strategies. the sjf strategy produces the most dramatic reductions for the 2,164 users not in class p-4. however, for the 5,239 users in class p-4, the sjf strategy produced an average wait time that was 2.1 minutes longer than the purely fcfs strategy. the hrrn strategy achieves lesser wait time reductions than sjf in the higher priority classes, but hrrn increased the average wait for users in class p-4 by only 0.7 minutes relative to fcfs. the average wait using the sjf-fb strategy falls in between that of sjf and hrrn for each priority class while guaranteeing users will be cut at most six times. an examination of the maximum wait times for each priority class in figure 2 illustrates how the express lane itself can be a bottleneck. even with a dedicated fifteen-minute express computer under the fcfs-15 strategy, at least one user would have waited over half an hour to use a computer for fifteen minutes or less. in all but the highest priority class (p-2 through p-4), the fcfs-15 strategy again performs poorly with at least one user in each of these classes waiting over ninety minutes for a computer. figure 2. maximum user wait minutes by priority class capping the number of times a user may be passed over in the queue under the sfj-fb strategy makes it less likely that members of classes p-2 and p-3 will be able to take advantage of their higher priority to cut in front of users in class p-4 during periods of peak demand. as a result, the sjf-fb maximum wait times for classes p-2 and p-3 are similar to those under the fcfs strategy. this was not the case in the breakdown of sjf-fb average waiting times across priority classes in figure 1. information technology and libraries | june 2012 79 table 3 breaks down waiting times for each queuing strategy according to the overall percentage of users waiting no more than the given number of minutes. here we see the effects of each strategy on the system as a whole, instead of by priority class. notice that the overall average wait times for the non-fcfs strategies are lower than those of fcfs. this indicates that the total reduction in waiting times for high-priority users exceeds the additional time spent waiting by users in class p-4. in other words, these strategies are globally more efficient than fcfs. notice, too, in table 3 that the non-fcfs strategies achieve significant reductions in the median wait time compared with fcfs. table 3. distribution of wait times by strategy after demonstrating the impact that breaking the first-come-first-served rule can have on waiting times, it is important to examine the line cuts that are associated with each of these strategies. line cuts are recorded by each user in the simulation while waiting in the queue. each time a user is selected from the queue and assigned a computer, remaining users who arrived prior to the one just selected note having been skipped over. by the time they are assigned a computer, users have recorded the total number of times they were passed over in the queue. public library computer waiting queues | williamson 80 figure 3. cumulative distribution of line cuts by queuing strategy figure 3 displays the cumulative percentage of users experiencing no more than the listed number of cuts for each non-fcfs strategy. the majority of users are not passed over at all under these strategies. however, there is a small minority of users that will be repeatedly cut in line. for instance, in our simulation, one unfortunate individual was passed over in the queue sixteen times under the sjf strategy. this user waited ninety-one minutes using this strategy as opposed to only fifty-nine minutes under the familiar fcfs waiting queue. most customers would become upset upon seeing a string of sixteen people jump over them in the queue and get on a computer while they are enduring such a long wait. the hrrn strategy caused a maximum of nine cuts to an individual in this simulation. this user waited seventy-three minutes under hrrn versus only fifty-five minutes using fcfs. extreme examples such as those above are the exception. under the hrrn and sjf-fb strategies, 99% of users were passed over at most four times while waiting in the queue. conclusion we have examined the simulation of several queuing strategies using a single month of computer usage data from the southern oaks library. the relative performance difference between queuing strategies will depend on the supply and demand of computers at any given location. clearly, at libraries with plenty of public computers for which customers seldom have to wait, the choice of queuing strategy is inconsequential. however, for libraries struggling with waiting times on par with those examined here, the choice can have a substantial impact. information technology and libraries | june 2012 81 in general, however, these simulation results demonstrate the ability of non-fcfs queuing strategies to significantly lower waiting times for certain classes of users without partitioning the pool of computers. these reductions in waiting times come at the cost of allowing high priority users to essentially cut in line. this causes slightly longer wait times for low priority users; but, overall average and median wait times see a small reduction. of course, for some customers, being passed over in line even once is intolerable. furthermore, creating a system to implement an alternative queuing strategy may present obstacles of its own. however, if the need to provide for quick, short-term computer access is pressing enough for a library to create a separate pool of “express” computers; then, one of the non-fcfs queuing strategies discussed in this paper may be a viable alternative. at the very least, the fcfs-15 simulation results should give one pause before resorting to designated “express” and “nonexpress” computers in an attempt to remedy unacceptable customer waiting times. acknowledgments the author would like to thank the metropolitan library system, kay bauman, jimmy welch, sudarshan dhall, and bo kinney for their support and assistance with this paper as well as tracey thompson and tim spindle for their excellent review and recommendations. references 1. j. d. slone, “the impact of time constraints on internet and web use,” journal of the american society for information science and technology 58 (2007): 508–17. 2. william mendenhall and terry sincich, statistics for engineering and the sciences (upper saddle river, nj: prentice-hall, 2006), 151–54. 3. abraham silberschatz, peter baer galvin, and greg gagne, operating system concepts (hoboken, nj: wiley, 2009), 188–200. ital_24n4p12-23 ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ ฀ generating collaborative systems for digital libraries | hilera et al. 195 josé r. hilera, carmen pagés, j. javier martínez, j. antonio gutiérrez, and luis de-marcos an evolutive process to convert glossaries into ontologies dictionary, the outcome will be limited by the richness of the definition of terms included in that dictionary. it would be what is normally called a “lightweight” ontology,6 which could later be converted into a “heavyweight” ontology by implementing, in the form of axioms, knowledge not contained in the dictionary. this paper describes the process of creating a lightweight ontology of the domain of software engineering, starting from the ieee standard glossary of software engineering terminology.7 ■■ ontologies, the semantic web, and libraries within the field of librarianship, ontologies are already being used as alternative tools to traditional controlled vocabularies. this may be observed particularly within the realm of digital libraries, although, as krause asserts, objections to their use have often been raised by the digital library community.8 one of the core objections is the difficulty of creating ontologies as compared to other vocabularies such as taxonomies or thesauri. nonetheless, the semantic richness of an ontology offers a wide range of possibilities concerning indexing and searching of library documents. the term ontology (used in philosophy to refer to the “theory about existence”) has been adopted by the artificial intelligence research community to define a categorization of a knowledge domain in a shared and agreed form, based on concepts and relationships, which may be formally represented in a computer readable and usable format. the term has been widely employed since 2001, when berners-lee et al. envisaged the semantic web, which aims to turn the information stored on the web into knowledge by transforming data stored in every webpage into a common scheme accepted in a specific domain.9 to accomplish that task, knowledge must be represented in an agreed-upon and reusable computer-readable format. to do this, machines will require access to structured collections of information and to formalisms which are based on mathematical logic that permits higher levels of automatic processing. technologies for the semantic web have been developed by the world wide web consortium (w3c). the most relevant technologies are rdf (resource description this paper describes a method to generate ontologies from glossaries of terms. the proposed method presupposes an evolutionary life cycle based on successive transformations of the original glossary that lead to products of intermediate knowledge representation (dictionary, taxonomy, and thesaurus). these products are characterized by an increase in semantic expressiveness in comparison to the product obtained in the previous transformation, with the ontology as the end product. although this method has been applied to produce an ontology from the “ieee standard glossary of software engineering terminology,” it could be applied to any glossary of any knowledge domain to generate an ontology that may be used to index or search for information resources and documents stored in libraries or on the semantic web. f rom the point of view of their expressiveness or semantic richness, knowledge representation tools can be classified at four levels: at the basic level (level 0), to which dictionaries belong, tools include definitions of concepts without formal semantic primitives; at the taxonomies level (level 1), tools include a vocabulary, implicit or explicit, as well as descriptions of specialized relationships between concepts; at the thesauri level (level 2), tools further include lexical (synonymy, hyperonymy, etc.) and equivalence relationships; and at the reference models level (level 3), tools combine the previous relationships with other more complex relationships between concepts to completely represent a certain knowledge domain.1 ontologies belong at this last level. according to the hierarchic classification above, knowledge representation tools of a particular level add semantic expressiveness to those in the lowest levels in such a way that a dictionary or glossary of terms might develop into a taxonomy or a thesaurus, and later into an ontology. there are a variety of comparative studies of these tools,2 as well as varying proposals for systematically generating ontologies from lower-level knowledge representation systems, especially from descriptor thesauri.3 this paper proposes a process for generating a terminological ontology from a dictionary of a specific knowledge domain.4 given the definition offered by neches et al. (“an ontology is an instrument that defines the basic terms and relations comprising the vocabulary of a topic area as well as the rules for combining terms and relations to define extensions to the vocabulary”)5 it is evident that the ontology creation process will be easier if there is a vocabulary to be extended than if it is developed from scratch. if the developed ontology is based exclusively on the josé r. hilera (jose.hilera@uah.es) is professor, carmen pagés (carmina.pages@uah.es) is assistant professor, j. javier martínez (josej.martinez@uah.es) is professor, j. antonio gutiérrez (jantonio.gutierrez@uah.es) is assistant professor, and luis de-marcos (luis.demarcos@uah.es) is professor, department of computer science, faculty of librarianship and documentation, university of alcalá, madrid, spain. 196 information technology and libraries | december 2010 configuration management; data types; errors, faults, and failures; evaluation techniques; instruction types; language types; libraries; microprogramming; operating systems; quality attributes; software documentation; software and system testing; software architecture; software development process; software development techniques; and software tools.15 in the glossary, entries are arranged alphabetically. an entry may consist of a single word, such as “software,” a phrase, such as “test case,” or an acronym, such as “cm.” if a term has more than one definition, the definitions are numbered. in most cases, noun definitions are given first, followed by verb and adjective definitions as applicable. examples, notes, and illustrations have been added to clarify selected definitions. cross-references are used to show a term’s relations with other terms in the dictionary: “contrast with” refers to a term with an opposite or substantially different meaning; “syn” refers to a synonymous term; “see also” refers to a related term; and “see” refers to a preferred term or to a term where the desired definition can be found. figure 2 shows an example of one of the definitions of the glossary terms. note that definitions can also include framework),10 which defines a common data model to specify metadata, and owl (ontology web language),11 which is a new markup language for publishing and sharing data using web ontologies. more recently, the w3c has presented a proposal for a new rdf-based markup system that will be especially useful in the context of libraries. it is called skos (simple knowledge organization system), and it provides a model for expressing the basic structure and content of concept schemes, such as thesauri, classification schemes, subject heading lists, taxonomies, folksonomies, and other similar types of controlled vocabularies.12 the emergence of the semantic web has created great interest within librarianship because of the new possibilities it offers in the areas of publication of bibliographical data and development of better indexes and better displays than those that we have now in ils opacs.13 for that reason, it is important to strive for semantic interoperability between the different vocabularies that may be used in libraries’ indexing and search systems, and to have compatible vocabularies (dictionaries, taxonomies, thesauri, ontologies, etc.) based on a shared standard like rdf. there are, at the present time, several proposals for using knowledge organization systems as alternatives to controlled vocabularies. for example, folksonomies, though originating within the web context, have been proposed by different authors for use within libraries “as a powerful, flexible tool for increasing the user-friendliness and interactivity of public library catalogs.”14 authors argue that the best approach would be to create interoperable controlled vocabularies using shared and agreed-upon glossaries and dictionaries from different domains as a departure point, and then to complete evolutive processes aimed at semantic extension to create ontologies, which could then be combined with other ontologies used in information systems running in both conventional and digital libraries for indexing as well as for supporting document searches. there are examples of glossaries that have been transformed into ontologies, such as the cambridge healthtech institute’s “pharmaceutical ontologies glossary and taxonomy” (http://www.genomicglossaries.com/content/ontolo gies.asp), which is an “evolving terminology for emerging technologies.” ■■ ieee standard glossary of software engineering terminology to demonstrate our proposed method, we will use a real glossary belonging to the computer science field, although it is possible to use any other. the glossary, available in electronic format (pdf), defines approximately 1,300 terms in the domain of software engineering (figure 1). topics include addressing assembling, compiling, linking, loading; computer performance evaluation; figure 1. cover of the glossary document generating collaborative systems for digital libraries | hilera et al. 197 4. define the classes and the class hierarchy 5. define the properties of classes (slots) 6. define the facets of the slots 7. create instances as outlined in the introduction, the ontology developed using our method is a terminological one. therefore we can ignore the first two steps in noy’s and mcguinness’ process as the concepts of the ontology coincide with the terms of the glossary used. any ontology development process must take into account the basic stages of the life cycle, but the way of organizing the stages can be different in different methods. in our case, since the ontology has a terminological character, we have established an incremental development process that supposes the natural evolution of the glossary from its original format (dictionary or vocabulary format) into an ontology. the proposed life cycle establishes a series of steps or phases that will result in intermediate knowledge representation tools, with the final product, the ontology, being the most semantically rich (figure 4). therefore this is a product-driven process, in which the aim of every step is to obtain an intermediate product useful on its own. the intermediate products and the final examples associated with the described concept. in the resulting ontology, the examples were included as instances of the corresponding class. in figure 2, it can be seen that the definition refers to another glossary on programming languages (std 610.13), which is a part of the series of dictionaries related to computer science (“ieee std 610,” figure 3). other glossaries which are mentioned in relation to some references about term definitions are 610.1, 610.5, 610.7, 610.8, and 610.9. to avoid redundant definitions and possible inconsistencies, links must be implemented between ontologies developed from those glossaries that include common concepts. the ontology generation process presented in this paper is meant to allow for integration with other ontologies that will be developed in the future from the other glossaries. in addition to the explicit references to other terms within the glossary and to terms from other glossaries, the textual definition of a concept also has implicit references to other terms. for example, from the phrase “provides features designed to facilitate expression of data structures” included in the definition of the term high order language (figure 2), it is possible to determine that there is an implicit relationship between this term and the term data structure, also included in the glossary. these relationships have been considered in establishing the properties of the concepts in the developed ontology. ■■ ontology development process many ontology development methods presuppose a life cycle and suggest technologies to apply during the process of developing an ontology.16 the method described by noy and mcguinness is helpful when beginning this process for the first time.17 they establish a seven-step process: 1. determine the domain and scope of the ontology 2. consider reusing existing ontologies 3. enumerate important terms in the ontology figure 2. example of term definition in the ieee glossary figure 3. ieee computer science glossaries 610—standard dictionary of computer terminology 610.1—standard glossary of mathematics of computing terminology 610.2—standard glossary of computer applications terminology 610.3—standard glossary of modeling and simulation terminology 610.4—standard glossary of image processing terminology 610.5—standard glossary of data management terminology 610.6—standard glossary of computer graphics terminology 610.7—standard glossary of computer networking terminology 610.8—standard glossary of artificial intelligence terminology 610.9—standard glossary of computer security and privacy terminology 610.10—standard glossary of computer hardware terminology 610.11—standard glossary of theory of computation terminology 610.12—standard glossary of software engineering terminology 610.13—standard glossary of computer languages terminology high order language (hol). a programming language that requires little knowledge of the computer on which a program will run, can be translated into several difference machine languages, allows symbolic naming of operations and addresses, provides features designed to facilitate expression of data structures and program logic, and usually results in several machine instructions for each program statement. examples include ada, cobol, fortran, algol, pascal. syn: high level language; higher order language; third generation language. contrast with: assembly language; fifth generation language; fourth generation language; machine language. note: specific languages are defined in p610.13 198 information technology and libraries | december 2010 since there are terms with different meanings (up to five in some cases) in the ieee glossary of software engineering terminology, during dictionary development we decided to create different concepts (classes) for the same term, associating a number to these concepts to differentiate them. for example, there are five different definitions for the term test, which is why there are five concepts (test1–test5), corresponding to the five meanings of the term: (1) an activity in which a system or component is executed under specified conditions, the results are observed or recorded, and an evaluation is made of some aspect of the system or component; (2) to conduct an activity as in (1); (3) a set of one or more test cases; (4) a set of one or more test procedures; (5) a set of one or more test cases and procedures. taxonomy the proposed lifecycle establishes a stage for the conversion of a dictionary into a taxonomy, understanding taxonomy as an instrument of concepts categorization, product are a dictionary, which has a formal and computer processed structure, with the terms and their definitions in xml format; a taxonomy, which reflects the hierarchic relationships between the terms; a thesaurus, which includes other relationships between the terms (for example, the synonymy relationship); and, finally, the ontology, which will include the hierarchy, the basic relationships of the thesaurus, new and more complex semantic relationships, and restrictions in form of axioms expressed using description logics.18 the following paragraphs describe the way each of these products is obtained. dictionary the first step of the proposed development process consists of the creation of a dictionary in xml format with all the terms included in the ieee standard glossary of software engineering terminology and their related definitions. this activity is particularly mechanical and does not need human intervention as it is basically a transformation of the glossary from its original format (pdf) into a format better suited to the development process. all formats considered for the dictionary are based on xml, and specifically on rdf and rdf schema. in the end, we decided to work with the standards daml+oil and owl,19 though we are not opposed to working with other languages, such as skos or xmi,20 in the future. (in the latter case, it would be possible to model the intermediate products and the ontology in uml graphic models stored in xml files.)21 in our project, the design and implementation of all products has been made using an ontology editor. we have used oiled (with oilviz plugin) as editor, both because of its simplicity and because it allows the exportation to owl and daml formats. however, with future maintenance and testing in mind, we decided to use protégé (with owl plugin) in the last step of the process, because this is a more flexible environment with extensible modules that integrate more functionality such as ontology annotation, evaluation, middleware service, query and inference, etc. figure 5 shows the dictionary entry for “high order language,” which appears in figure 2. note that the dictionary includes only owl:class (or daml:class) to mark the term; rdf:label to indicate the term name; and rdf:comment to provide the definition included in the original glossary. figure 4. ontology development process highorderlanguage figure 5. example of dictionary entry generating collaborative systems for digital libraries | hilera et al. 199 example, when analyzing the definition of the term compiler: “(is) a computer program that translates programs expressed in a high order language into their machine language equivalent,” it is possible to deduce that compiler is a subconcept of computer program, which is also included in the glossary.) in addition to the lexical or syntactic analysis, it is necessary for an expert in the domain to perform a semantic analysis to complete the development of the taxonomy. the implementation of the hierarchical relationships among the concepts is made using rdfs:subclassof, regardless of whether the taxonomy is implemented in owl or daml format, since both languages specify this type of relationship in the same way. figure 6 shows an example of a hierarchical relationship included in the definition of the concept pictured in figure 5. thesaurus according to the international organization for standardization (iso), a thesaurus is “the vocabulary of a controlled indexing language, formally organized in order to make explicit the a priori relations between concepts (for example ‘broader’ and ‘narrower’).”25 this definition establishes the lexical units and the semantic relationships between these units as the elements that constitute a thesaurus. the following is a sample of the lexical units: ■■ descriptors (also called “preferred terms”): the terms used consistently when indexing to represent a concept that can be in documents or in queries to these documents. the iso standard introduces the option of adding a definition or an application note to every term to establish explicitly the chosen meaning. this note is identified by the abbreviation sn (scope note), as shown in figure 7. ■■ non-descriptors (“non-preferred terms”): the synonyms or quasi-synonyms of a preferred term. a nonpreferred term is not assigned to documents submitted to an indexing process, but is provided as an entry point in a thesaurus to point to the appropriate descriptor. usually the descriptors are written in capital letters and the nondescriptors in small letters. ■■ compound descriptors: the terms used to represent complex concepts and groups of descriptors, which allow for the structuring of large numbers of thesaurus descriptors into subsets called micro-thesauri. in addition to lexical units, other fundamental elements of a thesaurus are semantic relationships between these units. the more common relationships between lexical units are the following: ■■ equivalence: the relationship between the descriptors and the nondescriptors (synonymous and that is, as a systematical classification in a traditional way. as gilchrist states, there is no consensus on the meaning of terms like taxonomy, thesaurus, or ontology.22 in addition, much work in the field of ontologies has been done without taking advantage of similar work performed in the fields of linguistics and library science.23 this situation is changing because of the increasing publication of works that relate the development of ontologies to the development of “classic” terminological tools (vocabularies, taxonomies, and thesauri). this paper emphasizes the importance and usefulness of the intermediate products created at each stage of the evolutive process from glossary to ontology. the end product of the initial stage is a dictionary expressed as xml. the next stage in the evolutive process (figure 4) is the transformation of that dictionary into a taxonomy through the addition of hierarchical relationships between concepts. to do this, it is necessary to undertake a lexicalsemantic analysis of the original glossary. this can be done in a semiautomatic way by applying natural language processing (nlp) techniques, such as those recommended by morales-del-castillo et al.,24 for creating thesauri. the basic processing sequence in linguistic engineering comprises the following steps: (1) incorporate the original documents (in our case the dictionary obtained in the previous stage) into the information system; (2) identify the language in which they are written, distinguishing independent words; (3) “understand” the processed material at the appropriate level; (4) use this understanding to transform, search, or traduce data; (5) produce the new media required to present the produced outcomes; and finally, (6) present the final outcome to human users by means of the most appropriate peripheral device—screen, speakers, printer, etc. an important aspect of this process is natural language comprehension. for that reason, several different kinds of programs are employed, including lemmatizers (which implement stemming algorithms to extract the lexeme or root of a word), morphologic analyzers (which glean sentence information from their constituent elements: morphemes, words, and parts of speech), syntactic analyzers (which group sentence constituents to extract elements larger than words), and semantic models (which represent language semantics in terms of concepts and their relations, using abstraction, logical reasoning, organization and data structuring capabilities). from the information in the software engineering dictionary and from a lexical analysis of it, it is possible to determine a hierarchical relationship when the name of a term contains the name of another one (for example, the term language and the terms programming language and hardware design language), or when expressions such as “is a” linked to the name of another term included in the glossary appear in the text of the term definition. (for 200 information technology and libraries | december 2010 indicating that high order language relates to both assembly and machine languages. the life cycle proposed in this paper (figure 4) includes a third step or phase that transforms the taxonomy obtained in the previous phase into a thesaurus through the incorporation of relationships between the concepts that complement the hierarchical relations included in the taxonomy. basically, we have to add two types of relationships—equivalence and associative, represented in the standard thesauri with uf (and use) and rt respectively. we will continue using xml to implement this new product. there are different ways of implementing a thesaurus using a language based on xml. for example, matthews et al. proposed a standard rdf format,26 where as hall created an ontology in daml.27 in both cases, the authors modeled the general structure of quasi-synonymous). iso establishes that the abbreviation uf (used for) precedes the nondescriptors linked to a descriptor; and the abbreviation use is used in the opposite case. for example, a thesaurus developed from the ieee glossary might include a descriptor “high order language” and an equivalence relationship with a nondescriptor “high level language” (figure 7). ■■ hierarchical: a relationship between two descriptors. in the thesaurus one of these descriptors has been defined as superior to the other one. there are no hierarchical relationships between nondescriptors, nor between nondescriptors and descriptors. a descriptor can have no lower descriptors or several of them, and no higher descriptors or several of them. according to the iso standard, hierarchy is expressed by means of the abbreviations bt (broader term), to indicate the generic or higher descriptors, and nt (narrower term), to indicate the specific or lower descriptors. the term at the head of the hierarchy to which a term belongs can be included, using the abbreviation tt (top term). figure 7 presents these hierarchical relationships. ■■ associative: a reciprocal relationship that is established between terms that are neither equivalent nor hierarchical, but are semantically or conceptually associated to such an extent that the link between them should be made explicit in the controlled vocabulary on the grounds that it may suggest additional terms for use in indexing or retrieval. it is generally indicated by the abbreviation rt (related term). there are no associative relationships between nondescriptors and descriptors, or between descriptors already linked by a hierarchical relation. it is possible to establish associative relationships between descriptors belonging to the same or different category. the associative relationships can be of very different types. for example, they can represent causality, instrumentation, location, similarity, origin, action, etc. figure 7 shows two associative relations, .. high order language (descriptor) sn a programming language that... uf high level language (no-descriptor) uf third generation language (no-descriptor) tt language bt programming language nt object oriented language nt declarative language rt assembly language (contrast with) rt machine language (contrast with) .. high level language use high order language .. third generation language use high order language .. figure 7. fragment of a thesaurus entry figure 6. example of taxonomy entry ... generating collaborative systems for digital libraries | hilera et al. 201 terms. for example: . or using the glossary notation: . ■■ the rest of the associative relationships (rt) that were included in the thesaurus correspond to the cross-references of the type “contrast with” and “see also” that appear explicitly in the ieee glossary. ■■ neither compound descriptors nor groups of descriptors have been implemented because there is no such structure in the glossary. ontology ding and foo state that “ontology promotes standardization and reusability of information representation through identifying common and shared knowledge. ontology adds values to traditional thesauri through deeper semantics in digital objects, both conceptually, relationally and machine understandably.”29 this semantic richness may imply deeper hierarchical levels, richer relationships between concepts, the definition of axioms or inference rules, etc. the final stage of the evolutive process is the transformation of the thesaurus created in the previous stage into an ontology. this is achieved through the addition of one or more of the basic elements of semantic complexity that differentiates ontologies from other knowledge representation standards (such as dictionaries, taxonomies, and thesauri). for example: ■■ semantic relationships between the concepts (classes) of the thesaurus have been added as properties or ontology slots. ■■ axioms of classes and axioms of properties. these are restriction rules that are declared to be satisfied by elements of ontology. for example, to establish disjunctive classes ( ), have been defined, and quantification restrictions (existential or universal) and cardinality restrictions in the relationships have been implemented as properties. software based on techniques of linguistic analysis has been developed to facilitate the establishment of the properties and restrictions. this software analyzes the definition text for each of the more than 1,500 glossary terms (in thesaurus format), isolating those words that a thesaurus from classes (rdf:class or daml:class) and properties (rdf:property or daml:objectproperty). in the first case they proposed five classes: thesaurusobject, concept, topconcept, term, scopenote; and several properties to implement the relations, like hasscopenote (sn), isindicatedby, preferredterm, usedfor (uf), conceptrelation, broaderconcept (bt), narrowerconcept (nt), topofhierarchy (tt) and isrelatedto (rt). recently the w3c has developed the skos specification, created to define knowledge organization schemes. in the case of thesauri, skos includes specific tags, such as skos:concept, skos:scopenote (sn), skos:broader (bt), skos:narrower (nt), skos:related (rt), etc., that are equivalent to those listed in the previous paragraph. our specification does not make any statement about the formal relationship between the class of skos concept schemes and the class of owl ontologies, which will allow different design patterns to be explored for using skos in combination with owl. although any of the above-mentioned formats could be used to implement the thesaurus, given that the endproduct of our process is to be an ontology, our proposal is that the product to be generated during this phase should have a format compatible with the final ontology and with the previous taxonomy. therefore a minimal number of changes will be carried out on the product created in the previous step, resulting in a knowledge representation tool similar to a thesaurus. that tool does not need to be modified during the following (final) phase of transformation into an ontology. nevertheless, if for some reason it is necessary to have the thesaurus in one of the other formats (such as skos), it is possible to apply a simple xslt transformation to the product. another option would be to integrate a thesaurus ontology, such as the one proposed by hall,28 with the ontology representing the ieee glossary. in the thesaurus implementation carried out in our project, the following limitations have been considered: ■■ only the hierarchical relationships implemented in the taxonomy have been considered. these include relationsips of type “is-a,” that is, generalization relationships or type–subset relationships. relationships that can be included in the thesaurus marked with tt, bt, and nt, like relations of type “part of” (that is, partative relationships) have not been considered. instead of considering them as hierarchical relationships, the final ontology includes the possibility of describing classes as a union of classes. ■■ the relationships of synonymy (uf and use) used to model the cross-references in the ieee glossary (“syn” and “see,” respectively) were implemented as equivalent terms, that is, as equivalent axioms between classes (owl:equivalentclass or daml:sameclassas), with inverse properties to reflect the preference of the 202 information technology and libraries | december 2010 match the name of other glossary terms (or a word in the definition text of other glossary terms). the isolated words will then be candidates for a relationship between both of them. (figure 8 shows the candidate properties obtained from the software engineering glossary.) the user then has the option of creating relationships with the identified candidate words. the user must indicate, for every relationship to be created, the restriction type that it represents as well as existential or universal quantification or cardinality (minimum or maximum). after confirming this information, the program updates the file containing the ontology (owl or daml), adding the property to the class that represents the processed term. figure 9 shows an example of the definition of two properties and its application to the class highorderlanguage: a property express with existential quantification over the class datastructure to indicate that a language must represent at least one data structure; and a property translateto of universal type to indicate that any high-level language is translated into machine language (machinelanguage). ■■ results, conclusions, and future work the existence of ontologies of specific knowledge domains (software engineering in this case) facilitates the process of finding resources about this discipline on the semantic web and in digital libraries, as well as the reuse of learning objects of the same domain stored in repositories available on the web.30 when a new resource is indexed in a library catalog, a new record that conforms to the ontology conceptual data model may be included. it will be necessary to assign its properties according to the concept definition included in the ontology. the user may later execute semantic queries that will be run by the search system that will traverse the ontology to identify the concept in which the user was interested to launch a wider query including the resources indexed under the concept. ontologies, like the one that has been “evolved,” may also be used in an open way to index and search for resources on the web. in that case, however, semantic search engines such as swoogle (http://swoogle.umbc .edu/), are required in place of traditional syntactic search engines, such as google. the creation of a complete ontology of a knowledge domain is a complex task. in the case of the domain presented in this paper, that of software engineering, although there have been initiatives toward ontology creation that have yielded publications by renowned authors in the field,31 a complete ontology has yet to be created and published. this paper has described a process for developing a modest but complete ontology from a glossary of terminology, both in owl format and daml+oil format, accept access accomplish account achieve adapt add adjust advance affect aggregate aid allocate allow allow symbolic naming alter analyze apply approach approve arrangement arrive assign assigned by assume avoid await begin break bring broke down builds call called by can be can be input can be used as can operate in cannot be usedas carry out cause change characterize combine communicate compare comply comprise conduct conform consist constrain construct contain contains no contribute control convert copy correct correspond count create debugs decompiles decomposedinto decrease define degree delineate denote depend depict describe design designate detect determine develop development direct disable disassembles display distribute divide document employ enable encapsulate encounter ensure enter establish estimate establish evaluate examine exchange execute after execute in executes expand express express as extract facilitate fetch fill follow fulfil generate give give partial given constrain govern have have associated have met have no hold identify identify request ignore implement imply improve incapacitate include incorporate increase indicate inform initiate insert install intend interact with interprets interrelate investigate invokes is is a defect in is a form of is a method of is a mode of is a part is a part of is a sequence is a sequenceof is a technique is a techniqueof is a type is a type of is ability is activated by is adjusted by is applied to is based is called by is composed is contained is contained in is establish is established is executed after is executed by is incorrect is independent of is manifest is measured in is not is not subdivided in is part is part of is performed by is performed on is portion is process by is produce by is produce in is ratio is represented by is the output is the result of is translated by is type is used is used in isolate know link list load locate maintain make make up may be measure meet mix modify monitors move no contain no execute no relate no use not be connected not erase not fill not have not involve not involving not translate not use occur occur in occur in a operate operatewith optimize order output parses pas pass test perform permit permitexecute permit the execution pertaining place preclude predict prepare prescribe present present for prevent preventaccessto process produce produce no propose provide rank reads realize receive reconstruct records recovery refine reflect reformat relate relation release relocates remove repair replace represent request require reserve reside restore restructure result resume retain retest returncontrolto reviews satisfy schedule send server set share show shutdown specify store store in structure submission of supervise supports suppress suspend swap synchronize take terminate test there are no through throughout transfer transform translate transmit treat through understand update use use in use to utilize value verify work in writes figure 8. candidate properties obtained from the linguistic analysis of the software engineering glossary generating collaborative systems for digital libraries | hilera et al. 203 to each term.) we defined 324 properties or relationships between these classes. these are based on a semiautomated linguistic analysis of the glossary content (for example, allow, convert, execute, operatewith, produces, translate, transform, utilize, workin, etc.), which will be refined in future versions. the authors’ aim is to use this ontology, which we have called ontoglose (ontology glossary software engineering), to unify the vocabulary. ontoglose will be used in a more ambitious project, whose purpose is the development of a complete ontology in software engineering from the swebok guide.32 although this paper has focused on this ontology, the method that has been described may be used to generate an ontology from any dictionary. the flexibility that owl permits for ontology description, along with its compatibility with other rdf-based metadata languages, makes possible interoperability between ontologies and between ontologies and other controlled vocabularies and allows for the building of merged representations of multiple knowledge domains. these representations may eventually be used in libraries and repositories to index and search for any kind of resource, not only those related to the original field. ■■ acknowledgments this research is co-funded by the spanish ministry of industry, tourism and commerce profit program (grant tsi-020100-2008-23). the authors also want to acknowledge support from the tifyc research group at the university of alcala. references and notes 1. m. dörr et al., state of the art in content standards (amsterdam: ontoweb consortium, 2001). 2. d. soergel, “the rise of ontologies or the reinvention of classification,” journal of the american society for information science 50, no. 12 (1999): 1119–20; a. gilchrist, “thesauri, taxonomies and ontologies—an etymological note,” journal of documentation 59, no. 1 (2003): 7–18. 3. b. j. wielinga et al., “from thesaurus to ontology,” proceedings of the 1st international conference on knowledge capture (new york: acm, 2001): 194–201: j. qin and s. paling, “converting a controlled vocabulary into an ontology: the case of gem,” information research 6 (2001): 2. 4. according to van heijst, schereiber, and wielinga, ontologies can be classified as terminological ontologies, information ontologies, and knowledge modeling ontologies; terminological ontologies specify the terms that are used to represent knowledge in the domain of discourse, and they are in use principally to unify vocabulary in a certain domain. g. van heijst, a. t. which is ready to use in the semantic web. as described at the opening of this article, our aim has been to create a lightweight ontology as a first version, which will later be improved by including more axioms and relationships that increase its semantic expressiveness. we have tried to make this first version as tailored as possible to the initial glossary, knowing that later versions will be improved by others who might take on the work. such improvements will increase the ontology’s utility, but will make it a lessfaithful representation of the ieee glossary from which it was derived. the ontology we have developed includes 1,521 classes that correspond to the same number of concepts represented in the ieee glossary. (included in this number are the different meanings that the glossary assigns ... figure 9. example of ontology entry 204 information technology and libraries | december 2010 20. w3c, skos; object management group, xml metadata interchange (xmi), 2003, http://www.omg.org/technology/documents/formal/xmi.htm (accessed oct. 5, 2009). 21. uml (unified modeling language) is a standardized general-purpose modeling language (http://www.uml.org). nowadays, different uml plugins for ontologies’ editors exist. these plugins allow working with uml graphic models. also, it is possible to realize the uml models with a case tool, to export them to xml format, and to transform them to the ontology format (for example, owl) using a xslt sheet, as the one published in d. gasevic, “umltoowl: converter from uml to owl,” http://www.sfu.ca/~dgasevic/projects/umltoowl/ (accessed oct. 5, 2009). 22. gilchrist, “thesauri, taxonomies and ontologies.” 23. soergel, “the rise of ontologies or the reinvention of classification.” 24. j. m. morales-del-castillo et al., “a semantic model of selective dissemination of information for digital libraries,” information technology & libraries 28, no. 1 (2009): 22–31. 25. international standards organization, iso 2788:1986 documentation—guidelines for the establishment and development of monolingual thesauri (geneve: international standards organization, 1986). 26. b. m. matthews, k. miller, and m. d. wilson, “a thesaurus interchange format in rdf,” 2002, http://www.w3c.rl.ac .uk/swad/thes_links.htm (accessed feb. 10, 2009). 27. m. hall, “call thesaurus ontology in daml,” dynamics research corporation, 2001, http://orlando.drc.com/daml/ ontology/call-thesaurus (accessed oct. 5, 2009). 28. ibid. 29. y. ding and s. foo, “ontology research and development. part 1—a review of ontology generation,” journal of information science 28, no. 2 (2002): 123–36. see also b. h. kwasnik, “the role of classification in knowledge representation and discover,” library trends 48 (1999): 22–47. 30. s. otón et al., “service oriented architecture for the implementation of distributed repositories of learning objects,” international journal of innovative computing, information & control (2010), forthcoming. 31. o. mendes and a. abran, “software engineering ontology: a development methodology,” metrics news 9 (2004): 68–76; c. calero, f. ruiz, and m. piattini, ontologies for software engineering and software technology (berlin: springer, 2006). 32. ieee, guide to the software engineering body of knowledge (swebok) (los alamitos, calif.: ieee computer society, 2004), http:// www.swebok.org (accessed oct. 5, 2009). schereiber, and b. j. wielinga, “using explicit ontologies in kbs development,” international journal of human & computer studies 46, no. 2/3 (1996): 183–292. 5. r. neches et al., “enabling technology for knowledge sharing,” ai magazine 12, no. 3 (1991): 36–56. 6. o. corcho, f. fernández-lópez, and a. gómez-pérez, “methodologies, tools and languages for buildings ontologies. where is their meeting point?” data & knowledge engineering 46, no. 1 (2003): 41–64. 7. intitute of electrical and electronics engineers (ieee), ieee std 610.12-1990(r2002): ieee standard glossary of software engineering terminology (reaffirmed 2002) (new york: ieee, 2002). 8. j. krause, “semantic heterogeneity: comparing new semantic web approaches with those of digital libraries,” library review 57, no. 3 (2008): 235–48. 9. t. berners-lee, j. hendler, and o. lassila, “the semantic web,” scientific american 284, no. 5 (2001): 34–43. 10. world wide web consortium (w3c), resource description framework (rdf): concepts and abstract syntax, w3c recommendation 10 february 2004, http://www.w3.org/tr/rdf-concepts/ (accessed oct. 5, 2009). 11. world wide web consortium (w3c), web ontology language (owl), 2004, http://www.w3.org/2004/owl (accessed oct. 5, 2009). 12. world wide web consortium (w3c), skos simple knowledge organization system, 2009, http://www.w3.org/ tr/2009/rec-skos-reference-20090818/ (accessed oct. 5, 2009). 13. m. m. yee, “can bibliographic data be put directly onto the semantic web?” information technology & libraries 28, no. 2 (2009): 55-80. 14. l. f. spiteri, “the structure and form of folksonomy tags: the road to the public library catalog,” information technology & libraries 26, no. 3 (2007): 13–25. 15. corcho, fernández-lópez, and gómez-pérez, “methodologies, tools and languages for buildings ontologies.” 16. ieee, ieee std 610.12-1990(r2002). 17. n. f. noy and d. l. mcguinness, “ontology development 101: a guide to creating your first ontology,” 2001, stanford university, http://www-ksl.stanford.edu/people/dlm/ papers/ontology-tutorial-noy-mcguinness.pdf (accessed sept 10, 2010). 18. d. baader et al., the description logic handbook (cambridge: cambridge univ. pr., 2003). 19. world wide web consortium, daml+oil reference description, 2001, http://www.w3.org/tr/daml+oil-reference (accessed oct. 5, 2009); w3c, owl. editorial | truitt 163 ■■ the space in between in my opinion, ital has an identity crisis. it seems to try in many ways to be scholarly like jasist, but lita simply isn’t as formal a group as asist. on the other end of the spectrum, code4lib is very dynamic, informal and community-driven. ital kind of flops around awkwardly in the space in between. —comment by a respondent to ital’s reader survey, december 2009 last december and january, you, the readers of information technology and libraries were invited to participate in a survey aimed at helping us to learn your likes and dislikes about ital, and where you’d like to see this journal go in terms of several important questions. the responses provide rich food for reflection about ital, its readers, what we do well and what we don’t, and our future directions. indeed, we’re still digesting and discussing them, nearly a year after the survey. i’d like to use some of my editorial space in this issue to introduce, provide an overview, and highlight a few of the most interesting results. i strongly encourage you to access the full survey results, which i’ve posted to our weblog italica (http:// ital-ica.blogspot.com/); i further invite you to post your own thoughts there about the survey results and their meaning. we ran the survey from mid-december to mid-january. a few responses trickled in as late as mid-february. the survey invitation was sent to the 2,614 lita personal members; nonmembers and ital subscribers (most of whom are institutions) were excluded. we ultimately received 320 responses—including two from individuals who confessed that they were not actually lita members—for a response rate of 12.24 percent. thus the findings reported below reflect the views of those who chose to respond to the survey. the response rate, while not optimal, is not far from the 15 percent that i understand lita usually expects for its surveys. as you may guess, not all respondents answered all questions, which accounts for some small discrepancies in the numbers reported. who are we? in analyzing the survey responses, one of the first things one notices is the range and diversity of ital’s reader base, and by extension, of lita’s membership. the largest groups of subscribers identify themselves either as traditional systems librarians (58, or 18.2 percent) or web services/development librarians (31, or 9.7 percent), with a further cohort of 7.2 percent (23) composed of those working with electronic resources or digital projects. but more than 20 percent (71) come from the ranks of library directors and associate directors. nearly 15 percent (47) identify their focus as being in the areas of reference, cataloguing, acquisitions, or collection development. see figure 1. the bottom line is that more than a third of our readers are coming from areas outside of library it. a couple of other demographic items: ■■ while nearly six in ten respondents (182, or 57.6 percent) work in academic libraries, that still leaves a sizable number (134, or 42.3 percent) who don’t. more than 14 percent (45) of the total 316 respondents come from the public library sector. ■■ nearly half (152, or 48.3 percent) of our readers indicated that they have been with lita for five years or fewer. note that this does not necessarily indicate the age or number of years of service of the respondents, but it’s probably a rough indicator. still, i confess that this was something of a surprise to me, as i expected larger numbers of long-time members. and how do the numbers shake out for us old geezers? the 6–10 and greater-than-15-years cohorts each composed about 20 percent of those responding; interestingly, only 11.4 percent (36) answered that they’d been lita members for between 11 and 15 years. assuming that these numbers are an accurate reflection of lita’s membership, i can’t help but wonder about the explanation for this anomaly.” see figure 2. how are we doing? question 4 on the survey asked readers to respond to several statements: “it is important to me that articles in ital are peerreviewed.” more than 75 percent (241, or 77.2 percent) answered that they either “agreed” or “strongly agreed.” “ital is timely.” more than seven in ten respondents (228, or 73.0 percent) either “agreed” or “strongly agreed” that ital is timely. only 27 (8.7 percent) disagreed. as a technology-focused journal, where time-to-publication is always a sensitive issue, i expected more dissatisfaction on this question (and no, that doesn’t mean that i don’t worry about the nine percent who believe we’re too slow out of the gate). marc truitt editorial: the space in between, or, why ital matters marc truitt (marc.truitt@ualberta.ca) is associate university librarian, bibliographic and information technology services, university of alberta libraries, edmonton, alberta, canada, and editor of ital. 164 information technology and libraries | december 2010 would likely quit lita, with narrative explanations that clearly underscore the belief that ital—especially a paper ital—is viewed by many as an important benefit of membership. the following comments are typical: ■■ “lita membership would carry no benefits for me.” ■■ “dues should decrease, though.” [from a respondent who indicated he or she would retain lita “i use information from ital in my work and/ or i find it intellectually stimulating.” by a nearly identical margin to that regarding timeliness, ital readers (226, or 72.7 percent) either “agreed” or “strongly agreed” that they use ital in their work or find its contents stimulating. “ital is an important benefit of lita membership.” an overwhelming majority (248, or 79.78 percent) of respondents either “agreed” or “strongly agreed” with this statement.1 this perception clearly emerges again in responses to the questions about whether readers would drop their lita membership if we produced an electronic-only or open-access ital (see below). where should we be going? several questions sought your input about different options for ital as we move forward. question 7, for example, asked you to rank how frequently you access ital content via several channels, with the choices being “print copy received via membership,” “print copy received by your institution/library,” “electronic copy from the ital website,” or “electronic copy accessed via an aggregator service to which your institution/library subscribes (e.g., ebsco).” the choice most frequently accessed was the print copy received via membership, at 81.1 percent (228). question 8 asked about your preferences in terms of ital’s publication model. of the 307 responses, 60.6 percent (186) indicated a preference for continuance of the present arrangement, whereby we publish both paper and electronic versions simultaneously. four in ten respondents preferred that ital move to publication in electronic version only.2 of those who favored continued availability of paper, the great majority (159, or 83.2 percent) indicated in question 9 that they simply preferred reading ital in paper. those who advocate moving to electronic-only do so for more mixed reasons (question 10), the most popular being cost-effectiveness, timeliness, and the environmental friendliness of electronic publication. a final question in this section asked that you respond to the statement “if ital were to become an electronic-only publication i would continue as a dues-paying member of lita.” while a reassuring 89.8 percent (273) of you answered in the affirmative, 9.5 percent (29) indicated that you figure 2. years of lita membership figure 1. professional position of lita members 18.2% (58) 0.3% (1) 0.6% (2) 0.6% (2) 0.9% (3) 2.2% (7) 2.5% (8) 3.1% (10) 4.1% (13) 4.4% (14) 6.3% (20) 7.9% (25) 9.4% (30) 9.7% (31) 12.9 % (41) 16.7% (53) 0% 5% 10% 15% 20% systems librarian (includes responsibility for ils, servers, workstat... other (please specify) library director web services/development librarian deputy/associate/assistant director reference services librarian cataloging librarian consortium/network/vendor librarian electronic resources librarian digital projects/digitization librarian student teaching faculty computing professional (non-mls) resource sharing librarian acquisitions/collection development librarian other library staff (non-mls) 11.4% (36) 19.7% (62) 20.0% (63) 48.3% (152) 0% 10% 20% 30% 40% 5 years or less 11–15 years 6–10 years more than 15 years editorial | truitt 165 his lipstick-on-a-pig ils. somewhere else there’s a library blogger who fends off bouts of insomnia by reading “wonky” ital papers in the wee hours of the morning. and that ain’t the half of it, as they say. in short—in terms of readers, interests, and preferences—“the space in between” is a pretty big niche for ital to serve. we celebrate it. and we’ll keep trying our best to serve it well. ■■ departures as i write these lines in late-september, it’s been a sad few weeks for those of us in the ital family. in mid-august, former ital editor jim kopp passed away following a battle with cancer. last week, dan marmion—jim’s successor as editor (1999–2004)—and a dear friend to many of us on the current ital editorial board—also left us, the victim of a malignant brain tumor. i never met jim, but lita president karen starr eulogized him in a posting to lita-l on august 16, 2010.3 i noted dan’s retirement due to illness in this space in march.4 i first met dan in the spring of 2000, when he arrived at notre dame as the new associate director for information systems and digital access (i think the position was differently titled then) and, incidentally, my new boss. dan arrived only six weeks after my own start there. things at notre dame were unsettled at the time: the libraries had only the year before successfully implemented exlibris’ aleph500 ils, the first north american site to do so. while exlibris moved on to implementations at mcgill and the university of iowa, we at notre dame struggled with the challenges of supporting and upgrading a system then new to the north american market. it was not always easy or smooth, but throughout, dan always maintained an unflappable and collegial manner with exlibris staff and a quiet but supportive demeanor toward those of us who worked for him. i wish i could say that i understood and appreciated this better at the time, but i can’t. i still had some growing ahead of me—i’m sure that i still do. dan was there for me again as an enthusiastic reference when i moved on, first to the university of houston in 2003 and then to the university of alberta three years later. in these jobs i’d like to think i’ve come to understand a bit better the complex challenges faced by senior managers in large research libraries; in the process, i know i’ve come to appreciate dan’s quiet, knowledgeable, and hands-off style with department managers. it is one i’ve tried (not always successfully) to cultivate. while i was still at notre dame, dan invited me to join the editorial board of information technology and libraries, a group which over the years has come to include many “friends of dan,” including judith carter (quite possibly the world’s finest managing editor), andy boze (ital’s membership] ■■ “ital is the major benefit to me as we don’t have funds for me to attend lita meetings or training sessions.” ■■ “the paper journal is really the only membership benefit i use regularly.” ■■ “actually my answer is more, ‘i don’t know.’ i really question the value of my lita membership. ital is at least some tangible benefit i receive. quite honestly, i don’t know that there really are other benefits of lita membership.” question 12 asked about whether ital should continue with its current delayed open-access model (i.e., the latest two issues embargoed for non-lita members), or go completely open-access. by a three-to-two margin, readers favored moving to an open-access model for all issues. in the following question that asked whether respondents would continue or terminate lita membership were ital to move to a completely open-access publication model, the results were remarkably similar to those for the question linking print availability to lita membership, with the narrative comments again suggesting much the same underlying reasoning. in sum, the results suggest to me more satisfaction with ital than i might have anticipated; at the same time, i’ve only scratched the surface in my comments here. the narrative answers in particular—which i have touched on in only the most cursory fashion—have many things to say about ital’s “place,” suggestions for future articles, and a host of other worthy ideas. there is as well the whole area of crosstabbing: some of the questions, when analyzed with reference to the demographic answers in the beginning of the survey, may highlight entirely new aspects of the data. who, for instance, favors continuance of a paper ital, and who prefers electronic-only? but to come back to that reader’s comment about ital and “the space in between” that i used to frame this discussion (indeed, this entire column): to me, the demographic responses—which clearly show ital has a substantial readership outside of library it—suggest that that “space in between” is precisely where ital should be. we may or may not occupy that space “awkwardly,” and there is always room for improvement, although i hope we do better than “flop around”! the results make clear that ital’s readers—who would be you!—encompass the spectrum from the tech-savvy early-career reader of code4lib journal (electronic-only, of course!) to the library administrator who satisfies her need for technology information by taking her paper copy of ital along when traveling. elsewhere on that continuum, there are reference librarians and catalogers wondering what’s new in library technology, and a traditional systems librarian pondering whether there is an open-source discovery solution out there that might breathe some new life into 166 information technology and libraries | december 2010 between membership and receiving the journal. many of them appear to infer that a portion of their lita dues, then, are earmarked for the publication and mailing of ital. sadly, this is not the case. in years past, ital’s income from advertising paid the bills and even generated additional revenue for lita coffers. today, the shoe is on the other foot because of declining advertising revenue, but ital is still expected to pay its own way, which it has failed to do in recent years. but to those who reasonably believe that some portion of their dues is dedicated to the support of ital, well, t’ain’t so. bothered by this? complain to the lita board. 2. as a point of comparison, consider the following results from the 2000 ital reader survey. respondents were asked to rank several publishing options on a scale of 1 to 3 (with 1 = most preferred option and 3 = least preferred option): ital should be published simultaneously as a print-onpaper journal and an electronic journal (n = 284): 1 = 169 (59.5%); 2 = 93 (32.7%); 3 = 22 (7.7%) ital should be published in an electronic form only (n = 293): 1 = 55 (18.8%); 2 = 61 (20.8%); 3 = 177 (60.4%) in other words, then as now, about 60% of readers preferred paper and electronic to electronic-only. 3. karen starr, “fw: [libs-or] jim kopp: celebration of life,” online posting, aug. 16, 2010, lita-l, http://lists.ala. org/sympa/arc/lita-l/2010-08/msg00079.html (accessed sept. 29, 2010). 4. marc truitt, “dan marmion,” information technology & libraries 29 (mar. 2010): 4, http://www.ala.org/ala/mgrps/ divs/lita/ital/292010/2901mar/editorial_pdf.cfm (accessed sept. 29, 2010). webmaster), and mark dehmlow. while dan left ital in 2004, i think that he left the journal a wonderful and lasting legacy in these extremely capable and dedicated folks. my fondest memories of dan concern our shared passion for model trains. i remember visiting a train show in south bend with him a couple of times, and our last time together (at the ala midwinter meeting in denver two years ago) was capped by a snowy trek with exlibris’ carl grant, another model train enthusiast, to the mecca of model railroading, caboose hobbies. three boys off to see their toys—oh, exquisite bliss! i don’t know whether ital or its predecessor jola have ever reprinted an editorial, but while searching the archives to find something that would honor both jim and dan, i found a piece that i hope speaks eloquently of their contributions and to ital’s reason for being. dan’s editorial, “why is ital important?” originally published in our june 2002 issue, appears again immediately following this column. i think its message and the views expressed therein by jim and dan remain as valid today as they were in 2002. they also may help to frame my comments concerning our reader survey in the previous section. farewell, jim and dan. you will both be sorely missed. notes and references 1. a number of narrative answers to the survey make it clear that ital readers who are lita members perceive a link reproduced with permission of the copyright owner. further reproduction prohibited without permission. site license initiatives in the united kingdom: the psli and nesli experience borin, jacqueline information technology and libraries; mar 2000; 19, 1; proquest pg. 42 l site license initiatives in the united kingdom: the psli and nesli experience jacqueline borin this article examines the development of site licensing within the united kingdom higher education community. in particular, it looks at haw the pressure to make better use of dwindling fiscal resources led ta the conclusion that information technology and its exploitation was necessary in order to create an effective library service. these conclusions, reached in the follett report of 1993, led to the establishment of a pilot site license initiative and then a national electronic site license initiative. the focus of this article is these initiatives and the issues they faced, which included off-site access, definition of a site and perhaps most importantly, the unbundling of print and electronic journals. increased competition for institution funding around the world has resulted in an erosion of library funding. in the united states state universities are receiving a decreasing portion of their funds from the state while private universities are forced to limit tuition increases due to outside market forces. in the united kingdom the entitlement to free higher education is currently under attack and losing ground. today's economic pressures are requiring individual libraries to make better use of their fiscal resources while the emphasis moves from being a repository for information to providing access to information. jacqueline sorin (jborin@csusm.edu) is coordinator of reference and electronic resources, library and information services, california state university, san marcos. as in the united states, the use of consortia for cost sharing in the united kingdom is becoming imperative as producers produce more electronic materials and make them available in full-text formats. consortia, while originally formed to cooperate on interlibrary loans and union catalogs, have recently taken on a new role, driven by financial expediency, in negotiating electronic licenses for their members, and the percentage of vendor contracts with consortia are rising. academic libraries cannot afford the prevalent pricing model that asks for the current print price plus an electronic surcharge plus projected inflation surcharges, therefore group purchasing power allows higher education institutions to leverage the money they have and to provide resources that would otherwise be unavailable. advantages for the vendor include one negotiator and one technical person for the consortia as a whole. in addition, the use of consortia provide greater leverage in pushing for the need for stable archiving and for retaining the principles of fair use within the electronic environment as well as reminding publishers of the need for flexible and multiple economic models to deal with the diverse needs and funding structures of consortia. i during the spring of 1998, while visiting academic libraries in the united kingdom, i looked at an existing initiative within the uk higher education community-the pilot site license initiative (psli), which had begun as a response to the follett report and to rising journal prices. at the time the three-year initiative was nearing its end and its successor, the national electronic site license initiative (nesli), was already the topic of much discussion. i history the concept of site licensing in the united kingdom higher education 42 information technology and libraries i march 2000 community had already been established, since 1988, by the combined higher education software team (chest), based at the university of bath. chest has negotiated site licenses with software suppliers and some large database producers through two different methods. either the supplier sells a national license to chest, which passes it on to the individual institution or chest sells licenses to the institution on the suppliers behalf and passes the fees on to them (see figure 1). chest works closely with national information services and systems (niss). niss provides a focal point for the uk education and research communities to access information resources. niss's web service, the niss information gateway, provides a host for chest information such as ebsco masterfile and oclc netfirst. most chest agreements are institution-wide site licenses that allow for all noncommercial use of the product, normally for five years to allow for incorporation into the curriculum. once an institution signs up it is committed for the full term of the agreement. chest is not in the business of either evaluating products or differentiating among competing suppliers. evaluations and purchase decisions are left up to the individual institutions.2 chest does set up and support e-mail discussion lists for each agreement so that users can discuss features and problems of the product among themselves. they also send out electronic news bulletins to provide advance warning of forthcoming agreements and to assess level of interest in future agreements. chest operates in a similar manner to many library consortia in the united states. the major differences are that it sells to higher education institutions as a whole so the products they sell include not only databases but also for example, software programs. this is also beginning to change in reproduced with permission of the copyright owner. further reproduction prohibited without permission. the united states. a recent article in the chronicle of higher education mentions that institutions will not stop with library databases, "in the future we'll be negotiating site licenses for software and all sorts of things . . . not just databases."3 although chest is substantially self-funding it is strongly supported (as is niss) by the joint information systems committee (jisc) of the higher education funding councils of england (hefce). the majority of public funding for higher education funding in the united kingdom is funneled through the hefcs (one each for england, scotland, wales, and northern ireland). one of the jisc committees, the information services subcommittee (issc), which in 1997 became part of the committee for electronic information (cei) defined principles for the delivery of content. 4 they were: • free at the point of use; • subscriptions not transaction based; • lowest common denominator; • universality; • commonality of interfaces and • mass instruction. i follett report in 1993 an investigation into how to deal with the pressures on library resources caused by the rapid expansion of student numbers and the worldwide explosion in academic knowledge and information was undertaken by the joint funding council 's libraries review group, chaired by sir brian follett. this investigation resulted in the follett report. one of the key conclusions of the report was "the exploitation of it is essential to create the effective higher education and public research establishments software, data , training needs ! chest © chest (university of bath) 1996 figure 1. chest diagram chest deals , chest offers negotiations software , data, training materials t it product suppliers library service of the future ." the review group recommended that as a starting point "a pilot initiative between a small number of institutions and a similar number of publishing houses should be sponsored by the funding councils to demonstrate in practical terms how material can be handled and distributed electronically." 5 as a consequence £15 million was allocated to an electronic libraries program, managed by jisc on behalf of hefce. the electronic libraries program was to "engage the higher education community in developing and shaping th e implementation of the electronic library." 6 this project provided a body of electronic resources and services for uk higher education and influenced a cultural shift towards the acceptance and use of electronic resources instead of more traditional information storage and access methods. psli in may 1995 a pilot site license initiative subsidized by the funding councils was set up to : • test if the site license concept could provide wider access to journals for those in the academic community; • see if it would allow more flexibility in the use of scholarly material ; • test the methods for dissemination of scholarly material to the higher education sector in a variety of formats ; • test legal models for a national site license program; and • explore the possibility for increased value for money from scholarly journals.7 sixty-five publishers were invited by hefce to participate for three years commencing january 1, 1996. hefce was also responsible through jisc for the funding of the elib program, but no formal links were established between the elib project and communications i borin 43 reproduced with permission of the copyright owner. further reproduction prohibited without permission. the psli. 8 the final selection of four companies included academic press ltd., blackwell publishers ltd., blackwell science ltd., and iop publishing ltd. the publishers agreed to offer print journals to higher education institutions for discounts of between 30 and 40 percent over the three year period as well as electronic access as available. originally the electronic journals were supposed to be the subsidiary component of the agreement but by the end of the agreement they had become the major focus. the psli achieved almost 100 percent take up among the higher education institutions due to the anticipated savings through the program.9 hefce did not specify how the publishers were to deliver their content. iopp hosted the journals on their own server, for example, while academic press linked their ideal server to the journals online service at the university of bath. one of the key provisions of the site license was the unlimited rights of authorized users to make photocopies (including their use within course packs) of the journals. academic press and iopp provided full-text access to all their journals while blackwell and blackwell science only allowed reading of full text where a print subscription existed. an integral part of the psli was that the funding from hefce to the higher education institutions was top sliced to support the discounted price offered to the institutions. several assessments of the initiative were made and a final evaluation of the pilot was concluded at the end of 1997. initial surveys indicated subscription savings through the program (average annual savings were approximately £11,800 per annum) and the first report of the evaluation team showed a wide level of support for the project despite major problems with lack of communication in a timely manner.10 the team recommended an extension of the psli to include more publishers and more emphasis on electronic delivery. one concern that was raised was ease of access, students had to know which system a journal they required was on. this was not easily discernible or user friendly. evaluations by focus groups showed users wanted one single access point to all electronic journals.11 also unresolved was the need for one consistent interface to the electronic journals and a solution to the archiving issue. at the end of the psli, hefce handed the next phase over to jisc. in the fall of 1997 jisc announced that a nesli would be set up and a new steering group was established. nesli was to be an electronic-only scheme and the invitation to tender went out at the end of 1997 with a decision to be made mid-1998. national electronic site license initiative nesli, a three-year jisc funded program, began on january 1, 1999 although the "official" launch was held at the british library on june 15, 1999. it is an initiative to deliver a national electronic journal service to the united kingdom higher education and research community (approximately 180 institutions) and is a successor program to the pilot site license initiative {psli). in may 1998 jisc appointed a consortium of swets and zeitlinger and manchester computing {university of manchester) to act as a managing agent (swets and blackwell ltd. announced in june 1999 their intention to combine swets subscription service and blackwell's information services, the two subscription agency services). the managing agent represents the higher education institutions in negotiations with publishers, manages delivery of the electronic material through a single web interface and oversees day-to-day operation of the program including the handling of subscriptions.12 44 information technology and libraries i march 2000 the managing agent also encourages the widespread acceptance by publishers of a standard model site license, one of the objectives of this being to reduce the number and diversity of site definitions used by publishers. other important provisions of the model site license addressed the issues of walk-in use by clients and the need for publishers to provide access to material previously subscribed to when a subscription is cancelled. the subscription model is currently the prevalent option although they are also working towards a pay-per-view option.13 priority has been given to publishers who had been involved in the psli and to those publishers participating in swetsnet, the delivery mechanism for the nesli. swetsnet is an electronic journal aggregation service that offers access to and management of internet journals. its search engine allows searching and browsing through titles from all publishers with links to the full-text articles. nesli is not a mandatory initiative, the higher education institutions can choose whether to participate in proposals and can pursue their own arrangements individually or through their own consortiums if they wish. while psli was basically a printbased initiative limited to a small number of publishers and funded via top slicing, nesli is an electronic initiative aimed at involving many more publishers. it is designed to be self-funding, although it did receive some start-up funding. although it is an electronic initiative, proposals that include print will be considered, as it is still not easy to separate print and electronic materials.14 the initiative addresses the most effective use, access, and purchase of electronic journals in the academic library community. its aims include: • access control-for on-site and remote users; • cost; reproduced with permission of the copyright owner. further reproduction prohibited without permission. • definition of a site; • archiving; and • unbundling print from electronic. access to swetsnet, the delivery mechanism for journals included in nesli, has now been supplemented by the option of athens authentication. athens, an authentication system developed by niss, provides individuals affiliated with higher education institutions a single username and password for all electronic services they have permission to access. athens is linked to swetsnet to ensure access for off-site, remote, and distance learners who do not have a fixed ip address. this supplements swetsnet's ip address authentication, which does not allow for individual access to toc and sdi alerting. a help desk is available for all nesli users through the university of manchester. the definition of a site is being addressed by the nesli model site license, which tries to standardize site definitions (including access from places that authorized users work or study, including homes and residence halls); interlibrary loan (supplying an authorized user of another library a single paper copy of an electronic original of a individual document); walk-in-users; access to subscribed material in perpetuity (it provides for an archive to be made of the licensed material with access to the archive permissible after termination of the license); and inclusion of material in course packs. jisc' s nesli steering group approved the model nesli site license on may 11, 1999 for use by the nesli managing agent.15 the managing agent asks publishers to accept the model license with as few alterations as possible. during the term of the initiative the managing agent will be working on additional value added services. these include links from key indexing and abstracting services, provision of access via z39.50, linking from library opacs, creation of catalog records and assessing a model for ejournal delivery via subject clusters. in particular, they have begun to look at the technical issues concerned with providing marc records for all electronic journals included in nesli offers. additionally they will be looking at solutions for longer term archiving of electronic journals to provide a comfort level for librarians purchasing electronic only copies.16 two offers that have been made under the nesli umbrella so far are blackwell sciences for 130 electronic journals and johns hopkins university press for 46 electronic titles. most recently two additional vendors have been added to the list. elsevier has made a proposal to deliver full text content via the publishers sciencedirect platform that includes the full text of more than 1,000 elsevier science journals along with those of other publishers. a total of more than 3,800 journals would be included in the service.17 mcb university press, an independent niche publisher, is offering access to 114 full text journals and secondary information in the area of management through it's emerald intelligence + fulltext service. similarly, here in the united states, california state university (csu) put out for competitive tender a contract for the building of a customized database of 1200+ electronic journals based on the print titles subscribed to by 15 or more of the 22 campuses-journal access core collection oacc). the journals will be made available via pharos, a new unified information access system for the csu. like ohiolink, a consortium of 74 ohio libraries, it will provide a common interface to electronic journals for students and faculty and will facilitate the development of distance learning programs.18 by unbundling the journals, libraries will no longer be required to pay for journals they do not want or need leading to moderate price savings. additional savings can be realized through the lowering of overhead costs achieved by system wide purchasing of core resources. other issues being addressed within the jacc rfp included archiving and perpetual access to journal articles the university system has paid for, availability of e-journals in multiple formats, interlibrary loan of electronic documents, currency of content and cost value at the journal-title level. 19 currently 500 core journals are being provided under the jacc by ebsco information services and the csu plans on expanding those offerings. i conclusion as we move into the next millennium library consortia will continue to work together with vendors to further customize journal offerings. however it is still far too early to say whether nesli will be successful or whether it will succeed in getting the publishing industry to accept the model site license. if it is to work within the higher education community, it will depend greatly on the flexibility and willingness of the publishers of scholarly journals. it has made a start by developing a license that sets a wider definition of a site and that deals realistically with the question of off-site access. by encouraging the unbundling of electronic and print subscriptions nesli allows services to be tailored to specific needs of the information community, but it remains to be seen how many publishers are prepared to accept unbundled deals at this stage. also as technology stabilizes and libraries acquire increasingly larger electronic collections, we will not be able to rely on license negotiations as the only way to influence pricing, access, and distribution. an additional problem that remains unaddressed by either psli or nesli is the pressure on academics to publish in traditional journals and the corcommunications i borin 45 reproduced with permission of the copyright owner. further reproduction prohibited without permission. responding rise in scholarly journal prices. nesli neither encourages nor hinders changes in scholarly communication and therefore the question of restructuring the scholarly communication process remains.20 references and notes 1. barbara mcfadden and arnold hirshon, "hanging together to avoid hanging separately: opportunities for academic libraries and consortia," information technology and libraries 17, no. 1 (march 1998): 36. see also international coalition of library consortia, "statement of current perspective and preferred practices for the selection and purchase of electronic information," information technology and libraries 17, no. 1 (march 1998): 45. 2. martin s. white, "from psli to nesli: site licensing for electronic journals," new review of academic librarianship 3, (1997): 139-50. see also chest. chest: software, data, and information for education (1996). 3. thomas j. deloughry, "library consortia save members money on electronic materials," the chronicle of higher education (feb. 9, 1996): a21. 4. information services subcommittee, "principles for the delivery of content." accessed nov. 17, 1999, www.jisc.ac.uk/ pub97 /nl_97.html#issc. 5. joint funding council's libraries review group. the follett report. (dec. 1993): accessed nov. 20, 1999, www.niss.ac. uk/ education/ hefc / follett/report/. 6. john kirriemuir, "background of the elib programme." accessed nov. 21, 1999, www.ukoln.ac.uk/services.elib/ background/history.html. 7. psli evaluation team, "uk pilot site license initiative: a progress report," serials 10, no. 1 (1997): 17-20. 8. white, "from psli to nesli," 149. 9. tony kidd, "electronic journals: their introduction and exploitation in academic libraries in the uk," serials review 24, no. 1 (1998): 7-14. 10. jill taylor roe, "united we save, divided we spend: current purchasing trends in serials acquisitions in the uk academic sector," serials review 24, no. 1 (1998): ~11. psli evaluation team, "uk pilot site license initiative," 17-20. 12. beverly friedgood, "the uk national site licensing initiative," serials 11, no. 1 (1998): 37-39. 13. university of manchester and swets & zeitlinger, nesli: national electronic site license initiative (1999). accessed nov. 21, 1999, www.nesli.ac.uk/. 14. nesli brochure, "further information for librarians." accessed nov. 21, 1999, www.nesli.ac.uk/ nesli-librarians-leaflet.html. 15. a copy of the model site license is available on the nesli web site. accessed nov. 22, 1999, www.nesli.ac.uk/ mode1license8.html. 16. albert prior, "nesli progress through collaboration," learned publishing 12, no. 1 (1999). 17. science direct. accessed nov. 24, 1999, www.sciencedirect.com. 18. declan butler, "the writing is on the web for science journals in print," nature 397, oan. 211998). 19. the journal access core collection request for proposal. accessed nov. 22, 1999, www.calstate.edu/tier3/ cs+p/rfp_ifb/980160/980160.pdf. 20. frederick j. friend, "uk pilot site license initiative: is it guiding libraries away from disaster on the rocks of price rises?" serials 9, no. 2 (1996): 129-33. a low-cost library database solution mark england, lura joseph, and nern w. schlecht two locally created databases are made available to the world via the web using an inexpensive but highly functional search engine created in-house. the technology consists of a microcomputer running unix to serve relational databases. cgi forms created using the programming language perl offer flexible interface designs for database users and database maintainers. many libraries maintain indexes to local collections or resources and create databases or bibliographies con46 information technology and libraries i march 2000 cerning subjects of local or regional interest. these local resource indexes are of great value to researchers. the web provides an inexpensive means for broadly disseminating these indexes. for example, kilcullen has described a nonsearchable, webbased newspaper index that uses microsoft access 97.1 jacso has written about the use of java applets to publish small directories and bibliographies.2 sturr has discussed the use of wais software to provide searchable online indexes.3 many of the web-based local databases and search interfaces currently used by libraries may: • have problems with functionality; • lack provisions for efficient searching; • be based on unreliable software; • be based on software and hardware that is expensive to purchase or implement; • be difficult for patrons to use; and • be difficult for staff to maintain. after trying several alternatives, staff members at the north dakota state university libraries have implemented an inexpensive but highly functional and reliable solution. we are now providing searchable indexes on the web using a microcomputer running unix to serve relational databases. cgi forms created at the north dakota state university libraries using the programming language perl offer flexible interface designs for database users and database maintainers. this article describes how we have implemark england (england@badlands. nodak.edu) is assistant director, lura joseph (ljoseph@badlands.nodak.edu) is physical sciences librarian, and nem w. schlecht (schlecht@plains.nodak.edu) is a systems administrator at the north dakota state university libraries, fargo, north dakota. this article discusses structural, systems, and other types of bias that arise in matching new records to large databases. the focus is databases for bibliographic utilities, but other related database concerns will be discussed. problems of satisfying a “match” with sufficient flexibility and rigor in an environment of imperfect data are presented, and sources of unintentional variance are discussed. editor’s note: this article was submitted in honor of the fortieth anniversaries of lita and ital. s ameness is a sometime thing. libraries and other information­intensive organizations have long faced the problem of large collections of records growing incrementally. computerized records in a net­ worked environment have encouraged the recognition that duplicate records pose a serious threat to efficient information retrieval. yet what constitutes a duplicate record may be neither exact nor completely predictable. levels of discernment are required to permit matches on records that do not dif­ fer significantly and records that do. n initial definitions matching is defined as the process by which additions to a large database are screened and compared with existing database records. ideally, this process of matching ensures that duplicates are not added, nor erroneous replacements made of record pairs that are not really equivalent. oclc (online computer library center, inc.) is a non­ profit organization serving member libraries and related institutions throughout the world. it is the chief database capital of the organization, and it is “owned” in a sense by the member libraries worldwide that use and contribute to it. at this writing, it contains over seventy­three mil­ lion records. this discussion focuses chiefly on oclc’s extended worldcat (xwc), though many of the issues are common to other bibliographic databases. examples of these include the research libraries group’s research libraries information network (rlin) database, pica (a european cooperative of libraries headquartered in the netherlands), and other union catalogs. the literature will demonstrate that the problems described exist in many if not most large bibliographic databases.the database contents are representations or surrogates of the objects in shared collections. individual records in xwc are com­ plex bibliographic representations of physical or virtual objects—books, films, urls, maps, slides, and much more. each of these records consists of metadata, i.e., “structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource”1(appendix a). the records use an xml varia­ tion of the marc communications format.2 for example, a record for a book might typically contain such fields for author, title, publisher, and date, and many more in addi­ tion. the representation of any one object can be quite com­ plex, containing scores of fields and subfields. such a record may be quite brief, or several thousand characters long. the depth and richness of the records varies enormously. they may describe materials in more than 450 languages. this is a database against which millions of searches and millions of records are processed, each month. why is matching a challenge? two records describing the same intellectual creation or work (e.g., shakespeare’s othello) can vary by physical form and other attributes. two records describing both the same work and exactly the same form can differ from each other if the records were created under different rules of record description (catalog­ ing). two records intended to describe the same object can vary unintentionally if typographical or other entry errors are present in one or both. thus sorting out significant from insignificant differences is critical. an example of the challenges of developing matching software in the metadata capture project is described elsewhere.3 the scope of misinformation is limited to information storage and retrieval, and specifically to comparison of incoming records to candidate matches in the database. the authors define misinformation as follows: 1. anything that can cause two database records, i.e., representations of different items to be mistaken as representations of the same item. these can lead to inappropriate merging or updates. 2. the effect of techniques or processes of search that can obscure distinctions in differing items. 3. any case where matching misses an appropriate match due to nonsignificant differences in two records that really represent the same item. note that disinformation (the intentional effort to mis­ represent) is not considered in scope for this discussion. the assumption is that cooperation is in the interests of all parties contributing to a shared database. we do not assume that all institutions sharing the database have the same goals. misinformation and bias in metadata processing | thornburg and oskins 15 misinformation and bias in metadata processing: matching in large databases gail thornburg and w. michael oskins gail thornburg (thornbug@oclc.org) has taught at the university of maryland and the university of illinois, and served as an adjunct professor at kent state university, and as a senior-level software engineer at oclc. w. michael oskins (oskins@oclc.org) has worked as a developer and researcher at oclc for twenty years. 16 information technology and libraries | june 200716 information technology and libraries | june 2007 what is bias? bias can be defined as factors in the creation or processing of database records that feed on misinformation or missing information, and skew charac­ terizations of the database records in question. context—matching and bias how are matching and bias related to each other? the growth of a database is in part a function of the matching process. if matching is not tuned correctly, the database can grow or change in nonoptimal ways. another way to look at the problem is to consider the goal of success in searching, and the need to know when to stop. human beings recognize that failure to find the best information for a given problem may be costly. finding the best information when less would suffice may also be costly. systems need to know this. for a large shared data­ base, hundreds of thousands of records may be processed in a day; the system must be as efficient as possible. what are some costs? fail to match when one should, and duplicates may proliferate in the database. match badly, and there is risk of merging multiple records that do not represent the same item. a system of matching can fail in more than one way. balance is needed. 1. searches, which are based on data in the incom­ ing record, may be too precise to find legitimate matches. loosen the criteria too much, and the search may return too many records to compare. 2. once retrieved, candidate matches are evaluated. compare candidates too narrowly, and records with insignificant differences will be rejected. fail to take note of salient differences between incom­ ing record and database record, and the match will be wrong, undetected, and potentially hard to detect in the future. the goals vary in different matching projects. for some projects, setting “holdings,” the indication that a member library owns a copy of something, is the main goal of the processing. this does not involve adding, replacing, or merging database records. for other projects, the goal is to update the database, either by replacing matched records, merging multiple duplicate records into one, or by adding new records if no match is found in the database. for the latter, bad matching could compromise database contents. n background hickey and rypka provide a good review of the problems of identifying duplicates and the implications for match­ ing software.4 their study notes concerns from a variety of library networks including that of the university of toronto (utlas), washington library network (wln), and research libraries group (rlin). they also refer­ ence studies on duplicate detection in the illinois state­ wide bibliographic database and at oak ridge national laboratories. background discussion of broader misinfor­ mation issues in shared library catalogs can be found in bade’s paper.5 a good, though dated, review of duplicate record problems can be found in the o’neill, rogers, and oskins article.6 the authors discuss their analysis of differences in records that are similar but not identical, and which elements caused failure to match two records for the same item. for example, when there was only one differing element in a pair, they found that element was most often publication date. their study shows the difficulties for experts to determine with certainty that a bibliographic record is for the same item. problems of typographical errors in shared biblio­ graphic records come under discussion by beall and kafadar.7 their study of copy cataloging errors found only 35.8 percent were corrected later by libraries, though the ordinary assumption is that copy cataloging will be updated when more information is available for an item. pollock and zamora report on a spelling error detection project at chemical abstracts service (cas) and charac­ terize the types of errors they found.8 chemical abstracts databases are among the most searched databases in the world. cas is usually characterized as a set of sources with considerable depth and breadth. of the four most common typographical errors they describe, errors of omission are most common, with insertion second, substitution third, and transposition fourth. over 90 percent of the errors they found were single letter errors. this is in agreement with the findings of o’neill and aluri, though the databases were substantially different.9 another study on moving­ image materials focuses on problems of near­equivalents in cataloging.10 yee suggests that cataloging practice tends to lead to making too many separate records for near equivalents. owen gingerich provides insight in the use of holdings information in oclc and other bibliographic utilities such as rlin for scholarly research in locating early editions of copernicus’ de revolutionibus.11 among other sources, he used holdings information in multiple bibliographic utilities to help in collecting a census of copies of de revolutionibus, and plotting its movements through europe in the sixteenth century. his article high­ lights the importance of distinguishing very similar items for scholarly research. shedenhelm and burk discuss the introduction of vendor records into oclc’s worldcat database.12 their results indicate that these minimal­level records increase the duplication rate within the database and can be costly to upgrade. (see further discussion in the section change in contributor characteristics below.) one problem in analysis of sources of mismatch in previous studies is that there is no good way to detect and charac­ public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 17misinformation and bias in metadata processing | thornburg and oskins 17 terize typos that form real words. jasco reviews studies characterizing types and sources of errors.13 sheila intner compares the quality issues in the databases of oclc and the research libraries group (rlg) and finds the issues similar.14 intner used matched samples of records from both worldcat and rlin to list and compare types of errors in the records. she noted that while the perception at that time was that rlin had higher­quality cataloging, the differences found were not statistically significant. jeffrey beall, while focusing in his study on the full­ text online database jstor, notes the commonality of problems in metadata quality.15 in addition, he discusses the special quality problems in a database of scanned images. the scanning software itself may introduce typo­ graphical errors. like xwc, the database changes rapidly. o’neill and visine­goetz present a survey of quality con­ trol issues in online databases.16 their sections on dupli­ cate detection and on matching algorithms illustrate the commonalities of these problems in a variety of shared cataloging databases. they cite variation in title as the most common reason for failure to identify a duplicate record that should match. variations in publisher, names, and pagination were noted as common. lei zeng pres­ ents a study of chinese language records in the oclc and rlin databases.17 zeng discusses quality problems including (1) format errors such as field and subfield tagging and incorrect punctuation; (2) content errors such as missing fields and internal record inconsisten­ cies; and (3) editing and inputting errors such as spacing and misspelling. part 2 of her study presents the results of the prototype rule­based system developed to catch such errors.18 while the author refrains from comparing the quality of oclc and rlin chinese language catalog records, the discussion makes clear that the quality issues are common to a number of online databases. more work is needed on quality and accuracy of shared records in non­roman scripts, or in other lan­ guages transliterated to roman script. n types of bias to be considered specific factors that may tend to bias an attempt to match one record to another include: 1. violated expectations—system software expects data it does not receive, or data received is not well formed. 2. temporal bias—changes in rules and philosophies of record creation over time. 3. design bias—choices in layout of the records, which favor one type of record representation at the expense of another. 4. judgment calls—distinctions introduced in record representations due to differing but legitimate variation in expert judgment. oclc is a multina­ tional cooperative and there is no universal set of standards and rules for creating database records. rules of cataloging most widely used are not abso­ lutely prescriptive and are designed to allow local deviation to meet local needs.19 5. structural bias—process and systems bias. this category reflects internal influences, inherent in the automatic processing, storage, and retrieval of large numbers of records. 6. growth of the database environment—whether in raw numbers of records, numbers of specific formats, numbers of foreign languages, or other characteristics that may affect efficient location and comparison of records. 7. changes in contributor characteristics––in the goals or focus of institutions that contribute to the database. violated expectations data may not conform to expectations. expectations about the nature of records in the data­ bases are frequently violated. what seem to be good rules for matching may not work well if the incoming data is not well formed, or simply not constructed as expected. biasing sources in the incoming data include the fol­ lowing: 1. typographical errors occur in titles and other parts of the record. anywhere the software has to parse text, an entry error—or even correction of an entry error by a later update—could con­ found matching. this could confound both (a) query execution and (b) candidate comparisons. basically the system expects textual data such as the name of a title or publisher to be correct, and machine­based efforts to detect errors in data are expensive to run. spelling detection techniques can compensate in some ways for data problems, but will not identify cases of real­word errors. see kukich for a survey of spelling error, real­word, and context­dependent techniques.20 2. there is also the issue of real word differences in similar text strings. an automated system with programmed fault tolerance may wrongly equate the publisher name “mila” with “mela” when they are distinct publishers. equivalence tables can cross­reference known variations on well­known publisher names, but cannot predict merges and other organizational changes. or consider author names: are “john smith” and “jon smith” the 1� information technology and libraries | june 20071� information technology and libraries | june 2007 same? this is a major problem with automated authority control where context clues may not be trustworthy. 3. errors of formatting of variable fields in the meta­ data contribute to false mismatch. the rules for data entry in the marc record are complex and have changed over time. erroneous placement or coding of subfields poses challenges for iden­ tification of relevant data. the software must be fault tolerant wherever possible. changes in the format of the data itself in these fields/sub­ fields may further complicate record comparisons. isbns (international standard book numbers) and lccns (library of congress control numbers) have both changed format in the recent past. 4. errors occur in the fields that indicate format of the information. in bibliographic records, format information is used to derive the overall type of material being described: book, url, dvd, and so on. errors in the data in combination can generate an incorrect material type for the record. 5. language of cataloging: this comparison has in the past caused inappropriate mismatches. the require­ ments in the new matching aimed to address this. 6. language in formation of queries: marc records frequently are a mixture of languages. as has been seen in other projects with intensive comparison of text, overlap in languages has the potential to confuse comparisons of short strings of text.21 the assumption made here is that the use of all pos­ sible syllables contained in the title should tend to mitigate language problems. nothing short of semantic analysis by the software is likely to solve such a problem, and contextual approaches to detection have had most success (in the produc­ tion environment) in carefully controlled cases. matching overall must be generic in its problem solving techniques. temporal bias large databases developed over time have their contents influenced by changes in standards for record creation, changes in contributor perception of the role of the data­ base, and changes in technology to be described. changes may include the following: 1. description level: e.g. changes such as book or elec­ tronic book. these have evolved from format­ to content­based descriptions that transcend format. over time, the cataloging rules for describing formats have changed. thus a format description created earlier might inadvertently “mismatch” the newer description of exactly the same item. for example, the rules for describing a book on a cd originally emphasized the cd format, whereas now, the emphasis might be shifted to focus on the intellectual content, the fact that it is a book. 2. the role of the database once perceived as chiefly repository or even backup source for a given library has become a shared resource with responsibilities to a community larger than any one library. 3. over time, the use of the database may change. (this is further discussed in the section on growth of the environment later.) searching has to satisfy the reference function of the database, but match­ ing as a process also relies on searching, and its goals are different. 4. varied standards worldwide challenge coopera­ tion. while u.s. libraries usually follow aacr2 and use the marc21 communications format, other parts of the world may use unimarc and country­specific cataloging rules. for instance, the pica bibliotekssystem, which hosts the dutch union catalog, used the prussian cataloging rules, which tended to focus on title entries.22 the switch to the rak was made by the early nineties.23 5. some libraries may not use any form of marc but submit a spreadsheet that is then converted to marc. there is some potential for ambiguities in those conversions due to lack of 1:1 correspon­ dence of parts. 6. even within a country, standards change over time, so that “correct” cataloging in one decade may not match that in a later period. neither is wrong, in its own temporal context, but each results in different metadata being created to describe the same item. intner points out that oclc’s database was initi­ ated a full decade before rlg implemented rlin, and rlin started almost the same time as the aacr2 publication.24 thus rlin had many fewer pre­aacr2 records in its database, while worldcat had many more preexisting records to try to match with the newer aacr2 forms. 7. objects referenced in the database may change over time. for instance, a record describing an elec­ tronic resource may point to a location no longer valid for that resource. 8. vendor records are created as advance advertis­ ing, but there is no guarantee the records will be updated later. estimating the time before updates occur is impossible. 9. records themselves change over time as they are copied, derived, and migrated into other systems. they may be enhanced or corrected in any system where they reside. so when they return to the origi­ nating database, they may have been transformed so far as to be unrecognizable as representations of the same item. this problem is not unique to xwc; public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 1�misinformation and bias in metadata processing | thornburg and oskins 1� it is a challenge for any shared database where export of records and reentry is likely. design bias the title, author, publisher, place of publication, and other elements of a record, designed in a time when most of the contents of a library were books, may not appear as clear or usable for other forms of informa­ tion, such as web sites or software. there is a risk to any design of a representation for an object, that it may favor distinctions in one format over another. or representations imported from other schemes may lose distinctions in the crosswalk from one scheme to another. a crosswalk is a mechanism for the mapping of data elements/content from one metadata scheme to another. dublin core and marc are just two examples of schemes used by library professionals. software exists to convert dublin core metadata to marc for­ mat, but the process of converting less complex data to a scheme of more structured data has inevitable limita­ tions. for instance, dublin core has “subject” while marc has dozens of ways to indicate subject, each with a different kind of designation for subject aspects of an item.25 see discussion in beall.26 libraries commonly exchange or purchase records from external sources to reduce the volume or costs of in­house cataloging. if an institution harvests metadata from multiple sources, there can be varying structures, content standards, and overall quality, all of which can make record compari­ sons error prone. while library and information science professionals have been creating metadata in the form of catalog records for a long time, the wider community of digital repositories may be outside the lis commu­ nity, and have varied understanding of the need for consistent representations of data. robertson discusses the challenges of metadata creation outside the library community.27 museums and archives may take a dif­ ferent view of what quality standards in metadata are. for example, for a museum, extensive detail about the provenance of an object is necessary. archives often record information at the collection level rather than the object level; for example, a box of miscellaneous papers, as opposed to a record for each of the papers within the box. educators need to describe resources such as learning objects. a learning object is any entity, digital or nondigital, which can be used, reused, or referenced during technology­supported learning 28 for these objects a metadata record using the ieee lom standard may be used.29 while this is as complex as a marc record, it has less bibliographic description and more focus on description of the nature and use of the learning object. in short, for one type of institution the notion of appropriate granularity of description may be too detailed or too vague for the needs of another type of institution. judgment calls two persons creating independent records for the same item exercise judgment in describing what is most impor­ tant about the object. one may say it is a book with an accompanying cd, another may say it is software on a cd, accompanied by a book of documentation. another example of legitimate variation is the choice of use of ellipses […] to leave out parts of long titles in a metadata description. one record creator may list the whole title, another may list only the first part followed by the mark of ellipsis to indicate abbreviation of the lengthy title. either is correct, but may not match each other without special techniques. see appendix b for the perils of ellipsis handling. the form of name of a publisher, given other occur­ rences of a publisher name in a record, may be abbrevi­ ated. for instance, in one place the corporate author who is also the publisher might be listed in the author field as “department of health and human services” and then abbreviated—or not—in the publisher area as “the department.” note that there are limitations inherent to the valida­ tion of any system of matching, in that human reviewers may not be able to determine whether two representa­ tions in fact describe the same item. structural bias 1. process bias refers to any features of the software which at run­time may change the way matching is carried out, whether by shortening or lengthen­ ing the analysis, or otherwise branching the logical flow. this can arise from many sources, including but not limited to the following factors. a. there is need for efficient processing of large num­ bers of incoming records. this can force an empha­ sis on speedy matching. that is, matching not required to replace records tends to be optimized to stop searching/matching as early as is reason­ able. in the case where unique key searching finds a single match to an incoming record, it is fairly easy for the software to “justify” stopping. if there are multiple matches found, more analysis may be needed before the decision to stop matching can be made. over time the numbers of records processed has increased enormously. b. matching needs to exploit “unique” keys to speed searching, yet these may not prove to be unique. though agreements are in place for use of numeric keys such as isbns, creation of these keys is not under the control of any one organization. 20 information technology and libraries | june 200720 information technology and libraries | june 2007 c. problems arise when brief records are com­ pared with fuller records. comparisons may be biased inadvertently towards false matches. such sparseness of data has been identified as a problem in rlin matching as well as in xwc. d. at the same time there is bias toward less generic titles in matching. requirements of sys­ tem throughput mandate an upper limit on the size of result set that the matching software will even attempt to analyze. this upper limit could tend to discriminate against effective retrieval of generic titles. matching will reject very large results sets of searches. so the query that has fewer title terms may tend to retrieve too much. titles such as “proceedings” or “bulletin” may be difficult to match if insufficient other informa­ tion is present in the record for the query to use. ironically this can mean addition of more generic titles to the database, since what is there is in effect less findable. e. transparency can contribute to bias in that, for each layer of transparency a layer of opacity may be added, when information is filtered out from a user’s view. that user may be a human or an application. openurl access to “appropriate copy” is an example from the standards world. the complexity of choosing among multiple online copies has become known as the “appro­ priate copy” problem. there are a number of instances where more than one legitimate copy of an electronic article may exist, such as mir­ roring or aggregator databases. it is essentially a problem of where and how to introduce localiza­ tion into the linking process.30 appropriateness reflects the user’s context, e.g., location, license agreements in place, cost, and other factors. 2. systems bias. what is this, really? the database can be seen as “agent.” the weight of its own mass may affect efforts to use its contents. a. for maintainers of large database systems, the goals of database repository and search engine may be somewhat at odds. yet librarians do make use of the database as reference source. b. search strategies for the software that acts as a user of the database is necessarily developed and optimized at a certain point in time. yet a river of new information flows into this data­ base. 1. if the numbers of types of entries in various database indexes grows nonproportion­ ally, search strategies that worked well in the past could potentially fall “out of tune” with the database contents. see growth of the environment section below. 2. change in proportions of languages in the database may render an application’s use of stopword lists less effective. 3. if changes in technology or practice result in new forms of material being described in the database, the software searches using material type as a limiter may not work properly. the software is using abstractions provided by the database, and they need to be kept synchronized. c. automated query construction presents its own problems. the use of boolean searching [term a and term b and term c] is quite restrictive in the sense that there is no “halfway” or flex for a record being included in a set of candidates. matching starts with the most specific search to avoid too­high numbers of records retrieved, and all it can do is drop or rearrange terms from a query in the effort to broaden the results. d. disconnects in metadata object creation/revision are another problem. links can point to broken uris (uniform resource identifiers). controlled vocabularies can drift or expand. even more confusing, a uri that is not broken may point to content which has changed to the point where the metadata no longer describes the item it once did. at one extreme, bruce and hillmann describe the curious case of citation of judicial opinions, for which a record of the opinion may be created as much as eighteen months before the volume with the official citation is printed, and thus the official citation cannot be created.31 e. expectations for creation of metadata play a role as well. traditional cataloging has generally had an expectation that most metadata is being cre­ ated once and reused. yet current practice may be more iterative, and must be, if such problems as records with broken internet uris are to be avoided. f. loss of synchronization can subvert process­ ing. note that other elements of metadata may become divorced or out of synch with the origi­ nal target /purpose. the prefix to an isbn was originally intended to describe the publisher, but is now an unreliable discriminator. numeric keys intended to identify items uniquely can retrieve multiple items, if the scheme for assign­ ing them is not applied consistently. in the worst case, meaningful data elements may become so corrupted as to be useless for record retrieval or even comparison of two records. g. ownership issues can detract from optimal data­ base management. member institutions’ percep­ tions of ownership of individual records can conflict with the goals of efficient search and retrieval. members may resist the idea of a “bet­ public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 21misinformation and bias in metadata processing | thornburg and oskins 21 ter” record being merged with a “lesser” one. so systems have ways of ranking records by source or contents with the general goal of trying to avoid losing information, but with the specific effect of refraining from actions that might be enriching in a given case. growth of the database environment a shared database can grow in unpredictable ways. a change in the relative proportions of different types of materials or topical coverage can render once­effective searches ineffective due to large result sets. an example of this is the number of internet­related entries in xwc. a search such as “dog” restricted to “internet­related” entries in 1995 retrieved thirty­four hits. this might be a manageable number. but in 2005, 225 entries were in the result set. similarly with subject headings, one search on “computer animation” retrieved fourteen hits in 1980, and 342 in 2005. in both cases the result sets grew from manageable to “too large” over time. the increase in the number of foreign language entries in a database can cause problems. just determining what language an entry is in can be difficult, and records may contain multiple languages. also, such languages as chinese, japanese, and korean can overlap. chinese syllables such as: “a, an, to, no, jan, ka, jun, lung, sung, i, lo, la, le, so, sun, juan,” seen out of context might be chinese or any one of several other languages. determining appropriate handling of stopwords and other rules for effective title matching becomes more complex as more languages populate the database. changes in contributor characteristics copy cataloging practices in an institution can affect xwc indirectly. an institution previously oriented to fixing downloaded records may adopt a policy of refrain­ ing from changing downloaded records. historical inde­ pendence of libraries is one illustration. prior to the 1970s, most libraries did not share their cataloging with other libraries. many institutions, especially smaller ones, were outside the loop and did things their own way. they used what rules they felt were useful, if they used any rules at all. later they converted sparse and poorly formed data into marc records and sent them to oclc for matching, perhaps in an effort to get back a more complete and useful record. yet the matching process is not always able to distinguish or interpret these local dialects. changes in specialization of cata­ loging staff at an institution, or cutbacks in staff can lead to reduced facility in providing original cataloging. outsourcing of cataloging work can affect handling of specialized materials as well. the introduction of vendor records and their characteristics has been noted by shedenhelm and burk.32 as they note, these records are very brief bibliographic records originally designed to advertise an item for sale by the vendor. these mini­ mal level records have a relatively high degree of dupli­ cation with existing records (37.5 percent in their study) and because of their sparseness can increase the cost of cataloging. changes in the proportion of contribu­ tors who create records in non­marc formats such as dublin core can affect the completeness of bibliographic entries. the use of such formats, meant to facilitate the entry of bibliographic materials, does come with a cost. group cataloging is a process whereby smaller libraries can join a larger set of institutions in order to reduce costs and facilitate cataloging. this larger group then contributes to oclc’s database as an entity. the growth of group cataloging has resulted in the addition of more records from smaller libraries, which may in the future have an effect on searching/matching in xwc worldcat overall. internationalization may be a factor as well. the marc format is an anglo­based format with english­language­based documentation. rapid inter­ national growth thrusts a broader range of traditions into a marc/oclc world. the role of character sets is heightened as the database grows. a cyrillic record may not be confidently matched to a transliterated record for the same item. although worldcat has a long his­ tory with cjk records, marc and worldcat are not yet accustomed to a wide repertoire of character sets. now, however, xwc is an environment in which expanding character coverage is possible, and likely. future research n we need more systematic study of the types of errors/omissions encountered in marc record cre­ ation. n how can the process of matching accomodate objects that change over time? n how does the conversion from new metadata schemes affect matching to marc records? does it help to know in what format a record arrived, or under what rules it was created? n how can we address sparseness in vendor records or legal citations? how can we deal with other advance publication issues? n how do changes in philosophy of the database affect the integrity of the matching process? n conclusions in this review we have seen that characterizing metadata at a high level is difficult. challenges for adding to a large, complex database include some of the following: 22 information technology and libraries | june 200722 information technology and libraries | june 2007 n rules for expert creation of metadata inevitably change over time. n the object of the metadata itself may change, more often than may be convenient. n comparisons of briefer records to records that are more elaborate descriptions can have pitfalls. search and comparison strategies for such record pairs are challenged by the need to have matching algorithms that work for every scenario. n changes within the database may themselves con­ tribute to exacerbation of matching problems if duplicates are added too often, or records are merged that actually represent different contents. because of the risk, policies for merging and replacing records tend to be conservative, but this does not always favor the greatest efficiency in database processing. n changes in the membership sharing a database are likely to affect its shape and searchability. n newer schemes of metadata representation are likely to challenge existing algorithms for determining matches. references 1. national information standards organization, understanding metadata (bethesda, md.: niso pr., 2004), 1. http:// www.niso.org/standards/resources/understanding metadata. pdf (accessed feb. 26, 2006). 2. library of congress, “marc 21 concise format for bibliographic data (2002).” http://www.loc.gov/marc/ bibliographic/ecbdhome.html (accessed nov. 20, 2004). 3. gail thornburg, “matching: discrimination, misinforma­ tion, and sudden death,” informing science conference, flag­ staff, ariz., june 2005. 4. thomas b. hickey and david j. rypka, “automatic detec­ tion of duplicate monographic records,” journal of library automation 12, no. 2 (june 1979): 125–42. 5. david bade, “the creation and persistence of misinfor­ mation in shared library catalogs,” occasional paper no. 211, (graduate school of library and information science, univer­ sity of illinois at urbana–champaign, apr. 2002). 6. edward t. o’neill, sally a. rogers, and w. michael oskins, “characteristics of duplicate records in oclc’s online union catalog,” library resources and technical services 37, no.1 (1993): 59–71. 7. jeffrey beal and karen kafadar, “the effectiveness of copy cataloging at eliminating typographical errors in shared bibliographic records,” library resources & technical services 48, no. 2 (apr. 2004): 92–101. 8. j. j. pollock and a. zamora, “collection and characteriza­ tion of spelling errors in scientific and scholarly text,” journal of the american society for information science 34, no. 1 (1983): 51–58. 9. edward t. o’neill and rao aluri, “a method for cor­ recting typographical errors in subject headings in oclc records,” research report # oclc/opr/rr­80/3 (1980). 10. martha m. yee, “manifestations and year­equivalents: theory, with special attention to moving­image materials,” library resources and technical services 38, no. 3 (1995): 227–55. 11. owen gingerich, “researching the book nobody read: the de revolutionibus of nicolaus copernicus,” the papers of the bibliographical society of america 99, no. 4 (2005): 484–504. 12. laura d. shedenhelm and bartley a. burk, “book vendor records in the oclc database: boon or bane?” library resources and technical services 45, no. 1 (2001): 10–19. 13. peter jasco, “content evaluation of databases,” in annual review of information science and technology, vol. 32 (medford, n.j.: information today, inc., for the american society for infor­ mation science, 1997), 231–67. 14. sheila intner, “quality in bibliographic databases: an analysis of member­controlled cataloging of oclc and rlin,” advances in library administration and organization 8 (1989): 1–24. 15. jeffrey beall, “metadata and data quality problems in the digital library,” journal of digital information 6, no. 3 (2005): 10–11. 16. edward t. o’neill and diane vizine­goetz, “quality control in online databases,” annual review of information science and technology 23 (washington, d.c.: american society for information science, 1988). 17. lei zeng, “quality control of chinese­language records using a rule­based data validation system. part 1: an evalua­ tion of the quality of chinese­language records in the oclc oluc database,” cataloging and classification quarterly 16, no. 4 (1993): 25–66 18. lei zeng, “quality control of chinese­language records using a rule­based data validation system. part 2: a study of a rule­based data validation system for online chinese cata­ loging,” cataloging and classification quarterly 18, no. 1 (1993): 3–26. 19. anglo-american cataloguing rules, 2nd ed., 2002 rev. (chi­ cago: ala, 2002). 20. karen kukich, “techniques for automatically correct­ ing words in text,” acm computing surveys 24, no. 4 (1992): 377–439. 21. gail thornburg, “the syllables in the haystack: techni­ cal challenges of non­chinese in a wade­giles to pinyin con­ version,” information technology and libraries 21, no. 3 (2002): 120–26. 22. hartmut walravens, “serials cataloguing in germany: the historical development,” cataloging and classification quarterly 35, no. 3/4 (2003): 541–51; instruktionen für die alphabetischen kataloge der preuszischen bibliotheken vom 10. mai 1899. 2 ausg. in der fassung vom 10. august 1908 (berlin: behrend & co., 1909). 23. richard greene, e­mail message to author, nov. 13, 2006; regeln für die alphabetische katalogisierung: rak / irmgard bou­ vier (wiesbaden, germany: l. reichert, 1980, c1977). 24. intner, “quality in bibliographic databases.” 25. richard greene, e­mail message to author, feb. 27, 2006. 26. beall, “metadata and data quality problems in the digital library.” 27. r. john robertson, “metadata quality: implications for library and information science professionals,” library review 54, no. 5 (2005): 295–300. public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 23misinformation and bias in metadata processing | thornburg and oskins 23 28. ieee. learning technology standards committee, “wg12: learning objects metadata.” http://ltsc.ieee.org/wg12 (accessed feb. 26, 2006). 29. ibid. 30. orien beit­arie et al., “linking to the appropriate copy: report of a doi­based prototype,” d-lib 7, no. 9 (sept. 2001). 31. thomas r. bruce and diane i. hillmann,“the continuum of metadata quality: defining, expressing, exploiting,” in metadata in practice (chicago: ala, 2004), 238–56. 32. shedenhelm and burk, “book vendor records in the oclc database.” 24 information technology and libraries | june 200724 information technology and libraries | june 2007 appendix a. sample cdfrecord record from the xwc database cgm 7a 27681290 vf bcahru mr baaafu 920714r19551952fr 092 mleng 92513007 dlcamim dlc lp5921u.s. copyright office xxu mr vbe 6360­6361 (viewing copy) fgb 5643­5647 (ref print) fpa 0621­0625 (master­ pos) othello (motion picture : welles) the tragedy of othello­­ the moor of venice / a mercury production, [films marceau?] ; directed, produced, and written by orson welles. u.s. ; [morocco?] france :films marceau,1952 ; [morocco?: :s.n., 1952?] ;united states : united artists,1955. 2 videocassettes of 2 (ca. 92 min.) :sd., b&w ; 3/4 in. viewing copy. 10 reels of 10 on 5 (ca. 8280 ft.) :sd., b&w ; 35 mm. ref print. 10 reels of 10 on 5 (ca. 8280 ft.) :sd., b&w ; 35 mm. masterpos. copyright: orson welles; 19sep52; lp5921. reference sources cited below and m/b/rs preliminary cataloging card list title as othello. photography, anchisi brizzi, g.r. aldo, george fanto ; film editors, john shepridge, jean sacha, renzo lucidi, william morton ; music, francesco lavagnino, alberto barberis. orson welles, suzanne cloutier, micheaì l macliamoì ir, robert coote. director, producer, and writer credits taken from focus on orson welles, p. 205. lc has u.s. reissue copy.dlc new york times,9/15/55. an adaptation of the play by william shakespeare. reference sources used: new york times, 9/15/55; international motion pic­ ture almanac, 1956, p. 329; focus on orson welles, p. 205­206; monthly film bulletin, v. 23, no. 267, p. 44; index de la cineì matog­ raphie francì§aise, 1952, p. 496. received: 5/26/87 from lc video lab;viewing copy; preservation, made from ref print, paperwork in acq: copyright­­material movement form file, lwo 21635; copyright collection. received: 12/2/64; ref print;copyright deposit; copyright collection. received: 5/70; masterpos;gift; afi theatre collection. othello (fictitious charac­ ter)drama. public libraries and internet access | jaeger, bertot, mcclure, and rodriguez 25misinformation and bias in metadata processing | thornburg and oskins 25 plays. mim features. mim welles, orson, 1915­direction, production,writing, cast. cloutier, suzanne,1927­cast. mac liammoì ir, micheaì l, 1899­1978,cast. coote, robert,1909­1982,cast. copyright collection (library of congress)dlc afi theatre collection (library of congress)dlc othello. appendix b. the perils of judging near matches a. challenges of handling ellipses in titles thought to be similar incoming title: general explanation of tax legislation enacted in ... / prepared by the staff of the joint committee on taxation match: general explanation of tax legislation enacted in the 104th congress prepared by the staff of the joint committee on taxation incoming title: general explanation of tax legislation enacted in ... / prepared by the staff of the joint committee on taxation match: general explanation of tax legislation enacted in the 106th congress prepared by the staff of the joint committee on taxation incoming title: general explanation of tax legislation enacted in ... / prepared by the staff of the joint committee on taxation match: general explanation of tax legislation enacted in the 107th congress prepared by the staff of the joint committee on taxation incoming title: general explanation of tax legislation enacted in ... / prepared by the staff of the joint committee on taxation match: general explanation of tax legislation enacted in the 108th congress prepared by the staff of the joint committee on taxation b. partial matches in names which might represent the same publisher publisher comparison is challenging in an environment where organziations are regularly merged or acquired by other organziations. there is no real authority control for publishers that would help cataloguers decide on a preferred form. when governmental organizations are added to the mix, the challenges increase. below are some examples of non­match­ ing text of publisher names in records, which might or might not considered the same by a human expert. (the publisher names have been normalized.) 26 information technology and libraries | june 200726 information technology and libraries | june 2007 1. publisher name may be partially or differently recorded in two records incoming publisher: konzeptstudien kantonale planungsgruppe match: kantonale planungsgruppe konzeptstudien (word order different) incoming publisher: institut francais proche orient match: institut francais darcheologie proche orient incoming publisher: u s dept of commerce national oceanic and atmospheric administration national environ­ mental satellite data and information service match: national oceanic and atmospheric administration 2. publisher name may have changed due to acquisition by another organization incoming publisher: pearson prentice hall match: prentice hall incoming publisher: uxl match: uxl thomson gale incoming publisher: thomson arco match: arco thomson learning 3. one record may show “publisher” which is actually government distributing agency or clearinghouse such as the u.s. government printing office or national technical information service (ntis), while the candidate match shows the actual government agency. these can be almost impossible to evaluate. incoming publisher: u s congressional service match: supt g p o (here the distributor is the government printing office, listed as the publisher) incoming publisher: u s dept of commerce national oceanic and atmospheric administration national environmental satellite data and information service match: national oceanic and atmospheric administration incoming publisher: u s gpo match: u s fish and wildlife service 4. the publisher in a record may start with or end with the publisher in the second record. should it be called a match? good: incoming publisher trotta match: editorial trotta incoming publisher wiley match: john wiley questionable? incoming publisher prentice hall match: prentice hall regents canada incoming publisher geuthner match: orientaliste geuthner incoming publisher oxford match: distributed royal affairs oxford incoming publisher: pan union general secretariat organization states match: social science section cultural affairs pan union 44 information technology and libraries | march 2011 jennifer emanuel usability of the vufind next-generation online catalog vufind incorporates many of the interactive web and social media technologies that the public uses online, including features from online booksellers and commercial search engines. the vufind search page is simple, containing only a single search box and a dropdown menu that gives users the option to search all fields or to search by title, author, subject, or isbn/issn (see figure 1). to combine searches using boolean logic or to limit to a particular language or format, the user must use the advanced search feature (see figure 2). the recordresults page displays results vertically, with each result containing basic item information, such as title, author, call number, location, item availability, and a graphical icon displaying the material’s format. the results page also has a column on the right side displaying “facets,” which are links that allow a user to refine their search and browse results using catalog data contained within the result set (see figure 3). vufind also contains a variety of web 2.0 features, such as the ability to tag items, create a list of favorite items, leave comments about an item, cite an item, and links to google book previews and extensive author biographies data mined from the internet. corresponding to the beginning of the vufind trial at uiuc, the university library purchased reviews, synopses, and cover images from syndetic solutions to further enhance both vufind and the existing webvoyage catalog. an additional appealing aspect of vufind was its speed; the carli installation of webvoyage is slow to load and is prone to time out while conducting searches. the uiuc library first provided vufind (http:// www.library.illinois.edu/vufind) at the beginning of the 2008 fall semester and expected it to be trialed through the end of the spring semester 2009. use statistics show that throughout the fall semester (september through december), there were approximately six thousand unique visitors each month, producing a total of more than thirty-eight thousand visits. spring statistics show use averaging more than ten thousand visitors a month, an increase most likely from word-of-mouth. librarians at both uiuc and carli were interested in what users thought about vufind, especially in relation to the usability of the interface. with this in mind, the library launched several forms of assessment during the spring semester. the first was a quantitative survey based on yale’s vufind usability testing.3 the second was a more extensive qualitative usability test that had users conducting sample searches in the interface and telling the facilitator their opinions. this article will discuss the hands-on usability portion of this study. survey responses that support the results presented herein will be reported in a separate venue. while this article only discusses vufind at a single institution, it does offer a generalized view of next-generation catalogs and how library users use such a catalog compared to a traditional online catalog. the vufind open–source, next-generation catalog system was implemented by the consortium of academic and research libraries in illinois as an alternative to the webvoyage opac system. the university of illinois at urbana-champaign began offering vufind alongside webvoyage in 2009 as an experiment in next generation catalogs. using a faceted search discovery interface, it offered numerous improvements to the uiuc catalog and focused on limiting results after searching rather than limiting searches up front. library users have praised vufind for its web 2.0 feel and features. however, there are issues, particularly with catalog data. v ufind is an open–source, next-generation catalog overlay system developed by villanova university library that was released to the public as beta in 2007 and version 1.0 in 2008.1 as of july 2009, four institutions implemented vufind as a primary catalog interface, and many more are either beta or internally testing it.2 more information about vufind, including the technical requirements and compatible opacs, is available on the project website (http://www.vufind.org). in illinois, the state consortium of academic and research libraries in illinois (carli) released a beta installation of vufind in 2008 on top of its webvoyage catalog database. the carli installation of vufind is a base installation with minor customizations to the carli catalog environment. some libraries in illinois utilize vufind as an alternative to their online catalog, including the university of illinois at urbana-champaign (uiuc), which currently advertises vufind as a more user friendly and faster version of the library catalog. as a part of the evaluation of nextgeneration catalog systems, uiuc decided to conduct hands-on usability testing during the spring of 2009. the carli catalog environment is very complex and comprises 153 member libraries throughout illinois, ranging from tiny academic libraries to the very large uiuc library. currently, 76 libraries use a centrally managed webvoyage system referred to as i-share. i-share is composed of a union catalog containing holdings of all 76 libraries as well as individual institution catalogs. library users heavily use the union catalog because of a strong culture of sharing materials between member institutions. carli’s vufind installation uses the records of the entire union catalog, but has library-specific views. each of these views is unique to the member library, but each library uses the same interface to view records throughout i-share. jennifer emanuel (emanuelj@illinois.edu) is digital services and reference librarian, university of illinois at urbana-champaign. usability of the vufind next-generation online catalog | emanuel 45 not simply find them.6 as a result, the past five years have been filled with commercial opac providers releasing next-generation library interfaces that overlay existing library catalog information and require an up-front investment by libraries to improve search capabilities. as these systems are inherently commercial and require a significant investment of capital, several open–source, next-generation catalog projects have emerged, such as vufind, blacklight, scriblio, and the extensible catalog project.7 these interfaces are often developed at one institution with their users in mind and then modified and adapted by other institutions to meet local needs. however, because they can be locally customized, libraries with significant technical expertise can have a unique interface that commercial vendors cannot compete against. one cannot discuss next-generation catalogs without mentioning the metadata that underlie opac systems. some librarians view the interface as only part of the problem of library catalogs and point to cataloging and metadata practices as the larger underlying problem. many librarians view traditional cataloging using machine-readable cataloging (marc), which has been used since the 1960s, as outdated because it was developed with nearly fifty-year-old technology in mind.8 however, because marc is so common and allows cataloging with a fine degree of granularity, current opac systems still utilize it. librarians have developed additional cataloging standards, such as dublin core (dc), metadata object description schema (mods), and functional requirements for bibliographic records (frbr), but none of these have achieved widespread adoption for cataloging printed materials. newly developed catalog projects, such as extensible catalog, are beginning to integrate these new metadata schemas, but currently others continue to use marc.9 many librarians also advocate to integrate folksonomy, or user tagging, into library catalogs. folksonomy is used by many library websites, most notably flickr, delicious, and librarything, each of which store user-submitted content that istagged with self-selected keywords that allow for easy retrieval and discovery.10 vufind integrates tagging into individual item records ■■ literature review librarians have complained about the usability of online catalogs since they were first created.4 when amazon.com became the go-to site for books and book information in the early 2000s, librarians and their users began to harshly criticize both opac interfaces and metadata standards.5 ever since north carolina state university announced a partnership with the commercial-search corporation endeca in 2006, librarians have been interested in the next generation of library catalogs and more broadly, discovery systems designed to help users discover library materials, figure 1. vufind default search figure 2. vufind advanced search figure 3. facets in vufind 46 information technology and libraries | march 2011 searching the library’s online catalog and were eager to see changes made to it. the test used was developed from a statewide usability test of different catalog interfaces usedin illinois. the test was adapted using the same sample searches, but was customized to the features and uses of vufind (see appendix). the vufind test was similar to the original test to allow a comparison of other catalog interfaces to vufind for internal evaluation purposes. i designed the test to allow subjects to perform a progressively complicated series of sample searches using the catalog while the moderator pointed out various features of the catalog interface. subjects were also asked what they thought about the search result sets and their opinions of the interface and navigation; they also were asked to perform specific tasks using vufind. the tasks were common library-catalog tasks using topics familiar at undergraduate–level students. the tasks ranged from a keyword search for “global warming” to a more complicated search for a specific compact disc by the artist prince. the tasks also included using the features associated with creating and using an account with vufind, such as adding tags and creating a favorite items list. through completing the test, subjects got an overview of vufind and were then asked to draw conclusions about their experience and compare it to other library catalogs they have used. the tests were performed in a small meeting room with one workstation set up with an install of the morae software, a microphone, and a web camera. morae is a very powerful software program developed by techsmith that records the screen on which the user is interacting with an interface, as well as environmental audio and video. although the study did not utilize all the features of the morae software, it was invaluable to the researcher to be able to review the entire testing experience with the same detail as when the test actually occurred in person. the study was carried out with the researcher sitting next to the workstation asking subjects to perform a task from the script while morae recorded all of their actions. once all fifteen subjects completed the test, the researcher watched the resulting videos and coded the answers into various themes on the basis of both broad subject categories and individual question answers. the researcher then gathered the codes into categories and used them to further analyze and gain insight into both the useful features of and problems with the vufind interface. ■■ analysis participants generally liked vufind and preferred it to the current webvoyage system. when asked to choose which catalog they would rather use, only one person, a faculty member, stated he would still use webvoyage. this faculty but does not pull tags from other sources; rather, users must tag items individually. additionally, next-generation catalogs offer a search mechanism that focuses on discovery rather than simply searching for library materials. users, accustomed to new ways of searching both on the internet and through commercial library indexing and abstracting databases, now search in a fundamentally different style than they did when opacs first became a part of library services. the online catalog is now just one of many tools that library users use to locate information and now covers fewer resources than it did ten to fifteen years ago. library users are now accustomed to using a single search box, such as with google; they also use nonlibrary online tools to find information about books and no longer view library catalogs as the primary place to look for books.11 as users are no longer accustomed to using the controlled language and particular searching methods of library catalogs because they have moved to discovering materials online, libraries must adapt to new way of obtaining information and focus not on teaching users how to locate library materials, but give them the tools to discover on their own.12 vufind is one option among many in the genre of next-generation or discovery-catalog tools. ■■ methods the study employed fifteen subjects who participated in individual, hands-on usability test sessions lasting an average of thirty minutes. i recruited volunteers though several methods, including posting to a university faculty and staff e-mail discussion list, an e-mail discussion lists aimed toward graduate students, and flyers in the undergraduate library. all means of recruitment stated that the library sought volunteer subjects to perform a variety of sample searches in a possible new library catalog interface. i also informed subjects that there was a gift card as a thank you for their time. all subjects had to sign a human subjects statement of informed consent approved by the university of illinois institutional review board. i sought a diverse sample, and therefore accepted the first five volunteers from the following pools: faculty and staff, graduate students, and undergraduate students. i felt that these three user groups were distinct enough to warrant having separate pools. the number of five users in each group was chosen because of jakob nielsen’s statement that five users will find 85 percent of usability problems and that fifteen users will discover all usability problems.13 although i did not specifically aim to recruit a diverse sample, the sample showed a large diversity in areas including age, library experience, and academic discipline. all subjects stated they had some experience usability of the vufind next-generation online catalog | emanuel 47 though there were questions as to how results were deemed relevant to the search statement as well as how they were ranked. participants were then asked to look at the right sidebar of the results page, which contains the facets. most users did not understand the term “facets,” with faculty and staff understanding the term more than graduate and undergraduate students did. one faculty member who understood the term facet noted that “facets are like a diamond with different sides or ways of viewing something.” however, when asked what term would be better to call the limiting options other than facet, several users suggested either calling the facets “categories” or renaming the column “refine search,” “narrow search,” or “sort your search.” participants were then asked to find how to see results for other i-share libraries. only two faculty members found i-share results quickly, and just half of the remaining participants were able to find the option at all. when asked what would make that option easier to find, most said they liked the wording, but the option needed to stand out more, perhaps with a different colored link or bolder type. two users thought having the location integrated as a facet would be the most useful way of seeing it. participants, however, quickly took to using the facets, as they were asked to use the climate change search results to find an electronic book published in 2008. no user had problems with this task, and several remarked that using facets was a lot easier than limiting to format and year before searching. the next task for participants was to open and examine a single record within their original climate change results (see figures 4 and 5). participants liked the layout, including the cover image with some brief title information, and a tabbed bar below showing additional information, such as more detailed description, holdings information, a table of contents, reviews, comments, and a link to request the item. several users remarked that they liked having information contained under tabs, but vufind organized each tab as a new webpage that made going back to previous tabs or the results page cumbersome. the only problem users had with the information contained within the tabs was the “staff view,” which contained the marc record information. most users looked at the marc record with confusion, including one graduate student who said, “if the staff view is of no use to the user, why even have it there?” one other useful feature that individual records in vufind contain is a link to an overlay window containing the full citation information for the item in both apa and mla formats. users were able to find this “cite this” link and liked having that information available. however, several participants noted that citation information would be much more beneficial if it could be easily exported to refworks or other bibliographic software. the next several searches used progressively higher-level member thought most of his searches were too advanced for the vufind interface and needed options that vufind did not have, such as limiting a search to an individual library or call number searching. this user did, however, specify that vufind would be easier to use for a fast and simple search. other users all responded very favorably to vufind, liking it better than any other online catalog they have used, with most stating that they wanted it as a permanent addition to the library. the most common responses to vufind were that the layout is easier on the eyes and displayed data much better than the webvoyage catalog; there were no comments about actual search results. several users stated that it was nice to be able to do a broad search and then have all limiting options presented to them as facets, allowing users to both limit after searching and letting them browse through a large number of search results. one user, an undergraduate student, stated she liked vufind because it “was new” and she always wants to try out new things on the internet. the first section of the usability test asked users to examine both the basic and advanced search options. users easily recognized how the interface functioned and liked having a single search box as the basic interface, noting that it looked more like a web search engine. they also recognized all of the dropdown menu options and agreed that the options included what they most often searched. however, four users wanted a keyword search. even though there is not a keyword search in webvoyage and there is an “all fields” menu option, participants seemed to think of the one box search universally as a keyword search and wanted that to be the default search option. one participant, an international graduate student, remarked that keyword is more understood by international students than the “all fields” search because, internationally, a field is not a search field but a scholarly field such as education or engineering. in the advanced search, all users thought the search options were clear and liked having icons to depict the various media formats. however, two users did remark that it would be useful to be able to limit by year on the advanced search page. the advanced search also is where the user can select one of seven languages, all of which are considered western languages, including latin and russian. two users, both international graduate students, stated that more languages would be beneficial, especially asian and more slavic languages. the university of illinois has separate libraries for asian and slavic materials, and these two participants said it would be useful to have search options that include the languages served by the libraries. the first task that participants were asked to do was an “all fields” search for “climate change.” they were instructed to look at the results page and an individual record to give feedback as to how they liked the layout and what they thought of the search results. upon looking at the results, all participants thought they were relevant, 48 information technology and libraries | march 2011 to items in which james joyce is the author, no participant had any problems, though several pointed out that there were three facets using his name—joyce, james; joyce, james avery; and joyce, j. a.—because of inconsistencies in cataloging (see figure 6). participants were next asked to search for an audio recording by the artist prince using the basic (single) search box. most participants did an “all fields” search for prince and attempted to use the facets to limit by a particular format. all but one was confident that they achieved the proper result, but there was confusion about the format. some participants were confused as to what format an audio recording was because the corresponding facet was for a music recording. a couple of users thought “audio recording” could be a spoken-word recording. most participants preferred that the format facets be more concrete toward a single actual physical format, such as a record, cassette, or a compact disc (see figure 7). physical formats appeared to resonate more with users than the broad cataloging term of “music recording.” a more specific format type (i.e., compact disc) is contained in the call number and should be straightforward to pull out as a facet. it appears vufind pulls the format information from marc field 245 subfield $h for medium rather than the call number (which at illinois can specify the format) or the 300 physical description field or another field such as a notes field that some institutions may use to specify the exact format. however, when participants were asked to further use facets to find prince’s first album, 1978’s for you, limitations with vufind became more apparent. each participant used a different method to search for this album, and none actually found the item either locally or in i-share, though the item has multiple copies available in both locations. most participants tried initially limiting by date because they were given that information. however, vufind’s facets focus on eras rather than specific years, which participants stated was frustrating as many items can fall under a broad era. also, the era facets brought up many more eras than one would consider an audio research skills and showed problems with both vufind and the catalog record data. the first search asked participants to do an “all fields” search for james joyce. all were able to complete the search, but there was notable confusion as to which records were written by james joyce and which were items about him. about half of the first-page results for this search did not list an author on the results page. vufind appears to pull the author field on the results page from the 100 field in the marc record, so if the 700 field is used instead for an editor, this information is not displayed on the results page. individual records do substitute the 700 field if the 100 field is not present, but this should also be the case on the initial results screen as well. several users thought it was strange that the results page often did not list the author, but an author was listed in the individual record. additionally, when asked to use the facets to limit figure 4. results set figure 5. record display figure 6. author facet figure 7. format facet usability of the vufind next-generation online catalog | emanuel 49 about both the reviews and comments that could be seen in the various records participants were asked to examine. many of the participants wanted more information as to where the reviews came from because this information was not clear. they also wanted to know whether the reviews or comments from catalog users had any type of moderation by a librarian. for the most part, participants liked having reviews inside the catalog records, but they liked having a summary even more. several users, all graduate students, expressed concern about the objectiveness of having reviews in the catalog, especially because it was not clear who did the review and feared that reviews may interject some bias that had no place in a library catalog record. one of these participants stated, “if i wanted reviews, i would just go to amazon. i don’t expect reviews, which can be subjective, to be in a library catalog—that is too commercial.” several undergraduate participants stated that reviews helped them decide whether the book was something that would be useful to them. the final task of the usability test asked participants to create an account with vufind because it is not connected to our user database. most users had no problems finishing this task, though they found some problems with the interface. first, it was not clear that users had to create an account and could not log in with their library number as they did in the library’s opac. second, the default field asks users for their barcode, which is not a term used at uiuc (users are assigned a library number). once logged in, participants were satisfied with the menu options and how their account information was displayed. finally, participants were asked, while logged in, to search for a favorite book and add it to their favorites list. all users liked the favorites-list feature, and many already knew of ways they could use it, but several wished they could create multiple lists and have the ability to arrange lists in folders. ■■ discussion participants thought favorably of the vufind interface and would use it again. they liked the layout of information much more than the current webvoyage interface and thought it was much easier to look at. they also had many comments that the color scheme (yellow and grey) was easier than the blues of the primary library opac. vufind also had more visual elements, such as cover images and icons representing format types that participants also commented on favorably. when asked to compare vufind to both the webvoyage catalog and amazon, only one participant indicated a preference for amazon, while the rest preferred vufind. the user who specified amazon, a faculty member, stated that that was where he always started searching for books; he would then search for specific titles in the recording, such as the 15th century. granted, the 15th century probably brings up music that originated in that era, not recorded then, but participants wanted the date to correspond to when an item was initially published or released. it appears that vufind pulls the era facet information from the subject headings and ignores the copyright or issue year. to users, the era facets are not useful for most of their search needs; users would rather limit by copyright or the original date of issue. another search that further highlighted problems searching for multimedia in vufind is the title search participants did for gone with the wind. everyone thought this search brought up relevant results, but when asked to determine whether the uiuc library had a copy of the dvd, many users expressed confusion. once again, the confusion was based on the inability to limit to a specific format. participants could use the facets to limit to a film or video, but not to a specific format. several participants stated that they needed specific formats because when they are doing a comparable search, they only want to find dvds. however, because all film formats are linked together under “film/video,” they must to go into individual records and examine the call number to determine the exact format. most participants stated clearly that “dvd” needed to be it’s own format facet and that entering a record to find the format required too much effort. participants also expressed frustration that the call number was the only place to determine specific format and believed that this information should be contained in the brief item information and not buried in the tabbed areas. the frustrations with the lack of specific formats also were evident when participants were asked to do an advanced search for a dvd on public speaking. all users initially thought the advanced search limiter for film/video was sufficient when they first looked at the advanced search options. however, when presented with an actual search (“public speaking”), they found that there should be more options and specific format choices up-front within the advanced search. another search that participants conducted was an author search for jack london. they then used the facets to find the book white fang. this search was chosen because the resulting records are mostly for older materials that often do not contain a lot of the additional information that newer records contain. participants looked at a specific record and then were asked what they thought of the information that was displayed. most answered that they would like as much information as you can give them, but were accepting of missing information. several participants stated that most people already know this book and thus did not need additional information. however, when pressed as to what information they would like added to the record, several users stated a summary would be the most useful. additionally, several users asked for more information 50 information technology and libraries | march 2011 the simplicity of the favorites listing feature, the difficulty of linking to other i-share library holdings, and the difficulties in using the facet categories. ■■ implications i intend to continue to perform similar usability tests on next-generation catalogs on a trial basis to examine one aspect regarding the future of online catalogs at uiuc. uiuc is looking at various catalog interfaces, of which vufind is one option, to see which best meets the needs of our users. users stated multiple times during testing that they find the current webvoyage interface to be very frustrating and will accept nearly anything that is an improvement, even if the new interface has some usability issues. vufind is not perfect for all searches, as shown by a lack of a call number search and the limitations in searching for multimedia options, but it does provide a more intuitive interface for most patrons. the future of vufind at uiuc is still open. development is currently stalled because of a lack of developer updates and internal staffing constraints both at uiuc and carli. however, because vufind is open–source, and the only ongoing cost is that of server maintenance, both carli and the library are continuing to display it as an option for searching the catalog. both carli and uiuc are closely examining other options for catalog interfaces that would provide patrons with a better search experience, but they have taken no further action to permanently adapt either vufind or to demo other options. despite its limitations, vufind is still a viable option for libraries with substantial technology expertise that are interested in a next-generation catalog interface at a low price. although it does have limitations, it has a better out-of-the-box interface than traditional opacs and should be considered alongside commercial options for any library thinking of adapting a catalog interface overlay. this usability test focused on one institution’s installation of vufind, which may or may not apply to other installations and other institutional needs. it would be interesting to study an installation of vufind at a smaller, nonresearch institution, where users have different searching needs and expectations related to a library’s opac. references 1. john houser, “the vufind implementation at villanova university,” library hi tech 27, no. 1 (2009): 96–105. 2. vufind, “vufind: about,” http://www.vufind.org/about .php (accessed sept. 10 2009). 3. kathleen bauer, “yale university vufind test— undergraduates,” http://www.library.yale.edu/libepub/ usability/studies/summary_undergraduate.doc (accessed mar. 20, 2010). library catalog to check availability. other participants who made comments about amazon stated that it was commercial and more about marketing materials, while the library catalog just provided the basic information needed to evaluate materials without attempting to sell them to you. several participants also stated they checked amazon for book information, but generally did not like it because of its commercial nature; because vufind provides much of the same information as amazon, they will use vufind first in the future. participants also thought amazon was for a popular and not scholarly audience, making it not useful for academic purposes. most users did not have much to say about the webvoyage opac, except it was overwhelming, had too many words on the result screen, and was not pleasantly visual. participants were also asked to look at vufind, amazon, and webvoyage from a visual preference. again, participants believed that vufind had the best layout. they liked that vufind had a very clean and uncluttered interface and that the colors were few and easy on the eye. they also commented about the visuals contained (cover art and icons) in the records and the vertical orientation of vufind (webvoyage has a horizontal orientation) to display records. they also liked how the facets were displayed, though two users thought they would be better situated on the left side of the results because they scan websites from the left to the right. the one thing that was mentioned several times was vufind’s lack of the star rating system that amazon uses to quickly rate an item. participants thought such a system might be better than reviews because it allows users to quickly scan through the item and not have to read through multiple reviews. when asked to rate the ease of use for vufind, with 1 being easy and 5 being difficult, participants rated it an average of 1.92. faculty rated the ease at 1.6, graduate students at 1.75, and undergraduates at 2.8. undergraduates were more likely to get frustrated at media searching and thought that some of the facets related to media items were confusing, which they used to explain their lower scores. however, when asked if they would rather use vufind over the current library catalog (webvoyage), all but one participant enthusiastically stated they would use vufind. most users stated that although vufind was not perfect, it was still much better than the other library catalog because of the better layout, visuals, and ability to limit results. the only user that specified they would still rather use the webvoyage catalog believed it had more options for advanced search, such as call number searching, which vufind lacked. there are, however, several changes that could make vufind more useful to our users that came out of usability testing. some of these are easy to implement on a local level, and others would improve the base build of vufind. a number of issues arose from usability testing, but the largest issues are the lack of refworks integration, usability of the vufind next-generation online catalog | emanuel 51 9. jennifer bowen, “metadata to support next-generation library resource discovery: lessons from the extensible catalog, phase 1,” information technology & libraries 27, no. 2 (2008): 6–19. 10. tom steele, “the new cooperative cataloging,” library hi tech 27, no. 1 (2009): 68–77. 11. ian rowlands and david nicholas, “understanding information behaviour: how do students and faculty find books?” journal of academic librarianship 34, no. 1 (2008): 3–15. 12. ja mi and cathy weng, “revitalizing the library opac: interface, searching, and display challengers,” information technology & libraries 27, no. 1 (2008): 5–22. 13. jakob nielsen, “why you only need to test with 5 users,” http://www.useit.com/alertbox/20000319.html (accessed mar. 20, 2010). 4. christine borgman, “why are online catalogs still hard to use?” journal of the american society for information science 47, no. 7 (1996): 493–503. 5. georgia briscoe, karne selden, and cheryl rae nyberg, “the catalog versus the home page: best practices for connecting to online resources,” law library journal 95, no. 2 (2003): 151–74. 6. kristin antelman, emily lynema, and andrew k. pace, “toward a twenty-first century library catalog,” information technology & libraries 25, no. 3 (2006): 128–39. 7. marshall breeding, “library technology guides: discovery layer interfaces,” http://www.librarytechnology. org/discovery.pl?sid=20100322930450439 (accessed mar. 2010). 8. karen m. spicher, “the development of the marc format,” cataloging & classification quaterly 21, no 3/4 (1996): 75–90. appendix. vufind usability study logging sheets i. the look and feel of vufind a. basic screen (the vufind main page) 1) is it obvious what to do? yes _____ no _____; what were you trying to do? 2) open the drop down box, examine the options. do you recognize theseoptions? yes _____ no _____ some _____ (if some, find out what the patron was expecting and get suggestions for improvement). comments: b. click on the advanced search option—take a minute to allow the participants to look around the screen 1) examine each of the advanced search options a) are the advanced search options clear? yes_____ no_____ b) are the advance search options helpful? yes_____no_____ 2) examine the limits fields, open the drop-down menu boxes a) are the limits clearly identified? yes _____ no _____ b) are the pictures helpful? yes _____ no _____ c) are the drop-down menu box options clear? yes _____ no _____ comments: ii. (back to the) basic search field a. enter the phrase—climate change (search all fields)—examine the search results 1) do the records retrieved appear to be relevant to your search statement? yes _____no _____don’t know _____ 2) what information would you like to see in the record? how should it be displayed? 3) examine the right sidebar. are the “facets” clear? yes _____no _____some, not all _____ 4) if you want to view items from other libraries in your search results, can you find the option? yes _____no _____ 5) can you find an electronic book published in 2008? yes _____no _____don’t know _____ comments: b. click on the first book record in the original climate change search results 1) is information about the book clearly represented? yes _____ no _____ 2) is it clear where to find item? yes _____ no _____ 3) look at the tags. do you understand what this feature is? yes _____ no _____ comments: c. look at the brief item information provided on the screen 1) is the information displayed useful in determining the scope and content of the item? yes _____no _____ 2) are the topics in the record useful for finding additional information on the topic? yes _____no _____ comments: d. click on each button below the brief record information 1) is this information useful? yes _____ no _____ 2) are the names for the tabs accurate? what should they be named? e. can you easily determine where the item is located and how to request it? yes _____no _____ comments: f. go back to the basic search box and enter the author james joyce (all fields) as a new search 1) is it easy to distinguish items by james joyce from items about james joyce? yes _____no _____ 2) using the facets, can you find only titles with james joyce as author? yes _____no _____ 3) can you find out how to cite an item? yes _____ no _____ comments: 52 information technology and libraries | march 2011 g. now try to find an audio recording by the artist prince using basic search were you successful? yes _____no _____ h. find the earliest prince recording ( “for you”; 1978). is it in the local collection? yes _____ no _____ if not, can you get a copy? comments: iii. in the advanced search screen: a. use the title drop down to find the item: gone with the wind 1) were you successful? yes _____ no _____ not sure _____ 2) can you locate a dvd of the same title? yes _____ no _____ 3) are copies of the dvd available in the university of illinois library? yes _____ no _____ comments: b. use the author drop down in the advanced search to locate titles by: jack london using the facets, find and open the record for the jack london novel, white fang. explore each of the: description, holdings, and comments tabs: 1) is this information useful? yes _____ no _____ 2) would you change the names of the tabs or the information on them? 3) other than your local library copy of white fang, can you find copies at other libraries? yes _____ no _____ comments: c. using the advanced search, find a dvd on public speaking (hint: use the limit box to select the film/video format) are there instructional videos in the university of illinois library? yes _____ no _____ 1) identify the author that’s responsible for one of the dvds 2) can you easily find other works by this author? yes _____ no _____ comments: iv. exploring the account features: a. click on login in the upper right corner of the page. on the next page, create an account. is it clear how to create an account? yes _____ no _____ b. once you have your account and are logged in to vufind, look at the menu on the right hand side. is it clear what each of the menu items are? yes _____ no _____ c. while still logged in, do a search for your favorite book and add it to your favorites list. is this tool useful, would you consider using it? yes _____ no _____ comments: v. comparing vufind to other resources: a. open three browser windows (this is easiest in firefox by entering ctrl-t for each new window) with 1) your library catalog 2) vufind 3) amazon.com enter global warming in each website in the basic search window of each. based on your initial reactions, which service appears the best for most of your uses? library catalog _____ vufind _____ amazon _____ comments: c. do you have a preference in the display formats? library catalog _____ vufind _____ amazon _____ comments: debriefing now that you have used vufind, how would you rate it—on a scale from 1–5, from easy to confusing to use? comments? how does it compare to other library catalogs you’ve used? if vufind and your home library catalog were available side-by-side, which would you use first? why? are you familiar with any of these other products: aquabrowser _____ googlebooks _____ microsoft live search _____ librarything _____amazon.com _____other preferred service _____ that’s it! thank you for participating in our usability. you will be receiving one other survey through email, we appreciate your opinions on the vufind product. lita covers 2, 3, and 4 index to advertisers communications wikidata: from “an” identifier to “the” identifier theo van veen information technology and libraries | june 2019 72 theo van veen (theovanveen@gmail.com) is researcher (retired), koninklijke bibliotheek. abstract library catalogues may be connected to the linked data cloud through various types of thesauri. for name authority thesauri in particular i would like to suggest a fundamental break with the current distributed linked data paradigm: to make a transition from a multitude of different identifiers to using a single, universal identifier for all relevant named entities, in the form of the wikidata identifier. wikidata (https://wikidata.org) seems to be evolving into a major authority hub that is lowering barriers to access the web of data for everyone. using the wikidata identifier of notable entities as a common identifier for connecting resources has significant benefits compared to traversing the ever-growing linked data cloud. when the use of wikidata reaches a critical mass, for some institutions, wikidata could even serve as an authority control mechanism. introduction library catalogs, at national as well as institutional levels, make use of thesauri for authority control of named entities, such as persons, locations, and events. authority records in thesauri contain information to distinguish between entities with the same name, combine pseudonyms and name variants for a single entity, and offer additional contextual information. links to a thesaurus from within a catalog often take the form of an authority control number, and serve as identifiers for an entity within the scope of the catalog. authority records in a catalog can be part of the linked data cloud when including links to thesauri such as viaf (https://viaf.org/), isni (http://www.isni.org/), or orcid (https://orcid.org/). however, using different identifier systems can lead to having many identifiers for a single entity. a single identifier system, not restricted to the library world and bibliographic metadata, could facilitate globally unique identifiers for each authority and therefore improve discovery of resources within a catalog. the need for reconciliation of identifiers has been pointed out before.1 what is now being suggested is to use the wikidata identifier as “the” identifier. wikidata is not domain specific, has a large user community, and offers appropriate apis for linking to its data. it provides access to a wealth of entity properties, it links to more than 2,000 other knowledge bases, it is used by google, and the number of organisations that link to wikidata is quantifiably growing with tremendous speed.2 the idea of using wikidata as an authority linking hub was recently proposed by joachim neubert.3 but why not go one step further and bring the wikidata identifier to the surface directly as “the” resource identifier, or official authority record? this has been argued before and the implications of this argument will be considered in more detail in the remainder of this article. 4 information technology and libraries | june 2019 73 figure 1. from linking everything to everything to linking directly to wikidata. figure 1 illustrates the differences between a few possible situations that should be distinguished. on the left, the “everything links to everything” situation shows wikidata as one of the many hubs in the linked data cloud. in the middle, the “wikidata as authority hub” situation is shown, where name authorities are linked to wikidata. on the right is the arrangement proposed in this article, where library systems and other systems for which this may apply share wikidata as a common identifier mechanism. of course, there is a need for systems that feed wikidata with trusted information and provide wikidata with a backlink to a rich resource description for entities. in practice, however, many backlinks do not provide rich additional information and in such cases a direct link to wikidata would be sufficient for the identification of entities. figure 2 shows these two situations and other possible variations by means of dashed lines, i.e. systems that feed wikidata, but use the wikidata identifier as resource identifier for the outside world vs. systems that link directly to wikidata, but keep a local thesaurus for administrative purposes. it is certainly not the intention to encourage institutions to give up their own resource descriptions or resource identifiers locally, especially not when they are an original or rich source of information about an entity. a distinction can be made between the url of the description of an entity and the url of the entity itself. when following the url of a real-world entity in a browser, it is good practice to redirect to the corresponding description of the entity. this is known as the “httprange-14” issue.5 this article will not go into any detail about this distinction other than to note that it makes sense to have a single global identifier for an entity while accepting different descriptions of that entity linked from various sources. wikidata | van veen 74 https://doi.org/10.6017/ital.v38i2.10886 figure 2. feeding properties connecting collections to wikidata (left) and direct linking to wikidata using resource identifier (right). the dashed lines show additional connecting possibilities. the motivating use case the idea of using the wikidata identifier as a universal identifier was born at the research department of the national library of the netherlands (kb) while working on a project aimed at automatically enriching newspaper articles with links to knowledge bases for named entities occurring in the text.6 these links include the wikidata identifier and, where available, the dutch and english dbpedia (http://dbpedia.org) identifiers, the viaf number, the geonames number (http://geonames.org), the kb thesaurus record number, and the identifier used by the parliamentary documentation centre (https://www.parlementairdocumentatiecentrum.nl/). the identifying parts of these links are indexed along with the article text in order to enable semantic search, including search based on wikidata properties. for demonstration purposes the enriched “newspapers+” collection was made available through the kb research portal, which gives access to most of the regular kb collections (figure 3). 7 in the newspaper project, linked named entities in search results are clickable to obtain more information. as most users are not expected to know sparql, the query language for the semantic web, the system offers a user-friendly method for semantic search: a query string entered between square brackets, for example “[roman emperor]”, is expanded by a “best guess” sparql query in wikidata, in this case resulting in entities having the property “position held=roman emperor.”. these in turn are used to do a search for articles containing one or more mentions of a roman emperor, even if the text “roman emperor” is not present in the article. in another example, when a user searches for the term “[beatles]” the “best guess” search yields articles mentioning entities with the property “member of=the beatles”. for ambiguous items, as in the case of “guernica,”, which can be the place in spain or picasso’s painting, the one with the highest number of occurrences in the newspapers is selected by default, but the user may select another one. for information technology and libraries | june 2019 75 the default or selected item, the user can select a specific property from a list of wikidata properties available for that specific item. the possibilities of this semantic search functionality may inspire others to use the wikidata identifier for globally known entities in other systems as well. figure 3. screenshot of the kb research portal with a newspaper article as result of searching “[architect=willem dudok]”. the results are articles about buildings of which willem dudok is the architect. the name of the building meeting the query [architect=willem dudok] is highlighted. usage scenarios two usage scenarios can be considered in more detail: (1) manually following links between wikidata descriptions and other resource descriptions, and (2) a federated sparql query can be performed by the system to automatically bring up linked entities. in the first scenario, in which resource identifiers link to wikidata, the user can follow the link to all resource descriptions having a backlink in wikidata. but why would a user follow such a link? reasons may include wanting more or context-specific information about the entity, or a desire to search in another system for objects mentioning a specific entity. in the latter case, the information behind the backlink should provide a url to search for the entity, or the backlink should be the search url itself. wikidata provides the possibility to specify various uri templates. these can be used to specify a link for searching objects mentioning the entity, rather than just showing a thesaurus entry. when the backlink does not provide extra information or a way to search the entity, the backlink is almost useless. thus, when systems provide resource links to wikidata they give users access to a wealth of information about an entity in the web of data and, potentially, to objects mentioning a specific entity. some systems only provide backlinks from wikidata | van veen 76 https://doi.org/10.6017/ital.v38i2.10886 wikidata to their resource descriptions but not the other way around. users from such systems cannot easily benefit from these links. the second scenario of a federated sparql query applies when searching objects in one system based on properties coming from other systems. formulating such a sparql query is not easy because doing so requires a lot of knowledge about the linked data cloud. the alternative is to put the complete linked data cloud in a unified (triple store) database. the technology of linked data fragments might solve the performance and scaling issues but not the complexity. 8 using a central knowledge base like wikidata could reduce complexity for the most common situation of searching objects in other systems using properties from wikidata. this use case requires these systems to take the users query and automatically formulate a sparql search. there are many systems that are linked to wikidata that do not support sparql at all or only support it in a way that is not intended for the average user. those systems can still let users benefit from wikidata by offering a simple add-on to search in wikidata for entities that meet some criteria and use the identifiers for a conventional search in the local system as shown for the case of the historical newspapers. these two use cases illustrate how the use of a wikidata identifier can lower the barrier to access information about an entity and to finding objects related to an entity by minimizing the number of hubs, minimizing the required knowledge and minimizing the required technology. this is achieved by linking resources to wikidata and, even more so, by making objects searchable by means of the wikidata identifier. advantages of using the wikidata identifier as universal identifier summarizing the above, a number of significant advantages of using the wikidata identifier as universal identifier can be seen. these include: • using the wikidata identifier as resource identifier makes wikidata the first hub. applications therefore have in the first instance to deal with only one description model. from there, it is easy to navigate further: most information is only “one hub away,” so less prior knowledge is required to link from one source to another. • wikidata identifiers can be used for federated search based on properties in wikidata, so there is less need to know how to access properties in other resource descriptions. • wikidata identifiers facilitate generating “just in case” links to systems having the wikidata identifier indexed. • complicated sparql queries using wikidata as primary source for properties can be shared and reused more easily compared to a situation with many diverse sources for properties. • wikidata offers many tools and apis for accessing and processing data. • some libraries and similar institutions may even decide to use wikidata directly for authority control when it reaches a critical mass, relieving them from maintaining a local thesaurus. implementation institutions can gradually adopt the use of wikidata identifiers without needing to make radical changes in their local infrastructure. a simple first step is automatically generating links to information technology and libraries | june 2019 77 wikidata in the presentation of an object or to the object description to provide contextual information and navigation options. as a next step, the wikidata q-number of an entity could be indexed along with the descriptions containing it, so these objects become findable via a wikidata identifier search, e.g. of the form: https://whatever.local/wdsearch?id=q937 the wikidata identifier could then be used in conventional as well as federated searches for a resource, regardless of the exact spelling of a resource name. a search may be refined using wikidata properties without further requirements with respect to local infrastructures. institutions having a sparql endpoint can allow for a federated sparql query for combining local data with data from wikidata. as sparql is not easy for the end user this requires a user interface that can formulate a sparql query to protect the user from knowing sparql. those institutions willing to start using the wikidata identifier as resource identifier can unify references in their bibliographic records. currently, for example, a reference to albert einstein, in a simplified, rdf-like (https://www.w3.org/rdf/) xml fragment in a bibliographic record, could look quite different for different institutions, e.g.: albert einstein albert einstein albert einstein albert einstein if the wikidata identifier is used as resource identifier, this could for all institutions become the same: albert einstein in this case it becomes easy to navigate the web, to create common bookmarklets, and provide additional functionality using the wikidata identifier. cataloguing process and criteria for new wikidata entries for institutions that decide to link their entities directly to wikidata, their catalog software would have to be configured to support wikidata lookups. catalogers would not have to know about linked data or rdf to create links to wikidata; they would simply have to query wikidata and select the appropriate entry to link. the cataloging software would then add the selected identifier to the record being edited. if a query in wikidata does not yield any results the item would first then have to be created by the cataloger. creating a new item using the wikidata user interface (figure 4) is straightforward: create an account, add a new item, and add statements (fields) and values. wikidata | van veen 78 https://doi.org/10.6017/ital.v38i2.10886 figure 4. data entry screen for entering a new item in wikidata. catalogers must be aware of some rules when creating items. wikidata editors may delete items that fall under one of wikidata’s exclusion criteria, such as vandalism, empty descriptions, broken links, etc. in addition, the item must refer to an instance of a clearly identifiable conceptual or material “notable” entity. notable means that the item must be mentioned by at least one reliable, third-party published source. here, common sense is required: being mentioned in a telephone book or a newspaper is in itself not considered as notability. entities that are not notable enough to be entered into wikidata would then remain identified by a link to a local or other thesaurus. possible objections to wikidata as authority control mechanism although it is, at least at the present moment, not the intention of this article to propose the use of wikidata as the primary local authority control mechanism, some institutions may nonetheless consider the opportunity to do so. there are numerous objections to this idea to note, including: 1) institutions may consider themselves authoritative sources of information, and may therefore want to keep control over “their” thesaurus. the idea that the greater community can make changes to “their” thesaurus may not be tenable to them. quality control and error detection certainly are important issues, but experts from outside the library can sometimes provide more and better information about a resource than cataloguing professionals. for misuse and erroneous input, the community can be relied on and trusted to correct and add to wikidata entries. information that is critical for local usage, such as access control, may still be managed locally. despite possible objections to using wikidata for universal authority control, national libraries and other institutions can information technology and libraries | june 2019 79 work together with wikidata to share responsibility of maintaining the resource, to optimize and harmonize the shared use of wikidata, and maintain validity and authority. this might imply a more rigorous quality control. 2) existing systems like viaf and isni already, at present, still contain more persons than wikidata, so why use wikidata? viaf and isni are domain specific and are more restrictive with respect to updates of their content and the availability of tools and apis. in wikidata both viaf and isni are just one hub away and for internal use the viaf and isni identifiers remain available. the question here is whether there will be a moment that wikidata reaches a critical mass and supersedes viaf and isni. 3) there may be disagreement about a certain entity, especially when it concerns political events or persons whose role is perceived differently by different political parties. wikidata contains neutral properties. the properties that may contain subjective qualifications or might suffer bias are mostly behind the backlinks, like the abstract in wikipedia. a fundamental difference between wikipedia and wikidata is that wikipedia doesn’t have to be consistent across languages. wikidata is much more structured and therefore more useful for semantic applications. it doesn’t allow for the different nuances in descriptions like wikipedia articles do and therefore wikidata doesn’t reflect different opinions in descriptions and is less subject to bias.9 furthermore, the cataloguing practices in libraries are subject to bias and subjectivity too. perception and political view may, for example, be reflected in some subject headings and may also change over time.10 it is debatable whether a cataloger is more neutral and less biased than a larger user community. although the use and acceptance of wikipedia as a true source of information may be arguable, in the light of the current “fake news” discussion it is extremely important to guard the correctness of information in wikipedia. in this context it is interesting to note that “according to a study in nature, the correctness of wikipedia articles is comparable to the encyclopaedia britannica, and a study by ibm researchers found that vandalism is repaired extremely quickly.”11 4) some objections have to do with the discussion of “centralization versus decentralization.” some institutions may not want a central system perceptively having control over their local data. the idea of using wikidata as a common authority control mechanism is not that different from the use of any other thesaurus or identifier framework like isbn, issn, etc., except for its use of a central resource description. 5) what if wikidata disappears? there are solutions in terms of mirrors and a local copy of wikidata. moreover, national libraries and other, similar institutions that are already responsible for long-term preservation of digital content can take responsibility for keeping wikidata alive to maximize its viability wikidata | van veen 80 https://doi.org/10.6017/ital.v38i2.10886 conclusion reconciliation of linked data identifiers in general, and using the wikidata identifier as universal identifier in particular, has been shown to have many advantages. libraries and similar institutions can gradually start using the wikidata identifier without needing to make radical changes in their local database infrastructure. when wikidata reaches a critical mass, libraries and similar institutions may want to switch to using wikidata identifiers as the default resource identifiers or authority records. however, given the enormous growth of the number of collections that link entities to wikidata that is already taking place, we might end up in a situation where the perception is that “if an item is not in wikidata, it doesn’t exist” stimulating putting more items in wikidata and making local descriptions less relevant. from a strategic point of view for adopting wikidata decision makers may pose the question: “why do we have a local thesaurus when we already have wikidata?” the next question, then, will probably not be “should we go this way?” but rather “when should we go this way and start using the wikidata identifier as the identifier?” references 1 robert sanderson, “the linked data snowball and why we need reconciliation,” slideshare, apr. 4, 2016, https://www.slideshare.net/azaroth42/linked-data-snowball-or-why-we-needreconciliation. 2 karen smith-yoshimura, “the rise of wikidata as a linked data source,” hanging together, aug. 6, 2018, http://hangingtogether.org/?p=6775. 3 joachim neubert, “wikidata as a linking hub for knowledge organization systems? integrating an authority mapping into wikidata and learning lessons for kos mappings,” in proceedings of the 17th european networked knowledge organization systems workshop, 2017, 14-25, http://ceur-ws.org/vol-1937/paper2.pdf. 4 theo van veen, “wikidata as universal library thesaurus,” presented oct. 2017 at wikidatacon 2017, berlin, https://www.youtube.com/watch?v=1_nxkbncohm. 5 “httprange-14,” wikipedia, accessed mar. 15, 2019, https://en.wikipedia.org/wiki/httprange-14. 6 theo van veen et. al., “linking named entities in dutch historical newspapers,” in metadata and semantics research, mtsr 2016, ed. emmanouel garoufallou (cham: springer, 2016), 205–10, https://doi.org/10.1007/978-3-319-49157-8_18. 7 video demonstration of “kb research portal,” kb | national library of the netherlands, http://www.kbresearch.nl/xportal, accessed apr. 26, 2019, https://www.youtube.com/watch?v=j5mcem-hemg. 8 ruben verborgh, “linked data fragments: query the web of data on web-scale by moving intelligence from servers to clients,” accessed mar. 15, 2019, http://linkeddatafragments.org/. 9 mark graham, “the problem with wikidata,” apr. 6, 2012, https://www.theatlantic.com/technology/archive/2012/04/the-problem-withwikidata/255564/. information technology and libraries | june 2019 81 10 candise branum, “the myth of library neutrality,” may 15, 2014, https://candisebranum.wordpress.com/2014/05/15/the-myth-of-library-neutrality/. 11 “the reliability of wikipedia,” wikipedia, accessed mar. 15, 2019, https://en.wikipedia.org/wiki/reliability_of_wikipedia. 2007 is ital’s 40th volume. my 40th birthday was the occasion of a great deal of bizarre behavior by my work colleagues, who booby­trapped my office. i do not like cake but love radishes. my birthday “cake” at work was a cheese ball decorated with forty radishes stuck on toothpicks. since i didn’t have to blow them out, i ate them—all forty. ital’s fortieth is no time for such shenanigans. rather it is a time for reflection, celebration, and memoriam. fred kilgour, the founding editor of the journal of library automation (jola), ital’s original title, died last summer. in planning for the 40th anniversaries of lita in 2006 and ital in 2007, the editorial board and i wanted to honor fred as founding editor. i called him and invited him to submit an article of his choosing. he thanked me but graciously declined. he was busy writing his mem­ oirs and said that he needed to conserve his strength for that task. to honor him as founding editor, i have invited a number of authors to submit articles describing their research or their seminal thoughts on our profession. readers have, i hope, seen those articles that are so des­ ignated by notes. i have also invited all lita members to submit such articles in previous editorials and in a posting to lita­l. several articles have resulted from these invitations. this being the first issue of the 2007 volume, it is neither too late for me to reissue an invitation, nor too late for you lita members and ital readers to respond with articles that commemorate our fortieth. i’m old enough to know that it is a cliché to proclaim “there has never been a more exciting time to be a librar­ ian.” it was so when volume 1 of jola appeared in 1967. it is so today. let us together peruse the tables of contents (tocs) of the first two issues. vol. 1, no. 1 ned c. morris, “computer based acquisitions system at texas a&i university”; richard d. johnson, “a book catalog at stanford”; robert wedgeworth, “brown university library fund accounting system”; richard e. chapin and dale h. pretzer, “comparative costs of converting shelf list records to machine readable form”; richard de gennaro, “the development and administration of automated systems in academic libraries” vol. 1, no. 2 lawrence auld, “automated book order and circulation control procedures at the oakland university library”; donald v. black, “creation of computer input in an expanded character set”; frederick c. kilgour, “costs of library catalog cards produced by computer”; r. a. kennedy, “bell laboratories’ library real­time loan system (bellrel)” four things are immediately striking about those titles. their authors described computer­based solutions and systems for big issues facing libraries forty years ago. second, those problems were all administrative, i.e., they involved using computers to increase the productivity of major operations performed by librarians and library staff. to paraphrase an oft­cited goal, they were systems designed to attempt to control the rate of rise of library costs of operations—to improve the efficiency and effec­ tiveness of internal library processes. therefore third, they were not systems for library users per se. and fourth, they were harbingers of success. global cooperative cataloging and well­integrated library systems have revolutionized our operations. we are devoting relatively more resources to direct services than we did forty years ago. i do not mean that no thoughts or efforts were being devoted to improved user services. when these articles were published, lockheed and the system development corporation (sdc) were in the process of developing the first commercially successful, general online database search systems. in fact, forty years ago, in a former life, as it were, i was present at what i believe was the first trans­ continental online information search, from a teletype machine in sdc’s office in dayton, ohio, to a computer at its santa monica headquarters. (aside to readers: as an impatient young man, i was struck less by the “magic” of the event than by an observation that i expressed on the spot: the response time was horrible—unacceptable. i opined that no one would put up with such a wait. i narrowly escaped with my scalp intact.) the national library of medicine (nlm) was perfecting the medical literature analysis and retrieval system (medlars), medline’s (medlars online’s) predecessor. selective dissemination of information (sdi) services were already being provided using batch processes. computers gen­ erated a myriad of printed article and technical report indexes. we’ve come a long way in forty years. an article in the current issue describes what librarians need to know about “facebook.” increasingly, in information­rich soci­ eties, our students and others want and need their infor­ mation technology on the run. the first five paragraphs of this editorial were com­ posed three weeks ago using the word processor on my palm treo 650 whilst i sat in medical­center waiting and examining rooms in portland, oregon. i downloaded the tocs of jola to my home desktop computer in vancouver, washington, two weeks ago. yesterday, i editorial: reflections on forty john webb john webb (jwebb@wsu.edu) is a librarian emeritus, washington state university, and editor of information technology and libraries. editorial | webb 3 contiuned on page 34 34 information technology and libraries | march 200734 information technology and libraries | march 2007 12. if you answered “yes” to question 11, please describe how facebook could be considered an aca­ demic endeavor. ______________________________________________ ______________________________________________ ______________________________________________ ______________________________________________ 13. please check all answers that best describe what effect, if any, use of facebook in the library has had on library services and operations?  has increased patron traffic  has increased patron use of computers  has created computer access problems for patrons  has created bandwidth problems or slowed down internet access  has generated complaints from other patrons  annoys library faculty and staff  interests library faculty and staff  has generated discussion among library faculty and staff about facebook 14. is privacy a concern you have about students using facebook in the library?  yes  no  not sure please list any observations, concerns, or opinions you have regarding facebook use in libraries. extracted the paragraphs from my palm to my desktop, and saved that document and the tocs on a universal serial bus (usb) key. today, i combined them in a new document on my laptop and keyed the remaining paragraphs in my room at an inn on a pier jutting into commencement bay in tacoma on southern puget sound. i sought inspiration from the view out my window of the water and the fall color, from old crow medicine show on my ipod, and from early sixties beyond the fringe skits on my treo. fred kilgour was committed to delivering informa­ tion to users when and where they wanted it. libraries must solve that challenge today, and i am confident that we shall. editorial continued from page 3 december_ital_maceli_final technology skills in the workplace: information professionals’ current use and future aspirations monica maceli and john j. burke information technology and libraries | december 2016 35 abstract information technology serves as an essential tool for today’s information professional, and ongoing research is needed to assess the technological directions of the field over time. this paper presents the results of a survey of the technologies used by library and information science practitioners, with attention to the combinations of technologies employed and the technology skills that practitioners wish to learn. the most common technologies employed were email, office productivity tools, web browsers, library catalogand database-searching tools, and printers, with programming topping the list of most-desired technology skill to learn. similar technology usage patterns were observed for early and later-career practitioners. findings also suggested the relative rarity of emerging technologies, such as the makerspace, in current practice. introduction over the past several decades, technology has rapidly moved from a specialized set of tools to an indispensable element of the library and information science (lis) workplace, and today it is woven throughout all aspects of librarianship and the information professions. information professionals engage with technology in traditional ways, such as working with integrated library systems, and in new innovative activities, such as mobile-app development or the creation of makerspaces.1 the vital role of technology has motivated a growing body of research literature, exploring the application of technology tools in the workplace, as well as within lis education, to effectively prepare tech-savvy practitioners. such work is instrumental to the progression of the field, and with the rapidly-changing technological landscape, requires ongoing attention from the research community. one of the most valuable perspectives in such research is that of the current practitioner. understanding current information professionals’ technology use can help in understanding the role and shape of the lis field, provide a baseline for related research efforts, and suggest future monica maceli (mmaceli@pratt.edu) is assistant professor, school of information, pratt institute, new york. john j. burke (burkejj@miamioh.edu) is library director and principal librarian, gardner-harvey library, miami university middletown, middletown, ohio. technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 36 directions. the practitioner perspective is also valuable in separating the hype that often surrounds emerging technologies from the reality of their use and application within the lis field. this paper presents the results of a survey of lis practitioners, oriented toward understanding the participants’ current technology use and future technology aspirations. the guiding research questions for this work are as follows: 1. what combinations of technology skillsets do lis practitioners commonly use? 2. what combinations of technology skillsets do lis practitioners desire to learn? 3. what technology skillsets do newer lis practitioners use and desire to learn as compared to those with ten-plus years of experience in the field? literature review the growth and increasing diversity of technologies used in library settings has been matched by a desire to explore how these technologies impact expectations for lis practitioner skill sets. triumph and beile examined the academic library job market in 2011 by describing the required qualifications for 957 positions posted on the ala joblist and arl job announcements websites.2 the authors also compared their results with similar studies conducted in 1996 and 1988 to see if they could track changes in requirements over a twenty-three-year period. they found that the number of distinct job titles increased in each survey because of the addition of new technologies to the library work environment that require positions focused on handling them. the comparison also found that computer skills as a position requirement increased by 100 percent between 1988 and 2011, with 55 percent of 2011 announcements requiring them. looking more deeply at the technology requirements specifically, mathews and pardue conducted a content analysis of 620 jobs ads from the ala joblist to identify skills required in those positions.3 the top technology competencies required were web development, project management, systems development, systems applications, networking, and programming languages. they found a significant overlap of librarian skill sets with those of it professionals, particularly in the areas of web development, project management, and information systems. riley-huff and rholes found that the most commonly sought technology-related job titles were systems/automation librarian, digital librarian, emerging and instructional technology librarian, web services/development librarian, and electronic resources librarian.4 a few years later, maceli added to this list with newly popular technology-relating titles, including emerging technologies librarian, metadata librarian, and user experience/architect librarian.5 beyond examining which specific technologies librarians should be able to use, researchers have also pondered whether a list of skills is even possible to create. crawford synthesized a series of blog posts from various authors to discuss which technology skills are essential and which are too specialized to serve as minimum technology requirements for librarians.6 he questioned whether universal skill sets should be established given the variety of tasks within libraries and the unique backgrounds of each library worker. crawford also questioned the expectation that every librarian information technology and libraries | december 2016 37 will have a broad array of technology skills from programming to video editing to game design and device troubleshooting. partridge et al. reported on a series of focus groups held with 76 librarians that examined the skills required for members of the profession, especially those addressing technology.7 in the questions they asked the focus groups, the authors focused on the term “library 2.0” and attempted to gather suggestions on skills that current and future librarians need to assist users. they concluded that the groups identified that a change in attitudes by librarians was more important to future library service than the acquisition of skills with specific technology tools. importance was given to librarians’ abilities to stay aware of technological changes, be resilient and reflective in the face of them, and to communicate regularly and clearly with the members of their communities. another area examined in the studies is where the acquisition of technology skills should and does happen for librarians. riley-huff and rholes reported on a dual approach to measure librarians’ preparation for performing technology-related tasks.8 the authors assessed course offerings for lis programs to see if they included sufficient technology preparation for new graduates to succeed in the workplace. they then surveyed lis practitioners and administrators to learn how they acquired their skills and how difficult it is to find candidates with enough technology preparation for library positions. their findings suggest that while lis programs offer many technology courses, they lack standardization, and graduates of any program cannot be expected to have a broad education in library technologies. further research confirmed this troubling lack of consistency in technology-related curricula. singh and mehra assessed a variety of stakeholders, including students, employers, educators, and professional organizations, finding widespread concern about the coverage of technology topics in lis curricula.9 despite inconsistencies between individual programs, several studies provided a holistic view of the popular technology offerings within lis curricula. programs commonly offered one or more introductory technology courses, as well as courses in database design and development, web design and development, digital libraries, systems analysis, and metadata.10,11,12 as researchers have emphasized from a variety of perspectives, new graduates could not realistically be expected to know every technology with application to the field of information.13 there was widespread acknowledgement that learning in this area can, and must, continue in a lifelong fashion throughout one’s career. riley-huff and rholes reported that lis practitioners saw their own experiences involving continuing skill development on the job, both before and after taking on a technology role.14 however, literature going back many decades suggests that the increasing need for continuing education in information technology has generally not been matched by increasing organizational support for these ventures. numerous deterrents to continuing technology education were noted, including lack of time,15 organizational climate, and the perception of one’s age.16 while studies in this area have primarily focused on mls-level positions, jones reported on academic library support staff members and their perceptions of technology use over a ten-year period and found that increased technology responsibilities added technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 38 to workloads and increased workplace stress.17 respondents noted that increasing use of technology in their libraries has increased their individual workloads along with the range of responsibilities that they hold. method to build an understanding of the research questions stated above, which focus on the technologies currently used by information professionals and those they desired to learn, we designed and administered a thirteen-question anonymous survey (see appendix) to the subscribers of thirty library-focused electronic discussion groups between february 25 and march 13, 2015. the groups were chosen to target respondents employed in multiple types of libraries (academic, public, school, and special) with a wide array of roles in their libraries (public services librarians, systems staff members, catalogers, and so on). we solicited respondents with an email sent to the groups asking for their participation in the survey and with the promise to post initial results to the same groups. the survey included closed and open-ended questions oriented toward understanding current technology use and future aspirations as well as capturing demographics useful in interpreting and generalizing the results. the survey questions have been previously used and iteratively expanded over time by the second author, first in the fall of 2008, then spring of 2012, with summative results presented in the last three editions of the neal-schuman library technology companion. we obtained a total of 2,216 responses to the question, “which of the following technologies or technology skills are you expected to use in your job on a regular basis?” of these responses, 1,488 (67 percent) of the respondents answered the question regarding technologies they would like to learn: “what technology skill would you like to learn to help you do your job better?” we conducted basic reporting of response frequency for closed questions to assess and report the demographics of the respondents. to analyze the open-ended survey question results in greater depth, we conducted a textual analysis using the r statistical package (https://www.r-project.org/). we used the tm (text mining) package in r (http://cran.rproject.org/package=tm) to calculate frequency, correlation of terms, generate plots, and cluster terms. results the following section will first present an overview of survey responses and respondents, and then explore results as related to the stated four research questions. the lis practitioners who responded to the survey reported that their libraries are located in forty us states, eight canadian provinces, and forty-three other countries. academic libraries were the most common type of library represented, followed by public, school, special, and other (see table 1). information technology and libraries | december 2016 39 library type number of respondents percentage of all respondents academic 1,206 54.4 public 545 24.6 school 266 12 special 138 6.2 other 61 2.8 table 1. the types of libraries in which survey respondents work respondents also provided their highest level of education. a total of 77 percent of responding lis practitioners have earned a library-related or other master’s degrees, dual master’s degrees, or doctoral degrees. from these reported levels of education, it is likely that more respondents are in librarian positions than in library support staff positions. however, individuals with master’s degrees serve in various roles in library organizations, so the percentage of graduate degree holders may not map exactly to the percentage of individuals in positions that require those degrees. significantly fewer respondents (16 percent) reported holding a high school diploma, some college credit, an associate degree, or a bachelor’s degree as their highest level of education. another aspect we measured in the survey was tasks that respondents performed on a regular basis. the range of tasks provided in the survey allowed for a clearer analysis of job responsibilities than broad categories of library work such as “public services” or “technical services.” some respondents appeared to be employed in solo librarian environments where they are performing several roles. even respondents who might have more focused job titles such as “reference librarian” or “cataloger” may be performing tasks that overlap traditional roles and categories of library work. the tasks offered in the survey and the responses to each are shown in table 2. technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 40 task number of respondents percentage of respondents reference 1,404 63.4 instruction 1,296 58.5 collection development 1,260 56.9 circulation 917 41.4 cataloging 905 40.8 electronic resource management 835 37.7 acquisitions 789 35.6 user experience 775 35 library administration 769 34.7 outreach 758 34.2 marketing/public relations 722 32.6 library/it systems 672 30.3 periodicals/serials 659 29.7 media/audiovisuals 566 25.5 interlibrary loan 518 23.4 distance library services 474 21.4 archives/special collections 437 19 other 209 9.40% table 2. tasks performed on a regular basis by survey respondents while public services-related activities lead the list, with reference, instruction, collection development, and circulation as the top four task areas, technical services-related activities are well represented; the next three in rank are cataloging, electronic resource management, and acquisitions. the overall list of tasks shows the diversity of work lis practitioners engage in, as each respondent chose an average of six tasks. the results also suggest that the survey respondents are well acquainted with a wide variety of library work rather than only having experience in a few areas, making their uses of technology more representative of the broader library world. the survey also questioned the barriers lis practitioners face as they try to add more technology to their libraries, and 2,161 respondents replied to the question, “which of the following are barriers to new technology adoption in your library?” financial considerations proved to be the most common barrier, with “budget” chosen by 80.7 percent of respondents, followed by “lack of staff time” (62.4 percent), “lack of staff with appropriate skill sets” (48.5 percent), and “administrative restrictions” (36.7 percent). information technology and libraries | december 2016 41 what combinations of technology skillsets do lis practitioners commonly use? responses from survey question 8, “which of the following technologies or technology skills are you expected to use in your job on a regular basis?,” were analyzed to build an understanding of this research questions. a total of 2,216 responses to this question were received. survey respondents were asked to select from a detailed list of technologies/skills (visible in question 8 of the appendix) that they regularly used. the top answers respondents chose for this question were: email, word processing, web browser, library catalog (public side), and library database searching. the full list of the top twenty-five technology skills and tools used is detailed in figure 1, with the list of the bottom fifteen technology skills used presented in figure 2. figure 1. top twenty-five technology skills/tools used by respondents (n = 2,216) 0 500 1,000 1,500 2,000 email word processing web browser library catalog public side library database searching spreadsheets printers web searching teaching others to use technology presentation software windows os laptops scanners library management system staff side downloadable ebooks web based ebook collections cloud based storage technology troubleshooting teaching using technology online instructional materials/products tablets web video conferencing educational copyright knowledge library website creation or management cloud-based productivity apps technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 42 figure 2. bottom fifteen technology skills/tools used by respondents (n = 2,216) text analysis techniques were then used to determine the frequent combinations of technology skills used in practice. first, a clustering approach was taken to visualize the most popular technologies that were commonly used in combination (figure 3). clustering helps in organizing and categorizing a large dataset when the categories are not known in advance, and, when plotted in a dendrogram chart, assists in visualizing these commonly co-occurring terms. the authors numbered the clusters identified in figure 3 for ease of reference. from left to right, the first cluster is focuses on communication and educational tools, the second emphasizes devices and software, the third contains web and multimedia creation tools, the fourth contains office productivity and public-facing information retrieval tools, and the fifth cluster has a diverse collection of responsibilities including systems-oriented responsibilities (from operating systems to specific hardware devices), working with ebooks, teaching with technology, and teaching technology to others. 0 500 1,000 1,500 2,000 mac os audio recording and editing technology equipment installation computer programming or coding assistive adaptive technology rfid chromebooks network management server management statistical analysis software makerspace technologies linux 3d printers augmented reality virtual reality information technology and libraries | december 2016 43 figure 3. cluster analysis of most frequent technology skills used in practice, with red outlines on each numbered cluster notably, the list of top skills used (figure 1) falls more on the end-user side of technology; skills more oriented toward systems work (e.g. linux, server management, computer programming, or coding) were less frequently mentioned, and several were among the lowest reported (figure 2). of the 2,216 respondents, 15 percent used programming or coding skills regularly in their job (which is of interest as programming or coding was the skill most desired to learn by respondents; this will be discussed further in the context of the next research question). plotting the correlations between the more advanced technology skillsets can provide a picture of the work such systems-oriented positions are commonly responsible for, particularly as they are less well represented in the responses as a whole. figure 4 plots the correlated terms for those tasked with “server management.” it is fair to assume someone with such responsibilities falls on the highly technical end of the spectrum. technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 44 figure 4. terms correlated with “server management,” indicating commonly co-occurring workplace technologies for highly-technical positions the more common task of “library website creation or management,” which fell to those with a broad level of technological expertise, had numerous correlated terms. figure 5 demonstrated a wide array of technology tools and responsibilities. figure 5. terms correlated with “library website creation or management,” indicating commonly co-occurring technologies used on the job information technology and libraries | december 2016 45 and lastly, teaching using technology and teaching technology to others is a long-standing responsibility of librarians and library staff. the following plot (figure 6) presents the skills correlated with “teaching others to use technology.” figure 6. terms correlated with “teaching others to use technology,” indicating commonly cooccurring technologies used on the job what combinations of technology skillsets do lis practitioners desire to learn? we analyzed responses to survey question 10, “what technology skill would you like to learn to help you do your job better?,” to explore this research question. as summarized in burke18—and consistent with the prior year’s findings—coding or programming remained the most desired technology skillset, mentioned by 19 percent of respondents. the raw text analysis yielded a fuller list of the top terms mentioned by participants (table 3 and visualized in figure 7). technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 46 technology term number of respondents percentage of respondents coding or programming (combined for reporting) 292 19.59 web 178 11.96 software 158 10.62 video 112 7.53 apps 106 7.12 editing 105 7.06 design 85 5.71 database 76 5.11 table 3. terms mentioned by 5 percent or more of survey respondents figure 7. wordcloud of responses to “what technology skill would you like to learn to help you do your job better?” information technology and libraries | december 2016 47 we then explored the deeper context of responses and individually analyzed responses specific to the more popular technology desires. first, we assessed the responses mentioning the desire to learn coding or programming. of these responses, the most common specific technologies mentioned were html, python, css, javascript, ruby, and sql, listed in decreasing order of interest. although most participants did not describe what they would like to do with their desired coding or programming skills, of those that did, the responses indicated interest in ● becoming more empowered to solve their own technology problems (e.g., “i would like to learn the [programming languages] so i don't have to rely on others to help with our website,” “i’m one of the most tech-skilled people at my library, but i’d like to be able to build more of my own tools and manage systems without needing someone from it or outside support.”); ● improving communication with it (e.g., “how to speak code, to aid in communication with it,” “to better identify problems and work with it to fix them”); ● creating novel tools and improving system interoperability (e.g. “coding for app and api creation”); and ● bringing new technologies to their library and patrons (e.g., “coding so that i can incorporate a hackerspace in my library”). next, we took a clustering approach to visualize the terms commonly desired in combination. figure 8 describes the clustered terms that we found within the programming or coding responses. the terms “programming” and “coding” form a distinct cluster to the right of the diagram, indicating that many responses contained only those two terms. technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 48 figure 8. clustering of terms present in responses indicating the desire to learn coding or programming the remaining portion of the diagram begins to illustrate the specific technologies mentioned for those respondents that answered in greater detail or expanded on their general answer of programming or coding. other related desired technology-skill areas become apparent: database management, html and css (as well as the more general “web design,” which appeared in the top terms in table 3), php and javascript, python and sql, and xml creation, among others. the bulleted list presented in the previous paragraph illustrates some of the potential applications participants envisioned these skills being useful in, but the majority did not provide this level of detail in their response. editing was another prominent term that appeared across participant responses and was largely meant in the context of video editing. because of the vagueness of the term “editing,” a closer look was necessary to determine other technology desires. looking at terms highly correlated with “editing” revealed both video and photo editing to be important to respondents. several of the topappearing terms were used more generally: “database” and mobile “apps” were mentioned without specifying the technology tool or scenario of use, such that a more contextual analysis could not be conducted. these responses can be particularly difficult to interpret as the term “databases” can have a technical meaning (e.g., working with sql) or it can refer to the use of library databases from an end user perspective. information technology and libraries | december 2016 49 what technology skillsets do newer lis practitioners use and desire to learn as compared to those with ten-plus years experience in the field? of the 2,216 survey responses, 877 stated they had worked in libraries for ten or fewer years. we analyzed these responses separately from the remaining 1,334 respondents who had worked in libraries for more than ten years. of this group, 644 had worked in libraries for twenty-plus years (figure 9). a handful of participants did not answer the question and were omitted from the analysis. figure 9. number of survey responses falling into the various categories for number of years working in libraries the top technology skills used in the workplace did not differ significantly between the different groups. the top skills, as discussed earlier and presented in figure 1, were well represented and similarly ordered. a few small percentage points of difference were noted in a handful of the top skills (figure 10). those newer to the field were slightly more likely to teach others to use technology, use cloud-based storage, and use cloud-based productivity apps. more experienced practitioners regularly used the library management system (on the staff side) more than those that were newer to the field. 0 100 200 300 400 500 600 700 0-2 3-5 6-10 11-15 16-20 21+ technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 50 figure 10. top twenty-five technology skills used by respondents in the zero to ten years’ experience (dark blue) and eleven-plus years experience (light blue) groups for the question regarding technologies they would like to learn, 69 percent of the participants with zero to ten years’ experience answered the question compared to a slightly smaller 65 percent of the participants with more than ten-years’ experience. top terms for both groups were very similar, including coding or programming, software, web, video, design, and editing. these terms were not dissimilar to the responses taken as a whole (table 3), indicating that respondents were generally interested in learning the same sorts of technology skills regardless of how long they had been in the field. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% email word processing web browser library catalog public side library database searching spreadsheets web searching printers teaching others to use technology presentation software windows os laptops scanners downloadable ebooks cloud based storage library management system staff side web based ebook collections technology troubleshooting teaching using technology online instructional materials/products cloud-based productivity apps tablets web video conferencing library website creation or management educational copyright knowledge 0-10 years 11+ years information technology and libraries | december 2016 51 a few noticeable differences between the two groups emerged. the most popular skills mentioned, coding or programming, were mentioned by 28 percent of the respondents with zero to ten years’ experience, and by 15 percent of the respondents with eleven-plus years experience. there was slightly more interest (by a few percentage points) in databases, design, python, and ruby in the zero to ten years’ experience group. taking a closer look at the different year ranges in the zero to ten years of experience or less group, revealed that those with three to five years of experience were most likely to be interested in learning coding or programming skills. figure 11. percentage of respondents interested in learning coding or programming in the groups with ten or fewer years’ experience of the participants that answered the question at all, several stated that there were no technology skills they would need or like to learn for their position, either because they were comfortable with their existing skills or were simply open to learning more as needed (but nothing specific came to mind). combined with those who did not answer the question (and so presumably did not have a particular technology they were interested in learning), 28 percent of the zero to ten years’ experience group and 31 percent of the eleven-plus years experience group did not have any technologies that they desired to learn at the moment. discussion as detailed earlier, the most common technologies employed by lis practitioners were email, office productivity tools, web browsers, library catalog and database searching tools, and printers. generally similar technology usage patterns were observed for early and later-career practitioners and programming topped the list of most-desired technology skill to learn. 0% 5% 10% 15% 20% 25% 30% 35% 0-2 years 3-5 years 6-10 years technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 52 the cluster analysis presented in figure 3 suggests that a relatively small percentage of practitioners have technology-intensive roles that would require skills such as programming, working with databases, systems administration, etc. rather, the cluster analysis showed common technology skillsets focused on the end-user side of technology tools. in fact, most of the top ten skills used—email, office productivity tools (word processing, spreadsheets and presentation software), web browsers, library catalog and database searching, printers, and teaching others to use technology—are fairly nontechnical in nature. a potential exception is that of teaching technology. figure 6 suggests that teaching others to use technology entails several hardware devices (for example, laptops, tablets, smartphones, and scanners) as well as online and digital resources, such as ebooks. however, most of the popular skills used would be considered baseline skills for information workers in any domain. as suggested by tennant, programming and other advanced technical skills do not necessarily need to be a core skill for all information professionals, but knowledge of the potential applications and possibilities of such tools is required.19 this idea was echoed by partridge et al., whose findings emphasized the need for awareness and resilience in tackling new technological developments.20 these skills alone would obviously be too little for lis practitioners explicitly seeking a high-tech role, as discussed in maceli.21 however, further research directed toward exploring the mental models and general technological understanding of information professionals would be helpful in understanding the true level of practitioner engagement with technology, to complement the list of relatively low-tech tools employed. programming has been a skill of great interest within the information professions for many years and the respondents’ enthusiasm and desire to learn in this area was readily apparent from the survey results, with nearly 20 percent of participants citing either “programming” or “coding” as a skill they desired to learn. in the context of their current responsibilities, 15 percent of respondents overall mentioned “computer programming or coding” as a regular technological skill they employed (figure 2). there was a slight difference between the librarians with fewer than eleven years of experience—19 percent coded regularly—compared to 13 percent of those with eleven or more years of experience. within the years-of-experience divisions, the newer practitioners were more interested in learning programming, with the peak of interest at three to five years in the workplace (figure 11). the relatively low interest or need to learn programming in the newest practitioners potentially indicates a hopeful finding—that their degree program was sufficient preparation for the early years of their career. prior research would contradict this finding. for example, choi and rasmussen’s 2006 survey found that, in the workplace, librarians frequently felt unprepared in their knowledge of programming and scripting languages.22 in the intervening years, curriculum has shifted to more heavily emphasize technology skills, including web development and other topics covering programming,23 perhaps better preparing early career practitioners. overall, information technology and libraries | december 2016 53 programming remains a popular skill in continuing education opportunities as well as in job listings,24 which aligns well with the respondents’ strong interest in this area. the skills commonly co-occurring with programming in practice included working with linux, database software, managing servers, and webpage creation (figure 4). taken as a whole, these skills indicate job responsibilities falling toward the systems side, with webpage creation a skill that bridged intensely technical and more user-focused work (as also evident in figure 4).this indicates that, though programming may be perceived as highly desirable for communicating and extending systems, as a formal job responsibility it may still fall to a relatively small number of information professionals in any significant manner. makerspace technologies and their implementation possibilities within libraries have garnered a great deal of excitement and interest in recent years, with much literature highlighting innovative projects in this area (such as american library association25 and bagley26). fourie and meyer provided an overview of the existing makerspace literature, finding that most research efforts focus on the needs and construction of the physical space.27 given the general popularity of the topic (as detailed in moorefield-lang),28 it is interesting to note that such technologies were infrequently mentioned by survey participants, both in those desiring to learn these tools and those who were currently using them. the most infrequent skills used (figure 2) included makerspace technologies, 3d printers, augmented, and virtual reality. only a small number of respondents currently used this mix of makerspace-oriented and emerging technologies, and only 3 percent of respondents mentioned interest in learning makespace-related skills. despite many research efforts exploring the particulars of unique makerspaces in a case-study approach (for example, moorefield-lang),29 little data exists on the total number of makerspaces within libraries, and the skillset is largely absent from prior research describing lis curriculum and job listings. this makes it difficult to determine whether the low number of participants that reported working with makerspace technologies is reflective of the small number of such spaces in existence or simply that few practitioners are assigned to work in this area, no matter their popularity. in either case, these findings provide a useful baseline with which to track the growth of makerspace offerings over time and librarian involvement in such intensely technological work. despite the interest and clear willingness to learn and use technology, several workplace challenges became apparent from participant responses. as prior research explored (notable riley-huff and rholes),30 practitioners assumed they would be continually learning and building skills on the job throughout their career to stay current technologically. as described in the earlier results section, many participants mentioned that, although they were highly willing and able to learn, the necessary organizational resources were lacking. as one participant noted, “i’d like to learn anything but the biggest problem seems to be budget (time and monetary).” several participants expressed feeling overwhelmed with their current workload. new learning opportunities, technological or otherwise, were simply not feasible. although the survey results indicated that practitioners of all ages were roughly equally interested in learning new technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 54 technologies, a handful of responses mentioned that ageist issues were creating barriers. though few, these respondents described being dismissed as technologists because of their age. these themes have long been noted in the large body of continuing-education-related literature going back several decades. stone’s study ranked lack of time as the top deterrent to professional development for librarians, and it appears little has changed.31 chan and auster noted that organizational climate and the perception of one’s age may impair the pursuit of professional development, among other impediments.32 however, research has noted a generally strong drive in older librarians to continue their education; long and applegate found a preference in latercareer librarians for learning outlets provided by formal library schools and related professional organizations, but a lower interest in generally popular topics such as programming.33 these findings were consistent with the participant responses gathered in this survey. finally, as detailed in the results section, a significant percent of respondents (33 percent) did not answer the question regarding what technologies they would like to learn. as is a limitation with survey research, it is difficult to know what the respondent’s intention was in not answering the question, i.e., are they comfortable with their current technology skills? do they lack the time or interest in pursuing further technology education? and of those that did answer, many did not specify their intended use of the technologies they desired to learn. so a deeper exploration of what technologies lis practitioners desire to learn and why would be of value as well. these questions are worth pursuing in more depth through further research efforts. conclusion this study provides a broad view into the technologies that lis practitioners currently use and desire to learn, across a variety of types of libraries, through an analysis of survey responses. despite a marked enthusiasm toward using and learning technology, respondents described serious organizational limitations impairing their ability to grow in these areas. the lis practitioners surveyed have interested patrons, see technology as part of their mission, and are not satisfied with the current state of affairs, but they seem to lack money, time, skills, and a willing library administration. though respondents expressed a great deal of interest in more advanced technology topics, such as programming, the majority typically engaged with technology on an end-user level, with a minority engaged in deeply technical work. this study suggests future work in exploring information professionals’ conceptual understanding of and attitudes toward technology, and a deeper look at the reasoning behind those who did not express a desire to learn new technologies. information technology and libraries | december 2016 55 references 1. marshall breeding, “library technology: the next generation,” computers in libraries 33, no. 8 (2013): 16–18, http://librarytechnology.org/repository/item.pl?id=18554. 2. therese f. triumph and penny m. beile, “the trending academic library job market: an analysis of library position announcements from 2011 with comparisons to 1996 and 1988,” college & research libraries 76, no. 6 (2015): 716–39, https://doi.org/10.5860/crl.76.6.716. 3. janie m. mathews and harold pardue, “the presence of it skill sets on librarian position announcements,” college & research libraries 70, no. 3 (2009): 250–57, https://doi.org/10.5860/crl.70.3.250. 4. debra a. riley-huff and julia m. rholes, “librarians and technology skill acquisition: issues and perspectives,” information technology and libraries 30, no. 3 (2011): 129–40, https://doi.org/10.6017/ital.v30i3.1770. 5. monica maceli, “creating tomorrow’s technologists: contrasting information technology curriculum in north american library and information science graduate programs against code4lib job listings,” journal of education for library and information science 56, no. 3 (2015): 198–212, https://doi.org/10.12783/issn.2328-2967/56/3/3. 6. walt crawford, “making it work perspective: techno and techmusts,” cites and insights 8, no. 4 (2008): 23–28. 7. helen partridge et al., “the contemporary librarian: skills, knowledge and attributes required in a world -f emerging technologies,” library & information science research 32, no. 4 (2010): 265–71, https://doi.org/10.1016/j.lisr.2010.07.001. 8. riley-huff and rholes, “librarians and technology skill acquisition.” 9. vandana singh and bharat mehra, “strengths and weaknesses of the information technology curriculum in library and information science graduate programs,” journal of librarianship and information science 45, no. 3 (2013): 219–231, https://doi.org/10.1177/0961000612448206. 10. riley-huff and rholes, “librarians and technology skill acquisition.” 11. sharon hu, “technology impacts on curriculum of library and information science (lis)—a united states (us) perspective,” libres: library & information science research electronic journal 23, no. 2 (2013): 1–9, http://www.libres-ejournal.info/1033/. 12. singh and mehra, “strengths and weaknesses of the information technology curriculum.” 13. see, for example, crawford, “making it work perspective”; partridge et al., “the contemporary librarian.” technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 56 14. riley-huff and rholes, “librarians and technology skill acquisition.” 15. elizabeth w. stone, factors related to the professional development of librarians (metuchen, nj: scarecrow, 1969). 16. donna c. chan and ethel auster, “factors contributing to the professional development of reference librarians,” library & information science research 25, no. 3 (2004): 265–86, https://doi.org/10.1016/s0740-8188(03)00030-6. 17. dorothy e. jones, “ten years later: support staff perceptions and opinions on technology in the workplace,” library trends 47, no. 4 (1999): 711–45. 18. john j. burke, the neal-schuman library technology companion: a basic guide for library staff, 5th edition (new york: neal-schuman, 2016). 19. roy tennant, “the digital librarian shortage,” library journal 127, no. 5 (2002): 32. 20. partridge et al., “the contemporary librarian.” 21. monica maceli, “what technology skills do developers need? a text analysis of job listings in library and information science (lis) from jobs.code4lib.org,” information technology and libraries 34, no. 3 (2015): 8–21, https://doi.org/10.6017/ital.v34i3.5893. 22. youngok choi and edie rasmussen, “what is needed to educate future digital libraries: a study of current practice and staffing patterns in academic and research libraries,” d-lib magazine 12, no. 9 (2006), http://www.dlib.org/dlib/september06/choi/09choi.html. 23. see, for example, maceli, “creating tomorrow's technologists.” 24. elías tzoc and john millard, “technical skills for new digital librarians,” library hi tech news 28, no. 8 (2011): 11–15, https://doi.org/10.1108/07419051111187851. 25. american library association, “manufacturing makerspaces,” american libraries 44, no. 1/2 (2013), https://americanlibrariesmagazine.org/2013/02/06/manufacturing-makerspaces/. 26. caitlin a. bagley, makerspaces: top trailblazing projects, a lita guide (chicago: american library association, 2014). 27. ina fourie and anika meyer, “what to make of makerspaces: tools and diy only or is there an interconnected information resources space?,” library hi tech 33, no. 4 (2015): 519–25, https://doi.org/10.1108/lht-09-2015-0092. 28. heather moorefield-lang, “change in the making: makerspaces and the ever-changing landscape of libraries,” techtrends 59, no. 3 (2015): 107–12, https://doi.org/10.1007/s11528-015-0860-z. information technology and libraries | december 2016 57 29. heather moorefield-lang, “makers in the library: case studies of 3d printers and maker spaces in library settings,” library hi tech 32, no. 4 (2014): 583–93, https://doi.org/10.1108/lht-06-2014-0056. 30. riley-huff and rholes, “librarians and technology skill acquisition.” 31. stone, factors related to the professional development of librarians. 32. chan and auster, “factors contributing to the professional development of reference librarians.” 33. chris e. long and rachel applegate, “bridging the gap in digital library continuing education: how librarians who were not ‘born digital’ are keeping up,” library leadership & management 22, no. 4 (2008), https://journals.tdl.org/llm/index.php/llm/article/view/1744. technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 58 appendix. survey questions 1. what type of library do you work in? 2. where is your library located (state/province/country)? 3. what is your job title? 4. what is your highest level of education? 5. which of the following methods have you used to learn about technologies and how to use them? please mark all that apply. • articles • as part of a degree i earned • books • coworkers • face-to-face credit courses • face-to-face training sessions • library patrons • online credit courses • online training sessions (webinars, etc.) • practice and experiment on my own • web resources i regularly check (sites, blogs, twitter, etc.) • web searching • other: 6. which of the following skill areas are part of your responsibilities? please mark all that apply. • acquisitions • archives/special collections • cataloging • circulation • collection development • distance library services • electronic resource management • instruction • interlibrary loan information technology and libraries | december 2016 59 • library administration • library it/systems • marketing/public relations • media/audiovisuals • outreach • periodicals/serials • reference • user experience • other: 7. how long have you worked in libraries? • 0–2 years • 3–5 years • 6–10 years • 11–15 years • 16–20 years • 21 or more years 8. which of the following technologies or technology skills are you expected to use in your job on a regular basis? please mark all that apply • assistive/adaptive technology • audio recording and editing • augmented reality (google glass, etc.) • blogging • cameras (still, video, etc.) • chromebooks • cloud-based productivity apps (google apps, office 365, etc.) • cloud-based storage (google drive, dropbox, icloud, onedrive, etc.) • computer programming or coding • computer security and privacy knowledge • database creation/editing software (ms access, etc.) • dedicated e-readers (kindle, nook, etc.) • digital projectors technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 60 • discovery layer/service/system • downloadable e-books • educational copyright knowledge • e-mail • facebook • fax machine • image editing software (photoshop, etc.) • laptops • learning management system (lms) or virtual learning environment (vle) • library catalog (public side) • library database searching • library management system (staff side) • library website creation or management • linux • mac operating system • makerspace technologies (laser cutters, cnc machines, arduinos, etc.) • mobile apps • network management • online instructional materials/products (libguides, tutorials, screencasts, etc.) • presentation software (ms powerpoint, prezi, google slides, etc.) • printers (public or staff) • rfid (radio frequency identification) • scanners and similar devices • server management • smart boards/interactive whiteboards • smartphones (iphone, android, etc.) • software installation • spreadsheets (ms excel, google sheets, etc.) • statistical analysis software (sas, spss, etc.) • tablets (ipad, surface, kindle fire, etc.) • teaching others to use technology information technology and libraries | december 2016 61 • teaching using technology (instruction sessions, workshops, etc.) • technology equipment installation • technology purchase decision-making • technology troubleshooting • texting, chatting, or instant messaging • 3d printers • twitter • using a web browser • video recording and editing • virtual reality (oculus rift, etc.) • virtual reference (text, chat, im, etc.) • word processing (ms word, google docs, etc.) • web-based e-book collections • web conferencing/video conferencing (webex, google hangouts, goto meeting, etc.) • webpage creation • web searching • windows operating system • other: 9. which of the following are barriers to new technology adoption in your library? please mark all that apply. • administrative restrictions • budget • lack of fit with library mission • lack of patron interest • lack of staff time • lack of staff with appropriate skill sets • satisfaction with amount of available technology • other: 10. what technology skill would you like to learn to help you do your job better? 11. what technologies do you help patrons with the most? 12. what technology item do you circulate the most? technology skills in the workplace: information professionals’ current use and future aspirations | maceli and burke | https://doi.org/10.6017/ital.v35i4.9540 62 13. what technology or technology skill would you most like to see added to your library? 190 information technology and libraries | december 2011 from static and stale to dynamic and collaborative: the drupal difference editor’s note: this paper is adapted from a presentation given at the 2010 lita forum. i n 2009, the university library of the university of california, santa cruz, moved from a static, dreamweaverand html-created website to an entirely new databasedriven website using the open-source content management system (cms) drupal. this article will describe the interdisciplinary approach the project team took for this large-scale transition process, with a focus on user testing, information architecture planning, user analytics, data gathering, and change management. we examine new approaches implemented for group-authoring of resources and the challenges presented by collaboration and crowdsourcing in an academic environment. we also discuss the impact on librarians and staff changing to this new paradigm of website design and development and the training support provided. we present our process for testing, staging, and publishing new content and describe the modules used to build dynamic subjectand course-guide displays. finally, we provide a list of resources and modules for beginning and intermediate drupal users. why change was needed our old library website was created using static html and its organizational structure evolved to mirror the administrative structure of the library. the vocabulary we used was very library-centric and, though useful to library staff, could be confusing to patrons. like many larger, older websites, we had accumulated a number of redundant and defunct pages. many of these pages had not been updated for years, had inconsistent naming conventions, or outdated page design. the catalyst for updating our web presence was predicated on several things. with more than one million visits per year and more than two million page views, our old servers were no longer able to handle this load, and we were about to begin a major project to replace our server hardware. in addition, we anticipated participating in an upcoming transition to a new campuswide website template. we saw this moment of change as an opportunity to revitalize the library website’s entire structure and reorganize it with a more user-centric approach to the menus and vocabulary. to do this, we decided to move away from dreamweaver and the static html approach to web design and instead choose a cms that would provide a more flexible and innovative interface. choosing drupal we had done research on commercial and open-source solutions and were leaning toward drupal as our cms. many academic departments at our campus were going through a similar process of website redesign and had already explored the cms options and had chosen drupal. this helped move us toward choosing drupal and taking advantage of a growing developer community on campus. two of the largest units on campus both chose drupal as their cms and have since been great partners for collaboration and peer support. drupal is a free, open-source cms (or content management framework) written in php with a mysql database backing it up. it is a small application of core modules with thousands of add-on modules available to increase functionality. drupal also has a very strong developer community and has been adopted by a growing number of libraries. we have found it to be very open and fluid, which is both a blessing and curse. for any one problem there can be dozens of differing solutions and modules to resolve it. the transition team the library created a core website implementation team consisting of a librarian project manager/developer, a web designer from the it department, and two librarian developers. the core team was supported by a server administrator and an it analyst. the it staff supported the technical aspects of drupal installation, backup, and maintenance. the librarian developers planned the content migration and managed the user interface design, layout, content, scope, and architecture. they needed to know the basics of how drupal works and needed to have much more access to the inner workings of drupal (e.g., modules, user permissions, etc.) than staff. the librarians also would train library staff, so needed to be able to teach and develop documentation and tailor instruction to specific staff needs. everyone who participated in the implementation team had many other competing responsibilities. the librarian developers had other projects and traditional duties such as collection development and reference services, so learning drupal and creating this new website was a part-time project and had to be integrated into existing workloads. tutorial ann hubble, deborah a. murphy, and susan chesley perry ann hubble (ahubble@ucsc.edu) is science librarian, deborah a. murphy (damurphy@ucsc.edu) is emerging technologies librarian, and susan chesley perry (chesley@ucsc.edu) is head of digital initiatives, university of california, santa cruz. selecting a web content management system for an academic library website | hubble, murphy, and perry 191from static and stale to dynamic and collaborative: the drupal difference | hubble, murphy, and perry 191 and often eccentric organizational structures that were no longer meaningful. the previous website had accumulated pages that were a bit more freewheeling in design with a lack of consistent navigation and “look and feel.” adding another layer of complexity, our website changeover took place during a period of great organizational change and a severe budget crisis. surprisingly, what seemed at first a major drawback was actually somewhat helpful. with fewer people spread thinner and doing more work, there was less need to feel in control of individual empires, leading to more cooperation during the changeover. staff learning styles vary, and no one approach to drupal training will work for everyone, so we brought many of the lessons we have learned in our bibliographic instruction sessions to our staff training. for example, we focused training on repetition, reassurance, and patience, ensuring it was an active process with hands-on participation as well as a lecture or demonstration. we provided ample time for questions and invited staff to bring their own projects to work on during training sessions. though some staff only needed to learn a few applications within drupal to perform their jobs, most needed specialized instruction to do some departmental-specific task or action that now had a very different interface. we supplemented our large group by drop-in training sessions with specialized departmental sessions, custom-made documentation, individual hands-on training, e-mail updates on system changes, and regular presentations of new system features. not everyone will become a “born again” drupalista, but everyone should at least feel that they can get their work done using drupal. drupal has also meant changes not only in the way content is added to the website, but also in how we handle revisions and updates. in the past, we had a very siloed initially from the increasing interest in drupal at the campus level. they attended a two-day intensive drupal training course from a company called lullabot, which provided an in-depth technical foundation for our initial drupal installation. this level of technical training and content was not appropriate for the other librarian developers on our team. however, a more detailed, midlevel training would have benefited the librarian developers and moved the project forward at a faster pace. these librarian developers learned using a combination of resources, including free online content that covers core drupal skills, combined with a few carefully chosen professional in-person consultations and online training packages and books. drupal is not a static environment, so after the initial training there was still a need for regular updates and refresitishers. our transition team joined the drupal4lib discussion list and consulted with library colleagues using drupal in the northern california area. drupalcon conferences as well as online users groups were excellent places not only to learn but also to make contact with vendors and other developers. several of these resources are listed in the accompanying bibliography in appendix b. staff training by far our largest group of library drupal users was the fifty-plus library staff content contributors who were faced with learning a new approach to web development. drupal’s successful implementation was ultimately dependent on ensuring library staff would be able to create, edit, and manage thousands of library webpages using this new cms. this was a change for everyone in the library, not just a few. the new website meant leaving behind the comfort of routines created over the years, elaborate designs that had been developed, and various idiosyncratic transition planning with the goal of making our new site user-centered, we wanted to make data-driven decisions about design rather than what had ultimately devolved into the practice of decisions based on politics and committee negotiations. to that end, we took several approaches to gathering user data. we inventoried our current site and gathered usage statistics based on website analytics. we met with small campus focus groups who answered questions about library site searching. we created personas for user categories based on profiles of average users (e.g. first-year students, graduate students, faculty, community users, etc.). based on this data, we drafted web interface wireframes and began user testing. drupal implementation also included developing a safe and effective means of moving from a testing environment to a final, public production site. this deployment process is a crucial component of ensuring that we could both test new features and still provide a stable environment for our users. after extensive discussions and revisions we developed a process to experiment with new modules and themes in a way that does not overwrite our existing public site. the deployment process goes from development to staging to production. it is critical to be able to determine that a new module or update will not negatively affect the database. the process we follow from our sandbox site to our production site is described in more detail in appendix a. transition team training we had three types of drupal users within the library: system administrators, developers, and staff (the primary content editors); each group had its own training needs. the library project manager, web designer, and it systems administrators benefited 192 information technology and libraries | december 2011 appropriate for this particular subject. each tab can also be customized to display whichever records pertain to this subject area. how our dynamic displays were built cck (content construction kit) module content used in both the “article databases and research tools” and “subject guides” displays is held within a special record type we created using the content construction kit (cck) module. we called this special record, or content type, online resource. we defined fields within the online resource record to hold information about individual resources we want to either display on our website or keep track of internally. the fields we defined include the resource name, url (sometimes multiple urls), description, type of information (article database, dictionary, encyclopedia, etc.), and subject discipline. figure 3 shows what a portion of the online resource record for a particular database changes, it’s updated in a central record and immediately reflected in displays throughout the site. not only is it less work to update information, but we also can provide resources in more varied combinations and make them more findable for our users. figure 1 shows how the dynamically created “article databases and research tools” list appears to a user browsing our website. the default display lists these resources in alphabetical order. the user can display the same group of records sorted by other criteria just by clicking on the appropriate tab. if the “by subject” tab is selected, the resources are displayed under subject headings. selecting the “by type” tab lists the resources by resource types, such as dictionaries and encyclopedias, citation style guides, etc. our subject guides are also created using the same components used to build the “article databases and research tools” lists. figure 2 shows a portion of one of our subject guides. like the previous example, this portion of the guide is created dynamically, displaying only records permissions environment that limited editing to only those given specific permissions. we now have role-based ownership where everyone can edit everything so that we did not have to keep up a detailed list of who does what. initial concern that someone could write over or accidently delete pages was somewhat remedied by the drupal revisions history feature, which assists with version control. there have been a few pages where ownership is an issue, and we are still in the process of developing a system to ensure that pages are updated when there is no specific individual linked to a page. dynamic displays: article databases and subject guides as part of the move to drupal, we wanted to take advantage of the new environment to redesign some of the more specialized portions of our site. in particular, we hoped that drupal’s dynamic displays would help us to keep our site more current with less effort. with this in mind, we chose to focus on two of the most heavily used resources: our list of article databases and our library subject guides. we planned to transform these static, high-maintenance html pages into dynamic, easily maintained, and easily generated resources. we used a number of drupal modules to develop the library’s website, and these are described in more detail in appendix c. to redesign our list of article databases and our library subject guides, we relied heavily on three important modules: cck (content construction kit), views, and taxonomy. the interaction of these three modules is key to building dynamically created webpages. once these modules are configured, information is input just once. drupal does the work of pulling the right information from each resource to create dynamic displays. if information, such as a url figure 1. dynamic display: article databases and research tools selecting a web content management system for an academic library website | hubble, murphy, and perry 193from static and stale to dynamic and collaborative: the drupal difference | hubble, murphy, and perry 193 and not programmers. we found that drupal was very different from anything we had used before and had a very steep learning curve. if we could start over, we would have invested much more time in lessons learned learning drupal takes time our implementation team was comprised predominantly of librarians and list of defined fields looks like behind the scenes to the librarian web developer. some fields within the online resource record rely on a little further customization. the “type of information” field is defined via an allowed values list. figure 4 shows a portion of the values list we have defined for this particular field. the “subject, discipline, topic” field (figure 3) incorporates a taxonomy list that we first created using the taxonomy module. this taxonomy vocabulary allows us to later sort the resources dynamically in both the “article databases and research tools” (figure 1) and “subject guides” displays (figure 2). taxonomy module figure 5 shows the list of subject terms we created using the taxonomy module. terms are easily added, edited, and organized via this taxonomy display, available only to the web developers. views module–putting it all together to define how the online resource records are displayed to the user (figure 1), we use the views module. views allow us to define, sort, and filter these records for display. figure 6 shows what the “article databases and research tools” view of figure 1 looks like to the web developer. notice that “a–z,” “subjects,” and “by type” are listed in a box on the left side of the page. each of these tabs corresponds to a tab on the page that displays to the user. in this case, “a–z” is bold and is the active tab currently being defined for this display. display settings such as the record type used, number of records to display per page, specific fields to display, type of sorting, and url path for the webpage are defined here. figure 2. dynamic display: dynamic portion of a subject guide figure 3. cck module: online resource record: manage fields 194 information technology and libraries | december 2011 learning drupal basics and getting a better grasp of how drupal works as a cms. the architecture and database-driven paradigm of our new drupal site is a significantly different environment from our previous website’s html-designed pages and directory-and-folder organization. of particular importance for our site were three core modules: cck, views, and taxonomy. becoming proficient with these modules was a challenge, and we can’t emphasize enough the importance of good, basic training on their use. start small: identify small parts to bring over initially, the thought of moving our old website to drupal seemed insurmountable. bringing over static html pages was straightforward, but portions of the website (such as converting our database of online resources) took more intensive planning. the entire process became more manageable when we divided up the site and focused on drupalizing small parts at a time. this way we could focus on learning enough drupal to make these portions of the site work without being overwhelmed. project management software: document & share what you’ve done if we were to transition an entire website again we would recommend using some type of project management software before starting. none of the implementation team worked on this site full time. this project was added to our other full-time workload providing reference services, collection planning, teaching, digital projects, etc. during our project we tried several free products but were not satisfied with any of them. we felt that finding the right project management package could have made the website transition process much figure 4. cck module: allowed values list (type of information) figure 5. taxonomy module: subjects list selecting a web content management system for an academic library website | hubble, murphy, and perry 195from static and stale to dynamic and collaborative: the drupal difference | hubble, murphy, and perry 195 and the library is now in a much better position for future website design transitions, a process that will be much easier with so much less static content to migrate. for example, the look and feel of our entire website can be transformed by reconfiguring a few options within drupal. ultimately, the transition of the library website to a drupal environment was a very good thing, and we are glad we did it. it was difficult and messy at times, but our website is now more flexible, agile, adaptable, and better poised for change. epilogue since this article was submitted, the uc santa cruz university library website has moved to an entirely new campus theme. we note that having a drupal-based cms greatly aided this transition process. personas for librarians and content contributors and done more usability testing for non-developers. we found that training and teaching library staff the architecture and databasedriven paradigm of the new drupal culture has been a challenge and we still have varying levels of buy-in. conclusion we now have a consistent look and feel to our site, though there are still many things yet to do. now that we are more comfortable using drupal, we can focus on creating more dynamic content, such as staff lists, adding sidebars to pages, and so on. increasing the number of dynamically created pages will mean a more up-to-date site in general. though group authoring within the library is still a challenge, we continue to find ways to encourage collaboration. easier. documenting and sharing how we created elements of the site helped us replicate complex components and allowed us to collaborate more easily on various projects. test, test, test testing the website as we developed it was a crucial component of our work. modules also can interact with other modules in unpredictable ways, so we ultimately found that loading new modules on our sandbox site, a mirror of the library website, was a crucial step in determining compatibility as well as functionality with our existing site (appendix a). it’s essential to practice using a live site without bringing the real production website down. focus on essential modules: cck, views, taxonomy images, wysiwyg editors drupal comes with a set of core modules plus an ever-increasing number of specialized contributed modules. finding and installing the right contributed module that fits a particular need can sometimes be difficult. there are often myriad modules that can solve a problem. it takes time to find and test each one to see if it will actually function as needed, and not all modules work well with one another. focusing on the essential drupal core modules plus cck, views, and taxonomy will help reduce unnecessary development frustrations. staff are important though we created many personas for faculty, students, and community users, we should have created figure 6. views module: article databases and research tools view 196 information technology and libraries | december 2011 appendix a. website deployment process created by bryn kanar and sue chesley perry selecting a web content management system for an academic library website | hubble, murphy, and perry 197from static and stale to dynamic and collaborative: the drupal difference | hubble, murphy, and perry 197 appendix b. drupal resources for getting started ■■ american library association. “drupal4lib interest group (lita library & information technology association).” http://connect.ala.org/node/71787 (accessed march 18, 2011). ■■ american library association. “showcase: database pages & research guides using drupal.” http://connect.ala .org/node/98546 (accessed march 18, 2011). ■■ austin, andy, and christopher harris. “drupal in libraries.” library technology reports 44, no. 4 (2008). ■■ byron, angela, addison berry, nathan haug, jeff eaton, james walker, and jeff robbins. using drupal: choosing and configuring modules to build dynamic websites. sebastopol, ca: o'reilly, 2008. ■■ drupal. “drupal.org.” http://drupal.org/(accessed march 18, 2011). ■■ drupal dojo. “drupal dojo.” http://drupaldojo.com/ (accessed march 18, 2011). ■■ drupal modules. “search, rate, and review drupal modules.” http://drupalmodules.com/ (accessed march 18, 2011). ■■ “drupalconsf san francisco – april 19-21, 2010.” http://sf2010.drupal.org/conference/sessions (accessed march 18, 2011). ■■ drupalib.”drupalib: a place for library drupalers to hang out.” http://drupalib.interoperating.info/ (accessed march 18, 2011). ■■ gotdrupal.com. “gotdrupal: once you've got it, you're addicted!”. http://gotdrupal.com (accessed march 18, 2011). ■■ groups.drupal. “libraries.” http://groups.drupal.org/libraries (accessed march 18, 2011). ■■ groups.drupal. “list of libraries using drupal.” http://groups.drupal.org/libraries/libraries (accessed march 18, 2011). ■■ “is this site built with drupal?”. http://www.isthissitebuiltwithdrupal.com/ (accessed march 18, 2011). ■■ learn by the drop. “learn by the drop: a place to learn drupal.” http://learnbythedrop.com/ (accessed march 18, 2011). ■■ “lullabot.” http://lullabot.com (accessed march 18, 2011). ■■ mastering drupal. “drupal screencasts.” http://www.masteringdrupal.com/videos (accessed march 18, 2011). ■■ slideshare. “drupal resources for libraries, sarah houghton-jan.” http://www.slideshare.net/librarianinblack/ drupal-resources-2982935 (accessed march 18, 2011). ■■ slideshare. “introduction to drupal for libraries, laura solomon.” http://www.slideshare.net/oplin/intro-to -drupal-for-libraries (accessed march 18, 2011). ■■ sunrainproductions. “drupalcampla 2009 views demystified.” http://www.sunrainproductions.com/ drupalcampla/views-demystified (accessed march 18, 2011). appendix c. selected drupal modules used on the ucsc library site ■■ administration menu—adds a top menu bar for authenticated users with common administration tasks ■■ cck—allows you to add new content types, for example the online resources content type for a–z list ■■ ckeditor—wysiwyg editor ■■ google analytics—adds google javascript tracking code to all of our site's pages ■■ google cse—allows us to use google as the site search ■■ imce—image-uploading module, also allows you to create subdirectories within the image directory ■■ image cache—allows you to pre-set sizes for images ■■ ldap integration—links user authentication to the library’s ldap server ■■ mollum—spam filter and image captcha (part of spam control) ■■ nice menus—allows drop-down/right/left expandable menus ■■ nodeblock—allows you to specify a content type as being a block, which content creators to edit the block text and title without having to access the block administration page ■■ pathauto—automatically generates path aliases for various kinds of content (nodes, categories, users) ■■ printer-friendly, e-mail and pdf versions—allows you to configure any type of page to display links for print, e-mail, and pdf ■■ rules—allows site administrators to define conditionally executed actions based on occurring events, we use it to send email when new content is created and to hide some content fields from selected user roles ■■ taxonomy—enables us to assign subjects and other categories to content; the url paths and views use taxonomy ■■ webform—enables quick creation of forms and questionnaires 10844 20190318 galley library services navigation: improving the online user experience brian rennick information technology and libraries | march 2019 14 brian rennick (brian_rennick@byu.edu) is aul for library it, brigham young university. abstract while the discoverability of traditional information resources is often the focus of library website design, there is also a need to help users find other services such as equipment, study rooms, and programs. a recent assessment of the brigham young university library website identified nearly two hundred services. many of these service descriptions were buried deep in the site, making them difficult to locate. this article will describe a web application that was developed to improve service discovery and to help ensure the accuracy and maintainability of service information on an academic library website. introduction the brigham young university library released a new version of its website in 2014. multiple usability studies were conducted to inform the design of the new site. during these studies, the web designers observed that when a user did not see what they were looking for on the homepage, they were likely to click on the “services” link as the next best option. the word services appeared to be an effective catch-all term. web designers asked themselves, “what is a library service?” they concluded that a library service could be anything public-facing that meets the needs of a user. using this broad definition, services could include: • library materials—both digital and physical (e.g. books, dvds) • material services (e.g. course reserve, interlibrary loan) • equipment and technology (e.g. computers, cameras, tripods) • help and guidance (e.g. research assistance, computer assistance) • locations (e.g. group study rooms, classrooms, help desks) • programs (e.g. friends of the library, lectures) because libraries offer so many diverse services, structuring a website to effectively promote them all brings many challenges. for instance, a common approach to presenting library services on a website is to have a menu that lists a few of the most popular or important services. the last menu item will normally be a link to a web page for “other services” that provides a more comprehensive service list. such an all-inclusive listing of library services on a single web page can easily lead to information overload for users. where do services belong in a library website’s information architecture? determining the one correct path is not easy because there are multiple valid ways to organize services into web pages. services could be arranged by department, service category, user group (undergraduates, graduates, faculty, visitors, alumni), or any number of other ways. an ideal system would allow users to follow the path that makes the most sense to them. information technology and libraries | march 2019 15 user expectations for a single (google-like) search box add to the challenges for service listings.1 a single search box, also known as a metasearch system, web-scale discovery service, or federated search, combines search results from multiple library sources. a study at the university of colorado found that users expected to locate services by entering keywords into the single search box on the library’s homepage.2 for example, the users attempted to search for “interlibrary loan” and “chat with a librarian” using the single search box. it is unrealistic to expect all users to follow a specific series of links in order to find the one correct path to information about a service when they are accustomed to google-style searching. even when a user manages to locate the correct web page where a service is described, the pertinent information can still be difficult to pinpoint when service descriptions are buried in paragraphs. users need to be able to quickly perform a visual scan of a web page to locate service information. kozak and hartley suggest that “bulleted lists are easier to read, easier to search and easier to remember than continuous prose.”3 the ongoing maintenance of service listings poses another significant challenge. for large academic libraries, up-to-date service information is difficult to maintain because it is typically scattered throughout a website. each department may have its own set of web pages and service listings. department pages created and maintained by different individuals end up with inconsistent design, organization, and voice. services that are common to multiple departments will have duplicate listings with different descriptions. maintenance of accurate information becomes an issue as services change; tracking down all of the references to a discontinued or modified service requires extensive searching of the website. literature review studies and commentaries regarding the information architecture of academic library websites have been covered extensively in the literature.4 a few articles specifically address the way that library services are organized on websites. library services are a significant component of academic library website content. clyde studied one hundred library websites from thirteen countries in order to compare common features and to determine some of the purposes for a library website.5 purposes for the sites varied. some focused on providing information about the library and its services while others functioned more like a portal, providing links to internet resources. cohen and still developed a list of core content for academic library websites by examining pages from university and two-year college sites.6 they organized the content into categories: library information, reference, research, instruction, and functionalities. liu surveyed arl libraries to get an overview of the state of web page development.7 the subsequent spec kit identifies services commonly found on academic library websites. yang and dalal studied a random sample of academic library websites to see which web–based reference services were offered and how they were presented.8 they also examined the differing terminology used to describe the services. the choice of terminology used on library websites impacts the findability of services. dewey compared academic websites from thirteen member libraries of a consortium to determine how findable service links were on the sites.9 the service links used in the evaluation covered “access, reference, information, and user education” categories. the study measured the number of clicks from the homepage that were required to find information about a service. dewey found library services navigation | rennick 16 https://doi.org/10.6017/ital.v38i1.10844 inconsistent use of terminology used to describe library services from one site to another. dewey posited that extensive use of library jargon could, in a sense, hide links from users. the overall conclusion was that the websites contained “too much information poorly placed.” a study of an academic library website by mcgillis and toms also found that participants struggled with terminology when attempting to locate services.10 the website reflected “traditional library structures” instead of using categories that were meaningful to users. the decision on where to place library services on a website is an important step in the design process. as part of their proposal to establish a benchmarking program for academic library websites, hightower, shih, and tilghman created classifications for the web pages they studied.11 library services were assigned to the “directional” category instead of representing a separate category. vaughan described a history of changes to an academic website that took place from 1996–2000.12 an interesting change was that, after multiple redesigns, the web designers combined two categories into a single “library services” category in order to simplify top level navigation on the home page. comeaux studied thirty-seven academic library websites to see how design elements evolved between 2012 and 2015.13 a portion of the study compiled terms used as navigation labels. the term “about” was the most common navigation label followed by “services” as the second most common. use of the term “services” as a main navigation label increased in popularity from 2012 to 2015. several researchers suggest organizing library services into web pages or portals that target different audiences. gullikson et al. studied usability issues related to the information architecture of an academic website and discovered that study participants followed different paths in their attempts to locate service information on the site.14 some users found items easily while others were unsuccessful. menu labels were not universally understood. the researchers identified a need for multiple access points to information in order to accommodate different mental models. they suggested employing multiple information organizational schemes, such as categorizing links by function, frequency of use, and target user group. adams and cassner analyzed the websites of arl libraries to see how services for distance education students and faculty were presented.15 they recommend strategies for helping distance students navigate the website, including maintaining a web page designed specifically for distance students that avoided jargon and clearly described services. detlor and lewis envisioned academic library websites as “sophisticated guidance systems which support users across a wide spectrum of information seeking behaviors—from goal-directed search to wayward browsing.”16 they reviewed arl library websites to see which important features were present or absent. their coding methodology was adopted by gardner, juricek, and xu in their study of how library web pages can meet the needs of campus faculty.17 liu proposed a conceptual model for an improved academic library website that would be organized into portals designed for specific user groups, such as undergraduates, faculty, or visitors.18 some of the arl websites studied by the researcher already implemented portals by user group. a more recent approach for locating library services has been to include website search results when using the single search from the homepage. for example, the north carolina state libraries website includes library-wide site search results when using the single search.19 the wayne state university libraries single search displays results from a university-wide site search.20 information technology and libraries | march 2019 17 an influential report produced by andrew pace provides practical advice for designing library websites.21 in the report, pace described the library services that should be included on a site and stressed that website design affects the discoverability and delivery of these services: “whether requiring minimal maintenance or constant upkeep, the extensibility of the design and flexibility of a site’s architecture ultimately saves the library time, money, hassle, and user frustration.”22 the web application described in this article aims to achieve these goals in terms of service discoverability and website maintainability. a services web application in an effort to tackle the challenges of services navigation and maintenance, the brigham young university library developed a web application for organizing services that allows multiple routes to service information. the application, known internally as “services,” was built using django, an open-source python web framework. the application incorporates a comprehensive list of library services and a map of service relationships. each service is assigned one or more categories, locations, and service areas within the application: • categories and subcategories—broad groupings of services (e.g., research help, for faculty, printing and copying) • locations—physical or virtual places within the library where services can be found (e.g., help desks, rooms) • service areas— library departments or other organizational units that offer services (e.g., humanities, special collections) services can have multiple categories, locations, and service areas and some service areas have multiple locations within the library (see figure 1). service information can also include links to related services. these links facilitate the serendipitous discovery of additional services (see figure 2). service information is stored in a relational database that joins connected entities together. an html template is used to format service information from the database in order to generate web pages for each of the services. maintaining the data in this manner ensures that changes made to service information in the database flow through to all of the associated web pages. adding or modifying entries automatically triggers the generation of new html for only the impacted services. generating static content by using triggers keeps the web pages up-to-date without the performance hit of real-time dynamic page generation. library services navigation | rennick 18 https://doi.org/10.6017/ital.v38i1.10844 figure 1. sample map illustrating relationships between services (on the left side) and service area locations (on the right side). information technology and libraries | march 2019 19 figure 2. sample map of how related service web pages are linked. library services navigation | rennick 20 https://doi.org/10.6017/ital.v38i1.10844 user scenarios the following examples of navigation paths typify how the web application can help users locate services. in each case there are multiple alternative paths that could be followed to find the same information. scenario 1. a student is looking for a computer that has music notation software installed. clicking the “services” link on the library homepage leads to a summary of library services. the student clicks the “public computers” link found under the “featured services” heading and is presented with detailed information about the computers. in the bullet points listed in the “overview” section there is a link to “see the list of software available on these computers.” following this link the student is able to learn that the desired software is available in the library’s music and dance media lab. scenario 2. while visiting a web page for the faculty delivery service, a professor notices a link to the category “for faculty.” following the link leads to a page that highlights some of the library services provided exclusively to campus faculty. the professor clicks the link “faculty expedited book orders” and is taken to a web page that describes the service and provides an online form for requesting a book. scenario 3. a student would like to borrow a camera for a class project. entering “digital cameras” into the main search box on the library homepage produces a link to “digital cameras (dslr)” listed under the “library services” heading at the top of the search results. following the link leads to a web page with information about the library’s digital camera offerings. the web page provides links to related services, including the library’s video production studio. the student decides to reserve the studio instead of checking out a camera. anatomy of a services web page each service web page is divided into sections to help users quickly find the type of information they seek. each section represents an information module with a specific purpose and an identifying design; the sections are color coded and displayed in a consistent order on each page. this helps users to find the same kind of information in the same place on every service page. major sections include: • title • description • keywords • hours • location • contact • overview • call to action • frequently asked questions • additional resources • related services • categories information technology and libraries | march 2019 21 a few of the sections require an explanation. the hours, location, and contact sections are links located directly below the title and description. clicking these links displays the section content. the overview section is intended to provide brief bullet points near the top of the web page so that users can quickly scan the most important information about the service. the call to action section follows these bullet points and contains one or more links to web applications that facilitate use of the service. examples of calls to action include: • place a hold • reserve a group study room • register for an advanced writing class • submit an interlibrary loan request most of the sections are optional since not all sections apply to every service. the services web pages can also include raw html that is embedded in a section in order to provide unique formatting for those services that do not neatly fit the standard layout. for example, the public computers page includes a section that displays the current availability of computers for each floor of the library. the look and feel of services web pages can be extended to other pages on the library website. library departments have web pages that provide information about personnel, mission, location, and services offered. some of these pages have been converted to a format that resembles the services layout in an effort to add cohesiveness to the library website. the department pages have sections similar to services pages such as hours, location, contact information, and an overview with bullet points. the pages can automatically display links to all of the services available in the department. because department pages are part of the services application and are connected to services with a relational database, changes to service information remains in sync across the entire website. this helps alleviate the problem of out-of-date department web pages. searching for services services can be located by submitting a query in a search box or by following links found on the main services web page. the services search engine matches words from the query with words found in a service name or associated tags. each service is tagged with keywords, phrases, or synonyms to increase the likelihood of successful searching. users may not be familiar with library jargon and will search for services using a variety of terms. it is impossible to name library services in a way that is understood by everyone, especially since academic library services target both students and faculty. a study on library services and user-centered language found that: “the choices of the graduate students did not always mirror those of the faculty. this highlights the inherent challenge of marketing services—the target audiences for the same service can have very different opinions and preferences.”23 services can have multi-word phrases assigned in addition to individual keywords. for example, the data management service has the following synonyms assigned: data curation, data management plan, and dmp. new keywords and phrases can be identified by reviewing search queries in the system log files and by conducting usability studies. library services navigation | rennick 22 https://doi.org/10.6017/ital.v38i1.10844 figure 3. the interlibrary loan service web page. information technology and libraries | march 2019 23 in addition to using a search box on the services web pages, users can search for services using the single search box on the library’s homepage. the single search box returns a link to matching services as part of search results when the search engine recognizes services keywords in a query. the services application has an api that makes keywords and other service information available to the single search box application. figure 4. search for a service from the single search box on the library’s homepage. figure 5. json results from the services api. to facilitate browsing, services are organized into three groups on the services web page: featured services, categories, and service areas. the featured services group highlights the most commonly sought-after services. categories are organized by the type of service or the target audience. the service areas group directs users to services available in library departments or units. the services web page does not list every service but instead directs users to web pages based on categories or service areas that list individual services. the services search feature can also include links to non-services. for example, library policies are not services yet users occasionally search for them on the services page (the library website posts {"status": 200, "results": [{"url": "https://lib.byu.edu/services/datamanagement/", "type": "service", "name": "data management", "slug": "datamanagement", "description": "through our institutional repository scholarsarchive, faculty can store research data. this is particularly useful for faculty who must develop data management plans for research projects funded by grants.", "keywords": ["data curation", "dmp", "data management plan", "data storage", "open access"]}], "total": 1, "query": "dmp"} library services navigation | rennick 24 https://doi.org/10.6017/ital.v38i1.10844 policy documents on the about page). in order to minimize user frustration with searching, links to non-services are included in search results so that users can be redirected to the desired pages. to help with optimization for external search engines such as google, each services page has a user-friendly url that clearly identifies the service. for example, the 3d printer service has the url https://lib.byu.edu/services/3d-printers/. each web page also includes the service name in an embedded html title tag. conclusion adopting a broad view of what represents a service has altered the library’s approach to the information architecture of the website. the services web application offers several innovations for improving library service discoverability and maintenance including: • standardized organization of service information • attaching keywords/aliases to service descriptions • an api for integration with the single search box on the homepage • links to related services • generation of web pages from a relational database usability tests were conducted throughout the development of the services application. follow-up assessments are planned for the future in order to verify that the application works as expected and to identify potential adjustments to the design. the services application shows promise as an effective tool for facilitating the discovery of services and increasing the reliability and uniformity of service information. acknowledgements the author gratefully acknowledges the contributions of grant zabriskie for the original concept and design of the services application and ben crowder for the implementation. references 1 cory lown, tito sierra, and josh boyer, “how users search the library from a single search box,” college & research libraries 74, no. 3 (may 2013): 227-41, https://doi.org/10.5860/crl-321. 2 rice majors, “comparative user experiences of next-generation catalogue interfaces,” library trends 61, no. 1 (summer 2012): 186–207, https://doi.org/10.1353/lib.2012.0029. 3 marcin kozak and james hartley, “writing the conclusions: how do bullet-points help?” journal of information science 37 no. 2 (feb. 2011): 221–24, https://doi.org/10.1177/0165551511399333. 4 barbara a. blummer, “a literature review of academic library web page studies,” journal of web librarianship 1 no. 1 (2007): 45–64, https://doi.org/10.1300/j502v01n01_04; galina letnikova, “usability testing of academic library web sites: a selective annotated bibliography,” internet reference services quarterly 8 no. 4 (2004): 53–68, https://doi.org/10.1300/j136v08n04_04. information technology and libraries | march 2019 25 5 laurel a. clyde, “the library as information provider: the home page,” the electronic library 14 no. 6 (dec. 1996): 549–58, https://doi.org/10.1108/eb045522. 6 laura b. cohen and julie m. still, “a comparison of research university and two-year college library web sites: content, functionality, and form,” college & research libraries 60 no. 3 (1999): 275–89, https://doi.org/10.5860/crl.60.3.275. 7 yaping peter liu, “web page development and management: a spec kit,” association of research libraries (1999): https://hdl.handle.net/2027/mdp.39015042087232. 8 sharon q. yang and heather a. dalal, “delivering virtual reference services on the web: an investigation into the current practice by academic libraries,” journal of academic librarianship 41 no. 1 (2015): 68–86, https://doi.org/10.1016/j.acalib.2014.10.003. 9 barbara i. dewey, “in search of services: analyzing the findability of links on cic university libraries’ web pages,” information technology and libraries, 18 no. 4 (1999): 210–13, http://www.ala.org/sites/ala.org.acrl/files/content/conferences/pdf/dewey99.pdf. 10 louise mcgillis and elaine g. toms, “usability of the academic library web site: implications for design,” college & research libraries 62 no. 4 (july 2001): 355–67, https://doi.org/10.5860/crl.62.4.355. 11 christy hightower, julie shih, and adam tilghman, “recommendations for benchmarking web site usage among academic libraries,” college & research libraries 59 no. 1 (jan. 1998): 61–79, https://crl.acrl.org/index.php/crl/article/viewfile/15182/16628. 12 jason vaughan, “three iterations of an academic library web site,” information technology and libraries 20 no. 2 (june 2001): 81–92, https://search.proquest.com/docview/215832160. 13 david j. comeaux, “web design trends in academic libraries—a longitudinal study,” journal of web librarianship 11 no. 1 (2017): 1–15, https://doi.org/10.1080/19322909.2016.1230031. 14 shelly gullikson et al., “the impact of information architecture on academic web site usability,” the electronic library 17 no. 5 (oct. 1999): 293–304, https://doi.org/10.1108/02640479910330714. 15 kate e. adams and mary cassner, “content and design of academic library web sites for distance learners: an analysis of arl libraries,” journal of library administration 37 no. 1/2 (2002): 3–13, https://doi.org/10.1300/j111v37n01_02. 16 brian detlor and vivian lewis, “academic library web sites: current practice and future directions,” journal of academic librarianship 32 no. 3 (may 2006): 251–58, https://doi.org/10.1016/j.acalib.2006.02.007. 17 susan j. gardner, john eric juricek, and f. grace xu, “an analysis of academic library web pages for faculty,” journal of academic librarianship 34 no. 1 (jan. 2008): 6–24, https://doi.org/10.1016/j.acalib.2007.11.006. library services navigation | rennick 26 https://doi.org/10.6017/ital.v38i1.10844 18 shu liu, “engaging users: the future of academic library web sites,” college & research libraries 69 no. 1 (jan. 2008): 6–27, https://doi.org/10.5860/crl.69.1.6. 19 kevin beswick, “quicksearch,” north carolina state university libraries, accessed nov. 28, 2018, https://www.lib.ncsu.edu/projects/quicksearch. 20 cole hudson and graham hukill, “one-to-many: building a single-search interface for disparate resources,” in exploring discovery: the front door to your library’s licensed and digitized content, ed. kenneth j. varnum (chicago: ala editions, 2016), 141–53, http://digitalcommons.wayne.edu/libsp/114. 21 andrew k. pace, “optimizing library web services: a usability approach,” library technology reports 38 no. 2 (mar./apr. 2002): 1–87, https://doi.org/10.5860/ltr.38n2. 22 ibid. 23 allison r. benedetti, “promoting library services with user-centered language,” libraries & the academy 17 no. 2 (apr. 2017): 217-34, https://doi.org/10.1353/pla.2017.0013. microsoft word 5888-14722-8-ce.docx exploratory  subject  searching  in     library  catalogs:  reclaiming  the  vision     julia  bauder  and     emma  lange     information  technology  and  libraries  |  june  2015             92   abstract   librarians  have  had  innovative  ideas  for  ways  to  use  subject  and  classification  data  to  provide  an   improved  online  search  experience  for  decades,  yet  after  thirty-­‐plus  years  of  improvements  in  our   online  catalogs,  users  continue  to  struggle  with  narrowing  down  their  subject  searches  to  provide   manageable  lists  containing  only  relevant  results.  this  article  reports  on  one  attempt  to  rectify  that   situation  by  radically  reenvisioning  the  library  catalog  interface,  enabling  users  to  interact  with  and   explore  their  search  results  in  a  profoundly  different  way.  this  new  interface  gives  users  the  option  of   viewing  a  graphical  overview  of  their  results,  grouped  by  discipline  and  subject.  results  are  depicted   as  a  two-­‐level  treemap,  which  gives  users  a  visual  representation  of  the  disciplinary  perspectives  (as   represented  by  the  main  classes  of  the  library  of  congress  classification)  and  topics  (as  represented   by  elements  of  the  library  of  congress  subject  headings)  included  in  the  results.   introduction   reading  library  literature  from  the  early  days  of  the  opac  era  is  simultaneously  inspiring  and   depressing.  the  enthusiasm  that  some  librarians  felt  in  those  days  about  the  new  possibilities  that   were  being  opened  by  online  catalogs  is  infectious.  elaine  svenonius  envisioned  a  catalog  that   could  interactively  guide  users  from  a  broad  single-­‐word  search  to  the  specific  topic  in  which  they   were  really  interested.1  pauline  cochrane  conceived  of  a  catalog  that  could  group  results  on   similar  aspects  of  a  given  subject,  showing  the  user  a  “systematic  outline”  of  what  was  available  on   the  subject  and  allowing  the  user  to  narrow  their  search  easily.2  marcia  bates  even  pondered   whether  “any  indexing/access  apparatus  that  does  not  stimulate,  intrigue,  and  give  pleasure  in  the   hunt  is  defective,”  since  “people  enjoy  exploring  knowledge,  particularly  if  they  can  pursue  mental   associations  in  the  same  way  they  do  in  their  minds.  .  .  .  should  that  not  also  carry  over  into   enjoying  exploring  an  apparatus  that  reflects  knowledge,  that  suggests  paths  not  thought  of,  and   that  shows  relationships  between  topics  that  are  surprising?”3  however,  looking  back  thirty  years   later,  it  is  dispiriting  to  consider  how  many  of  these  visions  have  not  yet  been  realized.     the  following  article  reports  on  one  attempt  to  rectify  that  situation  by  radically  reenvisioning  the   library  catalog  interface,  enabling  users  to  interact  with  and  explore  their  search  results  in  a     profoundly  different  way.  the  idea  is  to  give  users  the  option  of  viewing  a  graphical  overview  of   their  results,  grouped  by  discipline  and  subject.  this  was  achieved  by  modifying  a  vufind-­‐based     julia  bauder  (bauderj@grinnell.edu)  is  social  studies  and  data  services  librarian,  and     emma  lange  (langemm@grinnell.edu)  is  an  undergraduate  student  and  former  library  intern,   grinnell  college,  grinnell,  iowa.     exploratory  subject  searching  in  library  catalogs:  reclaiming  the  vision  |  bauder  and  lange   doi:  10.6017/ital.v34i2.5888   93   discovery  layer  to  allow  users  to  choose  between  a  traditional,  list-­‐based  view  of  their  search   results  and  a  visualized  view.  in  the  visualized  view,  results  are  depicted  as  a  two-­‐level  treemap,   which  gives  users  a  visual  representation  of  the  disciplinary  perspectives  (as  represented  by  the   main  classes  of  the  library  of  congress  classification  [lcc])  and  topics  (as  represented  by   elements  of  the  library  of  congress  subject  headings  [lcsh])  included  in  the  results.  an  example   of  this  visualized  view  can  be  seen  in  figure  1.   figure  1.  visualization  of  the  results  for  a  search  for  “climate  change.”   subsequent  sections  of  this  paper  summarize  the  library-­‐science  and  computer-­‐science  literature   that  provides  the  theoretical  justification  this  project,  explain  how  the  visualizations  are  created,   and  report  on  the  results  of  usability  testing  of  the  visual  interface  with  faculty,  academic  staff,  and   undergraduate  students.     information  technology  and  libraries  |  june  2015     94   literature  review   exploratory  subject  searching  in  library  catalogs   since  charles  ammi  cutter  published  his  rules  for  a  printed  dictionary  catalogue  in  1876,  most   library  catalogs  have  been  premised  on  the  idea  that  users  have  a  very  good  idea  of  what  they  are   looking  for  before  they  begin  to  interact  with  the  catalog.4  in  this  classic  view,  users  are  either   conducting  known-­‐item  searches—they  know  the  titles  or  the  author  of  the  books  they  want  to   find—or  they  know  the  exact  subject  on  which  they  are  interested  in  finding  books.  yet  research   has  shown  that  known-­‐item  searches  are  only  about  half  of  catalog  searches,5  and  that  users  often   have  a  very  difficult  time  expressing  their  information  needs  with  enough  detail  to  construct  a   specific  subject  search.  instead,  much  of  the  time,  users  approach  the  catalog  with  only  a  vaguely   formulated  information  need  and  an  even  vaguer  sense  of  what  words  to  type  into  the  catalog  to   get  the  resources  that  would  solve  their  information  need.6   even  in  the  earliest  days  of  the  opac  era,  librarians  were  aware  of  this  problem.  some  of  them,   including  elaine  svenonius  and  pauline  cochrane,  speculated  about  better  use  of  subject  and   classification  data  to  try  to  help  users  who  enter  too-­‐short,  overly  broad  searches  focus  their   results  on  the  information  that  they  truly  want.  one  of  cochrane’s  many  ideas  on  this  topic  was  to   use  subject  and  classification  data  “to  present  a  systematic  outline  of  a  subject,”  which  would  let   users  see  all  of  the  different  aspects  of  that  subject,  as  reflected  in  the  library’s  classification   system  and  subject  headings,  and  the  various  locations  where  those  materials  could  be  found  in   the  library.7  svenonius  suggested  using  library  classifications  to  help  narrow  users’  searches  to   appropriate  areas  of  the  catalog.  for  example,  she  suggests,  if  a  user  enters  “freedom”  as  a  search   term,  the  system  might  be  programmed  to  present  to  the  user  contexts  in  which  “freedom”  is  used   in  the  dewey  decimal  classification,  such  as  “freedom  of  choice”  or  “freedom  of  the  press.”  once   the  user  selects  a  one  of  these  phrases,  svenonius  continued,  the  system  could  present  the  user   with  additional  contextual  information,  again  allow  the  user  to  specify  which  context  is  desired,   and  then  guide  the  user  to  the  exact  call  number  range  for  information  on  the  topic.  she  concluded,   “thus  by  contextualizing  vague  words,  such  as  freedom,  within  perspective  hierarchies,  the   computer  might  guide  a  user  from  an  ineptly  or  imprecisely  articulated  search  request  to  one  that   is  quite  specific.”8   ideas  such  as  these  had  little  impact  on  the  design  of  production  library  catalogs  until  the  late   1990s,  when  a  dutch  company,  medialab  solutions,  began  developing  aquabrowser,  which   features  a  word  cloud  composed  of  synonyms  and  other  words  related  to  the  search  term  and   allows  users  to  refocus  their  search  by  clicking  on  these  words.9  aquabrowser  became  available  in   the  united  states  in  the  mid-­‐2000s,  shortly  before  north  carolina  state  university  launched  its   endeca-­‐based  catalog  in  2006.10     while  aquabrowser’s  word  cloud  is  certainly  visually  striking,  the  feature  that  these  and  most  of   the  subsequent  “next-­‐generation”  library  catalogs  implement  that  has  had  the  most  impact  on   search  behavior  is  faceting.  facets,  while  not  as  sophisticated  as  the  systems  envisioned  by     exploratory  subject  searching  in  library  catalogs:  reclaiming  the  vision  |  bauder  and  lange   doi:  10.6017/ital.v34i2.5888   95   svenonius  and  cochrane,  are  partial  solutions  to  the  problems  they  lay  out.  facets  can  serve  to   give  users  a  high-­‐level  overview  of  what  is  available  on  a  topic,  based  on  classification,  format,   period,  or  other  factors.  they  can  also  help  guide  a  user  from  an  impossibly  broad  search  to  a   more  focused  one.  various  studies  have  shown  that  faceted  interfaces  are  effective  at  helping   users  narrow  their  searches,  as  well  as  helping  them  discover  more  relevant  materials  than  they   did  when  performing  similar  tasks  on  nonfaceted  interfaces.11  however,  studies  have  also  shown   that  users  can  become  overwhelmed  by  the  number  and  variety  of  facets  available  and  the  number   of  options  shown  under  each  facet.12   visual  interfaces  to  document  corpora   when  librarians  were  pondering  how  to  create  a  better  online  library  catalog,  computer  scientists   were  investigating  the  broader  problem  of  helping  users  to  navigate  and  search  large  databases   and  collections  of  documents  effectively.  visual  interfaces  have  been  one  of  the  methods  computer   scientists  have  investigated  for  providing  user-­‐friendly  navigation,  with  perhaps  the  most   prominent  early  advocate  for  visual  interfaces  being  ben  shneiderman.13  in  recent  years,   shneiderman  and  other  researchers  have  built  and  tested  various  types  of  experimental  visual   interfaces  for  different  forms  of  information-­‐seeking.14  however,  with  a  few  exceptions,  most  of   these  visual  interfaces  have  remained  in  a  laboratory  rather  than  a  production  setting.15  with  the   exception  of  the  “date  slider,”  a  common  interface  feature  that  displays  a  bar  graph  showing  dates   related  to  the  search  results  and  allows  users  to  slide  handles  to  include  or  exclude  times  from   their  search  results,  few  current  document  search  systems  present  users  with  any  kind  of  visual   interface.   method   the  grinnell  college  libraries  use  vufind,  open-­‐source  software  originally  developed  at  villanova   university  as  a  discovery  layer  to  use  over  a  traditional  ils.  vufind  in  turn  makes  use  of  apache   solr,  a  powerful  open-­‐source  indexing  and  search  platform,  and  solrmarc,  code  developed  within   the  library  community  that  facilitates  indexing  marc  records  into  solr.  using  solrmarc,  marc   fields  and  subfields  are  mapped  to  various  fields  in  the  solr  index;  for  example,  the  contents  of   marc  field  020,  subfield  a,  and  field  773,  subfield  z,  are  both  mapped  to  a  solr  index  field  called   “isbn.”  more  than  fifty  solr  fields  are  populated  in  our  index.  our  visualization  system  was  built  on   top  of  vufind’s  solr  index  and  visualizes  data  taken  directly  from  the  index.     the  visualizations  are  created  in  javascript  using  the  d3.js  visualization  library,  and  they  are   designed  to  implement  shneiderman’s  visual  information  seeking  mantra:  “overview  first,  zoom   and  filter,  then  details-­‐on-­‐demand.”16  the  goal  was  to  give  users  the  option  of  viewing  a  graphical   overview  of  their  results,  grouped  by  disciplinary  perspective  and  topic,  and  then  allow  them  to   zoom  in  on  the  results  from  specific  perspectives  or  on  specific  topics.  once  they  have  used  the   interactive  visualization  to  narrow  their  search,  they  can  choose  to  see  a  traditional  list  of  results   with  full  bibliographic  details  about  the  items.  this  would,  ideally,  provide  a  version  of  the     information  technology  and  libraries  |  june  2015     96   systematic  outline  that  cochrane  envisioned.  it  should  also  support  users  as  they  attempt  to   narrow  down  their  search  results  and  focus  on  a  specific  aspect  of  their  chosen  subject  without   overwhelming  them  with  long  lists  of  results  or  of  facets.   currently,  we  are  visualizing  values  of  two  fields,  one  containing  the  first  letter  of  the  items’   library  of  congress  classification  (lcc)  numbers  and  the  other  containing  elements  of  the  items’   library  of  congress  subject  headings  (lcsh).  this  data  is  visualized  as  a  two-­‐level  treemap.17   first,  large  boxes  are  drawn  representing  the  number  of  items  matching  the  search  within  each   letter  of  the  lcc.  within  the  largest  of  these  boxes,  smaller  boxes  are  drawn  showing  the  most   common  elements  of  the  subject  headings  for  items  matching  that  search  within  that  lcc  main   class.  less  common  subject  heading  elements  are  combined  into  an  additional  small  box,  labeled   “x  more  topics”;  clicking  on  that  box  zooms  in  so  that  users  only  see  results  from  one  lcc  main   class,  and  it  displays  all  of  the  lcsh  headings  applied  to  items  in  that  group.  similarly,  users  can   click  on  any  of  the  smaller  lcc  boxes,  which  do  not  contain  lcsh  boxes  in  the  original   visualization,  to  zoom  in  on  that  lcc  main  class  and  see  the  lcsh  subject  headings  for  it.  both  the   large  and  the  small  boxes  are  sized  to  represent  what  proportion  of  the  results  were  in  that  lcc   main  class  or  had  that  lcsh  subject  heading.     this  is  easier  to  explain  with  a  concrete  example.  let’s  say  a  student  were  to  search  for  “climate   change”  and  click  on  the  option  to  visualize  the  results.  you  can  see  what  this  looks  like  in  figure  1.   instead  of  seeing  a  list  of  nearly  two  thousand  books,  the  student  now  sees  a  visual  representation   of  the  disciplinary  perspectives  (as  represented  by  the  main  classes  of  the  lcc)  and  topics  (as   represented  by  elements  of  the  lcsh)  included  in  the  results.  users  could  click  to  zoom  in  on  any   main  class  within  the  lcc  to  see  all  of  the  topics  covered  by  books  in  that  class,  as  in  figure  2,   where  the  student  has  zoomed  in  on  “s  –  agriculture.”  or  users  could  click  on  any  topic  facet  to  see   a  traditional  results  list  of  books  with  that  topic  facet  in  that  main  class.  at  any  zoom  level,  users   could  choose  to  return  to  the  traditional  results  list  by  clicking  on  the  “list  results”  option.18   we  launched  this  feature  in  our  catalog  midway  through  the  spring  2014  semester.  formal   usability  testing  was  completed  with  five  advanced  undergraduates,  three  staff,  and  two  faculty   members  in  the  summer  of  2014.  (see  appendix  a  for  the  outline  of  the  usability  test.)  one  first-­‐ year  student  completed  usability  testing  in  the  fall  2014  semester.  the  usability  study  asked   participants  to  complete  a  set  list  of  nine  specific,  predetermined  tasks.  some  tasks  involved  the   use  of  now-­‐standard  catalog  features,  such  as  saving  results  to  a  list  and  emailing  results  to  oneself,   while  about  half  of  the  tasks  involved  navigation  of  the  visualization  tool,  which  was  entirely  new   to  the  participants.  each  participant  received  the  same  tasks  and  testing  experience  regardless  of   their  status  as  a  student,  faculty,  or  staff,  and  each  academic  division  was  represented  among  the   participants.       exploratory  subject  searching  in  library  catalogs:  reclaiming  the  vision  |  bauder  and  lange   doi:  10.6017/ital.v34i2.5888   97     figure  2.  visualization  of  the  results  for  a  search  for  “climate  change,”  filtered  to  show  only   results  with  library  of  congress  classification  numbers  starting  with  s.   results   usability  testing  revealed  no  major  obstacles  in  the  way  of  users’  ability  to  navigate  the   visualization  feature;  the  visualized  search  results  were  quickly  deciphered  by  the  participants   with  the  assistance  of  the  context  set  by  the  study’s  outlined  tasks.  familiarity  with  library   catalogs  in  general,  and  the  grinnell  college  libraries  catalog  in  particular,  showed  no  marked   impact  on  users’  performance.  no  particular  user  group  performed  as  an  outlier  in  regards  to   users’  general  ability  to  complete  tasks  or  the  time  required  to  do  so.     the  most  common  issue  to  arise  during  the  session  concerned  the  visualization’s  truncated  text,   which  appears  in  the  far  left  column  of  results  when  the  descriptor  text  contains  too  many   characters  for  the  space  allocated.  (an  example  of  this  truncated  text  can  be  seen  in  figure  1.)  the     information  technology  and  libraries  |  june  2015     98   subject  boxes  appearing  in  the  furthest  left  column  contain  the  least  results,  and  therefore  receive   the  least  space  within  the  visualization.  this  limited  space  sometimes  results  in  truncated  text.   the  full-­‐text  can  be  viewed  by  hovering  over  the  truncated  text  box,  but  few  users  discovered  this   capability.  another  common  concern  involved  a  participant’s  ability  to  switch  their  search  results   from  the  default  list  view  to  the  visualized  view.  all  participants  were  capable  of  selecting  the   “visualize  these  results”  button  required  to  produce  the  visualization,  but  a  handful  of   participants  expressed  that  they  feared  they  would  not  find  that  option  if  they  were  not  prompted   to  do  so.   participants  remarked  that  the  visualization  initially  appeared  daunting  but  then  quickly  became   comfortable  navigating  the  results.  most  participants,  including  staff,  stated  that  they  found  the   tool  useful  and  intended  to  use  it  in  the  future  during  the  course  of  their  typical  work  at  the  college.   conclusion   librarians  have  had  innovative  ideas  for  ways  to  use  subject  and  classification  data  to  provide  an   improved  online  search  experience  for  decades,  yet  after  thirty-­‐plus  years  of  improvements  in   online  catalogs,  users  continue  to  struggle  with  narrowing  down  their  searches  to  produce   manageable  lists  containing  only  relevant  results.19  computer  scientists  have  been  advocating  for   interfaces  to  support  visual  information-­‐seeking  since  the  1980s.  finally,  hardware  and  software   have  improved  to  the  point  where  many  of  these  ideas  can  be  implemented  feasibly,  even  by   relatively  small  libraries.  now  is  the  time  to  put  some  of  them  into  production  and  see  how  well   they  work  for  library  users.  the  particular  visualizations  reported  in  this  article  may  or  may  not   be  the  best  possible  visualizations  of  bibliographic  data,  but  we  will  never  know  which  of  these   ideas  might  prove  to  be  the  revolution  that  library  discovery  interfaces  need  until  we  try  them.         exploratory  subject  searching  in  library  catalogs:  reclaiming  the  vision  |  bauder  and  lange   doi:  10.6017/ital.v34i2.5888   99   appendix  a.  usability  testing  instrument   introductory  questions   before  we  look  at  the  site,  i’d  like  to  ask  you  just  a  few  quick  questions.   —have  you  searched  for  materials  using  the  grinnell  college  libraries’  website  before?  if  so,  what   for  and  when?  (for  students  only:  could  you  please  estimate  how  many  research  projects  you’ve   done  at  grinnell  college  using  the  library  catalog?)   in  the  grinnell  college  libraries,  we’re  testing  out  a  new  tool  in  our  catalog  that  presents  search   results  in  a  different  way  than  you  are  used  to.  now  i’m  going  to  read  you  a  short  explanation  of   why  we  created  this  tool  and  what  we  hope  the  tool  will  do  for  you  before  we  start  the  test.   research  is  a  conversation:  a  scholar  reads  writings  by  other  scholars  in  the  field,  then  enters  into   dialogue  with  them  in  his  or  her  own  writing.  most  of  the  time,  these  conversations  happen  within   the  boundaries  of  a  single  discipline,  such  as  chemistry,  sociology,  or  art  history,  even  when  many   disciplines  are  discussing  similar  topics.  but  when  you  do  a  search  in  a  library  catalog,  writings   that  are  part  of  many  different  conversations  are  all  jumbled  together  in  the  results.  it’s  like  being   thrown  into  one  big  room  where  all  of  these  scholars,  from  all  of  these  different  disciplines,  are   talking  over  each  other  all  at  once.  our  new  visualization  tool  aims  to  help  you  sort  all  of  these   writings  into  the  separate  conversations  in  which  they  originated.     scenarios   now  i  am  going  to  ask  you  to  try  doing  some  specific  tasks  using  3search.  you  should  read  the   instructions  aloud  for  all  tasks  individually  prior  to  beginning  each.  and  again,  as  much  as  possible,   it  will  help  us  if  you  can  try  to  think  out  loud  as  you  go  along.     please  begin  by  reading  the  first  scenario  aloud  and  then  begin  the  first  scenario.  if  you  are  unsure   whether  you  finished  the  task  or  not,  please  ask  me.  i  can  confirm  if  the  task  has  been  completed.   once  you  are  done  with  scenario  1,  please  continue  onto  scenario  2  by  reading  it  aloud  and  then   beginning  the  task.  continue  this  process  until  all  scenarios  are  finished.  if  you  cannot  complete  a   task,  please  be  honest  and  try  to  explain  briefly  why  you  were  unsuccessful  and  continue  to  the   next.     1. pretend  that  you  are  writing  a  paper  about  issues  related  to  privacy  and  the  internet.  do  a   search  in  3search  with  the  words  “privacy  internet.”   2. please  select  the  first  worldcat  result  and  attempt  to  determine  whether  you  have  access   to  the  full  text  of  this  book.  if  not,  please  indicate  where  you  could  request  the  full  text   through  the  interlibrary  loan  service.   3. go  back  to  your  initial  search  results.  please  choose  “explore  these  results”  of  the  ebsco   database  results.  choose  an  article.  if  you  have  unlimited  texting,  have  the  article’s     information  technology  and  libraries  |  june  2015     100   information  texted  to  your  cell  phone.  then,  add  the  article  to  a  new  list  for  future   reference  throughout  this  project.   4. go  back  to  your  initial  search  results.  for  grinnell  college’s  collections  results,  click  on  the   “explore  these  results”  link.  then  click  on  the  “visualize  results”  link  to  visualize  the   results.  which  disciplines  appear  to  have  the  greatest  interest  in  this  topic?   5. when  privacy  and  the  internet  are  discussed  in  the  context  of  law,  what  are  some  of  the   topics  that  are  frequently  covered  in  these  discussions?   6. one  specific  topic  you  are  considering  is  the  legal  issues  around  libel  and  slander  on  the   internet.  how  many  resources  do  the  libraries  have  on  that  specific  topic?   7. click  on  “q  –  science,”  to  see  the  results  authored  by  theoretical  computer  scientists.  based   on  these  results,  what  are  some  of  the  topics  that  are  frequently  covered  in  their   discussions  when  these  computer  scientists  discuss  privacy  and  the  internet?   8. pretend  that  you  are  writing  this  paper  for  a  computer  science  class  and  you  are  supposed   to  address  your  topic  from  a  computer  science  perspective.  please  narrow  your  results  to   only  show  results  that  are  in  the  format  of  a  book.  based  on  this  new  visualization,  what   might  be  some  good  topics  to  consider?   9. add  one  of  these  books  to  the  list  you  created  in  step  3.  please  email  all  of  the  items  on  this   list  to  yourself.   debriefing   thank  you.  that  is  it  for  the  computer  tasks.  i  have  a  few  quick  questions  for  you  now  that  you   have  gotten  a  chance  to  use  the  site.   1. what  do  you  think  about  3search?  is  it  something  that  you  would  use?  why  or  why  not?   2. what  is  your  favorite  thing  about  3search?   3. what  is  your  least  favorite  thing  about  3search?   4. did  you  find  the  visualization  function  useful?  why  or  why  not?   5. do  you  have  any  recommendations  for  changes  to  the  way  this  site  looks  or  works?         exploratory  subject  searching  in  library  catalogs:  reclaiming  the  vision  |  bauder  and  lange   doi:  10.6017/ital.v34i2.5888   101   references     1.     elaine  svenonius,  “use  of  classification  in  online  retrieval,”  library  resources  &  technical   services  27,  no.  1  (1983):  76–80,    http://alcts.ala.org/lrts/lrtsv25no1.pdf.     2.     pauline  a.  cochrane,  “subject  access—free  or  controlled?  the  case  of  papua  new  guinea,”  in   redesign  of  catalogs  and  indexes  for  improved  online  subject  access:  selected  papers  of  pauline   a.  cochrane  (phoenix:  oryx,  1985),  275.  previously  published  in  online  public  access  to  library   files:  conference  proceedings:  the  proceedings  of  a  conference  held  at  the  university  of  bath,  3– 5  september  1984  (oxford:  elsevier,  1985).   3.     marcia  bates,  “subject  access  in  online  catalogs:  a  design  model,”  journal  of  the  american   society  for  information  science  37,  no.  6  (1986):  363,  http://dx.doi.org/10.1002/(sici)1097-­‐ 4571(198611)37:6<357::aid-­‐asi1>3.0.co;2-­‐h   4.     charles  ammi  cutter,  rules  for  a  printed  dictionary  catalog  (washington,  dc:  government   printing  office,  1876).   5.     david  ward,  jim  hahn,  and  kirsten  feist,  “autocomplete  as  a  research  tool:  a  study  on   providing  search  suggestions,”  information  technology  &  libraries  31,  no.  4  (2012):  6–19,   http://dx.doi.org/10.6017/ital.v31i4.1930;  suzanne  chapman  et  al.,  “manually  classifying   user  search  queries  on  an  academic  library  web  site,”  journal  of  web  librarianship  7  (2013):   401–21,  http://dx.doi.org/10.1080/19322909.2013.842096.   6.     n.  j.  belkin,  r.  n.  oddy,  and  h.  m.  brooks,  “ask  for  information  retrieval:  part  i.  background   and  theory,”  journal  of  documentation  (1982):  61–71,  http://dx.doi.org/10.1108/eb026722;   christine  borgman,  “why  are  online  catalogs  still  hard  to  use?,”  journal  of  the  american   society  for  information  science  (1996):  493–503,  http://dx.doi.org/10.1002/(sici)1097-­‐ 4571(199607)47:7<493::aid-­‐asi3>3.0.co;2-­‐p;  karen  markey,  “the  online  library  catalog:   paradise  lost  and  paradise  regained?,”  d-­‐lib  magazine  13,  no.  1/2  (2007),   http://www.dlib.org/dlib/january07/markey/01markey.html.   7.     cochrane,  “subject  access—free  or  controlled?,”  275.   8.     svenonius,  “use  of  classification  in  online  retrieval,”  78–79.   9.     jasper  kaizer  and  anthony  hodge,  “aquabrowser  library:  search,  discover,  refine,”  library  hi   tech  news  (december  2005):  9–12,  http://dx.doi.org/10.1108/07419050510644329.   10.    kristen  antelman,  emily  lynema,  and  andrew  pace,  “toward  a  twenty-­‐first  century  library   catalog,”  information  technology  &  libraries  25,  no.  3  (2006):  128–39,   http://dx.doi.org/10.6017/ital.v25i3.3342.   11.    tod  olson,  “utility  of  a  faceted  catalog  for  scholarly  research,”  library  hi  tech  (2007):  550– 61,  http://dx.doi.org/10.1108/07378830710840509;  jody  condit  fagan,  “usability  studies  of     information  technology  and  libraries  |  june  2015     102     faceted  browsing:  a  literature  review,”  information  technology  and  libraries  29,  no.  2   (2010):  58-­‐66,  http://dx.doi.org/10.6017/ital.v29i2.3144.   12.    kathleen  bauer,  “yale  university  library  vufind  test—undergraduates,”  november  11,  2008,   accessed  september  9,  2014,   http://www.library.yale.edu/usability/studies/summary_undergraduate.doc.   13.    see,  for  example,  ben  shneiderman,  “the  future  of  interactive  systems  and  the  emergence  of   direct  manipulation,”  behaviour  &  information  technology  1  (1982):  237–56,   http://dx.doi.org/10.1080/01449298208914450;  ben  shneiderman,  “dynamic  queries  for   visual  information  seeking,”  ieee  software  11  (1994):  70–77,   http://dx.doi.org/10.1109/52.329404.   14.    see,  for  example,  aleks  aris  et  al.,  “visual  overviews  for  discovering  key  papers  and   influences  across  research  fronts,”  journal  of  the  american  society  for  information  science  &   technology  60  (2009):  2219–28,  http://dx.doi.org/10.1002/asi.v60:11;  furu  wei  et  al.,   “tiara:  a  visual  exploratory  text  analytic  system,”  in  proceedings  of  the  16th  acm  sigkdd   international  conference  on  knowledge  discovery  and  data  mining  (washington,  dc:  acm,   2010),  153–62,  http://dx.doi.org/10.1145/1835804.1835827;  cody  dunne,  ben  shneiderman,   robert  gove,  judith  klavans,  and  bonnie  dorr,  “rapid  understanding  of  scientific  paper   collections:  integrating  statistics,  text  analysis,  and  visualization,”  journal  of  the  american   society  for  information  science  &  technology  63  (2012):  2351–69,   http://dx.doi.org/10.1002/asi.22652.   15.    the  most  notable  exception  is  carrot2  (http://search.carrot2.org),  a  search  tool  that  will   automatically  cluster  web  search  results  and  display  visualizations  of  those  clusters.   16.    ben  shneiderman,  “the  eyes  have  it:  a  task  by  data  type  taxonomy  for  information   visualizations,”  september  1996,  accessed  april  27,  2014,   http://drum.lib.umd.edu/bitstream/1903/5784/1/tr_96-­‐66.pdf.   17.    ben  shneiderman,  “treemaps  for  space-­‐constrained  visualization  of  hierarchies:  including   the  history  of  treemap  research  at  the  university  of  maryland,”  institute  for  systems   research,  accessed  october  6,  2014,  http://www.cs.umd.edu/hcil/treemap-­‐history.   18.    to  explore  this  feature  in  our  catalog,  go  to  https://libweb.grinnell.edu/vufind/search/home,   do  a  search,  and  click  on  the  “visualize  results”  link  in  the  upper  right.   19.    a  recent  project  information  literacy  report  found  that  the  two  aspects  of  research  that  first-­‐ year  students  found  most  difficult  were  “coming  up  with  keywords  to  narrow  down  searches”   and  “filtering  and  sorting  through  irrelevant  results  from  online  searches.”  alison  j.  head,   learning  the  ropes:  how  freshmen  conduct  course  research  once  they  enter  college  (project   information  literacy,  december  5,  2013),   http://projectinfolit.org/images/pdfs/pil_2013_freshmenstudy_fullreport.pdf,  15.   usability study of a library’s m obile website: an example from portland state university kimberly d. pendell and michael s. bowman usability study of a library’s mobile website | pendell and bowman 45 abstract to discover how a newly developed library mobile website performed across a variety of devices, the authors used a hybrid field and laboratory methodology to conduct a usability test of the website. twelve student participants were recruited and selected according to phone type. results revealed a wide array of errors attributed to site design, wireless network connections, as well as phone hardware and software. this study provides an example methodology for testing library mobile websites, identifies issues associated with mobile websites, and provides recommendations for improving the user experience. introduction mobile websites are swiftly becoming a new access point for library services and resources. these websites are significantly different from full websites, particularly in terms of the user interface and available mobile-friendly functions. in addition, users interact with a mobile website on a variety of smartphones or other internet-capable mobile devices, all with differing hardware and software. it is commonly considered a best practice to perform usability tests prior to the launch of a new website in order to assess its user friendliness, yet examples of applying this practice to new library mobile websites are rare. considering the variability of user experiences in the mobile environment, usability testing of mobile websites is an important step in the development process. this study is an example of how usability testing may be performed on a library mobile website. the results provided us with new insights on the experience of our target users. in the fall of 2010, with the rapid growth of smartphones nationwide especially among college students, portland state university (psu) library decided to develop a mobile library website for its campus community. the library’s lead programmer and a student employee developed a test version of the website. this version of the website included library hours, location information, a local catalog search, library account access for viewing and renewing checked out items, and access to reference services. it also included a “find a computer” feature displaying the availability of work stations in the library’s two computer labs. kimberly d. pendell (kpendell@pdx.edu) is social sciences librarian, assistant professor, and michael s. bowman (bowman@pdx.edu) is interim assistant university librarian for public services, associate professor, portland state university library, portland, oregon. mailto:kpendell@pdx.edu mailto:bowman@pdx.edu information technology & libraries | june 2012 46 the basic architecture and design of the site was modeled on other existing academic library mobile websites that were appealing to the development team. the top-level navigation of the mobile website largely mirrored the full library website, utilizing the same language as the website when possible. the mobile website was built to be compatible with webkit, the dominant smartphone layout engine. use of javascript on the website was minimized due to the varying levels of support for it on different smartphones, and flash was avoided entirely. figure 1. home page of library mobile website, test version we formed a mobile website team to further evaluate the test website and prepare it for launch. three out of four team members owned smartphones, either an iphone 3gs or an iphone 4. we soon began questioning how the mobile website would work on other types of phones, recognizing that hardware and software differences would likely impact user experience of the mobile website. performing a formal usability test using a variety of internet-capable phones quickly became a priority. we decided to conduct a usability test for the new mobile website in order to answer the question: how user-friendly and effective is the new library mobile website on students’ various mobile devices? literature review smartphones, mobile websites, and mobile applications have dominated the technology landscape in the last few years. smartphone ownership has steadily increased, and a large percentage of usability study of a library’s mobile website | pendell and bowman 47 smartphone owners regularly use their phone to access the internet. the pew research center reports that 52 percent of americans aged 18–29 own smartphones, and 81 percent of this population use their smartphone to access the internet or e-mail on a typical day. additionally, 42 percent of this population uses a smartphone as their primary online access point.1 the 2010 ecar study of undergraduate students and information technology found that 62.7 percent of undergraduate students own internet-capable handheld devices, an increase of 11.5 percent from 2009. the 2010 survey also showed that an additional 11.3 percent of students intended to purchase an internet-capable handheld device within the next year.2 in this environment academic libraries have been scrambling to address the proliferation of student owned mobile devices, thus the number of mobile library websites is growing. the library success wiki, which tracks libraries with mobile websites, shows an 66 percent increase in the number of academic libraries in the united states and canada with mobile websites from august 2010 to august 2011.3 we reviewed articles about mobile websites in the professional library science literature and found that mobile website usability testing is only briefly mentioned. in their summary of current mobile technologies and mobile library website development, bridges, rempel, and griggs state that “user testing should be part of any web application development plan. you can apply the same types of evaluation techniques used in non-mobile applications to ensure a usable interface.”4 in a previous article, the same authors also note that not accounting for other types of mobile users is easy to do but leaves a potentially large audience for a mobile website “out in the cold.”5 more recently, seeholzer and salem found the usability aspect of mobile website development to be in need of further research.6 usability evaluation techniques for a mobile website are similar to those for a full website, but the variety of smartphones and internet-capable feature phones immediately complicates standard usability testing practices. the mobile device landscape is fraught with variables that can have a significant impact on the user experience of a mobile website. factors like small screen size, processing power, wireless or data plan connection, and on-screen keyboards or other data entry methods contribute to user experience and impact usability testing. zhang and adipat note that, mobile devices themselves, due to their unique, heterogeneous characteristics and physical constraints, may play a much more influential role in usability testing of mobile applications than desktop computers do in usability testing of desktop applications. therefore real mobile devices should be used whenever possible.7 one strategy for usability testing on mobile devices is to identify device “families” by similar operating systems or other characteristics, then perform a test of the website. for example, griggs, bridges, and rempel found representative models of device families at a local retailer, where they tested the site on the display phones. the authors also recommend “hallway usability testing,” an impromptu test with a volunteer.8 zhang and adipat go on to outline two methodologies for formal mobile application usability testing: field studies and laboratory experiments. the benefit of a mobile usability field study is information technology & libraries | june 2012 48 the preservation of the mobile environment in which tasks are normally performed. however, data collection is challenging in field studies, requiring the participant to reliably and consistently selfreport data. in contrast, the benefit of a laboratory study is that researchers have more control over the test session and data collection method. laboratory usability tests lend themselves to screen capture or video recording, allowing researchers more comprehensive data regarding the participant’s performance on predetermined tasks.9 however, billi and others point out that there is no general agreement in the literature about the significance or usefulness of the difference between laboratory and field testing of mobile applications.10 one compromise between field studies and laboratory experiments is the use of a smartphone emulator: an emulator mimics the smartphone interface on a desktop computer and is recordable via screen capture. however, desktop emulators mask some usability problems that impact smartphones, such as an unstable wireless connection or limited bandwidth.11 in order to record test sessions of users working directly with mobile devices, jakob nielsen, the well-known usability expert, briefly mentions the use of a document camera.12 in another usability test of a mobile application, loizides and buchanan also used a document camera with recording capabilities to effectively record users working with a mobile device.13 usability attributes are metrics that help assess the user-friendliness of a website. in their review of empirical mobile usability studies, coursaris and kim present the three most commonly used measures in mobile usability testing: efficiency: degree to which the product is enabling the tasks to be performed in a quick, effective and economical manner or is hindering performance; effectiveness: accuracy and completeness with which specified users achieved specified goals in particular environment; satisfaction: the degree to which a product is giving contentment or making the user satisfied.14 the authors present these measures in an overall framework of “contextual usability” constructed with the four variables of user, task, environment, and technology. an important note is the authors’ use of technology rather than focusing solely on the product; this subtle difference acknowledges that the user interacts not only with a product, but also other factors closely associated with the product, such as wireless connectivity.15 a participant proceeding through a predetermined task scenario is helpful in assessing site efficiency and effectiveness by measuring the error rate and time spent on a task. user satisfaction may be gauged by the participant’s expression of satisfaction, confusion, or frustration while performing the tasks. measurement of user satisfaction may also be supplemented by a post-test survey. returning to general evaluation techniques, mobile website usability employs the use of task scenarios, post-test surveys, and data analysis methods, similar to full site testing. general guides such as the handbook of usability testing by rubin and chisnell and george’s user-centered library websites: usability evaluation methods provide helpful information on designing task scenarios, how to facilitate a test, post-test survey ideas, and methods of analysis.16 another usability study of a library’s mobile website | pendell and bowman 49 common data collection method in usability testing is the think aloud protocol as it allows researchers to more fully understand the user experience. participants are instructed to talk about what they are thinking as they use the site; for example, expressing uncertainty of what option to select, frustration with poorly designed data entry fields, or satisfaction with easily understood navigation. examples of the think aloud protocol can also be found in mobile website usability testing.17 method while effective usability testing normally relies on five to eight participants, we decided a larger number of participants would be needed in order to capture the behavior of the site on a variety of devices. therefore, we recruited twelve participants to accommodate a balanced variety of smartphone brands and models. based on average market share, we aimed to test the website on four iphones, four android phones, and four other types of smartphones or internet-capable mobile devices (e.g., blackberry, windows phones). all study participants were university students, the primary target audience of the mobile website. we used three methods to recruit participants: a post to the library’s facebook page, a news item on the library’s home page, and two dozen flyers posted around campus. each form of recruitment described an opportunity for students to spend less than thirty minutes helping the library test its new mobile website. also, participants would receive a $10 coffee shop gift card as an incentive. a project-specific email address served as the initial contact point for students to volunteer. we instructed volunteers to indicate their phone type in their e-mail; this information was used to select and contact the students with the desired variety of mobile devices. if a scheduled participant did not come to the test appointment, another student with the same or similar type of phone was contacted and scheduled. no other demographic data or screening was used to select participants, aside from a minimum age requirement of eighteen years old. we employed a hybrid field and laboratory test protocol, which allowed us to test the mobile website on students’ native devices while in a laboratory setting that we could efficiently manage and schedule. participants used their own phone for the test without any adjustment to their existing operating preferences, similar to field testing methodology. however, we used a controlled environment in order to facilitate the test session and create recordings for data analysis. a library conference room served as our laboratory, and a document camera with video recording capability was used to record the session. the document camera was placed on an audio/visual cart and the participants chose to either stand or sit while holding their phones under the camera. the document camera recorded the phone screen, the participant’s hands, and the audio of the session. the video feed was available through the room projector as well, which helped us monitor image quality of the recordings. information technology & libraries | june 2012 50 figure 2. video still from test session recording the test session consisted of two parts: the completion of five tasks using participants’ phones on our test website recorded under the document camera, and a post-test survey. participants were read an introduction and instructions from a script in order to decrease variation in test protocol and our influence as the facilitators. we also performed a walk-through of the testing session prior to administering it to ensure the script was clearly worded and easy to understand. we developed our test scenarios and tasks according to five functional objectives for the library mobile website: 1. participants can find library hours for a given day in the week. 2. participants can perform a known title search in catalog and check for item status. 3. participants can use my account to view checked out books.18 4. participants can use chat reference. 5. participants can effectively search for a scholarly article using the mobile version of ebscohost academic search complete. prior to beginning the test, we encouraged participants to use the “think aloud” protocol while performing tasks. we also instructed them to move between tasks however they would naturally in order to capture user behavior when navigating from one part of the site to another. the post-test survey provided us with additional data and user reactions to the site. users were asked to rate the site’s appearance, ease of use, and how frequently they might use the different website features usability study of a library’s mobile website | pendell and bowman 51 (e.g., renewing a checked out item). the survey was administered directly after the task scenario portion of the test in order to take advantage of the users’ recent experience with the website. we evaluated the test sessions utilizing the measures of efficiency, effectiveness, and satisfaction. in this study, we assessed efficiency as time spent performing the task and effectiveness as success or failure in completing the task. we observed errors and categorized them as either a user error or site error. each error was also categorized as minor, major, or fatal: minor errors were easily identified and corrected by the user; major errors caused a notable delay, but the user was able to correct and complete the task; fatal errors prevented the user from completing the task. to assess user satisfaction, we took note of user comments as they performed tasks, and we also referred to their ratings and comments on the post-test survey. before analyzing the test recordings, we normalized our scoring behavior by performing a sample test session with a library staff member unfamiliar with the mobile website. we scored the sample recording separately and then met to discuss, clarify, and agree upon each error category. each of the twelve test sessions was viewed and scored independently. once this process was completed, we discussed our scoring of each test session video, combining our data and observations. we analyzed the combined data by looking for both common and unique errors for each usability task across the variety of smartphones tested. to protect participants’ confidentiality, each video file and post-test survey was labeled only with the test number and device type. prior to beginning the study, all recruitment methods, informed consent, methodology, tasks and post-test survey were approved by portland state university human subjects research and review committee. findings our recruitment efforts were successful with even a few same-day responses from the announcement posted on the library’s facebook page. some students also indicated that they had seen the recruitment flyers on campus. a total of fifty-two students volunteered to participate; twelve students were successfully contacted, scheduled, and tested. the distribution of the twelve participants and their types of phones is shown in table 1. number of participants operating system phone model 4 android htc droid incredible 2; motorola droid; htc mytouch 3g slide; motorola cliq 2 3 ios iphone 3gs 2 blackberry blackberry 9630; blackberry curve information technology & libraries | june 2012 52 1 windows phone 7 windows phone 7 1 webos palm pixi 1 other windows kin 2 feature phone (a phone with internet capability, running kinos) table 1. test participants by smartphone operating system and model usability task scenarios all test participants quickly and successfully completed the first task, finding the library hours for sunday. the second task was to find a book in the library catalog and report whether the book was available for check out. nine participants completed this task; the windows phone 7 and the two blackberry phones presented a fatal system error when working with our mobile catalog software, mobilecat. these participants were able to perform a search but were not able to view a full item record, blocking them from seeing the item’s availability and completing the task. this task also revealed one minor error for iphone users: the iphone displayed the item’s ten digit isbn as a phone number, complete with touch-to-call button. many users took more time than anticipated when asked to search for a book. the video recordings captured participants slowly scrolling through the menu before choosing “search psuonly catalog.” a few participants expressed their hesitation verbally: ● “maybe not the catalog? i don't know. yeah i guess that would be the one.” ● “i don't look for books on this site anyway...my lack of knowledge more than anything else.” ● “search psu library catalog i'm assuming?” the blackberry curve participant did not recognize the catalog option and selected “databases & articles” to search for a book. she was guided back to the catalog after her unsuccessful search in ebscohost. we observed an additional delay in searching for a book when using the catalog interface. the catalog search included a pull down menu of collections options. the collections menu was included by the site developers because it is present in the full website version of the local catalog. users tended to explore the menu looking for a selection that would be helpful in performing the task; however, they abandoned the menu, occasionally expressing additional confusion. usability study of a library’s mobile website | pendell and bowman 53 figure 3. catalog search with additional “collections” menu the next task was to log into a library account and view checked out items. all participants were successful with this task, but frequent minor user errors were observed, all misspelling or numerical entry errors. most participants self-corrected before submitting the login; however, one participant submitted a misspelled user name and promptly received an error message from the site. participants were also instructed to log out of the account. after clicking “logout” one participant made the observation; “huh, it goes to the login screen. i assume i'm logged out, though it doesn't say so.” the fourth task scenario involved using the library’s chat reference service via the mobile website. the chat reference service is provided via open source software in cooperation with l-net, the oregon statewide service. usability testing demonstrated that the chat reference service did not perform well on a variety of phones. also, a significant problem arose when participants attempted to access chat reference via the university’s unsecured wireless network. because the chat reference service is managed by a third-party host, three participants were confronted with a non-mobile friendly authentication screen (see discussion of the local wireless environment below). as this was an unexpected event in testing, participants were given the option to authenticate or abandon the task. all three participants who arrived at this point chose to move ahead with authentication during the test session. information technology & libraries | june 2012 54 once the chat interface was available to participants, other system errors were discovered. only three out of twelve participants successfully sent and received a chat message. only one participant (htc droid incredible) experienced an error-free chat transaction. various problems encountered included: · unresponsive or slow to respond buttons, · text fields unresponsive to data entry, · unusually long page loading time, · non-mobile-friendly error message upon attempting to exit, and · non-mobile-friendly “leave a message” webpage. another finding from this task is that participants expressed concern regarding communication delays during the chat reference task. if the librarians staffing the chat service are busy with other users, a new incoming user is placed in a queue. after waiting in the chat queue for forty seconds, one participant commented, “probably if i was on the bus and it took this long, i would leave a message.” being in a controlled environment, participants looked to the facilitator as a guide for how long to remain in the chat queue, distorting the indication of how long users would wait for a chat reference transaction in the field environment. figure 4. chat reference queue usability study of a library’s mobile website | pendell and bowman 55 the last task scenario asked participants to use the mobile version of ebscohost’s academic search complete. our test instance of this database generally performed well with android phones and less well with webos phones or iphones. android participants successfully accessed, searched, and viewed results in the database. iphone users experienced delays in initiating text entry, three consecutive touches being consistently necessary to activate typing in the search field. our feature phone participant with a windows kin 2 was unable to use ebscohost because the phone’s browser, internet explorer 6, is not supported by the ebscohost mobile website. the palm pixi participant also experienced difficulty with very long page loading times, two security certificate notifications (not present on other tests), and our ezproxy authentication page. with all these obstacles, the palm pixi participant abandoned the task. another participant, blackberry 9630, also abandoned the task due to slow page loading. a secondary objective of our ebscohost search task was to observe if participants explored ebscohost’s “search options” in order to limit results to scholarly articles. our task scenario asked participants to find a scholarly article on global warming. only one participant explored the ebscohost interface, successfully identified the “search options” menu, and limited the results to “scholarly (peer reviewed) articles.” another participant included the words “peer reviewed” with “global warming” in the search field in an attempt to add the limit. a third expressed the need to limit to scholarly articles but was unable to discover how to do so. of the remaining seven participants who searched academic search complete for the topic “global warming” none expressed concern or awareness of the scholarly limit in academic search complete. it is unclear whether this was a product of the interface design, users’ lack of knowledge regarding limiting their search to scholarly sources, or if our task scenario was simply too vague. though participants’ wireless configurations, or lack thereof, was not formally part of the usability test, we quickly discovered that this variable had a significant impact on the user’s experience of the mobile website. in the introductory script and informed consent we recommended to participants that they connect to the university’s wireless network to avoid data charges. however, we did not explicitly instruct users to connect to the secure network. most participants chose to connect to the unencrypted wireless network and appeared to be unaware of the encrypted network (psu and psu secure respectively). using the unencrypted network led to authentication requirements at two different points in the test: using the chat service and searching academic search complete. other users who were unfamiliar with adding a wireless network to their phone used their cellular network connection. these participants were asked to authenticate only when accessing ebscohost’s academic search complete (see table 2). participants expressed surprise at the appearance of an authentication request when performing different tasks, particularly while connected to the on-campus university wireless network. the required data entry in a non-mobile friendly authentication screen, and the added page loading time, created an obstacle for the participant to overcome in order to complete the task. notably, three participants also explained their naivete on how to find and add a wireless network to their phone. information technology & libraries | june 2012 56 internet connection library mobile website chat reference ebscohost on campus, unencrypted wireless no authentication required authentication required authentication required on campus, encrypted wireless no authentication required no authentication required no authentication required on campus, cellular network no authentication required no authentication required authentication required off campus, any mode no authentication required no authentication required authentication required table 2. authentication requirements based on type of internet connection and resource. post -test survey each participant completed a post-test survey that asked them to rate the mobile website’s appearance and ease of use. the survey also asked participants to rank how frequently they were likely to use specific features of the website such as search for books and ask for help on a rating scale of more than weekly, weekly, monthly, less than monthly, and never. participants were also invited to add general comments about the website. the mobile website’s overall appearance and ease of use was highly rated by all participants. the straightforward design of the mobile website’s homepage also garnered praise in the comment section of the post-test survey. comments regarding the site’s design included: “very simple to navigate,” and “the simple homepage is perfect! also, i love that the site rotates sideways with my phone.” for many of the features listed on the survey participants selected an almost even distribution across the frequency of use rating scale. however, two features were ranked as having potential for very high use. nine out of twelve participants said they would search for articles weekly or more than weekly. eight out of twelve participants said they would use the “find a computer” function weekly or more than weekly. two participants additionally wrote in comments that “find a computer” was “very important” and would be used “every day.” at the other end of the scale, our menu option “directions” was ranked as having a potential frequency of use of never, with the exception of one participant marking less than monthly. discussion usability testing of the library’s mobile website provided the team with valuable information, leading us to implement important changes before the site was launched. we quickly decided on a usability study of a library’s mobile website | pendell and bowman 57 few changes, while others involved longer discussion. the collections menu was removed from the catalog search; this menu distracted and confused users with options that were not useful in a general search. “directions” was moved from a top level navigation element to a clickable link in the site footer. also, the need for a mobile version of the library’s ezproxy authentication page was clearly documented and has since been created and implemented. however, the team was very pleased with the praise for the overall appearance of the website and its ease of use, especially considering the significant difficulties some participants faced when completing specific tasks. the “find a computer” feature of the mobile website was very popular with test participants. the potential popularity among users is perhaps a reflection of overcrowded computer labs across campus and the continued need students have for desktop computing. unfortunately, “find a computer” has been temporarily removed from the site due to changes in computer laboratory tracking software at the campus it level. we hope to soon again have access to the workstation data for the library’s two computer labs in order to develop a new version of this feature. the hesitation participants displayed when selecting the catalog option in order to search for a book was remarkable for its pervasiveness. it’s possible that the term “catalog” has declined in use to the point of not being recognizable to some users, and it is not used to describe the search on the homepage of the library’s full website. in fact, we had originally planned to name the catalog search option with a more active and descriptive phrase, such as “find books and more,” which is used on the library’s full website. however, the full library website employs worldcat local, allowing users to make consortial and interlibrary loan requests. in contrast, the mobile website catalog reflects only our local holdings and does not support the request functionality. the team decided not to potentially confuse users further regarding the functionality of the different catalogs by giving them the same descriptive title. in the case that worldcat local’s beta mobile catalog increases in stability and functionality, we will abandon mobilecat and provide the same request options on the mobile website as on the full website. we discussed removing the chat service option from the “ask us” page. during usability testing, it was demonstrated that users would too frequently have poor experiences using this service due to slow page loads on most phones, the unpredictable responsiveness of text entry fields and buttons, and the wait time for a librarian to begin the chat. also, it could be that waiting in a virtual queue on a mobile device is particularly unappealing because the user is blocked from completing other tasks simultaneously. the library recently implemented a new text reference service, and this service was added to the mobile website. the text reference service is an asynchronous, non-webbased service that is less likely to pose similar usability problems as those found with the chat service. this reflects the difference between applications developed for desktop computing, such as web-based instant messaging, versus a technology that is specifically related to the mobile phone environment, like text messaging. however, tablet device users complicate matters since they might use the full desktop website or the mobile website; for this reason, chat reference is still part of the mobile website. information technology & libraries | june 2012 58 participants’ interest in accessing and searching databases was notable. during the task, many participants expressed positive reactions to the availability of the ebscohost database. the posttest survey results demonstrated a strong interest in searching for articles via the mobile website, giving their potential frequency of use as weekly or more than weekly. this evidence supports the previous user focus group results of seeholzer and salem.19 students are interested in accessing research databases on their mobile devices, despite the likely limitations of performing advanced searches and downloading files. therefore, the team decided to include ebscohost’s academic search complete along with eight other mobile-friendly databases in the live version of the website launched after the usability test. figure 5. home page of the library mobile website, updated usability study of a library’s mobile website | pendell and bowman 59 the new library mobile website was launched in the first week of fall 2011 quarter classes. in the first full week there were 569 visits to the site. site analytics for the first week also showed that our distribution of smartphone models in usability testing was fairly well matched with the users of the website, though we underestimated the number of iphone users: 64 percent of visits were from apple ios users, 28 percent from android users, 0.7percent blackberry users, and the remaining a mix of users with alternative mobile browsers and desktop browsers. usability testing with participants’ native smartphones and wireless connectivity revealed issues which would have been absent in a laboratory test that employed a mobile device emulator and a stable network connection. the complications introduced by the encrypted and unencrypted campus wireless networks, and cellular network connections, revealed some of the many variables users might experience outside of a controlled setting. ultimately, the variety of options for connecting to the internet from a smartphone, in combination with the authentication requirements of licensed library resources, potentially adds obstacles for users. general recommendations for mobile library websites that emerged from our usability test include: · users appreciate simple, streamlined navigation and clearly worded labels; · error message pages and other supplemental pages linked from the mobile website pages should be identified and mobile-friendly versions created; · recognize that how users connect to the mobile website is related to their experience using the site; · anticipate problems with third-party services (which often cannot be solved locally). additionally, system responses to user actions are important; for example, provide a “you have successfully logged out” message and an indicator that a catalog search is in progress. it is possible that users are even more likely to abandon tasks in a mobile environment than in a desktop environment if they perceive the site to be unresponsive. as test facilitators, we experienced three primary difficulties in keeping the testing sessions consistent. the unexpectedly poor performance of the mobile website on some devices required us to communicate with participants about when a task could be abandoned. for example, after one participant made three unsuccessful attempts at entering text data in the chat service interface, she was directed to move ahead to the next task. such instances of multiple unsuccessful attempts were considered to be fatal system errors. however, under these circumstances, it is difficult to know whether our test facilitation led participants to spend more or less time than they normally would attempting a task. secondly, the issue of system authentication led to unexpected variation in testing. some participants proceeded through these obstacles, while others either opted out or had significant enough technical difficulties that the task was deemed a fatal error. again, it is unclear how the average user would deal with this situation in the field. some users information technology & libraries | june 2012 60 might leave an activity if an obstacle appears too cumbersome, others might proceed. finally, participants demonstrated a wide range in their willingness to “think aloud.” in retrospect, as facilitators, we should have provided an example of the method before beginning the test; perhaps doing so would have encouraged the participants to speak more freely. the relatively simple nature of most of the test tasks may have also contributed to this problem as participants seemed reluctant to say something that might be considered too obvious. another limitation of our study is that the participants were a convenience sample of volunteers selected by phone type. though our selection was based loosely on market share of different smartphone brands, a preliminary investigation into the mobile device market of our target population would have been helpful to establish what devices would be most important to test. additional usability testing on more complex library related tasks, such as advanced searching in a database, or downloading and viewing files, is recommended for further research. also of interest would be a study of user willingness to proceed past obstacles like authentication requirements and non-mobile friendly pages in the field. conclusion we began our study questioning whether or not different smartphone hardware and operating systems would impact the user experience of our library’s new mobile website. usability testing confirmed that the type of smartphone does have an impact on the user experience, occasionally significantly so. by testing the site on a range of devices, we observed a wide variation of successful and unsuccessful experiences with our mobile website. the wide variety of phones and mobile devices in use makes developing a mobile website that perfectly serves all of them difficult; there is likely to always be a segment of users who experience difficulties with any given mobile website. however, usability testing data and developer awareness of potential problems will generate positive changes to mobile websites and alleviate frustration for many users down the road. references and notes 1. aaron smith, “35% of american adults own a smartphone: one quarter of smartphone owners use their phone for most of their online browsing,” pew research center, june 15, 2011, http://pewinternet.org/~/media//files/reports/2011/pip_smartphones.pdf (accessed oct. 13, 2011). 2. shannon d. smith and judith b. caruso, the ecar study of undergraduate students and information technology, 2010, educause, 2010, 41, http://net.educause.edu/ir/library/pdf/ers1006/rs/ers1006w.pdf (accessed sept. 12, 2011); shannon d. smith, gail salaway, and judith b. caruso, the ecar study of undergraduate students and information technology, 2009, educause, 2009, 49, http://www.educause.edu/resources/theecarstudyofundergraduatestu/187215 (accessed sept. 12, 2011). http://pewinternet.org/~/media/files/reports/2011/pip_smartphones.pdf http://net.educause.edu/ir/library/pdf/ers1006/rs/ers1006w.pdf http://www.educause.edu/resources/theecarstudyofundergraduatestu/187215 usability study of a library’s mobile website | pendell and bowman 61 3. a comparison count of u.s. and canadian academic libraries with active mobile websites, wiki page versions, august 2010 (56 listed) and august 2011 (84 listed). library success: a best practices wiki, “m-libraries: libraries offering mobile interfaces or applications,” http://libsuccess.org/index.php?title=m-libraries (accessed sept. 7, 2011). 4. laurie m. bridges, hannah gascho rempel, and kim griggs, “making the case for a fully mobile library web site: from floor maps to the catalog,” reference services review 38, no. 2 (2010): 317, doi:10.1108/00907321011045061. 5. kim griggs, laurie m. bridges, and hannah gascho rempel, “library/mmobile: tips on designing and developing mobile web sites,” code4lib journal no. 8 (2009), under “content adaptation techniques,” http://journal.code4lib.org/articles/2055 (accessed sept. 7, 2011). 6. jamie seeholzer and joseph a. salem jr., “library on the go: a focus group study of the mobile web and the academic library,” college & research libraries 72, no. 1 (2011): 19. 7. dongsong zhang and boonlit adipat, “challenges, methodologies, and issues in the usability testing of mobile applications,” international journal of human-computer interaction 18, no. 3 (2005): 302, doi:10.1207/s15327590ijhc1803_3. 8. griggs, bridges, and rempel, “library/mobile.” 9. zhang and adipat, “challenges, methodologies,” 303–4. 10. billi et al., “a unified methodology for the evaluation of accessibility and usability of mobile applications,” universal access in the information society 9, no. 4 (2010): 340, doi:10.1007/s10209-009-0180-1. 11. zhang and adipat, “challenges, methodologies,” 302. 12. jakob nielsen, “mobile usability,” alertbox, september 26, 2011, www.useit.com/alertbox/mobile-usability.html (accessed sept. 28, 2011). 13. fernando loizides and george buchanan, “performing document triage on small screen devices. part 1: structured documents,” in iiix ’10: proceeding of the third symposium on information interaction in context, ed. nicholas j. belkin and diane kelly (new york: acm, 2010), 342, doi:10.1145/1840784.1840836. 14. constantinos k. coursaris and dan j. kim, “a qualitative review of empirical mobile usability studies” (presentation, twelfth americas conference on information systems, acapulco, mexico, august 4–6, 2006), 4, http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.83.4082&rep=rep1&type=pdf (accessed sept. 7, 2011) 15. ibid., 2. http://libsuccess.org/index.php?title=m-libraries http://journal.code4lib.org/articles/2055 file:///c:/users/gerrityr/desktop/ital%2031n2_proofread/www.useit.com/alertbox/mobile-usability.html http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.83.4082&rep=rep1&type=pdf information technology & libraries | june 2012 62 16. jeffrey rubin and dana chisnell, handbook of usability testing: how to plan, design, and conduct effective tests, 2nd ed. (indianapolis, in: wiley, 2008); carole a. george, user-centered library web sites: usability evaluation methods (cambridge: chandos, 2008). 17. ronan hegarty and judith wusteman, “evaluating ebscohost mobile,” library hi tech 29, no. 2 (2011): 323–25, doi:10.1108/07378831111138198; robert c. wu et al., “usability of a mobile electronic medical record prototype: a verbal protocol analysis,” informatics for health & social care 33, no. 2 (2008): 141–42, doi:10.1080/17538150802127223. 18. in order to protect participants’ confidentiality a dummy library user account was created; the user name and password for the account were provided to the participant at the test session. 19. seeholzer and salem, “library on the go,” 14. tending a wild garden: library web design for persons with disabilities | vandenbark 23 r. todd vandenbark tending a wild garden: library web design for persons with disabilities nearly one-fifth of americans have some form of disability, and accessibility guidelines and standards that apply to libraries are complicated, unclear, and difficult to achieve. understanding how persons with disabilities access web-based content is critical to accessible design. recent research supports the use of a database-driven model for library web development. existing technologies offer a variety of tools to meet disabled patrons’ needs, and resources exist to assist library professionals in obtaining and evaluating product accessibility information from vendors. librarians in charge of technology can best serve these patrons by proactively updating and adapting services as assistive technologies improve. i n march 2007, eighty-two countries signed the united nations’ convention on the rights of persons with disabilities, including canada, the european community, and the united states. the convention’s purpose was “to promote, protect and ensure the full and equal enjoyment of all human rights and fundamental freedoms by all persons with disabilities, and to promote respect for their inherent dignity.”1 among the many proscriptions for assuring respect and equal treatment of people with disabilities (pwd) under the law, signatories agreed to take appropriate measures: (g) to promote access for persons with disabilities to new information and communications technologies and systems, including the internet; and (h) to promote the design, development, production and distribution of accessible information and communications technologies and systems at an early stage, so that these technologies and systems become accessible at minimum cost. in addition, the convention seeks to guarantee equal access to information by doing the following: (c) urging private entities that provide services to the general public, including through the internet, to provide information and services in accessible and usable formats for persons with disabilities; and (d) encouraging the mass media, including providers of information through the internet, to make their services accessible to persons with disabilities.2 because the internet and its design standards are evolving at a dizzying rate, it is difficult to create websites that are both cutting-edge and standards-compliant. this paper evaluates the challenge of web design as it relates to individuals with disabilities, exploring current standards, and offering recommendations for accessible development. examining the provision of it for this demographic is vital because according to the u.s. census bureau, the u.s. public includes about 51.2 million noninstitutionalized people living with disabilities, 32.5 million of which are severely disabled. this means that nearly one-fifth of the u.s. public faces some physical, mental, sensory, or other functional impairment (18 percent in 2002).3 because a library’s mandate is to make its resources accessible to everyone, it is important to attend to the special challenges faced by patrons with disabilities and to offer appropriate services with those special needs in mind. n current u.s. regulations, standards, and guidelines in 1990 congress enacted the americans with disabilities act (ada), the first comprehensive legislation mandating equal treatment under the law for pwd. the ada prohibits discrimination against pwd in employment, public services, public accommodations, and in telecommunications. title ii of the ada mandates that all state governments, local governments, and public agencies provide access for pwd to all of their activities, services, and programs. since school, public, and academic libraries are under the purview of title ii, they must “furnish auxiliary aids and services when necessary to ensure effective communication.”4 though predating widespread use of the internet, the law’s intent points toward the adoption and adaptation of appropriate technologies to allow persons with a variety of disabilities to access electronic resources in a way that is most effective for them. changes to section 508 of the 1973 rehabilitation act enacted in 1998 and 2000 introduced the first standards for “accessible information technology recognized by the federal government.”5 many state and local governments have since passed laws applying the standards of section 508 to government agencies and related services. according to the access board, the independent federal agency charged with assuring compliance with a variety of laws regarding services to pwd, information and communication technology (ict) includes any equipment or interconnected system or subsystem of equipment, that is used in the creation, conversion, or duplication of data or information. the term electronic r. todd vandenbark (todd.vandenbark@utah.edu) is web services librarian, eccles health sciences library, university of utah, salt lake city. 24 information technology and libraries | march 2010 and information technology includes, but is not limited to, telecommunications products (such as telephones), information kiosks and transaction machines, world wide web sites, multimedia, and office equipment such as copiers and fax machines.6 the access board further specifies guidelines for “web-based intranet and internet information and applications,” which are directly relevant to the provision of such services in libraries.7 what follows is a detailed examination of these standards with examples to assist in understanding and implementation. (a) a text equivalent for every non-text element shall be provided. assistive technology cannot yet describe what pictures and other images look like; they require meaningful text-based information associated with each picture. if an image directs the user to do something, the associated text must explain the purpose and meaning of the image. this way, someone who cannot see the screen can understand and navigate the page successfully. this is generally accomplished by using the “alt” and “longdesc” attributes for images: “short. however, these aids also can clutter a page when not used properly. the current versions of the most popular screen-reader software do not limit the amount of “alt” text they can read. however, freedom scientific’s jaws 6.x divides the “alt” attribute into distinct chunks of 125 characters each (excluding spaces) and reads them separately as if they were separate graphics.8 this can be confusing to the end user. longer content can be put into a separate text file and the file linked to using the “longdesc” attribute. when a page contains audio or video files, a text alternative needs to be provided. for audio files such as interviews, lectures, and podcasts, a link to a transcript of the audio file must be immediately available. for video clips such as those on youtube, captions must accompany the clip. (b) equivalent alternatives for any multimedia presentation shall be synchronized with the presentation. this means that captions for video must be real-time and synchronized with the actions in the video, not contained solely in a separate transcript. (c) web pages shall be designed so that all information conveyed with color is also available without color, for example from context or markup. while color can be used, it cannot be the sole source or indicator of information. imagine an educational website offering a story problem presented in black and green print, and the answer to the problem could be deciphered using only the green letters. this would be inaccessible to students who have certain forms of color-blindness as well as those who use screen-reader software. (d) documents shall be organized so they are readable without requiring an associated style sheet. the introduction of cascading style sheets (css) can improve accessibility because they allow the separation of presentation from content. however, not all browsers fully support css, so webpages need to be designed so any browser can read them accurately. the content needs to be organized so that it can be read and understood with css formatting turned off. (e) redundant text links shall be provided for each active region of a server-side image map, and (f) client-side image maps shall be provided instead of server-side image maps except where the regions cannot be defined with an available geometric shape. an image map can be thought of as a geometrically defined and arranged group of links to other content on a site. a clickable map of the fifty u.s. states is an example of a functioning image map. a server-side image map would appear to a screen reader only as a set of coordinates, whereas clientside maps can include information about where the link leads through “alt” text. the best practice is to only use client-side image maps and make sure the “alt” text is descriptive and meaningful. (g) row and column headers shall be identified for data tables, and (h) markup shall be used to associate data cells and header cells for data tables that have two or more logical levels of row or column headers. correct table coding is critical. each table should use the “table summary” attribute to provide a meaningful description of its content and arrangement: . headers should be coded using the table header (“th”) tag, and its “scope” attribute should specify whether the header applies to a row or a column:
or . if the table’s content is complex, it may be necessary to provide an alternative presentation of the information. it is best to rely on css for page layout, taking into consideration the directions in subparagraph (d) above. (i) frames shall be titled with text that facilitates frame identification and navigation. frames are a deprecated feature of html, and their use should be avoided in favor of css layout. (j) pages shall be designed to avoid causing the screen to flicker with a frequency greater than 2 hz and lower than 55 hz. lights with flicker rates in this range can trigger epileptic seizures. blinking or flashing elements on tending a wild garden: library web design for persons with disabilities | vandenbark 25 a webpage should be avoided until browsers provide the user with the ability to control flickering. (k) a text-only page, with equivalent information or functionality, shall be provided to make a web site comply with the provisions of this part, when compliance cannot be accomplished any other way. the content of the text-only page shall be updated whenever the primary page changes. complex content that is entirely visual in nature may require a separate text-only page, such as a page showing the english alphabet in american sign language. this requirement also serves as a stopgap measure for existing sites that require reworking for accessibility. some consider this to be the web’s version of separate-but-equal services, and should be avoided.9 offering a text-only alternative site can increase the sense of exclusion that pwd already feel. also, such versions of a website tend not to be equivalent to the parent site, leaving out promotions or advertisements. finally, a text-only version increases the workload of web development staff, making them more costly than creating a single, fully accessible site in the first place. (l) when pages utilize scripting languages to display content, or to create interface elements, the information provided by the script shall be identified with functional text that can be read by assistive technology. scripting languages such as javascript allow for more interactive content on a page while reducing the number of times the computer screen needs to be refreshed. if functional text is not available, the screen reader attempts to read the script’s code, which outputs as a meaningless jumble of characters. using redundant text links avoids this result. (m) when a web page requires that an applet, plug-in, or other application be present on the client system to interpret page content, the page must provide a link to a plug-in or applet that complies with [subpart b: technical standards] §1194.22(a) through (i). web developers need to ascertain whether a given plug-in or applet is accessible before requiring their webpage’s visitors to use it. when using applications such as quicktime or realaudio, it is important to provide an accessible link on the same page that will allow users to install the necessary plug-in. (n) when electronic forms are designed to be completed on-line, the form shall allow people using assistive technology to access information, field elements, and functionality required for completion and submission of the form, including all directions and cues. if scripts used in the completion of the form are inaccessible, an alternative method of completing the form must be made immediately available. each element of a form needs to be labeled properly using the